Use LATERAL join for commit_shas_from_metadata

What does this MR do and why?

Forces a nested-loop / primary-key lookup for MergeRequestDiffCommit.commit_shas_from_metadata by joining merge_request_commits_metadata with a fenced LEFT JOIN LATERAL (... LIMIT 1) instead of a plain LEFT JOIN.

During the commits-metadata backfill this query reads SHAs from both the old merge_request_diff_commits.sha and the new merge_request_commits_metadata.sha (via COALESCE). The planner badly misestimates the number of rows for a given merge_request_diff_id, so it picks a hash join that scans every metadata row for the project. The LATERAL + LIMIT 1 fence blocks subquery pull-up and forces a per-row lookup on merge_request_commits_metadata_pkey (id, project_id).

The change is wrapped in the commit_shas_metadata_lateral_join derisk flag (actor: Feature.current_request). Both paths return identical data — the flag only swaps the join strategy.

References

Database

Query: MergeRequestDiffCommit.commit_shas_from_metadata

How to set up and validate locally

  1. Feature.enable(:commit_shas_metadata_lateral_join)
  2. Call MergeRequestDiffCommit.for_merge_request_diff(diff_id, project_id).commit_shas_from_metadata(project_id: project_id, limit: nil).to_sql
  3. Confirm query is using lateral join.

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Krasimir Angelov

Merge request reports

Loading