Use LATERAL join for commit_shas_from_metadata
What does this MR do and why?
Forces a nested-loop / primary-key lookup for MergeRequestDiffCommit.commit_shas_from_metadata
by joining merge_request_commits_metadata with a fenced LEFT JOIN LATERAL (... LIMIT 1)
instead of a plain LEFT JOIN.
During the commits-metadata backfill this query reads SHAs from both the old merge_request_diff_commits.sha and the new merge_request_commits_metadata.sha (via COALESCE). The planner badly misestimates the number of rows for a given
merge_request_diff_id, so it picks a hash join that scans every metadata row for the project. The LATERAL + LIMIT 1 fence blocks
subquery pull-up and forces a per-row lookup on merge_request_commits_metadata_pkey (id, project_id).
The change is wrapped in the commit_shas_metadata_lateral_join derisk flag (actor: Feature.current_request). Both paths return identical data — the flag only swaps the join strategy.
References
- https://gitlab.com/gitlab-com/gl-infra/capacity-planning-trackers/gitlab-com/-/work_items/2925#note_3420269544
- FF rollout issue - http://gitlab.com/gitlab-org/gitlab/-/work_items/602010
Database
Query: MergeRequestDiffCommit.commit_shas_from_metadata
- Before (plain LEFT JOIN — Hash Left Join, ~13.3s, scans 63,517 metadata rows): https://console.postgres.ai/gitlab/gitlab-production-main/sessions/52182/commands/153696
- After (LATERAL + LIMIT 1 — Nested Loop, per-row PK probe): https://console.postgres.ai/gitlab/gitlab-production-main/sessions/52182/commands/153697
How to set up and validate locally
Feature.enable(:commit_shas_metadata_lateral_join)- Call
MergeRequestDiffCommit.for_merge_request_diff(diff_id, project_id).commit_shas_from_metadata(project_id: project_id, limit: nil).to_sql - Confirm query is using lateral join.
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.