Move a field check into the query instead of the loop body
What does this MR do and why?
The with_auto_merge_enabled scope on MergeRequest adds state_id and merge_when_pipeline_succeeds to the merge requests query. This is nontrivially more efficient than loading all merge requests, opened or closed, and checking the auto_merge_enabled? attribute method.
I've copied the following query analysis directly out of !167095 (merged) where we first proposed making this change, as part of a larger MR.
Why didn't we just leave these changes in that other MR?
This is a straightforward change we can ship immediately, with no operational considerations, right now. It is not completely clear to me how much benefit there is in doing this, but academically speaking it will be a benefit of some amount. We should still continue with !167095 (merged) to leverage our read-replicas, but there is no reason to not ship this specific improvement first.
Query Changes!
By moving auto_merge_enabled? out of the loop and adding logic to the query, we change the query sent to the database and so change the execution plan:
SELECT "merge_requests".* FROM "merge_requests"
WHERE "merge_requests"."source_project_id" = 278964 AND "merge_requests"."source_branch" = 'set-auto-merge-process-worker-to-sticky' AND (
EXISTS (
SELECT 1 FROM "merge_request_diffs" INNER JOIN "merge_request_diff_commits" ON "merge_request_diff_commits"."merge_request_diff_id" = "merge_request_diffs"."id"
WHERE (merge_requests.latest_merge_request_diff_id = merge_request_diffs.id) AND "merge_request_diff_commits"."sha" = '604b2722cfa685354d03ce43aca49f77c3cfb37c'
)
)
+ AND "merge_requests"."state_id" = 1
+ AND "merge_requests"."merge_when_pipeline_succeeds" = TRUE
Before: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/32112/commands/99182
After: https://console.postgres.ai/gitlab/gitlab-production-main/sessions/32112/commands/99183
My read of this is that it's good. My understanding of these plans is that the conceptual design of the query doesn't really change; the difference is that adding state = opened to the query changes our base merge_requests query plan to use the smaller partial index idx_merge_requests_on_source_project_and_branch_state_opened instead of the current full-table index index_merge_requests_on_source_project_id_and_source_branch.
Since these are both index scans, I would imagine the scan performance doesn't change very much, but the net result is that we ultimately return a smaller subset of records to join the rest of the query onto. That seems like a win to me.
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
How to set up and validate locally
No behavioral changes!