Insert MRs by head_commit_sha separately
Issue: Follow-up from "Link fast-forward MRs to deploy... (#443482 - closed)
What does this MR do and why?
This simplifies the query introduced in Link fast-forward MRs to deployment (!145211 - merged) so that merge requests by_merge_commit_sha
and by_head_commit_sha
are inserted separately instead of as a UNION query to the deployment_merge_requests
table.
The by_head_commit_sha
has been tested in Database Labs, with the required indexes added. See the Query Plans section below.
Further Details
In Link fast-forward MRs to deployment (!145211 - merged), we updated the query for the MRs that will be connected to a deployment. Instead of using only by_merge_commit_sha
we:
- introduced a new
by_head_commit_sha
scope: - introduced
by_merge_commit_sha_or_head_commit_sha
which is a UNION of the 2 scopes - then used this scope for inserting new
deployment_merge_requests
records (see https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/services/deployments/link_merge_requests_service.rb#L84)
The UNION query looked rather complex, but the query plan suggested that it should be okay. However, we have now seen an increase of PG::QueryCanceled: ERROR: canceling statement due to statement timeout
errors in the Deployments::LinkMergeRequestWorker
(see sentry event logs).
This MR:
-
updates the logic so that merge requests
by_merge_commit_sha
andby_head_commit_sha
are inserted separately instead of as a UNION query to thedeployment_merge_requests
table.We have also changed the "by_head_commit_sha" query so that instead of using a JOIN, which we found impossible to create an index for, it uses a nested SELECT statement which proved to perform well with the right indexes (see comments here and here).
-
makes sure querying by
head_commit_sha
is only done when the project's merge_method isfast_forward_merge
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
See: !145211 (merged)
How to set up and validate locally
See: !145211 (merged)
Query Plans
Initial UNION query
This was from the initial fix Link fast-forward MRs to deployment (!145211 - merged)
Separated queries
by_merge_commit_sha
- Database Labs EXPLAIN link
This was the original query before the fix Link fast-forward MRs to deployment (!145211 - merged) was made. We are not looking to change this, but this is just here for reference.
by_head_commit_sha
- Database Labs EXPLAIN link
This is the additional query that is used when a project has merge_method = fast_forward_merge
.
Index details
-
index_merge_requests_for_latest_diffs_with_state_merged
- creation time on DB labs:
~10 minutes
- index size on DB labs:
4600 MB
- creation time on DB labs:
-
index_on_merge_request_diffs_head_commit_sha
- creation time on DB labs:
~50 minutes
- index size on DB labs:
36 GB
- creation time on DB labs:
These indexes will be created asynchronously. Since the part of code that uses the query is behind a globally-disabled feature flag, I thought this would be okay to add the index in the same MR.
Please refer to this thread for details on how we settled on this query and its indexes.
I've also verified the async index creation locally:
In postgres_async_indexes
after running bundle exec rails db:migrate
In the database after running bundle exec rails gitlab:db:execute_async_index_operations:all