Geo - Improve the query performance to find unsynced job artifacts
What does this MR do?
Add a missing index to artifact_id
in the job_artifact_registry
table.
- Query
Geo::JobArtifactRegistryFinder.new.find_unsynced_job_artifacts(batch_size: 1000)
- Before
=> EXPLAIN for: SELECT "gitlab_secondary"."ci_job_artifacts".* FROM "gitlab_secondary"."ci_job_artifacts" LEFT OUTER JOIN job_artifact_registry
ON job_artifact_registry.artifact_id = gitlab_secondary.ci_job_artifacts.id WHERE ("gitlab_secondary"."ci_job_artifacts"."file_store" = 1 OR "gitlab_secondary"."ci_job_artifacts"."file_store" IS NULL) AND "job_artifact_registry"."artifact_id" IS NULL AND (1=1) LIMIT 1000
QUERY PLAN
----------------------------------------------------------------------------------------------
Limit (cost=100.00..17524.61 rows=3 width=116)
-> Nested Loop Anti Join (cost=100.00..17524.61 rows=3 width=116)
Join Filter: (job_artifact_registry.artifact_id = ci_job_artifacts.id)
-> Foreign Scan on ci_job_artifacts (cost=100.00..117.43 rows=6 width=116)
-> Materialize (cost=0.00..17277.52 rows=1768 width=4)
-> Seq Scan on job_artifact_registry (cost=0.00..17268.68 rows=1768 width=4)
- After
=> EXPLAIN for: SELECT "gitlab_secondary"."ci_job_artifacts".* FROM "gitlab_secondary"."ci_job_artifacts" LEFT OUTER JOIN job_artifact_registry
ON job_artifact_registry.artifact_id = gitlab_secondary.ci_job_artifacts.id WHERE ("gitlab_secondary"."ci_job_artifacts"."file_store" = 1 OR "gitlab_secondary"."ci_job_artifacts"."file_store" IS NULL) AND "job_artifact_registry"."artifact_id" IS NULL AND (1=1) LIMIT 1000
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=100.28..135.87 rows=3 width=116)
-> Nested Loop Anti Join (cost=100.28..135.87 rows=3 width=116)
-> Foreign Scan on ci_job_artifacts (cost=100.00..117.43 rows=6 width=116)
-> Index Only Scan using index_job_artifact_registry_on_artifact_id on job_artifact_registry (cost=0.28..3.63 rows=1 width=4)
Index Cond: (artifact_id = ci_job_artifacts.id)
Does this MR meet the acceptance criteria?
-
Changelog entry added, if necessary -
Documentation created/updated -
API support added -
Tests added for this feature/bug - Review
-
Has been reviewed by UX -
Has been reviewed by Frontend -
Has been reviewed by Backend -
Has been reviewed by Database
-
-
EE specific content should be in the top level /ee
folder -
Conform by the merge request performance guides -
Conform by the style guides -
Squashed related commits together -
Internationalization required/considered -
If paid feature, have we considered GitLab.com plan and how it works for groups and is there a design for promoting it to users who aren't on the correct plan -
End-to-end tests pass ( package-qa
manual pipeline job)
What are the relevant issue numbers?
Related to https://gitlab.com/gitlab-org/gitlab-ee/issues/5513
Edited by Nick Thomas