Geo - Optimize the query to return reverifiable projects on Geo primary node
What does this MR do?
Use a CTE query to improve the performance of the query to find reverifiable projects on Geo primary node. The current implementation of this query appears to be the slowest Geo query in staging now and hit statement timeout, see https://sentry.gitlab.net/gitlab/staginggitlabcom/issues/1736501/?query=is%3Aunresolved.
SQL queries
-
Before
SELECT projects.id FROM projects INNER JOIN project_repository_states ON project_repository_states.project_id = projects.id WHERE project_repository_states.repository_verification_checksum IS NOT NULL AND project_repository_states.last_repository_verification_failure IS NULL AND(project_repository_states.last_repository_verification_ran_at IS NULL OR project_repository_states.last_repository_verification_ran_at <= '2020-07-30 18:27:40.847082') AND projects.repository_storage = 'nfs-file07' ORDER BY projects.last_repository_updated_at ASC NULLS LAST LIMIT 1000;
- Query plan: https://explain.depesz.com/s/dzk3
-
After
WITH "reverifiable_repositories" AS ( SELECT "projects"."id", "projects"."last_repository_updated_at" FROM "projects" INNER JOIN "project_repository_states" ON "projects"."id" = "project_repository_states"."project_id" WHERE "project_repository_states"."repository_verification_checksum" IS NOT NULL AND "project_repository_states"."last_repository_verification_failure" IS NULL AND("project_repository_states"."last_repository_verification_ran_at" IS NULL OR "project_repository_states"."last_repository_verification_ran_at" <= '2020-07-30 18:27:40.847082') AND "projects"."repository_storage" = 'nfs-file07' LIMIT 1000 ) SELECT "id" FROM "reverifiable_repositories" AS "projects" ORDER BY projects.last_repository_updated_at ASC NULLS LAST;
- Query plan: https://explain.depesz.com/s/vh9W
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team
Related issue
Edited by Valery Sizov