Skip to content

Draft: Filter projects by date created for the registry repair worker

Adie (she/her) requested to merge 390842-scope-repair-worker-dates into master

What does this MR do and why?

In this MR, we add an additional where clause to our query that fetches projects that are still pending repair analysis.

Previously, we only do a left join from projects to container_registry_data_repair_details. But now as the container_registry_data_repair_details table gets filled up with rows, our query is getting bigger and slower. In this MR, we aim to scope the data repair analysis per year so:

  1. Our query can run faster with a smaller scope
  2. We can track progress of our data repair detail worker per year

This MR sets the scope for the year 2022. Currently, we set the dates as constants in the code. An initial thought was to add this as an application setting, but since this is a one-off repair thing, it might be too much to add a migration for the application setting just for the dates.

Database Analysis:

Before:

We do two queries, one to get the remaining work count and another one to fetch the next project:

Project.pending_data_repair_analysis.limit(max_running_jobs + 1).count
Raw SQL ```sql SELECT COUNT(*) FROM ( SELECT 1 AS one FROM "projects" LEFT OUTER JOIN "container_registry_data_repair_details" ON "container_registry_data_repair_details"."project_id" = "projects"."id" WHERE "container_registry_data_repair_details"."project_id" IS NULL ORDER BY "projects"."id" DESC LIMIT 5) subquery_for_count ```
Details and Visualization

https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/20063/commands/67255

Project.pending_data_repair_analysis.limit(max_running_jobs * 2).sample
SELECT
    "projects".*
FROM
    "projects"
    LEFT OUTER JOIN "container_registry_data_repair_details" ON "container_registry_data_repair_details"."project_id" = "projects"."id"
WHERE
    "container_registry_data_repair_details"."project_id" IS NULL
ORDER BY
    "projects"."id" DESC
LIMIT 8

After:

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #390842

Edited by Adie (she/her)

Merge request reports