Reset skipped container repositories
🔍 What does this MR do and why?
We are in the process of migrating/importing all container repositories to the new container registry. To start, we only imported container repositories that had less than 100 tags. Container repositories with more than 100 were skipped. In gitlab-com/gl-infra/production#7022 (closed) we have increased the tag limit, so we now would like to go back and "reset" the container repositories that were previously skipped so they can qualify for import.
This MR introduces a background migration that will change all container repositories that have a migration state of import_skipped
with the reason being too_many_tags
(value 2 in an enum) back to a migration state of default
and removing the skipped reason.
🐘 Database
I have opted to use the older style of background migration over the new batched background migration because this migration will be much faster with the ability to supply a filtered relation in the initial post-deployment migration. The batched background migrations do not yet easily support this functionality.
Post-deployment migration:
EachBatch
for scheduling jobs:
Initial record
select id from container_repositories where migration_state = 'import_skipped' and migration_skipped_reason = 2 order by id asc limit 1;
Explain 63ms
: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10023/commands/35497
Max of next batch
select id from container_repositories where migration_state = 'import_skipped' and migration_skipped_reason = 2 and id >= 1 order by id asc limit 1 offset 10000;
Explain 1.3ms
: https://postgres.ai/console/gitlab/gitlab-production-tunnel-pg12/sessions/10023/commands/35498
Batch calculations
There are ~27700 records matching migration_state = 'import_skipped' and migration_skipped_reason = 2
.
Batch size of 10_000, 3 jobs will be queued, total migration time of 6 minutes.
Background Migration
Min query for EachBatch
select id from container_repositories where migration_state = 'import_skipped' and migration_skipped_reason = 2 and id >= 1 and id < 24263 order by id asc limit 1;
Explain 171ms
: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10023/commands/35515
Max of next batch
select id from container_repositories where migration_state = 'import_skipped' and migration_skipped_reason = 2 and id >= 1 and id < 24263 order by id asc limit 1 offset 100;
- Explain 1st iteration
204ms
: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10023/commands/35504 - Explain next iteration
1ms
: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10023/commands/35505
We will have 100 iterations, but it is likely only the 1st or very few will be cold-cache queries, so we can expect most iterations to be as fast as the next iteration
above.
UPDATE query
UPDATE container_repositories SET migration_state = 'default', migration_skipped_reason = NULL where migration_state = 'import_skipped' and migration_skipped_reason = 2 and id >= 2868 and id < 5939;
Explain 82.3ms
: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10023/commands/35509
We will have 100 UPDATES in each job
Job Timing calculations
Each job has 10_000 records
Each sub-batch processes 100 records.
This means there will be 100 update queries.
Total time for EachBatch Queries
100 * 204 = 20400ms => 20.4s # note that most of these queries should be much lower than 204ms
Total time for Updates
100 * 82.3 = 8230ms => 8.23s
Total job time = .083 + 20.4 + 8.23 = ~29s
This timing should give plenty of room for queries that might take longer to still fit within the 2 minute target time for the job.
A note on additional indexes
We could add a btree index on (migration_state, migration_skipped_at, id)
to improve the timing on essentially all of the above queries by allowing for an Index Only Scan
rather than the various index scans used throughout the existing query plans. For example, here is the query that took 204ms above with such an index 3.7ms
: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10023/commands/35519
Given the existing times are within our 1s
timing guidelines for background migrations, I don't think it is necessary to add the temporary index for this migration.
📷 Screenshots or screen recordings
N/A
How to set up and validate locally
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related: #359255 (closed)