Skip to content

Reschedule background migration to copy projects.container_registry_enabled to project_features.container_registry_access_level

Reuben Pereira requested to merge 18792-reschedule-background-migration into master

What does this MR do?

This MR reschedules the background migration added in !55327 (merged). The previous background migration missed 10 jobs (!55327 (comment 543195183)), due to exclusive lease contention. This resulted in about 480,000 rows not being migrated. Instead of cleaning up all those rows in a cleanup migration, which could take much longer than 10 minutes, we're rescheduling the background migration. This time we will be tracking the jobs by passing in the track_jobs: true parameter to queue_background_migration_jobs_by_range_at_intervals.

The track_jobs parameter ensures that jobs are tracked in the background_migration_jobs table. So, in the cleanup migration we can check the rows in background_migration_jobs and complete any jobs that didn't complete, instead of having to go through the entire table again.

I'm not sure why lease contention occurred last time, since most of the jobs completed in about a second. The maximum duration of a job was 10 seconds.

This time, I've reduced the batch size from 50_000 to 30_000.

Background migration estimates

projects table has 18,189,661 rows

All rows to be migrated.

30,000 rows per sidekiq job

18,189,661 / 21,000 = 867 jobs

300 rows per batch in job

21,000 / 300 = 70 batches per job

18,189,661 / 300 = 60633 total batches

Estimated times per batch:

907ms for update statement with 300 items (from https://console.postgres.ai/gitlab/gitlab-production-tunnel/sessions/3366/commands/11161)

Execution time per sidekiq job:
907 * 70 = 63.49 seconds

Sidekiq jobs are scheduled 2 minutes apart.

867 * 120 = 1734 minutes = ~28.9 hours

Migration output

== 20210401131948 MoveContainerRegistryEnabledToProjectFeatures2: migrating ===
-- Scheduled 1 MoveContainerRegistryEnabledToProjectFeature jobs with a maximum of 30000 records per batch and an interval of 120 seconds.

The migration is expected to take at least 120 seconds. Expect all jobs to have completed after 2021-04-01 14:46:23 UTC."
== 20210401131948 MoveContainerRegistryEnabledToProjectFeatures2: migrated (0.1428s)

Revert migration output:

== 20210401131948 MoveContainerRegistryEnabledToProjectFeatures2: reverting ===
== 20210401131948 MoveContainerRegistryEnabledToProjectFeatures2: reverted (0.0000s)

Screenshots (strongly suggested)

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Related to #18792 (closed)

Edited by Reuben Pereira

Merge request reports