Phase 2 enqueuer
-
Review changes -
-
Download -
Patches
-
Plain diff

Context

We are preparing for Phase 2 of the Container Registry migration which involves importing all existing container repositories to the new platform (Phase 1 involved routing all new container repositories to the new platform). See &7316 (closed) for full details of how the import will work.
Rails is responsible for starting each import. This introduces the EnqueuerWorker
, which will query the container_repositories
table, find the next repository that qualifies for import, and make a request to the registry to start the pre-import.

What does this MR do and why?

This MR introduces the EnqueuerWorker
. It is responsible for finding the next container repository that qualifies for import and kicking off that import. It follows a sequence of checks:
- Return unless the main import feature flag
:container_registry_migration_phase2_enabled
is enabled. - Return if there are too many container repositories currently being imported.
- Return if there has not been a long enough delay between imports (eventually this will move to 0 delay, but we are starting off we want to go one at a time).
- Check if there are any imports that were aborted. If one is found, restart it and return.
- Find the next container repository that qualifies for import.
- We are following a rollout plan where we import one pricing tier at a time with a few other rules.
- If the qualified repository has too many tags, skip it and return.
- Start the import for the qualified repository.
- If starting or retrying an import fails, abort the import so it can try again later.
A more detailed description of these steps can be found in the issue description.
The EnqueuerWorker
will be kicked off in two ways:
- A cron running every hour will run the worker, starting an import
- Whenever an import is completed, the worker will be kicked off to start a new import
The cron ensures that imports will keep trying, especially while we are starting out and have everything throttled down using the various feature flags and application settings in ContainerRegistry::Migration
.
There are many calls to methods in ::ContainerRegistry::Migration
. These all are checking feature flag and application setting values. Since we are using a fairly large number of settings and feature flags to control the import rollout, they have been centralized to a single class to keep things organized.

Database

Queries
This MR introduces 4 new scopes that in turn make up 4 new queries:
1. ContainerRepository.with_migration_states(%w[pre_importing pre_import_done importing]).count
Query:
SELECT COUNT(*)
FROM "container_repositories"
WHERE "container_repositories"."migration_state"
IN ('pre_importing', 'pre_import_done', 'importing');
Explain:
- Without new index - Seq Scan
: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/8309/commands/29359 - With new index - Index only scan
: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/8306/commands/29357
Note: currently all container repositories have a 'default'
migration_state
, so in adding the index and updating some values on postgres.ai, we cannot achieve a cold-cache query. In addition to seeing that we have a better explain plan using the new index, the thing to notice for this and all of the queries using the new index is that the total number of buffers (hits + read) is much lower.
2. ContainerRepository.recently_done_migration_step.first
This query uses a new index index_container_repositories_on_greatest_done_at
Query:
SELECT "container_repositories".*
FROM "container_repositories"
WHERE "container_repositories"."migration_state" IN ('import_done', 'pre_import_done', 'import_aborted')
ORDER BY GREATEST(migration_pre_import_done_at, migration_import_done_at, migration_aborted_at) DESC
LIMIT 1;
To set up some data for this query in postgres.ai:
UPDATE container_repositories SET migration_state = 'import_done',
migration_import_done_at = (
select timestamp '2020-01-10 00:00:00' + random() * (timestamp '2022-01-01 00:00:00' - timestamp '2020-01-01 00:00:00')
) WHERE id % 100 = 0;
UPDATE container_repositories SET migration_state = 'pre_import_done',
migration_pre_import_done_at = (
select timestamp '2020-01-10 00:00:00' + random() * (timestamp '2022-01-01 00:00:00' - timestamp '2020-01-01 00:00:00')
) WHERE id % 425 = 0;
UPDATE container_repositories SET migration_state = 'import_aborted',
migration_aborted_at = (
select timestamp '2020-01-10 00:00:00' + random() * (timestamp '2022-01-01 00:00:00' - timestamp '2020-01-01 00:00:00')
) WHERE id % 900 = 0;
Explain: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/8442/commands/29887
3. ContainerRepository.with_migration_state('import_aborted').take
Query:
SELECT "container_repositories".*
FROM "container_repositories"
WHERE "container_repositories"."migration_state" = 'import_aborted'
LIMIT 1
4. ContainerRepository.ready_for_import.take
Explain:
- Without new index - Seq scan
: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/8309/commands/29366 - With new index - Index scan
: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/8309/commands/29370
The .ready_for_import
scope contains .with_target_import_tier
which is overwritten in EE and there is additionally a feature flag that can effect how the query is formed. Performance benefits greatly since we have no ORDER
and only need 1 record LIMIT 1
.
Note that the EE permutations have a guard so they will only execute on GitLab.com.
Here is each permutation with notes about when and how often they will be used:
EE - .with_target_import_tier
filters by plan name
This query occurs when the feature flag :container_registry_migration_limit_gitlab_org
is disabled. This is the most complicated query (most joins and filters). This is the query that will be used the majority of the time for the GitLab.com migration.
This query might benefit from an index since over time, the migration_state
of the container repositories will move from default
to import_done
, but I didn't want to pre-maturely add the index. I'm open to looking further into it if there are any specific ideas.
SELECT "container_repositories".*
FROM "container_repositories"
INNER JOIN "projects" ON "projects"."id" = "container_repositories"."project_id"
INNER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
INNER JOIN "gitlab_subscriptions" ON "gitlab_subscriptions"."namespace_id" = "namespaces"."id"
INNER JOIN "plans" ON "plans"."id" = "gitlab_subscriptions"."hosted_plan_id"
WHERE "container_repositories"."migration_state" = 'default'
AND "container_repositories"."created_at" < '2022-01-01 00:00:00'
AND "plans"."name" = 'free'
AND (
NOT EXISTS (
SELECT 1
FROM feature_gates
WHERE feature_gates.feature_key = 'container_registry_phase_2_deny_list'
AND feature_gates.key = 'actors'
AND feature_gates.value = concat('Group:', projects.namespace_id)
)
) LIMIT 1;
Explain: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/8310/commands/29375
EE - .with_target_import_tier
filters repositories for `gitlab-org` group
This query occurs when the feature flag :container_registry_migration_limit_gitlab_org
is enabled.
This will be used to allow us to start by only importing container repositories belonging to the gitlab-org
group.
SELECT "container_repositories".*
FROM "container_repositories"
INNER JOIN "projects" ON "projects"."id" = "container_repositories"."project_id"
INNER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
WHERE "container_repositories"."migration_state" = 'default'
AND "container_repositories"."created_at" < '2022-01-01 00:00:00'
AND "namespaces"."path" = 'gitlab-org'
AND (
NOT EXISTS (
SELECT 1
FROM feature_gates
WHERE feature_gates.feature_key = 'container_registry_phase_2_deny_list'
AND feature_gates.key = 'actors'
AND feature_gates.value = concat('Group:', projects.namespace_id)
)
) LIMIT 1
Explain: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/8411/commands/29674
FOSS - .with_target_import_tier`
returns `all`
This is the least complicated query (least joins and filters). This is what will be run on self-managed instances that use the import process.
SELECT "container_repositories".*
FROM "container_repositories"
INNER JOIN "projects" ON "projects"."id" = "container_repositories"."project_id"
WHERE "container_repositories"."migration_state" = 'default'
AND "container_repositories"."created_at" < '2022-01-23 00:00:00'
AND (
NOT EXISTS (
SELECT 1
FROM feature_gates
WHERE feature_gates.feature_key = 'container_registry_phase_2_deny_list'
AND feature_gates.key = 'actors'
AND feature_gates.value = concat('Group:', projects.namespace_id)
)
) LIMIT 1;
Explain: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/8306/commands/29333
Migrations
Migration output
→ bundle exec rails db:migrate:redo
== 20220128194722 AddIndexOnMigrationStateAndImportDoneAtToContainerRepositories: reverting
-- transaction_open?()
-> 0.0000s
-- indexes(:container_repositories)
-> 0.0052s
-- execute("SET statement_timeout TO 0")
-> 0.0007s
-- remove_index(:container_repositories, {:algorithm=>:concurrently, :name=>"index_container_repositories_on_migration_state_import_done_at"})
-> 0.0064s
-- execute("RESET statement_timeout")
-> 0.0007s
== 20220128194722 AddIndexOnMigrationStateAndImportDoneAtToContainerRepositories: reverted (0.0154s)
== 20220128194722 AddIndexOnMigrationStateAndImportDoneAtToContainerRepositories: migrating
-- transaction_open?()
-> 0.0000s
-- index_exists?(:container_repositories, [:migration_state, :migration_import_done_at], {:name=>"index_container_repositories_on_migration_state_import_done_at", :algorithm=>:concurrently})
-> 0.0068s
-- execute("SET statement_timeout TO 0")
-> 0.0006s
-- add_index(:container_repositories, [:migration_state, :migration_import_done_at], {:name=>"index_container_repositories_on_migration_state_import_done_at", :algorithm=>:concurrently})
-> 0.0082s
-- execute("RESET statement_timeout")
-> 0.0011s
== 20220128194722 AddIndexOnMigrationStateAndImportDoneAtToContainerRepositories: migrated (0.0211s)
→ bundle exec rake db:redo
== 20220204154220 AddIndexOnGreatestDoneAtToContainerRepositories: reverting ==
-- transaction_open?()
-> 0.0000s
-- indexes(:container_repositories)
-> 0.0057s
-- execute("SET statement_timeout TO 0")
-> 0.0009s
-- remove_index(:container_repositories, {:algorithm=>:concurrently, :name=>"index_container_repositories_on_greatest_done_at"})
-> 0.0066s
-- execute("RESET statement_timeout")
-> 0.0006s
== 20220204154220 AddIndexOnGreatestDoneAtToContainerRepositories: reverted (0.0185s)
== 20220204154220 AddIndexOnGreatestDoneAtToContainerRepositories: migrating ==
-- transaction_open?()
-> 0.0000s
-- index_exists?(:container_repositories, "GREATEST(migration_pre_import_done_at, migration_import_done_at, migration_aborted_at)", {:where=>"migration_state IN ('import_done', 'pre_import_done', 'import_aborted')", :name=>"index_container_repositories_on_greatest_done_at", :algorithm=>:concurrently})
-> 0.0061s
-- execute("SET statement_timeout TO 0")
-> 0.0006s
-- add_index(:container_repositories, "GREATEST(migration_pre_import_done_at, migration_import_done_at, migration_aborted_at)", {:where=>"migration_state IN ('import_done', 'pre_import_done', 'import_aborted')", :name=>"index_container_repositories_on_greatest_done_at", :algorithm=>:concurrently})
-> 0.0153s
-- execute("RESET statement_timeout")
-> 0.0007s
== 20220204154220 AddIndexOnGreatestDoneAtToContainerRepositories: migrated (0.0321s)

Screenshots or screen recordings

See below

How to set up and validate locally

We cannot fully test the functionality because the import functionality is still being developed in the Container Registry, so any requests to import a repository will result in an error. This does mean, however, we can test that imports are aborted properly and follow the application settings and feature flags in place.
-
Set up the feature flags:
Feature.enable(:container_registry_migration_phase2_enabled) Feature.enable(:container_registry_migration_phase2_capacity_1) Feature.disable(:container_registry_migration_phase2_enqueue_speed_fast) Feature.disable(:container_registry_migration_phase2_enqueue_speed_slow)
-
Create some container repositories in the console and set them to be created a few months ago so they qualify for import:
10.times { FactoryBot.create(:container_repository, project: Project.first) } ContainerRepository.update_all(created_at: 3.months.ago) ContainerRepository.where(migration_state: 'default').count # => 10
-
Run the worker
ContainerRegistry::Migration::EnqueuerWorker.set(queue: 'cronjob:container_registry_migration_enqueuer').perform_in(1.second)
-
Check the container repositories, the first one should have been aborted
ContainerRepository.where(migration_state: 'default').count # => 9 ContainerRepository.where.not(migration_state: 'default').first.migration_state # => "import_aborted" # Since the registry cannot be connected to in these tests, we receive an error and the import is aborted
-
Set the first repository as recently imported:
ContainerRepository.first.update(migration_state: 'import_done', migration_import_done_at: 5.minutes.go)
-
Rerun the worker and see no repositories are updated:
ContainerRegistry::Migration::EnqueuerWorker.set(queue: 'cronjob:container_registry_migration_enqueuer').perform_in(1.second) ContainerRepository.where(migration_state: 'default').count # => 9
-
Update the waiting time feature flag:
Feature.enable(:container_registry_migration_phase2_enqueue_speed_fast)
-
Rerun the worker and see another repository has been updated
ContainerRegistry::Migration::EnqueuerWorker.set(queue: 'cronjob:container_registry_migration_enqueuer').perform_in(1.second) ContainerRepository.where(migration_state: 'default').count # => 8

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #349744 (closed)
Merge request reports
- version 5117ddc556
- version 504a7ffb70
- version 4956b830af
- version 486779aa99
- version 472db762ab
- version 46fb00d755
- version 4596faf275
- version 4490d175ca
- version 43384743a1
- version 422af528bf
- version 41022905b5
- version 40503e5a18
- version 3900777868
- version 3860226522
- version 37f7f99431
- version 36aa6830f9
- version 35270a82c8
- version 3483323833
- version 33412b0755
- version 320a474326
- version 31acb293a9
- version 302eea5886
- version 29a93bd5aa
- version 282c92aeea
- version 27106a0497
- version 26be108b61
- version 25469d451c
- version 24c3d034e3
- version 23e3825204
- version 22a43c5e23
- version 21ddffa1d9
- version 20f8b89043
- version 1924f4dd7b
- version 18edf9d1e1
- version 1770ab8059
- version 1655759196
- version 15235a4723
- version 1407e57de9
- version 13b315ba8e
- version 12b315ba8e
- version 11b315ba8e
- version 1040b08554
- version 938b01781
- version 8d840229e
- version 7d840229e
- version 6d840229e
- version 5d840229e
- version 4d840229e
- version 3d840229e
- version 2d840229e
- version 1d840229e
- master (base)
- latest version9ebf63831 commit,
- version 5117ddc5561 commit,
- version 504a7ffb701 commit,
- version 4956b830af2 commits,
- version 486779aa992 commits,
- version 472db762ab1 commit,
- version 46fb00d7551 commit,
- version 4596faf2751 commit,
- version 4490d175ca5 commits,
- version 43384743a15 commits,
- version 422af528bf4 commits,
- version 41022905b54 commits,
- version 40503e5a184 commits,
- version 39007778683 commits,
- version 38602265223 commits,
- version 37f7f994312 commits,
- version 36aa6830f92 commits,
- version 35270a82c81 commit,
- version 34833238334 commits,
- version 33412b07553 commits,
- version 320a4743262 commits,
- version 31acb293a91 commit,
- version 302eea58861 commit,
- version 29a93bd5aa1 commit,
- version 282c92aeea1 commit,
- version 27106a04972 commits,
- version 26be108b611 commit,
- version 25469d451c6 commits,
- version 24c3d034e35 commits,
- version 23e38252043 commits,
- version 22a43c5e232 commits,
- version 21ddffa1d91 commit,
- version 20f8b890431 commit,
- version 1924f4dd7b7 commits,
- version 18edf9d1e15 commits,
- version 1770ab80594 commits,
- version 16557591963 commits,
- version 15235a47232 commits,
- version 1407e57de91 commit,
- version 13b315ba8e4 commits,
- version 12b315ba8e4 commits,
- version 11b315ba8e1 commit,
- version 1040b085542 commits,
- version 938b017811 commit,
- version 8d840229e2 commits,
- version 7d840229e2 commits,
- version 6d840229e2 commits,
- version 5d840229e2 commits,
- version 4d840229e448 commits,
- version 3d840229e2 commits,
- version 2d840229e1 commit,
- version 1d840229e1 commit,
- Side-by-side
- Inline