Skip to content

Enqueuer query uses root namespace for imports

Steve Abrams requested to merge 354800-enqueuery-query-update into master

What does this MR do and why?

In !78613 (merged) we added the ContainerRepository::Migration::EnqueuerWorker that will be responsible for finding container repositories to be imported into the new container registry database.

This worker is not yet in use and sits behind a feature flag.

The query used to find the next container repository to import takes into account the group's plan, and also whether or not a given group has been added to a deny-list to prevent it from importing. These aspects are how we will be controlling some of the rollout of the imports.

It was noted that we check the namespace that the container repository belongs to directly, but when it comes to the plan, that is specific to the root namespace. Similarly, when we add namespaces to the deny-list, we will be adding root namespaces.

This MR updates the query to target the container repository's root namespace rather than the direct namespace.

Database

Old queries with namespaces.id

EE - .with_target_import_tier filters by plan name

This query occurs when the feature flag :container_registry_migration_limit_gitlab_org is disabled. This is the most complicated query (most joins and filters). This is the query that will be used the majority of the time for the GitLab.com migration.

SELECT "container_repositories".* 
FROM "container_repositories" 
INNER JOIN "projects" ON "projects"."id" = "container_repositories"."project_id" 
INNER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id" 
INNER JOIN "gitlab_subscriptions" ON "gitlab_subscriptions"."namespace_id" = "namespaces"."id" 
INNER JOIN "plans" ON "plans"."id" = "gitlab_subscriptions"."hosted_plan_id" 
WHERE "container_repositories"."migration_state" = 'default' 
AND "container_repositories"."created_at" < '2022-01-01 00:00:00' 
AND "plans"."id" = 34 
AND (
  NOT EXISTS (
        SELECT 1
        FROM feature_gates
        WHERE feature_gates.feature_key = 'container_registry_phase_2_deny_list'
        AND feature_gates.key = 'actors'
        AND feature_gates.value = concat('Group:', projects.namespace_id)
  )
) LIMIT 1;

Explain: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/8310/commands/29375

EE - .with_target_import_tier filters repositories for `gitlab-org` group

This query occurs when the feature flag :container_registry_migration_limit_gitlab_org is enabled. This will be used to allow us to start by only importing container repositories belonging to the gitlab-org group.

SELECT "container_repositories".* 
FROM "container_repositories" 
INNER JOIN "projects" ON "projects"."id" = "container_repositories"."project_id" 
INNER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id" 
WHERE "container_repositories"."migration_state" = 'default' 
AND "container_repositories"."created_at" < '2022-01-01 00:00:00' 
AND "namespaces"."path" = 'gitlab-org' 
AND (
  NOT EXISTS (
        SELECT 1
        FROM feature_gates
        WHERE feature_gates.feature_key = 'container_registry_phase_2_deny_list'
        AND feature_gates.key = 'actors'
        AND feature_gates.value = concat('Group:', projects.namespace_id)
  )
) LIMIT 1

Explain: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/8411/commands/29674

FOSS - .with_target_import_tier` returns `all`

This is the least complicated query (least joins and filters). This is what will be run on self-managed instances that use the import process.

SELECT "container_repositories".* 
FROM "container_repositories" 
INNER JOIN "projects" ON "projects"."id" = "container_repositories"."project_id" 
WHERE "container_repositories"."migration_state" = 'default' 
AND "container_repositories"."created_at" < '2022-01-23 00:00:00' 
AND (
  NOT EXISTS (
        SELECT 1
        FROM feature_gates
        WHERE feature_gates.feature_key = 'container_registry_phase_2_deny_list'
        AND feature_gates.key = 'actors'
        AND feature_gates.value = concat('Group:', projects.namespace_id)
  )
) LIMIT 1;

Explain: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/8306/commands/29333

New queries with namespaces.traversal_ids[1]

The differences to note are that we now JOIN on:

INNER JOIN "gitlab_subscriptions" ON gitlab_subscriptions."namespace_id" = "namespaces"."traversal_ids"[1]

And the last part of the query that checks the deny-list (which is a feature flag) is updated to also use traversal_ids[1]

AND feature_gates.value = concat('Group:', namespaces.traversal_ids[1])
EE - .with_target_import_tier filters by plan name

This query occurs when the feature flag :container_registry_migration_limit_gitlab_org is disabled. This is the most complicated query (most joins and filters). This is the query that will be used the majority of the time for the GitLab.com migration.

SELECT
    "container_repositories".*
FROM
    "container_repositories"
    INNER JOIN "projects" ON "projects"."id" = "container_repositories"."project_id"
    INNER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
    INNER JOIN "gitlab_subscriptions" ON gitlab_subscriptions."namespace_id" = "namespaces"."traversal_ids"[1]
    INNER JOIN "plans" ON "plans"."id" = "gitlab_subscriptions"."hosted_plan_id"
WHERE
    "container_repositories"."migration_state" = 'default'
    AND "container_repositories"."created_at" < '2022-01-23 00:00:00'
    AND "plans"."id" = 34
    AND (NOT EXISTS (
            SELECT
                1
            FROM
                feature_gates
            WHERE
                feature_gates.feature_key = 'container_registry_phase_2_deny_list'
                AND feature_gates.key = 'actors'
                AND feature_gates.value = concat('Group:', namespaces.traversal_ids[1])
        )
    ) LIMIT 1;

Explain: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/8998/commands/31845

EE - .with_target_import_tier filters repositories for `gitlab-org` group

This query occurs when the feature flag :container_registry_migration_limit_gitlab_org is enabled. This will be used to allow us to start by only importing container repositories belonging to the gitlab-org group.

SELECT
    "container_repositories".*
FROM
    "container_repositories"
    INNER JOIN "projects" ON "projects"."id" = "container_repositories"."project_id"
    INNER JOIN "namespaces" ON "namespaces"."id" = "projects"."namespace_id"
WHERE
    "container_repositories"."migration_state" = 'default'
    AND "container_repositories"."created_at" < '2022-01-23 00:00:00'
    AND "namespaces"."path" = 'gitlab-org'
    AND (NOT EXISTS (
            SELECT
                1
            FROM
                feature_gates
            WHERE
                feature_gates.feature_key = 'container_registry_phase_2_deny_list'
                AND feature_gates.key = 'actors'
                AND feature_gates.value = concat('Group:', namespaces.traversal_ids[1])))
LIMIT 1;

Explain: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/8998/commands/31846

FOSS - .with_target_import_tier` returns `all`

This is the least complicated query (least joins and filters). This is what will be run on self-managed instances that use the import process.

SELECT "container_repositories".* 
FROM "container_repositories" 
INNER JOIN "projects" ON "projects"."id" = "container_repositories"."project_id" 
WHERE "container_repositories"."migration_state" = 'default' 
AND "container_repositories"."created_at" < '2022-01-23 00:00:00' 
AND (
  NOT EXISTS (
        SELECT 1
        FROM feature_gates
        WHERE feature_gates.feature_key = 'container_registry_phase_2_deny_list'
        AND feature_gates.key = 'actors'
        AND feature_gates.value = concat('Group:', projects.namespace_id)
  )
) LIMIT 1;

Explain: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/8998/commands/31847

Screenshots or screen recordings

N/A

How to set up and validate locally

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #354800 (closed)

Edited by Steve Abrams

Merge request reports