Skip to content

Backfill catalog_resource_versions.released_at with releases.released_at

Leaminn Ma requested to merge backfill-catalog-versions-released-at into master

What does this MR do and why?

In !141817 (merged), we established a process to keep the denormalized column catalog_resource_versions.released_at in sync with releases.released_at.

In this MR, we will backfill the released_at value of existing catalog resource versions.

Currently, the catalog_resource_versions table is relatively small on GitLab.com (~900 rows). But we are using a batched background migration (instead of a post-deploy migration) because:

  • The number of rows exceeds the batch size of 500.
  • The CI Catalog is Beta released, so there is now a lot more active use of this feature. So it's a bit more likely that a self-managed instance could have a larger table by the time this migration runs.

Resolves Step 3 of #430117 (closed).

Query plan

          sub_batch
            .where('release_id = releases.id')
            .update_all('released_at = releases.released_at FROM releases')

(This query approximates what is being executed with sub_batch and acts on ~100 rows:)

UPDATE "catalog_resource_versions"
SET released_at = releases.released_at
FROM releases
WHERE "catalog_resource_versions"."id" BETWEEN 1 AND 10000000
AND "catalog_resource_versions"."id" >= 1 AND "catalog_resource_versions"."id" < 1000100
AND (release_id = releases.id)

Query plan link: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/25452/commands/80613

Migration

Up

main: == [advisory_lock_connection] object_id: 183360, pg_backend_pid: 94893
main: == 20240118225727 QueueBackfillCatalogResourceVersionsReleasedAt: migrating ===
main: == 20240118225727 QueueBackfillCatalogResourceVersionsReleasedAt: migrated (0.0503s) 

main: == [advisory_lock_connection] object_id: 183360, pg_backend_pid: 94893
ci: == [advisory_lock_connection] object_id: 183880, pg_backend_pid: 94895
ci: == 20240118225727 QueueBackfillCatalogResourceVersionsReleasedAt: migrating ===
ci: -- The migration is skipped since it modifies the schemas: [:gitlab_main].
ci: -- This database can only apply migrations in one of the following schemas: [:gitlab_ci, :gitlab_internal, :gitlab_shared].
ci: == 20240118225727 QueueBackfillCatalogResourceVersionsReleasedAt: migrated (0.0080s) 

ci: == [advisory_lock_connection] object_id: 183880, pg_backend_pid: 94895

Down

main: == [advisory_lock_connection] object_id: 183060, pg_backend_pid: 95399
main: == 20240118225727 QueueBackfillCatalogResourceVersionsReleasedAt: reverting ===
main: == 20240118225727 QueueBackfillCatalogResourceVersionsReleasedAt: reverted (0.0321s) 

main: == [advisory_lock_connection] object_id: 183060, pg_backend_pid: 95399
ci: == [advisory_lock_connection] object_id: 183000, pg_backend_pid: 95829
ci: == 20240118225727 QueueBackfillCatalogResourceVersionsReleasedAt: reverting ===
ci: -- The migration is skipped since it modifies the schemas: [:gitlab_main].
ci: -- This database can only apply migrations in one of the following schemas: [:gitlab_ci, :gitlab_internal, :gitlab_shared].
ci: == 20240118225727 QueueBackfillCatalogResourceVersionsReleasedAt: reverted (0.0086s) 

ci: == [advisory_lock_connection] object_id: 183000, pg_backend_pid: 95829

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #430117 (closed)

Edited by Leaminn Ma

Merge request reports