Backfill semvar columns of existing catalog resource versions

Leaminn Ma requested to merge migrate-ci-catalog-version-semvar into master

Update [2024-03-07]

As described below, we had enforced the use of semantic versioning on Catalog Resource Versions in #427286 (closed). However, since it was released prior to backfilling data for existing Versions, this caused a breaking change and/or UI bug for many customers (see gitlab-com/gl-infra/delivery#19995 (comment 1806053147)). For this reason, the backfill migration in this MR is necessary to fix the issue for our customers.

What does this MR do and why?

In #427286 (closed), we enforced the use of semantic versioning on Release tags of catalog resources. When a new release is created, the version numbers from the tag are parsed and saved in the semver_* columns of the corresponding catalog_resource_versions record.

In this MR, we are backfilling existing catalog resource versions that don't yet have their semver_* columns updated. Versions whose name (release.tag) matches the format for a valid, partial, or extended semantic version are updated accordingly.

Partial/extended semantic versions must first be normalized into the conventional format because otherwise the regex Gitlab::Regex.semver_regex does not consider them a match.

Example conversions:

1         => 1.0.0
v1.2      => 1.2.0
1.2-alpha => 1.2.0-alpha
1.0+123   => 1.0.0+123   => 1.2.3
v4. => 4.5.6-beta

Any tag values that do not start with digits separated by the . delimiter are not converted.

Migration notes:

  • A follow-up MR to finalize the backfill migration will be done in the following milestone.
  • There are currently 1942 rows in the catalog_resource_versions table [as of 2024-03-06].

Resolves #444303 (closed).

MR acceptance checklist

  1. with_release_tag(sub_batch)
SELECT catalog_resource_versions.*, releases.tag
FROM "catalog_resource_versions"
INNER JOIN releases ON = catalog_resource_versions.release_id 
WHERE "catalog_resource_versions"."id" BETWEEN 1 AND 15
  AND "catalog_resource_versions"."semver_major" IS NULL 
  AND "catalog_resource_versions"."id" >= 1

Query plan link:

  1. Update
UPDATE "catalog_resource_versions"
SET "semver_major" = 4, "semver_minor" = 5, "semver_patch" = 6, "semver_prerelease" = 'alpha' 
WHERE "catalog_resource_versions"."id" = 30

Query plan link:



main: == [advisory_lock_connection] object_id: 119420, pg_backend_pid: 27853
main: == 20240305182005 QueueBackfillCatalogResourceVersionSemVer: migrating ========
main: == 20240305182005 QueueBackfillCatalogResourceVersionSemVer: migrated (0.0489s) 

main: == [advisory_lock_connection] object_id: 119420, pg_backend_pid: 27853
ci: == [advisory_lock_connection] object_id: 119980, pg_backend_pid: 27855
ci: == 20240305182005 QueueBackfillCatalogResourceVersionSemVer: migrating ========
ci: -- The migration is skipped since it modifies the schemas: [:gitlab_main].
ci: -- This database can only apply migrations in one of the following schemas: [:gitlab_ci, :gitlab_internal, :gitlab_shared].
ci: == 20240305182005 QueueBackfillCatalogResourceVersionSemVer: migrated (0.0074s) 

ci: == [advisory_lock_connection] object_id: 119980, pg_backend_pid: 27855


main: == [advisory_lock_connection] object_id: 119060, pg_backend_pid: 36115
main: == 20240305182005 QueueBackfillCatalogResourceVersionSemVer: reverting ========
main: == 20240305182005 QueueBackfillCatalogResourceVersionSemVer: reverted (0.0263s) 

main: == [advisory_lock_connection] object_id: 119060, pg_backend_pid: 36115
ci: == [advisory_lock_connection] object_id: 119060, pg_backend_pid: 36572
ci: == 20240305182005 QueueBackfillCatalogResourceVersionSemVer: reverting ========
ci: -- The migration is skipped since it modifies the schemas: [:gitlab_main].
ci: -- This database can only apply migrations in one of the following schemas: [:gitlab_ci, :gitlab_internal, :gitlab_shared].
ci: == 20240305182005 QueueBackfillCatalogResourceVersionSemVer: reverted (0.0084s) 

ci: == [advisory_lock_connection] object_id: 119060, pg_backend_pid: 36572

How to set up and validate locally

  1. Before checking out this branch and running the migration, first set up a catalog resource. You can create a new catalog resource (aka "component project") via the UI or you can seed one with the following command:
bundle exec rake "gitlab:seed:ci_catalog_resources[<YOUR-GROUP-PATH>, 1, true]"
  1. In the Rails console add releases and corresponding versions to the catalog resource project with a variety of tag name values:
project = Project.find(<PROJECT_ID>) # ID of the project object created from Step 1 (not of the catalog resource object).
author = User.first # Modify this as needed if your first User isn't valid.
tags = ["0.1", "", "1.1-alpha", "1.1.3-beta", "v1.1.3.5-beta", "1.2.3", "4.5+123", "5", "not-a-version", "test++..", "v3"]

tags.each do |tag|
  release = Release.create!(tag: tag, project: project, released_at:, author: author)
  version = release, catalog_resource: project.catalog_resource, project: project) false)
  1. Confirm with the following query that the catalog_resource_versions.semver_* columns are all nil.
project.reload.catalog_resource.versions.includes(:release).each_with_object({}) do |row, obj|
  obj[] = [row.semver_major, row.semver_minor, row.semver_patch, row.semver_prerelease]


  1. Now check out this branch and run the backfill migration.
bundle exec rails db:migrate
  1. Re-run the query from Step (3), and observe that the semver columns have been backfilled as expected.


  1. Clean up: Destroy the releases created in Step (2) because they're not actually associated with real tags and may cause unexpected behaviour with your gdk later.

