Skip to content
Snippets Groups Projects

Do not require stopping the registry to apply up DB migrations

Merged João Pereira requested to merge registry-pdm into master

What does this MR do?

The original change was implemented in !7140 (merged).

Currently, users must stop the registry to run gitlab-ctl registry-database migrate up. As we prepare to release the first post-deployment migrations to self-managed instances, we've realized this requirement prevents admins from running these migrations properly after deployment.

While admins can skip post-deployment migrations using the --skip-post-deployment flag and apply only regular migrations before starting an upgraded registry, they would need to stop the service again to run post-deployment migrations - contradicting their "post-deployment" purpose.

The registry application refuses to start (serve command) if there are unapplied regular migrations (present in the registry binary and not recorded in the schema_migrations table). Therefore, stopping the registry to apply migrations is unnecessary, as it wouldn't be running in the first place if there were pending migrations.

By removing this requirement, we enable the following workflow:

  1. Stop/upgrade the registry binary
  2. Apply pending regular migrations (optionally skip post-deployment ones to reduce downtime): gitlab-ctl registry-database migrate up --skip-post-deployment
  3. Start the registry
  4. If post-deployment migrations were skipped in step 2, apply them while the app is running: gitlab-ctl registry-database migrate up

Aside from the changes here, we're working on a separate MR to update the documentation with more details: gitlab!181680 (merged).

Related issues

Related to Ensuring Safe Execution of Post-Deployment Migr... (container-registry#1516 - closed).

Local testing

Setup

  1. Setup a development omnibus environment

  2. Provision a separate Postgres instance and create a logical database, for example:

    CREATE USER registry WITH PASSWORD 'registrypassword'
    CREATE DATABASE registry_omnibus WITH OWNER registry
  3. Alternatively, you can use the provisioned Postgres instance but you will need to allow the registry to connect to it via Unix socket, for example follow https://gitlab.com/-/snippets/2572204#prepare-the-metadata-database-for-the-container-registry.

  4. Configure the database section under the registry configuration, with the enabled flag set to false

    registry['database'] = {
      'enabled' => false,
      'host' => '172.17.0.1', # or path to unix socket
      'port' => 5432,
      'user' => 'registry',
      'password' => 'registrypassword',
      'dbname' => 'registry_omnibus',
      'sslmode' => 'disable'
    }
  5. Reconfigure gitlab-ctl reconfigure

  6. Confirm that the registry is running:

    $ gitlab-ctl tail registry
    ...
    2025-02-27_18:13:25.03700 time="2025-02-27T18:13:25.036Z" level=info msg="listening on [::]:41113" environment=production go_version=go1.23.6 instance_id=263a70b4-9a23-44df-a45a-5f28fd8cbf58 service=registry version=v4.15.0-gitlab

Procedure

  1. Let's enable the DB while no migrations have been applied to confirm that the registry refuses starting as expected:

    registry['database'] = {
      'enabled' => true,
      # ...
    $ gitlab-ctl reconfigure
    $ gitlab-ctl tail registry
    ...
    2025-02-27_18:16:25.62267 creating new registry instance: configuring application: there are pending database migrations, use the 'registry database migrate' CLI command to check and apply them
  2. Check that all migrations are pending (APPLIED column is empty for all rows in the output table):

    $ gitlab-ctl registry-database migrate status
    +--------------------------------------------------------------------------------------+---------+
    |                                      MIGRATION                                       | APPLIED |
    +--------------------------------------------------------------------------------------+---------+
    | 20210503145024_create_top_level_namespaces_table                                     |         |
    ...
    | 20241031081325_add_background_migration_timing_columns                               |         |
    +--------------------------------------------------------------------------------------+---------+
  3. Apply all migrations (should work fine, despite the registry being running but with the DB disabled):

    $ gitlab-ctl registry-database migrate up
    Running migrate up
    Executing command:
    /opt/gitlab/embedded/bin/registry database migrate up /var/opt/gitlab/registry/config.yml
    20210503145024_create_top_level_namespaces_table
    ...
    20241031081325_add_background_migration_timing_columns
    OK: applied 166 migrations and 0 background migrations in 10.871s
  4. Now we need to simulate the upgrade to a version that includes a new post-deployment migration. To do this we'll need to build the registry binary from a source that includes a dummy migration. I've prepared a branch for this here. Maybe there is an easier way to replace the version in Omnibus (please share if so) but I did it like this: You'll need Go 1.23+ and make and then:

    $ git clone -b sample-pdm-test --single-branch https://gitlab.com/gitlab-org/container-registry.git
    $ cd container-registry
    $ make binaries
    # now we can replace the registry binary with the new one
    $ gitlab-ctl stop registry
    $ mv /opt/gitlab/embedded/bin/registry /opt/gitlab/embedded/bin/registry_bkp
    $ cp bin/registry /opt/gitlab/embedded/bin/
    $ gitlab-ctl start registry
  5. Confirm that the registry started fine:

    $ gitlab-ctl tail registry

    Note: If you see an error like creating new registry instance: configuring application: registry filesystem metadata in use, please import data before enabling the database..., this is unrelated with this change and we're working on a fix for it (container-registry#1523). To resolve this run rm -rf /var/opt/gitlab/gitlab-rails/shared/registry/docker/registry/lockfiles.

  6. Check pending migrations:

    $  gitlab-ctl registry-database migrate status
    Running migrate status
    Executing command:
    /opt/gitlab/embedded/bin/registry database migrate status /var/opt/gitlab/registry/config.yml
    +--------------------------------------------------------------------------------------+--------------------------------------+
    |                                      MIGRATION                                       |               APPLIED                |
    +--------------------------------------------------------------------------------------+--------------------------------------+
    ...
    | 20250225164307_post_foo (post deployment)                                            |                                      |
    +--------------------------------------------------------------------------------------+--------------------------------------+
  7. Apply migrations without stopping:

    $ gitlab-ctl registry-database migrate up
    Running migrate up
    Executing command:
    /opt/gitlab/embedded/bin/registry database migrate up /var/opt/gitlab/registry/config.yml
    The following migrations will be applied:
    20250225164307_post_foo
    OK: applied 1 migrations and 0 background migrations in 0.025s

Checklist

See Definition of done.

For anything in this list which will not be completed, please provide a reason in the MR discussion.

Required

  • MR title and description are up to date, accurate, and descriptive.
  • MR targeting the appropriate branch.
  • Latest Merge Result pipeline is green.
  • When ready for review, MR is labeled workflowready for review per the Distribution MR workflow.

For GitLab team members

If you don't have access to this, the reviewer should trigger these jobs for you during the review process.

  • The manual Trigger:ee-package jobs have a green pipeline running against latest commit.
  • If config/software or config/patches directories are changed, make sure the build-package-on-all-os job within the Trigger:ee-package downstream pipeline succeeded.
  • If you are changing anything SSL related, then the Trigger:package:fips manual job within the Trigger:ee-package downstream pipeline must succeed.
  • If CI configuration is changed, the branch must be pushed to dev.gitlab.org to confirm regular branch builds aren't broken.

Expected (please provide an explanation if not completing)

  • Test plan indicating conditions for success has been posted and passes.
  • Documentation created/updated.
  • Tests added.
  • Integration tests added to GitLab QA.
  • Equivalent MR/issue for the GitLab Chart opened.
  • Validate potential values for new configuration settings. Formats such as integer 10, duration 10s, URI scheme://user:passwd@host:port may require quotation or other special handling when rendered in a template and written to a configuration file.
Edited by João Pereira

Merge request reports

Checking pipeline status.

Test summary failed to load results

Merged by Robert MarshallRobert Marshall 3 weeks ago (Mar 5, 2025 4:03pm UTC)

Loading

Pipeline #1701756241 passed

Pipeline passed for 17ac1143 on master

7 environments impacted.

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
Please register or sign in to reply
Loading