Registry database import messes up the createdAt/publishedAt values of the image tags

Why This Issue Remains Unfixed

Tag timestamps get set to import time because extracting the original creation date would require parsing manifest configurations during Step 2 of the import—the only phase requiring read-only mode and the most critical part of the migration process.

The complexity: Different manifest types have different structures. OCI/Docker v2 manifests may have a created field in their config. Manifest lists don't have configs—we'd need additional database queries to follow references. Some manifests (like kaniko --reproducible builds) have no creation timestamp at all. Each type needs different parsing logic.

Why not filesystem timestamps? Retrieving modification times requires Stat calls to the storage backend for every tag. For object storage (S3, GCS, Azure), these are network operations that would extend read-only time and introduce failure points for registries with hundreds of thousands of tags.

The real issue: Step 2 was deliberately kept simple because it's the only non-idempotent part of the import—if it fails, users must manually truncate the tags table before retrying. Adding manifest parsing introduces complexity and failure modes to the one step that can't fail gracefully. Even with this risk, we'd only get accurate timestamps for some manifest types, not a complete solution.

The problem also diminishes over time as cleanup policies remove older images and new pushes have accurate timestamps.

We have documented this as a known limitation so users are aware of this behavior when planning migrations.

Summary

The process of enabling the Container Registry Metadata Database (CRMD) and migrating the existing images/tags causes a "reset" of the createdAt value of all tags on the GL instance; the image tags receive the system's date-time as a new published value.

Example screenshots
Before migration After migration
Screenshot 2024-10-17 at 23.10.18.png Screenshot 2024-10-17 at 23.10.40.png

Background

Environment

  • GitLab v17.3.2-ee self-managed GitLab Ultimate
  • Installer: Linux Omnibus
  • OS: Debian 11
  • Resources: 1 CPU socket & 4 cores, 24 Gb RAM, 200 Gb SSD as root disk
  • CR: bundled with GL installation, has a dedicated mounted volume - 1500 Gb
  • CRMD: A separate database on the same PostgreSQL instance bundled with the installation, running alongside the GL instance on the same VM

Steps to reproduce

  1. GitLab instance v17.2 with at least 1 project that has published Docker images in the Container Registry, older than 24 hours
  2. Upgrade to v17.3 with apt install gitlab-ee=17.3*
  3. Create a new DB for CRMD with the help of gitlab-psql: CREATE DATABASE gitlabhq_registry OWNER gitlab
  4. Follow the "One-step Migration" from the documentation:
    • Adjust /etc/gitlab/gitlab.rb to include the required sections:

      Example configuration
      registry['database'] = {
        'enabled' => false,
        'host' => '0.0.0.0',
        'port' => 5432,
        'user' => 'gitlab',
        'password' => '<super-secret>',
        'dbname' => 'gitlabhq_registry',
        'sslmode' => 'disable'
      }
      registry['storage'] = {
        'filesystem' => {
          'rootdirectory' => "/mnt/registry/"
        },
        'maintenance' => {
          'readonly' => {
            'enabled' => true
          }
        }
      }
      
    • Run gitlab-ctl registry-database migrate up

    • Run gitlab-ctl registry-database import

  5. Upon completion, the image tags would have "Published an hour ago" on the GitLab UI of the Container Registry page
API Response Example

These are the json payloads from the static container registry repo with just 3 images:

Before After
before after

Problem

  1. The abovementioned documentation "Enable the metadata database for Linux package installations" does not mention the described "side effect", hence it's considered a bug
  2. From a DevOps view, this behavior is not acceptable, since it causes misleading information about the state of the CR and its content
  3. The worst aspect - the cleanup policies to delete image tags are becoming temporarily useless:
    • clogging the CR with image tags that can't be deleted until older_than: "Xd" is reached again after some time;
    • spoiling the garbage collection jobs, since the untagged images are no longer produced when they have to be outdated/recycled;
    • making the algorithm on how the cleanup policy works from step №6 unpredictable, since the created_date is no longer valid for older tags before migration was performed;
    • ...and causing huge overhead on the storage, especially from intensively developing projects with big-sized docker images.

Impact example

With the following CR cleanup policy enabled for all projects

      container_expiration_policy_attributes:
        cadence: "7d"
        keep_n: 25
        older_than: "90d"
        name_regex_delete: ".*"

...the consequences might be estimated in Tb of bloated disk space 😒

Expected behavior

After enabling and migrating to the CRMD, the following values on image tags should be preserved & consistent, and correctly written into the new DB table tags:

  • API (created_at)
  • GraphQL (createdAt / publishedAt)
  • DB (created_at / updated_at)
Observed table schema after migration to CRMD

image

Edited by Jaime Martinez