Implement Container Virtual Registry cache hit scenario

🌱 Context

This is a series of MRs to implement Container image virtual registry: push/pull end... (#549131)

What does this MR do and why?

Implement the Container Virtual Registry pull endpoint to support docker pull requests when the image is already in the cache.

References

#549131

Database Review

The MR didn't really introduce new queries. It uses the queries introduced in an earlier MR: !210719

Screenshots or screen recordings

NA

🔬 How to set up and validate locally

🛠️ 1. Setup

Enable Dependency Proxy, if not enabled.

Enable the feature flag:

Feature.enable(:container_virtual_registries)

Prepare a user, group, and create a registry with an upstream:

current_user = User.first # root or any user with read_virtual_registry permission
group = Group.find_by_full_path('your-group') # or create one

# Create a virtual registry
registry = VirtualRegistries::Container::Registry.create!(
  group: group,
  name: 'My Virtual Registry',
  description: 'Test registry'
)

# Create an upstream pointing to Docker Hub
upstream = registry.upstreams.create!(group: group, name: 'Docker Upstream', url: 'https://registry-1.docker.io')

2. 🧑‍🍳 Create cache entries for testing

#### 2A: 🏄 The Easy Way

I already downloaded the contents of the hello-world:latest image, so that you don't have to! The contents are in this snippet.

These are the Rails console commands to download the snippet contents, create temporary files, and create cache entries from those temporary files:

require 'open-uri'
require 'base64'
require 'json'

# Download cache entries data from snippet
entries_data = JSON.parse(URI.open('https://gitlab.com/-/snippets/4905050/raw').read)

entries_data.each do |data|
  # Decode and write content to tempfile
  file = Tempfile.new(['cache_entry'])
  file.binmode
  file.write(Base64.strict_decode64(data['content']))
  file.rewind
  
  uploaded = UploadedFile.new(
    file.path,
    filename: File.basename(data['relative_path']),
    sha1: Digest::SHA1.hexdigest(Base64.strict_decode64(data['content']))
  )
  
  VirtualRegistries::Container::Cache::Entries::CreateOrUpdateService.new(
    upstream: upstream,
    current_user: current_user,
    params: {
      path: data['relative_path'],
      file: uploaded,
      etag: data['upstream_etag'],
      content_type: data['content_type']
    }
  ).execute
  
  file.close
  file.unlink
  
  puts "✓ Created cache entry: #{data['relative_path']}"
end

puts "\nDone! Created #{entries_data.count} cache entries."

One more thing: We need to force file_store = 2 (object storage), otherwise, the download does not work.

upstream.cache_entries.update_all(file_store: 2)

We'll fix this in the next MR, along with the implementation of the push endpoints.

#### 2B: 🏋️ The Hard Way

This step is a bit more involved because we do not yet have a push endpoint. We'll manually download a docker image and stuff its contents into cache entries.

Pull the hello-world image using Docker, then extract and upload the files to the cache:

# Create our test directory
mkdir -p /tmp/vr-test

# Pull the hello-world image
docker pull hello-world:latest
docker save hello-world:latest -o /tmp/vr-test/hello-world.tar

# Extract the manifest and blobs
cd /tmp/vr-test
tar -xf hello-world.tar

Now create the cache entries in Rails console:

require 'digest'
require 'json'

# Step 1: Parse index.json to identify which among the files in blobs/sha256 is the manifest list
manifest_list_digest = JSON.parse(File.read('/tmp/vr-test/index.json'))['manifests'][0]['digest']
manifest_list = JSON.parse(File.read("/tmp/vr-test/blobs/sha256/#{manifest_list_digest.split(':').last}"))

Open the JSON file /tmp/vr-test/blobs/sha256/#{manifest_list}.

Look for the entry for your hardware platform.

If you're on MacOS, use the "arm64v8" platform. For Linux, use "amd64"

The ARM entry as of this writing looks like this:

        {
            "annotations": {
                "com.docker.official-images.bashbrew.arch": "arm64v8",
                "org.opencontainers.image.base.name": "scratch",
                "org.opencontainers.image.created": "2025-08-13T22:59:08Z",
                "org.opencontainers.image.revision": "6930d60e10e81283a57be3ee3a2b5ca328a40304",
                "org.opencontainers.image.source": "https:\/\/github.com\/docker-library\/hello-world.git#6930d60e10e81283a57be3ee3a2b5ca328a40304:arm64v8\/hello-world",
                "org.opencontainers.image.url": "https:\/\/hub.docker.com\/_\/hello-world",
                "org.opencontainers.image.version": "linux"
            },
            "digest": "sha256:00abdbfd095cf666ff8523d0ac0c5776c617a50907b0c32db3225847b622ec5a",
            "mediaType": "application\/vnd.oci.image.manifest.v1+json",
            "platform": {
                "architecture": "arm64",
                "os": "linux",
                "variant": "v8"
            },
            "size": 1039
        },

Get the value for the "digest" key. For the example above, it's 00abdbfd095cf666ff8523d0ac0c5776c617a50907b0c32db3225847b622ec5a.

Back to the Rails console.

# You should now have manifest_digest from the previous step
# Example: manifest_digest = 'sha256:00abdbfd095cf666ff8523d0ac0c5776c617a50907b0c32db3225847b622ec5a'

# Step 2: Read the manifest file
manifest_content = File.read("/tmp/vr-test/blobs/sha256/#{manifest_digest.sub('sha256:', '')}")
manifest_json = JSON.parse(manifest_content)

# Step 3: Extract config and layer digests from the manifest
config_digest = manifest_json['config']['digest']
layer_digests = manifest_json['layers'].map { |layer| layer['digest'] }

puts "Found config digest: #{config_digest}"
puts "Found #{layer_digests.count} layer(s):"
layer_digests.each_with_index { |digest, i| puts "  Layer #{i + 1}: #{digest}" }

# Step 4: Read the actual file contents
config_content = File.read("/tmp/vr-test/blobs/sha256/#{config_digest.sub('sha256:', '')}")
layer_contents = layer_digests.map do |digest|
  File.read("/tmp/vr-test/blobs/sha256/#{digest.sub('sha256:', '')}")
end

# Step 5: Create manifest cache entry
manifest_file = Tempfile.new(['manifest', '.json'])
manifest_file.write(manifest_content)
manifest_file.rewind

manifest_uploaded = UploadedFile.new(
  manifest_file.path,
  filename: 'latest',
  sha1: Digest::SHA1.hexdigest(manifest_content)
)

# NOTE: Change the path string if you're not using the hello-world image
VirtualRegistries::Container::Cache::Entries::CreateOrUpdateService.new(
  upstream: upstream,
  current_user: current_user,
  params: {
    path: 'hello-world/manifests/latest',
    file: manifest_uploaded,
    etag: manifest_digest,
    content_type: 'application/vnd.oci.image.manifest.v1+json'
  }
).execute

manifest_file.close
manifest_file.unlink

puts "✓ Created manifest cache entry with digest: #{manifest_digest}"

# Step 6: Create config blob cache entry
config_file = Tempfile.new(['config', '.json'])
config_file.write(config_content)
config_file.rewind

config_uploaded = UploadedFile.new(
  config_file.path,
  filename: config_digest.split(':').last,
  sha1: Digest::SHA1.hexdigest(config_content)
)

# NOTE: Change the path string if you're not using the hello-world image
VirtualRegistries::Container::Cache::Entries::CreateOrUpdateService.new(
  upstream: upstream,
  current_user: current_user,
  params: {
    path: "hello-world/blobs/#{config_digest}",
    file: config_uploaded,
    etag: config_digest,
    content_type: 'application/vnd.docker.container.image.v1+json'
  }
).execute

config_file.close
config_file.unlink

puts "✓ Created config blob cache entry: #{config_digest}"

# Step 7: Create layer blob cache entries
layer_digests.each_with_index do |layer_digest, index|
  layer_content = layer_contents[index]
  
  layer_file = Tempfile.new(['layer', '.tar'])
  layer_file.binmode
  layer_file.write(layer_content)
  layer_file.rewind
  
  layer_uploaded = UploadedFile.new(
    layer_file.path,
    filename: layer_digest.split(':').last,
    sha1: Digest::SHA1.hexdigest(layer_content)
  )
  
  # NOTE: Change the path string if you're not using the hello-world image
  VirtualRegistries::Container::Cache::Entries::CreateOrUpdateService.new(
    upstream: upstream,
    current_user: current_user,
    params: {
      path: "hello-world/blobs/#{layer_digest}",
      file: layer_uploaded,
      etag: layer_digest,
      content_type: 'application/octet-stream'
    }
  ).execute
  
  layer_file.close
  layer_file.unlink
  
  puts "✓ Created layer blob cache entry #{index + 1}/#{layer_digests.count}: #{layer_digest}"
end

# Step 8: Verify cache entries
puts "\n" + "=" * 60
puts "Summary:"
puts "  Total cache entries: #{upstream.reload.cache_entries.count}"
puts "  Manifests: #{upstream.cache_entries.where('relative_path LIKE ?', '%/manifests/%').count}"
puts "  Blobs: #{upstream.cache_entries.where('relative_path LIKE ?', '%/blobs/%').count}"
puts "=" * 60

One more thing: We need to force file_store = 2 (object storage), otherwise, the download does not work.

upstream.cache_entries.update_all(file_store: 2)

We'll fix this in the next MR, along with the implementation of the push endpoints.

3. 🔓 Docker login to the virtual registry

docker login gdk.test:3000/virtual_registries/containers/<registry_id>

When the docker client asks for the password, paste a personal access token of the user with read_virtual_registry permission.

4. 🔽 Docker pull from the virtual registry

Pull using the tag:

docker pull gdk.test:3000/virtual_registries/containers/<registry_id>/hello-world:latest

Pull using the digest:

docker pull gdk.test:3000/virtual_registries/container/8/hello-world@sha256:0c0473b2781ff136160d27c53706e6e593b0a7ded422170058d17101a5b92ff5

For both tests, you should see the image being pulled from the cache! 🎉

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #549131

Edited by Radamanthus Batnag

Merge request reports

Loading