Implement Container Virtual Registry cache hit scenario
🌱 Context
This is a series of MRs to implement Container image virtual registry: push/pull end... (#549131)
- Add Container Virtual Registry routes and stub ... (!210622 - merged)
- Modify Container Virtual Registry HandleFileReq... (!210719 - merged)
- Implement cache hit scenario
👈 WE ARE HERE - Implement cache miss scenario
- Cleanup: DRY up duplicated code between
VirtualRegistries::ContainersControllerandGroups::DependencyProxy::ApplicationController: #579774
What does this MR do and why?
Implement the Container Virtual Registry pull endpoint to support docker pull requests when the image is already in the cache.
References
Database Review
The MR didn't really introduce new queries. It uses the queries introduced in an earlier MR: !210719
Screenshots or screen recordings
NA
🔬 How to set up and validate locally
🛠️ 1. Setup
Enable Dependency Proxy, if not enabled.
Enable the feature flag:
Feature.enable(:container_virtual_registries)
Prepare a user, group, and create a registry with an upstream:
current_user = User.first # root or any user with read_virtual_registry permission
group = Group.find_by_full_path('your-group') # or create one
# Create a virtual registry
registry = VirtualRegistries::Container::Registry.create!(
group: group,
name: 'My Virtual Registry',
description: 'Test registry'
)
# Create an upstream pointing to Docker Hub
upstream = registry.upstreams.create!(group: group, name: 'Docker Upstream', url: 'https://registry-1.docker.io')
2. 🧑🍳 Create cache entries for testing
#### 2A: 🏄 The Easy Way
I already downloaded the contents of the hello-world:latest image, so that you don't have to! The contents are in this snippet.
These are the Rails console commands to download the snippet contents, create temporary files, and create cache entries from those temporary files:
require 'open-uri'
require 'base64'
require 'json'
# Download cache entries data from snippet
entries_data = JSON.parse(URI.open('https://gitlab.com/-/snippets/4905050/raw').read)
entries_data.each do |data|
# Decode and write content to tempfile
file = Tempfile.new(['cache_entry'])
file.binmode
file.write(Base64.strict_decode64(data['content']))
file.rewind
uploaded = UploadedFile.new(
file.path,
filename: File.basename(data['relative_path']),
sha1: Digest::SHA1.hexdigest(Base64.strict_decode64(data['content']))
)
VirtualRegistries::Container::Cache::Entries::CreateOrUpdateService.new(
upstream: upstream,
current_user: current_user,
params: {
path: data['relative_path'],
file: uploaded,
etag: data['upstream_etag'],
content_type: data['content_type']
}
).execute
file.close
file.unlink
puts "✓ Created cache entry: #{data['relative_path']}"
end
puts "\nDone! Created #{entries_data.count} cache entries."
One more thing: We need to force file_store = 2 (object storage), otherwise, the download does not work.
upstream.cache_entries.update_all(file_store: 2)
We'll fix this in the next MR, along with the implementation of the push endpoints.
#### 2B: 🏋️ The Hard Way
This step is a bit more involved because we do not yet have a push endpoint. We'll manually download a docker image and stuff its contents into cache entries.
Pull the hello-world image using Docker, then extract and upload the files to the cache:
# Create our test directory
mkdir -p /tmp/vr-test
# Pull the hello-world image
docker pull hello-world:latest
docker save hello-world:latest -o /tmp/vr-test/hello-world.tar
# Extract the manifest and blobs
cd /tmp/vr-test
tar -xf hello-world.tar
Now create the cache entries in Rails console:
require 'digest'
require 'json'
# Step 1: Parse index.json to identify which among the files in blobs/sha256 is the manifest list
manifest_list_digest = JSON.parse(File.read('/tmp/vr-test/index.json'))['manifests'][0]['digest']
manifest_list = JSON.parse(File.read("/tmp/vr-test/blobs/sha256/#{manifest_list_digest.split(':').last}"))
Open the JSON file /tmp/vr-test/blobs/sha256/#{manifest_list}.
Look for the entry for your hardware platform.
If you're on MacOS, use the "arm64v8" platform. For Linux, use "amd64"
The ARM entry as of this writing looks like this:
{
"annotations": {
"com.docker.official-images.bashbrew.arch": "arm64v8",
"org.opencontainers.image.base.name": "scratch",
"org.opencontainers.image.created": "2025-08-13T22:59:08Z",
"org.opencontainers.image.revision": "6930d60e10e81283a57be3ee3a2b5ca328a40304",
"org.opencontainers.image.source": "https:\/\/github.com\/docker-library\/hello-world.git#6930d60e10e81283a57be3ee3a2b5ca328a40304:arm64v8\/hello-world",
"org.opencontainers.image.url": "https:\/\/hub.docker.com\/_\/hello-world",
"org.opencontainers.image.version": "linux"
},
"digest": "sha256:00abdbfd095cf666ff8523d0ac0c5776c617a50907b0c32db3225847b622ec5a",
"mediaType": "application\/vnd.oci.image.manifest.v1+json",
"platform": {
"architecture": "arm64",
"os": "linux",
"variant": "v8"
},
"size": 1039
},
Get the value for the "digest" key. For the example above, it's 00abdbfd095cf666ff8523d0ac0c5776c617a50907b0c32db3225847b622ec5a.
Back to the Rails console.
# You should now have manifest_digest from the previous step
# Example: manifest_digest = 'sha256:00abdbfd095cf666ff8523d0ac0c5776c617a50907b0c32db3225847b622ec5a'
# Step 2: Read the manifest file
manifest_content = File.read("/tmp/vr-test/blobs/sha256/#{manifest_digest.sub('sha256:', '')}")
manifest_json = JSON.parse(manifest_content)
# Step 3: Extract config and layer digests from the manifest
config_digest = manifest_json['config']['digest']
layer_digests = manifest_json['layers'].map { |layer| layer['digest'] }
puts "Found config digest: #{config_digest}"
puts "Found #{layer_digests.count} layer(s):"
layer_digests.each_with_index { |digest, i| puts " Layer #{i + 1}: #{digest}" }
# Step 4: Read the actual file contents
config_content = File.read("/tmp/vr-test/blobs/sha256/#{config_digest.sub('sha256:', '')}")
layer_contents = layer_digests.map do |digest|
File.read("/tmp/vr-test/blobs/sha256/#{digest.sub('sha256:', '')}")
end
# Step 5: Create manifest cache entry
manifest_file = Tempfile.new(['manifest', '.json'])
manifest_file.write(manifest_content)
manifest_file.rewind
manifest_uploaded = UploadedFile.new(
manifest_file.path,
filename: 'latest',
sha1: Digest::SHA1.hexdigest(manifest_content)
)
# NOTE: Change the path string if you're not using the hello-world image
VirtualRegistries::Container::Cache::Entries::CreateOrUpdateService.new(
upstream: upstream,
current_user: current_user,
params: {
path: 'hello-world/manifests/latest',
file: manifest_uploaded,
etag: manifest_digest,
content_type: 'application/vnd.oci.image.manifest.v1+json'
}
).execute
manifest_file.close
manifest_file.unlink
puts "✓ Created manifest cache entry with digest: #{manifest_digest}"
# Step 6: Create config blob cache entry
config_file = Tempfile.new(['config', '.json'])
config_file.write(config_content)
config_file.rewind
config_uploaded = UploadedFile.new(
config_file.path,
filename: config_digest.split(':').last,
sha1: Digest::SHA1.hexdigest(config_content)
)
# NOTE: Change the path string if you're not using the hello-world image
VirtualRegistries::Container::Cache::Entries::CreateOrUpdateService.new(
upstream: upstream,
current_user: current_user,
params: {
path: "hello-world/blobs/#{config_digest}",
file: config_uploaded,
etag: config_digest,
content_type: 'application/vnd.docker.container.image.v1+json'
}
).execute
config_file.close
config_file.unlink
puts "✓ Created config blob cache entry: #{config_digest}"
# Step 7: Create layer blob cache entries
layer_digests.each_with_index do |layer_digest, index|
layer_content = layer_contents[index]
layer_file = Tempfile.new(['layer', '.tar'])
layer_file.binmode
layer_file.write(layer_content)
layer_file.rewind
layer_uploaded = UploadedFile.new(
layer_file.path,
filename: layer_digest.split(':').last,
sha1: Digest::SHA1.hexdigest(layer_content)
)
# NOTE: Change the path string if you're not using the hello-world image
VirtualRegistries::Container::Cache::Entries::CreateOrUpdateService.new(
upstream: upstream,
current_user: current_user,
params: {
path: "hello-world/blobs/#{layer_digest}",
file: layer_uploaded,
etag: layer_digest,
content_type: 'application/octet-stream'
}
).execute
layer_file.close
layer_file.unlink
puts "✓ Created layer blob cache entry #{index + 1}/#{layer_digests.count}: #{layer_digest}"
end
# Step 8: Verify cache entries
puts "\n" + "=" * 60
puts "Summary:"
puts " Total cache entries: #{upstream.reload.cache_entries.count}"
puts " Manifests: #{upstream.cache_entries.where('relative_path LIKE ?', '%/manifests/%').count}"
puts " Blobs: #{upstream.cache_entries.where('relative_path LIKE ?', '%/blobs/%').count}"
puts "=" * 60
One more thing: We need to force file_store = 2 (object storage), otherwise, the download does not work.
upstream.cache_entries.update_all(file_store: 2)
We'll fix this in the next MR, along with the implementation of the push endpoints.
3. 🔓 Docker login to the virtual registry
docker login gdk.test:3000/virtual_registries/containers/<registry_id>
When the docker client asks for the password, paste a personal access token of the user with read_virtual_registry permission.
4. 🔽 Docker pull from the virtual registry
Pull using the tag:
docker pull gdk.test:3000/virtual_registries/containers/<registry_id>/hello-world:latest
Pull using the digest:
docker pull gdk.test:3000/virtual_registries/container/8/hello-world@sha256:0c0473b2781ff136160d27c53706e6e593b0a7ded422170058d17101a5b92ff5
For both tests, you should see the image being pulled from the cache!
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Related to #549131