Add service for creating cache entries in container virtual registry

What does this MR do and why?

This adds a service class for creating Container Virtual Registry cache entries (model: ::VirtualRegistries::Container::Cache::Entry).

The service is a mirror of a similar service for Maven Virtual Registry

Implementation Notes

  • There's a small change from the Maven Virtual Registry code: GitLabDuo review recommended adding a retry limit to the retry logic in ::VirtualRegistries::Container::Cache::Entry.create_or_update_by!.
  • There's a lot of code duplication:
    • in the model code - ::VirtualRegistries::Packages::Maven::Cache::Entry and ::VirtualRegistries::Container::Cache::Entry
    • in the service code - ::VirtualRegistries::Packages::Maven::Cache::Entries::CreateOrUpdateService and ::VirtualRegistries::Container::Cache::Entries::CreateOrUpdateService.
    • In the specs for these models and services.

We're deliberately deciding to allow the code duplication. We do not want to conflate these two goals within one MR:

  • 1️⃣ Implementing Container Virtual Registry, and
  • 2️⃣ Deduplicating code and building the foundation for implementing NPM Virtual Registry and other virtual registries.

For goal 2️⃣, we'll have refactor follow-up MRs like this.

References

#549103 (closed)

Screenshots or screen recordings

NA

🧰 How to set up and validate locally

Details

The service is not yet used by any API endpoints. We can test in the Rails console, using code that is similar to how the service will be used, when integrated in an API endpoint.

Here's how the Maven service is used in the API:

service_response = ::VirtualRegistries::Packages::Maven::Cache::Entries::CreateOrUpdateService.new(
  upstream: target_upstream,
  current_user: current_user,
  params: declared_params.merge(etag: etag, content_type: content_type)
).execute

source

Based on the above, here are steps we can run on the Rails console to test:

  1. Download the Alpine container image manifest. We'll use this as the test file to be processed by the service.
docker manifest inspect alpine:latest > alpine-manifest.json
  1. Get a handle to an existing ::VirtualRegistries::Container::Registry object, or create one.
registry = ::VirtualRegistries::Container::Registry.last

or

root_group = Group.first # This should be a top-level group
registry = ::VirtualRegistries::Container::Registry.create!(group: root_group, name: root_group.name)
  1. Get a handle to an existing ::VirtualRegistries::Container::Upstream object, or create one.
upstream = ::VirtualRegistries::Container::Upstream.last

or

root_group = Group.first # This should be a top-level group
url = "https://us-central1-docker.pkg.dev/my-project-id/my-repo/my-app:latest"
upstream = ::VirtualRegistries::Container::Upstream.create!(group: root_group, url: url)
  1. Prepare the params hash
uploaded_file = UploadedFile.new('alpine-manifest.json', sha1: '92cfceb39d57d914ed8b14d0e37643de0797ae56')

params = {
  id: registry.id,
  path: 'alpine-manifest.json',
  file: uploaded_file,
  etag: nil,
  content_type: 'text/json'
}
  1. Call the service - Happy Path
service_response = ::VirtualRegistries::Container::Cache::Entries::CreateOrUpdateService.new(
  upstream: upstream,
  current_user: User.first,
  params: params
).execute

Success Response:

=> #<ServiceResponse:0x000000017a5f7460
 @http_status=:ok,
 @message=nil,
 @payload=
  {:cache_entry=>
    #<VirtualRegistries::Container::Cache::Entry:0x00000001686194c8
     group_id: 22,
     upstream_id: 14,
     downloads_count: 0,
     upstream_checked_at: Wed, 24 Sep 2025 12:18:09.168020000 UTC +00:00,
     downloaded_at: Wed, 24 Sep 2025 12:15:39.451813000 UTC +00:00,
     created_at: Wed, 24 Sep 2025 12:15:39.454291000 UTC +00:00,
     updated_at: Wed, 24 Sep 2025 12:18:09.176758000 UTC +00:00,
     file_store: 1,
     size: 5001,
     status: "default",
     file_md5: nil,
     file_sha1: "92cfceb39d57d914ed8b14d0e37643de0797ae56",
     upstream_etag: nil,
     content_type: "[FILTERED]",
     relative_path: "/alpine-manifest.json",
     file: "#<UploadedFile:0x000000031efb9110>",
     object_storage_key: "[FILTERED]">},
 @reason=nil,
 @status=:success>
  1. Call the service - fail because the file has no sha1
uploaded_file = UploadedFile.new('README.md')

params = {
  id: registry.id,
  path: 'README.md',
  file: uploaded_file,
  etag: nil,
  content_type: 'text/plain'
}

service_response = ::VirtualRegistries::Container::Cache::Entries::CreateOrUpdateService.new(
  upstream: upstream,
  current_user: User.first,
  params: params
).execute

Error Response

=> #<ServiceResponse:0x000000031ccf2550
 @http_status=nil,
 @message="Validation failed: File sha1 can't be blank, File sha1 is the wrong length (should be 40 characters)",
 @payload={},
 @reason=:persistence_error,
 @status=:error>

💾 Database Review

Inserting a new cache entry
INSERT INTO
    "virtual_registries_container_cache_entries" (
        "group_id",
        "upstream_id",
        "upstream_checked_at",
        "created_at",
        "updated_at",
        "size",
        "file_sha1",
        "content_type",
        "relative_path",
        "file",
        "object_storage_key"
    )
VALUES
    (
        22,
        14,
        '2025-09-24 13:18:39.856780',
        '2025-09-24 13:18:39.948658',
        '2025-09-24 13:18:39.948658',
        10,
        '\x92cfceb39d57d914ed8b14d0e37643de0797ae56',
        'text/plain',
        '/VERSION',
        '#<UploadedFile:0x0000000319978670>',
        '78/5f/785f3ec7eb32f30b90cd0fcf3657d388b5ff4297f2f9716ff66e9b69c05ddd09/virtual_registries/container/22/upstream/14/cache/entry/4c/15/04dba58b00e89da6256cb17f84ccb7b7195f18f11329d0b9948cbb9592e9'
    ) RETURNING "upstream_checked_at",
    "downloaded_at"

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/43730/commands/133600

NOTE: virtual_registries_container_cache_entries is a new table with no records.

Here's the query plan analysis on my local GDK:

 Insert on virtual_registries_container_cache_entries  (cost=0.00..0.01 rows=1 width=290) (actual time=2.517..2.518 rows=1 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=290) (actual time=0.002..0.002 rows=1 loops=1)
 Planning Time: 0.032 ms
 Trigger for constraint fk_rails_5c3a01ae96 on virtual_registries_container_cache_entries_15: time=6.358 calls=1
 Execution Time: 8.905 ms
Updating an existing cache entry
UPDATE
    "virtual_registries_container_cache_entries"
SET
    "upstream_checked_at" = '2025-09-24 13:16:52.151318',
    "updated_at" = '2025-09-24 13:16:52.178654',
    "size" = 5001,
    "file" = '#<UploadedFile:0x00000003182d3de8>'
WHERE
    "virtual_registries_container_cache_entries"."upstream_id" = 14
    AND "virtual_registries_container_cache_entries"."relative_path" = '/alpine-manifest.json'
    AND "virtual_registries_container_cache_entries"."status" = 0

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/43730/commands/133604

NOTE: virtual_registries_container_cache_entries is a new table with no records.

Here's the query plan analysis on my local GDK:

 Update on virtual_registries_container_cache_entries  (cost=0.14..2.17 rows=0 width=0) (actual time=1.682..1.682 rows=0 loops=1)
   Update on virtual_registries_container_cache_entries_13 virtual_registries_container_cache_entries_1
   ->  Index Scan using virtual_registries_container_relative_path_object_storage_idx13 on virtual_registries_container_cache_entries_13 virtual_registries_container_cache_entries_1  (cost=0.14..2.17 rows=1 width=62) (actual time=0.724..0.725 rows=1 loops=1)
         Index Cond: (relative_path = '/alpine-manifest.json'::text)
         Filter: ((upstream_id = 14) AND (status = 0))

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #549103 (closed)

Edited by Radamanthus Batnag

Merge request reports

Loading