Refine cache entries tables partitioning

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

In !174985 (comment 2916914136), we re-assessed the current partitioning of the cache entries table in virtual registries.

The summary of that discussion is:

partitioning is valuable for the cache entries table.
when initially set up (this MR), a few access patterns were missed because not implemented at that time, namely (batch iteration, pagination and geo support). Also, the composite primary key (with the status column) is bringing some pains (marking a row for destruction is not as simple as status = :pending_destruction).
we discussed the following solution. Hash partitioning on group_id and primary key set to [group_id, id] with id being a bigint column auto incremented by a shared sequence for all partitions.
- group_id is a condition that we can provide in the vast majority of the current access patterns, including when doing batch iterations (cleanup jobs) or pagination. The main query (eg. get a cache entry for a given relative path and a given set of upstream ids) will still properly access a single partition if we throw in the group_id condition.
- group_id can lead to an unbalanced distribution of data. That is an acceptable tradeoff considering the other aspects.
- id is the tie breaker instead of relative_path. This way, we allow multiple records with identical group_id, upstream_id, relative_path and status.
- id is also guaranteed to be unique across the partitions and this will largely help with pagination (unlock the use of keyset pagination) and Geo support (we can use the current logic as-is, see #473033 (comment 2712560914)).

For data volumes, we're considering the files in the related package registry and dependency proxy on gitlab.com :

Maven: 150 millions of files
Npm: 6 millions of files
Container: 2.2 millions of blobs + 320 thousands of manifests = 2.52 millions

As we can see, on the first 3 formats targeted in the virtual registries, maven is the one handling the most amount of files so cache entries of the maven virtual registry should result in a large size than the other 2 formats.

🔬 Query impact analysis

The Maven virtual registry being the more advanced one in the implementation road, that's the one we're going to consider here but keep in mind that similar queries will be triggered in the container and npm virtual registries. ✅ means that the group_id condition can be added (which will leverage the new partitioning), ❌ means we can't add that condition.

(core logic, read cache) Main read query ✅ (already a group_id condition).

SELECT "virtual_registries_packages_maven_cache_entries".* 
FROM "virtual_registries_packages_maven_cache_entries"
WHERE "virtual_registries_packages_maven_cache_entries"."status" = 0 
AND "virtual_registries_packages_maven_cache_entries"."group_id" = 33 
AND "virtual_registries_packages_maven_cache_entries"."upstream_id" IN (
  SELECT "virtual_registries_packages_maven_upstreams"."id" 
  FROM "virtual_registries_packages_maven_upstreams" 
  INNER JOIN "virtual_registries_packages_maven_registry_upstreams" ON "virtual_registries_packages_maven_upstreams"."id" = "virtual_registries_packages_maven_registry_upstreams"."upstream_id" 
  WHERE "virtual_registries_packages_maven_registry_upstreams"."registry_id" = 80 
  ORDER BY "virtual_registries_packages_maven_registry_upstreams"."position" ASC
) 
AND "virtual_registries_packages_maven_cache_entries"."relative_path" = '/relative/path/to/file' 
LIMIT 1

(core logic, write cache) Find or initialize by ✅ (already using a group_id condition)

SELECT "virtual_registries_packages_maven_cache_entries".* 
FROM "virtual_registries_packages_maven_cache_entries" 
WHERE "virtual_registries_packages_maven_cache_entries"."upstream_id" = 305 
AND "virtual_registries_packages_maven_cache_entries"."group_id" = 33 
AND "virtual_registries_packages_maven_cache_entries"."relative_path" = '/relative/path/to/file' 
LIMIT 1

(API endpoint) List cache entries for a given upstream ✅

SELECT "virtual_registries_packages_maven_cache_entries".* 
FROM "virtual_registries_packages_maven_cache_entries" 
WHERE "virtual_registries_packages_maven_cache_entries"."upstream_id" = 305 
AND "virtual_registries_packages_maven_cache_entries"."status" = 0 
ORDER BY "virtual_registries_packages_maven_cache_entries"."created_at" DESC 
LIMIT 20 OFFSET 0

(API endpoint) List cache entries for a given upstream with a search pattern ✅

SELECT "virtual_registries_packages_maven_cache_entries".* 
FROM "virtual_registries_packages_maven_cache_entries" 
WHERE "virtual_registries_packages_maven_cache_entries"."upstream_id" = 305 
AND "virtual_registries_packages_maven_cache_entries"."status" = 0 
AND "virtual_registries_packages_maven_cache_entries"."relative_path" ILIKE '%foobar%' 
ORDER BY "virtual_registries_packages_maven_cache_entries"."created_at" DESC 
LIMIT 20 OFFSET 0

(pending destruction cleanup job) Existence check of pending destruction cache entries. 🟠 (this could be updated to loop over group_id s present in the upstream table = longer background worker but the group_id condition would be present)

SELECT 1 AS one 
FROM "virtual_registries_packages_maven_cache_entries" 
WHERE "virtual_registries_packages_maven_cache_entries"."status" = 2 
LIMIT 1

(pending destruction cleanup job) Limited count of pending destruction cache entries. 🟠 (this could be updated to loop over group_id s present in the upstream table = longer background worker but the group_id condition would be present)

SELECT COUNT(*) 
FROM (
  SELECT 1 AS one 
  FROM "virtual_registries_packages_maven_cache_entries" 
  WHERE "virtual_registries_packages_maven_cache_entries"."status" = 2 
  LIMIT 3
) subquery_for_count

(pending destruction cleanup job) Get the next pending destruction cache entries. 🟠 (this could be updated to loop over group_id s present in the upstream table = longer background worker but the group_id condition would be present)

SELECT "virtual_registries_packages_maven_cache_entries".* 
FROM "virtual_registries_packages_maven_cache_entries" 
WHERE "virtual_registries_packages_maven_cache_entries"."status" = 2 
LIMIT 1 
FOR UPDATE SKIP LOCKED

(LFK) upstream destruction which updates the status column ❌

-- query selecting cache entries by upstream_id and status != pending destruction

(LFK) group destruction which updates the status column ✅ (group_id condition already present)

-- query selecting cache entries by group_id and status != pending destruction

(purge upstream worker) each_batch loop starting from upstream.default_cache_entries ✅

SELECT "virtual_registries_packages_maven_cache_entries"."relative_path" 
FROM "virtual_registries_packages_maven_cache_entries" 
WHERE "virtual_registries_packages_maven_cache_entries"."upstream_id" = 305 
AND "virtual_registries_packages_maven_cache_entries"."status" = 0
<other conditions>

(cleanup policies) each_batch loop starting from upstream.default_cache_entries.requiring_cleanup(5) ✅

SELECT "virtual_registries_packages_maven_cache_entries"."relative_path" 
FROM "virtual_registries_packages_maven_cache_entries" 
WHERE "virtual_registries_packages_maven_cache_entries"."upstream_id" = 305 
AND "virtual_registries_packages_maven_cache_entries"."status" = 0 
AND "virtual_registries_packages_maven_cache_entries"."downloaded_at" < '2025-11-29 15:04:37.646310'
<other conditions>

(model validation) uniqueness validation on [relative_path, upstream_id, status] ✅ (model validation will be reworked since we will have a different primary key.)

SELECT 1 AS one 
FROM "virtual_registries_packages_maven_cache_entries" 
WHERE "virtual_registries_packages_maven_cache_entries"."relative_path" = '/foo/bar' 
AND "virtual_registries_packages_maven_cache_entries"."upstream_id" = 305 
AND "virtual_registries_packages_maven_cache_entries"."status" = 0 
LIMIT 1

(model validation) uniqueness validation on [object_storage_key, relative_path, group_id] ✅ (model validation will be reworked since we will have a different primary key.)

SELECT 1 AS one 
FROM "virtual_registries_packages_maven_cache_entries" 
WHERE "virtual_registries_packages_maven_cache_entries"."object_storage_key" = 'c6/f3/c6f3ac57944a531490cd39902d0f777715fd005efac9a30622d5f5205e7f6894/virtual_registries/packages/maven/33/upstream/305/cache/entry/0b/76/8a6d6bee6111817dba3fa20c9b4cc2331d29b0b929b341ec2432f64c3e6c' 
AND "virtual_registries_packages_maven_cache_entries"."relative_path" = '/foo/bar' 
AND "virtual_registries_packages_maven_cache_entries"."group_id" IS NULL LIMIT 1

In totals, we would have:

✅ 9 supported queries. "Supported" here means that the query would have the group_id condition = single partition accessed. Either the existing query already has the condition or it can easily be added.
🟠 3 non supported queries as-is. This will need a refactor of the query. These are all located on a background job (= non user facing feature).
❌ 1 non supported queries. The LFK query when an upstream is destroyed can't easily be updated to have a group_id condition.
- The bright side here is that this is, again, on a background process (= non user facing feature).

🔀 Implementation plan

NPM

Since the cache entries table doesn't exist yet, the plan here is straightforward:

create the table with the correct structure/definition

Container

The table exists but it will require a rename as part of the local upstreams support. virtual_registries_container_cache_entries to virtual_registries_container_cache_remote_entries.

Given that here we're still implementing this feature, we can leverage the rename need:

create the new table with the new structure/definition.
switch all backend logic to the new table + update queries.
⚠️ old table handling. We can't drop the table as-is as we have files on object storage. We can leverage the cleanup worker that look for pending destruction entries. That worker should handle both tables.
- update all the records of the old table to status: pending_destruction.
- wait a few milestones so that the worker empty the old table.
- drop the old table.

Note that this path purges all upstream cache entries for the container virtual registry.

Maven

This is the most challenging format since we are in beta here and we do have users actively using the feature.

We have two paths available.

Exactly the same as the container virtual registry.
- Benefits: pretty straightforward and fastest path to the new structure.
- Downsides: we purge all maven upstreams which means that for the first few requests on the maven virtual registry, they will be slower as the cache is built back again. We will probably also need a correct on usage statistics at the group level.
Migrate the existing data. We need to:
1. set up a trigger so that inserting in the old table also insert in the new table.
2. have a background job that will migrate the data from the old table to the new one.
3. update the backend logic to use the new table.
4. drop the table as-is.
- Benefits: we keep the existing data and statistics. It should be transparent to users.
- Downsides: longer path than the first one. It might require a multi milestone process.

Aside of the virtual_registries_packages_maven_cache_entries, we do have virtual_registries_packages_maven_cache_local_entries which is also a partitioned table. It is basically a simpler model than virtual_registries_packages_maven_cache_entries but follows several similarities (has an upstream_id and group_id). We also applied the same partitioning configuration here so it would be coherent to also update this table. Fortunately, this table is for the local upstreams support which is still in progress. Currently, that table is not connected to any business logic and is empty. Thus, we're free to simply drop and recreate the table with the correct partition definition.

Edited Dec 10, 2025 by David Fernandez