Refine cache entries tables partitioning
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
In !174985 (comment 2916914136), we re-assessed the current partitioning of the cache entries table in virtual registries.
The summary of that discussion is:
- partitioning is valuable for the cache entries table.
- when initially set up (this MR), a few access patterns were missed because not implemented at that time, namely (batch iteration, pagination and geo support). Also, the composite primary key (with the
statuscolumn) is bringing some pains (marking a row for destruction is not as simple asstatus = :pending_destruction). - we discussed the following solution. Hash partitioning on
group_idand primary key set to[group_id, id]withidbeing abigintcolumn auto incremented by a shared sequence for all partitions.-
group_idis a condition that we can provide in the vast majority of the current access patterns, including when doing batch iterations (cleanup jobs) or pagination. The main query (eg. get a cache entry for a given relative path and a given set of upstream ids) will still properly access a single partition if we throw in thegroup_idcondition. -
group_idcan lead to an unbalanced distribution of data. That is an acceptable tradeoff considering the other aspects. -
idis the tie breaker instead ofrelative_path. This way, we allow multiple records with identicalgroup_id,upstream_id,relative_pathandstatus. -
idis also guaranteed to be unique across the partitions and this will largely help with pagination (unlock the use of keyset pagination) and Geo support (we can use the current logic as-is, see #473033 (comment 2712560914)).
-
For data volumes, we're considering the files in the related package registry and dependency proxy on gitlab.com :
- Maven: 150 millions of files
- Npm: 6 millions of files
- Container: 2.2 millions of blobs + 320 thousands of manifests = 2.52 millions
As we can see, on the first 3 formats targeted in the virtual registries, maven is the one handling the most amount of files so cache entries of the maven virtual registry should result in a large size than the other 2 formats.
🔬 Query impact analysis
The Maven virtual registry being the more advanced one in the implementation road, that's the one we're going to consider here but keep in mind that similar queries will be triggered in the container and npm virtual registries. group_id condition can be added (which will leverage the new partitioning),
(core logic, read cache) Main read query group_id condition).
SELECT "virtual_registries_packages_maven_cache_entries".*
FROM "virtual_registries_packages_maven_cache_entries"
WHERE "virtual_registries_packages_maven_cache_entries"."status" = 0
AND "virtual_registries_packages_maven_cache_entries"."group_id" = 33
AND "virtual_registries_packages_maven_cache_entries"."upstream_id" IN (
SELECT "virtual_registries_packages_maven_upstreams"."id"
FROM "virtual_registries_packages_maven_upstreams"
INNER JOIN "virtual_registries_packages_maven_registry_upstreams" ON "virtual_registries_packages_maven_upstreams"."id" = "virtual_registries_packages_maven_registry_upstreams"."upstream_id"
WHERE "virtual_registries_packages_maven_registry_upstreams"."registry_id" = 80
ORDER BY "virtual_registries_packages_maven_registry_upstreams"."position" ASC
)
AND "virtual_registries_packages_maven_cache_entries"."relative_path" = '/relative/path/to/file'
LIMIT 1
(core logic, write cache) Find or initialize by group_id condition)
SELECT "virtual_registries_packages_maven_cache_entries".*
FROM "virtual_registries_packages_maven_cache_entries"
WHERE "virtual_registries_packages_maven_cache_entries"."upstream_id" = 305
AND "virtual_registries_packages_maven_cache_entries"."group_id" = 33
AND "virtual_registries_packages_maven_cache_entries"."relative_path" = '/relative/path/to/file'
LIMIT 1
(API endpoint) List cache entries for a given upstream
SELECT "virtual_registries_packages_maven_cache_entries".*
FROM "virtual_registries_packages_maven_cache_entries"
WHERE "virtual_registries_packages_maven_cache_entries"."upstream_id" = 305
AND "virtual_registries_packages_maven_cache_entries"."status" = 0
ORDER BY "virtual_registries_packages_maven_cache_entries"."created_at" DESC
LIMIT 20 OFFSET 0
(API endpoint) List cache entries for a given upstream with a search pattern
SELECT "virtual_registries_packages_maven_cache_entries".*
FROM "virtual_registries_packages_maven_cache_entries"
WHERE "virtual_registries_packages_maven_cache_entries"."upstream_id" = 305
AND "virtual_registries_packages_maven_cache_entries"."status" = 0
AND "virtual_registries_packages_maven_cache_entries"."relative_path" ILIKE '%foobar%'
ORDER BY "virtual_registries_packages_maven_cache_entries"."created_at" DESC
LIMIT 20 OFFSET 0
(pending destruction cleanup job) Existence check of pending destruction cache entries. group_id s present in the upstream table = longer background worker but the group_id condition would be present)
SELECT 1 AS one
FROM "virtual_registries_packages_maven_cache_entries"
WHERE "virtual_registries_packages_maven_cache_entries"."status" = 2
LIMIT 1
(pending destruction cleanup job) Limited count of pending destruction cache entries. group_id s present in the upstream table = longer background worker but the group_id condition would be present)
SELECT COUNT(*)
FROM (
SELECT 1 AS one
FROM "virtual_registries_packages_maven_cache_entries"
WHERE "virtual_registries_packages_maven_cache_entries"."status" = 2
LIMIT 3
) subquery_for_count
(pending destruction cleanup job) Get the next pending destruction cache entries. group_id s present in the upstream table = longer background worker but the group_id condition would be present)
SELECT "virtual_registries_packages_maven_cache_entries".*
FROM "virtual_registries_packages_maven_cache_entries"
WHERE "virtual_registries_packages_maven_cache_entries"."status" = 2
LIMIT 1
FOR UPDATE SKIP LOCKED
(LFK) upstream destruction which updates the status column
-- query selecting cache entries by upstream_id and status != pending destruction
(LFK) group destruction which updates the status column group_id condition already present)
-- query selecting cache entries by group_id and status != pending destruction
(purge upstream worker) each_batch loop starting from upstream.default_cache_entries
SELECT "virtual_registries_packages_maven_cache_entries"."relative_path"
FROM "virtual_registries_packages_maven_cache_entries"
WHERE "virtual_registries_packages_maven_cache_entries"."upstream_id" = 305
AND "virtual_registries_packages_maven_cache_entries"."status" = 0
<other conditions>
(cleanup policies) each_batch loop starting from upstream.default_cache_entries.requiring_cleanup(5)
SELECT "virtual_registries_packages_maven_cache_entries"."relative_path"
FROM "virtual_registries_packages_maven_cache_entries"
WHERE "virtual_registries_packages_maven_cache_entries"."upstream_id" = 305
AND "virtual_registries_packages_maven_cache_entries"."status" = 0
AND "virtual_registries_packages_maven_cache_entries"."downloaded_at" < '2025-11-29 15:04:37.646310'
<other conditions>
(model validation) uniqueness validation on [relative_path, upstream_id, status]
SELECT 1 AS one
FROM "virtual_registries_packages_maven_cache_entries"
WHERE "virtual_registries_packages_maven_cache_entries"."relative_path" = '/foo/bar'
AND "virtual_registries_packages_maven_cache_entries"."upstream_id" = 305
AND "virtual_registries_packages_maven_cache_entries"."status" = 0
LIMIT 1
(model validation) uniqueness validation on [object_storage_key, relative_path, group_id]
SELECT 1 AS one
FROM "virtual_registries_packages_maven_cache_entries"
WHERE "virtual_registries_packages_maven_cache_entries"."object_storage_key" = 'c6/f3/c6f3ac57944a531490cd39902d0f777715fd005efac9a30622d5f5205e7f6894/virtual_registries/packages/maven/33/upstream/305/cache/entry/0b/76/8a6d6bee6111817dba3fa20c9b4cc2331d29b0b929b341ec2432f64c3e6c'
AND "virtual_registries_packages_maven_cache_entries"."relative_path" = '/foo/bar'
AND "virtual_registries_packages_maven_cache_entries"."group_id" IS NULL LIMIT 1
In totals, we would have:
-
✅ 9supported queries. "Supported" here means that the query would have thegroup_idcondition = single partition accessed. Either the existing query already has the condition or it can easily be added. -
🟠 3non supported queries as-is. This will need a refactor of the query. These are all located on a background job (= non user facing feature). -
❌ 1non supported queries. The LFK query when an upstream is destroyed can't easily be updated to have agroup_idcondition.- The bright side here is that this is, again, on a background process (= non user facing feature).
🔀 Implementation plan
NPM
Since the cache entries table doesn't exist yet, the plan here is straightforward:
- create the table with the correct structure/definition
Container
The table exists but it will require a rename as part of the local upstreams support. virtual_registries_container_cache_entries to virtual_registries_container_cache_remote_entries.
Given that here we're still implementing this feature, we can leverage the rename need:
- create the new table with the new structure/definition.
- switch all backend logic to the new table + update queries.
-
⚠️ old table handling. We can't drop the table as-is as we have files on object storage. We can leverage the cleanup worker that look for pending destruction entries. That worker should handle both tables.- update all the records of the old table to
status: pending_destruction. - wait a few milestones so that the worker empty the old table.
- drop the old table.
- update all the records of the old table to
Note that this path purges all upstream cache entries for the container virtual registry.
Maven
This is the most challenging format since we are in beta here and we do have users actively using the feature.
We have two paths available.
- Exactly the same as the container virtual registry.
- Benefits: pretty straightforward and fastest path to the new structure.
- Downsides: we purge all maven upstreams which means that for the first few requests on the maven virtual registry, they will be slower as the cache is built back again. We will probably also need a correct on usage statistics at the group level.
- Migrate the existing data. We need to:
- set up a trigger so that inserting in the old table also insert in the new table.
- have a background job that will migrate the data from the old table to the new one.
- update the backend logic to use the new table.
- drop the table as-is.
- Benefits: we keep the existing data and statistics. It should be transparent to users.
- Downsides: longer path than the first one. It might require a multi milestone process.
Aside of the virtual_registries_packages_maven_cache_entries, we do have virtual_registries_packages_maven_cache_local_entries which is also a partitioned table. It is basically a simpler model than virtual_registries_packages_maven_cache_entries but follows several similarities (has an upstream_id and group_id). We also applied the same partitioning configuration here so it would be coherent to also update this table. Fortunately, this table is for the local upstreams support which is still in progress. Currently, that table is not connected to any business logic and is empty. Thus, we're free to simply drop and recreate the table with the correct partition definition.