Skip to content

chore(datastore): expose CreateOrFind to the manifest store

Jaime Martinez requested to merge 640-add-find-by-digest-to-manifests into master

Context

An issue fix(datastore): race condition pushing two or m... (#640 - closed) was reported where users were getting 500 responses trying to push images to the new registry with the metadata database enabled.

Problem

A client such as jib may push 2 or more tagged manifests that reference the same digest at the same time. However, the query that inserts records into the manifests table fails with the constraint CONSTRAINT unique_manifests_top_lvl_nmspc_id_and_repository_id_and_digest UNIQUE (top_level_namespace_id, repository_id, digest) because is not prepared for race conditions such as this. More information can be found in this comment.

Solution

We need to implement a new CreateOrFind method in the manifestStore that can safely attempt to create a record, or find it if the race condition happens. We already use a similar approach for inserting blobs, so the solution in this MR is an adapted version for manifests of the same approach used for blobs.

The new CreateOrFind method will only be used in the manifests handler as implemented in !974 (merged) only when we fail to find the manifest the first time inside that handler.


CreateOrFind

The new query handles the ON CONFLICT clause:

INSERT INTO manifests (top_level_namespace_id, repository_id, total_size, schema_version, media_type_id, digest, payload,
				configuration_media_type_id, configuration_blob_digest, configuration_payload, non_conformant, non_distributable_layers)
			VALUES ($1, $2, $3, $4, $5, decode($6, 'hex'), $7, $8, decode($9, 'hex'), $10, $11, $12)
			ON CONFLICT (top_level_namespace_id, repository_id, digest) DO NOTHING
		RETURNING
			id, created_at

Explain

https://console.postgres.ai/gitlab/gitlab-production-registry/sessions/9734/commands/34493#visualize-depesz

Query

EXPLAIN INSERT INTO 
    manifests (top_level_namespace_id, repository_id, total_size, schema_version, media_type_id, digest, payload,     configuration_media_type_id, configuration_blob_digest, configuration_payload, non_conformant, non_distributable_layers) 
VALUES 
    (1, 1, 0, 2, 3, decode('0146b163863b462eadc1b17dca382ccbfb08a853cffc79e2049607f95455cc44fa', 'hex'), '[123 34 115 99 104 101 109 97 86 101 114 115 105 111 110 34 58 50 44 34 109 101 100 105 97 84 121 112 101 34 58 34 46 46 46 34 44 34 99 111 110 102 105 103 34 58 123 125 125]', null, null, '<nil>', false, false)

ON CONFLICT (top_level_namespace_id, repository_id, digest) DO NOTHING
RETURNING id, created_at;

Plan:

 ModifyTable on public.manifests  (cost=0.00..0.01 rows=1 width=176) (actual time=50.316..50.319 rows=1 loops=1)
   Buffers: shared hit=112 read=26 dirtied=10
   I/O Timings: read=48.940 write=0.000
   ->  Result  (cost=0.00..0.01 rows=1 width=176) (actual time=0.023..0.024 rows=1 loops=1)
         Buffers: shared hit=1 dirtied=1
         I/O Timings: read=0.000 write=0.000

Recommendations:

Looks good

Statistics:

Time: 156.647 ms
  - planning: 0.092 ms
  - execution: 156.555 ms
    - I/O read: 48.940 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 112 (~896.00 KiB) from the buffer pool
  - reads: 26 (~208.00 KiB) from the OS file cache, including disk I/O
  - dirtied: 10 (~80.00 KiB)
  - writes: 0
Edited by Jaime Martinez

Merge request reports

Loading