Draft: feat: add referrers data to internal List Tags API (!1466) · Merge requests · GitLab.org / container-registry

What does this MR do?

Extends the List Tags API to include referrer info for each tag - i.e. each manifest that has a subject reference to the tag.

Adds a referral=true query option to the List Tags endpoint. The value must be true for referrers to be included.
For each tag that would be output, the referrer info (if present) is appended in this format:

    ...
    "referrers": [
      {
        "artifactType": "application/vnd.dev.cosign.artifact.sig.v1+json",
        "digest": "sha256:eb110b254f038cd0d464b20de5070099a6f675b9d8eb7e5035d929540a041ab1"
      },
      ...
    ]
    ...

Notes

models.TagDetail was extended with the Referrers field.
The use of artifactType versus mediaType is being discussed in #1133 (closed) which follows from this issue ( #1009 (closed)). This field in the internal API can be easily changed to fit our needs. We should use artifactType here.
Manifest subject is a very new feature and there is presumably limited data to analyze against. Below is an example query that ran quickly in DBL:

Query

SELECT
	encode(m.digest, 'hex') AS digest,
	mt.media_type,
	encode(ms.digest, 'hex') AS subject_digest
FROM manifests AS m
JOIN media_types AS mt ON mt.id = m.media_type_id
JOIN manifests AS ms ON m.top_level_namespace_id = ms.top_level_namespace_id
	AND m.repository_id = ms.repository_id
	AND m.subject_id = ms.id
WHERE
	m.top_level_namespace_id = 1
	AND m.repository_id = 1
	AND m.subject_id IN (
		SELECT id
		FROM manifests
		WHERE
			top_level_namespace_id = 1
			AND repository_id = 1
			AND digest IN (
				SELECT decode(n, 'hex')
				FROM unnest(ARRAY[
					'01aa681fceae02d6f4bde6ca81bc09af8a95af71f513de8df709db0543fd4db382',
					'016891eb23653d10cb8bb1cb5487565e31c4d39ce36db60584d98293fca75b0912',
					'01176ddd66b2208eb1295989898e3264a5fb24b3a50584e120edd64e54d50feffa'
				]) AS n
			)
	)

Plan

Nested Loop  (cost=4.31..12.63 rows=1 width=107) (actual time=0.066..0.068 rows=0 loops=1)
 Buffers: shared hit=7
 I/O Timings: read=0.000 write=0.000
 ->  Nested Loop Semi Join  (cost=3.88..9.17 rows=1 width=109) (actual time=0.066..0.068 rows=0 loops=1)
       Buffers: shared hit=7
       I/O Timings: read=0.000 write=0.000
       ->  Hash Join  (cost=3.46..5.64 rows=1 width=101) (actual time=0.065..0.067 rows=0 loops=1)
             Hash Cond: (mt.id = m.media_type_id)
             Buffers: shared hit=7
             I/O Timings: read=0.000 write=0.000
             ->  Seq Scan on public.media_types mt  (cost=0.00..1.85 rows=85 width=45) (actual time=0.007..0.007 rows=1 loops=1)
                   Buffers: shared hit=1
                   I/O Timings: read=0.000 write=0.000
             ->  Hash  (cost=3.44..3.44 rows=1 width=60) (actual time=0.036..0.036 rows=0 loops=1)
                   Buckets: 1024  Batches: 1  Memory Usage: 8kB
                   Buffers: shared hit=6
                   I/O Timings: read=0.000 write=0.000
                   ->  Index Scan using manifests_p_56_top_level_namespace_id_repository_id_id_conf_key on partitions.manifests_p_56 m  (cost=0.42..3.44 rows=1 width=60) (actual time=0.036..0.036 rows=0 loops=1)
                         Index Cond: ((m.top_level_namespace_id = 1) AND (m.repository_id = 1))
                         Buffers: shared hit=6
                         I/O Timings: read=0.000 write=0.000
       ->  Nested Loop Semi Join  (cost=0.43..3.52 rows=1 width=8) (actual time=0.000..0.000 rows=0 loops=0)
             I/O Timings: read=0.000 write=0.000
             ->  Index Scan using manifests_p_56_top_level_namespace_id_repository_id_id_conf_key on partitions.manifests_p_56 manifests  (cost=0.42..3.44 rows=1 width=42) (actual time=0.000..0.000 rows=0 loops=0)
                   Index Cond: ((manifests.top_level_namespace_id = 1) AND (manifests.repository_id = 1))
                   I/O Timings: read=0.000 write=0.000
             ->  Function Scan on unnest n  (cost=0.00..0.03 rows=3 width=32) (actual time=0.000..0.000 rows=0 loops=0)
                   I/O Timings: read=0.000 write=0.000
 ->  Index Scan using manifests_p_56_top_level_namespace_id_repository_id_id_conf_key on partitions.manifests_p_56 ms  (cost=0.42..3.44 rows=1 width=58) (actual time=0.000..0.000 rows=0 loops=0)
       Index Cond: ((ms.top_level_namespace_id = 1) AND (ms.repository_id = 1))
       I/O Timings: read=0.000 write=0.000

Author checklist

Reviewer checklist

Ensure the commit and MR tittle are still accurate.
If the change contains a breaking change, apply the breaking change label.
If the change is considered high risk, apply the label high-risk-change
Identify if the change can be rolled back safely. (note: all other reasons for not being able to rollback will be sufficiently captured by major version changes).

If the MR introduces database schema migrations:

Ensure the commit and MR tittle start with fix:, feat:, or perf: so that the change appears on the Changelog

If the changes cannot be rolled back follow these steps:

If not, apply the label cannot-rollback.
Add a section to the MR description that includes the following details:
- The reasoning behind why a release containing the presented MR can not be rolled back (e.g. schema migrations or changes to the FS structure)
- Detailed steps to revert/disable a feature introduced by the same change where a migration cannot be rolled back. (note: ideally MRs containing schema migrations should not contain feature changes.)
- Ensure this MR does not add code that depends on these changes that cannot be rolled back.

Related to #1009 (closed)

Edited Oct 16, 2023 by Aaron Huntsman

Draft: feat: add referrers data to internal List Tags API

What does this MR do?

Notes

Author checklist

Reviewer checklist

Merge request reports