Annotate container images with GitLab metadata
Context
You can use the GitLab container registry to publish and share container images. You do this using the command line or more likely, you use GitLab CI/CD.
Problem to solve
The problem is that you'd like to be able to correlate which pipeline or job is responsible for building a given image/tag, so that if you ever need to verify or troubleshoot a given tag, you can quickly jump right into the job/code that built it.
Selfishly, as the PM, it would also help me to understand product usage across a variety of different tools. This would help me to better prioritize and design future features.
Challenges
Unfortunately, the Docker client does not support adding arbitrary annotations (key/value pairs) to image manifests.
Proposal
If an image is built using a GitLab pipeline, display build details, such as pipeline_id
, branch
, commit
and commit_sha
as part of the Container Registry UI.
-
pipeline_id
should link to the pipeline details page to help users troubleshoot when something has gone wrong. -
branch
andcommit
should link to their respective repositories to help the user find/verify the code that built a specific image. -
commit_sha
should be easily copyabl to ensure the user can leverage this information elsewhere.
Further details
Use a third-party tool to add the annotations to images after building them with Docker or other clients. This would require you, the user, to add something to your gitlab-ci.yml
like the below (using the crane tool from Google in this example):
some_gitlab_ci_job:
script:
# SAME: Build and push image to registry as usual
- docker build -t registry.gitlab.com/mygroup/myproject:latest .
- docker push registry.gitlab.com/mygroup/myproject:latest .
# NEW: Add annotations to image (this will download, mutate and re-push image to registry)
- crane mutate registry.gitlab.com/mygroup/myproject:latest \
--annotation com.gitlab.ci.pipeline_id=$CI_PIPELINE_ID \
--annotation com.gitlab.ci.job_id=$CI_JOB_ID \
--annotation com.gitlab.ci.user_id=$GITLAB_USER_ID
Pros
- This is fully compatible with the old registry, and forward-compatible with the new one, no changes required. Annotations are embedded in the manifest payload.
- On the Rails side, after getting the image manifest, the relevant annotations could be detected and their values surfaced in the image details UI, linking to CI pipelines etc.
- We can add as many annotations as we want. These are arbitrary key/value pairs whose only restriction is that both keys and values must be strings. So lots of flexibility.
- We could define our own standard annotations (.com.gitlab..)
- In the future, we could extend the new registry API to allow searching for images by annotations as well.
Cons
- This feature would depend on users updating their CI pipelines to use a third-party tool. This is significant. But perhaps, we could create an opt-in CI template that could be included in CI pipelines, such as the ones we already have for SAST, container scanning etc. I'm not sure if this is technically possible, but it would be neat.
Why annotate the manifest
For context, we should strive for annotations on the manifest for several reasons:
- No need to retrieve the image config on the Rails side. It's enough to get the manifest now (without DB) or nothing (in the future, with a new API on the registry, we can allow querying image metadata directly). So we can save 1 network request per image;
- We don't parse/interpret the image config on the registry side, as those are uploaded as regular blobs. So we can't "explode" annotations in the config and save them on the DB for easy access/querying like we can when receiving a manifest upload (or at least not as quickly/efficiently);
- The image config is meant to contain metadata for the container runtime eyes (platform, architecture, etc.), so adding build or user-level metadata there is far from ideal;
- OCI is pushing for standardized annotations on manifests/indexes (https://github.com/opencontainers/image-spec/blob/main/annotations.md#pre-defined-annotation-keys), so we can use the standard annotations or create our own (
com.gitlab...
), and this will (eventually) be seamless across providers. There are multiple exciting ideas on how to standardize the use of these, such as for tooling that detects vulnerabilities on base images by looking at these standardized annotations on manifests.