CNG: Container Registry images are not being built correctly
Summary
After gitlab-org/build/CNG!382 (merged) was merged, I decided to test the image from the CNG registry for v2.8.0-gitlab
and found out that the new feature we introduced in this release (new tag delete API route) wasn't available.
I then tested the image for the previous release, v2.7.8-gitlab
, and the change we introduced (in garbage collection) was also not there.
After some debugging (thanks @hswimelar for your help), we found that the CNG images are not being built correctly. They're being built based on the upstream repository source code instead of our fork.
We have only introduced visible changes since v2.7.8-gitlab
, so this subtle issue went unnoticed until now.
Steps to reproduce
In v2.8.0-gitlab
we introduced a new tag delete route on the API.
We can reproduce this issue by invoking the registry API on a container based on the CNG image, and see that the new route is not available:
- Create registry config file:
cat <<EOT > config.yml
version: 0.1
storage:
filesystem:
rootdirectory: /tmp
http:
addr: 0.0.0.0:5000
EOT
- Create registry container based on the CNG image:
docker run -d \
-p 5000:5000 \
-v $PWD/config.yml:/etc/docker/registry/config.yml \
registry.gitlab.com/gitlab-org/build/cng/gitlab-container-registry:v2.8.0-gitlab
- Invoke the registry API:
curl -i -X OPTIONS http://0.0.0.0:5000/v2/name/tags/reference/tag
HTTP/1.1 404 Not Found
Content-Type: text/plain; charset=utf-8
Docker-Distribution-Api-Version: registry/2.0
X-Content-Type-Options: nosniff
Date: Wed, 12 Feb 2020 16:47:51 GMT
Content-Length: 19
404 page not found
The request above should return a 200 OK
response with the change we introduced in v2.8.0-gitlab
.
We get the expected response if we repeat the test by cloning the registry repository and building the image based on its local Dockerfile
:
- Clone repository:
mkdir $GOPATH/src/github.com/docker
cd $GOPATH/src/github.com/docker
git clone git@gitlab.com:gitlab-org/container-registry.git --branch v2.8.0-gitlab --single-branch distribution
cd distribution
git rev-parse HEAD
20afb5a933381ee731c3fdd9b50afde951188a92
- Build Docker image locally:
docker build -t registry:v2.8.0-gitlab-local .
- Create registry container based on the local image:
docker run -d \
-p 5000:5000 \
-v $PWD/config.yml:/etc/docker/registry/config.yml \
registry:v2.8.0-gitlab-local
- Invoke the registry API:
curl -i -X OPTIONS http://0.0.0.0:5000/v2/name/tags/reference/tag
HTTP/1.1 200 OK
Allow: DELETE
Docker-Distribution-Api-Version: registry/2.0
Date: Wed, 12 Feb 2020 16:56:13 GMT
Content-Length: 0
Analysis
This is a subtle issue because the build flags being used are correct, so if we invoke registry -v
we'll see the correct version, but the source code used to build the registry
binary is the one from the upstream repository.
This happens because we forked the upstream repository and as such we have to respect its path due to how GOPATH
works. Therefore, to build the registry we have to either symlink or clone our repository into $GOPATH/src/github.com/docker/distribution
.
However, in the CNG Dockerfiles, we’re building the binaries from $GOPATH/src/gitlab.com/gitlab-org/container-registry
(source), which causes the go build tool to look at the source code in $GOPATH/src/github.com/docker/distribution
(which contains the upstream source code) and build based on that.
Solution
We’ll start a discussion around detaching from upstream and rename the path of our registry to avoid problems like this in future, but to fix this quickly we need to clone our registry repository into $GOPATH/src/github.com/docker/distribution
and build from there.
I have an MR for this: gitlab-org/build/CNG!385 (merged)
Questions
We haven’t tested the images of all our vX.Y.Z-gitlab
releases, but I’m afraid this issue has affected all of them, so we should rebuild all images using the fixed Dockerfiles.
How should we do this?
The most urgent releases are v2.7.8-gitlab
and v2.8.0-gitlab
as these are scheduled for 12.8, so we need to fix the corresponding images.
Also, do we need to do something about the corresponding registry version bumps for Charts, Omnibus and K8s Workloads?