Container Registry: containerd version 1.6.28-2 and higher can not pull from a public repository

Summary

On an on-premise GitLab instance, version v17.0.1-ee. Note: the container registry is configured to use an on-premise instance of MinIO. The problem has been replicated on up-to-date versions of MinIO ("RELEASE.2024-05-10T01-41-38Z").

Create a public repository (eliminating any problems with credentials).

Tag and push a container to the public repository.

On an Ubuntu 22.04 server, containerd 1.6.28-1 can pull the container, whereas containerd 1.6.28-2 fails with a "400 Bad Request".

Steps to reproduce

  1. On an on-premise instance of GitLab v17.0.1-ee with MinIO for object storage ("RELEASE.2024-05-10T01-41-38Z").
  2. I use https://github.com/geerlingguy/ansible-role-gitlab for GitLab
  3. I use https://github.com/minio/ansible-minio for MinIO
  4. Create a public reposistory. (In this example, the repo is k8s-gl-container-registry in root.)
  5. docker pull alpine
  6. docker tag gitlab-instance.com:5050/root/k8s-gl-container-registry
  7. docker push gitlab-instance.com:5050/root/k8s-gl-container-registry
  8. Provision an Ubuntu 22.04 server (as it has more containerd versions available than an Ubuntu 24.04).
  9. I use https://github.com/geerlingguy/ansible-role-containerd for containerd. Most notably controlling the version via containerd_package: containerd.io=1.6.28-2.
  10. Using an up-to-date version of crictl binary ("v1.30.0").
# crictl pull gitlab-instance.com:5050/root/k8s-gl-container-registry
E0529 15:45:52.938438     892 remote_image.go:180] "PullImage from image service failed" err="rpc error: code = Unknown desc = failed to pull and unpack image \"gitlab-instance.com:5050/root/k8s-gl-container-registry:latest\": failed to copy: httpReadSeeker: failed open: unexpected status code https://gitlab-instance.com:5050/v2/root/k8s-gl-container-registry/blobs/sha256:05455a08881ea9cf0e752bc48e61bbd71a34c029bb13df01e40e3e70e0d007bd: 400 Bad Request" image="gitlab-instance.com:5050/root/k8s-gl-container-registry"
FATA[0000] pulling image: failed to pull and unpack image "gitlab-instance.com:5050/root/k8s-gl-container-registry:latest": failed to copy: httpReadSeeker: failed open: unexpected status code https://gitlab-instance.com:5050/v2/root/k8s-gl-container-registry/blobs/sha256:05455a08881ea9cf0e752bc48e61bbd71a34c029bb13df01e40e3e70e0d007bd: 400 Bad Request

As this issue first came to my attention by way of Kubernetes clusters not being able to pull from GitLab container registry, I further bisected the problem with kind and k3d clusters.

kind clusters

works fails
v1.26.14 v1.26.15
v1.27.11 v1.27.13
v1.28.7 v1.28.9
v1.29.2 v1.29.4
v1.30.0

k3d clusters

works fails
v1.26.13+k3s1 v1.26.14+k3s1
v1.27.10-k3s2 v1.27.11-k3s1
v1.28.6-k3s2 v1.28.7-k3s1
v1.29.0-k3s1
v1.30.0-k3s1

Example Project

I was not able to reproduce this on gitlab.com, which is currently running a pre-release version (17.1.0-pre 961cbbb1). Additionally, I am certain gitlab.com is using something other than MinIO for object storage. Meaning, MinIO has not been ruled out, as I was unable to get container registry working with file storage. (The last sentences are not a bug report; I am admitting to not succeeding without MinIO, and running out of time.)

What is the current bug behavior?

From perspective of GitLab registry logs (two versions, first one works, second one is the non-functioning).

in root@gitlab-instance:/var/opt/gitlab/nginx/logs/gitlab_registry_access.log

containerd.io: 1.6.28-1
10.0.32.80 - - [21/May/2024:20:26:02 +0000] "HEAD /v2/devops/k8s-gl-container-registry/manifests/latest HTTP/1.1" 401 0 "" "containerd/1.6.28" -
10.0.32.80 - - [21/May/2024:20:26:02 +0000] "HEAD /v2/devops/k8s-gl-container-registry/manifests/latest HTTP/1.1" 200 0 "" "containerd/1.6.28" -
10.0.32.80 - - [21/May/2024:20:26:02 +0000] "GET /v2/devops/k8s-gl-container-registry/manifests/sha256:6457d53fb065d6f250e1504b9bc42d5b6c65941d57532c072d929dd0628977d0 HTTP/1.1" 401 192 "" "containerd/1.6.28" -
10.0.32.80 - - [21/May/2024:20:26:02 +0000] "GET /v2/devops/k8s-gl-container-registry/manifests/sha256:6457d53fb065d6f250e1504b9bc42d5b6c65941d57532c072d929dd0628977d0 HTTP/1.1" 200 528 "" "containerd/1.6.28" -
10.0.32.80 - - [21/May/2024:20:26:02 +0000] "GET /v2/devops/k8s-gl-container-registry/blobs/sha256:05455a08881ea9cf0e752bc48e61bbd71a34c029bb13df01e40e3e70e0d007bd HTTP/1.1" 307 0 "" "containerd/1.6.28" -
10.0.32.80 - - [21/May/2024:20:26:02 +0000] "GET /v2/devops/k8s-gl-container-registry/blobs/sha256:4abcf20661432fb2d719aaf90656f55c287f8ca915dc1c92ec14ff61e67fbaf8 HTTP/1.1" 307 0 "" "containerd/1.6.28" -

containerd.io: 1.6.28-2
10.0.32.80 - - [21/May/2024:20:23:13 +0000] "HEAD /v2/devops/k8s-gl-container-registry/manifests/latest HTTP/1.1" 401 0 "" "containerd/1.6.28" -
10.0.32.80 - - [21/May/2024:20:23:14 +0000] "HEAD /v2/devops/k8s-gl-container-registry/manifests/latest HTTP/1.1" 200 0 "" "containerd/1.6.28" -
10.0.32.80 - - [21/May/2024:20:23:14 +0000] "GET /v2/devops/k8s-gl-container-registry/manifests/sha256:6457d53fb065d6f250e1504b9bc42d5b6c65941d57532c072d929dd0628977d0 HTTP/1.1" 401 192 "" "containerd/1.6.28" -
10.0.32.80 - - [21/May/2024:20:23:14 +0000] "GET /v2/devops/k8s-gl-container-registry/manifests/sha256:6457d53fb065d6f250e1504b9bc42d5b6c65941d57532c072d929dd0628977d0 HTTP/1.1" 200 528 "" "containerd/1.6.28" -
10.0.32.80 - - [21/May/2024:20:23:14 +0000] "GET /v2/devops/k8s-gl-container-registry/blobs/sha256:05455a08881ea9cf0e752bc48e61bbd71a34c029bb13df01e40e3e70e0d007bd HTTP/1.1" 307 0 "" "containerd/1.6.28" -

From the perspective of containerd, with log level set to debug (in /etc/containerd/config.toml)

1.6.28-1
do request: get the blob
May 21 19:58:38 pax-lab-u22 containerd[1151]: time="2024-05-21T19:58:38.132206773Z" level=debug msg="do request" digest="sha256:05455a08881ea9cf0e752bc48e61bbd71a34c029bb13df01e40e3e70e0d007bd" mediatype=application/vnd.docker.container.image.v1+json request.header.accept="application/vnd.docker.container.image.v1+json, /" request.header.user-agent=containerd/1.6.28 request.method=GET size=1472 url="https://gitlab-alpha.awe.eco.cpanel.net:5050/v2/devops/k8s-gl-container-registry/blobs/sha256:05455a08881ea9cf0e752bc48e61bbd71a34c029bb13df01e40e3e70e0d007bd"

fetch response: fine
May 21 19:58:38 pax-lab-u22 containerd[1151]: time="2024-05-21T19:58:38.149637454Z" level=debug msg="fetch response received" digest="sha256:05455a08881ea9cf0e752bc48e61bbd71a34c029bb13df01e40e3e70e0d007bd" mediatype=application/vnd.docker.container.image.v1+json response.header.accept-ranges=bytes response.header.content-length=1472 response.header.content-security-policy=block-all-mixed-content response.header.content-type=application/octet-stream response.header.date="Tue, 21 May 2024 19:58:38 GMT" response.header.etag="\"1f1c9cf1aeae3023fbeaba38df37b5e3\"" response.header.last-modified="Mon, 20 May 2024 15:26:19 GMT" response.header.server=MinIO response.header.strict-transport-security="max-age=31536000; includeSubDomains" response.header.vary=Origin response.header.vary.1=Accept-Encoding response.header.x-amz-id-2=e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 response.header.x-amz-request-id=17D19953C98B89F2 response.header.x-content-type-options=nosniff response.header.x-xss-protection="1; mode=block" response.status="200 OK" size=1472 url="https://gitlab-alpha.awe.eco.cpanel.net:5050/v2/devops/k8s-gl-container-registry/blobs/sha256:05455a08881ea9cf0e752bc48e61bbd71a34c029bb13df01e40e3e70e0d007bd"

1.6.28-2
do request: get the blob
May 21 19:54:06 pax-lab-u22 containerd[6811]: time="2024-05-21T19:54:06.543460481Z" level=debug msg="do request" digest="sha256:05455a08881ea9cf0e752bc48e61bbd71a34c029bb13df01e40e3e70e0d007bd" mediatype=application/vnd.docker.container.image.v1+json request.header.accept="application/vnd.docker.container.image.v1+json, */*" request.header.user-agent=containerd/1.6.28 request.method=GET size=1472 url="https://gitlab-alpha.awe.eco.cpanel.net:5050/v2/devops/k8s-gl-container-registry/blobs/sha256:05455a08881ea9cf0e752bc48e61bbd71a34c029bb13df01e40e3e70e0d007bd"

fetch response: response.status="400 Bad Request"
May 21 19:54:06 pax-lab-u22 containerd[6811]: time="2024-05-21T19:54:06.559559565Z" level=debug msg="fetch response received" digest="sha256:05455a08881ea9cf0e752bc48e61bbd71a34c029bb13df01e40e3e70e0d007bd" mediatype=application/vnd.docker.container.image.v1+json response.header.accept-ranges=bytes response.header.content-length=365 response.header.content-type=application/xml response.header.date="Tue, 21 May 2024 19:54:06 GMT" response.header.server=MinIO response.header.vary=Origin response.status="400 Bad Request" size=1472 url="https://gitlab-alpha.awe.eco.cpanel.net:5050/v2/devops/k8s-gl-container-registry/blobs/sha256:05455a08881ea9cf0e752bc48e61bbd71a34c029bb13df01e40e3e70e0d007bd"

Describe what actually happens.

root@u22:~# crictl pull gitlab-instance.com:5050/root/k8s-gl-container-registry
E0529 16:14:00.126339     897 remote_image.go:180] "PullImage from image service failed" err="rpc error: code = Unknown desc = failed to pull and unpack image \"gitlab-instance.com:5050/root/k8s-gl-container-registry:latest\": failed to copy: httpReadSeeker: failed open: unexpected status code https://gitlab-instance.com:5050/v2/root/k8s-gl-container-registry/blobs/sha256:05455a08881ea9cf0e752bc48e61bbd71a34c029bb13df01e40e3e70e0d007bd: 400 Bad Request" image="gitlab-instance.com:5050/root/k8s-gl-container-registry"
FATA[0000] pulling image: failed to pull and unpack image "gitlab-instance.com:5050/root/k8s-gl-container-registry:latest": failed to copy: httpReadSeeker: failed open: unexpected status code https://gitlab-instance.com:5050/v2/root/k8s-gl-container-registry/blobs/sha256:05455a08881ea9cf0e752bc48e61bbd71a34c029bb13df01e40e3e70e0d007bd: 400 Bad Request

Describe what you should see instead.

root@u22:~# crictl pull gitlab-instance.com:5050/root/k8s-gl-container-registry
Image is up to date for sha256:05455a08881ea9cf0e752bc48e61bbd71a34c029bb13df01e40e3e70e0d007bd

Output of checks

Results of GitLab environment info

Expand for output related to GitLab environment info
sudo gitlab-rake gitlab:env:info
```
System information
System:         Ubuntu 22.04
Proxy:          no
Current User:   git
Using RVM:      no
Ruby Version:   3.1.5p253
Gem Version:    3.5.9
Bundler Version:2.5.9
Rake Version:   13.0.6
Redis Version:  7.0.15
Sidekiq Version:7.1.6
Go Version:     unknown

GitLab information
Version:        17.0.1-ee
Revision:       cf71f280df3
Directory:      /opt/gitlab/embedded/service/gitlab-rails
DB Adapter:     PostgreSQL
DB Version:     14.11
URL:            https://pax-lab-u22.awe.eco.cpanel.net
HTTP Clone URL: https://pax-lab-u22.awe.eco.cpanel.net/some-group/some-project.git
SSH Clone URL:  git@pax-lab-u22.awe.eco.cpanel.net:some-group/some-project.git
Elasticsearch:  no
Geo:            no
Using LDAP:     yes
Using Omniauth: yes
Omniauth Providers: saml

GitLab Shell
Version:        14.35.0
Repository storages:
- default:      unix:/var/opt/gitlab/gitaly/gitaly.socket
GitLab Shell path:              /opt/gitlab/embedded/service/gitlab-shell

Gitaly
- default Address:      unix:/var/opt/gitlab/gitaly/gitaly.socket
- default Version:      17.0.1
- default Git Version:  2.44.1.gl1
```

Results of GitLab application Check

Expand for output related to the GitLab application check

sudo gitlab-rake gitlab:check SANITIZE=true

[WARNING] Object storage for pages must have a bucket specified
[WARNING] Object storage for ci_secure_files must have a bucket specified
Checking GitLab subtasks ...

Checking GitLab Shell ...

GitLab Shell: ... GitLab Shell version >= 14.35.0 ? ... OK (14.35.0)
Running /opt/gitlab/embedded/service/gitlab-shell/bin/check
Internal API available: OK
Redis available via internal API: OK
gitlab-shell self-check successful

Checking GitLab Shell ... Finished

Checking Gitaly ...

Gitaly: ... default ... OK

Checking Gitaly ... Finished

Checking Sidekiq ...

Sidekiq: ... Running? ... yes
Number of Sidekiq processes (cluster/worker) ... 1/1

Checking Sidekiq ... Finished

Checking Incoming Email ...

Incoming Email: ... Reply by email is disabled in config/gitlab.yml

Checking Incoming Email ... Finished

Checking LDAP ...

LDAP: ... Server: ldapmain
LDAP authentication... Anonymous. No `bind_dn` or `password` configured
LDAP users with access to your GitLab server (only showing the first 100 results)
        User output sanitized. Found 100 users of 100 limit.

Checking LDAP ... Finished

Checking GitLab App ...

Database config exists? ... yes
Tables are truncated? ... skipped
All migrations up? ... yes
Database contains orphaned GroupMembers? ... no
GitLab config exists? ... yes
GitLab config up to date? ... yes
Cable config exists? ... yes
Resque config exists? ... yes
Log directory writable? ... yes
Tmp directory writable? ... yes
Uploads directory exists? ... yes
Uploads directory has correct permissions? ... yes
Uploads directory tmp has correct permissions? ... skipped (no tmp uploads folder yet)
Systemd unit files or init script exist? ... skipped (omnibus-gitlab has neither init script nor systemd units)
Systemd unit files or init script up-to-date? ... skipped (omnibus-gitlab has neither init script nor systemd units)
Projects have namespace: ...
1/1 ... yes
Redis version >= 6.2.14? ... yes
Ruby version >= 3.0.6 ? ... yes (3.1.5)
Git user has default SSH configuration? ... yes
Active users: ... 1
Is authorized keys file accessible? ... yes
GitLab configured to store new projects in hashed storage? ... yes
All projects are in hashed storage? ... yes
Elasticsearch version 7.x-8.x or OpenSearch version 1.x ... skipped (Advanced Search is disabled)
All migrations must be finished before doing a major upgrade ... skipped (Advanced Search is disabled)

Checking GitLab App ... Finished


Checking GitLab subtasks ... Finished

Possible fixes

GitLab.com is one minor version higher (pre-release), and does not demonstrate the problem. I also note there is a new container registry solution in Beta. So, one hopes the problem goes away with time.

That said, given how reproducible this is, I am surprised this has gone unreported. There is evidence it has been a "live issue" (on my environment) for 3-4 months.

Edited by Phil King