Skip to content

Geo: Multi-arch containers not properly replicating non-primary architectures to secondary Geo nodes, UI shows replication successful

Summary

Multi-architecture images show as replicated in UI, but non-primary architectures are not available from the secondary node when trying to stat or inspect images.

Steps to reproduce

This behavior was reported by a US federal customer in federal ticket 1050 (GitLab internal, US citizenship required), but I have been able to reproduce the described behavior.

Initial setup

  1. Have Geo instances with container registry replication enabled. I have two instances in GCP, customer has more
  2. Build and push a multi-architecture image to the primary node (I used docker buildx to build a BusyBox image for amd64 and arm64)
  3. Wait for sync and verification to complete
  4. Compare GUI output between primary and secondary nodes. Both of my nodes report synchronized, and container size and hashes are shown to be the same between nodes.

Comparison and troubleshooting

  1. Using skopeo, inspect remote images, I observe that my primary node identifies both architectures in the container but the second does not identify any arch at all. Also note that hashes mismatch:
brad@DebianRulez:~$ skopeo inspect --raw docker://geo1.bradsevy.online:5050/root/busybox-multi
{
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "schemaVersion": 2,
   "manifests": [
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "digest": "sha256:ac3408ba45f5038129cefd401d3828bca2a32e54dc0bf6ff44056936457bf1c5",
         "size": 740,
         "platform": {
            "architecture": "amd64",    <-----------------------------------------------------------
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "digest": "sha256:1e21fbd67772efeb971f0b97be99572219823216b6b1c47a1308fc27e5076335",
         "size": 740,
         "platform": {
            "architecture": "arm64",   <-----------------------------------------------------------
            "os": "linux"
         }
      }
   ]
}

---

brad@DebianRulez:~$ skopeo inspect --raw docker://geo2.bradsevy.online:5050/root/busybox-multi
{
   "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
   "schemaVersion": 2,
   "config": {
      "mediaType": "application/vnd.docker.container.image.v1+json",
      "digest": "sha256:27f909e5658cb519e5175bc681d5c605f01b613503ce8dcf3fe3c1847d37f8c7",
      "size": 844
   },
   "layers": [
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "digest": "sha256:0bc3020d05f1e08b41f1c5d54650a157b1690cde7fedb1fafbc9cda70ee2ec5c",
         "size": 50435617
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "digest": "sha256:f875f728594f35a040e0e4b122c67fe6b05592c71912a6ad4a3136a907fe3eaa",
         "size": 14124231
      }
   ]
}
  1. File sizes and checksums mismatch:

Primary:

root@brad-geo1:~# du -sh /var/opt/gitlab/gitlab-rails/shared/registry/docker/registry/v2/repositories/root/busybox-multi
148K    /var/opt/gitlab/gitlab-rails/shared/registry/docker/registry/v2/repositories/root/busybox-multi

root@brad-geo1:~# find /var/opt/gitlab/gitlab-rails/shared/registry/docker/registry/v2/repositories/root/busybox-multi -type f -exec md5sum {} \; | sort -k 2 | md5sum
ca6f7edcd95ade00548dc258e2b40af1  -

Secondary:

root@brad-geo2:~# du -sh /var/opt/gitlab/gitlab-rails/shared/registry/docker/registry/v2/repositories/root/busybox-multi
92K /var/opt/gitlab/gitlab-rails/shared/registry/docker/registry/v2/repositories/root/busybox-multi

root@brad-geo2:~# find /var/opt/gitlab/gitlab-rails/shared/registry/docker/registry/v2/repositories/root/busybox-multi -type f -exec md5sum {} \; | sort -k 2 | md5sum
e4d0a98fd71c3a6b7ec3f2b62e3bc64d  -
  1. Specifying architecture to pull with --platform=arm64, then inspecting the pulled image with docker image inspect <id> results in successfully pulling the arm64 image from the primary node, but still pulling amd64 on the secondary node:

Primary node:

brad@DebianRulez:~$ docker pull geo1.bradsevy.online:5050/root/busybox-multi --platform=arm64
Using default tag: latest
latest: Pulling from root/busybox-multi
310b368da982: Pull complete 
dc96c5f90a6f: Pull complete 
Digest: sha256:eaf1fdf80669e7338ab1edfeabd8b96f2fac673eaa971f8480d4006e29ec7a72
Status: Downloaded newer image for geo1.bradsevy.online:5050/root/busybox-multi:latest
geo1.bradsevy.online:5050/root/busybox-multi:latest

brad@DebianRulez:~$ docker images -a
REPOSITORY                                     TAG       IMAGE ID       CREATED       SIZE
geo1.bradsevy.online:5050/root/busybox-multi   latest    798012f55906   12 days ago   126MB

brad@DebianRulez:~$ docker image inspect 798012f55906
[
    {
        "Id": "sha256:798012f55906d247c79ea2e9acfbc6d53593b7751c5d851bf1f41eaff4237f52",
        "RepoTags": [
            "geo1.bradsevy.online:5050/root/busybox-multi:latest"
        ],
        "RepoDigests": [
            "geo1.bradsevy.online:5050/root/busybox-multi@sha256:eaf1fdf80669e7338ab1edfeabd8b96f2fac673eaa971f8480d4006e29ec7a72"
        ],
        "Parent": "",
        "Comment": "buildkit.dockerfile.v0",
        "Created": "2021-06-30T18:46:37.189300648Z",
        "Container": "",
        "ContainerConfig": {
            "Hostname": "",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": null,
            "Cmd": null,
            "Image": "",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": null
        },
        "DockerVersion": "",
        "Author": "",
        "Config": {
            "Hostname": "",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
            ],
            "Cmd": [
                "bash"
            ],
            "Image": "",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": null
        },
        "Architecture": "arm64",          <------------------------------------------------------------------------------
        "Os": "linux",
        "Size": 126480503,
        "VirtualSize": 126480503,
        "GraphDriver": {
            "Data": {
                "LowerDir": "/var/lib/docker/overlay2/bcfca43187e0079723d6fdc91b17d003bbd2789ebfc1838909a79f30b9aa99ef/diff",
                "MergedDir": "/var/lib/docker/overlay2/fe35f24b09d6f2af2fb39807d7f11531b24f5e445edb9452370a5fcfd32e58de/merged",
                "UpperDir": "/var/lib/docker/overlay2/fe35f24b09d6f2af2fb39807d7f11531b24f5e445edb9452370a5fcfd32e58de/diff",
                "WorkDir": "/var/lib/docker/overlay2/fe35f24b09d6f2af2fb39807d7f11531b24f5e445edb9452370a5fcfd32e58de/work"
            },
            "Name": "overlay2"
        },
        "RootFS": {
            "Type": "layers",
            "Layers": [
                "sha256:bee1275ae7ac87065d84e2e06aec6254579ac19d9b84e325cbbe03d46e8730e7",
                "sha256:f48735d31fdcbfb2125502fd4530a17b53d373e61bb8683cb6be9a1c8e1edea3"
            ]
        },
        "Metadata": {
            "LastTagTime": "0001-01-01T00:00:00Z"
        }
    }
]

Secondary node:

brad@DebianRulez:~$ docker pull geo2.bradsevy.online:5050/root/busybox-multi --platform=arm64
Using default tag: latest
latest: Pulling from root/busybox-multi
0bc3020d05f1: Pull complete 
f875f728594f: Pull complete 
Digest: sha256:ac3408ba45f5038129cefd401d3828bca2a32e54dc0bf6ff44056936457bf1c5
Status: Downloaded newer image for geo2.bradsevy.online:5050/root/busybox-multi:latest
geo2.bradsevy.online:5050/root/busybox-multi:latest

brad@DebianRulez:~$ docker images -a
REPOSITORY                                     TAG       IMAGE ID       CREATED       SIZE
geo2.bradsevy.online:5050/root/busybox-multi   latest    27f909e5658c   12 days ago   133MB

brad@DebianRulez:~$ docker image inspect 27f909e5658c
[
    {
        "Id": "sha256:27f909e5658cb519e5175bc681d5c605f01b613503ce8dcf3fe3c1847d37f8c7",
        "RepoTags": [
            "geo2.bradsevy.online:5050/root/busybox-multi:latest"
        ],
        "RepoDigests": [
            "geo2.bradsevy.online:5050/root/busybox-multi@sha256:ac3408ba45f5038129cefd401d3828bca2a32e54dc0bf6ff44056936457bf1c5"
        ],
        "Parent": "",
        "Comment": "buildkit.dockerfile.v0",
        "Created": "2021-06-30T18:46:21.646255956Z",
        "Container": "",
        "ContainerConfig": {
            "Hostname": "",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": null,
            "Cmd": null,
            "Image": "",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": null
        },
        "DockerVersion": "",
        "Author": "",
        "Config": {
            "Hostname": "",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
            ],
            "Cmd": [
                "bash"
            ],
            "Image": "",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": null
        },
        "Architecture": "amd64"        <----------------------------------------------------------------------------------
        "Os": "linux",
        "Size": 132681348,
        "VirtualSize": 132681348,
        "GraphDriver": {
            "Data": {
                "LowerDir": "/var/lib/docker/overlay2/07930508f682b867663201ea759fc6e2d01ed9283ce0f07e3068397aff530388/diff",
                "MergedDir": "/var/lib/docker/overlay2/5b0d1d456fd34ecfbee0096491eb81c0f01f67f4e5564bf23e2b1a5847c036fa/merged",
                "UpperDir": "/var/lib/docker/overlay2/5b0d1d456fd34ecfbee0096491eb81c0f01f67f4e5564bf23e2b1a5847c036fa/diff",
                "WorkDir": "/var/lib/docker/overlay2/5b0d1d456fd34ecfbee0096491eb81c0f01f67f4e5564bf23e2b1a5847c036fa/work"
            },
            "Name": "overlay2"
        },
        "RootFS": {
            "Type": "layers",
            "Layers": [
                "sha256:4e006334a6fdea37622f72b21eb75fe1484fc4f20ce8b8526187d6f7bd90a6fe",
                "sha256:51ea4e37f486d3064055a010939db3384b70e33240ff478cb09cf4d3858ca709"
            ]
        },
        "Metadata": {
            "LastTagTime": "0001-01-01T00:00:00Z"
        }
    }
]

Internal discussions

I engaged the Registry and Geo teams in their respective internal Slack channels. Messages available until approximately 10 October 2021. Relevant messages are copied into the internal ticket for posterity.

Registry: https://gitlab.slack.com/archives/CRD4A8HG8/p1625686073105400
Geo: https://gitlab.slack.com/archives/CRD4A8HG8/p1625686073105400

Example Project

What is the current bug behavior?

Only amd64 is made available on secondary node.

What is the expected correct behavior?

Secondary architectures (arm64 in this case) should be available on all secondary nodes.

Relevant logs and/or screenshots

404 errors every few seconds on from gitlab-ctl tail registry on primary node:

2021-07-12_21:03:13.63310 time="2021-07-12T21:03:13Z" level=warning msg="httpSink{http://geo1.bradsevy.online/api/v4/container_registry_event/events} encountered too many errors, backing off"
2021-07-12_21:03:14.66027 time="2021-07-12T21:03:14Z" level=error msg="retryingsink: error writing events: httpSink{http://geo1.bradsevy.online/api/v4/container_registry_event/events}: response status 404 Not Found unaccepted, retrying"
2021-07-12_21:03:14.66032 time="2021-07-12T21:03:14Z" level=warning msg="httpSink{http://geo1.bradsevy.online/api/v4/container_registry_event/events} encountered too many errors, backing off"
2021-07-12_21:03:15.68155 time="2021-07-12T21:03:15Z" level=error msg="retryingsink: error writing events: httpSink{http://geo1.bradsevy.online/api/v4/container_registry_event/events}: response status 404 Not Found unaccepted, retrying"
2021-07-12_21:03:15.68158 time="2021-07-12T21:03:15Z" level=warning msg="httpSink{http://geo1.bradsevy.online/api/v4/container_registry_event/events} encountered too many errors, backing off"
2021-07-12_21:03:16.70599 time="2021-07-12T21:03:16Z" level=error msg="retryingsink: error writing events: httpSink{http://geo1.bradsevy.online/api/v4/container_registry_event/events}: response status 404 Not Found unaccepted, retrying"

Output of checks

Results of GitLab environment info

Expand for output related to GitLab environment info

(For installations with omnibus-gitlab package run and paste the output of:
`sudo gitlab-rake gitlab:env:info`)

(For installations from source run and paste the output of:
`sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production`)

Results of GitLab application Check

Expand for output related to the GitLab application check

(For installations with omnibus-gitlab package run and paste the output of: sudo gitlab-rake gitlab:check SANITIZE=true)

(For installations from source run and paste the output of: sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true)

(we will only investigate if the tests are passing)

Possible fixes

Edited by Brad Sevy