Infinite loop using Minio S3 as registry backend when removing manifests

Summary

I searched the issue history and Google trying to find an issue like this but was unable to find any relevant results.

While I understand that Minio is not supported, it is listed in the documentation and the result of this issue is pretty severe.

Omnibus MWE is included below along with procedure to reproduce and the log lines which are repeated in a loop.

I came across this issue while testing a GitLab-ce instance out recently with all S3 storage (using configs as per documentation) and found my S3 provider raked up my bill to over $200 in the first day due to a very high number of requests. I didn't understand why and set up my own S3 storage using Minio to investigate.

I created a project, pushed a container, tried to delete a container and encountered the issue after about 20 or so mins.

Using Minio S3 storage (and possibly others?) for containers appears to result in an infinite loop trying to remove a container, the below log lines are repeated, SideKiq queue jumps up, CPU is maxed out and S3 is flooded with requests.

The container is never removed and the loop goes on for at least 10 hours (until I stopped all processes).

I tested on one production system (a dedicated server, where I saw the problem first), another VM from a popular cloud provider (not AWS), and my desktop machine which I used to test and create a MWE.

Steps to reproduce

In my example I am keeping things as simple as possible.

Summary:

  • We will be using a basic Minio setup with root credentials, this excludes any issues regarding permissions or policies
  • We will be using a Gitlab service on localhost to make testing easy
  • On a side note my docker is IPv6 enabled (adjust config if needed)

Start off by creating a ~/gitlab-test directory which we'll be working in.

STEP 1 - Minio

Create the data directory we'll be using mkdir -p data/minio.

Setup and create the Minio buckets ... I used the below docker-compose

version: '3'


services:
  minio:
    image: quay.io/minio/minio:latest
    command: server /data --console-address ":9001"
    environment:
      MINIO_ROOT_USER: testtest
      MINIO_ROOT_PASSWORD: testtest
    expose:
      - "9000:9000"
      - "9001:9001"
    volumes:
      - './data/minio:/data'
    networks:
      - gitlab

networks:
  gitlab:
    driver: bridge
    enable_ipv6: true

I ran ...

docker-compose up --remove-orphans

I logged into minio on http://localhost:9001 using testtest/testtestand created buckets (I enabled versioning for all of them as this is mandatory/default for almost any cluster setup)...

  • artifacts
  • external-diffs
  • lfs-objects
  • uploads
  • packages
  • dependency-proxy
  • terraform-state
  • pages
  • registry

I then stopped things by pressing ctrl-c

STEP 2 - Setup GitLab

Create the data directories we'll be using...

mkdir data/{config,logs,gitlab}

Update the docker-compose.yml file with the following...

version: '3'


services:
  gitlab:
    image: 'gitlab/gitlab-ce:latest'
    environment:
      GITLAB_OMNIBUS_CONFIG: |
        external_url 'http://localhost:9080'
        nginx['listen_addresses'] = ['*', '[::]']
        # Registry
        registry['enable'] = true
        registry_external_url 'http://localhost:5050'
        registry_nginx['listen_addresses'] = ['*', '[::]']
        registry_nginx['listen_https'] = false
        registry_nginx['listen_port'] = 5050
        # Consolidated object storage configuration
        gitlab_rails['object_store']['enabled'] = true
        gitlab_rails['object_store']['proxy_download'] = true
        gitlab_rails['object_store']['connection'] = {
          'provider' => 'AWS',
          'aws_access_key_id' => 'testtest',
          'aws_secret_access_key' => 'testtest',
          'endpoint' => 'http://minio:9000',
          'path_style' => true,
        }
        gitlab_rails['object_store']['objects']['artifacts']['bucket'] = 'artifacts'
        gitlab_rails['object_store']['objects']['external_diffs']['bucket'] = 'external-diffs'
        gitlab_rails['object_store']['objects']['lfs']['bucket'] = 'lfs-objects'
        gitlab_rails['object_store']['objects']['uploads']['bucket'] = 'uploads'
        gitlab_rails['object_store']['objects']['packages']['bucket'] = 'packages'
        gitlab_rails['object_store']['objects']['dependency_proxy']['bucket'] = 'dependency-proxy'
        gitlab_rails['object_store']['objects']['terraform_state']['bucket'] = 'terraform-state'
        gitlab_rails['object_store']['objects']['pages']['bucket'] = 'pages'
        registry['storage'] = {
          's3' => {
            'accesskey' => 'testtest',
            'secretkey' => 'testtest',
            'bucket' => 'registry',
            'region' => 'none',
            'regionendpoint' => 'http://minio:9000',
            'pathstyle' =>  true
          },
          'redirect' => {
            'disable' => true
          }
        }
    ports:
      - '9080:9080'
      - '5050:5050'
    volumes:
      - './data/config:/etc/gitlab'
      - './data/logs:/var/log/gitlab'
      - './data/gitlab:/var/opt/gitlab'
    networks:
      - gitlab

  minio:
    image: quay.io/minio/minio:latest
    command: server /data --console-address ":9001"
    environment:
      MINIO_ROOT_USER: testtest
      MINIO_ROOT_PASSWORD: testtest
    expose:
      - "9000:9000"
      - "9001:9001"
    volumes:
      - './data/minio:/data'
    networks:
      - gitlab

networks:
  gitlab:
    driver: bridge
    enable_ipv6: true

Start everything up using docker-compose up --remove-orphans.

STEP 3 - Reproducing

Log into http://localhost:9080 (sorry, I had port 8080 being used so I picked 9080).

Create a project under the root account, I used testrepo and "Create README.md file".

On your PC , create a simple Dockerfile in some directory, I used this...

FROM alpine:edge
RUN apk add --no-cache bash vim

Next, build the image with a tag...

docker build -t localhost:5050/root/testrepo .

Next, create a token for your root account with read_registry and write_registry.

Add the login info to docker... (I used username root-registry as the token username)

docker login -u root-registry -p "xxxxxxxxxx" localhost:5050

Next we are going to push the container...

docker push localhost:5050/root/testrepo

Lets create another one...

docker build -t localhost:5050/root/testrepo/test2 .
docker push localhost:5050/root/testrepo/test2

Everything is going to plan, all working 100%, image pulling works, pushing works, no problem.

The problem comes with REMOVING!

Go to the project container registry page, I went here http://localhost:9080/root/testrepo/container_registry.

Now, delete the repositories on the right using the trashcan and wait.

You may now have to wait a few mins or 10's of mins until the job triggers removal. It will flood the logs in the terminal you have running docker-compose up. The S3 storage will also be hit with hundreds of queries per second.

Example Project

I am unsure if this is a good idea, but probably doesn't affect your GitLab.com platform.

What is the current bug behavior?

Logs and S3 storage flooded with requests.

What is the expected correct behavior?

Container repository to be removed.

Relevant logs and/or screenshots

These appear to be the repeated log lines...

gitlab-test-gitlab-1  | ==> /var/log/gitlab/gitlab-rails/application_json.log <==
gitlab-test-gitlab-1  | {"severity":"INFO","time":"2023-01-02T23:31:48.219Z","correlation_id":"59aa3bf18a84b00a69e11eb9055fdcec","service_class":"Projects::ContainerRepository::DeleteTagsService","container_repository_id":2,"project_id":2,"message":"deleted tags","deleted_tags_count":1}
gitlab-test-gitlab-1  |
gitlab-test-gitlab-1  | ==> /var/log/gitlab/sidekiq/current <==
gitlab-test-gitlab-1  | {"severity":"INFO","time":"2023-01-02T23:31:48.219Z","project_id":2,"container_repository_id":2,"container_repository_path":"root/testrepo/test2","tags_size_before_delete":1,"deleted_tags_size":1,"meta.caller_id":"ContainerRegistry::DeleteContainerRepositoryWorker","correlation_id":"59aa3bf18a84b00a69e11eb9055fdcec","meta.root_caller_id":"Cronjob","meta.feature_category":"container_registry","meta.client_id":"ip/","class":"ContainerRegistry::DeleteContainerRepositoryWorker","job_status":"running","queue":"container_repository_delete:container_registry_delete_container_repository","jid":"ae31e50aadc5b6e162d9ab09","retry":0}
gitlab-test-gitlab-1  | {"severity":"INFO","time":"2023-01-02T23:31:48.223Z","retry":0,"queue":"container_repository_delete:container_registry_delete_container_repository","version":0,"status_expiration":1800,"queue_namespace":"container_repository_delete","args":[],"class":"ContainerRegistry::DeleteContainerRepositoryWorker","jid":"ae31e50aadc5b6e162d9ab09","created_at":"2023-01-02T23:31:48.174Z","meta.caller_id":"ContainerRegistry::DeleteContainerRepositoryWorker","correlation_id":"59aa3bf18a84b00a69e11eb9055fdcec","meta.root_caller_id":"Cronjob","meta.feature_category":"container_registry","meta.client_id":"ip/","worker_data_consistency":"always","size_limiter":"validated","enqueued_at":"2023-01-02T23:31:48.174Z","job_size_bytes":2,"pid":530,"message":"ContainerRegistry::DeleteContainerRepositoryWorker JID-ae31e50aadc5b6e162d9ab09: done: 0.047723 sec","job_status":"done","scheduling_latency_s":0.000903,"redis_calls":8,"redis_duration_s":0.000792,"redis_read_bytes":9,"redis_write_bytes":1502,"redis_queues_calls":4,"redis_queues_duration_s":0.000447,"redis_queues_read_bytes":5,"redis_queues_write_bytes":963,"redis_shared_state_calls":4,"redis_shared_state_duration_s":0.000345,"redis_shared_state_read_bytes":4,"redis_shared_state_write_bytes":539,"db_count":9,"db_write_count":5,"db_cached_count":1,"db_replica_count":0,"db_primary_count":9,"db_main_count":9,"db_main_replica_count":0,"db_replica_cached_count":0,"db_primary_cached_count":1,"db_main_cached_count":1,"db_main_replica_cached_count":0,"db_replica_wal_count":0,"db_primary_wal_count":0,"db_main_wal_count":0,"db_main_replica_wal_count":0,"db_replica_wal_cached_count":0,"db_primary_wal_cached_count":0,"db_main_wal_cached_count":0,"db_main_replica_wal_cached_count":0,"db_replica_duration_s":0.0,"db_primary_duration_s":0.002,"db_main_duration_s":0.002,"db_main_replica_duration_s":0.0,"external_http_count":4,"external_http_duration_s":0.009737825001138845,"cpu_s":0.033362,"mem_objects":9935,"mem_bytes":808704,"mem_mallocs":2174,"mem_total_bytes":1206104,"worker_id":"sidekiq_0","rate_limiting_gates":[],"duration_s":0.047723,"completed_at":"2023-01-02T23:31:48.223Z","load_balancing_strategy":"primary","db_duration_s":0.00234}
gitlab-test-gitlab-1  | {"severity":"INFO","time":"2023-01-02T23:31:48.223Z","retry":0,"queue":"container_repository_delete:container_registry_delete_container_repository","version":0,"status_expiration":1800,"queue_namespace":"container_repository_delete","args":[],"class":"ContainerRegistry::DeleteContainerRepositoryWorker","jid":"ecaa8d32d003ef2d99266df3","created_at":"2023-01-02T23:31:48.221Z","meta.caller_id":"ContainerRegistry::DeleteContainerRepositoryWorker","correlation_id":"59aa3bf18a84b00a69e11eb9055fdcec","meta.root_caller_id":"Cronjob","meta.feature_category":"container_registry","meta.client_id":"ip/","worker_data_consistency":"always","size_limiter":"validated","enqueued_at":"2023-01-02T23:31:48.222Z","job_size_bytes":2,"pid":530,"message":"ContainerRegistry::DeleteContainerRepositoryWorker JID-ecaa8d32d003ef2d99266df3: start","job_status":"start","scheduling_latency_s":0.001321}
gitlab-test-gitlab-1  |
gitlab-test-gitlab-1  | ==> /var/log/gitlab/registry/current <==
gitlab-test-gitlab-1  | 2023-01-02_23:31:48.24189 time="2023-01-02T23:31:48.241Z" level=info msg="authorized request" auth_user_name= auth_user_type= correlation_id=01GNTD7BWHQAAPMV5HES9PXF65 go_version=go1.18.7 version=v3.61.0-gitlab
gitlab-test-gitlab-1  | 2023-01-02_23:31:48.24191 {"content_type":"","correlation_id":"01GNTD7BWHQAAPMV5HES9PXF65","duration_ms":0,"host":"127.0.0.1:5000","level":"info","method":"GET","msg":"access","proto":"HTTP/1.1","referrer":"","remote_addr":"127.0.0.1:36784","remote_ip":"127.0.0.1","status":404,"system":"http","time":"2023-01-02T23:31:48.241Z","ttfb_ms":0,"uri":"/gitlab/v1/","user_agent":"GitLab/15.7.0","written_bytes":0}
gitlab-test-gitlab-1  |
gitlab-test-gitlab-1  | ==> /var/log/gitlab/gitlab-rails/application.log <==
gitlab-test-gitlab-1  | 2023-01-02T23:31:48.242Z: {:container_repository_id=>2, :container_repository_path=>"root/testrepo/test2", :project_id=>2, :third_party_cleanup_tags_service=>true}

Output of checks

Results of GitLab environment info

Expand for output related to GitLab environment info
System information
System:
Current User:   git
Using RVM:      no
Ruby Version:   2.7.7p221
Gem Version:    3.1.6
Bundler Version:2.3.15
Rake Version:   13.0.6
Redis Version:  6.2.7
Sidekiq Version:6.5.7
Go Version:     unknown

GitLab information
Version:        15.7.0
Revision:       b2a2fb69e66
Directory:      /opt/gitlab/embedded/service/gitlab-rails
DB Adapter:     PostgreSQL
DB Version:     13.8
URL:            http://localhost:9080
HTTP Clone URL: http://localhost:9080/some-group/some-project.git
SSH Clone URL:  git@localhost:some-group/some-project.git
Using LDAP:     no
Using Omniauth: yes
Omniauth Providers:

GitLab Shell
Version:        14.14.0
Repository storages:
- default:      unix:/var/opt/gitlab/gitaly/gitaly.socket
GitLab Shell path:              /opt/gitlab/embedded/service/gitlab-shell

Results of GitLab application Check

Expand for output related to the GitLab application check
Checking GitLab subtasks ...

Checking GitLab Shell ...

GitLab Shell: ... GitLab Shell version >= 14.14.0 ? ... OK (14.14.0) Running /opt/gitlab/embedded/service/gitlab-shell/bin/check Internal API available: OK Redis available via internal API: OK gitlab-shell self-check successful

Checking GitLab Shell ... Finished

Checking Gitaly ...

Gitaly: ... default ... OK

Checking Gitaly ... Finished

Checking Sidekiq ...

Sidekiq: ... Running? ... yes Number of Sidekiq processes (cluster/worker) ... 1/1

Checking Sidekiq ... Finished

Checking Incoming Email ...

Incoming Email: ... Reply by email is disabled in config/gitlab.yml

Checking Incoming Email ... Finished

Checking LDAP ...

LDAP: ... LDAP is disabled in config/gitlab.yml

Checking LDAP ... Finished

Checking GitLab App ...

Database config exists? ... yes All migrations up? ... yes Database contains orphaned GroupMembers? ... no GitLab config exists? ... yes GitLab config up to date? ... yes Cable config exists? ... yes Resque config exists? ... yes Log directory writable? ... yes Tmp directory writable? ... yes Uploads directory exists? ... yes Uploads directory has correct permissions? ... yes Uploads directory tmp has correct permissions? ... skipped (no tmp uploads folder yet) Systemd unit files or init script exist? ... skipped (omnibus-gitlab has neither init script nor systemd units) Systemd unit files or init script up-to-date? ... skipped (omnibus-gitlab has neither init script nor systemd units) Projects have namespace: ... 2/1 ... yes 1/2 ... yes Redis version >= 6.0.0? ... yes Ruby version >= 2.7.2 ? ... yes (2.7.7) Git user has default SSH configuration? ... yes Active users: ... 1 Is authorized keys file accessible? ... yes GitLab configured to store new projects in hashed storage? ... yes All projects are in hashed storage? ... yes

Checking GitLab App ... Finished

Checking GitLab subtasks ... Finished

Possible fixes

Edited by Nigel Kukard