Skip to content

Race condition finding or creating a container repository by path

Summary

Container Repository race condition when finding or creating a container repository by path when it doesn't exist.

Some ideas:

  • For small database instances, ensure_container_repository! returns find_by_path! too quickly.

Steps to reproduce

This can be reproduced locally using Podman and the container registry.

  1. Install podman https://podman.io/docs/installation
  2. Enable the registry with authentication.
  3. Optional. Configure an insecure registry with podman https://podman.io/docs/installation#registriesconf
podman machine ssh
sudo vi  /etc/containers/registries.conf 
## add the following lines
[[registry]]
location="registry.test:5000"
insecure=true
  1. Build an image with a few fairly large layers in it, sample Dockerfile. Remember to tag the image with a different path for every request. The aim is to trigger find_by_path! inside find_or_create_from_path
FROM alpine:latest

RUN dd if=/dev/urandom of=1M bs=100M count=1 "iflag=fullblock"
RUN dd if=/dev/urandom of=2M bs=300M count=1 "iflag=fullblock"
RUN dd if=/dev/urandom of=3M bs=100M count=1 "iflag=fullblock"
RUN dd if=/dev/urandom of=4M bs=100M count=1 "iflag=fullblock"
RUN dd if=/dev/urandom of=5M bs=200M count=1 "iflag=fullblock"
RUN dd if=/dev/urandom of=6M bs=100M count=1 "iflag=fullblock"
RUN dd if=/dev/urandom of=7M bs=100M count=1 "iflag=fullblock"
RUN dd if=/dev/urandom of=8M bs=100M count=1 "iflag=fullblock"
RUN dd if=/dev/urandom of=9M bs=700M count=1 "iflag=fullblock"
RUN dd if=/dev/urandom of=10M bs=100M count=1 "iflag=fullblock"
  1. Push the image to the registry and expect the following error:
podman push registry.test:5000/root/project/unknown2:latest
Getting image source signatures
Copying blob a3c15c74a5f3 done   |
Copying blob 2d29c494279d done   |
Copying blob 80150619a846 done   |
Copying blob b2669a77b6d4 done   |
Copying blob dbfa4f640bf1 done   |
Copying blob 507bc4814517 done   |
Copying blob 072eb7954b0f done   |
Copying blob 6c720da2a9cd done   |
Copying blob 7b8b5191c1b9 done   |
Error: trying to reuse blob sha256:5f4d9fc4d98de91820d2a9c81e501c8cc6429bc8758b43fcb2cd50f4cab9a324 at destination: Requesting bearer token: invalid status code from registry 404 (Not Found)

Example Project

Reported in #404326 (closed) on self-managed installations of GitLab 14.10.5 and 16.3.

I have not been able to reproduce on GitLab.com yet.

What is the current bug behavior?

A new container repository fails to be pushed when the ContainerRepository model does not exist.

Attempting to push the repository again succeeds, because the ContainerRepository has been created.

What is the expected correct behavior?

No race condition regardless of how many times we call the method ensure_container_repository! concurrently.

Relevant logs and/or screenshots

podman push output

Error: trying to reuse blob sha256:5f4d9fc4d98de91820d2a9c81e501c8cc6429bc8758b43fcb2cd50f4cab9a324 at destination: Requesting bearer token: invalid status code from registry 404 (Not Found)

exception log

ActiveRecord::RecordNotFound (Couldn't find ContainerRepository with [WHERE "container_repositories"."project_id" = $1 AND "container_repositories"."name" = $2]):
  app/models/container_repository.rb:622:in `find_by_path!'
  app/models/container_repository.rb:614:in `find_or_create_from_path'
  app/services/auth/container_registry_authentication_service.rb:229:in `ensure_container_repository!'
  app/services/auth/container_registry_authentication_service.rb:201:in `process_repository_access'
  app/services/auth/container_registry_authentication_service.rb:170:in `process_scope'
  app/services/auth/container_registry_authentication_service.rb:157:in `block in scopes'
  app/services/auth/container_registry_authentication_service.rb:156:in `map'
  app/services/auth/container_registry_authentication_service.rb:156:in `scopes'
  app/services/auth/container_registry_authentication_service.rb:28:in `execute'
  ee/app/services/ee/auth/container_registry_authentication_service.rb:12:in `execute'

Output of checks

Results of GitLab environment info

Expand for output related to GitLab environment info

(For installations with omnibus-gitlab package run and paste the output of:
`sudo gitlab-rake gitlab:env:info`)

(For installations from source run and paste the output of:
`sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production`)

Results of GitLab application Check

Expand for output related to the GitLab application check

(For installations with omnibus-gitlab package run and paste the output of: sudo gitlab-rake gitlab:check SANITIZE=true)

(For installations from source run and paste the output of: sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true)

(we will only investigate if the tests are passing)

Possible fixes

See #428115 (comment 1609807228)

TL;DR

  def self.find_or_create_from_path(path)
    record = safe_find_or_create_by!(
      project: path.repository_project,
      name: path.repository_name
    )

    return record if record.persisted?

    now = Time.zone.now
    while(Time.zone.now < now + 1.second)
      container = find_by_path(path)

      break container if container
    end

 rescue ActiveRecord::RecordNotUnique
   find_by_path(path)
 end
Edited by Jaime Martinez