Dependency Proxy fails to pull images from object storage
Summary
There is an issue with the Dependency Proxy in which images are pulled correctly from Docker Hub, but then not available when attempting to pull from them from the cache. The issue is that GitLab is looking for the blob on local storage when it actually exists in object storage.
Steps to reproduce
Thanks to @anton for documenting two methods to reproducing this bug:
Method 1: Via .gitlab-ci.yml
-
Enable object storage for dependency proxies using the below:
gitlab_rails['dependency_proxy_enabled'] = true gitlab_rails['dependency_proxy_object_store_enabled'] = true gitlab_rails['dependency_proxy_object_store_direct_upload'] = true gitlab_rails['dependency_proxy_object_store_background_upload'] = false gitlab_rails['dependency_proxy_object_store_remote_directory'] = "xxxxxxx" gitlab_rails['dependency_proxy_object_store_connection'] = { 'provider' => 'AWS', 'region' => 'ap-southeast-2', 'aws_access_key_id' => 'xxxxxxxx', 'aws_secret_access_key' => 'xxxxxxx', }
-
Create a project, with the following inside the
.gitlab-ci.yml
file:build: image: gitlab.tld/group/dependency_proxy/containers/alpine:latest stage: build script: - echo "Deploy complete"
-
Observe that the job completes successfully. At this point, if I check S3, the blobs exist there, and they aren't in local storage.
[0KRunning with gitlab-runner 13.0.0 (c127439c) [0;m[0K on antons-runner-docker vDxnAt5P [0;msection_start:1591157011:prepare_executor [0K[0K[36;1mPreparing the "docker" executor[0;m [0;m[0KUsing Docker executor with image gitlab.tld/group/dependency_proxy/containers/alpine:latest ... [0;m[0KPulling docker image gitlab.tld/group/dependency_proxy/containers/alpine:latest ... [0;m[0KUsing docker image sha256:a24bb4013296f61e89ba57005a7b3e52274d8edd3ae2077d04395f806b63d83e for gitlab.tld/group/dependency_proxy/containers/alpine:latest ... [0;msection_end:1591157021:prepare_executor [0Ksection_start:1591157021:prepare_script [0K[0K[36;1mPreparing environment[0;m [0;mRunning on runner-vdxnat5p-project-246-concurrent-0 via antons-runner01... section_end:1591157023:prepare_script [0Ksection_start:1591157023:get_sources [0K[0K[36;1mGetting source from Git repository[0;m [0;m[32;1mFetching changes with git depth set to 50...[0;m Reinitialized existing Git repository in /builds/group/alpine/.git/ From https://gitlab.tld/group/alpine * [new ref] refs/pipelines/213 -> refs/pipelines/213 43e4457..ae7caf0 master -> origin/master [32;1mChecking out ae7caf0d as master...[0;m [32;1mSkipping Git submodules setup[0;m section_end:1591157024:get_sources [0Ksection_start:1591157024:restore_cache [0K[0K[36;1mRestoring cache[0;m [0;msection_end:1591157026:restore_cache [0Ksection_start:1591157026:download_artifacts [0K[0K[36;1mDownloading artifacts[0;m [0;msection_end:1591157027:download_artifacts [0Ksection_start:1591157027:build_script [0K[0K[36;1mRunning before_script and script[0;m [0;m[32;1m$ echo "Deploy complete"[0;m Deploy complete section_end:1591157029:build_script [0Ksection_start:1591157029:after_script [0K[0K[36;1mRunning after_script[0;m [0;msection_end:1591157031:after_script [0Ksection_start:1591157031:archive_cache [0K[0K[36;1mSaving cache[0;m [0;msection_end:1591157033:archive_cache [0Ksection_start:1591157033:upload_artifacts_on_success [0K[0K[36;1mUploading artifacts for successful job[0;m [0;msection_end:1591157035:upload_artifacts_on_success [0K[32;1mJob succeeded [0;m
-
SSH into the runner, and remove the image:
# Identify the image ID of gitlab.tld/group/dependency_proxy/containers/alpine:latest docker images -a # Delete the gitlab.tld/group/dependency_proxy/containers/alpine:latest image docker rmi xxxxxxxx
-
Run another pipeline. This time, the docker pull will fail:
[0KRunning with gitlab-runner 13.0.0 (c127439c) [0;m[0K on antons-runner-docker vDxnAt5P [0;msection_start:1591157306:prepare_executor [0K[0K[36;1mPreparing the "docker" executor[0;m [0;m[0KUsing Docker executor with image gitlab.tld/group/dependency_proxy/containers/alpine:latest ... [0;m[0KPulling docker image gitlab.tld/group/dependency_proxy/containers/alpine:latest ... [0;m[31;1mERROR: Preparation failed: Error: No such image: gitlab.tld/group/dependency_proxy/containers/alpine:latest (docker.go:205:0s) [0;m[32;1mWill be retried in 3s ... [0;m[0KUsing Docker executor with image gitlab.tld/group/dependency_proxy/containers/alpine:latest ... [0;m[0KPulling docker image gitlab.tld/group/dependency_proxy/containers/alpine:latest ... [0;m[31;1mERROR: Preparation failed: Error: No such image: gitlab.tld/group/dependency_proxy/containers/alpine:latest (docker.go:205:0s) [0;m[32;1mWill be retried in 3s ... [0;m[0KUsing Docker executor with image gitlab.tld/group/dependency_proxy/containers/alpine:latest ... [0;m[0KPulling docker image gitlab.tld/group/dependency_proxy/containers/alpine:latest ... [0;m[31;1mERROR: Preparation failed: Error: No such image: gitlab.tld/group/dependency_proxy/containers/alpine:latest (docker.go:205:0s) [0;m[32;1mWill be retried in 3s ... [0;msection_end:1591157514:prepare_executor [0K[31;1mERROR: Job failed (system failure): Error: No such image: gitlab.tld/group/dependency_proxy/containers/alpine:latest (docker.go:205:0s) [0;m
The
production_json.log
will have multiple errors like the below, indicating that the blob does not exist on the local storage.The bug here is that GitLab is looking for the blob on local storage, when it actually exists on S3.
{ "method":"GET", "path":"/v2/group/dependency_proxy/containers/alpine/blobs/sha256:df20fa9351a15782c64e6dddb2d4a6f50bf6d3688060a34c4014b0d9a752eb4c", "format":"html", "controller":"Groups::DependencyProxyForContainersController", "action":"blob", "status":500, "time":"2020-06-03T04:09:49.987Z", "params":[ { "key":"group_id", "value":"group" }, { "key":"image", "value":"alpine" }, { "key":"sha", "value":"sha256:df20fa9351a15782c64e6dddb2d4a6f50bf6d3688060a34c4014b0d9a752eb4c" } ], "remote_ip":"68.183.186.26", "user_id":null, "username":null, "ua":"docker/19.03.9 go/go1.13.10 git-commit/9d988398e7 kernel/4.15.0-58-generic os/linux arch/amd64 UpstreamClient(Go-http-client/1.1)", "queue_duration_s":0.0032, "correlation_id":"nZNw8ZW7PWa", "meta.root_namespace":"group", "meta.caller_id":"Groups::DependencyProxyForContainersController#blob", "redis_calls":5, "redis_duration_s":0.000756, "cpu_s":0.04, "exception.class":"ActionController::MissingFile", "exception.message":"Cannot read file /var/opt/gitlab/gitlab-rails/shared/dependency_proxy/5e/f6/5ef6fdf32513aa7cd11f72beccf132b9224d33f271471fff402742887a171edf/dependency_proxy/104/files/44/df20fa9351a15782c64e6dddb2d4a6f50bf6d3688060a34c4014b0d9a752eb4c.gz", "exception.backtrace":[ "app/controllers/concerns/send_file_upload.rb:22:in `send_upload'", "ee/app/controllers/groups/dependency_proxy_for_containers_controller.rb:27:in `blob'", "ee/lib/gitlab/ip_address_state.rb:10:in `with'", "ee/app/controllers/ee/application_controller.rb:44:in `set_current_ip_address'", "app/controllers/application_controller.rb:496:in `set_current_admin'", "lib/gitlab/session.rb:11:in `with_session'", "app/controllers/application_controller.rb:487:in `set_session_storage'", "app/controllers/application_controller.rb:481:in `set_locale'", "lib/gitlab/error_tracking.rb:48:in `with_context'", "app/controllers/application_controller.rb:546:in `sentry_context'", "app/controllers/application_controller.rb:474:in `block in set_current_context'", "lib/gitlab/application_context.rb:52:in `block in use'", "lib/gitlab/application_context.rb:52:in `use'", "lib/gitlab/application_context.rb:20:in `with_context'", "app/controllers/application_controller.rb:467:in `set_current_context'", "ee/lib/gitlab/jira/middleware.rb:19:in `call'" ], "db_duration_s":0.00541, "view_duration_s":0.0, "duration_s":1.0179 }
Notice that GitLab thinks the blob is stored in
/var/opt/gitlab/gitlab-rails/shared/dependency_proxy/5e/f6/5ef6fdf32513aa7cd11f72beccf132b9224d33f271471fff402742887a171edf/dependency_proxy/104/files/44/df20fa9351a15782c64e6dddb2d4a6f50bf6d3688060a34c4014b0d9a752eb4c.gz
In the Rails console, the blob has a
file_store
ofnil
, andblob.file
seems to point to local storage:irb(main):009:0> blob = DependencyProxy::Blob.find(44) => #<DependencyProxy::Blob id: 44, group_id: 104, created_at: "2020-06-03 04:03:38", updated_at: "2020-06-03 04:03:38", size: 2797541, file_store: nil, file_name: "df20fa9351a15782c64e6dddb2d4a6f50bf6d3688060a34c40...", file: "df20fa9351a15782c64e6dddb2d4a6f50bf6d3688060a34c40..."> irb(main):100: 10:0> blob.attributes => {"id"=>44, "group_id"=>104, "created_at"=>Wed, 03 Jun 2020 04:03:38 UTC +00:00, "updated_at"=>Wed, 03 Jun 2020 04:03:38 UTC +00:00, "size"=>2797541, "file_store"=>nil, "file_name"=>"df20fa9351a15782c64e6dddb2d4a6f50bf6d3688060a34c4014b0d9a752eb4c.gz", "file"=>"df20fa9351a15782c64e6dddb2d4a6f50bf6d3688060a34c4014b0d9a752eb4c.gz"} irb(main):011:0> blob.file => #<DependencyProxy::FileUploader:0x00007f9985d1c678 @model=#<DependencyProxy::Blob id: 44, group_id: 104, created_at: "2020-06-03 04:03:38", updated_at: "2020-06-03 04:03:38", size: 2797541, file_store: nil, file_name: "df20fa9351a15782c64e6dddb2d4a6f50bf6d3688060a34c40...", file: "df20fa9351a15782c64e6dddb2d4a6f50bf6d3688060a34c40...">, @mounted_as=:file, @file=#<CarrierWave::SanitizedFile:0x00007f9985d2b560 @file="/var/opt/gitlab/gitlab-rails/shared/dependency_proxy/5e/f6/5ef6fdf32513aa7cd11f72beccf132b9224d33f271471fff402742887a171edf/dependency_proxy/104/files/44/df20fa9351a15782c64e6dddb2d4a6f50bf6d3688060a34c4014b0d9a752eb4c.gz", @original_filename=nil, @content_type=nil, @content=nil>, @filename=nil, @cache_id=nil, @versions={}, @object_store=1, @storage=#<CarrierWave::Storage::File:0x00007f9985d1c2e0 @uploader=#<DependencyProxy::FileUploader:0x00007f9985d1c678 ...
Notice that the file does not exist in local storage:
$ ls /var/opt/gitlab/gitlab-rails/shared/dependency_proxy/5e/f6/5ef6fdf32513aa7cd11f72beccf132b9224d33f271471fff402742887a171edf/dependency_proxy/104/files/44/df20fa9351a15782c64e6dddb2d4a6f50bf6d3688060a34c4014b0d9a752eb4c.gz ls: cannot access '/var/opt/gitlab/gitlab-rails/shared/dependency_proxy/5e/f6/5ef6fdf32513aa7cd11f72beccf132b9224d33f271471fff402742887a171edf/dependency_proxy/104/files/44/df20fa9351a15782c64e6dddb2d4a6f50bf6d3688060a34c4014b0d9a752eb4c.gz': No such file or directory
But it does exist on S3:
$ aws s3 ls s3://my-s3-bucket/5e/f6/5ef6fdf32513aa7cd11f72beccf132b9224d33f271471fff402742887a171edf/dependency_proxy/104/files/44/df20fa9351a15782c64e6dddb2d4a6f50bf6d3688060a34c4014b0d9a752eb4c.gz 2020-06-03 16:03:40 2797541 df20fa9351a15782c64e6dddb2d4a6f50bf6d3688060a34c4014b0d9a752eb4c.gz
Method 2: Via docker pull
You can also use the following sequence of commands to produce the same error in the production logs:
# Pull an image into the dependency proxy for the first time, it should succeed
docker pull gitlab.tld/group/dependency_proxy/containers/alpine:latest
# Identify the image ID of gitlab.tld/group/dependency_proxy/containers/alpine:latest
docker images -a
# Delete the gitlab.tld/group/dependency_proxy/containers/alpine:latest image
docker rmi xxxxxxxx
# Pull the image for the second time
docker pull gitlab.tld/group/dependency_proxy/containers/alpine:latest
When you pull the image the second time, you will see the output below. The same ActionController::MissingFile
error will be shown in the production logs as well.
$ docker pull gitlab.tld/group/dependency_proxy/containers/alpine:latest
latest: Pulling from group/dependency_proxy/containers/alpine
df20fa9351a1: Downloading
received unexpected HTTP status: 500 Internal Server Error
Example Project
https://gitlab.com/gitlab-org/ci-cd/package-stage/feature-testing/dependency-proxy
What is the current bug behavior?
What is the expected correct behavior?
- The image is downloaded successfully