Skip to content

Fix the dependency worker purge feature

David Fernandez requested to merge 277161-fix-dependency-proxy-purge-job into master

🌝 Context

The dependency proxy is a feature that allows users to use GitLab as a proxy between their $ docker commands and DockerHub.

This allows GitLab to cache manifests and blobs and avoid pinging DockerHub all the time.

With time, that pull-through cache accumulates many blobs and manifests. Those entries are linked to physical files on object storage. To help administrators to keep the object storage space usage in check, we implemented a very simple API endpoint.

When called, that API endpoint enqueues a background job that is responsible to delete the blobs and manifests. The sole reason why the destruction operation is not done inline within the API request is because deleting many files on object storage can take time. As such, it's very easy to hit the web request timeout.

In #277161 (closed), users noticed that the background job didn't work as expected: when the API url was called, blobs and manifests were not destroyed at all.

Our small analysis revealed that the worker is using ::Gitlab.config.dependency_proxy to know if the dependency proxy feature is enabled. The problem with that call is that, we have a configuration override in effect. That override will "force" the dependency proxy to be disabled if the ::Puma module is not loaded. This is explained by the docs:

Dependency proxy requires the Puma web server to be enabled.

Guess what happens on the background job side? Yes, ::Puma is not loaded at all and the job believes that the dependency proxy feature is not enabled at all. As such, the background job will skip the blobs and manifests destruction.

This MR aims to:

  • Fix the background job
  • Update the API endpoint with minor improvements

🔬 What does this MR do?

  • The background job will avoid looking at the config toggle.
    • Update the related spec
  • The API will now respond 202 Created instead of 200 Ok
    • This in line with other similar API endpoints.
    • In addition, it prevents the background job id to be returned
  • Update the API related spec
  • Update the related documentation

🎥 Screenshots or Screencasts (strongly suggested)

See next section

How to setup and validate locally (strongly suggested)

Requirements:

  • Have the $ docker client installed
  • Have a personal access token read with scopes api, read_registry and write_registry
  1. Enable the container registry
  2. Enable the dependency proxy (it should be enabled by default)
    • Make sure that your <gitlab_base_url with web port> is included in the "insecure registries" of the docker config.
  3. Create a group
    • Your user must be at least GUEST on that group
  4. Ensure that the dependency proxy is enabled in the group settings: <gitlab_base_url>/groups/<group_path>/-/dependency_proxy
  5. Login with docker into the GitLab instance.
    docker login <gitlab_base_url>:<web_port>
    Authenticating with existing credentials...
    Login Succeeded
  6. Let's pull an image through the dependency proxy:
    docker pull <gitlab_base_url with web port>/<group_path>/dependency_proxy/containers/alpine:latest
    latest: Pulling from bananas1/dependency_proxy/containers/alpine
    a0d0a0d46f8b: Pull complete 
    Digest: sha256:69704ef328d05a9f806b6b8502915e6a0a4faa4d72018dc42343f511490daf8a
    Status: Downloaded newer image for <gitlab_base_url with web port>/<group_path>/dependency_proxy/containers/alpine:latest
    <gitlab_base_url with web port>/<group_path>/dependency_proxy/containers/alpine:latest
  7. Have a look at the group settings again: <gitlab_base_url>/groups/<group_path>/-/dependency_proxy. We see this: Screenshot_2021-09-07_at_11.09.14

Now that we have some entries in the cache, we can proceed with our tests.

On master

  1. Tail the background workers logs
    $ gdk tail rails-background-jobs
  2. Let's curl the purge endpoint to enqueue a job:
    $ curl --request DELETE --header "PRIVATE-TOKEN: <personal access token>" "<gitlab_base_url>/api/v4/groups/<group_id>/dependency_proxy/cache"
    "69e6348477bcedd82c04ecc7"
  3. Notice in the background worker logs that a job with class PurgeDependencyProxyCacheWorker. The job has been executed.
  4. Check the group settings again (<gitlab_base_url>/groups/<group_path>/-/dependency_proxy). Blobs are still there. 😭

With this MR

  1. Tail the background workers logs
    $ gdk tail rails-background-jobs
  2. Let's curl the purge endpoint to enqueue a job:
    $ curl --request DELETE --header "PRIVATE-TOKEN: <personal access token>" "<gitlab_base_url>/api/v4/groups/<group_id>/dependency_proxy/cache"
    "202"
  3. Notice in the background worker logs that a job with class PurgeDependencyProxyCacheWorker. The job has been executed.
  4. Check the group settings again (<gitlab_base_url>/groups/<group_path>/-/dependency_proxy). This time around, the blobs have been destroyed 🚀

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

Does this MR contain changes to processing or storing of credentials or tokens, authorization and authentication methods or other items described in the security review guidelines? If not, then delete this Security section.

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team
Edited by David Fernandez

Merge request reports