Skip to content

Dependency proxy caches the manifest

Steve Abrams requested to merge 241639-dependency-proxy-manifests into master

🌲 Context

The Dependency Proxy allows users to cache container images that they pull and use from DockerHub. This is useful for things like running pipelines, instead of pulling from docker each time, once an image is cached, we can pull from the cache and not worry about having to download a variety of files. It has potential to speed up the pipeline.

When an image is pulled from DockerHub, two types of files are downloaded:

  • A manifest: Think of this like the table of contents or index, describing what the image is made of, and what other pieces are needed to build it.
  • Blobs: These are the individual pieces that make up the image.

Right now, the Dependency proxy stores the blob files but downloads the manifest every time a user pulls an image. The reason for this is the manifest will tell us if a given image is out of date or not (if some of the blobs have changed), so we know to download a new version.

Recently, Docker has started to enforce rate limiting on image pulls. It considers two GET requests for a manifest to count as one pull. This means that if we cache the manifest in addition to the blobs, we could help prevent users from being rate limited if they run many pipelines always fetching the same image (think about a pipeline that starts with image: node).

The trouble is, if we cache the manifest, how do we know if it is out of date? Docker allows HEAD requests for the manifest to be made that do not count towards the rate limit. The HEAD request contains a sha256 digest value that can be used to determine if a given manifest needs to be updated or not.

🔎 What does this MR do?

  • Creates and updates some of the dependency proxy services to cache the manifest file and builds the logic depicted below to determine if a new manifest should be pulled or not.
graph TD
    A[Receive manifest request] --> | We have the manifest cached.| B{Docker manifest HEAD request}
    A --> | We do not have manifest cached.| C{Docker manifest GET request}
    B --> | Digest matches the one in the DB | D[Fetch manifest from cache]
    B --> | Network failure, cannot reach DockerHub | D[Fetch manifest from cache]
    B --> | Digest does not match the one in DB | C
    C --> E[Save manifest to cache, save digest to database]
    D --> F
    E --> F[Return manifest]
  • We also cover the case if we have a cached manifest and for some reason the HEAD request fails (DockerHub is completely unreachable), we will return the cached manifest to allow users to work independently from DockerHub.

🐘 Database

Unfortunately, the original way I expected to query dependency_proxy_manifests did not end up being how the feature made the most sense. We search for dependency_proxy_manifests within a group with a given file_name rather than with a given digest. This is why I have removed the existing index, and add a new unique index on [:group_id, :file_name].

Example query:

// Query generated by group.dependency_proxy_manifests.find_by_file_name(file_name)
// Note: this table is empty, so there is no data to run against on production/database-lab

SELECT "dependency_proxy_manifests".* 
FROM "dependency_proxy_manifests" 
WHERE "dependency_proxy_manifests"."group_id" = 1 
AND "dependency_proxy_manifests"."file_name" = 'alpine:latest.json' 
LIMIT 1;

                                                                    QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.15..4.69 rows=1 width=138)
   ->  Index Scan using index_dependency_proxy_manifests_on_group_id_and_file_name on dependency_proxy_manifests  (cost=0.15..4.69 rows=1 width=138)
         Index Cond: (group_id = 1)
         Filter: (file_name = 'alpine:latest.json'::text)
(4 rows)

Note: This table is empty, it will not be in use until the MR is merged.

Up migration:

== 20201202160105 AddGroupFileNameIndexToDependencyProxyManifests: migrating ==
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:dependency_proxy_manifests, [:group_id, :file_name], {:name=>"index_dependency_proxy_manifests_on_group_id_and_file_name", :unique=>true, :algorithm=>:concurrently})
   -> 0.0066s
-- execute("SET statement_timeout TO 0")
   -> 0.0003s
-- add_index(:dependency_proxy_manifests, [:group_id, :file_name], {:name=>"index_dependency_proxy_manifests_on_group_id_and_file_name", :unique=>true, :algorithm=>:concurrently})
   -> 0.0108s
-- execute("RESET ALL")
   -> 0.0006s
-- transaction_open?()
   -> 0.0000s
-- indexes(:dependency_proxy_manifests)
   -> 0.0048s
-- remove_index(:dependency_proxy_manifests, {:algorithm=>:concurrently, :name=>"index_dependency_proxy_manifests_on_group_id_and_digest"})
   -> 0.0060s
== 20201202160105 AddGroupFileNameIndexToDependencyProxyManifests: migrated (0.0310s)

Down migration:

== 20201202160105 AddGroupFileNameIndexToDependencyProxyManifests: reverting ==
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:dependency_proxy_manifests, [:group_id, :digest], {:name=>"index_dependency_proxy_manifests_on_group_id_and_digest", :algorithm=>:concurrently})
   -> 0.0069s
-- execute("SET statement_timeout TO 0")
   -> 0.0005s
-- add_index(:dependency_proxy_manifests, [:group_id, :digest], {:name=>"index_dependency_proxy_manifests_on_group_id_and_digest", :algorithm=>:concurrently})
   -> 0.0079s
-- execute("RESET ALL")
   -> 0.0004s
-- transaction_open?()
   -> 0.0000s
-- indexes(:dependency_proxy_manifests)
   -> 0.0043s
-- remove_index(:dependency_proxy_manifests, {:algorithm=>:concurrently, :name=>"index_dependency_proxy_manifests_on_group_id_and_file_name"})
   -> 0.0048s
== 20201202160105 AddGroupFileNameIndexToDependencyProxyManifests: reverted (0.0267s)

Screenshots (strongly suggested)

Terminal output demonstrating the caching preventing the rate limit from changing
# some output from the terminal showing the rate limit counter does not change once the image is cached

$ ratelimitcheck # this is an alias I have that makes a few API calls to check the rate limit RateLimit-Limit: 100;w=21600 RateLimit-Remaining: 98;w=21600

$ docker pull gdk.test:3001/pub-group/dependency_proxy/containers/alpine:latest latest: Pulling from pub-group/dependency_proxy/containers/alpine 188c0c94c7c5: Already exists Digest: sha256:5ab5a6872b264fe4fd35d63991b9b7d8425f4bc79e7cf4d563c10956581170c9 Status: Image is up to date for gdk.test:3001/pub-group/dependency_proxy/containers/alpine:latest gdk.test:3001/pub-group/dependency_proxy/containers/alpine:latest

$ ratelimitcheck RateLimit-Limit: 100;w=21600 RateLimit-Remaining: 97;w=21600

$ docker pull gdk.test:3001/pub-group/dependency_proxy/containers/alpine:latest latest: Pulling from pub-group/dependency_proxy/containers/alpine 188c0c94c7c5: Already exists Digest: sha256:5ab5a6872b264fe4fd35d63991b9b7d8425f4bc79e7cf4d563c10956581170c9 Status: Image is up to date for gdk.test:3001/pub-group/dependency_proxy/containers/alpine:latest gdk.test:3001/pub-group/dependency_proxy/containers/alpine:latest

$ ratelimitcheck RateLimit-Limit: 100;w=21600 RateLimit-Remaining: 97;w=21600

$ docker pull gdk.test:3001/pub-group/dependency_proxy/containers/alpine:latest latest: Pulling from pub-group/dependency_proxy/containers/alpine 188c0c94c7c5: Already exists Digest: sha256:5ab5a6872b264fe4fd35d63991b9b7d8425f4bc79e7cf4d563c10956581170c9 Status: Image is up to date for gdk.test:3001/pub-group/dependency_proxy/containers/alpine:latest gdk.test:3001/pub-group/dependency_proxy/containers/alpine:latest

$ ratelimitcheck RateLimit-Limit: 100;w=21600 RateLimit-Remaining: 97;w=21600

in separate terminal with a rails console:

[1] pry(main)> Group.find(107).dependency_proxy_manifests.destroy_all Group Load (0.5ms) SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = $1 AND "namespaces"."id" = $2 LIMIT $3 [["type", "Group"], ["id", 107], ["LIMIT", 1]] DependencyProxy::Manifest Load (1.9ms) SELECT "dependency_proxy_manifests".* FROM "dependency_proxy_manifests" WHERE "dependency_proxy_manifests"."group_id" = $1 [["group_id", 107]] (0.2ms) BEGIN DependencyProxy::Manifest Destroy (0.3ms) DELETE FROM "dependency_proxy_manifests" WHERE "dependency_proxy_manifests"."id" = $1 [["id", 3]] [#<DependencyProxy::Manifest:0x00007fd0c959f9b0 id: 3, created_at: Wed, 02 Dec 2020 16:38:42 UTC +00:00, updated_at: Wed, 02 Dec 2020 16:39:56 UTC +00:00, group_id: 107, size: 2784, file_store: 1, file_name: "alpine:latest.json", file: "alpine:latest.json", digest: "sha256:5ab5a6872b264fe4fd35d63991b9b7d8425f4bc79e7cf4d563c10956581170c9">]

back to the other terminal

$ docker pull gdk.test:3001/pub-group/dependency_proxy/containers/alpine:latest latest: Pulling from pub-group/dependency_proxy/containers/alpine 188c0c94c7c5: Already exists Digest: sha256:5ab5a6872b264fe4fd35d63991b9b7d8425f4bc79e7cf4d563c10956581170c9 Status: Image is up to date for gdk.test:3001/pub-group/dependency_proxy/containers/alpine:latest gdk.test:3001/pub-group/dependency_proxy/containers/alpine:latest

$ ratelimitcheck RateLimit-Limit: 100;w=21600 RateLimit-Remaining: 96;w=21600

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team

Related to #241639 (closed)

Edited by Steve Abrams

Merge request reports