Skip to content

Draft: Add models for Virtual Registries, part 2/2

Targeting a non master branch. Do NOT merge this!

🔭 Context

With Maven virtual registry (&14137), we're starting the work on Virtual Registries. Virtual Registries is a feature that could be described as the evolution of the dependency proxy idea: having the GitLab instance play man in the middle between clients and artifacts registries. Artifacts can be any kind but we're going to focus on packages and container images, starting with Maven packages specifically.

In other words, the GitLab instance can be configured to contact a set of upstreams and expose a specific virtual registry url that "talks" the artifact type API, in this case the Maven API. When a request hits this API, we'll check with the set of upstreams and the first one to answer successfully "wins". We will pull the response from that upstream, cache it in the GitLab instance and return it to the client.

The benefits are:

  • multiple upstreams are aggregated behind a single url = simpler configuration on the clients.
  • by caching requests and using those caches in subsequent (identical) requests, we improve the reliability of the system. If the related upstream is down but we have all the correct caches in GitLab, then a client pulling dependencies for a project will work.
  • dependency firewall features. The GitLab instance can do more than just caching. We could run a vulnerability existence check so that we don't allow vulnerable dependencies enter the system.

👣 First iteration's scope

The scope of this feature being quite large, we reduced it for the first iteration. Here are the main aspects:

  • Will work at (root) Group level.
  • Maven packages only.
  • Restrictions on the associations counts:
    • A (root) Group can only have 1 registry (of type Maven).
    • A (maven) registry can only have 1 upstream.

The implementation that we start here should be able to host the evolutions of those restrictions:

  • Support to have the Virtual Registry at a different level (such as Organisation).
  • Support for other package formats.
  • Support for other artifact types than packages, namely container registries.
  • Support for multiple registries.
  • Support for multiple upstreams.
    • Support for different upstream types: local vs remote.

See the detailed analysis in #457503 (comment 1949349752).

💽 Database tables and models

This MR is part of Maven Virtual Registry: Database models (#467972) which tackles the database tables and models that we will need.

classDiagram
    class Reg["VirtualRegistries::Packages::Maven::Registry"]
    class RegU["VirtualRegistries::Packages::Maven::RegistryUpstream"]
    class U["VirtualRegistries::Packages::Maven::Upstream"]
    class CR["VirtualRegistries::Packages::Maven::CachedResponse"]

    Reg "1" --> "1" RegU
    RegU "1" --> "1" U
    U "1" --> "0..*" CR

As discussed above, several associations are 1:1 for now but will be changed into 1:n in the future.

One thing to note is that, we specialize the tables by the artifact type and subtype, in this case packages and maven. This is because we want to avoid the situation that we have in the grouppackage registry, where tables packages_packages and packages_package_files holds data for packages registries for all package formats. Thus, this is similar to splitting the data by artifact type and subtype.

Moreover, some package formats can have specific settings (such as how to handle the caching part on specific requests (metadata)). It wouldn't make sense to have these settings available in package formats that don't need them (if we were using one table for all formats).

This MR introduces the last table. All the others were introduced in Add models for Virtual Registries, part 1/2 (!156930).

What does this MR do and why?

  • Add VirtualRegistries::Packages::Maven::CachedResponse.
    • Link them with VirtualRegistries::Packages::Maven::Upstream
    • Set up object storage links/references
  • Add the cached response uploader
  • Add the related specs.

Obviously, the entire feature is behind a feature flag but since the models are not connected to any logic (yet), the feature flag has not been introduced in this MR.

🏎 MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

🦄 Screenshots or screen recordings

🤷

How to set up and validate locally

TBD

Edited by David Fernandez

Merge request reports