Skip to content

Maven virtual registries: local upstreams

🔥 Problem statement

Up to now, in the maven virtual registry, users could only build a list of remote upstreams. By remote, we mean a target url (with optional credentials). That url would point to a system that is outside the GitLab instance.

There can be a need to support local upstreams. By local, we mean targeting a project or group that lives in the same GitLab instance than the maven virtual registry.

Currently, this is not blocked as users can point to the project or group level endpoint of the GitLab Maven package registry as a remote upstream in a virtual registry and it will work. However, this is suboptimal as the virtual registry will cache the requested files. Thus, for files coming from GitLab projects (of the same instance), they are stored twice: once for the package registry and once for the virtual registry. This is not ideal since object storage usage is certainly not free.

🚒 Solution

The solution here is to categorize the upstreams so that the backend knows if it is dealing with a remote or local upstream.

Since polymorphic associations are not recommended, the simplest solution here would be to work at the upstream level and add an optional foreign key to a project or group.

Thus we would have (overview):

  • Update VirtualRegistries::Packages::Maven::Upstream to have a project_id or group_id column (fk to projects, optional).
    • We need to have a validation so that this project_id or group_id is one of the projects or groups contained in the (top level) parent group.
    • This project_id column should be set when url is not set and vice versa (mutually exclusive).
    • username and password can be set only when url is set.
    • It is not clear at this point if we need to have a kind column to quickly select local and remote upstreams as we can select url IS NULL or project_id IS NULL. Specific indexes might be required.
  • Update the handle file request service to support the case where the file comes from a local upstream
    • This file doesn't need to be cached at all and should be returned directly from the related package file.
  • Update the check upstream service to support local upstreams. One valuable approach could be:
    • Check all local upstreams.
      • If the package is found on a local upstream: do we need to check a remote upstream? Yes, do it. No, return that local upstream.
      • If the package is not found on any local upstreams: ignore all local upstreams in the upstream lists and walk the remote upstreams (similar to what we do today).
    • The idea is that we can check file existence on multiple local upstreams in a single database query, thus we should leverage that to optimize the amount of network calls we do to remote upstreams.
      • Accessing local upstreams will always be faster than remote upstreams.
  • Update the upstream APIs to allow:
    • creating an upstream with the project_id or group_id.
    • expose the project_id or group_id field when returning an upstream.

Remaining points to define:

  • implementation plan: MRs, aspects.
  • permissions.

Design Requirements

UI/UX Design Needs

  • Design an intuitive interface for selecting GitLab projects as local upstreams
  • Create clear visual differentiation between local and remote upstreams in the list view
  • Design status indicators showing the health/availability of local upstream projects
  • Develop a UI flow for testing connections to local upstreams

Mockups Needed

  • Project/group selector interface with search/filter functionality
  • Updated upstream list view showing both remote and local upstreams
  • Detail view for local upstream configuration
  • Connection test interface with appropriate success/failure states

User Experience Considerations

  • Platform engineers should be able to easily switch between configuring local and remote upstreams
  • Users should understand the storage benefits of local vs. remote upstreams
  • The relationship between the virtual registry and local project/group should be clearly visualized
  • Navigation between the virtual registry and referenced local projects/groups should be seamless

Analytics/Monitoring View

  • Create a visualization for local upstream usage patterns
Edited by 🤖 GitLab Bot 🤖