Discussion: What has to be true to start building a Maven dependency proxy
Context
One of the most frequently requested features for the Package Registry is a virtual registry for Maven. This key missing feature will allow many enterprise customers to migrate from Artifactory to GitLab.
Problem to solve
I'd like to understand what has to be true for us to start working on this feature as we have many customers that are patiently (or impatiently) waiting for the feature.
Proposal
Identify any blockers that prevent us from building the feature and how we can unblock them. For example, are there any open bugs, technical debt, or other constraints stopping us from starting?
What we'd like to build
We will give users the ability to add/configure one external Java repository. Once added, when a user tries to install a Java package using their project-level endpoint, GitLab will first look for the package in the project and if it's not found, will attempt to pull the package from the external repository.
- Project owners will be able to configure this via a project's settings (API or UI)
- We will support external repositories that require authentication, such as Artifactory or Sonatype
-
❓ Can the project owner change their preferred order so that the package is pulled first from the external repository and second from GitLab?
When a package is pulled from the external repository it will be imported into the GitLab project so that the next time that particular package/version is pulled it's pulled from GitLab and not the external repository.
- The benefit of this is that it as time goes on, fewer packages will have to be pulled externally.
- This will be a fast follow, once we've added initial support for the dependency proxy.
If the package is not found in their GitLab project or the external repository we will return an error (404?).
Users can configure pipelines to scan for vulnerabilities or compliance checks and can set rules that the package cannot be pulled unless the pipeline is
- This will tie in with our planned development of GitLab CI Events for Package
Questions
- If external packages are added to the GitLab project, what happens if you the same package is pulled into multiple projects and my team uses a group endpoint to pull that package name/version?
- The latest package will be pulled.
- This approach does mean that there will be duplication of packages across projects which can impact storage.
- What if I want to add multiple external remotes?
- That won't be included in the MVC but it is something that we will add support for in the future.
- What if I want to configure this for many (hundreds/thousands) of projects?
- We will offer an API to update the project settings, but we will build this feature at the project level to start.
- What tier will this feature be included in?
- We will launch this feature in GitLab Ultimate.
- What metrics will we track?
- As the PM, I'd really like to understand how many projects have configured an external remote, how many packages are pulled from external remotes, and for projects with an external remote what's the ratio of packages pulled from GitLab/externally?