Determine a path towards 'complete' for the Container Registry
Problem to solve
Based on usage and adoption, the container registry is our most valuable feature. It allows users to build and push docker images to GitLab, update those images from GitLab CI, and share them with their teammates.
The feature is currently at the viable stage and we are planning on taking it to 'complete' by January 2020. In order to achieve that, we must deliver:
We are currently facing two major obstacles in tackling the above features. The first is that we are not currently storing any data from Docker. The second is the inherent limitations of the Docker Registry API. Limits in authentication methods, naming conventions and tag pruning severely inhibit our ability to deliver a lovable product.
A lovable product must include:
- Vulnerability Scanning
- Multiple container formats or registries. (OCI/Quay)
- Auditing: All operations to the repositories are tracked and data can easily feed into other systems
- Notary: Image authenticity can be ensured.
- Policy based image replication
We must identify the best way to invest our resources to iteratively deliver short and long term value for our users.
Further details
Questions
- Does the free version of Docker registry allow us to build a competitive product that meets the growing demands of our customers?
- Does an enterprise product like Docker Trusted Registry or GoHarbor give us that ability?
- Does it make sense to build our own container registry? If so, how can we leverage the existing dependency proxy work done by DZ.
How we could address this:
(Disclaimer: This section needs engineering guidance and discussion. I'm only aggregating thoughts captured in existing issues, not prescribing solutions.)
- Utilize the Docker Notifications API to start collecting data and extend the Docker Registry API.
- Kamil has put a lot thought into this and captured the requisite steps here: https://gitlab.com/gitlab-org/gitlab-ce/issues/29639
- Builds on the existing Docker Distribution Registry based solution
- Uses open source project that can be contributed to upstream
- Docker Distribution Registry is written in Go so lots of performance headroom
- May be able to support short/mid term product goals: sorting, filtering, logging
- Not able to address current storage concerns: retention, purging
- Being that Docker Distribution Registry is written in Go, we have fewer people with that skillset
- Inherent issues of using asynchronous notifications to interact with file systems: race, inaccuracy, latency, etc.
- Fork and extend Docker Distribution Registry
- All the benefits of implementing the notifications API
- Due to Docker Distribution Registry being written in Go, maintainability and support is affected
- Keeping in sync with open source project can be difficult
- We could add useful functionality like online garbage collection
- We could integrate more closely with GitLab without using Notifications API
- Integrate Docker Trusted Registry or GoHarbor.io which has many features we need. Original ticket from Mark about GoHarbor
- Docker Trusted Registry and GoHarbor include many features GitLab already supports or, does not require such as a database, user permissions and security scanners.
- Docker Trusted Registry supports online garbage collection (v2.6.0)
- Docker Trusted Registry is scalable
- Docker Trusted Registry and GoHarbor require extensive integration efforts with GitLab
- Long timeline, pricing and licensing concerns
- Without a clear upgrade path, addressing existing data usage may not be possible
- Invest in building our own container registry which will give us maximum control.
- We can build what we need
- We are an open source project
- Some work already done (Dependency Proxy, UX, data stores) may be able to be reused
- Makes use of our core competency: Rails
- Would not address existing storage concerns
- Would require migration of customers
- We would need to work out how to handle existing data
- Would require possibly 4 to 7 months to get to a viable replacement for Docker Distribution Registry
- Would require Go coding to integrate with Workhorse (scale)
Proposal
Make a decision on extending support for our existing container registry or leveraging the work done with the dependency proxy to build our own.
- Understand the level of effort, trade-offs and risks to deliver the core set of features required to make the GitLab Container Registry a 'Complete' product.
- Make a decision before planning 12.3
- Plan the next body of work based on the above decision
Alternate Paths
- We could choose to leave the container registry and dependency proxy alone and only work on improving npm, Maven and Conan.
- We could focus on adding net new package manager integrations and expanding our footprint.