Optimize the container registry garbage collect command
Problem to solve
The GitLab Container Registry allows developers to build, push and share Docker images/tags using the Docker client and/or GitLab CI/CD.
For organizations that build many images across many projects, it is important to regularly remove old, unused images and tags. The container registry garbage collection process takes a long time to run and is inefficient. This makes it impossible for instances with large amounts of storage usage to run the process and results in expensive, inefficient usage of storage.
- Self-managed customers that schedule down time for the registry and run garbage collection, but not for more than 4 hours.
- I as an administrator, when I am running garbage collection to clean out old images/tags from the container registry which contains more than 500 GB of storage, need to be able to run garbage collection in less than four hours, so that I can optimize storage without sacrificing productivity of my organization's engineering teams.
- Optimizing the code will serve as a stepping stone to delivering in-line garbage collection and allow us to unblock some of our customers.
- Our large self-managed customers will be able to lower their cost of storage and improve the discoverability of their container registry.
- Small to mid-sized self-managed customers will be able to run garbage collection with less down time and a less disruptive process.
Optimize the container registry garbage collection process to run faster and enable our customers to clean-up their registry and lower the cost of storage.
- Iterate on the docker distribution pruner, which is currently in experimental mode
- Update docker distribution code with a speed optimization (similar to this)
- Work with one of our customers to fork and optimize the code.
Permissions and Security
- As with the existing garbage collection command, this should be limited to administrators only.
What does success look like, and how can we measure that?
- Success looks like we see a 50% improvement in how long the process takes for our customers. We can measure this by working with the several organizations that have requested this optimization and evaluating if it met their needs.
- Track number of times the command is run and how long the process took each time.