Skip to content

Delete corrupted Docker manifests during garbage collection

Problem to solve

The GitLab Container Registry allows users to build, publish and share Docker images alongside their source code and pipelines. In order to delete images from storage, users must first untag images, using the Container Registry API or the user interface. Once that is done, they can run garbage collection which will remove all unreferenced images.

However, a common problem we face is a Docker manifest will be corrupted and result in a file of 0 bytes. When the garbage collector encounters these corrupted files, it fails with the error:

failed to garbage collect: failed to mark: filesystem: filesystem: invalid checksum digest format

This typically results in the user creating an issue and reaching out to GitLab support. We have created a runbook to help users remediate the issue, but have not solved the core problem of the garbage collector failing, when it could just ignore or prune the corrupted manifests.

Intended users

Further details

Ignore vs. delete

  • Ignoring the corrupted files could also work, but it doesn't solve the issue, and the files would have to be deleted anyway if the user tried to transfer or delete the project.

Proposal

  • Update the Container Registry garbage collector give Administrators an option to to prune 0-byte files as part of the garbage collection process. These files are corrupted and should be deleted.

Permissions and Security

  • No permissions changes are required. Only administrators can run garbage collection.

Documentation

Testing

What does success look like, and how can we measure that?

  • Success looks like users can run garbage collection without encountering this common errror.
Edited by Tim Rizzi