Skip to content

Make the repository read-only while running cleanup

Nick Thomas requested to merge 220104-read-only-repository-cleanup into master

What does this MR do?

The "repository cleanup" operation described in https://docs.gitlab.com/ee/user/project/repository/reducing_the_repo_size_using_git.html#repository-cleanup involves rewriting history and clearing garbage (old packfiles, loose objects, etc) out of the repository. To get maximum space savings, we need to be much more aggressive about the garbage collection. This carries some risk of eating user data if any writable git operations are ongoing at the time.

This MR guards the current operation with the repository_read_only: true mechanism. This was originally designed for moving repositories between shards, but it can be repurposed here.

It's not a perfect mechanism - in particular, some user operations can still be performed that write to git while the repository is ostensibly read-only - but each of those cases is a problem now, with a procedure that is out of the control of the user, so this doesn't make things worse in that respect.

Technically we only need the repository to be read-only during the enhanced garbage-collection, but making it read-only for the whole job is a more understandable user experience.

Screenshots

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Related to #220104 (closed)

Edited by Saikat Sarkar

Merge request reports