Repository cleanup should prune all loose objects within 30 minutes
Problem to solve
Using repository cleanup (https://docs.gitlab.com/ee/user/project/repository/reducing_the_repo_size_using_git.html#using-the-bfg-repo-cleaner) triggers housekeeping, but git gc
doesn't prune all loose Git objects. This means they can still be accessed through the web interface if the object id is known, and the repo size isn't reduced immediately.
Further details
Proposal
When triggering house keeping use git gc --prune=30.minutes.ago
https://git-scm.com/docs/git-gc#Documentation/git-gc.txt---pruneltdategt
The original proposal can cause some data corruption and therefore not feasible (see this thread for details).
To avoid race conditions in Git and possible corruption:
- Move unreferenced objects into a different packfile
- Exclude them from the size calculation done by
du -sk
- Communicate to the user 1) which objects have been isolated and are set to be deleted in
X
days
Edited by Nick Thomas