Repository cleanup should prune all loose objects within 30 minutes

Problem to solve

Using repository cleanup (https://docs.gitlab.com/ee/user/project/repository/reducing_the_repo_size_using_git.html#using-the-bfg-repo-cleaner) triggers housekeeping, but git gc doesn't prune all loose Git objects. This means they can still be accessed through the web interface if the object id is known, and the repo size isn't reduced immediately.

Further details

Proposal

When triggering house keeping use git gc --prune=30.minutes.ago

https://git-scm.com/docs/git-gc#Documentation/git-gc.txt---pruneltdategt

The original proposal can cause some data corruption and therefore not feasible (see this thread for details).

To avoid race conditions in Git and possible corruption:

  1. Move unreferenced objects into a different packfile
  2. Exclude them from the size calculation done by du -sk
  3. Communicate to the user 1) which objects have been isolated and are set to be deleted in X days
Edited by Nick Thomas