Discussion: Deleting merge request refs after being merged/closed
Background
In #223156 (closed), we started to remove stale merge request refs (14 days after merged/closed) so they don't get advertised on git push/pull.
When it was implemented and deployed to production, some incidents (e.g. gitlab-com/gl-infra/production#3288 (closed)) occurred due to number of DeleteRefs calls. Due to that, scheduling was improved (#296874 (closed)) and the changes are put behind a feature flag (#336070 (closed)) that still needs to be rolled out.
Current Approach
NOTE: This is still behind a disabled feature flag which is not scoped per project.
We utilize sidekiq-cron to schedule ScheduleMergeRequestCleanupRefsWorker. This worker is responsible for scheduling MergeRequestCleanupRefsWorker every minute.
The MergeRequestCleanupRefsWorker is responsible for executing the MergeRequests::CleanupRefsService. There's a limit of maximum 4 jobs running at the same time.
The MergeRequests::CleanupRefsService checks for recent MR that is eligible to be cleaned up (14 days since it was merged/closed). This service utilizes DeleteRefs RPC to delete 2 refs (refs/merge-requests/#{iid}/head and refs/merge-requests/#{iid}/merge) and keep them around if they aren't kept around yet.
Limitation
Since there are lots of projects on GitLab.com, there's no guarantee which project/repository gets cleaned up first. Also, we are limiting the number of DeleteRefs call so it'll also take a while (or the job may never catch up) to clean up all stale merge request refs.
Questions
- Given that we're utilizing
DeleteRefsRPC in the worker to delete multiple refs, would it be right to say that what we're experiencing in #27347 (closed) is a possibility? - Do we really need keep around refs if we don't want refs to be deleted on garbage collect? Is there a way around it?
- Is there a more efficient way to delete old refs without calling
DeleteRefsfrom gitlab-rails (maybe on Git/Gitaly level)?
Started thread for each question.