Skip to content

Do git gc, not repack, after moving a repository between shards

What does this MR do?

In !20863 (merged) , we introduced an additional git garbage-collection operation that happens immediately after a repository has been moved between shards. In review, the gc was reduced to full_repack because we were a little concerned by the idea of doing a gc for every repository on a shard, in the event of a mass move.

However, it transpired that the RPCs we're concerned by need the commit-graph, not just pack file bitmaps, to be written, if they are to perform acceptably at scale. So this MR puts us back into running a gc.

The feature is still behind a disabled-by-default feature flag. After merge, I intend for us to test this by moving a whole shard of repositories, to see if the gc load is acceptable. If it isn't, then we'll need to revisit this entirely - groupgitaly are working on a ReplicateRepository RPC that we could use instead of FetchInternalRemote, for instance, but it's not ready yet: https://gitlab.com/gitlab-org/gitaly/issues?scope=all&utf8=%E2%9C%93&state=opened&search=ReplicateRepository

Screenshots

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Merge request reports