Do git gc, not repack, after moving a repository between shards
What does this MR do?
In !20863 (merged) , we introduced an additional git garbage-collection operation that happens immediately after a repository has been moved between shards. In review, the gc
was reduced to full_repack
because we were a little concerned by the idea of doing a gc
for every repository on a shard, in the event of a mass move.
However, it transpired that the RPCs we're concerned by need the commit-graph
, not just pack file bitmaps, to be written, if they are to perform acceptably at scale. So this MR puts us back into running a gc
.
The feature is still behind a disabled-by-default feature flag. After merge, I intend for us to test this by moving a whole shard of repositories, to see if the gc
load is acceptable. If it isn't, then we'll need to revisit this entirely - groupgitaly are working on a ReplicateRepository
RPC that we could use instead of FetchInternalRemote
, for instance, but it's not ready yet: https://gitlab.com/gitlab-org/gitaly/issues?scope=all&utf8=%E2%9C%93&state=opened&search=ReplicateRepository
Screenshots
Does this MR meet the acceptance criteria?
Conformity
- [-] Changelog entry
- [-] Documentation (if required)
-
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team