Skip to content

Investigating disk write throughput saturation on Gitaly nodes - unnecessary GC operations

file-27 is one of several Gitaly nodes undergoing frequent disk write throughput saturation..

Looking at iotop, the cause seems to be processes like this:

git      26488 25030  0 20:17 ?        00:00:00     /opt/gitlab/embedded/bin/git --git-dir /var/opt/gitlab/git-data/repositories/@hashed/69/2e/692eef7069554faadcbd97f83bbf5059a8f1eff4b24e36f04da0554b22a9
7c10.git -c pack.island=refs/heads -c pack.island=refs/tags -c repack.useDeltaIslands=true -c repack.writeBitmaps=true -c pack.writeBitmapHashCache=true -c gc.writeCommitGraph=true gc
git      26491 26488  0 20:17 ?        00:00:00       /opt/gitlab/embedded/libexec/git-core/git repack -d -l -A --unpack-unreachable=2.weeks.ago
git      26492 26491 87 20:17 ?        00:00:37         /opt/gitlab/embedded/libexec/git-core/git pack-objects --local --delta-base-offset /var/opt/gitlab/git-data/repositories/@hashed/69/2e/692eef7069554faadcbd97f83bbf5059a8f1eff4b24e36f04da0554b22a97c10.git/objects/pack/.tmp-26491-pack --keep-true-parents --non-empty --all --reflog --indexed-objects --write-bitmap-index --delta-islands --unpack-unreachable=2.weeks.ago

This project in particular is a clone of Unreal tournament. There has been no activity on the project in two years, yet between 2 and 4 times an hour, a GC will occur on this 10GB repository.

If we assume that the same GC operations are occurring on other repositories too, this could add up to a huge number of unnecessary write operations.

We should investigate the cause of this.

cc @zj-gitlab @jacobvosmaer-gitlab