Investigate gitaly disk space used by receive-pack sometimes leaving residual tmp_objdir-incoming
Goal
Find what causes the git receive-pack
process to sometimes leave a residual scratch dir tmp_objdir-incoming-XXXXXX
.
Follow-up: Clean up procedure
Once we have learned what we need to from this example project, we can manually clean up the existing residual temporary pack files.
Manual clean-up procedure: #2547 (comment 1587173372)
Problem summary
Capacity planning analysis uncovered that there is a pathology where git receive-pack
(spawned by Gitaly's SSHReceivePack
or PostReceivePack
gRPC methods) can sometimes leave a residual temp directory. That temp directory (tmp_objdir-incoming
) contains the pack file sent by the client running git push
.
Initial discovery notes:
-
https://gitlab.com/gitlab-com/gl-infra/capacity-planning/-/issues/305#note_1584642151 - Project 33697364 is using 0.55 TB of disk space, even though its accounted
repository_size
statistic only reports 10 GB of quota used. Supplemental internal-only details are in https://gitlab.com/gitlab-com/gl-infra/capacity-planning/-/issues/305#note_1584646112. -
https://gitlab.com/gitlab-com/gl-infra/capacity-planning/-/issues/305#note_1585461238 - The
git receive-pack
process fails to clean up its scratch directory (tmp_objdir-incoming-XXXXXX
).
The implementation of git receive-pack
makes use of the tmp_objdir
API, registering an atexit()
clean-up routine to remove the directory. Presumably that clean-up routine is either failing to either register or run under some set of circumstances that this example project is regularly triggering.
Example
This example (project 33697364) is a git repo that repeatedly accumulates residual tmp_objdir-incoming-XXXXXX
directories each day. So it is both recurring (readily observable) and impactful (large amount of wasted unaccounted disk space usage).
Set the GIT_DIR
for use in subsequent commands.
msmiley@file-65-stor-gprd.c.gitlab-production.internal:~$ GIT_DIR="/var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git"
Show the count of packed versus unpacked git objects, as accounted by git count-objects
. This perspective does not report the accumulation of nearly 200 large residual temporary pack files.
msmiley@file-65-stor-gprd.c.gitlab-production.internal:~$ sudo -i -u git /opt/gitlab/embedded/bin/git --git-dir "$GIT_DIR" count-objects --verbose --human-readable
count: 220
size: 268.49 MiB
in-pack: 303118
packs: 33
size-pack: 9.71 GiB
prune-packable: 21
garbage: 0
size-garbage: 0 bytes
Count the residual files under the tmp_objdir
directories used by git receive-pack
.
msmiley@file-65-stor-gprd.c.gitlab-production.internal:~$ sudo find "$GIT_DIR/objects" -type f -path "*/tmp_objdir-incoming-*/*" | wc -l
183
Nearly all of these directories contain a single file: tmp_pack_XXXXXX
. They sum to over 0.5 TB.
msmiley@file-65-stor-gprd.c.gitlab-production.internal:~$ sudo find "$GIT_DIR/objects" -type f -path "*/tmp_objdir-incoming-*/*" | xargs -r sudo du -shc | tail -n5
2.6G /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-UakUDy/pack/tmp_pack_dV5E2i
4.2G /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-f1q2nq/pack/tmp_pack_os3j0z
4.1G /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-8I4lOA/pack/tmp_pack_5VkXs3
4.2G /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-q1CQ6t/pack/tmp_pack_DWvPTt
557G total
List the oldest and newest residual dirs.
This has been occurring for at least 10 months, so not a new problem.
msmiley@file-65-stor-gprd.c.gitlab-production.internal:~$ sudo find "$GIT_DIR/objects" -type f -path "*/tmp_objdir-incoming-*/*" | xargs -r sudo ls -lhtr | head -n5
-r--r--r-- 1 git git 174M Dec 22 2022 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-xSY70S/pack/tmp_pack_I8nvBh
-r--r--r-- 1 git git 150M Jan 25 2023 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-BbokXo/pack/tmp_pack_4Bndkh
-r--r--r-- 1 git git 575M Jan 25 2023 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-txllDi/pack/tmp_pack_QWE8pc
-r--r--r-- 1 git git 201M Jan 27 2023 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-mX1xfH/pack/tmp_pack_Kmt8ez
-r--r--r-- 1 git git 933M Jan 30 2023 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-RCJZrJ/pack/tmp_pack_dC9GE3
msmiley@file-65-stor-gprd.c.gitlab-production.internal:~$ sudo find "$GIT_DIR/objects" -type f -path "*/tmp_objdir-incoming-*/*" | xargs -r sudo ls -lhtr | tail -n5
-r--r--r-- 1 git git 4.6G Sep 28 09:18 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-ETEksb/pack/tmp_pack_U17EmL
-r--r--r-- 1 git git 4.7G Sep 28 10:16 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-7ExHy2/pack/tmp_pack_xspZRJ
-r--r--r-- 1 git git 4.7G Sep 28 21:50 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-iEbEEI/pack/tmp_pack_8wf5AD
-r--r--r-- 1 git git 4.6G Sep 29 10:30 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-S1FOC0/pack/tmp_pack_aSfE5X
-r--r--r-- 1 git git 4.6G Oct 2 09:17 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-YC2RNy/pack/tmp_pack_brNjVu