Skip to content

Investigate gitaly disk space used by receive-pack sometimes leaving residual tmp_objdir-incoming

Goal

Find what causes the git receive-pack process to sometimes leave a residual scratch dir tmp_objdir-incoming-XXXXXX.

Follow-up: Clean up procedure

Once we have learned what we need to from this example project, we can manually clean up the existing residual temporary pack files.

Manual clean-up procedure: #2547 (comment 1587173372)

Problem summary

Capacity planning analysis uncovered that there is a pathology where git receive-pack (spawned by Gitaly's SSHReceivePack or PostReceivePack gRPC methods) can sometimes leave a residual temp directory. That temp directory (tmp_objdir-incoming) contains the pack file sent by the client running git push.

Initial discovery notes:

The implementation of git receive-pack makes use of the tmp_objdir API, registering an atexit() clean-up routine to remove the directory. Presumably that clean-up routine is either failing to either register or run under some set of circumstances that this example project is regularly triggering.

Example

This example (project 33697364) is a git repo that repeatedly accumulates residual tmp_objdir-incoming-XXXXXX directories each day. So it is both recurring (readily observable) and impactful (large amount of wasted unaccounted disk space usage).

Set the GIT_DIR for use in subsequent commands.

msmiley@file-65-stor-gprd.c.gitlab-production.internal:~$ GIT_DIR="/var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git"

Show the count of packed versus unpacked git objects, as accounted by git count-objects. This perspective does not report the accumulation of nearly 200 large residual temporary pack files.

msmiley@file-65-stor-gprd.c.gitlab-production.internal:~$ sudo -i -u git /opt/gitlab/embedded/bin/git --git-dir "$GIT_DIR" count-objects --verbose --human-readable
count: 220
size: 268.49 MiB
in-pack: 303118
packs: 33
size-pack: 9.71 GiB
prune-packable: 21
garbage: 0
size-garbage: 0 bytes

Count the residual files under the tmp_objdir directories used by git receive-pack.

msmiley@file-65-stor-gprd.c.gitlab-production.internal:~$ sudo find "$GIT_DIR/objects" -type f -path "*/tmp_objdir-incoming-*/*" | wc -l
183

Nearly all of these directories contain a single file: tmp_pack_XXXXXX. They sum to over 0.5 TB.

msmiley@file-65-stor-gprd.c.gitlab-production.internal:~$ sudo find "$GIT_DIR/objects" -type f -path "*/tmp_objdir-incoming-*/*" | xargs -r sudo du -shc | tail -n5
2.6G	/var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-UakUDy/pack/tmp_pack_dV5E2i
4.2G	/var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-f1q2nq/pack/tmp_pack_os3j0z
4.1G	/var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-8I4lOA/pack/tmp_pack_5VkXs3
4.2G	/var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-q1CQ6t/pack/tmp_pack_DWvPTt
557G	total

List the oldest and newest residual dirs.

This has been occurring for at least 10 months, so not a new problem.

msmiley@file-65-stor-gprd.c.gitlab-production.internal:~$ sudo find "$GIT_DIR/objects" -type f -path "*/tmp_objdir-incoming-*/*" | xargs -r sudo ls -lhtr | head -n5
-r--r--r-- 1 git git 174M Dec 22  2022 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-xSY70S/pack/tmp_pack_I8nvBh
-r--r--r-- 1 git git 150M Jan 25  2023 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-BbokXo/pack/tmp_pack_4Bndkh
-r--r--r-- 1 git git 575M Jan 25  2023 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-txllDi/pack/tmp_pack_QWE8pc
-r--r--r-- 1 git git 201M Jan 27  2023 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-mX1xfH/pack/tmp_pack_Kmt8ez
-r--r--r-- 1 git git 933M Jan 30  2023 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-RCJZrJ/pack/tmp_pack_dC9GE3

msmiley@file-65-stor-gprd.c.gitlab-production.internal:~$ sudo find "$GIT_DIR/objects" -type f -path "*/tmp_objdir-incoming-*/*" | xargs -r sudo ls -lhtr | tail -n5
-r--r--r-- 1 git git 4.6G Sep 28 09:18 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-ETEksb/pack/tmp_pack_U17EmL
-r--r--r-- 1 git git 4.7G Sep 28 10:16 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-7ExHy2/pack/tmp_pack_xspZRJ
-r--r--r-- 1 git git 4.7G Sep 28 21:50 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-iEbEEI/pack/tmp_pack_8wf5AD
-r--r--r-- 1 git git 4.6G Sep 29 10:30 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-S1FOC0/pack/tmp_pack_aSfE5X
-r--r--r-- 1 git git 4.6G Oct  2 09:17 /var/opt/gitlab/git-data/repositories/@hashed/7d/91/7d91fb2bd00acc28a7e9c4529baf1742702d3bd00283b7a0085b5c41ac661294.git/objects/tmp_objdir-incoming-YC2RNy/pack/tmp_pack_brNjVu
Edited by Matt Smiley