Pushes may fail due to timeouts on the 'internal/allowed' API when Cleanup invokes Housekeeping on a poorly packed repos
In v13.9 we started invoking Housekeeping
as part of the Cleanup
RPC. Cleanup
is called synchronously by the internal/allowed
endpoint when a push is received.
On a well-packed repo this is fine, but on poorly packed repos hosted on NFS Housekeeping
may take several minutes as it walks all files in the objects directory. This is longer than the 60 seconds Gitaly allows for the API, causing the push to fail.
In ZenDesk ticket # 207042 pushes to a customer's monorepo have started to fail ~80% of the time due to how long Housekeeping
is taking. This repo has ~30,000 unpacked objects, we are still investigating why RepackFull
or GarbageCollect
has not packed them.