catfile cache spawns too many processes
Lots of recent incidents have been caused by too many processes being spawned by Gitaly. The result is contention of the running processes with each other, but also contention around process start time when we try to grab a spawn token, where we gradually come to a halt as we start to become unable to spawn new processes at all.
One of the issues we have is our catfile cache. For each cached catfile process, we always spawn two processes: git cat-file --batch
and git cat-file --batch-info
. This is quite a waste of resources though given that in many cases we only need one of both processes. This is exacerbated by the fact that the catfile cache is used a lot, up to the point where we often have thousands of git-cat-file(1) processes around.
We should fix this and disentangle creating both processes such that we can selectively spawn what we need. This brings down the number of spawned processes by half in cases where we don't need it. Most importantly, this should reduce contention on the spawn token.