Standby partition cache configured too low
Gitaly keeps a configurable number of inactive partitions on stand by. Tearing down partitions and starting them up again every time they become inactive would lead to thrashing under sequential request load.
While the mechanism is there, it's not configured correctly. We originally intended to keep 100 partitions on stand by before shutting down the least recently used inactive partitions. However, this is incorrectly configured and we're currently keeping only two partitions on stand by.
We should fix this and keep more partitions on stand by. This can lead to unnecessary churn on partitions and has likely contributed to the production incident in Enable Gitaly Transactions on 2 regular Gitaly ... (gitlab-com/gl-infra/production#19748 - closed).