Skip to content

Consolidate and improve Gitaly cgroups documentation

Context

Documentation covering how Gitaly cgroups should be configured are currently spread across several locations, including:

Cgroups should be enabled on Gitaly nodes under two circumstances:

  1. When a Gitaly node is serving heavy, non-uniformly distributed workloads.
  2. If Gitaly is running on Kubernetes, as a protection mechanism against pod eviction on top of the reason above.

The current cgroups documentation, particularly the control groups section in the Configure Gitaly document, acts as a reference rather than a tutorial. Since cgroups tuning can be quite complex, we should offer additional guidance in line with best practices and empirical observations from running our own infrastructure.

Proposal

Review the documents in the list above, and compile key points into a central document. The resulting document should follow a tutorial style that guides the administrator on how to:

  1. Measure the current baseline Git workload demands
  2. Configure appropriate cgroups parent and repository limits
  3. Monitor cgroups-related metrics to determine if additional tuning is required

We should draw from our own experiences tuning cgroups values for gitlab.com

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information