Skip to content

Cgroup: Add an option to include Gitaly process in Cgroup hierarchy

Problem statement

For #5535 (closed)

When cgroup feature is enabled in Gitaly, all children Git commands are assigned with a corresponding per-repository cgroup. All of those cgroups are also under the control of a parent cgroup. This hierarchy limits the resource usage of those commands and prevents them from dominating the node resources.

In an incident, we discovered that the main Gitaly process is not a part of that cgroup hierarchy. It's fine most of the time because the Gitaly process itself is not resource-hungry. Unfortunately, in the incident, the Gitaly process leaks memory and grows its memory usage gradually. After a certain point, the process becomes unresponsive. This situation cannot be self-resolved without human intervention.

This MR adds an "IncludeGitalyProcess" config to include the Gitaly process in the parent cgroup.

When the IncludeGitalyProcess configuration is enabled, the main cgroup process's PID is added to Gitaly's cgroup hierarchy. The process doesn't possess its own exclusive limit. Rather, it shares the limit with all spawned Git processes, which are also managed by the repository cgroup, under the umbrella of the parent cgroup.

This configuration offers superior protection and flexibility compared to setting a fixed limit specifically for the Gitaly process. The Gitaly process typically has more resource leeway than the Git processes and is likely to be terminated last in the event of a problem.

This configuration could be enabled by default. We need to verify it on GitLab.com first. Because the cgroup setup is done when the process starts, it's not possible to use a feature flag. Hence, the new configuration is introduced.

Alternatives

  • GOMEMLIMIT is a good option. It sets the limit at the application layer and allows the GC to work more efficiently. For memory-leaking, which is the original intention of this MR, the leaked memory is unlikely to be GC-ed. It also doesn't support self-kill to reset the memory usage. In addition, the cgroup approach also covers CPU limits, which are obviously not supported by GOMEMLIMIT.
  • Adding the PID directly to /sys/fs/cgroup/gitaly/gitaly-<pid>/cgroup.procs doesn't work either. It raises the device or resource busy error. Another reason not to follow this approach is that the main path is more explicit about the limit (memory.max=max, cpu.max=max 100000).

Verify the solution

To verify the solution, I configured Gitaly cgroup and added the following code to simulate a memory leak. After starting the Gitaly server, capture the memory usage and compare before and after IncludeGitalyProcess is enabled.

diff --git a/internal/cli/gitaly/serve.go b/internal/cli/gitaly/serve.go
index fd4470241..9f14ba92b 100644
--- a/internal/cli/gitaly/serve.go
+++ b/internal/cli/gitaly/serve.go
@@ -5,6 +5,7 @@ import (
        "fmt"
        "os"
        "runtime/debug"
+       "strings"
        "time"

        "github.com/go-enry/go-license-detector/v4/licensedb"
@@ -177,6 +178,16 @@ func run(cfg config.Cfg, logger logrus.FieldLogger) error {
                return fmt.Errorf("failed setting up cgroups: %w", err)
        }

+       trash := map[int]string{}
+       i := 0
+       go func() {
+               for {
+                       i += 1
+                       trash[i] = strings.Repeat("a", i)
+                       time.Sleep(10 * time.Millisecond)
+               }
+       }()
+
        defer func() {
                if err := cgroupMgr.Cleanup(); err != nil {
                        logger.WithError(err).Warn("error cleaning up cgroups")

Cgroup V2

Before

[cgroups]
mountpoint = "/sys/fs/cgroup/"
hierarchy_root = "gitaly"
memory_bytes = 400000000
[cgroups.repositories]
count = 10
memory_bytes = 50000000

not_include_gitaly_process

  • The memory usage exceeds the configured memory limit (400MB) and continues to grow.
  • The Gitaly process's PID is not in any cgroup.
  • No OOM event in kernel logs or cgroup logs

After

[cgroups]
mountpoint = "/sys/fs/cgroup/"
hierarchy_root = "gitaly"
memory_bytes = 400000000
include_gitaly_process = true
[cgroups.repositories]
count = 10
memory_bytes = 50000000

include_gitaly_process

cat /sys/fs/cgroup/gitaly/memory.events
low 0
high 0
max 203 // This number indicates the amount of memory-exceeding events
oom 1
oom_kill 1
oom_group_kill 0
journalctl _TRANSPORT=kernel _HOSTNAME=$(hostname) | grep -i "Out of memory"
Killed process 117719 (gitaly) total-vm:1997636kB, anon-rss:258028kB, file-rss:174208kB, shmem-rss:0kB, UID:0 pgtables:1072kB oom_score_adj:0
ls -la /sys/fs/cgroup/gitaly/gitaly-121843
total 0
drwxr-xr-x 13 root root 0 Aug 24 04:12 .
drwxr-xr-x  4 root root 0 Aug 23 16:17 ..
drwxr-xr-x  2 root root 0 Aug 24 04:12 main
drwxr-xr-x  2 root root 0 Aug 24 04:12 repos-0
drwxr-xr-x  2 root root 0 Aug 24 04:12 repos-1
...
-r--r--r--  1 root root 0 Aug 24 04:13 memory.current
-r--r--r--  1 root root 0 Aug 24 04:13 memory.events
-r--r--r--  1 root root 0 Aug 24 04:13 memory.events.local
-rw-r--r--  1 root root 0 Aug 24 04:13 memory.high
  • The memory usage slightly exceeds the configured memory limit (400MB) and then killed.
  • The Gitaly process's PID is added to /sys/fs/cgroup/gitaly/gitaly-121843/main/cgroup.procs.
  • OOM events in kernel logs and cgroup memory events.

Cgroup V1

include_gitaly_access_v1

The test on CgroupV1 yields a similar result. The cgroupfs's structure is a bit different, though. There is a period where the memory goes up and down. My guess is, during that period, the process is under reclaim pressure. We don't set memory.high in Cgroup V2 scenario, hence the behavior is a bit different.

cat /sys/fs/cgroup/cpu/gitaly/gitaly-39791/main/cgroup.procs
39791
cat /sys/fs/cgroup/memory/gitaly/gitaly-39791/main/cgroup.procs
39791
cat /sys/fs/cgroup/memory/gitaly/gitaly-39791/main/memory.oom_control
oom_kill_disable 0
under_oom 0
oom_kill 1
cat /sys/fs/cgroup/memory/gitaly/gitaly-39791/memory.failcnt
1393
Edited by Quang-Minh Nguyen

Merge request reports