Skip to content

gitlab-runsvdir.service systemd TasksMax being reached on large customer systems

Summary

It's possible that a component of GitLab has changed its process forking behaviour, because we've got at least two reports of issues from customers who have deployed Omnibus onto a large server (rather than scaling to multiple servers).

  • from dmesg

    cgroup: fork rejected by pids controller in /system.slice/gitlab-runsvdir.service
  • from /var/log/gitlab/postgresql/current

    LOG:  could not fork autovacuum worker process: Resource temporarily unavailable
    LOG:  could not fork new process for connection: Resource temporarily unavailable

Identify that TasksMax is reached with systemctl status gitlab-runsvdir.service. The counter and the limit are displayed:

● gitlab-runsvdir.service - GitLab Runit supervision process
   Loaded: loaded (/usr/lib/systemd/system/gitlab-runsvdir.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2023-09-14 11:23:55 BST; 2min 26s ago
 Main PID: 870 (runsvdir)
    Tasks: 217 (limit: 4915)

4915 is what Omnibus sets.

Workaround

  1. Set package['systemd_tasks_max'] in gitlab.rb for example to 14915 and apply with gitlab-ctl reconfigure.
  2. Check with systemctl status gitlab-runsvdir.service
  3. Run /bin/systemctl daemon-reload if the limit doesn't change.
  4. If the limit is still unchanged, a reboot is required.

Steps to reproduce

What is the current bug behavior?

gitlab-runsvdir.service runs out of PIDs and components of GitLab stop functioning correctly.

What is the expected correct behavior?

GitLab runs as usual.

Relevant logs

To follow.

Relevant logs

Details of package version

Provide the package version installation details

Edited by olivier némoz saint-dizier