gitlab-runsvdir.service systemd TasksMax being reached on large customer systems
Summary
It's possible that a component of GitLab has changed its process forking behaviour, because we've got at least two reports of issues from customers who have deployed Omnibus onto a large server (rather than scaling to multiple servers).
-
from
dmesg
cgroup: fork rejected by pids controller in /system.slice/gitlab-runsvdir.service
-
from
/var/log/gitlab/postgresql/current
LOG: could not fork autovacuum worker process: Resource temporarily unavailable LOG: could not fork new process for connection: Resource temporarily unavailable
Identify that TasksMax
is reached with systemctl status gitlab-runsvdir.service
. The counter and the limit are displayed:
● gitlab-runsvdir.service - GitLab Runit supervision process
Loaded: loaded (/usr/lib/systemd/system/gitlab-runsvdir.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2023-09-14 11:23:55 BST; 2min 26s ago
Main PID: 870 (runsvdir)
Tasks: 217 (limit: 4915)
4915
is what Omnibus sets.
Workaround
- Set
package['systemd_tasks_max']
ingitlab.rb
for example to14915
and apply withgitlab-ctl reconfigure
. - Check with
systemctl status gitlab-runsvdir.service
- Run
/bin/systemctl daemon-reload
if the limit doesn't change. - If the limit is still unchanged, a reboot is required.
Steps to reproduce
What is the current bug behavior?
gitlab-runsvdir.service
runs out of PIDs and components of GitLab stop functioning correctly.
What is the expected correct behavior?
GitLab runs as usual.
Relevant logs
To follow.
Relevant logs
Details of package version
Provide the package version installation details
Edited by olivier némoz saint-dizier