gitlab-runsvdir script enforces a hard coded /proc/sys/fs/file-max of 1million
Summary
The startup script for the GitLab systemd service enforces a hard coded file-max limit
echo "1000000" > /proc/sys/fs/file-max
A customer with a large (96 core) single node GitLab is running into this limit. Internal ticket link.
-
syslog
Dec 28 07:56:24 ip-172-31-25-166 kernel: [1309002.690615] VFS: file-max limit 1000000 reached Dec 28 07:56:24 ip-172-31-25-166 rsyslogd: file '/var/log/kern.log': open error: Too many open files in system [v8.32.0 try http://www.rsyslog.com/e/2433 ]
-
postgresql
2023-12-28_07:56:26.98730 LOG: out of file descriptors: Too many open files in system; release and retry
It appears this limit caused throughput to stall on the server, resulting in a thundering herd of client activity. This triggered a kernel OOM event which killed PostgreSQL (known issue, see: gitlab#365947).
Aside: this is similar to: #8210 where the task limit in systemd is reached.
Workaround
To ensure this change survives upgrades, use a systemd drop-in file and a copy of the start script. Full details
What is the current bug behavior?
There's a hard coded file-max limit, which complicates raising that limit if it's encountered.
What is the expected correct behavior?
Either
-
This limit is left to the sysadmin to manage. Times have probably moved on since January 2017, and for Debian variants, there is effectively no limit in this value:
- Debian Bullseye
# sysctl -a | grep file-m fs.file-max = 9223372036854775807
- Ubuntu 20.04
$ cat /proc/sys/fs/file-max 9223372036854775807
-
Set it the same as other kernel tunables, and provide a lever to change it with
# ls -al /etc/sysctl.d/90* lrwxrwxrwx 1 root root 58 May 27 2021 /etc/sysctl.d/90-omnibus-gitlab-kernel.sem.conf -> /opt/gitlab/embedded/etc/90-omnibus-gitlab-kernel.sem.conf lrwxrwxrwx 1 root root 61 May 27 2021 /etc/sysctl.d/90-omnibus-gitlab-kernel.shmall.conf -> /opt/gitlab/embedded/etc/90-omnibus-gitlab-kernel.shmall.conf lrwxrwxrwx 1 root root 61 May 27 2021 /etc/sysctl.d/90-omnibus-gitlab-kernel.shmmax.conf -> /opt/gitlab/embedded/etc/90-omnibus-gitlab-kernel.shmmax.conf lrwxrwxrwx 1 root root 66 May 27 2021 /etc/sysctl.d/90-omnibus-gitlab-net.core.somaxconn.conf -> /opt/gitlab/embedded/etc/90-omnibus-gitlab-net.core.somaxconn.conf
Relevant logs
Relevant logs
Details of package version
v14.4.3-ee
However, the code was introduced in 2017, and is unchanged in %16.7 so it wouldn't matter if the customer was more current.
Environment details
- Operating System:
Ubuntu 18.04
- Installation Target, remove incorrect values:
- VM: AWS
- Installation Type, remove incorrect values:
- Upgrade from version
unknown
; not relevant to issue
- Upgrade from version
- Is there any other software running on the machine: nothing of note.
- single node installation?
- Resources
- CPU:
96
- Memory total:
378639M
- CPU: