Custom executor with high concurrent and low limit exceeds runner limit
Hi,
I just ran into a quite weird and rather unexpected error. I set up a GitLab Runner with this config:
concurrent = 4294967295
check_interval = 1
listen_address = "0.0.0.0:9252"
[session_server]
session_timeout = 1800
[[runners]]
name = "some-prefix-v03-incus"
url = "https://git.example.com"
id = 131
token = "<removed>"
limit = 4
executor = "custom"
builds_dir = "/builds"
cache_dir = "/cache"
[runners.custom_build_dir]
enabled = true
[runners.custom]
prepare_exec = "/opt/incus-driver/prepare.sh"
run_exec = "/opt/incus-driver/run.sh"
cleanup_exec = "/opt/incus-driver/cleanup.sh"
config_exec = "/opt/incus-driver/config.sh"
The idea is: make concurrent
be very high, and make limit
be the "real" limit for runner. In reality, this is puzzled together from two different sources in the Ansible inventory, so I wanted to avoid having to edit the concurrency
when modifying the limit
for the individual runner.
Interestingly enough, this triggered an OOM error on the machine in question. I thought it was because I had given it too little RAM to handle 4 simultaneous jobs, but it turned out it was trying to run no less than 12-16 simultaneous jobs:
hibox@some-ci-server:~$ incus list
+--------------------------------------------+---------+-----------------------+-----------------------------------------------+-----------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+--------------------------------------------+---------+-----------------------+-----------------------------------------------+-----------+-----------+
| runner-131-project-68-concurrent-0-1873864 | RUNNING | 10.92.60.1 (incusbr0) | fd42:a53e:81a8:e177:216:3eff:feaa:36b0 (eth0) | CONTAINER | 0 |
| | | 10.110.107.5 (eth0) | | | |
+--------------------------------------------+---------+-----------------------+-----------------------------------------------+-----------+-----------+
| runner-131-project-68-concurrent-0-1873884 | RUNNING | 10.92.60.1 (incusbr0) | fd42:a53e:81a8:e177:216:3eff:fe29:16e2 (eth0) | CONTAINER | 0 |
| | | 10.110.107.47 (eth0) | | | |
+--------------------------------------------+---------+-----------------------+-----------------------------------------------+-----------+-----------+
| runner-131-project-68-concurrent-0-1873909 | RUNNING | 10.110.107.243 (eth0) | fd42:a53e:81a8:e177:216:3eff:fe35:40ab (eth0) | CONTAINER | 0 |
+--------------------------------------------+---------+-----------------------+-----------------------------------------------+-----------+-----------+
| runner-131-project-68-concurrent-0-1873921 | RUNNING | 10.110.107.75 (eth0) | fd42:a53e:81a8:e177:216:3eff:feac:bc7e (eth0) | CONTAINER | 0 |
+--------------------------------------------+---------+-----------------------+-----------------------------------------------+-----------+-----------+
| runner-131-project-68-concurrent-1-1873865 | RUNNING | 10.92.60.1 (incusbr0) | fd42:a53e:81a8:e177:216:3eff:fe7e:7e79 (eth0) | CONTAINER | 0 |
| | | 10.110.107.206 (eth0) | | | |
+--------------------------------------------+---------+-----------------------+-----------------------------------------------+-----------+-----------+
| runner-131-project-68-concurrent-1-1873885 | RUNNING | 10.92.60.1 (incusbr0) | fd42:a53e:81a8:e177:216:3eff:feba:7e9b (eth0) | CONTAINER | 0 |
| | | 10.110.107.37 (eth0) | | | |
+--------------------------------------------+---------+-----------------------+-----------------------------------------------+-----------+-----------+
| runner-131-project-68-concurrent-1-1873910 | RUNNING | 10.110.107.126 (eth0) | fd42:a53e:81a8:e177:216:3eff:fe44:7c95 (eth0) | CONTAINER | 0 |
+--------------------------------------------+---------+-----------------------+-----------------------------------------------+-----------+-----------+
| runner-131-project-68-concurrent-1-1873922 | RUNNING | 10.110.107.129 (eth0) | fd42:a53e:81a8:e177:216:3eff:fe68:66bc (eth0) | CONTAINER | 0 |
+--------------------------------------------+---------+-----------------------+-----------------------------------------------+-----------+-----------+
| runner-131-project-68-concurrent-2-1873874 | RUNNING | 10.92.60.1 (incusbr0) | fd42:a53e:81a8:e177:216:3eff:feac:1b9c (eth0) | CONTAINER | 0 |
| | | 10.110.107.151 (eth0) | | | |
+--------------------------------------------+---------+-----------------------+-----------------------------------------------+-----------+-----------+
| runner-131-project-68-concurrent-2-1873895 | RUNNING | 10.110.107.147 (eth0) | fd42:a53e:81a8:e177:216:3eff:fe4a:cd98 (eth0) | CONTAINER | 0 |
+--------------------------------------------+---------+-----------------------+-----------------------------------------------+-----------+-----------+
| runner-131-project-68-concurrent-2-1873903 | RUNNING | 10.110.107.250 (eth0) | fd42:a53e:81a8:e177:216:3eff:fe9b:1651 (eth0) | CONTAINER | 0 |
+--------------------------------------------+---------+-----------------------+-----------------------------------------------+-----------+-----------+
| runner-131-project-68-concurrent-2-1873917 | RUNNING | 10.110.107.134 (eth0) | fd42:a53e:81a8:e177:216:3eff:fe74:917b (eth0) | CONTAINER | 0 |
+--------------------------------------------+---------+-----------------------+-----------------------------------------------+-----------+-----------+
| runner-131-project-68-concurrent-3-1873875 | RUNNING | 10.92.60.1 (incusbr0) | fd42:a53e:81a8:e177:216:3eff:fecd:701b (eth0) | CONTAINER | 0 |
| | | 10.110.107.224 (eth0) | | | |
+--------------------------------------------+---------+-----------------------+-----------------------------------------------+-----------+-----------+
| runner-131-project-68-concurrent-3-1873896 | RUNNING | 10.110.107.103 (eth0) | fd42:a53e:81a8:e177:216:3eff:fe60:343a (eth0) | CONTAINER | 0 |
+--------------------------------------------+---------+-----------------------+-----------------------------------------------+-----------+-----------+
| runner-131-project-68-concurrent-3-1873904 | RUNNING | 10.110.107.237 (eth0) | fd42:a53e:81a8:e177:216:3eff:fe57:4305 (eth0) | CONTAINER | 0 |
+--------------------------------------------+---------+-----------------------+-----------------------------------------------+-----------+-----------+
| runner-131-project-68-concurrent-3-1873918 | RUNNING | 10.110.107.248 (eth0) | fd42:a53e:81a8:e177:216:3eff:fe9b:96e0 (eth0) | CONTAINER | 0 |
+--------------------------------------------+---------+-----------------------+-----------------------------------------------+-----------+-----------+
How on earth is this possible? limit
should limit the number of instances for a given runner. Could it be that this doesn't work with the "custom" executor or something?
The version of the gitlab-runner
package is 16.8.0 in this case, running on Ubuntu 22.04.