Make commander start process group for each process
extract-comamnder-from-custom-executor
untill !1654 (merged) is merged
What does this MR do?
Have the commander create a process group for every command it starts. This is so that each subprocess has the PPID set to the process that the custom executor start which will help with process termination. More information about process groups can be found in http://www.informit.com/articles/article.aspx?p=397655&seqNum=6.
Update the unix killer to be aware of process groups and send the singal to each process instead of just the main process to prevent any orphan processes.
How to test
Simple go project that listens to SIGTERM/SIGKILL
and spawns child process
.gitlab-ci.yaml
job:
script:
- /Users/steve/Code/gitlab.com/steveazz/siglistener/siglistener
config.toml
[[runners]]
name = "custom"
url = "http://192.168.1.79:3000"
token = "PtUw_SJzn2y--tEKUvJw"
executor = "custom"
builds_dir = "/tmp/builds"
cache_dir = "/tmp/cache"
[runners.custom]
run_exec = "/Users/steve/Code/gitlab.com/custom-executor-drivers/debug/run.sh"
graceful_kill_timeout = 20
run.sh
#!/usr/bin/env bash
/bin/bash $1
Shell executor
Running the Runner binary locally with go run main.go
ps -opid,pgid,command
$ ps -opid,pgid,command
24415 24415 go run main.go run -c /Users/steve/Code/gitlab.com/gitlab-org/gitlab/gitlab-runner-config.toml <--- Runner process
24443 24415 /var/folders/h4/45_ps9bn4jb6w51g1ngy_n5c0000gn/T/go-build583985105/b001/exe/main run -c /Users/steve/Code/gitlab.com/gitlab-org/gitlab/gitlab-runner-config.toml
27195 27195 bash --login <--- Script from shell executor has it's own process group
27199 27195 bash --login
27200 27195 /Users/steve/Code/gitlab.com/steveazz/siglistener/siglistener
27204 27195 (sleep)
27205 27195 (sleep)
27206 27195 (sleep)
27207 27195 (sleep)
27208 27195 (sleep)
27209 27195 (sleep)
27210 27195 (sleep)
27211 27195 (sleep)
27212 27195 (sleep)
27213 27195 (sleep)
Steps
- Start job
- Run
ps -opid,pgid,command
same user as Runner to see the process - Cancel job
- Just
SIGKILl
is sent to thesiglistener
and sub-processes.
Custom executor (master)
Notice how the process group of the siglistener
and sleep are all for the go run main.go
the process not /Users/steve/Code/gitlab.com/steveazz/siglistener/siglistener
as how it would be for the shell executor.
ps -opid,pgid,command
$ ps -opid,pgid,command
16934 16934 go run main.go run -c /Users/steve/Code/gitlab.com/gitlab-org/gitlab/gitlab-runner-config.toml <--- Runner process
16962 16934 /var/folders/h4/45_ps9bn4jb6w51g1ngy_n5c0000gn/T/go-build263333944/b001/exe/main run -c /Users/steve/Code/gitlab.com/gitlab-org/gitlab/gitlab-runner-config.toml
17164 16934 bash /Users/steve/Code/gitlab.com/custom-executor-drivers/debug/run.sh /var/folders/h4/45_ps9bn4jb6w51g1ngy_n5c0000gn/T/custom-executor808275188/script062524327/script. build_script <--- Exec of build script still attached to the main runner process
17165 16934 /bin/bash /var/folders/h4/45_ps9bn4jb6w51g1ngy_n5c0000gn/T/custom-executor808275188/script062524327/script.
17167 16934 /bin/bash /var/folders/h4/45_ps9bn4jb6w51g1ngy_n5c0000gn/T/custom-executor808275188/script062524327/script.
17168 16934 /Users/steve/Code/gitlab.com/steveazz/siglistener/siglistener
17169 16934 sleep 60
17170 16934 sleep 60
17171 16934 sleep 60
17172 16934 sleep 60
17173 16934 sleep 60
17174 16934 sleep 60
17175 16934 sleep 60
17176 16934 sleep 60
17177 16934 sleep 60
17178 16934 sleep 60
- Start job
- Run
ps -opid,pgid,command
same user as Runner to see the process - Cancel job
-
SIGTERM
is sent to therun.sh
process and it's terminated properly but all other processes are still around since the GPID is set to the Runner process
Custom executor (this MR)
Notice how the process group is now set to the siglistener
for the child processes.
ps -opid,pgid,command
$ ps -opid,pgid,command
66515 66515 go run main.go run -c /Users/steve/Code/gitlab.com/gitlab-org/gitlab/gitlab-runner-config.toml <--- Runner process
66991 66515 /var/folders/h4/45_ps9bn4jb6w51g1ngy_n5c0000gn/T/go-build545902909/b001/exe/main run -c /Users/steve/Code/gitlab.com/gitlab-org/gitlab/gitlab-runner-config.toml
67086 67086 bash /Users/steve/Code/gitlab.com/custom-executor-drivers/debug/run.sh /var/folders/h4/45_ps9bn4jb6w51g1ngy_n5c0000gn/T/custom-executor612101401/script079237912/script. build_script <--- run.sh set as progress group
67087 67086 /bin/bash /var/folders/h4/45_ps9bn4jb6w51g1ngy_n5c0000gn/T/custom-executor612101401/script079237912/script.
67089 67086 /bin/bash /var/folders/h4/45_ps9bn4jb6w51g1ngy_n5c0000gn/T/custom-executor612101401/script079237912/script.
67090 67086 /Users/steve/Code/gitlab.com/steveazz/siglistener/siglistener
67091 67086 (sleep)
67092 67086 (sleep)
67093 67086 (sleep)
67094 67086 (sleep)
67095 67086 (sleep)
67096 67086 (sleep)
67097 67086 (sleep)
67098 67086 (sleep)
67099 67086 (sleep)
67100 67086 (sleep)
Steps
- Start job
- Run
ps -opid,pgid,command
same user as Runner to see the process - Cancel job
- Run
tail -f /tmp/siglistener.log
to see logs of the siglistener. In the logs there should be a message saying that SIGTERM was recived, and after 20tick
the process is killed. See we received a SIGTERM, and after 20 ticks the process was killed - Check if process tree is cleaned up with
ps -opid,pgid,command
Why was this MR needed?
- Prevent subprocesses of the custom executor to live forever/until the main Runner process exits
- Consolidated the shell executor and custom executor process logic together
Does this MR meet the acceptance criteria?
-
Documentation created/updated -
Added tests for this feature/bug -
In case of conflicts with master
- branch was rebased
What are the relevant issue numbers?
- Broken out of !1551 (closed)
- Part of #3376 (closed)
- Part of #4438 (closed)