Long running jobs canceled in GitLab UI, but runner continues process
I have a long running compile job (~40 minutes). I made 2 pushes one after another. I stopped the first running job with the ui. It tells me that the job is canceled. But the second job stays at pending.
I suspect that the runner finishes the job and is not properly terminated.
Is there a way to test my hypothesis?
I'm using the shell executor for the runner (gitlab and runner are on ubuntu 16.04)
Edit: as written in a comment below steps to reproduce the problem:
create a project with a simple
build: stage: build tags: - ubuntu_amd64 script: - ping localhost
start a pipeline and cancel it. This should also terminate the ping command (but it doesn't)
on the runner see if the process is still running
ps aux | grep ping gitlab-+ 19828 0.0 0.0 8656 1724 ? S 07:59 0:00 ping localhost
or just kill it with
killall ping (use
sudo if the runner is under another user)
At the moment we are simply killing the process group with SIGKILL and then ignore the result. Instead of doing this we should allow the process to gracefully shutdown by first sending
SIGTERM and after a specific timeout send
SIGKILL to the process. This will help with the processes being killed properly. We already have this implemented with the custom executor and should try and reuse the code to implement the same feature.
- Extract process killer form custom executor
- Extract commander interface from custom executor
Add Process groups to
- Use the same termination commands on Windows
- Rename test file
SIGKILLfor shell executor