GitLab runner graceful shutdown unexpected behaviour
Summary
Steps to reproduce
Set up a simple project with a pipeline with a sleep timer.
Create a shell execution runner on an ubuntu machine (mine is on version 22), register it and start the job.
ssh into the runner's machine
configure /etc/systemd/system/gitlab-runner.d/kill.conf with the following content and once the job is running (sleeping) execute gitlab-runner stop
[Service]
KillSignal=SIGQUIT
TimeoutStopSec=100
.gitlab-ci.yml
stages: # List of stages for jobs, and their order of execution
- build
- test
- deploy
build-job: # This job runs in the build stage, which runs first.
stage: build
script:
- echo "Compiling the code..."
- echo "Compile complete."
unit-test-job: # This job runs in the test stage.
stage: test # It only starts when the job in the build stage completes successfully.
script:
- echo "Running unit tests... This will take about 60 seconds."
- sleep 600
- echo "Code coverage is 90%"
lint-test-job: # This job also runs in the test stage.
stage: test # It can run at the same time as unit-test-job (in parallel).
script:
- echo "Linting code... This will take about 10 seconds."
- sleep 10
- echo "No lint issues found."
deploy-job: # This job runs in the deploy stage.
stage: deploy # It only runs when *both* jobs in the test stage complete successfully.
environment: production
script:
- echo "Deploying application..."
- echo "Application successfully deployed."
Actual behavior
gitlab-runner service stops in error, job on gitlab.com is waiting to complete
Expected behavior
gitlab-runner service stops gracefully, job on gitlab.com is shown as failed, jobs are not retried for this runner
Relevant logs and/or screenshots
job log
[0KRunning with gitlab-runner 16.3.0 (8ec04662)[0;m
[0K on ip-172-31-22-137 g-BKYv8vd, system ID: s_ae87f0fd4b62[0;m
section_start:1692647540:resolve_secrets
[0K[0K[36;1mResolving secrets[0;m[0;m
section_end:1692647540:resolve_secrets
[0Ksection_start:1692647540:prepare_executor
[0K[0K[36;1mPreparing the "shell" executor[0;m[0;m
[0KUsing Shell (bash) executor...[0;m
section_end:1692647540:prepare_executor
[0Ksection_start:1692647540:prepare_script
[0K[0K[36;1mPreparing environment[0;m[0;m
Running on ip-172-31-22-137...
section_end:1692647540:prepare_script
[0Ksection_start:1692647540:get_sources
[0K[0K[36;1mGetting source from Git repository[0;m[0;m
[32;1mFetching changes with git depth set to 20...[0;m
Reinitialized existing Git repository in /home/gitlab-runner/builds/g-BKYv8vd/0/gl-demo-ultimate-msporchia1/scrappy-projects/spot-instance-termination-test/.git/
[32;1mChecking out 8fc5d14f as detached HEAD (ref is main)...[0;m
[32;1mSkipping Git submodules setup[0;m
section_end:1692647541:get_sources
[0Ksection_start:1692647541:step_script
[0K[0K[36;1mExecuting "step_script" stage of the job script[0;m[0;m
[32;1m$ echo "Running unit tests... This will take about 60 seconds."[0;m
Running unit tests... This will take about 60 seconds.
[32;1m$ sleep 600[0;m
Environment description
config.toml contents
concurrent = 1
check_interval = 0
shutdown_timeout = 0
[session_server]
session_timeout = 1800
[[runners]]
name = "ip-172-31-22-137"
url = "https://gitlab.com"
id = 27119341
token = "{redacted}"
token_obtained_at = 2023-08-21T15:07:32Z
token_expires_at = 0001-01-01T00:00:00Z
executor = "shell"
[runners.cache]
MaxUploadedArchiveSize =
journalctl -u gitlab-runner
Aug 21 19:51:06 ip-172-31-22-137 systemd[1]: Stopped GitLab Runner.
Aug 21 19:51:11 ip-172-31-22-137 systemd[1]: Started GitLab Runner.
Aug 21 19:51:11 ip-172-31-22-137 gitlab-runner[3381]: Runtime platform arch=amd64 os=linux pid=3381 revision=8ec04662 version=16.3.0
Aug 21 19:51:11 ip-172-31-22-137 gitlab-runner[3381]: Starting multi-runner from /etc/gitlab-runner/config.toml... builds=0 max_builds=0
Aug 21 19:51:11 ip-172-31-22-137 gitlab-runner[3381]: Running in system-mode.
Aug 21 19:51:11 ip-172-31-22-137 gitlab-runner[3381]:
Aug 21 19:51:11 ip-172-31-22-137 gitlab-runner[3381]: Configuration loaded builds=0 max_builds=1
Aug 21 19:51:11 ip-172-31-22-137 gitlab-runner[3381]: listen_address not defined, metrics & debug endpoints disabled builds=0 max_builds=1
Aug 21 19:51:11 ip-172-31-22-137 gitlab-runner[3381]: [session_server].listen_address not defined, session endpoints disabled builds=0 max_builds=1
Aug 21 19:51:11 ip-172-31-22-137 gitlab-runner[3381]: Initializing executor providers builds=0 max_builds=1
Aug 21 19:52:15 ip-172-31-22-137 gitlab-runner[3381]: Checking for jobs... received job=4915814954 repo_url=https://gitlab.com/gl-demo-ultimate-msporchia1/scrappy-projects/spot-instance-termination-test.git runner=g-BKYv8vd
Aug 21 19:52:15 ip-172-31-22-137 gitlab-runner[3381]: Added job to processing list builds=1 job=4915814954 max_builds=1 project=48672264 repo_url=https://gitlab.com/gl-demo-ultimate-msporchia1/scrappy-projects/spot-instance-termination-test.git
Aug 21 19:52:15 ip-172-31-22-137 su[3392]: (to gitlab-runner) root on none
Aug 21 19:52:15 ip-172-31-22-137 su[3392]: pam_unix(su:session): session opened for user gitlab-runner(uid=1001) by (uid=0)
Aug 21 19:52:15 ip-172-31-22-137 su[3392]: pam_unix(su:session): session closed for user gitlab-runner
Aug 21 19:52:15 ip-172-31-22-137 su[3402]: (to gitlab-runner) root on none
Aug 21 19:52:15 ip-172-31-22-137 su[3402]: pam_unix(su:session): session opened for user gitlab-runner(uid=1001) by (uid=0)
Aug 21 19:52:16 ip-172-31-22-137 su[3402]: pam_unix(su:session): session closed for user gitlab-runner
Aug 21 19:52:16 ip-172-31-22-137 su[3436]: (to gitlab-runner) root on none
Aug 21 19:52:16 ip-172-31-22-137 su[3436]: pam_unix(su:session): session opened for user gitlab-runner(uid=1001) by (uid=0)
Aug 21 19:52:16 ip-172-31-22-137 su[3436]: pam_unix(su:session): session closed for user gitlab-runner
Aug 21 19:52:16 ip-172-31-22-137 su[3448]: (to gitlab-runner) root on none
Aug 21 19:52:16 ip-172-31-22-137 su[3448]: pam_unix(su:session): session opened for user gitlab-runner(uid=1001) by (uid=0)
Aug 21 19:52:16 ip-172-31-22-137 su[3448]: pam_unix(su:session): session closed for user gitlab-runner
Aug 21 19:52:16 ip-172-31-22-137 gitlab-runner[3381]: Job succeeded duration_s=1.013690672 job=4915814954 project=48672264 runner=g-BKYv8vd
Aug 21 19:52:17 ip-172-31-22-137 gitlab-runner[3381]: Appending trace to coordinator...ok code=202 job=4915814954 job-log=0-1531 job-status=running runner=g-BKYv8vd sent-log=0-1530 status=202 Accepted update-interval=1m0s
Aug 21 19:52:17 ip-172-31-22-137 gitlab-runner[3381]: Updating job... bytesize=1531 checksum=crc32:994dbcae job=4915814954 runner=g-BKYv8vd
Aug 21 19:52:17 ip-172-31-22-137 gitlab-runner[3381]: Submitting job to coordinator...accepted, but not yet completed bytesize=1531 checksum=crc32:994dbcae code=202 job=4915814954 job-status= runner=g-BKYv8vd update-interval=1s
Aug 21 19:52:18 ip-172-31-22-137 gitlab-runner[3381]: Updating job... bytesize=1531 checksum=crc32:994dbcae job=4915814954 runner=g-BKYv8vd
Aug 21 19:52:19 ip-172-31-22-137 gitlab-runner[3381]: Submitting job to coordinator...ok bytesize=1531 checksum=crc32:994dbcae code=200 job=4915814954 job-status= runner=g-BKYv8vd update-interval=0s
Aug 21 19:52:19 ip-172-31-22-137 gitlab-runner[3381]: Removed job from processing list builds=0 job=4915814954 max_builds=1 project=48672264 repo_url=https://gitlab.com/gl-demo-ultimate-msporchia1/scrappy-projects/spot-instance-termination-test.git
Aug 21 19:52:20 ip-172-31-22-137 gitlab-runner[3381]: Checking for jobs... received job=4915814958 repo_url=https://gitlab.com/gl-demo-ultimate-msporchia1/scrappy-projects/spot-instance-termination-test.git runner=g-BKYv8vd
Aug 21 19:52:20 ip-172-31-22-137 gitlab-runner[3381]: Added job to processing list builds=1 job=4915814958 max_builds=1 project=48672264 repo_url=https://gitlab.com/gl-demo-ultimate-msporchia1/scrappy-projects/spot-instance-termination-test.git
Aug 21 19:52:20 ip-172-31-22-137 su[3459]: (to gitlab-runner) root on none
Aug 21 19:52:20 ip-172-31-22-137 su[3459]: pam_unix(su:session): session opened for user gitlab-runner(uid=1001) by (uid=0)
Aug 21 19:52:20 ip-172-31-22-137 su[3459]: pam_unix(su:session): session closed for user gitlab-runner
Aug 21 19:52:20 ip-172-31-22-137 su[3469]: (to gitlab-runner) root on none
Aug 21 19:52:20 ip-172-31-22-137 su[3469]: pam_unix(su:session): session opened for user gitlab-runner(uid=1001) by (uid=0)
Aug 21 19:52:21 ip-172-31-22-137 su[3469]: pam_unix(su:session): session closed for user gitlab-runner
Aug 21 19:52:21 ip-172-31-22-137 su[3503]: (to gitlab-runner) root on none
Aug 21 19:52:21 ip-172-31-22-137 su[3503]: pam_unix(su:session): session opened for user gitlab-runner(uid=1001) by (uid=0)
Aug 21 19:52:23 ip-172-31-22-137 gitlab-runner[3381]: Appending trace to coordinator...ok code=202 job=4915814958 job-log=0-1321 job-status=running runner=g-BKYv8vd sent-log=0-1320 status=202 Accepted update-interval=1m0s
Aug 21 19:52:30 ip-172-31-22-137 gitlab-runner[3381]: WARNING: [runWait] received stop signal builds=1 max_builds=1 stop-signal=terminated
Aug 21 19:52:30 ip-172-31-22-137 gitlab-runner[3381]: WARNING: Graceful shutdown not finished properly. To gracefully clean up running plugins please use SIGQUIT (ctrl-\) instead of SIGINT (ctrl-c) builds=1 error=received stop signal: terminated max_builds=1
Aug 21 19:52:30 ip-172-31-22-137 gitlab-runner[3381]: WARNING: Starting forceful shutdown StopSignal=terminated builds=1 max_builds=1 shutdown-timeout=30s
Aug 21 19:52:30 ip-172-31-22-137 systemd[1]: Stopping GitLab Runner...
Aug 21 19:53:00 ip-172-31-22-137 gitlab-runner[3381]: WARNING: Forceful shutdown not finished properly builds=1 error=shutdown timed out max_builds=1
Aug 21 19:53:00 ip-172-31-22-137 gitlab-runner[3381]: FATAL: Service run failed error=shutdown timed out
Aug 21 19:53:00 ip-172-31-22-137 systemd[1]: gitlab-runner.service: Main process exited, code=exited, status=1/FAILURE
Aug 21 19:53:00 ip-172-31-22-137 systemd[1]: gitlab-runner.service: Failed with result 'exit-code'.
Aug 21 19:53:00 ip-172-31-22-137 systemd[1]: Stopped GitLab Runner.
Used GitLab Runner version
revision= 8ec04662 version=16.3.0
Possible fixes
Edited by Tomasz Maczukin