Upon Cancelling a Running Job, server running gitlab-runner gets "ERROR: Checking for jobs... forbidden"
Summary
I have successfully registered runners and jobs are being built and everything is fine and dandy.
But, then... if a build gets cancelled, it stop accepting jobs.
When trying to debug I am finding following: gitlab-runner --debug verify
sudo gitlab-runner --debug verify
Runtime platform arch=386 os=linux revision=5396d320 version=11.0.0
Checking runtime mode GOOS=linux uid=0
Running in system-mode.
Trying to load /etc/gitlab-runner/certs/gitlab.company.com.crt ...
Dialing: tcp gitlab.cumul8.com:443 ...
ERROR: Verifying runner... is removed runner=907ca4f2
FATAL: Failed to verify runners
gitlab-runner --debug run
sudo gitlab-runner --debug run
Runtime platform arch=386 os=linux revision=5396d320 version=11.0.0
Starting multi-runner from /etc/gitlab-runner/config.toml ... builds=0
Checking runtime mode GOOS=linux uid=0
Running in system-mode.
Configuration loaded builds=0
metricsserveraddress: ""
listenaddress: ""
concurrent: 4
checkinterval: 0
loglevel: null
user: ""
runners:
- name: cirunner03a-ozone
limit: 1
outputlimit: 20000
requestconcurrency: 0
runnercredentials:
url: https://gitlab.company.com/
token: 907ca4f265a53497887506fd24ad08
tlscafile: ""
tlscertfile: ""
tlskeyfile: ""
runnersettings:
executor: shell
buildsdir: /home/gitlab-runner/builds/runner_a/builds_dir
cachedir: /home/gitlab-runner/builds/runner_a/cache_dir
cloneurl: ""
environment: []
preclonescript: ""
prebuildscript: ""
postbuildscript: ""
shell: ""
ssh: null
docker: null
parallels: null
virtualbox: null
cache:
type: ""
serveraddress: ""
accesskey: ""
secretkey: ""
bucketname: ""
bucketlocation: ""
insecure: false
path: ""
shared: false
machine: null
kubernetes: null
sentrydsn: null
modtime: 2018-05-09T14:54:51.489977488-07:00
loaded: true
builds=0
Waiting for stop signal builds=0
WARNING: 'metrics_server' configuration entry is deprecated and will be removed in one of future releases; please use 'listen_address' instead
Metrics server disabled
Feeding runners to channel builds=0
Starting worker builds=0 worker=0
Starting worker builds=0 worker=1
Starting worker builds=0 worker=2
Starting worker builds=0 worker=3
Trying to load /etc/gitlab-runner/certs/gitlab.company.com.crt ...
Dialing: tcp gitlab.cumul8.com:443 ...
ERROR: Checking for jobs... forbidden runner=907ca4f2
Feeding runners to channel builds=0
ERROR: Checking for jobs... forbidden runner=907ca4f2
Feeding runners to channel builds=0
ERROR: Checking for jobs... forbidden runner=907ca4f2
ERROR: Runner https://gitlab.cumul8.com/907ca4f265a53497887506fd24ad08 is not healthy and will be disabled!
Feeding runners to channel builds=0
Feeding runners to channel builds=0
gitlab-runner --version
sudo gitlab-runner --version
Version: 11.0.0
Git revision: 5396d320
Git branch: 11-0-stable
GO version: go1.8.7
Built: 2018-06-22T11:03:37+00:00
OS/Arch: linux/386
I have tried stopping, starting and restarting the Gitlab-runner Also restarting the VM/Server Dosen't work.
Sometimes updating the version works.
Steps to reproduce
- Start a job
- Cancel the job
- Wait for it to start next job
- Nothing happens...
- Wait for about 1h and it starts accepting jobs
Running:
gitlab-runner verify --delete
gitlab-runner run
Seems to make is start again
What is the current bug behavior?
Cancelling a job stops the runner from picking up new jobs
What is the expected correct behavior?
When cancelling a job, it should just pick up next available job
Relevant logs and/or screenshots
(Paste any relevant logs - please use code blocks (```) to format console output, logs, and code as it's very hard to read otherwise.)
Output of checks
This is happening on our private gitlab
Results of GitLab environment info
GitLab 10.8.0 (55e4a0b3) GitLab Shell 7.1.2 GitLab Workhorse v4.2.0 GitLab API
Possible fixes
(If you can, link to the line of code that might be responsible for the problem)