Retry Jobs Stuck in Pending
Summary
When a failed job retries via the retry keyword available in .gitlab-ci.yml, the retry job gets stuck in a Pending state despite the runner having capacity.
Steps to reproduce
- Run a pipeline for, in this case, a specific tagged runner with a value of at least 1 for the
retrykeyword in .gitlab-ci.yml - Set up the job so that it will fail and cause a replacement retry job to be created
- You should see this retry job stuck as
Pending
What is the current bug behavior?
If left untouched, the retry jobs almost always remains in Pending for at least 45mins - this can be evidenced afterwards in the job summary under the Queued heading
What is the expected correct behavior?
Retry job should begin straight away if there is runner availability and capacity
Relevant logs and/or screenshots
Evidence of an excessively long queue where the job was Pending

Results of GitLab environment info
- Upgraded gitlab-runner on runner instance to 15.10.0 and problem still persisting.
- Began occurring approx a week ago.
- It can lead to considerable increases in pipeline runtimes.
Possible fixes
- Going into the project setting on Gitlab > CI/CD > Runners and then pausing and starting the appropriate
runner will see the
Pendingjob begin immediately. This is a manual workaround but not ideal nor always possible - Commencing a new pipeline sent to the runner will also cause the
Pendingjob to start immediately
devopsverify
severity3
priority1
pipeline
pipeline processing
testingcode testing
typebug
Edited by Shane Turley