Runner does not retry CI/CD jobs that should be automatically retried upon timeout
Summary
CI/CD jobs that timeout are not automatically retried when the job is configured to retry stuck_or_timeout_failure rather than always.
Retry documentation: https://docs.gitlab.com/ee/ci/yaml/#retry
MR that introduced this feature: gitlab-foss!21758 (merged)
Steps to reproduce
Use the following .gitlab-ci.yml file and ensure that job timeouts are set low enough so that sleep 3600 will never finish.
retry-just-some:
script: sleep 3600
retry:
max: 2
when:
- unknown_failure
- api_failure
- stuck_or_timeout_failure
- runner_system_failure
retry-explicit-always:
script: sleep 3600
retry:
max: 2
when:
- always
retry-implicit-always:
script: sleep 3600
retry:
max: 2
old-retry-syntax:
script: sleep 3600
retry: 2
You'll see that all but the first job (retry-just-some) are retried automatically upon timeout.
Example Project
https://gitlab.com/wescossick/demonstrate-retry-bug
What is the current bug behavior?
Job times out and is not retried.
What is the expected correct behavior?
Job times out and is automatically retried.
Relevant logs and/or screenshots
Only three out of the four jobs are retried:
Output of checks
This bug happens on GitLab.com
