Skip to content

Auto retry stuck or timed out jobs

Reuben Pereira requested to merge rp/add-auto-retry-for-stuck-jobs into master

What does this MR do and why?

Describe in detail what your merge request does and why.

Content

  • Auto retry stuck or timed out jobs

Retry when a CI job fails with stuck_or_timeout_failure as the reason (docs), so that jobs are auto retried if they did not start due to being stuck or timing out. For example: https://ops.gitlab.net/gitlab-org/release/tools/-/jobs/13657298.

Jobs that start executing and then timeout come under the job_execution_timeout category of failures, so they won't be retried by this. For example: https://ops.gitlab.net/gitlab-org/quality/staging/-/jobs/13537388. Automatically retrying such jobs is debatable because we might need to investigate what the job did before timing out.

Author Check-list

  • Has documentation been updated?
Edited by Reuben Pereira

Merge request reports