Treat split failure reasons as retry:when aliases

What does this MR do and why?

In GitLab 19.0, the generic stuck_or_timeout_failure and job_execution_timeout job failure reasons were split into a set of more specific reasons emitted by the Ci::StuckBuilds::* and Ci::TimedOutBuilds::* services:

Legacy reason Reasons that replaced it
stuck_or_timeout_failure stuck_pending_with_matching_runners, stuck_pending_no_matching_runners, no_updates_running, no_updates_canceling
job_execution_timeout server_timeout_running, server_timeout_canceling

The legacy reasons remain valid retry:when values (the enum keys are kept for historical data), so a config like the following still validates after the upgrade but silently stops matching any real failure, because new builds are only ever written with the specific reasons:

job:
  script: ./run.sh
  retry:
    max: 2
    when:
      - stuck_or_timeout_failure
      - job_execution_timeout

This MR makes the legacy reasons behave as aliases in auto-retry matching. When a user lists either of them under retry:when, the retry logic matches against the full set of specific reasons that replaced it, preserving the original intent ("retry me if I got stuck or timed out") without re-emitting the legacy reasons on new builds.

Implementation

  • A central alias map (Enums::Ci::CommitStatus.failure_reason_aliases) records which specific reasons replaced each legacy one. New splits should add an entry here.
  • Gitlab::Ci::Build::AutoRetry expands the configured retry:when list at match time, keeping each configured reason and the reasons that superseded it. This preserves matching for historical builds that still carry the legacy reason, and leaves per-reason behavior unchanged (a specific reason still matches only itself).
  • The YAML validator is untouched: the legacy keys remain in the enum, so existing configs keep validating.

References

Edited by Oleg Yakovenko

Merge request reports

Loading