Backport of "Treat split failure reasons as retry:when aliases"

What does this MR do and why?

Backport of !239415 (merged) to the 19-0-stable-ee branch.

In GitLab 19.0, the generic stuck_or_timeout_failure and job_execution_timeout job failure reasons were split into a set of more specific reasons emitted by the Ci::StuckBuilds::* and Ci::TimedOutBuilds::* services:

Legacy reason Reasons that replaced it
stuck_or_timeout_failure stuck_pending_with_matching_runners, stuck_pending_no_matching_runners, no_updates_running, no_updates_canceling
job_execution_timeout server_timeout_running, server_timeout_canceling

The legacy reasons remain valid retry:when values (the enum keys are kept for historical data), so a config that lists them still validates after the upgrade but silently stops matching any real failure, because new builds are only ever written with the specific reasons.

This MR makes the legacy reasons behave as aliases in auto-retry matching. When a user lists either of them under retry:when, the retry logic matches against the full set of specific reasons that replaced it, preserving the original intent ("retry me if I got stuck or timed out") without re-emitting the legacy reasons on new builds.

Implementation

  • A central alias map (Enums::Ci::CommitStatus.failure_reason_aliases) records which specific reasons replaced each legacy one.
  • Gitlab::Ci::Build::AutoRetry expands the configured retry:when list at match time, keeping each configured reason and the reasons that superseded it. This preserves matching for historical builds that still carry the legacy reason, and leaves per-reason behavior unchanged.
  • The YAML validator is untouched: the legacy keys remain in the enum, so existing configs keep validating.

References

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

  • This MR is backporting a bug fix, documentation update, or spec fix, previously merged in the default branch.
  • The MR that fixed the bug on the default branch has been deployed to GitLab.com (not applicable for documentation or spec changes).
  • The MR title is descriptive (e.g. "Backport of 'title of default branch MR'"). This is important, since the title will be copied to the patch blog post.
  • Required labels have been applied to this merge request
  • This MR has been approved by a maintainer (only one approval is required).
  • Ensure the e2e:test-on-omnibus-ee job has succeeded, or if it has failed, investigate the failures. If you determine the failures are unrelated, you may proceed. If you need assistance investigating, reach out to a Software Engineer in Test in #s_developer_experience.

Note to the merge request author and maintainer

If you have questions about the patch release process, please:

Edited by Hordur Freyr Yngvason

Merge request reports

Loading