Automatically trigger Fix a Failing Pipeline Flow on pipeline failures
## Problem to Solve
When a pipeline fails, engineers must manually notice the failure, diagnose the cause, and decide how to act — creating an expensive context-switching loop that slows down development cycles. The **Fix Pipeline flow** already exists to automate this response, but today it only runs when a user manually triggers it.
This epic makes Fix Pipeline automatic: when a pipeline fails, the flow kicks off without any human prompt.
However, simply wiring up automatic triggers today would create more problems than it solves:
**Trigger noise.** Pipeline events fire on every status change (created, in progress, success, failed, canceled) — 3 or more times per pipeline. Without status filtering, the flow fires on irrelevant events, causing system saturation at scale.
**Runaway MR creation.** The current flow always opens a new MR, regardless of context. If a pipeline fails on an existing MR branch, the flow opens a second MR targeting the first — which can then fail and spawn a third. This recursive pattern creates review debt and confusion.
**Duplicate actions.** Without idempotency awareness, the flow may repeatedly open issues or MRs for the same underlying failure, making the signal-to-noise ratio worse, not better.
**Infrastructure impact.** At enterprise scale — auto-triggering without controls would significantly increase load on CI runners and underlying infrastructure. Enterprise customers running flows on self-managed runners need the ability to isolate that load from their core CI/CD workloads.
At GitLab.com scale — and for enterprise customers managing thousands of daily pipeline failures — these problems would cause incidents, wasted compute, and eroded trust in the feature before it has a chance to prove its value.
### Current Functionality
- The current implementation cannot filter by pipeline status.
- AI flows trigger on every pipeline status change (created, in progress, success, failed, canceled), firing 3+ times per pipeline Flows must implement their own filtering logic to ignore irrelevant events.
- Triggers fire on ALL pipeline status changes regardless of status, branch, or source
- GitLab.com scale, this could saturate systems and cause incidents due to the high volume of pipeline events. **We will need to rollout slowly, dogfood on a smaller service, prior to moving onto `gitlab-org/gitlab`**
- No ability to scope triggers to specific branches or sources Risk of inconsistent fixes for flaky tests across MRs when flows can't distinguish master failures from MR failures (merged result pipelines)
## MVC Phase 1: Dogfooding proposal (Enable for GitLab internal projects)
https://gitlab.com/groups/gitlab-org/-/work_items/21181+s
## MVC Phase 2 : (Enable for customer projects)
https://gitlab.com/groups/gitlab-org/-/work_items/21225+s
## Post MVC
**Custom Prompt Per Trigger** - See https://gitlab.com/gitlab-org/gitlab/-/issues/588753#note_3109366453
**Scope Filtering**
- Enable filtering by pipeline source (merge request, scheduled, manual, etc.)
**UI configuration** for trigger filters
**Observability -** Any auto-triggering at scale must come with a real-time dashboard showing trigger and retry volume (e.g. "X retries in the last 5 minutes")
**Cost / Credit Safeguards -** Auto-fixing and retry flows must include controls to prevent runaway credit consumption, both at the individual invocation level and across concurrent users triggering the same flow.
##
##
epic