Automatically trigger Fix a Failing Pipeline Flow on pipeline failures
## Problem to Solve When a pipeline fails, engineers must manually notice the failure, diagnose the cause, and decide how to act — creating an expensive context-switching loop that slows down development cycles. The **Fix Pipeline flow** already exists to automate this response, but today it only runs when a user manually triggers it. This epic makes Fix Pipeline automatic: when a pipeline fails, the flow kicks off without any human prompt. However, simply wiring up automatic triggers today would create more problems than it solves: **Trigger noise.** Pipeline events fire on every status change (created, in progress, success, failed, canceled) — 3 or more times per pipeline. Without status filtering, the flow fires on irrelevant events, causing system saturation at scale. **Runaway MR creation.** The current flow always opens a new MR, regardless of context. If a pipeline fails on an existing MR branch, the flow opens a second MR targeting the first — which can then fail and spawn a third. This recursive pattern creates review debt and confusion. **Duplicate actions.** Without idempotency awareness, the flow may repeatedly open issues or MRs for the same underlying failure, making the signal-to-noise ratio worse, not better. **Infrastructure impact.** At enterprise scale — auto-triggering without controls would significantly increase load on CI runners and underlying infrastructure. Enterprise customers running flows on self-managed runners need the ability to isolate that load from their core CI/CD workloads. At GitLab.com scale — and for enterprise customers managing thousands of daily pipeline failures — these problems would cause incidents, wasted compute, and eroded trust in the feature before it has a chance to prove its value. ### Current Functionality - The current implementation cannot filter by pipeline status. - AI flows trigger on every pipeline status change (created, in progress, success, failed, canceled), firing 3+ times per pipeline Flows must implement their own filtering logic to ignore irrelevant events. - Triggers fire on ALL pipeline status changes regardless of status, branch, or source - GitLab.com scale, this could saturate systems and cause incidents due to the high volume of pipeline events. **We will need to rollout slowly, dogfood on a smaller service, prior to moving onto `gitlab-org/gitlab`** - No ability to scope triggers to specific branches or sources Risk of inconsistent fixes for flaky tests across MRs when flows can't distinguish master failures from MR failures (merged result pipelines) ## MVC Phase 1: Dogfooding proposal (Enable for GitLab internal projects) https://gitlab.com/groups/gitlab-org/-/work_items/21181+s ## MVC Phase 2 : (Enable for customer projects) https://gitlab.com/groups/gitlab-org/-/work_items/21225+s ## Post MVC **Custom Prompt Per Trigger** - See https://gitlab.com/gitlab-org/gitlab/-/issues/588753#note_3109366453 **Scope Filtering** - Enable filtering by pipeline source (merge request, scheduled, manual, etc.) **UI configuration** for trigger filters **Observability -** Any auto-triggering at scale must come with a real-time dashboard showing trigger and retry volume (e.g. "X retries in the last 5 minutes") **Cost / Credit Safeguards -** Auto-fixing and retry flows must include controls to prevent runaway credit consumption, both at the individual invocation level and across concurrent users triggering the same flow. ## ##
epic