What does this MR do and why?

Related to #382065 (closed) - Closing a merge train MR doesn't cancel pipelines that contain child pipelines

This code improves the reliability of canceling CI/CD pipelines by adding better error handling and retry logic.

The main changes include:

Before this MR: We retry_lock around each pipeline within the service. The retry_lock applies a transaction around each pipelines job update work so that if 3 jobs fails to update it rolls back the whole pipeline, and stops executing the service. This could leave children pipeline un-canceled.
After this MR: We retry_lock around each job batch because that shortens the transaction and lets each job retry individually if there is another job update going on.

These changes make the pipeline cancellation process more robust by handling race conditions and conflicts that can occur when multiple users or automated systems try to update the same jobs simultaneously. The result is fewer failed cancellation attempts and more reliable cleanup of running build processes.

Logs

We see that Ci::CancelPipelineService can trigger the StaleObjectError

Edited Nov 21, 2025 by Allison Browne

Improve retry mechanism when objects are stale for pipeline cancellation

What does this MR do and why?

Logs

Merge request reports