Re-enqueue auto-merge worker for unchecked MRs
What does this MR do and why?
Related to #594868 — auto-merge fails for concurrent MRs targeting the same branch.
When multiple MRs target the same branch with auto-merge enabled and their pipelines succeed concurrently, merging one MR triggers mark_as_unchecked on all sibling MRs. No mechanism re-triggers mergeability checks for those MRs, leaving them stuck in unchecked state with auto_merge_enabled=true.
Changes
- Added
enqueue_auto_merge_for_uncheckedinRefreshServicethat filters already-loaded MRs in Ruby (no extra DB query) for those withauto_merge_enabled=trueand enqueuesAutoMergeProcessWorker - Uses fixed 3-second staggered delays (0s–57s) so workers from the same or overlapping push events don't synchronize — each MR gets a delay of
index * 3seconds - Capped at 20 MRs per push event to bound queue volume; remaining MRs are picked up in subsequent rounds as merges trigger new pushes
- Gated behind the
auto_merge_on_mark_as_uncheckedfeature flag with a project actor
How the fix works
When a push to a target branch (e.g. master) occurs, UpdateMergeRequestsWorker calls RefreshService, which runs batch_mark_as_unchecked on all open MRs targeting that branch. After this, enqueue_auto_merge_for_unchecked identifies MRs with auto_merge_enabled=true from the already-loaded set and enqueues AutoMergeProcessWorker with staggered delays.
The worker calls mergeable? which triggers MergeabilityCheckService, transitioning the MR through checking → can_be_merged, at which point the existing can_be_merged callback processes the auto-merge.
Worker deduplication (deduplicate :until_executed, if_deduplicated: :reschedule_once) prevents duplicate processing per MR.
Why staggered delays?
Each merge into a target branch triggers mark_as_unchecked for all sibling MRs, re-enqueuing workers. Without staggering, N sequential merges produce O(N²) near-simultaneous workers. A fixed 3-second interval ensures workers are evenly spaced and predictable. The 20 MR cap limits queue volume per push event while the remaining MRs drain naturally in subsequent rounds.
| MRs remaining | Delay range |
|---|---|
| 1–5 | 0s–12s |
| 6–10 | 15s–27s |
| 11–15 | 30s–42s |
| 16–20 | 45s–57s |
Why service-layer only (no model callback)
Side effects are kept in RefreshService rather than an after_transition model callback because:
- The service layer has the full batch of affected MRs already loaded in memory, avoiding extra DB queries
- Avoids surprising background job fan-out from state machine transitions
RefreshServiceis the code path that handles the concurrent merge scenario (triggered byUpdateMergeRequestsWorkeron every push)
Feature flag
- Name:
auto_merge_on_mark_as_unchecked - Type:
gitlab_com_derisk - Rollout issue: #594893
MR acceptance checklist
- Tests added for staggered delays and batch limit
- Feature flag YAML created
- Rollout issue created: #594893