Draft: Wire ApprovalRulesSyncCoordinator into create/push/reset flows
Wires MergeRequests::ApprovalRulesSyncCoordinator (from !232648 (merged)) into the call sites that need sync tracking: create, push refresh, reset approvals. This is the MR that actually fixes the race condition; !232648 (merged) only added the infrastructure.
MR 2 of 2 split from !229090 (closed) per reviewer feedback. Depends on !232648 (merged).
Gated behind the deterministic_approval_rules_sync feature flag. When disabled, all code paths behave exactly as before (TTL-based temporarily_unapprove!, perform_in(10.seconds), no sync checks).
Problem
When two rapid pushes happen on the same branch, Push B's SyncCodeOwnerApprovalRules deletes and recreates approval rules. If Push A's MergeRequestResetApprovalsWorker runs during this sync, it sees a partial rule state and misses approval resets. The existing mitigations were probabilistic:
temporarily_unapprove!used a 15-second TTL Redis key — unreliable under loadperform_in(10.seconds)delay on the reset worker — a blind guess that the sync finishes in time
Solution
CreateService#after_create: callsstart_code_owner_sync!before the worker is enqueued so the MR is marked as temporarily unapproved without a gap between create and the async worker run.Refresh::ApprovalService#update_approvers_for_source_branch_merge_requests: callscoordinator.sync_code_owners!(synchronous start → execute → finish → enqueue any pending resets).Refresh::ApprovalService#reset_approvals_for_merge_requests: callscoordinator.schedule_reset_approvals!, which marks the reset as pending internally and enqueues the worker immediately if no code owner sync is in progress, or defers it to be drained when the sync completes.ResetApprovalsService: re-checks per-MR on entry. If a code-owner sync started between enqueue and execution, defers viacoordinator.schedule_reset_approvals!and skips this MR (it'll be drained when the sync finishes). Callsfinish_reset_approvals!after normal resets.SyncCodeOwnerApprovalRulesWorker: routes via FF check alone — coordinator path when FF is on, legacySyncCodeOwnerApprovalRulesservice when FF is off.
No artificial 10-second delay on the reset worker — the coordinator guarantees ordering.
Entry-check in the reset service
Between when a reset worker is enqueued (after the coordinator confirms no sync is in progress) and when Sidekiq actually executes it, a new code-owner sync can start. Without a check, the worker would run against the MR's half-built rules.
The service now re-checks coordinator.code_owner_sync_in_progress? per-MR before resetting. If a sync is in flight, the worker defers its own reset via coordinator.schedule_reset_approvals! and moves on to the next MR. The re-queued params get drained when the sync completes. Narrows the race window from "Sidekiq pickup delay" to "microseconds inside perform".
Reviewer feedback applied
- Worker routes via FF check alone (no
finish_syncparam needed). Addresses !229090 (closed)#note_2823013495. mark_reset_pending!folded intoschedule_reset_approvals!as an implementation detail — callers no longer need to call it separately.- Legacy
temporarily_unapprove!always set alongside coordinator keys for rollback safety (FF disable mid-flight still blocks merges via the legacy key).
Known limitations
The entry-check narrows but does not fully eliminate the race — a sync can still start between the check and the .delete_all that removes approvals. In practice this window is microseconds and the impact is bounded (rules only meaningfully change when CODEOWNERS itself is edited). Full mutual exclusion requires a per-MR serial-queue primitive; tracked as a follow-up.
Split from !229090 (closed). Relates to #373846.
Feature flag
deterministic_approval_rules_sync (introduced in !232648 (merged)).
Test plan
- Updated specs for
CreateService,ApprovalService,ResetApprovalsService,SyncCodeOwnerApprovalRules. - New spec case in
ResetApprovalsService: when a code-owner sync is in progress, the reset is deferred via the coordinator and approvals are not modified. - Updated spec for
SyncCodeOwnerApprovalRulesWorkerverifying the FF-on path delegates to the coordinator. - Updated spec for
MergeRequestResetApprovalsWorkerverifying it executes regardless of sync state (coordinator handles ordering). - Manually verified on GDK: FF off reproduces the race condition (worker ignores concurrent sync); FF on blocks the reset worker until sync completes via the coordinator.
- Live rails runner e2e tests against local GDK: schedule defers correctly when sync in progress; pending list drains on sync completion;
sync_code_owners!ensure block drains automatically.
EE: true