Fix code owner approval reset race condition

Summary

Fixes a race condition where concurrent diff creation causes previous_diff to shift, making entries_since_merge_request_commit compare the wrong commit range and return empty results — preventing selective code owner approval removal.

Problem

When a push happens to an MR's source branch, previous_diff (a positional query: ORDER BY id DESC OFFSET 1 scoped to diff_type: :regular) is used to determine which code owner files changed since the last push. A concurrent process that creates an additional regular MergeRequestDiff record (e.g., overlapping RefreshService.reload_diff() calls from concurrent UpdateMergeRequestsWorker jobs) can shift what previous_diff returns.

Note: ReloadMergeHeadDiffService creates merge_head diffs (diff_type = 2), which are excluded by the -> { regular } scope on the merge_request_diffs association. It cannot cause this race condition.

Example: 3 diffs — A, B, C on the same MR:

  1. We are on Diff A.
  2. Diff B gets generated (push).
  3. Reset approval worker gets enqueued.
  4. Diff C gets generated (concurrent process).
  5. Reset approval worker executes: previous_diff returns B (not A) → compares C→B instead of C→A → misses CODEOWNERS changes → approval survives when it should have been removed.

Solution

Capture push.oldrev at enqueue time and pass it through to ResetApprovalsService. The service uses this stable SHA instead of querying previous_diff at runtime, making it immune to concurrent diff state changes.

A resolve_previous_diff_head_sha method validates the oldrev against the MR's known merge_request_diffs.head_commit_sha values before using it. If the oldrev doesn't match any known diff (e.g., unexpected state), a warning is logged and the service falls back to the existing previous_diff.head_commit_sha behavior. Both the enqueue path and the resolve path are gated behind the feature flag for safe rollback.

Key changes:

  • ApprovalService passes push.oldrev through the worker to ResetApprovalsService (feature-flag gated)
  • ResetApprovalsService#resolve_previous_diff_head_sha validates oldrev against merge_request_diffs.head_commit_sha, with warning log and fallback to previous_diff.head_commit_sha (feature-flag gated)
  • The 10-second perform_in delay is kept — it mitigates a separate race condition where SyncCodeOwnerApprovalRules from a concurrent push may not have finished syncing approval rules yet. Removing the delay is planned as a follow-up once we add a proper sync-completion check.

Feature Flag

use_oldrev_for_approval_resetgitlab_com_derisk type, group::code review.

  • When enabled: uses push.oldrev for stable SHA comparison with perform_in(10.seconds) delay
  • When disabled: falls back to existing previous_diff behavior with perform_in(10.seconds) delay

Disabling the flag fully reverts to old behavior, even for already-enqueued jobs that carry an oldrev argument.

Rollout issue: #594940

  • !229090 — Split-out refactor: replaces TTL-based temporarily_unapprove! with deterministic Redis set-based sync tracking (independent improvement)
  • !226061 — Fixes a separate race condition on MR creation where SyncCodeOwnerApprovalRulesWorker runs before the diff exists

Test plan

  • reset_approvals_service_spec.rb: Tests concurrent scenario — without oldrev approvals survive (the bug), with oldrev approvals are correctly removed (the fix). Also tests warning log when oldrev doesn't match any known diff.
  • approval_service_spec.rb: Feature flag specs test both enabled/disabled paths.
  • merge_request_reset_approvals_worker_spec.rb: Verifies the worker passes the parameter through correctly.

Relates to #373846

Edited by Marc Shaw

Merge request reports

Loading