Optimize entries_since_commit to use lightweight FindChangedPaths RPC

Summary

Optimizes the MergeRequestResetApprovalsWorker by replacing the expensive CommitDiff Gitaly RPC with the lightweight FindChangedPaths Gitaly RPC when finding code owner entries that changed between consecutive MR diff versions.

Problem

The code_owner_approver_ids_entries_since_commit step in the worker is the number 1 bottleneck:

Percentile Duration
P90 1.802s
P95 4.636s
P99 12.017s

This is caused by Gitlab::CodeOwners.entries_since_merge_request_commit calling CompareService which makes a Gitaly CommitDiff RPC to compute the full diff (including patch content) between two commits. The cost scales with repository size and diff complexity.

Solution

Use Compare#changed_paths (backed by the FindChangedPaths Gitaly RPC) instead of Compare#modified_paths (backed by the CommitDiff Gitaly RPC). The key difference:

CommitDiff (old) FindChangedPaths (new)
Data returned Full patch content + metadata Path metadata only
Wire cost Heavy — scales with diff size Lightweight — fixed per file
Rename detection Implicit via from/to paths Explicit (find_renames: true)

Both RPCs produce identical path sets — verified locally across 159 comparisons (rename, delete, modify, no-change scenarios) with 0 mismatches.

Falls back to the CommitDiff-based comparison when the feature flag is disabled.

Local benchmarks (GDK, MR with 9 modified paths)

Avg P50 Min Max
CommitDiff (old) 67.47ms 57.74ms 50.94ms 156.01ms
FindChangedPaths (new) 24.46ms 21.95ms 14.01ms 50.46ms
Speedup 2.76x 2.63x

Production impact expected to be significantly larger since CommitDiff RPC cost scales with repo/diff size while FindChangedPaths returns only metadata.

Feature flag

This change is gated behind the optimized_code_owner_entries_since_commit feature flag (gitlab_com_derisk). When disabled, the existing CommitDiff-based comparison is used.

Performance data source

#579591 (comment 3111867536)

Edited by Marc Shaw

Merge request reports

Loading