Use state machine in Merge Train to avoid race conditions
What does this MR do?
This MR fixes a race condition on merge trains. Here is the details:
For example, there is a merge request on a merge train. A pipeline for merge train has finished just now and it's about to be merged. Here is what's happening in an internal process.
- When a pipeline succeeded, it notifies the event to auto merge facility, which triggers
AutoMergeProcessWorkersidekiq job. It runs a bunch of validation processes before executing merge inMergeTrains::RefreshMergeRequestService. - When the service merges the merge request (which executes
MergeRequests::MergeService), the following processes happen:
- 2-1. It runs a bunch of validations and triggers
git-mergevia gitaly - 2-2. Marks the
merge_requestobject asmerged, which marksmerge_trainobject asmergedstate as well. - 2-3. When
merge_trainobject is marked asmerged, it deletes a git-reference of merge train (refs/merge-requests/:iid/train)
Meanwhile, git-merge invokes PostReceive sidekiq job, which executes the following processes asynchronously:
- 2'-1. Update merge requests in
UpdateMergeRequestsWorker - 2'-2. It checks the status of merge train in
MergeTrains::CheckStatusService. Specifically, it checks if the new revision was introduced by merge train. If it doesn't exist, the service marks the merge train asstalestate as there was an unexpected commit was pushed from out of merge train cycle.
The problem is that 2'-2's result could be different if 2-2 has not finished yet. We expect that 2-2 always finishes before 2'-2 starts but it's not guaranteed due to the asynchronous processes. This causes an unexpected behavior such as #196133 (closed).
Closes #196133 (closed)
Solution & What's changed here
- We introduce a
mergingstatus to indicate that the merge train is nearly complete state. It's marked before the system triggersMergeRequests::MergeService. This ensures that 2'-2. returns always the same result. - We use
state_machineto ensure that each event is handled properly with a proper state validation. - Persisting
merged_atanddurationfor the future iterations #36146 (closed). - Remove
merge_trains_efficient_refreshfeature flag that we've confirmed that the feature id deemed stable. - Remove
merge_train_new_stale_checkfeature flag that we've confirmed that the feature id deemed stable.
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. - [-] Tested in all supported browsers
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
- [-] Label as security and @ mention
@gitlab-com/gl-security/appsec - [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
- [-] Security reports checked/validated by a reviewer from the AppSec team