Skip to content

Use state machine in Merge Train to avoid race conditions

Shinya Maeda requested to merge fix-merge-train-race-condition into master

What does this MR do?

This MR fixes a race condition on merge trains. Here is the details:

For example, there is a merge request on a merge train. A pipeline for merge train has finished just now and it's about to be merged. Here is what's happening in an internal process.

  1. When a pipeline succeeded, it notifies the event to auto merge facility, which triggers AutoMergeProcessWorker sidekiq job. It runs a bunch of validation processes before executing merge in MergeTrains::RefreshMergeRequestService.
  2. When the service merges the merge request (which executes MergeRequests::MergeService), the following processes happen:

Meanwhile, git-merge invokes PostReceive sidekiq job, which executes the following processes asynchronously:

  • 2'-1. Update merge requests in UpdateMergeRequestsWorker
  • 2'-2. It checks the status of merge train in MergeTrains::CheckStatusService. Specifically, it checks if the new revision was introduced by merge train. If it doesn't exist, the service marks the merge train as stale state as there was an unexpected commit was pushed from out of merge train cycle.

The problem is that 2'-2's result could be different if 2-2 has not finished yet. We expect that 2-2 always finishes before 2'-2 starts but it's not guaranteed due to the asynchronous processes. This causes an unexpected behavior such as #196133 (closed).

Closes #196133 (closed)

Solution & What's changed here

  • We introduce a merging status to indicate that the merge train is nearly complete state. It's marked before the system triggers MergeRequests::MergeService. This ensures that 2'-2. returns always the same result.
  • We use state_machine to ensure that each event is handled properly with a proper state validation.
  • Persisting merged_at and duration for the future iterations #36146 (closed).
  • Remove merge_trains_efficient_refresh feature flag that we've confirmed that the feature id deemed stable.
  • Remove merge_train_new_stale_check feature flag that we've confirmed that the feature id deemed stable.

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team
Edited by Shinya Maeda

Merge request reports