Skip to content

Fix race condition on merge train with better exclusive lock

Shinya Maeda requested to merge better-merge-train-exlusive-lock into master

What does this MR do?

During the merge train is being processed, the process is wrapped by exclusive lock. This exclusive lock is effective on the entire merge train, however, this could prevent from some MRs (especially earlier ones) to be processed properly and it could result in unmerged merge requests.

We fix this problem by making the exclusive lock more granular. Each merge request is wrapped by exclusive lock instead of entire merge train. This allows the system to let multiple processes to work on a single train, but still prevents concurrent process at the same merge request to avoid unexpected behavior.

Close https://gitlab.com/gitlab-org/gitlab-ee/issues/12423

Problem to solve

Currently, train processor locks the thread until it's finished refreshing all the following merge requests in a train.

This is safe, however, could cause a race condition in the following case:

RefreshMergeRequestsService: PID-1
Case: MR3 pipeline finished

               v
| MR1 | MR2 | MR3 | MR4 | MR5 |

RefreshMergeRequestsService: PID-2
Case: MR1 pipeline finished

   v <= should be processed
| MR1 | MR2 | MR3 | MR4 | MR5 |

We should process MR1 in PID-2 in the above case, otherwise, MR1 won't be merged.

Approach

Based on https://gitlab.slack.com/archives/D5RJFGJSD/p1561990650021300 by @ayufan

def execute(merge_request)
  Redis.push_to_queue(queue_key, merge_request.id)

  process_queue
end

def process_queue
  in_lock(lock_key) do
    process_all
  end

  # if we have anything added new to the queue
  # force refresh of the queue asynchronously
  # it will only be executed after in_lock succeeds,
  # otherwise it should raise exception
  RefreshMergeRequestTrainWorker.perform_async in anything_in_queue?
end

def process_all
  merge_request_ids = Redis.pop_all_from_queue(queue_key)
  merge_request_id = merge_request_ids.find_the_first_in_train

  proces_all_following(merge_request_id)
end

def anything_in_queue?
  Redis.len(queue_key) > 0
end

def queue_key
  'merge_train:project_id:target_branch'
end

Does this MR meet the acceptance criteria?

Conformity

Performance and testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team
Edited by Shinya Maeda

Merge request reports