Stuck Merge Train because of GitError
Background
In the previous issue, we've found the bug that causes merge trains stuck. That bug happens because of an internal Gitlab::Git::CommandError of Branches::DeleteService. Then we've fixed it by rescuing it.
Summary
On 2020-09-23, the merge train got stuck again. (internal link)
While investigating the logs and sentry errors, I've found these:
- Example merge request iid/id: 63122/71774644
- Example log: https://log.gprd.gitlab.net/goto/2034c97731c1b0b7803e72b7408f8186
- Example sentry error: https://sentry.gitlab.net/gitlab/gitlabcom/issues/1789286/events/34044239/
- 10:58:15.631 -
AutoMergeProcessWorker/ 71774644 was added to merge train - 10:58:19.828 -
AutoMergeProcessWorkerwas done and the pipeline is running - 11:06:28.235 - The pipeline succeeded, and enqueued
AutoMergeProcessWorker.AutoMergeProcessWorkertried to merge 71774644 - 11:07:08.019
-
AutoMergeProcessWorkercalledMergeTrains::RefreshMergeRequestsService->MergeTrains::RefreshMergeRequestService - In
MergeTrains::RefreshMergeRequestService#merge!,MergeRequests::MergeServicesucceeded, butmerge_train.finish_merge!did not. - 71774644 was successfully merged, but
cleanup_reffailed. - Then,
MergeTrains::RefreshMergeRequestsServicedid not continue its job and subsequent merge requests were stuck.
-
Other errors:
- 71868572 - https://sentry.gitlab.net/gitlab/gitlabcom/issues/1789286/events/34050848/
- 71641115 - https://sentry.gitlab.net/gitlab/gitlabcom/issues/1789286/events/34056676/
- 71871151 - https://sentry.gitlab.net/gitlab/gitlabcom/issues/1789286/events/34056858/
Solution Proposal
We can rescue Gitlab::Git::Repository::GitError in MergeTrains::RefreshMergeRequestsService or somewhere else suitable.