Skip to content

Add MergeTrains::FixStuckCarService and add check to RefreshWorker

drew stachon requested to merge alternative-stuck-merge-train-service into master

What does this MR do and why?

This MR adds a service for fixing broken state combinations of MergeTrains::Car and MergeRequest records. Rarely, but unfortunately, the background processes die during a merge and we're left with states that don't make sense. For instance, a Car sitting in merging while the Mr is already merged. All the combinations of these states are laid out in the service class spec.

A deep technical discussion of this problem and how it seems to occur can be found in #389044

How to set up and validate locally

Unfortunately, validating this locally is fairly cumbersome, and involves dropping Process.kill(KILL) into the service code and hoping to leave the MergeRequests in the same state as some we've seen in production.

Instead, what I've done is go into our actual production database for GitLab SaaS and look at all the MergeTrain::Car records with state merging, and write test cases here for what I found. I've written extensive comments documenting each handled case, and why we're handling it in exactly that way.

Not all the cases are currently handled, just the ones we're confident about. Also it's worth nothing that the overwhelming majority of cases on GitLab SaaS are Car#merging? and MergeRequest#merged?, which is handled here.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #389044

Edited by drew stachon

Merge request reports