Break UpdateMergeRequestsWorker / MergeRequests::RefreshService into separate workers and services
The UpdateMergeRequestsWorker
calls MergeRequests::RefreshService#execute
, which sounds fine. But MergeRequests::RefreshService#execute
does a heck of a lot:
- Close any MRs under a condition I honestly don't understand
🤔 - Mark MRs as manually merged.
- Update existing merge request diffs.
- Reset MWPS on any updated MRs.
- Mark pending 'pipeline failed' todos as done.
- Cache the MR closing issues relationship (from the issue closing pattern) to the DB.
- Add a comment when the branch is restored or deleted.
- Send notifications about the push.
- Mark the MR as WIP if the commit messages match.
- Execute any hooks.
That's too much! The current worker is very hard to reason about when it causes trouble: see https://gitlab.com/gitlab-org/gitlab-ce/issues/53153 for an example.
We can use the same queue (or queue namespaces) for the separate workers.
Note that the big caveat here is performance. At the moment, this service does a lot of caching and memoisation. In https://gitlab.com/gitlab-org/gitlab-ce/issues/53213 I highlighted one way this is actually hurting us - we could just use SQL. However, from a Gitaly perspective, we may need to fetch all the commits each time. We could possibly solve this by only loading the commits once, and passing the concrete information to each new worker (for instance, step 9 needs to know SHAs and messages; steps 2 and 3 only need SHAs; many other steps don't need any commit information).