Skip to content

Draft: Add MergeTrains::RefreshWorker timeout/requeue logic

What does this MR do and why?

This MR adds a 3-minute loop timeout to MergeTrains::RefreshService. Most executions finish in less than two minutes, so this limit should rarely need to be enforced. What we gain from the limit is the introduction of 4-minute TTL on the RefreshWorker that runs the service.

Having both of these limits in place will allow jobs to be executed more frequently. We'll want to enqueue jobs more often and let them be deduplicated when we add logic for fixing stuck MergeTrains.

How to set up and validate locally

  • Check out Kibana where we see the maximum job durations come in around 2 minutes
  • Other stuff

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by drew stachon

Merge request reports