Add FF to eagerly resume jobs
What does this MR do and why?
As described in #579350, there is an edge case when sidekiq queue is massively backlogged, the number of concurrent jobs of a worker could far exceed the set concurrency limit.
With this change, ResumeWorker by default will resume a batch of jobs at a time. This means each execution of ResumeWorker only resumes a number of jobs up to the concurrency limit.
The main purpose of this MR is to protect self-managed and Dedicated instances from the edge case of "resumed jobs can exceed concurrency limit when Sidekiq queue is backlogged", which could put a lot of pressure to the database
With FF concurrency_limit_eager_resume_processing, it tries to resume as many batches of jobs as possible in 5 minutes. The FF concurrency_limit_eager_resume_processing, will be enabled for GitLab.com as the performance of resuming jobs is important especially for .com.
For self-managed and Dedicated, they will be back to resuming 1 batch of jobs per ResumeWorker execution, which was the state before !206836 (merged).
References
How to set up and validate locally
-
Apply this diff
diff --git a/app/workers/chaos/sleep_worker.rb b/app/workers/chaos/sleep_worker.rb index 43b851a9f264..41403388ae2e 100644 --- a/app/workers/chaos/sleep_worker.rb +++ b/app/workers/chaos/sleep_worker.rb @@ -9,6 +9,8 @@ class SleepWorker # rubocop:disable Scalability/IdempotentWorker sidekiq_options retry: 3 include ChaosQueue + concurrency_limit -> { 10 } + def perform(duration_s) Gitlab::Chaos.sleep(duration_s) end -
Schedule a lot of jobs
while true Chaos::SleepWorker.perform_async(1) end -
On a separate console, keep checking the queue size
Gitlab::SidekiqMiddleware::ConcurrencyLimit::ConcurrencyLimitService.new("Chaos::SleepWorker").queue_size -
Once there are enough jobs in the queue, stop the loop in step 2.
-
Check that queue size will be decreasing slowly after a while (this may take up to 1 minute due to
ResumeWorkeris a per-minute cron).
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.