Limit Sidekiq `push_bulk` to a maximum of 1000 jobs in one go
What does this MR do and why?
This MR does 2 things:
- Limit the number of jobs that are currently enqueued via the bulk_perform_async & bulk_perform_in methods to a maximum of 1000 jobs in one go. If there are more than 1000 jobs to be enqueued, it is split into batches of 1000 and then enqueued.
- Now that we have upgraded Sidekiq to version 6.2.2 recently, we can make use of Sidekiq's feature that supports executing
push_bulk
with different delays in one go, which means that the number of Redis calls can be reduced if we do this. This feature was added in Sidekiq via https://github.com/mperham/sidekiq/pull/4243.
On job limiting to 1000 jobs in one go.
This is recommended by Sidekiq in the documentation : https://www.rubydoc.info/github/mperham/sidekiq/Sidekiq%2FClient:push_bulk
I wouldn't recommend pushing more than 1000 per call but YMMV based on network quality, size of job args, etc. A large number of jobs can cause a bit of Redis command processing latency.
To incorporate this, we are wrapping the current calls to Sidekiq::Client.push_bulk
within in_safe_limit_batches
, which does push_bulk
in batches of 1000 each. It returns the job_ids
of all enqueued jobs in one array, so for the executor, it appears no different to executing a Sidekiq::Client.push_bulk
in one go and the batching remains as an internal implementation detail.
push_bulk
with different delays in one go
On This was introduced in Sidekiq via https://github.com/mperham/sidekiq/pull/4243, and we can make use of this feature in the bulk_perform_in
method, in the case where batch_size
and batch_delay
are specified.
Using this approach can help us reduce the calls to Redis as we are now calling Sidekiq::Client.push_bulk
batch_size
number of times.
This approach was first discussed at !32974 (comment 348966941)
Screenshots or screen recordings
How to set up and validate locally
We can use the following to schedule these jobs from the development Rails console and check the results.
The methods are currently called from within the bulk_perform_async
and bulk_perform_in
methods in application_worker.rb
.
To test the change in local development environment,
- Set the
SAFE_PUSH_BULK_LIMIT
to10
inapplication_worker.rb
- Restart background jobs via GDK.
# For bulk_perform_async
# When exceeding safe limit
args = User.pluck(:id).sample(15).map { |id| [id] }
AuthorizedProjectUpdate::UserRefreshFromReplicaWorker.bulk_perform_async(args)
# When NOT exceeding safe limit
args = User.pluck(:id).sample(5).map { |id| [id] }
AuthorizedProjectUpdate::UserRefreshFromReplicaWorker.bulk_perform_async(args)
# For bulk_perform_in - without batches and batch delay
# When exceeding safe limit
args = User.pluck(:id).sample(15).map { |id| [id] }
AuthorizedProjectUpdate::UserRefreshFromReplicaWorker.bulk_perform_in(1.minute, args)
# When NOT exceeding safe limit
args = User.pluck(:id).sample(5).map { |id| [id] }
AuthorizedProjectUpdate::UserRefreshFromReplicaWorker.bulk_perform_in(1.minute, args)
# For bulk_perform_in - with batches and batch delay
# When exceeding safe limit
args = User.pluck(:id).sample(15).map { |id| [id] }
AuthorizedProjectUpdate::UserRefreshFromReplicaWorker.bulk_perform_in(2.minute, args, batch_size: 3, batch_delay: 30.seconds)
# When NOT exceeding safe limit
args = User.pluck(:id).sample(5).map { |id| [id] }
AuthorizedProjectUpdate::UserRefreshFromReplicaWorker.bulk_perform_in(2.minute, args, batch_size: 3, batch_delay: 30.seconds)
And in http://localhost:3000/admin/sidekiq/scheduled
, we can observe that the jobs are being
- scheduled at the expected time (in case of
bulk_perform_in
) - picked up correctly
- completed without errors.
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.