Skip to content

Limit Sidekiq `push_bulk` to a maximum of 1000 jobs in one go

Manoj M J requested to merge mmj-bulk-delays-for-bulk-perform-in-sidekiq into master

What does this MR do and why?

This MR does 2 things:

  • Limit the number of jobs that are currently enqueued via the bulk_perform_async & bulk_perform_in methods to a maximum of 1000 jobs in one go. If there are more than 1000 jobs to be enqueued, it is split into batches of 1000 and then enqueued.
  • Now that we have upgraded Sidekiq to version 6.2.2 recently, we can make use of Sidekiq's feature that supports executing push_bulk with different delays in one go, which means that the number of Redis calls can be reduced if we do this. This feature was added in Sidekiq via https://github.com/mperham/sidekiq/pull/4243.

On job limiting to 1000 jobs in one go.

This is recommended by Sidekiq in the documentation : https://www.rubydoc.info/github/mperham/sidekiq/Sidekiq%2FClient:push_bulk

I wouldn't recommend pushing more than 1000 per call but YMMV based on network quality, size of job args, etc. A large number of jobs can cause a bit of Redis command processing latency.

To incorporate this, we are wrapping the current calls to Sidekiq::Client.push_bulk within in_safe_limit_batches, which does push_bulk in batches of 1000 each. It returns the job_ids of all enqueued jobs in one array, so for the executor, it appears no different to executing a Sidekiq::Client.push_bulk in one go and the batching remains as an internal implementation detail.

On push_bulk with different delays in one go

This was introduced in Sidekiq via https://github.com/mperham/sidekiq/pull/4243, and we can make use of this feature in the bulk_perform_in method, in the case where batch_size and batch_delay are specified.

Using this approach can help us reduce the calls to Redis as we are now calling Sidekiq::Client.push_bulk batch_size number of times.

This approach was first discussed at !32974 (comment 348966941)

Screenshots or screen recordings

How to set up and validate locally

We can use the following to schedule these jobs from the development Rails console and check the results.

The methods are currently called from within the bulk_perform_async and bulk_perform_in methods in application_worker.rb.

To test the change in local development environment,

  • Set the SAFE_PUSH_BULK_LIMIT to 10 in application_worker.rb
  • Restart background jobs via GDK.
# For bulk_perform_async

  # When exceeding safe limit

  args = User.pluck(:id).sample(15).map { |id| [id] }
  AuthorizedProjectUpdate::UserRefreshFromReplicaWorker.bulk_perform_async(args)

  # When NOT exceeding safe limit

  args = User.pluck(:id).sample(5).map { |id| [id] }
  AuthorizedProjectUpdate::UserRefreshFromReplicaWorker.bulk_perform_async(args)

# For bulk_perform_in - without batches and batch delay

  # When exceeding safe limit

  args = User.pluck(:id).sample(15).map { |id| [id] }
  AuthorizedProjectUpdate::UserRefreshFromReplicaWorker.bulk_perform_in(1.minute, args)

  # When NOT exceeding safe limit

  args = User.pluck(:id).sample(5).map { |id| [id] }
  AuthorizedProjectUpdate::UserRefreshFromReplicaWorker.bulk_perform_in(1.minute, args)


# For bulk_perform_in - with batches and batch delay

  # When exceeding safe limit

   args = User.pluck(:id).sample(15).map { |id| [id] }
   AuthorizedProjectUpdate::UserRefreshFromReplicaWorker.bulk_perform_in(2.minute, args, batch_size: 3, batch_delay: 30.seconds)

  # When NOT exceeding safe limit

   args = User.pluck(:id).sample(5).map { |id| [id] }
   AuthorizedProjectUpdate::UserRefreshFromReplicaWorker.bulk_perform_in(2.minute, args, batch_size: 3, batch_delay: 30.seconds)

And in http://localhost:3000/admin/sidekiq/scheduled, we can observe that the jobs are being

  • scheduled at the expected time (in case of bulk_perform_in)
  • picked up correctly
  • completed without errors.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Manoj M J

Merge request reports