Skip to content

Sidekiq Redis experiment: have a single queue per shard

Background

This is an experiment extracted from #956 (closed). We have these factors playing into our problems with CPU usage on our Redis instance for Sidekiq, but we don't know the weightings of them:

  1. Number of clients performing BRPOP with ...
  2. ... a very long argument list (for the catchall shard) where ...
  3. ... some of those arguments represent frequently-used lists (Sidekiq queues).

Experiment

The idea here is to simulate the effect of &194 where we'd have far fewer queues to listen to. This is mostly related to factor 2, and it's what the Sidekiq docs themselves recommend.

We tested 2 variations:

  1. All shards using one queue per shard
  2. Only the 'catchall-on-k8s' workloads using one queue per shard (the one with the most queues)

Key Results

  1. Redis 6.0.10 with the BRPOP patch is largely equivalent to Redis 5.0.9. 6.0.10 without the BRPOP patch is always worse, and we probably don't need to test it much further.
  2. A single queue per shard gives us a roughly 40% absolute drop in CPU usage in Redis.
  3. A single queue for catchall-on-k8s, and leaving the rest unchanged gives us a roughly 30% absolute drop in CPU usage in Redis

Conclusion

A single queue per shard is a highly effective way to reduce Redis CPU usage by a large margin, and even only doing it on the shard with the largest number of the queues makes a substantial improvement.

Edited by Craig Miskell