Skip to content

Sidekiq Redis experiment: reduce number of clients by X%

Background

This is an experiment extracted from #956 (closed). We have these factors playing into our problems with CPU usage on our Redis instance for Sidekiq, but we don't know the weightings of them:

  1. Number of clients performing BRPOP with ...
  2. ... a very long argument list (for the catchall shard) where ...
  3. ... some of those arguments represent frequently-used lists (Sidekiq queues).

Experiment

This is to simulate &423 by acting as if X% of our Sidekiq workload was happening in a zonal cluster and so hitting a different Redis instance.

If item 1 is a significant factor, then we should see improvements here.

Summary Results

Redis CPU usage (%) for each set of worker counts (100% being a 'full' workload based on production, with data from #956 (closed))

Base/Idle 1 Generator 2 Generators 3 Generators Notes
Worker %
100% 11% 67% 87% 95% From #956 (closed)
66% 8% 60% 86% 92%
50% 6% 61% 63% 70%
40% 5% 60% 55% N/A
33% 4% 50% 52% N/A
25% 3% 36% N/A N/A

Conclusions

The reduction in usage is not linear on the number of workers.

If we make some stretchy assumptions that we split our workload into 3 clusters (&423) and that for safety we initially size each at 50% of the original cluster (number of workers), then we will likely see a drop to perhaps 60-65% (absolute) Redis CPU usage. At that scale the number of workers appears to have more than the rate of jobs being pushed into the queues, which is an interesting result.

Edited by Craig Miskell