Adjust Sidekiq poll interval to improve scheduled jobs enqueue latency
From data on GitLab.com:
https://log.gprd.gitlab.net/goto/0a11409b77a621901ec8a39bb82b5897
We can see that only 0.65% of polls are empty. This suggests that we always have scheduled jobs in the backlog. We can also see that the number of backlog jobs are high during busy periods:
Now that we are using the atomic scheduler, we can lower Sidekiq's poll interval so that the job of enqueuing scheduled jobs would be spread out to more processes. This in turn can improve our enqueue latency without really affecting Redis CPU because the same commands are just spread out across the processes. The risk if we go too low would be an increase in the number of empty polls, so we should keep track of that as we change the values. Although having extra empty polls isn't really that bad considering we have been running with redundant ZREMs for a long time. But we should keep an eye on Redis CPU as well.
We are currently using the Sidekiq default of process_count * Sidekiq.options[:average_scheduled_poll_interval]
. The default average_scheduled_poll_interval
is 5. We could find the sweet spot by trying to lower it down in small steps until we see a noticeable rise in CPU / empty polls then we can raise it back up a step.
Another option would be to configure Sidekiq.options[:poll_interval_average]
if we don't want the poll interval to be affected by process count.