Skip to content

New Sidekiq Queue detail panel: scheduled vs immediate queue events

Matthias Käppler requested to merge mk-sidekiq-scheduled-set-events into master

Related to: gitlab-org/gitlab#333671 (closed)

By expanding the set of workers that can leverage database load-balancing, and which will see a short execution delay, their enqueue events measured by sidekiq_enqueued_jobs_total have effectively doubled, because future scheduling in Sidekiq is realized by first posting a job to the queue via perform_in, which Sidekiq then moves into its ScheduledSet. When the time has come to execute the job, it is re-posted to the queue for immediate execution. In other words, the job is enqueued twice, and it also counted twice.

For overall queue length alerting this does not matter (since we don't care why something is in the queue, just that it is), but when counting "intended executions" or when trying to even see which jobs or to what extent they were scheduled for future execution, it is helpful to break this down more. We can now do this with the new scheduling label attached to this metric, which is either immediate or delayed, introduced in gitlab-org/gitlab!64322 (merged).

Here is an example for WebHookWorker, where you can see how its Enqueued Jobs rate is twice as high as actual executions:

Screenshot_from_2021-07-01_13-11-15

https://dashboards.gitlab.net/dashboard/snapshot/faGn8ZGQe7r4ypfhWlaijjbAqjngGkce?orgId=1&var-PROMETHEUS_DS=Global&var-environment=gprd&var-stage=main&var-queue=web_hook

It is also useful to identify gaps where queuing jobs and actually executing them experiences some drift, as seen above as well.

Edited by Matthias Käppler

Merge request reports