Review observability/monitoring requirements for horizontally scaling Sidekiq

This issue tracks application and infrastructure changes required for monitoring/observability to function in a horizontally scaled Sidekiq setup.

For purpose of this issue, lets call the new redis-sidekiq-shard-1.

Application

The instrumentation class for the new redis-sidekiq-shard-1 needs to be present for the storage label in our instrumentation middleware.

Labels can be added in https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/sidekiq_middleware/metrics_helper.rb#L22 to indicate the Redis instance that the message will be sent to. This should not add to the cardinality since the jobs of the same queue will go to the same store.

Infrastructure

redis-sidekiq-catchall-a will be represented as a new service in runbooks. This would require the following to be set up:

  • entry into metrics catalog
  • dashboards
  • alerting rules

The new Redis used for Sidekiq would also require a gitlab-exporter to probe it for sidekiq-related metrics (https://gitlab.com/gitlab-org/ruby/gems/gitlab-exporter/-/blob/master/lib/gitlab_exporter/sidekiq.rb). Depending on the new Redis deployment mode (chef vs GKE), if latter, we will need to add a gitlab-exporter container for the pod.

gitlab-exporter metrics comes with the type label which allows us to aggregate/slice on sidekiq metrics from specific Redis instance.


In summary

We monitor new redis-sidekiq-* instances as a shard of the ServiceRedisSidekiq. The methodology is outlined in this thread (#2818 (comment 1828734089)).

Edited by Sylvester Chin