Review observability/monitoring requirements for horizontally scaling Sidekiq
This issue tracks application and infrastructure changes required for monitoring/observability to function in a horizontally scaled Sidekiq setup.
For purpose of this issue, lets call the new redis-sidekiq-shard-1
.
Application
The instrumentation class for the new redis-sidekiq-shard-1
needs to be present for the storage
label in our instrumentation middleware.
Labels can be added in https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/sidekiq_middleware/metrics_helper.rb#L22 to indicate the Redis instance that the message will be sent to. This should not add to the cardinality since the jobs of the same queue
will go to the same store
.
Infrastructure
redis-sidekiq-catchall-a
will be represented as a new service in runbooks. This would require the following to be set up:
- entry into metrics catalog
- dashboards
- alerting rules
The new Redis used for Sidekiq would also require a gitlab-exporter
to probe it for sidekiq-related metrics (https://gitlab.com/gitlab-org/ruby/gems/gitlab-exporter/-/blob/master/lib/gitlab_exporter/sidekiq.rb). Depending on the new Redis deployment mode (chef vs GKE), if latter, we will need to add a gitlab-exporter
container for the pod.
gitlab-exporter
metrics comes with the type
label which allows us to aggregate/slice on sidekiq metrics from specific Redis instance.
In summary
We monitor new redis-sidekiq-* instances as a shard of the ServiceRedisSidekiq. The methodology is outlined in this thread (#2818 (comment 1828734089)).