Skip to content

feat: support shard level monitoring for GET-hybrid

feat: support shard level monitoring for GET-hybrid

This is follow up from !8755 (comment 2452646369) to allow the GET-hybrid metrics catalog to use the shardLevelMonitoring feature.

For that to work, we needed to add the following:

  1. Pass through the shardComponentSLIs aggregation set that is already in use for the shared runners to generate rules for services not defined through mixins.

  2. Add the alert descriptor to generate the alerts for the shardComponentSLIs aggregation.

  3. Add the shard-template to the service dashboard for services that enable shard monitoring by default. This will select all shards on the dashboard by default, but for GitLab.com we want to see catchall by default.

This commit currently does not change anything for either environment:

The GET-hybrid currently does not have any services for which this feature is enabled, but we could follow up with that for Sidekiq if we so choose.

For Sidekiq, the dashboard is slightly adjusted to deal with the different default for showing the catchall shard by default.

This adds a fix that removes fixing the colors on the dashboards monitoring multiple shards. Making sure that they aren't all just a yellow line, but rather have Grafana select a different color per shard.

This also enables the shard-monitoring feature for the ci-runners service which was already using shard monitoring, it was only in use for a single SLI, ci_runner_jobs, so shard-monitoring is explicitly disablled for the others.

Edited by Bob Van Landuyt

Merge request reports

Loading