chore: use unified sli registry in dashboards
chore: use unified sli registry in dashboards
In gitlab-com/gl-infra/scalability#2602 (closed) we've added a recording rule registry that includes all of the metrics used in SLIs with all of their significant labels.
This included a rename of a metric, which handily did not clash with the Thanos rules.
Now that we've switched Grafana to use Mimir as a default datasource, we can use this registry to resolve wich metrics to display.
chore: allow overriding gitlab-metrics-config for application SLIs
This allows overriding the recording rule registry when generating rules for application SLIs.
This means we can keep the same recordings in our Prometheus+Thanos implementation while we do switch the default registry that is used by our dashboards.
chore: configure selective registry for regional key metrics
This makes it so we can continue using the selective registry for globally evaluated regional recording rules.
chore: make prometheus rule group generator respect config
This makes the prometheus-service-group-generator
respect the
gitlab-metrics-config
.
When doing this, we make sure that the GET-hybrid continues to use the NullRegistry, even when we switch the default over to use the unified registry.
This also configures the service-key-metric-rule-files that we use for Prometheus+Thanos over to using the selective registry so we don't change the recording rules there when switching the default.
fix: make combined metrics respect configured config
Before this, combined metrics would always rely on the default configuration.
In our Mimir setup, this meant that for these SLIs we would rely on high cardinality source metrics, while we had recording rules available.
For the GET-hybrid, this meant that for alerts we were generating a query that was using recording rules that weren't there.