Use the new metrics as service level indicators in the service catalog
Now that we have new metrics defined, we'll want to use these as an
SLI in our service catalog. This new metric, should replace the apdex
for the puma component of the web, git, and api services.
Currently, and SLI
definition
has no concept of a success rate. Only apdex and errorRate and
requestRate are supported. The apdex key expects a
histogramApdex kind of object to generate
gitlab_component_apdex:success:rate and
gitlab_component_apdex:weight:score recordings:
https://gitlab.com/gitlab-com/runbooks/blob/dde7b5ef3270e8ff3ebb2dbd9cf6ca72b59239c4/rules/autogenerated-key-metrics-web.yml#L103. So
we'll need to make sure we can handle those in the SLI definitions.
Proposal: Make the serviceLevelIndicator use a successRate
// service.jsonnet
{
// ... service definition
monitoringThresholds: {
apdexScore: 0.998,
errorRatio: 0.9999,
},
serviceLevelIndicators: {
rails: {
sliKind: 'apdex',
succesRate: rateMetric(successCounter),
requestRate: rateMetric(totalCounter)
}
}
}
To know which SLOs to use for this successrate, we should add an sliKind specification to the SLI. The value there should also be added as a new static label to the resulting recordings, and be limited to error or apdex for now.
Discarded ideas
### Idea 1: Make the `apdex:` key understand a successRateApdex() object// service.jsonnet
{
// ... service definition
monitoringThresholds: {
apdexScore: 0.998,
errorRatio: 0.9999,
},
serviceLevelIndicators: {
rails: {
apdex: successRateApdex(successConter)
requestRate: rateMetric(totalCounter)
}
}
}
This proposal allows us to use the all of the thresholds defined in
monitorinThresholds and otherThresholds and create recordings
rules based on those thresholds.