Skip to content

Use the new metrics as service level indicators in the service catalog

Now that we have new metrics defined, we'll want to use these as an SLI in our service catalog. This new metric, should replace the apdex for the puma component of the web, git, and api services.

Currently, and SLI definition has no concept of a success rate. Only apdex and errorRate and requestRate are supported. The apdex key expects a histogramApdex kind of object to generate gitlab_component_apdex:success:rate and gitlab_component_apdex:weight:score recordings: https://gitlab.com/gitlab-com/runbooks/blob/dde7b5ef3270e8ff3ebb2dbd9cf6ca72b59239c4/rules/autogenerated-key-metrics-web.yml#L103. So we'll need to make sure we can handle those in the SLI definitions.

Proposal: Make the serviceLevelIndicator use a successRate

// service.jsonnet
{
  // ... service definition
  monitoringThresholds: {
    apdexScore: 0.998,
    errorRatio: 0.9999,
  },

  serviceLevelIndicators: {
    rails: {
      sliKind: 'apdex',
      succesRate: rateMetric(successCounter),
      requestRate: rateMetric(totalCounter)
    }
  }
}

To know which SLOs to use for this successrate, we should add an sliKind specification to the SLI. The value there should also be added as a new static label to the resulting recordings, and be limited to error or apdex for now.

Discarded ideas ### Idea 1: Make the `apdex:` key understand a successRateApdex() object
// service.jsonnet
{
  // ... service definition
  monitoringThresholds: {
    apdexScore: 0.998,
    errorRatio: 0.9999,
  },

  serviceLevelIndicators: {
    rails: {
      
      apdex: successRateApdex(successConter)
      requestRate: rateMetric(totalCounter)
    }
  }
}

This proposal allows us to use the all of the thresholds defined in monitorinThresholds and otherThresholds and create recordings rules based on those thresholds.

Edited by Bob Van Landuyt