Add saturation as a general metric (!1188) · Merge requests · GitLab.com / Runbooks

Over the past few days, for example in the following incidents, we have reached saturation on a resource.

Currently we don't monitor this as a key metric. The past few weeks have shown that we should.

Saturation is modelled as a known finite upper limit for a given resource. Each resource can have multiple saturation components.

For example, saturation can include memory, cpu, single cores (for single threaded services such as Redis)

The saturation metric for a service is aggregated as the maximum saturation point of any of the components of that service.

For example, if the widget service has the following saturation metrics

Then, the saturation of the service is 95%.

This is because saturation can be thought of as a bottleneck. A service is as saturated as it's most saturated component.

Edited Jul 10, 2019 by Andrew Newdigate

Add saturation as a general metric