Skip to content

feat: add counter apdexes for reference arch monitoring

Andrew Newdigate requested to merge counter-apdex-for-reference-webservice into master

Closes #109 (closed).

This updates reference architecture monitoring to rely on custom latencies instead of fixed latencies.

This may help with spurious SLO alerts in single-tenant environments, such as the one reported in https://gitlab.com/gitlab-com/gl-infra/gitlab-dedicated/team/-/issues/1725.

Comparing Results

In incident https://gitlab.com/gitlab-com/gl-infra/gitlab-dedicated/team/-/issues/1725, we saw apdex drop to 98% for well over 2 hours as queries hit the otherwise quiet instance.

screenshot-andrewn-2023-02-28T10h50Z_2x

With this change, almost all the requests are within the satisfactory latency as specified by the stage group teams that own these endpoints:

screenshot-andrewn-2023-02-28T10h49Z_2x

Edited by Andrew Newdigate

Merge request reports