Split apdex recording rule into component parts

The way we measure apdex can be (roughly) stated as:

apdex ratio = (requests which took a satisfactory amount of time)/(total number of requests)

This change splits out a new recording rule, gitlab_component_apdex:success:rate, containing the numerator (top) of this function.

Having this as it's own recording rule will allow us to generate long term (rolling 30d) availability measurements for each SLI.

Gradual Rollout

For risk mitigation, this change only adds gitlab_component_apdex:success:rate to the 1m rate.

The 1m rate is not used in SLI calcuation, so it allows us to confirm that everything is working while avoiding the risk of breaking our entire SLI/SLO infrastructure.

Next Steps

The next step will be to aggregate gitlab_component_apdex:success:rate into Thanos, and roll it our for the 5m,1h,30m and 6h values.

Edited by Andrew Newdigate

Merge request reports

Loading