Skip to content

Displaying error budget for a stage group

For this issue, we want to show the error budgets to the stage group in the following form:

4 panels in a row called Error Budgets (past 28 days):

  1. Availability: Percentage of overall success ratio
  2. Minutes spent: The number of minutes spent from ~20 minutes available
  3. Minutes remaining: The number of minutes remaining from ~20 minutes available (based on the 99.95% availability target)
  4. Info panel with links to the handbook and the calculation.

The simplified calculation is:

Availability

the number of operations with a satisfactory apdex + the number of operations without errors    
/
the total number of apdex measurements + the total number of operations

This calculates the error budget as a percentage, and we would use values from the previous 28 days in the calculation.

Minutes remaining

We could calculate this in seconds remaining as follows:

((1 - <target>) * 28 * 86400) - ((1 - <availability>) * 28 * 86400)

Grafana will then turn this into minutes/hours as appropriate.


This would result in the following results for the different availabilities with a target of 99.95%

Availability Minutes Spent Minutes Remaining
100% 0 20
99.97% 12 8
99.95% 20 0
99.90% 40 -20

The purpose of the error budget is to inform prioritization decisions and to drive action.

Edited by Rachel Nienaber