Single KPI for system load
Ask
Ask out of the GitLab.com stability escalation:
"KPI to simplify system load vs. capacity"
-
Create table from https://dbt.gitlabdata.com/#!/source/source.gitlab_snowflake.thanos.periodic_queries
to calculate the percentage of horizontally and non-horizontally scalable metrics forecast to be at risk of a hitting their capacity thresholds in the next 90 days. Thanos Link
Existing Measures / Systems (not exhaustive)
- Overall instrumentation and observability via the many dashboards and alerts
- tamland weekly saturation analysis
- Monthly database saturation analysis
- database saturation metrics
- daily postgres checks - tracking items such as PK exhaustion
- specific project measures - such as the criteria from the current DB Rapid Action to reduce queries to the primary to below 4% as measured in kibana also tracked in this sheet (one of several metrics)
cc: @andrewn @andrewn @glopezfernandez @brentnewton @marin @amyphillips @rnienaber @davis_townsend
Edited by Davis Townsend