Regional SLO alerts
This change completes the set of MR to include regional monitoring for some services at GitLab.
We already have regional dashboards, for example https://dashboards.gitlab.net/d/git-regional/git-regional-detail and https://dashboards.gitlab.net/d/registry-regional/registry-regional-detail?orgId=1.
This adds monitoring for those services.
I've included some comments in the code for reviewers...
Dashboards
Each aggregation set now has Apdex and Error Rate SLO analysis dashboards.
Snapshots of these dashboards are here:
- https://dashboards.gitlab.net/dashboard/snapshot/v06Ld6j0T0TTH0vh1uynaHptTBvw9e8D - alerts: Global Node-Aggregated SLI Metrics Apdex SLO Analysis
- https://dashboards.gitlab.net/dashboard/snapshot/S33I0QGoZYYd1FX6KE0gzt3041mGFJYe - alerts: Global Node-Aggregated SLI Metrics Error SLO Analysis
- https://dashboards.gitlab.net/dashboard/snapshot/W2Xd7HJEZqgAA65lcycw85CXvf49RV0j - alerts: Global SLI Metrics Apdex SLO Analysis
- https://dashboards.gitlab.net/dashboard/snapshot/Xb4fy8EOtPKAft8ugdaPCOrKjyFx1QyO - alerts: Global SLI Metrics Error SLO Analysis
- https://dashboards.gitlab.net/dashboard/snapshot/gARUuT3uvEQsuzwdRsQ7oWLkYhYgL1t2 - alerts: Regional SLI Metrics Apdex SLO Analysis
- https://dashboards.gitlab.net/dashboard/snapshot/ARwfTELamTGVX2SaPRB09KOvxymNYDm8 - alerts: Regional SLI Metrics Error SLO Analysis
- https://dashboards.gitlab.net/dashboard/snapshot/YW9NgW1C5eY4GApTv1T54dX5PRZE2oru - alerts: Global Service-Node-Aggregated Metrics Apdex SLO Analysis
- https://dashboards.gitlab.net/dashboard/snapshot/r9u6uKNDL9MHhJq24GrOROEVY0kdk97w - alerts: Global Service-Node-Aggregated Metrics Error SLO Analysis
- https://dashboards.gitlab.net/dashboard/snapshot/CBapBtgmzFZyNVPnfVscfiSvGFwN3pRo - alerts: Global Service-Regional-Aggregated Metrics Apdex SLO Analysis
- https://dashboards.gitlab.net/dashboard/snapshot/id9SPnrSXxX7dzAk0NuDTNMUWQJz94sb - alerts: Global Service-Regional-Aggregated Metrics Error SLO Analysis
- https://dashboards.gitlab.net/dashboard/snapshot/VpN24ABPBY3o70pkSO6PjowrG1axMVtm - alerts: Global Service-Aggregated Metrics Apdex SLO Analysis
- https://dashboards.gitlab.net/dashboard/snapshot/YZzr0b6Dp00gBbVvi8UPgbSW3qMqK2kd - alerts: Global Service-Aggregated Metrics Error SLO Analysis
This is helpful for when, say, a regional alert fires and the SLO Analysis dashboard can be used to generate a snapshot for the alert: for example puma in us-east1-b https://dashboards.gitlab.net/dashboard/snapshot/gARUuT3uvEQsuzwdRsQ7oWLkYhYgL1t2?orgId=1&var-PROMETHEUS_DS=Global&var-environment=gprd&var-type=git&var-stage=main&var-region=us-east1-b&var-component=puma&var-proposed_slo=NaN