Autogenerate routing tree based on feature categories
Introduction
There is a requirement for teams to receive service level alerts for the services that they are responsible for.
For example:
- An EM, @zj-gitlab, approached me saying "yesterday there was an incident involving one of our services. The Gitaly team would have liked to help out, but were unaware of the incident".
- The Pages team would live to be notified of pages service service-level degradations: #344 (closed)
Problem
Currently, our alert routing tree is quite adhoc. Additionally, its difficult to update to route alerts to a team.
Proposal
Using the feature_category attribute, and the service catalog, allow alerts that have the appropriate feature_category label to be routed to engineering teams, so that they can be aware of ongoing incidents.
Benefit/Impact
- More self-service capability by teams to receive alerts they're interested in (MRs on the runbooks repo, presumably easily determined)
- Broader understanding of the operational system by dev teams
Tasks
TBD
Edited by Craig Miskell