Centralized Alerting Framework
As GitLab continues to add additional monitoring features and capabilities, a key foundation will be the ability to notify users and administrators of events that need attention. These could come from a variety of different sources:
- Defined alerting thresholds for metrics (#4451)
- Automated anomaly detection (#3610)
- Behavior change after release (#3555 (closed))
- Auto Log Alerts (#3626)
- GitLab Service Alerts
Rather than building all the necessary alerting and other functionality into each of these areas, we can instead build a centralized alerting functionality. This would reduce the amount of work, and offer a single UI to manage notifications across these types of events.
- Alerts should support going out over configured Chat services, like Slack or Mattermost
- Add support for notifications via SMS
- Add support for responding and acknowledging alerts via notification methods
- Alerts should also feed into the Service Status Dashboard and Internal Ops Dashboard (#3541), if acknowledged as a problem
- Alerts should also be logged in a centralized alert log, to review what was firing, when, and for how long. Also would be nice to include who acknowledged the alert.