Skip to content

Consider opening incident issues for monitoring alerts

Corrective action from production#3308 (comment 483316459)

There are 2 sources of alerting rules: "generic" ones generated from SLIs in the metrics-catalog, and custom rules in the "rules" directory in runbooks. In the metrics-catalog, our team has an incident project set on it, which causes all SLIs marked as owned by our team to have incidents opened in the production issue tracker for their relevant alerts.

We could add the incident_project attribute to custom rules related to monitoring if we wanted. If we use the production project, Woodhouse will post updates to #incident-management, effectively making these alerts slightly noisier because people pay more attention to that channel than #alerts.

This would make us more likely to notice problems like the thanos compactor being broken for 1 month.

@gitlab-com/gl-infra/sre-observability wdyt?

Edited by Craig Furman