GitLab.com monitoring labels seem to be missing
During this past Americas shift of EoC, I noticed that several notifications are missing stage
and env
labels. There may be more missing, but these are the ones I noticed.
GCPScheduledSnapshotsFailed alert paged in #production for a staging problem. This made it difficult to know where the problem was, and once found, difficult to silence without silencing all alerts of this type for all environments. Here is a related comment in the EoC newsletter about this.
FrontendServiceWebsocketsServicesErrorSLOViolation alert has no stage label and alerted for an empty stage. I documented some more information from this in the related incident issue.
In both situations, some links to dashboards and alert promql queries did not work correctly due to missing labels, as well.