Skip to content

Do not swallow resolving alerts if setting to auto-close incidents is disabled

The following discussion from !52374 (merged) should be addressed:

  • @syasonik started a discussion:

    The auto_close_incident setting is for closing the issue associated with an alert when an alert "resolves." However, we're currently preventing the alert itself from resolving if the setting isn't enabled. Shouldn't we still resolve the alert, even if the issue shouldn't be?

    If we don't want to resolve the alert, then should we increment the alert counter?


The issues:

  1. The docs are disparate on this feature. It's alluded to in the Metrics Dashboard docs and the Integrations docs, but they don't mention the setting or how it actually works. I can add some, I just want to make sure I'm adding the right thing.
  2. The setting's description says 'Prometheus', but it's for HTTP too.
  3. If we remove 'Prometheus' from the text, it implies the alert is always resolved, but that's not currently the case.

The game plan:

  1. Add docs to Incidents describing the feature/setting, and to the Operations Settings docs, referencing the new documentation and the other related settings (Email Notifications & Create incidents manually).
  2. Change the handling of recovery alerts to always close the alert in gitlab. The setting will exclusively control whether the incident closes automatically.
  3. Change the setting description to: Automatically close associated Incident when an Alert is resolved via recover alert notification
  4. The text for the system note from recovery alerts says @alert-bot logged a resolving alert from HTTP endpoint Test. Update that to recovery alert.
Edited by Sarah Yasonik