Skip to content

Close incident on recovery alert from generic alert endpoint

Problem to solve

Incidents resolve for one of two reasons:

  1. Someone fixed the problem
  2. The problem fixed itself

In both scenarios, the monitoring tool often emits a recovery alert. In situations where a recovery alert is emitted, we want this to automatically close an associated incident in GitLab for the following reasons:

  1. If a system fixes itself, this may be unknown to the user, so we want the recovery alert to close the issue to indicate to responders that the incident is resolved.
  2. If someone fixes a problem and this is quickly recognized by the monitoring tool, closing an incident on a recovery alert saves the responder time.

Additionally, having this automation in place means that all users have the guarantee that open incidents are still active because once the problem has been solved they can rely on their tools to automatically close incident issues.

This behavior will be configurable via a checkbox where the user can choose to turn this automation on/off. Depending on the workflow of the customers response team, they may or may not want this to happen

Intended users

Further details

This work contributes to the Incident Management Vision

Proposal

Add functionality to new checkbox that activates the automatic behavior where a recovery alert closes an associated incident automatically.

Permissions and Security

Documentation

Testing

What does success look like, and how can we measure that?

What is the type of buyer?

Links / references

Edited by Sarah Waldner