Incident Automation
### Problem to solve A user will begin triaging a problem by reviewing an incident that's been created by a critical system alert. They should have immediate access to the most pertinent details in the alert payload, metrics that caused the alert, and links to relevant logs or traces. As the user dives into investigation and navigates to metrics, logs, traces for a given pod, they should not lose context to the original incident. Generating robust, relevant, and well-organized incidents will require configuration with some conditional logic. An operator is going to want to customize an incident based on the critical alert that generated it. Incident customizations may include: * **Format** - How data appears in the body of the incident * **Annotations** * **Embedded metrics** - Charts should be relevant to the alert and show the threshold that was exceeded * **Links to logs/traces** - Clicking this link should take the user to an aggregated pre-filtered view for the time the incident started * **Runbooks** - Relevant runbooks should be linked to an incident based on what the problem is * **Assignment** - Incidents should be automatically assigned to the right people/teams * **Labelling** - Users should be able to apply specific labels based on the affected systems ### Intended Users [Sasha the Software Developer](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#sasha-software-developer) [Devon the DevOps Engineer](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#devon-devops-engineer) [Sidney the Systems Administrator](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#sidney-systems-administrator) ### Further Details This epic supports the [Incident Management](https://about.gitlab.com/direction/monitor/debugging_and_health/incident_management/) and [Triage](https://about.gitlab.com/direction/monitor/workflows/triage/) Vision. ### Proposal * Enable parameterization of incoming alert data in incidents by supporting a markup language in issue templates. This would allow: * Custom formatting * Assignment and labelling based on alert attributes * Annotations based on alert attributes (e.g. including different runbooks in incidents based on the problem) * Automatically link and render relevant metrics charts
epic