Automatically embed metrics in issue for all gitlab-configured alerts
Problem to solve
Charts help users visualize what went wrong with triaging an incident. If a particular threshold was exceeded, we can reduce time spent during investigation by automatically embedding the relevant metrics chart in the issue.
Intended users
Sasha the Software Developer
Devon the DevOps Engineer
Sidney the Systems Administrator
Further details
This work contributes to the Incident Management Vision
Proposal
The first iteration will involve metrics for GitLab-configured alerts (where a chart already exists on our metrics dashboard). We would:
- Find the metric and use the corresponding dashboard/embed configuration including chart title, units, etc.
- Automatically embed a chart visualization for the metric that triggered the alert. Set the time frame to be event time +/- 30 minutes.
The chart should display in the issue description, after the summary and before the alert details (ie, the rest of the alert payload).
Out of scope for this iteration
- Embedding metrics in issue from Prometheus alerts not created in GitLab. Will be tackled in #195739 (closed)
- Embedding metrics in issues created from alerts triggered by the generic alert endpoint
Permissions and Security
Documentation
Testing
What does success look like, and how can we measure that?
What is the type of buyer?
Links / references
Edited by Amelia Bauerly