Automatically embed metrics in issue for all prometheus alerts
Problem to solve
Charts help users visualize what went wrong with triaging an incident. If a particular threshold was exceeded, we can reduce time spent during investigation by automatically embedding the relevant metrics chart in the issue.
In #119016 (closed), we introduced automatically adding relevant charts for GitLab configured alerts. In that instance, we have an existing chart defined on the metrics dashboard that we can pull and display automatically when an issue is created from an alert.
In this issue, we'll focus on metrics where we don't have an existing chart. What defaults do we need to set from the alert so that we can display a chart in the related issue even in those cases when a chart hasn't been previously defined?
Intended users
Sasha the Software Developer
Devon the DevOps Engineer
Sidney the Systems Administrator
Further details
This work contributes to the Incident Management Vision
Proposal
Automatically embed a chart visualization for the metric that triggered an alert when both title
and y_label
are present. Set the time frame to be event time +/- 30 minutes. As on the related issue, the embedded chart will display between the Summary and the rest of the payload.
Additional details
-
y_label
is required for automatically embedding metrics. When that information is present, we will automatically add a metric to the issue created from the alert. The embed will fail silently ify_label
is missing. - If
title
is missing from the alert, we will attempt to substitute in the metric name for the title. If the metric name is also missing, the embed will fail silently. - The default x-axis will be time, with the range being a one hour span around the event (event time ± 30 mins)).
- When we do not have the metric defined in GitLab, the embed url will remain hidden from the description.
- We'll want to disable the "Generate link to chart" dropdown option for 'metric-less' alerts, since we don't have a chart to link to.
Permissions and Security
Documentation
We will need to add y_label
is required for automatically embedding metrics in issues created from incidents