Skip to content

Automatically embed metrics in issue for all prometheus alerts

Problem to solve

Charts help users visualize what went wrong with triaging an incident. If a particular threshold was exceeded, we can reduce time spent during investigation by automatically embedding the relevant metrics chart in the issue.

In #119016 (closed), we introduced automatically adding relevant charts for GitLab configured alerts. In that instance, we have an existing chart defined on the metrics dashboard that we can pull and display automatically when an issue is created from an alert.

In this issue, we'll focus on metrics where we don't have an existing chart. What defaults do we need to set from the alert so that we can display a chart in the related issue even in those cases when a chart hasn't been previously defined?

Intended users

Sasha the Software Developer
Devon the DevOps Engineer
Sidney the Systems Administrator

Further details

This work contributes to the Incident Management Vision

Proposal

Automatically embed a chart visualization for the metric that triggered an alert when both title and y_label are present. Set the time frame to be event time +/- 30 minutes. As on the related issue, the embedded chart will display between the Summary and the rest of the payload.

Additional details

  • y_label is required for automatically embedding metrics. When that information is present, we will automatically add a metric to the issue created from the alert. The embed will fail silently if y_label is missing.
  • If title is missing from the alert, we will attempt to substitute in the metric name for the title. If the metric name is also missing, the embed will fail silently.
  • The default x-axis will be time, with the range being a one hour span around the event (event time ± 30 mins)).
  • When we do not have the metric defined in GitLab, the embed url will remain hidden from the description.
  • We'll want to disable the "Generate link to chart" dropdown option for 'metric-less' alerts, since we don't have a chart to link to.

Permissions and Security

Documentation

We will need to add y_label is required for automatically embedding metrics in issues created from incidents

Testing

What does success look like, and how can we measure that?

What is the type of buyer?

Links / references

Edited by Amelia Bauerly