Custom metrics are still half-baked after a long time being GA

Summary

There are still a lot of issues in Prometheus integration that were there a year ago and are still there.

Those issues are mostly concentrated in the are of custom Prometheus metrics for projects deployed in Kubernetes.

Using 12.1.6-ee (d05ee0a9)

The issues are broken down here:

Custom metric query validation

Issue: #13135 (closed)

What is the current bug behavior?

Custom metric query validation keeps breaking. Most of the time the query fails to validate, even if I just copy a valid query from Prometheus or Grafana. Most of the time removing a random character and putting it back makes the query valid.

What is the expected correct behavior?

I expect a valid Prometheus query to validate also in GitLab while adding a custom metric.

User is sent to the Prometheus custom integration page

Issue: #11312

What is the current bug behavior?

After adding a custom metric for a GitLab-managed Prometheus, the user is being sent to the Prometheus custom integration page, for no reason at all. There, the UI says "configure Prometheus integration to add custom metrics". It is very confusing.

What is the expected correct behavior?

When I add a custom metric, I expect to stay on the monitoring page and see the new metric among all other metrics.

Patterns in Legend

Issue: #32705

What is the current bug behavior?

There's no (documented) way to use patterns in legend, like {{hostname}} in Grafana.

What is the expected correct behavior?

I want to be able to use query aggregated group result in the graph legend, what Grafana supports for years.

Change or Delete Custom Metrics

Issue: #13592

What is the current bug behavior?

Custom metrics for GitLab-managed Prometheus are impossible to change or delete.

What is the expected correct behavior?

It should be possible to delete and edit custom metrics also for GitLab-managed Prometheus.

Error Retrieving Metrics

Issues:

  • #28855 (closed)
  • #30134
  • #10615 (closed)
  • #13685 (closed)

What is the current bug behavior?

After adding a custom metric, GitLab reports "There was an error while retrieving metrics". It works like 10% of the time. It might be that the query returns zero results but it is not an error. Grafana works perfectly fine with the same query, showing an empty graph.

What is the expected correct behavior?

The Monitoring page should work. If one metric returns zero results or fails to render, all other graphs should still be shown. The error message should clearly indicate what kind of error occurred when GitLab tried to retrieve metrics.

Edited Nov 16, 2020 by 🤖 GitLab Bot 🤖
Assignee Loading
Time tracking Loading