Custom Metrics failing for some projects

For some reason, I've gotten conflicting results when testing Custom Metrics on different projects.

For example:

  • It works great on our cloud native charts project: https://gitlab.com/charts/helm.gitlab.io/environments/190276/metrics
  • The same queries are causing Prometheus connection errors on a personal test project: https://gitlab.com/joshlambert/performance-devops/environments/244034/metrics

The error shown is Unable to connect to Prometheus server, which is not the case as:

  1. Cluster metrics still works
  2. If you delete the custom metric, the page successfully loads

I'm confused as to why this is the case, as the same queries are being used across both, one example which exhibits the failure is: avg(sum(container_memory_usage_bytes{container_name!="POD",pod_name=~"^%{ci_environment_slug}-(.*)",namespace="%{kube_namespace}"}) by (job)) without (job) /1024/1024/1024

Assignee Loading
Time tracking Loading