Skip to content

Prometheus server fails to get metrics from deployed environments

Summary

Metrics dashboard is showing the message No data found

Screen_Shot_2019-06-26_at_6.19.10_PM

Steps to reproduce

In the Metrics section inside of Operations

Conditions:

Kubernetes set up correctly
Pipeline green
Runners set up correctly

What is the current bug behavior?

There are no metrics shown.

Possible fixes

This can happen for a variety of reasons, but the main one being that there are no PrometheusMetric in your database.

These can be re-added by running ::Gitlab::DatabaseImporters::CommonMetrics::Importer.new.execute in your rails console.

Tasks to complete

  • Add a rake task for ::Gitlab::DatabaseImporters::CommonMetrics::Importer.new.execute so that it is easier to execute
  • In the documentation, let it be known that the rake task above can be run if this issue occurs to developers, or a users GitLab instance.
  • Perform a short investigation into why this may occur and any further causes & preventative steps that could be taken.

Previous description:

The only way of getting it to work is to completely reinstall GDK.

These are the things that have been tried without success:

  • Reconnecting an existing cluster with a project
  • Recreating the project and connecting it to the existing cluster
  • Recreating both the cluster and the project and doing everything from scratch

Some clues 🔎 that may help:

  • This message appeared in the logs for metrics-server pod: unable to fully collect metrics: unable to fully scrape metrics from source
  • /additional_metrics.json returns a 200 response, but data is empty. e.g:
    {"success":true,"data":[],"last_update":"2019-05-27T21:06:10:640Z"}
  • All responses are empty when hitting the Prometheus proxy API directly

More information if you want to dig deeper into this issue:

  • In the Rails console, run Deployment.find(<id>).additional_metrics and work the way inside the method. It may help narrow down
  • Said on Slack about the issue: "The most likely theory I think is that the reactive cache is still waiting for results so GitLab is returning nothing."

If you encounter this issue, please contact @tkuah

Edited by Sean Arnold