Prometheus server fails to get metrics from deployed environments
Summary
Metrics dashboard is showing the message No data found
Steps to reproduce
In the Metrics section inside of Operations
Conditions:
What is the current bug behavior?
There are no metrics shown.
Possible fixes
This can happen for a variety of reasons, but the main one being that there are no PrometheusMetric in your database.
These can be re-added by running ::Gitlab::DatabaseImporters::CommonMetrics::Importer.new.execute in your rails console.
Tasks to complete
-
Add a rake task for ::Gitlab::DatabaseImporters::CommonMetrics::Importer.new.executeso that it is easier to execute -
In the documentation, let it be known that the rake task above can be run if this issue occurs to developers, or a users GitLab instance. -
Perform a short investigation into why this may occur and any further causes & preventative steps that could be taken.
Previous description:
These are the things that have been tried without success:
- Reconnecting an existing cluster with a project
- Recreating the project and connecting it to the existing cluster
- Recreating both the cluster and the project and doing everything from scratch
Some clues
- This message appeared in the logs for
metrics-serverpod:unable to fully collect metrics: unable to fully scrape metrics from source -
/additional_metrics.jsonreturns a 200 response, but data is empty. e.g:{"success":true,"data":[],"last_update":"2019-05-27T21:06:10:640Z"} - All responses are empty when hitting the Prometheus proxy API directly
More information if you want to dig deeper into this issue:
- In the Rails console, run
Deployment.find(<id>).additional_metricsand work the way inside the method. It may help narrow down - Said on Slack about the issue:
"The most likely theory I think is that the reactive cache is still waiting for results so GitLab is returning nothing."
If you encounter this issue, please contact @tkuah
Edited by Sean Arnold
