Prometheus server fails to get metrics from deployed environments
Summary
Metrics dashboard is showing the message No data found
Steps to reproduce
In the Metrics
section inside of Operations
Conditions:
What is the current bug behavior?
There are no metrics shown.
Possible fixes
The only way of getting it to work is to completely reinstall GDK.
These are the things that have been tried without success:
- Reconnecting an existing cluster with a project
- Recreating the project and connecting it to the existing cluster
- Recreating both the cluster and the project and doing everything from scratch
Some clues
- This message appeared in the logs for
metrics-server
pod:unable to fully collect metrics: unable to fully scrape metrics from source
-
/additional_metrics.json
returns a 200 response, but data is empty. e.g:{"success":true,"data":[],"last_update":"2019-05-27T21:06:10:640Z"}
- All responses are empty when hitting the Prometheus proxy API directly
More information if you want to dig deeper into this issue:
- In the Rails console, run
Deployment.find(<id>).additional_metrics
and work the way inside the method. It may help narrow down - Said on Slack about the issue:
"The most likely theory I think is that the reactive cache is still waiting for results so GitLab is returning nothing."
If you encounter this issue, please contact @tkuah
Edited by Tristan Read