Metrics dashboard displays incorrect state when max retries limit reached

Summary

Metrics dashboard displays an incorrect state after the prometheus metrics query max retries limit is reached.

Steps to reproduce

Prerequisite: ensure that the environment_metrics_use_prometheus_endpoint flag is switched on.

  1. Set up an environment with prometheus metrics.
  2. Navigate to the environment metrics page, e.g. http://localhost:3000/root/my-cool-project/environments/1/metrics.
  3. Exceed the maximum number of prometheus metrics request retries (currently 3).

It may be difficult to reproduce the last step - this issue only occurs intermittently. It seems to be related to the time taken for the reactive cache to trigger. More details here: https://gitlab.com/gitlab-org/gitlab-ce/issues/63789#note_194751893.

What is the current bug behavior?

The metrics dashboard continues to show the message "Waiting for Performance Data".

What is the expected correct behavior?

The page should show some indication that the number of request retries has been exceeded and it is no longer attempting to fetch metrics. Perhaps an error with a prompt to refresh.

Relevant logs and/or screenshots

Screenshot_2019-07-23_at_13.03.28

Possible fixes

Show an error when the prometheus metrics request times out. Possibly increase the max_retries value.

Assignee Loading
Time tracking Loading