2020-03-25: Prometheus is restarting frequently & Prometheus is unreachable

Summary

Prometheus is restarting frequently.

  • Fqdn: prometheus-01.us-east1-d.gce.gitlab-runners.gitlab.net
  • Instance: prometheus-01.us-east1-d.gce.gitlab-runners.gitlab.net:9090
  • https://console.cloud.google.com/compute/instancesDetail/zones/us-east1-d/instances/prometheus-01-us-east1-d?project=gitlab-ci-155816&folder&organizationId&tab=monitoring&duration=PT12H

More information will be added as we investigate the issue.

Timeline

All times UTC.

2020-03-25

  • 15:23 - Received incident alert: Prometheus is restarting frequently
    • https://gitlab.pagerduty.com/incidents/PE5U91M
  • 15:45 - Received incident alert: Prometheus is unreachable
    • https://gitlab.pagerduty.com/incidents/P1NWGWO
  • 16:28 - Resolved #19130: Prometheus is restarting
    • https://gitlab.pagerduty.com/incidents/PE5U91M
  • 16:40 - Resolved #19131: Prometheus is unreachable
    • https://gitlab.pagerduty.com/incidents/P1NWGWO
  • 19:39 - Triggered #19138: Firing 1 - Prometheus is restarting frequently
    • https://gitlab.pagerduty.com/incidents/PKQHFDX
    • Resolved manually
  • 19:48 - Triggered #19138: Firing 1 - Prometheus is restarting frequently
    • https://gitlab.pagerduty.com/incidents/PKZ2CAK
    • Resolved manually
Edited Mar 25, 2020 by Nels Nelson
Assignee Loading
Time tracking Loading