Cluster Monitoring++

With a foundation of Kubernetes cluster monitoring available in https://gitlab.com/gitlab-org/gitlab-ce/issues/38783, we can begin to take this further.

A few key enhancements we should consider:

  • Display Mode health, capacity
  • Display Pod status, capacity
  • Include additional series for Limits for Memory/CPU. (In addition to utilization and Requested)
  • Alerting in the event of errors:
    • Pods/Nodes at capacity
    • Flapping pods (can't authenticate to registry, bad container image, etc.)
    • Node errors
    • etc..
Assignee Loading
Time tracking Loading