Cluster Monitoring++
With a foundation of Kubernetes cluster monitoring available in https://gitlab.com/gitlab-org/gitlab-ce/issues/38783, we can begin to take this further.
A few key enhancements we should consider:
- Display Mode health, capacity
- Display Pod status, capacity
- Include additional series for Limits for Memory/CPU. (In addition to utilization and Requested)
- Alerting in the event of errors:
- Pods/Nodes at capacity
- Flapping pods (can't authenticate to registry, bad container image, etc.)
- Node errors
- etc..