APM: Infrastructure dashboard - MVC
Problem to solve
Our metrics dashboard provides a detailed view of individual metrics across single service; this view does not scale, operators require a bird's-eye view across all of our services and IT infrastructure.
Intended users
Further details
Proposal
Provide a curated experience for operators to view the health of their infrastructure
- Compact view which visualizes the health of our infrastructure
- Scale to 1K+ infrastructure.
- Native support for container orchestration frameworks.
- The user should be able to select the metric across all pods
User flow
As a user, I would like to view the health across all of my pods
- Infrastructure dashboard would show a view of pods group by namespace
- User can select the environment
- User can select the metric which spans across all pods (e.g., CPU, memory usage)
- User can easily and quickly see the select metric on a specific pod
Design
In this issue, we are proposing adding two additional views to the current Operations > Environments page. In addition to the deployments view (which already exists), we are also proposing adding a pod health view, and a metrics view:
Deployments view | Pod health view | "Metrics" view |
---|---|---|
![]() |
![]() |
![]() |
Clicking on any of the pods in any of the visualizations above, will allow users to view either the logs or the metrics for the selected pods. Here's an example of how that could look:
User clicks on pod and decides what to view | Drilling down to a pod's log | Drilling down to a pod's metrics |
---|---|---|
![]() |
![]() |
![]() |
Proposal is to show the log view by default.
Note: we are also proposing adding a legend to the current deployments view. Apparently there are six or seven statuses that will need to be represented within the key, but it's not clear how those statuses are represented currently.