Monitor SSR Related Metrics for Frontend Pods
Goal
Currently we monitor hardware related metrics, although we do not have monitoring on SSR related metrics. We should try to monitor this and create alert conditions if SSR ever stops working.
We need to investigate if there exists a Prometheus exporter for this use case, and if not we can implement a metrics endpoint for the frontend pods.
QA
We can test this by deploying a monitoring stack to Sandbox.
Acceptance Criteria
-
Alerts are fired if SSR stops functioning -
SSR related metrics are visible in Grafana
Definition of Ready Checklist
-
Definition Of Done (DoD) -
Acceptance criteria -
Weighted -
QA