2025-10-28: Mimir
Mimir (Severity 2 (High))
Problem: A failure in Mimir ingesters caused loss of quorum, resulting in full service outage and missing metrics.
Impact: From 14:52 to 15:27 UTC, Mimir became unavailable, causing all queries and alerting to fail. This resulted in loss of monitoring visibility, incomplete metrics, and unreliable Grafana dashboards during the outage window. 18 out of 210 ingesters were affected, which led to write failures and increased latency.
Causes: Unknown
Response strategy: The system recovered on its own. All alerts have since resolved, and services have returned to normal.
This ticket was created to track INC-5287, by incident.io