2025-08-04: Prometheus scrape failures on Patroni node in gprd environment
Prometheus scrape failures on Patroni node in gprd environment (Severity 4)
Problem: Prometheus is unable to scrape exporters on multiple Patroni nodes in the gprd environment due to scrape failures caused by outdated statistics and severe table fragmentation.
Impact: There continues to be no user-visible impact.
Causes: Inefficient query plans, triggered by outdated statistics and severe fragmentation in the pg_statistic table, have been identified as the root cause.
Response strategy: The team plans to run VACUUM FULL pg_statistic during off-peak hours to resolve the issue, which has been confirmed as effective in the test environment. A change request will be submitted for approval. An alert silence has been extended to prevent unnecessary notifications.
This ticket was created to track INC-3078, by incident.io