Dashboard 7 redesign: Autovacuum and xmin horizon RCA
Make Dashboard 7 a great triage tool for *"is autovacuum keeping up, who is blocking xmin horizon, and how close are we to wraparound?"*. Current state surfaces signals across too many places and forces operators to flip dashboards. ## In scope **Wraparound risk (top of dashboard)** - Top-N tables by XID age (`age(pg_class.relfrozenxid)`) — full-width timeseries. - Top-N tables by MultiXID age (`age(pg_class.relminmxid)` via `mxid_age()`) — full-width timeseries. - Reference threshold lines on both: soft (`autovacuum_freeze_max_age` / `autovacuum_multixact_freeze_max_age`) plotted dynamically from `pg_settings`, failsafe (`vacuum_failsafe_age` / `vacuum_multixact_failsafe_age`) same source, hard (~`2^31 − 10M`) static. **xmin horizon overview + blockers** - Per-source age timeseries (pg_stat_activity, pg_replication_slots xmin/catalog_xmin, pg_stat_replication, pg_prepared_xacts) with current blocker identity in the legend (queryid, slot_name, standby_name, prepared_gid). - Long-running transaction age + current blocker counts. **Autovacuum mechanics** - Per-table dead-tuple debt vs computed trigger threshold (`autovacuum_overdue_factor`) — top-N timeseries with reference line at 1.0; tables above the line are overdue. - Active autovacuum workers vs `autovacuum_max_workers` (saturation / pool pinning). - Autovacuum workers blocked on lock — table panel showing the holder pid, queryid, and wait time. - Existing vacuum timeline panel kept. **Dashboard 1 (top-level overview) — small cleanups** - Rename the two database-level wraparound panels for clarity (`Database age (datfrozenxid) — wraparound risk`, `Database multixid age (datminmxid) — wraparound risk`). - Place \"DB logical size distribution\" and \"pg_wal directory size\" side-by-side. **Cross-cutting** - New \`\$top_n\` template variable on Dashboard 7; retrofit existing top-N panels. - Uniform \"Dashboards\" dropdown link (keepTime, includeVars) on every dashboard, plus a \`postgres-ai\` tag on each, so cross-dashboard navigation preserves time range and variables. ## Out of scope - Bloat panels (separate work; dashboard rename drops \"bloat\" since no bloat panels are kept here). - Autovacuum cost-based throttling deep-dives. ## References - https://postgres.ai/docs/postgres-howtos/performance-optimization/monitoring/how-to-monitor-xmin-horizon - https://postgres.ai/docs/postgres-howtos/database-administration/maintenance/autovacuum-queue-and-progress
issue