pg_stat_statements metric exports unlimited queryids, causing VictoriaMetrics OOM on high-cardinality databases

Problem

The pg_stat_statements metric in config/pgwatch-prometheus/metrics.yml exports every queryid from pg_stat_statements with no LIMIT or top-N filter.

On databases with high query diversity (e.g. managed Postgres services, ORMs generating many unique queries), this can produce 10,000+ unique time series. Each series has ~15 counter columns, so the total label combinations explode.

VictoriaMetrics then fails when evaluating PromQL functions like irate() across all these series:

cannot execute query: not enough memory for processing
irate(pgwatch_pg_stat_statements_calls...)
which returns 12726613 data points across 11773 time series with 1081 points
total available memory: 241591910 bytes
requested memory: 215398808 bytes

This effectively breaks Grafana dashboards that query pg_stat_statements metrics.

Additionally, the Flask backend (monitoring_flask_backend / flask-pgss-api) also OOMs when loading all queryid rows into Python memory. The container is limited to 192m and gets SIGKILL'd repeatedly:

[ERROR] Worker (pid:18133) was sent SIGKILL! Perhaps out of memory?

Root cause

Both the PG 11 and PG 17 SQL variants in the pg_stat_statements metric select from aggregated_statements without any row limit:

select
  datname as tag_datname,
  queryid as tag_queryid,
  calls::int8 as calls,
  ...
from aggregated_statements

No ORDER BY + LIMIT, no top-N filter. A Supabase database was observed with 4,910 unique queryids — typical databases have 100-500.

Proposed fix

  1. Limit the Prometheus-sink metric to the top N queryids (e.g. top 200 by total exec time):
select
  datname as tag_datname,
  queryid as tag_queryid,
  calls::int8 as calls,
  ...
from aggregated_statements
order by exec_time_total desc
limit 200

This keeps cardinality bounded regardless of how many unique queries the database has. The PostgreSQL sink (pgwatch-postgres) can continue to store all queryids since it doesn't have the same cardinality constraints.

  1. Increase monitoring_flask_backend mem_limit from 192m to 512m in docker-compose.yml as a safety net for the Flask backend which also processes pg_stat_statements data.

Notes

  • The VictoriaMetrics mem_limit is 1536m in docker-compose.yml. Increasing it is a band-aid, not a fix — cardinality will keep growing.
  • The pgss_queryid_queries metric in pgwatch-postgres/metrics.yml (which stores query text) already operates on a separate sink and is not affected.
  • This is specifically a Prometheus-sink cardinality problem. The PostgreSQL sink handles high cardinality fine.

Acceptance Criteria

  • pg_stat_statements Prometheus-sink metric limited to top N queryids (e.g., top 200 by total exec time) using ORDER BY exec_time_total DESC LIMIT 200
  • VictoriaMetrics no longer OOMs when processing irate() over pg_stat_statements series
  • Grafana dashboards querying pg_stat_statements metrics load successfully without memory errors
  • monitoring_flask_backend memory limit increased from 192m to 512m in docker-compose.yml as safety net
  • PostgreSQL sink (pgwatch-postgres) continues to store all queryids without any limit
  • The top-N filter applies to both PG 11 and PG 17 SQL variants in the metric definition

Definition of Done

  • Metric SQL updated in config/pgwatch-prometheus/metrics.yml with row limit
  • Flask backend memory limit updated in docker-compose.yml
  • Tested on a high-cardinality database (4000+ unique queryids) confirming VictoriaMetrics stability
  • Grafana pg_stat_statements panels render without OOM or 500 errors
  • No loss of important query metrics (top queries by execution time are always captured)
Edited by Nikolay Samokhvalov