Set vacuum_defer_cleanup_age on primary databases

Geo has queries that take 40-60s to run on the geo replica. As this is longer than the duration of transactions running on any of the production databases this causes conflicts with recovery which causes replication lag. As it is not doing streaming replication we do not have "feedback" available to mitigate these conflicts -- which is probably a good thing as it could cause bloat on production tables.

Work is ongoing to speed up those queries of course. However it would be useful to be able to run archived log based replicas and run longer queries than the 15s production queries without having lagging replication.

From prometheus https://prometheus.gitlab.com/graph?g0.range_input=2w&g0.end_input=2018-03-14%2022%3A53&g0.expr=postgres%3Apg_txid_per_second%3Arate5m%7Benvironment%3D%22prd%22%7D&g0.tab=0&g1.range_input=2d&g1.end_input=2018-03-21%2021%3A54&g1.expr=pg_replication_slots_xmin_age%7Benvironment%3D%22prd%22%7D&g1.tab=0

We can see that our transaction rate peaks at approximately 100 txn/s so a vacuum_defer_cleanup_age value to allow 60s queries to run without conflicts would be about 6000. We can also see occasional peaks of feedback from the replicas above 10000 and there was no noticeable impact on production. Even higher peaks over 30000 (presumably during replication lag events) didn't cause noticeable impacts.

I would suggest we set vacuum_defer_cleanup_age to 10000 which is a pretty conservative value and should be sufficient for the purpose of geo. If needed to support longer systems with longer running queries we could raise it to 20k or 50k but we would have to monitor the bloat.

So barring objections I plan to set it to 10k tomorrow morning (say 2018-03-22 at 11:00 UTC) and monitor the bloat immediately afterwards and periodically during the day. In particular it currently we see about 0.85 - 0.95 vacuums per minute on the project_import_data table and the peak of about 20k dead tuples on projects. If we see either of those increase dramatically or persistently I'll revert the change.

Assignee Loading
Time tracking Loading