Postgres: reconsider xmin horizon monitoring
Currently, in https://gitlab.com/gitlab-cookbooks/gitlab-exporters/-/blob/master/templates/postgres_exporter/queries.yaml.erb#L483, we use this to determining txid_age and (then max of it):
SELECT
CASE WHEN pg_is_in_recovery() THEN 'NaN'::float ELSE txid_current() % (2^52)::bigint END AS current,
CASE WHEN pg_is_in_recovery() THEN 'NaN'::float ELSE txid_snapshot_xmin(txid_current_snapshot()) % (2^52)::bigint END AS xmin,
CASE WHEN pg_is_in_recovery() THEN 'NaN'::float ELSE txid_current() - txid_snapshot_xmin(txid_current_snapshot()) END AS xmin_age
On a primary, if we do a simple test – in one session run select pg_sleep(30)
, and in another use this query, it doesn't detect xmin horizon growth due to local read-only transaction.
We probably should reconsider how we observe it – here is an alternative:
with bits as (
select
(
select age(backend_xmin) as xmin_age_local
from pg_stat_activity
order by xmin_age_local desc nulls last
limit 1),
(
select age(xmin) as xmin_age_slots
from pg_replication_slots
order by xmin_age_slots desc nulls last
limit 1
),
(
select age(transaction) as xmin_age_prepared_xacts
from pg_prepared_xacts
order by xmin_age_prepared_xacts desc nulls last
limit 1
)
)
select
*,
case
when pg_is_in_recovery() then null
else greatest(xmin_age_local, xmin_age_slots, xmin_age_prepared_xacts)
end as xmin_age
from bits;
A good post: https://www.cybertec-postgresql.com/en/reasons-why-vacuum-wont-remove-dead-rows/
Edited by Nikolay Samokhvalov