Skip to content

Understand Mutixact Offset LWLock database contention

Description

Today on 7th of September we had experienced an severity2 incident related to PostgreSQL statement timeouts.

Production incident gitlab-com/gl-infra/production#5493 (closed)

One symptom (or a cause) was elevated number of multixact_offset LWLocks sampled from pg_stat_activity during that time:

multixact_offset_wait_event

We also saw a reduced transactions rate and growing txid xmin age.

txid_rate

txid_min_age

Proposal

Understand the mechanism behind multixact offset locking leading to the database contention.

/cc @NikolayS @stanhu @andrewn @jarv @sgoldstein

Edited by Grzegorz Bizon