Timeout idle transactions
Context
- Incident: #895 (closed)
- Partly related to internal id lock congestion discussion: #901 (moved)
Observation
We observe a large number of idling database transactions which seems correlated with the apdex dips:
We don't have a good way to monitor general transaction length. However, when watching this output (every 0.5s or similar), we can make more observations:
SELECT
regexp_matches(application_name, '(\w+).*'),
count(*),
max(now() - xact_start),
avg(now() - xact_start),
percentile_cont(0.5) WITHIN GROUP (ORDER BY now() - xact_start) AS median,
percentile_cont(0.95) WITHIN GROUP (ORDER BY now() - xact_start) AS p95
FROM
pg_stat_activity
WHERE
state * 'idle in transaction'
GROUP BY
1
ORDER BY
1
At times, we see up to 150 sidekiq connections sitting in idle state for a few seconds (<10 most of the time).
For unicorn, it is less pronounced: Less idle transactions and the idle time only went up to like 2s, when I looked at it.
