Sidekiq: Prevent `ArchiveTraceWorker` from holding client transactions open
Related: Sidekiq jobs don't currently include client transaction timing metrics: this is documented in gitlab-com/gl-infra/scalability#894 (moved)
EDIT Also BuildTraceChunkFlushWorker
, see #336048 (closed)
There is some evidence that ArchiveTraceWorker
is holding client transactions open.
When reviewing postgres activity in pg_stat_activity
, and grouping by marginalia comments and state, we see that there are frequently a high number of open ArchiveTraceWorker
active connections on the primary instance, with status of idle in transaction
.
Obtained with:
$ sudo gitlab-psql -c "SELECT a.matches[1] AS application, coalesce(a.matches[6], concat_ws('#', a.matches[2], a.matches[3])) as endpoint, a.state, count(*) c FROM ( SELECT regexp_matches(query, '\/\*(?:application:(\w+),?)(?:controller:(\w+),?)?(?:action:(\w+),?)?(?:correlation_id:(\w+),?)(?:jid:(\w+),?)?(?:job_class:([\w:]+),?)?', 'g') AS matches, state FROM pg_stat_activity) a GROUP BY 1,2,3 ORDER BY 4 DESC, 1,2,3;"
This probably indicates that these transactions are being opened through ActiveRecord::Base#transaction
.
Since ArchiveTraceWorker
is a high throughput endpoint, we should ensure that unless there is a very good reason, it doesn't use client transactions, and if we have to, that they are being held open for the minimum time possible.