Skip to content

Add instrumentation for recording subtransaction depth

Stan Hu requested to merge sh-observe-subtransaction-depth into master

A number of recent, performance slowdowns in the PostgreSQL database due (e.g. gitlab-com/gl-infra/production#5333 (closed)) that cause a high rate of statement timeouts. It looks like most queries are waiting on SubtransControlLock, which indicates something is going on with how PostgreSQL is handling subtransactions.

We know that if the number of nested subtransactions exceeds 64, there is a significant performance penalty (see https://www.cybertec-postgresql.com/en/subtransactions-and-performance-in-postgresql/), but right now we don't know if any migrations or Sidekiq jobs are actually using that many nested subtransactions.

This merge request instruments ActiveRecord in order to find the source of nested subtransactions. This change reports a log message into application_json.log when the depth exceeds 32 subtransactions or the overall transaction exceeds 5 minutes.

A sample log message with thresholds removed:

{
  "severity": "INFO",
  "time": "2021-08-12T21:18:07.628Z",
  "correlation_id": "210c961bee1feb2e68a22fec77ef71fe",
  "class": "Gitlab::Database::Transaction::Context",
  "result": "commit",
  "duration_s": 0.009594183415174484,
  "depth": 0,
  "savepoints_count": 0,
  "rollbacks_count": 0,
  "releases_count": 0,
  "sql": "/*application:sidekiq,correlation_id:210c961bee1feb2e68a22fec77ef71fe,jid:c0449496fd536dccd23889fc,endpoint_id:Geo::VerificationBatchWorker*/ BEGIN"
}
Edited by Stan Hu

Merge request reports