Remove subtransactions in Note
In #338346 (comment 652623314), we identified the use of subtransactions can cause significant performance issues on PostgreSQL replicas at scale. If a long running transaction occurs, the subtransaction cache may not fit into the working set of memory, which can lead to a context switch storm when PostgreSQL needs to load subtransaction data from disk. While this arguably is a PostgreSQL bug, we should remove subtransactions entirely because they aren't performant at GitLab.com scale, nor are they strictly necessary.
From https://thanos.gitlab.net/classic/graph?g0.range_input=2h&g0.stacked=0&g0.max_source_resolution=0s&g0.expr=sum(rate(gitlab_active_record_subtransactions_total%7Benv%3D%22gprd%22%7D%5B1m%5D))%20by%20(model)&g0.tab=0 you can see Note
generates a significant number of subtransactions:
This may be happening inside cache_markdown_field.rb
. Whenever a note is created, store_mentions!
is called, which parses all the Markdown references and stores the mentioned users:
- https://gitlab.com/gitlab-org/gitlab/blob/03ccf7954f9f50709539bf08c95c53ead97a69a9/app/models/concerns/cache_markdown_field.rb#L172-173
- https://gitlab.com/gitlab-org/gitlab/blob/11f023f2f6f5cc48100346c42187be0a87b52034/app/models/note.rb#L593
There may be other places where subtransactions are being created, but this one has stood out when importing project notes in https://gitlab.com/gitlab-org/gitlab/-/issues/336788.
We use safe_find_or_create_by
to ensure we create a new model atomically.
We could just use a PostgreSQL insert
or upsert
to ensure the user_mentions
record exists before incrementing the data.
If we wanted to rewrite safe_find_or_create_by
to use insert
or upsert
(making sure we handle the ON CONFLICT
case), we would also need to account for the fact that these methods don't make ActiveRecord callbacks. From https://stackoverflow.com/questions/39058213/postgresql-upsert-differentiate-inserted-and-updated-rows-using-system-columns-x/39204667, we can potentially solve this by returning xmax
and checking whether the value is 0. If it is 0, it is newly-inserted, and we can fire the callbacks ourselves. There may be some callbacks (e.g. before_save
) that may not work, however.