Perform pull request IO work outside a transaction
What does this MR do?
This changes the GitHub Pull Request importer so it performs its heavy IO operations outside of a database transaction. This should reduce the amount of "idle-in-transaction" connections, when importing a lot of pull requests.
Why was this MR needed?
GitLab.com is having trouble importing many pull requests at once, as many will sit there and just wait for IO operations to complete, while holding on to database connections.
To illustrate, take the following graph:
This graph shows the amount of "idle in transaction" connections. The big drop towards the end is the result of applying this MR to GitLab.com.
We see a similar drop in the number of pgbouncer connections waiting for a PostgreSQL connection to become available:
Does this MR meet the acceptance criteria?
-
Changelog entry added, if necessary -
Tests added for this feature/bug - Conform by the code review guidelines
-
Has been reviewed by a Backend maintainer -
Has been reviewed by a Database specialist
-
-
Conform by the merge request performance guides -
Conform by the style guides -
If you have multiple commits, please combine them into a few logically organized commits by squashing them -
End-to-end tests pass ( package-and-qa
manual pipeline job)
Edited by Nick Thomas