Skip to content

Perform pull request IO work outside a transaction

Yorick Peterse requested to merge gh-importer-transactions into master

What does this MR do?

This changes the GitHub Pull Request importer so it performs its heavy IO operations outside of a database transaction. This should reduce the amount of "idle-in-transaction" connections, when importing a lot of pull requests.

Why was this MR needed?

GitLab.com is having trouble importing many pull requests at once, as many will sit there and just wait for IO operations to complete, while holding on to database connections.

To illustrate, take the following graph:

Screenshot_from_2018-06-04_18-54-08

This graph shows the amount of "idle in transaction" connections. The big drop towards the end is the result of applying this MR to GitLab.com.

We see a similar drop in the number of pgbouncer connections waiting for a PostgreSQL connection to become available:

Screenshot_from_2018-06-04_18-57-22

Does this MR meet the acceptance criteria?

Edited by Nick Thomas

Merge request reports