Skip to content

Parallelise the gitlab:elastic:index_database Rake task

For 9.0, move indexing of database rows from serialised, to one thread per table.

Since GitLab.com has a large number of rows in each table, this reduces the indexing time from the sum of all rows to the slowest table (probably notes, which has 2x as many indexable rows as the others). So this reduces the time from perhaps 24 hours, down to 8 hours.

This does make the rake task open 6 database connections, which interacts badly with the gdk default configuration (which has a connection pool size of 5). I'll open a separate MR to fix up the GDK configuration.

Related to #1839 (closed)

Edited by Sean Carroll

Merge request reports