Reduce the number of Elasticsearch client instances that are created (!3432) · Merge requests · GitLab.org / GitLab

The source project of this merge request has been removed.

Nick Thomas requested to merge (removed):3650-persist-elasticsearch-client into master Nov 16, 2017

What does this MR do?

A typical GitLab deployment will have many processes running, and each of those processes needs one elasticsearch client instance. The client instance is thread-safe and handles concurrent requests very well, with HTTP keep-alive connections.

Prior to this MR, each process using elasticsearch would instantiate one client per class that used Elasticsearch::Model::Client. So a multi-node setup might look like:

* Server A
  * Unicorn parent
     * Unicorn child A
      * Client for Project class
      * Client for Repository class
      * Client for Issue class
      * ...
    * Unicorn child B
      * Client for Project class
      * Client for Repository class
      * Client for Issue class
      * ...
  * Sidekiq
    * Client for Project class
    * Client for Repository class
    * Client for Issue class
    * ...
* Server B
  * Unicorn master
    * ... (same as above)
  * Sidekiq
    * .... (same as above)

(total: N, plus N per unicorn child, multipled by the number of servers)

Following this MR, we have the following clients instead:

* Server A
  * Sidekiq (1 client)
  * Unicorn parent
    * Unicorn child A (1 client)
    * Unicorn child b (1 client)
* Server B
  * ... (same as above)

(total: 1, + 1 per unicorn child, multipled by the number of servers)

This drastically reduces the number of HTTP connections we make to the Elasticsearch and AWS instance profile credentials servers, and should come with a small increase in performance due to better utilisation of those connections.

Are there points in the code the reviewer needs to double check?

I'm not 100% certain that the mutex is required. If not, it's very cheap to have compared to the operations we perform with the client, so I'm comfortable leaving it there.

Why was this MR needed?

When collecting instance profile credentials in AWS, each client instantiation is a HTTP request to an external web service. This service may rate-limit us if we perform too many requests in a given time period.

Screenshots (if relevant)

Does this MR meet the acceptance criteria?

Changelog entry added, if necessary
Tests added for this feature/bug
Review
- Has been reviewed by Backend
Conform by the merge request performance guides
Conform by the style guides
Squashed related commits together

What are the relevant issue numbers?

Closes #3650 (closed)

Edited Nov 17, 2017 by Nick Thomas

Reduce the number of Elasticsearch client instances that are created