Skip to content

Increase MaxIdleConnsPerHost in http.Transport from 2 to 100

Igor requested to merge http-client-pool-default into master

While investigating gitlab-com/gl-infra/production#1999 (closed), we found out that golang's http.Client actually has an internal connection pooler called http.Transport.

Summary:

If we look at production, gitlab-pages is opening tens of connections to gitlab-api per second.

The hypothesis for why this happens is that we have lots of concurrent requests going through the transport. This transport has a few parameters that control its behaviour. The one we are concerned with in this case is MaxIdleConnsPerHost.

MaxIdleConnsPerHost defines defines the threshold for burst capacity. Concurrent requests up to MaxIdleConnsPerHost will get connections that are pooled and reused.

If the number of concurrent requests exceeds it, all of those "extra" requests will get their own connection established on demand ("burst capacity"), but that connection will not go back into the pool, hence not be reused. Instead it will be closed after the request is done.

In other words: If we have more than MaxIdleConnsPerHost concurrent requests, we will constantly be opening and closing connections.

The default value for MaxIdleConnsPerHost is 2. This patch increases the parameter to 100.

We can make it user-tweakable in the future, but this should be high enough for quite a while.

Impact:

This connection churn creates other issues, in our case it exhausted ports on the NAT, leading to SYN drops, and eventually an outage of the service (gitlab-com/gl-infra/production#1999 (closed)).

cc @grzesiek

Edited by Igor

Merge request reports