Decommission ES cluster and create a new one with more nodes but smaller specs

This is part of the META issue https://gitlab.com/gitlab-com/infrastructure/issues/1597

As a continuation for https://gitlab.com/gitlab-com/infrastructure/issues/2157, we saw that we couldn't cope with the indexing load for a couple of reasons:

Not all nodes are receiving the same amount of requests (this is due to the use of parent-child document relationship, which is causing an unbalance across the shard size of the cluster

We didn't even index 1% of what's on GLcom, but @nick.thomas calculated that the space distribution should be even:

[{"count"=>"102871", "size_mib"=>"3800023.516858100891", "shard"=>"0"},
 {"count"=>"103217", "size_mib"=>"3667799.555052757263", "shard"=>"1"},
 {"count"=>"103195", "size_mib"=>"4278779.600588798523", "shard"=>"2"},
 {"count"=>"103235", "size_mib"=>"3400449.739967346191", "shard"=>"3"},
 {"count"=>"103298", "size_mib"=>"3745645.905962944031", "shard"=>"4"},
 {"count"=>"103524", "size_mib"=>"3708715.532225608826", "shard"=>"5"},
 {"count"=>"103180", "size_mib"=>"4372124.374240875244", "shard"=>"6"},
 {"count"=>"103488", "size_mib"=>"3521339.539809226990", "shard"=>"7"},
 {"count"=>"103548", "size_mib"=>"4258394.189184188843", "shard"=>"8"},
 {"count"=>"103335", "size_mib"=>"3896756.068240165710", "shard"=>"9"},
 {"count"=>"103335", "size_mib"=>"4073603.255154609680", "shard"=>"10"},
 {"count"=>"103247", "size_mib"=>"3673936.535663604736", "shard"=>"11"},
 {"count"=>"103435", "size_mib"=>"3441760.125455856323", "shard"=>"12"},
 {"count"=>"103570", "size_mib"=>"3957315.223025321960", "shard"=>"13"},
 {"count"=>"103123", "size_mib"=>"3680589.868497848511", "shard"=>"14"},
 {"count"=>"103255", "size_mib"=>"3916405.046738624573", "shard"=>"15"},
 {"count"=>"103545", "size_mib"=>"3748733.100311279297", "shard"=>"16"},
 {"count"=>"103267", "size_mib"=>"3725391.259220123291", "shard"=>"17"}]

Not all the nodes are receiving PUT requests over the index API due to a problem in the ruby client, which is being tracked here.

Also, 18 shards is not enough and we need to increase the number. We can have a cheaper cluster with more nodes (but less powerful) while also cutting in available disk space, as we don't need 4TB/node of avail space.

The current list of tasks is:

Get rid of elasticsearch0[123].db.gitlab.com
Change terraform resources
- From DS12_v2 -> DS11_v2
- From 4x 1TB disks -> 4x 512GB disks

Edited Jul 27, 2017 by Victor Lopez

Admin message

Decommission ES cluster and create a new one with more nodes but smaller specs