Elasticsearch total_shards_per_node prevents rollover to data_warm nodes
Here is a screenshot of available disk space going up for data_warm nodes, but going down for data_hot nodes. At the time of this writing, there is very little disk space left for data nodes.
After removing the total_shards_per_node
restriction, you can see things start to come back to normal:
We have a script that runs every 10 minutes that sets total_shards_per_node: 1
here: https://gitlab.com/gitlab-com/runbooks/-/blob/master/elastic/scheduled/hot_index_shards_per_node.sh
- First, we probably need to disable this script to prevent this problem from happening every 10 minutes.
- Afterwards, we should investigate why this is a problem. This problem came up during the upgrade of the logging cluster here: #6001 (comment 755191777)
Temporary workaround (only works for ~ 10 minutes) is to remove the total_shards_per_node
setting across the cluster:
PUT /*/_settings
{
"index.routing.allocation.total_shards_per_node" : null
}
Edited by John Mason