Schedule forced merge for Elasticsearch
Due to extended and repeating issues with latency Elastic.co has recommended that we periodically run a forced merge. #292439 (closed)
Running a forced merge comes with a Trade-off of inflating the segment sizes. The inflated segment sizes would create problems if the segments are too large. There is no way to estimate or calculate the segment size change.
Frequency should be carefully evaluated.
An additional reference is that splitting shards lowered the size of each shard. gitlab-com/gl-infra/production#2872 (comment 435703736)
Recommended Steps
-
Find a non-peak time Search will run slow during this process -
Notify Ops of the change -
Record Latency and document Performance from the last active day prior to this change -
Record Segment sizes prior to running -
Run POST /<your-index>/force_merge?only_expunge_deletes=true. I think this will need to be done for all indices -
Record document in this issue the latency and duration of the process running -
Record Segment size after running
Based on these results we should determine the frequency that we could safely run this process.
The second Saturday each month is the default option.
Edited by 🤖 GitLab Bot 🤖