Elasticsearch refresh interval strategy
Description
Refresh interval is a setting per index and is an important factor for indexing performance and search freshness.
from the docs:
How often to perform a refresh operation, which makes recent changes to the index visible to search. Defaults to 1s. Can be set to -1 to disable refresh. If this setting is not explicitly set, shards that haven’t seen search traffic for at least index.search.idle.after seconds will not receive background refreshes until they receive a search request. Searches that hit an idle shard where a refresh is pending will wait for the next background refresh (within 1s). This behavior aims to automatically optimize bulk indexing in the default case when no searches are performed. In order to opt out of this behavior an explicit value of 1s should set as the refresh interval.
Currently our indices have the following settings:
| index |
refresh_interval on production |
refresh_interval on staging |
|---|---|---|
| issues |
|
|
| users | 60s | default |
| main | default | default |
| commits | default | default |
| wikis | default | default |
| notes | default | default |
| projects | default | default |
| epics | default | default |
| merge_requests | default | default |
| migrations | default | default |
Having a 60s refresh interval on users is okay because the data doesn't get updated frequently but for issues, a 60s delay between having up-to-date data searchable is probably too much.
There's also a suggestion to set the refresh interval to -1 (disabling it) and letting the bulk cron worker refresh after indexing.
We need to establish our strategy for refresh interval and action on it.