Consider using less expensive index_options to save disk space
Summary
Currently the Elasticsearch mapping stores a lot of extra information for each term it stores. This can be replaced with less expensive options (like positions
instead of offsets
or even docs
instead of offsets
when highlighting is not needed and scoring is less of an issue).
Improvements
- Use
positions
forindex_options
on fields that require highlighting. This saves about ~33% index size. positions_index_options.json - Use
docs
forindex_options
on fields that don't require highlighting and that don't influence the scoring based on number of occurrences in the field. This saves ~2% index size. docs_index_options.json
Risks
No code changes should be necessary, though, the Elasticsearch documentation is clear that offsets
helps to speed up the highlighter.
Relates to #3327 (closed)
Edited by Michael Jakl