Consider using less expensive index_options to save disk space
Summary
Currently the Elasticsearch mapping stores a lot of extra information for each term it stores. This can be replaced with less expensive options (like positions instead of offsets or even docs instead of offsets when highlighting is not needed and scoring is less of an issue).
Improvements
- Use
positionsforindex_optionson fields that require highlighting. This saves about ~33% index size. positions_index_options.json - Use
docsforindex_optionson fields that don't require highlighting and that don't influence the scoring based on number of occurrences in the field. This saves ~2% index size. docs_index_options.json
Risks
No code changes should be necessary, though, the Elasticsearch documentation is clear that offsets helps to speed up the highlighter.
Relates to #3327 (closed)
Edited by Michael Jakl