Currently the Elasticsearch mapping stores a lot of extra information for each term it stores. This is replaced with less expensive options (like positions instead of offsets or even docs instead of offsets when highlighting is not needed and scoring is less of an issue).

Testing different options on a project from gitlabhq_export.tar.gz

Options Size, MB %
offsets (current) 808.2 100.00%
positions 561.53 69.48% -30.52%
mixed 512.38 63.40% -36.60%
  • there is no noticeable difference in searching speed between those options



