Skip to content

Use less expensive index_options

Dmitry Gruzd requested to merge 28085-index-options-tuning into master

What does this MR do?

Currently the Elasticsearch mapping stores a lot of extra information for each term it stores. This is replaced with less expensive options (like positions instead of offsets or even docs instead of offsets when highlighting is not needed and scoring is less of an issue).

Original issue

Testing different options on a project from gitlabhq_export.tar.gz

Options Size, MB %
offsets (current) 808.2 100.00%
positions 561.53 69.48% -30.52%
mixed 512.38 63.40% -36.60%
  • there is no noticeable difference in searching speed between those options

Screenshots

Screenshot_2020-02-27_at_18.08.18

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team

Closes #28085 (closed)

Edited by 🤖 GitLab Bot 🤖

Merge request reports