Skip to content

Resolve "ElasticSearch "Exact Term" broken when used with special characters."

Mario de la Ossa requested to merge 4085-elasticsearch-exact-term into master

What does this MR do?

  • Use the whitespace tokenizer instead of standard on the code_analyzer and code_search_analyzer elasticsearch custom analyzers so special chars are not ignored on search.

  • removed the code_mapping char_filter, since it would mangle the original and cause issues, it was implemented as a pattern_capture filter detailed below

  • added three pattern_capture filters:

    • one to capture terms inside quotes while ignoring the quotes (otherwise include "foo-x.y" will not show up when searching for foo-x.y)

    • one to separate terms with periods like.this.one, to re-implement what the code_mapping char_filter did but still retain the original so searches could find it

    • one to separate terms with slashes like/this/one, because that was something that the standard tokenizer would do and whitespace doesn't, so some results stopped being returned.

Screenshots (if relevant)

before: before

after: after

before: before

after: after

Does this MR meet the acceptance criteria?

  • Changelog entry added, if necessary
  • Tests added for this feature/bug
  • Review
    • Has been reviewed by Backend

What are the relevant issue numbers?

Closes #4085 (closed) Closes #4291 (closed)

Edited by Coung Ngo

Merge request reports