Search results inconsistent for camel case text
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Summary
Code search results (using Advanced search) are returning different results for camel case text.
Customer support ticket (internal): https://gitlab.zendesk.com/agent/tickets/487811
Steps to reproduce
- enable Advanced search in gdk for indexing and search
- index everything
bundle exec rake gitlab:elastic:index - find a project to add files to
- add five files:
- filename:
test1.java, content:public void helloWorldFooBar(Boolean test) throws Exception; - filename:
test2.java, content:testService.helloWorldFooBar(True); - filename:
test1.md, content:helloWorldFooBar - filename:
test2.md, content:helloworldfoobar - filename:
test3.md, content:helloWorldFooBar(False)
- filename:
What is the current bug behavior?
- Perform a project code search (verify that
Advanced search is enabled) - Search with
helloWorldFooBar - Get back 5 results
- Search with
helloworldfoobar - Get back 2 results
What is the expected correct behavior?
Results should work consistently for camel case text regardless of case sensitivity in search term
Possible fixes
Code content is indexed using the Elasticsearch Word delimeter graph token filter and appears to only be an issue when the text is followed by a non-ASCII character
Edited by 🤖 GitLab Bot 🤖

