Remove partial word matching from code search
What does this MR do?
Currently the code search uses ngrams to allow searching for prefixes as well as full matches. This takes up a lot of storage and can be replaced with a prefix search.
This MR removes the usage of edgeNGram_filter
from our index mappings.
Index size
Mappings | Size, MB | % | |
---|---|---|---|
current | 524.77 | 100.00% | |
without edgeNGram_filter | 176.55 | 33.64% | -66.36% |
Performance toolkit
Testing with theThere's 7.3% performance improvement in p90 for the new mappings
Before
* Environment: Localhost
* Environment Version: 13.1.0-pre `5e066ffdf76`
* Option: 60s_20rps
* Date: 2020-05-27
* Run Time: 1m 1.5s (Start: 07:02:03 UTC, End: 07:03:05 UTC)
* GPT Version: v1.3.1
NAME | RPS | RPS RESULT | TTFB AVG | TTFB P90 | REQ STATUS | RESULT
-------------------------------------|------|-------------------|----------|---------------------|----------------|----------------
api_v4_projects_project_search_blobs | 20/s | 18.67/s (>1.60/s) | 226.78ms | 233.48ms (<15000ms) | 100.00% (>95%) | Passed
After
* Environment: Localhost
* Environment Version: 13.1.0-pre `5e066ffdf76`
* Option: 60s_20rps
* Date: 2020-05-27
* Run Time: 1m 1.37s (Start: 07:06:20 UTC, End: 07:07:21 UTC)
* GPT Version: v1.3.1
NAME | RPS | RPS RESULT | TTFB AVG | TTFB P90 | REQ STATUS | RESULT
-------------------------------------|------|------------------|----------|---------------------|----------------|----------------
api_v4_projects_project_search_blobs | 20/s | 19.5/s (>1.60/s) | 157.64ms | 201.86ms (<15000ms) | 100.00% (>95%) | Passed
Screenshots
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process. -
Tested in all supported browsers -
Informed Infrastructure department of a default or new setting change, if applicable per definition of done
Security
If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team
Edited by 🤖 GitLab Bot 🤖