Skip to content

Remove wildcard prefixes from Elasticsearch queries

John Mason requested to merge remove-wildcard-prefix-searches into master

What does this MR do and why?

Removes wildcard prefixes from Elasticsearch queries.

Wildcard prefixes force Elasticsearch to search through every possible location in the index for a match. These cause searches to be terribly slow.

From Elastic's documentation on wildcard queries:

Avoid beginning patterns with * or ?. This can increase the iterations needed to find matching terms and slow search performance.

This is related to investigating current error budget spending here https://gitlab.com/gitlab-org/search-team/team-tasks/-/issues/65#note_731539646 and realizing that bursts of wildcard prefix queries are driving up our latency spikes.

image

Screenshots or screen recordings

Screenshot image

How to set up and validate locally

  1. Perform a search with a wildcard prefix such as path:*foo.rb
  2. Examine Elasticsearch query and performance bar
  3. Verify that is no wildcard prefix in the Elasticsearch wildcard query.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by John Mason

Merge request reports