Skip to content

Use traversal_ids for advanced project search

What does this MR do and why?

This MR attempts to improve the duration of searches made for projects when a group is selected. Currently 0.834% of requests are above the duration threshold of 2.452 seconds (source) and the target is to have it below 0.05%.

The reason for requests taking long is that the database and elasticsearch requests take long. One of the reasons is that for groups with a large number projects, we are querying and passing all project ids to elasticsearch. We saw cases where a group has 30k projects as an example.

In order to have better performing queries, this MR switches from using a project id inclusion query to a project id exclusion query by using traversal_ids - i.e. find all projects in the group namespace by ancestor id and then remove projects that the user does not have access to. Issue search follows a similar approach as an example.

How it works

First we check if the feature flag is enabled and if the user is able to access the selected group by checking if its part of user.authorized_groups. If not, we apply the current filter.

Next we add a prefix filter which finds all projects with traversal ids starting with the group's traversal ids. This finds all projects in the group and its subgroups.

Finally we exclude projects the user should not have access to. The logic is based off issue, blob, etc. search except that because project search is not a feature, we remove the feature logic and keep the selection of projects here.

NOTE: this only affects project-scoped group-level searches, all else stays the same.

Query before

{ "query": { "bool": { "must": [ { "simple_query_string": { "_name": "project:match:search_terms", "fields": [ "name^10", "name_with_namespace^2", "path_with_namespace", "path^9", "description" ], "query": "*", "lenient": true, "default_operator": "and" } } ], "filter": [ { "terms": { "_name": "doc:is_a:project", "type": [ "project" ] } }, { "terms": { "_name": "project:archived:false", "archived": [ false ] } }, { "bool": { "should": [ { "terms": { "_name": "project:membership:id", "id": [ 670 ] } } ] } } ] } }, "highlight": { "fields": { "name": {}, "name_with_namespace": {}, "path_with_namespace": {}, "path": {}, "description": {} }, "number_of_fragments": 0, "pre_tags": [ "gitlabelasticsearch→" ], "post_tags": [ "←gitlabelasticsearch" ] } }

Query after

{ "query": { "bool": { "must": [ { "simple_query_string": { "_name": "project:match:search_terms", "fields": [ "name^10", "name_with_namespace^2", "path_with_namespace", "path^9", "description" ], "query": "*", "lenient": true, "default_operator": "and" } } ], "filter": [ { "terms": { "_name": "doc:is_a:project", "type": [ "project" ] } }, { "terms": { "_name": "project:archived:false", "archived": [ false ] } }, { "bool": { "should": [ { "prefix": { "traversal_ids": { "_name": "project:ancestry_filter:descendants", "value": "919-" } } } ] } } ], "must_not": { "terms": { "id": [ 669, 671 ] } } } }, "highlight": { "fields": { "name": {}, "name_with_namespace": {}, "path_with_namespace": {}, "path": {}, "description": {} }, "number_of_fragments": 0, "pre_tags": [ "gitlabelasticsearch→" ], "post_tags": [ "←gitlabelasticsearch" ] } }

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

How to set up and validate locally

  1. Ensure Elasticsearch is running
  2. Disable the feature flag: Feature.disable(:advanced_search_project_traversal_ids_query)
  3. Visit /search and search for projects from within a group
  4. Note that there is a project:membership:id part of the query in the ES calls part of the performance bar
  5. Enable the feature flag: Feature.enable(:advanced_search_project_traversal_ids_query)
  6. Perform the same search and verify that the results are the same
  7. Note that there is now a project:ancestry_filter:descendants part of the query in the ES calls part of the performance bar

Related to #438704 (closed)

Edited by Madelein van Niekerk

Merge request reports