Fix GLQL ES performance regression: optimize `project_ids` pre-query in Work Items ES finder (#592420) · Issues · GitLab.org / GitLab

Fix GLQL ES performance regression: optimize `project_ids` pre-query in Work Items ES finder

## Problem to solve GLQL views on large namespaces (e.g. `gitlab-org` with ~7,000 projects) take ~10 seconds to load when Advanced Search is enabled via `glql_es_integration`. The regression was introduced in commit `3a84ec8d` (2026-02-27, "Remove FF `search_project_list_lookup`"), which changed the `project_ids` method in `ee/lib/ee/search/advanced_finders/work_items_finder.rb` from a lightweight path-based lookup to a heavy join-based query using `ProjectsFinder`: ```ruby # Current (slow) — introduced by 3a84ec8d def project_ids projects = ::ProjectsFinder.new(current_user: current_user) .execute .for_group_and_its_subgroups(resource_parent) .with_route .include_topics .without_order projects.pluck_primary_key end ``` This creates two compounding performance problems for large groups: 1. **Expensive PG pre-query**: `ProjectsFinder` with `for_group_and_its_subgroups` + `with_route` + `include_topics` joins is significantly slower than the previous `inside_path_preloaded` ILIKE-based lookup, especially for groups with thousands of projects. 2. **Large ES `terms` filter**: The resulting project IDs (e.g. 7,034 for `gitlab-org`) are sent as an ES `terms` filter on every GLQL query, making the ES query itself expensive. The file already has a comment acknowledging this is known technical debt: ```ruby # NOTE: project_ids should be removed when the traversal_ids # optimization is implemented for confidentiality filters # https://gitlab.com/gitlab-org/gitlab/-/issues/558781 ``` **Mitigation applied:** `glql_es_integration` was disabled on GitLab.com on 2026-03-05, routing GLQL queries to PostgreSQL (~0.3s). This needs to be re-enabled once the fix is in place. Postmortem: https://gitlab.com/gitlab-org/gitlab/-/work_items/592405#note_3134535593 ## Proposal Two complementary fixes, in order of urgency: ### Fix 1 — Add caching to `project_ids` (short-term, days) Add Rails cache with a short TTL (e.g. 5 minutes) to `project_ids`, keyed by group ID and user ID. This amortizes the PG cost across multiple concurrent GLQL requests for the same group, which is the common case (many users viewing the same wiki page): ```ruby def project_ids Rails.cache.fetch(['glql_es_project_ids', resource_parent.id, current_user.id], expires_in: 5.minutes) do ::ProjectsFinder.new(current_user: current_user) .execute .for_group_and_its_subgroups(resource_parent) .with_route .include_topics .without_order .pluck_primary_key end end ``` ### Fix 2 — Implement `traversal_ids` optimization (long-term, per #558781) The correct architectural fix is to eliminate the `project_ids` pre-query entirely by using ES `traversal_ids` range filtering to scope queries to a namespace subtree — a single efficient ES range filter instead of thousands of explicit project IDs. This is already tracked in https://gitlab.com/gitlab-org/gitlab/-/issues/558781 (now closed for confidentiality filters). The work items ES finder should be updated to use the same `traversal_ids` approach, removing the `project_ids` method and its callers entirely once the index supports it. ## Further details - **Regression commit:** `3a84ec8d` — "Remove FF `search_project_list_lookup`" (2026-02-27) - **Affected file:** `ee/lib/ee/search/advanced_finders/work_items_finder.rb`, `project_ids` method (~line 215) - **Related traversal_ids work:** https://gitlab.com/gitlab-org/gitlab/-/issues/558781 - **Incident:** https://gitlab.com/gitlab-org/gitlab/-/work_items/592405 ## Links / references - Incident (postmortem in comments): https://gitlab.com/gitlab-org/gitlab/-/work_items/592405 - `traversal_ids` optimization: https://gitlab.com/gitlab-org/gitlab/-/issues/558781 - Admin setting to control `glql_es_integration`: https://gitlab.com/gitlab-org/gitlab/-/issues/592415

issue