Zoekt: Cap max_file_match_results in standard search to reduce response size
Background
For standard (non-multi-match) search, Search::Zoekt::Params#max_file_match_results returns UNLIMITED (0). This means the coordinating node returns all matching files in its JSON response (up to ~5,000, bounded by the line match window), even though Rails only iterates through at most 10 pages × 20 results = 200 results before stopping (Cache::MAX_PAGES).
The 5,000 line match window (max_line_match_window) is intentional — it lets the coordinating node collect a large candidate set for relevance ranking. The problem is that the full ranked set is returned in the JSON response, even though Rails only needs the top N.
Note: Multi-match search also sends max_file_match_results = 5,000 (not per_page as originally assumed) — it has the same oversized payload problem.
Element count math
Each file + line match contributes ~52 JSON elements to the safe_parse counter (22 per file + 30 per line match). The 100k element limit is hit at:
| Scenario | Files | Matches/file | Total elements |
|---|---|---|---|
| Many files, 1 match | ~1,900 | 1 | ~100k |
| Moderate files, 5 matches | ~580 | 5 | ~100k |
| Fewer files, 10 matches | ~310 | 10 | ~100k |
This is more aggressive than originally estimated (~20 elements per result was too low).
Proposal
Cap max_file_match_results to max(Cache::MAX_PAGES, current_page) * per_page behind a feature flag. This mirrors the existing Cache#page_limit logic and ensures Rails always requests exactly as many files as it will actually consume:
- Page 1:
max(10, 1) * 20 = 200files - Page 5:
max(10, 5) * 20 = 200files (cache covers pages 1-10) - Page 15:
max(10, 15) * 20 = 300files - Page 50:
max(10, 50) * 20 = 1,000files
The coordinating node still collects and ranks across the full 5,000 window — the cap only affects the post-sort trim of the returned JSON. Deep pagination is preserved.
Implementation notes
Params doesn't currently have access to page or per_page — these need to be threaded through from SearchResults#zoekt_search → Client.search → Params. The page_limit value from Cache (which already computes [current_page, MAX_PAGES].max) is a natural fit.
Key files
-
ee/lib/search/zoekt/params.rb—max_file_match_resultscomputation -
ee/lib/search/zoekt/search_results.rb—zoekt_search(needs to pass page info) -
ee/lib/gitlab/search/zoekt/client.rb—search(needs to forward page info toParams) -
ee/lib/search/zoekt/cache.rb—page_limit(existing logic to reuse)
Rollout
This change should be rolled out behind a feature flag (e.g., zoekt_cap_file_match_results) to allow gradual rollout and quick rollback.
Note: This change requires #591911 (closed) (fix count fields) to land first — otherwise displayed counts would drop from "5,000+" to the cap value.