Gitaly n+1 in /api/:version/projects/:id/(-/)search?scope=blob endpoint

The /api/:version/projects/:id/(-/)search endpoint appears to iterate though each blob in a repository when scope=blob is used in the search.

This leads to mechanical sympathy alerts such as:

image

There appears to be some evidence in gitlab-com/gl-infra/scalability#64 (comment 261727093) that this is causing latency spikes across the fleet.

search logs


Cause

  1. When user does a blob search with a path filter (e.g. foo filename:bar), and there are thousands of potential matches
  2. Those thousands of results would return and wrapped as FoundBlob.
  3. Since filename:foo is present, the ruby side filtering is initiated
  4. The ruby side filtering currently compares search term bar with FoundBlob#binary_path, which is nil but can be lazy loaded
  5. FoundBlob therefore would load blob data from Gitaly, causing N+1 queries.
Edited by Mark Chao