Skip to content

Add limit and offset to SearchFilesByName RPC

For #4449 (closed)

As an attempt to remove ruby sidecar in Gitaly, I'm working on porting all Wiki-related RPCs to normal repository RPCs. SearchFilesByName is the key RPC for the whole migration. Whenever Rails need to access a wiki file, or fetch multiple wiki pages, it issues one SearchFilesByName RPC with corresponding filter. In a discovery, I noticed that Rails side paginate the list of pages in most common operations. It means that SearchFilesByName RPC returns all files of a repository, and then the result is filtered out by Rails. There are two problems with this:

  • This is an attack vector. A user can easily take down a Gitaly node with an enormous repository with hundreds of thousands of files.
  • Most of the resulting pages are redundant. Gitaly costs computing resources to compose the payload. Rails side also takes time to unpack and then remove most of them. This is a room for improvement.

From the above reasons, I think adding limit and offset to SearchFilesByName RPC makes sense. When picking the pagination approach, I noticed that we have both limit/offset and page token systems. In this case, the number of files don't change as often as other resources like commits/refs. In addition, the RPC depends on the order of git-ls-tree output. Therefore, I decided to go with limit/offset, instead of page token. Hope this decision makes sense.

To keep the backward compatibility, I don't enforce a default limit/offset at Gitaly side. Rails side should control these parameters. In fact, Wiki APIs are returning all pages at the moment. We are tracking adding pagination to APIs in gitlab#371859. Afterward, we can set default values.

Edited by Quang-Minh Nguyen

Merge request reports