Skip to content

Zoekt tuning max file size and trigrams

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Problem to solve

context (internal): https://gitlab.com/gitlab-com/enablement-sub-department/section-enable-request-for-help/-/issues/64#note_1843615608

Zoekt skips files larger than 1 MiB, whereas Advanced Search indices the first 1 MiB, but skips the rest.

Zoekt also has a trigram limit (default 20_000) that means the file won't be indexed even if it is under the file size limit

Proposal

Zoekt offers options to tune the behavior: SizeMax, TrigramMax and LargeFiles

We should investigate what would be the disk usage and/or memory implications for making any change. These options could also be exposed in the Admin UI for SM customers to tune.

Edited by 🤖 GitLab Bot 🤖