Zoekt tuning max file size and trigrams
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Problem to solve
context (internal): https://gitlab.com/gitlab-com/enablement-sub-department/section-enable-request-for-help/-/issues/64#note_1843615608
Zoekt skips files larger than 1 MiB, whereas Advanced Search indices the first 1 MiB, but skips the rest.
Zoekt also has a trigram limit (default 20_000) that means the file won't be indexed even if it is under the file size limit
Proposal
Zoekt offers options to tune the behavior: SizeMax, TrigramMax and LargeFiles
We should investigate what would be the disk usage and/or memory implications for making any change. These options could also be exposed in the Admin UI for SM customers to tune.
Edited  by 🤖 GitLab Bot 🤖