Enhance search capability for Chinese in the Wiki
-
Please check this box if this contribution uses AI-generated content (including content generated by GitLab Duo features) as outlined in the GitLab DCO & CLA. As a benefit of being a GitLab Community Contributor, you receive complimentary access to GitLab Duo.
What does this MR do and why?
Related Issue: #511060 (comment 2286182888)
JiHu Issue: #4537
What
Update the analyzer configuration on Wiki content for advanced search:
- Add an analyzer named
text_analyzer - Change the analyzer for Wiki's content from
code_analyzertotext_analyzer, because they are obviously not code but natural language
Why
The current advanced search are not support Chinese in Wiki, even when using third-party plugins such as smartcn (I believe there is some kind of bug here. The current MR does not fix it.)
Wiki content are using code_analyzer (tokenizer=whitespace) for word segmentation analysis. This strategy cannot effectively segment Chinese words.
References
Please include cross links to any resources that are relevant to this MR. This will give reviewers and future readers helpful context to give an efficient review of the changes introduced.
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
| Before (code_analyzer) | After (text_analyzer) |
|---|---|
![]() |
![]() |
How to set up and validate locally
See #511060 (closed) .

