RAG Validation for Global Search

Overview

There are various methods for search and retrieval in different domains supported by Global Search. It is currently unclear which retrieval methods, paired with which foundational models, perform best for each feature.

There are several potential areas for validation:

end to end evaluation in which performance for each feature with each retrieval method and each foundational models is baseline and compared
granular assessment of retrieval methodologies

Areas for Validation

retrieval implementations

BM25
Zoekt
semantic embedding
Ctags
hybrid search

chunking strategies
extent that dimensionality of vector embeddings impacts retrieval results; quality vs cost/latency

References

https://docs.gitlab.com/ee/user/search/

https://docs.gitlab.com/ee/user/search/advanced_search.html

https://about.gitlab.com/direction/global-search/

Edited Jun 11, 2024 by Susie Bitters