DRAFT - UX Experience for RAG Configurations
This issue outlines UX consideration for the implementation of RAG-based approaches to customization across Duo features, self-hosted or otherwise.
Background Vision
Human in the Loop Processes - Building Trust in AI Features
Trust in AI should not be misconstrued as a belief that AI will always produce correct results. Instead, it should be understood as confidence that the AI's outcomes will be reliable and its decision-making process transparent and well-understood. 1 This kind of trust involves transparency for an end user into the conditions under which an AI feature is operating, as well as the ability to define those conditions to optimize reliability. This transparency allows results to be interpreted, verified, and ultimately utilized by end users - leading to increased adoption of AI features. 2
We envision enabling transparent and configurable RAG processes to underpin customization of GitLab features. As GitLab advances RAG context injection for Duo features, the Custom Models team will enable Enterprise customers at all surfaces to:
- enable customers (both instance administrators and end users) to have transparency and control over what content is injected
- Instance administrators will be able to see default Gitlab context sources (for details of potential context source examples, see the DaVinci project).
- Instance administrators will be able to allow/ban specific sources of context for end users, as well as index/embed and configure non-default sources for their GitLab Duo features.
- End users will be able to further manipulate context allowed by their instance administrators within their UX or IDE (for Code Suggestions); turning on and off context injection from sources allowed by their instance owners
UX Requirements
Setting Up Customer Data for RAG
There are two iterative UX approaches that we could take to supply external customer data for RAG prompt injection. Customers could either:
Iteration One - Bring Your Own Index/Vector Store
- bring their documentation already indexed with BM25, OR
- bring their documentation already embedded in a vector store (for semantic search)
- administrators would then also need to declare their self-hosted embedding model and dimension size
- they would also then need to configure their self-hosted embedding model to enable semantic search
- administrators assign their documentation a unique and descriptive name to enable RAG configurations
Iteration Two - Index Within GL UI Using Self-Hosted Embedding Model
- administrators bring their documentation and declare it to Gitlab
- administrators choose:
- BM25 OR
- Zoekt OR
- configure their self-hosted embedding model (for self-managed) for embedding OR
- use default GitLab embedding model
- administrators configure parameters within GitLab UI
- chunking strategy
- vector size
- administrators create their documentation index/vector store within GitLab UI
- administrators assign their documentation index/vector store a unique and descriptive name to enable RAG configurations
Retrieving Data for RAG
- administrators can configure preferred retrieval strategies for their own data
- the ability to configure RAG retrieval strategies, based on their indexing/vector strategy and parameters such as:
- top k cutoff
- reranking
- summarization
- adjacent chunks
- Instance administrators will be able to see default Gitlab context sources (for details of potential context source examples, see the DaVinci project).
- Instance administrators will be able to allow/ban specific sources of context for end users, as well as index and configure non-default sources for their GitLab Duo features.
- End users will be able to further manipulate context allowed by their instance administrators within their UX or IDE (for Code Suggestions); turning on and off context injection from sources allowed by their instance administrators
- End users will also be able to leverage hotkey configurations for user-in-the-loop context retrieval - ref
- user can opt for automated context selection OR
- configure personalized default context selection OR
- user can manually trigger specific repos (human in the loop context retrieval) using hotkey triggers (such as @context.xyz)
- Instance administrators have the ability to evaluate RAG strategies within the GL UI