DRAFT - UX Experience for RAG Configurations

This issue outlines UX consideration for the implementation of RAG-based approaches to customization across Duo features, self-hosted or otherwise.

Background Vision

Human in the Loop Processes - Building Trust in AI Features

Trust in AI should not be misconstrued as a belief that AI will always produce correct results. Instead, it should be understood as confidence that the AI's outcomes will be reliable and its decision-making process transparent and well-understood. 1 This kind of trust involves transparency for an end user into the conditions under which an AI feature is operating, as well as the ability to define those conditions to optimize reliability. This transparency allows results to be interpreted, verified, and ultimately utilized by end users - leading to increased adoption of AI features. 2

We envision enabling transparent and configurable RAG processes to underpin customization of GitLab features. As GitLab advances RAG context injection for Duo features, the Custom Models team will enable Enterprise customers at all surfaces to:

enable customers (both instance administrators and end users) to have transparency and control over what content is injected
- Instance administrators will be able to see default Gitlab context sources (for details of potential context source examples, see the DaVinci project).
- Instance administrators will be able to allow/ban specific sources of context for end users, as well as index/embed and configure non-default sources for their GitLab Duo features.
- End users will be able to further manipulate context allowed by their instance administrators within their UX or IDE (for Code Suggestions); turning on and off context injection from sources allowed by their instance owners

UX Requirements

Setting Up Customer Data for RAG

There are two iterative UX approaches that we could take to supply external customer data for RAG prompt injection. Customers could either:

Iteration One - Bring Your Own Index/Vector Store

bring their documentation already indexed with BM25, OR
bring their documentation already embedded in a vector store (for semantic search)
- administrators would then also need to declare their self-hosted embedding model and dimension size
- they would also then need to configure their self-hosted embedding model to enable semantic search
administrators assign their documentation a unique and descriptive name to enable RAG configurations

Iteration Two - Index Within GL UI Using Self-Hosted Embedding Model

administrators bring their documentation and declare it to Gitlab
administrators choose:
- BM25 OR
- Zoekt OR
- configure their self-hosted embedding model (for self-managed) for embedding OR
- use default GitLab embedding model
administrators configure parameters within GitLab UI
- chunking strategy
- vector size
administrators create their documentation index/vector store within GitLab UI
administrators assign their documentation index/vector store a unique and descriptive name to enable RAG configurations

Retrieving Data for RAG

administrators can configure preferred retrieval strategies for their own data
the ability to configure RAG retrieval strategies, based on their indexing/vector strategy and parameters such as:
- top k cutoff
- reranking
- summarization
- adjacent chunks
Instance administrators will be able to see default Gitlab context sources (for details of potential context source examples, see the DaVinci project).
Instance administrators will be able to allow/ban specific sources of context for end users, as well as index and configure non-default sources for their GitLab Duo features.
End users will be able to further manipulate context allowed by their instance administrators within their UX or IDE (for Code Suggestions); turning on and off context injection from sources allowed by their instance administrators
End users will also be able to leverage hotkey configurations for user-in-the-loop context retrieval - ref
- user can opt for automated context selection OR
- configure personalized default context selection OR
- user can manually trigger specific repos (human in the loop context retrieval) using hotkey triggers (such as @context.xyz)
Instance administrators have the ability to evaluate RAG strategies within the GL UI

Edited Jul 11, 2024 by Susie Bitters