[UX] RAG Configurations
This issue outlines UX consideration for the implementation of RAG-based approaches to customization across Duo features, self-hosted or otherwise.
# Background Vision
### Human in the Loop Processes - Building Trust in AI Features
Trust in AI should not be misconstrued as a belief that AI will always produce correct results. Instead, it should be understood as confidence that the AI's outcomes will be reliable and its decision-making process transparent and well-understood. [1](https://link.springer.com/chapter/10.1007/978-3-030-50334-5_4) This kind of trust involves transparency for an end user into the conditions under which an AI feature is operating, as well as the ability to define those conditions to optimize reliability. This transparency allows results to be interpreted, verified, and ultimately utilized by end users - leading to increased adoption of AI features. [2](https://arxiv.org/abs/2203.12687)
We envision enabling transparent and configurable RAG processes to underpin customization of GitLab features. As GitLab advances RAG context injection for Duo features, the Custom Models team will enable Enterprise customers at all surfaces to:
* enable customers (both instance administrators and end users) to have transparency and control over what content is injected
* Instance administrators will be able to see default Gitlab context sources (for details of potential context source examples, see the [DaVinci](https://gitlab.com/groups/gitlab-org/editor-extensions/-/epics/55 "Advanced Code Context Resolver Architecture - (Project DaVinci)") project, or [context for Duo epic](https://gitlab.com/gitlab-org/gitlab/-/issues/469532#note_1983361749 "Define and scope context for GitLab Duo")).
* Instance administrators will be able to allow/ban specific sources of context for end users, as well as index/embed and configure non-default sources for their GitLab Duo features.
* End users will be able to further manipulate context allowed by their instance administrators within their UX or IDE (for Code Suggestions); turning on and off context injection from sources allowed by their instance owners
## UX Requirements
### Setting Up Customer Data for RAG
Customer Data for RAG must adhere to the [AI Context Management blueprint](https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/ai_context_management/)
* see [AI Context Policy Management](https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/ai_context_management/#\_ai-context-policy-management\_-proposal)
* see [Supplementary User Context](https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/ai_context_management/#\_supplementary-user-context\_-proposal)
The main components for RAG systems are:
1. data identification
2. data indexing
3. retrieval strategies
4. context injection
# Proposal
1\. **Data Identification** -- Customers would be responsible for identifying the data with which they want to enrich Duo features
2\. **Data Indexing** -- There are two iterative UX approaches that we could take to supply external customer data for RAG prompt injection. Customers could either:
* **Iteration One** - [Bring Your Own Index/Vector Store](https://gitlab.com/gitlab-org/ai-powered/custom-models/custom-models/-/issues/46 "Instance-Level Configuration for Bring Your Own Index/Vector Store")
1. bring their documentation already indexed with BM25, OR
2. bring their documentation already embedded in a vector store (for semantic search)
* administrators would then also need to declare their self-hosted embedding model and dimension size
* they would also then need to configure their self-hosted embedding model to enable semantic search
3. administrators assign their documentation a unique and descriptive name to enable RAG configurations
4. administrators can create authorization requirement for accessing data within the index
* **Iteration Two -** [Index via GL UI ](https://gitlab.com/groups/gitlab-org/-/epics/14773)
1. [administrators bring their documentation and declare it to Gitlab](https://gitlab.com/gitlab-org/gitlab/-/issues/464630 "Instance-Level Configuration for External Documentation")
2. administrators choose:
* BM25 **OR**
* Zoekt **OR**
* configure their self-hosted embedding model (for self-managed) for embedding **OR**
* use default GitLab embedding model (for cloud-connected instances) - currently [Vertex AI Search](https://cloud.google.com/enterprise-search)
3. administrators configure parameters within GitLab UI
* chunking strategy
* vector size
4. administrators create their documentation index/vector store within GitLab UI
5. administrators assign their documentation index/vector store a unique and descriptive name to enable RAG configurations
6. administrators can create authorization requirement for accessing data within the index
3\. Retrieval Strategies
* **Iteration One** - default retrievals
* customer utilize default retrieval strategies for each feature
* **Iteration Two** - custom retrievals
1. administrators can configure preferred retrieval strategies for their own data
* the ability to configure RAG retrieval strategies, based on their indexing/vector strategy and parameters such as:
* top k cutoff
* reranking
* summarization
* adjacent chunks
2. Instance administrators will be able to see default Gitlab context sources (for details of potential context source examples, see the [DaVinci](https://gitlab.com/groups/gitlab-org/editor-extensions/-/epics/55#note_1909631577 "Advanced Code Context Resolver Architecture - (Project DaVinci)") project).
3. Instance administrators will be able to allow/ban specific sources of context for end users, as well as index and configure non-default sources for their GitLab Duo features.
4. End users will be able to further manipulate context allowed by their instance administrators within their UX or IDE (for Code Suggestions); turning on and off context injection from sources allowed by their instance administrators
5. End users will also be able to leverage hotkey configurations for user-in-the-loop context retrieval - [ref](https://gitlab.com/groups/gitlab-org/-/epics/13947 "Slash commands in GitLab Duo Chat")
* user can opt for automated context selection **OR**
* configure personalized default context selection **OR**
* user can manually trigger specific repos (human in the loop context retrieval) using hotkey triggers (such as @context.xyz)
6. Instance administrators have [the ability to evaluate RAG strategies within the GL U](https://gitlab.com/gitlab-org/gitlab/-/issues/461250 "UX Experience for Customer Facing Validations")I
epic