Customer-Hosted Logging in Support of Self-Hosting and Customization
The Problem
As Custom Model moves towards developing self-hosted model support and customization approaches, we need to allow customers visibility into their own LLM flows for debugging, auditing, validation, and potentially accumulating data sets for supervised fine tuning. Currently GitLab does not enable customers to capture the input/outputs of LLM interactions; customers have no visibility into the flow of GenAI features.
Use Cases
- auditing
- purge requirements: customers in the self-managed airgapped space tend to be the most focused on data privacy and security. A natural extension of that concern is myriad levels of data access controls, to include classification levels. In implementing RAG processes for customization, our customers have the requirement to be able to trace the origin points of injected content. This will enable them to trace the data lineage of LLM produced content, and satisfy ‘purge’ requirements should errors be detected. In these high-data-sensitivity settings, tainted records must be traced back to their source and all cascading or affected records must be examined and re-validated - or otherwise be purged. In order to enable this data-tracing and purge requirements, we must include in our data logging the source records and text of all injected content.
- debugging
- validation
- datasets for supervised fine tuning: customers will likely require between 6-8k examples of 'good' prompt and responses in order to optimally shift the weights of pre-trained models for their own use cases. Our hypothesis is that customers likely do not have these datasets at hand, but access to their own logs as they begin to implement LLM operators in their own environments could help them build those datasets for later curation and use in SFT.
- anything that could materially change the privacy, security, or enable/disable features should have audit log events.
- model routing -- what elements we choose to capture can have an impact on downstream functions such as model routing
The Proposal
- We will expose users own interaction with LLM-based features to them, captured on their Rails instance. This data will not be visible to GitLab.
- Customers will have to opt into this capture of data, and can opt into particular facets
- We will offer different levels of logging, with more detailed logging available on an opt-in basis.
- Customers can configure their own retention period with options: 30 days, 90 days, 120 days, and 'no age off'
Logging options will include:
- token usage (input / output)
- user prompt / LLM response
- source / origin of RAG injected content (essential for purge requirements)
- latency (timestamps for user prompt / LLM response)
- user group
- user project
- user supergroups
- user role
- code owners
- data security classification level of the user’s current file, repo, group
- GitLab Control Framework
- manual selection of input routing (see end user controlled model routing)
- context-based parameters such as PII detection or zero shot determination (see context based model routing)
References
UX Experience for Self-Hosted Auditing / Logging (#467446 - closed)
https://gitlab.com/gitlab-org/gitlab/-/issues/454562+
https://gitlab.com/gitlab-org/gitlab/-/issues/463936+
There is additional context around latency here:
Edited by Susie Bitters