Planning: Duo Context Configuration & Exclusion
Background
As we expand the context and content available to Duo, we need to provide customer controls for excluding sensitive files/content from Duo features and supporting models. Customers may have sensitive files that should not be processed or input to LLM's and embeddings models.
References
Timeline considerations
- Helpful follow-on to the general availability launch of
/include - Helpful security control for Semantic search: chat with your codebase (&16910)
Main goal
Allow customers to enforce their security/privacy policy by controlling the content that is used within Duo. The messaging intent is ensure customers that excluded files/context are not processed by any Duo LLM or supporting model.
This provides a very strong data privacy and data retention position:
- Each customer can preclude content from ever being processed by Duo
- For content that is processed by Duo, we maintain zero-day data retention
Assumptions
- The implementation approach will be consistent with with iteration 1 of the AI Context Management proposal
- The
ai_context_policyfile is auto-created in new repositories where Duo is enabled - The
ai_context_policyfile is auto-created in existing repositories where Duo is enabled
- The
- We do not automatically exclude paths specified in
gitignore - We can use an implementation pattern that makes it trivial for each Duo feature to integrate & enforce the AI context check. Ideally, it would take each team less than a day or two to enforce the AI context policy within their Duo feature.
MVC Proposal
Functional summary
- Files are available for AI context by default, unless otherwise specified.
- An administrator can:
- Configure paths to exclude from AI context
- This could include a specific file, a directory, a file extension, etc.
- Configure paths to include for AI context
- e.g. Exclude a folder, but include 2 specific files in that folder.
- This could include a specific file, a directory, a file extension, etc.
- Configure paths to exclude from AI context
- The controls config file must be stored in the root of the repository.
- All files are excluded when a project has Duo turned off.
- Updating files when the exclusion configuration is updated:
- If files are embedded/stored, and Duo is turned off for the project, then we should remove the files from the Duo data store
- If files are embedded/stored, and those files are added to the exclude configuration, then we should remove the files from the Duo data store.
- The removal doesn't need to be instantaneous but we should aim for no more than 30 minutes to apply the change.
Excluded files behavior
- Content from excluded files is not sent to an LLM or embeddings model.
- Duo Chat is not supported for excluded files.
- Code Suggestions are not supported within excluded files.
- Content in excluded files won't be used to inform code completion suggestions in other files.
- This includes both open tabs context, and imports context.
- Content from excluded files is not embedded and stored.
- Generally, no Duo feature should use content from excluded files.
- Edge case: Duo is enabled but all or most files are excluded - Duo will be ineffective. No specific requirement here but we could consider in-product messaging if this is common.
UX treatments
- IDE extension should indicate when the open and active file is excluded.
- Proposal: show the disabled Tanuki icon
- This ensures the user understands that code suggestions will not work in this file, and chat won't know anything about the file.
- Excluded files are not available within the
/includeselection menu- This is nice to have but not critical - Chat should decide whether it's easier to display an excluded file in the file selector, and respond with an exclusion message; e.g.
One or more relevant files are excluded from Duo features based on your project's AI Context Configuration policy. - If some but not all files are excluded, Duo should respond based on the allowed content, and indicate that some files were excluded.
- This is nice to have but not critical - Chat should decide whether it's easier to display an excluded file in the file selector, and respond with an exclusion message; e.g.
- Each area of the product returns a descriptive message when a relevant file was excluded.
- e.g. User attempts to initiate Duo Code Review on an excluded file, and Duo responds:
One or more relevant files are excluded from Duo features based on your project's AI Context Configuration policy. - e.g. User attempts to remediate a security vulnerability resolution, and receives the exclusion message
- e.g. User attempts to troubleshoot a CI/CD pipeline, and receives the exclusion message
- e.g. User attempts to initiate Duo Code Review on an excluded file, and Duo responds:
Edge case
- We can't reasonably stop a user from copy/pasting the entire contents of restricted file into chat
- e.g. Open file, copy all code, paste into Chat along with question/task
Tier availability and deployment options
Supported Duo add-ons
- Duo Pro
✅ - Duo Enterprise
✅
Supported deployment options
- .com
✅ - Dedicated
✅ - Self Managed
✅ - Self-hosted models
✅
- Self-hosted models
Telemetry
- We can measure the number of customers using a non-default AI context policy
- We can measure the number of projects using a non-default AI context policy
Potential future iterations
- Automated validation of correct policy configuration
- Admin UI to manage policy
- Manage policy at group level
Metrics
The metrics are focused on adoption, and measuring a shift in projects moving from Duo-disabled to Duo-enabled with some files excluded. We believe that there will be fewer projects with Duo turned completely off and more projects where specific file extensions are disabled. As a prerequisite to roll out, we can baseline the number of projects where Duo is supported but disabled.
Adoption
- % of customers using AI context inclusion/exclusion
- % of projects using AI context inclusion/exclusion
Behavior change
- Reduced % of Duo-disabled projects