Planning: Duo Context Configuration & Exclusion

Background

As we expand the context and content available to Duo, we need to provide customer controls for excluding sensitive files/content from Duo features and supporting models. Customers may have sensitive files that should not be processed or input to LLM's and embeddings models.

References

Timeline considerations

Main goal

Allow customers to enforce their security/privacy policy by controlling the content that is used within Duo. The messaging intent is ensure customers that excluded files/context are not processed by any Duo LLM or supporting model.

This provides a very strong data privacy and data retention position:

  • Each customer can preclude content from ever being processed by Duo
  • For content that is processed by Duo, we maintain zero-day data retention

Assumptions

  • The implementation approach will be consistent with with iteration 1 of the AI Context Management proposal
    • The ai_context_policy file is auto-created in new repositories where Duo is enabled
    • The ai_context_policy file is auto-created in existing repositories where Duo is enabled
  • We do not automatically exclude paths specified in gitignore
  • We can use an implementation pattern that makes it trivial for each Duo feature to integrate & enforce the AI context check. Ideally, it would take each team less than a day or two to enforce the AI context policy within their Duo feature.

MVC Proposal

Functional summary

  • Files are available for AI context by default, unless otherwise specified.
  • An administrator can:
    • Configure paths to exclude from AI context
      • This could include a specific file, a directory, a file extension, etc.
    • Configure paths to include for AI context
      • e.g. Exclude a folder, but include 2 specific files in that folder.
      • This could include a specific file, a directory, a file extension, etc.
  • The controls config file must be stored in the root of the repository.
  • All files are excluded when a project has Duo turned off.
  • Updating files when the exclusion configuration is updated:
    • If files are embedded/stored, and Duo is turned off for the project, then we should remove the files from the Duo data store
    • If files are embedded/stored, and those files are added to the exclude configuration, then we should remove the files from the Duo data store.
    • The removal doesn't need to be instantaneous but we should aim for no more than 30 minutes to apply the change.

Excluded files behavior

  • Content from excluded files is not sent to an LLM or embeddings model.
  • Duo Chat is not supported for excluded files.
  • Code Suggestions are not supported within excluded files.
    • Content in excluded files won't be used to inform code completion suggestions in other files.
    • This includes both open tabs context, and imports context.
  • Content from excluded files is not embedded and stored.
  • Generally, no Duo feature should use content from excluded files.
  • Edge case: Duo is enabled but all or most files are excluded - Duo will be ineffective. No specific requirement here but we could consider in-product messaging if this is common.

UX treatments

  • IDE extension should indicate when the open and active file is excluded.
    • Proposal: show the disabled Tanuki icon
    • This ensures the user understands that code suggestions will not work in this file, and chat won't know anything about the file.
  • Excluded files are not available within the /include selection menu
    • This is nice to have but not critical - Chat should decide whether it's easier to display an excluded file in the file selector, and respond with an exclusion message; e.g. One or more relevant files are excluded from Duo features based on your project's AI Context Configuration policy.
    • If some but not all files are excluded, Duo should respond based on the allowed content, and indicate that some files were excluded.
  • Each area of the product returns a descriptive message when a relevant file was excluded.
    • e.g. User attempts to initiate Duo Code Review on an excluded file, and Duo responds: One or more relevant files are excluded from Duo features based on your project's AI Context Configuration policy.
    • e.g. User attempts to remediate a security vulnerability resolution, and receives the exclusion message
    • e.g. User attempts to troubleshoot a CI/CD pipeline, and receives the exclusion message

Edge case

  • We can't reasonably stop a user from copy/pasting the entire contents of restricted file into chat
    • e.g. Open file, copy all code, paste into Chat along with question/task

Tier availability and deployment options

Supported Duo add-ons

  • Duo Pro
  • Duo Enterprise

Supported deployment options

  • .com
  • Dedicated
  • Self Managed
    • Self-hosted models

Telemetry

  • We can measure the number of customers using a non-default AI context policy
  • We can measure the number of projects using a non-default AI context policy

Potential future iterations

  • Automated validation of correct policy configuration
  • Admin UI to manage policy
  • Manage policy at group level

Metrics

The metrics are focused on adoption, and measuring a shift in projects moving from Duo-disabled to Duo-enabled with some files excluded. We believe that there will be fewer projects with Duo turned completely off and more projects where specific file extensions are disabled. As a prerequisite to roll out, we can baseline the number of projects where Duo is supported but disabled.

Adoption

  • % of customers using AI context inclusion/exclusion
  • % of projects using AI context inclusion/exclusion

Behavior change

  • Reduced % of Duo-disabled projects
Edited by Jordan Janes