Opt-In Setting for AI Interactions Data Collection

Problem to Solve

GitLab cannot systematically improve DAP quality without access to real-world AI interaction data. While we're adding customer feedback capture mechanisms (thumbs up/down, qualitative comments), these alone don't provide the diagnostic depth needed to reproduce issues, evaluate model performance, or build regression tests.

Scope to Deliver

Introduce an admin-controlled opt-in toggle in Duo setting page that converts the existing Extended Logging feature flag into a discoverable, customer-facing control that collects AI interaction data (e.g. prompt and response text, session context, etc.)

Setting specifications:

  • Location: Top-level namespace/instance admin settings
  • Control level: Instance administrators or top-level group owners
  • Default state: Disabled (requires explicit opt-in)
  • Target audience: Paid tier customers only (all DAP customers at GA are paid)
  • Privacy protection: User identifiers (user_ids, usernames) are not stored with AI interaction data

See full proposal here.

Legal & Compliance Validation

Legal confirmed (Legal Issue #3106): Admin-level control is sufficient—no per-user opt-in required. No further legal approval required for paid tier (this scope).

Out of scope for this iteration

  • Free tier customers (deferred until DAP free tier launch, requires separate legal/comms strategy as we introduce new language to the Terms and Agreement)

Design

image.png

Opt-In Rate Projection & Its Implication to Data Storage and Volume

Based on industry benchmark, we estimate about 15%-25% of paid customers will opt-in as they onboard DAP. We aim to increase opt-in rate via:

  • leading with value - partner with field to illustrate value prop - post 18.9 delivery
  • integrating into DAP admin onboarding flow - to be planned
  • providing opt-in incentives - to be planned

Depending on the data volume, it is possible that traces may be large and could increase storage costs, operational load and retrieval/analysis complexity. We will mitigate such via retention limits, sampling and caps. We will handle this with Langsmith.

cc: @bastirehm @abacon-gitlab @ashrafkhamis

Edited Feb 06, 2026 by 🤖 GitLab Bot 🤖
Assignee Loading
Time tracking Loading