DAP Flow Session Traceability & Audit Trail for Enterprise Compliance
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
DAP Flow Session Traceability & Audit Trail for Enterprise Compliance
Problem Statement
Enterprise customers in regulated industries (financial services, healthcare, government) cannot demonstrate governance over AI-assisted changes because there is no comprehensive audit trail linking Duo Agent Platform (DAP) flow executions to the GitLab objects they create or modify.
While some audit infrastructure exists, significant gaps prevent regulated enterprises from meeting SOC2, ISO27001, and FedRAMP compliance requirements for AI-assisted development workflows.
Current State
What Exists Today
Based on documentation and existing issues research:
1. Internal Event Instrumentation
The DAP Instrumentation Guide documents token usage tracking via track_internal_event with AI Context fields:
-
session_id,flow_type,agent_name -
input_tokens,output_tokens -
model_provider,model_engine -
correlation_id,billing_event_id
2. API-Level Audit Events
The Software Development Flow documentation states:
"An audit event is created for each API request done by the Software Development Flow"
These events use the api_request_access_with_scope audit event type (introduced 17.7, Issue #499461 (closed)).
3. Commit Attribution (Resolved)
Issue #557042 (closed) addressed Composite Identity token attribution - resolved in milestone 18.7. Related Epic #20119 "Properly attribute authorship to the Composite Identity SA for DAP".
4. GraphQL Usage Data
The Duo and SDLC Trends API provides:
-
AGENT_PLATFORM_SESSION_STARTEDevents - Feature-level aggregation via
AiUserMetrics - Requires ClickHouse (instance-level data experimental in 18.7)
5. Limited AI Framework Audit Events
Audit Event Types documentation shows only 2 AI-specific events:
-
duo_features_enabled_updated(16.10) - Settings changes -
api_request_access_with_scope(17.7) - API requests with audited scopes
What Is Missing
1. Flow Session History UI
Gap: No user-facing interface to review past flow sessions
- Users cannot see what flows they've executed
- No way to review what a specific session did
- Cannot correlate outcomes to inputs
2. Session-to-GitLab-Object Linkage
Gap: No metadata connecting session_id to created/modified resources
- Commits created by DAP don't reference the originating flow session
- MRs don't link back to the session that created them
- Comments/notes have no session attribution
3. Flow Lifecycle Audit Events
Gap: Missing critical audit event types:
-
flow_started- When user initiates a flow -
flow_completed- Successful completion with outcome summary -
flow_failed- Failure with error context -
flow_paused/flow_resumed- User intervention events
4. Agent Reasoning/Decision Logs
Gap: No capture of agent decision-making process
- What plan did the agent create?
- Why did it choose specific approaches?
- What alternatives were considered and rejected?
5. Admin Query Capabilities
Gap: No dedicated DAP activity filtering
- Cannot filter audit logs by flow type
- No organization-wide DAP usage visibility
- No compliance reporting dashboards
6. Compliance Integration
Gap: No pre-built compliance tooling
- No SOC2/ISO27001/FedRAMP report templates
- No SIEM integration patterns documented
- No evidence collection automation
Customer Impact
Identified Blockers
| Customer | ARR Impact | Compliance Requirement | Status |
|---|---|---|---|
| Large European bank | $3.5M | Governance requirements for AI-assisted changes | Blocking adoption |
| Major UK financial institution | TBD | Compliance team audit trail requirement | Cannot approve without audit |
| Regulated Industries (General) | Significant | SOC2/ISO27001 change traceability | GA blocker |
Field Feedback
From Solutions Architecture field perspective:
- "Flow session traceability missing — No linkage: where invoked → what GitLab changes made → audit trail"
- "Customers in regulated industries cannot adopt DAP without demonstrating AI governance"
- "Compliance teams asking questions we cannot answer about AI-assisted changes"
Compliance Framework Requirements
SOC2 (CC6.1, CC7.2):
- All system changes must be authorized and traceable
- Audit logs must capture who, what, when for changes
- AI-assisted changes require same rigor as manual changes
- Authorization must be verified at flow initiation (CC6.1)
ISO 27001 (A.12.4):
- Event logging of user activities
- Protection of log information
- Administrator and operator logs
FedRAMP (AU-2, AU-3):
- Auditable events defined and documented
- Content of audit records (user, event, outcome, timestamp)
- Sufficient detail for forensic analysis
- Unambiguous user attribution required (AU-3)
User Stories
Compliance Officer
As a compliance officer, I need to audit all AI-assisted code changes to demonstrate governance for regulatory requirements. I need to answer: "What AI agent made this change, why, and who authorized it?"
Developer
As a developer, I want to see what changes a past flow session made so I can understand and review the agent's work. I need to correlate the output to my input to improve my prompts.
Instance Administrator
As an admin, I need to query all DAP activity across my instance for security and compliance reporting. I need to identify anomalous AI usage patterns and generate reports for auditors.
Security Team Member
As a security team member, I need to investigate what a specific flow session did if issues arise. I need a complete timeline of the agent's actions and decisions.
Engineering Manager
As an engineering manager, I need visibility into how my team uses DAP to understand productivity gains and ensure appropriate usage patterns.
Proposed Solution
1. Flow Session Entity Model
Create a first-class FlowSession entity capturing:
# Proposed schema
class FlowSession
# Identity
session_id: UUID
user_id: Integer
project_id: Integer
# Context
flow_type: String # "software_development", "fix_pipeline", etc.
input_prompt: Text # Must be sanitized for PII/secrets before storage
input_context: JSONB # files, issues, MRs referenced
# Execution
started_at: DateTime
completed_at: DateTime
status: Enum # started, running, paused, completed, failed
# Authorization (CC6.1 compliance)
authorization_verified_at: DateTime
authorization_method: String
authorized_scopes: Array[String]
# Outcomes
plan_summary: Text # Agent's generated plan
decisions_log: JSONB # Reasoning steps - must be sanitized for PII/secrets
# Relationships
has_many :flow_session_resources # Links to commits, MRs, comments
end
class FlowSessionResource
flow_session_id: UUID
resource_type: String # "Commit", "MergeRequest", "Note"
resource_id: String # SHA or ID
action: String # "created", "modified"
created_at: DateTime
end
2. Flow Lifecycle Audit Events
Add new audit event types:
| Event Type | Trigger | Data Captured |
|---|---|---|
duo_flow_started |
User initiates flow | session_id, flow_type, user, project, input_summary, authorization_verified_at, authorization_method, authorized_scopes |
duo_flow_plan_generated |
Agent creates plan | session_id, plan_steps, estimated_actions |
duo_flow_paused |
User pauses flow | session_id, paused_at, paused_at_step |
duo_flow_resumed |
User resumes flow | session_id, resumed_at |
duo_flow_completed |
Flow finishes successfully | session_id, resources_created, duration |
duo_flow_failed |
Flow encounters error | session_id, error_category, error_code, error_message, failed_at_step, resources_created_before_failure |
duo_flow_resource_created |
Agent creates GitLab resource | session_id, resource_type, resource_id |
3. Session Linkage to GitLab Objects
Attach session metadata to created resources:
Commits:
X-GitLab-Duo-Session: <session_id>
Stored in commit metadata or notes (per Git trailer convention).
Merge Requests:
- System note: "Created by Duo Agent Platform flow session
<session_id>" - MR description footer with session link
Comments/Notes:
- Store
duo_flow_session_idin notes table
4. User Session History UI
New page: /-/duo/sessions (or within Activity)
Features:
- List of user's flow sessions with status
- Filter by flow type, date range, project
- Filter by user (for admin/compliance use - supports FedRAMP AU-3)
- Session detail view showing:
- Input prompt and context
- Plan generated
- Resources created with links
- Timeline of agent actions
- Duration and token usage
- Clear linkage from session to authenticated GitLab user (Composite Identity attribution chain documented)
5. Admin DAP Audit Dashboard
New admin area: /admin/duo_agent_platform/audit
Features:
- Organization-wide flow session listing
- Filter by user, project, flow type, date range
- Export capabilities for compliance reports
- Anomaly detection (unusual usage patterns)
- Integration with existing audit log infrastructure
6. API for Compliance Tooling
GraphQL and REST endpoints:
query {
duoFlowSessions(
projectId: "gid://gitlab/Project/123"
userId: "gid://gitlab/User/456" # FedRAMP user attribution support
startDate: "2025-01-01"
endDate: "2025-01-31"
) {
nodes {
sessionId
user { username }
flowType
status
startedAt
completedAt
authorizationVerifiedAt
authorizationMethod
resourcesCreated {
resourceType
resourceId
webUrl
}
planSummary
decisionsLog
}
}
}
7. Sensitive Data Handling
Critical requirement: input_prompt and decisions_log must be sanitized before storage to prevent exposure of:
- Personally Identifiable Information (PII)
- Secrets, tokens, or credentials
- Other sensitive data that may be present in user prompts or agent reasoning
Acceptance Criteria
MVP (P1)
Core principle: Who initiated | What was requested | What was created | When
- Flow sessions logged with: user, timestamp, flow_type, project, input_summary
-
duo_flow_startedaudit event with authorization context (authorization_verified_at, authorization_method, authorized_scopes) -
duo_flow_completedaudit event with resources_created, duration -
duo_flow_failedaudit event with error_category, error_code, error_message, failed_at_step, resources_created_before_failure -
duo_flow_pausedandduo_flow_resumedaudit events - Session ID attached to commits/MRs created by DAP (visible in UI)
- Basic session list available to users
- Default retention period of 1 year for session data
-
Unambiguous user attribution -
user_idinduo_flow_startedlinked to authenticated GitLab user - Sensitive data sanitization - input_prompt and decision_log sanitized for PII/secrets before storage
- Documentation for compliance teams on available audit data
- Documentation on Composite Identity attribution chain (for FedRAMP)
Enhanced (P2)
- Full agent plan and reasoning captured per session
- User session history UI with detail view
- Admin audit dashboard with filtering
- GraphQL API for session queries (including user filtering)
- Audit event streaming support for DAP events
Enterprise (P3)
- SIEM integration patterns documented
- Pre-built compliance report templates (SOC2, ISO27001)
- Session comparison and diff views
- Alerting on unusual DAP usage patterns
- Configurable retention policy (beyond 1-year default)
Technical Considerations
Architecture Ownership
| Component | Team | Implementation |
|---|---|---|
| FlowSession model | Agent Foundations | Rails |
| Audit events | Compliance (Govern) | Rails |
| Session UI | AI Framework | Vue.js |
| API endpoints | Agent Foundations | GraphQL/REST |
| AI Gateway logging | AI Framework | AI Gateway service |
Data Storage
- Session metadata: PostgreSQL (GitLab database)
- Agent reasoning/plans: May require dedicated storage due to size
- Analytics/aggregation: ClickHouse (existing AI metrics pattern)
- Long-term retention: Consider archive strategy
Performance Considerations
- Session creation should not impact flow latency
- Audit events should be asynchronous (Sidekiq)
- Admin queries may need pagination and caching
- Consider GDPR implications for session data retention
Self-Managed Requirements
- All audit data must be stored locally
- No external dependencies for compliance features
- Admin configuration for retention periods
- Export formats compatible with common SIEM tools
Related Issues & Epics
- Issue #557042 (closed) - Commit attribution (CLOSED, 18.7)
- Epic #20119 - Composite Identity authorship attribution
- Issue #553573 - Duo as team member vision (discusses tracking)
- Issue #549846 - Usage billing reporting (CLOSED)
- Issue #499461 (closed) - API request audit events (referenced in docs)
- Issue #431738 - Bot token audit concerns (CLOSED)
Documentation References
- DAP Instrumentation Guide
- AI Event Instrumentation
- Software Development Flow
- Audit Event Types
- Duo and SDLC Trends API
Competitive Context
GitHub Copilot currently lacks comprehensive audit trail capabilities for AI-assisted changes, representing an opportunity for GitLab differentiation in regulated enterprise markets. Customer feedback indicates audit capabilities are a key selection criterion for financial services and healthcare organizations evaluating AI coding assistants.