DAP Flow Session Traceability & Audit Trail for Enterprise Compliance

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Close this issue

DAP Flow Session Traceability & Audit Trail for Enterprise Compliance

Problem Statement

Enterprise customers in regulated industries (financial services, healthcare, government) cannot demonstrate governance over AI-assisted changes because there is no comprehensive audit trail linking Duo Agent Platform (DAP) flow executions to the GitLab objects they create or modify.

While some audit infrastructure exists, significant gaps prevent regulated enterprises from meeting SOC2, ISO27001, and FedRAMP compliance requirements for AI-assisted development workflows.

Current State

What Exists Today

Based on documentation and existing issues research:

1. Internal Event Instrumentation

The DAP Instrumentation Guide documents token usage tracking via track_internal_event with AI Context fields:

session_id, flow_type, agent_name
input_tokens, output_tokens
model_provider, model_engine
correlation_id, billing_event_id

2. API-Level Audit Events

The Software Development Flow documentation states:

"An audit event is created for each API request done by the Software Development Flow"

These events use the api_request_access_with_scope audit event type (introduced 17.7, Issue #499461 (closed)).

3. Commit Attribution (Resolved)

Issue #557042 (closed) addressed Composite Identity token attribution - resolved in milestone 18.7. Related Epic #20119 "Properly attribute authorship to the Composite Identity SA for DAP".

4. GraphQL Usage Data

The Duo and SDLC Trends API provides:

AGENT_PLATFORM_SESSION_STARTED events
Feature-level aggregation via AiUserMetrics
Requires ClickHouse (instance-level data experimental in 18.7)

5. Limited AI Framework Audit Events

Audit Event Types documentation shows only 2 AI-specific events:

duo_features_enabled_updated (16.10) - Settings changes
api_request_access_with_scope (17.7) - API requests with audited scopes

What Is Missing

1. Flow Session History UI

Gap: No user-facing interface to review past flow sessions

Users cannot see what flows they've executed
No way to review what a specific session did
Cannot correlate outcomes to inputs

2. Session-to-GitLab-Object Linkage

Gap: No metadata connecting session_id to created/modified resources

Commits created by DAP don't reference the originating flow session
MRs don't link back to the session that created them
Comments/notes have no session attribution

3. Flow Lifecycle Audit Events

Gap: Missing critical audit event types:

flow_started - When user initiates a flow
flow_completed - Successful completion with outcome summary
flow_failed - Failure with error context
flow_paused / flow_resumed - User intervention events

4. Agent Reasoning/Decision Logs

Gap: No capture of agent decision-making process

What plan did the agent create?
Why did it choose specific approaches?
What alternatives were considered and rejected?

5. Admin Query Capabilities

Gap: No dedicated DAP activity filtering

Cannot filter audit logs by flow type
No organization-wide DAP usage visibility
No compliance reporting dashboards

6. Compliance Integration

Gap: No pre-built compliance tooling

No SOC2/ISO27001/FedRAMP report templates
No SIEM integration patterns documented
No evidence collection automation

Customer Impact

Identified Blockers

Customer	ARR Impact	Compliance Requirement	Status
Large European bank	$3.5M	Governance requirements for AI-assisted changes	Blocking adoption
Major UK financial institution	TBD	Compliance team audit trail requirement	Cannot approve without audit
Regulated Industries (General)	Significant	SOC2/ISO27001 change traceability	GA blocker

Field Feedback

From Solutions Architecture field perspective:

"Flow session traceability missing — No linkage: where invoked → what GitLab changes made → audit trail"
"Customers in regulated industries cannot adopt DAP without demonstrating AI governance"
"Compliance teams asking questions we cannot answer about AI-assisted changes"

Compliance Framework Requirements

SOC2 (CC6.1, CC7.2):

All system changes must be authorized and traceable
Audit logs must capture who, what, when for changes
AI-assisted changes require same rigor as manual changes
Authorization must be verified at flow initiation (CC6.1)

ISO 27001 (A.12.4):

Event logging of user activities
Protection of log information
Administrator and operator logs

FedRAMP (AU-2, AU-3):

Auditable events defined and documented
Content of audit records (user, event, outcome, timestamp)
Sufficient detail for forensic analysis
Unambiguous user attribution required (AU-3)

User Stories

Compliance Officer

As a compliance officer, I need to audit all AI-assisted code changes to demonstrate governance for regulatory requirements. I need to answer: "What AI agent made this change, why, and who authorized it?"

Developer

As a developer, I want to see what changes a past flow session made so I can understand and review the agent's work. I need to correlate the output to my input to improve my prompts.

Instance Administrator

As an admin, I need to query all DAP activity across my instance for security and compliance reporting. I need to identify anomalous AI usage patterns and generate reports for auditors.

Security Team Member

As a security team member, I need to investigate what a specific flow session did if issues arise. I need a complete timeline of the agent's actions and decisions.

Engineering Manager

As an engineering manager, I need visibility into how my team uses DAP to understand productivity gains and ensure appropriate usage patterns.

Proposed Solution

1. Flow Session Entity Model

Create a first-class FlowSession entity capturing:

# Proposed schema
class FlowSession
  # Identity
  session_id: UUID
  user_id: Integer
  project_id: Integer
  
  # Context
  flow_type: String # "software_development", "fix_pipeline", etc.
  input_prompt: Text # Must be sanitized for PII/secrets before storage
  input_context: JSONB # files, issues, MRs referenced
  
  # Execution
  started_at: DateTime
  completed_at: DateTime
  status: Enum # started, running, paused, completed, failed
  
  # Authorization (CC6.1 compliance)
  authorization_verified_at: DateTime
  authorization_method: String
  authorized_scopes: Array[String]
  
  # Outcomes
  plan_summary: Text # Agent's generated plan
  decisions_log: JSONB # Reasoning steps - must be sanitized for PII/secrets
  
  # Relationships
  has_many :flow_session_resources # Links to commits, MRs, comments
end

class FlowSessionResource
  flow_session_id: UUID
  resource_type: String # "Commit", "MergeRequest", "Note"
  resource_id: String # SHA or ID
  action: String # "created", "modified"
  created_at: DateTime
end

2. Flow Lifecycle Audit Events

Add new audit event types:

Event Type	Trigger	Data Captured
`duo_flow_started`	User initiates flow	session_id, flow_type, user, project, input_summary, authorization_verified_at, authorization_method, authorized_scopes
`duo_flow_plan_generated`	Agent creates plan	session_id, plan_steps, estimated_actions
`duo_flow_paused`	User pauses flow	session_id, paused_at, paused_at_step
`duo_flow_resumed`	User resumes flow	session_id, resumed_at
`duo_flow_completed`	Flow finishes successfully	session_id, resources_created, duration
`duo_flow_failed`	Flow encounters error	session_id, error_category, error_code, error_message, failed_at_step, resources_created_before_failure
`duo_flow_resource_created`	Agent creates GitLab resource	session_id, resource_type, resource_id

3. Session Linkage to GitLab Objects

Attach session metadata to created resources:

Commits:

X-GitLab-Duo-Session: <session_id>

Stored in commit metadata or notes (per Git trailer convention).

Merge Requests:

System note: "Created by Duo Agent Platform flow session <session_id>"
MR description footer with session link

Comments/Notes:

Store duo_flow_session_id in notes table

4. User Session History UI

New page: /-/duo/sessions (or within Activity)

Features:

List of user's flow sessions with status
Filter by flow type, date range, project
Filter by user (for admin/compliance use - supports FedRAMP AU-3)
Session detail view showing:
- Input prompt and context
- Plan generated
- Resources created with links
- Timeline of agent actions
- Duration and token usage
Clear linkage from session to authenticated GitLab user (Composite Identity attribution chain documented)

5. Admin DAP Audit Dashboard

New admin area: /admin/duo_agent_platform/audit

Features:

Organization-wide flow session listing
Filter by user, project, flow type, date range
Export capabilities for compliance reports
Anomaly detection (unusual usage patterns)
Integration with existing audit log infrastructure

6. API for Compliance Tooling

GraphQL and REST endpoints:

query {
  duoFlowSessions(
    projectId: "gid://gitlab/Project/123"
    userId: "gid://gitlab/User/456"  # FedRAMP user attribution support
    startDate: "2025-01-01"
    endDate: "2025-01-31"
  ) {
    nodes {
      sessionId
      user { username }
      flowType
      status
      startedAt
      completedAt
      authorizationVerifiedAt
      authorizationMethod
      resourcesCreated {
        resourceType
        resourceId
        webUrl
      }
      planSummary
      decisionsLog
    }
  }
}

7. Sensitive Data Handling

Critical requirement: input_prompt and decisions_log must be sanitized before storage to prevent exposure of:

Personally Identifiable Information (PII)
Secrets, tokens, or credentials
Other sensitive data that may be present in user prompts or agent reasoning

Acceptance Criteria

MVP (P1)

Core principle: Who initiated | What was requested | What was created | When

Enhanced (P2)

Full agent plan and reasoning captured per session
User session history UI with detail view
Admin audit dashboard with filtering
GraphQL API for session queries (including user filtering)
Audit event streaming support for DAP events

Enterprise (P3)

SIEM integration patterns documented
Pre-built compliance report templates (SOC2, ISO27001)
Session comparison and diff views
Alerting on unusual DAP usage patterns
Configurable retention policy (beyond 1-year default)

Technical Considerations

Architecture Ownership

Component	Team	Implementation
FlowSession model	Agent Foundations	Rails
Audit events	Compliance (Govern)	Rails
Session UI	AI Framework	Vue.js
API endpoints	Agent Foundations	GraphQL/REST
AI Gateway logging	AI Framework	AI Gateway service

Data Storage

Session metadata: PostgreSQL (GitLab database)
Agent reasoning/plans: May require dedicated storage due to size
Analytics/aggregation: ClickHouse (existing AI metrics pattern)
Long-term retention: Consider archive strategy

Performance Considerations

Session creation should not impact flow latency
Audit events should be asynchronous (Sidekiq)
Admin queries may need pagination and caching
Consider GDPR implications for session data retention

Self-Managed Requirements

All audit data must be stored locally
No external dependencies for compliance features
Admin configuration for retention periods
Export formats compatible with common SIEM tools

Issue #557042 (closed) - Commit attribution (CLOSED, 18.7)
Epic #20119 - Composite Identity authorship attribution
Issue #553573 (closed) - Duo as team member vision (discusses tracking)
Issue #549846 - Usage billing reporting (CLOSED)
Issue #499461 (closed) - API request audit events (referenced in docs)
Issue #431738 - Bot token audit concerns (CLOSED)

Documentation References

Competitive Context

GitHub Copilot currently lacks comprehensive audit trail capabilities for AI-assisted changes, representing an opportunity for GitLab differentiation in regulated enterprise markets. Customer feedback indicates audit capabilities are a key selection criterion for financial services and healthcare organizations evaluating AI coding assistants.

Edited Jan 04, 2026 by 🤖 GitLab Bot 🤖

DAP Flow Session Traceability & Audit Trail for Enterprise Compliance

DAP Flow Session Traceability & Audit Trail for Enterprise Compliance

Problem Statement

Current State

What Exists Today

1. Internal Event Instrumentation

2. API-Level Audit Events

3. Commit Attribution (Resolved)

4. GraphQL Usage Data

5. Limited AI Framework Audit Events

What Is Missing

1. Flow Session History UI

2. Session-to-GitLab-Object Linkage

3. Flow Lifecycle Audit Events

4. Agent Reasoning/Decision Logs

5. Admin Query Capabilities

6. Compliance Integration

Customer Impact

Identified Blockers

Field Feedback

Compliance Framework Requirements

User Stories

Compliance Officer

Developer

Instance Administrator

Security Team Member

Engineering Manager

Proposed Solution

1. Flow Session Entity Model

2. Flow Lifecycle Audit Events

3. Session Linkage to GitLab Objects

4. User Session History UI

5. Admin DAP Audit Dashboard

6. API for Compliance Tooling

7. Sensitive Data Handling

Acceptance Criteria

MVP (P1)

Enhanced (P2)

Enterprise (P3)

Technical Considerations

Architecture Ownership

Data Storage

Performance Considerations

Self-Managed Requirements

Related Issues & Epics

Documentation References

Competitive Context