Implement CodeSuggestions analytics source
Summary
This MR implements the CodeSuggestions analytics source for GLQL, enabling users to query GitLab Duo Code Suggestions usage data using natural language syntax. This builds on the analytics infrastructure from !347 (merged) by adding the first production analytics source.
What Changed
1. CodeSuggestions Source Implementation
Added new analytics-only source for Code Suggestions data:
-
New source:
CodeSuggestionsSourceAnalyzerinsrc/analyzer/sources/code_suggestions.rs- Analytics-only mode (no standard queries)
- GraphQL path:
analytics.duoCodeSuggestions - Requires
glql_code_suggestion_analytics_aggregationfeature flag
-
SourceAnalyzer trait implementation:
-
field_category()- categorizes 4 dimensions and 7 metrics -
analytics_dimension_key()- provides GraphQL field names for dimensions -
supported_modes()- declares Analytics mode only -
parent_scope()- returns "analytics" for nested GraphQL path
-
2. Field Definitions
4 Dimensions (groupable fields):
-
language- Programming language -
ideName- IDE/editor name -
timestamp- Timestamp of the suggestion event -
user- User who received the suggestion
7 Metrics (aggregatable fields):
-
totalCount- Total number of suggestions -
usersCount- Number of unique users -
acceptanceRate- Percentage of accepted suggestions -
suggestionSizeSum- Total size of all suggestions -
acceptedCount- Number of accepted suggestions -
rejectedCount- Number of rejected suggestions -
shownCount- Number of suggestions shown
3. Filter Mappings
Custom GraphQL parameter mappings for backend compatibility:
-
user→userId(with array wrapping:user = 123→userId: [123]) -
timestamp >=→timestampFrom -
timestamp <=→timestampTo - Other filters use field name directly (
language,ideName)
Architecture
Query Flow
- Parser accepts CodeSuggestion type with analytics mode
- Analyzer validates feature flag and categorizes fields
- Validator ensures dimensions are valid for this source
- Code Generator produces nested GraphQL query with orderBy
Generated GraphQL Structure
query GLQL($before: String, $after: String, $limit: Int) {
project(fullPath: "gitlab-org/gitlab") {
analytics {
duoCodeSuggestions(
language: "ruby",
userId: ["gid://gitlab/User/123"],
timestampFrom: "2024-01-01 00:00"
) {
aggregated(
before: $before,
after: $after,
first: $limit,
orderBy: [{ identifier: "acceptanceRate", direction: DESC }]
) {
nodes {
dimensions {
language
ideName
}
totalCount
acceptanceRate
acceptedCount
}
pageInfo {
hasNextPage
hasPreviousPage
startCursor
endCursor
}
}
}
}
}
}
Example Usage
Basic aggregation by language
mode: analytics
query: type = CodeSuggestion and timestamp >= -30d
dimensions: language
metrics: totalCount, acceptanceRate
sort: totalCount desc
User-specific analysis
mode: analytics
query: type = CodeSuggestion and user = @rob.hunt and timestamp >= -7d
dimensions: language, ideName
metrics: totalCount, acceptedCount, rejectedCount, acceptanceRate
sort: acceptanceRate desc
limit: 10
Multi-language comparison
mode: analytics
query: type = CodeSuggestion and language in ("ruby", "javascript", "python") and timestamp >= "2024-01-01"
dimensions: language, ideName
metrics: totalCount, usersCount, acceptanceRate
sort: acceptanceRate desc
Testing
This MR can be tested in two ways:
- Standalone GLQL Testing - Verify query compilation without backend integration
- End-to-End Testing - Full integration testing with MR gitlab!228129 (closed)
Most reviewers should use Approach 1 for code review. Approach 2 provides comprehensive validation but requires additional setup.
Standalone GLQL testing
note: For my own personal testing I have been using a script so I can monitor different modes compilation and transformation. Feel free if you'd like to use it too: test_analytics.rb. Copy it to the ./glql_rb directory and run with bundle exec ruby test_analytics.rb.
-
Build the ruby extension:
cd glql_rb bundle install bundle exec rake compile -
Use a test script to test the output:
ruby -I lib -r gitlab_query_language -e " query = 'type = CodeSuggestion and language = \"ruby\"' config = { project: 'gitlab-org/gitlab', mode: 'analytics', dimensions: 'language, ideName', metrics: 'totalCount, acceptanceRate', featureFlags: { glqlCodeSuggestions: true } } result = Glql.compile(query, config) puts '=== Compilation Success: ' + result['success'].to_s puts '=== Field Count: ' + result['fields'].length.to_s puts '' puts '=== Generated GraphQL (excerpt):' puts result['output'].lines[0..15].join " -
Test different query patterns: Multi-language comparison:
ruby -I lib -r gitlab_query_language -e " result = Glql.compile( 'type = CodeSuggestion and language in (\"ruby\", \"javascript\", \"python\")', { project: 'gitlab-org/gitlab', mode: 'analytics', dimensions: 'language, ideName', metrics: 'totalCount, acceptanceRate, acceptedCount', featureFlags: { glqlCodeSuggestions: true } } ) puts 'Success: ' + result['success'].to_s puts 'Contains language filter: ' + result['output'].include?('language:').to_s "User-specific analysis:
ruby -I lib -r gitlab_query_language -e " result = Glql.compile( 'type = CodeSuggestion and user = 123', { project: 'gitlab-org/gitlab', mode: 'analytics', dimensions: 'language', metrics: 'totalCount, usersCount', featureFlags: { glqlCodeSuggestions: true } } ) puts 'Success: ' + result['success'].to_s puts 'Contains userId: ' + result['output'].include?('userId:').to_s "Time range filter:
ruby -I lib -r gitlab_query_language -e " result = Glql.compile( 'type = CodeSuggestion and timestamp >= \"2024-01-01\" and timestamp <= \"2024-12-31\"', { project: 'gitlab-org/gitlab', mode: 'analytics', dimensions: 'language', metrics: 'totalCount', featureFlags: { glqlCodeSuggestions: true } } ) puts 'Success: ' + result['success'].to_s puts 'Contains timestampFrom: ' + result['output'].include?('timestampFrom:').to_s puts 'Contains timestampTo: ' + result['output'].include?('timestampTo:').to_s " -
Test standard mode is rejected by CodeSuggestion:
ruby -I lib -r gitlab_query_language -e " result = Glql.compile( 'type = CodeSuggestion and language = \"ruby\"', { project: 'gitlab-org/gitlab', fields: 'language, totalCount', featureFlags: { glqlCodeSuggestions: true } } ) if result['success'] puts 'ERROR: Should have failed but succeeded' else puts 'Correctly rejected standard mode' puts 'Error message: ' + result['output'] end " -
Test feature flag disables CodeSuggestion:
ruby -I lib -r gitlab_query_language -e " result = Glql.compile( 'type = CodeSuggestion and language in (\"ruby\", \"javascript\", \"python\")', { project: 'gitlab-org/gitlab', mode: 'analytics', dimensions: 'language, ideName', metrics: 'totalCount, acceptanceRate, acceptedCount', featureFlags: { glqlCodeSuggestions: false } } ) if result['success'] puts 'ERROR: Should have failed but succeeded' else puts 'Correctly rejected due to feature flag' puts 'Error message: ' + result['output'] end " -
Test transformation step:
ruby -I lib -r gitlab_query_language -e " # Mock GraphQL response from CodeSuggestions analytics API response = { project: { analytics: { duoCodeSuggestions: { aggregated: { nodes: [ { dimensions: { language: 'ruby', ideName: 'vscode' }, totalCount: 150, acceptanceRate: 0.75 } ], pageInfo: { hasNextPage: true, hasPreviousPage: false } } } } } } result = Glql.transform( response, { mode: 'analytics', fields: [ { name: 'language', type: 'dimension' }, { name: 'ideName', type: 'dimension' }, { name: 'totalCount', type: 'metric' }, { name: 'acceptanceRate', type: 'metric' } ] } ) node = result['data']['nodes'][0] puts 'Transform Success: ' + result['success'].to_s puts 'Dimensions Flattened: ' + (!node.key?('dimensions')).to_s puts 'Language: ' + node['language'] puts 'IDE: ' + node['ideName'] puts 'Total Count: ' + node['totalCount'].to_s puts 'Acceptance Rate: ' + node['acceptanceRate'].to_s "
End-to-end GLQL testing
Follow the testing instructions in Add CodeSuggestions support to GLQL in GitLab UI (gitlab!228129 - closed)
Related
- Depends on: !347 (merged) (analytics infrastructure)
-
Backend GraphQL API: gitlab!226274 (merged) (
expose-code-suggestions-ae-to-graphql) -
Feature flag:
glql_code_suggestion_analytics_aggregation(backend-controlled) - Next steps: !349 (closed) (timeSegment function support)
Related to #95 (closed)