Add backend and GLQL pipeline support for CodeSuggestion analytics
What does this MR do and why?
Enables CodeSuggestion analytics queries through the GLQL pipeline. This is the first MR in a stack that adds CodeSuggestion analytics support to the GitLab UI. Note this MR doesn't update the UI to use the correct presenters, it only sets up the data flow.
The feature flag and API logic live in FOSS, gated behind the glql_code_suggestion_analytics_aggregation feature flag. The request spec suite is split: the CE spec verifies code suggestions are blocked when the flag is off (default), and the EE spec verifies the full analytics flow when enabled. The ClickHouse-backed system test for the embedded GLQL view lives in an EE shared example.
What Changed
GLQL API
- Added
code_suggestions_enabled?helper tolib/api/glql.rbwith nil-safe project/group path lookups for feature flag evaluation - Updated
get_compile_contextto pass analytics parameters (mode, dimensions, metrics) to the GLQL compiler
Feature flag
- Created
glql_code_suggestion_analytics_aggregationfeature flag gating access to CodeSuggestion analytics queries - Pushed feature flag to frontend via
gon_helper
Frontend pipeline changes
- Updated GLQL parser to pass mode and fields through compilation
- Updated transformer to pass mode through to GLQL library
- Removed
DEFAULT_DISPLAY_FIELDSconstant (analytics queries have no default fields) - Passed mode through resolver to DataPresenter for display routing
Test structure
- CE request spec (
spec/requests/api/glql_spec.rb) includes a test verifying code suggestions returns an error when the feature flag is off - EE request spec (
ee/spec/requests/api/glql_spec.rb) tests the full code suggestions analytics flow: feature flag enabled/disabled, project-scoped, and group-scoped - CE shared example (
spec/support/shared_examples/features/glql/) tests standard GLQL embedded views - EE shared example (
ee/spec/support/shared_examples/features/glql/) tests CodeSuggestions analytics with ClickHouse, included viait_behaves_like 'embedded views (GLQL) EE' - Added frontend specs for parser, transformer, and resolver changes
CI changes: ClickHouse system test support
Previously, the ClickHouse CI infrastructure only supported unit and integration tests (via --tag click_house). System tests tagged with :click_house would fall through the cracks -- excluded from regular system jobs (which use --tag ~click_house) and not discovered by ClickHouse jobs (which only searched unit/integration directories).
This MR adds CI support for ClickHouse system tests:
- New jobs:
rspec system clickhouse25(FOSS) andrspec-ee system clickhouse25(EE), using medium runners and the ClickHouse 25 service container - New rules:
.rails:rules:clickhouse-system-changesand.rails:rules:ee-only-clickhouse-system-changes, gated at tier-3+ (consistent with existing ClickHouse unit tests), with medium runner availability checks - New artifact collector:
rspec:artifact-collector part-cfor FOSS results; EE results collected by the existingrspec:artifact-collector ee remainder - Trigger: Runs on backend changes at tier-3+, or via the
pipeline:run-all-rspeclabel /$ENABLE_RSPEC_SYSTEMCI variable
Example job run: https://gitlab.com/gitlab-org/gitlab/-/jobs/14145080222#L631
Reference MRs for existing ClickHouse CI:
- !124706 (merged) -- original ClickHouse CI setup
- !140792 (merged) -- streamlined ClickHouse rspec jobs
- !141248 (merged) -- classified ClickHouse tests as unit-level for test discovery
- !191454 (merged) -- added multi-version ClickHouse testing
MR Stack
- This MR — Backend + GLQL pipeline
- !233397 (merged) — Analytics presenters
- !233398 (merged) — ListBasePresenter extraction + ListDimensionsPresenter
References
- #592262
- !228129 (closed)
- GLQL MR !347 (Analytics Infrastructure)
- GLQL MR !348 (CodeSuggestions Source)
- Work Item 592262
- ClickHouse Setup Documentation
- GLQL User Documentation
Screenshots or screen recordings
| Scenario | GLQL query | Output |
|---|---|---|
| Every metric and dimension | ![]() |
![]() |
| In the Data Analyst Agent | Requires a lot of cajoling until Data Analyst agent integration for CodeSuggesti... (#592264) is complete | ![]() |
| With feature flag disabled | ![]() |
![]() |
| Trying to use it outside analytics mode | ![]() |
![]() |
Trying to use fields |
![]() |
![]() |
Trying to use dimensions or metrics in standard mode |
![]() |
![]() |
How to set up and validate locally
Prerequisites
- Setup clickhouse
- Make sure to run migrations:
bundle exec rake gitlab:clickhouse:migrate - Enable clickhouse analytics:
echo "Gitlab::CurrentSettings.current_application_settings.update(use_clickhouse_for_analytics: true)" | rails c - Seed test data:
FILTER=ai_usage_stats bundle exec rake db:seed_fu
UI Testing
-
Navigate to a seeded projects issue or merge request (by default this is the
toolboxgroup), and go to a markdown enabled field (comments or descriptions for instance) -
Test that without the feature flag enabled, the
CodeSuggestiontype returns an error:mode: analytics query: type = CodeSuggestion and timestamp >= -30d dimensions: language metrics: totalCount, acceptanceRate sort: totalCount desc limit: 10 -
Enable the feature flag:
echo "Feature.enable(:glql_code_suggestion_analytics_aggregation)" | rails c -
Refresh the page, test that the query no longer errors and returns results
-
Now test other variants like: Ide comparison by language
mode: analytics query: type = CodeSuggestion and language in ("ruby", "javascript", "python") dimensions: language, ideName metrics: totalCount, acceptedCount, rejectedCount, acceptanceRate sort: acceptanceRate descRecent activity
mode: analytics query: type = CodeSuggestion and timestamp >= -7d dimensions: language metrics: totalCount, usersCount, suggestionSizeSum sort: usersCount desc limit: 5User-specific
mode: analytics query: type = CodeSuggestion and user = 2 and timestamp >= -14d dimensions: language, ideName metrics: totalCount, acceptanceRate, acceptedCount sort: totalCount desc
Rest API endpoint
Test the /api/v4/glql REST API endpoint:
# Get a personal access token from GitLab UI: User Settings > Access Tokens
curl -H "PRIVATE-TOKEN: your_token" \
-H "Content-Type: application/json" \
-d '{
"glql_yaml": "query: type = CodeSuggestion and language = \"ruby\"\nmode: \"analytics\"\ndimensions: \"language\"\nmetrics: \"totalCount, acceptanceRate\"\nproject: \"toolbox/gitlab-smoke-tests\""
}' \
http://gdk.test:8080/api/v4/glql 










