[Fleet Visibility] Create ClickHouse materialized view for CI job performance metrics
Summary
Create a ClickHouse materialized view to aggregate CI job performance metrics including P50/P95 durations and failure rates.
Background
Part of the implementation for &18548 to add CI/CD job metrics to the CI/CD Analytics View.
Implementation Details
Create the following materialized view:
CREATE MATERIALIZED VIEW gitlab_clickhouse_development.ci_job_performance_daily_mv
ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMM(date)
ORDER BY (project_id, source, ref, name, stage_id, date)
AS
SELECT
toDate(b.finished_at) AS date,
b.project_id,
b.stage_id,
b.name,
p.source,
p.ref,
quantilesState(0.5, 0.95)(b.duration) AS duration_quantiles,
countState() AS total_builds,
countStateIf(b.status = 'failed') AS failed_builds
FROM gitlab_clickhouse_development.ci_finished_builds b
INNER JOIN gitlab_clickhouse_development.ci_finished_pipelines p
ON b.pipeline_id = p.id
WHERE b.finished_at > 0 -- Ensure we have valid finished times
GROUP BY date, b.project_id, b.stage_id, b.name, p.source, p.ref;
Key Design Decisions
- Daily granularity for flexible time window queries (7 days, 30 days, 90 days, 180 days)
- AggregatingMergeTree engine to properly handle State functions
- Partitioned by month for efficient data management and potential TTL
- ORDER BY optimized for filtering by project, source, ref, and searching by job name
-
Pre-aggregated percentiles using ClickHouse's
quantilesStatefor accuracy -
Includes source and ref from
ci_finished_pipelinestable via JOIN for filtering
Acceptance Criteria
-
Materialized view is created in production ClickHouse -
View supports queries for different time windows -
Performance is acceptable for typical query patterns -
Migration is reversible
Testing
Test query to validate the MV:
SELECT
project_id,
stage_id,
name,
source,
ref,
countMerge(total_builds) AS total_builds,
countMerge(failed_builds) AS failed_builds,
quantilesMerge(0.5, 0.95)(duration_quantiles) AS duration_percentiles,
duration_percentiles[1] AS p50_duration,
duration_percentiles[2] AS p95_duration,
if(total_builds > 0, failed_builds / total_builds, 0) AS failure_rate
FROM gitlab_clickhouse_development.ci_job_performance_daily_mv
WHERE date >= today() - INTERVAL 30 DAY
AND project_id = 19
GROUP BY project_id, stage_id, name, source, ref
ORDER BY p50_duration DESC;
Note: This issue was created by Claude based on implementation discussion in epic &18548
Edited by 🤖 GitLab Bot 🤖