CI/CD analytics: Failure rate and Success rate denominator incorrectly includes canceled/skipped pipelines and jobs
Summary
On the project CI/CD analytics page (<project>/-/pipelines/charts), the Failure rate and Success rate in both the Pipelines KPI strip and the Jobs panel use a denominator that includes canceled and skipped rows. This silently dilutes both rates whenever a project has non-trivial canceled/skipped volume.
A real-world example from gitlab-org/gitlab: the KPI strip displays Failure rate 7% and Success rate 91%. The two values should sum to ~100% (±1% due to integer-percent rounding) but instead sum to 98%. The missing 2% is the canceled/skipped slice being absorbed into the denominator rather than counted as its own bucket. The Status chart on the same page already exposes this slice as a separate "Other (Cancelled, Skipped)" series, so the inconsistency is user-visible.
What
Both rate calculations on <project>/-/pipelines/charts currently use:
rate = count(status) / total_countwhere total_count includes success + failed + canceled + skipped. The intended behavior is:
rate = count(status) / (count(success) + count(failed))Canceled is a deliberate user action and skipped is a config/dependency artifact — neither indicates job/pipeline health and should not depress the rates.
Affected surfaces:
| Surface | Component | Calculation layer |
|---|---|---|
| Pipelines KPI strip — Failure rate / Success rate | app/assets/javascripts/ci/analytics/project_ci_cd_analytics/components/pipelines_stats.vue:57,63 |
Frontend (failedCount/count, successCount/count) |
Jobs panel table (EE) — failedRate / successRate columns + 2 / N tooltip |
ee/app/assets/javascripts/ci/analytics/project_ci_cd_analytics/components/job_analytics_table.vue:241-246 and the server-side aggregation in lib/click_house/finders/ci/concerns/finished_builds_aggregations.rb:121-133 |
Backend (ClickHouse aggregation) — query_builder.count denominator |
Not affected:
- Duration chart (p50/p95 only — no rate math).
- Status chart (raw stacked counts with "Other" as its own bar).
How (proposed fix)
1. Backend — Jobs panel rate aggregation
File: lib/click_house/finders/ci/concerns/finished_builds_aggregations.rb (used by both FinishedBuildsFinder and FinishedBuildsDeduplicatedFinder).
Change build_rate_aggregate so the denominator is countIf(success) + countIf(failed):
def build_rate_aggregate(status)
numerator = build_count_aggregate(status)
denominator = query_builder.add( # confirm helper name
build_count_aggregate('success'),
build_count_aggregate('failed')
)
safe_denominator = query_builder.named_func('nullIf', [denominator, 0])
percentage = query_builder.division(numerator, safe_denominator)
percentage_value = query_builder.multiply(percentage, 100)
round(percentage_value).as("rate_of_#{status}")
endNotes:
- Confirm the QueryBuilder addition helper name in
lib/click_house/client/query_builder.rb. nullIf(denominator, 0)guards against0/0when a job's only finished outcomes are canceled/skipped — returnnullso the UI renders-.
2. Frontend — Pipelines KPI strip
File: app/assets/javascripts/ci/analytics/project_ci_cd_analytics/components/pipelines_stats.vue. The GraphQL query already returns successCount and failedCount. Compute the denominator client-side:
const rateDenominator = (BigInt(successCount ?? 0) + BigInt(failedCount ?? 0)).toString();
// ...
value: formatPipelineCountPercentage(failedCount, rateDenominator),
value: formatPipelineCountPercentage(successCount, rateDenominator),Total pipeline runs KPI continues to use count (the true unfiltered total — correct as-is).
3. Frontend — Jobs panel tooltip
File: ee/app/assets/javascripts/ci/analytics/project_ci_cd_analytics/components/job_analytics_table.vue:241-246. Change tooltip from formatCount(item.failedCount, item.count) to use BigInt(item.successCount) + BigInt(item.failedCount) so the displayed fraction N / M matches the displayed rate.