CI/CD analytics: Failure rate and Success rate denominator incorrectly includes canceled/skipped pipelines and jobs

Summary

On the project CI/CD analytics page (<project>/-/pipelines/charts), the Failure rate and Success rate in both the Pipelines KPI strip and the Jobs panel use a denominator that includes canceled and skipped rows. This silently dilutes both rates whenever a project has non-trivial canceled/skipped volume.

A real-world example from gitlab-org/gitlab: the KPI strip displays Failure rate 7% and Success rate 91%. The two values should sum to ~100% (±1% due to integer-percent rounding) but instead sum to 98%. The missing 2% is the canceled/skipped slice being absorbed into the denominator rather than counted as its own bucket. The Status chart on the same page already exposes this slice as a separate "Other (Cancelled, Skipped)" series, so the inconsistency is user-visible.

What

Both rate calculations on <project>/-/pipelines/charts currently use:

rate = count(status) / total_count

where total_count includes success + failed + canceled + skipped. The intended behavior is:

rate = count(status) / (count(success) + count(failed))

Canceled is a deliberate user action and skipped is a config/dependency artifact — neither indicates job/pipeline health and should not depress the rates.

Affected surfaces:

Surface Component Calculation layer
Pipelines KPI strip — Failure rate / Success rate app/assets/javascripts/ci/analytics/project_ci_cd_analytics/components/pipelines_stats.vue:57,63 Frontend (failedCount/count, successCount/count)
Jobs panel table (EE) — failedRate / successRate columns + 2 / N tooltip ee/app/assets/javascripts/ci/analytics/project_ci_cd_analytics/components/job_analytics_table.vue:241-246 and the server-side aggregation in lib/click_house/finders/ci/concerns/finished_builds_aggregations.rb:121-133 Backend (ClickHouse aggregation) — query_builder.count denominator

Not affected:

  • Duration chart (p50/p95 only — no rate math).
  • Status chart (raw stacked counts with "Other" as its own bar).

How (proposed fix)

1. Backend — Jobs panel rate aggregation

File: lib/click_house/finders/ci/concerns/finished_builds_aggregations.rb (used by both FinishedBuildsFinder and FinishedBuildsDeduplicatedFinder).

Change build_rate_aggregate so the denominator is countIf(success) + countIf(failed):

def build_rate_aggregate(status)
  numerator   = build_count_aggregate(status)
  denominator = query_builder.add(           # confirm helper name
    build_count_aggregate('success'),
    build_count_aggregate('failed')
  )
  safe_denominator = query_builder.named_func('nullIf', [denominator, 0])
  percentage = query_builder.division(numerator, safe_denominator)
  percentage_value = query_builder.multiply(percentage, 100)
  round(percentage_value).as("rate_of_#{status}")
end

Notes:

  • Confirm the QueryBuilder addition helper name in lib/click_house/client/query_builder.rb.
  • nullIf(denominator, 0) guards against 0/0 when a job's only finished outcomes are canceled/skipped — return null so the UI renders -.

2. Frontend — Pipelines KPI strip

File: app/assets/javascripts/ci/analytics/project_ci_cd_analytics/components/pipelines_stats.vue. The GraphQL query already returns successCount and failedCount. Compute the denominator client-side:

const rateDenominator = (BigInt(successCount ?? 0) + BigInt(failedCount ?? 0)).toString();
// ...
value: formatPipelineCountPercentage(failedCount, rateDenominator),
value: formatPipelineCountPercentage(successCount, rateDenominator),

Total pipeline runs KPI continues to use count (the true unfiltered total — correct as-is).

3. Frontend — Jobs panel tooltip

File: ee/app/assets/javascripts/ci/analytics/project_ci_cd_analytics/components/job_analytics_table.vue:241-246. Change tooltip from formatCount(item.failedCount, item.count) to use BigInt(item.successCount) + BigInt(item.failedCount) so the displayed fraction N / M matches the displayed rate.

Edited by 🤖 GitLab Bot 🤖