Adding runner job failed/success metrics with more details
What does this MR do?
This MR adds new prometheus metrics for runner succeeded jobs.
The additional success metric example:
# HELP gitlab_runner_succeeded_jobs_total Total number of succeeded jobs
# TYPE gitlab_runner_succeeded_jobs_total counter
gitlab_runner_succeeded_jobs_total{job_result="success",runner="7ywyWRnr"} 2
and modified the failed job metric to:
# HELP gitlab_runner_failed_jobs_total Total number of failed jobs
# TYPE gitlab_runner_failed_jobs_total counter
gitlab_runner_failed_jobs_total{job_result="script_failure",runner="7ywyWRnr"} 2
Why was this MR needed?
The existing metric only records failed jobs per runners; example:
ci_runner_failed_jobs_total{failure_reason="script_failure",runner="9e42ca"} 1 .
It would be also helpful to see the job succeed metric to track job success/failure rate.
Are there points in the code the reviewer needs to double check?
on helpers/prometheus/job_status_collector.go file, line 12 []string{"runner", "job_result"},
. gitlab_runner_failed_jobs_total
's label is changed to job_result
from failure_reason
to match gitlab_runner_succeeded_jobs_total
. Just want to point it out as we are not sure if this change will affect/break anything?
Does this MR meet the acceptance criteria?
-
Documentation created/updated - Tests
-
Added for this feature/bug -
All builds are passing
-
-
Branch has no merge conflicts with master
(if you do - rebase it please)
What are the relevant issue numbers?
Edited by 🤖 GitLab Bot 🤖