Expose a duration histogram of the runner prepare stage
Overview
This feature adds a new Prometheus histogram metric that counts the duration for preparing the CI/CD job environment - the prepare stage
Problem(s) to solve
- A customer that uses Kubernetes to host the CI/CD build environment (runners) and runs ~200k CI/CD jobs per day have found that the duration of the pod provisioning step (prepare environment) can be > 3 minutes. The estimate is that this impacts ~ 10% of the daily CI/CD jobs. Therefore, this customer needs visibility into the duration trends for the preparation stage to determine adjustments to the compute resources allocated to the Kubernetes cluster(s).
Proposal
-
Add a histogram metric that counts the duration for preparing the CI/CD job environment.
Note:
- We already partition number of jobs by execution step and executor stage, and that can be in some way extrapolated to time differences of the different pre-defined steps.
Edited by Darren Eastman