Add metrics with concurrent and limit values
What does this MR do?
Adds two new metrics:
Why was this MR needed?
It will greatly improve alerting possibilities. Currently user can track the number of handled jobs, partitioned by Runner worker. However there is no automatic way to track configured
limit. If user wants to create alerting, he need to hardcode estimated thresholds.
With these two new metrics alert definition may dynamically track configured limits, e.g.:
alert: TooManyJobsOnRunner expr: | sum(gitlab_runner_jobs / gitlab_runner_limit) by (runner) >= 0.8 for: 5m
Are there points in the code the reviewer needs to double check?
Does this MR meet the acceptance criteria?
- Documentation created/updated
- Added tests for this feature/bug
In case of conflicts with
master- branch was rebased