Use Prometheus to Query Runner Metrics Linked to Each Job
Compare changes
Files
2- Alex Groleau authored
+ 6
− 6
@@ -322,14 +322,14 @@ func (b *Build) collectAndUploadMetrics(ctx context.Context, startTime time.Time
This MR causes the gitlab-runner, after each job/build, to pull a time range of metrics from an available Prometheus server that is set to scrape metrics from runner instances. Metrics over this time range and json-ified and sent to GitLab as a raw artifact, associated with the job.
Metrics are a distilled version of logs. Much like traces, metrics, play an essential role in determining how a particular CI/CD job performed. This MR is needed to link a job to the metrics generated by runner nodes using Prometheus servers in the same production environment. These saved metrics can be used to display performance graphs to end users, as covered in https://gitlab.com/gitlab-org/gitlab-ce/issues/58921. They can also be used to detect various forms of abuse within the GitLab security team, as covered in https://gitlab.com/gitlab-com/gl-security/abuse/issues/83.
Gitlab-runner is currently responsible for running jobs, updating job status, and collecting traces from each run. This MR adds metrics collection alongside log collection to provide a complete picture of what happened on the runner instance. Implementing this feature in the gitlab-runner golang codebase with queries to Prometheus infrastructure made sense for the following reasons:
This MR currently supports metrics querying for docker-machine only, for now, with an easy path forward to support other executors. To add support to another executor, simply add the GetMetricsLabelName() and GetMetricsLabelValue() functions to it.
All of the committed changes.
master
- branch was rebased