Skip to content

Add gitlab_runner_job_prepare_stage_duration_seconds histogram

What does this MR do?

Exposes the gitlab_runner_job_prepare_stage_duration_seconds histogram with a stage label that can have as its value all the job stages.

Why was this MR needed?

To allow better monitoring of jobs stages.

We hide this metric behind a FF to not explode users' prometheus instances by default.

What's the best way to test this MR?

Do this in your runner config:

listen_address = ":9252"
[[runners]]
name = "k8s-local"
executor = "kubernetes"
environment = ["FF_EXPORT_HIGH_CARDINALITY_METRICS=1"]

Run a prometheus instance with the following config:

scrape_configs:
  - job_name: 'gitlab_runner_metrics'
    static_configs:
      - targets: ['localhost:9252']

Run the following command:

prometheus --config.file=./prometheus.yaml

Run grafana (on macos)

grafana-server --config=/usr/local/etc/grafana/grafana.ini --homepath /usr/local/share/grafana --packaging=brew cfg:default.paths.logs=/usr/local/var/log/grafana cfg:default.paths.data=/usr/local/var/lib/grafana cfg:default.paths.plugins=/usr/local/var/lib/grafana/plugins

In grafana add the prometheus data source that is running on :9090.

In grafana create a dashboard with a time series panel. The following queries should be satisfactory:

sum(rate(gitlab_runner_job_stage_duration_seconds_sum[30s])) by (stage) 
/ 
sum(rate(gitlab_runner_job_stage_duration_seconds_count[30s])) by (stage)
sum(increase(gitlab_runner_job_stage_duration_seconds_sum[30s])) > 0

They should render a graph like this:

image

What are the relevant issue numbers?

Edited by Georgi N. Georgiev | GitLab

Merge request reports

Loading