Update runner manager scrape configuration
Related to https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/13886
We're setting the external labels environment, stage, tier, type and ci_environment. The first four are common for all resources tracked by GitLab infrastructure monitoring.
Unfortunately some of them (like for example stage) are conflicting with labels that we have in Runner metrics.
With the legacy configuration this was handled automatically because were using the static_configs. It was configured like:
job_name: runner-managers
static_configs:
honor_labels: false
targets:
- runner-1:1234
- runner-2:1234
labels:
environment: gprd
stage: main
This static configuration allows to add custom labels to scraped metrics. honor_labels: false enables an internal translator that changes any scraped label, that conflicts with some additional one added by the server (like for example instance), to be prepended with exported_.
Long story short - this configuration allows the metric like:
gitlab_runner_jobs{runner="abcd", state="running", stage="step_script", executor_stage="docker_run"} 10
to be saved as
gitlab_runner_jobs{runner="abcd", state="running", exported_stage="step_script", executor_stage="docker_run", environment="gprd", stage="main"} 10
The stage="step_script" was changed to exported_stage="step_script" and then the stage="main" from the static configuration labels was added.
This unfortunately doesn't work with:
-
external labels, which according to the configuration are not triggering the
honor_labelsbehavior:Note that any globally configured "external_labels" are unaffected by this setting. In communication with external systems, they are always applied only when a time series does not have a given label yet and are ignored otherwise.
-
any other scrape target type than
static_configs.
As in our new configuration maintained here we're applicable for both (setting the GitLab Monitoring specific labels globally with external_labels and using gce_sd_configs), we've started to have different data here and in the legacy data sources.
This change will use the metrics_relabel_configs to:
- first, relabel labels matching the
external_labelsone to anexported_...version, - second, remove the original label.
This will make the labels matching the legacy data source (which we use in alerts and dashboards) and allow external_labels to be applied to these metrics.