Configuring Prometheus as described in reference architecture causes scrape config conflict
Summary
When configuring Prometheus as described in https://docs.gitlab.com/ee/administration/reference_architectures/2k_users.html#configure-prometheus, the configurations for the node
job conflict with the automatically generate scrape configs and Prometheus doesn't start.
This is somewhat related to #4294 (closed), but running without service discovery surfaces this bug again.
Code reference:
- https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/files/gitlab-cookbooks/monitoring/libraries/prometheus.rb#L236
- https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/files/gitlab-cookbooks/monitoring/libraries/prometheus.rb#L467
Steps to reproduce
- On a node with
roles(['monitoring_role'])
add an entry with'job_name': 'node'
into theprometheus['scrape_configs']
- Run
gitlab-ctl reconfigure
What is the current bug behavior?
Prometheus doesn't start.
What is the expected correct behavior?
Prometheus should start.
Workaround
Our current workaround is setting node_exporter['enable'] = false
which prevents the generation of the "default" scrape config, but also stops the node-exporter from running on the monitoring node.
Fix suggestion
Would it be possible to skip the generation of the "default" scrape configs if a scrape config with the same name exists in prometheus['scrape_configs']
?