Prometheus does not list any AlertManager endpoint
Summary
I deployed a toy dockerized gitlab 11.4.0-ce.0 ( I used this version to use a current version of prometheus 2.4.x instead of the older version 1.8.x that is used by default in version 11.3.x ).
My objective is to get the whole gitlab/prometheus/alertmanager integration working,
to the point that if I trigger an alert (for example by gitlab-ctl stop sidekiq
assuming I have an alert rule that raises when up == 0
) then I expect to receive an email with the alert.
I noticed that even if I can enable prometheus and alertmanager in /etc/gitlab/gitlab.rb
(or in the GITLAB_OMNIBUS_CONFIG
environment variable), there is no "alertmanager endpoints" listed in the prometheus UI. After checking through the code, I found that there is no way (documented or undocumented) to fill the alertmanager_config
section of the gitlab prometheus configuration.
How is Prometheus supposed to forward alerts to alertmanager if there is way to tell prometheus where to find alertmanager endpoints?
Steps to reproduce
Assuming I have a local network interface on my workstation with ip address of 192.168.122.1 (edit in the source below if this is not the case), I run a toy mailcatcher instance to listen on 192.168.122.1:1025 (smtp) and 192.168.122.1:1080 (http).
docker run -d --name mailcatcher --restart always \
-p 192.168.122.1:1025:1025 -p 192.168.122.1:1080:1080 \
sj26/mailcatcher
Then I configure gitlab (dockerized) to enable all the monitoring stack and send alerts to my toy mailcatcher instance.
gitlab_omnibus_config() {
cat <<-EOF
external_url 'http://gitlab.192.168.122.1.xip.io:8080'
nginx['listen_port'] = 80
# https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/doc/docker/README.md#expose-gitlab-on-different-ports
gitlab_rails['gitlab_shell_ssh_port'] = 8022
# prometheus
prometheus_monitoring['enable'] = true
prometheus['listen_address'] = '0.0.0.0:9090'
prometheus['monitor_kubernetes'] = false
# alertmanager
alertmanager['enable'] = true
alertmanager['listen_address'] = '0.0.0.0:9093'
alertmanager['receivers'] = [{'name' => 'custom-receiver','email_configs' => ['to' => 'gitlab-admin@example.com', 'require_tls' => false]}]
# Note: declare a custom-receiver because default-receiver forces require_tls (incompatible with mailcatcher)
alertmanager['default_receiver'] = 'custom-receiver'
# gitlab_rails email configuration is required to implicitly configure alertmanager for smtp
# see https://gitlab.com/gitlab-org/omnibus-gitlab/blob/11.4.0-ce.0/files/gitlab-cookbooks/gitlab/libraries/prometheus.rb#L166-197
gitlab_rails['gitlab_email_enabled'] = true
gitlab_rails['gitlab_email_from'] = 'gitlab@example.com'
# configure smtp to send emails to mailcatcher instance at 192.168.122.1:1025
gitlab_rails['smtp_enable'] = true
gitlab_rails['smtp_address'] = '192.168.122.1'
gitlab_rails['smtp_port'] = '1025'
gitlab_rails['smtp_tls'] = false
gitlab_rails['smtp_openssl_verify_mode'] = 'none'
gitlab_rails['smtp_enable_starttls_auto'] = false
gitlab_rails['smtp_ssl'] = false
gitlab_rails['smtp_force_ssl'] = false
EOF
}
docker run --detach \
--hostname gitlab.example.com \
--env GITLAB_OMNIBUS_CONFIG="$(gitlab_omnibus_config)" \
--publish 192.168.122.1:8080:80 \
--publish 192.168.122.1:8022:22 \
--publish 192.168.122.1:9090:9090 \
--publish 192.168.122.1:9093:9093 \
--name test-gitlab \
--restart always \
--volume /data/test-gitlab/config:/etc/gitlab \
--volume /data/test-gitlab/logs:/var/log/gitlab \
--volume /data/test-gitlab/data:/var/opt/gitlab \
gitlab/gitlab-ce:11.4.0-ce.0
Once gitlab has booted (docker logs -f test-gitlab
), I open two browser tabs:
- one with http://192.168.122.1:9090/status (prometheus status page)
- one with http://192.168.122.1:9090/config (prometheus configuration page)
What is the current bug behavior?
I see that no Alertmanager endpoints are listed in the status page.
I see that no alertmanager_config
section is present in the prometheus configuration page.
What is the expected correct behavior?
I should see at least a alertmanager endpoint listed (probably at address localhost:9093 since that's the address I configured in the GITLAB_OMNIBUS_CONFIG
environment variable.
I should see an alertmanager_config
section is present in the prometheus configuration page.
Note: the alertmanager_config
section of the prometheus configuration file is documented here: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#alertmanager_config
If alertmanager endpoints were found by prometheus (see Configuration Details below for how to make it work by hack), I could trigger an alert and have it pass through the stack until I receive an alert email in my toy mailcatcher UI at URL http://192.168.122.1:1080
Details of package version
$ docker ps | grep gitlab
cd4aff7a92a0 gitlab/gitlab-ce:11.4.0-ce.0 "/assets/wrapper" 49 minutes ago Up 49 minutes (healthy) 192.168.122.1:9090->9090/tcp, 192.168.122.1:9093->9093/tcp, 192.168.122.1:8022->22/tcp, 192.168.122.1:8080->80/tcp, 192.168.122.1:8443->443/tcp test-gitlab
Environment details
- Operating System:
Ubuntu
- Installation Target, remove incorrect values:
- Other:
Docker
- Other:
- Installation Type, remove incorrect values:
- New Installation
- Is there any other software running on the machine:
Not relevant
- Is this a single or multiple node installation?
Single Node
- Resources
- CPU:
Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz
(4 core / 8 threads) - Memory total:
16G
- CPU:
Configuration details
See the source of the gitlab_omnibus_config
bash function above.
In addition, I edited the following files inside the docker container to make the whole workflow work:
In /var/opt/gitlab/prometheus/prometheus.yml
, I added the following snippet:
alerting:
alertmanagers:
- scheme: http
static_configs:
- targets:
- "localhost:9093"
I created the /var/opt/gitlab/prometheus/rules/up.rules
file with the following content:
groups:
- name: up.rules
rules:
- alert: PrometheusInstanceDown
expr: up == 0
for: 1m
annotations:
description: '{{$labels.instance}} of job {{$labels.job}} has been down for more than 2 minutes.'
summary: '{{$labels.instance}}: Instance down'