Ship additional GitLab Prometheus alerts
We will be shipping AlertManager via GitLab 10.8 (omnibus-gitlab#2999 (closed)). I think we should begin shipping default alerts for GitLab administrators. What metrics are most useful to add ASAP as alerts for most GitLab users/customers?
Some ideas:
Component | Exporter Endpoint | Prometheus metric |
---|---|---|
Unicorn | http://localhost:8080/-/metrics | unicorn_active_connections |
Unicorn | http://localhost:8080/-/metrics | unicorn_queued_connections |
Unicorn | http://localhost:8080/-/metrics | job_register_attempts_failed_total |
Sidekiq | http://localhost:9168/sidekiq | sidekiq_queue_size |
Gitaly | http://localhost:9236/metrics | grpc_server_handled_total{grpc_code="ResourceExhausted} |
PostgreSQL | http://localhost:9187/metrics | pg_stat_database_deadlocks |
PostgreSQL | http://localhost:9187/metrics | pg_stat_database_conflicts_confl_deadlock |
Redis | http://localhost:9121/metrics | redis_up |
Workhorse | http://localhost:9229/metrics | ? |
Pages | http://localhost:9101/metrics | gitlab_pages_domains_updated_total |
/cc: @ayufan, @bjk-gitlab, @nick.thomas, @_stark, @dblessing, @jacobvosmaer-gitlab, @zj
Edited by Ben Kochie