Set up alert on HTTP Queue Timing
Carry over from https://gitlab.com/gitlab-com/infrastructure/issues/2379
While we anticipate better ways of measuring web request response timing with more integration of Prometheus, we should not wait on that to set up an alert on HTTP Queue timings. Once those timings go above an SLO that I am about to propose in this issue, the runbook that ties to the alert should describe how to add more unicorn hosts, as a first step. Obviously there can be other reasons for high HTTP Queue Timings, but this is a start.
- Proposed SLO: HTTP Queue Timing p99 < 15 ms.
- Proposed alert: Alert when p99 has been > 15 ms for more than 30 minutes.