Draft: chore: Add alert for Gitaly and create production incident for 3-day Apdex violation
requested to merge chore/add-alert-for-gitaly-to-create-incident-for-3-day-apdex-violation into master
What
Add a new alert for Gitaly single node 3 day burn rate violations, which creates incident issue instead of paging EOC.
Why
We saw a lot of SingleNode gitaly events, which were inactionable and short-lived.
Based on that, we lowered Apdex Score for component_node
and at the same time
started alerting on 3 day burn rate but rather than paging EOC for that, an incident
issue will get created, since slow burn is not threating immediately but certainly
does require someone to look at.
Relevant dicussion in this thread: https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/23576#note_1379500501