Investigate alerting thresholds for WebPagesServiceWebPagesServerApdexSLOViolationRegional

Summary

Following a recent incident, it was noted that we alert for drops in apdex after 2minutes.
While this could indicate a problem for us if the trend continues without recovery, should we be alerting on-call engineers in a pattern like this where it recovers very quickly, indicating little or most likely no impact to customers.

Alert in question: https://gitlab.com/gitlab-com/runbooks/-/blob/master/thanos-rules/autogenerated-service-level-alerts-web-pages-gprd.yml#L734

Related Incident(s)

production#14359

Originating issue(s): production#14359

Desired Outcome/Acceptance Criteria

Evaluate the effectiveness of the current alert and determinate new values if applicable.

Associated Services

Corrective Action Issue Checklist

Link the incident(s) this corrective action arose out of
Give context for what problem this corrective action is trying to prevent from re-occurring
Assign a severity label (this is the highest sev of related incidents, defaults to 'severity::4')
Assign a priority (this will default to 'Reliability::P4')