Infradev: WebsocketsServiceRailsRequestApdexSLOViolationRegional
Summary
Apdex for websockets consistently dropping below SLO.
This corresponds with a large number of 500 errors for ActionCable::Connection with 'can't alloc thread'.
Impact
This resulted in multiple incidents
- gitlab-com/gl-infra/production#15977 (closed)
- gitlab-com/gl-infra/production#15975 (closed)
- gitlab-com/gl-infra/production#16060 (closed)
- gitlab-com/gl-infra/production#16103 (closed)
- gitlab-com/gl-infra/production#16130 (closed)
- gitlab-com/gl-infra/production#16128 (closed)
- gitlab-com/gl-infra/production#16144 (closed)
- gitlab-com/gl-infra/production#16146 (closed)
- gitlab-com/gl-infra/production#16168 (closed)
Recommendation
Verification
Confirm that the 500s have stopped and that the apdex has stopped dropping below SLO.
Edited by Steve Xuereb