2020-03-11 saturation on redis-cache, latency spike on web
Summary
Canary deployment included an MR that caused redis-cache cpu saturation which impacted the web service.
Timeline
All times UTC.
2020-03-11
- 07:31 canary deployment starts
- 08:21 redis-cache cpu starts going up
- 08:26 alerts fire for cpu saturation on redis-cache and latency on web (EOC paged)
- xx:xx multiple engineers join the incident room, we're not sure if it's related to canary deployment, decide to drain it anyway to rule it out
- 08:30 canary drained
- 08:31 redis-cache cpu goes down (and along with it web latency)
Resources
- If the Situation Zoom room was utilised, recording will be automatically uploaded to Incident room Google Drive folder (private)
Edited by Michal Wasilewski