2020-03-09: Degreded performance for web, frontend, api
Summary
A kernel issue/ reboot of one of our secondary,read postgres, hosts caused a brief degraded state and spike in latencies and errors on GitLab.com. Once the host had rebooted and traffic had shifted off the affected host, the issue was resolved.
More information will be added as we investigate the issue.
Timeline
All times UTC.
2020-03-09
- 16:52 - First pages about degraded state for GitLab.com came in.
- 16:53 - patroni-07 (read-only pg replica) reboots
- 16:56 - Web SLO alert escalated to incident manager
- 16:59 - GitLab.com appears to recover
- 17:04 - Posted to status.io
- 17:20 - GitLab.com operating normally
Resources
- If the Situation Zoom room was utilised, recording will be automatically uploaded to Incident room Google Drive folder (private)
Edited by Dave Smith