Unclear sudden growth of Postgres
Summary
Unclear sudden growth of Postgres
Timeline
All times UTC.
2020-05-13
- 10:45 - Database begins to grow faster than usual & errors are rising
- 10:55 - Database growth back to normal, errors peaked, and slowly falling
- 11:00 - Backup delay alerted
- 11:18 - Backup delay remedied itself
- 11:20 - Errors peaked another time, ~2x as before
- 11:28 - Errors back down
- 12:03 - Incident declared from Slack
Details
We saw a sudden increase of used disk space on our postgres DB accompanied by a tral of errors an high load. This increase has caused our backup alert to fire. The backup was still in progress, but took too long to finish triggering the alert.
No further slowness was observed. Marking as S3 for now
Source
Incident declared by t4cc0re in Slack via /incident declare
command.
Resources
- If the Situation Zoom room was utilised, recording will be automatically uploaded to Incident room Google Drive folder (private)
Edited by Hendrik Meyer (xLabber)