[DEVOPS INCIDENT] Low disk space
Meta
Date of appearance: <2018-02-20 Wed> Date of resolution: <2018-02-20 Wed>
Total downtime caused: ~18mins
Participating members: Lasse, Yuki, Fabian
Symptoms
- Server is running low on disk space
- Server is running high on memory
- No low disk space alert happened
- We couldn't prune old docker images for some reason
Possible Causes
- There were many stale docker images not cleaned up
Attempts
- @fneu removed a container to free up minimal space to be able to do stuff (worked, 800MB)
- @yuki_is_bored tried pruning all images which for some reason didn't do anything - somehow the docker daemon wasn't reacting?
- At some point storage went down fast (nobody knows why :/) and we had to do an emergency reboot. After the reboot pruning worked fine and we're down to 50% disk space use.
Stuff need to be done
-
Fix low space alerts -
Add auto pruning cronjob with error alerts -
Possibly scheduled reboots? CC @yuki_is_bored
Edited by Lasse Schuirmann