Add alerting for runners cache machines
runners-cache-2 filled up to 96% disk space, at which point all PUT requests to the runners cache resulted in 500 errors.
We need to:
- Add alerting for disk space
- Add alerting for unusually high number of 50x errors
- Consider automatic recovery mechanism (e.g. clear cache)