This project is archived. Its data is read-only.
Add monitoring and alerting for GitHost.io
We have Slack alerting for GitHost.io, but it's not being used effectively. https://gitlab.com/gitlab-com/support/issues/474 discusses what we need to do to improve GitHost, but with @dblessing out this has been delayed. We really need to have the most basic alerting to do the following: 1. Ensure that a host responds with 200 OK 2. Pages a support person if it does not respond with X minutes 3. Monitor for disk space: send an e-mail (preferably to the customer too) if it nears full usage, and page a support person when it comes critical I think we're close to having this, but we need someone to focus on this ASAP. I know @ahanselka was involved with this effort earlier with Prometheus. We'll need some reinforcements while the support team is shorthanded next week. /cc: @lbot, @pcarranza, @ernstvn
issue