CI Runner disk space utilization monitoring
A spike in the runner disk space utilization was introduced with gitlab-org/gitlab!103083 (merged) but we were not able to catch the warning in advance until jobs started failing with the no disk space
error described in #117 (closed). This resulted in large number of broken master incidents including gitlab-org/gitlab#383826 (closed) that prompted a last minute revert following the broken master resolution process.
I'd like to start a conversation about how to properly set up these notifications to alert us when disk space is being used up. @f_santos kindly pointed out
Runner VMs are completely isolated from all infra, they don’t have any metric exporter running
So we cannot leverage any built in metric. However I wonder if there is any script we can add at the end of the job execution to check the disk utilization, we can start by printing out the usage to console as a first step. @f_santos Can you provide any guidance on what we can explore?