Skip to content

CI Runner disk space utilization monitoring

A spike in the runner disk space utilization was introduced with gitlab-org/gitlab!103083 (merged) but we were not able to catch the warning in advance until jobs started failing with the no disk space error described in #117 (closed). This resulted in large number of broken master incidents including gitlab-org/gitlab#383826 (closed) that prompted a last minute revert following the broken master resolution process.

I'd like to start a conversation about how to properly set up these notifications to alert us when disk space is being used up. @f_santos kindly pointed out

Runner VMs are completely isolated from all infra, they don’t have any metric exporter running

So we cannot leverage any built in metric. However I wonder if there is any script we can add at the end of the job execution to check the disk utilization, we can start by printing out the usage to console as a first step. @f_santos Can you provide any guidance on what we can explore?

Edited by Jennifer Li