Prune unused docker images on Ops ci-runner hosts
### Task Setup automatic pruning of unused old container images from the Ops environment's runner VM that handles chatops. Host: `runner-chatops-01-inf-ops.c.gitlab-ops.internal` ### Background A [PagerDuty alert](https://gitlab.pagerduty.com/incidents/PWOVR0F) noticed that the root filesystem on host `runner-chatops-01-inf-ops` had exceeded 90% full. This turned out to be due to slow growth (less than 1% growth per day). #### Where and how quickly is disk space being used? The large majority of disk space was used by container image layers stored under `/var/lib/docker/aufs/diff` ``` msmiley@runner-chatops-01-inf-ops.c.gitlab-ops.internal:~$ sudo du -hxc / | sort -hr > /tmp/du-hxc.sorted.out msmiley@runner-chatops-01-inf-ops.c.gitlab-ops.internal:~$ head /tmp/du-hxc.sorted.out 86G total 86G / 81G /var 80G /var/lib/docker/aufs/diff 80G /var/lib/docker/aufs 80G /var/lib/docker ... ``` The trend in free disk space shows a slow steady drop: PromQL: ``` node_filesystem_free_bytes{fqdn="runner-chatops-01-inf-ops.c.gitlab-ops.internal", device="/dev/sda1"} ``` ![Screenshot_from_2021-01-25_18-10-23](/uploads/9a19ad190aa9f4f0e618de167467ad28/Screenshot_from_2021-01-25_18-10-23.png) ### What container images are taking up the disk space? Before cleaning up the disk space as a manual task, I captured the list of images. Most of the disk space seems to be used by images from 2 repos: * Images from repo `registry.ops.gitlab.net/gitlab-com/chatops` were typically 550 MB, and a fresh image seems to be pulled daily. 79 such images used a total of 43 GB. * Images from repo `registry.ops.gitlab.net/gitlab-com/gl-infra/tamland` were typically 3.8 GB. 7 such images used a total of 27 GB. Counts: ``` msmiley@runner-chatops-01-inf-ops.c.gitlab-ops.internal:~$ sudo docker image ls | grep 'registry.ops.gitlab.net/gitlab-com/gl-infra/tamland' | wc -l 7 msmiley@runner-chatops-01-inf-ops.c.gitlab-ops.internal:~$ sudo docker image ls | grep 'registry.ops.gitlab.net/gitlab-com/chatops' | wc -l 79 ``` Examples: ``` msmiley@runner-chatops-01-inf-ops.c.gitlab-ops.internal:~$ sudo docker image ls | grep 'registry.ops.gitlab.net/gitlab-com/gl-infra/tamland' | head -n 5 registry.ops.gitlab.net/gitlab-com/gl-infra/tamland latest 61434c234ffb 2 weeks ago 3.88GB registry.ops.gitlab.net/gitlab-com/gl-infra/tamland <none> 1af0c85e7ea0 6 weeks ago 3.53GB registry.ops.gitlab.net/gitlab-com/gl-infra/tamland <none> 9ac364f52a0b 7 weeks ago 3.52GB registry.ops.gitlab.net/gitlab-com/gl-infra/tamland <none> 649d1423884e 2 months ago 3.84GB registry.ops.gitlab.net/gitlab-com/gl-infra/tamland <none> 3e12a7ad2679 2 months ago 3.84GB msmiley@runner-chatops-01-inf-ops.c.gitlab-ops.internal:~$ sudo docker image ls | grep 'registry.ops.gitlab.net/gitlab-com/chatops' | head -n 5 registry.ops.gitlab.net/gitlab-com/chatops latest 6d6aa6095e68 About an hour ago 541MB registry.ops.gitlab.net/gitlab-com/chatops <none> 968e0d60ec89 25 hours ago 541MB registry.ops.gitlab.net/gitlab-com/chatops <none> e3fcfc3e0a28 2 days ago 541MB registry.ops.gitlab.net/gitlab-com/chatops <none> 1b10f13ee288 3 days ago 541MB registry.ops.gitlab.net/gitlab-com/chatops <none> 92661a8baf9d 3 days ago 541MB ```
issue