Add a unit to clean-up multus cache files on nodes
What does this MR do and why?
Multus does not cleans up /var/lib/cni/multus automatically, leading to inodes starvation on the host file-system. We need to handle cleanup by ourselves for now.
Add a unit to regularly clean multus cache located in /var/lib/cni/multus.
The cleanup is made in two phases:
- first, for each cache file in /var/lib/cni/multus/results, we look for the related pod and remove the file if it no longer exists.
- then, for each configuration file in /var/lib/cni/multus, if we cannot find a cache file with the correct uid, we remove the file
I tried to configure it as a cronjob and duplicate it for each nodes but could not get kyverno to play nicely, so we're stuck with a for loop and a 1h timer.
Logs from the daemonset:
[INFO] Cleaning /host/var/lib/cni/multus/results.
[INFO] Checking if /host/var/lib/cni/multus/results/portmap-c4f718fb509e4c0d49471a86ccdefdbdf4ac309336f9e99c9d9ae4dddac17c54-eth0 is still relevant
Error from server (NotFound): pods "trivy-scan-5528d7d5e5bb3b48f674815e322af7c2da85cd392f6bc5cd7djw" not found
[INFO] pod continuous-scanning/trivy-scan-5528d7d5e5bb3b48f674815e322af7c2da85cd392f6bc5cd7djw for /host/var/lib/cni/multus/results/portmap-c4f718fb509e4c0d49471a86ccdefdbdf4ac309336f9e99c9d9ae4dddac17c54-eth0 no longer exists, removing it
[INFO] Checking if /host/var/lib/cni/multus/results/portmap-06e24403c95595a66fccbe3280f0386f2f5b4b6897f77318198045070c80b3c2-eth0 is still relevant
Error from server (NotFound): pods "reverse-proxy-odic-oklozi-64f79b8c7b-t5zmg" not found
[INFO] pod customer-vpc-test/reverse-proxy-odic-oklozi-64f79b8c7b-t5zmg for /host/var/lib/cni/multus/results/portmap-06e24403c95595a66fccbe3280f0386f2f5b4b6897f77318198045070c80b3c2-eth0 no longer exists, removing it
[INFO] Checking if /host/var/lib/cni/multus/results/portmap-d8ca1144e070029de989de5a1c03bbb50dbf39d1ed3afbd9f78ddb1755a173d0-eth0 is still relevant
Error from server (NotFound): pods "restart-whereabouts-29167260-tpnj7" not found
[INFO] pod victoria-maintenance/restart-whereabouts-29167260-tpnj7 for /host/var/lib/cni/multus/results/portmap-d8ca1144e070029de989de5a1c03bbb50dbf39d1ed3afbd9f78ddb1755a173d0-eth0 no longer exists, removing it
[INFO] Checking if /host/var/lib/cni/multus/results/portmap-3b15547bcfee90be7c088aa24847da5d50800f4c831ada110b3f0b1e370ec39d-eth0 is still relevant
Error from server (NotFound): pods "trivy-scan-44c0a3de0bb0a687f8e0c1845a9d1dd3835ec17dc407e2f2cwwk" not found
[INFO] pod continuous-scanning/trivy-scan-44c0a3de0bb0a687f8e0c1845a9d1dd3835ec17dc407e2f2cwwk for /host/var/lib/cni/multus/results/portmap-3b15547bcfee90be7c088aa24847da5d50800f4c831ada110b3f0b1e370ec39d-eth0 no longer exists, removing it
[INFO] Checking if /host/var/lib/cni/multus/results/portmap-71f9a86e64d311a2ecd85edfb1ac92badce7a6bf5366c4222354d4615bce1202-eth0 is still relevant
Error from server (NotFound): pods "trivy-scan-1c5cdfc2f5fe97ebaa91e750fae62f0915abaa79c85acb3rh9vs" not found
[INFO] pod continuous-scanning/trivy-scan-1c5cdfc2f5fe97ebaa91e750fae62f0915abaa79c85acb3rh9vs for /host/var/lib/cni/multus/results/portmap-71f9a86e64d311a2ecd85edfb1ac92badce7a6bf5366c4222354d4615bce1202-eth0 no longer exists, removing it
[INFO] Checking if /host/var/lib/cni/multus/results/portmap-cdd93a753a38d04c1b1e00e3ff1f37030e3e6e082979c08d17613228e236fb09-eth0 is still relevant
Error from server (NotFound): pods "logging-fluentbit-7zz9b" not found
[INFO] pod logging/logging-fluentbit-7zz9b for /host/var/lib/cni/multus/results/portmap-cdd93a753a38d04c1b1e00e3ff1f37030e3e6e082979c08d17613228e236fb09-eth0 no longer exists, removing it
[INFO] Checking if /host/var/lib/cni/multus/results/portmap-aeab98d9bfa7a8cd342bd0fbc701ac17f3a95329cdec90cc53414f6b8e1c5226-eth0 is still relevant
Error from server (NotFound): pods "proxy-gin-7b6df58c87-mmp2h" not found
[INFO] pod user-proxy-gin/proxy-gin-7b6df58c87-mmp2h for /host/var/lib/cni/multus/results/portmap-aeab98d9bfa7a8cd342bd0fbc701ac17f3a95329cdec90cc53414f6b8e1c5226-eth0 no longer exists, removing it
[INFO] Checking if /host/var/lib/cni/multus/results/portmap-d14e199cd876d7eff4e42562d076675bf99a2b4f1225371b5d9cdcc40ef08b4d-eth0 is still relevant
Error from server (NotFound): pods "cleanup-pods-missing-vrf-network-helper-29173155-ddkll" not found
[INFO] pod victoria-maintenance/cleanup-pods-missing-vrf-network-helper-29173155-ddkll for /host/var/lib/cni/multus/results/portmap-d14e199cd876d7eff4e42562d076675bf99a2b4f1225371b5d9cdcc40ef08b4d-eth0 no longer exists, removing it
[INFO] Checking if /host/var/lib/cni/multus/results/portmap-f5928d3ac8d4f6537711b1c8bdbcba179a6dbf70e0f4e7e917497319f016c93c-eth0 is still relevant
Error from server (NotFound): pods "proxy-gin-569b4b998b-wmdcm" not found
[INFO] pod user-proxy-gin/proxy-gin-569b4b998b-wmdcm for /host/var/lib/cni/multus/results/portmap-f5928d3ac8d4f6537711b1c8bdbcba179a6dbf70e0f4e7e917497319f016c93c-eth0 no longer exists, removing it
[INFO] Checking if /host/var/lib/cni/multus/results/portmap-b14cdf2dc7ef1b1b867418380596ced0b63e8e1dae564940f409585744ef1420-eth0 is still relevant
Error from server (NotFound): pods "rke2-coredns-rke2-coredns-6549f9b75-b624x" not found
[INFO] pod kube-system/rke2-coredns-rke2-coredns-6549f9b75-b624x for /host/var/lib/cni/multus/results/portmap-b14cdf2dc7ef1b1b867418380596ced0b63e8e1dae564940f409585744ef1420-eth0 no longer exists, removing it
[INFO] Checking if /host/var/lib/cni/multus/results/portmap-6cd15b550ea703018d22caac926cea22153f864b7a89bbc4f849fdf229307a20-eth0 is still relevant
Error from server (NotFound): pods "trivy-scan-220da6995a919b9ee6e0d3da7ca5f09802f3088007af56bk8lxd" not found
[INFO] pod continuous-scanning/trivy-scan-220da6995a919b9ee6e0d3da7ca5f09802f3088007af56bk8lxd for /host/var/lib/cni/multus/results/portmap-6cd15b550ea703018d22caac926cea22153f864b7a89bbc4f849fdf229307a20-eth0 no longer exists, removing it
[INFO] Checking if /host/var/lib/cni/multus/results/portmap-e8be26214f7969aaf12486e9cc3b71b0eda23a7a89a6280f035991253008baac-eth0 is still relevant
Error from server (NotFound): pods "cleanup-nonready-pods-29173540-nrkbl" not found
[INFO] pod victoria-maintenance/cleanup-nonready-pods-29173540-nrkbl for /host/var/lib/cni/multus/results/portmap-e8be26214f7969aaf12486e9cc3b71b0eda23a7a89a6280f035991253008baac-eth0 no longer exists, removing it
[INFO] Checking if /host/var/lib/cni/multus/results/portmap-866c9f0a92f4970d4d0fccf7cc4fcc13915b67dafb56145d54febd734a9b4f1b-eth0 is still relevant
Error from server (NotFound): pods "proxy-gin-b9f8c846b-w98c5" not found
[INFO] pod user-proxy-gin/proxy-gin-b9f8c846b-w98c5 for /host/var/lib/cni/multus/results/portmap-866c9f0a92f4970d4d0fccf7cc4fcc13915b67dafb56145d54febd734a9b4f1b-eth0 no longer exists, removing it
[INFO] Checking if /host/var/lib/cni/multus/results/portmap-24f87c6c280d26d7a0b82cbc21edef47b363e2f0ba151e61672996b5b793ceee-eth0 is still relevant
Error from server (NotFound): pods "cleanup-nonready-pods-29168645-p96m8" not found
[INFO] pod victoria-maintenance/cleanup-nonready-pods-29168645-p96m8 for /host/var/lib/cni/multus/results/portmap-24f87c6c280d26d7a0b82cbc21edef47b363e2f0ba151e61672996b5b793ceee-eth0 no longer exists, removing it
[INFO] Checking if /host/var/lib/cni/multus/results/portmap-76d0b90e0a010182960b4de0ff05187c2f3bdc99e3fd38fa16c0b5e1308feae0-eth0 is still relevant
Error from server (NotFound): pods "ssh-jumphost-auth-758d4b9645-gvjqk" not found
[INFO] pod customer-vpc-test/ssh-jumphost-auth-758d4b9645-gvjqk for /host/var/lib/cni/multus/results/portmap-76d0b90e0a010182960b4de0ff05187c2f3bdc99e3fd38fa16c0b5e1308feae0-eth0 no longer exists, removing it
[INFO] Checking if /host/var/lib/cni/multus/results/portmap-3bef683ff550da43e3a9bde6cda2fd2b9e3bbdec61876ca0f3e5024c7b784ae8-eth0 is still relevant
Error from server (NotFound): pods "trivy-scan-0213d10db6bb811e78806c8c18c0bdc331810e16d63f531lh5rd" not found
[INFO] pod continuous-scanning/trivy-scan-0213d10db6bb811e78806c8c18c0bdc331810e16d63f531lh5rd for /host/var/lib/cni/multus/results/portmap-3bef683ff550da43e3a9bde6cda2fd2b9e3bbdec61876ca0f3e5024c7b784ae8-eth0 no longer exists, removing it
...
[INFO] Checking if /host/var/lib/cni/multus/b4d204afa495dc92c336abc33f99cffb3fbeab8b6fbaa9e5d43a7cef851862d7 is still relevant
[INFO] /host/var/lib/cni/multus/b4d204afa495dc92c336abc33f99cffb3fbeab8b6fbaa9e5d43a7cef851862d7 is no longer used, removing it
[INFO] Checking if /host/var/lib/cni/multus/f1b7126a1897063fd7c06a919f956e1616196e77bb57560c38b0d47f3dac17d7 is still relevant
[INFO] /host/var/lib/cni/multus/f1b7126a1897063fd7c06a919f956e1616196e77bb57560c38b0d47f3dac17d7 is no longer used, removing it
[INFO] Checking if /host/var/lib/cni/multus/956fe1544c729c27fc0c35dc9994d065bd55dd7aabb3a93ded0e942818eea369 is still relevant
[INFO] /host/var/lib/cni/multus/956fe1544c729c27fc0c35dc9994d065bd55dd7aabb3a93ded0e942818eea369 is no longer used, removing it
[INFO] Checking if /host/var/lib/cni/multus/822161725c60de28de1a3538877bb2f48a0850135842bc1156b7ec9679c40fb8 is still relevant
[INFO] /host/var/lib/cni/multus/822161725c60de28de1a3538877bb2f48a0850135842bc1156b7ec9679c40fb8 is no longer used, removing it
[INFO] Checking if /host/var/lib/cni/multus/00f0bf103b95f35ca2c24b07502b64458212dd44078f198172010283a3dad139 is still relevant
[INFO] /host/var/lib/cni/multus/00f0bf103b95f35ca2c24b07502b64458212dd44078f198172010283a3dad139 is still used
[INFO] Checking if /host/var/lib/cni/multus/bf6c0aa815c04546d86b611de8c99b6c7cc57105418c982f06e506484059bc9b is still relevant
[INFO] /host/var/lib/cni/multus/bf6c0aa815c04546d86b611de8c99b6c7cc57105418c982f06e506484059bc9b is no longer used, removing it
[INFO] Checking if /host/var/lib/cni/multus/4627649182b5c07165849627ed7dacd9ffafe63608d6485d5b74bceacf99f3be is still relevant
[INFO] /host/var/lib/cni/multus/4627649182b5c07165849627ed7dacd9ffafe63608d6485d5b74bceacf99f3be is still used
[INFO] Checking if /host/var/lib/cni/multus/9ddef955239910958581d7b6a0dfc414f9894d3be11abf7728cc7036ab81b0ab is still relevant
[INFO] /host/var/lib/cni/multus/9ddef955239910958581d7b6a0dfc414f9894d3be11abf7728cc7036ab81b0ab is no longer used, removing it
[INFO] Checking if /host/var/lib/cni/multus/4182f042dbbd315221484277d2651fdd41fb46181d532e3bed380d5ef4bbcbe4 is still relevant
[INFO] /host/var/lib/cni/multus/4182f042dbbd315221484277d2651fdd41fb46181d532e3bed380d5ef4bbcbe4 is no longer used, removing it
[INFO] Checking if /host/var/lib/cni/multus/878c7b13e3342db34b152f1f51af4c5dea36c7a56ea81ad443be1604f9572c7a is still relevant
[INFO] /host/var/lib/cni/multus/878c7b13e3342db34b152f1f51af4c5dea36c7a56ea81ad443be1604f9572c7a is no longer used, removing it
[INFO] Checking if /host/var/lib/cni/multus/ba5914ebc28dfdc4bbf89cbb5878ed3748221bcf91397b7dd342f76ac092232e is still relevant
[INFO] /host/var/lib/cni/multus/ba5914ebc28dfdc4bbf89cbb5878ed3748221bcf91397b7dd342f76ac092232e is still used
[INFO] Checking if /host/var/lib/cni/multus/ecc0af5ca9260c7a317379dea8d4d64ec2000ca522b30fa931f54162099f020f is still relevant
[INFO] /host/var/lib/cni/multus/ecc0af5ca9260c7a317379dea8d4d64ec2000ca522b30fa931f54162099f020f is still used
[INFO] Checking if /host/var/lib/cni/multus/748602cb29e7d3a512fd89f213b0a1da83d4ecce2d8fd128263bc5edf68a4a05 is still relevant
[INFO] /host/var/lib/cni/multus/748602cb29e7d3a512fd89f213b0a1da83d4ecce2d8fd128263bc5edf68a4a05 is still used
[INFO] Checking if /host/var/lib/cni/multus/c08a06481888f98a7f301f1288247befc6fb7692a66c1f04e909951ee735f8f7 is still relevant
[INFO] /host/var/lib/cni/multus/c08a06481888f98a7f301f1288247befc6fb7692a66c1f04e909951ee735f8f7 is no longer used, removing it
[INFO] Checking if /host/var/lib/cni/multus/a3a855d9d27ebaed532441ac0c0a058aeceb0bf19a1e571540822f342d18c44e is still relevant
[INFO] /host/var/lib/cni/multus/a3a855d9d27ebaed532441ac0c0a058aeceb0bf19a1e571540822f342d18c44e is no longer used, removing it
[INFO] Checking if /host/var/lib/cni/multus/5ff48b5d85fe09d51e4b5892361caab0f9e9a59ddcd0a5e0cb289289e72bb5f7 is still relevant
[INFO] /host/var/lib/cni/multus/5ff48b5d85fe09d51e4b5892361caab0f9e9a59ddcd0a5e0cb289289e72bb5f7 is no longer used, removing it
[INFO] Checking if /host/var/lib/cni/multus/4a033cbe3c3aa5da49ac556b153204e79ebcee6c6c0d899ec28389a44fa75cfa is still relevant
[INFO] /host/var/lib/cni/multus/4a033cbe3c3aa5da49ac556b153204e79ebcee6c6c0d899ec28389a44fa75cfa is still used
[INFO] Checking if /host/var/lib/cni/multus/03c41827353513cf7b92e3a78d689721bfd96d9034d9c468c75b666258ba907d is still relevant
[INFO] /host/var/lib/cni/multus/03c41827353513cf7b92e3a78d689721bfd96d9034d9c468c75b666258ba907d is no longer used, removing it
[INFO] Checking if /host/var/lib/cni/multus/563c3c5d17fe3b22d05adeb27649f95a2c19389f9de247eea91995590873d791 is still relevant
[INFO] /host/var/lib/cni/multus/563c3c5d17fe3b22d05adeb27649f95a2c19389f9de247eea91995590873d791 is still used
[INFO] Checking if /host/var/lib/cni/multus/6620c498e027d5062598fb3e6ee4d78db6d9e0ed54f673498095ef87893f54a7 is still relevant
[INFO] /host/var/lib/cni/multus/6620c498e027d5062598fb3e6ee4d78db6d9e0ed54f673498095ef87893f54a7 is still used
[INFO] Checking if /host/var/lib/cni/multus/1f16a2d8c7f39a5eabce44ecaaf86255553b329a57c4c5486f51b0b4db94501a is still relevant
[INFO] /host/var/lib/cni/multus/1f16a2d8c7f39a5eabce44ecaaf86255553b329a57c4c5486f51b0b4db94501a is no longer used, removing it
[INFO] Checking if /host/var/lib/cni/multus/c3e078b931df64a3f4dc47f52459f15c5746281d1607584c4c95f85cdb7f8ee3 is still relevant
[INFO] /host/var/lib/cni/multus/c3e078b931df64a3f4dc47f52459f15c5746281d1607584c4c95f85cdb7f8ee3 is no longer used, removing it
[INFO] Checking if /host/var/lib/cni/multus/6af75d0148c2a80d9c7500c8688babcb2f5380bd4f18fa84a8b9a282da61e6cb is still relevant
[INFO] /host/var/lib/cni/multus/6af75d0148c2a80d9c7500c8688babcb2f5380bd4f18fa84a8b9a282da61e6cb is no longer used, removing it
[INFO] Checking if /host/var/lib/cni/multus/9fea7132f3925c51c907e52dc6883fbd0488cd25db51356e344182be9c62d631 is still relevant
[INFO] /host/var/lib/cni/multus/9fea7132f3925c51c907e52dc6883fbd0488cd25db51356e344182be9c62d631 is no longer used, removing it
[INFO] Checking if /host/var/lib/cni/multus/c0558409a4e8d4faf448e98d090b9038e415e422ed517aa057f19ce40ec9c153 is still relevant
[INFO] /host/var/lib/cni/multus/c0558409a4e8d4faf448e98d090b9038e415e422ed517aa057f19ce40ec9c153 is no longer used, removing it
Related reference(s)
Test coverage
I tested this locally on a cluster where multus had been running for ~6 months, this script cleared thousands of files. I then made sure multus still managed to assigns interfaces and ips to pods.
CI configuration
Below you can choose test deployment variants to run in this MR's CI.
Click to open to CI configuration
Legend:
| Icon | Meaning | Available values |
|---|---|---|
| Infra Provider |
capd, capo, capm3
|
|
| Bootstrap Provider |
kubeadm (alias kadm), rke2
|
|
| Node OS |
ubuntu, suse
|
|
| Deployment Options |
light-deploy, dev-sources, ha, misc, maxsurge-0, logging, no-logging
|
|
| Pipeline Scenarios | Available scenario list and description |
-
🎬 preview☁️ capd🚀 kadm🐧 ubuntu -
🎬 preview☁️ capo🚀 rke2🐧 suse -
🎬 preview☁️ capm3🚀 rke2🐧 ubuntu -
☁️ capd🚀 kadm🛠️ light-deploy🐧 ubuntu -
☁️ capd🚀 rke2🛠️ light-deploy🐧 suse -
☁️ capo🚀 rke2🐧 suse -
☁️ capo🚀 kadm🐧 ubuntu -
☁️ capo🚀 rke2🎬 rolling-update🛠️ ha🐧 ubuntu -
☁️ capo🚀 kadm🎬 wkld-k8s-upgrade🐧 ubuntu -
☁️ capo🚀 rke2🎬 rolling-update-no-wkld🛠️ ha🐧 suse -
☁️ capo🚀 rke2🎬 sylva-upgrade-from-1.4.x🛠️ ha🐧 ubuntu -
☁️ capo🚀 rke2🎬 sylva-upgrade-from-1.4.x🛠️ ha,misc🐧 ubuntu -
☁️ capo🚀 rke2🛠️ ha,misc🐧 ubuntu -
☁️ capm3🚀 rke2🐧 suse -
☁️ capm3🚀 kadm🐧 ubuntu -
☁️ capm3🚀 kadm🎬 rolling-update-no-wkld🛠️ ha,misc🐧 ubuntu -
☁️ capm3🚀 rke2🎬 wkld-k8s-upgrade🛠️ ha🐧 suse -
☁️ capm3🚀 kadm🎬 rolling-update🛠️ ha🐧 ubuntu -
☁️ capm3🚀 rke2🎬 sylva-upgrade-from-1.4.x🛠️ ha🐧 suse -
☁️ capm3🚀 rke2🛠️ misc,ha🐧 suse -
☁️ capm3🚀 rke2🎬 sylva-upgrade-from-1.4.x🛠️ ha,misc🐧 suse -
☁️ capm3🚀 kadm🎬 rolling-update🛠️ ha🐧 suse -
☁️ capm3🚀 ck8s🎬 no-wkld🛠️ light-deploy🐧 ubuntu
Global config for deployment pipelines
-
autorun pipelines -
allow failure on pipelines -
record sylvactl events
Notes:
- Enabling
autorunwill make deployment pipelines to be run automatically without human interaction - Disabling
allow failurewill make deployment pipelines mandatory for pipeline success. - if both
autorunandallow failureare disabled, deployment pipelines will need manual triggering but will be blocking the pipeline
Be aware: after configuration change, pipeline is not triggered automatically.
Please run it manually (by clicking the run pipeline button in Pipelines tab) or push new code.