Evicted pods in EKS, DiskPressure on nodes
We periodically see evicted pods in EKS which require manual cleanup using something like:
kubectl get pods | awk '/Evicted/{ print $1}' | xargs kubectl delete pod
To improve overall environmental hygiene and minimize resource consumption, we should categorize these evictions to better understand failures and automate cleanup.
Recently, this has been cost often by DiskPressure on the nodes in the cluster. We should confirm the size of the backing disks, the cause of the pressure, and probably update the cluster version as well.
Suspects for disk fill:
- logs
- our images, and the many revisions of them.
The pod termination message is:
Pod The node was low on resource: [DiskPressure].
/cc @WarheadsSE
Edited by Jason Plum