Create saturation alerts for node networking space
As seen on &571 we have limits to how many nodes we are allowed to spin up on a GKE cluster. Utilize this issue to track the creation of saturation alerts that warn us and Page us when we are near and at our limits:
Milestones
-
Determine true maximum node limit - we have a /22subnet for our regional cluster and/24subnet for our zonal clusters - are we limited to 254 nodes? or is some IP's reserved for other functionality? Knowing this will drive how the alert is managed. -
Create a saturation alert for our clusters to alert us when we are running low and out of available IP's for our clusters -
Create a saturation alert to let us know if a single node (limited to 110 Pods), is running out of room to host Pods -
Create appropriate runbooks for handling the above alerts
Edited by John Skarbek