CDS not rebuilding broken pods
If all Pods of a CDS are stuck in a permanent unavilable state (e.g. ImagePullBackOff because there was a wrong image specified) then CDS will never fix this.
These pods will count against the missing_pods
to ensure that we do not potentially break any more pods with a new configuration.
However the only way out of this currently is to kill enough pods so that only a maximum of max_unavailable
is still broken. Then the operator will pick up its work again.
Currently my only idea is to allow killing pods if they are stale and are not ready because of something with BackOff
in the name.
However this would cause the CDS Operator to delete such pods all the time which might make it harder to take a look at Logs.