When deleting primary pod cluster may become unresponsive
Summary
When deleting the primary Pod when this is not the pod at index 0 of the StatefulSet and there is any missing Pod at previous indexes the primary Pod will not be restarted.
Current Behaviour
The Pod at the missing index previous to the deletet primary Pod index is started.
This is due to implementation from #89 (closed) that, to avoid primary disruption on scale down, changes the labels of the primary Pod in order to be considered out of the StatefulSet control if the index is greather than 0 and less than the number of replicas of the StatefulSet minus 1.
Steps to reproduce
- Create a cluster with 2 instances
- Perform a switchover from the instance at index 0 to the instance at index 1
- Change the cluster to have 1 instance
- Wait for the instance at index 0 to be terminated
- Insert some data
- Perform a checkpoint
- Delete the instance at index 1
Expected Behaviour
The previous primary Pod is started.
Possible Solution
Following procedure have been tested with success:
- The primayr Pod is not the first Pod of the StatefulSet (should appear as the last element of the array in the
.metadata.annotations.history
field. - The primary Pod is deleted
- The StatefulSet is re-created with the
.spec.podManagementPolicy
set toParallel
, adding an entry to.spec.template.spec.nodeSelector
in order to make the new created Pod not scheduled (for example{"unschedule":"true"}
) and set.spec.replicas
in order to allow creation of Pods before the last primary Pod in history. - The StatefulSet is re-created with the
.spec.podManagementPolicy
set toParallel
, removing the prviously added entry from.spec.template.spec.nodeSelector
in order to make the new created Pod scheduled (for example{"unschedule":"true"}
) and set.spec.replicas
in order to allow creation of the last primary Pod in history. - Set the label
.metadata.labels.disruptible
tofalse
for the last primary Pod in history. - The StatefulSet is re-created with the
.spec.podManagementPolicy
set toOrderedReady
and set.spec.replicas
as in step 1.
Environment
- StackGres version: 1.0.0
- Kubernetes version: ?
- Cloud provider or hardware configuration: ?
Edited by Matteo Melli