SGDbOps are continuously re-created
Summary
If an SGDbOps is re-created with kubectl get
, kubectl delete
and kubectl create
(including the status) after 10 or less seconds that a SGDbOps is running it is recreated by the operator.
Current Behaviour
After the first SGDbOps Pod is created with name <sgdbops name>-<op>-<hex msb sgdbops uid>-1-<job suffix>
the operation runs normally and after few seconds the Pod is removed and another Pod is created with name <sgdbops name>-<op>-<hex msb sgdbops uid>-2-<job suffix>
. This is repeated in a cycle until the operation is able to complete.
Steps to reproduce
- Create a SGCluster
- Create a restart SGDbOps targeting the SGCluster
- Dump the SGDbOps YAML to a file
- Remove the SGDbOps
- Create the SGDbOps again using the YAML file
Expected Behaviour
The operator does not delete the running Pod of the SGDbOps and create a new one after few seconds.
Possible Solution
Reset the .status
when the SGDbOps is created.
Environment
- StackGres version: 1.0.0
- Kubernetes version: ?
- Cloud provider or hardware configuration: ?
Relevant logs and/or screenshots
2021-11-12 09:15:27,488 INFO [io.st.op.conciliation] (SGDbOps-ReconciliationLoop) Checking reconciliation status of SGDbOps dbops-restart-failed-618e2e96/failed-restart
2021-11-12 09:15:27,517 INFO [io.st.op.conciliation] (SGDbOps-ReconciliationLoop) SGDbOps dbops-restart-failed-618e2e96/failed-restart it's not up to date. Reconciling
2021-11-12 09:15:27,518 INFO [io.st.op.conciliation] (SGDbOps-ReconciliationLoop) Creating resource failed-restart-restart-6812166f60d84ef8-1 of kind: Job
2021-11-12 09:15:27,545 INFO [io.st.op.conciliation] (SGDbOps-ReconciliationLoop) Deleting resource failed-restart-restart-6812166f60d84ef8-0 of kind: Job
...
2021-11-12 09:16:36,826 INFO [io.st.op.conciliation] (SGDbOps-ReconciliationLoop) SGDbOps dbops-restart-failed-618e2e96/failed-restart it's not up to date. Reconciling
2021-11-12 09:16:36,826 INFO [io.st.op.conciliation] (SGDbOps-ReconciliationLoop) Creating resource failed-restart-restart-6812166f60d84ef8-2 of kind: Job
2021-11-12 09:16:36,850 INFO [io.st.op.conciliation] (SGDbOps-ReconciliationLoop) Deleting resource failed-restart-restart-6812166f60d84ef8-1 of kind: Job
2021-11-12 09:16:36,978 INFO [io.st.op.ad.mu.MutationResource] (executor-thread-1) Mutating admission review e18e99d7-9475-4bb1-8660-ba468622f558 of kind Group
...
$ kubectl get sgdbops.stackgres.io -n dbops-restart-failed-618e2e96 failed-restart -o yaml
apiVersion: stackgres.io/v1
kind: SGDbOps
metadata:
annotations:
stackgres.io/operatorVersion: 1.1.0-SNAPSHOT
creationTimestamp: "2021-11-12T09:13:51Z"
generation: 9
name: failed-restart
namespace: dbops-restart-failed-618e2e96
resourceVersion: "29814"
selfLink: /apis/stackgres.io/v1/namespaces/dbops-restart-failed-618e2e96/sgdbops/failed-restart
uid: 6812166f-60d8-4ef8-96cf-dc9a10136b42
spec:
op: restart
restart:
method: InPlace
sgCluster: dbops-restart-failed
status:
conditions:
- lastTransitionTime: "2021-11-12T09:22:30.783219Z"
reason: OperationRunning
status: "True"
type: Running
- lastTransitionTime: "2021-11-12T09:22:30.783219Z"
reason: OperationNotCompleted
status: "False"
type: Completed
- lastTransitionTime: "2021-11-12T09:22:30.783219Z"
reason: OperationNotFailed
status: "False"
type: Failed
opRetries: 8
opStarted: "2021-11-12T09:22:30.728709Z"
restart:
failure: Postgres of instance dbops-restart-failed-0 failed
initialInstances:
- dbops-restart-failed-0
pendingToRestartInstances:
- dbops-restart-failed-0
primaryInstance: dbops-restart-failed-0
Edited by Matteo Melli