Patroni cluster keeps failing over due to Endpoints annotations being removed
Summary
Give a 2 node cluster, they keep failing over and, when trying to manually failover, it fails.
Current Behaviour
Once the cluster is ready, the patroni state (using patroni list
) is wrong:
+ Cluster: my-db-cluster (6979305705338785925) -+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+-----------------+------+---------+-------+----+-----------+
| my-db-cluster-0 | | Replica | | | unknown |
| my-db-cluster-1 | | Leader | | | |
| my-db-cluster-2 | | Replica | | | unknown |
+-----------------+------+---------+-------+----+-----------+
Postgres timeline also keeps increasing:
➜ kubectl exec -it my-db-cluster-0 -c patroni -- patronictl topology
+ Cluster: my-db-cluster (uninitialized) -+---------+---------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+-------------------+---------------------+---------+---------+----+-----------+
| my-db-cluster-0 | | Leader | | | |
| + my-db-cluster-1 | 192.168.18.65:7433 | Replica | running | 12 | 0 |
| + my-db-cluster-2 | 192.168.43.218:7433 | Replica | running | 12 | 0 |
+-------------------+---------------------+---------+---------+----+-----------+
and after a few minutes (or less):
➜ kubectl exec -it my-db-cluster-0 -c patroni -- patronictl topology
+ Cluster: my-db-cluster (uninitialized) -+---------+---------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+-------------------+---------------------+---------+---------+----+-----------+
| my-db-cluster-2 | | Leader | | | |
| + my-db-cluster-0 | 192.168.88.123:7433 | Replica | running | 15 | 0 |
| + my-db-cluster-1 | 192.168.18.65:7433 | Replica | running | 15 | 0 |
+-------------------+---------------------+---------+---------+----+-----------+
in the logs there is no relevant error, but I got this messages frequently:
default 0s Warning Unhealthy pod/my-db-cluster-2 Readiness probe failed: HTTP probe failed with statuscode: 503
default 0s Normal DistributedLogsUpdated sgdistributedlogs/my-distributed-logs StackGres Centralized Logging default.my-distributed-logs updated
default 0s Normal ClusterUpdated sgcluster/my-db-cluster StackGres Cluster default.my-db-cluster updated
default 0s Normal DistributedLogsUpdated sgdistributedlogs/my-distributed-logs StackGres Centralized Logging default.my-distributed-logs updated
default 0s Normal ClusterUpdated sgcluster/my-db-cluster StackGres Cluster default.my-db-cluster updated
default 0s Normal DistributedLogsUpdated sgdistributedlogs/my-distributed-logs StackGres Centralized Logging default.my-distributed-logs updated
default 0s Normal ClusterUpdated sgcluster/my-db-cluster StackGres Cluster default.my-db-cluster updated
default 0s Normal DistributedLogsUpdated sgdistributedlogs/my-distributed-logs StackGres Centralized Logging default.my-distributed-logs updated
default 0s Normal ClusterUpdated sgcluster/my-db-cluster StackGres Cluster default.my-db-cluster updated
default 0s Normal DistributedLogsUpdated sgdistributedlogs/my-distributed-logs StackGres Centralized Logging default.my-distributed-logs updated
default 0s Normal ClusterUpdated sgcluster/my-db-cluster StackGres Cluster default.my-db-cluster updated
default 0s Normal DistributedLogsUpdated sgdistributedlogs/my-distributed-logs StackGres Centralized Logging default.my-distributed-logs updated
default 0s Normal ClusterUpdated sgcluster/my-db-cluster StackGres Cluster default.my-db-cluster updated
default 0s Normal DistributedLogsUpdated sgdistributedlogs/my-distributed-logs StackGres Centralized Logging default.my-distributed-logs updated
default 0s Normal ClusterUpdated sgcluster/my-db-cluster StackGres Cluster default.my-db-cluster updated
default 0s Normal DistributedLogsUpdated sgdistributedlogs/my-distributed-logs StackGres Centralized Logging default.my-distributed-logs updated
Patroni is failing over leaving following message in the log:
2021-06-30 01:19:13,578 INFO: Could not take out TTL lock
2021-06-30 01:19:13,712 INFO: demoted self after trying and failing to obtain lock
2021-06-30 01:19:13,713 INFO: Lock owner: my-db-cluster-2; I am my-db-cluster-0
2021-06-30 01:19:13,713 INFO: Lock owner: my-db-cluster-2; I am my-db-cluster-0
2021-06-30 01:19:13,713 INFO: starting after demotion in progress
2021-06-30 01:19:13,714 INFO: closed patroni connection to the postgresql cluster
StackGres operator is patching the Enpoints used for elections by Patroni and removing annotations so that the election state is lost:
---
apiVersion: v1
kind: Endpoints
metadata:
annotations:
acquireTime: "2021-06-30T14:55:11.668285+00:00"
leader: my-db-cluster-2
optime: "1208698360"
renewTime: "2021-06-30T14:55:11.722641+00:00"
transitions: "0"
ttl: "30"
creationTimestamp: "2021-06-30T14:18:41Z"
labels:
app: StackGresCluster
cluster: "true"
cluster-name: my-db-cluster
cluster-uid: c61a5a37-dee1-4584-a0d4-b5907dddb691
name: my-db-cluster
namespace: default
ownerReferences:
- apiVersion: stackgres.io/v1
controller: true
kind: SGCluster
name: my-db-cluster
uid: c61a5a37-dee1-4584-a0d4-b5907dddb691
resourceVersion: "21459"
selfLink: /api/v1/namespaces/default/endpoints/my-db-cluster
uid: 9f5c1372-6003-4e0d-8179-9d0974519481
subsets:
- addresses:
- hostname: my-db-cluster-2
ip: 192.168.6.62
nodeName: ip-192-168-28-183.us-east-2.compute.internal
targetRef:
kind: Pod
name: my-db-cluster-2
namespace: default
resourceVersion: "21414"
uid: 300609ef-9474-4434-a90a-48e921780936
ports:
- name: pgport
port: 7432
protocol: TCP
- name: pgreplication
port: 7433
protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
creationTimestamp: "2021-06-30T14:18:41Z"
labels:
app: StackGresCluster
cluster: "true"
cluster-name: my-db-cluster
cluster-uid: c61a5a37-dee1-4584-a0d4-b5907dddb691
name: my-db-cluster
namespace: default
ownerReferences:
- apiVersion: stackgres.io/v1
controller: true
kind: SGCluster
name: my-db-cluster
uid: c61a5a37-dee1-4584-a0d4-b5907dddb691
resourceVersion: "21517"
selfLink: /api/v1/namespaces/default/endpoints/my-db-cluster
uid: 9f5c1372-6003-4e0d-8179-9d0974519481
Steps to reproduce
- Create a Kubernetes cluster with version 1.19
- Create a StackGres cluster with 2 instances
Expected Behaviour
Patroni nodes are not failing over
Possible Solution
Annotations should not be overwritten but merged:
- Required annotations will overwrite exiting annotations with the same key or, if no existing annotation match the same key, will be added.
- Existing annotations will not be removed if no required annotation match its key.
Environment
- StackGres version:
❯ kubectl get deployments -n stackgres stackgres-operator --template '{{ printf "%s\n" (index .spec.template.spec.containers 0).image }}'
stackgres/operator:1.0.0-beta1
- Kubernetes version:
❯ kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"archive", BuildDate:"2021-05-14T14:09:09Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19+", GitVersion:"v1.19.8-eks-96780e", GitCommit:"96780e1b30acbf0a52c38b6030d7853e575bcdf3", GitTreeState:"clean", BuildDate:"2021-03-10T21:32:29Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.21) and server (1.19) exceeds the supported minor version skew of +/-1
- Cloud provider or hardware configuration:
created with eksctl:
--node-type m5a.2xlarge --node-volume-size 100 --nodes 3
Relevant logs and/or screenshots
Edited by Matteo Melli