CrashLoopBackOff: io.fabric8.kubernetes.client.KubernetesClientException: too old resource version
Summary
Stackgres operator restarting all the time. Operator logs say that it has problem with updating some crd resources, am I right? Everything is in the logs. Can anyone tell me what is going on and how to fix that.
Environment
- StackGres version: commit: 25f423e7 Release 0.8
- Kubernetes version (use
kubectl version
): Server Version: v1.15.9-gke.24 - Cloud provider or hardware configuration: GKE
Steps to reproduce
- install stackgress operator with default values:
helm install --namespace stackgres --name stackgres-operator ./stackgres-k8s/install/helm/stackgres-cluster
- install cluster from manifests
apiVersion: v1
kind: Namespace
metadata:
name: databases
---
apiVersion: stackgres.io/v1alpha1
kind: StackGresPostgresConfig
metadata:
name: pg-conf-l
namespace: databases
spec:
pgVersion: '11'
postgresql.conf:
max_connections: '800'
shared_buffers: '4GB'
work_mem: 4MB
maintenance_work_mem: 1GB
wal_compression: 'on'
wal_sender_timeout: '60s'
password_encryption: 'scram-sha-256'
random_page_cost: '1.5'
shared_preload_libraries: 'pg_stat_statements'
checkpoint_completion_target: '0.9'
checkpoint_timeout: '5min'
---
apiVersion: stackgres.io/v1alpha1
kind: StackGresConnectionPoolingConfig
metadata:
name: bouncer-conf
namespace: databases
spec:
pgbouncer.ini:
default_pool_size: '800'
max_client_conn: '800'
pool_mode: 'transaction'
---
apiVersion: stackgres.io/v1alpha1
kind: StackGresProfile
metadata:
name: size-l
namespace: databases
spec:
cpu: "4"
memory: 8Gi
---
apiVersion: stackgres.io/v1alpha1
kind: StackGresCluster
metadata:
name: dev-cluster
namespace: databases
spec:
instances: 2
pgVersion: '11.6'
volumeSize: '150Gi'
pgConfig: 'pg-conf-l'
connectionPoolingConfig: 'bouncer-conf'
resourceProfile: 'size-l'
nonProduction:
disableClusterPodAntiAffinity: true
Relevant logs and/or screenshots
2020-04-01 14:40:42,179 ERROR [io.st.op.ValidationResource] (vert.x-worker-thread-18) cannot proceed with request 3d872845-5e7d-46fb-89de-10062b9f4fa2 cause: Cannot update default CRdefaultpgconfig
2020-04-01 14:40:42,315 INFO [io.st.op.ValidationResource] (vert.x-worker-thread-19) Validating admission review 63cd7026-76a4-4446-989f-c9069def57a2 of kind GroupVersionKind(group=stackgres.io, kind=StackGresPostgresConfig, version=v1alpha1, additionalProperties={})
2020-04-01 14:40:42,315 ERROR [io.st.op.ValidationResource] (vert.x-worker-thread-19) cannot proceed with request 63cd7026-76a4-4446-989f-c9069def57a2 cause: Cannot update default CRdefaultpgconfig
2020-04-01 14:40:42,336 INFO [io.st.op.ValidationResource] (vert.x-worker-thread-15) Validating admission review 36693ad3-5b7e-408c-9f9e-195f35b15399 of kind GroupVersionKind(group=stackgres.io, kind=StackGresPostgresConfig, version=v1alpha1, additionalProperties={})
2020-04-01 14:40:42,926 ERROR [io.st.op.ValidationResource] (vert.x-worker-thread-7) cannot proceed with request fb82303c-8c46-42ce-8bf1-03fb1c8c0913 cause: Cannot update default CRdefaultpgconfig
2020-04-01 14:40:45,380 TRACE [io.st.op.re.AbstractReconciliationCycle] (Cluster-ReconciliationCycle) Starting Reconciliation Cycle 457
2020-04-01 14:40:45,383 TRACE [io.st.op.re.AbstractReconciliationCycle] (Cluster-ReconciliationCycle) Reconciliation Cycle 457 getting existing cluster list
2020-04-01 14:40:45,386 ERROR [io.st.op.re.AbstractReconciliationCycle] (Cluster-ReconciliationCycle) Cluster reconciliation cycle failed: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [CustomResourceDefinition] with name: [sgclusters.stackgres.io] in namespace: [stackgres] failed.
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:237)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:170)
at io.stackgres.operator.resource.ResourceUtil.getCustomResource(ResourceUtil.java:204)
at io.stackgres.operator.controller.ClusterReconciliationCycle.getExistingConfigs(ClusterReconciliationCycle.java:169)
at io.stackgres.operatorframework.reconciliation.AbstractReconciliationCycle.reconciliationCycle(AbstractReconciliationCycle.java:103)
at io.stackgres.operatorframework.reconciliation.AbstractReconciliationCycle.reconciliationCycleLoop(AbstractReconciliationCycle.java:87)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.ConnectException: Failed to connect to /10.0.0.1:443
at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:248)
at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:166)
at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:257)
at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:135)
at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:114)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:126)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:119)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:111)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:254)
at okhttp3.RealCall.execute(RealCall.java:92)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:411)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:372)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:337)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:318)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:833)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:226)
... 8 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:607)
at okhttp3.internal.platform.Platform.connectSocket(Platform.java:129)
at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:246)
... 40 more
2020-04-01 14:44:40,719 ERROR [io.st.op.co.ClusterResourceWatcherFactory] (OkHttp https://10.0.0.1/...) onClose was called, : io.fabric8.kubernetes.client.KubernetesClientException: too old resource version: 29897439 (29900565)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onMessage(WatchConnectionManager.java:263)
at okhttp3.internal.ws.RealWebSocket.onReadMessage(RealWebSocket.java:323)
at okhttp3.internal.ws.WebSocketReader.readMessageFrame(WebSocketReader.java:219)
at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:105)
at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:274)
at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:214)
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206)
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-04-01 14:44:40,781 INFO [io.st.op.ap.StackGresOperatorApp] (Thread-4) The application is stopping...
2020-04-01 14:44:40,784 INFO [io.st.op.co.ClusterResourceWatcherFactory] (Thread-4) onClose was called
2020-04-01 14:44:41,784 WARN [io.fa.ku.cl.ds.in.WatchConnectionManager] (Thread-4) Executor didn't terminate in time after shutdown in close(), killing it in: io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@10010c2d
2020-04-01 14:44:41,785 INFO [io.st.op.co.ClusterResourceWatcherFactory] (Thread-4) onClose was called
2020-04-01 14:44:42,164 INFO [io.st.op.co.ClusterResourceWatcherFactory] (Thread-4) onClose was called
2020-04-01 14:44:42,164 INFO [io.st.op.co.ClusterResourceWatcherFactory] (Thread-4) onClose was called
2020-04-01 14:44:43,165 WARN [io.fa.ku.cl.ds.in.WatchConnectionManager] (Thread-4) Executor didn't terminate in time after shutdown in close(), killing it in: io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@442ac289
2020-04-01 14:44:43,165 INFO [io.st.op.co.ClusterResourceWatcherFactory] (Thread-4) onClose was called
2020-04-01 14:44:43,166 INFO [io.st.op.re.AbstractReconciliationCycle] (Cluster-ReconciliationCycle) Cluster reconciliation cycle loop stopped
2020-04-01 14:44:43,211 DEBUG [io.qu.ar.impl] (Thread-4) ArC DI container shut down
2020-04-01 14:44:43,211 INFO [io.quarkus] (Thread-4) stackgres-operator stopped in 2.431s
Edited by Matteo Melli