Gitlab Pods Crash | Kubernetes 1.22 + Istio 1.8.1
Summary
Gitlab doesn't come up as part of a fresh installation, using Kube 1.22 + Istio 1.8.1 + Rook-Ceph
Steps to reproduce
Setup is equal to #2475 (closed)
heml repo add gitlab https://charts.gitlab.io/
helm repo update
helm uninstall gitlab --namespace gitlab
kubectl apply -f ./iac/secrets.yaml
helm upgrade -f ./iac/patch-1774.yaml --namespace gitlab --install gitlab gitlab/gitlab \
--timeout 600s \
--set global.hosts.domain=my-example.de \
--set certmanager-issuer.email=gitlab@my-example.de \
--set certmanager.install=false \
--set global.ingress.configureCertmanager=false \
--set gitlab-runner.install=false \
--set postgresql.install=false \
--set global.psql.host=postgres.postgresql.svc.cluster.local \
--set global.psql.database=gitlab \
--set global.psql.username=system_gitlab \
--set global.psql.password.secret=gitlab-secrets \
--set global.psql.password.key=psql_pwd
patch-1774.yaml (istio compatibility, see #2475 (closed) for details)
shared-secrets:
annotations:
sidecar.istio.io/inject: "false"
Configuration used
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.0", GitCommit:"c2b5237ccd9c0f1d600d3072634ca66cefdf272f", GitTreeState:"clean", BuildDate:"2021-08-04T18:03:20Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.0", GitCommit:"c2b5237ccd9c0f1d600d3072634ca66cefdf272f", GitTreeState:"clean", BuildDate:"2021-08-04T17:57:25Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"linux/amd64"}
helm upgrade -f ./iac/patch-1774.yaml --namespace gitlab --install gitlab gitlab/gitlab \
--timeout 600s \
--set global.hosts.domain=my-example.de \
--set certmanager-issuer.email=gitlab@my-example.de \
--set certmanager.install=false \
--set global.ingress.configureCertmanager=false \
--set gitlab-runner.install=false \
--set postgresql.install=false \
--set global.psql.host=postgres.postgresql.svc.cluster.local \
--set global.psql.database=gitlab \
--set global.psql.username=system_gitlab \
--set global.psql.password.secret=gitlab-secrets \
--set global.psql.password.key=psql_pwd
Current behavior
Pods hanging in a crash loop.
-
Eventsreference CPU (maybe a Gitlab limit?), while CPU is >50% idle (» see logs for VM utilization). -
Ingress-Controllerreferencing routing issue (istio?)
kubectl get all -o wide -n gitlab
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/gitlab-gitaly-0 2/2 Running 0 7h45m 192.168.0.77 kubernetes-worker02 <none> <none>
pod/gitlab-gitlab-exporter-8489f6dcd5-672nt 2/2 Running 0 7h45m 192.168.1.150 kubernetes-worker01 <none> <none>
pod/gitlab-gitlab-shell-7444fdbc8f-ttc88 2/2 Running 0 7h45m 192.168.1.149 kubernetes-worker01 <none> <none>
pod/gitlab-gitlab-shell-7444fdbc8f-v9fgb 2/2 Running 0 7h44m 192.168.0.79 kubernetes-worker02 <none> <none>
pod/gitlab-minio-658bd7d4fd-m5x9f 2/2 Running 0 7h45m 192.168.0.93 kubernetes-worker02 <none> <none>
pod/gitlab-nginx-ingress-controller-6fc99758c8-bvrnj 1/2 Running 126 (5m10s ago) 7h45m 192.168.1.155 kubernetes-worker01 <none> <none>
pod/gitlab-nginx-ingress-controller-6fc99758c8-l4lvs 1/2 CrashLoopBackOff 128 (8s ago) 7h45m 192.168.1.124 kubernetes-worker04 <none> <none>
pod/gitlab-nginx-ingress-default-backend-6964c77bb9-l652p 2/2 Running 0 7h45m 192.168.13.205 kubernetes-worker03 <none> <none>
pod/gitlab-prometheus-server-6444c7bd76-xhbz2 3/3 Running 1 7h45m 192.168.1.125 kubernetes-worker04 <none> <none>
pod/gitlab-redis-master-0 3/3 Running 0 7h45m 192.168.1.126 kubernetes-worker04 <none> <none>
pod/gitlab-registry-684955b74d-2fxvc 2/2 Running 1 7h45m 192.168.1.123 kubernetes-worker04 <none> <none>
pod/gitlab-registry-684955b74d-w929x 2/2 Running 0 7h44m 192.168.4.255 kubernetes-slave <none> <none>
pod/gitlab-sidekiq-all-in-1-v1-7c4584bc77-fj8w2 0/2 Init:CrashLoopBackOff 91 (5m ago) 7h45m 192.168.4.198 kubernetes-slave <none> <none>
pod/gitlab-task-runner-58f64bf85d-p8lf9 2/2 Running 0 7h45m 192.168.1.154 kubernetes-worker01 <none> <none>
pod/gitlab-webservice-default-76dbd8b9dd-5c6xq 0/3 Init:CrashLoopBackOff 90 (4m9s ago) 7h45m 192.168.1.122 kubernetes-worker04 <none> <none>
pod/gitlab-webservice-default-76dbd8b9dd-ck2zn 0/3 Init:CrashLoopBackOff 90 (82s ago) 7h44m 192.168.0.67 kubernetes-worker02 <none> <none>
Expected behavior
Gitlab chart installs and boots gitlab.
Versions
- Chart: latest
- Platform:
- Self-hosted:
sudo apt-add-repository "deb http://apt.kubernetes.io/ kubernetes-xenial main"
- Self-hosted:
- Kubernetes: (
kubectl version)- Client: Major:"1", Minor:"22", GitVersion:"v1.22.0"
- Server: Major:"1", Minor:"22", GitVersion:"v1.22.0"
- Helm: (
helm version)- via Jenkins (Pod:
lachlanevenson/k8s-helm:latest/ 3.6.3)
- via Jenkins (Pod:
Relevant logs
Ingress-Controller
kubectl -n gitlab logs pod/gitlab-nginx-ingress-controller-6fc99758c8-l4lvs
I0817 04:33:41.686538 7 nginx.go:249] "Starting NGINX Ingress controller"
I0817 04:33:41.707678 7 event.go:282] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"gitlab", Name:"gitlab-nginx-ingress-tcp", UID:"1b46e731-db2b-4417-8692-06a99c2985f6", APIVersion:"v1", ResourceVersion:"107053867", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap gitlab/gitlab-nginx-ingress-tcp
I0817 04:33:41.707914 7 event.go:282] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"gitlab", Name:"gitlab-nginx-ingress-controller", UID:"62dc95c9-aedd-4dfe-bbc8-33903396576b", APIVersion:"v1", ResourceVersion:"107053882", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap gitlab/gitlab-nginx-ingress-controller
E0817 04:33:42.789311 7 reflector.go:127] k8s.io/client-go@v0.19.3/tools/cache/reflector.go:156: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
E0817 04:33:44.091445 7 reflector.go:127] k8s.io/client-go@v0.19.3/tools/cache/reflector.go:156: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
E0817 04:33:45.830793 7 reflector.go:127] k8s.io/client-go@v0.19.3/tools/cache/reflector.go:156: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
2021/08/17 04:33:48 Get "http://127.0.0.1:10246/nginx_status": dial tcp 127.0.0.1:10246: connect: connection refused
W0817 04:33:48.756856 7 nginx_status.go:172] unexpected error obtaining nginx status info: Get "http://127.0.0.1:10246/nginx_status": dial tcp 127.0.0.1:10246: connect: connection refused
E0817 04:33:49.165888 7 reflector.go:127] k8s.io/client-go@v0.19.3/tools/cache/reflector.go:156: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
E0817 04:33:58.742783 7 reflector.go:127] k8s.io/client-go@v0.19.3/tools/cache/reflector.go:156: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
2021/08/17 04:34:03 Get "http://127.0.0.1:10246/nginx_status": dial tcp 127.0.0.1:10246: connect: connection refused
W0817 04:34:03.761409 7 nginx_status.go:172] unexpected error obtaining nginx status info: Get "http://127.0.0.1:10246/nginx_status": dial tcp 127.0.0.1:10246: connect: connection refused
2021/08/17 04:34:12 Get "http://127.0.0.1:10246/nginx_status": dial tcp 127.0.0.1:10246: connect: connection refused
W0817 04:34:12.302015 7 nginx_status.go:172] unexpected error obtaining nginx status info: Get "http://127.0.0.1:10246/nginx_status": dial tcp 127.0.0.1:10246: connect: connection refused
2021/08/17 04:34:18 Get "http://127.0.0.1:10246/nginx_status": dial tcp 127.0.0.1:10246: connect: connection refused
W0817 04:34:18.756783 7 nginx_status.go:172] unexpected error obtaining nginx status info: Get "http://127.0.0.1:10246/nginx_status": dial tcp 127.0.0.1:10246: connect: connection refused
E0817 04:34:23.050595 7 reflector.go:127] k8s.io/client-go@v0.19.3/tools/cache/reflector.go:156: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
2021/08/17 04:34:33 Get "http://127.0.0.1:10246/nginx_status": dial tcp 127.0.0.1:10246: connect: connection refused
W0817 04:34:33.754938 7 nginx_status.go:172] unexpected error obtaining nginx status info: Get "http://127.0.0.1:10246/nginx_status": dial tcp 127.0.0.1:10246: connect: connection refused
Sidekiq Please note CPU constraint. I do have 4 nodes having 2 CPU and 8GB RAM.
kubectl get events -n gitlab
77s Warning FailedScheduling pod/gitlab-sidekiq-all-in-1-v1-7c4584bc77-4rvw8 0/6 nodes are available: 1 node(s) were unschedulable, 5 Insufficient cpu.
73s Normal Scheduled pod/gitlab-sidekiq-all-in-1-v1-7c4584bc77-4rvw8 Successfully assigned gitlab/gitlab-sidekiq-all-in-1-v1-7c4584bc77-4rvw8 to kubernetes-slave
71s Normal Pulled pod/gitlab-sidekiq-all-in-1-v1-7c4584bc77-4rvw8 Container image "registry.gitlab.com/gitlab-org/build/cng/alpine-certificates:20191127-r2" already present on machine
70s Normal Created pod/gitlab-sidekiq-all-in-1-v1-7c4584bc77-4rvw8 Created container certificates
69s Normal Started pod/gitlab-sidekiq-all-in-1-v1-7c4584bc77-4rvw8 Started container certificates
68s Normal Pulling pod/gitlab-sidekiq-all-in-1-v1-7c4584bc77-4rvw8 Pulling image "busybox:latest"
67s Normal Pulled pod/gitlab-sidekiq-all-in-1-v1-7c4584bc77-4rvw8 Successfully pulled image "busybox:latest" in 1.29618704s
66s Normal Created pod/gitlab-sidekiq-all-in-1-v1-7c4584bc77-4rvw8 Created container configure
66s Normal Started pod/gitlab-sidekiq-all-in-1-v1-7c4584bc77-4rvw8 Started container configure
12s Normal Pulled pod/gitlab-sidekiq-all-in-1-v1-7c4584bc77-4rvw8 Container image "registry.gitlab.com/gitlab-org/build/cng/gitlab-sidekiq-ee:v14.1.2" already present on machine
11s Normal Created pod/gitlab-sidekiq-all-in-1-v1-7c4584bc77-4rvw8 Created container dependencies
11s Normal Started pod/gitlab-sidekiq-all-in-1-v1-7c4584bc77-4rvw8 Started container dependencies
4s Warning BackOff pod/gitlab-sidekiq-all-in-1-v1-7c4584bc77-4rvw8 Back-off restarting failed container
8m1s Normal Pulled pod/gitlab-sidekiq-all-in-1-v1-7c4584bc77-fj8w2 Container image "registry.gitlab.com/gitlab-org/build/cng/gitlab-sidekiq-ee:v14.1.2" already present on machine
3m3s Warning BackOff pod/gitlab-sidekiq-all-in-1-v1-7c4584bc77-fj8w2 Back-off restarting failed container
77s Normal SuccessfulCreate replicaset/gitlab-sidekiq-all-in-1-v1-7c4584bc77 Created pod: gitlab-sidekiq-all-in-1-v1-7c4584bc77-4rvw8
2m50s Warning FailedGetResourceMetric horizontalpodautoscaler/gitlab-sidekiq-all-in-1-v1 failed to get cpu utilization: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)
23m Normal Pulled pod/gitlab-webservice-default-76dbd8b9dd-5c6xq Container image "registry.gitlab.com/gitlab-org/build/cng/gitlab-webservice-ee:v14.1.2" already present on machine
3m4s Warning BackOff pod/gitlab-webservice-default-76dbd8b9dd-5c6xq Back-off restarting failed container
3m5s Warning BackOff pod/gitlab-webservice-default-76dbd8b9dd-ck2zn Back-off restarting failed container
2m50s Warning FailedGetResourceMetric horizontalpodautoscaler/gitlab-webservice-default failed to get cpu utilization: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)
Edited by Fabian Sc

