GitLab Agent CI/CD tunnel breaks `kubectl rollout status` waits
Summary
When connecting to a Kubernetes cluster in a CI job using the GitLab agent, various tools waiting for conditions in the cluster fail while waiting.
Steps to reproduce
- Create a CI job that is able to connect to k8s through an Agent tunnel. The job should do the following:
- Switch to the right context
- Create a deployment in the cluster 2.5 Optionally sleep for some time to allow the rollout to finish and show how broken the tunnel is. (The rollout usually completes in a few seconds.)
- Wait for the rollout of the deployment to finish
- Observe that
kubectl rollout status
hangs for an unreasonable time and prints an error message
Note that this seems to affect other "waits" as well such as kubectl wait
or other cli tools.
Example Project
A partial job script
could look like this:
- |-
kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
EOF
- sleep 180
- date # start waiting
- kubectl rollout status --timeout 5s deployment nginx
- date # finished waiting
What is the current bug behavior?
The wait fails with error messages from kubectl
.
What is the expected correct behavior?
The wait should work the same as with a direct connection to the cluster.
Relevant logs and/or screenshots
$ kubectl apply -f - <<EOF # collapsed multi-line command
deployment.apps/nginx created
$ sleep 180
$ date
Thu Oct 14 11:15:51 UTC 2021
$ kubectl rollout status --timeout 5s deployment nginx
deployment "nginx" successfully rolled out
E1014 11:49:00.400931 48 reflector.go:138] k8s.io/client-go/tools/watch/informerwatcher.go:146: Failed to watch *unstructured.Unstructured: an error on the server ("<html><head>\n<meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\n<title>502 Server Error</title>\n</head>\n<body text=#000000 bgcolor=#ffffff>\n<h1>Error: Server Error</h1>\n<h2>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds.</h2>\n<h2></h2>\n</body></html>") has prevented the request from succeeding
$ date
Thu Oct 14 11:49:00 UTC 2021
Output of checks
This bug happens on GitLab.com
Edited by Jan Boehm