Skip to content

GitLab Agent CI/CD tunnel breaks `kubectl rollout status` waits

Summary

When connecting to a Kubernetes cluster in a CI job using the GitLab agent, various tools waiting for conditions in the cluster fail while waiting.

Steps to reproduce

  1. Create a CI job that is able to connect to k8s through an Agent tunnel. The job should do the following:
  2. Switch to the right context
  3. Create a deployment in the cluster 2.5 Optionally sleep for some time to allow the rollout to finish and show how broken the tunnel is. (The rollout usually completes in a few seconds.)
  4. Wait for the rollout of the deployment to finish
  5. Observe that kubectl rollout status hangs for an unreasonable time and prints an error message

Note that this seems to affect other "waits" as well such as kubectl wait or other cli tools.

Example Project

A partial job script could look like this:

    - |-
      kubectl apply -f - <<EOF
      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: nginx
      spec:
        replicas: 3
        selector:
          matchLabels:
            app: nginx
        template:
          metadata:
            labels:
              app: nginx
          spec:
            containers:
            - name: nginx
              image: nginx:latest
      EOF
    - sleep 180
    - date # start waiting
    - kubectl rollout status --timeout 5s deployment nginx
    - date # finished waiting

What is the current bug behavior?

The wait fails with error messages from kubectl.

What is the expected correct behavior?

The wait should work the same as with a direct connection to the cluster.

Relevant logs and/or screenshots

$ kubectl apply -f - <<EOF # collapsed multi-line command
deployment.apps/nginx created
$ sleep 180
$ date
Thu Oct 14 11:15:51 UTC 2021
$ kubectl rollout status --timeout 5s deployment nginx
deployment "nginx" successfully rolled out
E1014 11:49:00.400931      48 reflector.go:138] k8s.io/client-go/tools/watch/informerwatcher.go:146: Failed to watch *unstructured.Unstructured: an error on the server ("<html><head>\n<meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\n<title>502 Server Error</title>\n</head>\n<body text=#000000 bgcolor=#ffffff>\n<h1>Error: Server Error</h1>\n<h2>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds.</h2>\n<h2></h2>\n</body></html>") has prevented the request from succeeding
$ date
Thu Oct 14 11:49:00 UTC 2021

Output of checks

This bug happens on GitLab.com

Edited by Jan Boehm