Defining multiple environment variable causes the GitLab Runner pod to be stuck on a loop.

We have a customer(internal) that used this method, configure a proxy environment, to define HTTP_PROXY, HTTPS_PROXY and no_proxy.

However, this caused the generated GitLab Runner pod to be stuck on a boot loop. Looking at the output of oc describe deployment:

Events:
  Type    Reason             Age                    From                   Message
  ----    ------             ----                   ----                   -------
  Normal  ScalingReplicaSet  7h41m                  deployment-controller  Scaled up replica set gitlab-runner-runner-b446b5d8c to 1
  Normal  ScalingReplicaSet  7h40m (x7 over 7h41m)  deployment-controller  Scaled up replica set gitlab-runner-runner-7dd894db59 to 1
  Normal  ScalingReplicaSet  7h40m (x7 over 7h41m)  deployment-controller  Scaled down replica set gitlab-runner-runner-7dd894db59 to 0
  Normal  ScalingReplicaSet  7h40m (x5 over 7h41m)  deployment-controller  Scaled up replica set gitlab-runner-runner-756c9f46b7 to 1
  Normal  ScalingReplicaSet  7h40m (x5 over 7h41m)  deployment-controller  Scaled down replica set gitlab-runner-runner-756c9f46b7 to 0

It seems that something is causing it to be scaled down and up continuously. Checking the output of oc rollout history deployment.apps shows that there is a lot of revision in the deployed runner:

deployment.apps/gitlab-runner-runner
REVISION  CHANGE-CAUSE
85        <none>
87        <none>
88        <none>
89        <none>

On my example, there is at least 89 revisions within a few minutes of deploying the runner.

Checking the difference between the revision shows that the only thing that changed is the order of the defined environment variables:

diff -u <(oc rollout history deployment.apps/gitlab-runner-runner --revision=87) <(oc rollout history deployment.apps/gitlab-runner-runner --revision=88)

--- /dev/fd/11	2021-11-17 15:00:44.000000000 +0800
+++ /dev/fd/12	2021-11-17 15:00:44.000000000 +0800
@@ -1,11 +1,11 @@
-deployment.apps/gitlab-runner-runner with revision #87
+deployment.apps/gitlab-runner-runner with revision #88
 Pod Template:
   Labels:	app.kubernetes.io/component=runner
 	app.kubernetes.io/instance=gitlab-runner-runner
 	app.kubernetes.io/managed-by=gitlab-runner-operator
 	app.kubernetes.io/name=gitlab-runner
 	app.kubernetes.io/part-of=runner
-	pod-template-hash=b446b5d8c
+	pod-template-hash=7dd894db59
   Annotations:	gitlab-runner-runner-config: ae891d5510f3e231e27b23d88db5e984dc9278fc6e61ed80fc5d65c925af9995
   Service Account:	gitlab-runner-sa
   Init Containers:
@@ -30,9 +30,9 @@
       KUBERNETES_HELPER_IMAGE:	registry.connect.redhat.com/gitlab/gitlab-runner-helper@sha256:272c50ca9ef77c92deac0ca302df9e0127d5c54609c35230f96cf1d91de5fe97
       RUNNER_TAG_LIST:	openshift
       KUBERNETES_IMAGE:	alpine
-      THIRD_ENV:	3
       FIRST_ENV:	1
       SECOND_ENV:	2
+      THIRD_ENV:	3
     Mounts:
       /config from scripts (ro)
       /init-secrets from init-runner-secrets (ro)
@@ -61,9 +61,9 @@
       KUBERNETES_HELPER_IMAGE:	registry.connect.redhat.com/gitlab/gitlab-runner-helper@sha256:272c50ca9ef77c92deac0ca302df9e0127d5c54609c35230f96cf1d91de5fe97
       RUNNER_TAG_LIST:	openshift
       KUBERNETES_IMAGE:	alpine
-      THIRD_ENV:	3
       FIRST_ENV:	1
       SECOND_ENV:	2
+      THIRD_ENV:	3
     Mounts:
       /scripts from scripts (rw)
       /secrets from runner-secrets (rw)

How to reproduce:

Create a configmap with multiple environment variables defined:

apiVersion: v1
data:
  FIRST_ENV: "1"
  SECOND_ENV: "2"
  THIRD_ENV: "3"
kind: ConfigMap
metadata:
  name: custom-env
  namespace: openshift-operators

Configure the Runner to use this configmap:

apiVersion: apps.gitlab.com/v1beta2
kind: Runner
metadata:
  name: gitlab-runner
spec:
  gitlabUrl: https://gitlab.gitlab-kubernetes.jpid.xyz
  buildImage: alpine
  token: gitlab-runner-secret
  tags: openshift
  env: custom-env

I think what's happening here is that if there's multiple environment variable defined in the configMap, the order the Operator will parse it is not guaranteed causing this issue.

Edited Nov 18, 2021 by Julian Paul Dasmarinas