Skip to content

Fix liveness probe for Runner Pod

Romuald Atchadé requested to merge fix-liveness-probe into main

What does this MR do?

The liveness probe seems to be failing. After investigation it looks like ${HOME%/root} was sometimes returning an invalid PATH making the awk instruction fails causing the restart of the Runner Pod.

Why was this MR needed?

To prevent the Runner Pod to unnecessarily restart

What's the best way to test this MR?

config for ubuntu
image:
  registry: registry.gitlab.com
  image: gitlab-org/gitlab-runner
  tag: ubuntu-v16.8.0
useTini: false
imagePullPolicy: IfNotPresent
replicas: 1
gitlabUrl: https://gitlab.com/
runnerToken: "glrt-REDACTED"

useJobNamespace: true
terminationGracePeriodSeconds: 0
concurrent: 1
checkInterval: 1
logLevel: "debug"
sessionServer:
  enabled: false
  # publicIP: ""
  annotations: {}
  timeout: 1800
  internalPort: 8093
  externalPort: 9000

## For RBAC support:
rbac:
  create: true
  rules:
    - apiGroups: [""]
      resources: ["configmaps", "events", "pods", "pods/attach", "pods/log", "secrets", "services",  "serviceAccounts"]
      verbs: ["get", "list", "watch", "create", "patch", "update", "delete"]
    - apiGroups: [""]
      resources: ["pods/exec"]
      verbs: ["create", "patch", "delete"]
  clusterWideAccess: false
  podSecurityPolicy:
    enabled: false
    resourceNames:
    - gitlab-runner
metrics:
  enabled: true
  portName: metrics
  port: 9252
  serviceMonitor:
    enabled: false
service:
  enabled: false
  type: ClusterIP
runners:
  config: |
    shutdown_timeout = 100
    [[runners]]
      [runners.kubernetes]
        image = "alpine"
        [runners.kubernetes.node_selector]
          "kubernetes.io/arch" = "amd64"
  runUntagged: true
  protected: true
  tags: "tests, ra-tests"
  builds: {}
  services: {}
  helpers: {}
securityContext:
  allowPrivilegeEscalation: true
  readOnlyRootFilesystem: false
  runAsNonRoot: true
podSecurityContext:
  runAsUser: 999
  fsGroup: 999
affinity: {}
nodeSelector:
  "kubernetes.io/arch": "amd64"
  "kubernetes.io/os": "linux"
tolerations: []
hostAliases: []
podAnnotations: {}
podLabels: {}
hpa: {}
secrets: []
configMaps: {}
volumeMounts: []
volumes: []
config for alpine
image:
  registry: registry.gitlab.com
  image: gitlab-org/gitlab-runner
  tag: alpine-v16.8.0
useTini: false
imagePullPolicy: IfNotPresent
replicas: 1
gitlabUrl: https://gitlab.com/
runnerToken: "glrt-REDACTED"

useJobNamespace: true
terminationGracePeriodSeconds: 0
concurrent: 1
checkInterval: 1
logLevel: "debug"
sessionServer:
  enabled: false
  # publicIP: ""
  annotations: {}
  timeout: 1800
  internalPort: 8093
  externalPort: 9000

## For RBAC support:
rbac:
  create: true
  rules:
    - apiGroups: [""]
      resources: ["configmaps", "events", "pods", "pods/attach", "pods/log", "secrets", "services",  "serviceAccounts"]
      verbs: ["get", "list", "watch", "create", "patch", "update", "delete"]
    - apiGroups: [""]
      resources: ["pods/exec"]
      verbs: ["create", "patch", "delete"]
  clusterWideAccess: false
  podSecurityPolicy:
    enabled: false
    resourceNames:
    - gitlab-runner
metrics:
  enabled: true
  portName: metrics
  port: 9252
  serviceMonitor:
    enabled: false
service:
  enabled: false
  type: ClusterIP
runners:
  config: |
    shutdown_timeout = 100
    [[runners]]
      [runners.kubernetes]
        image = "alpine"
        [runners.kubernetes.node_selector]
          "kubernetes.io/arch" = "amd64"
  runUntagged: true
  protected: true
  tags: "tests, ra-tests"
  builds: {}
  services: {}
  helpers: {}
securityContext:
  allowPrivilegeEscalation: true
  readOnlyRootFilesystem: false
  runAsNonRoot: true
podSecurityContext:
  runAsUser: 100
  fsGroup: 65533
affinity: {}
nodeSelector:
  "kubernetes.io/arch": "amd64"
  "kubernetes.io/os": "linux"
tolerations: []
hostAliases: []
podAnnotations: {}
podLabels: {}
hpa: {}
secrets: []
configMaps: {}
volumeMounts: []
volumes: []
  1. Install Runner with Helm Chart using the MR branch and the values.yaml provided below

  2. SSH on the Runner Manager Pod

kubectl exec -it gitlab-runner-HASH-SHORT_HASH -namespace YOUR_NAMESPACE -- /bin/sh
  1. Run the following commands
$ bash configmaps/check-live      
Verifying runner... is valid                        runner=REDACTED
$ echo $?
0

The output should be 0

What are the relevant issue numbers?

#526 #531 gitlab-org/gitlab#438357 gitlab-org/gitlab-runner#37242

I am not closing the related issues for now.

Edited by Romuald Atchadé

Merge request reports