Fix liveness probe for Runner Pod
What does this MR do?
The liveness probe seems to be failing. After investigation it looks like ${HOME%/root}
was sometimes returning an invalid PATH making the awk
instruction fails causing the restart of the Runner Pod.
Why was this MR needed?
To prevent the Runner Pod to unnecessarily restart
What's the best way to test this MR?
config for ubuntu
image:
registry: registry.gitlab.com
image: gitlab-org/gitlab-runner
tag: ubuntu-v16.8.0
useTini: false
imagePullPolicy: IfNotPresent
replicas: 1
gitlabUrl: https://gitlab.com/
runnerToken: "glrt-REDACTED"
useJobNamespace: true
terminationGracePeriodSeconds: 0
concurrent: 1
checkInterval: 1
logLevel: "debug"
sessionServer:
enabled: false
# publicIP: ""
annotations: {}
timeout: 1800
internalPort: 8093
externalPort: 9000
## For RBAC support:
rbac:
create: true
rules:
- apiGroups: [""]
resources: ["configmaps", "events", "pods", "pods/attach", "pods/log", "secrets", "services", "serviceAccounts"]
verbs: ["get", "list", "watch", "create", "patch", "update", "delete"]
- apiGroups: [""]
resources: ["pods/exec"]
verbs: ["create", "patch", "delete"]
clusterWideAccess: false
podSecurityPolicy:
enabled: false
resourceNames:
- gitlab-runner
metrics:
enabled: true
portName: metrics
port: 9252
serviceMonitor:
enabled: false
service:
enabled: false
type: ClusterIP
runners:
config: |
shutdown_timeout = 100
[[runners]]
[runners.kubernetes]
image = "alpine"
[runners.kubernetes.node_selector]
"kubernetes.io/arch" = "amd64"
runUntagged: true
protected: true
tags: "tests, ra-tests"
builds: {}
services: {}
helpers: {}
securityContext:
allowPrivilegeEscalation: true
readOnlyRootFilesystem: false
runAsNonRoot: true
podSecurityContext:
runAsUser: 999
fsGroup: 999
affinity: {}
nodeSelector:
"kubernetes.io/arch": "amd64"
"kubernetes.io/os": "linux"
tolerations: []
hostAliases: []
podAnnotations: {}
podLabels: {}
hpa: {}
secrets: []
configMaps: {}
volumeMounts: []
volumes: []
config for alpine
image:
registry: registry.gitlab.com
image: gitlab-org/gitlab-runner
tag: alpine-v16.8.0
useTini: false
imagePullPolicy: IfNotPresent
replicas: 1
gitlabUrl: https://gitlab.com/
runnerToken: "glrt-REDACTED"
useJobNamespace: true
terminationGracePeriodSeconds: 0
concurrent: 1
checkInterval: 1
logLevel: "debug"
sessionServer:
enabled: false
# publicIP: ""
annotations: {}
timeout: 1800
internalPort: 8093
externalPort: 9000
## For RBAC support:
rbac:
create: true
rules:
- apiGroups: [""]
resources: ["configmaps", "events", "pods", "pods/attach", "pods/log", "secrets", "services", "serviceAccounts"]
verbs: ["get", "list", "watch", "create", "patch", "update", "delete"]
- apiGroups: [""]
resources: ["pods/exec"]
verbs: ["create", "patch", "delete"]
clusterWideAccess: false
podSecurityPolicy:
enabled: false
resourceNames:
- gitlab-runner
metrics:
enabled: true
portName: metrics
port: 9252
serviceMonitor:
enabled: false
service:
enabled: false
type: ClusterIP
runners:
config: |
shutdown_timeout = 100
[[runners]]
[runners.kubernetes]
image = "alpine"
[runners.kubernetes.node_selector]
"kubernetes.io/arch" = "amd64"
runUntagged: true
protected: true
tags: "tests, ra-tests"
builds: {}
services: {}
helpers: {}
securityContext:
allowPrivilegeEscalation: true
readOnlyRootFilesystem: false
runAsNonRoot: true
podSecurityContext:
runAsUser: 100
fsGroup: 65533
affinity: {}
nodeSelector:
"kubernetes.io/arch": "amd64"
"kubernetes.io/os": "linux"
tolerations: []
hostAliases: []
podAnnotations: {}
podLabels: {}
hpa: {}
secrets: []
configMaps: {}
volumeMounts: []
volumes: []
-
Install Runner with Helm Chart using the MR branch and the values.yaml provided below
-
SSH on the Runner Manager Pod
kubectl exec -it gitlab-runner-HASH-SHORT_HASH -namespace YOUR_NAMESPACE -- /bin/sh
- Run the following commands
$ bash configmaps/check-live
Verifying runner... is valid runner=REDACTED
$ echo $?
0
The output should be 0
What are the relevant issue numbers?
#526 #531 gitlab-org/gitlab#438357 gitlab-org/gitlab-runner#37242
I am not closing the related issues for now.