Gitaly cgroup init containers error due to helm templating error
Summary
Gitaly cgroup init containers lack then needed permissions to work. These permissions cannot be added due to the way the helm chart templating logic is written. As a result, I'm unable to configure gitaly as recommended in the GitLab docs.
Steps to reproduce
-
Set your GitLab values file to enable the gitaly cgroups init container. You can view the relevant gitaly values file section for reference.
gitlab: gitaly: cgroups: enabled: true initContainer: securityContext: runAsUser: 0 runAsGroup: 0 privileged: true -
Deploy GitLab and observe that your gitaly nodes will fail to come up. The cgroups init container will error out due to a lack of permissions.
kubectl logs -n gitlab gitlab-gitaly-0 --container init-cgroups{"time":"2024-12-04T20:16:11.513566474Z","level":"INFO","msg":"cgroup setup configuration","GITALY_POD_UID":"a166234b-7fee-4613-97e3-61034b619052","CGROUP_PATH":"/run/gitaly/cgroup/kubepods.slice/","OUTPUT_PATH":"/init-secrets/gitaly-pod-cgroup"} {"time":"2024-12-04T20:16:11.515680262Z","level":"INFO","msg":"found cgroup path for Gitaly pod","cgroup_path":"/run/gitaly/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poda166234b_7fee_4613_97e3_61034b619052.slice"} {"time":"2024-12-04T20:16:11.515780133Z","level":"ERROR","msg":"changing cgroup permissions for Gitaly pod","pod_uid":"a166234b-7fee-4613-97e3-61034b619052","cgroup_path":"/run/gitaly/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poda166234b_7fee_4613_97e3_61034b619052.slice","error":"chown cgroup path \"\": chown /run/gitaly/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-poda166234b_7fee_4613_97e3_61034b619052.slice: permission denied"} -
Observe in the init container YAML that the
privileged: truepiece is missing.kubectl get statefulsets.apps -n gitlab gitlab-gitaly -o yaml | yq .spec.template.specinitContainers: - env: - name: GITALY_POD_UID valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.uid - name: CGROUP_PATH value: /run/gitaly/cgroup/kubepods.slice/ - name: OUTPUT_PATH value: /init-secrets/gitaly-pod-cgroup image: registry.gitlab.com/gitlab-org/build/cng/gitaly-init-cgroups:v17.6.1 imagePullPolicy: IfNotPresent name: init-cgroups resources: requests: cpu: 50m securityContext: runAsGroup: 0 runAsUser: 0 terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /init-secrets name: gitaly-secrets - mountPath: /run/gitaly/cgroup name: cgroup
What is the current bug behavior?
The needed security context values are dropped by the helm templating causing a broken install if cgroup init containers are enabled.
What is the expected correct behavior?
I can enable the cgroup init container, and it works. Ideally without me having to add security context values, but if I need to add them, it shouldn't drop them.
Possible fixes
Backtracing the source of the issue. The gitaly statefulset template uses a gitlab helper function to process the securityContext value and that gitlab helper function only receives limited securityContext keys which do not include privileged. The helper function I'm talking about is copied below:
{{/*
Return a PodSecurityContext definition.
Usage:
{{ include "gitlab.podSecurityContext" .Values.securityContext }}
*/}}
{{- define "gitlab.podSecurityContext" -}}
{{- $psc := . }}
{{- if $psc }}
securityContext:
{{- if not (empty $psc.runAsUser) }}
runAsUser: {{ $psc.runAsUser }}
{{- end }}
{{- if not (empty $psc.runAsGroup) }}
runAsGroup: {{ $psc.runAsGroup }}
{{- end }}
{{- if not (empty $psc.fsGroup) }}
fsGroup: {{ $psc.fsGroup }}
{{- end }}
{{- if not (empty $psc.fsGroupChangePolicy) }}
fsGroupChangePolicy: {{ $psc.fsGroupChangePolicy }}
{{- end }}
{{- if $psc.seccompProfile }}
seccompProfile:
{{- toYaml $psc.seccompProfile | nindent 4 }}
{{- end }}
{{- end }}
{{- end -}}
Recommendation: pre-populate the security context for the init cgroups container with user 0, group 0, and privileged because it will always need those to work. Don't let me modify it, or if I modify it, take the straight YAML like the init container helper function about 15 lines up in _helpers.tpl from the function above.
It may also work to add the capability CAP_SYS_ADMIN.