Use helper image to change K8s log dir permissions (!2573) · Merge requests · GitLab.org / gitlab-runner

What does this MR do?

Use helper image to change K8s log dir permissions

Why was this MR needed?

Background: With the new Docker Hub limits we shouldn't be using busybox image since it would mean that users will get rate limited when running jobs.

Solution: Use the helper image to change the permissions. The helper image is still hosted in Docker Hub, however in #27196 (closed) we are moving the helper image to registry.gitlab.com behind a feature flag FF_GITLAB_REGISTRY_HELPER_IMAGE.

Notes: This solves the problem of users in airgapped environments, because they can override the helper images.

There is also a big size difference on images. The busybox image is only around 1.2MB whilst the helper image is around 65MB. The container is a lot larger, but this brings consistency on the Kubernetes execution where we only use the user image and our helper image, which is going to be decoupled from Docker Hub. We can have a busybox image hosted inside of registry.gitlab.com but it might be premature optimization right now.

What's the best way to test this MR?

Make sure jobs with nonroot images still run

Use the following .gitlab-ci.yml for the Kubernetes executor

.gitlab-ci.yml

variables:
  SLEEP: 0
  FF_USE_LEGACY_KUBERNETES_EXECUTION_STRATEGY: "false"

stages:
- root
- alpine
- debian

root:
  image: alpine
  stage: root
  before_script:
  - id
  script:
  - mkdir -p test-root
  - echo "test" > test-root/test-root.txt
  - echo "test" > test-root.txt
  - sleep ${SLEEP}

alpine:
  image: steveazz/noroot-alpine
  stage: alpine
  before_script:
  - id
  script:
  - mkdir -p test-alpine
  - echo "test" > test-alpine/test-alpine.txt
  - echo "test" > test-alpine.txt
  - sleep ${SLEEP}

debian:
  image: steveazz/noroot-debian
  stage: debian
  before_script:
  - id
  script:
  - mkdir -p test-debian
  - echo "test" > test-debian/test-debian.txt
  - echo "test" > test-debian.txt
  - sleep ${SLEEP}

Make sure pipeline is green and jobs like alpine is successful

Remove the initContainer to produce the failure that the initContainer is solving

git diff

diff --git a/executors/kubernetes/kubernetes.go b/executors/kubernetes/kubernetes.go
index 5e5b69598..9a2e9527c 100644
--- a/executors/kubernetes/kubernetes.go
+++ b/executors/kubernetes/kubernetes.go
@@ -295,7 +295,7 @@ func (s *executor) ensurePodsConfigured(ctx context.Context) error {
                return fmt.Errorf("setting up scripts configMap: %w", err)
        }

-       err = s.setupBuildPod([]api.Container{s.buildLogPermissionsInitContainer()})
+       err = s.setupBuildPod([]api.Container{})
        if err != nil {
                return fmt.Errorf("setting up build pod: %w", err)
        }

Run the same job from step 1. The job is going to be stuck on Executing "step_script" stage of the job script until timeout

What are the relevant issue numbers?

Closes #27098 (closed)