Fix Kubernetes attach strategy for non-root environments (!2749) · Merge requests · GitLab.org / gitlab-runner

What does this MR do?

This MR fixes the problems explained in #26540 (closed).

Why was this MR needed?

Inside Openshift by design most of the Pods and their containers don't have root access. Every pod has a different user id that is a part of the root group but not a root user himself. This imposes some limitation which we didn't take in consideration during the implementation of the attach strategy. The two problems that prevented this from working are explained in deep details as comments in the changes but here's the overview:

We were trying to chmod a whole volume directory. That doesn't work for non-root users. However, non-root users can create files inside these volumes. After the file is created every container that has the same user id can read and write from/to these files.
We were mounting a volume inside another volume. This caused a weird behavior where the parent volume will become owned by root, regardless of the actual user id of the containers. After that, naturally we couldn't write anything to the parent volume which in our case was the volume mounted in /builds where the repo was contained.

What's the best way to test this MR?

Easy way

The easiest way is to verify that the testKubernetesNoRootImageFeatureFlag and the new testKubernetesWithNonRootSecurityContext tests are working with both attach and exec.

One easy way to test whether the changes in kubernetes.go have any effect is to checkout the master version of the file and run the testKubernetesWithNonRootSecurityContext test. The test should pass for exec but fail for attach.

Hard way

We could test this in Kubernetes with a custom SecurityContext, but that's just what the testKubernetesWithNonRootSecurityContext test does. The real test is with Openshift. Note: If you(the reviewer) prefer we could test this over a zoom call since setting up Openshift might be time consuming.

Install crc and create a local openshift cluster - https://developers.redhat.com/products/codeready-containers/overview
Install the Gitlab Runner Operator by following the instructions in the docs - https://docs.gitlab.com/runner/install/openshift.html
You should see a runner pod. Build a new runner binary and copy it inside the container: GOOS=linux go build -o gitlab-runner-linux . && oc cp gitlab-runner-linux gitlab-runner-runner-58dd79d6b8-wvpfc:/tmp
ssh in the container: oc rsh gitlab-runner-runner-58dd79d6b8-wvpfc
Go to tmp: /tmp
Create a new runner registration by running ./gitlab-runner-linix register --config config.toml. Specify a tag different than openshift, e.g. oc-test.
Start the runner in the same shell, it will run alongside the runner managed by the Operator, but it will still be in an openshift environment. ./gitlab-runner-linux run --config config.toml
Start a job tagged for this runner and set the variable FF_USE_LEGACY_KUBERNETES_EXECUTION_STRATEGY to be false in order to use the attach strategy.

What are the relevant issue numbers?

Closes #26540 (closed), #27107 (closed)

Edited Feb 15, 2021 by Georgi N. Georgiev

Fix Kubernetes attach strategy for non-root environments