Custom entrypoints of containers are ignored resulting in job failing or hanging
Summary
If a Docker image has a custom entrypoint specified or the entrypoint is overridden via pipeline job configuration, but there is no /bin/sh
and no /bin/busybox
present on the image, then the job fails to start or hangs.
Steps to reproduce
See example project.
Example Project
Example images are based on Alpine Linux with /bin
directory moved for the sake of the experiment.
Expand for a sample .gitlab-ci.yml
.job:
rules:
- if: $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH || $CI_PIPELINE_SOURCE == "merge_request_event"
.build-job:
extends: .job
image:
name: gcr.io/kaniko-project/executor:v1.21.1-debug
entrypoint: [ "" ]
script:
- mkdir -p /kaniko/.docker
- echo "$DOCKERFILE" > Dockerfile
- >
echo '{"auths":{"'$CI_REGISTRY'":{"auth":
"'$(printf "%s:%s" "$CI_REGISTRY_USER" "$CI_REGISTRY_PASSWORD" | base64 | tr -d '\n')'"
}}}' > /kaniko/.docker/config.json
- /kaniko/executor
--context "$CI_PROJECT_DIR"
--destination "${CI_REGISTRY_IMAGE}/${IMAGE_NAME}:latest"
--dockerfile "Dockerfile"
.run-job:
extends: .job
script:
- echo $0
- ps
"[no-bin] Build image":
extends: .build-job
variables:
DOCKERFILE: |
FROM alpine:3.19
RUN cp -rf /bin /foo \
&& find /foo -type l -exec sh -c "ln -s /foo/busybox {}_1; mv {}_1 {};" \; \
&& find /usr/bin -type l -exec sh -c "ln -s /foo/busybox {}_1; mv {}_1 {};" \; \
&& rm -rf /bin
ENV PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/foo"
SHELL ["/foo/busybox", "sh", "-c"]
ENTRYPOINT ["/foo/busybox", "sh"]
IMAGE_NAME: no-bin
"[bin-with-sh-only] Build image":
extends: .build-job
variables:
DOCKERFILE: |
FROM alpine:3.19
RUN cp -rf /bin /foo \
&& find /foo -type l -exec sh -c "ln -s /foo/busybox {}_1; mv {}_1 {};" \; \
&& find /usr/bin -type l -exec sh -c "ln -s /foo/busybox {}_1; mv {}_1 {};" \; \
&& /foo/rm -rf /bin \
&& /foo/mkdir /bin \
&& /foo/ln -s /foo/sh /bin/sh
ENV PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/foo"
SHELL ["/foo/busybox", "sh", "-c"]
ENTRYPOINT ["/foo/busybox", "sh"]
IMAGE_NAME: bin-with-sh-only
"[bin-with-busybox-only] Build image":
extends: .build-job
variables:
DOCKERFILE: |
FROM alpine:3.19
RUN cp -rf /bin /foo \
&& find /foo -type l -exec sh -c "ln -s /foo/busybox {}_1; mv {}_1 {};" \; \
&& find /usr/bin -type l -exec sh -c "ln -s /foo/busybox {}_1; mv {}_1 {};" \; \
&& /foo/rm -rf /bin \
&& /foo/mkdir /bin \
&& /foo/ln -s /foo/busybox /bin/busybox
ENV PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/foo"
SHELL ["/foo/busybox", "sh", "-c"]
ENTRYPOINT ["/foo/busybox", "sh"]
IMAGE_NAME: bin-with-busybox-only
"[bin-with-sh-and-busybox] Build image":
extends: .build-job
variables:
DOCKERFILE: |
FROM alpine:3.19
RUN cp -rf /bin /foo \
&& find /foo -type l -exec sh -c "ln -s /foo/busybox {}_1; mv {}_1 {};" \; \
&& find /usr/bin -type l -exec sh -c "ln -s /foo/busybox {}_1; mv {}_1 {};" \; \
&& /foo/rm -rf /bin \
&& /foo/mkdir /bin \
&& /foo/ln -s /foo/busybox /bin/busybox \
&& /foo/ln -s /foo/sh /bin/sh
ENV PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/foo"
SHELL ["/foo/busybox", "sh", "-c"]
ENTRYPOINT ["/foo/busybox", "sh"]
IMAGE_NAME: bin-with-sh-and-busybox
"[no-bin] Image entrypoint":
extends: .run-job
image: ${CI_REGISTRY_IMAGE}/no-bin:latest
needs:
- "[no-bin] Build image"
"[no-bin] Empty entrypoint":
extends: .run-job
image:
name: ${CI_REGISTRY_IMAGE}/no-bin:latest
entrypoint: [ "" ]
needs:
- "[no-bin] Build image"
"[no-bin] Override entrypoint":
extends: .run-job
image:
name: ${CI_REGISTRY_IMAGE}/no-bin:latest
entrypoint: [ "/foo/sh" ]
needs:
- "[no-bin] Build image"
"[bin-with-sh-only] Image entrypoint":
extends: .run-job
image: ${CI_REGISTRY_IMAGE}/bin-with-sh-only:latest
needs:
- "[bin-with-sh-only] Build image"
"[bin-with-sh-only] Empty entrypoint":
extends: .run-job
image:
name: ${CI_REGISTRY_IMAGE}/bin-with-sh-only:latest
entrypoint: [ "" ]
needs:
- "[bin-with-sh-only] Build image"
"[bin-with-sh-only] Override entrypoint":
extends: .run-job
image:
name: ${CI_REGISTRY_IMAGE}/bin-with-sh-only:latest
entrypoint: [ "/foo/sh" ]
needs:
- "[bin-with-sh-only] Build image"
"[bin-with-busybox-only] Image entrypoint":
extends: .run-job
image: ${CI_REGISTRY_IMAGE}/bin-with-busybox-only:latest
needs:
- "[bin-with-busybox-only] Build image"
"[bin-with-busybox-only] Empty entrypoint":
extends: .run-job
image:
name: ${CI_REGISTRY_IMAGE}/bin-with-busybox-only:latest
entrypoint: [ "" ]
needs:
- "[bin-with-busybox-only] Build image"
"[bin-with-busybox-only] Override entrypoint":
extends: .run-job
image:
name: ${CI_REGISTRY_IMAGE}/bin-with-busybox-only:latest
entrypoint: [ "/foo/busybox", "sh" ]
needs:
- "[bin-with-busybox-only] Build image"
"[bin-with-sh-and-busybox] Image entrypoint":
extends: .run-job
image: ${CI_REGISTRY_IMAGE}/bin-with-sh-and-busybox:latest
needs:
- "[bin-with-sh-and-busybox] Build image"
"[bin-with-sh-and-busybox] Empty entrypoint":
extends: .run-job
image:
name: ${CI_REGISTRY_IMAGE}/bin-with-sh-and-busybox:latest
entrypoint: [ "" ]
needs:
- "[bin-with-sh-and-busybox] Build image"
"[bin-with-sh-and-busybox] Override entrypoint":
extends: .run-job
image:
name: ${CI_REGISTRY_IMAGE}/bin-with-sh-and-busybox:latest
entrypoint: [ "/foo/sh" ]
needs:
- "[bin-with-sh-and-busybox] Build image"
What is the current bug behavior?
See the table below for outcomes:
Scenario | Default (image) entrypoint | Empty entrypoint | Job-specified entrypoint |
---|---|---|---|
No /bin directory |
|||
/bin contains only sh
|
|||
/bin contains only busybox
|
|||
/bin has sh and busybox
|
❌ The job fails to start
Running with gitlab-runner 16.10.0 (81ab07f6)
on spike-new-gitlab-runner-5f8c75dcbf-fk6v4 yxC4pz6Zv, system ID: r_9AC0GzBWtX5t
Preparing the "kubernetes" executor
00:00
Using Kubernetes namespace: gitlab-managed-apps
Using Kubernetes executor with image registry.gitlab.com/spike_api/infra/ci-cd/no-bin:latest ...
Using attach strategy to execute scripts...
Preparing environment
00:06
Using FF_USE_POD_ACTIVE_DEADLINE_SECONDS, the Pod activeDeadlineSeconds will be set to the job timeout: 1h0m0s...
Waiting for pod gitlab-managed-apps/runner-yxc4pz6zv-project-44400719-concurrent-4-m5e0ndd3 to be running, status is Pending
Waiting for pod gitlab-managed-apps/runner-yxc4pz6zv-project-44400719-concurrent-4-m5e0ndd3 to be running, status is Pending
ContainersNotReady: "containers with unready status: [build helper]"
ContainersNotReady: "containers with unready status: [build helper]"
ERROR: Job failed (system failure): prepare environment: setting up trapping scripts on emptyDir: unable to upgrade connection: container not found ("build"). Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information
💤 The job hangs
Running with gitlab-runner 16.10.0 (81ab07f6)
on spike-new-gitlab-runner-5f8c75dcbf-fk6v4 yxC4pz6Zv, system ID: r_9AC0GzBWtX5t
Preparing the "kubernetes" executor
00:00
Using Kubernetes namespace: gitlab-managed-apps
Using Kubernetes executor with image registry.gitlab.com/spike_api/infra/ci-cd/bin-with-sh-only:latest ...
Using attach strategy to execute scripts...
Preparing environment
00:10
Using FF_USE_POD_ACTIVE_DEADLINE_SECONDS, the Pod activeDeadlineSeconds will be set to the job timeout: 1h0m0s...
Waiting for pod gitlab-managed-apps/runner-yxc4pz6zv-project-44400719-concurrent-5-fg496ffm to be running, status is Pending
Waiting for pod gitlab-managed-apps/runner-yxc4pz6zv-project-44400719-concurrent-5-fg496ffm to be running, status is Pending
ContainersNotInitialized: "containers with incomplete status: [init-permissions]"
ContainersNotReady: "containers with unready status: [build helper]"
ContainersNotReady: "containers with unready status: [build helper]"
Waiting for pod gitlab-managed-apps/runner-yxc4pz6zv-project-44400719-concurrent-5-fg496ffm to be running, status is Pending
ContainersNotReady: "containers with unready status: [build helper]"
ContainersNotReady: "containers with unready status: [build helper]"
Running on runner-yxc4pz6zv-project-44400719-concurrent-5-fg496ffm via spike-new-gitlab-runner-5f8c75dcbf-fk6v4...
Getting source from Git repository
00:03
Fetching changes with git depth set to 20...
Initialized empty Git repository in /builds/spike_api/infra/ci-cd/.git/
Created fresh repository.
Checking out 5d31ab72 as detached HEAD (ref is refs/merge-requests/53/head)...
Skipping Git submodules setup
Executing "step_script" stage of the job script
sh: /scripts-44400719-6511950017/detect_shell_script: not found
After which the job hangs ..
😞 /bin/sh is always used
Running with gitlab-runner 16.10.0 (81ab07f6)
on spike-new-gitlab-runner-5f8c75dcbf-fk6v4 yxC4pz6Zv, system ID: r_9AC0GzBWtX5t
Preparing the "kubernetes" executor
00:00
Using Kubernetes namespace: gitlab-managed-apps
Using Kubernetes executor with image registry.gitlab.com/spike_api/infra/ci-cd/bin-with-sh-and-busybox:latest ...
Using attach strategy to execute scripts...
Preparing environment
00:07
Using FF_USE_POD_ACTIVE_DEADLINE_SECONDS, the Pod activeDeadlineSeconds will be set to the job timeout: 1h0m0s...
Waiting for pod gitlab-managed-apps/runner-yxc4pz6zv-project-44400719-concurrent-11-zn89wfv5 to be running, status is Pending
Waiting for pod gitlab-managed-apps/runner-yxc4pz6zv-project-44400719-concurrent-11-zn89wfv5 to be running, status is Pending
ContainersNotInitialized: "containers with incomplete status: [init-permissions]"
ContainersNotReady: "containers with unready status: [build helper]"
ContainersNotReady: "containers with unready status: [build helper]"
Running on runner-yxc4pz6zv-project-44400719-concurrent-11-zn89wfv5 via spike-new-gitlab-runner-5f8c75dcbf-fk6v4...
Getting source from Git repository
00:03
Fetching changes with git depth set to 20...
Initialized empty Git repository in /builds/spike_api/infra/ci-cd/.git/
Created fresh repository.
Checking out 297bd20c as detached HEAD (ref is refs/merge-requests/53/head)...
Skipping Git submodules setup
Executing "step_script" stage of the job script
00:01
$ echo $0
/scripts-44400719-6512159886/step_script
$ ps
PID USER TIME COMMAND
1 root 0:00 /bin/sh
24 root 0:00 sh -c (/scripts-44400719-6512159886/detect_shell_script /scripts-44400719-6512159886/step_script 2>&1 | tee -a /logs-44400719-6512159886/output.log) &
25 root 0:00 /bin/sh /scripts-44400719-6512159886/step_script
26 root 0:00 tee -a /logs-44400719-6512159886/output.log
30 root 0:00 /bin/sh /scripts-44400719-6512159886/step_script
37 root 0:00 ps
$ pstree
sh---sh-+-sh---sh---pstree
`-tee
Cleaning up project directory and file based variables
00:00
Job succeeded
What is the expected correct behavior?
- The entrypoint of the image or the entrypoint specified in the job configuration should be used, regardless of presence of
/bin/sh
or/bin/busybox
- The job should fail fast (not hang) in case the image contents or environment are not as GitLab runner expects
Relevant logs and/or screenshots
Logs attached under the "What is the current bug behavior?" section.
All images are valid and can be run via standard docker run -it
command, which opens up a fully functional shell.
Output of checks
Results of GitLab environment info
Expand for output related to GitLab environment info
(For installations with omnibus-gitlab package run and paste the output of: \\\\\\\\\\\\\\\sudo gitlab-rake gitlab:env:info\\\\\\\\\\\\\\\) (For installations from source run and paste the output of: \\\\\\\\\\\\\\\sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production\\\\\\\\\\\\\\\)
Results of GitLab application Check
Expand for output related to the GitLab application check
(For installations with omnibus-gitlab package run and paste the output of: \\\\\\\sudo gitlab-rake gitlab:check SANITIZE=true\\\\\\\) (For installations from source run and paste the output of: \\\\\\\sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true\\\\\\\) (we will only investigate if the tests are passing)