"shell not found" when trying to use Ubuntu or Fedora image
Summary
GitLab-Runner is failing jobs with a "shell not found" error.
image: fedora:rawhide yields
Executing "step_script" stage of the job script
00:01
Using docker image sha256:0415d12bda880ea27c43cfda07e7dfc0f1c18d3a48bd97bccb20742392db0689 for fedora:rawhide with digest fedora@sha256:e1d3599d0019925a65c78df763bfa73525cd0f7250290ecf033d9f6901fee759 ...
shell not found
image: dahanna/ubuntu-libc yields
Executing "step_script" stage of the job script
00:01
Using docker image sha256:eab1a04c7cf9b1a7f6a54b8fc6f9587b38afc19dbfb97706f14683770d980c70 for dahanna/ubuntu-libc with digest dahanna/ubuntu-libc@sha256:e96689d9ffe39583c4182ef7982431fd8af74c45320a81c5e06efcf5ef9541f8 ...
shell not found
You can see ubuntu-libc at https://hub.docker.com/r/dahanna/ubuntu-libc. It's literally just ubuntu:devel with libc upgraded. https://github.com/dHannasch/ubuntu-libc-dockerfile/blob/master/Dockerfile
FROM ubuntu:devel
# Installing common packages such as wget, curl and git will incidentally upgrade libc.
# This Dockerfile just upgrades libc to avoid the distraction of the specific package wget/curl/git.
# It makes no difference whether you upgrade libc6 or libc-bin. Either will upgrade the other.
RUN apt-get update && apt-get install --assume-yes --no-install-recommends libc-bin
The latest nightly build https://s3.amazonaws.com/gitlab-runner-downloads/master/binaries/gitlab-runner-linux-386 of GitLab-Runner yields the same result.
Steps to reproduce
Currently, this cannot be reproduced on GitLab.com. (Not sure what version of GitLab-Runner GitLab.com is running.) However, it can be reproduced with the latest nightly build https://s3.amazonaws.com/gitlab-runner-downloads/master/binaries/gitlab-runner-linux-386 .
test-fedora:
image: fedora:rawhide
tags:
- docker
script:
- echo success
test-ubuntu-libc:
image: dahanna/ubuntu-libc
tags:
- docker
script:
- echo success
Actual behavior
image: fedora:rawhide yields
Executing "step_script" stage of the job script
00:01
Using docker image sha256:0415d12bda880ea27c43cfda07e7dfc0f1c18d3a48bd97bccb20742392db0689 for fedora:rawhide with digest fedora@sha256:e1d3599d0019925a65c78df763bfa73525cd0f7250290ecf033d9f6901fee759 ...
shell not found
image: dahanna/ubuntu-libc yields
Executing "step_script" stage of the job script
00:01
Using docker image sha256:eab1a04c7cf9b1a7f6a54b8fc6f9587b38afc19dbfb97706f14683770d980c70 for dahanna/ubuntu-libc with digest dahanna/ubuntu-libc@sha256:e96689d9ffe39583c4182ef7982431fd8af74c45320a81c5e06efcf5ef9541f8 ...
shell not found
Expected behavior
It is technically possible that both fedora:rawhide and ubuntu:devel broke at the same time in ways that left them with no usable shell, despite being based on different underlying technology (CentOS and Debian respectively). But that does not seem very probable. It seems especially improbably that ubuntu:devel has a working shell, but after upgrading libc, it doesn't anymore.
Obviously there is something about the latest images --- the stable versions don't have this problem. So I'm bringing this up with the Docker Library as well. (Though it's sort of hard to make the case that there's something wrong with e.g. fedora:rawhide when all the jobs still pass just fine on GitLab.com.) But it seems very likely that whatever GitLab-Runner is choking on, it has nothing to do with actually not being able to find a shell.
While it is technically possible for there to be something so wrong with the Docker image that the GitLab-Runner has to fail the job, the expected behavior would be for the job to succeed normally, just as it does on whatever version of GitLab-Runner GitLab.com is using.
Relevant logs and/or screenshots
ubuntu-libc job log
Preparing the "docker" executor
00:03
Using Docker executor with image dahanna/ubuntu-libc ...
Pulling docker image dahanna/ubuntu-libc ...
Using docker image sha256:eab1a04c7cf9b1a7f6a54b8fc6f9587b38afc19dbfb97706f14683770d980c70 for dahanna/ubuntu-libc with digest dahanna/ubuntu-libc@sha256:e96689d9ffe39583c4182ef7982431fd8af74c45320a81c5e06efcf5ef9541f8 ...
Preparing environment
00:01
Getting source from Git repository
00:02
Fetching changes with git depth set to 50...
Reinitialized existing Git repository in /builds/dahanna/test/.git/
Checking out ae5c4204 as master...
Skipping Git submodules setup
Executing "step_script" stage of the job script
00:01
Using docker image sha256:eab1a04c7cf9b1a7f6a54b8fc6f9587b38afc19dbfb97706f14683770d980c70 for dahanna/ubuntu-libc with digest dahanna/ubuntu-libc@sha256:e96689d9ffe39583c4182ef7982431fd8af74c45320a81c5e06efcf5ef9541f8 ...
shell not found
Cleaning up file based variables
00:01
ERROR: Job failed: exit code 1
fedora job log
Preparing the "docker" executor
00:03
Using Docker executor with image fedora:rawhide ...
Pulling docker image fedora:rawhide ...
Using docker image sha256:0415d12bda880ea27c43cfda07e7dfc0f1c18d3a48bd97bccb20742392db0689 for fedora:rawhide with digest fedora@sha256:e1d3599d0019925a65c78df763bfa73525cd0f7250290ecf033d9f6901fee759 ...
Preparing environment
00:01
Getting source from Git repository
00:02
Fetching changes with git depth set to 50...
Reinitialized existing Git repository in /builds/dahanna/test/.git/
Checking out ae5c4204 as master...
Skipping Git submodules setup
Executing "step_script" stage of the job script
00:01
Using docker image sha256:0415d12bda880ea27c43cfda07e7dfc0f1c18d3a48bd97bccb20742392db0689 for fedora:rawhide with digest fedora@sha256:e1d3599d0019925a65c78df763bfa73525cd0f7250290ecf033d9f6901fee759 ...
shell not found
Cleaning up file based variables
00:01
ERROR: Job failed: exit code 1
Environment description
This can be reproduced with the latest nightly-build standalone binary. https://s3.amazonaws.com/gitlab-runner-downloads/master/binaries/gitlab-runner-linux-386 (It was originally observed compiling GitLab-Runner from source.)
$ sudo docker info
Client:
Context: default
Debug Mode: false
Plugins:
app: Docker App (Docker Inc., v0.9.1-beta3)
buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 39
Server Version: 20.10.2
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 269548fa27e0089a8b8278fc4fc781d7f65a939b
runc version: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.4.0-138-generic
Operating System: Ubuntu 16.04.7 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.67GiB
Name: new-gitlab-runner
ID: VPSP:G7TL:3PVC:QEML:CUVX:R4JZ:3OJ5:UNLJ:F27S:2LNG:FRFB:JNV2
Docker Root Dir: /var/lib/docker
Debug Mode: false
No Proxy: 127.0.0.1,localhost,localnets,.sandia.gov,.lan,.local,.home,/var/run/docker.sock,169.254.169.254,10.202.0.0/16,10.203.0.0/16
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swap limit support
config.toml contents
$ sudo cat /etc/gitlab-runner/config.toml
concurrent = 2
check_interval = 0
[session_server]
session_timeout = 1800
[[runners]]
executor = "docker"
[runners.custom_build_dir]
[runners.cache]
[runners.cache.s3]
[runners.cache.gcs]
[runners.docker]
tls_verify = false
image = "alpine:edge"
privileged = false
disable_entrypoint_overwrite = false
oom_kill_disable = false
disable_cache = false
volumes = ["/cache"]
shm_size = 0
Used GitLab Runner version
$ ./gitlab-runner-linux-386 --version
Version: 13.10.0~beta.44.g5905c876
Git revision: 5905c876
Git branch: master
GO version: go1.13.8
Built: 2021-02-24T10:31:17+0000
OS/Arch: linux/386