ERROR: Job failed (system failure): prepare environment: failed to start process: exec: "bash": executable file not found in $PATH. Using 16.2+ on RedHawk Linux server

Summary

Since updating our runner to 16.2.3 this error is thrown during all shell executor jobs, falling back to 16.1 versions and before works. This is only occurring on our RedHawk 7.5 server, our RHEL7 and RHEL8 servers are not affected by this.

Using the same exact server with a custom executor also works with "/usr/bin/bash" as the run_exec and "-l" as the run_args.

Steps to reproduce

.gitlab-ci.yml
test:
  variables:
    GIT_STRATEGY: "none"
  dependencies: []
  needs: []
  tags: [redhawk]
  script:
    - echo "TEST"

Actual behavior

Running with gitlab-runner 16.2.0 (782e15da)
on redhawk <redacted>
Resolving secrets
Preparing the "shell" executor
Using Shell (bash) executor...
Preparing environment
ERROR: Job failed (system failure): prepare environment: failed to start process: exec: "bash": executable file not found in $PATH. Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information

The above error is thrown with no changes to the environment when testing with runner version 16.2.0 and onwards.

Expected behavior

Running with gitlab-runner 16.1.0 (b72e108d)
  on redhawk <redacted>
Resolving secrets
Preparing the "shell" executor
Using Shell (bash) executor...Preparing environment
Running on redhawk...
Getting source from Git repository
Skipping Git repository setup
Skipping Git checkout
Skipping Git submodules setup
Executing "step_script" stage of the job script
$ echo "TEST"
TEST
Cleaning up project directory and file based variables
Job succeeded

The above success occurs with no changes to the environment when testing with runner version 16.1.1 and previous.

Relevant logs and/or screenshots

job log
Running with gitlab-runner 16.2.0 (782e15da)
on redhawk <redacted>
Resolving secrets
Preparing the "shell" executor
Using Shell (bash) executor...
Preparing environment
ERROR: Job failed (system failure): prepare environment: failed to start process: exec: "bash": executable file not found in $PATH. Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information
runner debug log
DEBU[0006] Preparing the "shell" executor    job=<redacted> project=<redacted> runner=<redacted>
DEBU[0006] Shell configuration: command: bash
arguments:
- -l
cmdline: bash -l
dockercommand:
- sh
- -c
- "if [ -x /usr/local/bin/bash ]; then\n\texec /usr/local/bin/bash -l\nelif [ -x /usr/bin/bash
  ]; then\n\texec /usr/bin/bash -l\nelif [ -x /bin/bash ]; then\n\texec /bin/bash
  -l\nelif [ -x /usr/local/bin/sh ]; then\n\texec /usr/local/bin/sh -l\nelif [ -x
  /usr/bin/sh ]; then\n\texec /usr/bin/sh -l\nelif [ -x /bin/sh ]; then\n\texec /bin/sh
  -l\nelif [ -x /busybox/sh ]; then\n\texec /busybox/sh -l\nelse\n\techo shell not
  found\n\texit 1\nfi\n\n"
passfile: false
extension: ""  job=<redacted> project=<redacted> runner=<redacted>
DEBU[0006] Using Shell (bash) executor...                job=<redacted> project=<redacted> runner=<redacted>
DEBU[0006] Waiting for signals...                        job=<redacted> project=<redacted> runner=<redacted>
DEBU[0006] No referees configured                        job=<redacted> project=<redacted> runner=<redacted>
DEBU[0006] Executing build stage                         build_stage=prepare_script job=<redacted> project=<redacted> runner=<redacted>
DEBU[0006] Preparing environment             job=<redacted> project=<redacted> runner=<redacted>
DEBU[0006] Using new shell command execution             job=<redacted> project=<redacted> runner=<redacted> 
ERRO[0006] Job failed (system failure): prepare environment: failed to start process: exec: "bash": executable file not found in $PATH. Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information  duration_s=0.003403969 job=<redacted> project=<redacted> runner=<redacted>
DEBU[0006] Checking for jobs...nothing                   runner=<redacted>

Environment description

This is occurring only on our redhawk linux 7.5 server: https://concurrent-rt.com/products/software/redhawk-linux/ and when using a runner version above 16.2 Our RHEL7 and RHEL8 servers are functioning OK with 16.2.3

config.toml contents
listen_address = "[::]:8092"
concurrent = 2
check_interval = 0
log_level = "info"
log_format = "text"
shutdown_timeout = 0

[session_server]
  listen_address = "[::]:8093"
  session_timeout = 1800

[[runners]]
  name = "redhawk"
  url = "<redacted>"
  id = 2
  token = "<redacted>"
  token_obtained_at = 2023-07-01T01:45:45Z
  token_expires_at = 0001-01-01T00:00:00Z
  executor = "shell"
  cache_dir = "<redacted>"
  [runners.custom_build_dir]
    enabled = true
config_working.toml contents
concurrent = 1
check_interval = 0
shutdown_timeout = 0

[session_server]
  session_timeout = 1800

[[runners]]
  name = "redhawk"
  url = "<redacted>"
  id = 3
  token = "<redacted>"
  token_obtained_at = 2024-01-10T19:34:43Z
  token_expires_at = 0001-01-01T00:00:00Z
  executor = "custom"
  builds_dir = "<redacted>"
  cache_dir = "<redacted>"
  [runners.custom_build_dir]
    enabled = true
  [runners.cache]
    MaxUploadedArchiveSize = 0
  [runners.custom]
    run_exec = "/usr/bin/bash"
    run_args = [ "-l" ]

Used GitLab Runner version

Not working:

gitlab-runner --version
Version:      16.2.0
Git revision: 782e15da
Git branch:   16-2-stable
GO version:   go1.20.5
Built:        2023-07-21T22:52:35+0000
OS/Arch:      linux/amd64

Working:

gitlab-runner --version
Version:      16.1.0
Git revision: b72e108d
Git branch:   16-1-stable
GO version:   go1.19.9
Built:        2023-06-21T21:52:30+0000
OS/Arch:      linux/amd64

Tested all these not working: 16.2.0 / 16.2.3 / 16.3.0 / 16.6.2

Possible fixes

Workaround using custom executor shown above as config_working.toml

I checked https://gitlab.com/gitlab-org/gitlab-runner/-/compare/v16.1.0...v16.2.0?from_project_id=250833&page=10&straight=false#738e384d2ee845352fdfc52719fb3c79d361952a and saw nothing that could cause this at least from what I could see, also tested enabling and disabling the FF_USE_LEGACY_KUBERNETES_EXECUTION_STRATEGY feature and no change to behavior.

Edited by J Z