gitlab-runner environment variables causing build environment to exceed system limits

Summary

gitlab-runner's environment variables are encroaching on system limits when combined with Windows, MSYS, and the build itself and causing tools (find & xargs) to fail.

Since upgrading to 16.7 (both runner & instance), some of our Windows builds have been intermittently failing when running find / xargs commands, apparently due to the environment, or its own internal command line exceeding system environment limits.

This has been happening on both docker & virtualbox instances, and the image used has not changed - so no changes to the MSYS runtime or other tools. Reverting to 16.6 (well, 16.6 with Add support for Windows 11 23H2 (!4504 - merged) included) appears to fix the issue.

This is a bit of weird one in terms of bug reports as I'm not sure if there's a solution. It might just be a "this is something to be aware of" documentation type thing. But perhaps a review of gitlab-runner's own environment variable usage might be necessary? Assuming I'm right about the cause of this (and that's no guarantee), I can see this becoming a more frequent issue in future.

Steps to reproduce

-

Actual behavior

find: The environment is too large for exec().

Expected behavior

The commands should work.

Relevant logs and/or screenshots

-

Environment description

The build environment is fairly complex, with Windows, MSYS, make & a perl test framework all contributing to the environment.

I totalled up the environment as printed by env just before the find call failed

Source Bytes Number of variables
Windows 2709 40
MSYS 2019 30
gitlab-runner/CI 11933 127
CI vars (pubkey removed) 693 13
repo 8960 92

Apparently the limit on Windows is 32kB and this is only about 28kB (and that includes a 3kB variable I added to force the issue). Diffing the outputs between 16.6 & 16.7 there's only a single extra environment variable (FF_USE_DOCKER_AUTOSCALER_DIAL_STDIO=true), but I can only say what I'm seeing.

There are plenty of environment variables within our own build that can and will be removed (Make appears to be exporting much more than expected, for one), but 12kB from gitlab-runner alone feels excessive.

Used GitLab Runner version

16.7

Possible fixes

Well, a workaround:

# After configuring we have no further need for any of the builtin gitlab-ci variables (except for some of the GITLAB_*
# variables), and they have such length that they tend to get in the way of our own additions, bumping up against
# various system limits (especially on Windows).
# Strip these out as early as possible to avoid this.
# This export -n & empty string combo appears to be the only one that works reliably on Windows/MSYS
for env in $(compgen -e | grep "^\(FF_\|CI_\|GITLAB_FEATURES\)"); do
    export -n "$env="
done