Specially crafted docker images can exhaust resources on managers
Summary
For the Docker executor, when we run a job's main container, we copy the containers output directly to the trace log implementation. This implementation has a limit on how much data is stored.
!2534 (merged) introduced a user
package that executes id
inside of the user's provided container. This is used to fetch the containers uid
and gid
and allows us to chown
files created by our helper image, whose files will also be created with uid 0
.
Unfortunately, no such limit is placed on the output from id
, so a specially crafted docker image that replaces how id
works can have it return an excessive amount of data to fill and exhaust a limitless buffer on the Runner manager.
This problem only exists when FF_DISABLE_UMASK_FOR_DOCKER_EXECUTOR
is enabled, however, it can be enabled at the job level. A simple protection is to ensure FF_DISABLE_UMASK_FOR_DOCKER_EXECUTOR
cannot be enabled until the problem is resolved.
A similar problem exists for Docker's service output, which fails its 30s health check. However, whilst there's no limit here (and probably should be), this command reads the Docker's log file output via a different API endpoint. This appears to have an internal limit of 1MB, because we don't follow
the output.
Solution
-
Disable FF_DISABLE_UMASK_FOR_DOCKER_EXECUTOR
on the Runner fleet to protect GitLab hosted Runner Managers: https://gitlab.com/gitlab-com/gl-infra/chef-repo/-/merge_requests/805/ -
Explicitly set sane limits on any command/data returned from containers that are not directly fed into the trace log.
Closes #28630 (closed)