Error with recursive submodule checkout (fatal: transport 'file' not allowed)

TL;DR: As a workaround, force a full clone: Set GIT_STRATEGY: clone (possibly in the UI), see #38908 (comment 2696065209)

Summary

Running a CI job with GIT_SUBMODULE_STRATEGY: recursive fails with fatal: transport 'file' not allowed for nested submodules if the commit for the submodule was not fetched before, when using git fetch strategy.

Steps to reproduce

Create 3 repositories:

  • subsubrepo, with 2 branches:
    • main
    • other, with a commit added w.r.t. main
  • subrepo, with the same 2 branches and submodule subsubrepo on the tip commit of the corresponding branch
  • toprepo, again with the same 2 branches and submodule subrepo on the tip commit of the corresponding branch

Make sure to disabled separate caches for protected & non-protected branches in the CI/CD settings if branches in toprepo have different protection levels. Also make sure git strategy in the CI/CD settings is set to git fetch (not git clone, which can be used as a workaround for affected users, btw).

Now first run a GIT_SUBMODULE_STRATEGY: recursive job on a Linux Docker runner in toprepo on the main branch. After that job has finished, run another job in toprepo on the other branch. If you make a mistake, stop the gitlab-runner service and remove the Docker caching volumes.

.gitlab-ci.yml
variables:
  GIT_SUBMODULE_STRATEGY: recursive

job1:
  tags:
    - linux-docker
  script:
    - echo hi

Actual behavior

Fetching subsubrepo fails with fatal: transport 'file' not allowed.

Expected behavior

Fetching nested submodules should work.

Relevant logs and/or screenshots

job log
Getting source from Git repository
Gitaly correlation ID: [...]
Fetching changes...
Reinitialized existing Git repository in /builds/mygroup/toprepo/.git/
Created fresh repository.
Checking out [...] as detached HEAD (ref is other)...
Updating/initializing submodules recursively...
Submodule 'subrepo' (https://gitlab-ci-token:[MASKED]@[...]/mygroup/subrepo.git) registered for path 'subrepo'
Synchronizing submodule url for 'subrepo'
Entering 'subrepo'
Entering 'subrepo/subsubrepo'
Entering 'subrepo'
HEAD is now at [...]
Entering 'subrepo/subsubrepo'
HEAD is now at [...]
From https://[...]/mygroup/subrepo
 * branch                HEAD       -> FETCH_HEAD
From https://[...]/mygroup/subrepo
 * branch                [...] -> FETCH_HEAD
Fetching submodule subsubrepo
fatal: transport 'file' not allowed
Errors during submodule fetch:
	subsubrepo
fatal: Fetched in submodule path 'subrepo', but it did not contain [...]. Direct fetching of that commit failed.
Updating submodules failed. Retrying...
Synchronizing submodule url for 'subrepo'
From https://[...]/mygroup/subrepo
 * branch                HEAD       -> FETCH_HEAD
From https://[...]/mygroup/subrepo
 * branch                [...] -> FETCH_HEAD
Fetching submodule subsubrepo
fatal: transport 'file' not allowed
Errors during submodule fetch:
	subsubrepo
fatal: Fetched in submodule path 'subrepo', but it did not contain [...]. Direct fetching of that commit failed.
Retrying in 5s
Cleaning up project directory and file based variables
ERROR: Job failed: exit code 1

Environment description

Self-hosted Linux Docker runner.

config.toml contents (irrelevant parts removed)
[[runners]]
  executor = "linux-docker"
  [runners.docker]
    privileged = true
    volumes = [
      "/cache",
      "/certs/client",
      "/etc/localtime:/etc/localtime:ro"
    ]

Used GitLab Runner version

Version:      18.1.1
Git revision: 2b813ade
Git branch:   18-1-stable
GO version:   go1.24.4 X:cacheprog
Built:        2025-06-26T16:25:31Z
OS/Arch:      linux/amd64

Possible fixes

I spent a lot of time researching the exact problem. Here's what I found:

  • The GitLab runner caches the repo in a docker volume (one per repo, per level of concurrency). From this volume, Git config is removed.
  • When Git doesn't have a remote, as is the case with deleted config, it assumes origin, and then tries origin as source for fetching. This fetching occurs here in a call to git fetch origin <commit> which is executed internally by git submodule update. Since origin is not an HTTP or SSH source, git assumes it to be a filename, and since the fetch command was not executed directly by the user (git submodule sets PROTOCOL_ALLOW_USER_ONLY=0), file transport is not allowed, leading to the esoteric error message.
  • GitLab runner does add the remote for toprepo using git remote add origin https://gitlab-ci-token:[MASKED]@[...]/mygroup/toprepo.git, and adds the URL for subrepo in .git/config using git submodule init, and then propagates it to .git/modules/subrepo/config via git submodule sync --recursive. However, git submodule init is not recursive, so subsubrepo does not get initialized, meaning that it won't get a remote and won't be affected by git submodule sync --recursive.
  • The runner then retries the submodule update, preceded by a second git submodule sync --recursive, but this usually only works when the first submodule update used --depth=1, not with a larger depth. (It may also work when -c submodule.recurse=false is passed to git submodule update, which is not to be confused with the --recurse flag.) Note that in these cases the original update command does still report failure.

The issue can be reproduced locally using the following script: repro.sh:

#!/usr/bin/env bash
set -eu -o pipefail
shopt -s globstar

remote="${1:?Please pass remote for toprepo}"
other_ref="${2:-other}" # branch/commit
update_depth="${3:-2}"

set -x  # Trace commands
rm -rf ./toprepo/
echo '>>> Making initial clone'
# Clone `main`
git clone --depth 1 --shallow-submodules --recurse-submodules -- "$remote" ./toprepo
cd ./toprepo/
# Simulate GitLab runner cleaning config
echo '>>> Removing config'
rm ./.git/**/config

echo '>>> Re-adding remote'
# Simulate GitLab runner re-adding remote
git remote add origin -- "$remote"

echo '>>> Fetching & checkout out other ref'
# Now we try to checkout `other`
other_commit="$(git ls-remote --exit-code origin -- "$other_ref" | cut -f1 || echo "$other_ref")"
git fetch --depth 1 --no-recurse-submodules origin -- "$other_commit"
git checkout --no-recurse-submodules FETCH_HEAD

git submodule init
git submodule sync --recursive

# Fails
git submodule update --depth "$update_depth" --recursive --init &&
	echo 'Initial update unexpectedly succeeded?!' >&2 && exit 1

echo '>>> Trying to fix with auto-retry from GitLab runner...'
git submodule sync --recursive
git submodule update --depth "$update_depth" --recursive --init &&
	echo 'Retried update succeeded!' && exit

# The fix
read -p 'Observe the error above. Then hit enter to fix it... (or interrupt to examine the repo)'
# These command should be between checkout & update
git submodule init
git submodule foreach --recursive 'git submodule init'
git submodule sync --recursive

git submodule update --depth "$update_depth" --recursive --init ||
	(echo 'Fixed update unexpectedly failed?!' >&2 && exit 1)

This also ends with the fix that I would propose: git submodule foreach --recursive 'git submodule init'.

Call it like so: ./repro.sh git@[...]:mygroup/toprepo. Passing depth 1 as third argument should make the retry succeed as well.

Note that I really spent a lot of time trying to figure out all the details, but still there are cases where the issue does not pop up even when I would expect it to, and there are cases where the retry sync does work, even when I wouldn't expect it to. I hope the repro works for you.

I don't think this is necessarily a git bug? Or do you think it is and should it be reported?

I hope the fix also works when we nest yet another level deeper, but I find it hard to test this.

Edited by stevenwdv