Unable to clean removed sub-submodules when using the GIT_STRATEGY: fetch
Summary
In https://gitlab.com/gitlab-org/gitlab-runner/-/blob/4d48abb0891720bd1350394654b1c5062f6c51d2/shells/abstract.go#L442-444 the sequence is effectively as follows for that function:
...
git submodule sync --recursive
git submodule foreach --recursive git clean -ffxd
git submodule foreach --recursive git reset --hard
git submodule update --init --recursive
...
What this does is:
- Update all the submodules' URLs, recursively
- Recursively remove any untracked files in the submodules
- Recursively put back any removed, tracked files in the submodules
- Recursively check out the correct commit in all submodules
The problem is step 2 happening before step 4. Let's say a submodule itself has a submodule, AND that in the new commit hash, that submodule has been removed. When we do step 2, no problem, we're on the earlier commit, and that sub-submodule is tracked and retained. When we do step 4, suddenly that directory that used to be sub-submodule is now some unknown directory that's left behind. If we had done the clean after the update, this would have been correctly removed.
Note: gitlab-runner!2883 (merged) MR adds a second git clean -ffxd
after step 4 in the original sequence fix the issue gitlab-runner!2351 (closed) According to the gitlab-runner!2351 (comment 403269219) by @pedropombeiro
Steps to reproduce
git clone https://gitlab.com/dkozlov/merge_request-2883
cd merge_request-2883
git checkout 7f9b73ce33942c117478212c98bd6c1e8a021d1a
git status
HEAD detached at 7f9b73c
nothing to commit, working tree clean
git submodule update --init --recursive
Cloning into '/home/user/git/test/merge_request-2883/merge_request-2883-submodule-a'...
warning: redirecting to https://gitlab.com/dkozlov/merge_request-2883-submodule-a.git/
Submodule path 'merge_request-2883-submodule-a': checked out 'aafce850fc665d23e7c2351edf3651df1dfd396e'
Submodule 'merge_request-2883-sub-submodule-b-1' (https://gitlab.com/dkozlov/merge_request-2883-sub-submodule-b-1) registered for path 'merge_request-2883-submodule-a/merge_request-2883-sub-submodule-b-1'
Submodule 'merge_request-2883-sub-submodule-b-2' (https://gitlab.com/dkozlov/merge_request-2883-sub-submodule-b-2) registered for path 'merge_request-2883-submodule-a/merge_request-2883-sub-submodule-b-2'
Cloning into '/home/user/git/test/merge_request-2883/merge_request-2883-submodule-a/merge_request-2883-sub-submodule-b-1'...
warning: redirecting to https://gitlab.com/dkozlov/merge_request-2883-sub-submodule-b-1.git/
Cloning into '/home/user/git/test/merge_request-2883/merge_request-2883-submodule-a/merge_request-2883-sub-submodule-b-2'...
warning: redirecting to https://gitlab.com/dkozlov/merge_request-2883-sub-submodule-b-2.git/
Submodule path 'merge_request-2883-submodule-a/merge_request-2883-sub-submodule-b-1': checked out '49fd0517d8bc4649aa51c6222390b5f9cd560964'
Submodule path 'merge_request-2883-submodule-a/merge_request-2883-sub-submodule-b-2': checked out 'b78c98693b313f10f26821a0f5d28284c3f0474b'
ls merge_request-2883-submodule-a/
merge_request-2883-sub-submodule-b-1 merge_request-2883-sub-submodule-b-2 README.md
git checkout d476cb4bdf877d764392a7f013bfc75dbfb1584f
M merge_request-2883-submodule-a
Previous HEAD position was 7f9b73c add submodule-a
HEAD is now at d476cb4 update merge_request-2883-submodule-a
git submodule sync --recursive
Synchronizing submodule url for 'merge_request-2883-submodule-a'Synchronizing submodule url for 'merge_request-2883-submodule-a/merge_request-2883-sub-submodule-b-1'
Synchronizing submodule url for 'merge_request-2883-submodule-a/merge_request-2883-sub-submodule-b-2'
git submodule foreach --recursive git clean -ffxd
Entering 'merge_request-2883-submodule-a'
Entering 'merge_request-2883-submodule-a/merge_request-2883-sub-submodule-b-1'
Entering 'merge_request-2883-submodule-a/merge_request-2883-sub-submodule-b-2'
git submodule foreach --recursive git reset --hard
Entering 'merge_request-2883-submodule-a'
HEAD is now at aafce85 add submodules b1 and b2
Entering 'merge_request-2883-submodule-a/merge_request-2883-sub-submodule-b-1'
HEAD is now at 49fd051 Initial commit
Entering 'merge_request-2883-submodule-a/merge_request-2883-sub-submodule-b-2'
HEAD is now at b78c986 Initial commit
git submodule update --init --recursive
**warning: unable to rmdir 'merge_request-2883-sub-submodule-b-2': Directory not empty**
Submodule path 'merge_request-2883-submodule-a': checked out '7eb4be081612cf7476dd44bf905a4348e050b669'
cd merge_request-2883-submodule-a
git statusmerge_request-2883-sub-submodule-b-2
HEAD detached at 7eb4be0
Untracked files:
(use "git add <file>..." to include in what will be committed)
**merge_request-2883-sub-submodule-b-2**/
Example Project
https://gitlab.com/dkozlov/merge_request-2883 https://gitlab.com/dkozlov/merge_request-2883-submodule-a https://gitlab.com/dkozlov/merge_request-2883-sub-submodule-b-1 https://gitlab.com/dkozlov/merge_request-2883-sub-submodule-b-2
What is the current bug behavior?
Removed "merge_request-2883-sub-submodule-b-2" sub-submodule exists when using GIT_STRATEGY: fetch
What is the expected correct behavior?
Removed "merge_request-2883-sub-submodule-b-2" sub-submodule should be cleaned when using GIT_STRATEGY: fetch
Relevant logs and/or screenshots
// How to reproduce https://gitlab.com/dkozlov/merge_request-2883:
// Create 4 repositories, e.g.
merge_request-2883
merge_request-2883-submodule-a
merge_request-2883-sub-submodule-b-1
merge_request-2883-sub-submodule-b-2
Perform the following actions:
git clone https://gitlab.com/dkozlov/merge_request-2883-submodule-a
cd merge_request-2883-submodule-a
git submodule add https://gitlab.com/dkozlov/merge_request-2883-sub-submodule-b-1
git submodule add https://gitlab.com/dkozlov/merge_request-2883-sub-submodule-b-2
git commit -m "add submodules b1 and b2"
cd ..
git clone https://gitlab.com/dkozlov/merge_request-2883
cd merge_request-2883
git commit -m "add submodule-a"
// Update the git submodules recursively
git submodule update --init --recursive
Submodule 'merge_request-2883-sub-submodule-b-1' (https://gitlab.com/dkozlov/merge_request-2883-sub-submodule-b-1) registered for path 'merge_request-2883-submodule-a/merge_request-2883-sub-submodule-b-1'
Submodule 'merge_request-2883-sub-submodule-b-2' (https://gitlab.com/dkozlov/merge_request-2883-sub-submodule-b-2) registered for path 'merge_request-2883-submodule-a/merge_request-2883-sub-submodule-b-2'
Cloning into '/home/user/git/merge_request-2883/merge_request-2883-submodule-a/merge_request-2883-sub-submodule-b-1'...
warning: redirecting to https://gitlab.com/dkozlov/merge_request-2883-sub-submodule-b-1.git/
Cloning into '/home/user/git/merge_request-2883/merge_request-2883-submodule-a/merge_request-2883-sub-submodule-b-2'...
warning: redirecting to https://gitlab.com/dkozlov/merge_request-2883-sub-submodule-b-2.git/
Submodule path 'merge_request-2883-submodule-a/merge_request-2883-sub-submodule-b-1': checked out '49fd0517d8bc4649aa51c6222390b5f9cd560964'
Submodule path 'merge_request-2883-submodule-a/merge_request-2883-sub-submodule-b-2': checked out 'b78c98693b313f10f26821a0f5d28284c3f0474b'
cd ../merge_request-2883-submodule-a
// Remove submodule "merge_request-2883-sub-submodule-b-2" using the following guide https://stackoverflow.com/questions/1260748/how-do-i-remove-a-submodule
git submodule deinit -f -- merge_request-2883-sub-submodule-b-2
Cleared directory 'merge_request-2883-sub-submodule-b-2'
Submodule 'merge_request-2883-sub-submodule-b-2' (https://gitlab.com/dkozlov/merge_request-2883-sub-submodule-b-2) unregistered for path 'merge_request-2883-sub-submodule-b-2'
rm -rf .git/modules/merge_request-2883-sub-submodule-b-2
git rm -f merge_request-2883-sub-submodule-b-2
rm 'merge_request-2883-sub-submodule-b-2'
git status
On branch master
Your branch is up to date with 'origin/master'.
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
modified: .gitmodules
deleted: merge_request-2883-sub-submodule-b-2
git commit -m "remove b-2"
git push
cd ../merge_request-2883
git log
commit 7f9b73ce33942c117478212c98bd6c1e8a021d1a (HEAD -> master, origin/master, origin/HEAD)
Author: Dmitry Kozlov <dmitry.f.kozlov@gmail.com>
Date: Sun May 16 04:42:54 2021 +0300
add submodule-a
commit 16110481e4a4b13ef4aaf4600ee1cb075029c4bb
Author: DmtiryK <dmitry.f.kozlov@gmail.com>
Date: Sun May 16 01:08:55 2021 +0000
Initial commit
ls merge_request-2883-submodule-a/
merge_request-2883-sub-submodule-b-1 merge_request-2883-sub-submodule-b-2 README.md
git submodule update --rebase --remote
warning: redirecting to https://gitlab.com/dkozlov/merge_request-2883-submodule-a.git/
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 3 (delta 1), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (3/3), 347 bytes | 347.00 KiB/s, done.
From https://gitlab.com/dkozlov/merge_request-2883-submodule-a
aafce85..7eb4be0 master -> origin/master
First, rewinding head to replay your work on top of it...
warning: unable to rmdir 'merge_request-2883-sub-submodule-b-2': Directory not empty
Fast-forwarded master to 7eb4be081612cf7476dd44bf905a4348e050b669.
Submodule path 'merge_request-2883-submodule-a': rebased into '7eb4be081612cf7476dd44bf905a4348e050b669'
git commit -m "update merge_request-2883-submodule-a"
git push