Geo: gitmodulesUrl: disallowed submodule url error causes repository sync failures
GitLab Geo repository synchronization fails with the error Error syncing repository: 13:creating repository: cloning repository: exit status 128 when repositories contain invalid submodule URLs in their .gitmodules files.
More details from this comment - https://gitlab.com/gitlab-com/gl-infra/gitlab-dedicated/team/-/issues/8576#note_2565818604
For 1: This new
git fsckbehavior comes from a change in upstream Git, whereby this check was added. It is therefore not a Gitaly issue specifically. See gitaly#5641. This will probaly impact other Geo customers. If my understanding is correct the current workarounds are:
- Ignore the "gitModulesUrl" git fsck check error as mentioned in #462567 (comment 2086468377)
- Fix the invalid URL wit
git-filter-repo toolas mentioned in #462567 (comment 2534852005).- Manually copy the project repositories as mentioned in https://gitlab.com/gitlab-com/request-for-help/-/issues/2151#note_2273834720.
Related Issue
This issue was encountered during a GitLab Dedicated migration: https://gitlab.com/gitlab-com/gl-infra/gitlab-dedicated/team/-/issues/8576#note_2568797060
Workaround
1. Backup projects
- Before proceeding, ensure they back up the projects beforehand, using the project export option - https://docs.gitlab.com/user/project/settings/import_export/
2. Remove blobs
- Follow the process to remove the blobs here - https://docs.gitlab.com/user/project/repository/repository_size/#remove-blobs
- For each affected project, identify the problematic blob IDs from the Gitaly logs and remove them using GitLab's blob removal interface
Note: Please also make it clear that the developers who work on these projects must remove their current copy and clone the fixed repository after the steps above. Otherwise, they can reintroduce the offending blobs.
Important limitation: If any of these repositories are part of a fork network, the blob removal method may not work (blobs contained in object pools cannot be removed this way).
3. Fix .gitmodules invalid URLs if required
- Check the state of
.gitmodulesfiles in each affected repository - If the
.gitmodulesstill contains invalid URLs likehttps://example.gitlab.com:foo/bar.gitinstead ofhttps://example.gitlab.com/foo/bar.git, the customer needs to:- Fix the URLs in the
.gitmodulesfile - Push a commit with valid URLs``
- Fix the URLs in the