LFS mirroring timeout issue
What's the problem?
Currently, GitLab passes --all for revisions even for small changes during every mirror update. This causes timeout errors when repositories have many LFS objects, as Gitaly processes all LFS pointers in the entire repository instead of just the changed ones. We can optimize this by calculating which refs have changed and passing only those to Gitaly.
Related to: [Pull mirroring] LFS objects fetch timeouts (#415229 - closed)
What does this MR do?
Smart revision detection:
- Compares local vs remote branch states to identify specific changes
- For updated branches: sends commit ranges (old_sha..new_sha) to process only new commits
- For new branches: sends full branch references (refs/remotes/upstream/branch)
- Falls back to
--allfor initial mirroring or when changes exceed threshold
The Workflow
| Before vs After |
|---|
![]() |
Alternatives considered
I'm not very happy with passing the changed_revisions to all downstream services because they don't use them directly - it's classic parameter drilling. As an alternative, I considered setting changed_revisions in git/repository since both UpdateMirrorService and BlobService can access it:
In lib/gitlab/git/repository.rb:
attr_accessor :changed_revisions
In services/projects/update_mirror_service.rb:
project.repository.raw_repository.changed_revisions = changed_revisions
In lib/gitlab/gitaly_client/blob_service.rb:
revisions = @repository.changed_revisions || [encode_binary("--all")]
- Pros: No parameter drilling needed
-
Cons:
gitlab/git/repository.rbdoesn't seem like the right place to holdchanged_revisions- it has nothing to do with Git functionality.
I also considered the parameter size impact. What if there are lots of branches with changes? However, mirroring runs at most once every five minutes on GitLab.com, so within that time window there shouldn't be huge changes, and changed_revisions is just an array of strings, so it should be manageable.
Overall, I think passing the param should be fine.
Local test scenarios
With FF enabled / disabled
- Initial mirroring - PASS (uses --all fallback)
- New branch creation - PASS (processes specific revisions)
- Existing branch with new changes - PASS (processes specific revisions)
- Manual mirroring - PASS
- Auto mirroring - PASS
How to test locally:
- Create a remote project (https://gitlab.com/emmaspark/mirroring-lfs)
- Create an empty local project without a README file
- Go to Settings → Repository → Mirroring
- Add the remote project repository URL (e.g., https://gitlab.com/emmaspark/mirroring-lfs.git)
- Click "Mirror repository"
- Check if the initial mirroring is successful
- Add a new branch with a new LFS file(file_name.txt) to the remote repo. Go to Settings → Repository → Mirroring. Click the run icon next to a bin icon. Check your local repo has the branch and you see the LFS content.
- Edit two of the existing branches of the remote project. Go to Settings → Repository → Mirroring. Click the run icon next to a bin icon. Check your local repo has the branches and you see the LFS contents.
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
