LFS mirroring timeout issue

What's the problem?

Currently, GitLab passes --all for revisions even for small changes during every mirror update. This causes timeout errors when repositories have many LFS objects, as Gitaly processes all LFS pointers in the entire repository instead of just the changed ones. We can optimize this by calculating which refs have changed and passing only those to Gitaly.

Related to: [Pull mirroring] LFS objects fetch timeouts (#415229 - closed)

What does this MR do?

Smart revision detection:

  • Compares local vs remote branch states to identify specific changes
  • For updated branches: sends commit ranges (old_sha..new_sha) to process only new commits
  • For new branches: sends full branch references (refs/remotes/upstream/branch)
  • Falls back to --all for initial mirroring or when changes exceed threshold

The Workflow

Before vs After
Screenshot_2025-08-29_at_12.36.16_pm

Alternatives considered

I'm not very happy with passing the changed_revisions to all downstream services because they don't use them directly - it's classic parameter drilling. As an alternative, I considered setting changed_revisions in git/repository since both UpdateMirrorService and BlobService can access it:

In lib/gitlab/git/repository.rb:

attr_accessor :changed_revisions

In services/projects/update_mirror_service.rb:

project.repository.raw_repository.changed_revisions = changed_revisions

In lib/gitlab/gitaly_client/blob_service.rb:

revisions = @repository.changed_revisions || [encode_binary("--all")]
  • Pros: No parameter drilling needed
  • Cons: gitlab/git/repository.rb doesn't seem like the right place to hold changed_revisions - it has nothing to do with Git functionality.

I also considered the parameter size impact. What if there are lots of branches with changes? However, mirroring runs at most once every five minutes on GitLab.com, so within that time window there shouldn't be huge changes, and changed_revisions is just an array of strings, so it should be manageable.

Overall, I think passing the param should be fine.

Local test scenarios

With FF enabled / disabled

  1. Initial mirroring - PASS (uses --all fallback)
  2. New branch creation - PASS (processes specific revisions)
  3. Existing branch with new changes - PASS (processes specific revisions)
  4. Manual mirroring - PASS
  5. Auto mirroring - PASS

How to test locally:

  1. Create a remote project (https://gitlab.com/emmaspark/mirroring-lfs)
  2. Create an empty local project without a README file
  3. Go to Settings → Repository → Mirroring
  4. Add the remote project repository URL (e.g., https://gitlab.com/emmaspark/mirroring-lfs.git)
  5. Click "Mirror repository"
  6. Check if the initial mirroring is successful
  7. Add a new branch with a new LFS file(file_name.txt) to the remote repo. Go to Settings → Repository → Mirroring. Click the run icon next to a bin icon. Check your local repo has the branch and you see the LFS content.
  8. Edit two of the existing branches of the remote project. Go to Settings → Repository → Mirroring. Click the run icon next to a bin icon. Check your local repo has the branches and you see the LFS contents.

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Emma Park

Merge request reports

Loading