Skip to content

Geo: Use `git clone` for first sync instead of `git fetch`

What does this MR do and why?

Geo used to work with the following logic:

  • If a repository is present on a secondary site
    • Run git fetch with geo special credentials and flags
  • If a repository is not present on a secondary site
    • Create an empty repository
    • Run git fetch with geo special credentials and flags

We want to skip the Create an empty repository step and instead of running git fetch when no repository is found, do the initial bootstrap using git clone.

While this is not a documented behavior, git behaves differently when using clone or creating an empty repository and then running fetch against a new remote. With clone it will land all objectes packed. With fetch it will all be unpacked, which requires a gc run immediatelly after. These is extremely inneficient, and has a huge impact on a big repository. In the best case it doubles the time it would normally takes, in the worst case, we are adding unnecessary load to an already saturated server for several minutes.

This change requires merging gitaly changes so that the gitaly calls can work.

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #5447 (closed)

Edited by Gabriel Mazetto

Merge request reports