Support for monolithic repositories (minimal maturity)
**Maturity:** **○ Planned** | [◔ Minimal (In progress)](https://gitlab.com/groups/gitlab-org/-/epics/915) | [◐ Viable](https://gitlab.com/groups/gitlab-org/-/epics/1483) | [👍 Complete](https://gitlab.com/groups/gitlab-org/-/epics/1484) | [❤️ Loveable](https://gitlab.com/groups/gitlab-org/-/epics/1485)
> [Minimal](https://about.gitlab.com/direction/maturity/): A minimal foundation so people can see where we're going and to validate customer need.
---
Working on a project in a huge Git repository (e.g. 100GB) is very difficult because the repository needs to be cloned (slow to transfer) and operations that look at the index are slow because of the number of tracked files.
From the [Git Documentation](https://github.com/git/git/blob/master/Documentation/technical/partial-clone.txt):
> The "Partial Clone" feature is a performance optimization for Git that
allows Git to function without having a complete copy of the repository.
The goal of this work is to allow Git better handle extremely large
repositories.
This is a native Git alternative to the [VFS for Git](https://gitlab.com/groups/gitlab-org/-/epics/93) an internal project by Microsoft to use Git for Windows.
## Vision
Git and GitLab should be able to natively support extremely large repositories, to allow projects using other SCM tools to migrate to GitLab.
## Further details
Functionally, **partial clone** needs to be used in combination with **sparse checkout** – see investigations https://gitlab.com/gitlab-org/gitaly/issues/1581
The following filter options are supported and generally need to be combined with `--no-checkout` to prevent everything be lazy downloaded.
```
git clone --filter=blob:limit=1m <url>
git clone --no-checkout --filter=blob:none <url>
git clone --no-checkout --filter=tree:none <url>
git clone --no-checkout --filter=sparse:path=<path> <url>
```
Source https://github.com/git/git/blob/v2.21.0/Documentation/rev-list-options.txt#L711-L739
## Proposal
### GitLab
- [x] Add support for Git 2.20 https://gitlab.com/gitlab-org/gitlab-ce/issues/54255
- [x] Add support for Git 2.22 https://gitlab.com/gitlab-org/gitaly/issues/1715 ~security
- [ ] Feature flag to allow upload pack filters https://gitlab.com/gitlab-org/gitaly/issues/1553
Other GitLab features will also need improvement:
- CI for monorepos https://gitlab.com/groups/gitlab-org/-/epics/812
- Code review for monorepos https://gitlab.com/groups/gitlab-org/-/epics/1318
- and more ...
### Git
- [x] Git 2.22.0: fix partial clone security vulnerability https://gitlab.com/gitlab-org/git/issues/3 ~"git contribution"
- [ ] Git 2.22.0 rejects clone with `sparse:oid` filter options https://gitlab.com/gitlab-org/git/issues/4
- [ ] perform sparse checkout when performing a partial clone by path https://gitlab.com/gitlab-org/git/issues/5
### Links / references
- https://public-inbox.org/git/20190326220906.111879-1-jonathantanmy@google.com/
- https://askubuntu.com/questions/460885/how-to-clone-git-repository-only-some-directories
- https://stackoverflow.com/questions/20910442/pre-load-git-repository/48012038
- https://briancoyner.github.io/2013/06/05/git-sparse-checkout.html
epic