Streamlined filter-spec partial clone and checkout workflow
Problem to solve
Working on a project in a very large Git repository (e.g. 100GB) is very difficult because the repository needs to be cloned, and because of the huge number (possibly millions) of files in the working copy.
Partial clone and sparse checkout are native Git solutions to this problems, but they are difficulty to use.
Further details
Partial clone and sparse checkout use a file of the same format as .gitignore but instead of explicitly listing the files to ignore, the file should explicitly specify the files to include.
# include everything in the docs, project-a, and project-b directories
# everything else will be excluded
docs/
project-a/
project-b/
However, currently this file needs to live on the server for Git to read when preparing the data to send, and then needs to be used to configure spares checkout in .git/info/sparse-checkout
Proposal
There should be a single command to:
- perform the partial clone, and
- configure sparse checkout,
- and perform the sparse checkout
This would currently be done manually with:
# partial clone
git clone --no-checkout --filter=sparse:path=<path> <repo> <dir>
# configure sparse checkout
git config --local core.sparsecheckout true
cat <path> >> .git/info/sparse-checkout
# perform sparse checkout
git checkout master