git-repack: Implement option to add a .promisor file after repacking with filter
In the partial clone demo, after we did a repacking with filter:
git -C <some remote> -c repack.writebitmaps=false repack -a -d \
--filter=blob:limit=900m --filter-to=<some path>
We need to manually add a .promisor
file
promisor_file=$(ls objects/pack/*.pack | perl -ne 'print if s/\.pack/.promisor/')
touch "$promisor_file"
Ideally there would be a command line option to tell git to create the promisor file after repacking
Proposal:
Add an optional argument --with-promisor-file
to create a .promisor
file
Concerns:
A major concern is that we need to make sure data integrity before adding the .promisor
file.
Quoting from discussion with @chriscool
It is important to make sure that no objects that are filtered out are lost. So before marking the other packfile as a promisor one (by adding a
*.promisor
file), and before removing the packfile that contains the filtered out objects, it makes sense to first check that the packfile that contains the filtered out objects can be removed without any object being lost.In other words the steps should often be something like:
- Repack with a filter.
- Check that the filtered out objects are all available on the promisor remote (or send them there to make sure they are available).
- Mark the packfile with objects that are not filtered out as a promisor one.
- Remove the packfile that contains the filtered out objects.
Adding an option so that the repacking can do both steps 1. and 3. (instead of only 1.) might be seen as somehow bypassing step 2. (or encouraging users to bypass it), except that step 2. is important to make sure that no objects are lost (which would corrupt the repo)
...
Previously he (Junio Hamano) suggested to add an option that performs all the 4 steps I talk about above. But this is more complex than an option that just performs step 3. on top of step 1.
Current actions:
- Add step 2 to 4, in gitaly logic so that we have data integrity check. (step 1 and 3 is already in PoC code)
- Working on git side to add step 3. If it gets rejected, we can try implement 1-4 in git. It should not block our PoC though, since we have that in gitaly already.