git: Speed up creation of packfiles via pack window memory limit
Depending on the repository's shape, generation of packfiles may prove to be hugely expensive. This doesn't even require the repository to have a huge number of objects, but it already suffices if it only has a few objects which are comparatively big (e.g. 10MB and more).
The root cause for this is how git tries to generate deltas for objects which are part of the packfile. To compute such deltas, it will first list all objects which are going to be part of the pack, sorts it by various metrics like size, type and path of the object and then uses a sliding window to iterate over those objects, trying to deltify the current object against all objects in that window. By default, this window has a width of 10 objects, which is quick to compute for small objects. But with growing object size, loading 10 objects into that window and comparing them with each other is proving to be increasingly expensive. E.g. assuming you have 10 objects of 100MB, each object will be compared with all the others in order to find a delta base. That is 45 object comparisons 100MB. One can easily see how this computation quickly adds up.
Git provides multiple mechanism with which one may speed up creation of
packfiles, but most importantly it provides mechanisms to change
parameters for the pack window. But the most important ones in the
context of biggish binary-like files are core.bigFileThreshold
and
pack.windowMemory
. The former one will instruct Git to treat every
file that is bigger than the configured limit to be treated as binary
file which is not to be deltified, while the latter one configures an
upper memory limit for the pack window.
A benchmark which uses git-pack-objects to pack objects of random data
shows though that core.bigFileThreshold
doesn't have any impact at all
on the packfile generation. This is unexpected, but considering that it
does have an impact on git-clone it seems like it is only used in some
conditions where packfiles are generated. Tuning pack.windowMemory
does show a great impact though: the smaller it is, the faster the
packfile generation becomes when handling largeish objects.
|---------|---------|---------|----------|----------|
| default | BFT=50M | PWM=50M | PWM=100M | PWM=250M |
|----------|---------|---------|---------|----------|----------|
| 10x10MB | 5.63s | 5.67s | 3.52s | 4.90s | 5.67s |
| 25x10MB | 16.47s | 21.11s | 9.07s | 19.95s | 20.44s |
|----------|---------|---------|---------|----------|----------|
| 10x50MB | 25.49s | 25.70s | 11.12s | 11.15s | 19.71s |
| 25x50MB | 95.69s | 95.80s | 27.79s | 28.68s | 59.59s |
|----------|---------|---------|---------|----------|----------|
| 10x100MB | 26.72s | 25.90s | 11.14s | 10.85s | 20.87s |
| 25x100MB | 96.14s | 95.45s | 27.95s | 27.84s | 56.46s |
|----------|---------|---------|---------|----------|----------|
| 10x250MB | 25.88s | 25.52s | 11.31s | 10.87s | 19.75s |
| 25x250MB | 96.50s | 95.92s | 29.10s | 28.15s | 54.81s |
|----------|---------|---------|---------|----------|----------|
BFT: core.bigFileThreshold
PWM: pack.windowMemory
In the above table, one can nicely see that setting PWM creates kind of an upper bound for how long it takes to compute the packfile. This nicely demonstrates that packfile generation is dominated by the deltification process.
As can be seen, a limit of 50MB resulted in a speedup of up to 3x while a limit of 250MB resulted in a speedup of up to 2x. Naturally, the speedup which can be achieved strongly depends on the shape of objects which are to be compressed. First, it will only ever apply if the object size sum of 10 objects actually exceed that limit. And second, the larger the gap between limit and sum of object sizes is, the bigger the effect is going to be.
One tradeoff is that with PWM being set, git may generate less efficient packfiles which are bigger due to missed deltification opportunities. Toying around with a "well-behaved" repository (linux.git in this case) showed that decreasing the PWM doesn't really show much of a difference in the generated packfile's size. Up to a PWM of 1MB, the size only increased by around 1%. Only when setting it as low as 1KB does the packfile significantly grow in size.
This tradeoff is going to hit non-well-behaved repositories (those with a lot of biggish files) a lot harder though. Considering cloning such repositories is really stressing our infrastructure though, this tradeoff is considered to be worth it. In the end, such binary blobs should ideally not be part of any repository anyway and should instead be moved into alternatives like e.g. Git LFS.
So based on our findings, let's limit the packfile window to 100MB. Speedups between 50MB and 100MB weren't significant enough to warrant the more aggressive limit, which is why we instead play it a bit safer.