Skip to content
  • Linus Torvalds's avatar
    optimize diffcore-delta by sorting hash entries. · eb4d0e3f
    Linus Torvalds authored and Junio C Hamano's avatar Junio C Hamano committed
    Here's a test-patch. I don't guarantee anything, except that when I did
    the timings I also did a "wc" on the result, and they matched..
    
    Before:
    	[torvalds@woody linux]$ time git diff -l0 --stat -C v2.6.22.. | wc
    	   7104   28574  438020
    
    	real    0m10.526s
    	user    0m10.401s
    	sys     0m0.136s
    
    After:
    	[torvalds@woody linux]$ time ~/git/git diff -l0 --stat -C v2.6.22.. | wc
    	   7104   28574  438020
    
    	real    0m8.876s
    	user    0m8.761s
    	sys     0m0.128s
    
    but the diff is fairly simple, so if somebody will go over it and say
    whether it's likely to be *correct* too, that 15% may well be worth it.
    
    [ Side note, without rename detection, that diff takes just under three
      seconds for me, so in that sense the improvement to the rename detection
      itself is larger than the overall 15% - it brings the cost of just
      rename detection from 7.5s to 5.9s, which would be on the order of just
      over a 20% performance improvement. ]
    
    Hmm. The patch depends on half-way subtle issues like the fact that the
    hashtables are guaranteed to not be full => we're guaranteed to have zero
    counts at the end => we don't need to do any steenking iterator count in
    the loop. A few comments might in order.
    
    		Linus
    eb4d0e3f