1. 05 Dec, 2009 1 commit
    • Linus Torvalds's avatar
      Fix diff -B/--dirstat miscounting of newly added contents · 77cd6ab6
      Linus Torvalds authored
      What used to happen is that diffcore_count_changes() simply ignored any
      hashes in the destination that didn't match hashes in the source. EXCEPT
      if the source hash didn't exist at all, in which case it would count _one_
      destination hash that happened to have the "next" hash value.  As a
      consequence, newly added material was often undercounted, making output
      from --dirstat and "complete rewrite" detection used by -B unrelialble.
      This changes it so that:
       - whenever it bypasses a destination hash (because it doesn't match a
         source), it counts the bytes associated with that as "literal added"
       - at the end (once we have used up all the source hashes), we do the same
         thing with the remaining destination hashes.
       - when hashes do match, and we use the difference in counts as a value,
         we also use up that destination hash entry (the 'd++').
      Signed-off-by: default avatarLinus Torvalds <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
  2. 04 Oct, 2007 1 commit
    • Linus Torvalds's avatar
      optimize diffcore-delta by sorting hash entries. · eb4d0e3f
      Linus Torvalds authored
      Here's a test-patch. I don't guarantee anything, except that when I did
      the timings I also did a "wc" on the result, and they matched..
      	[[email protected] linux]$ time git diff -l0 --stat -C v2.6.22.. | wc
      	   7104   28574  438020
      	real    0m10.526s
      	user    0m10.401s
      	sys     0m0.136s
      	[[email protected] linux]$ time ~/git/git diff -l0 --stat -C v2.6.22.. | wc
      	   7104   28574  438020
      	real    0m8.876s
      	user    0m8.761s
      	sys     0m0.128s
      but the diff is fairly simple, so if somebody will go over it and say
      whether it's likely to be *correct* too, that 15% may well be worth it.
      [ Side note, without rename detection, that diff takes just under three
        seconds for me, so in that sense the improvement to the rename detection
        itself is larger than the overall 15% - it brings the cost of just
        rename detection from 7.5s to 5.9s, which would be on the order of just
        over a 20% performance improvement. ]
      Hmm. The patch depends on half-way subtle issues like the fact that the
      hashtables are guaranteed to not be full => we're guaranteed to have zero
      counts at the end => we don't need to do any steenking iterator count in
      the loop. A few comments might in order.
  3. 06 Jul, 2007 1 commit
    • Junio C Hamano's avatar
      Introduce diff_filespec_is_binary() · 29a3eefd
      Junio C Hamano authored
      This replaces an explicit initialization of filespec->is_binary
      field used for rename/break followed by direct access to that
      field with a wrapper function that lazily iniaitlizes and
      accesses the field.  We would add more attribute accesses for
      the use of diff routines, and it would be better to make this
      abstraction earlier.
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
  4. 01 Jul, 2007 3 commits
  5. 15 Mar, 2006 2 commits
  6. 13 Mar, 2006 2 commits
  7. 12 Mar, 2006 1 commit
  8. 04 Mar, 2006 1 commit
  9. 01 Mar, 2006 2 commits