1. 05 Apr, 2013 5 commits
    • Junio C Hamano's avatar
      diffcore-pickaxe: fix leaks in "log -S<block>" and "log -G<pattern>" · 88ff684d
      Junio C Hamano authored
      The diff_grep() and has_changes() functions had early return
      codepaths for unmerged filepairs, which simply returned 0.  When we
      taught textconv filter to them, one was ignored and continued to
      return early without freeing the result filtered by textconv, and
      the other had a failed attempt to fix, which allowed the planned
      return value 0 to be overwritten by a bogus call to contains().
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      88ff684d
    • Junio C Hamano's avatar
      diffcore-pickaxe: port optimization from has_changes() to diff_grep() · ebb72262
      Junio C Hamano authored
      These two functions are called in the same codeflow to implement
      "log -S<block>" and "log -G<pattern>", respectively, but the latter
      lacked two obvious optimizations the former implemented, namely:
      
       - When a pickaxe limit is not given at all, they should return
         without wasting any cycle;
      
       - When both sides of the filepair are the same, and the same
         textconv conversion apply to them, return early, as there will be
         no interesting differences between the two anyway.
      
      Also release the filespec data once the processing is done (this is
      not about leaking memory--it is about releasing data we finished
      looking at as early as possible).
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      ebb72262
    • Simon Ruderich's avatar
      diffcore-pickaxe: respect --no-textconv · a8f61094
      Simon Ruderich authored
      git log -S doesn't respect --no-textconv:
      
          $ echo '*.txt diff=wrong' > .gitattributes
          $ git -c diff.wrong.textconv='xxx' log --no-textconv -Sfoo
          error: cannot run xxx: No such file or directory
          fatal: unable to read files to diff
      Reported-by: Matthieu Moy's avatarMatthieu Moy <Matthieu.Moy@grenoble-inp.fr>
      Signed-off-by: Simon Ruderich's avatarSimon Ruderich <simon@ruderich.org>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      a8f61094
    • Jeff King's avatar
      diffcore-pickaxe: remove fill_one() · 7cdb9b42
      Jeff King authored
      fill_one is _almost_ identical to just calling fill_textconv; the
      exception is that for the !DIFF_FILE_VALID case, fill_textconv gives us
      an empty buffer rather than a NULL one. Since we currently use the NULL
      pointer as a signal that the file is not present on one side of the
      diff, we must now switch to using DIFF_FILE_VALID to make the same
      check.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: Simon Ruderich's avatarSimon Ruderich <simon@ruderich.org>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      7cdb9b42
    • Simon Ruderich's avatar
      diffcore-pickaxe: remove unnecessary call to get_textconv() · bc615898
      Simon Ruderich authored
      The fill_one() function is responsible for finding and filling the
      textconv filter as necessary, and is called by diff_grep() function
      that implements "git log -G<pattern>".
      
      The has_changes() function that implements "git log -S<block>" calls
      get_textconv() for two sides being compared, before it checks to see
      if it was asked to perform the pickaxe limiting.  Move the code
      around to avoid this wastage.
      
      After has_changes() calls get_textconv() to obtain textconv for both
      sides, fill_one() is called to use them.
      
      By adding get_textconv() to diff_grep() and relieving fill_one() of
      responsibility to find the textconv filter, we can avoid calling
      get_textconv() twice in has_changes().
      
      With this change it's also no longer necessary for fill_one() to
      modify the textconv argument, therefore pass a pointer instead of a
      pointer to a pointer.
      Signed-off-by: Simon Ruderich's avatarSimon Ruderich <simon@ruderich.org>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      bc615898
  2. 28 Oct, 2012 3 commits
    • Jeff King's avatar
      pickaxe: use textconv for -S counting · ef90ab66
      Jeff King authored
      We currently just look at raw blob data when using "-S" to
      pickaxe. This is mostly historical, as pickaxe predates the
      textconv feature. If the user has bothered to define a
      textconv filter, it is more likely that their search string will be
      on the textconv output, as that is what they will see in the
      diff (and we do not even provide a mechanism for them to
      search for binary needles that contain NUL characters).
      
      This patch teaches "-S" to use textconv, just as we
      already do for "-G".
      Signed-off-by: default avatarJeff King <peff@peff.net>
      ef90ab66
    • Jeff King's avatar
      pickaxe: hoist empty needle check · 8fa4b09f
      Jeff King authored
      If we are given an empty pickaxe needle like "git log -S ''",
      it is impossible for us to find anything (because no matter
      what the content, the count will always be 0). We currently
      check this at the lowest level of contains(). Let's hoist
      the logic much earlier to has_changes(), so that it is
      simpler to return our answer before loading any blob data.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      8fa4b09f
    • Jeff King's avatar
      diff_grep: use textconv buffers for add/deleted files · b1c2f57d
      Jeff King authored
      If you use "-G" to grep a diff, we will apply a configured
      textconv filter to the data before generating the diff.
      However, if the diff is an addition or deletion, we do not
      bother running the diff at all, and just look for the token
      in the added (or removed) content. This works because we
      know that the diff must contain every line of content.
      
      However, while we used the textconv-derived buffers in the
      regular diff, we accidentally passed the original unmodified
      buffers to regexec when checking the added or removed
      content. This could lead to an incorrect answer.
      
      Worse, in some cases we might have a textconv buffer but no
      original buffer (e.g., if we pulled the textconv data from
      cache, or if we reused a working tree file when generating
      it). In that case, we could actually feed NULL to regexec
      and segfault.
      Reported-by: Peter Oberndorfer's avatarPeter Oberndorfer <kumbayo84@arcor.de>
      Signed-off-by: default avatarJeff King <peff@peff.net>
      b1c2f57d
  3. 29 Feb, 2012 1 commit
    • Junio C Hamano's avatar
      pickaxe: allow -i to search in patch case-insensitively · accccde4
      Junio C Hamano authored
      "git log -S<string>" is a useful way to find the last commit in the
      codebase that touched the <string>. As it was designed to be used by a
      porcelain script to dig the history starting from a block of text that
      appear in the starting commit, it never had to look for anything but an
      exact match.
      
      When used by an end user who wants to look for the last commit that
      removed a string (e.g. name of a variable) that he vaguely remembers,
      however, it is useful to support case insensitive match.
      
      When given the "--regexp-ignore-case" (or "-i") option, which originally
      was designed to affect case sensitivity of the search done in the commit
      log part, e.g. "log --grep", the matches made with -S/-G pickaxe search is
      done case insensitively now.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      accccde4
  4. 07 Oct, 2011 7 commits
  5. 21 Aug, 2011 1 commit
    • Fredrik K's avatar
      Use kwset in pickaxe · b95c5ada
      Fredrik K authored
      Benchmarks in the hot cache case:
      
      before:
      $ perf stat --repeat=5 git log -Sqwerty
      
      Performance counter stats for 'git log -Sqwerty' (5 runs):
      
             47,092,744 cache-misses             #      2.825 M/sec   ( +-   1.607% )
            123,368,389 cache-references         #      7.400 M/sec   ( +-   0.812% )
            330,040,998 branch-misses            #      3.134 %       ( +-   0.257% )
         10,530,896,750 branches                 #    631.663 M/sec   ( +-   0.121% )
         62,037,201,030 instructions             #      1.399 IPC     ( +-   0.142% )
         44,331,294,321 cycles                   #   2659.073 M/sec   ( +-   0.326% )
                 96,794 page-faults              #      0.006 M/sec   ( +-  11.952% )
                     25 CPU-migrations           #      0.000 M/sec   ( +-  25.266% )
                  1,424 context-switches         #      0.000 M/sec   ( +-   0.540% )
           16671.708650 task-clock-msecs         #      0.997 CPUs    ( +-   0.343% )
      
            16.728692052  seconds time elapsed   ( +-   0.344% )
      
      after:
      $ perf stat --repeat=5 git log -Sqwerty
      
      Performance counter stats for 'git log -Sqwerty' (5 runs):
      
             51,385,522 cache-misses             #      4.619 M/sec   ( +-   0.565% )
            129,177,880 cache-references         #     11.611 M/sec   ( +-   0.219% )
            319,222,775 branch-misses            #      6.946 %       ( +-   0.134% )
          4,595,913,233 branches                 #    413.086 M/sec   ( +-   0.112% )
         31,395,042,533 instructions             #      1.062 IPC     ( +-   0.129% )
         29,558,348,598 cycles                   #   2656.740 M/sec   ( +-   0.204% )
                 93,224 page-faults              #      0.008 M/sec   ( +-   4.487% )
                     19 CPU-migrations           #      0.000 M/sec   ( +-  10.425% )
                    950 context-switches         #      0.000 M/sec   ( +-   0.360% )
           11125.796039 task-clock-msecs         #      0.997 CPUs    ( +-   0.239% )
      
            11.164216599  seconds time elapsed   ( +-   0.240% )
      
      So the kwset code is about 33% faster.
      Signed-off-by: Fredrik K's avatarFredrik Kuivinen <frekui@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      b95c5ada
  6. 06 Oct, 2010 1 commit
  7. 05 Oct, 2010 1 commit
  8. 31 Aug, 2010 2 commits
    • Junio C Hamano's avatar
      git log/diff: add -G<regexp> that greps in the patch text · f506b8e8
      Junio C Hamano authored
      Teach "-G<regexp>" that is similar to "-S<regexp> --pickaxe-regexp" to the
      "git diff" family of commands.  This limits the diff queue to filepairs
      whose patch text actually has an added or a deleted line that matches the
      given regexp.  Unlike "-S<regexp>", changing other parts of the line that
      has a substring that matches the given regexp IS counted as a change, as
      such a change would appear as one deletion followed by one addition in a
      patch text.
      
      Unlike -S (pickaxe) that is intended to be used to quickly detect a commit
      that changes the number of occurrences of hits between the preimage and
      the postimage to serve as a part of larger toolchain, this is meant to be
      used as the top-level Porcelain feature.
      
      The implementation unfortunately has to run "diff" twice if you are
      running "log" family of commands to produce patches in the final output
      (e.g. "git log -p" or "git format-patch").  I think we _could_ cache the
      result in-core if we wanted to, but that would require larger surgery to
      the diffcore machinery (i.e. adding an extra pointer in the filepair
      structure to keep a pointer to a strbuf around, stuff the textual diff to
      the strbuf inside diffgrep_consume(), and make use of it in later stages
      when it is available) and it may not be worth it.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      f506b8e8
    • Junio C Hamano's avatar
      diff: pass the entire diff-options to diffcore_pickaxe() · 382f013b
      Junio C Hamano authored
      That would make it easier to give enhanced feature to the
      pickaxe transformation.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      382f013b
  9. 07 May, 2010 1 commit
  10. 22 Mar, 2009 1 commit
    • René Scharfe's avatar
      pickaxe: count regex matches only once · 7ad3c52e
      René Scharfe authored
      When --pickaxe-regex is used, forward past the end of matches instead of
      advancing to the byte after their start.  This way matches count only
      once, even if the regular expression matches their tail -- like in the
      fixed-string fork of the code.
      
      E.g.: /.*/ used to count the number of bytes instead of the number of
      lines.  /aa/ resulted in a count of two in "aaa" instead of one.
      
      Also document the fact that regexec() needs a NUL-terminated string as
      its second argument by adding an assert().
      Signed-off-by: default avatarRene Scharfe <rene.scharfe@lsrfire.ath.cx>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      7ad3c52e
  11. 17 Mar, 2009 1 commit
    • René Scharfe's avatar
      pickaxe: count regex matches only once · 50fd6997
      René Scharfe authored
      When --pickaxe-regex is used, forward past the end of matches instead of
      advancing to the byte after their start.  This way matches count only
      once, even if the regular expression matches their tail -- like in the
      fixed-string fork of the code.
      
      E.g.: /.*/ used to count the number of bytes instead of the number of
      lines.  /aa/ resulted in a count of two in "aaa" instead of one.
      
      Also document the fact that regexec() needs a NUL-terminated string as
      its second argument by adding an assert().
      Signed-off-by: default avatarRene Scharfe <rene.scharfe@lsrfire.ath.cx>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      50fd6997
  12. 03 Mar, 2009 1 commit
    • René Scharfe's avatar
      diffcore-pickaxe: use memmem() · ce163c79
      René Scharfe authored
      Use memmem() instead of open-coding it.  The system libraries usually have a
      much faster version than the memcmp()-loop here.  Even our own fall-back in
      compat/, which is used on Windows, is slightly faster.
      
      The following commands were run in a Linux kernel repository and timed, the
      best of five results is shown:
      
        $ STRING='Ensure that the real time constraints are schedulable.'
        $ git log -S"$STRING" HEAD -- kernel/sched.c >/dev/null
      
      On Ubuntu 8.10 x64, before (v1.6.2-rc2):
      
        8.09user 0.04system 0:08.14elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
        0inputs+0outputs (0major+30952minor)pagefaults 0swaps
      
      And with the patch:
      
        1.50user 0.04system 0:01.54elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
        0inputs+0outputs (0major+30645minor)pagefaults 0swaps
      
      On Fedora 10 x64, before:
      
        8.34user 0.05system 0:08.39elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
        0inputs+0outputs (0major+29268minor)pagefaults 0swaps
      
      And with the patch:
      
        1.15user 0.05system 0:01.20elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
        0inputs+0outputs (0major+32253minor)pagefaults 0swaps
      
      On Windows Vista x64, before:
      
        real    0m9.204s
        user    0m0.000s
        sys     0m0.000s
      
      And with the patch:
      
        real    0m8.470s
        user    0m0.000s
        sys     0m0.000s
      Signed-off-by: default avatarRene Scharfe <rene.scharfe@lsrfire.ath.cx>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      ce163c79
  13. 07 Jun, 2007 1 commit
    • Junio C Hamano's avatar
      War on whitespace · a6080a0a
      Junio C Hamano authored
      This uses "git-apply --whitespace=strip" to fix whitespace errors that have
      crept in to our source files over time.  There are a few files that need
      to have trailing whitespaces (most notably, test vectors).  The results
      still passes the test, and build result in Documentation/ area is unchanged.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      a6080a0a
  14. 07 May, 2007 1 commit
  15. 26 Jan, 2007 1 commit
    • Jeff King's avatar
      diffcore-pickaxe: fix infinite loop on zero-length needle · e1b16116
      Jeff King authored
      The "contains" algorithm runs into an infinite loop if the needle string
      has zero length. The loop could be modified to handle this, but it makes
      more sense to simply have an empty needle return no matches. Thus, a
      command like
        git log -S
      produces no output.
      
      We place the check at the top of the function so that we get the same
      results with or without --pickaxe-regex. Note that until now,
        git log -S --pickaxe-regex
      would match everything, not nothing.
      
      Arguably, an empty pickaxe string should simply produce an error
      message; however, this is still a useful assertion to add to the
      algorithm at this layer of the code.
      
      Noticed by Bill Lear.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      e1b16116
  16. 20 Dec, 2006 1 commit
    • Junio C Hamano's avatar
      simplify inclusion of system header files. · 85023577
      Junio C Hamano authored
      This is a mechanical clean-up of the way *.c files include
      system header files.
      
       (1) sources under compat/, platform sha-1 implementations, and
           xdelta code are exempt from the following rules;
      
       (2) the first #include must be "git-compat-util.h" or one of
           our own header file that includes it first (e.g. config.h,
           builtin.h, pkt-line.h);
      
       (3) system headers that are included in "git-compat-util.h"
           need not be included in individual C source files.
      
       (4) "git-compat-util.h" does not have to include subsystem
           specific header files (e.g. expat.h).
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      85023577
  17. 05 Apr, 2006 1 commit
  18. 04 Apr, 2006 1 commit
    • Petr Baudis's avatar
      Support for pickaxe matching regular expressions · d01d8c67
      Petr Baudis authored
      git-diff-* --pickaxe-regex will change the -S pickaxe to match
      POSIX extended regular expressions instead of fixed strings.
      
      The regex.h library is a rather stupid interface and I like pcre too, but
      with any luck it will be everywhere we will want to run Git on, it being
      POSIX.2 and all. I'm not sure if we can expect platforms like AIX to
      conform to POSIX.2 or if win32 has regex.h. We might add a flag to
      Makefile if there is a portability trouble potential.
      Signed-off-by: default avatarPetr Baudis <pasky@suse.cz>
      d01d8c67
  19. 24 Jul, 2005 1 commit
  20. 29 May, 2005 4 commits
  21. 23 May, 2005 2 commits
    • Junio C Hamano's avatar
      [PATCH] Performance fix for pickaxe. · 046aa644
      Junio C Hamano authored
      The pickaxe was expanding the blobs and searching in them even
      when it should have already known that both sides are the same.
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      046aa644
    • Junio C Hamano's avatar
      [PATCH] Rename/copy detection fix. · f7c1512a
      Junio C Hamano authored
      The rename/copy detection logic in earlier round was only good
      enough to show patch output and discussion on the mailing list
      about the diff-raw format updates revealed many problems with
      it.  This patch fixes all the ones known to me, without making
      things I want to do later impossible, mostly related to patch
      reordering.
      
       (1) Earlier rename/copy detector determined which one is rename
           and which one is copy too early, which made it impossible
           to later introduce diffcore transformers to reorder
           patches.  This patch fixes it by moving that logic to the
           very end of the processing.
      
       (2) Earlier output routine diff_flush() was pruning all the
           "no-change" entries indiscriminatingly.  This was done due
           to my false assumption that one of the requirements in the
           diff-raw output was not to show such an entry (which
           resulted in my incorrect comment about "diff-helper never
           being able to be equivalent to built-in diff driver").  My
           special thanks go to Linus for correcting me about this.
           When we produce diff-raw output, for the downstream to be
           able to tell renames from copies, sometimes it _is_
           necessary to output "no-change" entries, and this patch
           adds diffcore_prune() function for doing it.
      
       (3) Earlier diff_filepair structure was trying to be not too
           specific about rename/copy operations, but the purpose of
           the structure was to record one or two paths, which _was_
           indeed about rename/copy.  This patch discards xfrm_msg
           field which was trying to be generic for this wrong reason,
           and introduces a couple of fields (rename_score and
           rename_rank) that are explicitly specific to rename/copy
           logic.  One thing to note is that the information in a
           single diff_filepair structure _still_ does not distinguish
           renames from copies, and it is deliberately so.  This is to
           allow patches to be reordered in later stages.
      
       (4) This patch also adds some tests about diff-raw format
           output and makes sure that necessary "no-change" entries
           appear on the output.
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      f7c1512a
  22. 22 May, 2005 2 commits
    • Junio C Hamano's avatar
      [PATCH] Diffcore updates. · 6b14d7fa
      Junio C Hamano authored
      This moves the path selection logic from individual programs to a new
      diffcore transformer (diff-tree still needs to have its own for
      performance reasons).  Also the header printing code in diff-tree was
      tweaked not to produce anything when pickaxe is in effect and there is
      nothing interesting to report.  An interesting example is the following
      in the GIT archive itself:
      
          $ git-whatchanged -p -C -S'or something in a real script'
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      6b14d7fa
    • Junio C Hamano's avatar
      [PATCH] The diff-raw format updates. · 81e50eab
      Junio C Hamano authored
      Update the diff-raw format as Linus and I discussed, except that
      it does not use sequence of underscore '_' letters to express
      nonexistence.  All '0' mode is used for that purpose instead.
      
      The new diff-raw format can express rename/copy, and the earlier
      restriction that -M and -C _must_ be used with the patch format
      output is no longer necessary.  The patch makes -M and -C flags
      independent of -p flag, so you need to say git-whatchanged -M -p
      to get the diff/patch format.
      
      Updated are both documentations and tests.
      Signed-off-by: default avatarJunio C Hamano <junkio@cox.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      81e50eab