1. 20 Feb, 2019 1 commit
  2. 05 Nov, 2018 3 commits
    • Jeff King's avatar
      xdiff-interface: drop parse_hunk_header() · 5eade074
      Jeff King authored
      This function was used only for parsing the hunk headers generated by
      xdiff. Now that we can use hunk callbacks to get that information
      directly, it has outlived its usefulness.
      
      Note to anyone who wants to resurrect it: the "len" parameter was
      totally unused, meaning that the function could read past the end of the
      "line" array. In practice this never happened, because we only used it
      to parse xdiff's generated header lines. But it would be dangerous to
      use it for other cases without fixing this defect.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      5eade074
    • Jeff King's avatar
      diff: use hunk callback for word-diff · 7c61e25f
      Jeff King authored
      Our word-diff does not look at the -/+ lines generated by xdiff at all
      (because they are not real lines to show the user, but just the
      tokenized words split into lines). Instead we use the line numbers from
      the hunk headers to index our own data structure.
      
      As a result, our xdi_diff_outf() callback throws away all lines except
      hunk headers. We can instead use a hunk callback, which has two
      benefits:
      
        1. We don't have to re-parse the generated hunk header line, but can
           use the passed parameters directly.
      
        2. By setting our line callback to NULL, we can tell xdiff-interface
           that it does not even need to bother generating the other lines,
           saving a small amount of work.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      7c61e25f
    • Jeff King's avatar
      diff: avoid generating unused hunk header lines · 3b40a090
      Jeff King authored
      Some callers of xdi_diff_outf() do not look at the generated hunk header
      lines at all. By plugging in a no-op hunk callback, this tells xdiff not
      to even bother formatting them.
      
      This patch introduces a stock no-op callback and uses it with a few
      callers whose line callbacks explicitly ignore hunk headers (because
      they look only for +/- lines).
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      3b40a090
  3. 02 Nov, 2018 2 commits
    • Jeff King's avatar
      xdiff-interface: provide a separate consume callback for hunks · 9346d6d1
      Jeff King authored
      The previous commit taught xdiff to optionally provide the hunk header
      data to a specialized callback. But most users of xdiff actually use our
      more convenient xdi_diff_outf() helper, which ensures that our callbacks
      are always fed whole lines.
      
      Let's plumb the special hunk-callback through this interface, too. It
      will follow the same rule as xdiff when the hunk callback is NULL (i.e.,
      continue to pass a stringified hunk header to the line callback). Since
      we add NULL to each caller, there should be no behavior change yet.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      9346d6d1
    • Jeff King's avatar
      xdiff: provide a separate emit callback for hunks · 611e42a5
      Jeff King authored
      The xdiff library always emits hunk header lines to our callbacks as
      formatted strings like "@@ -a,b +c,d @@\n". This is convenient if we're
      going to output a diff, but less so if we actually need to compute using
      those numbers, which requires re-parsing the line.
      
      In preparation for moving away from this, let's teach xdiff a new
      callback function which gets the broken-out hunk information. To help
      callers that don't want to use this new callback, if it's NULL we'll
      continue to format the hunk header into a string.
      
      Note that this function renames the "outf" callback to "out_line", as
      well. This isn't strictly necessary, but helps in two ways:
      
        1. Now that there are two callbacks, it's nice to use more descriptive
           names.
      
        2. Many callers did not zero the emit_callback_data struct, and needed
           to be modified to set ecb.out_hunk to NULL. By changing the name of
           the existing struct member, that guarantees that any new callers
           from in-flight topics will break the build and be examined
           manually.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      611e42a5
  4. 29 Aug, 2018 1 commit
    • Jeff King's avatar
      convert "oidcmp() == 0" to oideq() · 4a7e27e9
      Jeff King authored
      Using the more restrictive oideq() should, in the long run,
      give the compiler more opportunities to optimize these
      callsites. For now, this conversion should be a complete
      noop with respect to the generated code.
      
      The result is also perhaps a little more readable, as it
      avoids the "zero is equal" idiom. Since it's so prevalent in
      C, I think seasoned programmers tend not to even notice it
      anymore, but it can sometimes make for awkward double
      negations (e.g., we can drop a few !!oidcmp() instances
      here).
      
      This patch was generated almost entirely by the included
      coccinelle patch. This mechanical conversion should be
      completely safe, because we check explicitly for cases where
      oidcmp() is compared to 0, which is what oideq() is doing
      under the hood. Note that we don't have to catch "!oidcmp()"
      separately; coccinelle's standard isomorphisms make sure the
      two are treated equivalently.
      
      I say "almost" because I did hand-edit the coccinelle output
      to fix up a few style violations (it mostly keeps the
      original formatting, but sometimes unwraps long lines).
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      4a7e27e9
  5. 16 May, 2018 1 commit
    • Stefan Beller's avatar
      object-store: move object access functions to object-store.h · cbd53a21
      Stefan Beller authored
      This should make these functions easier to find and cache.h less
      overwhelming to read.
      
      In particular, this moves:
      - read_object_file
      - oid_object_info
      - write_object_file
      
      As a result, most of the codebase needs to #include object-store.h.
      In this patch the #include is only added to files that would fail to
      compile otherwise.  It would be better to #include wherever
      identifiers from the header are used.  That can happen later
      when we have better tooling for it.
      Signed-off-by: Stefan Beller's avatarStefan Beller <sbeller@google.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      cbd53a21
  6. 14 Mar, 2018 1 commit
    • brian m. carlson's avatar
      sha1_file: convert read_sha1_file to struct object_id · b4f5aca4
      brian m. carlson authored
      Convert read_sha1_file to take a pointer to struct object_id and rename
      it read_object_file.  Do the same for read_sha1_file_extended.
      
      Convert one use in grep.c to use the new function without any other code
      change, since the pointer being passed is a void pointer that is already
      initialized with a pointer to struct object_id.  Update the declaration
      and definitions of the modified functions, and apply the following
      semantic patch to convert the remaining callers:
      
      @@
      expression E1, E2, E3;
      @@
      - read_sha1_file(E1.hash, E2, E3)
      + read_object_file(&E1, E2, E3)
      
      @@
      expression E1, E2, E3;
      @@
      - read_sha1_file(E1->hash, E2, E3)
      + read_object_file(E1, E2, E3)
      
      @@
      expression E1, E2, E3, E4;
      @@
      - read_sha1_file_extended(E1.hash, E2, E3, E4)
      + read_object_file_extended(&E1, E2, E3, E4)
      
      @@
      expression E1, E2, E3, E4;
      @@
      - read_sha1_file_extended(E1->hash, E2, E3, E4)
      + read_object_file_extended(E1, E2, E3, E4)
      Signed-off-by: brian m. carlson's avatarbrian m. carlson <sandals@crustytoothpaste.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      b4f5aca4
  7. 26 Oct, 2017 1 commit
  8. 15 Jun, 2017 1 commit
  9. 26 May, 2017 1 commit
  10. 21 Sep, 2016 1 commit
    • Johannes Schindelin's avatar
      regex: use regexec_buf() · b7d36ffc
      Johannes Schindelin authored
      The new regexec_buf() function operates on buffers with an explicitly
      specified length, rather than NUL-terminated strings.
      
      We need to use this function whenever the buffer we want to pass to
      regexec(3) may have been mmap(2)ed (and is hence not NUL-terminated).
      
      Note: the original motivation for this patch was to fix a bug where
      `git diff -G <regex>` would crash. This patch converts more callers,
      though, some of which allocated to construct NUL-terminated strings,
      or worse, modified buffers to temporarily insert NULs while calling
      regexec(3).  By converting them to use regexec_buf(), the code has
      become much cleaner.
      Signed-off-by: Johannes Schindelin's avatarJohannes Schindelin <johannes.schindelin@gmx.de>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      b7d36ffc
  11. 07 Sep, 2016 1 commit
  12. 31 May, 2016 1 commit
    • René Scharfe's avatar
      xdiff: don't trim common tail with -W · e0876bca
      René Scharfe authored
      The function trim_common_tail() exits early if context lines are
      requested.  If -U0 and -W are specified together then it can still trim
      context lines that might belong to a changed function.  As a result
      that function is shown incompletely.
      
      Fix that by calling trim_common_tail() only if no function context or
      fixed context is requested.  The parameter ctx is no longer needed now;
      remove it.
      
      While at it fix an outdated comment as well.
      Signed-off-by: default avatarRene Scharfe <l.s.r@web.de>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      e0876bca
  13. 22 Feb, 2016 1 commit
  14. 28 Sep, 2015 1 commit
    • Jeff King's avatar
      xdiff: reject files larger than ~1GB · dcd1742e
      Jeff King authored
      The xdiff code is not prepared to handle extremely large
      files. It uses "int" in many places, which can overflow if
      we have a very large number of lines or even bytes in our
      input files. This can cause us to produce incorrect diffs,
      with no indication that the output is wrong. Or worse, we
      may even underallocate a buffer whose size is the result of
      an overflowing addition.
      
      We're much better off to tell the user that we cannot diff
      or merge such a large file. This patch covers both cases,
      but in slightly different ways:
      
        1. For merging, we notice the large file and cleanly fall
           back to a binary merge (which is effectively "we cannot
           merge this").
      
        2. For diffing, we make the binary/text distinction much
           earlier, and in many different places. For this case,
           we'll use the xdi_diff as our choke point, and reject
           any diff there before it hits the xdiff code.
      
           This means in most cases we'll die() immediately after.
           That's not ideal, but in practice we shouldn't
           generally hit this code path unless the user is trying
           to do something tricky. We already consider files
           larger than core.bigfilethreshold to be binary, so this
           code would only kick in when that is circumvented
           (either by bumping that value, or by using a
           .gitattribute to mark a file as diffable).
      
           In other words, we can avoid being "nice" here, because
           there is already nice code that tries to do the right
           thing. We are adding the suspenders to the nice code's
           belt, so notice when it has been worked around (both to
           protect the user from malicious inputs, and because it
           is better to die() than generate bogus output).
      
      The maximum size was chosen after experimenting with feeding
      large files to the xdiff code. It's just under a gigabyte,
      which leaves room for two obvious cases:
      
        - a diff3 merge conflict result on files of maximum size X
          could be 3*X plus the size of the markers, which would
          still be only about 3G, which fits in a 32-bit int.
      
        - some of the diff code allocates arrays of one int per
          record. Even if each file consists only of blank lines,
          then a file smaller than 1G will have fewer than 1G
          records, and therefore the int array will fit in 4G.
      
      Since the limit is arbitrary anyway, I chose to go under a
      gigabyte, to leave a safety margin (e.g., we would not want
      to overflow by allocating "(records + 1) * sizeof(int)" or
      similar.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      dcd1742e
  15. 09 May, 2012 1 commit
  16. 15 May, 2011 1 commit
  17. 26 Dec, 2010 1 commit
  18. 10 Sep, 2010 1 commit
  19. 04 May, 2010 1 commit
  20. 17 Feb, 2010 1 commit
  21. 02 Jul, 2009 1 commit
  22. 08 Mar, 2009 1 commit
  23. 26 Nov, 2008 1 commit
  24. 25 Oct, 2008 1 commit
  25. 16 Oct, 2008 1 commit
    • Brandon Casey's avatar
      xdiff-interface.c: strip newline (and cr) from line before pattern matching · 563d5a2c
      Brandon Casey authored
      POSIX doth sayeth:
      
         "In the regular expression processing described in IEEE Std 1003.1-2001,
          the <newline> is regarded as an ordinary character and both a period and
          a non-matching list can match one. ... Those utilities (like grep) that
          do not allow <newline>s to match are responsible for eliminating any
          <newline> from strings before matching against the RE."
      
      Thus far git has not been removing the trailing newline from strings matched
      against regular expression patterns. This has the effect that (quoting
      Jonathan del Strother) "... a line containing just 'FUNCNAME' (terminated by
      a newline) will be matched by the pattern '^(FUNCNAME.$)' but not
      '^(FUNCNAME$)'", and more simply not '^FUNCNAME$'.
      Signed-off-by: default avatarBrandon Casey <casey@nrlssc.navy.mil>
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      563d5a2c
  26. 03 Oct, 2008 1 commit
    • Brandon Casey's avatar
      xdiff-interface.c: strip newline (and cr) from line before pattern matching · a5a5a048
      Brandon Casey authored
      POSIX doth sayeth:
      
         "In the regular expression processing described in IEEE Std 1003.1-2001,
          the <newline> is regarded as an ordinary character and both a period and
          a non-matching list can match one. ... Those utilities (like grep) that
          do not allow <newline>s to match are responsible for eliminating any
          <newline> from strings before matching against the RE."
      
      Thus far git has not been removing the trailing newline from strings matched
      against regular expression patterns. This has the effect that (quoting
      Jonathan del Strother) "... a line containing just 'FUNCNAME' (terminated by
      a newline) will be matched by the pattern '^(FUNCNAME.$)' but not
      '^(FUNCNAME$)'", and more simply not '^FUNCNAME$'.
      Signed-off-by: default avatarBrandon Casey <casey@nrlssc.navy.mil>
      Signed-off-by: default avatarShawn O. Pearce <spearce@spearce.org>
      a5a5a048
  27. 20 Sep, 2008 1 commit
    • Junio C Hamano's avatar
      diff: fix "multiple regexp" semantics to find hunk header comment · 3d8dccd7
      Junio C Hamano authored
      When multiple regular expressions are concatenated with "\n", they were
      traditionally AND'ed together, and only a line that matches _all_ of them
      is taken as a match.  This however is unwieldy when multiple regexp
      feature is used to specify alternatives.
      
      This fixes the semantics to take the first match.  A nagative pattern, if
      matches, makes the line to fail as before.  A match with a positive
      pattern will be the final match, and what it captures in $1 is used as the
      hunk header comment.
      
      We could write alternatives using "|" in ERE, but the machinery can only
      use captured $1 as the hunk header comment (or $0 if there is no match in
      $1), so you cannot write:
      
          "junk ( A | B ) | garbage ( C | D )"
      
      and expect both "junk" and "garbage" to get stripped with the existing
      code.  With this fix, you can write it as:
      
          "junk ( A | B ) \n garbage ( C | D )"
      
      and the way capture works would match the user expectation more
      naturally.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      3d8dccd7
  28. 19 Sep, 2008 1 commit
  29. 31 Aug, 2008 1 commit
  30. 14 Aug, 2008 3 commits
    • Junio C Hamano's avatar
      xdiff-interface: hide the whole "xdiff_emit_state" business from the caller · 8a3f524b
      Junio C Hamano authored
      This further enhances xdi_diff_outf() interface so that it takes two
      common parameters: the callback function that processes one line at a
      time, and a pointer to its application specific callback data structure.
      xdi_diff_outf() creates its own "xdiff_emit_state" structure and stashes
      these two away inside it, which is used by the lowest level output
      function in the xdiff_outf() callchain, consume_one(), to call back to the
      application layer.  With this restructuring, we lift the requirement that
      the caller supplied callback data structure embeds xdiff_emit_state
      structure as its first member.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      8a3f524b
    • Brian Downing's avatar
      Use strbuf for struct xdiff_emit_state's remainder · b4637760
      Brian Downing authored
      Continually xreallocing and freeing the remainder member of struct
      xdiff_emit_state was a noticeable performance hit.  Use a strbuf
      instead.
      
      This yields a decent performance improvement on "git blame" on certain
      repositories.  For example, before this commit:
      
      $ time git blame -M -C -C -p --incremental server.c >/dev/null
      101.52user 0.17system 1:41.73elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (0major+39561minor)pagefaults 0swaps
      
      With this commit:
      
      $ time git blame -M -C -C -p --incremental server.c >/dev/null
      80.38user 0.30system 1:20.81elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+0outputs (0major+50979minor)pagefaults 0swaps
      Signed-off-by: Brian Downing's avatarBrian Downing <bdowning@lavos.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      b4637760
    • Brian Downing's avatar
      Make xdi_diff_outf interface for running xdiff_outf diffs · c99db9d2
      Brian Downing authored
      To prepare for the need to initialize and release resources for an
      xdi_diff with the xdiff_outf output function, make a new function to
      wrap this usage.
      
      Old:
      
      	ecb.outf = xdiff_outf;
      	ecb.priv = &state;
      	...
      	xdi_diff(file_p, file_o, &xpp, &xecfg, &ecb);
      
      New:
      
      	xdi_diff_outf(file_p, file_o, &state.xm, &xpp, &xecfg, &ecb);
      Signed-off-by: Brian Downing's avatarBrian Downing <bdowning@lavos.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      c99db9d2
  31. 14 Mar, 2008 1 commit
  32. 22 Feb, 2008 1 commit
    • Jim Meyering's avatar
      Avoid unnecessary "if-before-free" tests. · 8e0f7003
      Jim Meyering authored
      This change removes all obvious useless if-before-free tests.
      E.g., it replaces code like this:
      
              if (some_expression)
                      free (some_expression);
      
      with the now-equivalent:
      
              free (some_expression);
      
      It is equivalent not just because POSIX has required free(NULL)
      to work for a long time, but simply because it has worked for
      so long that no reasonable porting target fails the test.
      Here's some evidence from nearly 1.5 years ago:
      
          http://www.winehq.org/pipermail/wine-patches/2006-October/031544.html
      
      FYI, the change below was prepared by running the following:
      
        git ls-files -z | xargs -0 \
        perl -0x3b -pi -e \
          's/\bif\s*\(\s*(\S+?)(?:\s*!=\s*NULL)?\s*\)\s+(free\s*\(\s*\1\s*\))/$2/s'
      
      Note however, that it doesn't handle brace-enclosed blocks like
      "if (x) { free (x); }".  But that's ok, since there were none like
      that in git sources.
      
      Beware: if you do use the above snippet, note that it can
      produce syntactically invalid C code.  That happens when the
      affected "if"-statement has a matching "else".
      E.g., it would transform this
      
        if (x)
          free (x);
        else
          foo ();
      
      into this:
      
        free (x);
        else
          foo ();
      
      There were none of those here, either.
      
      If you're interested in automating detection of the useless
      tests, you might like the useless-if-before-free script in gnulib:
      [it *does* detect brace-enclosed free statements, and has a --name=S
       option to make it detect free-like functions with different names]
      
        http://git.sv.gnu.org/gitweb/?p=gnulib.git;a=blob;f=build-aux/useless-if-before-free
      
      Addendum:
        Remove one more (in imap-send.c), spotted by Jean-Luc Herren <jlh@gmx.ch>.
      Signed-off-by: default avatarJim Meyering <meyering@redhat.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      8e0f7003
  33. 21 Dec, 2007 1 commit
    • Linus Torvalds's avatar
      Re(-re)*fix trim_common_tail() · d2f82950
      Linus Torvalds authored
      The tar-ball and the git archive itself is fine, but yes, the diff from
      2.6.23 to 2.6.24-rc6 is bad. It's the "trim_common_tail()" optimization
      that has caused way too much pain.
      
      Very interesting breakage. The patch was actually "correct" in a (rather
      limited) technical sense, but the context at the end was missing because
      while the trim_common_tail() code made sure to keep enough common context
      to allow a valid diff to be generated, the diff machinery itself could
      decide that it could generate the diff differently than the "obvious"
      solution.
      
      Thee sad fact is that the git optimization (which is very important for
      "git blame", which needs no context), is only really valid for that one
      case where we really don't need any context.
      
      [jc: since this is shared with "git diff -U0" codepath, context recovery
      to the end of line needs to be done even for zero context case.]
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      d2f82950
  34. 16 Dec, 2007 2 commits