1. 21 Mar, 2018 7 commits
    • Jeff King's avatar
      diff-highlight: detect --graph by indent · 4551fbba
      Jeff King authored
      This patch fixes a corner case where diff-highlight may
      scramble some diffs when combined with --graph.
      
      Commit 7e4ffb4c (diff-highlight: add support for --graph
      output, 2016-08-29) taught diff-highlight to skip past the
      graph characters at the start of each line with this regex:
      
        ($COLOR?\|$COLOR?\s+)*
      
      I.e., any series of pipes separated by and followed by
      arbitrary whitespace.  We need to match more than just a
      single space because the commit in question may be indented
      to accommodate other parts of the graph drawing. E.g.:
      
       * commit 1234abcd
       | ...
       | diff --git ...
      
      has only a single space, but for the last commit before a
      fork:
      
       | | |
       | * | commit 1234abcd
       | |/  ...
       | |   diff --git
      
      the diff lines have more spaces between the pipes and the
      start of the diff.
      
      However, when we soak up all of those spaces with the
      $GRAPH regex, we may accidentally include the leading space
      for a context line. That means we may consider the actual
      contents of a context line as part of the diff syntax. In
      other words, something like this:
      
         normal context line
        -old line
        +new line
         -this is a context line with a leading dash
      
      would cause us to see that final context line as a removal
      line, and we'd end up showing the hunk in the wrong order:
      
        normal context line
        -old line
         -this is a context line with a leading dash
        +new line
      
      Instead, let's a be a little more clever about parsing the
      graph. We'll look for the actual "*" line that marks the
      start of a commit, and record the indentation we see there.
      Then we can skip past that indentation when checking whether
      the line is a hunk header, removal, addition, etc.
      
      There is one tricky thing: the indentation in bytes may be
      different for various lines of the graph due to coloring.
      E.g., the "*" on a commit line is generally shown without
      color, but on the actual diff lines, it will be replaced
      with a colorized "|" character, adding several bytes. We
      work around this here by counting "visible" bytes. This is
      unfortunately a bit more expensive, making us about twice as
      slow to handle --graph output. But since this is meant to be
      used interactively anyway, it's tolerably fast (and the
      non-graph case is unaffected).
      
      One alternative would be to search for hunk header lines and
      use their indentation (since they'd have the same colors as
      the diff lines which follow). But that just opens up
      different corner cases. If we see:
      
        | |    @@ 1,2 1,3 @@
      
      we cannot know if this is a real diff that has been
      indented due to the graph, or if it's a context line that
      happens to look like a diff header. We can only be sure of
      the indent on the "*" lines, since we know those don't
      contain arbitrary data (technically the user could include a
      bunch of extra indentation via --format, but that's rare
      enough to disregard).
      Reported-by: Phillip Wood's avatarPhillip Wood <phillip.wood@dunelm.org.uk>
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      4551fbba
    • Jeff King's avatar
      diff-highlight: use flush() helper consistently · 009a81ed
      Jeff King authored
      The current flush() helper only shows the queued diff but
      does not clear the queue. This is conceptually a bug, but it
      works because we only call it once at the end of the
      program.
      
      Let's teach it to clear the queue, which will let us use it
      in more places (one for now, but more in future patches).
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      009a81ed
    • Jeff King's avatar
      diff-highlight: test graphs with --color · fbcf99e4
      Jeff King authored
      Our tests send git's output directly to files or pipes, so
      there will never be any color. Let's do at least one --color
      test to make sure that we can handle this case (which we
      currently can, but will be an easy thing to mess up when we
      touch the graph code in a future patch).
      
      We'll just cover the --graph case, since this is much more
      complex than the earlier cases (i.e., if it manages to
      highlight, then the non-graph case definitely would).
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      fbcf99e4
    • Jeff King's avatar
      diff-highlight: test interleaved parallel lines of history · 7ce2f4ca
      Jeff King authored
      The graph test in t9400 covers the case of two simultaneous
      branches, but all of the commits during this time are on the
      right-hand branch. So we test a graph structure like:
      
        | |
        | * commit ...
        | |
      
      but we never see the reverse, a commit on the left-hand
      branch:
      
        | |
        * | commit ...
        | |
      
      Since this is an easy thing to get wrong when touching the
      graph-matching code, let's cover it by adding one more
      commit with its timestamp interleaved with the other branch.
      
      Note that we need to pass --date-order to convince Git to
      show it this way (since --topo-order tries to keep lines of
      history separate).
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      7ce2f4ca
    • Jeff King's avatar
      diff-highlight: prefer "echo" to "cat" in tests · e28ae507
      Jeff King authored
      We generate a bunch of one-line files whose contents match
      their names, and then generate our commits by cat-ing those
      files. Let's just echo the contents directly, which saves
      some processes.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      e28ae507
    • Jeff King's avatar
      diff-highlight: use test_tick in graph test · 53ab9f0e
      Jeff King authored
      The exact ordering output by Git may depend on the commit
      timestamps, so let's make sure they're actually
      monotonically increasing, and not all the same (or worse,
      subject to how long the test script takes to run).
      
      Let's use test_tick to make sure this is stable. Note that
      we actually have to rearrange the order of the branches to
      match the expected graph structure (which means that
      previously we might racily have been testing a slightly
      different output, though the test is written in such a way
      that we'd still pass).
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      53ab9f0e
    • Jeff King's avatar
      diff-highlight: correct test graph diagram · 5013acc2
      Jeff King authored
      We actually branch "A" off of "D". The sample "--graph"
      output is right, but the left-to-right diagram is
      misleading. Let's fix it.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      5013acc2
  2. 06 Sep, 2017 1 commit
  3. 15 Jun, 2017 1 commit
    • Jeff King's avatar
      diff-highlight: split code into module · 0c977dbc
      Jeff King authored
      The diff-so-fancy project is also written in perl, and most
      of its users pipe diffs through both diff-highlight and
      diff-so-fancy. It would be nice if this could be done in a
      single script. So let's pull most of diff-highlight's code
      into its own module which can be used by diff-so-fancy.
      
      In addition, we'll abstract a few basic items like reading
      from stdio so that a script using the module can do more
      processing before or after diff-highlight handles the lines.
      See the README update for more details.
      
      One small downside is that the diff-highlight script must
      now be built using the Makefile. There are ways around this,
      but it quickly gets into perl arcana. Let's go with the
      simple solution. As a bonus, our Makefile now respects the
      PERL_PATH variable if it is set.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      0c977dbc
  4. 31 Aug, 2016 3 commits
    • Jeff King's avatar
      diff-highlight: avoid highlighting combined diffs · 3dbfe2b8
      Jeff King authored
      The algorithm in diff-highlight only understands how to look
      at two sides of a diff; it cannot correctly handle combined
      diffs with multiple preimages. Often highlighting does not
      trigger at all for these diffs because the line counts do
      not match up.  E.g., if we see:
      
        - ours
         -theirs
        ++resolved
      
      we would not bother highlighting; it naively looks like a
      single line went away, and then a separate hunk added
      another single line.
      
      But of course there are exceptions. E.g., if the other side
      deleted the line, we might see:
      
        - ours
        ++resolved
      
      which looks like we dropped " ours" and added "+resolved".
      This is only a small highlighting glitch (we highlight the
      space and the "+" along with the content), but it's also the
      tip of the iceberg. Even if we learned to find the true
      content here (by noticing we are in a 3-way combined diff
      and marking _two_ characters from the front of the line as
      uninteresting), there are other more complicated cases where
      we really do need to handle a 3-way hunk.
      
      Let's just punt for now; we can recognize combined diffs by
      the presence of extra "@" symbols in the hunk header, and
      treat them as non-diff content.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      3dbfe2b8
    • Jeff King's avatar
      diff-highlight: add multi-byte tests · 1b5290b1
      Jeff King authored
      Now that we have a test suite for diff highlight, we can
      show off the improvements from 8d00662d (diff-highlight: do
      not split multibyte characters, 2015-04-03).
      
      While we're at it, we can also add another case that
      _doesn't_ work: combining code points are treated as their
      own unit, which means that we may stick colors between them
      and the character they are modifying (with the result that
      the color is not shown in an xterm, though it's possible
      that other terminals err the other way, and show the color
      but not the accent).  There's no fix here, but let's
      document it as a failure.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      1b5290b1
    • Jeff King's avatar
      diff-highlight: ignore test cruft · 9f76e520
      Jeff King authored
      These are the same as in the normal t/.gitignore, with the
      exception of ".prove", as our Makefile does not support it.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      9f76e520
  5. 29 Aug, 2016 3 commits
  6. 04 Apr, 2015 1 commit
    • Kyle J. McKay's avatar
      diff-highlight: do not split multibyte characters · 8d00662d
      Kyle J. McKay authored
      When the input is UTF-8 and Perl is operating on bytes instead of
      characters, a diff that changes one multibyte character to another
      that shares an initial byte sequence will result in a broken diff
      display as the common byte sequence prefix will be separated from
      the rest of the bytes in the multibyte character.
      
      For example, if a single line contains only the unicode character
      U+C9C4 (encoded as UTF-8 0xEC, 0xA7, 0x84) and that line is then
      changed to the unicode character U+C9C0 (encoded as UTF-8 0xEC,
      0xA7, 0x80), when operating on bytes diff-highlight will show only
      the single byte change from 0x84 to 0x80 thus creating invalid UTF-8
      and a broken diff display.
      
      Fix this by putting Perl into character mode when splitting the line
      and then back into byte mode after the split is finished.
      
      The utf8::xxx functions require Perl 5.8 so we require that as well.
      
      Also, since we are mucking with code in the split_line function, we
      change a '*' quantifier to a '+' quantifier when matching the $COLOR
      expression which has the side effect of speeding everything up while
      eliminating useless '' elements in the returned array.
      Reported-by: Yi EungJun's avatarYi EungJun <semtlenori@gmail.com>
      Signed-off-by: default avatarKyle J. McKay <mackyle@gmail.com>
      Acked-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      8d00662d
  7. 20 Nov, 2014 1 commit
    • Jeff King's avatar
      diff-highlight: allow configurable colors · bca45fbc
      Jeff King authored
      Until now, the highlighting colors were hard-coded in the
      script (as "reverse" and "noreverse"), and you had to edit
      the script to change them. This patch teaches diff-highlight
      to read from color.diff-highlight.* to set them.
      
      In addition, it expands the possiblities considerably by
      adding two features:
      
        1. Old/new lines can be colored independently (so you can
           use a color scheme that complements existing line
           coloring).
      
        2. Normal, unhighlighted parts of the lines can be colored,
           too. Technically this can be done by separately
           configuring color.diff.old/new and matching it to your
           diff-highlight colors. But you may want a different
           look for your highlighted diffs versus your regular
           diffs.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      bca45fbc
  8. 04 Nov, 2014 1 commit
  9. 13 Feb, 2012 5 commits
    • Jeff King's avatar
      diff-highlight: document some non-optimal cases · a0b676aa
      Jeff King authored
      The diff-highlight script works on heuristics, so it can be
      wrong. Let's document some of the wrong-ness in case
      somebody feels like working on it.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      a0b676aa
    • Jeff King's avatar
      diff-highlight: match multi-line hunks · 34d9819e
      Jeff King authored
      Currently we only bother highlighting single-line hunks. The
      rationale was that the purpose of highlighting is to point
      out small changes between two similar lines that are
      otherwise hard to see. However, that meant we missed similar
      cases where two lines were changed together, like:
      
         -foo(buf);
         -bar(buf);
         +foo(obj->buf);
         +bar(obj->buf);
      
      Each of those changes is simple, and would benefit from
      highlighting (the "obj->" parts in this case).
      
      This patch considers whole hunks at a time. For now, we
      consider only the case where the hunk has the same number of
      removed and added lines, and assume that the lines from each
      segment correspond one-to-one. While this is just a
      heuristic, in practice it seems to generate sensible
      results (especially because we now omit highlighting on
      completely-changed lines, so when our heuristic is wrong, we
      tend to avoid highlighting at all).
      
      Based on an original idea and implementation by Michał
      Kiedrowicz.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      34d9819e
    • Jeff King's avatar
      diff-highlight: refactor to prepare for multi-line hunks · 6463fd7e
      Jeff King authored
      The current code structure assumes that we will only look at
      a pair of lines at any given time, and that the end result
      should always be to output that pair. However, we want to
      eventually handle multi-line hunks, which will involve
      collating pairs of removed/added lines. Let's refactor the
      code to return highlighted pairs instead of printing them.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      6463fd7e
    • Jeff King's avatar
      diff-highlight: don't highlight whole lines · 097128d1
      Jeff King authored
      If you have a change like:
      
        -foo
        +bar
      
      we end up highlighting the entirety of both lines (since the
      whole thing is changed). But the point of diff highlighting
      is to pinpoint the specific change in a pair of lines that
      are mostly identical. In this case, the highlighting is just
      noise, since there is nothing to pinpoint, and we are better
      off doing nothing.
      
      The implementation looks for "interesting" pairs by checking
      to see whether they actually have a matching prefix or
      suffix that does not simply consist of colorization and
      whitespace.  However, the implementation makes it easy to
      plug in other heuristics, too, like:
      
        1. Depending on the source material, the set of "boring"
           characters could be tweaked to include language-specific
           stuff (like braces or semicolons for C).
      
        2. Instead of saying "an interesting line has at least one
           character of prefix or suffix", we could require that
           less than N percent of the line be highlighted.
      
      The simple "ignore whitespace, and highlight if there are
      any matched characters" implemented by this patch seems to
      give good results on git.git. I'll leave experimentation
      with other heuristics to somebody who has a dataset that
      does not look good with the current code.
      
      Based on an original idea and implementation by Michał
      Kiedrowicz.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      097128d1
    • Jeff King's avatar
      diff-highlight: make perl strict and warnings fatal · 2b21008d
      Jeff King authored
      These perl features can catch bugs, and we shouldn't be
      violating any of the strict rules or creating any warnings,
      so let's turn them on.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      2b21008d
  10. 18 Oct, 2011 1 commit