1. 28 Mar, 2013 3 commits
    • Thomas Rast's avatar
      Implement line-history search (git log -L) · 12da1d1f
      Thomas Rast authored
      This is a rewrite of much of Bo's work, mainly in an effort to split
      it into smaller, easier to understand routines.
      
      The algorithm is built around the struct range_set, which encodes a
      series of line ranges as intervals [a,b).  This is used in two
      contexts:
      
      * A set of lines we are tracking (which will change as we dig through
        history).
      * To encode diffs, as pairs of ranges.
      
      The main routine is range_set_map_across_diff().  It processes the
      diff between a commit C and some parent P.  It determines which diff
      hunks are relevant to the ranges tracked in C, and computes the new
      ranges for P.
      
      The algorithm is then simply to process history in topological order
      from newest to oldest, computing ranges and (partial) diffs.  At
      branch points, we need to merge the ranges we are watching.  We will
      find that many commits do not affect the chosen ranges, and mark them
      TREESAME (in addition to those already filtered by pathspec limiting).
      Another pass of history simplification then gets rid of such commits.
      
      This is wired as an extra filtering pass in the log machinery.  This
      currently only reduces code duplication, but should allow for other
      simplifications and options to be used.
      
      Finally, we hook a diff printer into the output chain.  Ideally we
      would wire directly into the diff logic, to optionally use features
      like word diff.  However, that will require some major reworking of
      the diff chain, so we completely replace the output with our own diff
      for now.
      
      As this was a GSoC project, and has quite some history by now, many
      people have helped.  In no particular order, thanks go to
      
        Jakub Narebski <[email protected].com>
        Jens Lehmann <[email protected]>
        Jonathan Nieder <[email protected]>
        Junio C Hamano <[email protected]>
        Ramsay Jones <[email protected]>
        Will Palmer <[email protected]>
      
      Apologies to everyone I forgot.
      Signed-off-by: default avatarBo Yang <[email protected]>
      Signed-off-by: default avatarThomas Rast <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      12da1d1f
    • Bo Yang's avatar
      Export rewrite_parents() for 'log -L' · c7edcae0
      Bo Yang authored
      The function rewrite_one is used to rewrite a single
      parent of the current commit, and is used by rewrite_parents
      to rewrite all the parents.
      
      Decouple the dependence between them by making rewrite_one
      a callback function that is passed to rewrite_parents. Then
      export rewrite_parents for reuse by the line history browser.
      
      We will use this function in line-log.c.
      Signed-off-by: default avatarBo Yang <[email protected]>
      Signed-off-by: default avatarThomas Rast <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      c7edcae0
    • Bo Yang's avatar
      Refactor parse_loc · 25ed3412
      Bo Yang authored
      We want to use the same style of -L n,m argument for 'git log -L' as
      for git-blame.  Refactor the argument parsing of the range arguments
      from builtin/blame.c to the (new) file that will hold the 'git log -L'
      logic.
      
      To accommodate different data structures in blame and log -L, the file
      contents are abstracted away; parse_range_arg takes a callback that it
      uses to get the contents of a line of the (notional) file.
      
      The new test is for a case that made me pause during debugging: the
      'blame -L with invalid end' test was the only one that noticed an
      outright failure to parse the end *at all*.  So make a more explicit
      test for that.
      Signed-off-by: default avatarBo Yang <[email protected]>
      Signed-off-by: default avatarThomas Rast <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      25ed3412
  2. 27 Feb, 2013 7 commits
  3. 26 Feb, 2013 4 commits
  4. 25 Feb, 2013 25 commits
  5. 22 Feb, 2013 1 commit
    • Michael Haggerty's avatar
      Provide a mechanism to turn off symlink resolution in ceiling paths · 7ec30aaa
      Michael Haggerty authored
      Commit 1b77d83c 'setup_git_directory_gently_1(): resolve symlinks
      in ceiling paths' changed the setup code to resolve symlinks in the
      entries in GIT_CEILING_DIRECTORIES.  Because those entries are
      compared textually to the symlink-resolved current directory, an
      entry in GIT_CEILING_DIRECTORIES that contained a symlink would have
      no effect.  It was known that this could cause performance problems
      if the symlink resolution *itself* touched slow filesystems, but it
      was thought that such use cases would be unlikely.  The intention of
      the earlier change was to deal with a case when the user has this:
      
      	GIT_CEILING_DIRECTORIES=/home/gitster
      
      but in reality, /home/gitster is a symbolic link to somewhere else,
      e.g. /net/machine/home4/gitster. A textual comparison between the
      specified value /home/gitster and the location getcwd(3) returns
      would not help us, but readlink("/home/gitster") would still be
      fast.
      
      After this change was released, Anders Kaseorg <[email protected]>
      reported:
      
      > [...] my computer has been acting so slow when I’m not connected to
      > the network.  I put various network filesystem paths in
      > $GIT_CEILING_DIRECTORIES, such as
      > /afs/athena.mit.edu/user/a/n/andersk (to avoid hitting its parents
      > /afs/athena.mit.edu, /afs/athena.mit.edu/user/a, and
      > /afs/athena.mit.edu/user/a/n which all live in different AFS
      > volumes).  Now when I’m not connected to the network, every
      > invocation of Git, including the __git_ps1 in my shell prompt, waits
      > for AFS to timeout.
      
      To allow users to work around this problem, give them a mechanism to
      turn off symlink resolution in GIT_CEILING_DIRECTORIES entries.  All
      the entries that follow an empty entry will not be checked for symbolic
      links and used literally in comparison.  E.g. with these:
      
      	GIT_CEILING_DIRECTORIES=:/foo/bar:/xyzzy or
      	GIT_CEILING_DIRECTORIES=/foo/bar::/xyzzy
      
      we will not readlink("/xyzzy") because it comes after an empty entry.
      
      With the former (but not with the latter), "/foo/bar" comes after an
      empty entry, and we will not readlink it, either.
      Signed-off-by: default avatarMichael Haggerty <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      7ec30aaa