1. 15 Apr, 2013 2 commits
    • Karsten Blees's avatar
    • Karsten Blees's avatar
      dir.c: unify is_excluded and is_path_excluded APIs · 95c6f271
      Karsten Blees authored
      The is_excluded and is_path_excluded APIs are very similar, except for a
      few noteworthy differences:
      
      is_excluded doesn't handle ignored directories, results for paths within
      ignored directories are incorrect. This is probably based on the premise
      that recursive directory scans should stop at ignored directories, which
      is no longer true (in certain cases, read_directory_recursive currently
      calls is_excluded *and* is_path_excluded to get correct ignored state).
      
      is_excluded caches parsed .gitignore files of the last directory in struct
      dir_struct. If the directory changes, it finds a common parent directory
      and is very careful to drop only as much state as necessary. On the other
      hand, is_excluded will also read and parse .gitignore files in already
      ignored directories, which are completely irrelevant.
      
      is_path_excluded correctly handles ignored directories by checking if any
      component in the path is excluded. As it uses is_excluded internally, this
      unfortunately forces is_excluded to drop and re-read all .gitignore files,
      as there is no common parent directory for the root dir.
      
      is_path_excluded tracks state in a separate struct path_exclude_check,
      which is essentially a wrapper of dir_struct with two more fields. However,
      as is_path_excluded also modifies dir_struct, it is not possible to e.g.
      use multiple path_exclude_check structures with the same dir_struct in
      parallel. The additional structure just unnecessarily complicates the API.
      
      Teach is_excluded / prep_exclude about ignored directories: whenever
      entering a new directory, first check if the entire directory is excluded.
      Remember the excluded state in dir_struct. Don't traverse into already
      ignored directories (i.e. don't read irrelevant .gitignore files).
      
      Directories could also be excluded by exclude patterns specified on the
      command line or .git/info/exclude, so we cannot simply skip prep_exclude
      entirely if there's no .gitignore file name (dir_struct.exclude_per_dir).
      Move this check to just before actually reading the file.
      
      is_path_excluded is now equivalent to is_excluded, so we can simply
      redirect to it (the public API is cleaned up in the next patch).
      
      The performance impact of the additional ignored check per directory is
      hardly noticeable when reading directories recursively (e.g. 'git status').
      However, performance of git commands using the is_path_excluded API (e.g.
      'git ls-files --cached --ignored --exclude-standard') is greatly improved
      as this no longer re-reads .gitignore files on each call.
      
      Here's some performance data from the linux and WebKit repos (best of 10
      runs on a Debian Linux on SSD, core.preloadIndex=true):
      
             | ls-files -ci   |    status      | status --ignored
             | linux | WebKit | linux | WebKit | linux | WebKit
      -------+-------+--------+-------+--------+-------+---------
      before | 0.506 |  6.539 | 0.212 |  1.555 | 0.323 |  2.541
      after  | 0.080 |  1.191 | 0.218 |  1.583 | 0.321 |  2.579
      gain   | 6.325 |  5.490 | 0.972 |  0.982 | 1.006 |  0.985
      Signed-off-by: default avatarKarsten Blees <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      95c6f271
  2. 06 Jan, 2013 4 commits
    • Adam Spiers's avatar
      dir.c: improve docs for match_pathspec() and match_pathspec_depth() · 52ed1894
      Adam Spiers authored
      Fix a grammatical issue in the description of these functions, and
      make it more obvious how and why seen[] can be reused across multiple
      invocations.
      Signed-off-by: default avatarAdam Spiers <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      52ed1894
    • Adam Spiers's avatar
      dir.c: provide clear_directory() for reclaiming dir_struct memory · 270be816
      Adam Spiers authored
      By the end of a directory traversal, a dir_struct instance will
      typically contains pointers to various data structures on the heap.
      clear_directory() provides a convenient way to reclaim that memory.
      Signed-off-by: default avatarAdam Spiers <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      270be816
    • Adam Spiers's avatar
      dir.c: keep track of where patterns came from · c04318e4
      Adam Spiers authored
      For exclude patterns read in from files, the filename is stored in the
      exclude list, and the originating line number is stored in the
      individual exclude (counting starting at 1).
      
      For exclude patterns provided on the command line, a string describing
      the source of the patterns is stored in the exclude list, and the
      sequence number assigned to each exclude pattern is negative, with
      counting starting at -1.  So for example the 2nd pattern provided via
      --exclude would be numbered -2.  This allows any future consumers of
      that data to easily distinguish between exclude patterns from files
      vs. from the CLI.
      Signed-off-by: default avatarAdam Spiers <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      c04318e4
    • Adam Spiers's avatar
      dir.c: use a single struct exclude_list per source of excludes · c082df24
      Adam Spiers authored
      Previously each exclude_list could potentially contain patterns
      from multiple sources.  For example dir->exclude_list[EXC_FILE]
      would typically contain patterns from .git/info/exclude and
      core.excludesfile, and dir->exclude_list[EXC_DIRS] could contain
      patterns from multiple per-directory .gitignore files during
      directory traversal (i.e. when dir->exclude_stack was more than
      one item deep).
      
      We split these composite exclude_lists up into three groups of
      exclude_lists (EXC_CMDL / EXC_DIRS / EXC_FILE as before), so that each
      exclude_list now contains patterns from a single source.  This will
      allow us to cleanly track the origin of each pattern simply by adding
      a src field to struct exclude_list, rather than to struct exclude,
      which would make memory management of the source string tricky in the
      EXC_DIRS case where its contents are dynamically generated.
      
      Similarly, by moving the filebuf member from struct exclude_stack to
      struct exclude_list, it allows us to track and subsequently free
      memory buffers allocated during the parsing of all exclude files,
      rather than only tracking buffers allocated for files in the EXC_DIRS
      group.
      Signed-off-by: default avatarAdam Spiers <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      c082df24
  3. 28 Dec, 2012 7 commits
  4. 26 Nov, 2012 2 commits
  5. 15 Oct, 2012 2 commits
  6. 07 Jun, 2012 1 commit
  7. 06 Jun, 2012 2 commits
  8. 03 Jun, 2012 1 commit
    • Junio C Hamano's avatar
      ls-files -i: pay attention to exclusion of leading paths · eb41775e
      Junio C Hamano authored
      "git ls-files --exclude=t/ -i" does not show paths in directory t/
      that have been added to the index, but it should.
      
      The excluded() API was designed for callers who walk the tree from
      the top, checking each level of the directory hierarchy as it
      descends if it is excluded, and not even bothering to recurse into
      an excluded directory.  This would allow us optimize for a common
      case by not having to check if the exclude pattern "foo/" matches
      when looking at "foo/bar", because the caller should have noticed
      that "foo" is excluded and did not even bother to read "foo/bar"
      out of opendir()/readdir() to call it.
      
      The code for "ls-files -i" however walks the index linearly, feeding
      paths without checking if the leading directory is already excluded.
      
      Introduce a helper function path_excluded() to let this caller
      properly call excluded() check for higher hierarchies as necessary.
      Signed-off-by: default avatarJunio C Hamano <g[email protected]>
      eb41775e
  9. 15 Mar, 2012 1 commit
  10. 12 Sep, 2011 1 commit
  11. 06 Sep, 2011 1 commit
  12. 29 Mar, 2011 2 commits
  13. 03 Feb, 2011 2 commits
  14. 29 Nov, 2010 1 commit
  15. 06 Oct, 2010 1 commit
    • Joshua Jensen's avatar
      Add string comparison functions that respect the ignore_case variable. · 8cf2a84e
      Joshua Jensen authored
      Multiple locations within this patch series alter a case sensitive
      string comparison call such as strcmp() to be a call to a string
      comparison call that selects case comparison based on the global
      ignore_case variable. Behaviorally, when core.ignorecase=false, the
      *_icase() versions are functionally equivalent to their C runtime
      counterparts.  When core.ignorecase=true, the *_icase() versions perform
      a case insensitive comparison.
      
      Like Linus' earlier ignorecase patch, these may ignore filename
      conventions on certain file systems. By isolating filename comparisons
      to certain functions, support for those filename conventions may be more
      easily met.
      Signed-off-by: default avatarJoshua Jensen <[email protected]>
      Signed-off-by: default avatarJohannes Sixt <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      8cf2a84e
  16. 12 Jul, 2010 1 commit
  17. 24 Aug, 2009 1 commit
  18. 29 Jul, 2009 1 commit
  19. 09 Jul, 2009 2 commits
    • Linus Torvalds's avatar
      Simplify read_directory[_recursive]() arguments · dba2e203
      Linus Torvalds authored
      Stop the insanity with separate 'path' and 'base' arguments that must
      match.  We don't need that crazy interface any more, since we cleaned up
      handling of 'path' in commit da4b3e8c.
      Signed-off-by: default avatarLinus Torvalds <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      dba2e203
    • Linus Torvalds's avatar
      Add 'fill_directory()' helper function for directory traversal · 1d8842d9
      Linus Torvalds authored
      Most of the users of "read_directory()" actually want a much simpler
      interface than the whole complex (but rather powerful) one.
      
      In fact 'git add' had already largely abstracted out the core interface
      issues into a private "fill_directory()" function that was largely
      applicable almost as-is to a number of callers.  Yes, 'git add' wants to
      do some extra work of its own, specific to the add semantics, but we can
      easily split that out, and use the core as a generic function.
      
      This function does exactly that, and now that much simplified
      'fill_directory()' function can be shared with a number of callers,
      while also ensuring that the rather more complex calling conventions of
      read_directory() are used by fewer call-sites.
      
      This also makes the 'common_prefix()' helper function private to dir.c,
      since all callers are now in that file.
      Signed-off-by: default avatarLinus Torvalds <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      1d8842d9
  20. 18 Feb, 2009 1 commit
  21. 11 Jan, 2009 2 commits
  22. 03 Oct, 2008 1 commit
  23. 29 Sep, 2008 1 commit