1. 22 Feb, 2016 1 commit
    • Jeff King's avatar
      use xmallocz to avoid size arithmetic · 3733e694
      Jeff King authored
      We frequently allocate strings as xmalloc(len + 1), where
      the extra 1 is for the NUL terminator. This can be done more
      simply with xmallocz, which also checks for integer
      There's no case where switching xmalloc(n+1) to xmallocz(n)
      is wrong; the result is the same length, and malloc made no
      guarantees about what was in the buffer anyway. But in some
      cases, we can stop manually placing NUL at the end of the
      allocated buffer. But that's only safe if it's clear that
      the contents will always fill the buffer.
      In each case where this patch does so, I manually examined
      the control flow, and I tried to err on the side of caution.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  2. 25 Sep, 2015 1 commit
    • Jeff King's avatar
      entry.c: convert strcpy to xsnprintf · 330c8e26
      Jeff King authored
      This particular conversion is non-obvious, because nobody
      has passed our function the length of the destination
      buffer. However, the interface to checkout_entry specifies
      that the buffer must be at least TEMPORARY_FILENAME_LENGTH
      bytes long, so we can check that (meaning the existing code
      was not buggy, but merely worrisome to somebody reading it).
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  3. 13 Jun, 2014 2 commits
  4. 13 Mar, 2014 2 commits
  5. 24 Oct, 2013 2 commits
    • Junio C Hamano's avatar
      checkout_entry(): clarify the use of topath[] parameter · af2a651d
      Junio C Hamano authored
      The said function has this signature:
      	extern int checkout_entry(struct cache_entry *ce,
      				  const struct checkout *state,
      				  char *topath);
      At first glance, it might appear that the caller of checkout_entry()
      can specify to which path the contents are written out by the last
      parameter, and it is tempting to add "const" in front of its type.
      In reality, however, topath[] is to point at a buffer to store the
      temporary path generated by the callchain originating from this
      function, and the temporary path is always short, much shorter than
      the buffer prepared by its only caller in builtin/checkout-index.c.
      Document the code a bit to clarify so that future callers know how
      to use the function better.
      Noticed-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    • Duy Nguyen's avatar
      entry.c: convert checkout_entry to use strbuf · fd356f6a
      Duy Nguyen authored
      The old code does not do boundary check so any paths longer than
      PATH_MAX can cause buffer overflow. Replace it with strbuf to handle
      paths of arbitrary length.
      The OS may reject if the path is too long though. But in that case we
      report the cause (e.g. name too long) and usually move on to checking
      out the next entry.
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  6. 18 Jul, 2013 1 commit
  7. 09 Jul, 2013 1 commit
    • Duy Nguyen's avatar
      Convert "struct cache_entry *" to "const ..." wherever possible · 9c5e6c80
      Duy Nguyen authored
      I attempted to make index_state->cache[] a "const struct cache_entry **"
      to find out how existing entries in index are modified and where. The
      question I have is what do we do if we really need to keep track of on-disk
      changes in the index. The result is
       - diff-lib.c: setting CE_UPTODATE
       - name-hash.c: setting CE_HASHED
       - preload-index.c, read-cache.c, unpack-trees.c and
         builtin/update-index: obvious
       - entry.c: write_entry() may refresh the checked out entry via
         fill_stat_cache_info(). This causes "non-const struct cache_entry
         *" in builtin/apply.c, builtin/checkout-index.c and
       - builtin/ls-files.c: --with-tree changes stagemask and may set
      Of these, write_entry() and its call sites are probably most
      interesting because it modifies on-disk info. But this is stat info
      and can be retrieved via refresh, at least for porcelain
      commands. Other just uses ce_flags for local purposes.
      So, keeping track of "dirty" entries is just a matter of setting a
      flag in index modification functions exposed by read-cache.c. Except
      unpack-trees, the rest of the code base does not do anything funny
      behind read-cache's back.
      The actual patch is less valueable than the summary above. But if
      anyone wants to re-identify the above sites. Applying this patch, then
          diff --git a/cache.h b/cache.h
          index 430d021..1692891 100644
          --- a/cache.h
          +++ b/cache.h
          @@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode)
           #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)
           struct index_state {
          -	struct cache_entry **cache;
          +	const struct cache_entry **cache;
           	unsigned int version;
           	unsigned int cache_nr, cache_alloc, cache_changed;
           	struct string_list *resolve_undo;
      will help quickly identify them without bogus warnings.
      Signed-off-by: Duy Nguyen's avatarNguyễn Thái Ngọc Duy <pclouds@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  8. 27 Mar, 2013 1 commit
    • Jeff King's avatar
      streaming_write_entry: propagate streaming errors · d9c31e14
      Jeff King authored
      When we are streaming an index blob to disk, we store the
      error from stream_blob_to_fd in the "result" variable, and
      then immediately overwrite that with the return value of
      "close". That means we catch errors on close (e.g., problems
      committing the file to disk), but miss anything which
      happened before then.
      We can fix this by using bitwise-OR to accumulate errors in
      our result variable.
      While we're here, we can also simplify the error handling
      with an early return, which makes it easier to see under
      which circumstances we need to clean up.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  9. 14 Mar, 2013 1 commit
    • John Keeping's avatar
      entry: fix filter lookup · 7297a440
      John Keeping authored
      When looking up the stream filter, write_entry() should be passing the
      path of the file in the repository, not the path to which the content is
      going to be written.  This allows the file to be correctly looked up
      against the .gitattributes files in the working tree.
      This change makes the streaming case match the non-streaming case which
      passes ce->name to convert_to_working_tree later in the same function.
      The two tests added here test the different paths through write_entry
      since the CRLF filter is a streaming filter but the user-defined smudge
      filter is not streamed.
      Signed-off-by: John Keeping's avatarJohn Keeping <john@keeping.me.uk>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  10. 07 Mar, 2012 1 commit
  11. 26 May, 2011 1 commit
    • Junio C Hamano's avatar
      Add streaming filter API · b6691092
      Junio C Hamano authored
      This introduces an API to plug custom filters to an input stream.
      The caller gets get_stream_filter("path") to obtain an appropriate
      filter for the path, and then uses it when opening an input stream
      via open_istream().  After that, the caller can read from the stream
      with read_istream(), and close it with close_istream(), just like an
      unfiltered stream.
      This only adds a "null" filter that is a pass-thru filter, but later
      changes can add LF-to-CRLF and other filters, and the callers of the
      streaming API do not have to change.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  12. 21 May, 2011 3 commits
    • Junio C Hamano's avatar
      streaming_write_entry(): support files with holes · de6182db
      Junio C Hamano authored
      One typical use of a large binary file is to hold a sparse on-disk hash
      table with a lot of holes. Help preserving the holes with lseek().
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    • Junio C Hamano's avatar
      streaming_write_entry(): use streaming API in write_entry() · dd8e9121
      Junio C Hamano authored
      When the output to a path does not have to be converted, we can read from
      the object database from the streaming API and write to the file in the
      working tree, without having to hold everything in the memory.
      The ident, auto- and safe- crlf conversions inherently require you to read
      the whole thing before deciding what to do, so while it is technically
      possible to support them by using a buffer of an unbound size or rewinding
      and reading the stream twice, it is less practical than the traditional
      "read the whole thing in core and convert" approach.
      Adding streaming filters for the other conversions on top of this should
      be doable by tweaking the can_bypass_conversion() function (it should be
      renamed to can_filter_stream() when it happens). Then the streaming API
      can be extended to wrap the git_istream streaming_write_entry() opens on
      the underlying object in another git_istream that reads from it, filters
      what is read, and let the streaming_write_entry() read the filtered
      result. But that is outside the scope of this series.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    • Junio C Hamano's avatar
      write_entry(): separate two helper functions out · fd5db55d
      Junio C Hamano authored
      In the write-out codepath, a block of code determines what file in the
      working tree to write to, and opens an output file descriptor to it.
      After writing the contents out to the file, another block of code runs
      fstat() on the file descriptor when appropriate.
      Separate these blocks out to open_output_fd() and fstat_output()
      helper functions.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  13. 29 Nov, 2010 1 commit
  14. 12 Jan, 2010 1 commit
  15. 14 Dec, 2009 1 commit
  16. 18 Aug, 2009 1 commit
    • Junio C Hamano's avatar
      check_path(): allow symlinked directories to checkout-index --prefix · da02ca50
      Junio C Hamano authored
      Merlyn noticed that Documentation/install-doc-quick.sh no longer correctly
      removes old installed documents when the target directory has a leading
      path that is a symlink.  It turns out that "checkout-index --prefix" was
      broken by recent b6986d8a (git-checkout: be careful about untracked
      symlinks, 2009-07-29).
      I suspect has_symlink_leading_path() could learn the third parameter
      (prefix that is allowed to be symlinked directories) to allow us to retire
      a similar function has_dirs_only_path().
      Another avenue of fixing this I considered was to get rid of base_dir and
      base_dir_len from "struct checkout", and instead make "git checkout-index"
      when run with --prefix mkdir the leading path and chdir in there.  It
      might be the best longer term solution to this issue, as the base_dir
      feature is used only by that rather obscure codepath as far as I know.
      But at least this patch should fix this breakage.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  17. 30 Jul, 2009 1 commit
  18. 27 Jun, 2009 2 commits
  19. 30 Apr, 2009 1 commit
    • Alex Riesen's avatar
      replace direct calls to unlink(2) with unlink_or_warn · 691f1a28
      Alex Riesen authored
      This helps to notice when something's going wrong, especially on
      systems which lock open files.
      I used the following criteria when selecting the code for replacement:
      - it was already printing a warning for the unlink failures
      - it is in a function which already printing something or is
        called from such a function
      - it is in a static function, returning void and the function is only
        called from a builtin main function (cmd_)
      - it is in a function which handles emergency exit (signal handlers)
      - it is in a function which is obvously cleaning up the lockfiles
      Signed-off-by: default avatarAlex Riesen <raa.lkml@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  20. 20 Apr, 2009 1 commit
    • Johannes Sixt's avatar
      Windows: Skip fstat/lstat optimization in write_entry() · 34779c53
      Johannes Sixt authored
      Commit e4c72923 (write_entry(): use fstat() instead of lstat() when file
      is open, 2009-02-09) introduced an optimization of write_entry().
      Unfortunately, we cannot take advantage of this optimization on Windows
      because there is no guarantee that the time stamps are updated before the
      file is closed:
        "The only guarantee about a file timestamp is that the file time is
         correctly reflected when the handle that makes the change is closed."
      The failure of this optimization on Windows can be observed most easily by
      running a 'git checkout' that has to update several large files. In this
      case, 'git checkout' will report modified files, but infact only the
      timestamps were incorrectly recorded in the index, as can be verified by a
      subsequent 'git diff', which shows no change.
      Dmitry Potapov reports the same fix needs on Cygwin; this commit contains
      his updates for that.
      Signed-off-by: default avatarJohannes Sixt <j6t@kdbg.org>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  21. 10 Feb, 2009 4 commits
  22. 18 Jan, 2009 1 commit
    • Kjetil Barvik's avatar
      lstat_cache(): introduce has_dirs_only_path() function · bad4a54f
      Kjetil Barvik authored
      The create_directories() function in entry.c currently calls stat()
      or lstat() for each path component of the pathname 'path' each and every
      time.  For the 'git checkout' command, this function is called on each
      file for which we must do an update (ce->ce_flags & CE_UPDATE), so we get
      lots and lots of calls.
      To fix this, we make a new wrapper to the lstat_cache() function, and
      call the wrapper function instead of the calls to the stat() or the
      lstat() functions.  Since the paths given to the create_directories()
      function, is sorted alphabetically, the new wrapper would be very
      cache effective in this situation.
      To support it we must update the lstat_cache() function to be able to
      say that "please test the complete length of 'name'", and also to give
      it the length of a prefix, where the cache should use the stat()
      function instead of the lstat() function to test each path component.
      Thanks to Junio C Hamano, Linus Torvalds and Rene Scharfe for valuable
      comments to this patch!
      Signed-off-by: default avatarKjetil Barvik <barvik@broadpark.no>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  23. 11 Jan, 2009 1 commit
  24. 31 Aug, 2008 1 commit
  25. 19 Mar, 2008 1 commit
    • Linus Torvalds's avatar
      Fix possible Solaris problem in 'checkout_entry()' · 971f229c
      Linus Torvalds authored
      Currently when checking out an entry "path", we try to unlink(2) it first
      (because there could be stale file), and if there is a directory there,
      try to deal with it (typically we run recursive rmdir).  We ignore the
      error return from this unlink because there may not even be any file
      However if you are root on Solaris, you can unlink(2) a directory
      successfully and corrupt your filesystem.
      This moves the code around and check the directory first, and then
      unlink(2).  Also we check the error code from it.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  26. 21 Jan, 2008 1 commit
    • Linus Torvalds's avatar
      Make on-disk index representation separate from in-core one · 7a51ed66
      Linus Torvalds authored
      This converts the index explicitly on read and write to its on-disk
      format, allowing the in-core format to contain more flags, and be
      In particular, the in-core format is now host-endian (as opposed to the
      on-disk one that is network endian in order to be able to be shared
      across machines) and as a result we can dispense with all the
      htonl/ntohl on accesses to the cache_entry fields.
      This will make it easier to make use of various temporary flags that do
      not exist in the on-disk format.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  27. 10 Nov, 2007 1 commit
    • Junio C Hamano's avatar
      ce_match_stat, run_diff_files: use symbolic constants for readability · 4bd5b7da
      Junio C Hamano authored
      ce_match_stat() can be told:
       (1) to ignore CE_VALID bit (used under "assume unchanged" mode)
           and perform the stat comparison anyway;
       (2) not to perform the contents comparison for racily clean
           entries and report mismatch of cached stat information;
      using its "option" parameter.  Give them symbolic constants.
      Similarly, run_diff_files() can be told not to report anything
      on removed paths.  Also give it a symbolic constant for that.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  28. 22 Oct, 2007 1 commit
  29. 29 Sep, 2007 1 commit
    • Pierre Habouzit's avatar
      strbuf change: be sure ->buf is never ever NULL. · b315c5c0
      Pierre Habouzit authored
      For that purpose, the ->buf is always initialized with a char * buf living
      in the strbuf module. It is made a char * so that we can sloppily accept
      things that perform: sb->buf[0] = '\0', and because you can't pass "" as an
      initializer for ->buf without making gcc unhappy for very good reasons.
      strbuf_init/_detach/_grow have been fixed to trust ->alloc and not ->buf
      as a consequence strbuf_detach is _mandatory_ to detach a buffer, copying
      ->buf isn't an option anymore, if ->buf is going to escape from the scope,
      and eventually be free'd.
      API changes:
        * strbuf_setlen now always works, so just make strbuf_reset a convenience
        * strbuf_detatch takes a size_t* optional argument (meaning it can be
          NULL) to copy the buffer's len, as it was needed for this refactor to
          make the code more readable, and working like the callers.
      Signed-off-by: default avatarPierre Habouzit <madcoder@debian.org>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  30. 17 Sep, 2007 1 commit
    • Pierre Habouzit's avatar
      Rewrite convert_to_{git,working_tree} to use strbuf's. · 5ecd293d
      Pierre Habouzit authored
      * Now, those functions take an "out" strbuf argument, where they store their
        result if any. In that case, it also returns 1, else it returns 0.
      * those functions support "in place" editing, in the sense that it's OK to
        call them this way:
          convert_to_git(path, sb->buf, sb->len, sb);
        When doable, conversions are done in place for real, else the strbuf
        content is just replaced with the new one, transparentely for the caller.
      If you want to create a new filter working this way, being the accumulation
      of filter1, filter2, ... filtern, then your meta_filter would be:
          int meta_filter(..., const char *src, size_t len, struct strbuf *sb)
              int ret = 0;
              ret |= filter1(...., src, len, sb);
              if (ret) {
                  src = sb->buf;
                  len = sb->len;
              ret |= filter2(...., src, len, sb);
              if (ret) {
                  src = sb->buf;
                  len = sb->len;
              return ret | filtern(..., src, len, sb);
      That's why subfilters the convert_to_* functions called were also rewritten
      to work this way.
      Signed-off-by: default avatarPierre Habouzit <madcoder@debian.org>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  31. 15 Aug, 2007 1 commit
    • Junio C Hamano's avatar
      attr.c: read .gitattributes from index as well. · 1a9d7e9b
      Junio C Hamano authored
      This makes .gitattributes files to be read from the index when
      they are not checked out to the work tree.  This is in line with
      the way we always allowed low-level tools to operate in sparsely
      checked out work tree in a reasonable way.
      It swaps the order of new file creation and converting the blob
      to work tree representation; otherwise when we are in the middle
      of checking out .gitattributes we would notice an empty but
      unwritten .gitattributes file in the work tree and will ignore
      the copy in the index.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>