1. 05 Sep, 2018 1 commit
    • Jeff King's avatar
      reopen_tempfile(): truncate opened file · 6c003d6f
      Jeff King authored
      We provide a reopen_tempfile() function, which is in turn
      used by reopen_lockfile().  The idea is that a caller may
      want to rewrite the tempfile without letting go of the lock.
      And that's what our one caller does: after running
      add--interactive, "commit -p" will update the cache-tree
      extension of the index and write out the result, all while
      holding the lock.
      
      However, because we open the file with only the O_WRONLY
      flag, the existing index content is left in place, and we
      overwrite it starting at position 0. If the new index after
      updating the cache-tree is smaller than the original, those
      final bytes are not overwritten and remain in the file. This
      results in a corrupt index, since those cruft bytes are
      interpreted as part of the trailing hash (or even as an
      extension, if there are enough bytes).
      
      This bug actually pre-dates reopen_tempfile(); the original
      code from 9c4d6c02 (cache-tree: Write updated cache-tree
      after commit, 2014-07-13) has the same bug, and those lines
      were eventually refactored into the tempfile module. Nobody
      noticed until now for two reasons:
      
       - the bug can only be triggered in interactive mode
         ("commit -p" or "commit -i")
      
       - the size of the index must shrink after updating the
         cache-tree, which implies a non-trivial deletion. Notice
         that the included test actually has to create a 2-deep
         hierarchy. A single level is not enough to actually cause
         shrinkage.
      
      The fix is to truncate the file before writing out the
      second index. We can do that at the caller by using
      ftruncate(). But we shouldn't have to do that. There is no
      other place in Git where we want to open a file and
      overwrite bytes, making reopen_tempfile() a confusing and
      error-prone interface. Let's pass O_TRUNC there, which gives
      callers the same state they had after initially opening the
      file or lock.
      
      It's possible that we could later add a caller that wants
      something else (e.g., to open with O_APPEND). But this is
      the only caller we've had in the history of the codebase.
      Let's punt on doing anything more clever until another one
      comes along.
      Reported-by: Luc Van Oostenryck's avatarLuc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      6c003d6f
  2. 22 Feb, 2018 1 commit
  3. 06 Sep, 2017 11 commits
    • Jeff King's avatar
      tempfile: auto-allocate tempfiles on heap · 076aa2cb
      Jeff King authored
      The previous commit taught the tempfile code to give up
      ownership over tempfiles that have been renamed or deleted.
      That makes it possible to use a stack variable like this:
      
        struct tempfile t;
      
        create_tempfile(&t, ...);
        ...
        if (!err)
                rename_tempfile(&t, ...);
        else
                delete_tempfile(&t);
      
      But doing it this way has a high potential for creating
      memory errors. The tempfile we pass to create_tempfile()
      ends up on a global linked list, and it's not safe for it to
      go out of scope until we've called one of those two
      deactivation functions.
      
      Imagine that we add an early return from the function that
      forgets to call delete_tempfile(). With a static or heap
      tempfile variable, the worst case is that the tempfile hangs
      around until the program exits (and some functions like
      setup_shallow_temporary rely on this intentionally, creating
      a tempfile and then leaving it for later cleanup).
      
      But with a stack variable as above, this is a serious memory
      error: the variable goes out of scope and may be filled with
      garbage by the time the tempfile code looks at it.  Let's
      see if we can make it harder to get this wrong.
      
      Since many callers need to allocate arbitrary numbers of
      tempfiles, we can't rely on static storage as a general
      solution. So we need to turn to the heap. We could just ask
      all callers to pass us a heap variable, but that puts the
      burden on them to call free() at the right time.
      
      Instead, let's have the tempfile code handle the heap
      allocation _and_ the deallocation (when the tempfile is
      deactivated and removed from the list).
      
      This changes the return value of all of the creation
      functions. For the cleanup functions (delete and rename),
      we'll add one extra bit of safety: instead of taking a
      tempfile pointer, we'll take a pointer-to-pointer and set it
      to NULL after freeing the object. This makes it safe to
      double-call functions like delete_tempfile(), as the second
      call treats the NULL input as a noop. Several callsites
      follow this pattern.
      
      The resulting patch does have a fair bit of noise, as each
      caller needs to be converted to handle:
      
        1. Storing a pointer instead of the struct itself.
      
        2. Passing the pointer instead of taking the struct
           address.
      
        3. Handling a "struct tempfile *" return instead of a file
           descriptor.
      
      We could play games to make this less noisy. For example, by
      defining the tempfile like this:
      
        struct tempfile {
      	struct heap_allocated_part_of_tempfile {
                      int fd;
                      ...etc
              } *actual_data;
        }
      
      Callers would continue to have a "struct tempfile", and it
      would be "active" only when the inner pointer was non-NULL.
      But that just makes things more awkward in the long run.
      There aren't that many callers, so we can simply bite
      the bullet and adjust all of them. And the compiler makes it
      easy for us to find them all.
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      076aa2cb
    • Jeff King's avatar
      tempfile: remove deactivated list entries · 422a21c6
      Jeff King authored
      Once a "struct tempfile" is added to the global cleanup
      list, it is never removed. This means that its storage must
      remain valid for the lifetime of the program. For single-use
      tempfiles and locks, this isn't a big deal: we just declare
      the struct static. But for library code which may take
      multiple simultaneous locks (like the ref code), they're
      forced to allocate a struct on the heap and leak it.
      
      This is mostly OK in practice. The size of the leak is
      bounded by the number of refs, and most programs exit after
      operating on a fixed number of refs (and allocate
      simultaneous memory proportional to the number of ref
      updates in the first place). But:
      
        1. It isn't hard to imagine a real leak: a program which
           runs for a long time taking a series of ref update
           instructions and fulfilling them one by one. I don't
           think we have such a program now, but it's certainly
           plausible.
      
        2. The leaked entries appear as false positives to
           tools like valgrind.
      
      Let's relax this rule by keeping only "active" tempfiles on
      the list. We can do this easily by moving the list-add
      operation from prepare_tempfile_object to activate_tempfile,
      and adding a deletion in deactivate_tempfile.
      
      Existing callers do not need to be updated immediately.
      They'll continue to leak any tempfile objects they may have
      allocated, but that's no different than the status quo. We
      can clean them up individually.
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      422a21c6
    • Jeff King's avatar
      tempfile: use list.h for linked list · 24d82185
      Jeff King authored
      The tempfile API keeps to-be-cleaned tempfiles in a
      singly-linked list and never removes items from the list.  A
      future patch would like to start removing items, but removal
      from a singly linked list is O(n), as we have to walk the
      list to find the predecessor element. This means that a
      process which takes "n" simultaneous lockfiles (for example,
      an atomic transaction on "n" refs) may end up quadratic in
      "n".
      
      Before we start allowing items to be removed, it would be
      nice to have a way to cover this case in linear time.
      
      The simplest solution is to make an assumption about the
      order in which tempfiles are added and removed from the
      list. If both operations iterate over the tempfiles in the
      same order, then by putting new items at the end of the list
      our removal search will always find its items at the
      beginning of the list. And indeed, that would work for the
      case of refs. But it creates a hidden dependency between
      unrelated parts of the code. If anybody changes the ref code
      (or if we add a new caller that opens multiple simultaneous
      tempfiles) they may unknowingly introduce a performance
      regression.
      
      Another solution is to use a better data structure. A
      doubly-linked list works fine, and we already have an
      implementation in list.h. But there's one snag: the elements
      of "struct tempfile" are all marked as "volatile", since a
      signal handler may interrupt us and iterate over the list at
      any moment (even if we were in the middle of adding a new
      entry).
      
      We can declare a "volatile struct list_head", but we can't
      actually use it with the normal list functions. The compiler
      complains about passing a pointer-to-volatile via a regular
      pointer argument. And rightfully so, as the sub-function
      would potentially need different code to deal with the
      volatile case.
      
      That leaves us with a few options:
      
        1. Drop the "volatile" modifier for the list items.
      
           This is probably a bad idea. I checked the assembly
           output from "gcc -O2", and the "volatile" really does
           impact the order in which it updates memory.
      
        2. Use macros instead of inline functions. The irony here
           is that list.h is entirely implemented as trivial
           inline functions. So we basically are already
           generating custom code for each call. But sadly there's no
           way in C to declare the inline function to take a more
           generic type.
      
           We could do so by switching the inline functions to
           macros, but it does make the end result harder to read.
           And it doesn't fully solve the problem (for instance,
           the declaration of list_head needs to change so that
           its "prev" and "next" pointers point to other volatile
           structs).
      
        3. Don't use list.h, and just make our own ad-hoc
           doubly-linked list. It's not that much code to
           implement the basics that we need here. But if we're
           going to do so, why not add the few extra lines
           required to model it after the actual list.h interface?
           We can even reuse a few of the macro helpers.
      
      So this patch takes option 3, but actually implements a
      parallel "volatile list" interface in list.h, where it could
      potentially be reused by other code. This implements just
      enough for tempfile.c's use, though we could easily port
      other functions later if need be.
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      24d82185
    • Jeff King's avatar
      tempfile: release deactivated strbufs instead of resetting · 102cf7a6
      Jeff King authored
      When a tempfile is deactivated, we reset its strbuf to the
      empty string, which means we hold onto the memory for later
      reuse.
      
      Since we'd like to move to a system where tempfile structs
      can actually be freed, deactivating one should drop all
      resources it is currently using. And thus "release" rather
      than "reset" is the appropriate function to call.
      
      In theory the reset may have saved a malloc() when a
      tempfile (or a lockfile) is reused multiple times. But in
      practice this happened rarely. Most of our tempfiles are
      single-use, since in cases where we might actually use many
      (like ref locking) we xcalloc() a fresh one for each ref. In
      fact, we leak those locks (to appease the rule that tempfile
      storage can never be freed). Which means that using reset is
      actively hurting us: instead of leaking just the tempfile
      struct, we're leaking the extra heap chunk for the filename,
      too.
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      102cf7a6
    • Jeff King's avatar
      tempfile: robustify cleanup handler · 6b935066
      Jeff King authored
      We may call remove_tempfiles() from an atexit handler, or
      from a signal handler. In the latter case we must take care
      to avoid functions which may deadlock if the process is in
      an unknown state, including looking at any stdio handles
      (which may be in the middle of doing I/O and locked) or
      calling malloc() or free().
      
      The current implementation calls delete_tempfile(). We unset
      the tempfile's stdio handle (if any) to avoid deadlocking
      there. But delete_tempfile() still calls unlink_or_warn(),
      which can deadlock writing to stderr if the unlink fails.
      
      Since delete_tempfile() isn't very long, let's just
      open-code our own simple conservative version of the same
      thing.  Notably:
      
        1. The "skip_fclose" flag is now called "in_signal_handler",
           because it should inform more decisions than just the
           fclose handling.
      
        2. We can replace close_tempfile() with just close(fd).
           That skips the fclose() question altogether. This is
           fine for the atexit() case, too; there's no point
           flushing data to a file which we're about to delete
           anyway.
      
        3. We can choose between unlink/unlink_or_warn based on
           whether it's safe to use stderr.
      
        4. We can replace the deactivate_tempfile() call with a
           simple setting of the active flag. There's no need to
           do any further cleanup since we know the program is
           exiting.  And even though the current deactivation code
           is safe in a signal handler, this frees us up in future
           patches to make non-signal deactivation more
           complicated (e.g., by freeing resources).
      
        5. There's no need to remove items from the tempfile_list.
           The "active" flag is the ultimate answer to whether an
           entry has been handled or not. Manipulating the list
           just introduces more chance of recursive signals
           stomping on each other, and the whole list will go away
           when the program exits anyway. Less is more.
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      6b935066
    • Jeff King's avatar
      tempfile: factor out deactivation · b5f4dcb5
      Jeff King authored
      When we deactivate a tempfile, we also have to clean up the
      "filename" strbuf. Let's pull this out into its own function
      to keep the logic in one place (which will become more
      important when a future patch makes it more complicated).
      
      Note that we can use the same function when deactivating an
      object that _isn't_ actually active yet (like when we hit an
      error creating a tempfile). These callsites don't currently
      reset the "active" flag to 0, but it's OK to do so (it's
      just a noop for these cases).
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      b5f4dcb5
    • Jeff King's avatar
      tempfile: factor out activation · 2933ebba
      Jeff King authored
      There are a few steps required to "activate" a tempfile
      struct. Let's pull these out into a function. That saves a
      few repeated lines now, but more importantly will make it
      easier to change the activation scheme later.
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      2933ebba
    • Jeff King's avatar
      tempfile: replace die("BUG") with BUG() · 9b028aa4
      Jeff King authored
      Compared to die(), using BUG() triggers abort(). That may
      give us an actual coredump, which should make it easier to
      get a stack trace. And since the programming error for these
      assertions is not in the functions themselves but in their
      callers, such a stack trace is needed to actually find the
      source of the bug.
      
      In addition, abort() raises SIGABRT, which is more likely to
      be caught by our test suite.
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      9b028aa4
    • Jeff King's avatar
      tempfile: handle NULL tempfile pointers gracefully · f5b4dc76
      Jeff King authored
      The tempfile functions all take pointers to tempfile
      objects, but do not check whether the argument is NULL.
      This isn't a big deal in practice, since the lifetime of any
      tempfile object is defined to last for the whole program. So
      even if we try to call delete_tempfile() on an
      already-deleted tempfile, our "active" check will tell us
      that it's a noop.
      
      In preparation for transitioning to a new system that
      loosens the "tempfile objects can never be freed" rule,
      let's tighten up our active checks:
      
        1. A NULL pointer is now defined as "inactive" (so it will
           BUG for most functions, but works as a silent noop for
           things like delete_tempfile).
      
        2. Functions should always do the "active" check before
           looking at any of the struct fields.
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      f5b4dc76
    • Jeff King's avatar
      tempfile: prefer is_tempfile_active to bare access · e6fc2673
      Jeff King authored
      The tempfile code keeps an "active" flag, and we have a
      number of assertions to make sure that the objects are being
      used in the right order. Most of these directly check
      "active" rather than using the is_tempfile_active()
      accessor.
      
      Let's prefer using the accessor, in preparation for it
      growing more complicated logic (like checking for NULL).
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      e6fc2673
    • Jeff King's avatar
      tempfile: do not delete tempfile on failed close · 49bd0fc2
      Jeff King authored
      When close_tempfile() fails, we delete the tempfile and
      reset the fields of the tempfile struct. This makes it
      easier for callers to return without cleaning up, but it
      also makes this common pattern:
      
        if (close_tempfile(tempfile))
      	return error_errno("error closing %s", tempfile->filename.buf);
      
      wrong, because the "filename" field has been reset after the
      failed close. And it's not easy to fix, as in many cases we
      don't have another copy of the filename (e.g., if it was
      created via one of the mks_tempfile functions, and we just
      have the original template string).
      
      Let's drop the feature that a failed close automatically
      deletes the file. This puts the burden on the caller to do
      the deletion themselves, but this isn't that big a deal.
      Callers which do:
      
        if (write(...) || close_tempfile(...)) {
      	delete_tempfile(...);
      	return -1;
        }
      
      already had to call delete when the write() failed, and so
      aren't affected. Likewise, any caller which just calls die()
      in the error path is OK; we'll delete the tempfile during
      the atexit handler.
      
      Because this patch changes the semantics of close_tempfile()
      without changing its signature, all callers need to be
      manually checked and converted to the new scheme. This patch
      covers all in-tree callers, but there may be others for
      not-yet-merged topics. To catch these, we rename the
      function to close_tempfile_gently(), which will attract
      compile-time attention to new callers. (Technically the
      original could be considered "gentle" already in that it
      didn't die() on errors, but this one is even more so).
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      49bd0fc2
  4. 17 Feb, 2017 1 commit
    • Jeff King's avatar
      tempfile: set errno to a known value before calling ferror() · 7e8c9355
      Jeff King authored
      In close_tempfile(), we return an error if ferror()
      indicated a previous failure, or if fclose() failed. In the
      latter case, errno is set and it is useful for callers to
      report it.
      
      However, if _only_ ferror() triggers, then the value of
      errno is based on whatever syscall happened to last fail,
      which may not be related to our filehandle at all. A caller
      cannot tell the difference between the two cases, and may
      use "die_errno()" or similar to report a nonsense errno value.
      
      One solution would be to actually pass back separate return
      values for the two cases, so a caller can write a more
      appropriate message for each case. But that makes the
      interface clunky.
      
      Instead, let's just set errno to the generic EIO in this case.
      That's not as descriptive as we'd like, but at least it's
      predictable. So it's better than the status quo in all cases
      but one: when the last syscall really did involve a failure
      on our filehandle, we'll be wiping that out. But that's a
      fragile thing for us to rely on.
      
      In any case, we'll let the errno result from fclose() take
      precedence over our value, as we know that's recent and
      accurate (and many I/O errors will persist through the
      fclose anyway).
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      7e8c9355
  5. 16 Feb, 2017 1 commit
    • Jeff King's avatar
      tempfile: avoid "ferror | fclose" trick · 0838cbc2
      Jeff King authored
      The current code wants to record an error condition from
      either ferror() or fclose(), but makes sure that we always
      call both functions. So it can't use logical-OR "||", which
      would short-circuit when ferror() is true. Instead, it uses
      bitwise-OR "|" to evaluate both functions and set one or
      more bits in the "err" flag if they reported a failure.
      
      Unlike logical-OR, though, bitwise-OR does not introduce a
      sequence point, and the order of evaluation for its operands
      is unspecified. So a compiler would be free to generate code
      which calls fclose() first, and then ferror() on the
      now-freed filehandle.
      
      There's no indication that this has happened in practice,
      but let's write it out in a way that follows the standard.
      Noticed-by: Andreas Schwab's avatarAndreas Schwab <schwab@linux-m68k.org>
      Signed-off-by: 's avatarJeff King <peff@peff.net>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      0838cbc2
  6. 23 Aug, 2016 1 commit
    • Ben Wijen's avatar
      mingw: ensure temporary file handles are not inherited by child processes · 05d1ed61
      Ben Wijen authored
      When the index is locked and child processes inherit the handle to
      said lock and the parent process wants to remove the lock before the
      child process exits, on Windows there is a problem: it won't work
      because files cannot be deleted if a process holds a handle on them.
      The symptom:
      
          Rename from 'xxx/.git/index.lock' to 'xxx/.git/index' failed.
          Should I try again? (y/n)
      
      Spawning child processes with bInheritHandles==FALSE would not work
      because no file handles would be inherited, not even the hStdXxx
      handles in STARTUPINFO (stdin/stdout/stderr).
      
      Opening every file with O_NOINHERIT does not work, either, as e.g.
      git-upload-pack expects inherited file handles.
      
      This leaves us with the only way out: creating temp files with the
      O_NOINHERIT flag. This flag is Windows-specific, however. For our
      purposes, it is equivalent to O_CLOEXEC (which does not exist on
      Windows), so let's just open temporary files with the O_CLOEXEC flag and
      map that flag to O_NOINHERIT on Windows.
      
      As Eric Wong pointed out, we need to be careful to handle the case where
      the Linux headers used to compile Git support O_CLOEXEC but the Linux
      kernel used to run Git does not: it returns an EINVAL.
      
      This fixes the test that we just introduced to demonstrate the problem.
      Signed-off-by: 's avatarBen Wijen <ben@wijen.net>
      Signed-off-by: Johannes Schindelin's avatarJohannes Schindelin <johannes.schindelin@gmx.de>
      Signed-off-by: 's avatarJunio C Hamano <gitster@pobox.com>
      05d1ed61
  7. 10 Aug, 2015 4 commits