1. 22 Jan, 2018 1 commit
  2. 14 Sep, 2017 1 commit
    • Jeff King's avatar
      avoid "write_in_full(fd, buf, len) != len" pattern · 06f46f23
      Jeff King authored
      The return value of write_in_full() is either "-1", or the
      requested number of bytes[1]. If we make a partial write
      before seeing an error, we still return -1, not a partial
      value. This goes back to f6aa66cb (write_in_full: really
      write in full or return error on disk full., 2007-01-11).
      So checking anything except "was the return value negative"
      is pointless. And there are a couple of reasons not to do
        1. It can do a funny signed/unsigned comparison. If your
           "len" is signed (e.g., a size_t) then the compiler will
           promote the "-1" to its unsigned variant.
           This works out for "!= len" (unless you really were
           trying to write the maximum size_t bytes), but is a
           bug if you check "< len" (an example of which was fixed
           recently in config.c).
           We should avoid promoting the mental model that you
           need to check the length at all, so that new sites are
           not tempted to copy us.
        2. Checking for a negative value is shorter to type,
           especially when the length is an expression.
        3. Linus says so. In d34cf19b (Clean up write_in_full()
           users, 2007-01-11), right after the write_in_full()
           semantics were changed, he wrote:
             I really wish every "write_in_full()" user would just
             check against "<0" now, but this fixes the nasty and
             stupid ones.
           Appeals to authority aside, this makes it clear that
           writing it this way does not have an intentional
           benefit. It's a historical curiosity that we never
           bothered to clean up (and which was undoubtedly
           cargo-culted into new sites).
      So let's convert these obviously-correct cases (this
      includes write_str_in_full(), which is just a wrapper for
      [1] A careful reader may notice there is one way that
          write_in_full() can return a different value. If we ask
          write() to write N bytes and get a return value that is
          _larger_ than N, we could return a larger total. But
          besides the fact that this would imply a totally broken
          version of write(), it would already invoke undefined
          behavior. Our internal remaining counter is an unsigned
          size_t, which means that subtracting too many byte will
          wrap it around to a very large number. So we'll instantly
          begin reading off the end of the buffer, trying to write
          gigabytes (or petabytes) of data.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Reviewed-by: default avatarJonathan Nieder <jrnieder@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  3. 22 Aug, 2017 2 commits
    • Junio C Hamano's avatar
      rerere: allow approxidate in gc.rerereResolved/gc.rerereUnresolved · 6e96cb52
      Junio C Hamano authored
      These two configuration variables are described in the documentation
      to take an expiry period expressed in the number of days:
      	    Records of conflicted merge you resolved earlier are
      	    kept for this many days when 'git rerere gc' is run.
      	    The default is 60 days.
      	    Records of conflicted merge you have not resolved are
      	    kept for this many days when 'git rerere gc' is run.
      	    The default is 15 days.
      There is no strong reason not to allow a more general "approxidate"
      expiry specification, e.g. "5.days.ago", or "never".
      Rename the config_get_expiry() helper introduced in the previous
      step to git_config_get_expiry_in_days() and move it to a more
      generic place, config.c, and use date.c::parse_expiry_date() to do
      so.  Give it an ability to allow the caller to tell among three
      cases (i.e. there is no "gc.rerereResolved" config, there is and it
      is correctly parsed into the *expiry variable, and there was an
      error in parsing the given value).  The current caller can work
      correctly without using the return value, though.
      In the future, we may find other variables that only allow an
      integer that specifies "this many days" or other unit of time, and
      when it happens we may need to drop "_days" suffix from the name of
      the function and instead pass the "scale" value as another parameter.
      But this will do for now.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    • Junio C Hamano's avatar
      rerere: represent time duration in timestamp_t internally · 5ea82279
      Junio C Hamano authored
      The two configuration variables, gc.rerereResolved and
      gc.rerereUnresolved, are measured in days and are passed as such
      into the prune_one() helper function, which worked in time_t to see
      if an entry in the rerere database is past its expiry.
      Instead, have the caller turn the number of days into the expiry
      timestamp.  Further, use timestamp_t instead of time_t.  This will
      make it possible to extend the way the configuration variable is
      spelled by using date.c::parse_expiry_date() that gives the expiry
      timestamp in timestamp_t.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  4. 16 Jun, 2017 1 commit
  5. 15 Jun, 2017 1 commit
  6. 26 May, 2017 3 commits
  7. 07 Dec, 2016 1 commit
    • Junio C Hamano's avatar
      hold_locked_index(): align error handling with hold_lockfile_for_update() · b3e83cc7
      Junio C Hamano authored
      Callers of the hold_locked_index() function pass 0 when they want to
      prepare to write a new version of the index file without wishing to
      die or emit an error message when the request fails (e.g. somebody
      else already held the lock), and pass 1 when they want the call to
      die upon failure.
      This option is called LOCK_DIE_ON_ERROR by the underlying lockfile
      API, and the hold_locked_index() function translates the paramter to
      LOCK_DIE_ON_ERROR when calling the hold_lock_file_for_update().
      Replace these hardcoded '1' with LOCK_DIE_ON_ERROR and stop
      translating.  Callers other than the ones that are replaced with
      this change pass '0' to the function; no behaviour change is
      intended with this patch.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      Among the callers of hold_locked_index() that passes 0:
       - diff.c::refresh_index_quietly() at the end of "git diff" is an
         opportunistic update; it leaks the lockfile structure but it is
         just before the program exits and nobody should care.
       - builtin/describe.c::cmd_describe(),
         sequencer.c::read_and_refresh_cache() are all opportunistic
         updates and they are OK.
       - builtin/update-index.c::cmd_update_index() takes a lock upfront
         but we may end up not needing to update the index (i.e. the
         entries may be fully up-to-date), in which case we do not need to
         issue an error upon failure to acquire the lock.  We do diagnose
         and die if we indeed need to update, so it is OK.
       - wt-status.c::require_clean_work_tree() IS BUGGY.  It asks
         silence, does not check the returned value.  Compare with
         callsites like cmd_describe() and cmd_status() to notice that it
         is wrong to call update_index_if_able() unconditionally.
  8. 07 Sep, 2016 1 commit
  9. 19 May, 2016 1 commit
  10. 11 May, 2016 1 commit
  11. 09 May, 2016 1 commit
  12. 06 Apr, 2016 4 commits
    • Junio C Hamano's avatar
      rerere: adjust 'forget' to multi-variant world order · 890fca84
      Junio C Hamano authored
      Because conflicts with the same contents inside conflict blocks
      enclosed by "<<<<<<<" and ">>>>>>>" can now have multiple variants
      to help three-way merge to adjust to the differences outside the
      conflict blocks, "rerere forget $path" needs to be taught that there
      may be multiple recorded resolutions that share the same conflict
      hash (which groups the conflicts with "the same contents inside
      conflict blocks"), among which there are some that would not be
      relevant to the conflict we are looking at.  These "other variants"
      that happen to share the same conflict hash should not be cleared,
      and the variant that would apply to the current conflict may not be
      the zero-th one (which is the only one that is cleared by the
      current code).
      After finding the conflict hash, iterate over the existing variants
      and try to resolve the conflict using each of them to find the one
      that "cleanly" resolves the current conflict.  That is the one we
      want to forget and record the preimage for, so that the user can
      record the corrected resolution.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    • Junio C Hamano's avatar
      rerere: split code to call ll_merge() further · 0ce02b36
      Junio C Hamano authored
      The merge() helper function is given an existing rerere ID (i.e. the
      name of the .git/rr-cache/* subdirectory, and the variant number)
      that identifies one <preimage, postimage> pair, try to see if the
      conflicted state in the given path can be resolved by using the pair,
      and if this succeeds, then update the conflicted path with the
      result in the working tree.
      To implement rerere_forget() in the multiple variant world, we'd
      need a helper to do the "see if a <preimage, postimage> pair cleanly
      resolves a conflicted state we have in-core" part, without actually
      touching any file in the working tree, in order to identify which
      variant(s) to remove.  Split the logic to do so into a separate
      helper function try_merge() out of merge().
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    • Junio C Hamano's avatar
      rerere: move code related to "forget" together · 3d730ed9
      Junio C Hamano authored
      "rerere forget" is the only user of handle_cache() helper, which in
      turn is the only user of rerere_io that reads from an in-core buffer
      whose getline method is implemented as rerere_mem_getline().  Gather
      them together.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    • Junio C Hamano's avatar
      rerere: gc and clear · 1be1e851
      Junio C Hamano authored
      Adjust "git rerere gc" and "git rerere clear" to the new world order
      with rerere database with multiple variants for the same shape of
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  13. 15 Mar, 2016 4 commits
    • Junio C Hamano's avatar
      rerere: do use multiple variants · 629716d2
      Junio C Hamano authored
      This enables the multiple-variant support for real.  Multiple
      conflicts of the same shape can have differences in contexts where
      they appear, interfering the replaying of recorded resolution of one
      conflict to another, and in such a case, their resolutions are
      recorded as different variants under the same conflict ID.
      We still need to adjust garbage collection codepaths for this
      change, but the basic "replay" functionality is functional with
      this change.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    • Junio C Hamano's avatar
      rerere: allow multiple variants to exist · a13d1370
      Junio C Hamano authored
      The shape of the conflict in a path determines the conflict ID.  The
      preimage and postimage pair that was recorded for the conflict ID
      previously may or may not replay well for the conflict we just saw.
      Currently, we punt when the previous resolution does not cleanly
      replay, but ideally we should then be able to record the currently
      conflicted path by assigning a new 'variant', and then record the
      resolution the user is going to make.
      Introduce a mechanism to have more than one variant for a given
      conflict ID; we do not actually assign any variant other than 0th
      variant yet at this step.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    • Junio C Hamano's avatar
      rerere: delay the recording of preimage · c0a5423b
      Junio C Hamano authored
      We record the preimage only when there is no directory to record the
      conflict we encountered, i.e. when $GIT_DIR/rr-cache/$ID does not
      exist.  As the plan is to allow multiple <preimage,postimage> pairs
      as variants for the same conflict ID eventually, this logic needs to
      As the first step in that direction, stop the "did we create the
      directory?  Then we record the preimage" logic.  Instead, we record
      if a preimage does not exist when we saw a conflict in a path.  Also
      make sure that we remove a stale postimage, which most likely is
      totally unrelated to the resolution of this new conflict, when we
      create a new preimage under $ID when $GIT_DIR/rr-cache/$ID already
      In later patches, we will further update this logic to be "do we
      have <preimage,postimage> pair that cleanly resolve the current
      conflicts?  If not, record a new preimage as a new variant", but
      that does not happen at this stage yet.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    • Junio C Hamano's avatar
      rerere: handle leftover rr-cache/$ID directory and postimage files · 05dd9f13
      Junio C Hamano authored
      If by some accident there is only $GIT_DIR/rr-cache/$ID directory
      existed, we wouldn't have recorded a preimage for a conflict that
      is newly encountered, which would mean after a manual resolution,
      we wouldn't have recorded it by storing the postimage, because the
      logic used to be "if there is no rr-cache/$ID directory, then we are
      the first so record the preimage".  Instead, record preimage if we
      do not have one.
      In addition, if there is only $GIT_DIR/rr-cache/$ID/postimage
      without corresponding preimage, we would have tried to call into
      merge() and punted.
      These would have been a situation frustratingly hard to recover
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  14. 08 Feb, 2016 3 commits
    • Junio C Hamano's avatar
      rerere: scan $GIT_DIR/rr-cache/$ID when instantiating a rerere_id · 2c7929b1
      Junio C Hamano authored
      This will help fixing bootstrap corner-case issues, e.g. having an
      empty $GIT_DIR/rr-cache/$ID directory would fail to record a
      preimage, in later changes in this series.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    • Junio C Hamano's avatar
      rerere: split conflict ID further · 1869bbe1
      Junio C Hamano authored
      The plan is to keep assigning the backward compatible conflict ID
      based on the hash of the (normalized) text of conflicts, keep using
      that conflict ID as the directory name under $GIT_DIR/rr-cache/, but
      allow each conflicted path to use a separate "variant" to record
      resolutions, i.e. having more than one <preimage,postimage> pairs
      under $GIT_DIR/rr-cache/$ID/ directory.  As the first step in that
      direction, separate the shared "conflict ID" out of the rerere_id
      The plan is to keep information per $ID in rerere_dir, that can be
      shared among rerere_id that is per conflicted path.
      When we are done with rerere(), which can be directly called from
      other programs like "git apply", "git commit" and "git merge", the
      shared rerere_dir structures can be freed entirely, so they are not
      reference-counted and they are not freed when we release rerere_id's
      that reference them.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    • Jeff King's avatar
      rerere: replace strcpy with xsnprintf · f58316db
      Jeff King authored
      This shouldn't overflow, as we are copying a sha1 hex into a
      41-byte buffer. But it does not hurt to use a bound-checking
      function, which protects us and makes auditing for overflows
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  15. 01 Sep, 2015 1 commit
    • Jeff King's avatar
      rerere: release lockfile in non-writing functions · 9dd330e6
      Jeff King authored
      There's a bug in builtin/am.c in which we take a lock on
      MERGE_RR recursively. But rather than fix am.c, this patch
      fixes the confusing interface from rerere.c that caused the
      bug. Read on for the gory details.
      The setup_rerere() function both reads the existing MERGE_RR
      file, and takes MERGE_RR.lock. In the rerere() and
      rerere_forget() functions, we end up in write_rr(), which
      will then commit the lock file.
      But for functions like rerere_clear() that do not write to
      MERGE_RR, we expect the caller to have handled
      setup_rerere(). That caller would then need to release the
      lockfile, but it can't; the lock struct is local to
      For builtin/rerere.c, this is OK. We run a single rerere
      operation and then exit immediately, which has the side
      effect of rolling back the lockfile.
      But in builtin/am.c, this is actively wrong. If we run "git
      am -3 --skip", we call setup-rerere twice without releasing
      the lock:
        1. The "--skip" causes us to call am_rerere_clear(), which
           calls setup_rerere(), but never drops the lock.
        2. We then proceed to the next patch.
        3. The "--3way" may cause us to call rerere() to handle
           conflicts in that patch, but we are already holding the
           lock. The lockfile code dies with:
           BUG: prepare_tempfile_object called for active object
      We could fix this by having rerere_clear() call
      rollback_lock_file(). But it feels a bit odd for it to roll
      back a lockfile that it did not itself take. So let's
      simplify the interface further, and handle setup_rerere in
      the function itself, taking away the question from the
      caller over whether they need to do so.
      We can give rerere_gc() the same treatment, as well (even
      though it doesn't have any callers besides builtin/rerere.c
      at this point). Note that these functions don't take flags
      from their callers to pass along to setup_rerere; that's OK,
      because the flags would not be meaningful for what they are
      Both of those functions need to hold the lock because even
      though they do not write to MERGE_RR, they are still writing
      and should be protected from a simultaneous "rerere" run.
      But rerere_remaining(), "rerere diff", and "rerere status"
      are all read-only operations. They want to setup_rerere(),
      but do not care about taking the lock in the first place.
      Since our update of MERGE_RR is the usual atomic rename done
      by commit_lock_file, they can just do a lockless read. For
      that, we teach setup_rerere a READONLY flag to avoid the
      As a bonus, this pushes builtin/rerere.c's setup_rerere call
      closer to the functions that use it. Which means that "git
      rerere totally-bogus-command" will no longer silently
      exit(0) in a repository without rerere enabled.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  16. 10 Aug, 2015 1 commit
    • Jeff King's avatar
      memoize common git-path "constant" files · f932729c
      Jeff King authored
      One of the most common uses of git_path() is to pass a
      constant, like git_path("MERGE_MSG"). This has two
        1. The return value is a static buffer, and the lifetime
           is dependent on other calls to git_path, etc.
        2. There's no compile-time checking of the pathname. This
           is OK for a one-off (after all, we have to spell it
           correctly at least once), but many of these constant
           strings appear throughout the code.
      This patch introduces a series of functions to "memoize"
      these strings, which are essentially globals for the
      lifetime of the program. We compute the value once, take
      ownership of the buffer, and return the cached value for
      subsequent calls.  cache.h provides a helper macro for
      defining these functions as one-liners, and defines a few
      common ones for global use.
      Using a macro is a little bit gross, but it does nicely
      document the purpose of the functions. If we need to touch
      them all later (e.g., because we learned how to change the
      git_dir variable at runtime, and need to invalidate all of
      the stored values), it will be much easier to have the
      complete list.
      Note that the shared-global functions have separate, manual
      declarations. We could do something clever with the macros
      (e.g., expand it to a declaration in some places, and a
      declaration _and_ a definition in path.c). But there aren't
      that many, and it's probably better to stay away from
      too-magical macros.
      Likewise, if we abandon the C preprocessor in favor of
      generating these with a script, we could get much fancier.
      E.g., normalizing "FOO/BAR-BAZ" into "git_path_foo_bar_baz".
      But the small amount of saved typing is probably not worth
      the resulting confusion to readers who want to grep for the
      function's definition.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
  17. 24 Jul, 2015 13 commits