1. 14 Jan, 2019 2 commits
  2. 12 Nov, 2018 1 commit
  3. 15 Oct, 2018 1 commit
  4. 21 Sep, 2018 2 commits
  5. 11 Sep, 2018 1 commit
    • Elijah Newren's avatar
      rerere: avoid buffer overrun · ad2bf0d9
      Elijah Newren authored
      check_one_conflict() compares `i` to `active_nr` in two places to avoid
      buffer overruns, but left out an important third location.
      
      The code did used to have a check here comparing i to active_nr, back
      before commit fb70a06d ("rerere: fix an off-by-one non-bug",
      2015-06-28), however the code at the time used an 'if' rather than a
      'while' meaning back then that this loop could not have read past the
      end of the array, making the check unnecessary and it was removed.
      Unfortunately, in commit 5eda906b ("rerere: handle conflicts with
      multiple stage #1 entries", 2015-07-24), the 'if' was changed to a
      'while' and the check comparing i and active_nr was not re-instated,
      leading to this problem.
      Signed-off-by: Elijah Newren's avatarElijah Newren <newren@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      ad2bf0d9
  6. 13 Aug, 2018 1 commit
  7. 06 Aug, 2018 8 commits
    • Thomas Gummerer's avatar
      rerere: recalculate conflict ID when unresolved conflict is committed · bd7dfa54
      Thomas Gummerer authored
      Currently when a user doesn't resolve a conflict, commits the results,
      and does an operation which creates another conflict, rerere will use
      the ID of the previously unresolved conflict for the new conflict.
      This is because the conflict is kept in the MERGE_RR file, which
      'rerere' reads every time it is invoked.
      
      After the new conflict is solved, rerere will record the resolution
      with the ID of the old conflict.  So in order to replay the conflict,
      both merges would have to be re-done, instead of just the last one, in
      order for rerere to be able to automatically resolve the conflict.
      
      Instead of that, assign a new conflict ID if there are still conflicts
      in a file and the file had conflicts at a previous step.  This ID
      matches the conflict we actually resolved at the corresponding step.
      
      Note that there are no backwards compatibility worries here, as rerere
      would have failed to even normalize the conflict before this patch
      series.
      Signed-off-by: default avatarThomas Gummerer <t.gummerer@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      bd7dfa54
    • Thomas Gummerer's avatar
      rerere: teach rerere to handle nested conflicts · 4af32207
      Thomas Gummerer authored
      Currently rerere can't handle nested conflicts and will error out when
      it encounters such conflicts.  Do that by recursively calling the
      'handle_conflict' function to normalize the conflict.
      
      Note that a conflict like this would only be produced if a user
      commits a file with conflict markers, and gets a conflict including
      that in a susbsequent operation.
      
      The conflict ID calculation here deserves some explanation:
      
      As we are using the same handle_conflict function, the nested conflict
      is normalized the same way as for non-nested conflicts, which means
      the ancestor in the diff3 case is stripped out, and the parts of the
      conflict are ordered alphabetically.
      
      The conflict ID is however is only calculated in the top level
      handle_conflict call, so it will include the markers that 'rerere'
      adds to the output.  e.g. say there's the following conflict:
      
          <<<<<<< HEAD
          1
          =======
          <<<<<<< HEAD
          3
          =======
          2
          >>>>>>> branch-2
          >>>>>>> branch-3~
      
      it would be recorde as follows in the preimage:
      
          <<<<<<<
          1
          =======
          <<<<<<<
          2
          =======
          3
          >>>>>>>
          >>>>>>>
      
      and the conflict ID would be calculated as
      
          sha1(1<NUL><<<<<<<
          2
          =======
          3
          >>>>>>><NUL>)
      
      Stripping out vs. leaving the conflict markers in place in the inner
      conflict should have no practical impact, but it simplifies the
      implementation.
      Signed-off-by: default avatarThomas Gummerer <t.gummerer@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      4af32207
    • Thomas Gummerer's avatar
      rerere: return strbuf from handle path · 5ebbdad3
      Thomas Gummerer authored
      Currently we write the conflict to disk directly in the handle_path
      function.  To make it re-usable for nested conflicts, instead of
      writing the conflict out directly, store it in a strbuf and let the
      caller write it out.
      
      This does mean some slight increase in memory usage, however that
      increase is limited to the size of the largest conflict we've
      currently processed.  We already keep one copy of the conflict in
      memory, and it shouldn't be too large, so the increase in memory usage
      seems acceptable.
      
      As a bonus this lets us get replace the rerere_io_putconflict function
      with a trivial two line function.
      Signed-off-by: default avatarThomas Gummerer <t.gummerer@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      5ebbdad3
    • Thomas Gummerer's avatar
      rerere: factor out handle_conflict function · c0f16f8e
      Thomas Gummerer authored
      Factor out the handle_conflict function, which handles a single
      conflict in a path.  This is in preparation for a subsequent commit,
      where this function will be re-used.
      
      Note that this does change the behaviour of 'git rerere' slightly.
      Where previously we'd consider all files where an unmatched conflict
      marker is found as invalid, we now only consider files invalid when
      the "ours" conflict marker ("<<<<<<< <text>") is unmatched, not when
      other conflict markers (e.g. "=======") is unmatched.
      Signed-off-by: default avatarThomas Gummerer <t.gummerer@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      c0f16f8e
    • Thomas Gummerer's avatar
      rerere: only return whether a path has conflicts or not · 221444f5
      Thomas Gummerer authored
      We currently return the exact number of conflict hunks a certain path
      has from the 'handle_paths' function.  However all of its callers only
      care whether there are conflicts or not or if there is an error.
      Return only that information, and document that only that information
      is returned.  This will simplify the code in the subsequent steps.
      Signed-off-by: default avatarThomas Gummerer <t.gummerer@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      221444f5
    • Thomas Gummerer's avatar
      rerere: fix crash with files rerere can't handle · 93406a28
      Thomas Gummerer authored
      Currently when a user does a conflict resolution and ends it (in any
      way that calls 'git rerere' again) with a file 'rerere' can't handle,
      subsequent rerere operations that are interested in that path, such as
      'rerere clear' or 'rerere forget <path>' will fail, or even worse in
      the case of 'rerere clear' segfault.
      
      Such states include nested conflicts, or a conflict marker that
      doesn't have any match.
      
      This is because 'git rerere' calculates a conflict file and writes it
      to the MERGE_RR file.  When the user then changes the file in any way
      rerere can't handle, and then calls 'git rerere' on it again to record
      the conflict resolution, the handle_file function fails, and removes
      the 'preimage' file in the rr-cache in the process, while leaving the
      ID in the MERGE_RR file.
      
      Now when 'rerere clear' is run, it reads the ID from the MERGE_RR
      file, however the 'fit_variant' function for the ID is never called as
      the 'preimage' file does not exist anymore.  This means
      'collection->status' in 'has_rerere_resolution' is NULL, and the
      command will crash.
      
      To fix this, remove the rerere ID from the MERGE_RR file in the case
      when we can't handle it, just after the 'preimage' file was removed
      and remove the corresponding variant from .git/rr-cache/.  Removing it
      unconditionally is fine here, because if the user would have resolved
      the conflict and ran rerere, the entry would no longer be in the
      MERGE_RR file, so we wouldn't have this problem in the first place,
      while if the conflict was not resolved.
      
      Currently there is nothing left in this folder, as the 'preimage'
      was already deleted by the 'handle_file' function, so 'remove_variant'
      is a no-op.  Still call the function, to make sure we clean everything
      up, in case we add some other files corresponding to a variant in the
      future.
      
      Note that other variants that have the same conflict ID will not be
      touched.
      Signed-off-by: default avatarThomas Gummerer <t.gummerer@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      93406a28
    • Thomas Gummerer's avatar
      rerere: add documentation for conflict normalization · fb90dca3
      Thomas Gummerer authored
      Add some documentation for the logic behind the conflict normalization
      in rerere.
      Helped-by: default avatarJunio C Hamano <gitster@pobox.com>
      Signed-off-by: default avatarThomas Gummerer <t.gummerer@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      fb90dca3
    • Thomas Gummerer's avatar
      rerere: mark strings for translation · 2373b650
      Thomas Gummerer authored
      'git rerere' is considered a porcelain command and as such its output
      should be translated.  Its functionality is also only enabled through
      a config setting, so scripts really shouldn't rely on the output
      either way.
      Signed-off-by: default avatarThomas Gummerer <t.gummerer@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      2373b650
  8. 16 Jul, 2018 3 commits
    • Thomas Gummerer's avatar
      rerere: wrap paths in output in sq · 28fc9abd
      Thomas Gummerer authored
      It looks like most paths in the output in the git codebase are wrapped
      in single quotes.  Standardize on that in rerere as well.
      
      Apart from being more consistent, this also makes some of the strings
      match strings that are already translated in other parts of the
      codebase, thus reducing the work for translators, when the strings are
      marked for translation in a subsequent commit.
      Signed-off-by: default avatarThomas Gummerer <t.gummerer@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      28fc9abd
    • Thomas Gummerer's avatar
      rerere: lowercase error messages · c5d1d132
      Thomas Gummerer authored
      Documentation/CodingGuidelines mentions that error messages should be
      lowercase.  Prior to marking them for translation follow that pattern
      in rerere as well, so translators won't have to translate messages
      that don't conform to our guidelines.
      Signed-off-by: default avatarThomas Gummerer <t.gummerer@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      c5d1d132
    • Thomas Gummerer's avatar
      rerere: unify error messages when read_cache fails · e69db0b3
      Thomas Gummerer authored
      We have multiple different variants of the error message we show to
      the user if 'read_cache' fails.  The "Could not read index" variant we
      are using in 'rerere.c' is currently not used anywhere in translated
      form.
      
      As a subsequent commit will mark all output that comes from 'rerere.c'
      for translation, make the life of the translators a little bit easier
      by using a string that is used elsewhere, and marked for translation
      there, and thus most likely already translated.
      
      "index file corrupt" seems to be the most common error message we show
      when 'read_cache' fails, so use that here as well.
      Signed-off-by: default avatarThomas Gummerer <t.gummerer@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      e69db0b3
  9. 17 May, 2018 1 commit
  10. 16 May, 2018 1 commit
    • Stefan Beller's avatar
      object-store: move object access functions to object-store.h · cbd53a21
      Stefan Beller authored
      This should make these functions easier to find and cache.h less
      overwhelming to read.
      
      In particular, this moves:
      - read_object_file
      - oid_object_info
      - write_object_file
      
      As a result, most of the codebase needs to #include object-store.h.
      In this patch the #include is only added to files that would fail to
      compile otherwise.  It would be better to #include wherever
      identifiers from the header are used.  That can happen later
      when we have better tooling for it.
      Signed-off-by: Stefan Beller's avatarStefan Beller <sbeller@google.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      cbd53a21
  11. 10 May, 2018 1 commit
    • Martin Ågren's avatar
      lock_file: move static locks into functions · 0fa5a2ed
      Martin Ågren authored
      Placing `struct lock_file`s on the stack used to be a bad idea, because
      the temp- and lockfile-machinery would keep a pointer into the struct.
      But after 076aa2cb (tempfile: auto-allocate tempfiles on heap,
      2017-09-05), we can safely have lockfiles on the stack. (This applies
      even if a user returns early, leaving a locked lock behind.)
      
      Each of these `struct lock_file`s is used from within a single function.
      Move them into the respective functions to make the scope clearer and
      drop the staticness.
      
      For good measure, I have inspected these sites and come to believe that
      they always release the lock, with the possible exception of bailing out
      using `die()` or `exit()` or by returning from a `cmd_foo()`.
      
      As pointed out by Jeff King, it would be bad if someone held on to a
      `struct lock_file *` for some reason. After some grepping, I agree with
      his findings: no-one appears to be doing that.
      
      After this commit, the remaining occurrences of "static struct
      lock_file" are locks that are used from within different functions. That
      is, they need to remain static. (Short of more intrusive changes like
      passing around pointers to non-static locks.)
      Signed-off-by: default avatarMartin Ågren <martin.agren@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      0fa5a2ed
  12. 14 Mar, 2018 1 commit
    • brian m. carlson's avatar
      sha1_file: convert read_sha1_file to struct object_id · b4f5aca4
      brian m. carlson authored
      Convert read_sha1_file to take a pointer to struct object_id and rename
      it read_object_file.  Do the same for read_sha1_file_extended.
      
      Convert one use in grep.c to use the new function without any other code
      change, since the pointer being passed is a void pointer that is already
      initialized with a pointer to struct object_id.  Update the declaration
      and definitions of the modified functions, and apply the following
      semantic patch to convert the remaining callers:
      
      @@
      expression E1, E2, E3;
      @@
      - read_sha1_file(E1.hash, E2, E3)
      + read_object_file(&E1, E2, E3)
      
      @@
      expression E1, E2, E3;
      @@
      - read_sha1_file(E1->hash, E2, E3)
      + read_object_file(E1, E2, E3)
      
      @@
      expression E1, E2, E3, E4;
      @@
      - read_sha1_file_extended(E1.hash, E2, E3, E4)
      + read_object_file_extended(&E1, E2, E3, E4)
      
      @@
      expression E1, E2, E3, E4;
      @@
      - read_sha1_file_extended(E1->hash, E2, E3, E4)
      + read_object_file_extended(E1, E2, E3, E4)
      Signed-off-by: brian m. carlson's avatarbrian m. carlson <sandals@crustytoothpaste.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      b4f5aca4
  13. 01 Mar, 2018 1 commit
    • Martin Ågren's avatar
      write_locked_index(): add flag to avoid writing unchanged index · 61000814
      Martin Ågren authored
      We have several callers like
      
      	if (active_cache_changed && write_locked_index(...))
      		handle_error();
      	rollback_lock_file(...);
      
      where the final rollback is needed because "!active_cache_changed"
      shortcuts the if-expression. There are also a few variants of this,
      including some if-else constructs that make it more clear when the
      explicit rollback is really needed.
      
      Teach `write_locked_index()` to take a new flag SKIP_IF_UNCHANGED and
      simplify the callers. Leave the most complicated of the callers (in
      builtin/update-index.c) unchanged. Rewriting it to use this new flag
      would end up duplicating logic.
      
      We could have made the new flag behave the other way round
      ("FORCE_WRITE"), but that could break existing users behind their backs.
      Let's take the more conservative approach. We can still migrate existing
      callers to use our new flag. Later we might even be able to flip the
      default, possibly without entirely ignoring the risk to in-flight or
      out-of-tree topics.
      Suggested-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarMartin Ågren <martin.agren@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      61000814
  14. 22 Jan, 2018 1 commit
  15. 14 Sep, 2017 1 commit
    • Jeff King's avatar
      avoid "write_in_full(fd, buf, len) != len" pattern · 06f46f23
      Jeff King authored
      The return value of write_in_full() is either "-1", or the
      requested number of bytes[1]. If we make a partial write
      before seeing an error, we still return -1, not a partial
      value. This goes back to f6aa66cb (write_in_full: really
      write in full or return error on disk full., 2007-01-11).
      
      So checking anything except "was the return value negative"
      is pointless. And there are a couple of reasons not to do
      so:
      
        1. It can do a funny signed/unsigned comparison. If your
           "len" is signed (e.g., a size_t) then the compiler will
           promote the "-1" to its unsigned variant.
      
           This works out for "!= len" (unless you really were
           trying to write the maximum size_t bytes), but is a
           bug if you check "< len" (an example of which was fixed
           recently in config.c).
      
           We should avoid promoting the mental model that you
           need to check the length at all, so that new sites are
           not tempted to copy us.
      
        2. Checking for a negative value is shorter to type,
           especially when the length is an expression.
      
        3. Linus says so. In d34cf19b (Clean up write_in_full()
           users, 2007-01-11), right after the write_in_full()
           semantics were changed, he wrote:
      
             I really wish every "write_in_full()" user would just
             check against "<0" now, but this fixes the nasty and
             stupid ones.
      
           Appeals to authority aside, this makes it clear that
           writing it this way does not have an intentional
           benefit. It's a historical curiosity that we never
           bothered to clean up (and which was undoubtedly
           cargo-culted into new sites).
      
      So let's convert these obviously-correct cases (this
      includes write_str_in_full(), which is just a wrapper for
      write_in_full()).
      
      [1] A careful reader may notice there is one way that
          write_in_full() can return a different value. If we ask
          write() to write N bytes and get a return value that is
          _larger_ than N, we could return a larger total. But
          besides the fact that this would imply a totally broken
          version of write(), it would already invoke undefined
          behavior. Our internal remaining counter is an unsigned
          size_t, which means that subtracting too many byte will
          wrap it around to a very large number. So we'll instantly
          begin reading off the end of the buffer, trying to write
          gigabytes (or petabytes) of data.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Reviewed-by: default avatarJonathan Nieder <jrnieder@gmail.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      06f46f23
  16. 22 Aug, 2017 2 commits
    • Junio C Hamano's avatar
      rerere: allow approxidate in gc.rerereResolved/gc.rerereUnresolved · 6e96cb52
      Junio C Hamano authored
      These two configuration variables are described in the documentation
      to take an expiry period expressed in the number of days:
      
          gc.rerereResolved::
      	    Records of conflicted merge you resolved earlier are
      	    kept for this many days when 'git rerere gc' is run.
      	    The default is 60 days.
      
          gc.rerereUnresolved::
      	    Records of conflicted merge you have not resolved are
      	    kept for this many days when 'git rerere gc' is run.
      	    The default is 15 days.
      
      There is no strong reason not to allow a more general "approxidate"
      expiry specification, e.g. "5.days.ago", or "never".
      
      Rename the config_get_expiry() helper introduced in the previous
      step to git_config_get_expiry_in_days() and move it to a more
      generic place, config.c, and use date.c::parse_expiry_date() to do
      so.  Give it an ability to allow the caller to tell among three
      cases (i.e. there is no "gc.rerereResolved" config, there is and it
      is correctly parsed into the *expiry variable, and there was an
      error in parsing the given value).  The current caller can work
      correctly without using the return value, though.
      
      In the future, we may find other variables that only allow an
      integer that specifies "this many days" or other unit of time, and
      when it happens we may need to drop "_days" suffix from the name of
      the function and instead pass the "scale" value as another parameter.
      
      But this will do for now.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      6e96cb52
    • Junio C Hamano's avatar
      rerere: represent time duration in timestamp_t internally · 5ea82279
      Junio C Hamano authored
      The two configuration variables, gc.rerereResolved and
      gc.rerereUnresolved, are measured in days and are passed as such
      into the prune_one() helper function, which worked in time_t to see
      if an entry in the rerere database is past its expiry.
      
      Instead, have the caller turn the number of days into the expiry
      timestamp.  Further, use timestamp_t instead of time_t.  This will
      make it possible to extend the way the configuration variable is
      spelled by using date.c::parse_expiry_date() that gives the expiry
      timestamp in timestamp_t.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      5ea82279
  17. 16 Jun, 2017 1 commit
  18. 15 Jun, 2017 1 commit
  19. 26 May, 2017 3 commits
  20. 07 Dec, 2016 1 commit
    • Junio C Hamano's avatar
      hold_locked_index(): align error handling with hold_lockfile_for_update() · b3e83cc7
      Junio C Hamano authored
      Callers of the hold_locked_index() function pass 0 when they want to
      prepare to write a new version of the index file without wishing to
      die or emit an error message when the request fails (e.g. somebody
      else already held the lock), and pass 1 when they want the call to
      die upon failure.
      
      This option is called LOCK_DIE_ON_ERROR by the underlying lockfile
      API, and the hold_locked_index() function translates the paramter to
      LOCK_DIE_ON_ERROR when calling the hold_lock_file_for_update().
      
      Replace these hardcoded '1' with LOCK_DIE_ON_ERROR and stop
      translating.  Callers other than the ones that are replaced with
      this change pass '0' to the function; no behaviour change is
      intended with this patch.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      ---
      
      Among the callers of hold_locked_index() that passes 0:
      
       - diff.c::refresh_index_quietly() at the end of "git diff" is an
         opportunistic update; it leaks the lockfile structure but it is
         just before the program exits and nobody should care.
      
       - builtin/describe.c::cmd_describe(),
         builtin/commit.c::cmd_status(),
         sequencer.c::read_and_refresh_cache() are all opportunistic
         updates and they are OK.
      
       - builtin/update-index.c::cmd_update_index() takes a lock upfront
         but we may end up not needing to update the index (i.e. the
         entries may be fully up-to-date), in which case we do not need to
         issue an error upon failure to acquire the lock.  We do diagnose
         and die if we indeed need to update, so it is OK.
      
       - wt-status.c::require_clean_work_tree() IS BUGGY.  It asks
         silence, does not check the returned value.  Compare with
         callsites like cmd_describe() and cmd_status() to notice that it
         is wrong to call update_index_if_able() unconditionally.
      b3e83cc7
  21. 07 Sep, 2016 1 commit
  22. 19 May, 2016 1 commit
  23. 11 May, 2016 1 commit
  24. 09 May, 2016 1 commit
  25. 06 Apr, 2016 2 commits
    • Junio C Hamano's avatar
      rerere: adjust 'forget' to multi-variant world order · 890fca84
      Junio C Hamano authored
      Because conflicts with the same contents inside conflict blocks
      enclosed by "<<<<<<<" and ">>>>>>>" can now have multiple variants
      to help three-way merge to adjust to the differences outside the
      conflict blocks, "rerere forget $path" needs to be taught that there
      may be multiple recorded resolutions that share the same conflict
      hash (which groups the conflicts with "the same contents inside
      conflict blocks"), among which there are some that would not be
      relevant to the conflict we are looking at.  These "other variants"
      that happen to share the same conflict hash should not be cleared,
      and the variant that would apply to the current conflict may not be
      the zero-th one (which is the only one that is cleared by the
      current code).
      
      After finding the conflict hash, iterate over the existing variants
      and try to resolve the conflict using each of them to find the one
      that "cleanly" resolves the current conflict.  That is the one we
      want to forget and record the preimage for, so that the user can
      record the corrected resolution.
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      890fca84
    • Junio C Hamano's avatar
      rerere: split code to call ll_merge() further · 0ce02b36
      Junio C Hamano authored
      The merge() helper function is given an existing rerere ID (i.e. the
      name of the .git/rr-cache/* subdirectory, and the variant number)
      that identifies one <preimage, postimage> pair, try to see if the
      conflicted state in the given path can be resolved by using the pair,
      and if this succeeds, then update the conflicted path with the
      result in the working tree.
      
      To implement rerere_forget() in the multiple variant world, we'd
      need a helper to do the "see if a <preimage, postimage> pair cleanly
      resolves a conflicted state we have in-core" part, without actually
      touching any file in the working tree, in order to identify which
      variant(s) to remove.  Split the logic to do so into a separate
      helper function try_merge() out of merge().
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      0ce02b36