1. 09 Jul, 2017 6 commits
    • Jeff King's avatar
      reflog-walk: apply --since/--until to reflog dates · de239446
      Jeff King authored
      When doing a reflog walk, we use the commit's date to
      do any date limiting. In earlier versions of Git, this could
      lead to nonsense results, since a skipped commit would
      truncate the traversal. So a sequence like:
      
        git commit ...
        git checkout week-old-branch
        git checkout -
        git log -g --since=1.day.ago
      
      would stop at the week-old-branch, even though the "git
      commit" entry further back is still interesting.
      
      As of the prior commit, which uses a parent-less traversal
      of the reflog, you get the whole reflog minus any commits
      whose dates do not match the specified options. This is
      arguably useful, as you could scan the reflogs for commits
      that originated in a certain range.
      
      But more likely a user doing a reflog walk wants to limit
      based on the reflog entries themselves. You can simulate
      --until with:
      
        git log -g @{1.day.ago}
      
      but there's no way to ask Git to traverse only back to a
      certain date. E.g.:
      
        # show me reflog entries from the past day
        git log -g --since=1.day.ago
      
      This patch teaches the revision machinery to prefer the
      reflog entry dates to the commit dates when doing a reflog
      walk. Technically this is a change in behavior that affects
      plumbing, but the previous behavior was so buggy that it's
      unlikely anyone was relying on it.
      Signed-off-by: default avatarJeff King <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      de239446
    • Jeff King's avatar
      reflog-walk: stop using fake parents · d08565bf
      Jeff King authored
      The reflog-walk system works by putting a ref's tip into the
      pending queue, and then "traversing" the reflog by
      pretending that the parent of each commit is the previous
      reflog entry.
      
      This causes a number of user-visible oddities, as documented
      in t1414 (and the commit message which introduced it). We
      can fix all of them in one go by replacing the fake-reflog
      system with a much simpler one: just keeping a list of
      reflogs to show, and walking through them entry by entry.
      
      The implementation is fairly straight-forward, but there are
      a few items to note:
      
        1. We obviously must skip calling add_parents_to_list()
           when we are traversing reflogs, since we do not want to
           walk the original parents at all.  As a result, we must call
           try_to_simplify_commit() ourselves.
      
           There are other parts of add_parents_to_list() we skip,
           as well, but none of them should matter for a reflog
           traversal:
      
             -  We do not allow UNINTERESTING commits, nor
                symmetric ranges (and we bail when these are used
                with "-g").
      
             - Using --source makes no sense, since we aren't
               traversing. The reflog selector shows the same
               information with more detail.
      
             - Using --first-parent is still sensible, since you
               may want to see the first-parent diff for each
               entry. But since we're not traversing, we don't
               need to cull the parent list here.
      
        2. Since we now just walk the reflog entries themselves,
           rather than starting with the ref tip, we now look at
           the "new" field of each entry rather than the "old"
           (i.e., we are showing entries, not faking parents).
           This removes all of the tricky logic around skipping
           past root commits.
      
           But note that we have no way to show an entry with the
           null sha1 in its "new" field (because such a commit
           obviously does not exist). Normally this would not
           happen, since we delete reflogs along with refs, but
           there is one special case. When we rename the currently
           checked out branch, we write two reflog entries into
           the HEAD log: one where the commit goes away, and
           another where it comes back.
      
           Prior to this commit, we show both entries with
           identical reflog messages. After this commit, we show
           only the "comes back" entry. See the update in t3200
           which demonstrates this.
      
           Arguably either is fine, as the whole double-entry
           thing is a bit hacky in the first place. And until a
           recent fix, we truncated the traversal in such a case
           anyway, which was _definitely_ wrong.
      
        3. We show individual reflogs in order, but choose which
           reflog to show at each stage based on which has the
           most recent timestamp.  This interleaves the output
           from multiple reflogs based on date order, which is
           probably what you'd want with limiting like "-n 30".
      
           Note that the implementation aims for simplicity. It
           does a linear walk over the reflog queue for each
           commit it pulls, which may perform badly if you
           interleave an enormous number of reflogs. That seems
           like an unlikely use case; if we did want to handle it,
           we could probably keep a priority queue of reflogs,
           ordered by the timestamp of their current tip entry.
      Signed-off-by: default avatarJeff King <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      d08565bf
    • Jeff King's avatar
      rev-list: check reflog_info before showing usage · 7f97de5e
      Jeff King authored
      When git-rev-list sees no pending commits, it shows a usage
      message. This works even when reflog-walking is requested,
      because the reflog-walk code currently puts the reflog tips
      into the pending queue.
      
      In preparation for refactoring the reflog-walk code, let's
      explicitly check whether we have any reflogs to walk. For
      now this is a noop, but the existing reflog tests will make
      sure that it kicks in after the refactoring. Likewise, we'll
      add a test that "rev-list -g" without specifying any reflogs
      continues to fail (so that we know our check does not kick
      in too aggressively).
      
      Note that the implementation needs to go into its own
      sub-function, as the walk code does not expose its innards
      outside of reflog-walk.c.
      Signed-off-by: default avatarJeff King <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      7f97de5e
    • Jeff King's avatar
      get_revision_1(): replace do-while with an early return · 7c2f08aa
      Jeff King authored
      The get_revision_1() function tries to avoid entering its
      main loop at all when there are no commits to look at. But
      it's perfectly safe to call pop_commit() on an empty list
      (in which case it will return NULL). Switching to an early
      return from the loop lets us skip repeating the loop
      condition before we enter the do-while. That will get more
      important when we start pulling reflog-walk commits from a
      source besides the revs->commits queue, as that condition
      will get much more complicated.
      Signed-off-by: default avatarJeff King <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      7c2f08aa
    • Jeff King's avatar
      log: do not free parents when walking reflog · f35650df
      Jeff King authored
      When we're doing a reflog walk (instead of walking the
      actual parent pointers), we may see commits multiple times.
      For this reason, we hold on to the commit buffer for each
      commit rather than freeing it after we've showed the commit.
      
      We should do the same for the parent list. Right now this is
      just a minor optimization. But once we refactor how reflog
      walks are performed, keeping the parents will avoid
      confusing us the second time we see the commit.
      Signed-off-by: default avatarJeff King <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      f35650df
    • Jeff King's avatar
      log: clarify comment about reflog cycles · 822601e8
      Jeff King authored
      When we're walking reflogs, we leave the commit buffer and
      parents in place. A comment explains that this is due to
      "cycles". But the interesting thing is the unsaid
      implication: that the cycles (plus our clearing of the SEEN
      flag) will cause us to show commits multiple times. Let's
      spell it out.
      Signed-off-by: default avatarJeff King <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      822601e8
  2. 07 Jul, 2017 16 commits
    • Jeff King's avatar
      revision: disallow reflog walking with revs->limited · 82fd0f4a
      Jeff King authored
      The reflog-walk code doesn't work with limit_list().  That
      function traverses down the real history graph, not the fake
      reflog history that get_revision() returns. So it's not
      going to actually examine all of the commits we're going to
      show, because we'd add them to the pending list only during
      the actual traversal.
      
      In practice this limitation doesn't really matter, because
      the options that require list-limiting generally need
      UNINTERESTING endpoints or symmetric ranges, which already
      are forbidden for reflog walks. Still, there are likely some
      corner cases that would behave oddly. We're better off to
      warn the user that we can't fulfill their request than to
      generate potentially wrong output.
      
      This will also make it easier to refactor the reflog-walking
      code, because it eliminates a whole area of corner cases
      we'd have to consider (that already don't work anyway).
      Signed-off-by: default avatarJeff King <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      82fd0f4a
    • Jeff King's avatar
      t1414: document some reflog-walk oddities · 7cf686b9
      Jeff King authored
      Since its inception, the general strategy of the reflog-walk
      code has been to start with the tip commit for the ref, and
      as we traverse replace each commit's parent pointers with
      fake parents pointing to the previous reflog entry.
      
      This lets us traverse the reflog as if it were a real
      history, but it has some user-visible oddities. Namely:
      
        1. The fake parents are used for commit selection and
           display. So for example, "--merges" or "--no-merges"
           are not useful, because the history appears as a linear
           string of commits. Likewise, pathspec limiting is based
           on the diff between adjacent entries, not the changes
           actually introduced by a commit.
      
           These are often the same (e.g., because the entry was
           just running "git commit" and the adjacent entry _is_
           the true parent), but it may not be in several common
           cases. For instance, using "git reset" to jump around
           history, or "git checkout" to move HEAD.
      
        2. We reverse-map each commit back to its reflog. So when
           it comes time to show commit X, we say "a-ha, we added
           X because it was at the tip of the 'foo' reflog, so
           let's show the foo reflog". But this leads to nonsense
           results when you ask to traverse multiple reflogs: if
           two reflogs have the same tip commit, we only map back
           to one of them.  Instead, we should show both.
      
        3. If the tip of the reflog and the ref tip disagree on
           the current value, we show the ref tip but give no
           indication of the value in the reflog.  This situation
           isn't supposed to happen (since any ref update should
           touch the reflog). But if it does, given that the
           requested operation is to show the reflog, it makes
           sense to prefer that.
      
      This commit adds a new script with several expect_failure
      tests to demonstrate the problems.  This could be part of
      the existing t1411, but it's a bit easier to start from a
      fresh state, where we know exactly what will be in the log.
      
      Since the new multiple-reflog tests are checking the actual
      output, we can drop the "make sure we don't segfault" tests
      from t1411, which are a strict subset of what we're doing
      here.
      Signed-off-by: default avatarJeff King <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      7cf686b9
    • Junio C Hamano's avatar
      Merge branch 'jk/reflog-walk-maint' into jk/reflog-walk · be5982a7
      Junio C Hamano authored
      * jk/reflog-walk-maint:
        reflog-walk: include all fields when freeing complete_reflogs
        reflog-walk: don't free reflogs added to cache
        reflog-walk: duplicate strings in complete_reflogs list
        reflog-walk: skip over double-null oid due to HEAD rename
      be5982a7
    • Jeff King's avatar
      reflog-walk: include all fields when freeing complete_reflogs · e30d463d
      Jeff King authored
      When we encounter an error adding reflogs for a walk, we try
      to free any logs we have read. But we didn't free all
      fields, meaning that we could in theory leak all of the
      "items" array (which would consitute the bulk of the
      allocated memory).
      
      This patch adds a helper which frees all of the entries and
      uses it as appropriate.
      
      As it turns out, the leak seems impossible to trigger with
      the current code. Of the three error paths that free the
      complete_reflogs struct, two only kick in when the items
      array is empty, and the third was removed entirely in the
      previous commit.
      
      So this patch should be a noop in terms of behavior, but it
      fixes a potential maintenance headache should anybody add a
      new error path and copy the partial-free code. Which is
      what happened in 5026b471 (add_reflog_for_walk: avoid
      memory leak, 2017-05-04), though its leaky call was the
      third one that was recently removed.
      Signed-off-by: default avatarJeff King <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      e30d463d
    • Jeff King's avatar
      reflog-walk: don't free reflogs added to cache · 8aae3cf7
      Jeff King authored
      The add_reflog_for_walk() function keeps a cache mapping
      refnames to their reflog contents. We use a cached reflog
      entry if available, and otherwise allocate and store a new
      one.
      
      Since 5026b471 (add_reflog_for_walk: avoid memory leak,
      2017-05-04), when we hit an error parsing a date-based
      reflog spec, we free the reflog memory but leave the cache
      entry pointing to the now-freed memory.
      
      We can fix this by just leaving the memory intact once it
      has made it into the cache. This may leave an unused entry
      in the cache, but that's OK. And it means we also catch a
      similar situation: we may not have allocated at all in this
      invocation, but simply be pointing to a cached entry from a
      previous invocation (which is relying on that entry being
      present).
      
      The new test in t1411 exercises this case and fails when run
      with --valgrind or ASan.
      Signed-off-by: default avatarJeff King <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      8aae3cf7
    • Jeff King's avatar
      reflog-walk: duplicate strings in complete_reflogs list · 75afe7ac
      Jeff King authored
      As part of the add_reflog_to_walk() function, we keep a
      string_list mapping refnames to their reflog contents. This
      serves as a cache so that accessing the same reflog twice
      requires only a single copy of the log in memory.
      
      The string_list is initialized via xcalloc, meaning its
      strdup_strings field is set to 0. But after inserting a
      string into the list, we unconditionally call free() on the
      string, leaving the list pointing to freed memory. If
      another reflog is added (e.g., "git log -g HEAD HEAD"), then
      the second one may have unpredictable results.
      
      The extra free was added by 5026b471 (add_reflog_for_walk:
      avoid memory leak, 2017-05-04). Though if you look
      carefully, you can see that the code was buggy even before
      then. If we tried to read the reflogs by time but came up
      with no entries, we exited with an error, freeing the string
      in that code path. So the bug was harder to trigger, but
      still there.
      
      We can fix it by just asking the string list to make a copy
      of the string. Technically we could fix the problem by not
      calling free() on our string (and just handing over
      ownership to the string list), but there are enough
      conditionals that it's quite hard to figure out which code
      paths need the free and which do not. Simpler is better
      here.
      
      The new test reliably shows the problem when run with
      --valgrind or ASAN.
      Signed-off-by: default avatarJeff King <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      75afe7ac
    • Junio C Hamano's avatar
      Fifteenth batch for 2.14 · 8b2efe2a
      Junio C Hamano authored
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      8b2efe2a
    • Junio C Hamano's avatar
      Merge branch 'ab/strbuf-addftime-tzname-boolify' · 6ba649e4
      Junio C Hamano authored
      strbuf_addftime() is further getting tweaked.
      
      * ab/strbuf-addftime-tzname-boolify:
        strbuf: change an always NULL/"" strbuf_addftime() param to bool
        strbuf.h comment: discuss strbuf_addftime() arguments in order
      6ba649e4
    • Junio C Hamano's avatar
      Merge branch 'xz/send-email-batch-size' · eb37527a
      Junio C Hamano authored
      "git send-email" learned to overcome some SMTP server limitation
      that does not allow many pieces of e-mails to be sent over a single
      session.
      
      * xz/send-email-batch-size:
        send-email: --batch-size to work around some SMTP server limit
      eb37527a
    • Junio C Hamano's avatar
      Merge branch 'js/t5534-rev-parse-gives-multi-line-output-fix' · ccce1e51
      Junio C Hamano authored
      A few tests that tried to verify the contents of push certificates
      did not use 'git rev-parse' to formulate the line to look for in
      the certificate correctly.
      
      * js/t5534-rev-parse-gives-multi-line-output-fix:
        t5534: fix misleading grep invocation
      ccce1e51
    • Junio C Hamano's avatar
      Merge branch 'sb/merge-recursive-code-cleanup' · 2f4bcd8b
      Junio C Hamano authored
      Code clean-up.
      
      * sb/merge-recursive-code-cleanup:
        merge-recursive: use DIFF_XDL_SET macro
      2f4bcd8b
    • Junio C Hamano's avatar
      Merge branch 'rs/apply-avoid-over-reading' · f9b3252b
      Junio C Hamano authored
      Code clean-up to fix possible buffer over-reading.
      
      * rs/apply-avoid-over-reading:
        apply: use starts_with() in gitdiff_verify_name()
      f9b3252b
    • Junio C Hamano's avatar
      Merge branch 'ab/sha1dc-maint' · cbb8704a
      Junio C Hamano authored
      Update the sha1dc again to fix portability glitches.
      
      * ab/sha1dc-maint:
        sha1dc: update from upstream
      cbb8704a
    • Junio C Hamano's avatar
      Merge branch 'jc/utf8-fprintf' · 62458ea3
      Junio C Hamano authored
      Code cleanup.
      
      * jc/utf8-fprintf:
        submodule--helper: do not call utf8_fprintf() unnecessarily
      62458ea3
    • Junio C Hamano's avatar
      Merge branch 'js/fsck-name-object' · 8f58a34c
      Junio C Hamano authored
      Test fix.
      
      * js/fsck-name-object:
        t1450: use egrep for regexp "alternation"
      8f58a34c
    • Junio C Hamano's avatar
      Merge branch 'aw/contrib-subtree-doc-asciidoctor' · 33cc9cfc
      Junio C Hamano authored
      The Makefile rule in contrib/subtree for building documentation
      learned to honour USE_ASCIIDOCTOR just like the main documentation
      set does.
      
      * aw/contrib-subtree-doc-asciidoctor:
        subtree: honour USE_ASCIIDOCTOR when set
      33cc9cfc
  3. 05 Jul, 2017 8 commits
    • Junio C Hamano's avatar
      Fourteenth batch for 2.14 · 50ff9ea4
      Junio C Hamano authored
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      50ff9ea4
    • Junio C Hamano's avatar
      Merge branch 'jt/unify-object-info' · 00b7cf23
      Junio C Hamano authored
      Code clean-ups.
      
      * jt/unify-object-info:
        sha1_file: refactor has_sha1_file_with_flags
        sha1_file: do not access pack if unneeded
        sha1_file: teach sha1_object_info_extended more flags
        sha1_file: refactor read_object
        sha1_file: move delta base cache code up
        sha1_file: rename LOOKUP_REPLACE_OBJECT
        sha1_file: rename LOOKUP_UNKNOWN_OBJECT
        sha1_file: teach packed_object_info about typename
      00b7cf23
    • Junio C Hamano's avatar
      Merge branch 'cc/shared-index-permfix' · 8e90578f
      Junio C Hamano authored
      The split index code did not honor core.sharedrepository setting
      correctly.
      
      * cc/shared-index-permfix:
        t1700: make sure split-index respects core.sharedrepository
        t1301: move modebits() to test-lib-functions.sh
        read-cache: use shared perms when writing shared index
      8e90578f
    • Junio C Hamano's avatar
      Merge branch 'rs/sha1-name-readdir-optim' · 5ab148dd
      Junio C Hamano authored
      Optimize "what are the object names already taken in an alternate
      object database?" query that is used to derive the length of prefix
      an object name is uniquely abbreviated to.
      
      * rs/sha1-name-readdir-optim:
        sha1_file: guard against invalid loose subdirectory numbers
        sha1_file: let for_each_file_in_obj_subdir() handle subdir names
        p4205: add perf test script for pretty log formats
        sha1_name: cache readdir(3) results in find_short_object_filename()
      5ab148dd
    • Junio C Hamano's avatar
      Merge branch 'bw/repo-object' · 85ce4a68
      Junio C Hamano authored
      Introduce a "repository" object to eventually make it easier to
      work in multiple repositories (the primary focus is to work with
      the superproject and its submodules) in a single process.
      
      * bw/repo-object:
        ls-files: use repository object
        repository: enable initialization of submodules
        submodule: convert is_submodule_initialized to work on a repository
        submodule: add repo_read_gitmodules
        submodule-config: store the_submodule_cache in the_repository
        repository: add index_state to struct repo
        config: read config from a repository object
        path: add repo_worktree_path and strbuf_repo_worktree_path
        path: add repo_git_path and strbuf_repo_git_path
        path: worktree_git_path() should not use file relocation
        path: convert do_git_path to take a 'struct repository'
        path: convert strbuf_git_common_path to take a 'struct repository'
        path: always pass in commondir to update_common_dir
        path: create path.h
        environment: store worktree in the_repository
        environment: place key repository state in the_repository
        repository: introduce the repository object
        environment: remove namespace_len variable
        setup: add comment indicating a hack
        setup: don't perform lazy initialization of repository state
      85ce4a68
    • Jeff King's avatar
      reflog-walk: skip over double-null oid due to HEAD rename · 2272d3e5
      Jeff King authored
      Since 39ee4c6c (branch: record creation of renamed branch
      in HEAD's log, 2017-02-20), a rename on the currently
      checked out branch will create two entries in the HEAD
      reflog: one where the branch goes away (switching to the
      null oid), and one where it comes back (switching away from
      the null oid).
      
      This confuses the reflog-walk code. When walking backwards,
      it first sees the null oid in the "old" field of the second
      entry. Thanks to the "root commit" logic added by 71abeb75
      (reflog: continue walking the reflog past root commits,
      2016-06-03), we keep looking for the next entry by scanning
      the "new" field from the previous entry. But that field is
      also null! We need to go just a tiny bit further, and look
      at its "old" field. But with the current code, we decide the
      reflog has nothing else to show and just give up. To the
      user this looks like the reflog was truncated by the rename
      operation, when in fact those entries are still there.
      
      This patch does the absolute minimal fix, which is to look
      back that one extra level and keep traversing.
      
      The resulting behavior may not be the _best_ thing to do in
      the long run (for example, we show both reflog entries each
      with the same commit id), but it's a simple way to fix the
      problem without risking further regressions.
      Signed-off-by: default avatarJeff King <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      2272d3e5
    • Johannes Schindelin's avatar
      t5534: fix misleading grep invocation · 8722947e
      Johannes Schindelin authored
      It seems to be a little-known feature of `grep` (and it certainly came
      as a surprise to this here developer who believed to know the Unix tools
      pretty well) that multiple patterns can be passed in the same
      command-line argument simply by separating them by newlines. Watch, and
      learn:
      
      	$ printf '1\n2\n3\n' | grep "$(printf '1\n3\n')"
      	1
      	3
      
      That behavior also extends to patterns passed via `-e`, and it is not
      modified by passing the option `-E` (but trying this with -P issues the
      error "grep: the -P option only supports a single pattern").
      
      It seems that there are more old Unix hands who are surprised by this
      behavior, as grep invocations of the form
      
      	grep "$(git rev-parse A B) C" file
      
      were introduced in a85b377d (push: the beginning of "git push
      --signed", 2014-09-12), and later faithfully copy-edited in b9459019
      (push: heed user.signingkey for signed pushes, 2014-10-22).
      
      Please note that the output of `git rev-parse A B` separates the object
      IDs via *newlines*, not via spaces, and those newlines are preserved
      because the interpolation is enclosed in double quotes.
      
      As a consequence, these tests try to validate that the file contains
      either A's object ID, or B's object ID followed by C, or both. Clearly,
      however, what the test wanted to see is that there is a line that
      contains all of them.
      
      This is clearly unintended, and the grep invocations in question really
      match too many lines.
      
      Fix the test by avoiding the newlines in the patterns.
      Signed-off-by: Johannes Schindelin's avatarJohannes Schindelin <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      8722947e
    • xiaoqiang zhao's avatar
      send-email: --batch-size to work around some SMTP server limit · 5453b83b
      xiaoqiang zhao authored
      Some email servers (e.g. smtp.163.com) limit the number emails to be
      sent per session (connection) and this will lead to a faliure when
      sending many messages.
      
      Teach send-email to disconnect after sending a number of messages
      (configurable via the --batch-size=<num> option), wait for a few
      seconds (configurable via the --relogin-delay=<seconds> option) and
      reconnect, to work around such a limit.
      
      Also add two configuration variables to give these options the default.
      
      Note:
      
        We will use this as a band-aid for now, but in the longer term, we
        should look at and react to the SMTP error code from the server;
        Xianqiang reports that 450 and 451 are returned by problematic
        servers.
      
        cf. https://public-inbox.org/git/[email protected]/Signed-off-by: default avatarxiaoqiang zhao <[email protected]>
      Signed-off-by: default avatarJunio C Hamano <[email protected]>
      5453b83b
  4. 03 Jul, 2017 1 commit
  5. 01 Jul, 2017 2 commits
  6. 30 Jun, 2017 7 commits