1. 05 May, 2017 1 commit
    • Jeff King's avatar
      shell: disallow repo names beginning with dash · 3ec80449
      Jeff King authored
      When a remote server uses git-shell, the client side will
      connect to it like:
      
        ssh server "git-upload-pack 'foo.git'"
      
      and we literally exec ("git-upload-pack", "foo.git"). In
      early versions of upload-pack and receive-pack, we took a
      repository argument and nothing else. But over time they
      learned to accept dashed options. If the user passes a
      repository name that starts with a dash, the results are
      confusing at best (we complain of a bogus option instead of
      a non-existent repository) and malicious at worst (the user
      can start an interactive pager via "--help").
      
      We could pass "--" to the sub-process to make sure the
      user's argument is interpreted as a branch name. I.e.:
      
        git-upload-pack -- -foo.git
      
      But adding "--" automatically would make us inconsistent
      with a normal shell (i.e., when git-shell is not in use),
      where "-foo.git" would still be an error. For that case, the
      client would have to specify the "--", but they can't do so
      reliably, as existing versions of git-shell do not allow
      more than a single argument.
      
      The simplest thing is to simply disallow "-" at the start of
      the repo name argument. This hasn't worked either with or
      without git-shell since version 1.0.0, and nobody has
      complained.
      
      Note that this patch just applies to do_generic_cmd(), which
      runs upload-pack, receive-pack, and upload-archive. There
      are two other types of commands that git-shell runs:
      
        - do_cvs_cmd(), but this already restricts the argument to
          be the literal string "server"
      
        - admin-provided commands in the git-shell-commands
          directory. We'll pass along arbitrary arguments there,
          so these commands could have similar problems. But these
          commands might actually understand dashed arguments, so
          we cannot just block them here. It's up to the writer of
          the commands to make sure they are safe. With great
          power comes great responsibility.
      Reported-by: BlueC0re's avatarTimo Schmid <tschmid@ernw.de>
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      3ec80449
  2. 17 Mar, 2016 2 commits
    • Junio C Hamano's avatar
      Git 2.4.11 · 76542869
      Junio C Hamano authored
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      76542869
    • Junio C Hamano's avatar
      Merge branch 'jk/path-name-safety-2.4' into maint-2.4 · 32c6dca8
      Junio C Hamano authored
      Bugfix patches were backported from the 'master' front to plug heap
      corruption holes, to catch integer overflow in the computation of
      pathname lengths, and to get rid of the name_path API.  Both of
      these would have resulted in writing over an under-allocated buffer
      when formulating pathnames while tree traversal.
      
      * jk/path-name-safety-2.4:
        list-objects: pass full pathname to callbacks
        list-objects: drop name_path entirely
        list-objects: convert name_path to a strbuf
        show_object_with_name: simplify by using path_name()
        http-push: stop using name_path
        tree-diff: catch integer overflow in combine_diff_path allocation
        add helpers for detecting size_t overflow
      32c6dca8
  3. 16 Mar, 2016 7 commits
    • Jeff King's avatar
      list-objects: pass full pathname to callbacks · 2824e184
      Jeff King authored
      When we find a blob at "a/b/c", we currently pass this to
      our show_object_fn callbacks as two components: "a/b/" and
      "c". Callbacks which want the full value then call
      path_name(), which concatenates the two. But this is an
      inefficient interface; the path is a strbuf, and we could
      simply append "c" to it temporarily, then roll back the
      length, without creating a new copy.
      
      So we could improve this by teaching the callsites of
      path_name() this trick (and there are only 3). But we can
      also notice that no callback actually cares about the
      broken-down representation, and simply pass each callback
      the full path "a/b/c" as a string. The callback code becomes
      even simpler, then, as we do not have to worry about freeing
      an allocated buffer, nor rolling back our modification to
      the strbuf.
      
      This is theoretically less efficient, as some callbacks
      would not bother to format the final path component. But in
      practice this is not measurable. Since we use the same
      strbuf over and over, our work to grow it is amortized, and
      we really only pay to memcpy a few bytes.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      2824e184
    • Jeff King's avatar
      list-objects: drop name_path entirely · dc06dc88
      Jeff King authored
      In the previous commit, we left name_path as a thin wrapper
      around a strbuf. This patch drops it entirely. As a result,
      every show_object_fn callback needs to be adjusted. However,
      none of their code needs to be changed at all, because the
      only use was to pass it to path_name(), which now handles
      the bare strbuf.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      dc06dc88
    • Jeff King's avatar
      list-objects: convert name_path to a strbuf · f3badaed
      Jeff King authored
      The "struct name_path" data is examined in only two places:
      we generate it in process_tree(), and we convert it to a
      single string in path_name(). Everyone else just passes it
      through to those functions.
      
      We can further note that process_tree() already keeps a
      single strbuf with the leading tree path, for use with
      tree_entry_interesting().
      
      Instead of building a separate name_path linked list, let's
      just use the one we already build in "base". This reduces
      the amount of code (especially tricky code in path_name()
      which did not check for integer overflows caused by deep
      or large pathnames).
      
      It is also more efficient in some instances.  Any time we
      were using tree_entry_interesting, we were building up the
      strbuf anyway, so this is an immediate and obvious win
      there. In cases where we were not, we trade off storing
      "pathname/" in a strbuf on the heap for each level of the
      path, instead of two pointers and an int on the stack (with
      one pointer into the tree object). On a 64-bit system, the
      latter is 20 bytes; so if path components are less than that
      on average, this has lower peak memory usage.  In practice
      it probably doesn't matter either way; we are already
      holding in memory all of the tree objects leading up to each
      pathname, and for normal-depth pathnames, we are only
      talking about hundreds of bytes.
      
      This patch leaves "struct name_path" as a thin wrapper
      around the strbuf, to avoid disrupting callbacks. We should
      fix them, but leaving it out makes this diff easier to view.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      f3badaed
    • Jeff King's avatar
      show_object_with_name: simplify by using path_name() · 8eee9f92
      Jeff King authored
      When "git rev-list" shows an object with its associated path
      name, it does so by walking the name_path linked list and
      printing each component (stopping at any embedded NULs or
      newlines).
      
      We'd like to eventually get rid of name_path entirely in
      favor of a single buffer, and dropping this custom printing
      code is part of that. As a first step, let's use path_name()
      to format the list into a single buffer, and print that.
      This is strictly less efficient than the original, but it's
      a temporary step in the refactoring; our end game will be to
      get the fully formatted name in the first place.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      8eee9f92
    • Jeff King's avatar
      http-push: stop using name_path · c6bd2a1d
      Jeff King authored
      The graph traversal code here passes along a name_path to
      build up the pathname at which we find each blob. But we
      never actually do anything with the resulting names, making
      it a waste of code and memory.
      
      This usage came in aa1dbc98 (Update http-push functionality,
      2006-03-07), and originally the result was passed to
      "add_object" (which stored it, but didn't really use it,
      either). But we stopped using that function in 1f1e895f (Add
      "named object array" concept, 2006-06-19) in favor of
      storing just the objects themselves.
      
      Moreover, the generation of the name in process_tree() is
      buggy. It sticks "name" onto the end of the name_path linked
      list, and then passes it down again as it recurses (instead
      of "entry.path"). So it's a good thing this was unused, as
      the resulting path for "a/b/c/d" would end up as "a/a/a/a".
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      c6bd2a1d
    • Jeff King's avatar
      tree-diff: catch integer overflow in combine_diff_path allocation · d7701878
      Jeff King authored
      A combine_diff_path struct has two "flex" members allocated
      alongside the struct: a string to hold the pathname, and an
      array of parent pointers. We use an "int" to compute this,
      meaning we may easily overflow it if the pathname is
      extremely long.
      
      We can fix this by using size_t, and checking for overflow
      with the st_add helper.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      d7701878
    • Jeff King's avatar
      add helpers for detecting size_t overflow · 935de812
      Jeff King authored
      Performing computations on size_t variables that we feed to
      xmalloc and friends can be dangerous, as an integer overflow
      can cause us to allocate a much smaller chunk than we
      realized.
      
      We already have unsigned_add_overflows(), but let's add
      unsigned_mult_overflows() to that. Furthermore, rather than
      have each site manually check and die on overflow, we can
      provide some helpers that will:
      
        - promote the arguments to size_t, so that we know we are
          doing our computation in the same size of integer that
          will ultimately be fed to xmalloc
      
        - check and die on overflow
      
        - return the result so that computations can be done in
          the parameter list of xmalloc.
      
      These functions are a lot uglier to use than normal
      arithmetic operators (you have to do "st_add(foo, bar)"
      instead of "foo + bar"). To at least limit the damage, we
      also provide multi-valued versions. So rather than:
      
        st_add(st_add(a, b), st_add(c, d));
      
      you can write:
      
        st_add4(a, b, c, d);
      
      This isn't nearly as elegant as a varargs function, but it's
      a lot harder to get it wrong. You don't have to remember to
      add a sentinel value at the end, and the compiler will
      complain if you get the number of arguments wrong. This
      patch adds only the numbered variants required to convert
      the current code base; we can easily add more later if
      needed.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      935de812
  4. 28 Sep, 2015 9 commits
    • Junio C Hamano's avatar
      Git 2.4.10 · a2558fb8
      Junio C Hamano authored
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      a2558fb8
    • Junio C Hamano's avatar
      Sync with 2.3.10 · 6343e2f6
      Junio C Hamano authored
      6343e2f6
    • Junio C Hamano's avatar
      Git 2.3.10 · 18b58f70
      Junio C Hamano authored
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      18b58f70
    • Junio C Hamano's avatar
      92cdfd21
    • Jeff King's avatar
      merge-file: enforce MAX_XDIFF_SIZE on incoming files · 83c4d380
      Jeff King authored
      The previous commit enforces MAX_XDIFF_SIZE at the
      interfaces to xdiff: xdi_diff (which calls xdl_diff) and
      ll_xdl_merge (which calls xdl_merge).
      
      But we have another direct call to xdl_merge in
      merge-file.c. If it were written today, this probably would
      just use the ll_merge machinery. But it predates that code,
      and uses slightly different options to xdl_merge (e.g.,
      ZEALOUS_ALNUM).
      
      We could try to abstract out an xdi_merge to match the
      existing xdi_diff, but even that is difficult. Rather than
      simply report error, we try to treat large files as binary,
      and that distinction would happen outside of xdi_merge.
      
      The simplest fix is to just replicate the MAX_XDIFF_SIZE
      check in merge-file.c.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      83c4d380
    • Jeff King's avatar
      xdiff: reject files larger than ~1GB · dcd1742e
      Jeff King authored
      The xdiff code is not prepared to handle extremely large
      files. It uses "int" in many places, which can overflow if
      we have a very large number of lines or even bytes in our
      input files. This can cause us to produce incorrect diffs,
      with no indication that the output is wrong. Or worse, we
      may even underallocate a buffer whose size is the result of
      an overflowing addition.
      
      We're much better off to tell the user that we cannot diff
      or merge such a large file. This patch covers both cases,
      but in slightly different ways:
      
        1. For merging, we notice the large file and cleanly fall
           back to a binary merge (which is effectively "we cannot
           merge this").
      
        2. For diffing, we make the binary/text distinction much
           earlier, and in many different places. For this case,
           we'll use the xdi_diff as our choke point, and reject
           any diff there before it hits the xdiff code.
      
           This means in most cases we'll die() immediately after.
           That's not ideal, but in practice we shouldn't
           generally hit this code path unless the user is trying
           to do something tricky. We already consider files
           larger than core.bigfilethreshold to be binary, so this
           code would only kick in when that is circumvented
           (either by bumping that value, or by using a
           .gitattribute to mark a file as diffable).
      
           In other words, we can avoid being "nice" here, because
           there is already nice code that tries to do the right
           thing. We are adding the suspenders to the nice code's
           belt, so notice when it has been worked around (both to
           protect the user from malicious inputs, and because it
           is better to die() than generate bogus output).
      
      The maximum size was chosen after experimenting with feeding
      large files to the xdiff code. It's just under a gigabyte,
      which leaves room for two obvious cases:
      
        - a diff3 merge conflict result on files of maximum size X
          could be 3*X plus the size of the markers, which would
          still be only about 3G, which fits in a 32-bit int.
      
        - some of the diff code allocates arrays of one int per
          record. Even if each file consists only of blank lines,
          then a file smaller than 1G will have fewer than 1G
          records, and therefore the int array will fit in 4G.
      
      Since the limit is arbitrary anyway, I chose to go under a
      gigabyte, to leave a safety margin (e.g., we would not want
      to overflow by allocating "(records + 1) * sizeof(int)" or
      similar.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      dcd1742e
    • Jeff King's avatar
      react to errors in xdi_diff · 3efb9880
      Jeff King authored
      When we call into xdiff to perform a diff, we generally lose
      the return code completely. Typically by ignoring the return
      of our xdi_diff wrapper, but sometimes we even propagate
      that return value up and then ignore it later.  This can
      lead to us silently producing incorrect diffs (e.g., "git
      log" might produce no output at all, not even a diff header,
      for a content-level diff).
      
      In practice this does not happen very often, because the
      typical reason for xdiff to report failure is that it
      malloc() failed (it uses straight malloc, and not our
      xmalloc wrapper).  But it could also happen when xdiff
      triggers one our callbacks, which returns an error (e.g.,
      outf() in builtin/rerere.c tries to report a write failure
      in this way). And the next patch also plans to add more
      failure modes.
      
      Let's notice an error return from xdiff and react
      appropriately. In most of the diff.c code, we can simply
      die(), which matches the surrounding code (e.g., that is
      what we do if we fail to load a file for diffing in the
      first place). This is not that elegant, but we are probably
      better off dying to let the user know there was a problem,
      rather than simply generating bogus output.
      
      We could also just die() directly in xdi_diff, but the
      callers typically have a bit more context, and can provide a
      better message (and if we do later decide to pass errors up,
      we're one step closer to doing so).
      
      There is one interesting case, which is in diff_grep(). Here
      if we cannot generate the diff, there is nothing to match,
      and we silently return "no hits". This is actually what the
      existing code does already, but we make it a little more
      explicit.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      3efb9880
    • Junio C Hamano's avatar
    • Junio C Hamano's avatar
  5. 25 Sep, 2015 3 commits
    • Blake Burkhart's avatar
      http: limit redirection depth · b2581164
      Blake Burkhart authored
      By default, libcurl will follow circular http redirects
      forever. Let's put a cap on this so that somebody who can
      trigger an automated fetch of an arbitrary repository (e.g.,
      for CI) cannot convince git to loop infinitely.
      
      The value chosen is 20, which is the same default that
      Firefox uses.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      b2581164
    • Blake Burkhart's avatar
      http: limit redirection to protocol-whitelist · f4113cac
      Blake Burkhart authored
      Previously, libcurl would follow redirection to any protocol
      it was compiled for support with. This is desirable to allow
      redirection from HTTP to HTTPS. However, it would even
      successfully allow redirection from HTTP to SFTP, a protocol
      that git does not otherwise support at all. Furthermore
      git's new protocol-whitelisting could be bypassed by
      following a redirect within the remote helper, as it was
      only enforced at transport selection time.
      
      This patch limits redirects within libcurl to HTTP, HTTPS,
      FTP and FTPS. If there is a protocol-whitelist present, this
      list is limited to those also allowed by the whitelist. As
      redirection happens from within libcurl, it is impossible
      for an HTTP redirect to a protocol implemented within
      another remote helper.
      
      When the curl version git was compiled with is too old to
      support restrictions on protocol redirection, we warn the
      user if GIT_ALLOW_PROTOCOL restrictions were requested. This
      is a little inaccurate, as even without that variable in the
      environment, we would still restrict SFTP, etc, and we do
      not warn in that case. But anything else means we would
      literally warn every time git accesses an http remote.
      
      This commit includes a test, but it is not as robust as we
      would hope. It redirects an http request to ftp, and checks
      that curl complained about the protocol, which means that we
      are relying on curl's specific error message to know what
      happened. Ideally we would redirect to a working ftp server
      and confirm that we can clone without protocol restrictions,
      and not with them. But we do not have a portable way of
      providing an ftp server, nor any other protocol that curl
      supports (https is the closest, but we would have to deal
      with certificates).
      
      [jk: added test and version warning]
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      f4113cac
    • Jeff King's avatar
      transport: refactor protocol whitelist code · 5088d3b3
      Jeff King authored
      The current callers only want to die when their transport is
      prohibited. But future callers want to query the mechanism
      without dying.
      
      Let's break out a few query functions, and also save the
      results in a static list so we don't have to re-parse for
      each query.
      Based-on-a-patch-by: Blake Burkhart's avatarBlake Burkhart <bburky@bburky.com>
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      5088d3b3
  6. 23 Sep, 2015 2 commits
    • Jeff King's avatar
      submodule: allow only certain protocols for submodule fetches · 33cfccbb
      Jeff King authored
      Some protocols (like git-remote-ext) can execute arbitrary
      code found in the URL. The URLs that submodules use may come
      from arbitrary sources (e.g., .gitmodules files in a remote
      repository). Let's restrict submodules to fetching from a
      known-good subset of protocols.
      
      Note that we apply this restriction to all submodule
      commands, whether the URL comes from .gitmodules or not.
      This is more restrictive than we need to be; for example, in
      the tests we run:
      
        git submodule add ext::...
      
      which should be trusted, as the URL comes directly from the
      command line provided by the user. But doing it this way is
      simpler, and makes it much less likely that we would miss a
      case. And since such protocols should be an exception
      (especially because nobody who clones from them will be able
      to update the submodules!), it's not likely to inconvenience
      anyone in practice.
      Reported-by: Blake Burkhart's avatarBlake Burkhart <bburky@bburky.com>
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      33cfccbb
    • Jeff King's avatar
      transport: add a protocol-whitelist environment variable · a5adaced
      Jeff King authored
      If we are cloning an untrusted remote repository into a
      sandbox, we may also want to fetch remote submodules in
      order to get the complete view as intended by the other
      side. However, that opens us up to attacks where a malicious
      user gets us to clone something they would not otherwise
      have access to (this is not necessarily a problem by itself,
      but we may then act on the cloned contents in a way that
      exposes them to the attacker).
      
      Ideally such a setup would sandbox git entirely away from
      high-value items, but this is not always practical or easy
      to set up (e.g., OS network controls may block multiple
      protocols, and we would want to enable some but not others).
      
      We can help this case by providing a way to restrict
      particular protocols. We use a whitelist in the environment.
      This is more annoying to set up than a blacklist, but
      defaults to safety if the set of protocols git supports
      grows). If no whitelist is specified, we continue to default
      to allowing all protocols (this is an "unsafe" default, but
      since the minority of users will want this sandboxing
      effect, it is the only sensible one).
      
      A note on the tests: ideally these would all be in a single
      test file, but the git-daemon and httpd test infrastructure
      is an all-or-nothing proposition rather than a test-by-test
      prerequisite. By putting them all together, we would be
      unable to test the file-local code on machines without
      apache.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      a5adaced
  7. 04 Sep, 2015 10 commits
    • Junio C Hamano's avatar
      Git 2.4.9 · 74b67638
      Junio C Hamano authored
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      74b67638
    • Junio C Hamano's avatar
      Sync with 2.3.9 · ef0e938a
      Junio C Hamano authored
      ef0e938a
    • Junio C Hamano's avatar
      Git 2.3.9 · ecad27cf
      Junio C Hamano authored
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      ecad27cf
    • Junio C Hamano's avatar
      Sync with 2.2.3 · 8267cd11
      Junio C Hamano authored
      8267cd11
    • Junio C Hamano's avatar
      Git 2.2.3 · 441c4a40
      Junio C Hamano authored
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      441c4a40
    • Junio C Hamano's avatar
      f54cb059
    • Jeff King's avatar
      show-branch: use a strbuf for reflog descriptions · 78f23bdf
      Jeff King authored
      When we show "branch@{0}", we format into a fixed-size
      buffer using sprintf. This can overflow if you have long
      branch names. We can fix it by using a temporary strbuf.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      78f23bdf
    • Jeff King's avatar
      read_info_alternates: handle paths larger than PATH_MAX · 5015f01c
      Jeff King authored
      This function assumes that the relative_base path passed
      into it is no larger than PATH_MAX, and writes into a
      fixed-size buffer. However, this path may not have actually
      come from the filesystem; for example, add_submodule_odb
      generates a path using a strbuf and passes it in. This is
      hard to trigger in practice, though, because the long
      submodule directory would have to exist on disk before we
      would try to open its info/alternates file.
      
      We can easily avoid the bug, though, by simply creating the
      filename on the heap.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      5015f01c
    • Jeff King's avatar
      notes: use a strbuf in add_non_note · c29edfef
      Jeff King authored
      When we are loading a notes tree into our internal hash
      table, we also collect any files that are clearly non-notes.
      We format the name of the file into a PATH_MAX buffer, but
      unlike true notes (which cannot be larger than a fanned-out
      sha1 hash), these tree entries can be arbitrarily long,
      overflowing our buffer.
      
      We can fix this by switching to a strbuf. It doesn't even
      cost us an extra allocation, as we can simply hand ownership
      of the buffer over to the non-note struct.
      
      This is of moderate security interest, as you might fetch
      notes trees from an untrusted remote. However, we do not do
      so by default, so you would have to manually fetch into the
      notes namespace.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      c29edfef
    • Jeff King's avatar
      verify_absent: allow filenames longer than PATH_MAX · f514ef97
      Jeff King authored
      When unpack-trees wants to know whether a path will
      overwrite anything in the working tree, we use lstat() to
      see if there is anything there. But if we are going to write
      "foo/bar", we can't just lstat("foo/bar"); we need to look
      for leading prefixes (e.g., "foo"). So we use the lstat cache
      to find the length of the leading prefix, and copy the
      filename up to that length into a temporary buffer (since
      the original name is const, we cannot just stick a NUL in
      it).
      
      The copy we make goes into a PATH_MAX-sized buffer, which
      will overflow if the prefix is longer than PATH_MAX. How
      this happens is a little tricky, since in theory PATH_MAX is
      the biggest path we will have read from the filesystem. But
      this can happen if:
      
        - the compiled-in PATH_MAX does not accurately reflect
          what the filesystem is capable of
      
        - the leading prefix is not _quite_ what is on disk; it
          contains the next element from the name we are checking.
          So if we want to write "aaa/bbb/ccc/ddd" and "aaa/bbb"
          exists, the prefix of interest is "aaa/bbb/ccc". If
          "aaa/bbb" approaches PATH_MAX, then "ccc" can overflow
          it.
      
      So this can be triggered, but it's hard to do. In
      particular, you cannot just "git clone" a bogus repo. The
      verify_absent checks happen before unpack-trees writes
      anything to the filesystem, so there are never any leading
      prefixes during the initial checkout, and the bug doesn't
      trigger. And by definition, these files are larger than
      PATH_MAX, so writing them will fail, and clone will
      complain (though it may write a partial path, which will
      cause a subsequent "git checkout" to hit the bug).
      
      We can fix it by creating the temporary path on the heap.
      The extra malloc overhead is not important, as we are
      already making at least one stat() call (and probably more
      for the prefix discovery).
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      f514ef97
  8. 03 Aug, 2015 6 commits