1. 14 Mar, 2018 4 commits
  2. 15 Jun, 2017 1 commit
  3. 09 May, 2017 1 commit
    • Ramsay Jones's avatar
      archive-tar: fix a sparse 'constant too large' warning · 3f789719
      Ramsay Jones authored
      Commit dddbad72 ("timestamp_t: a new data type for timestamps",
      26-04-2017) introduced a new typedef 'timestamp_t', as a synonym for an
      unsigned long, which was used at the time to represent timestamps in
      git. A later commit 28f4aee3 ("use uintmax_t for timestamps",
      26-04-2017) changed the typedef to use an 'uintmax_t' for the timestamp
      representation type.
      
      When building on a 32-bit Linux system, sparse complains that a constant
      (USTAR_MAX_MTIME) used to detect a 'far-future mtime' timestamp, is too
      large; 'warning: constant 077777777777UL is so big it is unsigned long
      long' on lines 335 and 338 of archive-tar.c. Note that both gcc and
      clang only issue a warning if this constant is used in a context that
      requires an 'unsigned long' (rather than an uintmax_t). (Since TIME_MAX
      is no longer equal to 0xFFFFFFFF, even on a 32-bit system, the macro
      USTAR_MAX_MTIME is set to 077777777777UL, which cannot be represented as
      an 'unsigned long' constant).
      
      In order to suppress the warning, change the definition of the macro
      constant USTAR_MAX_MTIME to use an 'ULL' type suffix.
      
      In a similar vein, on systems which use a 64-bit representation of the
      'unsigned long' type, the USTAR_MAX_SIZE constant macro is defined with
      the value 077777777777ULL. Although this does not cause any warning
      messages to be issued, it would be more appropriate for this constant
      to use an 'UL' type suffix rather than 'ULL'.
      Signed-off-by: default avatarRamsay Jones <ramsay@ramsayjones.plus.com>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      3f789719
  4. 27 Apr, 2017 1 commit
    • Johannes Schindelin's avatar
      timestamp_t: a new data type for timestamps · dddbad72
      Johannes Schindelin authored
      Git's source code assumes that unsigned long is at least as precise as
      time_t. Which is incorrect, and causes a lot of problems, in particular
      where unsigned long is only 32-bit (notably on Windows, even in 64-bit
      versions).
      
      So let's just use a more appropriate data type instead. In preparation
      for this, we introduce the new `timestamp_t` data type.
      
      By necessity, this is a very, very large patch, as it has to replace all
      timestamps' data type in one go.
      
      As we will use a data type that is not necessarily identical to `time_t`,
      we need to be very careful to use `time_t` whenever we interact with the
      system functions, and `timestamp_t` everywhere else.
      Signed-off-by: Johannes Schindelin's avatarJohannes Schindelin <johannes.schindelin@gmx.de>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      dddbad72
  5. 06 Aug, 2016 1 commit
  6. 15 Jul, 2016 1 commit
  7. 01 Jul, 2016 3 commits
    • Jeff King's avatar
      archive-tar: drop return value · 5caeeb83
      Jeff King authored
      We never do any error checks, and so never return anything
      but "0". Let's just drop this to simplify the code.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      5caeeb83
    • Jeff King's avatar
      archive-tar: write extended headers for far-future mtime · 6e8e0991
      Jeff King authored
      The ustar format represents timestamps as seconds since the
      epoch, but only has room to store 11 octal digits.  To
      express anything larger, we need to use an extended header.
      This is exactly the same case we fixed for the size field in
      the previous commit, and the solution here follows the same
      pattern.
      
      This is even mentioned as an issue in f2f02675 (archive-tar:
      use xsnprintf for trivial formatting, 2015-09-24), but since
      it only affected things far in the future, it wasn't deemed
      worth dealing with. But note that my calculations claiming
      thousands of years were off there; because our xsnprintf
      produces a NUL byte, we only have until the year 2242 to fix
      this.
      
      Given that this is just around the corner (geologically
      speaking, anyway), and because it's easy to fix, let's just
      make it work. Unlike the previous fix for "size", where we
      had to write an individual extended header for each file, we
      can write one global header (since we have only one mtime
      for the whole archive).
      
      There's a slight bit of trickiness there. We may already be
      writing a global header with a "comment" field for the
      commit sha1. So we need to write our new field into the same
      header. To do this, we push the decision of whether to write
      such a header down into write_global_extended_header(),
      which will now assemble the header as it sees fit, and will
      return early if we have nothing to write (in practice, we'll
      only have a large mtime if it comes from a commit, but this
      makes it also work if you set your system clock ahead such
      that time() returns a huge value).
      
      Note that we don't (and never did) handle negative
      timestamps (i.e., before 1970). This would probably not be
      too hard to support in the same way, but since git does not
      support negative timestamps at all, I didn't bother here.
      
      After writing the extended header, we munge the timestamp in
      the ustar headers to the maximum-allowable size. This is
      wrong, but it's the least-wrong thing we can provide to a
      tar implementation that doesn't understand pax headers (it's
      also what GNU tar does).
      Helped-by: default avatarRené Scharfe <l.s.r@web.de>
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      6e8e0991
    • Jeff King's avatar
      archive-tar: write extended headers for file sizes >= 8GB · d1657b57
      Jeff King authored
      The ustar format has a fixed-length field for the size of
      each file entry which is supposed to contain up to 11 bytes
      of octal-formatted data plus a NUL or space terminator.
      
      These means that the largest size we can represent is
      077777777777, or 1 byte short of 8GB. The correct solution
      for a larger file, according to POSIX.1-2001, is to add an
      extended pax header, similar to how we handle long
      filenames. This patch does that, and writes zero for the
      size field in the ustar header (the last bit is not
      mentioned by POSIX, but it matches how GNU tar behaves with
      --format=pax).
      
      This should be a strict improvement over the current
      behavior, which is to die in xsnprintf with a "BUG".
      However, there's some interesting history here.
      
      Prior to f2f02675 (archive-tar: use xsnprintf for trivial
      formatting, 2015-09-24), we silently overflowed the "size"
      field. The extra bytes ended up in the "mtime" field of the
      header, which was then immediately written itself,
      overwriting our extra bytes. What that means depends on how
      many bytes we wrote.
      
      If the size was 64GB or greater, then we actually overflowed
      digits into the mtime field, meaning our value was
      effectively right-shifted by those lost octal digits. And
      this patch is again a strict improvement over that.
      
      But if the size was between 8GB and 64GB, then our 12-byte
      field held all of the actual digits, and only our NUL
      terminator overflowed. According to POSIX, there should be a
      NUL or space at the end of the field. However, GNU tar seems
      to be lenient here, and will correctly parse a size up 64GB
      (minus one) from the field. So sizes in this range might
      have just worked, depending on the implementation reading
      the tarfile.
      
      This patch is mostly still an improvement there, as the 8GB
      limit is specifically mentioned in POSIX as the correct
      limit. But it's possible that it could be a regression
      (versus the pre-f2f02675 state) if all of the following are
      true:
      
        1. You have a file between 8GB and 64GB.
      
        2. Your tar implementation _doesn't_ know about pax
           extended headers.
      
        3. Your tar implementation _does_ parse 12-byte sizes from
           the ustar header without a delimiter.
      
      It's probably not worth worrying about such an obscure set
      of conditions, but I'm documenting it here just in case.
      Helped-by: default avatarRené Scharfe <l.s.r@web.de>
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      d1657b57
  8. 26 May, 2016 1 commit
    • Jeff King's avatar
      archive-tar: convert snprintf to xsnprintf · 9e6c1e91
      Jeff King authored
      Commit f2f02675 (archive-tar: use xsnprintf for trivial
      formatting, 2015-09-24) converted cases of "sprintf" to
      "xsnprintf", but accidentally left one as just "snprintf".
      This meant that we could silently truncate the resulting
      buffer instead of flagging an error.
      
      In practice, this is impossible to achieve, as we are
      formatting a ustar checksum, which can be at most 7
      characters. But the point of xsnprintf is to document and
      check for "should be impossible" conditions; this site was
      just accidentally mis-converted during f2f02675.
      Noticed-by: default avatarPaul Green <Paul.Green@stratus.com>
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      9e6c1e91
  9. 25 Sep, 2015 3 commits
    • Jeff King's avatar
      archive-tar: use xsnprintf for trivial formatting · f2f02675
      Jeff King authored
      When we generate tar headers, we sprintf() values directly
      into a struct with the fixed-size header values. For the
      most part this is fine, as we are formatting small values
      (e.g., the octal format of "mode & 0x7777" is of fixed
      length). But it's still a good idea to use xsnprintf here.
      It communicates to readers what our expectation is, and it
      provides a run-time check that we are not overflowing the
      buffers.
      
      The one exception here is the mtime, which comes from the
      epoch time of the commit we are archiving. For sane values,
      this fits into the 12-byte value allocated in the header.
      But since git can handle 64-bit times, if I claim to be a
      visitor from the year 10,000 AD, I can overflow the buffer.
      This turns out to be harmless, as we simply overflow into
      the chksum field, which is then overwritten.
      
      This case is also best as an xsnprintf. It should never come
      up, short of extremely malformed dates, and in that case we
      are probably better off dying than silently truncating the
      date value (and we cannot expand the size of the buffer,
      since it is dictated by the ustar format). Our friends in
      the year 5138 (when we legitimately flip to a 12-digit
      epoch) can deal with that problem then.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      f2f02675
    • Jeff King's avatar
      convert trivial sprintf / strcpy calls to xsnprintf · 5096d490
      Jeff King authored
      We sometimes sprintf into fixed-size buffers when we know
      that the buffer is large enough to fit the input (either
      because it's a constant, or because it's numeric input that
      is bounded in size). Likewise with strcpy of constant
      strings.
      
      However, these sites make it hard to audit sprintf and
      strcpy calls for buffer overflows, as a reader has to
      cross-reference the size of the array with the input. Let's
      use xsnprintf instead, which communicates to a reader that
      we don't expect this to overflow (and catches the mistake in
      case we do).
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      5096d490
    • Jeff King's avatar
      archive-tar: fix minor indentation violation · 108332c7
      Jeff King authored
      This looks like a simple omission from 85390709 (archive-tar:
      unindent write_tar_entry by one level, 2012-05-03).
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      108332c7
  10. 20 Oct, 2014 1 commit
  11. 20 Aug, 2014 1 commit
  12. 04 Aug, 2014 1 commit
  13. 23 Jan, 2013 1 commit
  14. 06 Jan, 2013 1 commit
    • René Scharfe's avatar
      archive-tar: split long paths more carefully · 22f0dcd9
      René Scharfe authored
      The name field of a tar header has a size of 100 characters.  This limit
      was extended long ago in a backward compatible way by providing the
      additional prefix field, which can hold 155 additional characters.  The
      actual path is constructed at extraction time by concatenating the prefix
      field, a slash and the name field.
      
      get_path_prefix() is used to determine which slash in the path is used as
      the cutting point and thus which part of it is placed into the field
      prefix and which into the field name.  It tries to cram as much into the
      prefix field as possible.  (And only if we can't fit a path into the
      provided 255 characters we use a pax extended header to store it.)
      
      If a path is longer than 100 but shorter than 156 characters and ends
      with a slash (i.e. is for a directory) then get_path_prefix() puts the
      whole path in the prefix field and leaves the name field empty.  GNU tar
      reconstructs the path without complaint, but the tar included with
      NetBSD 6 does not: It reports the header to be invalid.
      
      For compatibility with this version of tar, make sure to never leave the
      name field empty.  In order to do that, trim the trailing slash from the
      part considered as possible prefix, if it exists -- that way the last
      path component (or more, but not less) will end up in the name field.
      Signed-off-by: default avatarRene Scharfe <rene.scharfe@lsrfire.ath.cx>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      22f0dcd9
  15. 13 Jun, 2012 1 commit
  16. 18 May, 2012 1 commit
  17. 03 May, 2012 4 commits
  18. 22 Jun, 2011 6 commits
    • Jeff King's avatar
      upload-archive: allow user to turn off filters · 7b97730b
      Jeff King authored
      Some tar filters may be very expensive to run, so sites do
      not want to expose them via upload-archive. This patch lets
      users configure tar.<filter>.remote to turn them off.
      
      By default, gzip filters are left on, as they are about as
      expensive as creating zip archives.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      7b97730b
    • Jeff King's avatar
      archive: provide builtin .tar.gz filter · 0e804e09
      Jeff King authored
      This works exactly as if the user had configured it via:
      
        [tar "tgz"]
      	command = gzip -cn
        [tar "tar.gz"]
      	command = gzip -cn
      
      but since it is so common, it's convenient to have it
      builtin without the user needing to do anything.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      0e804e09
    • Jeff King's avatar
      archive: implement configurable tar filters · 767cf457
      Jeff King authored
      It's common to pipe the tar output produce by "git archive"
      through gzip or some other compressor. Locally, this can
      easily be done by using a shell pipe. When requesting a
      remote archive, though, it cannot be done through the
      upload-archive interface.
      
      This patch allows configurable tar filters, so that one
      could define a "tar.gz" format that automatically pipes tar
      output through gzip.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      767cf457
    • Jeff King's avatar
      archive: pass archiver struct to write_archive callback · 4d7c9898
      Jeff King authored
      The current archivers are very static; when you are in the
      write_tar_archive function, you know you are writing a tar.
      However, to facilitate runtime-configurable archivers
      that will share a common write function we need to tell the
      function which archiver was used.
      
      As a convenience, we also provide an opaque data pointer in
      the archiver struct so that individual archivers can put
      something useful there when they register themselves.
      Technically they could just use the "name" field to look in
      an internal map of names to data, but this is much simpler.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      4d7c9898
    • Jeff King's avatar
      archive: refactor list of archive formats · 13e0f88d
      Jeff King authored
      Most of the tar and zip code was nicely split out into two
      abstracted files which knew only about their specific
      formats. The entry point to this code was a single "write
      archive" function.
      
      However, as these basic formats grow more complex (e.g., by
      handling multiple file extensions and format names), a
      static list of the entry point functions won't be enough.
      Instead, let's provide a way for the tar and zip code to
      tell the main archive code what they support by registering
      archiver names and functions.
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      13e0f88d
    • Jeff King's avatar
      archive-tar: don't reload default config options · 40e76291
      Jeff King authored
      We load our own tar-specific config, and then chain to
      git_default_config. This is pointless, as our caller should
      already have loaded the default config. It also introduces a
      needless inconsistency with the zip archiver, which does not
      look at the config files at all (and therefore relies on the
      caller to have loaded config).
      Signed-off-by: default avatarJeff King <peff@peff.net>
      Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
      40e76291
  19. 09 May, 2009 1 commit
  20. 12 Oct, 2008 1 commit
  21. 19 Jul, 2008 1 commit
  22. 15 Jul, 2008 3 commits
  23. 09 Jun, 2008 1 commit