Skip to content
Snippets Groups Projects
Select Git revision
  • pks-object-store-wo-the-repository
  • b4/pks-parse-options-integers
  • next protected
  • seen protected
  • master default protected
  • cc-lop-filter
  • b4/pks-drop-perl
  • todo protected
  • b4/505-wire-up-sparse-via-meson-2
  • b4/pks-split-object-file
  • b4/505-wire-up-sparse-via-meson
  • cc-lop-filter25
  • cc-lop-filter24
  • cc-lop-filter23
  • cc-lop-filter22
  • cc-lop-filter21
  • cc-lop-filter20
  • cc-lop-filter19
  • cc-lop-filter18
  • cc-lop-filter17
  • v2.49.0.gl1 protected
  • v2.49.0 protected
  • v2.49.0-rc2 protected
  • v2.49.0-rc1 protected
  • v2.49.0-rc0 protected
  • v2.48.1.gl1 protected
  • v2.48.1 protected
  • v2.48.0 protected
  • v2.48.0-rc2 protected
  • v2.48.0-rc1 protected
  • v2.46.3 protected
  • v2.40.4 protected
  • v2.41.3 protected
  • v2.42.4 protected
  • v2.43.6 protected
  • v2.44.3 protected
  • v2.45.3 protected
  • v2.47.2 protected
  • v2.48.0-rc0 protected
  • v2.47.1 protected
40 results

bundle.c

  • Johannes Schindelin's avatar
    9a84794a
    bundle: avoid closing file descriptor twice · 9a84794a
    Johannes Schindelin authored and Junio C Hamano's avatar Junio C Hamano committed
    
    Already when introduced in c7a8a162 (Add bundle transport,
    2007-09-10), the `bundle` transport had a bug where it would open a file
    descriptor to the bundle file and then close it _twice_: First, the file
    descriptor (`data->fd`) is passed to `unbundle()`, which would use it as
    the `stdin` of the `index-pack` process, which as a consequence would
    close it via `start_command()`. However, `data->fd` would still hold the
    numerical value of the file descriptor, and `close_bundle()` would see
    that and happily close it again.
    
    This seems not to have caused too many problems in almost two decades,
    but I encountered a situation today where it _does_ cause problems: In
    i686 variants of Git for Windows, it seems that file descriptors are
    reused quickly after they have been closed.
    
    In the particular scenario I faced, `git fetch <bundle> <ref>` gets the
    same file descriptor value when opening the bundle file and importing
    its embedded packfile (which implicitly closes the file descriptor) and
    then when opening a pack file in `fetch_and_consume_refs()` while
    looking up an object's header.
    
    Later on, after the bundle has been imported (and the `close_bundle()`
    function erroneously closes the file descriptor that has _already_ been
    closed when using it as `stdin` for `git index-pack`), the same file
    descriptor value has now been reused via `use_pack()`. Now, when either
    the recursive fetch (which defaults to "on", unfortunately) or a
    commit-graph update needs to `mmap()` the packfile, it fails due to a
    now-invalid file descriptor that _should_ point to the pack file but
    doesn't anymore.
    
    To fix that, let's invalidate `data->fd` after calling `unbundle()`.
    That way, `close_bundle()` does not close a file descriptor that may
    have been reused for something different. While at it, document that
    `unbundle()` closes the file descriptor, and ensure that it also does
    that when failing to verify the bundle.
    
    Luckily, this bug does not affect the bundle URI feature, it only
    affects the `git fetch <bundle>` code path.
    
    Note that this patch does not _completely_ clarifies who is responsible
    to close that file descriptor, as `run_command()` may fail _without_
    closing `cmd->in`. Addressing this issue thoroughly, however, would
    require a rather thorough re-design of the `start_command()` and
    `finish_command()` functionality to make it a lot less murky who is
    responsible for what file descriptors.
    
    At least this here patch is relatively easy to reason about, and
    addresses a hard failure (`fatal: mmap: could not determine filesize`)
    at the expense of leaking a file descriptor under very rare
    circumstances in which `git fetch` would error out anyway.
    
    Signed-off-by: default avatarJohannes Schindelin <johannes.schindelin@gmx.de>
    Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
    9a84794a
    History
    bundle: avoid closing file descriptor twice
    Johannes Schindelin authored and Junio C Hamano's avatar Junio C Hamano committed
    
    Already when introduced in c7a8a162 (Add bundle transport,
    2007-09-10), the `bundle` transport had a bug where it would open a file
    descriptor to the bundle file and then close it _twice_: First, the file
    descriptor (`data->fd`) is passed to `unbundle()`, which would use it as
    the `stdin` of the `index-pack` process, which as a consequence would
    close it via `start_command()`. However, `data->fd` would still hold the
    numerical value of the file descriptor, and `close_bundle()` would see
    that and happily close it again.
    
    This seems not to have caused too many problems in almost two decades,
    but I encountered a situation today where it _does_ cause problems: In
    i686 variants of Git for Windows, it seems that file descriptors are
    reused quickly after they have been closed.
    
    In the particular scenario I faced, `git fetch <bundle> <ref>` gets the
    same file descriptor value when opening the bundle file and importing
    its embedded packfile (which implicitly closes the file descriptor) and
    then when opening a pack file in `fetch_and_consume_refs()` while
    looking up an object's header.
    
    Later on, after the bundle has been imported (and the `close_bundle()`
    function erroneously closes the file descriptor that has _already_ been
    closed when using it as `stdin` for `git index-pack`), the same file
    descriptor value has now been reused via `use_pack()`. Now, when either
    the recursive fetch (which defaults to "on", unfortunately) or a
    commit-graph update needs to `mmap()` the packfile, it fails due to a
    now-invalid file descriptor that _should_ point to the pack file but
    doesn't anymore.
    
    To fix that, let's invalidate `data->fd` after calling `unbundle()`.
    That way, `close_bundle()` does not close a file descriptor that may
    have been reused for something different. While at it, document that
    `unbundle()` closes the file descriptor, and ensure that it also does
    that when failing to verify the bundle.
    
    Luckily, this bug does not affect the bundle URI feature, it only
    affects the `git fetch <bundle>` code path.
    
    Note that this patch does not _completely_ clarifies who is responsible
    to close that file descriptor, as `run_command()` may fail _without_
    closing `cmd->in`. Addressing this issue thoroughly, however, would
    require a rather thorough re-design of the `start_command()` and
    `finish_command()` functionality to make it a lot less murky who is
    responsible for what file descriptors.
    
    At least this here patch is relatively easy to reason about, and
    addresses a hard failure (`fatal: mmap: could not determine filesize`)
    at the expense of leaking a file descriptor under very rare
    circumstances in which `git fetch` would error out anyway.
    
    Signed-off-by: default avatarJohannes Schindelin <johannes.schindelin@gmx.de>
    Signed-off-by: default avatarJunio C Hamano <gitster@pobox.com>
Code owners
Assign users and groups as approvers for specific file changes. Learn more.