Skip to content
  • Eric Blake's avatar
    copy: CVE-2022-0485: Fail nbdcopy if NBD read or write fails · 8d444b41
    Eric Blake authored
    
    
    nbdcopy has a nasty bug when performing multi-threaded copies using
    asynchronous nbd calls - it was blindly treating the completion of an
    asynchronous command as successful, rather than checking the *error
    parameter.  This can result in the silent creation of a corrupted
    image in two different ways: when a read fails, we blindly wrote
    garbage to the destination; when a write fails, we did not flag that
    the destination was not written.
    
    Since nbdcopy already calls exit() on a synchronous read or write
    failure to a file, doing the same for an asynchronous op to an NBD
    server is the simplest solution.  A nicer solution, but more invasive
    to code and thus not done here, might be to allow up to N retries of
    the transaction (in case the read or write failure was transient), or
    even having a mode where as much data is copied as possible (portions
    of the copy that failed would be logged on stderr, and nbdcopy would
    still fail with a non-zero exit status, but this would copy more than
    just stopping at the first error, as can be done with rsync or
    ddrescue).
    
    Note that since we rely on auto-retiring and do NOT call
    nbd_aio_command_completed, our completion callbacks must always return
    1 (if they do not exit() first), even when acting on *error, so as not
    leave the command allocated until nbd_close.  As such, there is no
    sane way to return an error to a manual caller of the callback, and
    therefore we can drop dead code that calls perror() and exit() if the
    callback "failed".  It is also worth documenting the contract on when
    we must manually call the callback during the asynch_zero callback, so
    that we do not leak or double-free the command; thankfully, all the
    existing code paths were correct.
    
    The added testsuite script demonstrates several scenarios, some of
    which fail without the rest of this patch in place, and others which
    showcase ways in which sparse images can bypass errors.
    
    Once backports are complete, a followup patch on the main branch will
    edit docs/libnbd-security.pod with the mailing list announcement of
    the stable branch commit ids and release versions that incorporate
    this fix.
    
    Reported-by: default avatarNir Soffer <nsoffer@redhat.com>
    Fixes: bc896eec ("copy: Implement multi-conn, multiple threads, multiple requests in flight.", v1.5.6)
    Fixes: https://bugzilla.redhat.com/2046194
    
    
    Message-Id: <20220203202558.203013-6-eblake@redhat.com>
    Acked-by: default avatarRichard W.M. Jones <rjones@redhat.com>
    Acked-by: default avatarNir Soffer <nsoffer@redhat.com>
    [eblake: fix error message per Nir, tweak requires lines in unit test per Rich]
    Reviewed-by: default avatarLaszlo Ersek <lersek@redhat.com>
    8d444b41