Decompressor: Handle (trailing) garbage gracefully (add tests)
From Antonio Diaz Diaz:
For lzip it is easy to check the whole "transaction" (formed by one or more successive calls to 'wget_decompress'), by calling 'LZ_decompress_finish' (+ read loop) in the 'lzip_exit' function, just as you have already done. (The read loop should not read any data, or else the data produced by 'lzip_decompress' was not complete).
For gzip the change should be something similar; calling 'inflate' with Z_FINISH (+ read loop) in the 'gzip_exit' function.
Trailing garbage may cause problems when the decompression library is used to check the data. I don't know if trailing garbage happens in the kind of compressed data managed by Wget, but in the case of lzlib it can be easily ignored by ignoring the 'LZ_header_error' error code in 'lzip_exit'.
http://www.nongnu.org/lzip/manual/lzlib_manual.html#Error-codes
-- Constant: enum LZ_Errno LZ_header_error
An invalid member header (one with the wrong magic bytes) was read. If this happens at the end of the data stream it may indicate trailing data.
> What would be helpful is a bunch of compressed input files (ok and with
> typical issues, e.g. wrong CRC) plus an expected result / result
> checksum. So developers can test their code against it (you can also use
> it for your test suite).
From the lzip documentation[1] it is trivial to create files with any kind of problem. I use some of them, but unzcrash can test thousands of corrupt files quickly.
[1] http://www.nongnu.org/lzip/manual/lzlib_manual.html#Data-format
One feature unique to the lzip format is that it provides 3 factor integrity checking and the decompressors report mismatches in each factor separately:
$ lzip -cd bad_fox.lz
The quick brown fox jumps over the lazy dog.
bad_fox.lz: CRC mismatch; stored EB50CC4A, computed EB50CC6A
Data size mismatch; stored 44 (0x2C), computed 45 (0x2D)
Member size mismatch; stored 81 (0x51), computed 80 (0x50)
If you need specially crafted (corrupted or not) lzip files for your testing, just tell me and I'll make them for you.