Add LZ77 + Huffman compression and decompression
This implements MS-XCA, sections 2.1 and 2.2.
The first two commits are MR !2735 (closed), which this depends on. Then there's the decompressor, the compressor, fuzzers, tests, and test utility scripts. The fuzzers have been running cleanly for a couple of weeks now on my local machine. Much of this work has been based on test vectors developed using a Windows API. Some of these vectors are included as test data.
Our compression ratio is very similar to that of the default Windows setting (better on some, worse on others, averaging to 0.4% worse over the test data). There are some constants to tweak to be better or faster, but they don't have very much effect until taken to extremes.
The compression and decompression speed seems to be significantly faster than Windows, but I don't have exact comparisons. What I can say is the time it takes to compress or decompress the test vectors on a Windows VM using the userspace API is greater than the time it takes to run the entire cmocka test program compiled at -O0, which does the round trip and a lot more (OTOH, VM overhead, and the userspace API has unwanted helpful features).
The compressors and decompressors both want a small amount of auxiliary memory to work with. I have made talloc and non-talloc versions, where the talloc ones allocate the auxiliary memory and the destination buffers. The allocation is done in the outer layer, so the talloc functions effectively wrap the non-talloc ones.
This work is built on top of a start @jsutton24 made. When it was already too late (at least, emotionally) I discovered that @scabrero had already worked on a decompressor that @aaptel picked up and adapted for Wireshark. I think this implementation covers more cases than that one, in that it works with multi-block messages, can handle a non-terminal 256 symbol, and supports the undocumented 32 bit match lengths.
I have given this LGPL3+ because I think it might be of use to other projects and is fairly self-contained.
Checklist
-
Commits have Signed-off-by:with name/author being identical to the commit author -
(optional) This MR is just one part towards a larger feature. -
(optional, if backport required) Bugzilla bug filed and BUG:tag added -
Test suite updated with functionality tests -
Test suite updated with negative tests -
Documentation updated -
CI timeout is 3h or higher (see Settings/CICD/General pipelines/ Timeout)
Reviewer's checklist:
-
There is a test suite reasonably covering new functionality or modifications -
Function naming, parameters, return values, types, etc., are consistent and according to README.Coding.md -
This feature/change has adequate documentation added -
No obvious mistakes in the code