Skip to content

Encode multiple shared files properly

Luke Champine requested to merge gzip into master

Previously, each file would create, write to, and close its own gzip stream. This causes problems when multiple files are written to the same stream; the gzip decompressor doesn't expect multiple headers/footers in the same stream. Instead, we now reuse the same compressor/decompressor when encoding multiple files. This means the resulting .sia will have only one header and footer (checksum). Size may be further reduced because the same dictionary is used for multiple files.

This should not introduce any compatibility concerns, because prior to v0.5.0 the API did not support sharing multiple files.

Old format (264 bytes):

00000000  53 69 61 20 53 68 61 72  65 64 20 46 69 6c 65 03  |Sia Shared File.|
00000010  00 00 00 00 00 00 00 30  2e 34 02 00 00 00 00 00  |.......0.4......|
00000020  00 00 1f 8b 08 00 00 09  6e 88 02 ff e2 66 80 80  |........n....f..|
                ^^^^^ gzip magic bytes
00000030  92 d4 e2 92 b4 cc 9c 54  5d 33 d3 22 e6 88 12 90  |.......T]3."....|
00000040  90 99 73 d0 fd 02 e5 4f  5b 76 ad bd a6 fe f4 e1  |..s....O[v......|
00000050  23 81 d2 70 93 12 97 19  ab fe fc 5f 36 49 b5 e0  |#..p......._6I..|
00000060  e1 b2 a6 c3 d5 0c 28 c0  eb ff a3 45 20 5a 50 02  |......(....E ZP.|
00000070  c2 e7 81 8a 07 a5 a6 a6  e8 06 e7 e7 e4 e7 e6 e7  |................|
00000080  71 41 c5 58 50 b5 32 00  02 00 00 ff ff d0 25 26  |qA.XP.2.......%&|
                                                     ^^^^^ checksum
00000090  f7 87 00 00 00 1f 8b 08  00 00 09 6e 88 02 ff e2  |...........n....|
          ^^^^^ checksum ^^^^^ gzip magic bytes
000000a0  61 80 80 92 d4 e2 92 b4  cc 9c 54 5d 43 0b 8b 5f  |a.........T]C.._|
000000b0  c7 82 0d 41 62 a9 4f a7  c7 4b 4c d0 91 3a b2 4d  |...Ab.O..KL..:.M|
000000c0  51 20 c4 7b 6e f4 ed 20  c6 d4 09 67 1f 7d f1 79  |Q .{n.. ...g.}.y|
000000d0  2d 71 b4 95 fb b3 cc a4  8d 0c 28 60 cb fb f3 11  |-q........(`....|
000000e0  20 da 8a 0f c2 87 99 1d  94 9a 9a a2 1b 9c 9f 93  | ...............|
000000f0  9f 9b 9f c7 0a 15 63 41  d5 ca 00 08 00 00 ff ff  |......cA........|
00000100  36 c6 23 82 88 00 00 00                           |6.#.....|
          ^^^^^^^^^^^ checksum

New format (216 bytes):

00000000  53 69 61 20 53 68 61 72  65 64 20 46 69 6c 65 03  |Sia Shared File.|
00000010  00 00 00 00 00 00 00 30  2e 34 02 00 00 00 00 00  |.......0.4......|
00000020  00 00 1f 8b 08 00 00 09  6e 88 02 ff e2 61 80 80  |........n....a..|
                ^^^^^ gzip magic bytes
00000030  92 d4 e2 92 b4 cc 9c 54  5d 43 4b d3 5f 49 d6 02  |.......T]CK._I..|
00000040  20 b1 0b 17 5f bc ff f0  6a 9f 66 0e ef a5 26 9d  | ..._...j.f...&.|
00000050  bd 73 93 27 4a 07 fc de  fe 76 da 51 8e bb 1c 6f  |.s.'J....v.Q...o|
00000060  26 85 69 54 7d 0e 65 40  01 1b 36 c4 48 80 68 1e  |&.iT}.e@..6.H.h.|
00000070  36 08 1f 66 76 50 6a 6a  8a 6e 70 7e 4e 7e 6e 7e  |6..fvPjj.np~N~n~|
00000080  1e 07 54 8c 05 55 2b 03  16 77 18 4f 98 bc e3 3c  |..T..U+..w.O...<|
00000090  48 6c b2 43 ea ad 4e db  c5 bc 5a db fb b7 3e f8  |Hl.C..N...Z...>.|
000000a0  7b 49 bd 7a 75 86 c5 d6  f0 b8 63 ad eb 97 9f 63  |{I.zu.....c....c|
000000b0  3b ee 17 60 2a 86 6a 56  c8 bd e0 7b 20 f3 bd d7  |;..`*.jV...{ ...|
000000c0  b2 e0 74 07 13 54 8c 0b  cd 1d 80 00 00 00 ff ff  |..t..T..........|
000000d0  84 89 ac 9c 10 01 00 00                           |........|
          ^^^^^^^^^^^ checksum

(these files don't contain the same data, but their uncompressed size is roughly equivalent)

Merge request reports