Redesign nodeid format
First of all, it's now 40 bits instead of 48.
If the MSB of the first byte is 0, the remainder is a 39 bit siphash, just like before it was a 47 bit hash.
If it's a 1, the remaining bits are split up as so
Before
53M examples/search.index
14M build/aarch64-apple-darwin/doc/search.index
51M build/aarch64-apple-darwin/compiler-doc/search.index
After
52M examples/search.index
14M build/aarch64-apple-darwin/doc/search.index
50M build/aarch64-apple-darwin/compiler-doc/search.index
Description
1 0 0 N N N N N X X X X X X ...
^ ^ ^ ~~~~~~~~~~~ ----------------- first row number
| | | |
| | | \------ "long" alphabitmap entry number
| | |
| | \-------- not a run
| |
| \--------- whole prefix match
|
\---------- inlined
1 0 1 N N N N N X X X X X X ...
^ ^ ^ ~~~~~~~~~~~ ----------------- first row number
| | | |
| | | \------ run length - 1 (a single-element node is 00000)
| | |
| | \-------- run
| |
| \--------- suffix-only
|
\---------- inlined
1 1 0 N N N N N X X X X X X ...
^ ^ ^ ~~~~~~~~~~~ ----------------- first row number
| | | |
| | | \------ data length - 1
| | |
| | \-------- not a run
| |
| \--------- suffix-only
|
\---------- inlined
1 1 1 N N N N N X X X X X X ...
^ ^ ^ ~~~~~~~~~~~ ----------------- first row number
| | | |
| | | \------ run length - 1
| | |
| | \-------- run
| |
| \--------- suffix-only
|
\---------- inlined
Edited by Michael Howell