Some improvements to CRC.
-
Replace the strange runtime assembling of the 32-bit polynomial with its final representation.
-
Reduce unrolls to 4× (8× must be too extreme everywhere... on my computer the best value is 1×), and use constant offsets from
buf[0]
tobuf[number of unrolls - 1]
. -
Speed up CRC-128 by 7 (x64) ~ 15 (i386) times by manually inlining everything.