Hash flooding protection

The current hashtable library is pretty weak and is something of a Data Structures 101 implementation.

Here are some notes from Riastradh on #scheme, prompted by a comment in the hashtable library:

weinholt: An easy and widely used strategy for mitigating hash-flooding is to pick a uniform random key for each hash table and use siphash.

Siphash is a pseudorandom function family that is a reasonably high-value target for cryptographic scrutiny, and nobody has found any issues with it in the near-decade it's been around.

It's not going to be faster than (say) FNV-1, but -- assuming it is implemented without timing side channels of its own -- it will prevent anyone from guessing the key or predicting collisions.

Another approach is to use a universal hash family -- a UHF is much cheaper to compute than a PRF (comparable to FNV-1), and guarantees low collision probability for any pair of inputs under uniform random key, but it doesn't defend against an adaptive adversary who can act on knowledge of collisions between two inputs; it only defends against an adversary who submits all inputs up front before any interaction.

Here's a universal hash family using polynomial evaluation modulo 2^25 - 39, which fits comfortably in 58-bit fixnums: https://mumble.net/~campbell/tmp/20190820/uwc.scm But I recommend siphash for general-purpose hash tables which might see adaptive adversaries.

(Specifically: break a message m into 24-bit chunks, and interpret them as the coefficients of a polynomial over Z/(2^25 - 39)Z; the hash of a message is the evaluation of that polynomial at a uniform random point in Z/(2^25 - 39)Z.)

But first of all the hashtable library needs tests.

Edited Aug 22, 2021 by Gwen Weinholt

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information