Skip to content

Replace IP greylist structure by a generational Bloom filter

Benjamin Canou requested to merge nomadic-labs/tezos:klakplok@ok_bloomer into master

Merged as part of !2712 (merged)

The current implementation of IP greylisting uses a very complex data structure without fully exploiting its capacity, and without extensive testing to make sure that the CPU and RAM behaviour is as expected.

This MR suggest to use a variant of a Bloom filter (a very simple, RAM-constant and CPU efficient probabilistic structure) instead.

The standard Bloom filter structure is tweaked by replacing single-bit presence register by multi-bit countdown registers. When an element is added to the probabilistic set, all its registers are set to the maximum value of the countdown. Every now and then, all registers are decremented, so that in normal condition, the elements will be removed from the set when all their registers get decremented down to zero.

The implementation has unit tests to check self-consistency, and that the probabilistic behaviour is reasonably close to the theoretical model.

The current default values give the following behaviour:

run_21

Such a picture can be obtained by running:

BLOOMER_TEST_GNUPLOT_PATH=/tmp/plots dune runtest src/lib_stdlib && ( cd /tmp/plots && gnuplot *.plot )

The values are configurable using the node configuration file:

"peer_greylist_size"?: integer  [0, 2^16-1] /* The number of peer_ids kept in the peer_id greylist. */,
"ip_greylist_size_in_kylobytes"?: integer  [0, 2^16-1] /* The size of the IP address greylist (in kilobytes). */,
"ip_greylist_cleanup_delay"?: $timespan.system /* The time an IP address is kept in the greylist. */,

Default values are 1023 (it was previously hardcoded), 2MiB, and half an hour.

Edited by Clément Hurlin

Merge request reports