Skip to content

Round-robin AvailableCoins() to make sendtoaddress even faster (up to 300 tx/sec)

Summary

The current sendtoaddress/AvailableCoins() code with coinsel=1 or coinsel=2 will start enumerating coins from mapWallet.begin() and will stop enumerating coins from the wallet once 10x as much spendable value (plus fees) has been enumerated. When this is done repeatedly, this results in all of the early coins in mapWallet's siphashed iteration order getting consumed, and this causes sendtoaddress to get much slower over time, as the number of spent coins that AvailableCoins() needs to iterate past grows.

This MR will instead store a "cursor position" for the wallet, and will begin enumerating coins from the place where it finished on the last sendtoaddress call. This results in a more even coin usage pattern and much higher performance. It may also improve average coin selection quality, as repeated calls with small values is less likely to selectively deplete small coins from the start of the wallet and force large coins (with change) to be used.

This MR also includes a small performance boost when coinsel=2 is used by only enumerating 2nValue plus estimated fees, rather than 10nValue plus estimated fees when coinsel=1 is used.

Ramdisk

In order to get the full performance out of this code, the wallet needs to be run off of a ramdisk or fast SSD. For p2p_stresstest.py, you should do something like the following:

sudo mount -t tmpfs size=2G /tmpfs
python3 p2p_stresstest.py --tmpdir=/tmpfs/test1

On coinsel=2

This MR includes the most important element of what I had previously discussed as the coinsel=2 algorithm, but it does it by changing AvailableCoins() instead of implementing FastSelectCoins() as an alternative to SelectCoins(). This AvailableCoins() approach has proven to be a less invasive, less bug-prone, and only slightly slower method than I had attempted previously. As such, I am recommending this round-robin change be used for coinsel=1 as well as coinsel=2. This method achieves around 250 tx/sec with coinsel=1 (10nValue) and 300 tx/sec with coinsel=2 (2nValue). In comparison, the FastSelectCoins() method generates at around 400 tx/sec. This change does not make a future FastSelectCoins() implementation for coinsel=2 completely obsolete, but it does make it less urgent.

Test plan

  1. ninja check-all

  2. Run p2p_stresstest.py with a RAMdisk and verify that tx generation rate per node remains above 50 tx/sec per GHz of x86-equivalent CPU speed for all five blocks (e.g. above 200 tx/sec on a 4-GHz x86 CPU). On Linux, assuming your build directory is build/, you can do the following:

sudo mkdir /tmpfs
sudo mount -t tmpfs size=2G /tmpfs
cd ../test/benchmark
ln -s ../../build/test/config.ini ../
python3 p2p_stresstest.py --tmpdir=/tmpfs/test1

Merge request reports