prevalidator: reordered already handled mem calls
Context
This MR reordered the calls of mem in the already_handled function.
From mainnet analysis with the profiler we got:
on_notify .................................................................... 2151 1651.611ms 101%
already fetching ........................................................... 8774
fetching thread ............................................................ 9154 87.055ms 100%
is already handled ......................................................... 67797 1270.602ms 101%
is_classified ............................................................ 66971 181.157ms 103%
is_known_unparsable ...................................................... 380 0.677ms 100%
is_live_operation ........................................................ 67625 276.640ms 100%
is_pending ............................................................... 67797 111.201ms 101%
may_fetch_operation ........................................................ 2151 7.173ms 102%
not already handled ........................................................ 380
We can observe that 67797 calls are done to is already handled, this function try to look into sets to know if the operation hash is already known by the mempool.
The first set that we inspect is the pending one, with 67797 calls. We then search into the live_operations set, with 67625 calls. This means that 172 operations have been found in the pending set. Then we search in the classified set, with 66971 calls which means that 654 operations have been found in the live_operations set.
The last set is unparsable with 380 calls, which means that almost all the operation hash (66591) that we are notified of are in the classified set.
This MR propose to reorder the mem calls to first call the search into the classified set. This will avoid searching in pending and live_operations set that can be costly. live_operations contains the operations of the last 240 blocks (~50k operation hash).
This change seems to significantly improved the time spent in the already_handled function:
on_notify .................................................................... 3896 1052.067ms 99%
already fetching ........................................................... 5662
fetching thread ............................................................ 6088 69.032ms 99%
is already handled ......................................................... 63320 631.853ms 98%
is_classified ............................................................ 63320 217.865ms 99%
is_known_unparsable ...................................................... 426 0.701ms 101%
is_live_operation ........................................................ 3455 17.424ms 102%
is_pending ............................................................... 426 0.935ms 103%
may_fetch_operation ........................................................ 3896 6.777ms 101%
not already handled ........................................................ 426
Manually testing the MR
From the intrusive-profiling branch, run a node on mainnet and observe the mempool_profiling.txt report file. Then cherry-pick the commit from this MR, compile and restart the node and observe the new numbers in this file.
Since this MR is only focused on performance improvement, there is no automatic test for it.
Checklist
- n/a Document the interface of any function added or modified (see the coding guidelines)
- n/a Document any change to the user interface, including configuration parameters (see node configuration)
- n/a Provide automatic testing (see the testing guide).
- n/a For new features and bug fixes, add an item in the appropriate changelog (
docs/protocols/alpha.rstfor the protocol and the environment,CHANGES.rstat the root of the repository for everything else). -
Select suitable reviewers using the Reviewersfield below. -
Select as Assigneethe next person who should take action on that MR