Added caching mechanism for GetMedianTimePast() to avoid wasteful recalculation
Summary
This MR adds a mechanism to cache the median time past value after it is
calculated for each CBlockIndex, so we don't have to recalculate it every time the GetMediumTimePast() function is called.
The caching mechanism stores the 32-bit integer MTP value into the previously unused
"padding" of class CBlockIndex (right next to another 32-bit int), so the type still
takes 360 bytes of memory per instance, as it did before this MR.
Additionally, a benchmark was added to show the savings from not having to recalculate this value each time.
This value never changes in a live blockchain, so it should be cached,
rather than recalculated each time. Caching it makes GetMedianTimePast()
8x-10x faster on my system.
By using an atomic int for the cached value, the caching is done in a thread-safe and reentrant manner.
Background
GetMedianTimePast() is called very often during tx and block validation
and as such optimizing this hot path is important for saving CPU cycles.
GetMedianTimePast() being constant depends on the following two invariants:
-
nTimebeing constant per block in the chain. This is always true because the block's hash commits tonTimein the header. - All blocks in a chain never are re-ordered. This is also always true because a block's hash commits to
previousBlockHashin the header.
As such, by consensus, GetMedianTimePast() is always constant for any block.
Note that some unit tests do indeed modify nTime on an set of CBlockIndex in a chain, in order to test edge-case behavior of GetMedianTimePast(). This scenario can never happen on a live chain and violates class invariants as well as consensus rules.
The tests that do this were modified to invalidate the MTP cached value when they touch nTime, so they don't use the cached MTP value for the unit test in question.
Performance Benefit
On a full IBD GetMedianTimePast() is called ~470 times on average per block (!!). On my system, the function before this MR benchmarks at about 200msec per million calls (after this MR it clocks in at about 20msec per million calls). There are currently ~904,000 blocks. So for a full IBD, before this MR, the system would waste ~70-80 seconds of CPU time redundantly calculating GetMedianTimePast(), making the IBD complete a minute to a minute and a half later than it would otherwise do with this MR.
This is not significantly going to speed up IBD or block validation, but it doesn't hurt to save CPU cycles whenever we can.
Note that for high tx volume to mempool GetMedianTimePast() would also be called frequently per mempool txn so it helps to reduce latency ever-so-slightly to not have to recalculate the value there as well.
Again, the value never changes on a live blockchain, so not caching it makes no sense. Especially if we can do so without any memory cost (which this MR achieves).
Summary of Changes
- Add a private nested class,
CachedMTPinsideCBlockIndex, which stores a 32-bit atomic unsigned int, at the very end of the class inside the 4-bytes that was previously used for alignment padding (thus not growing the class's memory usage at all). - The cached value starts out as "no value", which is used to indicate to
GetMedianTimePast()that it needs to calculate the MTP value for this block index the first time it is called. - The old
GetMedianTimePast()was renamed toCalculateMedianTimePast()which does the actual calculation, and has been madeprivateto indicate it should never be called by outside code.- The calculation previously created an array of 11 64-bit ints, which took 808 bytes of stack space for the calculation. However,
nTimeis a 32-bit int so these 64-bit ints were overkill and unnecessary. - Instead, we calculate MTP using an array of 11 32-bit ints, which only eats 404 bytes of stack memory and may improve the CPU cost of this calculation due to better locality of reference and less cache pressure due to manipulating less memory at a time.
- The calculation previously created an array of 11 64-bit ints, which took 808 bytes of stack space for the calculation. However,
-
GetMedianTimePast()was re-written to either callCalculateMedianTimePast()if there is no cached value, or if there is, not call it and instead return the cached value. - Strange unit tests that violate class invariants by fudging
nTimewere modified to clear the cached value when they fudgenTime.
Test Plan
ninja all check-all- Run the benchmark:
ninja bench_bitcoin && ./src/bench/bench_bitcoin -filter=GetMedianTime.\*and compare the_Nocacheversus the one with the cache._Nocacheis status quo before this MR, and the other one is with the MTP caching active.