Added caching mechanism for GetMedianTimePast() to avoid wasteful recalculation

Summary

This MR adds a mechanism to cache the median time past value after it is calculated for each CBlockIndex, so we don't have to recalculate it every time the GetMediumTimePast() function is called.

The caching mechanism stores the 32-bit integer MTP value into the previously unused "padding" of class CBlockIndex (right next to another 32-bit int), so the type still takes 360 bytes of memory per instance, as it did before this MR.

Additionally, a benchmark was added to show the savings from not having to recalculate this value each time.

This value never changes in a live blockchain, so it should be cached, rather than recalculated each time. Caching it makes GetMedianTimePast() 8x-10x faster on my system.

By using an atomic int for the cached value, the caching is done in a thread-safe and reentrant manner.

Background

GetMedianTimePast() is called very often during tx and block validation and as such optimizing this hot path is important for saving CPU cycles.

GetMedianTimePast() being constant depends on the following two invariants:

  • nTime being constant per block in the chain. This is always true because the block's hash commits to nTime in the header.
  • All blocks in a chain never are re-ordered. This is also always true because a block's hash commits to previousBlockHash in the header.

As such, by consensus, GetMedianTimePast() is always constant for any block.

Note that some unit tests do indeed modify nTime on an set of CBlockIndex in a chain, in order to test edge-case behavior of GetMedianTimePast(). This scenario can never happen on a live chain and violates class invariants as well as consensus rules.

The tests that do this were modified to invalidate the MTP cached value when they touch nTime, so they don't use the cached MTP value for the unit test in question.

Performance Benefit

On a full IBD GetMedianTimePast() is called ~470 times on average per block (!!). On my system, the function before this MR benchmarks at about 200msec per million calls (after this MR it clocks in at about 20msec per million calls). There are currently ~904,000 blocks. So for a full IBD, before this MR, the system would waste ~70-80 seconds of CPU time redundantly calculating GetMedianTimePast(), making the IBD complete a minute to a minute and a half later than it would otherwise do with this MR.

This is not significantly going to speed up IBD or block validation, but it doesn't hurt to save CPU cycles whenever we can.

Note that for high tx volume to mempool GetMedianTimePast() would also be called frequently per mempool txn so it helps to reduce latency ever-so-slightly to not have to recalculate the value there as well.

Again, the value never changes on a live blockchain, so not caching it makes no sense. Especially if we can do so without any memory cost (which this MR achieves).

Summary of Changes

  • Add a private nested class, CachedMTP inside CBlockIndex, which stores a 32-bit atomic unsigned int, at the very end of the class inside the 4-bytes that was previously used for alignment padding (thus not growing the class's memory usage at all).
  • The cached value starts out as "no value", which is used to indicate to GetMedianTimePast() that it needs to calculate the MTP value for this block index the first time it is called.
  • The old GetMedianTimePast() was renamed to CalculateMedianTimePast() which does the actual calculation, and has been made private to indicate it should never be called by outside code.
    • The calculation previously created an array of 11 64-bit ints, which took 808 bytes of stack space for the calculation. However, nTime is a 32-bit int so these 64-bit ints were overkill and unnecessary.
    • Instead, we calculate MTP using an array of 11 32-bit ints, which only eats 404 bytes of stack memory and may improve the CPU cost of this calculation due to better locality of reference and less cache pressure due to manipulating less memory at a time.
  • GetMedianTimePast() was re-written to either call CalculateMedianTimePast() if there is no cached value, or if there is, not call it and instead return the cached value.
  • Strange unit tests that violate class invariants by fudging nTime were modified to clear the cached value when they fudge nTime.

Test Plan

  • ninja all check-all
  • Run the benchmark: ninja bench_bitcoin && ./src/bench/bench_bitcoin -filter=GetMedianTime.\* and compare the _Nocache versus the one with the cache. _Nocache is status quo before this MR, and the other one is with the MTP caching active.
Edited by Calin Culianu

Merge request reports

Loading