Baker: forge block when waiting for round to end
Context
Proposes a solution to disassociate block production from injection..
Currently, when a baker has a baking slot in the next level, it calculates and schedules a timeout that will emit a Time_to_bake
event which will cause the baker to forge, sign and inject a fresh block. If we know that the baker is the first proposer in the next level and quorum has been reached in this level, then we can forge a signed block while waiting for the round to end. When it is our Time_to_bake
we simply inject the pre-emptively forged block.
This MR changes baker behaviour to preemptively forge a block if it is the first proposer of the next level by introducing a the Time_to_forge_block
timeout. In automaton loop, compute_next_timeout
schedules a timeout for Time_to_forge_block
if:
- QC has been reached for the current proposal
- The baker is the first baker (round 0) in the next level
- No pending forged block
The idea is that when the state transition receives Time_to_forge_block
event, it kicks off the block forging process as its action and stores the result into level_state.next_forged_block
then it starts the Time_to_bake_block
timeout. Additionally, the Inject_block
action has been split into 2 actions - Forge_block
and Inject_block
. Since this optimisation is not applied to baking rounds greater than 0, the inject block action which are Inject_only
and Forge_then_inject
.
In this implementation, time to forge is calculated at round end - d
where d = 0.15 * block_time
by default and is configurable by the flag pre-emptive-forge-time
.
closes: #6492 (closed)
Manually testing the MR
Testing in sandbox
Setup
Apply the patch below to set block time to 10 seconds.
diff --git a/src/proto_alpha/lib_parameters/default_parameters.ml b/src/proto_alpha/lib_parameters/default_parameters.ml
index 9aef53823b..8f4dd71ab7 100644
--- a/src/proto_alpha/lib_parameters/default_parameters.ml
+++ b/src/proto_alpha/lib_parameters/default_parameters.ml
@@ -295,7 +295,7 @@ let derive_cryptobox_parameters ~redundancy_factor ~mainnet_constants_divider =
let constants_sandbox =
let consensus_committee_size = 256 in
- let block_time = 1 in
+ let block_time = 10 in
let Constants.Generated.{consensus_threshold = _; issuance_weights} =
Constants.Generated.generate ~consensus_committee_size
in
Commands
# build
make
# terminal 1
./src/bin_node/octez-sandboxed-node.sh 1 --connections 0
# terminal 2
export DIR=<tmp data dir created as part of octez node setup>
eval `./src/bin_client/octez-init-sandboxed-client.sh 1`
octez-activate-alpha
TEZOS_LOG="* -> debug" octez-baker-alpha run with local node $DIR
Old logs
Nov 08 12:37:03.053: next potential slot for level 10 is at round 0 at 2023-11-08T12:37:13-00:00 <- 1
Nov 08 12:37:03.053: for bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Nov 08 12:37:03.054: waiting 9.946s until it's time to bake at 2023-11-08T12:37:13-00:00 <- 2
Nov 08 12:37:13.004: automaton step: current phase idle, event
Nov 08 12:37:13.004: timeout reached: time to bake next level at round 0. <-3
Nov 08 12:37:13.004: proposing fresh block for bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx) <- 4
Nov 08 12:37:13.004: at round 0
Nov 08 12:37:13.009: prepare forging block at level 10, round 0 for
Nov 08 12:37:13.009: bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Nov 08 12:37:13.010: forging block at level 10, round 0 for <- 5
Nov 08 12:37:13.010: bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Nov 08 12:37:13.012: reading votes file: /tmp/tezos-tmp-client.SXTCJ4Hp/per_block_votes.json
Nov 08 12:37:13.012: Voting pass for liquidity baking toggle vote
Nov 08 12:37:13.012: Voting pass for adaptive issuance vote
Nov 08 12:37:13.015: start loading cache lazily
Nov 08 12:37:13.015: stop loading cache lazily
Nov 08 12:37:13.017: injecting block at level 10, round 0 for <-6
Nov 08 12:37:13.017: bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Nov 08 12:37:13.018: block BM8jJ4T2uzLY9LEQyuUkD3yLP7U5RAvugBwt9eNHTyZwj5i9tYq at level 10, round
Nov 08 12:37:13.018: 0 injected for bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Steps:
- Calculate time of potential baking slot at the next level.
- If there is a slot, set a timeout to wait for about ~10 which is the current block time. In other words, we are waiting for the current level to end.
- Timeout expires after ~10 seconds and its time to start baking our block.
- Proposing fresh block gathers operations from the mempool to be included in the block. 5,6. Execute the
Inject_block
action. These are coupled together implementation wise.
This MR logs
Nov 08 13:09:47.065: next potential slot for level 10 is at round 0 at 2023-11-08T13:09:57-00:00 <- 1
Nov 08 13:09:47.065: for bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Nov 08 13:09:47.065: first baker of next level found among delegates. pre-emptively forging block. <- 2
Nov 08 13:09:47.065: waiting 8.435s until it's time to forge block at 2023-11-08T13:09:55-00:00 <- 3
Nov 08 13:09:55.504: automaton step: current phase idle, event
Nov 08 13:09:55.504: timeout reached: time to forge block <- 4
Nov 08 13:09:55.505: preparing fresh block for bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx) <- 5
Nov 08 13:09:55.505: at round 0
Nov 08 13:09:55.510: prepare forging block at level 10, round 0 for
Nov 08 13:09:55.510: bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Nov 08 13:09:55.510: forging block at level 10, round 0 for <- 6
Nov 08 13:09:55.510: bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Nov 08 13:09:55.516: reading votes file: /tmp/tezos-tmp-client.E4kwd3Nt/per_block_votes.json
Nov 08 13:09:55.517: Voting pass for liquidity baking toggle vote
Nov 08 13:09:55.518: Voting pass for adaptive issuance vote
Nov 08 13:09:55.523: start loading cache lazily
Nov 08 13:09:55.523: stop loading cache lazily
Nov 08 13:09:55.528: found an elected block at level 9, round 0... checking baking rights
Nov 08 13:09:55.528: next potential slot for level 10 is at round 0 at 2023-11-08T13:09:57-00:00 <- 7
Nov 08 13:09:55.528: for bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Nov 08 13:09:55.529: waiting 1.471s until it's time to bake at 2023-11-08T13:09:57-00:00 <- 8
Nov 08 13:09:57.005: automaton step: current phase idle, event
Nov 08 13:09:57.005: timeout reached: time to bake next level at round 0 <- 9
Nov 08 13:09:57.007: found preemptively forged block. no need to forge block for <- 10
Nov 08 13:09:57.007: bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx) at round 0
Nov 08 13:09:57.008: injecting block at level 10, round 0 for <- 11
Nov 08 13:09:57.008: bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Nov 08 13:09:57.011: block BLdSgmskSeVYEN3g4y5oLoCwvEsy9AMsU1GhqSYnsyRcZvRnqre at level 10, round
Nov 08 13:09:57.011: 0 injected for bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Steps:
- Next potential baking slot is calculated.
- If we are the first baker at the next level ie. round 0, then we pre-emptively forge block.
- Set a timeout to wait ~8.5 seconds (10-1.5). If we forge to soon, the operation inclusion window will be too small.
- Timeout to forge expires after ~8.5 seconds and its time to start forging our block.
- Proposing fresh block gathers operations from the mempool to be included in the block.
- Execute the
Forge_block
action. The resultant forged block is stored in the state. - The next potential baking slot is calculated again.
- But this time, we already have a forged block in state, so we set a timeout to bake at the baking time (~1.5 seconds)
- Timeout expires after ~1.5s and its time to bake.
- When executing
Inject_block
action, we find a forged block already in the state so skip forging. - Inject the forged block.
Experiment Profiler results
Old Profiling results
perform action 'inject block' ................................................ 1 3809.201ms 72% +12s99.349ms
forge and inject block for tz1SYq214SCBy ................................... 1 3809.187ms 72% +0.011ms
timestamp of round ....................................................... 1 0.005ms 80% +0.004ms
retrieve injection level ................................................. 1 2.534ms 36% +8.794ms
generate seed nonce ...................................................... 1 0.003ms 97% +11.332ms
retrieve resulting context hash .......................................... 1 2.394ms 30% +11.613ms
retrieve live blocks ..................................................... 1 2.435ms 61% +14.011ms
filter non live operations ............................................... 1 1.083ms 100% +16.448ms
filter with context ...................................................... 1 2666.447ms 98% +17.536ms
begin construction ..................................................... 1 37.183ms 100% +0.001ms
load_key(stake_distribution@636) ..................................... 1 3.445ms 100% +6.962ms
endorsing_rights_by_first_slot ....................................... 1 12.196ms 100% +10.525ms
load_key(sampler_state@636) ........................................ 1 2.809ms 100% +0.050ms
endorsing_rights_by_first_slot ....................................... 1 8.982ms 100% +22.726ms
simulate and filter consensus .......................................... 1 250.199ms 100% +37.187ms
validating operation ................................................. 226 247.679ms 100%
simulate and filter votes .............................................. 1 0.002ms 105% +287.389ms
simulate and filter anonymous .......................................... 1 0.002ms 47% +287.392ms
prioritize managers .................................................... 1 11.562ms 100% +287.395ms
simulate and filter managers ........................................... 1 2354.320ms 97% +298.960ms
invalid operation filtered ........................................... 1823
validating operation ................................................. 2500 1275.185ms 100%
compute operations merkle root ......................................... 1 5.308ms 100% +2s653.286ms
finalize construction .................................................. 1 0.060ms 99% +2s658.597ms
finalize block header .................................................. 1 0.004ms 99% +2s658.720ms
compute payload hash ................................................... 1 3.710ms 101% +2s662.729ms
compute proof of work .................................................... 1 100.931ms 100% +2s683.988ms
sign with minimum 1.00s delay ............................................ 1 1002.101ms 1% +2s784.939ms
sign block header ...................................................... 1 3.026ms 97% +0.000ms
serializing .......................................................... 1 0.019ms 105% +0.003ms
waiting for lockfile ................................................. 1 0.255ms 117% +0.028ms
check highwatermark .................................................. 1 0.607ms 97% +0.303ms
record highwatermark ................................................. 1 1.645ms 94% +0.913ms
signing block ........................................................ 1 0.449ms 96% +2.575ms
inject block to node ..................................................... 1 21.613ms 11% +3s787.052ms
This MR Profiling results
perform action 'forge_block' ................................................. 1 3338.539ms 66% +10s510.578ms
timestamp of round ......................................................... 1 0.002ms 140% +0.004ms
retrieve injection level ................................................... 1 4.267ms 20% +9.853ms
generate seed nonce ........................................................ 1 0.004ms 105% +14.124ms
retrieve resulting context hash ............................................ 1 9.565ms 9% +14.523ms
retrieve live blocks ....................................................... 1 10.008ms 20% +24.091ms
filter non live operations ................................................. 1 1.676ms 100% +34.103ms
filter with context ........................................................ 1 2294.781ms 94% +35.782ms
begin construction ....................................................... 1 26.783ms 100% +0.003ms
load_key(stake_distribution@636) ....................................... 1 0.633ms 100% +5.132ms
endorsing_rights_by_first_slot ......................................... 1 11.285ms 100% +5.861ms
load_key(sampler_state@636) .......................................... 1 2.205ms 100% +0.034ms
endorsing_rights_by_first_slot ......................................... 1 8.970ms 100% +17.149ms
simulate and filter consensus ............................................ 1 50.167ms 100% +26.788ms
validating operation ................................................... 100 48.898ms 100%
simulate and filter votes ................................................ 1 0.004ms 75% +76.957ms
simulate and filter anonymous ............................................ 1 0.002ms 94% +76.963ms
prioritize managers ...................................................... 1 12.024ms 100% +76.967ms
simulate and filter managers ............................................. 1 2199.647ms 94% +88.994ms
invalid operation filtered ............................................. 2143
validating operation ................................................... 2476 975.564ms 100%
compute operations merkle root ........................................... 1 3.347ms 100% +2s288.645ms
finalize construction .................................................... 1 0.037ms 101% +2s291.995ms
finalize block header .................................................... 1 0.003ms 105% +2s292.087ms
compute payload hash ..................................................... 1 1.033ms 101% +2s293.742ms
compute proof of work ...................................................... 1 6.667ms 100% +2s330.566ms
sign with minimum 1.00s delay .............................................. 1 1001.281ms 1% +2s337.246ms
sign block header ........................................................ 1 3.080ms 95% +0.001ms
serializing ............................................................ 1 0.013ms 100% +0.001ms
waiting for lockfile ................................................... 1 0.225ms 84% +0.020ms
check highwatermark .................................................... 1 0.528ms 101% +0.276ms
record highwatermark ................................................... 1 1.768ms 94% +0.806ms
signing block .......................................................... 1 0.489ms 99% +2.588ms
compute next timeout ......................................................... 1 0.400ms 77% +13s849.123ms
compute next round time .................................................... 1 0.014ms 100% +0.001ms
compute next potential baking time ......................................... 1 0.303ms 73% +0.016ms
do step with event 'timeout' ................................................. 1 0.128ms 88% +13s849.529ms
time to bake at next level ................................................. 1 0.123ms 87% +0.002ms
perform action 'inject block' ................................................ 1 16.760ms 11% +13s849.662ms
inject block to node ....................................................... 1 16.475ms 9% +0.002ms
compute next timeout ......................................................... 1 0.458ms 27% +13s866.425ms
compute next round time .................................................... 1 0.004ms 124% +0.002ms
compute next potential baking time ......................................... 1 0.012ms 101% +0.009ms
Teztale Dataviz
Below is comparison of teztale dataviz for experiments with and without the patch. Both experiments were run on n1-standard-4
machines with Intel Ivy Bridge
processor at 15 seconds block time. The experiments were run with a set pre-emptive forge time for 1.5s
Old
Experiment 20231124-160857
: https://drive.google.com/file/d/1eAuMgj-TL_jet6Z-34hfF0-XgN5kOmn9/view?usp=drive_link
The image shows that validation was observed at ~3.9 seconds for most levels.
The image shows the average validation delay is at 5.83s.
This MR
Experiment 20231124-163040
: https://drive.google.com/file/d/1gNiRf0ixVlIWK9DvLD908gsYav-8yvml/view?usp=drive_link
The image shows that block validation was reached at ~2.4s which is equivalent to the time saving from the pre-emptive forge time, implying that the optimization did indeed work.
This image shows the average validation delay is ~1.5s less than the experiment without the patch. Similarly, we see reduction of ~1.5s in avg delay of 66% and 90% att.
Checklist
-
Document the interface of any function added or modified (see the coding guidelines) -
Document any change to the user interface, including configuration parameters (see node configuration) -
Provide automatic testing (see the testing guide). -
For new features and bug fixes, add an item in the appropriate changelog ( docs/protocols/alpha.rst
for the protocol and the environment,CHANGES.rst
at the root of the repository for everything else). -
Select suitable reviewers using the Reviewers
field below. -
Select as Assignee
the next person who should take action on that MR