Skip to content

Baker: forge block when waiting for round to end

Ryan Tan requested to merge ryan@bake-block-before-injection into master

Context

Proposes a solution to disassociate block production from injection..

Currently, when a baker has a baking slot in the next level, it calculates and schedules a timeout that will emit a Time_to_bake event which will cause the baker to forge, sign and inject a fresh block. If we know that the baker is the first proposer in the next level and quorum has been reached in this level, then we can forge a signed block while waiting for the round to end. When it is our Time_to_bake we simply inject the pre-emptively forged block.

This MR changes baker behaviour to preemptively forge a block if it is the first proposer of the next level by introducing a the Time_to_forge_block timeout. In automaton loop, compute_next_timeout schedules a timeout for Time_to_forge_block if:

  1. QC has been reached for the current proposal
  2. The baker is the first baker (round 0) in the next level
  3. No pending forged block

The idea is that when the state transition receives Time_to_forge_block event, it kicks off the block forging process as its action and stores the result into level_state.next_forged_block then it starts the Time_to_bake_block timeout. Additionally, the Inject_block action has been split into 2 actions - Forge_block and Inject_block. Since this optimisation is not applied to baking rounds greater than 0, the inject block action which are Inject_only and Forge_then_inject.

In this implementation, time to forge is calculated at round end - d where d = 0.15 * block_time by default and is configurable by the flag pre-emptive-forge-time.

closes: #6492 (closed)

Manually testing the MR

Testing in sandbox

Setup

Apply the patch below to set block time to 10 seconds.

diff --git a/src/proto_alpha/lib_parameters/default_parameters.ml b/src/proto_alpha/lib_parameters/default_parameters.ml
index 9aef53823b..8f4dd71ab7 100644
--- a/src/proto_alpha/lib_parameters/default_parameters.ml
+++ b/src/proto_alpha/lib_parameters/default_parameters.ml
@@ -295,7 +295,7 @@ let derive_cryptobox_parameters ~redundancy_factor ~mainnet_constants_divider =

 let constants_sandbox =
   let consensus_committee_size = 256 in
-  let block_time = 1 in
+  let block_time = 10 in
   let Constants.Generated.{consensus_threshold = _; issuance_weights} =
     Constants.Generated.generate ~consensus_committee_size
   in

Commands

# build
make

# terminal 1
./src/bin_node/octez-sandboxed-node.sh 1 --connections 0

# terminal 2
export DIR=<tmp data dir created as part of octez node setup>
eval `./src/bin_client/octez-init-sandboxed-client.sh 1`
octez-activate-alpha
TEZOS_LOG="* -> debug" octez-baker-alpha run with local node $DIR

Old logs

Nov 08 12:37:03.053: next potential slot for level 10 is at round 0 at 2023-11-08T12:37:13-00:00 <- 1
Nov 08 12:37:03.053:   for bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Nov 08 12:37:03.054: waiting 9.946s until it's time to bake at 2023-11-08T12:37:13-00:00 <- 2
Nov 08 12:37:13.004: automaton step: current phase idle, event
Nov 08 12:37:13.004:   timeout reached: time to bake next level at round 0. <-3
Nov 08 12:37:13.004: proposing fresh block for bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)  <- 4
Nov 08 12:37:13.004:   at round 0
Nov 08 12:37:13.009: prepare forging block at level 10, round 0 for 
Nov 08 12:37:13.009:   bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Nov 08 12:37:13.010: forging block at level 10, round 0 for <- 5
Nov 08 12:37:13.010:   bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Nov 08 12:37:13.012: reading votes file: /tmp/tezos-tmp-client.SXTCJ4Hp/per_block_votes.json
Nov 08 12:37:13.012: Voting pass for liquidity baking toggle vote
Nov 08 12:37:13.012: Voting pass for adaptive issuance vote
Nov 08 12:37:13.015: start loading cache lazily
Nov 08 12:37:13.015: stop loading cache lazily
Nov 08 12:37:13.017: injecting block at level 10, round 0 for <-6
Nov 08 12:37:13.017:   bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Nov 08 12:37:13.018: block BM8jJ4T2uzLY9LEQyuUkD3yLP7U5RAvugBwt9eNHTyZwj5i9tYq at level 10, round
Nov 08 12:37:13.018:   0 injected for bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)

Steps:

  1. Calculate time of potential baking slot at the next level.
  2. If there is a slot, set a timeout to wait for about ~10 which is the current block time. In other words, we are waiting for the current level to end.
  3. Timeout expires after ~10 seconds and its time to start baking our block.
  4. Proposing fresh block gathers operations from the mempool to be included in the block. 5,6. Execute the Inject_block action. These are coupled together implementation wise.

This MR logs

Nov 08 13:09:47.065: next potential slot for level 10 is at round 0 at 2023-11-08T13:09:57-00:00   <- 1
Nov 08 13:09:47.065:   for bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Nov 08 13:09:47.065: first baker of next level found among delegates. pre-emptively forging block. <- 2
Nov 08 13:09:47.065: waiting 8.435s until it's time to forge block at 2023-11-08T13:09:55-00:00    <- 3
Nov 08 13:09:55.504: automaton step: current phase idle, event
Nov 08 13:09:55.504:   timeout reached: time to forge block                                        <- 4
Nov 08 13:09:55.505: preparing fresh block for bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)   <- 5
Nov 08 13:09:55.505:   at round 0
Nov 08 13:09:55.510: prepare forging block at level 10, round 0 for 
Nov 08 13:09:55.510:   bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Nov 08 13:09:55.510: forging block at level 10, round 0 for                                        <- 6
Nov 08 13:09:55.510:   bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Nov 08 13:09:55.516: reading votes file: /tmp/tezos-tmp-client.E4kwd3Nt/per_block_votes.json
Nov 08 13:09:55.517: Voting pass for liquidity baking toggle vote
Nov 08 13:09:55.518: Voting pass for adaptive issuance vote
Nov 08 13:09:55.523: start loading cache lazily
Nov 08 13:09:55.523: stop loading cache lazily
Nov 08 13:09:55.528: found an elected block at level 9, round 0... checking baking rights
Nov 08 13:09:55.528: next potential slot for level 10 is at round 0 at 2023-11-08T13:09:57-00:00   <- 7
Nov 08 13:09:55.528:   for bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Nov 08 13:09:55.529: waiting 1.471s until it's time to bake at 2023-11-08T13:09:57-00:00           <- 8
Nov 08 13:09:57.005: automaton step: current phase idle, event 
Nov 08 13:09:57.005:   timeout reached: time to bake next level at round 0                         <- 9
Nov 08 13:09:57.007: found preemptively forged block. no need to forge block for                   <- 10
Nov 08 13:09:57.007:   bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx) at round 0
Nov 08 13:09:57.008: injecting block at level 10, round 0 for                                      <- 11
Nov 08 13:09:57.008:   bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)
Nov 08 13:09:57.011: block BLdSgmskSeVYEN3g4y5oLoCwvEsy9AMsU1GhqSYnsyRcZvRnqre at level 10, round
Nov 08 13:09:57.011:   0 injected for bootstrap1 (tz1KqTpEZ7Yob7QbPE4Hy4Wo8fHG8LhKxZSx)

Steps:

  1. Next potential baking slot is calculated.
  2. If we are the first baker at the next level ie. round 0, then we pre-emptively forge block.
  3. Set a timeout to wait ~8.5 seconds (10-1.5). If we forge to soon, the operation inclusion window will be too small.
  4. Timeout to forge expires after ~8.5 seconds and its time to start forging our block.
  5. Proposing fresh block gathers operations from the mempool to be included in the block.
  6. Execute the Forge_block action. The resultant forged block is stored in the state.
  7. The next potential baking slot is calculated again.
  8. But this time, we already have a forged block in state, so we set a timeout to bake at the baking time (~1.5 seconds)
  9. Timeout expires after ~1.5s and its time to bake.
  10. When executing Inject_block action, we find a forged block already in the state so skip forging.
  11. Inject the forged block.

Experiment Profiler results

Old Profiling results

perform action 'inject block' ................................................ 1         3809.201ms  72% +12s99.349ms
    forge and inject block for tz1SYq214SCBy ................................... 1         3809.187ms  72% +0.011ms
      timestamp of round ....................................................... 1            0.005ms  80% +0.004ms
      retrieve injection level ................................................. 1            2.534ms  36% +8.794ms
      generate seed nonce ...................................................... 1            0.003ms  97% +11.332ms
      retrieve resulting context hash .......................................... 1            2.394ms  30% +11.613ms
      retrieve live blocks ..................................................... 1            2.435ms  61% +14.011ms
      filter non live operations ............................................... 1            1.083ms 100% +16.448ms
      filter with context ...................................................... 1         2666.447ms  98% +17.536ms
        begin construction ..................................................... 1           37.183ms 100% +0.001ms
          load_key(stake_distribution@636) ..................................... 1            3.445ms 100% +6.962ms
          endorsing_rights_by_first_slot ....................................... 1           12.196ms 100% +10.525ms
            load_key(sampler_state@636) ........................................ 1            2.809ms 100% +0.050ms
          endorsing_rights_by_first_slot ....................................... 1            8.982ms 100% +22.726ms
        simulate and filter consensus .......................................... 1          250.199ms 100% +37.187ms
          validating operation ................................................. 226        247.679ms 100%
        simulate and filter votes .............................................. 1            0.002ms 105% +287.389ms
        simulate and filter anonymous .......................................... 1            0.002ms  47% +287.392ms
        prioritize managers .................................................... 1           11.562ms 100% +287.395ms
        simulate and filter managers ........................................... 1         2354.320ms  97% +298.960ms
          invalid operation filtered ........................................... 1823                     
          validating operation ................................................. 2500      1275.185ms 100%
        compute operations merkle root ......................................... 1            5.308ms 100% +2s653.286ms
        finalize construction .................................................. 1            0.060ms  99% +2s658.597ms
        finalize block header .................................................. 1            0.004ms  99% +2s658.720ms
        compute payload hash ................................................... 1            3.710ms 101% +2s662.729ms
      compute proof of work .................................................... 1          100.931ms 100% +2s683.988ms
      sign with minimum 1.00s delay ............................................ 1         1002.101ms   1% +2s784.939ms
        sign block header ...................................................... 1            3.026ms  97% +0.000ms
          serializing .......................................................... 1            0.019ms 105% +0.003ms
          waiting for lockfile ................................................. 1            0.255ms 117% +0.028ms
          check highwatermark .................................................. 1            0.607ms  97% +0.303ms
          record highwatermark ................................................. 1            1.645ms  94% +0.913ms
          signing block ........................................................ 1            0.449ms  96% +2.575ms
      inject block to node ..................................................... 1           21.613ms  11% +3s787.052ms

This MR Profiling results

  perform action 'forge_block' ................................................. 1         3338.539ms  66% +10s510.578ms
    timestamp of round ......................................................... 1            0.002ms 140% +0.004ms
    retrieve injection level ................................................... 1            4.267ms  20% +9.853ms
    generate seed nonce ........................................................ 1            0.004ms 105% +14.124ms
    retrieve resulting context hash ............................................ 1            9.565ms   9% +14.523ms
    retrieve live blocks ....................................................... 1           10.008ms  20% +24.091ms
    filter non live operations ................................................. 1            1.676ms 100% +34.103ms
    filter with context ........................................................ 1         2294.781ms  94% +35.782ms
      begin construction ....................................................... 1           26.783ms 100% +0.003ms
        load_key(stake_distribution@636) ....................................... 1            0.633ms 100% +5.132ms
        endorsing_rights_by_first_slot ......................................... 1           11.285ms 100% +5.861ms
          load_key(sampler_state@636) .......................................... 1            2.205ms 100% +0.034ms
        endorsing_rights_by_first_slot ......................................... 1            8.970ms 100% +17.149ms
      simulate and filter consensus ............................................ 1           50.167ms 100% +26.788ms
        validating operation ................................................... 100         48.898ms 100%
      simulate and filter votes ................................................ 1            0.004ms  75% +76.957ms
      simulate and filter anonymous ............................................ 1            0.002ms  94% +76.963ms
      prioritize managers ...................................................... 1           12.024ms 100% +76.967ms
      simulate and filter managers ............................................. 1         2199.647ms  94% +88.994ms
        invalid operation filtered ............................................. 2143                     
        validating operation ................................................... 2476       975.564ms 100%
      compute operations merkle root ........................................... 1            3.347ms 100% +2s288.645ms
      finalize construction .................................................... 1            0.037ms 101% +2s291.995ms
      finalize block header .................................................... 1            0.003ms 105% +2s292.087ms
      compute payload hash ..................................................... 1            1.033ms 101% +2s293.742ms
    compute proof of work ...................................................... 1            6.667ms 100% +2s330.566ms
    sign with minimum 1.00s delay .............................................. 1         1001.281ms   1% +2s337.246ms
      sign block header ........................................................ 1            3.080ms  95% +0.001ms
        serializing ............................................................ 1            0.013ms 100% +0.001ms
        waiting for lockfile ................................................... 1            0.225ms  84% +0.020ms
        check highwatermark .................................................... 1            0.528ms 101% +0.276ms
        record highwatermark ................................................... 1            1.768ms  94% +0.806ms
        signing block .......................................................... 1            0.489ms  99% +2.588ms
  compute next timeout ......................................................... 1            0.400ms  77% +13s849.123ms
    compute next round time .................................................... 1            0.014ms 100% +0.001ms
    compute next potential baking time ......................................... 1            0.303ms  73% +0.016ms
  do step with event 'timeout' ................................................. 1            0.128ms  88% +13s849.529ms
    time to bake at next level ................................................. 1            0.123ms  87% +0.002ms
  perform action 'inject block' ................................................ 1           16.760ms  11% +13s849.662ms
    inject block to node ....................................................... 1           16.475ms   9% +0.002ms
  compute next timeout ......................................................... 1            0.458ms  27% +13s866.425ms
    compute next round time .................................................... 1            0.004ms 124% +0.002ms
    compute next potential baking time ......................................... 1            0.012ms 101% +0.009ms

Teztale Dataviz

Below is comparison of teztale dataviz for experiments with and without the patch. Both experiments were run on n1-standard-4 machines with Intel Ivy Bridge processor at 15 seconds block time. The experiments were run with a set pre-emptive forge time for 1.5s

Old

Experiment 20231124-160857: https://drive.google.com/file/d/1eAuMgj-TL_jet6Z-34hfF0-XgN5kOmn9/view?usp=drive_link

The image shows that validation was observed at ~3.9 seconds for most levels. image

The image shows the average validation delay is at 5.83s. image

This MR

Experiment 20231124-163040: https://drive.google.com/file/d/1gNiRf0ixVlIWK9DvLD908gsYav-8yvml/view?usp=drive_link

The image shows that block validation was reached at ~2.4s which is equivalent to the time saving from the pre-emptive forge time, implying that the optimization did indeed work. image

This image shows the average validation delay is ~1.5s less than the experiment without the patch. Similarly, we see reduction of ~1.5s in avg delay of 66% and 90% att. image

Checklist

  • Document the interface of any function added or modified (see the coding guidelines)
  • Document any change to the user interface, including configuration parameters (see node configuration)
  • Provide automatic testing (see the testing guide).
  • For new features and bug fixes, add an item in the appropriate changelog (docs/protocols/alpha.rst for the protocol and the environment, CHANGES.rst at the root of the repository for everything else).
  • Select suitable reviewers using the Reviewers field below.
  • Select as Assignee the next person who should take action on that MR
Edited by Ryan Tan

Merge request reports