Skip to content

Py2ml: translate `test_tenderbake_long_dynamic_bake.py` to Tezt

Context

Part of #3628 (closed).

Depends on !5245 (merged).

Notes on git history: i first add a small function to tezt, then i refactor in 3 commits the original test of this file (added in !5245 (merged)). Then, i add the new test Long_dynamic_bake. Finally, the history contains a last commit inlining the code of the original test to ease review.

Description of original test

Run a network of nodes 5 and 3 bakers. Every 2 seconds, a random operation is generated. Every 20 seconds, a client verifies that the chain progresses (using polling with a timeout). It checks that the network has reached consensus on each level in less than or exactly 3 rounds. A final check verifies that the five nodes have reached a certain level (based on the duration of the test and the minimal block delay) and greatest common level, they all have the same head.

Description of translation

In the translated version, I work with levels instead of timeouts:

  (* This test runs [num_nodes] nodes for [test_levels]. All but one
     node has a baker attached. Every [kill_baker_period], a baker is
     added to the node lacking one and one of the bakers is killed.

     In addition, every [add_operation_period] level, a random
     transaction is injected.

     We measure and check for regressions in the time it takes the
     cluster to reach the final level [max_level]. *)

I did not implement the final fork check, as discussed below.

Protocol particularities

None.

Similarities other tests

Quite similar to test_tenderbake.py

Flakiness

Examples:

The issue seems to appear in the ring topology, which is a bit weird (why not in the other topology.)

The issues seems quite diverse:

  1. sometimes too many operations are inserted on the same level, hitting the 1 manager operation per manager per block issue: Only one manager operation per manager per block allowed (https://gitlab.com/tezos/tezos/-/jobs/2594907353)
  2. sometimes the nodes ring topology is not established (https://gitlab.com/tezos/tezos/-/jobs/2537087752)
  3. sometimes the network does not progress: https://gitlab.com/tezos/tezos/-/jobs/2556011954#L57
  4. sometimes the network ends up in a fork? https://gitlab.com/tezos/tezos/-/jobs/2528440487#L73 and https://gitlab.com/tezos/tezos/-/jobs/2928362854#L117

I think the first issue can be resolved by more careful programming. The second issue will be resolved by using Tezt's cluster module, which I hope is more stable. The third issue will be solved by a move to long tests.

For issue 4, I simply did not implement the fork check. If a fork happens, then the network will progress more slowly. So it is captured as a regression in the time taken to reach test_levels. It seems likes forks non-deterministically did happen in the original test, not sure what can be done about that.

Manually testing the MR

Without data visualization (requires docker-compose):

alias long_tezt='dune exec tezt/long_tests/main.exe --'
long_tezt -f tenderbake.ml -m dynamic -i

With data visualization:

docker-compose -f tezt/lib_performance_regression/local-sandbox/docker-compose.yml up -d
alias long_tezt='TEZT_CONFIG=tezt/lib_performance_regression/local-sandbox/tezt_config.json dune exec tezt/long_tests/main.exe --'
long_tezt -f tenderbake.ml -m dynamic -i

then visit http://localhost:3000

Checklist

  • Document the interface of any function added or modified (see the coding guidelines)
  • Document any change to the user interface, including configuration parameters (see node configuration)
  • Provide automatic testing (see the testing guide).
  • For new features and bug fixes, add an item in the appropriate changelog (docs/protocols/alpha.rst for the protocol and the environment, CHANGES.rst at the root of the repository for everything else).
  • Select suitable reviewers using the Reviewers field below.
  • Select as Assignee the next person who should take action on that MR
Edited by Arvid Jakobsson

Merge request reports