Block_validator: Mitigation of the infamous Irmin error
What
This MR aims to mitigate the Irmin error : unknown inode key (find_value)
by not marking the buggy block as inapplicable.
Why
Because we don't like bugs (not this kind)
How
If the irmin error is met, the node is not crashed gracefully anymore and is allowed to retry on the buggy block before crashing again.
Manually testing the MR
Even though the bug can't be forced currently, it is possible to simulate its error and check that a buggy node will not crash and will be able to validate and apply the buggy block and the following ones.
To do so, here is the testing process:
- Compile octez and start two nodes (in two different terminal) with:
./src/bin_node/octez-sandboxed-node.sh 1 --connections 2
./src/bin_node/octez-sandboxed-node.sh 2 --connections 2
- Apply the following irmin-patch.patch
- Recompile octez and start a (buggy) node (in a third terminal) with:
./src/bin_node/octez-sandboxed-node.sh 3 --connections 2
- Open a fourth terminal where all the following commands will be used:
-
eval `./src/bin_client/octez-init-sandboxed-client.sh 1`
octez-activate-alpha
-
octez-client bake for --minimal-timestamp
at least 5 times - The third node should:
- raise an
unknown inode key (find_value)
error without crashing, - receive the faulty block from the second node that didn't crash and was able to send the block back to the faulty node
- apply the block without raising the error
- be ready to receive the next blocks
- raise an
-
Checklist
-
Document the interface of any function added or modified (see the coding guidelines) -
Document any change to the user interface, including configuration parameters (see node configuration) -
Provide automatic testing (see the testing guide). -
For new features and bug fixes, add an item in the appropriate changelog ( docs/protocols/alpha.rst
for the protocol and the environment,CHANGES.rst
at the root of the repository for everything else). -
Select suitable reviewers using the Reviewers
field below. -
Select as Assignee
the next person who should take action on that MR
Edited by Mattias