feat(node): shutdown if the DN stalls [AC-2434]
Closes AC-2434. If the DN stalls, shut down the node.
The nominal procedure for anchoring is:
- After every block, each BVN produces an anchor, incrementing
Produced
indn.acme/anchors
, and sends it to the DN. - Anchors are sequenced, and the sequence number is included in the partition signature.
- After every block, the DN produces an anchor including a receipt for each BVN anchor received in the block.
- The sequence number of each BVN anchor is included along with the receipt.
- When the BVNs receive the DN anchor, they update
Acknowledged
indn.acme/anchors
with the sequence number from the receipt for their own anchor.
Thus, the BVNs (and the DN) track how many anchors have been produced and how many of those have been acknowledged by the DN. To detect a DN stall:
- After every block, each BVN checks verifies that their anchors have been acknowledged.
- If (number of anchors acknowledged) == (number of anchors produced), all anchors are acknowledged.
- Otherwise, the BVN checks the height at which the oldest unacknowledged anchor was produced.
- If that height is more than 50 (configurable) less than the current height (according to Tendermint), the node is shut down.
Review Checklist
If any item is not complete, the merge request is not ready to be reviewed and must be marked Draft:
.
-
The merge request title is in the format <change type>(<change scope>): <short description> [<task id>]
- For example,
feat(cli): add QR code generation [AC-123]
- For details, see CONTRIBUTING.md
- For example,
-
The description includes Closes <jira task ID>
(or rarelyUpdates <jira task ID>
) -
The change is fully validated by tests that are run during CI - In most cases this means a test in "validate.sh"
- In some cases, a Go test may be acceptable
- Validation is not applicable to things like documentation updates
- Purely UI/UX changes can be manually validated, such as changes to human-readable output
- For all other changes, automated validation tests are an absolute requirement unless a maintainer specifically explains why they are not in a comment on this merge request
-
The change is marked with one of the validation labels - validationci/cd for changes validated by CI tests
- validationmanual for changes validated by hand
- validationdeferred for changes validated by a follow up merge request
- validationnot applicable for changes where validation is not applicable
Merge Checklist
-
CI is passing -
Merge conflicts are resolved -
All discussions are resolved
Related to AC-2434