Skip to content

feat(node): shutdown if the DN stalls [AC-2434]

Ethan Reesor requested to merge AC-2434-detect-dn-stall into develop

Closes AC-2434. If the DN stalls, shut down the node.

The nominal procedure for anchoring is:

  • After every block, each BVN produces an anchor, incrementing Produced in dn.acme/anchors, and sends it to the DN.
  • Anchors are sequenced, and the sequence number is included in the partition signature.
  • After every block, the DN produces an anchor including a receipt for each BVN anchor received in the block.
  • The sequence number of each BVN anchor is included along with the receipt.
  • When the BVNs receive the DN anchor, they update Acknowledged in dn.acme/anchors with the sequence number from the receipt for their own anchor.

Thus, the BVNs (and the DN) track how many anchors have been produced and how many of those have been acknowledged by the DN. To detect a DN stall:

  • After every block, each BVN checks verifies that their anchors have been acknowledged.
  • If (number of anchors acknowledged) == (number of anchors produced), all anchors are acknowledged.
  • Otherwise, the BVN checks the height at which the oldest unacknowledged anchor was produced.
  • If that height is more than 50 (configurable) less than the current height (according to Tendermint), the node is shut down.

Review Checklist

If any item is not complete, the merge request is not ready to be reviewed and must be marked Draft:.

  • The merge request title is in the format <change type>(<change scope>): <short description> [<task id>]
    • For example, feat(cli): add QR code generation [AC-123]
    • For details, see CONTRIBUTING.md
  • The description includes Closes <jira task ID> (or rarely Updates <jira task ID>)
  • The change is fully validated by tests that are run during CI
    • In most cases this means a test in "validate.sh"
    • In some cases, a Go test may be acceptable
    • Validation is not applicable to things like documentation updates
    • Purely UI/UX changes can be manually validated, such as changes to human-readable output
    • For all other changes, automated validation tests are an absolute requirement unless a maintainer specifically explains why they are not in a comment on this merge request
  • The change is marked with one of the validation labels

Merge Checklist

  • CI is passing
  • Merge conflicts are resolved
  • All discussions are resolved

Related to AC-2434

Merge request reports