[bifrost] Keygen Retries and ChurnRetryInterval Mimir
Resolves #1585 (closed) - will separate out the seed refresh component noted on a comment there into a separate issue after this since it will also require more changes in go-tss
. This is just a draft and has seen no testing, just pushing for now in case someone wants to carry the torch, otherwise will continue with this testing next week.
This has been tested with paths noted in comment below. Also this was extended to read a mimir for the ChurnRetryInterval
.
Merge request reports
Activity
- Resolved by Ursa (9R)
(I echo the earlier thought that testing should ideally include whether a Ready node is able to (/fully) unbond after a keygen failure and then become an Active node on a keygen retry.)
added 19 commits
-
e3144da4...5672046d - 18 commits from branch
develop
- 7d354f24 - [bifrost] Keygen Retries and ChurnRetryInterval Mimir
-
e3144da4...5672046d - 18 commits from branch
Mocknet Testing:
- No keygen retry configured in mimir
- Failed with one offline node, TssPool failures reported as expected
- Offline node restarted, succeeded on next churn retry
- Keygen retry configured
- Failed with one offline node, retried until within interval blocks of next churn retry, TssPool failures reported after retries exhausted
- Failed with one offline node, retried until offline node enabled, next retry was successful, TssPool success reported
- When offline node came back online and there were 2 keygen blocks since last online it still tries the first one (old keygen block with other nodes moved on), but this is consistent with current behavior and should not cause new unintended side effects
- No keygen retry configured in mimir
Would appreciate @son-of-odin / @heimdallthor review in case of any gotchas.
- Resolved by Heimdall
- Resolved by Ursa (9R)
- Resolved by Heimdall
- Resolved by Multipartite
changed milestone to %Release-1.121.0
added 5 commits
-
0731113c...5d3959e6 - 2 commits from branch
develop
- 0ed36be6 - [bifrost] Keygen Retries and ChurnRetryInterval Mimir
- 6fe9e397 - heimdall feedback
- 2bba6939 - fix smoke
Toggle commit list-
0731113c...5d3959e6 - 2 commits from branch
- Resolved by Multipartite
be aware of the slashing that occurs, which is 6 hours of yield (i think). If you're doing retry, then that slashing amount should prob be reduced to match the retry interval
added 1 commit
- 48b07306 - add retry to keygen broadcast, remove dead backoff config