Skip to content

[Version-unspecific] Ensure blame of NewMsgTssKeysignFail has a FailReason

Multipartite requested to merge Multi/non-empty-failreason into develop

[Version-unspecific]

As discussed in more detail here (and here) in Discord, during 1.108.2 there were a number of attempts to broadcast keysign failure messages with empty FailReasons (despite having non-empty BlameNodes).

These MsgTssKeysignFail messages failed validation (ValidateBasic, IsEmpty, len(m.FailReason) == 0) and so could not be broadcast (thus unable to contribute to blame consensus),
though the locally-incremented sequence number from ErrUnknownRequest's commit code 6 resulted in other unrelated broadcasts failing in the same block (!2868).

This MR proposes checking the blame's FailReason right before the NewMsgTssKeysignFail (in PostKeysignFailure) and filling it in with "no fail reason available" to be non-empty.
Ideally other code changes could be made to prevent an empty FailReason from reaching that point in the first place,
but this represents a last resort in order to not attempt to broadcast an invalid message.

Chain-unspecific PostKeysignFailure is called from a number of chain-specific locations,
(mostly?) within the conditional
if errors.As(err, &keysignError) {
(/ if errors.As(utxoErr, &keysignError) {)
.

NewKeysignError is called in one line only, in toLocalTSSSigner.
Though not as thorough, a blame.IsEmpty() check and FailReason load could be done there instead;
the question is whether any meaningful information could be obtained from the keySignResp (from KeySign).

My current preference is to have the IsEmpty check right before NewMsgTssKeysignFail and have the loading of any meaningful information into FailReason be in the Go-TSS repository (downstream of KeySign called from toLocalTSSSigner) instead.

Merge request reports