When something does go wrong, native-THORChain refund reasons lack crucial information
[Update: As described in #1468 (closed),
when Saver withdraw synth->native swaps fail they cannot refund synths to the signer address,
and so it would be appropriate to record their 'fail to refund' status in the refund event.
Second update: Another example courtesy of akrokr:
https://midgard.ninerealms.com/v2/debug/block/9255018
A 1-RUNE withdraw request fails refund due to being under the 2000000 (0.02) RUNE transaction fee, but isn't recorded in the refund event as a 'fail to refund'.]
A bug in v1.103.0 introduced by !2728 (merged) (fixed by !2744 (merged) and !2747 (merged)) was initially discovered in the form of this transaction (added in block 9221022, handled in block 9221023).
The THORNode log clearly describes the problem:
{"level":"error","error":"488432852150eth/thor-0xa5f2211b9b8170f694421f2046281775e8468044 is smaller than 488921786509eth/thor-0xa5f2211b9b8170f694421f2046281775e8468044: insufficient funds","time":"2023-01-22T08:05:58Z","caller":"/app/x/thorchain/helpers.go:95","message":"fail to prepare outbund tx"}
488921786509 − 488432852150
= 488934359
which is the exact amount received by the affiliate address .a7xk ('t', THORSwap) in block 9221022.
However, when first discovered and without access to THORNode logs, the problem (comparing with other transactions) was a mystery.
This is because the refund event (for transaction BC12..AA86) in block 9221023 was thus:
"fail to add outbound tx: 2 errors occurred:\n\t* internal error\n\t* outbound amount does not meet requirements (83074657/83259714)\n\n"
Specifically, if something does go wrong and a native THORChain refund fails, the emitted event is not updated with 'fail to refund' and the reason because native THORChain transactions only have the refund event emitted before the txout attempt.
// for THORChain transactions, create the event before we txout. For other
// chains, do it after. The reason for this is we need to make sure the
// first event (refund) is created, before we create the outbound events
// (second). Because its THORChain, its safe to assume all the coins are
// safe to send back. Where as for external coins, we cannot make this
// assumption (ie coins we don't have pools for and therefore, don't know
// the value of it relative to rune)
if tx.Tx.Chain.Equals(common.THORChain) {
if err := addEvent(tx.Tx.Coins); err != nil {
return err
}
}
Since for external chains it is viewed as acceptable to try the event emission only after the txout attempt,
I question the line we need to make sure the first event (refund) is created, before we create the outbound events (second)
'.
I also refer to this comment:
Highly unlikely it will fail to emit an event , but even it does, Fail to emit event should not result in an error, we should ignore the error and move on.
Please tell me if there is a different reason for this.
I currently propose that all (including native THORChain) refund events be emitted after the txout attempt.
This would both preserve critical failure information when something did fail,
allow Flipside SQL searching by fail to refund
reason and `(asset = 'THOR.RUNE') OR (asset LIKE '%/%)' for any such cases (currently not possible),
and make refund event emission timing consistent (relative to outbound events) for all refunds, rather than being before outbound events for some and after outbound events for others.
At the same time, I also propose the minor change of altering the log spelling of 'fail to prepare outbund tx' (sic) to 'fail to prepare outbound tx',
particularly because it is inconveniently easy to automatically write 'outbound' when wanting to search for 'outbund'.
If it is felt that this might prompt any confusion with this 'fail to prepare outbound tx' error I welcome alternative proposals which would reduce ambiguity.