More txouts can be assigned to a vault than it has the balance to pay
Discord thread here.
In block 9827136, Asgard vault of pubkey thor..k9v2 had GAIA.ATOM balance of 54.85432700
and one outbound-queue assigned txout item of amount 32.78672600 GAIA.ATOM .
In this block it had no assigned GAIA.ATOM scheduled-outbound-queue txout items.
One block later, its outbound-queue txout item and GAIA.ATOM vault balance were unchanged,
but it had been assigned a new scheduled-outbound-queue txout item of amount 39.27404100 GAIA.ATOM.
As 32.78672600 + 39.27404100 is more than 54.85432700, this left nodes signing a transaction which could not succeed.
From a Cosmos Hub block explorer for transaction 5A7A175D1FB84138E5333CE2EABF6D3F3390E20290685086083373896B087401 (for which THORChain has no record):
"failed to execute message; message index: 0: 22061601uatom is smaller than 39274041uatom: insufficient funds"
In time, the txout item was assigned to a different vault and was completed. However, as the GAIA block scanner had no function like Ethereum's getTxInFromFailedTransaction
(to allow vault balances and pool asset depths to reflect failed-transaction gas costs),
upon the next churn (on 2023-03-08) vault migrations could not complete. The address repeatedly tried the final migration, failing each time and becoming more insolvent each time as each failure used more gas that the vault didn't know it was losing.
Ultimately the vault's GAIA.ATOM migration was rescued by a noop:novault transaction making it solvent again,
but a code change is appropriate to prevent this from recurring in future.
!2408 (merged) 'Optimise outbound vaults selection logic'
appears relevant in that, other than collecting all outbounds at once,
it also appears to have changed code behaviour to ignore txout items which have moved from the scheduled outbound queue ('/scheduled') to the outbound queue ('/outbound').
Specifically, in the code which was in deductVaultPendingOutboundBalance
and after was in getPendingOutbounds
.
At this time, I tentatively propose that getPendingOutbounds
code behaviour be changed (/reverted) to count outbound queue txout items as well, since these are balances which are yet to be deducted from the vault balance.
Update: For reference, the Querier functions queryScheduledOutbound
and queryPendingOutbound
demonstrate how the scheduled-outbound-queue txout items and outbound-queue txout items
are determined by checking for future-block and (empty-OutHash) past-block items respectively.