Skip to content

[bugfix] maps are evil

Son of Odin requested to merge subsidize-map into develop

THORChain got into a consensus failure, meaning that nodes couldn't agree on the end result of the next block. In the last block a unbond transaction was processed.

{"level":"info","coins":[{"asset":"THOR.RUNE","amount":"0"}],"from":"thor14c7vjwmctw3sqxx34xt4hyppmgmaljj4clxewv","memo":"UNBOND:thor1yr8n0val34wzyvfn0g0kk4zqlwzxguvj8n6thj:10000000000","time":"2021-11-12T14:35:45Z","message":"receive MsgDeposit"}
{"level":"info","amount":"10000000000","node address":"thor1yr8n0val34wzyvfn0g0kk4zqlwzxguvj8n6thj","request hash":"BDA4D8E4EC43E7D5279684945C0B5B48E3498E31059EE28562833AFDFFEAC6D2","time":"2021-11-12T14:35:45Z","message":"receive MsgUnBond"}
{"level":"info","request outbound tx hash":"0000000000000000000000000000000000000000000000000000000000000000","time":"2021-11-12T14:35:45Z","message":"receive MsgOutboundTx"}
{"level":"error","error":"fail to save pool: cannot save a pool with an empty asset","time":"2021-11-12T14:35:45Z","message":"fail to subsidize pool with slashed bond"}
{"level":"error","error":"fail to unbond: 2 errors occurred:\n\t* internal error\n\t* fail to save pool: cannot save a pool with an empty asset\n\n","time":"2021-11-12T14:35:45Z","message":"msg unbond fail handler"}

Looking closer at the code path of the error above, it was discovered that the was iterating over a map which should never be done in a blockchain. This is because maps do NOT maintain order (a non-deterministic order) which can result in a consensus failure. When the iteration of this map is 100% successful, order doesn't matter, as the result state change is the same.

But in this case, it wasn't successful as one of that pools was empty (more on this later). This means that each node iterated and saved pool data a different number of times before it came to the "bad empty pool", yielding each node with a different resulting state change.

The reason why we saw an "empty" pool asset for the first time, is because a ygg has referenced a pool that no longer exists (BNB.CAS-167), which when fetching this pool, resulted in an empty pool asset, which then failed when saving the pool back to the kvstore.

Edited by Son of Odin

Merge request reports