Skip to content

Bifrost can query the same THORNode (thorchainBridge) endpoints indefinitely many times in the same block without caching

Context: this Discord thread.
(Related to #1712 (closed)--"More caching in Bifrost could also be a win".)

{"level":"error","service":"bifrost","module":"observer","error":"failed to get node status: failed to get node status: failed to get node account: Status code: 429 Too Many Requests returned","time":"2023-10-11T20:34:22Z","caller":"/app/bifrost/observer/observe.go:224","message":"fail to send to THORChain"}

There, chunkifyAndSendToThorchain calls signAndSendToThorchain once for every chunk of transactions to observe.
(This was before !3173 (merged) was merged, so it could have been for a chunk containing an unobservable transaction over and over again?)
https://gitlab.com/thorchain/thornode/-/blob/v1.121.1/bifrost/observer/observe.go#L222-223

	for _, txIn := range o.chunkify(deck) {
		if err := o.signAndSendToThorchain(txIn); err != nil {

signAndSendToThorchain then starts by checking that the node is Active.
https://gitlab.com/thorchain/thornode/-/blob/v1.121.1/bifrost/observer/observe.go#L495-501

	nodeStatus, err := o.thorchainBridge.FetchNodeStatus()
	if err != nil {
		return fmt.Errorf("failed to get node status: %w", err)
	}
	if nodeStatus != stypes.NodeStatus_Active {
		return nil
	}

FetchNodeStatus calls GetNodeAccount,
GetNodeAccount calls getWithPath
(referring to NodeAccountEndpoint which is "/thorchain/node"), / getWithPath calls get
(finally with a full-URL argument),
and get calls Get
(using thorchainBridge's httpClient field)
and returns the 429 Too Many Requests status.

Notably, getVaultPubkeys and GetLastBlock both use caches to not query a THORNode endpoint more than once per block time span (constants.ThorchainBlockTime).
https://gitlab.com/thorchain/thornode/-/blob/v1.121.1/bifrost/thorclient/thorchain.go#L485-489

	if time.Since(b.lastPubKeysCheck) < constants.ThorchainBlockTime && b.lastPubKeyAddressPairs != nil {
		return b.lastPubKeyAddressPairs, nil
	}

	buf, s, err := b.getWithPath(PubKeysEndpoint)

https://gitlab.com/thorchain/thornode/-/blob/v1.121.1/bifrost/thorclient/block_height.go#L43-46

	if time.Since(b.lastBlockHeightCheck) < constants.ThorchainBlockTime && b.lastThorchainBlockHeight > 0 {
		return b.lastThorchainBlockHeight, nil
	}
	latestBlocks, err := b.getLastBlock(common.EmptyChain)

https://gitlab.com/thorchain/thornode/-/blob/v1.121.1/bifrost/thorclient/block_height.go#L59-64

func (b *thorchainBridge) getLastBlock(chain common.Chain) ([]types.QueryResLastBlockHeights, error) {
	path := LastBlockEndpoint
	if !chain.IsEmpty() {
		path = fmt.Sprintf("%s/%s", path, chain.String())
	}
	buf, _, err := b.getWithPath(path)

Checking, .getWithPath( appears 16 times in non-test code.
https://gitlab.com/search?search=.getWithPath%28&nav_source=navbar&project_id=13422983&group_id=5545144&search_code=true&repository_ref=v1.121.1

Individual caches could be implemented for every endpoint query,
but these could then be unintentionally left out of future endpoint queries, as well as adding clutter to the thorchainBridge struct.
https://gitlab.com/thorchain/thornode/-/blob/v1.121.1/bifrost/thorclient/thorchain.go#L74-79

	lastBlockHeightCheck     time.Time
	lastThorchainBlockHeight int64

	pubKeyCheckLock        *sync.Mutex
	lastPubKeysCheck       time.Time
	lastPubKeyAddressPairs []byte

I currently propose that the thorchainBridge-received get, which receives a full URL as an argument, have caching functionality that prevents Bifrost querying the same URL more than once per THORChain block time.

(If this is not preferred, getWithPath which calls getThorChainURL and so is only used for (local-node) THORNode endpoints could be an alternative target.
|
However, I believe avoiding get not to be necessary: as it is lower-case, it cannot be exported from its 'thorclient' package, and
git -C bifrost/thorclient grep "\.get("
returns only two results, both in thorchain.go: one already seen in getWithPath,
and one in IsCatchingUp, which also checks a THORNode (thus appropriate to not query more than once a block),
though non-local (b.cfg.ChainRPC) rather than local (b.cfg.ChainHost).)

Specifically: !3182 (merged)
'Cache responses for Bifrost get'.

Edited by Multipartite