Skip to content

Increase UTXO Daemon Probe Failure Thresholds

Ursa (9R) requested to merge ursa/utxo-probes into master

We (along with another operator) observed a case where the Bitcoin Cash daemon was stuck in a restart loop. Since subsequent restarts of the service cause Bifrosts to use all of the available RPC threads for the failure threshold of the liveness probe - causing restart and subsequently waiting for init before continuing to thrash in the same form.

This change sets all UTXO daemon readiness and liveness probes to 2.5 minutes and 15 minutes respectively - first stopping traffic to the daemon if it is overloaded, and only triggering a restart as a last resort if it cannot become health within 15 minutes.

Merge request reports