[Bug] Daemons misleading '--keep-alive' option behaviour
Environment (Mainnet, test network, build from source, ...)
Static v12.3 Linux binaries.
Summary
Daemon started with --keep-alive
option fails with a connection error when the node it's trying to connect to is not available.
Expected behavior
Daemon tries to reconnect to the node with some time interval.
Actual behavior
Daemon fails with an error.
Steps to reproduce
$ tezos-baker-012-Psithaca run with local node node-dir-ithacanet baker --keep-alive
Connection refused, retrying in 1.00 seconds...
Waiting for the node to be bootstrapped...
Connection refused, retrying in 1.50 seconds...
Connection refused, retrying in 2.25 seconds...
Connection refused, retrying in 3.38 seconds...
Connection refused, retrying in 5.06 seconds...
Error:
Rpc request failed:
- meth: GET
- uri: http://127.0.0.1:8732//monitor/bootstrapped
- error: Unable to connect to the node: "Unix.Unix_error(Unix.ECONNREFUSED, "connect", "")"
Hovewer, if node was running when the daemon was started subsequent node disconnection doesn't make daemon to fail with connection error:
$ tezos-baker-012-Psithaca run with local node node-dir-ithacanet baker --keep-alive
Node is bootstrapped.
Waiting for protocol 012-Psithaca to start...
Baker v12.3 (6e2037c9) for Psithaca2MLR started.
Apr 25 16:10:08.694 - 012-Psithaca.baker.transitions: received new head BM7XTi3qFb9qUq63XtZfR5RMivpbJn5CDpEMaSfonuwTdQX14QS at
...
Apr 25 16:10:09.667 - 012-Psithaca.baker.actions:
Lost connection with the node. Retrying to establish connection...
Apr 25 16:10:12.832 - 012-Psithaca.baker.operation_worker: loop failed with
Apr 25 16:10:12.832 - 012-Psithaca.baker.operation_worker: Error:
Apr 25 16:10:12.832 - 012-Psithaca.baker.operation_worker: Rpc request failed:
Apr 25 16:10:12.832 - 012-Psithaca.baker.operation_worker: - meth: GET
Apr 25 16:10:12.832 - 012-Psithaca.baker.operation_worker: - uri: http://127.0.0.1:8732//chains/main/mempool/monitor_operations?applied=yes&refused=no&outdated=no&branch_refused=no&branch_delayed=yes
Apr 25 16:10:12.832 - 012-Psithaca.baker.operation_worker: - error: Unable to connect to the node: "Unix.Unix_error(Unix.ECONNREFUSED, "connect", "")"
Apr 25 16:10:12.832 - 012-Psithaca.baker.operation_worker:
Connection refused, retrying in 1.00 seconds...
Waiting for the node to be bootstrapped...
Connection refused, retrying in 1.50 seconds...
Connection refused, retrying in 2.25 seconds...
Connection refused, retrying in 3.38 seconds...
Connection refused, retrying in 5.06 seconds...
Connection refused, retrying in 7.59 seconds...
Connection refused, retrying in 10.00 seconds...
Connection refused, retrying in 10.00 seconds...
Connection refused, retrying in 10.00 seconds...
Logs
RPC logs:
tezos-baker-012-Psithaca -l run with local node node-dir-ithacanet baker --keep-alive
>>>>0: http://127.0.0.1:8732//monitor/bootstrapped
Connection refused, retrying in 1.00 seconds...
Waiting for the node to be bootstrapped...
>>>>1: http://127.0.0.1:8732//monitor/bootstrapped
Connection refused, retrying in 1.50 seconds...
>>>>2: http://127.0.0.1:8732//monitor/bootstrapped
Connection refused, retrying in 2.25 seconds...
>>>>3: http://127.0.0.1:8732//monitor/bootstrapped
Connection refused, retrying in 3.38 seconds...
>>>>4: http://127.0.0.1:8732//monitor/bootstrapped
Connection refused, retrying in 5.06 seconds...
>>>>5: http://127.0.0.1:8732//monitor/bootstrapped
Error:
Rpc request failed:
- meth: GET
- uri: http://127.0.0.1:8732//monitor/bootstrapped
- error: Unable to connect to the node: "Unix.Unix_error(Unix.ECONNREFUSED, "connect", "")"
Edited by Roman Melnikov