witness_node program is crashing on active SON node (Mainnet)
Description:
I have an active SON on Mainnet. After setting up the server, installing/configuring Peerplays and a Bitcoin node, the witness_node program syncs all the blocks up to current. Then when my SON is scheduled, it tries to create "son down" proposals for all the other SONs. Most of the time these new proposals fail. Sometimes the proposals succeed in being created. The SON will operate for a few minutes and then get suddenly disconnected/crash (I can't exactly tell what's happening). Either way, the witness_node program ends.
The first problem I found was that my server time was off. I was getting a p2p disconnect because I was proposing blocks with a timestamp that was "in the future". I installed and configured NTP which solved that problem. But now it's crashing for some other reason.
Relevant info:
Server details
- Amazon AWS EC2 instance (r6g.large)
- 2 CPUs, 16 GB RAM, 150 GB storage, 10 Gigabit network
- arm64 architecture
- static public IP
- Inbound ports for P2P, RPC, HTTP, HTTPS, and the Bitcoin node are open.
- Outbound ports are open for all outbound traffic.
Software details
- Peerplays version 1.5.13
- built on the machine from source
- Bitcoin node (Bitcoin Core) version 0.21.1
- downloaded & extracted .tar.gz from bitcoin.org
Configuration details
- Peerplays is set up as a seed node (opened p2p), API node (opened RPC), and SON node
- Peerplays private key in config.ini is the SON account's active private key
- SON's signing key is the SON account's active key
- son-wallet is set up with a bitcoin pubkey and private key
Logs
Here's the end of my default.log when the program ends:
2021-08-18T02:24:42.400 th_a:invoke handle_block handle_block ] Got block: #42704062 time: 2021-08-18T02:24:42 latency: 400 ms from: jburr-blocks irreversible: 42704052 (-10) application.cpp:495
2021-08-18T02:24:42.901 th_a:SON Processing son_processing ] Scheduled SON: 1.33.3 Now: 2021-08-18T02:24:43 peerplays_sidechain_plugin.cpp:398
2021-08-18T02:24:45.381 th_a:invoke handle_block handle_block ] Got block: #42704063 time: 2021-08-18T02:24:45 latency: 381 ms from: robert-hedler irreversible: 42704053 (-10) application.cpp:495
2021-08-18T02:24:45.882 th_a:SON Processing son_processing ] Scheduled SON: 1.33.4 Now: 2021-08-18T02:24:46 peerplays_sidechain_plugin.cpp:398
2021-08-18T02:24:48.518 th_a:invoke handle_block handle_block ] Got block: #42704064 time: 2021-08-18T02:24:48 latency: 518 ms from: eifos-witness irreversible: 42704056 (-8) application.cpp:495
2021-08-18T02:24:49.019 th_a:SON Processing son_processing ] Scheduled SON: 1.33.0 Now: 2021-08-18T02:24:49 peerplays_sidechain_plugin.cpp:398
2021-08-18T02:24:54.401 th_a:invoke handle_block handle_block ] Got block: #42704065 time: 2021-08-18T02:24:54 latency: 401 ms from: hiltos-witness irreversible: 42704057 (-8) application.cpp:495
2021-08-18T02:24:54.902 th_a:SON Processing son_processing ] Scheduled SON: 1.33.9 Now: 2021-08-18T02:24:55 peerplays_sidechain_plugin.cpp:398
2021-08-18T02:24:57.452 th_a:invoke handle_block handle_block ] Got block: #42704066 time: 2021-08-18T02:24:57 latency: 452 ms from: royal-flush irreversible: 42704058 (-8) application.cpp:495
2021-08-18T02:24:57.952 th_a:SON Processing son_processing ] Scheduled SON: 1.33.7 Now: 2021-08-18T02:24:58 peerplays_sidechain_plugin.cpp:398
2021-08-18T02:24:57.953 th_a:SON Processing create_son_down_prop ] Sending son down proposal for 1.33.0 from 1.33.7 peerplays_sidechain_plugin.cpp:558
2021-08-18T02:24:57.953 th_a:SON Processing create_son_down_prop ] Sending son down proposal for 1.33.1 from 1.33.7 peerplays_sidechain_plugin.cpp:558
2021-08-18T02:24:57.954 th_a:SON Processing create_son_down_prop ] Sending son down proposal for 1.33.2 from 1.33.7 peerplays_sidechain_plugin.cpp:558
2021-08-18T02:24:57.955 th_a:SON Processing create_son_down_prop ] Sending son down proposal for 1.33.3 from 1.33.7 peerplays_sidechain_plugin.cpp:558
2021-08-18T02:24:57.955 th_a:SON Processing create_son_down_prop ] Sending son down proposal for 1.33.4 from 1.33.7 peerplays_sidechain_plugin.cpp:558
2021-08-18T02:24:57.956 th_a:SON Processing create_son_down_prop ] Sending son down proposal for 1.33.9 from 1.33.7 peerplays_sidechain_plugin.cpp:558
Here's the end of my p2p.log:
2021-08-18T02:24:39.208 p2p:message read_loop on_connection_accept ] I don't know if I'm firewalled. Sending a firewall check message to peer 104.225.216.198:9779 node.cpp:2121
2021-08-18T02:24:39.245 p2p:connect_to_task connect_to ] fatal: error connecting to peer 173.249.23.108:9777: 0 exception: unspecified
Connection refused
{"message":"Connection refused"}
asio asio.cpp:60 error_handler peer_connection.cpp:264
2021-08-18T02:24:39.247 p2p:delayed_peer_deletion_task destroy ] Unexpected exception from peer_connection's accept_or_connect_task : {"code":0,"name":"exception","message":"unspecified","stack":[{"context":{"level":"error","file":"asio.cpp","line":60,"method":"error_handler","hostname":"","thread_name":"asio","timestamp":"2021-08-18T02:24:39"},"format":"${message} ","data":{"message":"Connection refused"}}]} peer_connection.cpp:167
2021-08-18T02:24:39.252 p2p:connect_to_task connect_to ] fatal: error connecting to peer 109.111.245.114:9777: 0 exception: unspecified
No route to host
{"message":"No route to host"}
asio asio.cpp:60 error_handler peer_connection.cpp:264
2021-08-18T02:24:39.252 p2p:delayed_peer_deletion_task destroy ] Unexpected exception from peer_connection's accept_or_connect_task : {"code":0,"name":"exception","message":"unspecified","stack":[{"context":{"level":"error","file":"asio.cpp","line":60,"method":"error_handler","hostname":"","thread_name":"asio","timestamp":"2021-08-18T02:24:39"},"format":"${message} ","data":{"message":"No route to host"}}]} peer_connection.cpp:167
2021-08-18T02:24:39.573 p2p:message read_loop read_loop ] disconnected 11 eof_exception: End Of File
End of file
{"message":"End of file"}
asio asio.cpp:39 operator()
{"len":16}
p2p stcp_socket.cpp:117 readsome message_oriented_connection.cpp:218
2021-08-18T02:24:39.609 p2p:message read_loop on_check_firewall_re ] Firewall check we initiated has returned with result: connection_successful, endpoint = 18.116.153.207:9777 node.cpp:3774
2021-08-18T02:24:45.132 p2p:terminate_inactive_connections_loop terminate_inactive_c ] Forcibly disconnecting from handshaking peer 88.198.13.39:9777 due to inactivity of at least 5 seconds node.cpp:1345
2021-08-18T02:24:45.132 p2p:terminate_inactive_connections_loop terminate_inactive_c ] Peer's negotiating status: connecting, bytes sent: 0, bytes received: 0 node.cpp:1347
2021-08-18T02:24:45.132 p2p:terminate_inactive_connections_loop terminate_inactive_c ] Forcibly disconnecting from handshaking peer 172.17.0.2:39319 due to inactivity of at least 5 seconds node.cpp:1345
2021-08-18T02:24:45.132 p2p:terminate_inactive_connections_loop terminate_inactive_c ] Peer's negotiating status: connecting, bytes sent: 0, bytes received: 0 node.cpp:1347
2021-08-18T02:24:45.133 p2p:connect_to_task connect_to ] fatal: error connecting to peer 88.198.13.39:9777: 0 exception: unspecified
Operation canceled
{"message":"Operation canceled"}
asio asio.cpp:60 error_handler peer_connection.cpp:264
2021-08-18T02:24:45.133 p2p:delayed_peer_deletion_task destroy ] Unexpected exception from peer_connection's accept_or_connect_task : {"code":0,"name":"exception","message":"unspecified","stack":[{"context":{"level":"error","file":"asio.cpp","line":60,"method":"error_handler","hostname":"","thread_name":"asio","timestamp":"2021-08-18T02:24:45"},"format":"${message} ","data":{"message":"Operation canceled"}}]} peer_connection.cpp:167
2021-08-18T02:24:45.133 p2p:connect_to_task connect_to ] fatal: error connecting to peer 172.17.0.2:39319: 0 exception: unspecified
Operation canceled
{"message":"Operation canceled"}
asio asio.cpp:60 error_handler peer_connection.cpp:264
2021-08-18T02:24:45.133 p2p:delayed_peer_deletion_task destroy ] Unexpected exception from peer_connection's accept_or_connect_task : {"code":0,"name":"exception","message":"unspecified","stack":[{"context":{"level":"error","file":"asio.cpp","line":60,"method":"error_handler","hostname":"","thread_name":"asio","timestamp":"2021-08-18T02:24:45"},"format":"${message} ","data":{"message":"Operation canceled"}}]} peer_connection.cpp:167
2021-08-18T02:24:55.133 p2p:terminate_inactive_connections_loop terminate_inactive_c ] Forcibly disconnecting from handshaking peer 95.216.8.243:42765 due to inactivity of at least 5 seconds node.cpp:1345
2021-08-18T02:24:55.133 p2p:terminate_inactive_connections_loop terminate_inactive_c ] Peer's negotiating status: connecting, bytes sent: 0, bytes received: 0 node.cpp:1347
2021-08-18T02:24:55.133 p2p:terminate_inactive_connections_loop terminate_inactive_c ] Forcibly disconnecting from handshaking peer 195.154.83.79:3338 due to inactivity of at least 5 seconds node.cpp:1345
2021-08-18T02:24:55.133 p2p:terminate_inactive_connections_loop terminate_inactive_c ] Peer's negotiating status: connecting, bytes sent: 0, bytes received: 0 node.cpp:1347
2021-08-18T02:24:55.133 p2p:connect_to_task connect_to ] fatal: error connecting to peer 95.216.8.243:42765: 0 exception: unspecified
Operation canceled
{"message":"Operation canceled"}
asio asio.cpp:60 error_handler peer_connection.cpp:264
2021-08-18T02:24:55.133 p2p:connect_to_task connect_to ] fatal: error connecting to peer 195.154.83.79:3338: 0 exception: unspecified
Operation canceled
{"message":"Operation canceled"}
asio asio.cpp:60 error_handler peer_connection.cpp:264
2021-08-18T02:24:55.133 p2p:delayed_peer_deletion_task destroy ] Unexpected exception from peer_connection's accept_or_connect_task : {"code":0,"name":"exception","message":"unspecified","stack":[{"context":{"level":"error","file":"asio.cpp","line":60,"method":"error_handler","hostname":"","thread_name":"asio","timestamp":"2021-08-18T02:24:55"},"format":"${message} ","data":{"message":"Operation canceled"}}]} peer_connection.cpp:167
2021-08-18T02:24:55.133 p2p:delayed_peer_deletion_task destroy ] Unexpected exception from peer_connection's accept_or_connect_task : {"code":0,"name":"exception","message":"unspecified","stack":[{"context":{"level":"error","file":"asio.cpp","line":60,"method":"error_handler","hostname":"","thread_name":"asio","timestamp":"2021-08-18T02:24:55"},"format":"${message} ","data":{"message":"Operation canceled"}}]} peer_connection.cpp:167
There are no entries in the rpc.log.
Expected results:
The witness_node program should continue to run. I don't think the SON should be producing "son down" proposals for the other SONs unless they truly are down.
Actual results:
The witness_node program ends after attempting to create/update proposals.