Worker crashed: no space left on device
Maybe my linux-fu skills are lacking, but I cannot find a reason for this. Only the tezos-node is reporting this issue. Earlier today, my server alarmed with full space. I only had the vda1
drive present. It had filled up. I stopped tezos-node, created a new volume, moved .tezos-node/* to the new volume and corrected the config file. I fixed permissions, etc and started up the node. After checking, I start getting no space left on device
errors from node.
There is clearly plenty of space, 50% free. Plenty of inodes available.
[bake@naaa ~]# cat /home/bake/.tezos-node/config.json
{ "data-dir": "/mnt/tezos-node",
"rpc": { "listen-addr": "127.0.0.1:8732" },
"p2p":
{ "listen-addr": "[::]:9732",
"limits":
{ "connection-timeout": 10, "min-connections": 5,
"expected-connections": 10, "max-connections": 15,
"max_known_points": [ 80, 60 ], "max_known_peer_ids": [ 80, 60 ] } },
"shell": { "chain_validator": { "bootstrap_threshold": 2 } } }
[bake@naaa ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/vda1 25G 5.4G 20G 22% /
devtmpfs 474M 0 474M 0% /dev
tmpfs 496M 0 496M 0% /dev/shm
tmpfs 496M 13M 484M 3% /run
tmpfs 496M 0 496M 0% /sys/fs/cgroup
/dev/sda 40G 20G 21G 50% /mnt/tezos-node
tmpfs 100M 0 100M 0% /run/user/1001
[bake@naaa ~]$ df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/vda1 13106624 90846 13015778 1% /
devtmpfs 121304 309 120995 1% /dev
tmpfs 126943 1 126942 1% /dev/shm
tmpfs 126943 395 126548 1% /run
tmpfs 126943 16 126927 1% /sys/fs/cgroup
/dev/sda 20971520 13 20971507 1% /mnt/tezos-node
tmpfs 126943 1 126942 1% /run/user/1001
Oct 08 23:50:36 node.main: Starting a RPC server listening on ::ffff:127.0.0.1:8732.
Oct 08 23:50:36 node.main: The Tezos node is now running!
Oct 08 23:50:41 p2p.maintenance: Too few connections (5)
Oct 08 23:50:58 p2p.maintenance: Too few connections (5)
Oct 08 23:51:08 validator.peer(1): Worker started for NetXdQprcVkpa:idscox1A17ge
Oct 08 23:51:26 p2p.maintenance: Too few connections (5)
Oct 08 23:51:32 validator.peer(2): Worker started for NetXdQprcVkpa:idt5bXJSyT7G
Oct 08 23:51:32 validator.peer(3): Worker started for NetXdQprcVkpa:idtzAQ2RbKrB
Oct 08 23:51:33 validator.peer(4): Worker started for NetXdQprcVkpa:idtvNfu5gYb2
Oct 08 23:51:52 validator.peer(5): Worker started for NetXdQprcVkpa:idrkKSNLk5Bh
Oct 08 23:51:59 p2p.maintenance: Too few connections (5)
Oct 08 23:52:08 validator.block: Validation of block BKnT5z9272SpTn3cN1k6CncfWCVRUJr9sVEXrfqimfd24Jn9Rko failed
Oct 08 23:52:08 validator.block: Pushed: 2018-10-08T23:52:08Z, Treated: 2018-10-08T23:52:08Z, Failed: 2018-10-08T23:52:08Z
Oct 08 23:52:08 validator.block: No space left on device
Oct 08 23:52:08 validator.block: Validation of block BKnT5z9272SpTn3cN1k6CncfWCVRUJr9sVEXrfqimfd24Jn9Rko failed
Oct 08 23:52:08 validator.block: Pushed: 2018-10-08T23:52:08Z, Treated: 2018-10-08T23:52:08Z, Failed: 2018-10-08T23:52:08Z
Oct 08 23:52:08 validator.block: No space left on device
Oct 08 23:52:08 validator.block: Validation of block BKnT5z9272SpTn3cN1k6CncfWCVRUJr9sVEXrfqimfd24Jn9Rko failed
Oct 08 23:52:08 validator.block: Pushed: 2018-10-08T23:52:08Z, Treated: 2018-10-08T23:52:08Z, Failed: 2018-10-08T23:52:08Z
Oct 08 23:52:08 validator.block: No space left on device
Oct 08 23:52:08 node.validator.bootstrap_pipeline: Unexpected error (validator): Error:
Oct 08 23:52:08 node.validator.bootstrap_pipeline: No space left on device
Oct 08 23:52:08 node.validator.bootstrap_pipeline:
Oct 08 23:52:08 node.validator.bootstrap_pipeline: Unexpected error (validator): Error:
Oct 08 23:52:08 node.validator.bootstrap_pipeline: No space left on device
Oct 08 23:52:08 node.validator.bootstrap_pipeline:
Oct 08 23:52:08 node.validator.bootstrap_pipeline: Unexpected error (validator): Error:
Oct 08 23:52:08 node.validator.bootstrap_pipeline: No space left on device
Oct 08 23:52:08 node.validator.bootstrap_pipeline:
Oct 08 23:52:08 validator.peer(3): Worker crashed:
Oct 08 23:52:08 validator.peer(3): No space left on device
Oct 08 23:52:08 validator.peer(5): Worker crashed:
Oct 08 23:52:08 validator.peer(5): No space left on device
Oct 08 23:52:08 validator.peer(4): Worker crashed:
Oct 08 23:52:08 validator.peer(4): No space left on device
Oct 08 23:52:10 validator.block: Validation of block BKnT5z9272SpTn3cN1k6CncfWCVRUJr9sVEXrfqimfd24Jn9Rko failed
Oct 08 23:52:10 validator.block: Pushed: 2018-10-08T23:52:10Z, Treated: 2018-10-08T23:52:10Z, Failed: 2018-10-08T23:52:10Z
Oct 08 23:52:10 validator.block: No space left on device
Oct 08 23:52:10 node.validator.bootstrap_pipeline: Unexpected error (validator): Error:
Oct 08 23:52:10 node.validator.bootstrap_pipeline: No space left on device
Oct 08 23:52:10 node.validator.bootstrap_pipeline:
I can write to files on this volume with no issue. I have unmounted/remounted the volume. I have even rebooted the server.
[root@naaa ~]# dd if=/dev/zero of=/mnt/tezos-node/2G-file.dat count=2048 bs=1MiB
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 6.46693 s, 332 MB/s
[root@naaa ~]# ls -la /mnt/tezos-node/2G-file.dat
-rw-r--r--. 1 root root 2147483648 Oct 9 00:02 /mnt/tezos-node/2G-file.dat
[root@naaa ~]# rm /mnt/tezos-node/2G-file.dat
rm: remove regular file ‘/mnt/tezos-node/2G-file.dat’? y
I've googled around, there's no "deleted files" taking up space. SELinux is in permissive mode.
"something" is preventing only tezos-node from writing to this volume and I have no idea. Is there something in the node db's that is caching this state?