deadlock in ssh_channel_open_forward_unix
My ssh client deadlocked for 2 days, and I found the deadlock took place in ssh_channel_open_forward_unix
This is with the libssh of Ubuntu 24.04 (0.10.6/openssl/zlib)
Here is the stack I got by attaching gdb to the deadlocked client:
#0 0x0000792816f1b4cd in __GI___poll (fds=0x57f90e5349c0, nfds=1, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
#1 0x0000792817542531 in ?? () from /lib/x86_64-linux-gnu/libssh.so.4
#2 0x000079281754c0ed in ?? () from /lib/x86_64-linux-gnu/libssh.so.4
#3 0x000079281754c2c1 in ?? () from /lib/x86_64-linux-gnu/libssh.so.4
#4 0x0000792817561dc1 in ?? () from /lib/x86_64-linux-gnu/libssh.so.4
#5 0x00007928175242b5 in ssh_channel_open_forward_unix () from /lib/x86_64-linux-gnu/libssh.so.4
The deadlock happened while rebooting the server it was connected to. The sequence of events that happened are:
- The unix domain socket the client was connected to was closed when the service was killed by reboot
- The client tried to reconnect immediately
- The client managed to create a new ssh session because the ssh server was not killed by reboot yet
- The client deadlocked when trying to connect to the socket
I guess that during the reboot, the ssh server was killed right in the middle of ssh_channel_open_forward_unix, and that may be what caused the deadlock. I did not manage to replicate the problem.
Edited by Rémi Coulom