Poll is blocked indefinitely when the server callback can not send the whole packet
Originally discovered in !263 (closed) and further elaborated in !340 (closed).
tl;dr:
The libssh callback based server is using poll to check when there is something to read or we can write to the socket. The problem is that the poll is invoked recursively. Probably to prevent infinite recursion of invoking callbacks, the socket is "locked" for the time the callback is executed. But as we are technically operating on a single socket, this does not prevent invoking more callbacks, but also writing to this socket through poll.
!340 (comment 1295396598):
Longer version based onThe callbacks are invoked from within the first poll loop and the poll socket is locked for the time the callback is executed. The callback itself processes the packet and constructs the reply packet to be send to the peer from sftp_reply_data. The code sends the first part of the packet, but can not send the rest so it tries to enable the POLLOUT event on the socket, which is not immediately effective, because the lock is active. This is still non-blocking part, which goes on without any issue. But after that, we got from ssh_channel_write_common to blocking flush of the outgoing buffer, which invokes the poll again. But as mentioned previously, the POLLOUT event is not enabled on the socket so it waits (in blocking mode indefinitely) in poll.
This accidentally works all the way until we try to send some larger packet and we are not able to write them to the underlying kernel/network layer. I think the socket wrapper is too strict with allowing writing only up to the MTU, but we would hit the same issue with larger packets when the kernel or something would not allow to buffer some amount of data.
What are the ways out?
- The problem is really the blocking flush inside of the poll callback with locked socket in https://gitlab.com/libssh/libssh-mirror/-/blob/master/src/channels.c#L1580 . This code lives in there for 11 years. I really can not guess what would happen if we would change it to non-blocking flush -- this would allow the code returning to the first poll, which would already pick up the POLLOUT event as it is correctly set after we return from the callback.
- The inside problem is calling the poll recursively while keeping the socket locked. I could not find much information about the reason why it is locked. Probably to avoid recursive execution of the callbacks. One option might be locking the socket only for the POLLIN events and allowing the POLLOUT events to break out of this issue? Or could the outside poll break if we would change the events from inside of the callback?