T125: libssh 0.8.5 under Windows 10
Description
Originally reported by Gilles_Pelletier: https://bugs.libssh.org/T125
Hello world, I developping a putty like terminal with libssh and I got a problem in the bsd_poll() function: Sometimes it hangs.
/* compute fd_sets and find largest descriptor */
nfds 1 fds[0] {fd=2712 events=0 revents=784 }
rc remains to -1 if (max_fd == SSH_INVALID_SOCKET || rc == -1) { errno = EINVAL; return -1; }`
and the function bsd_poll() returns -1
To workaround this, I put the program into sleep for 200ms when it read 0 data length, but in that case the program is too slow. What is wrong ? Many thanks.
Comments:
Gilles_Pelletier commented on 2018-12-12 11:00:42 UTC:
also seen with libssh 0.8.2
asn commented on 2018-12-13 07:47:36 UTC:
Hi Gilles,
I have some fixes at https://git.libssh.org/users/asn/libssh.git/log/?h=master-poll
Gilles_Pelletier commented on 2018-12-14 11:27:54 UTC:
Hi Andreas,
I have done a mix between 0.8.5 from https://github.com/ShiftMediaProject/libssh and yours https://git.libssh.org/users/asn/libssh.git/snapshot/libssh-master-poll.zip
Some logs to confirm the new code: ssh_connect: libssh 0.8.90 (c) 2003-2018 Aris Adamantiadis, Andreas Schneider and libssh contributors. Distributed under the LGPL, please refer to COPYING file for information about your rights, using threading threads_winlock ssh_socket_connect: Nonblocking connection socket: 2648 ssh_connect: Socket connecting, now waiting for the callbacks to work ssh_connect: Actual timeout : 10000 ssh_socket_pollcallback: Poll callback on socket 2648 (POLLOUT ), out buffer 0 ssh_socket_pollcallback: Received POLLOUT in connecting state socket_callback_connected: Socket connection callback: 1 (0) ssh_socket_unbuffered_write: Enabling POLLOUT for socket ssh_socket_pollcallback: Poll callback on socket 2648 (POLLOUT ), out buffer 0 ssh_socket_pollcallback: Poll callback on socket 2648 (POLLIN ), out buffer 0 callback_receive_banner: Received banner: SSH-2.0-OpenSSH_7.1 ssh_client_connection_callback: SSH server banner: SSH-2.0-OpenSSH_7.1 ssh_analyze_banner: Analyzing banner: SSH-2.0-OpenSSH_7.1 ssh_analyze_banner: We are talking to an OpenSSH client version: 7.1 (70100)
Same issue if I reduce the wait on empty read to 1 ms (It’s better ! :) before if was occuring at 5 ms)
libssh\src\poll.c static int bsd_poll(ssh_pollfd_t *fds, nfds_t nfds, int timeout) ../..
/* compute fd_sets and find largest descriptor */
for (rc = -1, max_fd = 0, i = 0; i < nfds; i++) {
if (fds[i].fd == SSH_INVALID_SOCKET) {
continue;
}
#ifndef _WIN32 if (fds[i].fd >= FD_SETSIZE) { rc = -1; break; } #endif
if (fds[i].events & (POLLIN | POLLRDNORM)) {
FD_SET (fds[i].fd, &readfds);
}
if (fds[i].events & (POLLOUT | POLLWRNORM | POLLWRBAND)) {
FD_SET (fds[i].fd, &writefds);
}
if (fds[i].events & (POLLPRI | POLLRDBAND)) {
FD_SET (fds[i].fd, &exceptfds);
}
if (fds[i].fd > max_fd &&
(fds[i].events & (POLLIN | POLLOUT | POLLPRI |
POLLRDNORM | POLLRDBAND |
POLLWRNORM | POLLWRBAND))) {
max_fd = fds[i].fd;
rc = 0;
}
}
if (max_fd == SSH_INVALID_SOCKET || rc == -1) {
errno = EINVAL;
return -1;
}
I put a breakpoint on the line “errno = EINVAL ;” launch my terminal program on debug and watch some variables at this point... max_fd 0 rc -1
nfds 1 fds[0] {fd=2628 events=0 revents=784 }
I’ve tried to force the select() or timeout or change the -1 to 0 but the polling seems to be broken at this point.
My new questions : I notice these lines in the log (verbosity = 4 SSH_LOG_FUNCTIONS) grow_window: growing window (channel 43:0) to 1280000 bytes
channel_rcv_data: Channel receiving 81 bytes data in 0 (local win=1279757 remote win=2097152) channel_default_bufferize: placing 81 bytes into channel buffer (stderr=0) channel_rcv_data: Channel windows are now (local win=1279676 remote win=2097152) channel_request: Channel request shell success
Is it possible to enlarge this window (buffer I suppose...) with the c++ API ?
Comment on the bsd_poll() function /*
- This is a poll(2)-emulation using select for systems not providing a native
- poll implementation.
- Keep in mind that select is terribly inefficient. The interface is simply not
- meant to be used with maximum descriptor value greater, say, 32 or so. With
- a value as high as 1024 on Linux you'll pay dearly in every single call.
- poll() will be orders of magnitude faster. */ Does it mean there are problems with select() on Windows ? Where to find a poll implementation for Windows (Visual Studio 2017...) ? (if any exists ?)
Many thanks.
Gilles_Pelletier commented on 2019-01-02 14:45:46 UTC:
Hi Andreas (best wishes for this new year and new release :) ), I've update my project with this version: https://github.com/ShiftMediaProject/libssh/releases/tag/libssh-0.8.6
and the same issue occurs. something goes wrong in the bsd_poll() and I'm not able to fix it.
Many thanks.
Gilles_Pelletier commented on 2019-01-22 17:35:25 UTC:
Hi Andreas,
As I can't go further, I hack the thing upon the API. I remove the C++ wrapper to be in pure C. This way I can read/write the channel and session structures to pach the error and fix the problem. I remark that in this case of error, the data is really present on the server side. Here my ugly (but working) code :
ssh_session m_pSession ssh_channel m_pChannel
m_pSession and m_pChannel are successfuly created elsewere...
int SendSocket(const void *data, size_t len) { int iRet = -1 ;
if (m_pChannel != NULL) { iRet = ssh_channel_write(m_pChannel, data, len) ; if (iRet == -1) { if (m_pSession->session_state != SSH_SESSION_STATE_AUTHENTICATED) { m_pSession->session_state = SSH_SESSION_STATE_AUTHENTICATED ; Sleep(500) ; // write 0 ! iRet = ssh_channel_write(m_pChannel, data, 0) ; m_pSession->session_state = SSH_SESSION_STATE_AUTHENTICATED ; iRet = len ; } } } else { iRet = -1 ; }
return iRet ; }
With this trick, no problem...
asn commented on 2019-02-02 16:47:47 UTC:
I haven't had time to look into this issue again. However the session state is internal, how did you get access to it? Also the C++ wrapper is unmaintained I really know someone who uses it for complex stuff.
asn commented on 2019-02-27 07:39:39 UTC:
Some time ago, we use WSAPoll() on Windows but it had a lot of issues that's why we stopped using it. I guess the best would be to move to a library which provides different event loops like libverto.