T69: Problems with random disconnects when using channel read/write & remote shell
Description
Originally reported by migration: https://bugs.libssh.org/T69
In my pretty basic example program (working source code: https://pastebin.com/cybVfnVj) I receive random disconnects from the channel, which I cannot understand. The program serves as remote command executor using libssh, and while the first few commands are executed just fine, at some point ssh//channel//is//open() will return false with no obvious reason, and the channel is closed/unusable. Since I need a remote shell, I am working with channel read/write operations rather than ssh//channel//request//exec. When turning on verbosity, the last output of the above example program will be:
Execute [npm install]
[2017/05/08 08:55:50.187352, 3] ssh//socket//unbuffered//write: Enabling POLLOUT for socket
[2017/05/08 08:55:50.187357, 3] packet//send2: packet: wrote [len=28,padding=7,comp=20,payload=20]
[2017/05/08 08:55:50.187361, 3] channel//write//common: channel//write wrote 11 bytes
[2017/05/08 08:55:50.187370, 3] packet//send2: packet: wrote [len=28,padding=17,comp=10,payload=10]
[2017/05/08 08:55:50.187374, 3] channel//write//common: channel//write wrote 1 bytes
[2017/05/08 08:55:50.187383, 3] ssh//socket//unbuffered//write: Enabling POLLOUT for socket
[2017/05/08 08:55:50.187392, 3] packet//send2: packet: wrote [len=60,padding=8,comp=51,payload=51]
[2017/05/08 08:55:50.187395, 3] channel//write//common: channel//write wrote 42 bytes
[2017/05/08 08:55:50.187402, 3] ssh//socket//unbuffered//write: Enabling POLLOUT for socket
[2017/05/08 08:55:50.187410, 3] packet//send2: packet: wrote [len=28,padding=17,comp=10,payload=10]
[2017/05/08 08:55:50.187414, 3] channel//write//common: channel//write wrote 1 bytes
[2017/05/08 08:55:50.187423, 3] ssh//socket//unbuffered//write: Enabling POLLOUT for socket
[2017/05/08 08:55:50.187439, 3] ssh//channel//read//timeout: Read (1024) buffered : 0 bytes. Window: 1277707
[2017/05/08 08:55:53.608870, 3] ssh//packet//socket//callback: packet: read type 94 [len=60,padding=15,comp=44,payload=44]
[2017/05/08 08:55:53.608888, 3] ssh//packet//process: Dispatching handler for packet type 94
[2017/05/08 08:55:53.608895, 3] channel//rcv//data: Channel receiving 35 bytes data in 0 (local win=1277707 remote win=2096572)
[2017/05/08 08:55:53.608899, 3] channel//default//bufferize: placing 35 bytes into channel buffer (stderr=0)
[2017/05/08 08:55:53.608902, 3] channel//rcv//data: Channel windows are now (local win=1277672 remote win=2096572)----
Server response: Code (0)
Server output: [0 9903373aee394f16a077720cc5f58e80
]
Server error: []
Execute [pwd -P]
[2017/05/08 08:55:53.608945, 3] ssh//socket//unbuffered//write: Enabling POLLOUT for socket
[2017/05/08 08:55:53.608950, 3] packet//send2: packet: wrote [len=28,padding=12,comp=15,payload=15]
[2017/05/08 08:55:53.608954, 3] channel//write//common: channel//write wrote 6 bytes
[2017/05/08 08:55:53.608963, 3] packet//send2: packet: wrote [len=28,padding=17,comp=10,payload=10]
[2017/05/08 08:55:53.608967, 3] channel//write//common: channel//write wrote 1 bytes
[2017/05/08 08:55:53.608976, 3] ssh//socket//unbuffered//write: Enabling POLLOUT for socket
[2017/05/08 08:55:53.608985, 3] packet//send2: packet: wrote [len=60,padding=8,comp=51,payload=51]
[2017/05/08 08:55:53.608989, 3] channel//write//common: channel//write wrote 42 bytes
[2017/05/08 08:55:53.608996, 3] ssh//socket//unbuffered//write: Enabling POLLOUT for socket
[2017/05/08 08:55:53.609004, 3] packet//send2: packet: wrote [len=28,padding=17,comp=10,payload=10]
[2017/05/08 08:55:53.609007, 3] channel//write//common: channel//write wrote 1 bytes
[2017/05/08 08:55:53.609015, 3] ssh//socket//unbuffered//write: Enabling POLLOUT for socket
[2017/05/08 08:55:53.609021, 3] ssh//channel//read//timeout: Read (1024) buffered : 0 bytes. Window: 1277672
[2017/05/08 08:55:53.609116, 3] ssh//packet//socket//callback: packet: read type 98 [len=44,padding=18,comp=25,payload=25]
[2017/05/08 08:55:53.609127, 3] ssh//packet//process: Dispatching handler for packet type 98
[2017/05/08 08:55:53.609133, 3] channel//rcv//request: received exit-status 0
[2017/05/08 08:55:53.609137, 3] ssh//packet//socket//callback: Processing 72 bytes left in socket buffer
[2017/05/08 08:55:53.609144, 3] ssh//packet//socket//callback: packet: read type 96 [len=12,padding=6,comp=5,payload=5]
[2017/05/08 08:55:53.609147, 3] ssh//packet//process: Dispatching handler for packet type 96
[2017/05/08 08:55:53.609151, 3] channel//rcv//eof: Received eof on channel (43:0)
[2017/05/08 08:55:53.609154, 3] ssh//packet//socket//callback: Processing 36 bytes left in socket buffer
[2017/05/08 08:55:53.609160, 3] ssh//packet//socket//callback: packet: read type 97 [len=12,padding=6,comp=5,payload=5]
[2017/05/08 08:55:53.609163, 3] ssh//packet//process: Dispatching handler for packet type 97
[2017/05/08 08:55:53.609166, 3] channel//rcv//close: Received close on channel (43:0)
ERROR: CHANNEL IS CLOSED, trying to REOPEN....
Some things to notice:
- I will receive the disconnect only when actually connecting to a remote host, not localhost via ssh (the above is a connection from MacOS 10.12.4 to a RHEL 5.7 box)
- The disconnect will not occur when only issueing 1-2 remote commands. Also they will not occur when issueing the same command a lot of times in repetition. The commands in the example program always break for me at the last command ("pwd -P").
- Error handling/server verification is ommitted in the example for the sake of simplicity
- I am using libssh 0.7.5 compiled with openssl-1.0.0f on MacOS 10.12.4
Comments:
migration commented on 2017-06-19 06:54:24 UTC:
On Thursday, May 11th 2017 12:19:33, Patrick Fial wrote:
Actually I was able to pinpoint the "bad" remote command: the "npm install" command seems to request a PTY, which packet type 98 indicates. However, the channel is closed thereafter, probably because I requested a shell on the channel, but NOT a PTY. So either the server disconnects because libssh denies the pty request, or probably libssh closes the channel because I did not open a PTY? I was kind of able to confirm this assumption by changing my program code to also request a PTY via libssh commands, but this is not a solution, because parsing the resulting command/channel output in shell/pty mode will be cumbersome/impossible.
Either way, I wonder if there is a solution for this issue. Would this mean that it is not possible to execute remote commands in non-pty mode which might request a PTY?
I tried to use the channel//pty//request//function, like so:
struct ssh//channel//callbacks//struct cb = { .userdata = NULL, .channel//pty//request//function = onPtyRequest }; ssh//callbacks//init(&cb); ssh//set//channel//callbacks(channel, &cb); [...] int onPtyRequest(ssh//session session, ssh//channel channel, const char **term, int width, int height, int pxwidth, int pwheight, void **userdata) { printf("######################### PTY request received ##############\n"); return -1; }
My idea was that I could return -1 to deny the pty request, but this only makes the program crash with signal 11.
** thread #1, queue = 'com.apple.main-thread', stop reason = EXC//BAD//ACCESS (code=EXC//I386//GPFLT) ** frame #0: 0x0000000100001cef a.out frame #1: 0x00000001000c6da1 libssh.4.dylib`ssh//socket//pollcallback + 1857 frame #2: 0x00000001000c1e09 libssh.4.dylib`ssh//poll//ctx//dopoll + 441 frame #3: 0x00000001000c3af3 libssh.4.dylib`ssh//handle//packets + 291 frame #4: 0x00000001000c37c8 libssh.4.dylib`ssh//handle//packets//termination + 232 frame #5: 0x000000010009f9dc libssh.4.dylib`ssh//channel//read//timeout + 396 frame #6: 0x000000010009f83f libssh.4.dylib`ssh//channel//read + 47 frame #7: 0x000000010000142b a.out`readBytes(aBuffer=0x00000001003070f0, fromStderr=0, until="9903373aee394f16a077720cc5f58e80") at main2.cc:303 frame #8: 0x000000010000112a a.out`execute(command="date") at main2.cc:241 frame #9: 0x0000000100000b5b a.out`main(argc=1, argv=0x00007fff5fbff668) at main2.cc:75 frame #10: 0x00007fffaacc9235 libdyld.dylib`start + 1 frame #11: 0x00007fffaacc9235 libdyld.dylib`start + 1