-
Eric Blake authored
It's worth testing that our transition to the DEAD state works as expected, by intentionally killing a server. This test also makes it possible to test what happens when pending commands are stranded, so that an upcoming patch to add notifiers can show the difference it makes. The test was surprisingly hard to write. For starters, sending SIGINT to nbdkit was sometimes enough to kill the process (if it hadn't yet read the NBD_CMD_READ, and therefore did not try to wait for any outstanding requests before quitting), but often it did not (because nbdkit was stuck waiting for pthread_join()). Then there's the race that nbd_poll() can sometimes get lucky enough to catch a POLLHUP in REPLY.START where recv() returning 0 transitions things to CLOSED, but more often catches a POLLERR and transitions to DEAD. Then there was the hour I spent scratching my head why kill didn't seem to get rid of the child process even though the poll() was definitely seeing the fd closing, until I remembered that kill(pid,0) to zombie processes succeeds until you wait() or alter SIGCHLD to specifically prevent zombies. And debugging the now-fixed uninitialized variable in nbd_unlocked_poll added to the mix. I'm not 100% sure the test is portable to non-Linux - I guess we'll eventually find out when someone worries about a BSD port of libnbd.
91f24f7b