Skip to content

Busyloop on unexpected incoming UDP packet

Description of problem:

100% CPU usage of a ocserv-worker process

Version of ocserv used:

debian stable backports: 1.1.2-1~bpo10+1 (Not affected: 1.1.1-1~bpo10+1, so a regression in 1.1.2)

Client used:

Unknown.

Distributor of ocserv

Debian.

How reproducible:

About 1 in 40 clients.

Details

epoll reports a "ready-for-read" event for fd 5 (the UDP connection), but ocserv doesn't read anything - so it keeps busy looping. The UDP connection is in state UP_DISABLED.

One theory could be that the handshake failed; I don't see it killing the epoll registration, nor does it terminate the connection: https://gitlab.com/openconnect/ocserv/-/blob/ae049ee9ab0066a5fcddb85d892ac132e08e96db/src/worker-vpn.c#L1481-1482 (The break; just quits the switch statement; ret is cleared afterwards.)

# strace -tt -f -p 10061
11:53:04.617444 epoll_wait(3, [{EPOLLIN, {u32=5, u64=4294967301}}], 64, 7981) = 1
11:53:04.617493 epoll_wait(3, [{EPOLLIN, {u32=5, u64=4294967301}}], 64, 7981) = 1
11:53:04.617543 epoll_wait(3, [{EPOLLIN, {u32=5, u64=4294967301}}], 64, 7981) = 1
11:53:04.617593 epoll_wait(3, [{EPOLLIN, {u32=5, u64=4294967301}}], 64, 7981) = 1
11:53:04.617643 epoll_wait(3, [{EPOLLIN, {u32=5, u64=4294967301}}], 64, 7981) = 1
11:53:04.617692 epoll_wait(3, [{EPOLLIN, {u32=5, u64=4294967301}}], 64, 7981) = 1
11:53:04.617742 epoll_wait(3, [{EPOLLIN, {u32=5, u64=4294967301}}], 64, 7981) = 1
11:53:04.617792 epoll_wait(3, [{EPOLLIN, {u32=5, u64=4294967301}}], 64, 7981) = 1
...
Breakpoint 2, dtls_mainloop (tnow=0x7ffc40addf70, dtls=0x55fb22007b78, ws=0x55fb22007580) at worker-vpn.c:2699
2699    worker-vpn.c: No such file or directory.
(gdb) p dtls->udp_state
$2 = UP_DISABLED
(gdb) p *dtls
$3 = {io = {active = 1, pending = 0, priority = 0, data = 0x0, cb = 0x55fb21fc3c70 <dtls_watcher_cb>, next = 0x0, fd = 5, events = 1}, dtls_tptr = {fd = 5, msg = 0x0, consumed = 0, rx_time = {tv_sec = 0, tv_nsec = 0}},                  
  dtls_session = 0x55fb22025670, udp_state = UP_DISABLED, last_dtls_rehandshake = 0}
# lsof -p 10061
COMMAND     PID   USER   FD      TYPE             DEVICE SIZE/OFF    NODE NAME
...
ocserv-wo 10061 nobody    0u     unix 0x000000000fe3fdb1      0t0 4795663 type=DGRAM
ocserv-wo 10061 nobody    1u      CHR             10,200     0t40   11109 /dev/net/tun
ocserv-wo 10061 nobody    2u     unix 0x0000000060782efc      0t0   15639 type=STREAM
ocserv-wo 10061 nobody    3u  a_inode               0,13        0    8253 [eventpoll]
ocserv-wo 10061 nobody    4u  a_inode               0,13        0    8253 [eventfd]
ocserv-wo 10061 nobody    5u     IPv4            4795867      0t0     UDP vpn-worker2.[...]:443->[...]:58420 
ocserv-wo 10061 nobody   16u     IPv4            4796824      0t0     TCP vpn-worker2.[...]:443->[...]:49460 (ESTABLISHED)
ocserv-wo 10061 nobody   42u     unix 0x000000005ccb8ece      0t0 4796826 type=STREAM
Edited by Stefan Bühler