lte: Keep PHY and MAC trace files open
This prevents unnecessary flushes and file opening/seeking/closing syscalls for the LTE trace files, which add significant overhead in slow filesystems (e.g. fuse, WSL, network shares).
For example lena-radio-link-failure --numberOfEnbs=2 --useIdealRrc=0 --interSiteDistance=700 --simTime=17
runtime dropped from 28s down to 16s after applying the patch when testing on a ntfs-3g fuse mount. In WSLv1, the time spent on these syscalls was 5x the time spent on the simulations themselves (while testing the lena-profiling
example).
Strace -c before patches
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
31,51 0,246243 2 103480 942 openat
30,23 0,236284 2 102360 write
19,90 0,155493 1 102538 close
16,65 0,130132 1 102426 lseek
1,05 0,008186 18 437 mmap
0,21 0,001671 16 99 newfstatat
0,20 0,001595 11 134 mprotect
0,18 0,001391 14 98 read
0,03 0,000237 13 18 getcwd
0,03 0,000197 3 65 brk
0,01 0,000052 13 4 pread64
0,00 0,000027 13 2 1 arch_prctl
0,00 0,000021 10 2 2 access
0,00 0,000013 13 1 set_robust_list
0,00 0,000012 12 1 set_tid_address
0,00 0,000000 0 1 munmap
0,00 0,000000 0 1 execve
0,00 0,000000 0 2 2 statfs
0,00 0,000000 0 6 4 prctl
0,00 0,000000 0 4 futex
0,00 0,000000 0 1 prlimit64
0,00 0,000000 0 1 getrandom
------ ----------- ----------- --------- --------- ----------------
100,00 0,781554 1 411681 951 total
Strace -c after patches
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
94,05 0,801526 7 102351 write
3,33 0,028340 11 2492 942 openat
0,88 0,007476 4 1550 close
0,68 0,005819 4 1438 lseek
0,63 0,005389 12 436 mmap
0,18 0,001570 11 134 mprotect
0,11 0,000908 9 99 newfstatat
0,10 0,000874 8 98 read
0,03 0,000230 4 56 brk
0,00 0,000041 10 4 pread64
0,00 0,000010 10 1 set_tid_address
0,00 0,000010 10 1 set_robust_list
0,00 0,000007 3 2 1 arch_prctl
0,00 0,000000 0 1 munmap
0,00 0,000000 0 2 2 access
0,00 0,000000 0 1 execve
0,00 0,000000 0 18 getcwd
0,00 0,000000 0 2 2 statfs
0,00 0,000000 0 6 4 prctl
0,00 0,000000 0 4 futex
0,00 0,000000 0 1 prlimit64
0,00 0,000000 0 1 getrandom
------ ----------- ----------- --------- --------- ----------------
100,00 0,852200 7 108698 951 total
While the time spent on syscalls themselves doesn't change much, the 4x reduction in syscalls means less context switches, rescheduling, etc.
It could get even faster by not explicitly flushing (replace std::endl with "\n"), but makes it harder to debug issues.
Strace -c after patches + no explicit flushing
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
40,82 0,021103 2 10111 write
35,11 0,018152 7 2492 942 openat
9,29 0,004805 11 436 mmap
4,11 0,002123 1 1550 close
2,58 0,001333 9 134 mprotect
2,32 0,001199 3 339 writev
2,12 0,001096 0 1438 lseek
1,67 0,000864 8 99 newfstatat
1,46 0,000757 7 98 read
0,21 0,000107 1 55 brk
0,21 0,000107 5 18 getcwd
0,07 0,000034 8 4 pread64
0,02 0,000008 8 1 set_tid_address
0,01 0,000007 3 2 1 arch_prctl
0,01 0,000007 7 1 set_robust_list
0,00 0,000000 0 1 munmap
0,00 0,000000 0 2 2 access
0,00 0,000000 0 1 execve
0,00 0,000000 0 2 2 statfs
0,00 0,000000 0 6 4 prctl
0,00 0,000000 0 4 futex
0,00 0,000000 0 1 prlimit64
0,00 0,000000 0 1 getrandom
------ ----------- ----------- --------- --------- ----------------
100,00 0,051702 3 16796 951 total
Previously reported similar findings in !777 (comment 745537669) and #330 (comment 476200296)