Event subscription in server_init_hook can crash
CI tests show that this feature is unreliable. Occasionally, the test times out, but more concerning it can segfault (Linux, and macOS).
We need to get a call stack show C++ code as well, but here is the Python call stack.
tests/test_server.py::test_server_init_hook_subscribe_event_multiple_devices Fatal Python error: Segmentation fault
Current thread 0x00007fc4439ca700 (most recent call first):
File "/builds/[MASKED]/pytango/venv/lib/python3.9/site-packages/tango/device_proxy.py", line 1403 in __DeviceProxy__subscribe_event_attrib
File "/builds/[MASKED]/pytango/venv/lib/python3.9/site-packages/tango/device_proxy.py", line 1330 in __DeviceProxy__subscribe_event
File "/builds/[MASKED]/pytango/venv/lib/python3.9/site-packages/tango/green.py", line 112 in run
File "/builds/[MASKED]/pytango/venv/lib/python3.9/site-packages/tango/green.py", line 204 in greener
File "/builds/[MASKED]/pytango/tests/test_server.py", line 3581 in server_init_hook
File "/builds/[MASKED]/pytango/venv/lib/python3.9/site-packages/tango/green.py", line 101 in execute
File "/builds/[MASKED]/pytango/venv/lib/python3.9/site-packages/tango/server.py", line 666 in server_init_hook
File "/builds/[MASKED]/pytango/venv/lib/python3.9/site-packages/tango/server.py", line 1797 in tango_loop
File "/builds/[MASKED]/pytango/venv/lib/python3.9/site-packages/tango/green.py", line 112 in run
File "/builds/[MASKED]/pytango/venv/lib/python3.9/site-packages/tango/server.py", line 1803 in __server_run
File "/builds/[MASKED]/pytango/venv/lib/python3.9/site-packages/tango/server.py", line 1937 in run
File "/builds/[MASKED]/pytango/venv/lib/python3.9/site-packages/tango/test_context.py", line 456 in target
File "/opt/python/cp39-cp39/lib/python3.9/threading.py", line 917 in run
File "/opt/python/cp39-cp39/lib/python3.9/threading.py", line 980 in _bootstrap_inner
File "/opt/python/cp39-cp39/lib/python3.9/threading.py", line 937 in _bootstrap
Thread 0x00007fc46a513740 (most recent call first):
File "/opt/python/cp39-cp39/lib/python3.9/threading.py", line 316 in wait
File "/opt/python/cp39-cp39/lib/python3.9/queue.py", line 180 in get
File "/builds/[MASKED]/pytango/venv/lib/python3.9/site-packages/tango/test_context.py", line 559 in connect
File "/builds/[MASKED]/pytango/venv/lib/python3.9/site-packages/tango/test_context.py", line 554 in start
File "/builds/[MASKED]/pytango/venv/lib/python3.9/site-packages/tango/test_context.py", line 613 in __enter__
File "/builds/[MASKED]/pytango/tests/test_server.py", line 3593 in test_server_init_hook_subscribe_event_multiple_devices
...
When trying to induce this fault running through a C++ debugger, it hasn't worked. Instead we should enable core dumps for Linux source job and then print out backtrace if core file is found.
- In CI, we need to install
cpptango-dbgpackage for Linux builds. - For backtrace from core dump, use pystack: https://github.com/bloomberg/pystack.
- From cppTango: https://gitlab.com/tango-controls/cppTango/-/blob/main/ci/print_coredumps.sh?ref_type=heads and https://gitlab.com/tango-controls/cppTango/-/blob/main/ci/test.sh?ref_type=heads#L12-19
Edited by Anton Joubert