crash in threading, collections.counter not thread-safe?
looks like our assumption (that collections is thread-safe) fails under load:
root@gitlab-02:~# tail -F /var/log/nginx/*_access.log | grep -v -e '==>' -e 'HTTP/1.1" 499' | awk '{print $1}' | asncounter --repl
INFO: selected input file <stdin>
INFO: using datfile ipasn_20250523.1600.dat.gz
INFO: collecting addresses in LineCollector mode
INFO: starting interactive console, use recorder.display_results() to show current results
INFO: recorder.asn_counter and .prefix_counter dictionaries have the full data
Python 3.11.2 (main, Apr 28 2025, 14:11:48) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> INFO: loading /root/.cache/pyasn/asnames.json
count percent ASN AS
unique ASN: 0
count percent prefix ASN AS
unique prefixes: 0
total resolved: 0
total skipped: 0
total failed: 0
INFO: loading datfile /root/.cache/pyasn/ipasn_20250523.1600.dat.gz...
count percent ASN AS
Traceback (most recent call last):
File "/usr/bin/asncounter", line 8, in <module>
sys.exit(main_wrap())
^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/asncounter.py", line 907, in main_wrap
main()
File "/usr/lib/python3/dist-packages/asncounter.py", line 888, in main
repl_thread(collector, namespace)
File "/usr/lib/python3/dist-packages/asncounter.py", line 814, in repl_thread
repl_thread.join()
File "/usr/lib/python3.11/threading.py", line 1112, in join
self._wait_for_tstate_lock()
File "/usr/lib/python3.11/threading.py", line 1132, in _wait_for_tstate_lock
if lock.acquire(block, timeout):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/asncounter.py", line 862, in <lambda>
signal(SIGHUP, lambda s, f: recorder.display_results())
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/asncounter.py", line 261, in display_results
for asn, count in self.asn_counter.most_common(args.top):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/collections/__init__.py", line 622, in most_common
return heapq.nlargest(n, self.items(), key=_itemgetter(1))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/heapq.py", line 565, in nlargest
result = [(key(elem), i, elem) for i, elem in zip(range(0, -n, -1), it)]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/heapq.py", line 565, in <listcomp>
result = [(key(elem), i, elem) for i, elem in zip(range(0, -n, -1), it)]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: dictionary changed size during iteration
Fatal Python error: _enter_buffered_busy: could not acquire lock for <_io.BufferedReader name='/dev/tty'> at interpreter shutdown, possibly due to daemon threads
Python runtime state: finalizing (tstate=0x0000000000a840f8)
Current thread 0x00007ff2d60a3040 (most recent call first):
<no Python frame>
Extension modules: pyasn.pyasn_radix, _cffi_backend (total: 2)
Aborted
this was trigger by just sending a SIGHUP to the process. it happens pretty reliably with lots of traffic.