Skip to content

crash in threading, collections.counter not thread-safe?

looks like our assumption (that collections is thread-safe) fails under load:

root@gitlab-02:~# tail -F /var/log/nginx/*_access.log | grep -v -e '==>' -e 'HTTP/1.1" 499' | awk '{print $1}' | asncounter --repl
INFO: selected input file <stdin>
INFO: using datfile ipasn_20250523.1600.dat.gz
INFO: collecting addresses in LineCollector mode
INFO: starting interactive console, use recorder.display_results() to show current results
INFO: recorder.asn_counter and .prefix_counter dictionaries have the full data
Python 3.11.2 (main, Apr 28 2025, 14:11:48) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> INFO: loading /root/.cache/pyasn/asnames.json
count   percent ASN     AS
unique ASN: 0
count   percent prefix  ASN     AS
unique prefixes: 0
total resolved: 0
total skipped: 0
total failed: 0
INFO: loading datfile /root/.cache/pyasn/ipasn_20250523.1600.dat.gz...
count   percent ASN     AS
Traceback (most recent call last):
  File "/usr/bin/asncounter", line 8, in <module>
    sys.exit(main_wrap())
             ^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/asncounter.py", line 907, in main_wrap
    main()
  File "/usr/lib/python3/dist-packages/asncounter.py", line 888, in main
    repl_thread(collector, namespace)
  File "/usr/lib/python3/dist-packages/asncounter.py", line 814, in repl_thread
    repl_thread.join()
  File "/usr/lib/python3.11/threading.py", line 1112, in join
    self._wait_for_tstate_lock()
  File "/usr/lib/python3.11/threading.py", line 1132, in _wait_for_tstate_lock
    if lock.acquire(block, timeout):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/asncounter.py", line 862, in <lambda>
    signal(SIGHUP, lambda s, f: recorder.display_results())
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/asncounter.py", line 261, in display_results
    for asn, count in self.asn_counter.most_common(args.top):
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/collections/__init__.py", line 622, in most_common
    return heapq.nlargest(n, self.items(), key=_itemgetter(1))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/heapq.py", line 565, in nlargest
    result = [(key(elem), i, elem) for i, elem in zip(range(0, -n, -1), it)]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/heapq.py", line 565, in <listcomp>
    result = [(key(elem), i, elem) for i, elem in zip(range(0, -n, -1), it)]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: dictionary changed size during iteration
Fatal Python error: _enter_buffered_busy: could not acquire lock for <_io.BufferedReader name='/dev/tty'> at interpreter shutdown, possibly due to daemon threads
Python runtime state: finalizing (tstate=0x0000000000a840f8)

Current thread 0x00007ff2d60a3040 (most recent call first):
  <no Python frame>

Extension modules: pyasn.pyasn_radix, _cffi_backend (total: 2)
Aborted

this was trigger by just sending a SIGHUP to the process. it happens pretty reliably with lots of traffic.