Deadlock in line 200 of exx.py?
Line 200 of exx.py appears to possibly cause a deadlock.
In the above screenshot (data taken using py-spy operating with --native
), it appears that there's some sort of a deadlock in openmpi (3.1.4).
The relevant line appears to be a reduction operation, so maybe something about how the arrays are structured? I don't know how to reproduce this since this part of the code (used by the GW module, which I'm using now) has worked perfectly fine before.
May be worth looking into?
Edit: Formatting hard.
Edit 2: Adding: py-spy was running for well over 3 hours before this, watching this process and it was stuck on exx.py line 200. The times are much smaller here since I ran it with --native
for a short while to confirm what was going on.
Edited by Anubhab Haldar