Fujitsu fx1000: Crashes in hardwareTopologyPrepareDetection() function when using more than 8 MPI ranks per node.
We are executing GROMACS on the Fujitsu FX1000. It works great if we do not use more than 8 MPI ranks per node, but it crashes on the hardwareTopologyPrepareDetection() function when we use more than 8. Every single rank executes that function, so when there are many ranks on a single node, a considerable number of threads are spawned, and it seems that the FX1000 can not handle that. I guess the easiest solution is that only one process per node executes the hardwareTopologyPrepareDetection() function.