Distributed-memory algorithm for ndiag (3*3 and up) causes MPI_Comm_free Error

I am using QE v7.1 on a HPC system and have encountered a rather curious error.

When I use the -ndiag flag to run pw.x (e.g. mpiexec -np 32 ./pw.x -ndiag 4 -i pw.inp) I get the following error only if I use -ndiag 9 or higher. I did the following tests (where the input file and -np were kept the same):

  • -ndiag 1 -> "a serial algorithm will be used" -> no error
  • -ndiag 4 -> "custom distributed-memory algorithm (size of sub-group: 2*2 procs)" -> no error
  • -ndiag 9 -> "custom distributed-memory algorithm (size of sub-group: 3*3 procs)" -> error (see below)
  • -ndiag 16 -> "custom distributed-memory algorithm (size of sub-group: 4*4 procs)" -> error (see below)

The error traceback:

Fatal error in PMPI_Comm_free: Invalid communicator, error stack:
PMPI_Comm_free(145): MPI_Comm_free(comm=0x7fff64d69194) failed
PMPI_Comm_free(93).: Null communicator

It's puzzling how 2*2 works fine but 3*3 and up causes the error.

Any help would be highly appreciated or suggestions on what else to test.

Thank you!

-Peter

P.S.: The only thing I could find online that seems to be a similar (or the same) issues was this: link, but no solution was provided there.

Edited by Peter Schindler