Skip to content

Limit the default tMPI rank maximum GPU sharing

Szilárd Páll requested to merge sz_cap-tmpi-automatic-gpu-sharing into release-2022

With large number of cores on modern HPC machines, default thread-MPI launch can lead to >=8 ranks per GPU which is most likely suboptimal. Therefore, when the tMPI rank count is determined automatically, we limit the maximum number of ranks per GPU; currently this value is set to four.

Partially addresses #4332

Merge request reports