Call setupLocalGpuForceReduction after gmx_pme_send_coordinates
Call setupLocalGpuForceReduction() after gmx_pme_send_coordinates() for LIB_MPI as well. Without this reduceKernel is not called by PP ranks for PME forces as rvec buffer is not registered on PP side until the second neighbor search step.
This results in incorrect results as during first few hundred steps PME forces are not accumulated by PP ranks.
Fixes #4915 (closed)
Edited by Andrey Alekseenko