Implements PME decomposition support in mixed mode for CUDA-backend
- Adds support for PME grid Halo exchange for CUDA backend using CUDA-aware MPI.
- Use atom displacement estimate for PME GPU halo.
- Adds CUDA kernels for copying data from PME grid to FFT grid and back from FFT grid to PME grid.
ToDo. Following-up with separate MRs:
- Allow PME decomposition in mixed mode. Currently program exits early if we try to enable new code path.
- Disables DLB when PME GPU decomposition is on
- Disables PME tuning when PME GPU decomposition is on
- Add a dev flag to enable PME GPU decomposition
Refs #3884
Edited by Gaurav Garg