Skip to content

Use CUDA-aware MPI

Sebastian Ohlmann requested to merge cuda_aware_mpi into develop

Description

Instead of copying the boundary from the device to the host, use device pointers so the CUDA-aware MPI implementation can copy the date directly between the devices. Also overlap the transfer with the computation of the inner points to minimize the waiting time. Using this feature can be enabled at configure time (--enable-cudampi) and at runtime with a variable in the input file (CudaAwareMPI).

This MR also fixes a synchronization bug in subarray_gather, where a synchronization was missing.

News snippet

Use CUDA-aware MPI

Checklist

  • I have checked that my code follows the Octopus coding standards
  • I have added tests for all the new features added in this request.

Closes #224 (closed) and #162 (closed).

Edited by Martin Lueders

Merge request reports