Fix segfault with CUDA-aware MPI during finalization
And minor cleanups:
- Initialize variables and remove unused ones
- Initialize the variables if PFFT is missing
- Do not call
cuda_end()with CUDA-aware MPI
Edited by Meisam Tabriz
And minor cleanups:
cuda_end() with CUDA-aware MPI