Draft: CUDA Version of LAXLIB and parallel davidson
This MR is a slightly updated version of the M.R. by @sorland and @bonfus, that fixes the distributed iterative diagonalization done by the GPU version of pcegterg and pregterg.
To do:
- test it before merging
- more refactoring to replac CUF kernels with openACC
- implement the matrix distribution in non-contigous blocks (BLACS style)
Edited by Pietro Delugas