vexx_gamma_gpu band group
In vexx_gamma
and vexx_gamma_gpu
, the input argument psi
has the dimension of (lda*npol,m)
. I believe the 2nd dimension should have been max_ibands
for the band group parallelization to work. In the CPU case this is probably harmless. In the GPU case, however, the code can segfault at ALLOCATE(psi_d, source=psi)
.
To reproduce, run the following calculation using 6 GPUs and 6 band groups.
There is no problem in vexx_k
and vexx_k_gpu
. psi
is declared as (lda*npol,max_ibands)
in the former and (:,:)
in the latter.
A few GPU arrays in vexx_gamma_gpu
are allocated but never used, notably hpsi_d
. They have been cleaned up to save GPU memory.