Concurrence race with OpenMP for the nonlocal part of the pseudo
We have a possible concurrence race in the OpenMP parallelization of the calculation of the nonlocal part of the pseudopotential.
The code reads as follow
!$omp parallel do private(ip, ist)
do ip = 1, npoints
forall(ist = 1:nst)
vpsib%pack%X(psi)(ist, pmat%map(ip)) = vpsib%pack%X(psi)(ist, pmat%map(ip)) + psi(ist, ip)
end forall
end do
!$omp end parallel do
As one can see, we perform of reduction on vpsib%pack%X(psi)
, which is not
covered by the code (no "reduction" statement). There would be a problem
only is the same point of the grid is covered by two points in the submesh.
This possibility exists, there is even a flag in the submesh initialization
that stores that.
The same reason prevents the compiler to vectorize the loop, making the code inefficient....