CUDA code for RMMDIIS
There seems to be a problem for the CUDA version of the rmmdiis eigensolver:
Even when forcing the same blocksizes and tightening the EigensolverTolerance to 1e-9, the CUDA version of the code does not converge.
The example I used is the testsuite/functionals/10-vdw_d3_dna.01-gs_novdw.inp
Differences to the non-CUDA code appear at the first printout of the eigenvalues:
CUDA:
Eigenvalues [H]
#st Spin Eigenvalue Occupation
#k = 1, k = ( 0.000000, 0.000000, 0.000000)
1 -- -1.027486 2.000000
2 -- -1.019632 2.000000
3 -- -0.978117 2.000000
4 -- -0.968847 2.000000
5 -- -0.953255 2.000000
6 -- -0.940347 2.000000
7 -- -0.932274 2.000000
8 -- -0.926782 2.000000
9 -- -0.912659 2.000000
10 -- -0.891247 2.000000
11 -- -0.889957 2.000000
12 -- -0.885875 2.000000
13 -- -0.871691 2.000000
14 -- -0.863637 2.000000
15 -- -0.851919 2.000000
16 -- -0.830239 2.000000
17 -- -0.815815 2.000000
18 -- -0.793890 2.000000
19 -- -0.773314 2.000000
20 -- -0.771723 2.000000
21 -- -0.690563 2.000000
22 -- -0.683969 2.000000
23 -- -0.673973 2.000000
24 -- -0.667205 2.000000
25 -- -0.660853 2.000000
...
non-CUDA:
Eigenvalues [H]
#st Spin Eigenvalue Occupation
#k = 1, k = ( 0.000000, 0.000000, 0.000000)
1 -- -1.027486 2.000000
2 -- -1.019632 2.000000
3 -- -0.978117 2.000000
4 -- -0.968846 2.000000
5 -- -0.953349 2.000000
6 -- -0.940348 2.000000
7 -- -0.932295 2.000000
8 -- -0.929263 2.000000
9 -- -0.912685 2.000000
10 -- -0.893838 2.000000
11 -- -0.890251 2.000000
12 -- -0.885876 2.000000
13 -- -0.871691 2.000000
14 -- -0.863660 2.000000
15 -- -0.852078 2.000000
16 -- -0.837568 2.000000
17 -- -0.816301 2.000000
18 -- -0.798950 2.000000
19 -- -0.773314 2.000000
20 -- -0.771725 2.000000
21 -- -0.692003 2.000000
22 -- -0.686843 2.000000
23 -- -0.673974 2.000000
24 -- -0.667211 2.000000
25 -- -0.660855 2.000000
...
An interesting thing is that some eigenvalues do agree (1-3, 13, and some more) agree, but that might be coincidence.
Edited by Martin Lueders