BUG: in Trust Region Newton CG
The Trust Region Newton CG has a strange BUG when running in parallel which can bring it into a state where it stucks on one processor and will not end until the wall time breaks or it is stopped by the user.
I recognized this problem when implementing an IO example. With this script the error should be reproducible. Caution you have to choose case = "C"
in line 54, the other two cases do not use the trust region solver. The expected Bug does not always occur, from time to time the script runs without an error. However, executing mpirun -n 2 python3 netcdf_io_use_case.py
in the ./examples/
folder should lead to the following bug:
at CG step 171: |r|/|b| = 1.45052e-08, cg_tol = 1e-08
at CG step 172: |r|/|b| = 1.29407e-08, cg_tol = 1e-08
at CG step 173: |r|/|b| = 1.17899e-08, cg_tol = 1e-08
at CG step 174: |r|/|b| = 1.13414e-08, cg_tol = 1e-08
at CG step 175: |r|/|b| = 1.14722e-08, cg_tol = 1e-08
at CG step 176: |r|/|b| = 1.17469e-08, cg_tol = 1e-08
at CG step 177: |r|/|b| = 1.16371e-08, cg_tol = 1e-08
at CG step 178: |r|/|b| = 1.09257e-08, cg_tol = 1e-08
at CG step 179: |r|/|b| = 9.89285e-09, cg_tol = 1e-08
CG finished, reason: reached tolerance
2 newton decrease tr 6.25 8.38577 8.38577 0.922383 (1e-08)
Norm of rhs in (trust region) Steihaug CG = 70.3211
CG finished, reason: step exceeded trust region bounds
3 tr exceeded decrease tr 1.5625 10.896 13.1543 0.722766 (1e-08)
Norm of rhs in (trust region) Steihaug CG = 118.723
CG finished, reason: step exceeded trust region bounds
4 tr exceeded decrease tr 0.390625 0.70719 10.7221 0.252478 (1e-08)
Norm of rhs in (trust region) Steihaug CG = 0.500117
CG finished, reason: step exceeded trust region bounds
here the code does not stop or throw an error it just runs for ever.