-
Junchao Zhang authored
Remove KSPPIPEFGMRES from example with skip convergence test since very sensitive to happy ending Appears to have a sweet spot of much better performance for smallish vectors then matches unrolled code for large vectors Sample results on Barry's Apple M2 Laptop (using Apple's BLAS) ./ex19 -da_refine 5 -pc_type none -log_view -ksp_gmres_preallocate -ksp_view Vector length 37,636 VecMDot 1920 1.0 1.9707e-01 1.0 2.23e+09 1.0 0.0e+00 0.0e+00 0.0e+00 25 29 0 0 0 25 29 0 0 0 11291 -vec_mdot_use_gemv VecMDot 1920 1.0 7.5098e-02 1.0 2.23e+09 1.0 0.0e+00 0.0e+00 0.0e+00 12 29 0 0 0 12 29 0 0 0 29693 VecMDot 1920 1.0 8.1523e-02 1.0 2.23e+09 1.0 0.0e+00 0.0e+00 0.0e+00 12 29 0 0 0 12 29 0 0 0 27353 VecMDot 1920 1.0 7.0889e-02 1.0 2.23e+09 1.0 0.0e+00 0.0e+00 0.0e+00 11 29 0 0 0 11 29 0 0 0 31456 -da_refine 6 Vector length 148,996 VecMDot 4340 1.0 1.7666e+00 1.0 2.00e+10 1.0 0.0e+00 0.0e+00 0.0e+00 20 29 0 0 0 20 29 0 0 0 11319 -vec_mdot_use_gemv VecMDot 4422 1.0 1.3725e+00 1.0 2.04e+10 1.0 0.0e+00 0.0e+00 0.0e+00 15 29 0 0 0 15 29 0 0 0 14884 VecMDot 4422 1.0 1.4354e+00 1.0 2.04e+10 1.0 0.0e+00 0.0e+00 0.0e+00 16 29 0 0 0 16 29 0 0 0 14231 ./ex19 -da_refine 7 -pc_type none -log_view -ksp_gmres_preallocate -ksp_view -vec_mdot_use_gemv -ksp_max_it 100 -snes_max_it 1 Vector length 592,900 VecMDot 100 1.0 1.5915e-01 1.0 1.72e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14 27 0 0 0 14 27 0 0 0 10804 -vec_mdot_use_gemv VecMDot 100 1.0 1.6854e-01 1.0 1.72e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14 27 0 0 0 14 27 0 0 0 10230 VecMDot 100 1.0 1.5698e-01 1.0 1.72e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14 27 0 0 0 14 27 0 0 0 10983 -da_refine 8 vector length 2,365,444 VecMDot 100 1.0 6.2499e-01 1.0 6.86e+09 1.0 0.0e+00 0.0e+00 0.0e+00 13 27 0 0 0 13 27 0 0 0 10976 -vec_mdot_use_gemv VecMDot 100 1.0 6.8197e-01 1.0 6.88e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14 27 0 0 0 14 27 0 0 0 10087
b29a8671