VECKOKKOS does not log host-device transfers
This is on release
and main
.
-dm_vec_type kokkos
(no transfers)
$ ompi-cuda-g/tests/ts/tutorials/ex9 -log_view -log_view_gpu_time -dm_vec_type kokkos
[...]
VecAXPY 1112 1.0 7.8588e-01 1.0 1.11e+05 1.0 0.0e+00 0.0e+00 0.0e+00 51 62 0 0 0 51 62 0 0 0 0 0 0 0.00e+00 0 0.00e+00 100
VecAXPBYCZ 278 1.0 1.1626e-02 1.0 6.95e+04 1.0 0.0e+00 0.0e+00 0.0e+00 1 38 0 0 0 1 38 0 0 0 6 8 0 0.00e+00 0 0.00e+00 100
VecScatterBegin 1393 1.0 5.7381e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 0 1 2.16e-04 0 0.00e+00 0
VecScatterEnd 1393 1.0 5.8731e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DCtxCreate 1 1.0 2.3222e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DCtxSetUp 1 1.0 1.2690e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DCtxSetDevice 1 1.0 1.2739e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
TSStep 278 1.0 1.2185e+00 1.0 1.81e+05 1.0 0.0e+00 0.0e+00 0.0e+00 79 100 0 0 0 79 100 0 0 0 0 0 0 0.00e+00 0 0.00e+00 100
TSFunctionEval 1390 1.0 3.4093e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 22 0 0 0 0 22 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
-dm_vec_type cuda
(correct)
$ ompi-cuda-g/tests/ts/tutorials/ex9 -log_view -log_view_gpu_time -dm_vec_type cuda
[...]
VecAXPY 1112 1.0 3.5474e-02 1.0 1.11e+05 1.0 0.0e+00 0.0e+00 0.0e+00 2 62 0 0 0 2 62 0 0 0 3 6 1112 4.45e-01 0 0.00e+00 100
VecAXPBYCZ 278 1.0 1.9044e-02 1.0 6.95e+04 1.0 0.0e+00 0.0e+00 0.0e+00 1 38 0 0 0 1 38 0 0 0 4 9 278 1.11e-01 0 0.00e+00 100
VecScatterBegin 1393 1.0 6.1479e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 0 1394 6.02e-01 0 0.00e+00 0
VecScatterEnd 1393 1.0 6.1512e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
VecCUDACopyTo 2783 1.0 1.1600e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 0 2783 1.16e+00 0 0.00e+00 0
VecCUDACopyFrom 2784 1.0 2.4021e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 0.00e+00 2784 1.16e+00 0
cuBLAS Init 1 1.0 7.1763e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 47 0 0 0 0 47 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DCtxCreate 1 1.0 8.7790e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DCtxSetUp 1 1.0 8.8418e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DCtxSetDevice 1 1.0 1.2962e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
DCtxSync 7794 1.0 2.4647e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 0 0 0.00e+00 0 0.00e+00 0
TSStep 278 1.0 1.1907e+00 1.0 1.81e+05 1.0 0.0e+00 0.0e+00 0.0e+00 79 100 0 0 0 79 100 0 0 0 0 7 2779 1.16e+00 2780 1.16e+00 100
TSFunctionEval 1390 1.0 3.4119e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 23 0 0 0 0 23 0 0 0 0 0 0 1389 6.00e-01 2780 1.16e+00 0