Skip to content

MPIPtAP: Correct logflops for allatonce and allatonce_merged

Created by: Fande-Kong

@{557058:68ee230f-7454-4f70-bf75-9aba7b91d899}

I added the missed flops in this PR. But ``scalable” is still higher than ``allatonce” and ``allatonce_merged”. I guess the missing part is MatSetValues(ADD_VALUES). ``allatonce” uses MatSetValues(ADD_VALUES) more often than ``scalable” since it uses the out-product for P^T (AP).

allatonce_merged

mpirun -n 8 ./ex96 -Mx 300 -My 160 -Mz 100 -matptap_via allatonce_merged -log_view -malloc_dump -malloc_debug_ -options_left 1

MatPtAP 2 1.0 4.2803e+01 1.0 7.84e+08 1.0 5.5e+02 2.0e+05 1.2e+01 27 15 17 28 8 27 15 17 28 8 145

MatPtAPSymbolic 1 1.0 1.5029e+01 1.0 0.00e+00 0.0 3.4e+02 1.5e+05 4.0e+00 9 0 11 13 3 9 0 11 13 3 0

MatPtAPNumeric 2 1.0 2.7791e+01 1.0 7.84e+08 1.0 2.1e+02 2.9e+05 8.0e+00 17 15 7 16 5 17 15 7 16 5 223

allatonce

mpirun -n 8 ./ex96 -Mx 300 -My 160 -Mz 100 -matptap_via allatonce -log_view -malloc_dump -malloc_debug_ -options_left 1

MatPtAP 2 1.0 4.3823e+01 1.0 8.33e+08 1.1 5.5e+02 2.0e+05 1.2e+01 32 16 17 28 8 32 16 17 28 8 146

MatPtAPSymbolic 1 1.0 1.6467e+01 1.0 0.00e+00 0.0 3.4e+02 1.5e+05 4.0e+00 12 0 11 13 3 12 0 11 13 3 0

MatPtAPNumeric 2 1.0 2.7379e+01 1.0 8.33e+08 1.1 2.1e+02 2.9e+05 8.0e+00 20 16 7 16 5 20 16 7 16 5 233

scalable

mpirun -n 8 ./ex96 -Mx 300 -My 160 -Mz 100 -matptap_via scalable -log_view -malloc_dump -malloc_debug_ -options_left 1

MatPtAP 2 1.0 4.6968e+01 1.0 1.30e+09 1.0 3.2e+02 3.6e+05 1.5e+01 31 23 11 29 9 31 23 11 29 10 219

MatPtAPSymbolic 1 1.0 1.3074e+01 1.0 0.00e+00 0.0 1.3e+02 3.2e+05 7.0e+00 9 0 4 10 4 9 0 4 10 5 0

MatPtAPNumeric 2 1.0 3.3924e+01 1.0 1.30e+09 1.0 1.9e+02 3.9e+05 8.0e+00 22 23 7 19 5 22 23 7 19 5 304

Merge request reports

Loading