Skip to content

Tweak pasin_float, fix psqrt_complex

Reference issue

Fixes #2597 (closed)

pasin_float:

Swapped out a comparison for some bit flipping and some other minor optimizations. This reduces runtime by ~11% (AVX).

size before after diff
32 1080 949 -12%
64 1058 992 -6%
128 1089 912 -16%
256 1127 889 -21%
512 1086 882 -18%
1024 1014 845 -16%
2048 1223 952 -22%
4096 1125 856 -23%
8192 1270 1018 -19%
16384 1133 841 -25%
32768 1129 1021 -9%
65536 1067 880 -17%
131072 1145 861 -24%
262144 1125 982 -12%
524288 1199 937 -21%
1048576 1460 929 -36%
2097152 1220 1042 -14%
4194304 1431 1166 -18%
8388608 1885 1195 -36%
16777216 1798 1275 -29%
33554432 1485 1137 -23%
67108864 1373 1112 -19%
134217728 1338 1149 -14%

psqrt_complex: Fixed error handling where, unless otherwise specified, if either the real or imaginary component is nan, then the result is nan. This is addressed before handling the special infinity cases. Overall, it is slower.

https://godbolt.org/z/vneGGcGjc

size before after diff
32 464 527 13%
64 513 567 10%
128 498 544 9%
256 499 563 12%
512 492 514 4%
1024 605 636 5%
2048 468 540 15%
4096 551 612 11%
8192 467 548 17%
16384 510 612 20%
32768 466 571 22%
65536 517 542 4%
131072 520 608 16%
262144 520 537 3%
524288 510 577 13%
1048576 497 589 18%
2097152 505 583 15%
4194304 551 566 2%
8388608 514 600 16%
16777216 514 511 0%

What does this implement/fix?

Additional information

Edited by Charles Schlosser

Merge request reports

Loading