NEON Complex Intrinsics
What does this implement/fix?
ARM NEON has some interesting complex intrinsics including complex FMA, see here: https://developer.arm.com/documentation/ddi0596/2021-03/SIMD-FP-Instructions/FCMLA--Floating-point-Complex-Multiply-Accumulate-?lang=en
This MR gives a potential implementation of pmul
and pmadd
using these intrinsics.
Additional information
I don't consider this MR complete - but I've already discussed with @chuckyschluz and thought it was best to put this up to spur discussion. In particular:
1 - Which tests from the suite should I be running to check the outputs of these? 2 - There are other complex NEON intrinsics that could potentially be added, see: https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiessimdisa=[Neon]&f:@navigationhierarchiesinstructiongroup=[Complex%20arithmetic] but I have no idea how far to go