Fix neon packet math tests, add missing neon intrinsics
Reference issue
What does this implement/fix?
This might fix the aarch64 tests. We are missing pnmsub (and a few others) which defaults to a non-fused implementation, while the reference scalar op uses the fused version.
See this code: https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/GenericPacketMath.h#L1330
template <typename Packet, typename EnableIf = void>
struct pmadd_impl {
static EIGEN_DEVICE_FUNC EIGEN_ALWAYS_INLINE Packet pmadd(const Packet& a, const Packet& b, const Packet& c) {
return padd(pmul(a, b), c);
}
static EIGEN_DEVICE_FUNC EIGEN_ALWAYS_INLINE Packet pmsub(const Packet& a, const Packet& b, const Packet& c) {
return psub(pmul(a, b), c);
}
static EIGEN_DEVICE_FUNC EIGEN_ALWAYS_INLINE Packet pnmadd(const Packet& a, const Packet& b, const Packet& c) {
return psub(c, pmul(a, b));
}
static EIGEN_DEVICE_FUNC EIGEN_ALWAYS_INLINE Packet pnmsub(const Packet& a, const Packet& b, const Packet& c) {
return pnegate(pmadd(a, b, c));
}
};
The default implementation of pnmsub refers to pmadd within the context of pmadd_impl (no FMA), and not the specialized pmadd in the global Eigen namespace. This probably works by coincidence on gcc (is fp contraction enabled by default?) whereas clang is more strict.
See: https://godbolt.org/z/xK3Gc88or
Unrelated to the test failures, this adds a native version of pnmadd for float, double and half neon vector types.
| eigen | arithmetic | neon intrinsic |
|---|---|---|
| pmadd | a * b + c | vfma(c,a,b) |
| pnmadd | c - a * b | vfms(c,a,b) |
| pmsub | a * b - c | pnegate(pnmadd(a,b,c)) |
| pnmsub | -(a * b + c) | pnegate(pmadd(a,b,c)) |
Additional information
125/895 Test #66: packetmath_13 ......................Child aborted***Exception: 0.32 sec
Initializing random number generator with seed 1749374907
Repeating each test 10 times
=== Testing packet of type '13__Float16x8_t' and scalar type 'N5Eigen4halfE' and size '8' ===
=== Testing packet of type '13__Float16x4_t' and scalar type 'N5Eigen4halfE' and size '4' ===
ref: [ -0.3623 -0.6743 0.003361 -1.128 -0.688 0.9873 1.253 0.6074] != vec: [ -0.3623 -0.6743 0.003418 -1.128 -0.6885 0.9873 1.254 0.6074]
Values differ in position 2: 0.003360748291015625 vs 0.00341796875
Test test::runner<half>::run() failed in ../test/packetmath.cpp (503)
test::areApprox(ref, data2, PacketSize) && "internal::pnmsub"
Stack:
- test::runner<half>::run()
- packetmath
Edited by Charles Schlosser