Fix neon packet math tests, add missing neon intrinsics (!1904) · Merge requests · libeigen / eigen

Reference issue

What does this implement/fix?

This might fix the aarch64 tests. We are missing pnmsub (and a few others) which defaults to a non-fused implementation, while the reference scalar op uses the fused version.

See this code: https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/GenericPacketMath.h#L1330

template <typename Packet, typename EnableIf = void>
struct pmadd_impl {
  static EIGEN_DEVICE_FUNC EIGEN_ALWAYS_INLINE Packet pmadd(const Packet& a, const Packet& b, const Packet& c) {
    return padd(pmul(a, b), c);
  }
  static EIGEN_DEVICE_FUNC EIGEN_ALWAYS_INLINE Packet pmsub(const Packet& a, const Packet& b, const Packet& c) {
    return psub(pmul(a, b), c);
  }
  static EIGEN_DEVICE_FUNC EIGEN_ALWAYS_INLINE Packet pnmadd(const Packet& a, const Packet& b, const Packet& c) {
    return psub(c, pmul(a, b));
  }
  static EIGEN_DEVICE_FUNC EIGEN_ALWAYS_INLINE Packet pnmsub(const Packet& a, const Packet& b, const Packet& c) {
    return pnegate(pmadd(a, b, c));
  }
};

The default implementation of pnmsub refers to pmadd within the context of pmadd_impl (no FMA), and not the specialized pmadd in the global Eigen namespace. This probably works by coincidence on gcc (is fp contraction enabled by default?) whereas clang is more strict.

See: https://godbolt.org/z/xK3Gc88or

Unrelated to the test failures, this adds a native version of pnmadd for float, double and half neon vector types.

eigen	arithmetic	neon intrinsic
pmadd	a * b + c	vfma(c,a,b)
pnmadd	c - a * b	vfms(c,a,b)
pmsub	a * b - c	pnegate(pnmadd(a,b,c))
pnmsub	-(a * b + c)	pnegate(pmadd(a,b,c))

Additional information

125/895 Test  #66: packetmath_13 ......................Child aborted***Exception:   0.32 sec
Initializing random number generator with seed 1749374907
Repeating each test 10 times
=== Testing packet of type '13__Float16x8_t' and scalar type 'N5Eigen4halfE' and size '8' ===
=== Testing packet of type '13__Float16x4_t' and scalar type 'N5Eigen4halfE' and size '4' ===
ref: [ -0.3623   -0.6743  0.003361    -1.128    -0.688    0.9873     1.253    0.6074] != vec: [ -0.3623   -0.6743  0.003418    -1.128   -0.6885    0.9873     1.254    0.6074]
Values differ in position 2: 0.003360748291015625 vs 0.00341796875
Test test::runner<half>::run() failed in ../test/packetmath.cpp (503)
    test::areApprox(ref, data2, PacketSize) && "internal::pnmsub"
Stack:
  - test::runner<half>::run()
  - packetmath

Edited Jun 09, 2025 by Charles Schlosser

Fix neon packet math tests, add missing neon intrinsics

Reference issue

What does this implement/fix?

Additional information

Merge request reports