add madd_no_fma_helper
Modifies the general case of gebp_traits
so that the tmp
variable of madd
in the no_fma
case is only used if its type, RhsPacketType
is compatible with (able to store the result of) AccPacketType
, otherwise a fresh AccPacketType
is used instead.
Why do this at all?
AccPacketType
is, approximately, the result of LhsPacketType * RhsPacketType
, and the existing code attempts to store this type inside tmp
. However, tmp
has the type RhsPacketType
, so unless AccPacketType
can be converted and stored into RhsPacketType
, a compilation error results. Alternative methods, such as altering the type of the argument of tmp
to AccPacketType
necessitate spillover changes elsewhere, since both B#
and T0
are provided as arguments for tmp
and the function signatures become ambiguous. It is also painful for other projects to specialize the (medium size, well tied in) gebp_traits
template. The change here is minimal and should create no overhead for the cases that already worked under the old code. In the new use cases it enables, it could be more efficient by reusing the temp variable via (thread-local) static, or not assigning from b, etc. However, I wanted to keep the changes as simple as possible.