Inconsistent handling of NaNs by pmax/pmin
Assigned to Nobody
Link to original bugzilla bug (#1373)
Version: 3.3 (current stable)
Platform: x86 - general
Because of the differences in how std::min/std::max and the Intel SIMD instructions used for pmin/pmax handle the presence of a NaN in one of the arguments, you end up with inconsistent values on the outputs produced by from the vectorized and scalar code paths of scalar_min_op/scalar_max_op.
For example, std::max(a, b) is defined to have the same value as the ternary expression (a < b) ? b : a, in other words, if any or the arguments are NaN, a is returned.
Conversely, the MAXPS instruction invoked by _mm_max_ps(a, b) intrinsic always returns the value of the second argument b if either a or b is NaN.
I propose to swap the arguments to pmin/pmax in scalar_*_op::packetOp to get consistent results.
One could also swap the arguments in the implementation of scalar_*_op::operator(), but it turns out that the former approach makes the Eigen Tensor library properly propagate NaNs for the Relu operator, which is performance critical for neural net models.
Do you approve of this solution or do you have a better idea?