clean up intel packet reductions

Reference issue

What does this implement/fix?

While working on maxCoeff / minCoeff, I noticed there are a bunch of missing predux ops. This MR reorganizes the predux operations for Intel intrinsics into separate files and adds a few that were missing (namely, predux_min/max with NaN propagation). I also substituted in the AVX512 built-in floating point reductions.

Additional information

Edited by Charles Schlosser

Merge request reports

Loading