Vectorize acos, asin, and atan for float.

Reference issue

What does this implement/fix?

This change vectorizes the acos, asin, and atan operators in Eigen.

Additional information

Accuracy: Exhaustive testing for all float arguments in [-1:1] shows that this implementation is accurate to 2.6 ulps for pacos, and 3.8 ulps for pasin. Maximum relative error for patan is 2 ulps.

Speed: See: $2393114

Speedup for 4k element vector:

SSE AVX AVX512
asin() 6x 15.9x 18.6x
acos() 11.4x 29.5x 29.8x
atan() 5.5x 9.3x 17x
Edited by Rasmus Munk Larsen

Merge request reports

Loading