Vectorize acos, asin, and atan for float.
Reference issue
What does this implement/fix?
This change vectorizes the acos, asin, and atan operators in Eigen.
Additional information
Accuracy: Exhaustive testing for all float arguments in [-1:1] shows that this implementation
is accurate to 2.6 ulps for pacos, and 3.8 ulps for pasin. Maximum relative error for patan is 2 ulps.
Speed: See: $2393114
Speedup for 4k element vector:
| SSE | AVX | AVX512 | |
|---|---|---|---|
asin() |
6x | 15.9x | 18.6x |
acos() |
11.4x | 29.5x | 29.8x |
atan() |
5.5x | 9.3x | 17x |
Edited by Rasmus Munk Larsen