Vectorize atanh & add a missing definition and unit test for atan.

This change adds

  1. A vectorized implementation of atanh.
  2. A missing definition for atan<half>.
  3. Unit tests for patan.
  4. A new helper function for testing unary functors on special IEEE values. Tests are added for most common mathematical functions.

The vectorized function is accurate to 2 ULP.

Benchmark numbers:

SSE:
name                       old cpu/op  new cpu/op  delta
BM_eigen_atanh_float/1     0.29ns ± 1%  2.45ns ± 0%  +748.86%  (p=0.000 n=53+46)
BM_eigen_atanh_float/8     36.4ns ± 0%  34.3ns ± 2%    -5.92%  (p=0.000 n=43+38)
BM_eigen_atanh_float/64     334ns ± 1%   205ns ± 4%   -38.61%  (p=0.000 n=44+60)
BM_eigen_atanh_float/512   2.65µs ± 1%  1.60µs ± 1%   -39.82%  (p=0.000 n=51+38)
BM_eigen_atanh_float/4k    21.2µs ± 0%  12.7µs ± 4%   -40.00%  (p=0.000 n=56+59)
BM_eigen_atanh_float/32k    169µs ± 0%   102µs ± 1%   -39.80%  (p=0.000 n=51+43)
BM_eigen_atanh_float/256k  1.36ms ± 0%  0.82ms ± 1%   -39.84%  (p=0.000 n=56+48)
BM_eigen_atanh_float/1M    5.43ms ± 0%  3.26ms ± 2%   -39.87%  (p=0.000 n=55+43)

AVX2:
name                       old cpu/op  new cpu/op  delta
BM_eigen_atanh_float/1     2.12ns ± 1%  1.91ns ± 0%   -9.68%  (p=0.000 n=54+55)
BM_eigen_atanh_float/8     38.1ns ± 0%  39.4ns ± 1%   +3.55%  (p=0.000 n=45+43)
BM_eigen_atanh_float/64     334ns ± 3%   137ns ± 4%  -59.10%  (p=0.000 n=50+59)
BM_eigen_atanh_float/512   2.66µs ± 2%  0.84µs ± 5%  -68.51%  (p=0.000 n=51+57)
BM_eigen_atanh_float/4k    21.3µs ± 1%   6.5µs ± 5%  -69.53%  (p=0.000 n=58+59)
BM_eigen_atanh_float/32k    170µs ± 1%    51µs ± 4%  -69.90%  (p=0.000 n=57+53)
BM_eigen_atanh_float/256k  1.36ms ± 1%  0.41ms ± 4%  -69.81%  (p=0.000 n=58+47)
BM_eigen_atanh_float/1M    5.46ms ± 1%  1.65ms ± 5%  -69.71%  (p=0.000 n=60+59)

AVX512:
name                       old cpu/op  new cpu/op  delta
BM_eigen_atanh_float/1     2.18ns ± 1%  3.28ns ± 1%  +49.98%  (p=0.000 n=56+47)
BM_eigen_atanh_float/8     38.1ns ± 0%  41.4ns ± 1%   +8.87%  (p=0.000 n=47+44)
BM_eigen_atanh_float/64     332ns ± 1%   148ns ± 3%  -55.49%  (p=0.000 n=43+54)
BM_eigen_atanh_float/512   2.66µs ± 1%  0.47µs ± 5%  -82.21%  (p=0.000 n=52+56)
BM_eigen_atanh_float/4k    21.3µs ± 1%   3.1µs ± 3%  -85.39%  (p=0.000 n=54+50)
BM_eigen_atanh_float/32k    170µs ± 0%    24µs ± 3%  -85.86%  (p=0.000 n=54+55)
BM_eigen_atanh_float/256k  1.36ms ± 1%  0.19ms ± 3%  -85.90%  (p=0.000 n=56+60)
BM_eigen_atanh_float/1M    5.46ms ± 1%  0.76ms ± 4%  -85.98%  (p=0.000 n=57+58)
Edited by Rasmus Munk Larsen

Merge request reports

Loading