Skip to content

Adding PocketFFT support in FFT module since kissfft has some flaw in accuracy and performance

Guoqiang QI requested to merge guoqiangqi1/eigen:fft_dev into master

Eigen using KissFFT as the default fft implementation now,but it`s performance drops sharply in some specific situations since kissfft fail to handle "odd-sized" inputs (i.e., sizes with large factors) efficiently, which was proposed in #1717.In pocketfft, for lengths with very large prime factors, Bluestein's algorithm is used, and instead of an FFT of length n, a convolution of length n2 >= 2*n-1 is performed,where n is chosen to be highly composite.
In addition, the google/jax project uses eigen's fft as its backend before, which also has problems with performance and accuracy. You can see their discussion FFT precision/performance #2952. Currently, they have switched to pocketfft as their default fft implementation.

Prelininary performance comparison(complex to complex and -O3 optimization ):

-------------------------------------------
  length      kissfft       pocketfft
-------------------------------------------
 100000      7.03 ms         4.36 ms          
 100000*2    13.7 ms         9.67 ms          
 100001      14999 ms        24.3 ms        

Updated benchmarks tested by fft_benchmark.cpp on AArch64 (time units:ns) :

Benchmark                                    Time             CPU      Time Old      Time New       CPU Old       CPU New
-------------------------------------------------------------------------------------------------------------------------
test_scalar_float/100                     -0.1687         -0.1688          2555          2124          2554          2123
test_scalar_float/512                     -0.4176         -0.4176         14773          8603         14767          8600
test_scalar_float/4096                    -0.5413         -0.5413        158755         72818        158684         72781
test_scalar_float/32768                   -0.4713         -0.4714       1639528        866753       1638255        866018
test_scalar_float/100000                  -0.5012         -0.5012       6823375       3403716       6817207       3400439
test_scalar_float/100001                  -0.9988         -0.9988   22048197988      25454072   22029408670      25420215
test_scalar_double/100                    -0.1298         -0.1298          2421          2107          2420          2106
test_scalar_double/512                    -0.3835         -0.3835         13892          8565         13885          8561
test_scalar_double/4096                   -0.5060         -0.5060        153078         75624        153005         75581
test_scalar_double/32768                  -0.4349         -0.4350       1912198       1080593       1910733       1079557
test_scalar_double/100000                 -0.4256         -0.4256       7089696       4072526       7083264       4068571
test_scalar_double/100001                 -0.9985         -0.9985   29955156310      44809112   29930479990      44739807
test_complex_float/100                    -0.4638         -0.4638          4454          2388          4452          2387
test_complex_float/512                    -0.6285         -0.6285         29882         11102         29869         11097
test_complex_float/4096                   -0.6174         -0.6174        285366        109192        285238        109137
test_complex_float/32768                  -0.6265         -0.6265       3698800       1381495       3695956       1380318
test_complex_float/100000                 -0.6730         -0.6730      13653536       4465063      13642033       4460459
test_complex_float/100001                 -0.9990         -0.9990   22388960173      22772174   22370358320      22745811
test_complex_double/100                   -0.1601         -0.1602          4239          3560          4237          3558
test_complex_double/512                   -0.6085         -0.6085         28463         11143         28452         11138
test_complex_double/4096                  -0.5924         -0.5924        294522        120061        294352        119977
test_complex_double/32768                 -0.5963         -0.5963       4087268       1650075       4083765       1648615
test_complex_double/100000                -0.5011         -0.5011      15675452       7820870      15659329       7811764
test_complex_double/100001                -0.9985         -0.9985   28986634903      43583605   28961759140      43521986

fft_benchmark.cpp

Edited by Guoqiang QI

Merge request reports