FFT object instantiation
FFT objects are created multiple times per optimization iteration (and many times in general). This means that multiple FFT plans (for FFTW or Accelerate FFT) are created with (usually) the same data size in mind.
FFTW:
impl_->p_fft = fftw_plan_dft(narrow<int>(dims_.size()), dims_.data(), dataPointer, dataPointer, FFTW_FORWARD, FFTW_ESTIMATE_PATIENT);
Accelerate:
impl_->p_fft = vDSP_create_fftsetupD(vDSP_Length( std::log2( dataLength ) ), FFT_RADIX2 );
I think that we could reduce the amount of FFT objects being initialized and reuse the existing ones.
This would make a lot of sense for Accelerate FFT plan creation, as it takes into consideration only the dimensions of the data, not the data itself. Thus Accelerate FFT plans are reusable for future transforms, but we recreate it many times. (look at FFT Weights Arrays description and advice regarding reusing fft plans at https://developer.apple.com/library/archive/documentation/Performance/Conceptual/vDSP_Programming_Guide/UsingFourierTransforms/UsingFourierTransforms.html#//apple_ref/doc/uid/TP40005147-CH3-SW1).
Example: It takes about 1-15 us to execute forward FFT in oneparticle-example (256 points), but creating a new plan can take 40-70 us.