Optimization vectorization (!98) · Merge requests · octopus-code / octopus

The improvement of the routine vnl_start is for the unpacked version of the routine.

I tested the gain for TD calculations

Si primitive in serial, 1kpt
Si cubic in serial, 1kpt (spacing = 0.5,0.45,0.4,0.35)
hBN bilayer with c=240Bohr, 32 cores, 16kpt, parallelization en kpt+states (spacing = 0.5,0.45,0.4,0.35)

The attached plot show the time (from profiling/time.00000) vs the number of inner points in the mesh for hBN

and bulk silicon

The gain becomes more important when the ratio states/grid points becomes smaller.

Optimization vectorization