Skip to content

Some optimizations for batch operations and more

Nicolas Tancogne-Dejean requested to merge optimized_batch_ops into main

Description

Some optimizations for batch operations and more

  • Fixes some unit tests and add a missing number of flops for the norm2.
  • Implement the batch scalar scal operation, and using BLAS. This is much faster than the original version.
  • Replace some internal batch code by BLAS calls.
  • The timestep was too large in the exponential unit test, leading to NaNs for large number of applications of the exponential.
  • Adding some profiling FLOPs data for the density.
  • Change some inefficient OpenMP loop

News snippet

Code optimization.

Checklist

  • I have checked that my code follows the Octopus coding standards
  • I have added tests for all the new features added in this request.

Merge request reports