Optimize OpenMP routines
Description
Rewrite the norm2 routine to avoid false sharing and scale on OpenMP. Tune some OpenMP loops to improve memory access (by reducing NUMA) and using OpenMP simd more often.
News snippet
Improve performances using OpenMP.
Checklist
-
I have checked that my code follows the Octopus coding standards -
I have added tests for all the new features added in this request.
Edited by Nicolas Tancogne-Dejean