Skip to content

Change default for StatesBlockSize

Sebastian Ohlmann requested to merge fix_statesblocksize into develop

Description

Up to now, the default of StatesBlockSize for CPU runs depended on the number of OpenMP threads. However, the performance of the code is quite bad for StatesBlockSize > 8 because it leads to a much higher pressure on the memory bandwidth and thus less cache reusage.

Testing different hybrid combinations revealed a much better scaling of the OpenMP parallelization if this variable is kept fixed instead of depending on the number of threads.

News snippet

Change default for StatesBlockSize

Checklist

  • I have checked that my code follows the Octopus coding standards
  • I have added tests for all the new features added in this request.

Merge request reports