Skip to content

SYCL listed forces: optimize parameter passing

Andrey Alekseenko requested to merge aa-gfx90a-do-weird-indexing into release-2023

Compared to 2023.0, using ADH cubic. Speed-up of "Bonded" kernels:

F-only FV
hipSYCL 0.9.4, ROCm 5.2.5, bundled Clang, gfx90a 2.67 2.33
hipSYCL 0.9.4, ROCm 5.4.1, Clang 15.0.7, gfx1034 1.94 1.56
hipSYCL 0.9.4, CUDA 11.8, Clang 15.0.7, sm_86 1.00 1.00
IntelLLVM nightly 2023-02-06, CUDA 11.8, sm_86 4.09 1.55
oneAPI 2023.0, Intel Arc 770 1.29 1.01
oneAPI 2023.0, Ponte Vecchio 1.10 ~1

Based on AMD/StreamHPC optimization.

Refs #3928 (closed), #4593 (closed)

Edited by Andrey Alekseenko

Merge request reports