Skip to content

SYCL: Avoid performance regression with ROCm 5.5 on MI250X

Andrey Alekseenko requested to merge aa-4874 into release-2023

Buffer splitting introduced in fb3e0b96 (!3104 (merged)) causes a significant performance slowdown with ROCm 5.5+ on MI250X (#4874).

Here we use templates to un-split the buffer for AMD devices, while keeping the old, split, code for others.

This is a commit c828d428 (!3736 (merged)) cherry-picked from main to 2023, with release notes added.

Refs #4593 (closed), #4854 (closed)

Merge request reports