OpenCL testing broken in CI after AMD GPU upgrade
Summary
AMD GPUs in our k8s cluster were upgraded from GCN (gfx8xx) to RDNA2 (gfx1034). The new ones have 32-wide sub-groups and are not supported by our OpenCL backend.
SYCL (DPC++ and OpenSYCL) are fine.
Fixing the OpenCL NBNXM kernels seems like too much effort for a deprecated backend.
Exact steps to reproduce
Look at the status of gcc-*:test pipeline: it fails to to lack of compatible devices.
For developers: Why is this important?
We can't merge things until this is fixed.
If this is a bug, (1) what happens, and (2) what did you expect to happen?
(1) Pre-merge and other pipelines always fail
(2) Pre-merge and other pipelines pass unless there is an actual bug
Possible fixes
-
In main
, add a flag to force compiling kernels in Wave64 mode: !3532 (merged) -
Disable the job for the release-2022
andrelease-2023
branches: !3535 (merged), !3538 (merged) -
Once Intel GPUs (or some other OpenCL-compatible GPUs) are added to CI, re-enable the tests
Edited by Andrey Alekseenko