WIP: Replace GpuEventSynchronizer with DeviceEventSynchronizer
requested to merge 2527-gpueventsynchronizer-improvements into 3924-converge-gpueventsynchronizer-and-commandevent
Major differences:
-
markEvent
andenqueueWaitEvent
functionality moved toDeviceStream
class. -
DeviceEventSynchronizer
is backend-agnostic, no more different implementations for CUDA, OpenCL, and SYCL. - More configurable consumption counting. Previously, we had 1:1 on OpenCL and SYCL, and no counting on CUDA.
- Added a check whether the event was marked.
Notes:
- The new counting feature has not been properly enabled yet to avoid breaking CUDA code.
- Event marking is not used.
These limitations will be addressed in follow-up MRs. Here we aim to preserve the existing logic.
Overall, the idea is that we can do any synchronization we want with DeviceEvent
and DeviceStream
, while DeviceEventSynchronizer
serves as a convenience wrapper with additional producer-consumer accounting.
We might also wish to replace some uses of GpuEventSynchronizer
with a simple DeviceEvent
when no accounting is needed.
Closes #2527
Edited by Andrey Alekseenko