Includes changes needed to support realtime on GPUs.
Mul3 kernel is significantly faster than 2x mul2 kernels.