Disable Tree reduction for GPU.
For moderately sized inputs, running the Tree reduction quickly
overflows the GPU thread stack space, leading to memory errors.
This was happening in the cxx11_tensor_complex_gpu test, for example.
Disabling tree reduction on GPU fixes this.