Optimize division operations in TensorVolumePatch
Describe the feature you would like to be implemented.
Reduce one less division when generating a Packet given index in TensorVolumePatch.h
Would such a feature be useful for other users? Why?
Should reduce the number of CPU cycles given the division operation is expensive.