Fix broken tensor executor test, allow tensor packets of size 1.
The cxx11_tensor_executor test assumed vectorization was always possible for the given types - though they may not be depending on the platform. Added a check via packet_traits<T>.
Also modified tensor ops to actually allow PacketSize == 1,
such as for Packet1cd. The README previously said complex
was known to be broken, but we've since added some fixes for that.