libstd for nvptx64-nvidia-cuda target

Currently (0.3.0), device code works with libcore and alloc crate, but not with libstd. This is a too strong limit for easy use. For example, we cannnot use crate for CPU e.g. https://github.com/termoshtt/eom due to this restriction.