cuda10.2-vectorAdd should be a multistage build

The current Dockerfile should be a multistage build where we only copy the actual generated binary. Something along the lines of:

# Samples provided with the CUDA toolkit.

# docker build -t cuda-vectorAdd .
# docker run --gpus all -ti --rm cuda-vectorAdd

FROM nvidia/cuda:10.2-devel-ubi8 as devel

RUN dnf update -y && dnf install -y git make && \
    rm -rf /var/cache/dnf/*

WORKDIR /usr/local/cuda-10.2

RUN git clone https://github.com/NVIDIA/cuda-samples.git && \
    cd cuda-samples && make -j"$(nproc)" -k || true 

FROM nvidia/cuda:10.2-devel-ubi8 as prod
ENV NVIDIA_VISIBLE_DEVICES all
COPY --from devel /Samples/vectorAdd_nvrtc/vectorAdd_nvrtc /vectorAdd_nvrtc

CMD ./Samples/vectorAdd_nvrtc/vectorAdd_nvrtc