update Kokkos profiling doc with nvtx-connector info
This PR/MR updates the KOKKOS profiling doc with Nvidia Systems profiling. Here are the reasons for these updates:
- The
nvprof
connector is deprecated in the latest KOKKOS tools and thus have been removed, instead new NVTX connector is included. - Setting
CUDA_ROOT
is necessary because thecudatoolkit
module does not set this environment variable. WithoutCUDA_ROOT
, KOKKOS tools will fail to build.