Skip to content

Config: update arch-nersc-perlmutter-opt.py with --gpu-bind=none

To workaround this kind of errors

jczhang@nid002305 ~/petsc/src/snes/tutorials]$  
$ srun -n 2 --gpus-per-task=1 -c 32 ./ex19 -snes_monitor -dm_mat_type aijcusparse -dm_vec_type cuda -pc_type gamg -ksp_monitor -mg_levels_ksp_max_it 1
lid velocity = 0.0625, prandtl # = 1., grashof # = 1.
  0 SNES Function norm 2.391552133017e-01
(GTL DEBUG: 0) cuIpcOpenMemHandle: invalid argument, CUDA_ERROR_INVALID_VALUE, line no 360
(GTL DEBUG: 1) cuIpcOpenMemHandle: invalid argument, CUDA_ERROR_INVALID_VALUE, line no 360
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[1]PETSC ERROR: General MPI error
[1]PETSC ERROR: MPI error 1 Invalid buffer pointer
[1]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc!
[0]PETSC ERROR: General MPI error
[0]PETSC ERROR: MPI error 1 Invalid buffer pointer
[0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc!
[0]PETSC ERROR:   Option left: name:-mg_levels_ksp_max_it value: 1 source: command line
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.21.1-175-gcef0416b  GIT Date: 2024-05-15 19:29:27 +0000
[0]PETSC ERROR: /global/u1/j/jczhang/petsc/src/snes/tutorials/./ex19 on a arch-kokkos-dbg named nid002305 by jczhang Thu May 16 08:31:54 2024
[0]PETSC ERROR: Configure options --PETSC_ARCH=arch-kokkos-dbg --with-make-np=8 --with-mpiexec="srun -G4" --with-batch=0 --with-cc=cc --with-cxx=CC --with-fc=0 --COPTFLAGS="   -g -O3" --CXXOPTFLAGS=" -g -O3" --FOPTFLAGS="   -g -O3" --CUDAFLAGS="   -g -O3" --with-debugging=0 --with-cuda=1 --download-kokkos --download-kokkos-kernels
[0]PETSC ERROR: #1 PetscSFLinkFinishCommunication_Default() at /global/u1/j/jczhang/petsc/src/vec/is/sf/impls/basic/sfmpi.c:13
[0]PETSC ERROR: #2 PetscSFLinkFinishCommunication() at /global/homes/j/jczhang/petsc/include/../src/vec/is/sf/impls/basic/sfpack.h:291
[0]PETSC ERROR: #3 PetscSFReduceEnd_Basic() at /global/u1/j/jczhang/petsc/src/vec/is/sf/impls/basic/sfbasic.c:411
[0]PETSC ERROR: #4 PetscSFReduceEnd() at /global/u1/j/jczhang/petsc/src/vec/is/sf/interface/sf.c:1638
[0]PETSC ERROR: #5 PetscSFGatherEnd() at /global/u1/j/jczhang/petsc/src/vec/is/sf/interface/sf.c:1923
Edited by Junchao Zhang

Merge request reports