recommend how to run non-MPI worker daemons on a Slurm cluster
Many new and interesting programming frameworks require starting a worker daemon on each node in a cluster. The obvious choice is to use `srun`, but that does not work. Quoting from the tutorial regarding Apache Spark:
> Next, start one worker on each compute node. This is a little ugly; `mpirun` will wait until everything is finished before returning, but we want to start the workers in the background, so we add `&` and introduce a race condition. (`srun` has different, even less helpful behavior: it kills the worker as soon as it goes into the background.)
>
> $ mpirun -map-by '' -pernode ch-run -b ~/sparkconf /var/tmp/spark -- \
> /spark/sbin/start-slave.sh $MASTER_URL &
In this case, the script `start-slave.sh` will daemonize a worker child and then exit, at which point `srun` kills the worker.
However, even if the script didn't daemonize itself, `srun` cannot be backgrounded with `&` because subsequent `srun` will wait for it to complete, even in the background.
Use cases so far include Spark and FUSE.
The `mpirun` workaround above is awkward because MPI shouldn't be needed simply to start a process on each node. Other workarounds include `pdsh`, GNU Parallel, and similar.
Ideally, we would figure out how to make `srun` do what we want, since that's the native way to start processes on Slurm nodes.
This issue is to figure out a recommendation and propagate it through the documentation and examples.
See also #156 and #160. The former includes a reproducer script for the Spark behavior.
issue