Skip to content

realm: identify shared_peers using uuid

Wei Wu requested to merge shared_peers_boot_id into master

This PR contains:

  1. updating the RuntimeImpl::create_shared_peers function to use cat /proc/sys/kernel/random/boot_id (linux) and system_profiler SPHardwareDataType (macOS) to identify which ranks are on the same physical nodes, instead of using ipc mailbox. Thanks to the advice from @cperry4, apparently, this approach gives accurate results even for containers.
  2. move the RuntimeImpl::create_shared_peers after network::attach because we need the allgather in the new implementation, but it won't be available before attach in the GASNet1 module.
  3. add the implementation of allgather for all network modules.
Edited by Wei Wu

Merge request reports