Building (and making visibile) software for accelerators

While we have initial support for NVIDIA GPUs, there are packages such as LAMMPS that cannot be built fat (meaning selecting the appropriate CUDA compute capability at runtime). This complicates EESSI as we need a way of making individual builds for each CUDA compute capability (e.g., by adding a versionsuffix) and relying on the user to make the correct choice for their hardware. This also creates a bit of a conundrum, as the "same software everywhere" concept (which conceptually seems feasible for CPUs) becomes far more problematic in this scenario...do we even know what we mean for accelerators in such a scenario?

In addition to this, we will need to support a wider range of different (and potentially novel) accelerators in the future. Having thought about this a little, I have a proposal to make:

We treat accelerator installations differently
- Accelerator installations go into a different repository
- For NVIDIA, installations are made into a CUDA compute capability-specific installation subdirectory
  - Somehow need to trigger repeated builds for each compute capability
We provide no "same software everywhere" guarantee for accelerator software
Tweak the UI so that users only see software they can actually use (which I guess boils down to a conditional module use ... statement)

Edited Apr 02, 2024 by ocaisa