Add Intel GPU tile partitioning support for Data Center GPU Max (!115) · Merge requests · HPCToolkit / QAHPCT

This commit adds support for Intel GPU tile (sub-device) partitioning on Intel Data Center GPU Max (Ponte Vecchio) systems. The feature allows fine-grained work distribution across GPU tiles for better resource utilization.

Key changes:

Add 'tile' data descriptor to enable tile mode via SYCL_USE_INTEL_TILES=1
Modify syclgpu.cc to partition Intel GPUs into NUMA-affine tiles when enabled
Implement smart tile distribution for MPI+threads hybrid parallelism
Add tile tests to smoke.level0 and full.level0 test suites
Create missing MPI psxthreads.sycloffload.ipcx directory and Makefile
Update documentation in DD_README.md and minitest/README.md

The implementation maintains backward compatibility - tile mode is only activated when explicitly requested via the 'tile' descriptor or environment variable. On systems without Intel GPUs or tile support, the code gracefully falls back to standard GPU mode.

Tile assignment formula for MPI+threads: gpu_index = (rank + threadnum * mpi_size) % total_tiles

This ensures no collision between MPI ranks and threads when accessing tiles.

Signed-off-by: Yuning Xia yx87@rice.edu

Add Intel GPU tile partitioning support for Data Center GPU Max

Merge request reports