For the PVC partition, Intel MPI is the preferred GPU-aware MPI implementation. Load an impi
environment module, to make it available.
To enable GPU support, set the environment variable I_MPI_OFFLOAD
to "1" (in your jobscript). In case you make use of GPUs on multiple nodes, it is strongly recommended to use the psm3 libfabric provider (FI_PROVIDER=psm3
)
Depending on your application’s needs set I_MPI_OFFLOAD_CELL
to either tile
or device
to assign each MPI rank either a tile or the whole GPU device.
It is recommended to check the pinning by setting I_MPI_DEBUG
to (at least) 3 and I_MPI_OFFLOAD_PRINT_TOPOLOGY
to 1.
Refer to the Intel MPI documentation on GPU support for further information.
Example Job Script:
Codeblock | ||
---|---|---|
| ||
#!/bin/bash
# example to use use 2 x (2 x 4) = 16 MPI processes, each assigned
# to one of the two tiles (stacks) of an PVC GPU
#SBATCH --partition=gpu-pvc
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=8
#SBATCH --job-name=pin-check
# required for usage of Intel GPUs
module load intel
# required for MPI, apparently
module load impi/2021.11
# required for GPU usage with MPI
export FI_PROVIDER=psm3
# to enable GPU support in Intel MPI
export I_MPI_OFFLOAD=1
# assign each rank a tile of a GPU
export I_MPI_OFFLOAD_CELL=tile
# for checking the process pinning
export I_MPI_DEBUG=3
export I_MPI_OFFLOAD_PRINT_TOPOLOGY=1
mpirun ./application |