Content
Code Compilation
For code compilation you can choose one of the two compilers - Intel oneAPI or GNU. Both compilers are able to include the Intel MPI library.
Intel one API compiler
Codeblock |
---|
title | plain MPI, icc |
---|
collapse | true |
---|
|
module load intel/19.0.5
module load impi/2019.5
mpiiccmpiicx -Wl,-rpath,$LD_RUN_PATH -o hello.bin hello.c
mpiifortmpiifx -Wl,-rpath,$LD_RUN_PATH -o hello.bin hello.f90
mpiicpcmpiicpx -Wl,-rpath,$LD_RUN_PATH -o hello.bin hello.cpp |
Codeblock |
---|
title | hybrid MPI, /OpenMP, icc |
---|
collapse | true |
---|
|
module load intel/19.0.5
module load impi/2019.5
mpiiccmpiicx -qopenmpfopenmp -Wl,-rpath,$LD_RUN_PATH -o hello.bin hello.c
mpiifortmpiifx -qopenmpfopenmp -Wl,-rpath,$LD_RUN_PATH -o hello.bin hello.f90
mpiicpcmpiicpx -qopenmpfopenmp -Wl,-rpath,$LD_RUN_PATH -o hello.bin hello.cpp |
...
GNU compiler
Codeblock |
---|
title | plain MPI, gcc |
---|
collapse | true |
---|
|
module load gcc/9.3.0
module load impi/2019.5
mpigcc -Wl,-rpath,$LD_RUN_PATH -o hello.bin hello.c
mpif90 -Wl,-rpath,$LD_RUN_PATH -o hello.bin hello.f90
mpigxx -Wl,-rpath,$LD_RUN_PATH -o hello.bin hello.cpp |
Codeblock |
---|
title | hybrid MPI, /OpenMP gcc |
---|
collapse | true |
---|
|
module load gcc/9.3.0
module load impi/2019.5
mpigcc -fopenmp -Wl,-rpath,$LD_RUN_PATH -o hello.bin hello.c
mpif90 -fopenmp -Wl,-rpath,$LD_RUN_PATH -o hello.bin hello.f90
mpigxx -fopenmp -Wl,-rpath,$LD_RUN_PATH -o hello.bin hello.cpp |
...
Slurm job script
You need to start the MPI parallelized code on the system. You can choose between two approaches, namely
...
Using mpirun
Using mpirun
the pinning is controled controlled by the MPI library. Pinning by slurm SLURM you need to switch off by adding export SLURM_CPU_BIND=none
.
MPI only
Codeblock |
---|
title | MPI, full node |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
module load impi/2019.5
export SLURM_CPU_BIND=none
mpirun -ppn 96 ./hello.bin |
...
Codeblock |
---|
title | MPI, half node |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
module load impi/2019.5
export SLURM_CPU_BIND=none
export I_MPI_PIN_DOMAIN=core
export I_MPI_PIN_ORDER=scatter
mpirun -ppn 1248 ./hello.bin |
Codeblock |
---|
title | MPI, hyperthreading |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
module load impi/2019.5
export SLURM_CPU_BIND=none
mpirun -ppn 192 ./hello.bin |
MPI, OpenMP
You can run one code compiled with MPI and OpenMP. The examples cover the setup
- 2 nodes,
- 12 4 processes per node, 2 24 threads per process,one code compiled with MPI and OpenMP.
Codeblock |
---|
title | MPI, OpenMP compact, full node |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
module load impi/2019.5
export SLURM_CPU_BIND=none
export OMP_NUM_THREADS=224
mpirun -ppn 124 ./hello.bin |
The example covers the setup
- 2 nodes,
- 4 processes per node, 12 threads per process.
Codeblock |
---|
title | MPI, OpenMP scattered, half node |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
module load impi/2019.5
export SLURM_CPU_BIND=none
export OMP_PROC_BIND=spread
export OMP_NUM_THREADS=212
mpirun -ppn 124 ./hello.bin |
The example covers the setup
- 2 nodes,
- 96 4 processes per node using hyperthreading,
- 2 48 threads per process.
Codeblock |
---|
title | MPI, OpenMP hyperthreading |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
module load impi/2019.5
export SLURM_CPU_BIND=none
export OMP_PROC_BIND=spread
export OMP_NUM_THREADS=248
mpirun -ppn 964 ./hello.bin |
Using srun
MPI only
Codeblock |
---|
title | MPI, full node |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
srun --ntasks-per-node=96 ./hello.bin |
Codeblock |
---|
title | MPI, half node |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
srun --ntasks-per-node=48 ./hello.bin |
MPI, OpenMP
You can run one code compiled with MPI and OpenMP. The examples cover example covers the setup
- 2 nodes,
- 12 4 processes per node, 2 24 threads per process,one code compiled with MPI and OpenMP.
Codeblock |
---|
title | MPI, OpenMP, full node |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
export OMP_PROC_BIND=spread
export OMP_NUM_THREADS=24
srun --ntasks-per-node=12
#SBATCH4 --cpus-per-task=48 ./hello.bin |
The example covers the setup
- 2 nodes,
- 4 processes per node, 12 threads per process.
Codeblock |
---|
title | MPI, OpenMP, half node |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
export OMP_PROC_BIND=spread
export OMP_NUM_THREADS=212
srun --ntasks-per-node=4 --cpus-per-task=24 ./hello.bin |
The example covers the setup
- 2 nodes,
- 96 4 processes per node using hyperthreading,
- 2 48 threads per process.
Codeblock |
---|
title | MPI, OpenMP, hyperthreading |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=96
#SBATCH --cpus-per-task=2
#SBATCH --partition=standard96:test
export OMP_PROC_BIND=spread
export OMP_NUM_THREADS=248
srun --ntasks-per-node=4 --cpus-per-task=48 ./hello.bin |