sdsd
Code Compilation
For code compilation you can choose one of the two compilers Intel or Gnu. Both compilers are able to include the Intel MPI library.
Intel compiler
Codeblock |
---|
title | MPI, icc |
---|
collapse | true |
---|
|
module load intel/19.0.5
module load impi/2019.5
mpiicc -Wl,-rpath,$LD_RUN_PATH -o hello.bin hello.c
mpiifort -Wl,-rpath,$LD_RUN_PATH -o hello.bin hello.f90
mpiicpc -Wl,-rpath,$LD_RUN_PATH -o hello.bin hello.cpp |
...
Using mpirun
the pinning is controled controlled by the MPI library. Pinning by slurm you need to switch off by adding export SLURM_CPU_BIND=none
.
MPI only
Codeblock |
---|
title | MPI, full node |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
module load impi/2019.5
export SLURM_CPU_BIND=none
mpirun -ppn 96 ./hello.bin |
Codeblock |
---|
title | MPI scattered, half node |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
module load impi/2019.5
export SLURM_CPU_BIND=none
export I_MPI_PIN_DOMAIN=core
export I_MPI_PIN_ORDER=scatter
mpirun -ppn 1248 ./hello.bin |
Codeblock |
---|
title | MPI, hyperthreading |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
module load impi/2019.5
export SLURM_CPU_BIND=none
mpirun -ppn 192 ./hello.bin |
...
You can run one code compiled with MPI and OpenMP. The examples cover the setup
- 2 nodes,
- 12 4 processes per node, 2 24 threads per process,one code compiled with MPI and OpenMP.
Codeblock |
---|
title | MPI, OpenMP compact, full node |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
module load impi/2019.5
export SLURM_CPU_BIND=none
export OMP_NUM_THREADS=224
mpirun -ppn 124 ./hello.bin |
The example covers the setup
- 2 nodes,
- 4 processes per node, 12 threads per process.
Codeblock |
---|
title | MPI, OpenMP scattered, half node |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
module load impi/2019.5
export SLURM_CPU_BIND=none
export OMP_PROC_BIND=spread
export OMP_NUM_THREADS=212
mpirun -ppn 124 ./hello.bin |
The example covers the setup
- 2 nodes,
- 96 4 processes per node using hyperthreading,
- 2 48 threads per process.
Codeblock |
---|
title | MPI, OpenMP hyperthreading |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
module load impi/2019.5
export SLURM_CPU_BIND=none
export OMP_PROC_BIND=spread
export OMP_NUM_THREADS=248
mpirun -ppn 964 ./hello.bin |
Using srun
MPI only
Codeblock |
---|
title | MPI, full node |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
srun --ntasks-per-node=96 ./hello.bin |
Codeblock |
---|
title | MPI, half node |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
srun --ntasks-per-node=48 ./hello.bin |
MPI, OpenMP
You can run one code compiled with MPI and OpenMP. The examples cover example covers the setup
- 2 nodes,
- 12 4 processes per node, 2 24 threads per process,one code compiled with MPI and OpenMP.
Codeblock |
---|
title | MPI, OpenMP, full node |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
export OMP_PROC_BIND=spread
export OMP_NUM_THREADS=24
srun --ntasks-per-node=12
#SBATCH4 --cpus-per-task=48 ./hello.bin |
The example covers the setup
- 2 nodes,
- 4 processes per node, 12 threads per process.
Codeblock |
---|
title | MPI, OpenMP, half node |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=standard96:test
export OMP_PROC_BIND=spread
export OMP_NUM_THREADS=212
srun --ntasks-per-node=4 --cpus-per-task=24 ./hello.bin |
The example covers the setup
- 2 nodes,
- 96 4 processes per node using hyperthreading,
- 2 48 threads per process.
Codeblock |
---|
title | MPI, OpenMP, hyperthreading |
---|
collapse | true |
---|
|
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=96
#SBATCH --cpus-per-task=2
#SBATCH --partition=standard96:test
export OMP_PROC_BIND=spread
export OMP_NUM_THREADS=248
srun --ntasks-per-node=4 --cpus-per-task=48 ./hello.bin |