Content
Inhalt |
---|
Code execution
For examples for code execution, please visit Slurm CPU Genoa partition CPU CLX.
Code compilation
For code compilation please use gnu compiler.
...
Codeblock | ||||
---|---|---|---|---|
| ||||
module load gcc/13.3.0 module load openmpi/gcc/5.0.3 mpicc -fopenmp -Wl,-rpath,$LD_RUN_PATH -o hello.bin hello.c |
Slurm job script
A slurm script is submitted to the job scheduler slurm. It contains
- the request for compute nodes of a Slurm partition CPU CLX and CPU Genoa partition and
- commands to start your binary. You have two options to start an MPI binary.
- using
mpirun
usingsrun
(recommended when using Open MPI) - using
srun
mpirun
- using
Using
...
Using mpirun
(from the MPI library) to start the binary you need to switch off slurm binding by adding export SLURM_CPU_BIND=none
.
...
srun
When using Open MPI on the CPU Genoa partitions, you can make benefit of Open MPIs support for Slurm. Resource specifications provided in batch jobs or job steps (srun), such as number of tasks etc, are understood by the MPI library.
Codeblock | ||||||
---|---|---|---|---|---|---|
| ||||||
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=cpu- |
...
genoa #SBATCH --ntasks-per-node=192 srun ./hello.bin |
You can also run hybrid codes, i.e. applications using both MPI and OpenMP. The example covers the setup
- 2 nodes
- 8 (MPI) processes per node, 24 (OpenMP) threads per process.
Note that
Codeblock | ||||
---|---|---|---|---|
|
...
| |||
#!/bin/bash #SBATCH --partition=cpu-genoa #SBATCH --nodes=2 #SBATCH -- |
...
ntasks-per-node=8 #SBATCH --cpus-per-task=24 # to avoid usage of Hyperthreads #SBATCH --hint=nomultithread # Set number of OpenMP threads to the same of number cpus per task requested from slurm export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} # Ensure proper binding of OpenMP threads export OMP_PROC_BIND=true export OMP_PLACES=cores srun ./hello.bin |
You can run one code compiled with MPI and OpenMP. The example covers the setup
- 2 nodes,
- 4 processes per node, 24 threads per process.
...
Using mpirun
With mpirun, you can manipulate the the process binding, mapping and ranking with the command its line arguments. Refer to the man page of the respective MPI library for details.
Codeblock | ||||||
---|---|---|---|---|---|---|
| ||||||
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --partition=cpu- |
...
genoa module load openmpi/gcc/5.0.3 |
...
Using srun
...
title | MPI, full node |
---|---|
collapse | true |
...
# Run 384 MPI processes, distributed blockwise with rank 0-191 on first node and remaining ones on the second. # Bind processes to cores for potentially better performance. mpirun -np 384 --map-by ppr:192:node --bind-to core ./hello.bin |
You can
...
also run hybrid codes, i.e. applications using both MPI and OpenMP. The example covers the setup
- 2 nodes,
...
- 8 processes per node, 24 threads per process.
Codeblock | ||||||
---|---|---|---|---|---|---|
| ||||||
#!/bin/bash #SBATCH --nodes=2 #SBATCH --partition=cpu-clx:testgenoa module load openmpi/gcc/5.0.3 export OMP_NUM_THREADS=24 # Ensure proper binding of OpenMP threads export OMP_PROC_BIND=spreadtrue export OMP_NUM_THREADS=24 srun --ntasks-per-node=4 --cpus-per-task=48PLACES=cores # Bind processes to cores for potentially better performance. mpirun -np 16 --map-by ppr:8:node:pe=${OMP_NUM_THREADS} --bind-to core ./hello.bin |