Seitenvergleich

...

HTML Kommentar

Commenting out this block, since Berlin and Göttingen have separate documentation pages now.

Codeblock

language	bash
title	For Intel Skylake CPU compute nodes (Phase 1, Göttingen unsafe-only):

#!/bin/bash
#SBATCH --time 12:00:00
#SBATCH --nodes 2
#SBATCH --tasks-per-node 40

export SLURM_CPU_BIND=none

module load impi/2019.5
module load vasp/5.4.4.p1

mpirun vasp_std

The following example shows a job script that will run on the Nvidia A100 GPU nodes (Berlin). Per default, VASP will use one GPU per MPI task. If you plan to use 4 GPUs per node, you need to set 4 MPI tasks per node. Then, set the number of OpenMP threads to 18 (because 4x18=72 which is the number of CPU cores on GPU A100 partition) to speed up your calculation. This, however, also requires proper process pinning.

Codeblock

language	bash
title	For Nvidia A100 GPU compute nodes with CentOS 7(Berlin)

#!/bin/bash
#SBATCH --time =12:00:00
#SBATCH --nodes =2
#SBATCH --tasks-per-node 96

export SLURM_CPU_BIND=none=4
#SBATCH --cpus-per-task=18
#SBATCH --partition=gpu-a100

# Set the number of OpenMP threads as given by the SLURM parameter "cpus-per-task"
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}

# Binding OpenMP threads
export OMP_PLACES=cores
export OMP_PROC_BIND=close

# Avoid hcoll as MPI collective algorithm
export OMPI_MCA_coll="^hcoll"

# You may need to adjust this limit, depending on the case
export OMP_STACKSIZE=512m 

module load impinvhpc-hpcx/201923.51
module load vasp/56.4.4.p1

mpirun 1  

# Carefully adjust ppr:2, if you don't use 4 MPI processes per node
mpirun --bind-to core --map-by ppr:2:socket:PE=${SLURM_CPUS_PER_TASK} vasp_std

The following job script exemplifies how to run vasp 6.4.1 3 making use of OpenMP threads. Here, we have 2 OpenMP threads and 48 MPI tasks per node (the product of these 2 numbers should ideally be equal to the number of CPU cores per node).

...

Codeblock

language	bash
title	For compute nodes with CentOS 7

#!/bin/bash
#SBATCH --time=12:00:00
#SBATCH --nodes=2
#SBATCH --tasks-per-node=48
#SBATCH --cpus-per-task=2
#SBATCH --partition=standard96

export SLURM_CPU_BIND=none

# Set the number of OpenMP threads as given by the SLURM parameter "cpus-per-task"
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

# Adjust the maximum stack size of OpenMP threads
export OMP_STACKSIZE=512m

# Binding OpenMP threads
export OMP_PLACES=cores
export OMP_PROC_BIND=close

# Binding MPI tasks
export I_MPI_PIN=yes
export I_MPI_PIN_DOMAIN=omp
export I_MPI_PIN_CELL=core

module load impi/2021.7.1
module load vasp/6.4.1  

mpirun vasp_std

The following example shows a job script that will run on the Nvidia A100 GPU nodes (Berlin). Per default, VASP will use one GPU per MPI task. If you plan to use 4 GPUs per node, you need to set 4 MPI tasks per node. Then, set the number of OpenMP threads to 18 (because 4x18=72 which is the number of CPU cores on GPU A100 partition) to speed up your calculation. This, however, also requires proper process pinning.last example demonstrates how to run a job with vasp 5.4.4.p1 on nodes withe CentOS7

Codeblock

language	bash
title	For Nvidia A100 GPU compute nodes (Berlin)with CentOS 7

#!/bin/bash
#SBATCH --time= 12:00:00
#SBATCH --nodes= 2
#SBATCH --tasks-per-node=4
#SBATCH --cpus-per-task=18
#SBATCH --partition=gpu-a100

# Set the number of OpenMP threads as given by the SLURM parameter "cpus-per-task"
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}

# Binding OpenMP threads
export OMP_PLACES=cores
export OMP_PROC_BIND=close

# Avoid hcoll as MPI collective algorithm
export OMPI_MCA_coll="^hcoll"

# You may need to adjust this limit, depending on the case
export OMP_STACKSIZE=512m 

module load nvhpc-hpcx/23.1 96

export SLURM_CPU_BIND=none

module load impi/2019.5
module load vasp/65.4.1  

# Carefully adjust ppr:2, if you don't use 4 MPI processes per node
mpirun --bind-to core --map-by ppr:2:socket:PE=${SLURM_CPUS_PER_TASK}4.p1

mpirun vasp_std

Versionen im Vergleich

Alte Version 13

Neue Version 14

Schlüssel