...
CP2K is an MPI-parallel application. You can use either mpirun or srun as the job starter for CP2K. If you opt for mpirun, then, apart from loading the corresponding impi or openmpi modules, CPU and/or GPU pinning should be carefully carried out.
CP2K Version | Modulefile | Requirement | Compute Partitions | Support | CPU/GPU | Lise/Emmy |
---|---|---|---|---|---|---|
7.1 | cp2k/7.1 | impi/2021.13 | Rocky Linux 9 | omp libint fftw3 libxc elpa parallel mpi3 scalapack xsmm spglib mkl | / | / |
|
|
| ||||
|
|
| ||||
2023.1 | cp2k/2023.1 | openmpi/gcc.11/4.1.4 cuda/11.8 | GPU A100 | libint, fftw3, libxc, elpa, elpa_nvidia_gpu, scalapack, cosma, xsmm, dbcsr_acc, spglib, mkl, sirius, offload_cuda, spla_gemm, m_offloading, libvdwxc | / | / |
|
|
|
|
| ||
2023.2 | cp2k/2023.2 | openmpi/gcc.11/4.1.4 cuda/11.8 | GPU A100 | libint, fftw3, libxc, elpa, elpa_nvidia_gpu, scalapack, cosma, xsmm, dbcsr_acc, spglib, mkl, sirius, offload_cuda, spla_gemm, m_offloading, libvdwxc | / | / |
2024.1 | cp2k/2023.2 | impi/2021.13 | Rocky Linux 9 | omp,libint,fftw3,fftw3_mkl,libxc,elpa,parallel,mpi_f08,scalapack,xsmm,spglib,mkl,sirius,hdf5 | / | / |
...
- You need to check if, for your problem, a considerable acceleration is expected. E.g., for the following test cases, a performance degradation has been reported: https://www.cp2k.org/performance:piz-daint-h2o-64, https://www.cp2k.org/performance:piz-daint-h2o-64-ri-mp2, https://www.cp2k.org/performance:piz-daint-lih-hfx, https://www.cp2k.org/performance:piz-daint-fayalite-fist
GPU pinning is required (see the example of a job script below). Don't forget to make executable the script that takes care of the GPU pinning. In the example, this is achieved with:
chmod +x gpu_bind.sh
Using
...
CP2K as a library
Starting from version 2023.2, cp2k CP2K has been compiled enabling the option that allows it to be used as a library: libcp2k.a
can be found inside $CP2K_LIB_DIR
. The header libcp2k.h
is located in $CP2K_HEADER_DIR
, and the module files (.mod
), eventually possibly needed by Fortran users, are in $CP2K_MOD_DIR
.
...
Codeblock | ||||
---|---|---|---|---|
| ||||
#!/bin/bash #SBATCH --time=12:00:00 #SBATCH --partition=cpu-clx #SBATCH --nodes=1 #SBATCH --ntasks-per-node=24 #SBATCH --cpus-per-task=4 #SBATCH --job-name=cp2k export SLURM_CPU_BIND=none export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} # Binding OpenMP threads export OMP_PLACES=cores export OMP_PROC_BIND=close # Binding MPI tasks export I_MPI_PIN=yes export I_MPI_PIN_DOMAIN=omp export I_MPI_PIN_CELL=core # Our tests have shown that CP2K has better performance with psm2 as libfabric provider # Check if this also apply to your system # To stick to the default provider, comment out the following line export FI_PROVIDER=psm2 module load impi/2021.13 cp2k/2024.1 mpirun cp2k.psmp input > output | ||||
Codeblock | ||||
| ||||
#!/bin/bash Select #SBATCH --time=12:00:00 #SBATCH --nodes=1 #SBATCH --ntasks-per-node=24 #SBATCH --cpus-per-task=4 #SBATCH --job-name=cp2k export SLURM_CPU_BIND=none export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} # Binding OpenMP threads export OMP_PLACES=cores export OMP_PROC_BIND=close # Binding MPI tasks export I_MPI_PIN=yes export I_MPI_PIN_DOMAIN=omp export I_MPI_PIN_CELL=core the appropriate version module load intel/2021.2 impi/2021.7cp2k/2024.1 cp2k/2023.2 mpirun cp2k.psmp input > output | ||||
Codeblock | ||||
| ||||
#!/bin/bash #SBATCH --time=12:00:00 #SBATCH --nodes=1 #SBATCH --ntasks-per-node=24 #SBATCH --cpus-per-task=4 #SBATCH --job-name=cp2k export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} module load intel/2021.2 impi/2021.7.1 cp2k/2023.2 srun cp2k.psmp input > output |
Codeblock | ||||
---|---|---|---|---|
| ||||
#!/bin/bash #SBATCH --partition=gpu-a100 #SBATCH --time=12:00:00 #SBATCH --nodes=1 #SBATCH --ntasks-per-node=4 #SBATCH --cpus-per-task=18 #SBATCH --job-name=cp2k export SLURM_CPU_BIND=none export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} export OMP_PLACES=cores export OMP_PROC_BIND=close module load gcc/11.3.0 openmpi/gcc.11/4.1.4 cuda/11.8 cp2k/2023.2 # gpu_bind.sh (see the following script) should be placed inside the same directory where cp2k will be executed # Don't forget to make gpu_bind.sh executable by running: chmod +x gpu_bind.sh mpirun --bind-to core --map-by numa:PE=${SLURM_CPUS_PER_TASK} ./gpu_bind.sh cp2k.psmp input > output |
...
Codeblock | ||||
---|---|---|---|---|
| ||||
#!/bin/bash export CUDA_VISIBLE_DEVICES=$OMPI_COMM_WORLD_LOCAL_RANK $@ | ||||
HTML Kommentar | ||||
Commenting out this block, as Lise and Emmy have separate documentation pages now. Codeblock | | |||
|
Remark on OpenMP
Depending on the problem size, the code may stop with a segmentation fault due to insufficient stack size or due to threads exceeding their stack space. To circumvent this, we recommend inserting in the job script:
...