...
CP2K Version | Modulefile | Requirement | Compute Partitions | Support | CPU/GPU | Lise/Emmy |
---|---|---|---|---|---|---|
7.1 | cp2k/7.1 | impi/2021.13 | Rocky Linux 9 | omp libint fftw3 libxc elpa parallel mpi3 scalapack xsmm spglib mkl | / | / |
|
|
| ||||
|
|
| ||||
2023.1 | cp2k/2023.1 | openmpi/gcc.11/4.1.4 cuda/11.8 | GPU A100 | libint, fftw3, libxc, elpa, elpa_nvidia_gpu, scalapack, cosma, xsmm, dbcsr_acc, spglib, mkl, sirius, offload_cuda, spla_gemm, m_offloading, libvdwxc | / | / |
|
|
|
|
| ||
2023.2 | cp2k/2023.2 | openmpi/gcc.11/4.1.4 cuda/11.8 | GPU A100 | libint, fftw3, libxc, elpa, elpa_nvidia_gpu, scalapack, cosma, xsmm, dbcsr_acc, spglib, mkl, sirius, offload_cuda, spla_gemm, m_offloading, libvdwxc | / | / |
2024.1 | cp2k/2023.2 | impi/2021.13 | Rocky Linux 9 | omp,libint,fftw3,fftw3_mkl,libxc,elpa,parallel,mpi_f08,scalapack,xsmm,spglib,mkl,sirius,hdf5 | / | / |
...
- You need to check if, for your problem, a considerable acceleration is expected. E.g., for the following test cases, a performance degradation has been reported: https://www.cp2k.org/performance:piz-daint-h2o-64, https://www.cp2k.org/performance:piz-daint-h2o-64-ri-mp2, https://www.cp2k.org/performance:piz-daint-lih-hfx, https://www.cp2k.org/performance:piz-daint-fayalite-fist
GPU pinning is required (see the example of a job script below). Don't forget to make executable the script that takes care of the GPU pinning. In the example, this is achieved with:
chmod +x gpu_bind.sh
Using
...
CP2K as a library
Starting from version 2023.2, cp2k CP2K has been compiled enabling the option that allows it to be used as a library: libcp2k.a
can be found inside $CP2K_LIB_DIR
. The header libcp2k.h
is located in $CP2K_HEADER_DIR
, and the module files (.mod
), eventually possibly needed by Fortran users, are in $CP2K_MOD_DIR
.
...
Codeblock | ||||
---|---|---|---|---|
| ||||
#!/bin/bash #SBATCH --time=12:00:00 #SBATCH --partition=cpu-clx #SBATCH --nodes=1 #SBATCH --ntasks-per-node=24 #SBATCH --cpus-per-task=4 #SBATCH --job-name=cp2k export SLURM_CPU_BIND=none export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} # Binding OpenMP threads export OMP_PLACES=cores export OMP_PROC_BIND=close # Binding MPI tasks export I_MPI_PIN=yes export I_MPI_PIN_DOMAIN=omp export I_MPI_PIN_CELL=core # Our tests have shown that CP2K has better performance with psm2 as libfabric provider # Check if this also apply to your system # To stick to the default provider, comment out the following line export FI_PROVIDER=psm2 module load impi/2021.13 cp2k/2024.1 mpirun cp2k.psmp input > output | ||||
Codeblock | ||||
| ||||
#!/bin/bash Select #SBATCH --time=12:00:00 #SBATCH --partition standard96 #SBATCH --nodes=1 #SBATCH --ntasks-per-node=24 #SBATCH --cpus-per-task=4 #SBATCH --job-name=cp2k export SLURM_CPU_BIND=none export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} # Binding OpenMP threads export OMP_PLACES=cores export OMP_PROC_BIND=close # Binding MPI tasks export I_MPI_PIN=yes export I_MPI_PIN_DOMAIN=omp export I_MPI_PIN_CELL=core the appropriate version module load intelcp2k/2021.2 impi/2021.7.1 cp2k/2023.2 mpirun cp2k.psmp input > output | ||||
Codeblock | ||||
| ||||
#!/bin/bash #SBATCH --time=12:00:00 #SBATCH --partition standard96 #SBATCH --nodes=1 #SBATCH --ntasks-per-node=24 #SBATCH --cpus-per-task=4 #SBATCH --job-name=cp2k export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} module load intel/2021.2 impi/2021.7.1 cp2k/2023.2 srun cp2k.2024.1 mpirun cp2k.psmp input > output |
Codeblock | ||||
---|---|---|---|---|
| ||||
#!/bin/bash #SBATCH --partition=gpu-a100 #SBATCH --time=12:00:00 #SBATCH --nodes=1 #SBATCH --ntasks-per-node=4 #SBATCH --cpus-per-task=18 #SBATCH --job-name=cp2k export SLURM_CPU_BIND=none export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} export OMP_PLACES=cores export OMP_PROC_BIND=close module load gcc/11.3.0 openmpi/gcc.11/4.1.4 cuda/11.8 cp2k/2023.2 # gpu_bind.sh (see the following script) should be placed inside the same directory where cp2k will be executed # Don't forget to make gpu_bind.sh executable by running: chmod +x gpu_bind.sh mpirun --bind-to core --map-by numa:PE=${SLURM_CPUS_PER_TASK} ./gpu_bind.sh cp2k.psmp input > output |
...
Codeblock | ||||
---|---|---|---|---|
| ||||
#!/bin/bash export CUDA_VISIBLE_DEVICES=$OMPI_COMM_WORLD_LOCAL_RANK $@ | ||||
HTML Kommentar | ||||
Commenting out this block, as Lise and Emmy have separate documentation pages now. Codeblock | | |||
|
Remark on OpenMP
Depending on the problem size, the code may stop with a segmentation fault due to insufficient stack size or due to threads exceeding their stack space. To circumvent this, we recommend inserting in the job script:
...