Content

Inhalt

Code execution

For examples for code execution, please visit Slurm partition CPU CLX.

Code compilation

Intel oneAPI compiler

Codeblock

title	Serial Codecode execution
collapse	true

module load intel
iccicx -o hello.bin hello.c
ifortifx -o hello.bin hello.f90
icpcicpx -o hello.bin hello.cpp

Codeblock

title	OpenMP parallel threaded code execution
collapse	true

module load intel
iccicx -qopenmpfopenmp -o hello.bin hello.c
ifortifx -qopenmpfopenmp -o hello.bin hello.f90 
icpcicpx -qopenmpfopenmp -o hello.bin hello.cpp

...

GNU compiler

Codeblock

title	Serial code execution
collapse	true

module load gcc
gcc -o hello.bin hello.c
gfortran -o hello.bin hello.f90
g++ -o hello.bin hello.cpp

Codeblock

title	OpenMP threaded code execution
collapse	true

module load gcc
gcc -fopenmp -o hello.bin hello.c
gfortran -fopenmp -o hello.bin hello.f90
g++ -fopenmp -o hello.bin hello.cpp

Slurm job script

The examples for slurm job scripts, e.g. myjobscipt.slurm, that cover the setup

1 node,
1 OpenMP code running.

Codeblock

title	Serial
collapse	true

#SBATCH --nodes=1
#SBATCH --partition=cpu-clx:test
./hello.bin

Codeblock

title	OpenMP, full node
collapse	true

#SBATCH --nodes=1
#SBATCH --partition=cpu-clx:test
export OMP_PROC_BIND=spread
export OMP_NUM_THREADS=96
./hello.bin

Codeblock

title	OpenMP, half node
collapse	true

#SBATCH --nodes=1
#SBATCH --partition=cpu-clx:test
export OMP_PROC_BIND=spread
export OMP_NUM_THREADS=48
./hello.bin

Codeblock

title	OpenMP, hyperthreading
collapse	true

#SBATCH --nodes=1
#SBATCH --partition=cpu-clx:test
export OMP_PROC_BIND=spread
export OMP_NUM_THREADS=192
./hello.bin

You can run different OpenMP codes at the same time. The examples cover the setup

2 nodes,
4 OpenMP codes run simultaneously.
The code is not MPI parallel. mpirun is used to start the codes only.

Codeblock

title	OpenMP simultaneously
collapse	true

#SBATCH --nodes=2
#SBATCH --partition=cpu-clx:test
module load impi/2019.5
export SLURM_CPU_BIND=none
export OMP_PROC_BIND=spread
export OMP_NUM_THREADS=48
mpirun -ppn 2 \
       -np 1 ./code1.bin : -np 1 ./code2.bin : -np 1 ./code3.bin : -np 1 ./code4.bin

Codeblock

title	OpenMP simultaneously hyperthreading
collapse	true

#SBATCH --nodes=2
#SBATCH --partition=standard96:test
module load impi/2019.5
export SLURM_CPU_BIND=none
export OMP_PROC_BIND=spread
export OMP_NUM_THREADS=96
mpirun -ppn 2 \
       -np 1 ./code1.bin : -np 1 ./code2.bin : -np 1 ./code3.bin : -np 1 ./code4.bin

Compiler flags

To make full use of the vectorizing capabilities of the Intel Cascade Lake CPUs, AVX512 AVX-512 instructions and the 512bit 512-bit ZMM registers can be used with the following compile flags with of the Intel compilers:

-xCORE-AVX512 -qopt-zmm-usage=high

However, high ZMM register usage is not recommended in all cases (read moreImage Removed).

With the GNU compilers (GCC 7.x and later), architecture-specific optimization for Skylake and Cascade Lake CPUs is enabled with, the corresponding compiler flags are

-march=skylake-avx512cascadelake -mprefer-vector-width=512

Using the Intel MKL

The Intel® Math Kernel Library (Intel® MKL) is designed to run on multiple processors and operating systems. It is also compatible with several compilers and third party libraries, and provides different interfaces to the functionality. To support these different environments, tools, and interfaces, Intel MKL provides multiple libraries from which to choose.

Check out the link below Intel's link line advisor to see what libraries are recommended for a particular use case. https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/Image Removed

Versionen im Vergleich

Alte Version 5

Neue Version Aktuell

Schlüssel

Code execution

Code compilation

Intel oneAPI compiler

GNU compiler

Slurm job script

Compiler flags

Using the Intel MKL

Seitenvergleich

Versionen im Vergleich

Alte Version 5

Neue Version Aktuell

Schlüssel

Code execution

Code compilation

Intel oneAPI compiler

GNU compiler

Slurm job script

Compiler flags

Using the Intel MKL