Versionen im Vergleich

Schlüssel

  • Diese Zeile wurde hinzugefügt.
  • Diese Zeile wurde entfernt.
  • Formatierung wurde geändert.

CUDA (Compute Unified Device Architecture) is an interface to program Nvidia GPUs. It offers support to the languages such as C, C++, and Fortran.

...

Code build

For code generation we recommend the software package NVIDIA hpcx which is a combination of compiler and powerful libraries, like e.g. CUDA, cublas, and MPI.

Codeblock
languagetext
titleCUDA and with cublas
bgnlogin1 $ module load nvhpc-hpcx/23.1
bgnlogin1 $ module list
Currently Loaded Modulefiles: ... 4) hpcx   5) nvhpc-hpcx/23.1
bgnlogin1 $ nvc -cuda -gpu=cc8.0 cuda.c -o cuda.bin
bgnlogin1 $ nvc -cuda -gpu=cc8.0 -cudalib=cublas cuda_cublas.c -o cuda_cublas.bin

CUDA can be used in combination with MPI.

Codeblock
languagetext
titleCUDA with MPI
bgnlogin1 $ module load nvhpc-hpcx/23.1
bgnlogin1 $ mpicc -cuda -gpu=cc8.0 -cudalib=cublas mpi_cuda_cublas.c -o mpi_cuda_cublas.bin

Code execution

All available slurm partitions for the A100 GPU partition you can see on Slurm partitions GPU A100.

Codeblock
languagetext
titleJob script for CUDA
#!/bin/bash
#SBATCH --partition=gpu-a100:shared
#SBATCH --gres=gpu:1
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=72

./cuda.bin
./cuda_cublas.bin

...

languagetext
titleJob script for CUDA with MPI

...

Apptainer is provided as a module and can be used to download, build and run e.g. Nvidia containers:

Codeblock
languagebash
titleApptainer example
bgnlogin1 ~ $ module load apptainer
Module for Apptainer 1.1.6 loaded.

#pulling a tensorflow image from nvcr.io - needs to be compatible to local driver
bgnlogin1 ~ $ apptainer pull tensorflow-22.01-tf2-py3.sif docker://nvcr.io/nvidia/tensorflow:22.01-tf2-py3
...

#example: single node run calling python from the container in interactive job using 4 GPUs
bgnlogin1 ~ $ srun -pgpu-a100 --gres=gpu:4 --nodes=1 --pty --interactive --preserve-env ${SHELL}
...
bgn1003 ~ $ apptainer run --nv tensorflow-22.01-tf2-py3.sif python
...
Python 3.8.10 (default, Nov 26 2021, 20:14:08) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.config.list_physical_devices("GPU")
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:2', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:3', device_type='GPU')]

#optional: cleanup apptainer cache
bgnlogin1 ~ $ apptainer cache list
...
bgnlogin1 ~ $ apptainer cache clean