...
- login via ssh,
- file systems, and
- general Slurm usage.
Software and environment modules
- Login and compute nodes of the A100 GPU partition are running under Rocky Linux (currently version 8.6).
- Software for the A100 GPU partition provided by NHR@ZIB can be found using the module command, see Quickstart.
- Please note the presence of the sw.a100 environment module. It controls the software selection for the GPU A100 partition.
Codeblock | ||||
---|---|---|---|---|
| ||||
bgnlogin1 $ module avail
...
bgnlogin1 $ module load gcc
...
bgnlogin1 $ module list
Currently Loaded Modulefiles:
1) HLRNenv 2) sw.a100 3) slurm 4) gcc/11.3.0(default) |
Program build and execution
- Each node of the GPU A100 system is a combination of a host CPU and their four attached device GPUs. There is a wide range of software to support this hardware.
- We recommend to use the GPU A100 login nodes for program build. If a program build needs for the presence of CUDA drivers, compilation is possible on a compute node within a slurm job session, too.
- We restrict our presentation to examples. For that, please visit our manual on
...
Codeblock | ||
---|---|---|
| ||
bgnlogin1 $ squeue -u myaccount JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 7748370 gpu-a100 a100_mpi myaccount R 1:23 2 bgn[1007,1017] bgnlogin1 $ ssh bgn1007 bgn1007 $ top bgn1007 $ nvidia-smi bgn1007 $ module load nvtop bgn1007 $ nvtop |
Software and environment modules
- Login and compute nodes of the A100 GPU partition are running under Rocky Linux (currently version 8.6).
- Software for the A100 GPU partition provided by NHR@ZIB can be found using the module command, see Quickstart.
- Please note the presence of the sw.a100 environment module. It controls the software selection for the GPU A100 partition.
Codeblock | ||||
---|---|---|---|---|
| ||||
bgnlogin1 $ module avail
...
bgnlogin1 $ module load gcc
...
bgnlogin1 $ module list
Currently Loaded Modulefiles:
1) HLRNenv 2) sw.a100 3) slurm 4) gcc/11.3.0(default) |
Using the slurm batch system
...