The HLRN-IV system
The HLRN-IV system consists of two independent systems named Lise (named after Lise Meitner) and Emmy (named after Emmy Noether). The systems are located at the Zuse Institute Berlin and the University of Göttingen respectively. Overall, the HLRN-IV system consists of 1270 compute nodes with 121,920 cores in total. You can learn more about the system and the differences between the sites on the HLRN-IV website.
Login
Please login to the gateway nodes using the Secure Shell ssh
(protocol version 2), see the example below. The standard gateways are called
blogin.hlrn.de (Berlin)
and
glogin.hlrn.de (Göttingen).
Login authentication is possible only by SSH keys. For information and instructions please see our SSH Pubkey tutorial.
File Systems
Each complex has the following file systems available. More information about Quota, usage, and best pratices are available here.
- Home file system with 340 TiByte capacity containing
$HOME
directories/home/${USER}/
- Lustre parallel file system with 8.1 PiByte capacity containing
$WORK
directories/scratch/usr/${USER}/
$TMPDIR
directories/scratch/tmp/${USER}/
- project data directories
/scratch/projects/<projectID>/
(not yet available)
- Tape archive with 120 TiByte capacity (accessible on the login nodes, only)
- On Emmy: SSD for temporary data at
$LOCAL_TMPDIR
(400 GB shared among all jobs running on the node)
Info |
---|
Filesystem quotas are currently not activated for the $HOME and $WORK directories |
Info |
---|
Best practices for using WORK as a lustre filesystem: https://www.nas.nasa.gov/hecc/support/kb/lustre-best-practices_226.html |
Info |
---|
Hints for fair usage of the shared WORK ressource: Metadata Usage on WORK |
Software and Environment
The webpage Software gives you more information about available software on the HLRN systems.
HLRN Table of Contents
Inhalt |
---|
Partitions on system Lise
Compute system Lise at NHR@ZIB contains different Compute partitions for CPUs and GPUs. Your choice for the partition affects
- Login nodes,
- slurm partition (Compute partitions and Accounting), and
- Software.
Login nodes
To login to system Lise, please
- choose a login node associated to your Compute partitions and
- use authentication via SSH Login.
Software and environment modules
The webpage Software gives you information about available software on the NHR systems.
NHR provides a number of compilers and software packages for parallel computing and (serial) pre- and postprocessing:
- Compilers: Intel, GNU
- Libraries: NetCDF, LAPACK, ScaLAPACK, BLAS, FFTW, ...
- Debuggers: Allinea DDT, Roguewave TotalView...
- Tools: octave, python, R ...
- Visualisation: mostly tools to investigate gridded data sets from earth-system modelling
- Application software: mostly for engineering and chemistry (molecular dynamics)
To Environment Modules are used to manage the access to these software/libraries, HLRN uses the . The module
command . This command offers the following functionality.
- Show lists of available software
- Access Enables access to software in different versions
...
To avoid conflicts between different compilers and compiler versions, builds of most important libraries are provided for all compilers and major release numbers.
Program Build
Here only a brief introduction to program building using the intel compiler is given. For more detailed instructions, including important compiler flags and special libraries, refer to our webpage Compilation Guide.
Examples for building a program on the Atos system
To build executables for the Atos system, call the standard compiler executables (icc, ifort, gcc, gfortran) directly.
Codeblock | ||||
---|---|---|---|---|
| ||||
module load intel
icc -o hello.bin hello.c |
Codeblock | ||||
---|---|---|---|---|
| ||||
module load intel
module load impi
mpiicc -o hello.bin hello.c |
Codeblock | ||||
---|---|---|---|---|
| ||||
module load intel
icc -qopenmp -o hello.bin hello.c |
MPI, Communication Libraries, OpenMP
We provide several communication libraries:
- Intel MPI
- OpenMPI
As Intel MPI is the communication library recommended by the system vendor, currently only documentation for Intel MPI is provided, except for application specific documentation.
OpenMP support ist built in with the compilers from Intel and GNU.
Using the Batch System
To run your applications on the HLRN, you need to go through our batch system/scheduler: Slurm. The scheduler uses metainformation about the job (requested node and core count, wall time, etc.) and then runs your program on the compute nodes, once the resources are available and your job is next in line. For a more in depth introduction, visit our Slurm documentation.
We distinguish two kinds of jobs:
- Interactive job execution
- Job script execution
Resource specification
To request resources, there are multiple flags to be used when submitting the job.
...
-p <name>
...
Interactive jobs
Interactive MPI programs are executed applying the following steps (example for the default medium partition):
- Ask for an interactive shell with the command
srun <…> --pty --interactive bash
. For reduced waiting times we advise to use one of the test partitions when submitting interactive jobs. - In the interactive shell, execute the parallel program with the MPI starter mpirun or srun.
Codeblock | ||
---|---|---|
| ||
blogin1:~ > srun -t 00:10:00 -p medium40:test -N2 --tasks-per-node 24 --pty --interactive bash
bash-4.2$ mpirun hello_world >> hello_world.out
bash-4.2$ exit
blogin1:~ > |
Job scripts
Please go to our webpage MPI, OpenMP start Guide for more details about job scripts. For introduction, standard batch system jobs are executed applying the following steps:
- Provide (write) a batch job script, see the examples below.
- Submit the job script with the command
sbatch
(sbatch jobscript.sh
) - Monitor and control the job execution, e.g. with the commands
squeue
andscancel
(cancel the job).
A job script is a script (written in bash
, ksh
or csh
syntax) containing Slurm keywords which are used as arguments for the command sbatch
.
...
title | Intel MPI Job Script |
---|
Requesting 4 nodes in the medium partition with 96 cores (no hyperthreading) for 10 minutes, using Intel MPI.
Codeblock | ||||
---|---|---|---|---|
| ||||
#!/bin/bash
#SBATCH -t 00:10:00
#SBATCH -N 4
#SBATCH --tasks-per-node 96
#SBATCH -p standard96
module load impi
export SLURM_CPU_BIND=none # important when using "mpirun" from Intel-MPI!
# Do NOT use this with srun!
export I_MPI_HYDRA_TOPOLIB=ipl
export I_MPI_HYDRA_BRANCH_COUNT=-1
mpirun hello_world > hello.output |
...
title | OpenMP job |
---|
Requesting 1 large node with 96 CPUs (physical cores) for 20 minutes, and then using 192 hyperthreads
Codeblock | ||||
---|---|---|---|---|
| ||||
#!/bin/bash
#SBATCH -t 00:20:00
#SBATCH -N 1
#SBATCH --cpus-per-task=96
#SBATCH -p large96:test
# This binds each thread to one core
export OMP_PROC_BIND=TRUE
# Number of threads as given by -c / --cpus-per-task
export OMP_NUM_THREADS=$(($SLURM_CPUS_PER_TASK * 2))
export KMP_AFFINITY=verbose,scatter
hello_world > hello.output |
Job Accounting
The webpage Accounting and NPL gives you more information about job accounting.
Every batch job on Lise and Emmy is accounted. The account (project) which is debited for a batch job can be specified using the sbatch
parameter --account <account>
. If a batch job does not state an account (project), a default is taken from the account database. It defaults to the personal project of the user, which has the same name as the user. Users may modify their default project by visiting the HLRN service portal.
Getting Help
...
File systems
Each complex has the following file systems available. More information about Quota, usage, and best pratices are available on Fixing Quota Issues. Hints for data transfer are given here.
- Home file system with 340 TiByte capacity containing
$HOME
directories/home/${USER}/
- Lustre parallel file system with 8.1 PiByte capacity containing
$WORK
directories/scratch/usr/${USER}/
$TMPDIR
directories/scratch/tmp/${USER}/
- project data directories
/scratch/projects/<projectID>/
(not yet available)
- Tape archive with 120 TiByte capacity (accessible on the login nodes, only)
Info |
---|
Best practices for using WORK as a lustre filesystem: https://www.nas.nasa.gov/hecc/support/kb/lustre-best-practices_226.html |
Info |
---|
Hints for fair usage of the shared WORK ressource: Metadata Usage on WORK |