The HLRN-IV system

The HLRN-IV system consists of two independent systems named Lise (named after Lise Meitner) and Emmy (named after Emmy Noether). The systems are located at the Zuse Institute Berlin and the University of Göttingen respectively. Overall, the HLRN-IV system consists of 1270 compute nodes with 121,920 cores in total. You can learn more about the system and the differences between the sites on the HLRN-IV website.

Please login to the gateway nodes using the Secure Shell ssh (protocol version 2), see the example below. The standard gateways are called

blogin.hlrn.de (Berlin)
and
glogin.hlrn.de (Göttingen).

Login authentication is possible only by SSH keys. For information and instructions please see our SSH Pubkey tutorial.

...

Table of Contents

Inhalt

For questions, please contact the support crew support@nhr.zib.de.

Login authentication is possible via SSH keys only. Please visit our tutorial SSH Login.

Partition of Lise	Login node
CPU partition "Lise"	blogin.nhr.zib.de
GPU A100 partition	bgnlogin.nhr.zib.de
GPU PVC partition	bgilogin.nhr.zib.de

Codeblock

firstline	1
title	Example CPU partition

office $ ssh -i $HOME/.ssh/id_rsa_nhr nhr_username@blogin.nhr.zib.de
Enter passphrase for key '...':
blogin1 $

File systems

Each complex has the following file systems available. More information about Quota, usage, and best pratices are available on Fixing Quota Issues. Hints for data transfer are given here.

Home file system with 340 TiByte capacity containing $HOME directories /home/${USER}/
Lustre parallel file system with 8.1 PiByte capacity containing
- $WORK directories /scratch/usr/${USER}/
- $TMPDIR directories /scratch/tmp/${USER}/
- project data directories /scratch/projects/<projectID>/ (not yet available)
Tape archive with 120 TiByte capacity (accessible on the login nodes, only)
On Emmy: SSD for temporary data at $LOCAL_TMPDIR (400 GB shared among all jobs running on the node)

Info
Filesystem quotas are currently not activated for the $HOME and $WORK directories

Info
Best practices for using WORK as a lustre filesystem: https://www.nas.nasa.gov/hecc/support/kb/lustre-best-practices_226.html

...

Info
Hints for fair usage of the shared WORK ressource: Metadata Usage on WORK

Software and

...

environment modules

The webpage webpage Software gives gives you more information about available software on the HLRN NHR systems.

HLRN NHR provides a number of compilers and software packages for parallel computing and (serial) pre- and postprocessing:

Compilers: Intel, GNU
Libraries: NetCDF, LAPACK, ScaLAPACK, BLAS, FFTW, ...
Debuggers: Allinea DDT, Roguewave TotalView...
Tools: octave, python, R ...
Visualisation: mostly tools to investigate gridded data sets from earth-system modelling
Application software: mostly for engineering and chemistry (molecular dynamics)

To Environment Modules are used to manage the access to these software/libraries, HLRN uses the . The module command . This command offers the following functionality.

Show lists of available software
Access Enables access to software in different versions

...

To avoid conflicts between different compilers and compiler versions, builds of most important libraries are provided for all compilers and major release numbers.

Program

...

build

Here only a brief introduction to program building using the intel compiler is given. For more detailed instructions, including important compiler flags and special libraries, refer to our webpage Compilation GuideCPU CLX.

Examples for building a program on the Atos system

To build executables for the Atos system, call the standard compiler executables (icc, ifort, gcc, gfortran) directly.

...

Codeblock

language	bash
title	Parallel Code with OpenMP

module load intel
icc -qopenmp -o hello.bin hello.c

MPI, Communication Libraries, OpenMP

We provide several communication libraries:

...

As Intel MPI is the communication library recommended by the system vendor, currently only documentation for Intel MPI is provided, except for application specific documentation.

OpenMP support ist built in is available with the compilers from Intel and GNU.

Using the

...

batch system

To run your applications on the HLRNsystems, you need to go through our batch system/scheduler: Slurm. The scheduler uses metainformation meta information about the job (requested node and core count, wall time, etc.) and then runs your program on the compute nodes, once the resources are available and your job is next in line. For a more in depth introduction, visit our Slurm documentation.

We distinguish two kinds of jobs:

Interactive job execution
Job script execution

Resource specification

To request resources, there are multiple flags to be used when submitting the job.

	Parameter	Default Value
# tasks	-n #	1
# nodes	-N #	1
# tasks per node	--tasks-per-node #
partition	-p <name>	standard96/medium40
Timelimit	-t hh:mm:ss	12:00:00

Interactive

...

Interactive MPI programs are executed applying the following steps (example for the default medium partition):

...

jobs
Anker
interactive_jobs
interactive_jobs

For using compute resources interactively, e.g. to follow the execution of MPI programs, the following steps are required. Note that non-interactive batch jobs via job scripts (see below) are the primary way of using the compute resources.

A resource allocation for interactive usage has to be requested first with the salloc --interactive command which should also include your resource requirements.
When salloc successfully allocated the requested resources, you have to issue an additional srun command to work one of the allocated nodes (see example below) if you want to work on the compute node.
Afterwards, srun or MPI launch commands, like mpirun or mpiexec, can be used to start parallel programs (see according user guides)

Codeblock

language	text

blogin1: ~ >$ srunsalloc -t 00:10:00 -p medium40standard96:test -N2 --tasks-per-node 24
salloc: Granted job allocation [...]
salloc: Waiting for resource configuration
salloc: Nodes bcn[1001,1003] are ready for job
# To get a shell on one of the allocated nodes
blogin1 ~ $ srun --pty bash
bash-4.2$ mpirun hello_world >> hello_world.out
bash-4.2$ exit--interactive --preserve-env ${SHELL}
bcn1001 ~ $ srun hostname | sort | uniq -c
     24 bcn1001
     24 bcn1003
bcn1001 ~ $ exit
# Exit a second time for Berlin/Lise 
blogin1:~ > exit
salloc: Relinquishing job allocation [...]

Job scripts

Please go to our webpage MPI, OpenMP start Guide for CPU partition "Lise" for more details about job scripts. For introduction, standard batch system jobs are executed applying the following steps:

...

Erweitern

title	OpenMP job

Requesting 1 large node with 96 CPUs (physical cores) for 20 minutes, and then using 192 hyperthreads

Codeblock

language	bash
linenumbers	true

#!/bin/bash
#SBATCH -t 00:20:00
#SBATCH -N 1
#SBATCH --cpus-per-task=96
#SBATCH -p large96:test

# This binds each thread to one core
export OMP_PROC_BIND=TRUE
# Number of threads as given by -c / --cpus-per-task
export OMP_NUM_THREADS=$(($SLURM_CPUS_PER_TASK * 2))
export KMP_AFFINITY=verbose,scatter

hello_world > hello.output

Job Accounting

The webpage Accouting and NPL gives Accounting gives you more information about job accounting.

Every batch job on Lise and Emmy is accounted. The account (project) which is debited for a batch job can be specified using the sbatch parameter --account <account>. If a batch job does not state an account (project), a default is taken from the account database. It defaults to the personal project of the user, which has the same name as the user. Users may modify their default project by visiting the HLRN service portal.

Getting Help

HLRN help: For questions, please contact the HLRN support crew support@hlrn.dePortal NHR@ZIB.

Versionen im Vergleich

Alte Version 13

Neue Version 67

Schlüssel

The HLRN-IV system

File systems

Software and

environment modules

Program

build

Examples for building a program on the Atos system

MPI, Communication Libraries, OpenMP

Using the

batch system

Resource specification

Interactive

jobs
Anker
interactive_jobs
interactive_jobs

Job scripts

Job Accounting

Getting Help

Seitenvergleich

Versionen im Vergleich

Alte Version 13

Neue Version 67

Schlüssel

The HLRN-IV system

Login

Login

File systems

Software and

environment modules

Program

build

Examples for building a program on the Atos system

MPI, Communication Libraries, OpenMP

Using the

batch system

Resource specification

Interactive

jobs Ankerinteractive_jobsinteractive_jobs

Job scripts

Job Accounting

Getting Help

jobs
Anker
interactive_jobs
interactive_jobs