The compute nodes of Lise in Berlin (blogin.hlrn.de) and Emmy in Göttingen (glogin.hlrn.de) are organized via the following SLURM partitions:
Lise (Berlin)
...
Partition (number holds cores per node)
...
Content
Inhalt |
---|
Code execution
After creation of
- a binary (executable, model code) like in Workflow CPU CLX,
- a slurm job script like in Workflow CPU CLXwith a slurm partition in table Partition for CPU CLX,
submit the slurm job script to execute the binary on compute nodes.
Kein Format |
---|
> sbatch myjobscipt.slurm
Submitted batch job 8028673
> ls slurm-8028673.out
slurm-8028673.out |
Partition for CPU CLX
The compute nodes of the CPU cluster of system Lise are organised via the following Slurm partitions.
Partition name | Node count | CPU | Main memory (GB) | Max. nodes per job | Max. jobs per user (running/ queued) |
---|
Usable memory MB per node
CPU
Charged core- hours per node
Wall time limit (hh:mm:ss) | Remark | |||||
---|---|---|---|---|---|---|
cpu-clx | 948 | Cascade 9242 | 362 | 512 | 128 / 500 | 12:00:00 |
16 / 500
default |
cpu-clx:test |
16 dedicated +128 on demand | 362 | 16 | 1 / 500 |
+2 on demand
01:00:00 | test nodes with higher priority but |
747 000
1522 000
very fat memory nodes for data pre- and postprocessing
12 hours are too short? See here how to pass the 12h walltime limit with job dependencies.
Emmy (Göttingen)
Partition (number holds cores per node)
per job
per user
Usable memory MB per node
CPU, GPU type
gcn#
+48 on demand
+2 on demand
747 000
1522 000
less wall time | ||||||
cpu-clx:ssd | 50 | 362 | 128/500 | 12:00:00 | local 2TB SSD for IO | |
cpu-clx:large | 28 | 747 | 8 | 128 / 500 | 12:00:00 | fat memory nodes blogin1-2.nhr.zib.de |
cpu-clx:huge | 2 | 1522 | 1 | 128 / 500 | 24:00:00 | very fat memory nodes for data pre- and |
32 dedicated
+96 on demand
181 000
764 000
764 000
see GPU Usage
Which partition to choose?
If you do not request a partition, your job will be placed in the default partition, which is standard96.
...
post-processing |
See Slurm usage how to pass the 12h wall time limit with job dependencies.
Which partition to choose?
The default partition cpu-clx
is suitable for most calculations. The :test partitions are, as the name suggests, intended for shorter and smaller test runs. These have a higher priority and a few dedicated nodes, but are limited in time and number of nodesprovide only limited resources. Shared nodes are suitable for pre- and postprocessingpost-processing. A job running on a shared node is only accounted for its core fraction (cores of job / all cores per node). All non-shared nodes are exclusive to one job , which implies that full NPL are paid.
Details about the CPU/GPU types can be found below.
The network topology is described here.
The only at a time.
The available home/local-ssd/work/perm storages file systems are discussed in under File Systems.
An For an overview of all Slurm partitions and node statuses is provided bystatus of nodes: sinfo -r
To see For detailed information about a particular nodes type: scontrol show node <nodename>
Charge rates for accounting
Charge rates for the Slurm partitions can be found under Accounting.
Fat-Tree Communication Network of Lise
See OPA Fat Tree network of Lise
List of CPUs
...
Short name | Link to manufacturer specifications | Where to find | Units per node | Cores per unit | Clock speed |
---|---|---|---|---|---|
Cascade 9242 | Intel Cascade Lake Platinum 9242 (CLX-AP) |
CPU partition "Lise" | 2 | 48 | 2.3 |
Cascade 4210 | Intel Cascade Lake Silver 4210 (CLX) | blogin[1-8] |
2 | 10 | 2. |
2 |
640/5120*
...