Versionen im Vergleich

Schlüssel

  • Diese Zeile wurde hinzugefügt.
  • Diese Zeile wurde entfernt.
  • Formatierung wurde geändert.

Inhalt

The compute nodes of Lise in Berlin (blogin.hlrn.de) and Emmy in Göttingen (glogin.hlrn.de) are organized the CPU cluster of system Lise are organised via the following SLURM partitions:

Lise (Berlin)

...

Partition (number holds cores per node)

...

Slurm partitions.

bcn#24

Partition name

Node count

CPU

Main memory (GB)

Max. nodes
per job

Max. jobs per user (running/ queued)per user

Usable memory MB per node

CPU

Shared

Charged core-hours per node

Remarkstandard96Wall time limit (hh:mm:ss)Remark
cpu-clx688Cascade 9242362512

128 / 500

12:00:001204512

16 / 500

362 000Cascade 924296default partition
standard96cpu-clx:testbcn#1:00:0032 dedicated
+128 on demand
362 161 / 500362 000Cascade 92429601:00:00test nodes with higher priority but lower walltimeless wall time
large96bfn#12:00:002828747816 128 / 500747 000Cascade 924214412:00:00fat memory nodes
large96:testbfn#1:00:002 dedicated
+2 on demand
21 / 500747 000Cascade 9242144fat memory test nodes with higher priority but lower walltimelarge96:sharedbfn#48:00:002 dedicated1 16 / 500

747 000

Cascade 9242144fat memory nodes for data pre- and postprocessinghuge96bsn#01:00:002116 / 500

1522 000

Cascade 9242192very fat memory nodes for data pre- and postprocessing

12 hours are too short? See here how to pass the 12h walltime limit with job dependencies.

Emmy (Göttingen)

Partition (number holds cores per node)

Node name

Max. walltime

NodesMax. nodes
per job
Max. jobs
per user

Usable memory MB per node

CPU, GPU type

SharedNPL per node hourRemarkstandard96

gcn#

12:00:00996256unlimited362 000Cascade 924296default partitionstandard96:testgcn#1:00:008 dedicated
+128 on demand16unlimited362 000Cascade 924296
test nodes with higher priority but
lower walltime
less wall time
large96
gfn#12:00:0012
:shared2
unlimited
dedicated747
000Cascade 9242144fat memory nodeslarge96:testgfn#
1
:00:002 dedicated
+2 on demand2unlimited747 000Cascade 9242144fat memory test nodes with higher priority but lower walltimelarge96:sharedgfn#
128 / 50048:00:00

2 dedicated

+6 on demand

1unlimited747 000
Cascade 9242144
fat memory nodes for data pre- and
postprocessing
post-processing
huge96
gsn#24:00:00
2
1
unlimited
1522
000Cascade 9242192

very fat memory nodes for data pre- and postprocessing

medium40gcn#48:00:00424128unlimited181 000Skylake  614840medium40:testgcn#
1
:00:00

8 dedicated

+64 on demand

8unlimited

181 000

Skylake  614840test nodes with higher priority but lower walltimelarge40gfn#48large40:testgfn#1:00:00

2 dedicated

+2 on demand

2unlimited

764 000

Skylake  614880fat memory test nodes with higher priority but lower walltimelarge40:sharedgfn#48:00:00

2 dedicated

+6 on demand

1unlimited764 000Skylake  614880fat memory nodes
128 / 50024:00:00
124unlimited

764 000

Skylake  614880fat memory nodes

very fat memory nodes for data pre- and

postprocessinggpuggpu#48:00:0032unlimited

764 000 MB per node

(32GB HBM per GPU)

Skylake  6148 + 4 Nvidia V100 32GB375

see GPU Usage

grete
ggpu#48:00:00338unlimited

500 000 MB per node

(40GB HBM per GPU)

Zen3 EPYC 7513 + 4 NVidia A100 40GB
600grete:shared
ggpu#48:00:00351unlimited500 000 MB and 1 000 000 MB per node
(40GB or 80GB HBM per GPU)Zen3 EPYC 7513 + 4 NVidia A100 40GB
and Zen2 EPYC 7662 + 8 NVidia A100 80GB
150 per GPUgrete:interactive
ggpu#48:00:0031unlimited500 000 MB (10GB or 20GB HBM per MiG slice)Zen3 EPYC 7513 + 4 NVidia A100  40GB splitted in 2g.10gb and 3g.20gb slices
47 per MiG slice

see GPU Usage

GPUs are split into slices via MIG (3 slices per GPU)

grete:preemptible
ggpu#48:00:0031unlimited500 000 MB (10GB or 20GB HBM per MiG slice)Zen3 EPYC 7513 + 4 NVidia A100 40GB splitted in 2g.10gb and 3g.20gb slices
47 per MiG slice

...

post-processing

See Slurm usage how to pass the 12h wall time limit with job dependencies.

Which partition to choose?

If you do not request a partition, your job will be placed in the default partition, which is standard96.

The default partitions are partition is suitable for most calculations. The :test partitions are, as the name suggests, intended for shorter and smaller test runs. These have a higher priority and a few dedicated nodes, but are limited in time and number of nodesprovide only limited resources. Shared nodes are suitable for pre- and postprocessingpost-processing. A job running on a shared node is only accounted for its core fraction (cores of job / all cores per node). All non-shared nodes are exclusive to one job , which implies that full NPL are paid.Details about the CPU/GPU types can be found belowonly at a time.

The network topology is described here.The available home/local-ssd/work/perm storages file systems are discussed in under File Systems.

An For an overview of all Slurm partitions and node statuses is provided bystatus of nodes: sinfo -r
To see For detailed information about a particular nodes type: scontrol show node <nodename>

Charge rates

Charge rates for the Slurm partitions can be found under Accounting.

Fat-Tree Communication Network of Lise

See OPA Fat Tree network of Lise

List of CPUs

...


Short nameLink to manufacturer specificationsWhere to findUnits per node

Cores per unit

Clock speed
[GHz]

Cascade 9242Intel Cascade Lake Platinum 9242 (CLX-AP)Lise and Emmy compute partitionsCPU partition "Lise"2482.3
Cascade 4210Intel Cascade Lake Silver 4210 (CLX)blogin[1-8], glogin[3-86]2102.2
Skylake  6148Intel Skylake Gold 6148Emmy compute partitions2202.4
Skylake 4110Intel Skylake Silver 4110glogin[1-2]282.1
Tesla V100NVIDIA Tesla V100 32GBEmmy gpu partition4

640/5120*

Tesla A100NVIDIA Tesla A100 40GB and 80GBEmmy grete partitions4 or 8

432/6912*

...