This page contains all important information about the batch system Slurm, that you will need to run software on the HLRN. It does not contain every feature that Slurm has to offer. For that, please consult the official documentation and the man pages.
Submission of jobs mainly happens via the sbatch
command using jobscript, but interactive jobs and node allocations are also possible using srun
or salloc
. Resource selecttion (e.g. number of nodes or cores) is handled via command parameters, or may be specified in the job script.
Partitions
Partition | Availability | Max. walltime | Nodes | Max. nodes per job | Max jobs per user | Remark |
---|---|---|---|---|---|---|
standard96 | Lise | 12:00:00 | 952 | 256 | (var) | normal nodes with 384 GB memory, default partition |
standard96:test | Lise | 1:00:00 | 32 dedicated +128 on demand | 16 | 1 | normal test nodes with higher priority but lower walltime |
large96 | Lise | 12:00:00 | 28 | 4 | (var) | fat nodes with 768 GB memory |
large96:test | Lise | 1:00:00 | 2 dedicated +2 on demand | 2 | 1 | fat test nodes with higher priority but lower walltime |
large96:shared | Lise | 48:00:00 | 2 dedicated | 1 | (var) | fat nodes for data pre- and postprocessing |
huge96 | Lise | 24:00:00 | 2 | 1 | (var) | very fat nodes with 1536 GB memory, |
medium40 | Emmy | 12:00:00 | 368 | 128 | unlimited | normal nodes with 192 GB memory, default partition |
medium40:test | Emmy | 1:00:00 | 16 dedicated +48 on demand | 8 | unlimited | normal test nodes with higher priority but lower walltime |
large40 | Emmy | 12:00:00 | 11 | 4 | unlimited | fat nodes with 768 GB memory |
large40:test | Emmy | 1:00:00 | 3 | 2 | unlimited | fat test nodes with higher priority but lower walltime |
large40:shared | Emmy | 24:00:00 | 2 | 1 | unlimited | for data pre- and postprocessing |
gpu | Emmy | 12:00:00 | 1 | 1 | unlimited | equipped with 4 x NVIDIA Tesla V100 32GB |
If you do not request a partition, you will be placed on to the default partition, which is standard96 in Berlin and medium40 in Göttingen.
The default partitions are suitable for most calculations. The :test partitions are, as the name suggests, intended for shorter and smaller test runs. These have a higher priotity and a few dedicated nodes, but are limited in time and number of nodes. The :shared nodes are mainly for postprocessing. Nearly all nodes are exclusive to one job, except for the nodes in these :shared partitions.
Parameters
# nodes | -N # | 1 |
# tasks | -n # | 1 |
# tasks per node | --tasks-per-node # | 96 |
partition | -p <name> | standard96/medium40 |
# CPUs per task | -c # | Default 1, interesting for OpenMP/Hybrid jobs |
Timelimit | -t hh:mm:ss | 12:00:00 |
--mail-type=ALL | See sbatch manpage for different types |
Job Scripts
A job script can be any script that contains special instruction for Slurm. Most commonly used forms are shell scripts, such as bash
or plain sh
. But other scripting languages (e.g. Python, Perl, R) are also possible.
Codeblock | ||||||
---|---|---|---|---|---|---|
| ||||||
#!/bin/bash
#SBATCH -p medium40
#SBATCH -N 16
#SBATCH -t 06:00:00
module load impi
srun mybinary |