Versionen im Vergleich

Schlüssel

  • Diese Zeile wurde hinzugefügt.
  • Diese Zeile wurde entfernt.
  • Formatierung wurde geändert.

...

Codeblock
languagebash
titleExample Batch Script
#!/bin/bash

#SBATCH -p standard96cpu-clx:test
#SBATCH -N 16
#SBATCH -t 06:00:00

module load impi
srun mybinary

...

ParameterSBATCH flagComment
# nodes-N <#>
# tasks-n <#>
# tasks per node#SBATCH --tasks-per-node <#>Different defaults between mpirun and srun
partition

-p <name>

e.g. standard96cpu-clx, overview:   Slurm partition CPU CLX

# CPUs per task

-c <#>interesting for OpenMP/Hybrid jobs
Wall time limit-t hh:mm:ss
Mail--mail-type=ALLSee sbatch manpage for different types
Project/Account-A <project>Specify project for core hour accounting

...

Codeblock
titleExample: out of core hour
You can check the account of a job that is out of core hour.
> squeue
... myaccount ... AccountOutOfNPL ...

Interactive

...

jobs
Anker
interactive_jobs
interactive_jobs

For using compute resources interactively, e.g. to follow the execution of MPI programs, the following steps are required. Note that non-interactive batch jobs via job scripts (see below) are the primary way of using the compute resources.

  1. A resource allocation for interactive usage has to be requested first with the salloc --interactive command which should also include your resource requirements.
  2. When salloc successfully allocated the requested resources, you have to issue an additional srun command to work one of the allocated nodes (see example below) if you want to work on the compute node.
  3. Afterwards, srun or MPI launch commands, like mpirun or mpiexec, can be used to start parallel programs (see according user guides)
Codeblock
languagetext
blogin1 ~ $ salloc -t 00:10:00 -p cpu-clx:test -N2 --tasks-per-node 24
salloc: Granted job allocation [...]
salloc: Waiting for resource configuration
salloc: Nodes bcn[1001,1003] are ready for job
# To get a shell on one of the allocated nodes
blogin1 ~ $ srun --pty --interactive --preserve-env ${SHELL}
bcn1001 ~ $ srun hostname | sort | uniq -c
     24 bcn1001
     24 bcn1003
bcn1001 ~ $ exit
# Exit a second time for Berlin/Lise 
blogin1:~ > exit
salloc: Relinquishing job allocation [...]

Using the Shared Nodes

We provide a varying number of nodes from the large40 and large96 partitions as post processeing nodes in a shared mode, so that multiple jobs can run at once on a single node. You can request CPUs and memory and should take care, that you do not exceed your limits. For each CPU/Hyperthread, there is about 9.6Gb of Memory on large40:shared or 4 on the large96:shared partition.

...