Versionen im Vergleich

Schlüssel

  • Diese Zeile wurde hinzugefügt.
  • Diese Zeile wurde entfernt.
  • Formatierung wurde geändert.

Using srun to create multiple jobs steps

You can use srun  to start multiple job steps concurrently on a single node, e.g. if your job is not big enough to fill a whole node. There are a few details to follow:

  • By default, the srun command gets exclusive access to all resources of the job allocation and uses all tasks
    • you therefore need to limit srun to only use part of the allocation
    • this includes implicitly granted resources, i.e. memory and GPUs
    • the --exact flag is needed.
    • if running non-mpi programs, use the -c option to denote the number of cores, each process should have access to
  • srun  waits for the program to finish, so you need to start concurrent processes in the background
  • Good default memory per cpu values (without hyperthreading) are usually are:


    standard96large96huge96medium40large40/gpu
    --mem-per-cpu

    3770M

    7781M15854M4525M

    19075M


Examples

Codeblock
languagebash
titleFour concurrent Programs
linenumberstrue
#!/bin/bash
#SBATCH -p standard96
#SBATCH -t 06:00:00
#SBATCH -N 1

srun --exact -n1 -c 10 --mem-per-cpu 3770M  ./program1 &
srun --exact -n1 -c 80 --mem-per-cpu 3770M  ./program2 &
srun --exact -n1 -c 6 --mem-per-cpu 3770M  ./program3 &
wait

...

Codeblock
languagebash
titleRun a single GPU program four times concurrently
linenumberstrue
#!/bin/bash
#SBATCH -p gpu
#SBATCH -t 12:00:00
#SBATCH -N 1

srun --exact -n1 -c 10 -G1 --mem-per-cpu 19075M  ./single-gpu-program &
srun --exact -n1 -c 10 -G1 --mem-per-cpu 19075M  ./single-gpu-program &
srun --exact -n1 -c 10 -G1 --mem-per-cpu 19075M  ./single-gpu-program &
srun --exact -n1 -c 10 -G1 --mem-per-cpu 19075M  ./single-gpu-program &
wait

Using the Linux parallel command to run a large number of tasks

If you have to run many nearly identical but small tasks (single-core, little memory) you can try to use the Linux parallel command. To use this approach you first need to write a bash-shell script, e.g. task.sh, which executes a single task. As an example we will use the following script:

...

Codeblock
languagebash
titleparallel_job.sh
linenumberstrue
#!/bin/bash

#SBATCH --partition medium40standard96:test        # adjust partition as needed
#SBATCH --nodes 1                        # more than 1 node can be used
#SBATCH --tasks-per-node 4096              # one task per CPU core, adjust for partition

# set memory available per core
MEM_PER_CORE=4525    # must be set to value that corresponds with partition
                     # see https://www.hlrn.de/doc/display/PUB/Multiple+concurrent+programs+on+a+single+node

# Define srun arguments:
srun="srun -n1 -N1 --exclusive --mem-per-cpu $MEM_PER_CORE"
# --exclusive     ensures srun uses distinct CPUs for each job step
# -N1 -n1         allocates a single core to each task

# Define parallel arguments:
parallel="parallel -N 1 --delay .2 -j $SLURM_NTASKS --joblog parallel_job.log"
# -N                number of argument you want to pass to task script
# -j                number of parallel tasks (determined from resources provided by Slurm)
# --delay .2        prevents overloading the controlling node on short jobs
# --resume          add if needed to use joblog to continue an interrupted run (job resubmitted)
# --joblog          creates a log-file, required for resuming

# Run the tasks in parallel
$parallel "$srun ./task.sh {1}" ::: {1..100}
# task.sh          executable(!) script with the task to complete, may depend on some input parameter
# ::: {a..b}       range of parameters, alternatively $(seq 100) should also work
# {1}              parameter from range is passed here, multiple parameters can be used with
#                  additional {i}, e.g. {2} {3} (refer to parallel documentation)

...

Codeblock
$ sbatch parallel_job.sh

Looping over two arrays

You can use parallel  to loop over multiple arrays. The --xapply  option controls, if all permuatations are used or not:

Codeblock
languagebash
titleLooping over multiple inputs
collapsetrue
$ parallel --xapply echo {1} {2} ::: 1 2 3 ::: a b c
1 a
2 b
3 c
$ parallel echo {1} {2} ::: 1 2 3 ::: a b c
1 a
1 b
1 c
2 a
2 b
2 c
3 a
3 b
3 c

...