Too many open files


When using srun --propagate

A process started with "srun" using the "–propagate" option fails with "Too many open files". Since Slurm upgrade to version 21.

Slurm version 21 will run the compute process with a hard open file limit (RLIMIT_NOFILE) of only 4096.
See also https://github.com/SchedMD/slurm/commit/18b2f4fff3f8fd5773ab14ec631bbd5f2995fa6e


Solution

Add NOFILE to --propagate. See also man 1 srun.

Example:

$ srun --propagate=STACK,NOFILE ...

instead of

$ srun --propagate=STACK ...