Seitenvergleich

...

Codeblock
cp repository/* input_area sleep 20 mpirun ... sleep 20

Alternatively, the tool nocacheImage Added serves as a workaround for this issue (thanks John):

Codeblock
nocache cp repository/* input_area mpirun ...

Problem

In a job that requires "staging" of new huge input files (8GB in 650 files) during runtime, the job fails with error messages like "invalid file format". Inspecting the files later, does not reveal any errors and the input files are sane

Codeblock
cp repository/* input_area mpirun ...

It seems to be a lustre cache related problem, the startup of the parallel process is faster than lustre can sychronise itself on all nodes.

Solution

Add some delay after copying large file sets:

Codeblock
cp repository/* input_area sleep 20 mpirun ... sleep 20

Alternatively, the tool nocacheImage Added serves as a workaround for this issue (thanks John):

Codeblock
nocache cp repository/* input_area mpirun ...

Versionen im Vergleich

Alte Version 1

Neue Version Aktuell

Schlüssel

Related articles

Problem

Solution

Related articles