HLRN NHR@ZIB is operating at each site 3 central storage systems with their global file systems:
File System | Capacity | Storage Technology and Function |
---|---|---|
HOME |
...
340 TiB | IBM Spectrum Scale file system |
...
, exported via NFS to compute and login nodes |
...
|
WORK |
...
10 PiB | DDN ExaScaler with Lustre parallel file system |
...
PERM |
...
Tape archive with multiple petabyte capacity with additional harddisk caches |
The system Emmy has addtional additional storage options for high IO demands:
- Phase 1 nodes (partitions
medium40
andlarge40
): Local SSD All nodes of the partition standard96:ssd have local SSDs for temporary data at$LOCAL_TMPDIR
(400 GiB shared among all jobs running on the node). The environment variable$LOCAL_TMPDIR
is available on all nodes, but on the phase 2 systems it points to a ramdisk. - DDN IME based burst buffer with 48TiB NVMe storage (general availability together with the phase 2 nodes)
Login and copying data between HLRN sites
Inter-complex login (ssh
) as well data copy (rsync/sftp
) between both sites (Berlin and Göttingen) should work right out of the box. The same is true for inner-complex ssh and scp between nodes of one site. This is enabled through hostbased authentication.
...
- 2 TB per node). For more details refer to Special Filesystems.
LIFETIME
In general, we store all data for an extra year after the end of a test account/project. If not extended, the standard term of test account/project is one year.
HOME
Each user holds one home directory on each compute site Emmy and Lise.
...
HOME directory:
- directory
HOME=/home/${USER}
- for a higher number of files
- configuration files
- source code and executables
- limited disk space
- backup is available
The home filesystem and /sw
are mounted via NFS, so performance is medium. We take daily snapshots of the filesystem, which can be used to restore a former state of a file or directory. These snapshots can be accessed through the path /home/.snapshots
or /sw/.snapshots
. There are additional regular backups to restore the filesystem in case of a catastrophic failure.
...
The Lustre based work filesystem /scratch
is the main work filesystem for the HLRN clusterssystems. Each user can distribute data to different directories.
...
- parallel input/output for production jobs
- moderate number of files
- transient nature of data
- no backup, no disaster recovery
- available directories
WORK=/scratch/usr/${USER}
- large intput/output data for production jobs
- the intention is to use this data for the user only
- , for user data
- project directory
/scratch/projects/<projectID>
- large intput/output data for production jobs
- the intention is to share this data within the project group
TMPDIR
directory- , to collect and to share project data (please remember: no backup of the Lustre file system), see also hints on disk quota
TMPDIR
=/scratch/tmp/${USER}
- , applications and compilers store data temporarily
We provide no backup of this filesystem. The storage system of Emmy Lise provides around 65GiB85GiB/s streaming bandwith and Lise around 85GiB/s in normal operation. during the acceptance test. With higher occupancy, the effective (write) streaming bandwidth is reduced.
The storage system is harddisk hard-disk based (with solidstate disks vor SSDs for metadata), so the best performance can be reached with sequential IO of large files that is aligned to the fullstripe size of the underlying RAID6 (Emmy 1MiB, Lise 16MiB).
If you are accessing a large file (1GiB+) from multiple nodes in parallel, please consider to activate striping of the file with the Lustre command lfs setstripe
(specific to this file or for a whole directory, changes apply only for new files, so applying a new striping to an existing file requires a file copy) with a sensible stripe_count
(recommendation: Emmy up to 32, Lise up to 8) and a stripe_size
, which is a multiple of the RAID6 fullstripe size and matches the IO sizes of your job.
A general recommendation for network filesystems is to keep the number of metadata operations for open and closing files, as well as checks for file existence or changes as low as possible. These operations often become a bottleneck for the IO of your job and on large clusters , as the ones operated by HLRN, can can easily overload the file servers.
...
The magnetic tape archive provides additional storage for inactive data to free up space on the work WORK or HOME filesystem. It is directly accessible via on the login nodes at the mountpoint ..
- directory
/perm/${USER}
...
- secure file system location on magnetic tapes
- no solution for long-term data archiving
- no guarantee for 10 years according to rules for good scientific practice
For reasons of efficiency and performance, small files and/or complex directory structures should not be transferred to the archive directly. Please aggregate your data to compressed tarballs or other archive containers with a maximum size of 5,5TiB before copying your data to the archive.
...
If you have questions regarding your quota please contact your consultant.
You read quota information for your user account with the command hlrnquota
. More details you find on the page Quota solutions.
...
You are able to exceed your soft-quota limit for a grace period of 2 weeks until further write access is denied. Exceeding hard-quota limits result in a immediate deny.
File system quotas at HLRN are realized
- on HOME and WORK with respect to the unix groups
- on PERM with respect to the user account.
Quota on HOME and WORK
For each stored file on the file systems HOME and WORK the unix group of this file controls the attribution of quota to this unix group. For quota the directory of a file (/scratch/usr/${USER}
or /scratch/projects/<projectID>
) does not matter.
Each user account is a member in a number of unix groups. You can check the list of unix groups with the command groups
.
Codeblock | ||
---|---|---|
| ||
blogin4> groups myaccount
myaccount prj00012 |
Once you decide to set the unix group to prj00012
for a file, the used quota for the unix group prj00012
includes this file size.
Codeblock | ||
---|---|---|
| ||
blogin4> chgrp prj00012 somefile.txt
blogin4> ls -la somefile.txt
-rw------- 1 myaccount prj00012 237271040 Jul 3 2020 somefile.txt |
Codeblock | ||
---|---|---|
| ||
blogin4> chgrp --recursive prj00012 somedirectory
blogin4> ls -lad somedirectory
drwx------ 1 myaccount prj00012 4096 Jul 3 2020 somedirectory |
On Feb 1st 2021 all members of a project will be added to the matching UNIX-group and gain access to the projects files. Please adjust your project members / files accordingly by then. If you want to grant project members access to the files before that date, simply re-add them under https://zulassung.hlrn.de/.