10 Ch. 2.3: Storage
Storage
The national clusters have a few different types of storages attached:
- Each user has their own home space, which is the default folder. It is relatively small and generally backed up. It is available on compute and login nodes.
- project space is somewhat larger than home and is a suitable location to store project data associated with the research group. It is available on compute and login nodes.
- scratch space is a high speed and very large space for storing transient data, such as temporary data used by running jobs. It is only available on compute nodes
- nearline is slower storage comprised of a combination of disks and computer tape. This is for the data associated with active projects, but which is infrequently accessed. It is only available on login nodes.
Details on the storage available on each cluster are available here.
You can see a breakdown of your storage on an Alliance national cluster using the diskusage_report command:
$ diskusage_report
Description Space # of files
/home (user youruser) 401M/50G 11k/500k
/scratch (user youruser) 33k/20T 1/1000k
/project (group youruser) 0/2048k 0/1025
/project (group def-youruser) 255k/1000G 9/505k
The Globus Portal is the Alliance’s recommended method for transferring data to and from the national clusters. It is a fast, safe, and reliable tool that allows researchers to control data movement between their computers and Alliance storage using an intuitive web interface. Globus handles the data transfer in the background. It can even move data to a researcher’s local computer when a firewall NAT or router is otherwise blocking connections. More information on Globus Portal can be found here.
Exercises
Network Address Translation