The /scratch file system on the NYU HPC clusters is intended for short-term storage of analysis-ready data sets. The /scratch file system is not suitable for storing data for long periods of time.
Files on scratch are NOT backed up
Files on /scratch are NOT backed up. Always backup your important data to /archive. /archive is only available on HPC login nodes, not from compute nodes.
Files under /scratch that have not be accessed for more than 60 days will be automatically deleted. To find out what files will be purged you can run the following command:
Modifying file access times (using "touch" or any other method) for the purpose of circumventing purge policies may result in the loss of access to the cluster.
If even with the policy in place, the total usage of /scratch remains above 70%, the top /scratch users will be asked to reduce their /scratch usage.
Please do not directly use the linux cp command to copy data from /scratch to /archive on mercer login nodes. This would create heavy load on /archive, and would slow access to /home on Mercer, since both file systems are served from the same disks. Please use the /share/apps/utils/rsync.sh shell script wrapper to copy data from /scratch to /archive. Inside this script, we limit "rsync" to a bandwidth of no more than 20MB/s, also enable the flag -a for archive mode. In some cases when source and destination are not on the same file systems, and the file systems have different block size settings, the directory sizes might look different. This is fine since rsync makes sure data transfers are successful by verifying checksum. The script wrapper usage is similar to the rsync command, e.g.:
Please only back up what you actually need to retain. Backup the minimum needed to reproduce to your work.
If you see warnings about set_acl similar to the following example, please ignore them. /scratch file system is Luster with FACL enabled, /archive is ZFS file system. The FACL enabled for files in /scratch can not be kept when synchronize to /archive, that’s the warning information about. This also means when you copy the data back from /archive to /scratch, if you want to share with others, you’ll have to reset FACL properly.