Skip to end of metadata
Go to start of metadata

This page is retained from an earlier version of the HPC wiki only for reference, and the equivalent up-to-date page is at Quick Links.


The NYU HPC team currently maintains two clusters: The HPC cluster Prince and the Hadoop cluster Dumbo.

HPC user accounts

An HPC User account provides access to all NYU HPC and Big Data clusters. If you don't have a user account, you may apply for an HPC user account.

Old HPC clusters

NYU HPC team has retired its older clusters (Union Square, Cardiac, Bowery, Mercer). The current production HPC cluster is Prince.


  • Prince

    Prince is the new HPC cluster that is currently being deployed. Prince will replace the HPC Mercer Cluster.

  • Dumbo

    Dumbo is a 44 data node Hadoop cluster running Cloudera Distribution of Hadoop (CDH).

    • For a detailed description of dumbo and how to access it, please see the dumbo wiki pages.

  • ViDA OpenStack

    openstack cluster

    ViDA Openstack cluster is currently being deployed. Not in production yet.

     


The table below shows the File Systems available on the Prince Cluster.

Mountpoint

Storage Capacity

(User Quota)

FS Type

Backed up?

Flushed?

Availability

Variable

Value

/home

43 TB

(20 GB / user)

ZFS

Yes

No

All Prince nodes (login, compute)

$HOME

/home/$USER

/scratch

1.1 PB

(5 TB / user)

Lustre

NO

 

YES

Files unused for 60 days are deleted

All Prince nodes (login, compute)

$SCRATCH

/scratch/$USER

/beegfs

500TB

(2 TB / user)

BeeGFSNO

YES

Files unused for 60 days are deleted

All nodes (login, compute)
$BEEGFS/beegfs/$USER

/archive

700 TB

(2 TB / user)

ZFS

Yes

No

Only on login nodes

$ARCHIVE

/archive/$USER

/state/partition1Varies, mostly >100GBext3NO

YES

at the end of each job

Separate local filesystem on each compute node$SLURM_JOBTMP/state/partition1/$SLURM_JOBID

NYU HPC is currently in the process of a major upgrade:

  • 2014 Q1: A new system with 3200 Intel Ivy Bridge (c2013) cores in 160 nodes, to be named Mercer, is being installed
  • 2014 Q2: Union Square and Cardiac, which are at the end of their working lives, will be decommissioned
  • 2014 Q2: Most of the hardware comprising Bowery will be incorporated into Mercer to form a single, heterogeneous system.
  • 2014 Q2: Hydra will be integrated into Mercer
  • 2014 Q2: Lustre will be upgraded
  • 2015 Q1: Babar will reach end-of-life and be decommissioned

Consequently the information here is in a state of flux!

TODO: mention myquota.

 The diagram below shows network and storage access of the NYU clusters

 

Some important aspects of the cluster setup are:

  • The NYU clusters cannot be directly accessed from the internet: users must first log in to the bastion host  hpc.nyu.edu (however, Internet connections from the clusters are supported).
  • Each cluster consists of login nodes and compute nodes. The login nodes are for compiling code and preparing runs, actual computation should be run on the compute nodes by submitting it as a batch job (TODO link to how to)
  • The /scratch filesystem is available on login and compute nodes, and is on a high speed network. $SCRATCH is optimized for large-block I/O - please try not to use it for frequent, small I/O transfers. For these, we recommend using the node-local filesystem /state/partition1 (TODO: link to more info about fs usage)
  • The /archive filesystem is only available on the login nodes

Each cluster has three primary filesystems, described (TODO link to storage page). Stakeholder users (TODO add a link to "how to become a stakeholder") also have access to a fourth filesystem, /work.

  • No labels