Child pages
  • Clusters - Prince
Skip to end of metadata
Go to start of metadata

How to access the Prince cluster?

Information on how to access the HPC Prince cluster can he found here.

Prince

Overview

Prince is the new HPC cluster. The Mercer cluster was decommissioned on May 19th, 2017

 

Mercer end-of-life

 

MERCER retired on Friday May 19th, 2017

 

 

Cluster components listed below in green font will become part of the Prince cluster in Phase 2. Most of the components in green font are currently part of the HPC Mercer cluster.

Hardware Specifications

System Name

HPC Cluster Prince

Vendor

Dell

Network

  • Infiniband by Mellanox for MPI and access to file systems  (home (ZFS), scratch (Luster and BeeGFS), and archive (ZFS) )
  • 10Gbit Management Network (node provisioning and configuration)
  • 1 Gb Ethernet Service Network for IPMI/iDRAC access
  • 10Gbit access to the public NYU Network (only available on the Prince login nodes and selected management nodes)

Operating System

CentOS 7.3

Login Nodes

2 login nodes: prince0 and prince1.hpc.nyu.edu

Each login nodes has 2 Intel Xeon E5-2680v4 2.4GHz CPUs ("Broadwell", 14 cores/socket, a total of 28 cores/login node) and 128 GB memory

Compute Nodes

Standard Compute Nodes

  • 68 nodes each with 2 Intel Xeon E5-2690v4 2.6GHz CPUs ("Broadwell", 14 cores/socket, 28 cores/node) and 125GB memory, EDR interconnects
  • 32 nodes each with 2 Intel Xeon E5-2690v4 2.6GHz CPUs ("Broadwell", 14 cores/socket, 28 cores/node) and 250GB memory, EDR interconnects 
  • 32 nodes each with 2 Intel Xeon E5-2660v3 2.6GHz CPUs ("Haswell", 10 cores/socket, 20 cores/node) and 62 GB memory. The 32 nodes are M630 Blade servers on 2 M1000e chassis and are interconnected via FDR Infiniband
  • 64 nodes each with 2 Intel Xeon E5-2690v2 3.0GHz CPUs ("Ivy Bridge", 10 cores/socket, 20 cores/node) and 62 GB memory. The 64 nodes are M620 Blade servers on 4 M1000e chassis and are interconnected via FDR Infiniband (used to be Mercer chassis 0, 1, 2, 3)
  •  112 nodes each with 2 Intel Xeon E-2690v2 3.0GHz CPUs ("Ivy Bridge", 10 cores/socket, 20 cores/node) and 62 GB memory. The 112 nodes are M620 Blade servers on 7 M1000e chassis and are interconnected via QDR Infiniband (Mercer chassis 14-20)
  •  48 nodes each with 2 Intel Xeon E-2690v2 3.0GHz CPUs ("Ivy Bridge", 10 cores/socket, 20 cores/node) and 189 GB memory. The 48 nodes are M620 Blade servers on 3 M1000e chassis and are interconnected via QDR Infiniband (Mercer chassis 21-23)

Nodes equipped with NVIDIA GPUs

  • 9 nodes each with 2 Intel Xeon E5-2690v4 2.6GHz CPUs ("Broadwell", 14 cores/socket, 28 cores/node) and 256GB memory, EDR interconnects, each node equipped with 2 NVIDIA K80 GPUs (24GB, split between 2 GPU cards)
  • 4 nodes each with 2 Intel Xeon E5-2690v4 2.6GHz CPUs ("Broadwell", 14 cores/socket, 28 cores/node) and 128GB memory, EDR interconnects, each node equipped with 4 NVIDIA GTX 1080 GPUs (8 GB)
  • 8 nodes each with 2 Intel Xeon E5-2670v2 2.5GHz CPUs ("Ivy Bridge", 10 cores/socket, 20 cores/node) and 128 GB memory, FDR interconnects, each node equipped with 4 NVIDIA K80 GPUs

Medium Memory Node

  • 4 nodes each with 2 Intel Xeon E5-2687Wv3 3.1GHz ("Haswell", 10 cores/socket, 20 cores/node), 512GB memory, FDR interconnects.

High Memory Nodes

  • 2 nodes each with 4 Intel Xeon E7-8857v2 3.0GHz ("Ivy Bridge", 12 cores/socket, 48 cores/node), 1.5TB of memory, FDR interconnects.

 


Total Nodes
387 (385 Compute Nodes + 2 Login Nodes)
CPU cores
8928 cores on compute nodes + 56 cores on login nodes
GPUs

50 NVIDIA K80 (24GB)

16 NVIDIA GTX 1080 (8GB)

Total memory46TB for compute + 256GB for login nodes

File Systems

The table below shows the File Systems available on the Prince Cluster.

Mountpoint

Storage Capacity

(User Quota)

FS Type

Backed up?

Flushed?

Availability

Variable

Value

/home

43 TB

(20 GB / user)

ZFS

Yes

No

All Prince nodes (login, compute)

$HOME

/home/$USER

/scratch

1.1 PB

(5 TB / user)

Lustre

NO

 

YES

Files unused for 60 days are deleted

All Prince nodes (login, compute)

$SCRATCH

/scratch/$USER

/beegfs

500TB

(2 TB / user)

BeeGFSNO

YES

Files unused for 60 days are deleted

All nodes (login, compute)
$BEEGFS/beegfs/$USER

/archive

700 TB

(2 TB / user)

ZFS

Yes

No

Only on login nodes

$ARCHIVE

/archive/$USER

/state/partition1Varies, mostly >100GBext3NO

YES

at the end of each job

Separate local filesystem on each compute node$SLURM_JOBTMP/state/partition1/$SLURM_JOBID