This article will cover the essential topics to run jobs on Dalma. To know more about Dalma hardware, check this page Cluster - Dalma.
The operating system on Dalma is Linux. Make sure you know the basics. Useful links:
This is the structure of our HPC cluster, Dalma.
Your typical workflow on Dalma:
- (One time only) Let us know your computational requirement.
- (One time only) Apply an HPC account and pass our quiz.
- If needed, transfer your input data to Dalma.
- Log on to Dalma login nodes.
- Submit jobs on login nodes.
- Your jobs will queue for execution.
- Once done, examine the output.
These steps are explained below.
Getting or Renewing an Account
Get a new account and pass the quiz as instructed here: Accounts.
The yearly renewal instructions are on the same page
It takes 2 business day to activate your account.
Once your account is ready, you can access Dalma. With Linux or Mac in NYU AD/NY network, simply ssh in your local terminal:
If you use Windows or outside NYU AD/NY network, follow the instructions here: Access Dalma.
We have 4 storage systems for you: $HOME (/home/<NetID>), $SCRATCH (/scratch/<NetID>), $WORK (/work/<NetID> and $ARCHIVE (/archive/<NetID>).
In short, you should put all your data to $SCRATCH and run your jobs from there. Only a small persistent fraction to $HOME (e.g., source code, executables, Python / R packages...). For long-term storage, archive them to $ARCHIVE. $WORK is not visible on compute nodes but mountable on your local workstation, best suited to quick post-processing, analysis and visualization, without moving your data.
Backing up is a user's own responsibility. E.g., if a user deleted something accidentally, we can not recover, unfortunately.
$HOME and $SCRATCH
$HOME and $SCRATCH can be accessed as follows:
We urge our users to clean up their storage regularily.
Retention Policy Applies
Files older than 90 days at $SCRATCH will be deleted.
Running jobs from /home is a serious violation of HPC policy. Any users who intentionally violate this policy will get their account suspended. /home SSDs are not designed for scratch disks, it will kill the SSDs quickly.
$WORK can be accessed on login nodes as follows:
$WORK can also be mounted on your workstation, Linux and Mac only. Instructions are in this page. Mount $WORK with SSHFS
We urge our users to clean up their storage regularily.
Retention Policy Applies
Files older than 120 days at $WORK will be deleted.
The $ARCHIVE can be accessed as The guide to Archive on Dalma.
The usages are summarized below.
|Use for storing||source code / executable / perl-python-R packages||data||anything||anything|
|Accessible From||login / compute||login / compute||login||login|
|Use to Run Jobs||No||Yes||No||No|
|Retention Time (Days)||No Limit||90||120||No Limit|
|Default Quota||5GB, 100K Files||5TB, 500K Files||5TB, 500K Files||5TB, 125K Files|
Default block quota and inode quota, more precisely.
Run myquota in the terminal on Dalma to check your current usage and quota. Example output:
To this point, you should be able to log in Dalma and upload/download data.
Dalma consists of more than 12K CPU cores. But it is very unlikely that your code can scale up to use them all (contact us directly if you are confident). From the user perspective, here are the important specifications for most nodes:
- The CPU mode is Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, supporting AVX2.
- 28 CPU cores per node. Implications:
- If your code doesn't support MPI, or you don't know what MPI is, use maximum 28 cores per job.
- For MPI jobs using more than one node, always use a number of cores dividable by 28, to utilize the full nodes.
- 4 GB memory per core by default.
Contact us if you need special configuration (extra large memory, GPU, etc...)
- The Operating System is CentOS 7. Windows / Mac software is not supported.
- No GUI. No display.
You can compile / install your own software, and/or use our Module system. For the latter, first check what applications are available.
Then you could select the desired software to load. The following example shows how to load a self-sufficient-single-application environment for gromacs.
The following example shows how to load an environment for compiling source code from scratch.
At this point, compilers like 'gcc', 'gfortran' and 'g++' are available, in a sense that the paths to those executables are prepended to $PATH. Also, paths to libraries files from FFTW3 will be prepended to $LD_LIBRARY_PATH.
If you cannot find a certain version of the software (for example, you are looking for Python 3, but only to find Python 2 is available), try running the following command to make all modules visible first.
As you can see, Python 3 is available then. You could load Python 3 by loading the specific module.
At this point, you should be able to invoke the executable, e.g., 'python'.
Alternatively, install Conda in your $HOME for hassle-free, independent Python environment. Follow this page: Python - Create Your Own Environment using Anaconda
Now it is the exciting part. With input data and software ready, you can run your computational tasks now.
On HPC, you don't run it directly on the login nodes. Instead, you submit jobs on login nodes. These jobs will be queued to the system and executed eventually. Conceptually, each job is a 2-step process:
- You request certain resources from the system. The most common resources are CPU cores.
- With the assigned resources, you run your computational tasks.
There are two ways, interactive sessions or batch jobs.
You could get an interactive session directly from your terminal, on compute nodes. Only short interactive jobs should be used (e.g., experimenting with new modifications to your Matlab code).
To start an interactive session, use srun command:
Then you can run your applications on the terminal directly. E.g.,
In a real scenario, the system might be exhausted with no available resources to you. You need to wait in this circumstance.
In this example, user gh50 requested 1 CPU core (-n 1) on login node (login-0-1). The system responded, assigned a job id (775175), queued the job and assigned 1 CPU core from one of the compute nodes (compute-21-1) to the user.
To exit the interactive session, type Ctrl+d, or
Besides interactive sessions, a user can submit batch jobs to the system. For production jobs, batch jobs should be used.
A complete batch job workflow:
- Write a job script, which consists of 2 parts:
- Resources requirement.
- Commands to be executed.
- Submit the job.
- Relax, have a coffee, log off if you wish. The computer will do the work.
- Come back to examine the result.
Batch Job Script
A job script is a text file describing the job. As discussed, the first part tells how much resources you want. The second part is what you want to run. Choose one of the following examples to start with. If you are not sure, contact us.
The cluster is shared among the whole university. The HPC steering committee decides each year on resources limit for each department. We at NYUAD HPC center implement these limits.
Typically, a user can ask for 48 hours, 700 CPU cores maximum per job.
If you ask for more resources than you can use, your job will stay in the queue forever. (e.g., you specify 10000 hours walltime in your job script)
If you have multiple jobs (which is very normal), your jobs will start either immediately if the system is free and the quotas for you and your department have not been exhausted.
A Job with 1 CPU Core
This is a very basic example, using only one CPU core.
As you can see, it is a simple bash script, plus some lines on the top, starting with #SBATCH, which are the Slurm directives.
Those Slurm directives specify resources required. E.g., '–ntasks=1' is 1 CPU core. '–time=00:30:00' means the maximum walltime is 30 mins. '-o job.%J.out' is redirecting the stdout, usually your screen output, to a file called 'job.$JOBID.out'. Why? Because the system will run your job in the background, hence no display.
Everything under the Slurm directives is normal Linux command. This example runs 'hostname', which will print the hostname. In reality, you should load your desired modules, and execute whatever you want to run.
Multithreading enables a process to spawn multiple threads to accelerate its execution. The most common multithreading model in HPC is OpenMP. If your application supports this (not sure? contact us to find out), you could use the below example.
Comparing to the previous examples, there are 2 extra lines:
- '#SBATCH --cpus-per-task=4': this asks the system to assign 4 CPU cores per tasks. This number should be no larger than and a divisor of 28 (e.g., 2, 4, 7, 14, 28) because the majority of our nodes comes with 28 cores.
- 'export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK': this tells your applications, if OpenMP supported, to use all the CPU cores assigned to your job, by spawning an exact number of OpenMP threads.
Remember, running a job is 2 steps process: 1. Request the resources. 2. Use the resources. This example is a perfect illustration. Run with what you requested, no more, no less.
Pure MPI Job
Now comes the pure MPI Jobs.
Comparing to the 1 core example, there are 2 different lines:
- '#SBATCH --ntasks=56': This line requests 56 cores. This number should be divisible by 28. E.g., 56, 84, 112...
- 'srun hostname': This tells your application to run with MPI support, utilizing all CPU cores requested.
The old school 'mpiexec' or 'mpirun' are supported as well. But you need to load 'openmpi' module in this case.
Hybrid MPI Job
If your application support MPI + OpenMP hybrid parallelization, you could follow this example to submit a hybrid job.
In this case, the number of CPU cores requested is 56 (ntasks) * 4 (cpus-per-task) = 224. This number should be divisible by 28 to use all the cores on the nodes. As in the multithreading job example, make sure 'cpus-per-task' is a divisor of 28.
This example shows how to submit a job array, consist of 100 jobs, with environmental variable SLURM_ARRAY_TASK_ID varies from 1 to 100.
Or you can varies SLURM_ARRAY_TASK_ID from 51 to 100.
Or set the maximum number of simultaneously running tasks from the job array to 10.
We only allow a maximum of 200 jobs in queue for any given user.
Submitting a Job
Once you have your job script prepared, you could use the command sbatch to submit your job.
Let say if you saved your job script into a file called 'job.sh'. Then you should run the following.
After the submission, it will return the corresponding job id. E.g.,
In this case, the job id is 775602. You can safely log off Dalma at this point. Once the system can accommodate your request, the script will be executed. The screen output will be saved to the files you specified in the job script.
Checking Job Status
Before and During Job Execution
This command shows all your current jobs.
It means the job with Job ID 31408, has been running (ST: R) for 2 minutes on compute-21-4.
For more verbose information, use scontrol show job.
After Job Execution
Once the job is finished, the job can't be inspected by squeue or scontrol show job. At this point, you could inspect the job by sacct.
The following commands give you extremely verbose information on a job.
Canceling a Job
If you decide to end a job prematurely, use scancel
Use with Cautions
To cancel all jobs from your account. Run this on Dalma terminal.
That is. Up to this point, you should be able to run your computational tasks on Dalma. If there is any question, don't hesitate to contact us (contacts on the right)!