Where is my job in the queue - or on the system - and why?
The simplest queuing algorithm is "first come, first served". The queuing of jobs on the HPC cluster is a little more sophisticated as we pursue several goals:
- Minimal queuing times, especially for short jobs. Nobody wants to spend 4 hours in the queue for a 1-hour job.
- Efficient use of the available resources. If there is a job ready which can use hardware that would otherwise be idle, run it, even if it's not next in the queue.
Fair use of resources. If you've made heavy use of the cluster recently, jobs belonging to a user who has had less CPU time will get higher priority.
At NYU "recently" means "the last 24 hours", so users with large workloads are not excessively penalized.
Should you need more resources than the fair share allocations because of critical deadlines such as a grant application, a publication deadline, or class use, please email firstname.lastname@example.org to make special arrangements.
Special consideration for HPC Stakeholders. NYU HPC uses a "condo" model in which we manage HPC resources owned by specific schools and departments, in exchange for allowing the rest of the NYU HPC community to use those resources when the owners are not.
Slurm supports these goals by calculating a priority for each submitted job and placing the job in the queue according to its priority. The schedule of which job will run where and when is built from the job queues. When a job finishes earlier than scheduled (due to an overestimated walltime request), Slurm attempts to fill the newly-available space by scanning the queue for the first job which will fit without delaying an already-scheduled, higher priority job. In this way low-priority jobs with smaller resource requirements can jump ahead and be run early.
You can take advantage of this by requesting CPU, walltime and memory resources as accurately as possible. Be careful not to request too little though, or your job may exceed the request and be killed.
Monitoring jobs with Slurm
Report the status of partitions, nodes and queues etc.
Report job and job step status in the scheduling queue
Report accounting information by individual job and job step
Report accounting information about currently running jobs and job steps
Administrator tool to view and/or update system, job, step, partition or reservation status
To see the status of a single job - or a list of specific jobs - pass the Job IDs to squeue, as in the following example:
Most of the fields in the output are self-explanatory.
The column "ST" in the middle is the job status, which can be :
PD -pending: waiting for resource allocation
F -failed: non-zero exit code or other failures
CD -completed: all processes terminated with zero exit code
CG -completing: in the completing process, some processes may still be alive
The column "NODELIST(REASON)" in the end is job status due to the reason(s), which can be :
Priority:higher priority jobs exist
ReqNodeNotAvail:requested node maybe down, in use, reserved for other jobs
BeginTime:start time not reached yet
Dependency:wait for a depended job to finish
Other, less common job flags are described in the manual (
To list all jobs owned by you or any specific friend, use:
In Unix, the shell sets an environment variable USER to your username (at NYU this is your NetID). In the example above this environment variable is used instead of explicitly typing my NetID.
While a job with steps is running,
sstat can show its memory usage and a lot other information (run 'sstat --helpformat' to see what other quantities available), e.g.
If your job seems stuck in the queue, scontrol can give useful hints about why:
scontrol is mainly for administrators to view and/or update system, job, step, partition or reservation status. Users can also list detailed job and step information, e.g.
Request an interactive batch session on Prince cluster. When it starts (or immediately, in another window), use squeue
-u to see it in the queue. Try scontrol show job -dd