Child pages
  • Monitoring batch jobs at Prince
Skip to end of metadata
Go to start of metadata

Where is my job in the queue - or on the system - and why?

The simplest queuing algorithm is "first come, first served". The queuing of jobs on the HPC cluster is a little more sophisticated as we pursue several goals:

  • Minimal queuing times, especially for short jobs. Nobody wants to spend 4 hours in the queue for a 1-hour job. 
  • Efficient use of the available resources. If there is a job ready which can use hardware that would otherwise be idle, run it, even if it's not next in the queue.
  • Fair use of resources. If you've made heavy use of the cluster recently, jobs belonging to a user who has had less CPU time will get higher priority. 
    At NYU "recently" means "the last 24 hours", so users with large workloads are not excessively penalized.

    Should you need more resources than the fair share allocations because of critical deadlines such as a grant application, a publication deadline, or class use, please email hpc@nyu.edu to make special arrangements.

  • Special consideration for HPC Stakeholders. NYU HPC uses a "condo" model in which we manage HPC resources owned by specific schools and departments, in exchange for allowing the rest of the NYU HPC community to use those resources when the owners are not. 

Slurm supports these goals by calculating a priority for each submitted job and placing the job in the queue according to its priority. The schedule of which job will run where and when is built from the job queues. When a job finishes earlier than scheduled (due to an overestimated walltime request), Slurm attempts to fill the newly-available space by scanning the queue for the first job which will fit without delaying an already-scheduled, higher priority job. In this way low-priority jobs with smaller resource requirements can jump ahead and be run early.

You can take advantage of this by requesting CPU, walltime and memory resources as accurately as possible. Be careful not to request too little though, or your job may exceed the request and be killed.

Monitoring jobs with Slurm

sinfo

Report the status of partitions, nodes and queues etc.

squeue

Report job and job step status in the scheduling queue

sacct

Report accounting information by individual job and job step

sstat

Report accounting information about currently running jobs and job steps

scontrol

Administrator tool to view and/or update system, job, step, partition or reservation status 

 

To see the status of a single job - or a list of specific jobs - pass the Job IDs to squeue, as in the following example: 

Most of the fields in the output are self-explanatory. 

The column "ST" in the middle is the job status, which can be :

  • PD - pending: waiting for resource allocation
  • S  - suspended
  • R  - running
  • F  - failed: non-zero exit code or other failures
  • CD - completed: all processes terminated with zero exit code
  • CG - completing: in the completing process, some processes may still be alive

The column "NODELIST(REASON)" in the end is job status due to the reason(s), which can be :

  • JobHeldUser:          (obviously) 

  • Priority:                 higher priority jobs exist
  • ReqNodeNotAvail:  requested node maybe down, in use, reserved for other jobs
  • BeginTime:               start time not reached yet
  • Dependency:             wait for a depended job to finish

Other, less common job flags are described in the manual (man squeue).

To list all jobs owned by you or any specific friend, use: 

$ squeue -u NetID

For example:

In Unix, the shell sets an environment variable USER to your username (at NYU this is your NetID). In the example above this environment variable is used instead of explicitly typing my NetID.

 

While a job with steps is runningsstat can show its memory usage and a lot other information (run 'sstat --helpformat' to see what other quantities available), e.g.

 

If your job seems stuck in the queuescontrol can give useful hints about why:

scontrol is mainly for administrators to view and/or update system, job, step, partition or reservation status. Users can also list detailed job and step information, e.g. 

Exercise

Request an interactive batch session on Prince cluster. When it starts (or immediately, in another window), use squeue -u to see it in the queue. Try scontrol show job -dd <job_id> too.

  • No labels