There are two aspects to a batch job script:
- A set of PBS directives describing the resources required and other information about the job for Torque
- The script itself, comprised of commands to setup and perform the computations without additional user interaction
A simple example
A typical batch script on an NYU HPC cluster looks something like these:
|Using precompiled third-party software||Using self-developed or built software|
We'll work through them more closely in a moment.
You submit the job with qsub:
And monitor its progress (as is discussed further in here) with:
What just happened? Here's an annotated version of the first script:
The second script has the same PBS directives, but this time we are using code we compiled ourselves. Starting after the PBS directives:
Submitting a job
Jobs are submitted with the The options tell Torque information about the job, such as what resources will be needed. These can be specified in the job-script as PBS directives, or on the command line as options, or both (in which case the command line options take precedence should the two contradict each other). For each option there is a corresponding PBS directive with the syntax: For example, you can specify that a job needs 2 nodes and 8 cores on each node by adding to the script the directive: or as a command-line option to Options to manage job output: Options to set the job environment:
qsub when you submit the job:
Give the job a name. The default is the filename of the job script. Within the job,
$PBS_JOBNAME expands to the job name
stderr into the
path/for/stdout. Can be a filename or an existing directory. The default filename is
myjob.o12345, in the directory from which the job was submitted
path/for/stderr. Same usage as for
Send email to firstname.lastname@example.org when certain events occur. By default an email is sent only if the job is killed by the batch system.
-m b -m e -m a -m abe
Send email when the job begins (
b), ends (
e) and/or is aborted (
Use the shell at
/path/to/shell to interpret the script. Default is your login shell, which at NYU HPC is normally
-v VAR1,VAR2="some value",VAR3
Pass variables to the job, either with a specific value (the
VAR= form) or from the submitting environment (without "
Pass the full environment the job was submitted from
Jobs are submitted with the
The options tell Torque information about the job, such as what resources will be needed. These can be specified in the job-script as PBS directives, or on the command line as options, or both (in which case the command line options take precedence should the two contradict each other). For each option there is a corresponding PBS directive with the syntax:
For example, you can specify that a job needs 2 nodes and 8 cores on each node by adding to the script the directive:
or as a command-line option to
Options to manage job output:
Options to set the job environment:
Options to request compute resources:
Maximum wallclock time the job will need. Default is 1 hour. Walltime is specified in seconds or as
Maximum memory per node the job will need. Default depends on queue, normally 2GB for serial jobs and the full node for parallel jobs. Memory should be specified with units, eg
Number of nodes and number of processors per node required. Default is 1 node and 1 processor per node. The
:ppn=numcan be omitted, in which case (at NYU HPC) you will get full nodes. When using multiple nodes the job script will be executed on the first allocated node.
Submit to a specific queue. If not specified, Torque will choose a queue based on the resources requested.
A job submitted without requesting a specific queue or resources will go to the default serial queue (s48 on Mercer) with the default resource limits for that queue
Requesting the resources you need, as accurately as possible, allows your job to be started at the earliest opportunity as well as helping the system to schedule work efficiently to everyone's benefit.
Options for running interactively on the compute nodes:
Don't just submit the job, but also wait for it to start and connect
stdinto the current terminal.
Enable X forwarding, so programs using a GUI can be used during the session (provided you have X forwarding to your workstation set up)
Pass the current environment to the interactive batch job
- To leave an interactive batch session, type
exitat the command prompt.
Options for delaying starting a job:
Delay starting this job until
jobidhas completed successfully.
Delay starting this job until after the specified date and time. Month (
MM) and day-of-month (
DD) are optional, hour and minute are required.
Options for many similar jobs (array jobs and pbsdsh):
Submit an array of jobs with array ids as specified. Array ids can be specified as a numerical range, a comma-separated list of numbers, or as some combination of the two. Each job instance will have an environment variable
As above, but the appended '
%n' specifies the maximum number of array items (in this case, 5) which should be running at one time
- Submit a single "shepherd" job requesting multiple processes and from it start individual jobs with