Skip to end of metadata
Go to start of metadata

Linux Commands

If you never used Linux before you may refer to this Linux Commands tutorial page.

Getting started

If you are new to the HPC and you would like to know how to transfer files/run jobs, you may find the getting started page useful.

Frequently Asked Questions

  1. Why is "ls" on lustre so slow?
  2. Is it possible to make sure a job gets executed only after another one completes?
  3. How do I log in to a specific node?
  4. My job is resource-intensive. How can I make sure it is running smoothly?
  5. How do I run a STATA job?
  6. How to run a Gaussian job?
  7. What about a Matlab job?
  8. The "module: command not found" error.
  9. I need more memory with my jobs.
  10. Warning: no access to tty (Bad file descriptor), Thus no job control in this shell.
  11. I get an error "Warning: no display specified." when I use -X flag with ssh

Q: Why is "ls" on lustre so slow?

A: Please see the Lustre FAQ .

Q: Is it possible to make sure a job gets executed only after another one completes?

A: Yes. Instead of qsub job2.pbs when you submit job2, type:

$ qsub -W depend=afterok:42785.crunch.local job2.pbs

Job2 is scheduled for execution only after the job 42785.crunch.its.nyu.edu has terminated with no errors. If you would like job2 to launch anyway even if job1 fails, e.g., exceeding walltime limits, you may use "depend=afterany" instead of "depend=afterok". For more information, please refer to the qsub manual page.

Q: How do I log in to a specific node?

A: To log in to a compute node within a cluster, type ssh COMPUTE_NODE_NAME. e.g.,

$ ssh compute-0-96

To log in to a login node within a cluster, use ssh LOGIN_NODE_NAME. For example, enter

$ ssh login-0-3

if you would like to switch to the bowery3 login node from any other Bowery nodes.

Q: My job is resource-intensive. How can I make sure it is running smoothly?

A: After submitting jobs, you will be able to locate where your jobs are executing by running pbstop -u NetID. Then you can monitor these jobs by logging in to the corresponding compute nodes and running top. You will then see both CPU and memory consumptions. If you find little memory left (or even little swap left) due to your job, you should increase the "ppn" number in your PBS script or maybe consider taking advantage of the nodes with larger memory. See this question for more details.

Q: How do I run a STATA job?

A: STATA 11 was installed on USQ. To run STATA jobs, you need to;

(1) Prepare a STATA do file, such as "stata-test.do" which may include,

sysuse auto, clear
summarize
graph twoway (scatter mpg weight) (lfit mpg weight)
graph export stata-test.ps, replace

StataMP was also installed on the USQ, which is the parallel verision of Stata. To use multiple processors in StataMP, just insert "set processors X" as a line in your do file, where X is the number of processors and it should be equal to the "ppn" number in your PBS script.

(2) Create a PBS script "run-stata.pbs" to run STATA jobs in the batch mode. The content in this file can be like this,

#!/bin/csh -f

#PBS -V
#PBS -S /bin/tcsh
#PBS -N stata-test
#PBS -l nodes=1:ppn=1,walltime=01:00:00
#PBS -M NetID@nyu.edu
#PBS -m abe

source /etc/profile.d/env-modules.csh
module load stata/11

cd /scratch/NetID/hpc-tutorial/stata

stata -b do stata-test.do

Note

Be sure to substitute your own "NetID" for NetID.

You need to change the paths according to your directory. Please refer to this page for more information.

(3) Then submit the job by typing

$ qsub run-stata.pbs

Q: How to run a Gaussian job?

A: Gaussian 03 E01 is  installed on USQ and Bowery. To run a Gaussian job, you need to prepare a Gaussian input file which might look like this,

%Chk=H2.chk
#SP B3LYP/6-31+G*

No title

H 0.00000 0.00000 0.00000
H 0.75000 0.00000 0.00000

and save it as "input.com".

You can run it from an interactive session if you expect your jobs to finish very soon. The command is,

$ /share/apps/gaussian/G03-E01/intel/g03/run-g03.csh input.com >& output.out

Please use this script instead of running Gaussian by loading the module "gaussian/intel/G03-E01" and then executing g03 directly, because otherwise Gaussian will write scratch files to the default path and the system space might be filled up.

You may copy the "run-g03.csh" script to any directory and even rename it for your convenience.

(a) Running serial Gaussian jobs
A serial Gaussian job which uses one CPU core can only be submitted to the USQ cluster. A typical PBS script (named "run-gaussian.pbs") is,

#!/bin/csh -f

#PBS -V
#PBS -S /bin/tcsh
#PBS -N Gaussian-test
#PBS -l nodes=1:ppn=1,walltime=01:00:00
#PBS -M NetID@nyu.edu
#PBS -m abe

cd /scratch/NetID/gaussian-workdir
/share/apps/gaussian/G03-E01/intel/g03/run-g03.csh input.com >& output.out

Note

Be sure to substitute your own "NetID" for NetID.

You need to change the paths according to your directory. Please refer to this page for more information.
Then submit the job by typing qsub run-gaussian.pbs.
(b) Running parallel Gaussian jobs (on one node)
Gaussian also supports jobs that utilize multiple processors in a node. These jobs are also treated as "serial" in NYU HPC since they do not involve inter-node data communications, as thus can only be submitted to USQ or Cardiac. To do so, simply add the following line to the top of your Gaussian input file (input.com):

%NProcShared = PROC_NUM

where "PROC_NUM" is the number of shared processors that you would like Gaussian to take advantage of (4, for example), and it should not exceed the total CPU cores in a node.

You also need to change the "ppn" number correspondingly in the "#PBS -l" line of your PBS script file. Then submit the job by typing qsub run-gaussian.pbs as well.

What about a Matlab job?

(a) Running serial Matlab jobs with a single thread

You may run serial Matlab jobs from an interactive session or the batch mode. In both ways, you need to either load the module as,

$ module load matlab/R2009b
$ matlab

or provide the full path,

$ /share/apps/matlab/R2009b/bin/matlab

By default, MATLAB makes use of the multithreading capabilities of the computer on which it is running. This will cause trouble to other users if they are not aware of it. As a result, we disabled this function: the Matlab on the NYU HPC clusters will run by default with a single thread. You can add the "-singleCompThread" flag to ensure the single-threading usage;

$ /share/apps/matlab/R2009b/bin/matlab -nodisplay -singleCompThread

*(b) Running serial Matlab jobs with multiple threads*

You can add the flag "-multipleCompThreads" to your Matlab command if you would like to take advantage of multiple threads;

$ /share/apps/matlab/R2009b/bin/matlab -nodisplay -multipleCompThreads

Note that this is a home-made local flag which only works on the NYU HPC clusters.

Please also declare the correct CPU cores for your Matlab job, which should agree with the thread number. A multi-threading Matlab job script example is,

#!/bin/csh -f

#PBS -V
#PBS -S /bin/tcsh
#PBS -N Matlab-test
#PBS -l nodes=1:ppn=8,walltime=01:00:00
#PBS -M NetID@nyu.edu
#PBS -m abe

cd /scratch/NetID/matlab-workdir
/share/apps/matlab/R2009b/bin/matlab <INPUT >OUTPUT

Note

If you need fewer CPU cores than the total CPU number in a node (8 in most cases), you must also declare the correct number in the "OMP_NUM_THREADS" variable.

Note

Be sure to substitute your own "NetID" for NetID.

Your PBS script will be like this if you only need two CPU cores,

#!/bin/csh -f

#PBS -V
#PBS -S /bin/tcsh
#PBS -N Matlab-test
#PBS -l nodes=1:ppn=2,walltime=01:00:00
#PBS -M NetID@nyu.edu
#PBS -m abe

setenv OMP_NUM_THREADS 2
cd /scratch/NetID/matlab-workdir
/share/apps/matlab/R2009b/bin/matlab <INPUT >OUTPUT

or with the command setenv OMP_NUM_THREADS 2 changed to export OMP_NUM_THREADS=2 if you use bash.

Note

Be sure to substitute your own "NetID" for NetID.

(c) Running parallel Matlab jobs with PCT (on one node)

Because the MATLAB Distributed Computing Server is not available on the NYU HPC clusters, you can not run parallel Matlab jobs across multiple compute nodes but only on one node with up to 8 CPU cores. This can be realized with the Parallel Computing Toolbox (PCT) in Matlab. You may refer to the Matlab website for more information on PCT. The PCT jobs are also treated as "serial" on NYU HPC since they do not involve inter-node data communications, as thus can only be submitted to USQ or Cardiac.

Please note that the PCT in Matlab tends to write scratch files to the same directory (e.g., /home/NetID/.matlab/R2010b). As a result, these files can be overwritten or cause a conflict if you run multiple PCT jobs at the same time. The solution is to define a private folder for each job.

To do so, add this part to your PBS script below any "#PBS" lines and before executing Matlab if you use tcsh/csh,

set tmp_folder_base = /state/partition1/$USER
mkdir -p $tmp_folder_base
set data_location = `mktemp -d "$tmp_folder_base/MATLAB-data-XXXXXXXXXX"`
setenv DATA_LOCATION $data_location
setenv NTHREADS `cat $PBS_NODEFILE | wc -l`

Or this part if you use bash,

tmp_folder_base=/state/partition1/$USER
mkdir -p $tmp_folder_base
data_location=$(mktemp -d "$tmp_folder_base/MATLAB-data-XXXXXXXXXX")
export DATA_LOCATION=$data_location
export NTHREADS=$(cat $PBS_NODEFILE | wc -l)

For either shell, please also add "rm -rf $data_location" right before the "exit" in your PBS script.

Then modify the .m input file to include these lines before calling the "matlabpool" function,

data_location = getenv('DATA_LOCATION');
nthreads = getenv('NTHREADS');
scheduler = findResource('scheduler', 'type', 'local')
scheduler.DataLocation = data_location

Now you can start PCT by calling "matlabpool" like this in your .m input file,

matlabpool('open', 'local', nthreads)

Special Notes on Running Matlab 2010b

Matlab 2010b was also installed on USQ and Bowery. However, possibly due to a bug, the program will hang after printing the first line (lineA) if you run this script,

#!/bin/bash -f

/share/apps/matlab/R2010b/bin/matlab -nodisplay -r "
fprintf('lineA\n');
fprintf('lineB\n');
fprintf('lineC\n');
exit
"

This may also prevent you from executing any further Matlab commands. As an alternative, you can change your script to this format,

#!/bin/bash -f

cat <<EOF | /share/apps/matlab/R2010b/bin/matlab -nodisplay
fprintf('lineA\n');
fprintf('lineB\n');
fprintf('lineC\n');
exit
EOF

which can finish correctly with an output,

lineA
lineB
lincC

The "module: command not found" error.

Q: I tried to load modules in my PBS script. It failed with a "module: command not found" error message.

A: Be sure to tell the scheduler where the "module" program is before loading any modules in your script. Add this line

source /etc/profile.d/env-modules.csh

right above any "module load" lines to your PBS file if it uses tcsh/csh, or insert instead

source /etc/profile.d/env-modules.sh

if the file uses bash.

I need more memory with my jobs.

The total memory of a node is mostly 16GB on the USQ (shared by 8 CPU cores), or 24GB on the Bowery (shared by 12 CPUs). As a results, please avoid consuming more than 2GB memory per CPU core, otherwise the whole node may be crashed by the job.

If you need a large amount of memory, you may increase the "ppn" number in your PBS script correspondingly.

Example: Each of several Matlab jobs on USQ needs 3.5GB memory.
Solution: Do not submit more than 4 such jobs to one node or there will be no memory left. To ensure that, you can change "#PBS -l nodes=1:ppn=1,walltime=12:00:00" into "#PBS -l nodes=1:ppn=2,walltime=12:00:00".

If for any reasons your jobs need more than 16GB memory on the USQ or more than 24GB memory on the Bowery, you may take advantage of the Bigmem queue. See more details here or write to hpc@nyu.edu.

Warning: no access to tty (Bad file descriptor), Thus no job control in this shell.

It's harmless and does not indicate an error. This is an innocuous warning that simply means you are running a script (rather than a binary) under a job that has no access to the TTY.

In other words, you cannot interrupt it (^C), suspend it (^Z) or use other interactive commands because there is no screen or keyboard to interact with it.

It can be safely ignored.

"Warning: no display specified." with -X flag

Mac OS X

By default, X11 forwarding is not enabled on Mac Leopard. To enable it you need to have a line "X11Forwarding yes" in the file /private/etc/sshd_config. To achieve this do this command from the terminal.

$ sudo echo "X11Forwarding yes" >> /private/etc/sshd_config

Enter the password when prompted.

Windows

In order to run X Windows from remote systems you must have an X client installed on your system with Windows OS. Cygwin/X and Xming are implementations of the X Window System that runs under Microsoft Windows.

   

 

 

 
PBS Script Generator
An interactive tool that generates PBS script based on user's input. Check this page for more details.
Front-Line HPC Consulting
HPC consultations are available once a week, Monday 1-3 PM. Appointments are required. Please make an appointment at hpc@nyu.edu.

 

 

 

  • No labels