Child pages
  • Putting all pieces together
Skip to end of metadata
Go to start of metadata

 

Pulling it all together - Preparing, submitting and monitoring a job on Prince

 In this section we will prepare, submit and monitor a small Amber (molecular dynamics) job. Our test case comes from the test suite that comes with Amber.

Exercise

Start a terminal session on Prince and replicate this example in it. 


Choose your own example

After - or instead of - following this example through, prepare and submit a run of something genuinely relevant to your research. This way, if you are doing this tutorial in a classroom, the presenter will be available should you have questions or strike difficulties

We're using Amber, so first we'll look for available modules. On Prince:

There's quite a few versions there. We'll select a recent one - and purge first to ensure we start from a clean environment

Take a look at what it did:

... clearly, Amber uses a lot of other packages. The modulefile has looked after loading the correct ones.

Loading the module set a useful environment variable, AMBERHOME. We'll get our test case from there:

$ ls $AMBERHOME/test/amoeba_jac/
amoeba_jac.ips.mdout.save
amoeba_jac.mdout.save
amoeba_jac.pmemd.mdout.save
inpcrd
inpcrd.rst7
prmtop
Run.amoeba_jac
Run.amoeba_jac.ips
Run.amoeba_jac.pmemd

We are interested in the files highlighted in blue. Make a directory in your $HOME and copy them there.

Run.amoeba_jac is a script for running the test after building Amber - it's not exactly what we want, but we'll take some parts of it. First, we need the namelist file it creates, so we'll cut-and-paste the lines between "cat > mdin <<EOF" and "EOF" into a file called "mdin". (Note the liberal use of shortcuts in the snapshot below. The TAB key is also useful here!

$ mkdir $HOME/tutorial-2-ex1
$ cd !$
$ cp $AMBERHOME/test/amoeba_jac/[ip]* .
$ cp $AMBERHOME/test/amoeba_jac/Run.amoeba_jac .
$ cp Run.amoeba_jac mdin
$ vi mdin 

# ... and selectively delete the unwanted lines  

It should come out looking like this: (don't worry about the blue text just yet)

short md, nve ensemble
&cntrl
ntx=1, irest=0,
nstlim=10,
ntpr=1, ntwr=10000,
dt=0.001, vlimit=10.0,
cut=8., jfastw=4,
ntt=1, temp0=50.0,tempi=0.0,
iamoeba=1,
/
&ewald
nfft1=80,nfft2=80,nfft3=80,
skinnb=2.,nbtell=0,order=5,ew_coeff=0.45,
/
&amoeba
do_bond=1,do_ureyb=1,do_reg_angle=1,do_trig_angle=1,
do_opbend=1,do_torsion=1,do_pi_torsion=1,do_strbend=1,
do_torsion_torsion=1,do_amoeba_nonbond=1,
dipole_scf_tol = 0.01,dipole_scf_iter_max=20,
sor_coefficient=0.7,ee_damped_cut=4.5,ee_dsum_cut=6.7,
beeman_integrator=1,

/

Next we need to write a job script.

What resources will we need?

  • It's a serial job, so nodes=1 and ntasks=1
  • This test uses about 800MB of memory. We'll request 1GB to be sure
  • The test takes less than a minute. We'll request 5 minutes and you can optionally increase the run length (currently 10 fs) by increasing nstlim in the mdin file (that blue text above).

As for the workflow, the script needs to:

  • Load the Environment Module for Amber
  • Set up a run directory. We'll run in $SCRATCH, even though this is a very small test
    • NOTE: currently $SCRATCH is only defined for login (usually just interactive) shells. We can make any shell a login shell (even if not interactive) by calling bash (or csh) with the -l option. See the first line of the script below.
    • copy the input files there
  • Change into the run directory and run the job
    • The command to start this Amber job can be determined from the script Run.amoeba_jac:
      sander -O -i mdin -o amoeba_jac.mdout
    • We can get a report at the end of the run on the time and memory resources used by using the "time" command. 
      Here we give it an explicit path because bash has a builtin function called "time", and we want the more powerful command from /usr/bin

 


Finally, we can submit our job, and monitor its progress:

$ sbatch my_job_script.s

You'll get a job id returned.

Is it running yet?

$ squeue -u $USER

You could watch the output in the run directory:

$ ls -l ${SCRATCH}

Look for a directory whose suffix is the job id. Inside it, a file called amoeba_jac.mdout will be growing, with the output of the Amber run.

Finally, when the job finishes, you should see a slurm-*.out file in the directory you submitted from.

 

$ ls -lt
-rw-rw-r-- 1 johd johd      766 Jan 20 14:35 slurm-12418.out

Use 'cat' to see what is in the .out file. It's the timing information from /usr/bin/time -v. The job was very short! Try increasing nstlim (as mentioned above) so the job will take at least two or three minutes.

Exercise

 Experiment with sbatch options for the job name, output and error file merging and location, resource limits (what if the job exceeds them? You might need to increase nstlim quite a lot to hit a resource limit).


  • No labels