Introduction
Quick reference sheet for SLURM resource manager
Job scheduling commands
| Commands | Function | Basic Usage | Example | 
|---|---|---|---|
sbatch | 
      submit a slurm job | sbatch [script] | 
      $ sbatch job.sub | 
    
scancel | 
      delete slurm batch job | scancel [job_id] | 
      $ scancel 123456 | 
    
scontrol hold  | 
      hold slurm batch jobs | scontrol hold [job_id] | 
      $ scontrol hold 123456 | 
    
scontrol release | 
      release hold on slurm batch jobs | scontrol release  [job_id] | 
      $ scontrol release  123456 | 
    
Job management commands
| Job Status | Commands | 
|---|---|
sinfo -a | 
      list all queues | 
squeue | 
      list all jobs | 
squeue -u userid | 
      list jobs for userid | 
squeue -t R | 
      list running jobs | 
smap | 
      show jobs, partitions and nodes in a graphical network topology | 
Job script basics
A typical job script will look like this:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --cpus-per-task=8
#SBATCH --time=02:00:00
#SBATCH --mem=128G
#SBATCH --mail-user=netid@gmail.com
#SBATCH --mail-type=begin
#SBATCH --mail-type=end
#SBATCH --error=JobName.%J.err
#SBATCH --output=JobName.%J.out
cd $SLURM_SUBMIT_DIR
module load <modulename>
# your commands goes below
Lines starting with #SBATCH are for SLURM resource manager to request resources for HPC. Some important options are as follows:
| Option | Examples | Description | 
|---|---|---|
--nodes | 
          #SBATCH --nodes=1 | 
          Number of nodes | 
--cpus-per-task | 
          #SBATCH --cpus-per-task=16 | 
          Number of CPUs per node | 
--time | 
          #SBATCH --time=HH:MM:SS | 
          Total time requested for your job | 
--output | 
          #SBATCH -output filename | 
          STDOUT to a file | 
--error | 
          #SBATCH --error filename | 
          STDERR to a file | 
--mail-user | 
          #SBATCH --mail-user user@domain.edu< | 
          Email address to send notifications | 
Interactive session
To start a interactive session execute the following:
srun -N 1 -t 4:00:00 --pty /bin/bash
This command will give 1 Node for a time of 4 hours.
Getting info on past jobs
You can use slurm database to see how much memory your previous jobs used”
sacct -j <JOBID> --format JobID,Partition,Submit,Start,End,NodeList%40,ReqMem,MaxRSS,MaxRSSNode,MaxRSSTask,MaxVMSize,ExitCode
The following command will report requested memory and used residential and virtual memory for job JOBID.
Useful aliases
Here you have a few aliases that provide useful information parsed from the SLURM commands.
Place these aliases into your .bashrc. Follow the instructions in this practical tutorial.
alias si="sinfo -o \"%20P %5D %14F %8z %10m %10d %11l %16f %N\""
alias sq="squeue -o \"%8i %12j %4t %10u %20q %20a %10g %20P %10Q %5D %11l %11L %R\""
Resources
More information about Slurm can be found in external resources:
Further Reading
Creating SLURM job submission scriptsSubmitting dependency jobs using SLURM
PBS: Portable Batch System
PBS commands
Creating PBS job submission scripts
Submitting dependency jobs using PBS
Introduction to GNU parallel
Introduction to containers
MODULE 07: Data Acquisition and Wrangling