Creating SLURM job submission scripts
For a quick list of frequently used SLURM commands/options, take a look at this cheat sheet. This guide will show you how to generate SLURM job submission scripts for your commands, easily and efficiently!
1. Generate commands
Job submission scripts are configured to run frequently used programs with settings that are tailored to fit user needs. The idea is to make it simple to run frequently used programs without having to read the entire manual.
Some of the common bioinformatics job scripts are runBWA.sh
, runGSNAP.sh
, runBLASTn
, etc. Here we will take runBLASTn.sh as an example, but you can use this for almost any of the other run scripts as well as your own commands.
First, lets generate the commands!
Let’s assiume we have 5 fasta files: file0.fsa
, file1.fsa
, file2.fsa
, file3.fsa
and file4.fsa
(each with 1000 sequences) for which we need to run BLAST against nr
database.
For the purpose of this exercise, simply create 5 empty files like this:
touch file0.fsa file1.fsa file2.fsa file3.fsa file4.fsa
You can use the ls
command to confirm the files exist now.
ls file0.fsa file1.fsa file2.fsa file3.fsa file4.fsa
Then we can generate the BLAST execution commands as follows:
for file in file?.fsa; do
echo ./runBLASTn.sh $file;
done > blastn.cmds
The blastn.cmds
now contains 5 lines, each set to run blastn
on each of the file.
less blastn.cmds ./runBLASTn.sh file0.fsa ./runBLASTn.sh file1.fsa ./runBLASTn.sh file2.fsa ./runBLASTn.sh file3.fsa ./runBLASTn.sh file4.fsa
2. Create submission files
To create slurm script for each one of the lines in the blastn.cmds
file, we run the (downloaded) makeSLURMs.sh script as follows:
makeSLURMs.py 1 blastn.cmds
Here, 1
is to tell that put one job per submission file, and blastn.cmds
is the commands file, generated in the previous step.
Once the makeSLURMs.py
command is executed, you should see 5 .sub
files, each identified by numbers corresponding to input files.
The content should look like this:
#!/bin/bash
#SBATCH -N 1
#SBATCH -n 36
#SBATCH -t 96:00:00
#SBATCH -J blastn_0
#SBATCH -o blastn_0.o%j
#SBATCH -e blastn_0.e%j
#SBATCH --mail-user=arnstrm@gmail.com
#SBATCH --mail-type=begin
#SBATCH --mail-type=end
cd $SLURM_SUBMIT_DIR
ulimit -s unlimited
module purge
module use /opt/rit/spack-modules/lmod/linux-rhel7-x86_64/Core
module use /opt/rit/spack-modules/lmod/linux-rhel7-x86_64/gcc/7.3.0
#module use /work/GIF/software/modules
./runBLASTn.sh file1.fsa
scontrol show job $SLURM_JOB_ID
3. Submit the jobs
The next step is to submit the jobs to the cluster. It is seamless when using the for loop:
for f in blastn*.slurm; do
sbatch $f;
done
All commands will be submitted to the queue and will start running as per the availability of nodes.