Introduction
In late 2021, Singularity underwent a rebranding to Apptainer, changing the command keyword fromsingularity
to apptainer
, though the majority of subsequent commands and options retained their functionality. Learn more about the changes in the tutorial Apptainer: the container system for secure HPC.
Apptainer is the recommended module for container management, and SCINet users are advised to transition to it for enhanced functionality and support.
What are Containers?
A container can be thought of as a light weight Virtual machine that are in itself a portable linux environment. They help with reproducibility and dependencies required to run a script, program or pipeline.
Initial setup
Before diving into container tasks on HPC infrastructure:
- ensure you checked the available module using the commands:
module avail apptainer module avail singularity
- load a module of your choice:
module load apptainer/<version> # e.g., module load apptainer/1.2.5
- and consistently use the appropriate keyword for your commands (in this example: apptainer).
In order to use the container you will need to load the Singularity module on your HPC resource or install it on your local machine. This will vary from HPC resource to HPC resource. You can find what your HPC called the module by using the module avail
command. In many cases it will be as simple as the command below.
module load apptainer
On the SCINet HPC, please note that the archival version of Singularity is still accessible.
If required, you can load it using the singularityCE/3.11.4
module.
module load singularityCE/3.11.4
Finding and Downloading a Singularity Container
You can find containers that are compatible with Singularity in several main places:
resource | description |
---|---|
Singularity Hub | Singularity Hub is a dedicated repository for Singularity containers. It hosts a wide range of containers that are specifically built and optimized for use with Singularity. |
Docker Hub | Docker Hub is a popular repository for Docker images, many of which can be converted to Singularity format using tools like docker2singularity . |
Biocontainers | Biocontainers is a collection of bioinformatics software packages provided in the form of Docker images. These images can often be converted to Singularity format and used for bioinformatics analysis. |
Research Groups, Research Projects | Many research groups and projects develop and share their own Singularity containers tailored to their specific needs. These containers may be hosted on their websites, GitHub repositories, or other platforms. |
Community Forums and Websites | Community forums, mailing lists, and websites related to specific software or fields often provide links to Singularity-compatible containers shared by users and developers. |
The first time you use singularity it will by default put a .singularity
folder in your HOME
directory which commonly has limited storage space. Therefore it is important that you move that folder to a different location and then create a softlink from your home directory to the new location.
pull from Docker Hub
Docker Hub is a repository of Docker images, hosting a vast collection of containerized software. When searching for a tool on Docker Hub, utilize keywords, filters, and ratings for efficient navigation and selection.
The container used in this example contains the MAKER software tool, which is commonly used for genome annotation.
By pulling and running this Docker container, users can easily utilize the maker
tool without needing to install dependencies or manage software versions manually.
Pull the container from Docker hub:
apptainer pull docker://sjackman/maker
Pulling the specified container with apptainer may require some time…
2024/04/29 15:11:46 info unpack layer: sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4 INFO: Creating SIF file...
and upon completion, it will generate a Singularity Image Format (SIF):
ls
maker_latest.sif
To run a containerized tool, use the exec
command with the apptainer
keyword:
apptainer exec --bind $PWD maker_latest.sif maker --help
This command executes the maker tool inside the maker_latest.sif
container while binding the current directory ($PWD
) to the container’s filesystem and passing the --help
argument to the maker tool.
see the expected output
MAKER version 2.31.6 Usage: maker [options] {maker_opts} {maker_bopts} {maker_exe} Description: MAKER is a program that produces gene annotations in GFF3 format using evidence such as EST alignments and protein homology. MAKER can be used to produce gene annotations for new genomes as well as update annotations from existing genome databases. The three input arguments are control files that specify how MAKER should behave. All options for MAKER should be set in the control files, but a few can also be set on the command line. Command line options provide a convenient machanism to override commonly altered control file values. MAKER will automatically search for the control files in the current working directory if they are not specified on the command line. Input files listed in the control options files must be in fasta format unless otherwise specified. Please see MAKER documentation to learn more about control file configuration. MAKER will automatically try and locate the user control files in the current working directory if these arguments are not supplied when initializing MAKER. It is important to note that MAKER does not try and recalculated data that it has already calculated. For example, if you run an analysis twice on the same dataset you will notice that MAKER does not rerun any of the BLAST analyses, but instead uses the blast analyses stored from the previous run. To force MAKER to rerun all analyses, use the -f flag. MAKER also supports parallelization via MPI on computer clusters. Just launch MAKER via mpiexec (i.e. mpiexec -n 40 maker). MPI support must be configured during the MAKER installation process for this to work though Options: -genome|g {file} Overrides the genome file path in the control files -RM_off|R Turns all repeat masking options off. -datastore/ Forcably turn on/off MAKER's two deep directory nodatastore structure for output. Always on by default. -old_struct Use the old directory styles (MAKER 2.26 and lower) -base {string} Set the base name MAKER uses to save output files. MAKER uses the input genome file name by default. -tries|t {integer} Run contigs up to the specified number of tries. -cpus|c {integer} Tells how many cpus to use for BLAST analysis. Note: this is for BLAST and not for MPI! -force|f Forces MAKER to delete old files before running again. This will require all blast analyses to be rerun. -again|a recaculate all annotations and output files even if no settings have changed. Does not delete old analyses. -quiet|q Regular quiet. Only a handlful of status messages. -qq Even more quiet. There are no status messages. -dsindex Quickly generate datastore index file. Note that this will not check if run settings have changed on contigs -nolock Turn off file locks. May be usful on some file systems, but can cause race conditions if running in parallel. -TMP Specify temporary directory to use. -CTL Generate empty control files in the current directory. -OPTS Generates just the maker_opts.ctl file. -BOPTS Generates just the maker_bopts.ctl file. -EXE Generates just the maker_exe.ctl file. -MWAS {option} Easy way to control mwas_server for web-based GUI options: STOP START RESTART -version Prints the MAKER version. -help|? Prints this usage statement.
pull from Singularity Hub
Singularity Hub has transitioned to DataLad in early 2024 involves a significant change in website layout. However, the archival containers from Singularity Hub are still accessible for those who need them.
The new DataLad interface presents a simpler directory structure, displaying collections of datasets in a table format.
- You can find collections under clearly labeled folders, such as “data,” “logs,” and others, with metadata like size and last modified date.
- The search bar on the DataLad website facilitates locating specific datasets or tools.
- Additionally, instructions on how to install datasets from the DataLad server are displayed in the top-right corner.
First go to DataLad and locate the container you want through the search box located on the top-right corner of the page.
Archived containers from Singularity Hub are now accessible via shub:// and can be pulled using standard Apptainer commands. To find and pull a specific container, like utilities
version 1.0.1
, search for the repository name ISUGIFsingularity
.
N avigate by clicking through the nested folders to the desired container, and then copy the blue path displayed at the top. This path is the reference you’ll use with shub:// to pull the container.
Unfortunately, the copied path (ISUGIFsingularity / utilities / 1.0.1
) requires some adjustments:
- you need to replace the last slash
/
with a colon:
to specify the container version - and remove all white spaces.
The adjusted path will look like ISUGIFsingularity/utilities:1.0.1
, ready for your apptainer command:
apptainer pull shub://ISUGIFsingularity/utilities:1.0.1
INFO: Environment variable SINGULARITY_CACHEDIR is set, but APPTAINER_CACHEDIR is preferred INFO: Downloading shub image 392.2MiB / 392.2MiB [=======================================================================================] 100 % 65.4 MiB/s 0s
ls
utilities_1.0.1.sif
If you get a CERTIFICATE_VERIFY_FAILED: error then you can set your python certificate verification to off.
export PYTHONHTTPSVERIFY=0
Direct execution of Singularity containers
Containers often have runscripts that will provide you with useful information on how to use the container. The run scripts get initiated by executing the image as follows:
./ISUGIFsingularity-utilities-master-1.0.1.simg
Which in this case produces a list of files that can be called by the image.
colsum createhist.awk intervalBins.awk nb nbsearch new_Assemblathon.pl readme README.md seqlen.awk summary.sh
As I mentioned above, these are useful scripts often used in our group. One script borrowed from the Assemblathon paper, new_Assemblathon.pl, we use when evaluating genome assemblies and want to get a summary of the assembly statistics. To use this script via the container we can execute it in the following manner.
singularity exec ISUGIFsingularity-utilities-master-1.0.1.simg new_Assemblathon.pl Spirochaete.fasta
This way of executing containers is kind of tedious as it requires so much more than just the new_Assemblathon.pl script that you could just download and place it somewhere in your path. Later in this tutorial, we will show you how to create a bash script wrapper that will simplify calling the function from within the container. In most cases creating a container for a simple script (perl, python, bash) doesn’t make sense as it creates a small overhead to load singularity and the container before executing the script. Where containers shine, is when your script or pipeline requires lots and lots of software prerequisites (for example: a specific version of perl, blast, qiime, samtools, etc)
We ran the new_Assemblathon.pl script from the container on a spirochaete genome we had handy and it produced the desired output.
---------------- Information for assembly '/Users/severin/Downloads/Spirochaete.fasta' ----------------
Number of scaffolds 1
Total size of scaffolds 3251735
Longest scaffold 3251735
Shortest scaffold 3251735
Number of scaffolds > 1K nt 1 100.0%
Number of scaffolds > 10K nt 1 100.0%
Number of scaffolds > 100K nt 1 100.0%
Number of scaffolds > 1M nt 1 100.0%
Number of scaffolds > 10M nt 0 0.0%
Mean scaffold size 3251735
Median scaffold size 3251735
N50 scaffold length 3251735
L50 scaffold count 1
scaffold %A 25.62
scaffold %C 24.57
scaffold %G 24.21
scaffold %T 25.60
scaffold %N 0.00
scaffold %non-ACGTN 0.00
Number of scaffold non-ACGTN nt 0
...
This output has been truncated
How to install and use Singularity on your local machine (Mac)
If you want to explore containers on your local Mac you canfFollow the directions on this website http://singularity.lbl.gov/install-mac
Note: You may need to issue the following command if you update your operating system or get a new computer.
Note you may need to allow Oracle permission via your security settings if you are on a mac. See this website
Starting a singularity Virtual Machine (VM) instance on the Mac
If you are using your local machine, this will allow you to not only execute containers but also build containers from recipes and test them out.
mkdir singularity-vm
cd singularity-vm
vagrant destroy
vagrant init singularityware/singularity-2.4
vagrant up
vagrant ssh
Running a command using a Singularity container. (Same as above)
singularity exec ISUGIFsingularity-utilities-master-1.0.1.simg new_Assemblathon.pl Spirochaete.fasta
transferring files off and onto a local vagrant virtual machines
You probably started your vm and realized that you can’t access any of your files to try out the command above.
If you are running a VM locally to use singularity, you can transfer files to and from your VM using scp
and the VM private key.
Change into the folder you initiated your vagrant vm singularity-vm
and run the vagrant ssh-config command to get the private key
vagrant ssh-config
OUTPUT
Host default
HostName 127.0.0.1
User vagrant
Port 2222
UserKnownHostsFile /dev/null
StrictHostKeyChecking no
PasswordAuthentication no
IdentityFile /Users/severin/singularity-vm/.vagrant/machines/default/virtualbox/private_key
IdentitiesOnly yes
LogLevel FATAL
Then using that key scp files off of the virtual box from a newly opened terminal that is not in the VM.
scp -P 2222 -i /Users/severin/singularity-vm/.vagrant/machines/default/virtualbox/private_key vagrant@127.0.0.1:/home/vagrant/recipe .
or transfer a file to the VM.
scp -P 2222 -i /Users/severin/singularity-vm/.vagrant/machines/default/virtualbox/private_key ~/Downloads/Spirochaete.fasta vagrant@127.0.0.1:/home/vagrant/
creating wrappers for the singularity commands
Here is an example bash script wrapper for a singularity execution of a command in a container.
new_assemblathon
#!/bin/bash
singularity exec ISUGIFsingularity-utilities-master.simg new_Assemblathon.pl
This wrapper contains the singularity command and once put in your path you can just use
new_Assemblathon
to execute the script via the singularity container.