Software on HPC - overview
The software available on a high-performance computing (HPC) system can vary depending on the specific system and its intended use. However, some common types of software that are typically available on HPC systems include:
1. compilers
[ a special program that translates a programming language’s source code into machine code ]
HPC systems often have a variety of compilers installed, such as GCC ⤴ and Intel Fortran ⤴, that can be used to compile and optimize code for the system’s architecture.
2. programming languages
[ a system of syntax and semantics for writing instructions that can be executed by a computer ]
HPC systems often have a wide range of programming languages installed, such as C ⤴, C++ ⤴, Fortran ⤴, Python ⤴, and R ⤴, which can be used to write and run code on the system.
3. libraries and frameworks
[ ready-made code (functions and classes) to solve common tasks and boost development ]
HPC systems often have a variety of libraries and frameworks available, such as the MPI library ⤴ for parallel programming and the CUDA library ⤴ for GPU programming.
4. basic visualization software
[ remote visualization to display data and derive meaningful insights ]
HPC systems often have software for visualizing and analyzing data, such as ParaView ⤴, VisIt ⤴, gnuplot ⤴, VMD ⤴, and other ⤴.
5. job schedulers
[ a computer application for controlling resources and execution of jobs ]
HPC systems often have a job scheduler installed, such as Slurm ⤴, PBS ⤴ or LSF ⤴, which is responsible for managing the allocation of resources and scheduling jobs to run on the system.
6. data management
HPC systems have data management software, including:
category | examples | description |
---|---|---|
distributed file systems | Lustre ⤴, GlusterFS ⤴, GPFS ⤴ | enable efficient data sharing and collaboration among multiple users |
backup and archiving software | Amanda ⤴, Bacula ⤴, Tivoli Storage Manager ⤴ | protect data by creating regular backups and archiving older data to long-term storage |
data transfer software | Globus ⤴, GridFTP ⤴, Aspera ⤴ | transfer large amounts of data quickly and efficiently between HPC systems and other storage or computing machines |
data cataloging and metadata management software | iRODS ⤴, Dataverse ⤴, XNAT ⤴ | manage and organize large amounts of data, and provide search capabilities |
database management software | MySQL ⤴, MongoDB ⤴ | store, manage, a analyze large amounts of structured data |
How to find available software?
There are several ways to find available software on a high-performance computing (HPC) system. For some of them you can find the hands-on mini tutorials in the following subsections:
- A. Software as built-in commands
- B. Software as built-in modules
- C. Software via package manager
- D. Check the documentation
- E. Ask the system administrator
It’s worth noting that each HPC system can have a different way of managing and organizing the software<, so it’s best to consult the documentation or ask the system administrator for specific instructions.
Software as built-in commands
There are many different types of software that may be available as built-in commands on a high-performance computing (HPC) system. Some examples include:
- System utilities, see section
- Text processing and manipulation tools, see section
- Compression and archiving tools, see section
- Job management tools, see section
- Remote access tools, see section
- File transfer tools, see section
How to check built-in commands?
There are a few ways to check the list of available built-in commands on a high-performance computing (HPC) system:
- Using the
help
command
Many HPC systems have ahelp
command that can be used to view a list of built-in commands.
For example:help
orman
will give a list of all the commands.help man
- Using the
builtin
command
Some HPC systems have abuiltin
command that can be used to view a list of built-in commands.
Try these commands on your HPC system:builtin compgen -b
- Using the
alias
command
Thealias
command can be used to view a list of all currently defined aliases, which are often used to create custom built-in commands. Learn more from subsection Define aliases in the practical tutorial in this workbook.alias
- Examining shell initialization files
Some HPC systems may define built-in commands in shell initialization files such as.bashrc
,.bash_profile
,.bash_aliases
or similar. The user can check these files for custom built-in commands.Learn more from hands-on tutorials available in this workbook:
section Unix Shell Configuration: .bashrc & .bash_profile in the tutorial
tutorial Setting up your home directory for data analysis
tutorial Example .bashrc file configuration - Trying to use the desired command
If you know what the command corresponding to the program could be called, you can always try calling it in the terminal window. If such a command exists then usually calling it with the-h
flag will display the available options.
For example:chmod -h
abadacz@MacBook(bash):bin$ chmod -h usage: chmod [-fhv] [-R [-H | -L | -P]] [-a | +a | =a [i][# [ n]]] mode|entry file ... chmod [-fhv] [-R [-H | -L | -P]] [-E | -C | -N | -i | -I] file ...
If such a command does not exist then an error message will be printed.
random_command
abadacz@MacBook(bash):bin$ random_command -bash: random_command: command not found
Explore example software typically available as built-in commands:
1. System utilities
Basic system utilities, such as ls
, cd
and mkdir
are often available as built-in commands. These utilities allow users to:
- navigate the file system,
- manage files and directories,
- and perform other basic tasks.
2. Text processing and manipulation tools
Some common text processing and manipulation tools like sed
, awk
, grep
and cut
are often available, which allow users to manipulate and extract data from text files or command-line text streams.
There are also a built-in command-line text editors with basic graphical interface, such as nano
or vim
, which allow to write a script, edit a configuration file, modify data file, or create a quick note or documentation.
3. Compression and archiving tools
Tools like gzip
, tar
and zip
are often available, which allow users to compress and archive large files and directories.
Quick cheatsheet
Compress a single file:
gzip -c filename > filename.gz
Compress all files in a directory:
gzip -r directory
Decompress a single file:
gzip -d filename.gz
Decompress all files in a directory:
gzip -dr directory.gz
Compress an entire directory or a single file to .tar.gz
archive:
tar -czvf archive_name.tar.gz /path/to/directory-or-file
Extract the .tar.gz
archive:
tar -xzvf archive.tar.gz
4. Job management tools
The PBS ⤴ or Slurm ⤴ tools are commonly used on HPC systems to submit and manage jobs on the system. These tools allow users to submit jobs, monitor the status of their jobs, and view the job queue.
Useful job management commands:
SLURM tools | PBS tools | description |
---|---|---|
squeue -u {user} | qstat -u {user} | gives info about user’s jobs |
sbatch {job_script} | qsub {job_script} | submits job to the queue |
scancel {jobID} | qdel {jobID} | stops and removes job |
sinfo -N -l | pbsnodes -l | gives info about queues, partitions, or nodes |
scontrol show | pbsnodes -l | provides details about jobs job jobID , partitions partition pID , or nodes nodes |
seff {job_ID} | qstat -fxw {job_ID} | provides resource usage report for a finished job |
salloc | qsub -I {job_script} | starts interactive session |
…from the practical Introduction to job scheduling tutorials ( including SLURM and PBS) in the section of this workbook.
5. Remote access tools
Tools like ssh
, telnet
and rlogin
are often available, which allow users to remotely access and control other systems on the network.
6. File transfer tools
Tools such as scp
and rsync
are often available as built-in commands, which allow users to securely transfer files to and from the HPC system.
Copy data from local to remote machine (while being on a local machine):
# syntax:
scp <path_on_local>/<transferred_file_name> <user>@<hostname_to_remote>:<path_on_remote>
# example:
scp ~/.bashrc alex.badacz@atlas-dtn.hpc.msstate.edu:/project/90daydata/
Copy data from remote to local machine (while being on a local machine):
# syntax:
scp <user>@<hostname_to_remote>:<path_on_remote>/<transferred_file_name> <path_on_local>/
# example:
scp alex.badacz@atlas-dtn.hpc.msstate.edu:/project/90daydata/file.txt ~/DATA/
To copy directories use:
scp -r {path_to_the source} {path_to_the_destination}
To synchronize the content in both locations, recursively transfer the data using rsync
command:
rsync -avz --no-p --no-g {path_to_the source} {path_to_the_destination}
…from the practical tutorials about available in section of this workbook.
If you seek for a guide about transferring data to , see tutorials:
Software as built-in modules
Many HPC systems use a system of software modules to manage and organize the software that is available. The Environment Modules ⤴ package can help to make an HPC system more user-friendly, efficient, and accessible for a wide range of users. It allows users to manage and access software in a more flexible way, and can help to make the system more efficient.
The module
command can be used to list the available modules, and to see which modules are currently loaded.
module avail # List available packages
module avail <name> # List available variants of a given package
module list # List currently loaded modules
Software via package manager
The centralized package manager enables searching for and listing the available software packages on HPC systems. Different package managers match various operating systems. So first, check the operating system (OS) on your HPC infrastructure and identify the package manager in use. Then follow the cheatsheet below to search for the software needed. Learn more from the practical tutorial available in section of this workbook.
- for Ubuntu / Debian: .deb packages managed by
apt
anddpkg
# List installed and available packages: apt list
# Search apt list for a given package: apt search <software_name>
- for RHEL / Fedora / Rocky: .rpm packages managed by
yum
( ^ yum has been supplanted bydnf
)
# List installed and available packages: yum list all
# List only available packages: yum list available
# Search dnf list for a given package: yum search <software_name>
- for FreeBSD: .txz packages managed by
pkg
# Search pkg list for a given package: pkg search <software_name>
Learn more from external resources:
- Package Management Essentials ⤴ by DigitalOcean
- A Guide to Yum and Apt ⤴ by Baeldung
- Linux package management with YUM and RPM ⤴ by RedHat
Check the documentation
HPC systems often have extensive documentation available, including information on the software that is available. Users can consult the documentation to find a list of the software packages that are available. Primarily, such a list should contain:
- the licensed software, which may be available only for selected users
- the software with graphical interface (GUI) only, which may require the user to log on to the HPC via a web-based interface (instead of command line)
Software available on the SCINet HPC systems:
Software available on the ISU HPC systems:
Ask the system administrator
The HPC administrator have access to the information on all the software installed, thus in case of any doubts, it’s best to reach out to them for assistance.
- regarding SCINet HPC, contact VRSC team: scinet_vrsc@usda.gov
- regarding ISU HPC, contact administrators: hpc-help@iastate.edu
• How to get new software installed?
1. Check that the software is not already installed (follow the guide in this tutorial)
2. Consider the following criteria:
- if you think that the new software will be useful to many more users
or - the software is licensed
or - installation requires superuser privileges
If the answer to any isYES
, contact the HPC administrator and submit a request for software installation.
Otherwise, go to step 3.
3. Go to the tutorial to learn how to install the necessary software yourself.