Software on HPC - overview
The software available on a High-Performance Computing system can vary depending on the specific system and its intended use. However, HPC systems typically have a wide range of software available, including:
- operating systems
- cluster management software, including job schedulers
- programming languages
- development tools, including compilers and debuggers
- scientific libraries to accelerate computations
- applications, including numerical simulations, data analysis, and visualization tools
Explore the tutorial in this workbook, to learn more about:
Software as built-in modules
Many HPC systems use a system of software modules to manage and organize the software that is available for all users. The Environment Modules ⤴ package is a powerful tool that can help to simplify the management and use of software on HPC systems (benefit for cluster admins), and to improve the efficiency and reproducibility of research and development (benefit for users). Modules are useful in managing different versions of applications. They can be loaded and unloaded easily, dynamically, and atomically. And, a single module (copy of the package) can be used by many users at the same time.
Benefits of using Environment Modules
The Environment Modules package provides several benefits for both, users and system administrators .
A. For you as a user:
-
Easy access to a wide range of software
Environment Modules allows users to access a wide range of software that is installed on the system, without having to worry about dependencies or conflicts. This can help to make the system more accessible and user-friendly, especially for users who are not experts in the field. -
Easy access to multiple versions of software
Environment Modules allows users to easily switch between different versions of software that are installed on the system, and to use the version of the software that is most appropriate for their needs. This is particularly useful when different versions of a software package have different dependencies or are compiled with different options. -
Improved reproducibility
Using environment modules allows users to specify the version of the software they used in their computations, which can help to improve reproducibility and collaboration among researchers. -
Improved portability of software (and collaboration)
Environment Modules allows users to use the same software on different systems, without having to worry about compatibility issues. This can be especially useful for users who need to use the same software on multiple systems or who need to share their work with other users. -
Efficient use of system resources
Environment Modules allows users to only load the software that they need, rather than having all the software available all the time. This can help to save resources and improve the performance of the system, which is especially important for large HPC systems with many users.
A. For HPC administrators:
-
Simplified management of software and environment variables
nvironment Modules allows system administrators to easily manage the software that is available on the system, and to set environment variables such as paths and library paths automatically. This can reduce the amount of work that needs to be done by users to set up their environment, and can help to ensure that the software is used correctly. -
Compatibility with batch job systems
Environment Modules can be used in conjunction with batch job systems, such as LSF, PBS, and SLURM, to ensure that the correct version of software is used when a job is run. This can help to avoid conflicts and errors when running jobs. -
Flexibility in supporting different users with different needs
Many HPC systems support multiple users with different needs, Environment Modules allows System administrator to make available different version of software for different user or group of users.
module
command
The module
command is a way to manage and access software packages on HPC systems that use the Environment Modules package. This package allows users to easily switch between different versions of software that are installed on the system, and to set environment variables such as paths and library paths.
Note that some modules might have dependencies on other modules, so loading one module may automatically load other modules, or unloading a module might unload other dependent modules. The module
command can be complex and have additional functionality. System administrators and the module documentation will provide information on the dependencies and best practice for using the modules.
How to find a module?
Here are some common module
commands that can be used to list and search for available software on an HPC system:
list all modules
To list all of the available modules on the system, use:
module avail
The output will show the name of each module, as well as a short description of what the module does.
list loaded modules
To show the modules that are currently loaded in the user’s environment, use:
module list
search for a specific module
To search for a specific module by name, use:
module spider <name>
It will show all the versions of that module that are available on the system, along with information about the module, such as dependencies and conflicts.
details of a specific module
To display the details of a specific module, use:
module show <name>
It will show the version, the dependencies, and the environment variables that the module sets.
How to load a module?
Here are some common module
commands that can be used to load and unload the module on an HPC system:
load a specific module
To load a specific module into the user’s environment, use:
module load <name>
It will load the module, making the software that the module provides available to use.
unload a specific module
To unload a specific module from the user’s environment, use:
module unload <name>
This command will remove a module that has been loaded into user’s environment.
swap one with another module
To easily switch between different versions of the same software, use:
module swap <name1> <name2>
This command will swap one module that is currently loaded in the user’s environment with another module.
unload all modules
To unload all modules from your environment, use:
module purge
Examples of software modules
A wide range of software can be made available as modules on an HPC system. Some examples of the types of software that are commonly made available as modules include:
- 1. Compilers and interpreters
- 2. Scientific libraries
- 3. Applications : numerical simulations, data analysis
- 4. Deep learning framework
- 5. Visualization tool
- 6. Interoperability packages
It’s worth noting that the specific syntax and names of the modules may vary depending on the HPC system and the naming conventions used by the system administrator. In general, you can use the module avail
command to see the list of available modules, and the module show {name}
command to see the details of a specific module, such as its dependencies and conflicts.
1. Compilers and interpreters
Compilers and interpreters for languages such as C, C++, Fortran, Python, and R can be made available as modules, along with libraries and tools for those languages.
Loading a specific version of GCC compiler:
module load gcc/8.3.0
2. Scientific libraries
Libraries such as BLAS ⤴ (Basic Linear Algebra Subprograms), LAPACK ⤴ (Linear Algebra PACKage), and FFTW ⤴ (Fastest Fourier Transform in the West), which are commonly used in scientific computing, can be made available as modules.
Loading a specific version of OpenBLAS library:
module load openblas/0.3.9
3. Applications
A wide range of applications, such as numerical simulations [ GROMACS ⤴, NAMD ⤴, AMBER ⤴, Gaussian ⤴, ANSYS ⤴ ] and data analysis [ R ⤴, SciPy ⤴, MATLAB ⤴, Octave ⤴, Python Data Science Libraries ⤴ ] tools can be made available as modules.
Loading a specific version of of the VMD ⤴ (Visual Molecular Dynamics) simulation application:
module load vmd/1.9.4
Typical module names for other simulation tools: gromacs
, namd
, amber
, gaussian
, ansys
.
Loading a specific version of a data analysis application:
module load R/4.0.3-openblas-0.3.9
This will load the version 4.0.3 of R programming language along with OpenBLAS version 0.3.9.
Typical module names for other data analysis tools: scipy
, matlab
, octave
, and for python: pandas
, numpy
, matplotlib
, seaborn
, plotly
.
4. Deep learning framework
Many popular deep learning frameworks can be made available as modules on an HPC system, including TensorFlow ⤴, PyTorch ⤴, Caffe ⤴, Theano ⤴, and Keras ⤴.
Loading a specific version of a pytorch
deep learning framework:
module load pytorch/1.6-cuda-11.0-python-3.8
This will load the version 1.6 of PyTorch deep learning framework that is built for CUDA 11.0 and python 3.8.
Typical module names for other frameworks: tensorflow
, pytorch
, caffe
, theano
, keras
.
5. Visualization tools
Many visualization tools can be made available as modules on an HPC system, including ParaView ⤴, VisIt ⤴, Gnuplot ⤴, matplotlib ⤴, and Mayavi ⤴.
Loading a specific version of a paraview
visualization tool:
module load paraview/5.9-mpi-4.0.3-python-3.8
This will load version 5.9 of ParaView visualization tool that is built with MPI 4.0.3 and python 3.8
Typical module names for other tools: paraview
, visit
, gnuplot
, matplotlib
, mayavi
.
6. Interoperability packages
Environment Modules can be used to set environment variable or path for some specialized software that require different version of libraries or have different dependencies, such as MPI libraries.
Loading a specific version of of OpenMPI ⤴ library:
module load openmpi/4.1.3
SCINet HPC modules
Both SCINet HPC clusters, Atlas and Ceres use the Environment Modules package.
Learn more from SCINet resources:
ISU HPC modules
Both ISU HPC clusters, Nova and Condo use the Environment Modules package.
Learn more from ISU resources: