Introduction
Remote access to the contents stored on remote machines refers to the ability for a user to access, view, and manipulate files, data, and software available on another computer, including servers, databases, and HPC systems. This can be accomplished from anywhere over a network connection such as the Internet. Though, the user must have the necessary permissions and credentials to access the remote machine and its contents. In particular, there are several ways to access remotely the resources available on high-performance computing (HPC) clusters.
It is important to note that accessing data on an HPC cluster remotely can be slower than accessing data locally, due to the added latency of transmitting data over the network.
In addition, users may need to be granted access to the HPC cluster in order to use it remotely.
1. VPN (Virtual Private Network)
VPN is a technology that allows users to securely access a private network over the Internet. VPNs can be used to access remote files and data stored on remote computers within the same network. VPNs protect users data from being intercepted or monitored by unauthorized parties while login from off-campus.
Jump to solution to get started with:
- VPN access to Atlas and Ceres ⤴ computing clusters of the SCINet Scientific Computing
- VPN access to Nova and Condo ⤴ computing clusters of the
2. SSH (Secure Shell connection)
With SSH, a cryptographic network protocol, users can connect to the cluster and then browse, manipulate, and execute files as if they were sitting at the terminal of a computer on the cluster.
…from the hands-on tutorials in the
Jump to solution to get started with:
- SSH access to Atlas computing cluster of the SCINet HPC system
- SSH access to Ceres computing cluster of the SCINet HPC system
- SSH access to Nova computing cluster of the ISU HPC system
3. Remote web-based access
Some HPC clusters may also provide web-based interfaces (e.g., Open OnDemand ⤴) for remotely accessing and managing data. It allows users to submit computing jobs to the HPC queueing system through a web interface or API, without the need for command line login to the underlying infrastructure.
Jump to solution to get started with:
- OOD access to Atlas computing cluster of the SCINet HPC system
- OOD access to Ceres computing cluster of the SCINet HPC system
- OOD access to Nova ⤴ computing cluster of the ISU HPC system
4. Remote desktop software
VNC (Virtual Network Computing ⤴) or RDP (Remote Desktop Protocol ⤴ by Microsoft) allow users to remotely access and control a desktop (graphical user interface) on another computer, including some clusters.
5. RFS (Remote File System)
RFS protocol ⤴ is often used in computing clusters to connect multiple nodes together over a high-speed network. By using an RFS protocol, nodes in a cluster can access data stored on other nodes as if it were stored locally, which simplifies data access and eliminates the need to physically transfer large amounts of data between nodes. This can improve the performance and scalability of the cluster, and allows the nodes to work together more efficiently.
RFS protocol allows users to access files stored on a remote computer, without having to physically transfer the files to their local machine. Users can remotely access, read, write, and modify files as if they were stored locally on their own computer. Some HPC systems may have the Remote File System (RFS) pre-installed and configured, while others may not.
…from the hands-on tutorials in the Remote data preview (without downloading) section of this workbook:
If the HPC system already has RFS pre-configured, the user may simply need to follow the appropriate steps to access the remote file system, such as mounting the file system and logging in with their credentials. The specific steps and commands required to access the RFS will vary depending on the operating system and RFS implementation being used.
Further Reading
Virtual Private Network (VPN) Connection (command line)Secure Shell Connection (SSH) (command line)
SSH shortcuts and password-less login
Open On Demand (OOD) Connection (web-based GUI)
Setting up your home directory for data analysis
Software Available on HPC
Introduction to job scheduling
Introduction to GNU parallel
Introduction to containers
MODULE 07: Data Acquisition and Wrangling