Although today’s handy laptops perform many advanced and computationally intensive tasks, projects involving Big Data require significantly more resources. That need is satisfied by the HPC infrastructure, built from a network of computing clusters combined with immense memory. Access to these resources is remote, so job submission and data preview occurs through an interface on any local computing machine from any (allowed) geolocation. The HPC infrastructure is a shared community space, so you might want to familiarize yourself with the usage policy to avoid disrupting peer work.

Table of contents

1. Introduction to HPC infrastructure

2. Remote Access to HPC Resources

(see more in section 7: Data Acquisition: Remote Data Access ⤴)

3. Setting up Your Home Directory for Data Analysis

4. Software Available on HPC

5. Introduction to Job Scheduling

6. Introduction to GNU Parallel

7. Introduction to Containers


Homepage Prior Section Next Section