01. Introduction to Data Science
02. Introduction to the Command Line
- Terminal: a text-based interface for command-line operations
- Introduction to UNIX Shell: shell variables, HOME dir, .bashrc
- Useful Text Manipulation Programs
- UNIX CheatSheet
03. Setting Up Computing Machine
04. Development Environment
05. Introduction to Programming
- Basics of Algorithm Structure
- Introduction to Bash Scripting
- Introduction to Python programming
- Introduction to R programming
06. High-Performance Computing (HPC)
- Setting up your home directory for data analysis
- Introduction to HPC infrastructure
- Secure Shell Connection (SSH)
- Remote Data Access
(see more in section 7: Data Acquisition: Remote Data Access ⤴) - Software Available on HPC
- Introduction to Job Scheduling
- Introduction to GNU Parallel
- Introduction to Containers
07. Data Acquisition and Wrangling
- Remote Data Access
- Remote Data Transfer
- Tutorial: Copying Data using SSH
- Tutorial: Copying Data using Globus
- Tutorial: File Transfer using irods
- Tutorial: File Transfer using SRA Toolkit
- Tutorial: Downloading Online Data using WGET
- Tutorial: Downloading Online Data using Web Scraping
- Tutorial: Downloading Online GitHub Repos using GIT
- Tutorial: Downloading Online GitHub Folders using SVN
- Remote Data Preview without Downloading
- Remote Data Transfer
- Data Manipulation
- Manipulating Excel Data Sheets
- Manipulating Text Files with Python
- Tutorial: Read, Write, Split, Select Data
- Tutorial: JSON Module - Encoding & Decoding JSON Data
- Tutorial: Math Module - Various Mathematical Functions
- Tutorial: Pandas Library - Data Structure Manipulation Tool
- Tutorial: Numpy Library - Multi-Dimensional Arrays Parser
- Tutorial: SciPy Library - Algorithms for Scientific Computing
- Data Wrangling: ready-made apps
08. Data Visualization
- Introduction to Scientific Graphic Design
- Introduction to Scientific Graphing
- Gnuplot: Creating Plots in the UNIX Shell
- Plotly-Dash: Data Processing & Interactive Plotting with Python
- RStudio: Data Processing & Plotting with R