DataScience Workbook / 09. Project Management / 3. Resource Management / 3.3. Storage & Version Control / 3.3.2 Online Hosting Platforms for GIT Repositories / Introduction to GitHub
GitHub ⤴ is a cloud-based hosting platform that provides developers a space to store their projects, collaborate with others, and version-control their code. There are three primary use cases for using a version control platform like GitHub in science:
- Sharing of bioinformatics scripts
- Bioinformatic Program development
- Documentation of bioinformatic analyses
Probably the most important use-case for new users is documentation.
For those transitioning from a wet-lab, GitHub repos can be thought of as the
equivalent to a web-lab notebook, where every command performed in a bioinformatics analysis is recorded with an explanation as to why it was performed, when it was performed (date) and where it was performed (pwd).
Github documents can be written using markdown.
See Introduction to Markdown ⤴ tutorial as a good starting point.
GitHub leverages the power of Git distributed version control system.
Git, a distributed version control system, tracks changes to files over time, making it ideal for team-based work. Therefore, when used together, GitHub and Git create a robust system for code management, collaboration, and open-source contribution.
Version Control Systems
Version control systems are software tools that help manage changes to source code over time, allowing multiple contributors to work collaboratively and track modifications, enabling easy recovery of previous versions if needed.
What VCS allows you to do?
- Track changes made to each file
- Revert the entire project or a single file to a previous version
- Review changes made over time
- View who modified the file (and
blamethem for something if necessary)
- Collaborate with others without overwriting their work or risk file corruption, etc.
- Have multiple independent branches of the same repository and make changes without effecting others’ work.
- And more…
GitHub ⤴ utilizes Git ⤴, a distributed version control system that tracks changes in files over time, perfect for facilitating collaboration among multiple contributors. As a cloud-based hosting platform, GitHub provides a space for developers to store their
Git-initialized projects and manage versions of their code. Thus, the combination of GitHub and Git forms a robust infrastructure for effective code management, team collaboration, and open-source contributions.
Before we delve into the specifics of getting started with GitHub, it’s crucial to first revisit a few fundamental concepts of Git, the version control system used for synchronizing your local changes with the remote GitHub repository.
Why to ♥ Git?
Git manages a filesystem as a set of snapshots.
- Snapshots are called commits
(image source: https://git-scm.com)
- Almost every interaction with Git happens locally.
(image source https://git-scm.com)
- A remote host adds an additional level to a Git repository.
(image source https://www.git-tower.com)
- Also, Git allows for collaboration and back-up.
How to get a Github account
Signing up for an account is very easy. Just go to the signup webpage and fill out the form.
Most choose the free Unlimited public repositories option and don't set up an organization right away.
After you have an account, and if you are a researcher or educator you can request a free upgrade at about-github-education-for-educators-and-researchers/.
Setting up your Authentication key to streamline remote access
Authentication with a SSH key
Passwords are not always secure and can be annoying to type.
- SSH keys are much more secure and allow you to log in without typing your password (or just a different, simpler passphrase).
- When you generate a key, you create two things: a public key and a private key.
- You place the public key on any server and then unlock it by connecting to it with a client that already has the private key.
- When the two match up, the system unlocks without the need for a password.
SSH keys are also very important for using Git with remote hosts (e.g., GitHub)
Setup Authentication with a SSH key
Log in to the
or open a terminal on your local machine.
Create the key pair in your home directory:
$ ssh-keygen -t rsa
ssh-keygen command has been issued, you will be asked a few questions:
Enter file in which to save the key (/home/yourID/.ssh/id_rsa):
You can just hit enter here and it should save it to the file path given.
Enter passphrase (empty for no passphrase):
Here is where you decide if you want to password-protect your key. The downside, to having a passphrase, is then having to type it in each time you use the Key Pair.
Create the SSH key
$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/userid/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/userid/.ssh/id_rsa.
Your public key has been saved in /home/userid/.ssh/id_rsa.pub.
The key fingerprint is:
The key's randomart image is:
+--[ RSA 2048]----+
| .oo. |
| . o.E |
| + . o |
| . = = . |
| = S = . |
| o + = + |
| . o + o . |
| . o |
Copy SSH key to GitHub
You now have a file called
id_rsa.pub in your
[userid@hpc-class ~]$ cat .ssh/id_rsa.pub
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC8QgicqcpFPeyYZhJFW5lBTAdAjHBaYzLwH3l7+lrdmpEKMMMhXMZV5ucxN5WzWU/ERYviYQvQ8NBzkSuHo+SgNJkufF92UqeHIfI/KqgVEGbQn6NGfa5WFBgWZAJAjMzUUrAhJ2qsBez4M1f70os1S2SNcNfoFAJRdWEGE8SX2lpww8+VdCOY6ONw3AYbZbrZtn/ua2hJg+XjYb3T04ggV6TNyV4nnN5r2pRIjJA5OBX1TWcB9HOE4ZIGZoZlk5OYuUJ5rOfuzVLqanWayB3ujuPxW3IUmI6XJt7uDc1N5iVNs2FusjSZmuggWtzCw/pb7EExvNxYMYOxCsewjE0L userid@<remotehost>
Copy the entire contents of this file and add it to your GitHub account.
How to Add SSH key to GitHub Repo
Add your new ssh key to your GitHub account by going to SSH and GPG keys in your profile Settings. You will need to be logged into GitHub for the above link to work.
Learn details from the most recent docs provided at the GitHub official website: Adding a new SSH key to your account ⤴
Contibutions for this markdown document came from Matt Hufford and was modified with permission from his BCB546 course. Check out his amazing power point he created at the above link.
- GitHub for advanced users
- 4. Quality Assurance
- 5. Project Closing