Skip to main content

Git and GitHub

Original Author: Sumon C, May 27, 2021

Introduction

The following is a short introduction to the Git version control system, and GitHub, a web-based hosting service for version control using Git. Both of these tools are used during software development at REDCap@Yale. Here, we'll be using GitHub, and the Git Bash program for Windows (or the system Terminal for Mac and Linux). Instructions assume understanding of basic command line operations. Most commands written below should be typed in the UNIX or Git Bash terminal.

If you are familiar with using Git for personal projects and are only interested in learning how we apply the distributed developer workflow within REDCap@Yale, you can jump to the workflow section.

Initial set up on GitHub

Setting up GitHub

First, you will have to create an online account. GitHub can be found here: https://github.com

image-20200916194948398

Sign in with your NetID and password.

image-20200916195059095

Once signed in, locate your public profile by selecting your avatar in the top right corner, and selecting Settings.

image-20200916195607221

Setting up Git

Once you have your online account set up, you will need to download Git for your operating system:

For Windows: run the downloaded executable and install Git onto your machine.

note

For Windows, I installed to C:\Users\<your netID>\AppData\Local to bypass needing administrator privileges. This also allows you to modify configuration files (such as .bashrc), which I was not able to when I installed to C:/Program Files.

Once installed, configure Git in the terminal with your name and email found on GitHub (which should be your Yale email) using:

git config --global user.name "Your Name"
git config --global user.email your.email@yale.edu

This sets your name and email for all Git contributions on this local machine.

You can also set up your name and email for specific projects by using git config without the --global option.

Creating a new repository

At REDCap@Yale, we use the "remote copy first" method to create new Git repositories. This method also lets GitHub create a README.md file for you when creating the repository.

  1. Remote copy first: Create new repository on GitHub.

    image-20200916200519413

    image-20200916200833519

    Once created, copy the URL of your new repository by clicking the "Clone" button.

    image-20200630170744978

    Then, on your command line, cd to the location you want to create the directory, and:

    git clone <url>

Sometimes you may already have some code that you wish to upload to GitHub. In that case, you will have to link your local code with a GitHub repository. The steps are identical to the "remote copy first" method, except for the last step on the command line

  1. Local copy first: Move into the directory that contains your project using cd.

    Then, initialize the project as a Git repository with:

    git init

    Once initialized, link the local directory with your remote directory:

    git remote add origin <url>

Now you have a repository to work with!

Using GitHub

Basic workflow

The general workflow is:

  1. Create a change in your Working Directory (local copy).
  2. Stage these changes to the Index.
  3. Commit these changes to HEAD (which points to the last commit you made).

Staging

You can stage changes you made to the Index with:

git add <filename>

which lets you specify the file you want to stage. Otherwise:

git add .

lets you stage all files with changes except the ignored files.

Staging allows you to pick and choose the changes that you want in a specific commit. You may have multiple files that currently have changes, but if the changes are not logically associated, you can split them into multiple commits by only committing the files that contain logically associated changes.

Committing

You can then commit these changes to HEAD with:

git commit -m "Commit message"

If you want to note down details of your changes and cannot fit them into one line, then you can:

git commit

and Git will open the text editor that you specified during your installation, where you can write a multi-line commit message in the editor. Git Bash will wait until you close your text editor instance (if using text editors such as Atom or Sublime Text). If you prefer staying in the terminal, you can change the default editor to either vim or nano with a config command:

git config --global core.editor [vim / nano]

Notes on commit messages: Try to construct commit messages as descriptively as possible with present tense verbs. "Fix things" is not a good commit message. Instead, opt for, for example:

Fixes A1 issue reported by user B2 using C3 feature

Adds D4 routine to circumvent E5 function
Removes E5 function as no longer needed
Adds comments to clean up code

where the first line gives a general description of the commit, with following lines elaborating on the changes. Of course, if it fits in one line, a one-liner is fine. Further information can be found in the coding standards section.

Before you push a commit to the remote repository, it is possible to edit it. This can come in handy when you want to alter the commit message or realize you have to add additional files. Stage the commit as before but add the --amend flag to alter the last local commit:

git commit --amend

Pushing changes to your remote repository

Change are now committed to HEAD in your local Git, but your remote repository has no idea about these changes. You can send these changes to the remote repository with:

git push origin main

origin is your remote repository, and main is the branch you want to push your changes to. You can push changes from any branch that you want. It is good practice to specify the branch you're pushing, just to make sure you're pushing the right branch to GitHub.

Resetting local changes

What if you made changes that broke things or you no longer want to keep?

You can revert back using:

git checkout -- <filename>

which replaces the changes in your working directory with the last commit in HEAD. In this case, newly created files and changes already staged to the Index will be kept.

If you just want to revert to the last status saved in the remote repository (and drop all of your local changes and commits), you can do this with:

git fetch origin
git reset --hard origin/main

fetch only lets the local Git know of the status of your remote repository, but does not actually make any changes to your local repository. reset makes the changes.

Branching

So far we've been committing all of our changes to the main branch. This is not an issue for personal projects or simple projects involving a small number of developers. However, it is good practice to create branches to keep track of your workflow.

At REDCap@Yale, we treat the main branch of the central repository as the "bleeding edge" version of the software, where all bug fixes, new features, and enhancements will be centrally incorporated. Various releases and versions will be created by branching off of this main branch. For example, you may create a production branch that only contains the code present in the current version of the software that is used in production.

You can see all the branches your repository has with:

git branch

You can create new branches with:

git branch <name>

You can then switch to this branch with:

git checkout <name>

If that seems tedious, you can also create a new branch and immediately switch to it using:

git checkout -b <name>

The branch is only available on your local Git. To make is available to others, you will have to push the branch to GitHub with:

git push origin <branch>

You can always return to your main branch with:

git checkout main

You can delete a branch (and all of its associated commits, so be careful!) with:

git branch -d <name>

Merging

So what do you do when you have new features in a branch that you want to move to main? This is when you merge the branches.

You can do this by moving into your main branch with git checkout main, then:

git merge <branch>

where branch is the branch you want to pull updates from.

Note: Make sure you're in the branch you want to merge the changes to before using merge!

In most cases, main (or the branch you want to merge to) does not have any changes, and Git does a fast-forward auto-merge that sets the HEAD of main to the HEAD of branch.

Note: A fast-forward merge is treated as if the branching never happened. For example, if you merged branch with changes into main, it is treated as if the changes were directly done in main. To make sure Git keeps the branch, use the --no-ff option, which causes the merge to create a new commit object.

You can then stage and commit this edit with:

git add <filename>
git commit

and Git will commit the merge as a new commit.

If you want to see the differences in the files before doing a merge, you can use:

git diff <source> <target>

to see how the branches differ, where source is the branch you want to merge from, and target is the branch you want to merge into.

It is good practice to periodically make sure your local copy is up to date with your remote repository. You can do this with:

git pull

This fetches and merges remote changes into your working directory. Here, the same conflict resolution method is applied as manual merges, and Git will handle it automatically if there are no conflicts.

Merge Conflicts

However, sometimes both branches have distinct changes, and this results in merge conflicts.

In this case, Git will highlight the conflicts in the files, and you will manually be able to edit the files to choose or merge the conflicting changes.

Each merge conflict will have two sections:

  1. The section starting with <<<<<<< HEAD will contain the version in the local branch, and
  2. The section starting with ======= will contain the version in the merge branch.

>>>>>>> and the hash identifying the commit for the merging branch signifies the end of that specific merge conflict. You may have multiple conflicts in the same file.

You will have to choose which version of the merge conflict to keep in each file with merge conflicts, and then stage and commit the changes.

Log

Now that you have all these commits and changes, how do you keep track of all of them?

git log

lets you see the history of a repository.

git log has many options that can help you out. Some favorites:

  • --author=<author-name>: shows only the commits from a specific author
  • --pretty=oneline: shows each commit on single line
  • --graph --oneline --decorate --all: shows a visual tree of all the branches and their commits within your terminal
  • --name-status: shows the files that have been changed in addition to normal log information

Tagging and Versioning

When you have a sufficient number of changes/features, you may want to version your software to let others know the status of your production releases. This is especially important for REDCap@Yale developers, as REDCap External Modules have to follow a distinct semantic versioning style for the REDCap server to recognize and utilize the modules.

You can do this in GitHub by clicking on "Create a new release":

image-20200916203448273

You can then enter the version that you want this release to be associated with, type in a one-liner explaining the release, and also write down a detailed description of the release:

image-20200916204830557

Advanced Technique

Git Preferences

git config contains all of your settings for Git. git config --global --edit lets you open the Git configuration file and make changes that apply to all of your repositories. Make sure to make a copy of your original configuration file before making changes so you can revert back to the original in case you break things.

Some example git config changes:

  • color.ui true: lets you use colorful Git outputs
  • format.pretty oneline: shows log with just one line per commit by default

Bash/terminal Preferences

.bashrc located in /etc/bash.bashrc (for Windows) contains all of your settings for Git Bash (for Windows) and how Git looks in your terminal (for Linux and Mac). If you don't like the way the bash shell or terminal looks, this is where you can make modifications. As always, make sure to make a copy of your original bashrc file before making changes so you can revert back to the original in case you break things.

Ignoring Files

If you have files that you do not want git to add automatically or report as untracked, you can add them to .gitignore. Such files are usually generated files, such as logs, executables, or compiled files. Files already tracked by git or explicitly added are not affected. Each line of the file specifies a pattern, directory, or paths to files that should be ignored.

Some examples:

  • *.o would ignore all files end in ".o"
  • logs/ would ignore the "logs" directory and anything contained inside

Full details of patterns and precedence can be found here.

Tagging in Git

Earlier we showed that you can create new releases and versions within the GitHub web interface. You can do the same within the command line using Git.

You can tag your current/latest commit with:

git tag 1.0.0

You may have also noticed the long commit IDs such as 8e67a543302cb6b4472f41b28df8c5d43241c866 while using git log. These can be used to tag specific commits that you've already past.

git tag 0.1.0 8e67a54330

lets you tag commit 8e67a54330 with the version number 0.1.0. You generally don't need the entire commit ID, just the first 10 characters.

git push does not upload tags to the remote repository by default. You will have to use:

git push origin <tag>

to push specific tags to GitHub.

If you have a lot of tags:

git push --tags

lets you push all of them to the GitHub.

A more in-depth explanation of tagging in Git can be found here.

Additional Help

Resources

Git has a free online book that is very useful and should be your go-to reference when you have questions about using Git for version control.

This guide was largely based on:

with some additional detail added on.

The Ultimate Beginner Git Cheat Sheet goes into more detail once you're a bit more comfortable.

GitHub has a pretty useful cheat sheet.

The Git documentation is also very helpful when you want to know all of the options available to you while using a command. Using --help in the command line with git also takes you to the online documentation.

GitHub Help is a resource/guide for using GitHub.

Other resources I found by web searching while creating this guide:

And as always, Google is your friend.