Blog

Introduction to Git

Git is a common and useful tool for version control. I use it in all of my sizable software projects. However, it has a reputation of being impenetrable and daunting to learn. As a result, some people develop incredibly convoluted workarounds just to avoid it1. Git is complicated because it has a lot of features. But I'm really using the same few commands over and over again, and this subset is at least pretty easy to understand. This is hopefully a quick and simple guide to using Git the way I do.

What is Git?

Git is a computer program that manages git repositories. A git repository (project, "repo" for short) is like a tree, where chronological order goes from a single root to potentially multiple branches. Each branch represents a history of the project, like a timeline. Each branch contains markers called commits. These contain the state of the repo at particular points in time, and each can be jumped back to if desired. Essentially, every time you want to save the project state, you need to create a commit.

As you work on a project, you may decide that the current state is worth taking a snapshot of. Maybe you've just finished implementing some feature and want to share it with others while your code looks relatively clean. Or maybe you're about to make some extensive changes where it'd be nice to have a restore point to backup from. These are possible reasons to decide to make a commit.

Git also enables collaboration. Two people can work on different branches, completely independently from each other. If and when they want to combine their efforts, Git will allow them to merge their branches together, fusing into one with both of their changes. This is faciliated by having multiple copies of the repo: one is the "origin" and serves as the source of truth, while the others are copies that each person modifies locally. When they want to synchronize, they can use Git to push and pull from each other.

If each commit was saved as a full snapshot, you can imagine that they would take a huge amount of space. Instead, Git only stores a diff of each commit: it identifies the file segments that were added or deleted, and only stores those. That way, the state of every commit can still be restored by applying/undoing all of these diffs in sequence, but the space usage is minimized.

What is GitHub?

There's a difference between Git and GitHub. While Git is the computer program for version control, GitHub is a website that hosts Git repositories and also provides a couple tools to manage them. My workflow involves having GitHub store the "origin" repository, and every contributor having a local copy (managed with Git) before pushing to or pulling from GitHub to sync with others.

It's completely possible to only use Git, as long as other people can access your repository and push to/pull from it (e.g. a self-hosted website, company infranet). Similarly, it's possible to have version control using only GitHub, but you'd be stuck with the website UI, which is rather limited.

There are other sites which host Git repositories, like GitLab and Bitbucket.

What's a terminal?

If you've never run commands in a terminal before, this guide probably won't make much sense to you. Git is a command-line program, which means you interact with it by typing instructions (commands) into a text-based program, upon which it executes them. You can still read this guide for general information and try to use a UI version of Git like GitHub Desktop, but I much prefer using the command line.

Okay, so what are the commands?

git init

This initializes a fresh Git repository in the current folder. If you've done some work on a project already and want to incorporate Git, this is the command for you. Note that this does not link the repo to GitHub: you can do that by uploading your folder there, or adding a remote.

git clone <repo_url>

This creates a local copy of a Git repository from the link provided. Changes can be synced between the remote repo and the local repo using git pull and git push.

git pull

This updates your local repository with the latest information from the remote repo.

If you've made local commits while others have pushed to the remote, your local history will have diverged from the remote's history. To reconcile these, you can:

  • merge the two histories by creating a new merge commit on top, which might involve telling git how to resolve double-edited locations
  • rebase your local commits on top of the remote history, which takes your extra local commits and reapplies them, as if they were done on top of the remote.

If you have local changes (but no local commits), Git will ask you to stash your changes first: see git stash.


Once you have a set of files you want to put into a commit:

git add <path to files...>

This adds a set of files to the staging index, which indicates that you want to highlight these files in preparation for a commit. You can use the wildcard . (period) to indicate all files in the current folder, but it's perfectly normal for a commit to only contain some of the files you've worked on, while others are still in progress.

git commit

This compiles all of the files in the staging index into a commit. You will be prompted to provide a commit message and an optional description.

By default, Git used to open a vim editor window for this, which inadvertently led to one of the most viewed stackoverflow posts of all time. It might not do that anymoret, but you can configure it using git config --global core.editor="nano". Or just learn how to use the basics of vim.

git push

This updates a remote repo with your local commits. If the remote repo has new commits that aren't in your history, Git will ask you to reconcile the divergence by pulling before you can push: see git pull.


If you want to manage your files:

git rm <path to files...>

This removes a set of files from being tracked by Git. This is often used after you accidentally commit some files you weren't supposed to, even though they're needed in the project (such as config files with secrets).

git restore <path to files...>

This removes your changes in the set of files since the last commit. If you want to remove files from the staging index, use git restore --staged <path to files...>.

git status

Provides a summary of the current state of your local repository. It shows a list of files changed, a list of files in the staging index, what branch you're on, etc.

git diff

Shows the diff between the latest commit on your branch and the current state of your local repository. Press spacebar to scroll down and q to exit.

git stash

Collects all of your tracked files and saves them into a stash, then restores your repository to the state of the most recent commit. Useful for undoing all of your changes temporarily.

Use git stash pop to restore all the changes from the stash.


When you want to manage multiple histories:

git checkout <branch_name>

This switches to a particular branch. All of your files will be updated automatically.

  • To see the current list of branches, use git branch.
  • If you want to create a new branch, use git checkout -b <branch_name>.

git merge <branch>

This merges the given branch into the current local repository. Merge conflicts may be produced that need to be resolved manually. I typically use GitHub to deal with merges, since it provides a very informative and easy-to-use UI.


That ended up being quite a bit longer than I expected, and I didn't even mention forking: maybe some of it just comes with experience.


  1. Stuff like sending zip files in a shared Discord channel. It's functional in the most basic sense, but otherwise unusable. 

Thoughts? Leave a comment