Local Version Control With Git

Introduction

I’ve recently become obsessed with git for version control. I know that there are many git tutorials out there; this is probably redundant, but it focuses on only the simplest operations for my own reference. The git man pages themselves are extremely well written. This post is an adaptation of an article I wrote for our department wiki, targeted at some of my fellow researchers who have probably never heard of git. I removed all of the references to our internal projects.

Unfortunately, one of our major codebases at work is stored in a Microsoft Visual SourceSafe repository; it has been for years, and there’s enough momentum that we likely won’t move to a better version control system any time soon. If the motivation sounds odd, I mostly write research code, so we are regularly creating totally new code that may be used once or twice and then never again.

Visual SourceSafe has limited branching capabilities, so many users keep large subsets of the codebase checked out onto their development machine. When a major experimental change is feature-complete, dealing with merges in VSS is a pain that typically involves face-to-face negotiations of who checks in what when, or one person spending a day or two fixing conflicts in WinMerge or a similar tool. This article demonstrates how to use git, a modern distributed version control system used by the Linux kernel (among others), to manage smaller-scale changes to VSS code trees on Windows without having to risk checking in files that “break” functionality for other users.

The main point of using a local version control repository inside of a VSS checkout is three-fold:

  1. Commit more often
  2. Easy branching for experimental code (whether or not it pans out)
  3. More detailed history/self-documentation of recent changes

I’ve found that my development habits have gotten much better using git; in particular, I commit just about every successful compile-and-test, and I try to isolate commits by functionality as much as possible, so that it’s trivial to cherry pick experimental changes that should be kept or not.

Setup

Install Git

You’ll only need to do these steps once on your development machine.

  1. Get git
    1. If you don’t have it archived, get the latest version of Cygwin’s setup.exe.
    2. Run setup.exe, accept the cached repository and install locations
    3. Find git in the list (in category Devel), and set it to install
    4. Accept and finish the Cygwin incremental install
  2. Configure git
    1. Open a Cygwin shell
    2. Run these git setup commands to set authorship:
    3. $ git config --global user.name "Your Name"
      $ git config --global user.email "yname@you.com"

    4. Run these git setup commands to avoid choking on Windows newlines
    5. $ git config --global core.autocrlf true

Create Git Repository

You’ll do this once for each working directory you create as per above. The commands below assume that you’re working in c:/Projects/HelloWorld/.

  1. Create Repository
  2. $ cd c:/Projects/
    $ mkdir HelloWorld
    $ cd HelloWorld
    $ git init
    Initialized empty Git repository in c:/Projects/HelloWorld/.git/

  3. Configure Repository
    1. Tell git to ignore certain Visual Studio/SourceSafe files that change constantly
    2. $ echo -e "*.user\n*.ncb\n*.suo" >> .git/info/exclude

    3. See man git-config for more options
  4. Initial check-in
    1. Tell git which files you’re checking in
    2. $ git add <project subdirs>

    3. Commit all files (this may take a minute or two)
    4. $ git commit -m "Initial check-in"

Usage

This section is divided into common operations. One important note: the commands can be run from any subdirectory of the repository you created above, but most operations apply to the entire tree. The git manpages are excellent, and can explain further. There are a ton of options, but these get at the core functionality you’re likely to use working on most codebases.

View Changes

There are two commands useful here:
$ git status
$ git diff

One conceptual caveat: files that are pending a commit get added to the repository’s index. They’ll be listed in the status, but won’t appear in diffs. Be especially careful about using wildcards with the add command, because you generally don’t want to commit binaries or intermediate build files.

status will tell you which branch you are on, which files have been added to the index but not yet commited, which files have changed since the last commit but haven’t been added to the index, and which files aren’t being tracked by git. New files you create will appear in this list, among build directories you’re probably ignoring. For each file it will tell you if it’s been added, deleted, or modified.

diff will pass the output of diff for each file that has changed since the last commit but hasn’t been added yet to the index. This is a very detailed view, I generally use this to make sure that I’ve added everything I want before committing.

Commit Changes

This is one of the big advantages to local version control: you can commit after even the most trivial change, at least once you’re sure it works.

There are two commands that you’ll use for this process:
 $ git add <file/dir to add> ...
 $ git commit -m "Message for log"

Be very careful specifying directories when adding; git will add any unknown file contained in all subdirectories, which might end up adding a bunch of build files you don’t want.

In general, I use status and diff (described under View Changes above) to see what I’ve changed since the last commit, and to make sure I’ve added everything relevant before committing.

Tag

By default, tagging always applies to the most recent commit. If you have uncommitted changes, they won’t be under the tag. Tags can’t have spaces.

The usage is simple:
 $ git tag -m "Message for log" "my-tag-name"

Move/Rename/Delete Files

It is very important that you never use rm or mv inside of a git repository, because git is likely to get confused about what happened to the files. This includes renaming files inside of Visual Studio, since VS2005 effectively performs a mv behind the scenes. It’s not the end of the world if you do it, you just clobber the file history (git sees it as a delete followed by an unrelated create).

The commands for doing this are:
 $ git rm <file to remove> ...
 $ git mv <source path> <destination path>

They otherwise work exactly like the shell command, they just inform the repository of the change. The changes are automatically added to the index, ready for committing; if you further change a moved file, you’ll still need to add it from its new location.

Use Branches

Let’s say you’re trying out some new functionality, but it’s just not going anywhere. Maybe you’re adding some new functionality, but you need to hop back to the last working commit to build but you don’t want to lose your work. These are just a few of the cases where you might want to branch. You can also branch preemptively.

  • Create a new branch to contain current/future changes
  • $ git checkout -b <branch name>
    (edit file, if necessary)
    $ git add <broken files>
    $ git commit -m "Stuff I broke"

  • Switch between branches (this will replace files, make sure your changes are committed to the correct branch)
  • $ git checkout master
    (in master branch)
    $ git checkout experimental
    (in experimental branch)

  • List all branches (* indicates current branch)
  • $ git branch
      experimental
    * master

  • There are two ways to delete a branch:
    • -d makes sure that the changes have been committed to the master branch
    • -D just nukes the entire branch and all of its changes, committed or not

    $ git branch -d <branch name>
    $ git branch -D <branch name>

  • Merging between branches is also easy:
  • $ git checkout target
    (in target branch)
    $ git merge source
    (all non-conflicting commits from source will be added to target)

Sharing Repositories

Just make your working directory accessible as a shared folder, and another person on the network can get your complete working tree (with commit history and branches) by creating a new repository and pulling from your repository.
$ cd c:/Projects
$ mkdir HelloWorld
$ cd HelloWorld
$ git init
$ git pull //yourcomputer/Projects/HelloWorld/


Comments

7 responses to “Local Version Control With Git”

  1. Matti Avatar
    Matti

    Nick, this is awesome. I am going to channel Martin and suggest you link to this post from the GitWiki blog posts page.

    Now do an article about how non-coders can use git to sync their personal rc files etc. across many machines. Bonus points if you include some way to trigger shell scripts on a push. I mean, not that I need such a doc or anything.

    1. Dealing with dotfiles is on my project list, actually; I have a very old CVS + shell script setup that I used for all of my *nix logins at school, but I broke it a while ago and haven’t fixed it. My shell config is mostly “mature” at this point, so it’s not a huge deal for me. I was thinking of using branches for each host, so you could potentially do slight tweaks if you’re on Mac OS 10.4 vs. Ubuntu, as well as host-specific changes.

  2. flurie Avatar
    flurie

    For some reason these dinky little commit logs using -m instead of tracts composed in EDITOR make me think of twitter vs real blogging.

    1. I’ve almost always done that with CVS and SVN in the past; I just don’t feel the need to wait for $EDITOR to start when it’s just going to be a sentence or two.

    2. Although this approach works fine, it does mean that your web pages needs a git reooritpsy embedded within it.a0 A better approach, and one Ia0use now is to use rsync to copy (and synchorise) the files at the remote end.Here is an example, which detects a git merge, or a commit in one of two branches (site and test)a0which then rsyncs via ssh (to site mb) into the web locations for the production and test sites respectively.#!/bin/sh## Output a version file that we can include at the bottom of the pagebranch=$(git branch | sed -n s/^\*\ //p)cd “$(git rev-parse –show-cdup)”if [ “$branch” == “site” ]; then git clean -f rsync -axq –delete –exclude=scripts/calendar –exclude=htaccess –exclude=.git –exclude=test ./ mb:public_html/static/fiif [ “$branch” == “test” ]; then git clean -f rsync -axq –delete –exclude=scripts/calendar –exclude=htaccess –exclude=.git ./ mb:public_html/static/test/static/fi

  3. Judging from the response, your post is both well-written and informative. I tried to read it, honest, but started cracking up when I read “many git tutorials.” Google says:

    Definitions of git on the Web:

    rotter: a person who is deemed to be despicable or contemptible; “only a rotter would do that”; “kill the rat”; “throw the bum out”; “you cowardly …
    wordnet.princeton.edu/perl/webwn

    1. If you watch Linus Torvald’s lecture at Google about git, and why he and it are so awesome, he explains that he picked the name for precisely that connotation (and, presumably, a pun on “get”).

Leave a Reply to Nicolas WardCancel reply