I’ve recently become obsessed with git for version control. I know that there are many git tutorials out there; this is probably redundant, but it focuses on only the simplest operations for my own reference. The git man pages themselves are extremely well written. This post is an adaptation of an article I wrote for our department wiki, targeted at some of my fellow researchers who have probably never heard of git. I removed all of the references to our internal projects.
Unfortunately, one of our major codebases at work is stored in a Microsoft Visual SourceSafe repository; it has been for years, and there’s enough momentum that we likely won’t move to a better version control system any time soon. If the motivation sounds odd, I mostly write research code, so we are regularly creating totally new code that may be used once or twice and then never again.
Visual SourceSafe has limited branching capabilities, so many users keep large subsets of the codebase checked out onto their development machine. When a major experimental change is feature-complete, dealing with merges in VSS is a pain that typically involves face-to-face negotiations of who checks in what when, or one person spending a day or two fixing conflicts in WinMerge or a similar tool. This article demonstrates how to use git, a modern distributed version control system used by the Linux kernel (among others), to manage smaller-scale changes to VSS code trees on Windows without having to risk checking in files that “break” functionality for other users.
The main point of using a local version control repository inside of a VSS checkout is three-fold:
- Commit more often
- Easy branching for experimental code (whether or not it pans out)
- More detailed history/self-documentation of recent changes
I’ve found that my development habits have gotten much better using git; in particular, I commit just about every successful compile-and-test, and I try to isolate commits by functionality as much as possible, so that it’s trivial to cherry pick experimental changes that should be kept or not.
You’ll only need to do these steps once on your development machine.
- Get git
- If you don’t have it archived, get the latest version of Cygwin’s setup.exe.
- Run setup.exe, accept the cached repository and install locations
- Find git in the list (in category Devel), and set it to install
- Accept and finish the Cygwin incremental install
- Configure git
- Open a Cygwin shell
- Run these git setup commands to set authorship:
- Run these git setup commands to avoid choking on Windows newlines
$ git config --global user.name "Your Name"
$ git config --global user.email "email@example.com"
$ git config --global core.autocrlf true
Create Git Repository
You’ll do this once for each working directory you create as per above. The commands below assume that you’re working in c:/Projects/HelloWorld/.
- Create Repository
- Configure Repository
- Tell git to ignore certain Visual Studio/SourceSafe files that change constantly
- See man git-config for more options
$ echo -e "*.user\n*.ncb\n*.suo" >> .git/info/exclude
- Initial check-in
- Tell git which files you’re checking in
- Commit all files (this may take a minute or two)
$ git add <project subdirs>
$ git commit -m "Initial check-in"
$ cd c:/Projects/
$ mkdir HelloWorld
$ cd HelloWorld
$ git init
Initialized empty Git repository in c:/Projects/HelloWorld/.git/
This section is divided into common operations. One important note: the commands can be run from any subdirectory of the repository you created above, but most operations apply to the entire tree. The git manpages are excellent, and can explain further. There are a ton of options, but these get at the core functionality you’re likely to use working on most codebases.
There are two commands useful here:
$ git status
$ git diff
One conceptual caveat: files that are pending a commit get added to the repository’s index. They’ll be listed in the status, but won’t appear in diffs. Be especially careful about using wildcards with the add command, because you generally don’t want to commit binaries or intermediate build files.
status will tell you which branch you are on, which files have been added to the index but not yet commited, which files have changed since the last commit but haven’t been added to the index, and which files aren’t being tracked by git. New files you create will appear in this list, among build directories you’re probably ignoring. For each file it will tell you if it’s been added, deleted, or modified.
diff will pass the output of diff for each file that has changed since the last commit but hasn’t been added yet to the index. This is a very detailed view, I generally use this to make sure that I’ve added everything I want before committing.
This is one of the big advantages to local version control: you can commit after even the most trivial change, at least once you’re sure it works.
There are two commands that you’ll use for this process:
$ git add <file/dir to add> ...
$ git commit -m "Message for log"
Be very careful specifying directories when adding; git will add any unknown file contained in all subdirectories, which might end up adding a bunch of build files you don’t want.
In general, I use status and diff (described under View Changes above) to see what I’ve changed since the last commit, and to make sure I’ve added everything relevant before committing.
By default, tagging always applies to the most recent commit. If you have uncommitted changes, they won’t be under the tag. Tags can’t have spaces.
The usage is simple:
$ git tag -m "Message for log" "my-tag-name"
It is very important that you never use rm or mv inside of a git repository, because git is likely to get confused about what happened to the files. This includes renaming files inside of Visual Studio, since VS2005 effectively performs a mv behind the scenes. It’s not the end of the world if you do it, you just clobber the file history (git sees it as a delete followed by an unrelated create).
The commands for doing this are:
$ git rm <file to remove> ...
$ git mv <source path> <destination path>
They otherwise work exactly like the shell command, they just inform the repository of the change. The changes are automatically added to the index, ready for committing; if you further change a moved file, you’ll still need to add it from its new location.
Let’s say you’re trying out some new functionality, but it’s just not going anywhere. Maybe you’re adding some new functionality, but you need to hop back to the last working commit to build but you don’t want to lose your work. These are just a few of the cases where you might want to branch. You can also branch preemptively.
- Create a new branch to contain current/future changes
- Switch between branches (this will replace files, make sure your changes are committed to the correct branch)
- List all branches (* indicates current branch)
- There are two ways to delete a branch:
- -d makes sure that the changes have been committed to the master branch
- -D just nukes the entire branch and all of its changes, committed or not
$ git branch -d <branch name>
$ git branch -D <branch name>
- Merging between branches is also easy:
$ git checkout -b <branch name>
(edit file, if necessary)
$ git add <broken files>
$ git commit -m "Stuff I broke"
$ git checkout master
(in master branch)
$ git checkout experimental
(in experimental branch)
$ git branch
$ git checkout target
(in target branch)
$ git merge source
(all non-conflicting commits from source will be added to target)
Just make your working directory accessible as a shared folder, and another person on the network can get your complete working tree (with commit history and branches) by creating a new repository and pulling from your repository.
$ cd c:/Projects
$ mkdir HelloWorld
$ cd HelloWorld
$ git init
$ git pull //yourcomputer/Projects/HelloWorld/