Replacing Myself: Writing Unit Tests with ChatGPT
Leveling UpThe Bot that’s got everyone talking The science-fiction future is among us as we find ourselves on the precipice of an AI revolution. As...
If you’re already using source code control, you totally rock. You can skip this posting if you wish, but check out Off-Site Backups before you go.
I’ve been a professional software developer for the last two decades or so, and I have always used a source code control system. Packages with weird names like “RCS”, “CVS”, “Projector”, “Source Safe”, “Subversion”, “Perforce”, “Mercurial”, and “Git”. Even for personal projects, I get twitchy if I’ve written a hundred lines of code and I don’t have a place to put it.
In my work teaching at the Big Nerd Ranch, running a local CocoaHeads chapter, and mentoring junior programmers, I’ve noticed that a lot of folks who program just for the joy of it haven’t heard of source code control, or think it’s a Big Scary Thing and have avoided it. Hopefully I can convince you to use it.
“Source code control” is basically a big memory bank for all the the stuff that makes up your program. Your code. Your Xcode projects. Your button images. Maybe even your Photoshop and Illustrator documents. You make something new, then you add it to your source code control system and check it in. You make some changes, and you check those in too. That’s pretty much it. At any time, you can compare your current code with code that you wrote yesterday or last year. Think of it like a database of changes you’ve made to your code.
So why would you want to have a database like this? Isn’t the file system good enough? There’s a bunch of reasons.
The biggest reason, for me, is that it can act like a giant undo button. I can edit code fearlessly. I can make changes. I can do big refactorings. If I decide I don’t like it, I can revert to an earlier version.
I also don’t need to comment out chunks of code “just in case I might need it in the future.” If it’s not needed, it’s dead. If I want it back, I can go to the file history and retrieve it.
You can see what changes you’ve made. Did some coding today and now things are broken? You can get a quick diff and see what you did.
My debugging process involves lots of code hacking. Maybe adding early returns to functions. Perhaps cutting out method bodies. Inserting logging. Pasting in debugging utilities. It can get really messy. Once I’ve found the bug I can undo (a.k.a. “revert”) my hacking-n-slashing and apply the bug fix to pristine code.
Collaboration is powerful aspect of source code control. Multiple programmers can be working in the same code base, but can be doing their day-to-day work in a private copy on their own machines. They can introduce errors, break code, fix those errors, and do all sorts of the software violence that’s necessary to implement a new feature or fix a bug. Once the feature is done – it compiles cleanly and works properly, you can commit the changes to the shared code. The other programmers on your team can then update their private copies to include the new goodness.
Backup is very important too. By keeping master versions of your source code off of your machine, your magnificent creation will survive your machine dying. If your code is the source of your income, having it outside of your home or place of business will mean you can keep developing and shipping it, even if your house or office gets hit by a meteor.
So, why not just use Time Machine? You get the backup and undo features from it (for at least a time until the hourly backups age-out). But you lose the ability to easily know why things changed over time because there are no check-in comments. You’d have to diff each file against its counterpart in the time machine backup to know what changed. You also can’t easily figure out what set of files changed. If you modified four view controllers and four xib files, and then checked all of them in at the same time into a proper source code control system, you can go through your history and see “Huh. Why did these files change? Oh yeah, all of them were modified at the same time because I added the New User Invitation workflow.” Time machine can’t answer these kinds of questions.. Time Machine also can’t be used for collaboration. Two different programmers can’t use the same Time Machine backup to work together on the same code base.
So, why not just make a zip file every couple of hours, and share that around? That takes care of backups, and can be used as a form of collaboration, but it now falls on the shoulders of everyone involved to make sure they’re not making conflicting edits to files: Bob sets the main windows background color to fuchsia, and George sets it to teal. Which one wins? Unless there’s a shared canonical version of the code everyone is living in, all the versions of the program quickly gets out sync as everyone make changes. Having the team working out of a single source tree (say everyone logged into a single machine) is generally a non-starter – most non-trivial amounts of work will break the system until the work is finished.
OK, so you’re convinced. Now what? Pick a source code control system and start using it! Subversion (svn), Git, and Mercurial (hg) are popular. You can find hosting and collaboration through sites like Google Code, Github, or BitBucket. There are a wealth of books and tutorials out there, usually concentrating on the command line.
Which one to pick? It really doesn’t matter. If your friends or colleagues have one they prefer, use that one. You now have people you can ask for help. Subversion is based on straightforward concepts and is easy to learn. Git is very popular, and very powerful, allowing for many different collaboration modes. It’s also notorious for being hard to learn and having opaque error messages. Mercurial strikes a middle ground: it’s easier to use, but not as popular as Subversion or Git. If you’re completely new to the world of source code control, you might want to start out with Subversion because it has less conceptual overhead than the “distributed” source code control systems like Mercurial or Git. Plus it’s pretty easy to migrate from Subversion to another system.
Xcode has some integration with Subversion and Git, which might drive your decision. You can do most of the common source code control operations from within Xcode. Although, if you’re not familiar with how the command-line tools work, it’s easy to get confused when Xcode does something inexplicable.
There are also stand-alone GUI applications that expose a lot of the power of the underlying source code control system, such as GitX, GitHub’s App for Mac or Windows, Tower (git), SourceTree (git/mercurial), Versions (svn), and Tortoise SVN.
Be sure that your code gets backed up off-site. Using a source code control system is a great first step, but you can still experience catastrophic data loss if you don’t have some kind of backup in place.
One of my students was a happy Git user, gleefully using it for all of his projects. One problem: his Git repository was local to his machine, with no off-site backup. So when the hard drive died he lost his repository. Unfortunately he didn’t have a machine backup, so lost a number of projects from college. Keeping a machine backup (and you are backing up your machine, right?) would have meant that he would have only lost a couple of hours, or maybe a couple of days worth of work, but it’s still decidedly unfun to have to re-do work you’ve already done. With off-site storage, whether on an internet presence you control or one of the hosting services like Github or BitBucket, you won’t lose any committed code that you’ve pushed online.
I queried some of my friends for suggestions of stuff to look at. Software Carpentry has a set of lectures on source code control. For those wanting to learn Git, there’s Git Immersion. Think Like a Git explores the guts of Git, explaining how things work so that everyday operations aren’t quite as scary-boo. Pro Git is one of the canonical references. Mercurial users can get started with Hg Init or the Hg tutorial.
No matter what tool you choose, play with it first. My friend Jeremy W. Sherman told me “I honestly think the best way to get the hang of version control is to just play around with it. You really need some data you don’t care about destroying to be able to just play around and maybe mess things up.” Very true words. Set up some scenarios, play around with with commits and checkouts. And then once you’ve gotten a little familiarity, put one of your projects under source code control and enjoy all the benefits.
The Bot that’s got everyone talking The science-fiction future is among us as we find ourselves on the precipice of an AI revolution. As...
Big Nerd Ranch is chock-full of incredibly talented people. Today, we’re starting a series, Tell Our BNR Story, where folks within our industry share...
Writing documentation is fun—really, really fun. I know some engineers may disagree with me, but as a technical writer, creating quality documentation that will...