jump to navigation

Subverting the system. June 18, 2008

Posted by Scott Morrison in Uncategorized.
trackback

Subversion, often abbreviated as SVN, is a “version control system”. Prompted by Nathan’s request to hear about collaborative software for mathematicians, and the comments on Ben’s post on the subject, I’m going to briefly describe how you might use Subversion to collaborate on a maths paper. Even better, I’m offering to set up a subversion repository for any mathematician who’d like to try it. Jump to the bottom if you already have your subversion-fu, and just want the goodies.

Why would you want to try it? Essentially because I can’t imagine how you’re currently surviving in the dark ages without it! Having met Subversion through a programming project (the sadly defunct omath.org), I’ve used it for each and every maths paper I’ve written since, and even persuaded 6 coauthors so far to jump through the required hoops. They’ve all been happy enough. The alternative, emailing drafts back and forth, and having to keep track of “who’s in charge” at any given point, seems miserable. Besides automating the process of merging changes made by several people, it provides several nifty capabilities — recovering any previous version, if something goes wrong or you rediscover the charm of dropped paragraph, as well as tools to show differences, or to “blame” a file, showing who last edited each line or section.

You might have heard of one of Subversion’s many cousins: CVS, essentially made obsolete by SVN, and perhaps darcs or git, both of which are “distributed” systems, not requiring a central repository. I know mathematicians using all 4; I know about SVN, so I’ll stick with that today.

How do you use Subversion? I’ll describe here the command line interface; if you use Windows, you should install TortoiseSVN which gives a nice “right-click” interface to all of this, but the description should translate readily. There are also GUI interfaces for varieties of Linux and OSX; hopefully someone will chime in on those in the comments. Installing Subversion is fairly simple for most people — on a debian-like system, you can use the incantation “sudo apt-get install subversion”. At Berkeley, I used the incantation “Dear Julie, could you please install subversion on the department machines? It would be really useful for me and my collaborators. Thanks, Scott”, to which she replied “Sure, done!” about half an hour later. Julie is awesome.

Each SVN repository has a URL, for example something like http://tqft.net/svn/d4. (That’s a real live URL, corresponding to the SVN repository Noah and Emily and I are using to write a paper about the D_2n subfactor planar algebras. It’s even public, if you want to play along.)

To get started, you “check out” the repository:

svn checkout http://tqft.net/svn/d4

That should create a local directory called “d4″, containing several subdirectories and files. (If you know you only want part of the repository, you can use a longer URL, like http://tqft.net/svn/d4/trunk/code.) Once you’ve done this initial check out, the commands you’ll mostly use are

svn up

and

svn commit -m "This is a message describing the changes I just made."

The svn up command “updates” your local copy, automatically incorporating any changes that other people have made on the repository. Makes sure that you’re actually inside a directory containing files under version control. If you’ve just checked out the “d4″ repository described above, you have to type cd d4 before svn up will do anything. Noah makes this mistake all the time! :-)

Nearly always Subversion incorporates remote changes successfully, even if you’ve also been editing one of the modified files locally. (Be careful, though, to close and reopen any updated files in your text editor!) Sometimes it fails though, and this is called a conflict. If this happens, well, ask someone who’s dealt with one before, or go read some of the Subversion book! (Incidentally, this is a well-written and thorough resource, available freely online. You don’t need to read much of it for normal use, but the real Subversion guru will eventually need to master all its appendixes.)

The svn commit command sends your local changes to the repository. You should include a very brief message describing your changes, although I’m often lazy about this, and use a very very brief message: “”. The important rule for happy Subversion use is “commit early and often”. This only really matters when you’re concurrently editing files with other people, but Noah and Emily and I have found this a really nice way to use Subversion (three people typing at once makes for quickly growing papers)!

A few more things: when you create a new file, it isn’t automatically included in the repository. You have to use a command like

svn add my-new-file.tex

Further, instead of using commands mv, cp or rm to move, copy or delete a file, use svn mv, svn cp or svn rm, so Subversion knows what’s going on. The GUI clients mentioned above make this easy.

Finally, it’s give-away time. I’ve set up so many Subversion repositories by now (papers, all my private files, several programming projects, and repositories for friends) that I have it down to a fine art, and in particular just a few minutes startup time. So, if you’d like one for your own use, tell me:

  1. The name for the repository.
  2. A list of usernames and passwords for accessing the repository
    • I didn’t mention this above, but you’ll be prompted for a password the first time you try to make changes at the repository.
  3. Whether it should be public or private (i.e. readable by the world, or just those with passwords).
  4. If you’d like automatic emails every time anything gets committed, and if so which email address(es) to use. This is strongly recommended, even if it sounds unnecessary at first.

Disclaimers: I’m not promising support outside of this comment thread, and you’re on your own if my hosting company does a runner, or I do. In principle, I can read your private repository, but in practice I don’t care enough to do so. On the other hand, I’ll include your repository in my completely paranoid backup system.

If you’d prefer to do all this yourself, the command you want to start with it “svnadmin create abc123″. Hooking up the repository to apache or another webserver for easy access requires some practice, however. Alternatively, if you don’t mind paying $6 a month, http://dreamhost.com/ offers SVN hosting amongst their many other services. They’re cheap and cheerful, and every so often offline.

Comments»

1. MercurialUser - June 18, 2008

After using a system with a distributed philosophy, it would be very hard for me to go back.

I recommend Mercurial: http://www.selenic.com/mercurial/
We use it for everything from papers to very large software projects.

2. Ben Webster - June 18, 2008

I’ll just note, there a ton of places to get free Subversion accounts online, often in communities often aimed at developers. The one I’m trying at the moment is assembla.com, but I haven’t gotten enough of a feeling for it to make a real endorsement. It seems like stealing the coders bug-tracking tools could be rather useful, though.

Let me add that if you want a functional free account, you should definitely NOT go to beanstalk. Their free accounts suck.

Scott, do you have any thoughts about such sites?

3. jeremy - June 18, 2008

Yay subversion. It’s saved my butt many times when designing websites.

4. Nathan Dunfield - June 18, 2008

Here’s a nice LaTeX trick when using CVS or Subversion. The following code defines a macro \versioninfo which you can use to print the current revision number within your document e.g. in a footnote or the running head:

\def\RCS$#1: #2 ${\expandafter\def\csname RCS#1\endcsname{#2}}
\RCS$Revision: 1.99 $
\RCS$Date: 2008/03/18 03:44:56 $
\newcommand{\versioninfo}{Version \RCSRevision; Last commit \RCSDate}

Oh, and for Subversion you need to run:

svn propset svn:keywords “Id Revision Date” filename.tex

for this to work.

5. Nathan Dunfield - June 18, 2008

I recommend Mercurial.
We use it for everything from papers to very large software projects.

I’ve switched to Mercurial for coding work, and it is indeed truly excellent there, but have so far stuck with traditional central repository systems (CVS and SVN) for papers. I really like being able to track the revisions with specific numbers that can appear in the TeXed file as per my last comment, and have found this very helpful to coauthors unfamiliar with version control (”does it say Revision 1.41 in your copy of the file”, “no, only 1.40″, “ok, you need to do ‘cvs update’ to get my latest changes…”). Having one’s revisions called things like “2ad3dcb8d811″ instead of “1.40″ is a tad confusing to the neophyte user, though such keyword substitutions are apparently now supported in Mercurial.

6. Aaron - June 18, 2008

I have asked several of the people in my department (Physics and Astronomy) why they don’t use revision control software, and most hadn’t even heard of it, even the ones that do substantial amounts of coding. The ones that had were all ex-programmers of one sort or another. Once I explained, they all said “sounds too complicated”, even though they have had significant problems with shared code and others changing it.

It’s frustrating.

7. Mikael Vejdemo Johansson - June 18, 2008

My reactions ended up being too big for this comment thread. I posted about them on my own blog instead.
http://blog.mikael.johanssons.org/archive/2008/06/a-vision-for-collaborative-mathematics-platforms/

8. David Loeffler - June 19, 2008

I can’t believe I never thought of that :-) I’m currently writing up my thesis, and I work sometimes from my laptop at home and sometimes from my office machine; I use rsync to transfer stuff across, and it’s a right pain, as it will happily overwrite new stuff with old if you get the syntax even slightly wrong.

Your article has inspired me to just chuck the whole lot into SVN, which is just obviously a better solution, even though this isn’t a collaborative project — just me working from two places.

9. Nathan Dunfield - June 19, 2008

I use rsync to transfer stuff across, and it’s a right pain, as it will happily overwrite new stuff with old if you get the syntax even slightly wrong.

Yeah, bidirectional synchronization is not rsync’s forte; unison works very well for this. (Though of course for one’s thesis actual version control is the way to go.)

10. Scott Morrison - June 19, 2008

@ben (#3),

I have in the past used Jira, a “bug tracking” program. It’s made by a friend of mine in Australia, and his company Atlassian. Sadly it’s closed source, and commercial, but they’re very supportive of open source projects, and I got a free license for the omath.org project years ago.

I used it for all sorts of things beyond omath, however — keeping track of things that still needed to be written in a long and complicated paper, as well as organising my life for a while (keeping track of travel arrangements, passport and visa applications, papers to referee, etc.) In the end it fizzled out, and I fell back on simpler methods. I still have a working installation of Jira, however, if anyone would like to play.

Trac is by now the standard free software solution to the “issue tracking” problem. Does anyone have experience using it in mathematics? I’m not sure how useful it is, even for a complicated many-author paper. On the other hand, if I were ever appointed dictator of a maths department, I’d be tempted to have my first command be “Go create an account on our Trac server, and get started.”

11. Odd Man Out - June 20, 2008

The problem with SVN (and CVS) is that it requires a central server to do it. This comes with its own set of problems like having to have the technical know how to set one up and a lack of data integrity. The latter coming in the form of not only missing data, but history, etc as well if the server crashes. What if the server is unreachable?

As mentioned above Mercurial:

http://www.selenic.com/mercurial/wiki/

Is an excellent (distributed) one that as well can be set-up like a central server. Not only that, but it’s really easy to set that up for use over http with zero extra configuration needed for the web server. In fact, I did this while I was at a web hoster not too long ago. All that was needed was Python to be installed on the server. Everything else can be installed ‘locally’.

Another distributed version control system is GIT:

http://git.or.cz/

I haven’t tried it myself, but many people use and like it.

As I mentioned, I use Mercurial. I thought about using SVN, but it’s just way too heavy (bloated) for my tastes. Mercurial is fast, written in Python and pretty much runs anywhere. All that with a *very* small learning curve.

@Scott Morrison:

Re: Trac:

Why? Don’t you think that Trac is over-kill for a paper? Trac would be like digging a hole in your backyard with a H-bomb.

Personally, I would think that using a version control system (e.g. Mercurial) and possibly a mailing list would be just right. That way, no data gets lost and everyone gets cc’d (as long as replies go to the list) on emails.

“Make everything as simple as possible, but not simpler.”
– Albert Einstein

12. Scott Morrison - June 30, 2008

@Nathan, #4

just a pointer for Windows users of Subversion — Nathan’s suggested command

svn propset svn:keywords “Id Revision Date” filename.tex

doesn’t seem to work on the windows command line. You can acheive the same effect using the GUI provided by TortoiseSVN, or running svn through cygwin. I’m sure this is just something about how quotation marks are interpreted by the windows shell, but I’m no expert.