Request: long-distance collaboration

Nathan Dunfield (a new commenter!) supplies our first request:

How about a discussion of long-distance collaboration tools and methods, beyond just using email and talking on the phone? It seems like there are a lot things that might work, e.g. pointing a cheap webcam at piece of paper, using collaborative text editors (e.g. SubEthaEdit), IM’ing (some clients have LaTeX support, I tnink), virtual whiteboards (e.g. Scriblink.com), but which might also turn out to be useless in practice for all sorts of annoying technical reasons. So it would be interesting to hear from people who have had success or failure with various methods.

Unfortunately, I have nothing insightful to say on this topic (I would be really excited to hear if any one else has exciting ideas, for reasons which will be clear below). This is a little sad, since I’m a perfect candidate for having done something interesting in this area. I’m pretty technophilic, even for a mathematician, and am currently writing two different papers with two different people in Germany, and working on another paper in a group of 4 where I don’t think more than 2 of have been in the same state simultaneously in over a year. Almost all this work has been done over email, or face-to-face, with occasional phone conversations and one video chat on Skype (but with no attempt to write anything on a board or paper, just gesticulation). In particular, the last paper I mentioned has been written entirely while we were all permanently in different locations (me in Princeton and Boston, one in Oregon, one in California, and one in Amherst), and generated an enormous number of emails, I think around 500 (thank Gbus for Gmail).

So, why haven’t I done anything more exciting? Well, as Nathan said, the main reason is I just haven’t found the killer app that seemed worth investing in. There are online collaborative word processing programs, but none which do LaTeX well, to the best of my knowledge, so it’s easier to just pass files around via email.

The other problem is that if I did find a program I liked, I would then have to convince my coauthors it was worth using, and they’re, on the whole, more skeptical about these things than I am. I mean, with some them, getting BibTeX was something of a fight, and I think BibTeX is about the best LaTeX add-on in history. More generally, coauthors disagree about what technology is useful or convenient. I mean, Nick Proudfoot and I still have half-joking arguments about whether emacs or vi is superior (for the record: I’m on the side of emacs).

I think the other issue is my work-style. Even when collaborating on research with someone, I’m a very solitary worker. I prefer to go think on my own for a while, and then meet again if I get stuck or find something interesting (or vice versa), which means that I often don’t find email too limiting. I mean, it would obviously be better to be able to meet face-to-face whenever it felt necessary, but I think it feels necessary for me less often than it does for many people.

37 thoughts on “Request: long-distance collaboration

  1. LaTeX-equipped wikis can be pretty nice, as a supplement to (but not replacement of) email/phone/Skype/IM/whatever….

  2. What about subversion? I’m thinking about setting up a subversion server to track changes on my own papers; it seems like a pretty good call for a paper being jointly done. More for writing the paper than working out the argument, though.

  3. I’d like to see more knowledge of version control software, such as Subversion, Mercurial, and so on — especially distributed version control software (Mercurial, git, Bazaar, darcs, and so on). Mathematicians seem very ignorant of such software, even though it can make collaboration much easier.

    If you can learn a complex software system like LaTeX, you can learn to push, pull, and merge changes with your coauthor(s) using version control software!

  4. When I was a PhD student, some of my applied math contemporaries raved about the benefits of using a version control system such as CVS. The idea is that rather than emailing copies around and having to keep track of whose version is latest, the CVS system locks copies, tracks previous edits and so on. Plus people could log in and make changes as and when they wanted, provided someone else wasn’t simultaneously doing so.

    Having said that I never used it, and it certainly sounds like a non-starter for colleagues who are reluctant even to go for BibTeX.

  5. I recently set up SVN for a budding collaborator of mine; in order to make revisions of papers-in-progress easier. With TortoiseSVN (he’s on Windows), seeing changes and dealing with them ends up being quite easy. Generally, I make sure I keep all my own LaTeX code in a backed up source code repository at all times; it gives me editing history and storage security: things that aren’t all bad for collaborations either.

    Furthermore, I very recently discovered that both Miranda and Pidgin support a LaTeX plugin: Instant Messaging with pretty wide-reaching LaTeX support!

  6. Mikael: which Pidgin plugin do you use? I tried the pidgin-latex project at sourceforge but had trouble getting it to work properly. Is that the one you use, or is there a better one?

  7. I’ve also been using SVN for a number of years now (and CVS before). The motivation was mostly to have easy backup and history, but it’s quite useful at least for the write-up part of a collaboration, even if only one person uses it to keep the “master” copy of a paper (or a book; in fact, I think for collaborating on a book, where many writing issues are magnified, such a setup is really extremely useful).
    I’ve discussed about my setup briefly in this post:
    http://blogs.ethz.ch/kowalski/2008/02/23/version-control/

  8. Jay: I use the pidgin-latex plugin. Haven’t done much with it yet; but trying it out made it work neatly.

    You have to take care only to use
    $$ [some formula] $$
    as it won’t recognize $ .. $.

    What problems have you had?

  9. i use svn and am continuously surprised that other people don’t. I’ve tried to use it for colaboration but for most people it’s seems too difficult. I feel the same thing about emailng documents back and forth….

    kopete also has a latex plugin for IM.

  10. Like the previous posters, I’ve had good experiences using version control systems (e.g. CVS, subversion) to manage the LaTeX files of papers I’m writing collaboratively. I’ve found the benefits are especially large when there are more than one other person involved, or when there are more than one or two files, e.g. if you have lots of figures.

    While the person who manages the central repository where the paper is stored needs a certain level of technical acumen, the other users don’t; the Tortoise front-ends that Mikael mentions make it quite easy to use, even on Windows. Also, the distributed systems that Dan mentions give you a way around the problem of needing for all collaborators to have accounts on a common machine, or one of the collaborators running a server, as is the case for traditional tools like CVS.

    The PracTeX journal had an issue last year devoted to collaborative issues, which includes an introduction to a common version control system, Subversion.

  11. Also, regardless of how one moves the LaTeX files around, I’ve found the “latexdiff” program very useful for seeing what one’s collaborators have been up to. It produces output like this where you can clearly see what changes have been made. I also find it helpful to double-check my own edits, to make sure I haven’t introduced any new typos in my attempts to remove the old ones.

  12. “Having said that I never used it, and it certainly sounds like a non-starter for colleagues who are reluctant even to go for BibTeX.”

    As the colleague who was reluctant to go for BibTeX, I will say that this subversion thing sounds like a good idea, provided that Ben walks me through the procedure. Emailing documents back and forth ten times a day along with itemized descriptions of changes sure was a pain in the ass.

  13. One of my long-term collaborations has a yahoo discussion group dedicated to our projects; the ability to upload files and search through archived posts is very useful.

    Another collaboration has a wordpress group blog. The ability to scribble on each other’s posts is particularly useful (we even agreed on a colour scheme to distinguish each of our comments). And of course, the LaTeX support is a big plus.

    Of course, both of these are members-only, for obvious reasons. These are also collaborations involving three or more people; with just two people, it seems that one can get by on good old email with attachments, plus the occasional phone call, fax, and of course face-to-face meeting (preferably near a blackboard).

  14. I’m sure Scott will have a lot more to say about this, but all the projects I’m collaborating with him on (up to 3 at the moment) we’ve been using SVN both for the LaTeX files and for the Mathematica files. It’s really nice for LaTex, and kinda nice for Mathematica. On the other hand, I wouldn’t have been able to start using it without an expert in the room, and even now would be a little apprehensive about using it without Scott being reachable (though I mostly have the hang of it).

    SVN is also nice when you are in the same room. Scott, Emily, and I have had various paper writing days where we’re all working on a different section and every time you update and compile it’s added all the stuff that the other people have been working on. This is very satisfying and makes writing somewhat less intimidating.

    As an added bonus if I’m not at a computer I can look up any of the images in our papers on my iPhone via Scott’s SVN server.

    I also do a *lot* of math via IM, so I’m excited to hear about this IM clients which have LaTeX support.

  15. I’ve been using skype together with Windows Netmeeting for a virtual whiteboard, which is okay for just shooting ideas around (if both/all parties are running Windows). We’ve also used whiteboards (like jarnal) which permit use of an electronic pen and easy creation of pdf files, but is not too generous with space (meaning screen area). I myself would be interested in flexible and spacious whiteboards; haven’t tried scriblink yet.

  16. Mikael: took me a while to get it to compile, but that’s my fault (I’m running Linux). Since I’m running Pidgin 2.4.1 I had to update to the latest version in the CVS. Last night every time I tried to trigger it I got an error message saying the .png file didn’t exist where it should. Today I’m not getting an error message, but every math item I send displays as just the image-not-found red x. I’ll probably play around with it a little more this evening–although mathim sounds more likely to convince my less techie friends–but any suggestions you have would be appreciated.

    I’m also glad to see so much support for the version-control stuff. I’m planning on setting up a server for myself anyway, so svn should be pretty effective whenever I need it. (And gives me a free off-site backup).

  17. The pidgin latex plugin works pretty nicely for me. Compiles cleanly, and installs easily.

    However, you should note that pidgin-latex doesn’t display any graphics unless you have the right image conversion programs installed. You can find out which programs by clicking the ‘plugin details’ button in pidgin’s Plugins configuration window.

  18. Which of these solutions (aside from the pidgin-latex plugin) are open source? CVS has the advantage of being available (and standard!) for people running Linux/Unix…

  19. Jay: You might want to check that you have ImageMagick installed and working on your system. IIRC, the kind of magic that the pidgin latex plugin relies on is basically the image conversion tools in ImageMagick.

    rmb: All the version control systems mentioned above are open source, and easily available on Linux/Unix/MacOSX.

  20. Johanna asked me to look through this thread and provide some input. I’m not sure if this is what you’d want to do, but there are some pretty decent whiteboarding applications out there, several of which are free. Basically, the software does what you’d expect: it’s a virtual white board where all participants can see what’s being written. Some have VOIP capability, too (or, if not, you can use Skype concurrently).

    http://socialsourcecommons.org/search/query?q=whiteboard&submit=Search

    I’ve actually used Vyew, albeit for a non-mathematical group, and I thought it worked fairly well.

    Hope this helps.

  21. G’day y’all.

    I recently switched to a version control system for my papers (in fact, for just about everything – dotfiles, tex style files, programs, even my website is version controlled). As well as the easy backup and collaboration features, it also makes it easy to transfer files from one machine to another. I use bazaar. I wrote up my initial experiences with the shift on my website and, if anyone’s interested, you can find it here. I did try subversion initially but found that bazaar was much more flexible – it took me a few goes to find a system that I liked and bazaar was easier to rub out and start again with. I’d never used a version system before and consider bazaar extremely easy to use (mind you, I am a bit of a Linux geek so take “easy” in that light; however, as canonical are developing it then I think that they are trying to make it as easy to use as possible whilst retaining all the features. The online documentation – see the ‘links’ section of my website – is extremely useful and I recommend reading it before setting up a system). One extremely useful feature of bazaar (may be in others as well) is that it can use sftp (ie ssh) as a protocol.

    On the page linked above you’ll also find a link to a wikibook about using versioning systems for LaTeX papers. That gave me a few ideas before setting out.

    I would also add that it is extremely useful even if you don’t have collaborators. Having a ‘ChangeLog’ helps a lot too.

    Oh, and Emacs has an extremely useful macro: ‘add-entry-to-changelog’ so you can easily keep that up to date as well.

    (For the record, I use both emacs and vim).

    For those using, or considering using, such a system I’d recommend reading the section on line breaks on the page on my website.

    Another program that I have recently installed and instantly found useful is xournal. Amongst other things, it provides for easy annotation of PDFs. This might be useful if, say, in the collaboration you have a “designated writer”. Not sure if it has a LaTeX mode, though.

    Regarding long-distance interactive tools, I’d love to hear more about what works. For ‘real time’ interaction, I suspect that something LaTeX-enabled isn’t going to work. Where you need LaTeX capabilities is when the mathematics gets too hard for ascii (I’d be willing to bet that most readers of this can read a line of simple LaTeX code without needing it to be compiled) and then it can take a few goes to get it to look right, spoiling the immediacy of real time interaction. What I’d be most interested in is something like a mini-interactive whiteboard, so a graphics tablet connected to some program that everyone involved could see the output.

    For non-‘real time’ collaboration, something more wiki-ish is probably better. Though I’d be interested to know why a wiki is better than a forum and why everyone is so keen to be able to put LaTeX on the actual webpage. Although mathml is great, I don’t think that the current system is ideal. The general method seems to be to code something that is a little but not entirely unlike LaTeX which is then processed into mathml (eg via the amazing iTeX2MML). This looks nice, but if one wants to quote something then one has to essentially reverse engineer the code. Moreover, the various filters are not extendible so typing a long section of mathematics quickly becomes tedious (and cut-and-pasting from a genuine LaTeX document is fraught with difficulty). Wouldn’t it be so much simpler to upload a PDF?

    Here’s a thought to close with: imagine doing a spectral sequence calculation with version control. Each commit would be a new ‘page’ so you could step backwards and forwards through the sequence at will to see where all the differentials vanish. He he.

    PS Sorry if this post got a bit long …

  22. Andrew-

    Certainly most mathematicians can read bare LaTeX but it’s a bit like following a formula spoken out-loud: it gets hard fast. It happens to me semi-regularly that I don’t have the patience to read a MathSciNet review in plain-text; I’m sure I could, but it just doesn’t seem worth it. Math is often hard to follow under optimal conditions, so taking even one step out of the processing can make it better.

  23. Ben,

    You misunderstood me, so I apologise for not stating my point clearly enough. I was not saying “Why do we want LaTeX->(math|ht|x)ml when we have ASCII?”, I was saying “Why do we want LaTeX->(math|ht|x)ml when we have LaTeX->PDF?”. The obvious first objection is that it’s too much hassle to compile a PDF for a simple expression such as -exp^{\pi i}=1. I was attempting to anticipate this by saying that for simple expressions, ASCII is good enough. But for anything for which ASCII is not good enough, probably neither is LaTeX->(math|ht|x)ml. The reason being that if ASCII is not good enough, the expression is probably complicated enough that it should be previewed before posting and at that point the advantages of LaTeX->PDF vastly outweigh the disadvantages (IMHO).

    The primary advantages are the following:

    WYSIWIP: What you see is what I posted. One of the major problems, for mathematics, for (math|ht|x)ml is that the display is not sufficiently under the author’s control. A simple example can be found on the n-cafe TeXnical issues thread on placement of primes.
    Speed. Any post should be previewed before submitting but that goes double if it has maths in it. Having to send the data and wait for it to be processed then sent back often takes quite a while. My most complicated LaTeX document took 2.56s to compile. I don’t think I’ve ever had a preview request from the cafe that quickly. Of course, I could install the correct filter locally, run my post through that until I’ve gotten it right, and then post but how many people have the know-how to do that? But that brings me on to my next point.
    Not having to learn yet another markup. I’m very impressed by filters such as iTeX2MML. But they are very definitely not LaTeX->mathml. They are a new markup language and you have to learn new syntax to use them (I’m always having to look up how to do aligned equations in iTeX). I don’t know whether or not it is possible to write a converter that does honest LaTeX->.*ml (tex4ht seems pretty close, but that may be overkill for a blog filter) – I’ve tried writing one myself (babytex) so I appreciate the difficulties. In addition, since there are several filters, different sites will use different filters and one has to remember how to write certain equations each time.
    The Power of Emacs (or vi). Writing any markup without using Emacs feels like trying to play squash with both hands tied behind my back. It’s related to my last point: I know Emacs, I know how to use it. Writing in a tiny box on a webpage just isn’t the same.

    This, of course, all relates to non-real time interaction. For real time interaction then I don’t think any system involving LaTeX will do and one would be better off with a graphics tablet and an artistic package (proving that maths is an art, after all).

  24. I don’t think I’ve ever had a preview request from the cafe that quickly.

    My impression is that the delays involved with posting comments on the Cafe are not related to the processing of th math typesetting, but result from a wealth of other tasks that run in the background, such as email notification and notably anti-spam measures.

  25. I did not mean to disparage the system at the cafe! I used the cafe as an example because that is the one I have most experience of using but my remarks were intended as being a comparison of local vs remote. Namely, whatever task I wish to do – whether it be preview a post on a blog, compile a LaTeX document, or anything – it is highly likely to be faster if I can do it locally rather than remotely.

    It’s mildly interesting that there are two seemingly opposite trends in current computing: distributed computing and remote applications. Actually, they are quite complementary. After all, while I’m waiting for the remote system to notice that I’ve typed another character into my amazing article, the computer in front of me may as well be using its resources to scan for intelligent life on this planet (“Here I am, brain the size of a planet, and they ask me to pick up a piece of paper.”).

    If I were posting to the cafe at a much higher rate than I currently do (something like your rate, Urs!) then I would set up a system like I describe above with a local version of the filters to convert my post to mathml and then post the raw mathml.

    Actually, that wouldn’t be hard to do …

    Mind you, the delays you mention and the delays that I mentioned have null intersection. I was talking about the delay between first hitting “Preview” and (after several reviews) finally hitting “Post” whilst you are talking about the delay between hitting “Post” and the comment actually appearing. The delays in the former set are, I imagine, purely due to data processing and transfer.

  26. As the maintainer of itex2MML, let me make a few comments.

    1) It is really fast — much faster than any TeX implementation you will ever see. There are lots of things that are slow at the n-Category Café but itex2MML isn’t one of them. Because it only concerns itself with equations, rather than general page-layout, even a relatively un-optimized parser can beat the pants off TeX.

    2) It’s certainly not LaTeX, but it is is designed to be as close as possible to AMSLaTeX. For aligned equations, it supports the standard AMSLaTeX environments:

    aligned gathered split

    It doesn’t support eqnarray because eqnarray is crap. I’m very happy to add other idioms of AMSLaTeX, as users demand them.

    3) Currently, it operates as a Unix stream filter, and has native bindings for Ruby. I’m happy to add native bindings for Perl or Python if anyone wants them. I want to make it as easy as possible to incorporate in other blogging/wiki/… systems.

    4) I am sympathetic with your idea of doing more of the requisite processing locally, rather than remotely. But the likelihood, say, of getting something like itex2MML to work in Javascript is rather slim.

  27. Jacques, Urs, and anyone else still reading, …

    I have no criticism to make of the cafe or any of the machinery that underlies it! I’m extremely impressed by all of it, especially itex2MML (by the way, I have a modified version of your MT plugin for blosxom if you’d like it).

    What is slow about the cafe is that if I want to check that what I’ve written looks right, I have to send the entire post to the remote machine, let it process it, and wait for the result to be sent back. That machine may be fast, the link may be over a high-speed line, but still it’s over there and that’s what makes it slow.

    I’m only using the cafe as an example because I’ve used it.

    I’ve seen people, such as Urs, saying that they tend to prepare their comments in a separate editor and then paste them in. All they need is a previewer to check that it looks right and ta-da! That’s the model I’m suggesting, not a javascript implementation (which still involves network communication, if only to send the initial script).

    My point, which is getting a little lost, is that – great as they are – maybe remote filters are the wrong way to implement this. And maybe part of the problem is that “this” has not been clearly defined.

    So, just to try to ensure that I’m not misunderstood (and to make sure that neither Jacques nor Urs add an extra line to the cafe filters directing all my comments to /dev/null):

    The cafe, and the supporting infrastructure, are fantastic.

    But I’m not sure that they make a good prototype for wide-spread adoption because, as far as I can see, it works by taking existing systems and adapting them for maths rather than building something specifically designed for maths from the ground up. If one is going to adapt something, I’d rather adapt LaTeX to the web than the web to LaTeX.

    PS I agree – eqnarray is not worth implementing. What I have to keep reminding myself is that it uses ‘aligned’ and not ‘align’. Your offer is great, but exposes one of the problems I mentioned earlier: to extend itex, I have to email you with a recommendation and hope that you implement it.

  28. …by the way, I have a modified version of your MT plugin for blosxom if you’d like it

    By all means, send me a link, and I will publicize it. (If you are interested, I could even include it in my distribution — see below.)

    I’ve seen people, such as Urs, saying that they tend to prepare their comments in a separate editor and then paste them in. All they need is a previewer to check that it looks right and ta-da! That’s the model I’m suggesting, not a javascript implementation (which still involves network communication, if only to send the initial script).

    AbiWord uses itex2MML and GtkMathView for its Math editing. So you could edit things in that work processor.

    A Javascript-based solution (like jsMath) would not involve round-tripping the content to the server. But it would still be slow. And I’m unimpressed by the Javascript-based previewers, like the plugin for WordPress that allows some sort of rudimentary preview of one’s comments before posting.

    …it works by taking existing systems and adapting them for maths rather than building something specifically designed for maths from the ground up.

    That’s indeed a problem. But, it’s really a second-order problem. The first-order problem is that I have my own research to do, and haven’t the time to write a comparable system from scratch.

    What I have to keep reminding myself is that it uses ‘aligned’ and not ‘align’.

    The simple mnemonic is that I’ve implemented those environments which work inside an equation context. Thus: gathered instead of gather, aligned instead of align, and split (which, despite not having an “ed” suffix, only works inside an equation context).

    Your offer is great, but exposes one of the problems I mentioned earlier: to extend itex, I have to email you with a recommendation and hope that you implement it.

    Actually, you can check out a copy of the source code BZR repository, make whatever changes you have in mind, and then send me an email with the URL of your BZR repository, and I’ll pull your changes from there.

Comments are closed.