July 30, 2008 49

Revision control for LaTeX: in search of an answer

By in Software, Writing

diffing LaTeX

As an ultimate LaTeX addicted, I hate to admit that there is nothing in the TeX universe comparable to the amazingly simple and intuitive revision tracking system that Microsoft implemented in Word. OpenOffice apparently has an equally powerful version control system built in its Writer.

Those of you who ever ventured into the territories of TeX-based collaborative writing certainly know how painful it can be to keep track of changes among several authors in TeX. TeX sources are raw text, so if you need proper diffing or revision tracking you will probably have to resort to some revision control system (such as Subversion or Git). Revision tracking via RCS, however, can be a nightmare to set up and learn to use fluently if you’re not already familiar with some basic notions of software revision control.

After an ugly lot of email exchanged with coauthors to let each other know who was doing what with a manuscript, I decided to search the Web for an answer.

Unfortunately, I couldn’t find anything particularly useful or enlightening. I came across this nice tutorial recently published in PracTeX describing a few steps to make the most of revision control in LaTeX via SVN and some dedicated packages:

U. Ziegenhagen, LaTeX document management with Subversion, The PracTEX Journal, 3 (2007).

(incidentally there’s also a SourceForge project for OpenOffice-SVN integration)

Using this solution still means convincing my collaborators to learn SVN, which I am not sure is a viable solution. I anticipate that the first who comes up with a good solution for revision tracking in LaTeX (e.g. a server-side tool similar to revision control systems but optimized for LaTeX and with good integration with a variety of TeX front-ends) will definitely become rich and famous.

Until this happens, I’d be curious to hear what strategies people use for TeX-based collaborative authoring.

If you enjoyed this post, make sure you subscribe to my RSS feed!

Tags: , , , , ,

49 Responses to “Revision control for LaTeX: in search of an answer”

  1. Rob HyndmanNo Gravatar says:

    When I saw your post title, I was hoping that you had an answer to this problem! I’ve searched and an RCS seems the only available solutions, but the costs in doing that are relatively high. My manual low-tech solution is to append an incrementing number to each version of the tex file (e.g., paper1.tex, paper2.tex, …). Then I use CSDiff or CompareIt! to identify the changes made.

    One problem with RCS and my manual system is that the addition of a single word can lead to changes in every line of the paragraph due to auto-wrapping.

  2. TNo Gravatar says:

    LyX is able to track changes. It’s quite performing.

    Most of the time I work with paper copies, when possible.

    Skim (on Mac) provides a usefull notes/comments feature on PDFs.

    But I didn’t find any long-term solution yet.

  3. MiguelNo Gravatar says:

    I know it’s what you want to avoid, but I have to recommend using git (or svn if you must) for that kind of task. Sure, it will take some effort to learn, but it’s an investment that will pay of in many other areas, like programming.

  4. If you’re collaborating with just one other person, I find that email + rcs works ok. It’s not the perfect solution (you need to be pretty explicit about who is editing the document at each point in time), but it at least allows me to keep track of what I’ve changed, and what my co-author changed.

    I keeping looking for better solutions, but I’ve yet to find an interface where I’m the only one that needs to know about the rcs. http://gist.github.com/ looks promising but requires copying and pasting and a knowledge of how works work with git/github. Dropbox (http://www.getdropbox.com/) is another possibility – you can share files, and revisions are automatically tracked, but there’s no way to programmatically get at previous revisions. (I can send you an invite if you want to try it out)

    I think my ideal solution would have an email front end and git back end. Email is how most academics deal with the problem now, and would provide a familiar interface. To use the system, you’d just cc a special email address and the site would do the rest – the git back end would make it easy to track changes and resolve conflicts, and advanced users could work directly from the command line.

    Rob: a good diffing algorithm should be able to ignore those whitespace related changes – http://code.google.com/p/google-diff-match-patch/ does a particularly nice job.

  5. darioNo Gravatar says:

    Thanks all for sharing these tips, which confirm my initial impression about RCS as the best available solution.

    @T, I didn’t know LyX supported revision tracking, I’ll definitely check it out (although I don’t see myself abandon my current TeX editing tools anytime soon).

    @Hadley, mail+Git sounds like an interesting solution, in the end there are wiki engines or hosted wikifarms out there accepting revisions by mail or SMS so I don’t see why this should not be possible to have with an RCS back-end.

    I guess any viable solution for non-programmers would need at the very least to:
    1. make the commit process seamless (either via email or via plugins directly integrated in the front-end)
    2. allow easy diffing/merging between TeX revisions, avoiding the annoying issues pointed out by Peter and others.

    (@Peter, incidentally, I would not recommend CVS to someone willing to learn revision control from scratch).

    Regarding diff optimized for LaTeX, there is a latexdiff package at CTAN as well as a Perl script (not actively maintained any more though) that may turn out to be useful.

  6. Peter says:

    I use cvs out of old habit, but this works equally well for roughly any other VCS. Here is a workflow that keeps the amount of conflicts to a minimum for me:

    Initially: One person (you) create the document, put in all the usepackages, and make all the sections in the .tex. Make sure the document renders, and check in the document and the .cls.

    Others: check out that document.

    Repeat:

    1) cvs update
    2) Edit “your” section.
    3) cvs update, check for any messages about conflicts (and solve them).
    4) cvs commit -m ‘Nice short description’

    Always stop on point 4 when you leave for the day, even if your work is unfinished. Not having thousands of lines of uncommited content will make it a lot easier to resolve any conflicts, but will make it harder to back out changes. Since this is an article of 50 pages (tops) chances are you will finish it before you need to make any backouts. Make sure your commits never breaks the document rendering, since that breaks it for all others too.

    If you stick to editing things in “your” designated section and never change names of sections you will not have any conflicts. If you need to edit someone elses section, sync with them through email/phone or simply tell them to insert your paragraph between these two other paragraphs.

    The others have to learn two commands: cvs update and cvs commit, nothing else. For those who know the RCS you get all the goodies like cvs diff, commit history and so on.

  7. Rahul Premraj says:

    In my experience, using CVS or SVN are perhaps the best alternatives for this purpose. I use both extensively with my co-authors and the work flows seamlessly.

    If you find the use of these systems difficult, you can consider using GUI tools that make this process much simpler, even for new users. Two tools I can strongly recommend are SmartCVS and SmartSVN.

    Cheers!

  8. Ian MulvanyNo Gravatar says:

    Hi,

    For comparing two LaTeX files looking at output from ‘diff’ is a pain in the ass, so I whipped together a small script that takes file1.tex, file2.tex and produces a diffFile.tex. The diffFile.tex uses a package called correct.sty and color.sty to create a pdf that has the same kind of markup that you get from track changes.

    It works by serialsing both initial files into word lists, and doing a diff on the lists. There is some regex work going on to take care of some LaTeX syntax, but it’s not 100% robust.

    I’ve had this stuff sitting on my hard drive for a few years, but wasn’t using it any more, so I’ve just popped it up on Google Code:

    http://code.google.com/p/texdiffer/wiki/QuickStartGuide?updated=QuickStartGuide&ts=1217515515

    If I get time over the weekend I’ll post the other version of the code, which might be more robust, but I can’t remember.

    Let me know whether this is at all useful.

  9. grayNo Gravatar says:

    TortoiseSVN is a pretty intuitive windows frontend for subversion. I don’t know how seemless you need, but pretty much everything can be done from a right-click on the appropriate folder or file.

    I haven’t used it for co-authoring, but I do use it for backing up my work — all of the repositories are on the university server, and the local copies are on my laptop. If I were working with other people, I assume all they would have to do is install TortoiseSVN and checkout copies onto their own machines.

  10. darioNo Gravatar says:

    @Ian, texdiffer looks fantastic, thanks for sharing this.

    @gray, I agree, TortoiseSVN is recommended by most of my Windows-based developer collaborators. I’m on Mac OS and for my development projects I use a combination of command line and Subclipse (a plugin for Eclipse), although I’d never recommend Eclipse for collaborative authoring (any of you using TeXlipse by any chance?). I wish there was some SVN interface built straight into TeX front-ends (as per my comment above) so as to make updating, diffing and merging as seamless as saving a document in a word processor.

  11. Simon SperoNo Gravatar says:

    There’s a finder plugin for macos that covers much of the same ground as TortoiseSVN:

    See http://scplugin.tigris.org/

  12. vitaNo Gravatar says:

    hi all, i tried latexdiff with a very satisfactory result.

    it compares two similar tex files and creates one with changes in a graphical way similar to word or openoffice.
    the revision is also possible.

    for windows users latexdiff is a part of MikTex, for linux download from ftp://cam.ctan.org/tex-archive/support/latexdiff.tar.gz

    The latexdiff script makes use of the Perl package Algorithm::Diff (available from http://www.cpan.org, current version 1.19)
    so you need to have perl installed.

  13. olNo Gravatar says:

    To extend on tracking changes using LaTeX… do some of you LaTeX users have to collaborate on papers with people who don’t even know that there alternatives to MSWord? I had to quit LaTeX because of that… It was too much a pain to go through the convert/adapt/…/ process for the colleagues to simply be able to open the document :-( Any suggestions?

  14. vitaNo Gravatar says:

    i think that tracking changes is very helpful for publishing with non-tex people. these people just may fill tex files with text (created with probably known MS notepad:) and you can do tex-work yourself.
    this attitude i tried several times. of course it depends on effort/effect ratio of a document. maybe in future the other people appreciate not only the quality of latex-made documents but also advantages in creating them (equations, bibtex … and index without repeating pain)

  15. pqsNo Gravatar says:

    I’m using Bazaar for this purpose. It’s a distributed version control system very easy to use. What I love is that it doesn’t need a special server, it can upload your tree to any sftp. Its commands are very intuitive (it is used by the Ubuntu community) and it is multiplatform, as it is written in python.

  16. DuncanNo Gravatar says:

    You may want to try the trackchanges package from sourceforge. This can be used to (as the name suggests) track changes.

  17. Uwe BrauerNo Gravatar says:

    Hello

    I first describe what I am using, then I would like to point out what I am be looking for, namely some wiki-like collaborative latex based software.

    – What I doing.
    Diff tools:

    Let me first describe some diff tools I find handy: mgdiff,
    which is only linewise, wdiff which is wordwise but whose
    formating is not very attractive, then within (X)emacs,
    ediff, which can compare wordwise, whose output is nicely
    formated and most important has a nice merging toos (see
    below). However the ultimative diff tools is, no doubt
    *latexdiff*
    (I just tried out ldiff and it does not come
    close). Latexdiff also treats files under version control
    which is very nice.

    – revision tools. I use within Xemacs RCS with the vc
    backend, my colleagues (still) don’t. So that works
    roughly the following

    I start my version say result.tex and use, within Xemacs
    vc-visit-other-version
    I select latest and then I obtain result-rev-1.1.tex
    I send that version to my colleague who is making his
    changes and send me back the file. I check that new file
    in using RCS. Then I run
    `latexdiff-vc-fast-file-current-inferior’
    a small lisp hack which calls latediff-vc in an
    appropriate manner.
    This generates result-diff1.1.tex which I run pdflatex
    over and send this pdf file back to my colleague.

    I apply my changes check out,
    run
    vc-visit-other-version (1.3)
    `latexdiff-vc-fast-file-current-inferior’
    This generates result-diff1.2.tex which I run pdflatex and
    then both files to my colleague.

    Works pretty well.

    – MERGING. It somehow can happen that two versions have to
    be merged. Then meld or ediff with in Xemacs are good
    tools. A merge however is cumbersome no doubt.

    – what I would love to see is as I said I would like to point
    out what I am be looking for, namely some wiki-like
    collaborative latex based software. It seems that may be
    lyx will offer something like this in the feature.

    Uwe Brauer

  18. RichNo Gravatar says:

    I use a wiki for my research, which is not collaborative in my case, but could be. I then export to latex format via Word and Word Macros. So my changes are recorded in the wiki.

    Easy.

  19. [...] and merge changes.  You can quickly see who contributed what to a paper.  Dario Taraborelli wrote about this a few months ago, though his point was that you would need your collaborators to be familiar with a [...]

  20. James AllenNo Gravatar says:

    I use http://www.scribtex.com which is somewhere between a wiki and Google Docs for LaTeX. It allows collaboration and version tracking and seems to fit your requirements nicely.

  21. DaveNo Gravatar says:

    You use LaTeX but a VCS is too intimidating?

    Compared with learning LaTeX markup, svn or git are simple (at least in their basic functionality).

    For collaborative work, I would go with PmWiki, which has a wonderful plugin that exports wiki pages to LaTeX. You could set up a password protected wiki. You and your coauthors could write your paper collaboratively and then export the end result to LaTeX.

  22. AdrNo Gravatar says:

    I agree with Dave,

    git add fileyouwantversioncontrolon.tex
    gvim fileyouwantversioncontrolon.tex
    edit edit edit
    git commit -a -m “This is what I changed in the file”

    How hard can that be?

  23. Ian MulvanyNo Gravatar says:

    Dave, Adr,

    I think the problem is that through writing LaTeX is straightforward, looking at a lot of changes in a LaTeX file by just looking at the source code is painful. A visual representation of the changes is very helpful when it comes to checking revisions. VCS systems don’t give you this.

  24. DanNo Gravatar says:

    I have been playing around with \usepackage{changes}, available on CTAN. It allows for edits to be made and highlighted in color in the typeset document. It’s not ideal since there’s no way to accept changes other than removing the markup in the source file, but it is something, and simpler than setting version control. Also, the documentation is in German so you sort of have to figure it out as you go (but at least the commands are in English, e.g. \added[]{this is my inserted text}.

    There is another package called trackchanges but it is not in CTAN; it’s hosted on Sourceforge and it apparently allows for comments as well as edits, but I haven’t tried it (yet).

  25. darioNo Gravatar says:

    There’s a discussion on Slashdot started today on the same topic.

  26. Uwe BrauerNo Gravatar says:

    Hello

    I think noosphere, the underlying software of planetmath, is
    *precisely* (:-) ) what one is looking for:
    wiki(pedia) sort of collaborative software, using Latex.

    Uwe Brauer

  27. Thanks for latexdiff, it’s great!

    And thanks for emacs with ediff and merge, it’s also great!

    A few doubts for DIFF and MERGE:

    - How do you configure emacs to make DIFF, and, most important, MERGE, ignoring line wraps, extra spaces etc.?

    - Is there anything like latexdiff3 ?

    I would love the following MERGE:

    A tool that compares wordwise and ignores single line breaks but not double ones. (A tool that compares just as latexdiff to be precise.)

    LaTeX normally does not make a difference between a single line break and a white space, but the text editors keep breaking and appending lines over and over.

    Do you people know if there ever such a tool?

    More precisely,

    It would perfect that the tool compares Mine and Yours, having Older as the reference, treating triple or quadruple line breaks as double line breaks and single line breaks as spaces, and then show to me just the real conflicts (words, sentences or paragraphs where both Mine and Yours differ from Older).

    It should allow the user to decide if merged version would borrow the formatation (spaces and line breaks) from Mine, Yours, or Old.

    Is there anything that dos it?
    It doesn’t sound complicated…

    I can imagine something just as simple as that:
    Step 1: Given three files, make a 3-file comparison and applies the spacing and line breaking style from A to B and C (from B to C in regions where A is empty).
    Step 2: Do the standard merging with emacs, kdiff3 or any other good tool.

    For this purpose it only remains to implement the very simple tool to do Step 1.

    Version Control:

    I’m astonished by how unfriendly it is to work with these tools. I’ve tried hard but I’m just about to give up and go back to my file1.tex, file2.tex, …, file23.tex, …

  28. Uwe BrauerNo Gravatar says:

    Well have you tried ediff-regions-wordwise
    in (X)emacs?
    That comes quite
    close to what you are looking for, but only works for 2 files.

    Uwe Brauer

  29. Thanks, Uwe.

    But for 2 files we already have latexdiff.

    What would be really nice is latexdiff3 and wordwise ediff3

    About the script for “Step 1″, do you know anything like that?

  30. Uwe BrauerNo Gravatar says:

    Well but you cannot *merge* with latexdiff! with ediff you can!
    well with meld as well but this is not wordwise.

    I have not checked the latexdiff manual very carefully but I got the impression that latexdiff cannot deal with 3 files, but
    you might want to contact the author of that package.

    I did that several times and his responds have been fast usually.

    Uwe

  31. MicaNo Gravatar says:

    I might be a little late on this, but GIT now comes with a graphical front end in perl/tk. I run it on windows at work, at home on my mac & linux boxes. it’s called gitk and git gui. it comes with the git bundle… as far as your stubborn collaborators, in the gui they would only have to click the “commit” button, then select “push” from one of the drop down menus. pretty simple :P

  32. Jose says:

    LyX did it for me. Excellent application.

  33. Duke says:

    As far as I am concerned, I prefer git to CVS or SVN, since it works well on a disconnected laptop, and does not try to contact a remote server at every commit.
    But it’s OK, any versioning system is OK and does almost the same thing.
    What is mainly important is:
    + have a Makefile, that builds the latex document
    + fix the file and directory structure of your document, so that everyone ends editing the same files, and does not add or remove some during the collaborative process, I mean, it is OK and better if only one person is responsible for this structure

    Then just commit around, and for the diffs, you can use your source diff, and for a more clear view of the real changes, I found something called latexdiff (google for it), and just wrapped it with a small shell script to generate a document tree with a diff from 2 git tags (or checksums).
    This way I have a tree with every diff latex file (generated with latexdiff from the 2 versions I want), and I just have to type “make”, to generate a pdf file that shows the diffs (you can even choose how old removed text is displayed as well as the new one).
    This works because of the 2 previous points.

    I developed this just to show some colleagues that latex is still better than word.
    I am looking to adapt this to latex and git plugins in eclipse, since it can be a good approach to this issues (I don’t believe in the LyX way).

  34. File diff round-up says:

    [...] on Debian Science List Discussion on Ask Slashdot Discussion on Academic Productivity ScribTex — a online collaborative wiki-like LaTeX [...]

  35. AndresNo Gravatar says:

    How about google wave? would be nice to have bot that would do the editing, version control seems pretty easy.

  36. MarcNo Gravatar says:

    there is the trackchanges.sty at sourceforge:
    http://sourceforge.net/projects/trackchanges/
    but I am having some bugs and conflicts. It seems to be a very promising tool for further development, though. Anyone willing to do this?

  37. Uwe BrauerNo Gravatar says:

    I just checked. No documentation and as far as I can see you have to insert your changes with \change comments etc.
    Does not look to comfortable I have not checked the
    python scripts though. But my first impression is latex diff is better….if it could only merge……

  38. Dan DohertyNo Gravatar says:

    I use LaTeX for legal documents, so this topic has interested me for a long time. I have given up on methods that attempt to create a markup from the original LaTeX sources, and instead keep the .PDF of each version of the file (with git for some projects). Then I use Adobe Acrobat (got mine for $117) to compare the two versions, which produces its own .PDF with the markup embedded in it.

    It even finds changes in tables and other dark corners, does not rely on a knowledge of what LaTeX macros produce text, or any other impossible-to-generalize means of analyzing LaTeX sources.

    Hope this helps someone.

  39. darioNo Gravatar says:

    32 requests to introduce LaTeX support along with revision control in Google Docs have been posted and voted by several hundred users on Google Product Ideas.

  40. BalajiNo Gravatar says:

    I saw this blog because I am myself searching for a perfect solution to this. But I still have an approximate solution in my mind that eliminates the need for any RCS. There is the perl script latexdiff here:

    http://www.ctan.org/tex-archive/support/latexdiff/

    I haven’t yet tried it, but here are my plans. I will test it and let you know my results sometime soon. Please contact me by about July end if you don’t hear from me after this – I am busy submitting my thesis and graduating so I may forget.

    1. I use a LaTeX editor called LEd and this system has a feature that allows me to create my own user defined build commands which will be essentially batch programs.

    2. The great this about this program is that I can create a LaTeX project out of my work and archive it. So each archived version will contain all the LaTeX files in the project. (One of the things LEd lacks so far is an automatic script that will use latexdiff. So I plan to write a batch file for it)

    3. Another thing that LEd really lacks is automatic project file creation based on the main LaTeX file. It should simply pick the main LaTeX file, look for include commands, and some commands for includegraphics and any bibliography commands and automatically generate the project. I think TeXnicCenter does this very well, but it does not have custom batch file utilities and doesn’t have a good in-built DVI viewer so I prefer LEd.

    This should summarize my steps. I hope all this works and if it does, I’ll let you know. In any case, contact me by July end if I don’t revert.

  41. glopglopNo Gravatar says:

    I have not yet had the chance to try it but I just discovered the existence of the fixme package (cf CTAN) that allows to put editing notes and comments in a latex document. I don’t think it’s been mentioned in this thread. Maybe it is of use to someone.

  42. alfCNo Gravatar says:

    Use SVN, CSV, email, diff or what ever but above all
    % author0: use freaking comments!
    % author1 — be polite
    % author0 — ok :
    please, use comment lines. It is really that easy.

  43. JaumeNo Gravatar says:

    I came here looking for a tool that could show the changes in different colors, and latexdiff is exactly what I was looking for.

    I’ve just tried and it is amazing. Thanks!

  44. carlos says:

    just use scribTeX online.

  45. child minding…

    Academic Productivity » Revision control for LaTeX: in search of an answer…

  46. Administrare de personal…

    [...]Academic Productivity » Revision control for LaTeX: in search of an answer[...]…

Leave a Reply