Tool for the (collaborative) job

“Screws,” another classic from xkcd.com.  The mouse-over text reads: “if you encounter a hex bolt, but you only brought screwdrivers, you can try sandwiching the head of the bolt between two parallel screwdriver shafts, squeezing the screwdrivers together with a hand at either end, then twisting. It doesn’t work and it’s a great way to hurt yourself, but you can try it!” I have similar thoughts about the potential benefits of using the wrong tools for collaborating. 

 

Managing collaborative projects can be a job unto itself. I have five active collaborative projects right now, and each one follows has a different protocol and uses different collaborative tools a slightly different system, determined by the needs of the project and the personalities of the collaborators. Of course, everybody has their own ideas of the best way to use each of these, so there’s room for conflict. Sara’s post about Slack a few weeks ago was great, and I have talked with a few people who are really happy with that. I have used and heard about a lot of other tools, though, each of which seems to be good for many things but have some potential points for discussion. Also, some of these collaborative tools have spilled over into my teaching life, and have become more and more important as my teaching has evolved. In this week’s blog I’ll run through some things I’ve used and some I’ve just heard about, starting from basics.

Tool(s) 0: Skype and email. I really enjoy Skyping with my collaborators so we can have informal conversations and talk in real time about the direction of the project. Email is also really useful (can’t believe I’m writing something so obvious) for needs including sending thoughts at odd hours and sharing more complete ideas which involve notation or longer arguments.

Room for discussion: I think that it sometimes takes much more time to resolve simple questions over email, and it is often hard to communicate “squishy” ideas without some back and forth. So Skype is an important part of my collaborative life. That said, I currently have a collaborator who hates Skyping. She often doesn’t understand what I am trying to say when we Skype. Instead, she wants a list of carefully formed comments and questions, and she wants to think before responding. So email has become even more important in this collaboration than it usually is. This is also a totally valid way to work, but it has been a struggle for me to adjust.  Really both of these technologies are so general that people’s ways of using them can vary wildly, opening up plenty of room for collaborator disagreement.

Tool 1: Dropbox.

Dropbox is sort of the classic collaborative file sharing tool. I am using it in 3 different projects, each in a slightly different way. In all cases, we have created a Dropbox folder and shared it with all collaborators. We generally store references, images, slides, and all the files for the paper we are writing. With internet access, any collaborator can log onto Dropbox from any computer and access the files, edit them, and upload the changed documents to the shared folder. The feature that I like most about Dropbox is that since I have Dropbox installed on my two laptops, all the Dropbox files are stored on these computers and I can access the files when I don’t have internet access. They will automatically sync with the shared folder when my computer next connects to the internet. This can cause a problem occasionally if another collaborator has edited the file in the meantime, but nothing is lost—the later update just gets saved as a “conflicted copy”.

Room for discussion: I love Dropbox, and in fact I think it does its job perfectly. The discussion points here could actually apply to most of the file sharing systems that follow. These stem from my experiences with sharing documents, which have revealed how many conflicting assumptions and attachments my collaborators and I have. Basically, you have to decide who is allowed to change what, and how they will do that.

In some cases, one person has written a draft and then other people have made editing suggestions. In other cases, my collaborators and I have all assigned ourselves sections of a document, written the sections in separate LaTeX files, and then split up editing duties. Both of these are reasonable, as long as everyone is on the same page about what kind of changes people can make. I like sharing the initial writing duties, because then everyone feels like they own the document, but it always stings when someone wants to cut ¾ of and change the wording of the rest of your carefully crafted section. So, no matter how it works, plenty of consideration and care are required in the editing process. The sharing technology can both help and hinder this. Some people like to create a new version of a document with edits, so that the old one still exists just in case. This can lead to a profusion of different versions in the Dropbox folder, somewhat maddening to those who love super organized and clean files and folders. Also, edits can get lost, the same section can be edited by different people, resulting in conflicting changes, and so on. Someone has to go through and synthesize all these changes. All this is no problem if the group is basically on the same page about things and is fairly organized, but it can get messy, especially since LaTeX isn’t super helpful here.

Two hacks for better organization/easier LaTeX via Dropbox collaborative editing that I learned from some great collaborators:

  • Create one main skeleton .tex file, which has all your packages, commands, authors, title, and such in the preamble, and has the actual \begin{document}, \maketitle, and \end{document}. All the actual content should be in a separate .tex file for each section. These are merged by typing \include{section name here} in the body of the main file. For example, if you had a separate .tex file called introduction, you would write \include{introduction}. Then your file introduction.tex would start with the line \section{Introduction}\label{sec:intro}. To merge, just make sure the section .tex files are saved in the same folder as the main, and then compile the main document.
  • Create a command for each collaborator to comment in a different color in the document. You can type \usepackage[usenames,dvipsnames]{color} in your main file preamble, and then create a command for each person (also in the preamble) by typing something like \newcommand{\bet}[1]{{\color{magenta} \sf Beth: [#1]}} Then, when want to comment in any section, I would just type \bet{my comment here}. When I compile, my name and comment will show up in magenta.

Tool 2: SageMathCloud.

One of my collaborations involves thousands of lines of Sage code (SageMath is an awesome open-source computer algebra package). When my collaborator and I were in the same place, we could just work on Sage together in his office. When I changed institutions, we tried emailing Sage files back and forth, but it was very clumsy. This was right about the time when Sage launched SageMathCloud (SMC), its cloud-computing portal. SMC has been perfect for us, because it allows real time collaborative editing of the same document, and all computations run in the cloud, so you never need to download and install or update Sage on your machine. I can even check computations and edit from my phone. SMC also has a shared LaTeX editor, so we used this to share the TeX file that eventually became our first paper. It also supports sharing iPython notebooks, and many other file types. SMC has recently added course organization features as well, and I am using SageMathCloud with my class this semester (more on that in a couple weeks). Another nice SMC feature is the ability to “time travel,” which allows you to see old versions of your documents. I have periodically lost work due to some mistake on my part, or a cloud/internet glitch of some kind, and I have often been able to recover it using this feature.

Room for discussion: I love SMC! Let that be said before I say anything else. I also think it’s really worth watching this video about the history of Sage and SMC. I used the free version of SMC (a lot) for a couple years, and I was very happy with it. However, it was occasionally a bit glitchy. Computations ran at variable speeds, and sometimes seemed to just stall completely for no reason I could understand. I decided to invest in a membership, which allows me to run my computations on members-only servers and get other handy upgrades. I’m totally fine with paying for this, and I strongly support the model. However, the free version now seems even glitchier when I go back and try to use it with my students. I think it’s just that I got used to the improvements, and the free version still works fine. However, I would suggest that if you try the free version and like the basic idea, you might be happy with actually buying a membership. You can always go back to free, and your work will never disappear.

Tool 3: Network-Attached Storage.

I am currently working on a project with a great undergraduate student and a Psychology professor, who has a lab with many projects, many students, and a lot of data—something like 4 terrabytes in all. This is way too much stuff to store on dropbox, so his lab uses network-attached storage for all their work. This is my first time working on a project that involves so much stored information, so this is new to me. I am still figuring out the system but it seems to work well so far.

Room for discussion: As with SMC, I need to be connected to the internet to access the files, and I either have to be on campus or use a vpn, which is a small pain. Also, this isn’t my problem, but clearly my collaborator had to do some work to set up this system—buying a server, maintaining it, etc. In addition, my collaborator had to trust me with a username and password for the NAS server, and it makes me nervous to have access to all the files for his entire lab. What if I mess something up?! Of course that is unlikely, but it feels like a real responsibility. It would be great if I could only have access to a small part of the system, but that would mean we couldn’t easily draw on the data and figures from the whole in our document, so I guess I just need to be careful. I am definitely being careful.

Tool 4: Overleaf.

Overleaf is designed for creating and sharing documents in LaTeX. I haven’t used Overleaf, but it’s been recommended to me recently by two of my favorite math people. It looks great. It has a friendly LaTeX user interface, which is either a good or bad thing depending on how attached you are to your editor. Multiple people can work on the same document and see changes in real time, as in SMC. I am told that it solves one of my main Dropbox problems, which is how to easily track changes and incorporate comments in a shared document. The comments feature is really nice, similar to my comment hack above but much better. However, I don’t know that it has the mark-up capabilities that I would like. Really I just need to explore it more.

Room for discussion: I don’t know enough about Overleaf to say much about potential issues, aside from my speculation above. However, the free version of Overleaf shares one problem with some of these other tools—you can only edit or access documents when you are connected to the internet. You can pay for the pro version to sync with Dropbox.

Tool 5: ShareLaTeX

ShareLaTeX has been around for a while now, and it definitely does what you would think—lets you share LaTeX files, and work on TeX files on machines that do not have LaTeX installed (as long as you have internet access). Like Overleaf and SMC, it allows multiple users to work on the same document and see changes in real time. I used ShareLaTeX on one collaboration and it worked well, and I have taught several classes using this as an alternative to having students install LaTeX on their machines. It has a good previewer for viewing changes to the .pdf as you write the .tex code, so the students aren’t completely freaked out by their first LaTeX experience. I haven’t used it to collaborate recently, but it now seems to have a nice method for tracking changes. I am intrigued!

Room for discussion: ShareLaTeX seems to have many of the same features that Overleaf has. Similarly, you must subscribe to the premium version to automatically sync with Dropbox and be able to access documents and work offline. The free version also allows you to work with only one collaborator on each project. I think that is fair, but it is nice to be able to work with a larger group.

Thoughts about these tools? What do you use to collaborate or teach? Love the Corb Lund song that inspired this title? Let us know in the comments.

PS I have to add a personal update. I had a very Philadelphia weekend, including seeing Bruce Springsteen at Citizens Bank Park, and then leaving Philadelphia to go to the Jersey shore. Lucky for me, because apparently it was raining catfish in Philly.  Okay, this happened on Labor day, but it still seems worth sharing…

From USA TODAY:
Catfish falls from sky, hits woman on street
Falling catfish weren’t generally considered to be one of the hazards of life in Philadelphia until now.

This entry was posted in collaborations, research collaborations, technology, technology for teaching. Bookmark the permalink.

9 Responses to Tool for the (collaborative) job

  1. Kyle says:

    “git” is often used in programming for version control, and can be used in that way if there is a shared code or shared LaTeX file. I haven’t used it personally, but I’ve heard good things.

    • bmalmskog says:

      Thanks, Kyle. I used git briefly when working on writing Sage functions in a group, but I haven’t used it in a while. It seems like the go-to for computer scientists, so maybe I should reinvestigate.

  2. aBa Mbirika says:

    Beth nice post! I’ve immediately used the idea of the color coding for multiple collaborator’s commenting in my collaborators shared LaTeX files. Per your AWESOME idea, I made myself the following:
    newcommand{aBa}[1]{{color{magenta} sf aBa: [#1]}}
    And made similar ones for my collaborators.

    • bmalmskog says:

      Awesome! It is definitely not my original idea–I got it from the wondrous Christelle Vincent, and I see that maybe other people have had this idea as well. But I am so glad that it is as helpful to you as it was to me. Also, there is something strange going on with the html for comments, and it won’t display the backslashes in your comment. I tried making them verbatim, but no luck. Sorry about that!

      • bmalmskog says:

        Update, Christelle says she didn’t bring that idea to the group, either. So I will just say it came from good collaborators.

  3. Tom Hull says:

    Thanks for these reviews! This is a very timely issue, and can be very complicated for folks to figure out which of the many options would work best for them and/or their collaboration groups. A few comments of my own:

    (1) Dropbox has a few big limitations. One if the size of the storage space it gives you for free. When I first started using Dropbox I thought it was fine … until my collaborators started uploading tons of large image files and my storage space ran out. After that I had to convince one of my collaboration groups to switch to Google Drive (another, Dropbox-like option, but which at the time game more free space). So if you have a lot of collaborators, or if one of the projects uses large files, then you’ll either need to pay for more Dropbox space or find an alternative.

    The second issue with Dropbox, and shared file space services like it, is that it requires you to change your file management philosophy. If you’re the kind of person who organizes your research into folders on your hard drive designated by, say, topic, or by year, or whatever, then you’ll have to abandon that for your collaboration projects. Dropbox requires that your shared files are all in the “Dropbox Folder” and cannot be mirrored elsewhere in your file system. If one stubbornly tries to keep copies of such projects in your personal file hierarchy (like I foolishly tried to do) then it gets massively confusing keeping things up-to-date. Thus, the Dropbox model is horrible for people who like to have control over their computer filesystem organization. Your personal research workflow might not work well with Dropbox unless you’re willing to conform to it.

    (2) The best collaboration software that I’ve used is Apache Subversion (subversion.apache.org). The (major) drawback of Subversion (often abbreviated SVN) is that it needs to be set up on a server for your collaborators to use. That said, if one of your collaborators is at an institution with a Unix server that can be used, then setting SVN up for yourselves can be easy. The (major) advantage of SVN is that it handles version tracking of your files. That is, if multiple users try to update the same file at roughly the same time, then SVN tries to take care of the potential conflict by either (a) merging the two file edits if that can be done easily (say, if the two people’s edits to not conflict with each other), (b) adding easily-seen commentary in the file to highlight what you changed that could be in conflict with another edit, or (c) refusing to let you update the file because your changes are too substantive to someone else’s. That might sound confusing, but SVN does a great job managing things so that multiple writers do not step on each others toes. Collaborators are still advised to manage themselves when working on a document, but when your system breaks down or an accident happens, SVN keeps things in order. (E.g., it is much harder for a user to accidentally delete everything in SVN, whereas with Dropbox it’s as easy to make a mistake as it is with any of your other files.)

  4. Robert says:

    I can’t imagine collaborating with others on LaTeX files without git–in fact having checkpointed history is quite valuable even if you’re not working with someone else. (Similar to an mentioned above, but much easier to set up and more flexible.) I use that and the Google docs suite.

  5. Harald Schilly says:

    for latex files consisting of multiple ones, I always suggest to use “subfiles”. that makes the individual ones “full” files and hence compilable without tricks.

    https://github.com/sagemath/cloud-examples/tree/master/latex/multiple-files

Comments are closed.