We academics should be obsessed with the amount of stuff that we write, and it could be that one bottleneck of our output is simply the speed at which we type. We have provided some tools to help you write faster (see our review of an autocompleter here), but actually audio could be a very good tool to get your ideas into a more manageable form, which could be text or it could be simply an audio file. For example, it’s very, very easy to do a brain dump using audio. You just start talking about the idea that you just had and try to put it in a way that sounds reasonable that you go to other people, play it, and they will understand what you are saying.
In that sense, it is a lot better to use audio because you speak at a speed that is a lot higher than your typing speed.
Actually this post has been dictated into Audacity, which is a free software that I use for dictating. One of the things that mainly changed my mind and made me try dictation was Peter Fisher’s Podcast series; Peter Fisher is a professor at MIT, and he has a series of Podcasts on Academic Productivity. I seriously would recommend his stuff in my review here; I think he has plenty of very valuable advice in his Podcasts. But anyway, I want to go through the advantages of using audio as a means to take your ideas down to paper at the same time.
The first advantage is that audio forces linearity on you. When I write text, I can jump freely around; I can go to the introduction, then add to the end of the paper; I can work on the Methods section, go back to the intro, then back to Method and so on. This is not something that you can do with audio; you really have to start from point A and run all the way to point Z. This could be an advantage or it could be a disadvantage, but for short ideas like a Blog post or just a quick note, this should be an advantage.
The second advantage is that dictating also prevents multi-tasking; that is, when you are doing your audio recording, you cannot be working on all things at the same time. Like reading email while writing a paper (most people do that). Setting up the recording program means that you will do only dictating for a while. I don’t know about you, but having my undivided attention on just one task helps. I have been known for doing several tasks at the same time, and this is really not a good idea. We will talk about multi-tasking at some point in Academic Productivity, but right now I am pretty happy knowing that I have to talk to a microphone, and that is the only thing that I have to do, so there is no easy way I can be doing two things at the same time, like listening to music and writing for example.
The third advantage of audio is that it removes a barrier of entry for developing an idea. For example, sometimes I feel too lazy to fully develop an idea into a manuscript because it’s just too complicated. I can see way too many steps that I have to take until the full paper is completely written. It is just a pain to type the idea down and to correct the spelling and so on. When you are dictating, you don’t have that; you don’t have to worry about spelling, you don’t have to worry about the structure much, and you can just simply jump around and take your idea down as soon as possible. Of course, for longer-form pieces you are going to need a structure (but interestingly, not for something like a note or a Blog post or quick introduction for a letter). For longer form pieces, I would recommend you work on your outline with paper and pencil or a computer, and dedicate as much time as you want to the outline. Then, and only then, dictate. You may even have some standard structure that you just copy and paste and repeat over and over. For example, Experimental papers: all experimental papers have (1) an introduction, (2) a theory part where you connect your ideas with previous theories or you compare the predictions of two existing theories (3) A Method section will just bring what you actually did, (4) a results section and (5) a discussion. This is a pretty common structure, and if you have many experiments you may well have this structure repeated x times. Then it is good to have that structure before you start dictating, and you can even go into a fine grain detail and say, “Okay, introduction must have three paragraphs; the first one, I am going to explain idea y, and the second one idea z or whatever,” and so you may even make notes for what the paragraphs are going to contain and then dictate elaborating on those ideas. So you can get a detailed outline and then dictate to fill in the gaps in that online.
For outlining, the best tool I found is Microsoft OneNote. But this is worth a Blog post in itself, so I am not going to elaborate on OneNote later.
A fourth advantage of dictating is that it is actually hands-free; so you could do something, -again, only if you really have to multi-task while you are dictating-. So, in a extreme case of both multitasking and having no life you could be feeding your baby or doing something that requires you to use your hands and dictating one idea to your computer, or you could even use a portable voice recorder. So you may even be waiting for the bus or just commuting somehow and dictating, I have not tried driving and dictating or doing delicate things while dictating, but I guess, with practice you may even be able to do that.
Of course, dictating has disadvantages. Let us cover those too. For a start, it is very difficult to start dictating and get a perfectly fine manuscript out of your brain dump. It is going to take several iterations, but I think I just started using that dictation so I don’t really know how good it is. I think it may be worth it because you would get a mass of text that you can then work on. You can remove paragraphs, paste them together, clean up stuff, it should not be that difficult, I think. With a few iterations you can get a pretty good manuscript that you can just send off somewhere.
Finally I’d like to compare sending your ideas out to the world in pure audio form -what is called a Podcast – versus a text post. I mean: If I have dictated all this, why would I not just post the audio file?
I think there are several advantages for text. For example, text can definitely be consumed faster. We are trained to skim, and we do it consistently. It’s a lot more difficult to skim audio (although possible). The non-skimmable character or some speed reading methods made me abandon them. Second, text is indexed. No search engine indexes audio or video yet (although I’m sure it’s coming). So it’s harder to get noticed if you produce only audio, and people will have trouble trying to track down where you said what, making quotation difficult.
A nice advantage of audio, though, is that it can be consumed anywhere; for example I listen to Podcasts while jogging or commuting.
I know of at least one successful academic who dictates: Robin Dawes, at the social and decision sciences, Carnegie Mellon University. Drop a comment if you know of anyone else, or if dictation has made a big impact in the way you work on your writing.
We do that whole day long, every working hour we have, so we are very good at skimming, and I think most people would mostly blog post at incredible speed. So, it may be that people really want that kind of speed to consume your ideas and they don’t really want to wait for you to actually talk for ten minutes, so it will take ten minutes for them to get to your entire idea whereas they may skim the same idea in a text format in just a minute probably.
So, that’s one of the advantages for text. And of course, text is indexed in search engines, so if you want people to find you and your ideas, you have to have them in text form, it is very difficult to index audio. Eventually in the future, search engines will start indexing audio and video but that is not happening yet.
Of course, one advantage of your audio, like Podcasts, is that you can consume it anywhere you are, for example, waiting for the bus or driving or jogging. Actually, that is how I consume most of the Podcasts I listen to.
Now finally, I want to talk about how you can get your audio into text form if you want to do that. The first way is to use transcription services and that’s what I am using right now. It could be costly, it’s about $1 a minute and it takes a few days for the person to do the job which can be the downside to this option. The person I am using right now is taking three to five days, and then you have to go through the text and make sure it is the way you wanted it to be. While there’ll always be mistakes in transcription. I still think it’s the most convenient way… as you will see below.
The second way is using some software that can do transcription automatically. The best one is supposed to be Dragon Naturally Speaking. They are now in version 9.1 and to tell you the truth, I haven’t tried their software, I haven’t really spent time working with Dragon to see how well it can transcribe. They report conversion rates of about 29% correct. I know you have to train the system, it may take a while to get there. It’s pretty CPU intensive and it may be a bit disappointing the first time you use it because you have to correct errors on the fly, so as you are talking, you see the text on your screen and it can be in any kind of computer program like Word or OneNote, even an email you can compose with your voice and you will see words written as you talk, that’s pretty cool. But you will see that plenty of things are misspelt or just plain wrong; sometimes it is really hilarious. So it takes quite a bit of work; you have to type corrections, so it’s pretty attention demanding, so I will say, transcription services are actually better.
UPDATE: Now I have given Dragon a try.And unfortunately, y first impression is that Dragon show an impressive collection of fatal mistakes.
- The program asks permission to scan things you have written to improve and personalize its vocabulary. That’s great; but then it only scans the ‘My documents’ folder. What if you use a different folder to save your writings? Then, it will miss it. First fatal design flaw.
- When you fire Dragon, it installs a large toolbar across your screen. Well, if close the program, the space that used to be reserved for that toolbar is never released (!). Talk about civic behavior! All other applications, even after closing Dragon, will have lost a few inches of screen real state. This is just rude.
- In use. The inital accuracy was pretty poor, to the point of making me laugh outloud. But then, the laughs would show as text (Dragon’s best guesses!). Then, you obviously need to do some work on the produced text… but ff you correct the mistakes on the fly with a keyboard then the process is so slow and so distracting that you’re better off writing the thing yourself.
- Performance. I wonder how we current technology and fast CPUs this program needs several seconds to figure out what the hell you were saying.
Conclusion: honestly, if I was the CEO of this company I would have never released this program. You may get some use out of Dragon in extreme circumstances, i.e., if you have had an accident that prvents you from using your hands, and you cannot type or you have to hold you children while writing your paper and you really need to dictate. But overall, extremely disappointing experience.