Have you considered the productivity loss associated to a disastrous computer crash (where you cannot retrieve any of your files)? HDs do die, and it is really time-consuming to get back to a working state.
Anyone who regularly works on more than one computer and needs access to the same set of files will benefit from using a syncing tool. The following scenario is pretty common but not efficient:
You are working on a desktop computer and a laptop at home, as well as on a desktop computer at your office. You routinely copies your Word documents, Excel spreadsheets, PDFs, and other files over to USB flash drives, carry them between your home and workplace, and manually copy every file over to its appropriate directory within My Documents (or whatever home dir you use).
But Sometimes you get things wrong and clobber a newer version of a file with an older one, sometimes you move a file into the wrong place and ends up with duplicates that you must compare by hand, and sometimes you lose a chunk of her valuable work data when one of your computers’ hard drives crashes.
Most people don’t have a backup strategy in place. Everybody has colleagues that have had some kind of disastrous data loss, but somehow they think that their computer is immune to adversity, thieves, and hardware failures.
If you are lucky, your department/uni pays someone to keep backups of your work. However you are going to need to sync several computers (e.g., home/office) and work on a home computer that is not covered by your uni’s backup system (if any).
Note: before writing this post I have tested quite a lot of software. I have to agree with Roger Grimes in that most backup software sucks. Syncing is another story. I have come to the conclusion that using a syncing program to do backups is not only possible, but advisable. Anyway, here is my backup/sync strategy.
An excellent guide to different backup types and programs is this one. However, as we’ll see later, the program that I’d recommend is none of the ones listed in this guide.
The key questions to ask here is WHAT and WHEN. Not all files change often, so it’d be best to select only folders that change constantly for daily backups and leave the rest of your HD for a less intense backup schedule.
In my case, I want to have a current mirror to an external HD (updating often, maybe every 2-3 days, and a remote backup (to a server) updating maybe monthly.
if you think that your work is secure right now, just try to answer the question: would you cry if your HD died right now? This is the crying test. If your answer is yes, start backups now!
It is really important that you keep Incremental, versioned backups of your personal documents. By versioned I mean that you keep multiple copies of each document, organized by date , so that you can go back to an earlier version if you discover that you’ve deleted something important.
Note that filehamster does versioned backups in a way; however, with a general backup strategy you can protect all files in your home directory, not only those added to filehamster.
Since space is cheap, you can keep months of history (old versions of papers, programs, etc) in the same HD you use for backups. This could come in handy if you realize you need to go back to some changes you discarded in say a paper, but that now you prefer over the most recent version. From the Tao of backup:
An important element of any backup scheme is the retention of backups from various times in the past. For example, each week, you could place a backup tape into permanent storage. By doing so you are leaving a “trail” of backups that enable you to access data as it was at various times in the past.
If you connect thorough a network, you want to move the smartest amounts of information possible.
Nerf (from DC forums) makes the point clear:
RSync type applications are of no use where the source and destination files are on the same PC, even if the destination is on an external hard drive. RSync is a client/server program and its purpose is to minimize the amount of data (traffic) that needs to be sent between two computers in order to synchronize files. The client and server read the files in their entirety, several times in fact and send data back and forth to determine what parts of the files have changed. The client finally sends just the parts that have changed to the server. So the data going down the wire is minimal but the CPU and Disk use at each end is higher than a simple Copy (Backup) process.
Also note that RSync et.all. are useless where the content of files changes dramatically over time. For example lets say you have a 100 files and only two have changed. RSync will perform very well here. But if you Zip up these 100 files and use RSync on the zipped file you will probably find the entire file is sent down the wire. This is simply because most every bit in the source and destination zip files have changed. Encrypted files such as TrueCrypt containers similarly won’t work effectively.
programs do ‘patches’ (aka delta backup), that is, they copy/move only the bits of the file that have changed. Any of these delta backup programs should beat traditional ones at speed, but this is only useful when copying larget files that change in small parts over a network.
Ideally you want to *also* copy files to a server that is in a remote location (Rule #3 in the the Tao of backup.), then you definitely need a program that moves the minimum amount of information over a network. And of course you need to have access to a PC that is in a remote location, ideally a server running Rsync. This adds a whole new level of security.
There is a windows version of rsync cwRsync, that can be easily installed on any computer (no matter what OS is running) that you use for work.
Lifehacker posted a while about a mac solution to automatically sync a thumbdrive when plugged.
A good strategy is to get a dreamhost account, (like the one hosting this blog (because they provide lots of space for cheap) And backup to dreamhost using rsync). An easy way of configuration is covered here.
Putting all of this together: Super Flexible File Synchronizer (SFFS)
I have tested SFFS against:
- Backup4All: slow, incredible memory usage
- SyncBackSE: unreliable, confusing, doesn’t support ssh tunneling
- Genie backup: never really worked, created duplicate files.
Some have poor UI, some don’t support this or that method, some insist on zipping. None do the things that SFFS does.
A few highlights of SFFS:
- sftp support
- amazon S 3 support
- delta updating of *files* (i.e., if a big file is changed, only the changes are copied, not the entire file)
- resuming a profile after something went wrong easily and reliably
- caching the index so you do not have to reindex large drives that change little (saves lots of time)
- low memory and CPU usage
- a feeling of reliability overall
If a backup session is aborted or interrupted for any reason, SFFS can resume where you left it. That is, if you are scanning a large folder and you have to stop the SFFS process, the second time you start the same profile it will skip the files that it already compared; this is quite a time saver.
SFFS has its own ‘Daemon’ to run periodically in the background. I run it daily for most of my important files.
in my tests SFFS is really fast. SFTP and webDAV (and amazon S 3!) is a plus. Another interesting feature is the “detect moved files” feature (lot less transfer).
Super Flexible does delta updating too. From the help:
“In your profile, make the following checkmark: Use Partial File Updating, which is on the Advanced tab sheet in Advanced Mode.”
To make SFFS work like a backup program, say backup4all here is what you have to do:
- create your normal profile, left to right.
- Select “add timestamp to filename” (note: do not use delta updating here)
- Keep as many versions as you like (say 10)
- zip files if needed
- create another profile by copying the first one, but swap the paths, so it’s “right to left”
- test it
In fact, more recent versions do have a wizard for this.
Don’t forget the cardinal rule of computer work. – back up your work regularly. There is nothing more frustrating in computing than losing the lot because the system has gone down (and hard drives do fail). If it hasn’t happened yet, it may well one day. Your work can be just as much at risk on a server hard drive as on a local machine.
If you think your backup strategy is good, then read the Tao of backup. I did and I found plenty of holes on mine.