Tag Archive: offline

Mar 04

Backups and Data

I am no expert on this at all, but I felt like writing briefly about how I try to keep my data relatively safe.

 

There’s almost no such thing as a perfect backup strategy, but everyone should have one.

 

And each backup strategy should have three components:

 

  • Live data (this is the data where it lives when you work with it)
  • Offsite replica (this is a copy of the live data that is stored somewhere physically removed from the live data; in Japan, that means far enough away not to be destroyed in the same earthquake as the live data and offline replica)
  • Offline replica (this is a copy of the live data that is stored on a device that is only activated when data is being copied; this copy protects against something like a virus or other forms of data corruption.

 

Here is my solution.

 

All my data is copied incrementally, once a month, to an external 2TB hard disk, that I then unplug from the system.  This is my offline backup.

 

Documents: Live data is automatically synchronized via Jungledisk software to an encrypted location on a Rackspace server in the U.S.  The same program also creates a local copy on each of my three main computers.  Documents that require a high availability (instant replication across all PCs, like my password file) are synchronized using a free dropbox.com account.

 

Photos: The live data lives on a 2.5” encrypted USB HDD, which I can bring with me if I’m on the road.  I synchronize that data manually (using Beyond Compare) to my Nexenta server.  Nexenta is a variant of Open Solaris that is focused on being a NAS (Network Attached Storage) server.  Like Open Solaris, it uses the zfs filesystem (yes, I know that’s like saying “the HIV virus”) which, to put it simply, handles large amounts of data very well.  At the moment, I also synchronize these files manually to the Rackspace server as my offsite.

 

However, with 150GB of photos and around 50GB of video, the Rackspace charges ($0.15 USD/GB/Month) are starting to get high.  So I’ve decided to move my photos to datastorageunit in order to save money.  I currently am using 160GB of data on Rackspace, which is costing me about $24 USD a month ($288 USD) per year.  On the other hand, datastorageunit costs $150 USD per year ($12.50 per month) for 300 GB, which means that I can add video backups to that as well.

 

The advantages of Rackspace (via Jungledisk) are that it is easy to use and sync, and it is encrypted, both in transmission, and on the server.  Which is why I am leaving ~25GB of my live documents there.

 

Datastorageunit’s philosophy is more homebrew in that the user can (must) decide how to connect and transfer files.  This means that while transmission is encrypted, the remote filesystem is not.  However, there are options available to the user to encrypt that data, though I’ve decided not to in order to keep transfer times lower and avoid a massive headache.  When it comes down to it, though, my photos do not need encryption.

I will probably never move my main documents folder there, because I need the multiple-machine synchronization features that JungleDisk offers me.  It’s really nice turning on my laptop and having it automatically download all the recent changes to my files.

 

Once the data has been transferred over, I need to figure out how to automate my rsync job so the data gets mirrored to datastorageunit every night to preserve changes I’ve made throughout the day.

 

A quick Google search reveals that this may not be as straightforward as I originally thought…