Using ZFS incrementals for disaster recovery backups in AWS Glacier
My home Ubuntu Linux fileserver runs zfs on linux for all my main data filesystems. My goal was an efficient and cost-effective off-site storage for the data in case my home server is irretrievably lost/damaged/etc.
AWS Glacier currently charges $0.007 / GB of storage / month, so ~1TB of data should be ~$7/month for storage under normal circumstances.
Additionally, I wanted a method that would be easy on the upload and CPU work on the filesystem. Rsync is fine, for example, but it is costly to traverse big filesystems.
ZFS send/recv outputs and inputs a stream representing a given filesystem snapshot and it has an incremental option to only send the difference between the given snapshot and it’s baseline. These use STDIN and STDOUT very nicely, so backup and restore is covered at the filesystem level assuming I can reliably upload and retrieve the data. The incremental backups are very efficient, only send the blocks that had changed, and are not a tax on the system when run.
Therefore next step is getting the data into and out of of Glacier as needed. For this, I found a glacier-cmd utility on github.. This gives us the missing STDIN/STDOUT to and from Glacier. (Update: trying a different branch to deal with AWS timeouts better)
Working with Glacier takes a little getting used to. Here’s some basic facts:
Data is organized in Glacier in vaults.
Within a vault is individual files with a filename and a description (I use the description field).
Once files are uploaded, you have to wait for the vault to some Glacier crawler to initialize the inventory of the vault (basically a list of files) to see or access them. This can take a day or more.
Actually getting the inventory refreshed is an asynchronous process (get used to these with Glacier) that can take serveral hours by itself.
Retrieving files (in my case) required me to first 'getarchive' (async) and then 'download' (sync).
glacier-cmd lets you setup AWS SNS to email you when async Glacier commands finish. This can be tied to some real applications you may write, but for my case, just getting an email is fine.
So, I can follow this basic workflow for backups:
# Create the vault in Glacier if needed glacier-cmd mkvault testvault # Full backup zfs snap vat/testfs@lastbackup zfs send -p vat/testfs@lastbackup | glacier-cmd upload --description "Full backup test" --stdin testvault # Incremental backup (repeat as often as needed) zfs snap vat/testfs@morechanges zfs send -p -i vat/testfs@lastbackup vat/testfs@morechanges | glacier-cmd upload --description "Incremental" --stdin testvault # If the backup succeeded, make the 'lastbackup' snapshot reflect the most recent Glacer update: zfs destroy vat/testfs@lastbackup zfs rename vat/testfs@morechanges vat/testfs@lastbackup
Restores would roughly look like this:
# Trigger an inventory of our vault glacier-cmd inventory testvault # Wait until this finishes and outputs all our files glacier-cmd listjobs testvault # When the inventory is finished, you can re-run the inventory request and get something like this: +--------------------------------------------------------------------------------------------------------------------------------------------+---------------------+----------------------+------------------------------------------------------------------+-----------+ | Archive ID | Archive Description | Uploaded | SHA256 tree hash | Size | +--------------------------------------------------------------------------------------------------------------------------------------------+---------------------+----------------------+------------------------------------------------------------------+-----------+ | U-piHs9hp3RwZq-Xxrwjjy7e0ruVrhD_74SOE-8Ye-4T4NhvVg63kuQCeVRR2dOx1GVQ6uOxdSSG0NH5LBgdDRaYdF9SSmD_l4lGR6QbdpFBSvvj5Yc33Bf8nIU09nuZQQ1pMX5qZA | No description. | 2015-12-23T00:32:59Z | d30abc7a7707cd140d8ac1c43df257915ed391bbc4971516940a445956d1b5dc | 223506978 | | aEvEc_zCFiytsPEh8XMdMCcILmNkJm_MaOGuqpXN0-KFj52TkC-LA44IfoOxBHUSjI6gZMlaKlNzwOgxdiPkagkm6NhnZgRUvywKBYl9v8SIX7HO_5pjTXBVS4xVSlJHMfBv5M5YXQ | No description. | 2015-12-23T00:43:16Z | f64568291ca27e669081c426152682562c631a5c9896a202a0b442e5fd092122 | 223641765 | | MqC9kdtEsNvFv2w0XtLbotdMcZx3euMjkFRrmsd9jxgMeeXMngjPxD_Sl3tH7zCZ93WJJ6L308Gr3NbOBB_QijOFjaR44OOsEdb1bjc3c9SKYSNYMXnJcpePwlNdxxtKG_PlEJlpGg | No description. | 2015-12-23T00:49:41Z | fc5e44b5839c0d23e0ef3fe6da21d25d0702b12433ce5195319ac0fe280cacc5 | 2093 | +--------------------------------------------------------------------------------------------------------------------------------------------+---------------------+----------------------+------------------------------------------------------------------+-----------+ # For each Archive ID, we need to 'getarchive' and then finally download: glacier-cmd getarchive testvault U-piHs9hp3RwZq-Xxrwjjy7e0ruVrhD_74SOE-8Ye-4T4NhvVg63kuQCeVRR2dOx1GVQ6uOxdSSG0NH5LBgdDRaYdF9SSmD_l4lGR6QbdpFBSvvj5Yc33Bf8nIU09nuZQQ1pMX5qZA # Wait until that job is finished, then finally download. glacier-cmd download testvault U-piHs9hp3RwZq-Xxrwjjy7e0ruVrhD_74SOE-8Ye-4T4NhvVg63kuQCeVRR2dOx1GVQ6uOxdSSG0NH5LBgdDRaYdF9SSmD_l4lGR6QbdpFBSvvj5Yc33Bf8nIU09nuZQQ1pMX5qZA | zfs recv vat/restorefs
We'd need to zfs recv the full backup and then all the successive incrementals in order.
To make a bit simpler and easy to cron, I created a quick perl script to automate the backups (full or incremental). The basic usage is:
glacier_inc.pl vat/testfs
It creates a vault in Glacier for you for the filesystem.
It manages a '@glacier' snapshot for you.
If the @glacier snapshot doesn't exist, it does a full backup and creates it.
If the @glacier snapshot does exist, it creates a new snapshot and does an incremental instead.
If there are no diffs between the @glacier and the new snapshot, it skips the glacier upload.