It’s the time of year to think about those plans for making the next year better — lose weight, quit smoking, learn some new skill.
Here’s what I’m urging people to do: take advantage of very inexpensive and easy-to-use cloud storage services and make backup copies of your precious personal data. Digital photographs and movies, and more.
Amazon Glacier is a very low-cost storage service designed and priced for long-term archiving and backup. It’s just US$ 0.01 per gigabyte per month in their US-East-1 and US-West-2 regions, nominally Virginia and Oregon, slightly more in other regions. I can back up over 50,000 pictures for less than $0.60 per month:
% find Pictures/ -type f | wc -l 51816 % du -sh Pictures/ 54G Pictures/
As we discuss in Learning Tree’s Cloud Security Essentials course, you can’t really prove data availability in any rigorous sense. But Amazon describes their design as providing average annual durability of 99.999999999% for an archive. That is, they expect that there is only a 1-in-1,000,000,000 chance that a given archived file would be lost within one year.
Nothing is perfect, and magnetic tape is far from magic. Should your company or government agency have that much confidence in your archiving solution? If so, please let the rest of us know how you accomplish that!
Amazon’s design uses redundancy and geographic diversity. An archive is stored at multiple facilities, and on multiple devices within each facility.
They only provide limited details, but there is hardware RAID at the very bottom, coupled with aggressive replacement of hardware well before it starts to show recoverable errors.
The data only resides on one platform for a limited time, being periodically written onto a new platform (new for the data set, not necessarily based on just-out-of-the-box disks). At any moment there are at least three sets, plus at least a partial fourth when a new write is in progress. Hash values are calculated for the multiple copies of each data set, to detect when an error has somehow crept in. The differing one of the three is immediately re-created from the other two. It would be interesting to know how often that happens, but Amazon doesn’t say.
While Amazon’s design is very impressive for data availability and integrity, I would not trust it at all for confidentiality. They tout their use of encryption, with uploads over TLS only and the data being encrypted with AES using 256-bit keys. Keys that you don’t control, that you don’t even see, and which Amazon will happily and silently turn over to the U.S. Government.
If you want confidentiality, do it yourself! Get the OpenSSL toolkit and encrypt your files before uploading them. (We show you how to do this in Cloud Security Essentials course). Do not lose your key, as this is real cryptography and there’s no “back door”.
Let’s all (including our data) be safe in the new year!