In my last blog post, I explained how you can verify that Amazon cloud storage is zeroized before being redeployed for the next user.
But what about the cloud provider browsing through your data? After all, it’s stored on their hardware in their facility, so they have physical access plus the ability to interact with the virtualization hypervisors underlying your virtual machines and storage. Should we worry?
Let’s start by looking at the hardware. The disks are grouped into RAID arrays. RAID-based virtual disks are grouped into logical volumes. Your cloud storage volumes are then built on top of that.
Making your way down from what you perceive as a disk takes you through two levels of division, or really shredding—logical volumes spread across RAID disks, and each of those built from many physical devices. This is like the output of an immense community shredder, as your cloud systems share that infrastructure with other cloud customers. How many? You can’t tell, but it’s safe to guess that the number is large. And this is a good thing—a would-be browser of your data is further engulfed in a snowstorm of irrelevant file fragments.
Don’t forget that the cloud providers regularly replace and destroy storage hardware. So if some isolated shreds of your data end up on a given drive, how long is that drive going to stay in the array?
Next, look at the financial pressure on the provider. Amazon, Google, Microsoft and the other cloud providers are in business to make money. They know that everyone’s greatest worry is confidentiality, and they will lose lots of potential income if people lose confidence. So they work very hard to combine technology and processes to strictly limit staff access, and to detect and shut down any attempted violations.
Finally, consider the scale of the problem. Gmail gives you 10 GB of storage for free, more if you are willing to pay. As of January 2012 they had 350 million users. Do the math, realize that the vast majority of the content is trivial, and where is your motivation for casual browsing?
Knowing about all of that, I felt pretty comfortable. Then I taught a class attended by a guy who does data forensics for a U.S. government agency. Think about it: his job is to violate other people’s confidentiality.
We were talking about this issue, and I asked him if there was anything he could share about his thoughts on recovering cloud customer data from a U.S. cloud provider that will cooperate with government requests.
He just laughed.
You choose your battles, and this is one you have no hope of winning. This is why they use keyloggers, wiretaps and other techniques: even with full cloud provider cooperation there is very little to be gained even by enormous efforts.
That sounds secure enough for me!
These sorts of interesting conversations come up all the time in courses like Learning Tree’s cloud security course.
I like teaching classes where I get to learn things, too.