For the sake of your compliance, you hope that the answer is “nobody”!
For most organizations, confidentiality is the greatest concern. The most worrying thing about using cloud technology is storing your data on someone else’s hardware.
But how big of a risk is this, really?
Stored data should be encrypted.
However, how much of your data is simply uploaded to the cloud for static archiving? Don’t you process some, if not much or even all, of it? Unless you are using homomorphic encryption, your data will need to be in plaintext form for processing. So, a plaintext copy of the data is written to the disk, used for some processing, and eventually deleted when the processing results are encrypted into ciphertext.
Remember that deleting a file simply removes the reference to that file in the directory containing it. The contents are not erased.
That is what worries potential cloud users—plaintext copies of their sensitive data get written to hardware owned by others. We know we must not sell or give away our old storage devices and media without properly sanitizing it by overwriting under our careful supervision. This makes cloud-based storage seem so wrong!
I would suggest that we need a little calm thought about what really happens at the cloud provider.
Before you start, think about how you must finish. At some point you will be done with a cloud resource and you will turn it loose. What happens to your data bits? An experiment would be helpful.
Deploy an Amazon EC2 instance, and also deploy an EBS storage volume in the same region and availability zone. Connect that volume to the instance.
The disk device that came with the instance, probably /dev/xvda, is for the operating system. It’s the other disk, probably xvdf, where your sensitive data would go.
Think about what is happening. Amazon did not deploy brand new hardware when you asked for that storage volume, their system automatically carved it out of pre-existing storage that has already been used who knows how many times. This means that you have been given storage media that has had other customers’ sensitive data stored on it! But let’s see what is really visible. Hexdump that device, and you see nothing but zeros!
We just experimentally verified that Amazon zeroizes storage before deployment. So, as long as we can trust AWS (and Eucalyptus, a reverse engineering of AWS, and OpenStack and cloud systems built on those) to always work the same way, we don’t have to worry about other customers seeing our data.
What about keeping secrets from our provider? More on that next time! Meanwhile, if you think the experiment sounds interesting, we actually do it in Learning Tree’s Cloud Security Essentials course. And right after that, we set up encrypted cloud storage on that device.