Came across this interesting paper from the recently concluded 22nd USENIX Security Symposium: https://www.usenix.org/conference/usenixsecurity13/encryption-deduplicated-storage-dupless. The System is called DupLESS.
Typically, mixing encryption and deduplication does not yield good results. As each user uses their own key to encrypt, the resulting ciphertexts are different even if two users are encrypting the same files. So deduplication becomes impossible. Message Locked Encryption was mooted to work around this. To put it simply this encrypts each chunk of data using it’s cryptographic hash as the key. So two identical chunks will produce identical ciphertexts and and can still be deduplicated. However this leaks information and it is possible to brute force plaintexts, if data is not thoroughly unpredictable. Also, as another example, it is possible for privileged users having access to the storage server to store files and check their deduplication with other user’s data, thereby getting an idea of other user’s contents even if they are encrypted.
The DupLESS system above introduces a Key Server into the picture to perform authentication and then serve chunk encryption keys in a secure manner. From my understanding this means that all users authenticating to the same key server will be able to deduplicate data amongst themselves. An organization, using a cloud storage service, which does deduplication at the back-end will be able to deduplicate data among it’s users by using a local secured key server. This will prevent the storage provider or any external privileged user from gleaning any information about the data. A trusted third party can also provide a key service that can be shared among groups or categories of users, while not allowing the storage provider access to the key service. Neither the key server nor the storage service can glean any information about the plaintext data.
Very interesting stuff. The source code and a bunch of background is available at this link: http://cseweb.ucsd.edu/users/skeelvee/dupless/