Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There was a Swizz cloud backup system existing until some years ago.. can't recall the name but it started with a 'V'. They also encrypted the files on the client side before transmission, but the files were encrypted with their own md5sum or some such as key, and therefore similar files from different systems, encrypted, could still be de-duplicated across their whole system.


Interesting! I can picture how the clients could calculate a hash prior to encryption and that would let the server know those files have the same contents once decrypted but how would that let them save on disk space? They still can’t see the contents of the file itself even if they know it’s the same so how could they deduplicate the storage? If they drop either one they are just left with a single encrypted version using only one clients key which they can’t serve up to anyone else.


I assume they had a kind of pool for files, and a system linking files (or should I say "blobs" to each client's directory layout. Kind of like if I have a disk with different subdirectories, I could run a tool (which do exist) to find duplicates, and delete all except one copy, and hardlink the rest to that one.

As for the cloud storage system, the files were, as mentioned, stored in an encrypted form, using a hash of the original file as key (possibly md5, possibly something else, I can't recall that at the moment). Which the cloud provider didn't know, but the client's application would know it. The encrypted file is provided to (every) client, every client can decrypt it because the clients keep the encryption keys (the original hashes, one for every file).

The details of that I don't have anymore, there used to be a document describing the whole thing. I probably got rid of all of that after they stopped the service (which I used for several years, with no issues).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: