A few people at the MozCamp were interested in Weave’s use of cryptography to protect the user’s data and privacy. Although the specs for the Weave server are available, it may take someone new a while to wrap their head around the whole scheme. I’m going to attempt explaining what crypto operations we do and why we do it in this blog post.
First, let’s get some basic definitions out of the way. Symmetric cryptography means you have one key that can perform both encryption and decryption, and they are complementary operations. For Weave, we use AES with a 256 bit key, and we use it in a mode that requires an ‘initialization vector’ for every decryption. Asymmetric cryptography means there’s a pair of keys (usually called ‘public’ and ‘private’ keys). A piece of text “encrypted” by one key can only be “decrypted” by the other key. Here, we use RSA with a 2048 bit private key.
So, when a user first signs up for Weave using the wizard on their computer, we generate a (random) pair of public and private keys. Next, we use the user’s passphrase to create a symmetric key. This is done using a pretty standard algorithm known as PBKDF2 (short for “Password Key Derivation Function”). The PBKDF2 algorithm requires a ‘salt’ value which is also stored on the server. Now that we have a symmetric key, we use it to encrypt the user’s private key and upload it along with the public key to the server. Note that the passphrase is never sent to the server, so if the user’s password ever gets compromised all the attacker can get is their encrypted private key, which really isn’t of much use (especially given that the key is 2048 bits long).
Whenever a particular “engine” is to be synchronized (an engine could be Tabs, Bookmarks, History etc.) we generate a random symmetric key for that engine. This key is then encrypted using the user’s public key (now, one can only retrieve the original symmetric key with the corresponding private key) and uploaded as being associated with a particular engine. All entries (the ‘ciphertext’ property in a “Weave Basic Object”) in that engine are encrypted with the symmetric key that was generated for it.
To make things clear, let’s enumerate the steps we would take to decrypt a single tab object for user ‘foo’:
Find the user’s cluster by making a GET request to
https://services.mozilla.com/user/1/foo/node/weave. It returns
Fetch the user’s encrypted private key and public key from
https://sj-weave06.services.mozilla.com/0.5/foo/storage/keys/pubkeyrespectively. The user’s password is required to access these JSON objects.
Ask the user for their passphrase and generate a 256 bit symmetric key from it using PBKDF2 and the ‘salt’ found in the privkey object.
Use the generated symmetric key and the initialization vector found in the ‘iv’ property of the privkey object to decrypt the user’s private key.
Fetch the user’s encrypted tab objects from
Fetch the corresponding symmetric key (the URL is also listed in the “encryption” property of every WBO), in this case
Decrypt the symmetric key with the user’s private key.
Use the decrypted symmetric key to decrypt any WBO from the tabs collection with the initialization vector found in the ‘bulkIV’ property of the tabs symmetric key WBO.
A word about the formats in which the keys are actually stored in. All values are Base64. For symmetric keys, the key is stored as-is. For asymmetric keys, I wish we used a standard format like PKCS#12, but we don’t. It’s still ASN.1 though, in some format NSS exports private keys in. You need to do a bit of ASN.1 parsing to figure out the values you’re interested in.
Finally, a quick note about why we do all this. Sharing is now reasonably easy, if you want to share your bookmarks with someone, you just need to encrypt the corresponding symmetric key with their public key and they’re good to go. Also, each WBO has it’s own ‘encryption’ property so this can be as granular as needed. Secondly, the passphrase is never stored anywhere (except possibly on the user’s computer) so the server never sees anything other than encrypted blobs of Base64'ed text. Along with making HTTPS mandatory, we think this is a pretty secure way of protecting the user’s data.
If you have other encryption schemes that might fit into Weave’s use cases please let us know! (We’ve already been looking at interesting developments in this area such as Tahoe). I’d also love to hear from you if you have any questions on our current cryptography scheme. We’re constantly trying to improve the security and efficiency of our system so these details are only valid until we change our scheme :-)
Now, go write that third-party Weave client, you have no excuse not to!