├── .gitignore ├── LICENSE ├── README.md ├── src ├── publickey.py └── secretkey.py └── doc ├── chapter2.txt ├── intro.txt ├── chapter4.txt ├── chapter5.txt ├── chapter3.txt └── chapter1.txt /.gitignore: -------------------------------------------------------------------------------- 1 | # emacs is creative in coming up with new garbage to name backup files as 2 | \#*.*# 3 | .*.swp 4 | *.*~ 5 | .#* 6 | .hg 7 | *.pyc 8 | .hgignore 9 | .hg 10 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2013 Kyle Isom 2 | 3 | Permission to use, copy, modify, and distribute this software for any 4 | purpose with or without fee is hereby granted, provided that the above 5 | copyright notice and this permission notice appear in all copies. 6 | 7 | THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES 8 | WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF 9 | MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR 10 | ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES 11 | WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN 12 | ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF 13 | OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. 14 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Introduction to Cryptography with PyCrypto 2 | ========================================== 3 | 4 | author: kyle isom 5 | 6 | Read online: 7 | * [Github page](http://kisom.github.com/crypto_intro) 8 | * [Leanpub](https://leanpub.com/pycrypto/read) 9 | 10 | This is the repository for an intro to crypto using PyCrypto. It is aimed 11 | at introducing basic cryptography topics to programmers who are 12 | unfamiliar with cryptography. It uses the 13 | [PyCrypto](https://www.dlitz.net/software/pycrypto/) 14 | library. 15 | 16 | The tutorial is in the doc/ folder, where there is a LaTeX source file and a 17 | Makefile to build the PDF and (eventually the markdown file). I've 18 | 19 | It includes a library of sample code to illustrate use of the PyCrypto 20 | library as well as a tutorial. The sample code is purposefully simple to 21 | illustrate clearly how to use the software. 22 | 23 | Included source files (in src/): 24 | 25 | secretkey.py secret key cryptographic functions 26 | publickey.py public key cryptographic functions 27 | 28 | Contributors: 29 | * [zenmower](https://github.com/clarke187) provided grammar, spelling, and 30 | readability critiques. 31 | * [Kim Lidström](https://github.com/dxtr) provided spelling critiques. 32 | * [Erik Musick](http://erikmusick.com/) provided grammar and spelling critiques. 33 | 34 | 35 | -------------------------------------------------------------------------------- /src/publickey.py: -------------------------------------------------------------------------------- 1 | # publickey.py: public key cryptographic functions 2 | """ 3 | Secret-key functions from chapter 1 of "A Working Introduction to 4 | Cryptography with Python". 5 | """ 6 | 7 | import Crypto.Hash.SHA384 as SHA384 8 | import pyelliptic 9 | import secretkey 10 | import struct 11 | 12 | 13 | __CURVE = 'secp521r1' 14 | 15 | 16 | def generate_key(): 17 | """Generate a new elliptic curve keypair.""" 18 | return pyelliptic.ECC(curve=__CURVE) 19 | 20 | 21 | def sign(priv, msg): 22 | """Sign a message with the ECDSA key.""" 23 | return priv.sign(msg) 24 | 25 | 26 | def verify(pub, msg, sig): 27 | """ 28 | Verify the public key's signature on the message. pub should 29 | be a serialised public key. 30 | """ 31 | return pyelliptic.ECC(curve='secp521r1', pubkey=pub).verify(sig, msg) 32 | 33 | 34 | def shared_key(priv, pub): 35 | """Generate a new shared encryption key from a keypair.""" 36 | key = priv.get_ecdh_key(pub) 37 | key = key[:32] + SHA384.new(key[32:]).digest() 38 | return key 39 | 40 | 41 | def encrypt(pub, msg): 42 | """ 43 | Encrypt the message to the public key using ECIES. The public key 44 | should be a serialised public key. 45 | """ 46 | ephemeral = generate_key() 47 | key = shared_key(ephemeral, pub) 48 | ephemeral_pub = struct.pack('>H', len(ephemeral.get_pubkey())) 49 | ephemeral_pub += ephemeral.get_pubkey() 50 | return ephemeral_pub+secretkey.encrypt(msg, key) 51 | 52 | 53 | def decrypt(priv, msg): 54 | """ 55 | Decrypt an ECIES-encrypted message with the private key. 56 | """ 57 | ephemeral_len = struct.unpack('>H', msg[:2])[0] 58 | ephemeral_pub = msg[2:2+ephemeral_len] 59 | key = shared_key(priv, ephemeral_pub) 60 | return secretkey.decrypt(msg[2+ephemeral_len:], key) 61 | -------------------------------------------------------------------------------- /doc/chapter2.txt: -------------------------------------------------------------------------------- 1 | # ASCII-Armouring 2 | 3 | I'm going to take a quick detour and talk about ASCII armouring. If 4 | you've played with the crypto functions above, you'll notice they 5 | produce an annoying dump of binary data that can be a hassle to 6 | deal with. One common technique for making the data a little bit 7 | easier to deal with is to encode it with base64. There are a 8 | few ways to incorporate this into python: 9 | {Absolute Base64 Encoding}The easiest way is to just base64 encode 10 | everything in the encrypt function. Everything that goes into the 11 | decrypt function should be in base64 - if it's not, the `base64` 12 | module will throw an error: you could catch this and then try to 13 | decode it as binary data. 14 | 15 | ## A Simple Header 16 | 17 | A slightly more complex option, and the one I adopt in this 18 | article, is to use a `\x00` as the first byte of the ciphertext for 19 | binary data, and to use `\x41` (an ASCII "`A`") for ASCII encoded 20 | data. This will increase the complexity of the encryption and 21 | decryption functions slightly. We'll also pack the initialisation 22 | vector at the beginning of the file as well. Given now that the 23 | `iv` argument might be `None` in the decrypt function, I will have 24 | to rearrange the arguments a bit; for consistency, I will move it 25 | in both functions. My modified functions look like this now: 26 | 27 | ```python 28 | def encrypt(data, key, armour=False): 29 | """ 30 | Encrypt data using AES in CBC mode. The IV is prepended to the 31 | ciphertext. 32 | """ 33 | data = pad_data(data) 34 | ivec = generate_nonce() 35 | aes = AES.new(key[:__AES_KEYLEN], AES.MODE_CBC, ivec) 36 | ctxt = aes.encrypt(data) 37 | tag = new_tag(ivec+ctxt, key[__AES_KEYLEN:]) 38 | if armour: 39 | return '\x41' + (ivec + ctxt + tag).encode('base64') 40 | else: 41 | return '\x00' + ivec + ctxt + tag 42 | 43 | def decrypt(ciphertext, key): 44 | """ 45 | Decrypt a ciphertext encrypted with AES in CBC mode; assumes the IV 46 | has been prepended to the ciphertext. 47 | """ 48 | if ciphertext[0] == '\x41': 49 | ciphertext = ciphertext[1:].decode('base64') 50 | else: 51 | ciphertext = ciphertext[1:] 52 | if len(ciphertext) <= AES.block_size: 53 | return None, False 54 | tag_start = len(ciphertext) - __TAG_LEN 55 | ivec = ciphertext[:AES.block_size] 56 | data = ciphertext[AES.block_size:tag_start] 57 | if not verify_tag(ciphertext, key[__AES_KEYLEN:]): 58 | return None, False 59 | aes = AES.new(key[:__AES_KEYLEN], AES.MODE_CBC, ivec) 60 | data = aes.decrypt(data) 61 | return unpad_data(data), True 62 | ``` 63 | 64 | ## A More Complex Container 65 | 66 | 67 | There are more complex ways to do it (and you’ll see it with the 68 | public keys in the next section) that involve putting the base64 into 69 | a container of sorts that contains additional information about the 70 | key. 71 | -------------------------------------------------------------------------------- /doc/intro.txt: -------------------------------------------------------------------------------- 1 | # Introduction 2 | 3 | Recently at work I have been using the 4 | [PyCrypto](https://www.dlitz.net/software/pycrypto/) libraries quite a 5 | bit. The documentation is pretty good, but there are a few areas that 6 | took me a bit to figure out. In this post, I’ll be writing up a quick 7 | overview of the PyCrypto library and cover some general things to know 8 | when writing cryptographic code in general. I’ll go over symmetric, 9 | public-key, hybrid, and message authentication codes. Keep in mind 10 | this is a quick introduction and a lot of gross simplifications are 11 | made. For a more complete introduction to cryptography, take a look at 12 | the references at the end of this article. This article is just an 13 | appetite-whetter - if you have a real need for information security 14 | you should hire an expert. Real data security goes beyond this quick 15 | introduction (you wouldn’t trust the design and engineering of a 16 | bridge to a student with a quick introduction to civil engineering, 17 | would you?) 18 | 19 | Some quick terminology: for those unfamiliar, I introduce the 20 | following terms: 21 | 22 | * plaintext: the original message 23 | 24 | * ciphertext: the message after cryptographic transformations are 25 | applied to obscure the original message. 26 | 27 | * encrypt: producing ciphertext by applying cryptographic 28 | transformations to plaintext. 29 | 30 | * decrypt: producing plaintext by applying cryptographic 31 | transformations to ciphertext. 32 | 33 | * cipher: a particular set of cryptographic transformations providing 34 | means of both encryption and decryption. 35 | 36 | * hash: a set of cryptographic transformations that take a large input 37 | and transform it to a unique (typically fixed-size) output. For 38 | hashes to be cryptographically secure, collisions should be 39 | practically nonexistent. It should be practically impossible to 40 | determine the input from the output. 41 | 42 | Cryptography is an often misunderstood component of information 43 | security, so an overview of what it is and what role it plays is in 44 | order. There are four major roles that cryptography plays: 45 | 46 | * confidentiality: ensuring that only the intended recipients receive 47 | the plaintext of the message. 48 | 49 | * data integrity: the plaintext message arrives unaltered. 50 | 51 | * entity authentication: the identity of the sender is verified. An 52 | entity may be a person or a machine. 53 | 54 | * message authentication: the message is verified as having been 55 | unaltered. 56 | 57 | Note that cryptography is used to obscure the contents of a message 58 | and verify its contents and source. It will **not** hide the fact that 59 | two entities are communicating. 60 | 61 | There are two basic types of ciphers: symmetric and public-key 62 | ciphers. A symmetric key cipher employs the use of shared secret 63 | keys. They also tend to be much faster than public-key ciphers. A 64 | public-key cipher is so-called because each key consists of a private 65 | key which is used to generate a public key. Like their names imply, 66 | the private key is kept secret while the public key is passed 67 | around. First, I’ll take a look at a specific type of symmetric 68 | ciphers: block ciphers. 69 | 70 | 71 | 72 | -------------------------------------------------------------------------------- /doc/chapter4.txt: -------------------------------------------------------------------------------- 1 | # Key Exchange 2 | 3 | So how does Bob know the key actually belongs to Alice? There are two 4 | main schools of thought regarding the authentication of key ownership: 5 | centralised and decentralised. TLS/SSL follow the centralised school: 6 | a root certificate[^rootcert] authority (CA) signs intermediary CA 7 | keys, which then sign user keys. For example, if Bob runs Foo Widgets, 8 | LLC, he can generate an SSL keypair. From this, he generates a 9 | certificate signing request, and sends this to the CA. The CA, usually 10 | after taking some money and ostensibly actually verifying Bob's 11 | identity[^caverify], then signs Bob's certificate. Bob sets up his 12 | webserver to use his SSL certificate for all secure traffic, and Alice 13 | sees that the CA did in fact sign his certificate. This relies on 14 | trusted central authorities, like VeriSign[^verisign] Alice's web 15 | browser would ship with a keystore of select trusted CA public keys 16 | (like VeriSigns) that she could use to verify signatures on the 17 | certificates from various sites. This system is called a public key 18 | infrastructure. The other school of thought is followed by PGP[^pgp] - 19 | the decentralised model. 20 | 21 | In PGP, this is manifested as the Web of Trust[^wot]. For example, if 22 | Carol now wants to talk to Bob and gives Bob her public key, Bob can 23 | check to see if Carol's key has been signed by anyone else. We'll also 24 | say that Bob knows for a fact that Alice's key belongs to Alice, and 25 | he trusts her[^trust], and that Alice has signed Carol's key. Bob sees 26 | Alice's signature on Carol's key and then can be reasonably sure that 27 | Carol is who she says it was. If we repeat the process with Dave, 28 | whose key was signed by Carol (whose key was signed by Alice), Bob 29 | might be able to be more certain that the key belongs to Dave, but 30 | maybe he doesn't really trust Carol to properly verify identities. Bob 31 | can mark keys as having various trust levels, and from this a web of 32 | trust emerges: a picture of how well you can trust that a given key 33 | belongs to a given user. 34 | 35 | The key distribution problem is not a quick and easy problem to 36 | solve; a lot of very smart people have spent a lot of time coming 37 | up with solutions to the problem. There are key exchange protocols 38 | (such as the Diffie-Hellman key exchange[^dh] and IKE[^ike] (which 39 | uses Diffie-Hellman) that provide alternatives to the web of trust 40 | and public key infrastructures. 41 | 42 | [^rootcert]: A certificate is a public key encoded with X.509 and 43 | which can have additional informational attributes attached, such as 44 | organisation name and country. 45 | 46 | [^caverify]: The extent to which this actually happens varies widely based on the different CAs. 47 | 48 | [^verisign]: There is some question as to whether VeriSign can 49 | actually be trusted, but that is another discussion for another 50 | day... 51 | 52 | [^pgp]: and GnuPG 53 | 54 | [^wot]: http://www.rubin.ch/pgp/weboftrust.en.html 55 | 56 | [^trust]: It is quite often important to distinguish between *I know 57 | this key belongs to that user* and *I trust that user*. This is 58 | especially important with key signatures - if Bob cannot trust 59 | Alice to properly check identities, she might sign a key for an 60 | identity she hasn't checked. 61 | 62 | [^dh]: http://is.gd/Tr0zLP 63 | 64 | [^ike]: https://secure.wikimedia.org/wikipedia/en/wiki/Internet\_Key\_Exchange 65 | -------------------------------------------------------------------------------- /src/secretkey.py: -------------------------------------------------------------------------------- 1 | # secretkey.py: secret-key cryptographic functions 2 | """ 3 | Secret-key functions from chapter 1 of "A Working Introduction to 4 | Cryptography with Python". 5 | """ 6 | 7 | import Crypto.Cipher.AES as AES 8 | import Crypto.Hash.HMAC as HMAC 9 | import Crypto.Hash.SHA384 as SHA384 10 | import Crypto.Random.OSRNG.posix as RNG 11 | import pbkdf2 12 | import streql 13 | 14 | 15 | __AES_KEYLEN = 32 16 | __TAG_KEYLEN = 48 17 | __TAG_LEN = __TAG_KEYLEN 18 | KEYSIZE = __AES_KEYLEN + __TAG_KEYLEN 19 | 20 | 21 | def pad_data(data): 22 | """pad_data pads out the data to an AES block length.""" 23 | # return data if no padding is required 24 | if len(data) % 16 == 0: 25 | return data 26 | 27 | # subtract one byte that should be the 0x80 28 | # if 0 bytes of padding are required, it means only 29 | # a single \x80 is required. 30 | 31 | padding_required = 15 - (len(data) % 16) 32 | 33 | data = '%s\x80' % data 34 | data = '%s%s' % (data, '\x00' * padding_required) 35 | 36 | return data 37 | 38 | 39 | def unpad_data(data): 40 | """unpad_data removes padding from the data.""" 41 | if not data: 42 | return data 43 | 44 | data = data.rstrip('\x00') 45 | if data[-1] == '\x80': 46 | return data[:-1] 47 | else: 48 | return data 49 | 50 | 51 | def generate_nonce(): 52 | """Generate a random number used once.""" 53 | return RNG.new().read(AES.block_size) 54 | 55 | 56 | def new_tag(ciphertext, key): 57 | """Compute a new message tag using HMAC-SHA-384.""" 58 | return HMAC.new(key, msg=ciphertext, digestmod=SHA384).digest() 59 | 60 | 61 | def verify_tag(ciphertext, key): 62 | """Verify the tag on a ciphertext.""" 63 | tag_start = len(ciphertext) - __TAG_LEN 64 | data = ciphertext[:tag_start] 65 | tag = ciphertext[tag_start:] 66 | actual_tag = new_tag(data, key) 67 | return streql.equals(actual_tag, tag) 68 | 69 | 70 | def decrypt(ciphertext, key): 71 | """ 72 | Decrypt a ciphertext encrypted with AES in CBC mode; assumes the IV 73 | has been prepended to the ciphertext. 74 | """ 75 | if len(ciphertext) <= AES.block_size: 76 | return None, False 77 | tag_start = len(ciphertext) - __TAG_LEN 78 | ivec = ciphertext[:AES.block_size] 79 | data = ciphertext[AES.block_size:tag_start] 80 | if not verify_tag(ciphertext, key[__AES_KEYLEN:]): 81 | return None, False 82 | aes = AES.new(key[:__AES_KEYLEN], AES.MODE_CBC, ivec) 83 | data = aes.decrypt(data) 84 | return unpad_data(data), True 85 | 86 | 87 | def encrypt(data, key): 88 | """ 89 | Encrypt data using AES in CBC mode. The IV is prepended to the 90 | ciphertext. 91 | """ 92 | data = pad_data(data) 93 | ivec = generate_nonce() 94 | aes = AES.new(key[:__AES_KEYLEN], AES.MODE_CBC, ivec) 95 | ctxt = aes.encrypt(data) 96 | tag = new_tag(ivec+ctxt, key[__AES_KEYLEN:]) 97 | return ivec + ctxt + tag 98 | 99 | 100 | def generate_salt(salt_len): 101 | """Generate a salt for use with PBKDF2.""" 102 | return RNG.new().read(salt_len) 103 | 104 | 105 | def password_key(passphrase, salt=None): 106 | """Generate a key from a passphrase. Returns the tuple (salt, key).""" 107 | if salt is None: 108 | salt = generate_salt(16) 109 | passkey = pbkdf2.PBKDF2(passphrase, salt, iterations=16384).read(KEYSIZE) 110 | return salt, passkey 111 | -------------------------------------------------------------------------------- /doc/chapter5.txt: -------------------------------------------------------------------------------- 1 | # Source Code Listings 2 | 3 | ## secretkey.py 4 | 5 | ```python 6 | # secretkey.py: secret-key cryptographic functions 7 | """ 8 | Secret-key functions from chapter 1 of "A Working Introduction to 9 | Cryptography with Python". 10 | """ 11 | 12 | import Crypto.Cipher.AES as AES 13 | import Crypto.Hash.HMAC as HMAC 14 | import Crypto.Hash.SHA384 as SHA384 15 | import Crypto.Random.OSRNG.posix as RNG 16 | import pbkdf2 17 | import streql 18 | 19 | 20 | __AES_KEYLEN = 32 21 | __TAG_KEYLEN = 48 22 | __TAG_LEN = __TAG_KEYLEN 23 | KEYSIZE = __AES_KEYLEN + __TAG_KEYLEN 24 | 25 | 26 | def pad_data(data): 27 | """pad_data pads out the data to an AES block length.""" 28 | # return data if no padding is required 29 | if len(data) % 16 == 0: 30 | return data 31 | 32 | # subtract one byte that should be the 0x80 33 | # if 0 bytes of padding are required, it means only 34 | # a single \x80 is required. 35 | 36 | padding_required = 15 - (len(data) % 16) 37 | 38 | data = '%s\x80' % data 39 | data = '%s%s' % (data, '\x00' * padding_required) 40 | 41 | return data 42 | 43 | 44 | def unpad_data(data): 45 | """unpad_data removes padding from the data.""" 46 | if not data: 47 | return data 48 | 49 | data = data.rstrip('\x00') 50 | if data[-1] == '\x80': 51 | return data[:-1] 52 | else: 53 | return data 54 | 55 | 56 | def generate_nonce(): 57 | """Generate a random number used once.""" 58 | return RNG.new().read(AES.block_size) 59 | 60 | 61 | def new_tag(ciphertext, key): 62 | """Compute a new message tag using HMAC-SHA-384.""" 63 | return HMAC.new(key, msg=ciphertext, digestmod=SHA384).digest() 64 | 65 | 66 | def verify_tag(ciphertext, key): 67 | """Verify the tag on a ciphertext.""" 68 | tag_start = len(ciphertext) - __TAG_LEN 69 | data = ciphertext[:tag_start] 70 | tag = ciphertext[tag_start:] 71 | actual_tag = new_tag(data, key) 72 | return streql.equals(actual_tag, tag) 73 | 74 | 75 | def decrypt(ciphertext, key): 76 | """ 77 | Decrypt a ciphertext encrypted with AES in CBC mode; assumes the IV 78 | has been prepended to the ciphertext. 79 | """ 80 | if len(ciphertext) <= AES.block_size: 81 | return None, False 82 | tag_start = len(ciphertext) - __TAG_LEN 83 | ivec = ciphertext[:AES.block_size] 84 | data = ciphertext[AES.block_size:tag_start] 85 | if not verify_tag(ciphertext, key[__AES_KEYLEN:]): 86 | return None, False 87 | aes = AES.new(key[:__AES_KEYLEN], AES.MODE_CBC, ivec) 88 | data = aes.decrypt(data) 89 | return unpad_data(data), True 90 | 91 | 92 | def encrypt(data, key): 93 | """ 94 | Encrypt data using AES in CBC mode. The IV is prepended to the 95 | ciphertext. 96 | """ 97 | data = pad_data(data) 98 | ivec = generate_nonce() 99 | aes = AES.new(key[:__AES_KEYLEN], AES.MODE_CBC, ivec) 100 | ctxt = aes.encrypt(data) 101 | tag = new_tag(ivec+ctxt, key[__AES_KEYLEN:]) 102 | return ivec + ctxt + tag 103 | 104 | 105 | def generate_salt(salt_len): 106 | """Generate a salt for use with PBKDF2.""" 107 | return RNG.new().read(salt_len) 108 | 109 | 110 | def password_key(passphrase, salt=None): 111 | """Generate a key from a passphrase. Returns the tuple (salt, key).""" 112 | if salt is None: 113 | salt = generate_salt(16) 114 | passkey = pbkdf2.PBKDF2(passphrase, salt, iterations=16384).read(KEYSIZE) 115 | return salt, passkey 116 | ``` 117 | 118 | ## publickey.py 119 | 120 | ```python 121 | # publickey.py: public key cryptographic functions 122 | """ 123 | Secret-key functions from chapter 1 of "A Working Introduction to 124 | Cryptography with Python". 125 | """ 126 | 127 | import Crypto.Hash.SHA384 as SHA384 128 | import pyelliptic 129 | import secretkey 130 | import struct 131 | 132 | 133 | __CURVE = 'secp521r1' 134 | 135 | 136 | def generate_key(): 137 | """Generate a new elliptic curve keypair.""" 138 | return pyelliptic.ECC(curve=__CURVE) 139 | 140 | 141 | def sign(priv, msg): 142 | """Sign a message with the ECDSA key.""" 143 | return priv.sign(msg) 144 | 145 | 146 | def verify(pub, msg, sig): 147 | """ 148 | Verify the public key's signature on the message. pub should 149 | be a serialised public key. 150 | """ 151 | return pyelliptic.ECC(curve='secp521r1', pubkey=pub).verify(sig, msg) 152 | 153 | 154 | def shared_key(priv, pub): 155 | """Generate a new shared encryption key from a keypair.""" 156 | key = priv.get_ecdh_key(pub) 157 | key = key[:32] + SHA384.new(key[32:]).digest() 158 | return key 159 | 160 | 161 | def encrypt(pub, msg): 162 | """ 163 | Encrypt the message to the public key using ECIES. The public key 164 | should be a serialised public key. 165 | """ 166 | ephemeral = generate_key() 167 | key = shared_key(ephemeral, pub) 168 | ephemeral_pub = struct.pack('>H', len(ephemeral.get_pubkey())) 169 | ephemeral_pub += ephemeral.get_pubkey() 170 | return ephemeral_pub+secretkey.encrypt(msg, key) 171 | 172 | 173 | def decrypt(priv, msg): 174 | """ 175 | Decrypt an ECIES-encrypted message with the private key. 176 | """ 177 | ephemeral_len = struct.unpack('>H', msg[:2])[0] 178 | ephemeral_pub = msg[2:2+ephemeral_len] 179 | key = shared_key(priv, ephemeral_pub) 180 | return secretkey.decrypt(msg[2+ephemeral_len:], key) 181 | ``` 182 | -------------------------------------------------------------------------------- /doc/chapter3.txt: -------------------------------------------------------------------------------- 1 | # Public-key Cryptography 2 | 3 | The original version of this document had examples of using RSA 4 | cryptography with Python. However, RSA should be avoided for modern 5 | secure systems due to concerns with advancements in the discrete 6 | logarithm problem. While I haven't written Python in a while, I 7 | have done some research into packages for elliptic curve cryptography 8 | (ECC). The most promising one so far is 9 | [PyElliptic](https://pypi.python.org/pypi/pyelliptic/1.1), by [Yann 10 | GUIBET](https://github.com/yann2192). 11 | 12 | Public key cryptography is a type of cryptography that simplifies 13 | the key exchange problem: there is no need for a secure channel to 14 | communicate keys over. Instead, each user generates a private key 15 | with an associated public key. The public key can be given out 16 | without any security risk. There is still the challenge of distributing 17 | and verifying public keys, but that is outside the scope of this 18 | document. 19 | 20 | With elliptic curves, we have two types of operations that we 21 | generally want to accomplish: 22 | 23 | * Digital signatures are the public key equivalent of message 24 | authentication codes. Alice signs a document using her private 25 | key, and users verify the signature against her public key. 26 | 27 | * Encryption with elliptic curves is done by performing a key 28 | exchange. Alice uses a function called elliptic curve Diffie-Hellman 29 | (ECDH) to generate a shared key to encrypt messages to Bob. 30 | 31 | There are three curves we generally use with elliptic curve cryptography: 32 | 33 | * the NIST P256 curve, which is equivalent to an AES-128 key (also 34 | known as secp256r1) 35 | * the NIST P384 curve, which is equivalent to an AES-192 key (also 36 | known as secp384r1) 37 | * the NIST P521 curve, which is equivalent to an AES-256 key (also 38 | known as secp521r1) 39 | 40 | Alternatively, there is the Curve25519 curve, which can be used for 41 | key exchange, and the Ed25519 curve, which can be used for digital 42 | signatures. 43 | 44 | ## Generating Keys 45 | 46 | Generating new keys with PyElliptic is done with the `ECC` 47 | class. As we used AES-256 previously, we'll use P521 here. 48 | 49 | ```python 50 | import pyelliptic 51 | 52 | 53 | def generate_key(): 54 | return pyelliptic.ECC(curve='secp521r1') 55 | ``` 56 | 57 | Public and private keys can be exported (i.e. for storage) using the 58 | accessors (the examples shown are for Python 2). 59 | 60 | ``` 61 | >>> key = generate_key() 62 | >>> priv = key.get_privkey() 63 | >>> type(priv) 64 | str 65 | >>> pub = key.get_pubkey() 66 | >>> type(pub) 67 | str 68 | ``` 69 | 70 | The keys can be imported when instantiating a instance of the `ECC` 71 | class. 72 | 73 | ``` 74 | >>> pyelliptic.ECC(privkey=priv) 75 | 76 | >>> pyelliptic.ECC(pubkey=pub) 77 | 78 | ``` 79 | 80 | ## Signing Messages 81 | 82 | Normally when we do signatures, we compute the hash of the message and 83 | sign that. PyElliptic does this for us, using SHA-512. Signing 84 | messages is done with the private key and some message. The algorithm 85 | used by PyElliptic for signatures is called ECDSA. 86 | 87 | ```python 88 | def sign(key, msg): 89 | """Sign a message with the ECDSA key.""" 90 | return key.sign(msg) 91 | ``` 92 | 93 | In order to verify a message, we need the public key for the signing 94 | key, the message, and the signature. We'll expect a serialised public 95 | key and perform the import to a `pyelliptic.ecc.ECC` instance internally. 96 | 97 | ```python 98 | def verify(pub, msg, sig): 99 | """Verify the signature on a message.""" 100 | return pyelliptic.ECC(curve='secp521r1', pubkey=pub).verify(sig, msg) 101 | ``` 102 | 103 | ## Encryption 104 | 105 | Using elliptic curves, we encrypt using a function that generates a 106 | symmetric key using a public and private key pair. The function that 107 | we use, ECDH (elliptic curve Diffie-Hellman), works such that: 108 | 109 | ``` 110 | ECDH(alice_pub, bob_priv) == ECDH(bob_pub, alice_priv) 111 | ``` 112 | 113 | That is, ECDH with Alice's private key and Bob's public key returns 114 | the same shared key as ECDH with Bob's private key and Alice's public 115 | key. 116 | 117 | With `pyelliptic`, the private key used must be an instance of 118 | `pyelliptic.ecc.ECC`; the public key must be in serialised form. 119 | 120 | ``` 121 | >>> type(priv) 122 | 123 | >>> type(pub) 124 | str 125 | >>> shared_key = priv.get_ecdh_key(pub) 126 | >>> len(shared_key) 127 | 64 128 | ``` 129 | 130 | Our shared key is 64 bytes; this is enough for AES-256 and 131 | HMAC-SHA-256. What about HMAC-SHA-256? We could use a short key, or we 132 | could expand the last 32 bytes of the key using SHA-384 (which 133 | produces a 48-byte hash). Here's a function to do that: 134 | 135 | ```python 136 | def shared_key(priv, pub): 137 | """Generate a new shared encryption key from a keypair.""" 138 | shared_key = priv.get_ecdh_key(pub) 139 | shared_key = shared_key[:32] + SHA384.new(shared_key[32:]).digest() 140 | return shared_key 141 | ``` 142 | 143 | ### Ephemeral keys 144 | 145 | For improved security, we should use *ephemeral* keys for encryption; 146 | that is, we generate a new elliptic curve key pair for each encryption 147 | operation. This works as long as we send the public key with the 148 | message. Let's look at a sample EC encryption function. For this 149 | function, we need the public key of our recipient, and we'll pack our 150 | key into the beginning of the function. This method of encryption is 151 | called the elliptic curve integrated encryption scheme, or ECIES. 152 | 153 | ```python 154 | import secretkey 155 | import struct 156 | 157 | def encrypt(pub, msg): 158 | """ 159 | Encrypt the message to the public key using ECIES. The public key 160 | should be a serialised public key. 161 | """ 162 | ephemeral = generate_key() 163 | key = shared_key(ephemeral, pub) 164 | ephemeral_pub = struct.pack('>H', len(ephemeral.get_public_key())) 165 | ephemeral += ephemeral.get_public_key() 166 | return ephemeral_pub+secretkey.encrypt(msg, key) 167 | ``` 168 | 169 | Encryption packs the public key at the beginning, writing first a 170 | 16-bit unsigned integer containing the public key length and then 171 | appending the ephemeral public key and the ciphertext to 172 | this. Decryption needs to unpack the ephemeral public key (by reading 173 | the length and extracting that many bytes from the message) and then 174 | decrypting the message with the shared key. 175 | 176 | ```python 177 | def decrypt(pub, msg): 178 | """ 179 | Decrypt an ECIES-encrypted message with the private key. 180 | """ 181 | ephemeral_len = struct.unpack('>H', msg[:2]) 182 | ephemeral_pub = msg[2:2+ephemeral_len] 183 | key = shared_key(priv, ephemeral_pub) 184 | return secretkey.decrypt(msg[2+ephemeral_len:], key) 185 | ``` 186 | 187 | -------------------------------------------------------------------------------- /doc/chapter1.txt: -------------------------------------------------------------------------------- 1 | # Block Ciphers 2 | 3 | There are two further types of symmetric keys: stream and block 4 | ciphers. Stream ciphers operate on data streams, i.e. one byte at a 5 | time. Block ciphers operate on blocks of data, typically 16 bytes at a 6 | time. The most common block cipher and the standard one you should use 7 | unless you have a very good reason to use another one is the 8 | [AES](https://secure.wikimedia.org/wikipedia/en/wiki/Advanced_Encryption_Standard) 9 | block cipher, also documented in 10 | [FIPS PUB 197](http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf). AES 11 | is a specific subset of the Rijndael cipher. AES uses block size of 12 | 128-bits (16 bytes); data should be padded out to fit the block size - 13 | the length of the data block must be multiple of the block size. For 14 | example, given an input of `ABCDABCDABCDABCD ABCDABCDABCDABCD` no 15 | padding would need to be done. However, given `ABCDABCDABCDABCD 16 | ABCDABCDABCD` an additional 4 bytes of padding would need to be 17 | added. A common padding scheme is to use `0x80` as the first byte of 18 | padding, with `0x00` bytes filling out the rest of the padding. With 19 | padding, the previous example would look like: `ABCDABCDABCDABCD 20 | ABCDABCDABCD\x80\x00\x00\x00`. 21 | 22 | Here's our padding function: 23 | 24 | ```python 25 | def pad_data(data): 26 | # return data if no padding is required 27 | if len(data) % 16 == 0: 28 | return data 29 | 30 | # subtract one byte that should be the 0x80 31 | # if 0 bytes of padding are required, it means only 32 | # a single \x80 is required. 33 | 34 | padding_required = 15 - (len(data) % 16) 35 | 36 | data = '%s\x80' % data 37 | data = '%s%s' % (data, '\x00' * padding_required) 38 | 39 | return data 40 | ``` 41 | 42 | Our function to remove padding is similar: 43 | 44 | ``` 45 | def unpad_data(data): 46 | if not data: 47 | return data 48 | 49 | data = data.rstrip('\x00') 50 | if data[-1] == '\x80': 51 | return data[:-1] 52 | else: 53 | return data 54 | ``` 55 | 56 | Encryption with a block cipher requires selecting a 57 | [block mode](https://en.wikipedia.org/wiki/Block_cipher_mode). By far 58 | the most common mode used is **cipher block chaining** or *CBC* mode. 59 | Other modes include *counter (CTR)*, *cipher feedback (CFB)*, and the 60 | extremely insecure *electronic codebook (ECB)*. CBC mode is the 61 | standard and is well-vetted, so I will stick to that in this tutorial. 62 | Cipher block chaining works by XORing the previous block of ciphertext 63 | with the current block. You might recognise that the first block has 64 | nothing to be XOR'd with; enter the 65 | [*initialisation vector*](https://en.wikipedia.org/wiki/Initialization_vector). 66 | This comprises a number of randomly-generated bytes of data the same 67 | size as the cipher's block size. This initialisation vector should 68 | random enough that it cannot be recovered. 69 | 70 | One of the most critical components to encryption is properly 71 | generating random data. Fortunately, most of this is handled by the 72 | PyCrypto library’s `Crypto.Random.OSRNG module`. You should know that 73 | the more entropy sources that are available (such as network traffic 74 | and disk activity), the faster the system can generate 75 | cryptographically-secure random data. I’ve written a function that can 76 | generate a 77 | [*nonce*](https://secure.wikimedia.org/wikipedia/en/wiki/Cryptographic_nonce) 78 | suitable for use as an initialisation vector. This will work on a UNIX 79 | machine; the comments note how easy it is to adapt it to a Windows 80 | machine. This function requires a version of PyCrypto at least 2.1.0 81 | or higher. 82 | 83 | ```python 84 | import Crypto.Random.OSRNG.posix as RNG 85 | 86 | def generate_nonce(): 87 | """Generate a random number used once.""" 88 | return RNG.new().read(AES.block_size) 89 | ``` 90 | 91 | I will note here that the python `random` module is completely 92 | unsuitable for cryptography (as it is completely deterministic). You 93 | shouldn’t use it for cryptographic code. 94 | 95 | Symmetric ciphers are so-named because the key is shared across any 96 | entities. There are three key sizes for AES: 128-bit, 192-bit, and 97 | 256-bit, aka 16-byte, 24-byte, and 32-byte key sizes. Instead, we just 98 | need to generate 32 random bytes (and make sure we keep track of it) 99 | and use that as the key: 100 | 101 | ```python 102 | KEYSIZE = 32 103 | 104 | 105 | def generate_key(): 106 | return RNG.new().read(KEY_SIZE) 107 | ``` 108 | 109 | We can use this key to encrypt and decrypt data. To encrypt, we 110 | need the initialisation vector (i.e. a nonce), the key, and the 111 | data. However, the IV isn't a secret. When we encrypt, we'll prepend 112 | the IV to our encrypted data and make that part of the output. We 113 | can (and should) generate a completely random IV for each new 114 | message. 115 | 116 | ```python 117 | import Crypto.Cipher.AES as AES 118 | 119 | def encrypt(data, key): 120 | """ 121 | Encrypt data using AES in CBC mode. The IV is prepended to the 122 | ciphertext. 123 | """ 124 | data = pad_data(data) 125 | ivec = generate_nonce() 126 | aes = AES.new(key, AES.MODE_CBC, ivec) 127 | ctxt = aes.encrypt(data) 128 | return ivec + ctxt 129 | 130 | 131 | def decrypt(ciphertext, key): 132 | """ 133 | Decrypt a ciphertext encrypted with AES in CBC mode; assumes the IV 134 | has been prepended to the ciphertext. 135 | """ 136 | if len(ciphertext) <= AES.block_size: 137 | raise Exception("Invalid ciphertext.") 138 | ivec = ciphertext[:AES.block_size] 139 | ciphertext = ciphertext[AES.block_size:] 140 | aes = AES.new(key, AES.MODE_CBC, ivec) 141 | data = aes.decrypt(ciphertext) 142 | return unpad_data(data) 143 | ``` 144 | 145 | However, this is only part of the equation for securing messages: 146 | AES only gives us confidentiality. Remember how we had a few other 147 | criteria? We still need to add integrity and authenticity to our 148 | process. Readers with some experience might immediately think of 149 | hashing algorithms, like MD5 (which should be avoided like the 150 | plague) and SHA. The problem with these is that they are malleable: 151 | it is easy to change a digest produced by one of these algorithms, 152 | and there is no indication it's been changed. We need, a hash 153 | function that uses a key to generate the digest; the one we'll use 154 | is called HMAC. We do not want the same key used to encrypt the 155 | message; we should have a new, freshly generated key that is the 156 | same size as the digest's output size (although in many cases, this 157 | will be overkill). 158 | 159 | In order to encrypt properly, then, we need to modify our code a bit. 160 | The first thing you need to know is that HMAC is based on a 161 | particular SHA function. Since we're using AES-256, we'll use SHA-384. 162 | We say our message tags are computed using HMAC-SHA-384. This 163 | produces a 48-byte digest. Let's add a few new constants in, and 164 | update the KEYSIZE variable: 165 | 166 | ```python 167 | __aes_keylen = 32 168 | __tag_keylen = 48 169 | KEYSIZE = __aes_keylen + __tag_keylen 170 | ``` 171 | 172 | Now, let's add message tagging in: 173 | 174 | ```python 175 | import Crypto.Hash.HMAC as HMAC 176 | import Crypto.Hash.SHA384 as SHA384 177 | 178 | 179 | def new_tag(ciphertext, key): 180 | """Compute a new message tag using HMAC-SHA-384.""" 181 | return HMAC.new(key, msg=ciphertext, digestmod=SHA384).digest() 182 | ``` 183 | 184 | Here's our updated encrypt function: 185 | 186 | ```python 187 | def encrypt(data, key): 188 | """ 189 | Encrypt data using AES in CBC mode. The IV is prepended to the 190 | ciphertext. 191 | """ 192 | data = pad_data(data) 193 | ivec = generate_nonce() 194 | aes = AES.new(key[:__aes_keylen], AES.MODE_CBC, ivec) 195 | ctxt = aes.encrypt(data) 196 | tag = new_tag(ivec + ctxt, key[__aes_keylen:]) 197 | return ivec + ctxt + tag 198 | ``` 199 | 200 | Decryption has a snag: what we want to do is check to see if the 201 | message tag matches what we think it should be. However, the Python 202 | `==` operator stops matching on the first character it finds that 203 | doesn't match. This opens a verification based on the `==` operator to 204 | a timing attack. Without going into much detail, note that several 205 | cryptosystems have fallen prey to this exact attack; the keyczar 206 | system, for example, use the `==` operator and suffered an attack on 207 | the system. We'll use the `streql` package (i.e. `pip install streql`) 208 | to perform a constant-time comparison of the tags. 209 | 210 | ```python 211 | import streql 212 | 213 | 214 | def verify_tag(ciphertext, key): 215 | """Verify the tag on a ciphertext.""" 216 | tag_start = len(ciphertext) - __taglen 217 | data = ciphertext[:tag_start] 218 | tag = ciphertext[tag_start:] 219 | actual_tag = new_tag(data, key) 220 | return streql.equals(actual_tag, tag) 221 | ``` 222 | 223 | We'll also change our decrypt function to return a tuple: the 224 | original message (or None on failure), and a boolean that will be 225 | True if the tag was authenticated and the message decrypted 226 | 227 | ```python 228 | def decrypt(ciphertext, key): 229 | """ 230 | Decrypt a ciphertext encrypted with AES in CBC mode; assumes the IV 231 | has been prepended to the ciphertext. 232 | """ 233 | if len(ciphertext) <= AES.block_size: 234 | return None, False 235 | tag_start = len(ciphertext) - __TAG_LEN 236 | ivec = ciphertext[:AES.block_size] 237 | data = ciphertext[AES.block_size:tag_start] 238 | if not verify_tag(ciphertext, key[__AES_KEYLEN:]): 239 | return None, False 240 | aes = AES.new(key[:__AES_KEYLEN], AES.MODE_CBC, ivec) 241 | data = aes.decrypt(data) 242 | return unpad_data(data), True 243 | ``` 244 | 245 | We could also generate a key using a passphrase; to do so, you should 246 | use a key derivation algorithm, such as 247 | [PBKDF2](https://en.wikipedia.org/wiki/Pbkdf2). A function to derive a 248 | key from a passphrase will also need to store the salt that goes with 249 | the passphrase. PBKDf2 takes three arguments: the passphrase, the 250 | salt, and the number of iterations to run through. The currently 251 | recommended minimum number of iterations in 16384; this is a sensible 252 | default for programs using PBKDF2. 253 | 254 | What is a salt? A salt is a randomly generated value used to make sure 255 | the output of two runs of PBKDF2 are unique for the same 256 | passphrase. Generally, this should be a minimum of 16 bytes 257 | (128-bits). 258 | 259 | Here are two functions to generate a random salt and generate a secret 260 | key from PBKDF2: 261 | 262 | ```python 263 | import pbkdf2 264 | def generate_salt(salt_len): 265 | """Generate a salt for use with PBKDF2.""" 266 | return RNG.new().read(salt_len) 267 | 268 | 269 | def password_key(passphrase, salt=None): 270 | """Generate a key from a passphrase. Returns the tuple (salt, key).""" 271 | if salt is None: 272 | salt = generate_salt(16) 273 | passkey = pbkdf2.PBKDF2(passphrase, salt, iterations=16384).read(KEYSIZE) 274 | return salt, passkey 275 | ``` 276 | 277 | Keep in mind that the salt, while a public and non-secret value, must 278 | be present to recover the key. To generate a new key, pass `None` as 279 | the salt value, and a random salt will be generated. To recover the 280 | same key from the passphrase, the salt must be provided (and it must 281 | be the same salt generated when the passphrase key is generated). As 282 | an example, the salt could be provided as the first `len(salt)` bytes 283 | of the ciphertext. 284 | 285 | That should cover the basics of block cipher encryption. We’ve 286 | gone over key generation, padding, and encryption / decryption. This 287 | code has been packaged up in the example source directory as `secretkey`. 288 | --------------------------------------------------------------------------------