├── README.md
├── meta.md
├── ipcipher.md.html
├── meta.md.html
└── ipcipher.md


/README.md:
--------------------------------------------------------------------------------
1 | ipcipher.md


--------------------------------------------------------------------------------
/meta.md:
--------------------------------------------------------------------------------
1 | meta.md.html


--------------------------------------------------------------------------------
/ipcipher.md.html:
--------------------------------------------------------------------------------
1 | ipcipher.md


--------------------------------------------------------------------------------
/meta.md.html:
--------------------------------------------------------------------------------
  1 |                 <meta charset="utf-8" emacsmode="-*- markdown -*-">
  2 |                             **ipcipher: discussion and guidance**
  3 | 
  4 | ipcipher
  5 | ========
  6 | `ipcipher` is a simple way to encrypt IPv4 and IPv6 addresses such
  7 | that any address encrypts to a valid address.  This enables existing tools
  8 | to be used on encrypted IPv4 and IPv6 addresses.
  9 | 
 10 | The protocol is described [here](ipcipher.md.html).  
 11 | 
 12 | This page is about how and when to use the protocol, and what guarantees it
 13 | does and does not offer.
 14 | 
 15 | Applicability
 16 | =============
 17 | `ipcipher` is meant to be enable the analysis of traces, pcaps, logfiles etc
 18 | containing customer IP addresses, without revealing those actual (customer)
 19 | IP addresses.  Given privacy trends around the world and specifically the
 20 | advent of [EU GDPR](https://www.eugdpr.org/), it is more and more important
 21 | to protect personally identifying information. Worth nothing, the GDPR
 22 | touches specifically on how pseudonymization is part of [privacy by
 23 | design](https://iapp.org/news/a/top-10-operational-impacts-of-the-gdpr-part-8-pseudonymization/)
 24 | and how it can protect data if it leaks.
 25 | 
 26 | `ipcipher` is not in any way meant as an encryption algorithm that enables
 27 | the public dissemination of traces without impacting user privacy. It should
 28 | not be compared to, say, AES or Salsa20.
 29 | 
 30 | In other words, `ipcipher` encrypted IP addresses must still be protected.
 31 | This is because of inherent limitations of
 32 | '[pseudonymisation](https://en.wikipedia.org/wiki/Pseudonymization)'. 
 33 | 
 34 | Limitations of pseudonimity
 35 | ===========================
 36 | 
 37 | From the [Wikipedia](https://en.wikipedia.org/wiki/Pseudonymization):
 38 | 
 39 | > Pseudonymization is a procedure by which the most identifying fields
 40 | > within a data record are replaced by one or more artificial identifiers,
 41 | > or pseudonyms.  (...) The purpose is to render the data record less
 42 | > identifying and therefore lower customer or patient objections to its use. 
 43 | > Data in this form is suitable for extensive analytics and processing.
 44 | 
 45 | Pseudonymous data can be analysed just like regular data. So for example, a
 46 | PCAP of a user-originated denial of service attack can be studied using
 47 | regular tools to identify the pseudonymous IP addresses performing the
 48 | attack.
 49 | 
 50 | These IP addresses can subsequently be decrypted to identify the actual
 51 | culprits.
 52 | 
 53 | If however the PCAP file contains more information, it may be possible to
 54 | deanonymize the encrypted IP addresses. For example, HTTP requests may
 55 | contain the actual user IP address as a referrer. This ties the encrypted
 56 | address to the original, and breaks pseudonimization.
 57 | 
 58 | Another famous example was the [AOL search data
 59 | leak](https://en.wikipedia.org/wiki/AOL_search_data_leak) in which AOL
 60 | released search queries for AOL user-ids, without revealing user names. It
 61 | rapidly proved possible however to identify specific users based on their
 62 | search traffic.
 63 | 
 64 | Chosen plaintext attack
 65 | =======================
 66 | Since there are only around 4.3 billion IPv4 addresses, if an attacker can
 67 | both determine what IP addresses appear in an `ipcipher` encrypted log, and
 68 | have access to that log, the algorithm fails at least for IPv4. This is
 69 | because the attacker can enumerate each and every IPv4 address to see how it
 70 | ends up in the log.
 71 | 
 72 | IPv6 is not suitable for enumeration, but a chosen plaintext attack could
 73 | still be used to verify a potential actual IP address candidate.
 74 | 
 75 | Suggested doctrine
 76 | ==================
 77 | It is suggested that a new passphrase is used whenever new data is encrypted
 78 | for analysis.  This minimizes the opportunity for attackers to benefit from
 79 | previous re-identification efforts.
 80 | 
 81 | The procedure is then as follows:
 82 | 
 83 |  * Collect data to be analysed
 84 |  * Create a random passphrase, possibly using `pwgen`
 85 |  * Store passphrase securely
 86 |  * Encrypt IP addresses in trace/log/pcap
 87 |  * Send data off for analysis
 88 |  * If analysts have found interesting pseudonomyzed IP addresses, decrypt
 89 |     using stored key
 90 |  * Analysis team destroys encrypted trace data
 91 |  * Passphrase can be destroyed
 92 |  
 93 | Specific things to watch out for
 94 | ================================
 95 | IP addresses should be encrypted wherever they appear in a trace or a log.
 96 | This is not always easy. Of specific note, ICMP messages may for example
 97 | contain another copy of the source or destination IP address of the packet.
 98 | So when encrypting PCAPs, make sure to drop any traffic known to contain
 99 | further copies of actual IP addresses that you aren't also encrypting.
100 | 
101 | Of specific note, tunnelled but unencrypted traffic (GRE, VXLAN, IPIP, SIT)
102 | is guaranteed to carry further IP addresses 'inside'. 
103 | 
104 | It is advised to restrict PCAP captures to only the intended protocol (say,
105 | DNS).
106 | 
107 | When encrypting text-based log files, be sure to encrypt not only 1.2.3.4
108 | but also 1.2.3.4:25. Similarly, ::1 should be encrypted, but also [::1]:25
109 | and ::1#25, as well as fe80::1%eth0. 
110 | 
111 | Additionally, when stamping out IPv4 or IPv6 addresses in data structures
112 | with checksums (like IP headers), be sure to also update (or zero) those
113 | checksums, as these may provide a weak or even strong indication of the
114 | original IP address.
115 | 
116 | Protection level that should be accorded to encrypted IP addresses
117 | ==================================================================
118 | In general, the more data is stored, the higher should the protection level
119 | be, even with encrypted IP addresses.
120 | 
121 | As an example, a trace of 100 DNS packets from 100 different IP addresses
122 | does not offer a lot of scope for deanonymization. Hower, a full day
123 | recording of millions of IP addresses should not be shared as if it can't be
124 | deanonymized. 
125 | 
126 | A lesser risk can already be achieved by encrypting 24 hours of IP
127 | adddresses with 24 different keys, for example.
128 | 
129 | In general however, should data leak, damage will be significantly less if
130 | IP addresses were encrypted than if they had not been encrypted. 
131 | 
132 | Another way to look at it is that encrypting IP addresses is always a win,
133 | unless the traces are suddenly shared more widely than before.
134 | 
135 | <script>window.markdeepOptions={};
136 | window.markdeepOptions.tocStyle="short";</script>
137 | <!--  Markdeep:  --><style  class="fallback">body{visibility:hidden;white-space:pre;font-family:monospace}</style><script  src="markdeep.min.js"></script><script  src="https://casual-effects.com/markdeep/latest/markdeep.min.js"></script><script>window.alreadyProcessedMarkdeep||(document.body.style.visibility="visible")</script>
138 | 


--------------------------------------------------------------------------------
/ipcipher.md:
--------------------------------------------------------------------------------
  1 |                 <meta charset="utf-8" emacsmode="-*- markdown -*-">
  2 |                             **ipcipher: encrypting IP addresses**
  3 | 
  4 | STATUS: This standard is open for discussion. We hope to finalize it
  5 | quickly - bert.hubert@powerdns.com /
  6 | [@PowerDNS_Bert](https://twitter.com/PowerDNS_Bert).
  7 | 
  8 | ipcipher
  9 | ========
 10 | This page documents a simple way to encrypt IPv4 and IPv6 addresses such
 11 | that any address encrypts to a valid address.  This enables existing tools
 12 | to be used on encrypted IPv4 and IPv6 addresses.
 13 | 
 14 | There are many ways to do this, especially for IPv6, but the method
 15 | described here is simple and interoperable.  This page:
 16 | 
 17 |  * Describes the algorithms used to encrypt/decrypt IP addresses
 18 |  * Specifies how to derive the key from a password
 19 |  * Links to reference implementations in various languages
 20 |  * Provides a set of published test vectors to test interoperabilty
 21 | 
 22 | In order to enhance interoperability, implementations that want to encrypt
 23 | IP addresses are encouraged to do so using this 'ipcipher' standard.
 24 | 
 25 | Known implementations:
 26 | 
 27 |  * [In Go, by Silke Hofstra](https://github.com/silkeh/ipcipher)
 28 |  * PowerDNS
 29 | 
 30 | Discussion on how and when to use `ipcipher` can be found in the
 31 | [meta](meta.md.html) document.
 32 | 
 33 | Acknowledgements
 34 | ================
 35 | Silke Hofstra built the first interoperable implementation and found many
 36 | mistakes in the specification and test vectors. Jean-Philippe Aumasson
 37 | supplied the `ipcrypt` algorithm & guidance on key derivation. Further thanks to: 
 38 | Frank Denis for providing the C implementation of `ipcrypt` and general
 39 | advice, Edwin van Vliet for noting the risk of checksums providing hint of
 40 | old IP address.
 41 | 
 42 | 
 43 | Why encrypt IP addresses?
 44 | =========================
 45 | Frequently, privacy concerns and regulations get in the way of security
 46 | analysis.  Privacy is important, but so is security.  Compromised systems
 47 | eventually also harm privacy.
 48 | 
 49 | Per-customer/subscriber traces are extremely useful for researching the
 50 | security of networks.  However, privacy officers rightly object the
 51 | unbridled sharing of which IP address did what. 
 52 | 
 53 | One potential solution is to encrypt IP addresses in log files or PCAPs with
 54 | a secret key.  Crucially, this can be done in a way that the IP addresses
 55 | still look like IP addresses, and can be stored 'in place'.
 56 | 
 57 | The encryption key is held by the privacy officer, or their department, and
 58 | if based on encrypted IP addresses something interesting is found, the
 59 | address can be decrypted for further action.
 60 | 
 61 | The needs and merits of IP encryption are further explored in '[On IP address encryption: security analysis with respect for
 62 | privacy](https://medium.com/@bert.hubert/on-ip-address-encryption-security-analysis-with-respect-for-privacy-dabe1201b476)'.
 63 | Importantly, this also touches on inherent limitations of encrypting IP
 64 | addresses for privacy. 
 65 | 
 66 | Guidance on how to use `ipcipher` can be found [here](meta.md.html).
 67 | 
 68 | Key derivation
 69 | ==============
 70 | Both IPv4 and IPv6 encryption use a 128-bit key. To derive this key from the
 71 | passphrase, use PBKDF2 as follows:
 72 | 
 73 | ```
 74 | DK = PBKDF2(SHA1, Password, "ipcipheripcipher", 50000, 16)
 75 | ```
 76 | 
 77 | Or in words, RFC 2898 with SHA1 as hashing function, `ipcipheripcipher` as
 78 | salt, 50000 iterations, 16 bytes of key `DK`. In OpenSSL this
 79 | corresponds to:
 80 | 
 81 | ```
 82 |   static const char salt[]="ipcipheripcipher";
 83 |   unsigned char out[16];
 84 |   PKCS5_PBKDF2_HMAC_SHA1(passwordptr, passwordlen, (const unsigned char*)salt, sizeof(salt)-1, 50000, sizeof(out), out);
 85 | 
 86 | ```
 87 | 
 88 | The key derivation step is not optional.  The `ipcrypt` algorithm used for
 89 | IPv4 requires a fully randomized key and is not secure without it. In
 90 | addition, PBKDF2 protects against brute forcing of the passphrase.
 91 | 
 92 | Some test vectors for key derivation, where first entry is an empty string:
 93 | 
 94 |  * "" -> bb 8d cd 7b e9 a6 f4 3b 33 04 c6 40 d7 d7 10 3c
 95 |  * "3.141592653589793" ->  37 05 bd 6c 0e 26 a1 a8 39 89 8f 1f a0 16 a3 74
 96 |  * "crypto is not a coin" -> 06 c4 ba d2 3a 38 b9 e0 ad 9d 05 90 b0 a3 d9 3a
 97 |  
 98 | Take care not to process a possible trailing 0 in the password (or salt).
 99 | 
100 | Note: it is of course also possible to use a fully random 128-bit key that
101 | is not derived from a passphrase. This offers some security advantages too,
102 | as the full 128-bit keyspace is used. Implementations are encouraged to make
103 | it possible to either provide a passphrase or a 128-bit string, but be
104 | careful that it is not possible to disambiguate between these two
105 | automatically!
106 | 
107 | IPv4 algorithm
108 | ==============
109 | An IPv4 address is a 32 bit value, and to encrypt it to another IPv4 address
110 | we need a block cipher that is 32 bit native.  A modern and suitable
111 | algorithm is '[ipcrypt](https://github.com/veorq/ipcrypt)' by [Jean-Philippe
112 | Aumasson](https://aumasson.jp/). ipcrypt was inspired by
113 | [SipHash](https://en.wikipedia.org/wiki/SipHash) (which was invented by
114 | Aumasson and Dan J.  Bernstein).
115 | 
116 | ipcrypt uses a 128 bit key, there is no padding, no cipher modes or anything
117 | else.
118 | 
119 | Implementations:
120 | 
121 |  * [C](https://github.com/jedisct1/c-ipcrypt) by Frank Denis
122 |  * [Go](https://github.com/veorq/ipcrypt) by Jean-Philippe Aumasson
123 |  * [Python](https://github.com/veorq/ipcrypt) by Jean-Philippe Aumasson
124 |  * [Rust](https://github.com/stbuehler/rust-ipcrypt) by Stefan Bühler
125 | 
126 | Note that the (combined) Python and Go repository also includes command line
127 | tools.
128 |  
129 | Test vectors using the derived key "some 16-byte key" (minus the quotes):
130 | 
131 |  * 127.0.0.1 -> 114.62.227.59
132 |  * 8.8.8.8 -> 46.48.51.50
133 |  * 1.2.3.4 -> 171.238.15.199
134 |  
135 | Using the following key in hex: 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 
136 | 10
137 | 
138 |  * Start with IP address 192.168.69.42 and encrypt it 100 million times ->
139 |    93.155.197.186 (so keep on encrypting the encrypted address)
140 |    
141 | Using the password "crypto is not a coin":
142 | 
143 |  * 198.41.0.4 -> 139.111.117.167
144 |  * 130.161.180.1 -> 66.235.221.231
145 |  * 0.0.0.0 -> 203.253.152.187
146 |  
147 | Note that this password needs to be used to derive the actual key first.
148 | 
149 | IPv6 algorithm
150 | ==============
151 | IPv6 addresses are 128 bits, and there is a wealth of suitable algorithms
152 | available.  AES-128 is robust and widely available, and more than good
153 | enough.
154 | 
155 | AES is typically deployed in a mode like Cipher Block Chaining, but no such
156 | mode is required to encrypt IP addresses. A straight AES operation is used,
157 | with no further XORing, as in Electronic Code Book "mode".
158 | 
159 | AES is almost always already available.  To get a raw AES-128 encryption
160 | operation out of OpenSSL or its variants:
161 | 
162 | ```
163 |   AES_KEY wctx;
164 |   AES_set_encrypt_key(key, 128, &wctx);
165 |   AES_encrypt((const unsigned char*)&ca.sin6.sin6_addr.s6_addr,
166 |               (unsigned char*)&ret.sin6.sin6_addr.s6_addr, &wctx);  
167 | ```
168 | 
169 | Decryption is the same, with the obvious s/encrypt/decrypt/ change.
170 | 
171 | There is as yet no command line tool that performs these operations,
172 | although PowerDNS `pdnsutil` will feature this in the 4.2 release.
173 | 
174 | Test vectors using the key "some 16-byte key":
175 | 
176 |  * ::1 -> 3718:8853:1723:6c88:7e5f:2e60:c79a:2bf
177 |  * 2001:503:ba3e::2:30 -> 64d2:883d:ffb5:dd79:24b:943c:22aa:4ae7
178 |  * 2001:DB8:: -> ce7e:7e39:d282:e7b1:1d6d:5ca1:d4de:246f
179 | 
180 | Using the password "crypto is not a coin":
181 | 
182 |  * ::1 -> a551:9cb0:c9b:f6e1:6112:58a:af29:3a6c
183 |  * 2001:503:ba3e::2:30 -> 6e60:2674:2fac:d383:f9d5:dcfe:fc53:328e
184 |  * 2001:DB8:: -> a8f5:16c8:e2ea:23b9:748d:67a2:4107:9d2e
185 | 
186 | Note that this password needs to be used to derive the key first.
187 | 
188 | <script>window.markdeepOptions={};
189 | window.markdeepOptions.tocStyle="short";</script>
190 | <!--  Markdeep:  --><style  class="fallback">body{visibility:hidden;white-space:pre;font-family:monospace}</style><script  src="markdeep.min.js"></script><script  src="https://casual-effects.com/markdeep/latest/markdeep.min.js"></script><script>window.alreadyProcessedMarkdeep||(document.body.style.visibility="visible")</script>
191 | 


--------------------------------------------------------------------------------