├── .gitignore ├── CONTRIBUTING ├── LICENSE ├── README.md ├── go.mod ├── orderedcode.go └── orderedcode_test.go /.gitignore: -------------------------------------------------------------------------------- 1 | *~ 2 | -------------------------------------------------------------------------------- /CONTRIBUTING: -------------------------------------------------------------------------------- 1 | Want to contribute? Great! First, read this page (including the small print at the end). 2 | 3 | ### Before you contribute 4 | Before we can use your code, you must sign the 5 | [Google Individual Contributor License Agreement](https://developers.google.com/open-source/cla/individual?csw=1) 6 | (CLA), which you can do online. The CLA is necessary mainly because you own the 7 | copyright to your changes, even after your contribution becomes part of our 8 | codebase, so we need your permission to use and distribute your code. We also 9 | need to be sure of various other things—for instance that you'll tell us if you 10 | know that your code infringes on other people's patents. You don't have to sign 11 | the CLA until after you've submitted your code for review and a member has 12 | approved it, but you must do it before we can put your code into our codebase. 13 | Before you start working on a larger contribution, you should get in touch with 14 | us first through the issue tracker with your idea so that we can help out and 15 | possibly guide you. Coordinating up front makes it much easier to avoid 16 | frustration later on. 17 | 18 | ### Code reviews 19 | All submissions, including submissions by project members, require review. We 20 | use Github pull requests for this purpose. 21 | 22 | ### The small print 23 | Contributions made by corporations are covered by a different agreement than 24 | the one above, the Software Grant and Corporate Contributor License Agreement. 25 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. 176 | 177 | END OF TERMS AND CONDITIONS 178 | 179 | APPENDIX: How to apply the Apache License to your work. 180 | 181 | To apply the Apache License to your work, attach the following 182 | boilerplate notice, with the fields enclosed by brackets "[]" 183 | replaced with your own identifying information. (Don't include 184 | the brackets!) The text should be enclosed in the appropriate 185 | comment syntax for the file format. We also recommend that a 186 | file or class name and description of purpose be included on the 187 | same "printed page" as the copyright notice for easier 188 | identification within third-party archives. 189 | 190 | Copyright [yyyy] [name of copyright owner] 191 | 192 | Licensed under the Apache License, Version 2.0 (the "License"); 193 | you may not use this file except in compliance with the License. 194 | You may obtain a copy of the License at 195 | 196 | http://www.apache.org/licenses/LICENSE-2.0 197 | 198 | Unless required by applicable law or agreed to in writing, software 199 | distributed under the License is distributed on an "AS IS" BASIS, 200 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 201 | See the License for the specific language governing permissions and 202 | limitations under the License. 203 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # orderedcode 2 | 3 | orderedcode provides a byte encoding of a sequence of typed items. 4 | The resulting bytes can be lexicographically compared to yield the same 5 | ordering as item-wise comparison on the original sequences. 6 | 7 | This is particularly useful for specifying the order of rows in a database with 8 | lexicographically ordered string keys, such as Bigtable. 9 | 10 | See the package documentation in orderedcode.go for details and examples. 11 | -------------------------------------------------------------------------------- /go.mod: -------------------------------------------------------------------------------- 1 | module github.com/google/orderedcode 2 | -------------------------------------------------------------------------------- /orderedcode.go: -------------------------------------------------------------------------------- 1 | // Copyright 2015 Google Inc. All Rights Reserved. 2 | // 3 | // Licensed under the Apache License, Version 2.0 (the "License"); 4 | // you may not use this file except in compliance with the License. 5 | // You may obtain a copy of the License at 6 | // 7 | // http://www.apache.org/licenses/LICENSE-2.0 8 | // 9 | // Unless required by applicable law or agreed to in writing, software 10 | // distributed under the License is distributed on an "AS IS" BASIS, 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | // See the License for the specific language governing permissions and 13 | // limitations under the License. 14 | 15 | /* 16 | Package orderedcode provides a byte encoding of a sequence of typed items. 17 | The resulting bytes can be lexicographically compared to yield the same 18 | ordering as item-wise comparison on the original sequences. 19 | 20 | More precisely, suppose: 21 | - A is the encoding of the sequence of items [A_1..A_n], 22 | - B is the encoding of the sequence of items [B_1..B_n], 23 | - For each i, A_i and B_i have the same type. 24 | Then comparing A versus B lexicographically is the same as comparing the 25 | vectors [A_1..A_n] and [B_1..B_n] lexicographically. 26 | 27 | Furthermore, if i < j then [A_1..A_i]'s encoding is a prefix of [A_1..A_j]'s 28 | encoding. 29 | 30 | The order-maintaining and prefix properties described above are useful for 31 | generating keys for databases like Bigtable. 32 | 33 | Call Append(buffer, item1, ..., itemN) to construct the encoded bytes. The 34 | valid item types are: 35 | - string, 36 | - struct{}, which is an 'infinity' that sorts greater than any other string, 37 | - orderedcode.StringOrInfinity, which is a union type, 38 | - TrailingString, 39 | - float64, 40 | - int64, 41 | - uint64. 42 | 43 | As a convenience, orderedcode.Infinity is a value of type struct{}. For 44 | example, to encode a sequence of two strings, an 'infinity' and an uint64: 45 | buf, err := orderedcode.Append( 46 | nil, "foo", "bar", orderedcode.Infinity, uint64(42)) 47 | if err != nil { 48 | return err 49 | } 50 | key := string(buf) 51 | 52 | Alternatively, encoding can be done in multiple steps: 53 | var buf []byte 54 | // Ignore errors, for demonstration purposes. 55 | buf, _ = orderedcode.Append(buf, "foo") 56 | buf, _ = orderedcode.Append(buf, "bar") 57 | buf, _ = orderedcode.Append(buf, orderedcode.Infinity) 58 | buf, _ = orderedcode.Append(buf, uint64(42)) 59 | key := string(buf) 60 | 61 | Call Parse(encoded, &item1, ..., &itemN) to deconstruct an encoded string. 62 | The valid argument types are the pointers to the valid encoding types. For 63 | example: 64 | var ( 65 | s1, s2 string 66 | infinity3 struct{} 67 | u4 uint64 68 | ) 69 | remainingKey, err := orderedcode.Parse(key, &s1, &s2, &infinity3, &u4) 70 | 71 | Alternatively: 72 | var ( 73 | x1, x2, x3 orderedcode.StringOrInfinity 74 | u4 uint64 75 | ) 76 | remainingKey, err := orderedcode.Parse(key, &x1, &x2, &x3, &u4) 77 | 78 | A TrailingString is a string that, if present, must be the last item appended 79 | or parsed. It is not mandatory to use a TrailingString; it is valid for the 80 | last item to be a standard string or any other type listed above. A 81 | TrailingString simply allows a more efficient encoding while retaining the 82 | lexicographic order-maintaining property. If used, you cannot append a 83 | TrailingString and parse the result as a standard string, or as a 84 | StringOrInfinity. For example: 85 | key, err := orderedcode.Append( 86 | nil, "first", "middle", orderedcode.TrailingString("last")) 87 | if err != nil { 88 | return err 89 | } 90 | var ( 91 | s1, s2 string 92 | t3 orderedcode.TrailingString 93 | ) 94 | remainingKey, err := orderedcode.Parse(string(key), &s1, &s2, &t3) 95 | if err != nil { 96 | return err 97 | } 98 | fmt.Printf("trailing string: got s1=%q s2=%q t3=%q\n", s1, s2, t3) 99 | 100 | The same sequence of types should be used for encoding and decoding (although 101 | StringOrInfinity can substitute for either a string or a struct{}, but not for 102 | a TrailingString). The wire format is not fully self-describing: 103 | "\x00\x01\x04\x03\x02\x00\x01" is a valid encoding of both ["", "\x04\x03\x02"] 104 | and [uint64(0), uint64(4), uint64(0x20001)]. Decoding into a pointer of the 105 | wrong type may return corrupt data and no error. 106 | 107 | Each item can optionally be encoded in decreasing order. If the i'th item is 108 | and the lexicographic comparison of A and B comes down to A_i versus B_i, then 109 | A < B will equal A_i > B_i. 110 | 111 | To encode in decreasing order, wrap the item in an orderedcode.Decr value. To 112 | decode, wrap the item pointer in an orderedcode.Decr. For example: 113 | key, err := orderedcode.Append(nil, "foo", orderedcode.Decr("bar")) 114 | if err != nil { 115 | return err 116 | } 117 | var s1, s2 string 118 | _, err := orderedcode.Parse(string(key), &s1, orderedcode.Decr(&s2)) 119 | if err != nil { 120 | return err 121 | } 122 | fmt.Printf("round trip: got s1=%q s2=%q\n", s1, s2) 123 | 124 | Each item's ordering is independent from other items, but the same ordering 125 | should be used to encode and decode the i'th item. 126 | */ 127 | package orderedcode // import "github.com/google/orderedcode" 128 | 129 | import ( 130 | "errors" 131 | "fmt" 132 | "math" 133 | ) 134 | 135 | func invert(b []byte) { 136 | for i := range b { 137 | b[i] ^= 0xff 138 | } 139 | } 140 | 141 | // Infinity is an encodable value that sorts greater than any other string. 142 | var Infinity struct{} 143 | 144 | // StringOrInfinity is a union type. If Infinity is true, String must be "", 145 | // and the value represents an 'infinity' that is greater than any other 146 | // string. Otherwise, the value is the String string. 147 | type StringOrInfinity struct { 148 | String string 149 | Infinity bool 150 | } 151 | 152 | // TrailingString is a string that, if present, must be the last item appended 153 | // or parsed. 154 | type TrailingString string 155 | 156 | // Decr wraps a value so that it is encoded or decoded in decreasing order. 157 | func Decr(val interface{}) interface{} { 158 | return decr{val} 159 | } 160 | 161 | type decr struct { 162 | val interface{} 163 | } 164 | 165 | // Append appends the encoded representations of items to buf. Items can have 166 | // different underlying types, but each item must have type T or be the value 167 | // Decr(somethingOfTypeT), for T in the set: string, struct{}, StringOrInfinity, 168 | // TrailingString, float64, int64 or uint64. 169 | func Append(buf []byte, items ...interface{}) ([]byte, error) { 170 | for _, item := range items { 171 | n := len(buf) 172 | d, dOK := item.(decr) 173 | if dOK { 174 | item = d.val 175 | } 176 | 177 | switch x := item.(type) { 178 | case string: 179 | buf = appendString(buf, x) 180 | case struct{}: 181 | buf = appendInfinity(buf) 182 | case StringOrInfinity: 183 | if x.Infinity { 184 | if x.String != "" { 185 | return nil, errors.New("orderedcode: StringOrInfinity has non-zero String and non-zero Infinity") 186 | } 187 | buf = appendInfinity(buf) 188 | } else { 189 | buf = appendString(buf, x.String) 190 | } 191 | case TrailingString: 192 | buf = append(buf, string(x)...) 193 | case float64: 194 | buf = appendFloat64(buf, x) 195 | case int64: 196 | buf = appendInt64(buf, x) 197 | case uint64: 198 | buf = appendUint64(buf, x) 199 | default: 200 | return nil, fmt.Errorf("orderedcode: cannot append an item of type %T", item) 201 | } 202 | 203 | if dOK { 204 | invert(buf[n:]) 205 | } 206 | } 207 | return buf, nil 208 | } 209 | 210 | // The wire format for strings or infinity is: 211 | // - \x00\x01 terminates the string. 212 | // - \x00 bytes are escaped as \x00\xff. 213 | // - \xff bytes are escaped as \xff\x00. 214 | // - \xff\xff encodes 'infinity'. 215 | // - All other bytes are literals. 216 | const ( 217 | term = "\x00\x01" 218 | lit00 = "\x00\xff" 219 | litff = "\xff\x00" 220 | inf = "\xff\xff" 221 | ) 222 | 223 | func appendString(s []byte, x string) []byte { 224 | last := 0 225 | for i := 0; i < len(x); i++ { 226 | switch x[i] { 227 | case 0x00: 228 | s = append(s, x[last:i]...) 229 | s = append(s, lit00...) 230 | last = i + 1 231 | case 0xff: 232 | s = append(s, x[last:i]...) 233 | s = append(s, litff...) 234 | last = i + 1 235 | } 236 | } 237 | s = append(s, x[last:]...) 238 | s = append(s, term...) 239 | return s 240 | } 241 | 242 | func appendInfinity(s []byte) []byte { 243 | return append(s, inf...) 244 | } 245 | 246 | // The wire format for a float64 value x is, for positive x, the encoding of 247 | // the 64 bits (in IEEE 754 format) re-interpreted as an int64. For negative 248 | // x, we keep the sign bit and invert all other bits. Negative zero is a 249 | // special case that encodes the same as positive zero. 250 | 251 | func appendFloat64(s []byte, x float64) []byte { 252 | i := int64(math.Float64bits(x)) 253 | if i < 0 { 254 | i = math.MinInt64 - i 255 | } 256 | return appendInt64(s, i) 257 | } 258 | 259 | // The wire format for an int64 value x is, for non-negative x, n leading 1 260 | // bits, followed by a 0 bit, followed by n-1 bytes. That entire slice, after 261 | // masking off the leading 1 bits, is the big-endian representation of x. 262 | // n is the smallest positive integer that can represent x this way. 263 | // 264 | // The encoding of a negative x is the inversion of the encoding for ^x. 265 | // Thus, the encoded form's leading bit is a sign bit: it is 0 for negative x 266 | // and 1 for non-negative x. 267 | // 268 | // For example: 269 | // - 0x23 encodes as 10 100011 270 | // n=0, the remainder after masking is 0x23. 271 | // - 0x10e encodes as 110 00001 00001110 272 | // n=1, the remainder after masking is 0x10e. 273 | // - -0x10f encodes as 001 11110 11110001 274 | // This is the inverse of the encoding of 0x10e. 275 | // There are many more examples in orderedcode_test.go. 276 | 277 | func appendInt64(s []byte, x int64) []byte { 278 | // Fast-path those values of x that encode to a single byte. 279 | if x >= -64 && x < 64 { 280 | return append(s, uint8(x)^0x80) 281 | } 282 | // If x is negative, invert it, and correct for this at the end. 283 | neg := x < 0 284 | if neg { 285 | x = ^x 286 | } 287 | // x is now non-negative, and so its encoding starts with a 1: the sign bit. 288 | n := 1 289 | // buf is 8 bytes for x's big-endian representation plus 2 bytes for leading 1 bits. 290 | var buf [10]byte 291 | // Fill buf from back-to-front. 292 | i := 9 293 | for x > 0 { 294 | buf[i] = byte(x) 295 | n, i, x = n+1, i-1, x>>8 296 | } 297 | // Check if we need a full byte of leading 1 bits. 7 is 8 - 1; the 8 is the 298 | // number of bits in a byte, and the 1 is because lengthening the encoding 299 | // by one byte requires incrementing n. 300 | leadingFFByte := n > 7 301 | if leadingFFByte { 302 | n -= 7 303 | } 304 | // If we can squash the leading 1 bits together with x's most significant byte, 305 | // then we can save one byte. 306 | // 307 | // We need to adjust 8-n by -1 for the separating 0 bit, but also by 308 | // +1 because we are trying to get away with one fewer leading 1 bits. 309 | // The two adjustments cancel each other out. 310 | if buf[i+1] < 1< 0 { 341 | buf[i] = byte(x) 342 | i, x = i-1, x>>8 343 | } 344 | // The front-most byte is n, the number of bytes in the big-endian representation. 345 | buf[i] = byte(8 - i) 346 | return append(s, buf[i:]...) 347 | } 348 | 349 | // For decreasing order, the encoded bytes are the bitwise-not of the regular 350 | // encoding. Bitwise-not of a byte is equivalent to bitwise-xor with 0xff, and 351 | // bitwise-xor with 0x00 is a no-op. 352 | const ( 353 | increasing byte = 0x00 354 | decreasing byte = 0xff 355 | ) 356 | 357 | // errCorrupt is returned from Parse if the input cannot be decoded into the 358 | // requested types. 359 | var errCorrupt = errors.New("orderedcode: corrupt input") 360 | 361 | // Parse parses the next len(items) of their respective types and returns any 362 | // remaining encoded data. Items can have different underlying types, but each 363 | // item must have type *T or be the value Decr(somethingOfTypeStarT), for T in 364 | // the set: string, struct{}, StringOrInfinity, TrailingString, float64, int64 365 | // or uint64. 366 | func Parse(encoded string, items ...interface{}) (remaining string, err error) { 367 | for _, item := range items { 368 | dir := increasing 369 | if d, dOK := item.(decr); dOK { 370 | dir, item = decreasing, d.val 371 | } 372 | switch x := item.(type) { 373 | case *string: 374 | encoded, err = parseString(x, encoded, dir) 375 | case *struct{}: 376 | encoded, err = parseInfinity(encoded, dir) 377 | case *StringOrInfinity: 378 | if rem, err1 := parseInfinity(encoded, dir); err1 == nil { 379 | *x = StringOrInfinity{Infinity: true} 380 | encoded = rem 381 | } else { 382 | var s string 383 | encoded, err = parseString(&s, encoded, dir) 384 | if err == nil { 385 | *x = StringOrInfinity{String: s} 386 | } 387 | } 388 | case *TrailingString: 389 | if dir == increasing { 390 | *x, encoded = TrailingString(encoded), "" 391 | } else { 392 | b := []byte(encoded) 393 | invert(b) 394 | *x, encoded = TrailingString(b), "" 395 | } 396 | case *float64: 397 | encoded, err = parseFloat64(x, encoded, dir) 398 | case *int64: 399 | encoded, err = parseInt64(x, encoded, dir) 400 | case *uint64: 401 | encoded, err = parseUint64(x, encoded, dir) 402 | default: 403 | return "", fmt.Errorf("orderedcode: cannot parse an item of type %T", item) 404 | } 405 | if err != nil { 406 | return "", err 407 | } 408 | } 409 | return encoded, nil 410 | } 411 | 412 | func parseString(dst *string, s string, dir byte) (string, error) { 413 | var ( 414 | buf []byte 415 | last, i int 416 | ) 417 | for i < len(s) { 418 | switch v := s[i] ^ dir; v { 419 | case 0x00: 420 | if i+1 >= len(s) { 421 | return "", errCorrupt 422 | } 423 | switch s[i+1] ^ dir { 424 | case 0x01: 425 | // The terminator mark ends the string. 426 | if last == 0 && dir == increasing { 427 | // As an optimization, if no \x00 or \xff bytes were escaped, 428 | // and the result does not need inverting, then set *dst to a 429 | // sub-string of the original input. 430 | *dst = s[:i] 431 | return s[i+2:], nil 432 | } 433 | buf = append(buf, s[last:i]...) 434 | if dir != increasing { 435 | invert(buf) 436 | } 437 | *dst = string(buf) 438 | return s[i+2:], nil 439 | case 0xff: 440 | // Unescape the \x00. 441 | buf = append(buf, s[last:i]...) 442 | buf = append(buf, 0x00^dir) 443 | i += 2 444 | last = i 445 | default: 446 | return "", errCorrupt 447 | } 448 | case 0xff: 449 | if i+1 >= len(s) || s[i+1]^dir != 0x00 { 450 | return "", errCorrupt 451 | } 452 | // Unescape the \xff. 453 | buf = append(buf, s[last:i]...) 454 | buf = append(buf, 0xff^dir) 455 | i += 2 456 | last = i 457 | default: 458 | i++ 459 | } 460 | } 461 | return "", errCorrupt 462 | } 463 | 464 | func parseInfinity(s string, dir byte) (string, error) { 465 | if len(s) < 2 { 466 | return "", errCorrupt 467 | } 468 | for i := 0; i < 2; i++ { 469 | if s[i]^dir != inf[i] { 470 | return "", errCorrupt 471 | } 472 | } 473 | return s[2:], nil 474 | } 475 | 476 | func parseFloat64(dst *float64, s string, dir byte) (string, error) { 477 | var i int64 478 | s, err := parseInt64(&i, s, dir) 479 | if err != nil { 480 | return "", err 481 | } 482 | if i < 0 { 483 | i = math.MinInt64 - i 484 | } 485 | *dst = math.Float64frombits(uint64(i)) 486 | return s, nil 487 | } 488 | 489 | func parseInt64(dst *int64, s string, dir byte) (string, error) { 490 | if len(s) == 0 { 491 | return "", errCorrupt 492 | } 493 | // Fast-path any single-byte encoding. 494 | c := s[0] ^ dir 495 | if c >= 0x40 && c < 0xc0 { 496 | *dst = int64(int8(c ^ 0x80)) 497 | return s[1:], nil 498 | } 499 | // Invert everything if the encoded value is negative. 500 | neg := c&0x80 == 0 501 | if neg { 502 | c, dir = ^c, ^dir 503 | } 504 | // Consume the leading 0xff full of 1 bits, if present. 505 | n := 0 506 | if c == 0xff { 507 | if len(s) == 1 { 508 | return "", errCorrupt 509 | } 510 | s = s[1:] 511 | c = s[0] ^ dir 512 | // The encoding of the largest int64 (1<<63-1) starts with "\xff\xc0". 513 | if c > 0xc0 { 514 | return "", errCorrupt 515 | } 516 | // The 7 (being 8 - 1) is for the same reason as in appendInt64. 517 | n = 7 518 | } 519 | // Count and mask off any remaining 1 bits. 520 | for mask := byte(0x80); c&mask != 0; mask >>= 1 { 521 | c &= ^mask 522 | n++ 523 | } 524 | if len(s) < n { 525 | return "", errCorrupt 526 | } 527 | // Decode the big-endian, invert if necessary, and return. 528 | x := int64(c) 529 | for i := 1; i < n; i++ { 530 | c = s[i] ^ dir 531 | x = x<<8 | int64(c) 532 | } 533 | if neg { 534 | x = ^x 535 | } 536 | *dst = x 537 | return s[n:], nil 538 | } 539 | 540 | func parseUint64(dst *uint64, s string, dir byte) (string, error) { 541 | if len(s) == 0 { 542 | return "", errCorrupt 543 | } 544 | n := int(s[0] ^ dir) 545 | if n > 8 || len(s) < 1+n { 546 | return "", errCorrupt 547 | } 548 | x := uint64(0) 549 | for i := 0; i < n; i++ { 550 | x = x<<8 | uint64(s[1+i]^dir) 551 | } 552 | *dst = x 553 | return s[1+n:], nil 554 | } 555 | -------------------------------------------------------------------------------- /orderedcode_test.go: -------------------------------------------------------------------------------- 1 | // Copyright 2015 Google Inc. All Rights Reserved. 2 | // 3 | // Licensed under the Apache License, Version 2.0 (the "License"); 4 | // you may not use this file except in compliance with the License. 5 | // You may obtain a copy of the License at 6 | // 7 | // http://www.apache.org/licenses/LICENSE-2.0 8 | // 9 | // Unless required by applicable law or agreed to in writing, software 10 | // distributed under the License is distributed on an "AS IS" BASIS, 11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | // See the License for the specific language governing permissions and 13 | // limitations under the License. 14 | 15 | package orderedcode 16 | 17 | import ( 18 | "bytes" 19 | "fmt" 20 | "math" 21 | "math/rand" 22 | "reflect" 23 | "testing" 24 | "time" 25 | ) 26 | 27 | // See http://en.wikipedia.org/wiki/IEEE_754-1985 for the bit-level float64 format. 28 | var neg0 = math.Float64frombits(0x8000000000000000) 29 | 30 | func init() { 31 | // IEEE 754 states that negative zero == positive zero. 32 | if neg0 != 0 { 33 | panic("neg0 != 0") 34 | } 35 | } 36 | 37 | var testCases = []struct { 38 | val interface{} 39 | enc string 40 | }{ 41 | // Strings. 42 | {"", "\x00\x01"}, 43 | {"\x00", "\x00\xff\x00\x01"}, 44 | {"\x00\x00", "\x00\xff\x00\xff\x00\x01"}, 45 | {"\x01", "\x01\x00\x01"}, 46 | {"foo", "foo\x00\x01"}, 47 | {"foo\x00", "foo\x00\xff\x00\x01"}, 48 | {"foo\x00\x01", "foo\x00\xff\x01\x00\x01"}, 49 | {"foo\x01", "foo\x01\x00\x01"}, 50 | {"foo\x01\x00", "foo\x01\x00\xff\x00\x01"}, 51 | {"foo\xfe", "foo\xfe\x00\x01"}, 52 | {"foo\xff", "foo\xff\x00\x00\x01"}, 53 | {"\xff", "\xff\x00\x00\x01"}, 54 | {"\xff\xff", "\xff\x00\xff\x00\x00\x01"}, 55 | // Infinity. 56 | {Infinity, "\xff\xff"}, 57 | // Float64s. 58 | {float64(math.Inf(-1)), "\x00\x3f\x80\x10\x00\x00\x00\x00\x00\x00"}, 59 | {float64(-math.MaxFloat64), "\x00\x3f\x80\x10\x00\x00\x00\x00\x00\x01"}, 60 | {float64(-2.71828), "\x00\x3f\xbf\xfa\x40\xf6\x6a\x55\x08\x70"}, 61 | {float64(-1.0), "\x00\x40\x10\x00\x00\x00\x00\x00\x00"}, 62 | {float64(-math.SmallestNonzeroFloat64), "\x7f"}, 63 | {neg0, "\x80"}, 64 | {float64(0), "\x80"}, 65 | {float64(+math.SmallestNonzeroFloat64), "\x81"}, 66 | {float64(+0.333333333), "\xff\xbf\xd5\x55\x55\x54\xf9\xb5\x16"}, 67 | {float64(+1.0), "\xff\xbf\xf0\x00\x00\x00\x00\x00\x00"}, 68 | {float64(+1.41421), "\xff\xbf\xf6\xa0\x9a\xaa\x3a\xd1\x8d"}, 69 | {float64(+1.5), "\xff\xbf\xf8\x00\x00\x00\x00\x00\x00"}, 70 | {float64(+2.0), "\xff\xc0\x40\x00\x00\x00\x00\x00\x00\x00"}, 71 | {float64(+3.14159), "\xff\xc0\x40\x09\x21\xf9\xf0\x1b\x86\x6e"}, 72 | {float64(+6.022e23), "\xff\xc0\x44\xdf\xe1\x54\xf4\x57\xea\x13"}, 73 | {float64(+math.MaxFloat64), "\xff\xc0\x7f\xef\xff\xff\xff\xff\xff\xff"}, 74 | {float64(math.Inf(+1)), "\xff\xc0\x7f\xf0\x00\x00\x00\x00\x00\x00"}, 75 | // Int64s (values near zero). 76 | {int64(-8193), "\x1f\xdf\xff"}, // 00011111 11011111 11111111 77 | {int64(-8192), "\x20\x00"}, // 00100000 00000000 78 | {int64(-4097), "\x2f\xff"}, // 00101111 11111111 79 | {int64(-257), "\x3e\xff"}, // 00111110 11111111 80 | {int64(-256), "\x3f\x00"}, // 00111111 00000000 81 | {int64(-66), "\x3f\xbe"}, // 00111111 10111110 82 | {int64(-65), "\x3f\xbf"}, // 00111111 10111111 83 | {int64(-64), "\x40"}, // 01000000 84 | {int64(-63), "\x41"}, // 01000001 85 | {int64(-3), "\x7d"}, // 01111101 86 | {int64(-2), "\x7e"}, // 01111110 87 | {int64(-1), "\x7f"}, // 01111111 88 | {int64(+0), "\x80"}, // 10000000 89 | {int64(+1), "\x81"}, // 10000001 90 | {int64(+2), "\x82"}, // 10000010 91 | {int64(+62), "\xbe"}, // 10111110 92 | {int64(+63), "\xbf"}, // 10111111 93 | {int64(+64), "\xc0\x40"}, // 11000000 01000000 94 | {int64(+65), "\xc0\x41"}, // 11000000 01000001 95 | {int64(+255), "\xc0\xff"}, // 11000000 11111111 96 | {int64(+256), "\xc1\x00"}, // 11000001 00000000 97 | {int64(+4096), "\xd0\x00"}, // 11010000 00000000 98 | {int64(+8191), "\xdf\xff"}, // 11011111 11111111 99 | {int64(+8192), "\xe0\x20\x00"}, // 11100000 00100000 00000000 100 | // Int64s. 101 | {int64(-0x800), "\x38\x00"}, 102 | {int64(0x424242), "\xf0\x42\x42\x42"}, 103 | {int64(0x23), "\xa3"}, 104 | {int64(0x10e), "\xc1\x0e"}, 105 | {int64(-0x10f), "\x3e\xf1"}, 106 | {int64(0x020b0c0d), "\xf2\x0b\x0c\x0d"}, 107 | {int64(0x0a0b0c0d), "\xf8\x0a\x0b\x0c\x0d"}, 108 | {int64(0x0102030405060708), "\xff\x81\x02\x03\x04\x05\x06\x07\x08"}, 109 | // Int64s (edge cases). 110 | {int64(-1<<63 - 0), "\x00\x3f\x80\x00\x00\x00\x00\x00\x00\x00"}, // 00000000 00111111 10000000 0x00 0x00 0x00 0x00 0x00 0x00 0x00 111 | {int64(-1<<62 - 1), "\x00\x3f\xbf\xff\xff\xff\xff\xff\xff\xff"}, // 00000000 00111111 10111111 0xff 0xff 0xff 0xff 0xff 0xff 0xff 112 | {int64(-1<<62 - 0), "\x00\x40\x00\x00\x00\x00\x00\x00\x00"}, // 00000000 01000000 00000000 0x00 0x00 0x00 0x00 0x00 0x00 113 | {int64(-1<<55 - 1), "\x00\x7f\x7f\xff\xff\xff\xff\xff\xff"}, // 00000000 01111111 01111111 0xff 0xff 0xff 0xff 0xff 0xff 114 | {int64(-1<<55 - 0), "\x00\x80\x00\x00\x00\x00\x00\x00"}, // 00000000 10000000 0x00 0x00 0x00 0x00 0x00 0x00 115 | {int64(-1<<48 - 1), "\x00\xfe\xff\xff\xff\xff\xff\xff"}, // 00000000 11111110 0xff 0xff 0xff 0xff 0xff 0xff 116 | {int64(-1<<48 - 0), "\x01\x00\x00\x00\x00\x00\x00"}, // 00000001 00000000 0x00 0x00 0x00 0x00 0x00 117 | {int64(-1<<41 - 1), "\x01\xfd\xff\xff\xff\xff\xff"}, // 00000001 11111101 0xff 0xff 0xff 0xff 0xff 118 | {int64(-1<<41 - 0), "\x02\x00\x00\x00\x00\x00"}, // 00000010 00000000 0x00 0x00 0x00 0x00 119 | {int64(-1<<34 - 1), "\x03\xfb\xff\xff\xff\xff"}, // 00000011 11111011 0xff 0xff 0xff 0xff 120 | {int64(-1<<34 - 0), "\x04\x00\x00\x00\x00"}, // 00000100 00000000 0x00 0x00 0x00 121 | {int64(-1<<27 - 1), "\x07\xf7\xff\xff\xff"}, // 00000111 11110111 0xff 0xff 0xff 122 | {int64(-1<<27 - 0), "\x08\x00\x00\x00"}, // 00001000 00000000 0x00 0x00 123 | {int64(-1<<20 - 1), "\x0f\xef\xff\xff"}, // 00001111 11101111 0xff 0xff 124 | {int64(-1<<20 - 0), "\x10\x00\x00"}, // 00010000 00000000 0x00 125 | {int64(-1<<13 - 1), "\x1f\xdf\xff"}, // 00011111 11011111 0xff 126 | {int64(-1<<13 - 0), "\x20\x00"}, // 00100000 00000000 127 | {int64(-1<<6 - 1), "\x3f\xbf"}, // 00111111 10111111 128 | {int64(-1<<6 - 0), "\x40"}, // 01000000 129 | {int64(+1<<6 - 1), "\xbf"}, // 10111111 130 | {int64(+1<<6 - 0), "\xc0\x40"}, // 11000000 01000000 131 | {int64(+1<<13 - 1), "\xdf\xff"}, // 11011111 11111111 132 | {int64(+1<<13 - 0), "\xe0\x20\x00"}, // 11100000 00100000 0x00 133 | {int64(+1<<20 - 1), "\xef\xff\xff"}, // 11101111 11111111 0xff 134 | {int64(+1<<20 - 0), "\xf0\x10\x00\x00"}, // 11110000 00010000 0x00 0x00 135 | {int64(+1<<27 - 1), "\xf7\xff\xff\xff"}, // 11110111 11111111 0xff 0xff 136 | {int64(+1<<27 - 0), "\xf8\x08\x00\x00\x00"}, // 11111000 00001000 0x00 0x00 0x00 137 | {int64(+1<<34 - 1), "\xfb\xff\xff\xff\xff"}, // 11111011 11111111 0xff 0xff 0xff 138 | {int64(+1<<34 - 0), "\xfc\x04\x00\x00\x00\x00"}, // 11111100 00000100 0x00 0x00 0x00 0x00 139 | {int64(+1<<41 - 1), "\xfd\xff\xff\xff\xff\xff"}, // 11111101 11111111 0xff 0xff 0xff 0xff 140 | {int64(+1<<41 - 0), "\xfe\x02\x00\x00\x00\x00\x00"}, // 11111110 00000010 0x00 0x00 0x00 0x00 0x00 141 | {int64(+1<<48 - 1), "\xfe\xff\xff\xff\xff\xff\xff"}, // 11111110 11111111 0xff 0xff 0xff 0xff 0xff 142 | {int64(+1<<48 - 0), "\xff\x01\x00\x00\x00\x00\x00\x00"}, // 11111111 00000001 0x00 0x00 0x00 0x00 0x00 0x00 143 | {int64(+1<<55 - 1), "\xff\x7f\xff\xff\xff\xff\xff\xff"}, // 11111111 01111111 0xff 0xff 0xff 0xff 0xff 0xff 144 | {int64(+1<<55 - 0), "\xff\x80\x80\x00\x00\x00\x00\x00\x00"}, // 11111111 10000000 10000000 0x00 0x00 0x00 0x00 0x00 0x00 145 | {int64(+1<<62 - 1), "\xff\xbf\xff\xff\xff\xff\xff\xff\xff"}, // 11111111 10111111 11111111 0xff 0xff 0xff 0xff 0xff 0xff 146 | {int64(+1<<62 - 0), "\xff\xc0\x40\x00\x00\x00\x00\x00\x00\x00"}, // 11111111 11000000 01000000 0x00 0x00 0x00 0x00 0x00 0x00 0x00 147 | {int64(+1<<63 - 1), "\xff\xc0\x7f\xff\xff\xff\xff\xff\xff\xff"}, // 11111111 11000000 01111111 0xff 0xff 0xff 0xff 0xff 0xff 0xff 148 | // Uint64s. 149 | {uint64(0), "\x00"}, 150 | {uint64(1), "\x01\x01"}, 151 | {uint64(255), "\x01\xff"}, 152 | {uint64(256), "\x02\x01\x00"}, 153 | {uint64(1025), "\x02\x04\x01"}, 154 | {uint64(0x0a0b0c0d), "\x04\x0a\x0b\x0c\x0d"}, 155 | {uint64(0x0102030405060708), "\x08\x01\x02\x03\x04\x05\x06\x07\x08"}, 156 | {uint64(1<<64 - 1), "\x08\xff\xff\xff\xff\xff\xff\xff\xff"}, 157 | } 158 | 159 | func invertString(s string) string { 160 | b := []byte(s) 161 | for i := range b { 162 | b[i] = ^b[i] 163 | } 164 | return string(b) 165 | } 166 | 167 | // expect checks that decoding enc with direction dir and val's type yields 168 | // val and exhausts the input. 169 | func expect(enc string, dir byte, val interface{}) error { 170 | dst := reflect.New(reflect.TypeOf(val)) 171 | item := dst.Interface() 172 | if dir == decreasing { 173 | item = Decr(item) 174 | } 175 | enc, err := Parse(enc, item) 176 | if err != nil { 177 | return fmt.Errorf("val=%v of type %T: got error %v", val, val, err) 178 | } 179 | if got := dst.Elem().Interface(); !reflect.DeepEqual(got, val) { 180 | return fmt.Errorf("val=%v of type %T: got %v, want %v", val, val, got, val) 181 | } 182 | if len(enc) != 0 { 183 | return fmt.Errorf("code was not exhausted, remainder has length %d", len(enc)) 184 | } 185 | return nil 186 | } 187 | 188 | func TestIndividualEncodings(t *testing.T) { 189 | for _, tc := range testCases { 190 | // Test in-increasing-order. 191 | buf0, err := Append(nil, tc.val) 192 | if err != nil { 193 | t.Errorf("append incr: val=%v of type %T: %v", tc.val, tc.val, err) 194 | continue 195 | } 196 | enc0 := string(buf0) 197 | if enc0 != tc.enc { 198 | t.Errorf("append incr: val=%v of type %T:\ngot % x\nwant % x", tc.val, tc.val, enc0, tc.enc) 199 | continue 200 | } 201 | if err := expect(enc0, increasing, tc.val); err != nil { 202 | t.Errorf("parse incr: %v", err) 203 | } 204 | 205 | // Test in-decreasing-order. 206 | buf1, err := Append(nil, Decr(tc.val)) 207 | if err != nil { 208 | t.Errorf("append decr: val=%v of type %T: %v", tc.val, tc.val, err) 209 | continue 210 | } 211 | enc1 := string(buf1) 212 | // The in-decreasing-order encoding should be the bitwise-not of the regular encoding. 213 | if enc1 != invertString(tc.enc) { 214 | t.Errorf("append decr: val=%v of type %T:\ngot % x\nwant % x", tc.val, tc.val, enc1, invertString(tc.enc)) 215 | continue 216 | } 217 | if err := expect(enc1, decreasing, tc.val); err != nil { 218 | t.Errorf("parse decr: %v", err) 219 | } 220 | } 221 | } 222 | 223 | func TestConcatenation(t *testing.T) { 224 | // The encoding of multiple values should equal the concatenation of 225 | // the individual encodings. 226 | var ( 227 | items []interface{} 228 | buf1 bytes.Buffer 229 | ) 230 | for _, tc := range testCases { 231 | items = append(items, tc.val) 232 | buf1.WriteString(tc.enc) 233 | } 234 | buf0, err := Append(nil, items...) 235 | if err != nil { 236 | t.Fatalf("append: %v", err) 237 | } 238 | if s0, s1 := string(buf0), buf1.String(); s0 != s1 { 239 | t.Errorf("\ngot %q\nwant %q", s0, s1) 240 | } 241 | } 242 | 243 | func TestNaN(t *testing.T) { 244 | buf, err := Append(nil, math.NaN()) 245 | if err != nil { 246 | t.Fatalf("append: %v", err) 247 | } 248 | var f float64 249 | _, err = Parse(string(buf), &f) 250 | if err != nil { 251 | t.Fatalf("parse: %v", err) 252 | } 253 | if !math.IsNaN(f) { 254 | t.Errorf("got %v want NaN", f) 255 | } 256 | } 257 | 258 | func TestTrailingString(t *testing.T) { 259 | testCases := []string{ 260 | "", 261 | "\x00", 262 | "\x00\x01", 263 | "a", 264 | "bcd", 265 | "foo\x00", 266 | "foo\x00bar", 267 | "foo\x00bar\x00", 268 | "\xff", 269 | "\xff\x00", 270 | "\xff\xfe", 271 | "\xff\xff", 272 | } 273 | for _, decr := range []bool{false, true} { 274 | for _, tc := range testCases { 275 | src := interface{}(TrailingString(tc)) 276 | if decr { 277 | src = Decr(src) 278 | } 279 | buf, err := Append(nil, src) 280 | if err != nil { 281 | t.Errorf("decr=%v, tc=%q: append: %v", decr, tc, err) 282 | continue 283 | } 284 | 285 | enc, encWant := string(buf), tc 286 | if decr { 287 | encWant = invertString(encWant) 288 | } 289 | if enc != encWant { 290 | t.Errorf("decr=%v, tc=%q: append: got %q want %q", decr, tc, enc, encWant) 291 | continue 292 | } 293 | 294 | var x TrailingString 295 | dst := interface{}(&x) 296 | if decr { 297 | dst = Decr(dst) 298 | } 299 | rem, err := Parse(enc, dst) 300 | if err != nil { 301 | t.Errorf("decr=%v, tc=%q: parse: %v", decr, tc, err) 302 | continue 303 | } 304 | if rem != "" { 305 | t.Errorf(`decr=%v, tc=%q: parse: got remainder %q want ""`, decr, tc, rem) 306 | continue 307 | } 308 | if string(x) != tc { 309 | t.Errorf("decr=%v, tc=%q: parse: got %q want %q", decr, tc, x, tc) 310 | continue 311 | } 312 | } 313 | } 314 | } 315 | 316 | func TestIncrDecr(t *testing.T) { 317 | buf, err := Append(nil, 318 | uint64(0), 319 | Decr(uint64(1)), 320 | uint64(2), 321 | Decr(uint64(516)), 322 | uint64(517), 323 | Decr(uint64(0)), 324 | ) 325 | if err != nil { 326 | t.Fatalf("append: %v", err) 327 | } 328 | got := string(buf) 329 | want := "\x00" + "\xfe\xfe" + "\x01\x02" + "\xfd\xfd\xfb" + "\x02\x02\x05" + "\xff" 330 | if got != want { 331 | t.Errorf("\ngot %q\nwant %q", got, want) 332 | } 333 | } 334 | 335 | func TestRoundTrip(t *testing.T) { 336 | key, err := Append(nil, "foo", Decr("bar")) 337 | if err != nil { 338 | t.Fatalf("append: %v", err) 339 | } 340 | var s1, s2 string 341 | _, err = Parse(string(key), &s1, Decr(&s2)) 342 | if err != nil { 343 | t.Fatalf("parse: %v", err) 344 | } 345 | if s1 != "foo" || s2 != "bar" { 346 | t.Fatalf("got s1=%q s2=%q, want s1=%q s2=%q\n", s1, s2, "foo", "bar") 347 | } 348 | } 349 | 350 | func TestRandomStrings(t *testing.T) { 351 | const maxStrLen = 16 352 | seed := time.Now().UnixNano() 353 | t.Logf("random seed = %v", seed) 354 | // generator returns a func() string that is an infinite iterator of strings. 355 | // Calling that func() string returns the next string and advances the iterator. 356 | // Calling generator twice results in two independent iterators that yield 357 | // the same pseudo-random sequence of strings. 358 | generator := func() func() string { 359 | r := rand.New(rand.NewSource(seed)) 360 | return func() string { 361 | b := make([]byte, r.Intn(maxStrLen)) 362 | for i := range b { 363 | b[i] = byte(r.Intn(256)) 364 | } 365 | return string(b) 366 | } 367 | } 368 | const n = 1e5 369 | g0, g1, items := generator(), generator(), make([]interface{}, n) 370 | for i := 0; i < n; i++ { 371 | items[i] = g0() 372 | } 373 | buf, err := Append(nil, items...) 374 | if err != nil { 375 | t.Fatalf("append: %v", err) 376 | } 377 | enc := string(buf) 378 | if len(enc) < n*maxStrLen/2 { 379 | // On average, each of the n strings has length maxStrLen/2 before encoding. 380 | // The encoded length is greater, due to escaping and the terminator mark. 381 | t.Fatalf("enc is too short, length=%d", len(enc)) 382 | } 383 | for i := 0; i < n; i++ { 384 | var got string 385 | enc, err = Parse(enc, &got) 386 | if err != nil { 387 | t.Fatalf("i=%d: %v", i, err) 388 | } 389 | if want := g1(); got != want { 390 | t.Fatalf("i=%d: got %q, want %q", i, got, want) 391 | } 392 | } 393 | if len(enc) != 0 { 394 | t.Errorf("code was not exhausted, remainder has length %d", len(enc)) 395 | } 396 | } 397 | 398 | func TestRandomInt64s(t *testing.T) { 399 | seed := time.Now().UnixNano() 400 | t.Logf("random seed = %v", seed) 401 | generator := func() func() int64 { 402 | r := rand.New(rand.NewSource(seed)) 403 | return func() int64 { 404 | x := int64(r.Uint32()) 405 | y := int64(r.Uint32()) 406 | return x<<32 | y 407 | } 408 | } 409 | const n = 1e5 410 | g0, g1, items := generator(), generator(), make([]interface{}, n) 411 | for i := 0; i < n; i++ { 412 | items[i] = g0() 413 | } 414 | buf, err := Append(nil, items...) 415 | if err != nil { 416 | t.Fatalf("append: %v", err) 417 | } 418 | enc := string(buf) 419 | for i := 0; i < n; i++ { 420 | var got int64 421 | enc, err = Parse(enc, &got) 422 | if err != nil { 423 | t.Fatalf("i=%d: %v", i, err) 424 | } 425 | if want := g1(); got != want { 426 | t.Fatalf("i=%d: got %d, want %d", i, got, want) 427 | } 428 | } 429 | if len(enc) != 0 { 430 | t.Errorf("code was not exhausted, remainder has length %d", len(enc)) 431 | } 432 | } 433 | 434 | func TestStringOrInfinity(t *testing.T) { 435 | check := func(got StringOrInfinity, want interface{}) error { 436 | if got.String != "" && got.Infinity { 437 | return fmt.Errorf("StringOrInfinty has non-zero String and non-zero Infinity: %v", got) 438 | } 439 | switch v := want.(type) { 440 | case string: 441 | if got.String != v { 442 | return fmt.Errorf("got %q, want %q", got.String, v) 443 | } 444 | case struct{}: 445 | if !got.Infinity { 446 | return fmt.Errorf("got not-infinity, want infinity") 447 | } 448 | default: 449 | panic("unreachable") 450 | } 451 | return nil 452 | } 453 | 454 | vals := []interface{}{ 455 | "foo", 456 | "bar", 457 | Infinity, 458 | "", 459 | "\x00", 460 | Infinity, 461 | Infinity, 462 | "\xff", 463 | "AB\x00\x01\x02MN\xfd\xfe\xffYZ", 464 | } 465 | buf, err := Append(nil, vals...) 466 | if err != nil { 467 | t.Fatalf("append: %v", err) 468 | } 469 | 470 | // Test parsing one at a time. 471 | enc := string(buf) 472 | for i, val := range vals { 473 | var x StringOrInfinity 474 | enc, err = Parse(enc, &x) 475 | if err != nil { 476 | t.Fatalf("parse one: i=%d: Parse: %v", i, err) 477 | } 478 | if err := check(x, val); err != nil { 479 | t.Fatalf("parse one: i=%d: %v", i, err) 480 | } 481 | } 482 | if len(enc) != 0 { 483 | t.Errorf("parse one: code was not exhausted, remainder=%q", enc) 484 | } 485 | 486 | // Test parsing many at a time. 487 | enc = string(buf) 488 | got := make([]interface{}, len(vals)) 489 | for i := range got { 490 | got[i] = new(StringOrInfinity) 491 | } 492 | enc, err = Parse(enc, got...) 493 | if err != nil { 494 | t.Fatalf("parse many: Parse: %v", err) 495 | } 496 | for i, p := range got { 497 | if err := check(*p.(*StringOrInfinity), vals[i]); err != nil { 498 | t.Fatalf("parse many: i=%d: %v", i, err) 499 | } 500 | } 501 | if len(enc) != 0 { 502 | t.Errorf("parse many: code was not exhausted, remainder=%q", enc) 503 | } 504 | } 505 | 506 | func TestCorruptStringOrInfinity(t *testing.T) { 507 | var dst0, dst1, dst2 StringOrInfinity 508 | 509 | // Parse one StringOrInfinity value. 510 | input := "\x00" // The "\x00" is neither a valid string nor a valid infinity. 511 | if _, err := Parse(input, &dst0); err != errCorrupt { 512 | t.Errorf("parse one: got %v, want errCorrupt", err) 513 | } 514 | 515 | // Parse many StringOrInfinity values. 516 | input = "foo\x00\x01" + "\xff\xff" + "\x00" 517 | if _, err := Parse(input, &dst0, &dst1, &dst2); err != errCorrupt { 518 | t.Errorf("parse many: got %v, want errCorrupt", err) 519 | } 520 | } 521 | 522 | func TestCorrupt(t *testing.T) { 523 | testCases := []struct { 524 | dst interface{} 525 | inputs []string 526 | }{ 527 | { 528 | new(string), 529 | []string{ 530 | "", 531 | "\x00", // A valid uint64, but not a valid string. 532 | "\x00\x00", 533 | "\x00\x00\x01", 534 | "\x00\x02", 535 | "abc", 536 | "abc\xff\xff", 537 | "foo\x00", 538 | "\xa3", // A valid float64 or int64, but not a valid string. 539 | "\xff", 540 | "\xff\x00", 541 | "\xff\xfe", 542 | "\xff\xff", // A valid infinity, but not a valid string. 543 | }, 544 | }, 545 | { 546 | &Infinity, 547 | []string{ 548 | "", 549 | "\x00", // A valid uint64, but not a valid infinity. 550 | "abc", 551 | "foo\x00\x01", // A valid string, but not a valid infinity. 552 | "\xa3", // A valid float64 or int64, but not a valid infinity. 553 | "\xff", 554 | "\xff\x00", 555 | "\xff\xfe", 556 | }, 557 | }, 558 | { 559 | new(float64), 560 | []string{ 561 | "", 562 | "\x00", // A valid uint64, but not a valid float64. 563 | "\x00\x00", 564 | "\x00\x00abcdefghijklmnopqrst", 565 | "\x00\x01", // A valid string, but not a valid float64. 566 | "\xc0", 567 | "\xf0\x00", 568 | "\xff\xffabcdefghijklmnopqrst", 569 | "\xff\xff", // A valid infinity, but not a valid float64. 570 | }, 571 | }, 572 | { 573 | new(int64), 574 | []string{ 575 | "", 576 | "\x00", // A valid uint64, but not a valid int64. 577 | "\x00\x00", 578 | "\x00\x00abcdefghijklmnopqrst", 579 | "\x00\x01", // A valid string, but not a valid int64. 580 | "\xc0", 581 | "\xf0\x00", 582 | "\xff\xffabcdefghijklmnopqrst", 583 | "\xff\xff", // A valid infinity, but not a valid int64. 584 | }, 585 | }, 586 | { 587 | new(uint64), 588 | []string{ 589 | "", 590 | "\x01", 591 | "\x08abcd", 592 | "\x09abcdefghijklmnopqrst", 593 | "abc", 594 | "abc\xff\xff", 595 | "foo\x00\x01", // A valid string, but not a valid uint64. 596 | "\xa3", // A valid float64 or int64, but not a valid uint64. 597 | "\xff", 598 | "\xff\x00", 599 | "\xff\xfe", 600 | "\xff\xff", // A valid infinity, but not a valid uint64. 601 | }, 602 | }, 603 | } 604 | for _, tc := range testCases { 605 | for _, input := range tc.inputs { 606 | if _, err := Parse(input, tc.dst); err != errCorrupt { 607 | t.Errorf("dst has type %T, input=%q: got %v want errCorrupt", tc.dst, input, err) 608 | } 609 | } 610 | } 611 | } 612 | --------------------------------------------------------------------------------