├── LICENSE ├── README.md ├── phc-sf-parse.c └── phc-sf-spec.md /LICENSE: -------------------------------------------------------------------------------- 1 | CC0 1.0 Universal 2 | 3 | Statement of Purpose 4 | 5 | The laws of most jurisdictions throughout the world automatically confer 6 | exclusive Copyright and Related Rights (defined below) upon the creator and 7 | subsequent owner(s) (each and all, an "owner") of an original work of 8 | authorship and/or a database (each, a "Work"). 9 | 10 | Certain owners wish to permanently relinquish those rights to a Work for the 11 | purpose of contributing to a commons of creative, cultural and scientific 12 | works ("Commons") that the public can reliably and without fear of later 13 | claims of infringement build upon, modify, incorporate in other works, reuse 14 | and redistribute as freely as possible in any form whatsoever and for any 15 | purposes, including without limitation commercial purposes. These owners may 16 | contribute to the Commons to promote the ideal of a free culture and the 17 | further production of creative, cultural and scientific works, or to gain 18 | reputation or greater distribution for their Work in part through the use and 19 | efforts of others. 20 | 21 | For these and/or other purposes and motivations, and without any expectation 22 | of additional consideration or compensation, the person associating CC0 with a 23 | Work (the "Affirmer"), to the extent that he or she is an owner of Copyright 24 | and Related Rights in the Work, voluntarily elects to apply CC0 to the Work 25 | and publicly distribute the Work under its terms, with knowledge of his or her 26 | Copyright and Related Rights in the Work and the meaning and intended legal 27 | effect of CC0 on those rights. 28 | 29 | 1. Copyright and Related Rights. A Work made available under CC0 may be 30 | protected by copyright and related or neighboring rights ("Copyright and 31 | Related Rights"). Copyright and Related Rights include, but are not limited 32 | to, the following: 33 | 34 | i. the right to reproduce, adapt, distribute, perform, display, communicate, 35 | and translate a Work; 36 | 37 | ii. moral rights retained by the original author(s) and/or performer(s); 38 | 39 | iii. publicity and privacy rights pertaining to a person's image or likeness 40 | depicted in a Work; 41 | 42 | iv. rights protecting against unfair competition in regards to a Work, 43 | subject to the limitations in paragraph 4(a), below; 44 | 45 | v. rights protecting the extraction, dissemination, use and reuse of data in 46 | a Work; 47 | 48 | vi. database rights (such as those arising under Directive 96/9/EC of the 49 | European Parliament and of the Council of 11 March 1996 on the legal 50 | protection of databases, and under any national implementation thereof, 51 | including any amended or successor version of such directive); and 52 | 53 | vii. other similar, equivalent or corresponding rights throughout the world 54 | based on applicable law or treaty, and any national implementations thereof. 55 | 56 | 2. Waiver. To the greatest extent permitted by, but not in contravention of, 57 | applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and 58 | unconditionally waives, abandons, and surrenders all of Affirmer's Copyright 59 | and Related Rights and associated claims and causes of action, whether now 60 | known or unknown (including existing as well as future claims and causes of 61 | action), in the Work (i) in all territories worldwide, (ii) for the maximum 62 | duration provided by applicable law or treaty (including future time 63 | extensions), (iii) in any current or future medium and for any number of 64 | copies, and (iv) for any purpose whatsoever, including without limitation 65 | commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes 66 | the Waiver for the benefit of each member of the public at large and to the 67 | detriment of Affirmer's heirs and successors, fully intending that such Waiver 68 | shall not be subject to revocation, rescission, cancellation, termination, or 69 | any other legal or equitable action to disrupt the quiet enjoyment of the Work 70 | by the public as contemplated by Affirmer's express Statement of Purpose. 71 | 72 | 3. Public License Fallback. Should any part of the Waiver for any reason be 73 | judged legally invalid or ineffective under applicable law, then the Waiver 74 | shall be preserved to the maximum extent permitted taking into account 75 | Affirmer's express Statement of Purpose. In addition, to the extent the Waiver 76 | is so judged Affirmer hereby grants to each affected person a royalty-free, 77 | non transferable, non sublicensable, non exclusive, irrevocable and 78 | unconditional license to exercise Affirmer's Copyright and Related Rights in 79 | the Work (i) in all territories worldwide, (ii) for the maximum duration 80 | provided by applicable law or treaty (including future time extensions), (iii) 81 | in any current or future medium and for any number of copies, and (iv) for any 82 | purpose whatsoever, including without limitation commercial, advertising or 83 | promotional purposes (the "License"). The License shall be deemed effective as 84 | of the date CC0 was applied by Affirmer to the Work. Should any part of the 85 | License for any reason be judged legally invalid or ineffective under 86 | applicable law, such partial invalidity or ineffectiveness shall not 87 | invalidate the remainder of the License, and in such case Affirmer hereby 88 | affirms that he or she will not (i) exercise any of his or her remaining 89 | Copyright and Related Rights in the Work or (ii) assert any associated claims 90 | and causes of action with respect to the Work, in either case contrary to 91 | Affirmer's express Statement of Purpose. 92 | 93 | 4. Limitations and Disclaimers. 94 | 95 | a. No trademark or patent rights held by Affirmer are waived, abandoned, 96 | surrendered, licensed or otherwise affected by this document. 97 | 98 | b. Affirmer offers the Work as-is and makes no representations or warranties 99 | of any kind concerning the Work, express, implied, statutory or otherwise, 100 | including without limitation warranties of title, merchantability, fitness 101 | for a particular purpose, non infringement, or the absence of latent or 102 | other defects, accuracy, or the present or absence of errors, whether or not 103 | discoverable, all to the greatest extent permissible under applicable law. 104 | 105 | c. Affirmer disclaims responsibility for clearing rights of other persons 106 | that may apply to the Work or any use thereof, including without limitation 107 | any person's Copyright and Related Rights in the Work. Further, Affirmer 108 | disclaims responsibility for obtaining any necessary consents, permissions 109 | or other rights required for any use of the Work. 110 | 111 | d. Affirmer understands and acknowledges that Creative Commons is not a 112 | party to this document and has no duty or obligation with respect to this 113 | CC0 or use of the Work. 114 | 115 | For more information, please see 116 | 117 | 118 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | # PHC string format 3 | 4 | Specification of 5 | 6 | * An extended crypt(3) encoding format to support additional parameters 7 | of password hashing functions. 8 | 9 | * An encoding for the PHC winner Argon2. 10 | -------------------------------------------------------------------------------- /phc-sf-parse.c: -------------------------------------------------------------------------------- 1 | /* 2 | * Example code for a decoder and encoder of "hash strings", with Argon2i 3 | * parameters. 4 | * 5 | * This code comprises three sections: 6 | * 7 | * -- The first section contains generic Base64 encoding and decoding 8 | * functions. It is conceptually applicable to any hash function 9 | * implementation that uses Base64 to encode and decode parameters, 10 | * salts and outputs. It could be made into a library, provided that 11 | * the relevant functions are made public (non-static) and be given 12 | * reasonable names to avoid collisions with other functions. 13 | * 14 | * -- The second section is specific to Argon2i. It encodes and decodes 15 | * the parameters, salts and outputs. It does not compute the hash 16 | * itself. 17 | * 18 | * -- The third section is test code, with a main() function. With 19 | * this section, the whole file compiles as a stand-alone program 20 | * that exercises the encoding and decoding functions with some 21 | * test vectors. 22 | * 23 | * The code was originally written by Thomas Pornin , 24 | * to whom comments and remarks may be sent. It is released under what 25 | * should amount to Public Domain or its closest equivalent; the 26 | * following mantra is supposed to incarnate that fact with all the 27 | * proper legal rituals: 28 | * 29 | * --------------------------------------------------------------------- 30 | * This file is provided under the terms of Creative Commons CC0 1.0 31 | * Public Domain Dedication. To the extent possible under law, the 32 | * author (Thomas Pornin) has waived all copyright and related or 33 | * neighboring rights to this file. This work is published from: Canada. 34 | * --------------------------------------------------------------------- 35 | * 36 | * Copyright (c) 2015 Thomas Pornin 37 | */ 38 | 39 | #include 40 | #include 41 | #include 42 | #include 43 | 44 | /* ==================================================================== */ 45 | /* 46 | * Common code; could be shared between different hash functions. 47 | * 48 | * Note: the Base64 functions below assume that uppercase letters (resp. 49 | * lowercase letters) have consecutive numerical codes, that fit on 8 50 | * bits. All modern systems use ASCII-compatible charsets, where these 51 | * properties are true. If you are stuck with a dinosaur of a system 52 | * that still defaults to EBCDIC then you already have much bigger 53 | * interoperability issues to deal with. 54 | */ 55 | 56 | /* 57 | * Some macros for constant-time comparisons. These work over values in 58 | * the 0..255 range. Returned value is 0x00 on "false", 0xFF on "true". 59 | */ 60 | #define EQ(x, y) ((((-((unsigned)(x) ^ (unsigned)(y))) >> 8) & 0xFF) ^ 0xFF) 61 | #define GT(x, y) ((((unsigned)(y) - (unsigned)(x)) >> 8) & 0xFF) 62 | #define GE(x, y) (GT(y, x) ^ 0xFF) 63 | #define LT(x, y) GT(y, x) 64 | #define LE(x, y) GE(y, x) 65 | 66 | /* 67 | * Convert value x (0..63) to corresponding Base64 character. 68 | */ 69 | static int 70 | b64_byte_to_char(unsigned x) 71 | { 72 | return (LT(x, 26) & (x + 'A')) 73 | | (GE(x, 26) & LT(x, 52) & (x + ('a' - 26))) 74 | | (GE(x, 52) & LT(x, 62) & (x + ('0' - 52))) 75 | | (EQ(x, 62) & '+') | (EQ(x, 63) & '/'); 76 | } 77 | 78 | /* 79 | * Convert character c to the corresponding 6-bit value. If character c 80 | * is not a Base64 character, then 0xFF (255) is returned. 81 | */ 82 | static unsigned 83 | b64_char_to_byte(int c) 84 | { 85 | unsigned x; 86 | 87 | x = (GE(c, 'A') & LE(c, 'Z') & (c - 'A')) 88 | | (GE(c, 'a') & LE(c, 'z') & (c - ('a' - 26))) 89 | | (GE(c, '0') & LE(c, '9') & (c - ('0' - 52))) 90 | | (EQ(c, '+') & 62) | (EQ(c, '/') & 63); 91 | return x | (EQ(x, 0) & (EQ(c, 'A') ^ 0xFF)); 92 | } 93 | 94 | /* 95 | * Convert some bytes to Base64. 'dst_len' is the length (in characters) 96 | * of the output buffer 'dst'; if that buffer is not large enough to 97 | * receive the result (including the terminating 0), then (size_t)-1 98 | * is returned. Otherwise, the zero-terminated Base64 string is written 99 | * in the buffer, and the output length (counted WITHOUT the terminating 100 | * zero) is returned. 101 | */ 102 | static size_t 103 | to_base64(char *dst, size_t dst_len, const void *src, size_t src_len) 104 | { 105 | size_t olen; 106 | const unsigned char *buf; 107 | unsigned acc, acc_len; 108 | 109 | olen = (src_len / 3) << 2; 110 | switch (src_len % 3) { 111 | case 2: 112 | olen ++; 113 | /* fall through */ 114 | case 1: 115 | olen += 2; 116 | break; 117 | } 118 | if (dst_len <= olen) { 119 | return (size_t)-1; 120 | } 121 | acc = 0; 122 | acc_len = 0; 123 | buf = (const unsigned char *)src; 124 | while (src_len -- > 0) { 125 | acc = (acc << 8) + (*buf ++); 126 | acc_len += 8; 127 | while (acc_len >= 6) { 128 | acc_len -= 6; 129 | *dst ++ = b64_byte_to_char((acc >> acc_len) & 0x3F); 130 | } 131 | } 132 | if (acc_len > 0) { 133 | *dst ++ = b64_byte_to_char((acc << (6 - acc_len)) & 0x3F); 134 | } 135 | *dst ++ = 0; 136 | return olen; 137 | } 138 | 139 | /* 140 | * Decode Base64 chars into bytes. The '*dst_len' value must initially 141 | * contain the length of the output buffer '*dst'; when the decoding 142 | * ends, the actual number of decoded bytes is written back in 143 | * '*dst_len'. 144 | * 145 | * Decoding stops when a non-Base64 character is encountered, or when 146 | * the output buffer capacity is exceeded. If an error occurred (output 147 | * buffer is too small, invalid last characters leading to unprocessed 148 | * buffered bits), then NULL is returned; otherwise, the returned value 149 | * points to the first non-Base64 character in the source stream, which 150 | * may be the terminating zero. 151 | */ 152 | static const char * 153 | from_base64(void *dst, size_t *dst_len, const char *src) 154 | { 155 | size_t len; 156 | unsigned char *buf; 157 | unsigned acc, acc_len; 158 | 159 | buf = (unsigned char *)dst; 160 | len = 0; 161 | acc = 0; 162 | acc_len = 0; 163 | for (;;) { 164 | unsigned d; 165 | 166 | d = b64_char_to_byte(*src); 167 | if (d == 0xFF) { 168 | break; 169 | } 170 | src ++; 171 | acc = (acc << 6) + d; 172 | acc_len += 6; 173 | if (acc_len >= 8) { 174 | acc_len -= 8; 175 | if ((len ++) >= *dst_len) { 176 | return NULL; 177 | } 178 | *buf ++ = (acc >> acc_len) & 0xFF; 179 | } 180 | } 181 | 182 | /* 183 | * If the input length is equal to 1 modulo 4 (which is 184 | * invalid), then there will remain 6 unprocessed bits; 185 | * otherwise, only 0, 2 or 4 bits are buffered. The buffered 186 | * bits must also all be zero. 187 | */ 188 | if (acc_len > 4 || (acc & (((unsigned)1 << acc_len) - 1)) != 0) { 189 | return NULL; 190 | } 191 | *dst_len = len; 192 | return src; 193 | } 194 | 195 | /* 196 | * Decode decimal integer from 'str'; the value is written in '*v'. 197 | * Returned value is a pointer to the next non-decimal character in the 198 | * string. If there is no digit at all, or the value encoding is not 199 | * minimal (extra leading zeros), or the value does not fit in an 200 | * 'unsigned long', then NULL is returned. 201 | */ 202 | static const char * 203 | decode_decimal(const char *str, unsigned long *v) 204 | { 205 | const char *orig; 206 | unsigned long acc; 207 | 208 | orig = str; 209 | acc = 0; 210 | for (orig = str;; str ++) { 211 | int c; 212 | 213 | c = *str; 214 | if (c < '0' || c > '9') { 215 | break; 216 | } 217 | c -= '0'; 218 | if (acc > (ULONG_MAX / 10)) { 219 | return NULL; 220 | } 221 | acc *= 10; 222 | if ((unsigned long)c > (ULONG_MAX - acc)) { 223 | return NULL; 224 | } 225 | acc += (unsigned long)c; 226 | } 227 | if (str == orig || (*orig == '0' && str != (orig + 1))) { 228 | return NULL; 229 | } 230 | *v = acc; 231 | return str; 232 | } 233 | 234 | /* ==================================================================== */ 235 | /* 236 | * Code specific to Argon2i. 237 | * 238 | * The code below applies the following format: 239 | * 240 | * $argon2i$m=,t=,p=[,keyid=][,data=][$[$]] 241 | * 242 | * where is a decimal integer (positive, fits in an 'unsigned long') 243 | * and is Base64-encoded data (no '=' padding characters, no newline 244 | * or whitespace). The "keyid" is a binary identifier for a key (up to 8 245 | * bytes); "data" is associated data (up to 32 bytes). When the 'keyid' 246 | * (resp. the 'data') is empty, then it is ommitted from the output. 247 | * 248 | * The last two binary chunks (encoded in Base64) are, in that order, 249 | * the salt and the output. Both are optional, but you cannot have an 250 | * output without a salt. The binary salt length is between 8 and 48 bytes. 251 | * The output length is always exactly 32 bytes. 252 | */ 253 | 254 | /* 255 | * A structure containg the values that get encoded into Argon2i hash 256 | * strings. 257 | * 258 | * key_id_len is 0 if the string contains no key ID. 259 | * associated_data_len is 0 if the string contains no associated data. 260 | * salt_len is 0 if the string contains no salt (parameter-only string). 261 | * output_len is 0 if the string contains no output (a salt string, with 262 | * parameters and salt but no output). 263 | */ 264 | typedef struct { 265 | unsigned long m; 266 | unsigned long t; 267 | unsigned long p; 268 | unsigned char key_id[8]; 269 | size_t key_id_len; 270 | unsigned char associated_data[32]; 271 | size_t associated_data_len; 272 | unsigned char salt[48]; 273 | size_t salt_len; 274 | unsigned char output[64]; 275 | size_t output_len; 276 | } argon2i_params; 277 | 278 | /* 279 | * Decode an Argon2i hash string into the provided structure 'pp'. 280 | * Returned value is 1 on success, 0 on error. 281 | */ 282 | int 283 | argon2i_decode_string(argon2i_params *pp, const char *str) 284 | { 285 | #define CC(prefix) do { \ 286 | size_t cc_len = strlen(prefix); \ 287 | if (strncmp(str, prefix, cc_len) != 0) { \ 288 | return 0; \ 289 | } \ 290 | str += cc_len; \ 291 | } while (0) 292 | 293 | #define CC_opt(prefix, code) do { \ 294 | size_t cc_len = strlen(prefix); \ 295 | if (strncmp(str, prefix, cc_len) == 0) { \ 296 | str += cc_len; \ 297 | { code; } \ 298 | } \ 299 | } while (0) 300 | 301 | #define DECIMAL(x) do { \ 302 | unsigned long dec_x; \ 303 | str = decode_decimal(str, &dec_x); \ 304 | if (str == NULL) { \ 305 | return 0; \ 306 | } \ 307 | (x) = dec_x; \ 308 | } while (0) 309 | 310 | #define BIN(buf, max_len, len) do { \ 311 | size_t bin_len = (max_len); \ 312 | str = from_base64(buf, &bin_len, str); \ 313 | if (str == NULL) { \ 314 | return 0; \ 315 | } \ 316 | (len) = bin_len; \ 317 | } while (0) 318 | 319 | pp->key_id_len = 0; 320 | pp->associated_data_len = 0; 321 | pp->salt_len = 0; 322 | pp->output_len = 0; 323 | CC("$argon2i"); 324 | CC("$m="); 325 | DECIMAL(pp->m); 326 | CC(",t="); 327 | DECIMAL(pp->t); 328 | CC(",p="); 329 | DECIMAL(pp->p); 330 | 331 | /* 332 | * Both m and t must be no more than 2^32-1. The tests below 333 | * use a shift by 30 bits to avoid a direct comparison with 334 | * 0xFFFFFFFF, which may trigger a spurious compiler warning 335 | * on machines where 'unsigned long' is a 32-bit type. 336 | */ 337 | if (pp->m < 1 || (pp->m >> 30) > 3) { 338 | return 0; 339 | } 340 | if (pp->t < 1 || (pp->t >> 30) > 3) { 341 | return 0; 342 | } 343 | 344 | /* 345 | * The parallelism p must be between 1 and 255. The memory cost 346 | * parameter, expressed in kilobytes, must be at least 8 times 347 | * the value of p. 348 | */ 349 | if (pp->p < 1 || pp->p > 255) { 350 | return 0; 351 | } 352 | if (pp->m < (pp->p << 3)) { 353 | return 0; 354 | } 355 | 356 | CC_opt(",keyid=", BIN(pp->key_id, sizeof pp->key_id, pp->key_id_len)); 357 | CC_opt(",data=", BIN(pp->associated_data, sizeof pp->associated_data, 358 | pp->associated_data_len)); 359 | if (*str == 0) { 360 | return 1; 361 | } 362 | CC("$"); 363 | BIN(pp->salt, sizeof pp->salt, pp->salt_len); 364 | if (pp->salt_len < 8) { 365 | return 0; 366 | } 367 | if (*str == 0) { 368 | return 1; 369 | } 370 | CC("$"); 371 | BIN(pp->output, sizeof pp->output, pp->output_len); 372 | if (pp->output_len < 12) { 373 | return 0; 374 | } 375 | return *str == 0; 376 | 377 | #undef CC 378 | #undef CC_opt 379 | #undef DECIMAL 380 | #undef BIN 381 | } 382 | 383 | /* 384 | * Encode an Argon2i hash string into the provided buffer. 'dst_len' 385 | * contains the size, in characters, of the 'dst' buffer; if 'dst_len' 386 | * is less than the number of required characters (including the 387 | * terminating 0), then this function returns 0. 388 | * 389 | * If pp->output_len is 0, then the hash string will be a salt string 390 | * (no output). If pp->salt_len is also 0, then the string will be a 391 | * parameter-only string (no salt and no output). 392 | * 393 | * On success, 1 is returned. 394 | */ 395 | int 396 | argon2i_encode_string(char *dst, size_t dst_len, const argon2i_params *pp) 397 | { 398 | #define SS(str) do { \ 399 | size_t pp_len = strlen(str); \ 400 | if (pp_len >= dst_len) { \ 401 | return 0; \ 402 | } \ 403 | memcpy(dst, str, pp_len + 1); \ 404 | dst += pp_len; \ 405 | dst_len -= pp_len; \ 406 | } while (0) 407 | 408 | #define SX(x) do { \ 409 | char tmp[30]; \ 410 | sprintf(tmp, "%lu", (unsigned long)(x)); \ 411 | SS(tmp); \ 412 | } while (0); \ 413 | 414 | #define SB(buf, len) do { \ 415 | size_t sb_len = to_base64(dst, dst_len, buf, len); \ 416 | if (sb_len == (size_t)-1) { \ 417 | return 0; \ 418 | } \ 419 | dst += sb_len; \ 420 | dst_len -= sb_len; \ 421 | } while (0); \ 422 | 423 | SS("$argon2i$m="); 424 | SX(pp->m); 425 | SS(",t="); 426 | SX(pp->t); 427 | SS(",p="); 428 | SX(pp->p); 429 | if (pp->key_id_len > 0) { 430 | SS(",keyid="); 431 | SB(pp->key_id, pp->key_id_len); 432 | } 433 | if (pp->associated_data_len > 0) { 434 | SS(",data="); 435 | SB(pp->associated_data, pp->associated_data_len); 436 | } 437 | if (pp->salt_len == 0) { 438 | return 1; 439 | } 440 | SS("$"); 441 | SB(pp->salt, pp->salt_len); 442 | if (pp->output_len == 0) { 443 | return 1; 444 | } 445 | SS("$"); 446 | SB(pp->output, pp->output_len); 447 | return 1; 448 | 449 | #undef SS 450 | #undef SX 451 | #undef SB 452 | } 453 | 454 | /* ==================================================================== */ 455 | /* 456 | * Test code. 457 | */ 458 | 459 | static const char *KAT_GOOD[] = { 460 | "$argon2i$m=120,t=5000,p=2", 461 | "$argon2i$m=120,t=4294967295,p=2", 462 | "$argon2i$m=2040,t=5000,p=255", 463 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0", 464 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0ZQ", 465 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0ZQA", 466 | "$argon2i$m=120,t=5000,p=2,data=sRlHhRmKUGzdOmXn01XmXygd5Kc", 467 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0,data=sRlHhRmKUGzdOmXn01XmXygd5Kc", 468 | "$argon2i$m=120,t=5000,p=2$/LtFjH5rVL8", 469 | "$argon2i$m=120,t=5000,p=2$4fXXG0spB92WPB1NitT8/OH0VKI", 470 | "$argon2i$m=120,t=5000,p=2$BwUgJHHQaynE+a4nZrYRzOllGSjjxuxNXxyNRUtI6Dlw/zlbt6PzOL8Onfqs6TcG", 471 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0$4fXXG0spB92WPB1NitT8/OH0VKI", 472 | "$argon2i$m=120,t=5000,p=2,data=sRlHhRmKUGzdOmXn01XmXygd5Kc$4fXXG0spB92WPB1NitT8/OH0VKI", 473 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0,data=sRlHhRmKUGzdOmXn01XmXygd5Kc$4fXXG0spB92WPB1NitT8/OH0VKI", 474 | "$argon2i$m=120,t=5000,p=2$4fXXG0spB92WPB1NitT8/OH0VKI$iPBVuORECm5biUsjq33hn9/7BKqy9aPWKhFfK2haEsM", 475 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0$4fXXG0spB92WPB1NitT8/OH0VKI$iPBVuORECm5biUsjq33hn9/7BKqy9aPWKhFfK2haEsM", 476 | "$argon2i$m=120,t=5000,p=2,data=sRlHhRmKUGzdOmXn01XmXygd5Kc$4fXXG0spB92WPB1NitT8/OH0VKI$iPBVuORECm5biUsjq33hn9/7BKqy9aPWKhFfK2haEsM", 477 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0,data=sRlHhRmKUGzdOmXn01XmXygd5Kc$4fXXG0spB92WPB1NitT8/OH0VKI$iPBVuORECm5biUsjq33hn9/7BKqy9aPWKhFfK2haEsM", 478 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0,data=sRlHhRmKUGzdOmXn01XmXygd5Kc$iHSDPHzUhPzK7rCcJgOFfg$EkCWX6pSTqWruiR0", 479 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0,data=sRlHhRmKUGzdOmXn01XmXygd5Kc$iHSDPHzUhPzK7rCcJgOFfg$J4moa2MM0/6uf3HbY2Tf5Fux8JIBTwIhmhxGRbsY14qhTltQt+Vw3b7tcJNEbk8ium8AQfZeD4tabCnNqfkD1g", 480 | NULL 481 | }; 482 | 483 | static const char *KAT_BAD[] = { 484 | /* bad function name */ 485 | "$argon2j$m=120,t=5000,p=2", 486 | 487 | /* missing parameter 'm' */ 488 | "$argon2i$t=5000,p=2", 489 | 490 | /* missing parameter 't' */ 491 | "$argon2i$m=120,p=2", 492 | 493 | /* missing parameter 'p' */ 494 | "$argon2i$m=120,t=5000", 495 | 496 | /* value of 'm' is too small (lower than 8*p) */ 497 | "$argon2i$m=15,t=5000,p=2", 498 | 499 | /* value of 't' is invalid */ 500 | "$argon2i$m=120,t=0,p=2", 501 | 502 | /* value of 'p' is invalid (too small) */ 503 | "$argon2i$m=120,t=5000,p=0", 504 | 505 | /* value of 'p' is invalid (too large) */ 506 | "$argon2i$m=2000,t=5000,p=256", 507 | 508 | /* value of 'm' has non-minimal encoding */ 509 | "$argon2i$m=0120,t=5000,p=2", 510 | 511 | /* value of 't' has non-minimal encoding */ 512 | "$argon2i$m=120,t=05000,p=2", 513 | 514 | /* value of 'p' has non-minimal encoding */ 515 | "$argon2i$m=120,t=5000,p=02", 516 | 517 | /* value of 't' exceeds 2^32-1 */ 518 | "$argon2i$m=120,t=4294967296,p=2", 519 | 520 | /* invalid Base64 for keyid (length = 9 characters) */ 521 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0Z", 522 | 523 | /* invalid Base64 for keyid (unprocessed bits are not 0) */ 524 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0ZR", 525 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0ZQB", 526 | 527 | /* invalid keyid (too large) */ 528 | "$argon2i$m=120,t=5000,p=2,keyid=Mwmcv5/avkXJ", 529 | 530 | /* invalid associated data (too large) */ 531 | "$argon2i$m=120,t=5000,p=2,data=Vrai0ME0m7lorfxfOCG3+6we5N89+2hXwkbv0C5SECab", 532 | 533 | /* invalid salt (too small) */ 534 | "$argon2i$m=120,t=5000,p=2$+yPbRi6hdw", 535 | 536 | /* invalid salt (too large) */ 537 | "$argon2i$m=120,t=5000,p=2$SIZzzPhYC/CXOf64vWG/IZjO/amlRgvKscaRCYwdg9R1boFN/NjaC1VdXdcOtFx+0A", 538 | 539 | /* invalid output (too small) */ 540 | "$argon2i$m=120,t=5000,p=2$4fXXG0spB92WPB1NitT8/OH0VKI$iHSDPHzUhPzK7rCcJgOFfg$c+jbgTK0PT0eCMI", 541 | 542 | /* invalid output (too large) */ 543 | "$argon2i$m=120,t=5000,p=2$4fXXG0spB92WPB1NitT8/OH0VKI$iHSDPHzUhPzK7rCcJgOFfg$KtTPhiUlDb98psIiNxUSZ8GYVEm1CsfEaLJrppBe5poD2/sQOUu5mmowSiQUbH+ZK3PjFdY3KUuf83bT5XqTZy0", 544 | 545 | NULL 546 | }; 547 | 548 | int 549 | main(void) 550 | { 551 | const char **s; 552 | 553 | for (s = KAT_GOOD; *s; s ++) { 554 | const char *str; 555 | argon2i_params pp; 556 | char tmp[300]; 557 | size_t len; 558 | 559 | str = *s; 560 | if (!argon2i_decode_string(&pp, str)) { 561 | fprintf(stderr, "Failed to decode: %s\n", str); 562 | exit(EXIT_FAILURE); 563 | } 564 | if (!argon2i_encode_string(tmp, sizeof tmp, &pp)) { 565 | fprintf(stderr, "Failed to encode back: %s\n", str); 566 | exit(EXIT_FAILURE); 567 | } 568 | if (strcmp(str, tmp) != 0) { 569 | fprintf(stderr, "Decode/encode difference:\n"); 570 | fprintf(stderr, " in: %s\n", str); 571 | fprintf(stderr, " out: %s\n", tmp); 572 | } 573 | len = strlen(str); 574 | if (!argon2i_encode_string(tmp, len + 1, &pp)) { 575 | fprintf(stderr, "Encode failure (1): %s\n", str); 576 | exit(EXIT_FAILURE); 577 | } 578 | if (argon2i_encode_string(tmp, len, &pp)) { 579 | fprintf(stderr, "Encode failure (2): %s\n", str); 580 | exit(EXIT_FAILURE); 581 | } 582 | } 583 | 584 | for (s = KAT_BAD; *s; s ++) { 585 | const char *str; 586 | argon2i_params pp; 587 | 588 | str = *s; 589 | if (argon2i_decode_string(&pp, str)) { 590 | fprintf(stderr, "Decoded invalid string: %s\n", str); 591 | exit(EXIT_FAILURE); 592 | } 593 | } 594 | 595 | printf("All tests OK\n"); 596 | return 0; 597 | } 598 | -------------------------------------------------------------------------------- /phc-sf-spec.md: -------------------------------------------------------------------------------- 1 | # PHC string format 2 | 3 | ## Example 4 | 5 | Given the following inputs: 6 | 7 | * Password: `hunter2` 8 | * Salt: ```\x81\x98\x95\xFC\xCD`=\xCD\xB6\x12P\a\xFC\x98u\x1F``` 9 | * Secret: `pepper` 10 | * Variant: `argon2id` 11 | * Version: `19` 12 | * Time cost: `2` 13 | * Memory cost: `65536` 14 | * Parallelism cost: `1` 15 | 16 | Argon2 will generate the following digest: 17 | 18 | `$argon2id$v=19$m=65536,t=2,p=1$gZiV/M1gPc22ElAH/Jh1Hw$CWOrkoo7oJBQ/iyh7uJ0LO2aLEfrHwTWllSAxT0zRno` 19 | 20 | ## Specification 21 | 22 | This document specifies string encodings for the output of a password 23 | hashing function. Three kinds of strings are defined: 24 | 25 | - Parameter string: identifies the function and contains values for 26 | its parameters. 27 | - Salt string: a parameter string that also specifies the salt value. 28 | - Hash string: a salt string that also specifies the hash output. 29 | 30 | The specification calls for deterministic encoding: for a given 31 | function, set of parameters, salt value and output, producers MUST 32 | output the exact unique sequence of characters prescribed in this 33 | documentation. This allows testing with regards to explicit test 34 | vectors, and promotes interoperability by discouraging local variants. 35 | Consumers may accept other encodings, but are also allowed to reject any 36 | string that differs from the format herein described. 37 | 38 | 39 | We define the following format: 40 | 41 | $[$v=][$=(,=)*][$[$]] 42 | 43 | where: 44 | 45 | - `` is the symbolic name for the function 46 | - `` is the algorithm version 47 | - `` is a parameter name 48 | - `` is a parameter value 49 | - `` is an encoding of the salt 50 | - `` is an encoding of the hash output 51 | 52 | The string is then the concatenation, in that order, of: 53 | 54 | - a `$` sign; 55 | - the function symbolic name; 56 | - optionally, a `$` sign followed by the algorithm version with a `v=version` format; 57 | - optionally, a `$` sign followed by one or several parameters, each 58 | with a `name=value` format; the parameters are separated by commas; 59 | - optionally, a `$` sign followed by the (encoded) salt value; 60 | - optionally, a `$` sign followed by the (encoded) hash output (the 61 | hash output may be present only if the salt is present). 62 | 63 | The function symbolic name is a sequence of characters in: `[a-z0-9-]` 64 | (lowercase letters, digits, and the minus sign). No other character is 65 | allowed. Each function defines its own identifier (or identifiers in 66 | case of a function family); identifiers should be explicit (human 67 | readable, not a single digit), with a length of about 5 to 10 68 | characters. An identifier name MUST NOT exceed 32 characters in length. 69 | 70 | The value for the version shall be a sequence of characters in: `[0-9]`. 71 | 72 | Each parameter name shall be a sequence of characters in: `[a-z0-9-]` 73 | (lowercase letters, digits, and the minus sign). No other character is 74 | allowed. Parameter names SHOULD be readable for a human user. A 75 | parameter name MUST NOT exceed 32 characters in length. A parameter 76 | name MUST NOT be equal to the string `v` (to avoid confusion with the 77 | version field). 78 | 79 | The value for each parameter consists in characters in: 80 | `[a-zA-Z0-9/+.-]` (lowercase letters, uppercase letters, digits, `/`, 81 | `+`, `.` and `-`). No other character is allowed. Interpretation of the 82 | value depends on the parameter and the function. The function 83 | specification MUST unambiguously define the set of valid parameter 84 | values. The function specification MUST define a maximum length (in 85 | characters) for each parameter. For numerical parameters, functions 86 | SHOULD use plain decimal encoding (other encodings are possible as long 87 | as they are clearly defined). 88 | 89 | The function specification MUST define a clear, unambiguous, 90 | deterministic encoding for each possible value of a parameter. Producers 91 | of strings MUST follow that encoding. Consumers MAY accept alternate 92 | encodings. 93 | 94 | A version may be optional; if the version is optional, then the 95 | function MUST define the default version to use. 96 | 97 | A parameter may be optional; if a parameter is optional, then the 98 | function MUST define the default value of the parameter. That default 99 | value MUST NOT be subject to context-dependent alterations (e.g. a value 100 | configurable in a system-wide setting is not an acceptable default). 101 | When a parameter is optional, producers MUST omit the parameter if its 102 | value is equal to the default value. The function MUST specify which 103 | parameters are optional and which are not. 104 | 105 | The function MUST specify the order in which parameters may appear. 106 | Producers MUST NOT allow parameters to appear in any other order. 107 | 108 | If the function expects no parameter at all, or all parameters are 109 | optional and their value happens to match the default, then the complete 110 | list, including its starting `$` sign, is omitted. Note that the `=` 111 | sign may appear within the complete string only as part of a list of 112 | parameters. 113 | 114 | The salt consists in a sequence of characters in: `[a-zA-Z0-9/+.-]` 115 | (lowercase letters, uppercase letters, digits, `/`, `+`, `.` and `-`). 116 | The function specification MUST define the set of valid salt values and 117 | a maximum length for this field. Functions that work over arbitrary 118 | binary salts SHOULD define that field to be the B64 encoding for a 119 | binary value whose length falls in a defined range or set of ranges. 120 | 121 | The hash output, if present (in a "hash string"), MUST be the B64 122 | encoding of the raw output of the hash function. The function 123 | specification MUST define the minimum, maximum and default output 124 | length. 125 | 126 | 127 | ### B64 128 | 129 | The B64 encoding is the standard Base64 encoding (RFC 4648, section 4) 130 | except that the padding `=` signs are omitted, and extra characters 131 | (whitespace) are not allowed: 132 | 133 | - Input is split into successive groups of bytes. Each group, except 134 | possibly the last one, contains exactly three bytes. 135 | 136 | - For a group of bytes b0, b1 and b2, compute the following value: 137 | 138 | x = (b0 << 16) + (b1 << 8) + b2 139 | 140 | Then split `x` into four 6-bit values `y0`, `y1`, `y2` and `y3` 141 | such that: 142 | 143 | x = (y0 << 18) + (y1 << 12) + (y2 << 6) + y3 144 | 145 | - Each 6-bit value is encoded into a character in the `[A-Za-z0-9+/]` 146 | alphabet, in that order: 147 | * `A`..`Z` = 0 to 25 148 | * `a`..`z` = 26 to 51 149 | * `0`..`9` = 52 to 61 150 | * `+` = 62 151 | * `/` = 63 152 | 153 | - If the last group does not contain exactly three bytes, then: 154 | 155 | 1. The group is completed with one or two bytes of value 0x00, 156 | then processed as above. 157 | 2. The resulting sequence of characters is truncated to its 158 | first two characters (if the group initially contained a single 159 | byte) or to its first three characters (if the group initially 160 | contained two bytes). 161 | 162 | A B64-encoded value thus yields a string whose length, taken modulo 4, 163 | can be equal to 0, 2 or 3, but not to 1. Take note that a sequence of 164 | characters of the right length may still be an invalid encoding if it 165 | defines some non-zero trailing bits in the last incomplete group; 166 | producers MUST set the trailing bits to 0, while consumers MAY ignore 167 | them, or MAY reject such invalid encodings. 168 | 169 | 170 | ### Decimal Encoding 171 | 172 | For an integer value _x_, its decimal encoding consist in the following: 173 | 174 | - If _x_ < 0, then its decimal encoding is the minus sign `-` followed 175 | by the decimal encoding of -_x_. 176 | - If _x_ = 0, then its decimal encoding is the single character `0`. 177 | - If _x_ > 0, then its decimal encoding is the smallest sequence of 178 | ASCII digits that matches its value (i.e. there is no leading zero). 179 | 180 | Thus, a value is a valid decimal for an integer _x_ if and only if all of 181 | the following hold true: 182 | 183 | - The first character is either a `-` sign, or an ASCII digit. 184 | - All characters other than the first are ASCII digits. 185 | - If the first character is `-` sign, then there is at least another 186 | character, and the second character is not a `0`. 187 | - If the string consists in more than one character, then the first 188 | one cannot be a `0`. 189 | 190 | The C function `strtol()` and `strtoul()` can decode decimal values if 191 | their `base` parameter is set to 10. 192 | 193 | 194 | ### Function Duties 195 | 196 | A password hashing function that uses this specification for its salt 197 | and hash strings MUST specify the following: 198 | 199 | - The function symbolic name. 200 | 201 | - The unique order in which parameters may appear. 202 | 203 | - For each parameter: 204 | * the parameter name; 205 | * the set or range of acceptable values for the parameter; 206 | * the deterministic encoding of the parameter; 207 | * the maximum size (in characters) of the encoded parameter value; 208 | * whether the parameter is optional, and, if yes, its default 209 | value when not encoded. 210 | 211 | - The set of valid salt values, in particular minimum and maximum 212 | length (in characters, and in bytes when applicable). 213 | 214 | - The minimum, maximum and default output lengths (in bytes, and in 215 | characters after encoding). 216 | 217 | 218 | It is RECOMMENDED to follow these guidelines: 219 | 220 | - The function name, and the parameter names, should promote 221 | readability. (Note that readability depends a lot on who is doing 222 | the reading, and there is no universal definition of that property.) 223 | 224 | - Making parameters optional means that human readers must know what 225 | value a parameter has when it has been omitted. Parameters for 226 | optional features (e.g. some explicit "additional data") are most 227 | naturally made optional; other parameters such as number of 228 | iterations are best kept specified explicitly. 229 | 230 | - Maximum lengths for salt, output and parameter values are meant to 231 | help consumer implementations, in particular written in C and using 232 | stack-allocated buffers. These buffers must account for the worst 233 | case, i.e. the maximum defined length. Therefore, keep these lengths 234 | low. 235 | 236 | - The role of salts is to achieve uniqueness. A _random_ salt is fine 237 | for that as long as its length is sufficient; a 16-byte salt would 238 | work well (by definition, UUID are very good salts, and they encode 239 | over exactly 16 bytes). 16 bytes encode as 22 characters in B64. 240 | Functions should disallow salt values that are too small for 241 | security (4 bytes should be viewed as an absolute minimum). 242 | 243 | - The hash output, for a verification, must be long enough to make 244 | preimage attacks at least as hard as password guessing. To promote 245 | wide acceptance, a default output size of 256 bits (32 bytes, 246 | encoded as 43 characters) is recommended. Function implementations 247 | SHOULD NOT allow outputs of less than 80 bits to be used for 248 | password verification. 249 | 250 | 251 | ## API 252 | 253 | The traditional Unix crypt() function is used both for password 254 | registration, and for password verification. It uses two string 255 | parameters: 256 | 257 | char *crypt(const char *key, const char *salt); 258 | 259 | The `key` is the password, while `salt` is a salt string or a hash 260 | string. In order to be compatible with how the crypt() function is 261 | used in existing software, the following must hold: 262 | 263 | - If `salt` is a salt string (no output), then the function must 264 | compute a hash output whose length is the default output length for 265 | that function. The returned string MUST be the strict, deterministic 266 | encoding of the used parameters, salt and output. 267 | 268 | - If `salt` is a parameter string (no salt nor output), then the 269 | function must generate a new appropriate salt value as mandated by 270 | the function specification (e.g. using the defined default salt 271 | length), and then proceed as in the previous case. The returned 272 | string MUST be the strict, deterministic encoding of the used 273 | parameters, salt and output. 274 | 275 | - If `salt` is a hash string, then the function must compute an output 276 | with exactly the same length as the one provided in the input. The 277 | output is then the concatenation of the parameters and salt _as they 278 | were received_, and the newly computed output. Basically, the 279 | function truncates the `salt` string at its last `$` sign, then 280 | appends the recomputed output. 281 | 282 | The third case departs from the prescription that string producers must 283 | always follow the deterministic encoding. This is done that way in order 284 | to support the common case of password verification: the `salt` value is 285 | the complete hash string as it is stored; the hash is recomputed, and 286 | the caller verifies that the exact same string is obtained (e.g. with a 287 | `strcmp()` call). This is the reason why the parameters and salt are 288 | reused "as is" in the output, even if they do not match the 289 | deterministic encoding prescribed in this document. 290 | 291 | On the other hand, when the input `salt` string does not include the 292 | hash output, then this is initial registration, and we insist on using 293 | the unique valid deterministic encoding. The whole point is to try to 294 | avoid local variations that are detrimental to interoperability, while 295 | not breaking existing password hashes. 296 | 297 | 298 | ## Argon2 Encoding 299 | 300 | For Argon2, the following is specified: 301 | 302 | - The identifier for Argon2d is `argon2d`. 303 | 304 | - The identifier for Argon2i is `argon2i`. 305 | 306 | - The identifier for Argon2id is `argon2id`. 307 | 308 | - The versions are: [16, 19]. 309 | 310 | - The parameters are: 311 | 312 | * `m`: Memory size, expressed in kilobytes, between 1 and (2^32)-1. 313 | Value is an integer in decimal, over 1 to 10 digits. 314 | 315 | * `t`: Number of iterations, between 1 and (2^32)-1. 316 | Value is an integer in decimal, over 1 to 10 digits. 317 | 318 | * `p`: Degree of parallelism, between 1 and 255. 319 | Value is an integer in decimal, over 1 to 3 digits. 320 | 321 | * `keyid`: Binary identifier for a key. Value is a sequence of 0 322 | to 8 bytes, encoded in B64 as 0 to 11 characters. This parameter 323 | is optional; the default value is the empty sequence (no byte at 324 | all) and its meaning is that no key is to be used. The contents of 325 | the identifier are chosen by the application and are meant to 326 | allow the application to locate the key to use. 327 | 328 | * `data`: Associated data. Value is a sequence of 0 to 32 bytes, 329 | encoded in B64 as 0 to 43 characters. This parameter is optional; 330 | the default value is the empty sequence (no byte at all). The 331 | associated data is extra, non-secret value that is included in the 332 | Argon2 input. 333 | 334 | The parameters shall appear in the `m,t,p,keyid,data` order. 335 | The `keyid` and `data` parameters are optional; the three others 336 | are NOT optional. 337 | 338 | - The salt value is encoded in B64. The length in bytes of the 339 | salt is between 8 and 48 bytes(*), thus yielding a length in 340 | characters between 11 and 64 characters (and that length is never 341 | equal to 1 modulo 4). The default byte length of the salt is 16 342 | bytes (22 characters in B64 encoding). An encoded UUID, or a 343 | sequence of 16 bytes produced with a cryptographically strong 344 | PRNG, are appropriate salt values. 345 | 346 | ((*) the Argon2 specification states that the salt can be much 347 | longer, up to 2^32-1 bytes, but this makes little sense for 348 | password hashing. Specifying a relatively small maximum length 349 | allows for parsing with a stack allocated buffer.) 350 | 351 | - The hash output is encoded in B64. Its length shall be between 352 | 12 and 64 bytes (16 and 86 characters, respectively). The default 353 | output length is 32 bytes (43 characters). 354 | --------------------------------------------------------------------------------