├── LICENSE
├── README.md
├── phc-sf-parse.c
└── phc-sf-spec.md
/LICENSE:
--------------------------------------------------------------------------------
1 | CC0 1.0 Universal
2 |
3 | Statement of Purpose
4 |
5 | The laws of most jurisdictions throughout the world automatically confer
6 | exclusive Copyright and Related Rights (defined below) upon the creator and
7 | subsequent owner(s) (each and all, an "owner") of an original work of
8 | authorship and/or a database (each, a "Work").
9 |
10 | Certain owners wish to permanently relinquish those rights to a Work for the
11 | purpose of contributing to a commons of creative, cultural and scientific
12 | works ("Commons") that the public can reliably and without fear of later
13 | claims of infringement build upon, modify, incorporate in other works, reuse
14 | and redistribute as freely as possible in any form whatsoever and for any
15 | purposes, including without limitation commercial purposes. These owners may
16 | contribute to the Commons to promote the ideal of a free culture and the
17 | further production of creative, cultural and scientific works, or to gain
18 | reputation or greater distribution for their Work in part through the use and
19 | efforts of others.
20 |
21 | For these and/or other purposes and motivations, and without any expectation
22 | of additional consideration or compensation, the person associating CC0 with a
23 | Work (the "Affirmer"), to the extent that he or she is an owner of Copyright
24 | and Related Rights in the Work, voluntarily elects to apply CC0 to the Work
25 | and publicly distribute the Work under its terms, with knowledge of his or her
26 | Copyright and Related Rights in the Work and the meaning and intended legal
27 | effect of CC0 on those rights.
28 |
29 | 1. Copyright and Related Rights. A Work made available under CC0 may be
30 | protected by copyright and related or neighboring rights ("Copyright and
31 | Related Rights"). Copyright and Related Rights include, but are not limited
32 | to, the following:
33 |
34 | i. the right to reproduce, adapt, distribute, perform, display, communicate,
35 | and translate a Work;
36 |
37 | ii. moral rights retained by the original author(s) and/or performer(s);
38 |
39 | iii. publicity and privacy rights pertaining to a person's image or likeness
40 | depicted in a Work;
41 |
42 | iv. rights protecting against unfair competition in regards to a Work,
43 | subject to the limitations in paragraph 4(a), below;
44 |
45 | v. rights protecting the extraction, dissemination, use and reuse of data in
46 | a Work;
47 |
48 | vi. database rights (such as those arising under Directive 96/9/EC of the
49 | European Parliament and of the Council of 11 March 1996 on the legal
50 | protection of databases, and under any national implementation thereof,
51 | including any amended or successor version of such directive); and
52 |
53 | vii. other similar, equivalent or corresponding rights throughout the world
54 | based on applicable law or treaty, and any national implementations thereof.
55 |
56 | 2. Waiver. To the greatest extent permitted by, but not in contravention of,
57 | applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and
58 | unconditionally waives, abandons, and surrenders all of Affirmer's Copyright
59 | and Related Rights and associated claims and causes of action, whether now
60 | known or unknown (including existing as well as future claims and causes of
61 | action), in the Work (i) in all territories worldwide, (ii) for the maximum
62 | duration provided by applicable law or treaty (including future time
63 | extensions), (iii) in any current or future medium and for any number of
64 | copies, and (iv) for any purpose whatsoever, including without limitation
65 | commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes
66 | the Waiver for the benefit of each member of the public at large and to the
67 | detriment of Affirmer's heirs and successors, fully intending that such Waiver
68 | shall not be subject to revocation, rescission, cancellation, termination, or
69 | any other legal or equitable action to disrupt the quiet enjoyment of the Work
70 | by the public as contemplated by Affirmer's express Statement of Purpose.
71 |
72 | 3. Public License Fallback. Should any part of the Waiver for any reason be
73 | judged legally invalid or ineffective under applicable law, then the Waiver
74 | shall be preserved to the maximum extent permitted taking into account
75 | Affirmer's express Statement of Purpose. In addition, to the extent the Waiver
76 | is so judged Affirmer hereby grants to each affected person a royalty-free,
77 | non transferable, non sublicensable, non exclusive, irrevocable and
78 | unconditional license to exercise Affirmer's Copyright and Related Rights in
79 | the Work (i) in all territories worldwide, (ii) for the maximum duration
80 | provided by applicable law or treaty (including future time extensions), (iii)
81 | in any current or future medium and for any number of copies, and (iv) for any
82 | purpose whatsoever, including without limitation commercial, advertising or
83 | promotional purposes (the "License"). The License shall be deemed effective as
84 | of the date CC0 was applied by Affirmer to the Work. Should any part of the
85 | License for any reason be judged legally invalid or ineffective under
86 | applicable law, such partial invalidity or ineffectiveness shall not
87 | invalidate the remainder of the License, and in such case Affirmer hereby
88 | affirms that he or she will not (i) exercise any of his or her remaining
89 | Copyright and Related Rights in the Work or (ii) assert any associated claims
90 | and causes of action with respect to the Work, in either case contrary to
91 | Affirmer's express Statement of Purpose.
92 |
93 | 4. Limitations and Disclaimers.
94 |
95 | a. No trademark or patent rights held by Affirmer are waived, abandoned,
96 | surrendered, licensed or otherwise affected by this document.
97 |
98 | b. Affirmer offers the Work as-is and makes no representations or warranties
99 | of any kind concerning the Work, express, implied, statutory or otherwise,
100 | including without limitation warranties of title, merchantability, fitness
101 | for a particular purpose, non infringement, or the absence of latent or
102 | other defects, accuracy, or the present or absence of errors, whether or not
103 | discoverable, all to the greatest extent permissible under applicable law.
104 |
105 | c. Affirmer disclaims responsibility for clearing rights of other persons
106 | that may apply to the Work or any use thereof, including without limitation
107 | any person's Copyright and Related Rights in the Work. Further, Affirmer
108 | disclaims responsibility for obtaining any necessary consents, permissions
109 | or other rights required for any use of the Work.
110 |
111 | d. Affirmer understands and acknowledges that Creative Commons is not a
112 | party to this document and has no duty or obligation with respect to this
113 | CC0 or use of the Work.
114 |
115 | For more information, please see
116 |
117 |
118 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
2 | # PHC string format
3 |
4 | Specification of
5 |
6 | * An extended crypt(3) encoding format to support additional parameters
7 | of password hashing functions.
8 |
9 | * An encoding for the PHC winner Argon2.
10 |
--------------------------------------------------------------------------------
/phc-sf-parse.c:
--------------------------------------------------------------------------------
1 | /*
2 | * Example code for a decoder and encoder of "hash strings", with Argon2i
3 | * parameters.
4 | *
5 | * This code comprises three sections:
6 | *
7 | * -- The first section contains generic Base64 encoding and decoding
8 | * functions. It is conceptually applicable to any hash function
9 | * implementation that uses Base64 to encode and decode parameters,
10 | * salts and outputs. It could be made into a library, provided that
11 | * the relevant functions are made public (non-static) and be given
12 | * reasonable names to avoid collisions with other functions.
13 | *
14 | * -- The second section is specific to Argon2i. It encodes and decodes
15 | * the parameters, salts and outputs. It does not compute the hash
16 | * itself.
17 | *
18 | * -- The third section is test code, with a main() function. With
19 | * this section, the whole file compiles as a stand-alone program
20 | * that exercises the encoding and decoding functions with some
21 | * test vectors.
22 | *
23 | * The code was originally written by Thomas Pornin ,
24 | * to whom comments and remarks may be sent. It is released under what
25 | * should amount to Public Domain or its closest equivalent; the
26 | * following mantra is supposed to incarnate that fact with all the
27 | * proper legal rituals:
28 | *
29 | * ---------------------------------------------------------------------
30 | * This file is provided under the terms of Creative Commons CC0 1.0
31 | * Public Domain Dedication. To the extent possible under law, the
32 | * author (Thomas Pornin) has waived all copyright and related or
33 | * neighboring rights to this file. This work is published from: Canada.
34 | * ---------------------------------------------------------------------
35 | *
36 | * Copyright (c) 2015 Thomas Pornin
37 | */
38 |
39 | #include
40 | #include
41 | #include
42 | #include
43 |
44 | /* ==================================================================== */
45 | /*
46 | * Common code; could be shared between different hash functions.
47 | *
48 | * Note: the Base64 functions below assume that uppercase letters (resp.
49 | * lowercase letters) have consecutive numerical codes, that fit on 8
50 | * bits. All modern systems use ASCII-compatible charsets, where these
51 | * properties are true. If you are stuck with a dinosaur of a system
52 | * that still defaults to EBCDIC then you already have much bigger
53 | * interoperability issues to deal with.
54 | */
55 |
56 | /*
57 | * Some macros for constant-time comparisons. These work over values in
58 | * the 0..255 range. Returned value is 0x00 on "false", 0xFF on "true".
59 | */
60 | #define EQ(x, y) ((((-((unsigned)(x) ^ (unsigned)(y))) >> 8) & 0xFF) ^ 0xFF)
61 | #define GT(x, y) ((((unsigned)(y) - (unsigned)(x)) >> 8) & 0xFF)
62 | #define GE(x, y) (GT(y, x) ^ 0xFF)
63 | #define LT(x, y) GT(y, x)
64 | #define LE(x, y) GE(y, x)
65 |
66 | /*
67 | * Convert value x (0..63) to corresponding Base64 character.
68 | */
69 | static int
70 | b64_byte_to_char(unsigned x)
71 | {
72 | return (LT(x, 26) & (x + 'A'))
73 | | (GE(x, 26) & LT(x, 52) & (x + ('a' - 26)))
74 | | (GE(x, 52) & LT(x, 62) & (x + ('0' - 52)))
75 | | (EQ(x, 62) & '+') | (EQ(x, 63) & '/');
76 | }
77 |
78 | /*
79 | * Convert character c to the corresponding 6-bit value. If character c
80 | * is not a Base64 character, then 0xFF (255) is returned.
81 | */
82 | static unsigned
83 | b64_char_to_byte(int c)
84 | {
85 | unsigned x;
86 |
87 | x = (GE(c, 'A') & LE(c, 'Z') & (c - 'A'))
88 | | (GE(c, 'a') & LE(c, 'z') & (c - ('a' - 26)))
89 | | (GE(c, '0') & LE(c, '9') & (c - ('0' - 52)))
90 | | (EQ(c, '+') & 62) | (EQ(c, '/') & 63);
91 | return x | (EQ(x, 0) & (EQ(c, 'A') ^ 0xFF));
92 | }
93 |
94 | /*
95 | * Convert some bytes to Base64. 'dst_len' is the length (in characters)
96 | * of the output buffer 'dst'; if that buffer is not large enough to
97 | * receive the result (including the terminating 0), then (size_t)-1
98 | * is returned. Otherwise, the zero-terminated Base64 string is written
99 | * in the buffer, and the output length (counted WITHOUT the terminating
100 | * zero) is returned.
101 | */
102 | static size_t
103 | to_base64(char *dst, size_t dst_len, const void *src, size_t src_len)
104 | {
105 | size_t olen;
106 | const unsigned char *buf;
107 | unsigned acc, acc_len;
108 |
109 | olen = (src_len / 3) << 2;
110 | switch (src_len % 3) {
111 | case 2:
112 | olen ++;
113 | /* fall through */
114 | case 1:
115 | olen += 2;
116 | break;
117 | }
118 | if (dst_len <= olen) {
119 | return (size_t)-1;
120 | }
121 | acc = 0;
122 | acc_len = 0;
123 | buf = (const unsigned char *)src;
124 | while (src_len -- > 0) {
125 | acc = (acc << 8) + (*buf ++);
126 | acc_len += 8;
127 | while (acc_len >= 6) {
128 | acc_len -= 6;
129 | *dst ++ = b64_byte_to_char((acc >> acc_len) & 0x3F);
130 | }
131 | }
132 | if (acc_len > 0) {
133 | *dst ++ = b64_byte_to_char((acc << (6 - acc_len)) & 0x3F);
134 | }
135 | *dst ++ = 0;
136 | return olen;
137 | }
138 |
139 | /*
140 | * Decode Base64 chars into bytes. The '*dst_len' value must initially
141 | * contain the length of the output buffer '*dst'; when the decoding
142 | * ends, the actual number of decoded bytes is written back in
143 | * '*dst_len'.
144 | *
145 | * Decoding stops when a non-Base64 character is encountered, or when
146 | * the output buffer capacity is exceeded. If an error occurred (output
147 | * buffer is too small, invalid last characters leading to unprocessed
148 | * buffered bits), then NULL is returned; otherwise, the returned value
149 | * points to the first non-Base64 character in the source stream, which
150 | * may be the terminating zero.
151 | */
152 | static const char *
153 | from_base64(void *dst, size_t *dst_len, const char *src)
154 | {
155 | size_t len;
156 | unsigned char *buf;
157 | unsigned acc, acc_len;
158 |
159 | buf = (unsigned char *)dst;
160 | len = 0;
161 | acc = 0;
162 | acc_len = 0;
163 | for (;;) {
164 | unsigned d;
165 |
166 | d = b64_char_to_byte(*src);
167 | if (d == 0xFF) {
168 | break;
169 | }
170 | src ++;
171 | acc = (acc << 6) + d;
172 | acc_len += 6;
173 | if (acc_len >= 8) {
174 | acc_len -= 8;
175 | if ((len ++) >= *dst_len) {
176 | return NULL;
177 | }
178 | *buf ++ = (acc >> acc_len) & 0xFF;
179 | }
180 | }
181 |
182 | /*
183 | * If the input length is equal to 1 modulo 4 (which is
184 | * invalid), then there will remain 6 unprocessed bits;
185 | * otherwise, only 0, 2 or 4 bits are buffered. The buffered
186 | * bits must also all be zero.
187 | */
188 | if (acc_len > 4 || (acc & (((unsigned)1 << acc_len) - 1)) != 0) {
189 | return NULL;
190 | }
191 | *dst_len = len;
192 | return src;
193 | }
194 |
195 | /*
196 | * Decode decimal integer from 'str'; the value is written in '*v'.
197 | * Returned value is a pointer to the next non-decimal character in the
198 | * string. If there is no digit at all, or the value encoding is not
199 | * minimal (extra leading zeros), or the value does not fit in an
200 | * 'unsigned long', then NULL is returned.
201 | */
202 | static const char *
203 | decode_decimal(const char *str, unsigned long *v)
204 | {
205 | const char *orig;
206 | unsigned long acc;
207 |
208 | orig = str;
209 | acc = 0;
210 | for (orig = str;; str ++) {
211 | int c;
212 |
213 | c = *str;
214 | if (c < '0' || c > '9') {
215 | break;
216 | }
217 | c -= '0';
218 | if (acc > (ULONG_MAX / 10)) {
219 | return NULL;
220 | }
221 | acc *= 10;
222 | if ((unsigned long)c > (ULONG_MAX - acc)) {
223 | return NULL;
224 | }
225 | acc += (unsigned long)c;
226 | }
227 | if (str == orig || (*orig == '0' && str != (orig + 1))) {
228 | return NULL;
229 | }
230 | *v = acc;
231 | return str;
232 | }
233 |
234 | /* ==================================================================== */
235 | /*
236 | * Code specific to Argon2i.
237 | *
238 | * The code below applies the following format:
239 | *
240 | * $argon2i$m=,t=,p=[,keyid=][,data=][$[$]]
241 | *
242 | * where is a decimal integer (positive, fits in an 'unsigned long')
243 | * and is Base64-encoded data (no '=' padding characters, no newline
244 | * or whitespace). The "keyid" is a binary identifier for a key (up to 8
245 | * bytes); "data" is associated data (up to 32 bytes). When the 'keyid'
246 | * (resp. the 'data') is empty, then it is ommitted from the output.
247 | *
248 | * The last two binary chunks (encoded in Base64) are, in that order,
249 | * the salt and the output. Both are optional, but you cannot have an
250 | * output without a salt. The binary salt length is between 8 and 48 bytes.
251 | * The output length is always exactly 32 bytes.
252 | */
253 |
254 | /*
255 | * A structure containg the values that get encoded into Argon2i hash
256 | * strings.
257 | *
258 | * key_id_len is 0 if the string contains no key ID.
259 | * associated_data_len is 0 if the string contains no associated data.
260 | * salt_len is 0 if the string contains no salt (parameter-only string).
261 | * output_len is 0 if the string contains no output (a salt string, with
262 | * parameters and salt but no output).
263 | */
264 | typedef struct {
265 | unsigned long m;
266 | unsigned long t;
267 | unsigned long p;
268 | unsigned char key_id[8];
269 | size_t key_id_len;
270 | unsigned char associated_data[32];
271 | size_t associated_data_len;
272 | unsigned char salt[48];
273 | size_t salt_len;
274 | unsigned char output[64];
275 | size_t output_len;
276 | } argon2i_params;
277 |
278 | /*
279 | * Decode an Argon2i hash string into the provided structure 'pp'.
280 | * Returned value is 1 on success, 0 on error.
281 | */
282 | int
283 | argon2i_decode_string(argon2i_params *pp, const char *str)
284 | {
285 | #define CC(prefix) do { \
286 | size_t cc_len = strlen(prefix); \
287 | if (strncmp(str, prefix, cc_len) != 0) { \
288 | return 0; \
289 | } \
290 | str += cc_len; \
291 | } while (0)
292 |
293 | #define CC_opt(prefix, code) do { \
294 | size_t cc_len = strlen(prefix); \
295 | if (strncmp(str, prefix, cc_len) == 0) { \
296 | str += cc_len; \
297 | { code; } \
298 | } \
299 | } while (0)
300 |
301 | #define DECIMAL(x) do { \
302 | unsigned long dec_x; \
303 | str = decode_decimal(str, &dec_x); \
304 | if (str == NULL) { \
305 | return 0; \
306 | } \
307 | (x) = dec_x; \
308 | } while (0)
309 |
310 | #define BIN(buf, max_len, len) do { \
311 | size_t bin_len = (max_len); \
312 | str = from_base64(buf, &bin_len, str); \
313 | if (str == NULL) { \
314 | return 0; \
315 | } \
316 | (len) = bin_len; \
317 | } while (0)
318 |
319 | pp->key_id_len = 0;
320 | pp->associated_data_len = 0;
321 | pp->salt_len = 0;
322 | pp->output_len = 0;
323 | CC("$argon2i");
324 | CC("$m=");
325 | DECIMAL(pp->m);
326 | CC(",t=");
327 | DECIMAL(pp->t);
328 | CC(",p=");
329 | DECIMAL(pp->p);
330 |
331 | /*
332 | * Both m and t must be no more than 2^32-1. The tests below
333 | * use a shift by 30 bits to avoid a direct comparison with
334 | * 0xFFFFFFFF, which may trigger a spurious compiler warning
335 | * on machines where 'unsigned long' is a 32-bit type.
336 | */
337 | if (pp->m < 1 || (pp->m >> 30) > 3) {
338 | return 0;
339 | }
340 | if (pp->t < 1 || (pp->t >> 30) > 3) {
341 | return 0;
342 | }
343 |
344 | /*
345 | * The parallelism p must be between 1 and 255. The memory cost
346 | * parameter, expressed in kilobytes, must be at least 8 times
347 | * the value of p.
348 | */
349 | if (pp->p < 1 || pp->p > 255) {
350 | return 0;
351 | }
352 | if (pp->m < (pp->p << 3)) {
353 | return 0;
354 | }
355 |
356 | CC_opt(",keyid=", BIN(pp->key_id, sizeof pp->key_id, pp->key_id_len));
357 | CC_opt(",data=", BIN(pp->associated_data, sizeof pp->associated_data,
358 | pp->associated_data_len));
359 | if (*str == 0) {
360 | return 1;
361 | }
362 | CC("$");
363 | BIN(pp->salt, sizeof pp->salt, pp->salt_len);
364 | if (pp->salt_len < 8) {
365 | return 0;
366 | }
367 | if (*str == 0) {
368 | return 1;
369 | }
370 | CC("$");
371 | BIN(pp->output, sizeof pp->output, pp->output_len);
372 | if (pp->output_len < 12) {
373 | return 0;
374 | }
375 | return *str == 0;
376 |
377 | #undef CC
378 | #undef CC_opt
379 | #undef DECIMAL
380 | #undef BIN
381 | }
382 |
383 | /*
384 | * Encode an Argon2i hash string into the provided buffer. 'dst_len'
385 | * contains the size, in characters, of the 'dst' buffer; if 'dst_len'
386 | * is less than the number of required characters (including the
387 | * terminating 0), then this function returns 0.
388 | *
389 | * If pp->output_len is 0, then the hash string will be a salt string
390 | * (no output). If pp->salt_len is also 0, then the string will be a
391 | * parameter-only string (no salt and no output).
392 | *
393 | * On success, 1 is returned.
394 | */
395 | int
396 | argon2i_encode_string(char *dst, size_t dst_len, const argon2i_params *pp)
397 | {
398 | #define SS(str) do { \
399 | size_t pp_len = strlen(str); \
400 | if (pp_len >= dst_len) { \
401 | return 0; \
402 | } \
403 | memcpy(dst, str, pp_len + 1); \
404 | dst += pp_len; \
405 | dst_len -= pp_len; \
406 | } while (0)
407 |
408 | #define SX(x) do { \
409 | char tmp[30]; \
410 | sprintf(tmp, "%lu", (unsigned long)(x)); \
411 | SS(tmp); \
412 | } while (0); \
413 |
414 | #define SB(buf, len) do { \
415 | size_t sb_len = to_base64(dst, dst_len, buf, len); \
416 | if (sb_len == (size_t)-1) { \
417 | return 0; \
418 | } \
419 | dst += sb_len; \
420 | dst_len -= sb_len; \
421 | } while (0); \
422 |
423 | SS("$argon2i$m=");
424 | SX(pp->m);
425 | SS(",t=");
426 | SX(pp->t);
427 | SS(",p=");
428 | SX(pp->p);
429 | if (pp->key_id_len > 0) {
430 | SS(",keyid=");
431 | SB(pp->key_id, pp->key_id_len);
432 | }
433 | if (pp->associated_data_len > 0) {
434 | SS(",data=");
435 | SB(pp->associated_data, pp->associated_data_len);
436 | }
437 | if (pp->salt_len == 0) {
438 | return 1;
439 | }
440 | SS("$");
441 | SB(pp->salt, pp->salt_len);
442 | if (pp->output_len == 0) {
443 | return 1;
444 | }
445 | SS("$");
446 | SB(pp->output, pp->output_len);
447 | return 1;
448 |
449 | #undef SS
450 | #undef SX
451 | #undef SB
452 | }
453 |
454 | /* ==================================================================== */
455 | /*
456 | * Test code.
457 | */
458 |
459 | static const char *KAT_GOOD[] = {
460 | "$argon2i$m=120,t=5000,p=2",
461 | "$argon2i$m=120,t=4294967295,p=2",
462 | "$argon2i$m=2040,t=5000,p=255",
463 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0",
464 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0ZQ",
465 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0ZQA",
466 | "$argon2i$m=120,t=5000,p=2,data=sRlHhRmKUGzdOmXn01XmXygd5Kc",
467 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0,data=sRlHhRmKUGzdOmXn01XmXygd5Kc",
468 | "$argon2i$m=120,t=5000,p=2$/LtFjH5rVL8",
469 | "$argon2i$m=120,t=5000,p=2$4fXXG0spB92WPB1NitT8/OH0VKI",
470 | "$argon2i$m=120,t=5000,p=2$BwUgJHHQaynE+a4nZrYRzOllGSjjxuxNXxyNRUtI6Dlw/zlbt6PzOL8Onfqs6TcG",
471 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0$4fXXG0spB92WPB1NitT8/OH0VKI",
472 | "$argon2i$m=120,t=5000,p=2,data=sRlHhRmKUGzdOmXn01XmXygd5Kc$4fXXG0spB92WPB1NitT8/OH0VKI",
473 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0,data=sRlHhRmKUGzdOmXn01XmXygd5Kc$4fXXG0spB92WPB1NitT8/OH0VKI",
474 | "$argon2i$m=120,t=5000,p=2$4fXXG0spB92WPB1NitT8/OH0VKI$iPBVuORECm5biUsjq33hn9/7BKqy9aPWKhFfK2haEsM",
475 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0$4fXXG0spB92WPB1NitT8/OH0VKI$iPBVuORECm5biUsjq33hn9/7BKqy9aPWKhFfK2haEsM",
476 | "$argon2i$m=120,t=5000,p=2,data=sRlHhRmKUGzdOmXn01XmXygd5Kc$4fXXG0spB92WPB1NitT8/OH0VKI$iPBVuORECm5biUsjq33hn9/7BKqy9aPWKhFfK2haEsM",
477 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0,data=sRlHhRmKUGzdOmXn01XmXygd5Kc$4fXXG0spB92WPB1NitT8/OH0VKI$iPBVuORECm5biUsjq33hn9/7BKqy9aPWKhFfK2haEsM",
478 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0,data=sRlHhRmKUGzdOmXn01XmXygd5Kc$iHSDPHzUhPzK7rCcJgOFfg$EkCWX6pSTqWruiR0",
479 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0,data=sRlHhRmKUGzdOmXn01XmXygd5Kc$iHSDPHzUhPzK7rCcJgOFfg$J4moa2MM0/6uf3HbY2Tf5Fux8JIBTwIhmhxGRbsY14qhTltQt+Vw3b7tcJNEbk8ium8AQfZeD4tabCnNqfkD1g",
480 | NULL
481 | };
482 |
483 | static const char *KAT_BAD[] = {
484 | /* bad function name */
485 | "$argon2j$m=120,t=5000,p=2",
486 |
487 | /* missing parameter 'm' */
488 | "$argon2i$t=5000,p=2",
489 |
490 | /* missing parameter 't' */
491 | "$argon2i$m=120,p=2",
492 |
493 | /* missing parameter 'p' */
494 | "$argon2i$m=120,t=5000",
495 |
496 | /* value of 'm' is too small (lower than 8*p) */
497 | "$argon2i$m=15,t=5000,p=2",
498 |
499 | /* value of 't' is invalid */
500 | "$argon2i$m=120,t=0,p=2",
501 |
502 | /* value of 'p' is invalid (too small) */
503 | "$argon2i$m=120,t=5000,p=0",
504 |
505 | /* value of 'p' is invalid (too large) */
506 | "$argon2i$m=2000,t=5000,p=256",
507 |
508 | /* value of 'm' has non-minimal encoding */
509 | "$argon2i$m=0120,t=5000,p=2",
510 |
511 | /* value of 't' has non-minimal encoding */
512 | "$argon2i$m=120,t=05000,p=2",
513 |
514 | /* value of 'p' has non-minimal encoding */
515 | "$argon2i$m=120,t=5000,p=02",
516 |
517 | /* value of 't' exceeds 2^32-1 */
518 | "$argon2i$m=120,t=4294967296,p=2",
519 |
520 | /* invalid Base64 for keyid (length = 9 characters) */
521 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0Z",
522 |
523 | /* invalid Base64 for keyid (unprocessed bits are not 0) */
524 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0ZR",
525 | "$argon2i$m=120,t=5000,p=2,keyid=Hj5+dsK0ZQB",
526 |
527 | /* invalid keyid (too large) */
528 | "$argon2i$m=120,t=5000,p=2,keyid=Mwmcv5/avkXJ",
529 |
530 | /* invalid associated data (too large) */
531 | "$argon2i$m=120,t=5000,p=2,data=Vrai0ME0m7lorfxfOCG3+6we5N89+2hXwkbv0C5SECab",
532 |
533 | /* invalid salt (too small) */
534 | "$argon2i$m=120,t=5000,p=2$+yPbRi6hdw",
535 |
536 | /* invalid salt (too large) */
537 | "$argon2i$m=120,t=5000,p=2$SIZzzPhYC/CXOf64vWG/IZjO/amlRgvKscaRCYwdg9R1boFN/NjaC1VdXdcOtFx+0A",
538 |
539 | /* invalid output (too small) */
540 | "$argon2i$m=120,t=5000,p=2$4fXXG0spB92WPB1NitT8/OH0VKI$iHSDPHzUhPzK7rCcJgOFfg$c+jbgTK0PT0eCMI",
541 |
542 | /* invalid output (too large) */
543 | "$argon2i$m=120,t=5000,p=2$4fXXG0spB92WPB1NitT8/OH0VKI$iHSDPHzUhPzK7rCcJgOFfg$KtTPhiUlDb98psIiNxUSZ8GYVEm1CsfEaLJrppBe5poD2/sQOUu5mmowSiQUbH+ZK3PjFdY3KUuf83bT5XqTZy0",
544 |
545 | NULL
546 | };
547 |
548 | int
549 | main(void)
550 | {
551 | const char **s;
552 |
553 | for (s = KAT_GOOD; *s; s ++) {
554 | const char *str;
555 | argon2i_params pp;
556 | char tmp[300];
557 | size_t len;
558 |
559 | str = *s;
560 | if (!argon2i_decode_string(&pp, str)) {
561 | fprintf(stderr, "Failed to decode: %s\n", str);
562 | exit(EXIT_FAILURE);
563 | }
564 | if (!argon2i_encode_string(tmp, sizeof tmp, &pp)) {
565 | fprintf(stderr, "Failed to encode back: %s\n", str);
566 | exit(EXIT_FAILURE);
567 | }
568 | if (strcmp(str, tmp) != 0) {
569 | fprintf(stderr, "Decode/encode difference:\n");
570 | fprintf(stderr, " in: %s\n", str);
571 | fprintf(stderr, " out: %s\n", tmp);
572 | }
573 | len = strlen(str);
574 | if (!argon2i_encode_string(tmp, len + 1, &pp)) {
575 | fprintf(stderr, "Encode failure (1): %s\n", str);
576 | exit(EXIT_FAILURE);
577 | }
578 | if (argon2i_encode_string(tmp, len, &pp)) {
579 | fprintf(stderr, "Encode failure (2): %s\n", str);
580 | exit(EXIT_FAILURE);
581 | }
582 | }
583 |
584 | for (s = KAT_BAD; *s; s ++) {
585 | const char *str;
586 | argon2i_params pp;
587 |
588 | str = *s;
589 | if (argon2i_decode_string(&pp, str)) {
590 | fprintf(stderr, "Decoded invalid string: %s\n", str);
591 | exit(EXIT_FAILURE);
592 | }
593 | }
594 |
595 | printf("All tests OK\n");
596 | return 0;
597 | }
598 |
--------------------------------------------------------------------------------
/phc-sf-spec.md:
--------------------------------------------------------------------------------
1 | # PHC string format
2 |
3 | ## Example
4 |
5 | Given the following inputs:
6 |
7 | * Password: `hunter2`
8 | * Salt: ```\x81\x98\x95\xFC\xCD`=\xCD\xB6\x12P\a\xFC\x98u\x1F```
9 | * Secret: `pepper`
10 | * Variant: `argon2id`
11 | * Version: `19`
12 | * Time cost: `2`
13 | * Memory cost: `65536`
14 | * Parallelism cost: `1`
15 |
16 | Argon2 will generate the following digest:
17 |
18 | `$argon2id$v=19$m=65536,t=2,p=1$gZiV/M1gPc22ElAH/Jh1Hw$CWOrkoo7oJBQ/iyh7uJ0LO2aLEfrHwTWllSAxT0zRno`
19 |
20 | ## Specification
21 |
22 | This document specifies string encodings for the output of a password
23 | hashing function. Three kinds of strings are defined:
24 |
25 | - Parameter string: identifies the function and contains values for
26 | its parameters.
27 | - Salt string: a parameter string that also specifies the salt value.
28 | - Hash string: a salt string that also specifies the hash output.
29 |
30 | The specification calls for deterministic encoding: for a given
31 | function, set of parameters, salt value and output, producers MUST
32 | output the exact unique sequence of characters prescribed in this
33 | documentation. This allows testing with regards to explicit test
34 | vectors, and promotes interoperability by discouraging local variants.
35 | Consumers may accept other encodings, but are also allowed to reject any
36 | string that differs from the format herein described.
37 |
38 |
39 | We define the following format:
40 |
41 | $[$v=][$=(,=)*][$[$]]
42 |
43 | where:
44 |
45 | - `` is the symbolic name for the function
46 | - `` is the algorithm version
47 | - `` is a parameter name
48 | - `` is a parameter value
49 | - `` is an encoding of the salt
50 | - `` is an encoding of the hash output
51 |
52 | The string is then the concatenation, in that order, of:
53 |
54 | - a `$` sign;
55 | - the function symbolic name;
56 | - optionally, a `$` sign followed by the algorithm version with a `v=version` format;
57 | - optionally, a `$` sign followed by one or several parameters, each
58 | with a `name=value` format; the parameters are separated by commas;
59 | - optionally, a `$` sign followed by the (encoded) salt value;
60 | - optionally, a `$` sign followed by the (encoded) hash output (the
61 | hash output may be present only if the salt is present).
62 |
63 | The function symbolic name is a sequence of characters in: `[a-z0-9-]`
64 | (lowercase letters, digits, and the minus sign). No other character is
65 | allowed. Each function defines its own identifier (or identifiers in
66 | case of a function family); identifiers should be explicit (human
67 | readable, not a single digit), with a length of about 5 to 10
68 | characters. An identifier name MUST NOT exceed 32 characters in length.
69 |
70 | The value for the version shall be a sequence of characters in: `[0-9]`.
71 |
72 | Each parameter name shall be a sequence of characters in: `[a-z0-9-]`
73 | (lowercase letters, digits, and the minus sign). No other character is
74 | allowed. Parameter names SHOULD be readable for a human user. A
75 | parameter name MUST NOT exceed 32 characters in length. A parameter
76 | name MUST NOT be equal to the string `v` (to avoid confusion with the
77 | version field).
78 |
79 | The value for each parameter consists in characters in:
80 | `[a-zA-Z0-9/+.-]` (lowercase letters, uppercase letters, digits, `/`,
81 | `+`, `.` and `-`). No other character is allowed. Interpretation of the
82 | value depends on the parameter and the function. The function
83 | specification MUST unambiguously define the set of valid parameter
84 | values. The function specification MUST define a maximum length (in
85 | characters) for each parameter. For numerical parameters, functions
86 | SHOULD use plain decimal encoding (other encodings are possible as long
87 | as they are clearly defined).
88 |
89 | The function specification MUST define a clear, unambiguous,
90 | deterministic encoding for each possible value of a parameter. Producers
91 | of strings MUST follow that encoding. Consumers MAY accept alternate
92 | encodings.
93 |
94 | A version may be optional; if the version is optional, then the
95 | function MUST define the default version to use.
96 |
97 | A parameter may be optional; if a parameter is optional, then the
98 | function MUST define the default value of the parameter. That default
99 | value MUST NOT be subject to context-dependent alterations (e.g. a value
100 | configurable in a system-wide setting is not an acceptable default).
101 | When a parameter is optional, producers MUST omit the parameter if its
102 | value is equal to the default value. The function MUST specify which
103 | parameters are optional and which are not.
104 |
105 | The function MUST specify the order in which parameters may appear.
106 | Producers MUST NOT allow parameters to appear in any other order.
107 |
108 | If the function expects no parameter at all, or all parameters are
109 | optional and their value happens to match the default, then the complete
110 | list, including its starting `$` sign, is omitted. Note that the `=`
111 | sign may appear within the complete string only as part of a list of
112 | parameters.
113 |
114 | The salt consists in a sequence of characters in: `[a-zA-Z0-9/+.-]`
115 | (lowercase letters, uppercase letters, digits, `/`, `+`, `.` and `-`).
116 | The function specification MUST define the set of valid salt values and
117 | a maximum length for this field. Functions that work over arbitrary
118 | binary salts SHOULD define that field to be the B64 encoding for a
119 | binary value whose length falls in a defined range or set of ranges.
120 |
121 | The hash output, if present (in a "hash string"), MUST be the B64
122 | encoding of the raw output of the hash function. The function
123 | specification MUST define the minimum, maximum and default output
124 | length.
125 |
126 |
127 | ### B64
128 |
129 | The B64 encoding is the standard Base64 encoding (RFC 4648, section 4)
130 | except that the padding `=` signs are omitted, and extra characters
131 | (whitespace) are not allowed:
132 |
133 | - Input is split into successive groups of bytes. Each group, except
134 | possibly the last one, contains exactly three bytes.
135 |
136 | - For a group of bytes b0, b1 and b2, compute the following value:
137 |
138 | x = (b0 << 16) + (b1 << 8) + b2
139 |
140 | Then split `x` into four 6-bit values `y0`, `y1`, `y2` and `y3`
141 | such that:
142 |
143 | x = (y0 << 18) + (y1 << 12) + (y2 << 6) + y3
144 |
145 | - Each 6-bit value is encoded into a character in the `[A-Za-z0-9+/]`
146 | alphabet, in that order:
147 | * `A`..`Z` = 0 to 25
148 | * `a`..`z` = 26 to 51
149 | * `0`..`9` = 52 to 61
150 | * `+` = 62
151 | * `/` = 63
152 |
153 | - If the last group does not contain exactly three bytes, then:
154 |
155 | 1. The group is completed with one or two bytes of value 0x00,
156 | then processed as above.
157 | 2. The resulting sequence of characters is truncated to its
158 | first two characters (if the group initially contained a single
159 | byte) or to its first three characters (if the group initially
160 | contained two bytes).
161 |
162 | A B64-encoded value thus yields a string whose length, taken modulo 4,
163 | can be equal to 0, 2 or 3, but not to 1. Take note that a sequence of
164 | characters of the right length may still be an invalid encoding if it
165 | defines some non-zero trailing bits in the last incomplete group;
166 | producers MUST set the trailing bits to 0, while consumers MAY ignore
167 | them, or MAY reject such invalid encodings.
168 |
169 |
170 | ### Decimal Encoding
171 |
172 | For an integer value _x_, its decimal encoding consist in the following:
173 |
174 | - If _x_ < 0, then its decimal encoding is the minus sign `-` followed
175 | by the decimal encoding of -_x_.
176 | - If _x_ = 0, then its decimal encoding is the single character `0`.
177 | - If _x_ > 0, then its decimal encoding is the smallest sequence of
178 | ASCII digits that matches its value (i.e. there is no leading zero).
179 |
180 | Thus, a value is a valid decimal for an integer _x_ if and only if all of
181 | the following hold true:
182 |
183 | - The first character is either a `-` sign, or an ASCII digit.
184 | - All characters other than the first are ASCII digits.
185 | - If the first character is `-` sign, then there is at least another
186 | character, and the second character is not a `0`.
187 | - If the string consists in more than one character, then the first
188 | one cannot be a `0`.
189 |
190 | The C function `strtol()` and `strtoul()` can decode decimal values if
191 | their `base` parameter is set to 10.
192 |
193 |
194 | ### Function Duties
195 |
196 | A password hashing function that uses this specification for its salt
197 | and hash strings MUST specify the following:
198 |
199 | - The function symbolic name.
200 |
201 | - The unique order in which parameters may appear.
202 |
203 | - For each parameter:
204 | * the parameter name;
205 | * the set or range of acceptable values for the parameter;
206 | * the deterministic encoding of the parameter;
207 | * the maximum size (in characters) of the encoded parameter value;
208 | * whether the parameter is optional, and, if yes, its default
209 | value when not encoded.
210 |
211 | - The set of valid salt values, in particular minimum and maximum
212 | length (in characters, and in bytes when applicable).
213 |
214 | - The minimum, maximum and default output lengths (in bytes, and in
215 | characters after encoding).
216 |
217 |
218 | It is RECOMMENDED to follow these guidelines:
219 |
220 | - The function name, and the parameter names, should promote
221 | readability. (Note that readability depends a lot on who is doing
222 | the reading, and there is no universal definition of that property.)
223 |
224 | - Making parameters optional means that human readers must know what
225 | value a parameter has when it has been omitted. Parameters for
226 | optional features (e.g. some explicit "additional data") are most
227 | naturally made optional; other parameters such as number of
228 | iterations are best kept specified explicitly.
229 |
230 | - Maximum lengths for salt, output and parameter values are meant to
231 | help consumer implementations, in particular written in C and using
232 | stack-allocated buffers. These buffers must account for the worst
233 | case, i.e. the maximum defined length. Therefore, keep these lengths
234 | low.
235 |
236 | - The role of salts is to achieve uniqueness. A _random_ salt is fine
237 | for that as long as its length is sufficient; a 16-byte salt would
238 | work well (by definition, UUID are very good salts, and they encode
239 | over exactly 16 bytes). 16 bytes encode as 22 characters in B64.
240 | Functions should disallow salt values that are too small for
241 | security (4 bytes should be viewed as an absolute minimum).
242 |
243 | - The hash output, for a verification, must be long enough to make
244 | preimage attacks at least as hard as password guessing. To promote
245 | wide acceptance, a default output size of 256 bits (32 bytes,
246 | encoded as 43 characters) is recommended. Function implementations
247 | SHOULD NOT allow outputs of less than 80 bits to be used for
248 | password verification.
249 |
250 |
251 | ## API
252 |
253 | The traditional Unix crypt() function is used both for password
254 | registration, and for password verification. It uses two string
255 | parameters:
256 |
257 | char *crypt(const char *key, const char *salt);
258 |
259 | The `key` is the password, while `salt` is a salt string or a hash
260 | string. In order to be compatible with how the crypt() function is
261 | used in existing software, the following must hold:
262 |
263 | - If `salt` is a salt string (no output), then the function must
264 | compute a hash output whose length is the default output length for
265 | that function. The returned string MUST be the strict, deterministic
266 | encoding of the used parameters, salt and output.
267 |
268 | - If `salt` is a parameter string (no salt nor output), then the
269 | function must generate a new appropriate salt value as mandated by
270 | the function specification (e.g. using the defined default salt
271 | length), and then proceed as in the previous case. The returned
272 | string MUST be the strict, deterministic encoding of the used
273 | parameters, salt and output.
274 |
275 | - If `salt` is a hash string, then the function must compute an output
276 | with exactly the same length as the one provided in the input. The
277 | output is then the concatenation of the parameters and salt _as they
278 | were received_, and the newly computed output. Basically, the
279 | function truncates the `salt` string at its last `$` sign, then
280 | appends the recomputed output.
281 |
282 | The third case departs from the prescription that string producers must
283 | always follow the deterministic encoding. This is done that way in order
284 | to support the common case of password verification: the `salt` value is
285 | the complete hash string as it is stored; the hash is recomputed, and
286 | the caller verifies that the exact same string is obtained (e.g. with a
287 | `strcmp()` call). This is the reason why the parameters and salt are
288 | reused "as is" in the output, even if they do not match the
289 | deterministic encoding prescribed in this document.
290 |
291 | On the other hand, when the input `salt` string does not include the
292 | hash output, then this is initial registration, and we insist on using
293 | the unique valid deterministic encoding. The whole point is to try to
294 | avoid local variations that are detrimental to interoperability, while
295 | not breaking existing password hashes.
296 |
297 |
298 | ## Argon2 Encoding
299 |
300 | For Argon2, the following is specified:
301 |
302 | - The identifier for Argon2d is `argon2d`.
303 |
304 | - The identifier for Argon2i is `argon2i`.
305 |
306 | - The identifier for Argon2id is `argon2id`.
307 |
308 | - The versions are: [16, 19].
309 |
310 | - The parameters are:
311 |
312 | * `m`: Memory size, expressed in kilobytes, between 1 and (2^32)-1.
313 | Value is an integer in decimal, over 1 to 10 digits.
314 |
315 | * `t`: Number of iterations, between 1 and (2^32)-1.
316 | Value is an integer in decimal, over 1 to 10 digits.
317 |
318 | * `p`: Degree of parallelism, between 1 and 255.
319 | Value is an integer in decimal, over 1 to 3 digits.
320 |
321 | * `keyid`: Binary identifier for a key. Value is a sequence of 0
322 | to 8 bytes, encoded in B64 as 0 to 11 characters. This parameter
323 | is optional; the default value is the empty sequence (no byte at
324 | all) and its meaning is that no key is to be used. The contents of
325 | the identifier are chosen by the application and are meant to
326 | allow the application to locate the key to use.
327 |
328 | * `data`: Associated data. Value is a sequence of 0 to 32 bytes,
329 | encoded in B64 as 0 to 43 characters. This parameter is optional;
330 | the default value is the empty sequence (no byte at all). The
331 | associated data is extra, non-secret value that is included in the
332 | Argon2 input.
333 |
334 | The parameters shall appear in the `m,t,p,keyid,data` order.
335 | The `keyid` and `data` parameters are optional; the three others
336 | are NOT optional.
337 |
338 | - The salt value is encoded in B64. The length in bytes of the
339 | salt is between 8 and 48 bytes(*), thus yielding a length in
340 | characters between 11 and 64 characters (and that length is never
341 | equal to 1 modulo 4). The default byte length of the salt is 16
342 | bytes (22 characters in B64 encoding). An encoded UUID, or a
343 | sequence of 16 bytes produced with a cryptographically strong
344 | PRNG, are appropriate salt values.
345 |
346 | ((*) the Argon2 specification states that the salt can be much
347 | longer, up to 2^32-1 bytes, but this makes little sense for
348 | password hashing. Specifying a relatively small maximum length
349 | allows for parsing with a stack allocated buffer.)
350 |
351 | - The hash output is encoded in B64. Its length shall be between
352 | 12 and 64 bytes (16 and 86 characters, respectively). The default
353 | output length is 32 bytes (43 characters).
354 |
--------------------------------------------------------------------------------