├── README.md ├── SIS_hardness.sage ├── compute_weight.sage ├── jl.py ├── plot_paper.py ├── poly-arith ├── Makefile ├── README.md ├── bench.c ├── bench.h ├── cpucycles.c ├── cpucycles.h ├── fips202.c ├── fips202.h ├── ntt.c ├── ntt.h ├── params.h ├── poly.c ├── poly.h ├── randombytes.c ├── randombytes.h ├── reduce.c ├── reduce.h ├── symmetric-shake.c ├── symmetric.h ├── test_mul ├── test_mul.c ├── util.c └── util.h └── proof_size_estimate.py /README.md: -------------------------------------------------------------------------------- 1 | # aggregate-falcon 2 | This repository contains supplemantary material for the paper "Aggregating Falcon Signatures With LaBRADOR". 3 | 4 | It contains the following files: 5 | 6 | - `jl.py`: his python script computes the constant 'C_1' and slack 'sqrt(lambda/C_2)' of the Johnson-Lindenstrauss lemma, Lemma 2.2 in the paper. 7 | The constant 'c' (C_1 in the paper) is used in 'proof_size_estimate.py' as JL_const and the slack (sqrt(lambda/C_2 in paper)) is used as JL_slack 8 | To run the program, simply run the python script, for example >>> python3 ./jl.py 9 | 10 | - `compute_weight.sage`: This SageMath script computes the weight 'w' and infinity norm bound 'gamma' of the different challenge sets as presented in Table 3 in Section D.4 of the paper. The weight and the inifinity norm bound directly define the bound on the square of the l2-norm (tau in python script and T_2 in paper) and the bound on the operator norm (T in python script and T_op in paper) and are used in 'proof_size_estimate.py'. 11 | To run the program, simply run the SageMath script, for example >>> load("compute_weight.sage") 12 | 13 | - `SIS_hardness.sage`: This SageMath script computes the estimated hardness of Module-SIS needed for our aggregate signature. 14 | It is derived from corresponding code provided to us by Gregor Seiler. 15 | Its output is then used as 'kappa_lim' in the python script 'proof_size_estimate.py' 16 | Concretely, it computes the corresponding 'kappa_lim' lists for the classes 17 | 1) FALCON_64_128 (used for two-splitting 128-bit security) 18 | 2) FALCON_128_128 (used for almost-fully-splititing 128-bit security 19 | 3) FALCON_128_256 (used for two-splitting 256-bit security) 20 | 4) FALCON_256_256 (used for almost-fully-splitting 256-bit security) 21 | To run the program, simply run the SageMath script, for example >>> load("SIS_hardness.sage") 22 | 23 | - `proof_size_estimate.py`: This python script computes the estimated sizes of our aggregate signature as presented in Section 6 and Appendix E of our paper. It is derived from corresponding code provided to us by Gregor Seiler. 24 | It first provides the numbers for the comparision with [JRS23], Squirrel and Chipmunk. 25 | Then it provides the numbers stored in 'estimates-lin-.csv' to derive the plots through 'plot_paper.py' later. 26 | To run the program, simply run the python script, for example >>> python3 ./proof_size_estimate.py 27 | 28 | - `plot_paper.py`: After having run 'proof_size_estimate.py', this python script computes the corresponding plots for Section 6. 29 | To obtain Figure 3 (left) in the paper, run the python script with the only function not commented out being 'plot_512_1024_AS_lin()' 30 | To obtain Figure 3 (right) in the paper, run the python script with the only function not commented out being 'plot_512_AS_lin_no_salt()' 31 | To obtain Figure 4 in the paper, first run the python script with the only function not commented out being 'plot_512_AS_2S_FS_lin()', then run it again with the only function not commented out being 'plot_1024_AS_2S_FS_lin()'. 32 | 33 | - `poly_arith/*`: Experiments with polynomial arithmetic for various choices of parameters. Check the `README.md` inside for instructions. 34 | 35 | -------------------------------------------------------------------------------- /SIS_hardness.sage: -------------------------------------------------------------------------------- 1 | ''' This SageMath script computes the estimated hardness of Module-SIS needed for our aggregate signature. 2 | It is derived from corresponding code provided to us by Gregor Seiler. 3 | Its output is then used as 'kappa_lim' in the python script 'proof_size_estimate.py' 4 | Concretely, it computes the corresponding 'kappa_lim' lists for the classes 5 | 1) FALCON_64_128 (used for two-splitting 128-bit security) 6 | 2) FALCON_128_128 (used for almost-fully-splititing 128-bit security 7 | 3) FALCON_128_256 (used for two-splitting 256-bit security) 8 | 4) FALCON_256_256 (used for almost-fully-splitting 256-bit security) 9 | To run the program, simply run the SageMath script, for example >>> load("SIS_hardness.sage")''' 10 | 11 | def GSAhermite(bs): 12 | return (bs/(2*pi*e)*numerical_approx(pi*bs)^(1/bs))^(1/(2*bs-2)) 13 | 14 | def findblocksize(h): 15 | bs=9999 16 | for i in range(50, 1000): 17 | if GSAhermite(i) <= h: 18 | bs = i 19 | break 20 | return bs 21 | 22 | def BDGLcost(bs): 23 | return bs*log(sqrt(3/2),2) 24 | 25 | # m: rows 26 | # q: modulus 27 | # b: L2 norm bound 28 | def SIShardness(m, q, b, verbose=False): 29 | if b >= q: 30 | print("Norm bound bigger than q!\n") 31 | return 32 | 33 | # When b = q^(m/d) h^d , h is maximal at d = 2 m log_b(q) 34 | d = round(2*m*log(q, b)) 35 | h = numerical_approx((b*q^(-m/d)))^(1/d) # calc root HF for lattice dimension d 36 | #print(d) 37 | #print(h) 38 | #for i in range(m+1, n, 5): 39 | # t = (b*q^(-m/i))^(1/i) # calc root HF for col i (lattice dim) 40 | # if t > h: 41 | # h = t 42 | # d = i 43 | 44 | bs = findblocksize(h) 45 | if bs > d: 46 | #print("Required block size bigger than lattice dimension!\n") 47 | return 292 48 | 49 | cost = BDGLcost(bs) 50 | if verbose: 51 | print("SIS hardness in l2 norm:\n"); 52 | print(" SIS lattice dimension used: %d\n" % d); 53 | print(" Root Hermite factor: %.5f\n"% h); 54 | print(" BKZ block size: %d\n" % bs); 55 | print(" Classical Core-SVP bit-cost for BDGL16 sieve: %.2f\n" % cost) 56 | 57 | return cost 58 | 59 | #Binary search for the largest beta such that SIS is hard 60 | def binary_search(sec, d, q, kappa, max_beta): 61 | l = 1 62 | r = max_beta 63 | cost = 0 64 | while l < r: 65 | mid = (l+r)//2 66 | #print("mid", mid) 67 | cost = SIShardness(m=kappa*d, q=q, b=mid) 68 | if abs(sec-cost) < 1 and cost > sec: # stop if the cost is close enough the the desired security level 69 | break 70 | if cost < sec: 71 | r = mid 72 | else: 73 | l = mid + 1 74 | 75 | print("Classical Core-SVP bit-cost for BDGL16 sieve: %.2f\n" % cost) 76 | return r-1 77 | 78 | def build_table(sec, d, q, max_kappa): 79 | l = [-1] 80 | for kappa in range(1, max_kappa+1): # change 81 | beta = binary_search(sec, d, q, kappa, q) 82 | l.append(beta) 83 | print("kappa", kappa, "beta", beta, "\n") 84 | return l 85 | 86 | ###### ====== SEC PARAMETER 128 ===== ###### 87 | 88 | print('Security Parameter 128') 89 | sec = 128 90 | 91 | # class FALCON_64_128 92 | d = 64 # ring degree 93 | logq = 47 #Lower bound for q, for hardness estimation 94 | q = 2**logq-1 95 | max_kappa = 35 #max kappa chosen such that maximal beta value is reached 96 | 97 | l = build_table(sec, d, q, max_kappa) 98 | print('Ring Degree 64 - Security Parameter 128') 99 | print(l) 100 | 101 | # class FALCON_128_128 102 | d = 128 # ring degree 103 | logq = 47 #Lower bound for q, for hardness estimation 104 | q = 2**logq-1 105 | max_kappa = 18 #max kappa chosen such that maximal beta value is reached 106 | 107 | l = build_table(sec, d, q, max_kappa) 108 | print('Ring Degree 128 - Security Parameter 128') 109 | print(l) 110 | 111 | ###### ====== SEC PARAMETER 256 ===== ###### 112 | 113 | print('Security Parameter 256') 114 | sec = 256 115 | 116 | # class FALCON_128_256 117 | d = 128 # ring degree 118 | logq = 47 #Lower bound for q, for hardness estimation 119 | q = 2**logq-1 120 | max_kappa = 29 #max kappa chosen such that maximal beta value is reached 121 | 122 | l = build_table(sec, d, q, max_kappa) 123 | print('Ring Degree 128 - Security Parameter 256') 124 | print(l) 125 | 126 | # class FALCON_256_256 127 | d = 256 # ring degree 128 | logq = 47 #Lower bound for q, for hardness estimation 129 | q = 2**logq-1 130 | max_kappa = 15 #max kappa chosen such that maximal beta value is reached 131 | 132 | l = build_table(sec, d, q, max_kappa) 133 | print('Ring Degree 256 - Security Parameter 256') 134 | print(l) 135 | 136 | # For d = 64 137 | # 128-bit security l = [-1, 271, 2687, 16383, 73727, 524287, 917503, 2621439, 7340031, 18874367, 50331647, 117440511, 251658239, 1073741823, 1207959551, 2684354559, 5368709119, 10737418239, 21474836479, 68719476735, 137438953471, 137438953471, 240518168575, 549755813887, 755914244095, 1374389534719, 4398046511103, 4398046511103, 8796093022207, 13194139533311, 35184372088831, 35184372088831, 52776558133247, 87960930222079, 140737488355326, 140737488355326] 138 | # after that just repetitions 139 | # 140 | # For d = 128: 141 | # 128-bit security l = [-1, 2687, 73727, 917503, 7340031, 50331647, 251658239, 1207959551, 5368709119, 21474836479, 137438953471, 240518168575, 755914244095, 4398046511103, 8796093022207, 35184372088831, 52776558133247, 140737488355326, 140737488355326] 142 | # 256-bit security l = [-1, 463, 5887, 49151, 229375, 917503, 3407871, 11534335, 34603007, 100663295, 268435455, 805306367, 1744830463, 4294967295, 9663676415, 21474836479, 47244640255, 103079215103, 206158430207, 549755813887, 1099511627775, 1649267441663, 3161095929855, 6047313952767, 13194139533311, 21990232555519, 39582418599935, 70368744177663, 140737488355326, 140737488355326] 143 | # after that just repetitions 144 | # 145 | # For d = 256 146 | # 256-bit security l = [-1, 5887, 229375, 3407871, 34603007, 268435455, 1744830463, 9663676415, 47244640255, 206158430207, 1099511627775, 3161095929855, 13194139533311, 39582418599935, 140737488355326, 140737488355326] 147 | # after that just repetitions 148 | -------------------------------------------------------------------------------- /compute_weight.sage: -------------------------------------------------------------------------------- 1 | # MatRiCT+ invertibility bound heuristic model computation (https://gitlab.com/raykzhao/matrict_plus) 2 | # Ron Steinfeld, 29 July 2021 3 | # Adapted by authors of the submission to encompass larger infinity norm bounds (gamma), 7 February 2024 4 | # Please cite the MatRiCT+ paper accepted at IEEE S&P 2022 conference (full version available at https://eprint.iacr.org/2021/545) when using scripts 5 | 6 | ''' 7 | SageMath script to compute the weight 'w' and infinity norm bound 'gamma' of the different challenge sets as presented in Table 3 in Section D.4 of the submission. 8 | The weight and the inifinity norm bound directly define the bound on the square of the l2-norm (tau in python script and T_2 in paper) and the bound on the operator norm (T in python script and T_op in paper) and are used in 'proof_size_estimate.py'. 9 | To run the program, simply run the SageMath script, for example >>> load("compute_weight.sage") 10 | ''' 11 | 12 | import math 13 | import numpy as np 14 | import time 15 | import csv 16 | from sage.modules.free_module_integer import IntegerLattice 17 | 18 | # maximal number of witnesses in each iteration of LaBRADOR (cf. Equation 9 in submission) 19 | # max_r = 6 ceil(sqrt(N)) + 1 for N <= 10~000, r_max <= 601 20 | max_r = 601 21 | 22 | # simple function to compute a prime number with the right splitting behaviour 23 | def compute_prime(num_bit,ell): 24 | for i in range(2^(num_bit-1),2^(num_bit+3)): 25 | if i % (4*ell) == (2*ell+1) and is_prime(i): 26 | return(i) 27 | return 0 28 | 29 | # compute gama and weight for the almost fully splitting case 30 | def compute_weight_almost_fully_splitting(d,ell,q,secparam): 31 | max_gam = 100 # maximal gamma we are interested in; 32 | gam = 1 33 | tilde_w = 1 34 | delt = d//ell # degree of the irred factors of X^d+1 mod q (delta in paper) 35 | while True: 36 | # FOLLOWING THE FORMULAS OF LEMMA D.1 37 | A = RDF(gamma((tilde_w+1)/2)/sqrt(pi()*(ell*gam)^tilde_w)) # tilde_w'th moment of Gaussian with var 1/(2*gam*ell) 38 | eta = RDF(ell^tilde_w * factorial(ell-tilde_w) / factorial(ell)) # eta as in paper 39 | expect_EM2 = eta * RDF(1/q + (1-1/q)*A) # Expected value of M2 40 | expect_B = expect_EM2^delt # expected bound for B (with delta independent coefficients per CRT slot) 41 | 42 | # FOLLOWING THE EQUATION 2 43 | final_term = (5+2*ell)*max_r * expect_B # (5+2l)Br = one of the additive terms in the knowledge error of LaBRADOR 44 | 45 | # logs of Results 46 | lgEM2 = round(RDF(log(expect_EM2)/log(2)),1) 47 | lgB = round(RDF(log(expect_B)/log(2)),1) 48 | lgfinal_term = round(RDF(log(final_term)/log(2)),1) 49 | 50 | # increase the weight if security level is not yet reached and weight is below the number of CRT slots 51 | if lgfinal_term > -secparam and tilde_w < ell: 52 | tilde_w += 1 53 | continue 54 | # increase gamma, reset weight to 1 if security level is not yet reached but weight is reaching the number of CRT slots 55 | if lgfinal_term > - secparam and tilde_w >= ell: 56 | if gam >= max_gam: 57 | print("ERROR: TOO LARGE GAMMA >= ",max_gam) 58 | break 59 | else: 60 | tilde_w = 1 61 | gam += 1 62 | continue 63 | else: 64 | final_w = tilde_w*delt 65 | binom = binomial(d,final_w)*(2*gam)^(final_w) 66 | if log(binom,2) < secparam: # double check if size of challenge space is big enough 67 | print("ERROR, CHALLENGE SPACE TOO SMALL") 68 | break 69 | # stop once securtiy level is reached 70 | else: 71 | print("\n") 72 | print ("=== COMPUTE WEIGHT - ALMOST FULLY SPLITTING ===") 73 | print("\n") 74 | print ("Input Pars:") 75 | print ("ring degree (d in paper) = ", d) 76 | print ("no. of irred factors (r in paper) = ", ell) 77 | print ("bit length of modulus (q in paper) = ", math.ceil(log(q)/log(2))) 78 | print ("security paramater (lambda in paper)= ", secparam) 79 | print("\n") 80 | print ("Output Pars:") 81 | print ("log of M2 = ",lgEM2) 82 | print ("log of well-spreadness = ",lgB) 83 | print ("log of final term in knowledge error (5+2l)Br= ",lgfinal_term) 84 | print ("weight of each S_i set (tilde_w in paper)= ", tilde_w) 85 | print ("total weight of callenge elements (w in paper)= ", final_w) 86 | print ("infinity norm bound (gamma in paper) = ", gam) 87 | break 88 | 89 | # compute gama and weight for the two splitting case 90 | def compute_weight_two_splitting(d,secparam): 91 | w = 1 92 | gam = 1 93 | max_gam = 100 # maximal gamma we are interested in; 94 | while True: 95 | binom = binomial(d,w)*(2*gam)^w 96 | # in two-splitting: B = 1/|C| -> |C|/(5+2l)r has to be larger than 2^{secpar} 97 | if log(binom/(9*max_r),2) < secparam: # only condition: size of challenge space is big enough; we divide by (5+2l)r where l=2 and r<= max_r 98 | if w < d-1: 99 | w += 1 100 | continue 101 | if w >= d-1: 102 | if gam >= max_gam: 103 | print("ERROR, MAX_GAM =", max_gam, " WAS REACHED") 104 | else: 105 | w = 1 106 | gam += 1 107 | continue 108 | else: 109 | break 110 | print("\n") 111 | print ("=== COMPUTE WEIGHT - TWO SPLITTING ===") 112 | print("\n") 113 | print ("Input Pars:") 114 | print ("ring degree (d in paper) = ", d) 115 | print ("security paramater (lambda in paper)= ", secparam) 116 | print("\n") 117 | print ("Output Pars:") 118 | print("total weight of challenge elements (w in paper)= ",w) 119 | print ("infinity norm bound (gamma in paper) = ", gam) 120 | 121 | 122 | #### Aggregating Falcon Signatures With LaBRADOR #### 123 | #### Reproducing the numbers of Table 1 of the submission ###### 124 | 125 | ### ALMOST FULLY SPLITTING 126 | 127 | ## aiming security level of 128-bits 128 | secparam=128 129 | 130 | # splitting up to level 4 131 | d=128 132 | ell=d//4 133 | q=compute_prime(47,ell) 134 | #q=compute_prime(63,ell) #gives slightly smaller parameters 135 | compute_weight_almost_fully_splitting(d,ell,q,secparam) 136 | 137 | ## aiming security level of 256-bits 138 | secparam=256 139 | 140 | # splitting up to level 8 141 | d =256 142 | ell=d//8 143 | q=compute_prime(47,ell) 144 | #q=compute_prime(63,ell) #gives slightly smaller parameters 145 | compute_weight_almost_fully_splitting(d,ell,q,secparam) 146 | 147 | ### TWO SPLITTING 148 | 149 | ## aiming security level of 128-bits 150 | secparam=128 151 | d=64 152 | compute_weight_two_splitting(d,secparam) 153 | 154 | ## aiming security level of 256-bits 155 | secparam=256 156 | d=128 157 | compute_weight_two_splitting(d,secparam) 158 | -------------------------------------------------------------------------------- /jl.py: -------------------------------------------------------------------------------- 1 | ''' 2 | This python script computes the constant 'C_1' and slack 'sqrt(lambda/C_2)' of the Johnson-Lindenstrauss lemma, Lemma 2.2 in the submission. 3 | The constant 'c' (C_1 in the paper) is used in 'proof_size_estimate.py' as JL_const and the slack (sqrt(lambda/C_2 in paper)) is used as JL_slack 4 | To run the program, simply run the python script, for example >>> python3 ./jl.py 5 | ''' 6 | 7 | from scipy.stats import chi2,norm 8 | from scipy.special import binom 9 | from math import log2,log,sqrt,ceil,floor,pi 10 | 11 | def ceil_decimal(x, numdecials): 12 | if numdecials == 0: 13 | return ceil(x) 14 | factor = 10**numdecials 15 | return ceil(x*factor)/factor 16 | 17 | def floor_decimal(x, numdecimals): 18 | if numdecimals == 0: 19 | return floor(x) 20 | factor = 10**numdecimals 21 | return floor(x*factor)/factor 22 | 23 | #Added margin to get more sound proof for Labrador strengthening 24 | def tailbounds_chl21(secparam, round=False, numdecimals=2,margin=0): 25 | target = 2.0**(-secparam-margin) 26 | projdim = 2*secparam 27 | 28 | #Pr[|| > alpha norm(w)] <= target * 1/projdim (for union bound over coords of proj) 29 | inv = norm.isf(target*(1/projdim)) 30 | alpha = inv / sqrt(2) 31 | 32 | #Pr[norm(Pw)^2 < beta norm(w)^2] <= target 33 | inv = chi2.ppf(target, projdim) 34 | beta = inv / 2.0 35 | 36 | #Pr[norm(Pw)^2 > gamma norm(w)^2] <= target 37 | inv = chi2.isf(target, projdim) 38 | gamma = inv / 2.0 39 | 40 | if(round): 41 | alpha = ceil_decimal(alpha, numdecimals) 42 | beta = floor_decimal(beta, numdecimals) 43 | gamma = ceil_decimal(gamma, numdecimals) 44 | 45 | return projdim, alpha, beta, gamma 46 | 47 | projdim128, alpha128, beta128, gamma128 = tailbounds_chl21(128) 48 | assert alpha128 < 9.75 49 | assert beta128 > 30 50 | assert gamma128 < 337 51 | projdim256, alpha256, beta256, gamma256 = tailbounds_chl21(256, margin=1) 52 | 53 | #Proof of Cor 3.3 in [GHL21] works as long as sqrt(beta)b < q/(8d) 54 | #Output: c s.t. the lemma holds for b < q/(cd) 55 | def jl_glh21_normreq(beta, round=False, numdecimals=2): 56 | c = 8 * sqrt(beta) 57 | if round: 58 | return ceil_decimal(c, numdecimals) 59 | return c 60 | 61 | assert jl_glh21_normreq(beta128) < 45 62 | 63 | #Output: c s.t. the lemma holds for b < q/c 64 | def jl_labrador_normreq(secparam, projdim, alpha, beta, round=False, numdecimals=2): 65 | #Case 1: 66 | c1 = sqrt(beta)/(1-alpha/10) 67 | 68 | #Case 2 69 | #Finding number of coordinates of proj (not mod q) that can have magnitude 70 | #greater than q/120 before prob is < 2^-secparam 71 | union_bound = 0 72 | last_val = union_bound 73 | bound = 2**(-secparam) 74 | prob_other_coords = 3.0**(-projdim) 75 | i = 0 76 | while union_bound < bound and i <= projdim: 77 | last_val = union_bound 78 | i += 1 79 | prob_other_coords *= 3 80 | union_bound += binom(projdim, i)*prob_other_coords 81 | case2numdigits = i 82 | 83 | c2 = 120*sqrt(beta) / sqrt(case2numdigits) 84 | c = max(c1, c2) 85 | 86 | #Case 3 87 | prob_qhalf = 2 * norm.sf(sqrt(2)*alpha/2) 88 | c3 = 11*sqrt(2*beta)/(sqrt(pi)*((1/sqrt(2))-prob_qhalf-0.3)) 89 | c = max(c, c3) 90 | 91 | if round: 92 | return ceil_decimal(c, numdecimals) 93 | return c 94 | 95 | assert jl_labrador_normreq(128, 256, alpha128, beta128) <= 125 96 | 97 | c128 = jl_labrador_normreq(128, 256, alpha128, beta128) 98 | c256 = jl_labrador_normreq(256, 512, alpha256, beta256) 99 | 100 | def print_jl_msg(secparam): 101 | projdim, alpha, beta, _ = tailbounds_chl21(secparam) 102 | c = jl_labrador_normreq(secparam, projdim, alpha, beta, round=True, numdecimals=0) 103 | beta_rounded = floor(beta) 104 | print(f"=== Modular Johnson-Lindenstrauss for secparam = {secparam} ===") 105 | print("Let w be a fixed vector in Z^d.") 106 | print(f"Let {projdim} be the dimension of our projections.") 107 | print(f"Let b <= q/{c}.") #c corresponds to C_1 in Lemma 2.2 in paper 108 | print() 109 | print(f"If norm(w)_2 > b, then the probability that norm(proj)_2 < sqrt({beta_rounded}) norm(w)_2") # beta_rounded corresponds to C_2 in Lemma 2.2 in paper 110 | print(f"is heuristically less than 2^-{secparam}.") 111 | print() 112 | print(f"This gives a slack of {ceil_decimal(sqrt(projdim/(2*beta)), 2)} for the approximate norm check.") 113 | 114 | # Run function for security parameter lambda = 128 -> used for Falcon-512 115 | print_jl_msg(128) 116 | # Run function for security parameter lambda = 256 -> used for Falcon-1024 117 | print_jl_msg(256) 118 | -------------------------------------------------------------------------------- /plot_paper.py: -------------------------------------------------------------------------------- 1 | ''' After having run 'proof_size_estimate.py', this python script computes the corresponding plots for Section 6. 2 | To obtain Figure 3 (left) in submission, run the python script with the only function not commented out being 'plot_512_1024_AS_lin()' 3 | To obtain Figure 3 (right) in submission, run the python script with the only function not commented out being 'plot_512_AS_lin_no_salt()' 4 | To obtain Figure 4 in submission, first run the python script with the only function not commented out being 'plot_512_AS_2S_FS_lin()', then run it again with the only function not commented out being 'plot_1024_AS_2S_FS_lin()'. 5 | ''' 6 | 7 | from pandas import * 8 | import matplotlib.pyplot as plt 9 | from matplotlib.ticker import AutoMinorLocator 10 | plt.style.use('tableau-colorblind10') 11 | 12 | def kB(sizes_bits): 13 | conversion = lambda b: b / 8000 14 | sizes_kB = list(map(conversion, sizes_bits)) 15 | return sizes_kB 16 | 17 | def add_salts(num_sigs, sizes): 18 | add = lambda e: e[0] + (320 * e[1]) 19 | return list(map(add, zip(sizes, num_sigs))) 20 | 21 | #data = read_csv("estimates-lin-zoom.csv") 22 | data = read_csv("estimates-lin-.csv") 23 | # print(data.keys) 24 | #with AS 25 | heading ="Num-Sigs,Naive-512,Naive-1024,Falcon-512-2S,Falcon-512-AS,Falcon-1024-2S,Falcon-1024-AS" 26 | 27 | num_sigs = data["Num-Sigs"].tolist() 28 | naive_512 = data["Naive-512"].tolist() 29 | naive_1024 = data["Naive-1024"].tolist() 30 | f512_2S = data["Falcon-512-2S"].tolist() 31 | f512_AS = data["Falcon-512-AS"].tolist() 32 | f1024_2S = data["Falcon-1024-2S"].tolist() 33 | f1024_AS = data["Falcon-1024-AS"].tolist() 34 | 35 | data_mod = read_csv("estimates-mod-size.csv") 36 | num_sigs_mod = data_mod["Num-Sigs"].tolist() 37 | mod_512 = data_mod["Falcon-512-log-q"].tolist() 38 | mod_1024 = data_mod["Falcon-1024-log-q"].tolist() 39 | 40 | ''' Naive signature sizes for optimized Falcon as described in [ETWY22] 41 | Numbers for skew factor gamma = 8 come from Table 1 in ETWY22, where we translate from bytes to bits; 42 | The salt is still of size 40 bytes = 320 bits 43 | ''' 44 | 45 | elips_512 = [8 * (410-40)* n for n in num_sigs] 46 | elips_1024= [8 * (780-40)* n for n in num_sigs] 47 | 48 | fig, ax = plt.subplots() 49 | 50 | def plot_512_1024_2S_lin(): 51 | plt.title("Comparison With Trivial Aggregation") 52 | ax.plot(num_sigs, kB(add_salts(num_sigs, f512_2S)), label="Our Aggregation of Falcon-512") 53 | ax.plot(num_sigs, kB(add_salts(num_sigs, f1024_2S)), label="Our Aggregation of Falcon-1024", linestyle="dashed") 54 | plt.plot(num_sigs, kB(add_salts(num_sigs, naive_512)), label="Trivial Aggregation of Falcon-512") 55 | plt.plot(num_sigs, kB(add_salts(num_sigs, naive_1024)), label="Trivial Aggregation of Falcon-1024", linestyle="dashed") 56 | plt.ylabel("Size (in kB)") 57 | plt.xlabel("Number of Signatures") 58 | ax.legend() 59 | ax.grid(axis="y") 60 | ratio = 0.5 61 | x_left, x_right = ax.get_xlim() 62 | y_low, y_high = ax.get_ylim() 63 | ax.set_aspect(abs((x_right-x_left)/(y_low-y_high))*ratio) 64 | ax.xaxis.set_minor_locator(AutoMinorLocator()) 65 | plt.ylim(ymin=0) 66 | plt.xlim(xmin=0) 67 | plt.savefig("plot-512-1024-2S-lin.pdf") 68 | 69 | def plot_512_1024_2S_lin_zoom(): 70 | plt.title("Comparison With Trivial Aggregation") 71 | ax.plot(num_sigs, kB(add_salts(num_sigs, f512_2S)), label="Our Aggregation of Falcon-512") 72 | ax.plot(num_sigs, kB(add_salts(num_sigs, f1024_2S)), label="Our Aggregation of Falcon-1024", linestyle="dashed") 73 | plt.plot(num_sigs, kB(add_salts(num_sigs, naive_512)), label="Trivial Aggregation of Falcon-512") 74 | plt.plot(num_sigs, kB(add_salts(num_sigs, naive_1024)), label="Trivial Aggregation of Falcon-1024", linestyle="dashed") 75 | plt.ylabel("Size (in kB)") 76 | plt.xlabel("Number of Signatures") 77 | ax.legend() 78 | ax.grid(axis="y") 79 | ratio = 0.5 80 | x_left, x_right = ax.get_xlim() 81 | y_low, y_high = ax.get_ylim() 82 | ax.set_aspect(abs((x_right-x_left)/(y_low-y_high))*ratio) 83 | ax.xaxis.set_minor_locator(AutoMinorLocator()) 84 | plt.ylim(ymin=0) 85 | plt.xlim(xmin=0) 86 | plt.savefig("plot-512-1024-2S-lin_zoom.pdf") 87 | 88 | def plot_512_2S_lin_no_salt(): 89 | plt.title("Effect of Salt on Aggregate Signature") 90 | ax.plot(num_sigs, kB(add_salts(num_sigs, f512_2S)), label="Our Aggregation of Falcon-512") 91 | ax.plot(num_sigs, kB(f512_2S), label="Our Aggregation of Falcon-512 (No Salt)", linestyle="dashed") 92 | plt.plot(num_sigs, kB(add_salts(num_sigs, naive_512)), label="Trivial Aggregation of Falcon-512") 93 | plt.plot(num_sigs, kB(naive_512), label="Trivial Aggregation of Falcon-512 (No Salt)", linestyle="dashed") 94 | plt.ylabel("Size (in kB)") 95 | plt.xlabel("Number of Signatures") 96 | ax.legend() 97 | ax.grid(axis="y") 98 | ratio = 0.5 99 | x_left, x_right = ax.get_xlim() 100 | y_low, y_high = ax.get_ylim() 101 | ax.set_aspect(abs((x_right-x_left)/(y_low-y_high))*ratio) 102 | ax.xaxis.set_minor_locator(AutoMinorLocator()) 103 | plt.ylim(ymin=0) 104 | plt.xlim(xmin=0) 105 | plt.savefig("plot-512-2S-lin-no-salt.pdf") 106 | 107 | def plot_512_AS_2S_lin(): 108 | plt.title("Comparison With Other Challenge Sets (Falcon-512)") 109 | plt.plot(num_sigs, kB(add_salts(num_sigs, f512_AS)), label="Almost-Fully-Splitting for Falcon-512") 110 | plt.plot(num_sigs, kB(add_salts(num_sigs, f512_2S)), label="Two-Splitting for Falcon-512", linestyle="dashed") 111 | plt.ylabel("Size (in kB)") 112 | plt.xlabel("Number of Signatures") 113 | ax.legend() 114 | ax.grid(axis="y") 115 | ratio = 0.5 116 | x_left, x_right = ax.get_xlim() 117 | y_low, y_high = ax.get_ylim() 118 | ax.set_aspect(abs((x_right-x_left)/(y_low-y_high))*ratio) 119 | ax.xaxis.set_minor_locator(AutoMinorLocator()) 120 | plt.ylim(ymin=0) 121 | plt.xlim(xmin=0) 122 | plt.savefig("plot-512-AS-2S-lin.pdf") 123 | 124 | def plot_1024_AS_2S_lin(): 125 | plt.title("Comparison With Other Challenge Sets (Falcon-1024)") 126 | ax.plot(num_sigs, kB(add_salts(num_sigs, f1024_AS)), label="Almost-Fully-Splitting for Falcon-1024") 127 | ax.plot(num_sigs, kB(add_salts(num_sigs, f1024_2S)), label="Two-Splitting for Falcon-1024", linestyle="dashed") 128 | plt.ylabel("Size (in kB)") 129 | plt.xlabel("Number of Signatures") 130 | ax.legend() 131 | ax.grid(axis="y") 132 | ratio = 0.5 133 | x_left, x_right = ax.get_xlim() 134 | y_low, y_high = ax.get_ylim() 135 | ax.set_aspect(abs((x_right-x_left)/(y_low-y_high))*ratio) 136 | ax.xaxis.set_minor_locator(AutoMinorLocator()) 137 | plt.ylim(ymin=0) 138 | plt.xlim(xmin=0) 139 | plt.savefig("plot-1024-AS-2S-lin.pdf") 140 | 141 | 142 | ### Figure 3 (left) in submission 143 | plot_512_1024_AS_lin() 144 | 145 | ### zoom 146 | #plot_512_1024_AS_lin_zoom() 147 | 148 | ### Figure 3 (right) in submission 149 | #plot_512_2S_lin_no_salt() 150 | 151 | ### Figure 4 (left) in submission 152 | #plot_512_AS_2S_lin() 153 | 154 | ### Figure 4 (right) in submission 155 | #plot_1024_AS_2S_lin() 156 | 157 | 158 | -------------------------------------------------------------------------------- /poly-arith/Makefile: -------------------------------------------------------------------------------- 1 | CC ?= /usr/bin/cc 2 | CFLAGS += -Wall -Wextra -Wmissing-prototypes -Wredundant-decls -Wshadow -Wpointer-arith -O3 -fomit-frame-pointer -ggdb -lflint -DDEGREE=64 3 | RM = /bin/rm 4 | 5 | SOURCES = poly.c ntt.c reduce.c util.c fips202.c symmetric-shake.c cpucycles.c 6 | HEADERS = params.h poly.h ntt.h reduce.h util.h 7 | 8 | .PHONY: all speed shared clean 9 | 10 | all: \ 11 | test_mul 12 | 13 | test_mul: $(SOURCES) randombytes.c poly.c ntt.c test_mul.c bench.c 14 | $(CC) $(CFLAGS) $^ -o test_mul -lm 15 | 16 | clean: 17 | -$(RM) -rf *.gcno *.gcda *.gch *.lcov *.o *.so 18 | -$(RM) -rf test_mul 19 | -------------------------------------------------------------------------------- /poly-arith/README.md: -------------------------------------------------------------------------------- 1 | Proof-of-concept code for the polynomial arithmetic experiments accompannying the paper "Aggregating Falcon Signatures With LaBRADOR" submitted to CRYPTO 2024. 2 | 3 | Depedencies are the GMP 6.2 and FLINT 2.9 libraries. 4 | 5 | For building the code, run `make` inside the source directory. This will build the `test_mul` binary for running tests and benchmarks for various parameters. 6 | 7 | By default it run the experiments for subring degree 64, but other parameters can be enabled by replacing `-DDEGREE=64` with `-DDEGREE=128` or `-DDEGREE=256` in the Makefile. 8 | The code implements various polynomial multiplication algorithms for 64-bit modulus. One can then manually scale the latencies to 128-bit modulus. 9 | -------------------------------------------------------------------------------- /poly-arith/bench.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include "bench.h" 3 | #include "cpucycles.h" 4 | 5 | // Private Definitions 6 | 7 | //Stores the time measured before the execution of the benchmark. 8 | static long long before; 9 | 10 | //Stores the time measured after the execution of the benchmark. 11 | static long long after; 12 | 13 | //Stores the sum of timings for the current benchmark. 14 | static long long total; 15 | 16 | // Public Definitions 17 | 18 | void bench_reset() { 19 | total = 0; 20 | } 21 | 22 | void bench_before() { 23 | before = cpucycles(); 24 | } 25 | 26 | void bench_after() { 27 | long long result; 28 | after = cpucycles(); 29 | result = (after - before); 30 | total += result; 31 | } 32 | 33 | void bench_compute(int benches) { 34 | total = total / benches; 35 | } 36 | 37 | void bench_print() { 38 | printf("%lld cycles\n", total); 39 | printf("\n"); 40 | } 41 | 42 | unsigned long long bench_total() { 43 | return total; 44 | } 45 | -------------------------------------------------------------------------------- /poly-arith/bench.h: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | #ifndef BENCH_H 4 | #define BENCH_H 5 | 6 | /*============================================================================*/ 7 | /* Macro definitions */ 8 | /*============================================================================*/ 9 | 10 | /** 11 | * Number of times each benchmark is ran. 12 | */ 13 | #define BENCH 100 14 | 15 | /** 16 | * Runs a new benchmark once. 17 | * 18 | * @param[in] LABEL - the label for this benchmark. 19 | * @param[in] FUNCTION - the function to benchmark. 20 | */ 21 | #define BENCH_ONCE(LABEL, FUNCTION) \ 22 | bench_reset(); \ 23 | printf("BENCH: " LABEL "%*c = ", (int)(20 - strlen(LABEL)), ' '); \ 24 | bench_before(); \ 25 | FUNCTION; \ 26 | bench_after(); \ 27 | bench_compute(1); \ 28 | bench_print(); \ 29 | 30 | /** 31 | * Runs a new benchmark a small number of times. 32 | * 33 | * @param[in] LABEL - the label for this benchmark. 34 | * @param[in] FUNCTION - the function to benchmark. 35 | */ 36 | #define BENCH_SMALL(LABEL, FUNCTION) \ 37 | bench_reset(); \ 38 | printf("BENCH: " LABEL "%*c = ", (int)(20 - strlen(LABEL)), ' '); \ 39 | bench_before(); \ 40 | for (int i = 0; i < BENCH; i++) { \ 41 | FUNCTION; \ 42 | } \ 43 | bench_after(); \ 44 | bench_compute(BENCH); \ 45 | bench_print(); \ 46 | 47 | /** 48 | * Runs a new benchmark. 49 | * 50 | * @param[in] LABEL - the label for this benchmark. 51 | */ 52 | #define BENCH_BEGIN(LABEL) \ 53 | bench_reset(); \ 54 | printf("BENCH: " LABEL "%*c = ", (int)(32 - strlen(LABEL)), ' '); \ 55 | for (int i = 0; i < BENCH; i++) { \ 56 | 57 | /** 58 | * Prints the mean timing of each execution in nanoseconds. 59 | */ 60 | #define BENCH_END \ 61 | } \ 62 | bench_compute(BENCH * BENCH); \ 63 | bench_print() \ 64 | 65 | /** 66 | * Measures the time of one execution and adds it to the benchmark total. 67 | * 68 | * @param[in] FUNCTION - the function executed. 69 | */ 70 | #define BENCH_ADD(FUNCTION) \ 71 | FUNCTION; \ 72 | bench_before(); \ 73 | for (int j = 0; j < BENCH; j++) { \ 74 | FUNCTION; \ 75 | } \ 76 | bench_after(); \ 77 | 78 | /*============================================================================*/ 79 | /* Function prototypes */ 80 | /*============================================================================*/ 81 | 82 | /** 83 | * Resets the benchmark data. 84 | * 85 | * @param[in] label - the benchmark label. 86 | */ 87 | void bench_reset(void); 88 | 89 | /** 90 | * Measures the time before a benchmark is executed. 91 | */ 92 | void bench_before(void); 93 | 94 | /** 95 | * Measures the time after a benchmark was started and adds it to the total. 96 | */ 97 | void bench_after(void); 98 | 99 | /** 100 | * Computes the mean elapsed time between the start and the end of a benchmark. 101 | * 102 | * @param benches - the number of executed benchmarks. 103 | */ 104 | void bench_compute(int benches); 105 | 106 | /** 107 | * Prints the last benchmark. 108 | */ 109 | void bench_print(void); 110 | 111 | /** 112 | * Returns the result of the last benchmark. 113 | * 114 | * @return the last benchmark. 115 | */ 116 | unsigned long long bench_total(void); 117 | 118 | #endif /* !BENCH_H */ 119 | -------------------------------------------------------------------------------- /poly-arith/cpucycles.c: -------------------------------------------------------------------------------- 1 | #include "cpucycles.h" 2 | 3 | long long cpucycles(void) 4 | { 5 | unsigned long long result; 6 | asm volatile(".byte 15;.byte 49;shlq $32,%%rdx;orq %%rdx,%%rax" 7 | : "=a" (result) :: "%rdx"); 8 | return result; 9 | } 10 | -------------------------------------------------------------------------------- /poly-arith/cpucycles.h: -------------------------------------------------------------------------------- 1 | #ifndef CPUCYCLES_H 2 | #define CPUCYCLES_H 3 | 4 | long long cpucycles(void); 5 | 6 | #endif 7 | -------------------------------------------------------------------------------- /poly-arith/fips202.c: -------------------------------------------------------------------------------- 1 | /* Based on the public domain implementation in crypto_hash/keccakc512/simple/ from 2 | * http://bench.cr.yp.to/supercop.html by Ronny Van Keer and the public domain "TweetFips202" 3 | * implementation from https://twitter.com/tweetfips202 by Gilles Van Assche, Daniel J. Bernstein, 4 | * and Peter Schwabe */ 5 | 6 | #include 7 | #include 8 | #include "fips202.h" 9 | 10 | #define NROUNDS 24 11 | #define ROL(a, offset) ((a << offset) ^ (a >> (64-offset))) 12 | 13 | /************************************************* 14 | * Name: load64 15 | * 16 | * Description: Load 8 bytes into uint64_t in little-endian order 17 | * 18 | * Arguments: - const uint8_t *x: pointer to input byte array 19 | * 20 | * Returns the loaded 64-bit unsigned integer 21 | **************************************************/ 22 | static uint64_t load64(const uint8_t x[8]) { 23 | unsigned int i; 24 | uint64_t r = 0; 25 | 26 | for(i=0;i<8;i++) 27 | r |= (uint64_t)x[i] << 8*i; 28 | 29 | return r; 30 | } 31 | 32 | /************************************************* 33 | * Name: store64 34 | * 35 | * Description: Store a 64-bit integer to array of 8 bytes in little-endian order 36 | * 37 | * Arguments: - uint8_t *x: pointer to the output byte array (allocated) 38 | * - uint64_t u: input 64-bit unsigned integer 39 | **************************************************/ 40 | static void store64(uint8_t x[8], uint64_t u) { 41 | unsigned int i; 42 | 43 | for(i=0;i<8;i++) 44 | x[i] = u >> 8*i; 45 | } 46 | 47 | /* Keccak round constants */ 48 | static const uint64_t KeccakF_RoundConstants[NROUNDS] = { 49 | (uint64_t)0x0000000000000001ULL, 50 | (uint64_t)0x0000000000008082ULL, 51 | (uint64_t)0x800000000000808aULL, 52 | (uint64_t)0x8000000080008000ULL, 53 | (uint64_t)0x000000000000808bULL, 54 | (uint64_t)0x0000000080000001ULL, 55 | (uint64_t)0x8000000080008081ULL, 56 | (uint64_t)0x8000000000008009ULL, 57 | (uint64_t)0x000000000000008aULL, 58 | (uint64_t)0x0000000000000088ULL, 59 | (uint64_t)0x0000000080008009ULL, 60 | (uint64_t)0x000000008000000aULL, 61 | (uint64_t)0x000000008000808bULL, 62 | (uint64_t)0x800000000000008bULL, 63 | (uint64_t)0x8000000000008089ULL, 64 | (uint64_t)0x8000000000008003ULL, 65 | (uint64_t)0x8000000000008002ULL, 66 | (uint64_t)0x8000000000000080ULL, 67 | (uint64_t)0x000000000000800aULL, 68 | (uint64_t)0x800000008000000aULL, 69 | (uint64_t)0x8000000080008081ULL, 70 | (uint64_t)0x8000000000008080ULL, 71 | (uint64_t)0x0000000080000001ULL, 72 | (uint64_t)0x8000000080008008ULL 73 | }; 74 | 75 | /************************************************* 76 | * Name: KeccakF1600_StatePermute 77 | * 78 | * Description: The Keccak F1600 Permutation 79 | * 80 | * Arguments: - uint64_t *state: pointer to input/output Keccak state 81 | **************************************************/ 82 | static void KeccakF1600_StatePermute(uint64_t state[25]) 83 | { 84 | int round; 85 | 86 | uint64_t Aba, Abe, Abi, Abo, Abu; 87 | uint64_t Aga, Age, Agi, Ago, Agu; 88 | uint64_t Aka, Ake, Aki, Ako, Aku; 89 | uint64_t Ama, Ame, Ami, Amo, Amu; 90 | uint64_t Asa, Ase, Asi, Aso, Asu; 91 | uint64_t BCa, BCe, BCi, BCo, BCu; 92 | uint64_t Da, De, Di, Do, Du; 93 | uint64_t Eba, Ebe, Ebi, Ebo, Ebu; 94 | uint64_t Ega, Ege, Egi, Ego, Egu; 95 | uint64_t Eka, Eke, Eki, Eko, Eku; 96 | uint64_t Ema, Eme, Emi, Emo, Emu; 97 | uint64_t Esa, Ese, Esi, Eso, Esu; 98 | 99 | //copyFromState(A, state) 100 | Aba = state[ 0]; 101 | Abe = state[ 1]; 102 | Abi = state[ 2]; 103 | Abo = state[ 3]; 104 | Abu = state[ 4]; 105 | Aga = state[ 5]; 106 | Age = state[ 6]; 107 | Agi = state[ 7]; 108 | Ago = state[ 8]; 109 | Agu = state[ 9]; 110 | Aka = state[10]; 111 | Ake = state[11]; 112 | Aki = state[12]; 113 | Ako = state[13]; 114 | Aku = state[14]; 115 | Ama = state[15]; 116 | Ame = state[16]; 117 | Ami = state[17]; 118 | Amo = state[18]; 119 | Amu = state[19]; 120 | Asa = state[20]; 121 | Ase = state[21]; 122 | Asi = state[22]; 123 | Aso = state[23]; 124 | Asu = state[24]; 125 | 126 | for(round = 0; round < NROUNDS; round += 2) { 127 | // prepareTheta 128 | BCa = Aba^Aga^Aka^Ama^Asa; 129 | BCe = Abe^Age^Ake^Ame^Ase; 130 | BCi = Abi^Agi^Aki^Ami^Asi; 131 | BCo = Abo^Ago^Ako^Amo^Aso; 132 | BCu = Abu^Agu^Aku^Amu^Asu; 133 | 134 | //thetaRhoPiChiIotaPrepareTheta(round, A, E) 135 | Da = BCu^ROL(BCe, 1); 136 | De = BCa^ROL(BCi, 1); 137 | Di = BCe^ROL(BCo, 1); 138 | Do = BCi^ROL(BCu, 1); 139 | Du = BCo^ROL(BCa, 1); 140 | 141 | Aba ^= Da; 142 | BCa = Aba; 143 | Age ^= De; 144 | BCe = ROL(Age, 44); 145 | Aki ^= Di; 146 | BCi = ROL(Aki, 43); 147 | Amo ^= Do; 148 | BCo = ROL(Amo, 21); 149 | Asu ^= Du; 150 | BCu = ROL(Asu, 14); 151 | Eba = BCa ^((~BCe)& BCi ); 152 | Eba ^= (uint64_t)KeccakF_RoundConstants[round]; 153 | Ebe = BCe ^((~BCi)& BCo ); 154 | Ebi = BCi ^((~BCo)& BCu ); 155 | Ebo = BCo ^((~BCu)& BCa ); 156 | Ebu = BCu ^((~BCa)& BCe ); 157 | 158 | Abo ^= Do; 159 | BCa = ROL(Abo, 28); 160 | Agu ^= Du; 161 | BCe = ROL(Agu, 20); 162 | Aka ^= Da; 163 | BCi = ROL(Aka, 3); 164 | Ame ^= De; 165 | BCo = ROL(Ame, 45); 166 | Asi ^= Di; 167 | BCu = ROL(Asi, 61); 168 | Ega = BCa ^((~BCe)& BCi ); 169 | Ege = BCe ^((~BCi)& BCo ); 170 | Egi = BCi ^((~BCo)& BCu ); 171 | Ego = BCo ^((~BCu)& BCa ); 172 | Egu = BCu ^((~BCa)& BCe ); 173 | 174 | Abe ^= De; 175 | BCa = ROL(Abe, 1); 176 | Agi ^= Di; 177 | BCe = ROL(Agi, 6); 178 | Ako ^= Do; 179 | BCi = ROL(Ako, 25); 180 | Amu ^= Du; 181 | BCo = ROL(Amu, 8); 182 | Asa ^= Da; 183 | BCu = ROL(Asa, 18); 184 | Eka = BCa ^((~BCe)& BCi ); 185 | Eke = BCe ^((~BCi)& BCo ); 186 | Eki = BCi ^((~BCo)& BCu ); 187 | Eko = BCo ^((~BCu)& BCa ); 188 | Eku = BCu ^((~BCa)& BCe ); 189 | 190 | Abu ^= Du; 191 | BCa = ROL(Abu, 27); 192 | Aga ^= Da; 193 | BCe = ROL(Aga, 36); 194 | Ake ^= De; 195 | BCi = ROL(Ake, 10); 196 | Ami ^= Di; 197 | BCo = ROL(Ami, 15); 198 | Aso ^= Do; 199 | BCu = ROL(Aso, 56); 200 | Ema = BCa ^((~BCe)& BCi ); 201 | Eme = BCe ^((~BCi)& BCo ); 202 | Emi = BCi ^((~BCo)& BCu ); 203 | Emo = BCo ^((~BCu)& BCa ); 204 | Emu = BCu ^((~BCa)& BCe ); 205 | 206 | Abi ^= Di; 207 | BCa = ROL(Abi, 62); 208 | Ago ^= Do; 209 | BCe = ROL(Ago, 55); 210 | Aku ^= Du; 211 | BCi = ROL(Aku, 39); 212 | Ama ^= Da; 213 | BCo = ROL(Ama, 41); 214 | Ase ^= De; 215 | BCu = ROL(Ase, 2); 216 | Esa = BCa ^((~BCe)& BCi ); 217 | Ese = BCe ^((~BCi)& BCo ); 218 | Esi = BCi ^((~BCo)& BCu ); 219 | Eso = BCo ^((~BCu)& BCa ); 220 | Esu = BCu ^((~BCa)& BCe ); 221 | 222 | // prepareTheta 223 | BCa = Eba^Ega^Eka^Ema^Esa; 224 | BCe = Ebe^Ege^Eke^Eme^Ese; 225 | BCi = Ebi^Egi^Eki^Emi^Esi; 226 | BCo = Ebo^Ego^Eko^Emo^Eso; 227 | BCu = Ebu^Egu^Eku^Emu^Esu; 228 | 229 | //thetaRhoPiChiIotaPrepareTheta(round+1, E, A) 230 | Da = BCu^ROL(BCe, 1); 231 | De = BCa^ROL(BCi, 1); 232 | Di = BCe^ROL(BCo, 1); 233 | Do = BCi^ROL(BCu, 1); 234 | Du = BCo^ROL(BCa, 1); 235 | 236 | Eba ^= Da; 237 | BCa = Eba; 238 | Ege ^= De; 239 | BCe = ROL(Ege, 44); 240 | Eki ^= Di; 241 | BCi = ROL(Eki, 43); 242 | Emo ^= Do; 243 | BCo = ROL(Emo, 21); 244 | Esu ^= Du; 245 | BCu = ROL(Esu, 14); 246 | Aba = BCa ^((~BCe)& BCi ); 247 | Aba ^= (uint64_t)KeccakF_RoundConstants[round+1]; 248 | Abe = BCe ^((~BCi)& BCo ); 249 | Abi = BCi ^((~BCo)& BCu ); 250 | Abo = BCo ^((~BCu)& BCa ); 251 | Abu = BCu ^((~BCa)& BCe ); 252 | 253 | Ebo ^= Do; 254 | BCa = ROL(Ebo, 28); 255 | Egu ^= Du; 256 | BCe = ROL(Egu, 20); 257 | Eka ^= Da; 258 | BCi = ROL(Eka, 3); 259 | Eme ^= De; 260 | BCo = ROL(Eme, 45); 261 | Esi ^= Di; 262 | BCu = ROL(Esi, 61); 263 | Aga = BCa ^((~BCe)& BCi ); 264 | Age = BCe ^((~BCi)& BCo ); 265 | Agi = BCi ^((~BCo)& BCu ); 266 | Ago = BCo ^((~BCu)& BCa ); 267 | Agu = BCu ^((~BCa)& BCe ); 268 | 269 | Ebe ^= De; 270 | BCa = ROL(Ebe, 1); 271 | Egi ^= Di; 272 | BCe = ROL(Egi, 6); 273 | Eko ^= Do; 274 | BCi = ROL(Eko, 25); 275 | Emu ^= Du; 276 | BCo = ROL(Emu, 8); 277 | Esa ^= Da; 278 | BCu = ROL(Esa, 18); 279 | Aka = BCa ^((~BCe)& BCi ); 280 | Ake = BCe ^((~BCi)& BCo ); 281 | Aki = BCi ^((~BCo)& BCu ); 282 | Ako = BCo ^((~BCu)& BCa ); 283 | Aku = BCu ^((~BCa)& BCe ); 284 | 285 | Ebu ^= Du; 286 | BCa = ROL(Ebu, 27); 287 | Ega ^= Da; 288 | BCe = ROL(Ega, 36); 289 | Eke ^= De; 290 | BCi = ROL(Eke, 10); 291 | Emi ^= Di; 292 | BCo = ROL(Emi, 15); 293 | Eso ^= Do; 294 | BCu = ROL(Eso, 56); 295 | Ama = BCa ^((~BCe)& BCi ); 296 | Ame = BCe ^((~BCi)& BCo ); 297 | Ami = BCi ^((~BCo)& BCu ); 298 | Amo = BCo ^((~BCu)& BCa ); 299 | Amu = BCu ^((~BCa)& BCe ); 300 | 301 | Ebi ^= Di; 302 | BCa = ROL(Ebi, 62); 303 | Ego ^= Do; 304 | BCe = ROL(Ego, 55); 305 | Eku ^= Du; 306 | BCi = ROL(Eku, 39); 307 | Ema ^= Da; 308 | BCo = ROL(Ema, 41); 309 | Ese ^= De; 310 | BCu = ROL(Ese, 2); 311 | Asa = BCa ^((~BCe)& BCi ); 312 | Ase = BCe ^((~BCi)& BCo ); 313 | Asi = BCi ^((~BCo)& BCu ); 314 | Aso = BCo ^((~BCu)& BCa ); 315 | Asu = BCu ^((~BCa)& BCe ); 316 | } 317 | 318 | //copyToState(state, A) 319 | state[ 0] = Aba; 320 | state[ 1] = Abe; 321 | state[ 2] = Abi; 322 | state[ 3] = Abo; 323 | state[ 4] = Abu; 324 | state[ 5] = Aga; 325 | state[ 6] = Age; 326 | state[ 7] = Agi; 327 | state[ 8] = Ago; 328 | state[ 9] = Agu; 329 | state[10] = Aka; 330 | state[11] = Ake; 331 | state[12] = Aki; 332 | state[13] = Ako; 333 | state[14] = Aku; 334 | state[15] = Ama; 335 | state[16] = Ame; 336 | state[17] = Ami; 337 | state[18] = Amo; 338 | state[19] = Amu; 339 | state[20] = Asa; 340 | state[21] = Ase; 341 | state[22] = Asi; 342 | state[23] = Aso; 343 | state[24] = Asu; 344 | } 345 | 346 | /************************************************* 347 | * Name: keccak_init 348 | * 349 | * Description: Initializes the Keccak state. 350 | * 351 | * Arguments: - uint64_t *s: pointer to Keccak state 352 | **************************************************/ 353 | static void keccak_init(uint64_t s[25]) 354 | { 355 | unsigned int i; 356 | for(i=0;i<25;i++) 357 | s[i] = 0; 358 | } 359 | 360 | /************************************************* 361 | * Name: keccak_absorb 362 | * 363 | * Description: Absorb step of Keccak; incremental. 364 | * 365 | * Arguments: - uint64_t *s: pointer to Keccak state 366 | * - unsigned int pos: position in current block to be absorbed 367 | * - unsigned int r: rate in bytes (e.g., 168 for SHAKE128) 368 | * - const uint8_t *in: pointer to input to be absorbed into s 369 | * - size_t inlen: length of input in bytes 370 | * 371 | * Returns new position pos in current block 372 | **************************************************/ 373 | static unsigned int keccak_absorb(uint64_t s[25], 374 | unsigned int pos, 375 | unsigned int r, 376 | const uint8_t *in, 377 | size_t inlen) 378 | { 379 | unsigned int i; 380 | 381 | while(pos+inlen >= r) { 382 | for(i=pos;i> 8*(i%8); 441 | outlen -= i-pos; 442 | pos = i; 443 | } 444 | 445 | return pos; 446 | } 447 | 448 | 449 | /************************************************* 450 | * Name: keccak_absorb_once 451 | * 452 | * Description: Absorb step of Keccak; 453 | * non-incremental, starts by zeroeing the state. 454 | * 455 | * Arguments: - uint64_t *s: pointer to (uninitialized) output Keccak state 456 | * - unsigned int r: rate in bytes (e.g., 168 for SHAKE128) 457 | * - const uint8_t *in: pointer to input to be absorbed into s 458 | * - size_t inlen: length of input in bytes 459 | * - uint8_t p: domain-separation byte for different Keccak-derived functions 460 | **************************************************/ 461 | static void keccak_absorb_once(uint64_t s[25], 462 | unsigned int r, 463 | const uint8_t *in, 464 | size_t inlen, 465 | uint8_t p) 466 | { 467 | unsigned int i; 468 | 469 | for(i=0;i<25;i++) 470 | s[i] = 0; 471 | 472 | while(inlen >= r) { 473 | for(i=0;is); 526 | state->pos = 0; 527 | } 528 | 529 | /************************************************* 530 | * Name: shake128_absorb 531 | * 532 | * Description: Absorb step of the SHAKE128 XOF; incremental. 533 | * 534 | * Arguments: - keccak_state *state: pointer to (initialized) output Keccak state 535 | * - const uint8_t *in: pointer to input to be absorbed into s 536 | * - size_t inlen: length of input in bytes 537 | **************************************************/ 538 | void shake128_absorb(keccak_state *state, const uint8_t *in, size_t inlen) 539 | { 540 | state->pos = keccak_absorb(state->s, state->pos, SHAKE128_RATE, in, inlen); 541 | } 542 | 543 | /************************************************* 544 | * Name: shake128_finalize 545 | * 546 | * Description: Finalize absorb step of the SHAKE128 XOF. 547 | * 548 | * Arguments: - keccak_state *state: pointer to Keccak state 549 | **************************************************/ 550 | void shake128_finalize(keccak_state *state) 551 | { 552 | keccak_finalize(state->s, state->pos, SHAKE128_RATE, 0x1F); 553 | state->pos = SHAKE128_RATE; 554 | } 555 | 556 | /************************************************* 557 | * Name: shake128_squeeze 558 | * 559 | * Description: Squeeze step of SHAKE128 XOF. Squeezes arbitraily many 560 | * bytes. Can be called multiple times to keep squeezing. 561 | * 562 | * Arguments: - uint8_t *out: pointer to output blocks 563 | * - size_t outlen : number of bytes to be squeezed (written to output) 564 | * - keccak_state *s: pointer to input/output Keccak state 565 | **************************************************/ 566 | void shake128_squeeze(uint8_t *out, size_t outlen, keccak_state *state) 567 | { 568 | state->pos = keccak_squeeze(out, outlen, state->s, state->pos, SHAKE128_RATE); 569 | } 570 | 571 | /************************************************* 572 | * Name: shake128_absorb_once 573 | * 574 | * Description: Initialize, absorb into and finalize SHAKE128 XOF; non-incremental. 575 | * 576 | * Arguments: - keccak_state *state: pointer to (uninitialized) output Keccak state 577 | * - const uint8_t *in: pointer to input to be absorbed into s 578 | * - size_t inlen: length of input in bytes 579 | **************************************************/ 580 | void shake128_absorb_once(keccak_state *state, const uint8_t *in, size_t inlen) 581 | { 582 | keccak_absorb_once(state->s, SHAKE128_RATE, in, inlen, 0x1F); 583 | state->pos = SHAKE128_RATE; 584 | } 585 | 586 | /************************************************* 587 | * Name: shake128_squeezeblocks 588 | * 589 | * Description: Squeeze step of SHAKE128 XOF. Squeezes full blocks of 590 | * SHAKE128_RATE bytes each. Can be called multiple times 591 | * to keep squeezing. Assumes new block has not yet been 592 | * started (state->pos = SHAKE128_RATE). 593 | * 594 | * Arguments: - uint8_t *out: pointer to output blocks 595 | * - size_t nblocks: number of blocks to be squeezed (written to output) 596 | * - keccak_state *s: pointer to input/output Keccak state 597 | **************************************************/ 598 | void shake128_squeezeblocks(uint8_t *out, size_t nblocks, keccak_state *state) 599 | { 600 | keccak_squeezeblocks(out, nblocks, state->s, SHAKE128_RATE); 601 | } 602 | 603 | /************************************************* 604 | * Name: shake256_init 605 | * 606 | * Description: Initilizes Keccak state for use as SHAKE256 XOF 607 | * 608 | * Arguments: - keccak_state *state: pointer to (uninitialized) Keccak state 609 | **************************************************/ 610 | void shake256_init(keccak_state *state) 611 | { 612 | keccak_init(state->s); 613 | state->pos = 0; 614 | } 615 | 616 | /************************************************* 617 | * Name: shake256_absorb 618 | * 619 | * Description: Absorb step of the SHAKE256 XOF; incremental. 620 | * 621 | * Arguments: - keccak_state *state: pointer to (initialized) output Keccak state 622 | * - const uint8_t *in: pointer to input to be absorbed into s 623 | * - size_t inlen: length of input in bytes 624 | **************************************************/ 625 | void shake256_absorb(keccak_state *state, const uint8_t *in, size_t inlen) 626 | { 627 | state->pos = keccak_absorb(state->s, state->pos, SHAKE256_RATE, in, inlen); 628 | } 629 | 630 | /************************************************* 631 | * Name: shake256_finalize 632 | * 633 | * Description: Finalize absorb step of the SHAKE256 XOF. 634 | * 635 | * Arguments: - keccak_state *state: pointer to Keccak state 636 | **************************************************/ 637 | void shake256_finalize(keccak_state *state) 638 | { 639 | keccak_finalize(state->s, state->pos, SHAKE256_RATE, 0x1F); 640 | state->pos = SHAKE256_RATE; 641 | } 642 | 643 | /************************************************* 644 | * Name: shake256_squeeze 645 | * 646 | * Description: Squeeze step of SHAKE256 XOF. Squeezes arbitraily many 647 | * bytes. Can be called multiple times to keep squeezing. 648 | * 649 | * Arguments: - uint8_t *out: pointer to output blocks 650 | * - size_t outlen : number of bytes to be squeezed (written to output) 651 | * - keccak_state *s: pointer to input/output Keccak state 652 | **************************************************/ 653 | void shake256_squeeze(uint8_t *out, size_t outlen, keccak_state *state) 654 | { 655 | state->pos = keccak_squeeze(out, outlen, state->s, state->pos, SHAKE256_RATE); 656 | } 657 | 658 | /************************************************* 659 | * Name: shake256_absorb_once 660 | * 661 | * Description: Initialize, absorb into and finalize SHAKE256 XOF; non-incremental. 662 | * 663 | * Arguments: - keccak_state *state: pointer to (uninitialized) output Keccak state 664 | * - const uint8_t *in: pointer to input to be absorbed into s 665 | * - size_t inlen: length of input in bytes 666 | **************************************************/ 667 | void shake256_absorb_once(keccak_state *state, const uint8_t *in, size_t inlen) 668 | { 669 | keccak_absorb_once(state->s, SHAKE256_RATE, in, inlen, 0x1F); 670 | state->pos = SHAKE256_RATE; 671 | } 672 | 673 | /************************************************* 674 | * Name: shake256_squeezeblocks 675 | * 676 | * Description: Squeeze step of SHAKE256 XOF. Squeezes full blocks of 677 | * SHAKE256_RATE bytes each. Can be called multiple times 678 | * to keep squeezing. Assumes next block has not yet been 679 | * started (state->pos = SHAKE256_RATE). 680 | * 681 | * Arguments: - uint8_t *out: pointer to output blocks 682 | * - size_t nblocks: number of blocks to be squeezed (written to output) 683 | * - keccak_state *s: pointer to input/output Keccak state 684 | **************************************************/ 685 | void shake256_squeezeblocks(uint8_t *out, size_t nblocks, keccak_state *state) 686 | { 687 | keccak_squeezeblocks(out, nblocks, state->s, SHAKE256_RATE); 688 | } 689 | 690 | /************************************************* 691 | * Name: shake128 692 | * 693 | * Description: SHAKE128 XOF with non-incremental API 694 | * 695 | * Arguments: - uint8_t *out: pointer to output 696 | * - size_t outlen: requested output length in bytes 697 | * - const uint8_t *in: pointer to input 698 | * - size_t inlen: length of input in bytes 699 | **************************************************/ 700 | void shake128(uint8_t *out, size_t outlen, const uint8_t *in, size_t inlen) 701 | { 702 | size_t nblocks; 703 | keccak_state state; 704 | 705 | shake128_absorb_once(&state, in, inlen); 706 | nblocks = outlen/SHAKE128_RATE; 707 | shake128_squeezeblocks(out, nblocks, &state); 708 | outlen -= nblocks*SHAKE128_RATE; 709 | out += nblocks*SHAKE128_RATE; 710 | shake128_squeeze(out, outlen, &state); 711 | } 712 | 713 | /************************************************* 714 | * Name: shake256 715 | * 716 | * Description: SHAKE256 XOF with non-incremental API 717 | * 718 | * Arguments: - uint8_t *out: pointer to output 719 | * - size_t outlen: requested output length in bytes 720 | * - const uint8_t *in: pointer to input 721 | * - size_t inlen: length of input in bytes 722 | **************************************************/ 723 | void shake256(uint8_t *out, size_t outlen, const uint8_t *in, size_t inlen) 724 | { 725 | size_t nblocks; 726 | keccak_state state; 727 | 728 | shake256_absorb_once(&state, in, inlen); 729 | nblocks = outlen/SHAKE256_RATE; 730 | shake256_squeezeblocks(out, nblocks, &state); 731 | outlen -= nblocks*SHAKE256_RATE; 732 | out += nblocks*SHAKE256_RATE; 733 | shake256_squeeze(out, outlen, &state); 734 | } 735 | 736 | /************************************************* 737 | * Name: sha3_256 738 | * 739 | * Description: SHA3-256 with non-incremental API 740 | * 741 | * Arguments: - uint8_t *h: pointer to output (32 bytes) 742 | * - const uint8_t *in: pointer to input 743 | * - size_t inlen: length of input in bytes 744 | **************************************************/ 745 | void sha3_256(uint8_t h[32], const uint8_t *in, size_t inlen) 746 | { 747 | unsigned int i; 748 | uint64_t s[25]; 749 | 750 | keccak_absorb_once(s, SHA3_256_RATE, in, inlen, 0x06); 751 | KeccakF1600_StatePermute(s); 752 | for(i=0;i<4;i++) 753 | store64(h+8*i,s[i]); 754 | } 755 | 756 | /************************************************* 757 | * Name: sha3_512 758 | * 759 | * Description: SHA3-512 with non-incremental API 760 | * 761 | * Arguments: - uint8_t *h: pointer to output (64 bytes) 762 | * - const uint8_t *in: pointer to input 763 | * - size_t inlen: length of input in bytes 764 | **************************************************/ 765 | void sha3_512(uint8_t h[64], const uint8_t *in, size_t inlen) 766 | { 767 | unsigned int i; 768 | uint64_t s[25]; 769 | 770 | keccak_absorb_once(s, SHA3_512_RATE, in, inlen, 0x06); 771 | KeccakF1600_StatePermute(s); 772 | for(i=0;i<8;i++) 773 | store64(h+8*i,s[i]); 774 | } 775 | -------------------------------------------------------------------------------- /poly-arith/fips202.h: -------------------------------------------------------------------------------- 1 | #ifndef FIPS202_H 2 | #define FIPS202_H 3 | 4 | #include 5 | #include 6 | 7 | #define SHAKE128_RATE 168 8 | #define SHAKE256_RATE 136 9 | #define SHA3_256_RATE 136 10 | #define SHA3_512_RATE 72 11 | 12 | #define FIPS202_NAMESPACE(s) pqmx_fips202_ref_##s 13 | 14 | typedef struct { 15 | uint64_t s[25]; 16 | unsigned int pos; 17 | } keccak_state; 18 | 19 | #define shake128_init FIPS202_NAMESPACE(shake128_init) 20 | void shake128_init(keccak_state *state); 21 | #define shake128_absorb FIPS202_NAMESPACE(shake128_absorb) 22 | void shake128_absorb(keccak_state *state, const uint8_t *in, size_t inlen); 23 | #define shake128_finalize FIPS202_NAMESPACE(shake128_finalize) 24 | void shake128_finalize(keccak_state *state); 25 | #define shake128_squeeze FIPS202_NAMESPACE(shake128_squeeze) 26 | void shake128_squeeze(uint8_t *out, size_t outlen, keccak_state *state); 27 | #define shake128_absorb_once FIPS202_NAMESPACE(shake128_absorb_once) 28 | void shake128_absorb_once(keccak_state *state, const uint8_t *in, size_t inlen); 29 | #define shake128_squeezeblocks FIPS202_NAMESPACE(shake128_squeezeblocks) 30 | void shake128_squeezeblocks(uint8_t *out, size_t nblocks, keccak_state *state); 31 | 32 | #define shake256_init FIPS202_NAMESPACE(shake256_init) 33 | void shake256_init(keccak_state *state); 34 | #define shake256_absorb FIPS202_NAMESPACE(shake256_absorb) 35 | void shake256_absorb(keccak_state *state, const uint8_t *in, size_t inlen); 36 | #define shake256_finalize FIPS202_NAMESPACE(shake256_finalize) 37 | void shake256_finalize(keccak_state *state); 38 | #define shake256_squeeze FIPS202_NAMESPACE(shake256_squeeze) 39 | void shake256_squeeze(uint8_t *out, size_t outlen, keccak_state *state); 40 | #define shake256_absorb_once FIPS202_NAMESPACE(shake256_absorb_once) 41 | void shake256_absorb_once(keccak_state *state, const uint8_t *in, size_t inlen); 42 | #define shake256_squeezeblocks FIPS202_NAMESPACE(shake256_squeezeblocks) 43 | void shake256_squeezeblocks(uint8_t *out, size_t nblocks, keccak_state *state); 44 | 45 | #define shake128 FIPS202_NAMESPACE(shake128) 46 | void shake128(uint8_t *out, size_t outlen, const uint8_t *in, size_t inlen); 47 | #define shake256 FIPS202_NAMESPACE(shake256) 48 | void shake256(uint8_t *out, size_t outlen, const uint8_t *in, size_t inlen); 49 | #define sha3_256 FIPS202_NAMESPACE(sha3_256) 50 | void sha3_256(uint8_t h[32], const uint8_t *in, size_t inlen); 51 | #define sha3_512 FIPS202_NAMESPACE(sha3_512) 52 | void sha3_512(uint8_t h[64], const uint8_t *in, size_t inlen); 53 | 54 | #endif 55 | -------------------------------------------------------------------------------- /poly-arith/ntt.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include "params.h" 3 | #include "ntt.h" 4 | #include "reduce.h" 5 | 6 | // generate with the following SageMath code: 7 | // _zetas = [(mont_r * pow(rou, br(i,7), q)) % q for i in range(128)] 8 | // zetas = [(int(a)-q) if (a > (q-1)//2) else a for a in _zetas] 9 | const int64_t zetas[PQMX_N] = { 10 | 532719943776463611, 1712824210598473412, -1370295354093451833, 124262200749178965, -714223376490581314, -138275710883642869, -1102926763625050, 1767657196266643088, -279197397721253178, 356492044517202178, 1487661701559950979, 1304103426002015710, 1629618216410753009, 1445036730941650804, 1619116696143497796, 105245167723797935, 367723163442095531, -1487071528781225970, 1778506941797319594, -1394647439355336766, 518804722893076255, 141291798759015968, -1463055328214512227, 1403678053431936803, 1629369437266972731, 116935029746966406, 1153404459772611806, 667428405244972246, 1787810736653776682, 347928242243858624, -358756082137868769, -178447815807751549, 998523887703719690, 145591344792672111, -1035093695668908134, -713763388126671157, -1025203798968210061, 918295552878371182, 837927154393850157, -397395448593122216, -749263788221464499, -726376690825334826, 1110731177573170827, -85056957706944670, 1404312931376851606, 400469219639645685, 270310627308674971, 1768608497918996086, 502419640554247474, 931599578710541146, -1180772881415063460, 735294279160574257, 1755306377104268619, -1085009530551386428, 1550895881188125058, -958859323388743482, -50256176979663429, -898615674107773859, 156591195085678140, 1156811988645169641, 716682534159876124, -776091615957528220, 1154435461316943051, 988374064973051516 11 | }; 12 | 13 | 14 | /************************************************* 15 | * Name: fqmul 16 | * 17 | * Description: Multiplication followed by Montgomery reduction 18 | * 19 | * Arguments: - int64_t a: first factor 20 | * - int64_t b: second factor 21 | * 22 | * Returns 16-bit integer congruent to a*b*R^{-1} mod q 23 | **************************************************/ 24 | int64_t fqmul(int64_t a, int64_t b) { 25 | return montgomery_reduce((__int128)a*b); 26 | } 27 | 28 | /************************************************* 29 | * Name: ntt 30 | * 31 | * Description: Inplace number-theoretic transform (NTT) in Rq. 32 | * input is in standard order, output is in bitreversed order 33 | * 34 | * Arguments: - int64_t r[4096]: pointer to input/output vector of elements of Zq 35 | **************************************************/ 36 | void ntt4(int64_t r[PQMX_N]) { 37 | unsigned int len, start, j, k; 38 | int64_t t, zeta; 39 | 40 | k = 1; 41 | for(len = PQMX_N/2; len >= 4; len >>= 1) { 42 | for(start = 0; start < PQMX_N; start = j + len) { 43 | zeta = zetas[k++]; 44 | for(j = start; j < start + len; j++) { 45 | t = fqmul(zeta, r[j + len]); 46 | r[j + len] = barrett_reduce(r[j]) - t; 47 | r[j] = barrett_reduce(r[j]) + t; 48 | } 49 | } 50 | } 51 | } 52 | 53 | void ntt8(int64_t r[PQMX_N]) { 54 | unsigned int len, start, j, k; 55 | int64_t t, zeta; 56 | 57 | k = 1; 58 | for(len = PQMX_N/2; len >= 8; len >>= 1) { 59 | for(start = 0; start < PQMX_N; start = j + len) { 60 | zeta = zetas[k++]; 61 | for(j = start; j < start + len; j++) { 62 | t = fqmul(zeta, r[j + len]); 63 | r[j + len] = barrett_reduce(r[j]) - t; 64 | r[j] = barrett_reduce(r[j]) + t; 65 | } 66 | } 67 | } 68 | } 69 | 70 | void ntt_full(int64_t r[PQMX_N]) { 71 | unsigned int len, start, j, k; 72 | int64_t t, zeta; 73 | 74 | k = 1; 75 | for(len = PQMX_N/2; len > 0; len >>= 1) { 76 | for(start = 0; start < PQMX_N; start = j + len) { 77 | zeta = zetas[k++]; 78 | for(j = start; j < start + len; j++) { 79 | t = fqmul(zeta, r[j + len]); 80 | r[j + len] = barrett_reduce(r[j]) - t; 81 | r[j] = barrett_reduce(r[j]) + t; 82 | } 83 | } 84 | } 85 | } 86 | 87 | /************************************************* 88 | * Name: invntt_tomont 89 | * 90 | * Description: Inplace inverse number-theoretic transform in Rq and 91 | * multiplication by Montgomery factor 2^16. 92 | * Input is in bitreversed order, output is in standard order 93 | * 94 | * Arguments: - int16_t r[4096]: pointer to input/output vector of elements of Zq 95 | **************************************************/ 96 | void invntt(int64_t r[PQMX_N]) { 97 | unsigned int start, len, j, k; 98 | int64_t t, zeta; 99 | const int64_t f = PQMX_MONT4; // mont^2/PQMX_L 100 | 101 | k = PQMX_L-1; 102 | for(len = (PQMX_N/PQMX_L); len < PQMX_N; len <<= 1) { 103 | for(start = 0; start < PQMX_N; start = j + len) { 104 | zeta = zetas[k--]; 105 | for(j = start; j < start + len; j++) { 106 | t = r[j]; 107 | r[j] = barrett_reduce(t + r[j + len]); 108 | r[j + len] = barrett_reduce(r[j + len] - t); 109 | r[j + len] = fqmul(zeta, r[j + len]); 110 | } 111 | } 112 | } 113 | for(j = 0; j < PQMX_N; j++) 114 | r[j] = fqmul(r[j], f); 115 | 116 | } 117 | 118 | /************************************************* 119 | * Name: basemul 120 | * 121 | * Description: Multiplication of polynomials in Zq[X]/(X^4-zeta) 122 | * used for multiplication of elements in Rq in NTT domain 123 | * 124 | * Arguments: - int14_t r[4]: pointer to the output polynomial 125 | * - const int64_t a[4]: pointer to the first factor 126 | * - const int64_t b[4]: pointer to the second factor 127 | * - int64_t zeta: integer defining the reduction polynomial 128 | **************************************************/ 129 | void basemul4(int64_t r[4], const int64_t a[4], const int64_t b[4], int64_t zeta) 130 | { 131 | int64_t tmp, rr[4]; 132 | 133 | rr[0] = fqmul(a[0], b[0]); 134 | tmp = fqmul(a[1], b[3]); 135 | rr[0] += fqmul(tmp, zeta); 136 | tmp = fqmul(a[2], b[2]); 137 | rr[0] = barrett_reduce(rr[0]) + fqmul(tmp, zeta); 138 | tmp = fqmul(a[3], b[1]); 139 | rr[0] = barrett_reduce(rr[0]) + fqmul(tmp, zeta); 140 | 141 | rr[1] = fqmul(a[0], b[1]); 142 | rr[1] += fqmul(a[1], b[0]); 143 | tmp = fqmul(a[2], b[3]); 144 | rr[1] = barrett_reduce(rr[1]) + fqmul(tmp, zeta); 145 | tmp = fqmul(a[3], b[2]); 146 | rr[1] = barrett_reduce(rr[1]) + fqmul(tmp, zeta); 147 | 148 | rr[2] = fqmul(a[0],b[2]); 149 | rr[2] += fqmul(a[1],b[1]); 150 | rr[2] = barrett_reduce(rr[2]) + fqmul(a[2],b[0]); 151 | tmp = fqmul(a[3], b[3]); 152 | rr[2] = barrett_reduce(rr[2]) + fqmul(tmp, zeta); 153 | 154 | rr[3] = fqmul(a[0],b[3]); 155 | rr[3] = barrett_reduce(rr[3]) + fqmul(a[1],b[2]); 156 | rr[3] = barrett_reduce(rr[3]) + fqmul(a[2],b[1]); 157 | rr[3] = barrett_reduce(rr[3]) + fqmul(a[3],b[0]); 158 | 159 | r[0] = rr[0]; 160 | r[1] = rr[1]; 161 | r[2] = rr[2]; 162 | r[3] = rr[3]; 163 | } 164 | 165 | void basemul8(int64_t r[8], const int64_t a[8], const int64_t b[8], int64_t zeta) 166 | { 167 | int64_t tmp, rr[8]; 168 | 169 | rr[0] = fqmul(a[0], b[0]); 170 | tmp = fqmul(a[1], b[7]); 171 | rr[0] += fqmul(tmp, zeta); 172 | tmp = fqmul(a[2], b[6]); 173 | rr[0] = barrett_reduce(rr[0]) + fqmul(tmp, zeta); 174 | tmp = fqmul(a[3], b[5]); 175 | rr[0] = barrett_reduce(rr[0]) + fqmul(tmp, zeta); 176 | tmp = fqmul(a[4], b[4]); 177 | rr[0] = barrett_reduce(rr[0]) + fqmul(tmp, zeta); 178 | tmp = fqmul(a[5], b[3]); 179 | rr[0] = barrett_reduce(rr[0]) + fqmul(tmp, zeta); 180 | tmp = fqmul(a[6], b[2]); 181 | rr[0] = barrett_reduce(rr[0]) + fqmul(tmp, zeta); 182 | tmp = fqmul(a[7], b[1]); 183 | rr[0] = barrett_reduce(rr[0]) + fqmul(tmp, zeta); 184 | 185 | rr[1] = fqmul(a[0], b[1]); 186 | rr[1] += fqmul(a[1], b[0]); 187 | tmp = fqmul(a[2], b[7]); 188 | rr[1] = barrett_reduce(rr[1]) + fqmul(tmp, zeta); 189 | tmp = fqmul(a[3], b[6]); 190 | rr[1] = barrett_reduce(rr[1]) + fqmul(tmp, zeta); 191 | tmp = fqmul(a[4], b[5]); 192 | rr[1] = barrett_reduce(rr[1]) + fqmul(tmp, zeta); 193 | tmp = fqmul(a[5], b[4]); 194 | rr[1] = barrett_reduce(rr[1]) + fqmul(tmp, zeta); 195 | tmp = fqmul(a[6], b[3]); 196 | rr[1] = barrett_reduce(rr[1]) + fqmul(tmp, zeta); 197 | tmp = fqmul(a[7], b[2]); 198 | rr[1] = barrett_reduce(rr[1]) + fqmul(tmp, zeta); 199 | 200 | rr[2] = fqmul(a[0],b[2]); 201 | rr[2] += fqmul(a[1],b[1]); 202 | rr[2] = barrett_reduce(rr[2]) + fqmul(a[2],b[0]); 203 | tmp = fqmul(a[3], b[7]); 204 | rr[2] = barrett_reduce(rr[2]) + fqmul(tmp, zeta); 205 | tmp = fqmul(a[4], b[6]); 206 | rr[2] = barrett_reduce(rr[2]) + fqmul(tmp, zeta); 207 | tmp = fqmul(a[5], b[5]); 208 | rr[2] = barrett_reduce(rr[2]) + fqmul(tmp, zeta); 209 | tmp = fqmul(a[6], b[4]); 210 | rr[2] = barrett_reduce(rr[2]) + fqmul(tmp, zeta); 211 | tmp = fqmul(a[7], b[3]); 212 | rr[2] = barrett_reduce(rr[2]) + fqmul(tmp, zeta); 213 | 214 | rr[3] = fqmul(a[0],b[3]); 215 | rr[3] = barrett_reduce(rr[3]) + fqmul(a[1],b[2]); 216 | rr[3] = barrett_reduce(rr[3]) + fqmul(a[2],b[1]); 217 | rr[3] = barrett_reduce(rr[3]) + fqmul(a[3],b[0]); 218 | tmp = fqmul(a[4], b[7]); 219 | rr[3] = barrett_reduce(rr[3]) + fqmul(tmp, zeta); 220 | tmp = fqmul(a[5], b[6]); 221 | rr[3] = barrett_reduce(rr[3]) + fqmul(tmp, zeta); 222 | tmp = fqmul(a[6], b[5]); 223 | rr[3] = barrett_reduce(rr[3]) + fqmul(tmp, zeta); 224 | tmp = fqmul(a[7], b[4]); 225 | rr[3] = barrett_reduce(rr[3]) + fqmul(tmp, zeta); 226 | 227 | rr[4] = fqmul(a[0],b[4]); 228 | rr[4] = barrett_reduce(rr[4]) + fqmul(a[1],b[3]); 229 | rr[4] = barrett_reduce(rr[4]) + fqmul(a[2],b[2]); 230 | rr[4] = barrett_reduce(rr[4]) + fqmul(a[3],b[1]); 231 | rr[4] = barrett_reduce(rr[4]) + fqmul(a[4],b[0]); 232 | tmp = fqmul(a[5], b[7]); 233 | rr[4] = barrett_reduce(rr[4]) + fqmul(tmp, zeta); 234 | tmp = fqmul(a[6], b[6]); 235 | rr[4] = barrett_reduce(rr[4]) + fqmul(tmp, zeta); 236 | tmp = fqmul(a[7], b[5]); 237 | rr[4] = barrett_reduce(rr[4]) + fqmul(tmp, zeta); 238 | 239 | rr[5] = fqmul(a[0],b[5]); 240 | rr[5] = barrett_reduce(rr[5]) + fqmul(a[1],b[4]); 241 | rr[5] = barrett_reduce(rr[5]) + fqmul(a[2],b[3]); 242 | rr[5] = barrett_reduce(rr[5]) + fqmul(a[3],b[2]); 243 | rr[5] = barrett_reduce(rr[5]) + fqmul(a[4],b[1]); 244 | rr[5] = barrett_reduce(rr[5]) + fqmul(a[5],b[0]); 245 | tmp = fqmul(a[6], b[7]); 246 | rr[5] = barrett_reduce(rr[5]) + fqmul(tmp, zeta); 247 | tmp = fqmul(a[7], b[6]); 248 | rr[5] = barrett_reduce(rr[5]) + fqmul(tmp, zeta); 249 | 250 | rr[6] = fqmul(a[0],b[6]); 251 | rr[6] = barrett_reduce(rr[6]) + fqmul(a[1],b[5]); 252 | rr[6] = barrett_reduce(rr[6]) + fqmul(a[2],b[4]); 253 | rr[6] = barrett_reduce(rr[6]) + fqmul(a[3],b[3]); 254 | rr[6] = barrett_reduce(rr[6]) + fqmul(a[4],b[2]); 255 | rr[6] = barrett_reduce(rr[6]) + fqmul(a[5],b[1]); 256 | rr[6] = barrett_reduce(rr[6]) + fqmul(a[6],b[0]); 257 | tmp = fqmul(a[7], b[7]); 258 | rr[6] = barrett_reduce(rr[6]) + fqmul(tmp, zeta); 259 | 260 | rr[7] = fqmul(a[0],b[7]); 261 | rr[7] = barrett_reduce(rr[7]) + fqmul(a[1],b[6]); 262 | rr[7] = barrett_reduce(rr[7]) + fqmul(a[2],b[5]); 263 | rr[7] = barrett_reduce(rr[7]) + fqmul(a[3],b[4]); 264 | rr[7] = barrett_reduce(rr[7]) + fqmul(a[4],b[3]); 265 | rr[7] = barrett_reduce(rr[7]) + fqmul(a[5],b[2]); 266 | rr[7] = barrett_reduce(rr[7]) + fqmul(a[6],b[1]); 267 | rr[7] = barrett_reduce(rr[7]) + fqmul(a[7],b[0]); 268 | 269 | r[0] = rr[0]; 270 | r[1] = rr[1]; 271 | r[2] = rr[2]; 272 | r[3] = rr[3]; 273 | r[4] = rr[4]; 274 | r[5] = rr[5]; 275 | r[6] = rr[6]; 276 | r[7] = rr[7]; 277 | } 278 | 279 | /************************************************* 280 | * Name: scalar_field_mul 281 | * 282 | * Description: Scalar multiplication of a polynomial in Zq[X]/(X^4-zeta) 283 | * 284 | * Arguments: - int14_t r[4]: pointer to the output polynomial 285 | * - const int64_t a: scalar factor 286 | * - const int64_t b[4]: pointer to the input polynomial factor 287 | **************************************************/ 288 | void scalar_field_mul(int64_t r[4], const int64_t a, const int64_t b[4]){ 289 | unsigned int i; 290 | for(i=0;i<4;i++){ 291 | r[i] = fqmul(a, b[i]); 292 | r[i] = fqmul(r[i], PQMX_MONT2); 293 | } 294 | } 295 | 296 | /************************************************* 297 | * Name: field_mul 298 | * 299 | * Description: Field multiplication in Zq. 300 | * 301 | * Arguments: - int14_t c: pointer to the output element 302 | * - int64_t a: first field element 303 | * - int64_t b: second field element 304 | **************************************************/ 305 | void field_mul(int64_t *c, int64_t *a, int64_t *b) { 306 | *c = fqmul(*a, *b); 307 | } -------------------------------------------------------------------------------- /poly-arith/ntt.h: -------------------------------------------------------------------------------- 1 | #ifndef NTT_H 2 | #define NTT_H 3 | 4 | #include 5 | #include "params.h" 6 | 7 | #define zetas PQMX_NAMESPACE(zetas) 8 | extern const int64_t zetas[PQMX_N]; 9 | 10 | #define ntt4 PQMX_NAMESPACE(ntt4) 11 | void ntt4(int64_t poly[PQMX_N]); 12 | 13 | #define ntt8 PQMX_NAMESPACE(ntt8) 14 | void ntt8(int64_t poly[PQMX_N]); 15 | 16 | #define ntt_full PQMX_NAMESPACE(ntt_full) 17 | void ntt_full(int64_t poly[PQMX_N]); 18 | 19 | #define invntt PQMX_NAMESPACE(invntt) 20 | void invntt(int64_t poly[PQMX_N]); 21 | 22 | #define basemul4 PQMX_NAMESPACE(basemul4) 23 | void basemul4(int64_t r[4], const int64_t a[4], const int64_t b[4], int64_t zeta); 24 | 25 | #define basemul8 PQMX_NAMESPACE(basemul8) 26 | void basemul8(int64_t r[8], const int64_t a[8], const int64_t b[8], int64_t zeta); 27 | 28 | #define scalar_field_mul PQMX_NAMESPACE(scalar_field_mul) 29 | void scalar_field_mul(int64_t r[4], const int64_t a, const int64_t b[4]); 30 | 31 | #define field_mul PQMX_NAMESPACE(field_mul) 32 | void field_mul(int64_t *c, int64_t *a, int64_t *b); 33 | 34 | #define fqmul PQMX_NAMESPACE(fqmul) 35 | int64_t fqmul(int64_t a, int64_t b); 36 | 37 | #endif 38 | -------------------------------------------------------------------------------- /poly-arith/params.h: -------------------------------------------------------------------------------- 1 | #ifndef PARAMS_H 2 | #define PARAMS_H 3 | 4 | #define PQMX_NAMESPACE(s) PQMX_##s 5 | 6 | #if DEGREE == 64 7 | 8 | #define PQMX_N 64 9 | #define PQMX_L 16 10 | #define PQMX_Q 3582804825986617601 11 | #define PQMX_BETA PQMX_N 12 | #define PQMX_ETA 2 13 | 14 | #define PQMX_MONT 532719943776463611L // 2^64 mod q 15 | #define PQMX_MONT2 2729501898859279540L // 2^128 mod q 16 | #define PQMX_MONT3 480494926098756462L // 1/MONT mod q 17 | #define PQMX_MONT4 -725107337817949429L // MONT^2/PQMX_L mod q 18 | #define PQMX_QINV 15972825632445050625UL // q^-1 mod 2^64 19 | 20 | #elif DEGREE == 128 21 | 22 | #define PQMX_N 128 23 | #define PQMX_L 32 24 | #define PQMX_Q 3582804825986617601 25 | #define PQMX_BETA PQMX_N 26 | #define PQMX_ETA 2 27 | 28 | #define PQMX_MONT 532719943776463611L // 2^64 mod q 29 | #define PQMX_MONT2 2729501898859279540L // 2^128 mod q 30 | #define PQMX_MONT3 480494926098756462L // 1/MONT mod q 31 | #define PQMX_MONT4 1428848744084334086L // MONT^2/PQMX_L mod q 32 | #define PQMX_QINV 15972825632445050625UL // q^-1 mod 2^64 33 | 34 | #elif DEGREE == 256 35 | 36 | #define PQMX_N 256 37 | #define PQMX_L 32 38 | #define PQMX_Q 3582804825986617601 39 | #define PQMX_BETA PQMX_N 40 | #define PQMX_ETA 2 41 | 42 | #define PQMX_MONT 532719943776463611L // 2^64 mod q 43 | #define PQMX_MONT2 2729501898859279540L // 2^128 mod q 44 | #define PQMX_MONT3 480494926098756462L // 1/MONT mod q 45 | #define PQMX_MONT4 1428848744084334086L // MONT^2/PQMX_L mod q 46 | #define PQMX_QINV 15972825632445050625UL // q^-1 mod 2^64 47 | 48 | #endif 49 | 50 | #define PQMX_SYMBYTES (PQMX_N/8) /* size in bytes of hashes, and seeds */ 51 | #define PQMX_SSBYTES 32 /* size in bytes of shared key */ 52 | 53 | #define PQMX_POLYBYTES (PQMX_N*sizeof(int64_t)) 54 | 55 | #define PQMX_MU 1 56 | #define PQMX_LAMBDA 1 57 | #define PQMX_NV 10 58 | 59 | #ifndef SHORT 60 | #define PQMX_M (8*PQMX_NV+4+PQMX_ETA) 61 | #define PQMX_DELTA (1L<<43) 62 | #define PQMX_POLYCOMPRESSEDBYTES 22528 63 | #else 64 | #define PQMX_M (4*PQMX_LAMBDA+20*PQMX_NV+3) 65 | #define PQMX_DELTA (1L<<47) 66 | #define PQMX_POLYCOMPRESSEDBYTES 24576 67 | #endif 68 | 69 | #define PQMX_INDCPA_MSGBYTES (PQMX_SYMBYTES) 70 | #define PQMX_INDCPA_PUBLICKEYBYTES (PQMX_POLYVECBYTES + PQMX_SYMBYTES) 71 | #define PQMX_INDCPA_SECRETKEYBYTES (PQMX_POLYVECBYTES) 72 | #define PQMX_INDCPA_BYTES (PQMX_POLYVECCOMPRESSEDBYTES + PQMX_POLYCOMPRESSEDBYTES) 73 | 74 | 75 | #endif 76 | -------------------------------------------------------------------------------- /poly-arith/poly.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include "params.h" 4 | #include "poly.h" 5 | #include "ntt.h" 6 | #include "reduce.h" 7 | #include "symmetric.h" 8 | 9 | /* Polynomial defining the cyclotomic ring. */ 10 | nmod_poly_t cyclo_poly, crt_poly[2]; 11 | 12 | /************************************************* 13 | * Name: rej_uniform 14 | * 15 | * Description: Run rejection sampling on uniform random bytes to generate 16 | * uniform random integers mod q 17 | * 18 | * Arguments: - int64_t *r: pointer to output buffer 19 | * - unsigned int len: requested number of 64-bit integers (uniform mod q) 20 | * - const uint8_t *buf: pointer to input buffer (assumed to be uniformly random bytes) 21 | * - unsigned int buflen: length of input buffer in bytes 22 | * 23 | * Returns number of sampled 64-bit integers (at most len) 24 | **************************************************/ 25 | static unsigned int rej_uniform(int64_t *r, 26 | unsigned int len, 27 | const uint8_t *buf, 28 | unsigned int buflen) 29 | { 30 | unsigned int ctr, pos, j; 31 | 32 | uint64_t t; 33 | ctr = pos = 0; 34 | while(ctr < len && pos + 8 <= buflen) { 35 | t = buf[pos++]; 36 | for(j=8;j<64;j+=8) 37 | t |= (uint64_t)buf[pos++] << j; 38 | t &= (1L << 62)-1; 39 | 40 | if(t < PQMX_Q) 41 | r[ctr++] = t; 42 | } 43 | return ctr; 44 | } 45 | 46 | /************************************************* 47 | * Name: poly_uniform 48 | * 49 | * Description: Generate uniform random polynomial 50 | * 51 | * Arguments: - const poly *r: pointer to output polynomial 52 | * - const uint8_t seed[]: pointer to input buffer 53 | * (assumed to be uniformly random bytes) of length PQMX_SYMBYTES 54 | * - uint32_t nonce: 32-bit nonce 55 | **************************************************/ 56 | #define POLY_UNIFORM_NBLOCKS (PQMX_POLYBYTES + XOF_BLOCKBYTES - 1)/(XOF_BLOCKBYTES) 57 | void poly_uniform(poly *r, const uint8_t seed[PQMX_SYMBYTES], uint32_t nonce) 58 | { 59 | unsigned int i, ctr, off, buflen; 60 | xof_state state; 61 | uint8_t rnd[POLY_UNIFORM_NBLOCKS*XOF_BLOCKBYTES+2]; 62 | memset(rnd,0,sizeof(rnd)); 63 | memset(r->coeffs, 0, PQMX_POLYBYTES); 64 | 65 | buflen = POLY_UNIFORM_NBLOCKS*XOF_BLOCKBYTES; 66 | 67 | xof_absorb(&state, seed, nonce); 68 | xof_squeezeblocks(rnd, POLY_UNIFORM_NBLOCKS, &state); 69 | 70 | ctr = rej_uniform(r->coeffs, PQMX_N, rnd, buflen); 71 | 72 | while(ctrcoeffs + ctr, PQMX_N - ctr, rnd, buflen); 79 | } 80 | 81 | poly_reduce(r); 82 | } 83 | 84 | /************************************************* 85 | * Name: poly_ntt 86 | * 87 | * Description: Computes negacyclic number-theoretic transform (NTT) of 88 | * a polynomial in place; 89 | * inputs assumed to be in normal order, output in bitreversed order 90 | * 91 | * Arguments: - uint64_t *r: pointer to in/output polynomial 92 | **************************************************/ 93 | void poly_ntt4(poly *r) 94 | { 95 | ntt4(r->coeffs); 96 | poly_reduce(r); 97 | } 98 | 99 | void poly_ntt8(poly *r) 100 | { 101 | ntt8(r->coeffs); 102 | poly_reduce(r); 103 | } 104 | 105 | void poly_ntt_full(poly *r) 106 | { 107 | ntt_full(r->coeffs); 108 | poly_reduce(r); 109 | } 110 | 111 | /************************************************* 112 | * Name: poly_invntt_tomont 113 | * 114 | * Description: Computes inverse of negacyclic number-theoretic transform (NTT) 115 | * of a polynomial in place; 116 | * inputs assumed to be in bitreversed order, output in normal order 117 | * 118 | * Arguments: - uint64_t *a: pointer to in/output polynomial 119 | **************************************************/ 120 | void poly_invntt_tomont(poly *r) 121 | { 122 | invntt(r->coeffs); 123 | } 124 | 125 | /************************************************* 126 | * Name: poly_basemul_montgomery 127 | * 128 | * Description: Multiplication of two polynomials in NTT domain 129 | * 130 | * Arguments: - poly *r: pointer to output polynomial 131 | * - const poly *a: pointer to first input polynomial 132 | * - const poly *b: pointer to second input polynomial 133 | **************************************************/ 134 | void poly_basemul_montgomery4(poly *r, const poly *a, const poly *b) 135 | { 136 | unsigned int i; 137 | for(i=0;icoeffs[8*i], &a->coeffs[8*i], &b->coeffs[8*i], zetas[PQMX_L/2+i]); 139 | basemul4(&r->coeffs[8*i+4], &a->coeffs[8*i+4], &b->coeffs[8*i+4], -zetas[PQMX_L/2+i]); 140 | } 141 | } 142 | 143 | void poly_basemul_montgomery8(poly *r, const poly *a, const poly *b) 144 | { 145 | unsigned int i; 146 | for(i=0;icoeffs[16*i], &a->coeffs[16*i], &b->coeffs[16*i], zetas[PQMX_L/2+i]); 148 | basemul8(&r->coeffs[16*i+8], &a->coeffs[16*i+8], &b->coeffs[16*i+8], -zetas[PQMX_L/2+i]); 149 | } 150 | } 151 | 152 | /************************************************* 153 | * Name: poly_pointwise_montgomery 154 | * 155 | * Description: Pointwise multiplication of polynomials in NTT domain 156 | * representation and multiplication of resulting polynomial 157 | * by 2^{-32}. 158 | * 159 | * Arguments: - poly *c: pointer to output polynomial 160 | * - const poly *a: pointer to first input polynomial 161 | * - const poly *b: pointer to second input polynomial 162 | **************************************************/ 163 | void poly_pointwise_montgomery(poly *c, const poly *a, const poly *b) { 164 | unsigned int i; 165 | 166 | for(i = 0; i < PQMX_N; ++i) 167 | c->coeffs[i] = montgomery_reduce((int64_t)a->coeffs[i] * b->coeffs[i]); 168 | } 169 | 170 | /************************************************* 171 | * Name: poly_ctrmul 172 | * 173 | * Description: CRT multiplication of two half-degree polynomials. 174 | * 175 | * Arguments: - poly *c: pointer to output polynomial 176 | * - const poly *a: pointer to first input polynomial 177 | * - const poly *b: pointer to second input polynomial 178 | **************************************************/ 179 | void poly_crtmul(nmod_poly_t *c, const nmod_poly_t *a, const nmod_poly_t *b) { 180 | nmod_poly_mulmod(c[0], a[0], b[0], crt_poly[0]); 181 | nmod_poly_mulmod(c[1], a[1], b[1], crt_poly[1]); 182 | } 183 | 184 | /************************************************* 185 | * Name: poly_tomont 186 | * 187 | * Description: Inplace conversion of all coefficients of a polynomial 188 | * from normal domain to Montgomery domain 189 | * 190 | * Arguments: - poly *r: pointer to input/output polynomial 191 | **************************************************/ 192 | void poly_tomont(poly *r) 193 | { 194 | unsigned int i; 195 | for(i=0;icoeffs[i] = montgomery_reduce((__int128)r->coeffs[i]*PQMX_MONT2); 197 | } 198 | 199 | /************************************************* 200 | * Name: poly_reduce 201 | * 202 | * Description: Applies Barrett reduction to all coefficients of a polynomial 203 | * for details of the Barrett reduction see comments in reduce.c 204 | * 205 | * Arguments: - poly *r: pointer to input/output polynomial 206 | **************************************************/ 207 | void poly_reduce(poly *r) 208 | { 209 | unsigned int i; 210 | for(i=0;icoeffs[i] = barrett_reduce(r->coeffs[i]); 212 | } 213 | 214 | /************************************************* 215 | * Name: poly_reduce_mont 216 | * 217 | * Description: Applies Montgomery reduction to all coefficients of a polynomial 218 | * for details of the Montgomery reduction see comments in reduce.c 219 | * 220 | * Arguments: - poly *r: pointer to input/output polynomial 221 | **************************************************/ 222 | void poly_reduce_mont(poly *r) 223 | { 224 | unsigned int i; 225 | for(i=0;icoeffs[i] = montgomery_reduce( (__int128) r->coeffs[i] ); 227 | } 228 | 229 | 230 | /************************************************* 231 | * Name: poly_add 232 | * 233 | * Description: Add two polynomials; no modular reduction is performed 234 | * 235 | * Arguments: - poly *r: pointer to output polynomial 236 | * - const poly *a: pointer to first input polynomial 237 | * - const poly *b: pointer to second input polynomial 238 | **************************************************/ 239 | void poly_add(poly *r, const poly *a, const poly *b) 240 | { 241 | unsigned int i; 242 | for(i=0;icoeffs[i] = a->coeffs[i] + b->coeffs[i]; 244 | } 245 | 246 | /************************************************* 247 | * Name: poly_sub 248 | * 249 | * Description: Subtract two polynomials; no modular reduction is performed 250 | * 251 | * Arguments: - poly *r: pointer to output polynomial+ 252 | * - const poly *a: pointer to first input polynomial 253 | * - const poly *b: pointer to second input polynomial 254 | **************************************************/ 255 | void poly_sub(poly *r, const poly *a, const poly *b) 256 | { 257 | unsigned int i; 258 | for(i=0;icoeffs[i] = a->coeffs[i] - b->coeffs[i]; 260 | } 261 | 262 | /************************************************* 263 | * Name: poly_shift 264 | * 265 | * Description: Inplace Shift polynomial coefficients right by one and negate 266 | * the new leading coefficient. 267 | * 268 | * Arguments: - poly *r: pointer to input polynomial 269 | **************************************************/ 270 | void poly_shift(poly *r){ 271 | int64_t tmp = r->coeffs[PQMX_N-1]; 272 | unsigned int i; 273 | for(i=PQMX_N-1;i>0;i--){ 274 | r->coeffs[i] = r->coeffs[i-1]; 275 | } 276 | r->coeffs[0] = -1*tmp; 277 | } -------------------------------------------------------------------------------- /poly-arith/poly.h: -------------------------------------------------------------------------------- 1 | #ifndef POLY_H 2 | #define POLY_H 3 | 4 | #include 5 | #include "params.h" 6 | 7 | #include 8 | #include 9 | 10 | /* 11 | * Elements of R_q = Z_q[X]/(X^n + 1). Represents polynomial 12 | * coeffs[0] + X*coeffs[1] + X^2*xoeffs[2] + ... + X^{n-1}*coeffs[n-1] 13 | */ 14 | typedef struct{ 15 | int64_t coeffs[PQMX_N]; 16 | } poly; 17 | 18 | 19 | #define poly_uniform PQMX_NAMESPACE(poly_uniform) 20 | void poly_uniform(poly *r, const uint8_t seed[PQMX_SYMBYTES], uint32_t nonce); 21 | 22 | #define poly_ntt4 PQMX_NAMESPACE(poly_ntt4) 23 | void poly_ntt4(poly *r); 24 | #define poly_ntt8 PQMX_NAMESPACE(poly_ntt8) 25 | void poly_ntt8(poly *r); 26 | #define poly_ntt_full PQMX_NAMESPACE(poly_ntt_full) 27 | void poly_ntt_full(poly *r); 28 | #define poly_crtmul PQMX_NAMESPACE(poly_crtmul) 29 | void poly_crtmul(nmod_poly_t *c, const nmod_poly_t *a, const nmod_poly_t *b); 30 | #define poly_invntt_tomont PQMX_NAMESPACE(poly_invntt_tomont) 31 | void poly_invntt_tomont(poly *r); 32 | #define poly_basemul_montgomery4 PQMX_NAMESPACE(poly_basemul_montgomery4) 33 | void poly_basemul_montgomery4(poly *r, const poly *a, const poly *b); 34 | #define poly_basemul_montgomery8 PQMX_NAMESPACE(poly_basemul_montgomery8) 35 | void poly_basemul_montgomery8(poly *r, const poly *a, const poly *b); 36 | #define poly_pointwise_montgomery PQMX_NAMESPACE(poly_pointwise_montgomery) 37 | void poly_pointwise_montgomery(poly *r, const poly *a, const poly *b); 38 | 39 | #define poly_basemul_acc PQMX_NAMESPACE(poly_basemul_acc) 40 | void poly_basemul_acc(int64_t r[4], const poly *a, const poly *b); 41 | 42 | #define poly_tomont PQMX_NAMESPACE(poly_tomont) 43 | void poly_tomont(poly *r); 44 | 45 | #define poly_inner_prod_mont PQMX_NAMESPACE(poly_inner_prod_mont) 46 | void poly_inner_prod_mont(poly *r, const poly *a, const poly *b); 47 | 48 | 49 | #define poly_reduce PQMX_NAMESPACE(poly_reduce) 50 | void poly_reduce(poly *r); 51 | #define poly_reduce_mont PQMX_NAMESPACE(poly_reduce_mont) 52 | void poly_reduce_mont(poly *r); 53 | 54 | #define poly_add PQMX_NAMESPACE(poly_add) 55 | void poly_add(poly *r, const poly *a, const poly *b); 56 | #define poly_sub PQMX_NAMESPACE(poly_sub) 57 | void poly_sub(poly *r, const poly *a, const poly *b); 58 | 59 | #define poly_shift PQMX_NAMESPACE(poly_shift) 60 | void poly_shift(poly *r); 61 | #endif 62 | -------------------------------------------------------------------------------- /poly-arith/randombytes.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include "randombytes.h" 5 | 6 | #ifdef _WIN32 7 | #include 8 | #include 9 | #else 10 | #include 11 | #include 12 | #ifdef __linux__ 13 | #define _GNU_SOURCE 14 | #include 15 | #include 16 | #else 17 | #include 18 | #endif 19 | #endif 20 | 21 | #ifdef _WIN32 22 | void randombytes(uint8_t *out, size_t outlen) { 23 | HCRYPTPROV ctx; 24 | size_t len; 25 | 26 | if(!CryptAcquireContext(&ctx, NULL, NULL, PROV_RSA_FULL, CRYPT_VERIFYCONTEXT)) 27 | abort(); 28 | 29 | while(outlen > 0) { 30 | len = (outlen > 1048576) ? 1048576 : outlen; 31 | if(!CryptGenRandom(ctx, len, (BYTE *)out)) 32 | abort(); 33 | 34 | out += len; 35 | outlen -= len; 36 | } 37 | 38 | if(!CryptReleaseContext(ctx, 0)) 39 | abort(); 40 | } 41 | #elif defined(__linux__) && defined(SYS_getrandom) 42 | void randombytes(uint8_t *out, size_t outlen) { 43 | ssize_t ret; 44 | 45 | while(outlen > 0) { 46 | ret = syscall(SYS_getrandom, out, outlen, 0); 47 | if(ret == -1 && errno == EINTR) 48 | continue; 49 | else if(ret == -1) 50 | abort(); 51 | 52 | out += ret; 53 | outlen -= ret; 54 | } 55 | } 56 | #else 57 | void randombytes(uint8_t *out, size_t outlen) { 58 | static int fd = -1; 59 | ssize_t ret; 60 | 61 | while(fd == -1) { 62 | fd = open("/dev/urandom", O_RDONLY); 63 | if(fd == -1 && errno == EINTR) 64 | continue; 65 | else if(fd == -1) 66 | abort(); 67 | } 68 | 69 | while(outlen > 0) { 70 | ret = read(fd, out, outlen); 71 | if(ret == -1 && errno == EINTR) 72 | continue; 73 | else if(ret == -1) 74 | abort(); 75 | 76 | out += ret; 77 | outlen -= ret; 78 | } 79 | } 80 | #endif 81 | -------------------------------------------------------------------------------- /poly-arith/randombytes.h: -------------------------------------------------------------------------------- 1 | #ifndef RANDOMBYTES_H 2 | #define RANDOMBYTES_H 3 | 4 | #include 5 | #include 6 | 7 | void randombytes(uint8_t *out, size_t outlen); 8 | 9 | #endif 10 | -------------------------------------------------------------------------------- /poly-arith/reduce.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include "params.h" 3 | #include "reduce.h" 4 | 5 | /************************************************* 6 | * Name: montgomery_reduce 7 | * 8 | * Description: Montgomery reduction; given a 128-bit integer a, computes 9 | * 64-bit integer congruent to a * R^-1 mod q, where R=2^64 10 | * 11 | * Arguments: - __int128 a: input integer to be reduced; 12 | * has to be in {-q2^63,...,q2^63-1} 13 | * 14 | * Returns: integer in {-q+1,...,q-1} congruent to a * R^-1 modulo q. 15 | **************************************************/ 16 | int64_t montgomery_reduce(__int128 a) 17 | { 18 | int64_t t; 19 | 20 | t = (int64_t)a*PQMX_QINV; 21 | t = (a - (__int128)t*PQMX_Q) >> 64; 22 | return t; 23 | } 24 | 25 | /************************************************* 26 | * Name: barrett_reduce 27 | * 28 | * Description: Barrett reduction; given a 64-bit integer a, computes 29 | * centered representative congruent to a mod q in {-(q-1)/2,...,(q-1)/2} 30 | * 31 | * Arguments: - int64_t a: input integer to be reduced 32 | * 33 | * Returns: integer in {-(q-1)/2,...,(q-1)/2} congruent to a modulo q. 34 | **************************************************/ 35 | int64_t barrett_reduce(int64_t a) { 36 | int64_t t; 37 | const int64_t v = ( ((__int128)1 <<124 ) + PQMX_Q/2)/PQMX_Q; 38 | t = ((__int128)v*a + ((__int128)1 << 123) ) >> 124; 39 | t *= PQMX_Q; 40 | return a - t; 41 | } -------------------------------------------------------------------------------- /poly-arith/reduce.h: -------------------------------------------------------------------------------- 1 | #ifndef REDUCE_H 2 | #define REDUCE_H 3 | 4 | #include 5 | #include "params.h" 6 | 7 | #define montgomery_reduce PQMX_NAMESPACE(montgomery_reduce) 8 | int64_t montgomery_reduce(__int128 a); 9 | 10 | #define barrett_reduce PQMX_NAMESPACE(barrett_reduce) 11 | int64_t barrett_reduce(int64_t a); 12 | 13 | #endif 14 | -------------------------------------------------------------------------------- /poly-arith/symmetric-shake.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include "params.h" 5 | #include "symmetric.h" 6 | #include "fips202.h" 7 | 8 | /************************************************* 9 | * Name: PQMX_shake128_absorb 10 | * 11 | * Description: Absorb step of the SHAKE128 specialized for the PQMX context. 12 | * 13 | * Arguments: - keccak_state *state: pointer to (uninitialized) output Keccak state 14 | * - const uint8_t *seed: pointer to PQMX_SYMBYTES input to be absorbed into state 15 | * - uint8_t x: additional byte of input 16 | **************************************************/ 17 | void PQMX_shake128_absorb(keccak_state *state, 18 | const uint8_t seed[PQMX_SYMBYTES], 19 | uint8_t x) 20 | { 21 | uint8_t extseed[PQMX_SYMBYTES+1]; 22 | 23 | memcpy(extseed, seed, PQMX_SYMBYTES); 24 | extseed[PQMX_SYMBYTES+0] = x; 25 | //extseed[PQMX_SYMBYTES+1] = y; 26 | 27 | shake128_absorb_once(state, extseed, sizeof(extseed)); 28 | } 29 | 30 | /************************************************* 31 | * Name: PQMX_shake256_prf 32 | * 33 | * Description: Usage of SHAKE256 as a PRF, concatenates secret and public input 34 | * and then generates outlen bytes of SHAKE256 output 35 | * 36 | * Arguments: - uint8_t *out: pointer to output 37 | * - size_t outlen: number of requested output bytes 38 | * - const uint8_t *key: pointer to the key (of length PQMX_SYMBYTES) 39 | * - uint8_t nonce: single-byte nonce (public PRF input) 40 | **************************************************/ 41 | void PQMX_shake256_prf(uint8_t *out, size_t outlen, const uint8_t key[PQMX_SYMBYTES], uint8_t nonce) 42 | { 43 | uint8_t extkey[PQMX_SYMBYTES+1]; 44 | 45 | memcpy(extkey, key, PQMX_SYMBYTES); 46 | extkey[PQMX_SYMBYTES] = nonce; 47 | 48 | shake256(out, outlen, extkey, sizeof(extkey)); 49 | } 50 | -------------------------------------------------------------------------------- /poly-arith/symmetric.h: -------------------------------------------------------------------------------- 1 | #ifndef SYMMETRIC_H 2 | #define SYMMETRIC_H 3 | 4 | #include 5 | #include 6 | #include "params.h" 7 | 8 | 9 | #include "fips202.h" 10 | 11 | typedef keccak_state xof_state; 12 | 13 | #define PQMX_shake128_absorb PQMX_NAMESPACE(PQMX_shake128_absorb) 14 | void PQMX_shake128_absorb(keccak_state *s, 15 | const uint8_t seed[PQMX_SYMBYTES], 16 | uint8_t x); 17 | 18 | #define PQMX_shake256_prf PQMX_NAMESPACE(PQMX_shake256_prf) 19 | void PQMX_shake256_prf(uint8_t *out, size_t outlen, const uint8_t key[PQMX_SYMBYTES], uint8_t nonce); 20 | 21 | #define XOF_BLOCKBYTES SHAKE128_RATE 22 | 23 | #define hash_h(OUT, IN, INBYTES) sha3_256(OUT, IN, INBYTES) 24 | #define hash_g(OUT, IN, INBYTES) sha3_512(OUT, IN, INBYTES) 25 | #define xof_absorb(STATE, SEED, X) PQMX_shake128_absorb(STATE, SEED, X) 26 | #define xof_squeezeblocks(OUT, OUTBLOCKS, STATE) shake128_squeezeblocks(OUT, OUTBLOCKS, STATE) 27 | #define prf(OUT, OUTBYTES, KEY, NONCE) PQMX_shake256_prf(OUT, OUTBYTES, KEY, NONCE) 28 | #define kdf(OUT, IN, INBYTES) shake256(OUT, PQMX_SSBYTES, IN, INBYTES) 29 | 30 | 31 | #endif /* SYMMETRIC_H */ 32 | -------------------------------------------------------------------------------- /poly-arith/test_mul: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dfaranha/aggregate-falcon/971472cd8f75513ed963f3b3e4ffa8adc490b794/poly-arith/test_mul -------------------------------------------------------------------------------- /poly-arith/test_mul.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include "params.h" 4 | #include "randombytes.h" 5 | #include "ntt.h" 6 | #include "poly.h" 7 | #include "assert.h" 8 | #include "bench.h" 9 | 10 | #define NTESTS 1 11 | #define N PQMX_N 12 | #define Q PQMX_Q 13 | #define SEEDBYTES PQMX_SYMBYTES 14 | 15 | /* Modulus and roots for the two-splitting case. */ 16 | #define P 1092158679064256381 17 | #define P0 267789826476780723 18 | #define P1 824368852587475658 19 | 20 | extern nmod_poly_t cyclo_poly, crt_poly[2]; 21 | int test(void); 22 | int bench(void); 23 | 24 | static void poly_naivemul(poly *c, const poly *a, const poly *b) { 25 | unsigned int i,j; 26 | int64_t r[2*N] = {0}; 27 | 28 | for(i = 0; i < N; i++) 29 | for(j = 0; j < N; j++) 30 | r[i+j] = (r[i+j] + ((__int128)a->coeffs[i]*b->coeffs[j])) % Q; 31 | 32 | for(i = N; i < 2*N; i++) 33 | r[i-N] = (r[i-N] - r[i]) % Q; 34 | 35 | for(i = 0; i < N; i++) 36 | c->coeffs[i] = r[i]; 37 | } 38 | 39 | int test(void) { 40 | int i, j; 41 | uint8_t seed[SEEDBYTES]; 42 | uint16_t nonce = 0; 43 | poly a, b, c, d; 44 | 45 | randombytes(seed, sizeof(seed)); 46 | for(i = 0; i < NTESTS; ++i) { 47 | poly_uniform(&a, seed, nonce++); 48 | poly_uniform(&b, seed, nonce++); 49 | 50 | c = a; 51 | #if DEGREE < 256 52 | poly_ntt4(&c); 53 | for(j = 0; j < N; ++j) 54 | c.coeffs[j] = ((__int128)c.coeffs[j]*PQMX_MONT3) % Q; 55 | poly_invntt_tomont(&c); 56 | for(j = 0; j < N; ++j) { 57 | if((c.coeffs[j] - a.coeffs[j]) % Q) 58 | fprintf(stderr, "ERROR in ntt/invntt: c[%d] = %ld != %ld\n", j, c.coeffs[j]%Q, a.coeffs[j]); 59 | } 60 | 61 | poly_naivemul(&c, &a, &b); 62 | poly_ntt4(&a); 63 | poly_ntt4(&b); 64 | poly_basemul_montgomery4(&d, &a, &b); 65 | poly_invntt_tomont(&d); 66 | #else 67 | poly_ntt8(&c); 68 | for(j = 0; j < N; ++j) 69 | c.coeffs[j] = ((__int128)c.coeffs[j]*PQMX_MONT3) % Q; 70 | poly_invntt_tomont(&c); 71 | for(j = 0; j < N; ++j) { 72 | if((c.coeffs[j] - a.coeffs[j]) % Q) 73 | fprintf(stderr, "ERROR in ntt/invntt: c[%d] = %ld != %ld\n", j, c.coeffs[j]%Q, a.coeffs[j]); 74 | } 75 | 76 | poly_naivemul(&c, &a, &b); 77 | poly_ntt8(&a); 78 | poly_ntt8(&b); 79 | poly_basemul_montgomery8(&d, &a, &b); 80 | poly_invntt_tomont(&d); 81 | #endif 82 | 83 | for(j = 0; j < N; ++j) { 84 | if((d.coeffs[j] - c.coeffs[j]) % Q) 85 | fprintf(stderr, "ERROR in multiplication: d[%d] = %ld != %ld\n", j, d.coeffs[j], c.coeffs[j]); 86 | } 87 | } 88 | 89 | return 0; 90 | } 91 | 92 | int bench(void) { 93 | uint8_t seed[SEEDBYTES]; 94 | uint16_t nonce = 0; 95 | poly a, b, c, d; 96 | nmod_poly_t crt_a[2], crt_b[2], crt_c[2]; 97 | uint64_t coeff; 98 | 99 | for (size_t i = 0; i < 2; i++) { 100 | nmod_poly_init(crt_a[i], P); 101 | nmod_poly_init(crt_b[i], P); 102 | nmod_poly_init(crt_c[i], P); 103 | } 104 | 105 | randombytes(seed, sizeof(seed)); 106 | poly_uniform(&a, seed, nonce++); 107 | poly_uniform(&b, seed, nonce++); 108 | for (int i = 0; i < N/2; i++) { 109 | for (int j = 0; j < 2; j++) { 110 | randombytes((uint8_t *)&coeff, sizeof(coeff)); 111 | nmod_poly_set_coeff_ui(crt_a[j], i, coeff % P); 112 | randombytes((uint8_t *)&coeff, sizeof(coeff)); 113 | nmod_poly_set_coeff_ui(crt_b[j], i, coeff % P); 114 | randombytes((uint8_t *)&coeff, sizeof(coeff)); 115 | nmod_poly_set_coeff_ui(crt_c[j], i, coeff % P); 116 | } 117 | } 118 | 119 | BENCH_SMALL("field mul: ", field_mul(&c.coeffs[0], &a.coeffs[0], &b.coeffs[0])); 120 | BENCH_SMALL("almost-full up-to-4 ntt: ", poly_ntt4(&a)); 121 | BENCH_SMALL("almost-full up-to-8 ntt: ", poly_ntt8(&a)); 122 | BENCH_SMALL("fully-split ntt: ", poly_ntt_full(&a)); 123 | BENCH_SMALL("inv ntt: ", poly_invntt_tomont(&d);) 124 | BENCH_SMALL("naive schoolbook mul: ", poly_naivemul(&c, &a, &b)); 125 | BENCH_SMALL("two-split CRT mul: ", poly_crtmul(crt_c, crt_a, crt_b)); 126 | BENCH_SMALL("almost-full up-to-4 mul: ", poly_basemul_montgomery4(&d, &a, &b);); 127 | BENCH_SMALL("almost-full up-to-8 mul: ", poly_basemul_montgomery8(&d, &a, &b);); 128 | BENCH_SMALL("fully-split mul: ", poly_pointwise_montgomery(&c, &a, &b)); 129 | 130 | for (size_t i = 0; i < 2; i++) { 131 | nmod_poly_clear(crt_a[i]); 132 | nmod_poly_clear(crt_b[i]); 133 | nmod_poly_clear(crt_c[i]); 134 | } 135 | 136 | return 0; 137 | } 138 | 139 | int main(void) { 140 | nmod_poly_init(cyclo_poly, P); 141 | for (int i = 0; i < 2; i++) { 142 | nmod_poly_init(crt_poly[i], P); 143 | } 144 | 145 | // Initialize polynomial as x^N + 1. */ 146 | nmod_poly_set_coeff_ui(cyclo_poly, N, 1); 147 | nmod_poly_set_coeff_ui(cyclo_poly, 0, 1); 148 | 149 | // Initialize two factors of the polynomial for CRT representation. 150 | nmod_poly_set_coeff_ui(crt_poly[0], N/2, 1); 151 | nmod_poly_set_coeff_ui(crt_poly[0], 0, P0); 152 | nmod_poly_set_coeff_ui(crt_poly[1], N/2, 1); 153 | nmod_poly_set_coeff_ui(crt_poly[1], 0, P1); 154 | 155 | for (int i = 0; i < 2; i++) { 156 | nmod_poly_clear(crt_poly[i]); 157 | } 158 | nmod_poly_clear(cyclo_poly); 159 | 160 | test(); 161 | bench(); 162 | } -------------------------------------------------------------------------------- /poly-arith/util.c: -------------------------------------------------------------------------------- 1 | #include "util.h" 2 | #include 3 | #include 4 | #include "randombytes.h" 5 | #include "reduce.h" 6 | 7 | /************************************************* 8 | * Name: poly_norm 9 | * 10 | * Description: Calculate l_p norm of a polynomial 11 | * 12 | * Arguments: - const poly *a: pointer to input polynomial 13 | * - int l : norm degree. 0 for infinity norm. 14 | **************************************************/ 15 | double poly_norm(const poly *a, int l) 16 | { 17 | double norm = 0.0; 18 | int64_t i; 19 | double j; 20 | for(i=0;icoeffs[i]); 22 | if(l == 0) 23 | norm = max(norm, j); 24 | else 25 | norm += pow(j, l*1.0); 26 | } 27 | if(l==0) return norm; 28 | return pow(norm, 1.0/l); 29 | } 30 | 31 | /************************************************* 32 | * Name: polyvec_norm 33 | * 34 | * Description: Calculate l_p norm of a polynomial vector 35 | * 36 | * Arguments: - const poly *a: pointer to input polynomial 37 | * - int l : norm degree. 0 for infinity norm. 38 | * - int vlen: vector dimension 39 | **************************************************/ 40 | double polyvec_norm(const poly *a, int l, int len) 41 | { 42 | double norm = 0.0, tmp; 43 | int64_t i; 44 | 45 | for(i=0;i 0; i--){ 84 | j = secure_random(i); 85 | tmp = arr[j]; 86 | arr[j] = arr[i]; 87 | arr[i] = tmp; 88 | } 89 | } 90 | 91 | /************************************************* 92 | * Name: padding 93 | * 94 | * Description: A simple padding mechanism. padded-data = \x01 \x01 *\xFF \x01 data \x00 95 | * 96 | * Arguments: - uint8_t *out: pointer to output buffer 97 | * - const uint8_t *buf: pointer to input buffer 98 | * - size_t buflen: buffer length 99 | **************************************************/ 100 | void padding(uint8_t out[PQMX_INDCPA_MSGBYTES], const uint8_t *buf, size_t buflen) 101 | { 102 | memset(out, 0xFF, PQMX_INDCPA_MSGBYTES); 103 | out[0] = 0x01; 104 | out[1] = 0x01; 105 | out[PQMX_INDCPA_MSGBYTES-1] = 0x00; 106 | out[PQMX_INDCPA_MSGBYTES - buflen - 1] = 0x01; 107 | memcpy(out+PQMX_INDCPA_MSGBYTES-buflen, buf, buflen); 108 | } 109 | 110 | /************************************************* 111 | * Name: remove_padding 112 | * 113 | * Description: Unpadding 114 | * 115 | * Arguments: - uint8_t *out: pointer to output buffer 116 | * - const uint8_t *in: pointer to input buffer 117 | **************************************************/ 118 | void remove_padding(uint8_t *out, const uint8_t in[PQMX_INDCPA_MSGBYTES]) 119 | { 120 | memset(out, 0, 32); 121 | int i=2; 122 | while(in[i] == 0xFF){ 123 | i++; 124 | } 125 | memcpy(out, in+i+1, PQMX_INDCPA_MSGBYTES - i-1); 126 | } -------------------------------------------------------------------------------- /poly-arith/util.h: -------------------------------------------------------------------------------- 1 | #ifndef UTIL_H 2 | #define UTIL_H 3 | 4 | #include 5 | #include 6 | #include "params.h" 7 | #include "poly.h" 8 | 9 | #define max(a,b) \ 10 | ({ __typeof__ (a) _a = (a); \ 11 | __typeof__ (b) _b = (b); \ 12 | _a > _b ? _a : _b; }) 13 | 14 | #define poly_norm PQMX_NAMESPACE(poly_norm) 15 | double poly_norm(const poly *a, int l); 16 | 17 | #define polyvec_norm PQMX_NAMESPACE(polyvec_norm) 18 | double polyvec_norm(const poly *a, int l, int len); 19 | 20 | int secure_random(int j); 21 | void shuffle_array(int64_t *arr, int len); 22 | 23 | void padding(uint8_t out[PQMX_INDCPA_MSGBYTES], const uint8_t *buf, size_t buflen); 24 | void remove_padding(uint8_t *out, const uint8_t in[PQMX_INDCPA_MSGBYTES]); 25 | 26 | #endif -------------------------------------------------------------------------------- /proof_size_estimate.py: -------------------------------------------------------------------------------- 1 | ''' This python script computes the estimated sizes of our aggregate signature as presented in Section 6 and Appendix E of our submission. 2 | It is derived from corresponding code provided to us by Gregor Seiler. 3 | It first provides the numbers for the comparision with [JRS23], Squirrel and Chipmunk. 4 | Then it provides the numbers stored in 'estimates-lin-.csv' to derive the plots through 'plot_paper.py' later. 5 | To run the program, simply run the python script, for example >>> python3 ./proof_size_estimate.py ''' 6 | 7 | import math 8 | import sys 9 | import time 10 | import csv 11 | from enum import Enum 12 | 13 | ### FALCON PARAMETER SETS ### 14 | 15 | ''' Defining parameter sets for FALCON_DEG_SEC: 16 | Parameters depend on ring degree DEG and security level SEC, thus different classes for the relevant combinations 17 | 18 | d = ring degree 19 | q = Falcon modulus 20 | beta = norm bound of signatures 21 | bit_lin = bit length of a signature without nonce (from Falcon specifications, Table 3.3 and Algorithm 10) 22 | SECPARAM = aimed security level 23 | JL_const = the constant in the Johnson-Lindenstrauss Lemma (for the corresponding security level) 24 | JL_slack = the slack in the Johnson-Lidenstrauss Lemma (for the given security level) 25 | kappa_lim[kappa] = largest beta for which that kappa gives us SECPARAM-bit MSIS security (using SIS_hardness.py script) 26 | ''' 27 | 28 | # all configurations we need 29 | class FALCON_64_128(): 30 | def __init__(self) -> None: 31 | self.d = 64 32 | self.q = 12289 33 | self.beta = 5834 34 | self.bit_len = 5328-320 35 | self.SECPARAM = 128 36 | self.JL_const = 120 37 | self.JL_slack = math.sqrt(128/30) 38 | self.kappa_lim = [-1, 271, 2687, 16383, 73727, 524287, 917503, 2621439, 7340031, 18874367, 50331647, 117440511, 251658239, 1073741823, 1207959551, 2684354559, 5368709119, 10737418239, 21474836479, 68719476735, 137438953471, 137438953471, 240518168575, 549755813887, 755914244095, 1374389534719, 4398046511103, 4398046511103, 8796093022207, 13194139533311, 35184372088831, 35184372088831, 52776558133247, 87960930222079, 140737488355326, 140737488355326] 39 | 40 | class FALCON_128_128(): 41 | def __init__(self) -> None: 42 | self.d = 128 43 | self.q = 12289 44 | self.beta = 5834 45 | self.bit_len = 5328-320 46 | self.SECPARAM = 128 47 | self.JL_const = 120 48 | self.JL_slack = math.sqrt(128/30) 49 | self.kappa_lim = [-1, 2687, 73727, 917503, 7340031, 50331647, 251658239, 1207959551, 5368709119, 21474836479, 137438953471, 240518168575, 755914244095, 4398046511103, 8796093022207, 35184372088831, 52776558133247, 140737488355326, 140737488355326] 50 | 51 | class FALCON_128_256(): 52 | def __init__(self) -> None: 53 | self.d = 128 54 | self.q = 12289 55 | self.beta = 5834 56 | self.bit_len = 10240-320 57 | self.SECPARAM = 256 58 | self.JL_const = 168 59 | self.JL_slack = math.sqrt(128/30) 60 | self.kappa_lim = [-1, 463, 5887, 49151, 229375, 917503, 3407871, 11534335, 34603007, 100663295, 268435455, 805306367, 1744830463, 4294967295, 9663676415, 21474836479, 47244640255, 103079215103, 206158430207, 549755813887, 1099511627775, 1649267441663, 3161095929855, 6047313952767, 13194139533311, 21990232555519, 39582418599935, 70368744177663, 140737488355326, 140737488355326] 61 | 62 | class FALCON_256_256(): 63 | def __init__(self) -> None: 64 | self.d = 256 65 | self.q = 12289 66 | self.beta = 5834 67 | self.bit_len = 10240-320 68 | self.SECPARAM = 256 69 | self.JL_const = 168 70 | self.JL_slack = math.sqrt(128/30) 71 | self.kappa_lim = [-1, 5887, 229375, 3407871, 34603007, 268435455, 1744830463, 9663676415, 47244640255, 206158430207, 1099511627775, 3161095929855, 13194139533311, 39582418599935, 140737488355326, 140737488355326] 72 | 73 | 74 | ### LABRADOR CHALLENGE SETS ### 75 | 76 | ''' Defining challenge sets for FALCON_DEG_SEC in the case of 77 | 1) two-splitting rings (2_SPLIT) 78 | 2) almost fully-splitting rings (ALMOST_FULL_SPLIT) 79 | 80 | (Parameters depend on ring degree DEG and security level SEC, thus different classes for the relevant combinations) 81 | 82 | rho = number of CRT slots (ell in the paper) 83 | tau = bound on the square of l_2 norm (given by the weight, T_2 in the paper) 84 | T = bound on operator norm (given by the weight, T_op in the paper) 85 | rep = boolian to indicate whether parallel repetition is necessary 86 | 87 | Parameters are set such that 88 | 89 | i) size of challenge set > 2^SECPARAM 90 | ii) probability of non-invertibility < 2^(-SEPARAM) 91 | 92 | using the SageMath program 'compute_weight.sage' in this repository 93 | ''' 94 | 95 | # best subring -> best parameter sets 96 | class CHAL_2_SPLIT_64_128(): 97 | def __init__(self) -> None: 98 | self.rho = 2 99 | self.tau = math.ceil(172/2) #omega * gam^2 = 43*(2**2)=172 100 | self.T = math.ceil(86/2) #omega * gam = 43*2 =86 101 | self.rep = False 102 | 103 | class CHAL_2_SPLIT_128_256(): 104 | def __init__(self) -> None: 105 | self.rho = 2 106 | self.tau = math.ceil(296/2.5) #omega * gam^2 = 74*(2**2)=296 107 | self.T = math.ceil(148/2.5) #omega * gam = 74*2=148 108 | self.rep = False 109 | 110 | # best subring -> best parameter sets 111 | class CHAL_ALMOST_FULL_SPLIT_128_128(): 112 | def __init__(self) -> None: 113 | self.rho = 128//4 114 | self.tau = math.ceil(1024/2) # 64*(4**2)=1024 115 | self.T = math.ceil(256/2) # 64*4=256 116 | self.rep = False 117 | 118 | class CHAL_ALMOST_FULL_SPLIT_256_256(): 119 | def __init__(self) -> None: 120 | self.rho = 256//8 121 | self.tau = math.ceil(1296/2.5) #144*(3**2)=1296 122 | self.T = math.ceil(432/2.5) #144*3=432 123 | self.rep = False 124 | 125 | #beta_fun is a lambda that takes kappa as input and outputs a norm. Needed for Thm 5.1. 126 | def get_kappa(beta_fun, q_,kappa_lim): 127 | kappa = 0 128 | beta = 0 129 | for i in range(1, len(kappa_lim)): 130 | beta = beta_fun(i) 131 | if beta <= kappa_lim[i]: 132 | kappa = i 133 | break 134 | 135 | if beta >= q_: 136 | raise Exception("Beta must be smaller than the LaBRADOR modulus.") 137 | if kappa == 0: 138 | raise Exception("The table kappa_lim has not found a kappa for such a large beta", beta) 139 | return kappa 140 | 141 | ##### Printing sizes ##### 142 | 143 | def format_size(nbits): 144 | # Should we print KiB instead? 145 | nbytes = math.ceil(nbits / 8) 146 | nkb = round(nbytes / 1000, 2) 147 | nmb = round(nbytes / 1000**2, 2) 148 | ngb = round(nbytes / 1000**3, 2) 149 | if nbytes < 1000: 150 | return "{v:10.2f} B".format(v=nbytes) 151 | if nkb < 1000: 152 | return "{v:10.2f} kB".format(v=nkb) 153 | if nmb < 1000: 154 | return "{v:10.2f} MB".format(v=nmb) 155 | return "{v:10.2f} GB".format(v=ngb) 156 | 157 | #### LaBRADOR script utils #### 158 | 159 | def gaussianentropy(sig): 160 | a = 1 161 | if (sig >= 4): 162 | a = math.floor(sig/2) 163 | sig /= a; 164 | 165 | d = 1/(2*sig**2) 166 | n = 0 167 | for i in range(-math.ceil(15*sig), 0): 168 | n += math.exp(-i**2*d) 169 | n = 2*n + 1 170 | logn = math.log(n) 171 | e = 0 172 | for i in range(-math.ceil(15*sig), 0): 173 | f = math.exp(-i**2*d) 174 | e += f*(math.log(f) - logn) 175 | e = (-2*e + logn)/(n*math.log(2.0)) 176 | 177 | return(e+math.log2(a)) 178 | 179 | def l2norm(list): 180 | return math.sqrt(sum(list[i]**2 for i in range(len(list)))) 181 | 182 | ##### Starting params ###### 183 | 184 | def get_initial_params(num_sigs, falcon, chal, scal, verbose=False): 185 | ## signature l2-norm (contains both s_i's and epsilon_i's as well as their conjugates): 186 | beta_sig_ell2 = math.sqrt(num_sigs) * 2 * falcon.beta # called beta_2^(1) in paper 187 | ## quotient l2-norm (contains v_i's for lifting the modulus): 188 | beta_quotient_ell2 = math.sqrt(num_sigs) * (falcon.beta* falcon.d + math.sqrt(falcon.d) + 1) # called beta_2^(2) in paper 189 | ## sum of both l2-norm bounds gives final l2-norm bound 190 | beta_labrador = beta_sig_ell2 + beta_quotient_ell2 191 | 192 | # Condition 1 193 | # for beta_2^(1) 194 | bound_cond_1_sig = math.sqrt(falcon.SECPARAM) * beta_sig_ell2 * falcon.JL_const 195 | # for beta_2^(2) 196 | bound_cond_1_quo = math.sqrt(falcon.SECPARAM) * beta_quotient_ell2 * falcon.JL_const 197 | 198 | # Condition 2 199 | # for beta_inf^(1) 200 | bound_cond_2_sig = (falcon.JL_slack)**2 * (beta_sig_ell2)**2 * 4 * (falcon.d + 2) 201 | # for beta_inf^(2) 202 | bound_cond_2_quo = falcon.JL_slack * beta_quotient_ell2 * 6 * falcon.q 203 | 204 | 205 | q_bitlen = math.ceil( max(math.log2(bound_cond_1_sig), math.log2(bound_cond_1_quo), math.log2(bound_cond_2_sig), math.log2(bound_cond_2_quo))) 206 | 207 | ''' 208 | if verbose: 209 | print(num_sigs) 210 | print("beta_2^1: ", beta_sig_ell2, "beta_2^2: ", beta_quotient_ell2) 211 | print("bound through condition 1 for signature: ", bound_cond_1_sig) 212 | print("bound through condition 1 for quotient: ", bound_cond_1_quo) 213 | print("bound through condition 2 for signature: ", bound_cond_2_sig) 214 | print("bound through condition 2 for quotient: ", bound_cond_2_quo) 215 | print("bit length of Labrador modulus: ", q_bitlen) 216 | ''' 217 | if falcon.SECPARAM > q_bitlen * (falcon.d / chal.rho): 218 | print("ERROR: q^{d/l} not big enough") 219 | 220 | q_ = 2**q_bitlen-1 221 | 222 | ## Starting Constraints 223 | n = scal * num_sigs 224 | r = 6 * math.ceil(math.sqrt(num_sigs)) + 1 225 | return q_, n, [r], [beta_labrador,0] 226 | 227 | ############################### 228 | class Stage(Enum): 229 | FIRST = 1 230 | MID = 2 231 | SECLAST = 3 232 | LAST = 4 #Meant for last round optimization, not used any more. 233 | 234 | class Iteration(): 235 | def __init__(self, q_, d, slack, n, r_list, beta_list, chal, secparam, kappa_lim, stage=Stage.MID, prevnu=1, prevmu=1) -> None: 236 | self.q_ = q_ 237 | self.d = d 238 | self.slack = slack 239 | self.n = n 240 | self.r_list = r_list 241 | self.beta_list = beta_list 242 | self.chal = chal 243 | self.stage = stage 244 | self.prevnu = prevnu 245 | self.prevmu = prevmu 246 | self.secparam = secparam 247 | self.kappa_lim = kappa_lim 248 | 249 | #Initializing internal variables 250 | if stage == Stage.LAST: 251 | self.init_last() 252 | else: 253 | self.init_normal() 254 | 255 | def init_normal(self): 256 | #Combined norm of the old witness vectors. 257 | self.beta = l2norm(self.beta_list) 258 | self.logq = math.ceil(math.log2(self.q_)) 259 | 260 | #sig=sigma, the usual symbol for standard derivation. 261 | #They assume s_i are Gaussian with this standard deviation (p.16). 262 | self.sigs = [self.beta_list[i]/math.sqrt(self.r_list[i]*self.n*self.d) for i in range(0, len(self.r_list))] 263 | 264 | #In the paper they model z as having SD sigs*sqrt(r*tau). 265 | #This seems to be the new SD of z_0 and z_1, when they keep track of the norm and SD of the 266 | #subcomponents of the witness separately. 267 | self.sigz = math.sqrt(self.sigs[0]**2*(1.0+(self.r_list[0]-1)*self.chal.tau) + sum([self.sigs[i]**2*self.r_list[i]*self.chal.tau for i in range(1, len(self.r_list))])) 268 | 269 | #An upper bound on the standard deviation on the g_ij's (p.19, max not mentioned) 270 | self.sigh = math.sqrt(2*self.n*self.d)*max(self.sigs)**2 271 | 272 | if self.stage == Stage.SECLAST: 273 | self.t,self.b = 1, 1 274 | else: 275 | #z is decomposed into t=2 parts wrt to the basis b (p.16) 276 | self.t,self.b = 2,round(math.sqrt(math.sqrt(12.0)*self.sigz)) 277 | 278 | #t_i and h_ij are split into t1 parts wrt the basis b1. 279 | #t1 is different compared to the paper. I assume the max is to make sure 280 | #we don't get weird dividing by log(1) issues. 281 | self.t1 = round(self.logq/math.log2(math.sqrt(12.0)*self.sigz/self.b)) 282 | self.t1 = max(2,self.t1) 283 | self.t1 = min(14,self.t1) 284 | self.b1 = math.ceil(2**(self.logq/self.t1)) 285 | 286 | #g_ij is split into t2 parts wrt the basis b2. 287 | #t2 and b2 are both different from the paper. 288 | self.t2 = round(math.log(math.sqrt(12)*self.sigh)/math.log(math.sqrt(12)*self.sigz/self.b)) 289 | self.t2 = max(1,self.t2) 290 | self.b2 = math.ceil((math.sqrt(12)*self.sigh)**(1/self.t2)) 291 | 292 | #Total number of witness elements 293 | sumr = sum(self.r_list) 294 | #Computes the combined l2-norm upper bound for z_0 and z_1, sqrt(2/b**2 * gamma) (see p.15) 295 | #gamma = sigz *sqrt(nd), because the norm contribution of each Zq coefficient is on average the SD, 296 | #and there are nd coeffcients. 297 | self.nextbeta_list = [self.sigz/self.b*math.sqrt(float(self.t)*self.n*self.d), 0] 298 | 299 | #Rank of inner commitments 300 | newbeta1_fun = lambda kappa: math.sqrt(self.b1**2/12*self.t1*sumr*kappa*self.d + (self.b1**2*self.t1+self.b2**2*self.t2)/12*(sumr**2+sumr)/2*self.d) 301 | newbeta_fun = lambda kappa: math.sqrt(self.nextbeta_list[0]**2 + newbeta1_fun(kappa)**2) 302 | kappa_norm = lambda kappa: max(6*self.chal.T*self.b*self.slack*newbeta_fun(kappa),2*self.b*self.slack*newbeta_fun(kappa)+4*self.chal.T*self.slack*self.beta) 303 | self.kappa = get_kappa(kappa_norm, self.q_,self.kappa_lim) 304 | self.nextbeta_list[1] = newbeta1_fun(self.kappa) 305 | 306 | #The rank of the outer commitments. 307 | kappa1_norm = lambda kappa: 2*self.slack*newbeta_fun(kappa) 308 | self.kappa1 = get_kappa(kappa1_norm, self.q_,self.kappa_lim) 309 | 310 | self.m = self.t1*sumr*self.kappa + (self.t1+self.t2)*(sumr**2+sumr)/2 311 | 312 | #Tail is the version of the last iteration that is in the paper. 313 | def init_last(self): 314 | #Same as main 315 | self.beta = l2norm(self.beta_list) 316 | self.logq = math.ceil(math.log2(self.q_)) 317 | # size += 4*128; # challenges 318 | 319 | #Same as main 320 | self.sigs = [self.beta_list[i]/math.sqrt(self.r_list[i]*self.n*self.d) for i in range(0, len(self.r_list))] 321 | self.sigz = math.sqrt(self.sigs[0]**2*(1.0+(self.r_list[0]-1)*self.chal.tau) + sum([self.sigs[i]**2*self.r_list[i]*self.chal.tau for i in range(1, len(self.r_list))])) 322 | self.sigh = math.sqrt(2*self.n*self.d)*max(self.sigs)**2 323 | 324 | #The norm of z when it is not decomposed into z_0 and z_1. 325 | self.beta = self.sigz*math.sqrt(self.n*self.d) 326 | #Picks the inner com rank according to Theorem 5.1, but with the new expression for the norm of z 327 | #We also don't need to multiply with the slack since we are not going to recurse further. 328 | kappa_norm = lambda kappa: max(6*self.chal.T*self.beta,2*self.beta+4*self.chal.T*self.beta) 329 | self.kappa = get_kappa(kappa_norm, self.q_,self.kappa_lim) 330 | 331 | #For printing purposes 332 | self.kappa1 = 0 333 | self.m = 0 334 | 335 | def size_step(self): 336 | numproj = 2 if self.stage == Stage.FIRST else 1 337 | jl_proj = numproj * 2 * self.secparam*gaussianentropy(self.beta/math.sqrt(2.0)) 338 | 339 | jl_proof = math.ceil(self.secparam/self.logq)*self.d*self.logq 340 | 341 | numouter = 0 if self.stage == Stage.SECLAST else 2 342 | outer_com = numouter * self.kappa1 * self.d * self.logq 343 | 344 | return jl_proj + jl_proof + outer_com 345 | 346 | 347 | def size_ti(self): 348 | return sum(self.r_list)*self.kappa*self.d*self.logq 349 | 350 | def size_gij(self): 351 | if self.stage == Stage.LAST: 352 | #According to Section 5.6, there will be 2*nu+1 g_ij polynomials 353 | return (1+2*self.r_list[0])*self.d*gaussianentropy(self.sigh) # quadratic garbage polys, assumes t = 1 354 | 355 | return ((self.r_list[0]**2 + self.r_list[0])/2)*self.d*gaussianentropy(self.sigh) 356 | 357 | def size_hij(self): 358 | if self.stage == Stage.LAST: 359 | #According to Section 5.6, there will be 2r-1 h_ij polynomials. 360 | return (2*sum(self.r_list)-1)*self.d*self.logq # linear garbage polys 361 | 362 | return ((self.r_list[0]**2 + self.r_list[0])/2)*self.d*self.logq 363 | 364 | def size_lastmsg(self): 365 | return self.size_ti() + self.size_gij() + self.size_hij() + self.size_z() 366 | 367 | def size_all(self): 368 | return self.size_step() + self.size_lastmsg() 369 | 370 | def parallel_reps(self): 371 | if not self.chal.rep: 372 | return 1 373 | for k in range(1,20): 374 | if self.secparam < math.log2( k / self.d) + k * math.log2(self.q_): # <=> 2^(-secparam) > d/k * (1/q)^k 375 | return k 376 | # TODO this is super hacky, but we should never end in this case 377 | return 21 378 | 379 | def size_z(self): 380 | if self.stage == Stage.LAST: 381 | #TODO: times two to avoid exponential loss in weak opening norms (ask sebastian) 382 | return 2* self.parallel_reps()*self.n*self.d*gaussianentropy(self.sigz) 383 | return self.parallel_reps()*self.n*self.d*gaussianentropy(self.sigz) 384 | 385 | def next_it(self, recursion_strategy, nextstage=Stage.MID): 386 | nu, mu = recursion_strategy(self, nextstage) 387 | n_ = math.ceil(self.parallel_reps()*self.n/nu) 388 | m_ = math.ceil(self.m/mu) #Without the ceils and floors, this would make newm=n/nu=newn. 389 | n_ = max(n_,m_) #New rank 390 | r_ = [self.t*nu,mu]; #New multiplicity. 391 | it = Iteration(self.q_, self.d, self.slack, n_, r_, self.nextbeta_list, self.chal, self.secparam, self.kappa_lim, nextstage, nu, mu,) 392 | return it 393 | 394 | def __str__(self): 395 | return f"n: {self.n:10d}, r: {sum(self.r_list):10d}, m: {math.ceil(self.m):10d}, ν: {self.prevnu:4d}, μ: {self.prevmu:4d}, kappa: {self.kappa:4d}, kappa1: {self.kappa1:4d}, log(β) {math.log2(self.beta):5.2f} norm balance: {100*(self.beta_list[1]-self.beta_list[0])/max(self.beta_list):.2f} size: {format_size(self.size_step())}{', ' +format_size(self.size_lastmsg())}" 396 | 397 | def lastmsg_str(self): 398 | # self.size_z() + self.size_t_i() + self.size_g_ij() + self.size_hij() 399 | return f"z: {format_size(self.size_z())}, ti: {format_size(self.size_ti())}, gij: {format_size(self.size_gij())}, hij: {format_size(self.size_hij())}" 400 | 401 | #### Recursion #### 402 | 403 | def recursion_strategy_4(it, nextstage): 404 | best_nu, best_mu = 1,1 405 | best = it.next_it((lambda a,b: (1,1)), nextstage) 406 | best_size = best.parallel_reps() * 2 * best.n + best.m 407 | res = 50 408 | for nu in range(1,res): 409 | for mu in range(1,res): 410 | candidate = it.next_it((lambda a,b: (nu,mu)), nextstage) 411 | cand_size = candidate.parallel_reps() * 2 * candidate.n + candidate.m 412 | if cand_size < best_size: 413 | best_size = cand_size 414 | best_nu, best_mu = nu, mu 415 | # print(best_nu,best_mu) 416 | return best_nu,best_mu 417 | 418 | def recursion_strategy_final_4(it, nextstage): 419 | best_nu, best_mu = 1,1 420 | best = it.next_it((lambda a, b: (1,1)), Stage.LAST) 421 | best_size = best.size_all() 422 | res = 100 423 | for nu in range(1,res): 424 | for mu in range(1,res): 425 | candidate = it.next_it((lambda a,b: (nu,mu)), Stage.LAST) 426 | if candidate.size_all() < best_size: 427 | best_size = candidate.size_all() 428 | best_nu, best_mu = nu, mu 429 | # print(best_nu,best_mu) 430 | return best_nu,best_mu 431 | 432 | def recursion_to_depth(initial_it, depth): 433 | iters = [initial_it] 434 | it = initial_it 435 | strategies = [] if depth <= 3 else [recursion_strategy_4] * (depth-3) 436 | for strat in strategies: 437 | it = it.next_it(strat, Stage.MID) 438 | iters.append(it) 439 | 440 | seclast = it.next_it(recursion_strategy_4, Stage.SECLAST) 441 | last = seclast.next_it(recursion_strategy_final_4, Stage.LAST) 442 | iters.append(seclast) 443 | iters.append(last) 444 | 445 | total = 0 446 | for it in iters: 447 | total += it.size_step() 448 | 449 | total += iters[len(iters)-1].size_lastmsg() 450 | 451 | return iters, total 452 | 453 | def recursion_to_depth_no_last_opt(initial_it, depth): 454 | iters = [initial_it] 455 | it = initial_it 456 | strategies = [] if depth <= 2 else [recursion_strategy_4] * (depth-2) 457 | for strat in strategies: 458 | it = it.next_it(strat, Stage.MID) 459 | iters.append(it) 460 | 461 | seclast = it.next_it(recursion_strategy_4, Stage.SECLAST) 462 | iters.append(seclast) 463 | 464 | total = 0 465 | for it in iters: 466 | total += it.size_step() 467 | 468 | total += iters[len(iters)-1].size_lastmsg() 469 | 470 | return iters, total 471 | 472 | def search(num_sigs, max_depth, falcon, chal, scal, verbose=False): 473 | q_, n, r_list, beta_labrador_list = get_initial_params(num_sigs, falcon, chal, scal, verbose=verbose) 474 | if verbose: print("Initial Modulus (in bits):", math.ceil(math.log2(q_))) 475 | 476 | it = Iteration(q_, falcon.d, falcon.JL_slack, n, r_list, beta_labrador_list, chal, falcon.SECPARAM, falcon.kappa_lim, Stage.FIRST) 477 | if verbose: print(f'Requires {it.parallel_reps()} parallel repetition(s)') 478 | 479 | best_iterations, best_size = recursion_to_depth_no_last_opt(it, 1) 480 | 481 | for d in range(2, max_depth + 1): 482 | iterations, size = recursion_to_depth_no_last_opt(it, d) 483 | if size < best_size: 484 | best_iterations, best_size = iterations, size 485 | if verbose: 486 | #for depth, it in enumerate(best_iterations): 487 | # print(f"{depth:2d}", it) 488 | #print("Last message:",best_iterations[len(best_iterations)-1].lastmsg_str()) 489 | print("Total (with salt):", format_size(best_size + 320*num_sigs)) 490 | print("Total (without salt) ", format_size(best_size)) 491 | print("Trivial solution (with salt): ", format_size(num_sigs*falcon.bit_len + 320*num_sigs)) 492 | print("Trivial solution (without salt): ", format_size(num_sigs*falcon.bit_len)) 493 | print("Compression rate (with salt): ", round((best_size + 320*num_sigs)/(num_sigs*falcon.bit_len + 320*num_sigs),2)) 494 | 495 | return best_size 496 | 497 | ## FUNCTIONS TO COMPUTE NUMBERS AND STORE IN CSV FILES 498 | 499 | def compute_sizes_best(file_name, sizes): 500 | f = open(file_name,"a") 501 | max_depth = 15 502 | 503 | f.write("Num-Sigs,Naive-512,Naive-1024,Falcon-512-2S,Falcon-512-AS,Falcon-1024-2S,Falcon-1024-AS\n") 504 | for num_sigs in sizes: 505 | sys.stdout.write('\r') 506 | sys.stdout.write(f'{num_sigs}') 507 | sys.stdout.flush() 508 | # Falcon-512 2S (two-splitting), AS (almost-fully-splitting) 509 | f512_2S = search(num_sigs, max_depth, FALCON_64_128(), CHAL_2_SPLIT_64_128(),8) 510 | f512_AS = search(num_sigs, max_depth, FALCON_128_128(), CHAL_ALMOST_FULL_SPLIT_128_128(),4) 511 | # Falcon-1024 2S (two-splitting), AS (almost-fully-splitting) 512 | f1024_2S = search(num_sigs, max_depth, FALCON_128_256(), CHAL_2_SPLIT_128_256(),8) 513 | f1024_AS = search(num_sigs, max_depth, FALCON_256_256(), CHAL_ALMOST_FULL_SPLIT_256_256(),4) 514 | # Also include naive concatenative 515 | f.write(f'{num_sigs}, {num_sigs * FALCON_128_128().bit_len}, {num_sigs * FALCON_256_256().bit_len}, {f512_2S}, {f512_AS}, {f1024_2S},{f1024_AS} \n') 516 | f.flush() 517 | f.close() 518 | print("Done!") 519 | 520 | ## NEW WITH BEST SUBRING 521 | # COMPARISON WITH [JRS23] 522 | 523 | print("\n RESULTS FOR COMPARISON WITH [JRS23] (Table 4 in the submission)") 524 | print("\n For Falcon-512 (degree 64, secpar 121) and N=500") 525 | search(500, 15, FALCON_64_128(), CHAL_2_SPLIT_64_128(),8, True) 526 | print("\n For Falcon-512 (degree 64, secpar 121) and N=1000") 527 | search(1000, 15, FALCON_64_128(), CHAL_2_SPLIT_64_128(),8, True) 528 | print("\n For Falcon-512 (degree 64, secpar 121) and N=2000") 529 | search(2000, 15, FALCON_64_128(), CHAL_2_SPLIT_64_128(),8, True) 530 | 531 | # COMPARISON WITH SQUIRREL AND CHIPMUNK 532 | 533 | print("\n RESULTS FOR COMPARISON WITH SQUIRREL & CHIPMUNK (Table 5 in the submission)") 534 | print("\n For Falcon-512 (degree 64, secpar 121) and N=1024") 535 | search(1024, 15, FALCON_64_128(), CHAL_2_SPLIT_64_128(),8, True) 536 | print("\n For Falcon-512 (degree 64, secpar 121) and N=4096") 537 | search(4096, 15, FALCON_64_128(), CHAL_2_SPLIT_64_128(),8, True) 538 | print("\n For Falcon-512 (degree 64, secpar 121) and N=8192") 539 | search(8192, 15, FALCON_64_128(), CHAL_2_SPLIT_64_128(),8, True) 540 | 541 | # Computes all relevant aggregate signature sizes to run plot.py to obtain the different figures in Section 6.2 542 | compute_sizes_best("estimates-lin-.csv", range(100, 10101, 100)) # start, end, step size 543 | --------------------------------------------------------------------------------