├── vdf_portal_config.json ├── msu ├── rtl │ ├── vivado_ozturk │ │ ├── generate.sh │ │ ├── msu.srcs │ │ │ ├── constrs_1 │ │ │ │ └── new │ │ │ │ │ └── user.xdc │ │ │ └── tb.sv │ │ ├── run_vivado.sh │ │ └── tb_behav.wcfg │ ├── vivado_simple │ │ ├── generate.sh │ │ ├── msu.srcs │ │ │ ├── constrs_1 │ │ │ │ └── new │ │ │ │ │ └── user.xdc │ │ │ └── tb.sv │ │ ├── run_vivado.sh │ │ └── tb_behav.wcfg │ ├── input_simple.vc │ ├── sdaccel │ │ ├── placer_constrs.xdc │ │ ├── open_waves.tcl │ │ ├── kernel.xml │ │ ├── Makefile │ │ ├── utils.mk │ │ ├── tcl │ │ │ ├── gen_xo.tcl │ │ │ └── package_kernel.tcl │ │ ├── vdf_counter.sv │ │ ├── vdf_wrapper.sv │ │ ├── Makefile.sdaccel │ │ ├── vdf_kernel.sv │ │ └── vdf.v │ ├── input.vc │ ├── verilator.mk │ ├── gen_test.py │ ├── modular_square_wrapper.sv │ ├── multiplier.mk │ ├── Makefile │ └── msu.sv ├── scripts │ ├── sdaccel_env.sh │ ├── simulation_setup.sh │ └── f1_setup.sh ├── sw │ ├── Config.h │ ├── MSUVerilator.hpp │ ├── MSUSDAccel.hpp │ ├── MSU.hpp │ ├── MSU.cpp │ ├── MSUVerilatorDirect.cpp │ ├── MSUSDAccel.cpp │ ├── main.cpp │ ├── MSUVerilator.cpp │ └── Squarer.hpp └── Makefile ├── docs ├── interface_timing.png ├── generate_modulus.md ├── verilator.md ├── onprem.md ├── test_portal.md └── aws_f1.md ├── FPGA_Competition_Application_Form.pdf ├── FPGA_Competition_Official_Rules_and_Disclosures.pdf ├── primitives ├── README.md ├── rtl │ ├── full_adder.sv │ ├── multiplier.sv │ ├── carry_save_adder.sv │ ├── carry_save_adder_tree_level.sv │ ├── compressor_tree_3_to_2.sv │ └── multiply.sv └── model │ └── primitives.py ├── submission_form.txt ├── .gitignore ├── modular_square ├── model │ └── vdf_basic.py └── rtl │ └── modular_square_simple.sv └── LICENSE /vdf_portal_config.json: -------------------------------------------------------------------------------- 1 | { 2 | "target": "liveness" 3 | } 4 | -------------------------------------------------------------------------------- /msu/rtl/vivado_ozturk/generate.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | vivado -source msu.tcl -mode batch 4 | 5 | -------------------------------------------------------------------------------- /msu/rtl/vivado_simple/generate.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | vivado -source msu.tcl -mode batch 4 | 5 | -------------------------------------------------------------------------------- /docs/interface_timing.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/supranational/vdf-fpga/HEAD/docs/interface_timing.png -------------------------------------------------------------------------------- /FPGA_Competition_Application_Form.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/supranational/vdf-fpga/HEAD/FPGA_Competition_Application_Form.pdf -------------------------------------------------------------------------------- /FPGA_Competition_Official_Rules_and_Disclosures.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/supranational/vdf-fpga/HEAD/FPGA_Competition_Official_Rules_and_Disclosures.pdf -------------------------------------------------------------------------------- /msu/rtl/input_simple.vc: -------------------------------------------------------------------------------- 1 | // This file typically lists flags required by a large project, e.g. include directories 2 | +librescan +libext+.v+.sv+.vh+.svh -y . 3 | -------------------------------------------------------------------------------- /msu/rtl/sdaccel/placer_constrs.xdc: -------------------------------------------------------------------------------- 1 | add_cells_to_pblock [get_pblocks pblock_dynamic_SLR2] [get_cells [list {WRAPPER_INST/CL/vdf_1/inst/inst_wrapper/inst_kernel/msu/modsqr/modsqr}]] 2 | -------------------------------------------------------------------------------- /msu/rtl/sdaccel/open_waves.tcl: -------------------------------------------------------------------------------- 1 | current_fileset 2 | open_wave_database obj_hw_emu/xilinx_aws-vu9p-f1-04261818_dynamic_5_0-0-vdf.hw_emu.xilinx_aws-vu9p-f1-04261818_dynamic_5_0.wdb 3 | 4 | -------------------------------------------------------------------------------- /msu/rtl/input.vc: -------------------------------------------------------------------------------- 1 | // This file typically lists flags required by a large project, e.g. include directories 2 | +librescan +libext+.v+.sv+.vh+.svh -y . -y ./obj_dir/mem -y ../../primitives/rtl -y ../rtl -y ../../modular_square/rtl 3 | -------------------------------------------------------------------------------- /docs/generate_modulus.md: -------------------------------------------------------------------------------- 1 | 2 | To generate a new RSA modulus: 3 | ``` 4 | openssl genrsa -out mykey.pem 1024 5 | openssl rsa -in mykey.pem -pubout > mykey.pub 6 | openssl rsa -pubin -modulus -noout -in mykey.pub 7 | rm mykey.pem 8 | rm mykey.pub 9 | ``` 10 | -------------------------------------------------------------------------------- /primitives/README.md: -------------------------------------------------------------------------------- 1 | This repository contains low level arithmetic primitives in RTL that can be used for FPGA or ASIC based designs. 2 | 3 | Multiply is a fully parameterized polynomial multiplier with configurable polynomial degree and coefficient bit-width. 4 | 5 | These were created as part of the https://www.cryptophage.com/ project. 6 | -------------------------------------------------------------------------------- /submission_form.txt: -------------------------------------------------------------------------------- 1 | VDF FPGA Competition Submission Form 2 | 3 | To submit your design: 4 | - submit an official team entry form 5 | - fill in the fields below 6 | - create your final commit with git signoff: 7 | git commit -s -m "round 1 entry" 8 | - email your final repo + commit to hello@vdfalliance.org 9 | 10 | Team name: 11 | Expected result (avg ns/square): 12 | Design documentation (below): 13 | 14 | -------------------------------------------------------------------------------- /msu/rtl/vivado_ozturk/msu.srcs/constrs_1/new/user.xdc: -------------------------------------------------------------------------------- 1 | 2 | create_clock -period 10.000 -name ap_clk -waveform {0.000 5.000} [get_ports ap_clk] 3 | 4 | create_pblock sl_exclusion 5 | resize_pblock [get_pblocks sl_exclusion] -add {CLOCKREGION_X4Y0:CLOCKREGION_X5Y9} 6 | set_property EXCLUDE_PLACEMENT 1 [get_pblocks sl_exclusion] 7 | create_pblock SLR2 8 | add_cells_to_pblock [get_pblocks SLR2] [get_cells -quiet [list inst_wrapper/inst_kernel/msu/modsqr/modsqr]] 9 | resize_pblock [get_pblocks SLR2] -add {CLOCKREGION_X0Y10:CLOCKREGION_X5Y14} 10 | -------------------------------------------------------------------------------- /msu/rtl/vivado_simple/msu.srcs/constrs_1/new/user.xdc: -------------------------------------------------------------------------------- 1 | 2 | create_clock -period 10.000 -name ap_clk -waveform {0.000 5.000} [get_ports ap_clk] 3 | 4 | create_pblock sl_exclusion 5 | resize_pblock [get_pblocks sl_exclusion] -add {CLOCKREGION_X4Y0:CLOCKREGION_X5Y9} 6 | set_property EXCLUDE_PLACEMENT 1 [get_pblocks sl_exclusion] 7 | create_pblock SLR2 8 | add_cells_to_pblock [get_pblocks SLR2] [get_cells -quiet [list inst_wrapper/inst_kernel/msu/modsqr/modsqr]] 9 | resize_pblock [get_pblocks SLR2] -add {CLOCKREGION_X0Y10:CLOCKREGION_X5Y14} 10 | -------------------------------------------------------------------------------- /msu/scripts/sdaccel_env.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # source this script to setup the environment F1 development 4 | 5 | hostname|grep ec2 > /dev/null 6 | if [ $? == 0 ] 7 | then 8 | echo "Setting up the an EC2 environment..." 9 | 10 | else 11 | echo "Setting up an on-premise environment..." 12 | 13 | export XILINX_SDX=/tools/Xilinx/SDx/2018.3 14 | PATH=$PATH:$XILINX_SDX/bin 15 | export AWS_FPGA_REPO_DIR=~/src/project_data/aws-fpga 16 | fi 17 | 18 | git clone https://github.com/aws/aws-fpga.git $AWS_FPGA_REPO_DIR 19 | pushd $AWS_FPGA_REPO_DIR 20 | 21 | # The following will require sudo 22 | source sdaccel_setup.sh 23 | 24 | popd 25 | -------------------------------------------------------------------------------- /msu/scripts/simulation_setup.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | grep Ubuntu /etc/os-release > /dev/null 4 | if [ $? == 0 ] 5 | then 6 | # Ubuntu 7 | echo "Running Ubuntu setup..." 8 | 9 | sudo apt update -y 10 | sudo apt install -y python3 libgmp-dev gtkwave 11 | 12 | wget https://www.veripool.org/ftp/verilator-4.016.tgz 13 | sudo apt-get install -y make autoconf g++ flex bison 14 | tar xvzf verilator*.t*gz 15 | cd verilator-4.016/ 16 | ./configure 17 | make -j 4 18 | sudo make install 19 | 20 | else 21 | # Assume CentOS 22 | echo "Running CentOS setup..." 23 | sudo yum update -y 24 | sudo yum install -y gmp-devel verilator python36 gtkwave 25 | fi 26 | 27 | export PATH=/tools/Xilinx/Vivado/2018.3/bin:$PATH 28 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | **~ 2 | **__pycache__ 3 | **logs 4 | **obj_dir 5 | **obj 6 | **\.dat 7 | **msuconfig.vh 8 | 9 | **vivado_*backup.jou 10 | **vivado_*backup.log 11 | **vivado.jou 12 | **vivado.log 13 | **vivado_pid*.str 14 | 15 | msu/rtl/vivado_ozturk/msu 16 | msu/rtl/vivado_ozturk/test.txt 17 | msu/rtl/vivado_ozturk/msu.cache 18 | msu/rtl/vivado_ozturk/msu.hw 19 | msu/rtl/vivado_ozturk/msu.ip_user_files 20 | msu/rtl/vivado_ozturk/msu.runs 21 | msu/rtl/vivado_ozturk/msu.srcs/mem 22 | msu/rtl/vivado_ozturk/msu.sim 23 | 24 | msu/rtl/vivado_simple/msu 25 | msu/rtl/vivado_simple/test.txt 26 | msu/rtl/vivado_simple/msu.cache 27 | msu/rtl/vivado_simple/msu.hw 28 | msu/rtl/vivado_simple/msu.ip_user_files 29 | msu/rtl/vivado_simple/msu.runs 30 | msu/rtl/vivado_simple/msu.srcs/mem 31 | msu/rtl/vivado_simple/msu.sim 32 | -------------------------------------------------------------------------------- /msu/scripts/f1_setup.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # Run this script to setup newly instantiated hosts 4 | 5 | # Install simulation dependencies 6 | sudo yum update -y 7 | sudo yum install -y gmp-devel verilator python36 8 | 9 | # Install the aws-fpga repo 10 | git clone https://github.com/aws/aws-fpga.git $AWS_FPGA_REPO_DIR; 11 | cd $AWS_FPGA_REPO_DIR && git pull; 12 | source $AWS_FPGA_REPO_DIR/sdaccel_setup.sh 13 | 14 | # Install VNC (optional, but provides a richer working environment) 15 | sudo yum -y install tigervnc-server tigervnc-server-minimal 16 | sudo yum -y groupinstall X11 17 | sudo yum --enablerepo=epel -y groups install "Xfce" 18 | sudo yum -y install kdiff3 19 | sudo yum -y install emacs 20 | 21 | cd 22 | mkdir .vnc 23 | cd .vnc 24 | cat < xstartup 25 | #!/bin/bash 26 | startxfce4 & 27 | EOF 28 | chmod +x xstartup 29 | 30 | -------------------------------------------------------------------------------- /msu/rtl/vivado_simple/run_vivado.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -e 4 | 5 | # Configuration 6 | export MOD_LEN=1024 7 | export SIMPLE_SQ=1 8 | MODEL=msu 9 | 10 | # Set current directory to the location of this script 11 | SCRIPT=$(dirname "$0") 12 | SCRIPTPATH=$(realpath "$SCRIPT") 13 | cd $SCRIPTPATH 14 | 15 | # Clean up the msuconfig file in rtl so vivado doesn't choose it 16 | # (why is there no way to configure the vivado include path?) 17 | rm -f ../msuconfig.vh 18 | 19 | # Generate a test 20 | ../gen_test.py -c -s $MOD_LEN 21 | 22 | # Generate the Vivado project 23 | if [ ! -d msu ]; then 24 | echo "Generating vivado project" 25 | ./generate.sh 26 | fi 27 | 28 | # Update the project directory to the current dir 29 | #sed 's@\(Project [^ ]\+ [^ ]\+ Path="\)[^\\"]\+@\1'$SCRIPTPATH/$MODEL.xpr'@' $MODEL.xpr > $MODEL.xpr_new 30 | #mv $MODEL.xpr_new $MODEL.xpr 31 | 32 | cd msu 33 | vivado $MODEL.xpr & 34 | -------------------------------------------------------------------------------- /modular_square/model/vdf_basic.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python3 2 | 3 | from random import getrandbits 4 | 5 | # Competition is for 1024 bits 6 | NUM_BITS = 1024 7 | 8 | NUM_ITERATIONS = 1000 9 | 10 | # Rather than being random each time, we will provide randomly generated values 11 | x = getrandbits(NUM_BITS) 12 | N = 124066695684124741398798927404814432744698427125735684128131855064976895337309138910015071214657674309443149407457493434579063840841220334555160125016331040933690674569571217337630239191517205721310197608387239846364360850220896772964978569683229449266819903414117058030106528073928633017118689826625594484331 13 | 14 | # t should be small for testing purposes. 15 | # For the final FPGA runs, t will be 2^30 16 | t = NUM_ITERATIONS 17 | 18 | # Iterative modular squaring t times 19 | # This is the function that needs to be optimized on FPGA 20 | for _ in range(t): 21 | x = (x * x) % N 22 | 23 | # Final result is a 1024b value 24 | h = x 25 | print(h) 26 | -------------------------------------------------------------------------------- /msu/rtl/sdaccel/kernel.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | -------------------------------------------------------------------------------- /primitives/rtl/full_adder.sv: -------------------------------------------------------------------------------- 1 | /******************************************************************************* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | *******************************************************************************/ 16 | 17 | /* 18 | A basic 1-bit full adder 19 | ------- 20 | | FA | 21 | A --> | | --> S 22 | B --> | | 23 | Cin --> | | --> Cout 24 | ------- 25 | */ 26 | 27 | module full_adder 28 | ( 29 | input logic A, 30 | input logic B, 31 | input logic Cin, 32 | output logic Cout, 33 | output logic S 34 | ); 35 | 36 | always_comb begin 37 | S = A ^ B ^ Cin; 38 | Cout = (A & B) | (Cin & (A ^ B)); 39 | end 40 | endmodule 41 | -------------------------------------------------------------------------------- /docs/verilator.md: -------------------------------------------------------------------------------- 1 | # Verilator 2 | 3 | The Ozturk design supports verilator as a simulator. 4 | 5 | While we're big fans of verilator, it unfortunately doesn't support 1024 bit modular squaring using * and %. As a result the default bitwidth for this design when using verilator is 128 bits. We found it can also be finicky with large bitwidths. Unpacked arrays of smaller words seems more stable. 6 | 7 | Enabling verilator takes just a few steps on Ubuntu 18 and AWS F1 CentOS. The setup script requires sudo access to install dependencies. 8 | 9 | ``` 10 | # Install dependencies 11 | ./msu/scripts/simulation_setup.sh 12 | 13 | # Run simulations for both designs 14 | cd msu 15 | make 16 | ``` 17 | 18 | The verilator testbench instantiates the MSU portion of the design as well as the squarer circuit. The MSU interfaces to the SDAccel interfaces and provides control to count the number iterations, capture the result, and send it back to the host driver. 19 | 20 | Simulating the MSU design is a fast way to iterate, debug, and test before moving on to hardware emulation. 21 | 22 | You can run simulations and view waveforms for a particular design as follows: 23 | ``` 24 | cd msu 25 | 26 | # Simple squarer 27 | make clean; make simple 28 | 29 | # 8 cycle Ozturk squarer 30 | make clean; make ozturk 31 | 32 | # View waveforms 33 | gtkwave rtl/obj_dir/logs/vlt_dump.vcd 34 | ``` 35 | -------------------------------------------------------------------------------- /msu/sw/Config.h: -------------------------------------------------------------------------------- 1 | /* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | */ 16 | 17 | #ifndef _CONFIG_H_ 18 | #define _CONFIG_H_ 19 | 20 | #include 21 | 22 | #define T_LEN 64 23 | 24 | #define MSU_BYTES_PER_WORD 4 25 | #define MSU_WORD_LEN (MSU_BYTES_PER_WORD*8) 26 | #define EXTRA_ELEMENTS 2 27 | #define NUM_SEGMENTS 4 28 | 29 | // Constants for Ozturk construction 30 | #define REDUNDANT_ELEMENTS 2 31 | #define WORD_LEN 16 32 | 33 | // Use to define size of word on cpp side (1,2,4,8) depending on bit_len 34 | #define BN_BUFFER_SIZE 4 // top.sv BIT_LEN = 17-32 35 | 36 | // Use to create offset when using larger words for bit_len 37 | // Such as when bit_len in top.sv is 17b and is 16b here, offset is 16 38 | #define BN_BUFFER_OFFSET 0 39 | 40 | void bn_shl(mpz_t bn, int bits); 41 | void bn_shr(mpz_t bn, int bits); 42 | void bn_init_mask(mpz_t mask, int bits); 43 | 44 | #endif 45 | -------------------------------------------------------------------------------- /msu/rtl/vivado_ozturk/run_vivado.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -e 4 | 5 | # Configuration 6 | # If using 128 bits be sure to change tb.sv as well. 7 | export MOD_LEN=1024 8 | MODEL=msu 9 | OBJ=../sdaccel/obj_vivado 10 | 11 | # Set current directory to the location of this script 12 | SCRIPT=$(dirname "$0") 13 | SCRIPTPATH=$(realpath "$SCRIPT") 14 | cd $SCRIPTPATH 15 | 16 | # Clean up the msuconfig file in rtl so vivado doesn't choose it 17 | # (why is there no way to configure the vivado include path?) 18 | rm -f ../msuconfig.vh 19 | 20 | # Generate a test 21 | # msuconfig.vh from this script will be replaced with the one from 22 | # makefile.sdaccel. 23 | ../gen_test.py -s ${MOD_LEN} 24 | rm -f msu.srcs/msuconfig.vh 25 | 26 | # Build dependencies 27 | mkdir -p ${MODEL}.srcs 28 | rm -fr ${OBJ} 29 | mkdir -p ${OBJ} 30 | 31 | # Delete the any old files first to ensure they are up to date 32 | TARGETS="msuconfig.vh mem/reduction_lut_000.dat" 33 | export MODSQR_DIR=../../../../../modular_square 34 | DIRECT_TB=1 make -C ${OBJ} -f ../../multiplier.mk ${TARGETS} 35 | 36 | # Copy the ROM files into the src directory. 37 | cp ${OBJ}/msuconfig.vh ${MODEL}.srcs 38 | cp -r ${OBJ}/mem ${MODEL}.srcs 39 | rm -fr ${OBJ} 40 | 41 | # Generate the Vivado project 42 | if [ ! -d msu ]; then 43 | echo "Generating vivado project" 44 | ./generate.sh 45 | fi 46 | 47 | # Update the project directory to the current dir 48 | #sed 's@\(Project [^ ]\+ [^ ]\+ Path="\)[^\\"]\+@\1'$SCRIPTPATH/$MODEL.xpr'@' $MODEL.xpr > $MODEL.xpr_new 49 | #mv $MODEL.xpr_new $MODEL.xpr 50 | 51 | cd msu 52 | vivado $MODEL.xpr & 53 | 54 | 55 | -------------------------------------------------------------------------------- /msu/rtl/sdaccel/Makefile: -------------------------------------------------------------------------------- 1 | # 2 | # Copyright 2019 Supranational, LLC 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"); 5 | # you may not use this file except in compliance with the License. 6 | # You may obtain a copy of the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | # See the License for the specific language governing permissions and 14 | # limitations under the License. 15 | # 16 | 17 | OBJ ?= obj 18 | 19 | # Build the simulation model plus the host. To rebuild the RTL 20 | # after making changes you must 'make clean' first. 21 | hw_emu_random: 22 | mkdir -p $(OBJ) 23 | RANDOM_MODULUS=1 \ 24 | $(MAKE) -C $(OBJ) -f ../Makefile.sdaccel TARGETS=hw_emu check 25 | 26 | hw_emu: 27 | mkdir -p $(OBJ) 28 | time $(MAKE) -C $(OBJ) -f ../Makefile.sdaccel TARGETS=hw_emu check 29 | 30 | # Build the host software 31 | host: 32 | $(MAKE) -C $(OBJ) -f ../Makefile.sdaccel TARGETS=hw host 33 | 34 | # Build an FPGA model 35 | hw: 36 | mkdir -p $(OBJ) 37 | time $(MAKE) -C $(OBJ) -f ../Makefile.sdaccel TARGETS=hw all 38 | 39 | # Gather files to copy to F1 machine 40 | to_f1: 41 | $(MAKE) -C $(OBJ) -f ../Makefile.sdaccel to_f1 42 | 43 | # Gather the sources needed for sdx for interactive hw emulation debug 44 | sdx: 45 | mkdir -p $(OBJ) 46 | $(MAKE) -C $(OBJ) -f ../Makefile.sdaccel sdx 47 | 48 | clean: 49 | ifdef XILINX_SDX 50 | mkdir -p $(OBJ) 51 | $(MAKE) -C $(OBJ) -f ../Makefile.sdaccel cleanall 52 | endif 53 | rm -fr $(OBJ) 54 | rm -fr $(OBJ)_hw_emu 55 | rm -fr $(OBJ)_hw 56 | 57 | -------------------------------------------------------------------------------- /primitives/rtl/multiplier.sv: -------------------------------------------------------------------------------- 1 | /******************************************************************************* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | *******************************************************************************/ 16 | 17 | /* 18 | Parameterized width full multiply 19 | 20 | ------- ---- 21 | A --> | | | FF | 22 | | A * B | --> | | --> P 23 | B --> | | | /\ | 24 | ------- ---- 25 | ^ 26 | clk --------------------| 27 | 28 | 29 | Can be used to represent an FPGA DSP multiplier for unsigned values 30 | */ 31 | 32 | module multiplier 33 | #( 34 | parameter int A_BIT_LEN = 17, 35 | parameter int B_BIT_LEN = 17, 36 | 37 | parameter int MUL_OUT_BIT_LEN = A_BIT_LEN + B_BIT_LEN 38 | ) 39 | ( 40 | input logic clk, 41 | input logic [A_BIT_LEN-1:0] A, 42 | input logic [B_BIT_LEN-1:0] B, 43 | output logic [MUL_OUT_BIT_LEN-1:0] P 44 | ); 45 | 46 | logic [MUL_OUT_BIT_LEN-1:0] P_result; 47 | 48 | always_comb begin 49 | P_result[MUL_OUT_BIT_LEN-1:0] = A[A_BIT_LEN-1:0] * B[B_BIT_LEN-1:0]; 50 | end 51 | 52 | always_ff @(posedge clk) begin 53 | P[MUL_OUT_BIT_LEN-1:0] <= P_result[MUL_OUT_BIT_LEN-1:0]; 54 | end 55 | endmodule 56 | -------------------------------------------------------------------------------- /msu/sw/MSUVerilator.hpp: -------------------------------------------------------------------------------- 1 | /* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | */ 16 | 17 | #ifndef _MSU_VERILATOR_H_ 18 | #define _MSU_VERILATOR_H_ 19 | 20 | #include 21 | #include 22 | #include 23 | 24 | // If "verilator --trace" is used, include the tracing class 25 | #if VM_TRACE 26 | # include 27 | #endif 28 | 29 | extern vluint64_t *main_time_singleton; 30 | 31 | class MSUVerilator : public MSUDevice { 32 | public: 33 | Vtb *tb; 34 | VerilatedVcdC* tfp; 35 | 36 | // Current simulation time (64-bit unsigned) 37 | vluint64_t main_time = 0; 38 | 39 | // Watchdog cycle count 40 | uint64_t watchdog; 41 | 42 | mpz_t msu_in; 43 | mpz_t msu_out; 44 | int msu_words_in; 45 | int msu_words_out; 46 | 47 | MSUVerilator(int argc, char** argv); 48 | virtual ~MSUVerilator(); 49 | 50 | virtual void init(MSU *_msu, Squarer *_squarer); 51 | virtual void reset(); 52 | virtual void clock_cycle(); 53 | virtual void compute_job(uint64_t t_start, 54 | uint64_t t_final, 55 | mpz_t sq_in, 56 | mpz_t sq_out); 57 | 58 | void axi_write(mpz_t data, int words); 59 | void axi_read(mpz_t data, int words); 60 | 61 | void pet() { 62 | watchdog = 0; 63 | } 64 | }; 65 | 66 | #endif 67 | -------------------------------------------------------------------------------- /msu/rtl/sdaccel/utils.mk: -------------------------------------------------------------------------------- 1 | #+------------------------------------------------------------------------------- 2 | # The following parameters are assigned with default values. These parameters can 3 | # be overridden through the make command line 4 | #+------------------------------------------------------------------------------- 5 | 6 | REPORT := no 7 | PROFILE := no 8 | DEBUG := no 9 | 10 | #'estimate' for estimate report generation 11 | #'system' for system report generation 12 | ifneq ($(REPORT), no) 13 | CLFLAGS += --report estimate 14 | CLLDFLAGS += --report system 15 | endif 16 | 17 | #Generates profile summary report 18 | ifeq ($(PROFILE), yes) 19 | LDCLFLAGS += --profile_kernel data:all:all:all 20 | endif 21 | 22 | #Generates debug summary report 23 | ifeq ($(DEBUG), yes) 24 | CLFLAGS += --dk protocol:all:all:all 25 | endif 26 | 27 | #Checks for XILINX_SDX 28 | ifndef XILINX_SDX 29 | $(warning XILINX_SDX variable is not set, please set correctly and rerun) 30 | $(error source msu/scripts/sdaccel_env.sh) 31 | endif 32 | 33 | # sanitize_dsa - create a filesystem friendly name from dsa name 34 | # $(1) - name of dsa 35 | COLON=: 36 | PERIOD=. 37 | UNDERSCORE=_ 38 | sanitize_dsa = $(strip $(subst $(PERIOD),$(UNDERSCORE),$(subst $(COLON),$(UNDERSCORE),$(1)))) 39 | 40 | device2dsa = $(if $(filter $(suffix $(1)),.xpfm),$(shell $(COMMON_REPO)/utility/parsexpmf.py $(1) dsa 2>/dev/null),$(1)) 41 | device2sandsa = $(call sanitize_dsa,$(call device2dsa,$(1))) 42 | device2dep = $(if $(filter $(suffix $(1)),.xpfm),$(dir $(1))/$(shell $(COMMON_REPO)/utility/parsexpmf.py $(1) hw 2>/dev/null) $(1),) 43 | 44 | # Cleaning stuff 45 | RM = rm -f 46 | RMDIR = rm -rf 47 | 48 | ECHO:= @echo 49 | 50 | docs: README.md 51 | 52 | README.md: description.json 53 | $(ABS_COMMON_REPO)/utility/readme_gen/readme_gen.py description.json 54 | 55 | check-devices: 56 | ifndef DEVICE 57 | $(error DEVICE not set. Please set the DEVICE properly and rerun. Run "make help" for more details.) 58 | endif 59 | -------------------------------------------------------------------------------- /msu/rtl/vivado_simple/tb_behav.wcfg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | clk 25 | clk 26 | 27 | 28 | reset 29 | reset 30 | 31 | 32 | start 33 | start 34 | 35 | 36 | sq_in[1023:0] 37 | sq_in[1023:0] 38 | 39 | 40 | sq_out[1023:0] 41 | sq_out[1023:0] 42 | 43 | 44 | valid 45 | valid 46 | 47 | 48 | -------------------------------------------------------------------------------- /docs/onprem.md: -------------------------------------------------------------------------------- 1 | # SDAccel On-Premise 2 | 3 | It's possible to perform hardware emulation and synthesis on-premise using the flow defined by AWS. 4 | 5 | The steps to enable an on-premise are described here: . 6 | 7 | You will need a license for the vu9p in Vivado and for SDAccel. Xilinx offers trial licenses on their website. The licenses should be loaded through the license manager, which is accessed from the Vivado Help menu. 8 | 9 | Host requirements: 32GB of memory is preferred though 16GB of memory should be sufficient. Single threaded performance is the main determinant of runtime. 10 | 11 | ## Ubuntu 18 12 | 13 | While Ubuntu 18 is not officially supported, the on-premise flow can be made to work with a few additional changes after installing SDAccel. 14 | 15 | ``` 16 | # Link to the OS installed version of libstdc++: 17 | cd /tools/Xilinx/SDx/2018.3/lib/lnx64.o/Default/ 18 | mv libstdc++.so.6 libstdc++.so.6_orig 19 | ln -s /usr/lib/x86_64-linux-gnu/libstdc++.so.6 20 | 21 | cd /tools/Xilinx/SDx/2018.3/lib/lnx64.o/Default/ 22 | mv libstdc++.so.6 libstdc++.so.6_orig 23 | ln -s /usr/lib/x86_64-linux-gnu/libstdc++.so.6 24 | 25 | # After the changes above this should report "ERROR: no card found" 26 | /opt/xilinx/xrt/bin/xbutil validate 27 | 28 | # Some of the python scripts reference /bin/env 29 | cd /bin 30 | sudo ln -s /usr/bin/env 31 | ``` 32 | 33 | ## helloworld 34 | 35 | The `helloworld_ocl` example should now successfully complete: 36 | ``` 37 | source ./msu/scripts/sdaccel_env.sh 38 | cd $AWS_FPGA_REPO_DIR/SDAccel/examples/xilinx/getting_started/host/helloworld_ocl 39 | 40 | # in Makefile, change DEVICE to: 41 | DEVICE := $(AWS_PLATFORM) 42 | 43 | make cleanall; make TARGETS=sw_emu DEVICES=$AWS_PLATFORM check 44 | ``` 45 | 46 | You can now follow the hardware emulation and synthesis flows described in [aws_f1](docs/aws_f1.md). 47 | 48 | To register the image built from on-premise synthesis first copy the `msu/rtl/obj/xclbin/vdf.hw.xilinx_aws-vu9p-f1-04261818_dynamic_5_0.xclbin` and `host` files to an AWS F1 instance, then run `create_sdaccel_afi.sh`. 49 | -------------------------------------------------------------------------------- /docs/test_portal.md: -------------------------------------------------------------------------------- 1 | # Test portal 2 | 3 | The online test portal dramatically lowers the bar to testing your design in AWS F1 environment. 4 | 5 | Rather than go through the process of enabling AWS, the F1 environment, etc., you can design, test and tune your multiplier in Vivado and submit it to the portal to make sure the results are what you expect. 6 | 7 | Once you submit your design, the test portal will clone your repo, run simulation, hardware emulation, synthesis/place and route, and provide the results back to you in an encrypted file on S3. 8 | 9 | ## Usage limitations 10 | 11 | - The portal is not intended for basic testing - you should test and tune your design in Vivado first. 12 | - The script will schedule requests prevent spamming and provide a level of access/fairness to the teams 13 | - There will be a time limit of 8 hours for any request. We'll revise this if needed based on usage data. The goal is to balance allowing jobs to complete with fairness and availability to all teams. 14 | 15 | ## API 16 | 17 | Usage: msu/scripts/portal --access KEY [command] 18 | 19 | - --access - secret access key, issued per team. This is a hash of the encryption key. 20 | - command 21 | - list - display pending jobs 22 | - cancel JOBID - cancel a job 23 | - submit repo [options] - submit a repo for processing 24 | - --sim - run simulations 25 | - --hw-emu - run hardware emulation 26 | - --synthesis - run synthesis/pnr 27 | - --email - notification email address 28 | - Each stage runs all preceeding stages 29 | 30 | ## Job flow 31 | 32 | 1. The API endpoint will validate the request and use the secret key to authorize the transaction. 33 | 1. Once the job is scheduled the endpoint will dispatch it to a worker, which may be a long running instance, AWS Batch, or some other mechanism. 34 | 1. The worker will instantiate a docker image on a z1d.2xlarge, setup the F1 environment, and run the job. 35 | 1. The worker will gather the results, including log files and reports, create a tarball, and encrypt it with a randomly generated password. 36 | 1. The worker will publish the results on a shared S3 node and send an email notification. 37 | -------------------------------------------------------------------------------- /primitives/rtl/carry_save_adder.sv: -------------------------------------------------------------------------------- 1 | /******************************************************************************* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | *******************************************************************************/ 16 | 17 | /* 18 | A parameterized carry save adder (CSA) 19 | Loops through each input bit and feeds a full adder (FA) 20 | -------------------------------- 21 | | CSA | 22 | | for each i in BIT_LEN | 23 | | ------- | 24 | | | FA | | 25 | A[] --> | Ai --> | | --> Si | --> S[] 26 | B[] --> | Bi --> | | | 27 | Cin[] --> | Cini --> | | --> Couti | --> Cout[] 28 | | ------- | 29 | -------------------------------- 30 | */ 31 | 32 | module carry_save_adder 33 | #( 34 | parameter int BIT_LEN = 19 35 | ) 36 | ( 37 | input logic [BIT_LEN-1:0] A, 38 | input logic [BIT_LEN-1:0] B, 39 | input logic [BIT_LEN-1:0] Cin, 40 | output logic [BIT_LEN-1:0] Cout, 41 | output logic [BIT_LEN-1:0] S 42 | ); 43 | 44 | genvar i; 45 | generate 46 | for (i=0; i 21 | #include 22 | #include 23 | 24 | #define KERNEL_NAME "vdf" 25 | 26 | class OpenCLContext { 27 | public: 28 | bool quiet; 29 | 30 | // Host memory for buffers 31 | std::vector> input_buf; 32 | std::vector> output_buf; 33 | int msu_words_in; 34 | int msu_words_out; 35 | 36 | // OpenCL structures 37 | cl::Context *context; 38 | cl::CommandQueue *q; 39 | cl::Program *program; 40 | cl::Kernel *krnl_vdf; 41 | cl::Buffer *inBuffer; 42 | cl::Buffer *outBuffer; 43 | std::vector inBufferVec; 44 | std::vector outBufferVec; 45 | 46 | OpenCLContext() {} 47 | ~OpenCLContext(); 48 | void init(int msu_words_in, int msu_words_out); 49 | void compute_job(mpz_t msu_out, mpz_t msu_in); 50 | }; 51 | 52 | class MSUSDAccel : public MSUDevice { 53 | OpenCLContext ocl; 54 | public: 55 | mpz_t msu_in; 56 | mpz_t msu_out; 57 | int msu_words_in; 58 | int msu_words_out; 59 | 60 | MSUSDAccel() { 61 | mpz_inits(msu_in, msu_out, 0); 62 | } 63 | virtual ~MSUSDAccel() { 64 | mpz_clears(msu_in, msu_out, 0); 65 | } 66 | 67 | virtual void init(MSU *_msu, Squarer *_squarer); 68 | virtual void compute_job(uint64_t t_start, 69 | uint64_t t_final, 70 | mpz_t sq_in, 71 | mpz_t sq_out); 72 | 73 | virtual void set_quiet(bool _quiet) { 74 | quiet = _quiet; 75 | ocl.quiet = _quiet; 76 | } 77 | }; 78 | 79 | #endif 80 | -------------------------------------------------------------------------------- /primitives/rtl/carry_save_adder_tree_level.sv: -------------------------------------------------------------------------------- 1 | /******************************************************************************* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | *******************************************************************************/ 16 | 17 | /* 18 | Group the input terms into sets of three for input into a carry save adder 19 | Shift the CSA carry output by 1 for use in the next level 20 | The sum already has the correct weight, therefore we only pad for consistency 21 | Any leftover terms that did not fit into a set are returned padded 22 | */ 23 | 24 | module carry_save_adder_tree_level 25 | #( 26 | parameter int NUM_ELEMENTS = 3, 27 | parameter int BIT_LEN = 19, 28 | 29 | parameter int NUM_RESULTS = (integer'(NUM_ELEMENTS/3) * 2) + 30 | (NUM_ELEMENTS%3) 31 | ) 32 | ( 33 | input logic [BIT_LEN-1:0] terms[NUM_ELEMENTS], 34 | output logic [BIT_LEN-1:0] results[NUM_RESULTS] 35 | ); 36 | 37 | genvar i; 38 | generate 39 | for (i=0; i<(NUM_ELEMENTS / 3); i++) begin : csa_insts 40 | // Add three consecutive terms 41 | carry_save_adder #(.BIT_LEN(BIT_LEN)) 42 | carry_save_adder ( 43 | .A(terms[i*3]), 44 | .B(terms[(i*3)+1]), 45 | .Cin(terms[(i*3)+2]), 46 | .Cout({results[i*2][0], 47 | results[i*2][BIT_LEN-1:1]}), 48 | .S(results[(i*2)+1][BIT_LEN-1:0]) 49 | ); 50 | end 51 | 52 | // Save any unused terms for the next level 53 | for (i=0; i<(NUM_ELEMENTS % 3); i++) begin : csa_level_extras 54 | always_comb begin 55 | results[(NUM_RESULTS - 1) - i][BIT_LEN-1:0] = 56 | terms[(NUM_ELEMENTS- 1) - i][BIT_LEN-1:0]; 57 | end 58 | end 59 | endgenerate 60 | endmodule 61 | -------------------------------------------------------------------------------- /msu/rtl/sdaccel/tcl/gen_xo.tcl: -------------------------------------------------------------------------------- 1 | # /******************************************************************************* 2 | # Copyright (c) 2018, Xilinx, Inc. 3 | # All rights reserved. 4 | # 5 | # Redistribution and use in source and binary forms, with or without modification, 6 | # are permitted provided that the following conditions are met: 7 | # 8 | # 1. Redistributions of source code must retain the above copyright notice, 9 | # this list of conditions and the following disclaimer. 10 | # 11 | # 12 | # 2. Redistributions in binary form must reproduce the above copyright notice, 13 | # this list of conditions and the following disclaimer in the documentation 14 | # and/or other materials provided with the distribution. 15 | # 16 | # 17 | # 3. Neither the name of the copyright holder nor the names of its contributors 18 | # may be used to endorse or promote products derived from this software 19 | # without specific prior written permission. 20 | # 21 | # 22 | # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 23 | # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,THE IMPLIED 24 | # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 25 | # IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, 26 | # INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 27 | # BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 28 | # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY 29 | # OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 30 | # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, 31 | # EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | # 33 | # *******************************************************************************/ 34 | 35 | if { $::argc != 4 } { 36 | puts "ERROR: Program \"$::argv0\" requires 4 arguments!\n" 37 | puts "Usage: $::argv0 \n" 38 | exit 39 | } 40 | 41 | set xoname [lindex $::argv 0] 42 | set krnl_name [lindex $::argv 1] 43 | set target [lindex $::argv 2] 44 | set device [lindex $::argv 3] 45 | 46 | set suffix "${krnl_name}_${target}_${device}" 47 | 48 | #source -notrace ./scripts/package_kernel.tcl 49 | source ../tcl/package_kernel.tcl 50 | 51 | if {[file exists "${xoname}"]} { 52 | file delete -force "${xoname}" 53 | } 54 | 55 | package_xo -xo_path ${xoname} -kernel_name ${krnl_name} -ip_directory ./packaged_kernel_${suffix} -kernel_xml ../kernel.xml 56 | -------------------------------------------------------------------------------- /msu/rtl/vivado_ozturk/tb_behav.wcfg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | clk 25 | clk 26 | 27 | 28 | reset 29 | reset 30 | 31 | 32 | start 33 | start 34 | 35 | 36 | valid 37 | valid 38 | 39 | 40 | modulus[1023:0] 41 | modulus[1023:0] 42 | 43 | 44 | sq_in[1023:0] 45 | sq_in[1023:0] 46 | 47 | 48 | sq_out[2111:0] 49 | sq_out[2111:0] 50 | 51 | 52 | t_start[31:0] 53 | t_start[31:0] 54 | 55 | 56 | t_final[31:0] 57 | t_final[31:0] 58 | 59 | 60 | -------------------------------------------------------------------------------- /msu/Makefile: -------------------------------------------------------------------------------- 1 | # 2 | # Copyright 2019 Supranational, LLC 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"); 5 | # you may not use this file except in compliance with the License. 6 | # You may obtain a copy of the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | # See the License for the specific language governing permissions and 14 | # limitations under the License. 15 | # 16 | 17 | SDACCEL_DIR=rtl/sdaccel 18 | 19 | 20 | # Main targets 21 | 22 | all: hw_emu 23 | 24 | # Requires verilator 25 | regress: sim hw_emu_simple judge 26 | 27 | judge: synthesis 28 | 29 | hw_emu_simple: 30 | make clean 31 | OBJ=obj_hw_emu MOD_LEN=1024 SIMPLE_SQ=1 $(MAKE) -C $(SDACCEL_DIR) hw_emu 32 | 33 | hw: 34 | MOD_LEN=1024 SIMPLE_SQ=0 $(MAKE) -C $(SDACCEL_DIR) hw 35 | 36 | 37 | # This target is used by the test portal to perform hardware emulation 38 | hw_emu: 39 | make clean 40 | 41 | @echo "" 42 | @echo "############################################################" 43 | @echo "# Running hardware emulation..." 44 | @echo "############################################################" 45 | OBJ=obj_hw_emu MOD_LEN=1024 SIMPLE_SQ=0 \ 46 | $(MAKE) -C $(SDACCEL_DIR) hw_emu |& tee hw_emu.log 47 | 48 | # This target is used by the test portal to perform synthesis 49 | synthesis: 50 | make clean 51 | 52 | @echo "" 53 | @echo "############################################################" 54 | @echo "# Running hardware emulation..." 55 | @echo "############################################################" 56 | OBJ=obj_hw_emu MOD_LEN=1024 SIMPLE_SQ=0 \ 57 | $(MAKE) -C $(SDACCEL_DIR) hw_emu |& tee hw_emu.log 58 | 59 | @echo "" 60 | @echo "############################################################" 61 | @echo "# Synthesizing..." 62 | @echo "############################################################" 63 | OBJ=obj_hw MOD_LEN=1024 SIMPLE_SQ=0 \ 64 | $(MAKE) -C $(SDACCEL_DIR) hw |& tee hw.log 65 | 66 | 67 | # Additional, mostly verilator, targets 68 | 69 | sim: simple simple ozturk ozturk 70 | 71 | ozturk: 72 | $(MAKE) clean 73 | $(MAKE) -C rtl run 74 | 75 | simple: 76 | $(MAKE) clean 77 | SIMPLE_SQ=1 $(MAKE) -C rtl run 78 | 79 | 80 | hw_emu_random: 81 | make clean 82 | $(MAKE) -C $(SDACCEL_DIR) hw_emu_random 83 | 84 | 85 | clean: 86 | $(MAKE) -C rtl clean 87 | $(MAKE) -C $(SDACCEL_DIR) clean 88 | 89 | 90 | # These work but seems unnecessary to maintain another testbench variant 91 | # ozturk: 92 | # $(MAKE) clean 93 | # DIRECT_TB=1 $(MAKE) -C rtl run 94 | 95 | # simple: 96 | # $(MAKE) clean 97 | # SIMPLE_SQ=1 DIRECT_TB=1 $(MAKE) -C rtl run 98 | -------------------------------------------------------------------------------- /msu/rtl/gen_test.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python3 2 | 3 | ################################################################################ 4 | # Copyright 2019 Supranational LLC 5 | # 6 | # Licensed under the Apache License, Version 2.0 (the "License"); 7 | # you may not use this file except in compliance with the License. 8 | # You may obtain a copy of the License at 9 | # 10 | # http://www.apache.org/licenses/LICENSE-2.0 11 | # 12 | # Unless required by applicable law or agreed to in writing, software 13 | # distributed under the License is distributed on an "AS IS" BASIS, 14 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 15 | # See the License for the specific language governing permissions and 16 | # limitations under the License. 17 | ################################################################################ 18 | 19 | import sys 20 | import getopt 21 | 22 | MOD_LEN = 1024 23 | M = None 24 | 25 | # Set to 50k for final regression runs 26 | T_FINAL = 1000 27 | 28 | gen_msuconfig = False 29 | 30 | try: 31 | opts, args = getopt.getopt(sys.argv[1:],"hcM:s:", ["modulus=", "size="]) 32 | except getopt.GetoptError: 33 | print ('gen_test.py -M [-c]') 34 | sys.exit(2) 35 | 36 | for opt, arg in opts: 37 | if opt == '-h': 38 | print ('gen_test.py -M [-c]') 39 | sys.exit() 40 | elif opt in ("-s", "--size"): 41 | MOD_LEN = int(arg) 42 | elif opt in ("-M", "--modulus"): 43 | M = int(arg) 44 | elif opt == "-c": 45 | gen_msuconfig = True 46 | 47 | if MOD_LEN == 128 and M == None: 48 | M = 302934307671667531413257853548643485645 49 | 50 | if MOD_LEN == 1024 and M == None: 51 | # For the Ozturk design this modulus must match what is found in modulus.mk 52 | # since reduction LUTs have to be generated ahead of time. 53 | MOD_LEN == 1024 54 | M = 124066695684124741398798927404814432744698427125735684128131855064976895337309138910015071214657674309443149407457493434579063840841220334555160125016331040933690674569571217337630239191517205721310197608387239846364360850220896772964978569683229449266819903414117058030106528073928633017118689826625594484331 55 | 56 | print("MOD_LEN = %d" % MOD_LEN) 57 | print("MODULUS = %d" % M) 58 | print(" bitlen = %d" % (M.bit_length())) 59 | 60 | f = open('test.txt', 'w') 61 | 62 | sq_in = 2 63 | f.write("%x\n" % sq_in) 64 | 65 | def sqr(t_start, t_final, incr, sq_in): 66 | for i in range(t_start+incr, t_final+1, incr): 67 | for j in range(incr): 68 | sq_in = (sq_in * sq_in) % M 69 | 70 | f.write("%d, %x\n" % (i, sq_in)) 71 | return(i, sq_in) 72 | 73 | (t_curr, sq_in) = sqr(0, 10, 1, sq_in) 74 | (t_curr, sq_in) = sqr(10, T_FINAL, 10, sq_in) 75 | 76 | f.close() 77 | 78 | if gen_msuconfig: 79 | f = open('msu.srcs/msuconfig.vh', 'w') 80 | f.write("`define SIMPLE_SQ 1\n") 81 | f.write("`define SQ_IN_BITS_DEF %d\n" % (MOD_LEN)) 82 | f.write("`define SQ_OUT_BITS_DEF %d\n" % (MOD_LEN)) 83 | f.write("`define MOD_LEN_DEF %d\n" % (MOD_LEN)) 84 | f.write("`define MODULUS_DEF %d'h%x\n" % (MOD_LEN, M)) 85 | f.close() 86 | 87 | -------------------------------------------------------------------------------- /msu/sw/MSU.hpp: -------------------------------------------------------------------------------- 1 | /* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | */ 16 | 17 | #ifndef _MSU_LIB_H_ 18 | #define _MSU_LIB_H_ 19 | 20 | #include 21 | #include 22 | #include 23 | #include 24 | 25 | 26 | template 27 | void bn_to_buffer(mpz_t bn, T *var, size_t words, 28 | bool suppress_warning = false, 29 | bool zero_extra_words = false) { 30 | size_t countp; 31 | mpz_export(var, &countp, -1, BN_BUFFER_SIZE, 0, BN_BUFFER_OFFSET, bn); 32 | if(countp != words) { 33 | if(!suppress_warning) { 34 | printf("WARNING: expected %ld words, got %ld\n", words, countp); 35 | } 36 | if(zero_extra_words) { 37 | for(unsigned i = countp; i < words; i++) { 38 | var[i] = 0; 39 | } 40 | } 41 | } 42 | } 43 | 44 | template 45 | void bn_from_buffer(mpz_t bn, T *var, size_t words) { 46 | mpz_import(bn, words, -1, BN_BUFFER_SIZE, 0, BN_BUFFER_OFFSET, var); 47 | } 48 | 49 | class MSU; 50 | 51 | class MSUDevice { 52 | protected: 53 | bool quiet; 54 | MSU *msu; 55 | Squarer *squarer; 56 | public: 57 | virtual ~MSUDevice() {} 58 | virtual void init(MSU *_msu, Squarer *_squarer) { 59 | msu = _msu; 60 | squarer = _squarer; 61 | } 62 | virtual void reset() {} 63 | virtual void clock_cycle() {} 64 | virtual void compute_job(uint64_t t_start, 65 | uint64_t t_final, 66 | mpz_t sq_in, 67 | mpz_t sq_out) = 0; 68 | virtual void set_quiet(bool _quiet) { 69 | quiet = _quiet; 70 | } 71 | }; 72 | 73 | class MSU { 74 | public: 75 | gmp_randstate_t rand_state; 76 | 77 | int mod_len; 78 | mpz_t modulus; 79 | 80 | int num_elements; 81 | 82 | mpz_t sq_in; 83 | mpz_t sq_out; 84 | uint64_t t_start; 85 | uint64_t t_final; 86 | 87 | uint64_t compute_time; 88 | 89 | bool quiet; 90 | 91 | MSUDevice &device; 92 | 93 | MSU(MSUDevice &_d, int mod_len, mpz_t _modulus); 94 | virtual ~MSU(); 95 | 96 | int run_fixed(uint64_t t_start, uint64_t t_final, mpz_t sq_in, 97 | bool check); 98 | int run_random(uint64_t t_start, uint64_t t_final, bool rrandom, 99 | bool check); 100 | void prepare_random_job(bool rrandom); 101 | void compute_job(); 102 | int check_job(); 103 | 104 | void set_quiet(bool _quiet) { 105 | quiet = _quiet; 106 | } 107 | }; 108 | #endif 109 | -------------------------------------------------------------------------------- /primitives/model/primitives.py: -------------------------------------------------------------------------------- 1 | # 2 | # Copyright 2019 Supranational, LLC 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"); 5 | # you may not use this file except in compliance with the License. 6 | # You may obtain a copy of the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | # See the License for the specific language governing permissions and 14 | # limitations under the License. 15 | # 16 | 17 | def int_to_bits(x, bit_len): 18 | return [x >> i & 1 for i in range(0, bit_len)] 19 | 20 | def bits_to_int(x): 21 | y = 0 22 | for i, b in enumerate(x): 23 | y = (b << i) | y 24 | return y 25 | 26 | # Full Adder 27 | def fa(A, B, Cin): 28 | S = A ^ B ^ Cin 29 | Cout = (A & B) | (Cin & (A ^B)) 30 | return Cout, S 31 | 32 | # Carry Save Adder 33 | def csa(A, B, Cin, bit_len): 34 | Cout = bit_len*[0] 35 | S = bit_len*[0] 36 | for i in range(bit_len): 37 | Cout[i], S[i] = fa(A[i], B[i], Cin[i]) 38 | return Cout, S 39 | 40 | # One level of the compressor tree 41 | def csa_level(terms, bit_len): 42 | num_results = len(terms)//3 43 | 44 | result_terms = [] 45 | 46 | # Feed three consecutive terms to a CSA 47 | for i in range(2, len(terms), 3): 48 | cout, s = csa(terms[i-2], terms[i-1], terms[i], bit_len) 49 | # Need to shift carry 1 bit 50 | cout.insert(0,0) 51 | s.append(0) 52 | result_terms.append(cout) 53 | result_terms.append(s) 54 | 55 | # Push any leftover terms not feed to a CSA to the next level 56 | for i in range(len(terms)%3): 57 | temp_term = terms[(len(terms)-1)-i] 58 | temp_term.append(0) 59 | result_terms.append(temp_term) 60 | 61 | return result_terms 62 | 63 | # 3:2 compressor tree 64 | def compressor_tree(terms, bit_len): 65 | if (len(terms) == 3): 66 | cout, s = csa(terms[0], terms[1], terms[2], bit_len) 67 | cout.insert(0,0) 68 | s.append(0) 69 | else: 70 | next_level_terms = csa_level(terms, bit_len) 71 | cout, s = compressor_tree(next_level_terms, bit_len+1) 72 | 73 | return cout, s 74 | 75 | # Multiplier 76 | def multiplier(A, B): 77 | P = A * B 78 | return P 79 | 80 | def multiply(A, B, NUM_ELEMENTS, COL_BIT_LEN, WORD_LEN): 81 | mul_result = (NUM_ELEMENTS*NUM_ELEMENTS)*[0] 82 | for i in range (NUM_ELEMENTS): 83 | for j in range(NUM_ELEMENTS): 84 | mul_result[(NUM_ELEMENTS*i)+j] = multiplier(A[i], B[j]) 85 | 86 | # grid[col][row] 87 | grid = [[0 for x in range(NUM_ELEMENTS*2)] for y in range(NUM_ELEMENTS*2)] 88 | for i in range (NUM_ELEMENTS): 89 | for j in range(NUM_ELEMENTS): 90 | grid[i+j][2*i] = mul_result[(NUM_ELEMENTS*i)+j] & \ 91 | (pow(2,WORD_LEN)-1) 92 | grid[i+j+1][(2*i)+1] = (mul_result[(NUM_ELEMENTS*i)+j] >> WORD_LEN) & \ 93 | (pow(2,COL_BIT_LEN)-1) 94 | 95 | cout = (NUM_ELEMENTS*2)*[0] 96 | s = (NUM_ELEMENTS*2)*[0] 97 | 98 | cout[0] = 0 99 | cout[(NUM_ELEMENTS*2)-1] = 0 100 | 101 | s[0] = grid[0][0] 102 | s[(NUM_ELEMENTS*2)-1] = grid[(NUM_ELEMENTS*2)-1][(NUM_ELEMENTS*2)-1] 103 | 104 | for i in range (1, (NUM_ELEMENTS*2)-1): 105 | grid_bits = [] 106 | for g in grid[i]: 107 | grid_bits.append(int_to_bits(g, COL_BIT_LEN)) 108 | 109 | result = compressor_tree(grid_bits, COL_BIT_LEN) 110 | cout[i] = bits_to_int(result[0]) 111 | s[i] = bits_to_int(result[1]) 112 | 113 | return cout, s 114 | 115 | -------------------------------------------------------------------------------- /modular_square/rtl/modular_square_simple.sv: -------------------------------------------------------------------------------- 1 | /******************************************************************************* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | *******************************************************************************/ 16 | 17 | `include "msuconfig.vh" 18 | 19 | // Set a default modulus and bitwidth but allow them to be defined 20 | // externally as well. 21 | `ifndef MOD_LEN_DEF 22 | `define MOD_LEN_DEF 1024 23 | `endif 24 | `ifndef MODULUS_DEF 25 | `define MODULUS_DEF 1024'd124066695684124741398798927404814432744698427125735684128131855064976895337309138910015071214657674309443149407457493434579063840841220334555160125016331040933690674569571217337630239191517205721310197608387239846364360850220896772964978569683229449266819903414117058030106528073928633017118689826625594484331 26 | `endif 27 | 28 | module modular_square_simple 29 | #( 30 | parameter int MOD_LEN = `MOD_LEN_DEF 31 | ) 32 | ( 33 | input logic clk, 34 | input logic reset, 35 | input logic start, 36 | input logic [MOD_LEN-1:0] sq_in, 37 | output logic [MOD_LEN-1:0] sq_out, 38 | output logic valid 39 | ); 40 | 41 | localparam [MOD_LEN-1:0] MODULUS = `MODULUS_DEF; 42 | 43 | logic [MOD_LEN-1:0] cur_sq_in; 44 | logic [MOD_LEN*2-1:0] squared; 45 | logic [MOD_LEN-1:0] sq_out_comb; 46 | 47 | // Mimic a pipeline 48 | localparam [3:0] PIPELINE_DEPTH = 10; 49 | logic [3:0] valid_count; 50 | logic running; 51 | logic valid_next; 52 | 53 | // Store the square input, circulate the result back to the input 54 | always_ff @(posedge clk) begin 55 | if(start) begin 56 | cur_sq_in <= sq_in; 57 | end else if(valid_next) begin 58 | cur_sq_in <= sq_out_comb; 59 | end 60 | end 61 | assign sq_out = valid ? cur_sq_in : {MOD_LEN{1'bx}}; 62 | 63 | // Control 64 | always_ff @(posedge clk) begin 65 | if(reset) begin 66 | running <= 0; 67 | valid_count <= 0; 68 | end else begin 69 | if(start || valid_next) begin 70 | running <= 1; 71 | valid_count <= 0; 72 | end else begin 73 | valid_count <= valid_count + 1; 74 | end 75 | end 76 | end 77 | 78 | assign valid_next = running && (valid_count == PIPELINE_DEPTH-1); 79 | always_ff @(posedge clk) begin 80 | valid <= valid_next; 81 | end 82 | 83 | //---------------------------------------------------------------------- 84 | // EDIT HERE 85 | // Insert/instantiate your multiplier below 86 | // Modify control above as needed while satisfying the interface 87 | // 88 | 89 | // Compute the modular square function 90 | always_comb begin 91 | squared = {{MOD_LEN{1'b0}}, cur_sq_in}; 92 | squared = squared * squared; 93 | squared = squared % {{MOD_LEN{1'b0}}, MODULUS}; 94 | sq_out_comb = squared[MOD_LEN-1:0]; 95 | end 96 | 97 | // EDIT HERE 98 | //---------------------------------------------------------------------- 99 | 100 | endmodule 101 | -------------------------------------------------------------------------------- /msu/rtl/sdaccel/tcl/package_kernel.tcl: -------------------------------------------------------------------------------- 1 | # /******************************************************************************* 2 | # Copyright (c) 2018, Xilinx, Inc. 3 | # All rights reserved. 4 | # 5 | # Redistribution and use in source and binary forms, with or without modification, 6 | # are permitted provided that the following conditions are met: 7 | # 8 | # 1. Redistributions of source code must retain the above copyright notice, 9 | # this list of conditions and the following disclaimer. 10 | # 11 | # 12 | # 2. Redistributions in binary form must reproduce the above copyright notice, 13 | # this list of conditions and the following disclaimer in the documentation 14 | # and/or other materials provided with the distribution. 15 | # 16 | # 17 | # 3. Neither the name of the copyright holder nor the names of its contributors 18 | # may be used to endorse or promote products derived from this software 19 | # without specific prior written permission. 20 | # 21 | # 22 | # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 23 | # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,THE IMPLIED 24 | # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 25 | # IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, 26 | # INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 27 | # BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 28 | # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY 29 | # OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 30 | # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, 31 | # EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | # 33 | # *******************************************************************************/ 34 | 35 | set root_path ../../../.. 36 | set msu_path $root_path/msu/rtl 37 | set sdaccel_path $root_path/msu/rtl/sdaccel 38 | set primitives_path $root_path/primitives/rtl 39 | set modsqr_path $root_path/modular_square/rtl 40 | 41 | set path_to_packaged "./packaged_kernel_${suffix}" 42 | set path_to_tmp_project "./tmp_kernel_pack_${suffix}" 43 | 44 | create_project -force kernel_pack $path_to_tmp_project 45 | 46 | add_files -norecurse [glob $msu_path/msu.sv] 47 | add_files -norecurse [glob $msu_path/modular_square_wrapper.sv] 48 | add_files -norecurse [glob msuconfig.vh] 49 | add_files -norecurse [glob mem/*.dat] 50 | add_files -norecurse [glob mem/reduction_lut.sv] 51 | add_files -norecurse [glob $sdaccel_path/*.sv] 52 | add_files -norecurse [glob $sdaccel_path/*.v] 53 | add_files -norecurse [glob $primitives_path/*.sv] 54 | add_files -norecurse [glob $modsqr_path/*.sv] 55 | 56 | set_property top ${krnl_name} [current_fileset] 57 | 58 | update_compile_order -fileset sources_1 59 | update_compile_order -fileset sim_1 60 | ipx::package_project -root_dir $path_to_packaged -vendor xilinx.com -library RTLKernel -taxonomy /KernelIP -import_files -set_current false 61 | ipx::unload_core $path_to_packaged/component.xml 62 | ipx::edit_ip_in_project -upgrade true -name tmp_edit_project -directory $path_to_packaged $path_to_packaged/component.xml 63 | set_property core_revision 2 [ipx::current_core] 64 | foreach up [ipx::get_user_parameters] { 65 | ipx::remove_user_parameter [get_property NAME $up] [ipx::current_core] 66 | } 67 | set_property sdx_kernel true [ipx::current_core] 68 | set_property sdx_kernel_type rtl [ipx::current_core] 69 | ipx::create_xgui_files [ipx::current_core] 70 | ipx::associate_bus_interfaces -busif m00_axi -clock ap_clk [ipx::current_core] 71 | ipx::associate_bus_interfaces -busif s_axi_control -clock ap_clk [ipx::current_core] 72 | set_property xpm_libraries {XPM_CDC XPM_MEMORY XPM_FIFO} [ipx::current_core] 73 | set_property supported_families { } [ipx::current_core] 74 | set_property auto_family_support_level level_2 [ipx::current_core] 75 | ipx::update_checksums [ipx::current_core] 76 | ipx::save_core [ipx::current_core] 77 | close_project -delete 78 | -------------------------------------------------------------------------------- /primitives/rtl/compressor_tree_3_to_2.sv: -------------------------------------------------------------------------------- 1 | /******************************************************************************* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | *******************************************************************************/ 16 | 17 | /* 18 | Tree built out of 3:2 compressors. 19 | Parameterized to take any number of inputs, each of a common size 20 | */ 21 | 22 | module compressor_tree_3_to_2 23 | #( 24 | parameter int NUM_ELEMENTS = 9, 25 | parameter int BIT_LEN = 16 26 | ) 27 | ( 28 | input logic [BIT_LEN-1:0] terms[NUM_ELEMENTS], 29 | output logic [BIT_LEN-1:0] C, 30 | output logic [BIT_LEN-1:0] S 31 | ); 32 | 33 | `ifdef FASTSIM 34 | // This is intended for simulation only to improve compile and run time 35 | always_comb begin 36 | C = 0; 37 | S = 0; 38 | for(int k = 0; k < NUM_ELEMENTS; k++) begin 39 | S += terms[k]; 40 | end 41 | end 42 | 43 | `else 44 | 45 | // If there is only one or two elements, then return the input (no tree) 46 | // If there are three elements, this is the last level in the tree 47 | // For greater than three elements: 48 | // Instantiate a set of carry save adders to process this level's terms 49 | // Recursive instantiate this module to complete the rest of the tree 50 | generate 51 | if (NUM_ELEMENTS == 1) begin // Return value 52 | always_comb begin 53 | C[BIT_LEN-1:0] = '0; 54 | S[BIT_LEN-1:0] = terms[0]; 55 | end 56 | end 57 | else if (NUM_ELEMENTS == 2) begin // Return value 58 | always_comb begin 59 | C[BIT_LEN-1:0] = terms[1]; 60 | S[BIT_LEN-1:0] = terms[0]; 61 | end 62 | end 63 | else if (NUM_ELEMENTS == 3) begin // last level 64 | /* verilator lint_off UNUSED */ 65 | logic [BIT_LEN-1:0] Cout; 66 | /* verilator lint_on UNUSED */ 67 | 68 | carry_save_adder #(.BIT_LEN(BIT_LEN)) 69 | carry_save_adder ( 70 | .A(terms[0]), 71 | .B(terms[1]), 72 | .Cin(terms[2]), 73 | .Cout(Cout), 74 | .S(S[BIT_LEN-1:0]) 75 | ); 76 | always_comb begin 77 | C[BIT_LEN-1:0] = {Cout[BIT_LEN-2:0], 1'b0}; 78 | end 79 | end 80 | else begin 81 | //localparam integer NUM_RESULTS = ($rtoi($floor(NUM_ELEMENTS/3)) * 2) + 82 | // (NUM_ELEMENTS%3); 83 | localparam integer NUM_RESULTS = (integer'(NUM_ELEMENTS/3) * 2) + 84 | (NUM_ELEMENTS%3); 85 | 86 | logic [BIT_LEN-1:0] next_level_terms[NUM_RESULTS]; 87 | 88 | carry_save_adder_tree_level #(.NUM_ELEMENTS(NUM_ELEMENTS), 89 | .BIT_LEN(BIT_LEN) 90 | ) 91 | carry_save_adder_tree_level ( 92 | .terms(terms), 93 | .results(next_level_terms) 94 | ); 95 | 96 | compressor_tree_3_to_2 #(.NUM_ELEMENTS(NUM_RESULTS), 97 | .BIT_LEN(BIT_LEN) 98 | ) 99 | compressor_tree_3_to_2 ( 100 | .terms(next_level_terms), 101 | .C(C), 102 | .S(S) 103 | ); 104 | end 105 | endgenerate 106 | `endif 107 | endmodule 108 | -------------------------------------------------------------------------------- /msu/rtl/modular_square_wrapper.sv: -------------------------------------------------------------------------------- 1 | /******************************************************************************* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | *******************************************************************************/ 16 | 17 | // Pipe the modular squaring circuit IOs to relieve timing pressure. 18 | `ifndef MOD_LEN_DEF 19 | `define MOD_LEN_DEF 1024 20 | `endif 21 | 22 | module modular_square_wrapper 23 | #( 24 | parameter int MOD_LEN = `MOD_LEN_DEF, 25 | 26 | parameter int WORD_LEN = 16, 27 | parameter int REDUNDANT_ELEMENTS = 2, 28 | parameter int NONREDUNDANT_ELEMENTS = MOD_LEN / WORD_LEN, 29 | parameter int NUM_ELEMENTS = REDUNDANT_ELEMENTS + 30 | NONREDUNDANT_ELEMENTS, 31 | // Send the coefficients out in 32 bits - somewhat inefficient use 32 | // of space but not timing critical and easier to read/debug 33 | parameter int SQ_OUT_BITS = NUM_ELEMENTS * WORD_LEN*2 34 | ) 35 | ( 36 | input logic clk, 37 | input logic reset, 38 | input logic start, 39 | input logic [MOD_LEN-1:0] sq_in, 40 | output logic [SQ_OUT_BITS-1:0] sq_out, 41 | output logic valid 42 | ); 43 | 44 | localparam int BIT_LEN = 17; 45 | localparam int IO_STAGES = 3; 46 | 47 | logic start_stages[IO_STAGES]; 48 | logic [BIT_LEN-1:0] sq_in_stages[IO_STAGES][NUM_ELEMENTS]; 49 | logic [BIT_LEN-1:0] sq_out_stages[IO_STAGES][NUM_ELEMENTS]; 50 | logic valid_stages[IO_STAGES]; 51 | 52 | genvar j; 53 | 54 | always_ff @(posedge clk) begin 55 | start_stages[0] <= start; 56 | end 57 | 58 | // Split sq_in into polynomial coefficients 59 | generate 60 | for(j = 0; j < NONREDUNDANT_ELEMENTS; j++) begin 61 | always @(posedge clk) begin 62 | sq_in_stages[0][j] <= {{(BIT_LEN-WORD_LEN){1'b0}}, 63 | sq_in[j*WORD_LEN +: WORD_LEN]}; 64 | end 65 | end 66 | // Clear the redundant coefficients 67 | for(j = NONREDUNDANT_ELEMENTS; j < NUM_ELEMENTS; j++) begin 68 | always @(posedge clk) begin 69 | sq_in_stages[0][j] <= 0; 70 | end 71 | end 72 | endgenerate 73 | 74 | // Gather the output coefficients into sq_out 75 | generate 76 | for(j = 0; j < NUM_ELEMENTS; j++) begin 77 | always_comb begin 78 | sq_out[j*WORD_LEN*2 +: 2*WORD_LEN] = {{(2*WORD_LEN-BIT_LEN){1'b0}}, 79 | sq_out_stages[IO_STAGES-1][j]}; 80 | end 81 | end 82 | endgenerate 83 | 84 | assign valid = valid_stages[IO_STAGES-1]; 85 | 86 | // Create the pipeline 87 | generate 88 | for(j = 1; j < IO_STAGES; j++) begin 89 | always_ff @(posedge clk) begin 90 | start_stages[j] <= start_stages[j-1]; 91 | sq_in_stages[j] <= sq_in_stages[j-1]; 92 | sq_out_stages[j] <= sq_out_stages[j-1]; 93 | valid_stages[j] <= valid_stages[j-1]; 94 | end 95 | end 96 | endgenerate 97 | 98 | modular_square_8_cycles 99 | #( 100 | .NONREDUNDANT_ELEMENTS(NONREDUNDANT_ELEMENTS) 101 | ) 102 | modsqr( 103 | .clk (clk), 104 | .reset (reset), 105 | .start (start_stages[IO_STAGES-1]), 106 | .sq_in (sq_in_stages[IO_STAGES-1]), 107 | .sq_out (sq_out_stages[0]), 108 | .valid (valid_stages[0]) 109 | ); 110 | 111 | endmodule 112 | -------------------------------------------------------------------------------- /msu/rtl/sdaccel/vdf_counter.sv: -------------------------------------------------------------------------------- 1 | // /******************************************************************************* 2 | // Copyright (c) 2018, Xilinx, Inc. 3 | // All rights reserved. 4 | // 5 | // Redistribution and use in source and binary forms, with or without modification, 6 | // are permitted provided that the following conditions are met: 7 | // 8 | // 1. Redistributions of source code must retain the above copyright notice, 9 | // this list of conditions and the following disclaimer. 10 | // 11 | // 12 | // 2. Redistributions in binary form must reproduce the above copyright notice, 13 | // this list of conditions and the following disclaimer in the documentation 14 | // and/or other materials provided with the distribution. 15 | // 16 | // 17 | // 3. Neither the name of the copyright holder nor the names of its contributors 18 | // may be used to endorse or promote products derived from this software 19 | // without specific prior written permission. 20 | // 21 | // 22 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 23 | // ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,THE IMPLIED 24 | // WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 25 | // IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, 26 | // INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 27 | // BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 28 | // DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY 29 | // OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 30 | // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, 31 | // EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | // 33 | // *******************************************************************************/ 34 | 35 | // default_nettype of none prevents implicit wire declaration. 36 | `default_nettype none 37 | 38 | module vdf_counter #( 39 | parameter integer C_WIDTH = 4, 40 | parameter [C_WIDTH-1:0] C_INIT = {C_WIDTH{1'b0}} 41 | ) 42 | ( 43 | input wire clk, 44 | input wire clken, 45 | input wire rst, 46 | input wire load, 47 | input wire incr, 48 | input wire decr, 49 | input wire [C_WIDTH-1:0] load_value, 50 | output wire [C_WIDTH-1:0] count, 51 | output wire is_zero 52 | ); 53 | 54 | timeunit 1ps; 55 | timeprecision 1ps; 56 | 57 | ///////////////////////////////////////////////////////////////////////////// 58 | // Local Parameters 59 | ///////////////////////////////////////////////////////////////////////////// 60 | localparam [C_WIDTH-1:0] LP_ZERO = {C_WIDTH{1'b0}}; 61 | localparam [C_WIDTH-1:0] LP_ONE = {{C_WIDTH-1{1'b0}},1'b1}; 62 | localparam [C_WIDTH-1:0] LP_MAX = {C_WIDTH{1'b1}}; 63 | 64 | ///////////////////////////////////////////////////////////////////////////// 65 | // Variables 66 | ///////////////////////////////////////////////////////////////////////////// 67 | reg [C_WIDTH-1:0] count_r = C_INIT; 68 | reg is_zero_r = (C_INIT == LP_ZERO); 69 | 70 | ///////////////////////////////////////////////////////////////////////////// 71 | // Begin RTL 72 | ///////////////////////////////////////////////////////////////////////////// 73 | assign count = count_r; 74 | 75 | always @(posedge clk) begin 76 | if (rst) begin 77 | count_r <= C_INIT; 78 | end 79 | else if (clken) begin 80 | if (load) begin 81 | count_r <= load_value; 82 | end 83 | else if (incr & ~decr) begin 84 | count_r <= count_r + 1'b1; 85 | end 86 | else if (~incr & decr) begin 87 | count_r <= count_r - 1'b1; 88 | end 89 | else 90 | count_r <= count_r; 91 | end 92 | end 93 | 94 | assign is_zero = is_zero_r; 95 | 96 | always @(posedge clk) begin 97 | if (rst) begin 98 | is_zero_r <= (C_INIT == LP_ZERO); 99 | end 100 | else if (clken) begin 101 | if (load) begin 102 | is_zero_r <= (load_value == LP_ZERO); 103 | end 104 | else begin 105 | is_zero_r <= incr ^ decr ? (decr && (count_r == LP_ONE)) || (incr && (count_r == LP_MAX)) : is_zero_r; 106 | end 107 | end 108 | else begin 109 | is_zero_r <= is_zero_r; 110 | end 111 | end 112 | 113 | 114 | endmodule : vdf_counter 115 | `default_nettype wire 116 | 117 | -------------------------------------------------------------------------------- /msu/rtl/multiplier.mk: -------------------------------------------------------------------------------- 1 | # 2 | # Copyright 2019 Supranational, LLC 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"); 5 | # you may not use this file except in compliance with the License. 6 | # You may obtain a copy of the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | # See the License for the specific language governing permissions and 14 | # limitations under the License. 15 | # 16 | 17 | ############################################################################ 18 | # Multiplier configuration 19 | ############################################################################ 20 | 21 | SIMPLE_SQ ?= 0 22 | ifeq ($(SIMPLE_SQ), 1) 23 | MOD_LEN ?= 128 24 | else 25 | MOD_LEN ?= 1024 26 | endif 27 | 28 | # 1 - Connect the testbench directly to the squaring circuit 29 | # 0 - Connect the testbench directly to the MSU 30 | DIRECT_TB ?= 0 31 | 32 | # Constants for the Ozturk multiplier 33 | REDUNDANT_ELEMENTS = 2 34 | NONREDUNDANT_ELEMENTS ?= $(shell expr $(MOD_LEN) \/ $(WORD_LEN)) 35 | NUM_ELEMENTS = $(shell expr $(NONREDUNDANT_ELEMENTS) \+ \ 36 | $(REDUNDANT_ELEMENTS)) 37 | WORD_LEN = 16 38 | BIT_LEN = 17 39 | 40 | ifeq ($(SIMPLE_SQ), 1) 41 | SQ_IN_BITS = $(MOD_LEN) 42 | SQ_OUT_BITS = $(MOD_LEN) 43 | else 44 | SQ_IN_BITS = $(MOD_LEN) 45 | SQ_OUT_BITS = $(shell expr $(NUM_ELEMENTS) \* $(WORD_LEN) \* 2) 46 | endif 47 | 48 | # Default modulus for various sizes 49 | ifndef MODULUS 50 | ifeq ($(NONREDUNDANT_ELEMENTS), 1) 51 | MODULUS = 49088 52 | endif 53 | ifeq ($(NONREDUNDANT_ELEMENTS), 2) 54 | MODULUS = 1319797480 55 | endif 56 | ifeq ($(NONREDUNDANT_ELEMENTS), 4) 57 | MODULUS = 10290524089509967236 58 | endif 59 | ifeq ($(NONREDUNDANT_ELEMENTS), 8) 60 | MODULUS = 302934307671667531413257853548643485645 61 | endif 62 | ifeq ($(NONREDUNDANT_ELEMENTS), 16) 63 | MODULUS = 33025623512261490103902707258419309725034860259537403375815092309878324079655 64 | endif 65 | ifeq ($(NONREDUNDANT_ELEMENTS), 32) 66 | MODULUS = 6489662188004289912380470564448077957325054535910000462604166663459673710886837850185567098610688907939251192940184027313309919696320700640064979438888128 67 | endif 68 | ifeq ($(NONREDUNDANT_ELEMENTS), 64) 69 | MODULUS = 124066695684124741398798927404814432744698427125735684128131855064976895337309138910015071214657674309443149407457493434579063840841220334555160125016331040933690674569571217337630239191517205721310197608387239846364360850220896772964978569683229449266819903414117058030106528073928633017118689826625594484331 70 | endif 71 | ifeq ($(NONREDUNDANT_ELEMENTS), 128) 72 | MODULUS = 9377944221571685634155357309238201353523714494933203932192352610373185905160064191380814163563653465686686344569948132435768764189230283870831379273286538073257936156915196745293608951123906426669343509495359436534714767355508167485174462490387748891786824058464058759514090422733587163281784566205124153235051703550025891216469399946549380070025504308122753979231888712348434628534163045096998571026286859992004518389268564973163318230346906823917015138015136534425282323916197448591565660862677175296696705791983908960387617248409752260394512393068089040746777040892828872978879414544318732112296166363704634142810 73 | endif 74 | endif 75 | 76 | ifeq ($(RANDOM_MODULUS),1) 77 | MODULUS := $(shell python3 -c \ 78 | "import random; \ 79 | bits = $(NONREDUNDANT_ELEMENTS)*$(WORD_LEN); \ 80 | M = random.getrandbits(bits); \ 81 | print(M)") 82 | endif 83 | export RANDOM_MODULUS = 0 84 | 85 | # Configure MSU parameters. These are included through vdf_kernel.sv 86 | msuconfig.vh: 87 | echo "\`define SQ_IN_BITS_DEF $(SQ_IN_BITS)" \ 88 | > msuconfig.vh 89 | echo "\`define SQ_OUT_BITS_DEF $(SQ_OUT_BITS)" \ 90 | >> msuconfig.vh 91 | echo "\`define MODULUS_DEF $(MOD_LEN)'d$(MODULUS)" \ 92 | >> msuconfig.vh 93 | echo "\`define MOD_LEN_DEF $(MOD_LEN)" \ 94 | >> msuconfig.vh 95 | ifeq ($(SIMPLE_SQ), 1) 96 | echo "\`define SIMPLE_SQ $(SIMPLE_SQ)" \ 97 | >> msuconfig.vh 98 | endif 99 | 100 | mem/reduction_lut_000.dat: 101 | mkdir -p mem 102 | cd mem && $(MODSQR_DIR)/rtl/gen_reduction_lut.py \ 103 | --nonredundant $(NONREDUNDANT_ELEMENTS) \ 104 | --modulus $(MODULUS) 105 | -------------------------------------------------------------------------------- /msu/rtl/vivado_simple/msu.srcs/tb.sv: -------------------------------------------------------------------------------- 1 | /******************************************************************************* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | *******************************************************************************/ 16 | 17 | `include "msuconfig.vh" 18 | 19 | module tb(); 20 | localparam integer MOD_LEN = 1024; 21 | //localparam integer MOD_LEN = 128; 22 | 23 | 24 | logic clk; 25 | logic reset; 26 | logic start; 27 | logic valid; 28 | logic [MOD_LEN-1:0] modulus; 29 | logic [MOD_LEN-1:0] sq_in; 30 | logic [MOD_LEN-1:0] sq_out; 31 | logic [MOD_LEN-1:0] sq_out_expected; 32 | logic [MOD_LEN-1:0] sq_out_actual; 33 | 34 | integer t_start; 35 | integer t_final; 36 | integer t_curr; 37 | 38 | integer test_file; 39 | integer i, ret; 40 | integer cycle_count; 41 | integer error_count; 42 | 43 | integer total_cycle_count; 44 | integer total_squarings; 45 | 46 | modular_square_simple 47 | #( 48 | .MOD_LEN(MOD_LEN) 49 | ) 50 | uut( 51 | clk, 52 | reset, 53 | start, 54 | sq_in, 55 | sq_out, 56 | valid 57 | ); 58 | 59 | initial begin 60 | test_file = $fopen("../../../../../test.txt", "r"); 61 | if(test_file == 0) begin 62 | $display("test_file handle was NULL"); 63 | $finish; 64 | end 65 | end 66 | 67 | always begin 68 | #5 clk = ~clk; 69 | end 70 | 71 | initial begin 72 | // Reset the design 73 | clk = 1'b0; 74 | reset = 1'b1; 75 | sq_in = 0; 76 | start = 1'b0; 77 | t_start = 0; 78 | t_curr = 0; 79 | 80 | @(negedge clk); 81 | @(negedge clk); 82 | @(negedge clk); 83 | @(negedge clk); 84 | 85 | reset = 1'b0; 86 | 87 | @(negedge clk); 88 | @(negedge clk); 89 | @(negedge clk); 90 | @(negedge clk); 91 | 92 | // Scan in the modulus and initial value 93 | $fscanf(test_file, "%x\n", sq_in); 94 | @(negedge clk); 95 | 96 | start = 1'b1; 97 | @(negedge clk); 98 | start = 1'b0; 99 | 100 | // Run the squarer and periodically check results 101 | error_count = 0; 102 | total_cycle_count = 0; 103 | total_squarings = 0; 104 | while(1) begin 105 | ret = $fscanf(test_file, "%d, %x\n", t_final, sq_out_expected); 106 | if(ret != 2) begin 107 | break; 108 | end 109 | 110 | // Run to the next checkpoint specified in the test file 111 | cycle_count = 1; 112 | t_start = t_curr; 113 | while(t_curr < t_final) begin 114 | if(valid == 1'b1) begin 115 | t_curr = t_curr + 1; 116 | sq_out_actual = sq_out; 117 | total_squarings = total_squarings + 1; 118 | end 119 | 120 | @(negedge clk); 121 | cycle_count = cycle_count + 1; 122 | total_cycle_count = total_cycle_count + 1; 123 | end 124 | 125 | sq_out_actual = sq_out_actual; 126 | 127 | $display("%5d %0.2f %x", t_final, 128 | real'(cycle_count) / real'(t_final - t_start), 129 | sq_out_actual); 130 | 131 | // Check correctness 132 | if(sq_out_actual !== sq_out_expected) begin 133 | $display("MISTATCH expected %x", sq_out_expected); 134 | $display(" actual %x", sq_out_actual); 135 | error_count = error_count + 1; 136 | break; 137 | end 138 | @(negedge clk); 139 | total_cycle_count = total_cycle_count + 1; 140 | end 141 | $display("Overall %d cycles, %d squarings, %0.2f cyc/sq", 142 | total_cycle_count, total_squarings, 143 | real'(total_cycle_count) / real'(total_squarings)); 144 | if(error_count == 0) begin 145 | $display("SUCCESS!"); 146 | $finish(); 147 | end 148 | @(negedge clk); 149 | @(negedge clk); 150 | @(negedge clk); 151 | @(negedge clk); 152 | $error("FAILURE %d mismatches", error_count); 153 | $finish(); 154 | end 155 | endmodule 156 | 157 | -------------------------------------------------------------------------------- /msu/sw/MSU.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | */ 16 | 17 | #include 18 | #include 19 | #include 20 | #include 21 | #include 22 | 23 | void bn_shl(mpz_t bn, int bits) { 24 | mpz_mul_2exp(bn, bn, bits); 25 | } 26 | void bn_shr(mpz_t bn, int bits) { 27 | mpz_fdiv_q_2exp(bn, bn, bits); 28 | } 29 | void bn_init_mask(mpz_t mask, int bits) { 30 | mpz_set_ui(mask, 1); 31 | mpz_mul_2exp(mask, mask, bits); 32 | mpz_sub_ui(mask, mask, 1); 33 | } 34 | 35 | // Start a nanosecond-resolution timer 36 | struct timespec timer_start(){ 37 | struct timespec start_time; 38 | clock_gettime(CLOCK_REALTIME, &start_time); 39 | return start_time; 40 | } 41 | 42 | // End a timer, returning nanoseconds elapsed as a long 43 | long timer_end(struct timespec start_time){ 44 | struct timespec end_time; 45 | clock_gettime(CLOCK_REALTIME, &end_time); 46 | long diffInNanos = (end_time.tv_sec - start_time.tv_sec) * 47 | (long)1e9 + (end_time.tv_nsec - start_time.tv_nsec); 48 | return diffInNanos; 49 | } 50 | 51 | 52 | MSU::MSU(MSUDevice &_d, int _mod_len, mpz_t _modulus) 53 | : device(_d) { 54 | unsigned long seed = 0; 55 | gmp_randinit_mt(rand_state); 56 | gmp_randseed_ui(rand_state, seed); 57 | 58 | mod_len = _mod_len; 59 | 60 | mpz_inits(sq_in, modulus, sq_out, NULL); 61 | mpz_set(modulus, _modulus); 62 | gmp_printf("Modulus is %Zd\n\n", modulus); 63 | } 64 | 65 | MSU::~MSU() { 66 | mpz_clears(sq_in, modulus, sq_out, NULL); 67 | } 68 | 69 | // Run a job using the provided sq_in starting value. 70 | int MSU::run_fixed(uint64_t _t_start, uint64_t _t_final, mpz_t _sq_in, 71 | bool check) { 72 | t_start = _t_start; 73 | t_final = _t_final; 74 | mpz_set(sq_in, _sq_in); 75 | compute_job(); 76 | if(check) { 77 | return(check_job()); 78 | } 79 | return 0; 80 | } 81 | 82 | // Run a job using a random sq_in starting value. 83 | int MSU::run_random(uint64_t _t_start, uint64_t _t_final, bool rrandom, 84 | bool check) { 85 | t_start = _t_start; 86 | t_final = _t_final; 87 | prepare_random_job(rrandom); 88 | compute_job(); 89 | if(check) { 90 | return(check_job()); 91 | } 92 | return 0; 93 | } 94 | 95 | // Generate a random starting input 96 | void MSU::prepare_random_job(bool rrandom) { 97 | int num_rand_bits = mod_len; 98 | if(rrandom) { 99 | // Use a smaller bit size to avoid getting an input bigger than the 100 | // modulus 101 | mpz_rrandomb(sq_in, rand_state, num_rand_bits-2); 102 | } else { 103 | mpz_urandomb(sq_in, rand_state, num_rand_bits); 104 | } 105 | mpz_mod(sq_in, sq_in, modulus); 106 | } 107 | 108 | // Once the job parameters are configured compute_job will execute it on the 109 | // target. 110 | void MSU::compute_job() { 111 | struct timespec start_ts; 112 | start_ts = timer_start(); 113 | 114 | ////////////////////////////////////////////////////////////////////// 115 | // PREPROCESSING goes below this line (Montgomery conversion, etc) 116 | // 117 | 118 | // Perform the computation 119 | device.compute_job(t_start, t_final, sq_in, sq_out); 120 | 121 | // 122 | // POSTPROCESSING goes above this line (Montgomery conversion, etc) 123 | ////////////////////////////////////////////////////////////////////// 124 | 125 | compute_time = timer_end(start_ts); 126 | 127 | if(!quiet) { 128 | gmp_printf("sq_out is 0x%Zx\n", sq_out); 129 | } 130 | } 131 | 132 | // Check the result by comparing it to the expected value as computed by 133 | // software. 134 | int MSU::check_job() { 135 | mpz_t expected; 136 | mpz_inits(expected, NULL); 137 | 138 | mpz_set(expected, sq_in); 139 | for(uint64_t i = t_start; i < t_final; i++) { 140 | mpz_powm_ui(expected, expected, 2, modulus); 141 | //gmp_printf("sq_in^2 is 0x%Zx\n", expected); 142 | } 143 | 144 | if(!quiet) { 145 | gmp_printf("sq_in is 0x%Zx\n", sq_in); 146 | gmp_printf("expected is 0x%Zx\n", expected); 147 | gmp_printf("actual is 0x%Zx\n", sq_out); 148 | } 149 | 150 | // Check product 151 | int failures = 0; 152 | if (mpz_cmp(expected, sq_out) != 0) { 153 | printf("MISMATCH found - test Failed!\n"); 154 | failures++; 155 | } 156 | if(failures == 0) { 157 | printf("MATCH!"); 158 | } 159 | 160 | mpz_clears(expected, NULL); 161 | 162 | return(failures); 163 | } 164 | 165 | -------------------------------------------------------------------------------- /msu/sw/MSUVerilatorDirect.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | */ 16 | 17 | #include 18 | 19 | vluint64_t *main_time_singleton = 0; 20 | 21 | // Called by $time in Verilog 22 | double sc_time_stamp() { 23 | if(main_time_singleton) { 24 | return *main_time_singleton; 25 | } else { 26 | return(0); 27 | } 28 | } 29 | 30 | MSUVerilator::MSUVerilator(int argc, char** argv) { 31 | mpz_inits(msu_in, msu_out, 0); 32 | 33 | main_time_singleton = &main_time; 34 | 35 | pet(); 36 | 37 | // Pass arguments so Verilated code can see them, e.g. $value$plusargs 38 | Verilated::commandArgs(argc, argv); 39 | 40 | // Set debug level, 0 is off, 9 is highest presently used 41 | Verilated::debug(0); 42 | 43 | // Randomization reset policy 44 | Verilated::randReset(2); 45 | 46 | // Construct the Verilated model 47 | tb = new Vtb; 48 | 49 | // If verilator was invoked with --trace argument, 50 | // and if at run time passed the +trace argument, turn on tracing 51 | tfp = NULL; 52 | #if VM_TRACE 53 | const char* flag = Verilated::commandArgsPlusMatch("trace"); 54 | if (flag && 0==strcmp(flag, "+trace")) { 55 | Verilated::traceEverOn(true); 56 | VL_PRINTF("Enabling waves into obj_dir/logs/vlt_dump.vcd...\n"); 57 | tfp = new VerilatedVcdC; 58 | tb->trace(tfp, 99); // Trace 99 levels of hierarchy 59 | // Not supported in default Centos version 60 | //Verilated::mkdir("logs"); 61 | tfp->open("logs/vlt_dump.vcd"); // Open the dump file 62 | } 63 | #endif 64 | } 65 | 66 | MSUVerilator::~MSUVerilator() { 67 | mpz_clears(msu_in, msu_out, 0); 68 | 69 | tb->final(); 70 | 71 | // Close trace if opened 72 | #if VM_TRACE 73 | if (tfp) { tfp->close(); tfp = NULL; } 74 | #endif 75 | 76 | // Coverage analysis (since test passed) 77 | #if VM_COVERAGE 78 | Verilated::mkdir("logs"); 79 | VerilatedCov::write("logs/coverage.dat"); 80 | #endif 81 | 82 | // Destroy model 83 | delete tb; tb = NULL; 84 | } 85 | 86 | void MSUVerilator::init(MSU *_msu, Squarer *_squarer) { 87 | MSUDevice::init(_msu, _squarer); 88 | 89 | int nonredundant_elements = msu->mod_len / WORD_LEN; 90 | int num_elements = nonredundant_elements + REDUNDANT_ELEMENTS; 91 | msu_words_in = (T_LEN/MSU_WORD_LEN*2 + (nonredundant_elements+1)/2); 92 | msu_words_out = (T_LEN/MSU_WORD_LEN + num_elements); 93 | } 94 | 95 | void MSUVerilator::reset() { 96 | // Reset the device 97 | tb->reset = 1; 98 | tb->clk = 1; 99 | tb->start = 0; 100 | 101 | for(int i = 0; i < 10; i++) { 102 | clock_cycle(); 103 | } 104 | 105 | // Out of reset 106 | tb->reset = 0; 107 | clock_cycle(); 108 | clock_cycle(); 109 | clock_cycle(); 110 | } 111 | 112 | void MSUVerilator::compute_job(uint64_t t_start, 113 | uint64_t t_final, 114 | mpz_t sq_in, 115 | mpz_t sq_out) { 116 | reset(); 117 | 118 | // Number of 32-bit words we need to copy from mpz to sq_in/sq_out 119 | //int sq_words = msu->mod_len / 8 / BN_BUFFER_SIZE; 120 | bn_to_buffer(sq_in, tb->sq_in, squarer->msu_words_in(), true, true); 121 | 122 | // Load values 123 | tb->start = 1; 124 | clock_cycle(); 125 | tb->start = 0; 126 | 127 | uint64_t t_cur = t_start; 128 | while(t_cur < t_final ){ 129 | while(!tb->valid) { 130 | clock_cycle(); 131 | } 132 | pet(); 133 | 134 | t_cur++; 135 | if(t_cur == t_final) { 136 | break; 137 | } 138 | 139 | clock_cycle(); 140 | } 141 | 142 | bn_from_buffer(msu_out, tb->sq_out, squarer->msu_words_out()); 143 | gmp_printf("squarer result is 0x%Zx\n", msu_out); 144 | 145 | squarer->unpack(sq_out, 0, msu_out, WORD_LEN); 146 | 147 | clock_cycle(); 148 | clock_cycle(); 149 | clock_cycle(); 150 | } 151 | 152 | void MSUVerilator::clock_cycle() { 153 | watchdog++; 154 | if(watchdog == 1000) { 155 | printf("ERROR: Hit cycle count limit\n"); 156 | #if VM_TRACE 157 | if (tfp) { tfp->close(); tfp = NULL; } 158 | #endif 159 | exit(0); 160 | } 161 | 162 | main_time++; 163 | tb->clk = 0; 164 | tb->eval(); 165 | #if VM_TRACE 166 | if (tfp) tfp->dump (main_time); 167 | #endif 168 | 169 | main_time++; 170 | tb->clk = 1; 171 | tb->eval(); 172 | #if VM_TRACE 173 | if (tfp) tfp->dump (main_time); 174 | #endif 175 | } 176 | 177 | -------------------------------------------------------------------------------- /msu/sw/MSUSDAccel.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | */ 16 | 17 | #include 18 | #include 19 | #include 20 | #include 21 | 22 | using namespace std; 23 | 24 | // Print a buffer in 32 bit words. 25 | void print_buffer(const char *name, uint32_t *buffer, int size) { 26 | printf("BUFFER: %s size %d words, %d bytes\n", name, size, size * 4); 27 | for(int i = 0; i < size; i++) { 28 | printf(" %3d: 0x%04x\n", i, buffer[i]); 29 | } 30 | } 31 | 32 | // Print a buffer in 32 bit words as one long line. 33 | void print_buffer_concise(const char *name, uint32_t *buffer, int size) { 34 | printf("%s: ", name); 35 | for(int i = 0; i < size; i++) { 36 | if(i != 0) { 37 | printf(", "); 38 | } 39 | printf("%05x", buffer[size - i - 1]); 40 | } 41 | printf("\n"); 42 | } 43 | 44 | void OpenCLContext::init(int _msu_words_in, int _msu_words_out) { 45 | cl_int err; 46 | 47 | msu_words_in = _msu_words_in; 48 | msu_words_out = _msu_words_out; 49 | 50 | input_buf.resize(msu_words_in); 51 | output_buf.resize(msu_words_out); 52 | 53 | // Clear the data buffers 54 | int i = 0; 55 | for(i = 0; i < msu_words_in; i++) { 56 | input_buf[i] = 0; 57 | } 58 | for(i = 0; i < msu_words_out; i++) { 59 | output_buf[i] = 0; 60 | } 61 | 62 | // Create Program and Kernel 63 | std::vector devices = xcl::get_xil_devices(); 64 | cl::Device device = devices[0]; 65 | 66 | OCL_CHECK(err, context = 67 | new cl::Context(device, NULL, NULL, NULL, &err)) 68 | OCL_CHECK(err, q = 69 | new cl::CommandQueue(*context, device, 70 | CL_QUEUE_PROFILING_ENABLE, &err)); 71 | std::string device_name = device.getInfo(); 72 | 73 | std::string binaryFile = xcl::find_binary_file(device_name, KERNEL_NAME); 74 | cl::Program::Binaries bins = xcl::import_binary_file(binaryFile); 75 | devices.resize(1); 76 | OCL_CHECK(err, program = 77 | new cl::Program(*context, devices, bins, NULL, &err)); 78 | OCL_CHECK(err, krnl_vdf = 79 | new cl::Kernel(*program, KERNEL_NAME, &err)); 80 | 81 | // Allocate OpenCL buffers in memory 82 | OCL_CHECK(err, inBuffer = 83 | new cl::Buffer(*context, 84 | CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY, 85 | (size_t)msu_words_in*MSU_BYTES_PER_WORD, 86 | input_buf.data(), &err)); 87 | OCL_CHECK(err, outBuffer = 88 | new cl::Buffer(*context, 89 | CL_MEM_USE_HOST_PTR | CL_MEM_WRITE_ONLY, 90 | (size_t)msu_words_out*MSU_BYTES_PER_WORD, 91 | output_buf.data(), &err)); 92 | inBufferVec.push_back(*inBuffer); 93 | outBufferVec.push_back(*outBuffer); 94 | 95 | // Set kernel arguments. 96 | // Not used 97 | OCL_CHECK(err, err = krnl_vdf->setArg(0, 0)); 98 | OCL_CHECK(err, err = krnl_vdf->setArg(1, *inBuffer)); 99 | OCL_CHECK(err, err = krnl_vdf->setArg(2, *outBuffer)); 100 | // Not used 101 | OCL_CHECK(err, err = krnl_vdf->setArg(3, *outBuffer)); 102 | } 103 | 104 | OpenCLContext::~OpenCLContext() { 105 | delete outBuffer; 106 | delete inBuffer; 107 | delete krnl_vdf; 108 | delete program; 109 | } 110 | 111 | void OpenCLContext::compute_job(mpz_t msu_out, mpz_t msu_in) { 112 | if(!quiet) { 113 | gmp_printf("msu_in is 0x%Zx\n", msu_in); 114 | } 115 | bn_to_buffer(msu_in, input_buf.data(), msu_words_in, true, true); 116 | //print_buffer_concise("msu_in", input_buf.data(), msu_words_in); 117 | 118 | cl_int err; 119 | 120 | // DMA the buffers to the FPGA 121 | OCL_CHECK(err, err = q->enqueueMigrateMemObjects(inBufferVec, 0)); 122 | 123 | // Launch the Kernel 124 | OCL_CHECK(err, err = q->enqueueTask(*krnl_vdf)); 125 | 126 | // DMA the results from FPGA to host 127 | OCL_CHECK(err, err = 128 | q->enqueueMigrateMemObjects(outBufferVec, 129 | CL_MIGRATE_MEM_OBJECT_HOST)); 130 | OCL_CHECK(err, err = q->finish()); 131 | 132 | // Extract the result 133 | bn_from_buffer(msu_out, output_buf.data(), msu_words_out); 134 | if(!quiet) { 135 | gmp_printf("msu_out is 0x%Zx\n", msu_out); 136 | //print_buffer_concise("msu_out", output_buf.data(), msu_words_out); 137 | } 138 | } 139 | 140 | void MSUSDAccel::init(MSU *_msu, Squarer *_squarer) { 141 | MSUDevice::init(_msu, _squarer); 142 | 143 | int nonredundant_elements = msu->mod_len / WORD_LEN; 144 | int num_elements = nonredundant_elements + REDUNDANT_ELEMENTS; 145 | msu_words_in = (T_LEN/MSU_WORD_LEN*2 + (nonredundant_elements+1)/2); 146 | msu_words_out = (T_LEN/MSU_WORD_LEN + num_elements); 147 | 148 | ocl.init(msu_words_in, msu_words_out); 149 | } 150 | 151 | void MSUSDAccel::compute_job(uint64_t t_start, 152 | uint64_t t_final, 153 | mpz_t sq_in, 154 | mpz_t sq_out) { 155 | squarer->pack(msu_in, t_start, t_final, sq_in); 156 | ocl.compute_job(msu_out, msu_in); 157 | 158 | uint64_t t_final_out; 159 | squarer->unpack(sq_out, &t_final_out, msu_out, WORD_LEN); 160 | } 161 | -------------------------------------------------------------------------------- /msu/rtl/vivado_ozturk/msu.srcs/tb.sv: -------------------------------------------------------------------------------- 1 | /******************************************************************************* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | *******************************************************************************/ 16 | 17 | `include "msuconfig.vh" 18 | 19 | module tb(); 20 | localparam integer MOD_LEN = 1024; 21 | //localparam integer MOD_LEN = 128; 22 | 23 | // Ozturk parameters 24 | localparam integer WORD_LEN = 16; 25 | localparam integer BIT_LEN = 17; 26 | localparam integer AXI_LEN = 32; 27 | localparam integer SQ_IN_BITS = `SQ_IN_BITS_DEF; 28 | localparam integer SQ_OUT_BITS = `SQ_OUT_BITS_DEF; 29 | localparam MODULUS = `MODULUS_DEF; 30 | 31 | 32 | logic clk; 33 | logic reset; 34 | logic start; 35 | logic valid; 36 | logic [MOD_LEN-1:0] modulus; 37 | logic [SQ_IN_BITS-1:0] sq_in; 38 | logic [SQ_OUT_BITS-1:0] sq_out; 39 | logic [MOD_LEN-1:0] sq_out_expected; 40 | logic [SQ_OUT_BITS-1:0] sq_out_actual; 41 | logic [SQ_OUT_BITS-1:0] sq_out_reducing; 42 | logic [MOD_LEN-1:0] sq_out_reduced; 43 | 44 | integer t_start; 45 | integer t_final; 46 | integer t_curr; 47 | 48 | integer test_file; 49 | integer i, ret; 50 | integer cycle_count; 51 | integer error_count; 52 | 53 | integer total_cycle_count; 54 | integer total_squarings; 55 | 56 | modular_square_wrapper 57 | #( 58 | .MOD_LEN(MOD_LEN) 59 | ) 60 | uut( 61 | clk, 62 | reset, 63 | start, 64 | sq_in, 65 | sq_out, 66 | valid 67 | ); 68 | 69 | initial begin 70 | test_file = $fopen("../../../../../test.txt", "r"); 71 | if(test_file == 0) begin 72 | $display("test_file handle was NULL"); 73 | $finish; 74 | end 75 | end 76 | 77 | always begin 78 | #5 clk = ~clk; 79 | end 80 | 81 | initial begin 82 | // Reset the design 83 | clk = 1'b0; 84 | reset = 1'b1; 85 | sq_in = 0; 86 | start = 1'b0; 87 | t_start = 0; 88 | t_curr = 0; 89 | 90 | @(negedge clk); 91 | @(negedge clk); 92 | @(negedge clk); 93 | @(negedge clk); 94 | 95 | reset = 1'b0; 96 | 97 | @(negedge clk); 98 | @(negedge clk); 99 | @(negedge clk); 100 | @(negedge clk); 101 | 102 | // Scan in the modulus and initial value 103 | $fscanf(test_file, "%x\n", sq_in); 104 | @(negedge clk); 105 | 106 | start = 1'b1; 107 | @(negedge clk); 108 | start = 1'b0; 109 | 110 | // Run the squarer and periodically check results 111 | error_count = 0; 112 | total_cycle_count = 0; 113 | total_squarings = 0; 114 | while(1) begin 115 | ret = $fscanf(test_file, "%d, %x\n", t_final, sq_out_expected); 116 | if(ret != 2) begin 117 | break; 118 | end 119 | 120 | // Run to the next checkpoint specified in the test file 121 | cycle_count = 1; 122 | t_start = t_curr; 123 | while(t_curr < t_final) begin 124 | if(valid == 1'b1) begin 125 | t_curr = t_curr + 1; 126 | sq_out_actual = sq_out; 127 | total_squarings = total_squarings + 1; 128 | end 129 | 130 | @(negedge clk); 131 | cycle_count = cycle_count + 1; 132 | total_cycle_count = total_cycle_count + 1; 133 | end 134 | 135 | // Reduce the result from polynomial form 136 | sq_out_reducing = 0; 137 | for(i = 0; i < SQ_OUT_BITS / AXI_LEN; i++) begin 138 | if(i > 0) begin 139 | sq_out_reducing <<= WORD_LEN; 140 | sq_out_actual <<= AXI_LEN; 141 | end 142 | sq_out_reducing += sq_out_actual[SQ_OUT_BITS-AXI_LEN +: BIT_LEN]; 143 | end 144 | sq_out_reduced = sq_out_reducing % MODULUS; 145 | 146 | $display("%5d %0.2f %x", t_final, 147 | real'(cycle_count) / real'(t_final - t_start), 148 | sq_out_reduced); 149 | 150 | // Check correctness 151 | if(sq_out_reduced !== sq_out_expected) begin 152 | $display("MISTATCH expected %x", sq_out_expected); 153 | $display(" actual %x", sq_out_reduced); 154 | error_count = error_count + 1; 155 | break; 156 | end 157 | @(negedge clk); 158 | total_cycle_count = total_cycle_count + 1; 159 | end 160 | $display("Overall %d cycles, %d squarings, %0.2f cyc/sq", 161 | total_cycle_count, total_squarings, 162 | real'(total_cycle_count) / real'(total_squarings)); 163 | if(error_count == 0) begin 164 | $display("SUCCESS!"); 165 | $finish(); 166 | end 167 | @(negedge clk); 168 | @(negedge clk); 169 | @(negedge clk); 170 | @(negedge clk); 171 | $error("FAILURE %d mismatches", error_count); 172 | $finish(); 173 | end 174 | endmodule 175 | 176 | -------------------------------------------------------------------------------- /msu/rtl/Makefile: -------------------------------------------------------------------------------- 1 | # 2 | # Copyright 2019 Supranational, LLC 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"); 5 | # you may not use this file except in compliance with the License. 6 | # You may obtain a copy of the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | # See the License for the specific language governing permissions and 14 | # limitations under the License. 15 | # 16 | 17 | default: run 18 | 19 | # Multiplier configuration 20 | include multiplier.mk 21 | 22 | # Test configuration 23 | ITERATIONS ?= 25 24 | T_FINAL ?= 3 25 | FASTSIM ?= 1 26 | INTERMEDIATES ?= 0 27 | 28 | include ./verilator.mk 29 | 30 | 31 | # Overrides to reproduce specific cases 32 | # ITERATIONS = 1 33 | # T_FINAL = 30 34 | # MOD_LEN = 128 35 | # MODULUS = 302934307671667531413257853548643485645 36 | # SQ_IN = 0x45558c7335696741d41c91186caf806b 37 | 38 | ifdef SQ_IN 39 | SQ_IN_FLAG = -s $(SQ_IN) 40 | endif 41 | ifeq ($(RRANDOM), 1) 42 | RRANDOM_FLAG = -1 43 | endif 44 | ifeq ($(FASTSIM), 1) 45 | VERILATOR_FLAGS += -DFASTSIM=1 46 | endif 47 | 48 | ifeq ($(DIRECT_TB), 1) 49 | ifeq ($(SIMPLE_SQ), 1) 50 | TOP_FILE = modular_square_simple.sv 51 | else 52 | TOP_FILE = modular_square_wrapper.sv 53 | endif 54 | else 55 | TOP_FILE = msu.sv 56 | endif 57 | 58 | VERILATOR_FLAGS += -DMOD_LEN_DEF=$(MOD_LEN) 59 | VERILATOR_FLAGS += -DMODULUS_DEF=$(MOD_LEN)\'d$(MODULUS) 60 | VERILATOR_FLAGS += -CFLAGS '-I../../sw -Wall -std=c++11' 61 | VERILATOR_FLAGS += ../sw/MSU.cpp 62 | VERILATOR_FLAGS += --prefix Vtb 63 | 64 | ifeq ($(SIMPLE_SQ), 1) 65 | VERILATOR_FLAGS += -DSIMPLE_SQ=1 66 | VERILATOR_FLAGS += -CFLAGS '-DSIMPLE_SQ=1' 67 | endif 68 | ifeq ($(DIRECT_TB), 1) 69 | VERILATOR_FLAGS += -CFLAGS '-DDIRECT_TB=1' 70 | VERILATOR_FLAGS += ../sw/MSUVerilatorDirect.cpp 71 | else 72 | VERILATOR_FLAGS += ../sw/MSUVerilator.cpp 73 | endif 74 | 75 | VERILATOR_FLAGS += -DMSU_SQ_IN_BITS_DEF=$(SQ_IN_BITS) 76 | VERILATOR_FLAGS += -DMSU_SQ_OUT_BITS_DEF=$(SQ_OUT_BITS) 77 | 78 | MODSQR_PATH = $(realpath ../../modular_square/rtl) 79 | 80 | ###################################################################### 81 | 82 | run: msuconfig.vh 83 | @echo 84 | @echo "-- Large Integer Modular Squaring" 85 | 86 | ifeq ($(SIMPLE_SQ), 0) 87 | @echo 88 | @echo "-- GENERATE LUTs -----------------" 89 | mkdir -p obj_dir 90 | cd obj_dir && $(MODSQR_PATH)/gen_reduction_lut.py \ 91 | --nonredundant $(NONREDUNDANT_ELEMENTS) \ 92 | --modulus $(MODULUS) 93 | endif 94 | 95 | @echo 96 | @echo "-- VERILATE ----------------" 97 | $(VERILATOR) $(VERILATOR_FLAGS) -f input.vc $(TOP_FILE) ../sw/main.cpp 98 | 99 | @echo 100 | @echo "-- COMPILE -----------------" 101 | LIBS=-lgmp $(MAKE) -j 4 -C obj_dir -f Vtb.mk 102 | 103 | @echo 104 | @echo "-- RUN ---------------------" 105 | @mkdir -p obj_dir/logs 106 | cd obj_dir && ./Vtb $(TRACE_FLAG) -i $(ITERATIONS) \ 107 | -n $(MOD_LEN) \ 108 | -t $(INTERMEDIATES) \ 109 | -f $(T_FINAL) \ 110 | -m $(MODULUS) $(RRANDOM_FLAG) $(SQ_IN_FLAG) \ 111 | -e 112 | 113 | @echo 114 | @echo "-- DONE --------------------" 115 | ifeq ($(VERILATOR_TRACE), 1) 116 | @echo "To see waveforms:" 117 | @echo "gtkwave obj_dir/logs/vlt_dump.vcd &" 118 | endif 119 | @echo 120 | 121 | # Run multiple tests with a random modulus 122 | judge: 123 | for number in 1 2 3 4 5 ; do \ 124 | echo "" \ 125 | echo "TEST ITERATION $$number" ; \ 126 | make clean; MOD_LEN=1024 \ 127 | ITERATIONS=1 \ 128 | T_FINAL=1000 \ 129 | FASTSIM=1 \ 130 | RANDOM_MODULUS=1 \ 131 | RRANDOM=1 \ 132 | VERILATOR_TRACE=0 \ 133 | make; \ 134 | done 135 | 136 | # Does not work with the simple multiplier due to verilator bitwidth limitations 137 | regression: 138 | make clean; MOD_LEN=128 \ 139 | ITERATIONS=20 \ 140 | T_FINAL=30 \ 141 | FASTSIM=0 \ 142 | VERILATOR_TRACE=0 \ 143 | make 144 | make clean; MOD_LEN=128 \ 145 | ITERATIONS=100 \ 146 | T_FINAL=30 \ 147 | FASTSIM=0 \ 148 | RRANDOM=1 \ 149 | VERILATOR_TRACE=0 \ 150 | make 151 | make clean; MOD_LEN=256 \ 152 | ITERATIONS=10 \ 153 | T_FINAL=10 \ 154 | FASTSIM=0 \ 155 | RANDOM_MODULUS=1 \ 156 | RRANDOM=1 \ 157 | VERILATOR_TRACE=0 \ 158 | make 159 | make clean; MOD_LEN=512 \ 160 | ITERATIONS=10 \ 161 | T_FINAL=10 \ 162 | FASTSIM=1 \ 163 | RANDOM_MODULUS=1 \ 164 | RRANDOM=1 \ 165 | VERILATOR_TRACE=0 \ 166 | make 167 | make clean; MOD_LEN=2048 \ 168 | ITERATIONS=10 \ 169 | T_FINAL=10 \ 170 | FASTSIM=1 \ 171 | RANDOM_MODULUS=1 \ 172 | RRANDOM=1 \ 173 | VERILATOR_TRACE=0 \ 174 | make 175 | 176 | 177 | ###################################################################### 178 | # Other targets 179 | 180 | show-config: 181 | $(VERILATOR) -V 182 | 183 | maintainer-copy:: 184 | clean mostlyclean distclean maintainer-clean:: 185 | -rm -rf obj_dir logs *.log *.dmp *.vpd coverage.dat core mem 186 | -rm -rf msuconfig.vh 187 | -------------------------------------------------------------------------------- /msu/sw/main.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | */ 16 | 17 | #include 18 | #include 19 | #include 20 | #include 21 | 22 | #if defined(FPGA) || defined(SDX_PLATFORM) 23 | #include 24 | #elif defined(SIMPLE_SQ) 25 | #include 26 | #include 27 | #else 28 | #include 29 | #include 30 | #endif 31 | 32 | #include 33 | 34 | #ifndef MOD_LEN 35 | #define MOD_LEN 1024 36 | #endif 37 | #ifndef MODULUS 38 | #define MODULUS "302934307671667531413257853548643485645" 39 | #endif 40 | 41 | 42 | void print_usage() { 43 | printf("Usage: host [1e] -m modulus\n"); 44 | printf("\n"); 45 | printf("Options:\n"); 46 | printf(" -1 Use libgmp rrandom (default urandom)\n"); 47 | printf(" -e Enable hw emulation mode\n"); 48 | printf(" -q Quiet\n"); 49 | printf(" -i num Set the number of test iterations to run\n"); 50 | printf(" -f num Set t_final\n"); 51 | printf(" -t num Number of modsqr iterations per intermediate value\n"); 52 | printf(" -n num Set the number of modulus bits\n"); 53 | printf(" -s 0xnum Set the the starting sq_in (default random)\n"); 54 | printf("\n"); 55 | exit(0); 56 | } 57 | 58 | int main(int argc, char** argv, char** env) { 59 | 60 | mpz_t modulus, sq_in; 61 | mpz_inits(modulus, sq_in, NULL); 62 | 63 | mpz_set_str(modulus, MODULUS, 10); 64 | 65 | int test_iterations = 1; 66 | uint64_t t_final = 1; 67 | uint64_t intermediate_iters = 0; 68 | int mod_len = MOD_LEN; 69 | bool rrandom = false; 70 | bool hw_emu = false; 71 | bool quiet = false; 72 | int opt; 73 | while((opt = getopt(argc, argv, "h1qi:f:t:m:s:n:u:e")) != -1) { 74 | switch(opt) { 75 | case 'h': 76 | print_usage(); 77 | break; 78 | case '1': 79 | rrandom = true; 80 | break; 81 | case 'e': 82 | hw_emu = true; 83 | break; 84 | case 'q': 85 | quiet = true; 86 | break; 87 | case 'i': 88 | test_iterations = atoi(optarg); 89 | break; 90 | case 'f': 91 | t_final = atol(optarg); 92 | break; 93 | case 't': 94 | intermediate_iters = atol(optarg); 95 | break; 96 | case 'n': 97 | mod_len = atoi(optarg); 98 | break; 99 | case 's': 100 | if(mpz_set_str(sq_in, optarg+2, 16) != 0) { 101 | printf("Failed to parse sq_in %s!\n", optarg); 102 | exit(1); 103 | } 104 | break; 105 | case 'm': 106 | if(mpz_set_str(modulus, optarg, 10) != 0) { 107 | printf("Failed to parse modulus %s!\n", optarg); 108 | exit(1); 109 | } 110 | break; 111 | } 112 | }; 113 | if(mpz_cmp_ui(modulus, 0) == 0) { 114 | printf("ERROR: must provide a modulus with -m\n"); 115 | exit(1); 116 | } 117 | 118 | if(rrandom) { 119 | printf("Enabling rrandom testing\n"); 120 | } 121 | if(hw_emu) { 122 | printf("Enabling hardware emulation mode\n"); 123 | } 124 | 125 | #if defined(FPGA) || defined(SDX_PLATFORM) 126 | MSUSDAccel device; 127 | #else 128 | MSUVerilator device(argc, argv); 129 | #endif 130 | 131 | #if defined(SIMPLE_SQ) 132 | #if defined(DIRECT_TB) 133 | Squarer *squarer = new SquarerSimpleDirect(mod_len, modulus); 134 | #else 135 | Squarer *squarer = new SquarerSimple(mod_len, modulus); 136 | #endif 137 | #else 138 | #if defined(DIRECT_TB) 139 | Squarer *squarer = new SquarerOzturkDirect(mod_len, modulus); 140 | #else 141 | Squarer *squarer = new SquarerOzturk(mod_len, modulus); 142 | #endif 143 | #endif 144 | 145 | MSU msu(device, mod_len, modulus); 146 | msu.set_quiet(quiet); 147 | 148 | device.init(&msu, squarer); 149 | device.set_quiet(quiet); 150 | 151 | device.reset(); 152 | 153 | 154 | if(intermediate_iters == 0) { 155 | intermediate_iters = t_final; 156 | } 157 | 158 | int failures = 0; 159 | uint64_t t_start = 0; 160 | for(int test = 0; test < test_iterations; test++) { 161 | uint64_t iter = 0; 162 | while(iter < t_final) { 163 | uint64_t run_t_final = intermediate_iters; 164 | if(run_t_final + iter > t_final) { 165 | run_t_final = t_final - iter; 166 | } 167 | 168 | if(mpz_cmp_ui(sq_in, 0) != 0) { 169 | failures += msu.run_fixed(t_start, run_t_final, 170 | sq_in, hw_emu); 171 | } else { 172 | failures += msu.run_random(t_start, run_t_final, 173 | rrandom, hw_emu); 174 | } 175 | 176 | iter += intermediate_iters; 177 | mpz_set(sq_in, msu.sq_out); 178 | 179 | printf("\n"); 180 | if(failures > 0) { 181 | return(failures); 182 | } 183 | if(!hw_emu) { 184 | double ns_per_iter = ((double)msu.compute_time / 185 | (double)run_t_final); 186 | gmp_printf("%lu %0.1lf ns/sq: %Zd\n", iter, ns_per_iter, 187 | msu.sq_out); 188 | } 189 | } 190 | } 191 | if(failures == 0 && hw_emu) { 192 | printf("\nPASSED %ld iterations\n", test_iterations*(t_final-t_start)); 193 | } 194 | 195 | return(failures); 196 | } 197 | -------------------------------------------------------------------------------- /msu/sw/MSUVerilator.cpp: -------------------------------------------------------------------------------- 1 | /* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | */ 16 | 17 | #include 18 | 19 | vluint64_t *main_time_singleton = 0; 20 | 21 | // Called by $time in Verilog 22 | double sc_time_stamp() { 23 | if(main_time_singleton) { 24 | return *main_time_singleton; 25 | } else { 26 | return(0); 27 | } 28 | } 29 | 30 | MSUVerilator::MSUVerilator(int argc, char** argv) { 31 | mpz_inits(msu_in, msu_out, 0); 32 | 33 | main_time_singleton = &main_time; 34 | 35 | pet(); 36 | 37 | // Pass arguments so Verilated code can see them, e.g. $value$plusargs 38 | Verilated::commandArgs(argc, argv); 39 | 40 | // Set debug level, 0 is off, 9 is highest presently used 41 | Verilated::debug(0); 42 | 43 | // Randomization reset policy 44 | Verilated::randReset(2); 45 | 46 | // Construct the Verilated model 47 | tb = new Vtb; 48 | 49 | // If verilator was invoked with --trace argument, 50 | // and if at run time passed the +trace argument, turn on tracing 51 | tfp = NULL; 52 | #if VM_TRACE 53 | const char* flag = Verilated::commandArgsPlusMatch("trace"); 54 | if (flag && 0==strcmp(flag, "+trace")) { 55 | Verilated::traceEverOn(true); 56 | VL_PRINTF("Enabling waves into obj_dir/logs/vlt_dump.vcd...\n"); 57 | tfp = new VerilatedVcdC; 58 | tb->trace(tfp, 99); // Trace 99 levels of hierarchy 59 | // Not supported in default Centos version 60 | //Verilated::mkdir("logs"); 61 | tfp->open("logs/vlt_dump.vcd"); // Open the dump file 62 | } 63 | #endif 64 | } 65 | 66 | MSUVerilator::~MSUVerilator() { 67 | mpz_clears(msu_in, msu_out, 0); 68 | 69 | tb->final(); 70 | 71 | // Close trace if opened 72 | #if VM_TRACE 73 | if (tfp) { tfp->close(); tfp = NULL; } 74 | #endif 75 | 76 | // Coverage analysis (since test passed) 77 | #if VM_COVERAGE 78 | Verilated::mkdir("logs"); 79 | VerilatedCov::write("logs/coverage.dat"); 80 | #endif 81 | 82 | // Destroy model 83 | delete tb; tb = NULL; 84 | } 85 | 86 | void MSUVerilator::init(MSU *_msu, Squarer *_squarer) { 87 | MSUDevice::init(_msu, _squarer); 88 | 89 | int nonredundant_elements = msu->mod_len / WORD_LEN; 90 | int num_elements = nonredundant_elements + REDUNDANT_ELEMENTS; 91 | msu_words_in = (T_LEN/MSU_WORD_LEN*2 + (nonredundant_elements+1)/2); 92 | msu_words_out = (T_LEN/MSU_WORD_LEN + num_elements); 93 | } 94 | 95 | void MSUVerilator::reset() { 96 | // Reset the device 97 | tb->reset = 1; 98 | tb->clk = 1; 99 | tb->ap_start = 0; 100 | tb->s_axis_tlast = 0; 101 | tb->s_axis_tvalid = 0; 102 | tb->m_axis_tready = 0; 103 | 104 | for(int i = 0; i < 10; i++) { 105 | clock_cycle(); 106 | } 107 | 108 | // Out of reset 109 | tb->reset = 0; 110 | clock_cycle(); 111 | clock_cycle(); 112 | clock_cycle(); 113 | } 114 | 115 | void MSUVerilator::compute_job(uint64_t t_start, 116 | uint64_t t_final, 117 | mpz_t sq_in, 118 | mpz_t sq_out) { 119 | // Load values 120 | tb->ap_start = 1; 121 | clock_cycle(); 122 | tb->ap_start = 0; 123 | 124 | squarer->pack(msu_in, t_start, t_final, sq_in); 125 | gmp_printf("msu_in is 0x%Zx\n", msu_in); 126 | axi_write(msu_in, squarer->msu_words_in()); 127 | 128 | while(!tb->start_xfer) { 129 | clock_cycle(); 130 | } 131 | pet(); 132 | 133 | //bn_from_buffer(msu_out, tb->msu_out, squarer->msu_words_out()); 134 | axi_read(msu_out, squarer->msu_words_out()); 135 | gmp_printf("MSU result is 0x%Zx\n", msu_out); 136 | 137 | uint64_t t_final_out; 138 | squarer->unpack(sq_out, &t_final_out, msu_out, WORD_LEN); 139 | 140 | clock_cycle(); 141 | clock_cycle(); 142 | clock_cycle(); 143 | } 144 | 145 | void MSUVerilator::clock_cycle() { 146 | watchdog++; 147 | if(watchdog == 1000) { 148 | printf("ERROR: Hit cycle count limit\n"); 149 | #if VM_TRACE 150 | if (tfp) { tfp->close(); tfp = NULL; } 151 | #endif 152 | exit(0); 153 | } 154 | 155 | main_time++; 156 | tb->clk = 0; 157 | tb->eval(); 158 | #if VM_TRACE 159 | if (tfp) tfp->dump (main_time); 160 | #endif 161 | 162 | main_time++; 163 | tb->clk = 1; 164 | tb->eval(); 165 | #if VM_TRACE 166 | if (tfp) tfp->dump (main_time); 167 | #endif 168 | } 169 | 170 | void MSUVerilator::axi_write(mpz_t data, int words) { 171 | while(words > 0) { 172 | uint32_t d = mpz_get_ui(data); 173 | bn_shr(data, BN_BUFFER_SIZE*8); 174 | 175 | while(!tb->s_axis_tready) { 176 | clock_cycle(); 177 | } 178 | pet(); 179 | 180 | if(words == 1) { 181 | tb->s_axis_tlast = 1; 182 | } 183 | tb->s_axis_tvalid = 1; 184 | tb->s_axis_tdata = d; 185 | clock_cycle(); 186 | tb->s_axis_tlast = 0; 187 | words--; 188 | } 189 | clock_cycle(); 190 | } 191 | 192 | 193 | void MSUVerilator::axi_read(mpz_t data, int words) { 194 | printf("Reading from axi\n"); 195 | mpz_t d; 196 | mpz_init(d); 197 | 198 | uint64_t total_words = words; 199 | 200 | tb->m_axis_tready = 1; 201 | while(words > 0) { 202 | while(!tb->m_axis_tvalid) { 203 | clock_cycle(); 204 | } 205 | pet(); 206 | 207 | bn_shr(data, BN_BUFFER_SIZE*8); 208 | mpz_set_ui(d, tb->m_axis_tdata); 209 | bn_shl(d, (total_words-1) * BN_BUFFER_SIZE*8); 210 | mpz_add(data, data, d); 211 | clock_cycle(); 212 | words--; 213 | } 214 | } 215 | -------------------------------------------------------------------------------- /msu/sw/Squarer.hpp: -------------------------------------------------------------------------------- 1 | /* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | */ 16 | 17 | #ifndef _SQUARER_H_ 18 | #define _SQUARER_H_ 19 | 20 | #include 21 | #include 22 | #include 23 | 24 | class Squarer { 25 | protected: 26 | uint64_t mod_len; 27 | mpz_t modulus; 28 | public: 29 | Squarer(uint64_t _mod_len, mpz_t _modulus) { 30 | mod_len = _mod_len; 31 | 32 | mpz_init(modulus); 33 | mpz_set(modulus, _modulus); 34 | } 35 | virtual ~Squarer() { 36 | mpz_clear(modulus); 37 | } 38 | 39 | virtual uint64_t msu_words_in() = 0; 40 | virtual uint64_t msu_words_out() = 0; 41 | 42 | // Pack data into a buffer to be transmitted to the SDAccel RTL kernel. 43 | virtual void pack(mpz_t msu_in, uint64_t t_start, uint64_t t_final, 44 | mpz_t sq_in) = 0; 45 | 46 | // Unpack data from a buffer after receiving from the SDAccel RTL kernel. 47 | virtual void unpack(mpz_t sq_out, uint64_t *t_final, mpz_t msu_out, 48 | int word_len) = 0; 49 | }; 50 | 51 | class SquarerOzturk : public Squarer { 52 | protected: 53 | int words_in; 54 | int words_out; 55 | 56 | public: 57 | SquarerOzturk(uint64_t _mod_len, mpz_t _modulus) 58 | : Squarer(_mod_len, _modulus) { 59 | int nonredundant_elements = _mod_len / WORD_LEN; 60 | int num_elements = nonredundant_elements + REDUNDANT_ELEMENTS; 61 | // Only the square in/out words are included here 62 | words_in = (nonredundant_elements+1)/2; 63 | words_out = num_elements; 64 | } 65 | 66 | virtual uint64_t msu_words_in() { 67 | return(T_LEN/MSU_WORD_LEN*2 + words_in); 68 | } 69 | 70 | virtual uint64_t msu_words_out() { 71 | return(T_LEN/MSU_WORD_LEN + words_out); 72 | } 73 | 74 | virtual void pack(mpz_t msu_in, uint64_t t_start, uint64_t t_final, 75 | mpz_t sq_in) { 76 | mpz_set(msu_in, sq_in); 77 | 78 | // t_final 79 | bn_shl(msu_in, T_LEN); 80 | mpz_add_ui(msu_in, msu_in, t_final); 81 | 82 | // t_start 83 | bn_shl(msu_in, T_LEN); 84 | mpz_add_ui(msu_in, msu_in, t_start); 85 | 86 | } 87 | virtual void unpack(mpz_t sq_out, uint64_t *t_final, mpz_t msu_out, 88 | int word_len) { 89 | *t_final = mpz_get_ui(msu_out); 90 | bn_shr(msu_out, T_LEN); 91 | 92 | // Reduce the polynomial from redundant form 93 | reduce_polynomial(sq_out, msu_out, word_len, MSU_WORD_LEN); 94 | } 95 | 96 | void reduce_polynomial(mpz_t result, mpz_t poly, 97 | int word_len, int padded_word_len) { 98 | uint64_t mask = (1ULL<). 10 | 11 | A distilled down set of instructions specific to this design follows. 12 | 13 | **Note that you can also enable AWS F1 hardware emulation and synthesis on-premise. See [SDAccel On-Premise](#sdaccel-on-premise)** 14 | 15 | ## Host instantiation 16 | 17 | We assume some familiarity with the AWS environment. To instantiate a new AWS host for working with the FPGA follow these steps: 18 | 19 | 1. Login to the AWS page , go to the EC2 service portal 20 | 1. Click on Launch Instance 21 | 1. For AMI, go to AWS Marketplace, then search for FPGA 22 | 1. Choose FPGA Developer AMI 23 | 1. For instance type choose z1d.2xlarge for development, f1.2xlarge for FPGA enabled, then Review and Launch 24 | 1. For configuration of the host we recommend: 25 | 1. Increase root disk space by about 20GB for an f1.2xlarge, 60GB for a z1d.2xlarge. 26 | 1. Add a descriptive tag to help track instances and volumes 27 | 1. Launch the instance 28 | 1. In the EC2 Instances page, select the instance and choose Actions->Connect. This will tell you the instance hostname that you can ssh to. 29 | 1. Note that for the FPGA Developer AMI the username will be 'centos' 30 | 1. Log in with `ssh centos@HOST` 31 | 32 | You may find it convenient to install additional ssh keys for github, etc. 33 | 34 | ## Host setup 35 | 36 | Some initial setup is required for new F1 hosts. See for more detail. 37 | 38 | We've encapsulated a typical setup that includes vnc: 39 | ``` 40 | ./msu/scripts/f1_setup.sh 41 | ``` 42 | 43 | You can then optionally start a vncserver if you prefer to work in an X-windows environment: 44 | ``` 45 | # Start a vncserver 46 | vncserver 47 | ``` 48 | 49 | Connect using ssh to tunnel the vnc port: 50 | ``` 51 | ssh -L 5908:localhost:5901 centos@HOST 52 | ``` 53 | 54 | And view it locally: 55 | ``` 56 | vncviewer :8 57 | ``` 58 | 59 | Once you have vnc up run vncconfig to enable copy/paste: 60 | ``` 61 | vncconfig & 62 | ``` 63 | 64 | ## Vivado 65 | 66 | You can run Vivado on either the simple or Ozturk multipliers using the pre-installed tools. 67 | 68 | Choose your target: 69 | ``` 70 | # Simple 71 | cd msu/rtl/vivado_simple 72 | 73 | # Ozturk 74 | cd msu/rtl/vivado_ozturk 75 | ``` 76 | 77 | Update the target part. The AWS FPGA AMI does not have the basic initial part installed so change the msu.tcl file to use the vu9p target part. 78 | ``` 79 | sed 's/xc7s100fgga676-2/xcvu9p-flga2104-1-e/g' msu.tcl > msu2.tcl 80 | mv msu2.tcl msu.tcl 81 | ``` 82 | 83 | Launch Vivado: 84 | ``` 85 | ./run_vivado.sh 86 | ``` 87 | 88 | ## Hardware Emulation 89 | 90 | To build and run a test in hardware emulation: 91 | ``` 92 | source ./msu/scripts/sdaccel_env.sh 93 | cd msu 94 | make clean 95 | make hw_emu 96 | ``` 97 | 98 | Rerunning without cleaning the build will retain the hardware emulation (hardware) portion while rebuilding and executing the host (software) portion. 99 | 100 | Tracing is enabled by default in the hw_emu run. To view the resulting waveforms run: 101 | ``` 102 | vivado -source open_waves.tcl 103 | ``` 104 | 105 | ## Hardware Synthesis 106 | 107 | Synthesis and Place&Route compile the design from RTL into a bitstream that can be loaded on the FPGA. This step takes 1-3 hours depending on complexity of the design, host speed, synthesis targets, etc. 108 | 109 | You can enable a **faster run** by relaxing the kernel frequency (search for kernel_frequency in the Makefile) or building a smaller multiplier (comment out 1024b, uncomment 128b in the Makefile). This is often convenient when trying things out. 110 | 111 | ``` 112 | source ./msu/scripts/sdaccel_env.sh 113 | cd msu 114 | make clean 115 | make hw 116 | ``` 117 | 118 | Once synthesis successfully completes you can register the new image to process it for running on FPGA hardware. Follow the instructions in to setup an S3 bucket. This only needs to be done once. We assume a bucket name 'vdfsn' but you will need to change this to match your bucket name. Once that is done run the following: 119 | 120 | ``` 121 | # Configure AWS credentials. You should only need to do this once on a given 122 | # host 123 | # AWS Access Key ID [None]: XXXXXX 124 | # AWS Secret Access Key [None]: XXXXXX 125 | # Default region name [None]: us-east-1 126 | # Default output format [None]: json 127 | aws configure 128 | 129 | # Register the new bitstream 130 | # Update S3_BUCKET in Makefile.sdaccel to reflect the name of your bucket. 131 | cd msu/rtl/sdaccel 132 | make to_f1 133 | 134 | # Check status using the afi_id from the last step. It should say 135 | # pending for about 30 minutes, then available. 136 | cat *afi_id.txt 137 | aws ec2 describe-fpga-images --fpga-image-ids afi-XXXXXXXXXXXX 138 | 139 | # Copy the required files to an FPGA enabled host for execution: 140 | HOST=xxxx # Your F1 hostname here 141 | scp obj/to_f1.tar.gz centos@$HOST:. 142 | ``` 143 | 144 | ## FPGA Execution 145 | 146 | Once you have synthesized a bitstream, registered it using `create_sdaccel_afi.sh`, describe-fpga-image reports available, and copied the necessary files to an f1 machine you are ready to execute on the FPGA. 147 | 148 | Currently debug mode is required due to a known AWS issue. Create an `sdaccel.ini` file in the same directory you will be running from: 149 | ``` 150 | cat < sdaccel.ini 151 | [Debug] 152 | profile=true 153 | EOF 154 | ``` 155 | 156 | Execute the host driver code. This will automatically load the image referenced by the awsxclbin file onto the FPGA. 157 | ``` 158 | tar xf to_f1.tar.gz 159 | sudo su 160 | source $AWS_FPGA_REPO_DIR/sdaccel_runtime_setup.sh 161 | 162 | # Run a short test and verify the result in software 163 | ./host -e -f 100 164 | 165 | # Run a billion iterations starting with an input of 2 166 | ./host -s 0x2 -f 1073741824 167 | ``` 168 | 169 | The expected result of 2^2^2^30 using the default 1k (64 coefficient) modulus in the Makefile is: 170 | `9782776834334634490446343758704728706980122657033141222406929631982781114105293252444979173994924549755313289718816652420124314107449156688222852673024696927113240716169907514261823484008194829047317452425855361884165852504086556390349991640188347831084926001670580437428161157316196941905575574310934275893` 171 | -------------------------------------------------------------------------------- /msu/rtl/sdaccel/vdf_wrapper.sv: -------------------------------------------------------------------------------- 1 | // /******************************************************************************* 2 | // Copyright (c) 2018, Xilinx, Inc. 3 | // All rights reserved. 4 | // 5 | // Redistribution and use in source and binary forms, with or without modification, 6 | // are permitted provided that the following conditions are met: 7 | // 8 | // 1. Redistributions of source code must retain the above copyright notice, 9 | // this list of conditions and the following disclaimer. 10 | // 11 | // 12 | // 2. Redistributions in binary form must reproduce the above copyright notice, 13 | // this list of conditions and the following disclaimer in the documentation 14 | // and/or other materials provided with the distribution. 15 | // 16 | // 17 | // 3. Neither the name of the copyright holder nor the names of its contributors 18 | // may be used to endorse or promote products derived from this software 19 | // without specific prior written permission. 20 | // 21 | // 22 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 23 | // ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,THE IMPLIED 24 | // WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 25 | // IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, 26 | // INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 27 | // BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 28 | // DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY 29 | // OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 30 | // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, 31 | // EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | // 33 | // *******************************************************************************/ 34 | 35 | // default_nettype of none prevents implicit wire declaration. 36 | `default_nettype none 37 | module vdf_wrapper #( 38 | parameter integer C_M00_AXI_ADDR_WIDTH = 64 , 39 | parameter integer C_M00_AXI_DATA_WIDTH = 512 40 | ) 41 | ( 42 | // System Signals 43 | input wire ap_clk , 44 | input wire ap_rst_n , 45 | // AXI4 master interface m00_axi 46 | output wire m00_axi_awvalid , 47 | input wire m00_axi_awready , 48 | output wire [C_M00_AXI_ADDR_WIDTH-1:0] m00_axi_awaddr , 49 | output wire [8-1:0] m00_axi_awlen , 50 | output wire m00_axi_wvalid , 51 | input wire m00_axi_wready , 52 | output wire [C_M00_AXI_DATA_WIDTH-1:0] m00_axi_wdata , 53 | output wire [C_M00_AXI_DATA_WIDTH/8-1:0] m00_axi_wstrb , 54 | output wire m00_axi_wlast , 55 | input wire m00_axi_bvalid , 56 | output wire m00_axi_bready , 57 | output wire m00_axi_arvalid , 58 | input wire m00_axi_arready , 59 | output wire [C_M00_AXI_ADDR_WIDTH-1:0] m00_axi_araddr , 60 | output wire [8-1:0] m00_axi_arlen , 61 | input wire m00_axi_rvalid , 62 | output wire m00_axi_rready , 63 | input wire [C_M00_AXI_DATA_WIDTH-1:0] m00_axi_rdata , 64 | input wire m00_axi_rlast , 65 | // SDx Control Signals 66 | input wire ap_start , 67 | output wire ap_idle , 68 | output wire ap_done , 69 | input wire [32-1:0] input0 , 70 | input wire [64-1:0] input_mem , 71 | input wire [64-1:0] output_mem , 72 | input wire [64-1:0] intermediates_mem 73 | ); 74 | 75 | 76 | timeunit 1ps; 77 | timeprecision 1ps; 78 | 79 | /////////////////////////////////////////////////////////////////////////////// 80 | // Local Parameters 81 | /////////////////////////////////////////////////////////////////////////////// 82 | // Large enough for interesting traffic. 83 | localparam integer LP_DEFAULT_LENGTH_IN_BYTES = (C_M00_AXI_DATA_WIDTH/8*20); 84 | localparam integer LP_NUM_EXAMPLES = 1; 85 | 86 | /////////////////////////////////////////////////////////////////////////////// 87 | // Wires and Variables 88 | /////////////////////////////////////////////////////////////////////////////// 89 | (* KEEP = "yes" *) 90 | logic areset = 1'b0; 91 | logic ap_start_r = 1'b0; 92 | logic ap_idle_r = 1'b1; 93 | logic ap_start_pulse ; 94 | logic [LP_NUM_EXAMPLES-1:0] ap_done_i ; 95 | logic [LP_NUM_EXAMPLES-1:0] ap_done_r = {LP_NUM_EXAMPLES{1'b0}}; 96 | logic [32-1:0] ctrl_constant = 32'd1; 97 | 98 | /////////////////////////////////////////////////////////////////////////////// 99 | // Begin RTL 100 | /////////////////////////////////////////////////////////////////////////////// 101 | 102 | // Register and invert reset signal. 103 | always @(posedge ap_clk) begin 104 | areset <= ~ap_rst_n; 105 | end 106 | 107 | // create pulse when ap_start transitions to 1 108 | always @(posedge ap_clk) begin 109 | begin 110 | ap_start_r <= ap_start; 111 | end 112 | end 113 | 114 | assign ap_start_pulse = ap_start & ~ap_start_r; 115 | 116 | // ap_idle is asserted when done is asserted, it is de-asserted when ap_start_pulse 117 | // is asserted 118 | always @(posedge ap_clk) begin 119 | if (areset) begin 120 | ap_idle_r <= 1'b1; 121 | end 122 | else begin 123 | ap_idle_r <= ap_done ? 1'b1 : 124 | ap_start_pulse ? 1'b0 : ap_idle; 125 | end 126 | end 127 | 128 | assign ap_idle = ap_idle_r; 129 | 130 | // Done logic 131 | always @(posedge ap_clk) begin 132 | if (areset) begin 133 | ap_done_r <= '0; 134 | end 135 | else begin 136 | ap_done_r <= (ap_start_pulse | ap_done) ? '0 : ap_done_r | ap_done_i; 137 | end 138 | end 139 | 140 | assign ap_done = &ap_done_r; 141 | 142 | // Vadd example 143 | vdf_kernel #( 144 | .C_M_AXI_ADDR_WIDTH ( C_M00_AXI_ADDR_WIDTH ), 145 | .C_M_AXI_DATA_WIDTH ( C_M00_AXI_DATA_WIDTH ), 146 | .C_ADDER_BIT_WIDTH ( 32 ), 147 | .C_XFER_SIZE_WIDTH ( 32 ) 148 | ) 149 | inst_kernel ( 150 | .aclk ( ap_clk ), 151 | .areset ( areset ), 152 | .kernel_clk ( ap_clk ), 153 | .kernel_rst ( areset ), 154 | .ctrl_addr_offset_in ( input_mem ), 155 | .ctrl_addr_offset_out ( output_mem ), 156 | .ctrl_constant ( 32'b1 ), 157 | .ap_start ( ap_start_pulse ), 158 | .ap_done ( ap_done_i[0] ), 159 | .m_axi_awvalid ( m00_axi_awvalid ), 160 | .m_axi_awready ( m00_axi_awready ), 161 | .m_axi_awaddr ( m00_axi_awaddr ), 162 | .m_axi_awlen ( m00_axi_awlen ), 163 | .m_axi_wvalid ( m00_axi_wvalid ), 164 | .m_axi_wready ( m00_axi_wready ), 165 | .m_axi_wdata ( m00_axi_wdata ), 166 | .m_axi_wstrb ( m00_axi_wstrb ), 167 | .m_axi_wlast ( m00_axi_wlast ), 168 | .m_axi_bvalid ( m00_axi_bvalid ), 169 | .m_axi_bready ( m00_axi_bready ), 170 | .m_axi_arvalid ( m00_axi_arvalid ), 171 | .m_axi_arready ( m00_axi_arready ), 172 | .m_axi_araddr ( m00_axi_araddr ), 173 | .m_axi_arlen ( m00_axi_arlen ), 174 | .m_axi_rvalid ( m00_axi_rvalid ), 175 | .m_axi_rready ( m00_axi_rready ), 176 | .m_axi_rdata ( m00_axi_rdata ), 177 | .m_axi_rlast ( m00_axi_rlast ), 178 | .input0 ( input0 ) 179 | ); 180 | 181 | 182 | endmodule : vdf_wrapper 183 | `default_nettype wire 184 | -------------------------------------------------------------------------------- /primitives/rtl/multiply.sv: -------------------------------------------------------------------------------- 1 | /******************************************************************************* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | *******************************************************************************/ 16 | 17 | /* 18 | Multiply two arrays element by element 19 | The products are split into low (L) and high (H) values 20 | The products in each column are summed using compressor trees 21 | Leave results in carry/sum format 22 | 23 | Example A*B 4x4 element multiply results in 8 carry/sum values 24 | 25 | |----------------------------------| 26 | | B3 | B2 | B1 | B0 | 27 | |----------------------------------| 28 | |----------------------------------| 29 | x | A3 | A2 | A1 | A0 | 30 | |----------------------------------| 31 | ------------------------------------------------------------------------- 32 | 33 | Col 34 | Row 7 6 5 4 3 2 1 0 35 | 0 A00B03L A00B02L A00B01L A00B00L 36 | 1 A00B03H A00B02H A00B01H A00B00H 37 | 2 A01B03L A01B02L A01B01L A01B00L 38 | 3 A01B03H A01B02H A01B01H A01B00H 39 | 4 A02B03L A02B02L A02B01L A02B00L 40 | 5 A02B03H A02B02H A02B01H A02B00H 41 | 6 A03B03L A03B02L A03B01L A03B00L 42 | 7 + A03B03H A03B02H A03B01H A03B00H 43 | ------------------------------------------------------------------------- 44 | C7,S7 C6,S6 C5,S5 C4,S4 C3,S3 C2,S2 C1,S1 C0,S0 45 | */ 46 | 47 | module multiply 48 | #( 49 | parameter int NUM_ELEMENTS = 33, 50 | parameter int A_BIT_LEN = 17, 51 | parameter int B_BIT_LEN = 17, 52 | parameter int WORD_LEN = 16, 53 | 54 | parameter int MUL_OUT_BIT_LEN = A_BIT_LEN + B_BIT_LEN, 55 | parameter int COL_BIT_LEN = MUL_OUT_BIT_LEN - WORD_LEN, 56 | 57 | // Extra bits needed for accumulation depends on bit width 58 | // If one operand is larger than the other, then only need enough extra 59 | // bits based on number of larger operands. 60 | parameter int EXTRA_TREE_BITS = (COL_BIT_LEN > WORD_LEN) ? 61 | $clog2(NUM_ELEMENTS) : 62 | $clog2(NUM_ELEMENTS*2), 63 | parameter int OUT_BIT_LEN = COL_BIT_LEN + EXTRA_TREE_BITS 64 | ) 65 | ( 66 | input logic clk, 67 | input logic [A_BIT_LEN-1:0] A[NUM_ELEMENTS], 68 | input logic [B_BIT_LEN-1:0] B[NUM_ELEMENTS], 69 | output logic [OUT_BIT_LEN-1:0] Cout[NUM_ELEMENTS*2], 70 | output logic [OUT_BIT_LEN-1:0] S[NUM_ELEMENTS*2] 71 | ); 72 | 73 | localparam int GRID_PAD_SHORT = EXTRA_TREE_BITS; 74 | localparam int GRID_PAD_LONG = (COL_BIT_LEN - WORD_LEN) + 75 | EXTRA_TREE_BITS; 76 | 77 | logic [MUL_OUT_BIT_LEN-1:0] mul_result[NUM_ELEMENTS*NUM_ELEMENTS]; 78 | logic [OUT_BIT_LEN-1:0] grid[NUM_ELEMENTS*2][NUM_ELEMENTS*2]; 79 | 80 | 81 | // Instantiate all the multipliers, requires NUM_ELEMENTS^2 muls 82 | genvar i, j; 83 | generate 84 | for (i=0; i DEVICE=" 6 | $(ECHO) " Generate the design for specified Target and Device." 7 | $(ECHO) "" 8 | $(ECHO) " make clean " 9 | $(ECHO) " Remove the generated non-hardware files." 10 | $(ECHO) "" 11 | $(ECHO) " make cleanall" 12 | $(ECHO) " Remove all the generated files." 13 | $(ECHO) "" 14 | $(ECHO) " make check TARGET= DEVICE=" 15 | $(ECHO) " Run application in emulation." 16 | $(ECHO) "" 17 | 18 | 19 | ############################################################################ 20 | # Multiplier configuration 21 | ############################################################################ 22 | include ../../multiplier.mk 23 | 24 | HOST_FLAGS_HW_EMU = -e -f 1 25 | HOST_FLAGS_FPGA = -e -f 10 26 | 27 | ############################################################################ 28 | # Synthesis directives 29 | ############################################################################ 30 | 31 | ifeq ($(SIMPLE_SQ), 1) 32 | LDCLFLAGS += -DSIMPLE_SQ=1 33 | endif 34 | LDCLFLAGS += -DMOD_LEN_DEF=$(MOD_LEN) 35 | LDCLFLAGS += -DMODULUS_DEF=$(MOD_LEN)\'d$(MODULUS) 36 | LDCLFLAGS += -DMSU_SQ_IN_BITS_DEF=$(SQ_IN_BITS) 37 | LDCLFLAGS += -DMSU_SQ_OUT_BITS_DEF=$(SQ_OUT_BITS) 38 | 39 | 40 | LDCLFLAGS += --xp "vivado_prop:run.pfm_dynamic_vdf_1_0_synth_1.\ 41 | {STEPS.SYNTH_DESIGN.ARGS.FANOUT_LIMIT}={400}" 42 | LDCLFLAGS += --xp "vivado_prop:run.impl_1.\ 43 | {STEPS.OPT_DESIGN.ARGS.DIRECTIVE}={Explore}" 44 | #LDCLFLAGS += --xp "vivado_prop:run.impl_1.\ 45 | {STEPS.PLACE_DESIGN.ARGS.DIRECTIVE}={SSI_HighUtilSLRs}" 46 | #LDCLFLAGS += --xp "vivado_prop:run.impl_1.\ 47 | # {STEPS.PLACE_DESIGN.ARGS.DIRECTIVE}={SSI_SpreadLogic_High}" 48 | LDCLFLAGS += --xp "vivado_prop:run.impl_1.\ 49 | {STEPS.PLACE_DESIGN.ARGS.DIRECTIVE}={Explore}" 50 | #LDCLFLAGS += --xp "vivado_prop:run.impl_1.\ 51 | # {STEPS.PLACE_DESIGN.ARGS.DIRECTIVE}={ExtraTimingOpt}" 52 | LDCLFLAGS += --xp "vivado_prop:run.impl_1.\ 53 | {STEPS.PHYS_OPT_DESIGN.ARGS.DIRECTIVE}=\ 54 | {AlternateFlowWithRetiming}" 55 | 56 | # Vivado 2018.3 57 | LDCLFLAGS += --xp "vivado_prop:run.impl_1.\ 58 | {STEPS.ROUTE_DESIGN.ARGS.DIRECTIVE}={AggressiveExplore}" 59 | LDCLFLAGS += --xp "vivado_prop:run.impl_1.\ 60 | {STEPS.POST_ROUTE_PHYS_OPT_DESIGN.ARGS.DIRECTIVE}=\ 61 | {AggressiveExplore}" 62 | 63 | # Add in additional constraints, such as pblocks 64 | # This constraint places all performance critical logic in SLR2, which is 65 | # free of shell logic. 66 | PLACER_CONSTRS = $(realpath ../placer_constrs.xdc) 67 | LDCLFLAGS += --xp "vivado_prop:run.impl_1.{STEPS.PLACE_DESIGN.TCL.PRE}=\ 68 | $(PLACER_CONSTRS)" 69 | 70 | LDCLFLAGS += --kernel_frequency 161 71 | 72 | ############################################################################ 73 | # AWS/SDAccel configuration 74 | ############################################################################ 75 | 76 | # Points to Utility Directory 77 | COMMON_REPO = $(AWS_FPGA_REPO_DIR)/SDAccel/examples/xilinx 78 | ABS_COMMON_REPO = $(shell readlink -f $(COMMON_REPO)) 79 | 80 | ROOT_DIR = $(realpath ../../../..) 81 | MODSQR_DIR = $(ROOT_DIR)/modular_square 82 | SCRIPTS_DIR = ../tcl 83 | HOST_SRC_DIR = $(ROOT_DIR)/msu/sw 84 | 85 | HOST_SRCS += $(HOST_SRC_DIR)/MSUSDAccel.cpp 86 | HOST_SRCS += $(HOST_SRC_DIR)/MSU.cpp 87 | HOST_SRCS += $(HOST_SRC_DIR)/main.cpp 88 | 89 | CXXFLAGS += -I$(HOST_SRC_DIR) -DFPGA=1 \ 90 | -DMODULUS=\"$(MODULUS)\" \ 91 | -DMOD_LEN=$(MOD_LEN) 92 | ifeq ($(SIMPLE_SQ), 1) 93 | CXXFLAGS += -DSIMPLE_SQ=1 94 | endif 95 | 96 | TARGETS := hw 97 | TARGET := $(TARGETS) 98 | DEVICE := $(AWS_PLATFORM) 99 | XCLBIN := ./xclbin 100 | 101 | include ../utils.mk 102 | 103 | DSA := $(call device2sandsa, $(DEVICE)) 104 | BUILD_DIR := ./vdf/_x.$(TARGET).$(DSA) 105 | 106 | CXX := $(XILINX_SDX)/bin/xcpp 107 | XOCC := $(XILINX_SDX)/bin/xocc 108 | VIVADO := vivado 109 | 110 | # Include Libraries 111 | include $(ABS_COMMON_REPO)/libs/opencl/opencl.mk 112 | include $(ABS_COMMON_REPO)/libs/xcl2/xcl2.mk 113 | CXXFLAGS += $(xcl2_CXXFLAGS) 114 | LDFLAGS += $(xcl2_LDFLAGS) 115 | HOST_SRCS += $(xcl2_SRCS) 116 | 117 | CXXFLAGS += $(opencl_CXXFLAGS) -Wall -O0 -std=c++14 118 | LDFLAGS += $(opencl_LDFLAGS) -lgmp 119 | 120 | # Host compiler global settings 121 | CXXFLAGS += -fmessage-length=0 122 | LDFLAGS += -lrt -lstdc++ 123 | 124 | # Kernel compiler global settings 125 | CLFLAGS += -t $(TARGET) --platform $(DEVICE) 126 | CLFLAGS += --save-temps --temp_dir $(BUILD_DIR) 127 | 128 | # Enable waveform tracing 129 | TRACE = 1 130 | CLFLAGS += -g 131 | 132 | # Gather files to copy to an FPGA enabled server 133 | F1_FILE = to_f1.tar.gz 134 | 135 | # Host side executable name 136 | EXECUTABLE = host 137 | 138 | EMCONFIG_DIR = $(XCLBIN)/$(DSA) 139 | 140 | BINARY_CONTAINER = $(XCLBIN)/vdf.$(TARGET).$(DSA).xclbin 141 | BINARY_CONTAINER_XO = $(XCLBIN)/vdf.$(TARGET).$(DSA).xo 142 | BINARY_CONTAINER_AWS = vdf.$(TARGET).$(DSA).awsxclbin 143 | 144 | CP = cp -rf 145 | 146 | S3_BUCKET = vdfsn 147 | 148 | 149 | ############################################################################ 150 | # Rules 151 | ############################################################################ 152 | 153 | .PHONY: all clean cleanall docs emconfig 154 | all: check-devices $(EXECUTABLE) $(BINARY_CONTAINER) emconfig 155 | 156 | .PHONY: exe 157 | exe: $(EXECUTABLE) 158 | 159 | # Gather source files needed to run the sdx GUI 160 | sdx: 161 | mkdir -p sdx/src 162 | cp $(HOST_SRCS) sdx/src 163 | cp $(xcl2_HDRS) sdx/src 164 | cp $(XCLBIN)/vdf.hw_emu.$(DSA).xo sdx/src 165 | cp $(HOST_SRC_DIR)/*.cpp sdx/src 166 | cp $(HOST_SRC_DIR)/*.hpp sdx/src 167 | cp mem/*.dat sdx/src 168 | 169 | # Gather files needed to run on the FPGA 170 | .PHONY: to_f1 171 | to_f1: $(F1_FILE) 172 | 173 | $(F1_FILE): $(EXECUTABLE) $(BINARY_CONTAINER_AWS) 174 | tar czvf $@ $^ 175 | 176 | # Register the new bitstream 177 | # This only works on an AWS host 178 | $(BINARY_CONTAINER_AWS): $(BINARY_CONTAINER) 179 | cd $(XCLBIN) 180 | $(SDACCEL_DIR)/tools/create_sdaccel_afi.sh \ 181 | -xclbin=$(BINARY_CONTAINER) \ 182 | -o=vdf.hw.xilinx_aws-vu9p-f1-04261818_dynamic_5_0 \ 183 | -s3_bucket=$(S3_BUCKET) -s3_dcp_key=dcp -s3_logs_key=logs 184 | cat *afi_id.txt 185 | 186 | 187 | # Generate the LUTs. reduction_lut_000.dat will be present for any bitwidth. 188 | $(BINARY_CONTAINER_XO): mem/reduction_lut_000.dat 189 | $(BINARY_CONTAINER_XO): msuconfig.vh 190 | 191 | sdaccel.ini: 192 | ifeq ($(TRACE), 1) 193 | echo "[Emulation]" > sdaccel.ini 194 | echo "launch_waveform=batch" >> sdaccel.ini 195 | echo "[Debug]" >> sdaccel.ini 196 | echo "profile=true" >> sdaccel.ini 197 | echo "timeline_trace=true" >> sdaccel.ini 198 | echo "device_profile=true" >> sdaccel.ini 199 | 200 | $(BINARY_CONTAINER_XO): sdaccel.ini 201 | else 202 | rm -f sdaccel.ini 203 | endif 204 | 205 | xo: $(XCLBIN)/vdf.$(TARGET).$(DSA).xo 206 | 207 | $(XCLBIN)/vdf.$(TARGET).$(DSA).xo: 208 | mkdir -p ${XCLBIN} 209 | $(VIVADO) -mode batch -source $(SCRIPTS_DIR)/gen_xo.tcl \ 210 | -tclargs $@ vdf hw $(DEVICE) 211 | 212 | 213 | # Building kernel 214 | $(XCLBIN)/vdf.$(TARGET).$(DSA).xclbin: $(BINARY_CONTAINER_XO) 215 | mkdir -p $(XCLBIN) 216 | $(XOCC) $(CLFLAGS) $(LDCLFLAGS) \ 217 | -lo $(XCLBIN)/vdf.$(TARGET).$(DSA).xclbin \ 218 | $(XCLBIN)/vdf.$(TARGET).$(DSA).xo 219 | 220 | # Building Host 221 | $(EXECUTABLE): $(HOST_SRCS) $(HOST_HDRS) 222 | mkdir -p $(XCLBIN) 223 | $(CXX) $(CXXFLAGS) $(HOST_SRCS) $(HOST_HDRS) -o '$@' $(LDFLAGS) 224 | 225 | emconfig:$(EMCONFIG_DIR)/emconfig.json 226 | $(EMCONFIG_DIR)/emconfig.json: 227 | emconfigutil --platform $(DEVICE) --od $(EMCONFIG_DIR) 228 | 229 | check: all 230 | ifeq ($(TARGET),$(filter $(TARGET),sw_emu hw_emu)) 231 | $(CP) $(EMCONFIG_DIR)/emconfig.json . 232 | XCL_EMULATION_MODE=$(TARGET) ./$(EXECUTABLE) $(HOST_FLAGS_HW_EMU) 233 | else 234 | ./$(EXECUTABLE) $(HOST_FLAGS_FPGA) 235 | endif 236 | 237 | ifneq ($(TARGET),$(findstring $(TARGET), hw hw_emu)) 238 | $(warning WARNING:Application supports only hw hw_emu TARGET. \ 239 | Please use the target for running the application) 240 | endif 241 | 242 | 243 | # Cleaning stuff 244 | clean: 245 | -$(RMDIR) $(EXECUTABLE) $(XCLBIN)/{*sw_emu*,*hw_emu*} 246 | -$(RMDIR) sdaccel_* TempConfig system_estimate.xtxt *.rpt 247 | -$(RMDIR) src/*.ll _xocc_* .Xil emconfig.json 248 | -$(RMDIR) dltmp* xmltmp* *.log *.jou *.wcfg *.wdb 249 | -$(RMDIR) mem 250 | 251 | cleanall: clean 252 | -$(RMDIR) $(XCLBIN) 253 | -$(RMDIR) _x.* 254 | -$(RMDIR) ./tmp_kernel_pack* ./packaged_kernel* 255 | -$(RMDIR) mem 256 | -------------------------------------------------------------------------------- /msu/rtl/msu.sv: -------------------------------------------------------------------------------- 1 | /* 2 | Copyright 2019 Supranational LLC 3 | 4 | Licensed under the Apache License, Version 2.0 (the "License"); 5 | you may not use this file except in compliance with the License. 6 | You may obtain a copy of the License at 7 | 8 | http://www.apache.org/licenses/LICENSE-2.0 9 | 10 | Unless required by applicable law or agreed to in writing, software 11 | distributed under the License is distributed on an "AS IS" BASIS, 12 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | See the License for the specific language governing permissions and 14 | limitations under the License. 15 | */ 16 | 17 | `include "msuconfig.vh" 18 | 19 | // MSU configuration 20 | `ifndef SQ_IN_BITS_DEF 21 | `define SQ_IN_BITS_DEF 1024 22 | `endif 23 | `ifndef SQ_OUT_BITS_DEF 24 | `define SQ_OUT_BITS_DEF 1024 25 | `endif 26 | 27 | 28 | module msu 29 | #( 30 | // Data width of both input and output data on AXI bus. 31 | parameter int AXI_LEN = 32, 32 | parameter int C_XFER_SIZE_WIDTH = 32, 33 | 34 | parameter int SQ_IN_BITS = `SQ_IN_BITS_DEF, 35 | parameter int SQ_OUT_BITS = `SQ_OUT_BITS_DEF, 36 | parameter int T_LEN = 64 37 | ) 38 | ( 39 | input wire clk, 40 | input wire reset, 41 | 42 | // Incoming AXI interface. 43 | input wire s_axis_tvalid, 44 | output wire s_axis_tready, 45 | input wire [AXI_LEN-1:0] s_axis_tdata, 46 | /* verilator lint_off UNUSED */ 47 | input wire [AXI_LEN/8-1:0] s_axis_tkeep, 48 | /* verilator lint_on UNUSED */ 49 | input wire s_axis_tlast, 50 | output wire [C_XFER_SIZE_WIDTH-1:0] s_axis_xfer_size_in_bytes, 51 | 52 | // Outgoing AXI interface. 53 | output wire m_axis_tvalid, 54 | input wire m_axis_tready, 55 | output wire [AXI_LEN-1:0] m_axis_tdata, 56 | output wire [AXI_LEN/8-1:0] m_axis_tkeep, 57 | output wire m_axis_tlast, 58 | output wire [C_XFER_SIZE_WIDTH-1:0] m_axis_xfer_size_in_bytes, 59 | /* verilator lint_off UNUSED */ 60 | input wire ap_start, 61 | /* verilator lint_on UNUSED */ 62 | output wire ap_done, 63 | output wire start_xfer 64 | ); 65 | 66 | // Incoming txn count: t_start, t_final, sq_in 67 | localparam int AXI_IN_COUNT = (T_LEN/AXI_LEN*2 + 68 | SQ_IN_BITS / AXI_LEN); 69 | // Outgoing txn count: t_current, sq_out 70 | localparam int AXI_OUT_COUNT = (T_LEN/AXI_LEN + 71 | SQ_OUT_BITS / AXI_LEN); 72 | localparam int AXI_BYTES_PER_TXN = AXI_LEN/8; 73 | localparam int AXI_IN_BITS = AXI_IN_COUNT * AXI_LEN; 74 | localparam int AXI_OUT_BITS = AXI_OUT_COUNT * AXI_LEN; 75 | 76 | 77 | // State machine states. 78 | typedef enum { 79 | STATE_INIT, 80 | STATE_RECV, 81 | STATE_SQIN, 82 | STATE_START, 83 | STATE_COMPUTE, 84 | STATE_PREPARE_SEND, 85 | STATE_SEND, 86 | STATE_IDLE 87 | } State; 88 | State state; 89 | State next_state; 90 | 91 | // Squaring parameters 92 | logic [T_LEN-1:0] t_current; 93 | logic [T_LEN-1:0] t_final; 94 | logic [SQ_IN_BITS-1:0] sq_in; 95 | logic [SQ_OUT_BITS-1:0] sq_out; 96 | 97 | logic sq_start; 98 | logic sq_finished; 99 | 100 | logic final_iteration; 101 | 102 | // AXI data storage 103 | logic [AXI_IN_BITS-1:0] axi_in; 104 | logic [AXI_OUT_BITS-1:0] axi_out; 105 | logic [C_XFER_SIZE_WIDTH-1:0] axi_out_count; 106 | logic axi_in_shift; 107 | 108 | 109 | genvar gi; 110 | 111 | // Xilinx recommends clocking reset. 112 | logic reset_1d; 113 | always @(posedge clk) begin 114 | reset_1d <= reset; 115 | end 116 | 117 | 118 | ////////////////////////////////////////////////////////////////////// 119 | // State machine 120 | ////////////////////////////////////////////////////////////////////// 121 | 122 | always @(posedge clk) begin 123 | state <= next_state; 124 | end 125 | 126 | always_comb begin 127 | if(reset_1d) begin 128 | next_state = STATE_INIT; 129 | end else begin 130 | case(state) 131 | STATE_INIT: 132 | if(ap_start) begin 133 | next_state = STATE_RECV; 134 | end else begin 135 | next_state = STATE_INIT; 136 | end 137 | 138 | STATE_RECV: 139 | if(s_axis_tlast && s_axis_tvalid && s_axis_tready) begin 140 | next_state = STATE_SQIN; 141 | end else begin 142 | next_state = STATE_RECV; 143 | end 144 | 145 | STATE_SQIN: 146 | next_state = STATE_START; 147 | 148 | STATE_START: 149 | next_state = STATE_COMPUTE; 150 | 151 | STATE_COMPUTE: 152 | if(t_current == t_final) begin 153 | next_state = STATE_PREPARE_SEND; 154 | end else begin 155 | next_state = STATE_COMPUTE; 156 | end 157 | 158 | STATE_PREPARE_SEND: 159 | next_state = STATE_SEND; 160 | 161 | STATE_SEND: 162 | if(axi_out_count == AXI_OUT_COUNT-1 && m_axis_tready) begin 163 | next_state = STATE_IDLE; 164 | end else begin 165 | next_state = STATE_SEND; 166 | end 167 | 168 | STATE_IDLE: 169 | next_state = STATE_INIT; 170 | 171 | default: 172 | next_state = STATE_INIT; 173 | endcase 174 | end 175 | end 176 | 177 | ////////////////////////////////////////////////////////////////////// 178 | // Receive AXI data 179 | ////////////////////////////////////////////////////////////////////// 180 | assign axi_in_shift = state == STATE_RECV && s_axis_tvalid; 181 | 182 | always @(posedge clk) begin 183 | if(axi_in_shift) begin 184 | axi_in <= { s_axis_tdata, axi_in[AXI_IN_BITS-1:AXI_LEN] }; 185 | end 186 | end 187 | 188 | always @(posedge clk) begin 189 | if(state == STATE_SQIN) begin 190 | t_current <= axi_in[T_LEN-1:0]; 191 | t_final <= axi_in[2*T_LEN-1:T_LEN]; 192 | sq_in <= axi_in[AXI_IN_BITS-1:2*T_LEN]; 193 | end else if(state == STATE_COMPUTE && sq_finished) begin 194 | t_current <= t_current + 1; 195 | end 196 | end 197 | assign final_iteration = sq_finished && (t_current == t_final-1); 198 | 199 | assign sq_start = state == STATE_START; 200 | assign s_axis_xfer_size_in_bytes = (AXI_IN_COUNT*AXI_BYTES_PER_TXN); 201 | assign s_axis_tready = (state == STATE_RECV); 202 | 203 | ////////////////////////////////////////////////////////////////////// 204 | // Modsqr function 205 | ////////////////////////////////////////////////////////////////////// 206 | 207 | `ifdef SIMPLE_SQ 208 | modular_square_simple 209 | `else 210 | modular_square_wrapper 211 | `endif 212 | #( 213 | .MOD_LEN(SQ_IN_BITS) 214 | ) 215 | modsqr 216 | ( 217 | .clk (clk), 218 | .reset (reset || reset_1d || state == STATE_RECV), 219 | .start (sq_start), 220 | .sq_in (sq_in), 221 | .sq_out (sq_out), 222 | .valid (sq_finished) 223 | ); 224 | 225 | ////////////////////////////////////////////////////////////////////// 226 | // Send AXI data 227 | ////////////////////////////////////////////////////////////////////// 228 | localparam int SQ_OUT_OFFSET = 2; 229 | always @(posedge clk) begin 230 | if(final_iteration) begin 231 | axi_out_count <= 0; 232 | axi_out[T_LEN-1:0] <= t_current; 233 | axi_out[AXI_OUT_BITS-1:T_LEN] <= sq_out; 234 | end else if(state == STATE_SEND && m_axis_tready) begin 235 | axi_out <= { {AXI_LEN{1'b0}}, 236 | axi_out[AXI_OUT_BITS-1:AXI_LEN] }; 237 | 238 | axi_out_count <= axi_out_count + 1; 239 | end 240 | end 241 | 242 | assign m_axis_xfer_size_in_bytes = AXI_OUT_COUNT*AXI_BYTES_PER_TXN; 243 | assign m_axis_tvalid = (state == STATE_SEND && 244 | axi_out_count < AXI_OUT_COUNT); 245 | assign m_axis_tdata = axi_out[AXI_LEN-1:0]; 246 | assign m_axis_tlast = 0; 247 | assign m_axis_tkeep = {(AXI_LEN/8){1'b1}}; 248 | assign start_xfer = state == STATE_PREPARE_SEND; 249 | assign ap_done = state == STATE_IDLE; 250 | endmodule 251 | 252 | // Local Variables: 253 | // verilog-library-directories:("." "../hw" "../../squarer_6_13/rtl/") 254 | // End: 255 | -------------------------------------------------------------------------------- /msu/rtl/sdaccel/vdf_kernel.sv: -------------------------------------------------------------------------------- 1 | // /******************************************************************************* 2 | // Copyright (c) 2018, Xilinx, Inc. 3 | // All rights reserved. 4 | // 5 | // Redistribution and use in source and binary forms, with or without modification, 6 | // are permitted provided that the following conditions are met: 7 | // 8 | // 1. Redistributions of source code must retain the above copyright notice, 9 | // this list of conditions and the following disclaimer. 10 | // 11 | // 12 | // 2. Redistributions in binary form must reproduce the above copyright notice, 13 | // this list of conditions and the following disclaimer in the documentation 14 | // and/or other materials provided with the distribution. 15 | // 16 | // 17 | // 3. Neither the name of the copyright holder nor the names of its contributors 18 | // may be used to endorse or promote products derived from this software 19 | // without specific prior written permission. 20 | // 21 | // 22 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 23 | // ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,THE IMPLIED 24 | // WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 25 | // IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, 26 | // INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 27 | // BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 28 | // DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY 29 | // OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 30 | // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, 31 | // EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | // 33 | // *******************************************************************************/ 34 | 35 | `include "msuconfig.vh" 36 | 37 | // default_nettype of none prevents implicit wire declaration. 38 | `default_nettype none 39 | 40 | module vdf_kernel #( 41 | parameter integer C_M_AXI_ADDR_WIDTH = 64 , 42 | parameter integer C_M_AXI_DATA_WIDTH = 512, 43 | parameter integer C_XFER_SIZE_WIDTH = 32, 44 | parameter integer C_ADDER_BIT_WIDTH = 32 45 | ) 46 | ( 47 | // System Signals 48 | input wire aclk , 49 | input wire areset , 50 | // Extra clocks 51 | input wire kernel_clk , 52 | input wire kernel_rst , 53 | // AXI4 master interface 54 | output wire m_axi_awvalid , 55 | input wire m_axi_awready , 56 | output wire [C_M_AXI_ADDR_WIDTH-1:0] m_axi_awaddr , 57 | output wire [8-1:0] m_axi_awlen , 58 | output wire m_axi_wvalid , 59 | input wire m_axi_wready , 60 | output wire [C_M_AXI_DATA_WIDTH-1:0] m_axi_wdata , 61 | output wire [C_M_AXI_DATA_WIDTH/8-1:0] m_axi_wstrb , 62 | output wire m_axi_wlast , 63 | output wire m_axi_arvalid , 64 | input wire m_axi_arready , 65 | output wire [C_M_AXI_ADDR_WIDTH-1:0] m_axi_araddr , 66 | output wire [8-1:0] m_axi_arlen , 67 | input wire m_axi_rvalid , 68 | output wire m_axi_rready , 69 | input wire [C_M_AXI_DATA_WIDTH-1:0] m_axi_rdata , 70 | input wire m_axi_rlast , 71 | input wire m_axi_bvalid , 72 | output wire m_axi_bready , 73 | input wire ap_start , 74 | output wire ap_done , 75 | input wire [C_M_AXI_ADDR_WIDTH-1:0] ctrl_addr_offset_in, 76 | input wire [C_M_AXI_ADDR_WIDTH-1:0] ctrl_addr_offset_out, 77 | input wire [C_ADDER_BIT_WIDTH-1:0] ctrl_constant, 78 | input wire [32-1:0] input0 79 | ); 80 | 81 | timeunit 1ps; 82 | timeprecision 1ps; 83 | 84 | 85 | /////////////////////////////////////////////////////////////////////////////// 86 | // Local Parameters 87 | /////////////////////////////////////////////////////////////////////////////// 88 | localparam integer LP_DW_BYTES = C_M_AXI_DATA_WIDTH/8; 89 | localparam integer LP_AXI_BURST_LEN = 4096/LP_DW_BYTES < 256 ? 4096/LP_DW_BYTES : 256; 90 | localparam integer LP_LOG_BURST_LEN = $clog2(LP_AXI_BURST_LEN); 91 | localparam integer LP_BRAM_DEPTH = 512; 92 | localparam integer LP_RD_MAX_OUTSTANDING = LP_BRAM_DEPTH / LP_AXI_BURST_LEN; 93 | localparam integer LP_WR_MAX_OUTSTANDING = 32; 94 | 95 | /////////////////////////////////////////////////////////////////////////////// 96 | // Wires and Variables 97 | /////////////////////////////////////////////////////////////////////////////// 98 | 99 | // Control logic 100 | logic done = 1'b0; 101 | // AXI read master stage 102 | logic read_done; 103 | logic rd_tvalid; 104 | logic rd_tready; 105 | logic rd_tlast; 106 | logic [C_M_AXI_DATA_WIDTH-1:0] rd_tdata; 107 | // Adder stage 108 | logic adder_tvalid; 109 | logic adder_tready; 110 | logic [C_M_AXI_DATA_WIDTH-1:0] adder_tdata; 111 | logic [C_XFER_SIZE_WIDTH-1:0] adder_out_xfer_size_in_bytes; 112 | logic [C_XFER_SIZE_WIDTH-1:0] adder_in_xfer_size_in_bytes; 113 | 114 | // AXI write master stage 115 | logic write_done; 116 | logic start_xfer; 117 | logic ap_done_core; 118 | 119 | /////////////////////////////////////////////////////////////////////////////// 120 | // Begin RTL 121 | /////////////////////////////////////////////////////////////////////////////// 122 | 123 | // AXI4 Read Master, output format is an AXI4-Stream master, one stream per thread. 124 | vdf_axi_read_master #( 125 | .C_M_AXI_ADDR_WIDTH ( C_M_AXI_ADDR_WIDTH ) , 126 | .C_M_AXI_DATA_WIDTH ( C_M_AXI_DATA_WIDTH ) , 127 | .C_XFER_SIZE_WIDTH ( C_XFER_SIZE_WIDTH ) , 128 | .C_MAX_OUTSTANDING ( LP_RD_MAX_OUTSTANDING ) , 129 | .C_INCLUDE_DATA_FIFO ( 1 ) 130 | ) 131 | inst_axi_read_master ( 132 | .aclk ( aclk ) , 133 | .areset ( areset ) , 134 | .ctrl_start ( ap_start ) , 135 | .ctrl_done ( read_done ) , 136 | .ctrl_addr_offset ( ctrl_addr_offset_in ) , 137 | .ctrl_xfer_size_in_bytes ( adder_in_xfer_size_in_bytes ) , 138 | .m_axi_arvalid ( m_axi_arvalid ) , 139 | .m_axi_arready ( m_axi_arready ) , 140 | .m_axi_araddr ( m_axi_araddr ) , 141 | .m_axi_arlen ( m_axi_arlen ) , 142 | .m_axi_rvalid ( m_axi_rvalid ) , 143 | .m_axi_rready ( m_axi_rready ) , 144 | .m_axi_rdata ( m_axi_rdata ) , 145 | .m_axi_rlast ( m_axi_rlast ) , 146 | .m_axis_aclk ( kernel_clk ) , 147 | .m_axis_areset ( kernel_rst ) , 148 | .m_axis_tvalid ( rd_tvalid ) , 149 | .m_axis_tready ( rd_tready ) , 150 | .m_axis_tlast ( rd_tlast ) , 151 | .m_axis_tdata ( rd_tdata ) 152 | ); 153 | 154 | msu #( 155 | .AXI_LEN ( C_M_AXI_DATA_WIDTH ), 156 | .C_XFER_SIZE_WIDTH ( C_XFER_SIZE_WIDTH ), 157 | .SQ_IN_BITS ( `SQ_IN_BITS_DEF ), 158 | .SQ_OUT_BITS ( `SQ_OUT_BITS_DEF ) 159 | ) 160 | msu 161 | ( 162 | .clk ( kernel_clk ) , 163 | .reset ( kernel_rst ) , 164 | .s_axis_tvalid ( rd_tvalid ) , 165 | .s_axis_tready ( rd_tready ) , 166 | .s_axis_tdata ( rd_tdata ) , 167 | .s_axis_tkeep ( {C_M_AXI_DATA_WIDTH/8{1'b1}} ) , 168 | .s_axis_tlast ( rd_tlast ) , 169 | .m_axis_tvalid ( adder_tvalid ) , 170 | .m_axis_tready ( adder_tready ) , 171 | .m_axis_tdata ( adder_tdata ) , 172 | .m_axis_tkeep ( ) , // Not used 173 | .m_axis_tlast ( ) , // Not used 174 | .start_xfer ( start_xfer ) , 175 | .ap_start ( ap_start ) , 176 | .ap_done ( ap_done_core ) , 177 | .m_axis_xfer_size_in_bytes(adder_out_xfer_size_in_bytes), 178 | .s_axis_xfer_size_in_bytes(adder_in_xfer_size_in_bytes) 179 | ); 180 | 181 | // AXI4 Write Master 182 | vdf_axi_write_master #( 183 | .C_M_AXI_ADDR_WIDTH ( C_M_AXI_ADDR_WIDTH ) , 184 | .C_M_AXI_DATA_WIDTH ( C_M_AXI_DATA_WIDTH ) , 185 | .C_XFER_SIZE_WIDTH ( C_XFER_SIZE_WIDTH ) , 186 | .C_MAX_OUTSTANDING ( LP_WR_MAX_OUTSTANDING ) , 187 | .C_INCLUDE_DATA_FIFO ( 1 ) 188 | ) 189 | inst_axi_write_master ( 190 | .aclk ( aclk ) , 191 | .areset ( areset ) , 192 | .ctrl_start ( start_xfer ) , 193 | .ctrl_done ( write_done ) , 194 | .ctrl_addr_offset ( ctrl_addr_offset_out ) , 195 | .ctrl_xfer_size_in_bytes ( adder_out_xfer_size_in_bytes ) , 196 | .m_axi_awvalid ( m_axi_awvalid ) , 197 | .m_axi_awready ( m_axi_awready ) , 198 | .m_axi_awaddr ( m_axi_awaddr ) , 199 | .m_axi_awlen ( m_axi_awlen ) , 200 | .m_axi_wvalid ( m_axi_wvalid ) , 201 | .m_axi_wready ( m_axi_wready ) , 202 | .m_axi_wdata ( m_axi_wdata ) , 203 | .m_axi_wstrb ( m_axi_wstrb ) , 204 | .m_axi_wlast ( m_axi_wlast ) , 205 | .m_axi_bvalid ( m_axi_bvalid ) , 206 | .m_axi_bready ( m_axi_bready ) , 207 | .s_axis_aclk ( kernel_clk ) , 208 | .s_axis_areset ( kernel_rst ) , 209 | .s_axis_tvalid ( adder_tvalid ) , 210 | .s_axis_tready ( adder_tready ) , 211 | .s_axis_tdata ( adder_tdata ) 212 | ); 213 | 214 | assign ap_done = ((adder_out_xfer_size_in_bytes == 0) ? ap_done_core : 215 | write_done); 216 | endmodule : vdf_kernel 217 | `default_nettype wire 218 | 219 | -------------------------------------------------------------------------------- /msu/rtl/sdaccel/vdf.v: -------------------------------------------------------------------------------- 1 | // /******************************************************************************* 2 | // Copyright (c) 2018, Xilinx, Inc. 3 | // All rights reserved. 4 | // 5 | // Redistribution and use in source and binary forms, with or without modification, 6 | // are permitted provided that the following conditions are met: 7 | // 8 | // 1. Redistributions of source code must retain the above copyright notice, 9 | // this list of conditions and the following disclaimer. 10 | // 11 | // 12 | // 2. Redistributions in binary form must reproduce the above copyright notice, 13 | // this list of conditions and the following disclaimer in the documentation 14 | // and/or other materials provided with the distribution. 15 | // 16 | // 17 | // 3. Neither the name of the copyright holder nor the names of its contributors 18 | // may be used to endorse or promote products derived from this software 19 | // without specific prior written permission. 20 | // 21 | // 22 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 23 | // ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,THE IMPLIED 24 | // WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 25 | // IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, 26 | // INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 27 | // BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 28 | // DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY 29 | // OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 30 | // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, 31 | // EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | // 33 | // *******************************************************************************/ 34 | 35 | // default_nettype of none prevents implicit wire declaration. 36 | `default_nettype none 37 | 38 | // Top level of the kernel. Do not modify module name, parameters or ports. 39 | module vdf #( 40 | parameter integer C_S_AXI_CONTROL_ADDR_WIDTH = 6 , 41 | parameter integer C_S_AXI_CONTROL_DATA_WIDTH = 32 , 42 | parameter integer C_M00_AXI_ADDR_WIDTH = 64 , 43 | parameter integer C_M00_AXI_DATA_WIDTH = 32 44 | ) 45 | ( 46 | // System Signals 47 | input wire ap_clk , 48 | input wire ap_rst_n , 49 | // Note: A minimum subset of AXI4 memory mapped signals are declared. AXI 50 | // signals omitted from these interfaces are automatically inferred with the 51 | // optimal values for Xilinx SDx systems. This allows Xilinx AXI4 Interconnects 52 | // within the system to be optimized by removing logic for AXI4 protocol 53 | // features that are not necessary. When adapting AXI4 masters within the RTL 54 | // kernel that have signals not declared below, it is suitable to add the 55 | // signals to the declarations below to connect them to the AXI4 Master. 56 | // 57 | // List of ommited signals - effect 58 | // ------------------------------- 59 | // ID - Transaction ID are used for multithreading and out of order 60 | // transactions. This increases complexity. This saves logic and increases Fmax 61 | // in the system when ommited. 62 | // SIZE - Default value is log2(data width in bytes). Needed for subsize bursts. 63 | // This saves logic and increases Fmax in the system when ommited. 64 | // BURST - Default value (0b01) is incremental. Wrap and fixed bursts are not 65 | // recommended. This saves logic and increases Fmax in the system when ommited. 66 | // LOCK - Not supported in AXI4 67 | // CACHE - Default value (0b0011) allows modifiable transactions. No benefit to 68 | // changing this. 69 | // PROT - Has no effect in SDx systems. 70 | // QOS - Has no effect in SDx systems. 71 | // REGION - Has no effect in SDx systems. 72 | // USER - Has no effect in SDx systems. 73 | // RESP - Not useful in most SDx systems. 74 | // 75 | // AXI4 master interface m00_axi 76 | output wire m00_axi_awvalid , 77 | input wire m00_axi_awready , 78 | output wire [C_M00_AXI_ADDR_WIDTH-1:0] m00_axi_awaddr , 79 | output wire [8-1:0] m00_axi_awlen , 80 | output wire m00_axi_wvalid , 81 | input wire m00_axi_wready , 82 | output wire [C_M00_AXI_DATA_WIDTH-1:0] m00_axi_wdata , 83 | output wire [C_M00_AXI_DATA_WIDTH/8-1:0] m00_axi_wstrb , 84 | output wire m00_axi_wlast , 85 | input wire m00_axi_bvalid , 86 | output wire m00_axi_bready , 87 | output wire m00_axi_arvalid , 88 | input wire m00_axi_arready , 89 | output wire [C_M00_AXI_ADDR_WIDTH-1:0] m00_axi_araddr , 90 | output wire [8-1:0] m00_axi_arlen , 91 | input wire m00_axi_rvalid , 92 | output wire m00_axi_rready , 93 | input wire [C_M00_AXI_DATA_WIDTH-1:0] m00_axi_rdata , 94 | input wire m00_axi_rlast , 95 | // AXI4-Lite slave interface 96 | input wire s_axi_control_awvalid, 97 | output wire s_axi_control_awready, 98 | input wire [C_S_AXI_CONTROL_ADDR_WIDTH-1:0] s_axi_control_awaddr , 99 | input wire s_axi_control_wvalid , 100 | output wire s_axi_control_wready , 101 | input wire [C_S_AXI_CONTROL_DATA_WIDTH-1:0] s_axi_control_wdata , 102 | input wire [C_S_AXI_CONTROL_DATA_WIDTH/8-1:0] s_axi_control_wstrb , 103 | input wire s_axi_control_arvalid, 104 | output wire s_axi_control_arready, 105 | input wire [C_S_AXI_CONTROL_ADDR_WIDTH-1:0] s_axi_control_araddr , 106 | output wire s_axi_control_rvalid , 107 | input wire s_axi_control_rready , 108 | output wire [C_S_AXI_CONTROL_DATA_WIDTH-1:0] s_axi_control_rdata , 109 | output wire [2-1:0] s_axi_control_rresp , 110 | output wire s_axi_control_bvalid , 111 | input wire s_axi_control_bready , 112 | output wire [2-1:0] s_axi_control_bresp , 113 | output wire interrupt 114 | ); 115 | 116 | /////////////////////////////////////////////////////////////////////////////// 117 | // Local Parameters 118 | /////////////////////////////////////////////////////////////////////////////// 119 | 120 | /////////////////////////////////////////////////////////////////////////////// 121 | // Wires and Variables 122 | /////////////////////////////////////////////////////////////////////////////// 123 | (* DONT_TOUCH = "yes" *) 124 | reg areset = 1'b0; 125 | wire ap_start ; 126 | wire ap_idle ; 127 | wire ap_done ; 128 | wire [32-1:0] input0 ; 129 | wire [64-1:0] input_mem ; 130 | wire [64-1:0] output_mem ; 131 | wire [64-1:0] intermediates_mem ; 132 | 133 | // Register and invert reset signal. 134 | always @(posedge ap_clk) begin 135 | areset <= ~ap_rst_n; 136 | end 137 | 138 | /////////////////////////////////////////////////////////////////////////////// 139 | // Begin control interface RTL. Modifying not recommended. 140 | /////////////////////////////////////////////////////////////////////////////// 141 | 142 | 143 | // AXI4-Lite slave interface 144 | vdf_control_s_axi #( 145 | .C_ADDR_WIDTH ( C_S_AXI_CONTROL_ADDR_WIDTH ), 146 | .C_DATA_WIDTH ( C_S_AXI_CONTROL_DATA_WIDTH ) 147 | ) 148 | inst_control_s_axi ( 149 | .aclk ( ap_clk ), 150 | .areset ( areset ), 151 | .aclk_en ( 1'b1 ), 152 | .awvalid ( s_axi_control_awvalid ), 153 | .awready ( s_axi_control_awready ), 154 | .awaddr ( s_axi_control_awaddr ), 155 | .wvalid ( s_axi_control_wvalid ), 156 | .wready ( s_axi_control_wready ), 157 | .wdata ( s_axi_control_wdata ), 158 | .wstrb ( s_axi_control_wstrb ), 159 | .arvalid ( s_axi_control_arvalid ), 160 | .arready ( s_axi_control_arready ), 161 | .araddr ( s_axi_control_araddr ), 162 | .rvalid ( s_axi_control_rvalid ), 163 | .rready ( s_axi_control_rready ), 164 | .rdata ( s_axi_control_rdata ), 165 | .rresp ( s_axi_control_rresp ), 166 | .bvalid ( s_axi_control_bvalid ), 167 | .bready ( s_axi_control_bready ), 168 | .bresp ( s_axi_control_bresp ), 169 | .interrupt ( interrupt ), 170 | .ap_start ( ap_start ), 171 | .ap_done ( ap_done ), 172 | .ap_idle ( ap_idle ), 173 | .input0 ( input0 ), 174 | .input_mem ( input_mem ), 175 | .output_mem ( output_mem ), 176 | .intermediates_mem ( intermediates_mem ) 177 | ); 178 | 179 | /////////////////////////////////////////////////////////////////////////////// 180 | // Add kernel logic here. Modify/remove example code as necessary. 181 | /////////////////////////////////////////////////////////////////////////////// 182 | 183 | // Example RTL block. Remove to insert custom logic. 184 | vdf_wrapper #( 185 | .C_M00_AXI_ADDR_WIDTH ( C_M00_AXI_ADDR_WIDTH ), 186 | .C_M00_AXI_DATA_WIDTH ( C_M00_AXI_DATA_WIDTH ) 187 | ) 188 | inst_wrapper ( 189 | .ap_clk ( ap_clk ), 190 | .ap_rst_n ( ap_rst_n ), 191 | .m00_axi_awvalid ( m00_axi_awvalid ), 192 | .m00_axi_awready ( m00_axi_awready ), 193 | .m00_axi_awaddr ( m00_axi_awaddr ), 194 | .m00_axi_awlen ( m00_axi_awlen ), 195 | .m00_axi_wvalid ( m00_axi_wvalid ), 196 | .m00_axi_wready ( m00_axi_wready ), 197 | .m00_axi_wdata ( m00_axi_wdata ), 198 | .m00_axi_wstrb ( m00_axi_wstrb ), 199 | .m00_axi_wlast ( m00_axi_wlast ), 200 | .m00_axi_bvalid ( m00_axi_bvalid ), 201 | .m00_axi_bready ( m00_axi_bready ), 202 | .m00_axi_arvalid ( m00_axi_arvalid ), 203 | .m00_axi_arready ( m00_axi_arready ), 204 | .m00_axi_araddr ( m00_axi_araddr ), 205 | .m00_axi_arlen ( m00_axi_arlen ), 206 | .m00_axi_rvalid ( m00_axi_rvalid ), 207 | .m00_axi_rready ( m00_axi_rready ), 208 | .m00_axi_rdata ( m00_axi_rdata ), 209 | .m00_axi_rlast ( m00_axi_rlast ), 210 | .ap_start ( ap_start ), 211 | .ap_done ( ap_done ), 212 | .ap_idle ( ap_idle ), 213 | .input0 ( input0 ), 214 | .input_mem ( input_mem ), 215 | .output_mem ( output_mem ), 216 | .intermediates_mem ( intermediates_mem ) 217 | ); 218 | 219 | endmodule 220 | `default_nettype wire 221 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | 2 | Apache License 3 | Version 2.0, January 2004 4 | http://www.apache.org/licenses/ 5 | 6 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 7 | 8 | 1. Definitions. 9 | 10 | "License" shall mean the terms and conditions for use, reproduction, 11 | and distribution as defined by Sections 1 through 9 of this document. 12 | 13 | "Licensor" shall mean the copyright owner or entity authorized by 14 | the copyright owner that is granting the License. 15 | 16 | "Legal Entity" shall mean the union of the acting entity and all 17 | other entities that control, are controlled by, or are under common 18 | control with that entity. For the purposes of this definition, 19 | "control" means (i) the power, direct or indirect, to cause the 20 | direction or management of such entity, whether by contract or 21 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 22 | outstanding shares, or (iii) beneficial ownership of such entity. 23 | 24 | "You" (or "Your") shall mean an individual or Legal Entity 25 | exercising permissions granted by this License. 26 | 27 | "Source" form shall mean the preferred form for making modifications, 28 | including but not limited to software source code, documentation 29 | source, and configuration files. 30 | 31 | "Object" form shall mean any form resulting from mechanical 32 | transformation or translation of a Source form, including but 33 | not limited to compiled object code, generated documentation, 34 | and conversions to other media types. 35 | 36 | "Work" shall mean the work of authorship, whether in Source or 37 | Object form, made available under the License, as indicated by a 38 | copyright notice that is included in or attached to the work 39 | (an example is provided in the Appendix below). 40 | 41 | "Derivative Works" shall mean any work, whether in Source or Object 42 | form, that is based on (or derived from) the Work and for which the 43 | editorial revisions, annotations, elaborations, or other modifications 44 | represent, as a whole, an original work of authorship. For the purposes 45 | of this License, Derivative Works shall not include works that remain 46 | separable from, or merely link (or bind by name) to the interfaces of, 47 | the Work and Derivative Works thereof. 48 | 49 | "Contribution" shall mean any work of authorship, including 50 | the original version of the Work and any modifications or additions 51 | to that Work or Derivative Works thereof, that is intentionally 52 | submitted to Licensor for inclusion in the Work by the copyright owner 53 | or by an individual or Legal Entity authorized to submit on behalf of 54 | the copyright owner. For the purposes of this definition, "submitted" 55 | means any form of electronic, verbal, or written communication sent 56 | to the Licensor or its representatives, including but not limited to 57 | communication on electronic mailing lists, source code control systems, 58 | and issue tracking systems that are managed by, or on behalf of, the 59 | Licensor for the purpose of discussing and improving the Work, but 60 | excluding communication that is conspicuously marked or otherwise 61 | designated in writing by the copyright owner as "Not a Contribution." 62 | 63 | "Contributor" shall mean Licensor and any individual or Legal Entity 64 | on behalf of whom a Contribution has been received by Licensor and 65 | subsequently incorporated within the Work. 66 | 67 | 2. Grant of Copyright License. Subject to the terms and conditions of 68 | this License, each Contributor hereby grants to You a perpetual, 69 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 70 | copyright license to reproduce, prepare Derivative Works of, 71 | publicly display, publicly perform, sublicense, and distribute the 72 | Work and such Derivative Works in Source or Object form. 73 | 74 | 3. Grant of Patent License. Subject to the terms and conditions of 75 | this License, each Contributor hereby grants to You a perpetual, 76 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 77 | (except as stated in this section) patent license to make, have made, 78 | use, offer to sell, sell, import, and otherwise transfer the Work, 79 | where such license applies only to those patent claims licensable 80 | by such Contributor that are necessarily infringed by their 81 | Contribution(s) alone or by combination of their Contribution(s) 82 | with the Work to which such Contribution(s) was submitted. If You 83 | institute patent litigation against any entity (including a 84 | cross-claim or counterclaim in a lawsuit) alleging that the Work 85 | or a Contribution incorporated within the Work constitutes direct 86 | or contributory patent infringement, then any patent licenses 87 | granted to You under this License for that Work shall terminate 88 | as of the date such litigation is filed. 89 | 90 | 4. Redistribution. You may reproduce and distribute copies of the 91 | Work or Derivative Works thereof in any medium, with or without 92 | modifications, and in Source or Object form, provided that You 93 | meet the following conditions: 94 | 95 | (a) You must give any other recipients of the Work or 96 | Derivative Works a copy of this License; and 97 | 98 | (b) You must cause any modified files to carry prominent notices 99 | stating that You changed the files; and 100 | 101 | (c) You must retain, in the Source form of any Derivative Works 102 | that You distribute, all copyright, patent, trademark, and 103 | attribution notices from the Source form of the Work, 104 | excluding those notices that do not pertain to any part of 105 | the Derivative Works; and 106 | 107 | (d) If the Work includes a "NOTICE" text file as part of its 108 | distribution, then any Derivative Works that You distribute must 109 | include a readable copy of the attribution notices contained 110 | within such NOTICE file, excluding those notices that do not 111 | pertain to any part of the Derivative Works, in at least one 112 | of the following places: within a NOTICE text file distributed 113 | as part of the Derivative Works; within the Source form or 114 | documentation, if provided along with the Derivative Works; or, 115 | within a display generated by the Derivative Works, if and 116 | wherever such third-party notices normally appear. The contents 117 | of the NOTICE file are for informational purposes only and 118 | do not modify the License. You may add Your own attribution 119 | notices within Derivative Works that You distribute, alongside 120 | or as an addendum to the NOTICE text from the Work, provided 121 | that such additional attribution notices cannot be construed 122 | as modifying the License. 123 | 124 | You may add Your own copyright statement to Your modifications and 125 | may provide additional or different license terms and conditions 126 | for use, reproduction, or distribution of Your modifications, or 127 | for any such Derivative Works as a whole, provided Your use, 128 | reproduction, and distribution of the Work otherwise complies with 129 | the conditions stated in this License. 130 | 131 | 5. Submission of Contributions. Unless You explicitly state otherwise, 132 | any Contribution intentionally submitted for inclusion in the Work 133 | by You to the Licensor shall be under the terms and conditions of 134 | this License, without any additional terms or conditions. 135 | Notwithstanding the above, nothing herein shall supersede or modify 136 | the terms of any separate license agreement you may have executed 137 | with Licensor regarding such Contributions. 138 | 139 | 6. Trademarks. This License does not grant permission to use the trade 140 | names, trademarks, service marks, or product names of the Licensor, 141 | except as required for reasonable and customary use in describing the 142 | origin of the Work and reproducing the content of the NOTICE file. 143 | 144 | 7. Disclaimer of Warranty. Unless required by applicable law or 145 | agreed to in writing, Licensor provides the Work (and each 146 | Contributor provides its Contributions) on an "AS IS" BASIS, 147 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 148 | implied, including, without limitation, any warranties or conditions 149 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 150 | PARTICULAR PURPOSE. You are solely responsible for determining the 151 | appropriateness of using or redistributing the Work and assume any 152 | risks associated with Your exercise of permissions under this License. 153 | 154 | 8. Limitation of Liability. In no event and under no legal theory, 155 | whether in tort (including negligence), contract, or otherwise, 156 | unless required by applicable law (such as deliberate and grossly 157 | negligent acts) or agreed to in writing, shall any Contributor be 158 | liable to You for damages, including any direct, indirect, special, 159 | incidental, or consequential damages of any character arising as a 160 | result of this License or out of the use or inability to use the 161 | Work (including but not limited to damages for loss of goodwill, 162 | work stoppage, computer failure or malfunction, or any and all 163 | other commercial damages or losses), even if such Contributor 164 | has been advised of the possibility of such damages. 165 | 166 | 9. Accepting Warranty or Additional Liability. While redistributing 167 | the Work or Derivative Works thereof, You may choose to offer, 168 | and charge a fee for, acceptance of support, warranty, indemnity, 169 | or other liability obligations and/or rights consistent with this 170 | License. However, in accepting such obligations, You may act only 171 | on Your own behalf and on Your sole responsibility, not on behalf 172 | of any other Contributor, and only if You agree to indemnify, 173 | defend, and hold each Contributor harmless for any liability 174 | incurred by, or claims asserted against, such Contributor by reason 175 | of your accepting any such warranty or additional liability. 176 | 177 | ------------------------------------------------------------------------------ 178 | ------------------------------------------------------------------------------ 179 | 180 | The Cryptophage project contains open source components with separate copyright 181 | notices and license terms. Your use of the source code for the these 182 | components is subject to the terms and conditions of the following 183 | licenses. 184 | 185 | ------------------------------------------------------------------------------ 186 | ------------------------------------------------------------------------------ 187 | 188 | ./tcl 189 | ./rtl/sdaccel 190 | 191 | Copyright (c) 2018, Xilinx, Inc. 192 | All rights reserved. 193 | 194 | Redistribution and use in source and binary forms, with or without modification, 195 | are permitted provided that the following conditions are met: 196 | 197 | 1. Redistributions of source code must retain the above copyright notice, 198 | this list of conditions and the following disclaimer. 199 | 200 | 2. Redistributions in binary form must reproduce the above copyright notice, 201 | this list of conditions and the following disclaimer in the documentation 202 | and/or other materials provided with the distribution. 203 | 204 | 3. Neither the name of the copyright holder nor the names of its contributors 205 | may be used to endorse or promote products derived from this software 206 | without specific prior written permission. 207 | 208 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 209 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, 210 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 211 | IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 212 | INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 213 | PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 214 | HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 215 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, 216 | EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 217 | 218 | 219 | 220 | --------------------------------------------------------------------------------