├── Agenda
└── Agenda_10815
├── LICENSE
├── README.md
├── gen
├── Makefile
├── README
├── afu_csr.h
├── afu_csr.spec
├── afu_csr.sv
└── afu_csr.vh
├── rtl
├── cacheline_buffer.sv
├── conv_forward_layer.sv
├── conv_forward_layer_tb.sv
├── inner_product_backward.sv
├── inner_product_backward_tb.sv
├── inner_product_forward.sv
├── inner_product_forward_tb.sv
├── loss_layer_tb.sv
├── loss_opt.sv
├── pooling_backward_layer_tb.sv
├── pooling_backward_opt.sv
├── qa_conv.sv
├── qip
│ ├── float_add.bsf
│ ├── float_add.cmp
│ ├── float_add.inc
│ ├── float_add.qip
│ ├── float_add.v
│ ├── float_add_bb.v
│ ├── float_add_inst.v
│ ├── float_add_syn.v
│ ├── float_mult.bsf
│ ├── float_mult.cmp
│ ├── float_mult.inc
│ ├── float_mult.qip
│ ├── float_mult.v
│ ├── float_mult_bb.v
│ ├── float_mult_inst.v
│ ├── float_mult_syn.v
│ ├── iplauncher_debug.log
│ ├── ram_2p.qip
│ ├── ram_2p.v
│ └── ram_2p_bb.v
├── relu_backward_layer.sv
├── relu_backward_layer.sv.bak
├── relu_backward_layer_tb.sv
├── relu_backward_layer_tb.sv.bak
├── relu_backward_opt.sv
├── relu_backward_opt_tb.sv
├── relu_forward.sv
└── relu_forward_tb.sv
├── test
├── conv_forward_tests_header.py
├── pooling_backward_tests_header.py
├── pooling_forward_tests_header.py
├── relu_backward_tests_header.py
├── relu_forward_tests_header.py
├── softmax_with_loss_tests_header.py
└── test_data
│ ├── conv_forward_test_data.vh
│ ├── inner_product_backward_test_data.vh
│ ├── ip_backward_test_data.vh
│ ├── ip_forward_test_data.vh
│ ├── pooling_backward_test_data.vh
│ ├── pooling_forward_test_data.vh
│ ├── relu_backward_test_data.vh
│ ├── relu_forward_test_data.vh
│ └── softmax_with_loss_test_data.vh
└── tools
├── caffe_install_deps.sh
└── nvidia_smi_command.sh
/Agenda/Agenda_10815:
--------------------------------------------------------------------------------
1 | Agenda for team meeting (Oct 8,2015)
2 |
3 | Meta Data Management System
4 | GitHub Repo: Use for SCV
5 | - To be setup
6 | git config --global user.name "Your Name"
7 | git config --global user.email you@example.com
8 |
9 | Use Gist for code snippit
10 |
11 | JIRA: Use for workflow control
12 | - Can be linked with GitHub
13 | - Brian will host
14 | - To be config
15 | - Ultra-DNS service
16 |
17 | Confluence/GitWiki: Use for Wiki [Doc]
18 | - Which one to use?
19 | - Use this as a doc for our proj
20 |
21 | Slack
22 | - Done
23 | - Thank Brian
24 |
25 | GoogleSite Blog
26 | Make sure you can edit it
27 | User-friendly Proposal
28 | - To be re-written
29 |
30 | Engineering Requirements
31 | How are we doing it?
32 | - The Steakholder has no list
33 | - Translate Stk paragraph to bullet?
34 | - Use it as a baseline?
35 | - We each should indep come up w/ 10 req
36 |
37 |
38 |
39 |
40 |
41 |
42 |
43 |
44 |
45 |
46 |
47 |
48 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2018 Brian Hill
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | ## Synopsis
2 |
3 | The objective of this project is to showcase the increased performance and efficiency of using custom computing rather than a fixed processor architecture by bench-marking the FPGA implementation of deep learning kernels against the GPU implementation. Code created during this project will be given to the open source community to increase awareness of configurable hardware and demonstrate the utility of FPGAs for deep learning. In doing so, students will become more familiar with programming for an FPGA as well as applying deep learning in a real-world application: diabetic retinopathy. The project will also showcase the power of using deep CNNs for computer vision by creating a binary classification system for breast invasive carcinoma tissue images.
4 |
5 | ## Code Example
6 |
7 | Show what the library does as concisely as possible, developers should be able to figure out **how** your project solves their problem by looking at the code example. Make sure the API you are showing off is obvious, and that your code is short and concise.
8 |
9 | ## Motivation
10 |
11 | We propose to create a "scalable," energy efficient method of deep machine learning that helps the diagnosis in the field of medicine with more accuracy and less time. We will be applying this method to classify severity in diabetic retinopathy images; in the field of cancer we will be applying this to breast invasive carcinoma for classifying the tumor grade or absence of tumor.
12 |
13 | The implementation will be taking the most widely used machine learning open source algorithm that will be benchmarked against multiple implementations for energy efficiency study. Three of the most popular deep learning packages are Theano, Torch, and Caffe. We will select the most widely used open source software package that is used by the deep learning community and rewrite the kernels for a field-programmable gate array (FPGA) and compare it with other implementations (GPU) for energy efficiency study. All of the code will be shared with the open source community.
14 |
15 |
16 |
17 |
18 | ## Installation
19 |
20 | Provide code examples and explanations of how to get the project.
21 |
22 | ## API Reference
23 |
24 | Depending on the size of the project, if it is small and simple enough the reference docs can be added to the README. For medium size to larger projects it is important to at least provide a link to where the API reference docs live.
25 |
26 | ## Tests
27 |
28 | Describe and show how to run the tests with code examples.
29 |
30 | ## Contributors
31 |
32 | Brian Hill
33 | Sophia Zhang
34 |
35 | ## License
36 |
37 | A short snippet describing the license (MIT, Apache, etc.)
38 |
--------------------------------------------------------------------------------
/gen/Makefile:
--------------------------------------------------------------------------------
1 | GEN_TOOL = /opt/python-2.7/bin/python2.7 ../../tools/afu_csr_gen.py
2 |
3 | afu_csr.sv: afu_csr.spec
4 | $(GEN_TOOL)
5 |
6 | clean:
7 | rm -f *.h *.vh *.sv
8 |
--------------------------------------------------------------------------------
/gen/README:
--------------------------------------------------------------------------------
1 | * Automatically generate RTL to implement and C code to access AFU control registers.
2 | * Each register is mapped to a unique 32 bit register
3 | * 64 bit registers are supported
4 | * Each register can have a unique reset wire
5 |
6 |
7 | 1. Create spec file format
8 |
9 | # comment
10 | register name, number of bits [, reset signal name]
11 |
12 |
13 | 2. Generate code from afu_csr.spec file:
14 |
15 | make
16 |
--------------------------------------------------------------------------------
/gen/afu_csr.h:
--------------------------------------------------------------------------------
1 | // Code generated by afu_csr_gen
2 |
3 | #define CSR_AFU_DSM_BASE 0x8a00
4 | #define CSR_AFU_CNTXT_BASE 0x8a08
5 | #define CSR_AFU_EN 0x8a10
6 | #define CSR_DOORBELL 0x8a14
7 | #define CSR_READ_BUFFER_LINES 0x8a18
8 | #define CSR_READ_BUFFER_BASE 0x8a1c
9 | #define CSR_WRITE_BUFFER_BASE 0x8a24
10 | #define CSR_UPDATE_DSM 0x8a2c
11 | #define CSR_PLL_RESET 0x8a30
12 | #define CSR_LOAD_WEIGHTS 0x8a34
13 | #define CSR_NUM_CL_PER_FILTER 0x8a38
14 | #define CSR_NUM_FILTERS 0x8a3c
15 | #define CSR_MAX_WEIGHT_BUFFER_ADDR 0x8a40
16 | #define CSR_LOAD_IMAGES 0x8a44
17 | #define CSR_WRITE_FENCE 0x8a48
18 |
--------------------------------------------------------------------------------
/gen/afu_csr.spec:
--------------------------------------------------------------------------------
1 | # afu csrs
2 | # register name, bits [, reset]
3 |
4 | afu_en, 1
5 | doorbell, 32, reset_doorbell
6 |
7 | read_buffer_lines, 32
8 | read_buffer_base, 64
9 | write_buffer_base, 64
10 |
11 | update_dsm, 32, reset_update_dsm
12 |
13 | pll_reset, 1
14 | load_weights, 1
15 | num_cl_per_filter, 8
16 | num_filters, 16
17 | max_weight_buffer_addr, 16
18 | load_images, 1
19 |
20 | write_fence, 1, DEFAULT
21 |
--------------------------------------------------------------------------------
/gen/afu_csr.sv:
--------------------------------------------------------------------------------
1 | // Code generated by afu_csr_gen
2 |
3 |
4 | `include "spl.vh"
5 | `include "afu.vh"
6 | `include "afu_csr.vh"
7 |
8 | module afu_csr
9 | (
10 | input logic clk,
11 | input logic resetb,
12 | spl_bus_t spl_bus,
13 | afu_bus_t afu_bus
14 | );
15 |
16 | always_ff @(posedge clk) begin
17 | if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_AFU_DSM_BASEL) begin
18 | afu_bus.csr.afu_dsm_base[31:0] <= spl_bus.rw_rsp.data[31:0];
19 | end
20 | end
21 |
22 | always_ff @(posedge clk) begin
23 | if (~resetb) begin
24 | afu_bus.csr.afu_dsm_base_valid <= 0;
25 | end else if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_AFU_DSM_BASEL) begin
26 | afu_bus.csr.afu_dsm_base_valid <= 1;
27 | end
28 | end
29 |
30 | always_ff @(posedge clk) begin
31 | if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_AFU_DSM_BASEH) begin
32 | afu_bus.csr.afu_dsm_base[63:32] <= spl_bus.rw_rsp.data[31:0];
33 | end
34 | end
35 |
36 | always_ff @(posedge clk) begin
37 | if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_AFU_CNTXT_BASEL) begin
38 | afu_bus.csr.afu_cntxt_base[31:0] <= spl_bus.rw_rsp.data[31:0];
39 | end
40 | end
41 |
42 | always_ff @(posedge clk) begin
43 | if (~resetb) begin
44 | afu_bus.csr.afu_cntxt_base_valid <= 0;
45 | end else if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_AFU_CNTXT_BASEL) begin
46 | afu_bus.csr.afu_cntxt_base_valid <= 1;
47 | end
48 | end
49 |
50 | always_ff @(posedge clk) begin
51 | if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_AFU_CNTXT_BASEH) begin
52 | afu_bus.csr.afu_cntxt_base[63:32] <= spl_bus.rw_rsp.data[31:0];
53 | end
54 | end
55 |
56 | always_ff @(posedge clk) begin
57 | if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_AFU_EN) begin
58 | afu_bus.csr.afu_en <= spl_bus.rw_rsp.data[0];
59 | end
60 | end
61 |
62 | always_ff @(posedge clk) begin
63 | if (afu_bus.csr.reset_doorbell) begin
64 | afu_bus.csr.doorbell <= 0;
65 | end else if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_DOORBELL) begin
66 | afu_bus.csr.doorbell <= spl_bus.rw_rsp.data[31:0];
67 | end
68 | end
69 |
70 | always_ff @(posedge clk) begin
71 | if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_READ_BUFFER_LINES) begin
72 | afu_bus.csr.read_buffer_lines <= spl_bus.rw_rsp.data[31:0];
73 | end
74 | end
75 |
76 | always_ff @(posedge clk) begin
77 | if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_READ_BUFFER_BASEL) begin
78 | afu_bus.csr.read_buffer_base[31:0] <= spl_bus.rw_rsp.data[31:0];
79 | end
80 | end
81 |
82 | always_ff @(posedge clk) begin
83 | if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_READ_BUFFER_BASEH) begin
84 | afu_bus.csr.read_buffer_base[63:32] <= spl_bus.rw_rsp.data[31:0];
85 | end
86 | end
87 |
88 | always_ff @(posedge clk) begin
89 | if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_WRITE_BUFFER_BASEL) begin
90 | afu_bus.csr.write_buffer_base[31:0] <= spl_bus.rw_rsp.data[31:0];
91 | end
92 | end
93 |
94 | always_ff @(posedge clk) begin
95 | if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_WRITE_BUFFER_BASEH) begin
96 | afu_bus.csr.write_buffer_base[63:32] <= spl_bus.rw_rsp.data[31:0];
97 | end
98 | end
99 |
100 | always_ff @(posedge clk) begin
101 | if (afu_bus.csr.reset_update_dsm) begin
102 | afu_bus.csr.update_dsm <= 0;
103 | end else if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_UPDATE_DSM) begin
104 | afu_bus.csr.update_dsm <= spl_bus.rw_rsp.data[31:0];
105 | end
106 | end
107 |
108 | always_ff @(posedge clk) begin
109 | if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_PLL_RESET) begin
110 | afu_bus.csr.pll_reset <= spl_bus.rw_rsp.data[0];
111 | end
112 | end
113 |
114 | always_ff @(posedge clk) begin
115 | if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_LOAD_WEIGHTS) begin
116 | afu_bus.csr.load_weights <= spl_bus.rw_rsp.data[0];
117 | end
118 | end
119 |
120 | always_ff @(posedge clk) begin
121 | if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_NUM_CL_PER_FILTER) begin
122 | afu_bus.csr.num_cl_per_filter <= spl_bus.rw_rsp.data[7:0];
123 | end
124 | end
125 |
126 | always_ff @(posedge clk) begin
127 | if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_NUM_FILTERS) begin
128 | afu_bus.csr.num_filters <= spl_bus.rw_rsp.data[15:0];
129 | end
130 | end
131 |
132 | always_ff @(posedge clk) begin
133 | if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_MAX_WEIGHT_BUFFER_ADDR) begin
134 | afu_bus.csr.max_weight_buffer_addr <= spl_bus.rw_rsp.data[15:0];
135 | end
136 | end
137 |
138 | always_ff @(posedge clk) begin
139 | if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_LOAD_IMAGES) begin
140 | afu_bus.csr.load_images <= spl_bus.rw_rsp.data[0];
141 | end
142 | end
143 |
144 | always_ff @(posedge clk) begin
145 | if (~resetb) begin
146 | afu_bus.csr.write_fence <= 0;
147 | end else if (spl_bus.rw_rsp.cfg_valid && {spl_bus.rw_rsp.header[13:0], 2'b0} == ADDR_WRITE_FENCE) begin
148 | afu_bus.csr.write_fence <= spl_bus.rw_rsp.data[0];
149 | end
150 | end
151 |
152 | endmodule
--------------------------------------------------------------------------------
/gen/afu_csr.vh:
--------------------------------------------------------------------------------
1 | // Code generated by afu_csr_gen
2 |
3 |
4 | `ifndef AFU_CSR_VH
5 | `define AFU_CSR_VH
6 |
7 | localparam ADDR_AFU_DSM_BASEL = 16'h8a00;
8 | localparam ADDR_AFU_DSM_BASEH = 16'h8a04;
9 | localparam ADDR_AFU_CNTXT_BASEL = 16'h8a08;
10 | localparam ADDR_AFU_CNTXT_BASEH = 16'h8a0c;
11 | localparam ADDR_AFU_EN = 16'h8a10;
12 | localparam ADDR_DOORBELL = 16'h8a14;
13 | localparam ADDR_READ_BUFFER_LINES = 16'h8a18;
14 | localparam ADDR_READ_BUFFER_BASEL = 16'h8a1c;
15 | localparam ADDR_READ_BUFFER_BASEH = 16'h8a20;
16 | localparam ADDR_WRITE_BUFFER_BASEL = 16'h8a24;
17 | localparam ADDR_WRITE_BUFFER_BASEH = 16'h8a28;
18 | localparam ADDR_UPDATE_DSM = 16'h8a2c;
19 | localparam ADDR_PLL_RESET = 16'h8a30;
20 | localparam ADDR_LOAD_WEIGHTS = 16'h8a34;
21 | localparam ADDR_NUM_CL_PER_FILTER = 16'h8a38;
22 | localparam ADDR_NUM_FILTERS = 16'h8a3c;
23 | localparam ADDR_MAX_WEIGHT_BUFFER_ADDR = 16'h8a40;
24 | localparam ADDR_LOAD_IMAGES = 16'h8a44;
25 | localparam ADDR_WRITE_FENCE = 16'h8a48;
26 |
27 | typedef struct
28 | {
29 | logic afu_dsm_base_valid;
30 | logic [63:0] afu_dsm_base;
31 | logic afu_cntxt_base_valid;
32 | logic [63:0] afu_cntxt_base;
33 | logic afu_en;
34 | logic [31:0] doorbell;
35 | logic [31:0] read_buffer_lines;
36 | logic [63:0] read_buffer_base;
37 | logic [63:0] write_buffer_base;
38 | logic [31:0] update_dsm;
39 | logic pll_reset;
40 | logic load_weights;
41 | logic [7:0] num_cl_per_filter;
42 | logic [15:0] num_filters;
43 | logic [15:0] max_weight_buffer_addr;
44 | logic load_images;
45 | logic write_fence;
46 | logic reset_doorbell;
47 | logic reset_update_dsm;
48 | } afu_csr_t;
49 |
50 | `endif
--------------------------------------------------------------------------------
/rtl/cacheline_buffer.sv:
--------------------------------------------------------------------------------
1 | module cacheline_buffer(
2 | input logic wr_clk,
3 | input logic wr_en,
4 | input logic [7:0] wr_addr,
5 | input logic [511:0] wr_data,
6 | input logic rd_clk,
7 | input logic [7:0] rd_addr,
8 | output logic [511:0] rd_data
9 | );
10 |
11 | ram_2p ram_2p_low(
12 | .data(wr_data[255:0]),
13 | .rdaddress(rd_addr),
14 | .rdclock(rd_clk),
15 | .wraddress(wr_addr),
16 | .wrclock(wr_clk),
17 | .wren(wr_en),
18 | .q(rd_data[255:0])
19 | );
20 |
21 | ram_2p ram_2p_high(
22 | .data(wr_data[511:256]),
23 | .rdaddress(rd_addr),
24 | .rdclock(rd_clk),
25 | .wraddress(wr_addr),
26 | .wrclock(wr_clk),
27 | .wren(wr_en),
28 | .q(rd_data[511:256])
29 | );
30 |
31 | endmodule
32 |
--------------------------------------------------------------------------------
/rtl/conv_forward_layer.sv:
--------------------------------------------------------------------------------
1 | module conv_forward_layer #(parameter WIDTH = 8)
2 | (
3 | input logic clk,
4 | input logic reset,
5 | input logic [7:0] id,
6 | input logic [31:0] in_data [WIDTH-1:0],
7 | input logic [31:0] weight_vec [WIDTH-1:0],
8 | // input logic [31:0] bias_term,
9 | output logic [31:0] out_data,
10 | output logic [7:0] id_out
11 | );
12 |
13 | logic [31:0] connections [2*WIDTH] ;
14 |
15 | genvar i, j;
16 | generate
17 | //create float_mult blocks to multiply WIDTH number
18 | //of inputs with weight_vec
19 | for (i = 0; i < WIDTH; i++) begin : GEN_MULTS
20 | float_mult float_mult_inst(
21 | .clk_en(!reset),
22 | .clock(clk),
23 | .dataa(in_data[i]),
24 | .datab(weight_vec[i]),
25 | .result(connections[i+WIDTH])
26 | );
27 | end
28 | //sum the products, and reduce to single value
29 | for (i = WIDTH; i > 1; i = i / 2) begin : GEN_SUMS
30 | for (j = i; j > i/2 && j != 1; j--) begin : SUM_MULTS
31 | float_add float_add_inst(
32 | .aclr(reset),
33 | .clock(clk),
34 | .dataa(connections[2*j-1]),
35 | .datab(connections[2*j-2]),
36 | .result(connections[j-1])
37 | );
38 | end
39 | end
40 | endgenerate
41 |
42 | //add bias term to sum to produce final sum
43 | // float_add float_add_bias_term(
44 | // .aclr(reset),
45 | // .clock(clk),
46 | // .dataa(connections[1]),
47 | // .datab(bias_term),
48 | // .result(connections[0])
49 | // );
50 | //write result to output reg + pass id val on
51 | always @(posedge clk) begin
52 | out_data <= connections[1];
53 | id_out = id;
54 | end
55 |
56 | endmodule
--------------------------------------------------------------------------------
/rtl/conv_forward_layer_tb.sv:
--------------------------------------------------------------------------------
1 | `timescale 1ns/100ps
2 |
3 | module conv_forward_layer_tb();
4 | `include "/home/b/FPGA-CNN/test/test_data/conv_forward_test_data.vh"
5 | parameter CYCLE = 5; //clk period: 5ns = 200 Mhz
6 | parameter MULT_DELAY = 5; //#clks to complete a mult
7 | parameter ADD_DELAY = 7; //#clks to complete an add
8 | parameter WIDTH = 8; //input vector width
9 |
10 | parameter NUM_TESTS = 5000;
11 | parameter MEM_SIZE = NUM_TESTS*WIDTH;
12 |
13 | reg clk, reset;
14 | logic [31:0] in_vec [WIDTH-1:0]; //input vec to module
15 | logic [31:0] weight_vec [WIDTH-1:0]; //weight vec to module
16 | logic [31:0] bias_term; //bias term to module
17 | logic [31:0] out; //output from module
18 | int i, j, num_errors, num_add_levels, delay;
19 |
20 | //initialize clk
21 | initial begin
22 | clk = 0;
23 | end
24 |
25 | //forever cycle the clk
26 | always begin
27 | #(CYCLE/2.0) clk = ~clk;
28 | end
29 |
30 | //instantiate the module
31 | conv_forward_layer #(.WIDTH(WIDTH))
32 | conv_forward_inst(
33 | .clk(clk),
34 | .reset(reset),
35 | .id(8'b0),
36 | .in_data(in_vec),
37 | .weight_vec(weight_vec),
38 | .bias_term(bias_term),
39 | .out_data(out)
40 | );
41 |
42 | initial begin
43 | reset = 0;
44 | num_errors = 0;
45 | num_add_levels = 1;
46 | //calculate log2(WIDTH)
47 | while (WIDTH / (2*num_add_levels) != 1) begin
48 | num_add_levels++;
49 | end
50 | //calculate total delay of one calculation
51 | //1 mult delay, log2(WIDTH) add delays to sum products, 1 add delay for bias term
52 | delay = CYCLE*(MULT_DELAY + ADD_DELAY*(num_add_levels + 1) + 1);
53 |
54 | $display("num add levels: %d", num_add_levels);
55 | //for all test cases
56 | for (i = 0; i < MEM_SIZE; i = i + WIDTH) begin
57 | //copy each value to input vector
58 | for (j = 0; j < WIDTH; j++) begin
59 | in_vec[j] = test_input[i+j];
60 | end
61 | //copy each value to weight vector
62 | for (j = 0; j < WIDTH; j++) begin
63 | weight_vec[j] = test_weights[i+j];
64 | end
65 | //copy bias term
66 | bias_term = test_bias[i/WIDTH];
67 |
68 | //wait for computation to finish
69 | #(delay)
70 |
71 | //if we were wrong, check for rounding error
72 | if( out != test_output[i/WIDTH] ) begin
73 | //if the number was off because of a rounding error, ignore
74 | if ( out - test_output[i/WIDTH] < 32'h000000ff ||
75 | test_output[i/WIDTH] - out < 32'h000000ff ) begin
76 | //ignore
77 | //otherwise, complain
78 | end else begin
79 | assert( out == test_output[i/WIDTH] );
80 | $display("output: %h\tcalculated: %h", out, test_output[i/WIDTH]);
81 | num_errors++;
82 | end
83 | end
84 | $display("(%f percent)\n", 100.0*(NUM_TESTS-num_errors)/NUM_TESTS);
85 | end
86 | $display("############################################\n");
87 | $display("Testing complete!\n");
88 | $display("%d of %d tests passed\n", NUM_TESTS-num_errors, NUM_TESTS);
89 | $display("(%f percent)\n", 100.0*(NUM_TESTS-num_errors)/NUM_TESTS);
90 | $display("############################################\n");
91 | end
92 |
93 | endmodule
94 |
--------------------------------------------------------------------------------
/rtl/inner_product_backward.sv:
--------------------------------------------------------------------------------
1 | /* Sophia Zhang
2 | * ECE 44x Senior Design
3 | * Block: Inner Product Layer (Backward)
4 | * File Name: inner_product_backward.sv
5 | * Module: Inner Product Layer (Backward)
6 | * Description: The inner product layer (backpropagation) takes in the number of filters
7 | * along with the height and width of the vectors. The bias and weight are used along with
8 | * floating point multiplication, for dot product, to learn the differences.
9 | * The bias_filler is a constant with a default value of zero, while the weight_filler is
10 | * a constant set to zero by default.
11 | */
12 |
13 | module ip_backward#(parameter WIDTH = 8)
14 | (
15 | input logic clk, //clock signal
16 | input logic reset, //reset
17 | input logic [31:0] in_data [WIDTH-1:0], //input data, vector of floats
18 | input logic [31:0] weights [WIDTH-1:0], //weight
19 | input logic [31:0] bias,
20 | input logic [7:0] in_id,
21 | output logic [31:0] out_data, //output data, vector of floats
22 | output logic [7:0] out_id
23 | );
24 |
25 | logic [31:0] connections [2*WIDTH];
26 | genvar i, j;
27 | generate
28 | //create float_mult blocks to multiply WIDTH number of inputs with weights
29 | for (i = 0; i < WIDTH; i++) begin : GEN_MULTS
30 | float_mult floating_mult_inst(
31 | .clk_en(!reset),
32 | .clock(clk),
33 | .dataa(in_data[i]),
34 | .datab(weights[i]),
35 | .result(connections[i + WIDTH])
36 | );
37 | end
38 |
39 | //add the products, and reduce to a single value
40 | for (i = WIDTH; i > 1; i = i / 2) begin : GEN_SUMS
41 | for (j = i; j > i / 2 && j != 1; j--) begin : SUM_MULTS
42 | float_add float_add_inst(
43 | .aclr(reset),
44 | .clock(clk),
45 | .dataa(connections[2*j-1]),
46 | .datab(connections[2*j-2]),
47 | .result(connections[j-1])
48 | );
49 | end
50 | end
51 | endgenerate
52 |
53 | //add bias term to sum to produce final sum
54 | float_add float_add_bias_term(
55 | .aclr(reset),
56 | .clock(clk),
57 | .dataa(connections[1]),
58 | .datab(bias),
59 | .result(connections[0])
60 | );
61 |
62 | always @(posedge clk) begin
63 | out_data <= connections[0];
64 | out_id <= in_id;
65 | end
66 |
67 | endmodule
68 |
--------------------------------------------------------------------------------
/rtl/inner_product_backward_tb.sv:
--------------------------------------------------------------------------------
1 | `timescale 1ns/100ps
2 |
3 | module inner_product_backward_tb();
4 | //`include "/nfs/stak/students/z/zhangso/ECE441/inner_product_backward/test_data/ip_backward_test_data.vh"
5 | `include "/home/b/FPGA-CNN/test/test_data/inner_product_backward_test_data.vh"
6 |
7 | parameter CYCLE = 5;
8 | parameter MULT_DELAY = 5;
9 | parameter ADD_DELAY = 7;
10 | parameter WIDTH = 8;
11 |
12 | parameter NUM_TESTS = 5000;
13 | parameter MEM_SIZE = NUM_TESTS*WIDTH;
14 |
15 | reg clk, reset;
16 | logic [31:0] in_vec [WIDTH-1:0]; //input vec to module
17 | logic [31:0] weight_vec [WIDTH-1:0]; //weight vec to module
18 | logic [31:0] bias_term;
19 | logic [31:0] out; //output from module
20 | int i, j, num_errors, num_add_levels, delay;
21 |
22 | //initialize clk
23 | initial begin
24 | clk = 0;
25 | end
26 |
27 | //forever cycle the clk
28 | always begin
29 | #(CYCLE/2.0) clk = ~clk;
30 | end
31 |
32 | //instantiate the module
33 | ip_backward #(.WIDTH(WIDTH))
34 | ip_backward_inst(
35 | .clk(clk),
36 | .reset(reset),
37 | .in_data(in_vec),
38 | .weights(weight_vec),
39 | .bias(bias_term),
40 | .in_id(8'b0),
41 | .out_data(out)
42 | );
43 |
44 | initial begin
45 | reset = 0;
46 | num_errors = 0;
47 | num_add_levels = 1;
48 | //calculate log2(WIDTH)
49 | while (WIDTH / (2*num_add_levels) != 1) begin
50 | num_add_levels++;
51 | end
52 | //calculate total delay of one calculation
53 | //1 mult delay, log2(WIDTH) add delays to sum products, 1 add delay for bias term
54 | delay = CYCLE*(MULT_DELAY + ADD_DELAY*(num_add_levels + 1) + 1);
55 |
56 | $display("num add levels: %d", num_add_levels);
57 | //for all test cases
58 | for (i = 0; i < MEM_SIZE; i = i + WIDTH) begin
59 | //copy each value to input vector
60 | for (j = 0; j < WIDTH; j++) begin
61 | in_vec[j] = test_input[i+j];
62 | end
63 | //copy each value to weight vector
64 | for (j = 0; j < WIDTH; j++) begin
65 | weight_vec[j] = test_weights[i+j];
66 | end
67 | //copy bias term
68 | bias_term = test_bias[i/WIDTH];
69 |
70 | //wait for computation to finish
71 | #(delay)
72 |
73 | $display("output: %h\tcalculated: %h", out, test_output[i/WIDTH]);
74 | if( out != test_output[i/WIDTH] ) begin
75 | //if the number was off because of a rounding error, ignore
76 | if ( out - test_output[i/WIDTH] < 32'h000000ff ||
77 | test_output[i/WIDTH] - out < 32'h000000ff ) begin
78 | //ignore
79 | //otherwise, complain
80 | end else begin
81 | assert( out == test_output[i/WIDTH] );
82 | $display("output: %h\tcalculated: %h", out, test_output[i/WIDTH]);
83 | num_errors++;
84 | end
85 | end
86 | end
87 | $display("############################################\n");
88 | $display("Testing complete!\n");
89 | $display("%d of %d tests passed\n", NUM_TESTS-num_errors, NUM_TESTS);
90 | $display("(%f percent)\n", 100.0*(NUM_TESTS-num_errors)/NUM_TESTS);
91 | $display("############################################\n");
92 | end
93 |
94 | endmodule
95 |
--------------------------------------------------------------------------------
/rtl/inner_product_forward.sv:
--------------------------------------------------------------------------------
1 | /*
2 | * ECE 44x Senior Design
3 | * Block: Inner Product Layer (Forward)
4 | * File Name: inner_product_forward.sv
5 | * Module: Inner Product Layer (Forward)
6 | * Description: The inner product layer (forward) is the dot product of the weight and an input vector.
7 | * Both the forward and backward passes can include a bias.
8 | */
9 |
10 | module ip_forward#(parameter WIDTH = 8)
11 | (
12 | input logic clk, //clock signal
13 | input logic reset, //reset
14 | input logic [31:0] in_data [WIDTH-1:0], //input data
15 | input logic [31:0] weights [WIDTH-1:0], //used in dot product
16 | input logic [7:0] in_id,
17 | output logic [31:0] out_data, //output data
18 | output logic [7:0] out_id
19 | );
20 |
21 | logic [31:0] connections [2*WIDTH];
22 | genvar i, j;
23 | generate
24 | //create float_mult blocks to multiply the WIDTH number of inputs by the weights
25 | for (i = 0; i < WIDTH; i++) begin : GEN_MULTS
26 | floating_mult floating_mult_inst(
27 | .clk_en(!reset),
28 | .clock(clk),
29 | .dataa(in_data[i]),
30 | .datab(weights[i]),
31 | .result(connections[i+WIDTH])
32 | );
33 | end
34 |
35 | //add the products and reduce to a single value
36 | for (i = WIDTH; i > 1; i = i / 2) begin : GEN_SUMS
37 | for (j = i; j > i / 2 && j != 1; j--) begin : SUM_MULTS
38 | float_add float_add_inst(
39 | .aclr(reset),
40 | .clock(clk),
41 | .dataa(connections[2*j-1]),
42 | .datab(connections[2*j-2]),
43 | .result(connections[j-1])
44 | );
45 | end
46 | end
47 | endgenerate
48 |
49 |
50 | always @(posedge clk) begin
51 | out_data <= connections[1];
52 | out_id <= in_id;
53 | end
54 |
55 | endmodule
56 |
--------------------------------------------------------------------------------
/rtl/inner_product_forward_tb.sv:
--------------------------------------------------------------------------------
1 | `timescale 1ns/100ps
2 |
3 | module inner_product_forward_tb();
4 | `include "/nfs/stak/students/z/zhangso/ECE441/inner_product_forward/test_data/ip_forward_test_data.vh"
5 | parameter CYCLE = 5;
6 | parameter MULT_DELAY = 5;
7 | parameter ADD_DELAY = 7;
8 | parameter WIDTH = 8;
9 |
10 | parameter NUM_TESTS = 5000;
11 | parameter MEM_SIZE = NUM_TESTS*WIDTH;
12 |
13 | reg clk, reset;
14 | logic [31:0] in_vec [WIDTH-1:0]; //input vec to module
15 | logic [31:0] weight_vec [WIDTH-1:0]; //weight vec to module
16 | //logic [31:0] bias_term; //bias term to module
17 | logic [31:0] out; //output from module
18 | int id, i, j, num_errors, num_add_levels, delay;
19 |
20 | //initialize clk
21 | initial begin
22 | clk = 0;
23 | end
24 |
25 | //forever cycle the clk
26 | always begin
27 | #(CYCLE/2.0) clk = ~clk;
28 | end
29 |
30 | //instantiate the module
31 | ip_forward #(.WIDTH(WIDTH))
32 | inner_product_forward_inst(
33 | .clk(clk),
34 | .reset(reset),
35 | .in_data(in_vec),
36 | .weights(weight_vec),
37 | .in_id(8'b0),
38 | .out_data(out)
39 | );
40 |
41 | initial begin
42 | reset = 0;
43 | num_errors = 0;
44 | num_add_levels = 1;
45 | //calculate log2(WIDTH)
46 | while (WIDTH / (2*num_add_levels) != 1) begin
47 | num_add_levels++;
48 | end
49 | //calculate total delay of one calculation
50 | //1 mult delay, log2(WIDTH) add delays to sum products, 1 add delay for bias term
51 | //num_add_levels + 1 for bias
52 | delay = CYCLE*(MULT_DELAY + ADD_DELAY*(num_add_levels) + 2);
53 |
54 | $display("num add levels: %d", num_add_levels);
55 | //for all test cases
56 | for (i = 0; i < MEM_SIZE; i = i + WIDTH) begin
57 | //copy each value to input vector
58 | for (j = 0; j < WIDTH; j++) begin
59 | in_vec[j] = test_input[i+j];
60 | end
61 | //copy each value to weight vector
62 | for (j = 0; j < WIDTH; j++) begin
63 | weight_vec[j] = test_weights[i+j];
64 | end
65 | //copy bias term
66 | //bias_term = test_bias[i/WIDTH];
67 |
68 | //wait for computation to finish
69 | #(delay)
70 |
71 | $display("output: %h\tcalculated: %h", out, test_output[i/WIDTH]);
72 | assert( out == test_output[i/WIDTH] );
73 | //if we were wrong, increase error count
74 | if( out != test_output[i/WIDTH] ) begin
75 | num_errors++;
76 | end
77 | end
78 | $display("############################################\n");
79 | $display("Testing complete!\n");
80 | $display("%d of %d tests passed\n", NUM_TESTS-num_errors, NUM_TESTS);
81 | $display("(%f percent)\n", 100.0*(NUM_TESTS-num_errors)/NUM_TESTS);
82 | $display("############################################\n");
83 | end
84 |
85 | endmodule
86 |
--------------------------------------------------------------------------------
/rtl/loss_layer_tb.sv:
--------------------------------------------------------------------------------
1 | `timescale 1ns/100ps
2 |
3 | module loss_layer_tb();
4 | `include "/home/b/bear_git/FPGA-CNN/test/test_data/softmax_with_loss_test_data.vh"
5 | parameter CYCLE = 5; //clk period: 5ns = 200 Mhz
6 | parameter MULT_DELAY = 5; //#clks to complete a mult
7 | parameter ADD_DELAY = 7; //#clks to complete an add
8 | parameter SUB_DELAY = 7; //#clks to complete a sub
9 | parameter EXP_DELAY = 17; //#clks to complete an exponential
10 | parameter LOG_DELAY = 21; //#clks to complete a log
11 | parameter DIV_DELAY = 6; //#clks to complte a div
12 | parameter WIDTH = 8; //input vector width
13 |
14 | parameter NUM_TESTS = 10000;
15 | parameter MEM_SIZE = NUM_TESTS*WIDTH;
16 |
17 | reg clk, reset;
18 | logic [31:0] in_vec [WIDTH-1:0]; //input vec to module
19 | logic [31:0] label; //correct classification
20 | logic [7:0] id; //identification value
21 | logic [31:0] out; //output from module
22 | logic f_overall_sum;
23 | int i, j, num_errors, num_add_levels, delay, sub_exp_add_delay, div_log_delay;
24 |
25 | //initialize clk
26 | initial begin
27 | clk = 0;
28 | end
29 |
30 | //forever cycle the clk
31 | always begin
32 | #(CYCLE/2.0) clk = ~clk;
33 | end
34 |
35 | //instantiate the module
36 | lol_opt #(.WIDTH(WIDTH))
37 | lol_opt_inst(
38 | .clk(clk),
39 | .reset_n(reset),
40 | .in_ID(id),
41 | .f_overall_sum(f_overall_sum),
42 | .all_clsf(in_vec),
43 | .corr_clsf(label),
44 | .data_out(out)
45 | );
46 |
47 | initial begin
48 | reset = 1;
49 | id = 0;
50 | f_overall_sum = 0;
51 | num_errors = 0;
52 | num_add_levels = 1;
53 | //calculate log2(WIDTH)
54 | while (WIDTH / (2*num_add_levels) != 1) begin
55 | num_add_levels++;
56 | end
57 | //calculate total delay of one calculation
58 | sub_exp_add_delay = CYCLE*(SUB_DELAY + EXP_DELAY + ADD_DELAY*(num_add_levels));
59 | div_log_delay = CYCLE*(DIV_DELAY + LOG_DELAY-1);
60 |
61 | $display("num add levels: %d", num_add_levels);
62 | //for all test cases
63 | for (i = 0; i < MEM_SIZE; i = i + WIDTH) begin
64 | //reset module
65 | reset = 0;
66 | #CYCLE reset = 1;
67 |
68 | //copy each value to input vector
69 | for (j = 0; j < WIDTH; j++) begin
70 | in_vec[j] = test_input[i+j];
71 | end
72 |
73 | //copy label
74 | label = test_label[i/WIDTH];
75 |
76 | //wait for computation to finish
77 | #(sub_exp_add_delay)
78 | f_overall_sum = 1;
79 | #CYCLE
80 | f_overall_sum = 0;
81 | //add to overall sum
82 | #(CYCLE*ADD_DELAY)
83 | //div and log
84 | #(div_log_delay)
85 |
86 | //if we were wrong, check for rounding error
87 | if( out != test_output[i/WIDTH] ) begin
88 | //if log(1.0) in NumPy gave us garbage, do our own check
89 | if ( test_div[i/WIDTH] == 32'h3f800000 ) begin
90 | if ( out != 32'h80000000 ) begin
91 | $display("Error! Module did not correctly handle log(1.0)");
92 | $display("output: %h\tcalculated: 32'h80000000", out);
93 | end
94 | //if the number was off because of a rounding error, ignore
95 | end else if ( out - test_output[i/WIDTH] < 32'h0000ffff ||
96 | test_output[i/WIDTH] - out < 32'h0000ffff ) begin
97 | //$display("Rounding error");
98 | //otherwise, complain
99 | end else begin
100 | //assert( out == test_output[i/WIDTH] );
101 | $display("Error! Module result not expected value");
102 | $display("output: %h\tcalculated: %h", out, test_output[i/WIDTH]);
103 | $display("out&:\t\t%b", out & 32'hfffff000);
104 | $display("corr&:\t\t%b", test_output[i/WIDTH] & 32'hfffff000);
105 | $display("out-corr:\t\t%b", out - test_output[i/WIDTH]);
106 | $display("corr-out:\t\t%b", test_output[i/WIDTH] - out);
107 | num_errors++;
108 | end
109 | end
110 | $display("(%f percent)\n", 100.0*((i/WIDTH)+1-num_errors)/((i/WIDTH)+1));
111 | end
112 | $display("############################################\n");
113 | $display("Testing complete!\n");
114 | $display("%d of %d tests passed\n", NUM_TESTS-num_errors, NUM_TESTS);
115 | $display("(%f percent)\n", 100.0*(NUM_TESTS-num_errors)/NUM_TESTS);
116 | $display("############################################\n");
117 | end
118 |
119 | endmodule
120 |
--------------------------------------------------------------------------------
/rtl/loss_opt.sv:
--------------------------------------------------------------------------------
1 | /* Author: Youthawin Philavastvanid
2 | * Date : 02/08/2016
3 | *
4 | * Module: loss_opt
5 | * Desc :
6 | *
7 | * Design:
8 | * Input :
9 | * Ouput :
10 | *
11 | * Timeline:
12 |
13 | * WARNING:
14 | f_inc_idx should be a pulse w/ 1 clk cyc WIDTH indicating it is time increment the idx of the adder
15 | f_inc_idx_exp should be a pulse w/ 1 clk cyc WIDTH indicating it is time increment the idx of the e^( z_correctClassification )
16 | */
17 |
18 | module lol_opt#(
19 | parameter WEIGHT=1,
20 | WIDTH=8 //number of float input
21 | )(
22 |
23 | input reset_n, //reset
24 | input clk, //clock
25 | input f_overall_sum, //Summer inc flag
26 | input [31:0] all_clsf [WIDTH-1:0], //calculated classification
27 | input [31:0] corr_clsf, //correcnt classification
28 | input reg [7:0] in_ID,
29 |
30 | output reg [7:0] out_ID,
31 | output reg [31:0] data_out //Vector data output
32 |
33 | );
34 |
35 | reg [31:0] sub_result [WIDTH-1:0];
36 | reg [31:0] corr_clsf_sub_result;
37 | reg [31:0] overall_sum;
38 | reg [31:0] buff_overall_sum;
39 | reg [31:0] current_sum;
40 |
41 | reg [31:0] sum_e_all_clsf; //SUM( e^( z_allClassification ) )
42 | reg [31:0] buff_sum_e_all_clsf; //output buff for SUM( e^( z_allClassification ) )
43 |
44 | reg [31:0] e_all_clsf [WIDTH-1:0]; //e^( z_allClassification )
45 | reg [15:0] idx_e_all_clsf; //idx for e_all_clsf
46 | reg f_set_inc_idx_e_all; //increment flag for idx_e_all_clsf
47 |
48 | reg [31:0] e_corr_clsf; //e^( z_correctClassification )
49 |
50 | reg [31:0] div_ecorr_sumall; //e^( z_correctClassification ) / SUM( e^( z_allClassification )
51 |
52 | reg [31:0] buff_out_div;
53 | reg [31:0] buff_out;
54 |
55 | logic [31:0] connections [2*WIDTH] ;
56 |
57 | genvar i, j;
58 | generate
59 | //create float_sub blocks to subtract WIDTH number
60 | //of inputs with weight_vec
61 | for (i = 0; i < WIDTH; i++) begin : GEN_SUBS
62 | flt_sub flt_sub_inst(
63 | .aclr(!reset_n),
64 | .clock(clk),
65 | .dataa(all_clsf[i]),
66 | .datab(corr_clsf),
67 | .result(sub_result[i])
68 | );
69 | end
70 | //create float_exp blocks to multiply WIDTH number
71 | //of inputs with weight_vec
72 | for (i = 0; i < WIDTH; i++) begin : GEN_EXPS
73 | flt_exp flt_exp_inst(
74 | .aclr(!reset_n),
75 | .clock(clk),
76 | .data(sub_result[i]),
77 | .result(connections[i+WIDTH])
78 | );
79 | end
80 | //sum the products, and reduce to single value
81 | for (i = WIDTH; i > 1; i = i / 2) begin : GEN_SUMS
82 | for (j = i; j > i/2 && j != 1; j--) begin : SUM_MULTS
83 | flt_add flt_add_inst(
84 | .aclr(!reset_n),
85 | .clock(clk),
86 | .dataa(connections[2*j-1]),
87 | .datab(connections[2*j-2]),
88 | .result(connections[j-1])
89 | );
90 | end
91 | end
92 | endgenerate
93 |
94 | flt_add_new flt_add_overall_sum(
95 | .aclr(!reset_n),
96 | .clock(clk),
97 | .dataa(current_sum),
98 | .datab(overall_sum),
99 | .result(buff_overall_sum)
100 | );
101 | //forwarding the ID
102 | assign out_ID = in_ID;
103 | assign current_sum = connections[1];
104 | always_ff @(posedge clk, negedge reset_n) begin
105 | if (!reset_n) begin
106 | overall_sum <= 0;
107 | end else begin
108 | if( f_overall_sum )
109 | overall_sum <= buff_overall_sum;
110 | else
111 | overall_sum <= overall_sum;
112 | end
113 | end
114 |
115 |
116 | //Dividing -- e^( z_correctClassification ) / SUM( e^( z_allClassification ) )
117 | flt_div_new flt_div_inst( //[+6]
118 | .aclr (!reset_n),
119 | .clock (clk),
120 | .dataa (32'h3f800000),
121 | .datab (buff_overall_sum),
122 | .result (div_ecorr_sumall) );
123 |
124 | //Taking log of quotion product
125 | flt_log flt_log( //[+21]
126 | .aclr (!reset_n),
127 | .clock (clk),
128 | .data (div_ecorr_sumall),
129 | .result (buff_out_div) );
130 |
131 | //Multiply (-1)
132 | assign data_out = buff_out_div ^ (1<<31);
133 |
134 | endmodule
135 |
136 |
--------------------------------------------------------------------------------
/rtl/pooling_backward_layer_tb.sv:
--------------------------------------------------------------------------------
1 | `timescale 1ns/100ps
2 | `define DEBUG 1
3 | module pooling_backward_layer_tb();
4 | `include "/home/b/bear_git/FPGA-CNN/test/test_data/pooling_backward_test_data.vh"
5 | parameter CYCLE = 5;
6 | parameter MULT_DELAY = 5;
7 | parameter KERNEL_WIDTH = 3;
8 | parameter KERNEL_HEIGHT = 3;
9 | parameter WIDTH = KERNEL_WIDTH*KERNEL_HEIGHT;
10 |
11 | parameter NUM_TESTS = 5000;
12 | parameter MEM_SIZE = NUM_TESTS*WIDTH;
13 |
14 |
15 | reg clk, reset;
16 | logic [31:0] in_vec [WIDTH-1:0]; //input vec to module
17 | logic [7:0] in_idx;
18 | logic [31:0] in_err_term;
19 | logic [31:0] out_data [WIDTH-1:0]; //output from module
20 | int i, j, k, num_errors, num_depth, delay;
21 |
22 | //initialize clk
23 | initial begin
24 | clk = 0;
25 | end
26 |
27 | //forever cycle the clk
28 | always begin
29 | #(CYCLE/2.0) clk = ~clk;
30 | end
31 |
32 | //instantiate the module
33 | pooling_backward_opt #( .k_w(KERNEL_WIDTH), .k_h(KERNEL_HEIGHT) )
34 | pooling_backward_tbmodule (
35 | .reset_n (reset), //reset
36 | .clk (clk), //clock
37 | .max_flt_idx (in_idx),
38 | .data_vect_in (in_vec), //Vector data input
39 | .error_term (in_err_term),
40 |
41 | .data_vect_out (out_data) //Vector data output
42 | );
43 |
44 | initial begin
45 | reset = 0;
46 | num_errors = 0;
47 |
48 | //calculate total delay of one calculation
49 | //one multiplication, plus one cycle to load operand, one to load result
50 | delay = CYCLE*(MULT_DELAY + 3);
51 |
52 | //for all test cases
53 | for (i = 0; i < MEM_SIZE; i = i + WIDTH) begin
54 | //copy each value to input vector
55 |
56 | for (j = 0; j < WIDTH; j++) begin
57 | in_vec[j] = test_input[i+j];
58 | end
59 |
60 | in_idx = test_index[i/WIDTH];
61 | in_err_term = test_error_term[i/WIDTH];
62 |
63 | //wait for computation to finish
64 | #(delay)
65 |
66 | $display("test case: %d\t", i/WIDTH);
67 | $display("test idx: %d\t", in_idx);
68 | $display("err_term: %h\t", in_err_term);
69 | `ifdef DEBUG
70 | $display("in_vec\t test_input");
71 | for (j = 0; j < WIDTH; j++) begin
72 | $display("%h\t%h", in_vec[j], test_input[i+j]);
73 | end
74 | $display("out_data\t test_output\t");
75 | for (j = 0; j < WIDTH; j++) begin
76 | $display("%h\t%h", out_data[j], test_output[i+j]);
77 | end
78 | `endif
79 |
80 | if( out_data[in_idx] != test_output[i+in_idx]) begin
81 | //if the number was off because of a rounding error, ignore
82 | if ( out_data[in_idx] - test_output[i+in_idx] < 32'h000000ff ||
83 | test_output[i+in_idx] - out_data[in_idx] < 32'h000000ff ) begin
84 | `ifdef DEBUG
85 | $display("rounding error");
86 | `endif
87 | //otherwise, complain
88 | end else begin
89 | assert( out_data == test_output[i+in_idx] );
90 | $display("output: %h\tcalculated: %h", out_data[in_idx], test_output[i+in_idx]);
91 | num_errors++;
92 | end
93 | end
94 |
95 | $display("\n\n");
96 |
97 | end
98 | $display("############################################\n");
99 | $display("Testing complete!\n");
100 | $display("%d of %d tests passed\n", NUM_TESTS-num_errors, NUM_TESTS);
101 | $display("(%f percent)\n", 100.0*(NUM_TESTS-num_errors)/NUM_TESTS);
102 | $display("############################################\n");
103 | end
104 |
105 | endmodule
106 |
--------------------------------------------------------------------------------
/rtl/pooling_backward_opt.sv:
--------------------------------------------------------------------------------
1 | /* Author: Youthawin Philavastvanid
2 | * Date : 02/08/2016
3 | *
4 | * Module: pooling_backward_opt
5 | * Desc :
6 | *
7 | * Design:
8 | * Input : Takes in a 1D vector containing all the floating point values
9 | * Ouput : Maximum value of the float
10 | *
11 | * WARNING: Max number the module can handle is 32 floating point
12 | */
13 |
14 | module pooling_backward_opt#(
15 | parameter
16 | k_w=3, //kernel width
17 | k_h=3, //kernel height
18 | k_size= k_w*k_h)(//kernel size
19 |
20 |
21 | input logic reset_n, //reset
22 | input logic clk, //clock
23 | input logic [7:0] max_flt_idx, //idx of max float in a kernel
24 | input logic [31:0] data_vect_in[k_size-1:0], //data input
25 | input logic [31:0] error_term, //error term for a kernel
26 |
27 | output reg [31:0] data_vect_out[k_size-1:0] //Vector data output
28 | );
29 |
30 | reg [31:0] max;
31 | reg [31:0] result;
32 |
33 | // data_vect_out[row][col] <= data_vect_in[row][col]*error_term;
34 | float_mult float_mult_inst (
35 | .clk_en(!reset_n),
36 | .clock(clk),
37 | .dataa(max),
38 | .datab(error_term),
39 | .result(result)
40 | );
41 |
42 | always @(posedge clk) begin
43 | for(int i=0; i NUM_CYCLES) begin
72 | rd_addr <= rd_addr + 1;
73 | cycle_count <= 0;
74 | end else begin
75 | cycle_count <= cycle_count + 1;
76 | end
77 | end
78 | end
79 |
80 | //select buffer to write to
81 | assign wr_en_input_data = !buffer_select;
82 | assign wr_en_weight_data = buffer_select;
83 |
84 |
85 | logic [255:0] conv_out;
86 | logic [511:0] data_buffer_out;
87 | logic [511:0] weight_buffer_out;
88 |
89 | cacheline_buffer input_data_buffer(
90 | .wr_clk(clk),
91 | .wr_en(wr_en_input_data),
92 | .wr_addr(wr_addr),
93 | .wr_data(data),
94 | .rd_clk(clk),
95 | .rd_addr(rd_addr),
96 | .rd_data(data_buffer_out)
97 | );
98 |
99 | cacheline_buffer weight_data_buffer(
100 | .wr_clk(clk),
101 | .wr_en(wr_en_weight_data),
102 | .wr_addr(wr_addr),
103 | .wr_data(data),
104 | .rd_clk(clk),
105 | .rd_addr(rd_addr),
106 | .rd_data(weight_buffer_out)
107 | );
108 |
109 | ram_2p result_data_buffer(
110 | .wrclock(clk),
111 | .wren(clk),
112 | .wraddress(rd_addr),
113 | .data(conv_out),
114 | .rdclock(clk),
115 | .rdaddress(),
116 | .q(result)
117 | );
118 |
119 | genvar i;
120 | generate
121 | for (i = 0; i < NUM_BLOCKS; i++) begin : GEN_CONV
122 | conv_forward_layer #(WIDTH=2)
123 | conv_forward_inst(
124 | .clk(clk),
125 | .reset(resetb),
126 | .id(rd_addr),
127 | .in_data(conv_bus.data[(i+1)*64-1:i*64]),
128 | .weight_vec(conv_bus.weights[(i+1)*64-1:i*64]),
129 | // .bias_term(conv_bus.bias[(i+1)*32-1:i*32]),
130 | .out_data(conv_out[(i+1)*32-1:i*32])
131 | );
132 | end
133 | endgenerate
134 |
135 |
136 |
137 | endmodule
138 |
--------------------------------------------------------------------------------
/rtl/qip/float_add.bsf:
--------------------------------------------------------------------------------
1 | /*
2 | WARNING: Do NOT edit the input and output ports in this file in a text
3 | editor if you plan to continue editing the block that represents it in
4 | the Block Editor! File corruption is VERY likely to occur.
5 | */
6 | /*
7 | Copyright (C) 1991-2015 Altera Corporation. All rights reserved.
8 | Your use of Altera Corporation's design tools, logic functions
9 | and other software and tools, and its AMPP partner logic
10 | functions, and any output files from any of the foregoing
11 | (including device programming or simulation files), and any
12 | associated documentation or information are expressly subject
13 | to the terms and conditions of the Altera Program License
14 | Subscription Agreement, the Altera Quartus Prime License Agreement,
15 | the Altera MegaCore Function License Agreement, or other
16 | applicable license agreement, including, without limitation,
17 | that your use is for the sole purpose of programming logic
18 | devices manufactured by Altera and sold by Altera or its
19 | authorized distributors. Please refer to the applicable
20 | agreement for further details.
21 | */
22 | (header "symbol" (version "1.2"))
23 | (symbol
24 | (rect 0 0 184 248)
25 | (text "float_add" (rect 66 0 182 18)(font "Arial" (font_size 10)))
26 | (text "inst" (rect 8 232 45 244)(font "Arial" ))
27 | (port
28 | (pt 0 40)
29 | (input)
30 | (text "dataa[31..0]" (rect 0 0 127 15)(font "Arial" (font_size 8)))
31 | (text "dataa[31..0]" (rect 20 34 78 46)(font "Arial" (font_size 8)))
32 | (line (pt 0 40)(pt 16 40)(line_width 3))
33 | )
34 | (port
35 | (pt 0 56)
36 | (input)
37 | (text "datab[31..0]" (rect 0 0 127 15)(font "Arial" (font_size 8)))
38 | (text "datab[31..0]" (rect 20 50 78 62)(font "Arial" (font_size 8)))
39 | (line (pt 0 56)(pt 16 56)(line_width 3))
40 | )
41 | (port
42 | (pt 0 88)
43 | (input)
44 | (text "clock" (rect 0 0 53 15)(font "Arial" (font_size 8)))
45 | (text "clock" (rect 20 82 45 94)(font "Arial" (font_size 8)))
46 | (line (pt 0 88)(pt 16 88))
47 | )
48 | (port
49 | (pt 0 104)
50 | (input)
51 | (text "aclr" (rect 0 0 42 15)(font "Arial" (font_size 8)))
52 | (text "aclr" (rect 20 98 38 110)(font "Arial" (font_size 8)))
53 | (line (pt 0 104)(pt 16 104))
54 | )
55 | (port
56 | (pt 184 40)
57 | (output)
58 | (text "result[31..0]" (rect 0 0 138 15)(font "Arial" (font_size 8)))
59 | (text "result[31..0]" (rect 109 34 166 46)(font "Arial" (font_size 8)))
60 | (line (pt 184 40)(pt 168 40)(line_width 3))
61 | )
62 | (drawing
63 | (text "Clock Cycles: 7" (rect 20 115 108 240)(font "Arial" ))
64 | (text "Single Precision" (rect 20 131 107 272)(font "Arial" ))
65 | (text "Exponent Width: 8" (rect 20 147 118 304)(font "Arial" ))
66 | (text "Mantissa Width: 23" (rect 20 163 122 336)(font "Arial" ))
67 | (text "Direction: Add" (rect 20 179 100 368)(font "Arial" ))
68 | (text "Optimization: Speed" (rect 20 195 125 400)(font "Arial" ))
69 | (line (pt 0 0)(pt 186 0))
70 | (line (pt 186 0)(pt 186 250))
71 | (line (pt 0 250)(pt 186 250))
72 | (line (pt 0 0)(pt 0 250))
73 | (line (pt 16 24)(pt 170 24))
74 | (line (pt 170 24)(pt 170 226))
75 | (line (pt 16 226)(pt 170 226))
76 | (line (pt 16 24)(pt 16 226))
77 | )
78 | )
79 |
--------------------------------------------------------------------------------
/rtl/qip/float_add.cmp:
--------------------------------------------------------------------------------
1 | --Copyright (C) 1991-2015 Altera Corporation. All rights reserved.
2 | --Your use of Altera Corporation's design tools, logic functions
3 | --and other software and tools, and its AMPP partner logic
4 | --functions, and any output files from any of the foregoing
5 | --(including device programming or simulation files), and any
6 | --associated documentation or information are expressly subject
7 | --to the terms and conditions of the Altera Program License
8 | --Subscription Agreement, the Altera Quartus Prime License Agreement,
9 | --the Altera MegaCore Function License Agreement, or other
10 | --applicable license agreement, including, without limitation,
11 | --that your use is for the sole purpose of programming logic
12 | --devices manufactured by Altera and sold by Altera or its
13 | --authorized distributors. Please refer to the applicable
14 | --agreement for further details.
15 |
16 |
17 | component float_add
18 | PORT
19 | (
20 | aclr : IN STD_LOGIC ;
21 | clock : IN STD_LOGIC ;
22 | dataa : IN STD_LOGIC_VECTOR (31 DOWNTO 0);
23 | datab : IN STD_LOGIC_VECTOR (31 DOWNTO 0);
24 | result : OUT STD_LOGIC_VECTOR (31 DOWNTO 0)
25 | );
26 | end component;
27 |
--------------------------------------------------------------------------------
/rtl/qip/float_add.inc:
--------------------------------------------------------------------------------
1 | --Copyright (C) 1991-2015 Altera Corporation. All rights reserved.
2 | --Your use of Altera Corporation's design tools, logic functions
3 | --and other software and tools, and its AMPP partner logic
4 | --functions, and any output files from any of the foregoing
5 | --(including device programming or simulation files), and any
6 | --associated documentation or information are expressly subject
7 | --to the terms and conditions of the Altera Program License
8 | --Subscription Agreement, the Altera Quartus Prime License Agreement,
9 | --the Altera MegaCore Function License Agreement, or other
10 | --applicable license agreement, including, without limitation,
11 | --that your use is for the sole purpose of programming logic
12 | --devices manufactured by Altera and sold by Altera or its
13 | --authorized distributors. Please refer to the applicable
14 | --agreement for further details.
15 |
16 |
17 | FUNCTION float_add
18 | (
19 | aclr,
20 | clock,
21 | dataa[31..0],
22 | datab[31..0]
23 | )
24 |
25 | RETURNS (
26 | result[31..0]
27 | );
28 |
--------------------------------------------------------------------------------
/rtl/qip/float_add.qip:
--------------------------------------------------------------------------------
1 | set_global_assignment -name IP_TOOL_NAME "ALTFP_ADD_SUB"
2 | set_global_assignment -name IP_TOOL_VERSION "15.1"
3 | set_global_assignment -name IP_GENERATED_DEVICE_FAMILY "{Stratix V}"
4 | set_global_assignment -name VERILOG_FILE [file join $::quartus(qip_path) "float_add.v"]
5 | set_global_assignment -name MISC_FILE [file join $::quartus(qip_path) "float_add.bsf"]
6 | set_global_assignment -name MISC_FILE [file join $::quartus(qip_path) "float_add_inst.v"]
7 | set_global_assignment -name MISC_FILE [file join $::quartus(qip_path) "float_add_bb.v"]
8 | set_global_assignment -name MISC_FILE [file join $::quartus(qip_path) "float_add.inc"]
9 | set_global_assignment -name MISC_FILE [file join $::quartus(qip_path) "float_add.cmp"]
10 | set_global_assignment -name MISC_FILE [file join $::quartus(qip_path) "float_add_syn.v"]
11 |
--------------------------------------------------------------------------------
/rtl/qip/float_add_bb.v:
--------------------------------------------------------------------------------
1 | // megafunction wizard: %ALTFP_ADD_SUB%VBB%
2 | // GENERATION: STANDARD
3 | // VERSION: WM1.0
4 | // MODULE: altfp_add_sub
5 |
6 | // ============================================================
7 | // File Name: float_add.v
8 | // Megafunction Name(s):
9 | // altfp_add_sub
10 | //
11 | // Simulation Library Files(s):
12 | // lpm
13 | // ============================================================
14 | // ************************************************************
15 | // THIS IS A WIZARD-GENERATED FILE. DO NOT EDIT THIS FILE!
16 | //
17 | // 15.1.1 Build 189 12/02/2015 SJ Standard Edition
18 | // ************************************************************
19 |
20 | //Copyright (C) 1991-2015 Altera Corporation. All rights reserved.
21 | //Your use of Altera Corporation's design tools, logic functions
22 | //and other software and tools, and its AMPP partner logic
23 | //functions, and any output files from any of the foregoing
24 | //(including device programming or simulation files), and any
25 | //associated documentation or information are expressly subject
26 | //to the terms and conditions of the Altera Program License
27 | //Subscription Agreement, the Altera Quartus Prime License Agreement,
28 | //the Altera MegaCore Function License Agreement, or other
29 | //applicable license agreement, including, without limitation,
30 | //that your use is for the sole purpose of programming logic
31 | //devices manufactured by Altera and sold by Altera or its
32 | //authorized distributors. Please refer to the applicable
33 | //agreement for further details.
34 |
35 | module float_add (
36 | aclr,
37 | clock,
38 | dataa,
39 | datab,
40 | result)/* synthesis synthesis_clearbox = 1 */;
41 |
42 | input aclr;
43 | input clock;
44 | input [31:0] dataa;
45 | input [31:0] datab;
46 | output [31:0] result;
47 |
48 | endmodule
49 |
50 | // ============================================================
51 | // CNX file retrieval info
52 | // ============================================================
53 | // Retrieval info: PRIVATE: FPM_FORMAT NUMERIC "0"
54 | // Retrieval info: PRIVATE: INTENDED_DEVICE_FAMILY STRING "Stratix V"
55 | // Retrieval info: PRIVATE: SYNTH_WRAPPER_GEN_POSTFIX STRING "1"
56 | // Retrieval info: PRIVATE: WIDTH_DATA NUMERIC "32"
57 | // Retrieval info: LIBRARY: altera_mf altera_mf.altera_mf_components.all
58 | // Retrieval info: CONSTANT: DENORMAL_SUPPORT STRING "NO"
59 | // Retrieval info: CONSTANT: DIRECTION STRING "ADD"
60 | // Retrieval info: CONSTANT: INTENDED_DEVICE_FAMILY STRING "Stratix V"
61 | // Retrieval info: CONSTANT: OPTIMIZE STRING "SPEED"
62 | // Retrieval info: CONSTANT: PIPELINE NUMERIC "7"
63 | // Retrieval info: CONSTANT: REDUCED_FUNCTIONALITY STRING "NO"
64 | // Retrieval info: CONSTANT: WIDTH_EXP NUMERIC "8"
65 | // Retrieval info: CONSTANT: WIDTH_MAN NUMERIC "23"
66 | // Retrieval info: USED_PORT: aclr 0 0 0 0 INPUT NODEFVAL "aclr"
67 | // Retrieval info: USED_PORT: clock 0 0 0 0 INPUT NODEFVAL "clock"
68 | // Retrieval info: USED_PORT: dataa 0 0 32 0 INPUT NODEFVAL "dataa[31..0]"
69 | // Retrieval info: USED_PORT: datab 0 0 32 0 INPUT NODEFVAL "datab[31..0]"
70 | // Retrieval info: USED_PORT: result 0 0 32 0 OUTPUT NODEFVAL "result[31..0]"
71 | // Retrieval info: CONNECT: @aclr 0 0 0 0 aclr 0 0 0 0
72 | // Retrieval info: CONNECT: @clock 0 0 0 0 clock 0 0 0 0
73 | // Retrieval info: CONNECT: @dataa 0 0 32 0 dataa 0 0 32 0
74 | // Retrieval info: CONNECT: @datab 0 0 32 0 datab 0 0 32 0
75 | // Retrieval info: CONNECT: result 0 0 32 0 @result 0 0 32 0
76 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_add.v TRUE
77 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_add.inc TRUE
78 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_add.cmp TRUE
79 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_add.bsf TRUE
80 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_add_inst.v TRUE
81 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_add_bb.v TRUE
82 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_add_syn.v TRUE
83 | // Retrieval info: LIB_FILE: lpm
84 |
--------------------------------------------------------------------------------
/rtl/qip/float_add_inst.v:
--------------------------------------------------------------------------------
1 | float_add float_add_inst (
2 | .aclr ( aclr_sig ),
3 | .clock ( clock_sig ),
4 | .dataa ( dataa_sig ),
5 | .datab ( datab_sig ),
6 | .result ( result_sig )
7 | );
8 |
--------------------------------------------------------------------------------
/rtl/qip/float_mult.bsf:
--------------------------------------------------------------------------------
1 | /*
2 | WARNING: Do NOT edit the input and output ports in this file in a text
3 | editor if you plan to continue editing the block that represents it in
4 | the Block Editor! File corruption is VERY likely to occur.
5 | */
6 | /*
7 | Copyright (C) 1991-2015 Altera Corporation. All rights reserved.
8 | Your use of Altera Corporation's design tools, logic functions
9 | and other software and tools, and its AMPP partner logic
10 | functions, and any output files from any of the foregoing
11 | (including device programming or simulation files), and any
12 | associated documentation or information are expressly subject
13 | to the terms and conditions of the Altera Program License
14 | Subscription Agreement, the Altera Quartus Prime License Agreement,
15 | the Altera MegaCore Function License Agreement, or other
16 | applicable license agreement, including, without limitation,
17 | that your use is for the sole purpose of programming logic
18 | devices manufactured by Altera and sold by Altera or its
19 | authorized distributors. Please refer to the applicable
20 | agreement for further details.
21 | */
22 | (header "symbol" (version "1.2"))
23 | (symbol
24 | (rect 0 0 224 176)
25 | (text "float_mult" (rect 80 0 209 20)(font "Dialog" (font_size 10)))
26 | (text "inst" (rect 8 160 45 172)(font "Arial" ))
27 | (port
28 | (pt 0 48)
29 | (input)
30 | (text "dataa[31..0]" (rect 0 0 127 15)(font "Dialog" (font_size 8)))
31 | (text "dataa[31..0]" (rect 4 35 72 47)(font "Dialog" (font_size 8)))
32 | (line (pt 0 48)(pt 80 48)(line_width 3))
33 | )
34 | (port
35 | (pt 0 64)
36 | (input)
37 | (text "datab[31..0]" (rect 0 0 127 15)(font "Dialog" (font_size 8)))
38 | (text "datab[31..0]" (rect 4 51 72 63)(font "Dialog" (font_size 8)))
39 | (line (pt 0 64)(pt 80 64)(line_width 3))
40 | )
41 | (port
42 | (pt 0 80)
43 | (input)
44 | (text "clk_en" (rect 0 0 63 15)(font "Dialog" (font_size 8)))
45 | (text "clk_en" (rect 4 67 38 79)(font "Dialog" (font_size 8)))
46 | (line (pt 0 80)(pt 80 80))
47 | )
48 | (port
49 | (pt 0 96)
50 | (input)
51 | (text "clock" (rect 0 0 53 15)(font "Dialog" (font_size 8)))
52 | (text "clock" (rect 4 83 34 95)(font "Dialog" (font_size 8)))
53 | (line (pt 0 96)(pt 80 96))
54 | )
55 | (port
56 | (pt 224 48)
57 | (output)
58 | (text "result[31..0]" (rect 0 0 138 15)(font "Dialog" (font_size 8)))
59 | (text "result[31..0]" (rect 151 35 218 47)(font "Dialog" (font_size 8)))
60 | (line (pt 224 48)(pt 144 48)(line_width 3))
61 | )
62 | (drawing
63 | (text "Clock cycles: 5" (rect 151 114 368 238)(font "Arial" ))
64 | (text "Single Precision" (rect 145 130 357 270)(font "Arial" ))
65 | (text "Exponent Width: 8" (rect 138 146 354 302)(font "Arial" ))
66 | (text "Mantissa Width: 23" (rect 134 162 350 334)(font "Arial" ))
67 | (line (pt 80 32)(pt 144 32))
68 | (line (pt 144 32)(pt 144 112))
69 | (line (pt 80 112)(pt 144 112))
70 | (line (pt 80 32)(pt 80 112))
71 | (line (pt 0 0)(pt 224 0))
72 | (line (pt 224 0)(pt 224 176))
73 | (line (pt 0 176)(pt 224 176))
74 | (line (pt 0 0)(pt 0 176))
75 | )
76 | )
77 |
--------------------------------------------------------------------------------
/rtl/qip/float_mult.cmp:
--------------------------------------------------------------------------------
1 | --Copyright (C) 1991-2015 Altera Corporation. All rights reserved.
2 | --Your use of Altera Corporation's design tools, logic functions
3 | --and other software and tools, and its AMPP partner logic
4 | --functions, and any output files from any of the foregoing
5 | --(including device programming or simulation files), and any
6 | --associated documentation or information are expressly subject
7 | --to the terms and conditions of the Altera Program License
8 | --Subscription Agreement, the Altera Quartus Prime License Agreement,
9 | --the Altera MegaCore Function License Agreement, or other
10 | --applicable license agreement, including, without limitation,
11 | --that your use is for the sole purpose of programming logic
12 | --devices manufactured by Altera and sold by Altera or its
13 | --authorized distributors. Please refer to the applicable
14 | --agreement for further details.
15 |
16 |
17 | component float_mult
18 | PORT
19 | (
20 | clk_en : IN STD_LOGIC ;
21 | clock : IN STD_LOGIC ;
22 | dataa : IN STD_LOGIC_VECTOR (31 DOWNTO 0);
23 | datab : IN STD_LOGIC_VECTOR (31 DOWNTO 0);
24 | result : OUT STD_LOGIC_VECTOR (31 DOWNTO 0)
25 | );
26 | end component;
27 |
--------------------------------------------------------------------------------
/rtl/qip/float_mult.inc:
--------------------------------------------------------------------------------
1 | --Copyright (C) 1991-2015 Altera Corporation. All rights reserved.
2 | --Your use of Altera Corporation's design tools, logic functions
3 | --and other software and tools, and its AMPP partner logic
4 | --functions, and any output files from any of the foregoing
5 | --(including device programming or simulation files), and any
6 | --associated documentation or information are expressly subject
7 | --to the terms and conditions of the Altera Program License
8 | --Subscription Agreement, the Altera Quartus Prime License Agreement,
9 | --the Altera MegaCore Function License Agreement, or other
10 | --applicable license agreement, including, without limitation,
11 | --that your use is for the sole purpose of programming logic
12 | --devices manufactured by Altera and sold by Altera or its
13 | --authorized distributors. Please refer to the applicable
14 | --agreement for further details.
15 |
16 |
17 | FUNCTION float_mult
18 | (
19 | clk_en,
20 | clock,
21 | dataa[31..0],
22 | datab[31..0]
23 | )
24 |
25 | RETURNS (
26 | result[31..0]
27 | );
28 |
--------------------------------------------------------------------------------
/rtl/qip/float_mult.qip:
--------------------------------------------------------------------------------
1 | set_global_assignment -name IP_TOOL_NAME "ALTFP_MULT"
2 | set_global_assignment -name IP_TOOL_VERSION "15.1"
3 | set_global_assignment -name IP_GENERATED_DEVICE_FAMILY "{Stratix V}"
4 | set_global_assignment -name VERILOG_FILE [file join $::quartus(qip_path) "float_mult.v"]
5 | set_global_assignment -name MISC_FILE [file join $::quartus(qip_path) "float_mult.bsf"]
6 | set_global_assignment -name MISC_FILE [file join $::quartus(qip_path) "float_mult_inst.v"]
7 | set_global_assignment -name MISC_FILE [file join $::quartus(qip_path) "float_mult_bb.v"]
8 | set_global_assignment -name MISC_FILE [file join $::quartus(qip_path) "float_mult.inc"]
9 | set_global_assignment -name MISC_FILE [file join $::quartus(qip_path) "float_mult.cmp"]
10 | set_global_assignment -name MISC_FILE [file join $::quartus(qip_path) "float_mult_syn.v"]
11 |
--------------------------------------------------------------------------------
/rtl/qip/float_mult.v:
--------------------------------------------------------------------------------
1 | // megafunction wizard: %ALTFP_MULT%
2 | // GENERATION: STANDARD
3 | // VERSION: WM1.0
4 | // MODULE: ALTFP_MULT
5 |
6 | // ============================================================
7 | // File Name: float_mult.v
8 | // Megafunction Name(s):
9 | // ALTFP_MULT
10 | //
11 | // Simulation Library Files(s):
12 | // lpm
13 | // ============================================================
14 | // ************************************************************
15 | // THIS IS A WIZARD-GENERATED FILE. DO NOT EDIT THIS FILE!
16 | //
17 | // 15.1.1 Build 189 12/02/2015 SJ Standard Edition
18 | // ************************************************************
19 |
20 |
21 | //Copyright (C) 1991-2015 Altera Corporation. All rights reserved.
22 | //Your use of Altera Corporation's design tools, logic functions
23 | //and other software and tools, and its AMPP partner logic
24 | //functions, and any output files from any of the foregoing
25 | //(including device programming or simulation files), and any
26 | //associated documentation or information are expressly subject
27 | //to the terms and conditions of the Altera Program License
28 | //Subscription Agreement, the Altera Quartus Prime License Agreement,
29 | //the Altera MegaCore Function License Agreement, or other
30 | //applicable license agreement, including, without limitation,
31 | //that your use is for the sole purpose of programming logic
32 | //devices manufactured by Altera and sold by Altera or its
33 | //authorized distributors. Please refer to the applicable
34 | //agreement for further details.
35 |
36 |
37 | //altfp_mult CBX_AUTO_BLACKBOX="ALL" DEDICATED_MULTIPLIER_CIRCUITRY="YES" DENORMAL_SUPPORT="NO" DEVICE_FAMILY="Stratix V" EXCEPTION_HANDLING="NO" PIPELINE=5 REDUCED_FUNCTIONALITY="NO" ROUNDING="TO_NEAREST" WIDTH_EXP=8 WIDTH_MAN=23 clk_en clock dataa datab result
38 | //VERSION_BEGIN 15.1 cbx_alt_ded_mult_y 2015:11:24:18:49:55:SJ cbx_altbarrel_shift 2015:11:24:18:49:55:SJ cbx_altera_mult_add 2015:11:24:18:49:55:SJ cbx_altera_mult_add_rtl 2015:11:24:18:49:55:SJ cbx_altfp_mult 2015:11:24:18:49:55:SJ cbx_altmult_add 2015:11:24:18:49:55:SJ cbx_cycloneii 2015:11:24:18:49:55:SJ cbx_lpm_add_sub 2015:11:24:18:49:55:SJ cbx_lpm_compare 2015:11:24:18:49:55:SJ cbx_lpm_mult 2015:11:24:18:49:55:SJ cbx_mgl 2015:11:24:20:43:33:SJ cbx_nadder 2015:11:24:18:49:55:SJ cbx_padd 2015:11:24:18:49:55:SJ cbx_parallel_add 2015:11:24:18:49:55:SJ cbx_stratix 2015:11:24:18:49:55:SJ cbx_stratixii 2015:11:24:18:49:55:SJ cbx_util_mgl 2015:11:24:18:49:55:SJ VERSION_END
39 | // synthesis VERILOG_INPUT_VERSION VERILOG_2001
40 | // altera message_off 10463
41 |
42 |
43 | //synthesis_resources = lpm_add_sub 4 lpm_mult 1 reg 136
44 | //synopsys translate_off
45 | `timescale 1 ps / 1 ps
46 | //synopsys translate_on
47 | module float_mult_altfp_mult_t9o
48 | (
49 | clk_en,
50 | clock,
51 | dataa,
52 | datab,
53 | result) ;
54 | input clk_en;
55 | input clock;
56 | input [31:0] dataa;
57 | input [31:0] datab;
58 | output [31:0] result;
59 | `ifndef ALTERA_RESERVED_QIS
60 | // synopsys translate_off
61 | `endif
62 | tri1 clk_en;
63 | `ifndef ALTERA_RESERVED_QIS
64 | // synopsys translate_on
65 | `endif
66 |
67 | reg dataa_exp_all_one_ff_p1;
68 | reg dataa_exp_not_zero_ff_p1;
69 | reg dataa_man_not_zero_ff_p1;
70 | reg dataa_man_not_zero_ff_p2;
71 | reg datab_exp_all_one_ff_p1;
72 | reg datab_exp_not_zero_ff_p1;
73 | reg datab_man_not_zero_ff_p1;
74 | reg datab_man_not_zero_ff_p2;
75 | reg [9:0] delay_exp2_bias;
76 | reg [9:0] delay_exp_bias;
77 | reg delay_man_product_msb;
78 | reg delay_man_product_msb_p0;
79 | reg [8:0] exp_add_p1;
80 | reg [7:0] exp_result_ff;
81 | reg input_is_infinity_dffe_0;
82 | reg input_is_infinity_dffe_1;
83 | reg input_is_infinity_ff1;
84 | reg input_is_nan_dffe_0;
85 | reg input_is_nan_dffe_1;
86 | reg input_is_nan_ff1;
87 | reg input_not_zero_dffe_0;
88 | reg input_not_zero_dffe_1;
89 | reg input_not_zero_ff1;
90 | reg lsb_dffe;
91 | reg [22:0] man_result_ff;
92 | reg [23:0] man_round_p;
93 | reg [24:0] man_round_p2;
94 | reg round_dffe;
95 | reg [0:0] sign_node_ff0;
96 | reg [0:0] sign_node_ff1;
97 | reg [0:0] sign_node_ff2;
98 | reg [0:0] sign_node_ff3;
99 | reg [0:0] sign_node_ff4;
100 | reg sticky_dffe;
101 | wire [8:0] wire_exp_add_adder_result;
102 | wire [9:0] wire_exp_adj_adder_result;
103 | wire [9:0] wire_exp_bias_subtr_result;
104 | wire [24:0] wire_man_round_adder_result;
105 | wire [47:0] wire_man_product2_mult_result;
106 | wire aclr;
107 | wire [9:0] bias;
108 | wire [7:0] dataa_exp_all_one;
109 | wire [7:0] dataa_exp_not_zero;
110 | wire [22:0] dataa_man_not_zero;
111 | wire [7:0] datab_exp_all_one;
112 | wire [7:0] datab_exp_not_zero;
113 | wire [22:0] datab_man_not_zero;
114 | wire exp_is_inf;
115 | wire exp_is_zero;
116 | wire [9:0] expmod;
117 | wire [7:0] inf_num;
118 | wire lsb_bit;
119 | wire [23:0] man_result_round;
120 | wire [24:0] man_shift_full;
121 | wire [7:0] result_exp_all_one;
122 | wire [8:0] result_exp_not_zero;
123 | wire round_bit;
124 | wire round_carry;
125 | wire [22:0] sticky_bit;
126 |
127 | // synopsys translate_off
128 | initial
129 | dataa_exp_all_one_ff_p1 = 0;
130 | // synopsys translate_on
131 | always @ ( posedge clock or posedge aclr)
132 | if (aclr == 1'b1) dataa_exp_all_one_ff_p1 <= 1'b0;
133 | else if (clk_en == 1'b1) dataa_exp_all_one_ff_p1 <= dataa_exp_all_one[7];
134 | // synopsys translate_off
135 | initial
136 | dataa_exp_not_zero_ff_p1 = 0;
137 | // synopsys translate_on
138 | always @ ( posedge clock or posedge aclr)
139 | if (aclr == 1'b1) dataa_exp_not_zero_ff_p1 <= 1'b0;
140 | else if (clk_en == 1'b1) dataa_exp_not_zero_ff_p1 <= dataa_exp_not_zero[7];
141 | // synopsys translate_off
142 | initial
143 | dataa_man_not_zero_ff_p1 = 0;
144 | // synopsys translate_on
145 | always @ ( posedge clock or posedge aclr)
146 | if (aclr == 1'b1) dataa_man_not_zero_ff_p1 <= 1'b0;
147 | else if (clk_en == 1'b1) dataa_man_not_zero_ff_p1 <= dataa_man_not_zero[10];
148 | // synopsys translate_off
149 | initial
150 | dataa_man_not_zero_ff_p2 = 0;
151 | // synopsys translate_on
152 | always @ ( posedge clock or posedge aclr)
153 | if (aclr == 1'b1) dataa_man_not_zero_ff_p2 <= 1'b0;
154 | else if (clk_en == 1'b1) dataa_man_not_zero_ff_p2 <= dataa_man_not_zero[22];
155 | // synopsys translate_off
156 | initial
157 | datab_exp_all_one_ff_p1 = 0;
158 | // synopsys translate_on
159 | always @ ( posedge clock or posedge aclr)
160 | if (aclr == 1'b1) datab_exp_all_one_ff_p1 <= 1'b0;
161 | else if (clk_en == 1'b1) datab_exp_all_one_ff_p1 <= datab_exp_all_one[7];
162 | // synopsys translate_off
163 | initial
164 | datab_exp_not_zero_ff_p1 = 0;
165 | // synopsys translate_on
166 | always @ ( posedge clock or posedge aclr)
167 | if (aclr == 1'b1) datab_exp_not_zero_ff_p1 <= 1'b0;
168 | else if (clk_en == 1'b1) datab_exp_not_zero_ff_p1 <= datab_exp_not_zero[7];
169 | // synopsys translate_off
170 | initial
171 | datab_man_not_zero_ff_p1 = 0;
172 | // synopsys translate_on
173 | always @ ( posedge clock or posedge aclr)
174 | if (aclr == 1'b1) datab_man_not_zero_ff_p1 <= 1'b0;
175 | else if (clk_en == 1'b1) datab_man_not_zero_ff_p1 <= datab_man_not_zero[10];
176 | // synopsys translate_off
177 | initial
178 | datab_man_not_zero_ff_p2 = 0;
179 | // synopsys translate_on
180 | always @ ( posedge clock or posedge aclr)
181 | if (aclr == 1'b1) datab_man_not_zero_ff_p2 <= 1'b0;
182 | else if (clk_en == 1'b1) datab_man_not_zero_ff_p2 <= datab_man_not_zero[22];
183 | // synopsys translate_off
184 | initial
185 | delay_exp2_bias = 0;
186 | // synopsys translate_on
187 | always @ ( posedge clock or posedge aclr)
188 | if (aclr == 1'b1) delay_exp2_bias <= 10'b0;
189 | else if (clk_en == 1'b1) delay_exp2_bias <= delay_exp_bias;
190 | // synopsys translate_off
191 | initial
192 | delay_exp_bias = 0;
193 | // synopsys translate_on
194 | always @ ( posedge clock or posedge aclr)
195 | if (aclr == 1'b1) delay_exp_bias <= 10'b0;
196 | else if (clk_en == 1'b1) delay_exp_bias <= wire_exp_bias_subtr_result;
197 | // synopsys translate_off
198 | initial
199 | delay_man_product_msb = 0;
200 | // synopsys translate_on
201 | always @ ( posedge clock or posedge aclr)
202 | if (aclr == 1'b1) delay_man_product_msb <= 1'b0;
203 | else if (clk_en == 1'b1) delay_man_product_msb <= delay_man_product_msb_p0;
204 | // synopsys translate_off
205 | initial
206 | delay_man_product_msb_p0 = 0;
207 | // synopsys translate_on
208 | always @ ( posedge clock or posedge aclr)
209 | if (aclr == 1'b1) delay_man_product_msb_p0 <= 1'b0;
210 | else if (clk_en == 1'b1) delay_man_product_msb_p0 <= wire_man_product2_mult_result[47];
211 | // synopsys translate_off
212 | initial
213 | exp_add_p1 = 0;
214 | // synopsys translate_on
215 | always @ ( posedge clock or posedge aclr)
216 | if (aclr == 1'b1) exp_add_p1 <= 9'b0;
217 | else if (clk_en == 1'b1) exp_add_p1 <= wire_exp_add_adder_result;
218 | // synopsys translate_off
219 | initial
220 | exp_result_ff = 0;
221 | // synopsys translate_on
222 | always @ ( posedge clock or posedge aclr)
223 | if (aclr == 1'b1) exp_result_ff <= 8'b0;
224 | else if (clk_en == 1'b1) exp_result_ff <= ((inf_num & {8{((exp_is_inf | input_is_infinity_ff1) | input_is_nan_ff1)}}) | ((wire_exp_adj_adder_result[7:0] & {8{(~ exp_is_zero)}}) & {8{input_not_zero_ff1}}));
225 | // synopsys translate_off
226 | initial
227 | input_is_infinity_dffe_0 = 0;
228 | // synopsys translate_on
229 | always @ ( posedge clock or posedge aclr)
230 | if (aclr == 1'b1) input_is_infinity_dffe_0 <= 1'b0;
231 | else if (clk_en == 1'b1) input_is_infinity_dffe_0 <= ((dataa_exp_all_one_ff_p1 & (~ (dataa_man_not_zero_ff_p1 | dataa_man_not_zero_ff_p2))) | (datab_exp_all_one_ff_p1 & (~ (datab_man_not_zero_ff_p1 | datab_man_not_zero_ff_p2))));
232 | // synopsys translate_off
233 | initial
234 | input_is_infinity_dffe_1 = 0;
235 | // synopsys translate_on
236 | always @ ( posedge clock or posedge aclr)
237 | if (aclr == 1'b1) input_is_infinity_dffe_1 <= 1'b0;
238 | else if (clk_en == 1'b1) input_is_infinity_dffe_1 <= input_is_infinity_dffe_0;
239 | // synopsys translate_off
240 | initial
241 | input_is_infinity_ff1 = 0;
242 | // synopsys translate_on
243 | always @ ( posedge clock or posedge aclr)
244 | if (aclr == 1'b1) input_is_infinity_ff1 <= 1'b0;
245 | else if (clk_en == 1'b1) input_is_infinity_ff1 <= input_is_infinity_dffe_1;
246 | // synopsys translate_off
247 | initial
248 | input_is_nan_dffe_0 = 0;
249 | // synopsys translate_on
250 | always @ ( posedge clock or posedge aclr)
251 | if (aclr == 1'b1) input_is_nan_dffe_0 <= 1'b0;
252 | else if (clk_en == 1'b1) input_is_nan_dffe_0 <= ((dataa_exp_all_one_ff_p1 & (dataa_man_not_zero_ff_p1 | dataa_man_not_zero_ff_p2)) | (datab_exp_all_one_ff_p1 & (datab_man_not_zero_ff_p1 | datab_man_not_zero_ff_p2)));
253 | // synopsys translate_off
254 | initial
255 | input_is_nan_dffe_1 = 0;
256 | // synopsys translate_on
257 | always @ ( posedge clock or posedge aclr)
258 | if (aclr == 1'b1) input_is_nan_dffe_1 <= 1'b0;
259 | else if (clk_en == 1'b1) input_is_nan_dffe_1 <= input_is_nan_dffe_0;
260 | // synopsys translate_off
261 | initial
262 | input_is_nan_ff1 = 0;
263 | // synopsys translate_on
264 | always @ ( posedge clock or posedge aclr)
265 | if (aclr == 1'b1) input_is_nan_ff1 <= 1'b0;
266 | else if (clk_en == 1'b1) input_is_nan_ff1 <= input_is_nan_dffe_1;
267 | // synopsys translate_off
268 | initial
269 | input_not_zero_dffe_0 = 0;
270 | // synopsys translate_on
271 | always @ ( posedge clock or posedge aclr)
272 | if (aclr == 1'b1) input_not_zero_dffe_0 <= 1'b0;
273 | else if (clk_en == 1'b1) input_not_zero_dffe_0 <= (dataa_exp_not_zero_ff_p1 & datab_exp_not_zero_ff_p1);
274 | // synopsys translate_off
275 | initial
276 | input_not_zero_dffe_1 = 0;
277 | // synopsys translate_on
278 | always @ ( posedge clock or posedge aclr)
279 | if (aclr == 1'b1) input_not_zero_dffe_1 <= 1'b0;
280 | else if (clk_en == 1'b1) input_not_zero_dffe_1 <= input_not_zero_dffe_0;
281 | // synopsys translate_off
282 | initial
283 | input_not_zero_ff1 = 0;
284 | // synopsys translate_on
285 | always @ ( posedge clock or posedge aclr)
286 | if (aclr == 1'b1) input_not_zero_ff1 <= 1'b0;
287 | else if (clk_en == 1'b1) input_not_zero_ff1 <= input_not_zero_dffe_1;
288 | // synopsys translate_off
289 | initial
290 | lsb_dffe = 0;
291 | // synopsys translate_on
292 | always @ ( posedge clock or posedge aclr)
293 | if (aclr == 1'b1) lsb_dffe <= 1'b0;
294 | else if (clk_en == 1'b1) lsb_dffe <= lsb_bit;
295 | // synopsys translate_off
296 | initial
297 | man_result_ff = 0;
298 | // synopsys translate_on
299 | always @ ( posedge clock or posedge aclr)
300 | if (aclr == 1'b1) man_result_ff <= 23'b0;
301 | else if (clk_en == 1'b1) man_result_ff <= {((((((man_result_round[22] & input_not_zero_ff1) & (~ input_is_infinity_ff1)) & (~ exp_is_inf)) & (~ exp_is_zero)) | (input_is_infinity_ff1 & (~ input_not_zero_ff1))) | input_is_nan_ff1), (((((man_result_round[21:0] & {22{input_not_zero_ff1}}) & {22{(~ input_is_infinity_ff1)}}) & {22{(~ exp_is_inf)}}) & {22{(~ exp_is_zero)}}) & {22{(~ input_is_nan_ff1)}})};
302 | // synopsys translate_off
303 | initial
304 | man_round_p = 0;
305 | // synopsys translate_on
306 | always @ ( posedge clock or posedge aclr)
307 | if (aclr == 1'b1) man_round_p <= 24'b0;
308 | else if (clk_en == 1'b1) man_round_p <= man_shift_full[24:1];
309 | // synopsys translate_off
310 | initial
311 | man_round_p2 = 0;
312 | // synopsys translate_on
313 | always @ ( posedge clock or posedge aclr)
314 | if (aclr == 1'b1) man_round_p2 <= 25'b0;
315 | else if (clk_en == 1'b1) man_round_p2 <= wire_man_round_adder_result;
316 | // synopsys translate_off
317 | initial
318 | round_dffe = 0;
319 | // synopsys translate_on
320 | always @ ( posedge clock or posedge aclr)
321 | if (aclr == 1'b1) round_dffe <= 1'b0;
322 | else if (clk_en == 1'b1) round_dffe <= round_bit;
323 | // synopsys translate_off
324 | initial
325 | sign_node_ff0 = 0;
326 | // synopsys translate_on
327 | always @ ( posedge clock or posedge aclr)
328 | if (aclr == 1'b1) sign_node_ff0 <= 1'b0;
329 | else if (clk_en == 1'b1) sign_node_ff0 <= (dataa[31] ^ datab[31]);
330 | // synopsys translate_off
331 | initial
332 | sign_node_ff1 = 0;
333 | // synopsys translate_on
334 | always @ ( posedge clock or posedge aclr)
335 | if (aclr == 1'b1) sign_node_ff1 <= 1'b0;
336 | else if (clk_en == 1'b1) sign_node_ff1 <= sign_node_ff0[0:0];
337 | // synopsys translate_off
338 | initial
339 | sign_node_ff2 = 0;
340 | // synopsys translate_on
341 | always @ ( posedge clock or posedge aclr)
342 | if (aclr == 1'b1) sign_node_ff2 <= 1'b0;
343 | else if (clk_en == 1'b1) sign_node_ff2 <= sign_node_ff1[0:0];
344 | // synopsys translate_off
345 | initial
346 | sign_node_ff3 = 0;
347 | // synopsys translate_on
348 | always @ ( posedge clock or posedge aclr)
349 | if (aclr == 1'b1) sign_node_ff3 <= 1'b0;
350 | else if (clk_en == 1'b1) sign_node_ff3 <= sign_node_ff2[0:0];
351 | // synopsys translate_off
352 | initial
353 | sign_node_ff4 = 0;
354 | // synopsys translate_on
355 | always @ ( posedge clock or posedge aclr)
356 | if (aclr == 1'b1) sign_node_ff4 <= 1'b0;
357 | else if (clk_en == 1'b1) sign_node_ff4 <= sign_node_ff3[0:0];
358 | // synopsys translate_off
359 | initial
360 | sticky_dffe = 0;
361 | // synopsys translate_on
362 | always @ ( posedge clock or posedge aclr)
363 | if (aclr == 1'b1) sticky_dffe <= 1'b0;
364 | else if (clk_en == 1'b1) sticky_dffe <= sticky_bit[22];
365 | lpm_add_sub exp_add_adder
366 | (
367 | .aclr(aclr),
368 | .cin(1'b0),
369 | .clken(clk_en),
370 | .clock(clock),
371 | .cout(),
372 | .dataa({1'b0, dataa[30:23]}),
373 | .datab({1'b0, datab[30:23]}),
374 | .overflow(),
375 | .result(wire_exp_add_adder_result)
376 | `ifndef FORMAL_VERIFICATION
377 | // synopsys translate_off
378 | `endif
379 | ,
380 | .add_sub(1'b1)
381 | `ifndef FORMAL_VERIFICATION
382 | // synopsys translate_on
383 | `endif
384 | );
385 | defparam
386 | exp_add_adder.lpm_pipeline = 1,
387 | exp_add_adder.lpm_width = 9,
388 | exp_add_adder.lpm_type = "lpm_add_sub";
389 | lpm_add_sub exp_adj_adder
390 | (
391 | .cin(1'b0),
392 | .cout(),
393 | .dataa(delay_exp2_bias),
394 | .datab(expmod),
395 | .overflow(),
396 | .result(wire_exp_adj_adder_result)
397 | `ifndef FORMAL_VERIFICATION
398 | // synopsys translate_off
399 | `endif
400 | ,
401 | .aclr(1'b0),
402 | .add_sub(1'b1),
403 | .clken(1'b1),
404 | .clock(1'b0)
405 | `ifndef FORMAL_VERIFICATION
406 | // synopsys translate_on
407 | `endif
408 | );
409 | defparam
410 | exp_adj_adder.lpm_width = 10,
411 | exp_adj_adder.lpm_type = "lpm_add_sub";
412 | lpm_add_sub exp_bias_subtr
413 | (
414 | .cout(),
415 | .dataa({1'b0, exp_add_p1[8:0]}),
416 | .datab({bias[9:0]}),
417 | .overflow(),
418 | .result(wire_exp_bias_subtr_result)
419 | `ifndef FORMAL_VERIFICATION
420 | // synopsys translate_off
421 | `endif
422 | ,
423 | .aclr(1'b0),
424 | .add_sub(1'b1),
425 | .cin(),
426 | .clken(1'b1),
427 | .clock(1'b0)
428 | `ifndef FORMAL_VERIFICATION
429 | // synopsys translate_on
430 | `endif
431 | );
432 | defparam
433 | exp_bias_subtr.lpm_direction = "SUB",
434 | exp_bias_subtr.lpm_pipeline = 0,
435 | exp_bias_subtr.lpm_representation = "UNSIGNED",
436 | exp_bias_subtr.lpm_width = 10,
437 | exp_bias_subtr.lpm_type = "lpm_add_sub";
438 | lpm_add_sub man_round_adder
439 | (
440 | .cout(),
441 | .dataa({1'b0, man_round_p}),
442 | .datab({{24{1'b0}}, round_carry}),
443 | .overflow(),
444 | .result(wire_man_round_adder_result)
445 | `ifndef FORMAL_VERIFICATION
446 | // synopsys translate_off
447 | `endif
448 | ,
449 | .aclr(1'b0),
450 | .add_sub(1'b1),
451 | .cin(),
452 | .clken(1'b1),
453 | .clock(1'b0)
454 | `ifndef FORMAL_VERIFICATION
455 | // synopsys translate_on
456 | `endif
457 | );
458 | defparam
459 | man_round_adder.lpm_pipeline = 0,
460 | man_round_adder.lpm_width = 25,
461 | man_round_adder.lpm_type = "lpm_add_sub";
462 | lpm_mult man_product2_mult
463 | (
464 | .aclr(aclr),
465 | .clken(clk_en),
466 | .clock(clock),
467 | .dataa({1'b1, dataa[22:0]}),
468 | .datab({1'b1, datab[22:0]}),
469 | .result(wire_man_product2_mult_result)
470 | `ifndef FORMAL_VERIFICATION
471 | // synopsys translate_off
472 | `endif
473 | ,
474 | .sum({1{1'b0}})
475 | `ifndef FORMAL_VERIFICATION
476 | // synopsys translate_on
477 | `endif
478 | );
479 | defparam
480 | man_product2_mult.lpm_pipeline = 2,
481 | man_product2_mult.lpm_representation = "UNSIGNED",
482 | man_product2_mult.lpm_widtha = 24,
483 | man_product2_mult.lpm_widthb = 24,
484 | man_product2_mult.lpm_widthp = 48,
485 | man_product2_mult.lpm_widths = 1,
486 | man_product2_mult.lpm_type = "lpm_mult",
487 | man_product2_mult.lpm_hint = "DEDICATED_MULTIPLIER_CIRCUITRY=YES";
488 | assign
489 | aclr = 1'b0,
490 | bias = {{3{1'b0}}, {7{1'b1}}},
491 | dataa_exp_all_one = {(dataa[30] & dataa_exp_all_one[6]), (dataa[29] & dataa_exp_all_one[5]), (dataa[28] & dataa_exp_all_one[4]), (dataa[27] & dataa_exp_all_one[3]), (dataa[26] & dataa_exp_all_one[2]), (dataa[25] & dataa_exp_all_one[1]), (dataa[24] & dataa_exp_all_one[0]), dataa[23]},
492 | dataa_exp_not_zero = {(dataa[30] | dataa_exp_not_zero[6]), (dataa[29] | dataa_exp_not_zero[5]), (dataa[28] | dataa_exp_not_zero[4]), (dataa[27] | dataa_exp_not_zero[3]), (dataa[26] | dataa_exp_not_zero[2]), (dataa[25] | dataa_exp_not_zero[1]), (dataa[24] | dataa_exp_not_zero[0]), dataa[23]},
493 | dataa_man_not_zero = {(dataa[22] | dataa_man_not_zero[21]), (dataa[21] | dataa_man_not_zero[20]), (dataa[20] | dataa_man_not_zero[19]), (dataa[19] | dataa_man_not_zero[18]), (dataa[18] | dataa_man_not_zero[17]), (dataa[17] | dataa_man_not_zero[16]), (dataa[16] | dataa_man_not_zero[15]), (dataa[15] | dataa_man_not_zero[14]), (dataa[14] | dataa_man_not_zero[13]), (dataa[13] | dataa_man_not_zero[12]), (dataa[12] | dataa_man_not_zero[11]), dataa[11], (dataa[10] | dataa_man_not_zero[9]), (dataa[9] | dataa_man_not_zero[8]), (dataa[8] | dataa_man_not_zero[7]), (dataa[7] | dataa_man_not_zero[6]), (dataa[6] | dataa_man_not_zero[5]), (dataa[5] | dataa_man_not_zero[4]), (dataa[4] | dataa_man_not_zero[3]), (dataa[3] | dataa_man_not_zero[2]), (dataa[2] | dataa_man_not_zero[1]), (dataa[1] | dataa_man_not_zero[0]), dataa[0]},
494 | datab_exp_all_one = {(datab[30] & datab_exp_all_one[6]), (datab[29] & datab_exp_all_one[5]), (datab[28] & datab_exp_all_one[4]), (datab[27] & datab_exp_all_one[3]), (datab[26] & datab_exp_all_one[2]), (datab[25] & datab_exp_all_one[1]), (datab[24] & datab_exp_all_one[0]), datab[23]},
495 | datab_exp_not_zero = {(datab[30] | datab_exp_not_zero[6]), (datab[29] | datab_exp_not_zero[5]), (datab[28] | datab_exp_not_zero[4]), (datab[27] | datab_exp_not_zero[3]), (datab[26] | datab_exp_not_zero[2]), (datab[25] | datab_exp_not_zero[1]), (datab[24] | datab_exp_not_zero[0]), datab[23]},
496 | datab_man_not_zero = {(datab[22] | datab_man_not_zero[21]), (datab[21] | datab_man_not_zero[20]), (datab[20] | datab_man_not_zero[19]), (datab[19] | datab_man_not_zero[18]), (datab[18] | datab_man_not_zero[17]), (datab[17] | datab_man_not_zero[16]), (datab[16] | datab_man_not_zero[15]), (datab[15] | datab_man_not_zero[14]), (datab[14] | datab_man_not_zero[13]), (datab[13] | datab_man_not_zero[12]), (datab[12] | datab_man_not_zero[11]), datab[11], (datab[10] | datab_man_not_zero[9]), (datab[9] | datab_man_not_zero[8]), (datab[8] | datab_man_not_zero[7]), (datab[7] | datab_man_not_zero[6]), (datab[6] | datab_man_not_zero[5]), (datab[5] | datab_man_not_zero[4]), (datab[4] | datab_man_not_zero[3]), (datab[3] | datab_man_not_zero[2]), (datab[2] | datab_man_not_zero[1]), (datab[1] | datab_man_not_zero[0]), datab[0]},
497 | exp_is_inf = (((~ wire_exp_adj_adder_result[9]) & wire_exp_adj_adder_result[8]) | ((~ wire_exp_adj_adder_result[8]) & result_exp_all_one[7])),
498 | exp_is_zero = (wire_exp_adj_adder_result[9] | (~ result_exp_not_zero[8])),
499 | expmod = {{8{1'b0}}, (delay_man_product_msb & man_round_p2[24]), (delay_man_product_msb ^ man_round_p2[24])},
500 | inf_num = {8{1'b1}},
501 | lsb_bit = man_shift_full[1],
502 | man_result_round = ((man_round_p2[23:0] & {24{(~ man_round_p2[24])}}) | (man_round_p2[24:1] & {24{man_round_p2[24]}})),
503 | man_shift_full = ((wire_man_product2_mult_result[46:22] & {25{(~ wire_man_product2_mult_result[47])}}) | (wire_man_product2_mult_result[47:23] & {25{wire_man_product2_mult_result[47]}})),
504 | result = {sign_node_ff4[0:0], exp_result_ff[7:0], man_result_ff[22:0]},
505 | result_exp_all_one = {(result_exp_all_one[6] & wire_exp_adj_adder_result[7]), (result_exp_all_one[5] & wire_exp_adj_adder_result[6]), (result_exp_all_one[4] & wire_exp_adj_adder_result[5]), (result_exp_all_one[3] & wire_exp_adj_adder_result[4]), (result_exp_all_one[2] & wire_exp_adj_adder_result[3]), (result_exp_all_one[1] & wire_exp_adj_adder_result[2]), (result_exp_all_one[0] & wire_exp_adj_adder_result[1]), wire_exp_adj_adder_result[0]},
506 | result_exp_not_zero = {(result_exp_not_zero[7] | wire_exp_adj_adder_result[8]), (result_exp_not_zero[6] | wire_exp_adj_adder_result[7]), (result_exp_not_zero[5] | wire_exp_adj_adder_result[6]), (result_exp_not_zero[4] | wire_exp_adj_adder_result[5]), (result_exp_not_zero[3] | wire_exp_adj_adder_result[4]), (result_exp_not_zero[2] | wire_exp_adj_adder_result[3]), (result_exp_not_zero[1] | wire_exp_adj_adder_result[2]), (result_exp_not_zero[0] | wire_exp_adj_adder_result[1]), wire_exp_adj_adder_result[0]},
507 | round_bit = man_shift_full[0],
508 | round_carry = (round_dffe & (lsb_dffe | sticky_dffe)),
509 | sticky_bit = {(sticky_bit[21] | (wire_man_product2_mult_result[47] & wire_man_product2_mult_result[22])), (sticky_bit[20] | wire_man_product2_mult_result[21]), (sticky_bit[19] | wire_man_product2_mult_result[20]), (sticky_bit[18] | wire_man_product2_mult_result[19]), (sticky_bit[17] | wire_man_product2_mult_result[18]), (sticky_bit[16] | wire_man_product2_mult_result[17]), (sticky_bit[15] | wire_man_product2_mult_result[16]), (sticky_bit[14] | wire_man_product2_mult_result[15]), (sticky_bit[13] | wire_man_product2_mult_result[14]), (sticky_bit[12] | wire_man_product2_mult_result[13]), (sticky_bit[11] | wire_man_product2_mult_result[12]), (sticky_bit[10] | wire_man_product2_mult_result[11]), (sticky_bit[9] | wire_man_product2_mult_result[10]), (sticky_bit[8] | wire_man_product2_mult_result[9]), (sticky_bit[7] | wire_man_product2_mult_result[8]), (sticky_bit[6] | wire_man_product2_mult_result[7]), (sticky_bit[5] | wire_man_product2_mult_result[6]), (sticky_bit[4] | wire_man_product2_mult_result[5]), (sticky_bit[3] | wire_man_product2_mult_result[4]), (sticky_bit[2] | wire_man_product2_mult_result[3]), (sticky_bit[1] | wire_man_product2_mult_result[2]), (sticky_bit[0] | wire_man_product2_mult_result[1]), wire_man_product2_mult_result[0]};
510 | endmodule //float_mult_altfp_mult_t9o
511 | //VALID FILE
512 |
513 |
514 | // synopsys translate_off
515 | `timescale 1 ps / 1 ps
516 | // synopsys translate_on
517 | module float_mult (
518 | clk_en,
519 | clock,
520 | dataa,
521 | datab,
522 | result);
523 |
524 | input clk_en;
525 | input clock;
526 | input [31:0] dataa;
527 | input [31:0] datab;
528 | output [31:0] result;
529 |
530 | wire [31:0] sub_wire0;
531 | wire [31:0] result = sub_wire0[31:0];
532 |
533 | float_mult_altfp_mult_t9o float_mult_altfp_mult_t9o_component (
534 | .clk_en (clk_en),
535 | .clock (clock),
536 | .dataa (dataa),
537 | .datab (datab),
538 | .result (sub_wire0));
539 |
540 | endmodule
541 |
542 | // ============================================================
543 | // CNX file retrieval info
544 | // ============================================================
545 | // Retrieval info: LIBRARY: altera_mf altera_mf.altera_mf_components.all
546 | // Retrieval info: PRIVATE: FPM_FORMAT STRING "Single"
547 | // Retrieval info: PRIVATE: INTENDED_DEVICE_FAMILY STRING "Stratix V"
548 | // Retrieval info: CONSTANT: DEDICATED_MULTIPLIER_CIRCUITRY STRING "YES"
549 | // Retrieval info: CONSTANT: DENORMAL_SUPPORT STRING "NO"
550 | // Retrieval info: CONSTANT: EXCEPTION_HANDLING STRING "NO"
551 | // Retrieval info: CONSTANT: INTENDED_DEVICE_FAMILY STRING "UNUSED"
552 | // Retrieval info: CONSTANT: LPM_HINT STRING "UNUSED"
553 | // Retrieval info: CONSTANT: LPM_TYPE STRING "altfp_mult"
554 | // Retrieval info: CONSTANT: PIPELINE NUMERIC "5"
555 | // Retrieval info: CONSTANT: REDUCED_FUNCTIONALITY STRING "NO"
556 | // Retrieval info: CONSTANT: ROUNDING STRING "TO_NEAREST"
557 | // Retrieval info: CONSTANT: WIDTH_EXP NUMERIC "8"
558 | // Retrieval info: CONSTANT: WIDTH_MAN NUMERIC "23"
559 | // Retrieval info: USED_PORT: clk_en 0 0 0 0 INPUT NODEFVAL "clk_en"
560 | // Retrieval info: CONNECT: @clk_en 0 0 0 0 clk_en 0 0 0 0
561 | // Retrieval info: USED_PORT: clock 0 0 0 0 INPUT NODEFVAL "clock"
562 | // Retrieval info: CONNECT: @clock 0 0 0 0 clock 0 0 0 0
563 | // Retrieval info: USED_PORT: dataa 0 0 32 0 INPUT NODEFVAL "dataa[31..0]"
564 | // Retrieval info: CONNECT: @dataa 0 0 32 0 dataa 0 0 32 0
565 | // Retrieval info: USED_PORT: datab 0 0 32 0 INPUT NODEFVAL "datab[31..0]"
566 | // Retrieval info: CONNECT: @datab 0 0 32 0 datab 0 0 32 0
567 | // Retrieval info: USED_PORT: result 0 0 32 0 OUTPUT NODEFVAL "result[31..0]"
568 | // Retrieval info: CONNECT: result 0 0 32 0 @result 0 0 32 0
569 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult.v TRUE FALSE
570 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult.qip TRUE FALSE
571 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult.bsf TRUE TRUE
572 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult_inst.v TRUE TRUE
573 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult_bb.v TRUE TRUE
574 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult.inc TRUE TRUE
575 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult.cmp TRUE TRUE
576 | // Retrieval info: PRIVATE: SYNTH_WRAPPER_GEN_POSTFIX NUMERIC "1"
577 | // Retrieval info: LIB_FILE: lpm
578 |
--------------------------------------------------------------------------------
/rtl/qip/float_mult_bb.v:
--------------------------------------------------------------------------------
1 | // megafunction wizard: %ALTFP_MULT%VBB%
2 | // GENERATION: STANDARD
3 | // VERSION: WM1.0
4 | // MODULE: ALTFP_MULT
5 |
6 | // ============================================================
7 | // File Name: float_mult.v
8 | // Megafunction Name(s):
9 | // ALTFP_MULT
10 | //
11 | // Simulation Library Files(s):
12 | // lpm
13 | // ============================================================
14 | // ************************************************************
15 | // THIS IS A WIZARD-GENERATED FILE. DO NOT EDIT THIS FILE!
16 | //
17 | // 15.1.1 Build 189 12/02/2015 SJ Standard Edition
18 | // ************************************************************
19 |
20 | //Copyright (C) 1991-2015 Altera Corporation. All rights reserved.
21 | //Your use of Altera Corporation's design tools, logic functions
22 | //and other software and tools, and its AMPP partner logic
23 | //functions, and any output files from any of the foregoing
24 | //(including device programming or simulation files), and any
25 | //associated documentation or information are expressly subject
26 | //to the terms and conditions of the Altera Program License
27 | //Subscription Agreement, the Altera Quartus Prime License Agreement,
28 | //the Altera MegaCore Function License Agreement, or other
29 | //applicable license agreement, including, without limitation,
30 | //that your use is for the sole purpose of programming logic
31 | //devices manufactured by Altera and sold by Altera or its
32 | //authorized distributors. Please refer to the applicable
33 | //agreement for further details.
34 |
35 | module float_mult (
36 | clk_en,
37 | clock,
38 | dataa,
39 | datab,
40 | result)/* synthesis synthesis_clearbox = 1 */;
41 |
42 | input clk_en;
43 | input clock;
44 | input [31:0] dataa;
45 | input [31:0] datab;
46 | output [31:0] result;
47 |
48 | endmodule
49 |
50 | // ============================================================
51 | // CNX file retrieval info
52 | // ============================================================
53 | // Retrieval info: LIBRARY: altera_mf altera_mf.altera_mf_components.all
54 | // Retrieval info: PRIVATE: FPM_FORMAT STRING "Single"
55 | // Retrieval info: PRIVATE: INTENDED_DEVICE_FAMILY STRING "Stratix V"
56 | // Retrieval info: CONSTANT: DEDICATED_MULTIPLIER_CIRCUITRY STRING "YES"
57 | // Retrieval info: CONSTANT: DENORMAL_SUPPORT STRING "NO"
58 | // Retrieval info: CONSTANT: EXCEPTION_HANDLING STRING "NO"
59 | // Retrieval info: CONSTANT: INTENDED_DEVICE_FAMILY STRING "UNUSED"
60 | // Retrieval info: CONSTANT: LPM_HINT STRING "UNUSED"
61 | // Retrieval info: CONSTANT: LPM_TYPE STRING "altfp_mult"
62 | // Retrieval info: CONSTANT: PIPELINE NUMERIC "5"
63 | // Retrieval info: CONSTANT: REDUCED_FUNCTIONALITY STRING "NO"
64 | // Retrieval info: CONSTANT: ROUNDING STRING "TO_NEAREST"
65 | // Retrieval info: CONSTANT: WIDTH_EXP NUMERIC "8"
66 | // Retrieval info: CONSTANT: WIDTH_MAN NUMERIC "23"
67 | // Retrieval info: USED_PORT: clk_en 0 0 0 0 INPUT NODEFVAL "clk_en"
68 | // Retrieval info: CONNECT: @clk_en 0 0 0 0 clk_en 0 0 0 0
69 | // Retrieval info: USED_PORT: clock 0 0 0 0 INPUT NODEFVAL "clock"
70 | // Retrieval info: CONNECT: @clock 0 0 0 0 clock 0 0 0 0
71 | // Retrieval info: USED_PORT: dataa 0 0 32 0 INPUT NODEFVAL "dataa[31..0]"
72 | // Retrieval info: CONNECT: @dataa 0 0 32 0 dataa 0 0 32 0
73 | // Retrieval info: USED_PORT: datab 0 0 32 0 INPUT NODEFVAL "datab[31..0]"
74 | // Retrieval info: CONNECT: @datab 0 0 32 0 datab 0 0 32 0
75 | // Retrieval info: USED_PORT: result 0 0 32 0 OUTPUT NODEFVAL "result[31..0]"
76 | // Retrieval info: CONNECT: result 0 0 32 0 @result 0 0 32 0
77 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult.v TRUE FALSE
78 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult.qip TRUE FALSE
79 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult.bsf TRUE TRUE
80 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult_inst.v TRUE TRUE
81 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult_bb.v TRUE TRUE
82 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult.inc TRUE TRUE
83 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult.cmp TRUE TRUE
84 | // Retrieval info: PRIVATE: SYNTH_WRAPPER_GEN_POSTFIX NUMERIC "1"
85 | // Retrieval info: LIB_FILE: lpm
86 |
--------------------------------------------------------------------------------
/rtl/qip/float_mult_inst.v:
--------------------------------------------------------------------------------
1 | float_mult float_mult_inst (
2 | .clk_en ( clk_en_sig ),
3 | .clock ( clock_sig ),
4 | .dataa ( dataa_sig ),
5 | .datab ( datab_sig ),
6 | .result ( result_sig )
7 | );
8 |
--------------------------------------------------------------------------------
/rtl/qip/float_mult_syn.v:
--------------------------------------------------------------------------------
1 | // megafunction wizard: %ALTFP_MULT%
2 | // GENERATION: STANDARD
3 | // VERSION: WM1.0
4 | // MODULE: ALTFP_MULT
5 |
6 | // ============================================================
7 | // File Name: float_mult.v
8 | // Megafunction Name(s):
9 | // ALTFP_MULT
10 | //
11 | // Simulation Library Files(s):
12 | // lpm
13 | // ============================================================
14 | // ************************************************************
15 | // THIS IS A WIZARD-GENERATED FILE. DO NOT EDIT THIS FILE!
16 | //
17 | // 15.1.1 Build 189 12/02/2015 SJ Standard Edition
18 | // ************************************************************
19 |
20 |
21 | //Copyright (C) 1991-2015 Altera Corporation. All rights reserved.
22 | //Your use of Altera Corporation's design tools, logic functions
23 | //and other software and tools, and its AMPP partner logic
24 | //functions, and any output files from any of the foregoing
25 | //(including device programming or simulation files), and any
26 | //associated documentation or information are expressly subject
27 | //to the terms and conditions of the Altera Program License
28 | //Subscription Agreement, the Altera Quartus Prime License Agreement,
29 | //the Altera MegaCore Function License Agreement, or other
30 | //applicable license agreement, including, without limitation,
31 | //that your use is for the sole purpose of programming logic
32 | //devices manufactured by Altera and sold by Altera or its
33 | //authorized distributors. Please refer to the applicable
34 | //agreement for further details.
35 |
36 |
37 | //altfp_mult DEDICATED_MULTIPLIER_CIRCUITRY="YES" DENORMAL_SUPPORT="NO" DEVICE_FAMILY="Stratix V" EXCEPTION_HANDLING="NO" PIPELINE=5 REDUCED_FUNCTIONALITY="NO" ROUNDING="TO_NEAREST" WIDTH_EXP=8 WIDTH_MAN=23 clk_en clock dataa datab result
38 | //VERSION_BEGIN 15.1 cbx_alt_ded_mult_y 2015:11:24:18:49:55:SJ cbx_altbarrel_shift 2015:11:24:18:49:55:SJ cbx_altera_mult_add 2015:11:24:18:49:55:SJ cbx_altera_mult_add_rtl 2015:11:24:18:49:55:SJ cbx_altfp_mult 2015:11:24:18:49:55:SJ cbx_altmult_add 2015:11:24:18:49:55:SJ cbx_cycloneii 2015:11:24:18:49:55:SJ cbx_lpm_add_sub 2015:11:24:18:49:55:SJ cbx_lpm_compare 2015:11:24:18:49:55:SJ cbx_lpm_mult 2015:11:24:18:49:55:SJ cbx_mgl 2015:11:24:20:43:33:SJ cbx_nadder 2015:11:24:18:49:55:SJ cbx_padd 2015:11:24:18:49:55:SJ cbx_parallel_add 2015:11:24:18:49:55:SJ cbx_stratix 2015:11:24:18:49:55:SJ cbx_stratixii 2015:11:24:18:49:55:SJ cbx_util_mgl 2015:11:24:18:49:55:SJ VERSION_END
39 | // synthesis VERILOG_INPUT_VERSION VERILOG_2001
40 | // altera message_off 10463
41 |
42 |
43 |
44 | //lpm_add_sub DEVICE_FAMILY="Stratix V" LPM_PIPELINE=1 LPM_WIDTH=9 aclr cin clken clock dataa datab result
45 | //VERSION_BEGIN 15.1 cbx_cycloneii 2015:11:24:18:49:55:SJ cbx_lpm_add_sub 2015:11:24:18:49:55:SJ cbx_mgl 2015:11:24:20:43:33:SJ cbx_nadder 2015:11:24:18:49:55:SJ cbx_stratix 2015:11:24:18:49:55:SJ cbx_stratixii 2015:11:24:18:49:55:SJ VERSION_END
46 |
47 |
48 | //lpm_add_sub DEVICE_FAMILY="Stratix V" LPM_WIDTH=10 cin dataa datab result
49 | //VERSION_BEGIN 15.1 cbx_cycloneii 2015:11:24:18:49:55:SJ cbx_lpm_add_sub 2015:11:24:18:49:55:SJ cbx_mgl 2015:11:24:20:43:33:SJ cbx_nadder 2015:11:24:18:49:55:SJ cbx_stratix 2015:11:24:18:49:55:SJ cbx_stratixii 2015:11:24:18:49:55:SJ VERSION_END
50 |
51 |
52 | //lpm_add_sub DEVICE_FAMILY="Stratix V" LPM_DIRECTION="SUB" LPM_PIPELINE=0 LPM_REPRESENTATION="UNSIGNED" LPM_WIDTH=10 dataa datab result
53 | //VERSION_BEGIN 15.1 cbx_cycloneii 2015:11:24:18:49:55:SJ cbx_lpm_add_sub 2015:11:24:18:49:55:SJ cbx_mgl 2015:11:24:20:43:33:SJ cbx_nadder 2015:11:24:18:49:55:SJ cbx_stratix 2015:11:24:18:49:55:SJ cbx_stratixii 2015:11:24:18:49:55:SJ VERSION_END
54 |
55 |
56 | //lpm_add_sub DEVICE_FAMILY="Stratix V" LPM_PIPELINE=0 LPM_WIDTH=25 dataa datab result
57 | //VERSION_BEGIN 15.1 cbx_cycloneii 2015:11:24:18:49:55:SJ cbx_lpm_add_sub 2015:11:24:18:49:55:SJ cbx_mgl 2015:11:24:20:43:33:SJ cbx_nadder 2015:11:24:18:49:55:SJ cbx_stratix 2015:11:24:18:49:55:SJ cbx_stratixii 2015:11:24:18:49:55:SJ VERSION_END
58 |
59 |
60 | //lpm_mult DEDICATED_MULTIPLIER_CIRCUITRY="YES" DEVICE_FAMILY="Stratix V" LPM_PIPELINE=2 LPM_REPRESENTATION="UNSIGNED" LPM_WIDTHA=24 LPM_WIDTHB=24 LPM_WIDTHP=48 LPM_WIDTHS=1 aclr clken clock dataa datab result
61 | //VERSION_BEGIN 15.1 cbx_cycloneii 2015:11:24:18:49:55:SJ cbx_lpm_add_sub 2015:11:24:18:49:55:SJ cbx_lpm_mult 2015:11:24:18:49:55:SJ cbx_mgl 2015:11:24:20:43:33:SJ cbx_nadder 2015:11:24:18:49:55:SJ cbx_padd 2015:11:24:18:49:55:SJ cbx_stratix 2015:11:24:18:49:55:SJ cbx_stratixii 2015:11:24:18:49:55:SJ cbx_util_mgl 2015:11:24:18:49:55:SJ VERSION_END
62 |
63 | //synthesis_resources =
64 | //synopsys translate_off
65 | `timescale 1 ps / 1 ps
66 | //synopsys translate_on
67 | module float_mult_mult
68 | (
69 | aclr,
70 | clken,
71 | clock,
72 | dataa,
73 | datab,
74 | result) /* synthesis synthesis_clearbox=1 */;
75 | input aclr;
76 | input clken;
77 | input clock;
78 | input [23:0] dataa;
79 | input [23:0] datab;
80 | output [47:0] result;
81 | `ifndef ALTERA_RESERVED_QIS
82 | // synopsys translate_off
83 | `endif
84 | tri0 aclr;
85 | tri1 clken;
86 | tri0 clock;
87 | `ifndef ALTERA_RESERVED_QIS
88 | // synopsys translate_on
89 | `endif
90 |
91 | reg [23:0] dataa_input_reg;
92 | reg [23:0] datab_input_reg;
93 | reg [47:0] result_output_reg;
94 | wire [23:0] dataa_wire;
95 | wire [23:0] datab_wire;
96 | wire [47:0] result_wire;
97 |
98 |
99 | // synopsys translate_off
100 | initial
101 | dataa_input_reg = 0;
102 | // synopsys translate_on
103 | always @(posedge clock or posedge aclr)
104 | if (aclr == 1'b1) dataa_input_reg <= 24'b0;
105 | else if (clken == 1'b1) dataa_input_reg <= dataa;
106 | // synopsys translate_off
107 | initial
108 | datab_input_reg = 0;
109 | // synopsys translate_on
110 | always @(posedge clock or posedge aclr)
111 | if (aclr == 1'b1) datab_input_reg <= 24'b0;
112 | else if (clken == 1'b1) datab_input_reg <= datab;
113 | // synopsys translate_off
114 | initial
115 | result_output_reg = 0;
116 | // synopsys translate_on
117 | always @(posedge clock or posedge aclr)
118 | if (aclr == 1'b1) result_output_reg <= 48'b0;
119 | else if (clken == 1'b1) result_output_reg <= result_wire[47:0];
120 |
121 | assign dataa_wire = dataa_input_reg;
122 | assign datab_wire = datab_input_reg;
123 | assign result_wire = dataa_wire * datab_wire;
124 | assign result = ({result_output_reg});
125 |
126 | endmodule //float_mult_mult
127 |
128 | //synthesis_resources = lut 55 reg 136
129 | //synopsys translate_off
130 | `timescale 1 ps / 1 ps
131 | //synopsys translate_on
132 | module float_mult_altfp_mult
133 | (
134 | clk_en,
135 | clock,
136 | dataa,
137 | datab,
138 | result) /* synthesis synthesis_clearbox=1 */;
139 | input clk_en;
140 | input clock;
141 | input [31:0] dataa;
142 | input [31:0] datab;
143 | output [31:0] result;
144 | `ifndef ALTERA_RESERVED_QIS
145 | // synopsys translate_off
146 | `endif
147 | tri1 clk_en;
148 | `ifndef ALTERA_RESERVED_QIS
149 | // synopsys translate_on
150 | `endif
151 |
152 | reg dataa_exp_all_one_ff_p1;
153 | reg dataa_exp_not_zero_ff_p1;
154 | reg dataa_man_not_zero_ff_p1;
155 | reg dataa_man_not_zero_ff_p2;
156 | reg datab_exp_all_one_ff_p1;
157 | reg datab_exp_not_zero_ff_p1;
158 | reg datab_man_not_zero_ff_p1;
159 | reg datab_man_not_zero_ff_p2;
160 | reg [9:0] delay_exp2_bias;
161 | reg [9:0] delay_exp_bias;
162 | reg delay_man_product_msb;
163 | reg delay_man_product_msb_p0;
164 | reg [8:0] exp_add_p1;
165 | reg [7:0] exp_result_ff;
166 | reg input_is_infinity_dffe_0;
167 | reg input_is_infinity_dffe_1;
168 | reg input_is_infinity_ff1;
169 | reg input_is_nan_dffe_0;
170 | reg input_is_nan_dffe_1;
171 | reg input_is_nan_ff1;
172 | reg input_not_zero_dffe_0;
173 | reg input_not_zero_dffe_1;
174 | reg input_not_zero_ff1;
175 | reg lsb_dffe;
176 | reg [22:0] man_result_ff;
177 | reg [23:0] man_round_p;
178 | reg [24:0] man_round_p2;
179 | reg round_dffe;
180 | reg [0:0] sign_node_ff0;
181 | reg [0:0] sign_node_ff1;
182 | reg [0:0] sign_node_ff2;
183 | reg [0:0] sign_node_ff3;
184 | reg [0:0] sign_node_ff4;
185 | reg sticky_dffe;
186 | (* ALTERA_ATTRIBUTE = {"POWER_UP_LEVEL=LOW"} *)
187 | reg [8:0] wire_exp_add_adder_pipeline_dffe_Q;
188 | wire [8:0] wire_exp_add_adder_pipeline_dffe_D;
189 | wire [9:0] wire_exp_add_adder_result_int;
190 | wire wire_exp_add_adder_aclr;
191 | wire wire_exp_add_adder_cin;
192 | wire wire_exp_add_adder_clken;
193 | wire wire_exp_add_adder_clock;
194 | wire [8:0] wire_exp_add_adder_dataa;
195 | wire [8:0] wire_exp_add_adder_datab;
196 | wire [8:0] wire_exp_add_adder_result;
197 | wire [10:0] wire_exp_adj_adder_result_int;
198 | wire wire_exp_adj_adder_cin;
199 | wire [9:0] wire_exp_adj_adder_dataa;
200 | wire [9:0] wire_exp_adj_adder_datab;
201 | wire [9:0] wire_exp_adj_adder_result;
202 | wire [9:0] wire_exp_bias_subtr_dataa;
203 | wire [9:0] wire_exp_bias_subtr_datab;
204 | wire [9:0] wire_exp_bias_subtr_result;
205 | wire [24:0] wire_man_round_adder_dataa;
206 | wire [24:0] wire_man_round_adder_datab;
207 | wire [24:0] wire_man_round_adder_result;
208 | wire [23:0] wire_man_product2_mult_dataa;
209 | wire [23:0] wire_man_product2_mult_datab;
210 | wire [47:0] wire_man_product2_mult_result;
211 | wire aclr;
212 | wire [9:0] bias;
213 | wire [7:0] dataa_exp_all_one;
214 | wire [7:0] dataa_exp_not_zero;
215 | wire [22:0] dataa_man_not_zero;
216 | wire [7:0] datab_exp_all_one;
217 | wire [7:0] datab_exp_not_zero;
218 | wire [22:0] datab_man_not_zero;
219 | wire exp_is_inf;
220 | wire exp_is_zero;
221 | wire [9:0] expmod;
222 | wire [7:0] inf_num;
223 | wire lsb_bit;
224 | wire [23:0] man_result_round;
225 | wire [24:0] man_shift_full;
226 | wire [7:0] result_exp_all_one;
227 | wire [8:0] result_exp_not_zero;
228 | wire round_bit;
229 | wire round_carry;
230 | wire [22:0] sticky_bit;
231 |
232 | // synopsys translate_off
233 | initial
234 | dataa_exp_all_one_ff_p1 = 0;
235 | // synopsys translate_on
236 | always @ ( posedge clock or posedge aclr)
237 | if (aclr == 1'b1) dataa_exp_all_one_ff_p1 <= 1'b0;
238 | else if (clk_en == 1'b1) dataa_exp_all_one_ff_p1 <= dataa_exp_all_one[7];
239 | // synopsys translate_off
240 | initial
241 | dataa_exp_not_zero_ff_p1 = 0;
242 | // synopsys translate_on
243 | always @ ( posedge clock or posedge aclr)
244 | if (aclr == 1'b1) dataa_exp_not_zero_ff_p1 <= 1'b0;
245 | else if (clk_en == 1'b1) dataa_exp_not_zero_ff_p1 <= dataa_exp_not_zero[7];
246 | // synopsys translate_off
247 | initial
248 | dataa_man_not_zero_ff_p1 = 0;
249 | // synopsys translate_on
250 | always @ ( posedge clock or posedge aclr)
251 | if (aclr == 1'b1) dataa_man_not_zero_ff_p1 <= 1'b0;
252 | else if (clk_en == 1'b1) dataa_man_not_zero_ff_p1 <= dataa_man_not_zero[10];
253 | // synopsys translate_off
254 | initial
255 | dataa_man_not_zero_ff_p2 = 0;
256 | // synopsys translate_on
257 | always @ ( posedge clock or posedge aclr)
258 | if (aclr == 1'b1) dataa_man_not_zero_ff_p2 <= 1'b0;
259 | else if (clk_en == 1'b1) dataa_man_not_zero_ff_p2 <= dataa_man_not_zero[22];
260 | // synopsys translate_off
261 | initial
262 | datab_exp_all_one_ff_p1 = 0;
263 | // synopsys translate_on
264 | always @ ( posedge clock or posedge aclr)
265 | if (aclr == 1'b1) datab_exp_all_one_ff_p1 <= 1'b0;
266 | else if (clk_en == 1'b1) datab_exp_all_one_ff_p1 <= datab_exp_all_one[7];
267 | // synopsys translate_off
268 | initial
269 | datab_exp_not_zero_ff_p1 = 0;
270 | // synopsys translate_on
271 | always @ ( posedge clock or posedge aclr)
272 | if (aclr == 1'b1) datab_exp_not_zero_ff_p1 <= 1'b0;
273 | else if (clk_en == 1'b1) datab_exp_not_zero_ff_p1 <= datab_exp_not_zero[7];
274 | // synopsys translate_off
275 | initial
276 | datab_man_not_zero_ff_p1 = 0;
277 | // synopsys translate_on
278 | always @ ( posedge clock or posedge aclr)
279 | if (aclr == 1'b1) datab_man_not_zero_ff_p1 <= 1'b0;
280 | else if (clk_en == 1'b1) datab_man_not_zero_ff_p1 <= datab_man_not_zero[10];
281 | // synopsys translate_off
282 | initial
283 | datab_man_not_zero_ff_p2 = 0;
284 | // synopsys translate_on
285 | always @ ( posedge clock or posedge aclr)
286 | if (aclr == 1'b1) datab_man_not_zero_ff_p2 <= 1'b0;
287 | else if (clk_en == 1'b1) datab_man_not_zero_ff_p2 <= datab_man_not_zero[22];
288 | // synopsys translate_off
289 | initial
290 | delay_exp2_bias = 0;
291 | // synopsys translate_on
292 | always @ ( posedge clock or posedge aclr)
293 | if (aclr == 1'b1) delay_exp2_bias <= 10'b0;
294 | else if (clk_en == 1'b1) delay_exp2_bias <= delay_exp_bias;
295 | // synopsys translate_off
296 | initial
297 | delay_exp_bias = 0;
298 | // synopsys translate_on
299 | always @ ( posedge clock or posedge aclr)
300 | if (aclr == 1'b1) delay_exp_bias <= 10'b0;
301 | else if (clk_en == 1'b1) delay_exp_bias <= wire_exp_bias_subtr_result;
302 | // synopsys translate_off
303 | initial
304 | delay_man_product_msb = 0;
305 | // synopsys translate_on
306 | always @ ( posedge clock or posedge aclr)
307 | if (aclr == 1'b1) delay_man_product_msb <= 1'b0;
308 | else if (clk_en == 1'b1) delay_man_product_msb <= delay_man_product_msb_p0;
309 | // synopsys translate_off
310 | initial
311 | delay_man_product_msb_p0 = 0;
312 | // synopsys translate_on
313 | always @ ( posedge clock or posedge aclr)
314 | if (aclr == 1'b1) delay_man_product_msb_p0 <= 1'b0;
315 | else if (clk_en == 1'b1) delay_man_product_msb_p0 <= wire_man_product2_mult_result[47];
316 | // synopsys translate_off
317 | initial
318 | exp_add_p1 = 0;
319 | // synopsys translate_on
320 | always @ ( posedge clock or posedge aclr)
321 | if (aclr == 1'b1) exp_add_p1 <= 9'b0;
322 | else if (clk_en == 1'b1) exp_add_p1 <= wire_exp_add_adder_result;
323 | // synopsys translate_off
324 | initial
325 | exp_result_ff = 0;
326 | // synopsys translate_on
327 | always @ ( posedge clock or posedge aclr)
328 | if (aclr == 1'b1) exp_result_ff <= 8'b0;
329 | else if (clk_en == 1'b1) exp_result_ff <= ((inf_num & {8{((exp_is_inf | input_is_infinity_ff1) | input_is_nan_ff1)}}) | ((wire_exp_adj_adder_result[7:0] & {8{(~ exp_is_zero)}}) & {8{input_not_zero_ff1}}));
330 | // synopsys translate_off
331 | initial
332 | input_is_infinity_dffe_0 = 0;
333 | // synopsys translate_on
334 | always @ ( posedge clock or posedge aclr)
335 | if (aclr == 1'b1) input_is_infinity_dffe_0 <= 1'b0;
336 | else if (clk_en == 1'b1) input_is_infinity_dffe_0 <= ((dataa_exp_all_one_ff_p1 & (~ (dataa_man_not_zero_ff_p1 | dataa_man_not_zero_ff_p2))) | (datab_exp_all_one_ff_p1 & (~ (datab_man_not_zero_ff_p1 | datab_man_not_zero_ff_p2))));
337 | // synopsys translate_off
338 | initial
339 | input_is_infinity_dffe_1 = 0;
340 | // synopsys translate_on
341 | always @ ( posedge clock or posedge aclr)
342 | if (aclr == 1'b1) input_is_infinity_dffe_1 <= 1'b0;
343 | else if (clk_en == 1'b1) input_is_infinity_dffe_1 <= input_is_infinity_dffe_0;
344 | // synopsys translate_off
345 | initial
346 | input_is_infinity_ff1 = 0;
347 | // synopsys translate_on
348 | always @ ( posedge clock or posedge aclr)
349 | if (aclr == 1'b1) input_is_infinity_ff1 <= 1'b0;
350 | else if (clk_en == 1'b1) input_is_infinity_ff1 <= input_is_infinity_dffe_1;
351 | // synopsys translate_off
352 | initial
353 | input_is_nan_dffe_0 = 0;
354 | // synopsys translate_on
355 | always @ ( posedge clock or posedge aclr)
356 | if (aclr == 1'b1) input_is_nan_dffe_0 <= 1'b0;
357 | else if (clk_en == 1'b1) input_is_nan_dffe_0 <= ((dataa_exp_all_one_ff_p1 & (dataa_man_not_zero_ff_p1 | dataa_man_not_zero_ff_p2)) | (datab_exp_all_one_ff_p1 & (datab_man_not_zero_ff_p1 | datab_man_not_zero_ff_p2)));
358 | // synopsys translate_off
359 | initial
360 | input_is_nan_dffe_1 = 0;
361 | // synopsys translate_on
362 | always @ ( posedge clock or posedge aclr)
363 | if (aclr == 1'b1) input_is_nan_dffe_1 <= 1'b0;
364 | else if (clk_en == 1'b1) input_is_nan_dffe_1 <= input_is_nan_dffe_0;
365 | // synopsys translate_off
366 | initial
367 | input_is_nan_ff1 = 0;
368 | // synopsys translate_on
369 | always @ ( posedge clock or posedge aclr)
370 | if (aclr == 1'b1) input_is_nan_ff1 <= 1'b0;
371 | else if (clk_en == 1'b1) input_is_nan_ff1 <= input_is_nan_dffe_1;
372 | // synopsys translate_off
373 | initial
374 | input_not_zero_dffe_0 = 0;
375 | // synopsys translate_on
376 | always @ ( posedge clock or posedge aclr)
377 | if (aclr == 1'b1) input_not_zero_dffe_0 <= 1'b0;
378 | else if (clk_en == 1'b1) input_not_zero_dffe_0 <= (dataa_exp_not_zero_ff_p1 & datab_exp_not_zero_ff_p1);
379 | // synopsys translate_off
380 | initial
381 | input_not_zero_dffe_1 = 0;
382 | // synopsys translate_on
383 | always @ ( posedge clock or posedge aclr)
384 | if (aclr == 1'b1) input_not_zero_dffe_1 <= 1'b0;
385 | else if (clk_en == 1'b1) input_not_zero_dffe_1 <= input_not_zero_dffe_0;
386 | // synopsys translate_off
387 | initial
388 | input_not_zero_ff1 = 0;
389 | // synopsys translate_on
390 | always @ ( posedge clock or posedge aclr)
391 | if (aclr == 1'b1) input_not_zero_ff1 <= 1'b0;
392 | else if (clk_en == 1'b1) input_not_zero_ff1 <= input_not_zero_dffe_1;
393 | // synopsys translate_off
394 | initial
395 | lsb_dffe = 0;
396 | // synopsys translate_on
397 | always @ ( posedge clock or posedge aclr)
398 | if (aclr == 1'b1) lsb_dffe <= 1'b0;
399 | else if (clk_en == 1'b1) lsb_dffe <= lsb_bit;
400 | // synopsys translate_off
401 | initial
402 | man_result_ff = 0;
403 | // synopsys translate_on
404 | always @ ( posedge clock or posedge aclr)
405 | if (aclr == 1'b1) man_result_ff <= 23'b0;
406 | else if (clk_en == 1'b1) man_result_ff <= {((((((man_result_round[22] & input_not_zero_ff1) & (~ input_is_infinity_ff1)) & (~ exp_is_inf)) & (~ exp_is_zero)) | (input_is_infinity_ff1 & (~ input_not_zero_ff1))) | input_is_nan_ff1), (((((man_result_round[21:0] & {22{input_not_zero_ff1}}) & {22{(~ input_is_infinity_ff1)}}) & {22{(~ exp_is_inf)}}) & {22{(~ exp_is_zero)}}) & {22{(~ input_is_nan_ff1)}})};
407 | // synopsys translate_off
408 | initial
409 | man_round_p = 0;
410 | // synopsys translate_on
411 | always @ ( posedge clock or posedge aclr)
412 | if (aclr == 1'b1) man_round_p <= 24'b0;
413 | else if (clk_en == 1'b1) man_round_p <= man_shift_full[24:1];
414 | // synopsys translate_off
415 | initial
416 | man_round_p2 = 0;
417 | // synopsys translate_on
418 | always @ ( posedge clock or posedge aclr)
419 | if (aclr == 1'b1) man_round_p2 <= 25'b0;
420 | else if (clk_en == 1'b1) man_round_p2 <= wire_man_round_adder_result;
421 | // synopsys translate_off
422 | initial
423 | round_dffe = 0;
424 | // synopsys translate_on
425 | always @ ( posedge clock or posedge aclr)
426 | if (aclr == 1'b1) round_dffe <= 1'b0;
427 | else if (clk_en == 1'b1) round_dffe <= round_bit;
428 | // synopsys translate_off
429 | initial
430 | sign_node_ff0 = 0;
431 | // synopsys translate_on
432 | always @ ( posedge clock or posedge aclr)
433 | if (aclr == 1'b1) sign_node_ff0 <= 1'b0;
434 | else if (clk_en == 1'b1) sign_node_ff0 <= (dataa[31] ^ datab[31]);
435 | // synopsys translate_off
436 | initial
437 | sign_node_ff1 = 0;
438 | // synopsys translate_on
439 | always @ ( posedge clock or posedge aclr)
440 | if (aclr == 1'b1) sign_node_ff1 <= 1'b0;
441 | else if (clk_en == 1'b1) sign_node_ff1 <= sign_node_ff0[0:0];
442 | // synopsys translate_off
443 | initial
444 | sign_node_ff2 = 0;
445 | // synopsys translate_on
446 | always @ ( posedge clock or posedge aclr)
447 | if (aclr == 1'b1) sign_node_ff2 <= 1'b0;
448 | else if (clk_en == 1'b1) sign_node_ff2 <= sign_node_ff1[0:0];
449 | // synopsys translate_off
450 | initial
451 | sign_node_ff3 = 0;
452 | // synopsys translate_on
453 | always @ ( posedge clock or posedge aclr)
454 | if (aclr == 1'b1) sign_node_ff3 <= 1'b0;
455 | else if (clk_en == 1'b1) sign_node_ff3 <= sign_node_ff2[0:0];
456 | // synopsys translate_off
457 | initial
458 | sign_node_ff4 = 0;
459 | // synopsys translate_on
460 | always @ ( posedge clock or posedge aclr)
461 | if (aclr == 1'b1) sign_node_ff4 <= 1'b0;
462 | else if (clk_en == 1'b1) sign_node_ff4 <= sign_node_ff3[0:0];
463 | // synopsys translate_off
464 | initial
465 | sticky_dffe = 0;
466 | // synopsys translate_on
467 | always @ ( posedge clock or posedge aclr)
468 | if (aclr == 1'b1) sticky_dffe <= 1'b0;
469 | else if (clk_en == 1'b1) sticky_dffe <= sticky_bit[22];
470 | assign
471 | wire_exp_add_adder_result_int = {wire_exp_add_adder_dataa, wire_exp_add_adder_cin} + {wire_exp_add_adder_datab, wire_exp_add_adder_cin};
472 | //synopsys translate_off
473 | initial
474 | wire_exp_add_adder_pipeline_dffe_Q = 0;
475 | //synopsys translate_on
476 | always @(posedge wire_exp_add_adder_clock or posedge wire_exp_add_adder_aclr)
477 | if (wire_exp_add_adder_aclr == 1'b1) wire_exp_add_adder_pipeline_dffe_Q <= 9'b0;
478 | else if (wire_exp_add_adder_clken == 1'b1) wire_exp_add_adder_pipeline_dffe_Q <= wire_exp_add_adder_pipeline_dffe_D;
479 | assign
480 | wire_exp_add_adder_result = wire_exp_add_adder_pipeline_dffe_Q[8:0],
481 | wire_exp_add_adder_pipeline_dffe_D[8:0] = wire_exp_add_adder_result_int[9:1];
482 | assign
483 | wire_exp_add_adder_aclr = aclr,
484 | wire_exp_add_adder_cin = 1'b0,
485 | wire_exp_add_adder_clken = clk_en,
486 | wire_exp_add_adder_clock = clock,
487 | wire_exp_add_adder_dataa = {1'b0, dataa[30:23]},
488 | wire_exp_add_adder_datab = {1'b0, datab[30:23]};
489 | assign
490 | wire_exp_adj_adder_result_int = {wire_exp_adj_adder_dataa, wire_exp_adj_adder_cin} + {wire_exp_adj_adder_datab, wire_exp_adj_adder_cin};
491 | assign
492 | wire_exp_adj_adder_result = wire_exp_adj_adder_result_int[10:1];
493 | assign
494 | wire_exp_adj_adder_cin = 1'b0,
495 | wire_exp_adj_adder_dataa = delay_exp2_bias,
496 | wire_exp_adj_adder_datab = expmod;
497 | assign
498 | wire_exp_bias_subtr_result = wire_exp_bias_subtr_dataa - wire_exp_bias_subtr_datab;
499 | assign
500 | wire_exp_bias_subtr_dataa = {1'b0, exp_add_p1[8:0]},
501 | wire_exp_bias_subtr_datab = {bias[9:0]};
502 | assign
503 | wire_man_round_adder_result = wire_man_round_adder_dataa + wire_man_round_adder_datab;
504 | assign
505 | wire_man_round_adder_dataa = {1'b0, man_round_p},
506 | wire_man_round_adder_datab = {{24{1'b0}}, round_carry};
507 | float_mult_mult man_product2_mult
508 | (
509 | .aclr(aclr),
510 | .clken(clk_en),
511 | .clock(clock),
512 | .dataa({1'b1, dataa[22:0]}),
513 | .datab({1'b1, datab[22:0]}),
514 | .result(wire_man_product2_mult_result));
515 | assign
516 | aclr = 1'b0,
517 | bias = {{3{1'b0}}, {7{1'b1}}},
518 | dataa_exp_all_one = {(dataa[30] & dataa_exp_all_one[6]), (dataa[29] & dataa_exp_all_one[5]), (dataa[28] & dataa_exp_all_one[4]), (dataa[27] & dataa_exp_all_one[3]), (dataa[26] & dataa_exp_all_one[2]), (dataa[25] & dataa_exp_all_one[1]), (dataa[24] & dataa_exp_all_one[0]), dataa[23]},
519 | dataa_exp_not_zero = {(dataa[30] | dataa_exp_not_zero[6]), (dataa[29] | dataa_exp_not_zero[5]), (dataa[28] | dataa_exp_not_zero[4]), (dataa[27] | dataa_exp_not_zero[3]), (dataa[26] | dataa_exp_not_zero[2]), (dataa[25] | dataa_exp_not_zero[1]), (dataa[24] | dataa_exp_not_zero[0]), dataa[23]},
520 | dataa_man_not_zero = {(dataa[22] | dataa_man_not_zero[21]), (dataa[21] | dataa_man_not_zero[20]), (dataa[20] | dataa_man_not_zero[19]), (dataa[19] | dataa_man_not_zero[18]), (dataa[18] | dataa_man_not_zero[17]), (dataa[17] | dataa_man_not_zero[16]), (dataa[16] | dataa_man_not_zero[15]), (dataa[15] | dataa_man_not_zero[14]), (dataa[14] | dataa_man_not_zero[13]), (dataa[13] | dataa_man_not_zero[12]), (dataa[12] | dataa_man_not_zero[11]), dataa[11], (dataa[10] | dataa_man_not_zero[9]), (dataa[9] | dataa_man_not_zero[8]), (dataa[8] | dataa_man_not_zero[7]), (dataa[7] | dataa_man_not_zero[6]), (dataa[6] | dataa_man_not_zero[5]), (dataa[5] | dataa_man_not_zero[4]), (dataa[4] | dataa_man_not_zero[3]), (dataa[3] | dataa_man_not_zero[2]), (dataa[2] | dataa_man_not_zero[1]), (dataa[1] | dataa_man_not_zero[0]), dataa[0]},
521 | datab_exp_all_one = {(datab[30] & datab_exp_all_one[6]), (datab[29] & datab_exp_all_one[5]), (datab[28] & datab_exp_all_one[4]), (datab[27] & datab_exp_all_one[3]), (datab[26] & datab_exp_all_one[2]), (datab[25] & datab_exp_all_one[1]), (datab[24] & datab_exp_all_one[0]), datab[23]},
522 | datab_exp_not_zero = {(datab[30] | datab_exp_not_zero[6]), (datab[29] | datab_exp_not_zero[5]), (datab[28] | datab_exp_not_zero[4]), (datab[27] | datab_exp_not_zero[3]), (datab[26] | datab_exp_not_zero[2]), (datab[25] | datab_exp_not_zero[1]), (datab[24] | datab_exp_not_zero[0]), datab[23]},
523 | datab_man_not_zero = {(datab[22] | datab_man_not_zero[21]), (datab[21] | datab_man_not_zero[20]), (datab[20] | datab_man_not_zero[19]), (datab[19] | datab_man_not_zero[18]), (datab[18] | datab_man_not_zero[17]), (datab[17] | datab_man_not_zero[16]), (datab[16] | datab_man_not_zero[15]), (datab[15] | datab_man_not_zero[14]), (datab[14] | datab_man_not_zero[13]), (datab[13] | datab_man_not_zero[12]), (datab[12] | datab_man_not_zero[11]), datab[11], (datab[10] | datab_man_not_zero[9]), (datab[9] | datab_man_not_zero[8]), (datab[8] | datab_man_not_zero[7]), (datab[7] | datab_man_not_zero[6]), (datab[6] | datab_man_not_zero[5]), (datab[5] | datab_man_not_zero[4]), (datab[4] | datab_man_not_zero[3]), (datab[3] | datab_man_not_zero[2]), (datab[2] | datab_man_not_zero[1]), (datab[1] | datab_man_not_zero[0]), datab[0]},
524 | exp_is_inf = (((~ wire_exp_adj_adder_result[9]) & wire_exp_adj_adder_result[8]) | ((~ wire_exp_adj_adder_result[8]) & result_exp_all_one[7])),
525 | exp_is_zero = (wire_exp_adj_adder_result[9] | (~ result_exp_not_zero[8])),
526 | expmod = {{8{1'b0}}, (delay_man_product_msb & man_round_p2[24]), (delay_man_product_msb ^ man_round_p2[24])},
527 | inf_num = {8{1'b1}},
528 | lsb_bit = man_shift_full[1],
529 | man_result_round = ((man_round_p2[23:0] & {24{(~ man_round_p2[24])}}) | (man_round_p2[24:1] & {24{man_round_p2[24]}})),
530 | man_shift_full = ((wire_man_product2_mult_result[46:22] & {25{(~ wire_man_product2_mult_result[47])}}) | (wire_man_product2_mult_result[47:23] & {25{wire_man_product2_mult_result[47]}})),
531 | result = {sign_node_ff4[0:0], exp_result_ff[7:0], man_result_ff[22:0]},
532 | result_exp_all_one = {(result_exp_all_one[6] & wire_exp_adj_adder_result[7]), (result_exp_all_one[5] & wire_exp_adj_adder_result[6]), (result_exp_all_one[4] & wire_exp_adj_adder_result[5]), (result_exp_all_one[3] & wire_exp_adj_adder_result[4]), (result_exp_all_one[2] & wire_exp_adj_adder_result[3]), (result_exp_all_one[1] & wire_exp_adj_adder_result[2]), (result_exp_all_one[0] & wire_exp_adj_adder_result[1]), wire_exp_adj_adder_result[0]},
533 | result_exp_not_zero = {(result_exp_not_zero[7] | wire_exp_adj_adder_result[8]), (result_exp_not_zero[6] | wire_exp_adj_adder_result[7]), (result_exp_not_zero[5] | wire_exp_adj_adder_result[6]), (result_exp_not_zero[4] | wire_exp_adj_adder_result[5]), (result_exp_not_zero[3] | wire_exp_adj_adder_result[4]), (result_exp_not_zero[2] | wire_exp_adj_adder_result[3]), (result_exp_not_zero[1] | wire_exp_adj_adder_result[2]), (result_exp_not_zero[0] | wire_exp_adj_adder_result[1]), wire_exp_adj_adder_result[0]},
534 | round_bit = man_shift_full[0],
535 | round_carry = (round_dffe & (lsb_dffe | sticky_dffe)),
536 | sticky_bit = {(sticky_bit[21] | (wire_man_product2_mult_result[47] & wire_man_product2_mult_result[22])), (sticky_bit[20] | wire_man_product2_mult_result[21]), (sticky_bit[19] | wire_man_product2_mult_result[20]), (sticky_bit[18] | wire_man_product2_mult_result[19]), (sticky_bit[17] | wire_man_product2_mult_result[18]), (sticky_bit[16] | wire_man_product2_mult_result[17]), (sticky_bit[15] | wire_man_product2_mult_result[16]), (sticky_bit[14] | wire_man_product2_mult_result[15]), (sticky_bit[13] | wire_man_product2_mult_result[14]), (sticky_bit[12] | wire_man_product2_mult_result[13]), (sticky_bit[11] | wire_man_product2_mult_result[12]), (sticky_bit[10] | wire_man_product2_mult_result[11]), (sticky_bit[9] | wire_man_product2_mult_result[10]), (sticky_bit[8] | wire_man_product2_mult_result[9]), (sticky_bit[7] | wire_man_product2_mult_result[8]), (sticky_bit[6] | wire_man_product2_mult_result[7]), (sticky_bit[5] | wire_man_product2_mult_result[6]), (sticky_bit[4] | wire_man_product2_mult_result[5]), (sticky_bit[3] | wire_man_product2_mult_result[4]), (sticky_bit[2] | wire_man_product2_mult_result[3]), (sticky_bit[1] | wire_man_product2_mult_result[2]), (sticky_bit[0] | wire_man_product2_mult_result[1]), wire_man_product2_mult_result[0]};
537 | endmodule //float_mult_altfp_mult
538 | //VALID FILE
539 |
540 |
541 | // synopsys translate_off
542 | `timescale 1 ps / 1 ps
543 | // synopsys translate_on
544 | module float_mult (
545 | clk_en,
546 | clock,
547 | dataa,
548 | datab,
549 | result)/* synthesis synthesis_clearbox = 1 */;
550 |
551 | input clk_en;
552 | input clock;
553 | input [31:0] dataa;
554 | input [31:0] datab;
555 | output [31:0] result;
556 |
557 | wire [31:0] sub_wire0;
558 | wire [31:0] result = sub_wire0[31:0];
559 |
560 | float_mult_altfp_mult float_mult_altfp_mult_component (
561 | .clk_en (clk_en),
562 | .clock (clock),
563 | .dataa (dataa),
564 | .datab (datab),
565 | .result (sub_wire0));
566 |
567 | endmodule
568 |
569 | // ============================================================
570 | // CNX file retrieval info
571 | // ============================================================
572 | // Retrieval info: LIBRARY: altera_mf altera_mf.altera_mf_components.all
573 | // Retrieval info: PRIVATE: FPM_FORMAT STRING "Single"
574 | // Retrieval info: PRIVATE: INTENDED_DEVICE_FAMILY STRING "Stratix V"
575 | // Retrieval info: CONSTANT: DEDICATED_MULTIPLIER_CIRCUITRY STRING "YES"
576 | // Retrieval info: CONSTANT: DENORMAL_SUPPORT STRING "NO"
577 | // Retrieval info: CONSTANT: EXCEPTION_HANDLING STRING "NO"
578 | // Retrieval info: CONSTANT: INTENDED_DEVICE_FAMILY STRING "UNUSED"
579 | // Retrieval info: CONSTANT: LPM_HINT STRING "UNUSED"
580 | // Retrieval info: CONSTANT: LPM_TYPE STRING "altfp_mult"
581 | // Retrieval info: CONSTANT: PIPELINE NUMERIC "5"
582 | // Retrieval info: CONSTANT: REDUCED_FUNCTIONALITY STRING "NO"
583 | // Retrieval info: CONSTANT: ROUNDING STRING "TO_NEAREST"
584 | // Retrieval info: CONSTANT: WIDTH_EXP NUMERIC "8"
585 | // Retrieval info: CONSTANT: WIDTH_MAN NUMERIC "23"
586 | // Retrieval info: USED_PORT: clk_en 0 0 0 0 INPUT NODEFVAL "clk_en"
587 | // Retrieval info: CONNECT: @clk_en 0 0 0 0 clk_en 0 0 0 0
588 | // Retrieval info: USED_PORT: clock 0 0 0 0 INPUT NODEFVAL "clock"
589 | // Retrieval info: CONNECT: @clock 0 0 0 0 clock 0 0 0 0
590 | // Retrieval info: USED_PORT: dataa 0 0 32 0 INPUT NODEFVAL "dataa[31..0]"
591 | // Retrieval info: CONNECT: @dataa 0 0 32 0 dataa 0 0 32 0
592 | // Retrieval info: USED_PORT: datab 0 0 32 0 INPUT NODEFVAL "datab[31..0]"
593 | // Retrieval info: CONNECT: @datab 0 0 32 0 datab 0 0 32 0
594 | // Retrieval info: USED_PORT: result 0 0 32 0 OUTPUT NODEFVAL "result[31..0]"
595 | // Retrieval info: CONNECT: result 0 0 32 0 @result 0 0 32 0
596 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult.v TRUE FALSE
597 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult.qip TRUE FALSE
598 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult.bsf TRUE TRUE
599 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult_inst.v TRUE TRUE
600 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult_bb.v TRUE TRUE
601 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult.inc TRUE TRUE
602 | // Retrieval info: GEN_FILE: TYPE_NORMAL float_mult.cmp TRUE TRUE
603 | // Retrieval info: PRIVATE: SYNTH_WRAPPER_GEN_POSTFIX NUMERIC "1"
604 | // Retrieval info: LIB_FILE: lpm
605 |
--------------------------------------------------------------------------------
/rtl/qip/iplauncher_debug.log:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/brianhill11/FPGA-CNN/23f4b55a8d7acdb33eaa3584de8e833e1448f8f6/rtl/qip/iplauncher_debug.log
--------------------------------------------------------------------------------
/rtl/qip/ram_2p.qip:
--------------------------------------------------------------------------------
1 | set_global_assignment -name IP_TOOL_NAME "RAM: 2-PORT"
2 | set_global_assignment -name IP_TOOL_VERSION "13.1"
3 | set_global_assignment -name VERILOG_FILE [file join $::quartus(qip_path) "ram_2p.v"]
4 | set_global_assignment -name MISC_FILE [file join $::quartus(qip_path) "ram_2p_bb.v"]
5 |
--------------------------------------------------------------------------------
/rtl/qip/ram_2p.v:
--------------------------------------------------------------------------------
1 | // megafunction wizard: %RAM: 2-PORT%
2 | // GENERATION: STANDARD
3 | // VERSION: WM1.0
4 | // MODULE: altsyncram
5 |
6 | // ============================================================
7 | // File Name: ram_2p.v
8 | // Megafunction Name(s):
9 | // altsyncram
10 | //
11 | // Simulation Library Files(s):
12 | // altera_mf
13 | // ============================================================
14 | // ************************************************************
15 | // THIS IS A WIZARD-GENERATED FILE. DO NOT EDIT THIS FILE!
16 | //
17 | // 13.1.4 Build 182 03/12/2014 Patches 4.26 SJ Full Version
18 | // ************************************************************
19 |
20 |
21 | //Copyright (C) 1991-2014 Altera Corporation
22 | //Your use of Altera Corporation's design tools, logic functions
23 | //and other software and tools, and its AMPP partner logic
24 | //functions, and any output files from any of the foregoing
25 | //(including device programming or simulation files), and any
26 | //associated documentation or information are expressly subject
27 | //to the terms and conditions of the Altera Program License
28 | //Subscription Agreement, Altera MegaCore Function License
29 | //Agreement, or other applicable license agreement, including,
30 | //without limitation, that your use is for the sole purpose of
31 | //programming logic devices manufactured by Altera and sold by
32 | //Altera or its authorized distributors. Please refer to the
33 | //applicable agreement for further details.
34 |
35 |
36 | // synopsys translate_off
37 | `timescale 1 ps / 1 ps
38 | // synopsys translate_on
39 | module ram_2p (
40 | data,
41 | rdaddress,
42 | rdclock,
43 | wraddress,
44 | wrclock,
45 | wren,
46 | q);
47 |
48 | input [255:0] data;
49 | input [7:0] rdaddress;
50 | input rdclock;
51 | input [7:0] wraddress;
52 | input wrclock;
53 | input wren;
54 | output [255:0] q;
55 | `ifndef ALTERA_RESERVED_QIS
56 | // synopsys translate_off
57 | `endif
58 | tri1 wrclock;
59 | tri0 wren;
60 | `ifndef ALTERA_RESERVED_QIS
61 | // synopsys translate_on
62 | `endif
63 |
64 | wire [255:0] sub_wire0;
65 | wire [255:0] q = sub_wire0[255:0];
66 |
67 | altsyncram altsyncram_component (
68 | .address_a (wraddress),
69 | .clock0 (wrclock),
70 | .data_a (data),
71 | .wren_a (wren),
72 | .address_b (rdaddress),
73 | .clock1 (rdclock),
74 | .q_b (sub_wire0),
75 | .aclr0 (1'b0),
76 | .aclr1 (1'b0),
77 | .addressstall_a (1'b0),
78 | .addressstall_b (1'b0),
79 | .byteena_a (1'b1),
80 | .byteena_b (1'b1),
81 | .clocken0 (1'b1),
82 | .clocken1 (1'b1),
83 | .clocken2 (1'b1),
84 | .clocken3 (1'b1),
85 | .data_b ({256{1'b1}}),
86 | .eccstatus (),
87 | .q_a (),
88 | .rden_a (1'b1),
89 | .rden_b (1'b1),
90 | .wren_b (1'b0));
91 | defparam
92 | altsyncram_component.address_aclr_b = "NONE",
93 | altsyncram_component.address_reg_b = "CLOCK1",
94 | altsyncram_component.clock_enable_input_a = "BYPASS",
95 | altsyncram_component.clock_enable_input_b = "BYPASS",
96 | altsyncram_component.clock_enable_output_b = "BYPASS",
97 | altsyncram_component.intended_device_family = "Stratix V",
98 | altsyncram_component.lpm_type = "altsyncram",
99 | altsyncram_component.numwords_a = 256,
100 | altsyncram_component.numwords_b = 256,
101 | altsyncram_component.operation_mode = "DUAL_PORT",
102 | altsyncram_component.outdata_aclr_b = "NONE",
103 | altsyncram_component.outdata_reg_b = "CLOCK1",
104 | altsyncram_component.power_up_uninitialized = "FALSE",
105 | altsyncram_component.widthad_a = 8,
106 | altsyncram_component.widthad_b = 8,
107 | altsyncram_component.width_a = 256,
108 | altsyncram_component.width_b = 256,
109 | altsyncram_component.width_byteena_a = 1;
110 |
111 |
112 | endmodule
113 |
114 | // ============================================================
115 | // CNX file retrieval info
116 | // ============================================================
117 | // Retrieval info: PRIVATE: ADDRESSSTALL_A NUMERIC "0"
118 | // Retrieval info: PRIVATE: ADDRESSSTALL_B NUMERIC "0"
119 | // Retrieval info: PRIVATE: BYTEENA_ACLR_A NUMERIC "0"
120 | // Retrieval info: PRIVATE: BYTEENA_ACLR_B NUMERIC "0"
121 | // Retrieval info: PRIVATE: BYTE_ENABLE_A NUMERIC "0"
122 | // Retrieval info: PRIVATE: BYTE_ENABLE_B NUMERIC "0"
123 | // Retrieval info: PRIVATE: BYTE_SIZE NUMERIC "8"
124 | // Retrieval info: PRIVATE: BlankMemory NUMERIC "1"
125 | // Retrieval info: PRIVATE: CLOCK_ENABLE_INPUT_A NUMERIC "0"
126 | // Retrieval info: PRIVATE: CLOCK_ENABLE_INPUT_B NUMERIC "0"
127 | // Retrieval info: PRIVATE: CLOCK_ENABLE_OUTPUT_A NUMERIC "0"
128 | // Retrieval info: PRIVATE: CLOCK_ENABLE_OUTPUT_B NUMERIC "0"
129 | // Retrieval info: PRIVATE: CLRdata NUMERIC "0"
130 | // Retrieval info: PRIVATE: CLRq NUMERIC "0"
131 | // Retrieval info: PRIVATE: CLRrdaddress NUMERIC "0"
132 | // Retrieval info: PRIVATE: CLRrren NUMERIC "0"
133 | // Retrieval info: PRIVATE: CLRwraddress NUMERIC "0"
134 | // Retrieval info: PRIVATE: CLRwren NUMERIC "0"
135 | // Retrieval info: PRIVATE: Clock NUMERIC "1"
136 | // Retrieval info: PRIVATE: Clock_A NUMERIC "0"
137 | // Retrieval info: PRIVATE: Clock_B NUMERIC "0"
138 | // Retrieval info: PRIVATE: IMPLEMENT_IN_LES NUMERIC "0"
139 | // Retrieval info: PRIVATE: INDATA_ACLR_B NUMERIC "0"
140 | // Retrieval info: PRIVATE: INDATA_REG_B NUMERIC "0"
141 | // Retrieval info: PRIVATE: INIT_FILE_LAYOUT STRING "PORT_B"
142 | // Retrieval info: PRIVATE: INIT_TO_SIM_X NUMERIC "0"
143 | // Retrieval info: PRIVATE: INTENDED_DEVICE_FAMILY STRING "Stratix V"
144 | // Retrieval info: PRIVATE: JTAG_ENABLED NUMERIC "0"
145 | // Retrieval info: PRIVATE: JTAG_ID STRING "NONE"
146 | // Retrieval info: PRIVATE: MAXIMUM_DEPTH NUMERIC "0"
147 | // Retrieval info: PRIVATE: MEMSIZE NUMERIC "65536"
148 | // Retrieval info: PRIVATE: MEM_IN_BITS NUMERIC "0"
149 | // Retrieval info: PRIVATE: MIFfilename STRING ""
150 | // Retrieval info: PRIVATE: OPERATION_MODE NUMERIC "2"
151 | // Retrieval info: PRIVATE: OUTDATA_ACLR_B NUMERIC "0"
152 | // Retrieval info: PRIVATE: OUTDATA_REG_B NUMERIC "1"
153 | // Retrieval info: PRIVATE: RAM_BLOCK_TYPE NUMERIC "0"
154 | // Retrieval info: PRIVATE: READ_DURING_WRITE_MODE_MIXED_PORTS NUMERIC "2"
155 | // Retrieval info: PRIVATE: READ_DURING_WRITE_MODE_PORT_A NUMERIC "3"
156 | // Retrieval info: PRIVATE: READ_DURING_WRITE_MODE_PORT_B NUMERIC "3"
157 | // Retrieval info: PRIVATE: REGdata NUMERIC "1"
158 | // Retrieval info: PRIVATE: REGq NUMERIC "1"
159 | // Retrieval info: PRIVATE: REGrdaddress NUMERIC "1"
160 | // Retrieval info: PRIVATE: REGrren NUMERIC "1"
161 | // Retrieval info: PRIVATE: REGwraddress NUMERIC "1"
162 | // Retrieval info: PRIVATE: REGwren NUMERIC "1"
163 | // Retrieval info: PRIVATE: SYNTH_WRAPPER_GEN_POSTFIX STRING "0"
164 | // Retrieval info: PRIVATE: USE_DIFF_CLKEN NUMERIC "0"
165 | // Retrieval info: PRIVATE: UseDPRAM NUMERIC "1"
166 | // Retrieval info: PRIVATE: VarWidth NUMERIC "0"
167 | // Retrieval info: PRIVATE: WIDTH_READ_A NUMERIC "256"
168 | // Retrieval info: PRIVATE: WIDTH_READ_B NUMERIC "256"
169 | // Retrieval info: PRIVATE: WIDTH_WRITE_A NUMERIC "256"
170 | // Retrieval info: PRIVATE: WIDTH_WRITE_B NUMERIC "256"
171 | // Retrieval info: PRIVATE: WRADDR_ACLR_B NUMERIC "0"
172 | // Retrieval info: PRIVATE: WRADDR_REG_B NUMERIC "0"
173 | // Retrieval info: PRIVATE: WRCTRL_ACLR_B NUMERIC "0"
174 | // Retrieval info: PRIVATE: enable NUMERIC "0"
175 | // Retrieval info: PRIVATE: rden NUMERIC "0"
176 | // Retrieval info: LIBRARY: altera_mf altera_mf.altera_mf_components.all
177 | // Retrieval info: CONSTANT: ADDRESS_ACLR_B STRING "NONE"
178 | // Retrieval info: CONSTANT: ADDRESS_REG_B STRING "CLOCK1"
179 | // Retrieval info: CONSTANT: CLOCK_ENABLE_INPUT_A STRING "BYPASS"
180 | // Retrieval info: CONSTANT: CLOCK_ENABLE_INPUT_B STRING "BYPASS"
181 | // Retrieval info: CONSTANT: CLOCK_ENABLE_OUTPUT_B STRING "BYPASS"
182 | // Retrieval info: CONSTANT: INTENDED_DEVICE_FAMILY STRING "Stratix V"
183 | // Retrieval info: CONSTANT: LPM_TYPE STRING "altsyncram"
184 | // Retrieval info: CONSTANT: NUMWORDS_A NUMERIC "256"
185 | // Retrieval info: CONSTANT: NUMWORDS_B NUMERIC "256"
186 | // Retrieval info: CONSTANT: OPERATION_MODE STRING "DUAL_PORT"
187 | // Retrieval info: CONSTANT: OUTDATA_ACLR_B STRING "NONE"
188 | // Retrieval info: CONSTANT: OUTDATA_REG_B STRING "CLOCK1"
189 | // Retrieval info: CONSTANT: POWER_UP_UNINITIALIZED STRING "FALSE"
190 | // Retrieval info: CONSTANT: WIDTHAD_A NUMERIC "8"
191 | // Retrieval info: CONSTANT: WIDTHAD_B NUMERIC "8"
192 | // Retrieval info: CONSTANT: WIDTH_A NUMERIC "256"
193 | // Retrieval info: CONSTANT: WIDTH_B NUMERIC "256"
194 | // Retrieval info: CONSTANT: WIDTH_BYTEENA_A NUMERIC "1"
195 | // Retrieval info: USED_PORT: data 0 0 256 0 INPUT NODEFVAL "data[255..0]"
196 | // Retrieval info: USED_PORT: q 0 0 256 0 OUTPUT NODEFVAL "q[255..0]"
197 | // Retrieval info: USED_PORT: rdaddress 0 0 8 0 INPUT NODEFVAL "rdaddress[7..0]"
198 | // Retrieval info: USED_PORT: rdclock 0 0 0 0 INPUT NODEFVAL "rdclock"
199 | // Retrieval info: USED_PORT: wraddress 0 0 8 0 INPUT NODEFVAL "wraddress[7..0]"
200 | // Retrieval info: USED_PORT: wrclock 0 0 0 0 INPUT VCC "wrclock"
201 | // Retrieval info: USED_PORT: wren 0 0 0 0 INPUT GND "wren"
202 | // Retrieval info: CONNECT: @address_a 0 0 8 0 wraddress 0 0 8 0
203 | // Retrieval info: CONNECT: @address_b 0 0 8 0 rdaddress 0 0 8 0
204 | // Retrieval info: CONNECT: @clock0 0 0 0 0 wrclock 0 0 0 0
205 | // Retrieval info: CONNECT: @clock1 0 0 0 0 rdclock 0 0 0 0
206 | // Retrieval info: CONNECT: @data_a 0 0 256 0 data 0 0 256 0
207 | // Retrieval info: CONNECT: @wren_a 0 0 0 0 wren 0 0 0 0
208 | // Retrieval info: CONNECT: q 0 0 256 0 @q_b 0 0 256 0
209 | // Retrieval info: GEN_FILE: TYPE_NORMAL ram_2p.v TRUE
210 | // Retrieval info: GEN_FILE: TYPE_NORMAL ram_2p.inc FALSE
211 | // Retrieval info: GEN_FILE: TYPE_NORMAL ram_2p.cmp FALSE
212 | // Retrieval info: GEN_FILE: TYPE_NORMAL ram_2p.bsf FALSE
213 | // Retrieval info: GEN_FILE: TYPE_NORMAL ram_2p_inst.v FALSE
214 | // Retrieval info: GEN_FILE: TYPE_NORMAL ram_2p_bb.v TRUE
215 | // Retrieval info: LIB_FILE: altera_mf
216 |
--------------------------------------------------------------------------------
/rtl/qip/ram_2p_bb.v:
--------------------------------------------------------------------------------
1 | // megafunction wizard: %RAM: 2-PORT%VBB%
2 | // GENERATION: STANDARD
3 | // VERSION: WM1.0
4 | // MODULE: altsyncram
5 |
6 | // ============================================================
7 | // File Name: ram_2p.v
8 | // Megafunction Name(s):
9 | // altsyncram
10 | //
11 | // Simulation Library Files(s):
12 | // altera_mf
13 | // ============================================================
14 | // ************************************************************
15 | // THIS IS A WIZARD-GENERATED FILE. DO NOT EDIT THIS FILE!
16 | //
17 | // 13.1.4 Build 182 03/12/2014 Patches 4.26 SJ Full Version
18 | // ************************************************************
19 |
20 | //Copyright (C) 1991-2014 Altera Corporation
21 | //Your use of Altera Corporation's design tools, logic functions
22 | //and other software and tools, and its AMPP partner logic
23 | //functions, and any output files from any of the foregoing
24 | //(including device programming or simulation files), and any
25 | //associated documentation or information are expressly subject
26 | //to the terms and conditions of the Altera Program License
27 | //Subscription Agreement, Altera MegaCore Function License
28 | //Agreement, or other applicable license agreement, including,
29 | //without limitation, that your use is for the sole purpose of
30 | //programming logic devices manufactured by Altera and sold by
31 | //Altera or its authorized distributors. Please refer to the
32 | //applicable agreement for further details.
33 |
34 | module ram_2p (
35 | data,
36 | rdaddress,
37 | rdclock,
38 | wraddress,
39 | wrclock,
40 | wren,
41 | q);
42 |
43 | input [255:0] data;
44 | input [7:0] rdaddress;
45 | input rdclock;
46 | input [7:0] wraddress;
47 | input wrclock;
48 | input wren;
49 | output [255:0] q;
50 | `ifndef ALTERA_RESERVED_QIS
51 | // synopsys translate_off
52 | `endif
53 | tri1 wrclock;
54 | tri0 wren;
55 | `ifndef ALTERA_RESERVED_QIS
56 | // synopsys translate_on
57 | `endif
58 |
59 | endmodule
60 |
61 | // ============================================================
62 | // CNX file retrieval info
63 | // ============================================================
64 | // Retrieval info: PRIVATE: ADDRESSSTALL_A NUMERIC "0"
65 | // Retrieval info: PRIVATE: ADDRESSSTALL_B NUMERIC "0"
66 | // Retrieval info: PRIVATE: BYTEENA_ACLR_A NUMERIC "0"
67 | // Retrieval info: PRIVATE: BYTEENA_ACLR_B NUMERIC "0"
68 | // Retrieval info: PRIVATE: BYTE_ENABLE_A NUMERIC "0"
69 | // Retrieval info: PRIVATE: BYTE_ENABLE_B NUMERIC "0"
70 | // Retrieval info: PRIVATE: BYTE_SIZE NUMERIC "8"
71 | // Retrieval info: PRIVATE: BlankMemory NUMERIC "1"
72 | // Retrieval info: PRIVATE: CLOCK_ENABLE_INPUT_A NUMERIC "0"
73 | // Retrieval info: PRIVATE: CLOCK_ENABLE_INPUT_B NUMERIC "0"
74 | // Retrieval info: PRIVATE: CLOCK_ENABLE_OUTPUT_A NUMERIC "0"
75 | // Retrieval info: PRIVATE: CLOCK_ENABLE_OUTPUT_B NUMERIC "0"
76 | // Retrieval info: PRIVATE: CLRdata NUMERIC "0"
77 | // Retrieval info: PRIVATE: CLRq NUMERIC "0"
78 | // Retrieval info: PRIVATE: CLRrdaddress NUMERIC "0"
79 | // Retrieval info: PRIVATE: CLRrren NUMERIC "0"
80 | // Retrieval info: PRIVATE: CLRwraddress NUMERIC "0"
81 | // Retrieval info: PRIVATE: CLRwren NUMERIC "0"
82 | // Retrieval info: PRIVATE: Clock NUMERIC "1"
83 | // Retrieval info: PRIVATE: Clock_A NUMERIC "0"
84 | // Retrieval info: PRIVATE: Clock_B NUMERIC "0"
85 | // Retrieval info: PRIVATE: IMPLEMENT_IN_LES NUMERIC "0"
86 | // Retrieval info: PRIVATE: INDATA_ACLR_B NUMERIC "0"
87 | // Retrieval info: PRIVATE: INDATA_REG_B NUMERIC "0"
88 | // Retrieval info: PRIVATE: INIT_FILE_LAYOUT STRING "PORT_B"
89 | // Retrieval info: PRIVATE: INIT_TO_SIM_X NUMERIC "0"
90 | // Retrieval info: PRIVATE: INTENDED_DEVICE_FAMILY STRING "Stratix V"
91 | // Retrieval info: PRIVATE: JTAG_ENABLED NUMERIC "0"
92 | // Retrieval info: PRIVATE: JTAG_ID STRING "NONE"
93 | // Retrieval info: PRIVATE: MAXIMUM_DEPTH NUMERIC "0"
94 | // Retrieval info: PRIVATE: MEMSIZE NUMERIC "65536"
95 | // Retrieval info: PRIVATE: MEM_IN_BITS NUMERIC "0"
96 | // Retrieval info: PRIVATE: MIFfilename STRING ""
97 | // Retrieval info: PRIVATE: OPERATION_MODE NUMERIC "2"
98 | // Retrieval info: PRIVATE: OUTDATA_ACLR_B NUMERIC "0"
99 | // Retrieval info: PRIVATE: OUTDATA_REG_B NUMERIC "1"
100 | // Retrieval info: PRIVATE: RAM_BLOCK_TYPE NUMERIC "0"
101 | // Retrieval info: PRIVATE: READ_DURING_WRITE_MODE_MIXED_PORTS NUMERIC "2"
102 | // Retrieval info: PRIVATE: READ_DURING_WRITE_MODE_PORT_A NUMERIC "3"
103 | // Retrieval info: PRIVATE: READ_DURING_WRITE_MODE_PORT_B NUMERIC "3"
104 | // Retrieval info: PRIVATE: REGdata NUMERIC "1"
105 | // Retrieval info: PRIVATE: REGq NUMERIC "1"
106 | // Retrieval info: PRIVATE: REGrdaddress NUMERIC "1"
107 | // Retrieval info: PRIVATE: REGrren NUMERIC "1"
108 | // Retrieval info: PRIVATE: REGwraddress NUMERIC "1"
109 | // Retrieval info: PRIVATE: REGwren NUMERIC "1"
110 | // Retrieval info: PRIVATE: SYNTH_WRAPPER_GEN_POSTFIX STRING "0"
111 | // Retrieval info: PRIVATE: USE_DIFF_CLKEN NUMERIC "0"
112 | // Retrieval info: PRIVATE: UseDPRAM NUMERIC "1"
113 | // Retrieval info: PRIVATE: VarWidth NUMERIC "0"
114 | // Retrieval info: PRIVATE: WIDTH_READ_A NUMERIC "256"
115 | // Retrieval info: PRIVATE: WIDTH_READ_B NUMERIC "256"
116 | // Retrieval info: PRIVATE: WIDTH_WRITE_A NUMERIC "256"
117 | // Retrieval info: PRIVATE: WIDTH_WRITE_B NUMERIC "256"
118 | // Retrieval info: PRIVATE: WRADDR_ACLR_B NUMERIC "0"
119 | // Retrieval info: PRIVATE: WRADDR_REG_B NUMERIC "0"
120 | // Retrieval info: PRIVATE: WRCTRL_ACLR_B NUMERIC "0"
121 | // Retrieval info: PRIVATE: enable NUMERIC "0"
122 | // Retrieval info: PRIVATE: rden NUMERIC "0"
123 | // Retrieval info: LIBRARY: altera_mf altera_mf.altera_mf_components.all
124 | // Retrieval info: CONSTANT: ADDRESS_ACLR_B STRING "NONE"
125 | // Retrieval info: CONSTANT: ADDRESS_REG_B STRING "CLOCK1"
126 | // Retrieval info: CONSTANT: CLOCK_ENABLE_INPUT_A STRING "BYPASS"
127 | // Retrieval info: CONSTANT: CLOCK_ENABLE_INPUT_B STRING "BYPASS"
128 | // Retrieval info: CONSTANT: CLOCK_ENABLE_OUTPUT_B STRING "BYPASS"
129 | // Retrieval info: CONSTANT: INTENDED_DEVICE_FAMILY STRING "Stratix V"
130 | // Retrieval info: CONSTANT: LPM_TYPE STRING "altsyncram"
131 | // Retrieval info: CONSTANT: NUMWORDS_A NUMERIC "256"
132 | // Retrieval info: CONSTANT: NUMWORDS_B NUMERIC "256"
133 | // Retrieval info: CONSTANT: OPERATION_MODE STRING "DUAL_PORT"
134 | // Retrieval info: CONSTANT: OUTDATA_ACLR_B STRING "NONE"
135 | // Retrieval info: CONSTANT: OUTDATA_REG_B STRING "CLOCK1"
136 | // Retrieval info: CONSTANT: POWER_UP_UNINITIALIZED STRING "FALSE"
137 | // Retrieval info: CONSTANT: WIDTHAD_A NUMERIC "8"
138 | // Retrieval info: CONSTANT: WIDTHAD_B NUMERIC "8"
139 | // Retrieval info: CONSTANT: WIDTH_A NUMERIC "256"
140 | // Retrieval info: CONSTANT: WIDTH_B NUMERIC "256"
141 | // Retrieval info: CONSTANT: WIDTH_BYTEENA_A NUMERIC "1"
142 | // Retrieval info: USED_PORT: data 0 0 256 0 INPUT NODEFVAL "data[255..0]"
143 | // Retrieval info: USED_PORT: q 0 0 256 0 OUTPUT NODEFVAL "q[255..0]"
144 | // Retrieval info: USED_PORT: rdaddress 0 0 8 0 INPUT NODEFVAL "rdaddress[7..0]"
145 | // Retrieval info: USED_PORT: rdclock 0 0 0 0 INPUT NODEFVAL "rdclock"
146 | // Retrieval info: USED_PORT: wraddress 0 0 8 0 INPUT NODEFVAL "wraddress[7..0]"
147 | // Retrieval info: USED_PORT: wrclock 0 0 0 0 INPUT VCC "wrclock"
148 | // Retrieval info: USED_PORT: wren 0 0 0 0 INPUT GND "wren"
149 | // Retrieval info: CONNECT: @address_a 0 0 8 0 wraddress 0 0 8 0
150 | // Retrieval info: CONNECT: @address_b 0 0 8 0 rdaddress 0 0 8 0
151 | // Retrieval info: CONNECT: @clock0 0 0 0 0 wrclock 0 0 0 0
152 | // Retrieval info: CONNECT: @clock1 0 0 0 0 rdclock 0 0 0 0
153 | // Retrieval info: CONNECT: @data_a 0 0 256 0 data 0 0 256 0
154 | // Retrieval info: CONNECT: @wren_a 0 0 0 0 wren 0 0 0 0
155 | // Retrieval info: CONNECT: q 0 0 256 0 @q_b 0 0 256 0
156 | // Retrieval info: GEN_FILE: TYPE_NORMAL ram_2p.v TRUE
157 | // Retrieval info: GEN_FILE: TYPE_NORMAL ram_2p.inc FALSE
158 | // Retrieval info: GEN_FILE: TYPE_NORMAL ram_2p.cmp FALSE
159 | // Retrieval info: GEN_FILE: TYPE_NORMAL ram_2p.bsf FALSE
160 | // Retrieval info: GEN_FILE: TYPE_NORMAL ram_2p_inst.v FALSE
161 | // Retrieval info: GEN_FILE: TYPE_NORMAL ram_2p_bb.v TRUE
162 | // Retrieval info: LIB_FILE: altera_mf
163 |
--------------------------------------------------------------------------------
/rtl/relu_backward_layer.sv:
--------------------------------------------------------------------------------
1 |
2 | module relu_backward_layer #(parameter WIDTH = 16, parameter NEGATIVE_SLOPE = 0.0)
3 | (
4 | input logic clk, //clock signal
5 | input logic reset, //reset signal
6 | input logic [7:0] id, //id value
7 | input logic [31:0] in_vec [WIDTH-1:0],//vector of floats
8 | output reg [7:0] id_out, //output id value
9 | output reg [31:0] out_vec [WIDTH-1:0] //vector of floats
10 | );
11 |
12 | generate
13 | genvar i;
14 |
15 | for (i = 0; i < WIDTH; i = i+1) begin : RELU_BACKWARD
16 | relu_backward_opt #(.NEGATIVE_SLOPE(NEGATIVE_SLOPE))
17 | relu_ops ( .clk(clk), .reset(reset),
18 | .in_data(in_vec[i]), .out_data(out_vec[i]) );
19 | end
20 | endgenerate
21 |
22 | always @(posedge clk) begin
23 | id_out <= id;
24 | end
25 | endmodule
26 |
--------------------------------------------------------------------------------
/rtl/relu_backward_layer.sv.bak:
--------------------------------------------------------------------------------
1 |
2 | module relu_backward_layer #(parameter WIDTH = 4, parameter NEGATIVE_SLOPE = 0)
3 | ( input logic clk, //clock signal
4 | input logic reset, //reset signal
5 | input logic [31:0] in_vec [WIDTH-1:0],//vector of floats
6 | output reg [31:0] out_vec [WIDTH-1:0] //vector of floats
7 | );
8 |
9 | //parameter NEGATIVE_SLOPE = 0;
10 | //parameter WIDTH = 4;
11 |
12 | generate
13 | genvar i;
14 |
15 | for (i = 0; i < WIDTH; i = i+1) begin
16 | relu_backward_opt #(.NEGATIVE_SLOPE(NEGATIVE_SLOPE))
17 | relu_ops ( .clk(clk), .reset(reset),
18 | .in_data(in_vec[i]), .out_data(out_vec[i]) );
19 | end
20 | endgenerate
21 |
22 | endmodule
23 |
--------------------------------------------------------------------------------
/rtl/relu_backward_layer_tb.sv:
--------------------------------------------------------------------------------
1 | `timescale 1ns/100ps
2 |
3 | module relu_backward_layer_tb();
4 | `include "/home/b/FPGA-CNN/test/test_data/relu_backward_test_data.vh"
5 | parameter CYCLE = 5; //clk period: 5ns = 200Mhz signal
6 | parameter NEG_SLOPE = 0.0; //negative slope param
7 | parameter WIDTH = 8; //width of input/output vec
8 |
9 | parameter NUM_TESTS = 5000; //number of test iterations
10 | parameter MEM_SIZE = NUM_TESTS*WIDTH;
11 |
12 | reg clk, reset;
13 | reg [31:0] in_vec [WIDTH-1:0]; //input vec to module
14 | reg [31:0] out_vec [WIDTH-1:0]; //outout vec from module
15 | int i, j, num_errors;
16 |
17 | //initialize clk
18 | initial begin
19 | clk = 0;
20 | end
21 |
22 | //forever cycle the clk
23 | always begin
24 | #(CYCLE/2.0) clk = ~clk;
25 | end
26 |
27 | //instantiate the module
28 | relu_backward_layer #(.WIDTH(8), .NEGATIVE_SLOPE(NEG_SLOPE) )
29 | relu( .clk(clk), .reset(reset), .id(8'b0), .in_vec(in_vec), .out_vec(out_vec) );
30 |
31 | initial begin
32 | reset = 0;
33 | num_errors = 0;
34 | //for all test cases
35 | for (i = 0; i < MEM_SIZE; i = i+(WIDTH)) begin
36 | //for each value in input vector
37 | for (j = 0; j < WIDTH; j++) begin
38 | //use test input value as input
39 | in_vec[j] = test_input[i+j];
40 | end
41 | //wait for it...
42 | #(CYCLE)
43 | //for each value in output vector (same size as input)
44 | for (j = 0; j < WIDTH; j++) begin
45 | //check output of module against value calculated by Python
46 | $display("output: %h\tcalculated:%h", out_vec[j], test_output[i+j]);
47 | assert( out_vec[j] == test_output[i+j] );
48 | //if we were wrong, increase error count
49 | if( out_vec[j] != test_output[i+j] ) begin
50 | num_errors++;
51 | end
52 | end
53 | end
54 | $display("############################################\n");
55 | $display("Testing complete!\n");
56 | $display("%d of %d tests passed!\n", NUM_TESTS-num_errors, NUM_TESTS);
57 | $display("(%f percent)\n", 100*(NUM_TESTS-num_errors)/NUM_TESTS);
58 | $display("############################################\n");
59 | end
60 | endmodule
61 |
62 |
--------------------------------------------------------------------------------
/rtl/relu_backward_layer_tb.sv.bak:
--------------------------------------------------------------------------------
1 | `timescale 1ns/1ns
2 |
3 | module relu_backward_layer_tb();
4 |
5 | parameter CYCLE = 100;
6 | //use $shortrealtobuts() to convert float to binary
7 | parameter NEG_SLOPE = $shortrealtobits(0.0001);
8 | parameter WIDTH = 4;
9 |
10 | reg clk, reset;
11 | shortreal a [WIDTH-1:0];
12 | reg [31:0] b [WIDTH-1:0];
13 | reg random_sign; //1 bit for random float sign
14 | reg [7:0] random_exp; //8 bits for random float exp
15 | reg [22:0] random_mantissa; //23 bits for random float mantissa
16 |
17 | //forever cycle the clk
18 | initial begin
19 | clk <= 0;
20 | forever begin
21 | #(CYCLE/2) clk = ~clk;
22 | end
23 | end
24 |
25 | relu_backward_layer #(.NEGATIVE_SLOPE(NEG_SLOPE), .WIDTH(WIDTH) )
26 | relu( .clk(clk), .reset(reset), .in_vec(a), .out_vec(b) );
27 |
28 | int i, j;
29 | initial begin
30 | reset = 0;
31 | repeat(10) begin
32 | i = i+1;
33 | //build the input vector of floats
34 | for (j = 0; j < WIDTH; j = j+1) begin
35 | $display("Build\tIteration %d\tinput %d\n", i, j);
36 | //generate a random sign bit, exponent, and mantissa value
37 | random_sign = $urandom(i+j) % 2;
38 | random_exp = $urandom(i+j+2) % 255;
39 | random_mantissa = $urandom(i+j+5);
40 | //concatenate sign bit, exponent, and mantissa to make float
41 | a[j] = {random_sign, random_exp, random_mantissa};
42 | end
43 | //take a quick break
44 | $display("a[0]: %b\n", a[0]);
45 | #(3*CYCLE)
46 | //check the output vector of floats
47 | for (j = 0; j < WIDTH; j = j+1) begin
48 | $display("Test: Iteration %d\tinput %d\n", i, j);
49 | //if the input value is greater than zero, then
50 | //output should be same as input
51 | if ($bitstoshortreal(a[j]) > 0e0) begin
52 | $display("%f is greater than zero: %b\n", a[j], a[j]);
53 | $display("a: %b b: %b\n", a[j], b[j]);
54 | assert( a[j] == b[j] );
55 | //else input value less than or equal to zero,
56 | //so output should be NEG_SLOPE
57 | end else begin
58 | $display("%f is less than or equal to zero: %b\n", a[j], a[j]);
59 | $display("NEG_SLOPE: %b\n", NEG_SLOPE);
60 | $display("a: %b b: %b\n", a[j], b[j]);
61 | assert( $bitstoshortreal(b[j]) == $bitstoshortreal(NEG_SLOPE) );
62 | end
63 | end
64 | end
65 | $display("############################################\n");
66 | $display("All tests passed!\n");
67 | $display("############################################\n");
68 | end
69 | endmodule
70 |
71 |
--------------------------------------------------------------------------------
/rtl/relu_backward_opt.sv:
--------------------------------------------------------------------------------
1 |
2 | module relu_backward_opt( input logic clk, //clock signal
3 | input logic reset, //reset signal
4 | input logic [31:0] in_data, //32-bit float
5 | output logic [31:0] out_data); //32-bit float
6 |
7 | parameter NEGATIVE_SLOPE = 0.0;
8 |
9 | //at rising edge of clock
10 | always @(posedge clk, negedge reset) begin
11 | //check for reset value, else continue
12 | if (!reset) begin
13 | //if value is positive, output the value
14 | if (in_data[31] == 0) begin
15 | out_data <= in_data;
16 | //else output the NEGATIVE_SLOPE (usually 0)
17 | end else begin
18 | out_data <= NEGATIVE_SLOPE;
19 | end
20 | end
21 | end
22 | endmodule
23 |
--------------------------------------------------------------------------------
/rtl/relu_backward_opt_tb.sv:
--------------------------------------------------------------------------------
1 | `timescale 1ns/1ns
2 |
3 | module relu_backward_opt_tb();
4 |
5 | parameter CYCLE = 100;
6 | //use $shortrealtobuts() to convert float to binary
7 | parameter NEG_SLOPE = $shortrealtobits(0.0001);
8 |
9 | reg clk, reset;
10 | shortreal a;
11 | reg [31:0] b;
12 | reg random_sign; //1 bit for random float sign
13 | reg [7:0] random_exp; //8 bits for random float exp
14 | reg [22:0] random_mantissa; //23 bits for random float mantissa
15 |
16 | //forever cycle the clk
17 | initial begin
18 | clk <= 0;
19 | forever begin
20 | #(CYCLE/2) clk = ~clk;
21 | end
22 | end
23 |
24 | relu_backward_opt #(.NEGATIVE_SLOPE(NEG_SLOPE)) relu( .clk(clk), .reset(reset), .in_data(a), .out_data(b) );
25 |
26 | int i;
27 | initial begin
28 | reset = 0;
29 | repeat(10000) begin
30 | i = i+1;
31 | //generate a random sign bit, exponent, and mantissa value
32 | random_sign = $urandom(i) % 2;
33 | random_exp = $urandom(i+2) % 255;
34 | random_mantissa = $urandom(i+5);
35 | //concatenate sign bit, exponent, and mantissa to make float
36 | a = {random_sign, random_exp, random_mantissa};
37 | #(3*CYCLE)
38 | //if the input value is greater than zero, then
39 | //output should be same as input
40 | if ($bitstoshortreal(a) > 0e0) begin
41 | //$display("%f is greater than zero: %b\n", a, a);
42 | //$display("a: %b b: %b\n", a, b);
43 | assert( a == b );
44 | //else input value less than or equal to zero,
45 | //so output should be NEG_SLOPE
46 | end else begin
47 | //$display("%f is less than or equal to zero: %b\n", a, a);
48 | //$display("NEG_SLOPE: %b\n", NEG_SLOPE);
49 | //$display("a: %b b: %b\n", a, b);
50 | assert( $bitstoshortreal(b) == $bitstoshortreal(NEG_SLOPE) );
51 | end
52 | end
53 | $display("############################################\n");
54 | $display("All tests passed!\n");
55 | $display("############################################\n");
56 | end
57 | endmodule
58 |
59 |
--------------------------------------------------------------------------------
/rtl/relu_forward.sv:
--------------------------------------------------------------------------------
1 | module relu_forward #(parameter negative_slope = 0, parameter WIDTH = 8)
2 | (
3 | input logic reset_n, //reset
4 | input logic clk_en,
5 | input logic clk, //clock signal
6 | input logic [31:0] in_data [WIDTH-1:0], //data vector of floats
7 | input logic [7:0] in_id,
8 | output reg [31:0] out_data [WIDTH-1:0], //data vector of floats
9 | output reg [7:0] out_id
10 | );
11 |
12 |
13 | //default negative slope is 0
14 |
15 | genvar i;
16 | generate
17 | for(i = 0; i < WIDTH; i = i+1) begin : RELU_FORWARD_MULT
18 | relu_forward_opt #(.negative_slope(negative_slope))
19 | opt(
20 | .reset_n(reset_n),
21 | .clk_en(clk_en),
22 | .clk(clk),
23 | .in_data(in_data[i]),
24 | .out_data(out_data[i])
25 | );
26 | end
27 |
28 | endgenerate
29 |
30 | always @(posedge clk) begin
31 | //b <= out_data;
32 | out_id <= in_id;
33 | end
34 |
35 | endmodule
36 |
37 | module relu_forward_opt #(parameter negative_slope = 0)
38 | (
39 | input logic reset_n, //reset
40 | input logic clk_en,
41 | input logic clk, //clock signal
42 | input logic [31:0] in_data, //data vector of floats
43 | output reg [31:0] out_data //data vector of floats
44 | );
45 |
46 | reg [31:0] b;
47 | floating_mult floating_mult_inst(
48 | .clk_en(clk_en),
49 | .clock(clk),
50 | .dataa(in_data),
51 | .datab(b),
52 | .result(out_data)
53 | );
54 |
55 | always @(posedge clk) begin
56 | if (in_data[31] == 0) begin
57 | //if positive, multiply input by 1 (don't change)
58 | b <= 1'b00111111100000000000000000000000;
59 | end else begin
60 | b <= negative_slope;
61 | end
62 | b <= out_data;
63 | end
64 |
65 | endmodule
66 |
--------------------------------------------------------------------------------
/rtl/relu_forward_tb.sv:
--------------------------------------------------------------------------------
1 | `timescale 1ns/100ps
2 |
3 | module relu_forward_tb();
4 | `include "/nfs/stak/students/z/zhangso/ECE441/relu_forward/test_data/relu_forward_test_data.vh"
5 | parameter CYCLE = 5; //clk period: 5 ns = 200 MHz signal
6 | parameter NEG_SLOPE = 0.0; //parameter negative slope
7 | parameter WIDTH = 8; //width of the input and output vectors
8 |
9 | parameter NUM_TESTS = 4000; //number of test iterations
10 | parameter MEM_SIZE = NUM_TESTS*WIDTH;
11 |
12 | reg clk, reset;
13 | reg clk_en;
14 | reg [31:0] in_data [WIDTH-1:0]; //input vec to module
15 | reg [31:0] out_data [WIDTH-1:0]; //output vec from module
16 | int i, j, num_errors;
17 |
18 | //initialize clk
19 | initial begin
20 | clk = 0;
21 | //clk_en = 1;
22 | end
23 |
24 | //forever cycle the clk
25 | always begin
26 | #(CYCLE/2.0) clk = ~clk;
27 | end
28 |
29 | //instantiate the module
30 | relu_forward #(.negative_slope(NEG_SLOPE), .WIDTH(8) )
31 | relu( .reset_n(reset), .clk(clk), .clk_en(clk_en), .in_data(in_data), .out_data(out_data) );
32 |
33 | initial begin
34 | reset = 0;
35 | num_errors = 0;
36 | //for all test cases
37 | for (i = 0; i < MEM_SIZE; i = i+(WIDTH)) begin
38 | //for each value in input vector
39 | for (j = 0; j < WIDTH; j++) begin
40 | //use test input value as input
41 | in_data[j] = test_input[i+j];
42 | end
43 | //wait for it...
44 | #(5*CYCLE) //5*CYCLE due to multiplication
45 | //for each value in output vector (same size as input)
46 | for (j = 0; j < WIDTH; j++) begin
47 | //check output of module against value calculated by Python
48 | $display("output: %h\tcalculated:%h", out_data[j], test_output[i+j]);
49 | assert( out_data[j] == test_output[i+j] );
50 | //if we were wrong, increase error count
51 | if( out_data[j] != test_output[i+j] ) begin
52 | num_errors++;
53 | end
54 | end
55 | end
56 | $display("############################################\n");
57 | $display("Testing complete!\n");
58 | $display("%d of %d tests passed!\n", NUM_TESTS-num_errors, NUM_TESTS);
59 | $display("(%f percent)\n", 100.0*(NUM_TESTS-num_errors)/NUM_TESTS);
60 | $display("############################################\n");
61 | end
62 | endmodule
63 |
--------------------------------------------------------------------------------
/test/conv_forward_tests_header.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/python
2 |
3 | import csv
4 | import random
5 | import struct
6 | import argparse
7 | import numpy as np
8 |
9 | data_file_name = 'test_data/conv_forward_test_data.vh'
10 |
11 | # convert floating point value to hex value
12 | def float_to_hex(f):
13 | return format(struct.unpack(' 0, output = input
98 | if (input_val > 0):
99 | output_val = input_val;
100 | # else, output = NEGATIVE_SLOPE (usually 0)
101 | else:
102 | output_val = NEGATIVE_SLOPE
103 | # add to vectors
104 | input_vec.append( input_val )
105 | output_vec.append( output_val )
106 | f.writerow( build_data_line( 'test_input', input_vec, i, 'hex' ) )
107 | f.writerow( build_data_line( 'test_output', output_vec, i, 'hex' ) )
108 | # for debugging/sanity check..
109 | if (DEBUG):
110 | f.writerow( ["/*############ DEBUG ############"] )
111 | f.writerow( build_data_line( 'test_input', input_vec, i, 'float' ) )
112 | f.writerow( build_data_line( 'test_output', output_vec, i, 'float' ) )
113 | f.writerow( ["############ END DEBUG ############*/"] )
114 | # end the 'initial begin' statement
115 | f.writerow( ['end'] )
116 | # add endif statement
117 | f.writerow( ['`endif'] )
118 |
119 |
120 | if __name__ == '__main__':
121 | main()
122 |
--------------------------------------------------------------------------------
/test/relu_forward_tests_header.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/python
2 |
3 | import csv
4 | import random
5 | import struct
6 | import argparse
7 | import numpy as np
8 |
9 | data_file_name = 'test_data/relu_forward_test_data.vh'
10 |
11 | # convert floating point value to hex value
12 | def float_to_hex(f):
13 | return format(struct.unpack(' 0, output = input
91 | if (input_vec[j] > 0):
92 | output_vec.append( input_vec[j] );
93 | # else, output = NEGATIVE_SLOPE (usually 0)
94 | else:
95 | output_vec.append( input_vec[j]*NEGATIVE_SLOPE )
96 | f.writerow( build_data_line( 'test_input', input_vec, i, 'hex' ) )
97 | f.writerow( build_data_line( 'test_output', output_vec, i, 'hex' ) )
98 | # for debugging/sanity check..
99 | if (DEBUG):
100 | f.writerow( ["/*############ DEBUG ############"] )
101 | f.writerow( build_data_line( 'test_input', input_vec, i, 'float' ) )
102 | f.writerow( build_data_line( 'test_output', output_vec, i, 'float' ) )
103 | f.writerow( ["############ END DEBUG ############*/"] )
104 |
105 | # end the 'initial begin' statement
106 | f.writerow( ['end'] )
107 | # add endif statement
108 | f.writerow( ['`endif'] )
109 |
110 |
111 | if __name__ == '__main__':
112 | main()
113 |
--------------------------------------------------------------------------------
/test/softmax_with_loss_tests_header.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/python
2 |
3 | import csv
4 | import random
5 | import struct
6 | import argparse
7 | import numpy as np
8 |
9 | data_file_name = 'test_data/softmax_with_loss_test_data.vh'
10 |
11 | # convert floating point value to hex value
12 | def float_to_hex(f):
13 | return format(struct.unpack('> $CAFFE_CONFIG
29 |
30 | ##############################################
31 | ##### Install Depdendencies
32 | ##############################################
33 | # create dependencies dir if not exist
34 | mkdir -p $BASE/dependencies
35 |
36 | ##############################################
37 | ##### install boost 1.59.0
38 | ##############################################
39 | cd $BASE/dependencies
40 | mkdir -p boost
41 | cd boost
42 | wget http://sourceforge.net/projects/boost/files/boost/1.59.0/boost_1_59_0.tar.gz
43 | tar -zxvf boost_1_59_0.tar.gz
44 | cd boost_1_59_0
45 | ./bootstrap.sh --prefix=${PWD}
46 | ./b2 install -j${NUM_CORES}
47 |
48 | echo "export LD_LIBRARY_PATH=\"${PWD}/lib:"'${LD_LIBRARY_PATH}"' >> ~/.bashrc
49 | echo "export PATH=\"${PWD}/bin:"'${PATH}"' >> ~/.bashrc
50 | echo "setenv LD_LIBRARY_PATH \"${PWD}/lib:"'${LD_LIBRARY_PATH}"' >> ~/.cshrc
51 | echo "setenv PATH \"${PWD}/bin:"'${PATH}"' >> ~/.csshrc
52 | echo "INCLUDE_DIRS += ${PWD}/include" >> $CAFFE_CONFIG
53 | echo "LIBRARY_DIRS += ${PWD}/lib" >> $CAFFE_CONFIG
54 |
55 | ##############################################
56 | ##### install protobuf
57 | ##############################################
58 | cd $BASE/dependencies
59 | # build protobuf
60 | git clone https://github.com/google/protobuf.git
61 | cd protobuf
62 | ./autogen.sh
63 | ./configure --prefix=${PWD} && make -j${NUM_CORES} && make install
64 |
65 | echo "export LD_LIBRARY_PATH=\"${PWD}/lib:"'${LD_LIBRARY_PATH}"' >> ~/.bashrc
66 | echo "export PATH=\"${PWD}/bin:"'${PATH}"' >> ~/.bashrc
67 | echo "setenv LD_LIBRARY_PATH \"${PWD}/lib:"'${LD_LIBRARY_PATH}"' >> ~/.cshrc
68 | echo "setenv PATH \"${PWD}/bin:"'${PATH}"' >> ~/.csshrc
69 | echo "INCLUDE_DIRS += ${PWD}/include" >> $CAFFE_CONFIG
70 | echo "LIBRARY_DIRS += ${PWD}/lib" >> $CAFFE_CONFIG
71 |
72 | ##############################################
73 | ##### install snappy
74 | ##############################################
75 | cd $BASE/dependencies
76 | mkdir -p snappy
77 | cd snappy
78 | wget https://snappy.googlecode.com/files/snappy-1.1.1.tar.gz
79 | tar -xzvf snappy-1.1.1.tar.gz
80 | cd snappy-1.1.1
81 | ./configure --prefix=${PWD} && make -j${NUM_CORES} && make install
82 |
83 | echo "export LD_LIBRARY_PATH=\"${PWD}/lib:"'${LD_LIBRARY_PATH}"' >> ~/.bashrc
84 | echo "export PATH=\"${PWD}/bin:"'${PATH}"' >> ~/.bashrc
85 | echo "setenv LD_LIBRARY_PATH \"${PWD}/lib:"'${LD_LIBRARY_PATH}"' >> ~/.cshrc
86 | echo "setenv PATH \"${PWD}/bin:"'${PATH}"' >> ~/.csshrc
87 | echo "INCLUDE_DIRS += ${PWD}/include" >> $CAFFE_CONFIG
88 | echo "LIBRARY_DIRS += ${PWD}/lib" >> $CAFFE_CONFIG
89 |
90 | ##############################################
91 | ##### install gflags
92 | ##############################################
93 | cd $BASE/dependencies
94 | mkdir -p gflags
95 | cd gflags
96 | wget https://gflags.googlecode.com/files/gflags-2.0-no-svn-files.tar.gz
97 | tar -xzvf gflags-2.0-no-svn-files.tar.gz
98 | cd gflags-2.0
99 | ./configure --prefix=${PWD} && make -j${NUM_CORES} && make install
100 |
101 | echo "export LD_LIBRARY_PATH=\"${PWD}/lib:"'${LD_LIBRARY_PATH}"' >> ~/.bashrc
102 | echo "export PATH=\"${PWD}/bin:"'${PATH}"' >> ~/.bashrc
103 | echo "setenv LD_LIBRARY_PATH \"${PWD}/lib:"'${LD_LIBRARY_PATH}"' >> ~/.cshrc
104 | echo "setenv PATH \"${PWD}/bin:"'${PATH}"' >> ~/.csshrc
105 | echo "INCLUDE_DIRS += ${PWD}/include" >> $CAFFE_CONFIG
106 | echo "LIBRARY_DIRS += ${PWD}/lib" >> $CAFFE_CONFIG
107 |
108 | ##############################################
109 | ###### install glog
110 | ##############################################
111 | cd $BASE/dependencies
112 | mkdir -p glog
113 | cd glog
114 | wget https://google-glog.googlecode.com/files/glog-0.3.3.tar.gz
115 | tar zxvf glog-0.3.3.tar.gz
116 | cd glog-0.3.3
117 | ./configure --prefix=${PWD} && make -j${NUM_CORES} && make install
118 |
119 | echo "export LD_LIBRARY_PATH=\"${PWD}/lib:"'${LD_LIBRARY_PATH}"' >> ~/.bashrc
120 | echo "export PATH=\"${PWD}/bin:"'${PATH}"' >> ~/.bashrc
121 | echo "setenv LD_LIBRARY_PATH \"${PWD}/lib:"'${LD_LIBRARY_PATH}"' >> ~/.cshrc
122 | echo "setenv PATH \"${PWD}/bin:"'${PATH}"' >> ~/.csshrc
123 | echo "INCLUDE_DIRS += ${PWD}/include" >> $CAFFE_CONFIG
124 | echo "LIBRARY_DIRS += ${PWD}/lib" >> $CAFFE_CONFIG
125 |
126 | ##### install lmdb
127 | cd $BASE/dependencies
128 | git clone https://github.com/LMDB/lmdb
129 | cd lmdb/libraries/liblmdb
130 | sed -i 's_/usr/local_._g' Makefile
131 | make -j${NUM_CORES} && make install
132 |
133 | echo "export LD_LIBRARY_PATH=\"${PWD}/lib:"'${LD_LIBRARY_PATH}"' >> ~/.bashrc
134 | echo "export PATH=\"${PWD}/bin:"'${PATH}"' >> ~/.bashrc
135 | echo "setenv LD_LIBRARY_PATH \"${PWD}/lib:"'${LD_LIBRARY_PATH}"' >> ~/.cshrc
136 | echo "setenv PATH \"${PWD}/bin:"'${PATH}"' >> ~/.csshrc
137 | echo "INCLUDE_DIRS += ${PWD}/include" >> $CAFFE_CONFIG
138 | echo "LIBRARY_DIRS += ${PWD}/lib" >> $CAFFE_CONFIG
139 |
140 | ##############################################
141 | ##### install hdf5
142 | ##############################################
143 | cd $BASE/dependencies
144 | mkdir -p hdf5
145 | cd hdf5
146 | wget http://www.hdfgroup.org/ftp/HDF5/current/src/hdf5-1.8.16.tar.gz
147 | tar -zxvf hdf5-1.8.16.tar.gz
148 | cd hdf5-1.8.16
149 | ./configure --prefix=${PWD} && make -j${NUM_CORES} && make install
150 |
151 | echo "export LD_LIBRARY_PATH=\"${PWD}/lib:"'${LD_LIBRARY_PATH}"' >> ~/.bashrc
152 | echo "export PATH=\"${PWD}/bin:"'${PATH}"' >> ~/.bashrc
153 | echo "setenv LD_LIBRARY_PATH \"${PWD}/lib:"'${LD_LIBRARY_PATH}"' >> ~/.cshrc
154 | echo "setenv PATH \"${PWD}/bin:"'${PATH}"' >> ~/.csshrc
155 | echo "LIBRARY_DIRS += ${PWD}/lib" >> $CAFFE_CONFIG
156 | echo "INCLUDE_DIRS += ${PWD}/include" >> $CAFFE_CONFIG
157 |
158 | echo
159 | echo "########################################################"
160 | echo "Please source your ~/.bashrc file and/or ~/.cshrc file"
161 | echo "i.e.: source ~/.bashrc "
162 | echo "########################################################"
163 |
164 |
--------------------------------------------------------------------------------
/tools/nvidia_smi_command.sh:
--------------------------------------------------------------------------------
1 | #!/bin/sh
2 |
3 | # this command is used to query the Nvidia GPUs for various metrics
4 |
5 | nvidia-smi --query-gpu=timestamp,index,pstate,memory.total,memory.used,memory.free,utilization.gpu,utilization.memory,power.draw,power.limit,clocks.gr,clocks.sm,clocks.mem,clocks.applications.gr,clocks.applications.mem,gpu_uuid -l 10 -f gpu_metric_dump.csv --format=csv,nounits
6 |
--------------------------------------------------------------------------------