├── .gitignore ├── README.md ├── build_ice40.sh ├── build_simulation.sh ├── examples ├── average.png ├── diff.png ├── fruits_add_threshold.png ├── fruits_edge_detection.png ├── fruits_gaussian.png ├── fruits_mult.png ├── image_fruits_128.png ├── peppers128.png ├── peppers_add_threshold.png ├── peppers_edge_detection.png ├── peppers_gaussian.png └── peppers_mult.png ├── hdl └── image_processing.v ├── ice40 ├── hdl │ ├── Makefile │ ├── io.pcf │ ├── ram_interface.v │ ├── spi_interface.v │ └── top.v └── software │ ├── image_processing_ice40.cpp │ ├── image_processing_ice40.hpp │ └── spi_lib │ ├── spi_lib.c │ └── spi_lib.h ├── run_gnuplot.sh ├── simulation ├── image_processing_simulation.cpp └── image_processing_simulation.hpp └── software ├── image_processing.hpp ├── images ├── image_fruits_64.h ├── image_fruits_8.h └── image_sequential.h └── main.cpp /.gitignore: -------------------------------------------------------------------------------- 1 | obj_dir/ 2 | simu 3 | output.dat 4 | soft_ice40 5 | 6 | *.asc 7 | *.bin 8 | *.blif 9 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # FPGA Image Processing 2 | 3 | Implementation of simple image processing operations in verilog. This project revolves around a central image processing module `image_processing.v` which can be included in a simulation environment using verilator or it can be included in a `top.v` for the ice40 Ultraplus fpga. Both case are implemented in the `simulation/` and `ice40/` folders. 4 | 5 | As it is targeted to low end fpga devices (both in price and power consumption) such as the ice40 ultraplus. It uses 1Mbit of ram to store the images into 6 | two buffers, the input and the storage buffer. 7 | Images are loaded and read in the input buffer, the calculations are done on the storage buffer. The two buffers can be swapped. Most operations will be done in the storage buffer, if an operation is applied on the two images (for example binary_add), the resulting image will be written in the storage buffer. 8 | 9 | The images are stored in a .h (done with gimp). 10 | 11 | ### Operations 12 | - per pixel ops 13 | - add/sub 14 | - invert 15 | - threshold 16 | - mult/div 17 | - 3x3 matrix convolution 18 | - apply convolution on storage buffer, writes result back 19 | - apply convolution on input buffer, adds result with storage buffer. 20 | - binary operation (input *op* buffer) 21 | - add 22 | - sub 23 | - mult 24 | - switch input and buffer 25 | - load image to input 26 | - read image from input 27 | 28 | ### Examples 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 62 | 63 | 64 | 65 | 67 | 68 | 69 | 70 |

Original
Add & threshold		"
Multiplication 0.5x
Gaussian blur
Edge detection
Average 61 \| 0.5\fruits+0.5\peppers
Difference 66 \| abs(fruits-peppers)

71 | 72 | ### Commands 73 | 74 | The image processing module receives messages separated into blocks of 8bits values 75 | a message is composed of a 1B operand and the parameters which can be of variable length (from 0 to n). 76 | 77 | The commands are listed as an enum in `software/image_processing.h` 78 | 79 | | Command name | Content | Description | 80 | |-----|-----|-----| 81 | |COMMAND_PARAM | byte0:width LSB
byte1:width MSB
byte2:height LSB
byte3:height MSB | This is the first command to be sent to the IP module, will do some init and sets the image size, sizes are given as unsigned 16bits numbers, in little endian | 82 | |COMMAND_SEND_IMG | width*height bytes data | Sends the image to the module, image will be stored in input buffer | 83 | |COMMAND_GET_STATUS | none | Returns some status on the module, 4 bytes will be returned of which only the bit0 of byte0 is useful (for now) which tells if module is busy or not | 84 | |COMMAND_READ_IMG | none | Will receive the image data, which will be image_width*image_height bytes. The sizes are the ones given with the COMMAND_PARAM | 85 | |COMMAND_APPLY_ADD | byte0: add value LSB
byte1: add value MSB
byte2: bit0: clamp | Ask the module to do an add, image+value, the value is a 16bits signed value, the last parameter is the clamp, if set to one, it will clamp the result between 0 and 255, and avoid cyclic overflow | 86 | |COMMAND_APPLY_THRESHOLD | byte0: thresholding value
byte1: replacement value
byte2: bit0: upper selection of threshold | Will replace pixels in the resulting image if they satisfy the thresholding, if upper selection is 1, this means that every pixels >= to thresh_val will be set to replacement, otherwise it will replace every pixels <= thresh_val | 87 | |COMMAND_SWITCH_BUFFERS | none | Images in input and storage buffers will be switched | 88 | |COMMAND_BINARY_ADD | byte0: bit0: clamp | Adds input pixels with storage and store result in storage, clamp parameter will prevent values going over 255 or under 0 and be cyclic | 89 | |COMMAND_APPLY_INVERT | none | Will invert image in storage buffer, inversion means that the new image will have pixels = 255 - storage_pixels | 90 | |COMMAND_CONVOLUTION | byte0: bit0 clamp bit1: source input bit2: add to result
bytes 1 to 9: 3x3 convolution matrix byte as a 8bit fixed point value, in row major format (first 3 bytes are for first row) | Will do a convolution of the image with the 3x3 matrix. The border pixels will not be evaluated (set at 0). Format of the kernel matrix is in 8bit fixed point: 1bit sign, 3bits integers, 4bits fractions.
The convolution will normally be applied on the image in storage buffer and the result will be written in the same buffer. However the convolution can be applied on the input buffer by setting the source_input bit. By setting the add_to_result bit, the convolution result will be added to the content of the storage buffer instead of overwriting it, this is useful for edge detection (multiple kernels on different gradients orientation) | 91 | |COMMAND_BINARY_SUB | byte0: bit0: clamp, bit1 is absolute difference | Will do image input - image storage and store the result in storage. If bit1 is active, will do the absolute difference between input and storage | 92 | |COMMAND_BINARY_MULT | byte0: bit0: clamp | Will do input*storage and store the image in storage | 93 | |COMMAND_APPLY_MULT | byte0: fixed point multiplication value to by applied on image
byte1: bit0: clamp | Will apply the multiplication value in fixed point format (1 sign, 3 units, 4 fractionnal) to be applied on the storage buffer, will store the result in storage | 94 | 95 | ### Fixed point calculation 96 | 97 | Operations such as multiplication or convolutions require real numbers. For example the gaussian blur 98 | is implemented using a kernel whose sum is one, so each element of the kernel should be <= 1.0. 99 | 100 | Real numbers are represented as fixed point numbers on 8 bits. 101 | The first bit is used as the sign bit, 3 bits for the numbers and 4 for the fractions. 102 | This means the values can go from -7.0 to 7.0 with a precision of 0.0625 (1/2^4). 103 | 104 | ### Architecture 105 | 106 | Two modes of operation are possible: simulation with verilator or running on ice40 fpga. 107 | 108 | The specific files for these two modes are situated in the `simulation/` and `ice40/` folders. 109 | 110 | The two implementations of the image processing interface `software/image_processing.hpp` reflect this architecture by either communicating 111 | with the verlator class or the ice40 fpga via SPI. 112 | 113 | ``` 114 | +---------------------+ +-----------------------+ 115 | | | | | 116 | | | | | 117 | | main.cpp +--------+ Image_processing.hpp | 118 | | | | | 119 | | | | | 120 | +---------------------+ +---------+-------------+ 121 | ^ 122 | | 123 | +------------+-----------+ 124 | | | 125 | +----------------+ +---------+-------------+ +-------+-----------+ +-----------+ 126 | | | | | | | | | 127 | | Verilator | | IP_simulation.hpp | | IP_ice40.hpp | SPI | FPGA | 128 | | Simulation +------+ IP_simulation.cpp | | IP_ice40.cpp +--------------+ Ice40 | 129 | | obj_dir/ | | | | | | | 130 | | | | | | | | | 131 | +----------------+ +-----------------------+ +-------------------+ +-----------+ 132 | 133 | ``` 134 | 135 | ### ice40 SPI communication 136 | 137 | When using the image processing module in ice40 mode it communicates with the ice40 using the SPI interface, to send and receive commands. 138 | 139 | The linux host computer will use the ftdi lib to communicate with the ftdi chip on the breakout board. The Ice40 will use the hardware SPI module. 140 | 141 | The format of SPI packets as seen from the programmer is one SPI command, one byte, the spi command will be read by the spi module and the byte will be forwarded to the IP module. The exception to this rule is sending and reading images, where the spi packets are bigger to accelerate throughput. 142 | 143 | The SPI commands are different from the image processing commands. 144 | 145 | Here are the possible SPI commands: 146 | - SPI_INIT: first command to be sent to the fpga, should contain {0, 0, 0x11} as body 147 | - SPI_READ_DATA: reads two bytes from fpga, first one is status (its first bit is high if there is a valid value), second one is the data itself 148 | - SPI_SEND_CMD: sends a image processing command (such as COMMAND_APPLY_ADD) to the fpga 149 | - SPI_SEND_DATA: sends a byte to the fpga, normally after a command for parameters 150 | - SPI_READ_DATA32: sends 32bytes to the fpga (used for images send) 151 | - SPI_SEND_DATA32: reads 32bytes from the fpga (first byte is status) 152 | 153 | # Build & run 154 | 155 | ### Simulation 156 | 157 | `./build_simulation.sh` Builds the simulation and C program 158 | 159 | `./simu` Runs simulator 160 | 161 | `./run_gnuplot.sh` to display the output (`output.dat`) in gnuplot 162 | 163 | ### ice40 164 | 165 | Tested with an ice40 ultraplus on a breakout board 166 | 167 | In `ice40/hdl/` create the bitstream by calling `make` 168 | 169 | Call `make prog` to program the `top.v` bitstream on the fpga 170 | 171 | To compile the host software, call `build_ice40.sh` in the root 172 | 173 | Call `soft_ice40` which will communicate with the fpga (send images, receive them) 174 | 175 | As with the simulation, `./run_gnuplot.sh` to display the output (`output.dat`) in gnuplot 176 | 177 | ## Needed tools 178 | 179 | - Verilator 3.874 180 | - gnuplot 5.0 181 | - yosys 0.9 182 | - FTDI lib to build the ice40 host software (libftdi-lib0.20) 183 | -------------------------------------------------------------------------------- /build_ice40.sh: -------------------------------------------------------------------------------- 1 | echo "TODO" 2 | 3 | g++ -DICE40 ice40/software/spi_lib/spi_lib.c ice40/software/image_processing_ice40.cpp software/main.cpp -o soft_ice40 -lftdi 4 | # g++ -DICE40 ice40/software/spi_lib/spi_lib.c ice40/software/image_processing_ice40.cpp -o soft_ice40 5 | -------------------------------------------------------------------------------- /build_simulation.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | set -e 3 | #compile simple_cpu 4 | 5 | if [ -d "obj_dir" ]; then 6 | rm -r obj_dir 7 | fi 8 | 9 | # to create the obj dir: 10 | verilator -Wall --cc hdl/image_processing.v --exe simulation/image_processing_simulation.cpp software/main.cpp 11 | 12 | # to compile: 13 | make CXXFLAGS="-g -DSIMULATION" -j -C obj_dir -f Vimage_processing.mk Vimage_processing 14 | 15 | cp obj_dir/Vimage_processing simu 16 | -------------------------------------------------------------------------------- /examples/average.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/damdoy/fpga_image_processing/b1d7480cd804e53a53af48e95850bbf61088f40f/examples/average.png -------------------------------------------------------------------------------- /examples/diff.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/damdoy/fpga_image_processing/b1d7480cd804e53a53af48e95850bbf61088f40f/examples/diff.png -------------------------------------------------------------------------------- /examples/fruits_add_threshold.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/damdoy/fpga_image_processing/b1d7480cd804e53a53af48e95850bbf61088f40f/examples/fruits_add_threshold.png -------------------------------------------------------------------------------- /examples/fruits_edge_detection.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/damdoy/fpga_image_processing/b1d7480cd804e53a53af48e95850bbf61088f40f/examples/fruits_edge_detection.png -------------------------------------------------------------------------------- /examples/fruits_gaussian.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/damdoy/fpga_image_processing/b1d7480cd804e53a53af48e95850bbf61088f40f/examples/fruits_gaussian.png -------------------------------------------------------------------------------- /examples/fruits_mult.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/damdoy/fpga_image_processing/b1d7480cd804e53a53af48e95850bbf61088f40f/examples/fruits_mult.png -------------------------------------------------------------------------------- /examples/image_fruits_128.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/damdoy/fpga_image_processing/b1d7480cd804e53a53af48e95850bbf61088f40f/examples/image_fruits_128.png -------------------------------------------------------------------------------- /examples/peppers128.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/damdoy/fpga_image_processing/b1d7480cd804e53a53af48e95850bbf61088f40f/examples/peppers128.png -------------------------------------------------------------------------------- /examples/peppers_add_threshold.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/damdoy/fpga_image_processing/b1d7480cd804e53a53af48e95850bbf61088f40f/examples/peppers_add_threshold.png -------------------------------------------------------------------------------- /examples/peppers_edge_detection.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/damdoy/fpga_image_processing/b1d7480cd804e53a53af48e95850bbf61088f40f/examples/peppers_edge_detection.png -------------------------------------------------------------------------------- /examples/peppers_gaussian.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/damdoy/fpga_image_processing/b1d7480cd804e53a53af48e95850bbf61088f40f/examples/peppers_gaussian.png -------------------------------------------------------------------------------- /examples/peppers_mult.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/damdoy/fpga_image_processing/b1d7480cd804e53a53af48e95850bbf61088f40f/examples/peppers_mult.png -------------------------------------------------------------------------------- /hdl/image_processing.v: -------------------------------------------------------------------------------- 1 | /* verilator lint_off UNUSED */ 2 | 3 | module image_processing( 4 | clk, reset, 5 | 6 | //memory access 7 | addr, wr_en, rd_en, data_read, data_read_valid, data_write, 8 | 9 | //comm module access 10 | comm_cmd, comm_data_in, comm_data_in_valid, comm_data_out, comm_data_out_valid, comm_data_out_free 11 | ); 12 | 13 | input wire clk; 14 | input wire reset; 15 | 16 | //memory interface 17 | output reg [31:0] addr; 18 | output reg wr_en; 19 | output reg rd_en; 20 | output reg [15:0] data_write; 21 | input wire [15:0] data_read; 22 | input wire data_read_valid; 23 | 24 | //comm module interface 25 | input wire [7:0] comm_cmd; 26 | input wire [7:0] comm_data_in; 27 | input wire comm_data_in_valid; 28 | output reg [7:0] comm_data_out; 29 | output reg comm_data_out_valid; 30 | input wire comm_data_out_free; 31 | 32 | parameter STATE_IDLE = 0, STATE_WAIT_COMMAND = STATE_IDLE+1, STATE_READ_COMMAND_PARAM_WIDTH=STATE_WAIT_COMMAND+1, 33 | STATE_READ_COMMAND_PARAM_HEIGHT = STATE_READ_COMMAND_PARAM_WIDTH+1, STATE_SEND_IMG = STATE_READ_COMMAND_PARAM_HEIGHT+1, 34 | STATE_READ_IMG = STATE_SEND_IMG+1, STATE_GET_STATUS = STATE_READ_IMG+1, 35 | STATE_APPLY_ADD_READ_PARAM = STATE_GET_STATUS+1, STATE_APPLY_ADD = STATE_APPLY_ADD_READ_PARAM+1, 36 | STATE_THRESHOLD_READ_PARAM = STATE_APPLY_ADD+1, STATE_PROC_THRESHOLD = STATE_THRESHOLD_READ_PARAM+1, 37 | STATE_BINARY_ADD_READ_PARAM = STATE_PROC_THRESHOLD+1, STATE_PROC_BINARY = STATE_BINARY_ADD_READ_PARAM+1, 38 | STATE_APPLY_INVERT_READ_PARAM = STATE_PROC_BINARY+1, STATE_PROC_UNARY = STATE_APPLY_INVERT_READ_PARAM+1, 39 | STATE_CONVOLUTION_READ_PARAM = STATE_PROC_UNARY+1, STATE_PROC_CONVOLUTION = STATE_CONVOLUTION_READ_PARAM+1, 40 | STATE_BINARY_SUB_READ_PARAM = STATE_PROC_CONVOLUTION+1, STATE_BINARY_MULT_READ_PARAM = STATE_BINARY_SUB_READ_PARAM+1, 41 | STATE_APPLY_MULT_READ_PARAM = STATE_BINARY_MULT_READ_PARAM+1, STATE_APPLY_MULT = STATE_APPLY_MULT_READ_PARAM+1, 42 | STATE_PROC_CONVOLUTION_CALCULATION = STATE_APPLY_MULT+1, STATE_PROC_CONVOLUTION_WRITEBACK_1 = STATE_PROC_CONVOLUTION_CALCULATION+1, 43 | STATE_PROC_CONVOLUTION_WRITEBACK_2 = STATE_PROC_CONVOLUTION_WRITEBACK_1+1; 44 | 45 | //only works in systemverilog 46 | // enum bit [7:0] {STATE_IDLE, STATE_WAIT_COMMAND, STATE_READ_COMMAND_PARAM_WIDTH, 47 | // STATE_READ_COMMAND_PARAM_HEIGHT, STATE_SEND_IMG, 48 | // STATE_READ_IMG, STATE_GET_STATUS, STATE_APPLY_ADD_READ_PARAM, STATE_APPLY_ADD, 49 | // STATE_THRESHOLD_READ_PARAM, STATE_PROC_THRESHOLD, 50 | // STATE_BINARY_ADD_READ_PARAM, STATE_PROC_BINARY, 51 | // STATE_APPLY_INVERT_READ_PARAM, STATE_PROC_UNARY, 52 | // STATE_APPLY_POW_READ_PARAM, 53 | // STATE_CONVOLUTION_READ_PARAM, STATE_PROC_CONVOLUTION, 54 | // STATE_BINARY_SUB_READ_PARAM, STATE_BINARY_MULT_READ_PARAM, 55 | // STATE_APPLY_MULT_READ_PARAM, STATE_APPLY_MULT} States; 56 | 57 | reg [7:0] state; 58 | 59 | reg [7:0] state_processing; 60 | reg [7:0] processing_command; 61 | 62 | 63 | parameter COMMAND_PARAM = 0, COMMAND_SEND_IMG = COMMAND_PARAM+1, COMMAND_READ_IMG = COMMAND_SEND_IMG+1, 64 | COMMAND_GET_STATUS = COMMAND_READ_IMG+1, COMMAND_APPLY_ADD = COMMAND_GET_STATUS+1, COMMAND_APPLY_THRESHOLD = COMMAND_APPLY_ADD+1, 65 | COMMAND_SWITCH_BUFFERS = COMMAND_APPLY_THRESHOLD+1, COMMAND_BINARY_ADD = COMMAND_SWITCH_BUFFERS+1, 66 | COMMAND_APPLY_INVERT = COMMAND_BINARY_ADD+1, 67 | COMMAND_CONVOLUTION = COMMAND_APPLY_INVERT+1, COMMAND_BINARY_SUB = COMMAND_CONVOLUTION+1, COMMAND_BINARY_MULT = COMMAND_BINARY_SUB+1, 68 | COMMAND_APPLY_MULT = COMMAND_BINARY_MULT+1; 69 | 70 | //systemverilog enum 71 | // enum bit [7:0] { 72 | // COMMAND_PARAM, COMMAND_SEND_IMG, COMMAND_READ_IMG, 73 | // COMMAND_GET_STATUS, COMMAND_APPLY_ADD, COMMAND_APPLY_THRESHOLD,COMMAND_SWITCH_BUFFERS, COMMAND_BINARY_ADD, 74 | // COMMAND_APPLY_INVERT, COMMAND_APPLY_POW, COMMAND_APPLY_SQRT,COMMAND_CONVOLUTION, COMMAND_BINARY_SUB, COMMAND_BINARY_MULT, 75 | // COMMAND_APPLY_MULT 76 | // } Commands; 77 | 78 | //128KB memory available on spram 79 | //2*64KB 80 | //256*256 images assuming 1B/pixel 81 | parameter MEMORY_SIZE = 1024*128; 82 | parameter BUFFER_SIZE = MEMORY_SIZE/2; 83 | parameter BUFFER2_LOCATION = MEMORY_SIZE/2; 84 | 85 | parameter CONVOLUTION_LINE_MAX_SIZE = 256; 86 | 87 | ////////local reg 88 | reg [15:0] counter_read; 89 | reg [31:0] memory_addr_counter; 90 | reg [15:0] mem_data_buffer; 91 | reg mem_data_buffer_full; 92 | reg [15:0] buffer_read; 93 | 94 | reg [15:0] proc_counter_read; 95 | reg [31:0] proc_memory_addr_counter; 96 | reg [31:0] proc_conv_memory_addr_read; 97 | reg [31:0] proc_conv_memory_addr_write; 98 | 99 | reg binary_read_buffer; 100 | reg operation_step; 101 | 102 | //will keep the previous lines for convolution 103 | reg [7:0] convolution_buffer [0:CONVOLUTION_LINE_MAX_SIZE-1][0:1]; 104 | 105 | //a 4x3 matrix for current calculation 106 | reg [7:0] convolution_buffer_local [0:3][0:2]; 107 | reg [7:0] matrix_convolution_counter; 108 | reg [15:0] calc_left_buf; 109 | reg [15:0] calc_right_buf; 110 | 111 | reg [15:0] counter_convolution_x; 112 | reg [15:0] counter_convolution_y; 113 | 114 | //since we have to read in advance, there is a slight offset between the read and write 115 | reg [15:0] counter_convolution_x_write; 116 | reg [15:0] counter_convolution_y_write; 117 | 118 | //buffers to keep last reads, as we need 3x3 matrices but only have two lines of buffers 119 | reg [15:0] convolution_previous_read[1:0]; 120 | reg [15:0] convolution_previous_read_counter_x[1:0]; 121 | reg [15:0] convolution_previous_read_counter_y[1:0]; 122 | reg [7:0] convolution_last_calculation; 123 | 124 | reg [1:0] convolution_reading_data; 125 | reg [15:0] convolution_data_to_add; 126 | reg [15:0] data_read_store; 127 | 128 | //params 129 | reg [15:0] img_width; 130 | reg [15:0] img_height; 131 | 132 | reg [15:0] add_value; 133 | reg clamp; 134 | reg absolute_diff; 135 | reg convolution_source_input; 136 | reg convolution_add_to_result; 137 | 138 | reg [7:0] threshold_value; 139 | reg [7:0] threshold_replacement; 140 | reg threshold_upper; 141 | 142 | reg [31:0] buffer_input_address; 143 | reg [31:0] buffer_storage_address; 144 | 145 | reg [15:0] convolution_matrix [0:8]; //3x3 matrix 146 | reg [7:0] convolution_params; 147 | 148 | reg [7:0] mult_value_param; 149 | 150 | //returns 8bit value from 16bit with possible clamping 151 | function [7:0] apply_clamp; 152 | input [15:0] in; 153 | input clamp_en; 154 | begin 155 | apply_clamp = in[7:0]; 156 | if(clamp_en == 1 && $signed(in) > 255)begin 157 | apply_clamp = 255; 158 | end 159 | if(clamp_en == 1 && $signed(in) < 0) begin 160 | apply_clamp = 0; 161 | end 162 | end 163 | endfunction 164 | 165 | //same as apply_clamp but with a fixed point value, acts as rounding by taking the [11:4] bits 166 | function [7:0] apply_clamp_fixed16; 167 | input [15:0] in; 168 | input clamp_en; 169 | begin 170 | apply_clamp_fixed16 = in[11:4]; 171 | if(clamp_en == 1 && $signed(in[15:4]) > 255)begin 172 | apply_clamp_fixed16 = 255; 173 | end 174 | if(clamp_en == 1 && $signed(in[15:4]) < 0) begin 175 | apply_clamp_fixed16 = 0; 176 | end 177 | end 178 | endfunction 179 | 180 | initial begin 181 | //outputs reg 182 | addr = 32'b0; 183 | wr_en = 0; 184 | rd_en = 0; 185 | data_write = 16'b0; 186 | 187 | mem_data_buffer_full = 0; 188 | comm_data_out = 8'b0; 189 | comm_data_out_valid = 0; 190 | 191 | proc_counter_read = 0; 192 | proc_memory_addr_counter = 0; 193 | 194 | binary_read_buffer = 0; 195 | 196 | state = STATE_WAIT_COMMAND; 197 | state_processing = STATE_IDLE; 198 | 199 | counter_read = 0; 200 | 201 | buffer_input_address = 0; 202 | //doesn't want to be init at this value ==> do it in init state 203 | // buffer_storage_address = BUFFER2_LOCATION; 204 | buffer_storage_address = 0; 205 | 206 | img_width = 0; 207 | img_height = 0; 208 | end 209 | 210 | always @(posedge clk) 211 | begin 212 | 213 | //default 214 | comm_data_out_valid <= 0; 215 | wr_en <= 0; 216 | rd_en <= 0; 217 | 218 | case (state) 219 | STATE_IDLE: begin 220 | end 221 | STATE_WAIT_COMMAND: begin 222 | if(comm_data_in_valid == 1) 223 | begin 224 | case (comm_cmd) 225 | COMMAND_PARAM: begin //also acts as init 226 | state <= STATE_READ_COMMAND_PARAM_WIDTH; 227 | counter_read <= 1; //will be used to read the 16bits 228 | buffer_storage_address <= BUFFER2_LOCATION; 229 | buffer_input_address <= 0; 230 | end 231 | COMMAND_SEND_IMG: begin 232 | state <= STATE_SEND_IMG; 233 | counter_read <= img_width*img_height; 234 | memory_addr_counter <= buffer_input_address; 235 | end 236 | COMMAND_READ_IMG: begin 237 | state <= STATE_READ_IMG; 238 | counter_read <= img_width*img_height; 239 | memory_addr_counter <= buffer_input_address; 240 | end 241 | COMMAND_GET_STATUS: begin 242 | state <= STATE_GET_STATUS; 243 | counter_read <= 3; 244 | end 245 | COMMAND_APPLY_ADD: begin 246 | state <= STATE_APPLY_ADD_READ_PARAM; 247 | counter_read <= 2; //read 16bit of parameters 248 | end 249 | COMMAND_APPLY_THRESHOLD: begin 250 | state <= STATE_THRESHOLD_READ_PARAM; 251 | counter_read <= 2; // read 3*8bits of params 252 | end 253 | COMMAND_SWITCH_BUFFERS: begin 254 | state <= STATE_WAIT_COMMAND; 255 | buffer_input_address <= buffer_storage_address; 256 | buffer_storage_address <= buffer_input_address; 257 | end 258 | COMMAND_BINARY_ADD: begin 259 | state <= STATE_BINARY_ADD_READ_PARAM; 260 | counter_read <= 0; 261 | end 262 | COMMAND_APPLY_INVERT: begin 263 | state <= STATE_WAIT_COMMAND; 264 | state_processing <= STATE_PROC_UNARY; 265 | processing_command <= COMMAND_APPLY_INVERT; 266 | counter_read <= 0; 267 | proc_counter_read <= img_width*img_height; 268 | proc_memory_addr_counter <= buffer_storage_address; 269 | end 270 | COMMAND_CONVOLUTION: begin 271 | state <= STATE_CONVOLUTION_READ_PARAM; 272 | counter_read <= 9; // will read 10 params 273 | end 274 | COMMAND_BINARY_SUB: begin 275 | state <= STATE_BINARY_SUB_READ_PARAM; 276 | counter_read <= 0; 277 | end 278 | COMMAND_APPLY_MULT: begin 279 | state <= STATE_APPLY_MULT_READ_PARAM; 280 | counter_read <= 1; 281 | end 282 | default: begin 283 | end 284 | endcase 285 | end 286 | end 287 | STATE_READ_COMMAND_PARAM_WIDTH: begin 288 | if(comm_data_in_valid == 1)begin 289 | if(counter_read == 1) begin 290 | img_width[7:0] <= comm_data_in; 291 | counter_read <= 0; 292 | end else begin 293 | img_width[15:8] <= comm_data_in; 294 | state <= STATE_READ_COMMAND_PARAM_HEIGHT; 295 | counter_read <= 1; 296 | end 297 | end 298 | end 299 | STATE_READ_COMMAND_PARAM_HEIGHT: begin 300 | if(comm_data_in_valid == 1)begin 301 | if(counter_read == 1) begin 302 | img_height[7:0] <= comm_data_in; 303 | counter_read <= 0; 304 | end else begin 305 | img_height[15:8] <= comm_data_in; 306 | state <= STATE_WAIT_COMMAND; //just wait next command 307 | end 308 | end 309 | end 310 | STATE_GET_STATUS: begin 311 | if(comm_data_out_free == 1) begin 312 | if(counter_read == 3) begin //first status response is "is_busy" 313 | comm_data_out_valid <= 1; 314 | comm_data_out[7:0] <= 8'h0; 315 | comm_data_out[0] <= ~(state_processing == STATE_IDLE); 316 | end else if(counter_read == 2) begin 317 | comm_data_out_valid <= 1; 318 | comm_data_out[7:0] <= 8'h0; 319 | end else if(counter_read == 1) begin 320 | comm_data_out_valid <= 1; 321 | comm_data_out[7:0] <= 8'h0; 322 | end else begin 323 | comm_data_out_valid <= 1; 324 | comm_data_out[7:0] <= 8'h0; 325 | end 326 | 327 | if(counter_read > 0) begin 328 | counter_read <= counter_read - 1; 329 | end else begin //counter_read == 0 330 | state <= STATE_WAIT_COMMAND; 331 | end 332 | end 333 | end 334 | STATE_SEND_IMG: begin //receives image from the host 335 | if(comm_data_in_valid == 1) begin 336 | 337 | if(memory_addr_counter[0] == 1'b0) begin 338 | data_write[7:0] <= comm_data_in; 339 | end else begin 340 | data_write[15:8] <= comm_data_in; 341 | wr_en <= 1; 342 | addr <= {memory_addr_counter[31:1], 1'b0}; 343 | end 344 | 345 | memory_addr_counter <= memory_addr_counter+1; 346 | 347 | if(counter_read > 1) begin 348 | counter_read <= counter_read - 1; 349 | end else begin 350 | state <= STATE_WAIT_COMMAND; 351 | end 352 | end 353 | end 354 | //reads the parameters for the add command 355 | STATE_APPLY_ADD_READ_PARAM: begin 356 | if(comm_data_in_valid == 1)begin 357 | if(counter_read == 2) begin 358 | add_value[7:0] <= comm_data_in; 359 | counter_read <= 1; 360 | end else if(counter_read == 1) begin 361 | add_value[15:8] <= comm_data_in; 362 | counter_read <= 0; 363 | end else begin 364 | clamp <= comm_data_in[0]; 365 | state_processing <= STATE_PROC_UNARY; 366 | processing_command <= COMMAND_APPLY_ADD; 367 | state <= STATE_WAIT_COMMAND; 368 | proc_counter_read <= img_width*img_height; 369 | proc_memory_addr_counter <= buffer_storage_address; 370 | end 371 | end 372 | end 373 | STATE_READ_IMG: begin 374 | if(memory_addr_counter[0] == 1'b0) begin 375 | rd_en <= 1; 376 | addr <= memory_addr_counter; 377 | memory_addr_counter <= memory_addr_counter + 1; 378 | end else begin 379 | if (counter_read[0] == 1'b0 && data_read_valid == 1'b1) begin //image size mod 2 should be 0 380 | mem_data_buffer <= data_read; 381 | mem_data_buffer_full <= 1; 382 | end 383 | 384 | if( comm_data_out_free == 1 && mem_data_buffer_full == 1 ) begin 385 | if (counter_read[0] == 1'b0) begin //image size mod 2 should be 0 386 | comm_data_out_valid <= 1; 387 | comm_data_out <= mem_data_buffer[7:0]; 388 | counter_read <= counter_read - 1; 389 | end else begin 390 | comm_data_out_valid <= 1; 391 | comm_data_out <= mem_data_buffer[15:8]; 392 | counter_read <= counter_read - 1; 393 | memory_addr_counter <= memory_addr_counter+1; 394 | mem_data_buffer_full <= 0; 395 | if(counter_read <= 1) begin // = 1 and not 0 because we are shifted by one 396 | state <= STATE_WAIT_COMMAND; 397 | end 398 | end 399 | end 400 | end 401 | end 402 | STATE_THRESHOLD_READ_PARAM: begin 403 | if(comm_data_in_valid == 1)begin 404 | if(counter_read == 2) begin 405 | threshold_value[7:0] <= comm_data_in; 406 | counter_read <= 1; 407 | end else if(counter_read == 1) begin 408 | threshold_replacement[7:0] <= comm_data_in; 409 | counter_read <= 0; 410 | end else begin 411 | threshold_upper <= comm_data_in[0]; 412 | state_processing <= STATE_PROC_UNARY; 413 | processing_command <= COMMAND_APPLY_THRESHOLD; 414 | state <= STATE_WAIT_COMMAND; 415 | proc_counter_read <= img_width*img_height; 416 | proc_memory_addr_counter <= buffer_storage_address; 417 | end 418 | end 419 | end 420 | STATE_BINARY_ADD_READ_PARAM: begin 421 | if(comm_data_in_valid == 1)begin 422 | state_processing <= STATE_PROC_BINARY; 423 | processing_command <= COMMAND_BINARY_ADD; 424 | state <= STATE_WAIT_COMMAND; 425 | clamp <= comm_data_in[0]; 426 | proc_counter_read <= img_width*img_height; 427 | proc_memory_addr_counter <= 0; //offset 428 | binary_read_buffer <= 0; 429 | end 430 | end 431 | STATE_CONVOLUTION_READ_PARAM: begin 432 | if(comm_data_in_valid == 1) begin 433 | if(counter_read == 9)begin //first value are gneral params 434 | convolution_params <= comm_data_in; 435 | clamp <= comm_data_in[0]; 436 | convolution_source_input <= comm_data_in[1]; 437 | convolution_add_to_result <= comm_data_in[2]; 438 | counter_read <= 8; 439 | end else begin //the kernel matrix (3x3) 440 | //sign extend the 8bits values into 16bits 441 | convolution_matrix[8-counter_read] <= { {8{comm_data_in[7]}}, comm_data_in}; 442 | counter_read <= counter_read - 1; 443 | 444 | if(counter_read == 0) begin 445 | state_processing <= STATE_PROC_CONVOLUTION; 446 | state <= STATE_WAIT_COMMAND; 447 | proc_counter_read <= img_width*img_height; 448 | counter_convolution_x <= 0; 449 | counter_convolution_y <= 0; 450 | counter_convolution_x_write <= 0; 451 | counter_convolution_y_write <= 0; 452 | convolution_reading_data <= 2'b00; 453 | if(convolution_source_input == 1) begin //input buffer used as source for convolution 454 | proc_conv_memory_addr_read <= buffer_input_address; 455 | end else begin 456 | proc_conv_memory_addr_read <= buffer_storage_address; 457 | end 458 | proc_conv_memory_addr_write <= buffer_storage_address; 459 | end 460 | end 461 | end 462 | end 463 | STATE_BINARY_SUB_READ_PARAM: begin 464 | if(comm_data_in_valid == 1)begin 465 | state_processing <= STATE_PROC_BINARY; 466 | processing_command <= COMMAND_BINARY_SUB; 467 | state <= STATE_WAIT_COMMAND; 468 | clamp <= comm_data_in[0]; 469 | absolute_diff <= comm_data_in[1]; 470 | proc_counter_read <= img_width*img_height; 471 | proc_memory_addr_counter <= 0; //offset 472 | binary_read_buffer <= 0; 473 | end 474 | end 475 | STATE_BINARY_MULT_READ_PARAM: begin 476 | if(comm_data_in_valid == 1)begin 477 | state_processing <= STATE_PROC_BINARY; 478 | processing_command <= COMMAND_BINARY_MULT; 479 | state <= STATE_WAIT_COMMAND; 480 | clamp <= comm_data_in[0]; 481 | proc_counter_read <= img_width*img_height; 482 | proc_memory_addr_counter <= 0; //offset 483 | binary_read_buffer <= 0; 484 | end 485 | end 486 | STATE_APPLY_MULT_READ_PARAM: begin 487 | if(comm_data_in_valid == 1 && counter_read == 1) begin 488 | mult_value_param <= comm_data_in; 489 | counter_read <= 0; 490 | end else if (comm_data_in_valid == 1 && counter_read == 0) begin 491 | state_processing <= STATE_PROC_UNARY; 492 | processing_command <= COMMAND_APPLY_MULT; 493 | state <= STATE_WAIT_COMMAND; 494 | clamp <= comm_data_in[0]; 495 | proc_counter_read <= img_width*img_height; 496 | proc_memory_addr_counter <= buffer_storage_address; 497 | end 498 | end 499 | default: begin 500 | end 501 | endcase 502 | 503 | case (state_processing) 504 | STATE_IDLE: begin 505 | end 506 | STATE_PROC_UNARY: begin : unary 507 | reg [15:0] temp_calc; //used for calculations 508 | //assume single port ram, reads the data 509 | if(proc_memory_addr_counter[0] == 1'b0) begin 510 | rd_en <= 1; 511 | addr <= proc_memory_addr_counter; //set by previous state to be either buffers 512 | proc_memory_addr_counter <= proc_memory_addr_counter+1; 513 | end else begin 514 | if (data_read_valid == 1'b1) begin //received the data, apply the unary operation 515 | 516 | if(processing_command == COMMAND_APPLY_ADD)begin 517 | temp_calc = {8'b0, data_read[7:0]}+add_value; 518 | data_write[7:0] <= apply_clamp(temp_calc, clamp); 519 | temp_calc = {8'b0, data_read[15:8]}+add_value; 520 | data_write[15:8] <= apply_clamp(temp_calc, clamp); 521 | end else if(processing_command == COMMAND_APPLY_THRESHOLD) begin 522 | data_write <= data_read; 523 | if(threshold_upper == 1) begin 524 | if(data_read[7:0] >= threshold_value) begin 525 | data_write[7:0] <= threshold_replacement; 526 | end 527 | if(data_read[15:8] >= threshold_value) begin 528 | data_write[15:8] <= threshold_replacement; 529 | end 530 | end else begin 531 | if(data_read[7:0] <= threshold_value) begin 532 | data_write[7:0] <= threshold_replacement; 533 | end 534 | if(data_read[15:8] <= threshold_value) begin 535 | data_write[15:8] <= threshold_replacement; 536 | end 537 | end 538 | end else if(processing_command == COMMAND_APPLY_INVERT) begin 539 | data_write <= ~data_read; 540 | end else if(processing_command == COMMAND_APPLY_MULT) begin 541 | temp_calc = {8'b0, mult_value_param}*{8'b0, data_read[7:0]}; 542 | data_write[7:0] <= apply_clamp_fixed16(temp_calc, clamp); 543 | temp_calc = {8'b0, mult_value_param}*{8'b0, data_read[15:8]}; 544 | data_write[15:8] <= apply_clamp_fixed16(temp_calc, clamp); 545 | end 546 | 547 | //write back the data 548 | wr_en <= 1; 549 | //16bits data addressing 550 | addr <= {proc_memory_addr_counter[31:1], 1'b0}; 551 | proc_memory_addr_counter <= proc_memory_addr_counter+1; 552 | 553 | if(proc_counter_read > 2) begin // > 2 and not 0 because we are shifted by one due to clk assignment 554 | proc_counter_read <= proc_counter_read - 2; 555 | end else begin 556 | state_processing <= STATE_IDLE; 557 | end 558 | end 559 | end 560 | end 561 | STATE_PROC_BINARY: begin : binary 562 | reg [15:0] temp_calc; //used for calculations 563 | //assume single port ram 564 | if(proc_memory_addr_counter[0] == 1'b0 && binary_read_buffer == 0) begin 565 | rd_en <= 1; 566 | //must read the buffer storage first o/w seems to be problems with the writeback (timing constraints?) 567 | addr <= buffer_storage_address+proc_memory_addr_counter; 568 | binary_read_buffer <= 1; 569 | end if (proc_memory_addr_counter[0] == 1'b0 && binary_read_buffer == 1) begin 570 | //reads data from first buffer, issue read for the second buffer, same address 571 | if (data_read_valid == 1'b1) begin 572 | buffer_read <= data_read; 573 | rd_en <= 1; 574 | addr <= buffer_input_address+proc_memory_addr_counter; 575 | binary_read_buffer <= 0; 576 | proc_memory_addr_counter <= proc_memory_addr_counter + 1; 577 | operation_step <= 0; //binary op done in multiple clk counts due to timing constr. 578 | end 579 | end else begin 580 | //separated into two steps due to what seemed to be timing constraints 581 | if (data_read_valid == 1'b1 && operation_step == 0) begin 582 | temp_calc = 0; 583 | 584 | if( processing_command == COMMAND_BINARY_ADD) begin 585 | temp_calc = {8'b0, buffer_read[7:0]} + {8'b0, data_read[7:0]}; 586 | data_write[7:0] <= apply_clamp(temp_calc, clamp); 587 | end else if (processing_command == COMMAND_BINARY_SUB) begin 588 | temp_calc = {8'b0, buffer_read[7:0]} - {8'b0, data_read[7:0]}; 589 | if(absolute_diff == 1 && $signed(temp_calc) < 0) begin 590 | temp_calc = -temp_calc; 591 | end 592 | data_write[7:0] <= apply_clamp(temp_calc, clamp); 593 | end else if (processing_command == COMMAND_BINARY_MULT) begin 594 | temp_calc = {8'b0, buffer_read[7:0]} * {8'b0, data_read[7:0]}; 595 | data_write[7:0] <= apply_clamp(temp_calc, clamp); 596 | end 597 | 598 | operation_step <= 1; 599 | 600 | end else if (operation_step == 1) begin 601 | if( processing_command == COMMAND_BINARY_ADD) begin 602 | temp_calc = {8'b0, buffer_read[15:8]} + {8'b0, data_read[15:8]}; 603 | data_write[15:8] <= apply_clamp(temp_calc, clamp); 604 | end else if (processing_command == COMMAND_BINARY_SUB) begin 605 | temp_calc = {8'b0, buffer_read[15:8]} - {8'b0, data_read[15:8]}; 606 | if(absolute_diff == 1 && $signed(temp_calc) < 0) begin 607 | temp_calc = -temp_calc; 608 | end 609 | data_write[15:8] <= apply_clamp(temp_calc, clamp); 610 | end else if (processing_command == COMMAND_BINARY_MULT) begin 611 | temp_calc = {8'b0, buffer_read[15:8]} * {8'b0, data_read[15:8]}; 612 | data_write[15:8] <= apply_clamp(temp_calc, clamp); 613 | end 614 | 615 | operation_step <= 0; 616 | 617 | //wrtie back data into storage, same 16bits address 618 | wr_en <= 1; 619 | addr <= buffer_storage_address+{proc_memory_addr_counter[31:1], 1'b0}; 620 | proc_memory_addr_counter <= proc_memory_addr_counter+1; 621 | 622 | if(proc_counter_read > 2) begin // > 2 and not 0 because we are shifted by one 623 | proc_counter_read <= proc_counter_read - 2; 624 | end else begin 625 | state_processing <= STATE_IDLE; 626 | end 627 | end 628 | end 629 | end 630 | STATE_PROC_CONVOLUTION: begin : conv 631 | 632 | //read data in target image to be added to the convolution result 633 | if(convolution_add_to_result == 1 && convolution_reading_data != 2'b10) begin 634 | if(convolution_reading_data == 2'b00) begin 635 | rd_en <= 1; 636 | addr <= proc_conv_memory_addr_write; 637 | convolution_reading_data <= 2'b01; 638 | end else if(convolution_reading_data == 2'b01 && data_read_valid == 1) begin 639 | convolution_data_to_add <= data_read; 640 | convolution_reading_data <= 2'b10; 641 | end 642 | end else if(proc_conv_memory_addr_read[0] == 1'b0) begin //read the data (input) 643 | rd_en <= 1; 644 | addr <= proc_conv_memory_addr_read; 645 | proc_conv_memory_addr_read <= proc_conv_memory_addr_read + 1; 646 | 647 | if(convolution_add_to_result == 0) begin 648 | convolution_data_to_add <= 16'b0; 649 | end 650 | end else if (data_read_valid == 1) begin 651 | 652 | //data read will be written in conv. buffer at the end of writeback 653 | data_read_store <= data_read; 654 | matrix_convolution_counter <= 0; //counts clk cycles for calulation 655 | state_processing <= STATE_PROC_CONVOLUTION_CALCULATION; 656 | 657 | // //do the lookup before the calculation (will infer the sprams! (yosys 0.9)) 658 | convolution_buffer_local[0][0] <= convolution_buffer[counter_convolution_x_write[7:0]-1][ (counter_convolution_y_write[0]+1)%2]; 659 | convolution_buffer_local[1][0] <= convolution_buffer[counter_convolution_x_write[7:0]][ (counter_convolution_y_write[0]+1)%2]; 660 | convolution_buffer_local[2][0] <= convolution_buffer[counter_convolution_x_write[7:0]+1][ (counter_convolution_y_write[0]+1)%2]; 661 | convolution_buffer_local[3][0] <= convolution_buffer[counter_convolution_x_write[7:0]+2][ (counter_convolution_y_write[0]+1)%2]; 662 | 663 | convolution_buffer_local[0][1] <= convolution_buffer[counter_convolution_x_write[7:0]-1][ (counter_convolution_y_write[0])]; 664 | convolution_buffer_local[1][1] <= convolution_buffer[counter_convolution_x_write[7:0]][ (counter_convolution_y_write[0])]; 665 | convolution_buffer_local[2][1] <= convolution_buffer[counter_convolution_x_write[7:0]+1][ (counter_convolution_y_write[0])]; 666 | convolution_buffer_local[3][1] <= convolution_buffer[counter_convolution_x_write[7:0]+2][ (counter_convolution_y_write[0])]; 667 | 668 | convolution_buffer_local[0][2] <= convolution_previous_read[1][15:8]; 669 | convolution_buffer_local[1][2] <= convolution_previous_read[0][7:0]; 670 | convolution_buffer_local[2][2] <= convolution_previous_read[0][15:8]; 671 | convolution_buffer_local[3][2] <= data_read[7:0]; 672 | end 673 | end 674 | STATE_PROC_CONVOLUTION_CALCULATION: begin : conv_proc 675 | reg [15:0] temp_calc; //used for calculations 676 | 677 | //if we are on a border => write 0 678 | if(counter_convolution_y_write == 0 || counter_convolution_y_write >= img_height-1 || counter_convolution_x_write == 0) begin 679 | data_write[7:0] <= 0; 680 | end else begin 681 | temp_calc = 0; 682 | 683 | 684 | if(matrix_convolution_counter == 0) begin 685 | temp_calc = temp_calc + convolution_matrix[0]*{8'b0, convolution_buffer_local[0][0]}; 686 | calc_left_buf <= temp_calc; 687 | end else if (matrix_convolution_counter == 1) begin 688 | temp_calc = calc_left_buf; 689 | temp_calc = temp_calc + convolution_matrix[1]*{8'b0, convolution_buffer_local[1][0]}; 690 | calc_left_buf <= temp_calc; 691 | end else if (matrix_convolution_counter == 2) begin 692 | temp_calc = calc_left_buf; 693 | temp_calc = temp_calc + convolution_matrix[2]*{8'b0, convolution_buffer_local[2][0]}; 694 | calc_left_buf <= temp_calc; 695 | end else if (matrix_convolution_counter == 3) begin 696 | temp_calc = calc_left_buf; 697 | temp_calc = temp_calc + convolution_matrix[3]*{8'b0, convolution_buffer_local[0][1]}; 698 | calc_left_buf <= temp_calc; 699 | end else if (matrix_convolution_counter == 4) begin 700 | temp_calc = calc_left_buf; 701 | temp_calc = temp_calc + convolution_matrix[4]*{8'b0, convolution_buffer_local[1][1]}; 702 | calc_left_buf <= temp_calc; 703 | end else if (matrix_convolution_counter == 5) begin 704 | temp_calc = calc_left_buf; 705 | temp_calc = temp_calc + convolution_matrix[5]*{8'b0, convolution_buffer_local[2][1]}; 706 | calc_left_buf <= temp_calc; 707 | end else if (matrix_convolution_counter == 6) begin 708 | temp_calc = calc_left_buf; 709 | temp_calc = temp_calc + convolution_matrix[6]*{8'b0, convolution_buffer_local[0][2]}; 710 | calc_left_buf <= temp_calc; 711 | end else if (matrix_convolution_counter == 7) begin 712 | temp_calc = calc_left_buf; 713 | temp_calc = temp_calc + convolution_matrix[7]*{8'b0, convolution_buffer_local[1][2]}; 714 | calc_left_buf <= temp_calc; 715 | end else if (matrix_convolution_counter == 8) begin 716 | temp_calc = calc_left_buf; 717 | temp_calc = temp_calc + convolution_matrix[8]*{8'b0, convolution_buffer_local[2][2]}; 718 | calc_left_buf <= temp_calc; 719 | end else if (matrix_convolution_counter == 9) begin 720 | temp_calc = calc_left_buf; 721 | end 722 | 723 | temp_calc[7:0] = apply_clamp_fixed16(temp_calc, clamp); 724 | //convolution_data_to_add is either the value already existing at this address, or 0 (depends on param) 725 | data_write[7:0] <= apply_clamp({8'b0, convolution_data_to_add[7:0]}+{8'b0, temp_calc[7:0]}, 1); 726 | end 727 | 728 | //if second byte value are on the border => 0 729 | if(counter_convolution_y_write == 0 || counter_convolution_y_write >= img_height-1 || counter_convolution_x_write >= img_width-2) begin 730 | data_write[15:8] <= 0; 731 | end else begin 732 | 733 | temp_calc = 0; 734 | 735 | //starts at 1 because want to couple similar operations with the first byte calculation 736 | if(matrix_convolution_counter == 1) begin 737 | temp_calc = temp_calc + convolution_matrix[0]*{8'b0, convolution_buffer_local[1][0]}; 738 | calc_right_buf <= temp_calc; 739 | end else if (matrix_convolution_counter == 2) begin 740 | temp_calc = calc_right_buf; 741 | temp_calc = temp_calc + convolution_matrix[1]*{8'b0, convolution_buffer_local[2][0]}; 742 | calc_right_buf <= temp_calc; 743 | end else if (matrix_convolution_counter == 3) begin 744 | temp_calc = calc_right_buf; 745 | temp_calc = temp_calc + convolution_matrix[2]*{8'b0, convolution_buffer_local[3][0]}; 746 | calc_right_buf <= temp_calc; 747 | end else if (matrix_convolution_counter == 4) begin 748 | temp_calc = calc_right_buf; 749 | temp_calc = temp_calc + convolution_matrix[3]*{8'b0, convolution_buffer_local[1][1]}; 750 | calc_right_buf <= temp_calc; 751 | end else if (matrix_convolution_counter == 5) begin 752 | temp_calc = calc_right_buf; 753 | temp_calc = temp_calc + convolution_matrix[4]*{8'b0, convolution_buffer_local[2][1]}; 754 | calc_right_buf <= temp_calc; 755 | end else if (matrix_convolution_counter == 6) begin 756 | temp_calc = calc_right_buf; 757 | temp_calc = temp_calc + convolution_matrix[5]*{8'b0, convolution_buffer_local[3][1]}; 758 | calc_right_buf <= temp_calc; 759 | end else if (matrix_convolution_counter == 7) begin 760 | temp_calc = calc_right_buf; 761 | temp_calc = temp_calc + convolution_matrix[6]*{8'b0, convolution_buffer_local[1][2]}; 762 | calc_right_buf <= temp_calc; 763 | end else if (matrix_convolution_counter == 8) begin 764 | temp_calc = calc_right_buf; 765 | temp_calc = temp_calc + convolution_matrix[7]*{8'b0, convolution_buffer_local[2][2]}; 766 | calc_right_buf <= temp_calc; 767 | end else if (matrix_convolution_counter == 9) begin 768 | temp_calc = calc_right_buf; 769 | temp_calc = temp_calc + convolution_matrix[8]*{8'b0, convolution_buffer_local[3][2]}; 770 | end 771 | 772 | temp_calc[7:0] = apply_clamp_fixed16(temp_calc, clamp); 773 | data_write[15:8] <= apply_clamp({8'b0, convolution_data_to_add[15:8]}+{8'b0, temp_calc[7:0]}, 1); 774 | 775 | end 776 | if(matrix_convolution_counter == 9) begin 777 | state_processing <= STATE_PROC_CONVOLUTION_WRITEBACK_1; 778 | end else begin 779 | matrix_convolution_counter <= matrix_convolution_counter + 1; 780 | end 781 | end 782 | STATE_PROC_CONVOLUTION_WRITEBACK_1: begin 783 | //stores last read in the buffer 784 | //this is done in two states to infer a spram for the convolution buffer 785 | convolution_buffer[convolution_previous_read_counter_x[1][7:0]][convolution_previous_read_counter_y[1][0]] <= convolution_previous_read[1][7:0]; 786 | state_processing <= STATE_PROC_CONVOLUTION_WRITEBACK_2; 787 | end 788 | STATE_PROC_CONVOLUTION_WRITEBACK_2: begin 789 | //keep the current read in a buffer before putting it in the convolution buffer because we read data 2 by 2 and we need 3x3 matrices 790 | convolution_previous_read[0] <= data_read_store; 791 | convolution_previous_read_counter_x[0] <= counter_convolution_x; 792 | convolution_previous_read_counter_y[0] <= counter_convolution_y; 793 | 794 | convolution_previous_read[1] <= convolution_previous_read[0]; 795 | convolution_previous_read_counter_x[1] <= convolution_previous_read_counter_x[0]; 796 | convolution_previous_read_counter_y[1] <= convolution_previous_read_counter_y[0]; 797 | 798 | //second write to conv. buffer (spram behaviour) 799 | convolution_buffer[convolution_previous_read_counter_x[1][7:0]+1][convolution_previous_read_counter_y[1][0]] <= convolution_previous_read[1][15:8]; 800 | 801 | //counts the reads and increment y reg when x has sweept the width 802 | if(counter_convolution_x+2 >= img_width) begin 803 | counter_convolution_x <= 0; 804 | counter_convolution_y <= counter_convolution_y + 1; 805 | end else begin 806 | counter_convolution_x <= counter_convolution_x + 2; 807 | end 808 | 809 | if(counter_convolution_y == 0 || (counter_convolution_y == 1 && counter_convolution_x == 0)) begin 810 | //ffirst line not wrtiten back, need delay to fill the convolution buffer 811 | wr_en <= 0; 812 | end else begin 813 | wr_en <= 1; 814 | proc_conv_memory_addr_write <= proc_conv_memory_addr_write + 2; //reads 2x8bits 815 | 816 | //offset between read and write 817 | //only update write counter when there is an actual write 818 | if(counter_convolution_x_write+2 >= img_width) begin 819 | counter_convolution_x_write <= 0; 820 | counter_convolution_y_write <= counter_convolution_y_write + 1; 821 | end else begin 822 | counter_convolution_x_write <= counter_convolution_x_write + 2; 823 | end 824 | 825 | end 826 | 827 | addr <= {proc_conv_memory_addr_write[31:1], 1'b0}; 828 | 829 | proc_conv_memory_addr_read <= proc_conv_memory_addr_read + 1; 830 | 831 | convolution_reading_data <= 2'b00; 832 | 833 | //end condition 834 | if(counter_convolution_y >= img_height+1 && counter_convolution_x+2 >= img_width)begin 835 | state_processing <= STATE_IDLE; 836 | end else begin 837 | state_processing <= STATE_PROC_CONVOLUTION; 838 | end 839 | end 840 | default: begin 841 | end 842 | endcase 843 | end 844 | endmodule 845 | -------------------------------------------------------------------------------- /ice40/hdl/Makefile: -------------------------------------------------------------------------------- 1 | filename = top 2 | pcf_file = io.pcf 3 | 4 | build: 5 | yosys -p "synth_ice40 -blif $(filename).blif" $(filename).v 6 | arachne-pnr -d 5k -P sg48 -p $(pcf_file) $(filename).blif -o $(filename).asc 7 | # yosys -p "synth_ice40 -blif $(filename).blif -json $(filename).json" $(filename).v 8 | # nextpnr-ice40 --up5k --json $(filename).json --pcf $(pcf_file) --asc $(filename).asc #doesn't seem to work with the HW SPI module 9 | icepack $(filename).asc $(filename).bin 10 | 11 | prog: 12 | iceprog -S $(filename).bin 13 | 14 | prog_flash: 15 | iceprog $(filename).bin 16 | 17 | clean: 18 | rm -rf $(filename).blif $(filename).asc $(filename).bin 19 | -------------------------------------------------------------------------------- /ice40/hdl/io.pcf: -------------------------------------------------------------------------------- 1 | # For the iCE40 UltraPlus (iCE40UP5K-QFN) Breakout Board 2 | 3 | set_io LED_R 41 4 | set_io LED_G 40 5 | set_io LED_B 39 6 | set_io SW[0] 23 7 | set_io SW[1] 25 8 | set_io SW[2] 34 9 | set_io SW[3] 43 10 | set_io clk 35 11 | 12 | # bank 0 13 | set_io IOT_39A 26 14 | set_io IOT_38B 27 15 | set_io IOT_42B 31 16 | set_io IOT_43A 32 17 | set_io IOT_45A_G1 37 18 | set_io IOT_51A 42 19 | set_io IOT_50B 38 20 | 21 | #spi 22 | set_io SPI_SS 16 23 | set_io SPI_SCK 15 24 | set_io SPI_MOSI 17 25 | set_io SPI_MISO 14 26 | -------------------------------------------------------------------------------- /ice40/hdl/ram_interface.v: -------------------------------------------------------------------------------- 1 | //ice40 has 1024 kbit of spram ==> 128KB 2 | 3 | // wire [31:0] ip_mem_addr; 4 | // wire ip_mem_wr_en; 5 | // wire ip_mem_rd_en; 6 | // wire [15:0] ip_mem_data_write; 7 | // wire [15:0] ip_mem_data_read; 8 | // wire ip_mem_data_read_valid; 9 | 10 | module ram_interface(input clk, input [31:0] addr, input wr_en, input rd_en, input [15:0] data_write, output reg [15:0] data_read, output reg data_read_valid); 11 | 12 | wire ram_wren [3:0]; 13 | 14 | wire [15:0] ram_data_out [3:0]; 15 | 16 | reg [1:0] output_mux [1:0]; 17 | reg rd_en_buffer[2:0]; 18 | 19 | 20 | SB_SPRAM256KA spram0 21 | ( 22 | .ADDRESS(addr[14:1]), //14bits (16K*2B) 23 | .DATAIN(data_write), 24 | .MASKWREN(4'b1111), 25 | .WREN(ram_wren[0]), 26 | .CHIPSELECT(1'b1), 27 | .CLOCK(clk), 28 | .STANDBY(1'b0), 29 | .SLEEP(1'b0), 30 | .POWEROFF(1'b1), 31 | .DATAOUT(ram_data_out[0]) 32 | ); 33 | 34 | SB_SPRAM256KA spram1 35 | ( 36 | .ADDRESS(addr[14:1]), 37 | .DATAIN(data_write), 38 | .MASKWREN(4'b1111), 39 | .WREN(ram_wren[1]), 40 | .CHIPSELECT(1'b1), 41 | .CLOCK(clk), 42 | .STANDBY(1'b0), 43 | .SLEEP(1'b0), 44 | .POWEROFF(1'b1), 45 | .DATAOUT(ram_data_out[1]) 46 | ); 47 | 48 | SB_SPRAM256KA spram2 49 | ( 50 | .ADDRESS(addr[14:1]), 51 | .DATAIN(data_write), 52 | .MASKWREN(4'b1111), 53 | .WREN(ram_wren[2]), 54 | .CHIPSELECT(1'b1), 55 | .CLOCK(clk), 56 | .STANDBY(1'b0), 57 | .SLEEP(1'b0), 58 | .POWEROFF(1'b1), 59 | .DATAOUT(ram_data_out[2]) 60 | ); 61 | 62 | SB_SPRAM256KA spram3 63 | ( 64 | .ADDRESS(addr[14:1]), 65 | .DATAIN(data_write), 66 | .MASKWREN(4'b1111), 67 | .WREN(ram_wren[3]), 68 | .CHIPSELECT(1'b1), 69 | .CLOCK(clk), 70 | .STANDBY(1'b0), 71 | .SLEEP(1'b0), 72 | .POWEROFF(1'b1), 73 | .DATAOUT(ram_data_out[3]) 74 | ); 75 | 76 | assign ram_wren[addr[16:15]] = wr_en; 77 | 78 | always @(posedge clk) 79 | begin 80 | output_mux[0] <= addr[16:15]; 81 | output_mux[1] <= output_mux[0]; 82 | rd_en_buffer[0] <= rd_en; 83 | rd_en_buffer[1] <= rd_en_buffer[0]; 84 | rd_en_buffer[2] <= rd_en_buffer[1]; 85 | data_read_valid <= (rd_en_buffer[2] == 0 && rd_en_buffer[1] == 1); 86 | data_read <= ram_data_out[output_mux[1]]; 87 | end 88 | 89 | endmodule 90 | -------------------------------------------------------------------------------- /ice40/hdl/spi_interface.v: -------------------------------------------------------------------------------- 1 | 2 | module spi_interface(input clk, input spi_sck, input spi_ss, input spi_mosi, output spi_miso, 3 | output reg [7:0] spi_cmd, output reg [7:0] spi_data_out, output reg spi_data_out_valid, input [7:0] spi_data_in, input spi_data_in_valid, 4 | output spi_data_in_free, output reg [2:0] led_debug); 5 | 6 | parameter NOP=0, INIT=1, SEND_DATA=2, RECEIVE_CMD=3, RECEIVE_DATA=4, SEND_DATA32=5, RECEIVE_DATA32=6; 7 | 8 | //state machine parameters 9 | parameter INIT_SPICR0=0, INIT_SPICR1=INIT_SPICR0+1, INIT_SPICR2=INIT_SPICR1+1, INIT_SPIBR=INIT_SPICR2+1, INIT_SPICSR=INIT_SPIBR+1, 10 | SPI_WAIT_RECEPTION=INIT_SPICSR+1, SPI_READ_OPCODE=SPI_WAIT_RECEPTION+1, SPI_READ_LED_VALUE=SPI_READ_OPCODE+1, 11 | SPI_READ_INIT=SPI_READ_LED_VALUE+1, SPI_SEND_DATA=SPI_READ_INIT+1, SPI_WAIT_TRANSMIT_READY=SPI_SEND_DATA+1, 12 | SPI_TRANSMIT=SPI_WAIT_TRANSMIT_READY+1; 13 | 14 | parameter SPI_ADDR_SPICR0 = 8'b00001000, SPI_ADDR_SPICR1 = 8'b00001001, SPI_ADDR_SPICR2 = 8'b00001010, SPI_ADDR_SPIBR = 8'b00001011, 15 | SPI_ADDR_SPITXDR = 8'b00001101, SPI_ADDR_SPIRXDR = 8'b00001110, SPI_ADDR_SPICSR = 8'b00001111, SPI_ADDR_SPISR = 8'b00001100; 16 | 17 | reg [7:0] state_spi; 18 | 19 | //hw spi signals 20 | reg spi_stb; //strobe must be set to high when read or write 21 | reg spi_rw; //selects read or write (high = write) 22 | reg [7:0] spi_adr; // address 23 | reg [7:0] spi_dati; // data input 24 | wire [7:0] spi_dato; // data output 25 | wire spi_ack; //ack that the transfer is done (read valid or write ack) 26 | //the miso/mosi signals are not used, because this module is set as a slave 27 | wire spi_miso_unused; 28 | wire spi_mosi_unused; 29 | 30 | SB_SPI SB_SPI_inst(.SBCLKI(clk), .SBSTBI(spi_stb), .SBRWI(spi_rw), 31 | .SBADRI0(spi_adr[0]), .SBADRI1(spi_adr[1]), .SBADRI2(spi_adr[2]), .SBADRI3(spi_adr[3]), .SBADRI4(spi_adr[4]), .SBADRI5(spi_adr[5]), .SBADRI6(spi_adr[6]), .SBADRI7(spi_adr[7]), 32 | .SBDATI0(spi_dati[0]), .SBDATI1(spi_dati[1]), .SBDATI2(spi_dati[2]), .SBDATI3(spi_dati[3]), .SBDATI4(spi_dati[4]), .SBDATI5(spi_dati[5]), .SBDATI6(spi_dati[6]), .SBDATI7(spi_dati[7]), 33 | .SBDATO0(spi_dato[0]), .SBDATO1(spi_dato[1]), .SBDATO2(spi_dato[2]), .SBDATO3(spi_dato[3]), .SBDATO4(spi_dato[4]), .SBDATO5(spi_dato[5]), .SBDATO6(spi_dato[6]), .SBDATO7(spi_dato[7]), 34 | .SBACKO(spi_ack), 35 | .MI(spi_miso_unused), .SO(spi_miso), 36 | .MO(spi_mosi_unused), .SI(spi_mosi), 37 | .SCKI(spi_sck), .SCSNI(spi_ss) 38 | ); 39 | 40 | reg is_spi_init; //waits the INIT command from the master 41 | 42 | reg [7:0] counter_read; //count the bytes to read to form a command 43 | reg [7:0] command_data[63:0]; //the command, saved as array of bytes 44 | 45 | reg [7:0] counter_send; //counts the bytes to send 46 | reg [7:0] data_to_send; //buffer for data to be written in send register 47 | reg is_data_to_send; 48 | 49 | reg [7:0] buffer_data_in; //keeps data from the image processing module to be send 50 | reg buffer_full; 51 | 52 | //not free if buffer is full or data is incoming 53 | assign spi_data_in_free = !(buffer_full || spi_data_in_valid); 54 | 55 | initial begin 56 | 57 | spi_stb = 0; 58 | spi_rw = 0; 59 | spi_adr = 0; 60 | spi_dati = 0; 61 | data_to_send = 0; 62 | is_data_to_send = 0; 63 | 64 | is_spi_init = 0; 65 | counter_send = 0; 66 | 67 | buffer_full = 0; 68 | 69 | state_spi = INIT_SPICR0; 70 | 71 | led_debug <= 3'b100; 72 | end 73 | 74 | always @(posedge clk) 75 | begin 76 | 77 | //default 78 | spi_stb <= 0; 79 | spi_data_out_valid <= 0; 80 | 81 | if (spi_data_in_valid == 1) begin 82 | buffer_data_in <= spi_data_in; 83 | buffer_full <= 1; 84 | end 85 | 86 | case (state_spi) 87 | INIT_SPICR0 : begin //spi control register 0, nothing interesting for this case (delay counts) 88 | spi_adr <= SPI_ADDR_SPICR0; 89 | spi_dati <= 8'b00000000; 90 | spi_stb <= 1; 91 | spi_rw <= 1; 92 | if(spi_ack == 1) begin 93 | spi_stb <= 0; 94 | state_spi <= INIT_SPICR1; 95 | end 96 | end 97 | INIT_SPICR1 : begin //spi control register 1 98 | spi_adr <= SPI_ADDR_SPICR1; 99 | spi_dati <= 8'b10000000; //bit7: enable SPI 100 | spi_stb <= 1; 101 | spi_rw <= 1; 102 | if(spi_ack == 1) begin 103 | spi_stb <= 0; 104 | state_spi <= INIT_SPICR2; 105 | end 106 | end 107 | INIT_SPICR2 : begin //spi control register 2 108 | spi_adr <= SPI_ADDR_SPICR2; 109 | spi_dati <= 8'b00000001; //bit0: lsb first 110 | spi_stb <= 1; 111 | spi_rw <= 1; 112 | if(spi_ack == 1) begin 113 | spi_stb <= 0; 114 | state_spi <= INIT_SPIBR; 115 | end 116 | end 117 | INIT_SPIBR : begin //spi clock prescale 118 | spi_adr <= SPI_ADDR_SPIBR; 119 | spi_dati <= 8'b00000000; //clock divider => 1 120 | spi_stb <= 1; 121 | spi_rw <= 1; 122 | if(spi_ack == 1) begin 123 | spi_stb <= 0; 124 | state_spi <= INIT_SPICSR; 125 | end 126 | end 127 | INIT_SPICSR : begin //SPI master chip select register, absolutely no use as SPI module set as slave 128 | spi_adr <= SPI_ADDR_SPICSR; 129 | spi_dati <= 8'b00000000; 130 | spi_stb <= 1; 131 | spi_rw <= 1; 132 | if(spi_ack == 1) begin 133 | spi_stb <= 0; 134 | state_spi <= SPI_WAIT_RECEPTION; 135 | counter_read <= 0; 136 | end 137 | end 138 | SPI_WAIT_RECEPTION : begin 139 | spi_adr <= SPI_ADDR_SPISR; //status register 140 | spi_stb <= 1; 141 | spi_rw <= 0; //read 142 | if(spi_ack == 1) begin 143 | spi_stb <= 0; 144 | state_spi <= SPI_WAIT_RECEPTION; 145 | 146 | //wait for bit3, tells that data is available 147 | if (is_spi_init == 0 && spi_dato[3] == 1) begin 148 | state_spi <= SPI_READ_INIT; 149 | end 150 | 151 | if (is_spi_init == 1 && spi_dato[3] == 1) begin 152 | if( (command_data[0] != SEND_DATA32 && counter_send < 2) || (command_data[0] == SEND_DATA32 && counter_send < 31)) begin //can only send 2 bytes back 153 | state_spi <= SPI_WAIT_TRANSMIT_READY; 154 | end else begin 155 | state_spi <= SPI_READ_OPCODE; 156 | end 157 | end 158 | end 159 | end 160 | SPI_WAIT_TRANSMIT_READY: begin 161 | spi_adr <= SPI_ADDR_SPISR; //status registers 162 | spi_stb <= 1; 163 | spi_rw <= 0; //read 164 | if(spi_ack == 1) begin 165 | spi_stb <= 0; 166 | 167 | //bit 4 = TRDY, transmit ready 168 | if (spi_dato[4] == 1) begin 169 | state_spi <= SPI_TRANSMIT; 170 | end 171 | end 172 | end 173 | SPI_TRANSMIT: begin 174 | spi_adr <= SPI_ADDR_SPITXDR; 175 | if(counter_send == 0) begin //status 176 | spi_dati <= 8'b01000000; 177 | spi_dati[0] <= buffer_full; //tells that something useful was sent 178 | // spi_dati[0] <= 1; //tells that something useful was sent 179 | end else begin 180 | if(is_data_to_send == 1) begin 181 | spi_dati <= data_to_send; 182 | // spi_dati <= 8'h48 183 | end else begin 184 | spi_dati <= 8'h42; 185 | end 186 | end 187 | 188 | spi_stb <= 1; 189 | spi_rw <= 1; 190 | if(spi_ack == 1) begin 191 | spi_stb <= 0; 192 | counter_send <= counter_send + 1; 193 | 194 | if(is_data_to_send == 1) begin 195 | is_data_to_send <= 0; 196 | buffer_full <= 0; 197 | end 198 | 199 | if (is_spi_init == 0) begin 200 | state_spi <= SPI_READ_INIT; 201 | end else begin 202 | state_spi <= SPI_READ_OPCODE; 203 | end 204 | end 205 | end 206 | SPI_READ_INIT: begin 207 | spi_adr <= SPI_ADDR_SPIRXDR; //read data register 208 | spi_stb <= 1; 209 | spi_rw <= 0; //read 210 | if(spi_ack == 1) begin 211 | spi_stb <= 0; 212 | state_spi <= SPI_WAIT_RECEPTION; 213 | command_data[counter_read] <= spi_dato; 214 | 215 | if(spi_dato == 8'h11) begin 216 | counter_read <= 0; 217 | is_spi_init <= 1; 218 | counter_send <= 0; 219 | end 220 | end 221 | end 222 | SPI_READ_OPCODE: begin 223 | spi_adr <= SPI_ADDR_SPIRXDR; //read data register 224 | spi_stb <= 1; 225 | spi_rw <= 0; //read 226 | if(spi_ack == 1) begin 227 | spi_stb <= 0; 228 | counter_read <= counter_read + 1; 229 | 230 | state_spi <= SPI_WAIT_RECEPTION; 231 | command_data[counter_read] <= spi_dato; 232 | 233 | if( counter_read == 0 && spi_dato == SEND_DATA) begin 234 | if(buffer_full == 1)begin 235 | data_to_send <= buffer_data_in; 236 | is_data_to_send <= 1; 237 | end 238 | end 239 | if( counter_read == 0 && spi_dato == SEND_DATA32 ) begin 240 | if(buffer_full == 1)begin 241 | data_to_send <= buffer_data_in; 242 | is_data_to_send <= 1; 243 | end 244 | end 245 | 246 | if( counter_read == 1 ) begin 247 | if( command_data[0] == RECEIVE_CMD) begin 248 | spi_data_out_valid <= 1; 249 | spi_cmd <= spi_dato; 250 | end else if( command_data[0] == RECEIVE_DATA) begin 251 | spi_data_out_valid <= 1; 252 | spi_data_out <= spi_dato; 253 | end 254 | end 255 | if( counter_read >= 1 ) begin 256 | if( command_data[0] == RECEIVE_DATA32 ) begin 257 | spi_data_out_valid <= 1; 258 | spi_data_out <= spi_dato; 259 | end 260 | else if( command_data[0] == SEND_DATA32 && counter_read < 30) begin 261 | if(buffer_full == 1)begin 262 | data_to_send <= buffer_data_in; 263 | is_data_to_send <= 1; 264 | end 265 | end 266 | end 267 | 268 | if( command_data[0] == RECEIVE_DATA32 || command_data[0] == SEND_DATA32 ) begin 269 | if( counter_read == 32 ) begin 270 | counter_read <= 0; 271 | counter_send <= 0; 272 | end 273 | end 274 | else begin 275 | if( counter_read == 3 ) begin 276 | counter_read <= 0; 277 | counter_send <= 0; 278 | end 279 | end 280 | end 281 | end 282 | 283 | endcase 284 | end 285 | 286 | endmodule 287 | -------------------------------------------------------------------------------- /ice40/hdl/top.v: -------------------------------------------------------------------------------- 1 | `include "../../hdl/image_processing.v" 2 | `include "ram_interface.v" 3 | `include "spi_interface.v" 4 | 5 | module top(input [3:0] SW, input clk, output LED_R, output LED_G, output LED_B, input SPI_SCK, input SPI_SS, input SPI_MOSI, output SPI_MISO); 6 | 7 | reg [2:0] led; 8 | 9 | //leds are active low 10 | assign LED_R = ~led[0]; 11 | assign LED_G = ~led[1]; 12 | assign LED_B = ~led[2]; 13 | 14 | reg reset_ip; 15 | 16 | wire [31:0] ip_mem_addr; 17 | wire ip_mem_wr_en; 18 | wire ip_mem_rd_en; 19 | wire [15:0] ip_mem_data_write; 20 | wire [15:0] ip_mem_data_read; 21 | wire ip_mem_data_read_valid; 22 | 23 | wire [7:0] ip_comm_cmd; 24 | wire [7:0] ip_comm_data_in; 25 | wire ip_comm_data_in_valid; 26 | wire [7:0] ip_comm_data_out; 27 | wire ip_comm_data_out_valid; 28 | wire ip_comm_data_out_free; 29 | 30 | wire [2:0] spi_led_debug; 31 | 32 | image_processing image_processing(.clk(clk), .reset(reset_ip), 33 | .addr(ip_mem_addr), .wr_en(ip_mem_wr_en), .rd_en(ip_mem_rd_en), .data_read(ip_mem_data_read), .data_read_valid(ip_mem_data_read_valid), .data_write(ip_mem_data_write), 34 | .comm_cmd(ip_comm_cmd), .comm_data_in(ip_comm_data_in), .comm_data_in_valid(ip_comm_data_in_valid), .comm_data_out(ip_comm_data_out), .comm_data_out_valid(ip_comm_data_out_valid), 35 | .comm_data_out_free(ip_comm_data_out_free) 36 | ); 37 | 38 | ram_interface ram_interface(.clk(clk), .addr(ip_mem_addr), .wr_en(ip_mem_wr_en), .rd_en(ip_mem_rd_en), 39 | .data_write(ip_mem_data_write), .data_read(ip_mem_data_read), .data_read_valid(ip_mem_data_read_valid) 40 | ); 41 | 42 | spi_interface spi_interface(.clk(clk), .spi_sck(SPI_SCK), .spi_ss(SPI_SS), .spi_mosi(SPI_MOSI), .spi_miso(SPI_MISO), 43 | .spi_cmd(ip_comm_cmd), .spi_data_out(ip_comm_data_in), .spi_data_out_valid(ip_comm_data_in_valid), .spi_data_in(ip_comm_data_out), .spi_data_in_valid(ip_comm_data_out_valid), 44 | .spi_data_in_free(ip_comm_data_out_free), .led_debug(spi_led_debug) 45 | ); 46 | 47 | initial begin 48 | led <= 3'b000; 49 | end 50 | 51 | always @(posedge clk) 52 | begin 53 | led <= spi_led_debug; 54 | end 55 | 56 | endmodule 57 | -------------------------------------------------------------------------------- /ice40/software/image_processing_ice40.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | 7 | #include "image_processing_ice40.hpp" 8 | 9 | #define SPI_NOP 0x00 10 | #define SPI_INIT 0x01 11 | #define SPI_READ_DATA 0x02 12 | #define SPI_SEND_CMD 0x03 13 | #define SPI_SEND_DATA 0x04 14 | #define SPI_READ_DATA32 0x05 15 | #define SPI_SEND_DATA32 0x06 16 | 17 | Image_processing_ice40::Image_processing_ice40(){ 18 | 19 | } 20 | 21 | Image_processing_ice40::~Image_processing_ice40(){ 22 | 23 | } 24 | 25 | void Image_processing_ice40::send_params(uint16_t img_width, uint16_t img_height){ 26 | 27 | spi_init(); 28 | 29 | uint8_t init_param[3] = {0x0, 0x0, 0x11}; 30 | 31 | if (spi_command_send(SPI_INIT, init_param) != 0){ 32 | printf("trouble to get answer\n"); 33 | } 34 | 35 | this->image_width = img_width; 36 | this->image_height = img_height; 37 | 38 | uint8_t img_width8[2] = {img_width&0xFF, (img_width>>8)&0xFF}; 39 | uint8_t img_height8[2] = {img_height&0xFF, (img_height>>8)&0xFF}; 40 | // printf("ice40!\n"); 41 | printf("sending img_width8[0]: %u, [1]: %u\n", img_width8[0], img_width8[1]); 42 | printf("sending img_height8[0]: %u, [1]: %u\n", img_height8[0], img_height8[1]); 43 | 44 | spi_command_send(SPI_SEND_CMD, COMMAND_PARAM); 45 | spi_command_send(SPI_SEND_DATA, img_width8[0]); 46 | spi_command_send(SPI_SEND_DATA, img_width8[1]); 47 | spi_command_send(SPI_SEND_DATA, img_height8[0]); 48 | spi_command_send(SPI_SEND_DATA, img_height8[1]); 49 | } 50 | 51 | void Image_processing_ice40::read_status(uint8_t *output){ 52 | 53 | spi_command_send(SPI_SEND_CMD, COMMAND_GET_STATUS); 54 | 55 | uint8_t send_data[3]; //don't care 56 | uint8_t recv_data[2]; 57 | int counter_output = 0; 58 | for (size_t i = 0; i < 100; i++) { 59 | spi_command_send_recv(SPI_READ_DATA, send_data, recv_data); 60 | printf("recv_data[0]: 0x%x\n", recv_data[0]); 61 | if(recv_data[0]&1 == 1 && counter_output<4){ 62 | printf("recv data: 0x%x\n", recv_data[1]); 63 | output[counter_output] = recv_data[1]; 64 | counter_output++; 65 | } 66 | } 67 | } 68 | 69 | void Image_processing_ice40::send_image(uint8_t *image){ 70 | 71 | printf("sending..\n"); 72 | spi_command_send(SPI_SEND_CMD, COMMAND_SEND_IMG); 73 | 74 | uint image_size = image_width*image_height; 75 | 76 | //fast send, 16 bytes per packet 77 | uint8_t data_to_send[32]; 78 | size_t i = 0; 79 | for (i = 0; i < image_size; i+=32) { 80 | memcpy(data_to_send, image+i, 32); 81 | spi_command_send_32B(SPI_SEND_DATA32, data_to_send); 82 | } 83 | 84 | //padding if image is not a multiple of the fast send 85 | for (size_t j = 0; j < (image_size%32); j++) { 86 | spi_command_send(SPI_SEND_DATA, image[i+j]); 87 | } 88 | } 89 | 90 | void Image_processing_ice40::send_add(int16_t value, bool clamp){ 91 | uint8_t add_value8[2] = {value&0xFF, (value>>8)&0xFF}; 92 | 93 | spi_command_send(SPI_SEND_CMD, COMMAND_APPLY_ADD); 94 | spi_command_send(SPI_SEND_DATA, add_value8[0]); 95 | spi_command_send(SPI_SEND_DATA, add_value8[1]); 96 | spi_command_send(SPI_SEND_DATA, clamp); 97 | } 98 | 99 | void Image_processing_ice40::send_threshold(uint8_t threshold_value, uint8_t replacement_value, bool upper_selection){ 100 | 101 | spi_command_send(SPI_SEND_CMD, COMMAND_APPLY_THRESHOLD); 102 | spi_command_send(SPI_SEND_DATA, threshold_value); 103 | spi_command_send(SPI_SEND_DATA, replacement_value); 104 | spi_command_send(SPI_SEND_DATA, upper_selection); 105 | } 106 | 107 | void Image_processing_ice40::send_image_invert(){ 108 | 109 | spi_command_send(SPI_SEND_CMD, COMMAND_APPLY_INVERT); 110 | } 111 | 112 | void Image_processing_ice40::send_mult(float value, bool clamp){ 113 | uint8_t val_fixed_4_4 = 0; 114 | if( value < 0 ){ 115 | val_fixed_4_4 = 0; //no sense to multiply an image by negative val 116 | } 117 | 118 | float value_buf = value; 119 | //run over 7 bits (4bits for signed decimal value, so 8th bit will be 0) 120 | for (int i = 6; i >= 0; i--) { 121 | if( value_buf >= pow(2, i-4) ) { 122 | value_buf -= pow(2, i-4); 123 | val_fixed_4_4 += 1<send_threshold(0, value, true); 211 | } 212 | -------------------------------------------------------------------------------- /ice40/software/image_processing_ice40.hpp: -------------------------------------------------------------------------------- 1 | #ifndef IMAGE_PROCESSING_ICE40_H 2 | #define IMAGE_PROCESSING_ICE40_H 3 | 4 | #include "../../software/image_processing.hpp" 5 | #include "spi_lib/spi_lib.h" 6 | 7 | struct Operation{ 8 | bool is_command; 9 | Commands command; 10 | uint8_t data; 11 | 12 | Operation(bool is_command, Commands com, uint8_t data){ 13 | this->is_command = is_command; 14 | this->command = com; 15 | this->data = data; 16 | } 17 | }; 18 | 19 | typedef std::queue FIFO_OP; 20 | 21 | class Image_processing_ice40 : public Image_processing{ 22 | public: 23 | Image_processing_ice40(); 24 | ~Image_processing_ice40(); 25 | 26 | virtual void send_params(uint16_t img_width, uint16_t img_height); 27 | virtual void read_status(uint8_t *output); //TODO vector? 28 | virtual void send_image(uint8_t *img); 29 | 30 | virtual void send_add(int16_t value, bool clamp); 31 | virtual void send_threshold(uint8_t threshold_value, uint8_t replacement, bool upper_selection); 32 | virtual void send_image_invert(); 33 | virtual void send_mult(float value, bool clamp); 34 | 35 | virtual void wait_end_busy(); 36 | 37 | virtual void read_image(uint8_t *img); 38 | 39 | virtual void switch_buffers(); 40 | 41 | virtual void send_binary_add(bool clamp); 42 | virtual void send_binary_sub(bool clamp, bool absolute_diff); 43 | virtual void send_binary_mult(bool clamp); 44 | virtual void send_convolution(uint8_t *kernel, bool clamp, bool input_source, bool add_to_output); 45 | virtual void send_clear(uint8_t value); 46 | 47 | private: 48 | }; 49 | 50 | #endif 51 | -------------------------------------------------------------------------------- /ice40/software/spi_lib/spi_lib.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include "spi_lib.h" 4 | 5 | static struct ftdi_context ftdic; 6 | static int ftdic_open = 0; //false 7 | static int verbose = 0; //false 8 | static int ftdic_latency_set = 0; //false 9 | static unsigned char ftdi_latency; 10 | 11 | static void send_byte(uint8_t data) 12 | { 13 | int rc = ftdi_write_data(&ftdic, &data, 1); 14 | if (rc != 1) { 15 | fprintf(stderr, "Write error (single byte, rc=%d, expected %d) data: 0x%x.\n", rc, 1, data); 16 | exit(2); 17 | } 18 | } 19 | 20 | static uint8_t recv_byte() 21 | { 22 | uint8_t data; 23 | while (1) { 24 | int rc = ftdi_read_data(&ftdic, &data, 1); 25 | if (rc < 0) { 26 | fprintf(stderr, "Read error.\n"); 27 | exit(2); 28 | } 29 | if (rc == 1) 30 | break; 31 | usleep(100); 32 | } 33 | return data; 34 | } 35 | 36 | static uint8_t xfer_spi_bits(uint8_t data, int n) 37 | { 38 | if (n < 1) 39 | return 0; 40 | 41 | /* Input and output, update data on negative edge read on positive, bits. */ 42 | send_byte(MC_DATA_IN | MC_DATA_OUT | MC_DATA_BITS | MC_DATA_LSB ); 43 | send_byte(n - 1); 44 | send_byte(data); 45 | 46 | return recv_byte(); 47 | } 48 | 49 | static void xfer_spi(uint8_t *data, int n) 50 | { 51 | if (n < 1) 52 | return; 53 | 54 | /* Input and output, update data on negative edge read on positive. */ 55 | send_byte(MC_DATA_IN | MC_DATA_OUT | MC_DATA_LSB | MC_DATA_OCN); 56 | send_byte(n - 1); 57 | send_byte((n - 1) >> 8); 58 | 59 | int rc = ftdi_write_data(&ftdic, data, n); 60 | if (rc != n) { 61 | fprintf(stderr, "Write error (chunk, rc=%d, expected %d).\n", rc, n); 62 | exit(2); 63 | } 64 | 65 | for (int i = 0; i < n; i++) 66 | data[i] = recv_byte(); 67 | } 68 | 69 | static void send_spi(uint8_t *data, int n) 70 | { 71 | if (n < 1) 72 | return; 73 | 74 | //sends data on the negative edges 75 | send_byte(MC_DATA_OUT | MC_DATA_LSB | MC_DATA_OCN); 76 | send_byte(n - 1); 77 | send_byte((n - 1) >> 8); 78 | 79 | int rc = ftdi_write_data(&ftdic, data, n); 80 | if (rc != n) { 81 | fprintf(stderr, "Write error (chunk, rc=%d, expected %d).\n", rc, n); 82 | exit(2); 83 | } 84 | } 85 | 86 | static void read_spi(uint8_t *data, int n) 87 | { 88 | if (n < 1) 89 | return; 90 | 91 | //receive data on positive edge 92 | send_byte(MC_DATA_IN | MC_DATA_LSB /*| MC_DATA_OCN*/); 93 | send_byte(n - 1); 94 | send_byte((n - 1) >> 8); 95 | 96 | // int rc = ftdi_write_data(&ftdic, data, n); 97 | // if (rc != n) { 98 | // fprintf(stderr, "Write error (chunk, rc=%d, expected %d).\n", rc, n); 99 | // error(2); 100 | // } 101 | 102 | for (int i = 0; i < n; i++) 103 | data[i] = recv_byte(); 104 | } 105 | 106 | 107 | static void set_gpio(int slavesel_b, int creset_b) 108 | { 109 | uint8_t gpio = 0; 110 | 111 | if (slavesel_b) { 112 | // ADBUS4 (GPIOL0) 113 | gpio |= 0x10; 114 | } 115 | 116 | if (creset_b) { 117 | // ADBUS7 (GPIOL3) 118 | gpio |= 0x80; 119 | } 120 | 121 | send_byte(MC_SETB_LOW); 122 | send_byte(gpio); /* Value */ 123 | send_byte(0x93); /* Direction */ 124 | } 125 | 126 | // the FPGA reset is released so also FLASH chip select should be deasserted 127 | static void flash_release_reset() 128 | { 129 | set_gpio(1, 1); 130 | } 131 | 132 | // SRAM reset is the same as flash_chip_select() 133 | // For ease of code reading we use this function instead 134 | static void sram_reset() 135 | { 136 | // Asserting chip select and reset lines 137 | set_gpio(0, 0); 138 | } 139 | 140 | // SRAM chip select assert 141 | // When accessing FPGA SRAM the reset should be released 142 | static void sram_chip_select() 143 | { 144 | set_gpio(0, 1); 145 | } 146 | 147 | 148 | int spi_init() 149 | { 150 | // ftdi initialization taken from iceprog https://github.com/cliffordwolf/icestorm/blob/master/iceprog/iceprog.c 151 | enum ftdi_interface ifnum = INTERFACE_A; 152 | int status = 0; 153 | 154 | printf("init..\n"); 155 | 156 | status = ftdi_init(&ftdic); 157 | if( status != 0) 158 | { 159 | printf("couldn't initalize ftdi\n"); 160 | return 1; 161 | } 162 | 163 | status = ftdi_set_interface(&ftdic, ifnum); 164 | if(status != 0) 165 | { 166 | printf("couldn't initalize ftdi interface\n"); 167 | return 1; 168 | } 169 | 170 | if (ftdi_usb_open(&ftdic, 0x0403, 0x6010) && ftdi_usb_open(&ftdic, 0x0403, 0x6014)) { 171 | printf("Can't find iCE FTDI USB device (vendor_id 0x0403, device_id 0x6010 or 0x6014).\n"); 172 | return 1; 173 | } 174 | 175 | if (ftdi_usb_reset(&ftdic)) { 176 | fprintf(stderr, "Failed to reset iCE FTDI USB device.\n"); 177 | return 1; 178 | } 179 | 180 | if (ftdi_usb_purge_buffers(&ftdic)) { 181 | fprintf(stderr, "Failed to purge buffers on iCE FTDI USB device.\n"); 182 | return 1; 183 | } 184 | 185 | if (ftdi_get_latency_timer(&ftdic, &ftdi_latency) < 0) { 186 | fprintf(stderr, "Failed to get latency timer (%s).\n", ftdi_get_error_string(&ftdic)); 187 | return 1; 188 | } 189 | 190 | /* 1 is the fastest polling, it means 1 kHz polling */ 191 | if (ftdi_set_latency_timer(&ftdic, 1) < 0) { 192 | fprintf(stderr, "Failed to set latency timer (%s).\n", ftdi_get_error_string(&ftdic)); 193 | return 1; 194 | } 195 | 196 | // ftdic_latency_set = 1; 197 | 198 | /* Enter MPSSE (Multi-Protocol Synchronous Serial Engine) mode. Set all pins to output. */ 199 | if (ftdi_set_bitmode(&ftdic, 0xff, BITMODE_MPSSE) < 0) { 200 | fprintf(stderr, "Failed to set BITMODE_MPSSE on iCE FTDI USB device.\n"); 201 | exit(2); 202 | } 203 | 204 | //enable clock divide by 5 ==> 6MHz 205 | send_byte(MC_TCK_D5); 206 | 207 | //divides by value+1 208 | send_byte(MC_SET_CLK_DIV); 209 | send_byte(0); 210 | send_byte(0x01); 211 | //so, 6/2 MHz ==> 3MHz 212 | 213 | usleep(100); 214 | 215 | sram_chip_select(); 216 | usleep(2000); 217 | 218 | return 0; 219 | } 220 | 221 | //send without caring about result 222 | int spi_command_send(uint8_t cmd, uint8_t param[3]){ 223 | uint8_t nop_param[] = {0,0}; //wont be returned 224 | 225 | return spi_command_send_recv(cmd, param, nop_param); 226 | } 227 | 228 | //send one byte without caring about result 229 | int spi_command_send(uint8_t cmd, uint8_t param){ 230 | uint8_t nop_param[] = {0,0}; //wont be returned 231 | uint8_t params[] = {param, 0,0}; 232 | 233 | return spi_command_send_recv(cmd, params, nop_param); 234 | } 235 | 236 | //sends 32bits 237 | int spi_command_send_32B(uint8_t cmd, uint8_t data[32]){ 238 | uint8_t nop_param[31]; //wont be returned 239 | 240 | return spi_command_send_recv_32B(cmd, data, nop_param); 241 | } 242 | 243 | int spi_command_send_recv(uint8_t cmd, uint8_t send_param[3], uint8_t recv_data[2]) 244 | { 245 | uint8_t to_send[] = {cmd, send_param[0], send_param[1], send_param[2]}; 246 | uint retries = 0; 247 | uint max_retries = 10; 248 | 249 | do{ 250 | to_send[0] = cmd; 251 | memcpy(to_send+1, send_param, 7); 252 | xfer_spi(to_send, 4); 253 | retries++; 254 | } while(retries < max_retries && (to_send[2] & STATUS_FPGA_RECV_MASK) == 0); //check that fpga really received the datas 255 | 256 | //copy data received to output 257 | //first 2 bytes are garbage, the third one is the status 258 | memcpy(recv_data, to_send+2, 2); 259 | 260 | return retries >= max_retries; 261 | } 262 | 263 | //TODO merge with last function 264 | int spi_command_send_recv_32B(uint8_t cmd, uint8_t send_param[32], uint8_t recv_data[31]) 265 | { 266 | uint8_t to_send[33]; 267 | uint retries = 0; 268 | uint max_retries = 10; 269 | 270 | do{ 271 | to_send[0] = cmd; 272 | memcpy(to_send+1, send_param, 32); 273 | xfer_spi(to_send, 33); 274 | retries++; 275 | } while(retries < max_retries && (to_send[2] & STATUS_FPGA_RECV_MASK) == 0); //check that fpga really received the datas 276 | 277 | //copy data received to output 278 | //first 2 bytes are garbage, the third one is the status 279 | memcpy(recv_data, to_send+2, 31); 280 | 281 | return retries >= max_retries; 282 | } 283 | -------------------------------------------------------------------------------- /ice40/software/spi_lib/spi_lib.h: -------------------------------------------------------------------------------- 1 | #ifndef SPI_LIB_H 2 | #define SPI_LIB_H 3 | #include 4 | 5 | // ftdi initialization taken from iceprog https://github.com/cliffordwolf/icestorm/blob/master/iceprog/iceprog.c 6 | // --------------------------------------------------------- 7 | // MPSSE / FTDI definitions 8 | // --------------------------------------------------------- 9 | 10 | /* FTDI bank pinout typically used for iCE dev boards 11 | * BUS IO | Signal | Control 12 | * -------+--------+-------------- 13 | * xDBUS0 | SCK | MPSSE 14 | * xDBUS1 | MOSI | MPSSE 15 | * xDBUS2 | MISO | MPSSE 16 | * xDBUS3 | nc | 17 | * xDBUS4 | CS | GPIO 18 | * xDBUS5 | nc | 19 | * xDBUS6 | CDONE | GPIO 20 | * xDBUS7 | CRESET | GPIO 21 | */ 22 | 23 | /* MPSSE engine command definitions */ 24 | enum mpsse_cmd 25 | { 26 | /* Mode commands */ 27 | MC_SETB_LOW = 0x80, /* Set Data bits LowByte */ 28 | MC_READB_LOW = 0x81, /* Read Data bits LowByte */ 29 | MC_SETB_HIGH = 0x82, /* Set Data bits HighByte */ 30 | MC_READB_HIGH = 0x83, /* Read data bits HighByte */ 31 | MC_LOOPBACK_EN = 0x84, /* Enable loopback */ 32 | MC_LOOPBACK_DIS = 0x85, /* Disable loopback */ 33 | MC_SET_CLK_DIV = 0x86, /* Set clock divisor */ 34 | MC_FLUSH = 0x87, /* Flush buffer fifos to the PC. */ 35 | MC_WAIT_H = 0x88, /* Wait on GPIOL1 to go high. */ 36 | MC_WAIT_L = 0x89, /* Wait on GPIOL1 to go low. */ 37 | MC_TCK_X5 = 0x8A, /* Disable /5 div, enables 60MHz master clock */ 38 | MC_TCK_D5 = 0x8B, /* Enable /5 div, backward compat to FT2232D */ 39 | MC_EN_3PH_CLK = 0x8C, /* Enable 3 phase clk, DDR I2C */ 40 | MC_DIS_3PH_CLK = 0x8D, /* Disable 3 phase clk */ 41 | MC_CLK_N = 0x8E, /* Clock every bit, used for JTAG */ 42 | MC_CLK_N8 = 0x8F, /* Clock every byte, used for JTAG */ 43 | MC_CLK_TO_H = 0x94, /* Clock until GPIOL1 goes high */ 44 | MC_CLK_TO_L = 0x95, /* Clock until GPIOL1 goes low */ 45 | MC_EN_ADPT_CLK = 0x96, /* Enable adaptive clocking */ 46 | MC_DIS_ADPT_CLK = 0x97, /* Disable adaptive clocking */ 47 | MC_CLK8_TO_H = 0x9C, /* Clock until GPIOL1 goes high, count bytes */ 48 | MC_CLK8_TO_L = 0x9D, /* Clock until GPIOL1 goes low, count bytes */ 49 | MC_TRI = 0x9E, /* Set IO to only drive on 0 and tristate on 1 */ 50 | /* CPU mode commands */ 51 | MC_CPU_RS = 0x90, /* CPUMode read short address */ 52 | MC_CPU_RE = 0x91, /* CPUMode read extended address */ 53 | MC_CPU_WS = 0x92, /* CPUMode write short address */ 54 | MC_CPU_WE = 0x93, /* CPUMode write extended address */ 55 | }; 56 | 57 | #define MC_DATA_TMS (0x40) /* When set use TMS mode */ 58 | #define MC_DATA_IN (0x20) /* When set read data (Data IN) */ 59 | #define MC_DATA_OUT (0x10) /* When set write data (Data OUT) */ 60 | #define MC_DATA_LSB (0x08) /* When set input/output data LSB first. */ 61 | #define MC_DATA_ICN (0x04) /* When set receive data on negative clock edge */ 62 | #define MC_DATA_BITS (0x02) /* When set count bits not bytes */ 63 | #define MC_DATA_OCN (0x01) /* When set update data on negative clock edge */ 64 | 65 | #define STATUS_FPGA_RECV_OFFSET 6 //fpga has received data 66 | #define STATUS_FPGA_SEND_OFFSET 7 //fpga has sent data 67 | 68 | #define STATUS_FPGA_RECV_MASK (0x1< 2 | #include 3 | #include 4 | 5 | #include "image_processing_simulation.hpp" 6 | 7 | Image_processing_simulation::Image_processing_simulation(){ 8 | this->simulator = new Vimage_processing; 9 | 10 | this->memory = new uint16_t[512*128]; 11 | } 12 | 13 | Image_processing_simulation::~Image_processing_simulation(){ 14 | delete this->simulator; 15 | delete[] this->memory; 16 | } 17 | 18 | void Image_processing_simulation::send_params(uint16_t img_width, uint16_t img_height){ 19 | 20 | this->image_width = img_width; 21 | this->image_height = img_height; 22 | 23 | uint8_t img_width8[2] = {img_width&0xFF, (img_width>>8)&0xFF}; 24 | uint8_t img_height8[2] = {img_height&0xFF, (img_height>>8)&0xFF}; 25 | 26 | printf("sending img_width8[0]: %u, [1]: %u\n", img_width8[0], img_width8[1]); 27 | printf("sending img_height8[0]: %u, [1]: %u\n", img_height8[0], img_height8[1]); 28 | 29 | fifo_in.push(Operation(true, COMMAND_PARAM, 0)); 30 | fifo_in.push(Operation(false, COMMAND_NONE, img_width8[0])); 31 | fifo_in.push(Operation(false, COMMAND_NONE, img_width8[1])); 32 | fifo_in.push(Operation(false, COMMAND_NONE, img_height8[0])); 33 | fifo_in.push(Operation(false, COMMAND_NONE, img_height8[1])); 34 | 35 | for (size_t i = 0; i < 5; i++) { 36 | main_loop_clk(); 37 | } 38 | } 39 | 40 | void Image_processing_simulation::read_status(uint8_t *output){ 41 | fifo_in.push(Operation(true, COMMAND_GET_STATUS, 0)); 42 | 43 | for (size_t i = 0; i < 100; i++) { 44 | main_loop_clk(); 45 | } 46 | 47 | // while(!fifo_out.empty()){ 48 | // printf("read status: %u\n", fifo_out.front()); 49 | // fifo_out.pop(); 50 | // } 51 | 52 | for (size_t i = 0; i < 4; i++) { 53 | if(!fifo_out.empty()){ 54 | printf("read status: %u\n", fifo_out.front()); 55 | output[i] = fifo_out.front(); 56 | fifo_out.pop(); 57 | } 58 | } 59 | } 60 | 61 | void Image_processing_simulation::send_image(uint8_t *image){ 62 | fifo_in.push(Operation(true, COMMAND_SEND_IMG, 0)); 63 | for (size_t i = 0; i < image_width*image_height; i++) { 64 | fifo_in.push(Operation(false, COMMAND_NONE, image[i])); 65 | } 66 | 67 | for (size_t i = 0; i < image_width*image_height+500; i++) { 68 | main_loop_clk(); 69 | } 70 | } 71 | 72 | void Image_processing_simulation::send_add(int16_t value, bool clamp){ 73 | uint8_t add_value8[2] = {value&0xFF, (value>>8)&0xFF}; 74 | 75 | fifo_in.push(Operation(true, COMMAND_APPLY_ADD, 0)); 76 | fifo_in.push(Operation(false, COMMAND_NONE, add_value8[0])); 77 | fifo_in.push(Operation(false, COMMAND_NONE, add_value8[1])); 78 | fifo_in.push(Operation(false, COMMAND_NONE, clamp)); 79 | 80 | for (size_t i = 0; i < 100; i++) { //don't wait too long, will be checked with status 81 | main_loop_clk(); 82 | } 83 | } 84 | 85 | void Image_processing_simulation::send_threshold(uint8_t threshold_value, uint8_t replacement_value, bool upper_selection){ 86 | fifo_in.push(Operation(true, COMMAND_APPLY_THRESHOLD, 0)); 87 | fifo_in.push(Operation(false, COMMAND_NONE, threshold_value)); 88 | fifo_in.push(Operation(false, COMMAND_NONE, replacement_value)); 89 | fifo_in.push(Operation(false, COMMAND_NONE, upper_selection)); 90 | 91 | for (size_t i = 0; i < 100; i++) { 92 | main_loop_clk(); 93 | } 94 | } 95 | 96 | void Image_processing_simulation::send_image_invert(){ 97 | fifo_in.push(Operation(true, COMMAND_APPLY_INVERT, 0)); 98 | 99 | for (size_t i = 0; i < 10; i++) { 100 | main_loop_clk(); 101 | } 102 | } 103 | 104 | void Image_processing_simulation::send_mult(float value, bool clamp){ 105 | uint8_t val_fixed_4_4 = 0; 106 | if( value < 0 ){ 107 | val_fixed_4_4 = 0; //no sense to multiply an image by negative val 108 | } 109 | 110 | float value_buf = value; 111 | //run over 7 bits (4bits for signed decimal value, so 8th bit will be 0) 112 | for (int i = 6; i >= 0; i--) { 113 | if( value_buf >= pow(2, i-4) ) { 114 | value_buf -= pow(2, i-4); 115 | val_fixed_4_4 += 1<send_threshold(0, value, true); 219 | } 220 | 221 | 222 | 223 | void Image_processing_simulation::main_loop_clk(){ 224 | simulator->clk = 0; 225 | simulator->reset = 0; 226 | simulator->eval(); 227 | simulator->clk = 1; 228 | simulator->reset = 0; 229 | simulator->comm_data_in_valid = 0; 230 | static int counter_free = 0; 231 | if(counter_free > 0){ //simulates the fact that the comm line can be full 232 | counter_free--; 233 | } 234 | simulator->comm_data_out_free = (counter_free == 0); 235 | if(!fifo_in.empty()){ 236 | Operation op = fifo_in.front(); 237 | fifo_in.pop(); 238 | if(op.is_command){ 239 | simulator->comm_data_in_valid = 1; 240 | simulator->comm_cmd = op.command; 241 | } 242 | else{ 243 | simulator->comm_data_in_valid = 1; 244 | simulator->comm_data_in = op.data; 245 | } 246 | } 247 | 248 | if(simulator->comm_data_out_valid == 1){ 249 | fifo_out.push(simulator->comm_data_out); 250 | counter_free = 3; 251 | simulator->comm_data_out_free = (counter_free == 0); 252 | } 253 | 254 | simulator->data_read_valid = 0; 255 | if(simulator->rd_en == 1){ 256 | printf("read req addr: 0x%x\n", simulator->addr); 257 | simulator->data_read = memory[simulator->addr/2]; 258 | simulator->data_read_valid = 1; 259 | } 260 | if(simulator->wr_en == 1){ 261 | printf("wants to write: addr:0x%x data:0x%x\n", simulator->addr, simulator->data_write); 262 | memory[simulator->addr/2] = simulator->data_write; 263 | } 264 | 265 | simulator->eval(); 266 | } 267 | -------------------------------------------------------------------------------- /simulation/image_processing_simulation.hpp: -------------------------------------------------------------------------------- 1 | #ifndef IMAGE_PROCESSING_SIMULATION_H 2 | #define IMAGE_PROCESSING_SIMULATION_H 3 | 4 | #include "../software/image_processing.hpp" 5 | #include "../obj_dir/Vimage_processing.h" 6 | 7 | struct Operation{ 8 | bool is_command; 9 | Commands command; 10 | uint8_t data; 11 | 12 | Operation(bool is_command, Commands com, uint8_t data){ 13 | this->is_command = is_command; 14 | this->command = com; 15 | this->data = data; 16 | } 17 | }; 18 | 19 | typedef std::queue FIFO_OP; 20 | 21 | class Image_processing_simulation : public Image_processing{ 22 | public: 23 | Image_processing_simulation(); 24 | ~Image_processing_simulation(); 25 | 26 | virtual void send_params(uint16_t img_width, uint16_t img_height); 27 | virtual void read_status(uint8_t *output); //TODO vector? 28 | virtual void send_image(uint8_t *img); 29 | 30 | virtual void send_add(int16_t value, bool clamp); 31 | virtual void send_threshold(uint8_t threshold_value, uint8_t replacement, bool upper_selection); 32 | virtual void send_image_invert(); 33 | virtual void send_mult(float value, bool clamp); 34 | 35 | virtual void wait_end_busy(); 36 | 37 | virtual void read_image(uint8_t *img); 38 | 39 | virtual void switch_buffers(); 40 | 41 | virtual void send_binary_add(bool clamp); 42 | virtual void send_binary_sub(bool clamp, bool absolute_diff); 43 | virtual void send_binary_mult(bool clamp); 44 | virtual void send_convolution(uint8_t *kernel, bool clamp, bool input_source, bool add_to_output); 45 | virtual void send_clear(uint8_t value); 46 | 47 | private: 48 | void main_loop_clk(); 49 | 50 | Vimage_processing *simulator; 51 | FIFO_OP fifo_in; 52 | std::queue fifo_out; 53 | 54 | uint16_t *memory; 55 | }; 56 | 57 | #endif 58 | -------------------------------------------------------------------------------- /software/image_processing.hpp: -------------------------------------------------------------------------------- 1 | #ifndef IMAGE_PROCESSING_H 2 | #define IMAGE_PROCESSING_H 3 | 4 | enum Commands {COMMAND_PARAM, COMMAND_SEND_IMG, COMMAND_READ_IMG, COMMAND_GET_STATUS, COMMAND_APPLY_ADD, COMMAND_APPLY_THRESHOLD, 5 | COMMAND_SWITCH_BUFFERS, COMMAND_BINARY_ADD, COMMAND_APPLY_INVERT, 6 | COMMAND_CONVOLUTION, COMMAND_BINARY_SUB, COMMAND_BINARY_MULT, COMMAND_APPLY_MULT, COMMAND_NONE=255}; 7 | 8 | class Image_processing { 9 | 10 | public: 11 | Image_processing(){ 12 | 13 | } 14 | 15 | virtual void send_params(uint16_t img_width, uint16_t img_height) = 0; 16 | virtual void read_status(uint8_t *output) = 0; //TODO vector? 17 | virtual void send_image(uint8_t *img) = 0; 18 | 19 | virtual void send_add(int16_t value, bool clamp) = 0; 20 | virtual void send_threshold(uint8_t threshold_value, uint8_t replacement, bool upper_selection) = 0; 21 | virtual void send_image_invert() = 0; 22 | virtual void send_mult(float value, bool clamp) = 0; 23 | 24 | virtual void wait_end_busy() = 0; 25 | 26 | virtual void read_image(uint8_t *img) = 0; 27 | 28 | virtual void switch_buffers() = 0; 29 | 30 | virtual void send_binary_add(bool clamp) = 0; 31 | virtual void send_binary_sub(bool clamp, bool absolute_diff) = 0; 32 | virtual void send_binary_mult(bool clamp) = 0; 33 | virtual void send_convolution(uint8_t *kernel, bool clamp, bool input_source, bool add_to_output) = 0; 34 | virtual void send_clear(uint8_t value) = 0; 35 | 36 | protected: 37 | uint16_t image_width; 38 | uint16_t image_height; 39 | }; 40 | 41 | #endif 42 | -------------------------------------------------------------------------------- /software/images/image_fruits_64.h: -------------------------------------------------------------------------------- 1 | /* GIMP header image file format (RGB)*/ 2 | 3 | static unsigned int image_width = 64; 4 | static unsigned int image_height = 64; 5 | 6 | /* Call this macro repeatedly. After each use, the pixel data can be extracted */ 7 | 8 | #define HEADER_PIXEL(data,pixel) {\ 9 | pixel[0] = (((data[0] - 33) << 2) | ((data[1] - 33) >> 4)); \ 10 | pixel[1] = ((((data[1] - 33) & 0xF) << 4) | ((data[2] - 33) >> 2)); \ 11 | pixel[2] = ((((data[2] - 33) & 0x3) << 6) | ((data[3] - 33))); \ 12 | data += 4; \ 13 | } 14 | static const char *header_data = 15 | "V>86VN<7X^`@Y/$AX^`@Y/$AV>86UN,3UN,3X.T=U^04T-T-?XN\\J[?HT]`0T]`0" 16 | "T=X.@HZ_J[?HD9W.TM\\/S=H*R]@(Z?8F`0T]\\O\\OV.45Q-$!Q-$!M\\/TN\\?XO,CY" 17 | "NL;WN,3UML+SKKKKL;WNLK[OK[OLKKKKJ[?HHJ[?I+#AK+CIML+SM<'RLK[OI+#A" 18 | "7FJ;&\"1535F*=8&R;7FJBY?(H*S=D)S-D9W.D9W.DI[/E*#1DY_0DY_0EJ+3F*35" 19 | "W^P86X^`@SML+U>(2UN,3X>X>U>(2UN,3T]`0C)C)[/DIYO,C" 20 | "H:W>:W>H%\"!1WNL;S-D)Q]0$R]@(^@8V_0DY\\_`PR=8&Q-$!PL[_PHEJ+3JK;GBI;'F:76FJ;7DI[/D9W.DI[/DY_0EJ+3FJ;7FZ?8HJ[?" 23 | "5&\"1W>H:X>X>V>86V.45VN<7W.D9U>(2U.$1W.D9XN\\?X>X>XN\\?V>86DI[/>86V" 24 | ".T=X'2E:H:W>VN<7S=H*RM<'V^@8`0T]^04UZO86U>(2ZOH:X>X>X.T=V^@8U>(2UN,3U>(2JK;G-4%R" 28 | "*C9GGJK;U.$1W.D9T=X.Y/$A^@8V`0T]\\?XNUN,3P\\_`Q=(\"PL[_O\\O\\P,S]NL;W" 29 | "M,#QN,3UML+SM<'RL[_PL+SMJK;GJK;GK+CII[/DI[/DJ;7FML+SM<'RL[_PJ;7F" 30 | "0$Q]&256JK;GIK+CKKKK" 31 | "X>X>35F*XN\\?Y_0DXN\\?XN\\?XN\\?U^04SML+U.$1X>X>UN,3W>H:=H*SA)#!(\"Q=" 32 | "I;'BX^`@WNL;WNL;VN<7U>(2_PL[^PG:G:EJ+3E*#1DY_0D9W.EZ/4FZ?8GJK;I;'BJ;7FL;WN" 35 | "X^`@Y?(BAI+#W^P86VN<7T-T-XN\\?TM\\/<'RMA9'\"E*#1Y?(B" 36 | "V^@8S=H*T-T-S=H*W^P<[/DI`@X^]0$QX>X>Q=(\"Q-$!N,3UPL[_O\\O\\OU>(2U.$1V>86VN<7X.T=V.45X^`@Y_0DM<'R=(\"Q)#!AF*35T]`0" 40 | "R-4%T-T-T=X.S=H*[/DI`0T]^`@X\\O\\OR=8&PL[_PL[_O\\O\\N,3UO\\O\\O,CYNL;W" 41 | "NL;WN,3UK+CIL;WNL+SML+SMK+CIK;GJH*S=IK+CI;'BI+#AM<'RLK[OM,#QJK;G" 42 | "97&B*S=HJ[?HH:W>G:G:E:'2DY_0DY_0EJ+3GJK;GZOX>W.D9X.T=8FZ?X>X>W^P86U>(2SML+\\_`P`@X^]@(RX^`@Q=(\"PI+#AM,#QM,#QM,#QJ+3E" 46 | ">86V256&E:'2GJK;FJ;7E*#1E*#1E:'2G:G:H*S=H*S=H:W>I+#AK+CIL;WNM,#Q" 47 | "Y?(BV>86W^PX>" 48 | "WNL;W.D9V^@8ZOFJ;7GZOG*C9EZ/4DY_0E:'2FZ?8H:W>H:W>HZ_@H:W>I[/DJK;GL+SMLK[O" 51 | "Y?(BX>X>U^04UN,3IK+CT]`0W>H:KKKK5F*3*C9G256&V.45LK[OM\\/TT=X.VN<7" 52 | "[OLKZOX>U>(2T]`0U.$186V>56&2,#QM>X>X=8&RU.$1P\\_`S=H*U^04X>X>W^P<" 56 | "Z/4EX>X>^04U`P\\_^@8V]0$QL+SMQ=(\"PI;'BI;'BI[/DJ+3EI+#AHZ_@HZ_@HZ_@HZ_@" 59 | "TM\\/X.T=T=X.U^04ML+S2%2%*C9G)C)CTM\\/XN\\?U>(2R]@(R]@(OLK[T-T-U^04" 60 | "ZO(2P,S]SML+UN,3" 64 | "Z/4E`P\\__`P\\^@8V[/DIX>X>V.45S]P,N,3UO(2L[_PLK[OL;WNU.$1P\\_`Q=(\"QM,#VN<7WNL;`P\\_" 72 | "`@X^^`@XZ?8FZ?8FZ/4EX.T=T=X.R-4%M,#QM\\/TN,3UN<7VN<7VN,3UOH:W>H*S=GJK;FJ;7K+CIJK;GJK;GKKKK" 74 | "J[?HEZ/4DI[/DY_0EJ+3H:W>I+#AK+CIKKKKJK;GH*S=F:76E*#1E*#1DY_0D)S-" 75 | "'RM<2E:'T-T-SML+N\\?XQM,#L+SMLK[OON\\?XT=X.O,CYKKKKK+CIS]P,P,S]QM,#V>86T]`0T]`0Z/4E`@X^`@X^_0DY" 80 | "\\/TMZO(2SML+R-4%PL[_P,S]OLK[ML+SK+CIJ;7FJ+3EH:W>I;'BJ+3EJ;7FKKKK" 82 | "I+#ADI[/D)S-DI[/EJ+3G*C9HZ_@J;7FI+#AG*C9F*35D)S-BI;'B97&C)C)B)3%" 83 | "T=X.O\\O\\P\\_`N\\?XKKKKL+SMU>(2Q-$!R=8&S=H*Z?8F[/DI`P\\__`P\\^P86P(2" 88 | "Y_0DZ/4EYO,CX.T=W^P86U>(2SML+DY_0N\\?XQ-$!Q=(\"R]@(S=H*SML+T=X." 89 | "SML+T-T-T]`0T=X.S]P,SML+S]P,T-T-S-D)PL[_L;WNK;GJK;GJGZO86U^04V.45V.45U.$1J+3ET=X.V>86V>86U^04TM\\/R=8&Q-$!" 93 | "QM,#R=8&S=H*RM<'S-D)R]@(S=H*S]P,U.$1U.$1T=X.R=8&P\\_`OLK[K;GJN<7V" 94 | "O,CYC)C)B97&BY?(C)C)D9W.BY?(@(R]?(BY>86V>H:W@X^`AI+#?XN\\?8FZ?8FZ" 95 | "OLK[G:G:HJ[?Q]0$I[/DKKKKP\\_`_PL[`P\\__PL[^@8V]@(RUN,3S=H*Q]0$WNL;" 96 | "W.D9V>86T]`0TM\\/T=X.U>(2U.$1UN,3K[OLU.$1V>86VN<7V^@8V>86U.$1R]@(" 97 | "PL[_P\\_`PH:W?8FZ?8FZ?XN\\?8FZ;'BI?(BY?(BY?(BY" 99 | "P,S]FJ;7G:G:J;7FQ=(\"Q]0$_@HZ`@X^_0DY]`0T[OLKR]@(S=H*TM\\/W^PH:" 100 | "W.D9W.D9V>86U.$1S-D)R]@(SML+S]P,JK;GT]`0V>86U^04U.$1T]`0T=X.T-T-" 101 | "R]@(PL[_O(2UAI+#?XN\\>H:W?(BY<7VN>(2UH:W^PX>" 104 | "X.T=X.T=X.T=W.D9V>86V>86V.45UN,3UN,3UN,3TM\\/S]P,U.$1TM\\/S=H*SML+" 105 | "S]P,R=8&O\\O\\LK[OKKKKJ;7FGZO86VX>X>X>" 108 | "XN\\?X^`@WNL;WNL;W>H:V^@8WNL;W>H:V^@8U^04UN,3U>(2T-T-T=X.T]`0S=H*" 109 | "S]P,S-D)QM,#M\\/TGJK;E:'2C9G*B)3%@HZ_BY?(J;7FL+SMGZOS]P,T-T-" 110 | "S=H*JK;GB97&AY/$@HZ_B97&A9'\"A)#!D)S-BI;'CIK+?8FZ=X.T<7VN;'BI:76F" 111 | "[OLK`P\\_`0T]^`@X]P,S\\?XNV^@8S]P,T]`0U.$1T]`0V.45W>H:WNL;WNL;X>X>" 112 | "X>X>X>X>W^PX>W^P(2T=X.TM\\/S]P," 113 | "S]P,RM<'Q-$!P,S]H:W>D)S-E:'2K;GJS=H*W.D9V^@8KKKKE:'2P\\_`T-T-T-T-" 114 | "OLK[FZ?8?8FZA)#!BI;'AY/$>X>XAI+#D)S-E:'2E*#1B)3%>86V>X>X;'BI97&B" 115 | "`0T]_0DY]`0T]0$QXN\\?TM\\/U.$1U>(2U^04U^04U.$1VN<7VN<7W^PX>W>H:W>H:V>86U.$1V.45U>(2U>(2S=H*" 117 | "S=H*Q]0$R=8&OLK[K[OLSML+VN<7UN,3U^04WNL;M,#QBI;'K[OLQM,#T]`0R=8&" 118 | "J+3EB97&@X^`@(R]AI+#?8FZBI;'A9'\"D)S-G:G:GZO8" 119 | "]`0T]0$QY?(BT]`0U>(2UN,3TM\\/TM\\/T-T-S-D)T=X.UN,3V^@8W.D9X>X>W>H:" 120 | "XN\\?W^PH:V^@8X.T=X.T=X>X>Y?(BX.T=X.T=VN<7VN<7U^04T]`0T]`0T-T-" 121 | "RM<'Q=(\"Q-$!O86U.$1G*C9L;WNJ+3EHJ[?R=8&R]@(O,CY" 122 | "D9W.BY?(?(BY>X>XBY?(C)C)A)#!BY?(B97&G:G:IK+CI+#A?XN\\<'RM:'2EH:WNL;X.T=XN\\?X>X>X.T=VN<7U^04U>(2S]P,SML+SML+" 125 | "R=8&P\\_`OLK[M\\/TJK;GN\\?XUN,3V^@8I+#AQ-$!S-D)LK[OE*#1Q=(\"N<7VJ+3E" 126 | "A9'\"A)#!=8&R@X^`D9W.CIK+AY/$DY_0AY/$HJ[?ML+SGJK;AY/$:W>H>(2UJ[?HM\\/TQ-$!S-D)T-T-RM<'T]`0V.45VN<7W.D9WNL;W^P<" 128 | "W^PH:VN<7VN<7V^@8V^@8V^@8VN<7U^04T=X.T=X.T-T-S=H*R]@(" 129 | "P\\_`P,S]N,3UKKKKJK;GLK[OW^PH:W=(\"Q" 131 | "K[OLM\\/TOLK[R-4%TM\\/V>86W.D9W>H:WNL;V>86T-T-UN,3V.45VN<7U>(2W.D9" 132 | "WNL;WNL;W.D9V.45VN<7V>86V>86T]`0U^04U^04U>(2SML+PL[_Q]0$QM,#Q=(\"" 133 | "OLK[NL;WL+SMJ[?HH:W>I[/D\\?XNK;GJQ-$!Q]0$T=X.N\\?X?(BYJ;7FH:W>?HJ[" 134 | "@X^`>H:W=X.T@(R]D9W.E:'2D9W.DI[/D9W.D)S-@X^`A)#!A)#!:W>H=X.TX>X.T=Y/$AX^`@XN\\?V^@8S=H*U.$1U^04U^04UN,3U^04" 136 | "V.45VN<7V>86U^04PL[_N\\?XO,CYS=H*TM\\/T]`0S=H*S=H*R=8&Q=(\"P,S]O,CY" 137 | "N<7VLK[OKKKKHJ[?G*C9FJ;7X.T=L[_POLK[S]P,RM<'NL;W?(BYG:G:EZ/4>(2U" 138 | ">X>X?(BY@X^`?XN\\D)S-D9W.CIK+CYO,E:'2>86V?8FZA9'\"?8FZ=8&R;'BI;7FJ" 139 | "7FJ;<'RMT-T-V.45V^@8W^PH:X.T=WNL;U.$1R-4%T-T-U.$1UN,3UN,3UN,3" 140 | "TM\\/T-T-T=X.K+CIK[OLKKKKK;GJI;'BO(2U" 142 | ">86V@(R]@8V^@X^`BY?(C)C)C)C)C)C)FJ;7A)#!G:G:G:G:@HZ_AY/$?HJ[:G:G" 143 | "7&B99W.DBY?(Q=(\"R]@(TM\\/UN,3UN,3VN<7U>(2P,S]T-T-OF:76CIK+@8V^GJK;O,CYRM<'Q-$!NL;WKKKKD)S-@8V^AY/$;GJK" 146 | "?HJ[?XN\\A9'\"@X^`BY?(B)3%@HZ_@8V^D9W.BI;'JK;GH*S=CIK+A)#!@(R]8V^@" 147 | "66663UN,4EZ/Q]0$RM<'S]P,T]`0V.45T-T-S]P,NL;WN,3UJ+3EKKKKKKKKJ;7F" 148 | "GJK;OEJ+3CIK+@(R]CIK+E*#1F:76GJK;K[OLM,#QI;'BFJ;7=H*S>X>X86V@X^`D9W.EZ/4BY?(@X^`>(2U;WNL" 151 | "4%R-35F*04U^RM<'S=H*TM\\/T]`0U^04T=X.V>86M<'RJ;7FKKKKKKKKKKKKI;'B" 152 | "H:W>EZ/4DY_0JK;GI[/DI+#AG:G:F*35EJ+3CYO,L[_PCYO,EJ+3K+CIJK;GI[/D" 153 | "H*S=GJK;E*#1A9'\"(2U97&B" 154 | ">86VA9'\"B)3%A9'\"BY?(F*35AI+#AY/$>X>XAI+#BY?(GJK;D)S-@HZ_@(R]<'RM" 155 | "35F*0T]`-D)SZ/4EW>H:W^PH:U^04LK[OJK;GJK;GJK;GHZ_@J;7F" 156 | "I;'BF:76DI[/F*35F*35B)3%>X>XC)C)FJ;7G:G:D9W.?(BYEJ+3I+#AK;GJJ[?H" 157 | "L+SMJ;7FEZ/4>X>XB97&KKKKJK;GI+#AIK+CDI[/DI[/HJ[?HZ_@FJ;7DI[/>X>X" 158 | "<'RMA)#!A)#!@X^`C9G*DY_0B97&AI+#@X^`CIK+D)S-FZ?8EJ+3D9W.>86V:76F" 159 | "1U.$1U.$D9W.U^04W^PI+#AH:W>I[/DJ+3E" 160 | "H*S=GJK;D)S-CYO,BI;'=(\"QHJ[?L+SMF*35CIK+FZ?8AY/$F:76I+#AKKKKK+CI" 161 | "K[OLJK;GJ;7FF*35DI[/I[/DJ;7FI+#AH*S=E*#1E*#1DI[/GJK;G:G:E:'2D9W." 162 | "<7VN=X.T(2UB97&CIK+@(R]>86V?8FZ?XN\\CIK+JK;GF*35?XN\\?8FZ:G:G" 163 | "3UN,F:76VN<7T]`0T-T-X.T=UN,3I+#AJK;GG*C9DI[/BI;'FZ?8H*S=I[/DJ;7F" 164 | "H*S=I;'BDI[/B97&CYO,AY/$K[OLJK;GF:76D)S-AY/$E*#1?8FZI[/DK+CIKKKK" 165 | "LK[OK;GJI;'BG*C9H*S=I[/DH:W>GJK;F:76D9W.D9W.CIK+FJ;7F*35BI;'A9'\"" 166 | "A9'\">X>X<7VN<7VN?HJ[?XN\\:76F;WNL8V^@;WNL=8&RA9'\"EJ+3>86VDY_09G*C" 167 | "O\\O\\OHZ_@LK[OI+#AH*S=EJ+3DY_0BI;'B97&AI+#CYO,D9W.A9'\"@HZ_" 170 | "=H*SE*#1G:G:EJ+3?XN\\?HJ[=8&R6V>89W.D=8&R(2U;'BI>(2U?XN\\:W>H" 171 | "Q=(\"LK[OS]P,UN,3U^04M<'RH:W>JK;GJK;GJ;7FGZOGJK;D)S-H*S=O,CYHJ[?J+3EIK+CFJ;7" 173 | "G:G:G*C9D9W.JK;GKKKKK;GJK+CIG:G:CYO,CYO,>X>XB)3%FZ?8FJ;7B97&:G:G" 174 | "DY_0B)3%G:G:H:W>GZOX>X?(BYBY?(D)S-@(R]?(BY=H*S:'2E" 175 | "Q]0$M<'RR-4%U>(2V^@8L+SMGZOG:G:D)S-BY?(" 178 | "?XN\\D)S-GZOX>X=(\"Q45V.@HZ_?XN\\@8V^I;'BI[/DI+#AHJ[?FZ?8EJ+3FZ?8" 181 | "?XN\\@(R]>H:WF:76GJK;FZ?8EZ/4D9W.BI;'?(BYDY_0G*C9I;'BKKKKA)#!?HJ[" 182 | "C)C)C9G*F:76F:76EJ+3E:'2E*#18FZ?5V.466668&R=8V^@9W.D9G*C:76F9'\"A" 183 | "JK;GG:G:J;7FS-D)WNL;Y/$A?XN\\C)C)CIK+DY_0F*35EJ+3H*S=L[_PLK[OJK;G" 184 | "HZ_@A)#!:76F:G:G1U.$256&4EZ/8&R=:'2EGZOX>XD)S-CYO,B)3%@X^`?HJ[?8FZDY_0FJ;7HJ[?I;'B@(R]>(2U" 186 | "A9'\"C)C)B)3%?8FZ@8V^C9G*A9'\"7FJ;4EZ/5&\"18FZ?7VN<6V>886V>9W.D86V>" 187 | "N,3UH:W>CIK+PL[_W.D9Y?(BS=H*DY_0@X^`BI;'E*#1G:G:GZOFZ?8FJ;7E*#1D)S-B97&" 189 | ">(2U8FZ?6F:75&\"1<7VN?HJ[E:'2N<7VI[/DB97&C)C)CIK+D)S-C)C)>(2U=(\"Q" 190 | "?XN\\?(BY?8FZ@HZ_>86V@X^`<7VN1E*#8V^@9G*C5&\"17FJ;97&B7&B98&R=7FJ;" 191 | "R]@(O,CYCIK+IK+CTM\\/M\\/TF*35I;'BC)C)>H:WBY?(GJK;E*#1F*35HJ[?IK+C" 192 | "G:G:CIK+@HZ_6F:7CIK+M\\/TKKKKJ;7FAI+#CYO,F*35DY_0D)S-BI;'A9'\"?8FZ" 193 | "8V^@3UN,BY?(DI[/A)#!?(BYM,#QF:76KKKKD)S-B97&C9G*X>X" 194 | "<7VN?XN\\G*C9L;WNKKKKE:'29'\"A=8&RH*S=HZ_@E*#17&B90T]`7FJ;8&R=8FZ?" 195 | "S-D)Q]0$IK+C@X^`K[OLDY_0E:'2J;7FGJK;?XN\\=(\"QF*35G:G:EZ/4GJK;FJ;7" 196 | "D9W.BI;'A9'\"BY?(L+SMK;GJJ[?HHZ_@E:'2>86VCIK+AY/$AI+#?8FZ=X.T<'RM" 197 | "/$AY97&BD9W.FJ;7G*C9IK+CH:W>GZOEZ/4@8V^A9'\"G*C9EZ/4CIK+D)S-D)S-" 200 | "C9G*@X^`>H:WG:G:K;GJJ;7FH*S=EZ/4DI[/:G:G:'2E@8V^=X.T<'RM:G:G6&25" 201 | "(R]@97&BBY?(F:76FJ;7G*C9D)S-BI;'BI;'CYO,@(R]>(2UBI;'97&B8FZ?7FJ;" 202 | ">86VB97&HZ_@HZ_@FJ;7FZ?8G*C9C9G*FJ;7C9G*D9W.D9W.=X.T4U^066664EZ/" 203 | "U.$1U.$1UN,3T-T-AY/$DI[/E:'2D9W.@X^`>H:W?XN\\DY_0C)C)@HZ_@8V^AI+#" 204 | "?HJ[>(2U;GJKG*C9IK+CHJ[?G*C9E*#1BY?(=(\"Q3EJ+;7FJCYO,AI+#=H*S-T-T" 205 | "+CIK>86VCYO,E*#1D)S-BI;'D9W.@HZ_A)#!AY/$>H:W@X^`:G:G>(2U=(\"Q;WNL" 206 | "=X.TB)3%E*#1D9W.C9G*D)S-D9W.AI+#C9G*EZ/4DI[/B)3%?HJ[1U.$5F*34EZ/" 207 | "S]P,T=X.T]`0V.45D9W.CIK+CIK+B)3%AI+#@HZ_=8&R?(BYB)3%?(BYH:WEZ/4CYO,F*35J;7F" 210 | "G*C9@(R]@(R]>H:W?HJ[>H:WA)#!=8&RA9'\"DI[/EZ/4F*3586V;WNL" 212 | "97&B6F:735F*?(BYE*#1CYO,BI;'@HZ_?8FZ=H*SK[OLK;GJK;GJI;'BF*35A)#!" 213 | "?HJ[+3EJ;'BI=H*S<'RM66665F*3=(\"Q=X.TH:W>H:W>86V:76FC9G*DY_0E*#1E*#1C9G*04U^256&3EJ+" 215 | "T-T-U^04UN,3V.45U^04T-T-G*C9BI;'?XN\\=X.T76F:6&2597&B86V>>(2U<'RM" 216 | "7VN<6F:77FJ;7&B9BY?(@X^`?HJ[=X.T;WNLAI+#IK+CGZOX>X66662%2%56&28FZ?8V^@6V>81E*#*#1E,3UN7&B9HZ_@F:76G:G:IK+CGZO<" 218 | "AI+#GJK;<'RM:G:G:'2E;7FJ9G*C<7VNB)3%C9G*D)S-D)S-CIK+4EZ/3%B)4U^0" 219 | "U>(2U^04V.45UN,3T]`0T]`0S]P,AI+#DI[/;GJK?8FZ>X>X;'BI9G*C=H*S=H*S" 220 | ";7FJ:W>H6666:76F97&B=8&R;WNL;'BI86V>?(BYFZ?8I+#AH*S=G*C9D9W.@8V^" 221 | "?(BY>(2U?(BYDY_0KKKKGZOH:W>EJ+3C9G*" 222 | "A9'\"DI[/;WNL66667VN<6&254%R-;7FJ?HJ[AI+#BI;'AI+#?XN\\5V.43EJ+4U^0" 223 | "U^04V>86T]`0T=X.T=X.SML+RM<'P\\_`BI;'EZ/4E:'2A)#!<'RM9G*C<'RM?8FZ" 224 | "X>XDI[/CIK+@8V^" 225 | "?XN\\<'RMD9W.HZ_@HZ_@HZ_@FZ?8@8V^>H:W6V>83UN,G*C9GJK;C)C)CYO,A)#!" 226 | "?(BYA)#!4U^0256&,S]P)C)C2U>(9G*C=(\"Q>H:W?8FZ=H*S;GJK6F:76F:77VN<" 227 | "T]`0U>(2UN,3UN,3S]P,S-D)QM,#M,#QG:G:F*35A)#!?8FZ@HZ_=X.T;'BI;'BI" 228 | ":W>HDY_0CYO,F:76H:W>FJ;7C)C)H:W@HZ_>X>X" 229 | "=8&R86V>DI[/H*S=FJ;7G*C9B97&D)S-@X^`7VN<,#QMA)#!CIK+C9G*?HJ[>(2U" 230 | ";'BI=X.T,CYO256&256&5&\"104U^56&29'\"A;7FJ;WNL86V>9W.D?XN\\:G:G;'BI" 231 | "T]`0U^04UN,3RM<'P\\_`NL;WN\\?XK;GJC)C)A)#!H:WAY/$H+CIK.$1U3UN,:G:G5F*31U.$4EZ/6&257FJ;=X.T;'BI7VN<7VN<" 235 | "T]`0OLK[M,#QL;WNLK[OL+SMH:W>CYO,=H*S=X.T?(BY;'BI9'\"A7&B976F:76F:" 236 | "76F:7FJ;6F:7D)S-FZ?8D9W.AI+#@8V^7&B97&B95F*3;GJK>86V<7VNH:W@(R]@HZ_>H:W7VN<76F:(2U>86V=H*S<'RM" 238 | "8FZ?45V.;WNL5V.40DY_.45V1%\"!;7FJ8FZ?9W.D@(R]E:'2FZ?84EZ/5F*35V.4" 239 | "N,3UG*C9AI+#>X>X9G*C35F*4EZ/4EZ/3EJ+2U>(4U^0>86V?8FZ@8V^=(\"Q9'\"A" 240 | "8&R=8&R=6V>87FJ;B97&A)#!?XN\\A)#!6V>85&\"15V.45F*33UN,15&\",CYO.T=X" 241 | "3UN,6F:76&25@HZ_>(2U?8FZ>X>X=(\"Q=X.T;GJK9W.D9G*C:G:G97&B86V>9'\"A" 242 | "9W.D:'2E<7VN:'2E3%B)-T-T86V>7VN<0$Q]86V>7FJ;@(R](3%B)76F:>H:WH*S=G:G:=8&R" 244 | ":76F:W>H86V>5V.4:W>H=(\"Q?8FZ?8FZ4%R--4%R1%\"!04U^/DI[.T=X2%2%256&" 245 | "/$AY-T-T/$AY.45V76F:;WNL;WNL=8&R8FZ?8&R=7&B97&B96V>87FJ;6F:76666" 246 | "6F:776F:7VN<7FJ;5F*34%R-.D9W7FJ;AI+#7&B976F:;GJK3UN,56&235F*;GJK" 247 | "3UN,5&\"145V.4EZ/7&B986V>:'2E76F:76F:66664U^06F:77&B9I;'BFJ;7=(\"Q" 248 | ";'BI;'BI9'\"A4EZ/6V>8;7FJ=H*S;'BI0T]`2U>(.$1U+3EJ0DY_2%2%2U>(45V." 249 | "5&\"1/DI[-D)S+#AI3%B)=(\"Q;GJK:76F8&R=7FJ;6V>87FJ;9G*C(0$Q]2U>(6V>8;7FJ9G*C;WNL97&B86V>" 251 | "5F*356&26V>8;7FJ;WNL>(2U@(R]@HZ_=8&R:76F9G*C8FZ?35F*7&B9=H*SH0T]`35F*45V.0T]`-$!Q256&35F*4U^06V>8" 253 | "97&B5&\"1.T=X)S-D@8V^AI+#8FZ?:'2E8FZ?76F:6V>886V>@X^`A)#!E*#17&B9" 254 | "5V.45V.45V.44EZ/4EZ/6&2545V.3UN,7VN<:W>H?(BY?(BY>H:W<'RM=(\"QH6666:G:G=H*S@(R]@X^`?8FZAI+#@X^`9G*C6666;WNL9W.D76F:>(2U;'BI" 256 | "9W.D7VN<4EZ/3EJ+256&3UN,256&256&5&\"15V.41E*#.D9W5&\"15V.47FJ;;WNL" 257 | "(2U>H:W=H*S@X^`>86V7VN<86V>97&B;GJK7&B98V^@FZ?8P,S]:W>H6666" 258 | "5V.45F*37FJ;5&\"15&\"15V.45V.49'\"A>86V>H:WH?8FZ?8FZ=8&R=8&RA)#!A9'\"B97&:G:G7VN<<7VN9W.D;7FJ<7VN97&B" 260 | "86V>66664%R-3%B)2U>(/DI[3UN,4%R-3EJ+4EZ/2U>(+#AI7VN<;GJK;7FJ6V>87&B9>X>XBY?(8FZ?6V>89'\"A@X^`7&B95V.4" 262 | "5&\"14U^05&\"14EZ/2U>(6F:7<'RM;GJK:W>H8FZ?7FJ;7FJ;97&B8FZ?5&\"13UN," 263 | ":G:GA)#!>(2U:'2E@8V^@HZ_>86V@X^`?8FZ>X>X;GJK;WNL:76F8V^@;'BI;'BI" 264 | "76F:56&23EJ+2U>(5F*3-T-T,S]P/$AY/DI[.$1U(BY?0DY_=8&RH;WNL8V^@6&257VN<86V>8V^@6&258FZ?;7FJ86V>666666665F*35&\"14U^0" 266 | "45V.45V.4%R-3EJ+2U>(9'\"A9'\"A7&B97&B9:'2E>(2U@HZ_C9G*BI;';7FJ3UN," 267 | ":76F@X^`=X.TA9'\"@HZ_X>X=H*S:76F97&B;'BI:76F8V^@:'2E:76F" 268 | "8V^@4U^05V.445V.:'2E76F:7VN<8FZ?7&B97&B9:W>H76F:>(2U>X>X7VN<8&R=" 269 | "AI+#HZ_@E:'26F:77&B9:G:G7FJ;76F:4U^05&\"156&25&\"15&\"145V.4%R-3EJ+" 270 | "5&\"13EJ+35F*35F*7&B9:'2E5V.46V>8;WNLA)#!G:G:L[_PO,CYL[_P@HZ_45V." 271 | ""; 272 | -------------------------------------------------------------------------------- /software/images/image_fruits_8.h: -------------------------------------------------------------------------------- 1 | /* GIMP header image file format (RGB)*/ 2 | 3 | static unsigned int image_width = 8; 4 | static unsigned int image_height = 8; 5 | 6 | /* Call this macro repeatedly. After each use, the pixel data can be extracted */ 7 | 8 | #define HEADER_PIXEL(data,pixel) {\ 9 | pixel[0] = (((data[0] - 33) << 2) | ((data[1] - 33) >> 4)); \ 10 | pixel[1] = ((((data[1] - 33) & 0xF) << 4) | ((data[2] - 33) >> 2)); \ 11 | pixel[2] = ((((data[2] - 33) & 0x3) << 6) | ((data[3] - 33))); \ 12 | data += 4; \ 13 | } 14 | static const char *header_data = 15 | "U.$1M\\/TS=H*Q]0$LK[OL;WN>(2UG:G:J;7FO86V^@8U>(2O,CYN,3UA9'\"AI+#" 17 | "KKKKK+CIFZ?8GZOX>X?8FZ8&R=<7VN:76F;'BI6V>83%B)86V>97&B5F*3;7FJ" 19 | ""; 20 | -------------------------------------------------------------------------------- /software/images/image_sequential.h: -------------------------------------------------------------------------------- 1 | /* GIMP header image file format (RGB)*/ 2 | 3 | static unsigned int image_width = 8; 4 | static unsigned int image_height = 8; 5 | 6 | /* Call this macro repeatedly. After each use, the pixel data can be extracted */ 7 | 8 | #define HEADER_PIXEL(data,pixel) {\ 9 | pixel[0] = ((data-header_data)/4); \ 10 | pixel[1] = ((data-header_data)/4); \ 11 | pixel[2] = ((data-header_data)/4); \ 12 | data += 4; \ 13 | } 14 | static const char *header_data = 15 | "U.$1M\\/TS=H*Q]0$LK[OL;WN>(2UG:G:J;7FO86V^@8U>(2O,CYN,3UA9'\"AI+#" 17 | "KKKKK+CIFZ?8GZOX>X?8FZ8&R=<7VN:76F;'BI6V>83%B)86V>97&B5F*3;7FJ" 19 | ""; 20 | -------------------------------------------------------------------------------- /software/main.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | // #include "images/peppers128.h" 6 | // #include "images/image_fruits_128.h" 7 | #include "images/image_fruits_64.h" 8 | // #include "images/image_fruits_16.h" 9 | // #include "images/image_fruits_8.h" 10 | // #include "images/image_sequential.h" 11 | 12 | #include "image_processing.hpp" 13 | 14 | #ifdef SIMULATION 15 | #include "../simulation/image_processing_simulation.hpp" 16 | #elif ICE40 17 | #include "../ice40/software/image_processing_ice40.hpp" 18 | #endif 19 | 20 | void test_send_read(uint8_t *image_input, uint8_t *image_output, Image_processing *img_proc){ 21 | img_proc->send_params(image_width, image_height); 22 | uint8_t status[4]; 23 | img_proc->read_status(status); 24 | printf("after send params\n"); 25 | for (size_t i = 0; i < 4; i++) { 26 | printf("status out %lu : 0x%x\n", i, status[i]); 27 | } 28 | 29 | printf("sending image\n"); 30 | img_proc->send_image(image_input); 31 | 32 | printf("receiving image\n"); 33 | img_proc->read_image(image_output); 34 | 35 | img_proc->read_status(status); 36 | } 37 | 38 | void test_add_threshold(uint8_t *image_input, uint8_t *image_output, Image_processing *img_proc){ 39 | img_proc->send_params(image_width, image_height); 40 | 41 | uint8_t status[4]; 42 | img_proc->read_status(status); 43 | printf("after send params\n"); 44 | for (size_t i = 0; i < 4; i++) { 45 | printf("status out %lu : 0x%x\n", i, status[i]); 46 | } 47 | 48 | img_proc->send_image(image_input); 49 | 50 | printf("===========ADD===========\n"); 51 | img_proc->switch_buffers(); 52 | img_proc->send_add(32, true); 53 | img_proc->read_status(status); 54 | printf("after ADD\n"); 55 | for (size_t i = 0; i < 4; i++) { 56 | printf("status out %lu : 0x%x\n", i, status[i]); 57 | } 58 | 59 | img_proc->wait_end_busy(); 60 | 61 | img_proc->read_status(status); 62 | printf("Finished ADD\n"); 63 | for (size_t i = 0; i < 4; i++) { 64 | printf("status out %lu : 0x%x\n", i, status[i]); 65 | } 66 | 67 | printf("===========THRESHOLD===========\n"); 68 | img_proc->send_threshold(168, 0, 0); 69 | img_proc->wait_end_busy(); 70 | 71 | img_proc->switch_buffers(); 72 | img_proc->read_image(image_output); 73 | } 74 | 75 | //will load the image in the input buffer and set the storage buffer to pixels of value 32 76 | // and then will add the two buffers 77 | void test_binary_add(uint8_t *image_input, uint8_t *image_output, Image_processing *img_proc){ 78 | img_proc->send_params(image_width, image_height); 79 | img_proc->send_image(image_input); //in input buffer 80 | 81 | img_proc->send_clear(32); //storage buffer will have an image full of pixels 32 82 | img_proc->wait_end_busy(); 83 | 84 | img_proc->switch_buffers(); 85 | 86 | img_proc->send_binary_add(true); 87 | img_proc->wait_end_busy(); 88 | 89 | img_proc->switch_buffers(); 90 | img_proc->read_image(image_output); 91 | } 92 | 93 | void test_gaussian_blur(uint8_t *image_input, uint8_t *image_output, Image_processing *img_proc){ 94 | img_proc->send_params(image_width, image_height); 95 | img_proc->send_image(image_input); 96 | 97 | //gaussian blur kernel 98 | uint8_t conv_kernel[9] = {(1)<<0, (1)<<1, (1)<<0, (1)<<1, ((1)<<2), (1)<<1, (1)<<0, (1)<<1, (1)<<0}; 99 | 100 | // uint8_t conv_kernel[9] = {(1)<<4, (1)<<4, (1)<<4, (1)<<4, ((-7)<<4), (1)<<4, (1)<<4, (1)<<4, (1)<<4}; 101 | 102 | img_proc->send_convolution(conv_kernel, true, true, false); 103 | 104 | printf("wait end busy\n"); 105 | 106 | img_proc->wait_end_busy(); 107 | 108 | printf("switch buffers\n"); 109 | 110 | img_proc->switch_buffers(); 111 | 112 | img_proc->read_image(image_output); 113 | } 114 | 115 | //use gradient kernels 116 | void test_simple_edge_detection(uint8_t *image_input, uint8_t *image_output, Image_processing *img_proc){ 117 | img_proc->send_params(image_width, image_height); 118 | img_proc->send_image(image_input); 119 | 120 | //top gradient 121 | { 122 | uint8_t conv_kernel[9] = {(1)<<3, (1)<<4, (1)<<3, (0)<<4, ((0)<<4), (0)<<4, (-1)<<3, (-1)<<4, (-1)<<3}; 123 | img_proc->send_convolution(conv_kernel, true, true, false); 124 | } 125 | img_proc->wait_end_busy(); 126 | 127 | //bottom gradient 128 | { 129 | uint8_t conv_kernel[9] = {(-1)<<3, (-1)<<4, (-1)<<3, (0)<<4, ((0)<<4), (0)<<4, (1)<<3, (1)<<4, (1)<<3}; 130 | img_proc->send_convolution(conv_kernel, true, true, true); 131 | } 132 | img_proc->wait_end_busy(); 133 | 134 | //left gradient 135 | { 136 | uint8_t conv_kernel[9] = {(1)<<3, (0)<<4, (-1)<<3, (1)<<4, ((0)<<4), (-1)<<4, (1)<<3, (0)<<4, (-1)<<3}; 137 | img_proc->send_convolution(conv_kernel, true, true, true); 138 | } 139 | img_proc->wait_end_busy(); 140 | 141 | //right gradient 142 | { 143 | uint8_t conv_kernel[9] = {(-1)<<3, (0)<<4, (1)<<3, (-1)<<4, ((0)<<4), (1)<<4, (-1)<<3, (0)<<4, (1)<<3}; 144 | img_proc->send_convolution(conv_kernel, true, true, true); 145 | } 146 | img_proc->wait_end_busy(); 147 | 148 | img_proc->switch_buffers(); 149 | 150 | img_proc->read_image(image_output); 151 | } 152 | 153 | void test_multiplication(uint8_t *image_input, uint8_t *image_output, Image_processing *img_proc){ 154 | 155 | img_proc->send_params(image_width, image_height); 156 | img_proc->send_image(image_input); 157 | 158 | img_proc->switch_buffers(); 159 | 160 | img_proc->send_mult(0.5f, true); 161 | img_proc->wait_end_busy(); 162 | 163 | img_proc->switch_buffers(); 164 | img_proc->read_image(image_output); 165 | } 166 | 167 | void test_binary_diff(uint8_t *image_input, uint8_t *image_output, Image_processing *img_proc){ 168 | img_proc->send_params(image_width, image_height); 169 | img_proc->send_image(image_input); //in input buffer 170 | 171 | img_proc->send_clear(128); 172 | img_proc->wait_end_busy(); 173 | 174 | img_proc->send_binary_sub(true, true); 175 | // img_proc->send_binary_add(true); 176 | img_proc->wait_end_busy(); 177 | 178 | img_proc->switch_buffers(); 179 | img_proc->read_image(image_output); 180 | } 181 | 182 | void test_images_average(uint8_t *image_input1, uint8_t *image_input2, uint8_t *image_output, Image_processing *img_proc){ 183 | img_proc->send_params(image_width, image_height); 184 | img_proc->send_image(image_input1); //in input buffer 185 | 186 | img_proc->switch_buffers(); 187 | 188 | img_proc->send_image(image_input2); 189 | 190 | img_proc->send_mult(0.5f, true); 191 | img_proc->wait_end_busy(); 192 | 193 | img_proc->switch_buffers(); 194 | 195 | img_proc->send_mult(0.5f, true); 196 | img_proc->wait_end_busy(); 197 | 198 | img_proc->send_binary_add(true); 199 | img_proc->wait_end_busy(); 200 | 201 | img_proc->switch_buffers(); 202 | img_proc->read_image(image_output); 203 | } 204 | 205 | void test_images_diff(uint8_t *image_input1, uint8_t *image_input2, uint8_t *image_output, Image_processing *img_proc){ 206 | img_proc->send_params(image_width, image_height); 207 | img_proc->send_image(image_input1); //in input buffer 208 | 209 | img_proc->switch_buffers(); 210 | 211 | img_proc->send_image(image_input2); 212 | img_proc->switch_buffers(); 213 | 214 | img_proc->send_binary_sub(true, true); 215 | img_proc->wait_end_busy(); 216 | 217 | img_proc->switch_buffers(); 218 | img_proc->read_image(image_output); 219 | } 220 | 221 | int main(){ 222 | FILE *output_file = fopen("output.dat", "w"); 223 | 224 | Image_processing *img_proc; 225 | 226 | #ifdef SIMULATION 227 | img_proc = new Image_processing_simulation(); 228 | #elif ICE40 229 | img_proc = new Image_processing_ice40(); 230 | #endif 231 | 232 | uint8_t *image_input = new uint8_t[image_width*image_height]; 233 | uint8_t *image_input2 = new uint8_t[image_width*image_height]; 234 | uint8_t *image_output = new uint8_t[image_width*image_height]; 235 | 236 | const char *ptr_image = header_data; 237 | for (size_t i = 0; i < image_height*image_width; i++) { 238 | uint8_t pixel[3]; 239 | HEADER_PIXEL(ptr_image, pixel); 240 | image_input[i] = pixel[0]; 241 | } 242 | 243 | //load second image 244 | // ptr_image = header_data2; 245 | // for (size_t i = 0; i < image_height*image_width; i++) { 246 | // uint8_t pixel[3]; 247 | // HEADER_PIXEL(ptr_image, pixel); 248 | // image_input2[i] = pixel[0]; 249 | // } 250 | 251 | //test selection 252 | // test_send_read(image_input, image_output, img_proc); 253 | // test_add_threshold(image_input, image_output, img_proc); 254 | // test_binary_add(image_input, image_output, img_proc); 255 | // test_gaussian_blur(image_input, image_output, img_proc); 256 | test_simple_edge_detection(image_input, image_output, img_proc); 257 | // test_multiplication(image_input, image_output, img_proc); 258 | // test_binary_diff(image_input, image_output, img_proc); 259 | // test_images_average(image_input, image_input2, image_output, img_proc); 260 | // test_images_diff(image_input, image_input2, image_output, img_proc); 261 | 262 | // test_simulation(image_input, image_output); 263 | 264 | for (size_t i = 0; i < image_height*image_width; i++) { 265 | fprintf(output_file, "%d ", image_output[i]); 266 | if( ((i+1) % (image_width)) == 0){ 267 | fprintf(output_file, "\n"); 268 | } 269 | } 270 | 271 | fclose(output_file); 272 | 273 | delete img_proc; 274 | return 0; 275 | } 276 | --------------------------------------------------------------------------------