├── .gitignore ├── .gitmodules ├── CMakeLists.txt ├── LICENSE ├── README.md ├── build └── .gitignore ├── depths.png ├── designs ├── Makefile ├── mul.v ├── silice_blaze.v ├── silice_blinky.v ├── silice_div.v ├── silice_icev_leds.v ├── silice_mulpip.v ├── silice_vga_demo.v ├── silice_vga_demo_flyover.v ├── silice_vga_test.v ├── simple.v ├── test1.si ├── test2.si └── test3.si ├── lut4.png ├── silice_vga_test.gif ├── src ├── CMakeLists.txt ├── analyze.cc ├── analyze.h ├── blif.cc ├── blif.h ├── fstapi │ ├── CMakeLists.txt │ ├── fastlz.c │ ├── fastlz.h │ ├── fstapi.c │ ├── fstapi.h │ ├── lz4.c │ └── lz4.h ├── read.cc ├── read.h ├── sh_clear.cs ├── sh_init.cs ├── sh_outports.cs ├── sh_posedge.cs ├── sh_simul.cs ├── sh_visu.fp ├── sh_visu.vp ├── silixel.cc ├── silixel_cpu.cc ├── simul_cpu.cc ├── simul_cpu.h ├── simul_gpu.cc ├── simul_gpu.h ├── uintX.h └── wasi.cc ├── synth.sh ├── synth ├── synth.yosys └── synth_bram.yosys └── synth_bram.sh /.gitignore: -------------------------------------------------------------------------------- 1 | .*.sw* 2 | -------------------------------------------------------------------------------- /.gitmodules: -------------------------------------------------------------------------------- 1 | [submodule "src/LibSL"] 2 | path = src/LibSL 3 | url = https://github.com/sylefeb/LibSL.git 4 | -------------------------------------------------------------------------------- /CMakeLists.txt: -------------------------------------------------------------------------------- 1 | CMAKE_MINIMUM_REQUIRED(VERSION 2.6) 2 | PROJECT(silixel) 3 | 4 | add_subdirectory(src) 5 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | BSD 3-Clause License 2 | 3 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 4 | All rights reserved. 5 | 6 | Redistribution and use in source and binary forms, with or without 7 | modification, are permitted provided that the following conditions are met: 8 | 9 | 1. Redistributions of source code must retain the above copyright notice, this 10 | list of conditions and the following disclaimer. 11 | 12 | 2. Redistributions in binary form must reproduce the above copyright notice, 13 | this list of conditions and the following disclaimer in the documentation 14 | and/or other materials provided with the distribution. 15 | 16 | 3. Neither the name of the copyright holder nor the names of its 17 | contributors may be used to endorse or promote products derived from 18 | this software without specific prior written permission. 19 | 20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 21 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 22 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 23 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 24 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 25 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 26 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 27 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 28 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Exploring gate-level simulation on CPU and GPU 2 | 3 | This repository contains my experiments on gate-level simulation. By that I mean taking the output of [Yosys](https://github.com/YosysHQ/yosys) and simulating the gate network. This is done in the context of hardware design for FPGAs, with a graphics twist (because making your own [graphics hardware is fun](https://github.com/sylefeb/Silice/blob/master/projects/README.md)). As a quick reminder, hardware design is achieved by going from a hardware description (e.g. a source code in [Verilog](https://en.wikipedia.org/wiki/Verilog)) to a network of gates that implement the logic ; this is what Yosys does. This network of gates can be later turned into a configuration file for an [FPGA](https://www.nandland.com/articles/fpga-101-fpgas-for-beginners.html), or into lithography masks to actually manufacture [ASICs](https://en.wikipedia.org/wiki/Application-specific_integrated_circuit), in both cases implementing the design in hardware. 4 | 5 |

6 | A VGA test design being simulated on the GPU, with gate binary outputs overlaid.
Synthesized on an FPGA this design produces the same pattern on a VGA output, 640x480 @60Hz.
7 |
8 |

9 | 10 | > The purpose of this repo is to learn about hardware simulation, having fun hacking and understanding how this is possible at all. We are thus implementing a toy simulator. For actual, efficient simulation please refer to [*Verilator*](https://www.veripool.org/verilator/), [*CXXRTL*](https://tomverbeure.github.io/2020/08/08/CXXRTL-the-New-Yosys-Simulation-Backend.html) and [*Icarus Verilog*](http://iverilog.icarus.com/). 11 | 12 | > **Work in progress**: I am currently working on this README and commenting/cleaning the source code. Feedback is welcome! 13 | 14 | This all started as I stumbled upon an entry to the Google CTF 2019 contest: [reversing-gpurtl](https://www.youtube.com/watch?v=3ac9HAsfV8c). The source code [is available](https://github.com/google/google-ctf/tree/master/2019/finals/reversing-gpurtl) and shows how to brute force a gate-level simulation onto the GPU. 15 | 16 | *What does that mean? How does that work? We're going to precisely answer these questions!* 17 | 18 | By analyzing the `reversing-gpurtl` source code and scripts (which are in Python and Rust), I got a good understanding of how the gate level simulation was achieved. And I was surprised to discover that it is *simple*! 19 | 20 | But first, what is a *gate* in our context? The simplest (and only!) logical element in the network will be a *LUT4*. A LUT (Lookup Up Table) is a basic building block of an FPGA. A simplified LUT4 schematic would look like that: 21 |

22 | 23 | The LUT4 has 4 single bit inputs (`a`,`b`,`c`,`d`) and two single bit outputs: `D` and `Q`. Output `D` is 'immediately' updated (as fast as the circuit can do it) when `a`,`b`,`c` or `d` change. Output `Q` is updated with the current value of `D` only when the clock ticks (positive edge on `clk`). Given `a`,`b`,`c`,`d` the value taken by `D` depends on the LUT configuration, which is a 16 entry truth table (configured by Yosys). It specifies the value of the output bit based on the values of `a`, `b`, `c` and `d`: four bits that can be either 0 or 1, and thus $2^4=16$ possibilities. This configuration implies that the LUT4 has a small internal memory (16 bits), which is indeed what gets configured by Yosys in the LUT4s. 24 | 25 | Fundamentally, the idea for simulation is as follows: 26 | 1. First, ask Yosys to synthesize a design using only LUT4s, see the [script here](synth/synth.yosys). 27 | 28 | 1. Second, parse the result written by Yosys (a `blif` file) and prepare a data-structure for simulation. The file tells us about the LUT4s, their configurations and how they are connected. There are a few minor complications that are detailed in the source code comments. 29 | 30 | 1. Third, run the simulation! The basic idea (we'll improve next) is to simulate all LUTs in parallel. For each LUT, we read its four inputs and update its `D` output based on its configuration. Once nothing changes anymore, we simulate a positive clock edge by updating the `Q` output to reflect the value of the `D` output. Rinse and repeat. 31 | 32 | And that's all there is for a basic, working simulator! 33 | 34 | Let's now briefly look at an overview of the source code, and then take a closer look at how the simulation behaves. This will lead us to some optimizations, and will let us understand some performance tradeoffs. 35 | 36 | ## Source code overview 37 | 38 | To give you a rough outline of the source code: 39 | - Step 1 is covered in the [synth.yosys](synth/synth.yosys) script and [synth.sh](synth.sh). 40 | - Step 2 is covered in [blif.cc](src/blif.cc) and [read.cc](src/read.cc) 41 | - Step 3 is covered in [simul_cpu.cc](src/simul_cpu.cc) (CPU) and [simul_gpu.cc](src/simul_gpu.cc) (GPU), both being called from the main app [silixel.cc](src/silixel.cc). In terms of GPU the two important compute shaders are [sh_simul.cs](src/sh_simul.cs) and [sh_posedge.cs](src/sh_posedge.cs). A second application does only CPU simulation (see [silixel_cpu.cc](src/silixel_cpu.cc)). It is very simple so that can be a good starting point. 42 | 43 | ## A closer look and some improvements 44 | 45 | Blindly simulating all LUTs in parallel works just fine. However, it is quite inefficient in terms of *effective simulated LUT per computation steps*. What do I meant by that? 46 | 47 | Let us assume a perfectly parallel computer, with exactly one core per LUT (on a small design and large GPU this might just be the case!). 48 | It turns out that, in most cases, at each 'parallel update' only few LUT outputs are actually changing. This is quite expected: at each simulation step the logic is unlikely to generate changes to all LUTs throughout the entire design. Well, to what extent this is true depends *entirely* on your design of course, but on most designs I tried only a small percentage of LUTs are actually modified at every iteration. This implies that a lot of LUT updates are wasted computations. 49 | 50 | So what can we do to improve efficiency? We will apply two refinements. The first one 51 | is used both on the CPU and GPU implementations. The second one is used only on the CPU implementation. 52 | 53 | ### *Refinement 1: sorting LUTs by combinational depth* 54 | 55 | Let's have a look at a simple network: 56 | 57 |

58 | 59 | I numbered the LUTs from `L0` to `L5`. The LUTs in the network have been arranged 60 | by *combinational depth*. Given a LUT, the depth counts how many other LUTs are 61 | in between any of its input and a Q (flip-flop) output, *at most* considering all inputs. 62 | 63 | > Recall the D outputs are updated as soon as the inputs change (they are *combinational* outputs) while the Q outputs are updated only at the positive clock edge (*registered* outputs). Thus, within a clock cycle we propagate data from `Q` outputs to `D` outputs until nothing changes, before simulating the next positive clock edge and moving on to the next cycle. 64 | 65 | For instance, `L0` is at depth 0 because both its inputs `a` and `c` read directly 66 | from Q outputs. The same is true of `L1`. 67 | Now `L4` is at depth 1 because while `c` reads from a Q output (which would mean depth 0), `a` reads from the D output of `L1`. Since `L1` is at depth 0, `L4` has to be depth 1. The final depth of the LUT is the largest considering all inputs. 68 | 69 | The depth analysis is performed in [analyze.cc](src/analyze.cc). 70 | 71 | How does that help? Remember that during simulation, we update all LUTs in parallel 72 | until nothing changes, and then simulate a positive clock edge. 73 | This introduces two problems. First, we need to track whether something change, 74 | and with large numbers of LUTs that is not free if running parallel on the GPU, for instance. 75 | Second, only few LUT outputs actually change at every iteration, while we update all of them. 76 | In the illustrated example, `L5` would not change until the very last iteration. And during this last 77 | iteration it is the only one to change, so the update is wasted on all other LUTs. 78 | 79 | Having the depth gives us some nice properties to reduce the impact of these issues: 80 | - Since we know the maximum overall depth (2 in the example) we know exactly 81 | how many iterations to run and do not have to implement a 'no change' detection 82 | (3 iterations in the example). 83 | - LUTs at a same depth are independent from one another. Consider `L2`, `L3` and `L4`: changing 84 | the D output of one does not impact the others. 85 | This is true by construction since if one would depend on another, it would have been 86 | assigned at the next depth in the network. 87 | Furthermore, LUTs at a same depth only possibly depend on changes of 88 | the D output of LUTs *at lower depths*. Thus, we can do less work at each iteration, 89 | focusing only on the LUTs that could possibly change. In the example, we would 90 | run three parallel iterations, first {`L0`,`L1`}, then {`L2`,`L3`,`L4`}, then {`L5`}. 91 | This results in substantial savings. On the GPU, we can still update large chunks 92 | of LUTs in parallel *without any synchronization* (LUTs at a same depth), which is 93 | ideal. 94 | 95 | > The maximum depth also reflects at what max frequency the circuit can run. Indeed, assuming 96 | it takes the same delay for signal to propagate through all LUTs, the number of LUTs 97 | to traverse *at most* determines the worst case propagation delay, and hence the 98 | maximum frequency. This is often reported as the *critical path* by place and route software 99 | such as [nextpnr](https://github.com/YosysHQ/nextpnr). 100 | 101 | Now we have seen all the ingredients of the GPU implementation. 102 | See in particular function `simulCycle_gpu` in source file [`simul_gpu.cc`](src/simul_gpu.cc), 103 | that calls the compute shaders once for each depth level, and then for the positive clock edge. 104 | 105 | > A detail, not discussed here, is that some LUTs remain constant during simulation 106 | and can be skipped after initialization. This is done in the implementation. 107 | 108 | ### *Refinement 2: fanout and compute lists* 109 | 110 | The first refinement avoids blind updates to all LUTs. However, it remains 111 | very likely that within a set of LUTs at a same depth, many are updated while 112 | their inputs did not change. Consider {`L2`,`L3`,`L4`}. If only the D output of `L0` changed, 113 | then only `L2` actually requires an update. 114 | 115 | This second refinement avoids this issue, implementing a *compute list* per depth level 116 | (including the final positive edge update of Q outputs). 117 | An iteration at a given depth *k* inserts LUTs that should be refreshed in the compute lists of the next depth levels (> *k*). 118 | These are the LUTs using as input the changing D output of a LUT at depth *k*. 119 | 120 | To do this efficiently, we first compute the *fanout* of the LUTs. Let us consider a single LUT: 121 | its fanout is the list of LUTs that use its D output, and of course all are deeper in the 122 | network. Given this list, whenever a LUT D output changes we can efficiently insert all LUTs of 123 | its fanout to the compute lists 124 | (see `addFanout` in source file [`simul_cpu.cc`](src/simul_cpu.cc)). LUTs are inserted 125 | only once thanks to a 'dirty bit' flag. 126 | 127 | This approach works very well on the CPU, which is using a single thread and is 128 | anyway a sequential traversal. In fact, it outperforms the GPU on all but very large 129 | designs (which, on top of it, are large for bad reasons due to memories (BRAM/SPRAM) 130 | being turned into humongous networks of LUTs). 131 | 132 | > This approach is not easily amenable to the GPU. I actually tried, but this required 133 | atomic updates, synchronization and indirect compute dispatch ... which in the end 134 | together killed performance. But it might be that I did not find the right way yet! 135 | 136 | > Performance can be further improved on the CPU. First, the computations seem a 137 | case for SSE instructions. Second, I should be using more cores! However, 138 | like on the GPU, synchronization can quickly become a performance bottleneck... 139 | 140 | Alright, time for some compilation and testing! 141 | 142 | ## Compile and run 143 | 144 | First, make sure to get the submodules: 145 | ``` 146 | git submodule init 147 | git submodule update 148 | ``` 149 | Use `CMake` to prepare a Makefile for your system (possibly use `-G` to specify the makefile generator), then `make`: 150 | ``` 151 | cd build 152 | cmake .. 153 | make 154 | cd .. 155 | ``` 156 | Before simulating, we have to run Yosys on a design (a Verilog source file). 157 | Yosys has to be installed and in PATH. 158 | From a command line in the repo root, run: 159 | 160 | ``` 161 | ./synth.sh silice_vga_test 162 | ``` 163 | 164 | This synthesizes a design ([silice_vga_test.v](designs/silice_vga_test.v)) and 165 | generates the output in [`build`](./build). There are several designs, see [`designs`](./designs/). 166 | 167 | After running `./build/src/silixel` you should see this: 168 | 169 |

170 | 171 | Time to experiment, the source code is yours to hack! 172 | Please let me know what you thought, and feel free to [reach out on Mastodon](https://mastodon.online/@sylefeb). 173 | 174 | ## Limitations 175 | 176 | - Single clock domain 177 | -------------------------------------------------------------------------------- /build/.gitignore: -------------------------------------------------------------------------------- 1 | * 2 | !.gitignore 3 | -------------------------------------------------------------------------------- /depths.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sylefeb/Silixel/09bb5313db3a26002615b670a8a87fed1de529a6/depths.png -------------------------------------------------------------------------------- /designs/Makefile: -------------------------------------------------------------------------------- 1 | 2 | .DEFAULT: $@ 3 | silice-make.py -s $@.si -b bare -p basic -o BUILD_$(subst :,_,$@) $(ARGS) 4 | cd .. ; ./synth_bram.sh /BUILD_$(subst :,_,$@)/build ; cd - 5 | 6 | clean: 7 | rm -rf BUILD_* 8 | -------------------------------------------------------------------------------- /designs/mul.v: -------------------------------------------------------------------------------- 1 | module multest(clock, out); 2 | 3 | input clock; 4 | output reg [7:0] out = 0; 5 | reg [7:0] a = 0; 6 | // reg [7:0] b = 0; 7 | 8 | always @(posedge clock) 9 | begin 10 | out <= a * a; 11 | a <= a + 1; 12 | // b <= b + 1; 13 | end 14 | 15 | endmodule 16 | -------------------------------------------------------------------------------- /designs/silice_blinky.v: -------------------------------------------------------------------------------- 1 | /* 2 | 3 | Copyright 2019, (C) Sylvain Lefebvre and contributors 4 | List contributors with: git shortlog -n -s -- 5 | 6 | MIT license 7 | 8 | Permission is hereby granted, free of charge, to any person obtaining a copy of 9 | this software and associated documentation files (the "Software"), to deal in 10 | the Software without restriction, including without limitation the rights to 11 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 12 | the Software, and to permit persons to whom the Software is furnished to do so, 13 | subject to the following conditions: 14 | 15 | The above copyright notice and this permission notice shall be included in all 16 | copies or substantial portions of the Software. 17 | 18 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 19 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 20 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 21 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 22 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 23 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 24 | 25 | (header_2_M) 26 | 27 | */ 28 | `define BARE 1 29 | `define COLOR_DEPTH 6 30 | 31 | 32 | module top( 33 | `ifdef VGA 34 | // VGA 35 | output out_video_clock, 36 | output reg [`COLOR_DEPTH-1:0] out_video_r, 37 | output reg [`COLOR_DEPTH-1:0] out_video_g, 38 | output reg [`COLOR_DEPTH-1:0] out_video_b, 39 | output out_video_hs, 40 | output out_video_vs, 41 | `endif 42 | // basic 43 | output [7:0] out_leds, 44 | input clock 45 | ); 46 | 47 | reg [2:0] ready = 3'b111; 48 | 49 | always @(posedge clock) begin 50 | ready <= ready >> 1; 51 | end 52 | 53 | wire run_main; 54 | assign run_main = 1'b1; 55 | 56 | M_main __main( 57 | .clock(clock), 58 | .reset(ready[0]), 59 | .out_leds(out_leds), 60 | `ifdef VGA 61 | .out_video_clock(out_video_clock), 62 | .out_video_r(out_video_r), 63 | .out_video_g(out_video_g), 64 | .out_video_b(out_video_b), 65 | .out_video_hs(out_video_hs), 66 | .out_video_vs(out_video_vs), 67 | `endif 68 | .in_run(run_main) 69 | ); 70 | 71 | endmodule 72 | 73 | module M_main ( 74 | out_leds, 75 | in_run, 76 | out_done, 77 | reset, 78 | out_clock, 79 | clock 80 | ); 81 | output [7:0] out_leds; 82 | input in_run; 83 | output out_done; 84 | input reset; 85 | output out_clock; 86 | input clock; 87 | assign out_clock = clock; 88 | 89 | reg [7:0] _d_cnt; 90 | reg [7:0] _q_cnt; 91 | reg [7:0] _d_leds; 92 | reg [7:0] _q_leds; 93 | reg [1:0] _d_index,_q_index = 3; 94 | reg _autorun = 0; 95 | assign out_leds = _q_leds; 96 | assign out_done = (_q_index == 3) & _autorun; 97 | 98 | 99 | 100 | `ifdef FORMAL 101 | initial begin 102 | assume(reset); 103 | end 104 | assume property($initstate || (out_done)); 105 | `endif 106 | always @* begin 107 | _d_cnt = _q_cnt; 108 | _d_leds = _q_leds; 109 | _d_index = _q_index; 110 | // _always_pre 111 | _d_leds = _q_cnt[(8)-(8)+:8]; 112 | (* full_case *) 113 | case (_q_index) 114 | 0: begin 115 | // _top 116 | _d_index = 1; 117 | end 118 | 1: begin 119 | // __while__block_1 120 | if (1) begin 121 | // __block_2 122 | // __block_4 123 | _d_cnt = _q_cnt+1; 124 | // __block_5 125 | _d_index = 1; 126 | end else begin 127 | _d_index = 2; 128 | end 129 | end 130 | 2: begin 131 | // __block_3 132 | _d_index = 3; 133 | end 134 | 3: begin // end of 135 | end 136 | default: begin 137 | _d_index = {2{1'bx}}; 138 | `ifdef FORMAL 139 | assume(0); 140 | `endif 141 | end 142 | endcase 143 | // _always_post 144 | end 145 | 146 | always @(posedge clock) begin 147 | _q_cnt <= (reset) ? 0 : _d_cnt; 148 | _q_leds <= _d_leds; 149 | _q_index <= reset ? 3 : ( ~_autorun ? 0 : _d_index); 150 | _autorun <= reset ? 0 : 1; 151 | end 152 | 153 | endmodule 154 | 155 | -------------------------------------------------------------------------------- /designs/silice_div.v: -------------------------------------------------------------------------------- 1 | /* 2 | 3 | Copyright 2019, (C) Sylvain Lefebvre and contributors 4 | List contributors with: git shortlog -n -s -- 5 | 6 | MIT license 7 | 8 | Permission is hereby granted, free of charge, to any person obtaining a copy of 9 | this software and associated documentation files (the "Software"), to deal in 10 | the Software without restriction, including without limitation the rights to 11 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 12 | the Software, and to permit persons to whom the Software is furnished to do so, 13 | subject to the following conditions: 14 | 15 | The above copyright notice and this permission notice shall be included in all 16 | copies or substantial portions of the Software. 17 | 18 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 19 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 20 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 21 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 22 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 23 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 24 | 25 | (header_2_M) 26 | 27 | */ 28 | `define BARE 1 29 | `define COLOR_DEPTH 6 30 | 31 | 32 | module top( 33 | `ifdef VGA 34 | // VGA 35 | output out_video_clock, 36 | output reg [`COLOR_DEPTH-1:0] out_video_r, 37 | output reg [`COLOR_DEPTH-1:0] out_video_g, 38 | output reg [`COLOR_DEPTH-1:0] out_video_b, 39 | output out_video_hs, 40 | output out_video_vs, 41 | `endif 42 | // basic 43 | output [7:0] out_leds, 44 | input clock 45 | ); 46 | 47 | reg [2:0] ready = 3'b111; 48 | 49 | always @(posedge clock) begin 50 | ready <= ready >> 1; 51 | end 52 | 53 | wire run_main; 54 | assign run_main = 1'b1; 55 | 56 | M_main __main( 57 | .clock(clock), 58 | .reset(ready[0]), 59 | .out_leds(out_leds), 60 | `ifdef VGA 61 | .out_video_clock(out_video_clock), 62 | .out_video_r(out_video_r), 63 | .out_video_g(out_video_g), 64 | .out_video_b(out_video_b), 65 | .out_video_hs(out_video_hs), 66 | .out_video_vs(out_video_vs), 67 | `endif 68 | .in_run(run_main) 69 | ); 70 | 71 | endmodule 72 | 73 | 74 | module M_div16__div0 ( 75 | in_inum, 76 | in_iden, 77 | out_ret, 78 | in_run, 79 | out_done, 80 | reset, 81 | out_clock, 82 | clock 83 | ); 84 | input signed [15:0] in_inum; 85 | input signed [15:0] in_iden; 86 | output signed [15:0] out_ret; 87 | input in_run; 88 | output out_done; 89 | input reset; 90 | output out_clock; 91 | input clock; 92 | assign out_clock = clock; 93 | wire [16:0] _w_diff; 94 | wire [15:0] _w_num; 95 | wire [15:0] _w_den; 96 | 97 | reg [16:0] _d_ac; 98 | reg [16:0] _q_ac; 99 | reg [4:0] _d_i; 100 | reg [4:0] _q_i; 101 | reg signed [15:0] _d_ret; 102 | reg signed [15:0] _q_ret; 103 | reg [1:0] _d_index,_q_index = 3; 104 | assign out_ret = _q_ret; 105 | assign out_done = (_q_index == 3); 106 | 107 | 108 | assign _w_diff = _q_ac-_w_den; 109 | assign _w_num = in_inum; 110 | assign _w_den = in_iden; 111 | 112 | `ifdef FORMAL 113 | initial begin 114 | assume(reset); 115 | end 116 | assume property($initstate || (in_run || out_done)); 117 | `endif 118 | always @* begin 119 | _d_ac = _q_ac; 120 | _d_i = _q_i; 121 | _d_ret = _q_ret; 122 | _d_index = _q_index; 123 | // _always_pre 124 | (* full_case *) 125 | case (_q_index) 126 | 0: begin 127 | // _top 128 | _d_ac = {{15{1'b0}},_w_num[15+:1]}; 129 | _d_ret = {_w_num[0+:15],1'b0}; 130 | _d_index = 1; 131 | end 132 | 1: begin 133 | // __while__block_1 134 | if (_q_i!=16) begin 135 | // __block_2 136 | // __block_4 137 | if (_w_diff[16+:1]==0) begin 138 | // __block_5 139 | // __block_7 140 | _d_ac = {_w_diff[0+:15],_q_ret[15+:1]}; 141 | _d_ret = {_q_ret[0+:15],1'b1}; 142 | // __block_8 143 | end else begin 144 | // __block_6 145 | // __block_9 146 | _d_ac = {_q_ac[0+:15],_q_ret[15+:1]}; 147 | _d_ret = {_q_ret[0+:15],1'b0}; 148 | // __block_10 149 | end 150 | // __block_11 151 | _d_i = _q_i+1; 152 | // __block_12 153 | _d_index = 1; 154 | end else begin 155 | _d_index = 2; 156 | end 157 | end 158 | 2: begin 159 | // __block_3 160 | _d_index = 3; 161 | end 162 | 3: begin // end of 163 | end 164 | default: begin 165 | _d_index = {2{1'bx}}; 166 | `ifdef FORMAL 167 | assume(0); 168 | `endif 169 | end 170 | endcase 171 | // _always_post 172 | end 173 | 174 | always @(posedge clock) begin 175 | _q_ac <= _d_ac; 176 | _q_i <= (reset | ~in_run) ? 0 : _d_i; 177 | _q_ret <= (reset | ~in_run) ? 0 : _d_ret; 178 | _q_index <= reset ? 3 : ( ~in_run ? 0 : _d_index); 179 | end 180 | 181 | endmodule 182 | 183 | module M_main ( 184 | out_leds, 185 | in_run, 186 | out_done, 187 | reset, 188 | out_clock, 189 | clock 190 | ); 191 | output [7:0] out_leds; 192 | input in_run; 193 | output out_done; 194 | input reset; 195 | output out_clock; 196 | input clock; 197 | assign out_clock = clock; 198 | wire signed [15:0] _w_div0_ret; 199 | wire _w_div0_done; 200 | wire signed [15:0] _c_num; 201 | assign _c_num = 20043; 202 | wire signed [15:0] _c_den; 203 | assign _c_den = 41; 204 | reg signed [15:0] _t_result; 205 | 206 | reg [7:0] _d_leds; 207 | reg [7:0] _q_leds; 208 | reg signed [15:0] _d_div0_inum,_q_div0_inum; 209 | reg signed [15:0] _d_div0_iden,_q_div0_iden; 210 | reg [1:0] _d_index,_q_index = 3; 211 | reg _autorun = 0; 212 | reg _div0_run = 0; 213 | assign out_leds = _q_leds; 214 | assign out_done = (_q_index == 3) & _autorun; 215 | M_div16__div0 div0 ( 216 | .in_inum(_q_div0_inum), 217 | .in_iden(_q_div0_iden), 218 | .out_ret(_w_div0_ret), 219 | .out_done(_w_div0_done), 220 | .in_run(_div0_run), 221 | .reset(reset), 222 | .clock(clock)); 223 | 224 | 225 | 226 | `ifdef FORMAL 227 | initial begin 228 | assume(reset); 229 | end 230 | assume property($initstate || (out_done)); 231 | `endif 232 | always @* begin 233 | _d_leds = _q_leds; 234 | _d_div0_inum = _q_div0_inum; 235 | _d_div0_iden = _q_div0_iden; 236 | _d_index = _q_index; 237 | _div0_run = 1; 238 | _t_result = 0; 239 | // _always_pre 240 | (* full_case *) 241 | case (_q_index) 242 | 0: begin 243 | // _top 244 | _d_div0_inum = _c_num; 245 | _d_div0_iden = _c_den; 246 | _div0_run = 0; 247 | _d_index = 1; 248 | end 249 | 1: begin 250 | // __block_1 251 | if (_w_div0_done == 1) begin 252 | _d_index = 2; 253 | end else begin 254 | _d_index = 1; 255 | end 256 | end 257 | 2: begin 258 | // __block_2 259 | _t_result = _w_div0_ret; 260 | $display("%d / %d = %d",_c_num,_c_den,_t_result); 261 | _d_leds = _t_result[0+:8]; 262 | _d_index = 3; 263 | end 264 | 3: begin // end of 265 | end 266 | default: begin 267 | _d_index = {2{1'bx}}; 268 | `ifdef FORMAL 269 | assume(0); 270 | `endif 271 | end 272 | endcase 273 | // _always_post 274 | end 275 | 276 | always @(posedge clock) begin 277 | _q_leds <= _d_leds; 278 | _q_index <= reset ? 3 : ( ~_autorun ? 0 : _d_index); 279 | _autorun <= reset ? 0 : 1; 280 | _q_div0_inum <= _d_div0_inum; 281 | _q_div0_iden <= _d_div0_iden; 282 | end 283 | 284 | endmodule 285 | 286 | -------------------------------------------------------------------------------- /designs/silice_mulpip.v: -------------------------------------------------------------------------------- 1 | /* 2 | 3 | Copyright 2019, (C) Sylvain Lefebvre and contributors 4 | List contributors with: git shortlog -n -s -- 5 | 6 | MIT license 7 | 8 | Permission is hereby granted, free of charge, to any person obtaining a copy of 9 | this software and associated documentation files (the "Software"), to deal in 10 | the Software without restriction, including without limitation the rights to 11 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 12 | the Software, and to permit persons to whom the Software is furnished to do so, 13 | subject to the following conditions: 14 | 15 | The above copyright notice and this permission notice shall be included in all 16 | copies or substantial portions of the Software. 17 | 18 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 19 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 20 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 21 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 22 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 23 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 24 | 25 | (header_2_M) 26 | 27 | */ 28 | `define BARE 1 29 | `define COLOR_DEPTH 6 30 | 31 | 32 | module top( 33 | `ifdef VGA 34 | // VGA 35 | output out_video_clock, 36 | output reg [`COLOR_DEPTH-1:0] out_video_r, 37 | output reg [`COLOR_DEPTH-1:0] out_video_g, 38 | output reg [`COLOR_DEPTH-1:0] out_video_b, 39 | output out_video_hs, 40 | output out_video_vs, 41 | `endif 42 | // basic 43 | output [7:0] out_leds, 44 | input clock 45 | ); 46 | 47 | reg [2:0] ready = 3'b111; 48 | 49 | always @(posedge clock) begin 50 | ready <= ready >> 1; 51 | end 52 | 53 | wire run_main; 54 | assign run_main = 1'b1; 55 | 56 | M_main __main( 57 | .clock(clock), 58 | .reset(ready[0]), 59 | .out_leds(out_leds), 60 | `ifdef VGA 61 | .out_video_clock(out_video_clock), 62 | .out_video_r(out_video_r), 63 | .out_video_g(out_video_g), 64 | .out_video_b(out_video_b), 65 | .out_video_hs(out_video_hs), 66 | .out_video_vs(out_video_vs), 67 | `endif 68 | .in_run(run_main) 69 | ); 70 | 71 | endmodule 72 | 73 | 74 | module M_mulpip16__mul ( 75 | in_im0, 76 | in_im1, 77 | out_ret, 78 | out_done, 79 | reset, 80 | out_clock, 81 | clock 82 | ); 83 | input signed [15:0] in_im0; 84 | input signed [15:0] in_im1; 85 | output signed [15:0] out_ret; 86 | output out_done; 87 | input reset; 88 | output out_clock; 89 | input clock; 90 | assign out_clock = clock; 91 | reg [15:0] _t_sum_1_0; 92 | reg [15:0] _t_sum_2_0; 93 | reg [15:0] _t_sum_2_1; 94 | reg [15:0] _t_sum_3_0; 95 | reg [15:0] _t_sum_3_1; 96 | reg [15:0] _t_sum_3_2; 97 | reg [15:0] _t_sum_3_3; 98 | reg [15:0] _t_sum_4_0; 99 | reg [15:0] _t_sum_4_1; 100 | reg [15:0] _t_sum_4_2; 101 | reg [15:0] _t_sum_4_3; 102 | reg [15:0] _t_sum_4_4; 103 | reg [15:0] _t_sum_4_5; 104 | reg [15:0] _t_sum_4_6; 105 | reg [15:0] _t_sum_4_7; 106 | reg [15:0] _t_m0; 107 | reg [15:0] _t_m1; 108 | reg [0:0] _t_m0_neg; 109 | reg [0:0] _t_m1_neg; 110 | reg [0:0] _t___pip_138_0_m0_neg; 111 | reg [0:0] _t___pip_138_0_m1_neg; 112 | reg [15:0] _t___pip_138_3_sum_1_0; 113 | reg [15:0] _t___pip_138_2_sum_2_0; 114 | reg [15:0] _t___pip_138_2_sum_2_1; 115 | reg [15:0] _t___pip_138_1_sum_3_0; 116 | reg [15:0] _t___pip_138_1_sum_3_1; 117 | reg [15:0] _t___pip_138_1_sum_3_2; 118 | reg [15:0] _t___pip_138_1_sum_3_3; 119 | reg [15:0] _t___pip_138_0_sum_4_0; 120 | reg [15:0] _t___pip_138_0_sum_4_1; 121 | reg [15:0] _t___pip_138_0_sum_4_2; 122 | reg [15:0] _t___pip_138_0_sum_4_3; 123 | reg [15:0] _t___pip_138_0_sum_4_4; 124 | reg [15:0] _t___pip_138_0_sum_4_5; 125 | reg [15:0] _t___pip_138_0_sum_4_6; 126 | reg [15:0] _t___pip_138_0_sum_4_7; 127 | 128 | reg [0:0] _d___pip_138_1_m0_neg; 129 | reg [0:0] _q___pip_138_1_m0_neg; 130 | reg [0:0] _d___pip_138_2_m0_neg; 131 | reg [0:0] _q___pip_138_2_m0_neg; 132 | reg [0:0] _d___pip_138_3_m0_neg; 133 | reg [0:0] _q___pip_138_3_m0_neg; 134 | reg [0:0] _d___pip_138_4_m0_neg; 135 | reg [0:0] _q___pip_138_4_m0_neg; 136 | reg [0:0] _d___pip_138_1_m1_neg; 137 | reg [0:0] _q___pip_138_1_m1_neg; 138 | reg [0:0] _d___pip_138_2_m1_neg; 139 | reg [0:0] _q___pip_138_2_m1_neg; 140 | reg [0:0] _d___pip_138_3_m1_neg; 141 | reg [0:0] _q___pip_138_3_m1_neg; 142 | reg [0:0] _d___pip_138_4_m1_neg; 143 | reg [0:0] _q___pip_138_4_m1_neg; 144 | reg [15:0] _d___pip_138_4_sum_1_0; 145 | reg [15:0] _q___pip_138_4_sum_1_0; 146 | reg [15:0] _d___pip_138_3_sum_2_0; 147 | reg [15:0] _q___pip_138_3_sum_2_0; 148 | reg [15:0] _d___pip_138_3_sum_2_1; 149 | reg [15:0] _q___pip_138_3_sum_2_1; 150 | reg [15:0] _d___pip_138_2_sum_3_0; 151 | reg [15:0] _q___pip_138_2_sum_3_0; 152 | reg [15:0] _d___pip_138_2_sum_3_1; 153 | reg [15:0] _q___pip_138_2_sum_3_1; 154 | reg [15:0] _d___pip_138_2_sum_3_2; 155 | reg [15:0] _q___pip_138_2_sum_3_2; 156 | reg [15:0] _d___pip_138_2_sum_3_3; 157 | reg [15:0] _q___pip_138_2_sum_3_3; 158 | reg [15:0] _d___pip_138_1_sum_4_0; 159 | reg [15:0] _q___pip_138_1_sum_4_0; 160 | reg [15:0] _d___pip_138_1_sum_4_1; 161 | reg [15:0] _q___pip_138_1_sum_4_1; 162 | reg [15:0] _d___pip_138_1_sum_4_2; 163 | reg [15:0] _q___pip_138_1_sum_4_2; 164 | reg [15:0] _d___pip_138_1_sum_4_3; 165 | reg [15:0] _q___pip_138_1_sum_4_3; 166 | reg [15:0] _d___pip_138_1_sum_4_4; 167 | reg [15:0] _q___pip_138_1_sum_4_4; 168 | reg [15:0] _d___pip_138_1_sum_4_5; 169 | reg [15:0] _q___pip_138_1_sum_4_5; 170 | reg [15:0] _d___pip_138_1_sum_4_6; 171 | reg [15:0] _q___pip_138_1_sum_4_6; 172 | reg [15:0] _d___pip_138_1_sum_4_7; 173 | reg [15:0] _q___pip_138_1_sum_4_7; 174 | reg signed [15:0] _d_ret; 175 | reg signed [15:0] _q_ret; 176 | reg [1:0] _d_index,_q_index = 3; 177 | reg _autorun = 0; 178 | assign out_ret = _q_ret; 179 | assign out_done = (_q_index == 3) & _autorun; 180 | 181 | 182 | 183 | `ifdef FORMAL 184 | initial begin 185 | assume(reset); 186 | end 187 | assume property($initstate || (out_done)); 188 | `endif 189 | always @* begin 190 | _d___pip_138_1_m0_neg = _q___pip_138_1_m0_neg; 191 | _d___pip_138_2_m0_neg = _q___pip_138_2_m0_neg; 192 | _d___pip_138_3_m0_neg = _q___pip_138_3_m0_neg; 193 | _d___pip_138_4_m0_neg = _q___pip_138_4_m0_neg; 194 | _d___pip_138_1_m1_neg = _q___pip_138_1_m1_neg; 195 | _d___pip_138_2_m1_neg = _q___pip_138_2_m1_neg; 196 | _d___pip_138_3_m1_neg = _q___pip_138_3_m1_neg; 197 | _d___pip_138_4_m1_neg = _q___pip_138_4_m1_neg; 198 | _d___pip_138_4_sum_1_0 = _q___pip_138_4_sum_1_0; 199 | _d___pip_138_3_sum_2_0 = _q___pip_138_3_sum_2_0; 200 | _d___pip_138_3_sum_2_1 = _q___pip_138_3_sum_2_1; 201 | _d___pip_138_2_sum_3_0 = _q___pip_138_2_sum_3_0; 202 | _d___pip_138_2_sum_3_1 = _q___pip_138_2_sum_3_1; 203 | _d___pip_138_2_sum_3_2 = _q___pip_138_2_sum_3_2; 204 | _d___pip_138_2_sum_3_3 = _q___pip_138_2_sum_3_3; 205 | _d___pip_138_1_sum_4_0 = _q___pip_138_1_sum_4_0; 206 | _d___pip_138_1_sum_4_1 = _q___pip_138_1_sum_4_1; 207 | _d___pip_138_1_sum_4_2 = _q___pip_138_1_sum_4_2; 208 | _d___pip_138_1_sum_4_3 = _q___pip_138_1_sum_4_3; 209 | _d___pip_138_1_sum_4_4 = _q___pip_138_1_sum_4_4; 210 | _d___pip_138_1_sum_4_5 = _q___pip_138_1_sum_4_5; 211 | _d___pip_138_1_sum_4_6 = _q___pip_138_1_sum_4_6; 212 | _d___pip_138_1_sum_4_7 = _q___pip_138_1_sum_4_7; 213 | _d_ret = _q_ret; 214 | _d_index = _q_index; 215 | _t_sum_1_0 = 0; 216 | _t_sum_2_0 = 0; 217 | _t_sum_2_1 = 0; 218 | _t_sum_3_0 = 0; 219 | _t_sum_3_1 = 0; 220 | _t_sum_3_2 = 0; 221 | _t_sum_3_3 = 0; 222 | _t_sum_4_0 = 0; 223 | _t_sum_4_1 = 0; 224 | _t_sum_4_2 = 0; 225 | _t_sum_4_3 = 0; 226 | _t_sum_4_4 = 0; 227 | _t_sum_4_5 = 0; 228 | _t_sum_4_6 = 0; 229 | _t_sum_4_7 = 0; 230 | _t_m0 = 0; 231 | _t_m1 = 0; 232 | _t_m0_neg = 0; 233 | _t_m1_neg = 0; 234 | _t___pip_138_0_m0_neg = 0; 235 | _t___pip_138_0_m1_neg = 0; 236 | _t___pip_138_3_sum_1_0 = 0; 237 | _t___pip_138_2_sum_2_0 = 0; 238 | _t___pip_138_2_sum_2_1 = 0; 239 | _t___pip_138_1_sum_3_0 = 0; 240 | _t___pip_138_1_sum_3_1 = 0; 241 | _t___pip_138_1_sum_3_2 = 0; 242 | _t___pip_138_1_sum_3_3 = 0; 243 | _t___pip_138_0_sum_4_0 = 0; 244 | _t___pip_138_0_sum_4_1 = 0; 245 | _t___pip_138_0_sum_4_2 = 0; 246 | _t___pip_138_0_sum_4_3 = 0; 247 | _t___pip_138_0_sum_4_4 = 0; 248 | _t___pip_138_0_sum_4_5 = 0; 249 | _t___pip_138_0_sum_4_6 = 0; 250 | _t___pip_138_0_sum_4_7 = 0; 251 | // _always_pre 252 | (* full_case *) 253 | case (_q_index) 254 | 0: begin 255 | // _top 256 | _d_index = 1; 257 | end 258 | 1: begin 259 | // __while__block_1 260 | if (1) begin 261 | // __block_2 262 | // __block_4 263 | // pipeline 264 | // -------- stage 0 265 | // __stage___block_6 266 | // __block_7 267 | if (in_im0<0) begin 268 | // __block_8 269 | // __block_10 270 | _t_m0_neg = 1; 271 | _t_m0 = -in_im0; 272 | // __block_11 273 | end else begin 274 | // __block_9 275 | // __block_12 276 | _t_m0 = in_im0; 277 | // __block_13 278 | end 279 | // __block_14 280 | if (in_im1<0) begin 281 | // __block_15 282 | // __block_17 283 | _t_m1_neg = 1; 284 | _t_m1 = -in_im1; 285 | // __block_18 286 | end else begin 287 | // __block_16 288 | // __block_19 289 | _t_m1 = in_im1; 290 | // __block_20 291 | end 292 | // __block_21 293 | case (_t_m1[0+:2]) 294 | 2'b00: begin 295 | // __block_23_case 296 | // __block_24 297 | _t_sum_4_0 = 0; 298 | // __block_25 299 | end 300 | 2'b10: begin 301 | // __block_26_case 302 | // __block_27 303 | _t_sum_4_0 = _t_m0<<1; 304 | // __block_28 305 | end 306 | 2'b01: begin 307 | // __block_29_case 308 | // __block_30 309 | _t_sum_4_0 = _t_m0<<0; 310 | // __block_31 311 | end 312 | 2'b11: begin 313 | // __block_32_case 314 | // __block_33 315 | _t_sum_4_0 = (_t_m0<<0)+(_t_m0<<1); 316 | // __block_34 317 | end 318 | endcase 319 | // __block_22 320 | case (_t_m1[2+:2]) 321 | 2'b00: begin 322 | // __block_36_case 323 | // __block_37 324 | _t_sum_4_1 = 0; 325 | // __block_38 326 | end 327 | 2'b10: begin 328 | // __block_39_case 329 | // __block_40 330 | _t_sum_4_1 = _t_m0<<3; 331 | // __block_41 332 | end 333 | 2'b01: begin 334 | // __block_42_case 335 | // __block_43 336 | _t_sum_4_1 = _t_m0<<2; 337 | // __block_44 338 | end 339 | 2'b11: begin 340 | // __block_45_case 341 | // __block_46 342 | _t_sum_4_1 = (_t_m0<<2)+(_t_m0<<3); 343 | // __block_47 344 | end 345 | endcase 346 | // __block_35 347 | case (_t_m1[4+:2]) 348 | 2'b00: begin 349 | // __block_49_case 350 | // __block_50 351 | _t_sum_4_2 = 0; 352 | // __block_51 353 | end 354 | 2'b10: begin 355 | // __block_52_case 356 | // __block_53 357 | _t_sum_4_2 = _t_m0<<5; 358 | // __block_54 359 | end 360 | 2'b01: begin 361 | // __block_55_case 362 | // __block_56 363 | _t_sum_4_2 = _t_m0<<4; 364 | // __block_57 365 | end 366 | 2'b11: begin 367 | // __block_58_case 368 | // __block_59 369 | _t_sum_4_2 = (_t_m0<<4)+(_t_m0<<5); 370 | // __block_60 371 | end 372 | endcase 373 | // __block_48 374 | case (_t_m1[6+:2]) 375 | 2'b00: begin 376 | // __block_62_case 377 | // __block_63 378 | _t_sum_4_3 = 0; 379 | // __block_64 380 | end 381 | 2'b10: begin 382 | // __block_65_case 383 | // __block_66 384 | _t_sum_4_3 = _t_m0<<7; 385 | // __block_67 386 | end 387 | 2'b01: begin 388 | // __block_68_case 389 | // __block_69 390 | _t_sum_4_3 = _t_m0<<6; 391 | // __block_70 392 | end 393 | 2'b11: begin 394 | // __block_71_case 395 | // __block_72 396 | _t_sum_4_3 = (_t_m0<<6)+(_t_m0<<7); 397 | // __block_73 398 | end 399 | endcase 400 | // __block_61 401 | case (_t_m1[8+:2]) 402 | 2'b00: begin 403 | // __block_75_case 404 | // __block_76 405 | _t_sum_4_4 = 0; 406 | // __block_77 407 | end 408 | 2'b10: begin 409 | // __block_78_case 410 | // __block_79 411 | _t_sum_4_4 = _t_m0<<9; 412 | // __block_80 413 | end 414 | 2'b01: begin 415 | // __block_81_case 416 | // __block_82 417 | _t_sum_4_4 = _t_m0<<8; 418 | // __block_83 419 | end 420 | 2'b11: begin 421 | // __block_84_case 422 | // __block_85 423 | _t_sum_4_4 = (_t_m0<<8)+(_t_m0<<9); 424 | // __block_86 425 | end 426 | endcase 427 | // __block_74 428 | case (_t_m1[10+:2]) 429 | 2'b00: begin 430 | // __block_88_case 431 | // __block_89 432 | _t_sum_4_5 = 0; 433 | // __block_90 434 | end 435 | 2'b10: begin 436 | // __block_91_case 437 | // __block_92 438 | _t_sum_4_5 = _t_m0<<11; 439 | // __block_93 440 | end 441 | 2'b01: begin 442 | // __block_94_case 443 | // __block_95 444 | _t_sum_4_5 = _t_m0<<10; 445 | // __block_96 446 | end 447 | 2'b11: begin 448 | // __block_97_case 449 | // __block_98 450 | _t_sum_4_5 = (_t_m0<<10)+(_t_m0<<11); 451 | // __block_99 452 | end 453 | endcase 454 | // __block_87 455 | case (_t_m1[12+:2]) 456 | 2'b00: begin 457 | // __block_101_case 458 | // __block_102 459 | _t_sum_4_6 = 0; 460 | // __block_103 461 | end 462 | 2'b10: begin 463 | // __block_104_case 464 | // __block_105 465 | _t_sum_4_6 = _t_m0<<13; 466 | // __block_106 467 | end 468 | 2'b01: begin 469 | // __block_107_case 470 | // __block_108 471 | _t_sum_4_6 = _t_m0<<12; 472 | // __block_109 473 | end 474 | 2'b11: begin 475 | // __block_110_case 476 | // __block_111 477 | _t_sum_4_6 = (_t_m0<<12)+(_t_m0<<13); 478 | // __block_112 479 | end 480 | endcase 481 | // __block_100 482 | case (_t_m1[14+:2]) 483 | 2'b00: begin 484 | // __block_114_case 485 | // __block_115 486 | _t_sum_4_7 = 0; 487 | // __block_116 488 | end 489 | 2'b10: begin 490 | // __block_117_case 491 | // __block_118 492 | _t_sum_4_7 = _t_m0<<15; 493 | // __block_119 494 | end 495 | 2'b01: begin 496 | // __block_120_case 497 | // __block_121 498 | _t_sum_4_7 = _t_m0<<14; 499 | // __block_122 500 | end 501 | 2'b11: begin 502 | // __block_123_case 503 | // __block_124 504 | _t_sum_4_7 = (_t_m0<<14)+(_t_m0<<15); 505 | // __block_125 506 | end 507 | endcase 508 | // __block_113 509 | _t___pip_138_0_m0_neg = _t_m0_neg; 510 | _t___pip_138_0_m1_neg = _t_m1_neg; 511 | _t___pip_138_0_sum_4_0 = _t_sum_4_0; 512 | _t___pip_138_0_sum_4_1 = _t_sum_4_1; 513 | _t___pip_138_0_sum_4_2 = _t_sum_4_2; 514 | _t___pip_138_0_sum_4_3 = _t_sum_4_3; 515 | _t___pip_138_0_sum_4_4 = _t_sum_4_4; 516 | _t___pip_138_0_sum_4_5 = _t_sum_4_5; 517 | _t___pip_138_0_sum_4_6 = _t_sum_4_6; 518 | _t___pip_138_0_sum_4_7 = _t_sum_4_7; 519 | // -------- stage 1 520 | // __stage___block_127 521 | // __block_128 522 | _t_sum_3_0 = _q___pip_138_1_sum_4_0+_q___pip_138_1_sum_4_1; 523 | _t_sum_3_1 = _q___pip_138_1_sum_4_2+_q___pip_138_1_sum_4_3; 524 | _t_sum_3_2 = _q___pip_138_1_sum_4_4+_q___pip_138_1_sum_4_5; 525 | _t_sum_3_3 = _q___pip_138_1_sum_4_6+_q___pip_138_1_sum_4_7; 526 | _t___pip_138_1_sum_3_1 = _t_sum_3_1; 527 | _t___pip_138_1_sum_3_0 = _t_sum_3_0; 528 | _t___pip_138_1_sum_3_2 = _t_sum_3_2; 529 | _t___pip_138_1_sum_3_3 = _t_sum_3_3; 530 | // -------- stage 2 531 | // __stage___block_130 532 | // __block_131 533 | _t_sum_2_0 = _q___pip_138_2_sum_3_0+_q___pip_138_2_sum_3_1; 534 | _t_sum_2_1 = _q___pip_138_2_sum_3_2+_q___pip_138_2_sum_3_3; 535 | _t___pip_138_2_sum_2_0 = _t_sum_2_0; 536 | _t___pip_138_2_sum_2_1 = _t_sum_2_1; 537 | // -------- stage 3 538 | // __stage___block_133 539 | // __block_134 540 | _t_sum_1_0 = _q___pip_138_3_sum_2_0+_q___pip_138_3_sum_2_1; 541 | _t___pip_138_3_sum_1_0 = _t_sum_1_0; 542 | // -------- stage 4 543 | // __stage___block_136 544 | // __block_137 545 | if (_q___pip_138_4_m0_neg^_q___pip_138_4_m1_neg) begin 546 | // __block_138 547 | // __block_140 548 | _d_ret = -_q___pip_138_4_sum_1_0; 549 | // __block_141 550 | end else begin 551 | // __block_139 552 | // __block_142 553 | _d_ret = _q___pip_138_4_sum_1_0; 554 | // __block_143 555 | end 556 | // __block_144 557 | // __block_5 558 | // __block_146 559 | _d_index = 1; 560 | end else begin 561 | _d_index = 2; 562 | end 563 | end 564 | 2: begin 565 | // __block_3 566 | _d_index = 3; 567 | end 568 | 3: begin // end of 569 | end 570 | default: begin 571 | _d_index = {2{1'bx}}; 572 | `ifdef FORMAL 573 | assume(0); 574 | `endif 575 | end 576 | endcase 577 | // _always_post 578 | end 579 | 580 | always @(posedge clock) begin 581 | _q___pip_138_1_m0_neg <= _t___pip_138_0_m0_neg; 582 | _q___pip_138_2_m0_neg <= _d___pip_138_1_m0_neg; 583 | _q___pip_138_3_m0_neg <= _d___pip_138_2_m0_neg; 584 | _q___pip_138_4_m0_neg <= _d___pip_138_3_m0_neg; 585 | _q___pip_138_1_m1_neg <= _t___pip_138_0_m1_neg; 586 | _q___pip_138_2_m1_neg <= _d___pip_138_1_m1_neg; 587 | _q___pip_138_3_m1_neg <= _d___pip_138_2_m1_neg; 588 | _q___pip_138_4_m1_neg <= _d___pip_138_3_m1_neg; 589 | _q___pip_138_4_sum_1_0 <= _t___pip_138_3_sum_1_0; 590 | _q___pip_138_3_sum_2_0 <= _t___pip_138_2_sum_2_0; 591 | _q___pip_138_3_sum_2_1 <= _t___pip_138_2_sum_2_1; 592 | _q___pip_138_2_sum_3_0 <= _t___pip_138_1_sum_3_0; 593 | _q___pip_138_2_sum_3_1 <= _t___pip_138_1_sum_3_1; 594 | _q___pip_138_2_sum_3_2 <= _t___pip_138_1_sum_3_2; 595 | _q___pip_138_2_sum_3_3 <= _t___pip_138_1_sum_3_3; 596 | _q___pip_138_1_sum_4_0 <= _t___pip_138_0_sum_4_0; 597 | _q___pip_138_1_sum_4_1 <= _t___pip_138_0_sum_4_1; 598 | _q___pip_138_1_sum_4_2 <= _t___pip_138_0_sum_4_2; 599 | _q___pip_138_1_sum_4_3 <= _t___pip_138_0_sum_4_3; 600 | _q___pip_138_1_sum_4_4 <= _t___pip_138_0_sum_4_4; 601 | _q___pip_138_1_sum_4_5 <= _t___pip_138_0_sum_4_5; 602 | _q___pip_138_1_sum_4_6 <= _t___pip_138_0_sum_4_6; 603 | _q___pip_138_1_sum_4_7 <= _t___pip_138_0_sum_4_7; 604 | _q_ret <= _d_ret; 605 | _q_index <= reset ? 3 : ( ~_autorun ? 0 : _d_index); 606 | _autorun <= reset ? 0 : 1; 607 | end 608 | 609 | endmodule 610 | 611 | module M_main ( 612 | out_leds, 613 | in_run, 614 | out_done, 615 | reset, 616 | out_clock, 617 | clock 618 | ); 619 | output [7:0] out_leds; 620 | input in_run; 621 | output out_done; 622 | input reset; 623 | output out_clock; 624 | input clock; 625 | assign out_clock = clock; 626 | wire signed [15:0] _w_mul_ret; 627 | wire _w_mul_done; 628 | reg signed [15:0] _t_result; 629 | 630 | reg signed [15:0] _d_m0; 631 | reg signed [15:0] _q_m0; 632 | reg signed [15:0] _d_m1; 633 | reg signed [15:0] _q_m1; 634 | reg [7:0] _d_leds; 635 | reg [7:0] _q_leds; 636 | reg signed [15:0] _d_mul_im0,_q_mul_im0; 637 | reg signed [15:0] _d_mul_im1,_q_mul_im1; 638 | reg [3:0] _d_index,_q_index = 13; 639 | reg _autorun = 0; 640 | assign out_leds = _q_leds; 641 | assign out_done = (_q_index == 13) & _autorun; 642 | M_mulpip16__mul mul ( 643 | .in_im0(_d_mul_im0), 644 | .in_im1(_d_mul_im1), 645 | .out_ret(_w_mul_ret), 646 | .out_done(_w_mul_done), 647 | .reset(reset), 648 | .clock(clock)); 649 | 650 | 651 | 652 | `ifdef FORMAL 653 | initial begin 654 | assume(reset); 655 | end 656 | assume property($initstate || (out_done)); 657 | `endif 658 | always @* begin 659 | _d_m0 = _q_m0; 660 | _d_m1 = _q_m1; 661 | _d_leds = _q_leds; 662 | _d_mul_im0 = _q_mul_im0; 663 | _d_mul_im1 = _q_mul_im1; 664 | _d_index = _q_index; 665 | // _always_pre 666 | _t_result = _w_mul_ret; 667 | _d_mul_im0 = _q_m0; 668 | _d_mul_im1 = _q_m1; 669 | _d_leds = _t_result[0+:8]; 670 | (* full_case *) 671 | case (_q_index) 672 | 0: begin 673 | // _top 674 | _d_m0 = 2; 675 | _d_m1 = 3; 676 | $display("%d * %d = ...",_d_m0,_d_m1); 677 | _d_index = 1; 678 | end 679 | 1: begin 680 | // __block_1 681 | _d_m0 = _q_m0+1; 682 | _d_m1 = -_q_m1-1; 683 | $display("%d * %d = ...",_d_m0,_d_m1); 684 | _d_index = 2; 685 | end 686 | 2: begin 687 | // __block_2 688 | _d_m0 = _q_m0+1; 689 | _d_m1 = -_q_m1+1; 690 | $display("%d * %d = ...",_d_m0,_d_m1); 691 | _d_index = 3; 692 | end 693 | 3: begin 694 | // __block_3 695 | _d_m0 = _q_m0+1; 696 | _d_m1 = -_q_m1-1; 697 | $display("%d * %d = ...",_d_m0,_d_m1); 698 | _d_index = 4; 699 | end 700 | 4: begin 701 | // __block_4 702 | _d_m0 = _q_m0+1; 703 | _d_m1 = -_q_m1+1; 704 | $display("%d * %d = ...",_d_m0,_d_m1); 705 | _d_index = 5; 706 | end 707 | 5: begin 708 | // __block_5 709 | _d_m0 = _q_m0+1; 710 | _d_m1 = -_q_m1-1; 711 | $display("%d * %d = ...",_d_m0,_d_m1); 712 | _d_index = 6; 713 | end 714 | 6: begin 715 | // __block_6 716 | $display("... = %d",_t_result); 717 | _d_index = 7; 718 | end 719 | 7: begin 720 | // __block_7 721 | $display("... = %d",_t_result); 722 | _d_index = 8; 723 | end 724 | 8: begin 725 | // __block_8 726 | $display("... = %d",_t_result); 727 | _d_index = 9; 728 | end 729 | 9: begin 730 | // __block_9 731 | $display("... = %d",_t_result); 732 | _d_index = 10; 733 | end 734 | 10: begin 735 | // __block_10 736 | $display("... = %d",_t_result); 737 | _d_index = 11; 738 | end 739 | 11: begin 740 | // __block_11 741 | $display("... = %d",_t_result); 742 | _d_index = 12; 743 | end 744 | 12: begin 745 | // __block_12 746 | _d_index = 13; 747 | end 748 | 13: begin // end of 749 | end 750 | default: begin 751 | _d_index = {4{1'bx}}; 752 | `ifdef FORMAL 753 | assume(0); 754 | `endif 755 | end 756 | endcase 757 | // _always_post 758 | end 759 | 760 | always @(posedge clock) begin 761 | _q_m0 <= (reset) ? 0 : _d_m0; 762 | _q_m1 <= (reset) ? 0 : _d_m1; 763 | _q_leds <= (reset) ? 0 : _d_leds; 764 | _q_index <= reset ? 13 : ( ~_autorun ? 0 : _d_index); 765 | _autorun <= reset ? 0 : 1; 766 | _q_mul_im0 <= _d_mul_im0; 767 | _q_mul_im1 <= _d_mul_im1; 768 | end 769 | 770 | endmodule 771 | 772 | -------------------------------------------------------------------------------- /designs/silice_vga_test.v: -------------------------------------------------------------------------------- 1 | `define VGA 1 2 | /* 3 | 4 | Copyright 2019, (C) Sylvain Lefebvre and contributors 5 | List contributors with: git shortlog -n -s -- 6 | 7 | MIT license 8 | 9 | Permission is hereby granted, free of charge, to any person obtaining a copy of 10 | this software and associated documentation files (the "Software"), to deal in 11 | the Software without restriction, including without limitation the rights to 12 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of 13 | the Software, and to permit persons to whom the Software is furnished to do so, 14 | subject to the following conditions: 15 | 16 | The above copyright notice and this permission notice shall be included in all 17 | copies or substantial portions of the Software. 18 | 19 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 20 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS 21 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 22 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 23 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 24 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 25 | 26 | (header_2_M) 27 | 28 | */ 29 | `define BARE 1 30 | `define COLOR_DEPTH 6 31 | 32 | 33 | module top( 34 | `ifdef VGA 35 | // VGA 36 | output out_video_clock, 37 | output reg [`COLOR_DEPTH-1:0] out_video_r, 38 | output reg [`COLOR_DEPTH-1:0] out_video_g, 39 | output reg [`COLOR_DEPTH-1:0] out_video_b, 40 | output out_video_hs, 41 | output out_video_vs, 42 | `endif 43 | // basic 44 | output [7:0] out_leds, 45 | input clock 46 | ); 47 | 48 | reg [2:0] ready = 3'b111; 49 | 50 | always @(posedge clock) begin 51 | ready <= ready >> 1; 52 | end 53 | 54 | wire run_main; 55 | assign run_main = 1'b1; 56 | 57 | M_main __main( 58 | .clock(clock), 59 | .reset(ready[0]), 60 | .out_leds(out_leds), 61 | `ifdef VGA 62 | .out_video_clock(out_video_clock), 63 | .out_video_r(out_video_r), 64 | .out_video_g(out_video_g), 65 | .out_video_b(out_video_b), 66 | .out_video_hs(out_video_hs), 67 | .out_video_vs(out_video_vs), 68 | `endif 69 | .in_run(run_main) 70 | ); 71 | 72 | endmodule 73 | 74 | 75 | module M_vga__vga_driver ( 76 | out_vga_hs, 77 | out_vga_vs, 78 | out_active, 79 | out_vblank, 80 | out_vga_x, 81 | out_vga_y, 82 | reset, 83 | out_clock, 84 | clock 85 | ); 86 | output [0:0] out_vga_hs; 87 | output [0:0] out_vga_vs; 88 | output [0:0] out_active; 89 | output [0:0] out_vblank; 90 | output [9:0] out_vga_x; 91 | output [9:0] out_vga_y; 92 | input reset; 93 | output out_clock; 94 | input clock; 95 | assign out_clock = clock; 96 | wire [9:0] _w_pix_x; 97 | wire [9:0] _w_pix_y; 98 | wire [0:0] _w_active_h; 99 | wire [0:0] _w_active_v; 100 | 101 | reg [9:0] _d_xcount = 0; 102 | reg [9:0] _q_xcount = 0; 103 | reg [9:0] _d_ycount = 0; 104 | reg [9:0] _q_ycount = 0; 105 | reg [0:0] _d_vga_hs; 106 | reg [0:0] _q_vga_hs; 107 | reg [0:0] _d_vga_vs; 108 | reg [0:0] _q_vga_vs; 109 | reg [0:0] _d_active; 110 | reg [0:0] _q_active; 111 | reg [0:0] _d_vblank; 112 | reg [0:0] _q_vblank; 113 | reg [9:0] _d_vga_x; 114 | reg [9:0] _q_vga_x; 115 | reg [9:0] _d_vga_y; 116 | reg [9:0] _q_vga_y; 117 | assign out_vga_hs = _q_vga_hs; 118 | assign out_vga_vs = _q_vga_vs; 119 | assign out_active = _q_active; 120 | assign out_vblank = _q_vblank; 121 | assign out_vga_x = _q_vga_x; 122 | assign out_vga_y = _q_vga_y; 123 | 124 | 125 | assign _w_pix_x = (_q_xcount-160); 126 | assign _w_pix_y = (_q_ycount-45); 127 | assign _w_active_h = (_q_xcount>=160&&_q_xcount<800); 128 | assign _w_active_v = (_q_ycount>=45&&_q_ycount<525); 129 | 130 | `ifdef FORMAL 131 | initial begin 132 | assume(reset); 133 | end 134 | `endif 135 | always @* begin 136 | _d_xcount = _q_xcount; 137 | _d_ycount = _q_ycount; 138 | _d_vga_hs = _q_vga_hs; 139 | _d_vga_vs = _q_vga_vs; 140 | _d_active = _q_active; 141 | _d_vblank = _q_vblank; 142 | _d_vga_x = _q_vga_x; 143 | _d_vga_y = _q_vga_y; 144 | // _always_pre 145 | _d_active = _w_active_h&&_w_active_v; 146 | _d_vga_hs = ~((_q_xcount>=16&&_q_xcount<112)); 147 | _d_vga_vs = ~((_q_ycount>=10&&_q_ycount<12)); 148 | _d_vblank = (_q_ycount<45); 149 | // __block_1 150 | _d_vga_x = _w_active_h ? _w_pix_x:0; 151 | _d_vga_y = _w_active_v ? _w_pix_y:0; 152 | if (_q_xcount==799) begin 153 | // __block_2 154 | // __block_4 155 | _d_xcount = 0; 156 | if (_q_ycount==524) begin 157 | // __block_5 158 | // __block_7 159 | _d_ycount = 0; 160 | // __block_8 161 | end else begin 162 | // __block_6 163 | // __block_9 164 | _d_ycount = _q_ycount+1; 165 | // __block_10 166 | end 167 | // __block_11 168 | // __block_12 169 | end else begin 170 | // __block_3 171 | // __block_13 172 | _d_xcount = _q_xcount+1; 173 | // __block_14 174 | end 175 | // __block_15 176 | // __block_16 177 | // _always_post 178 | end 179 | 180 | always @(posedge clock) begin 181 | _q_xcount <= _d_xcount; 182 | _q_ycount <= _d_ycount; 183 | _q_vga_hs <= _d_vga_hs; 184 | _q_vga_vs <= _d_vga_vs; 185 | _q_active <= _d_active; 186 | _q_vblank <= _d_vblank; 187 | _q_vga_x <= _d_vga_x; 188 | _q_vga_y <= _d_vga_y; 189 | end 190 | 191 | endmodule 192 | 193 | 194 | module M_frame_display__display ( 195 | in_pix_x, 196 | in_pix_y, 197 | in_pix_active, 198 | in_pix_vblank, 199 | out_pix_red, 200 | out_pix_green, 201 | out_pix_blue, 202 | out_done, 203 | reset, 204 | out_clock, 205 | clock 206 | ); 207 | input [9:0] in_pix_x; 208 | input [9:0] in_pix_y; 209 | input [0:0] in_pix_active; 210 | input [0:0] in_pix_vblank; 211 | output [5:0] out_pix_red; 212 | output [5:0] out_pix_green; 213 | output [5:0] out_pix_blue; 214 | output out_done; 215 | input reset; 216 | output out_clock; 217 | input clock; 218 | assign out_clock = clock; 219 | reg [5:0] _t_pix_red; 220 | reg [5:0] _t_pix_green; 221 | reg [5:0] _t_pix_blue; 222 | 223 | reg [2:0] _d_index,_q_index = 5; 224 | reg _autorun = 0; 225 | assign out_pix_red = _t_pix_red; 226 | assign out_pix_green = _t_pix_green; 227 | assign out_pix_blue = _t_pix_blue; 228 | assign out_done = (_q_index == 5) & _autorun; 229 | 230 | 231 | 232 | `ifdef FORMAL 233 | initial begin 234 | assume(reset); 235 | end 236 | assume property($initstate || (out_done)); 237 | `endif 238 | always @* begin 239 | _d_index = _q_index; 240 | // _always_pre 241 | _t_pix_red = 0; 242 | _t_pix_green = 0; 243 | _t_pix_blue = 0; 244 | (* full_case *) 245 | case (_q_index) 246 | 0: begin 247 | // _top 248 | _d_index = 1; 249 | end 250 | 1: begin 251 | // __while__block_1 252 | if (1) begin 253 | // __block_2 254 | // __block_4 255 | _d_index = 3; 256 | end else begin 257 | _d_index = 2; 258 | end 259 | end 260 | 3: begin 261 | // __while__block_5 262 | if (in_pix_vblank==0) begin 263 | // __block_6 264 | // __block_8 265 | if (in_pix_active) begin 266 | // __block_9 267 | // __block_11 268 | _t_pix_blue = in_pix_x[4+:6]; 269 | _t_pix_green = in_pix_y[4+:6]; 270 | _t_pix_red = in_pix_x[1+:6]; 271 | // __block_12 272 | end else begin 273 | // __block_10 274 | end 275 | // __block_13 276 | // __block_14 277 | _d_index = 3; 278 | end else begin 279 | _d_index = 4; 280 | end 281 | end 282 | 2: begin 283 | // __block_3 284 | _d_index = 5; 285 | end 286 | 4: begin 287 | // __while__block_15 288 | if (in_pix_vblank==1) begin 289 | // __block_16 290 | // __block_18 291 | // __block_19 292 | _d_index = 4; 293 | end else begin 294 | _d_index = 1; 295 | end 296 | end 297 | 5: begin // end of 298 | end 299 | default: begin 300 | _d_index = {3{1'bx}}; 301 | `ifdef FORMAL 302 | assume(0); 303 | `endif 304 | end 305 | endcase 306 | // _always_post 307 | end 308 | 309 | always @(posedge clock) begin 310 | _q_index <= reset ? 5 : ( ~_autorun ? 0 : _d_index); 311 | _autorun <= reset ? 0 : 1; 312 | end 313 | 314 | endmodule 315 | 316 | module M_main ( 317 | out_leds, 318 | out_video_clock, 319 | out_video_r, 320 | out_video_g, 321 | out_video_b, 322 | out_video_hs, 323 | out_video_vs, 324 | in_run, 325 | out_done, 326 | reset, 327 | out_clock, 328 | clock 329 | ); 330 | output [7:0] out_leds; 331 | output [0:0] out_video_clock; 332 | output [5:0] out_video_r; 333 | output [5:0] out_video_g; 334 | output [5:0] out_video_b; 335 | output [0:0] out_video_hs; 336 | output [0:0] out_video_vs; 337 | input in_run; 338 | output out_done; 339 | input reset; 340 | output out_clock; 341 | input clock; 342 | assign out_clock = clock; 343 | wire [0:0] _w_vga_driver_vga_hs; 344 | wire [0:0] _w_vga_driver_vga_vs; 345 | wire [0:0] _w_vga_driver_active; 346 | wire [0:0] _w_vga_driver_vblank; 347 | wire [9:0] _w_vga_driver_vga_x; 348 | wire [9:0] _w_vga_driver_vga_y; 349 | wire [5:0] _w_display_pix_red; 350 | wire [5:0] _w_display_pix_green; 351 | wire [5:0] _w_display_pix_blue; 352 | wire _w_display_done; 353 | reg [7:0] _t_leds; 354 | 355 | reg [7:0] _d_frame; 356 | reg [7:0] _q_frame; 357 | reg [0:0] _d_video_clock; 358 | reg [0:0] _q_video_clock; 359 | reg [2:0] _d_index,_q_index = 7; 360 | reg _autorun = 0; 361 | assign out_leds = _t_leds; 362 | assign out_video_clock = _q_video_clock; 363 | assign out_video_r = _w_display_pix_red; 364 | assign out_video_g = _w_display_pix_green; 365 | assign out_video_b = _w_display_pix_blue; 366 | assign out_video_hs = _w_vga_driver_vga_hs; 367 | assign out_video_vs = _w_vga_driver_vga_vs; 368 | assign out_done = (_q_index == 7) & _autorun; 369 | M_vga__vga_driver vga_driver ( 370 | .out_vga_hs(_w_vga_driver_vga_hs), 371 | .out_vga_vs(_w_vga_driver_vga_vs), 372 | .out_active(_w_vga_driver_active), 373 | .out_vblank(_w_vga_driver_vblank), 374 | .out_vga_x(_w_vga_driver_vga_x), 375 | .out_vga_y(_w_vga_driver_vga_y), 376 | .reset(reset), 377 | .clock(clock)); 378 | M_frame_display__display display ( 379 | .in_pix_x(_w_vga_driver_vga_x), 380 | .in_pix_y(_w_vga_driver_vga_y), 381 | .in_pix_active(_w_vga_driver_active), 382 | .in_pix_vblank(_w_vga_driver_vblank), 383 | .out_pix_red(_w_display_pix_red), 384 | .out_pix_green(_w_display_pix_green), 385 | .out_pix_blue(_w_display_pix_blue), 386 | .out_done(_w_display_done), 387 | .reset(reset), 388 | .clock(clock)); 389 | 390 | 391 | 392 | `ifdef FORMAL 393 | initial begin 394 | assume(reset); 395 | end 396 | assume property($initstate || (out_done)); 397 | `endif 398 | always @* begin 399 | _d_frame = _q_frame; 400 | _d_video_clock = _q_video_clock; 401 | _d_index = _q_index; 402 | _t_leds = 0; 403 | // _always_pre 404 | (* full_case *) 405 | case (_q_index) 406 | 0: begin 407 | // _top 408 | _d_index = 1; 409 | end 410 | 1: begin 411 | // __while__block_1 412 | if (1) begin 413 | // __block_2 414 | // __block_4 415 | _d_index = 3; 416 | end else begin 417 | _d_index = 2; 418 | end 419 | end 420 | 3: begin 421 | // __while__block_5 422 | if (_w_vga_driver_vblank==1) begin 423 | // __block_6 424 | // __block_8 425 | // __block_9 426 | _d_index = 3; 427 | end else begin 428 | _d_index = 4; 429 | end 430 | end 431 | 2: begin 432 | // __block_3 433 | _d_index = 7; 434 | end 435 | 4: begin 436 | // __block_7 437 | $display("vblank off"); 438 | _d_index = 5; 439 | end 440 | 5: begin 441 | // __while__block_10 442 | if (_w_vga_driver_vblank==0) begin 443 | // __block_11 444 | // __block_13 445 | // __block_14 446 | _d_index = 5; 447 | end else begin 448 | _d_index = 6; 449 | end 450 | end 451 | 6: begin 452 | // __block_12 453 | $display("vblank on"); 454 | _d_frame = _q_frame+1; 455 | // __block_15 456 | _d_index = 1; 457 | end 458 | 7: begin // end of 459 | end 460 | default: begin 461 | _d_index = {3{1'bx}}; 462 | `ifdef FORMAL 463 | assume(0); 464 | `endif 465 | end 466 | endcase 467 | // _always_post 468 | end 469 | 470 | always @(posedge clock) begin 471 | _q_frame <= (reset) ? 0 : _d_frame; 472 | _q_video_clock <= _d_video_clock; 473 | _q_index <= reset ? 7 : ( ~_autorun ? 0 : _d_index); 474 | _autorun <= reset ? 0 : 1; 475 | end 476 | 477 | endmodule 478 | 479 | -------------------------------------------------------------------------------- /designs/simple.v: -------------------------------------------------------------------------------- 1 | module counter(clock, out); 2 | 3 | input clock; 4 | output reg [8:0] out = 0; 5 | 6 | always @(posedge clock) 7 | begin 8 | out <= out + 1; 9 | end 10 | 11 | endmodule 12 | -------------------------------------------------------------------------------- /designs/test1.si: -------------------------------------------------------------------------------- 1 | unit main(output uint8 test) 2 | { 3 | bram uint8 ram[256] = {1,2,3,4,5,6,7,8,9,0,pad(0)}; 4 | 5 | always { 6 | test = ram.rdata; 7 | ram.addr = ram.rdata; 8 | } 9 | } 10 | -------------------------------------------------------------------------------- /designs/test2.si: -------------------------------------------------------------------------------- 1 | unit main(output uint8 test) 2 | { 3 | //bram uint8 ram[256] = {1,2,3,4,5,6,7,8,9,0,pad(0)}; 4 | 5 | always_before { 6 | test = 8hff; 7 | } 8 | 9 | algorithm { 10 | test = 8haa; 11 | while (1) { 12 | test = 1; 13 | ++: 14 | test = 2; 15 | ++: 16 | test = 3; 17 | } 18 | } 19 | } 20 | -------------------------------------------------------------------------------- /designs/test3.si: -------------------------------------------------------------------------------- 1 | unit main(output uint8 test) 2 | { 3 | bram uint8 ram[256] = {1,2,3,4,5,6,7,8,9,0,pad(0)}; 4 | 5 | always_before { 6 | test = 8hff; 7 | } 8 | 9 | algorithm { 10 | ram.addr = 0; 11 | ram.wenable = 0; 12 | while (1) { 13 | test = ram.rdata; 14 | ram.wenable = 1; 15 | ram.wdata = ram.rdata + 10; 16 | ++: 17 | test = 8hff; 18 | ram.wenable = 0; 19 | ram.addr = ram.addr > 10 ? 0 : ram.addr + 1; 20 | } 21 | } 22 | } 23 | -------------------------------------------------------------------------------- /lut4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sylefeb/Silixel/09bb5313db3a26002615b670a8a87fed1de529a6/lut4.png -------------------------------------------------------------------------------- /silice_vga_test.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sylefeb/Silixel/09bb5313db3a26002615b670a8a87fed1de529a6/silice_vga_test.gif -------------------------------------------------------------------------------- /src/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | CMAKE_MINIMUM_REQUIRED(VERSION 3.5) 2 | 3 | ADD_SUBDIRECTORY(LibSL EXCLUDE_FROM_ALL) 4 | ADD_SUBDIRECTORY(fstapi) 5 | 6 | IF(WASI) 7 | ELSE() 8 | 9 | SET(SHADERS 10 | sh_simul 11 | sh_posedge 12 | sh_outports 13 | sh_init 14 | sh_visu 15 | ) 16 | AUTO_BIND_SHADERS( ${SHADERS} ) 17 | 18 | ADD_EXECUTABLE(silixel 19 | silixel.cc 20 | blif.cc 21 | blif.h 22 | sh_simul.cs 23 | sh_simul.h 24 | sh_posedge.cs 25 | sh_posedge.h 26 | sh_outports.cs 27 | sh_outports.h 28 | sh_init.cs 29 | sh_init.h 30 | sh_visu.fp 31 | sh_visu.vp 32 | sh_visu.h 33 | simul_cpu.cc 34 | simul_cpu.h 35 | simul_gpu.cc 36 | simul_gpu.h 37 | read.cc 38 | read.h 39 | analyze.cc 40 | analyze.h 41 | ) 42 | 43 | IF(LINUX) 44 | TARGET_LINK_LIBRARIES(silixel LibSL LibSL_gl4core freeglut) 45 | ELSE() 46 | TARGET_LINK_LIBRARIES(silixel LibSL LibSL_gl4core) 47 | ENDIF() 48 | 49 | ENDIF() 50 | 51 | ADD_DEFINITIONS(-DSRC_PATH=\"${CMAKE_SOURCE_DIR}\") 52 | 53 | ADD_EXECUTABLE(silixel_cpu 54 | silixel_cpu.cc 55 | simul_cpu.cc 56 | simul_cpu.h 57 | blif.cc 58 | blif.h 59 | read.cc 60 | read.h 61 | analyze.cc 62 | analyze.h 63 | wasi.cc 64 | ) 65 | 66 | TARGET_LINK_LIBRARIES(silixel_cpu LibSL fstapi) 67 | 68 | # install and paths 69 | 70 | install(TARGETS silixel_cpu RUNTIME DESTINATION bin/) 71 | -------------------------------------------------------------------------------- /src/analyze.cc: -------------------------------------------------------------------------------- 1 | // @sylefeb 2022-01-04 2 | /* --------------------------------------------------------------------- 3 | 4 | Analyzes the design, determines the 'depth' of each LUT by propagating 5 | from the Q outputs (depth 0). The LUT depth is 1 + the max of its input depths. 6 | Within a clock cycle: 7 | - LUTs of lower depth are not influenced by LUTs of higher depth. 8 | - LUTs at a same depth are not influenced by each others. 9 | The LUTs are then sorted by depth and the data structure reordered. 10 | 11 | ----------------------------------------------------------------------- */ 12 | /* 13 | BSD 3-Clause License 14 | 15 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 16 | All rights reserved. 17 | 18 | Redistribution and use in source and binary forms, with or without 19 | modification, are permitted provided that the following conditions are met: 20 | 21 | 1. Redistributions of source code must retain the above copyright notice, this 22 | list of conditions and the following disclaimer. 23 | 24 | 2. Redistributions in binary form must reproduce the above copyright notice, 25 | this list of conditions and the following disclaimer in the documentation 26 | and/or other materials provided with the distribution. 27 | 28 | 3. Neither the name of the copyright holder nor the names of its 29 | contributors may be used to endorse or promote products derived from 30 | this software without specific prior written permission. 31 | 32 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 33 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 34 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 35 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 36 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 37 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 38 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 39 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 40 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 41 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 42 | */ 43 | 44 | #include 45 | 46 | #include 47 | #include 48 | #include 49 | #include 50 | #include 51 | #include 52 | #include 53 | #include 54 | #include 55 | #include 56 | 57 | using namespace std; 58 | 59 | #include "read.h" 60 | #include "analyze.h" 61 | 62 | // ----------------------------------------------------------------------------- 63 | 64 | // Propagates depths through the network (from all Q at depth 0). 65 | // Returns whether something changed. 66 | bool analyzeStep(const vector& luts,vector& _depths) 67 | { 68 | bool changed = false; 69 | for (int l=0;l -1) { 75 | int other_value = 0; 76 | if ((luts[l].inputs[i] & 1) == 0) { 77 | other_value = _depths[(luts[l].inputs[i] >> 1)]; 78 | } 79 | if (other_value < std::numeric_limits::max()) { 80 | ++other_value; 81 | } 82 | new_value = max(new_value , other_value); 83 | } 84 | } 85 | // update output depth if changed 86 | if (_depths[l] != new_value) { 87 | _depths[l] = new_value; 88 | changed = true; 89 | } 90 | } 91 | return changed; 92 | } 93 | 94 | // ----------------------------------------------------------------------------- 95 | 96 | // Performs an analysis of the design, computing the combinational depth 97 | // of all LUTs 98 | void analyze( 99 | vector& _luts, 100 | vector& _brams, 101 | vector >& _outbits, 102 | map& _indices, 103 | vector& _ones, 104 | vector& _step_starts, 105 | vector& _step_ends, 106 | vector& _depths) 107 | { 108 | vector lut_depths; 109 | lut_depths.resize(_luts.size(),std::numeric_limits::max()); 110 | /// iterate the analysis step 111 | // propagates combinational depth from Q outputs 112 | bool changed = true; 113 | int maxiter = 1024; 114 | while (changed && maxiter-- > 0) { 115 | changed = analyzeStep(_luts, lut_depths); 116 | } 117 | if (maxiter <= 0) { 118 | fprintf(stderr, "cannot perform analysis, combinational loop in design?"); 119 | exit(-1); 120 | } 121 | // reorder by increasing depth 122 | vector > source; 123 | source.resize(_luts.size()); 124 | int max_depth = 0; 125 | for (int l = 0; l < _luts.size(); ++l) { 126 | source[l] = make_pair(lut_depths[l], l); // depth,id 127 | if (lut_depths[l] < std::numeric_limits::max()) { 128 | max_depth = max(max_depth, lut_depths[l]); 129 | } 130 | } 131 | if (max_depth == 0) { 132 | fprintf(stderr, "analysis failed (why?)"); 133 | exit(-1); 134 | } 135 | /// determine const LUTs based on initialization 136 | // const LUTs are placed at depth 0, which is not simulated 137 | // we can only consider const if the inputs where not initialized, otherwise 138 | // there may be an on-purpose cascade of FF from the initialization point 139 | set with_init; 140 | for (auto one : _ones) { 141 | with_init.insert(one); 142 | } 143 | // promote 0-depth cells with init to 1-depth (non const) 144 | for (int l = 0; l < _luts.size(); ++l) { 145 | if (source[l].first == 0) { 146 | if (with_init.count((l << 1) + 0) || with_init.count((l << 1) + 1)) { 147 | source[l].first = 1; 148 | break; 149 | } 150 | } 151 | } 152 | // convert d-depth cells using only 0-depth const cells as 0-depth (const) 153 | for (int depth = 1; depth <= max_depth; depth++) { 154 | for (int l = 0; l < _luts.size(); ++l) { 155 | if (source[l].first == depth) { 156 | bool no_init_input = true; 157 | for (int i = 0; i < 4; ++i) { 158 | if (_luts[l].inputs[i] > -1) { 159 | if (with_init.count(_luts[l].inputs[i]) != 0) { 160 | no_init_input = false; break; 161 | } 162 | if (_luts[_luts[l].inputs[i] >> 1].external) { 163 | no_init_input = false; break; 164 | } 165 | } 166 | } 167 | if (no_init_input) { 168 | // now we check that all inputs are 0-depth 169 | bool all_inputs_0depth = true; 170 | for (int i = 0; i < 4; ++i) { 171 | if (_luts[l].inputs[i] > -1) { 172 | int idepth = source[_luts[l].inputs[i] >> 1].first; 173 | if (idepth > 0) { 174 | all_inputs_0depth = false; 175 | } 176 | } 177 | } 178 | if (all_inputs_0depth) { 179 | source[l].first = 0; 180 | } 181 | } 182 | } 183 | } 184 | } 185 | // update max_depth 186 | max_depth = 0; 187 | for (int l = 0; l < _luts.size(); ++l) { 188 | max_depth = max(max_depth, source[l].first); 189 | } 190 | #if 0 191 | // debug: output full list of LUTs 192 | for (int l = 0; l < _luts.size(); ++l) { 193 | int i0d = _luts[l].inputs[0] < 0 ? 999 : (_luts[l].inputs[0] & 1 ? 999 : source[_luts[l].inputs[0] >> 1].first); 194 | int i1d = _luts[l].inputs[1] < 0 ? 999 : (_luts[l].inputs[1] & 1 ? 999 : source[_luts[l].inputs[1] >> 1].first); 195 | int i2d = _luts[l].inputs[2] < 0 ? 999 : (_luts[l].inputs[2] & 1 ? 999 : source[_luts[l].inputs[2] >> 1].first); 196 | int i3d = _luts[l].inputs[3] < 0 ? 999 : (_luts[l].inputs[3] & 1 ? 999 : source[_luts[l].inputs[3] >> 1].first); 197 | fprintf(stderr, "LUT %d, depth %d min input depths: %d\n", l<<1, source[l].first, min(min(i0d, i1d), min(i2d, i3d))); 198 | } 199 | #endif 200 | // sort by depth 201 | sort(source.begin(),source.end()); 202 | // build the reordering arrays 203 | vector reorder; 204 | vector inv_reorder; 205 | reorder .resize(_luts.size()); 206 | inv_reorder .resize(_luts.size()); 207 | _depths .resize(_luts.size()); 208 | _step_starts.resize(max_depth+1,std::numeric_limits::max()); 209 | _step_ends .resize(max_depth+1,0); 210 | for (int o=0;o::max()) { 215 | _step_starts[source[o].first] = min(_step_starts[source[o].first],o); 216 | _step_ends [source[o].first] = max(_step_ends [source[o].first],o); 217 | } 218 | } 219 | // reorder the LUTs 220 | vector init_luts = _luts; 221 | reorderLUTs(init_luts, reorder, inv_reorder, _luts, _brams, _outbits, _indices, _ones); 222 | #if 0 223 | // print report 224 | fprintf(stderr,"analysis done\n"); 225 | for (int d=0;d<_step_starts.size();++d) { 226 | fprintf(stderr,"depth %3d on luts %6d-%6d (%6d/%6d)\n", 227 | d,_step_starts[d],_step_ends[d], 228 | _step_ends[d] - _step_starts[d] + 1, 229 | (int)_luts.size()); 230 | } 231 | #endif 232 | } 233 | 234 | // ----------------------------------------------------------------------------- 235 | 236 | // Reorders the LUT datastructure based on input reordering arrays 237 | void reorderLUTs( 238 | const vector& init_luts, 239 | const vector& reorder, 240 | const vector& inv_reorder, 241 | vector& _luts, 242 | vector& _brams, 243 | vector >& _outbits, 244 | map& _indices, 245 | vector& _ones) 246 | { 247 | /// apply the reordering 248 | // -> luts 249 | _luts.resize(init_luts.size()); 250 | for (int o=0;o -1) { 257 | int reg = init_luts[l].inputs[i] &1; 258 | int src = init_luts[l].inputs[i]>>1; 259 | _luts[o].inputs[i] = (inv_reorder[src]<<1) | reg; 260 | } else { 261 | _luts[o].inputs[i] = -1; 262 | } 263 | } 264 | } 265 | // -> bits 266 | for (int b = 0; b < _outbits.size(); ++b) { 267 | int reg = _outbits[b].second &1; 268 | int src = _outbits[b].second>>1; 269 | _outbits[b].second = (inv_reorder[src]<<1) | reg; 270 | } 271 | // -> indices 272 | for (auto& idc : _indices) { 273 | int reg = idc.second & 1; 274 | int src = idc.second >> 1; 275 | idc.second = (inv_reorder[src] << 1) | reg; 276 | } 277 | // -> brams 278 | for (auto& b : _brams) { 279 | vector*> vecs; 280 | vecs.push_back(&b.rd_addr); 281 | vecs.push_back(&b.rd_data); 282 | vecs.push_back(&b.wr_addr); 283 | vecs.push_back(&b.wr_data); 284 | vecs.push_back(&b.wr_en); 285 | for (auto vptr : vecs) { 286 | for (int i=0;i < (int)vptr->size();++i) { 287 | int reg = vptr->at(i) & 1; 288 | int src = vptr->at(i) >> 1; 289 | vptr->at(i) = (inv_reorder[src] << 1) | reg; 290 | } 291 | } 292 | } 293 | // -> ones (init) 294 | for (int b = 0; b < _ones.size(); ++b) { 295 | int reg = _ones[b] & 1; 296 | int src = _ones[b] >> 1; 297 | _ones[b] = (inv_reorder[src] << 1) | reg; 298 | } 299 | } 300 | 301 | // ----------------------------------------------------------------------------- 302 | 303 | 304 | // Builds a data-structure representing the fanout of each LUT: the list 305 | // of LUTs that use it as an input. This is used to only simulate the LUTs 306 | // which inputs have changed at each depth. 307 | void buildFanout( 308 | vector& _luts, 309 | vector& _fanout) 310 | { 311 | // build fanout 312 | vector > fanouts; 313 | fanouts.resize(_luts.size()); 314 | for (int l = 0; l < _luts.size(); ++l) { 315 | ForIndex(i, 4) { 316 | if (_luts[l].inputs[i] > -1) { 317 | int lut_in = _luts[l].inputs[i] >> 1; 318 | int lut_in_q_else_d = _luts[l].inputs[i] & 1; 319 | fanouts[lut_in].insert((l << 1) | lut_in_q_else_d); 320 | } 321 | } 322 | } 323 | // -> flatten in output 324 | int totsz = 0; 325 | for (const auto& fo : fanouts) { 326 | totsz += (int)fo.size() + 1; 327 | } 328 | _fanout.reserve(_luts.size() /*header, 1 index per lut*/ + totsz); 329 | int rsum = (int)_luts.size(); 330 | for (const auto& fo : fanouts) { 331 | _fanout.push_back(rsum); 332 | rsum += (int)fo.size() + 1; 333 | } 334 | for (const auto& fo : fanouts) { 335 | for (auto l : fo) { 336 | _fanout.push_back(l); 337 | } 338 | _fanout.push_back(-1); 339 | } 340 | sl_assert(_fanout.size() == _luts.size() + totsz); 341 | } 342 | 343 | // ----------------------------------------------------------------------------- 344 | -------------------------------------------------------------------------------- /src/analyze.h: -------------------------------------------------------------------------------- 1 | // @sylefeb 2022-01-04 2 | /* 3 | BSD 3-Clause License 4 | 5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, 15 | this list of conditions and the following disclaimer in the documentation 16 | and/or other materials provided with the distribution. 17 | 18 | 3. Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from 20 | this software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | */ 33 | 34 | #pragma once 35 | 36 | #include 37 | using namespace std; 38 | 39 | void analyze( 40 | std::vector& _luts, 41 | std::vector& _brams, 42 | std::vector >& _outbits, 43 | std::map& _indices, 44 | std::vector& _ones, 45 | std::vector& _step_starts, 46 | std::vector& _step_ends, 47 | std::vector& _depths); 48 | 49 | void reorderLUTs( 50 | const std::vector& init_luts, 51 | const std::vector& reorder, 52 | const std::vector& inv_reorder, 53 | std::vector& _luts, 54 | std::vector& _brams, 55 | std::vector >& _outbits, 56 | std::map& _indices, 57 | std::vector& _ones); 58 | 59 | void buildFanout( 60 | std::vector& _luts, 61 | std::vector& _fanout); 62 | -------------------------------------------------------------------------------- /src/blif.cc: -------------------------------------------------------------------------------- 1 | // @sylefeb 2022-01-08 2 | /* 3 | 4 | Simple BLIF file parser, nothing special. 5 | Reads the inputs, outputs, gates and latches into a t_blif struct. 6 | 7 | */ 8 | /* 9 | BSD 3-Clause License 10 | 11 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 12 | All rights reserved. 13 | 14 | Redistribution and use in source and binary forms, with or without 15 | modification, are permitted provided that the following conditions are met: 16 | 17 | 1. Redistributions of source code must retain the above copyright notice, this 18 | list of conditions and the following disclaimer. 19 | 20 | 2. Redistributions in binary form must reproduce the above copyright notice, 21 | this list of conditions and the following disclaimer in the documentation 22 | and/or other materials provided with the distribution. 23 | 24 | 3. Neither the name of the copyright holder nor the names of its 25 | contributors may be used to endorse or promote products derived from 26 | this software without specific prior written permission. 27 | 28 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 29 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 30 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 31 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 32 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 33 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 34 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 35 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 36 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 37 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 38 | */ 39 | 40 | #include "blif.h" 41 | #include 42 | 43 | // ------------------------------------------------------------------- 44 | 45 | using namespace std; 46 | 47 | // ------------------------------------------------------------------- 48 | 49 | void readList( 50 | LibSL::BasicParser::Parser& parser, 51 | vector& _list) 52 | { 53 | while (1) { 54 | parser.skipSpaces(); 55 | char next = parser.readChar(false); 56 | if (next == '\n') { 57 | break; 58 | } else { 59 | string name = parser.readString(); 60 | _list.emplace_back(name); 61 | } 62 | } 63 | } 64 | 65 | // ------------------------------------------------------------------- 66 | 67 | void readConfig( 68 | LibSL::BasicParser::Parser& parser, 69 | pair& _cfgs) 70 | { 71 | string vals = parser.readString(); 72 | string out = parser.readString(); 73 | if (out.empty()) { std::swap(vals,out); } 74 | _cfgs = make_pair(vals, out); 75 | } 76 | 77 | // ------------------------------------------------------------------- 78 | 79 | ushort lut_config(const std::vector >& config_strings) 80 | { 81 | ushort cfg = 0; 82 | for (auto cs : config_strings) { 83 | if (cs.second == "1") { // probably always the case, defaults to 0 84 | ForIndex(c, 16) { // for each of 16 configs 85 | bool accept = true; 86 | ForIndex(j, cs.first.length()) { 87 | if (cs.first[cs.first.length()-1-j] == '1') { 88 | if (!(c & (1 << j))) { accept = false; break; } 89 | } else { 90 | if ( c & (1 << j) ) { accept = false; break; } 91 | } 92 | } 93 | if (accept) { 94 | cfg |= (1 << c); 95 | } 96 | } 97 | } 98 | } 99 | return cfg; 100 | } 101 | 102 | // ------------------------------------------------------------------- 103 | 104 | void split(const std::string& s, char delim, std::vector& elems) 105 | { 106 | std::stringstream ss(s); 107 | std::string item; 108 | while (getline(ss, item, delim)) { 109 | elems.push_back(item); 110 | } 111 | } 112 | 113 | // ------------------------------------------------------------------- 114 | 115 | void parse(const char *fname, t_blif& _blif) 116 | { 117 | 118 | LibSL::BasicParser::FileStream stream(fname); 119 | LibSL::BasicParser::Parser parser(stream,false); 120 | 121 | bool in_subckt = false; 122 | bool in_bram = false; 123 | 124 | fprintf(stderr, "Parsing ... "); 125 | Console::processingInit(); 126 | while (!parser.eof()) { 127 | parser.skipSpaces(); 128 | char first = parser.readChar(false); 129 | if (first == '#') { 130 | // skip comment 131 | } else if (first == '.') { 132 | string type = parser.readString(); 133 | if (type == ".model") { 134 | in_subckt = false; 135 | string name = parser.readString(); 136 | } else if (type == ".inputs") { 137 | in_subckt = false; 138 | readList(parser, _blif.inputs); 139 | } else if (type == ".outputs") { 140 | in_subckt = false; 141 | readList(parser, _blif.outputs); 142 | } else if (type == ".names") { 143 | in_subckt = false; 144 | vector ios; 145 | readList(parser, ios); 146 | _blif.gates.push_back(t_gate_nfo()); 147 | if (!ios.empty()) { 148 | _blif.gates.back().output = ios.back(); 149 | for (int i = 0; i < (int)ios.size() - 1; ++i) { 150 | _blif.gates.back().inputs.push_back(ios[i]); 151 | } 152 | } 153 | } else if (type == ".latch") { 154 | in_subckt = false; 155 | _blif.latches.push_back(t_latch_nfo()); 156 | vector nfos; 157 | readList(parser, nfos); 158 | sl_assert(nfos.size() == 5); 159 | _blif.latches.back().input = nfos[0]; 160 | _blif.latches.back().output = nfos[1]; 161 | _blif.latches.back().init = nfos[4]; 162 | } else if (type == ".subckt") { 163 | in_subckt = true; 164 | in_bram = false; 165 | string type = parser.readString(); 166 | if (type == "$mem_v2") { 167 | in_bram = true; 168 | _blif.brams.push_back(t_bram_nfo()); 169 | vector bindings; 170 | readList(parser, bindings); 171 | for (auto b : bindings) { 172 | std::vector left_right; 173 | split(b,'=',left_right); 174 | if (left_right.size() != 2) { 175 | fprintf(stderr," cannot interpret binding %s\n",b.c_str()); 176 | } else { 177 | _blif.brams.back().bindings[left_right[0]] = left_right[1]; 178 | // fprintf(stderr,"%s = %s\n",left_right[0].c_str(),left_right[1].c_str()); 179 | } 180 | } 181 | } 182 | } else if (type == ".param") { 183 | if (in_subckt && in_bram) { 184 | string param; 185 | param = parser.readString(); 186 | if (param == "MEMID") { 187 | string id; 188 | id = parser.readString(); 189 | _blif.brams.back().name = id; 190 | } else if (param == "INIT") { 191 | // read init bits and store 192 | parser.skipSpaces(); 193 | uint b = 0; 194 | while (1) { 195 | char next = parser.readChar(false); 196 | if (next == '\n') { 197 | break; 198 | } else { 199 | char bit = parser.readChar(true); 200 | _blif.brams.back().data.set(b, (bit == '1')); 201 | ++b; 202 | } 203 | } 204 | // fprintf(stderr, "read %d init bits\n", _blif.brams.back().data.bitsize()); 205 | } else { 206 | // read value (max 32 bits) 207 | uint32_t v = 0; 208 | parser.skipSpaces(); 209 | while (1) { 210 | char next = parser.readChar(false); 211 | if (next == '\n') { 212 | break; 213 | } else { 214 | char bit = parser.readChar(true); 215 | if (bit == '1') { 216 | v = (v << 1) | 1; 217 | } else { 218 | v = v << 1; 219 | } 220 | } 221 | } 222 | if (param == "ABITS") { 223 | _blif.brams.back().addr_width = v; 224 | } else if (param == "SIZE") { 225 | _blif.brams.back().size = v; 226 | } else if (param == "WIDTH") { 227 | _blif.brams.back().data_width = v; 228 | } else { 229 | // TODO: check num ports, etc 230 | } 231 | } 232 | } 233 | } 234 | } else if (first == '0' || first == '1' || first == '-') { 235 | // read configuration 236 | _blif.gates.back().config_strings.push_back(pair()); 237 | readConfig(parser, _blif.gates.back().config_strings.back()); 238 | } 239 | // skip to next line 240 | parser.reachChar('\n'); 241 | Console::processingUpdate(); 242 | } 243 | Console::processingEnd(); 244 | fprintf(stderr, " done.\n"); 245 | 246 | } 247 | 248 | // ------------------------------------------------------------------- 249 | -------------------------------------------------------------------------------- /src/blif.h: -------------------------------------------------------------------------------- 1 | // @sylefeb 2022-01-08 2 | /* 3 | BSD 3-Clause License 4 | 5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, 15 | this list of conditions and the following disclaimer in the documentation 16 | and/or other materials provided with the distribution. 17 | 18 | 3. Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from 20 | this software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | */ 33 | 34 | #pragma once 35 | 36 | #include 37 | 38 | #include 39 | #include 40 | 41 | #include "uintX.h" 42 | 43 | typedef struct { 44 | std::string input; 45 | std::string output; 46 | std::string init; 47 | } t_latch_nfo; 48 | 49 | typedef struct { 50 | std::vector inputs; 51 | std::string output; 52 | std::vector > config_strings; 53 | } t_gate_nfo; 54 | 55 | typedef struct { 56 | std::map bindings; 57 | std::string name; 58 | int size; 59 | int addr_width; 60 | int data_width; 61 | uintX data; 62 | } t_bram_nfo; 63 | 64 | typedef struct { 65 | std::vector inputs; 66 | std::vector outputs; 67 | std::vector latches; 68 | std::vector gates; 69 | std::vector brams; 70 | } t_blif; 71 | 72 | /// Parses a blif file 73 | void parse(const char *fname, t_blif& _blif); 74 | 75 | /// Returns an integer representing the LUT configuration provides as strings 76 | ushort lut_config(const std::vector >& config_strings); 77 | 78 | /// Utilities 79 | // splits a string using a delimiter 80 | void split(const std::string &s, char delim, std::vector &elems); 81 | -------------------------------------------------------------------------------- /src/fstapi/CMakeLists.txt: -------------------------------------------------------------------------------- 1 | cmake_minimum_required(VERSION 3.5) 2 | project(fstapi) 3 | 4 | INCLUDE_DIRECTORIES( 5 | ${PROJECT_SOURCE_DIR}/ 6 | ${PROJECT_SOURCE_DIR}/../LibSL/src/libs/src/zlib/ 7 | ) 8 | 9 | ADD_LIBRARY(fstapi 10 | fastlz.c 11 | fastlz.h 12 | lz4.c 13 | lz4.h 14 | fstapi.c 15 | fstapi.h 16 | ) 17 | -------------------------------------------------------------------------------- /src/fstapi/fastlz.c: -------------------------------------------------------------------------------- 1 | /* 2 | FastLZ - lightning-fast lossless compression library 3 | 4 | Copyright (C) 2007 Ariya Hidayat (ariya@kde.org) 5 | Copyright (C) 2006 Ariya Hidayat (ariya@kde.org) 6 | Copyright (C) 2005 Ariya Hidayat (ariya@kde.org) 7 | 8 | Permission is hereby granted, free of charge, to any person obtaining a copy 9 | of this software and associated documentation files (the "Software"), to deal 10 | in the Software without restriction, including without limitation the rights 11 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 12 | copies of the Software, and to permit persons to whom the Software is 13 | furnished to do so, subject to the following conditions: 14 | 15 | The above copyright notice and this permission notice shall be included in 16 | all copies or substantial portions of the Software. 17 | 18 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 19 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 20 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 21 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 22 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 23 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 24 | THE SOFTWARE. 25 | 26 | SPDX-License-Identifier: MIT 27 | */ 28 | 29 | #include "fastlz.h" 30 | 31 | #if !defined(FASTLZ__COMPRESSOR) && !defined(FASTLZ_DECOMPRESSOR) 32 | 33 | /* 34 | * Always check for bound when decompressing. 35 | * Generally it is best to leave it defined. 36 | */ 37 | #define FASTLZ_SAFE 38 | 39 | 40 | /* 41 | * Give hints to the compiler for branch prediction optimization. 42 | */ 43 | #if defined(__GNUC__) && (__GNUC__ > 2) 44 | #define FASTLZ_EXPECT_CONDITIONAL(c) (__builtin_expect((c), 1)) 45 | #define FASTLZ_UNEXPECT_CONDITIONAL(c) (__builtin_expect((c), 0)) 46 | #else 47 | #define FASTLZ_EXPECT_CONDITIONAL(c) (c) 48 | #define FASTLZ_UNEXPECT_CONDITIONAL(c) (c) 49 | #endif 50 | 51 | /* 52 | * Use inlined functions for supported systems. 53 | */ 54 | #if defined(__GNUC__) || defined(__DMC__) || defined(__POCC__) || defined(__WATCOMC__) || defined(__SUNPRO_C) 55 | #define FASTLZ_INLINE inline 56 | #elif defined(__BORLANDC__) || defined(_MSC_VER) || defined(__LCC__) 57 | #define FASTLZ_INLINE __inline 58 | #else 59 | #define FASTLZ_INLINE 60 | #endif 61 | 62 | /* 63 | * Prevent accessing more than 8-bit at once, except on x86 architectures. 64 | */ 65 | #if !defined(FASTLZ_STRICT_ALIGN) 66 | #define FASTLZ_STRICT_ALIGN 67 | #if defined(__i386__) || defined(__386) /* GNU C, Sun Studio */ 68 | #undef FASTLZ_STRICT_ALIGN 69 | #elif defined(__i486__) || defined(__i586__) || defined(__i686__) || defined(__amd64) /* GNU C */ 70 | #undef FASTLZ_STRICT_ALIGN 71 | #elif defined(_M_IX86) /* Intel, MSVC */ 72 | #undef FASTLZ_STRICT_ALIGN 73 | #elif defined(__386) 74 | #undef FASTLZ_STRICT_ALIGN 75 | #elif defined(_X86_) /* MinGW */ 76 | #undef FASTLZ_STRICT_ALIGN 77 | #elif defined(__I86__) /* Digital Mars */ 78 | #undef FASTLZ_STRICT_ALIGN 79 | #endif 80 | #endif 81 | 82 | /* prototypes */ 83 | int fastlz_compress(const void* input, int length, void* output); 84 | int fastlz_compress_level(int level, const void* input, int length, void* output); 85 | int fastlz_decompress(const void* input, int length, void* output, int maxout); 86 | 87 | #define MAX_COPY 32 88 | #define MAX_LEN 264 /* 256 + 8 */ 89 | #define MAX_DISTANCE 8192 90 | 91 | #if !defined(FASTLZ_STRICT_ALIGN) 92 | #define FASTLZ_READU16(p) *((const flzuint16*)(p)) 93 | #else 94 | #define FASTLZ_READU16(p) ((p)[0] | (p)[1]<<8) 95 | #endif 96 | 97 | #define HASH_LOG 13 98 | #define HASH_SIZE (1<< HASH_LOG) 99 | #define HASH_MASK (HASH_SIZE-1) 100 | #define HASH_FUNCTION(v,p) { v = FASTLZ_READU16(p); v ^= FASTLZ_READU16(p+1)^(v>>(16-HASH_LOG));v &= HASH_MASK; } 101 | 102 | #undef FASTLZ_LEVEL 103 | #define FASTLZ_LEVEL 1 104 | 105 | #undef FASTLZ_COMPRESSOR 106 | #undef FASTLZ_DECOMPRESSOR 107 | #define FASTLZ_COMPRESSOR fastlz1_compress 108 | #define FASTLZ_DECOMPRESSOR fastlz1_decompress 109 | static FASTLZ_INLINE int FASTLZ_COMPRESSOR(const void* input, int length, void* output); 110 | static FASTLZ_INLINE int FASTLZ_DECOMPRESSOR(const void* input, int length, void* output, int maxout); 111 | #include "fastlz.c" 112 | 113 | #undef FASTLZ_LEVEL 114 | #define FASTLZ_LEVEL 2 115 | 116 | #undef MAX_DISTANCE 117 | #define MAX_DISTANCE 8191 118 | #define MAX_FARDISTANCE (65535+MAX_DISTANCE-1) 119 | 120 | #undef FASTLZ_COMPRESSOR 121 | #undef FASTLZ_DECOMPRESSOR 122 | #define FASTLZ_COMPRESSOR fastlz2_compress 123 | #define FASTLZ_DECOMPRESSOR fastlz2_decompress 124 | static FASTLZ_INLINE int FASTLZ_COMPRESSOR(const void* input, int length, void* output); 125 | static FASTLZ_INLINE int FASTLZ_DECOMPRESSOR(const void* input, int length, void* output, int maxout); 126 | #include "fastlz.c" 127 | 128 | int fastlz_compress(const void* input, int length, void* output) 129 | { 130 | /* for short block, choose fastlz1 */ 131 | if(length < 65536) 132 | return fastlz1_compress(input, length, output); 133 | 134 | /* else... */ 135 | return fastlz2_compress(input, length, output); 136 | } 137 | 138 | int fastlz_decompress(const void* input, int length, void* output, int maxout) 139 | { 140 | /* magic identifier for compression level */ 141 | int level = ((*(const flzuint8*)input) >> 5) + 1; 142 | 143 | if(level == 1) 144 | return fastlz1_decompress(input, length, output, maxout); 145 | if(level == 2) 146 | return fastlz2_decompress(input, length, output, maxout); 147 | 148 | /* unknown level, trigger error */ 149 | return 0; 150 | } 151 | 152 | int fastlz_compress_level(int level, const void* input, int length, void* output) 153 | { 154 | if(level == 1) 155 | return fastlz1_compress(input, length, output); 156 | if(level == 2) 157 | return fastlz2_compress(input, length, output); 158 | 159 | return 0; 160 | } 161 | 162 | #else /* !defined(FASTLZ_COMPRESSOR) && !defined(FASTLZ_DECOMPRESSOR) */ 163 | 164 | static FASTLZ_INLINE int FASTLZ_COMPRESSOR(const void* input, int length, void* output) 165 | { 166 | const flzuint8* ip = (const flzuint8*) input; 167 | const flzuint8* ip_bound = ip + length - 2; 168 | const flzuint8* ip_limit = ip + length - 12; 169 | flzuint8* op = (flzuint8*) output; 170 | 171 | const flzuint8* htab[HASH_SIZE]; 172 | const flzuint8** hslot; 173 | flzuint32 hval; 174 | 175 | flzuint32 copy; 176 | 177 | /* sanity check */ 178 | if(FASTLZ_UNEXPECT_CONDITIONAL(length < 4)) 179 | { 180 | if(length) 181 | { 182 | /* create literal copy only */ 183 | *op++ = length-1; 184 | ip_bound++; 185 | while(ip <= ip_bound) 186 | *op++ = *ip++; 187 | return length+1; 188 | } 189 | else 190 | return 0; 191 | } 192 | 193 | /* initializes hash table */ 194 | for (hslot = htab; hslot < htab + HASH_SIZE; hslot++) 195 | *hslot = ip; 196 | 197 | /* we start with literal copy */ 198 | copy = 2; 199 | *op++ = MAX_COPY-1; 200 | *op++ = *ip++; 201 | *op++ = *ip++; 202 | 203 | /* main loop */ 204 | while(FASTLZ_EXPECT_CONDITIONAL(ip < ip_limit)) 205 | { 206 | const flzuint8* ref; 207 | flzuint32 distance; 208 | 209 | /* minimum match length */ 210 | flzuint32 len = 3; 211 | 212 | /* comparison starting-point */ 213 | const flzuint8* anchor = ip; 214 | 215 | /* check for a run */ 216 | #if FASTLZ_LEVEL==2 217 | if(ip[0] == ip[-1] && FASTLZ_READU16(ip-1)==FASTLZ_READU16(ip+1)) 218 | { 219 | distance = 1; 220 | /* ip += 3; */ /* scan-build, never used */ 221 | ref = anchor - 1 + 3; 222 | goto match; 223 | } 224 | #endif 225 | 226 | /* find potential match */ 227 | HASH_FUNCTION(hval,ip); 228 | hslot = htab + hval; 229 | ref = htab[hval]; 230 | 231 | /* calculate distance to the match */ 232 | distance = anchor - ref; 233 | 234 | /* update hash table */ 235 | *hslot = anchor; 236 | 237 | /* is this a match? check the first 3 bytes */ 238 | if(distance==0 || 239 | #if FASTLZ_LEVEL==1 240 | (distance >= MAX_DISTANCE) || 241 | #else 242 | (distance >= MAX_FARDISTANCE) || 243 | #endif 244 | *ref++ != *ip++ || *ref++!=*ip++ || *ref++!=*ip++) 245 | goto literal; 246 | 247 | #if FASTLZ_LEVEL==2 248 | /* far, needs at least 5-byte match */ 249 | if(distance >= MAX_DISTANCE) 250 | { 251 | if(*ip++ != *ref++ || *ip++!= *ref++) 252 | goto literal; 253 | len += 2; 254 | } 255 | 256 | match: 257 | #endif 258 | 259 | /* last matched byte */ 260 | ip = anchor + len; 261 | 262 | /* distance is biased */ 263 | distance--; 264 | 265 | if(!distance) 266 | { 267 | /* zero distance means a run */ 268 | flzuint8 x = ip[-1]; 269 | while(ip < ip_bound) 270 | if(*ref++ != x) break; else ip++; 271 | } 272 | else 273 | for(;;) 274 | { 275 | /* safe because the outer check against ip limit */ 276 | if(*ref++ != *ip++) break; 277 | if(*ref++ != *ip++) break; 278 | if(*ref++ != *ip++) break; 279 | if(*ref++ != *ip++) break; 280 | if(*ref++ != *ip++) break; 281 | if(*ref++ != *ip++) break; 282 | if(*ref++ != *ip++) break; 283 | if(*ref++ != *ip++) break; 284 | while(ip < ip_bound) 285 | if(*ref++ != *ip++) break; 286 | break; 287 | } 288 | 289 | /* if we have copied something, adjust the copy count */ 290 | if(copy) 291 | /* copy is biased, '0' means 1 byte copy */ 292 | *(op-copy-1) = copy-1; 293 | else 294 | /* back, to overwrite the copy count */ 295 | op--; 296 | 297 | /* reset literal counter */ 298 | copy = 0; 299 | 300 | /* length is biased, '1' means a match of 3 bytes */ 301 | ip -= 3; 302 | len = ip - anchor; 303 | 304 | /* encode the match */ 305 | #if FASTLZ_LEVEL==2 306 | if(distance < MAX_DISTANCE) 307 | { 308 | if(len < 7) 309 | { 310 | *op++ = (len << 5) + (distance >> 8); 311 | *op++ = (distance & 255); 312 | } 313 | else 314 | { 315 | *op++ = (7 << 5) + (distance >> 8); 316 | for(len-=7; len >= 255; len-= 255) 317 | *op++ = 255; 318 | *op++ = len; 319 | *op++ = (distance & 255); 320 | } 321 | } 322 | else 323 | { 324 | /* far away, but not yet in the another galaxy... */ 325 | if(len < 7) 326 | { 327 | distance -= MAX_DISTANCE; 328 | *op++ = (len << 5) + 31; 329 | *op++ = 255; 330 | *op++ = distance >> 8; 331 | *op++ = distance & 255; 332 | } 333 | else 334 | { 335 | distance -= MAX_DISTANCE; 336 | *op++ = (7 << 5) + 31; 337 | for(len-=7; len >= 255; len-= 255) 338 | *op++ = 255; 339 | *op++ = len; 340 | *op++ = 255; 341 | *op++ = distance >> 8; 342 | *op++ = distance & 255; 343 | } 344 | } 345 | #else 346 | 347 | if(FASTLZ_UNEXPECT_CONDITIONAL(len > MAX_LEN-2)) 348 | while(len > MAX_LEN-2) 349 | { 350 | *op++ = (7 << 5) + (distance >> 8); 351 | *op++ = MAX_LEN - 2 - 7 -2; 352 | *op++ = (distance & 255); 353 | len -= MAX_LEN-2; 354 | } 355 | 356 | if(len < 7) 357 | { 358 | *op++ = (len << 5) + (distance >> 8); 359 | *op++ = (distance & 255); 360 | } 361 | else 362 | { 363 | *op++ = (7 << 5) + (distance >> 8); 364 | *op++ = len - 7; 365 | *op++ = (distance & 255); 366 | } 367 | #endif 368 | 369 | /* update the hash at match boundary */ 370 | HASH_FUNCTION(hval,ip); 371 | htab[hval] = ip++; 372 | HASH_FUNCTION(hval,ip); 373 | htab[hval] = ip++; 374 | 375 | /* assuming literal copy */ 376 | *op++ = MAX_COPY-1; 377 | 378 | continue; 379 | 380 | literal: 381 | *op++ = *anchor++; 382 | ip = anchor; 383 | copy++; 384 | if(FASTLZ_UNEXPECT_CONDITIONAL(copy == MAX_COPY)) 385 | { 386 | copy = 0; 387 | *op++ = MAX_COPY-1; 388 | } 389 | } 390 | 391 | /* left-over as literal copy */ 392 | ip_bound++; 393 | while(ip <= ip_bound) 394 | { 395 | *op++ = *ip++; 396 | copy++; 397 | if(copy == MAX_COPY) 398 | { 399 | copy = 0; 400 | *op++ = MAX_COPY-1; 401 | } 402 | } 403 | 404 | /* if we have copied something, adjust the copy length */ 405 | if(copy) 406 | *(op-copy-1) = copy-1; 407 | else 408 | op--; 409 | 410 | #if FASTLZ_LEVEL==2 411 | /* marker for fastlz2 */ 412 | *(flzuint8*)output |= (1 << 5); 413 | #endif 414 | 415 | return op - (flzuint8*)output; 416 | } 417 | 418 | static FASTLZ_INLINE int FASTLZ_DECOMPRESSOR(const void* input, int length, void* output, int maxout) 419 | { 420 | const flzuint8* ip = (const flzuint8*) input; 421 | const flzuint8* ip_limit = ip + length; 422 | flzuint8* op = (flzuint8*) output; 423 | flzuint8* op_limit = op + maxout; 424 | flzuint32 ctrl = (*ip++) & 31; 425 | int loop = 1; 426 | 427 | do 428 | { 429 | const flzuint8* ref = op; 430 | flzuint32 len = ctrl >> 5; 431 | flzuint32 ofs = (ctrl & 31) << 8; 432 | 433 | if(ctrl >= 32) 434 | { 435 | #if FASTLZ_LEVEL==2 436 | flzuint8 code; 437 | #endif 438 | len--; 439 | ref -= ofs; 440 | if (len == 7-1) 441 | #if FASTLZ_LEVEL==1 442 | len += *ip++; 443 | ref -= *ip++; 444 | #else 445 | do 446 | { 447 | code = *ip++; 448 | len += code; 449 | } while (code==255); 450 | code = *ip++; 451 | ref -= code; 452 | 453 | /* match from 16-bit distance */ 454 | if(FASTLZ_UNEXPECT_CONDITIONAL(code==255)) 455 | if(FASTLZ_EXPECT_CONDITIONAL(ofs==(31 << 8))) 456 | { 457 | ofs = (*ip++) << 8; 458 | ofs += *ip++; 459 | ref = op - ofs - MAX_DISTANCE; 460 | } 461 | #endif 462 | 463 | #ifdef FASTLZ_SAFE 464 | if (FASTLZ_UNEXPECT_CONDITIONAL(op + len + 3 > op_limit)) 465 | return 0; 466 | 467 | if (FASTLZ_UNEXPECT_CONDITIONAL(ref-1 < (flzuint8 *)output)) 468 | return 0; 469 | #endif 470 | 471 | if(FASTLZ_EXPECT_CONDITIONAL(ip < ip_limit)) 472 | ctrl = *ip++; 473 | else 474 | loop = 0; 475 | 476 | if(ref == op) 477 | { 478 | /* optimize copy for a run */ 479 | flzuint8 b = ref[-1]; 480 | *op++ = b; 481 | *op++ = b; 482 | *op++ = b; 483 | for(; len; --len) 484 | *op++ = b; 485 | } 486 | else 487 | { 488 | #if !defined(FASTLZ_STRICT_ALIGN) 489 | const flzuint16* p; 490 | flzuint16* q; 491 | #endif 492 | /* copy from reference */ 493 | ref--; 494 | *op++ = *ref++; 495 | *op++ = *ref++; 496 | *op++ = *ref++; 497 | 498 | #if !defined(FASTLZ_STRICT_ALIGN) 499 | /* copy a byte, so that now it's word aligned */ 500 | if(len & 1) 501 | { 502 | *op++ = *ref++; 503 | len--; 504 | } 505 | 506 | /* copy 16-bit at once */ 507 | q = (flzuint16*) op; 508 | op += len; 509 | p = (const flzuint16*) ref; 510 | for(len>>=1; len > 4; len-=4) 511 | { 512 | *q++ = *p++; 513 | *q++ = *p++; 514 | *q++ = *p++; 515 | *q++ = *p++; 516 | } 517 | for(; len; --len) 518 | *q++ = *p++; 519 | #else 520 | for(; len; --len) 521 | *op++ = *ref++; 522 | #endif 523 | } 524 | } 525 | else 526 | { 527 | ctrl++; 528 | #ifdef FASTLZ_SAFE 529 | if (FASTLZ_UNEXPECT_CONDITIONAL(op + ctrl > op_limit)) 530 | return 0; 531 | if (FASTLZ_UNEXPECT_CONDITIONAL(ip + ctrl > ip_limit)) 532 | return 0; 533 | #endif 534 | 535 | *op++ = *ip++; 536 | for(--ctrl; ctrl; ctrl--) 537 | *op++ = *ip++; 538 | 539 | loop = FASTLZ_EXPECT_CONDITIONAL(ip < ip_limit); 540 | if(loop) 541 | ctrl = *ip++; 542 | } 543 | } 544 | while(FASTLZ_EXPECT_CONDITIONAL(loop)); 545 | 546 | return op - (flzuint8*)output; 547 | } 548 | 549 | #endif /* !defined(FASTLZ_COMPRESSOR) && !defined(FASTLZ_DECOMPRESSOR) */ 550 | -------------------------------------------------------------------------------- /src/fstapi/fastlz.h: -------------------------------------------------------------------------------- 1 | /* 2 | FastLZ - lightning-fast lossless compression library 3 | 4 | Copyright (C) 2007 Ariya Hidayat (ariya@kde.org) 5 | Copyright (C) 2006 Ariya Hidayat (ariya@kde.org) 6 | Copyright (C) 2005 Ariya Hidayat (ariya@kde.org) 7 | 8 | Permission is hereby granted, free of charge, to any person obtaining a copy 9 | of this software and associated documentation files (the "Software"), to deal 10 | in the Software without restriction, including without limitation the rights 11 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 12 | copies of the Software, and to permit persons to whom the Software is 13 | furnished to do so, subject to the following conditions: 14 | 15 | The above copyright notice and this permission notice shall be included in 16 | all copies or substantial portions of the Software. 17 | 18 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 19 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 20 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 21 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 22 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 23 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 24 | THE SOFTWARE. 25 | 26 | SPDX-License-Identifier: MIT 27 | */ 28 | 29 | #ifndef FASTLZ_H 30 | #define FASTLZ_H 31 | 32 | #include 33 | 34 | #define flzuint8 uint8_t 35 | #define flzuint16 uint16_t 36 | #define flzuint32 uint32_t 37 | 38 | 39 | #define FASTLZ_VERSION 0x000100 40 | 41 | #define FASTLZ_VERSION_MAJOR 0 42 | #define FASTLZ_VERSION_MINOR 0 43 | #define FASTLZ_VERSION_REVISION 0 44 | 45 | #define FASTLZ_VERSION_STRING "0.1.0" 46 | 47 | #if defined (__cplusplus) 48 | extern "C" { 49 | #endif 50 | 51 | /** 52 | Compress a block of data in the input buffer and returns the size of 53 | compressed block. The size of input buffer is specified by length. The 54 | minimum input buffer size is 16. 55 | 56 | The output buffer must be at least 5% larger than the input buffer 57 | and can not be smaller than 66 bytes. 58 | 59 | If the input is not compressible, the return value might be larger than 60 | length (input buffer size). 61 | 62 | The input buffer and the output buffer can not overlap. 63 | */ 64 | 65 | int fastlz_compress(const void* input, int length, void* output); 66 | 67 | /** 68 | Decompress a block of compressed data and returns the size of the 69 | decompressed block. If error occurs, e.g. the compressed data is 70 | corrupted or the output buffer is not large enough, then 0 (zero) 71 | will be returned instead. 72 | 73 | The input buffer and the output buffer can not overlap. 74 | 75 | Decompression is memory safe and guaranteed not to write the output buffer 76 | more than what is specified in maxout. 77 | */ 78 | 79 | int fastlz_decompress(const void* input, int length, void* output, int maxout); 80 | 81 | /** 82 | Compress a block of data in the input buffer and returns the size of 83 | compressed block. The size of input buffer is specified by length. The 84 | minimum input buffer size is 16. 85 | 86 | The output buffer must be at least 5% larger than the input buffer 87 | and can not be smaller than 66 bytes. 88 | 89 | If the input is not compressible, the return value might be larger than 90 | length (input buffer size). 91 | 92 | The input buffer and the output buffer can not overlap. 93 | 94 | Compression level can be specified in parameter level. At the moment, 95 | only level 1 and level 2 are supported. 96 | Level 1 is the fastest compression and generally useful for short data. 97 | Level 2 is slightly slower but it gives better compression ratio. 98 | 99 | Note that the compressed data, regardless of the level, can always be 100 | decompressed using the function fastlz_decompress above. 101 | */ 102 | 103 | int fastlz_compress_level(int level, const void* input, int length, void* output); 104 | 105 | #if defined (__cplusplus) 106 | } 107 | #endif 108 | 109 | #endif /* FASTLZ_H */ 110 | -------------------------------------------------------------------------------- /src/read.cc: -------------------------------------------------------------------------------- 1 | // @sylefeb 2022-01-04 2 | /* 3 | BSD 3-Clause License 4 | 5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, 15 | this list of conditions and the following disclaimer in the documentation 16 | and/or other materials provided with the distribution. 17 | 18 | 3. Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from 20 | this software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | */ 33 | 34 | #include 35 | 36 | #include 37 | #include 38 | #include 39 | #include 40 | #include 41 | #include 42 | #include 43 | #include 44 | #include 45 | #include 46 | 47 | using namespace std; 48 | 49 | #include "read.h" 50 | #include "blif.h" 51 | 52 | // ----------------------------------------------------------------------------- 53 | /* 54 | From a read blif file, prepares a data-structure for simulation 55 | The blif file contains: 56 | - gates, which are LUT4s 57 | - latches, which indicate a flip-flop 58 | 59 | In most cases, a latch corresponds to the output of a gate, so it is 60 | simply a matter of connecting to the Q output of the corresponding gate. 61 | There are cases however where latches are chained. This requires us to 62 | instantiate gates to implement the flip-flops (since we only simulate 63 | gates). These gates are passthrough, and only their Q output is used, 64 | see tag [extra gates] in comments below. 65 | */ 66 | void buildSimulData( 67 | t_blif& _blif, // might change (adding extra LUTs) 68 | vector& _luts, // output vector of LUTs (see header) 69 | vector& _brams, // output vector of BRAMs 70 | vector >& _outbits, // output bit indices 71 | vector& _ones, // which output bit start as '1' 72 | map& _indices) // signal to LUT map 73 | { 74 | // gather output names and their source gate/latch 75 | map output2src; 76 | // -> gates 77 | ForArray(_blif.gates, g) { 78 | output2src[_blif.gates[g].output] = v2i(0, g); 79 | } 80 | // -> latches 81 | ForArray(_blif.latches, l) { 82 | output2src[_blif.latches[l].output] = v2i(1, l); 83 | } 84 | // -> BRAMs 85 | ForArray(_blif.brams, b) { 86 | for (int i=0; i < _blif.brams[b].data_width; ++i) { 87 | string port = "RD_DATA[" + std::to_string(i) + "]"; 88 | auto P = _blif.brams[b].bindings.find(port); 89 | if (P == _blif.brams[b].bindings.end()) { 90 | fprintf(stderr, " bram '%s' is disconnected\n", port.c_str()); 91 | } else { 92 | output2src[P->second] = v2i(2,b); 93 | } 94 | } 95 | } 96 | // -> inputs 97 | ForArray(_blif.inputs, i) { 98 | output2src[_blif.inputs[i]] = v2i(3, i); 99 | } 100 | // number all outputs 101 | // prepare to create luts 102 | // -> find register outputs that depend on other registers 103 | for (const auto& o : output2src) { 104 | if (o.second[0] == 1) { // latch 105 | // find input type 106 | sl_assert(output2src.count(_blif.latches[o.second[1]].input)); 107 | const auto& I = output2src.find(_blif.latches[o.second[1]].input); 108 | if (I->second[0] == 1) { 109 | // input of this latch is the output (Q) of an earlier latch 110 | // we need a pass-through gate to do that [extra gates] 111 | int g = (int)_blif.gates.size(); 112 | _blif.gates.push_back(t_gate_nfo()); 113 | _blif.gates.back().config_strings.push_back(make_pair("1", "1")); 114 | _blif.gates.back().inputs.push_back(I->first); 115 | string ex = "__extra__" + I->first; 116 | _blif.gates.back().output = ex; 117 | _blif.latches[o.second[1]].input = ex; 118 | output2src[ex] = v2i(0, g); 119 | } 120 | } 121 | } 122 | // -> create one LUT per latch 123 | vector lut_gates; 124 | for (const auto& o : output2src) { 125 | if (o.second[0] == 1) { // latch 126 | // find input type 127 | const auto& I = output2src.find (_blif.latches[o.second[1]].input); 128 | sl_assert(I != output2src.end()); 129 | sl_assert(I->second[0] != 1); // other has to not be a latch 130 | /// create LUT for the D output (latch input) 131 | /// assign output to Q (latch output) 132 | // store indices of input (D) and output (Q) 133 | _indices[I->first] = (((int)lut_gates.size()) << 1); 134 | _indices[o.first] = (((int)lut_gates.size()) << 1) + 1; 135 | // gate that corresponds to the lut 136 | lut_gates.push_back(I->second[1]); 137 | } 138 | } 139 | // -> create one LUT per comb output 140 | for (const auto& o : output2src) { 141 | if (o.second[0] != 1) { // not latch 142 | // ignore clock 143 | if (o.first == "clock") { 144 | continue; 145 | } 146 | // check if the output is already assigned 147 | if (_indices.count(o.first)) { 148 | continue; 149 | } 150 | // check that it does not use clock // NOTE: investigate 151 | bool skip = false; 152 | for (auto i : _blif.gates[o.second[1]].inputs) { 153 | if (i == "clock") { 154 | skip = true; 155 | break; 156 | } 157 | } 158 | if (skip) continue; 159 | /// create LUT for the D output 160 | // store index 161 | _indices[o.first] = (((int)lut_gates.size()) << 1); 162 | if (o.second[0] == 0) { 163 | lut_gates.push_back(o.second[1]); // gate that corresponds to the lut 164 | } else if (o.second[0] == 2) { 165 | lut_gates.push_back(-1); // external lut 166 | } else if (o.second[0] == 3) { 167 | lut_gates.push_back(-1); // external lut 168 | } else { 169 | fprintf(stderr, " unexpected\n"); 170 | } 171 | } 172 | } 173 | // -> connect BRAMs to design 174 | for (const auto &b : _blif.brams) { 175 | _brams.push_back(t_bram()); 176 | _brams.back().name = b.name; 177 | _brams.back().data = b.data; 178 | // -> list what to connect 179 | vector< tuple* > > ports_width; 180 | ports_width.push_back(make_tuple("RD_ADDR", b.addr_width, &_brams.back().rd_addr)); 181 | ports_width.push_back(make_tuple("RD_DATA", b.data_width, &_brams.back().rd_data)); 182 | ports_width.push_back(make_tuple("WR_ADDR", b.addr_width, &_brams.back().wr_addr)); 183 | ports_width.push_back(make_tuple("WR_DATA", b.data_width, &_brams.back().wr_data)); 184 | ports_width.push_back(make_tuple("WR_EN", b.data_width, &_brams.back().wr_en)); 185 | // -> check and connect 186 | for (auto &pw : ports_width) { 187 | for (int i=0; i < (int)get<1>(pw); ++i) { 188 | string port = get<0>(pw) + "[" + std::to_string(i) + "]"; 189 | auto P = b.bindings.find(port); 190 | if (P == b.bindings.end()) { 191 | fprintf(stderr, " bram '%s' is disconnected\n", port.c_str()); 192 | } else { 193 | auto I = _indices.find(P->second); 194 | if (I == _indices.end()) { 195 | fprintf(stderr, " bram '%s' is connected to unkown '%s'\n", port.c_str(), P->second.c_str()); 196 | exit(-1); 197 | } else { 198 | // std::cerr << P->first << " " << I->second << '\n'; 199 | get<2>(pw)->push_back(I->second); 200 | } 201 | } 202 | } 203 | } 204 | } 205 | // -> instantiate LUTs 206 | for (auto g : lut_gates) { 207 | _luts.push_back(t_lut()); 208 | ForIndex(i, 4) { 209 | _luts.back().inputs[i] = -1; 210 | } 211 | if (g == -1) { 212 | _luts.back().cfg = 0; 213 | _luts.back().external = true; // TODO FIXME: merge with above!!! 214 | } else { 215 | _luts.back().cfg = lut_config(_blif.gates[g].config_strings); 216 | _luts.back().external = false; 217 | int i = 4 - (int)_blif.gates[g].inputs.size(); 218 | for (auto inp : _blif.gates[g].inputs) { 219 | auto I = _indices.find(inp); 220 | if (I == _indices.end()) { 221 | fprintf(stderr, " input '%s' disconnected\n", inp.c_str()); 222 | } else { 223 | _luts.back().inputs[i++] = I->second; 224 | } 225 | } 226 | } 227 | } 228 | 229 | for (auto op : _blif.outputs) { 230 | auto I = _indices.find(op); 231 | if (I == _indices.end()) { 232 | fprintf(stderr, " outport '%s' disconnected\n", op.c_str()); 233 | } else { 234 | _outbits.push_back(make_pair(op, I->second)); 235 | } 236 | } 237 | 238 | for (const auto& l : _blif.latches) { 239 | if (l.init == "1") { 240 | auto I = _indices.find(l.output); 241 | sl_assert(I != _indices.end()); 242 | _ones.push_back(I->second); 243 | } 244 | } 245 | 246 | /// DEBUG 247 | #if 0 248 | map reverse_indices; 249 | for (auto& idc : _indices) { 250 | reverse_indices[idc.second] = idc.first; 251 | } 252 | for (int l = 0; l < _luts.size(); ++l) { 253 | fprintf(stderr,"LUT %3d (%s), cfg:%4x, inputs: %4d %4d %4d %4d ext:%d\n", 254 | l<<1, reverse_indices.at(l<<1).c_str(), _luts[l].cfg, 255 | _luts[l].inputs[0], _luts[l].inputs[1], 256 | _luts[l].inputs[2], _luts[l].inputs[3], 257 | _luts[l].external); 258 | } 259 | #endif 260 | } 261 | 262 | // ----------------------------------------------------------------------------- 263 | 264 | /* 265 | Reads the design from a BLIF file 266 | */ 267 | void readDesign( 268 | const char *path, 269 | vector& _luts, 270 | vector& _brams, 271 | vector >& _outbits, 272 | vector& _ones, 273 | map& _indices) 274 | { 275 | t_blif blif; 276 | // parse the blif file 277 | parse(path, blif); 278 | // build the design datastructure 279 | buildSimulData(blif, _luts, _brams, _outbits, _ones, _indices); 280 | } 281 | 282 | // ----------------------------------------------------------------------------- 283 | -------------------------------------------------------------------------------- /src/read.h: -------------------------------------------------------------------------------- 1 | // @sylefeb 2022-01-04 2 | /* 3 | BSD 3-Clause License 4 | 5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, 15 | this list of conditions and the following disclaimer in the documentation 16 | and/or other materials provided with the distribution. 17 | 18 | 3. Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from 20 | this software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | */ 33 | 34 | #pragma once 35 | 36 | #include 37 | #include 38 | 39 | #include "uintX.h" 40 | 41 | typedef unsigned char uchar; 42 | 43 | #pragma pack(push) 44 | #pragma pack(1) 45 | // struct holding a LUT configuration: 46 | // - cfg is a 16 bits integer that defined the truth table for 4 inputs 47 | // - inputs[4] are the indices of the inputs (other LUTs in the LUT table) 48 | // Each index lower bit indicates whether the input is connected to D (0) 49 | // or Q (1). The higher bits are the LUT index. 50 | // So for LUT i the index is obtained as (i<<1) + 0 if connected to D 51 | // or (i<<1) + 1 if connected to Q 52 | // Given the index x, the LUT is (x>>1) and (x&1) == 1 if Q, otherwise D 53 | typedef struct s_lut { 54 | unsigned short cfg; 55 | int inputs[4]; 56 | bool external; // TODO: try to move this out of t_lut, only used between read and analyze 57 | } t_lut; 58 | #pragma pack(pop) 59 | 60 | // struct holding a BRAM 61 | typedef struct s_bram { 62 | std::string name; 63 | uintX data; 64 | std::vector rd_addr; 65 | std::vector rd_data; 66 | std::vector wr_addr; 67 | std::vector wr_data; 68 | std::vector wr_en; 69 | // int rd_clock; 70 | // int wr_clock; 71 | } t_bram; 72 | 73 | void readDesign( 74 | const char *path, 75 | std::vector& _luts, 76 | std::vector& _brams, 77 | std::vector >& _outbits, 78 | std::vector& _ones, 79 | std::map& _indices); 80 | -------------------------------------------------------------------------------- /src/sh_clear.cs: -------------------------------------------------------------------------------- 1 | // @sylefeb 2021-01-04 2 | /* 3 | BSD 3-Clause License 4 | 5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, 15 | this list of conditions and the following disclaimer in the documentation 16 | and/or other materials provided with the distribution. 17 | 18 | 3. Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from 20 | this software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | */ 33 | 34 | #version 430 35 | 36 | layout(local_size_x = 1, local_size_y = 1) in; 37 | 38 | layout(std430, binding = 0) buffer Buf { uint buf[]; }; 39 | 40 | void main() 41 | { 42 | uint id = gl_GlobalInvocationID.x; 43 | buf[id] = 0; 44 | } 45 | -------------------------------------------------------------------------------- /src/sh_init.cs: -------------------------------------------------------------------------------- 1 | // @sylefeb 2021-01-04 2 | /* 3 | BSD 3-Clause License 4 | 5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, 15 | this list of conditions and the following disclaimer in the documentation 16 | and/or other materials provided with the distribution. 17 | 18 | 3. Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from 20 | this software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | */ 33 | 34 | #version 430 35 | 36 | layout(local_size_x = 1, local_size_y = 1) in; 37 | 38 | coherent layout(std430, binding = 2) buffer Buf2 { uint outputs[]; }; 39 | readonly layout(std430, binding = 5) buffer Buf5 { uint ones []; }; 40 | 41 | void main() 42 | { 43 | uint id = gl_GlobalInvocationID.x; 44 | // update flipflop 45 | uint o = ones[id]; 46 | atomicOr(outputs[o >> 1u], 1u << (o & 1u)); 47 | } 48 | -------------------------------------------------------------------------------- /src/sh_outports.cs: -------------------------------------------------------------------------------- 1 | // @sylefeb 2021-01-04 2 | /* 3 | BSD 3-Clause License 4 | 5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, 15 | this list of conditions and the following disclaimer in the documentation 16 | and/or other materials provided with the distribution. 17 | 18 | 3. Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from 20 | this software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | */ 33 | 34 | #version 430 35 | 36 | layout(local_size_x = 1, local_size_y = 1) in; 37 | 38 | coherent readonly layout(std430, binding = 2) buffer Buf2 { uint outputs []; }; 39 | readonly layout(std430, binding = 3) buffer Buf3 { uint portlocs[]; }; 40 | writeonly layout(std430, binding = 4) buffer Buf4 { uint portvals[]; }; 41 | 42 | uniform uint offset; 43 | 44 | void main() 45 | { 46 | uint id = gl_GlobalInvocationID.x; 47 | uint o = portlocs[id]; 48 | portvals[offset + id] = (outputs[o>>1u] >> (o&1u)) & 1u; 49 | } 50 | -------------------------------------------------------------------------------- /src/sh_posedge.cs: -------------------------------------------------------------------------------- 1 | // @sylefeb 2021-01-04 2 | /* 3 | BSD 3-Clause License 4 | 5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, 15 | this list of conditions and the following disclaimer in the documentation 16 | and/or other materials provided with the distribution. 17 | 18 | 3. Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from 20 | this software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | */ 33 | 34 | #version 430 35 | 36 | layout(local_size_x = 128, local_size_y = 1) in; 37 | 38 | coherent layout(std430, binding = 2) buffer Buf2 { uint outputs[]; }; 39 | 40 | uniform uint num; 41 | 42 | void main() 43 | { 44 | if (gl_GlobalInvocationID.x < num) 45 | { 46 | uint lut_id = gl_GlobalInvocationID.x; 47 | // update Q output from D, but only if their values differ. 48 | uint outv = outputs[lut_id]; 49 | if ((outv & 1) != ((outv>>1)&1)) { 50 | if ((outv & 1u) == 1u) { 51 | outputs[lut_id] = 3u; 52 | } else { 53 | outputs[lut_id] = 0u; 54 | } 55 | } 56 | } 57 | } 58 | -------------------------------------------------------------------------------- /src/sh_simul.cs: -------------------------------------------------------------------------------- 1 | // @sylefeb 2021-01-04 2 | /* 3 | BSD 3-Clause License 4 | 5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, 15 | this list of conditions and the following disclaimer in the documentation 16 | and/or other materials provided with the distribution. 17 | 18 | 3. Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from 20 | this software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | */ 33 | 34 | #version 430 35 | 36 | layout(local_size_x = 128, local_size_y = 1) in; 37 | 38 | readonly layout(std430, binding = 0) buffer Buf0 { uint cfg []; }; 39 | readonly layout(std430, binding = 1) buffer Buf1 { ivec4 addrs []; }; 40 | coherent layout(std430, binding = 2) buffer Buf2 { uint outputs[]; }; 41 | 42 | uniform uint start_lut; 43 | uniform uint num; 44 | 45 | uint get_output(uint a) 46 | { 47 | return (outputs[a >> 1u] >> (a & 1u)) & 1u; 48 | } 49 | 50 | void main() 51 | { 52 | if (gl_GlobalInvocationID.x < num) 53 | { 54 | uint lut_id = start_lut + gl_GlobalInvocationID.x; 55 | // apply LUT logic 56 | uint C = cfg [lut_id]; 57 | ivec4 a = addrs[lut_id]; 58 | uint i0 = get_output(a.x); 59 | uint i1 = get_output(a.y); 60 | uint i2 = get_output(a.z); 61 | uint i3 = get_output(a.w); 62 | uint sh = i3 | (i2 << 1) | (i1 << 2) | (i0 << 3); 63 | // get previous value, compute old/new 64 | uint outv = outputs[lut_id]; 65 | uint old_d = outv & 1u; 66 | uint new_d = (C >> sh) & 1u; 67 | // if different, assign 68 | if (old_d != new_d) { 69 | if (new_d == 1u) { 70 | outputs[lut_id] = outv | 1u; 71 | } else { 72 | outputs[lut_id] = outv & 0xfffffffeu; 73 | } 74 | } 75 | } 76 | } 77 | -------------------------------------------------------------------------------- /src/sh_visu.fp: -------------------------------------------------------------------------------- 1 | // @sylefeb 2021-01-09 2 | /* 3 | BSD 3-Clause License 4 | 5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, 15 | this list of conditions and the following disclaimer in the documentation 16 | and/or other materials provided with the distribution. 17 | 18 | 3. Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from 20 | this software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | */ 33 | 34 | #version 430 35 | 36 | in vec2 uv; 37 | out vec4 color; 38 | 39 | readonly layout(std430, binding = 2) buffer Buf2 { uint outputs[]; }; 40 | 41 | uniform int sqsz; 42 | uniform int num; 43 | uniform int depth0_end; 44 | 45 | void main() 46 | { 47 | int id = depth0_end + int(uv.x*sqsz) + int(uv.y*sqsz)*sqsz; 48 | ivec2 o = id < num ? ivec2(outputs[id]&1u,(outputs[id]>>1u)&1u) : ivec2(0,0); 49 | vec2 c = vec2(o.xy); 50 | color = vec4(c.x,c.yy,1.0); 51 | } 52 | -------------------------------------------------------------------------------- /src/sh_visu.vp: -------------------------------------------------------------------------------- 1 | // @sylefeb 2021-01-09 2 | /* 3 | BSD 3-Clause License 4 | 5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, 15 | this list of conditions and the following disclaimer in the documentation 16 | and/or other materials provided with the distribution. 17 | 18 | 3. Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from 20 | this software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | */ 33 | 34 | #version 430 35 | 36 | in vec4 mvf_vertex; 37 | out vec2 uv; 38 | 39 | void main() 40 | { 41 | uv = mvf_vertex.xy; 42 | gl_Position = vec4(mvf_vertex.xy*2.0-1.0,0.5,1.0); 43 | } 44 | -------------------------------------------------------------------------------- /src/silixel.cc: -------------------------------------------------------------------------------- 1 | // @sylefeb 2022-01-04 2 | /* --------------------------------------------------------------------- 3 | 4 | Main file, creates a small graphical GUI (OpenGL+ImGUI) around a 5 | simulated design. If the design has VGA signals, displays the result 6 | using a texture. Allows to select between GPU/CPU simulation. 7 | 8 | ----------------------------------------------------------------------- */ 9 | /* 10 | BSD 3-Clause License 11 | 12 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 13 | All rights reserved. 14 | 15 | Redistribution and use in source and binary forms, with or without 16 | modification, are permitted provided that the following conditions are met: 17 | 18 | 1. Redistributions of source code must retain the above copyright notice, this 19 | list of conditions and the following disclaimer. 20 | 21 | 2. Redistributions in binary form must reproduce the above copyright notice, 22 | this list of conditions and the following disclaimer in the documentation 23 | and/or other materials provided with the distribution. 24 | 25 | 3. Neither the name of the copyright holder nor the names of its 26 | contributors may be used to endorse or promote products derived from 27 | this software without specific prior written permission. 28 | 29 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 30 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 31 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 32 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 33 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 34 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 35 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 36 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 37 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 38 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 39 | */ 40 | // -------------------------------------------------------------- 41 | 42 | #include 43 | #include 44 | #include 45 | 46 | #include 47 | #include 48 | 49 | #include "read.h" 50 | #include "analyze.h" 51 | #include "simul_cpu.h" 52 | #include "simul_gpu.h" 53 | 54 | // -------------------------------------------------------------- 55 | 56 | #include 57 | #include 58 | 59 | // -------------------------------------------------------------- 60 | 61 | using namespace std; 62 | 63 | // -------------------------------------------------------------- 64 | 65 | #define SCREEN_W (640) // screen width and height 66 | #define SCREEN_H (480) 67 | 68 | // -------------------------------------------------------------- 69 | 70 | // Shader to visualize LUT outputs 71 | #include "sh_visu.h" 72 | AutoBindShader::sh_visu g_ShVisu; 73 | 74 | // Output ports 75 | map g_OutPorts; // name, LUT id and rank in g_OutPortsValues 76 | Array g_OutPortsValues; // output port values 77 | 78 | int g_Cycle = 0; 79 | 80 | vector g_step_starts; 81 | vector g_step_ends; 82 | vector g_luts; 83 | vector g_brams; 84 | map g_indices; 85 | vector g_ones; 86 | vector g_cpu_fanout; 87 | vector g_cpu_depths; 88 | vector g_cpu_outputs; 89 | vector g_cpu_computelists; 90 | 91 | AutoPtr g_Quad; 92 | GLTimer g_GPU_timer; 93 | 94 | bool g_Use_GPU = true; 95 | 96 | // -------------------------------------------------------------- 97 | 98 | bool designHasVGA() 99 | { 100 | return (g_OutPorts.count("out_video_vs") > 0); 101 | } 102 | 103 | /* -------------------------------------------------------- */ 104 | 105 | ImageRGBA_Ptr g_Framebuffer; 106 | Tex2DRGBA_Ptr g_FramebufferTex; 107 | int g_X = 0; 108 | int g_Y = 0; 109 | int g_HS = 0; 110 | int g_VS = 0; 111 | double g_Hz = 0; 112 | double g_UsecPerCycle = 0; 113 | string g_OutPortString; 114 | int g_OutportCycle = 0; 115 | 116 | /* -------------------------------------------------------- */ 117 | 118 | void simulGPUNextWait() 119 | { 120 | g_GPU_timer.start(); 121 | while (1) { 122 | simulCycle_gpu(g_luts, g_step_starts, g_step_ends); 123 | if (simulReadback_gpu()) break; 124 | } 125 | g_GPU_timer.stop(); 126 | auto ms = g_GPU_timer.waitResult(); 127 | g_Hz = (double)CYCLE_BUFFER_LEN / ((double)ms / 1000.0); 128 | g_UsecPerCycle = (double)ms * 1000.0 / (double)CYCLE_BUFFER_LEN; 129 | g_OutportCycle = 0; 130 | } 131 | 132 | /* -------------------------------------------------------- */ 133 | 134 | void simulGPUNext() 135 | { 136 | g_GPU_timer.start(); 137 | 138 | simulCycle_gpu(g_luts, g_step_starts, g_step_ends); 139 | bool datain = simulReadback_gpu(); 140 | 141 | g_GPU_timer.stop(); 142 | auto ms = g_GPU_timer.waitResult(); 143 | g_Hz = (double)1 / ((double)ms / 1000.0); 144 | g_UsecPerCycle = (double)ms * 1000.0 / (double)1; 145 | 146 | if (datain) { 147 | g_OutportCycle = 0; 148 | } else { 149 | ++g_OutportCycle; 150 | } 151 | 152 | } 153 | 154 | /* -------------------------------------------------------- */ 155 | 156 | void updateFrame(int vs, int hs, int r, int g, int b) 157 | { 158 | if (vs) { 159 | if (hs) { 160 | if (g_X >= 48 && g_Y >= 34) { 161 | g_Framebuffer->pixel(g_X - 48, g_Y - 34) = v4b(r << 2, g << 2, b << 2, 255); 162 | } 163 | ++g_X; 164 | } else { 165 | g_X = 0; 166 | if (g_HS) { 167 | ++g_Y; 168 | g_FramebufferTex = Tex2DRGBA_Ptr(new Tex2DRGBA(g_Framebuffer->pixels())); 169 | } 170 | } 171 | } else { 172 | g_X = g_Y = 0; 173 | } 174 | g_VS = vs; 175 | g_HS = hs; 176 | } 177 | 178 | /* -------------------------------------------------------- */ 179 | 180 | void simulGPU() 181 | { 182 | if (designHasVGA()) { // design has VGA output, display it 183 | simulGPUNextWait(); // simulates a number of cycles and wait 184 | // read the output of the simulated cycles 185 | ForIndex(cy, CYCLE_BUFFER_LEN) { 186 | int offset = cy * (int)g_OutPorts.size(); 187 | int vs = g_OutPortsValues[offset + g_OutPorts["out_video_vs"][0]]; 188 | int hs = g_OutPortsValues[offset + g_OutPorts["out_video_hs"][0]]; 189 | int r = 0; 190 | ForIndex(i, 6) { 191 | r = r | ((g_OutPortsValues[offset + g_OutPorts["out_video_r[" + to_string(i) + "]"][0]]) << i); 192 | } 193 | int g = 0; 194 | ForIndex(i, 6) { 195 | g = g | ((g_OutPortsValues[offset + g_OutPorts["out_video_g[" + to_string(i) + "]"][0]]) << i); 196 | } 197 | int b = 0; 198 | ForIndex(i, 6) { 199 | b = b | ((g_OutPortsValues[offset + g_OutPorts["out_video_b[" + to_string(i) + "]"][0]]) << i); 200 | } 201 | updateFrame(vs, hs, r, g, b); 202 | } 203 | } else { // design has no VGA, show the output ports 204 | simulGPUNext(); // step one cycle 205 | // make the output string 206 | g_OutPortString = ""; 207 | int offset = g_OutportCycle * (int)g_OutPorts.size(); 208 | for (auto op : g_OutPorts) { 209 | g_OutPortString = (g_OutPortsValues[offset + op.second[0]] ? "1" : "0") + g_OutPortString; 210 | } 211 | } 212 | } 213 | 214 | /* -------------------------------------------------------- */ 215 | 216 | uchar simulCPU_output(std::string o) 217 | { 218 | int pos = g_OutPorts.at(o)[1]; 219 | int lut = pos >> 1; 220 | int q_else_d = pos & 1; 221 | uchar bit = (g_cpu_outputs[lut] >> q_else_d) & 1; 222 | return bit; 223 | } 224 | 225 | /* -------------------------------------------------------- */ 226 | 227 | void simulCPU() 228 | { 229 | if (designHasVGA()) { 230 | // multiple steps 231 | int num_measures = 0; 232 | const int N_measures = 100; 233 | Elapsed el; 234 | while (num_measures++ < N_measures) { 235 | simulCycle_cpu(g_luts, g_brams, g_cpu_depths, g_step_starts, g_step_ends, g_cpu_fanout, g_cpu_computelists, g_cpu_outputs); 236 | simulPosEdge_cpu(g_luts, g_cpu_depths, (int)g_step_starts.size(), g_cpu_fanout, g_cpu_computelists, g_cpu_outputs); 237 | int vs = simulCPU_output("out_video_vs"); 238 | int hs = simulCPU_output("out_video_hs"); 239 | int r = 0; 240 | ForIndex(i, 6) { 241 | r = r | (simulCPU_output("out_video_r[" + to_string(i) + "]") << i); 242 | } 243 | int g = 0; 244 | ForIndex(i, 6) { 245 | g = g | (simulCPU_output("out_video_g[" + to_string(i) + "]") << i); 246 | } 247 | int b = 0; 248 | ForIndex(i, 6) { 249 | b = b | (simulCPU_output("out_video_b[" + to_string(i) + "]") << i); 250 | } 251 | updateFrame(vs, hs, r, g, b); 252 | } 253 | auto ms = el.elapsed(); 254 | g_Hz = (double)N_measures / ((double)ms / 1000.0); 255 | g_UsecPerCycle = (double)ms * 1000.0 / (double)N_measures; 256 | } else { 257 | // multiple steps 258 | int num_measures = 0; 259 | const int N_measures = 20; 260 | Elapsed el; 261 | while (num_measures++ < N_measures) { 262 | simulCycle_cpu(g_luts, g_brams, g_cpu_depths, g_step_starts, g_step_ends, g_cpu_fanout, g_cpu_computelists, g_cpu_outputs); 263 | simulPosEdge_cpu(g_luts, g_cpu_depths, (int)g_step_starts.size(), g_cpu_fanout, g_cpu_computelists, g_cpu_outputs); 264 | } 265 | auto ms = el.elapsed(); 266 | if (ms > 0) { 267 | g_Hz = (double)N_measures / ((double)ms / 1000.0); 268 | g_UsecPerCycle = (double)ms * 1000.0 / (double)N_measures; 269 | } else { 270 | g_Hz = -1; 271 | g_UsecPerCycle = -1; 272 | } 273 | // make the output string 274 | g_OutPortString = ""; 275 | for (auto op : g_OutPorts) { 276 | g_OutPortString = (simulCPU_output(op.first) ? "1" : "0") + g_OutPortString; 277 | } 278 | } 279 | } 280 | 281 | /* -------------------------------------------------------- */ 282 | 283 | void mainRender() 284 | { 285 | 286 | // simulate 287 | if (g_Use_GPU) { 288 | simulGPU(); 289 | } else { 290 | simulCPU(); 291 | } 292 | 293 | // basic rendering 294 | LibSL::GPUHelpers::clearScreen(LIBSL_COLOR_BUFFER | LIBSL_DEPTH_BUFFER, 0.2f, 0.2f, 0.2f); 295 | 296 | // render display 297 | if (designHasVGA()) { 298 | // -> texture for VGA display 299 | GLBasicPipeline::getUniqueInstance()->begin(); 300 | GLBasicPipeline::getUniqueInstance()->setProjection(orthoMatrixGL(0, 1, 1, 0, -1, 1)); 301 | GLBasicPipeline::getUniqueInstance()->setModelview(m4x4f::identity()); 302 | GLBasicPipeline::getUniqueInstance()->setColor(v4f(1)); 303 | if (!g_FramebufferTex.isNull()) { 304 | g_FramebufferTex->bind(); 305 | } 306 | GLBasicPipeline::getUniqueInstance()->enableTexture(); 307 | GLBasicPipeline::getUniqueInstance()->bindTextureUnit(0); 308 | g_Quad->render(); 309 | GLBasicPipeline::getUniqueInstance()->end(); 310 | } 311 | 312 | // render LUTs+FF 313 | if (g_Use_GPU) { 314 | GLProtectViewport vp; 315 | glViewport(0, 0, SCREEN_H*2/3, SCREEN_H*2/3); 316 | g_ShVisu.begin(); 317 | g_Quad->render(); 318 | g_ShVisu.end(); 319 | } 320 | 321 | // -> GUI 322 | ImGui::SetNextWindowSize(ImVec2(300, 150), ImGuiCond_Once); 323 | ImGui::Begin("Status"); 324 | ImGui::Checkbox("Simulate on GPU", &g_Use_GPU); 325 | if (g_Use_GPU && !g_brams.empty()) { 326 | cerr << "this design has BRAMs, currently unsupported on GPU\n"; 327 | g_Use_GPU = false; 328 | } 329 | ImGui::Text("%5.1f KHz %5.1f usec / cycle", g_Hz/1000.0, g_UsecPerCycle); 330 | ImGui::Text("simulated cycle: %6d", g_Cycle); 331 | ImGui::Text("simulated LUT4+FF %7d", g_luts.size()); 332 | ImGui::Text("screen row %3d",g_Y); 333 | if (!g_OutPortString.empty()) { 334 | ImGui::Text("outputs: %s", g_OutPortString.c_str()); 335 | } 336 | ImGui::End(); 337 | 338 | SimpleUI::renderImGui(); 339 | } 340 | 341 | /* -------------------------------------------------------- */ 342 | 343 | int main(int argc, char **argv) 344 | { 345 | try { 346 | 347 | /// init simple UI (glut clone for both GL and D3D) 348 | cerr << "Init SimpleUI "; 349 | SimpleUI::init(SCREEN_W, SCREEN_H); 350 | SimpleUI::onRender = mainRender; 351 | cerr << "[OK]" << endl; 352 | 353 | /// bind imgui 354 | SimpleUI::bindImGui(); 355 | SimpleUI::initImGui(); 356 | SimpleUI::onReshape(SCREEN_W, SCREEN_H); 357 | 358 | glDisable(GL_DEPTH_TEST); 359 | glDisable(GL_CULL_FACE); 360 | 361 | /// help 362 | printf("[ESC] - quit\n"); 363 | 364 | /// display stuff 365 | g_Framebuffer = ImageRGBA_Ptr(new ImageRGBA(640,480)); 366 | g_Quad = AutoPtr(new GLMesh()); 367 | g_Quad->begin(GPUMESH_TRIANGLESTRIP); 368 | g_Quad->texcoord0_2(0, 0); g_Quad->vertex_2(0, 0); 369 | g_Quad->texcoord0_2(1, 0); g_Quad->vertex_2(1, 0); 370 | g_Quad->texcoord0_2(0, 1); g_Quad->vertex_2(0, 1); 371 | g_Quad->texcoord0_2(1, 1); g_Quad->vertex_2(1, 1); 372 | g_Quad->end(); 373 | 374 | /// GPU shaders init 375 | g_ShVisu.init(); 376 | 377 | /// GPU timer 378 | g_GPU_timer.init(); 379 | 380 | /// load up design 381 | vector > outbits; 382 | readDesign(SRC_PATH "/build/synth.blif", g_luts, g_brams, outbits, g_ones, g_indices); 383 | 384 | if (!g_brams.empty()) { 385 | g_Use_GPU = false; 386 | } 387 | 388 | analyze(g_luts, g_brams, outbits, g_indices, g_ones, g_step_starts, g_step_ends, g_cpu_depths); 389 | 390 | buildFanout(g_luts, g_cpu_fanout); 391 | 392 | int rank = 0; 393 | for (auto op : outbits) { 394 | g_OutPorts.insert(make_pair(op.first,v2i(rank++, op.second))); 395 | } 396 | g_OutPortsValues.allocate(rank * CYCLE_BUFFER_LEN); 397 | 398 | /// GPU buffers init 399 | simulInit_gpu(g_luts, g_ones); 400 | 401 | // init CPU simulation 402 | simulInit_cpu(g_luts, g_brams, g_step_starts, g_step_ends, g_ones, g_cpu_computelists, g_cpu_outputs); 403 | 404 | /// Quick benchmarking at startup 405 | #if 0 406 | // -> time GPU 407 | simulBegin_gpu(g_luts,g_step_starts,g_step_ends,g_ones); 408 | { 409 | ForIndex(trials, 3) { 410 | int n_cycles = 10000; 411 | g_GPU_timer.start(); 412 | ForIndex(cycle, n_cycles) { 413 | simulCycle_gpu(g_luts, g_step_starts, g_step_ends); 414 | simulReadback_gpu(); 415 | ++g_Cycle; 416 | } 417 | g_GPU_timer.stop(); 418 | simulPrintOutput_gpu(outbits); 419 | auto ms = g_GPU_timer.waitResult(); 420 | printf("[GPU] %d msec, ~ %f Hz, cycle time: %f usec\n", 421 | (int)ms, 422 | (double)n_cycles / ((double)ms / 1000.0), 423 | (double)ms * 1000.0 / (double)n_cycles); 424 | } 425 | } 426 | simulEnd_gpu(); 427 | // -> time CPU 428 | { 429 | ForIndex(trials, 3) { 430 | Elapsed el; 431 | int n_cycles = 1000; 432 | ForIndex(cy, n_cycles) { 433 | simulCycle_cpu(g_luts, g_brams, g_cpu_depths, g_step_starts, g_step_ends, g_cpu_fanout, g_cpu_computelists, g_cpu_outputs); 434 | simulPosEdge_cpu(g_luts, g_cpu_depths, (int)g_step_starts.size(), g_cpu_fanout, g_cpu_computelists, g_cpu_outputs); 435 | } 436 | auto ms = el.elapsed(); 437 | printf("[CPU] %d msec, ~ %f Hz, cycle time: %f usec\n", 438 | (int)ms, 439 | (double)n_cycles / ((double)ms / 1000.0), 440 | (double)ms * 1000.0 / (double)n_cycles); 441 | } 442 | } 443 | #endif 444 | 445 | /// shader parameters 446 | g_ShVisu.begin(); 447 | int n_simul = (int)g_luts.size() - g_step_ends[0]; 448 | int sqsz = (int)sqrt((double)(n_simul)) + 1; 449 | fprintf(stderr, "simulating %d LUTs+FF (%dx%d pixels)", n_simul, sqsz, sqsz); 450 | g_ShVisu.sqsz .set(sqsz); 451 | g_ShVisu.num .set((int)(g_luts.size())); 452 | g_ShVisu.depth0_end.set((int)(g_step_ends[0])); 453 | g_ShVisu.end(); 454 | 455 | /// main loop 456 | simulBegin_gpu(g_luts, g_step_starts, g_step_ends, g_ones); 457 | SimpleUI::loop(); 458 | simulEnd_gpu(); 459 | 460 | /// clean exit 461 | simulTerminate_gpu(); 462 | g_ShVisu.terminate(); 463 | g_GPU_timer.terminate(); 464 | g_FramebufferTex = Tex2DRGBA_Ptr(); 465 | g_Quad = AutoPtr(); 466 | 467 | /// shutdown SimpleUI 468 | SimpleUI::shutdown(); 469 | 470 | } catch (Fatal& e) { 471 | cerr << e.message() << endl; 472 | return (-1); 473 | } 474 | 475 | return (0); 476 | } 477 | 478 | /* -------------------------------------------------------- */ 479 | -------------------------------------------------------------------------------- /src/silixel_cpu.cc: -------------------------------------------------------------------------------- 1 | // @sylefeb 2022-01-04 2 | /* 3 | BSD 3-Clause License 4 | 5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, 15 | this list of conditions and the following disclaimer in the documentation 16 | and/or other materials provided with the distribution. 17 | 18 | 3. Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from 20 | this software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | */ 33 | // -------------------------------------------------------------- 34 | 35 | #include 36 | 37 | #include 38 | #include 39 | #include 40 | #include 41 | #include 42 | #include 43 | #include 44 | #include 45 | 46 | using namespace std; 47 | 48 | #include "simul_cpu.h" 49 | #include "blif.h" 50 | #include "fstapi/fstapi.h" 51 | #define FST_TS_S 0 52 | #define FST_TS_MS -3 53 | #define FST_TS_US -6 54 | #define FST_TS_NS -9 55 | #define FST_TS_PS -12 56 | 57 | // ----------------------------------------------------------------------------- 58 | 59 | typedef struct { 60 | string name; 61 | string base_name; 62 | int bit_index; 63 | int lut_index; 64 | fstHandle fst_handle; 65 | fstVarType fst_type; 66 | } t_watch; 67 | 68 | string base_name(string str) 69 | { 70 | size_t dot_pos = str.rfind('.'); 71 | size_t pos = str.find('['); 72 | if (pos != std::string::npos) { 73 | if (dot_pos != std::string::npos) { 74 | return str.substr(dot_pos + 1, pos - dot_pos - 1); 75 | } else { 76 | return str.substr(0, pos); 77 | } 78 | } else if (dot_pos != std::string::npos) { 79 | return str.substr(dot_pos+1); 80 | } else { 81 | return str; 82 | } 83 | } 84 | 85 | int index(string str) 86 | { 87 | size_t s = str.find('['); 88 | size_t e = str.find(']', s); 89 | if (s == std::string::npos || e == std::string::npos || s >= e - 1) { 90 | return -1; // no index 91 | } 92 | string istr = str.substr(s + 1, e - s - 1); 93 | return std::stoi(istr); 94 | } 95 | 96 | t_watch& add_watch(string signal, const map& indices, vector &_watches) 97 | { 98 | auto I = indices.find(signal); 99 | if (I == indices.end()) { 100 | fprintf(stderr, " cannot find signal '%s' to watch\n", signal.c_str()); 101 | exit (-1); 102 | } 103 | t_watch w; 104 | w.name = signal; 105 | w.base_name = base_name(w.name); 106 | w.bit_index = index(w.name); 107 | w.lut_index = I->second; 108 | w.fst_handle = 0; 109 | w.fst_type = FST_VT_VCD_WIRE; 110 | _watches.push_back(w); 111 | return _watches.back(); 112 | } 113 | 114 | void setFstScope(fstWriterContext *fst, string signal) 115 | { 116 | vector path; 117 | split(signal, '.', path); 118 | if (!path.empty()) path.pop_back(); 119 | for (auto node : path) { 120 | fstWriterSetScope(fst, FST_ST_VCD_MODULE, node.c_str(), NULL); 121 | } 122 | } 123 | 124 | void unsetFstScope(fstWriterContext *fst, string signal) 125 | { 126 | vector path; 127 | split(signal, '.', path); 128 | if (!path.empty()) path.pop_back(); 129 | for (auto node : path) { 130 | fstWriterSetUpscope(fst); 131 | } 132 | } 133 | 134 | 135 | // ----------------------------------------------------------------------------- 136 | 137 | const char *c_ClockAnim[] = { 138 | " _____ \n", 139 | " _____/ \\ ", 140 | " _____ \n", 141 | " ____/ \\_ ", 142 | " _____ \n", 143 | " ___/ \\__ ", 144 | " _____ \n", 145 | " __/ \\___ ", 146 | " _____ \n", 147 | " _/ \\____ ", 148 | " _____ \n", 149 | " / \\_____ ", 150 | " _____ \n", 151 | " \\_____/ ", 152 | " ____ _ \n", 153 | " \\_____/ ", 154 | " ___ __ \n", 155 | " \\_____/ ", 156 | " __ ___ \n", 157 | " \\_____/ ", 158 | " _ ____ \n", 159 | " \\_____/ ", 160 | " _____ \n", 161 | " \\_____/ ", 162 | }; 163 | 164 | 165 | int main(int argc,const char **argv) 166 | { 167 | bool silice_design = false; 168 | int num_cycles = 10000; 169 | const char *blif_path = SRC_PATH "/build/synth.blif"; 170 | 171 | fprintf(stderr, "<<<====----- Silixel v0.1 by @sylefeb -----====>>>\n"); 172 | 173 | /// parse options 174 | int i = 1; 175 | while (i < argc) { 176 | if (strcmp(argv[i], "--silice") == 0) { 177 | silice_design = true; 178 | ++i; 179 | } else if (strcmp(argv[i], "--cycles") == 0) { 180 | if (i + 1 == argc) { 181 | fprintf(stderr, "--cycles expects a parameter (integer, number of cycles to simulate)\n"); 182 | exit(-1); 183 | } 184 | ++i; 185 | num_cycles = atoi(argv[i]); 186 | ++i; 187 | } else if (strcmp(argv[i], "--blif") == 0) { 188 | if (i + 1 == argc) { 189 | fprintf(stderr, "--blif expects a parameter (string, file to load)\n"); 190 | exit(-1); 191 | } 192 | ++i; 193 | blif_path = argv[i]; 194 | ++i; 195 | } else { ++i; } 196 | } 197 | 198 | /// checks 199 | { 200 | FILE *f = 0; 201 | fopen_s(&f, blif_path, "rb"); 202 | if (f == NULL) { 203 | fprintf(stderr, " cannot open input blif file %s\n", blif_path); 204 | exit(-1); 205 | } else { 206 | fclose(f); 207 | } 208 | } 209 | 210 | /// load up design 211 | vector luts; 212 | std::vector brams; 213 | vector > outbits; 214 | vector ones; 215 | map indices; 216 | readDesign(blif_path, luts, brams, outbits, ones, indices); 217 | 218 | vector step_starts; 219 | vector step_ends; 220 | vector depths; 221 | analyze(luts, brams, outbits, indices, ones, step_starts, step_ends, depths); 222 | 223 | vector fanout; 224 | buildFanout(luts, fanout); 225 | 226 | /// add reset to init to ones 227 | bool has_reset = indices.count("reset") > 0; 228 | if (has_reset) { 229 | ones.push_back(indices.at("reset")); 230 | } 231 | 232 | /// simulate 233 | vector outputs; 234 | vector computelists; 235 | simulInit_cpu(luts, brams, step_starts, step_ends, ones, computelists, outputs); 236 | 237 | /// automatically add all outputs as watches 238 | vector watches; 239 | if (silice_design) { 240 | // selection specialized to a silice design 241 | for (auto signal : indices) { 242 | if (signal.first.substr(0, 3) == "out") { 243 | auto &w = add_watch(signal.first, indices, watches); 244 | w.fst_type = FST_VT_VCD_WIRE; 245 | } else if (signal.first.find("_q_") != std::string::npos) { 246 | auto &w = add_watch(signal.first, indices, watches); 247 | w.fst_type = FST_VT_VCD_REG; 248 | } 249 | } 250 | } else { 251 | // selection for any other design 252 | for (auto signal : indices) { 253 | if (signal.first[0] != '$') { 254 | auto &w = add_watch(signal.first, indices, watches); 255 | w.fst_type = FST_VT_VCD_WIRE; 256 | } 257 | } 258 | } 259 | if (has_reset) { 260 | add_watch("reset", indices, watches); 261 | } 262 | 263 | LibSL::CppHelpers::Console::clear(); 264 | LibSL::CppHelpers::Console::pushCursor(); 265 | fprintf(stderr, " _____\n"); 266 | fprintf(stderr, " init_/ "); 267 | // simulPrintOutput_cpu(outputs, outbits); 268 | 269 | // FST trace 270 | fstWriterContext *fst = fstWriterCreate("./trace.fst", 1); 271 | if (fst == NULL) { 272 | fprintf(stderr,"cannot open trace.fst for writing\n"); 273 | exit (-1); 274 | } 275 | fstWriterSetTimescale(fst, 1); 276 | fstWriterSetScope(fst, FST_ST_VCD_MODULE, "top", NULL); 277 | // -> group individual bits 278 | map bitcounts; 279 | for (auto &w : watches) { 280 | bitcounts[w.base_name] = max(bitcounts[w.base_name], index(w.name)+1); 281 | } 282 | set added; 283 | for (auto& w : watches) { 284 | if (!added.count(w.base_name)) { 285 | added.insert(w.base_name); 286 | setFstScope(fst, w.name); 287 | w.fst_handle = fstWriterCreateVar(fst, FST_VT_VCD_REG, FST_VD_IMPLICIT, max(1,bitcounts[w.base_name]), w.base_name.c_str(), NULL); 288 | unsetFstScope(fst, w.name); 289 | } 290 | } 291 | auto fst_clock = fstWriterCreateVar(fst, FST_VT_VCD_REG, FST_VD_IMPLICIT, 1, "clock", NULL); 292 | 293 | LibSL::CppHelpers::Console::popCursor(); 294 | LibSL::CppHelpers::Console::pushCursor(); 295 | 296 | int anim = 0; 297 | Every ev(100); 298 | 299 | int cycles = 0; 300 | while (num_cycles == -1 || cycles < num_cycles) { 301 | 302 | if (has_reset) { 303 | if (cycles < 16) { 304 | simulSetSignal_cpu(indices.at("reset"), true, depths, (int)step_starts.size(), fanout, computelists, outputs); 305 | } else if (cycles == 16) { 306 | simulSetSignal_cpu(indices.at("reset"), false, depths, (int)step_starts.size(), fanout, computelists, outputs); 307 | } 308 | } 309 | 310 | fstWriterEmitTimeChange(fst, (cycles << 1) + 0); 311 | fstWriterEmitValueChange(fst, fst_clock, "0"); 312 | simulCycle_cpu(luts, brams, depths, step_starts, step_ends, fanout, computelists, outputs); 313 | 314 | fstWriterEmitTimeChange(fst, (cycles << 1) + 1); 315 | fstWriterEmitValueChange(fst, fst_clock, "1"); 316 | simulPosEdge_cpu(luts, depths, (int)step_starts.size(), fanout, computelists, outputs); 317 | 318 | int console_out = ev.expired(); 319 | 320 | if (console_out) { 321 | LibSL::CppHelpers::Console::popCursor(); 322 | LibSL::CppHelpers::Console::pushCursor(); 323 | int a = anim % 12; 324 | fprintf(stderr, c_ClockAnim[a * 2 + 0]); 325 | fprintf(stderr, c_ClockAnim[a * 2 + 1]); 326 | ++anim; 327 | if (num_cycles > -1) { 328 | fprintf(stderr, " (%7d cycles, %3d%% completed)\n", cycles, 100 * cycles / num_cycles); 329 | } else { 330 | fprintf(stderr, "\n"); 331 | } 332 | } 333 | 334 | // print and trace watches 335 | map values; 336 | for (auto w : watches) { 337 | int b = w.lut_index; 338 | int lut = b >> 1; 339 | int q_else_d = b & 1; 340 | int bit = (outputs[lut] >> q_else_d) & 1; 341 | if (w.bit_index > -1) { 342 | if (values[w.base_name].empty()) { 343 | values[w.base_name].resize(bitcounts[w.base_name], '0'); 344 | } 345 | values[w.base_name][bitcounts[w.base_name]-1-w.bit_index] = bit ? '1' : '0'; 346 | } else { 347 | values[w.base_name] = bit ? "1" : "0"; 348 | } 349 | } 350 | set added; 351 | const int max_display = 16; 352 | int num_display = 0; 353 | for (auto w : watches) { 354 | if (!added.count(w.base_name)) { 355 | added.insert(w.base_name); 356 | fstWriterEmitValueChange(fst, w.fst_handle, values[w.base_name].c_str()); 357 | if (console_out && num_display < max_display) { 358 | fprintf(stderr, "%-40s %s\n", w.base_name.c_str(), values[w.base_name].c_str()); 359 | ++num_display; 360 | } 361 | } 362 | } 363 | 364 | ++cycles; 365 | // Sleep(500); /// slow down on purpose 366 | } 367 | 368 | fstWriterClose(fst); 369 | 370 | fprintf(stderr, "\n\noutput: trace.fst\n\n"); 371 | 372 | return 0; 373 | } 374 | 375 | // ----------------------------------------------------------------------------- 376 | -------------------------------------------------------------------------------- /src/simul_cpu.cc: -------------------------------------------------------------------------------- 1 | // @sylefeb 2022-01-04 2 | /* 3 | BSD 3-Clause License 4 | 5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, 15 | this list of conditions and the following disclaimer in the documentation 16 | and/or other materials provided with the distribution. 17 | 18 | 3. Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from 20 | this software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | */ 33 | 34 | #include 35 | 36 | #include 37 | #include 38 | #include 39 | #include 40 | #include 41 | #include 42 | #include 43 | #include 44 | #include 45 | #include 46 | 47 | using namespace std; 48 | 49 | #include "read.h" 50 | #include "blif.h" 51 | 52 | // ----------------------------------------------------------------------------- 53 | 54 | // forward def 55 | void simulPosEdgeAll_cpu(const vector& luts, vector& _outputs); 56 | 57 | // ----------------------------------------------------------------------------- 58 | 59 | static inline void simulLUT_cpu( 60 | int l, 61 | const vector& luts, 62 | vector& _outputs) 63 | { 64 | // read inputs 65 | unsigned short cfg_idx = 0; 66 | for (int i = 0; i < 4; ++i) { 67 | if (luts[l].inputs[i] > -1) { 68 | int lut = luts[l].inputs[i] >> 1; 69 | int q_else_d = luts[l].inputs[i] & 1; 70 | uchar bit = (_outputs[lut] >> q_else_d) & 1; 71 | cfg_idx |= bit ? (1 << (3 - i)) : 0; 72 | } 73 | } 74 | // update outputs 75 | uchar new_value = (luts[l].cfg >> cfg_idx) & 1; 76 | if (new_value) _outputs[l] |= 1; 77 | else _outputs[l] &= 0xfffffffe; 78 | } 79 | 80 | // ----------------------------------------------------------------------------- 81 | 82 | // add LUT fanout to the compute lists 83 | static inline void addFanout( 84 | int l, 85 | int q_else_d, 86 | const vector& depths, 87 | int numdepths, 88 | const vector& fanout, 89 | vector& _computelists, 90 | vector& _outputs 91 | ) { 92 | int cur = fanout[l]; 93 | int other = fanout[cur]; 94 | while (other != -1) { 95 | int other_lut = other >> 1; 96 | if (q_else_d == (other&1)) { // other uses D/Q input 97 | if ((_outputs[other_lut] & 4) == 0) { // not yet inserted 98 | _outputs[other_lut] |= 4; // tag as inserted 99 | // insert in comb. depth compute list 100 | int dpt = depths[other_lut]; 101 | int cls = _computelists[dpt]; 102 | int idx = _computelists[cls]++; 103 | _computelists[cls + 1 + idx] = other_lut; 104 | } 105 | } 106 | ++cur; 107 | other = fanout[cur]; 108 | } 109 | } 110 | 111 | // ----------------------------------------------------------------------------- 112 | 113 | static inline void simulLUT_cpu( 114 | int l, 115 | const vector& luts, 116 | const vector& depths, 117 | int numdepths, 118 | const vector& fanout, 119 | vector& _computelists, 120 | vector& _outputs) 121 | { 122 | // skip externals 123 | if (luts[l].external) { 124 | return; 125 | } 126 | // read inputs 127 | unsigned short cfg_idx = 0; 128 | for (int i = 0; i < 4; ++i) { 129 | if (luts[l].inputs[i] > -1) { 130 | int lut = luts[l].inputs[i] >> 1; 131 | int q_else_d = luts[l].inputs[i] & 1; 132 | uchar bit = (_outputs[lut] >> q_else_d)&1; 133 | cfg_idx |= bit ? (1 << (3 - i)) : 0; 134 | } 135 | } 136 | // update outputs 137 | uchar new_value = (luts[l].cfg >> cfg_idx) & 1; 138 | if ((_outputs[l]&1) != new_value) { 139 | if (new_value) _outputs[l] |= 1; 140 | else _outputs[l] &= ~1; 141 | // fprintf(stderr, "LUT %d changed (new:%d)\n",l<<1,new_value); 142 | // add fanout to compute list 143 | addFanout(l, 0, depths, numdepths, fanout, _computelists, _outputs); 144 | // add this LUT to posedge list 145 | if ((_outputs[l] & 8) == 0) { // not yet inserted 146 | _outputs[l] |= 8; // tag as inserted 147 | // insert in posedge compute list 148 | int dpt = numdepths; 149 | int cls = _computelists[dpt]; 150 | int idx = _computelists[cls]++; 151 | _computelists[cls + 1 + idx] = l; 152 | } 153 | } 154 | // reset inserted flag (preserve posedge flag) 155 | _outputs[l] &= 3|8; 156 | } 157 | 158 | // ----------------------------------------------------------------------------- 159 | 160 | void simulBRAMS_cpu( 161 | vector& _brams, 162 | const vector& depths, 163 | int numdepths, 164 | const vector& fanout, 165 | vector& _computelists, 166 | vector& _outputs) 167 | { 168 | // process BRAMs 169 | for (auto &bram : _brams) { 170 | // make rd_addr 171 | uint rd_addr = 0; 172 | for (int i=0;i < bram.rd_addr.size();++i) { 173 | int b = bram.rd_addr[i]; 174 | int lut = b >> 1; 175 | int q_else_d = b & 1; 176 | uint bit = ((_outputs[lut] >> q_else_d) & 1) ? 1 : 0; 177 | rd_addr = rd_addr | (bit << i); 178 | } 179 | // make wr_addr 180 | uint wr_addr = 0; 181 | for (int i=0;i < bram.wr_addr.size();++i) { 182 | int b = bram.wr_addr[i]; 183 | int lut = b >> 1; 184 | int q_else_d = b & 1; 185 | uint bit = ((_outputs[lut] >> q_else_d) & 1) ? 1 : 0; 186 | wr_addr = wr_addr | (bit << i); 187 | } 188 | // DEBUG 189 | uint32_t dbg_rdata = 0; 190 | uint32_t dbg_wdata = 0; 191 | uint32_t dbg_wen = 0; 192 | for (int i=0;i < bram.rd_data.size();++i) { 193 | int o = bram.data.bitsize() - (int)(rd_addr * bram.rd_data.size()); 194 | bool bitr = bram.data.get(o - 1 - i); 195 | dbg_rdata = dbg_rdata | ((bitr?1:0) << i); 196 | if (!bram.wr_data.empty()) { 197 | int bw = bram.wr_data[i]; 198 | uint bit_w = (_outputs[bw >> 1] >> (bw & 1)) & 1; 199 | dbg_wdata = dbg_wdata | ((bit_w?1:0) << i); 200 | int be = bram.wr_en[i]; 201 | uint bit_e = (_outputs[be >> 1] >> (be & 1)) & 1; 202 | dbg_wen = dbg_wen | ((bit_e?1:0) << i); 203 | } 204 | } 205 | // fetch and set rd_data 206 | for (int i=0;i < bram.rd_data.size();++i) { 207 | int o = bram.data.bitsize() - (int)(rd_addr * bram.rd_data.size()); 208 | bool bit = bram.data.get(o - 1 - i); 209 | int b = bram.rd_data[i]; 210 | int lut = b >> 1; 211 | int q_else_d = b & 1; 212 | if (bit) { 213 | _outputs[lut] |= 0b11; 214 | } else { 215 | _outputs[lut] &= ~0b11; 216 | } 217 | // add fanout to compute list 218 | addFanout(lut, 0, depths, numdepths, fanout, _computelists, _outputs); 219 | addFanout(lut, 1, depths, numdepths, fanout, _computelists, _outputs); 220 | } 221 | // get wr_data and store 222 | if (!bram.wr_data.empty()) { 223 | for (int i=0;i < bram.wr_data.size();++i) { 224 | int o = bram.data.bitsize() - (int)(wr_addr * bram.rd_data.size()); 225 | int bw = bram.wr_data[i]; 226 | uint bit_w = (_outputs[bw >> 1] >> (bw & 1)) & 1; 227 | int be = bram.wr_en[i]; 228 | uint bit_e = (_outputs[be >> 1] >> (be & 1)) & 1; 229 | if (bit_e) { 230 | bram.data.set(o - 1 - i, bit_w != 0); 231 | } 232 | } 233 | } 234 | // report 235 | // fprintf(stderr, "- bram %s @%08x = %08x w:@%08x=%08x(%08x)\n", bram.name.c_str(), rd_addr, dbg_rdata, wr_addr, dbg_wdata, dbg_wen); 236 | } 237 | } 238 | 239 | // ----------------------------------------------------------------------------- 240 | 241 | void simulInit_cpu( 242 | const vector& luts, 243 | vector& _brams, 244 | const vector& step_starts, 245 | const vector& step_ends, 246 | const vector& ones, 247 | vector& _computelists, 248 | vector& _outputs) 249 | { 250 | _outputs.resize(luts.size(),0); 251 | // initialize ones 252 | for (int o = 0; o < ones.size(); ++o) { 253 | _outputs[ones[o] >> 1] |= 1 << (ones[o] & 1); 254 | } 255 | // resolve const cells 256 | for (int cy = 0; cy < 2; ++cy) { // those which are const, and then those that only depend on consts 257 | for (int l = step_starts[0]; l <= step_ends[0]; ++l) { 258 | simulLUT_cpu(l, luts, _outputs); 259 | } 260 | simulPosEdgeAll_cpu(luts, _outputs); 261 | } 262 | // initialize ones 263 | // Why a second time? Some of these registers may have been cleared after const resolve 264 | for (int o = 0; o < ones.size(); ++o) { 265 | _outputs[ones[o] >> 1] |= 1 << (ones[o] & 1); 266 | } 267 | // computelists 268 | int cpl_sz = (int)step_starts.size()+1; // header, one index per depth + 1 for posedge 269 | for (int d = 0; d < step_starts.size(); ++d) { 270 | int num = step_ends[d] - step_starts[d] + 1; // max entries in list for this depth 271 | cpl_sz += num + 1; // +1 for list length 272 | } 273 | cpl_sz += (int)luts.size() + 1; // final list for posedge 274 | _computelists.resize(cpl_sz,0); 275 | int offset = (int)step_starts.size()+1; // header, one index per depth + 1 for posedge 276 | for (int d = 0; d < step_starts.size(); ++d) { 277 | _computelists[d] = offset; // header, start of list (first entry is length) 278 | int num = step_ends[d] - step_starts[d] + 1; // max entries in list for this depth 279 | offset += num + 1; // +1 for list length 280 | } 281 | _computelists[step_starts.size()] = offset; // final list for posedge 282 | // -> initially we put all LUTs in the computelist 283 | for (int d = 0; d < (int)step_starts.size() ; ++d) { 284 | int cls = _computelists[d]; 285 | // fill-in list 286 | for (int l = step_starts[d]; l <= step_ends[d]; ++l) { 287 | int idx = _computelists[cls]++; 288 | sl_assert(cls + 1 + idx < _computelists.size()); 289 | _computelists[cls + 1 + idx] = l; 290 | } 291 | } 292 | { 293 | int cls = _computelists[step_starts.size()]; 294 | for (int l = 0; l < luts.size(); ++l) { 295 | int idx = _computelists[cls]++; 296 | sl_assert(cls + 1 + idx < _computelists.size()); 297 | _computelists[cls + 1 + idx] = l; 298 | } 299 | } 300 | // -> we tag all LUTs as being inserted already 301 | for (int l = 0; l < luts.size(); ++l) { 302 | _outputs[l] |= 4 | 8; 303 | } 304 | } 305 | 306 | // ----------------------------------------------------------------------------- 307 | 308 | void simulCycle_cpu( 309 | const vector& luts, 310 | vector& _brams, 311 | const vector& depths, 312 | const vector& step_starts, 313 | const vector& step_ends, 314 | const vector& fanout, 315 | vector& _computelists, 316 | vector& _outputs) 317 | { 318 | // BRAMs 319 | simulBRAMS_cpu(_brams, depths, (int)step_starts.size(), fanout, _computelists, _outputs); 320 | // process LUTs 321 | for (int depth = 0; depth < step_starts.size(); ++depth) { 322 | int cls = _computelists[depth]; 323 | int num = _computelists[cls]; 324 | // cerr << sprint("depth: %5d, num: %5d\n", depth, num); 325 | for (int n = 0; n < num ; ++n) { 326 | int l = _computelists[cls + 1 + n]; 327 | simulLUT_cpu(l, luts, depths, (int)step_starts.size(), fanout, _computelists, _outputs); 328 | } 329 | // clear compute list for this depth 330 | _computelists[cls] = 0; 331 | } 332 | } 333 | 334 | // ----------------------------------------------------------------------------- 335 | 336 | void simulPosEdgeAll_cpu( 337 | const vector& luts, 338 | vector& _outputs) 339 | { 340 | for (int l = 0; l < luts.size(); ++l) { 341 | uchar d = _outputs[l] & 1; 342 | uchar q = (_outputs[l] >> 1) & 1; 343 | if (d != q) { 344 | if (d) { 345 | _outputs[l] |= 2; 346 | } else { 347 | _outputs[l] &= 0xfffffffd; 348 | } 349 | } 350 | } 351 | } 352 | 353 | // ----------------------------------------------------------------------------- 354 | 355 | void simulPosEdge_cpu( 356 | const vector& luts, 357 | const vector& depths, 358 | int numdepths, 359 | const vector& fanout, 360 | vector& _computelists, 361 | vector& _outputs) 362 | { 363 | int cls = _computelists[numdepths]; 364 | // process LUTs 365 | int num = _computelists[cls]; 366 | // cerr << sprint("posedge num: %5d\n", num); 367 | for (int n = 0; n < num; ++n) { 368 | int l = _computelists[cls + 1 + n]; 369 | uchar d = _outputs[l] & 1; 370 | uchar q = (_outputs[l] >> 1) & 1; 371 | if (d != q) { 372 | if (d) { 373 | _outputs[l] |= 2; 374 | } else { 375 | _outputs[l] &= 0xfffffffd; 376 | } 377 | // add fanout to compute list 378 | addFanout(l, 1, depths, numdepths, fanout, _computelists, _outputs); 379 | } 380 | // reset inserted flag 381 | _outputs[l] &= 7; 382 | } 383 | // clear compute list 384 | _computelists[cls] = 0; 385 | } 386 | 387 | // ----------------------------------------------------------------------------- 388 | 389 | void simulPrintOutput_cpu( 390 | const vector& outputs, 391 | const vector >& outbits) 392 | { 393 | // display result 394 | int val = 0; 395 | string str; 396 | for (int b = 0; b < outbits.size() ; b++) { 397 | int lut = outbits[b].second >> 1; 398 | int q_else_d = outbits[b].second & 1; 399 | uchar bit = (outputs[lut] >> q_else_d) & 1; 400 | str = (bit ? "1" : "0") + str; 401 | val += bit << b; 402 | } 403 | fprintf(stderr,"b%s (d%03d h%03x) \n",str.c_str(),val,val); 404 | } 405 | 406 | // ----------------------------------------------------------------------------- 407 | 408 | void simulSetSignal_cpu( 409 | int sig, 410 | bool v, 411 | const vector& depths, 412 | int numdepths, 413 | const vector& fanout, 414 | vector& _computelists, 415 | vector& _outputs 416 | ) { 417 | int b = sig; 418 | int lut = b >> 1; 419 | // set D 420 | if (v) { 421 | _outputs[lut] |= 1; 422 | } else { 423 | _outputs[lut] &= ~1; 424 | } 425 | // add fanout to compute list 426 | addFanout(lut, 0, depths, numdepths, fanout, _computelists, _outputs); 427 | } 428 | 429 | // ----------------------------------------------------------------------------- 430 | -------------------------------------------------------------------------------- /src/simul_cpu.h: -------------------------------------------------------------------------------- 1 | // @sylefeb 2022-01-04 2 | /* 3 | BSD 3-Clause License 4 | 5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, 15 | this list of conditions and the following disclaimer in the documentation 16 | and/or other materials provided with the distribution. 17 | 18 | 3. Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from 20 | this software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | */ 33 | 34 | #pragma once 35 | 36 | #include 37 | #include 38 | using namespace std; 39 | 40 | #include "read.h" 41 | #include "analyze.h" 42 | 43 | void simulInit_cpu( 44 | const vector& luts, 45 | vector& _brams, 46 | const vector& step_starts, 47 | const vector& step_ends, 48 | const vector& ones, 49 | vector& _computelists, 50 | vector& _outputs); 51 | 52 | void simulCycle_cpu( 53 | const vector& luts, 54 | vector& _brams, 55 | const vector& depths, 56 | const vector& step_starts, 57 | const vector& step_ends, 58 | const vector& fanout, 59 | vector& _computelists, 60 | vector& _outputs); 61 | 62 | void simulPosEdge_cpu( 63 | const vector& luts, 64 | const vector& depths, 65 | int numdepths, 66 | const vector& fanout, 67 | vector& _computelists, 68 | vector& _outputs); 69 | 70 | void simulPrintOutput_cpu( 71 | const vector& outputs, 72 | const vector >& outbits); 73 | 74 | 75 | void simulSetSignal_cpu( 76 | int sig, 77 | bool v, 78 | const vector& depths, 79 | int numdepths, 80 | const vector& fanout, 81 | vector& _computelists, 82 | vector& _outputs); 83 | -------------------------------------------------------------------------------- /src/simul_gpu.cc: -------------------------------------------------------------------------------- 1 | // @sylefeb 2022-02-11 2 | /* 3 | BSD 3-Clause License 4 | 5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, 15 | this list of conditions and the following disclaimer in the documentation 16 | and/or other materials provided with the distribution. 17 | 18 | 3. Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from 20 | this software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | */ 33 | // -------------------------------------------------------------- 34 | 35 | #include 36 | #include 37 | 38 | #include "read.h" 39 | #include "analyze.h" 40 | #include "simul_gpu.h" 41 | 42 | using namespace std; 43 | 44 | // -------------------------------------------------------------- 45 | 46 | // Ah, some good old globals and externs 47 | // defined in silixel.cc 48 | extern map g_OutPorts; 49 | extern Array g_OutPortsValues; 50 | extern int g_Cycle; 51 | 52 | // -------------------------------------------------------------- 53 | 54 | // Oh, some more globals, sure! 55 | // All shaders 56 | AutoBindShader::sh_simul g_ShSimul; // simulates a sub-cycle step 57 | AutoBindShader::sh_posedge g_ShPosEdge; // simulates posedge 58 | AutoBindShader::sh_outports g_ShOutPorts;// fills output ports 59 | AutoBindShader::sh_init g_ShInit; // init LUTs with ones 60 | 61 | // -------------------------------------------------------------- 62 | 63 | using namespace std; 64 | 65 | // -------------------------------------------------------------- 66 | 67 | // More?? Of course, why not? 68 | GLBuffer g_LUTs_Cfg; // uint, one per LUT (NOTE: 16 bits are used, could pack) 69 | GLBuffer g_LUTs_Addrs; // uint, four per LUT (NOTE: 24 bits per addr is enough, could pack on three) 70 | GLBuffer g_LUTs_Outputs; // uint, one per LUT, bit 0 (D) bit 1 (Q) bit 2 (dirty) 71 | GLBuffer g_GPU_OutPortsLocs; // uint, per outport 72 | GLBuffer g_GPU_OutPortsVals; // uint, per outport * CYCLE_BUFFER_LEN (NOTE: total overkill, could be one bit) 73 | GLBuffer g_GPU_OutInits; // uint, one per output to initialize 74 | 75 | // -------------------------------------------------------------- 76 | const int G = 128; 77 | // -------------------------------------------------------------- 78 | 79 | void simulInit_gpu(const vector& luts,const vector& ones) 80 | { 81 | g_ShSimul.init(); 82 | g_ShPosEdge.init(); 83 | g_ShOutPorts.init(); 84 | g_ShInit.init(); 85 | 86 | int n_luts = (int)luts.size(); 87 | n_luts += ( (n_luts & (G - 1)) ? (G - (n_luts & (G - 1))) : 0 ); 88 | g_LUTs_Cfg .init( n_luts * sizeof(uint), GL_SHADER_STORAGE_BUFFER); 89 | g_LUTs_Addrs .init((n_luts << 2) * sizeof(uint), GL_SHADER_STORAGE_BUFFER); 90 | g_LUTs_Outputs.init( n_luts * sizeof(uint), GL_SHADER_STORAGE_BUFFER); 91 | g_GPU_OutPortsVals.init((int)g_OutPorts.size() * sizeof(uint) * CYCLE_BUFFER_LEN, GL_SHADER_STORAGE_BUFFER); 92 | g_GPU_OutPortsLocs.init((int)g_OutPorts.size() * sizeof(uint), GL_SHADER_STORAGE_BUFFER); 93 | g_GPU_OutInits.init((int)ones.size() * sizeof(uint), GL_SHADER_STORAGE_BUFFER); 94 | 95 | // we initialize all outputs to zero 96 | { 97 | glBindBufferARB(GL_SHADER_STORAGE_BUFFER, g_LUTs_Outputs.glId()); 98 | int *ptr = (int*)glMapBufferARB(GL_SHADER_STORAGE_BUFFER, GL_WRITE_ONLY); 99 | memset(ptr, 0x00, g_LUTs_Outputs.size()); 100 | glUnmapBufferARB(GL_SHADER_STORAGE_BUFFER); 101 | glBindBufferARB(GL_SHADER_STORAGE_BUFFER, 0); 102 | } 103 | // initialize the static LUT table 104 | // -> configs 105 | { 106 | glBindBufferARB(GL_SHADER_STORAGE_BUFFER, g_LUTs_Cfg.glId()); 107 | int *ptr = (int*)glMapBufferARB(GL_SHADER_STORAGE_BUFFER, GL_WRITE_ONLY); 108 | ForIndex(l, (int)luts.size()) { 109 | ptr[l] = (int)luts[l].cfg; 110 | } 111 | glUnmapBufferARB(GL_SHADER_STORAGE_BUFFER); 112 | glBindBufferARB(GL_SHADER_STORAGE_BUFFER, 0); 113 | } 114 | // -> addrs 115 | { 116 | glBindBufferARB(GL_SHADER_STORAGE_BUFFER, g_LUTs_Addrs.glId()); 117 | int *ptr = (int*)glMapBufferARB(GL_SHADER_STORAGE_BUFFER, GL_WRITE_ONLY); 118 | ForIndex(l, (int)luts.size()) { 119 | ForIndex(i, 4) { 120 | ptr[(l<<2)+i] = max(0,(int)luts[l].inputs[i]); 121 | } 122 | } 123 | glUnmapBufferARB(GL_SHADER_STORAGE_BUFFER); 124 | glBindBufferARB(GL_SHADER_STORAGE_BUFFER, 0); 125 | } 126 | // -> outport locations 127 | { 128 | glBindBufferARB(GL_SHADER_STORAGE_BUFFER, g_GPU_OutPortsLocs.glId()); 129 | int *ptr = (int*)glMapBufferARB(GL_SHADER_STORAGE_BUFFER, GL_WRITE_ONLY); 130 | for (auto op : g_OutPorts) { 131 | ptr[op.second[0]] = op.second[1]; 132 | } 133 | glUnmapBufferARB(GL_SHADER_STORAGE_BUFFER); 134 | glBindBufferARB(GL_SHADER_STORAGE_BUFFER, 0); 135 | } 136 | // -> initialized outputs 137 | { 138 | glBindBufferARB(GL_SHADER_STORAGE_BUFFER, g_GPU_OutInits.glId()); 139 | int *ptr = (int*)glMapBufferARB(GL_SHADER_STORAGE_BUFFER, GL_WRITE_ONLY); 140 | ForIndex(o,ones.size()) { ptr[o] = ones[o]; } 141 | glUnmapBufferARB(GL_SHADER_STORAGE_BUFFER); 142 | glBindBufferARB(GL_SHADER_STORAGE_BUFFER, 0); 143 | } 144 | glMemoryBarrier(GL_ALL_BARRIER_BITS); 145 | } 146 | 147 | /* -------------------------------------------------------- */ 148 | 149 | void simulBegin_gpu( 150 | const vector& luts, 151 | const vector& step_starts, 152 | const vector& step_ends, 153 | const vector& ones) 154 | { 155 | glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, g_LUTs_Cfg.glId()); 156 | glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 1, g_LUTs_Addrs.glId()); 157 | glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 2, g_LUTs_Outputs.glId()); 158 | glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 3, g_GPU_OutPortsLocs.glId()); 159 | glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 4, g_GPU_OutPortsVals.glId()); 160 | glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 5, g_GPU_OutInits.glId()); 161 | // init cells 162 | g_ShInit.begin(); 163 | g_ShInit.run(v3i((int)ones.size(),1,1)); 164 | g_ShInit.end(); 165 | // resolve constant cells 166 | ForIndex (c,2) { 167 | int n = step_ends[0] - step_starts[0] + 1; 168 | g_ShSimul.begin(); 169 | g_ShSimul.start_lut.set((uint)0); 170 | g_ShSimul.num.set((uint)n); 171 | g_ShSimul.run(v3i((n / G) + ((n & (G - 1)) ? 1 : 0), 1, 1)); 172 | g_ShSimul.end(); 173 | glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT); 174 | g_ShPosEdge.begin(); 175 | n = (int)luts.size(); 176 | g_ShPosEdge.num.set((uint)n); 177 | g_ShPosEdge.run(v3i((n / G) + ((n & (G - 1)) ? 1 : 0), 1, 1)); 178 | g_ShPosEdge.end(); 179 | glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT); 180 | } 181 | // init cells 182 | // Why a second time? Some of these registers may have been cleared after const resolve 183 | g_ShInit.begin(); 184 | g_ShInit.run(v3i((int)ones.size(), 1, 1)); 185 | g_ShInit.end(); 186 | } 187 | 188 | /* -------------------------------------------------------- */ 189 | 190 | /* 191 | Simulate one cycle on the GPU 192 | */ 193 | void simulCycle_gpu( 194 | const vector& luts, 195 | const vector& step_starts, 196 | const vector& step_ends) 197 | { 198 | 199 | g_ShSimul.begin(); 200 | // iterate on depth levels (skipping const depth 0) 201 | ForRange(depth, 1, (int)step_starts.size()-1) { 202 | // only update LUTs at this particular level 203 | int n = step_ends[depth] - step_starts[depth] + 1; 204 | g_ShSimul.start_lut.set((uint)step_starts[depth]); 205 | g_ShSimul.num.set((uint)n); 206 | g_ShSimul.run(v3i((n / G) + ((n & (G - 1)) ? 1 : 0), 1, 1)); 207 | // sync required between iterations 208 | glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT); 209 | } 210 | g_ShSimul.end(); 211 | // simulate positive clock edge 212 | g_ShPosEdge.begin(); 213 | int n = (int)luts.size(); 214 | g_ShPosEdge.num.set((uint)n); 215 | g_ShPosEdge.run(v3i((n / G) + ((n & (G - 1)) ? 1 : 0), 1, 1)); 216 | g_ShPosEdge.end(); 217 | // sync required to ensure further reads see the update 218 | glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT); 219 | 220 | ++g_Cycle; 221 | 222 | } 223 | 224 | /* -------------------------------------------------------- */ 225 | 226 | void simulEnd_gpu() 227 | { 228 | glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 5, 0); 229 | glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 4, 0); 230 | glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 3, 0); 231 | glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 2, 0); 232 | glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 1, 0); 233 | glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, 0); 234 | } 235 | 236 | /* -------------------------------------------------------- */ 237 | 238 | uint g_RBCycle = 0; 239 | 240 | bool simulReadback_gpu() 241 | { 242 | glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT); 243 | 244 | // gather outport values 245 | g_ShOutPorts.begin(); 246 | g_ShOutPorts.offset.set((uint)g_OutPorts.size() * g_RBCycle); 247 | g_ShOutPorts.run(v3i((int)g_OutPorts.size(), 1, 1)); // TODO: local size >= 32 248 | g_ShOutPorts.end(); 249 | 250 | ++g_RBCycle; 251 | 252 | if (g_RBCycle == CYCLE_BUFFER_LEN) { 253 | // readback buffer 254 | glBindBufferARB(GL_SHADER_STORAGE_BUFFER, g_GPU_OutPortsVals.glId()); 255 | glGetBufferSubData(GL_SHADER_STORAGE_BUFFER, 0, g_OutPortsValues.sizeOfData(), g_OutPortsValues.raw()); 256 | glBindBufferARB(GL_SHADER_STORAGE_BUFFER, 0); 257 | g_RBCycle = 0; 258 | } 259 | 260 | return g_RBCycle == 0; 261 | } 262 | 263 | /* -------------------------------------------------------- */ 264 | 265 | void simulPrintOutput_gpu(const vector >& outbits) 266 | { 267 | // display result (assumes readback done) 268 | int val = 0; 269 | string str; 270 | for (int b = 0; b < outbits.size(); b++) { 271 | int vb = g_OutPortsValues[b]; 272 | str = (vb ? "1" : "0") + str; 273 | val += vb << b; 274 | } 275 | fprintf(stderr, "b%s (d%d h%x)\n", str.c_str(), val, val); 276 | } 277 | 278 | // -------------------------------------------------------------- 279 | 280 | void simulTerminate_gpu() 281 | { 282 | g_LUTs_Addrs.terminate(); 283 | g_LUTs_Cfg.terminate(); 284 | g_LUTs_Outputs.terminate(); 285 | g_GPU_OutPortsLocs.terminate(); 286 | g_GPU_OutPortsVals.terminate(); 287 | g_GPU_OutInits.terminate(); 288 | } 289 | 290 | // -------------------------------------------------------------- 291 | -------------------------------------------------------------------------------- /src/simul_gpu.h: -------------------------------------------------------------------------------- 1 | // @sylefeb 2022-01-04 2 | /* 3 | BSD 3-Clause License 4 | 5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, 15 | this list of conditions and the following disclaimer in the documentation 16 | and/or other materials provided with the distribution. 17 | 18 | 3. Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from 20 | this software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | */ 33 | #pragma once 34 | 35 | // -------------------------------------------------------------- 36 | 37 | #include 38 | using namespace std; 39 | 40 | // -------------------------------------------------------------- 41 | 42 | #define CYCLE_BUFFER_LEN 1024 43 | 44 | // -------------------------------------------------------------- 45 | 46 | #include "sh_simul.h" 47 | extern AutoBindShader::sh_simul g_ShSimul; 48 | #include "sh_posedge.h" 49 | extern AutoBindShader::sh_posedge g_ShPosEdge; 50 | #include "sh_outports.h" 51 | extern AutoBindShader::sh_outports g_ShOutPorts; 52 | #include "sh_init.h" 53 | extern AutoBindShader::sh_init g_ShInit; 54 | 55 | typedef GPUMESH_MVF2(mvf_vertex_2f, mvf_texcoord0_2f) mvf_simple; 56 | typedef GPUMesh_GL_VBO GLMesh; 57 | 58 | extern AutoPtr g_Quad; 59 | 60 | extern GLTimer g_GPU_timer; 61 | 62 | // -------------------------------------------------------------- 63 | 64 | void simulInit_gpu( 65 | const vector& luts, 66 | const vector& ones 67 | ); 68 | 69 | void simulBegin_gpu( 70 | const vector& luts, 71 | const vector& step_starts, 72 | const vector& step_ends, 73 | const vector& ones); 74 | 75 | void simulCycle_gpu( 76 | const vector& luts, 77 | const vector& step_starts, 78 | const vector& step_ends); 79 | 80 | bool simulReadback_gpu(); 81 | 82 | void simulPrintOutput_gpu(const vector >& outbits); 83 | 84 | void simulEnd_gpu(); 85 | 86 | void simulTerminate_gpu(); 87 | 88 | // -------------------------------------------------------------- 89 | -------------------------------------------------------------------------------- /src/uintX.h: -------------------------------------------------------------------------------- 1 | // @sylefeb 2025-04-08 2 | /* 3 | BSD 3-Clause License 4 | 5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, 15 | this list of conditions and the following disclaimer in the documentation 16 | and/or other materials provided with the distribution. 17 | 18 | 3. Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from 20 | this software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | */ 33 | 34 | #pragma once 35 | 36 | #include 37 | 38 | class uintX 39 | { 40 | private: 41 | std::vector m_bits; 42 | public: 43 | uintX() {} 44 | void set(uint b, bool v) { 45 | uint i = b >> 5; 46 | if (i >= m_bits.size()) { m_bits.resize(i + 1, 0); } 47 | if (v) m_bits[i] |= (uint32_t(1) << (b & 31)); 48 | else m_bits[i] &= ~(uint32_t(1) << (b & 31)); 49 | } 50 | bool get(uint b) { 51 | uint i = b >> 5; 52 | if (i >= m_bits.size()) return 0; 53 | return 1 == ((m_bits[i] >> (b & 31)) & 1); 54 | } 55 | uint bitsize() { return (uint)m_bits.size() * 32; } 56 | }; 57 | 58 | -------------------------------------------------------------------------------- /src/wasi.cc: -------------------------------------------------------------------------------- 1 | // @sylefeb 2025-04-22 2 | /* 3 | BSD 3-Clause License 4 | 5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without 9 | modification, are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, this 12 | list of conditions and the following disclaimer. 13 | 14 | 2. Redistributions in binary form must reproduce the above copyright notice, 15 | this list of conditions and the following disclaimer in the documentation 16 | and/or other materials provided with the distribution. 17 | 18 | 3. Neither the name of the copyright holder nor the names of its 19 | contributors may be used to endorse or promote products derived from 20 | this software without specific prior written permission. 21 | 22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 32 | */ 33 | 34 | #include 35 | #include 36 | #include 37 | #include 38 | 39 | // This file contains extra definitions that are required for proper compilation 40 | // with the WASI WebAssembly framework, due to lack of support for exceptions 41 | // and threads (the later should not be needed, ANTLR is creating the missing 42 | // __cxa_thread_atexit export) 43 | 44 | #if defined(__wasi__) 45 | 46 | // well ... 47 | extern "C" { 48 | void * __cxa_allocate_exception(size_t /*thrown_size*/) { abort(); } 49 | void __cxa_throw(void */*thrown_object*/, std::type_info */*tinfo*/, void (*/*dest*/)(void *)) { abort(); } 50 | int system( const char* ) {} 51 | clock_t clock() { return 0; } 52 | FILE *tmpfile() { return NULL; } 53 | int __cxa_thread_atexit(void*, void*, void*) {} 54 | } 55 | 56 | #endif 57 | 58 | // ------------------------------------------------- 59 | -------------------------------------------------------------------------------- /synth.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # run ./synth.sh DESIGN 4 | # (without the .v, where design is a Verilog file in ./designs/) 5 | 6 | mkdir build 7 | 8 | cp designs/$1.v build/synth.v 9 | 10 | cd synth 11 | yosys -s synth.yosys 12 | cd .. 13 | -------------------------------------------------------------------------------- /synth/synth.yosys: -------------------------------------------------------------------------------- 1 | # BSD 3-Clause License 2 | 3 | # Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 4 | # All rights reserved. 5 | 6 | # Redistribution and use in source and binary forms, with or without 7 | # modification, are permitted provided that the following conditions are met: 8 | 9 | # 1. Redistributions of source code must retain the above copyright notice, this 10 | # list of conditions and the following disclaimer. 11 | 12 | # 2. Redistributions in binary form must reproduce the above copyright notice, 13 | # this list of conditions and the following disclaimer in the documentation 14 | # and/or other materials provided with the distribution. 15 | 16 | # 3. Neither the name of the copyright holder nor the names of its 17 | # contributors may be used to endorse or promote products derived from 18 | # this software without specific prior written permission. 19 | 20 | # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 21 | # AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 22 | # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 23 | # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 24 | # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 25 | # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 26 | # SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 27 | # CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 28 | # OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29 | # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30 | 31 | read_verilog -sv ../build/synth.v 32 | 33 | techmap -map +/adff2dff.v 34 | 35 | hierarchy -check -auto-top 36 | 37 | proc 38 | # async2sync 39 | flatten 40 | opt_expr 41 | opt_clean 42 | check 43 | opt -nodffe -nosdff 44 | fsm 45 | opt -nodffe -nosdff 46 | wreduce 47 | peepopt 48 | opt_clean 49 | techmap -map +/cmp2lut.v -D LUT_WIDTH=4 50 | # alumacc 51 | share 52 | opt -nodffe -nosdff 53 | memory -nomap 54 | opt_clean 55 | 56 | opt -fast -full -nodffe -nosdff 57 | memory_map 58 | opt -full -nodffe -nosdff 59 | techmap 60 | opt -fast -nodffe -nosdff 61 | abc -fast -lut 4 62 | opt -fast -nodffe -nosdff 63 | 64 | hierarchy -check 65 | stat 66 | check 67 | 68 | write_blif -param ../build/synth.blif 69 | -------------------------------------------------------------------------------- /synth/synth_bram.yosys: -------------------------------------------------------------------------------- 1 | # BSD 3-Clause License 2 | 3 | # Copyright (c) 2022, Sylvain Lefebvre (@sylefeb) 4 | # All rights reserved. 5 | 6 | # Redistribution and use in source and binary forms, with or without 7 | # modification, are permitted provided that the following conditions are met: 8 | 9 | # 1. Redistributions of source code must retain the above copyright notice, this 10 | # list of conditions and the following disclaimer. 11 | 12 | # 2. Redistributions in binary form must reproduce the above copyright notice, 13 | # this list of conditions and the following disclaimer in the documentation 14 | # and/or other materials provided with the distribution. 15 | 16 | # 3. Neither the name of the copyright holder nor the names of its 17 | # contributors may be used to endorse or promote products derived from 18 | # this software without specific prior written permission. 19 | 20 | # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 21 | # AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 22 | # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 23 | # DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 24 | # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 25 | # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 26 | # SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 27 | # CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 28 | # OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29 | # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30 | 31 | read_verilog -sv ../build/synth.v 32 | 33 | techmap -map +/adff2dff.v 34 | 35 | hierarchy -check -auto-top 36 | 37 | proc 38 | # async2sync 39 | flatten 40 | opt_expr 41 | opt_clean 42 | check 43 | opt -nodffe -nosdff 44 | fsm 45 | opt -nodffe -nosdff 46 | wreduce 47 | peepopt 48 | opt_clean 49 | techmap -map +/cmp2lut.v -D LUT_WIDTH=4 50 | # alumacc 51 | share 52 | opt -nodffe -nosdff 53 | memory -nomap 54 | opt_clean 55 | 56 | opt -fast -full -nodffe -nosdff 57 | # memory_map 58 | opt -full -nodffe -nosdff 59 | techmap 60 | opt -fast -nodffe -nosdff 61 | abc -fast -lut 4 62 | opt -fast -nodffe -nosdff 63 | 64 | hierarchy -check 65 | stat 66 | check 67 | 68 | write_blif -param ../build/synth.blif 69 | -------------------------------------------------------------------------------- /synth_bram.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # run ./synth.sh DESIGN 4 | # (without the .v, where design is a Verilog file in ./designs/) 5 | 6 | mkdir build 7 | 8 | cp designs/$1.v build/synth.v 9 | 10 | cd synth 11 | yosys -s synth_bram.yosys 12 | cd .. 13 | --------------------------------------------------------------------------------