├── .gitignore
├── .gitmodules
├── CMakeLists.txt
├── LICENSE
├── README.md
├── build
└── .gitignore
├── depths.png
├── designs
├── Makefile
├── mul.v
├── silice_blaze.v
├── silice_blinky.v
├── silice_div.v
├── silice_icev_leds.v
├── silice_mulpip.v
├── silice_vga_demo.v
├── silice_vga_demo_flyover.v
├── silice_vga_test.v
├── simple.v
├── test1.si
├── test2.si
└── test3.si
├── lut4.png
├── silice_vga_test.gif
├── src
├── CMakeLists.txt
├── analyze.cc
├── analyze.h
├── blif.cc
├── blif.h
├── fstapi
│ ├── CMakeLists.txt
│ ├── fastlz.c
│ ├── fastlz.h
│ ├── fstapi.c
│ ├── fstapi.h
│ ├── lz4.c
│ └── lz4.h
├── read.cc
├── read.h
├── sh_clear.cs
├── sh_init.cs
├── sh_outports.cs
├── sh_posedge.cs
├── sh_simul.cs
├── sh_visu.fp
├── sh_visu.vp
├── silixel.cc
├── silixel_cpu.cc
├── simul_cpu.cc
├── simul_cpu.h
├── simul_gpu.cc
├── simul_gpu.h
├── uintX.h
└── wasi.cc
├── synth.sh
├── synth
├── synth.yosys
└── synth_bram.yosys
└── synth_bram.sh
/.gitignore:
--------------------------------------------------------------------------------
1 | .*.sw*
2 |
--------------------------------------------------------------------------------
/.gitmodules:
--------------------------------------------------------------------------------
1 | [submodule "src/LibSL"]
2 | path = src/LibSL
3 | url = https://github.com/sylefeb/LibSL.git
4 |
--------------------------------------------------------------------------------
/CMakeLists.txt:
--------------------------------------------------------------------------------
1 | CMAKE_MINIMUM_REQUIRED(VERSION 2.6)
2 | PROJECT(silixel)
3 |
4 | add_subdirectory(src)
5 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | BSD 3-Clause License
2 |
3 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb)
4 | All rights reserved.
5 |
6 | Redistribution and use in source and binary forms, with or without
7 | modification, are permitted provided that the following conditions are met:
8 |
9 | 1. Redistributions of source code must retain the above copyright notice, this
10 | list of conditions and the following disclaimer.
11 |
12 | 2. Redistributions in binary form must reproduce the above copyright notice,
13 | this list of conditions and the following disclaimer in the documentation
14 | and/or other materials provided with the distribution.
15 |
16 | 3. Neither the name of the copyright holder nor the names of its
17 | contributors may be used to endorse or promote products derived from
18 | this software without specific prior written permission.
19 |
20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
21 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
23 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
24 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
26 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
27 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
28 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Exploring gate-level simulation on CPU and GPU
2 |
3 | This repository contains my experiments on gate-level simulation. By that I mean taking the output of [Yosys](https://github.com/YosysHQ/yosys) and simulating the gate network. This is done in the context of hardware design for FPGAs, with a graphics twist (because making your own [graphics hardware is fun](https://github.com/sylefeb/Silice/blob/master/projects/README.md)). As a quick reminder, hardware design is achieved by going from a hardware description (e.g. a source code in [Verilog](https://en.wikipedia.org/wiki/Verilog)) to a network of gates that implement the logic ; this is what Yosys does. This network of gates can be later turned into a configuration file for an [FPGA](https://www.nandland.com/articles/fpga-101-fpgas-for-beginners.html), or into lithography masks to actually manufacture [ASICs](https://en.wikipedia.org/wiki/Application-specific_integrated_circuit), in both cases implementing the design in hardware.
4 |
5 |
6 | A VGA test design being simulated on the GPU, with gate binary outputs overlaid.
Synthesized on an FPGA this design produces the same pattern on a VGA output, 640x480 @60Hz.
7 | 
8 |
9 |
10 | > The purpose of this repo is to learn about hardware simulation, having fun hacking and understanding how this is possible at all. We are thus implementing a toy simulator. For actual, efficient simulation please refer to [*Verilator*](https://www.veripool.org/verilator/), [*CXXRTL*](https://tomverbeure.github.io/2020/08/08/CXXRTL-the-New-Yosys-Simulation-Backend.html) and [*Icarus Verilog*](http://iverilog.icarus.com/).
11 |
12 | > **Work in progress**: I am currently working on this README and commenting/cleaning the source code. Feedback is welcome!
13 |
14 | This all started as I stumbled upon an entry to the Google CTF 2019 contest: [reversing-gpurtl](https://www.youtube.com/watch?v=3ac9HAsfV8c). The source code [is available](https://github.com/google/google-ctf/tree/master/2019/finals/reversing-gpurtl) and shows how to brute force a gate-level simulation onto the GPU.
15 |
16 | *What does that mean? How does that work? We're going to precisely answer these questions!*
17 |
18 | By analyzing the `reversing-gpurtl` source code and scripts (which are in Python and Rust), I got a good understanding of how the gate level simulation was achieved. And I was surprised to discover that it is *simple*!
19 |
20 | But first, what is a *gate* in our context? The simplest (and only!) logical element in the network will be a *LUT4*. A LUT (Lookup Up Table) is a basic building block of an FPGA. A simplified LUT4 schematic would look like that:
21 | 
22 |
23 | The LUT4 has 4 single bit inputs (`a`,`b`,`c`,`d`) and two single bit outputs: `D` and `Q`. Output `D` is 'immediately' updated (as fast as the circuit can do it) when `a`,`b`,`c` or `d` change. Output `Q` is updated with the current value of `D` only when the clock ticks (positive edge on `clk`). Given `a`,`b`,`c`,`d` the value taken by `D` depends on the LUT configuration, which is a 16 entry truth table (configured by Yosys). It specifies the value of the output bit based on the values of `a`, `b`, `c` and `d`: four bits that can be either 0 or 1, and thus $2^4=16$ possibilities. This configuration implies that the LUT4 has a small internal memory (16 bits), which is indeed what gets configured by Yosys in the LUT4s.
24 |
25 | Fundamentally, the idea for simulation is as follows:
26 | 1. First, ask Yosys to synthesize a design using only LUT4s, see the [script here](synth/synth.yosys).
27 |
28 | 1. Second, parse the result written by Yosys (a `blif` file) and prepare a data-structure for simulation. The file tells us about the LUT4s, their configurations and how they are connected. There are a few minor complications that are detailed in the source code comments.
29 |
30 | 1. Third, run the simulation! The basic idea (we'll improve next) is to simulate all LUTs in parallel. For each LUT, we read its four inputs and update its `D` output based on its configuration. Once nothing changes anymore, we simulate a positive clock edge by updating the `Q` output to reflect the value of the `D` output. Rinse and repeat.
31 |
32 | And that's all there is for a basic, working simulator!
33 |
34 | Let's now briefly look at an overview of the source code, and then take a closer look at how the simulation behaves. This will lead us to some optimizations, and will let us understand some performance tradeoffs.
35 |
36 | ## Source code overview
37 |
38 | To give you a rough outline of the source code:
39 | - Step 1 is covered in the [synth.yosys](synth/synth.yosys) script and [synth.sh](synth.sh).
40 | - Step 2 is covered in [blif.cc](src/blif.cc) and [read.cc](src/read.cc)
41 | - Step 3 is covered in [simul_cpu.cc](src/simul_cpu.cc) (CPU) and [simul_gpu.cc](src/simul_gpu.cc) (GPU), both being called from the main app [silixel.cc](src/silixel.cc). In terms of GPU the two important compute shaders are [sh_simul.cs](src/sh_simul.cs) and [sh_posedge.cs](src/sh_posedge.cs). A second application does only CPU simulation (see [silixel_cpu.cc](src/silixel_cpu.cc)). It is very simple so that can be a good starting point.
42 |
43 | ## A closer look and some improvements
44 |
45 | Blindly simulating all LUTs in parallel works just fine. However, it is quite inefficient in terms of *effective simulated LUT per computation steps*. What do I meant by that?
46 |
47 | Let us assume a perfectly parallel computer, with exactly one core per LUT (on a small design and large GPU this might just be the case!).
48 | It turns out that, in most cases, at each 'parallel update' only few LUT outputs are actually changing. This is quite expected: at each simulation step the logic is unlikely to generate changes to all LUTs throughout the entire design. Well, to what extent this is true depends *entirely* on your design of course, but on most designs I tried only a small percentage of LUTs are actually modified at every iteration. This implies that a lot of LUT updates are wasted computations.
49 |
50 | So what can we do to improve efficiency? We will apply two refinements. The first one
51 | is used both on the CPU and GPU implementations. The second one is used only on the CPU implementation.
52 |
53 | ### *Refinement 1: sorting LUTs by combinational depth*
54 |
55 | Let's have a look at a simple network:
56 |
57 | 
58 |
59 | I numbered the LUTs from `L0` to `L5`. The LUTs in the network have been arranged
60 | by *combinational depth*. Given a LUT, the depth counts how many other LUTs are
61 | in between any of its input and a Q (flip-flop) output, *at most* considering all inputs.
62 |
63 | > Recall the D outputs are updated as soon as the inputs change (they are *combinational* outputs) while the Q outputs are updated only at the positive clock edge (*registered* outputs). Thus, within a clock cycle we propagate data from `Q` outputs to `D` outputs until nothing changes, before simulating the next positive clock edge and moving on to the next cycle.
64 |
65 | For instance, `L0` is at depth 0 because both its inputs `a` and `c` read directly
66 | from Q outputs. The same is true of `L1`.
67 | Now `L4` is at depth 1 because while `c` reads from a Q output (which would mean depth 0), `a` reads from the D output of `L1`. Since `L1` is at depth 0, `L4` has to be depth 1. The final depth of the LUT is the largest considering all inputs.
68 |
69 | The depth analysis is performed in [analyze.cc](src/analyze.cc).
70 |
71 | How does that help? Remember that during simulation, we update all LUTs in parallel
72 | until nothing changes, and then simulate a positive clock edge.
73 | This introduces two problems. First, we need to track whether something change,
74 | and with large numbers of LUTs that is not free if running parallel on the GPU, for instance.
75 | Second, only few LUT outputs actually change at every iteration, while we update all of them.
76 | In the illustrated example, `L5` would not change until the very last iteration. And during this last
77 | iteration it is the only one to change, so the update is wasted on all other LUTs.
78 |
79 | Having the depth gives us some nice properties to reduce the impact of these issues:
80 | - Since we know the maximum overall depth (2 in the example) we know exactly
81 | how many iterations to run and do not have to implement a 'no change' detection
82 | (3 iterations in the example).
83 | - LUTs at a same depth are independent from one another. Consider `L2`, `L3` and `L4`: changing
84 | the D output of one does not impact the others.
85 | This is true by construction since if one would depend on another, it would have been
86 | assigned at the next depth in the network.
87 | Furthermore, LUTs at a same depth only possibly depend on changes of
88 | the D output of LUTs *at lower depths*. Thus, we can do less work at each iteration,
89 | focusing only on the LUTs that could possibly change. In the example, we would
90 | run three parallel iterations, first {`L0`,`L1`}, then {`L2`,`L3`,`L4`}, then {`L5`}.
91 | This results in substantial savings. On the GPU, we can still update large chunks
92 | of LUTs in parallel *without any synchronization* (LUTs at a same depth), which is
93 | ideal.
94 |
95 | > The maximum depth also reflects at what max frequency the circuit can run. Indeed, assuming
96 | it takes the same delay for signal to propagate through all LUTs, the number of LUTs
97 | to traverse *at most* determines the worst case propagation delay, and hence the
98 | maximum frequency. This is often reported as the *critical path* by place and route software
99 | such as [nextpnr](https://github.com/YosysHQ/nextpnr).
100 |
101 | Now we have seen all the ingredients of the GPU implementation.
102 | See in particular function `simulCycle_gpu` in source file [`simul_gpu.cc`](src/simul_gpu.cc),
103 | that calls the compute shaders once for each depth level, and then for the positive clock edge.
104 |
105 | > A detail, not discussed here, is that some LUTs remain constant during simulation
106 | and can be skipped after initialization. This is done in the implementation.
107 |
108 | ### *Refinement 2: fanout and compute lists*
109 |
110 | The first refinement avoids blind updates to all LUTs. However, it remains
111 | very likely that within a set of LUTs at a same depth, many are updated while
112 | their inputs did not change. Consider {`L2`,`L3`,`L4`}. If only the D output of `L0` changed,
113 | then only `L2` actually requires an update.
114 |
115 | This second refinement avoids this issue, implementing a *compute list* per depth level
116 | (including the final positive edge update of Q outputs).
117 | An iteration at a given depth *k* inserts LUTs that should be refreshed in the compute lists of the next depth levels (> *k*).
118 | These are the LUTs using as input the changing D output of a LUT at depth *k*.
119 |
120 | To do this efficiently, we first compute the *fanout* of the LUTs. Let us consider a single LUT:
121 | its fanout is the list of LUTs that use its D output, and of course all are deeper in the
122 | network. Given this list, whenever a LUT D output changes we can efficiently insert all LUTs of
123 | its fanout to the compute lists
124 | (see `addFanout` in source file [`simul_cpu.cc`](src/simul_cpu.cc)). LUTs are inserted
125 | only once thanks to a 'dirty bit' flag.
126 |
127 | This approach works very well on the CPU, which is using a single thread and is
128 | anyway a sequential traversal. In fact, it outperforms the GPU on all but very large
129 | designs (which, on top of it, are large for bad reasons due to memories (BRAM/SPRAM)
130 | being turned into humongous networks of LUTs).
131 |
132 | > This approach is not easily amenable to the GPU. I actually tried, but this required
133 | atomic updates, synchronization and indirect compute dispatch ... which in the end
134 | together killed performance. But it might be that I did not find the right way yet!
135 |
136 | > Performance can be further improved on the CPU. First, the computations seem a
137 | case for SSE instructions. Second, I should be using more cores! However,
138 | like on the GPU, synchronization can quickly become a performance bottleneck...
139 |
140 | Alright, time for some compilation and testing!
141 |
142 | ## Compile and run
143 |
144 | First, make sure to get the submodules:
145 | ```
146 | git submodule init
147 | git submodule update
148 | ```
149 | Use `CMake` to prepare a Makefile for your system (possibly use `-G` to specify the makefile generator), then `make`:
150 | ```
151 | cd build
152 | cmake ..
153 | make
154 | cd ..
155 | ```
156 | Before simulating, we have to run Yosys on a design (a Verilog source file).
157 | Yosys has to be installed and in PATH.
158 | From a command line in the repo root, run:
159 |
160 | ```
161 | ./synth.sh silice_vga_test
162 | ```
163 |
164 | This synthesizes a design ([silice_vga_test.v](designs/silice_vga_test.v)) and
165 | generates the output in [`build`](./build). There are several designs, see [`designs`](./designs/).
166 |
167 | After running `./build/src/silixel` you should see this:
168 |
169 | 
170 |
171 | Time to experiment, the source code is yours to hack!
172 | Please let me know what you thought, and feel free to [reach out on Mastodon](https://mastodon.online/@sylefeb).
173 |
174 | ## Limitations
175 |
176 | - Single clock domain
177 |
--------------------------------------------------------------------------------
/build/.gitignore:
--------------------------------------------------------------------------------
1 | *
2 | !.gitignore
3 |
--------------------------------------------------------------------------------
/depths.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sylefeb/Silixel/09bb5313db3a26002615b670a8a87fed1de529a6/depths.png
--------------------------------------------------------------------------------
/designs/Makefile:
--------------------------------------------------------------------------------
1 |
2 | .DEFAULT: $@
3 | silice-make.py -s $@.si -b bare -p basic -o BUILD_$(subst :,_,$@) $(ARGS)
4 | cd .. ; ./synth_bram.sh /BUILD_$(subst :,_,$@)/build ; cd -
5 |
6 | clean:
7 | rm -rf BUILD_*
8 |
--------------------------------------------------------------------------------
/designs/mul.v:
--------------------------------------------------------------------------------
1 | module multest(clock, out);
2 |
3 | input clock;
4 | output reg [7:0] out = 0;
5 | reg [7:0] a = 0;
6 | // reg [7:0] b = 0;
7 |
8 | always @(posedge clock)
9 | begin
10 | out <= a * a;
11 | a <= a + 1;
12 | // b <= b + 1;
13 | end
14 |
15 | endmodule
16 |
--------------------------------------------------------------------------------
/designs/silice_blinky.v:
--------------------------------------------------------------------------------
1 | /*
2 |
3 | Copyright 2019, (C) Sylvain Lefebvre and contributors
4 | List contributors with: git shortlog -n -s --
5 |
6 | MIT license
7 |
8 | Permission is hereby granted, free of charge, to any person obtaining a copy of
9 | this software and associated documentation files (the "Software"), to deal in
10 | the Software without restriction, including without limitation the rights to
11 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
12 | the Software, and to permit persons to whom the Software is furnished to do so,
13 | subject to the following conditions:
14 |
15 | The above copyright notice and this permission notice shall be included in all
16 | copies or substantial portions of the Software.
17 |
18 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
19 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
20 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
21 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
22 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
23 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
24 |
25 | (header_2_M)
26 |
27 | */
28 | `define BARE 1
29 | `define COLOR_DEPTH 6
30 |
31 |
32 | module top(
33 | `ifdef VGA
34 | // VGA
35 | output out_video_clock,
36 | output reg [`COLOR_DEPTH-1:0] out_video_r,
37 | output reg [`COLOR_DEPTH-1:0] out_video_g,
38 | output reg [`COLOR_DEPTH-1:0] out_video_b,
39 | output out_video_hs,
40 | output out_video_vs,
41 | `endif
42 | // basic
43 | output [7:0] out_leds,
44 | input clock
45 | );
46 |
47 | reg [2:0] ready = 3'b111;
48 |
49 | always @(posedge clock) begin
50 | ready <= ready >> 1;
51 | end
52 |
53 | wire run_main;
54 | assign run_main = 1'b1;
55 |
56 | M_main __main(
57 | .clock(clock),
58 | .reset(ready[0]),
59 | .out_leds(out_leds),
60 | `ifdef VGA
61 | .out_video_clock(out_video_clock),
62 | .out_video_r(out_video_r),
63 | .out_video_g(out_video_g),
64 | .out_video_b(out_video_b),
65 | .out_video_hs(out_video_hs),
66 | .out_video_vs(out_video_vs),
67 | `endif
68 | .in_run(run_main)
69 | );
70 |
71 | endmodule
72 |
73 | module M_main (
74 | out_leds,
75 | in_run,
76 | out_done,
77 | reset,
78 | out_clock,
79 | clock
80 | );
81 | output [7:0] out_leds;
82 | input in_run;
83 | output out_done;
84 | input reset;
85 | output out_clock;
86 | input clock;
87 | assign out_clock = clock;
88 |
89 | reg [7:0] _d_cnt;
90 | reg [7:0] _q_cnt;
91 | reg [7:0] _d_leds;
92 | reg [7:0] _q_leds;
93 | reg [1:0] _d_index,_q_index = 3;
94 | reg _autorun = 0;
95 | assign out_leds = _q_leds;
96 | assign out_done = (_q_index == 3) & _autorun;
97 |
98 |
99 |
100 | `ifdef FORMAL
101 | initial begin
102 | assume(reset);
103 | end
104 | assume property($initstate || (out_done));
105 | `endif
106 | always @* begin
107 | _d_cnt = _q_cnt;
108 | _d_leds = _q_leds;
109 | _d_index = _q_index;
110 | // _always_pre
111 | _d_leds = _q_cnt[(8)-(8)+:8];
112 | (* full_case *)
113 | case (_q_index)
114 | 0: begin
115 | // _top
116 | _d_index = 1;
117 | end
118 | 1: begin
119 | // __while__block_1
120 | if (1) begin
121 | // __block_2
122 | // __block_4
123 | _d_cnt = _q_cnt+1;
124 | // __block_5
125 | _d_index = 1;
126 | end else begin
127 | _d_index = 2;
128 | end
129 | end
130 | 2: begin
131 | // __block_3
132 | _d_index = 3;
133 | end
134 | 3: begin // end of
135 | end
136 | default: begin
137 | _d_index = {2{1'bx}};
138 | `ifdef FORMAL
139 | assume(0);
140 | `endif
141 | end
142 | endcase
143 | // _always_post
144 | end
145 |
146 | always @(posedge clock) begin
147 | _q_cnt <= (reset) ? 0 : _d_cnt;
148 | _q_leds <= _d_leds;
149 | _q_index <= reset ? 3 : ( ~_autorun ? 0 : _d_index);
150 | _autorun <= reset ? 0 : 1;
151 | end
152 |
153 | endmodule
154 |
155 |
--------------------------------------------------------------------------------
/designs/silice_div.v:
--------------------------------------------------------------------------------
1 | /*
2 |
3 | Copyright 2019, (C) Sylvain Lefebvre and contributors
4 | List contributors with: git shortlog -n -s --
5 |
6 | MIT license
7 |
8 | Permission is hereby granted, free of charge, to any person obtaining a copy of
9 | this software and associated documentation files (the "Software"), to deal in
10 | the Software without restriction, including without limitation the rights to
11 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
12 | the Software, and to permit persons to whom the Software is furnished to do so,
13 | subject to the following conditions:
14 |
15 | The above copyright notice and this permission notice shall be included in all
16 | copies or substantial portions of the Software.
17 |
18 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
19 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
20 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
21 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
22 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
23 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
24 |
25 | (header_2_M)
26 |
27 | */
28 | `define BARE 1
29 | `define COLOR_DEPTH 6
30 |
31 |
32 | module top(
33 | `ifdef VGA
34 | // VGA
35 | output out_video_clock,
36 | output reg [`COLOR_DEPTH-1:0] out_video_r,
37 | output reg [`COLOR_DEPTH-1:0] out_video_g,
38 | output reg [`COLOR_DEPTH-1:0] out_video_b,
39 | output out_video_hs,
40 | output out_video_vs,
41 | `endif
42 | // basic
43 | output [7:0] out_leds,
44 | input clock
45 | );
46 |
47 | reg [2:0] ready = 3'b111;
48 |
49 | always @(posedge clock) begin
50 | ready <= ready >> 1;
51 | end
52 |
53 | wire run_main;
54 | assign run_main = 1'b1;
55 |
56 | M_main __main(
57 | .clock(clock),
58 | .reset(ready[0]),
59 | .out_leds(out_leds),
60 | `ifdef VGA
61 | .out_video_clock(out_video_clock),
62 | .out_video_r(out_video_r),
63 | .out_video_g(out_video_g),
64 | .out_video_b(out_video_b),
65 | .out_video_hs(out_video_hs),
66 | .out_video_vs(out_video_vs),
67 | `endif
68 | .in_run(run_main)
69 | );
70 |
71 | endmodule
72 |
73 |
74 | module M_div16__div0 (
75 | in_inum,
76 | in_iden,
77 | out_ret,
78 | in_run,
79 | out_done,
80 | reset,
81 | out_clock,
82 | clock
83 | );
84 | input signed [15:0] in_inum;
85 | input signed [15:0] in_iden;
86 | output signed [15:0] out_ret;
87 | input in_run;
88 | output out_done;
89 | input reset;
90 | output out_clock;
91 | input clock;
92 | assign out_clock = clock;
93 | wire [16:0] _w_diff;
94 | wire [15:0] _w_num;
95 | wire [15:0] _w_den;
96 |
97 | reg [16:0] _d_ac;
98 | reg [16:0] _q_ac;
99 | reg [4:0] _d_i;
100 | reg [4:0] _q_i;
101 | reg signed [15:0] _d_ret;
102 | reg signed [15:0] _q_ret;
103 | reg [1:0] _d_index,_q_index = 3;
104 | assign out_ret = _q_ret;
105 | assign out_done = (_q_index == 3);
106 |
107 |
108 | assign _w_diff = _q_ac-_w_den;
109 | assign _w_num = in_inum;
110 | assign _w_den = in_iden;
111 |
112 | `ifdef FORMAL
113 | initial begin
114 | assume(reset);
115 | end
116 | assume property($initstate || (in_run || out_done));
117 | `endif
118 | always @* begin
119 | _d_ac = _q_ac;
120 | _d_i = _q_i;
121 | _d_ret = _q_ret;
122 | _d_index = _q_index;
123 | // _always_pre
124 | (* full_case *)
125 | case (_q_index)
126 | 0: begin
127 | // _top
128 | _d_ac = {{15{1'b0}},_w_num[15+:1]};
129 | _d_ret = {_w_num[0+:15],1'b0};
130 | _d_index = 1;
131 | end
132 | 1: begin
133 | // __while__block_1
134 | if (_q_i!=16) begin
135 | // __block_2
136 | // __block_4
137 | if (_w_diff[16+:1]==0) begin
138 | // __block_5
139 | // __block_7
140 | _d_ac = {_w_diff[0+:15],_q_ret[15+:1]};
141 | _d_ret = {_q_ret[0+:15],1'b1};
142 | // __block_8
143 | end else begin
144 | // __block_6
145 | // __block_9
146 | _d_ac = {_q_ac[0+:15],_q_ret[15+:1]};
147 | _d_ret = {_q_ret[0+:15],1'b0};
148 | // __block_10
149 | end
150 | // __block_11
151 | _d_i = _q_i+1;
152 | // __block_12
153 | _d_index = 1;
154 | end else begin
155 | _d_index = 2;
156 | end
157 | end
158 | 2: begin
159 | // __block_3
160 | _d_index = 3;
161 | end
162 | 3: begin // end of
163 | end
164 | default: begin
165 | _d_index = {2{1'bx}};
166 | `ifdef FORMAL
167 | assume(0);
168 | `endif
169 | end
170 | endcase
171 | // _always_post
172 | end
173 |
174 | always @(posedge clock) begin
175 | _q_ac <= _d_ac;
176 | _q_i <= (reset | ~in_run) ? 0 : _d_i;
177 | _q_ret <= (reset | ~in_run) ? 0 : _d_ret;
178 | _q_index <= reset ? 3 : ( ~in_run ? 0 : _d_index);
179 | end
180 |
181 | endmodule
182 |
183 | module M_main (
184 | out_leds,
185 | in_run,
186 | out_done,
187 | reset,
188 | out_clock,
189 | clock
190 | );
191 | output [7:0] out_leds;
192 | input in_run;
193 | output out_done;
194 | input reset;
195 | output out_clock;
196 | input clock;
197 | assign out_clock = clock;
198 | wire signed [15:0] _w_div0_ret;
199 | wire _w_div0_done;
200 | wire signed [15:0] _c_num;
201 | assign _c_num = 20043;
202 | wire signed [15:0] _c_den;
203 | assign _c_den = 41;
204 | reg signed [15:0] _t_result;
205 |
206 | reg [7:0] _d_leds;
207 | reg [7:0] _q_leds;
208 | reg signed [15:0] _d_div0_inum,_q_div0_inum;
209 | reg signed [15:0] _d_div0_iden,_q_div0_iden;
210 | reg [1:0] _d_index,_q_index = 3;
211 | reg _autorun = 0;
212 | reg _div0_run = 0;
213 | assign out_leds = _q_leds;
214 | assign out_done = (_q_index == 3) & _autorun;
215 | M_div16__div0 div0 (
216 | .in_inum(_q_div0_inum),
217 | .in_iden(_q_div0_iden),
218 | .out_ret(_w_div0_ret),
219 | .out_done(_w_div0_done),
220 | .in_run(_div0_run),
221 | .reset(reset),
222 | .clock(clock));
223 |
224 |
225 |
226 | `ifdef FORMAL
227 | initial begin
228 | assume(reset);
229 | end
230 | assume property($initstate || (out_done));
231 | `endif
232 | always @* begin
233 | _d_leds = _q_leds;
234 | _d_div0_inum = _q_div0_inum;
235 | _d_div0_iden = _q_div0_iden;
236 | _d_index = _q_index;
237 | _div0_run = 1;
238 | _t_result = 0;
239 | // _always_pre
240 | (* full_case *)
241 | case (_q_index)
242 | 0: begin
243 | // _top
244 | _d_div0_inum = _c_num;
245 | _d_div0_iden = _c_den;
246 | _div0_run = 0;
247 | _d_index = 1;
248 | end
249 | 1: begin
250 | // __block_1
251 | if (_w_div0_done == 1) begin
252 | _d_index = 2;
253 | end else begin
254 | _d_index = 1;
255 | end
256 | end
257 | 2: begin
258 | // __block_2
259 | _t_result = _w_div0_ret;
260 | $display("%d / %d = %d",_c_num,_c_den,_t_result);
261 | _d_leds = _t_result[0+:8];
262 | _d_index = 3;
263 | end
264 | 3: begin // end of
265 | end
266 | default: begin
267 | _d_index = {2{1'bx}};
268 | `ifdef FORMAL
269 | assume(0);
270 | `endif
271 | end
272 | endcase
273 | // _always_post
274 | end
275 |
276 | always @(posedge clock) begin
277 | _q_leds <= _d_leds;
278 | _q_index <= reset ? 3 : ( ~_autorun ? 0 : _d_index);
279 | _autorun <= reset ? 0 : 1;
280 | _q_div0_inum <= _d_div0_inum;
281 | _q_div0_iden <= _d_div0_iden;
282 | end
283 |
284 | endmodule
285 |
286 |
--------------------------------------------------------------------------------
/designs/silice_mulpip.v:
--------------------------------------------------------------------------------
1 | /*
2 |
3 | Copyright 2019, (C) Sylvain Lefebvre and contributors
4 | List contributors with: git shortlog -n -s --
5 |
6 | MIT license
7 |
8 | Permission is hereby granted, free of charge, to any person obtaining a copy of
9 | this software and associated documentation files (the "Software"), to deal in
10 | the Software without restriction, including without limitation the rights to
11 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
12 | the Software, and to permit persons to whom the Software is furnished to do so,
13 | subject to the following conditions:
14 |
15 | The above copyright notice and this permission notice shall be included in all
16 | copies or substantial portions of the Software.
17 |
18 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
19 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
20 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
21 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
22 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
23 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
24 |
25 | (header_2_M)
26 |
27 | */
28 | `define BARE 1
29 | `define COLOR_DEPTH 6
30 |
31 |
32 | module top(
33 | `ifdef VGA
34 | // VGA
35 | output out_video_clock,
36 | output reg [`COLOR_DEPTH-1:0] out_video_r,
37 | output reg [`COLOR_DEPTH-1:0] out_video_g,
38 | output reg [`COLOR_DEPTH-1:0] out_video_b,
39 | output out_video_hs,
40 | output out_video_vs,
41 | `endif
42 | // basic
43 | output [7:0] out_leds,
44 | input clock
45 | );
46 |
47 | reg [2:0] ready = 3'b111;
48 |
49 | always @(posedge clock) begin
50 | ready <= ready >> 1;
51 | end
52 |
53 | wire run_main;
54 | assign run_main = 1'b1;
55 |
56 | M_main __main(
57 | .clock(clock),
58 | .reset(ready[0]),
59 | .out_leds(out_leds),
60 | `ifdef VGA
61 | .out_video_clock(out_video_clock),
62 | .out_video_r(out_video_r),
63 | .out_video_g(out_video_g),
64 | .out_video_b(out_video_b),
65 | .out_video_hs(out_video_hs),
66 | .out_video_vs(out_video_vs),
67 | `endif
68 | .in_run(run_main)
69 | );
70 |
71 | endmodule
72 |
73 |
74 | module M_mulpip16__mul (
75 | in_im0,
76 | in_im1,
77 | out_ret,
78 | out_done,
79 | reset,
80 | out_clock,
81 | clock
82 | );
83 | input signed [15:0] in_im0;
84 | input signed [15:0] in_im1;
85 | output signed [15:0] out_ret;
86 | output out_done;
87 | input reset;
88 | output out_clock;
89 | input clock;
90 | assign out_clock = clock;
91 | reg [15:0] _t_sum_1_0;
92 | reg [15:0] _t_sum_2_0;
93 | reg [15:0] _t_sum_2_1;
94 | reg [15:0] _t_sum_3_0;
95 | reg [15:0] _t_sum_3_1;
96 | reg [15:0] _t_sum_3_2;
97 | reg [15:0] _t_sum_3_3;
98 | reg [15:0] _t_sum_4_0;
99 | reg [15:0] _t_sum_4_1;
100 | reg [15:0] _t_sum_4_2;
101 | reg [15:0] _t_sum_4_3;
102 | reg [15:0] _t_sum_4_4;
103 | reg [15:0] _t_sum_4_5;
104 | reg [15:0] _t_sum_4_6;
105 | reg [15:0] _t_sum_4_7;
106 | reg [15:0] _t_m0;
107 | reg [15:0] _t_m1;
108 | reg [0:0] _t_m0_neg;
109 | reg [0:0] _t_m1_neg;
110 | reg [0:0] _t___pip_138_0_m0_neg;
111 | reg [0:0] _t___pip_138_0_m1_neg;
112 | reg [15:0] _t___pip_138_3_sum_1_0;
113 | reg [15:0] _t___pip_138_2_sum_2_0;
114 | reg [15:0] _t___pip_138_2_sum_2_1;
115 | reg [15:0] _t___pip_138_1_sum_3_0;
116 | reg [15:0] _t___pip_138_1_sum_3_1;
117 | reg [15:0] _t___pip_138_1_sum_3_2;
118 | reg [15:0] _t___pip_138_1_sum_3_3;
119 | reg [15:0] _t___pip_138_0_sum_4_0;
120 | reg [15:0] _t___pip_138_0_sum_4_1;
121 | reg [15:0] _t___pip_138_0_sum_4_2;
122 | reg [15:0] _t___pip_138_0_sum_4_3;
123 | reg [15:0] _t___pip_138_0_sum_4_4;
124 | reg [15:0] _t___pip_138_0_sum_4_5;
125 | reg [15:0] _t___pip_138_0_sum_4_6;
126 | reg [15:0] _t___pip_138_0_sum_4_7;
127 |
128 | reg [0:0] _d___pip_138_1_m0_neg;
129 | reg [0:0] _q___pip_138_1_m0_neg;
130 | reg [0:0] _d___pip_138_2_m0_neg;
131 | reg [0:0] _q___pip_138_2_m0_neg;
132 | reg [0:0] _d___pip_138_3_m0_neg;
133 | reg [0:0] _q___pip_138_3_m0_neg;
134 | reg [0:0] _d___pip_138_4_m0_neg;
135 | reg [0:0] _q___pip_138_4_m0_neg;
136 | reg [0:0] _d___pip_138_1_m1_neg;
137 | reg [0:0] _q___pip_138_1_m1_neg;
138 | reg [0:0] _d___pip_138_2_m1_neg;
139 | reg [0:0] _q___pip_138_2_m1_neg;
140 | reg [0:0] _d___pip_138_3_m1_neg;
141 | reg [0:0] _q___pip_138_3_m1_neg;
142 | reg [0:0] _d___pip_138_4_m1_neg;
143 | reg [0:0] _q___pip_138_4_m1_neg;
144 | reg [15:0] _d___pip_138_4_sum_1_0;
145 | reg [15:0] _q___pip_138_4_sum_1_0;
146 | reg [15:0] _d___pip_138_3_sum_2_0;
147 | reg [15:0] _q___pip_138_3_sum_2_0;
148 | reg [15:0] _d___pip_138_3_sum_2_1;
149 | reg [15:0] _q___pip_138_3_sum_2_1;
150 | reg [15:0] _d___pip_138_2_sum_3_0;
151 | reg [15:0] _q___pip_138_2_sum_3_0;
152 | reg [15:0] _d___pip_138_2_sum_3_1;
153 | reg [15:0] _q___pip_138_2_sum_3_1;
154 | reg [15:0] _d___pip_138_2_sum_3_2;
155 | reg [15:0] _q___pip_138_2_sum_3_2;
156 | reg [15:0] _d___pip_138_2_sum_3_3;
157 | reg [15:0] _q___pip_138_2_sum_3_3;
158 | reg [15:0] _d___pip_138_1_sum_4_0;
159 | reg [15:0] _q___pip_138_1_sum_4_0;
160 | reg [15:0] _d___pip_138_1_sum_4_1;
161 | reg [15:0] _q___pip_138_1_sum_4_1;
162 | reg [15:0] _d___pip_138_1_sum_4_2;
163 | reg [15:0] _q___pip_138_1_sum_4_2;
164 | reg [15:0] _d___pip_138_1_sum_4_3;
165 | reg [15:0] _q___pip_138_1_sum_4_3;
166 | reg [15:0] _d___pip_138_1_sum_4_4;
167 | reg [15:0] _q___pip_138_1_sum_4_4;
168 | reg [15:0] _d___pip_138_1_sum_4_5;
169 | reg [15:0] _q___pip_138_1_sum_4_5;
170 | reg [15:0] _d___pip_138_1_sum_4_6;
171 | reg [15:0] _q___pip_138_1_sum_4_6;
172 | reg [15:0] _d___pip_138_1_sum_4_7;
173 | reg [15:0] _q___pip_138_1_sum_4_7;
174 | reg signed [15:0] _d_ret;
175 | reg signed [15:0] _q_ret;
176 | reg [1:0] _d_index,_q_index = 3;
177 | reg _autorun = 0;
178 | assign out_ret = _q_ret;
179 | assign out_done = (_q_index == 3) & _autorun;
180 |
181 |
182 |
183 | `ifdef FORMAL
184 | initial begin
185 | assume(reset);
186 | end
187 | assume property($initstate || (out_done));
188 | `endif
189 | always @* begin
190 | _d___pip_138_1_m0_neg = _q___pip_138_1_m0_neg;
191 | _d___pip_138_2_m0_neg = _q___pip_138_2_m0_neg;
192 | _d___pip_138_3_m0_neg = _q___pip_138_3_m0_neg;
193 | _d___pip_138_4_m0_neg = _q___pip_138_4_m0_neg;
194 | _d___pip_138_1_m1_neg = _q___pip_138_1_m1_neg;
195 | _d___pip_138_2_m1_neg = _q___pip_138_2_m1_neg;
196 | _d___pip_138_3_m1_neg = _q___pip_138_3_m1_neg;
197 | _d___pip_138_4_m1_neg = _q___pip_138_4_m1_neg;
198 | _d___pip_138_4_sum_1_0 = _q___pip_138_4_sum_1_0;
199 | _d___pip_138_3_sum_2_0 = _q___pip_138_3_sum_2_0;
200 | _d___pip_138_3_sum_2_1 = _q___pip_138_3_sum_2_1;
201 | _d___pip_138_2_sum_3_0 = _q___pip_138_2_sum_3_0;
202 | _d___pip_138_2_sum_3_1 = _q___pip_138_2_sum_3_1;
203 | _d___pip_138_2_sum_3_2 = _q___pip_138_2_sum_3_2;
204 | _d___pip_138_2_sum_3_3 = _q___pip_138_2_sum_3_3;
205 | _d___pip_138_1_sum_4_0 = _q___pip_138_1_sum_4_0;
206 | _d___pip_138_1_sum_4_1 = _q___pip_138_1_sum_4_1;
207 | _d___pip_138_1_sum_4_2 = _q___pip_138_1_sum_4_2;
208 | _d___pip_138_1_sum_4_3 = _q___pip_138_1_sum_4_3;
209 | _d___pip_138_1_sum_4_4 = _q___pip_138_1_sum_4_4;
210 | _d___pip_138_1_sum_4_5 = _q___pip_138_1_sum_4_5;
211 | _d___pip_138_1_sum_4_6 = _q___pip_138_1_sum_4_6;
212 | _d___pip_138_1_sum_4_7 = _q___pip_138_1_sum_4_7;
213 | _d_ret = _q_ret;
214 | _d_index = _q_index;
215 | _t_sum_1_0 = 0;
216 | _t_sum_2_0 = 0;
217 | _t_sum_2_1 = 0;
218 | _t_sum_3_0 = 0;
219 | _t_sum_3_1 = 0;
220 | _t_sum_3_2 = 0;
221 | _t_sum_3_3 = 0;
222 | _t_sum_4_0 = 0;
223 | _t_sum_4_1 = 0;
224 | _t_sum_4_2 = 0;
225 | _t_sum_4_3 = 0;
226 | _t_sum_4_4 = 0;
227 | _t_sum_4_5 = 0;
228 | _t_sum_4_6 = 0;
229 | _t_sum_4_7 = 0;
230 | _t_m0 = 0;
231 | _t_m1 = 0;
232 | _t_m0_neg = 0;
233 | _t_m1_neg = 0;
234 | _t___pip_138_0_m0_neg = 0;
235 | _t___pip_138_0_m1_neg = 0;
236 | _t___pip_138_3_sum_1_0 = 0;
237 | _t___pip_138_2_sum_2_0 = 0;
238 | _t___pip_138_2_sum_2_1 = 0;
239 | _t___pip_138_1_sum_3_0 = 0;
240 | _t___pip_138_1_sum_3_1 = 0;
241 | _t___pip_138_1_sum_3_2 = 0;
242 | _t___pip_138_1_sum_3_3 = 0;
243 | _t___pip_138_0_sum_4_0 = 0;
244 | _t___pip_138_0_sum_4_1 = 0;
245 | _t___pip_138_0_sum_4_2 = 0;
246 | _t___pip_138_0_sum_4_3 = 0;
247 | _t___pip_138_0_sum_4_4 = 0;
248 | _t___pip_138_0_sum_4_5 = 0;
249 | _t___pip_138_0_sum_4_6 = 0;
250 | _t___pip_138_0_sum_4_7 = 0;
251 | // _always_pre
252 | (* full_case *)
253 | case (_q_index)
254 | 0: begin
255 | // _top
256 | _d_index = 1;
257 | end
258 | 1: begin
259 | // __while__block_1
260 | if (1) begin
261 | // __block_2
262 | // __block_4
263 | // pipeline
264 | // -------- stage 0
265 | // __stage___block_6
266 | // __block_7
267 | if (in_im0<0) begin
268 | // __block_8
269 | // __block_10
270 | _t_m0_neg = 1;
271 | _t_m0 = -in_im0;
272 | // __block_11
273 | end else begin
274 | // __block_9
275 | // __block_12
276 | _t_m0 = in_im0;
277 | // __block_13
278 | end
279 | // __block_14
280 | if (in_im1<0) begin
281 | // __block_15
282 | // __block_17
283 | _t_m1_neg = 1;
284 | _t_m1 = -in_im1;
285 | // __block_18
286 | end else begin
287 | // __block_16
288 | // __block_19
289 | _t_m1 = in_im1;
290 | // __block_20
291 | end
292 | // __block_21
293 | case (_t_m1[0+:2])
294 | 2'b00: begin
295 | // __block_23_case
296 | // __block_24
297 | _t_sum_4_0 = 0;
298 | // __block_25
299 | end
300 | 2'b10: begin
301 | // __block_26_case
302 | // __block_27
303 | _t_sum_4_0 = _t_m0<<1;
304 | // __block_28
305 | end
306 | 2'b01: begin
307 | // __block_29_case
308 | // __block_30
309 | _t_sum_4_0 = _t_m0<<0;
310 | // __block_31
311 | end
312 | 2'b11: begin
313 | // __block_32_case
314 | // __block_33
315 | _t_sum_4_0 = (_t_m0<<0)+(_t_m0<<1);
316 | // __block_34
317 | end
318 | endcase
319 | // __block_22
320 | case (_t_m1[2+:2])
321 | 2'b00: begin
322 | // __block_36_case
323 | // __block_37
324 | _t_sum_4_1 = 0;
325 | // __block_38
326 | end
327 | 2'b10: begin
328 | // __block_39_case
329 | // __block_40
330 | _t_sum_4_1 = _t_m0<<3;
331 | // __block_41
332 | end
333 | 2'b01: begin
334 | // __block_42_case
335 | // __block_43
336 | _t_sum_4_1 = _t_m0<<2;
337 | // __block_44
338 | end
339 | 2'b11: begin
340 | // __block_45_case
341 | // __block_46
342 | _t_sum_4_1 = (_t_m0<<2)+(_t_m0<<3);
343 | // __block_47
344 | end
345 | endcase
346 | // __block_35
347 | case (_t_m1[4+:2])
348 | 2'b00: begin
349 | // __block_49_case
350 | // __block_50
351 | _t_sum_4_2 = 0;
352 | // __block_51
353 | end
354 | 2'b10: begin
355 | // __block_52_case
356 | // __block_53
357 | _t_sum_4_2 = _t_m0<<5;
358 | // __block_54
359 | end
360 | 2'b01: begin
361 | // __block_55_case
362 | // __block_56
363 | _t_sum_4_2 = _t_m0<<4;
364 | // __block_57
365 | end
366 | 2'b11: begin
367 | // __block_58_case
368 | // __block_59
369 | _t_sum_4_2 = (_t_m0<<4)+(_t_m0<<5);
370 | // __block_60
371 | end
372 | endcase
373 | // __block_48
374 | case (_t_m1[6+:2])
375 | 2'b00: begin
376 | // __block_62_case
377 | // __block_63
378 | _t_sum_4_3 = 0;
379 | // __block_64
380 | end
381 | 2'b10: begin
382 | // __block_65_case
383 | // __block_66
384 | _t_sum_4_3 = _t_m0<<7;
385 | // __block_67
386 | end
387 | 2'b01: begin
388 | // __block_68_case
389 | // __block_69
390 | _t_sum_4_3 = _t_m0<<6;
391 | // __block_70
392 | end
393 | 2'b11: begin
394 | // __block_71_case
395 | // __block_72
396 | _t_sum_4_3 = (_t_m0<<6)+(_t_m0<<7);
397 | // __block_73
398 | end
399 | endcase
400 | // __block_61
401 | case (_t_m1[8+:2])
402 | 2'b00: begin
403 | // __block_75_case
404 | // __block_76
405 | _t_sum_4_4 = 0;
406 | // __block_77
407 | end
408 | 2'b10: begin
409 | // __block_78_case
410 | // __block_79
411 | _t_sum_4_4 = _t_m0<<9;
412 | // __block_80
413 | end
414 | 2'b01: begin
415 | // __block_81_case
416 | // __block_82
417 | _t_sum_4_4 = _t_m0<<8;
418 | // __block_83
419 | end
420 | 2'b11: begin
421 | // __block_84_case
422 | // __block_85
423 | _t_sum_4_4 = (_t_m0<<8)+(_t_m0<<9);
424 | // __block_86
425 | end
426 | endcase
427 | // __block_74
428 | case (_t_m1[10+:2])
429 | 2'b00: begin
430 | // __block_88_case
431 | // __block_89
432 | _t_sum_4_5 = 0;
433 | // __block_90
434 | end
435 | 2'b10: begin
436 | // __block_91_case
437 | // __block_92
438 | _t_sum_4_5 = _t_m0<<11;
439 | // __block_93
440 | end
441 | 2'b01: begin
442 | // __block_94_case
443 | // __block_95
444 | _t_sum_4_5 = _t_m0<<10;
445 | // __block_96
446 | end
447 | 2'b11: begin
448 | // __block_97_case
449 | // __block_98
450 | _t_sum_4_5 = (_t_m0<<10)+(_t_m0<<11);
451 | // __block_99
452 | end
453 | endcase
454 | // __block_87
455 | case (_t_m1[12+:2])
456 | 2'b00: begin
457 | // __block_101_case
458 | // __block_102
459 | _t_sum_4_6 = 0;
460 | // __block_103
461 | end
462 | 2'b10: begin
463 | // __block_104_case
464 | // __block_105
465 | _t_sum_4_6 = _t_m0<<13;
466 | // __block_106
467 | end
468 | 2'b01: begin
469 | // __block_107_case
470 | // __block_108
471 | _t_sum_4_6 = _t_m0<<12;
472 | // __block_109
473 | end
474 | 2'b11: begin
475 | // __block_110_case
476 | // __block_111
477 | _t_sum_4_6 = (_t_m0<<12)+(_t_m0<<13);
478 | // __block_112
479 | end
480 | endcase
481 | // __block_100
482 | case (_t_m1[14+:2])
483 | 2'b00: begin
484 | // __block_114_case
485 | // __block_115
486 | _t_sum_4_7 = 0;
487 | // __block_116
488 | end
489 | 2'b10: begin
490 | // __block_117_case
491 | // __block_118
492 | _t_sum_4_7 = _t_m0<<15;
493 | // __block_119
494 | end
495 | 2'b01: begin
496 | // __block_120_case
497 | // __block_121
498 | _t_sum_4_7 = _t_m0<<14;
499 | // __block_122
500 | end
501 | 2'b11: begin
502 | // __block_123_case
503 | // __block_124
504 | _t_sum_4_7 = (_t_m0<<14)+(_t_m0<<15);
505 | // __block_125
506 | end
507 | endcase
508 | // __block_113
509 | _t___pip_138_0_m0_neg = _t_m0_neg;
510 | _t___pip_138_0_m1_neg = _t_m1_neg;
511 | _t___pip_138_0_sum_4_0 = _t_sum_4_0;
512 | _t___pip_138_0_sum_4_1 = _t_sum_4_1;
513 | _t___pip_138_0_sum_4_2 = _t_sum_4_2;
514 | _t___pip_138_0_sum_4_3 = _t_sum_4_3;
515 | _t___pip_138_0_sum_4_4 = _t_sum_4_4;
516 | _t___pip_138_0_sum_4_5 = _t_sum_4_5;
517 | _t___pip_138_0_sum_4_6 = _t_sum_4_6;
518 | _t___pip_138_0_sum_4_7 = _t_sum_4_7;
519 | // -------- stage 1
520 | // __stage___block_127
521 | // __block_128
522 | _t_sum_3_0 = _q___pip_138_1_sum_4_0+_q___pip_138_1_sum_4_1;
523 | _t_sum_3_1 = _q___pip_138_1_sum_4_2+_q___pip_138_1_sum_4_3;
524 | _t_sum_3_2 = _q___pip_138_1_sum_4_4+_q___pip_138_1_sum_4_5;
525 | _t_sum_3_3 = _q___pip_138_1_sum_4_6+_q___pip_138_1_sum_4_7;
526 | _t___pip_138_1_sum_3_1 = _t_sum_3_1;
527 | _t___pip_138_1_sum_3_0 = _t_sum_3_0;
528 | _t___pip_138_1_sum_3_2 = _t_sum_3_2;
529 | _t___pip_138_1_sum_3_3 = _t_sum_3_3;
530 | // -------- stage 2
531 | // __stage___block_130
532 | // __block_131
533 | _t_sum_2_0 = _q___pip_138_2_sum_3_0+_q___pip_138_2_sum_3_1;
534 | _t_sum_2_1 = _q___pip_138_2_sum_3_2+_q___pip_138_2_sum_3_3;
535 | _t___pip_138_2_sum_2_0 = _t_sum_2_0;
536 | _t___pip_138_2_sum_2_1 = _t_sum_2_1;
537 | // -------- stage 3
538 | // __stage___block_133
539 | // __block_134
540 | _t_sum_1_0 = _q___pip_138_3_sum_2_0+_q___pip_138_3_sum_2_1;
541 | _t___pip_138_3_sum_1_0 = _t_sum_1_0;
542 | // -------- stage 4
543 | // __stage___block_136
544 | // __block_137
545 | if (_q___pip_138_4_m0_neg^_q___pip_138_4_m1_neg) begin
546 | // __block_138
547 | // __block_140
548 | _d_ret = -_q___pip_138_4_sum_1_0;
549 | // __block_141
550 | end else begin
551 | // __block_139
552 | // __block_142
553 | _d_ret = _q___pip_138_4_sum_1_0;
554 | // __block_143
555 | end
556 | // __block_144
557 | // __block_5
558 | // __block_146
559 | _d_index = 1;
560 | end else begin
561 | _d_index = 2;
562 | end
563 | end
564 | 2: begin
565 | // __block_3
566 | _d_index = 3;
567 | end
568 | 3: begin // end of
569 | end
570 | default: begin
571 | _d_index = {2{1'bx}};
572 | `ifdef FORMAL
573 | assume(0);
574 | `endif
575 | end
576 | endcase
577 | // _always_post
578 | end
579 |
580 | always @(posedge clock) begin
581 | _q___pip_138_1_m0_neg <= _t___pip_138_0_m0_neg;
582 | _q___pip_138_2_m0_neg <= _d___pip_138_1_m0_neg;
583 | _q___pip_138_3_m0_neg <= _d___pip_138_2_m0_neg;
584 | _q___pip_138_4_m0_neg <= _d___pip_138_3_m0_neg;
585 | _q___pip_138_1_m1_neg <= _t___pip_138_0_m1_neg;
586 | _q___pip_138_2_m1_neg <= _d___pip_138_1_m1_neg;
587 | _q___pip_138_3_m1_neg <= _d___pip_138_2_m1_neg;
588 | _q___pip_138_4_m1_neg <= _d___pip_138_3_m1_neg;
589 | _q___pip_138_4_sum_1_0 <= _t___pip_138_3_sum_1_0;
590 | _q___pip_138_3_sum_2_0 <= _t___pip_138_2_sum_2_0;
591 | _q___pip_138_3_sum_2_1 <= _t___pip_138_2_sum_2_1;
592 | _q___pip_138_2_sum_3_0 <= _t___pip_138_1_sum_3_0;
593 | _q___pip_138_2_sum_3_1 <= _t___pip_138_1_sum_3_1;
594 | _q___pip_138_2_sum_3_2 <= _t___pip_138_1_sum_3_2;
595 | _q___pip_138_2_sum_3_3 <= _t___pip_138_1_sum_3_3;
596 | _q___pip_138_1_sum_4_0 <= _t___pip_138_0_sum_4_0;
597 | _q___pip_138_1_sum_4_1 <= _t___pip_138_0_sum_4_1;
598 | _q___pip_138_1_sum_4_2 <= _t___pip_138_0_sum_4_2;
599 | _q___pip_138_1_sum_4_3 <= _t___pip_138_0_sum_4_3;
600 | _q___pip_138_1_sum_4_4 <= _t___pip_138_0_sum_4_4;
601 | _q___pip_138_1_sum_4_5 <= _t___pip_138_0_sum_4_5;
602 | _q___pip_138_1_sum_4_6 <= _t___pip_138_0_sum_4_6;
603 | _q___pip_138_1_sum_4_7 <= _t___pip_138_0_sum_4_7;
604 | _q_ret <= _d_ret;
605 | _q_index <= reset ? 3 : ( ~_autorun ? 0 : _d_index);
606 | _autorun <= reset ? 0 : 1;
607 | end
608 |
609 | endmodule
610 |
611 | module M_main (
612 | out_leds,
613 | in_run,
614 | out_done,
615 | reset,
616 | out_clock,
617 | clock
618 | );
619 | output [7:0] out_leds;
620 | input in_run;
621 | output out_done;
622 | input reset;
623 | output out_clock;
624 | input clock;
625 | assign out_clock = clock;
626 | wire signed [15:0] _w_mul_ret;
627 | wire _w_mul_done;
628 | reg signed [15:0] _t_result;
629 |
630 | reg signed [15:0] _d_m0;
631 | reg signed [15:0] _q_m0;
632 | reg signed [15:0] _d_m1;
633 | reg signed [15:0] _q_m1;
634 | reg [7:0] _d_leds;
635 | reg [7:0] _q_leds;
636 | reg signed [15:0] _d_mul_im0,_q_mul_im0;
637 | reg signed [15:0] _d_mul_im1,_q_mul_im1;
638 | reg [3:0] _d_index,_q_index = 13;
639 | reg _autorun = 0;
640 | assign out_leds = _q_leds;
641 | assign out_done = (_q_index == 13) & _autorun;
642 | M_mulpip16__mul mul (
643 | .in_im0(_d_mul_im0),
644 | .in_im1(_d_mul_im1),
645 | .out_ret(_w_mul_ret),
646 | .out_done(_w_mul_done),
647 | .reset(reset),
648 | .clock(clock));
649 |
650 |
651 |
652 | `ifdef FORMAL
653 | initial begin
654 | assume(reset);
655 | end
656 | assume property($initstate || (out_done));
657 | `endif
658 | always @* begin
659 | _d_m0 = _q_m0;
660 | _d_m1 = _q_m1;
661 | _d_leds = _q_leds;
662 | _d_mul_im0 = _q_mul_im0;
663 | _d_mul_im1 = _q_mul_im1;
664 | _d_index = _q_index;
665 | // _always_pre
666 | _t_result = _w_mul_ret;
667 | _d_mul_im0 = _q_m0;
668 | _d_mul_im1 = _q_m1;
669 | _d_leds = _t_result[0+:8];
670 | (* full_case *)
671 | case (_q_index)
672 | 0: begin
673 | // _top
674 | _d_m0 = 2;
675 | _d_m1 = 3;
676 | $display("%d * %d = ...",_d_m0,_d_m1);
677 | _d_index = 1;
678 | end
679 | 1: begin
680 | // __block_1
681 | _d_m0 = _q_m0+1;
682 | _d_m1 = -_q_m1-1;
683 | $display("%d * %d = ...",_d_m0,_d_m1);
684 | _d_index = 2;
685 | end
686 | 2: begin
687 | // __block_2
688 | _d_m0 = _q_m0+1;
689 | _d_m1 = -_q_m1+1;
690 | $display("%d * %d = ...",_d_m0,_d_m1);
691 | _d_index = 3;
692 | end
693 | 3: begin
694 | // __block_3
695 | _d_m0 = _q_m0+1;
696 | _d_m1 = -_q_m1-1;
697 | $display("%d * %d = ...",_d_m0,_d_m1);
698 | _d_index = 4;
699 | end
700 | 4: begin
701 | // __block_4
702 | _d_m0 = _q_m0+1;
703 | _d_m1 = -_q_m1+1;
704 | $display("%d * %d = ...",_d_m0,_d_m1);
705 | _d_index = 5;
706 | end
707 | 5: begin
708 | // __block_5
709 | _d_m0 = _q_m0+1;
710 | _d_m1 = -_q_m1-1;
711 | $display("%d * %d = ...",_d_m0,_d_m1);
712 | _d_index = 6;
713 | end
714 | 6: begin
715 | // __block_6
716 | $display("... = %d",_t_result);
717 | _d_index = 7;
718 | end
719 | 7: begin
720 | // __block_7
721 | $display("... = %d",_t_result);
722 | _d_index = 8;
723 | end
724 | 8: begin
725 | // __block_8
726 | $display("... = %d",_t_result);
727 | _d_index = 9;
728 | end
729 | 9: begin
730 | // __block_9
731 | $display("... = %d",_t_result);
732 | _d_index = 10;
733 | end
734 | 10: begin
735 | // __block_10
736 | $display("... = %d",_t_result);
737 | _d_index = 11;
738 | end
739 | 11: begin
740 | // __block_11
741 | $display("... = %d",_t_result);
742 | _d_index = 12;
743 | end
744 | 12: begin
745 | // __block_12
746 | _d_index = 13;
747 | end
748 | 13: begin // end of
749 | end
750 | default: begin
751 | _d_index = {4{1'bx}};
752 | `ifdef FORMAL
753 | assume(0);
754 | `endif
755 | end
756 | endcase
757 | // _always_post
758 | end
759 |
760 | always @(posedge clock) begin
761 | _q_m0 <= (reset) ? 0 : _d_m0;
762 | _q_m1 <= (reset) ? 0 : _d_m1;
763 | _q_leds <= (reset) ? 0 : _d_leds;
764 | _q_index <= reset ? 13 : ( ~_autorun ? 0 : _d_index);
765 | _autorun <= reset ? 0 : 1;
766 | _q_mul_im0 <= _d_mul_im0;
767 | _q_mul_im1 <= _d_mul_im1;
768 | end
769 |
770 | endmodule
771 |
772 |
--------------------------------------------------------------------------------
/designs/silice_vga_test.v:
--------------------------------------------------------------------------------
1 | `define VGA 1
2 | /*
3 |
4 | Copyright 2019, (C) Sylvain Lefebvre and contributors
5 | List contributors with: git shortlog -n -s --
6 |
7 | MIT license
8 |
9 | Permission is hereby granted, free of charge, to any person obtaining a copy of
10 | this software and associated documentation files (the "Software"), to deal in
11 | the Software without restriction, including without limitation the rights to
12 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
13 | the Software, and to permit persons to whom the Software is furnished to do so,
14 | subject to the following conditions:
15 |
16 | The above copyright notice and this permission notice shall be included in all
17 | copies or substantial portions of the Software.
18 |
19 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
20 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
21 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
22 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
23 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
24 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
25 |
26 | (header_2_M)
27 |
28 | */
29 | `define BARE 1
30 | `define COLOR_DEPTH 6
31 |
32 |
33 | module top(
34 | `ifdef VGA
35 | // VGA
36 | output out_video_clock,
37 | output reg [`COLOR_DEPTH-1:0] out_video_r,
38 | output reg [`COLOR_DEPTH-1:0] out_video_g,
39 | output reg [`COLOR_DEPTH-1:0] out_video_b,
40 | output out_video_hs,
41 | output out_video_vs,
42 | `endif
43 | // basic
44 | output [7:0] out_leds,
45 | input clock
46 | );
47 |
48 | reg [2:0] ready = 3'b111;
49 |
50 | always @(posedge clock) begin
51 | ready <= ready >> 1;
52 | end
53 |
54 | wire run_main;
55 | assign run_main = 1'b1;
56 |
57 | M_main __main(
58 | .clock(clock),
59 | .reset(ready[0]),
60 | .out_leds(out_leds),
61 | `ifdef VGA
62 | .out_video_clock(out_video_clock),
63 | .out_video_r(out_video_r),
64 | .out_video_g(out_video_g),
65 | .out_video_b(out_video_b),
66 | .out_video_hs(out_video_hs),
67 | .out_video_vs(out_video_vs),
68 | `endif
69 | .in_run(run_main)
70 | );
71 |
72 | endmodule
73 |
74 |
75 | module M_vga__vga_driver (
76 | out_vga_hs,
77 | out_vga_vs,
78 | out_active,
79 | out_vblank,
80 | out_vga_x,
81 | out_vga_y,
82 | reset,
83 | out_clock,
84 | clock
85 | );
86 | output [0:0] out_vga_hs;
87 | output [0:0] out_vga_vs;
88 | output [0:0] out_active;
89 | output [0:0] out_vblank;
90 | output [9:0] out_vga_x;
91 | output [9:0] out_vga_y;
92 | input reset;
93 | output out_clock;
94 | input clock;
95 | assign out_clock = clock;
96 | wire [9:0] _w_pix_x;
97 | wire [9:0] _w_pix_y;
98 | wire [0:0] _w_active_h;
99 | wire [0:0] _w_active_v;
100 |
101 | reg [9:0] _d_xcount = 0;
102 | reg [9:0] _q_xcount = 0;
103 | reg [9:0] _d_ycount = 0;
104 | reg [9:0] _q_ycount = 0;
105 | reg [0:0] _d_vga_hs;
106 | reg [0:0] _q_vga_hs;
107 | reg [0:0] _d_vga_vs;
108 | reg [0:0] _q_vga_vs;
109 | reg [0:0] _d_active;
110 | reg [0:0] _q_active;
111 | reg [0:0] _d_vblank;
112 | reg [0:0] _q_vblank;
113 | reg [9:0] _d_vga_x;
114 | reg [9:0] _q_vga_x;
115 | reg [9:0] _d_vga_y;
116 | reg [9:0] _q_vga_y;
117 | assign out_vga_hs = _q_vga_hs;
118 | assign out_vga_vs = _q_vga_vs;
119 | assign out_active = _q_active;
120 | assign out_vblank = _q_vblank;
121 | assign out_vga_x = _q_vga_x;
122 | assign out_vga_y = _q_vga_y;
123 |
124 |
125 | assign _w_pix_x = (_q_xcount-160);
126 | assign _w_pix_y = (_q_ycount-45);
127 | assign _w_active_h = (_q_xcount>=160&&_q_xcount<800);
128 | assign _w_active_v = (_q_ycount>=45&&_q_ycount<525);
129 |
130 | `ifdef FORMAL
131 | initial begin
132 | assume(reset);
133 | end
134 | `endif
135 | always @* begin
136 | _d_xcount = _q_xcount;
137 | _d_ycount = _q_ycount;
138 | _d_vga_hs = _q_vga_hs;
139 | _d_vga_vs = _q_vga_vs;
140 | _d_active = _q_active;
141 | _d_vblank = _q_vblank;
142 | _d_vga_x = _q_vga_x;
143 | _d_vga_y = _q_vga_y;
144 | // _always_pre
145 | _d_active = _w_active_h&&_w_active_v;
146 | _d_vga_hs = ~((_q_xcount>=16&&_q_xcount<112));
147 | _d_vga_vs = ~((_q_ycount>=10&&_q_ycount<12));
148 | _d_vblank = (_q_ycount<45);
149 | // __block_1
150 | _d_vga_x = _w_active_h ? _w_pix_x:0;
151 | _d_vga_y = _w_active_v ? _w_pix_y:0;
152 | if (_q_xcount==799) begin
153 | // __block_2
154 | // __block_4
155 | _d_xcount = 0;
156 | if (_q_ycount==524) begin
157 | // __block_5
158 | // __block_7
159 | _d_ycount = 0;
160 | // __block_8
161 | end else begin
162 | // __block_6
163 | // __block_9
164 | _d_ycount = _q_ycount+1;
165 | // __block_10
166 | end
167 | // __block_11
168 | // __block_12
169 | end else begin
170 | // __block_3
171 | // __block_13
172 | _d_xcount = _q_xcount+1;
173 | // __block_14
174 | end
175 | // __block_15
176 | // __block_16
177 | // _always_post
178 | end
179 |
180 | always @(posedge clock) begin
181 | _q_xcount <= _d_xcount;
182 | _q_ycount <= _d_ycount;
183 | _q_vga_hs <= _d_vga_hs;
184 | _q_vga_vs <= _d_vga_vs;
185 | _q_active <= _d_active;
186 | _q_vblank <= _d_vblank;
187 | _q_vga_x <= _d_vga_x;
188 | _q_vga_y <= _d_vga_y;
189 | end
190 |
191 | endmodule
192 |
193 |
194 | module M_frame_display__display (
195 | in_pix_x,
196 | in_pix_y,
197 | in_pix_active,
198 | in_pix_vblank,
199 | out_pix_red,
200 | out_pix_green,
201 | out_pix_blue,
202 | out_done,
203 | reset,
204 | out_clock,
205 | clock
206 | );
207 | input [9:0] in_pix_x;
208 | input [9:0] in_pix_y;
209 | input [0:0] in_pix_active;
210 | input [0:0] in_pix_vblank;
211 | output [5:0] out_pix_red;
212 | output [5:0] out_pix_green;
213 | output [5:0] out_pix_blue;
214 | output out_done;
215 | input reset;
216 | output out_clock;
217 | input clock;
218 | assign out_clock = clock;
219 | reg [5:0] _t_pix_red;
220 | reg [5:0] _t_pix_green;
221 | reg [5:0] _t_pix_blue;
222 |
223 | reg [2:0] _d_index,_q_index = 5;
224 | reg _autorun = 0;
225 | assign out_pix_red = _t_pix_red;
226 | assign out_pix_green = _t_pix_green;
227 | assign out_pix_blue = _t_pix_blue;
228 | assign out_done = (_q_index == 5) & _autorun;
229 |
230 |
231 |
232 | `ifdef FORMAL
233 | initial begin
234 | assume(reset);
235 | end
236 | assume property($initstate || (out_done));
237 | `endif
238 | always @* begin
239 | _d_index = _q_index;
240 | // _always_pre
241 | _t_pix_red = 0;
242 | _t_pix_green = 0;
243 | _t_pix_blue = 0;
244 | (* full_case *)
245 | case (_q_index)
246 | 0: begin
247 | // _top
248 | _d_index = 1;
249 | end
250 | 1: begin
251 | // __while__block_1
252 | if (1) begin
253 | // __block_2
254 | // __block_4
255 | _d_index = 3;
256 | end else begin
257 | _d_index = 2;
258 | end
259 | end
260 | 3: begin
261 | // __while__block_5
262 | if (in_pix_vblank==0) begin
263 | // __block_6
264 | // __block_8
265 | if (in_pix_active) begin
266 | // __block_9
267 | // __block_11
268 | _t_pix_blue = in_pix_x[4+:6];
269 | _t_pix_green = in_pix_y[4+:6];
270 | _t_pix_red = in_pix_x[1+:6];
271 | // __block_12
272 | end else begin
273 | // __block_10
274 | end
275 | // __block_13
276 | // __block_14
277 | _d_index = 3;
278 | end else begin
279 | _d_index = 4;
280 | end
281 | end
282 | 2: begin
283 | // __block_3
284 | _d_index = 5;
285 | end
286 | 4: begin
287 | // __while__block_15
288 | if (in_pix_vblank==1) begin
289 | // __block_16
290 | // __block_18
291 | // __block_19
292 | _d_index = 4;
293 | end else begin
294 | _d_index = 1;
295 | end
296 | end
297 | 5: begin // end of
298 | end
299 | default: begin
300 | _d_index = {3{1'bx}};
301 | `ifdef FORMAL
302 | assume(0);
303 | `endif
304 | end
305 | endcase
306 | // _always_post
307 | end
308 |
309 | always @(posedge clock) begin
310 | _q_index <= reset ? 5 : ( ~_autorun ? 0 : _d_index);
311 | _autorun <= reset ? 0 : 1;
312 | end
313 |
314 | endmodule
315 |
316 | module M_main (
317 | out_leds,
318 | out_video_clock,
319 | out_video_r,
320 | out_video_g,
321 | out_video_b,
322 | out_video_hs,
323 | out_video_vs,
324 | in_run,
325 | out_done,
326 | reset,
327 | out_clock,
328 | clock
329 | );
330 | output [7:0] out_leds;
331 | output [0:0] out_video_clock;
332 | output [5:0] out_video_r;
333 | output [5:0] out_video_g;
334 | output [5:0] out_video_b;
335 | output [0:0] out_video_hs;
336 | output [0:0] out_video_vs;
337 | input in_run;
338 | output out_done;
339 | input reset;
340 | output out_clock;
341 | input clock;
342 | assign out_clock = clock;
343 | wire [0:0] _w_vga_driver_vga_hs;
344 | wire [0:0] _w_vga_driver_vga_vs;
345 | wire [0:0] _w_vga_driver_active;
346 | wire [0:0] _w_vga_driver_vblank;
347 | wire [9:0] _w_vga_driver_vga_x;
348 | wire [9:0] _w_vga_driver_vga_y;
349 | wire [5:0] _w_display_pix_red;
350 | wire [5:0] _w_display_pix_green;
351 | wire [5:0] _w_display_pix_blue;
352 | wire _w_display_done;
353 | reg [7:0] _t_leds;
354 |
355 | reg [7:0] _d_frame;
356 | reg [7:0] _q_frame;
357 | reg [0:0] _d_video_clock;
358 | reg [0:0] _q_video_clock;
359 | reg [2:0] _d_index,_q_index = 7;
360 | reg _autorun = 0;
361 | assign out_leds = _t_leds;
362 | assign out_video_clock = _q_video_clock;
363 | assign out_video_r = _w_display_pix_red;
364 | assign out_video_g = _w_display_pix_green;
365 | assign out_video_b = _w_display_pix_blue;
366 | assign out_video_hs = _w_vga_driver_vga_hs;
367 | assign out_video_vs = _w_vga_driver_vga_vs;
368 | assign out_done = (_q_index == 7) & _autorun;
369 | M_vga__vga_driver vga_driver (
370 | .out_vga_hs(_w_vga_driver_vga_hs),
371 | .out_vga_vs(_w_vga_driver_vga_vs),
372 | .out_active(_w_vga_driver_active),
373 | .out_vblank(_w_vga_driver_vblank),
374 | .out_vga_x(_w_vga_driver_vga_x),
375 | .out_vga_y(_w_vga_driver_vga_y),
376 | .reset(reset),
377 | .clock(clock));
378 | M_frame_display__display display (
379 | .in_pix_x(_w_vga_driver_vga_x),
380 | .in_pix_y(_w_vga_driver_vga_y),
381 | .in_pix_active(_w_vga_driver_active),
382 | .in_pix_vblank(_w_vga_driver_vblank),
383 | .out_pix_red(_w_display_pix_red),
384 | .out_pix_green(_w_display_pix_green),
385 | .out_pix_blue(_w_display_pix_blue),
386 | .out_done(_w_display_done),
387 | .reset(reset),
388 | .clock(clock));
389 |
390 |
391 |
392 | `ifdef FORMAL
393 | initial begin
394 | assume(reset);
395 | end
396 | assume property($initstate || (out_done));
397 | `endif
398 | always @* begin
399 | _d_frame = _q_frame;
400 | _d_video_clock = _q_video_clock;
401 | _d_index = _q_index;
402 | _t_leds = 0;
403 | // _always_pre
404 | (* full_case *)
405 | case (_q_index)
406 | 0: begin
407 | // _top
408 | _d_index = 1;
409 | end
410 | 1: begin
411 | // __while__block_1
412 | if (1) begin
413 | // __block_2
414 | // __block_4
415 | _d_index = 3;
416 | end else begin
417 | _d_index = 2;
418 | end
419 | end
420 | 3: begin
421 | // __while__block_5
422 | if (_w_vga_driver_vblank==1) begin
423 | // __block_6
424 | // __block_8
425 | // __block_9
426 | _d_index = 3;
427 | end else begin
428 | _d_index = 4;
429 | end
430 | end
431 | 2: begin
432 | // __block_3
433 | _d_index = 7;
434 | end
435 | 4: begin
436 | // __block_7
437 | $display("vblank off");
438 | _d_index = 5;
439 | end
440 | 5: begin
441 | // __while__block_10
442 | if (_w_vga_driver_vblank==0) begin
443 | // __block_11
444 | // __block_13
445 | // __block_14
446 | _d_index = 5;
447 | end else begin
448 | _d_index = 6;
449 | end
450 | end
451 | 6: begin
452 | // __block_12
453 | $display("vblank on");
454 | _d_frame = _q_frame+1;
455 | // __block_15
456 | _d_index = 1;
457 | end
458 | 7: begin // end of
459 | end
460 | default: begin
461 | _d_index = {3{1'bx}};
462 | `ifdef FORMAL
463 | assume(0);
464 | `endif
465 | end
466 | endcase
467 | // _always_post
468 | end
469 |
470 | always @(posedge clock) begin
471 | _q_frame <= (reset) ? 0 : _d_frame;
472 | _q_video_clock <= _d_video_clock;
473 | _q_index <= reset ? 7 : ( ~_autorun ? 0 : _d_index);
474 | _autorun <= reset ? 0 : 1;
475 | end
476 |
477 | endmodule
478 |
479 |
--------------------------------------------------------------------------------
/designs/simple.v:
--------------------------------------------------------------------------------
1 | module counter(clock, out);
2 |
3 | input clock;
4 | output reg [8:0] out = 0;
5 |
6 | always @(posedge clock)
7 | begin
8 | out <= out + 1;
9 | end
10 |
11 | endmodule
12 |
--------------------------------------------------------------------------------
/designs/test1.si:
--------------------------------------------------------------------------------
1 | unit main(output uint8 test)
2 | {
3 | bram uint8 ram[256] = {1,2,3,4,5,6,7,8,9,0,pad(0)};
4 |
5 | always {
6 | test = ram.rdata;
7 | ram.addr = ram.rdata;
8 | }
9 | }
10 |
--------------------------------------------------------------------------------
/designs/test2.si:
--------------------------------------------------------------------------------
1 | unit main(output uint8 test)
2 | {
3 | //bram uint8 ram[256] = {1,2,3,4,5,6,7,8,9,0,pad(0)};
4 |
5 | always_before {
6 | test = 8hff;
7 | }
8 |
9 | algorithm {
10 | test = 8haa;
11 | while (1) {
12 | test = 1;
13 | ++:
14 | test = 2;
15 | ++:
16 | test = 3;
17 | }
18 | }
19 | }
20 |
--------------------------------------------------------------------------------
/designs/test3.si:
--------------------------------------------------------------------------------
1 | unit main(output uint8 test)
2 | {
3 | bram uint8 ram[256] = {1,2,3,4,5,6,7,8,9,0,pad(0)};
4 |
5 | always_before {
6 | test = 8hff;
7 | }
8 |
9 | algorithm {
10 | ram.addr = 0;
11 | ram.wenable = 0;
12 | while (1) {
13 | test = ram.rdata;
14 | ram.wenable = 1;
15 | ram.wdata = ram.rdata + 10;
16 | ++:
17 | test = 8hff;
18 | ram.wenable = 0;
19 | ram.addr = ram.addr > 10 ? 0 : ram.addr + 1;
20 | }
21 | }
22 | }
23 |
--------------------------------------------------------------------------------
/lut4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sylefeb/Silixel/09bb5313db3a26002615b670a8a87fed1de529a6/lut4.png
--------------------------------------------------------------------------------
/silice_vga_test.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/sylefeb/Silixel/09bb5313db3a26002615b670a8a87fed1de529a6/silice_vga_test.gif
--------------------------------------------------------------------------------
/src/CMakeLists.txt:
--------------------------------------------------------------------------------
1 | CMAKE_MINIMUM_REQUIRED(VERSION 3.5)
2 |
3 | ADD_SUBDIRECTORY(LibSL EXCLUDE_FROM_ALL)
4 | ADD_SUBDIRECTORY(fstapi)
5 |
6 | IF(WASI)
7 | ELSE()
8 |
9 | SET(SHADERS
10 | sh_simul
11 | sh_posedge
12 | sh_outports
13 | sh_init
14 | sh_visu
15 | )
16 | AUTO_BIND_SHADERS( ${SHADERS} )
17 |
18 | ADD_EXECUTABLE(silixel
19 | silixel.cc
20 | blif.cc
21 | blif.h
22 | sh_simul.cs
23 | sh_simul.h
24 | sh_posedge.cs
25 | sh_posedge.h
26 | sh_outports.cs
27 | sh_outports.h
28 | sh_init.cs
29 | sh_init.h
30 | sh_visu.fp
31 | sh_visu.vp
32 | sh_visu.h
33 | simul_cpu.cc
34 | simul_cpu.h
35 | simul_gpu.cc
36 | simul_gpu.h
37 | read.cc
38 | read.h
39 | analyze.cc
40 | analyze.h
41 | )
42 |
43 | IF(LINUX)
44 | TARGET_LINK_LIBRARIES(silixel LibSL LibSL_gl4core freeglut)
45 | ELSE()
46 | TARGET_LINK_LIBRARIES(silixel LibSL LibSL_gl4core)
47 | ENDIF()
48 |
49 | ENDIF()
50 |
51 | ADD_DEFINITIONS(-DSRC_PATH=\"${CMAKE_SOURCE_DIR}\")
52 |
53 | ADD_EXECUTABLE(silixel_cpu
54 | silixel_cpu.cc
55 | simul_cpu.cc
56 | simul_cpu.h
57 | blif.cc
58 | blif.h
59 | read.cc
60 | read.h
61 | analyze.cc
62 | analyze.h
63 | wasi.cc
64 | )
65 |
66 | TARGET_LINK_LIBRARIES(silixel_cpu LibSL fstapi)
67 |
68 | # install and paths
69 |
70 | install(TARGETS silixel_cpu RUNTIME DESTINATION bin/)
71 |
--------------------------------------------------------------------------------
/src/analyze.cc:
--------------------------------------------------------------------------------
1 | // @sylefeb 2022-01-04
2 | /* ---------------------------------------------------------------------
3 |
4 | Analyzes the design, determines the 'depth' of each LUT by propagating
5 | from the Q outputs (depth 0). The LUT depth is 1 + the max of its input depths.
6 | Within a clock cycle:
7 | - LUTs of lower depth are not influenced by LUTs of higher depth.
8 | - LUTs at a same depth are not influenced by each others.
9 | The LUTs are then sorted by depth and the data structure reordered.
10 |
11 | ----------------------------------------------------------------------- */
12 | /*
13 | BSD 3-Clause License
14 |
15 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb)
16 | All rights reserved.
17 |
18 | Redistribution and use in source and binary forms, with or without
19 | modification, are permitted provided that the following conditions are met:
20 |
21 | 1. Redistributions of source code must retain the above copyright notice, this
22 | list of conditions and the following disclaimer.
23 |
24 | 2. Redistributions in binary form must reproduce the above copyright notice,
25 | this list of conditions and the following disclaimer in the documentation
26 | and/or other materials provided with the distribution.
27 |
28 | 3. Neither the name of the copyright holder nor the names of its
29 | contributors may be used to endorse or promote products derived from
30 | this software without specific prior written permission.
31 |
32 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
33 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
34 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
35 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
36 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
37 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
38 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
39 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
40 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
41 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
42 | */
43 |
44 | #include
45 |
46 | #include
47 | #include
48 | #include
49 | #include
50 | #include
51 | #include
52 | #include
53 | #include
54 | #include
55 | #include
56 |
57 | using namespace std;
58 |
59 | #include "read.h"
60 | #include "analyze.h"
61 |
62 | // -----------------------------------------------------------------------------
63 |
64 | // Propagates depths through the network (from all Q at depth 0).
65 | // Returns whether something changed.
66 | bool analyzeStep(const vector& luts,vector& _depths)
67 | {
68 | bool changed = false;
69 | for (int l=0;l -1) {
75 | int other_value = 0;
76 | if ((luts[l].inputs[i] & 1) == 0) {
77 | other_value = _depths[(luts[l].inputs[i] >> 1)];
78 | }
79 | if (other_value < std::numeric_limits::max()) {
80 | ++other_value;
81 | }
82 | new_value = max(new_value , other_value);
83 | }
84 | }
85 | // update output depth if changed
86 | if (_depths[l] != new_value) {
87 | _depths[l] = new_value;
88 | changed = true;
89 | }
90 | }
91 | return changed;
92 | }
93 |
94 | // -----------------------------------------------------------------------------
95 |
96 | // Performs an analysis of the design, computing the combinational depth
97 | // of all LUTs
98 | void analyze(
99 | vector& _luts,
100 | vector& _brams,
101 | vector >& _outbits,
102 | map& _indices,
103 | vector& _ones,
104 | vector& _step_starts,
105 | vector& _step_ends,
106 | vector& _depths)
107 | {
108 | vector lut_depths;
109 | lut_depths.resize(_luts.size(),std::numeric_limits::max());
110 | /// iterate the analysis step
111 | // propagates combinational depth from Q outputs
112 | bool changed = true;
113 | int maxiter = 1024;
114 | while (changed && maxiter-- > 0) {
115 | changed = analyzeStep(_luts, lut_depths);
116 | }
117 | if (maxiter <= 0) {
118 | fprintf(stderr, "cannot perform analysis, combinational loop in design?");
119 | exit(-1);
120 | }
121 | // reorder by increasing depth
122 | vector > source;
123 | source.resize(_luts.size());
124 | int max_depth = 0;
125 | for (int l = 0; l < _luts.size(); ++l) {
126 | source[l] = make_pair(lut_depths[l], l); // depth,id
127 | if (lut_depths[l] < std::numeric_limits::max()) {
128 | max_depth = max(max_depth, lut_depths[l]);
129 | }
130 | }
131 | if (max_depth == 0) {
132 | fprintf(stderr, "analysis failed (why?)");
133 | exit(-1);
134 | }
135 | /// determine const LUTs based on initialization
136 | // const LUTs are placed at depth 0, which is not simulated
137 | // we can only consider const if the inputs where not initialized, otherwise
138 | // there may be an on-purpose cascade of FF from the initialization point
139 | set with_init;
140 | for (auto one : _ones) {
141 | with_init.insert(one);
142 | }
143 | // promote 0-depth cells with init to 1-depth (non const)
144 | for (int l = 0; l < _luts.size(); ++l) {
145 | if (source[l].first == 0) {
146 | if (with_init.count((l << 1) + 0) || with_init.count((l << 1) + 1)) {
147 | source[l].first = 1;
148 | break;
149 | }
150 | }
151 | }
152 | // convert d-depth cells using only 0-depth const cells as 0-depth (const)
153 | for (int depth = 1; depth <= max_depth; depth++) {
154 | for (int l = 0; l < _luts.size(); ++l) {
155 | if (source[l].first == depth) {
156 | bool no_init_input = true;
157 | for (int i = 0; i < 4; ++i) {
158 | if (_luts[l].inputs[i] > -1) {
159 | if (with_init.count(_luts[l].inputs[i]) != 0) {
160 | no_init_input = false; break;
161 | }
162 | if (_luts[_luts[l].inputs[i] >> 1].external) {
163 | no_init_input = false; break;
164 | }
165 | }
166 | }
167 | if (no_init_input) {
168 | // now we check that all inputs are 0-depth
169 | bool all_inputs_0depth = true;
170 | for (int i = 0; i < 4; ++i) {
171 | if (_luts[l].inputs[i] > -1) {
172 | int idepth = source[_luts[l].inputs[i] >> 1].first;
173 | if (idepth > 0) {
174 | all_inputs_0depth = false;
175 | }
176 | }
177 | }
178 | if (all_inputs_0depth) {
179 | source[l].first = 0;
180 | }
181 | }
182 | }
183 | }
184 | }
185 | // update max_depth
186 | max_depth = 0;
187 | for (int l = 0; l < _luts.size(); ++l) {
188 | max_depth = max(max_depth, source[l].first);
189 | }
190 | #if 0
191 | // debug: output full list of LUTs
192 | for (int l = 0; l < _luts.size(); ++l) {
193 | int i0d = _luts[l].inputs[0] < 0 ? 999 : (_luts[l].inputs[0] & 1 ? 999 : source[_luts[l].inputs[0] >> 1].first);
194 | int i1d = _luts[l].inputs[1] < 0 ? 999 : (_luts[l].inputs[1] & 1 ? 999 : source[_luts[l].inputs[1] >> 1].first);
195 | int i2d = _luts[l].inputs[2] < 0 ? 999 : (_luts[l].inputs[2] & 1 ? 999 : source[_luts[l].inputs[2] >> 1].first);
196 | int i3d = _luts[l].inputs[3] < 0 ? 999 : (_luts[l].inputs[3] & 1 ? 999 : source[_luts[l].inputs[3] >> 1].first);
197 | fprintf(stderr, "LUT %d, depth %d min input depths: %d\n", l<<1, source[l].first, min(min(i0d, i1d), min(i2d, i3d)));
198 | }
199 | #endif
200 | // sort by depth
201 | sort(source.begin(),source.end());
202 | // build the reordering arrays
203 | vector reorder;
204 | vector inv_reorder;
205 | reorder .resize(_luts.size());
206 | inv_reorder .resize(_luts.size());
207 | _depths .resize(_luts.size());
208 | _step_starts.resize(max_depth+1,std::numeric_limits::max());
209 | _step_ends .resize(max_depth+1,0);
210 | for (int o=0;o::max()) {
215 | _step_starts[source[o].first] = min(_step_starts[source[o].first],o);
216 | _step_ends [source[o].first] = max(_step_ends [source[o].first],o);
217 | }
218 | }
219 | // reorder the LUTs
220 | vector init_luts = _luts;
221 | reorderLUTs(init_luts, reorder, inv_reorder, _luts, _brams, _outbits, _indices, _ones);
222 | #if 0
223 | // print report
224 | fprintf(stderr,"analysis done\n");
225 | for (int d=0;d<_step_starts.size();++d) {
226 | fprintf(stderr,"depth %3d on luts %6d-%6d (%6d/%6d)\n",
227 | d,_step_starts[d],_step_ends[d],
228 | _step_ends[d] - _step_starts[d] + 1,
229 | (int)_luts.size());
230 | }
231 | #endif
232 | }
233 |
234 | // -----------------------------------------------------------------------------
235 |
236 | // Reorders the LUT datastructure based on input reordering arrays
237 | void reorderLUTs(
238 | const vector& init_luts,
239 | const vector& reorder,
240 | const vector& inv_reorder,
241 | vector& _luts,
242 | vector& _brams,
243 | vector >& _outbits,
244 | map& _indices,
245 | vector& _ones)
246 | {
247 | /// apply the reordering
248 | // -> luts
249 | _luts.resize(init_luts.size());
250 | for (int o=0;o -1) {
257 | int reg = init_luts[l].inputs[i] &1;
258 | int src = init_luts[l].inputs[i]>>1;
259 | _luts[o].inputs[i] = (inv_reorder[src]<<1) | reg;
260 | } else {
261 | _luts[o].inputs[i] = -1;
262 | }
263 | }
264 | }
265 | // -> bits
266 | for (int b = 0; b < _outbits.size(); ++b) {
267 | int reg = _outbits[b].second &1;
268 | int src = _outbits[b].second>>1;
269 | _outbits[b].second = (inv_reorder[src]<<1) | reg;
270 | }
271 | // -> indices
272 | for (auto& idc : _indices) {
273 | int reg = idc.second & 1;
274 | int src = idc.second >> 1;
275 | idc.second = (inv_reorder[src] << 1) | reg;
276 | }
277 | // -> brams
278 | for (auto& b : _brams) {
279 | vector*> vecs;
280 | vecs.push_back(&b.rd_addr);
281 | vecs.push_back(&b.rd_data);
282 | vecs.push_back(&b.wr_addr);
283 | vecs.push_back(&b.wr_data);
284 | vecs.push_back(&b.wr_en);
285 | for (auto vptr : vecs) {
286 | for (int i=0;i < (int)vptr->size();++i) {
287 | int reg = vptr->at(i) & 1;
288 | int src = vptr->at(i) >> 1;
289 | vptr->at(i) = (inv_reorder[src] << 1) | reg;
290 | }
291 | }
292 | }
293 | // -> ones (init)
294 | for (int b = 0; b < _ones.size(); ++b) {
295 | int reg = _ones[b] & 1;
296 | int src = _ones[b] >> 1;
297 | _ones[b] = (inv_reorder[src] << 1) | reg;
298 | }
299 | }
300 |
301 | // -----------------------------------------------------------------------------
302 |
303 |
304 | // Builds a data-structure representing the fanout of each LUT: the list
305 | // of LUTs that use it as an input. This is used to only simulate the LUTs
306 | // which inputs have changed at each depth.
307 | void buildFanout(
308 | vector& _luts,
309 | vector& _fanout)
310 | {
311 | // build fanout
312 | vector > fanouts;
313 | fanouts.resize(_luts.size());
314 | for (int l = 0; l < _luts.size(); ++l) {
315 | ForIndex(i, 4) {
316 | if (_luts[l].inputs[i] > -1) {
317 | int lut_in = _luts[l].inputs[i] >> 1;
318 | int lut_in_q_else_d = _luts[l].inputs[i] & 1;
319 | fanouts[lut_in].insert((l << 1) | lut_in_q_else_d);
320 | }
321 | }
322 | }
323 | // -> flatten in output
324 | int totsz = 0;
325 | for (const auto& fo : fanouts) {
326 | totsz += (int)fo.size() + 1;
327 | }
328 | _fanout.reserve(_luts.size() /*header, 1 index per lut*/ + totsz);
329 | int rsum = (int)_luts.size();
330 | for (const auto& fo : fanouts) {
331 | _fanout.push_back(rsum);
332 | rsum += (int)fo.size() + 1;
333 | }
334 | for (const auto& fo : fanouts) {
335 | for (auto l : fo) {
336 | _fanout.push_back(l);
337 | }
338 | _fanout.push_back(-1);
339 | }
340 | sl_assert(_fanout.size() == _luts.size() + totsz);
341 | }
342 |
343 | // -----------------------------------------------------------------------------
344 |
--------------------------------------------------------------------------------
/src/analyze.h:
--------------------------------------------------------------------------------
1 | // @sylefeb 2022-01-04
2 | /*
3 | BSD 3-Clause License
4 |
5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb)
6 | All rights reserved.
7 |
8 | Redistribution and use in source and binary forms, with or without
9 | modification, are permitted provided that the following conditions are met:
10 |
11 | 1. Redistributions of source code must retain the above copyright notice, this
12 | list of conditions and the following disclaimer.
13 |
14 | 2. Redistributions in binary form must reproduce the above copyright notice,
15 | this list of conditions and the following disclaimer in the documentation
16 | and/or other materials provided with the distribution.
17 |
18 | 3. Neither the name of the copyright holder nor the names of its
19 | contributors may be used to endorse or promote products derived from
20 | this software without specific prior written permission.
21 |
22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
32 | */
33 |
34 | #pragma once
35 |
36 | #include
37 | using namespace std;
38 |
39 | void analyze(
40 | std::vector& _luts,
41 | std::vector& _brams,
42 | std::vector >& _outbits,
43 | std::map& _indices,
44 | std::vector& _ones,
45 | std::vector& _step_starts,
46 | std::vector& _step_ends,
47 | std::vector& _depths);
48 |
49 | void reorderLUTs(
50 | const std::vector& init_luts,
51 | const std::vector& reorder,
52 | const std::vector& inv_reorder,
53 | std::vector& _luts,
54 | std::vector& _brams,
55 | std::vector >& _outbits,
56 | std::map& _indices,
57 | std::vector& _ones);
58 |
59 | void buildFanout(
60 | std::vector& _luts,
61 | std::vector& _fanout);
62 |
--------------------------------------------------------------------------------
/src/blif.cc:
--------------------------------------------------------------------------------
1 | // @sylefeb 2022-01-08
2 | /*
3 |
4 | Simple BLIF file parser, nothing special.
5 | Reads the inputs, outputs, gates and latches into a t_blif struct.
6 |
7 | */
8 | /*
9 | BSD 3-Clause License
10 |
11 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb)
12 | All rights reserved.
13 |
14 | Redistribution and use in source and binary forms, with or without
15 | modification, are permitted provided that the following conditions are met:
16 |
17 | 1. Redistributions of source code must retain the above copyright notice, this
18 | list of conditions and the following disclaimer.
19 |
20 | 2. Redistributions in binary form must reproduce the above copyright notice,
21 | this list of conditions and the following disclaimer in the documentation
22 | and/or other materials provided with the distribution.
23 |
24 | 3. Neither the name of the copyright holder nor the names of its
25 | contributors may be used to endorse or promote products derived from
26 | this software without specific prior written permission.
27 |
28 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
29 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
30 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
31 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
32 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
33 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
34 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
35 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
36 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
37 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
38 | */
39 |
40 | #include "blif.h"
41 | #include
42 |
43 | // -------------------------------------------------------------------
44 |
45 | using namespace std;
46 |
47 | // -------------------------------------------------------------------
48 |
49 | void readList(
50 | LibSL::BasicParser::Parser& parser,
51 | vector& _list)
52 | {
53 | while (1) {
54 | parser.skipSpaces();
55 | char next = parser.readChar(false);
56 | if (next == '\n') {
57 | break;
58 | } else {
59 | string name = parser.readString();
60 | _list.emplace_back(name);
61 | }
62 | }
63 | }
64 |
65 | // -------------------------------------------------------------------
66 |
67 | void readConfig(
68 | LibSL::BasicParser::Parser& parser,
69 | pair& _cfgs)
70 | {
71 | string vals = parser.readString();
72 | string out = parser.readString();
73 | if (out.empty()) { std::swap(vals,out); }
74 | _cfgs = make_pair(vals, out);
75 | }
76 |
77 | // -------------------------------------------------------------------
78 |
79 | ushort lut_config(const std::vector >& config_strings)
80 | {
81 | ushort cfg = 0;
82 | for (auto cs : config_strings) {
83 | if (cs.second == "1") { // probably always the case, defaults to 0
84 | ForIndex(c, 16) { // for each of 16 configs
85 | bool accept = true;
86 | ForIndex(j, cs.first.length()) {
87 | if (cs.first[cs.first.length()-1-j] == '1') {
88 | if (!(c & (1 << j))) { accept = false; break; }
89 | } else {
90 | if ( c & (1 << j) ) { accept = false; break; }
91 | }
92 | }
93 | if (accept) {
94 | cfg |= (1 << c);
95 | }
96 | }
97 | }
98 | }
99 | return cfg;
100 | }
101 |
102 | // -------------------------------------------------------------------
103 |
104 | void split(const std::string& s, char delim, std::vector& elems)
105 | {
106 | std::stringstream ss(s);
107 | std::string item;
108 | while (getline(ss, item, delim)) {
109 | elems.push_back(item);
110 | }
111 | }
112 |
113 | // -------------------------------------------------------------------
114 |
115 | void parse(const char *fname, t_blif& _blif)
116 | {
117 |
118 | LibSL::BasicParser::FileStream stream(fname);
119 | LibSL::BasicParser::Parser parser(stream,false);
120 |
121 | bool in_subckt = false;
122 | bool in_bram = false;
123 |
124 | fprintf(stderr, "Parsing ... ");
125 | Console::processingInit();
126 | while (!parser.eof()) {
127 | parser.skipSpaces();
128 | char first = parser.readChar(false);
129 | if (first == '#') {
130 | // skip comment
131 | } else if (first == '.') {
132 | string type = parser.readString();
133 | if (type == ".model") {
134 | in_subckt = false;
135 | string name = parser.readString();
136 | } else if (type == ".inputs") {
137 | in_subckt = false;
138 | readList(parser, _blif.inputs);
139 | } else if (type == ".outputs") {
140 | in_subckt = false;
141 | readList(parser, _blif.outputs);
142 | } else if (type == ".names") {
143 | in_subckt = false;
144 | vector ios;
145 | readList(parser, ios);
146 | _blif.gates.push_back(t_gate_nfo());
147 | if (!ios.empty()) {
148 | _blif.gates.back().output = ios.back();
149 | for (int i = 0; i < (int)ios.size() - 1; ++i) {
150 | _blif.gates.back().inputs.push_back(ios[i]);
151 | }
152 | }
153 | } else if (type == ".latch") {
154 | in_subckt = false;
155 | _blif.latches.push_back(t_latch_nfo());
156 | vector nfos;
157 | readList(parser, nfos);
158 | sl_assert(nfos.size() == 5);
159 | _blif.latches.back().input = nfos[0];
160 | _blif.latches.back().output = nfos[1];
161 | _blif.latches.back().init = nfos[4];
162 | } else if (type == ".subckt") {
163 | in_subckt = true;
164 | in_bram = false;
165 | string type = parser.readString();
166 | if (type == "$mem_v2") {
167 | in_bram = true;
168 | _blif.brams.push_back(t_bram_nfo());
169 | vector bindings;
170 | readList(parser, bindings);
171 | for (auto b : bindings) {
172 | std::vector left_right;
173 | split(b,'=',left_right);
174 | if (left_right.size() != 2) {
175 | fprintf(stderr," cannot interpret binding %s\n",b.c_str());
176 | } else {
177 | _blif.brams.back().bindings[left_right[0]] = left_right[1];
178 | // fprintf(stderr,"%s = %s\n",left_right[0].c_str(),left_right[1].c_str());
179 | }
180 | }
181 | }
182 | } else if (type == ".param") {
183 | if (in_subckt && in_bram) {
184 | string param;
185 | param = parser.readString();
186 | if (param == "MEMID") {
187 | string id;
188 | id = parser.readString();
189 | _blif.brams.back().name = id;
190 | } else if (param == "INIT") {
191 | // read init bits and store
192 | parser.skipSpaces();
193 | uint b = 0;
194 | while (1) {
195 | char next = parser.readChar(false);
196 | if (next == '\n') {
197 | break;
198 | } else {
199 | char bit = parser.readChar(true);
200 | _blif.brams.back().data.set(b, (bit == '1'));
201 | ++b;
202 | }
203 | }
204 | // fprintf(stderr, "read %d init bits\n", _blif.brams.back().data.bitsize());
205 | } else {
206 | // read value (max 32 bits)
207 | uint32_t v = 0;
208 | parser.skipSpaces();
209 | while (1) {
210 | char next = parser.readChar(false);
211 | if (next == '\n') {
212 | break;
213 | } else {
214 | char bit = parser.readChar(true);
215 | if (bit == '1') {
216 | v = (v << 1) | 1;
217 | } else {
218 | v = v << 1;
219 | }
220 | }
221 | }
222 | if (param == "ABITS") {
223 | _blif.brams.back().addr_width = v;
224 | } else if (param == "SIZE") {
225 | _blif.brams.back().size = v;
226 | } else if (param == "WIDTH") {
227 | _blif.brams.back().data_width = v;
228 | } else {
229 | // TODO: check num ports, etc
230 | }
231 | }
232 | }
233 | }
234 | } else if (first == '0' || first == '1' || first == '-') {
235 | // read configuration
236 | _blif.gates.back().config_strings.push_back(pair());
237 | readConfig(parser, _blif.gates.back().config_strings.back());
238 | }
239 | // skip to next line
240 | parser.reachChar('\n');
241 | Console::processingUpdate();
242 | }
243 | Console::processingEnd();
244 | fprintf(stderr, " done.\n");
245 |
246 | }
247 |
248 | // -------------------------------------------------------------------
249 |
--------------------------------------------------------------------------------
/src/blif.h:
--------------------------------------------------------------------------------
1 | // @sylefeb 2022-01-08
2 | /*
3 | BSD 3-Clause License
4 |
5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb)
6 | All rights reserved.
7 |
8 | Redistribution and use in source and binary forms, with or without
9 | modification, are permitted provided that the following conditions are met:
10 |
11 | 1. Redistributions of source code must retain the above copyright notice, this
12 | list of conditions and the following disclaimer.
13 |
14 | 2. Redistributions in binary form must reproduce the above copyright notice,
15 | this list of conditions and the following disclaimer in the documentation
16 | and/or other materials provided with the distribution.
17 |
18 | 3. Neither the name of the copyright holder nor the names of its
19 | contributors may be used to endorse or promote products derived from
20 | this software without specific prior written permission.
21 |
22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
32 | */
33 |
34 | #pragma once
35 |
36 | #include
37 |
38 | #include
39 | #include
40 |
41 | #include "uintX.h"
42 |
43 | typedef struct {
44 | std::string input;
45 | std::string output;
46 | std::string init;
47 | } t_latch_nfo;
48 |
49 | typedef struct {
50 | std::vector inputs;
51 | std::string output;
52 | std::vector > config_strings;
53 | } t_gate_nfo;
54 |
55 | typedef struct {
56 | std::map bindings;
57 | std::string name;
58 | int size;
59 | int addr_width;
60 | int data_width;
61 | uintX data;
62 | } t_bram_nfo;
63 |
64 | typedef struct {
65 | std::vector inputs;
66 | std::vector outputs;
67 | std::vector latches;
68 | std::vector gates;
69 | std::vector brams;
70 | } t_blif;
71 |
72 | /// Parses a blif file
73 | void parse(const char *fname, t_blif& _blif);
74 |
75 | /// Returns an integer representing the LUT configuration provides as strings
76 | ushort lut_config(const std::vector >& config_strings);
77 |
78 | /// Utilities
79 | // splits a string using a delimiter
80 | void split(const std::string &s, char delim, std::vector &elems);
81 |
--------------------------------------------------------------------------------
/src/fstapi/CMakeLists.txt:
--------------------------------------------------------------------------------
1 | cmake_minimum_required(VERSION 3.5)
2 | project(fstapi)
3 |
4 | INCLUDE_DIRECTORIES(
5 | ${PROJECT_SOURCE_DIR}/
6 | ${PROJECT_SOURCE_DIR}/../LibSL/src/libs/src/zlib/
7 | )
8 |
9 | ADD_LIBRARY(fstapi
10 | fastlz.c
11 | fastlz.h
12 | lz4.c
13 | lz4.h
14 | fstapi.c
15 | fstapi.h
16 | )
17 |
--------------------------------------------------------------------------------
/src/fstapi/fastlz.c:
--------------------------------------------------------------------------------
1 | /*
2 | FastLZ - lightning-fast lossless compression library
3 |
4 | Copyright (C) 2007 Ariya Hidayat (ariya@kde.org)
5 | Copyright (C) 2006 Ariya Hidayat (ariya@kde.org)
6 | Copyright (C) 2005 Ariya Hidayat (ariya@kde.org)
7 |
8 | Permission is hereby granted, free of charge, to any person obtaining a copy
9 | of this software and associated documentation files (the "Software"), to deal
10 | in the Software without restriction, including without limitation the rights
11 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
12 | copies of the Software, and to permit persons to whom the Software is
13 | furnished to do so, subject to the following conditions:
14 |
15 | The above copyright notice and this permission notice shall be included in
16 | all copies or substantial portions of the Software.
17 |
18 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
19 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
20 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
21 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
22 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
23 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
24 | THE SOFTWARE.
25 |
26 | SPDX-License-Identifier: MIT
27 | */
28 |
29 | #include "fastlz.h"
30 |
31 | #if !defined(FASTLZ__COMPRESSOR) && !defined(FASTLZ_DECOMPRESSOR)
32 |
33 | /*
34 | * Always check for bound when decompressing.
35 | * Generally it is best to leave it defined.
36 | */
37 | #define FASTLZ_SAFE
38 |
39 |
40 | /*
41 | * Give hints to the compiler for branch prediction optimization.
42 | */
43 | #if defined(__GNUC__) && (__GNUC__ > 2)
44 | #define FASTLZ_EXPECT_CONDITIONAL(c) (__builtin_expect((c), 1))
45 | #define FASTLZ_UNEXPECT_CONDITIONAL(c) (__builtin_expect((c), 0))
46 | #else
47 | #define FASTLZ_EXPECT_CONDITIONAL(c) (c)
48 | #define FASTLZ_UNEXPECT_CONDITIONAL(c) (c)
49 | #endif
50 |
51 | /*
52 | * Use inlined functions for supported systems.
53 | */
54 | #if defined(__GNUC__) || defined(__DMC__) || defined(__POCC__) || defined(__WATCOMC__) || defined(__SUNPRO_C)
55 | #define FASTLZ_INLINE inline
56 | #elif defined(__BORLANDC__) || defined(_MSC_VER) || defined(__LCC__)
57 | #define FASTLZ_INLINE __inline
58 | #else
59 | #define FASTLZ_INLINE
60 | #endif
61 |
62 | /*
63 | * Prevent accessing more than 8-bit at once, except on x86 architectures.
64 | */
65 | #if !defined(FASTLZ_STRICT_ALIGN)
66 | #define FASTLZ_STRICT_ALIGN
67 | #if defined(__i386__) || defined(__386) /* GNU C, Sun Studio */
68 | #undef FASTLZ_STRICT_ALIGN
69 | #elif defined(__i486__) || defined(__i586__) || defined(__i686__) || defined(__amd64) /* GNU C */
70 | #undef FASTLZ_STRICT_ALIGN
71 | #elif defined(_M_IX86) /* Intel, MSVC */
72 | #undef FASTLZ_STRICT_ALIGN
73 | #elif defined(__386)
74 | #undef FASTLZ_STRICT_ALIGN
75 | #elif defined(_X86_) /* MinGW */
76 | #undef FASTLZ_STRICT_ALIGN
77 | #elif defined(__I86__) /* Digital Mars */
78 | #undef FASTLZ_STRICT_ALIGN
79 | #endif
80 | #endif
81 |
82 | /* prototypes */
83 | int fastlz_compress(const void* input, int length, void* output);
84 | int fastlz_compress_level(int level, const void* input, int length, void* output);
85 | int fastlz_decompress(const void* input, int length, void* output, int maxout);
86 |
87 | #define MAX_COPY 32
88 | #define MAX_LEN 264 /* 256 + 8 */
89 | #define MAX_DISTANCE 8192
90 |
91 | #if !defined(FASTLZ_STRICT_ALIGN)
92 | #define FASTLZ_READU16(p) *((const flzuint16*)(p))
93 | #else
94 | #define FASTLZ_READU16(p) ((p)[0] | (p)[1]<<8)
95 | #endif
96 |
97 | #define HASH_LOG 13
98 | #define HASH_SIZE (1<< HASH_LOG)
99 | #define HASH_MASK (HASH_SIZE-1)
100 | #define HASH_FUNCTION(v,p) { v = FASTLZ_READU16(p); v ^= FASTLZ_READU16(p+1)^(v>>(16-HASH_LOG));v &= HASH_MASK; }
101 |
102 | #undef FASTLZ_LEVEL
103 | #define FASTLZ_LEVEL 1
104 |
105 | #undef FASTLZ_COMPRESSOR
106 | #undef FASTLZ_DECOMPRESSOR
107 | #define FASTLZ_COMPRESSOR fastlz1_compress
108 | #define FASTLZ_DECOMPRESSOR fastlz1_decompress
109 | static FASTLZ_INLINE int FASTLZ_COMPRESSOR(const void* input, int length, void* output);
110 | static FASTLZ_INLINE int FASTLZ_DECOMPRESSOR(const void* input, int length, void* output, int maxout);
111 | #include "fastlz.c"
112 |
113 | #undef FASTLZ_LEVEL
114 | #define FASTLZ_LEVEL 2
115 |
116 | #undef MAX_DISTANCE
117 | #define MAX_DISTANCE 8191
118 | #define MAX_FARDISTANCE (65535+MAX_DISTANCE-1)
119 |
120 | #undef FASTLZ_COMPRESSOR
121 | #undef FASTLZ_DECOMPRESSOR
122 | #define FASTLZ_COMPRESSOR fastlz2_compress
123 | #define FASTLZ_DECOMPRESSOR fastlz2_decompress
124 | static FASTLZ_INLINE int FASTLZ_COMPRESSOR(const void* input, int length, void* output);
125 | static FASTLZ_INLINE int FASTLZ_DECOMPRESSOR(const void* input, int length, void* output, int maxout);
126 | #include "fastlz.c"
127 |
128 | int fastlz_compress(const void* input, int length, void* output)
129 | {
130 | /* for short block, choose fastlz1 */
131 | if(length < 65536)
132 | return fastlz1_compress(input, length, output);
133 |
134 | /* else... */
135 | return fastlz2_compress(input, length, output);
136 | }
137 |
138 | int fastlz_decompress(const void* input, int length, void* output, int maxout)
139 | {
140 | /* magic identifier for compression level */
141 | int level = ((*(const flzuint8*)input) >> 5) + 1;
142 |
143 | if(level == 1)
144 | return fastlz1_decompress(input, length, output, maxout);
145 | if(level == 2)
146 | return fastlz2_decompress(input, length, output, maxout);
147 |
148 | /* unknown level, trigger error */
149 | return 0;
150 | }
151 |
152 | int fastlz_compress_level(int level, const void* input, int length, void* output)
153 | {
154 | if(level == 1)
155 | return fastlz1_compress(input, length, output);
156 | if(level == 2)
157 | return fastlz2_compress(input, length, output);
158 |
159 | return 0;
160 | }
161 |
162 | #else /* !defined(FASTLZ_COMPRESSOR) && !defined(FASTLZ_DECOMPRESSOR) */
163 |
164 | static FASTLZ_INLINE int FASTLZ_COMPRESSOR(const void* input, int length, void* output)
165 | {
166 | const flzuint8* ip = (const flzuint8*) input;
167 | const flzuint8* ip_bound = ip + length - 2;
168 | const flzuint8* ip_limit = ip + length - 12;
169 | flzuint8* op = (flzuint8*) output;
170 |
171 | const flzuint8* htab[HASH_SIZE];
172 | const flzuint8** hslot;
173 | flzuint32 hval;
174 |
175 | flzuint32 copy;
176 |
177 | /* sanity check */
178 | if(FASTLZ_UNEXPECT_CONDITIONAL(length < 4))
179 | {
180 | if(length)
181 | {
182 | /* create literal copy only */
183 | *op++ = length-1;
184 | ip_bound++;
185 | while(ip <= ip_bound)
186 | *op++ = *ip++;
187 | return length+1;
188 | }
189 | else
190 | return 0;
191 | }
192 |
193 | /* initializes hash table */
194 | for (hslot = htab; hslot < htab + HASH_SIZE; hslot++)
195 | *hslot = ip;
196 |
197 | /* we start with literal copy */
198 | copy = 2;
199 | *op++ = MAX_COPY-1;
200 | *op++ = *ip++;
201 | *op++ = *ip++;
202 |
203 | /* main loop */
204 | while(FASTLZ_EXPECT_CONDITIONAL(ip < ip_limit))
205 | {
206 | const flzuint8* ref;
207 | flzuint32 distance;
208 |
209 | /* minimum match length */
210 | flzuint32 len = 3;
211 |
212 | /* comparison starting-point */
213 | const flzuint8* anchor = ip;
214 |
215 | /* check for a run */
216 | #if FASTLZ_LEVEL==2
217 | if(ip[0] == ip[-1] && FASTLZ_READU16(ip-1)==FASTLZ_READU16(ip+1))
218 | {
219 | distance = 1;
220 | /* ip += 3; */ /* scan-build, never used */
221 | ref = anchor - 1 + 3;
222 | goto match;
223 | }
224 | #endif
225 |
226 | /* find potential match */
227 | HASH_FUNCTION(hval,ip);
228 | hslot = htab + hval;
229 | ref = htab[hval];
230 |
231 | /* calculate distance to the match */
232 | distance = anchor - ref;
233 |
234 | /* update hash table */
235 | *hslot = anchor;
236 |
237 | /* is this a match? check the first 3 bytes */
238 | if(distance==0 ||
239 | #if FASTLZ_LEVEL==1
240 | (distance >= MAX_DISTANCE) ||
241 | #else
242 | (distance >= MAX_FARDISTANCE) ||
243 | #endif
244 | *ref++ != *ip++ || *ref++!=*ip++ || *ref++!=*ip++)
245 | goto literal;
246 |
247 | #if FASTLZ_LEVEL==2
248 | /* far, needs at least 5-byte match */
249 | if(distance >= MAX_DISTANCE)
250 | {
251 | if(*ip++ != *ref++ || *ip++!= *ref++)
252 | goto literal;
253 | len += 2;
254 | }
255 |
256 | match:
257 | #endif
258 |
259 | /* last matched byte */
260 | ip = anchor + len;
261 |
262 | /* distance is biased */
263 | distance--;
264 |
265 | if(!distance)
266 | {
267 | /* zero distance means a run */
268 | flzuint8 x = ip[-1];
269 | while(ip < ip_bound)
270 | if(*ref++ != x) break; else ip++;
271 | }
272 | else
273 | for(;;)
274 | {
275 | /* safe because the outer check against ip limit */
276 | if(*ref++ != *ip++) break;
277 | if(*ref++ != *ip++) break;
278 | if(*ref++ != *ip++) break;
279 | if(*ref++ != *ip++) break;
280 | if(*ref++ != *ip++) break;
281 | if(*ref++ != *ip++) break;
282 | if(*ref++ != *ip++) break;
283 | if(*ref++ != *ip++) break;
284 | while(ip < ip_bound)
285 | if(*ref++ != *ip++) break;
286 | break;
287 | }
288 |
289 | /* if we have copied something, adjust the copy count */
290 | if(copy)
291 | /* copy is biased, '0' means 1 byte copy */
292 | *(op-copy-1) = copy-1;
293 | else
294 | /* back, to overwrite the copy count */
295 | op--;
296 |
297 | /* reset literal counter */
298 | copy = 0;
299 |
300 | /* length is biased, '1' means a match of 3 bytes */
301 | ip -= 3;
302 | len = ip - anchor;
303 |
304 | /* encode the match */
305 | #if FASTLZ_LEVEL==2
306 | if(distance < MAX_DISTANCE)
307 | {
308 | if(len < 7)
309 | {
310 | *op++ = (len << 5) + (distance >> 8);
311 | *op++ = (distance & 255);
312 | }
313 | else
314 | {
315 | *op++ = (7 << 5) + (distance >> 8);
316 | for(len-=7; len >= 255; len-= 255)
317 | *op++ = 255;
318 | *op++ = len;
319 | *op++ = (distance & 255);
320 | }
321 | }
322 | else
323 | {
324 | /* far away, but not yet in the another galaxy... */
325 | if(len < 7)
326 | {
327 | distance -= MAX_DISTANCE;
328 | *op++ = (len << 5) + 31;
329 | *op++ = 255;
330 | *op++ = distance >> 8;
331 | *op++ = distance & 255;
332 | }
333 | else
334 | {
335 | distance -= MAX_DISTANCE;
336 | *op++ = (7 << 5) + 31;
337 | for(len-=7; len >= 255; len-= 255)
338 | *op++ = 255;
339 | *op++ = len;
340 | *op++ = 255;
341 | *op++ = distance >> 8;
342 | *op++ = distance & 255;
343 | }
344 | }
345 | #else
346 |
347 | if(FASTLZ_UNEXPECT_CONDITIONAL(len > MAX_LEN-2))
348 | while(len > MAX_LEN-2)
349 | {
350 | *op++ = (7 << 5) + (distance >> 8);
351 | *op++ = MAX_LEN - 2 - 7 -2;
352 | *op++ = (distance & 255);
353 | len -= MAX_LEN-2;
354 | }
355 |
356 | if(len < 7)
357 | {
358 | *op++ = (len << 5) + (distance >> 8);
359 | *op++ = (distance & 255);
360 | }
361 | else
362 | {
363 | *op++ = (7 << 5) + (distance >> 8);
364 | *op++ = len - 7;
365 | *op++ = (distance & 255);
366 | }
367 | #endif
368 |
369 | /* update the hash at match boundary */
370 | HASH_FUNCTION(hval,ip);
371 | htab[hval] = ip++;
372 | HASH_FUNCTION(hval,ip);
373 | htab[hval] = ip++;
374 |
375 | /* assuming literal copy */
376 | *op++ = MAX_COPY-1;
377 |
378 | continue;
379 |
380 | literal:
381 | *op++ = *anchor++;
382 | ip = anchor;
383 | copy++;
384 | if(FASTLZ_UNEXPECT_CONDITIONAL(copy == MAX_COPY))
385 | {
386 | copy = 0;
387 | *op++ = MAX_COPY-1;
388 | }
389 | }
390 |
391 | /* left-over as literal copy */
392 | ip_bound++;
393 | while(ip <= ip_bound)
394 | {
395 | *op++ = *ip++;
396 | copy++;
397 | if(copy == MAX_COPY)
398 | {
399 | copy = 0;
400 | *op++ = MAX_COPY-1;
401 | }
402 | }
403 |
404 | /* if we have copied something, adjust the copy length */
405 | if(copy)
406 | *(op-copy-1) = copy-1;
407 | else
408 | op--;
409 |
410 | #if FASTLZ_LEVEL==2
411 | /* marker for fastlz2 */
412 | *(flzuint8*)output |= (1 << 5);
413 | #endif
414 |
415 | return op - (flzuint8*)output;
416 | }
417 |
418 | static FASTLZ_INLINE int FASTLZ_DECOMPRESSOR(const void* input, int length, void* output, int maxout)
419 | {
420 | const flzuint8* ip = (const flzuint8*) input;
421 | const flzuint8* ip_limit = ip + length;
422 | flzuint8* op = (flzuint8*) output;
423 | flzuint8* op_limit = op + maxout;
424 | flzuint32 ctrl = (*ip++) & 31;
425 | int loop = 1;
426 |
427 | do
428 | {
429 | const flzuint8* ref = op;
430 | flzuint32 len = ctrl >> 5;
431 | flzuint32 ofs = (ctrl & 31) << 8;
432 |
433 | if(ctrl >= 32)
434 | {
435 | #if FASTLZ_LEVEL==2
436 | flzuint8 code;
437 | #endif
438 | len--;
439 | ref -= ofs;
440 | if (len == 7-1)
441 | #if FASTLZ_LEVEL==1
442 | len += *ip++;
443 | ref -= *ip++;
444 | #else
445 | do
446 | {
447 | code = *ip++;
448 | len += code;
449 | } while (code==255);
450 | code = *ip++;
451 | ref -= code;
452 |
453 | /* match from 16-bit distance */
454 | if(FASTLZ_UNEXPECT_CONDITIONAL(code==255))
455 | if(FASTLZ_EXPECT_CONDITIONAL(ofs==(31 << 8)))
456 | {
457 | ofs = (*ip++) << 8;
458 | ofs += *ip++;
459 | ref = op - ofs - MAX_DISTANCE;
460 | }
461 | #endif
462 |
463 | #ifdef FASTLZ_SAFE
464 | if (FASTLZ_UNEXPECT_CONDITIONAL(op + len + 3 > op_limit))
465 | return 0;
466 |
467 | if (FASTLZ_UNEXPECT_CONDITIONAL(ref-1 < (flzuint8 *)output))
468 | return 0;
469 | #endif
470 |
471 | if(FASTLZ_EXPECT_CONDITIONAL(ip < ip_limit))
472 | ctrl = *ip++;
473 | else
474 | loop = 0;
475 |
476 | if(ref == op)
477 | {
478 | /* optimize copy for a run */
479 | flzuint8 b = ref[-1];
480 | *op++ = b;
481 | *op++ = b;
482 | *op++ = b;
483 | for(; len; --len)
484 | *op++ = b;
485 | }
486 | else
487 | {
488 | #if !defined(FASTLZ_STRICT_ALIGN)
489 | const flzuint16* p;
490 | flzuint16* q;
491 | #endif
492 | /* copy from reference */
493 | ref--;
494 | *op++ = *ref++;
495 | *op++ = *ref++;
496 | *op++ = *ref++;
497 |
498 | #if !defined(FASTLZ_STRICT_ALIGN)
499 | /* copy a byte, so that now it's word aligned */
500 | if(len & 1)
501 | {
502 | *op++ = *ref++;
503 | len--;
504 | }
505 |
506 | /* copy 16-bit at once */
507 | q = (flzuint16*) op;
508 | op += len;
509 | p = (const flzuint16*) ref;
510 | for(len>>=1; len > 4; len-=4)
511 | {
512 | *q++ = *p++;
513 | *q++ = *p++;
514 | *q++ = *p++;
515 | *q++ = *p++;
516 | }
517 | for(; len; --len)
518 | *q++ = *p++;
519 | #else
520 | for(; len; --len)
521 | *op++ = *ref++;
522 | #endif
523 | }
524 | }
525 | else
526 | {
527 | ctrl++;
528 | #ifdef FASTLZ_SAFE
529 | if (FASTLZ_UNEXPECT_CONDITIONAL(op + ctrl > op_limit))
530 | return 0;
531 | if (FASTLZ_UNEXPECT_CONDITIONAL(ip + ctrl > ip_limit))
532 | return 0;
533 | #endif
534 |
535 | *op++ = *ip++;
536 | for(--ctrl; ctrl; ctrl--)
537 | *op++ = *ip++;
538 |
539 | loop = FASTLZ_EXPECT_CONDITIONAL(ip < ip_limit);
540 | if(loop)
541 | ctrl = *ip++;
542 | }
543 | }
544 | while(FASTLZ_EXPECT_CONDITIONAL(loop));
545 |
546 | return op - (flzuint8*)output;
547 | }
548 |
549 | #endif /* !defined(FASTLZ_COMPRESSOR) && !defined(FASTLZ_DECOMPRESSOR) */
550 |
--------------------------------------------------------------------------------
/src/fstapi/fastlz.h:
--------------------------------------------------------------------------------
1 | /*
2 | FastLZ - lightning-fast lossless compression library
3 |
4 | Copyright (C) 2007 Ariya Hidayat (ariya@kde.org)
5 | Copyright (C) 2006 Ariya Hidayat (ariya@kde.org)
6 | Copyright (C) 2005 Ariya Hidayat (ariya@kde.org)
7 |
8 | Permission is hereby granted, free of charge, to any person obtaining a copy
9 | of this software and associated documentation files (the "Software"), to deal
10 | in the Software without restriction, including without limitation the rights
11 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
12 | copies of the Software, and to permit persons to whom the Software is
13 | furnished to do so, subject to the following conditions:
14 |
15 | The above copyright notice and this permission notice shall be included in
16 | all copies or substantial portions of the Software.
17 |
18 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
19 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
20 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
21 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
22 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
23 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
24 | THE SOFTWARE.
25 |
26 | SPDX-License-Identifier: MIT
27 | */
28 |
29 | #ifndef FASTLZ_H
30 | #define FASTLZ_H
31 |
32 | #include
33 |
34 | #define flzuint8 uint8_t
35 | #define flzuint16 uint16_t
36 | #define flzuint32 uint32_t
37 |
38 |
39 | #define FASTLZ_VERSION 0x000100
40 |
41 | #define FASTLZ_VERSION_MAJOR 0
42 | #define FASTLZ_VERSION_MINOR 0
43 | #define FASTLZ_VERSION_REVISION 0
44 |
45 | #define FASTLZ_VERSION_STRING "0.1.0"
46 |
47 | #if defined (__cplusplus)
48 | extern "C" {
49 | #endif
50 |
51 | /**
52 | Compress a block of data in the input buffer and returns the size of
53 | compressed block. The size of input buffer is specified by length. The
54 | minimum input buffer size is 16.
55 |
56 | The output buffer must be at least 5% larger than the input buffer
57 | and can not be smaller than 66 bytes.
58 |
59 | If the input is not compressible, the return value might be larger than
60 | length (input buffer size).
61 |
62 | The input buffer and the output buffer can not overlap.
63 | */
64 |
65 | int fastlz_compress(const void* input, int length, void* output);
66 |
67 | /**
68 | Decompress a block of compressed data and returns the size of the
69 | decompressed block. If error occurs, e.g. the compressed data is
70 | corrupted or the output buffer is not large enough, then 0 (zero)
71 | will be returned instead.
72 |
73 | The input buffer and the output buffer can not overlap.
74 |
75 | Decompression is memory safe and guaranteed not to write the output buffer
76 | more than what is specified in maxout.
77 | */
78 |
79 | int fastlz_decompress(const void* input, int length, void* output, int maxout);
80 |
81 | /**
82 | Compress a block of data in the input buffer and returns the size of
83 | compressed block. The size of input buffer is specified by length. The
84 | minimum input buffer size is 16.
85 |
86 | The output buffer must be at least 5% larger than the input buffer
87 | and can not be smaller than 66 bytes.
88 |
89 | If the input is not compressible, the return value might be larger than
90 | length (input buffer size).
91 |
92 | The input buffer and the output buffer can not overlap.
93 |
94 | Compression level can be specified in parameter level. At the moment,
95 | only level 1 and level 2 are supported.
96 | Level 1 is the fastest compression and generally useful for short data.
97 | Level 2 is slightly slower but it gives better compression ratio.
98 |
99 | Note that the compressed data, regardless of the level, can always be
100 | decompressed using the function fastlz_decompress above.
101 | */
102 |
103 | int fastlz_compress_level(int level, const void* input, int length, void* output);
104 |
105 | #if defined (__cplusplus)
106 | }
107 | #endif
108 |
109 | #endif /* FASTLZ_H */
110 |
--------------------------------------------------------------------------------
/src/read.cc:
--------------------------------------------------------------------------------
1 | // @sylefeb 2022-01-04
2 | /*
3 | BSD 3-Clause License
4 |
5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb)
6 | All rights reserved.
7 |
8 | Redistribution and use in source and binary forms, with or without
9 | modification, are permitted provided that the following conditions are met:
10 |
11 | 1. Redistributions of source code must retain the above copyright notice, this
12 | list of conditions and the following disclaimer.
13 |
14 | 2. Redistributions in binary form must reproduce the above copyright notice,
15 | this list of conditions and the following disclaimer in the documentation
16 | and/or other materials provided with the distribution.
17 |
18 | 3. Neither the name of the copyright holder nor the names of its
19 | contributors may be used to endorse or promote products derived from
20 | this software without specific prior written permission.
21 |
22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
32 | */
33 |
34 | #include
35 |
36 | #include
37 | #include
38 | #include
39 | #include
40 | #include
41 | #include
42 | #include
43 | #include
44 | #include
45 | #include
46 |
47 | using namespace std;
48 |
49 | #include "read.h"
50 | #include "blif.h"
51 |
52 | // -----------------------------------------------------------------------------
53 | /*
54 | From a read blif file, prepares a data-structure for simulation
55 | The blif file contains:
56 | - gates, which are LUT4s
57 | - latches, which indicate a flip-flop
58 |
59 | In most cases, a latch corresponds to the output of a gate, so it is
60 | simply a matter of connecting to the Q output of the corresponding gate.
61 | There are cases however where latches are chained. This requires us to
62 | instantiate gates to implement the flip-flops (since we only simulate
63 | gates). These gates are passthrough, and only their Q output is used,
64 | see tag [extra gates] in comments below.
65 | */
66 | void buildSimulData(
67 | t_blif& _blif, // might change (adding extra LUTs)
68 | vector& _luts, // output vector of LUTs (see header)
69 | vector& _brams, // output vector of BRAMs
70 | vector >& _outbits, // output bit indices
71 | vector& _ones, // which output bit start as '1'
72 | map& _indices) // signal to LUT map
73 | {
74 | // gather output names and their source gate/latch
75 | map output2src;
76 | // -> gates
77 | ForArray(_blif.gates, g) {
78 | output2src[_blif.gates[g].output] = v2i(0, g);
79 | }
80 | // -> latches
81 | ForArray(_blif.latches, l) {
82 | output2src[_blif.latches[l].output] = v2i(1, l);
83 | }
84 | // -> BRAMs
85 | ForArray(_blif.brams, b) {
86 | for (int i=0; i < _blif.brams[b].data_width; ++i) {
87 | string port = "RD_DATA[" + std::to_string(i) + "]";
88 | auto P = _blif.brams[b].bindings.find(port);
89 | if (P == _blif.brams[b].bindings.end()) {
90 | fprintf(stderr, " bram '%s' is disconnected\n", port.c_str());
91 | } else {
92 | output2src[P->second] = v2i(2,b);
93 | }
94 | }
95 | }
96 | // -> inputs
97 | ForArray(_blif.inputs, i) {
98 | output2src[_blif.inputs[i]] = v2i(3, i);
99 | }
100 | // number all outputs
101 | // prepare to create luts
102 | // -> find register outputs that depend on other registers
103 | for (const auto& o : output2src) {
104 | if (o.second[0] == 1) { // latch
105 | // find input type
106 | sl_assert(output2src.count(_blif.latches[o.second[1]].input));
107 | const auto& I = output2src.find(_blif.latches[o.second[1]].input);
108 | if (I->second[0] == 1) {
109 | // input of this latch is the output (Q) of an earlier latch
110 | // we need a pass-through gate to do that [extra gates]
111 | int g = (int)_blif.gates.size();
112 | _blif.gates.push_back(t_gate_nfo());
113 | _blif.gates.back().config_strings.push_back(make_pair("1", "1"));
114 | _blif.gates.back().inputs.push_back(I->first);
115 | string ex = "__extra__" + I->first;
116 | _blif.gates.back().output = ex;
117 | _blif.latches[o.second[1]].input = ex;
118 | output2src[ex] = v2i(0, g);
119 | }
120 | }
121 | }
122 | // -> create one LUT per latch
123 | vector lut_gates;
124 | for (const auto& o : output2src) {
125 | if (o.second[0] == 1) { // latch
126 | // find input type
127 | const auto& I = output2src.find (_blif.latches[o.second[1]].input);
128 | sl_assert(I != output2src.end());
129 | sl_assert(I->second[0] != 1); // other has to not be a latch
130 | /// create LUT for the D output (latch input)
131 | /// assign output to Q (latch output)
132 | // store indices of input (D) and output (Q)
133 | _indices[I->first] = (((int)lut_gates.size()) << 1);
134 | _indices[o.first] = (((int)lut_gates.size()) << 1) + 1;
135 | // gate that corresponds to the lut
136 | lut_gates.push_back(I->second[1]);
137 | }
138 | }
139 | // -> create one LUT per comb output
140 | for (const auto& o : output2src) {
141 | if (o.second[0] != 1) { // not latch
142 | // ignore clock
143 | if (o.first == "clock") {
144 | continue;
145 | }
146 | // check if the output is already assigned
147 | if (_indices.count(o.first)) {
148 | continue;
149 | }
150 | // check that it does not use clock // NOTE: investigate
151 | bool skip = false;
152 | for (auto i : _blif.gates[o.second[1]].inputs) {
153 | if (i == "clock") {
154 | skip = true;
155 | break;
156 | }
157 | }
158 | if (skip) continue;
159 | /// create LUT for the D output
160 | // store index
161 | _indices[o.first] = (((int)lut_gates.size()) << 1);
162 | if (o.second[0] == 0) {
163 | lut_gates.push_back(o.second[1]); // gate that corresponds to the lut
164 | } else if (o.second[0] == 2) {
165 | lut_gates.push_back(-1); // external lut
166 | } else if (o.second[0] == 3) {
167 | lut_gates.push_back(-1); // external lut
168 | } else {
169 | fprintf(stderr, " unexpected\n");
170 | }
171 | }
172 | }
173 | // -> connect BRAMs to design
174 | for (const auto &b : _blif.brams) {
175 | _brams.push_back(t_bram());
176 | _brams.back().name = b.name;
177 | _brams.back().data = b.data;
178 | // -> list what to connect
179 | vector< tuple* > > ports_width;
180 | ports_width.push_back(make_tuple("RD_ADDR", b.addr_width, &_brams.back().rd_addr));
181 | ports_width.push_back(make_tuple("RD_DATA", b.data_width, &_brams.back().rd_data));
182 | ports_width.push_back(make_tuple("WR_ADDR", b.addr_width, &_brams.back().wr_addr));
183 | ports_width.push_back(make_tuple("WR_DATA", b.data_width, &_brams.back().wr_data));
184 | ports_width.push_back(make_tuple("WR_EN", b.data_width, &_brams.back().wr_en));
185 | // -> check and connect
186 | for (auto &pw : ports_width) {
187 | for (int i=0; i < (int)get<1>(pw); ++i) {
188 | string port = get<0>(pw) + "[" + std::to_string(i) + "]";
189 | auto P = b.bindings.find(port);
190 | if (P == b.bindings.end()) {
191 | fprintf(stderr, " bram '%s' is disconnected\n", port.c_str());
192 | } else {
193 | auto I = _indices.find(P->second);
194 | if (I == _indices.end()) {
195 | fprintf(stderr, " bram '%s' is connected to unkown '%s'\n", port.c_str(), P->second.c_str());
196 | exit(-1);
197 | } else {
198 | // std::cerr << P->first << " " << I->second << '\n';
199 | get<2>(pw)->push_back(I->second);
200 | }
201 | }
202 | }
203 | }
204 | }
205 | // -> instantiate LUTs
206 | for (auto g : lut_gates) {
207 | _luts.push_back(t_lut());
208 | ForIndex(i, 4) {
209 | _luts.back().inputs[i] = -1;
210 | }
211 | if (g == -1) {
212 | _luts.back().cfg = 0;
213 | _luts.back().external = true; // TODO FIXME: merge with above!!!
214 | } else {
215 | _luts.back().cfg = lut_config(_blif.gates[g].config_strings);
216 | _luts.back().external = false;
217 | int i = 4 - (int)_blif.gates[g].inputs.size();
218 | for (auto inp : _blif.gates[g].inputs) {
219 | auto I = _indices.find(inp);
220 | if (I == _indices.end()) {
221 | fprintf(stderr, " input '%s' disconnected\n", inp.c_str());
222 | } else {
223 | _luts.back().inputs[i++] = I->second;
224 | }
225 | }
226 | }
227 | }
228 |
229 | for (auto op : _blif.outputs) {
230 | auto I = _indices.find(op);
231 | if (I == _indices.end()) {
232 | fprintf(stderr, " outport '%s' disconnected\n", op.c_str());
233 | } else {
234 | _outbits.push_back(make_pair(op, I->second));
235 | }
236 | }
237 |
238 | for (const auto& l : _blif.latches) {
239 | if (l.init == "1") {
240 | auto I = _indices.find(l.output);
241 | sl_assert(I != _indices.end());
242 | _ones.push_back(I->second);
243 | }
244 | }
245 |
246 | /// DEBUG
247 | #if 0
248 | map reverse_indices;
249 | for (auto& idc : _indices) {
250 | reverse_indices[idc.second] = idc.first;
251 | }
252 | for (int l = 0; l < _luts.size(); ++l) {
253 | fprintf(stderr,"LUT %3d (%s), cfg:%4x, inputs: %4d %4d %4d %4d ext:%d\n",
254 | l<<1, reverse_indices.at(l<<1).c_str(), _luts[l].cfg,
255 | _luts[l].inputs[0], _luts[l].inputs[1],
256 | _luts[l].inputs[2], _luts[l].inputs[3],
257 | _luts[l].external);
258 | }
259 | #endif
260 | }
261 |
262 | // -----------------------------------------------------------------------------
263 |
264 | /*
265 | Reads the design from a BLIF file
266 | */
267 | void readDesign(
268 | const char *path,
269 | vector& _luts,
270 | vector& _brams,
271 | vector >& _outbits,
272 | vector& _ones,
273 | map& _indices)
274 | {
275 | t_blif blif;
276 | // parse the blif file
277 | parse(path, blif);
278 | // build the design datastructure
279 | buildSimulData(blif, _luts, _brams, _outbits, _ones, _indices);
280 | }
281 |
282 | // -----------------------------------------------------------------------------
283 |
--------------------------------------------------------------------------------
/src/read.h:
--------------------------------------------------------------------------------
1 | // @sylefeb 2022-01-04
2 | /*
3 | BSD 3-Clause License
4 |
5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb)
6 | All rights reserved.
7 |
8 | Redistribution and use in source and binary forms, with or without
9 | modification, are permitted provided that the following conditions are met:
10 |
11 | 1. Redistributions of source code must retain the above copyright notice, this
12 | list of conditions and the following disclaimer.
13 |
14 | 2. Redistributions in binary form must reproduce the above copyright notice,
15 | this list of conditions and the following disclaimer in the documentation
16 | and/or other materials provided with the distribution.
17 |
18 | 3. Neither the name of the copyright holder nor the names of its
19 | contributors may be used to endorse or promote products derived from
20 | this software without specific prior written permission.
21 |
22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
32 | */
33 |
34 | #pragma once
35 |
36 | #include
37 | #include
38 |
39 | #include "uintX.h"
40 |
41 | typedef unsigned char uchar;
42 |
43 | #pragma pack(push)
44 | #pragma pack(1)
45 | // struct holding a LUT configuration:
46 | // - cfg is a 16 bits integer that defined the truth table for 4 inputs
47 | // - inputs[4] are the indices of the inputs (other LUTs in the LUT table)
48 | // Each index lower bit indicates whether the input is connected to D (0)
49 | // or Q (1). The higher bits are the LUT index.
50 | // So for LUT i the index is obtained as (i<<1) + 0 if connected to D
51 | // or (i<<1) + 1 if connected to Q
52 | // Given the index x, the LUT is (x>>1) and (x&1) == 1 if Q, otherwise D
53 | typedef struct s_lut {
54 | unsigned short cfg;
55 | int inputs[4];
56 | bool external; // TODO: try to move this out of t_lut, only used between read and analyze
57 | } t_lut;
58 | #pragma pack(pop)
59 |
60 | // struct holding a BRAM
61 | typedef struct s_bram {
62 | std::string name;
63 | uintX data;
64 | std::vector rd_addr;
65 | std::vector rd_data;
66 | std::vector wr_addr;
67 | std::vector wr_data;
68 | std::vector wr_en;
69 | // int rd_clock;
70 | // int wr_clock;
71 | } t_bram;
72 |
73 | void readDesign(
74 | const char *path,
75 | std::vector& _luts,
76 | std::vector& _brams,
77 | std::vector >& _outbits,
78 | std::vector& _ones,
79 | std::map& _indices);
80 |
--------------------------------------------------------------------------------
/src/sh_clear.cs:
--------------------------------------------------------------------------------
1 | // @sylefeb 2021-01-04
2 | /*
3 | BSD 3-Clause License
4 |
5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb)
6 | All rights reserved.
7 |
8 | Redistribution and use in source and binary forms, with or without
9 | modification, are permitted provided that the following conditions are met:
10 |
11 | 1. Redistributions of source code must retain the above copyright notice, this
12 | list of conditions and the following disclaimer.
13 |
14 | 2. Redistributions in binary form must reproduce the above copyright notice,
15 | this list of conditions and the following disclaimer in the documentation
16 | and/or other materials provided with the distribution.
17 |
18 | 3. Neither the name of the copyright holder nor the names of its
19 | contributors may be used to endorse or promote products derived from
20 | this software without specific prior written permission.
21 |
22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
32 | */
33 |
34 | #version 430
35 |
36 | layout(local_size_x = 1, local_size_y = 1) in;
37 |
38 | layout(std430, binding = 0) buffer Buf { uint buf[]; };
39 |
40 | void main()
41 | {
42 | uint id = gl_GlobalInvocationID.x;
43 | buf[id] = 0;
44 | }
45 |
--------------------------------------------------------------------------------
/src/sh_init.cs:
--------------------------------------------------------------------------------
1 | // @sylefeb 2021-01-04
2 | /*
3 | BSD 3-Clause License
4 |
5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb)
6 | All rights reserved.
7 |
8 | Redistribution and use in source and binary forms, with or without
9 | modification, are permitted provided that the following conditions are met:
10 |
11 | 1. Redistributions of source code must retain the above copyright notice, this
12 | list of conditions and the following disclaimer.
13 |
14 | 2. Redistributions in binary form must reproduce the above copyright notice,
15 | this list of conditions and the following disclaimer in the documentation
16 | and/or other materials provided with the distribution.
17 |
18 | 3. Neither the name of the copyright holder nor the names of its
19 | contributors may be used to endorse or promote products derived from
20 | this software without specific prior written permission.
21 |
22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
32 | */
33 |
34 | #version 430
35 |
36 | layout(local_size_x = 1, local_size_y = 1) in;
37 |
38 | coherent layout(std430, binding = 2) buffer Buf2 { uint outputs[]; };
39 | readonly layout(std430, binding = 5) buffer Buf5 { uint ones []; };
40 |
41 | void main()
42 | {
43 | uint id = gl_GlobalInvocationID.x;
44 | // update flipflop
45 | uint o = ones[id];
46 | atomicOr(outputs[o >> 1u], 1u << (o & 1u));
47 | }
48 |
--------------------------------------------------------------------------------
/src/sh_outports.cs:
--------------------------------------------------------------------------------
1 | // @sylefeb 2021-01-04
2 | /*
3 | BSD 3-Clause License
4 |
5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb)
6 | All rights reserved.
7 |
8 | Redistribution and use in source and binary forms, with or without
9 | modification, are permitted provided that the following conditions are met:
10 |
11 | 1. Redistributions of source code must retain the above copyright notice, this
12 | list of conditions and the following disclaimer.
13 |
14 | 2. Redistributions in binary form must reproduce the above copyright notice,
15 | this list of conditions and the following disclaimer in the documentation
16 | and/or other materials provided with the distribution.
17 |
18 | 3. Neither the name of the copyright holder nor the names of its
19 | contributors may be used to endorse or promote products derived from
20 | this software without specific prior written permission.
21 |
22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
32 | */
33 |
34 | #version 430
35 |
36 | layout(local_size_x = 1, local_size_y = 1) in;
37 |
38 | coherent readonly layout(std430, binding = 2) buffer Buf2 { uint outputs []; };
39 | readonly layout(std430, binding = 3) buffer Buf3 { uint portlocs[]; };
40 | writeonly layout(std430, binding = 4) buffer Buf4 { uint portvals[]; };
41 |
42 | uniform uint offset;
43 |
44 | void main()
45 | {
46 | uint id = gl_GlobalInvocationID.x;
47 | uint o = portlocs[id];
48 | portvals[offset + id] = (outputs[o>>1u] >> (o&1u)) & 1u;
49 | }
50 |
--------------------------------------------------------------------------------
/src/sh_posedge.cs:
--------------------------------------------------------------------------------
1 | // @sylefeb 2021-01-04
2 | /*
3 | BSD 3-Clause License
4 |
5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb)
6 | All rights reserved.
7 |
8 | Redistribution and use in source and binary forms, with or without
9 | modification, are permitted provided that the following conditions are met:
10 |
11 | 1. Redistributions of source code must retain the above copyright notice, this
12 | list of conditions and the following disclaimer.
13 |
14 | 2. Redistributions in binary form must reproduce the above copyright notice,
15 | this list of conditions and the following disclaimer in the documentation
16 | and/or other materials provided with the distribution.
17 |
18 | 3. Neither the name of the copyright holder nor the names of its
19 | contributors may be used to endorse or promote products derived from
20 | this software without specific prior written permission.
21 |
22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
32 | */
33 |
34 | #version 430
35 |
36 | layout(local_size_x = 128, local_size_y = 1) in;
37 |
38 | coherent layout(std430, binding = 2) buffer Buf2 { uint outputs[]; };
39 |
40 | uniform uint num;
41 |
42 | void main()
43 | {
44 | if (gl_GlobalInvocationID.x < num)
45 | {
46 | uint lut_id = gl_GlobalInvocationID.x;
47 | // update Q output from D, but only if their values differ.
48 | uint outv = outputs[lut_id];
49 | if ((outv & 1) != ((outv>>1)&1)) {
50 | if ((outv & 1u) == 1u) {
51 | outputs[lut_id] = 3u;
52 | } else {
53 | outputs[lut_id] = 0u;
54 | }
55 | }
56 | }
57 | }
58 |
--------------------------------------------------------------------------------
/src/sh_simul.cs:
--------------------------------------------------------------------------------
1 | // @sylefeb 2021-01-04
2 | /*
3 | BSD 3-Clause License
4 |
5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb)
6 | All rights reserved.
7 |
8 | Redistribution and use in source and binary forms, with or without
9 | modification, are permitted provided that the following conditions are met:
10 |
11 | 1. Redistributions of source code must retain the above copyright notice, this
12 | list of conditions and the following disclaimer.
13 |
14 | 2. Redistributions in binary form must reproduce the above copyright notice,
15 | this list of conditions and the following disclaimer in the documentation
16 | and/or other materials provided with the distribution.
17 |
18 | 3. Neither the name of the copyright holder nor the names of its
19 | contributors may be used to endorse or promote products derived from
20 | this software without specific prior written permission.
21 |
22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
32 | */
33 |
34 | #version 430
35 |
36 | layout(local_size_x = 128, local_size_y = 1) in;
37 |
38 | readonly layout(std430, binding = 0) buffer Buf0 { uint cfg []; };
39 | readonly layout(std430, binding = 1) buffer Buf1 { ivec4 addrs []; };
40 | coherent layout(std430, binding = 2) buffer Buf2 { uint outputs[]; };
41 |
42 | uniform uint start_lut;
43 | uniform uint num;
44 |
45 | uint get_output(uint a)
46 | {
47 | return (outputs[a >> 1u] >> (a & 1u)) & 1u;
48 | }
49 |
50 | void main()
51 | {
52 | if (gl_GlobalInvocationID.x < num)
53 | {
54 | uint lut_id = start_lut + gl_GlobalInvocationID.x;
55 | // apply LUT logic
56 | uint C = cfg [lut_id];
57 | ivec4 a = addrs[lut_id];
58 | uint i0 = get_output(a.x);
59 | uint i1 = get_output(a.y);
60 | uint i2 = get_output(a.z);
61 | uint i3 = get_output(a.w);
62 | uint sh = i3 | (i2 << 1) | (i1 << 2) | (i0 << 3);
63 | // get previous value, compute old/new
64 | uint outv = outputs[lut_id];
65 | uint old_d = outv & 1u;
66 | uint new_d = (C >> sh) & 1u;
67 | // if different, assign
68 | if (old_d != new_d) {
69 | if (new_d == 1u) {
70 | outputs[lut_id] = outv | 1u;
71 | } else {
72 | outputs[lut_id] = outv & 0xfffffffeu;
73 | }
74 | }
75 | }
76 | }
77 |
--------------------------------------------------------------------------------
/src/sh_visu.fp:
--------------------------------------------------------------------------------
1 | // @sylefeb 2021-01-09
2 | /*
3 | BSD 3-Clause License
4 |
5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb)
6 | All rights reserved.
7 |
8 | Redistribution and use in source and binary forms, with or without
9 | modification, are permitted provided that the following conditions are met:
10 |
11 | 1. Redistributions of source code must retain the above copyright notice, this
12 | list of conditions and the following disclaimer.
13 |
14 | 2. Redistributions in binary form must reproduce the above copyright notice,
15 | this list of conditions and the following disclaimer in the documentation
16 | and/or other materials provided with the distribution.
17 |
18 | 3. Neither the name of the copyright holder nor the names of its
19 | contributors may be used to endorse or promote products derived from
20 | this software without specific prior written permission.
21 |
22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
32 | */
33 |
34 | #version 430
35 |
36 | in vec2 uv;
37 | out vec4 color;
38 |
39 | readonly layout(std430, binding = 2) buffer Buf2 { uint outputs[]; };
40 |
41 | uniform int sqsz;
42 | uniform int num;
43 | uniform int depth0_end;
44 |
45 | void main()
46 | {
47 | int id = depth0_end + int(uv.x*sqsz) + int(uv.y*sqsz)*sqsz;
48 | ivec2 o = id < num ? ivec2(outputs[id]&1u,(outputs[id]>>1u)&1u) : ivec2(0,0);
49 | vec2 c = vec2(o.xy);
50 | color = vec4(c.x,c.yy,1.0);
51 | }
52 |
--------------------------------------------------------------------------------
/src/sh_visu.vp:
--------------------------------------------------------------------------------
1 | // @sylefeb 2021-01-09
2 | /*
3 | BSD 3-Clause License
4 |
5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb)
6 | All rights reserved.
7 |
8 | Redistribution and use in source and binary forms, with or without
9 | modification, are permitted provided that the following conditions are met:
10 |
11 | 1. Redistributions of source code must retain the above copyright notice, this
12 | list of conditions and the following disclaimer.
13 |
14 | 2. Redistributions in binary form must reproduce the above copyright notice,
15 | this list of conditions and the following disclaimer in the documentation
16 | and/or other materials provided with the distribution.
17 |
18 | 3. Neither the name of the copyright holder nor the names of its
19 | contributors may be used to endorse or promote products derived from
20 | this software without specific prior written permission.
21 |
22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
32 | */
33 |
34 | #version 430
35 |
36 | in vec4 mvf_vertex;
37 | out vec2 uv;
38 |
39 | void main()
40 | {
41 | uv = mvf_vertex.xy;
42 | gl_Position = vec4(mvf_vertex.xy*2.0-1.0,0.5,1.0);
43 | }
44 |
--------------------------------------------------------------------------------
/src/silixel.cc:
--------------------------------------------------------------------------------
1 | // @sylefeb 2022-01-04
2 | /* ---------------------------------------------------------------------
3 |
4 | Main file, creates a small graphical GUI (OpenGL+ImGUI) around a
5 | simulated design. If the design has VGA signals, displays the result
6 | using a texture. Allows to select between GPU/CPU simulation.
7 |
8 | ----------------------------------------------------------------------- */
9 | /*
10 | BSD 3-Clause License
11 |
12 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb)
13 | All rights reserved.
14 |
15 | Redistribution and use in source and binary forms, with or without
16 | modification, are permitted provided that the following conditions are met:
17 |
18 | 1. Redistributions of source code must retain the above copyright notice, this
19 | list of conditions and the following disclaimer.
20 |
21 | 2. Redistributions in binary form must reproduce the above copyright notice,
22 | this list of conditions and the following disclaimer in the documentation
23 | and/or other materials provided with the distribution.
24 |
25 | 3. Neither the name of the copyright holder nor the names of its
26 | contributors may be used to endorse or promote products derived from
27 | this software without specific prior written permission.
28 |
29 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
30 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
31 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
32 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
33 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
34 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
35 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
36 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
37 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
38 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
39 | */
40 | // --------------------------------------------------------------
41 |
42 | #include
43 | #include
44 | #include
45 |
46 | #include
47 | #include
48 |
49 | #include "read.h"
50 | #include "analyze.h"
51 | #include "simul_cpu.h"
52 | #include "simul_gpu.h"
53 |
54 | // --------------------------------------------------------------
55 |
56 | #include
57 | #include
58 |
59 | // --------------------------------------------------------------
60 |
61 | using namespace std;
62 |
63 | // --------------------------------------------------------------
64 |
65 | #define SCREEN_W (640) // screen width and height
66 | #define SCREEN_H (480)
67 |
68 | // --------------------------------------------------------------
69 |
70 | // Shader to visualize LUT outputs
71 | #include "sh_visu.h"
72 | AutoBindShader::sh_visu g_ShVisu;
73 |
74 | // Output ports
75 | map g_OutPorts; // name, LUT id and rank in g_OutPortsValues
76 | Array g_OutPortsValues; // output port values
77 |
78 | int g_Cycle = 0;
79 |
80 | vector g_step_starts;
81 | vector g_step_ends;
82 | vector g_luts;
83 | vector g_brams;
84 | map g_indices;
85 | vector g_ones;
86 | vector g_cpu_fanout;
87 | vector g_cpu_depths;
88 | vector g_cpu_outputs;
89 | vector g_cpu_computelists;
90 |
91 | AutoPtr g_Quad;
92 | GLTimer g_GPU_timer;
93 |
94 | bool g_Use_GPU = true;
95 |
96 | // --------------------------------------------------------------
97 |
98 | bool designHasVGA()
99 | {
100 | return (g_OutPorts.count("out_video_vs") > 0);
101 | }
102 |
103 | /* -------------------------------------------------------- */
104 |
105 | ImageRGBA_Ptr g_Framebuffer;
106 | Tex2DRGBA_Ptr g_FramebufferTex;
107 | int g_X = 0;
108 | int g_Y = 0;
109 | int g_HS = 0;
110 | int g_VS = 0;
111 | double g_Hz = 0;
112 | double g_UsecPerCycle = 0;
113 | string g_OutPortString;
114 | int g_OutportCycle = 0;
115 |
116 | /* -------------------------------------------------------- */
117 |
118 | void simulGPUNextWait()
119 | {
120 | g_GPU_timer.start();
121 | while (1) {
122 | simulCycle_gpu(g_luts, g_step_starts, g_step_ends);
123 | if (simulReadback_gpu()) break;
124 | }
125 | g_GPU_timer.stop();
126 | auto ms = g_GPU_timer.waitResult();
127 | g_Hz = (double)CYCLE_BUFFER_LEN / ((double)ms / 1000.0);
128 | g_UsecPerCycle = (double)ms * 1000.0 / (double)CYCLE_BUFFER_LEN;
129 | g_OutportCycle = 0;
130 | }
131 |
132 | /* -------------------------------------------------------- */
133 |
134 | void simulGPUNext()
135 | {
136 | g_GPU_timer.start();
137 |
138 | simulCycle_gpu(g_luts, g_step_starts, g_step_ends);
139 | bool datain = simulReadback_gpu();
140 |
141 | g_GPU_timer.stop();
142 | auto ms = g_GPU_timer.waitResult();
143 | g_Hz = (double)1 / ((double)ms / 1000.0);
144 | g_UsecPerCycle = (double)ms * 1000.0 / (double)1;
145 |
146 | if (datain) {
147 | g_OutportCycle = 0;
148 | } else {
149 | ++g_OutportCycle;
150 | }
151 |
152 | }
153 |
154 | /* -------------------------------------------------------- */
155 |
156 | void updateFrame(int vs, int hs, int r, int g, int b)
157 | {
158 | if (vs) {
159 | if (hs) {
160 | if (g_X >= 48 && g_Y >= 34) {
161 | g_Framebuffer->pixel(g_X - 48, g_Y - 34) = v4b(r << 2, g << 2, b << 2, 255);
162 | }
163 | ++g_X;
164 | } else {
165 | g_X = 0;
166 | if (g_HS) {
167 | ++g_Y;
168 | g_FramebufferTex = Tex2DRGBA_Ptr(new Tex2DRGBA(g_Framebuffer->pixels()));
169 | }
170 | }
171 | } else {
172 | g_X = g_Y = 0;
173 | }
174 | g_VS = vs;
175 | g_HS = hs;
176 | }
177 |
178 | /* -------------------------------------------------------- */
179 |
180 | void simulGPU()
181 | {
182 | if (designHasVGA()) { // design has VGA output, display it
183 | simulGPUNextWait(); // simulates a number of cycles and wait
184 | // read the output of the simulated cycles
185 | ForIndex(cy, CYCLE_BUFFER_LEN) {
186 | int offset = cy * (int)g_OutPorts.size();
187 | int vs = g_OutPortsValues[offset + g_OutPorts["out_video_vs"][0]];
188 | int hs = g_OutPortsValues[offset + g_OutPorts["out_video_hs"][0]];
189 | int r = 0;
190 | ForIndex(i, 6) {
191 | r = r | ((g_OutPortsValues[offset + g_OutPorts["out_video_r[" + to_string(i) + "]"][0]]) << i);
192 | }
193 | int g = 0;
194 | ForIndex(i, 6) {
195 | g = g | ((g_OutPortsValues[offset + g_OutPorts["out_video_g[" + to_string(i) + "]"][0]]) << i);
196 | }
197 | int b = 0;
198 | ForIndex(i, 6) {
199 | b = b | ((g_OutPortsValues[offset + g_OutPorts["out_video_b[" + to_string(i) + "]"][0]]) << i);
200 | }
201 | updateFrame(vs, hs, r, g, b);
202 | }
203 | } else { // design has no VGA, show the output ports
204 | simulGPUNext(); // step one cycle
205 | // make the output string
206 | g_OutPortString = "";
207 | int offset = g_OutportCycle * (int)g_OutPorts.size();
208 | for (auto op : g_OutPorts) {
209 | g_OutPortString = (g_OutPortsValues[offset + op.second[0]] ? "1" : "0") + g_OutPortString;
210 | }
211 | }
212 | }
213 |
214 | /* -------------------------------------------------------- */
215 |
216 | uchar simulCPU_output(std::string o)
217 | {
218 | int pos = g_OutPorts.at(o)[1];
219 | int lut = pos >> 1;
220 | int q_else_d = pos & 1;
221 | uchar bit = (g_cpu_outputs[lut] >> q_else_d) & 1;
222 | return bit;
223 | }
224 |
225 | /* -------------------------------------------------------- */
226 |
227 | void simulCPU()
228 | {
229 | if (designHasVGA()) {
230 | // multiple steps
231 | int num_measures = 0;
232 | const int N_measures = 100;
233 | Elapsed el;
234 | while (num_measures++ < N_measures) {
235 | simulCycle_cpu(g_luts, g_brams, g_cpu_depths, g_step_starts, g_step_ends, g_cpu_fanout, g_cpu_computelists, g_cpu_outputs);
236 | simulPosEdge_cpu(g_luts, g_cpu_depths, (int)g_step_starts.size(), g_cpu_fanout, g_cpu_computelists, g_cpu_outputs);
237 | int vs = simulCPU_output("out_video_vs");
238 | int hs = simulCPU_output("out_video_hs");
239 | int r = 0;
240 | ForIndex(i, 6) {
241 | r = r | (simulCPU_output("out_video_r[" + to_string(i) + "]") << i);
242 | }
243 | int g = 0;
244 | ForIndex(i, 6) {
245 | g = g | (simulCPU_output("out_video_g[" + to_string(i) + "]") << i);
246 | }
247 | int b = 0;
248 | ForIndex(i, 6) {
249 | b = b | (simulCPU_output("out_video_b[" + to_string(i) + "]") << i);
250 | }
251 | updateFrame(vs, hs, r, g, b);
252 | }
253 | auto ms = el.elapsed();
254 | g_Hz = (double)N_measures / ((double)ms / 1000.0);
255 | g_UsecPerCycle = (double)ms * 1000.0 / (double)N_measures;
256 | } else {
257 | // multiple steps
258 | int num_measures = 0;
259 | const int N_measures = 20;
260 | Elapsed el;
261 | while (num_measures++ < N_measures) {
262 | simulCycle_cpu(g_luts, g_brams, g_cpu_depths, g_step_starts, g_step_ends, g_cpu_fanout, g_cpu_computelists, g_cpu_outputs);
263 | simulPosEdge_cpu(g_luts, g_cpu_depths, (int)g_step_starts.size(), g_cpu_fanout, g_cpu_computelists, g_cpu_outputs);
264 | }
265 | auto ms = el.elapsed();
266 | if (ms > 0) {
267 | g_Hz = (double)N_measures / ((double)ms / 1000.0);
268 | g_UsecPerCycle = (double)ms * 1000.0 / (double)N_measures;
269 | } else {
270 | g_Hz = -1;
271 | g_UsecPerCycle = -1;
272 | }
273 | // make the output string
274 | g_OutPortString = "";
275 | for (auto op : g_OutPorts) {
276 | g_OutPortString = (simulCPU_output(op.first) ? "1" : "0") + g_OutPortString;
277 | }
278 | }
279 | }
280 |
281 | /* -------------------------------------------------------- */
282 |
283 | void mainRender()
284 | {
285 |
286 | // simulate
287 | if (g_Use_GPU) {
288 | simulGPU();
289 | } else {
290 | simulCPU();
291 | }
292 |
293 | // basic rendering
294 | LibSL::GPUHelpers::clearScreen(LIBSL_COLOR_BUFFER | LIBSL_DEPTH_BUFFER, 0.2f, 0.2f, 0.2f);
295 |
296 | // render display
297 | if (designHasVGA()) {
298 | // -> texture for VGA display
299 | GLBasicPipeline::getUniqueInstance()->begin();
300 | GLBasicPipeline::getUniqueInstance()->setProjection(orthoMatrixGL(0, 1, 1, 0, -1, 1));
301 | GLBasicPipeline::getUniqueInstance()->setModelview(m4x4f::identity());
302 | GLBasicPipeline::getUniqueInstance()->setColor(v4f(1));
303 | if (!g_FramebufferTex.isNull()) {
304 | g_FramebufferTex->bind();
305 | }
306 | GLBasicPipeline::getUniqueInstance()->enableTexture();
307 | GLBasicPipeline::getUniqueInstance()->bindTextureUnit(0);
308 | g_Quad->render();
309 | GLBasicPipeline::getUniqueInstance()->end();
310 | }
311 |
312 | // render LUTs+FF
313 | if (g_Use_GPU) {
314 | GLProtectViewport vp;
315 | glViewport(0, 0, SCREEN_H*2/3, SCREEN_H*2/3);
316 | g_ShVisu.begin();
317 | g_Quad->render();
318 | g_ShVisu.end();
319 | }
320 |
321 | // -> GUI
322 | ImGui::SetNextWindowSize(ImVec2(300, 150), ImGuiCond_Once);
323 | ImGui::Begin("Status");
324 | ImGui::Checkbox("Simulate on GPU", &g_Use_GPU);
325 | if (g_Use_GPU && !g_brams.empty()) {
326 | cerr << "this design has BRAMs, currently unsupported on GPU\n";
327 | g_Use_GPU = false;
328 | }
329 | ImGui::Text("%5.1f KHz %5.1f usec / cycle", g_Hz/1000.0, g_UsecPerCycle);
330 | ImGui::Text("simulated cycle: %6d", g_Cycle);
331 | ImGui::Text("simulated LUT4+FF %7d", g_luts.size());
332 | ImGui::Text("screen row %3d",g_Y);
333 | if (!g_OutPortString.empty()) {
334 | ImGui::Text("outputs: %s", g_OutPortString.c_str());
335 | }
336 | ImGui::End();
337 |
338 | SimpleUI::renderImGui();
339 | }
340 |
341 | /* -------------------------------------------------------- */
342 |
343 | int main(int argc, char **argv)
344 | {
345 | try {
346 |
347 | /// init simple UI (glut clone for both GL and D3D)
348 | cerr << "Init SimpleUI ";
349 | SimpleUI::init(SCREEN_W, SCREEN_H);
350 | SimpleUI::onRender = mainRender;
351 | cerr << "[OK]" << endl;
352 |
353 | /// bind imgui
354 | SimpleUI::bindImGui();
355 | SimpleUI::initImGui();
356 | SimpleUI::onReshape(SCREEN_W, SCREEN_H);
357 |
358 | glDisable(GL_DEPTH_TEST);
359 | glDisable(GL_CULL_FACE);
360 |
361 | /// help
362 | printf("[ESC] - quit\n");
363 |
364 | /// display stuff
365 | g_Framebuffer = ImageRGBA_Ptr(new ImageRGBA(640,480));
366 | g_Quad = AutoPtr(new GLMesh());
367 | g_Quad->begin(GPUMESH_TRIANGLESTRIP);
368 | g_Quad->texcoord0_2(0, 0); g_Quad->vertex_2(0, 0);
369 | g_Quad->texcoord0_2(1, 0); g_Quad->vertex_2(1, 0);
370 | g_Quad->texcoord0_2(0, 1); g_Quad->vertex_2(0, 1);
371 | g_Quad->texcoord0_2(1, 1); g_Quad->vertex_2(1, 1);
372 | g_Quad->end();
373 |
374 | /// GPU shaders init
375 | g_ShVisu.init();
376 |
377 | /// GPU timer
378 | g_GPU_timer.init();
379 |
380 | /// load up design
381 | vector > outbits;
382 | readDesign(SRC_PATH "/build/synth.blif", g_luts, g_brams, outbits, g_ones, g_indices);
383 |
384 | if (!g_brams.empty()) {
385 | g_Use_GPU = false;
386 | }
387 |
388 | analyze(g_luts, g_brams, outbits, g_indices, g_ones, g_step_starts, g_step_ends, g_cpu_depths);
389 |
390 | buildFanout(g_luts, g_cpu_fanout);
391 |
392 | int rank = 0;
393 | for (auto op : outbits) {
394 | g_OutPorts.insert(make_pair(op.first,v2i(rank++, op.second)));
395 | }
396 | g_OutPortsValues.allocate(rank * CYCLE_BUFFER_LEN);
397 |
398 | /// GPU buffers init
399 | simulInit_gpu(g_luts, g_ones);
400 |
401 | // init CPU simulation
402 | simulInit_cpu(g_luts, g_brams, g_step_starts, g_step_ends, g_ones, g_cpu_computelists, g_cpu_outputs);
403 |
404 | /// Quick benchmarking at startup
405 | #if 0
406 | // -> time GPU
407 | simulBegin_gpu(g_luts,g_step_starts,g_step_ends,g_ones);
408 | {
409 | ForIndex(trials, 3) {
410 | int n_cycles = 10000;
411 | g_GPU_timer.start();
412 | ForIndex(cycle, n_cycles) {
413 | simulCycle_gpu(g_luts, g_step_starts, g_step_ends);
414 | simulReadback_gpu();
415 | ++g_Cycle;
416 | }
417 | g_GPU_timer.stop();
418 | simulPrintOutput_gpu(outbits);
419 | auto ms = g_GPU_timer.waitResult();
420 | printf("[GPU] %d msec, ~ %f Hz, cycle time: %f usec\n",
421 | (int)ms,
422 | (double)n_cycles / ((double)ms / 1000.0),
423 | (double)ms * 1000.0 / (double)n_cycles);
424 | }
425 | }
426 | simulEnd_gpu();
427 | // -> time CPU
428 | {
429 | ForIndex(trials, 3) {
430 | Elapsed el;
431 | int n_cycles = 1000;
432 | ForIndex(cy, n_cycles) {
433 | simulCycle_cpu(g_luts, g_brams, g_cpu_depths, g_step_starts, g_step_ends, g_cpu_fanout, g_cpu_computelists, g_cpu_outputs);
434 | simulPosEdge_cpu(g_luts, g_cpu_depths, (int)g_step_starts.size(), g_cpu_fanout, g_cpu_computelists, g_cpu_outputs);
435 | }
436 | auto ms = el.elapsed();
437 | printf("[CPU] %d msec, ~ %f Hz, cycle time: %f usec\n",
438 | (int)ms,
439 | (double)n_cycles / ((double)ms / 1000.0),
440 | (double)ms * 1000.0 / (double)n_cycles);
441 | }
442 | }
443 | #endif
444 |
445 | /// shader parameters
446 | g_ShVisu.begin();
447 | int n_simul = (int)g_luts.size() - g_step_ends[0];
448 | int sqsz = (int)sqrt((double)(n_simul)) + 1;
449 | fprintf(stderr, "simulating %d LUTs+FF (%dx%d pixels)", n_simul, sqsz, sqsz);
450 | g_ShVisu.sqsz .set(sqsz);
451 | g_ShVisu.num .set((int)(g_luts.size()));
452 | g_ShVisu.depth0_end.set((int)(g_step_ends[0]));
453 | g_ShVisu.end();
454 |
455 | /// main loop
456 | simulBegin_gpu(g_luts, g_step_starts, g_step_ends, g_ones);
457 | SimpleUI::loop();
458 | simulEnd_gpu();
459 |
460 | /// clean exit
461 | simulTerminate_gpu();
462 | g_ShVisu.terminate();
463 | g_GPU_timer.terminate();
464 | g_FramebufferTex = Tex2DRGBA_Ptr();
465 | g_Quad = AutoPtr();
466 |
467 | /// shutdown SimpleUI
468 | SimpleUI::shutdown();
469 |
470 | } catch (Fatal& e) {
471 | cerr << e.message() << endl;
472 | return (-1);
473 | }
474 |
475 | return (0);
476 | }
477 |
478 | /* -------------------------------------------------------- */
479 |
--------------------------------------------------------------------------------
/src/silixel_cpu.cc:
--------------------------------------------------------------------------------
1 | // @sylefeb 2022-01-04
2 | /*
3 | BSD 3-Clause License
4 |
5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb)
6 | All rights reserved.
7 |
8 | Redistribution and use in source and binary forms, with or without
9 | modification, are permitted provided that the following conditions are met:
10 |
11 | 1. Redistributions of source code must retain the above copyright notice, this
12 | list of conditions and the following disclaimer.
13 |
14 | 2. Redistributions in binary form must reproduce the above copyright notice,
15 | this list of conditions and the following disclaimer in the documentation
16 | and/or other materials provided with the distribution.
17 |
18 | 3. Neither the name of the copyright holder nor the names of its
19 | contributors may be used to endorse or promote products derived from
20 | this software without specific prior written permission.
21 |
22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
32 | */
33 | // --------------------------------------------------------------
34 |
35 | #include
36 |
37 | #include
38 | #include
39 | #include
40 | #include
41 | #include
42 | #include
43 | #include
44 | #include
45 |
46 | using namespace std;
47 |
48 | #include "simul_cpu.h"
49 | #include "blif.h"
50 | #include "fstapi/fstapi.h"
51 | #define FST_TS_S 0
52 | #define FST_TS_MS -3
53 | #define FST_TS_US -6
54 | #define FST_TS_NS -9
55 | #define FST_TS_PS -12
56 |
57 | // -----------------------------------------------------------------------------
58 |
59 | typedef struct {
60 | string name;
61 | string base_name;
62 | int bit_index;
63 | int lut_index;
64 | fstHandle fst_handle;
65 | fstVarType fst_type;
66 | } t_watch;
67 |
68 | string base_name(string str)
69 | {
70 | size_t dot_pos = str.rfind('.');
71 | size_t pos = str.find('[');
72 | if (pos != std::string::npos) {
73 | if (dot_pos != std::string::npos) {
74 | return str.substr(dot_pos + 1, pos - dot_pos - 1);
75 | } else {
76 | return str.substr(0, pos);
77 | }
78 | } else if (dot_pos != std::string::npos) {
79 | return str.substr(dot_pos+1);
80 | } else {
81 | return str;
82 | }
83 | }
84 |
85 | int index(string str)
86 | {
87 | size_t s = str.find('[');
88 | size_t e = str.find(']', s);
89 | if (s == std::string::npos || e == std::string::npos || s >= e - 1) {
90 | return -1; // no index
91 | }
92 | string istr = str.substr(s + 1, e - s - 1);
93 | return std::stoi(istr);
94 | }
95 |
96 | t_watch& add_watch(string signal, const map& indices, vector &_watches)
97 | {
98 | auto I = indices.find(signal);
99 | if (I == indices.end()) {
100 | fprintf(stderr, " cannot find signal '%s' to watch\n", signal.c_str());
101 | exit (-1);
102 | }
103 | t_watch w;
104 | w.name = signal;
105 | w.base_name = base_name(w.name);
106 | w.bit_index = index(w.name);
107 | w.lut_index = I->second;
108 | w.fst_handle = 0;
109 | w.fst_type = FST_VT_VCD_WIRE;
110 | _watches.push_back(w);
111 | return _watches.back();
112 | }
113 |
114 | void setFstScope(fstWriterContext *fst, string signal)
115 | {
116 | vector path;
117 | split(signal, '.', path);
118 | if (!path.empty()) path.pop_back();
119 | for (auto node : path) {
120 | fstWriterSetScope(fst, FST_ST_VCD_MODULE, node.c_str(), NULL);
121 | }
122 | }
123 |
124 | void unsetFstScope(fstWriterContext *fst, string signal)
125 | {
126 | vector path;
127 | split(signal, '.', path);
128 | if (!path.empty()) path.pop_back();
129 | for (auto node : path) {
130 | fstWriterSetUpscope(fst);
131 | }
132 | }
133 |
134 |
135 | // -----------------------------------------------------------------------------
136 |
137 | const char *c_ClockAnim[] = {
138 | " _____ \n",
139 | " _____/ \\ ",
140 | " _____ \n",
141 | " ____/ \\_ ",
142 | " _____ \n",
143 | " ___/ \\__ ",
144 | " _____ \n",
145 | " __/ \\___ ",
146 | " _____ \n",
147 | " _/ \\____ ",
148 | " _____ \n",
149 | " / \\_____ ",
150 | " _____ \n",
151 | " \\_____/ ",
152 | " ____ _ \n",
153 | " \\_____/ ",
154 | " ___ __ \n",
155 | " \\_____/ ",
156 | " __ ___ \n",
157 | " \\_____/ ",
158 | " _ ____ \n",
159 | " \\_____/ ",
160 | " _____ \n",
161 | " \\_____/ ",
162 | };
163 |
164 |
165 | int main(int argc,const char **argv)
166 | {
167 | bool silice_design = false;
168 | int num_cycles = 10000;
169 | const char *blif_path = SRC_PATH "/build/synth.blif";
170 |
171 | fprintf(stderr, "<<<====----- Silixel v0.1 by @sylefeb -----====>>>\n");
172 |
173 | /// parse options
174 | int i = 1;
175 | while (i < argc) {
176 | if (strcmp(argv[i], "--silice") == 0) {
177 | silice_design = true;
178 | ++i;
179 | } else if (strcmp(argv[i], "--cycles") == 0) {
180 | if (i + 1 == argc) {
181 | fprintf(stderr, "--cycles expects a parameter (integer, number of cycles to simulate)\n");
182 | exit(-1);
183 | }
184 | ++i;
185 | num_cycles = atoi(argv[i]);
186 | ++i;
187 | } else if (strcmp(argv[i], "--blif") == 0) {
188 | if (i + 1 == argc) {
189 | fprintf(stderr, "--blif expects a parameter (string, file to load)\n");
190 | exit(-1);
191 | }
192 | ++i;
193 | blif_path = argv[i];
194 | ++i;
195 | } else { ++i; }
196 | }
197 |
198 | /// checks
199 | {
200 | FILE *f = 0;
201 | fopen_s(&f, blif_path, "rb");
202 | if (f == NULL) {
203 | fprintf(stderr, " cannot open input blif file %s\n", blif_path);
204 | exit(-1);
205 | } else {
206 | fclose(f);
207 | }
208 | }
209 |
210 | /// load up design
211 | vector luts;
212 | std::vector brams;
213 | vector > outbits;
214 | vector ones;
215 | map indices;
216 | readDesign(blif_path, luts, brams, outbits, ones, indices);
217 |
218 | vector step_starts;
219 | vector step_ends;
220 | vector depths;
221 | analyze(luts, brams, outbits, indices, ones, step_starts, step_ends, depths);
222 |
223 | vector fanout;
224 | buildFanout(luts, fanout);
225 |
226 | /// add reset to init to ones
227 | bool has_reset = indices.count("reset") > 0;
228 | if (has_reset) {
229 | ones.push_back(indices.at("reset"));
230 | }
231 |
232 | /// simulate
233 | vector outputs;
234 | vector computelists;
235 | simulInit_cpu(luts, brams, step_starts, step_ends, ones, computelists, outputs);
236 |
237 | /// automatically add all outputs as watches
238 | vector watches;
239 | if (silice_design) {
240 | // selection specialized to a silice design
241 | for (auto signal : indices) {
242 | if (signal.first.substr(0, 3) == "out") {
243 | auto &w = add_watch(signal.first, indices, watches);
244 | w.fst_type = FST_VT_VCD_WIRE;
245 | } else if (signal.first.find("_q_") != std::string::npos) {
246 | auto &w = add_watch(signal.first, indices, watches);
247 | w.fst_type = FST_VT_VCD_REG;
248 | }
249 | }
250 | } else {
251 | // selection for any other design
252 | for (auto signal : indices) {
253 | if (signal.first[0] != '$') {
254 | auto &w = add_watch(signal.first, indices, watches);
255 | w.fst_type = FST_VT_VCD_WIRE;
256 | }
257 | }
258 | }
259 | if (has_reset) {
260 | add_watch("reset", indices, watches);
261 | }
262 |
263 | LibSL::CppHelpers::Console::clear();
264 | LibSL::CppHelpers::Console::pushCursor();
265 | fprintf(stderr, " _____\n");
266 | fprintf(stderr, " init_/ ");
267 | // simulPrintOutput_cpu(outputs, outbits);
268 |
269 | // FST trace
270 | fstWriterContext *fst = fstWriterCreate("./trace.fst", 1);
271 | if (fst == NULL) {
272 | fprintf(stderr,"cannot open trace.fst for writing\n");
273 | exit (-1);
274 | }
275 | fstWriterSetTimescale(fst, 1);
276 | fstWriterSetScope(fst, FST_ST_VCD_MODULE, "top", NULL);
277 | // -> group individual bits
278 | map bitcounts;
279 | for (auto &w : watches) {
280 | bitcounts[w.base_name] = max(bitcounts[w.base_name], index(w.name)+1);
281 | }
282 | set added;
283 | for (auto& w : watches) {
284 | if (!added.count(w.base_name)) {
285 | added.insert(w.base_name);
286 | setFstScope(fst, w.name);
287 | w.fst_handle = fstWriterCreateVar(fst, FST_VT_VCD_REG, FST_VD_IMPLICIT, max(1,bitcounts[w.base_name]), w.base_name.c_str(), NULL);
288 | unsetFstScope(fst, w.name);
289 | }
290 | }
291 | auto fst_clock = fstWriterCreateVar(fst, FST_VT_VCD_REG, FST_VD_IMPLICIT, 1, "clock", NULL);
292 |
293 | LibSL::CppHelpers::Console::popCursor();
294 | LibSL::CppHelpers::Console::pushCursor();
295 |
296 | int anim = 0;
297 | Every ev(100);
298 |
299 | int cycles = 0;
300 | while (num_cycles == -1 || cycles < num_cycles) {
301 |
302 | if (has_reset) {
303 | if (cycles < 16) {
304 | simulSetSignal_cpu(indices.at("reset"), true, depths, (int)step_starts.size(), fanout, computelists, outputs);
305 | } else if (cycles == 16) {
306 | simulSetSignal_cpu(indices.at("reset"), false, depths, (int)step_starts.size(), fanout, computelists, outputs);
307 | }
308 | }
309 |
310 | fstWriterEmitTimeChange(fst, (cycles << 1) + 0);
311 | fstWriterEmitValueChange(fst, fst_clock, "0");
312 | simulCycle_cpu(luts, brams, depths, step_starts, step_ends, fanout, computelists, outputs);
313 |
314 | fstWriterEmitTimeChange(fst, (cycles << 1) + 1);
315 | fstWriterEmitValueChange(fst, fst_clock, "1");
316 | simulPosEdge_cpu(luts, depths, (int)step_starts.size(), fanout, computelists, outputs);
317 |
318 | int console_out = ev.expired();
319 |
320 | if (console_out) {
321 | LibSL::CppHelpers::Console::popCursor();
322 | LibSL::CppHelpers::Console::pushCursor();
323 | int a = anim % 12;
324 | fprintf(stderr, c_ClockAnim[a * 2 + 0]);
325 | fprintf(stderr, c_ClockAnim[a * 2 + 1]);
326 | ++anim;
327 | if (num_cycles > -1) {
328 | fprintf(stderr, " (%7d cycles, %3d%% completed)\n", cycles, 100 * cycles / num_cycles);
329 | } else {
330 | fprintf(stderr, "\n");
331 | }
332 | }
333 |
334 | // print and trace watches
335 | map values;
336 | for (auto w : watches) {
337 | int b = w.lut_index;
338 | int lut = b >> 1;
339 | int q_else_d = b & 1;
340 | int bit = (outputs[lut] >> q_else_d) & 1;
341 | if (w.bit_index > -1) {
342 | if (values[w.base_name].empty()) {
343 | values[w.base_name].resize(bitcounts[w.base_name], '0');
344 | }
345 | values[w.base_name][bitcounts[w.base_name]-1-w.bit_index] = bit ? '1' : '0';
346 | } else {
347 | values[w.base_name] = bit ? "1" : "0";
348 | }
349 | }
350 | set added;
351 | const int max_display = 16;
352 | int num_display = 0;
353 | for (auto w : watches) {
354 | if (!added.count(w.base_name)) {
355 | added.insert(w.base_name);
356 | fstWriterEmitValueChange(fst, w.fst_handle, values[w.base_name].c_str());
357 | if (console_out && num_display < max_display) {
358 | fprintf(stderr, "%-40s %s\n", w.base_name.c_str(), values[w.base_name].c_str());
359 | ++num_display;
360 | }
361 | }
362 | }
363 |
364 | ++cycles;
365 | // Sleep(500); /// slow down on purpose
366 | }
367 |
368 | fstWriterClose(fst);
369 |
370 | fprintf(stderr, "\n\noutput: trace.fst\n\n");
371 |
372 | return 0;
373 | }
374 |
375 | // -----------------------------------------------------------------------------
376 |
--------------------------------------------------------------------------------
/src/simul_cpu.cc:
--------------------------------------------------------------------------------
1 | // @sylefeb 2022-01-04
2 | /*
3 | BSD 3-Clause License
4 |
5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb)
6 | All rights reserved.
7 |
8 | Redistribution and use in source and binary forms, with or without
9 | modification, are permitted provided that the following conditions are met:
10 |
11 | 1. Redistributions of source code must retain the above copyright notice, this
12 | list of conditions and the following disclaimer.
13 |
14 | 2. Redistributions in binary form must reproduce the above copyright notice,
15 | this list of conditions and the following disclaimer in the documentation
16 | and/or other materials provided with the distribution.
17 |
18 | 3. Neither the name of the copyright holder nor the names of its
19 | contributors may be used to endorse or promote products derived from
20 | this software without specific prior written permission.
21 |
22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
32 | */
33 |
34 | #include
35 |
36 | #include
37 | #include
38 | #include
39 | #include
40 | #include
41 | #include
42 | #include
43 | #include
44 | #include
45 | #include
46 |
47 | using namespace std;
48 |
49 | #include "read.h"
50 | #include "blif.h"
51 |
52 | // -----------------------------------------------------------------------------
53 |
54 | // forward def
55 | void simulPosEdgeAll_cpu(const vector& luts, vector& _outputs);
56 |
57 | // -----------------------------------------------------------------------------
58 |
59 | static inline void simulLUT_cpu(
60 | int l,
61 | const vector& luts,
62 | vector& _outputs)
63 | {
64 | // read inputs
65 | unsigned short cfg_idx = 0;
66 | for (int i = 0; i < 4; ++i) {
67 | if (luts[l].inputs[i] > -1) {
68 | int lut = luts[l].inputs[i] >> 1;
69 | int q_else_d = luts[l].inputs[i] & 1;
70 | uchar bit = (_outputs[lut] >> q_else_d) & 1;
71 | cfg_idx |= bit ? (1 << (3 - i)) : 0;
72 | }
73 | }
74 | // update outputs
75 | uchar new_value = (luts[l].cfg >> cfg_idx) & 1;
76 | if (new_value) _outputs[l] |= 1;
77 | else _outputs[l] &= 0xfffffffe;
78 | }
79 |
80 | // -----------------------------------------------------------------------------
81 |
82 | // add LUT fanout to the compute lists
83 | static inline void addFanout(
84 | int l,
85 | int q_else_d,
86 | const vector& depths,
87 | int numdepths,
88 | const vector& fanout,
89 | vector& _computelists,
90 | vector& _outputs
91 | ) {
92 | int cur = fanout[l];
93 | int other = fanout[cur];
94 | while (other != -1) {
95 | int other_lut = other >> 1;
96 | if (q_else_d == (other&1)) { // other uses D/Q input
97 | if ((_outputs[other_lut] & 4) == 0) { // not yet inserted
98 | _outputs[other_lut] |= 4; // tag as inserted
99 | // insert in comb. depth compute list
100 | int dpt = depths[other_lut];
101 | int cls = _computelists[dpt];
102 | int idx = _computelists[cls]++;
103 | _computelists[cls + 1 + idx] = other_lut;
104 | }
105 | }
106 | ++cur;
107 | other = fanout[cur];
108 | }
109 | }
110 |
111 | // -----------------------------------------------------------------------------
112 |
113 | static inline void simulLUT_cpu(
114 | int l,
115 | const vector& luts,
116 | const vector& depths,
117 | int numdepths,
118 | const vector& fanout,
119 | vector& _computelists,
120 | vector& _outputs)
121 | {
122 | // skip externals
123 | if (luts[l].external) {
124 | return;
125 | }
126 | // read inputs
127 | unsigned short cfg_idx = 0;
128 | for (int i = 0; i < 4; ++i) {
129 | if (luts[l].inputs[i] > -1) {
130 | int lut = luts[l].inputs[i] >> 1;
131 | int q_else_d = luts[l].inputs[i] & 1;
132 | uchar bit = (_outputs[lut] >> q_else_d)&1;
133 | cfg_idx |= bit ? (1 << (3 - i)) : 0;
134 | }
135 | }
136 | // update outputs
137 | uchar new_value = (luts[l].cfg >> cfg_idx) & 1;
138 | if ((_outputs[l]&1) != new_value) {
139 | if (new_value) _outputs[l] |= 1;
140 | else _outputs[l] &= ~1;
141 | // fprintf(stderr, "LUT %d changed (new:%d)\n",l<<1,new_value);
142 | // add fanout to compute list
143 | addFanout(l, 0, depths, numdepths, fanout, _computelists, _outputs);
144 | // add this LUT to posedge list
145 | if ((_outputs[l] & 8) == 0) { // not yet inserted
146 | _outputs[l] |= 8; // tag as inserted
147 | // insert in posedge compute list
148 | int dpt = numdepths;
149 | int cls = _computelists[dpt];
150 | int idx = _computelists[cls]++;
151 | _computelists[cls + 1 + idx] = l;
152 | }
153 | }
154 | // reset inserted flag (preserve posedge flag)
155 | _outputs[l] &= 3|8;
156 | }
157 |
158 | // -----------------------------------------------------------------------------
159 |
160 | void simulBRAMS_cpu(
161 | vector& _brams,
162 | const vector& depths,
163 | int numdepths,
164 | const vector& fanout,
165 | vector& _computelists,
166 | vector& _outputs)
167 | {
168 | // process BRAMs
169 | for (auto &bram : _brams) {
170 | // make rd_addr
171 | uint rd_addr = 0;
172 | for (int i=0;i < bram.rd_addr.size();++i) {
173 | int b = bram.rd_addr[i];
174 | int lut = b >> 1;
175 | int q_else_d = b & 1;
176 | uint bit = ((_outputs[lut] >> q_else_d) & 1) ? 1 : 0;
177 | rd_addr = rd_addr | (bit << i);
178 | }
179 | // make wr_addr
180 | uint wr_addr = 0;
181 | for (int i=0;i < bram.wr_addr.size();++i) {
182 | int b = bram.wr_addr[i];
183 | int lut = b >> 1;
184 | int q_else_d = b & 1;
185 | uint bit = ((_outputs[lut] >> q_else_d) & 1) ? 1 : 0;
186 | wr_addr = wr_addr | (bit << i);
187 | }
188 | // DEBUG
189 | uint32_t dbg_rdata = 0;
190 | uint32_t dbg_wdata = 0;
191 | uint32_t dbg_wen = 0;
192 | for (int i=0;i < bram.rd_data.size();++i) {
193 | int o = bram.data.bitsize() - (int)(rd_addr * bram.rd_data.size());
194 | bool bitr = bram.data.get(o - 1 - i);
195 | dbg_rdata = dbg_rdata | ((bitr?1:0) << i);
196 | if (!bram.wr_data.empty()) {
197 | int bw = bram.wr_data[i];
198 | uint bit_w = (_outputs[bw >> 1] >> (bw & 1)) & 1;
199 | dbg_wdata = dbg_wdata | ((bit_w?1:0) << i);
200 | int be = bram.wr_en[i];
201 | uint bit_e = (_outputs[be >> 1] >> (be & 1)) & 1;
202 | dbg_wen = dbg_wen | ((bit_e?1:0) << i);
203 | }
204 | }
205 | // fetch and set rd_data
206 | for (int i=0;i < bram.rd_data.size();++i) {
207 | int o = bram.data.bitsize() - (int)(rd_addr * bram.rd_data.size());
208 | bool bit = bram.data.get(o - 1 - i);
209 | int b = bram.rd_data[i];
210 | int lut = b >> 1;
211 | int q_else_d = b & 1;
212 | if (bit) {
213 | _outputs[lut] |= 0b11;
214 | } else {
215 | _outputs[lut] &= ~0b11;
216 | }
217 | // add fanout to compute list
218 | addFanout(lut, 0, depths, numdepths, fanout, _computelists, _outputs);
219 | addFanout(lut, 1, depths, numdepths, fanout, _computelists, _outputs);
220 | }
221 | // get wr_data and store
222 | if (!bram.wr_data.empty()) {
223 | for (int i=0;i < bram.wr_data.size();++i) {
224 | int o = bram.data.bitsize() - (int)(wr_addr * bram.rd_data.size());
225 | int bw = bram.wr_data[i];
226 | uint bit_w = (_outputs[bw >> 1] >> (bw & 1)) & 1;
227 | int be = bram.wr_en[i];
228 | uint bit_e = (_outputs[be >> 1] >> (be & 1)) & 1;
229 | if (bit_e) {
230 | bram.data.set(o - 1 - i, bit_w != 0);
231 | }
232 | }
233 | }
234 | // report
235 | // fprintf(stderr, "- bram %s @%08x = %08x w:@%08x=%08x(%08x)\n", bram.name.c_str(), rd_addr, dbg_rdata, wr_addr, dbg_wdata, dbg_wen);
236 | }
237 | }
238 |
239 | // -----------------------------------------------------------------------------
240 |
241 | void simulInit_cpu(
242 | const vector& luts,
243 | vector& _brams,
244 | const vector& step_starts,
245 | const vector& step_ends,
246 | const vector& ones,
247 | vector& _computelists,
248 | vector& _outputs)
249 | {
250 | _outputs.resize(luts.size(),0);
251 | // initialize ones
252 | for (int o = 0; o < ones.size(); ++o) {
253 | _outputs[ones[o] >> 1] |= 1 << (ones[o] & 1);
254 | }
255 | // resolve const cells
256 | for (int cy = 0; cy < 2; ++cy) { // those which are const, and then those that only depend on consts
257 | for (int l = step_starts[0]; l <= step_ends[0]; ++l) {
258 | simulLUT_cpu(l, luts, _outputs);
259 | }
260 | simulPosEdgeAll_cpu(luts, _outputs);
261 | }
262 | // initialize ones
263 | // Why a second time? Some of these registers may have been cleared after const resolve
264 | for (int o = 0; o < ones.size(); ++o) {
265 | _outputs[ones[o] >> 1] |= 1 << (ones[o] & 1);
266 | }
267 | // computelists
268 | int cpl_sz = (int)step_starts.size()+1; // header, one index per depth + 1 for posedge
269 | for (int d = 0; d < step_starts.size(); ++d) {
270 | int num = step_ends[d] - step_starts[d] + 1; // max entries in list for this depth
271 | cpl_sz += num + 1; // +1 for list length
272 | }
273 | cpl_sz += (int)luts.size() + 1; // final list for posedge
274 | _computelists.resize(cpl_sz,0);
275 | int offset = (int)step_starts.size()+1; // header, one index per depth + 1 for posedge
276 | for (int d = 0; d < step_starts.size(); ++d) {
277 | _computelists[d] = offset; // header, start of list (first entry is length)
278 | int num = step_ends[d] - step_starts[d] + 1; // max entries in list for this depth
279 | offset += num + 1; // +1 for list length
280 | }
281 | _computelists[step_starts.size()] = offset; // final list for posedge
282 | // -> initially we put all LUTs in the computelist
283 | for (int d = 0; d < (int)step_starts.size() ; ++d) {
284 | int cls = _computelists[d];
285 | // fill-in list
286 | for (int l = step_starts[d]; l <= step_ends[d]; ++l) {
287 | int idx = _computelists[cls]++;
288 | sl_assert(cls + 1 + idx < _computelists.size());
289 | _computelists[cls + 1 + idx] = l;
290 | }
291 | }
292 | {
293 | int cls = _computelists[step_starts.size()];
294 | for (int l = 0; l < luts.size(); ++l) {
295 | int idx = _computelists[cls]++;
296 | sl_assert(cls + 1 + idx < _computelists.size());
297 | _computelists[cls + 1 + idx] = l;
298 | }
299 | }
300 | // -> we tag all LUTs as being inserted already
301 | for (int l = 0; l < luts.size(); ++l) {
302 | _outputs[l] |= 4 | 8;
303 | }
304 | }
305 |
306 | // -----------------------------------------------------------------------------
307 |
308 | void simulCycle_cpu(
309 | const vector& luts,
310 | vector& _brams,
311 | const vector& depths,
312 | const vector& step_starts,
313 | const vector& step_ends,
314 | const vector& fanout,
315 | vector& _computelists,
316 | vector& _outputs)
317 | {
318 | // BRAMs
319 | simulBRAMS_cpu(_brams, depths, (int)step_starts.size(), fanout, _computelists, _outputs);
320 | // process LUTs
321 | for (int depth = 0; depth < step_starts.size(); ++depth) {
322 | int cls = _computelists[depth];
323 | int num = _computelists[cls];
324 | // cerr << sprint("depth: %5d, num: %5d\n", depth, num);
325 | for (int n = 0; n < num ; ++n) {
326 | int l = _computelists[cls + 1 + n];
327 | simulLUT_cpu(l, luts, depths, (int)step_starts.size(), fanout, _computelists, _outputs);
328 | }
329 | // clear compute list for this depth
330 | _computelists[cls] = 0;
331 | }
332 | }
333 |
334 | // -----------------------------------------------------------------------------
335 |
336 | void simulPosEdgeAll_cpu(
337 | const vector& luts,
338 | vector& _outputs)
339 | {
340 | for (int l = 0; l < luts.size(); ++l) {
341 | uchar d = _outputs[l] & 1;
342 | uchar q = (_outputs[l] >> 1) & 1;
343 | if (d != q) {
344 | if (d) {
345 | _outputs[l] |= 2;
346 | } else {
347 | _outputs[l] &= 0xfffffffd;
348 | }
349 | }
350 | }
351 | }
352 |
353 | // -----------------------------------------------------------------------------
354 |
355 | void simulPosEdge_cpu(
356 | const vector& luts,
357 | const vector& depths,
358 | int numdepths,
359 | const vector& fanout,
360 | vector& _computelists,
361 | vector& _outputs)
362 | {
363 | int cls = _computelists[numdepths];
364 | // process LUTs
365 | int num = _computelists[cls];
366 | // cerr << sprint("posedge num: %5d\n", num);
367 | for (int n = 0; n < num; ++n) {
368 | int l = _computelists[cls + 1 + n];
369 | uchar d = _outputs[l] & 1;
370 | uchar q = (_outputs[l] >> 1) & 1;
371 | if (d != q) {
372 | if (d) {
373 | _outputs[l] |= 2;
374 | } else {
375 | _outputs[l] &= 0xfffffffd;
376 | }
377 | // add fanout to compute list
378 | addFanout(l, 1, depths, numdepths, fanout, _computelists, _outputs);
379 | }
380 | // reset inserted flag
381 | _outputs[l] &= 7;
382 | }
383 | // clear compute list
384 | _computelists[cls] = 0;
385 | }
386 |
387 | // -----------------------------------------------------------------------------
388 |
389 | void simulPrintOutput_cpu(
390 | const vector& outputs,
391 | const vector >& outbits)
392 | {
393 | // display result
394 | int val = 0;
395 | string str;
396 | for (int b = 0; b < outbits.size() ; b++) {
397 | int lut = outbits[b].second >> 1;
398 | int q_else_d = outbits[b].second & 1;
399 | uchar bit = (outputs[lut] >> q_else_d) & 1;
400 | str = (bit ? "1" : "0") + str;
401 | val += bit << b;
402 | }
403 | fprintf(stderr,"b%s (d%03d h%03x) \n",str.c_str(),val,val);
404 | }
405 |
406 | // -----------------------------------------------------------------------------
407 |
408 | void simulSetSignal_cpu(
409 | int sig,
410 | bool v,
411 | const vector& depths,
412 | int numdepths,
413 | const vector& fanout,
414 | vector& _computelists,
415 | vector& _outputs
416 | ) {
417 | int b = sig;
418 | int lut = b >> 1;
419 | // set D
420 | if (v) {
421 | _outputs[lut] |= 1;
422 | } else {
423 | _outputs[lut] &= ~1;
424 | }
425 | // add fanout to compute list
426 | addFanout(lut, 0, depths, numdepths, fanout, _computelists, _outputs);
427 | }
428 |
429 | // -----------------------------------------------------------------------------
430 |
--------------------------------------------------------------------------------
/src/simul_cpu.h:
--------------------------------------------------------------------------------
1 | // @sylefeb 2022-01-04
2 | /*
3 | BSD 3-Clause License
4 |
5 | Copyright (c) 2022, Sylvain Lefebvre (@sylefeb)
6 | All rights reserved.
7 |
8 | Redistribution and use in source and binary forms, with or without
9 | modification, are permitted provided that the following conditions are met:
10 |
11 | 1. Redistributions of source code must retain the above copyright notice, this
12 | list of conditions and the following disclaimer.
13 |
14 | 2. Redistributions in binary form must reproduce the above copyright notice,
15 | this list of conditions and the following disclaimer in the documentation
16 | and/or other materials provided with the distribution.
17 |
18 | 3. Neither the name of the copyright holder nor the names of its
19 | contributors may be used to endorse or promote products derived from
20 | this software without specific prior written permission.
21 |
22 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
23 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
24 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
25 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
26 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
27 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
28 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
29 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
30 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
31 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
32 | */
33 |
34 | #pragma once
35 |
36 | #include