├── .DS_Store
├── CODE_OF_CONDUCT.md
├── README.md
├── docs
├── .$Flowchart.drawio.bkp
├── Flowchart.drawio
├── Flowchart.png
├── Implementation of a RISC-V compatible Multiply-Add Fused Unit.pdf
├── Wallace Tree.drawio
└── WallaceTree.png
└── src
├── 00_TESTBED
├── MAC32_top_tb.sv
├── PATTERN.v
├── PATTERN_sample.v
├── TESTBED.v
└── TESTBED_sample.v
├── 01_RTL
├── 01_run
├── Compressor32.v
├── Compressor42.v
├── EACAdder.v
├── FullAdder.v
├── LeadingOneDetector_Top.v
├── MAC32_top.v
├── MSBIncrementer.v
├── Normalizer.v
├── PreNormalizer.v
├── R4Booth.v
├── Rounder.v
├── SpecialCaseDetector.v
├── WallaceTree.v
├── ZeroDetector_Base.v
└── ZeroDetector_Group.v
├── 02_SYN
├── 01_run_dc
├── 09_clean_up
├── Netlist
│ ├── SUBWAY_SYN.sdf
│ └── SUBWAY_SYN.v
├── Report
│ ├── SUBWAY.area
│ ├── SUBWAY.check
│ ├── SUBWAY.resource
│ └── SUBWAY.timing
├── default.svf
├── syn.log
└── syn.tcl
└── 03_GATE_SIM
├── 01_run
├── 09_clean_up
├── SUBWAY_SYN.sdf.X
└── irun.log
/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hankshyu/RISC-V_MAC/a99fb15770c8dd553d0b348fadddc2b8b3d4b22b/.DS_Store
--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | # Contributor Covenant Code of Conduct
2 |
3 | ## Our Pledge
4 |
5 | We as members, contributors, and leaders pledge to make participation in our
6 | community a harassment-free experience for everyone, regardless of age, body
7 | size, visible or invisible disability, ethnicity, sex characteristics, gender
8 | identity and expression, level of experience, education, socio-economic status,
9 | nationality, personal appearance, race, religion, or sexual identity
10 | and orientation.
11 |
12 | We pledge to act and interact in ways that contribute to an open, welcoming,
13 | diverse, inclusive, and healthy community.
14 |
15 | ## Our Standards
16 |
17 | Examples of behavior that contributes to a positive environment for our
18 | community include:
19 |
20 | * Demonstrating empathy and kindness toward other people
21 | * Being respectful of differing opinions, viewpoints, and experiences
22 | * Giving and gracefully accepting constructive feedback
23 | * Accepting responsibility and apologizing to those affected by our mistakes,
24 | and learning from the experience
25 | * Focusing on what is best not just for us as individuals, but for the
26 | overall community
27 |
28 | Examples of unacceptable behavior include:
29 |
30 | * The use of sexualized language or imagery, and sexual attention or
31 | advances of any kind
32 | * Trolling, insulting or derogatory comments, and personal or political attacks
33 | * Public or private harassment
34 | * Publishing others' private information, such as a physical or email
35 | address, without their explicit permission
36 | * Other conduct which could reasonably be considered inappropriate in a
37 | professional setting
38 |
39 | ## Enforcement Responsibilities
40 |
41 | Community leaders are responsible for clarifying and enforcing our standards of
42 | acceptable behavior and will take appropriate and fair corrective action in
43 | response to any behavior that they deem inappropriate, threatening, offensive,
44 | or harmful.
45 |
46 | Community leaders have the right and responsibility to remove, edit, or reject
47 | comments, commits, code, wiki edits, issues, and other contributions that are
48 | not aligned to this Code of Conduct, and will communicate reasons for moderation
49 | decisions when appropriate.
50 |
51 | ## Scope
52 |
53 | This Code of Conduct applies within all community spaces, and also applies when
54 | an individual is officially representing the community in public spaces.
55 | Examples of representing our community include using an official e-mail address,
56 | posting via an official social media account, or acting as an appointed
57 | representative at an online or offline event.
58 |
59 | ## Enforcement
60 |
61 | Instances of abusive, harassing, or otherwise unacceptable behavior may be
62 | reported to the community leaders responsible for enforcement at
63 | hankshyu@gmail.com.
64 | All complaints will be reviewed and investigated promptly and fairly.
65 |
66 | All community leaders are obligated to respect the privacy and security of the
67 | reporter of any incident.
68 |
69 | ## Enforcement Guidelines
70 |
71 | Community leaders will follow these Community Impact Guidelines in determining
72 | the consequences for any action they deem in violation of this Code of Conduct:
73 |
74 | ### 1. Correction
75 |
76 | **Community Impact**: Use of inappropriate language or other behavior deemed
77 | unprofessional or unwelcome in the community.
78 |
79 | **Consequence**: A private, written warning from community leaders, providing
80 | clarity around the nature of the violation and an explanation of why the
81 | behavior was inappropriate. A public apology may be requested.
82 |
83 | ### 2. Warning
84 |
85 | **Community Impact**: A violation through a single incident or series
86 | of actions.
87 |
88 | **Consequence**: A warning with consequences for continued behavior. No
89 | interaction with the people involved, including unsolicited interaction with
90 | those enforcing the Code of Conduct, for a specified period of time. This
91 | includes avoiding interactions in community spaces as well as external channels
92 | like social media. Violating these terms may lead to a temporary or
93 | permanent ban.
94 |
95 | ### 3. Temporary Ban
96 |
97 | **Community Impact**: A serious violation of community standards, including
98 | sustained inappropriate behavior.
99 |
100 | **Consequence**: A temporary ban from any sort of interaction or public
101 | communication with the community for a specified period of time. No public or
102 | private interaction with the people involved, including unsolicited interaction
103 | with those enforcing the Code of Conduct, is allowed during this period.
104 | Violating these terms may lead to a permanent ban.
105 |
106 | ### 4. Permanent Ban
107 |
108 | **Community Impact**: Demonstrating a pattern of violation of community
109 | standards, including sustained inappropriate behavior, harassment of an
110 | individual, or aggression toward or disparagement of classes of individuals.
111 |
112 | **Consequence**: A permanent ban from any sort of public interaction within
113 | the community.
114 |
115 | ## Attribution
116 |
117 | This Code of Conduct is adapted from the [Contributor Covenant][homepage],
118 | version 2.0, available at
119 | https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
120 |
121 | Community Impact Guidelines were inspired by [Mozilla's code of conduct
122 | enforcement ladder](https://github.com/mozilla/diversity).
123 |
124 | [homepage]: https://www.contributor-covenant.org
125 |
126 | For answers to common questions about this code of conduct, see the FAQ at
127 | https://www.contributor-covenant.org/faq. Translations are available at
128 | https://www.contributor-covenant.org/translations.
129 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Implementation of a RISC-V compatible Multiply-Add Fused Unit
2 |
3 | ***The full paper could be found [here](docs/Implementation%20of%20a%20RISC-V%20compatible%20Multiply-Add%20Fused%20Unit.pdf)***
4 |
5 | ## Abstract
6 |
7 | The floating-point Multiply-Add Fused (MAF, also known as Multiply-ACcumulate, MAC) unit is popular in modern microprocessor design due to its efficiency and performance advantages. The design aims to speed up scientific computations, multimedia applications, and in particular, convolutional neural networks for machine learning tasks. This study implements a MAF unit with RISC-V ”F” extension compatibility, incorporating standard IEEE 754-2008 exception handling, NaN propagation, and denormalized number support. Five distinct rounding modes and accrued exception flags are also supported in the proposed design. We test our implementation with carefully crafted corner cases and random generated floating-point numbers to verify its correctness.
8 |
9 | **Index Terms—Floating-Point Unit, Multiply-Add fused, Multiply Accumulate, RISC-V**
10 |
11 | ## 1.Intorduction
12 | Floating-point operations play a crucial role in modern day computing, especially when the machine learning domain flourishes. The growing computational power makes training sophisticated models possible. To apply the machine learning models in real life applications typically requires floating-points computations, which is demanding since large amount of real time data must be processed. Moreover, deep learning algorithms with exhaustive need of floating-point computational capabilities, such as neural networks, grew its popularity recently. These applications further challenge the floating-point processing power of the microprocessors. Among all floating-point operations, add-and-multiply are the most demanding one, the combination appears in the convolution layers of convolutional neural networks, digital filtering, and many other computing models’ architecture.
13 |
14 | Floating-point units are available on most microprocessors nowadays. Most designs center around a fused multiply-add dataflow due to its simplicity and performance advantage over separate multiplier and adder pipelines. It combines two basic operations with only one rounding error and shares hardware components to save chip area. Such design is also consistent with the basic RISC philosophy of heavily optimize key units in order to rapidly carry out the most frequently expected functions. Furthermore, the existence of fused multiply-add unit leads to more efficient superscalar CPU design since three floating-point instructions: add, multiply, and fused multiply-add could be scheduled to the same functional unit.
15 |
16 | To take full advantage of the MAF dataflow, [3] transforms a set of equations into a series of multiply-adds by a numerical analysis technique called Horner's rule. [4] presents a general method to convert any transform algorithm into MAF optimized algorithms. [5] presents a framework for automatically generating MAF code for every linear DSP transform. The above-mentioned examples shows that the MAF architecture is recognized in modern computing and could receive optimization at the software level.
17 |
18 | ## 3. Overall Maf Unit Architecture
19 |
20 | 
21 |
--------------------------------------------------------------------------------
/docs/.$Flowchart.drawio.bkp:
--------------------------------------------------------------------------------
1 | 7V1dc6M2F/41vowHSYDgMnGStjObbaZp323fGw9rE5sWmxSTxNlfX2EjjCQwAgQo2d2LnSCQwNI5z/mWJmi22f8Ue0/ru2jphxNoLPcTdD2B0IUG+T9teDs2ANMEx5ZVHCyztlPDQ/DNzxqzjqvnYOnvmAeTKAqT4IltXETbrb9ImDYvjqNX9rHHKGTf+uStfKHhYeGFYuuXYJmsj60OxKf2n/1gtaZvBrZ7vLPx6MPZL9mtvWX0WmhCNxM0i6MoOf612c/8MJ08Oi/HfrcVd/MPi/1tItPBvntwDXy7+P2bEf/rui/xl3/RBTqO8uKFz9kPfnjyF4EXzrydf+0nZEqjOPv+5I1OShw9b5d+Oi6YoKvXdZD4D0/eIr37SsiAtK2TTZjdFr8z+/QXP078faEp++6f/GjjJ/EbeSS7awPr2CWjImgZU/vY8npaFGBmM70uLIidtXkZHazywU9TRf7IZqvBzAELCvPiLwntZJdRnKyjVbT1wptT69Vp5gxydXrmUxQ9ZfP1t58kbxkjeM9JxM7mY7RNspsgHWKXeHFymRI6aVmE3m4XLGjzbRDSbv52SR+KyGIfWwr3Kxcp/Ulnlyj2Qy8JXlh2KZvtrOt9FJBX5EsLHW5pQb60dBDyU1Z+kvUrEjg3lMkPZQlD7aLneOELQ5Gp8d4Kjz2lD+yqP1p4U/7RVd9W/TNPNHj8ihNF5pPcnkhNgb1/M68I4qxHZ2nTMqeuqR1XW8KEffHCkMzD77Hv6zhpSINJs4VJ+5rS2Pz+XkTIEw5to63PzhALjmSa4rc/04upRS//yuGMXFzvi09ev9GrfZAUupGrvwp3Tp3SC9qniKrw3FodAaSegI6YVc+ZtfBaWFSrZE1pW0cUNk2BsGxDFj3F0VxhNEvE4gpYb4rFFjzDE72CK9X0NFIAMp4qSn+Dlf7ZEyfpf47VjBpWa88mriSbADSQGiIiK5bWRFQRFKYC4QSkl/M7b5vMvTBYbQnVlNHbJ++rH7I0cng81QfJuvgxaUhFTECMmsvsxiZYLo/k6O+Cb97Xw3jpimbcRga3ribWdQm9iWt+ljkE4ZabatlbJ0VrqEzoXRhTaCPMrM0F7AZ7dBiLE4nR4+PO72dtYcnabgprO18TVP8uF9iYmjZ2WFWw2/L2v5wAGaVr9c7BH1kYmMRWyf6vVrtAid6VXt37cUDmNqXJM8oYOKuMVUqVWmnhwGGkBRClhQNb262ptlT4h1moQ6IYUmTFuhb3C/AQWhNAos/p7mW+9kLyabc3+6f55mW+S1Hs/WBhjgZKpB1CNmBWBuiPhtZHQcPOKm3f2MNxLZI2rJQttmmPsdgnSYJBUZYQhjGM8wKlu8+0XHCSd/Dy7vCqbCJKPPSa0RLvlXT7MtL5FwE4iIUObNE1NZsHY0sWk1yH6euuvMU/qwNnzKIwig8fg25vb+Fslj8SxUs/prczOmSpTF5MUc5VobM7xGRWKaWo/GMHvUDcCD0KMROPi2vt3JUKAU1/vLKGwivuRRDVBnjqevSFcGLE4konhCuDLwnsq5Gj5xlYBbxhC5jdAC1HMHc4BHPGRbBUzBcwjChmri6KmQTN6Q9/cCj4414E3Fr4q+vRF/yJEe5LneCvB5BzFIKcQcVHV5AzzaFAzhUW/D72P0fxhizmN7KI/NIPHaMHjhiPhOPH6LEhzBsbW5qHBFB1c+L0kNBUC+Y083CEkKOJ+0J0JL7MwnWgXtLJ5DuxuD6o0xGIqu/s4XK+8OKMX8bzUMkHOwrRF9wo9q6X85MyTX1AX3leYbcsTjCyMxO2IhX4g1RGIBVqXrJ4s3vejK1vNglVAYU+QIAw5/Xopk0OIDRENWiWyosH78W/XC410B9LcjwtUwP9UTS0iMKogbBlHA45mtYgKGDgswCn/SDoSY1VpqTKpkRIJ9AV4s9pPK1IfxetjMTGKiqlXZqAalp1+infw7V0UU6xKCtSdiGyYh5sX96NytFvvhHLhMhEJgSGBW0ELOy49hmeRPSKd/MNwFHSSUa5mIQYcwzVUU5SUQGBkIYkZm0rsiBNl2U2zIdwSz4Pd+xx+jXjMzR1uwkMPS4zNzA1h2RmAEwTI9vJOfrDMLNhKGJex51i95Ru6bK8Iu3bH9blIjDBzeVMW73ZRuPrzQCaYwIEmDRQj/XTcqGslmtJsnFHpuU1TgdzdCNbImUBdiDXsHqR2vx7nCx9sN+oHBTTrnbr4DGZb9+Xr4TyrpK0XuA4thLJwUqKC45wekR/OEqWbzmUSRr6A7tKPx4ENjYTOIR0gTEZAHBEc9sLw29+HL0ruKEBLSVwY1tYCdxcQE4R7RNhhGX85HvLYLv6davPbiT5ji5UpjrGyEqmI+rlaTWhHu4mTS1UC8OifTq0twlJwnpTAzUth3CR0mpKqvuKiS3yRW+NM9B4b5Ndm6zgoPM9xsxUGDUvk1WcsKzmxLIL7hQiGSMnXZrF1MeZ28U/MO9grY1/cD1sfDb+oY6cxRy8PBZYKW801bmAsnTKFHkdoEbnQoOlUwIoRsSp7jCu3tAmMYbFLMcAx1BS5opuI+I1y53pUW2wHbdVCVfz2hpxqxVHH1ldUp3+cPXLdhH7myNMjW1+lCQY244Gju6SveYOMzP+jJniFlHAAONPmTOKWtgR0VRnWw/kuTcNbrcLflVVOdyRUfqeypB3xXf17C8rCeTtn+bbKB5Z6H+H6bCyOwZSgNUlHZa6FYqSMq0nGZ+IvsP0+/dKRGIJh0alXCV6g6tBKRfdHWg0jMbMP6eK2YDIbVU7pmijjtAdMnpXR4BYWjX4Tj62aOtkux0fIgfe8u8xCU023nth9UNZdZvlgVOGKq12d2qEhuIgAiXWetQfqKTRdODUZKnakg4RKKPqyj28R/dpMRQNKykaVRB0gfjQidYOuDs1EGpDfHqpMdIErdxo7CSQTTHdSSCz/PSPdIaW3m6dE5TgBS9Qzi6Jo3/yg0cOLWvvKR1ys1+lB65M/dBfJHHqNJ96X8nj3iKZbp73sGRt4yghkxGl73KPLw6j10t6RsrZleuUhmmWulpMS1SZnL5UJqsaE3SrYOqCCrjRRvma8T4ei/fb+XkwEugc2TUBOwuLzNGq0zCnh9hiaKhuf9bv0wipUxWb8eUAPCS7YVFe9UBM4cxY6BrZRCI191ezVGJlOXVR9bJOfB2SGNTCbm2n8YJaJbskMeLv3Zt6RV6DbpHZLlrbZe9UPKq39brpxmJm7nvQjRVowhYUISEvaR5NE3ZETfg+2j2Maxhr6eppz6uNky/Y6uCOW7JRicRtuSyUKqmSsecDmPXf1fR5fCZAKm5MN9j2wXTRBcaa3z3vdYS83eHZKL4PtunkQBYFD98VxOSJ4zUh+vSlKoARGWcSgJiwilsSVuHJWF0syhDd4b/5u+cwmZdUkZTV/FQeZcfvDcnv3129V+S1deNcm+fmXRZkWIyhUniY4+XS09+Mwj/2W4AhbnsnXUfpsOkWABhyMFdS5sJ9VB1wNPYYWNynGs555BN+G9ehr+xe0er/43ZMFjBvwK2lngVoKHygExZdfjHbE70wVnui57PihpKWgE6rRrlv7/H0ZH7X+Q6nUFkMREO2muZwNm/xNpQi3Y5HUsGBjgmxRs6owEMe/aAZBfMHY6PWFAxSg7v6ILWBSBhwCaD1R0fU9uiJ6mmRCeveJ6bw4eMm0A4JKVx9jclfq/Sv+zha+LudBhW4/IRpcAY2QGIQ+JD/HmxX80209Mt2pe+kQVVtZihVqCeleUmcRFS5mOoyEcudJGhQAwZxFCfuziSru2HHnAJ8QiHuRAdp9GtcXcj9AsCfkV3ypVbXHoYpGQIpt8Hyyswhjk4RaxoV8munfHVGxGfSm5fv7eX3MAzEnw8CpU81Fc3iTlJeETsBo+r3VH02ty+K0KEv1Va/GqP2Wmqbs9IqOaM+5CbJQYptONBi6z/KGqBWmVak5VojnfdjuR+InNuDdld/bOXqNSY5m7ffxCx2RSTHv2mwI6Y+kqeqotKuLwiV3Xy8c955rXYhTc82N5TZGz1b42gE45QP9U7PDXxo2tOzcJqe9LlQJfXIHD1Lhx2a0jPmtkUD54+KUkbPJSebiQTeyXJrL6cHii0hNrxq2OxCdAk18W4D2VCTuuUVS9U//68sHDn8EY9KYpg1RK3glB4IabZF150gAUtY/Ag9umcw/O6Z3FYYQObHGp6rxVybX0uTDD4WV1Mq1oqred15OKYWA1CKmfqkHtoYF/RDMDVAdaVVVZWILETIbNhgjoklJanJeXJRc38sr/cNDyYlu0j8+fHBhHKPTmACeZt2MDBxmpkBmTXZAkkAPh7YQv/vWHommaquZHcL6cT0gaKq3OEUuO1pGpDLlMQ8lqnalZh7j5OlDPdr4DqirvTZ+1yKbjru0zX0kQp9O2ewwRIBkiysqB9IOiaqjrREDeyX7eMP0hqJtGzkTh1HDXWVjTUCgYmq2bX/A7tGwy6kCrv4gUYgLTHh7v9VB778oK0BwAsroi1hIIW0RS7jKEqKj6d1c3fRMrUKb/4D
--------------------------------------------------------------------------------
/docs/Flowchart.drawio:
--------------------------------------------------------------------------------
1 | 7V3dd5u4Ev9r/BgfJAGCx3z27jlNN7fZe7vdFx9qk5hdbFKMU6d//QobYSSBESAESdqHnlgGgTUzvxnNlybocrX7EHtPy9to4YcTaCx2E3Q1gRA7kPyfDrwcBgCE1mHkMQ4W2dhx4D746WeDRja6DRb+hrkwiaIwCZ7YwXm0XvvzhBnz4jj6wV72EIXsU5+8R18YuJ97oTj6JVgky8OoA/Fx/D9+8LikTwa2e/hm5dGLs1+yWXqL6EdhCF1P0GUcRcnhr9Xu0g/TxaPrcrjvpuLb/MVif53I3GDf3rsGvpn/8dOIv7vuc/zlOzpDh1mevXCb/eD7J38eeOGlt/Gv/IQsaRRn75+80EWJo+164afzggm6+LEMEv/+yZun3/4gbEDGlskqzL4W3zN79Wc/TvxdYSh77w9+tPKT+IVckn1rg4xpMi6CljG1DyM/jkQBZrbSywJB7GzMy/jgMZ/8uFTkj2y1GqwcsKCwLv6C8E72MYqTZfQYrb3w+jh6cVw5g3w6XvMxip6y9frbT5KXTBC8bRKxq/kQrZPsS5BOsUm8ODlPGZ2MzENvswnmdPgmCOlt/npBL4oIsQ8jhe8riZT+pJMkiv3QS4JnVlzKVju79S4KyCNy0kKHIy3ISUsnIT/l0U+y+4oMzk1l8lNZwlSbaBvPfWEqsjTeS+Gyp/SCTfVLC0/KX7rq3ap/5pEHD29x5Mh8kdszqSmI92fzgiDOcnCRNi1z6pqjk2pLWLAvXhiSdfgj9v0xLhoawaLZwqJ9S3lsdncnIuQRh9bR2mdXiAVHskzxy5/ph6lFP37N4Yx8uNoVr7x6oZ92QVK4jXz6WvjmeFP6gd5TRFV4ilYHAKlnoANm1UtmLbwWiGqV0JSOdURh0xQYyzZk0VOczRVms0QsroD1plhswRMy0Su4UktvRAZAJlNF7W+w2j+74qj9T4maUSNq7cXElRQTgDSZISKyYmlLRBVDYaoQjkB6Prv11snMC4PHNeGaMn776H3zQ5ZH9pen9iChix+TgVTFBGRTc559sQoWiwM7+pvgp/dtP19K0UzayOTWxcS6KuE3keYnhUNQbvlWLXvqpLgbKlN6Z8YU2ggztDmD3WCPTmNxKjF6eNj4/dAWltB2VaDtbElQ/V0S2JiaNnZYU7AbefsnJ0BGKa1eOfgjCwOT7FWy/6vNLlBid6Wf7vw4IGub8uQJYwycNMYqtUqttqDepr61BRC1hQNb71tTa6nwD7NQh0Q1pGgX61rcL8A6rCaARJ/T7fNs6YXk1W6ud0+z1fNsk6LY68HCHA2UaDuEbMBQBowfDa1XiIbNHGGyJm3f2MNJLZLeWCkjtmkPQeyjJsGgqEuIwBjGaYXS3WdarjjJM3h9t39UthAlHvqR8RLvlXT72qTzDwJQyw4d2KJr6nIWDK1ZTPI5TB934c3/edxLxmUURvH+ZdDNzQ28vMwvieKFH9OvMz5kuUxeTVHJVWGzO2TLrFJLUf3HTnqGuBl6VGImHhbX2rkrFQLa+PHK0oVX3IMgqg3w1N3RF8KJEYuLMSFcGXxJYF+NHj0twCrgDVvA7AZoOYK5+hDMGRbBUjVfwDBimLljMcwkeG788Ad1wR/3IODWwl/dHX3BnxjhPh8T/PUAco5CkDOo+ugKcqapC+RcgeB3sf8pileEmD8JEXnS647RA0eMR8LhY/TYENaNjS3NQgKoY3Pi9JDQVAvmNPNwgJCjiftCdCQ+zMJ1oF5yk8nfxOK6VqcjEE3fy/vz2dyLM3kZzkMlH+woRF9wo9h7H6Gg7kJTH9BXnlfYLYsTDOzMhK1YBf5ilQFYhW4vWbzZbFdD25tNQlVAoQ8QIMx5PbpZkxqUhmgGXab64t579s8XixHYjyU5npY5AvtR3GgRg3EEypZxOORoWoOggIHPApz2g6BHM1aZkSqbEiGdQFeIP6fxtCL/nbXaJDY2USnv0gRU06qzT/k7XGssxikWdUUqLkRXzIL186sxOfrNN2KFEJnIhMCwoI2AhR3XPiGTiH7i3XwaJEo6yShXkxBjTqA66kmqKiAQ0pDErG1FO0jTZYUN8yHcktfDHe84/prhBZq63QSBHlaYG2w1dQozAKaJke3kEv1mhNkwFAmv406xe0y3dFlZkfbt63W5CEJwfX45WrvZRsPbzQCaQwIEmDQwj8dn5UJZK9eSFOOOQstbnA7m+Ea2RMoC7ESuYfWitfnnOFn6YL9ROSimXW2WwUMyW78uXwmVXSVpvcBxbCWag9UUZxzj9Ij+cJAs33Iok9zoa3aVvj0IbLxN4BDSBcZEA+CI220vDH/6cfSq4IYGtJTAjW1hJXBzBjlDtE+EEcj40fcWwfrx9/V4upHkHV2oTnWMgY1MR7TL02rCcbibRrpDtTAs7k91e5uQJKw33aCm5RAuUlpNSW1fMbFFvuitcQYa722ya5MVHHT6jiEzFQbNy2QNJyxrObHigjuFSIbISZcWMfVx5nbxD8w7WGvjH9wdNj4Z/1DHzmIOXh4LrNQ3I7W5gLJ0yhR5HaDG5kLa0ikBFCPi1HYY1m5okxjDYpZjgEMoKXNFt1HxI8ud6dFssB23VQlX89oasdWKMx5dXVKdfn/x23oe+6sDTA29/ShJMLadETi6S3rN7Vdm+BUzxRZRwADDL5kziFk4rqp5XZ570+C6XfBUVeVwR0bpcypD3hXv1bO/rCSQt3uaraN4YKX/DtNhZTsGUoAdSzosdSsUNWVaTzI8E73D9PvXykRiCceISrlK7AZ3BKVctDvQYBiNmX9OlbABUdqqOqaMxhyhHTJ6N0eAWFqlvZOPLe51sm7H+8iBt/h7SEaTjfeeWf1wVl2zPHDMUKXV7k6N0lAcRKDMWo/6mkoaTQdOTZarLekQgTKuruzhPbhPi+FoWMnRqIKhC8yHjry2x92pgVAb5huXGSPN0Mo3jZ0UsimmOwlslp/+ka7Qwtssc4YSvOAFztkkcfRPfvDIfmTpPaVTrnaP6YErUz/050mcOs2n3jdyuTdPpqvtDpbQNo4SshhR+iz38OAw+nFOz0g5SblOaZgmFOPjpiUaTE5fBpNVjQhjq1/qggm4UZv8kUk+Hkry23l5MBK4HNk14ToLi6LR6iY9Z4fYYmCorjvr+9yC1BmKzeRSgwzJtivKax7IRjjbKnSNayKRm/urWCrZYzl1MfWym/gqJDGkhd3am4YLaZX0SGLU36vf6BVlDbpFYTtrvSt7pepR/U6vm2Us5uW+BstYgR1sQRES8oLmwSxhR7SE76LN/bDb4rE7ekw7zxzJMeVUumgPnh6azlEv/7CZbj9jC5M7doOj6pDr9ixUSalS8Kdjp/Xv1fR6fCI2K/bE09a5mHKHINWz2+1ujHi72V8bxXfBOl0cyELw/r2CmFxx+EykI32oClRGxoncIyai45ZEdHg2VhcGM0RP/Gd/sw2TWUkBS1m5UeUpenxbSr51eHWbyivr2rkyT627LMiwGENNAD0n26UHzxmFf+y7AEPsuCddwumwmR4AGHIwV1Jhw71UHXA0dldY3KsazmnkE34bd0NficWiy+F/N0OKgHkNbiz1IkCj8JoOd3R5YrZnemGu9kzPJ+Tp0paALuuI0u5e48HNfMP7DgdgWQxEQ7aQZ38scPFrKMW6HU/DgppOKLEGTubAOk+dGBkH82dyo9YcDNLdfvUZbppYGHC5p/WnVtTe0RPX0/oWNrZA9uH7l5tAOySscPEtJn89pn/dxdHc32xGUPzLL9gIjt8GSIw/71Pvg/XjbBUt/LKG+J0sqKo+ilI1glKWl8QhSJXEVJcEWe4kQVo3MIjjOLExlKzthh1zCvARhbjDJKTRr3FhI/cLAH88d8mbWl3vMEzJ+Ev5HiwvCtVxaotYTqlQXjulyjMqPtPevH5vr7/1CBB/NAmUPlBV3BZ30vKKxAkYVb+n6rW5lizCDX2ZtuMrb2pvpbY5pq1SMurjfZISpHgPB1p0HaSiAWqNaUVWrjXQUUOW+4bYuT1od/XHVlKvMcvZ/P5NTKBXxHL8k7SdbvWWPFUVRX59Qahs3/POKe+11oU0P9vcVGZv/GwNYxEMU7nUOz838KGNnp+Fg/ykj6QqKYXm+Fk67NCUnzHXkQ2cPqVKGT+XHKomMninnVt7Pa0ptoTY8Kphs4ToEmri3QayoSZ15BWr5D/9vywcqf90SSUxzBqmVnBAEIQ026JrE0rAMhY/Q4/uGQzfvZDbCgPI/Fz6pVrMtfm9NMngbUk15eJRSTVvO+sTajEApVioj+ahjXHBPgRTA1SXeVWVqFRCRH1rCFlfkqZ8LDEvOk8uau6P5e0+/WBS0sDiz7cPJlR6xgQmkN/TagMTp9k2INtNtkASgA9nxdD/O9a9SebJK2msId2QUFNUlTsXA7c9yANymZKYxzJVDZG55zhZynC/G1xHtJU+eZ9K0W2MLcJ0n+bQt3MGGywTIMnCivqJpGOi6lhLtMB+Wz/8Yq2BWMtG7tRx1HBX2VwDMJhoml35v7BrMOxCqrCLn2gA1hIT7v6qOmvmF29pAC+siLeEifrkre3yMzY+/PXf7/dgt9i4wL/6uqFpaJJbiYxWeRUfOLWvaE8eTW5Gk138vG9QYx8jN1GetKveK1BKQ9Fofjc0tIEiGvIT6aZhM//gm6ahsPSvRQ5FA/D90NAQ20+0hlNUP1fPlKQcqde2mm/j55wVGL6oJH/RD1b6S3T1lxXrVKQI1rwFQ/lzZIP63PWd3Vin1rzYqHwbJsFTGJQ0mU78XTIpaX3AVZ88ELOXG5J35peVKUniS4NKJVsQ27yDQQGE7BIQ4nFaolSJfIyjKCkSLu0KcRst0pjH9b8=
--------------------------------------------------------------------------------
/docs/Flowchart.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hankshyu/RISC-V_MAC/a99fb15770c8dd553d0b348fadddc2b8b3d4b22b/docs/Flowchart.png
--------------------------------------------------------------------------------
/docs/Implementation of a RISC-V compatible Multiply-Add Fused Unit.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hankshyu/RISC-V_MAC/a99fb15770c8dd553d0b348fadddc2b8b3d4b22b/docs/Implementation of a RISC-V compatible Multiply-Add Fused Unit.pdf
--------------------------------------------------------------------------------
/docs/Wallace Tree.drawio:
--------------------------------------------------------------------------------
1 | 7Z1dc6M2FIZ/jS+TAQkBvkyz2bYzu9OdZqZtrnaIkW1ajFyQN3Z/fYVBBmSZDwdLsMtVzEEcsPTwvrIkyAw+bvY/x952/Zn4OJwBw9/P4IcZAKZh2OxPGjlkETR3s8AqDvy8UBF4Dv7D/Mg8ugt8nFQKUkJCGmyrwQWJIryglZgXx+StWmxJwupZt94KnwWeF154Hv0z8Ok6i7rAKeK/4GC15mc27Xm2Z+Pxwvk3SdaeT95KIfg0g48xITT7tNk/4jCtPF4v2XEfL+w9XViMI9rmgAdwWP5GvF+f0J583v1On57dlzszT5PQA//G2GcVkG+SmK7JikRe+FREf4rJLvJxmtZgW0WZT4RsWdBkwb8xpYe8Nb0dJSy0ppsw34v3Af0rPfzeQfnmS2nXh32e+rhxyDeWJKJ5RtNKd0Y0PpSzpNsv/Mh0o8hz3OKJsm+cfs2LNclrheziBa6rvpxIL15hWlMOndqb3SiYbDC7HnZcjEOPBt+q1+HlxK5O5U6HfiEBu0Jg7Pm9lKOV31sWR42nyK4rP6pA4yGOvUOp2DYtkFw+j20L5zEE0pquq1qefciugG+V6qQIHentQjLQSzJoT3KJ3E7ginfAWEHuzJ9R5QnOG/iza8vfiL/slN+8cJdXwwzYIc2brQKm/e+O8B13ybFBH1gB09ruj63K97NPq/zvMdErD3z6w/xq8Ci72lexJItlZxUSxDwC2Skz29xsY5wkJD67e4p7I6X5bR1Q/Lz1jhS9Ma+v3gfLIAwfScjSpMdC38PucsHiCY3JP7i0x164+HV5gvcbjine1+N7jhtvZ0doZ65/b4VTmzy2Lrk0Mi4TWqLkCgjs0dhpSYTAUFQIDFmFEOioQlCHCjljdMFu/bcBAGjrANCqt7Xz8kgDgECxDZqTDVqi0Oi3QXeMNjgUEYJDFiHodhShete8kQvOx+iCg+mGDRtA0G0wAuoYjICKXRBMLgjmQ3PB08DvqFxwMH1xq6UKOTpUiMlIJxuEpo6+uObB/XEPiQ6aP9Pt5oKgvvxt+LMUuyCcXNAUf/Prd0H447lgwkSDPqSzzywSkQjz2Mcgrby+ZyFRS6VydSiVOBmIzIZhU1Bb/kZOaf1wTjkxenmoHjW4qSUyrcJNkVI3BdMEY9Gup14T1O6maIxu+r6RLcVKZQ9ZqcQ5nUalsjUoFdA8C65jEnKgkM51QCoOuTZBKs4pKIHUVmyn00QlSzY4O9W8XELDRKVioXIGLVRmR6Gqd98buanmyXQdk5nfKaTv0ipHsWGBybCAONes37A0r2y4zrDmEJTV4M64NwyzQRKOW19wHLCKw/EwdMJtqROmIcdKjVBwSMdlGLaFzhhBEyM3YsRVaiZwGkw8X6aJLN1mAsfz8F9JKCByunQth6gR81FoBDPtEfqIUaEjtRHTnhC5ESJzxTYyDaKdrXPUbyN8EEVCQanxrPo+QNpKLWgBDbQkWy96VyJLwsjpqrPs/IwX0TFaoCOoi4gSwq5vyVBywSu07X5QEoezbCRByVLaI5EtnBY9KPIFGS7Va7URNLwyoFmw5U1SqnJUU+PvfGOA+CQ/NISWzHzp7I0B3RNdePVAbwvsZashegTl1Akp90FO3Q5pH6QtI9raXnyKXnxbROu2FxOpbnvZ1N2IREIXAMic3zt2P/e/LFfLt4/0hoFsQHySgDoCLLe51doSIMulWghko1gTAbUEoB4JkORSTYDsB+hkBY0YQKc/K5DlUmwF/DfKJATtCbD6EwJZLsVCYMlevzQRUEtAiy5cawJadC1vTQCYrOAaDADszwpkuVRbwY1HkL5DIQBGf0Igy6VaCGSPnU4E1BFgtujCtSVAlks1Ad0GBxehlyTB4ko36LbQsS0J1z93rogY8bU+V1uGMJ0F7HasdH5gXlzK1/AYKkC15W/0iN94f8e0fcesttHvSy9J6zz6LSZSLG6o2y+djuJWXYFhs95U6yUYbENcQ9EnPBfWOSgaNhOmu2znSnos4b0xjqWYnm6/kkZBjzYoDOPeANXmvNYGLeNWXLDN4n8MZMWL/9QAn/4H
--------------------------------------------------------------------------------
/docs/WallaceTree.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hankshyu/RISC-V_MAC/a99fb15770c8dd553d0b348fadddc2b8b3d4b22b/docs/WallaceTree.png
--------------------------------------------------------------------------------
/src/00_TESTBED/PATTERN.v:
--------------------------------------------------------------------------------
1 | `ifdef RTL
2 | `define CYCLE_TIME 20.0
3 | `endif
4 | `ifdef GATE
5 | `define CYCLE_TIME 20.0
6 | `endif
7 |
8 |
9 | module PATTERN #(
10 | parameter PARM_RM = 3,
11 | parameter PARM_XLEN = 32,
12 | parameter PARM_RM_RNE = 3'b000,
13 | parameter PARM_RM_RTZ = 3'b001,
14 | parameter PARM_RM_RDN = 3'b010,
15 | parameter PARM_RM_RUP = 3'b011,
16 | parameter PARM_RM_RMM = 3'b100
17 | ) (
18 | output reg clk,
19 | output reg rst_n,
20 |
21 | output reg [PARM_RM - 1 : 0] Rounding_mode_i,
22 | output reg [PARM_XLEN - 1 : 0] A_i,
23 | output reg [PARM_XLEN - 1 : 0] B_i,
24 | output reg [PARM_XLEN - 1 : 0] C_i,
25 |
26 | input [PARM_XLEN - 1 : 0] Result_o, // T (result_o) = A + (B * C)
27 | //Accrued exceptions (fflags)
28 | input NV_o,
29 | input OF_o,
30 | input UF_o,
31 | input NX_o );
32 |
33 | //================================================================
34 | // Main Function
35 | //================================================================
36 | initial begin
37 | $display("Welcom to the RISCV MAC project!");
38 | #5;
39 | $finish;
40 | end
41 |
42 |
43 | endmodule
--------------------------------------------------------------------------------
/src/00_TESTBED/PATTERN_sample.v:
--------------------------------------------------------------------------------
1 | `ifdef RTL
2 | `define CYCLE_TIME 10.0
3 | `endif
4 | `ifdef GATE
5 | `define CYCLE_TIME 10.0
6 | `endif
7 |
8 | module PATTERN(
9 | // Output Signals
10 | clk,
11 | rst_n,
12 | in_valid,
13 | init,
14 | in0,
15 | in1,
16 | in2,
17 | in3,
18 | // Input Signals
19 | out_valid,
20 | out
21 | );
22 |
23 | //================================================================
24 | // INPUT AND OUTPUT DECLARATION
25 | //================================================================
26 | /* Input for design */
27 | output reg clk, rst_n;
28 | output reg in_valid;
29 | output reg [1:0] init;
30 | output reg [1:0] in0, in1, in2, in3;
31 |
32 | /* Output for pattern */
33 | input out_valid;
34 | input [1:0] out;
35 |
36 | //================================================================
37 | // integer
38 | //================================================================
39 | real CYCLE = `CYCLE_TIME;
40 | parameter PATNUM = 300;
41 | integer SEED = 82;
42 | integer total_latency;
43 | integer patcount;
44 | reg [1 : 0] map [64-1 : 0][4-1 : 0];
45 | reg [1:0]init_in;
46 | integer wait_val_time;
47 | integer i;
48 | integer cac; //check answer cycle
49 |
50 | integer bonus_point;
51 | integer sum_bonus;
52 |
53 | reg [1:0] spotA, spotB, move;
54 | reg [1:0] current_line;
55 |
56 | integer resetted;
57 |
58 | //================================================================
59 | // parameter
60 | //================================================================
61 |
62 | parameter M_FORWARD = 2'd0;
63 | parameter M_RIGHT = 2'd1;
64 | parameter M_LEFT = 2'd2;
65 | parameter M_JUMP = 2'd3;
66 |
67 | parameter S_ROAD = 2'd0;
68 | parameter S_LO = 2'd1;
69 | parameter S_HO = 2'd2;
70 | parameter S_TRAINS = 2'd3;
71 |
72 | parameter PRINT_MSG = 1; //set to 0 to hand in the exercise
73 |
74 |
75 | //================================================================
76 | // clock
77 | //================================================================
78 | initial clk = 0;
79 |
80 | always #(CYCLE/2.0) clk = ~clk;
81 |
82 | //================================================================
83 | // initial
84 | //================================================================
85 | initial begin
86 | resetted = 0;
87 | @(posedge rst_n)
88 | resetted = 1;
89 | end
90 |
91 | always begin
92 |
93 | if(resetted)begin
94 | // The out should be reset when your out_valid is low.
95 | if(out_valid == 1'b0 && (out != 2'b00))begin
96 | $display("SPEC 4 IS FAIL!");
97 | $finish;
98 | end else if((in_valid === 1) && (out_valid !==0))begin
99 | // The out_valid should not be high when in_valid is high.
100 | $display("SPEC 5 IS FAIL!");
101 | $finish;
102 | end
103 | end
104 | #(CYCLE/10.0);
105 | end
106 |
107 |
108 | initial begin
109 |
110 | rst_n = 1'b1;
111 | in_valid = 1'b0;
112 | init = 'bx;
113 | in0 = 'bx;
114 | in1 = 'bx;
115 | in2 = 'bx;
116 | in3 = 'bx;
117 | sum_bonus = 0;
118 | force clk = 0;
119 | total_latency = 0;
120 | genmap;//this is to avoid starting out_valid...
121 | reset_signal_task;
122 | check_ans;
123 | for(patcount=1; patcount<=PATNUM; patcount=patcount+1) begin
124 | if(PRINT_MSG) $display("PATTERN:%05d",patcount);
125 | input_task;
126 | wait_out_valid;
127 | check_ans;
128 |
129 | @(negedge clk);
130 | check_ans;
131 | @(negedge clk);
132 | check_ans;
133 | @(negedge clk);
134 |
135 | end
136 | if(PRINT_MSG) $display("Total BONUS: %d running %d cycles!!",sum_bonus,PATNUM);
137 | YOU_PASS_task;
138 | end
139 |
140 | //================================================================
141 | // task
142 | //================================================================
143 |
144 | task reset_signal_task;
145 | begin
146 | #(0.5); rst_n=0;
147 | #(2.0);
148 | if((out_valid !== 0)||(out !== 0)) begin
149 | $display("SPEC 3 IS FAIL!");
150 | // The reset signal (rst_n) would be given only once at the beginning of simulation. All output
151 | // signals should be reset after the reset signal is asserted.
152 | $finish;
153 | end
154 | #(10); rst_n=1;
155 | #(3); release clk;
156 | end
157 | endtask
158 |
159 |
160 | task input_task;
161 | begin
162 | // Inputs start from second negtive edge after the begining of clock
163 | if(patcount=='d1)begin
164 | repeat(2)@(negedge clk);
165 | end
166 |
167 | genmap;
168 | if(PRINT_MSG) printmap;
169 |
170 | in_valid = 1'b1;
171 | for (i = 0; i < 64; i = i+1) begin
172 | init = (i == 0)? init_in : 2'bxx;
173 |
174 | in0 = map[i][0];
175 | in1 = map[i][1];
176 | in2 = map[i][2];
177 | in3 = map[i][3];
178 |
179 | if(out_valid !== 0)begin
180 | $display("SPEC 5 IS FAIL!");
181 | // The out_valid should not be high when in_valid is high.
182 | $finish;
183 | end
184 |
185 | @(negedge clk);
186 | //disable input
187 |
188 | end
189 |
190 | in_valid = 1'b0;
191 | init = 2'bx;
192 | in0 = 2'bx;
193 | in1 = 2'bx;
194 | in2 = 2'bx;
195 | in3 = 2'bx;
196 | end
197 | endtask
198 |
199 | task wait_out_valid; begin
200 | wait_val_time = -1;
201 | while(out_valid !== 1) begin
202 | wait_val_time = wait_val_time + 1;
203 | if(wait_val_time == 3000)begin
204 | $display("SPEC 6 IS FAIL!");
205 | // The execution latency is limited in 3000 cycles. The latency is the time of the clock cycles
206 | // between the falling edge of the in_valid and the rising edge of the out_valid.
207 | $finish;
208 | end
209 | if(out !== 2'b00)begin
210 | $display("SPEC 4 IS FAIL!");
211 | // The out should be reset when your out_valid is low.
212 | $finish;
213 | end
214 | @(negedge clk);
215 |
216 | end
217 | total_latency = total_latency + wait_val_time;
218 | end endtask
219 |
220 | task check_ans;
221 | begin
222 |
223 | //++++++++++++++++++++++++++++++++++++++++++++++++
224 | // Check the answer here
225 | cac = 0;
226 | bonus_point = 0;
227 |
228 |
229 | while(out_valid)begin
230 |
231 | if((cac > 62) || !((out === 2'd0)|| (out !== 2'd1)|| (out !== 2'd2)|| (out !== 2'd3)))begin
232 | $display("SPEC 7 IS FAIL!");
233 | // The out_valid and out must be asserted successively in 63 cycles.
234 | $finish;
235 | end
236 | else begin
237 | //check for incorrect answers
238 |
239 | if(cac == 0)begin
240 | current_line = init_in;
241 | end
242 |
243 | if((current_line == 0 && out == M_LEFT) || (current_line == 3 && out == M_RIGHT))begin
244 | $display("SPEC 8-1 IS FAIL!");
245 | // - SPEC 8-1 (5%): The character cannot run outside the map.
246 | $finish;
247 | end else if( ((out == M_FORWARD) && (map[cac+1][current_line] == S_LO)) ||
248 | ((out == M_RIGHT) && (map[cac+1][current_line+1] == S_LO)) ||
249 | ((out == M_LEFT) && (map[cac+1][current_line-1] == S_LO)) )begin
250 | $display("SPEC 8-2 IS FAIL!");
251 | // - SPEC 8-2 (5%): The character must avoid hitting lower obstacles.
252 | $finish;
253 |
254 | end else if( ((out == M_JUMP) && (map[cac+1][current_line] == S_HO)) ||
255 | ((out == M_RIGHT) && (map[cac+1][current_line+1] == S_HO)) ||
256 | ((out == M_LEFT) && (map[cac+1][current_line-1] == S_HO)) )begin
257 | $display("SPEC 8-3 IS FAIL!");
258 | // - SPEC 8-3 (5%): The character must avoid hitting higher obstacles.
259 | $finish;
260 | end else if( ((out == M_FORWARD) && (map[cac+1][current_line] == S_TRAINS)) ||
261 | ((out == M_JUMP) && (map[cac+1][current_line] == S_TRAINS)) ||
262 | ((out == M_RIGHT) && (map[cac+1][current_line+1] == S_TRAINS)) ||
263 | ((out == M_LEFT) && (map[cac+1][current_line-1] == S_TRAINS)) )begin
264 | $display("SPEC 8-4 IS FAIL!");
265 | // - SPEC 8-4 (5%): The character must avoid hitting trains.
266 | $finish;
267 | end else if((out == M_JUMP) && (map[cac][current_line] == S_LO))begin
268 | $display("SPEC 8-5 IS FAIL!");
269 | // - SPEC 8-5 (5%): If you are on a lower obstacle (2’b01), you cannot use jump.
270 | $finish;
271 | end
272 |
273 | if(PRINT_MSG)begin
274 |
275 | if(cac % 8 == 0)$write("Block %02d: ",cac/8);
276 |
277 | if(out == M_FORWARD)begin
278 | $write("%dF ",current_line);
279 | bonus_point = bonus_point + 1;
280 |
281 | end else if(out == M_JUMP)begin
282 | $write("%dJ ",current_line);
283 | bonus_point = bonus_point + 4;
284 |
285 | end else if(out == M_LEFT)begin
286 | $write("%dL ",current_line);
287 | current_line = current_line - 1;
288 | bonus_point = bonus_point + 2;
289 | end else if(out == M_RIGHT)begin
290 | $write("%dR ",current_line);
291 | current_line = current_line + 1;
292 | bonus_point = bonus_point + 2;
293 |
294 | end
295 | if(cac % 8 == 7)$display();
296 | end else begin
297 | if(out == M_LEFT)begin
298 | current_line = current_line - 1;
299 | end else if(out == M_RIGHT)begin
300 | current_line = current_line + 1;
301 |
302 | end
303 | end
304 |
305 | end
306 | @(negedge clk);
307 |
308 | cac = cac + 1;
309 |
310 | end
311 | //+++++++++++++++++++++++++++++++++++++++++++++++
312 | if((cac < 62) && (cac != 0)) begin
313 | $display("SPEC 7 IS FAIL!");
314 | // The out_valid and out must be asserted successively in 63 cycles.
315 | $finish;
316 | end
317 | if(PRINT_MSG)begin
318 | $display();
319 | $display("Bonus is : %d",bonus_point);
320 | sum_bonus = sum_bonus + bonus_point;
321 | $display("\033[0;34mPASS PATTERN NO.%4d,\033[m \033[0;32mexecution cycle : %3d\033[m",patcount ,wait_val_time);
322 | end
323 |
324 | end
325 | endtask
326 |
327 | task YOU_PASS_task;
328 | begin
329 | if(PRINT_MSG)begin
330 | $display("\n");
331 | $display("\n");
332 | $display(" ---------------------------- ");
333 | $display(" -- -- |\__|| ");
334 | $display(" -- Congratulations !! -- / O.O | ");
335 | $display(" -- -- /_____ | ");
336 | $display(" -- Simulation out!! -- /^ ^ ^ \\ |");
337 | $display(" -- -- |^ ^ ^ ^ |w| ");
338 | $display(" ---------------------------- \\m___m__|_|");
339 | $display("\n");
340 | end
341 |
342 | #(500);
343 | $finish;
344 | end
345 | endtask
346 |
347 | task genmap;
348 | integer idx, jdx, kdx, ldx;
349 | integer grow_obstacles, grow_trainnum, grow_trainpos;
350 | begin
351 | for (idx = 0; idx < 64; idx = idx+1) begin
352 | for(jdx = 0; jdx < 4; jdx = jdx + 1)begin
353 | //the map is covered with road
354 | map[idx][jdx] = 2'b00;
355 | //grow high and low obstacles
356 | if((idx % 'd2 == 0) && (idx % 'd8 != 0))begin
357 | grow_obstacles = $random(SEED) % 'd3;
358 | if(grow_obstacles == 1) map[idx][jdx] = 2'b01; //low obstacles (LO)
359 | else if(grow_obstacles == 2) map[idx][jdx] = 2'b10; //high obstacles (HO)
360 | end
361 | end
362 | end
363 | // put on the trains
364 | for (jdx = 0; jdx < 8; jdx = jdx+1) begin
365 | grow_trainnum = ($random(SEED) %'d3)+1;
366 | for(ldx = 0; ldx < grow_trainnum; ldx = ldx+1)begin
367 | grow_trainpos = $random(SEED) %'d4;
368 | while(map[jdx*8][grow_trainpos] == 2'b11)begin
369 | grow_trainpos = $random(SEED) %'d4; //regenerate position
370 | end
371 | map[jdx*8 + 0][grow_trainpos] = 2'b11;
372 | map[jdx*8 + 1][grow_trainpos] = 2'b11;
373 | map[jdx*8 + 2][grow_trainpos] = 2'b11;
374 | map[jdx*8 + 3][grow_trainpos] = 2'b11;
375 | end
376 | end
377 | //generate the input position
378 | init_in = $random(SEED) % 'd4;
379 | while(map[0][init_in] == 2'b11)begin
380 | init_in = $random(SEED) % 'd4;
381 | end
382 |
383 | end
384 |
385 | endtask
386 |
387 | task printmap;
388 | integer idx, jdx;
389 | begin
390 | $display("\t >< ||0***+*** 1***+*** 2***+*** 3***+*** 4***+*** 5***+*** 6***+*** 7***+*** (%d)",init_in);
391 | for (jdx = 0; jdx < 4; jdx = jdx+1) begin
392 | $write("%d ||",jdx);
393 | for (idx = 0; idx < 64; idx = idx+1) begin
394 | $write("%d",map[idx][jdx]);
395 | if(idx % 'd8 == 7) $write(" ");
396 | end
397 | $display();
398 | end
399 | end
400 | endtask
401 |
402 | endmodule
403 |
404 |
405 | // fail.txt
406 | // SPEC 3 IS FAIL!
407 | // The reset signal (rst_n) would be given only once at the beginning of simulation. All output
408 | // signals should be reset after the reset signal is asserted.
409 |
410 | // SPEC 4 IS FAIL!
411 | // The out should be reset when your out_valid is low.
412 |
413 | // SPEC 5 IS FAIL!
414 | // The out_valid should not be high when in_valid is high.
415 |
416 | // SPEC 6 IS FAIL!
417 | // The execution latency is limited in 3000 cycles. The latency is the time of the clock cycles
418 | // between the falling edge of the in_valid and the rising edge of the out_valid.
419 |
420 | // SPEC 7 IS FAIL!
421 | // The out_valid and out must be asserted successively in 63 cycles.
422 |
423 |
424 | // SPEC 8-1 IS FAIL!
425 | // - SPEC 8-1 (5%): The character cannot run outside the map.
426 |
427 | // SPEC 8-2 IS FAIL!
428 | // - SPEC 8-2 (5%): The character must avoid hitting lower obstacles.
429 |
430 | // SPEC 8-3 IS FAIL!
431 | // - SPEC 8-3 (5%): The character must avoid hitting higher obstacles.
432 |
433 | // SPEC 8-4 IS FAIL!
434 | // - SPEC 8-4 (5%): The character must avoid hitting trains.
435 |
436 | // SPEC 8-5 IS FAIL!
437 | // - SPEC 8-5 (5%): If you are on a lower obstacle (2’b01), you cannot use jump.
438 |
--------------------------------------------------------------------------------
/src/00_TESTBED/TESTBED.v:
--------------------------------------------------------------------------------
1 | `timescale 1ns / 1ps
2 | //////////////////////////////////////////////////////////////////////////////////
3 | // Engineer: Tzu-Han Hsu
4 | // Create Date: 11/20/2023 05:03:25 PM
5 | // Module Name: TESTBED
6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit
7 | // HDL(Version): Verilog-2005
8 | //
9 | // Dependencies: MAC32_top.v
10 | // PATTERN.v
11 | //
12 | //////////////////////////////////////////////////////////////////////////////////
13 | // Description: Testbed of MAC32_top module, act as breadboard
14 | //
15 | //////////////////////////////////////////////////////////////////////////////////
16 | // Revision:
17 | //
18 | //////////////////////////////////////////////////////////////////////////////////
19 |
20 | `timescale 1ns/10ps
21 |
22 | `include "PATTERN.v"
23 | `ifdef RTL
24 | `include "MAC.v"
25 | `endif
26 | `ifdef GATE
27 | `include "MAC_SYN.v"
28 | `endif
29 |
30 | module TESTBED;
31 |
32 | // parameter declaration
33 | parameter PARM_RM = 3,
34 | parameter PARM_XLEN = 32,
35 | parameter PARM_RM_RNE = 3'b000,
36 | parameter PARM_RM_RTZ = 3'b001,
37 | parameter PARM_RM_RDN = 3'b010,
38 | parameter PARM_RM_RUP = 3'b011,
39 | parameter PARM_RM_RMM = 3'b100
40 |
41 | // interconnect wires delcarations
42 | wire clk, rst_n;
43 |
44 | wire [PARM_RM - 1 : 0] Rounding_mode_wire;
45 | wire [PARM_XLEN - 1 : 0] A_wire;
46 | wire [PARM_XLEN - 1 : 0] B_wire;
47 | wire [PARM_XLEN - 1 : 0] C_wire;
48 |
49 | wire [PARM_XLEN - 1 : 0] Result_wire;
50 | wire NV_wire;
51 | wire OF_wire;
52 | wire UF_wire;
53 | wire NX_wire;
54 |
55 | initial begin
56 | `ifdef RTL
57 | $fsdbDumpfile("MAC.fsdb");
58 | $fsdbDumpvars(0,"+mda");
59 | `endif
60 | `ifdef GATE
61 | $sdf_annotate("MAC_SYN.sdf", u_SUBWAY);
62 | $fsdbDumpfile("MAC_SYN.fsdb");
63 | $fsdbDumpvars(0,"+mda");
64 | `endif
65 | end
66 |
67 | PATTERN #(
68 | .PARM_RM (PARM_RM),
69 | .PARM_XLEN(PARM_XLEN),
70 | .PARM_RM_RNE(PARM_RM_RNE),
71 | .PARM_RM_RTZ(PARM_RM_RTZ),
72 | .PARM_RM_RDN(PARM_RM_RDN),
73 | .PARM_RM_RUP(PARM_RM_RUP),
74 | .PARM_RM_RMM(PARM_RM_RMM)
75 | )u_PATTERN(
76 | .clk(clk),
77 | .rst_n(rst_n),
78 |
79 | .Rounding_mode_o(Rounding_mode_wire),
80 | .A_o(A_wire),
81 | .B_o(B_wire),
82 | .C_o(C_wire),
83 |
84 | .Result_i(Result_wire),
85 |
86 | //Accrued exceptions (fflags)
87 | .NV_i(NV_wire),
88 | .OF_i(OF_wire),
89 | .UF_i(UF_wire),
90 | .NX_i(NX_wire)
91 | );
92 |
93 | MAC32_top #(
94 | .PARM_RM (PARM_RM),
95 | .PARM_XLEN(PARM_XLEN),
96 | .PARM_RM_RNE(PARM_RM_RNE),
97 | .PARM_RM_RTZ(PARM_RM_RTZ),
98 | .PARM_RM_RDN(PARM_RM_RDN),
99 | .PARM_RM_RUP(PARM_RM_RUP),
100 | .PARM_RM_RMM(PARM_RM_RMM)
101 | ) u_MAC32_top (
102 | //input clk_i,
103 | //input rst_i,
104 | //input stall_i,
105 | //input req_i,
106 |
107 | .Rounding_mode_i(Rounding_mode_wire),
108 | .A_i(A_wire),
109 | .B_i(B_wire),
110 | .C_i(C_wire),
111 |
112 | .Result_o(Result_wire), // T (result_o) = A + (B * C)
113 | //output ready_o,
114 |
115 | //Accrued exceptions (fflags)
116 | .NV_o(NV_wire),
117 | .OF_o(OF_wire),
118 | .UF_o(UF_wire),
119 | .NX_o(NX_wire)
120 | );
121 |
122 | endmodule
--------------------------------------------------------------------------------
/src/00_TESTBED/TESTBED_sample.v:
--------------------------------------------------------------------------------
1 | /**************************************************************************/
2 | // Copyright (c) 2023, OASIS Lab
3 | // MODULE: TESTBED
4 | // FILE NAME: TESTBED.v
5 | // VERSRION: 1.0
6 | // DATE: Feb 8, 2023
7 | // AUTHOR: Kuan-Wei Chen, NYCU IEE
8 | // CODE TYPE: RTL or Behavioral Level (Verilog)
9 | // DESCRIPTION: 2023 Spring IC Lab / Exersise Lab03 / SUBWAY
10 | // MODIFICATION HISTORY:
11 | // Date Description
12 | //
13 | /**************************************************************************/
14 | `timescale 1ns/10ps
15 |
16 | `include "PATTERN.v"
17 | `ifdef RTL
18 | `include "SUBWAY.v"
19 | `endif
20 | `ifdef GATE
21 | `include "SUBWAY_SYN.v"
22 | `endif
23 |
24 | module TESTBED;
25 |
26 | wire clk, rst_n, in_valid;
27 | wire [1:0] init;
28 | wire [1:0] in0, in1, in2, in3;
29 | wire out_valid;
30 | wire [1:0] out;
31 |
32 | initial begin
33 | `ifdef RTL
34 | $fsdbDumpfile("SUBWAY.fsdb");
35 | $fsdbDumpvars(0,"+mda");
36 | `endif
37 | `ifdef GATE
38 | $sdf_annotate("SUBWAY_SYN.sdf", u_SUBWAY);
39 | $fsdbDumpfile("SUBWAY_SYN.fsdb");
40 | $fsdbDumpvars(0,"+mda");
41 | `endif
42 | end
43 |
44 | SUBWAY u_SUBWAY(
45 | .clk(clk),
46 | .rst_n(rst_n),
47 | .in_valid(in_valid),
48 | .init(init),
49 | .in0(in0),
50 | .in1(in1),
51 | .in2(in2),
52 | .in3(in3),
53 | .out_valid(out_valid),
54 | .out(out)
55 | );
56 |
57 | PATTERN u_PATTERN(
58 | .clk(clk),
59 | .rst_n(rst_n),
60 | .in_valid(in_valid),
61 | .init(init),
62 | .in0(in0),
63 | .in1(in1),
64 | .in2(in2),
65 | .in3(in3),
66 | .out_valid(out_valid),
67 | .out(out)
68 | );
69 |
70 | endmodule
71 |
--------------------------------------------------------------------------------
/src/01_RTL/01_run:
--------------------------------------------------------------------------------
1 | irun TESTBED.v -define RTL -define FUNC -debug -incdir /usr/synthesis/dw/sim_ver/ ./1_RTL/ -notimingchecks -loadpli1 debpli:novas_pli_boot
--------------------------------------------------------------------------------
/src/01_RTL/Compressor32.v:
--------------------------------------------------------------------------------
1 | `timescale 1ns / 1ps
2 | //////////////////////////////////////////////////////////////////////////////////
3 | // Engineer: Tzu-Han Hsu
4 | // Create Date: 07/25/2022 10:34:02 AM
5 | // Module Name: Compressor32
6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit
7 | // HDL(Version): Verilog-2005
8 | //
9 | // Dependencies: FullAdder.v
10 | //
11 | //////////////////////////////////////////////////////////////////////////////////
12 | // Description: This is a 3:2 compressor, a.k.a carry save adder.
13 | //
14 | //////////////////////////////////////////////////////////////////////////////////
15 | // Revision:
16 | // 09/12/2022 - Add BSD-3-Clause Licence
17 | //
18 | //////////////////////////////////////////////////////////////////////////////////
19 | // License information:
20 | //
21 | // This software is released under the BSD-3-Clause Licence,
22 | // see https://opensource.org/licenses/BSD-3-Clause for details.
23 | // In the following license statements, "software" refers to the
24 | // "source code" of the complete hardware/software system.
25 | //
26 | // Copyright 2022,
27 | // Embedded Intelligent Systems Lab (EISL)
28 | // Deparment of Computer Science
29 | // National Yang Ming Chiao Tung Uniersity
30 | // Hsinchu, Taiwan.
31 | //
32 | // All rights reserved.
33 | //
34 | // Redistribution and use in source and binary forms, with or without
35 | // modification, are permitted provided that the following conditions are met:
36 | //
37 | // 1. Redistributions of source code must retain the above copyright notice,
38 | // this list of conditions and the following disclaimer.
39 | //
40 | // 2. Redistributions in binary form must reproduce the above copyright notice,
41 | // this list of conditions and the following disclaimer in the documentation
42 | // and/or other materials provided with the distribution.
43 | //
44 | // 3. Neither the name of the copyright holder nor the names of its contributors
45 | // may be used to endorse or promote products derived from this software
46 | // without specific prior written permission.
47 | //
48 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
49 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
50 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
51 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
52 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
53 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
54 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
55 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
56 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
57 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
58 | // POSSIBILITY OF SUCH DAMAGE.
59 | //////////////////////////////////////////////////////////////////////////////////
60 |
61 |
62 |
63 | module Compressor32 #(
64 | parameter XLEN = 49
65 | ) (
66 | input [XLEN - 1 : 0] A_i,
67 | input [XLEN - 1 : 0] B_i,
68 | input [XLEN - 1 : 0] C_i,
69 | output [XLEN - 1 : 0] Sum_o,
70 | output [XLEN - 1 : 0] Carry_o
71 | );
72 |
73 | generate
74 | genvar j;
75 | for(j = 0; j < XLEN; j = j+1)begin
76 | FullAdder FA(
77 | .augend_i(A_i[j]),
78 | .addend_i(B_i[j]),
79 | .carry_i(C_i[j]),
80 | .sum_o(Sum_o[j]),
81 | .carry_o(Carry_o[j])
82 | );
83 |
84 | end
85 | endgenerate
86 |
87 | endmodule
88 |
--------------------------------------------------------------------------------
/src/01_RTL/Compressor42.v:
--------------------------------------------------------------------------------
1 | `timescale 1ns / 1ps
2 | //////////////////////////////////////////////////////////////////////////////////
3 | // Engineer: Tzu-Han Hsu
4 | // Create Date: 07/25/2022 10:34:02 AM
5 | // Module Name: Compressor42
6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit
7 | // HDL(Version): Verilog-2005
8 | //
9 | // Dependencies: Compressor32.v
10 | //
11 | //////////////////////////////////////////////////////////////////////////////////
12 | // Description: Input 4 partial sums and outputs sum and carry
13 | // One possible implementation is with 2 3-2 Compressors,
14 | // Or could be mapped to a more efficient design
15 | //
16 | //////////////////////////////////////////////////////////////////////////////////
17 | // Revision:
18 | // 07/25/2022 - Compressor32 down32 module should take top_carry shift one bit to the left as input
19 | // 07/25/2022 - hidden_carry_msb wire added, collect overflow bits to suppress sign extension
20 | // 09/12/2022 - Add BSD-3-Clause Licence
21 | //
22 | //////////////////////////////////////////////////////////////////////////////////
23 | // License information:
24 | //
25 | // This software is released under the BSD-3-Clause Licence,
26 | // see https://opensource.org/licenses/BSD-3-Clause for details.
27 | // In the following license statements, "software" refers to the
28 | // "source code" of the complete hardware/software system.
29 | //
30 | // Copyright 2022,
31 | // Embedded Intelligent Systems Lab (EISL)
32 | // Deparment of Computer Science
33 | // National Yang Ming Chiao Tung Uniersity
34 | // Hsinchu, Taiwan.
35 | //
36 | // All rights reserved.
37 | //
38 | // Redistribution and use in source and binary forms, with or without
39 | // modification, are permitted provided that the following conditions are met:
40 | //
41 | // 1. Redistributions of source code must retain the above copyright notice,
42 | // this list of conditions and the following disclaimer.
43 | //
44 | // 2. Redistributions in binary form must reproduce the above copyright notice,
45 | // this list of conditions and the following disclaimer in the documentation
46 | // and/or other materials provided with the distribution.
47 | //
48 | // 3. Neither the name of the copyright holder nor the names of its contributors
49 | // may be used to endorse or promote products derived from this software
50 | // without specific prior written permission.
51 | //
52 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
53 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
54 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
55 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
56 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
57 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
58 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
59 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
60 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
61 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
62 | // POSSIBILITY OF SUCH DAMAGE.
63 | //////////////////////////////////////////////////////////////////////////////////
64 |
65 |
66 | module Compressor42 #(
67 | parameter XLEN = 49
68 | ) (
69 | input [XLEN - 1 : 0] A_i,
70 | input [XLEN - 1 : 0] B_i,
71 | input [XLEN - 1 : 0] C_i,
72 | input [XLEN - 1 : 0] D_i,
73 | output [XLEN - 1 : 0] Sum_o,
74 | output [XLEN - 1 : 0] Carry_o,
75 | output hidden_carry_msb);
76 |
77 | wire [XLEN - 1: 0] top_sum;
78 | wire [XLEN - 1: 0] top_carry;
79 |
80 | Compressor32 top32(
81 | .A_i(A_i),
82 | .B_i(B_i),
83 | .C_i(C_i),
84 | .Sum_o(top_sum),
85 | .Carry_o(top_carry)
86 | );
87 |
88 | Compressor32 down32(
89 | .A_i(top_sum),
90 | .B_i({top_carry<<1}),
91 | .C_i(D_i),
92 | .Sum_o(Sum_o),
93 | .Carry_o(Carry_o)
94 | );
95 |
96 | assign hidden_carry_msb = top_carry[XLEN - 1];
97 |
98 | endmodule
99 |
--------------------------------------------------------------------------------
/src/01_RTL/EACAdder.v:
--------------------------------------------------------------------------------
1 | `timescale 1ns / 1ps
2 | //////////////////////////////////////////////////////////////////////////////////
3 | // Engineer: Tzu-Han Hsu
4 | // Create Date: 07/29/2022 10:40:06 AM
5 | // Module Name: EACAdder
6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit
7 | // HDL(Version): Verilog-2005
8 | //
9 | // Dependencies: None
10 | //
11 | //////////////////////////////////////////////////////////////////////////////////
12 | // Description: An adder outputs a positive magnitude result and preferably
13 | // only need to conditionally complement one operand
14 | //
15 | //////////////////////////////////////////////////////////////////////////////////
16 | // Revision:
17 | // 07/29/2022 - I/O port names renamed with correct suffix
18 | // 08/06/2022 - Add A_Zero_i signal to detect A is -0, in order to avoid false end round carry
19 | // 09/12/2022 - Add BSD-3-Clause Licence
20 | //
21 | //////////////////////////////////////////////////////////////////////////////////
22 | // License information:
23 | //
24 | // This software is released under the BSD-3-Clause Licence,
25 | // see https://opensource.org/licenses/BSD-3-Clause for details.
26 | // In the following license statements, "software" refers to the
27 | // "source code" of the complete hardware/software system.
28 | //
29 | // Copyright 2022,
30 | // Embedded Intelligent Systems Lab (EISL)
31 | // Deparment of Computer Science
32 | // National Yang Ming Chiao Tung Uniersity
33 | // Hsinchu, Taiwan.
34 | //
35 | // All rights reserved.
36 | //
37 | // Redistribution and use in source and binary forms, with or without
38 | // modification, are permitted provided that the following conditions are met:
39 | //
40 | // 1. Redistributions of source code must retain the above copyright notice,
41 | // this list of conditions and the following disclaimer.
42 | //
43 | // 2. Redistributions in binary form must reproduce the above copyright notice,
44 | // this list of conditions and the following disclaimer in the documentation
45 | // and/or other materials provided with the distribution.
46 | //
47 | // 3. Neither the name of the copyright holder nor the names of its contributors
48 | // may be used to endorse or promote products derived from this software
49 | // without specific prior written permission.
50 | //
51 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
52 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
53 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
54 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
55 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
56 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
57 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
58 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
59 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
60 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
61 | // POSSIBILITY OF SUCH DAMAGE.
62 | //////////////////////////////////////////////////////////////////////////////////
63 |
64 |
65 | module EACAdder #(
66 | parameter PARM_MANT = 23
67 | ) (
68 | input [2*PARM_MANT + 1 : 0] CSA_sum_i,
69 | input [2*PARM_MANT + 1 : 0] CSA_carry_i,
70 | input Carry_postcor_i,
71 | input Sub_Sign_i,
72 | input A_Zero_i,
73 |
74 | output [2*PARM_MANT + 1 : 0] low_sum_o,
75 | output low_carry_o,
76 | output [2*PARM_MANT + 1 : 0] low_sum_inv_o,
77 | output low_carry_inv_o);
78 |
79 | wire end_round_carry = Sub_Sign_i & (~A_Zero_i);
80 | assign {low_carry_o, low_sum_o} = CSA_sum_i + {Carry_postcor_i, CSA_carry_i[2*PARM_MANT : 0], end_round_carry};
81 | assign {low_carry_inv_o, low_sum_inv_o} = 2'b10 + {1'b1, ~CSA_sum_i} + {~Carry_postcor_i, ~CSA_carry_i[2*PARM_MANT : 0], ~end_round_carry};
82 |
83 | endmodule
84 |
--------------------------------------------------------------------------------
/src/01_RTL/FullAdder.v:
--------------------------------------------------------------------------------
1 | `timescale 1ns / 1ps
2 | //////////////////////////////////////////////////////////////////////////////////
3 | // Engineer: Tzu-Han Hsu
4 | // Create Date: 07/22/2022 10:13:32 AM
5 | // Module Name: FullAdder
6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit
7 | // HDL(Version): Verilog-2005
8 | //
9 | // Dependencies: None
10 | //
11 | //////////////////////////////////////////////////////////////////////////////////
12 | // Description: A full Adder module, with 3 input and 2 output
13 | //
14 | //////////////////////////////////////////////////////////////////////////////////
15 | // Revision:
16 | // 07/25/2022 - Output ports naming inconsistent with definition, bug fixed
17 | // 09/12/2022 - Add BSD-3-Clause Licence
18 | //
19 | //////////////////////////////////////////////////////////////////////////////////
20 | // License information:
21 | //
22 | // This software is released under the BSD-3-Clause Licence,
23 | // see https://opensource.org/licenses/BSD-3-Clause for details.
24 | // In the following license statements, "software" refers to the
25 | // "source code" of the complete hardware/software system.
26 | //
27 | // Copyright 2022,
28 | // Embedded Intelligent Systems Lab (EISL)
29 | // Deparment of Computer Science
30 | // National Yang Ming Chiao Tung Uniersity
31 | // Hsinchu, Taiwan.
32 | //
33 | // All rights reserved.
34 | //
35 | // Redistribution and use in source and binary forms, with or without
36 | // modification, are permitted provided that the following conditions are met:
37 | //
38 | // 1. Redistributions of source code must retain the above copyright notice,
39 | // this list of conditions and the following disclaimer.
40 | //
41 | // 2. Redistributions in binary form must reproduce the above copyright notice,
42 | // this list of conditions and the following disclaimer in the documentation
43 | // and/or other materials provided with the distribution.
44 | //
45 | // 3. Neither the name of the copyright holder nor the names of its contributors
46 | // may be used to endorse or promote products derived from this software
47 | // without specific prior written permission.
48 | //
49 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
50 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
51 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
52 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
53 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
54 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
55 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
56 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
57 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
58 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
59 | // POSSIBILITY OF SUCH DAMAGE.
60 | //////////////////////////////////////////////////////////////////////////////////
61 |
62 |
63 | module FullAdder(
64 | input augend_i,
65 | input addend_i,
66 | input carry_i,
67 | output sum_o,
68 | output carry_o);
69 |
70 | assign sum_o = augend_i ^ addend_i ^ carry_i;
71 | assign carry_o = (augend_i & addend_i) || (addend_i & carry_i) || (carry_i & augend_i);
72 |
73 | endmodule
74 |
--------------------------------------------------------------------------------
/src/01_RTL/LeadingOneDetector_Top.v:
--------------------------------------------------------------------------------
1 | `timescale 1ns / 1ps
2 | //////////////////////////////////////////////////////////////////////////////////
3 | // Engineer: Tzu-Han Hsu
4 | // Create Date: 07/29/2022 11:01:00 PM
5 | // Module Name: LeadingOneDetector_Top
6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit
7 | // HDL(Version): Verilog-2005
8 | //
9 | // Dependencies: ZeroDetector_Base.v
10 | // ZeroDetector_Group.v
11 | //
12 | //////////////////////////////////////////////////////////////////////////////////
13 | // Description: It detect the shifting amount needed for a leading one
14 | //
15 | //////////////////////////////////////////////////////////////////////////////////
16 | // Revision:
17 | // 07/29/2022 - Mux simplification, combine one else if clause into else clause
18 | // 09/12/2022 - Add BSD-3-Clause Licence
19 | //
20 | //////////////////////////////////////////////////////////////////////////////////
21 | // License information:
22 | //
23 | // This software is released under the BSD-3-Clause Licence,
24 | // see https://opensource.org/licenses/BSD-3-Clause for details.
25 | // In the following license statements, "software" refers to the
26 | // "source code" of the complete hardware/software system.
27 | //
28 | // Copyright 2022,
29 | // Embedded Intelligent Systems Lab (EISL)
30 | // Deparment of Computer Science
31 | // National Yang Ming Chiao Tung Uniersity
32 | // Hsinchu, Taiwan.
33 | //
34 | // All rights reserved.
35 | //
36 | // Redistribution and use in source and binary forms, with or without
37 | // modification, are permitted provided that the following conditions are met:
38 | //
39 | // 1. Redistributions of source code must retain the above copyright notice,
40 | // this list of conditions and the following disclaimer.
41 | //
42 | // 2. Redistributions in binary form must reproduce the above copyright notice,
43 | // this list of conditions and the following disclaimer in the documentation
44 | // and/or other materials provided with the distribution.
45 | //
46 | // 3. Neither the name of the copyright holder nor the names of its contributors
47 | // may be used to endorse or promote products derived from this software
48 | // without specific prior written permission.
49 | //
50 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
51 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
52 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
53 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
54 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
55 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
56 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
57 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
58 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
59 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
60 | // POSSIBILITY OF SUCH DAMAGE.
61 | //////////////////////////////////////////////////////////////////////////////////
62 |
63 |
64 | module LeadingOneDetector_Top #(
65 | parameter X_LEN = 74,
66 | parameter PARM_SHIFTZERO = $clog2(X_LEN)
67 | ) (
68 | input [X_LEN - 1 : 0] data_i,
69 |
70 | output reg [PARM_SHIFTZERO - 1 : 0] shift_num_o,
71 | output allzero_o );
72 |
73 |
74 | wire [7:0] base_zeros;
75 | generate
76 | genvar i;
77 | for(i = 0; i < 8; i = i+1)begin
78 | ZeroDetector_Base #(8) lzd_base(
79 | .base_data_i(data_i[(72 - i*8) -: 8]),
80 | .zero_o(base_zeros[i])
81 | );
82 | end
83 | endgenerate
84 |
85 | wire [3:0] lv1_zeros;
86 | generate
87 | genvar j;
88 | for (j = 0; j < 4; j = j+1) begin
89 | ZeroDetector_Group #(2) lzd_grouplv1(
90 | .group_data_i(base_zeros[j*2 +:2]),
91 | .group_zero_o(lv1_zeros[j])
92 | );
93 | end
94 | endgenerate
95 |
96 | wire [1:0] lv2_zeros;
97 | ZeroDetector_Group #(2) lzd_grouplv2_0(
98 | .group_data_i(lv1_zeros[1:0]),
99 | .group_zero_o(lv2_zeros[0])
100 | );
101 |
102 | ZeroDetector_Group #(2) lzd_grouplv2_1(
103 | .group_data_i(lv1_zeros[3:2]),
104 | .group_zero_o(lv2_zeros[1])
105 | );
106 |
107 | wire lv3_zeros;
108 | ZeroDetector_Group #(2) lzd_grouplv3(
109 | .group_data_i(lv2_zeros),
110 | .group_zero_o(lv3_zeros)
111 | );
112 |
113 | wire left_zero = (data_i[8:0] == 9'd0);
114 |
115 |
116 | //output logic
117 | assign allzero_o = lv3_zeros & left_zero;
118 |
119 | always @(*) begin
120 | if(lv3_zeros)begin
121 | if(data_i[8]) shift_num_o = 64;
122 | else if(data_i[7]) shift_num_o = 65;
123 | else if(data_i[6]) shift_num_o = 66;
124 | else if(data_i[5]) shift_num_o = 67;
125 | else if(data_i[4]) shift_num_o = 68;
126 | else if(data_i[3]) shift_num_o = 69;
127 | else if(data_i[2]) shift_num_o = 70;
128 | else if(data_i[1]) shift_num_o = 71;
129 | else shift_num_o = 72; //when all zero or data_i[0]
130 |
131 | end
132 | else begin //1 appears in 72 : 9
133 | if(lv2_zeros[0])begin // 1 appears in 40 : 9
134 | if(lv1_zeros[2])begin // 1 appears in 24 : 9
135 | if(base_zeros[6])begin // 1 appears in 16 : 9
136 |
137 | if(data_i[16]) shift_num_o = 56;
138 | else if(data_i[15]) shift_num_o = 57;
139 | else if(data_i[14]) shift_num_o = 58;
140 | else if(data_i[13]) shift_num_o = 59;
141 | else if(data_i[12]) shift_num_o = 60;
142 | else if(data_i[11]) shift_num_o = 61;
143 | else if(data_i[10]) shift_num_o = 62;
144 | else shift_num_o = 63; //data_i[9]
145 | end
146 | else begin // 1 appears in 24 : 17
147 |
148 | if(data_i[24]) shift_num_o = 48;
149 | else if(data_i[23]) shift_num_o = 49;
150 | else if(data_i[22]) shift_num_o = 50;
151 | else if(data_i[21]) shift_num_o = 51;
152 | else if(data_i[20]) shift_num_o = 52;
153 | else if(data_i[19]) shift_num_o = 53;
154 | else if(data_i[18]) shift_num_o = 54;
155 | else shift_num_o = 55; // data_i[17]
156 | end
157 | end
158 | else begin // 1 appears in 40 : 25
159 | if(base_zeros[4])begin // 1 appears in 32 : 25
160 |
161 | if(data_i[32]) shift_num_o = 40;
162 | else if(data_i[31]) shift_num_o = 41;
163 | else if(data_i[30]) shift_num_o = 42;
164 | else if(data_i[29]) shift_num_o = 43;
165 | else if(data_i[28]) shift_num_o = 44;
166 | else if(data_i[27]) shift_num_o = 45;
167 | else if(data_i[26]) shift_num_o = 46;
168 | else shift_num_o = 47; //data_i[25]
169 | end
170 | else begin // 1 appears in 40 : 33
171 |
172 | if(data_i[40]) shift_num_o = 32;
173 | else if(data_i[39]) shift_num_o = 33;
174 | else if(data_i[38]) shift_num_o = 34;
175 | else if(data_i[37]) shift_num_o = 35;
176 | else if(data_i[36]) shift_num_o = 36;
177 | else if(data_i[35]) shift_num_o = 37;
178 | else if(data_i[34]) shift_num_o = 38;
179 | else shift_num_o = 39; // data_i[33]
180 | end
181 | end
182 | end
183 | else begin //1 in 72 : 41
184 | if(lv1_zeros[0])begin //1 appears in 56 : 41
185 | if(base_zeros[2])begin // 1 appears in 48 : 41
186 |
187 | if(data_i[48]) shift_num_o = 24;
188 | else if(data_i[47]) shift_num_o = 25;
189 | else if(data_i[46]) shift_num_o = 26;
190 | else if(data_i[45]) shift_num_o = 27;
191 | else if(data_i[44]) shift_num_o = 28;
192 | else if(data_i[43]) shift_num_o = 29;
193 | else if(data_i[42]) shift_num_o = 30;
194 | else shift_num_o = 31; // data_i[41]
195 | end
196 | else begin // 1 appears in 56 : 49
197 |
198 | if(data_i[56]) shift_num_o = 16;
199 | else if(data_i[55]) shift_num_o = 17;
200 | else if(data_i[54]) shift_num_o = 18;
201 | else if(data_i[53]) shift_num_o = 19;
202 | else if(data_i[52]) shift_num_o = 20;
203 | else if(data_i[51]) shift_num_o = 21;
204 | else if(data_i[50]) shift_num_o = 22;
205 | else shift_num_o = 23; // data_i[49]
206 | end
207 |
208 | end
209 | else begin // 1 appears in 72 : 57
210 | if(base_zeros[0])begin // 1 appears in 64 : 57
211 |
212 | if(data_i[64]) shift_num_o = 8;
213 | else if(data_i[63]) shift_num_o = 9;
214 | else if(data_i[62]) shift_num_o = 10;
215 | else if(data_i[61]) shift_num_o = 11;
216 | else if(data_i[60]) shift_num_o = 12;
217 | else if(data_i[59]) shift_num_o = 13;
218 | else if(data_i[58]) shift_num_o = 14;
219 | else shift_num_o = 15; // data_i[57]
220 | end
221 | else begin // 1 appears in 72 : 65
222 |
223 | if(data_i[72]) shift_num_o = 0;
224 | else if(data_i[71]) shift_num_o = 1;
225 | else if(data_i[70]) shift_num_o = 2;
226 | else if(data_i[69]) shift_num_o = 3;
227 | else if(data_i[68]) shift_num_o = 4;
228 | else if(data_i[67]) shift_num_o = 5;
229 | else if(data_i[66]) shift_num_o = 6;
230 | else shift_num_o = 7; // data_i[65]
231 |
232 | end
233 | end
234 | end
235 | end
236 | end
237 |
238 |
239 | endmodule
240 |
--------------------------------------------------------------------------------
/src/01_RTL/MAC32_top.v:
--------------------------------------------------------------------------------
1 | `timescale 1ns / 1ps
2 | //////////////////////////////////////////////////////////////////////////////////
3 | // Engineer: Tzu-Han Hsu
4 | // Create Date: 07/21/2022 03:34:32 PM
5 | // Module Name: MAC32_top
6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit
7 | // HDL(Version): Verilog-2005
8 | //
9 | // Dependencies: SpecialCaseDetector.v
10 | // R4Booth.v
11 | // WallaceTree.v
12 | // PreNormalizer.v
13 | // Compressor32.v
14 | // EACAdder.v
15 | // MSBIncrementer.v
16 | // LeadingOneDetector_Top.v
17 | // Normalizer.v
18 | // Rounder.v
19 | //////////////////////////////////////////////////////////////////////////////////
20 | // Description:
21 | //
22 | //////////////////////////////////////////////////////////////////////////////////
23 | // Revision:
24 | // 08/12/2022 - Update mv_halt signal, now zero is viewed as the smalest denormalized number.
25 | // 08/14/2022 - Stable non-pipelined build (v1.0)
26 | // 08/15/2022 - R4Booth and Wallace Tree update
27 | // 08/16/2022 - Instantiation name start with UpperCase
28 | // 09/12/2022 - Add BSD-3-Clause Licence
29 | //
30 | //////////////////////////////////////////////////////////////////////////////////
31 | // License information:
32 | //
33 | // This software is released under the BSD-3-Clause Licence,
34 | // see https://opensource.org/licenses/BSD-3-Clause for details.
35 | // In the following license statements, "software" refers to the
36 | // "source code" of the complete hardware/software system.
37 | //
38 | // Copyright 2022,
39 | // Embedded Intelligent Systems Lab (EISL)
40 | // Deparment of Computer Science
41 | // National Yang Ming Chiao Tung Uniersity
42 | // Hsinchu, Taiwan.
43 | //
44 | // All rights reserved.
45 | //
46 | // Redistribution and use in source and binary forms, with or without
47 | // modification, are permitted provided that the following conditions are met:
48 | //
49 | // 1. Redistributions of source code must retain the above copyright notice,
50 | // this list of conditions and the following disclaimer.
51 | //
52 | // 2. Redistributions in binary form must reproduce the above copyright notice,
53 | // this list of conditions and the following disclaimer in the documentation
54 | // and/or other materials provided with the distribution.
55 | //
56 | // 3. Neither the name of the copyright holder nor the names of its contributors
57 | // may be used to endorse or promote products derived from this software
58 | // without specific prior written permission.
59 | //
60 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
61 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
62 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
63 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
64 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
65 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
66 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
67 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
68 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
69 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
70 | // POSSIBILITY OF SUCH DAMAGE.
71 | //////////////////////////////////////////////////////////////////////////////////
72 | // Additional Comments:
73 | //
74 | // Floating-point control and status register:
75 | // |31 8|7 5|4 0|
76 | // |reserved| Rounding Mode (frm) | Accured Exceptions(fflags) |
77 | // NV DZ OF UF NX
78 | //
79 | // Rounding mode encoding:
80 | // Rounding Mode| Mnemonic | Meaning
81 | // -------------------------------------------------------------------------------------------
82 | // 000 | RNE | Round to Nearest, ties to Even
83 | // 001 | RTZ | Round towards Zero
84 | // 010 | RDN | Round Down (towards -INFINITY)
85 | // 011 | RUP | Round UP (towards +INFINITY)
86 | // 100 | RMM | Round to Nearest, ties Max Magnitude
87 | // 101 | --- | Invalid. Reserved for future use
88 | // 110 | --- | Invalid. Reserved for future use
89 | // 111 | DYN | In instruction's rm field, selects dynamic rounding mode;
90 | // In Rounding Mode register, Invalid
91 | //
92 | // Accrued exception flag encoding:
93 | // Flag Mnemonic | Flag Meaning
94 | // --------------------------------------
95 | // NV | Invalid Operation
96 | // DZ | Divide by Zero
97 | // OF | Overflow
98 | // UF | Underflow
99 | // NX | Inexact
100 | //
101 | //////////////////////////////////////////////////////////////////////////////////
102 |
103 |
104 | module MAC32_top #(
105 | parameter PARM_RM = 3,
106 | parameter PARM_XLEN = 32,
107 | parameter PARM_RM_RNE = 3'b000,
108 | parameter PARM_RM_RTZ = 3'b001,
109 | parameter PARM_RM_RDN = 3'b010,
110 | parameter PARM_RM_RUP = 3'b011,
111 | parameter PARM_RM_RMM = 3'b100
112 | ) (
113 | //input clk_i,
114 | //input rst_i,
115 | //input stall_i,
116 | //input req_i,
117 |
118 | input [PARM_RM - 1 : 0] Rounding_mode_i,
119 |
120 | input [PARM_XLEN - 1 : 0] A_i,
121 | input [PARM_XLEN - 1 : 0] B_i,
122 | input [PARM_XLEN - 1 : 0] C_i,
123 |
124 |
125 | output [PARM_XLEN - 1 : 0] Result_o, // T (result_o) = A + (B * C)
126 | //output ready_o,
127 |
128 | //Accrued exceptions (fflags)
129 | output NV_o,
130 | //output DZ_o, //would not occur in Multiplication or Addition
131 | output OF_o,
132 | output UF_o,
133 | output NX_o );
134 |
135 |
136 | parameter PARM_EXP = 8;
137 | parameter PARM_MANT = 23;
138 | parameter PARM_BIAS = 127;
139 | parameter PARM_LEADONE_WIDTH = 7;
140 | parameter PARM_EXP_ONE = 8'h01;
141 | parameter PARM_MANT_NAN = 23'b100_0000_0000_0000_0000_0000; //RISC-V defines canonical NaN to be 0x7fc0_0000
142 |
143 |
144 | //inputs wires of specialCaseDetectors
145 | wire A_Leadingbit = | A_i[PARM_XLEN - 2 : PARM_MANT]; //normalized number has leading 1, denormalized with leading 0
146 | wire B_Leadingbit = | B_i[PARM_XLEN - 2 : PARM_MANT];
147 | wire C_Leadingbit = | C_i[PARM_XLEN - 2 : PARM_MANT];
148 | //outputs wires of specialCaseDetectors
149 | wire A_Inf, B_Inf, C_Inf;
150 | wire A_Zero, B_Zero, C_Zero;
151 | wire A_NaN, B_NaN, C_NaN;
152 | wire A_DeN, B_DeN, C_DeN;
153 |
154 |
155 | SpecialCaseDetector #(
156 | .PARM_XLEN(PARM_XLEN),
157 | .PARM_EXP(PARM_EXP),
158 | .PARM_MANT(PARM_MANT)
159 | ) SpecialCaseDetector (
160 | .A_i(A_i),
161 | .B_i(B_i),
162 | .C_i(C_i),
163 | .A_Leadingbit_i(A_Leadingbit),
164 | .B_Leadingbit_i(B_Leadingbit),
165 | .C_Leadingbit_i(C_Leadingbit),
166 |
167 | .A_Inf_o(A_Inf),
168 | .B_Inf_o(B_Inf),
169 | .C_Inf_o(C_Inf),
170 | .A_Zero_o(A_Zero),
171 | .B_Zero_o(B_Zero),
172 | .C_Zero_o(C_Zero),
173 | .A_NaN_o(A_NaN),
174 | .B_NaN_o(B_NaN),
175 | .C_NaN_o(C_NaN),
176 | .A_DeN_o(A_DeN),
177 | .B_DeN_o(B_DeN),
178 | .C_DeN_o(C_DeN)
179 | );
180 |
181 |
182 | wire A_Sign = A_i[PARM_XLEN - 1];
183 | wire B_Sign = B_i[PARM_XLEN - 1];
184 | wire C_Sign = C_i[PARM_XLEN - 1];
185 | wire Sub_Sign = A_Sign ^ B_Sign ^ C_Sign; // indicator of effective subtraction
186 |
187 | //denormalized number has exponent 1
188 | wire [PARM_EXP - 1: 0] A_Exp = A_DeN? PARM_EXP_ONE : A_i[PARM_XLEN - 2 : PARM_MANT];
189 | wire [PARM_EXP - 1: 0] B_Exp = B_DeN? PARM_EXP_ONE : B_i[PARM_XLEN - 2 : PARM_MANT];
190 | wire [PARM_EXP - 1: 0] C_Exp = C_DeN? PARM_EXP_ONE : C_i[PARM_XLEN - 2 : PARM_MANT];
191 |
192 | wire [PARM_MANT : 0] A_Mant = {A_Leadingbit, A_i[PARM_MANT - 1 : 0]};
193 | wire [PARM_MANT : 0] B_Mant = {B_Leadingbit, B_i[PARM_MANT - 1 : 0]};
194 | wire [PARM_MANT : 0] C_Mant = {C_Leadingbit, C_i[PARM_MANT - 1 : 0]};
195 |
196 | //Generate 13 Partial Product by Radix-4 Booth's Algorithm
197 | wire [2*PARM_MANT + 2 : 0] booth_PP [12 - 1: 0];
198 | wire [2*PARM_MANT + 1 : 0] booth_PP_13; //Partial Product's MSB is always 0
199 |
200 |
201 | R4Booth #(
202 | .PARM_MANT(PARM_MANT)
203 | ) R4Booth (
204 | .MantA_i(B_Mant),
205 | .MantB_i(C_Mant),
206 |
207 | .pp_00_o(booth_PP[ 0]),
208 | .pp_01_o(booth_PP[ 1]),
209 | .pp_02_o(booth_PP[ 2]),
210 | .pp_03_o(booth_PP[ 3]),
211 | .pp_04_o(booth_PP[ 4]),
212 | .pp_05_o(booth_PP[ 5]),
213 | .pp_06_o(booth_PP[ 6]),
214 | .pp_07_o(booth_PP[ 7]),
215 | .pp_08_o(booth_PP[ 8]),
216 | .pp_09_o(booth_PP[ 9]),
217 | .pp_10_o(booth_PP[10]),
218 | .pp_11_o(booth_PP[11]),
219 | .pp_12_o(booth_PP_13)
220 | );
221 |
222 |
223 | //Sum 13 partial Product by Wallace Tree
224 | wire [2*PARM_MANT + 2 : 0] Wallace_sum;
225 | wire [2*PARM_MANT + 2 : 0] Wallace_carry;
226 | wire Wallace_suppression_sign_extension;
227 |
228 |
229 | WallaceTree #(
230 | .PARM_MANT(PARM_MANT)
231 | ) WallaceTree (
232 | .pp_00_i(booth_PP[ 0]),
233 | .pp_01_i(booth_PP[ 1]),
234 | .pp_02_i(booth_PP[ 2]),
235 | .pp_03_i(booth_PP[ 3]),
236 | .pp_04_i(booth_PP[ 4]),
237 | .pp_05_i(booth_PP[ 5]),
238 | .pp_06_i(booth_PP[ 6]),
239 | .pp_07_i(booth_PP[ 7]),
240 | .pp_08_i(booth_PP[ 8]),
241 | .pp_09_i(booth_PP[ 9]),
242 | .pp_10_i(booth_PP[10]),
243 | .pp_11_i(booth_PP[11]),
244 | .pp_12_i(booth_PP_13),
245 |
246 | .wallace_sum_o(Wallace_sum),
247 | .wallace_carry_o(Wallace_carry),
248 | .suppression_sign_extension_o(Wallace_suppression_sign_extension)
249 | );
250 |
251 |
252 | //Prenormalization of the augend, in parallel with multiplication.
253 | //global signals ...
254 | wire Sign_aligned;
255 | wire Exp_mv_sign;
256 | wire Mv_halt;
257 |
258 | //Exponent Processor
259 | //d = expA - (expB + expC - bias[127])
260 | //mv = 27 - d = expB + expC - expA + 100
261 |
262 | wire [PARM_EXP + 1 : 0] Exp_mv = 27 - A_Exp + B_Exp + C_Exp - PARM_BIAS; // d = expA - (expB + expC - 127), mv = 27 - d
263 | wire [PARM_EXP + 1 : 0] Exp_mv_neg = -27 + A_Exp - B_Exp - C_Exp + PARM_BIAS;
264 |
265 | assign Exp_mv_sign = Exp_mv[PARM_EXP + 1]; // the sign bit of the mv parameter, Sign_amt_DO
266 |
267 | //Revision 2.00 - Update mv_halt signal, now zero is viewed as the smalest denormalized number.
268 | //right shift(+) is out of range, which is 74 or more
269 | assign Mv_halt = ((~Exp_mv_sign) & (Exp_mv[PARM_EXP : 0] > 73))|| A_Zero;
270 |
271 | //signals for prenormalizer:
272 | wire SignFlip_ADD_PRN;
273 |
274 | wire [3*PARM_MANT + 5 : 0] A_Mant_aligned;
275 | wire [PARM_MANT + 3 : 0] A_Mant_aligned_high = A_Mant_aligned[3*PARM_MANT + 5 : 2*PARM_MANT + 2];
276 | wire [2*PARM_MANT + 1 : 0] A_Mant_aligned_low = A_Mant_aligned[2*PARM_MANT + 1 : 0];
277 |
278 | wire signed [PARM_EXP + 1 : 0] Exp_aligned;
279 | wire Mant_sticky_sht_out;
280 |
281 |
282 | PreNormalizer #(
283 | .PARM_EXP(PARM_EXP),
284 | .PARM_MANT(PARM_MANT),
285 | .PARM_BIAS(PARM_BIAS)
286 | ) PreNormalizer (
287 | .A_sign_i(A_Sign),
288 | .B_sign_i(B_Sign),
289 | .C_sign_i(C_Sign),
290 | .Sub_Sign_i(Sub_Sign),
291 | .A_Exp_i(A_Exp),
292 | .B_Exp_i(B_Exp),
293 | .C_Exp_i(C_Exp),
294 | .A_Mant_i(A_Mant),
295 | .Sign_flip_i(SignFlip_ADD_PRN),
296 | .Mv_halt_i(Mv_halt),
297 | .Exp_mv_i(Exp_mv),
298 | .Exp_mv_sign_i(Exp_mv_sign),
299 |
300 | .A_Mant_aligned_o(A_Mant_aligned),
301 | .Exp_aligned_o(Exp_aligned),
302 | .Sign_aligned_o(Sign_aligned),
303 | .Mant_sticky_sht_out_o(Mant_sticky_sht_out)
304 | );
305 |
306 |
307 | //adjust wallace sum to send in...
308 | wire [2*PARM_MANT + 2 : 0] Wallace_sum_adjusted;
309 | wire [2*PARM_MANT + 2 : 0] Wallace_carry_adjusted;
310 |
311 | assign Wallace_sum_adjusted = (Exp_mv_sign)? 0 : Wallace_sum;
312 | assign Wallace_carry_adjusted = (Exp_mv_sign) ? 0 : Wallace_carry;
313 |
314 | //Sums the Wallace outputs with A_Low
315 | wire [2*PARM_MANT + 1 : 0] CSA_sum;
316 | wire [2*PARM_MANT + 1 : 0] CSA_carry;
317 |
318 | Compressor32 #(
319 | .XLEN(2*PARM_MANT + 2)
320 | ) CarrySaveAdder (
321 | .A_i(A_Mant_aligned_low), //A_low
322 | .B_i(Wallace_sum_adjusted[2*PARM_MANT + 1 : 0]),
323 | .C_i({Wallace_carry_adjusted[2*PARM_MANT : 0], 1'b0}),
324 |
325 | .Sum_o(CSA_sum),
326 | .Carry_o(CSA_carry)
327 | );
328 |
329 | //correction based sign extenson is also in grand-adder.
330 | //output signals
331 | reg [73 : 0] PosSum;
332 | wire Minus_sticky_bit;
333 |
334 | wire Adder_sign; //global signal for Sign_out_D
335 |
336 | //End Around Carry Adders, LSBs
337 |
338 | wire wallace_msb_G = Wallace_sum_adjusted[2*PARM_MANT + 2] & Wallace_carry_adjusted[2*PARM_MANT + 1];
339 | //if Wallace's msb is 1, or will carry to 1
340 | wire adder_Correlated_sign = Wallace_suppression_sign_extension | Wallace_carry_adjusted[2*PARM_MANT + 2] | wallace_msb_G;
341 | wire Carry_postcor = (~Exp_mv_sign) & ((~adder_Correlated_sign) ^ CSA_carry[2*PARM_MANT + 1]);
342 |
343 | wire [2*PARM_MANT + 1 : 0] low_sum;
344 | wire low_carry;
345 | wire [2*PARM_MANT + 1 : 0] low_sum_inv;
346 | wire low_carry_inv;
347 |
348 |
349 | EACAdder #(
350 | .PARM_MANT(PARM_MANT)
351 | ) EACAdder (
352 | .CSA_sum_i(CSA_sum),
353 | .CSA_carry_i(CSA_carry),
354 | .Carry_postcor_i(Carry_postcor),
355 | .Sub_Sign_i(Sub_Sign),
356 | .A_Zero_i(A_Zero),//This is added to deal with false Sub_Sign_i(If a is -0)
357 |
358 | .low_sum_o(low_sum),
359 | .low_carry_o(low_carry),
360 | .low_sum_inv_o(low_sum_inv),
361 | .low_carry_inv_o(low_carry_inv)
362 | );
363 |
364 |
365 | //Incrementer, Work on MSBs
366 | wire [PARM_MANT + 3 : 0]high_sum;
367 | wire [PARM_MANT + 3 : 0]high_sum_inv;
368 |
369 |
370 | MSBIncrementer #(
371 | .PARM_MANT(PARM_MANT)
372 | ) MSBIncrementer (
373 | .low_carry_i(low_carry),
374 | .low_carry_inv_i(low_carry_inv),
375 | .A_Mant_aligned_high_i(A_Mant_aligned_high),
376 |
377 | .high_sum_o(high_sum),
378 | .high_sum_inv_o(high_sum_inv)
379 | );
380 |
381 |
382 | wire bc_not_strange = ~(B_Inf | C_Inf | B_Zero | C_Zero | B_NaN | C_NaN);
383 | wire [3*PARM_MANT + 4 : 0] sub_minus = {{A_Mant_aligned_high[PARM_MANT+2 : 0], 1'b0} - bc_not_strange, 47'd0};
384 |
385 | //Output of the Adder stage...
386 | assign SignFlip_ADD_PRN = high_sum[PARM_MANT + 3];
387 | assign Adder_sign = Exp_mv_sign? Sign_aligned: (SignFlip_ADD_PRN ^ Sign_aligned);
388 |
389 | always @(*) begin
390 | if(Mv_halt)
391 | PosSum = {{26'd0}, low_sum};
392 | else if(Exp_mv_sign) //b*c does not participate
393 | PosSum = Sub_Sign? sub_minus : {A_Mant_aligned_high[PARM_MANT+2 : 0], 48'd0};
394 | else if(SignFlip_ADD_PRN)
395 | PosSum = {high_sum_inv[PARM_MANT + 2 : 0], low_sum_inv};
396 | else
397 | PosSum = {high_sum[PARM_MANT + 2 : 0], low_sum};
398 | end
399 |
400 |
401 | // for Sign_amt_DI=1'b1, if is difficult to compute combined with other cases.
402 | // When addition, | (b*c) ; when substruction, | (b*c) for rounding excption trunction.
403 | assign Minus_sticky_bit = Exp_mv_sign && (bc_not_strange);
404 |
405 | //leading one anticipator, detects the shift amount necessary for normalization
406 | wire [PARM_LEADONE_WIDTH - 1 : 0] shift_num;
407 | wire allzero;
408 |
409 |
410 | LeadingOneDetector_Top #(
411 | .X_LEN(74)
412 | ) LeadingOneDetector (
413 | .data_i(PosSum),
414 |
415 | .shift_num_o(shift_num),
416 | .allzero_o(allzero)
417 | );
418 |
419 |
420 | //Shift the exponent according to the result of LeadingOneDetector
421 | wire [3*PARM_MANT + 4 : 0] Mant_norm;
422 | wire [PARM_EXP + 1 : 0] Exp_norm;
423 | wire [PARM_EXP + 1 : 0] Exp_norm_mone;
424 | wire [PARM_EXP + 1 : 0] Exp_max_rs;
425 | wire [3*PARM_MANT + 6 : 0] Rs_Mant;
426 |
427 | Normalizer #(
428 | .PARM_EXP(PARM_EXP),
429 | .PARM_MANT(PARM_MANT),
430 | .PARM_LEADONE_WIDTH(PARM_LEADONE_WIDTH)
431 | ) Normalizer (
432 | .Mant_i(PosSum),
433 | .Exp_i(Exp_aligned),
434 | .Shift_num_i(shift_num),
435 | .Exp_mv_sign_i(Exp_mv_sign),
436 |
437 | .Mant_norm_o(Mant_norm),
438 | .Exp_norm_o(Exp_norm),
439 | .Exp_norm_mone_o(Exp_norm_mone),
440 | .Exp_max_rs_o(Exp_max_rs),
441 | .Rs_Mant_o(Rs_Mant)
442 | );
443 |
444 | wire Sign_result;
445 | wire [PARM_EXP - 1 : 0] Exp_result;
446 | wire [PARM_MANT - 1 : 0] Mant_result;
447 |
448 | assign Result_o = {Sign_result, Exp_result, Mant_result}; //outputlogic
449 |
450 | Rounder #(
451 | .PARM_RM(PARM_RM),
452 | .PARM_RM_RNE(PARM_RM_RNE),
453 | .PARM_RM_RTZ(PARM_RM_RTZ),
454 | .PARM_RM_RDN(PARM_RM_RDN),
455 | .PARM_RM_RUP(PARM_RM_RUP),
456 | .PARM_RM_RMM(PARM_RM_RMM),
457 | .PARM_MANT_NAN(PARM_MANT_NAN),
458 | .PARM_EXP(PARM_EXP),
459 | .PARM_MANT(PARM_MANT),
460 | .PARM_LEADONE_WIDTH(PARM_LEADONE_WIDTH)
461 | ) Rounder (
462 | .Exp_i(Exp_aligned),
463 | .Sign_i(Adder_sign),
464 | .Allzero_i(allzero),
465 | .Exp_mv_sign_i(Exp_mv_sign),
466 | .Sub_Sign_i(Sub_Sign),
467 | .A_Exp_raw_i(A_i[PARM_XLEN - 2 : PARM_MANT]), // This is different from A_Exp, since we would like the "raw" bits
468 | .Rounding_mode_i(Rounding_mode_i),
469 | .A_Mant_i(A_Mant),
470 | .A_Sign_i(A_Sign),
471 | .B_Sign_i(B_Sign),
472 | .C_Sign_i(C_Sign),
473 | .A_DeN_i(A_DeN),
474 | .A_Inf_i(A_Inf),
475 | .B_Inf_i(B_Inf),
476 | .C_Inf_i(C_Inf),
477 | .A_Zero_i(A_Zero),
478 | .B_Zero_i(B_Zero),
479 | .C_Zero_i(C_Zero),
480 | .A_NaN_i(A_NaN),
481 | .B_NaN_i(B_NaN),
482 | .C_NaN_i(C_NaN),
483 | .Mant_sticky_sht_out_i(Mant_sticky_sht_out),
484 | .Minus_sticky_bit_i(Minus_sticky_bit),
485 | .Mant_norm_i(Mant_norm),
486 | .Exp_norm_i(Exp_norm),
487 | .Exp_norm_mone_i(Exp_norm_mone),
488 | .Exp_max_rs_i(Exp_max_rs),
489 | .Rs_Mant_i(Rs_Mant),
490 |
491 | .Sign_result_o(Sign_result),
492 | .Exp_result_o(Exp_result),
493 | .Mant_result_o(Mant_result),
494 | .Invalid_o(NV_o),
495 | .Overflow_o(OF_o),
496 | .Underflow_o(UF_o),
497 | .Inexact_o(NX_o)
498 | );
499 |
500 | endmodule
501 |
502 |
--------------------------------------------------------------------------------
/src/01_RTL/MSBIncrementer.v:
--------------------------------------------------------------------------------
1 | `timescale 1ns / 1ps
2 | //////////////////////////////////////////////////////////////////////////////////
3 | // Engineer: Tzu-Han Hsu
4 | // Create Date: 07/29/2022 10:53:38 AM
5 | // Module Name: MSBIncrementer
6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit
7 | // HDL(Version): Verilog-2005
8 | //
9 | // Dependencies: None
10 | //
11 | //////////////////////////////////////////////////////////////////////////////////
12 | // Description: Increments A_High if needed by the A_Low carry signal
13 | //
14 | //////////////////////////////////////////////////////////////////////////////////
15 | // Revision:
16 | // 09/12/2022 - Add BSD-3-Clause Licence
17 | //
18 | //////////////////////////////////////////////////////////////////////////////////
19 | // License information:
20 | //
21 | // This software is released under the BSD-3-Clause Licence,
22 | // see https://opensource.org/licenses/BSD-3-Clause for details.
23 | // In the following license statements, "software" refers to the
24 | // "source code" of the complete hardware/software system.
25 | //
26 | // Copyright 2022,
27 | // Embedded Intelligent Systems Lab (EISL)
28 | // Deparment of Computer Science
29 | // National Yang Ming Chiao Tung Uniersity
30 | // Hsinchu, Taiwan.
31 | //
32 | // All rights reserved.
33 | //
34 | // Redistribution and use in source and binary forms, with or without
35 | // modification, are permitted provided that the following conditions are met:
36 | //
37 | // 1. Redistributions of source code must retain the above copyright notice,
38 | // this list of conditions and the following disclaimer.
39 | //
40 | // 2. Redistributions in binary form must reproduce the above copyright notice,
41 | // this list of conditions and the following disclaimer in the documentation
42 | // and/or other materials provided with the distribution.
43 | //
44 | // 3. Neither the name of the copyright holder nor the names of its contributors
45 | // may be used to endorse or promote products derived from this software
46 | // without specific prior written permission.
47 | //
48 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
49 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
50 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
51 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
52 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
53 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
54 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
55 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
56 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
57 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
58 | // POSSIBILITY OF SUCH DAMAGE.
59 | //////////////////////////////////////////////////////////////////////////////////
60 |
61 |
62 | module MSBIncrementer #(
63 | parameter PARM_MANT = 23
64 | ) (
65 | input low_carry_i,
66 | input low_carry_inv_i,
67 | input [PARM_MANT + 3 : 0] A_Mant_aligned_high_i,
68 |
69 | output [PARM_MANT + 3 : 0] high_sum_o,
70 | output [PARM_MANT + 3 : 0] high_sum_inv_o
71 | );
72 | wire high_carry; // signal that is abandoned
73 | wire high_carry_inv; // signal that is abandoned
74 |
75 | assign {high_carry, high_sum_o} = (low_carry_i)? A_Mant_aligned_high_i + 1 : A_Mant_aligned_high_i;
76 | assign {high_carry_inv, high_sum_inv_o} = (low_carry_inv_i)? ~A_Mant_aligned_high_i : ~A_Mant_aligned_high_i - 1;
77 |
78 | endmodule
79 |
--------------------------------------------------------------------------------
/src/01_RTL/Normalizer.v:
--------------------------------------------------------------------------------
1 | `timescale 1ns / 1ps
2 | //////////////////////////////////////////////////////////////////////////////////
3 | // Engineer: Tzu-Han Hsu
4 | // Create Date: 08/01/2022 03:36:51 PM
5 | // Module Name: Normalizer
6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit
7 | // HDL(Version): Verilog-2005
8 | //
9 | // Dependencies: None
10 | //
11 | //////////////////////////////////////////////////////////////////////////////////
12 | // Description: Normalizes the Fraction, and correct the exponent by
13 | // the input from Leading One Detector
14 | //
15 | //////////////////////////////////////////////////////////////////////////////////
16 | // Revision:
17 | // 08/01/2022 - Output logic mistaken port name, fixed
18 | // 08/04/2022 - Remove redundant parameters
19 | // 09/12/2022 - Add BSD-3-Clause Licence
20 | //
21 | //////////////////////////////////////////////////////////////////////////////////
22 | // License information:
23 | //
24 | // This software is released under the BSD-3-Clause Licence,
25 | // see https://opensource.org/licenses/BSD-3-Clause for details.
26 | // In the following license statements, "software" refers to the
27 | // "source code" of the complete hardware/software system.
28 | //
29 | // Copyright 2022,
30 | // Embedded Intelligent Systems Lab (EISL)
31 | // Deparment of Computer Science
32 | // National Yang Ming Chiao Tung Uniersity
33 | // Hsinchu, Taiwan.
34 | //
35 | // All rights reserved.
36 | //
37 | // Redistribution and use in source and binary forms, with or without
38 | // modification, are permitted provided that the following conditions are met:
39 | //
40 | // 1. Redistributions of source code must retain the above copyright notice,
41 | // this list of conditions and the following disclaimer.
42 | //
43 | // 2. Redistributions in binary form must reproduce the above copyright notice,
44 | // this list of conditions and the following disclaimer in the documentation
45 | // and/or other materials provided with the distribution.
46 | //
47 | // 3. Neither the name of the copyright holder nor the names of its contributors
48 | // may be used to endorse or promote products derived from this software
49 | // without specific prior written permission.
50 | //
51 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
52 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
53 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
54 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
55 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
56 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
57 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
58 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
59 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
60 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
61 | // POSSIBILITY OF SUCH DAMAGE.
62 | //////////////////////////////////////////////////////////////////////////////////
63 |
64 |
65 | module Normalizer#(
66 | parameter PARM_EXP = 8,
67 | parameter PARM_MANT = 23,
68 | parameter PARM_LEADONE_WIDTH = 7
69 | ) (
70 | input [3*PARM_MANT + 4 : 0]Mant_i,
71 | input [PARM_EXP + 1 : 0]Exp_i,
72 | input [PARM_LEADONE_WIDTH - 1 : 0] Shift_num_i,
73 | input Exp_mv_sign_i,
74 |
75 | output [3*PARM_MANT + 4 : 0] Mant_norm_o,
76 | output reg [PARM_EXP + 1 : 0] Exp_norm_o,
77 | output [PARM_EXP + 1 : 0] Exp_norm_mone_o,
78 | output [PARM_EXP + 1 : 0] Exp_max_rs_o,
79 | output [3*PARM_MANT + 6 : 0] Rs_Mant_o
80 | );
81 |
82 | //Exponent corrections and normalization by results from LOA
83 |
84 | wire [PARM_LEADONE_WIDTH - 1 : 0] Shift_num = (Exp_mv_sign_i | Mant_i[3*PARM_MANT + 4])? 0 : Shift_num_i; //If the exponent < 0, or it has a leading one (1xxxxxx....)
85 |
86 | reg [PARM_EXP : 0] norm_amt;
87 | always @(*) begin
88 | if(Exp_i[PARM_EXP + 1])
89 | norm_amt = 0; // the expoent overflows
90 | else if(Exp_i > Shift_num)
91 | norm_amt = Shift_num; // assure that exp would not < 0
92 | else
93 | norm_amt = Exp_i[PARM_EXP : 0] - 1; //Denormalized Numbers, has exponent of 0, representing -126
94 | end
95 |
96 | assign Mant_norm_o = Mant_i << norm_amt;
97 |
98 |
99 | always @(*) begin
100 | if(Exp_i[PARM_EXP + 1])
101 | Exp_norm_o = 0; // the expoent overflows
102 | else if(Exp_i > Shift_num)
103 | Exp_norm_o = Exp_i - Shift_num; // assure that exp would not < 0
104 | else
105 | Exp_norm_o = 1; //Denormalized Numbers, has exponent of 0, representing -126
106 | end
107 |
108 | assign Exp_norm_mone_o = Exp_i - Shift_num - 1;
109 |
110 | //if Exp < 0, shift Right
111 |
112 | assign Exp_max_rs_o = Exp_i[PARM_EXP : 0] + 74;
113 | wire [PARM_EXP + 1 : 0] Rs_count = (~Exp_i + 1) + 1; // -Exp_i + 1, number of right shifts to get a denormalized number.
114 | assign Rs_Mant_o = {Mant_i, 2'd0} >> Rs_count;
115 |
116 | endmodule
117 |
--------------------------------------------------------------------------------
/src/01_RTL/PreNormalizer.v:
--------------------------------------------------------------------------------
1 | `timescale 1ns / 1ps
2 | //////////////////////////////////////////////////////////////////////////////////
3 | // Engineer: Tzu-Han Hsu
4 | // Create Date: 07/25/2022 10:50:12 PM
5 | // Module Name: PreNormalizer
6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit
7 | // HDL(Version): Verilog-2005
8 | //
9 | // Dependencies: None
10 | //
11 | //////////////////////////////////////////////////////////////////////////////////
12 | // Description: It shifts the augend to the correct position, and calculates
13 | // its exponent, works in parallel with the multiplier
14 | //
15 | //////////////////////////////////////////////////////////////////////////////////
16 | // Revision:
17 | // 07/25/2022 - Ports renaming into appropriate suffix
18 | // 07/26/2022 - Add debug signals to probe the shifting
19 | // 07/26/2022 - Debug wires removed
20 | // 07/27/2022 - Input wire "sign_change_i" renamed to "sign_flip_i"
21 | // 09/12/2022 - Add BSD-3-Clause Licence
22 | //
23 | //////////////////////////////////////////////////////////////////////////////////
24 | // License information:
25 | //
26 | // This software is released under the BSD-3-Clause Licence,
27 | // see https://opensource.org/licenses/BSD-3-Clause for details.
28 | // In the following license statements, "software" refers to the
29 | // "source code" of the complete hardware/software system.
30 | //
31 | // Copyright 2022,
32 | // Embedded Intelligent Systems Lab (EISL)
33 | // Deparment of Computer Science
34 | // National Yang Ming Chiao Tung Uniersity
35 | // Hsinchu, Taiwan.
36 | //
37 | // All rights reserved.
38 | //
39 | // Redistribution and use in source and binary forms, with or without
40 | // modification, are permitted provided that the following conditions are met:
41 | //
42 | // 1. Redistributions of source code must retain the above copyright notice,
43 | // this list of conditions and the following disclaimer.
44 | //
45 | // 2. Redistributions in binary form must reproduce the above copyright notice,
46 | // this list of conditions and the following disclaimer in the documentation
47 | // and/or other materials provided with the distribution.
48 | //
49 | // 3. Neither the name of the copyright holder nor the names of its contributors
50 | // may be used to endorse or promote products derived from this software
51 | // without specific prior written permission.
52 | //
53 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
54 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
55 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
56 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
57 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
58 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
59 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
60 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
61 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
62 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
63 | // POSSIBILITY OF SUCH DAMAGE.
64 | //////////////////////////////////////////////////////////////////////////////////
65 |
66 |
67 | module PreNormalizer #(
68 | parameter PARM_EXP = 8,
69 | parameter PARM_MANT = 23,
70 | parameter PARM_BIAS = 127
71 | ) (
72 | input A_sign_i,
73 | input B_sign_i,
74 | input C_sign_i,
75 | input Sub_Sign_i,
76 | input [PARM_EXP - 1 : 0] A_Exp_i,
77 | input [PARM_EXP - 1 : 0] B_Exp_i,
78 | input [PARM_EXP - 1 : 0] C_Exp_i,
79 | input [PARM_MANT : 0] A_Mant_i,
80 | input Sign_flip_i,
81 | input Mv_halt_i,
82 | input [PARM_EXP + 1 : 0] Exp_mv_i,
83 | input Exp_mv_sign_i,
84 |
85 | output Sign_aligned_o,
86 | output [PARM_EXP + 1: 0] Exp_aligned_o,
87 | output reg [74 : 0] A_Mant_aligned_o,
88 | output reg Mant_sticky_sht_out_o
89 | );
90 |
91 |
92 | wire [73 : 0] A_Mant_aligned;
93 | wire [PARM_MANT : 0] Drop_bits;
94 | assign {A_Mant_aligned, Drop_bits} = {A_Mant_i, 74'd0} >> (Mv_halt_i ? 0 : Exp_mv_i);
95 |
96 | //output logic for aligner
97 | assign Sign_aligned_o = (Exp_mv_sign_i)? A_sign_i : B_sign_i ^ C_sign_i;
98 | assign Exp_aligned_o = (Exp_mv_sign_i)? A_Exp_i : (B_Exp_i + C_Exp_i - PARM_BIAS + 27); // exponent = (expB + expC -127) + point distance(= 27)
99 |
100 | //output logic for A_Mant_aligned_o
101 | always @(*) begin
102 | if(Exp_mv_sign_i)
103 | A_Mant_aligned_o = (A_Mant_i << 50);
104 | else if(~Mv_halt_i)
105 | A_Mant_aligned_o = {Sub_Sign_i, {74{Sub_Sign_i}}^A_Mant_aligned};
106 | else
107 | A_Mant_aligned_o = 0;
108 | end
109 |
110 |
111 | wire [PARM_MANT : 0] A_Mant_2compelemnt = (~A_Mant_i) + 1; //2's complement of mantA
112 | wire [PARM_MANT : 0] Drop_bits_2complement = (~Drop_bits) + 1; //2's complemet of Drop_bits
113 |
114 | //output logic for Mant_sticky_sht_out_o
115 | always @(*) begin
116 | if(Sub_Sign_i & (~Sign_flip_i))
117 | Mant_sticky_sht_out_o = (Mv_halt_i)? (|A_Mant_2compelemnt) : (|Drop_bits_2complement);
118 | else
119 | Mant_sticky_sht_out_o = (Mv_halt_i)? (|A_Mant_i) : (|Drop_bits);
120 | end
121 |
122 | endmodule
123 |
--------------------------------------------------------------------------------
/src/01_RTL/R4Booth.v:
--------------------------------------------------------------------------------
1 | `timescale 1ns / 1ps
2 | //////////////////////////////////////////////////////////////////////////////////
3 | // Engineer: Tzu-Han Hsu
4 | // Create Date: 07/22/2022 10:59:09 AM
5 | // Module Name: R4Booth
6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit
7 | // HDL(Version): Verilog-2005
8 | //
9 | // Dependencies: None
10 | //
11 | //////////////////////////////////////////////////////////////////////////////////
12 | // Description: Breaking down 24bit * 24 bit into 13 partial products, using
13 | // Radix-4 Booth's Algorithm
14 | //
15 | //////////////////////////////////////////////////////////////////////////////////
16 | // Revision:
17 | // 07/22/2022 - Encode the input by Radix-4 Booth's Recording Table
18 | // 07/22/2022 - Utilize generate statements in the encoder
19 | // 07/24/2022 - Combine the module with Booth recording module
20 | // 07/25/2022 - Add decoder section to generate partial produt by the encoded message
21 | // 07/25/2022 - Decoder index bug fix
22 | // 07/25/2022 - Parameters updated, redundancy removed and comments added
23 | // 08/15/2022 - Save one bit by reducing the bus width of pp_12_o
24 | // 09/12/2022 - Add BSD-3-Clause Licence
25 | //
26 | //////////////////////////////////////////////////////////////////////////////////
27 | // License information:
28 | //
29 | // This software is released under the BSD-3-Clause Licence,
30 | // see https://opensource.org/licenses/BSD-3-Clause for details.
31 | // In the following license statements, "software" refers to the
32 | // "source code" of the complete hardware/software system.
33 | //
34 | // Copyright 2022,
35 | // Embedded Intelligent Systems Lab (EISL)
36 | // Deparment of Computer Science
37 | // National Yang Ming Chiao Tung Uniersity
38 | // Hsinchu, Taiwan.
39 | //
40 | // All rights reserved.
41 | //
42 | // Redistribution and use in source and binary forms, with or without
43 | // modification, are permitted provided that the following conditions are met:
44 | //
45 | // 1. Redistributions of source code must retain the above copyright notice,
46 | // this list of conditions and the following disclaimer.
47 | //
48 | // 2. Redistributions in binary form must reproduce the above copyright notice,
49 | // this list of conditions and the following disclaimer in the documentation
50 | // and/or other materials provided with the distribution.
51 | //
52 | // 3. Neither the name of the copyright holder nor the names of its contributors
53 | // may be used to endorse or promote products derived from this software
54 | // without specific prior written permission.
55 | //
56 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
57 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
58 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
59 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
60 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
61 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
62 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
63 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
64 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
65 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
66 | // POSSIBILITY OF SUCH DAMAGE.
67 | //////////////////////////////////////////////////////////////////////////////////
68 |
69 |
70 | module R4Booth #(
71 | parameter PARM_MANT = 23
72 | ) (
73 | input [PARM_MANT : 0] MantA_i, // input is {hidden_bit, mantissa} = 1 + 23 = 24 bits
74 | input [PARM_MANT : 0] MantB_i,
75 |
76 | output [2*PARM_MANT + 2 : 0] pp_00_o, //output range is 24*2 +1(if x2 multiplicand) = 49 bits
77 | output [2*PARM_MANT + 2 : 0] pp_01_o,
78 | output [2*PARM_MANT + 2 : 0] pp_02_o,
79 | output [2*PARM_MANT + 2 : 0] pp_03_o,
80 | output [2*PARM_MANT + 2 : 0] pp_04_o,
81 | output [2*PARM_MANT + 2 : 0] pp_05_o,
82 | output [2*PARM_MANT + 2 : 0] pp_06_o,
83 | output [2*PARM_MANT + 2 : 0] pp_07_o,
84 | output [2*PARM_MANT + 2 : 0] pp_08_o,
85 | output [2*PARM_MANT + 2 : 0] pp_09_o,
86 | output [2*PARM_MANT + 2 : 0] pp_10_o,
87 | output [2*PARM_MANT + 2 : 0] pp_11_o,
88 | output [2*PARM_MANT + 1 : 0] pp_12_o
89 | );
90 | parameter PARM_PP = ((PARM_MANT+1)+1+1)/2; //booth's algorithm produces at most CEILING( (n+1))/2 ) partial products
91 |
92 | //Modified Booth's Recording Table
93 | // Multiplier
94 | //| Bit i + 1 | Bit i | Bit i - 1 | Multiplicand selected |
95 | //| 0 | 0 | 0 | 0 x Multiplicand |
96 | //| 0 | 0 | 1 | +1 x Multiplicand |
97 | //| 0 | 1 | 0 | +1 x Multiplicand |
98 | //| 0 | 1 | 1 | +2 x Multiplicand |
99 | //| 1 | 0 | 0 | -2 x Multiplicand |
100 | //| 1 | 0 | 1 | -1 x Multiplicand |
101 | //| 1 | 1 | 0 | -1 x Multiplicand |
102 | //| 1 | 1 | 1 | 0 x Multiplicand |
103 |
104 |
105 | wire [PARM_MANT + 3 : 0] mant_B_Padding = {2'd0, MantB_i, 1'd0};
106 |
107 | wire [PARM_PP - 1 : 0] mul1x; // mul1x_o = bit (i) ^ bit(i - 1)
108 | wire [PARM_PP - 1 : 0] mul2x; // mul2x_o = (pattern == 3'b011 || pattern_i == 3'b100);
109 | wire [PARM_PP - 1 : 0] mulsign; // mulsign_o = bit (i + 1)
110 |
111 |
112 | generate
113 | genvar j;
114 | for (j = 0; j < 13; j = j+1) begin
115 | assign mul1x[j] = mant_B_Padding[j*2] ^ mant_B_Padding[j*2 + 1];
116 | assign mul2x[j] = ((~mant_B_Padding[j*2]) & (~mant_B_Padding[j*2 + 1]) & (mant_B_Padding[j*2 + 2])) ||
117 | ((mant_B_Padding[j*2]) & (mant_B_Padding[j*2+1]) & (~mant_B_Padding[j*2+2]));
118 | assign mulsign[j] = mant_B_Padding[j*2 + 2];
119 | end
120 | endgenerate
121 |
122 |
123 | // Partial product is differentiate by 0x 1x 2x here
124 | reg [PARM_MANT + 1 : 0] booth_PP_tmp [PARM_PP - 1: 0];
125 | wire [PARM_MANT + 1 : 0] booth_PP [PARM_PP - 1: 0];
126 |
127 | integer idx;
128 | always @(*) begin
129 | for (idx = 0; idx < PARM_PP; idx = idx + 1) begin
130 | if(mul1x[idx]) booth_PP_tmp[idx] = MantA_i;
131 | else if(mul2x[idx]) booth_PP_tmp[idx] = MantA_i << 1;
132 | else booth_PP_tmp[idx] = 0;
133 |
134 | end
135 | end
136 |
137 | //bit flip if it's negative due to booth's algorithm, we calculate 2's complement by bitwise invert and add 1 to the next row.
138 | generate
139 | genvar k;
140 | for(k = 0; k < PARM_PP; k = k + 1)begin
141 | assign booth_PP[k] = (mulsign[k])? ~booth_PP_tmp[k] : booth_PP_tmp[k];
142 | end
143 | endgenerate
144 |
145 |
146 | //by adding the "1 triagle" in the left up. It's under the assumption that it's an unsigned Multiplication.
147 | assign pp_00_o = {21'd0, ~mulsign[ 0],{2{mulsign[0]}},booth_PP[0]};
148 | assign pp_01_o = {21'd1, ~mulsign[ 1], booth_PP[ 1], 1'b0, mulsign[ 0]};
149 | assign pp_02_o = {19'd1, ~mulsign[ 2], booth_PP[ 2], 1'b0, mulsign[ 1], 2'd0};
150 | assign pp_03_o = {17'd1, ~mulsign[ 3], booth_PP[ 3], 1'b0, mulsign[ 2], 4'd0};
151 | assign pp_04_o = {15'd1, ~mulsign[ 4], booth_PP[ 4], 1'b0, mulsign[ 3], 6'd0};
152 | assign pp_05_o = {13'd1, ~mulsign[ 5], booth_PP[ 5], 1'b0, mulsign[ 4], 8'd0};
153 | assign pp_06_o = {11'd1, ~mulsign[ 6], booth_PP[ 6], 1'b0, mulsign[ 5], 10'd0};
154 | assign pp_07_o = { 9'd1, ~mulsign[ 7], booth_PP[ 7], 1'b0, mulsign[ 6], 12'd0};
155 | assign pp_08_o = { 7'd1, ~mulsign[ 8], booth_PP[ 8], 1'b0, mulsign[ 7], 14'd0};
156 | assign pp_09_o = { 5'd1, ~mulsign[ 9], booth_PP[ 9], 1'b0, mulsign[ 8], 16'd0};
157 | assign pp_10_o = { 3'd1, ~mulsign[10], booth_PP[10], 1'b0, mulsign[ 9], 18'd0};
158 | assign pp_11_o = { 1'd1, ~mulsign[11], booth_PP[11], 1'b0, mulsign[10], 20'd0};
159 | assign pp_12_o = {booth_PP[12][PARM_MANT : 0], 1'b0, mulsign[11], 22'd0}; //Save one bit, MSB is always 0
160 |
161 | endmodule
162 |
--------------------------------------------------------------------------------
/src/01_RTL/Rounder.v:
--------------------------------------------------------------------------------
1 | `timescale 1ns / 1ps
2 | //////////////////////////////////////////////////////////////////////////////////
3 | // Engineer: Tzu-Han Hsu
4 | // Create Date: 07/30/2022 10:47:12 AM
5 | // Module Name: Rounder
6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit
7 | // HDL(Version): Verilog-2005
8 | //
9 | // Dependencies: None
10 | //
11 | //////////////////////////////////////////////////////////////////////////////////
12 | // Description: Process Rounding by checking the rounding mode set, guard bit,
13 | // round bit and sticky bit.
14 | // Raises Invalid, Overflow and Underflow under appropriate situations
15 | // Adjust the exponent and mantissa for the module output
16 | //
17 | //////////////////////////////////////////////////////////////////////////////////
18 | // Revision:
19 | // 07/30/2022 - File Created
20 | // 08/01/2022 - Rename File
21 | // 08/08/2022 - Add PARM_MATN_RMM support
22 | // 08/09/2022 - Invalid_o shall raise whilst Overflow/Underflow
23 | // 08/10/2022 - Debug wires added to observe the chosen MUX path
24 | // 08/12/2022 - Remove A = 0 as special case, due to the update of mv_halt in MAC32_top.v
25 | // 08/13/2022 - Fix multidriven Net, Mant_result_o
26 | // 08/13/2022 - Underflow signal fixed, denorm number wouldn't fire Underflow Signal
27 | // 08/14/2022 - Debug wires removed
28 | // 09/12/2022 - Add BSD-3-Clause Licence
29 | //
30 | //////////////////////////////////////////////////////////////////////////////////
31 | // License information:
32 | //
33 | // This software is released under the BSD-3-Clause Licence,
34 | // see https://opensource.org/licenses/BSD-3-Clause for details.
35 | // In the following license statements, "software" refers to the
36 | // "source code" of the complete hardware/software system.
37 | //
38 | // Copyright 2022,
39 | // Embedded Intelligent Systems Lab (EISL)
40 | // Deparment of Computer Science
41 | // National Yang Ming Chiao Tung Uniersity
42 | // Hsinchu, Taiwan.
43 | //
44 | // All rights reserved.
45 | //
46 | // Redistribution and use in source and binary forms, with or without
47 | // modification, are permitted provided that the following conditions are met:
48 | //
49 | // 1. Redistributions of source code must retain the above copyright notice,
50 | // this list of conditions and the following disclaimer.
51 | //
52 | // 2. Redistributions in binary form must reproduce the above copyright notice,
53 | // this list of conditions and the following disclaimer in the documentation
54 | // and/or other materials provided with the distribution.
55 | //
56 | // 3. Neither the name of the copyright holder nor the names of its contributors
57 | // may be used to endorse or promote products derived from this software
58 | // without specific prior written permission.
59 | //
60 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
61 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
62 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
63 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
64 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
65 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
66 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
67 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
68 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
69 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
70 | // POSSIBILITY OF SUCH DAMAGE.
71 | //////////////////////////////////////////////////////////////////////////////////
72 | // Additional Comments:
73 | // IEEE Std 754-2008 Chap7. Default exception handling
74 | //
75 | // -------------------------------------------------------------------------------
76 | // 7.2 Invalid Operation
77 | //
78 | // The invalid operation exception is signaled if and only if there is no usefully definable result.
79 | // In these cases the operands are invalid for the operation to be performed.
80 | // For operations producing results in floating-point format, the default result of an operation that signals the
81 | // invalid operation exception shall be a quiet NaN that should provide some diagnostic information (see 6.2).
82 | // These operations are:
83 | // a) any general-computational or signaling-computational operation on a signaling NaN (see 6.2),
84 | // except for some conversions (see 5.12)
85 | // b) multiplication: multiplication(0, ∞) or multiplication(∞, 0)
86 | // c) fusedMultiplyAdd: fusedMultiplyAdd(0, ∞, c) or fusedMultiplyAdd(∞, 0, c) unless c is a quiet
87 | // NaN; if c is a quiet NaN then it is implementation defined whether the invalid operation exception
88 | // is signaled
89 | // d) addition or subtraction or fusedMultiplyAdd: magnitude subtraction of infinities, such as:
90 | // addition(+∞, −∞)
91 | // e) division: division(0, 0) or division(∞, ∞)
92 | // f) remainder: remainder(x, y), when y is zero or x is infinite and neither is NaN
93 | // g) squareRoot if the operand is less than zero
94 | // h) quantize when the result does not fit in the destination format or when one operand is finite and the
95 | // other is infinite
96 | // -------------------------------------------------------------------------------
97 | // 7.4 Overflow (IEEE 754-2008)
98 | //
99 | // The overflow exception shall be signaled if and only if the destination format’s largest finite number is
100 | // exceeded in magnitude by what would have been the rounded floating-point result (see 4) were the exponent
101 | // range unbounded. The default result shall be determined by the rounding-direction attribute and the sign of
102 | // the intermediate result as follows:
103 | // a) roundTiesToEven and roundTiesToAway carry all overflows to ∞ with the sign of the intermediate
104 | // result.
105 | // b) roundTowardZero carries all overflows to the format’s largest finite number with the sign of the
106 | // intermediate result.
107 | // c) roundTowardNegative carries positive overflows to the format’s largest finite number, and carries
108 | // negative overflows to −∞.
109 | // d) roundTowardPositive carries negative overflows to the format’s most negative finite number, and
110 | // carries positive overflows to +∞.
111 | // In addition, under default exception handling for overflow, the overflow flag shall be raised and the inexact
112 | // exception shall be signaled.
113 | // -------------------------------------------------------------------------------
114 | // 7.6 Inexact
115 | //
116 | // Unless stated otherwise, if the rounded result of an operation is inexact—that is, it differs from what would
117 | // have been computed were both exponent range and precision unbounded—then the inexact exception shall
118 | // be signaled. The rounded or overflowed result shall be delivered to the destination
119 | // (emphaisis added)
120 | // When all of these exceptions are handled by default, the inexact flag
121 | // is always raised when either the overflow or underflow flag is raised.
122 | //
123 | //////////////////////////////////////////////////////////////////////////////////
124 |
125 |
126 | module Rounder #(
127 | parameter PARM_RM = 3,
128 | parameter PARM_RM_RNE = 3'b000,
129 | parameter PARM_RM_RTZ = 3'b001,
130 | parameter PARM_RM_RDN = 3'b010,
131 | parameter PARM_RM_RUP = 3'b011,
132 | parameter PARM_RM_RMM = 3'b100,
133 | parameter PARM_MANT_NAN = 23'b100_0000_0000_0000_0000_0000,
134 | parameter PARM_EXP = 8,
135 | parameter PARM_MANT = 23,
136 | parameter PARM_LEADONE_WIDTH = 7
137 | ) (
138 |
139 | input [PARM_EXP + 1 : 0]Exp_i,
140 | input Sign_i,
141 |
142 | input Allzero_i,
143 | input Exp_mv_sign_i,
144 |
145 | input Sub_Sign_i,
146 | input [PARM_EXP - 1 : 0] A_Exp_raw_i,
147 | input [PARM_MANT : 0] A_Mant_i,
148 | input [PARM_RM - 1 : 0] Rounding_mode_i,
149 | input A_Sign_i,
150 | input B_Sign_i,
151 | input C_Sign_i,
152 |
153 | input A_DeN_i,
154 | input A_Inf_i,
155 | input B_Inf_i,
156 | input C_Inf_i,
157 | input A_Zero_i,
158 | input B_Zero_i,
159 | input C_Zero_i,
160 | input A_NaN_i,
161 | input B_NaN_i,
162 | input C_NaN_i,
163 |
164 | input Mant_sticky_sht_out_i,
165 | input Minus_sticky_bit_i,
166 |
167 | input [3*PARM_MANT + 4 : 0] Mant_norm_i,
168 | input [PARM_EXP + 1 : 0] Exp_norm_i,
169 | input [PARM_EXP + 1 : 0] Exp_norm_mone_i,
170 | input [PARM_EXP + 1 : 0] Exp_max_rs_i,
171 | input [3*PARM_MANT + 6 : 0] Rs_Mant_i,
172 |
173 | output reg Sign_result_o,
174 | output reg [PARM_EXP - 1 : 0] Exp_result_o,
175 | output reg [PARM_MANT - 1 : 0] Mant_result_o,
176 | output Invalid_o,
177 | output reg Overflow_o,
178 | output Underflow_o,
179 | output Inexact_o);
180 |
181 | //Sticky bit
182 | reg [2*PARM_MANT + 1 : 0] Mant_sticky_changed;
183 | always @(*) begin
184 | if(Exp_norm_i[PARM_EXP + 1])
185 | Mant_sticky_changed = Rs_Mant_i [2*PARM_MANT + 3 : 2];
186 | else if(Exp_norm_i == 0)
187 | Mant_sticky_changed = Mant_norm_i[2*PARM_MANT + 2 : 1];
188 | else if(Mant_norm_i[3*PARM_MANT + 4]) // | Exp_norm_i == 0
189 | Mant_sticky_changed = Mant_norm_i[2*PARM_MANT + 1 : 0];
190 | else
191 | Mant_sticky_changed = {Mant_norm_i[2*PARM_MANT : 0], 1'b0};
192 | end
193 |
194 | wire Sticky_one = (|Mant_sticky_changed) || Mant_sticky_sht_out_i || Minus_sticky_bit_i;
195 |
196 |
197 | wire includeNaN = A_NaN_i | B_NaN_i | C_NaN_i;
198 | wire zeromulinf = (B_Zero_i & C_Inf_i) | (C_Zero_i & B_Inf_i);
199 | wire subinf = (Sub_Sign_i & A_Inf_i & (B_Inf_i | C_Inf_i));
200 |
201 | assign Invalid_o = (includeNaN | zeromulinf | subinf);
202 |
203 | reg Mant_sticky;
204 | reg [PARM_MANT : 0] Mant_result_norm; // 24 bit
205 | reg [PARM_EXP - 1 : 0] Exp_result_norm; // 8 bit
206 | reg [1 : 0] Mant_lower;
207 |
208 |
209 | always @(*) begin
210 | //assign value to avoid latches
211 | Overflow_o = 1'b0;
212 | Mant_result_norm = 0;
213 | Exp_result_norm = 0;
214 | Mant_lower = 2'b00;
215 | Sign_result_o = 1'b0;
216 | Mant_sticky = 1'b0;
217 | if(Invalid_o)begin
218 | Mant_result_norm = {1'b0, PARM_MANT_NAN}; //PARM_MANT_NAN is 23 bit
219 | Exp_result_norm = 8'b1111_1111;
220 |
221 | end
222 | else if(A_Inf_i | B_Inf_i | C_Inf_i)begin
223 | // The result is Infinity
224 | // Operations on infinite operands are exact and therefore signal no exceptions
225 | Exp_result_norm = 8'b1111_1111;
226 | // If there's two infinities, they must be the same, if there's 3, it's the same with A_sign
227 | if(A_Inf_i) Sign_result_o = A_Sign_i;
228 | else Sign_result_o = B_Sign_i ^ C_Sign_i;
229 |
230 | end
231 | else if(B_Zero_i | C_Zero_i)begin
232 | // Bor situation of sth + sth*0 / sth + 0*sth
233 | Mant_result_norm = A_Mant_i;
234 | Exp_result_norm = A_Exp_raw_i;
235 | Sign_result_o = A_Sign_i;
236 |
237 | end
238 | else if(Exp_mv_sign_i)begin
239 | // Only A counts , B x C is too small compare to A
240 | Mant_result_norm = A_Mant_i;
241 | Exp_result_norm = A_Exp_raw_i;
242 | Sign_result_o = A_Sign_i;
243 | Mant_sticky = Sticky_one; // When the exponent move left (negative), sticky bit would come from Mant_sticky
244 |
245 | end
246 | else if(Allzero_i)begin
247 | Sign_result_o = Sign_i;
248 |
249 | end
250 | else if(Exp_i[PARM_EXP + 1])begin
251 | if(~Exp_max_rs_i[PARM_EXP + 1])begin
252 | // Exponent would <0 after right shift (too negative)
253 | Overflow_o = 1;
254 | Sign_result_o = Sign_i;
255 | end
256 | else begin
257 | // Denormalized number
258 | Mant_result_norm = {1'b0, Rs_Mant_i[3*PARM_MANT + 6 : 2*PARM_MANT + 6]};
259 | Mant_lower = Rs_Mant_i[2*PARM_MANT + 5 : 2*PARM_MANT + 4];
260 | Sign_result_o = Sign_i;
261 | Mant_sticky = Sticky_one;
262 | end
263 |
264 | end
265 | else if((Exp_norm_i[PARM_EXP : 0] == 256) & (~Mant_norm_i[3*PARM_MANT + 4]) & (Mant_norm_i[3*PARM_MANT + 3 : 2*PARM_MANT+3] != 0))begin
266 | // Overflow
267 | Overflow_o = 1;
268 | Sign_result_o = Sign_i;
269 |
270 | end
271 | else if(Exp_norm_i[PARM_EXP - 1 : 0] == 8'b1111_1111)begin
272 |
273 | if(Mant_norm_i[3*PARM_MANT + 4] || (Mant_norm_i[3*PARM_MANT + 4 : 2*PARM_MANT + 4] == 0))begin
274 | // Overflow
275 | Overflow_o = 1;
276 | Sign_result_o = Sign_i;
277 | end
278 | else begin
279 | // Normal numbers
280 | Exp_result_norm = 8'b1111_1110; //254
281 | Sign_result_o = Sign_i;
282 |
283 | Mant_result_norm = Mant_norm_i [3*PARM_MANT + 2 : 2*PARM_MANT + 3];//originally out of bound
284 | Mant_lower = Mant_norm_i[2*PARM_MANT + 2 : 2*PARM_MANT + 1];
285 | Mant_sticky = Sticky_one;
286 |
287 | //see if it's overflow, if mant is full and about to round up
288 | if(Mant_result_norm[PARM_MANT - 1 : 0] == {(PARM_MANT){1'b1}})begin
289 | case (Rounding_mode_i)
290 | PARM_RM_RNE:
291 | Overflow_o = Mant_lower[1] & (Mant_lower[0] | Mant_sticky | Mant_result_norm[0]);
292 | PARM_RM_RTZ:
293 | Overflow_o = 0;
294 | PARM_RM_RDN:
295 | Overflow_o = ((|Mant_lower) || Mant_sticky) & Sign_i;
296 | PARM_RM_RUP:
297 | Overflow_o = ((|Mant_lower) || Mant_sticky) & (~Sign_i);
298 | PARM_RM_RMM:
299 | Overflow_o = Mant_lower[1];
300 | default:
301 | Overflow_o = 0;
302 | endcase
303 | end
304 | end
305 |
306 | end
307 | else if(Exp_norm_i[PARM_EXP])begin
308 | //Overflow Occurs, the exponent at preNorm(multiplication is over 127)
309 | Overflow_o = 1;
310 | Sign_result_o = Sign_i;
311 | end
312 | else if(Exp_norm_i == 10'd0)begin
313 | // Denormalized number
314 | Mant_result_norm = {1'b0, Mant_norm_i[3*PARM_MANT + 4 : 2*PARM_MANT + 5]};
315 | Mant_lower = Mant_norm_i[2*PARM_MANT + 4 : 2*PARM_MANT + 3];
316 | Sign_result_o = Sign_i;
317 | Mant_sticky = Sticky_one;
318 | end
319 | else if(Exp_norm_i == 10'd1)begin
320 | if(Mant_norm_i[3*PARM_MANT + 4])begin
321 | //Normal Number
322 | Mant_result_norm = Mant_norm_i[3*PARM_MANT + 4 : 2*PARM_MANT + 4];
323 | Exp_result_norm = 1;
324 | Mant_lower = Mant_norm_i[2*PARM_MANT + 3 : 2*PARM_MANT + 2];
325 | Sign_result_o = Sign_i;
326 | Mant_sticky = Sticky_one;
327 |
328 | end
329 | else begin
330 | // Denormalized Number
331 | // Denormalized does not mean exactly an underflow...
332 | Mant_result_norm = Mant_norm_i[3*PARM_MANT + 4: 2*PARM_MANT + 4];
333 | Mant_lower = Mant_norm_i[2*PARM_MANT + 3 : 2*PARM_MANT + 2];
334 | Sign_result_o = Sign_i;
335 | Mant_sticky = Sticky_one;
336 |
337 | end
338 |
339 | end
340 | else if(~Mant_norm_i[3*PARM_MANT + 4])begin
341 | // Numbers with 0X.XX, normal numbers
342 | Mant_result_norm = Mant_norm_i[3*PARM_MANT + 3 : 2*PARM_MANT + 3];
343 | Exp_result_norm = Exp_norm_mone_i[PARM_EXP - 1 : 0];
344 | Mant_lower = Mant_norm_i[2*PARM_MANT + 2 : 2*PARM_MANT + 1];
345 | Sign_result_o = Sign_i;
346 | Mant_sticky = Sticky_one;
347 | end
348 | else begin
349 | // Numbers with 1X.XX, normal nubmers
350 | Mant_result_norm = Mant_norm_i[3*PARM_MANT + 4 : 2*PARM_MANT + 4];
351 | Exp_result_norm = Exp_norm_i[PARM_EXP - 1 : 0];
352 | Mant_lower = Mant_norm_i[2*PARM_MANT + 3 : 2*PARM_MANT + 2];
353 | Sign_result_o = Sign_i;
354 | Mant_sticky = Sticky_one;
355 | end
356 | end
357 |
358 | //Represents Guard, Round and Sticky bit
359 | // Guard Bit: Mant_lower[1]
360 | // Round Bit: Mant_lower[0]
361 | // Sticky Bit: Mant_sticky
362 | wire GRSbits = (|Mant_lower) || Mant_sticky;
363 |
364 | //Rounding determins wheter to add 1 to the mantissa, sending Mant_roundup signal;
365 | reg Mant_roundup;
366 |
367 | always @(*) begin
368 | case (Rounding_mode_i)
369 | PARM_RM_RNE:
370 | Mant_roundup = Mant_lower[1] & (Mant_lower[0] | Mant_sticky | Mant_result_norm[0]);
371 | PARM_RM_RTZ:
372 | Mant_roundup = 0;
373 | PARM_RM_RDN:
374 | Mant_roundup = GRSbits & Sign_i;
375 | PARM_RM_RUP:
376 | Mant_roundup = GRSbits & (~Sign_i);
377 | PARM_RM_RMM:
378 | Mant_roundup = Mant_lower[1];
379 | default:
380 | Mant_roundup = 0;
381 | endcase
382 | end
383 |
384 | wire [PARM_MANT + 1 : 0] Mant_upper_rounded = Mant_result_norm + Mant_roundup;
385 | wire Mant_renormalize = Mant_upper_rounded[PARM_MANT + 1];
386 |
387 | //output logic
388 |
389 | always @(*) begin
390 | if(Overflow_o)begin
391 | case (Rounding_mode_i)
392 | PARM_RM_RNE:
393 | Mant_result_o = 0; // to Inf
394 | PARM_RM_RTZ:
395 | Mant_result_o = {PARM_MANT{1'b1}};//to Largest Finite Number
396 | PARM_RM_RDN:
397 | Mant_result_o = (Sign_result_o)? 0 : {PARM_MANT{1'b1}}; //+: to largest Finite Number -: to Inf
398 | PARM_RM_RUP:
399 | Mant_result_o = (Sign_result_o)? {PARM_MANT{1'b1}} : 0; //+: to Inf -: to most negative Finite Number
400 | PARM_RM_RMM:
401 | Mant_result_o = 0; // to Inf
402 | default:
403 | Mant_result_o = 0;
404 | endcase
405 | end
406 | else if(Mant_renormalize)
407 | Mant_result_o = Mant_upper_rounded[PARM_MANT : 1];
408 | else
409 | Mant_result_o = Mant_upper_rounded[PARM_MANT - 1 : 0];
410 | end
411 |
412 | always@(*)begin
413 | if(Overflow_o)begin
414 | case (Rounding_mode_i)
415 | PARM_RM_RNE:
416 | Exp_result_o = {PARM_EXP{1'b1}}; // to Inf
417 | PARM_RM_RTZ:
418 | Exp_result_o = {{(PARM_EXP-1){1'b1}},1'b0}; ////to Largest Finite Number, exp = 1111_1110
419 | PARM_RM_RDN:
420 | Exp_result_o = (Sign_result_o)? {PARM_EXP{1'b1}} : {{(PARM_EXP-1){1'b1}},1'b0};
421 | PARM_RM_RUP:
422 | Exp_result_o = (Sign_result_o)? {{(PARM_EXP-1){1'b1}},1'b0} : {PARM_EXP{1'b1}};
423 | PARM_RM_RMM:
424 | Exp_result_o = {PARM_EXP{1'b1}}; // to Inf
425 | default:
426 | Exp_result_o = 0; //Revision 8/13/2022 - Fix multidrive net, Exp_result_o
427 | endcase
428 | end
429 | else
430 | Exp_result_o = Exp_result_norm + Mant_renormalize;
431 | end
432 |
433 | assign Underflow_o = ({Exp_result_o,Mant_result_o} == 0) & GRSbits;
434 | assign Inexact_o = GRSbits || Overflow_o ||Underflow_o;
435 |
436 | endmodule
437 |
--------------------------------------------------------------------------------
/src/01_RTL/SpecialCaseDetector.v:
--------------------------------------------------------------------------------
1 | `timescale 1ns / 1ps
2 | //////////////////////////////////////////////////////////////////////////////////
3 | // Engineer: Tzu-Han Hsu
4 | // Create Date: 07/21/2022 07:41:57 PM
5 | // Module Name: SpecialCaseDetector
6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit
7 | // HDL(Version): Verilog-2005
8 | //
9 | // Dependencies: None
10 | //
11 | //////////////////////////////////////////////////////////////////////////////////
12 | // Description: Detect whether the input data is:
13 | // 1. Infinity
14 | // 2. Zero
15 | // 3. NaN
16 | // 4. Denormalized Number
17 | //
18 | //////////////////////////////////////////////////////////////////////////////////
19 | // Revision:
20 | // 07/21/2022 - Infinity, Zero and NaN detection done.
21 | // 07/21/2022 - Denormalized Nubmer detection added
22 | // 08/04/2022 - I/O ports renamingm, appropriate suffix added
23 | // 08/04/2022 - parameters arranged, comments added for readability
24 | // 08/14/2022 - Wire rename, avoid non-parameter upper cased wire
25 | // 09/12/2022 - Add BSD-3-Clause Licence
26 | //
27 | //////////////////////////////////////////////////////////////////////////////////
28 | // License information:
29 | //
30 | // This software is released under the BSD-3-Clause Licence,
31 | // see https://opensource.org/licenses/BSD-3-Clause for details.
32 | // In the following license statements, "software" refers to the
33 | // "source code" of the complete hardware/software system.
34 | //
35 | // Copyright 2022,
36 | // Embedded Intelligent Systems Lab (EISL)
37 | // Deparment of Computer Science
38 | // National Yang Ming Chiao Tung Uniersity
39 | // Hsinchu, Taiwan.
40 | //
41 | // All rights reserved.
42 | //
43 | // Redistribution and use in source and binary forms, with or without
44 | // modification, are permitted provided that the following conditions are met:
45 | //
46 | // 1. Redistributions of source code must retain the above copyright notice,
47 | // this list of conditions and the following disclaimer.
48 | //
49 | // 2. Redistributions in binary form must reproduce the above copyright notice,
50 | // this list of conditions and the following disclaimer in the documentation
51 | // and/or other materials provided with the distribution.
52 | //
53 | // 3. Neither the name of the copyright holder nor the names of its contributors
54 | // may be used to endorse or promote products derived from this software
55 | // without specific prior written permission.
56 | //
57 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
58 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
59 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
60 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
61 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
62 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
63 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
64 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
65 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
66 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
67 | // POSSIBILITY OF SUCH DAMAGE.
68 | //////////////////////////////////////////////////////////////////////////////////
69 |
70 |
71 | module SpecialCaseDetector #(
72 | parameter PARM_XLEN = 32,
73 | parameter PARM_EXP = 8,
74 | parameter PARM_MANT = 23
75 |
76 | ) (
77 | input [PARM_XLEN - 1 : 0] A_i,
78 | input [PARM_XLEN - 1 : 0] B_i,
79 | input [PARM_XLEN - 1 : 0] C_i,
80 | input A_Leadingbit_i,
81 | input B_Leadingbit_i,
82 | input C_Leadingbit_i,
83 |
84 | output A_Inf_o,
85 | output B_Inf_o,
86 | output C_Inf_o,
87 | output A_Zero_o,
88 | output B_Zero_o,
89 | output C_Zero_o,
90 | output A_NaN_o,
91 | output B_NaN_o,
92 | output C_NaN_o,
93 | output A_DeN_o,
94 | output B_DeN_o,
95 | output C_DeN_o);
96 |
97 |
98 | wire [PARM_EXP - 1: 0] Exp_Fullone = {PARM_EXP{1'b1}}; // Exponent is all '1'
99 |
100 |
101 | wire A_ExpZero = ~A_Leadingbit_i;
102 | wire B_ExpZero = ~B_Leadingbit_i;
103 | wire C_ExpZero = ~C_Leadingbit_i;
104 |
105 | wire A_ExpFull = (A_i[PARM_XLEN - 2 : PARM_MANT] == Exp_Fullone);
106 | wire B_ExpFull = (B_i[PARM_XLEN - 2 : PARM_MANT] == Exp_Fullone);
107 | wire C_ExpFull = (C_i[PARM_XLEN - 2 : PARM_MANT] == Exp_Fullone);
108 |
109 | wire A_MantZero = (A_i[PARM_MANT - 1 : 0] == 0);
110 | wire B_MantZero = (B_i[PARM_MANT - 1 : 0] == 0);
111 | wire C_MantZero = (C_i[PARM_MANT - 1 : 0] == 0);
112 |
113 |
114 | //output logic
115 | assign A_Zero_o = A_ExpZero & A_MantZero;
116 | assign B_Zero_o = B_ExpZero & B_MantZero;
117 | assign C_Zero_o = C_ExpZero & C_MantZero;
118 |
119 | assign A_Inf_o = A_ExpFull & A_MantZero;
120 | assign B_Inf_o = B_ExpFull & B_MantZero;
121 | assign C_Inf_o = C_ExpFull & C_MantZero;
122 |
123 | assign A_NaN_o = A_ExpFull & (~A_MantZero);
124 | assign B_NaN_o = B_ExpFull & (~B_MantZero);
125 | assign C_NaN_o = C_ExpFull & (~C_MantZero);
126 |
127 | assign A_DeN_o = A_ExpZero & (~A_MantZero);
128 | assign B_DeN_o = B_ExpZero & (~B_MantZero);
129 | assign C_DeN_o = C_ExpZero & (~C_MantZero);
130 |
131 | endmodule
--------------------------------------------------------------------------------
/src/01_RTL/WallaceTree.v:
--------------------------------------------------------------------------------
1 | `timescale 1ns / 1ps
2 | //////////////////////////////////////////////////////////////////////////////////
3 | // Engineer: Tzu-Han Hsu
4 | // Create Date: 07/22/2022 03:15:31 PM
5 | // Module Name: WallaceTree
6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit
7 | // HDL(Version): Verilog-2005
8 | //
9 | // Dependencies: Compressor32.v
10 | // Compressor42.v
11 | //
12 | //////////////////////////////////////////////////////////////////////////////////
13 | // Description: Sums 13 partial products using carry save adder(CSA) into carry and sum
14 | // with:
15 | // 9x 3-2 Compressor
16 | // 1x 4-2 Compressor
17 | //
18 | //////////////////////////////////////////////////////////////////////////////////
19 | // Revision:
20 | // 07/22/2022 - Basic wiring finished, I/O signals updated for appropriate prefix
21 | // 07/25/2022 - Use generate statements to simplify code
22 | // 07/25/2022 - Multidriven net fixed
23 | // 07/25/2022 - Interleaving Wires rearranged
24 | // 08/15/2022 - Interconnection changed, to reduce critical path
25 | // 09/12/2022 - Add BSD-3-Clause Licence
26 | //
27 | //////////////////////////////////////////////////////////////////////////////////
28 | // License information:
29 | //
30 | // This software is released under the BSD-3-Clause Licence,
31 | // see https://opensource.org/licenses/BSD-3-Clause for details.
32 | // In the following license statements, "software" refers to the
33 | // "source code" of the complete hardware/software system.
34 | //
35 | // Copyright 2022,
36 | // Embedded Intelligent Systems Lab (EISL)
37 | // Deparment of Computer Science
38 | // National Yang Ming Chiao Tung Uniersity
39 | // Hsinchu, Taiwan.
40 | //
41 | // All rights reserved.
42 | //
43 | // Redistribution and use in source and binary forms, with or without
44 | // modification, are permitted provided that the following conditions are met:
45 | //
46 | // 1. Redistributions of source code must retain the above copyright notice,
47 | // this list of conditions and the following disclaimer.
48 | //
49 | // 2. Redistributions in binary form must reproduce the above copyright notice,
50 | // this list of conditions and the following disclaimer in the documentation
51 | // and/or other materials provided with the distribution.
52 | //
53 | // 3. Neither the name of the copyright holder nor the names of its contributors
54 | // may be used to endorse or promote products derived from this software
55 | // without specific prior written permission.
56 | //
57 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
58 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
59 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
60 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
61 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
62 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
63 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
64 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
65 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
66 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
67 | // POSSIBILITY OF SUCH DAMAGE.
68 | //////////////////////////////////////////////////////////////////////////////////
69 |
70 |
71 | module WallaceTree #(
72 | parameter PARM_MANT = 23
73 | ) (
74 | input [2*PARM_MANT + 2 : 0] pp_00_i,
75 | input [2*PARM_MANT + 2 : 0] pp_01_i,
76 | input [2*PARM_MANT + 2 : 0] pp_02_i,
77 | input [2*PARM_MANT + 2 : 0] pp_03_i,
78 | input [2*PARM_MANT + 2 : 0] pp_04_i,
79 | input [2*PARM_MANT + 2 : 0] pp_05_i,
80 | input [2*PARM_MANT + 2 : 0] pp_06_i,
81 | input [2*PARM_MANT + 2 : 0] pp_07_i,
82 | input [2*PARM_MANT + 2 : 0] pp_08_i,
83 | input [2*PARM_MANT + 2 : 0] pp_09_i,
84 | input [2*PARM_MANT + 2 : 0] pp_10_i,
85 | input [2*PARM_MANT + 2 : 0] pp_11_i,
86 | input [2*PARM_MANT + 1 : 0] pp_12_i,
87 |
88 | output [2*PARM_MANT + 2 : 0] wallace_sum_o,
89 | output [2*PARM_MANT + 2 : 0] wallace_carry_o,
90 | output suppression_sign_extension_o);
91 |
92 |
93 | wire [2*PARM_MANT + 2 : 0] csa_sum [9 - 1: 0];
94 | wire [2*PARM_MANT + 2 : 0] csa_carry [9 - 1: 0];
95 |
96 | wire [2*PARM_MANT + 2 : 0] csa_shcy [9 - 1: 0];
97 | wire [9 : 3] sign_extension;
98 | generate
99 | genvar i;
100 | for(i = 3; i < 9; i = i+1)begin
101 | assign sign_extension[i] = csa_carry[i][2*PARM_MANT + 2];
102 | end
103 | endgenerate
104 |
105 | generate
106 | genvar j;
107 | for(j = 0; j < 9; j = j+1)begin
108 | assign csa_shcy[j] = csa_carry[j] << 1;
109 | end
110 | endgenerate
111 |
112 |
113 | Compressor32 #(2*PARM_MANT + 3) LV1_0 (.A_i(pp_00_i),.B_i(pp_01_i),.C_i(pp_02_i),.Sum_o(csa_sum[0]),.Carry_o(csa_carry[0]));
114 | Compressor32 #(2*PARM_MANT + 3) LV1_1 (.A_i(pp_03_i),.B_i(pp_04_i),.C_i(pp_05_i),.Sum_o(csa_sum[1]),.Carry_o(csa_carry[1]));
115 | Compressor32 #(2*PARM_MANT + 3) LV1_2 (.A_i(pp_06_i),.B_i(pp_07_i),.C_i(pp_08_i),.Sum_o(csa_sum[2]),.Carry_o(csa_carry[2]));
116 | Compressor32 #(2*PARM_MANT + 3) LV1_3 (.A_i(pp_09_i),.B_i(pp_10_i),.C_i(pp_11_i),.Sum_o(csa_sum[3]),.Carry_o(csa_carry[3]));
117 |
118 | Compressor32 #(2*PARM_MANT + 3) LV2_0 (.A_i(csa_sum[0] ),.B_i(csa_shcy[0]),.C_i(csa_sum[1] ),.Sum_o(csa_sum[4]),.Carry_o(csa_carry[4]));
119 | Compressor32 #(2*PARM_MANT + 3) LV2_1 (.A_i(csa_shcy[1]),.B_i(csa_sum[2] ),.C_i(csa_shcy[2]),.Sum_o(csa_sum[5]),.Carry_o(csa_carry[5]));
120 | Compressor32 #(2*PARM_MANT + 3) LV2_2 (.A_i(csa_sum[3] ),.B_i(csa_shcy[3]),.C_i({1'd0, pp_12_i}),.Sum_o(csa_sum[6]),.Carry_o(csa_carry[6]));
121 |
122 | Compressor32 #(2*PARM_MANT + 3) LV3_0 (.A_i(csa_sum[4] ),.B_i(csa_shcy[4]),.C_i(csa_sum[5] ),.Sum_o(csa_sum[7]),.Carry_o(csa_carry[7]));
123 | Compressor32 #(2*PARM_MANT + 3) LV3_1 (.A_i(csa_shcy[5]),.B_i(csa_sum[6] ),.C_i(csa_shcy[6]),.Sum_o(csa_sum[8]),.Carry_o(csa_carry[8]));
124 |
125 | Compressor42 #(2*PARM_MANT + 3)
126 | LV4_Final (
127 | .A_i(csa_sum[7]),
128 | .B_i(csa_shcy[7]),
129 | .C_i(csa_sum[8]),
130 | .D_i(csa_shcy[8]),
131 | .Sum_o(wallace_sum_o),
132 | .Carry_o(wallace_carry_o),
133 | .hidden_carry_msb(sign_extension[9])
134 | );
135 |
136 | assign suppression_sign_extension_o = |sign_extension;
137 |
138 | endmodule
139 |
--------------------------------------------------------------------------------
/src/01_RTL/ZeroDetector_Base.v:
--------------------------------------------------------------------------------
1 | `timescale 1ns / 1ps
2 | //////////////////////////////////////////////////////////////////////////////////
3 | // Engineer: Tzu-Han Hsu
4 | // Create Date: 07/29/2022 11:01:59 PM
5 | // Module Name: ZeroDetector_Base
6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit
7 | // HDL(Version): Verilog-2005
8 | //
9 | // Dependencies: None
10 | //
11 | //////////////////////////////////////////////////////////////////////////////////
12 | // Description: Check if the input bits are all zero
13 | //
14 | //////////////////////////////////////////////////////////////////////////////////
15 | // Revision:
16 | // 08/15/2022 - allow parameter to control input size
17 | // 09/12/2022 - Add BSD-3-Clause Licence
18 | //
19 | //////////////////////////////////////////////////////////////////////////////////
20 | // License information:
21 | //
22 | // This software is released under the BSD-3-Clause Licence,
23 | // see https://opensource.org/licenses/BSD-3-Clause for details.
24 | // In the following license statements, "software" refers to the
25 | // "source code" of the complete hardware/software system.
26 | //
27 | // Copyright 2022,
28 | // Embedded Intelligent Systems Lab (EISL)
29 | // Deparment of Computer Science
30 | // National Yang Ming Chiao Tung Uniersity
31 | // Hsinchu, Taiwan.
32 | //
33 | // All rights reserved.
34 | //
35 | // Redistribution and use in source and binary forms, with or without
36 | // modification, are permitted provided that the following conditions are met:
37 | //
38 | // 1. Redistributions of source code must retain the above copyright notice,
39 | // this list of conditions and the following disclaimer.
40 | //
41 | // 2. Redistributions in binary form must reproduce the above copyright notice,
42 | // this list of conditions and the following disclaimer in the documentation
43 | // and/or other materials provided with the distribution.
44 | //
45 | // 3. Neither the name of the copyright holder nor the names of its contributors
46 | // may be used to endorse or promote products derived from this software
47 | // without specific prior written permission.
48 | //
49 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
50 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
51 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
52 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
53 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
54 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
55 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
56 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
57 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
58 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
59 | // POSSIBILITY OF SUCH DAMAGE.
60 | //////////////////////////////////////////////////////////////////////////////////
61 |
62 |
63 | module ZeroDetector_Base #(
64 | parameter XLEN = 8
65 | ) (
66 | input [XLEN - 1: 0] base_data_i,
67 | output zero_o );
68 |
69 | assign zero_o = (base_data_i == 0);
70 |
71 | endmodule
72 |
--------------------------------------------------------------------------------
/src/01_RTL/ZeroDetector_Group.v:
--------------------------------------------------------------------------------
1 | `timescale 1ns / 1ps
2 | //////////////////////////////////////////////////////////////////////////////////
3 | // Engineer: Tzu-Han Hsu
4 | // Create Date: 07/29/2022 11:01:59 PM
5 | // Module Name: ZeroDetector_Group
6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit
7 | // HDL(Version): Verilog-2005
8 | //
9 | // Dependencies: None
10 | //
11 | //////////////////////////////////////////////////////////////////////////////////
12 | // Description: Check if two groups are all zero
13 | //
14 | //////////////////////////////////////////////////////////////////////////////////
15 | // Revision:
16 | // 09/12/2022 - Add BSD-3-Clause Licence
17 | //
18 | //////////////////////////////////////////////////////////////////////////////////
19 | // License information:
20 | //
21 | // This software is released under the BSD-3-Clause Licence,
22 | // see https://opensource.org/licenses/BSD-3-Clause for details.
23 | // In the following license statements, "software" refers to the
24 | // "source code" of the complete hardware/software system.
25 | //
26 | // Copyright 2022,
27 | // Embedded Intelligent Systems Lab (EISL)
28 | // Deparment of Computer Science
29 | // National Yang Ming Chiao Tung Uniersity
30 | // Hsinchu, Taiwan.
31 | //
32 | // All rights reserved.
33 | //
34 | // Redistribution and use in source and binary forms, with or without
35 | // modification, are permitted provided that the following conditions are met:
36 | //
37 | // 1. Redistributions of source code must retain the above copyright notice,
38 | // this list of conditions and the following disclaimer.
39 | //
40 | // 2. Redistributions in binary form must reproduce the above copyright notice,
41 | // this list of conditions and the following disclaimer in the documentation
42 | // and/or other materials provided with the distribution.
43 | //
44 | // 3. Neither the name of the copyright holder nor the names of its contributors
45 | // may be used to endorse or promote products derived from this software
46 | // without specific prior written permission.
47 | //
48 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
49 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
50 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
51 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
52 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
53 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
54 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
55 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
56 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
57 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
58 | // POSSIBILITY OF SUCH DAMAGE.
59 | //////////////////////////////////////////////////////////////////////////////////
60 |
61 |
62 | module ZeroDetector_Group #(
63 | parameter XLEN = 2
64 | ) (
65 | input [XLEN - 1 : 0] group_data_i,
66 | output group_zero_o );
67 |
68 | assign group_zero_o = &group_data_i;
69 |
70 | endmodule
--------------------------------------------------------------------------------
/src/02_SYN/01_run_dc:
--------------------------------------------------------------------------------
1 | dc_shell-t -f syn.tcl | tee syn.log
2 |
--------------------------------------------------------------------------------
/src/02_SYN/09_clean_up:
--------------------------------------------------------------------------------
1 | rm -rf INCA_libs nWaveLog
2 | rm -rf *.fsdb
3 | rm -rf *.log
4 | rm -rf *~
5 | rm -rf ./Netlist/*.*
6 | rm -rf ./Report/*.*
7 | rm -rf dwsvf*
8 | rm -rf alib*
9 | rm -rf default.svf
10 | rm -rf alib-52
11 | rm -rf *-verilog.*
12 | rm -rf *.mr
13 | rm -rf diff_syn
14 |
--------------------------------------------------------------------------------
/src/02_SYN/Report/SUBWAY.area:
--------------------------------------------------------------------------------
1 |
2 | ****************************************
3 | Report : area
4 | Design : SUBWAY
5 | Version: T-2022.03
6 | Date : Sun Apr 9 02:23:57 2023
7 | ****************************************
8 |
9 | Library(s) Used:
10 |
11 | slow (File: /RAID2/COURSE/iclab/iclabta01/umc018/Synthesis/slow.db)
12 |
13 | Number of ports: 16
14 | Number of nets: 835
15 | Number of cells: 718
16 | Number of combinational cells: 554
17 | Number of sequential cells: 164
18 | Number of macros/black boxes: 0
19 | Number of buf/inv: 73
20 | Number of references: 36
21 |
22 | Combinational area: 8236.166493
23 | Buf/Inv area: 728.481627
24 | Noncombinational area: 11991.672226
25 | Macro/Black Box area: 0.000000
26 | Net Interconnect area: undefined (No wire load specified)
27 |
28 | Total cell area: 20227.838719
29 | Total area: undefined
30 | 1
31 |
--------------------------------------------------------------------------------
/src/02_SYN/Report/SUBWAY.check:
--------------------------------------------------------------------------------
1 | 1
2 |
--------------------------------------------------------------------------------
/src/02_SYN/Report/SUBWAY.resource:
--------------------------------------------------------------------------------
1 |
2 | ****************************************
3 | Report : resources
4 | Design : SUBWAY
5 | Version: T-2022.03
6 | Date : Sun Apr 9 02:23:57 2023
7 | ****************************************
8 |
9 |
10 | Resource Report for this hierarchy in file
11 | /RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v
12 | =============================================================================
13 | | Cell | Module | Parameters | Contained Operations |
14 | =============================================================================
15 | | lt_x_2 | DW_cmp | width=7 | lt_100 (SUBWAY.v:100) |
16 | | gte_x_3 | DW_cmp | width=7 | gte_111 (SUBWAY.v:111) |
17 | | add_x_100 | DW01_inc | width=6 | add_418 (SUBWAY.v:418) |
18 | | sub_x_7 | DW01_dec | width=2 | sub_146 (SUBWAY.v:146) |
19 | | | | | sub_146_2 (SUBWAY.v:146) |
20 | | | | | sub_149 (SUBWAY.v:149) |
21 | | | | | sub_173 (SUBWAY.v:173) |
22 | | | | | sub_175 (SUBWAY.v:175) |
23 | | | | | sub_192 (SUBWAY.v:192) |
24 | | | | | sub_206 (SUBWAY.v:206) |
25 | | | | | sub_209 (SUBWAY.v:209) |
26 | | | | | sub_225 (SUBWAY.v:225) |
27 | | | | | sub_242 (SUBWAY.v:242) |
28 | | | | | sub_242_2 (SUBWAY.v:242) |
29 | | | | | sub_242_3 (SUBWAY.v:242) |
30 | | | | | sub_247 (SUBWAY.v:247) |
31 | | | | | sub_251 (SUBWAY.v:251) |
32 | | | | | sub_260 (SUBWAY.v:260) |
33 | | | | | sub_283 (SUBWAY.v:283) |
34 | | | | | sub_283_2 (SUBWAY.v:283) |
35 | | | | | sub_286 (SUBWAY.v:286) |
36 | | | | | sub_311 (SUBWAY.v:311) |
37 | | | | | sub_313 (SUBWAY.v:313) |
38 | | | | | sub_330 (SUBWAY.v:330) |
39 | | | | | sub_344 (SUBWAY.v:344) |
40 | | | | | sub_347 (SUBWAY.v:347) |
41 | | | | | sub_363 (SUBWAY.v:363) |
42 | | | | | sub_380 (SUBWAY.v:380) |
43 | | | | | sub_380_2 (SUBWAY.v:380) |
44 | | | | | sub_380_3 (SUBWAY.v:380) |
45 | | | | | sub_385 (SUBWAY.v:385) |
46 | | | | | sub_389 (SUBWAY.v:389) |
47 | | | | | sub_398 (SUBWAY.v:398) |
48 | | add_x_6 | DW01_inc | width=2 | add_142 (SUBWAY.v:142) |
49 | | | | | add_142_2 (SUBWAY.v:142) |
50 | | | | | add_145 (SUBWAY.v:145) |
51 | | | | | add_169 (SUBWAY.v:169) |
52 | | | | | add_172 (SUBWAY.v:172) |
53 | | | | | add_188 (SUBWAY.v:188) |
54 | | | | | add_202 (SUBWAY.v:202) |
55 | | | | | add_205 (SUBWAY.v:205) |
56 | | | | | add_221 (SUBWAY.v:221) |
57 | | | | | add_236 (SUBWAY.v:236) |
58 | | | | | add_242 (SUBWAY.v:242) |
59 | | | | | add_242_2 (SUBWAY.v:242) |
60 | | | | | add_246 (SUBWAY.v:246) |
61 | | | | | add_247 (SUBWAY.v:247) |
62 | | | | | add_279 (SUBWAY.v:279) |
63 | | | | | add_279_2 (SUBWAY.v:279) |
64 | | | | | add_282 (SUBWAY.v:282) |
65 | | | | | add_307 (SUBWAY.v:307) |
66 | | | | | add_310 (SUBWAY.v:310) |
67 | | | | | add_326 (SUBWAY.v:326) |
68 | | | | | add_340 (SUBWAY.v:340) |
69 | | | | | add_343 (SUBWAY.v:343) |
70 | | | | | add_359 (SUBWAY.v:359) |
71 | | | | | add_374 (SUBWAY.v:374) |
72 | | | | | add_380 (SUBWAY.v:380) |
73 | | | | | add_380_2 (SUBWAY.v:380) |
74 | | | | | add_384 (SUBWAY.v:384) |
75 | | | | | add_385 (SUBWAY.v:385) |
76 | | eq_x_103 | DW_cmp | width=4 | eq_184 (SUBWAY.v:184) |
77 | | | | | eq_322 (SUBWAY.v:322) |
78 | | DP_OP_304J1_122_9552 | | |
79 | | | DP_OP_304J1_122_9552 | | |
80 | | DP_OP_305J1_123_2228 | | |
81 | | | DP_OP_305J1_123_2228 | | |
82 | =============================================================================
83 |
84 | Datapath Report for DP_OP_304J1_122_9552
85 | ==============================================================================
86 | | Cell | Contained Operations |
87 | ==============================================================================
88 | | DP_OP_304J1_122_9552 | mult_add_87_aco (SUBWAY.v:87) |
89 | | | add_87_aco (SUBWAY.v:87) |
90 | ==============================================================================
91 |
92 | ==============================================================================
93 | | | | Data | | |
94 | | Var | Type | Class | Width | Expression |
95 | ==============================================================================
96 | | I1 | PI | Unsigned | 7 | |
97 | | I2 | PI | Unsigned | 1 | |
98 | | T163 | IFO | Unsigned | 7 | I1 * I2 (SUBWAY.v:87) |
99 | | O1 | PO | Unsigned | 7 | T163 + $unsigned(1'b1) (SUBWAY.v:87) |
100 | ==============================================================================
101 |
102 | Datapath Report for DP_OP_305J1_123_2228
103 | ==============================================================================
104 | | Cell | Contained Operations |
105 | ==============================================================================
106 | | DP_OP_305J1_123_2228 | add_184 (SUBWAY.v:184) add_184_2 (SUBWAY.v:184) |
107 | | | add_184_3 (SUBWAY.v:184) add_322 (SUBWAY.v:322) |
108 | | | add_322_2 (SUBWAY.v:322) add_322_3 (SUBWAY.v:322) |
109 | ==============================================================================
110 |
111 | ==============================================================================
112 | | | | Data | | |
113 | | Var | Type | Class | Width | Expression |
114 | ==============================================================================
115 | | I1 | PI | Unsigned | 1 | |
116 | | I2 | PI | Unsigned | 1 | |
117 | | I3 | PI | Unsigned | 1 | |
118 | | I4 | PI | Unsigned | 1 | |
119 | | O1 | PO | Unsigned | 4 | I1 + I2 + I3 + I4 ( SUBWAY.v:184 SUBWAY.v:322 ) |
120 | ==============================================================================
121 |
122 |
123 | Implementation Report
124 | ===============================================================================
125 | | | | Current | Set |
126 | | Cell | Module | Implementation | Implementation |
127 | ===============================================================================
128 | | lt_x_2 | DW_cmp | apparch (area) | |
129 | | gte_x_3 | DW_cmp | apparch (area) | |
130 | | add_x_100 | DW01_inc | apparch (area) | |
131 | | sub_x_7 | DW01_dec | apparch (area) | |
132 | | add_x_6 | DW01_inc | apparch (area) | |
133 | | eq_x_103 | DW_cmp | apparch (area) | |
134 | | DP_OP_304J1_122_9552 | | |
135 | | | DP_OP_304J1_122_9552 | str (area) | |
136 | | | | mult_arch: and | |
137 | | DP_OP_305J1_123_2228 | | |
138 | | | DP_OP_305J1_123_2228 | str (area) | |
139 | ===============================================================================
140 |
141 | 1
142 |
--------------------------------------------------------------------------------
/src/02_SYN/Report/SUBWAY.timing:
--------------------------------------------------------------------------------
1 | Information: Updating design information... (UID-85)
2 |
3 | ****************************************
4 | Report : timing
5 | -path full
6 | -delay max
7 | -max_paths 1
8 | Design : SUBWAY
9 | Version: T-2022.03
10 | Date : Sun Apr 9 02:23:57 2023
11 | ****************************************
12 |
13 | Operating Conditions: slow Library: slow
14 | Wire Load Model Mode: top
15 |
16 | Startpoint: in_valid (input port clocked by clk)
17 | Endpoint: map_reg[0][0][0]
18 | (rising edge-triggered flip-flop clocked by clk)
19 | Path Group: clk
20 | Path Type: max
21 |
22 | Point Incr Path
23 | -----------------------------------------------------------
24 | clock clk (rise edge) 0.00 0.00
25 | clock network delay (ideal) 0.00 0.00
26 | input external delay 5.00 5.00 f
27 | in_valid (in) 0.00 5.00 f
28 | U679/Y (NAND2XL) 0.13 5.13 r
29 | U645/Y (OAI31XL) 0.30 5.43 f
30 | U614/Y (NOR2XL) 1.63 7.06 r
31 | U739/Y (INVXL) 0.69 7.74 f
32 | U784/Y (AOI2BB2XL) 0.47 8.21 f
33 | map_reg[0][0][0]/D (DFFRX1) 0.00 8.21 f
34 | data arrival time 8.21
35 |
36 | clock clk (rise edge) 10.00 10.00
37 | clock network delay (ideal) 0.00 10.00
38 | map_reg[0][0][0]/CK (DFFRX1) 0.00 10.00 r
39 | library setup time -0.32 9.68
40 | data required time 9.68
41 | -----------------------------------------------------------
42 | data required time 9.68
43 | data arrival time -8.21
44 | -----------------------------------------------------------
45 | slack (MET) 1.46
46 |
47 |
48 | 1
49 |
--------------------------------------------------------------------------------
/src/02_SYN/default.svf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hankshyu/RISC-V_MAC/a99fb15770c8dd553d0b348fadddc2b8b3d4b22b/src/02_SYN/default.svf
--------------------------------------------------------------------------------
/src/02_SYN/syn.log:
--------------------------------------------------------------------------------
1 |
2 | Design Compiler Graphical
3 | DC Ultra (TM)
4 | DFTMAX (TM)
5 | Power Compiler (TM)
6 | DesignWare (R)
7 | DC Expert (TM)
8 | Design Vision (TM)
9 | HDL Compiler (TM)
10 | VHDL Compiler (TM)
11 | DFT Compiler
12 | Design Compiler(R)
13 |
14 | Version T-2022.03 for linux64 - Feb 22, 2022
15 |
16 | Copyright (c) 1988 - 2022 Synopsys, Inc.
17 | This software and the associated documentation are proprietary to Synopsys,
18 | Inc. This software may only be used in accordance with the terms and conditions
19 | of a written license agreement with Synopsys, Inc. All other use, reproduction,
20 | or distribution of this software is strictly prohibited. Licensed Products
21 | communicate with Synopsys servers for the purpose of providing software
22 | updates, detecting software piracy and verifying that customers are using
23 | Licensed Products in conformity with the applicable License Key for such
24 | Licensed Products. Synopsys will use information gathered in connection with
25 | this process to deliver software updates and pursue software pirates and
26 | infringers.
27 |
28 | Inclusivity & Diversity - Visit SolvNetPlus to read the "Synopsys Statement on
29 | Inclusivity and Diversity" (Refer to article 000036315 at
30 | https://solvnetplus.synopsys.com)
31 | Initializing...
32 | #======================================================
33 | #
34 | # Synopsys Synthesis Scripts (Design Vision dctcl mode)
35 | #
36 | #======================================================
37 | #======================================================
38 | # Set Libraries
39 | #======================================================
40 | set search_path {./../01_RTL \
41 | ~iclabta01/umc018/Synthesis/ \
42 | /usr/synthesis/libraries/syn/ }
43 | ./../01_RTL ~iclabta01/umc018/Synthesis/ /usr/synthesis/libraries/syn/
44 | set synthetic_library {dw_foundation.sldb}
45 | dw_foundation.sldb
46 | set link_library {* dw_foundation.sldb standard.sldb slow.db}
47 | * dw_foundation.sldb standard.sldb slow.db
48 | set target_library {slow.db}
49 | slow.db
50 | #======================================================
51 | # Global Parameters
52 | #======================================================
53 | set DESIGN "SUBWAY"
54 | SUBWAY
55 | set hdlin_ff_always_sync_set_reset true
56 | true
57 | set CLK_period 10.0
58 | 10.0
59 | #======================================================
60 | # Read RTL Code
61 | #======================================================
62 | read_sverilog $DESIGN.v
63 | Loading db file '/usr/cad/synopsys/synthesis/2022.03/libraries/syn/dw_foundation.sldb'
64 | Loading db file '/usr/cad/synopsys/synthesis/2022.03/libraries/syn/standard.sldb'
65 | Loading db file '/RAID2/COURSE/iclab/iclabta01/umc018/Synthesis/slow.db'
66 | Loading db file '/usr/cad/synopsys/synthesis/2022.03/libraries/syn/gtech.db'
67 | Loading link library 'slow'
68 | Loading link library 'gtech'
69 | Loading sverilog file '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v'
70 | Detecting input file type automatically (-rtl or -netlist).
71 | Reading with Presto HDL Compiler (equivalent to -rtl option).
72 | Running PRESTO HDLC
73 | Compiling source file /RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v
74 | Warning: /RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v:77: signed to unsigned conversion occurs. (VER-318)
75 |
76 | Statistics for case statements in always block at line 74 in file
77 | '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v'
78 | ===============================================
79 | | Line | full/ parallel |
80 | ===============================================
81 | | 75 | auto/auto |
82 | ===============================================
83 |
84 | Statistics for case statements in always block at line 125 in file
85 | '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v'
86 | ===============================================
87 | | Line | full/ parallel |
88 | ===============================================
89 | | 133 | no/auto |
90 | | 138 | auto/auto |
91 | | 231 | auto/auto |
92 | | 275 | auto/auto |
93 | | 369 | auto/auto |
94 | ===============================================
95 |
96 | Inferred memory devices in process
97 | in routine SUBWAY line 69 in file
98 | '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v'.
99 | ===============================================================================
100 | | Register Name | Type | Width | Bus | MB | AR | AS | SR | SS | ST |
101 | ===============================================================================
102 | | current_state_reg | Flip-flop | 2 | Y | N | Y | N | N | N | N |
103 | ===============================================================================
104 |
105 | Inferred memory devices in process
106 | in routine SUBWAY line 84 in file
107 | '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v'.
108 | ===============================================================================
109 | | Register Name | Type | Width | Bus | MB | AR | AS | SR | SS | ST |
110 | ===============================================================================
111 | | cnt_reg | Flip-flop | 7 | Y | N | Y | N | N | N | N |
112 | ===============================================================================
113 |
114 | Inferred memory devices in process
115 | in routine SUBWAY line 91 in file
116 | '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v'.
117 | ===============================================================================
118 | | Register Name | Type | Width | Bus | MB | AR | AS | SR | SS | ST |
119 | ===============================================================================
120 | | map_reg | Flip-flop | 32 | Y | N | Y | N | N | N | N |
121 | ===============================================================================
122 |
123 | Inferred memory devices in process
124 | in routine SUBWAY line 125 in file
125 | '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v'.
126 | ===============================================================================
127 | | Register Name | Type | Width | Bus | MB | AR | AS | SR | SS | ST |
128 | ===============================================================================
129 | | answer_reg | Flip-flop | 112 | Y | N | Y | N | N | N | N |
130 | | lane_reg | Flip-flop | 2 | Y | N | Y | N | N | N | N |
131 | ===============================================================================
132 |
133 | Inferred memory devices in process
134 | in routine SUBWAY line 414 in file
135 | '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v'.
136 | ===============================================================================
137 | | Register Name | Type | Width | Bus | MB | AR | AS | SR | SS | ST |
138 | ===============================================================================
139 | | ans_idx_reg | Flip-flop | 6 | Y | N | Y | N | N | N | N |
140 | ===============================================================================
141 |
142 | Inferred memory devices in process
143 | in routine SUBWAY line 427 in file
144 | '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v'.
145 | ===============================================================================
146 | | Register Name | Type | Width | Bus | MB | AR | AS | SR | SS | ST |
147 | ===============================================================================
148 | | out_valid_reg | Flip-flop | 1 | N | N | Y | N | N | N | N |
149 | ===============================================================================
150 |
151 | Inferred memory devices in process
152 | in routine SUBWAY line 436 in file
153 | '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v'.
154 | ===============================================================================
155 | | Register Name | Type | Width | Bus | MB | AR | AS | SR | SS | ST |
156 | ===============================================================================
157 | | out_reg | Flip-flop | 2 | Y | N | Y | N | N | N | N |
158 | ===============================================================================
159 | Statistics for MUX_OPs
160 | ======================================================
161 | | block name/line | Inputs | Outputs | # sel inputs |
162 | ======================================================
163 | | SUBWAY/140 | 4 | 4 | 2 |
164 | | SUBWAY/142 | 4 | 2 | 2 |
165 | | SUBWAY/142 | 4 | 2 | 2 |
166 | | SUBWAY/146 | 4 | 2 | 2 |
167 | | SUBWAY/146 | 4 | 2 | 2 |
168 | | SUBWAY/169 | 4 | 2 | 2 |
169 | | SUBWAY/173 | 4 | 2 | 2 |
170 | | SUBWAY/202 | 4 | 2 | 2 |
171 | | SUBWAY/206 | 4 | 2 | 2 |
172 | | SUBWAY/242 | 4 | 2 | 2 |
173 | | SUBWAY/242 | 4 | 2 | 2 |
174 | | SUBWAY/242 | 4 | 2 | 2 |
175 | | SUBWAY/242 | 4 | 2 | 2 |
176 | | SUBWAY/242 | 4 | 2 | 2 |
177 | | SUBWAY/247 | 4 | 2 | 2 |
178 | | SUBWAY/247 | 4 | 2 | 2 |
179 | | SUBWAY/279 | 4 | 2 | 2 |
180 | | SUBWAY/279 | 4 | 2 | 2 |
181 | | SUBWAY/283 | 4 | 2 | 2 |
182 | | SUBWAY/283 | 4 | 2 | 2 |
183 | | SUBWAY/307 | 4 | 2 | 2 |
184 | | SUBWAY/311 | 4 | 2 | 2 |
185 | | SUBWAY/340 | 4 | 2 | 2 |
186 | | SUBWAY/344 | 4 | 2 | 2 |
187 | | SUBWAY/380 | 4 | 2 | 2 |
188 | | SUBWAY/380 | 4 | 2 | 2 |
189 | | SUBWAY/380 | 4 | 2 | 2 |
190 | | SUBWAY/380 | 4 | 2 | 2 |
191 | | SUBWAY/380 | 4 | 2 | 2 |
192 | | SUBWAY/385 | 4 | 2 | 2 |
193 | | SUBWAY/385 | 4 | 2 | 2 |
194 | ======================================================
195 | Information: Complex logic will not be considered for set/reset inference. (ELAB-2008)
196 | Presto compilation completed successfully.
197 | Current design is now '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.db:SUBWAY'
198 | Loaded 1 design.
199 | Current design is 'SUBWAY'.
200 | SUBWAY
201 | current_design $DESIGN
202 | Current design is 'SUBWAY'.
203 | {SUBWAY}
204 | #======================================================
205 | # Global Setting
206 | #======================================================
207 | set_wire_load_mode top
208 | 1
209 | #======================================================
210 | # Set Design Constraints
211 | #======================================================
212 | create_clock -name "clk" -period $CLK_period clk
213 | 1
214 | set_input_delay [ expr $CLK_period*0.5 ] -clock clk [all_inputs]
215 | 1
216 | set_output_delay [ expr $CLK_period*0.5 ] -clock clk [all_outputs]
217 | 1
218 | set_input_delay 0 -clock clk clk
219 | 1
220 | set_load 0.05 [all_outputs]
221 | 1
222 | #======================================================
223 | # Optimization
224 | #======================================================
225 | uniquify
226 | 1
227 | set_fix_multiple_port_nets -all -buffer_constants
228 | 1
229 | compile_ultra
230 | Information: Performing power optimization. (PWR-850)
231 | Analyzing: "/RAID2/COURSE/iclab/iclabta01/umc018/Synthesis/slow.db"
232 | Library analysis succeeded.
233 | Information: Evaluating DesignWare library utilization. (UISN-27)
234 |
235 | ============================================================================
236 | | DesignWare Building Block Library | Version | Available |
237 | ============================================================================
238 | | Basic DW Building Blocks | S-2021.06-DWBB_202106.0 | * |
239 | | Licensed DW Building Blocks | S-2021.06-DWBB_202106.0 | * |
240 | ============================================================================
241 |
242 | ====================================================================================================
243 | | Flow Information |
244 | ----------------------------------------------------------------------------------------------------
245 | | Flow | Design Compiler WLM |
246 | | Comand line | compile_ultra |
247 | ====================================================================================================
248 | | Design Information | Value |
249 | ====================================================================================================
250 | | Number of Scenarios | 0 |
251 | | Leaf Cell Count | 3498 |
252 | | Number of User Hierarchies | 0 |
253 | | Sequential Cell Count | 164 |
254 | | Macro Count | 0 |
255 | | Number of Power Domains | 0 |
256 | | Number of Path Groups | 2 |
257 | | Number of VT class | 0 |
258 | | Number of Clocks | 1 |
259 | | Number of Dont Touch cells | 264 |
260 | | Number of Dont Touch nets | 0 |
261 | | Number of size only cells | 0 |
262 | | Design with UPF Data | false |
263 | ----------------------------------------------------------------------------------------------------
264 | | Variables | Value |
265 | ----------------------------------------------------------------------------------------------------
266 | | set_fix_multiple_port_nets | -all -buffer_constants |
267 | ====================================================================================================
268 | Information: Sequential output inversion is enabled. SVF file must be used for formal verification. (OPT-1208)
269 |
270 | Information: There are 30 potential problems in your design. Please run 'check_design' for more information. (LINT-99)
271 |
272 | Simplifying Design 'SUBWAY'
273 |
274 | Loaded alib file './alib-52/slow.db.alib'
275 | Building model 'DW01_NAND2'
276 | Information: Ungrouping 0 of 1 hierarchies before Pass 1 (OPT-775)
277 | Information: State dependent leakage is now switched from on to off.
278 |
279 | Beginning Pass 1 Mapping
280 | ------------------------
281 | Processing 'SUBWAY'
282 | Information: Added key list 'DesignWare' to design 'SUBWAY'. (DDB-72)
283 | Implement Synthetic for 'SUBWAY'.
284 |
285 | Updating timing information
286 | Information: Updating design information... (UID-85)
287 | Information: The library cell 'HOLDX1' in the library 'slow' is not characterized for internal power. (PWR-536)
288 | Information: The target library(s) contains cell(s), other than black boxes, that are not characterized for internal power. (PWR-24)
289 |
290 | Beginning Mapping Optimizations (Ultra High effort)
291 | -------------------------------
292 | Information: There is no timing violation in design SUBWAY. Delay-based auto_ungroup will not be performed. (OPT-780)
293 |
294 | TOTAL
295 | ELAPSED WORST NEG SETUP DESIGN LEAKAGE
296 | TIME AREA SLACK COST RULE COST ENDPOINT POWER
297 | --------- --------- --------- --------- --------- ------------------------- ---------
298 | 0:00:04 27748.8 0.00 0.0 0.0 3245796.2500
299 | 0:00:04 27738.8 0.00 0.0 0.0 3245446.2500
300 |
301 | Beginning Constant Register Removal
302 | -----------------------------------
303 | 0:00:04 27738.8 0.00 0.0 0.0 3245446.2500
304 | 0:00:04 27738.8 0.00 0.0 0.0 3245446.2500
305 |
306 | Beginning Global Optimizations
307 | ------------------------------
308 | Numerical Synthesis (Phase 1)
309 | Numerical Synthesis (Phase 2)
310 | Global Optimization (Phase 1)
311 | Global Optimization (Phase 2)
312 | Global Optimization (Phase 3)
313 | Global Optimization (Phase 4)
314 | Global Optimization (Phase 5)
315 | Global Optimization (Phase 6)
316 | Global Optimization (Phase 7)
317 | Global Optimization (Phase 8)
318 | Global Optimization (Phase 9)
319 | Global Optimization (Phase 10)
320 | Global Optimization (Phase 11)
321 | Global Optimization (Phase 12)
322 | Global Optimization (Phase 13)
323 | Global Optimization (Phase 14)
324 | Global Optimization (Phase 15)
325 | Global Optimization (Phase 16)
326 | Global Optimization (Phase 17)
327 | Global Optimization (Phase 18)
328 | Global Optimization (Phase 19)
329 | Global Optimization (Phase 20)
330 | Global Optimization (Phase 21)
331 | Global Optimization (Phase 22)
332 | Global Optimization (Phase 23)
333 | Global Optimization (Phase 24)
334 | Global Optimization (Phase 25)
335 | Global Optimization (Phase 26)
336 | Global Optimization (Phase 27)
337 | Global Optimization (Phase 28)
338 | Global Optimization (Phase 29)
339 | Global Optimization (Phase 30)
340 |
341 | Beginning Isolate Ports
342 | -----------------------
343 |
344 | Beginning Delay Optimization
345 | ----------------------------
346 | 0:00:05 20969.6 0.00 0.0 0.0 1455825.1250
347 | 0:00:05 20969.6 0.00 0.0 0.0 1455825.1250
348 | 0:00:05 20969.6 0.00 0.0 0.0 1455825.1250
349 | 0:00:05 20297.7 0.00 0.0 0.0 1435287.1250
350 | 0:00:05 20291.0 0.00 0.0 0.0 1435205.0000
351 | 0:00:05 20291.0 0.00 0.0 0.0 1435205.0000
352 |
353 | Beginning WLM Backend Optimization
354 | --------------------------------------
355 | 0:00:05 20291.0 0.00 0.0 0.0 1429606.6250
356 | 0:00:05 20291.0 0.00 0.0 0.0 1429606.6250
357 | 0:00:05 20291.0 0.00 0.0 0.0 1429606.6250
358 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
359 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
360 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
361 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
362 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
363 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
364 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
365 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
366 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
367 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
368 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
369 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
370 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
371 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
372 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
373 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
374 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
375 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
376 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
377 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
378 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
379 |
380 |
381 | Beginning Leakage Power Optimization (max_leakage_power 0)
382 | ------------------------------------
383 |
384 | TOTAL
385 | ELAPSED WORST NEG SETUP DESIGN LEAKAGE
386 | TIME AREA SLACK COST RULE COST ENDPOINT POWER
387 | --------- --------- --------- --------- --------- ------------------------- ---------
388 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750
389 | Global Optimization (Phase 31)
390 | Global Optimization (Phase 32)
391 | Global Optimization (Phase 33)
392 | Global Optimization (Phase 34)
393 | Global Optimization (Phase 35)
394 | Global Optimization (Phase 36)
395 | Global Optimization (Phase 37)
396 | Global Optimization (Phase 38)
397 | Global Optimization (Phase 39)
398 | Global Optimization (Phase 40)
399 | Global Optimization (Phase 41)
400 | Global Optimization (Phase 42)
401 | Global Optimization (Phase 43)
402 | 0:00:05 20440.7 0.00 0.0 0.0 1230133.5000
403 | 0:00:05 20440.7 0.00 0.0 0.0 1230133.5000
404 | 0:00:05 20440.7 0.00 0.0 0.0 1230133.5000
405 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
406 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
407 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
408 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
409 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
410 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
411 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
412 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
413 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
414 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
415 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
416 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
417 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
418 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
419 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
420 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
421 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
422 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
423 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
424 |
425 | TOTAL
426 | ELAPSED WORST NEG SETUP DESIGN LEAKAGE
427 | TIME AREA SLACK COST RULE COST ENDPOINT POWER
428 | --------- --------- --------- --------- --------- ------------------------- ---------
429 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500
430 | 0:00:06 20148.0 0.00 0.0 0.0 1246586.7500
431 | 0:00:06 20148.0 0.00 0.0 0.0 1246586.7500
432 | 0:00:06 20148.0 0.00 0.0 0.0 1246586.7500
433 | 0:00:06 20148.0 0.00 0.0 0.0 1246586.7500
434 | 0:00:06 20227.8 0.00 0.0 0.0 1241531.3750
435 | 0:00:06 20227.8 0.00 0.0 0.0 1241531.3750
436 | 0:00:06 20227.8 0.00 0.0 0.0 1241531.3750
437 | 0:00:06 20227.8 0.00 0.0 0.0 1241531.3750
438 | 0:00:06 20227.8 0.00 0.0 0.0 1241531.3750
439 | 0:00:06 20227.8 0.00 0.0 0.0 1241531.3750
440 | Loading db file '/RAID2/COURSE/iclab/iclabta01/umc018/Synthesis/slow.db'
441 |
442 |
443 | Note: Symbol # after min delay cost means estimated hold TNS across all active scenarios
444 |
445 |
446 | Optimization Complete
447 | ---------------------
448 | Information: State dependent leakage is now switched from off to on.
449 | Information: Propagating switching activity (low effort zero delay simulation). (PWR-6)
450 | 1
451 | #======================================================
452 | # Output Reports
453 | #======================================================
454 | check_design > Report/$DESIGN\.check
455 | report_timing > Report/$DESIGN\.timing
456 | report_area > Report/$DESIGN\.area
457 | report_resource > Report/$DESIGN\.resource
458 | #======================================================
459 | # Change Naming Rule
460 | #======================================================
461 | set bus_inference_style "%s\[%d\]"
462 | %s[%d]
463 | set bus_naming_style "%s\[%d\]"
464 | %s[%d]
465 | set hdlout_internal_busses true
466 | true
467 | change_names -hierarchy -rule verilog
468 | 1
469 | define_name_rules name_rule -allowed "a-z A-Z 0-9 _" -max_length 255 -type cell
470 | 1
471 | define_name_rules name_rule -allowed "a-z A-Z 0-9 _[]" -max_length 255 -type net
472 | 1
473 | define_name_rules name_rule -map {{"\\*cell\\*" "cell"}}
474 | 1
475 | change_names -hierarchy -rules name_rule
476 | 1
477 | #======================================================
478 | # Output Results
479 | #======================================================
480 | set verilogout_higher_designs_first true
481 | true
482 | write -format verilog -output Netlist/$DESIGN\_SYN.v -hierarchy
483 | Writing verilog file '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/02_SYN/Netlist/SUBWAY_SYN.v'.
484 | 1
485 | write_sdf -version 2.1 -context verilog -load_delay cell Netlist/$DESIGN\_SYN.sdf
486 | Information: Writing timing information to file '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/02_SYN/Netlist/SUBWAY_SYN.sdf'. (WT-3)
487 | 1
488 | report_area
489 |
490 | ****************************************
491 | Report : area
492 | Design : SUBWAY
493 | Version: T-2022.03
494 | Date : Sun Apr 9 02:23:57 2023
495 | ****************************************
496 |
497 | Library(s) Used:
498 |
499 | slow (File: /RAID2/COURSE/iclab/iclabta01/umc018/Synthesis/slow.db)
500 |
501 | Number of ports: 16
502 | Number of nets: 835
503 | Number of cells: 718
504 | Number of combinational cells: 554
505 | Number of sequential cells: 164
506 | Number of macros/black boxes: 0
507 | Number of buf/inv: 73
508 | Number of references: 36
509 |
510 | Combinational area: 8236.166493
511 | Buf/Inv area: 728.481627
512 | Noncombinational area: 11991.672226
513 | Macro/Black Box area: 0.000000
514 | Net Interconnect area: undefined (No wire load specified)
515 |
516 | Total cell area: 20227.838719
517 | Total area: undefined
518 | 1
519 | report_timing
520 |
521 | ****************************************
522 | Report : timing
523 | -path full
524 | -delay max
525 | -max_paths 1
526 | Design : SUBWAY
527 | Version: T-2022.03
528 | Date : Sun Apr 9 02:23:57 2023
529 | ****************************************
530 |
531 | Operating Conditions: slow Library: slow
532 | Wire Load Model Mode: top
533 |
534 | Startpoint: in_valid (input port clocked by clk)
535 | Endpoint: map_reg_0__0__0_
536 | (rising edge-triggered flip-flop clocked by clk)
537 | Path Group: clk
538 | Path Type: max
539 |
540 | Point Incr Path
541 | -----------------------------------------------------------
542 | clock clk (rise edge) 0.00 0.00
543 | clock network delay (ideal) 0.00 0.00
544 | input external delay 5.00 5.00 f
545 | in_valid (in) 0.00 5.00 f
546 | U679/Y (NAND2XL) 0.13 5.13 r
547 | U645/Y (OAI31XL) 0.30 5.43 f
548 | U614/Y (NOR2XL) 1.63 7.06 r
549 | U739/Y (INVXL) 0.69 7.74 f
550 | U784/Y (AOI2BB2XL) 0.47 8.21 f
551 | map_reg_0__0__0_/D (DFFRX1) 0.00 8.21 f
552 | data arrival time 8.21
553 |
554 | clock clk (rise edge) 10.00 10.00
555 | clock network delay (ideal) 0.00 10.00
556 | map_reg_0__0__0_/CK (DFFRX1) 0.00 10.00 r
557 | library setup time -0.32 9.68
558 | data required time 9.68
559 | -----------------------------------------------------------
560 | data required time 9.68
561 | data arrival time -8.21
562 | -----------------------------------------------------------
563 | slack (MET) 1.46
564 |
565 |
566 | 1
567 | #======================================================
568 | # Finish and Quit
569 | #======================================================
570 | exit
571 |
572 | Memory usage for this session 196 Mbytes.
573 | Memory usage for this session including child processes 241 Mbytes.
574 | CPU usage for this session 57 seconds ( 0.02 hours ).
575 | Elapsed time for this session 62 seconds ( 0.02 hours ).
576 |
577 | Thank you...
578 |
--------------------------------------------------------------------------------
/src/02_SYN/syn.tcl:
--------------------------------------------------------------------------------
1 | #======================================================
2 | #
3 | # Synopsys Synthesis Scripts (Design Vision dctcl mode)
4 | #
5 | #======================================================
6 |
7 | #======================================================
8 | # Set Libraries
9 | #======================================================
10 | set search_path {./../01_RTL \
11 | ~iclabta01/umc018/Synthesis/ \
12 | /usr/synthesis/libraries/syn/ }
13 |
14 | set synthetic_library {dw_foundation.sldb}
15 | set link_library {* dw_foundation.sldb standard.sldb slow.db}
16 | set target_library {slow.db}
17 |
18 | #======================================================
19 | # Global Parameters
20 | #======================================================
21 | set DESIGN "SUBWAY"
22 | set hdlin_ff_always_sync_set_reset true
23 | set CLK_period 10.0
24 |
25 | #======================================================
26 | # Read RTL Code
27 | #======================================================
28 | read_sverilog $DESIGN.v
29 | current_design $DESIGN
30 |
31 | #======================================================
32 | # Global Setting
33 | #======================================================
34 | set_wire_load_mode top
35 |
36 | #======================================================
37 | # Set Design Constraints
38 | #======================================================
39 | create_clock -name "clk" -period $CLK_period clk
40 | set_input_delay [ expr $CLK_period*0.5 ] -clock clk [all_inputs]
41 | set_output_delay [ expr $CLK_period*0.5 ] -clock clk [all_outputs]
42 | set_input_delay 0 -clock clk clk
43 | set_load 0.05 [all_outputs]
44 |
45 | #======================================================
46 | # Optimization
47 | #======================================================
48 | uniquify
49 | set_fix_multiple_port_nets -all -buffer_constants
50 | compile_ultra
51 |
52 | #======================================================
53 | # Output Reports
54 | #======================================================
55 | check_design > Report/$DESIGN\.check
56 | report_timing > Report/$DESIGN\.timing
57 | report_area > Report/$DESIGN\.area
58 | report_resource > Report/$DESIGN\.resource
59 |
60 | #======================================================
61 | # Change Naming Rule
62 | #======================================================
63 | set bus_inference_style "%s\[%d\]"
64 | set bus_naming_style "%s\[%d\]"
65 | set hdlout_internal_busses true
66 |
67 | change_names -hierarchy -rule verilog
68 |
69 | define_name_rules name_rule -allowed "a-z A-Z 0-9 _" -max_length 255 -type cell
70 | define_name_rules name_rule -allowed "a-z A-Z 0-9 _[]" -max_length 255 -type net
71 | define_name_rules name_rule -map {{"\\*cell\\*" "cell"}}
72 | change_names -hierarchy -rules name_rule
73 |
74 | #======================================================
75 | # Output Results
76 | #======================================================
77 |
78 | set verilogout_higher_designs_first true
79 | write -format verilog -output Netlist/$DESIGN\_SYN.v -hierarchy
80 | write_sdf -version 2.1 -context verilog -load_delay cell Netlist/$DESIGN\_SYN.sdf
81 | report_area
82 | report_timing
83 | #======================================================
84 | # Finish and Quit
85 | #======================================================
86 | exit
87 |
--------------------------------------------------------------------------------
/src/03_GATE_SIM/01_run:
--------------------------------------------------------------------------------
1 | irun -timescale 1ns/1fs -override_precision -sdf_precision 1fs TESTBED.v -define GATE -define FUNC -debug -v ~iclabta01/umc018/Verilog/umc18_neg.v -nontcglitch -loadpli1 debpli:novas_pli_boot
2 |
--------------------------------------------------------------------------------
/src/03_GATE_SIM/09_clean_up:
--------------------------------------------------------------------------------
1 | rm -rf INCA_libs nWaveLog
2 | rm -rf *.fsdb
3 | rm -rf *.log
4 | rm -rf *~
5 | rm -rf *.sdf.X
6 | rm -rf *.key
7 | rm -rf *.conf
8 | rm -rf *.rc
9 |
10 |
--------------------------------------------------------------------------------
/src/03_GATE_SIM/SUBWAY_SYN.sdf.X:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hankshyu/RISC-V_MAC/a99fb15770c8dd553d0b348fadddc2b8b3d4b22b/src/03_GATE_SIM/SUBWAY_SYN.sdf.X
--------------------------------------------------------------------------------