├── .DS_Store ├── CODE_OF_CONDUCT.md ├── README.md ├── docs ├── .$Flowchart.drawio.bkp ├── Flowchart.drawio ├── Flowchart.png ├── Implementation of a RISC-V compatible Multiply-Add Fused Unit.pdf ├── Wallace Tree.drawio └── WallaceTree.png └── src ├── 00_TESTBED ├── MAC32_top_tb.sv ├── PATTERN.v ├── PATTERN_sample.v ├── TESTBED.v └── TESTBED_sample.v ├── 01_RTL ├── 01_run ├── Compressor32.v ├── Compressor42.v ├── EACAdder.v ├── FullAdder.v ├── LeadingOneDetector_Top.v ├── MAC32_top.v ├── MSBIncrementer.v ├── Normalizer.v ├── PreNormalizer.v ├── R4Booth.v ├── Rounder.v ├── SpecialCaseDetector.v ├── WallaceTree.v ├── ZeroDetector_Base.v └── ZeroDetector_Group.v ├── 02_SYN ├── 01_run_dc ├── 09_clean_up ├── Netlist │ ├── SUBWAY_SYN.sdf │ └── SUBWAY_SYN.v ├── Report │ ├── SUBWAY.area │ ├── SUBWAY.check │ ├── SUBWAY.resource │ └── SUBWAY.timing ├── default.svf ├── syn.log └── syn.tcl └── 03_GATE_SIM ├── 01_run ├── 09_clean_up ├── SUBWAY_SYN.sdf.X └── irun.log /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hankshyu/RISC-V_MAC/a99fb15770c8dd553d0b348fadddc2b8b3d4b22b/.DS_Store -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Contributor Covenant Code of Conduct 2 | 3 | ## Our Pledge 4 | 5 | We as members, contributors, and leaders pledge to make participation in our 6 | community a harassment-free experience for everyone, regardless of age, body 7 | size, visible or invisible disability, ethnicity, sex characteristics, gender 8 | identity and expression, level of experience, education, socio-economic status, 9 | nationality, personal appearance, race, religion, or sexual identity 10 | and orientation. 11 | 12 | We pledge to act and interact in ways that contribute to an open, welcoming, 13 | diverse, inclusive, and healthy community. 14 | 15 | ## Our Standards 16 | 17 | Examples of behavior that contributes to a positive environment for our 18 | community include: 19 | 20 | * Demonstrating empathy and kindness toward other people 21 | * Being respectful of differing opinions, viewpoints, and experiences 22 | * Giving and gracefully accepting constructive feedback 23 | * Accepting responsibility and apologizing to those affected by our mistakes, 24 | and learning from the experience 25 | * Focusing on what is best not just for us as individuals, but for the 26 | overall community 27 | 28 | Examples of unacceptable behavior include: 29 | 30 | * The use of sexualized language or imagery, and sexual attention or 31 | advances of any kind 32 | * Trolling, insulting or derogatory comments, and personal or political attacks 33 | * Public or private harassment 34 | * Publishing others' private information, such as a physical or email 35 | address, without their explicit permission 36 | * Other conduct which could reasonably be considered inappropriate in a 37 | professional setting 38 | 39 | ## Enforcement Responsibilities 40 | 41 | Community leaders are responsible for clarifying and enforcing our standards of 42 | acceptable behavior and will take appropriate and fair corrective action in 43 | response to any behavior that they deem inappropriate, threatening, offensive, 44 | or harmful. 45 | 46 | Community leaders have the right and responsibility to remove, edit, or reject 47 | comments, commits, code, wiki edits, issues, and other contributions that are 48 | not aligned to this Code of Conduct, and will communicate reasons for moderation 49 | decisions when appropriate. 50 | 51 | ## Scope 52 | 53 | This Code of Conduct applies within all community spaces, and also applies when 54 | an individual is officially representing the community in public spaces. 55 | Examples of representing our community include using an official e-mail address, 56 | posting via an official social media account, or acting as an appointed 57 | representative at an online or offline event. 58 | 59 | ## Enforcement 60 | 61 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 62 | reported to the community leaders responsible for enforcement at 63 | hankshyu@gmail.com. 64 | All complaints will be reviewed and investigated promptly and fairly. 65 | 66 | All community leaders are obligated to respect the privacy and security of the 67 | reporter of any incident. 68 | 69 | ## Enforcement Guidelines 70 | 71 | Community leaders will follow these Community Impact Guidelines in determining 72 | the consequences for any action they deem in violation of this Code of Conduct: 73 | 74 | ### 1. Correction 75 | 76 | **Community Impact**: Use of inappropriate language or other behavior deemed 77 | unprofessional or unwelcome in the community. 78 | 79 | **Consequence**: A private, written warning from community leaders, providing 80 | clarity around the nature of the violation and an explanation of why the 81 | behavior was inappropriate. A public apology may be requested. 82 | 83 | ### 2. Warning 84 | 85 | **Community Impact**: A violation through a single incident or series 86 | of actions. 87 | 88 | **Consequence**: A warning with consequences for continued behavior. No 89 | interaction with the people involved, including unsolicited interaction with 90 | those enforcing the Code of Conduct, for a specified period of time. This 91 | includes avoiding interactions in community spaces as well as external channels 92 | like social media. Violating these terms may lead to a temporary or 93 | permanent ban. 94 | 95 | ### 3. Temporary Ban 96 | 97 | **Community Impact**: A serious violation of community standards, including 98 | sustained inappropriate behavior. 99 | 100 | **Consequence**: A temporary ban from any sort of interaction or public 101 | communication with the community for a specified period of time. No public or 102 | private interaction with the people involved, including unsolicited interaction 103 | with those enforcing the Code of Conduct, is allowed during this period. 104 | Violating these terms may lead to a permanent ban. 105 | 106 | ### 4. Permanent Ban 107 | 108 | **Community Impact**: Demonstrating a pattern of violation of community 109 | standards, including sustained inappropriate behavior, harassment of an 110 | individual, or aggression toward or disparagement of classes of individuals. 111 | 112 | **Consequence**: A permanent ban from any sort of public interaction within 113 | the community. 114 | 115 | ## Attribution 116 | 117 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], 118 | version 2.0, available at 119 | https://www.contributor-covenant.org/version/2/0/code_of_conduct.html. 120 | 121 | Community Impact Guidelines were inspired by [Mozilla's code of conduct 122 | enforcement ladder](https://github.com/mozilla/diversity). 123 | 124 | [homepage]: https://www.contributor-covenant.org 125 | 126 | For answers to common questions about this code of conduct, see the FAQ at 127 | https://www.contributor-covenant.org/faq. Translations are available at 128 | https://www.contributor-covenant.org/translations. 129 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Implementation of a RISC-V compatible Multiply-Add Fused Unit 2 | 3 | ***The full paper could be found [here](docs/Implementation%20of%20a%20RISC-V%20compatible%20Multiply-Add%20Fused%20Unit.pdf)*** 4 | 5 | ## Abstract 6 | 7 | The floating-point Multiply-Add Fused (MAF, also known as Multiply-ACcumulate, MAC) unit is popular in modern microprocessor design due to its efficiency and performance advantages. The design aims to speed up scientific computations, multimedia applications, and in particular, convolutional neural networks for machine learning tasks. This study implements a MAF unit with RISC-V ”F” extension compatibility, incorporating standard IEEE 754-2008 exception handling, NaN propagation, and denormalized number support. Five distinct rounding modes and accrued exception flags are also supported in the proposed design. We test our implementation with carefully crafted corner cases and random generated floating-point numbers to verify its correctness. 8 | 9 | **Index Terms—Floating-Point Unit, Multiply-Add fused, Multiply Accumulate, RISC-V** 10 | 11 | ## 1.Intorduction 12 | Floating-point operations play a crucial role in modern day computing, especially when the machine learning domain flourishes. The growing computational power makes training sophisticated models possible. To apply the machine learning models in real life applications typically requires floating-points computations, which is demanding since large amount of real time data must be processed. Moreover, deep learning algorithms with exhaustive need of floating-point computational capabilities, such as neural networks, grew its popularity recently. These applications further challenge the floating-point processing power of the microprocessors. Among all floating-point operations, add-and-multiply are the most demanding one, the combination appears in the convolution layers of convolutional neural networks, digital filtering, and many other computing models’ architecture. 13 | 14 | Floating-point units are available on most microprocessors nowadays. Most designs center around a fused multiply-add dataflow due to its simplicity and performance advantage over separate multiplier and adder pipelines. It combines two basic operations with only one rounding error and shares hardware components to save chip area. Such design is also consistent with the basic RISC philosophy of heavily optimize key units in order to rapidly carry out the most frequently expected functions. Furthermore, the existence of fused multiply-add unit leads to more efficient superscalar CPU design since three floating-point instructions: add, multiply, and fused multiply-add could be scheduled to the same functional unit. 15 | 16 | To take full advantage of the MAF dataflow, [3] transforms a set of equations into a series of multiply-adds by a numerical analysis technique called Horner's rule. [4] presents a general method to convert any transform algorithm into MAF optimized algorithms. [5] presents a framework for automatically generating MAF code for every linear DSP transform. The above-mentioned examples shows that the MAF architecture is recognized in modern computing and could receive optimization at the software level. 17 | 18 | ## 3. Overall Maf Unit Architecture 19 | 20 | ![maf](docs/Flowchart.png) 21 | -------------------------------------------------------------------------------- /docs/.$Flowchart.drawio.bkp: -------------------------------------------------------------------------------- 1 | 7V1dc6M2F/41vowHSYDgMnGStjObbaZp323fGw9rE5sWmxSTxNlfX2EjjCQwAgQo2d2LnSCQwNI5z/mWJmi22f8Ue0/ru2jphxNoLPcTdD2B0IUG+T9teDs2ANMEx5ZVHCyztlPDQ/DNzxqzjqvnYOnvmAeTKAqT4IltXETbrb9ImDYvjqNX9rHHKGTf+uStfKHhYeGFYuuXYJmsj60OxKf2n/1gtaZvBrZ7vLPx6MPZL9mtvWX0WmhCNxM0i6MoOf612c/8MJ08Oi/HfrcVd/MPi/1tItPBvntwDXy7+P2bEf/rui/xl3/RBTqO8uKFz9kPfnjyF4EXzrydf+0nZEqjOPv+5I1OShw9b5d+Oi6YoKvXdZD4D0/eIr37SsiAtK2TTZjdFr8z+/QXP078faEp++6f/GjjJ/EbeSS7awPr2CWjImgZU/vY8npaFGBmM70uLIidtXkZHazywU9TRf7IZqvBzAELCvPiLwntZJdRnKyjVbT1wptT69Vp5gxydXrmUxQ9ZfP1t58kbxkjeM9JxM7mY7RNspsgHWKXeHFymRI6aVmE3m4XLGjzbRDSbv52SR+KyGIfWwr3Kxcp/Ulnlyj2Qy8JXlh2KZvtrOt9FJBX5EsLHW5pQb60dBDyU1Z+kvUrEjg3lMkPZQlD7aLneOELQ5Gp8d4Kjz2lD+yqP1p4U/7RVd9W/TNPNHj8ihNF5pPcnkhNgb1/M68I4qxHZ2nTMqeuqR1XW8KEffHCkMzD77Hv6zhpSINJs4VJ+5rS2Pz+XkTIEw5to63PzhALjmSa4rc/04upRS//yuGMXFzvi09ev9GrfZAUupGrvwp3Tp3SC9qniKrw3FodAaSegI6YVc+ZtfBaWFSrZE1pW0cUNk2BsGxDFj3F0VxhNEvE4gpYb4rFFjzDE72CK9X0NFIAMp4qSn+Dlf7ZEyfpf47VjBpWa88mriSbADSQGiIiK5bWRFQRFKYC4QSkl/M7b5vMvTBYbQnVlNHbJ++rH7I0cng81QfJuvgxaUhFTECMmsvsxiZYLo/k6O+Cb97Xw3jpimbcRga3ribWdQm9iWt+ljkE4ZabatlbJ0VrqEzoXRhTaCPMrM0F7AZ7dBiLE4nR4+PO72dtYcnabgprO18TVP8uF9iYmjZ2WFWw2/L2v5wAGaVr9c7BH1kYmMRWyf6vVrtAid6VXt37cUDmNqXJM8oYOKuMVUqVWmnhwGGkBRClhQNb262ptlT4h1moQ6IYUmTFuhb3C/AQWhNAos/p7mW+9kLyabc3+6f55mW+S1Hs/WBhjgZKpB1CNmBWBuiPhtZHQcPOKm3f2MNxLZI2rJQttmmPsdgnSYJBUZYQhjGM8wKlu8+0XHCSd/Dy7vCqbCJKPPSa0RLvlXT7MtL5FwE4iIUObNE1NZsHY0sWk1yH6euuvMU/qwNnzKIwig8fg25vb+Fslj8SxUs/prczOmSpTF5MUc5VobM7xGRWKaWo/GMHvUDcCD0KMROPi2vt3JUKAU1/vLKGwivuRRDVBnjqevSFcGLE4konhCuDLwnsq5Gj5xlYBbxhC5jdAC1HMHc4BHPGRbBUzBcwjChmri6KmQTN6Q9/cCj4414E3Fr4q+vRF/yJEe5LneCvB5BzFIKcQcVHV5AzzaFAzhUW/D72P0fxhizmN7KI/NIPHaMHjhiPhOPH6LEhzBsbW5qHBFB1c+L0kNBUC+Y083CEkKOJ+0J0JL7MwnWgXtLJ5DuxuD6o0xGIqu/s4XK+8OKMX8bzUMkHOwrRF9wo9q6X85MyTX1AX3leYbcsTjCyMxO2IhX4g1RGIBVqXrJ4s3vejK1vNglVAYU+QIAw5/Xopk0OIDRENWiWyosH78W/XC410B9LcjwtUwP9UTS0iMKogbBlHA45mtYgKGDgswCn/SDoSY1VpqTKpkRIJ9AV4s9pPK1IfxetjMTGKiqlXZqAalp1+infw7V0UU6xKCtSdiGyYh5sX96NytFvvhHLhMhEJgSGBW0ELOy49hmeRPSKd/MNwFHSSUa5mIQYcwzVUU5SUQGBkIYkZm0rsiBNl2U2zIdwSz4Pd+xx+jXjMzR1uwkMPS4zNzA1h2RmAEwTI9vJOfrDMLNhKGJex51i95Ru6bK8Iu3bH9blIjDBzeVMW73ZRuPrzQCaYwIEmDRQj/XTcqGslmtJsnFHpuU1TgdzdCNbImUBdiDXsHqR2vx7nCx9sN+oHBTTrnbr4DGZb9+Xr4TyrpK0XuA4thLJwUqKC45wekR/OEqWbzmUSRr6A7tKPx4ENjYTOIR0gTEZAHBEc9sLw29+HL0ruKEBLSVwY1tYCdxcQE4R7RNhhGX85HvLYLv6davPbiT5ji5UpjrGyEqmI+rlaTWhHu4mTS1UC8OifTq0twlJwnpTAzUth3CR0mpKqvuKiS3yRW+NM9B4b5Ndm6zgoPM9xsxUGDUvk1WcsKzmxLIL7hQiGSMnXZrF1MeZ28U/MO9grY1/cD1sfDb+oY6cxRy8PBZYKW801bmAsnTKFHkdoEbnQoOlUwIoRsSp7jCu3tAmMYbFLMcAx1BS5opuI+I1y53pUW2wHbdVCVfz2hpxqxVHH1ldUp3+cPXLdhH7myNMjW1+lCQY244Gju6SveYOMzP+jJniFlHAAONPmTOKWtgR0VRnWw/kuTcNbrcLflVVOdyRUfqeypB3xXf17C8rCeTtn+bbKB5Z6H+H6bCyOwZSgNUlHZa6FYqSMq0nGZ+IvsP0+/dKRGIJh0alXCV6g6tBKRfdHWg0jMbMP6eK2YDIbVU7pmijjtAdMnpXR4BYWjX4Tj62aOtkux0fIgfe8u8xCU023nth9UNZdZvlgVOGKq12d2qEhuIgAiXWetQfqKTRdODUZKnakg4RKKPqyj28R/dpMRQNKykaVRB0gfjQidYOuDs1EGpDfHqpMdIErdxo7CSQTTHdSSCz/PSPdIaW3m6dE5TgBS9Qzi6Jo3/yg0cOLWvvKR1ys1+lB65M/dBfJHHqNJ96X8nj3iKZbp73sGRt4yghkxGl73KPLw6j10t6RsrZleuUhmmWulpMS1SZnL5UJqsaE3SrYOqCCrjRRvma8T4ei/fb+XkwEugc2TUBOwuLzNGq0zCnh9hiaKhuf9bv0wipUxWb8eUAPCS7YVFe9UBM4cxY6BrZRCI191ezVGJlOXVR9bJOfB2SGNTCbm2n8YJaJbskMeLv3Zt6RV6DbpHZLlrbZe9UPKq39brpxmJm7nvQjRVowhYUISEvaR5NE3ZETfg+2j2Maxhr6eppz6uNky/Y6uCOW7JRicRtuSyUKqmSsecDmPXf1fR5fCZAKm5MN9j2wXTRBcaa3z3vdYS83eHZKL4PtunkQBYFD98VxOSJ4zUh+vSlKoARGWcSgJiwilsSVuHJWF0syhDd4b/5u+cwmZdUkZTV/FQeZcfvDcnv3129V+S1deNcm+fmXRZkWIyhUniY4+XS09+Mwj/2W4AhbnsnXUfpsOkWABhyMFdS5sJ9VB1wNPYYWNynGs555BN+G9ehr+xe0er/43ZMFjBvwK2lngVoKHygExZdfjHbE70wVnui57PihpKWgE6rRrlv7/H0ZH7X+Q6nUFkMREO2muZwNm/xNpQi3Y5HUsGBjgmxRs6owEMe/aAZBfMHY6PWFAxSg7v6ILWBSBhwCaD1R0fU9uiJ6mmRCeveJ6bw4eMm0A4JKVx9jclfq/Sv+zha+LudBhW4/IRpcAY2QGIQ+JD/HmxX80209Mt2pe+kQVVtZihVqCeleUmcRFS5mOoyEcudJGhQAwZxFCfuziSru2HHnAJ8QiHuRAdp9GtcXcj9AsCfkV3ypVbXHoYpGQIpt8Hyyswhjk4RaxoV8munfHVGxGfSm5fv7eX3MAzEnw8CpU81Fc3iTlJeETsBo+r3VH02ty+K0KEv1Va/GqP2Wmqbs9IqOaM+5CbJQYptONBi6z/KGqBWmVak5VojnfdjuR+InNuDdld/bOXqNSY5m7ffxCx2RSTHv2mwI6Y+kqeqotKuLwiV3Xy8c955rXYhTc82N5TZGz1b42gE45QP9U7PDXxo2tOzcJqe9LlQJfXIHD1Lhx2a0jPmtkUD54+KUkbPJSebiQTeyXJrL6cHii0hNrxq2OxCdAk18W4D2VCTuuUVS9U//68sHDn8EY9KYpg1RK3glB4IabZF150gAUtY/Ag9umcw/O6Z3FYYQObHGp6rxVybX0uTDD4WV1Mq1oqred15OKYWA1CKmfqkHtoYF/RDMDVAdaVVVZWILETIbNhgjoklJanJeXJRc38sr/cNDyYlu0j8+fHBhHKPTmACeZt2MDBxmpkBmTXZAkkAPh7YQv/vWHommaquZHcL6cT0gaKq3OEUuO1pGpDLlMQ8lqnalZh7j5OlDPdr4DqirvTZ+1yKbjru0zX0kQp9O2ewwRIBkiysqB9IOiaqjrREDeyX7eMP0hqJtGzkTh1HDXWVjTUCgYmq2bX/A7tGwy6kCrv4gUYgLTHh7v9VB778oK0BwAsroi1hIIW0RS7jKEqKj6d1c3fRMrUKb/4D -------------------------------------------------------------------------------- /docs/Flowchart.drawio: -------------------------------------------------------------------------------- 1 | 7V3dd5u4Ev9r/BgfJAGCx3z27jlNN7fZe7vdFx9qk5hdbFKMU6d//QobYSSBESAESdqHnlgGgTUzvxnNlybocrX7EHtPy9to4YcTaCx2E3Q1gRA7kPyfDrwcBgCE1mHkMQ4W2dhx4D746WeDRja6DRb+hrkwiaIwCZ7YwXm0XvvzhBnz4jj6wV72EIXsU5+8R18YuJ97oTj6JVgky8OoA/Fx/D9+8LikTwa2e/hm5dGLs1+yWXqL6EdhCF1P0GUcRcnhr9Xu0g/TxaPrcrjvpuLb/MVif53I3GDf3rsGvpn/8dOIv7vuc/zlOzpDh1mevXCb/eD7J38eeOGlt/Gv/IQsaRRn75+80EWJo+164afzggm6+LEMEv/+yZun3/4gbEDGlskqzL4W3zN79Wc/TvxdYSh77w9+tPKT+IVckn1rg4xpMi6CljG1DyM/jkQBZrbSywJB7GzMy/jgMZ/8uFTkj2y1GqwcsKCwLv6C8E72MYqTZfQYrb3w+jh6cVw5g3w6XvMxip6y9frbT5KXTBC8bRKxq/kQrZPsS5BOsUm8ODlPGZ2MzENvswnmdPgmCOlt/npBL4oIsQ8jhe8riZT+pJMkiv3QS4JnVlzKVju79S4KyCNy0kKHIy3ISUsnIT/l0U+y+4oMzk1l8lNZwlSbaBvPfWEqsjTeS+Gyp/SCTfVLC0/KX7rq3ap/5pEHD29x5Mh8kdszqSmI92fzgiDOcnCRNi1z6pqjk2pLWLAvXhiSdfgj9v0xLhoawaLZwqJ9S3lsdncnIuQRh9bR2mdXiAVHskzxy5/ph6lFP37N4Yx8uNoVr7x6oZ92QVK4jXz6WvjmeFP6gd5TRFV4ilYHAKlnoANm1UtmLbwWiGqV0JSOdURh0xQYyzZk0VOczRVms0QsroD1plhswRMy0Su4UktvRAZAJlNF7W+w2j+74qj9T4maUSNq7cXElRQTgDSZISKyYmlLRBVDYaoQjkB6Prv11snMC4PHNeGaMn776H3zQ5ZH9pen9iChix+TgVTFBGRTc559sQoWiwM7+pvgp/dtP19K0UzayOTWxcS6KuE3keYnhUNQbvlWLXvqpLgbKlN6Z8YU2ggztDmD3WCPTmNxKjF6eNj4/dAWltB2VaDtbElQ/V0S2JiaNnZYU7AbefsnJ0BGKa1eOfgjCwOT7FWy/6vNLlBid6Wf7vw4IGub8uQJYwycNMYqtUqttqDepr61BRC1hQNb71tTa6nwD7NQh0Q1pGgX61rcL8A6rCaARJ/T7fNs6YXk1W6ud0+z1fNsk6LY68HCHA2UaDuEbMBQBowfDa1XiIbNHGGyJm3f2MNJLZLeWCkjtmkPQeyjJsGgqEuIwBjGaYXS3WdarjjJM3h9t39UthAlHvqR8RLvlXT72qTzDwJQyw4d2KJr6nIWDK1ZTPI5TB934c3/edxLxmUURvH+ZdDNzQ28vMwvieKFH9OvMz5kuUxeTVHJVWGzO2TLrFJLUf3HTnqGuBl6VGImHhbX2rkrFQLa+PHK0oVX3IMgqg3w1N3RF8KJEYuLMSFcGXxJYF+NHj0twCrgDVvA7AZoOYK5+hDMGRbBUjVfwDBimLljMcwkeG788Ad1wR/3IODWwl/dHX3BnxjhPh8T/PUAco5CkDOo+ugKcqapC+RcgeB3sf8pileEmD8JEXnS647RA0eMR8LhY/TYENaNjS3NQgKoY3Pi9JDQVAvmNPNwgJCjiftCdCQ+zMJ1oF5yk8nfxOK6VqcjEE3fy/vz2dyLM3kZzkMlH+woRF9wo9h7H6Gg7kJTH9BXnlfYLYsTDOzMhK1YBf5ilQFYhW4vWbzZbFdD25tNQlVAoQ8QIMx5PbpZkxqUhmgGXab64t579s8XixHYjyU5npY5AvtR3GgRg3EEypZxOORoWoOggIHPApz2g6BHM1aZkSqbEiGdQFeIP6fxtCL/nbXaJDY2USnv0gRU06qzT/k7XGssxikWdUUqLkRXzIL186sxOfrNN2KFEJnIhMCwoI2AhR3XPiGTiH7i3XwaJEo6yShXkxBjTqA66kmqKiAQ0pDErG1FO0jTZYUN8yHcktfDHe84/prhBZq63QSBHlaYG2w1dQozAKaJke3kEv1mhNkwFAmv406xe0y3dFlZkfbt63W5CEJwfX45WrvZRsPbzQCaQwIEmDQwj8dn5UJZK9eSFOOOQstbnA7m+Ea2RMoC7ESuYfWitfnnOFn6YL9ROSimXW2WwUMyW78uXwmVXSVpvcBxbCWag9UUZxzj9Ij+cJAs33Iok9zoa3aVvj0IbLxN4BDSBcZEA+CI220vDH/6cfSq4IYGtJTAjW1hJXBzBjlDtE+EEcj40fcWwfrx9/V4upHkHV2oTnWMgY1MR7TL02rCcbibRrpDtTAs7k91e5uQJKw33aCm5RAuUlpNSW1fMbFFvuitcQYa722ya5MVHHT6jiEzFQbNy2QNJyxrObHigjuFSIbISZcWMfVx5nbxD8w7WGvjH9wdNj4Z/1DHzmIOXh4LrNQ3I7W5gLJ0yhR5HaDG5kLa0ikBFCPi1HYY1m5okxjDYpZjgEMoKXNFt1HxI8ud6dFssB23VQlX89oasdWKMx5dXVKdfn/x23oe+6sDTA29/ShJMLadETi6S3rN7Vdm+BUzxRZRwADDL5kziFk4rqp5XZ570+C6XfBUVeVwR0bpcypD3hXv1bO/rCSQt3uaraN4YKX/DtNhZTsGUoAdSzosdSsUNWVaTzI8E73D9PvXykRiCceISrlK7AZ3BKVctDvQYBiNmX9OlbABUdqqOqaMxhyhHTJ6N0eAWFqlvZOPLe51sm7H+8iBt/h7SEaTjfeeWf1wVl2zPHDMUKXV7k6N0lAcRKDMWo/6mkoaTQdOTZarLekQgTKuruzhPbhPi+FoWMnRqIKhC8yHjry2x92pgVAb5huXGSPN0Mo3jZ0UsimmOwlslp/+ka7Qwtssc4YSvOAFztkkcfRPfvDIfmTpPaVTrnaP6YErUz/050mcOs2n3jdyuTdPpqvtDpbQNo4SshhR+iz38OAw+nFOz0g5SblOaZgmFOPjpiUaTE5fBpNVjQhjq1/qggm4UZv8kUk+Hkry23l5MBK4HNk14ToLi6LR6iY9Z4fYYmCorjvr+9yC1BmKzeRSgwzJtivKax7IRjjbKnSNayKRm/urWCrZYzl1MfWym/gqJDGkhd3am4YLaZX0SGLU36vf6BVlDbpFYTtrvSt7pepR/U6vm2Us5uW+BstYgR1sQRES8oLmwSxhR7SE76LN/bDb4rE7ekw7zxzJMeVUumgPnh6azlEv/7CZbj9jC5M7doOj6pDr9ixUSalS8Kdjp/Xv1fR6fCI2K/bE09a5mHKHINWz2+1ujHi72V8bxXfBOl0cyELw/r2CmFxx+EykI32oClRGxoncIyai45ZEdHg2VhcGM0RP/Gd/sw2TWUkBS1m5UeUpenxbSr51eHWbyivr2rkyT627LMiwGENNAD0n26UHzxmFf+y7AEPsuCddwumwmR4AGHIwV1Jhw71UHXA0dldY3KsazmnkE34bd0NficWiy+F/N0OKgHkNbiz1IkCj8JoOd3R5YrZnemGu9kzPJ+Tp0paALuuI0u5e48HNfMP7DgdgWQxEQ7aQZ38scPFrKMW6HU/DgppOKLEGTubAOk+dGBkH82dyo9YcDNLdfvUZbppYGHC5p/WnVtTe0RPX0/oWNrZA9uH7l5tAOySscPEtJn89pn/dxdHc32xGUPzLL9gIjt8GSIw/71Pvg/XjbBUt/LKG+J0sqKo+ilI1glKWl8QhSJXEVJcEWe4kQVo3MIjjOLExlKzthh1zCvARhbjDJKTRr3FhI/cLAH88d8mbWl3vMEzJ+Ev5HiwvCtVxaotYTqlQXjulyjMqPtPevH5vr7/1CBB/NAmUPlBV3BZ30vKKxAkYVb+n6rW5lizCDX2ZtuMrb2pvpbY5pq1SMurjfZISpHgPB1p0HaSiAWqNaUVWrjXQUUOW+4bYuT1od/XHVlKvMcvZ/P5NTKBXxHL8k7SdbvWWPFUVRX59Qahs3/POKe+11oU0P9vcVGZv/GwNYxEMU7nUOz838KGNnp+Fg/ykj6QqKYXm+Fk67NCUnzHXkQ2cPqVKGT+XHKomMninnVt7Pa0ptoTY8Kphs4ToEmri3QayoSZ15BWr5D/9vywcqf90SSUxzBqmVnBAEIQ026JrE0rAMhY/Q4/uGQzfvZDbCgPI/Fz6pVrMtfm9NMngbUk15eJRSTVvO+sTajEApVioj+ahjXHBPgRTA1SXeVWVqFRCRH1rCFlfkqZ8LDEvOk8uau6P5e0+/WBS0sDiz7cPJlR6xgQmkN/TagMTp9k2INtNtkASgA9nxdD/O9a9SebJK2msId2QUFNUlTsXA7c9yANymZKYxzJVDZG55zhZynC/G1xHtJU+eZ9K0W2MLcJ0n+bQt3MGGywTIMnCivqJpGOi6lhLtMB+Wz/8Yq2BWMtG7tRx1HBX2VwDMJhoml35v7BrMOxCqrCLn2gA1hIT7v6qOmvmF29pAC+siLeEifrkre3yMzY+/PXf7/dgt9i4wL/6uqFpaJJbiYxWeRUfOLWvaE8eTW5Gk138vG9QYx8jN1GetKveK1BKQ9Fofjc0tIEiGvIT6aZhM//gm6ahsPSvRQ5FA/D90NAQ20+0hlNUP1fPlKQcqde2mm/j55wVGL6oJH/RD1b6S3T1lxXrVKQI1rwFQ/lzZIP63PWd3Vin1rzYqHwbJsFTGJQ0mU78XTIpaX3AVZ88ELOXG5J35peVKUniS4NKJVsQ27yDQQGE7BIQ4nFaolSJfIyjKCkSLu0KcRst0pjH9b8= -------------------------------------------------------------------------------- /docs/Flowchart.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hankshyu/RISC-V_MAC/a99fb15770c8dd553d0b348fadddc2b8b3d4b22b/docs/Flowchart.png -------------------------------------------------------------------------------- /docs/Implementation of a RISC-V compatible Multiply-Add Fused Unit.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hankshyu/RISC-V_MAC/a99fb15770c8dd553d0b348fadddc2b8b3d4b22b/docs/Implementation of a RISC-V compatible Multiply-Add Fused Unit.pdf -------------------------------------------------------------------------------- /docs/Wallace Tree.drawio: -------------------------------------------------------------------------------- 1 | 7Z1dc6M2FIZ/jS+TAQkBvkyz2bYzu9OdZqZtrnaIkW1ajFyQN3Z/fYVBBmSZDwdLsMtVzEEcsPTwvrIkyAw+bvY/x952/Zn4OJwBw9/P4IcZAKZh2OxPGjlkETR3s8AqDvy8UBF4Dv7D/Mg8ugt8nFQKUkJCGmyrwQWJIryglZgXx+StWmxJwupZt94KnwWeF154Hv0z8Ok6i7rAKeK/4GC15mc27Xm2Z+Pxwvk3SdaeT95KIfg0g48xITT7tNk/4jCtPF4v2XEfL+w9XViMI9rmgAdwWP5GvF+f0J583v1On57dlzszT5PQA//G2GcVkG+SmK7JikRe+FREf4rJLvJxmtZgW0WZT4RsWdBkwb8xpYe8Nb0dJSy0ppsw34v3Af0rPfzeQfnmS2nXh32e+rhxyDeWJKJ5RtNKd0Y0PpSzpNsv/Mh0o8hz3OKJsm+cfs2LNclrheziBa6rvpxIL15hWlMOndqb3SiYbDC7HnZcjEOPBt+q1+HlxK5O5U6HfiEBu0Jg7Pm9lKOV31sWR42nyK4rP6pA4yGOvUOp2DYtkFw+j20L5zEE0pquq1qefciugG+V6qQIHentQjLQSzJoT3KJ3E7ginfAWEHuzJ9R5QnOG/iza8vfiL/slN+8cJdXwwzYIc2brQKm/e+O8B13ybFBH1gB09ruj63K97NPq/zvMdErD3z6w/xq8Ci72lexJItlZxUSxDwC2Skz29xsY5wkJD67e4p7I6X5bR1Q/Lz1jhS9Ma+v3gfLIAwfScjSpMdC38PucsHiCY3JP7i0x164+HV5gvcbjine1+N7jhtvZ0doZ65/b4VTmzy2Lrk0Mi4TWqLkCgjs0dhpSYTAUFQIDFmFEOioQlCHCjljdMFu/bcBAGjrANCqt7Xz8kgDgECxDZqTDVqi0Oi3QXeMNjgUEYJDFiHodhShete8kQvOx+iCg+mGDRtA0G0wAuoYjICKXRBMLgjmQ3PB08DvqFxwMH1xq6UKOTpUiMlIJxuEpo6+uObB/XEPiQ6aP9Pt5oKgvvxt+LMUuyCcXNAUf/Prd0H447lgwkSDPqSzzywSkQjz2Mcgrby+ZyFRS6VydSiVOBmIzIZhU1Bb/kZOaf1wTjkxenmoHjW4qSUyrcJNkVI3BdMEY9Gup14T1O6maIxu+r6RLcVKZQ9ZqcQ5nUalsjUoFdA8C65jEnKgkM51QCoOuTZBKs4pKIHUVmyn00QlSzY4O9W8XELDRKVioXIGLVRmR6Gqd98buanmyXQdk5nfKaTv0ipHsWGBybCAONes37A0r2y4zrDmEJTV4M64NwyzQRKOW19wHLCKw/EwdMJtqROmIcdKjVBwSMdlGLaFzhhBEyM3YsRVaiZwGkw8X6aJLN1mAsfz8F9JKCByunQth6gR81FoBDPtEfqIUaEjtRHTnhC5ESJzxTYyDaKdrXPUbyN8EEVCQanxrPo+QNpKLWgBDbQkWy96VyJLwsjpqrPs/IwX0TFaoCOoi4gSwq5vyVBywSu07X5QEoezbCRByVLaI5EtnBY9KPIFGS7Va7URNLwyoFmw5U1SqnJUU+PvfGOA+CQ/NISWzHzp7I0B3RNdePVAbwvsZashegTl1Akp90FO3Q5pH6QtI9raXnyKXnxbROu2FxOpbnvZ1N2IREIXAMic3zt2P/e/LFfLt4/0hoFsQHySgDoCLLe51doSIMulWghko1gTAbUEoB4JkORSTYDsB+hkBY0YQKc/K5DlUmwF/DfKJATtCbD6EwJZLsVCYMlevzQRUEtAiy5cawJadC1vTQCYrOAaDADszwpkuVRbwY1HkL5DIQBGf0Igy6VaCGSPnU4E1BFgtujCtSVAlks1Ad0GBxehlyTB4ko36LbQsS0J1z93rogY8bU+V1uGMJ0F7HasdH5gXlzK1/AYKkC15W/0iN94f8e0fcesttHvSy9J6zz6LSZSLG6o2y+djuJWXYFhs95U6yUYbENcQ9EnPBfWOSgaNhOmu2znSnos4b0xjqWYnm6/kkZBjzYoDOPeANXmvNYGLeNWXLDN4n8MZMWL/9QAn/4H -------------------------------------------------------------------------------- /docs/WallaceTree.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hankshyu/RISC-V_MAC/a99fb15770c8dd553d0b348fadddc2b8b3d4b22b/docs/WallaceTree.png -------------------------------------------------------------------------------- /src/00_TESTBED/PATTERN.v: -------------------------------------------------------------------------------- 1 | `ifdef RTL 2 | `define CYCLE_TIME 20.0 3 | `endif 4 | `ifdef GATE 5 | `define CYCLE_TIME 20.0 6 | `endif 7 | 8 | 9 | module PATTERN #( 10 | parameter PARM_RM = 3, 11 | parameter PARM_XLEN = 32, 12 | parameter PARM_RM_RNE = 3'b000, 13 | parameter PARM_RM_RTZ = 3'b001, 14 | parameter PARM_RM_RDN = 3'b010, 15 | parameter PARM_RM_RUP = 3'b011, 16 | parameter PARM_RM_RMM = 3'b100 17 | ) ( 18 | output reg clk, 19 | output reg rst_n, 20 | 21 | output reg [PARM_RM - 1 : 0] Rounding_mode_i, 22 | output reg [PARM_XLEN - 1 : 0] A_i, 23 | output reg [PARM_XLEN - 1 : 0] B_i, 24 | output reg [PARM_XLEN - 1 : 0] C_i, 25 | 26 | input [PARM_XLEN - 1 : 0] Result_o, // T (result_o) = A + (B * C) 27 | //Accrued exceptions (fflags) 28 | input NV_o, 29 | input OF_o, 30 | input UF_o, 31 | input NX_o ); 32 | 33 | //================================================================ 34 | // Main Function 35 | //================================================================ 36 | initial begin 37 | $display("Welcom to the RISCV MAC project!"); 38 | #5; 39 | $finish; 40 | end 41 | 42 | 43 | endmodule -------------------------------------------------------------------------------- /src/00_TESTBED/PATTERN_sample.v: -------------------------------------------------------------------------------- 1 | `ifdef RTL 2 | `define CYCLE_TIME 10.0 3 | `endif 4 | `ifdef GATE 5 | `define CYCLE_TIME 10.0 6 | `endif 7 | 8 | module PATTERN( 9 | // Output Signals 10 | clk, 11 | rst_n, 12 | in_valid, 13 | init, 14 | in0, 15 | in1, 16 | in2, 17 | in3, 18 | // Input Signals 19 | out_valid, 20 | out 21 | ); 22 | 23 | //================================================================ 24 | // INPUT AND OUTPUT DECLARATION 25 | //================================================================ 26 | /* Input for design */ 27 | output reg clk, rst_n; 28 | output reg in_valid; 29 | output reg [1:0] init; 30 | output reg [1:0] in0, in1, in2, in3; 31 | 32 | /* Output for pattern */ 33 | input out_valid; 34 | input [1:0] out; 35 | 36 | //================================================================ 37 | // integer 38 | //================================================================ 39 | real CYCLE = `CYCLE_TIME; 40 | parameter PATNUM = 300; 41 | integer SEED = 82; 42 | integer total_latency; 43 | integer patcount; 44 | reg [1 : 0] map [64-1 : 0][4-1 : 0]; 45 | reg [1:0]init_in; 46 | integer wait_val_time; 47 | integer i; 48 | integer cac; //check answer cycle 49 | 50 | integer bonus_point; 51 | integer sum_bonus; 52 | 53 | reg [1:0] spotA, spotB, move; 54 | reg [1:0] current_line; 55 | 56 | integer resetted; 57 | 58 | //================================================================ 59 | // parameter 60 | //================================================================ 61 | 62 | parameter M_FORWARD = 2'd0; 63 | parameter M_RIGHT = 2'd1; 64 | parameter M_LEFT = 2'd2; 65 | parameter M_JUMP = 2'd3; 66 | 67 | parameter S_ROAD = 2'd0; 68 | parameter S_LO = 2'd1; 69 | parameter S_HO = 2'd2; 70 | parameter S_TRAINS = 2'd3; 71 | 72 | parameter PRINT_MSG = 1; //set to 0 to hand in the exercise 73 | 74 | 75 | //================================================================ 76 | // clock 77 | //================================================================ 78 | initial clk = 0; 79 | 80 | always #(CYCLE/2.0) clk = ~clk; 81 | 82 | //================================================================ 83 | // initial 84 | //================================================================ 85 | initial begin 86 | resetted = 0; 87 | @(posedge rst_n) 88 | resetted = 1; 89 | end 90 | 91 | always begin 92 | 93 | if(resetted)begin 94 | // The out should be reset when your out_valid is low. 95 | if(out_valid == 1'b0 && (out != 2'b00))begin 96 | $display("SPEC 4 IS FAIL!"); 97 | $finish; 98 | end else if((in_valid === 1) && (out_valid !==0))begin 99 | // The out_valid should not be high when in_valid is high. 100 | $display("SPEC 5 IS FAIL!"); 101 | $finish; 102 | end 103 | end 104 | #(CYCLE/10.0); 105 | end 106 | 107 | 108 | initial begin 109 | 110 | rst_n = 1'b1; 111 | in_valid = 1'b0; 112 | init = 'bx; 113 | in0 = 'bx; 114 | in1 = 'bx; 115 | in2 = 'bx; 116 | in3 = 'bx; 117 | sum_bonus = 0; 118 | force clk = 0; 119 | total_latency = 0; 120 | genmap;//this is to avoid starting out_valid... 121 | reset_signal_task; 122 | check_ans; 123 | for(patcount=1; patcount<=PATNUM; patcount=patcount+1) begin 124 | if(PRINT_MSG) $display("PATTERN:%05d",patcount); 125 | input_task; 126 | wait_out_valid; 127 | check_ans; 128 | 129 | @(negedge clk); 130 | check_ans; 131 | @(negedge clk); 132 | check_ans; 133 | @(negedge clk); 134 | 135 | end 136 | if(PRINT_MSG) $display("Total BONUS: %d running %d cycles!!",sum_bonus,PATNUM); 137 | YOU_PASS_task; 138 | end 139 | 140 | //================================================================ 141 | // task 142 | //================================================================ 143 | 144 | task reset_signal_task; 145 | begin 146 | #(0.5); rst_n=0; 147 | #(2.0); 148 | if((out_valid !== 0)||(out !== 0)) begin 149 | $display("SPEC 3 IS FAIL!"); 150 | // The reset signal (rst_n) would be given only once at the beginning of simulation. All output 151 | // signals should be reset after the reset signal is asserted. 152 | $finish; 153 | end 154 | #(10); rst_n=1; 155 | #(3); release clk; 156 | end 157 | endtask 158 | 159 | 160 | task input_task; 161 | begin 162 | // Inputs start from second negtive edge after the begining of clock 163 | if(patcount=='d1)begin 164 | repeat(2)@(negedge clk); 165 | end 166 | 167 | genmap; 168 | if(PRINT_MSG) printmap; 169 | 170 | in_valid = 1'b1; 171 | for (i = 0; i < 64; i = i+1) begin 172 | init = (i == 0)? init_in : 2'bxx; 173 | 174 | in0 = map[i][0]; 175 | in1 = map[i][1]; 176 | in2 = map[i][2]; 177 | in3 = map[i][3]; 178 | 179 | if(out_valid !== 0)begin 180 | $display("SPEC 5 IS FAIL!"); 181 | // The out_valid should not be high when in_valid is high. 182 | $finish; 183 | end 184 | 185 | @(negedge clk); 186 | //disable input 187 | 188 | end 189 | 190 | in_valid = 1'b0; 191 | init = 2'bx; 192 | in0 = 2'bx; 193 | in1 = 2'bx; 194 | in2 = 2'bx; 195 | in3 = 2'bx; 196 | end 197 | endtask 198 | 199 | task wait_out_valid; begin 200 | wait_val_time = -1; 201 | while(out_valid !== 1) begin 202 | wait_val_time = wait_val_time + 1; 203 | if(wait_val_time == 3000)begin 204 | $display("SPEC 6 IS FAIL!"); 205 | // The execution latency is limited in 3000 cycles. The latency is the time of the clock cycles 206 | // between the falling edge of the in_valid and the rising edge of the out_valid. 207 | $finish; 208 | end 209 | if(out !== 2'b00)begin 210 | $display("SPEC 4 IS FAIL!"); 211 | // The out should be reset when your out_valid is low. 212 | $finish; 213 | end 214 | @(negedge clk); 215 | 216 | end 217 | total_latency = total_latency + wait_val_time; 218 | end endtask 219 | 220 | task check_ans; 221 | begin 222 | 223 | //++++++++++++++++++++++++++++++++++++++++++++++++ 224 | // Check the answer here 225 | cac = 0; 226 | bonus_point = 0; 227 | 228 | 229 | while(out_valid)begin 230 | 231 | if((cac > 62) || !((out === 2'd0)|| (out !== 2'd1)|| (out !== 2'd2)|| (out !== 2'd3)))begin 232 | $display("SPEC 7 IS FAIL!"); 233 | // The out_valid and out must be asserted successively in 63 cycles. 234 | $finish; 235 | end 236 | else begin 237 | //check for incorrect answers 238 | 239 | if(cac == 0)begin 240 | current_line = init_in; 241 | end 242 | 243 | if((current_line == 0 && out == M_LEFT) || (current_line == 3 && out == M_RIGHT))begin 244 | $display("SPEC 8-1 IS FAIL!"); 245 | // - SPEC 8-1 (5%): The character cannot run outside the map. 246 | $finish; 247 | end else if( ((out == M_FORWARD) && (map[cac+1][current_line] == S_LO)) || 248 | ((out == M_RIGHT) && (map[cac+1][current_line+1] == S_LO)) || 249 | ((out == M_LEFT) && (map[cac+1][current_line-1] == S_LO)) )begin 250 | $display("SPEC 8-2 IS FAIL!"); 251 | // - SPEC 8-2 (5%): The character must avoid hitting lower obstacles. 252 | $finish; 253 | 254 | end else if( ((out == M_JUMP) && (map[cac+1][current_line] == S_HO)) || 255 | ((out == M_RIGHT) && (map[cac+1][current_line+1] == S_HO)) || 256 | ((out == M_LEFT) && (map[cac+1][current_line-1] == S_HO)) )begin 257 | $display("SPEC 8-3 IS FAIL!"); 258 | // - SPEC 8-3 (5%): The character must avoid hitting higher obstacles. 259 | $finish; 260 | end else if( ((out == M_FORWARD) && (map[cac+1][current_line] == S_TRAINS)) || 261 | ((out == M_JUMP) && (map[cac+1][current_line] == S_TRAINS)) || 262 | ((out == M_RIGHT) && (map[cac+1][current_line+1] == S_TRAINS)) || 263 | ((out == M_LEFT) && (map[cac+1][current_line-1] == S_TRAINS)) )begin 264 | $display("SPEC 8-4 IS FAIL!"); 265 | // - SPEC 8-4 (5%): The character must avoid hitting trains. 266 | $finish; 267 | end else if((out == M_JUMP) && (map[cac][current_line] == S_LO))begin 268 | $display("SPEC 8-5 IS FAIL!"); 269 | // - SPEC 8-5 (5%): If you are on a lower obstacle (2’b01), you cannot use jump. 270 | $finish; 271 | end 272 | 273 | if(PRINT_MSG)begin 274 | 275 | if(cac % 8 == 0)$write("Block %02d: ",cac/8); 276 | 277 | if(out == M_FORWARD)begin 278 | $write("%dF ",current_line); 279 | bonus_point = bonus_point + 1; 280 | 281 | end else if(out == M_JUMP)begin 282 | $write("%dJ ",current_line); 283 | bonus_point = bonus_point + 4; 284 | 285 | end else if(out == M_LEFT)begin 286 | $write("%dL ",current_line); 287 | current_line = current_line - 1; 288 | bonus_point = bonus_point + 2; 289 | end else if(out == M_RIGHT)begin 290 | $write("%dR ",current_line); 291 | current_line = current_line + 1; 292 | bonus_point = bonus_point + 2; 293 | 294 | end 295 | if(cac % 8 == 7)$display(); 296 | end else begin 297 | if(out == M_LEFT)begin 298 | current_line = current_line - 1; 299 | end else if(out == M_RIGHT)begin 300 | current_line = current_line + 1; 301 | 302 | end 303 | end 304 | 305 | end 306 | @(negedge clk); 307 | 308 | cac = cac + 1; 309 | 310 | end 311 | //+++++++++++++++++++++++++++++++++++++++++++++++ 312 | if((cac < 62) && (cac != 0)) begin 313 | $display("SPEC 7 IS FAIL!"); 314 | // The out_valid and out must be asserted successively in 63 cycles. 315 | $finish; 316 | end 317 | if(PRINT_MSG)begin 318 | $display(); 319 | $display("Bonus is : %d",bonus_point); 320 | sum_bonus = sum_bonus + bonus_point; 321 | $display("\033[0;34mPASS PATTERN NO.%4d,\033[m \033[0;32mexecution cycle : %3d\033[m",patcount ,wait_val_time); 322 | end 323 | 324 | end 325 | endtask 326 | 327 | task YOU_PASS_task; 328 | begin 329 | if(PRINT_MSG)begin 330 | $display("\n"); 331 | $display("\n"); 332 | $display(" ---------------------------- "); 333 | $display(" -- -- |\__|| "); 334 | $display(" -- Congratulations !! -- / O.O | "); 335 | $display(" -- -- /_____ | "); 336 | $display(" -- Simulation out!! -- /^ ^ ^ \\ |"); 337 | $display(" -- -- |^ ^ ^ ^ |w| "); 338 | $display(" ---------------------------- \\m___m__|_|"); 339 | $display("\n"); 340 | end 341 | 342 | #(500); 343 | $finish; 344 | end 345 | endtask 346 | 347 | task genmap; 348 | integer idx, jdx, kdx, ldx; 349 | integer grow_obstacles, grow_trainnum, grow_trainpos; 350 | begin 351 | for (idx = 0; idx < 64; idx = idx+1) begin 352 | for(jdx = 0; jdx < 4; jdx = jdx + 1)begin 353 | //the map is covered with road 354 | map[idx][jdx] = 2'b00; 355 | //grow high and low obstacles 356 | if((idx % 'd2 == 0) && (idx % 'd8 != 0))begin 357 | grow_obstacles = $random(SEED) % 'd3; 358 | if(grow_obstacles == 1) map[idx][jdx] = 2'b01; //low obstacles (LO) 359 | else if(grow_obstacles == 2) map[idx][jdx] = 2'b10; //high obstacles (HO) 360 | end 361 | end 362 | end 363 | // put on the trains 364 | for (jdx = 0; jdx < 8; jdx = jdx+1) begin 365 | grow_trainnum = ($random(SEED) %'d3)+1; 366 | for(ldx = 0; ldx < grow_trainnum; ldx = ldx+1)begin 367 | grow_trainpos = $random(SEED) %'d4; 368 | while(map[jdx*8][grow_trainpos] == 2'b11)begin 369 | grow_trainpos = $random(SEED) %'d4; //regenerate position 370 | end 371 | map[jdx*8 + 0][grow_trainpos] = 2'b11; 372 | map[jdx*8 + 1][grow_trainpos] = 2'b11; 373 | map[jdx*8 + 2][grow_trainpos] = 2'b11; 374 | map[jdx*8 + 3][grow_trainpos] = 2'b11; 375 | end 376 | end 377 | //generate the input position 378 | init_in = $random(SEED) % 'd4; 379 | while(map[0][init_in] == 2'b11)begin 380 | init_in = $random(SEED) % 'd4; 381 | end 382 | 383 | end 384 | 385 | endtask 386 | 387 | task printmap; 388 | integer idx, jdx; 389 | begin 390 | $display("\t >< ||0***+*** 1***+*** 2***+*** 3***+*** 4***+*** 5***+*** 6***+*** 7***+*** (%d)",init_in); 391 | for (jdx = 0; jdx < 4; jdx = jdx+1) begin 392 | $write("%d ||",jdx); 393 | for (idx = 0; idx < 64; idx = idx+1) begin 394 | $write("%d",map[idx][jdx]); 395 | if(idx % 'd8 == 7) $write(" "); 396 | end 397 | $display(); 398 | end 399 | end 400 | endtask 401 | 402 | endmodule 403 | 404 | 405 | // fail.txt 406 | // SPEC 3 IS FAIL! 407 | // The reset signal (rst_n) would be given only once at the beginning of simulation. All output 408 | // signals should be reset after the reset signal is asserted. 409 | 410 | // SPEC 4 IS FAIL! 411 | // The out should be reset when your out_valid is low. 412 | 413 | // SPEC 5 IS FAIL! 414 | // The out_valid should not be high when in_valid is high. 415 | 416 | // SPEC 6 IS FAIL! 417 | // The execution latency is limited in 3000 cycles. The latency is the time of the clock cycles 418 | // between the falling edge of the in_valid and the rising edge of the out_valid. 419 | 420 | // SPEC 7 IS FAIL! 421 | // The out_valid and out must be asserted successively in 63 cycles. 422 | 423 | 424 | // SPEC 8-1 IS FAIL! 425 | // - SPEC 8-1 (5%): The character cannot run outside the map. 426 | 427 | // SPEC 8-2 IS FAIL! 428 | // - SPEC 8-2 (5%): The character must avoid hitting lower obstacles. 429 | 430 | // SPEC 8-3 IS FAIL! 431 | // - SPEC 8-3 (5%): The character must avoid hitting higher obstacles. 432 | 433 | // SPEC 8-4 IS FAIL! 434 | // - SPEC 8-4 (5%): The character must avoid hitting trains. 435 | 436 | // SPEC 8-5 IS FAIL! 437 | // - SPEC 8-5 (5%): If you are on a lower obstacle (2’b01), you cannot use jump. 438 | -------------------------------------------------------------------------------- /src/00_TESTBED/TESTBED.v: -------------------------------------------------------------------------------- 1 | `timescale 1ns / 1ps 2 | ////////////////////////////////////////////////////////////////////////////////// 3 | // Engineer: Tzu-Han Hsu 4 | // Create Date: 11/20/2023 05:03:25 PM 5 | // Module Name: TESTBED 6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit 7 | // HDL(Version): Verilog-2005 8 | // 9 | // Dependencies: MAC32_top.v 10 | // PATTERN.v 11 | // 12 | ////////////////////////////////////////////////////////////////////////////////// 13 | // Description: Testbed of MAC32_top module, act as breadboard 14 | // 15 | ////////////////////////////////////////////////////////////////////////////////// 16 | // Revision: 17 | // 18 | ////////////////////////////////////////////////////////////////////////////////// 19 | 20 | `timescale 1ns/10ps 21 | 22 | `include "PATTERN.v" 23 | `ifdef RTL 24 | `include "MAC.v" 25 | `endif 26 | `ifdef GATE 27 | `include "MAC_SYN.v" 28 | `endif 29 | 30 | module TESTBED; 31 | 32 | // parameter declaration 33 | parameter PARM_RM = 3, 34 | parameter PARM_XLEN = 32, 35 | parameter PARM_RM_RNE = 3'b000, 36 | parameter PARM_RM_RTZ = 3'b001, 37 | parameter PARM_RM_RDN = 3'b010, 38 | parameter PARM_RM_RUP = 3'b011, 39 | parameter PARM_RM_RMM = 3'b100 40 | 41 | // interconnect wires delcarations 42 | wire clk, rst_n; 43 | 44 | wire [PARM_RM - 1 : 0] Rounding_mode_wire; 45 | wire [PARM_XLEN - 1 : 0] A_wire; 46 | wire [PARM_XLEN - 1 : 0] B_wire; 47 | wire [PARM_XLEN - 1 : 0] C_wire; 48 | 49 | wire [PARM_XLEN - 1 : 0] Result_wire; 50 | wire NV_wire; 51 | wire OF_wire; 52 | wire UF_wire; 53 | wire NX_wire; 54 | 55 | initial begin 56 | `ifdef RTL 57 | $fsdbDumpfile("MAC.fsdb"); 58 | $fsdbDumpvars(0,"+mda"); 59 | `endif 60 | `ifdef GATE 61 | $sdf_annotate("MAC_SYN.sdf", u_SUBWAY); 62 | $fsdbDumpfile("MAC_SYN.fsdb"); 63 | $fsdbDumpvars(0,"+mda"); 64 | `endif 65 | end 66 | 67 | PATTERN #( 68 | .PARM_RM (PARM_RM), 69 | .PARM_XLEN(PARM_XLEN), 70 | .PARM_RM_RNE(PARM_RM_RNE), 71 | .PARM_RM_RTZ(PARM_RM_RTZ), 72 | .PARM_RM_RDN(PARM_RM_RDN), 73 | .PARM_RM_RUP(PARM_RM_RUP), 74 | .PARM_RM_RMM(PARM_RM_RMM) 75 | )u_PATTERN( 76 | .clk(clk), 77 | .rst_n(rst_n), 78 | 79 | .Rounding_mode_o(Rounding_mode_wire), 80 | .A_o(A_wire), 81 | .B_o(B_wire), 82 | .C_o(C_wire), 83 | 84 | .Result_i(Result_wire), 85 | 86 | //Accrued exceptions (fflags) 87 | .NV_i(NV_wire), 88 | .OF_i(OF_wire), 89 | .UF_i(UF_wire), 90 | .NX_i(NX_wire) 91 | ); 92 | 93 | MAC32_top #( 94 | .PARM_RM (PARM_RM), 95 | .PARM_XLEN(PARM_XLEN), 96 | .PARM_RM_RNE(PARM_RM_RNE), 97 | .PARM_RM_RTZ(PARM_RM_RTZ), 98 | .PARM_RM_RDN(PARM_RM_RDN), 99 | .PARM_RM_RUP(PARM_RM_RUP), 100 | .PARM_RM_RMM(PARM_RM_RMM) 101 | ) u_MAC32_top ( 102 | //input clk_i, 103 | //input rst_i, 104 | //input stall_i, 105 | //input req_i, 106 | 107 | .Rounding_mode_i(Rounding_mode_wire), 108 | .A_i(A_wire), 109 | .B_i(B_wire), 110 | .C_i(C_wire), 111 | 112 | .Result_o(Result_wire), // T (result_o) = A + (B * C) 113 | //output ready_o, 114 | 115 | //Accrued exceptions (fflags) 116 | .NV_o(NV_wire), 117 | .OF_o(OF_wire), 118 | .UF_o(UF_wire), 119 | .NX_o(NX_wire) 120 | ); 121 | 122 | endmodule -------------------------------------------------------------------------------- /src/00_TESTBED/TESTBED_sample.v: -------------------------------------------------------------------------------- 1 | /**************************************************************************/ 2 | // Copyright (c) 2023, OASIS Lab 3 | // MODULE: TESTBED 4 | // FILE NAME: TESTBED.v 5 | // VERSRION: 1.0 6 | // DATE: Feb 8, 2023 7 | // AUTHOR: Kuan-Wei Chen, NYCU IEE 8 | // CODE TYPE: RTL or Behavioral Level (Verilog) 9 | // DESCRIPTION: 2023 Spring IC Lab / Exersise Lab03 / SUBWAY 10 | // MODIFICATION HISTORY: 11 | // Date Description 12 | // 13 | /**************************************************************************/ 14 | `timescale 1ns/10ps 15 | 16 | `include "PATTERN.v" 17 | `ifdef RTL 18 | `include "SUBWAY.v" 19 | `endif 20 | `ifdef GATE 21 | `include "SUBWAY_SYN.v" 22 | `endif 23 | 24 | module TESTBED; 25 | 26 | wire clk, rst_n, in_valid; 27 | wire [1:0] init; 28 | wire [1:0] in0, in1, in2, in3; 29 | wire out_valid; 30 | wire [1:0] out; 31 | 32 | initial begin 33 | `ifdef RTL 34 | $fsdbDumpfile("SUBWAY.fsdb"); 35 | $fsdbDumpvars(0,"+mda"); 36 | `endif 37 | `ifdef GATE 38 | $sdf_annotate("SUBWAY_SYN.sdf", u_SUBWAY); 39 | $fsdbDumpfile("SUBWAY_SYN.fsdb"); 40 | $fsdbDumpvars(0,"+mda"); 41 | `endif 42 | end 43 | 44 | SUBWAY u_SUBWAY( 45 | .clk(clk), 46 | .rst_n(rst_n), 47 | .in_valid(in_valid), 48 | .init(init), 49 | .in0(in0), 50 | .in1(in1), 51 | .in2(in2), 52 | .in3(in3), 53 | .out_valid(out_valid), 54 | .out(out) 55 | ); 56 | 57 | PATTERN u_PATTERN( 58 | .clk(clk), 59 | .rst_n(rst_n), 60 | .in_valid(in_valid), 61 | .init(init), 62 | .in0(in0), 63 | .in1(in1), 64 | .in2(in2), 65 | .in3(in3), 66 | .out_valid(out_valid), 67 | .out(out) 68 | ); 69 | 70 | endmodule 71 | -------------------------------------------------------------------------------- /src/01_RTL/01_run: -------------------------------------------------------------------------------- 1 | irun TESTBED.v -define RTL -define FUNC -debug -incdir /usr/synthesis/dw/sim_ver/ ./1_RTL/ -notimingchecks -loadpli1 debpli:novas_pli_boot -------------------------------------------------------------------------------- /src/01_RTL/Compressor32.v: -------------------------------------------------------------------------------- 1 | `timescale 1ns / 1ps 2 | ////////////////////////////////////////////////////////////////////////////////// 3 | // Engineer: Tzu-Han Hsu 4 | // Create Date: 07/25/2022 10:34:02 AM 5 | // Module Name: Compressor32 6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit 7 | // HDL(Version): Verilog-2005 8 | // 9 | // Dependencies: FullAdder.v 10 | // 11 | ////////////////////////////////////////////////////////////////////////////////// 12 | // Description: This is a 3:2 compressor, a.k.a carry save adder. 13 | // 14 | ////////////////////////////////////////////////////////////////////////////////// 15 | // Revision: 16 | // 09/12/2022 - Add BSD-3-Clause Licence 17 | // 18 | ////////////////////////////////////////////////////////////////////////////////// 19 | // License information: 20 | // 21 | // This software is released under the BSD-3-Clause Licence, 22 | // see https://opensource.org/licenses/BSD-3-Clause for details. 23 | // In the following license statements, "software" refers to the 24 | // "source code" of the complete hardware/software system. 25 | // 26 | // Copyright 2022, 27 | // Embedded Intelligent Systems Lab (EISL) 28 | // Deparment of Computer Science 29 | // National Yang Ming Chiao Tung Uniersity 30 | // Hsinchu, Taiwan. 31 | // 32 | // All rights reserved. 33 | // 34 | // Redistribution and use in source and binary forms, with or without 35 | // modification, are permitted provided that the following conditions are met: 36 | // 37 | // 1. Redistributions of source code must retain the above copyright notice, 38 | // this list of conditions and the following disclaimer. 39 | // 40 | // 2. Redistributions in binary form must reproduce the above copyright notice, 41 | // this list of conditions and the following disclaimer in the documentation 42 | // and/or other materials provided with the distribution. 43 | // 44 | // 3. Neither the name of the copyright holder nor the names of its contributors 45 | // may be used to endorse or promote products derived from this software 46 | // without specific prior written permission. 47 | // 48 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 49 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 50 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 51 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE 52 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 53 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 54 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 55 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 56 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 57 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 58 | // POSSIBILITY OF SUCH DAMAGE. 59 | ////////////////////////////////////////////////////////////////////////////////// 60 | 61 | 62 | 63 | module Compressor32 #( 64 | parameter XLEN = 49 65 | ) ( 66 | input [XLEN - 1 : 0] A_i, 67 | input [XLEN - 1 : 0] B_i, 68 | input [XLEN - 1 : 0] C_i, 69 | output [XLEN - 1 : 0] Sum_o, 70 | output [XLEN - 1 : 0] Carry_o 71 | ); 72 | 73 | generate 74 | genvar j; 75 | for(j = 0; j < XLEN; j = j+1)begin 76 | FullAdder FA( 77 | .augend_i(A_i[j]), 78 | .addend_i(B_i[j]), 79 | .carry_i(C_i[j]), 80 | .sum_o(Sum_o[j]), 81 | .carry_o(Carry_o[j]) 82 | ); 83 | 84 | end 85 | endgenerate 86 | 87 | endmodule 88 | -------------------------------------------------------------------------------- /src/01_RTL/Compressor42.v: -------------------------------------------------------------------------------- 1 | `timescale 1ns / 1ps 2 | ////////////////////////////////////////////////////////////////////////////////// 3 | // Engineer: Tzu-Han Hsu 4 | // Create Date: 07/25/2022 10:34:02 AM 5 | // Module Name: Compressor42 6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit 7 | // HDL(Version): Verilog-2005 8 | // 9 | // Dependencies: Compressor32.v 10 | // 11 | ////////////////////////////////////////////////////////////////////////////////// 12 | // Description: Input 4 partial sums and outputs sum and carry 13 | // One possible implementation is with 2 3-2 Compressors, 14 | // Or could be mapped to a more efficient design 15 | // 16 | ////////////////////////////////////////////////////////////////////////////////// 17 | // Revision: 18 | // 07/25/2022 - Compressor32 down32 module should take top_carry shift one bit to the left as input 19 | // 07/25/2022 - hidden_carry_msb wire added, collect overflow bits to suppress sign extension 20 | // 09/12/2022 - Add BSD-3-Clause Licence 21 | // 22 | ////////////////////////////////////////////////////////////////////////////////// 23 | // License information: 24 | // 25 | // This software is released under the BSD-3-Clause Licence, 26 | // see https://opensource.org/licenses/BSD-3-Clause for details. 27 | // In the following license statements, "software" refers to the 28 | // "source code" of the complete hardware/software system. 29 | // 30 | // Copyright 2022, 31 | // Embedded Intelligent Systems Lab (EISL) 32 | // Deparment of Computer Science 33 | // National Yang Ming Chiao Tung Uniersity 34 | // Hsinchu, Taiwan. 35 | // 36 | // All rights reserved. 37 | // 38 | // Redistribution and use in source and binary forms, with or without 39 | // modification, are permitted provided that the following conditions are met: 40 | // 41 | // 1. Redistributions of source code must retain the above copyright notice, 42 | // this list of conditions and the following disclaimer. 43 | // 44 | // 2. Redistributions in binary form must reproduce the above copyright notice, 45 | // this list of conditions and the following disclaimer in the documentation 46 | // and/or other materials provided with the distribution. 47 | // 48 | // 3. Neither the name of the copyright holder nor the names of its contributors 49 | // may be used to endorse or promote products derived from this software 50 | // without specific prior written permission. 51 | // 52 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 53 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 54 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 55 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE 56 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 57 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 58 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 59 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 60 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 61 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 62 | // POSSIBILITY OF SUCH DAMAGE. 63 | ////////////////////////////////////////////////////////////////////////////////// 64 | 65 | 66 | module Compressor42 #( 67 | parameter XLEN = 49 68 | ) ( 69 | input [XLEN - 1 : 0] A_i, 70 | input [XLEN - 1 : 0] B_i, 71 | input [XLEN - 1 : 0] C_i, 72 | input [XLEN - 1 : 0] D_i, 73 | output [XLEN - 1 : 0] Sum_o, 74 | output [XLEN - 1 : 0] Carry_o, 75 | output hidden_carry_msb); 76 | 77 | wire [XLEN - 1: 0] top_sum; 78 | wire [XLEN - 1: 0] top_carry; 79 | 80 | Compressor32 top32( 81 | .A_i(A_i), 82 | .B_i(B_i), 83 | .C_i(C_i), 84 | .Sum_o(top_sum), 85 | .Carry_o(top_carry) 86 | ); 87 | 88 | Compressor32 down32( 89 | .A_i(top_sum), 90 | .B_i({top_carry<<1}), 91 | .C_i(D_i), 92 | .Sum_o(Sum_o), 93 | .Carry_o(Carry_o) 94 | ); 95 | 96 | assign hidden_carry_msb = top_carry[XLEN - 1]; 97 | 98 | endmodule 99 | -------------------------------------------------------------------------------- /src/01_RTL/EACAdder.v: -------------------------------------------------------------------------------- 1 | `timescale 1ns / 1ps 2 | ////////////////////////////////////////////////////////////////////////////////// 3 | // Engineer: Tzu-Han Hsu 4 | // Create Date: 07/29/2022 10:40:06 AM 5 | // Module Name: EACAdder 6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit 7 | // HDL(Version): Verilog-2005 8 | // 9 | // Dependencies: None 10 | // 11 | ////////////////////////////////////////////////////////////////////////////////// 12 | // Description: An adder outputs a positive magnitude result and preferably 13 | // only need to conditionally complement one operand 14 | // 15 | ////////////////////////////////////////////////////////////////////////////////// 16 | // Revision: 17 | // 07/29/2022 - I/O port names renamed with correct suffix 18 | // 08/06/2022 - Add A_Zero_i signal to detect A is -0, in order to avoid false end round carry 19 | // 09/12/2022 - Add BSD-3-Clause Licence 20 | // 21 | ////////////////////////////////////////////////////////////////////////////////// 22 | // License information: 23 | // 24 | // This software is released under the BSD-3-Clause Licence, 25 | // see https://opensource.org/licenses/BSD-3-Clause for details. 26 | // In the following license statements, "software" refers to the 27 | // "source code" of the complete hardware/software system. 28 | // 29 | // Copyright 2022, 30 | // Embedded Intelligent Systems Lab (EISL) 31 | // Deparment of Computer Science 32 | // National Yang Ming Chiao Tung Uniersity 33 | // Hsinchu, Taiwan. 34 | // 35 | // All rights reserved. 36 | // 37 | // Redistribution and use in source and binary forms, with or without 38 | // modification, are permitted provided that the following conditions are met: 39 | // 40 | // 1. Redistributions of source code must retain the above copyright notice, 41 | // this list of conditions and the following disclaimer. 42 | // 43 | // 2. Redistributions in binary form must reproduce the above copyright notice, 44 | // this list of conditions and the following disclaimer in the documentation 45 | // and/or other materials provided with the distribution. 46 | // 47 | // 3. Neither the name of the copyright holder nor the names of its contributors 48 | // may be used to endorse or promote products derived from this software 49 | // without specific prior written permission. 50 | // 51 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 52 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 53 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 54 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE 55 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 56 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 57 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 58 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 59 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 60 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 61 | // POSSIBILITY OF SUCH DAMAGE. 62 | ////////////////////////////////////////////////////////////////////////////////// 63 | 64 | 65 | module EACAdder #( 66 | parameter PARM_MANT = 23 67 | ) ( 68 | input [2*PARM_MANT + 1 : 0] CSA_sum_i, 69 | input [2*PARM_MANT + 1 : 0] CSA_carry_i, 70 | input Carry_postcor_i, 71 | input Sub_Sign_i, 72 | input A_Zero_i, 73 | 74 | output [2*PARM_MANT + 1 : 0] low_sum_o, 75 | output low_carry_o, 76 | output [2*PARM_MANT + 1 : 0] low_sum_inv_o, 77 | output low_carry_inv_o); 78 | 79 | wire end_round_carry = Sub_Sign_i & (~A_Zero_i); 80 | assign {low_carry_o, low_sum_o} = CSA_sum_i + {Carry_postcor_i, CSA_carry_i[2*PARM_MANT : 0], end_round_carry}; 81 | assign {low_carry_inv_o, low_sum_inv_o} = 2'b10 + {1'b1, ~CSA_sum_i} + {~Carry_postcor_i, ~CSA_carry_i[2*PARM_MANT : 0], ~end_round_carry}; 82 | 83 | endmodule 84 | -------------------------------------------------------------------------------- /src/01_RTL/FullAdder.v: -------------------------------------------------------------------------------- 1 | `timescale 1ns / 1ps 2 | ////////////////////////////////////////////////////////////////////////////////// 3 | // Engineer: Tzu-Han Hsu 4 | // Create Date: 07/22/2022 10:13:32 AM 5 | // Module Name: FullAdder 6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit 7 | // HDL(Version): Verilog-2005 8 | // 9 | // Dependencies: None 10 | // 11 | ////////////////////////////////////////////////////////////////////////////////// 12 | // Description: A full Adder module, with 3 input and 2 output 13 | // 14 | ////////////////////////////////////////////////////////////////////////////////// 15 | // Revision: 16 | // 07/25/2022 - Output ports naming inconsistent with definition, bug fixed 17 | // 09/12/2022 - Add BSD-3-Clause Licence 18 | // 19 | ////////////////////////////////////////////////////////////////////////////////// 20 | // License information: 21 | // 22 | // This software is released under the BSD-3-Clause Licence, 23 | // see https://opensource.org/licenses/BSD-3-Clause for details. 24 | // In the following license statements, "software" refers to the 25 | // "source code" of the complete hardware/software system. 26 | // 27 | // Copyright 2022, 28 | // Embedded Intelligent Systems Lab (EISL) 29 | // Deparment of Computer Science 30 | // National Yang Ming Chiao Tung Uniersity 31 | // Hsinchu, Taiwan. 32 | // 33 | // All rights reserved. 34 | // 35 | // Redistribution and use in source and binary forms, with or without 36 | // modification, are permitted provided that the following conditions are met: 37 | // 38 | // 1. Redistributions of source code must retain the above copyright notice, 39 | // this list of conditions and the following disclaimer. 40 | // 41 | // 2. Redistributions in binary form must reproduce the above copyright notice, 42 | // this list of conditions and the following disclaimer in the documentation 43 | // and/or other materials provided with the distribution. 44 | // 45 | // 3. Neither the name of the copyright holder nor the names of its contributors 46 | // may be used to endorse or promote products derived from this software 47 | // without specific prior written permission. 48 | // 49 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 50 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 51 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 52 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE 53 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 54 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 55 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 56 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 57 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 58 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 59 | // POSSIBILITY OF SUCH DAMAGE. 60 | ////////////////////////////////////////////////////////////////////////////////// 61 | 62 | 63 | module FullAdder( 64 | input augend_i, 65 | input addend_i, 66 | input carry_i, 67 | output sum_o, 68 | output carry_o); 69 | 70 | assign sum_o = augend_i ^ addend_i ^ carry_i; 71 | assign carry_o = (augend_i & addend_i) || (addend_i & carry_i) || (carry_i & augend_i); 72 | 73 | endmodule 74 | -------------------------------------------------------------------------------- /src/01_RTL/LeadingOneDetector_Top.v: -------------------------------------------------------------------------------- 1 | `timescale 1ns / 1ps 2 | ////////////////////////////////////////////////////////////////////////////////// 3 | // Engineer: Tzu-Han Hsu 4 | // Create Date: 07/29/2022 11:01:00 PM 5 | // Module Name: LeadingOneDetector_Top 6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit 7 | // HDL(Version): Verilog-2005 8 | // 9 | // Dependencies: ZeroDetector_Base.v 10 | // ZeroDetector_Group.v 11 | // 12 | ////////////////////////////////////////////////////////////////////////////////// 13 | // Description: It detect the shifting amount needed for a leading one 14 | // 15 | ////////////////////////////////////////////////////////////////////////////////// 16 | // Revision: 17 | // 07/29/2022 - Mux simplification, combine one else if clause into else clause 18 | // 09/12/2022 - Add BSD-3-Clause Licence 19 | // 20 | ////////////////////////////////////////////////////////////////////////////////// 21 | // License information: 22 | // 23 | // This software is released under the BSD-3-Clause Licence, 24 | // see https://opensource.org/licenses/BSD-3-Clause for details. 25 | // In the following license statements, "software" refers to the 26 | // "source code" of the complete hardware/software system. 27 | // 28 | // Copyright 2022, 29 | // Embedded Intelligent Systems Lab (EISL) 30 | // Deparment of Computer Science 31 | // National Yang Ming Chiao Tung Uniersity 32 | // Hsinchu, Taiwan. 33 | // 34 | // All rights reserved. 35 | // 36 | // Redistribution and use in source and binary forms, with or without 37 | // modification, are permitted provided that the following conditions are met: 38 | // 39 | // 1. Redistributions of source code must retain the above copyright notice, 40 | // this list of conditions and the following disclaimer. 41 | // 42 | // 2. Redistributions in binary form must reproduce the above copyright notice, 43 | // this list of conditions and the following disclaimer in the documentation 44 | // and/or other materials provided with the distribution. 45 | // 46 | // 3. Neither the name of the copyright holder nor the names of its contributors 47 | // may be used to endorse or promote products derived from this software 48 | // without specific prior written permission. 49 | // 50 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 51 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 52 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 53 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE 54 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 55 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 56 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 57 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 58 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 59 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 60 | // POSSIBILITY OF SUCH DAMAGE. 61 | ////////////////////////////////////////////////////////////////////////////////// 62 | 63 | 64 | module LeadingOneDetector_Top #( 65 | parameter X_LEN = 74, 66 | parameter PARM_SHIFTZERO = $clog2(X_LEN) 67 | ) ( 68 | input [X_LEN - 1 : 0] data_i, 69 | 70 | output reg [PARM_SHIFTZERO - 1 : 0] shift_num_o, 71 | output allzero_o ); 72 | 73 | 74 | wire [7:0] base_zeros; 75 | generate 76 | genvar i; 77 | for(i = 0; i < 8; i = i+1)begin 78 | ZeroDetector_Base #(8) lzd_base( 79 | .base_data_i(data_i[(72 - i*8) -: 8]), 80 | .zero_o(base_zeros[i]) 81 | ); 82 | end 83 | endgenerate 84 | 85 | wire [3:0] lv1_zeros; 86 | generate 87 | genvar j; 88 | for (j = 0; j < 4; j = j+1) begin 89 | ZeroDetector_Group #(2) lzd_grouplv1( 90 | .group_data_i(base_zeros[j*2 +:2]), 91 | .group_zero_o(lv1_zeros[j]) 92 | ); 93 | end 94 | endgenerate 95 | 96 | wire [1:0] lv2_zeros; 97 | ZeroDetector_Group #(2) lzd_grouplv2_0( 98 | .group_data_i(lv1_zeros[1:0]), 99 | .group_zero_o(lv2_zeros[0]) 100 | ); 101 | 102 | ZeroDetector_Group #(2) lzd_grouplv2_1( 103 | .group_data_i(lv1_zeros[3:2]), 104 | .group_zero_o(lv2_zeros[1]) 105 | ); 106 | 107 | wire lv3_zeros; 108 | ZeroDetector_Group #(2) lzd_grouplv3( 109 | .group_data_i(lv2_zeros), 110 | .group_zero_o(lv3_zeros) 111 | ); 112 | 113 | wire left_zero = (data_i[8:0] == 9'd0); 114 | 115 | 116 | //output logic 117 | assign allzero_o = lv3_zeros & left_zero; 118 | 119 | always @(*) begin 120 | if(lv3_zeros)begin 121 | if(data_i[8]) shift_num_o = 64; 122 | else if(data_i[7]) shift_num_o = 65; 123 | else if(data_i[6]) shift_num_o = 66; 124 | else if(data_i[5]) shift_num_o = 67; 125 | else if(data_i[4]) shift_num_o = 68; 126 | else if(data_i[3]) shift_num_o = 69; 127 | else if(data_i[2]) shift_num_o = 70; 128 | else if(data_i[1]) shift_num_o = 71; 129 | else shift_num_o = 72; //when all zero or data_i[0] 130 | 131 | end 132 | else begin //1 appears in 72 : 9 133 | if(lv2_zeros[0])begin // 1 appears in 40 : 9 134 | if(lv1_zeros[2])begin // 1 appears in 24 : 9 135 | if(base_zeros[6])begin // 1 appears in 16 : 9 136 | 137 | if(data_i[16]) shift_num_o = 56; 138 | else if(data_i[15]) shift_num_o = 57; 139 | else if(data_i[14]) shift_num_o = 58; 140 | else if(data_i[13]) shift_num_o = 59; 141 | else if(data_i[12]) shift_num_o = 60; 142 | else if(data_i[11]) shift_num_o = 61; 143 | else if(data_i[10]) shift_num_o = 62; 144 | else shift_num_o = 63; //data_i[9] 145 | end 146 | else begin // 1 appears in 24 : 17 147 | 148 | if(data_i[24]) shift_num_o = 48; 149 | else if(data_i[23]) shift_num_o = 49; 150 | else if(data_i[22]) shift_num_o = 50; 151 | else if(data_i[21]) shift_num_o = 51; 152 | else if(data_i[20]) shift_num_o = 52; 153 | else if(data_i[19]) shift_num_o = 53; 154 | else if(data_i[18]) shift_num_o = 54; 155 | else shift_num_o = 55; // data_i[17] 156 | end 157 | end 158 | else begin // 1 appears in 40 : 25 159 | if(base_zeros[4])begin // 1 appears in 32 : 25 160 | 161 | if(data_i[32]) shift_num_o = 40; 162 | else if(data_i[31]) shift_num_o = 41; 163 | else if(data_i[30]) shift_num_o = 42; 164 | else if(data_i[29]) shift_num_o = 43; 165 | else if(data_i[28]) shift_num_o = 44; 166 | else if(data_i[27]) shift_num_o = 45; 167 | else if(data_i[26]) shift_num_o = 46; 168 | else shift_num_o = 47; //data_i[25] 169 | end 170 | else begin // 1 appears in 40 : 33 171 | 172 | if(data_i[40]) shift_num_o = 32; 173 | else if(data_i[39]) shift_num_o = 33; 174 | else if(data_i[38]) shift_num_o = 34; 175 | else if(data_i[37]) shift_num_o = 35; 176 | else if(data_i[36]) shift_num_o = 36; 177 | else if(data_i[35]) shift_num_o = 37; 178 | else if(data_i[34]) shift_num_o = 38; 179 | else shift_num_o = 39; // data_i[33] 180 | end 181 | end 182 | end 183 | else begin //1 in 72 : 41 184 | if(lv1_zeros[0])begin //1 appears in 56 : 41 185 | if(base_zeros[2])begin // 1 appears in 48 : 41 186 | 187 | if(data_i[48]) shift_num_o = 24; 188 | else if(data_i[47]) shift_num_o = 25; 189 | else if(data_i[46]) shift_num_o = 26; 190 | else if(data_i[45]) shift_num_o = 27; 191 | else if(data_i[44]) shift_num_o = 28; 192 | else if(data_i[43]) shift_num_o = 29; 193 | else if(data_i[42]) shift_num_o = 30; 194 | else shift_num_o = 31; // data_i[41] 195 | end 196 | else begin // 1 appears in 56 : 49 197 | 198 | if(data_i[56]) shift_num_o = 16; 199 | else if(data_i[55]) shift_num_o = 17; 200 | else if(data_i[54]) shift_num_o = 18; 201 | else if(data_i[53]) shift_num_o = 19; 202 | else if(data_i[52]) shift_num_o = 20; 203 | else if(data_i[51]) shift_num_o = 21; 204 | else if(data_i[50]) shift_num_o = 22; 205 | else shift_num_o = 23; // data_i[49] 206 | end 207 | 208 | end 209 | else begin // 1 appears in 72 : 57 210 | if(base_zeros[0])begin // 1 appears in 64 : 57 211 | 212 | if(data_i[64]) shift_num_o = 8; 213 | else if(data_i[63]) shift_num_o = 9; 214 | else if(data_i[62]) shift_num_o = 10; 215 | else if(data_i[61]) shift_num_o = 11; 216 | else if(data_i[60]) shift_num_o = 12; 217 | else if(data_i[59]) shift_num_o = 13; 218 | else if(data_i[58]) shift_num_o = 14; 219 | else shift_num_o = 15; // data_i[57] 220 | end 221 | else begin // 1 appears in 72 : 65 222 | 223 | if(data_i[72]) shift_num_o = 0; 224 | else if(data_i[71]) shift_num_o = 1; 225 | else if(data_i[70]) shift_num_o = 2; 226 | else if(data_i[69]) shift_num_o = 3; 227 | else if(data_i[68]) shift_num_o = 4; 228 | else if(data_i[67]) shift_num_o = 5; 229 | else if(data_i[66]) shift_num_o = 6; 230 | else shift_num_o = 7; // data_i[65] 231 | 232 | end 233 | end 234 | end 235 | end 236 | end 237 | 238 | 239 | endmodule 240 | -------------------------------------------------------------------------------- /src/01_RTL/MAC32_top.v: -------------------------------------------------------------------------------- 1 | `timescale 1ns / 1ps 2 | ////////////////////////////////////////////////////////////////////////////////// 3 | // Engineer: Tzu-Han Hsu 4 | // Create Date: 07/21/2022 03:34:32 PM 5 | // Module Name: MAC32_top 6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit 7 | // HDL(Version): Verilog-2005 8 | // 9 | // Dependencies: SpecialCaseDetector.v 10 | // R4Booth.v 11 | // WallaceTree.v 12 | // PreNormalizer.v 13 | // Compressor32.v 14 | // EACAdder.v 15 | // MSBIncrementer.v 16 | // LeadingOneDetector_Top.v 17 | // Normalizer.v 18 | // Rounder.v 19 | ////////////////////////////////////////////////////////////////////////////////// 20 | // Description: 21 | // 22 | ////////////////////////////////////////////////////////////////////////////////// 23 | // Revision: 24 | // 08/12/2022 - Update mv_halt signal, now zero is viewed as the smalest denormalized number. 25 | // 08/14/2022 - Stable non-pipelined build (v1.0) 26 | // 08/15/2022 - R4Booth and Wallace Tree update 27 | // 08/16/2022 - Instantiation name start with UpperCase 28 | // 09/12/2022 - Add BSD-3-Clause Licence 29 | // 30 | ////////////////////////////////////////////////////////////////////////////////// 31 | // License information: 32 | // 33 | // This software is released under the BSD-3-Clause Licence, 34 | // see https://opensource.org/licenses/BSD-3-Clause for details. 35 | // In the following license statements, "software" refers to the 36 | // "source code" of the complete hardware/software system. 37 | // 38 | // Copyright 2022, 39 | // Embedded Intelligent Systems Lab (EISL) 40 | // Deparment of Computer Science 41 | // National Yang Ming Chiao Tung Uniersity 42 | // Hsinchu, Taiwan. 43 | // 44 | // All rights reserved. 45 | // 46 | // Redistribution and use in source and binary forms, with or without 47 | // modification, are permitted provided that the following conditions are met: 48 | // 49 | // 1. Redistributions of source code must retain the above copyright notice, 50 | // this list of conditions and the following disclaimer. 51 | // 52 | // 2. Redistributions in binary form must reproduce the above copyright notice, 53 | // this list of conditions and the following disclaimer in the documentation 54 | // and/or other materials provided with the distribution. 55 | // 56 | // 3. Neither the name of the copyright holder nor the names of its contributors 57 | // may be used to endorse or promote products derived from this software 58 | // without specific prior written permission. 59 | // 60 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 61 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 62 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 63 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE 64 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 65 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 66 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 67 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 68 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 69 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 70 | // POSSIBILITY OF SUCH DAMAGE. 71 | ////////////////////////////////////////////////////////////////////////////////// 72 | // Additional Comments: 73 | // 74 | // Floating-point control and status register: 75 | // |31 8|7 5|4 0| 76 | // |reserved| Rounding Mode (frm) | Accured Exceptions(fflags) | 77 | // NV DZ OF UF NX 78 | // 79 | // Rounding mode encoding: 80 | // Rounding Mode| Mnemonic | Meaning 81 | // ------------------------------------------------------------------------------------------- 82 | // 000 | RNE | Round to Nearest, ties to Even 83 | // 001 | RTZ | Round towards Zero 84 | // 010 | RDN | Round Down (towards -INFINITY) 85 | // 011 | RUP | Round UP (towards +INFINITY) 86 | // 100 | RMM | Round to Nearest, ties Max Magnitude 87 | // 101 | --- | Invalid. Reserved for future use 88 | // 110 | --- | Invalid. Reserved for future use 89 | // 111 | DYN | In instruction's rm field, selects dynamic rounding mode; 90 | // In Rounding Mode register, Invalid 91 | // 92 | // Accrued exception flag encoding: 93 | // Flag Mnemonic | Flag Meaning 94 | // -------------------------------------- 95 | // NV | Invalid Operation 96 | // DZ | Divide by Zero 97 | // OF | Overflow 98 | // UF | Underflow 99 | // NX | Inexact 100 | // 101 | ////////////////////////////////////////////////////////////////////////////////// 102 | 103 | 104 | module MAC32_top #( 105 | parameter PARM_RM = 3, 106 | parameter PARM_XLEN = 32, 107 | parameter PARM_RM_RNE = 3'b000, 108 | parameter PARM_RM_RTZ = 3'b001, 109 | parameter PARM_RM_RDN = 3'b010, 110 | parameter PARM_RM_RUP = 3'b011, 111 | parameter PARM_RM_RMM = 3'b100 112 | ) ( 113 | //input clk_i, 114 | //input rst_i, 115 | //input stall_i, 116 | //input req_i, 117 | 118 | input [PARM_RM - 1 : 0] Rounding_mode_i, 119 | 120 | input [PARM_XLEN - 1 : 0] A_i, 121 | input [PARM_XLEN - 1 : 0] B_i, 122 | input [PARM_XLEN - 1 : 0] C_i, 123 | 124 | 125 | output [PARM_XLEN - 1 : 0] Result_o, // T (result_o) = A + (B * C) 126 | //output ready_o, 127 | 128 | //Accrued exceptions (fflags) 129 | output NV_o, 130 | //output DZ_o, //would not occur in Multiplication or Addition 131 | output OF_o, 132 | output UF_o, 133 | output NX_o ); 134 | 135 | 136 | parameter PARM_EXP = 8; 137 | parameter PARM_MANT = 23; 138 | parameter PARM_BIAS = 127; 139 | parameter PARM_LEADONE_WIDTH = 7; 140 | parameter PARM_EXP_ONE = 8'h01; 141 | parameter PARM_MANT_NAN = 23'b100_0000_0000_0000_0000_0000; //RISC-V defines canonical NaN to be 0x7fc0_0000 142 | 143 | 144 | //inputs wires of specialCaseDetectors 145 | wire A_Leadingbit = | A_i[PARM_XLEN - 2 : PARM_MANT]; //normalized number has leading 1, denormalized with leading 0 146 | wire B_Leadingbit = | B_i[PARM_XLEN - 2 : PARM_MANT]; 147 | wire C_Leadingbit = | C_i[PARM_XLEN - 2 : PARM_MANT]; 148 | //outputs wires of specialCaseDetectors 149 | wire A_Inf, B_Inf, C_Inf; 150 | wire A_Zero, B_Zero, C_Zero; 151 | wire A_NaN, B_NaN, C_NaN; 152 | wire A_DeN, B_DeN, C_DeN; 153 | 154 | 155 | SpecialCaseDetector #( 156 | .PARM_XLEN(PARM_XLEN), 157 | .PARM_EXP(PARM_EXP), 158 | .PARM_MANT(PARM_MANT) 159 | ) SpecialCaseDetector ( 160 | .A_i(A_i), 161 | .B_i(B_i), 162 | .C_i(C_i), 163 | .A_Leadingbit_i(A_Leadingbit), 164 | .B_Leadingbit_i(B_Leadingbit), 165 | .C_Leadingbit_i(C_Leadingbit), 166 | 167 | .A_Inf_o(A_Inf), 168 | .B_Inf_o(B_Inf), 169 | .C_Inf_o(C_Inf), 170 | .A_Zero_o(A_Zero), 171 | .B_Zero_o(B_Zero), 172 | .C_Zero_o(C_Zero), 173 | .A_NaN_o(A_NaN), 174 | .B_NaN_o(B_NaN), 175 | .C_NaN_o(C_NaN), 176 | .A_DeN_o(A_DeN), 177 | .B_DeN_o(B_DeN), 178 | .C_DeN_o(C_DeN) 179 | ); 180 | 181 | 182 | wire A_Sign = A_i[PARM_XLEN - 1]; 183 | wire B_Sign = B_i[PARM_XLEN - 1]; 184 | wire C_Sign = C_i[PARM_XLEN - 1]; 185 | wire Sub_Sign = A_Sign ^ B_Sign ^ C_Sign; // indicator of effective subtraction 186 | 187 | //denormalized number has exponent 1 188 | wire [PARM_EXP - 1: 0] A_Exp = A_DeN? PARM_EXP_ONE : A_i[PARM_XLEN - 2 : PARM_MANT]; 189 | wire [PARM_EXP - 1: 0] B_Exp = B_DeN? PARM_EXP_ONE : B_i[PARM_XLEN - 2 : PARM_MANT]; 190 | wire [PARM_EXP - 1: 0] C_Exp = C_DeN? PARM_EXP_ONE : C_i[PARM_XLEN - 2 : PARM_MANT]; 191 | 192 | wire [PARM_MANT : 0] A_Mant = {A_Leadingbit, A_i[PARM_MANT - 1 : 0]}; 193 | wire [PARM_MANT : 0] B_Mant = {B_Leadingbit, B_i[PARM_MANT - 1 : 0]}; 194 | wire [PARM_MANT : 0] C_Mant = {C_Leadingbit, C_i[PARM_MANT - 1 : 0]}; 195 | 196 | //Generate 13 Partial Product by Radix-4 Booth's Algorithm 197 | wire [2*PARM_MANT + 2 : 0] booth_PP [12 - 1: 0]; 198 | wire [2*PARM_MANT + 1 : 0] booth_PP_13; //Partial Product's MSB is always 0 199 | 200 | 201 | R4Booth #( 202 | .PARM_MANT(PARM_MANT) 203 | ) R4Booth ( 204 | .MantA_i(B_Mant), 205 | .MantB_i(C_Mant), 206 | 207 | .pp_00_o(booth_PP[ 0]), 208 | .pp_01_o(booth_PP[ 1]), 209 | .pp_02_o(booth_PP[ 2]), 210 | .pp_03_o(booth_PP[ 3]), 211 | .pp_04_o(booth_PP[ 4]), 212 | .pp_05_o(booth_PP[ 5]), 213 | .pp_06_o(booth_PP[ 6]), 214 | .pp_07_o(booth_PP[ 7]), 215 | .pp_08_o(booth_PP[ 8]), 216 | .pp_09_o(booth_PP[ 9]), 217 | .pp_10_o(booth_PP[10]), 218 | .pp_11_o(booth_PP[11]), 219 | .pp_12_o(booth_PP_13) 220 | ); 221 | 222 | 223 | //Sum 13 partial Product by Wallace Tree 224 | wire [2*PARM_MANT + 2 : 0] Wallace_sum; 225 | wire [2*PARM_MANT + 2 : 0] Wallace_carry; 226 | wire Wallace_suppression_sign_extension; 227 | 228 | 229 | WallaceTree #( 230 | .PARM_MANT(PARM_MANT) 231 | ) WallaceTree ( 232 | .pp_00_i(booth_PP[ 0]), 233 | .pp_01_i(booth_PP[ 1]), 234 | .pp_02_i(booth_PP[ 2]), 235 | .pp_03_i(booth_PP[ 3]), 236 | .pp_04_i(booth_PP[ 4]), 237 | .pp_05_i(booth_PP[ 5]), 238 | .pp_06_i(booth_PP[ 6]), 239 | .pp_07_i(booth_PP[ 7]), 240 | .pp_08_i(booth_PP[ 8]), 241 | .pp_09_i(booth_PP[ 9]), 242 | .pp_10_i(booth_PP[10]), 243 | .pp_11_i(booth_PP[11]), 244 | .pp_12_i(booth_PP_13), 245 | 246 | .wallace_sum_o(Wallace_sum), 247 | .wallace_carry_o(Wallace_carry), 248 | .suppression_sign_extension_o(Wallace_suppression_sign_extension) 249 | ); 250 | 251 | 252 | //Prenormalization of the augend, in parallel with multiplication. 253 | //global signals ... 254 | wire Sign_aligned; 255 | wire Exp_mv_sign; 256 | wire Mv_halt; 257 | 258 | //Exponent Processor 259 | //d = expA - (expB + expC - bias[127]) 260 | //mv = 27 - d = expB + expC - expA + 100 261 | 262 | wire [PARM_EXP + 1 : 0] Exp_mv = 27 - A_Exp + B_Exp + C_Exp - PARM_BIAS; // d = expA - (expB + expC - 127), mv = 27 - d 263 | wire [PARM_EXP + 1 : 0] Exp_mv_neg = -27 + A_Exp - B_Exp - C_Exp + PARM_BIAS; 264 | 265 | assign Exp_mv_sign = Exp_mv[PARM_EXP + 1]; // the sign bit of the mv parameter, Sign_amt_DO 266 | 267 | //Revision 2.00 - Update mv_halt signal, now zero is viewed as the smalest denormalized number. 268 | //right shift(+) is out of range, which is 74 or more 269 | assign Mv_halt = ((~Exp_mv_sign) & (Exp_mv[PARM_EXP : 0] > 73))|| A_Zero; 270 | 271 | //signals for prenormalizer: 272 | wire SignFlip_ADD_PRN; 273 | 274 | wire [3*PARM_MANT + 5 : 0] A_Mant_aligned; 275 | wire [PARM_MANT + 3 : 0] A_Mant_aligned_high = A_Mant_aligned[3*PARM_MANT + 5 : 2*PARM_MANT + 2]; 276 | wire [2*PARM_MANT + 1 : 0] A_Mant_aligned_low = A_Mant_aligned[2*PARM_MANT + 1 : 0]; 277 | 278 | wire signed [PARM_EXP + 1 : 0] Exp_aligned; 279 | wire Mant_sticky_sht_out; 280 | 281 | 282 | PreNormalizer #( 283 | .PARM_EXP(PARM_EXP), 284 | .PARM_MANT(PARM_MANT), 285 | .PARM_BIAS(PARM_BIAS) 286 | ) PreNormalizer ( 287 | .A_sign_i(A_Sign), 288 | .B_sign_i(B_Sign), 289 | .C_sign_i(C_Sign), 290 | .Sub_Sign_i(Sub_Sign), 291 | .A_Exp_i(A_Exp), 292 | .B_Exp_i(B_Exp), 293 | .C_Exp_i(C_Exp), 294 | .A_Mant_i(A_Mant), 295 | .Sign_flip_i(SignFlip_ADD_PRN), 296 | .Mv_halt_i(Mv_halt), 297 | .Exp_mv_i(Exp_mv), 298 | .Exp_mv_sign_i(Exp_mv_sign), 299 | 300 | .A_Mant_aligned_o(A_Mant_aligned), 301 | .Exp_aligned_o(Exp_aligned), 302 | .Sign_aligned_o(Sign_aligned), 303 | .Mant_sticky_sht_out_o(Mant_sticky_sht_out) 304 | ); 305 | 306 | 307 | //adjust wallace sum to send in... 308 | wire [2*PARM_MANT + 2 : 0] Wallace_sum_adjusted; 309 | wire [2*PARM_MANT + 2 : 0] Wallace_carry_adjusted; 310 | 311 | assign Wallace_sum_adjusted = (Exp_mv_sign)? 0 : Wallace_sum; 312 | assign Wallace_carry_adjusted = (Exp_mv_sign) ? 0 : Wallace_carry; 313 | 314 | //Sums the Wallace outputs with A_Low 315 | wire [2*PARM_MANT + 1 : 0] CSA_sum; 316 | wire [2*PARM_MANT + 1 : 0] CSA_carry; 317 | 318 | Compressor32 #( 319 | .XLEN(2*PARM_MANT + 2) 320 | ) CarrySaveAdder ( 321 | .A_i(A_Mant_aligned_low), //A_low 322 | .B_i(Wallace_sum_adjusted[2*PARM_MANT + 1 : 0]), 323 | .C_i({Wallace_carry_adjusted[2*PARM_MANT : 0], 1'b0}), 324 | 325 | .Sum_o(CSA_sum), 326 | .Carry_o(CSA_carry) 327 | ); 328 | 329 | //correction based sign extenson is also in grand-adder. 330 | //output signals 331 | reg [73 : 0] PosSum; 332 | wire Minus_sticky_bit; 333 | 334 | wire Adder_sign; //global signal for Sign_out_D 335 | 336 | //End Around Carry Adders, LSBs 337 | 338 | wire wallace_msb_G = Wallace_sum_adjusted[2*PARM_MANT + 2] & Wallace_carry_adjusted[2*PARM_MANT + 1]; 339 | //if Wallace's msb is 1, or will carry to 1 340 | wire adder_Correlated_sign = Wallace_suppression_sign_extension | Wallace_carry_adjusted[2*PARM_MANT + 2] | wallace_msb_G; 341 | wire Carry_postcor = (~Exp_mv_sign) & ((~adder_Correlated_sign) ^ CSA_carry[2*PARM_MANT + 1]); 342 | 343 | wire [2*PARM_MANT + 1 : 0] low_sum; 344 | wire low_carry; 345 | wire [2*PARM_MANT + 1 : 0] low_sum_inv; 346 | wire low_carry_inv; 347 | 348 | 349 | EACAdder #( 350 | .PARM_MANT(PARM_MANT) 351 | ) EACAdder ( 352 | .CSA_sum_i(CSA_sum), 353 | .CSA_carry_i(CSA_carry), 354 | .Carry_postcor_i(Carry_postcor), 355 | .Sub_Sign_i(Sub_Sign), 356 | .A_Zero_i(A_Zero),//This is added to deal with false Sub_Sign_i(If a is -0) 357 | 358 | .low_sum_o(low_sum), 359 | .low_carry_o(low_carry), 360 | .low_sum_inv_o(low_sum_inv), 361 | .low_carry_inv_o(low_carry_inv) 362 | ); 363 | 364 | 365 | //Incrementer, Work on MSBs 366 | wire [PARM_MANT + 3 : 0]high_sum; 367 | wire [PARM_MANT + 3 : 0]high_sum_inv; 368 | 369 | 370 | MSBIncrementer #( 371 | .PARM_MANT(PARM_MANT) 372 | ) MSBIncrementer ( 373 | .low_carry_i(low_carry), 374 | .low_carry_inv_i(low_carry_inv), 375 | .A_Mant_aligned_high_i(A_Mant_aligned_high), 376 | 377 | .high_sum_o(high_sum), 378 | .high_sum_inv_o(high_sum_inv) 379 | ); 380 | 381 | 382 | wire bc_not_strange = ~(B_Inf | C_Inf | B_Zero | C_Zero | B_NaN | C_NaN); 383 | wire [3*PARM_MANT + 4 : 0] sub_minus = {{A_Mant_aligned_high[PARM_MANT+2 : 0], 1'b0} - bc_not_strange, 47'd0}; 384 | 385 | //Output of the Adder stage... 386 | assign SignFlip_ADD_PRN = high_sum[PARM_MANT + 3]; 387 | assign Adder_sign = Exp_mv_sign? Sign_aligned: (SignFlip_ADD_PRN ^ Sign_aligned); 388 | 389 | always @(*) begin 390 | if(Mv_halt) 391 | PosSum = {{26'd0}, low_sum}; 392 | else if(Exp_mv_sign) //b*c does not participate 393 | PosSum = Sub_Sign? sub_minus : {A_Mant_aligned_high[PARM_MANT+2 : 0], 48'd0}; 394 | else if(SignFlip_ADD_PRN) 395 | PosSum = {high_sum_inv[PARM_MANT + 2 : 0], low_sum_inv}; 396 | else 397 | PosSum = {high_sum[PARM_MANT + 2 : 0], low_sum}; 398 | end 399 | 400 | 401 | // for Sign_amt_DI=1'b1, if is difficult to compute combined with other cases. 402 | // When addition, | (b*c) ; when substruction, | (b*c) for rounding excption trunction. 403 | assign Minus_sticky_bit = Exp_mv_sign && (bc_not_strange); 404 | 405 | //leading one anticipator, detects the shift amount necessary for normalization 406 | wire [PARM_LEADONE_WIDTH - 1 : 0] shift_num; 407 | wire allzero; 408 | 409 | 410 | LeadingOneDetector_Top #( 411 | .X_LEN(74) 412 | ) LeadingOneDetector ( 413 | .data_i(PosSum), 414 | 415 | .shift_num_o(shift_num), 416 | .allzero_o(allzero) 417 | ); 418 | 419 | 420 | //Shift the exponent according to the result of LeadingOneDetector 421 | wire [3*PARM_MANT + 4 : 0] Mant_norm; 422 | wire [PARM_EXP + 1 : 0] Exp_norm; 423 | wire [PARM_EXP + 1 : 0] Exp_norm_mone; 424 | wire [PARM_EXP + 1 : 0] Exp_max_rs; 425 | wire [3*PARM_MANT + 6 : 0] Rs_Mant; 426 | 427 | Normalizer #( 428 | .PARM_EXP(PARM_EXP), 429 | .PARM_MANT(PARM_MANT), 430 | .PARM_LEADONE_WIDTH(PARM_LEADONE_WIDTH) 431 | ) Normalizer ( 432 | .Mant_i(PosSum), 433 | .Exp_i(Exp_aligned), 434 | .Shift_num_i(shift_num), 435 | .Exp_mv_sign_i(Exp_mv_sign), 436 | 437 | .Mant_norm_o(Mant_norm), 438 | .Exp_norm_o(Exp_norm), 439 | .Exp_norm_mone_o(Exp_norm_mone), 440 | .Exp_max_rs_o(Exp_max_rs), 441 | .Rs_Mant_o(Rs_Mant) 442 | ); 443 | 444 | wire Sign_result; 445 | wire [PARM_EXP - 1 : 0] Exp_result; 446 | wire [PARM_MANT - 1 : 0] Mant_result; 447 | 448 | assign Result_o = {Sign_result, Exp_result, Mant_result}; //outputlogic 449 | 450 | Rounder #( 451 | .PARM_RM(PARM_RM), 452 | .PARM_RM_RNE(PARM_RM_RNE), 453 | .PARM_RM_RTZ(PARM_RM_RTZ), 454 | .PARM_RM_RDN(PARM_RM_RDN), 455 | .PARM_RM_RUP(PARM_RM_RUP), 456 | .PARM_RM_RMM(PARM_RM_RMM), 457 | .PARM_MANT_NAN(PARM_MANT_NAN), 458 | .PARM_EXP(PARM_EXP), 459 | .PARM_MANT(PARM_MANT), 460 | .PARM_LEADONE_WIDTH(PARM_LEADONE_WIDTH) 461 | ) Rounder ( 462 | .Exp_i(Exp_aligned), 463 | .Sign_i(Adder_sign), 464 | .Allzero_i(allzero), 465 | .Exp_mv_sign_i(Exp_mv_sign), 466 | .Sub_Sign_i(Sub_Sign), 467 | .A_Exp_raw_i(A_i[PARM_XLEN - 2 : PARM_MANT]), // This is different from A_Exp, since we would like the "raw" bits 468 | .Rounding_mode_i(Rounding_mode_i), 469 | .A_Mant_i(A_Mant), 470 | .A_Sign_i(A_Sign), 471 | .B_Sign_i(B_Sign), 472 | .C_Sign_i(C_Sign), 473 | .A_DeN_i(A_DeN), 474 | .A_Inf_i(A_Inf), 475 | .B_Inf_i(B_Inf), 476 | .C_Inf_i(C_Inf), 477 | .A_Zero_i(A_Zero), 478 | .B_Zero_i(B_Zero), 479 | .C_Zero_i(C_Zero), 480 | .A_NaN_i(A_NaN), 481 | .B_NaN_i(B_NaN), 482 | .C_NaN_i(C_NaN), 483 | .Mant_sticky_sht_out_i(Mant_sticky_sht_out), 484 | .Minus_sticky_bit_i(Minus_sticky_bit), 485 | .Mant_norm_i(Mant_norm), 486 | .Exp_norm_i(Exp_norm), 487 | .Exp_norm_mone_i(Exp_norm_mone), 488 | .Exp_max_rs_i(Exp_max_rs), 489 | .Rs_Mant_i(Rs_Mant), 490 | 491 | .Sign_result_o(Sign_result), 492 | .Exp_result_o(Exp_result), 493 | .Mant_result_o(Mant_result), 494 | .Invalid_o(NV_o), 495 | .Overflow_o(OF_o), 496 | .Underflow_o(UF_o), 497 | .Inexact_o(NX_o) 498 | ); 499 | 500 | endmodule 501 | 502 | -------------------------------------------------------------------------------- /src/01_RTL/MSBIncrementer.v: -------------------------------------------------------------------------------- 1 | `timescale 1ns / 1ps 2 | ////////////////////////////////////////////////////////////////////////////////// 3 | // Engineer: Tzu-Han Hsu 4 | // Create Date: 07/29/2022 10:53:38 AM 5 | // Module Name: MSBIncrementer 6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit 7 | // HDL(Version): Verilog-2005 8 | // 9 | // Dependencies: None 10 | // 11 | ////////////////////////////////////////////////////////////////////////////////// 12 | // Description: Increments A_High if needed by the A_Low carry signal 13 | // 14 | ////////////////////////////////////////////////////////////////////////////////// 15 | // Revision: 16 | // 09/12/2022 - Add BSD-3-Clause Licence 17 | // 18 | ////////////////////////////////////////////////////////////////////////////////// 19 | // License information: 20 | // 21 | // This software is released under the BSD-3-Clause Licence, 22 | // see https://opensource.org/licenses/BSD-3-Clause for details. 23 | // In the following license statements, "software" refers to the 24 | // "source code" of the complete hardware/software system. 25 | // 26 | // Copyright 2022, 27 | // Embedded Intelligent Systems Lab (EISL) 28 | // Deparment of Computer Science 29 | // National Yang Ming Chiao Tung Uniersity 30 | // Hsinchu, Taiwan. 31 | // 32 | // All rights reserved. 33 | // 34 | // Redistribution and use in source and binary forms, with or without 35 | // modification, are permitted provided that the following conditions are met: 36 | // 37 | // 1. Redistributions of source code must retain the above copyright notice, 38 | // this list of conditions and the following disclaimer. 39 | // 40 | // 2. Redistributions in binary form must reproduce the above copyright notice, 41 | // this list of conditions and the following disclaimer in the documentation 42 | // and/or other materials provided with the distribution. 43 | // 44 | // 3. Neither the name of the copyright holder nor the names of its contributors 45 | // may be used to endorse or promote products derived from this software 46 | // without specific prior written permission. 47 | // 48 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 49 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 50 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 51 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE 52 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 53 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 54 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 55 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 56 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 57 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 58 | // POSSIBILITY OF SUCH DAMAGE. 59 | ////////////////////////////////////////////////////////////////////////////////// 60 | 61 | 62 | module MSBIncrementer #( 63 | parameter PARM_MANT = 23 64 | ) ( 65 | input low_carry_i, 66 | input low_carry_inv_i, 67 | input [PARM_MANT + 3 : 0] A_Mant_aligned_high_i, 68 | 69 | output [PARM_MANT + 3 : 0] high_sum_o, 70 | output [PARM_MANT + 3 : 0] high_sum_inv_o 71 | ); 72 | wire high_carry; // signal that is abandoned 73 | wire high_carry_inv; // signal that is abandoned 74 | 75 | assign {high_carry, high_sum_o} = (low_carry_i)? A_Mant_aligned_high_i + 1 : A_Mant_aligned_high_i; 76 | assign {high_carry_inv, high_sum_inv_o} = (low_carry_inv_i)? ~A_Mant_aligned_high_i : ~A_Mant_aligned_high_i - 1; 77 | 78 | endmodule 79 | -------------------------------------------------------------------------------- /src/01_RTL/Normalizer.v: -------------------------------------------------------------------------------- 1 | `timescale 1ns / 1ps 2 | ////////////////////////////////////////////////////////////////////////////////// 3 | // Engineer: Tzu-Han Hsu 4 | // Create Date: 08/01/2022 03:36:51 PM 5 | // Module Name: Normalizer 6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit 7 | // HDL(Version): Verilog-2005 8 | // 9 | // Dependencies: None 10 | // 11 | ////////////////////////////////////////////////////////////////////////////////// 12 | // Description: Normalizes the Fraction, and correct the exponent by 13 | // the input from Leading One Detector 14 | // 15 | ////////////////////////////////////////////////////////////////////////////////// 16 | // Revision: 17 | // 08/01/2022 - Output logic mistaken port name, fixed 18 | // 08/04/2022 - Remove redundant parameters 19 | // 09/12/2022 - Add BSD-3-Clause Licence 20 | // 21 | ////////////////////////////////////////////////////////////////////////////////// 22 | // License information: 23 | // 24 | // This software is released under the BSD-3-Clause Licence, 25 | // see https://opensource.org/licenses/BSD-3-Clause for details. 26 | // In the following license statements, "software" refers to the 27 | // "source code" of the complete hardware/software system. 28 | // 29 | // Copyright 2022, 30 | // Embedded Intelligent Systems Lab (EISL) 31 | // Deparment of Computer Science 32 | // National Yang Ming Chiao Tung Uniersity 33 | // Hsinchu, Taiwan. 34 | // 35 | // All rights reserved. 36 | // 37 | // Redistribution and use in source and binary forms, with or without 38 | // modification, are permitted provided that the following conditions are met: 39 | // 40 | // 1. Redistributions of source code must retain the above copyright notice, 41 | // this list of conditions and the following disclaimer. 42 | // 43 | // 2. Redistributions in binary form must reproduce the above copyright notice, 44 | // this list of conditions and the following disclaimer in the documentation 45 | // and/or other materials provided with the distribution. 46 | // 47 | // 3. Neither the name of the copyright holder nor the names of its contributors 48 | // may be used to endorse or promote products derived from this software 49 | // without specific prior written permission. 50 | // 51 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 52 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 53 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 54 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE 55 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 56 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 57 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 58 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 59 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 60 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 61 | // POSSIBILITY OF SUCH DAMAGE. 62 | ////////////////////////////////////////////////////////////////////////////////// 63 | 64 | 65 | module Normalizer#( 66 | parameter PARM_EXP = 8, 67 | parameter PARM_MANT = 23, 68 | parameter PARM_LEADONE_WIDTH = 7 69 | ) ( 70 | input [3*PARM_MANT + 4 : 0]Mant_i, 71 | input [PARM_EXP + 1 : 0]Exp_i, 72 | input [PARM_LEADONE_WIDTH - 1 : 0] Shift_num_i, 73 | input Exp_mv_sign_i, 74 | 75 | output [3*PARM_MANT + 4 : 0] Mant_norm_o, 76 | output reg [PARM_EXP + 1 : 0] Exp_norm_o, 77 | output [PARM_EXP + 1 : 0] Exp_norm_mone_o, 78 | output [PARM_EXP + 1 : 0] Exp_max_rs_o, 79 | output [3*PARM_MANT + 6 : 0] Rs_Mant_o 80 | ); 81 | 82 | //Exponent corrections and normalization by results from LOA 83 | 84 | wire [PARM_LEADONE_WIDTH - 1 : 0] Shift_num = (Exp_mv_sign_i | Mant_i[3*PARM_MANT + 4])? 0 : Shift_num_i; //If the exponent < 0, or it has a leading one (1xxxxxx....) 85 | 86 | reg [PARM_EXP : 0] norm_amt; 87 | always @(*) begin 88 | if(Exp_i[PARM_EXP + 1]) 89 | norm_amt = 0; // the expoent overflows 90 | else if(Exp_i > Shift_num) 91 | norm_amt = Shift_num; // assure that exp would not < 0 92 | else 93 | norm_amt = Exp_i[PARM_EXP : 0] - 1; //Denormalized Numbers, has exponent of 0, representing -126 94 | end 95 | 96 | assign Mant_norm_o = Mant_i << norm_amt; 97 | 98 | 99 | always @(*) begin 100 | if(Exp_i[PARM_EXP + 1]) 101 | Exp_norm_o = 0; // the expoent overflows 102 | else if(Exp_i > Shift_num) 103 | Exp_norm_o = Exp_i - Shift_num; // assure that exp would not < 0 104 | else 105 | Exp_norm_o = 1; //Denormalized Numbers, has exponent of 0, representing -126 106 | end 107 | 108 | assign Exp_norm_mone_o = Exp_i - Shift_num - 1; 109 | 110 | //if Exp < 0, shift Right 111 | 112 | assign Exp_max_rs_o = Exp_i[PARM_EXP : 0] + 74; 113 | wire [PARM_EXP + 1 : 0] Rs_count = (~Exp_i + 1) + 1; // -Exp_i + 1, number of right shifts to get a denormalized number. 114 | assign Rs_Mant_o = {Mant_i, 2'd0} >> Rs_count; 115 | 116 | endmodule 117 | -------------------------------------------------------------------------------- /src/01_RTL/PreNormalizer.v: -------------------------------------------------------------------------------- 1 | `timescale 1ns / 1ps 2 | ////////////////////////////////////////////////////////////////////////////////// 3 | // Engineer: Tzu-Han Hsu 4 | // Create Date: 07/25/2022 10:50:12 PM 5 | // Module Name: PreNormalizer 6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit 7 | // HDL(Version): Verilog-2005 8 | // 9 | // Dependencies: None 10 | // 11 | ////////////////////////////////////////////////////////////////////////////////// 12 | // Description: It shifts the augend to the correct position, and calculates 13 | // its exponent, works in parallel with the multiplier 14 | // 15 | ////////////////////////////////////////////////////////////////////////////////// 16 | // Revision: 17 | // 07/25/2022 - Ports renaming into appropriate suffix 18 | // 07/26/2022 - Add debug signals to probe the shifting 19 | // 07/26/2022 - Debug wires removed 20 | // 07/27/2022 - Input wire "sign_change_i" renamed to "sign_flip_i" 21 | // 09/12/2022 - Add BSD-3-Clause Licence 22 | // 23 | ////////////////////////////////////////////////////////////////////////////////// 24 | // License information: 25 | // 26 | // This software is released under the BSD-3-Clause Licence, 27 | // see https://opensource.org/licenses/BSD-3-Clause for details. 28 | // In the following license statements, "software" refers to the 29 | // "source code" of the complete hardware/software system. 30 | // 31 | // Copyright 2022, 32 | // Embedded Intelligent Systems Lab (EISL) 33 | // Deparment of Computer Science 34 | // National Yang Ming Chiao Tung Uniersity 35 | // Hsinchu, Taiwan. 36 | // 37 | // All rights reserved. 38 | // 39 | // Redistribution and use in source and binary forms, with or without 40 | // modification, are permitted provided that the following conditions are met: 41 | // 42 | // 1. Redistributions of source code must retain the above copyright notice, 43 | // this list of conditions and the following disclaimer. 44 | // 45 | // 2. Redistributions in binary form must reproduce the above copyright notice, 46 | // this list of conditions and the following disclaimer in the documentation 47 | // and/or other materials provided with the distribution. 48 | // 49 | // 3. Neither the name of the copyright holder nor the names of its contributors 50 | // may be used to endorse or promote products derived from this software 51 | // without specific prior written permission. 52 | // 53 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 54 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 55 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 56 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE 57 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 58 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 59 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 60 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 61 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 62 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 63 | // POSSIBILITY OF SUCH DAMAGE. 64 | ////////////////////////////////////////////////////////////////////////////////// 65 | 66 | 67 | module PreNormalizer #( 68 | parameter PARM_EXP = 8, 69 | parameter PARM_MANT = 23, 70 | parameter PARM_BIAS = 127 71 | ) ( 72 | input A_sign_i, 73 | input B_sign_i, 74 | input C_sign_i, 75 | input Sub_Sign_i, 76 | input [PARM_EXP - 1 : 0] A_Exp_i, 77 | input [PARM_EXP - 1 : 0] B_Exp_i, 78 | input [PARM_EXP - 1 : 0] C_Exp_i, 79 | input [PARM_MANT : 0] A_Mant_i, 80 | input Sign_flip_i, 81 | input Mv_halt_i, 82 | input [PARM_EXP + 1 : 0] Exp_mv_i, 83 | input Exp_mv_sign_i, 84 | 85 | output Sign_aligned_o, 86 | output [PARM_EXP + 1: 0] Exp_aligned_o, 87 | output reg [74 : 0] A_Mant_aligned_o, 88 | output reg Mant_sticky_sht_out_o 89 | ); 90 | 91 | 92 | wire [73 : 0] A_Mant_aligned; 93 | wire [PARM_MANT : 0] Drop_bits; 94 | assign {A_Mant_aligned, Drop_bits} = {A_Mant_i, 74'd0} >> (Mv_halt_i ? 0 : Exp_mv_i); 95 | 96 | //output logic for aligner 97 | assign Sign_aligned_o = (Exp_mv_sign_i)? A_sign_i : B_sign_i ^ C_sign_i; 98 | assign Exp_aligned_o = (Exp_mv_sign_i)? A_Exp_i : (B_Exp_i + C_Exp_i - PARM_BIAS + 27); // exponent = (expB + expC -127) + point distance(= 27) 99 | 100 | //output logic for A_Mant_aligned_o 101 | always @(*) begin 102 | if(Exp_mv_sign_i) 103 | A_Mant_aligned_o = (A_Mant_i << 50); 104 | else if(~Mv_halt_i) 105 | A_Mant_aligned_o = {Sub_Sign_i, {74{Sub_Sign_i}}^A_Mant_aligned}; 106 | else 107 | A_Mant_aligned_o = 0; 108 | end 109 | 110 | 111 | wire [PARM_MANT : 0] A_Mant_2compelemnt = (~A_Mant_i) + 1; //2's complement of mantA 112 | wire [PARM_MANT : 0] Drop_bits_2complement = (~Drop_bits) + 1; //2's complemet of Drop_bits 113 | 114 | //output logic for Mant_sticky_sht_out_o 115 | always @(*) begin 116 | if(Sub_Sign_i & (~Sign_flip_i)) 117 | Mant_sticky_sht_out_o = (Mv_halt_i)? (|A_Mant_2compelemnt) : (|Drop_bits_2complement); 118 | else 119 | Mant_sticky_sht_out_o = (Mv_halt_i)? (|A_Mant_i) : (|Drop_bits); 120 | end 121 | 122 | endmodule 123 | -------------------------------------------------------------------------------- /src/01_RTL/R4Booth.v: -------------------------------------------------------------------------------- 1 | `timescale 1ns / 1ps 2 | ////////////////////////////////////////////////////////////////////////////////// 3 | // Engineer: Tzu-Han Hsu 4 | // Create Date: 07/22/2022 10:59:09 AM 5 | // Module Name: R4Booth 6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit 7 | // HDL(Version): Verilog-2005 8 | // 9 | // Dependencies: None 10 | // 11 | ////////////////////////////////////////////////////////////////////////////////// 12 | // Description: Breaking down 24bit * 24 bit into 13 partial products, using 13 | // Radix-4 Booth's Algorithm 14 | // 15 | ////////////////////////////////////////////////////////////////////////////////// 16 | // Revision: 17 | // 07/22/2022 - Encode the input by Radix-4 Booth's Recording Table 18 | // 07/22/2022 - Utilize generate statements in the encoder 19 | // 07/24/2022 - Combine the module with Booth recording module 20 | // 07/25/2022 - Add decoder section to generate partial produt by the encoded message 21 | // 07/25/2022 - Decoder index bug fix 22 | // 07/25/2022 - Parameters updated, redundancy removed and comments added 23 | // 08/15/2022 - Save one bit by reducing the bus width of pp_12_o 24 | // 09/12/2022 - Add BSD-3-Clause Licence 25 | // 26 | ////////////////////////////////////////////////////////////////////////////////// 27 | // License information: 28 | // 29 | // This software is released under the BSD-3-Clause Licence, 30 | // see https://opensource.org/licenses/BSD-3-Clause for details. 31 | // In the following license statements, "software" refers to the 32 | // "source code" of the complete hardware/software system. 33 | // 34 | // Copyright 2022, 35 | // Embedded Intelligent Systems Lab (EISL) 36 | // Deparment of Computer Science 37 | // National Yang Ming Chiao Tung Uniersity 38 | // Hsinchu, Taiwan. 39 | // 40 | // All rights reserved. 41 | // 42 | // Redistribution and use in source and binary forms, with or without 43 | // modification, are permitted provided that the following conditions are met: 44 | // 45 | // 1. Redistributions of source code must retain the above copyright notice, 46 | // this list of conditions and the following disclaimer. 47 | // 48 | // 2. Redistributions in binary form must reproduce the above copyright notice, 49 | // this list of conditions and the following disclaimer in the documentation 50 | // and/or other materials provided with the distribution. 51 | // 52 | // 3. Neither the name of the copyright holder nor the names of its contributors 53 | // may be used to endorse or promote products derived from this software 54 | // without specific prior written permission. 55 | // 56 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 57 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 58 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 59 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE 60 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 61 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 62 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 63 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 64 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 65 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 66 | // POSSIBILITY OF SUCH DAMAGE. 67 | ////////////////////////////////////////////////////////////////////////////////// 68 | 69 | 70 | module R4Booth #( 71 | parameter PARM_MANT = 23 72 | ) ( 73 | input [PARM_MANT : 0] MantA_i, // input is {hidden_bit, mantissa} = 1 + 23 = 24 bits 74 | input [PARM_MANT : 0] MantB_i, 75 | 76 | output [2*PARM_MANT + 2 : 0] pp_00_o, //output range is 24*2 +1(if x2 multiplicand) = 49 bits 77 | output [2*PARM_MANT + 2 : 0] pp_01_o, 78 | output [2*PARM_MANT + 2 : 0] pp_02_o, 79 | output [2*PARM_MANT + 2 : 0] pp_03_o, 80 | output [2*PARM_MANT + 2 : 0] pp_04_o, 81 | output [2*PARM_MANT + 2 : 0] pp_05_o, 82 | output [2*PARM_MANT + 2 : 0] pp_06_o, 83 | output [2*PARM_MANT + 2 : 0] pp_07_o, 84 | output [2*PARM_MANT + 2 : 0] pp_08_o, 85 | output [2*PARM_MANT + 2 : 0] pp_09_o, 86 | output [2*PARM_MANT + 2 : 0] pp_10_o, 87 | output [2*PARM_MANT + 2 : 0] pp_11_o, 88 | output [2*PARM_MANT + 1 : 0] pp_12_o 89 | ); 90 | parameter PARM_PP = ((PARM_MANT+1)+1+1)/2; //booth's algorithm produces at most CEILING( (n+1))/2 ) partial products 91 | 92 | //Modified Booth's Recording Table 93 | // Multiplier 94 | //| Bit i + 1 | Bit i | Bit i - 1 | Multiplicand selected | 95 | //| 0 | 0 | 0 | 0 x Multiplicand | 96 | //| 0 | 0 | 1 | +1 x Multiplicand | 97 | //| 0 | 1 | 0 | +1 x Multiplicand | 98 | //| 0 | 1 | 1 | +2 x Multiplicand | 99 | //| 1 | 0 | 0 | -2 x Multiplicand | 100 | //| 1 | 0 | 1 | -1 x Multiplicand | 101 | //| 1 | 1 | 0 | -1 x Multiplicand | 102 | //| 1 | 1 | 1 | 0 x Multiplicand | 103 | 104 | 105 | wire [PARM_MANT + 3 : 0] mant_B_Padding = {2'd0, MantB_i, 1'd0}; 106 | 107 | wire [PARM_PP - 1 : 0] mul1x; // mul1x_o = bit (i) ^ bit(i - 1) 108 | wire [PARM_PP - 1 : 0] mul2x; // mul2x_o = (pattern == 3'b011 || pattern_i == 3'b100); 109 | wire [PARM_PP - 1 : 0] mulsign; // mulsign_o = bit (i + 1) 110 | 111 | 112 | generate 113 | genvar j; 114 | for (j = 0; j < 13; j = j+1) begin 115 | assign mul1x[j] = mant_B_Padding[j*2] ^ mant_B_Padding[j*2 + 1]; 116 | assign mul2x[j] = ((~mant_B_Padding[j*2]) & (~mant_B_Padding[j*2 + 1]) & (mant_B_Padding[j*2 + 2])) || 117 | ((mant_B_Padding[j*2]) & (mant_B_Padding[j*2+1]) & (~mant_B_Padding[j*2+2])); 118 | assign mulsign[j] = mant_B_Padding[j*2 + 2]; 119 | end 120 | endgenerate 121 | 122 | 123 | // Partial product is differentiate by 0x 1x 2x here 124 | reg [PARM_MANT + 1 : 0] booth_PP_tmp [PARM_PP - 1: 0]; 125 | wire [PARM_MANT + 1 : 0] booth_PP [PARM_PP - 1: 0]; 126 | 127 | integer idx; 128 | always @(*) begin 129 | for (idx = 0; idx < PARM_PP; idx = idx + 1) begin 130 | if(mul1x[idx]) booth_PP_tmp[idx] = MantA_i; 131 | else if(mul2x[idx]) booth_PP_tmp[idx] = MantA_i << 1; 132 | else booth_PP_tmp[idx] = 0; 133 | 134 | end 135 | end 136 | 137 | //bit flip if it's negative due to booth's algorithm, we calculate 2's complement by bitwise invert and add 1 to the next row. 138 | generate 139 | genvar k; 140 | for(k = 0; k < PARM_PP; k = k + 1)begin 141 | assign booth_PP[k] = (mulsign[k])? ~booth_PP_tmp[k] : booth_PP_tmp[k]; 142 | end 143 | endgenerate 144 | 145 | 146 | //by adding the "1 triagle" in the left up. It's under the assumption that it's an unsigned Multiplication. 147 | assign pp_00_o = {21'd0, ~mulsign[ 0],{2{mulsign[0]}},booth_PP[0]}; 148 | assign pp_01_o = {21'd1, ~mulsign[ 1], booth_PP[ 1], 1'b0, mulsign[ 0]}; 149 | assign pp_02_o = {19'd1, ~mulsign[ 2], booth_PP[ 2], 1'b0, mulsign[ 1], 2'd0}; 150 | assign pp_03_o = {17'd1, ~mulsign[ 3], booth_PP[ 3], 1'b0, mulsign[ 2], 4'd0}; 151 | assign pp_04_o = {15'd1, ~mulsign[ 4], booth_PP[ 4], 1'b0, mulsign[ 3], 6'd0}; 152 | assign pp_05_o = {13'd1, ~mulsign[ 5], booth_PP[ 5], 1'b0, mulsign[ 4], 8'd0}; 153 | assign pp_06_o = {11'd1, ~mulsign[ 6], booth_PP[ 6], 1'b0, mulsign[ 5], 10'd0}; 154 | assign pp_07_o = { 9'd1, ~mulsign[ 7], booth_PP[ 7], 1'b0, mulsign[ 6], 12'd0}; 155 | assign pp_08_o = { 7'd1, ~mulsign[ 8], booth_PP[ 8], 1'b0, mulsign[ 7], 14'd0}; 156 | assign pp_09_o = { 5'd1, ~mulsign[ 9], booth_PP[ 9], 1'b0, mulsign[ 8], 16'd0}; 157 | assign pp_10_o = { 3'd1, ~mulsign[10], booth_PP[10], 1'b0, mulsign[ 9], 18'd0}; 158 | assign pp_11_o = { 1'd1, ~mulsign[11], booth_PP[11], 1'b0, mulsign[10], 20'd0}; 159 | assign pp_12_o = {booth_PP[12][PARM_MANT : 0], 1'b0, mulsign[11], 22'd0}; //Save one bit, MSB is always 0 160 | 161 | endmodule 162 | -------------------------------------------------------------------------------- /src/01_RTL/Rounder.v: -------------------------------------------------------------------------------- 1 | `timescale 1ns / 1ps 2 | ////////////////////////////////////////////////////////////////////////////////// 3 | // Engineer: Tzu-Han Hsu 4 | // Create Date: 07/30/2022 10:47:12 AM 5 | // Module Name: Rounder 6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit 7 | // HDL(Version): Verilog-2005 8 | // 9 | // Dependencies: None 10 | // 11 | ////////////////////////////////////////////////////////////////////////////////// 12 | // Description: Process Rounding by checking the rounding mode set, guard bit, 13 | // round bit and sticky bit. 14 | // Raises Invalid, Overflow and Underflow under appropriate situations 15 | // Adjust the exponent and mantissa for the module output 16 | // 17 | ////////////////////////////////////////////////////////////////////////////////// 18 | // Revision: 19 | // 07/30/2022 - File Created 20 | // 08/01/2022 - Rename File 21 | // 08/08/2022 - Add PARM_MATN_RMM support 22 | // 08/09/2022 - Invalid_o shall raise whilst Overflow/Underflow 23 | // 08/10/2022 - Debug wires added to observe the chosen MUX path 24 | // 08/12/2022 - Remove A = 0 as special case, due to the update of mv_halt in MAC32_top.v 25 | // 08/13/2022 - Fix multidriven Net, Mant_result_o 26 | // 08/13/2022 - Underflow signal fixed, denorm number wouldn't fire Underflow Signal 27 | // 08/14/2022 - Debug wires removed 28 | // 09/12/2022 - Add BSD-3-Clause Licence 29 | // 30 | ////////////////////////////////////////////////////////////////////////////////// 31 | // License information: 32 | // 33 | // This software is released under the BSD-3-Clause Licence, 34 | // see https://opensource.org/licenses/BSD-3-Clause for details. 35 | // In the following license statements, "software" refers to the 36 | // "source code" of the complete hardware/software system. 37 | // 38 | // Copyright 2022, 39 | // Embedded Intelligent Systems Lab (EISL) 40 | // Deparment of Computer Science 41 | // National Yang Ming Chiao Tung Uniersity 42 | // Hsinchu, Taiwan. 43 | // 44 | // All rights reserved. 45 | // 46 | // Redistribution and use in source and binary forms, with or without 47 | // modification, are permitted provided that the following conditions are met: 48 | // 49 | // 1. Redistributions of source code must retain the above copyright notice, 50 | // this list of conditions and the following disclaimer. 51 | // 52 | // 2. Redistributions in binary form must reproduce the above copyright notice, 53 | // this list of conditions and the following disclaimer in the documentation 54 | // and/or other materials provided with the distribution. 55 | // 56 | // 3. Neither the name of the copyright holder nor the names of its contributors 57 | // may be used to endorse or promote products derived from this software 58 | // without specific prior written permission. 59 | // 60 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 61 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 62 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 63 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE 64 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 65 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 66 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 67 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 68 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 69 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 70 | // POSSIBILITY OF SUCH DAMAGE. 71 | ////////////////////////////////////////////////////////////////////////////////// 72 | // Additional Comments: 73 | // IEEE Std 754-2008 Chap7. Default exception handling 74 | // 75 | // ------------------------------------------------------------------------------- 76 | // 7.2 Invalid Operation 77 | // 78 | // The invalid operation exception is signaled if and only if there is no usefully definable result. 79 | // In these cases the operands are invalid for the operation to be performed. 80 | // For operations producing results in floating-point format, the default result of an operation that signals the 81 | // invalid operation exception shall be a quiet NaN that should provide some diagnostic information (see 6.2). 82 | // These operations are: 83 | // a) any general-computational or signaling-computational operation on a signaling NaN (see 6.2), 84 | // except for some conversions (see 5.12) 85 | // b) multiplication: multiplication(0, ∞) or multiplication(∞, 0) 86 | // c) fusedMultiplyAdd: fusedMultiplyAdd(0, ∞, c) or fusedMultiplyAdd(∞, 0, c) unless c is a quiet 87 | // NaN; if c is a quiet NaN then it is implementation defined whether the invalid operation exception 88 | // is signaled 89 | // d) addition or subtraction or fusedMultiplyAdd: magnitude subtraction of infinities, such as: 90 | // addition(+∞, −∞) 91 | // e) division: division(0, 0) or division(∞, ∞) 92 | // f) remainder: remainder(x, y), when y is zero or x is infinite and neither is NaN 93 | // g) squareRoot if the operand is less than zero 94 | // h) quantize when the result does not fit in the destination format or when one operand is finite and the 95 | // other is infinite 96 | // ------------------------------------------------------------------------------- 97 | // 7.4 Overflow (IEEE 754-2008) 98 | // 99 | // The overflow exception shall be signaled if and only if the destination format’s largest finite number is 100 | // exceeded in magnitude by what would have been the rounded floating-point result (see 4) were the exponent 101 | // range unbounded. The default result shall be determined by the rounding-direction attribute and the sign of 102 | // the intermediate result as follows: 103 | // a) roundTiesToEven and roundTiesToAway carry all overflows to ∞ with the sign of the intermediate 104 | // result. 105 | // b) roundTowardZero carries all overflows to the format’s largest finite number with the sign of the 106 | // intermediate result. 107 | // c) roundTowardNegative carries positive overflows to the format’s largest finite number, and carries 108 | // negative overflows to −∞. 109 | // d) roundTowardPositive carries negative overflows to the format’s most negative finite number, and 110 | // carries positive overflows to +∞. 111 | // In addition, under default exception handling for overflow, the overflow flag shall be raised and the inexact 112 | // exception shall be signaled. 113 | // ------------------------------------------------------------------------------- 114 | // 7.6 Inexact 115 | // 116 | // Unless stated otherwise, if the rounded result of an operation is inexact—that is, it differs from what would 117 | // have been computed were both exponent range and precision unbounded—then the inexact exception shall 118 | // be signaled. The rounded or overflowed result shall be delivered to the destination 119 | // (emphaisis added) 120 | // When all of these exceptions are handled by default, the inexact flag 121 | // is always raised when either the overflow or underflow flag is raised. 122 | // 123 | ////////////////////////////////////////////////////////////////////////////////// 124 | 125 | 126 | module Rounder #( 127 | parameter PARM_RM = 3, 128 | parameter PARM_RM_RNE = 3'b000, 129 | parameter PARM_RM_RTZ = 3'b001, 130 | parameter PARM_RM_RDN = 3'b010, 131 | parameter PARM_RM_RUP = 3'b011, 132 | parameter PARM_RM_RMM = 3'b100, 133 | parameter PARM_MANT_NAN = 23'b100_0000_0000_0000_0000_0000, 134 | parameter PARM_EXP = 8, 135 | parameter PARM_MANT = 23, 136 | parameter PARM_LEADONE_WIDTH = 7 137 | ) ( 138 | 139 | input [PARM_EXP + 1 : 0]Exp_i, 140 | input Sign_i, 141 | 142 | input Allzero_i, 143 | input Exp_mv_sign_i, 144 | 145 | input Sub_Sign_i, 146 | input [PARM_EXP - 1 : 0] A_Exp_raw_i, 147 | input [PARM_MANT : 0] A_Mant_i, 148 | input [PARM_RM - 1 : 0] Rounding_mode_i, 149 | input A_Sign_i, 150 | input B_Sign_i, 151 | input C_Sign_i, 152 | 153 | input A_DeN_i, 154 | input A_Inf_i, 155 | input B_Inf_i, 156 | input C_Inf_i, 157 | input A_Zero_i, 158 | input B_Zero_i, 159 | input C_Zero_i, 160 | input A_NaN_i, 161 | input B_NaN_i, 162 | input C_NaN_i, 163 | 164 | input Mant_sticky_sht_out_i, 165 | input Minus_sticky_bit_i, 166 | 167 | input [3*PARM_MANT + 4 : 0] Mant_norm_i, 168 | input [PARM_EXP + 1 : 0] Exp_norm_i, 169 | input [PARM_EXP + 1 : 0] Exp_norm_mone_i, 170 | input [PARM_EXP + 1 : 0] Exp_max_rs_i, 171 | input [3*PARM_MANT + 6 : 0] Rs_Mant_i, 172 | 173 | output reg Sign_result_o, 174 | output reg [PARM_EXP - 1 : 0] Exp_result_o, 175 | output reg [PARM_MANT - 1 : 0] Mant_result_o, 176 | output Invalid_o, 177 | output reg Overflow_o, 178 | output Underflow_o, 179 | output Inexact_o); 180 | 181 | //Sticky bit 182 | reg [2*PARM_MANT + 1 : 0] Mant_sticky_changed; 183 | always @(*) begin 184 | if(Exp_norm_i[PARM_EXP + 1]) 185 | Mant_sticky_changed = Rs_Mant_i [2*PARM_MANT + 3 : 2]; 186 | else if(Exp_norm_i == 0) 187 | Mant_sticky_changed = Mant_norm_i[2*PARM_MANT + 2 : 1]; 188 | else if(Mant_norm_i[3*PARM_MANT + 4]) // | Exp_norm_i == 0 189 | Mant_sticky_changed = Mant_norm_i[2*PARM_MANT + 1 : 0]; 190 | else 191 | Mant_sticky_changed = {Mant_norm_i[2*PARM_MANT : 0], 1'b0}; 192 | end 193 | 194 | wire Sticky_one = (|Mant_sticky_changed) || Mant_sticky_sht_out_i || Minus_sticky_bit_i; 195 | 196 | 197 | wire includeNaN = A_NaN_i | B_NaN_i | C_NaN_i; 198 | wire zeromulinf = (B_Zero_i & C_Inf_i) | (C_Zero_i & B_Inf_i); 199 | wire subinf = (Sub_Sign_i & A_Inf_i & (B_Inf_i | C_Inf_i)); 200 | 201 | assign Invalid_o = (includeNaN | zeromulinf | subinf); 202 | 203 | reg Mant_sticky; 204 | reg [PARM_MANT : 0] Mant_result_norm; // 24 bit 205 | reg [PARM_EXP - 1 : 0] Exp_result_norm; // 8 bit 206 | reg [1 : 0] Mant_lower; 207 | 208 | 209 | always @(*) begin 210 | //assign value to avoid latches 211 | Overflow_o = 1'b0; 212 | Mant_result_norm = 0; 213 | Exp_result_norm = 0; 214 | Mant_lower = 2'b00; 215 | Sign_result_o = 1'b0; 216 | Mant_sticky = 1'b0; 217 | if(Invalid_o)begin 218 | Mant_result_norm = {1'b0, PARM_MANT_NAN}; //PARM_MANT_NAN is 23 bit 219 | Exp_result_norm = 8'b1111_1111; 220 | 221 | end 222 | else if(A_Inf_i | B_Inf_i | C_Inf_i)begin 223 | // The result is Infinity 224 | // Operations on infinite operands are exact and therefore signal no exceptions 225 | Exp_result_norm = 8'b1111_1111; 226 | // If there's two infinities, they must be the same, if there's 3, it's the same with A_sign 227 | if(A_Inf_i) Sign_result_o = A_Sign_i; 228 | else Sign_result_o = B_Sign_i ^ C_Sign_i; 229 | 230 | end 231 | else if(B_Zero_i | C_Zero_i)begin 232 | // Bor situation of sth + sth*0 / sth + 0*sth 233 | Mant_result_norm = A_Mant_i; 234 | Exp_result_norm = A_Exp_raw_i; 235 | Sign_result_o = A_Sign_i; 236 | 237 | end 238 | else if(Exp_mv_sign_i)begin 239 | // Only A counts , B x C is too small compare to A 240 | Mant_result_norm = A_Mant_i; 241 | Exp_result_norm = A_Exp_raw_i; 242 | Sign_result_o = A_Sign_i; 243 | Mant_sticky = Sticky_one; // When the exponent move left (negative), sticky bit would come from Mant_sticky 244 | 245 | end 246 | else if(Allzero_i)begin 247 | Sign_result_o = Sign_i; 248 | 249 | end 250 | else if(Exp_i[PARM_EXP + 1])begin 251 | if(~Exp_max_rs_i[PARM_EXP + 1])begin 252 | // Exponent would <0 after right shift (too negative) 253 | Overflow_o = 1; 254 | Sign_result_o = Sign_i; 255 | end 256 | else begin 257 | // Denormalized number 258 | Mant_result_norm = {1'b0, Rs_Mant_i[3*PARM_MANT + 6 : 2*PARM_MANT + 6]}; 259 | Mant_lower = Rs_Mant_i[2*PARM_MANT + 5 : 2*PARM_MANT + 4]; 260 | Sign_result_o = Sign_i; 261 | Mant_sticky = Sticky_one; 262 | end 263 | 264 | end 265 | else if((Exp_norm_i[PARM_EXP : 0] == 256) & (~Mant_norm_i[3*PARM_MANT + 4]) & (Mant_norm_i[3*PARM_MANT + 3 : 2*PARM_MANT+3] != 0))begin 266 | // Overflow 267 | Overflow_o = 1; 268 | Sign_result_o = Sign_i; 269 | 270 | end 271 | else if(Exp_norm_i[PARM_EXP - 1 : 0] == 8'b1111_1111)begin 272 | 273 | if(Mant_norm_i[3*PARM_MANT + 4] || (Mant_norm_i[3*PARM_MANT + 4 : 2*PARM_MANT + 4] == 0))begin 274 | // Overflow 275 | Overflow_o = 1; 276 | Sign_result_o = Sign_i; 277 | end 278 | else begin 279 | // Normal numbers 280 | Exp_result_norm = 8'b1111_1110; //254 281 | Sign_result_o = Sign_i; 282 | 283 | Mant_result_norm = Mant_norm_i [3*PARM_MANT + 2 : 2*PARM_MANT + 3];//originally out of bound 284 | Mant_lower = Mant_norm_i[2*PARM_MANT + 2 : 2*PARM_MANT + 1]; 285 | Mant_sticky = Sticky_one; 286 | 287 | //see if it's overflow, if mant is full and about to round up 288 | if(Mant_result_norm[PARM_MANT - 1 : 0] == {(PARM_MANT){1'b1}})begin 289 | case (Rounding_mode_i) 290 | PARM_RM_RNE: 291 | Overflow_o = Mant_lower[1] & (Mant_lower[0] | Mant_sticky | Mant_result_norm[0]); 292 | PARM_RM_RTZ: 293 | Overflow_o = 0; 294 | PARM_RM_RDN: 295 | Overflow_o = ((|Mant_lower) || Mant_sticky) & Sign_i; 296 | PARM_RM_RUP: 297 | Overflow_o = ((|Mant_lower) || Mant_sticky) & (~Sign_i); 298 | PARM_RM_RMM: 299 | Overflow_o = Mant_lower[1]; 300 | default: 301 | Overflow_o = 0; 302 | endcase 303 | end 304 | end 305 | 306 | end 307 | else if(Exp_norm_i[PARM_EXP])begin 308 | //Overflow Occurs, the exponent at preNorm(multiplication is over 127) 309 | Overflow_o = 1; 310 | Sign_result_o = Sign_i; 311 | end 312 | else if(Exp_norm_i == 10'd0)begin 313 | // Denormalized number 314 | Mant_result_norm = {1'b0, Mant_norm_i[3*PARM_MANT + 4 : 2*PARM_MANT + 5]}; 315 | Mant_lower = Mant_norm_i[2*PARM_MANT + 4 : 2*PARM_MANT + 3]; 316 | Sign_result_o = Sign_i; 317 | Mant_sticky = Sticky_one; 318 | end 319 | else if(Exp_norm_i == 10'd1)begin 320 | if(Mant_norm_i[3*PARM_MANT + 4])begin 321 | //Normal Number 322 | Mant_result_norm = Mant_norm_i[3*PARM_MANT + 4 : 2*PARM_MANT + 4]; 323 | Exp_result_norm = 1; 324 | Mant_lower = Mant_norm_i[2*PARM_MANT + 3 : 2*PARM_MANT + 2]; 325 | Sign_result_o = Sign_i; 326 | Mant_sticky = Sticky_one; 327 | 328 | end 329 | else begin 330 | // Denormalized Number 331 | // Denormalized does not mean exactly an underflow... 332 | Mant_result_norm = Mant_norm_i[3*PARM_MANT + 4: 2*PARM_MANT + 4]; 333 | Mant_lower = Mant_norm_i[2*PARM_MANT + 3 : 2*PARM_MANT + 2]; 334 | Sign_result_o = Sign_i; 335 | Mant_sticky = Sticky_one; 336 | 337 | end 338 | 339 | end 340 | else if(~Mant_norm_i[3*PARM_MANT + 4])begin 341 | // Numbers with 0X.XX, normal numbers 342 | Mant_result_norm = Mant_norm_i[3*PARM_MANT + 3 : 2*PARM_MANT + 3]; 343 | Exp_result_norm = Exp_norm_mone_i[PARM_EXP - 1 : 0]; 344 | Mant_lower = Mant_norm_i[2*PARM_MANT + 2 : 2*PARM_MANT + 1]; 345 | Sign_result_o = Sign_i; 346 | Mant_sticky = Sticky_one; 347 | end 348 | else begin 349 | // Numbers with 1X.XX, normal nubmers 350 | Mant_result_norm = Mant_norm_i[3*PARM_MANT + 4 : 2*PARM_MANT + 4]; 351 | Exp_result_norm = Exp_norm_i[PARM_EXP - 1 : 0]; 352 | Mant_lower = Mant_norm_i[2*PARM_MANT + 3 : 2*PARM_MANT + 2]; 353 | Sign_result_o = Sign_i; 354 | Mant_sticky = Sticky_one; 355 | end 356 | end 357 | 358 | //Represents Guard, Round and Sticky bit 359 | // Guard Bit: Mant_lower[1] 360 | // Round Bit: Mant_lower[0] 361 | // Sticky Bit: Mant_sticky 362 | wire GRSbits = (|Mant_lower) || Mant_sticky; 363 | 364 | //Rounding determins wheter to add 1 to the mantissa, sending Mant_roundup signal; 365 | reg Mant_roundup; 366 | 367 | always @(*) begin 368 | case (Rounding_mode_i) 369 | PARM_RM_RNE: 370 | Mant_roundup = Mant_lower[1] & (Mant_lower[0] | Mant_sticky | Mant_result_norm[0]); 371 | PARM_RM_RTZ: 372 | Mant_roundup = 0; 373 | PARM_RM_RDN: 374 | Mant_roundup = GRSbits & Sign_i; 375 | PARM_RM_RUP: 376 | Mant_roundup = GRSbits & (~Sign_i); 377 | PARM_RM_RMM: 378 | Mant_roundup = Mant_lower[1]; 379 | default: 380 | Mant_roundup = 0; 381 | endcase 382 | end 383 | 384 | wire [PARM_MANT + 1 : 0] Mant_upper_rounded = Mant_result_norm + Mant_roundup; 385 | wire Mant_renormalize = Mant_upper_rounded[PARM_MANT + 1]; 386 | 387 | //output logic 388 | 389 | always @(*) begin 390 | if(Overflow_o)begin 391 | case (Rounding_mode_i) 392 | PARM_RM_RNE: 393 | Mant_result_o = 0; // to Inf 394 | PARM_RM_RTZ: 395 | Mant_result_o = {PARM_MANT{1'b1}};//to Largest Finite Number 396 | PARM_RM_RDN: 397 | Mant_result_o = (Sign_result_o)? 0 : {PARM_MANT{1'b1}}; //+: to largest Finite Number -: to Inf 398 | PARM_RM_RUP: 399 | Mant_result_o = (Sign_result_o)? {PARM_MANT{1'b1}} : 0; //+: to Inf -: to most negative Finite Number 400 | PARM_RM_RMM: 401 | Mant_result_o = 0; // to Inf 402 | default: 403 | Mant_result_o = 0; 404 | endcase 405 | end 406 | else if(Mant_renormalize) 407 | Mant_result_o = Mant_upper_rounded[PARM_MANT : 1]; 408 | else 409 | Mant_result_o = Mant_upper_rounded[PARM_MANT - 1 : 0]; 410 | end 411 | 412 | always@(*)begin 413 | if(Overflow_o)begin 414 | case (Rounding_mode_i) 415 | PARM_RM_RNE: 416 | Exp_result_o = {PARM_EXP{1'b1}}; // to Inf 417 | PARM_RM_RTZ: 418 | Exp_result_o = {{(PARM_EXP-1){1'b1}},1'b0}; ////to Largest Finite Number, exp = 1111_1110 419 | PARM_RM_RDN: 420 | Exp_result_o = (Sign_result_o)? {PARM_EXP{1'b1}} : {{(PARM_EXP-1){1'b1}},1'b0}; 421 | PARM_RM_RUP: 422 | Exp_result_o = (Sign_result_o)? {{(PARM_EXP-1){1'b1}},1'b0} : {PARM_EXP{1'b1}}; 423 | PARM_RM_RMM: 424 | Exp_result_o = {PARM_EXP{1'b1}}; // to Inf 425 | default: 426 | Exp_result_o = 0; //Revision 8/13/2022 - Fix multidrive net, Exp_result_o 427 | endcase 428 | end 429 | else 430 | Exp_result_o = Exp_result_norm + Mant_renormalize; 431 | end 432 | 433 | assign Underflow_o = ({Exp_result_o,Mant_result_o} == 0) & GRSbits; 434 | assign Inexact_o = GRSbits || Overflow_o ||Underflow_o; 435 | 436 | endmodule 437 | -------------------------------------------------------------------------------- /src/01_RTL/SpecialCaseDetector.v: -------------------------------------------------------------------------------- 1 | `timescale 1ns / 1ps 2 | ////////////////////////////////////////////////////////////////////////////////// 3 | // Engineer: Tzu-Han Hsu 4 | // Create Date: 07/21/2022 07:41:57 PM 5 | // Module Name: SpecialCaseDetector 6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit 7 | // HDL(Version): Verilog-2005 8 | // 9 | // Dependencies: None 10 | // 11 | ////////////////////////////////////////////////////////////////////////////////// 12 | // Description: Detect whether the input data is: 13 | // 1. Infinity 14 | // 2. Zero 15 | // 3. NaN 16 | // 4. Denormalized Number 17 | // 18 | ////////////////////////////////////////////////////////////////////////////////// 19 | // Revision: 20 | // 07/21/2022 - Infinity, Zero and NaN detection done. 21 | // 07/21/2022 - Denormalized Nubmer detection added 22 | // 08/04/2022 - I/O ports renamingm, appropriate suffix added 23 | // 08/04/2022 - parameters arranged, comments added for readability 24 | // 08/14/2022 - Wire rename, avoid non-parameter upper cased wire 25 | // 09/12/2022 - Add BSD-3-Clause Licence 26 | // 27 | ////////////////////////////////////////////////////////////////////////////////// 28 | // License information: 29 | // 30 | // This software is released under the BSD-3-Clause Licence, 31 | // see https://opensource.org/licenses/BSD-3-Clause for details. 32 | // In the following license statements, "software" refers to the 33 | // "source code" of the complete hardware/software system. 34 | // 35 | // Copyright 2022, 36 | // Embedded Intelligent Systems Lab (EISL) 37 | // Deparment of Computer Science 38 | // National Yang Ming Chiao Tung Uniersity 39 | // Hsinchu, Taiwan. 40 | // 41 | // All rights reserved. 42 | // 43 | // Redistribution and use in source and binary forms, with or without 44 | // modification, are permitted provided that the following conditions are met: 45 | // 46 | // 1. Redistributions of source code must retain the above copyright notice, 47 | // this list of conditions and the following disclaimer. 48 | // 49 | // 2. Redistributions in binary form must reproduce the above copyright notice, 50 | // this list of conditions and the following disclaimer in the documentation 51 | // and/or other materials provided with the distribution. 52 | // 53 | // 3. Neither the name of the copyright holder nor the names of its contributors 54 | // may be used to endorse or promote products derived from this software 55 | // without specific prior written permission. 56 | // 57 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 58 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 59 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 60 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE 61 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 62 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 63 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 64 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 65 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 66 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 67 | // POSSIBILITY OF SUCH DAMAGE. 68 | ////////////////////////////////////////////////////////////////////////////////// 69 | 70 | 71 | module SpecialCaseDetector #( 72 | parameter PARM_XLEN = 32, 73 | parameter PARM_EXP = 8, 74 | parameter PARM_MANT = 23 75 | 76 | ) ( 77 | input [PARM_XLEN - 1 : 0] A_i, 78 | input [PARM_XLEN - 1 : 0] B_i, 79 | input [PARM_XLEN - 1 : 0] C_i, 80 | input A_Leadingbit_i, 81 | input B_Leadingbit_i, 82 | input C_Leadingbit_i, 83 | 84 | output A_Inf_o, 85 | output B_Inf_o, 86 | output C_Inf_o, 87 | output A_Zero_o, 88 | output B_Zero_o, 89 | output C_Zero_o, 90 | output A_NaN_o, 91 | output B_NaN_o, 92 | output C_NaN_o, 93 | output A_DeN_o, 94 | output B_DeN_o, 95 | output C_DeN_o); 96 | 97 | 98 | wire [PARM_EXP - 1: 0] Exp_Fullone = {PARM_EXP{1'b1}}; // Exponent is all '1' 99 | 100 | 101 | wire A_ExpZero = ~A_Leadingbit_i; 102 | wire B_ExpZero = ~B_Leadingbit_i; 103 | wire C_ExpZero = ~C_Leadingbit_i; 104 | 105 | wire A_ExpFull = (A_i[PARM_XLEN - 2 : PARM_MANT] == Exp_Fullone); 106 | wire B_ExpFull = (B_i[PARM_XLEN - 2 : PARM_MANT] == Exp_Fullone); 107 | wire C_ExpFull = (C_i[PARM_XLEN - 2 : PARM_MANT] == Exp_Fullone); 108 | 109 | wire A_MantZero = (A_i[PARM_MANT - 1 : 0] == 0); 110 | wire B_MantZero = (B_i[PARM_MANT - 1 : 0] == 0); 111 | wire C_MantZero = (C_i[PARM_MANT - 1 : 0] == 0); 112 | 113 | 114 | //output logic 115 | assign A_Zero_o = A_ExpZero & A_MantZero; 116 | assign B_Zero_o = B_ExpZero & B_MantZero; 117 | assign C_Zero_o = C_ExpZero & C_MantZero; 118 | 119 | assign A_Inf_o = A_ExpFull & A_MantZero; 120 | assign B_Inf_o = B_ExpFull & B_MantZero; 121 | assign C_Inf_o = C_ExpFull & C_MantZero; 122 | 123 | assign A_NaN_o = A_ExpFull & (~A_MantZero); 124 | assign B_NaN_o = B_ExpFull & (~B_MantZero); 125 | assign C_NaN_o = C_ExpFull & (~C_MantZero); 126 | 127 | assign A_DeN_o = A_ExpZero & (~A_MantZero); 128 | assign B_DeN_o = B_ExpZero & (~B_MantZero); 129 | assign C_DeN_o = C_ExpZero & (~C_MantZero); 130 | 131 | endmodule -------------------------------------------------------------------------------- /src/01_RTL/WallaceTree.v: -------------------------------------------------------------------------------- 1 | `timescale 1ns / 1ps 2 | ////////////////////////////////////////////////////////////////////////////////// 3 | // Engineer: Tzu-Han Hsu 4 | // Create Date: 07/22/2022 03:15:31 PM 5 | // Module Name: WallaceTree 6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit 7 | // HDL(Version): Verilog-2005 8 | // 9 | // Dependencies: Compressor32.v 10 | // Compressor42.v 11 | // 12 | ////////////////////////////////////////////////////////////////////////////////// 13 | // Description: Sums 13 partial products using carry save adder(CSA) into carry and sum 14 | // with: 15 | // 9x 3-2 Compressor 16 | // 1x 4-2 Compressor 17 | // 18 | ////////////////////////////////////////////////////////////////////////////////// 19 | // Revision: 20 | // 07/22/2022 - Basic wiring finished, I/O signals updated for appropriate prefix 21 | // 07/25/2022 - Use generate statements to simplify code 22 | // 07/25/2022 - Multidriven net fixed 23 | // 07/25/2022 - Interleaving Wires rearranged 24 | // 08/15/2022 - Interconnection changed, to reduce critical path 25 | // 09/12/2022 - Add BSD-3-Clause Licence 26 | // 27 | ////////////////////////////////////////////////////////////////////////////////// 28 | // License information: 29 | // 30 | // This software is released under the BSD-3-Clause Licence, 31 | // see https://opensource.org/licenses/BSD-3-Clause for details. 32 | // In the following license statements, "software" refers to the 33 | // "source code" of the complete hardware/software system. 34 | // 35 | // Copyright 2022, 36 | // Embedded Intelligent Systems Lab (EISL) 37 | // Deparment of Computer Science 38 | // National Yang Ming Chiao Tung Uniersity 39 | // Hsinchu, Taiwan. 40 | // 41 | // All rights reserved. 42 | // 43 | // Redistribution and use in source and binary forms, with or without 44 | // modification, are permitted provided that the following conditions are met: 45 | // 46 | // 1. Redistributions of source code must retain the above copyright notice, 47 | // this list of conditions and the following disclaimer. 48 | // 49 | // 2. Redistributions in binary form must reproduce the above copyright notice, 50 | // this list of conditions and the following disclaimer in the documentation 51 | // and/or other materials provided with the distribution. 52 | // 53 | // 3. Neither the name of the copyright holder nor the names of its contributors 54 | // may be used to endorse or promote products derived from this software 55 | // without specific prior written permission. 56 | // 57 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 58 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 59 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 60 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE 61 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 62 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 63 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 64 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 65 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 66 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 67 | // POSSIBILITY OF SUCH DAMAGE. 68 | ////////////////////////////////////////////////////////////////////////////////// 69 | 70 | 71 | module WallaceTree #( 72 | parameter PARM_MANT = 23 73 | ) ( 74 | input [2*PARM_MANT + 2 : 0] pp_00_i, 75 | input [2*PARM_MANT + 2 : 0] pp_01_i, 76 | input [2*PARM_MANT + 2 : 0] pp_02_i, 77 | input [2*PARM_MANT + 2 : 0] pp_03_i, 78 | input [2*PARM_MANT + 2 : 0] pp_04_i, 79 | input [2*PARM_MANT + 2 : 0] pp_05_i, 80 | input [2*PARM_MANT + 2 : 0] pp_06_i, 81 | input [2*PARM_MANT + 2 : 0] pp_07_i, 82 | input [2*PARM_MANT + 2 : 0] pp_08_i, 83 | input [2*PARM_MANT + 2 : 0] pp_09_i, 84 | input [2*PARM_MANT + 2 : 0] pp_10_i, 85 | input [2*PARM_MANT + 2 : 0] pp_11_i, 86 | input [2*PARM_MANT + 1 : 0] pp_12_i, 87 | 88 | output [2*PARM_MANT + 2 : 0] wallace_sum_o, 89 | output [2*PARM_MANT + 2 : 0] wallace_carry_o, 90 | output suppression_sign_extension_o); 91 | 92 | 93 | wire [2*PARM_MANT + 2 : 0] csa_sum [9 - 1: 0]; 94 | wire [2*PARM_MANT + 2 : 0] csa_carry [9 - 1: 0]; 95 | 96 | wire [2*PARM_MANT + 2 : 0] csa_shcy [9 - 1: 0]; 97 | wire [9 : 3] sign_extension; 98 | generate 99 | genvar i; 100 | for(i = 3; i < 9; i = i+1)begin 101 | assign sign_extension[i] = csa_carry[i][2*PARM_MANT + 2]; 102 | end 103 | endgenerate 104 | 105 | generate 106 | genvar j; 107 | for(j = 0; j < 9; j = j+1)begin 108 | assign csa_shcy[j] = csa_carry[j] << 1; 109 | end 110 | endgenerate 111 | 112 | 113 | Compressor32 #(2*PARM_MANT + 3) LV1_0 (.A_i(pp_00_i),.B_i(pp_01_i),.C_i(pp_02_i),.Sum_o(csa_sum[0]),.Carry_o(csa_carry[0])); 114 | Compressor32 #(2*PARM_MANT + 3) LV1_1 (.A_i(pp_03_i),.B_i(pp_04_i),.C_i(pp_05_i),.Sum_o(csa_sum[1]),.Carry_o(csa_carry[1])); 115 | Compressor32 #(2*PARM_MANT + 3) LV1_2 (.A_i(pp_06_i),.B_i(pp_07_i),.C_i(pp_08_i),.Sum_o(csa_sum[2]),.Carry_o(csa_carry[2])); 116 | Compressor32 #(2*PARM_MANT + 3) LV1_3 (.A_i(pp_09_i),.B_i(pp_10_i),.C_i(pp_11_i),.Sum_o(csa_sum[3]),.Carry_o(csa_carry[3])); 117 | 118 | Compressor32 #(2*PARM_MANT + 3) LV2_0 (.A_i(csa_sum[0] ),.B_i(csa_shcy[0]),.C_i(csa_sum[1] ),.Sum_o(csa_sum[4]),.Carry_o(csa_carry[4])); 119 | Compressor32 #(2*PARM_MANT + 3) LV2_1 (.A_i(csa_shcy[1]),.B_i(csa_sum[2] ),.C_i(csa_shcy[2]),.Sum_o(csa_sum[5]),.Carry_o(csa_carry[5])); 120 | Compressor32 #(2*PARM_MANT + 3) LV2_2 (.A_i(csa_sum[3] ),.B_i(csa_shcy[3]),.C_i({1'd0, pp_12_i}),.Sum_o(csa_sum[6]),.Carry_o(csa_carry[6])); 121 | 122 | Compressor32 #(2*PARM_MANT + 3) LV3_0 (.A_i(csa_sum[4] ),.B_i(csa_shcy[4]),.C_i(csa_sum[5] ),.Sum_o(csa_sum[7]),.Carry_o(csa_carry[7])); 123 | Compressor32 #(2*PARM_MANT + 3) LV3_1 (.A_i(csa_shcy[5]),.B_i(csa_sum[6] ),.C_i(csa_shcy[6]),.Sum_o(csa_sum[8]),.Carry_o(csa_carry[8])); 124 | 125 | Compressor42 #(2*PARM_MANT + 3) 126 | LV4_Final ( 127 | .A_i(csa_sum[7]), 128 | .B_i(csa_shcy[7]), 129 | .C_i(csa_sum[8]), 130 | .D_i(csa_shcy[8]), 131 | .Sum_o(wallace_sum_o), 132 | .Carry_o(wallace_carry_o), 133 | .hidden_carry_msb(sign_extension[9]) 134 | ); 135 | 136 | assign suppression_sign_extension_o = |sign_extension; 137 | 138 | endmodule 139 | -------------------------------------------------------------------------------- /src/01_RTL/ZeroDetector_Base.v: -------------------------------------------------------------------------------- 1 | `timescale 1ns / 1ps 2 | ////////////////////////////////////////////////////////////////////////////////// 3 | // Engineer: Tzu-Han Hsu 4 | // Create Date: 07/29/2022 11:01:59 PM 5 | // Module Name: ZeroDetector_Base 6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit 7 | // HDL(Version): Verilog-2005 8 | // 9 | // Dependencies: None 10 | // 11 | ////////////////////////////////////////////////////////////////////////////////// 12 | // Description: Check if the input bits are all zero 13 | // 14 | ////////////////////////////////////////////////////////////////////////////////// 15 | // Revision: 16 | // 08/15/2022 - allow parameter to control input size 17 | // 09/12/2022 - Add BSD-3-Clause Licence 18 | // 19 | ////////////////////////////////////////////////////////////////////////////////// 20 | // License information: 21 | // 22 | // This software is released under the BSD-3-Clause Licence, 23 | // see https://opensource.org/licenses/BSD-3-Clause for details. 24 | // In the following license statements, "software" refers to the 25 | // "source code" of the complete hardware/software system. 26 | // 27 | // Copyright 2022, 28 | // Embedded Intelligent Systems Lab (EISL) 29 | // Deparment of Computer Science 30 | // National Yang Ming Chiao Tung Uniersity 31 | // Hsinchu, Taiwan. 32 | // 33 | // All rights reserved. 34 | // 35 | // Redistribution and use in source and binary forms, with or without 36 | // modification, are permitted provided that the following conditions are met: 37 | // 38 | // 1. Redistributions of source code must retain the above copyright notice, 39 | // this list of conditions and the following disclaimer. 40 | // 41 | // 2. Redistributions in binary form must reproduce the above copyright notice, 42 | // this list of conditions and the following disclaimer in the documentation 43 | // and/or other materials provided with the distribution. 44 | // 45 | // 3. Neither the name of the copyright holder nor the names of its contributors 46 | // may be used to endorse or promote products derived from this software 47 | // without specific prior written permission. 48 | // 49 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 50 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 51 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 52 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE 53 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 54 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 55 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 56 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 57 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 58 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 59 | // POSSIBILITY OF SUCH DAMAGE. 60 | ////////////////////////////////////////////////////////////////////////////////// 61 | 62 | 63 | module ZeroDetector_Base #( 64 | parameter XLEN = 8 65 | ) ( 66 | input [XLEN - 1: 0] base_data_i, 67 | output zero_o ); 68 | 69 | assign zero_o = (base_data_i == 0); 70 | 71 | endmodule 72 | -------------------------------------------------------------------------------- /src/01_RTL/ZeroDetector_Group.v: -------------------------------------------------------------------------------- 1 | `timescale 1ns / 1ps 2 | ////////////////////////////////////////////////////////////////////////////////// 3 | // Engineer: Tzu-Han Hsu 4 | // Create Date: 07/29/2022 11:01:59 PM 5 | // Module Name: ZeroDetector_Group 6 | // Project Name: IEEE-754 & RISC-V Compatible Multiply-Accumulate Unit 7 | // HDL(Version): Verilog-2005 8 | // 9 | // Dependencies: None 10 | // 11 | ////////////////////////////////////////////////////////////////////////////////// 12 | // Description: Check if two groups are all zero 13 | // 14 | ////////////////////////////////////////////////////////////////////////////////// 15 | // Revision: 16 | // 09/12/2022 - Add BSD-3-Clause Licence 17 | // 18 | ////////////////////////////////////////////////////////////////////////////////// 19 | // License information: 20 | // 21 | // This software is released under the BSD-3-Clause Licence, 22 | // see https://opensource.org/licenses/BSD-3-Clause for details. 23 | // In the following license statements, "software" refers to the 24 | // "source code" of the complete hardware/software system. 25 | // 26 | // Copyright 2022, 27 | // Embedded Intelligent Systems Lab (EISL) 28 | // Deparment of Computer Science 29 | // National Yang Ming Chiao Tung Uniersity 30 | // Hsinchu, Taiwan. 31 | // 32 | // All rights reserved. 33 | // 34 | // Redistribution and use in source and binary forms, with or without 35 | // modification, are permitted provided that the following conditions are met: 36 | // 37 | // 1. Redistributions of source code must retain the above copyright notice, 38 | // this list of conditions and the following disclaimer. 39 | // 40 | // 2. Redistributions in binary form must reproduce the above copyright notice, 41 | // this list of conditions and the following disclaimer in the documentation 42 | // and/or other materials provided with the distribution. 43 | // 44 | // 3. Neither the name of the copyright holder nor the names of its contributors 45 | // may be used to endorse or promote products derived from this software 46 | // without specific prior written permission. 47 | // 48 | // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 49 | // AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 50 | // IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 51 | // ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE 52 | // LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 53 | // CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 54 | // SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 55 | // INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 56 | // CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 57 | // ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 58 | // POSSIBILITY OF SUCH DAMAGE. 59 | ////////////////////////////////////////////////////////////////////////////////// 60 | 61 | 62 | module ZeroDetector_Group #( 63 | parameter XLEN = 2 64 | ) ( 65 | input [XLEN - 1 : 0] group_data_i, 66 | output group_zero_o ); 67 | 68 | assign group_zero_o = &group_data_i; 69 | 70 | endmodule -------------------------------------------------------------------------------- /src/02_SYN/01_run_dc: -------------------------------------------------------------------------------- 1 | dc_shell-t -f syn.tcl | tee syn.log 2 | -------------------------------------------------------------------------------- /src/02_SYN/09_clean_up: -------------------------------------------------------------------------------- 1 | rm -rf INCA_libs nWaveLog 2 | rm -rf *.fsdb 3 | rm -rf *.log 4 | rm -rf *~ 5 | rm -rf ./Netlist/*.* 6 | rm -rf ./Report/*.* 7 | rm -rf dwsvf* 8 | rm -rf alib* 9 | rm -rf default.svf 10 | rm -rf alib-52 11 | rm -rf *-verilog.* 12 | rm -rf *.mr 13 | rm -rf diff_syn 14 | -------------------------------------------------------------------------------- /src/02_SYN/Report/SUBWAY.area: -------------------------------------------------------------------------------- 1 | 2 | **************************************** 3 | Report : area 4 | Design : SUBWAY 5 | Version: T-2022.03 6 | Date : Sun Apr 9 02:23:57 2023 7 | **************************************** 8 | 9 | Library(s) Used: 10 | 11 | slow (File: /RAID2/COURSE/iclab/iclabta01/umc018/Synthesis/slow.db) 12 | 13 | Number of ports: 16 14 | Number of nets: 835 15 | Number of cells: 718 16 | Number of combinational cells: 554 17 | Number of sequential cells: 164 18 | Number of macros/black boxes: 0 19 | Number of buf/inv: 73 20 | Number of references: 36 21 | 22 | Combinational area: 8236.166493 23 | Buf/Inv area: 728.481627 24 | Noncombinational area: 11991.672226 25 | Macro/Black Box area: 0.000000 26 | Net Interconnect area: undefined (No wire load specified) 27 | 28 | Total cell area: 20227.838719 29 | Total area: undefined 30 | 1 31 | -------------------------------------------------------------------------------- /src/02_SYN/Report/SUBWAY.check: -------------------------------------------------------------------------------- 1 | 1 2 | -------------------------------------------------------------------------------- /src/02_SYN/Report/SUBWAY.resource: -------------------------------------------------------------------------------- 1 | 2 | **************************************** 3 | Report : resources 4 | Design : SUBWAY 5 | Version: T-2022.03 6 | Date : Sun Apr 9 02:23:57 2023 7 | **************************************** 8 | 9 | 10 | Resource Report for this hierarchy in file 11 | /RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v 12 | ============================================================================= 13 | | Cell | Module | Parameters | Contained Operations | 14 | ============================================================================= 15 | | lt_x_2 | DW_cmp | width=7 | lt_100 (SUBWAY.v:100) | 16 | | gte_x_3 | DW_cmp | width=7 | gte_111 (SUBWAY.v:111) | 17 | | add_x_100 | DW01_inc | width=6 | add_418 (SUBWAY.v:418) | 18 | | sub_x_7 | DW01_dec | width=2 | sub_146 (SUBWAY.v:146) | 19 | | | | | sub_146_2 (SUBWAY.v:146) | 20 | | | | | sub_149 (SUBWAY.v:149) | 21 | | | | | sub_173 (SUBWAY.v:173) | 22 | | | | | sub_175 (SUBWAY.v:175) | 23 | | | | | sub_192 (SUBWAY.v:192) | 24 | | | | | sub_206 (SUBWAY.v:206) | 25 | | | | | sub_209 (SUBWAY.v:209) | 26 | | | | | sub_225 (SUBWAY.v:225) | 27 | | | | | sub_242 (SUBWAY.v:242) | 28 | | | | | sub_242_2 (SUBWAY.v:242) | 29 | | | | | sub_242_3 (SUBWAY.v:242) | 30 | | | | | sub_247 (SUBWAY.v:247) | 31 | | | | | sub_251 (SUBWAY.v:251) | 32 | | | | | sub_260 (SUBWAY.v:260) | 33 | | | | | sub_283 (SUBWAY.v:283) | 34 | | | | | sub_283_2 (SUBWAY.v:283) | 35 | | | | | sub_286 (SUBWAY.v:286) | 36 | | | | | sub_311 (SUBWAY.v:311) | 37 | | | | | sub_313 (SUBWAY.v:313) | 38 | | | | | sub_330 (SUBWAY.v:330) | 39 | | | | | sub_344 (SUBWAY.v:344) | 40 | | | | | sub_347 (SUBWAY.v:347) | 41 | | | | | sub_363 (SUBWAY.v:363) | 42 | | | | | sub_380 (SUBWAY.v:380) | 43 | | | | | sub_380_2 (SUBWAY.v:380) | 44 | | | | | sub_380_3 (SUBWAY.v:380) | 45 | | | | | sub_385 (SUBWAY.v:385) | 46 | | | | | sub_389 (SUBWAY.v:389) | 47 | | | | | sub_398 (SUBWAY.v:398) | 48 | | add_x_6 | DW01_inc | width=2 | add_142 (SUBWAY.v:142) | 49 | | | | | add_142_2 (SUBWAY.v:142) | 50 | | | | | add_145 (SUBWAY.v:145) | 51 | | | | | add_169 (SUBWAY.v:169) | 52 | | | | | add_172 (SUBWAY.v:172) | 53 | | | | | add_188 (SUBWAY.v:188) | 54 | | | | | add_202 (SUBWAY.v:202) | 55 | | | | | add_205 (SUBWAY.v:205) | 56 | | | | | add_221 (SUBWAY.v:221) | 57 | | | | | add_236 (SUBWAY.v:236) | 58 | | | | | add_242 (SUBWAY.v:242) | 59 | | | | | add_242_2 (SUBWAY.v:242) | 60 | | | | | add_246 (SUBWAY.v:246) | 61 | | | | | add_247 (SUBWAY.v:247) | 62 | | | | | add_279 (SUBWAY.v:279) | 63 | | | | | add_279_2 (SUBWAY.v:279) | 64 | | | | | add_282 (SUBWAY.v:282) | 65 | | | | | add_307 (SUBWAY.v:307) | 66 | | | | | add_310 (SUBWAY.v:310) | 67 | | | | | add_326 (SUBWAY.v:326) | 68 | | | | | add_340 (SUBWAY.v:340) | 69 | | | | | add_343 (SUBWAY.v:343) | 70 | | | | | add_359 (SUBWAY.v:359) | 71 | | | | | add_374 (SUBWAY.v:374) | 72 | | | | | add_380 (SUBWAY.v:380) | 73 | | | | | add_380_2 (SUBWAY.v:380) | 74 | | | | | add_384 (SUBWAY.v:384) | 75 | | | | | add_385 (SUBWAY.v:385) | 76 | | eq_x_103 | DW_cmp | width=4 | eq_184 (SUBWAY.v:184) | 77 | | | | | eq_322 (SUBWAY.v:322) | 78 | | DP_OP_304J1_122_9552 | | | 79 | | | DP_OP_304J1_122_9552 | | | 80 | | DP_OP_305J1_123_2228 | | | 81 | | | DP_OP_305J1_123_2228 | | | 82 | ============================================================================= 83 | 84 | Datapath Report for DP_OP_304J1_122_9552 85 | ============================================================================== 86 | | Cell | Contained Operations | 87 | ============================================================================== 88 | | DP_OP_304J1_122_9552 | mult_add_87_aco (SUBWAY.v:87) | 89 | | | add_87_aco (SUBWAY.v:87) | 90 | ============================================================================== 91 | 92 | ============================================================================== 93 | | | | Data | | | 94 | | Var | Type | Class | Width | Expression | 95 | ============================================================================== 96 | | I1 | PI | Unsigned | 7 | | 97 | | I2 | PI | Unsigned | 1 | | 98 | | T163 | IFO | Unsigned | 7 | I1 * I2 (SUBWAY.v:87) | 99 | | O1 | PO | Unsigned | 7 | T163 + $unsigned(1'b1) (SUBWAY.v:87) | 100 | ============================================================================== 101 | 102 | Datapath Report for DP_OP_305J1_123_2228 103 | ============================================================================== 104 | | Cell | Contained Operations | 105 | ============================================================================== 106 | | DP_OP_305J1_123_2228 | add_184 (SUBWAY.v:184) add_184_2 (SUBWAY.v:184) | 107 | | | add_184_3 (SUBWAY.v:184) add_322 (SUBWAY.v:322) | 108 | | | add_322_2 (SUBWAY.v:322) add_322_3 (SUBWAY.v:322) | 109 | ============================================================================== 110 | 111 | ============================================================================== 112 | | | | Data | | | 113 | | Var | Type | Class | Width | Expression | 114 | ============================================================================== 115 | | I1 | PI | Unsigned | 1 | | 116 | | I2 | PI | Unsigned | 1 | | 117 | | I3 | PI | Unsigned | 1 | | 118 | | I4 | PI | Unsigned | 1 | | 119 | | O1 | PO | Unsigned | 4 | I1 + I2 + I3 + I4 ( SUBWAY.v:184 SUBWAY.v:322 ) | 120 | ============================================================================== 121 | 122 | 123 | Implementation Report 124 | =============================================================================== 125 | | | | Current | Set | 126 | | Cell | Module | Implementation | Implementation | 127 | =============================================================================== 128 | | lt_x_2 | DW_cmp | apparch (area) | | 129 | | gte_x_3 | DW_cmp | apparch (area) | | 130 | | add_x_100 | DW01_inc | apparch (area) | | 131 | | sub_x_7 | DW01_dec | apparch (area) | | 132 | | add_x_6 | DW01_inc | apparch (area) | | 133 | | eq_x_103 | DW_cmp | apparch (area) | | 134 | | DP_OP_304J1_122_9552 | | | 135 | | | DP_OP_304J1_122_9552 | str (area) | | 136 | | | | mult_arch: and | | 137 | | DP_OP_305J1_123_2228 | | | 138 | | | DP_OP_305J1_123_2228 | str (area) | | 139 | =============================================================================== 140 | 141 | 1 142 | -------------------------------------------------------------------------------- /src/02_SYN/Report/SUBWAY.timing: -------------------------------------------------------------------------------- 1 | Information: Updating design information... (UID-85) 2 | 3 | **************************************** 4 | Report : timing 5 | -path full 6 | -delay max 7 | -max_paths 1 8 | Design : SUBWAY 9 | Version: T-2022.03 10 | Date : Sun Apr 9 02:23:57 2023 11 | **************************************** 12 | 13 | Operating Conditions: slow Library: slow 14 | Wire Load Model Mode: top 15 | 16 | Startpoint: in_valid (input port clocked by clk) 17 | Endpoint: map_reg[0][0][0] 18 | (rising edge-triggered flip-flop clocked by clk) 19 | Path Group: clk 20 | Path Type: max 21 | 22 | Point Incr Path 23 | ----------------------------------------------------------- 24 | clock clk (rise edge) 0.00 0.00 25 | clock network delay (ideal) 0.00 0.00 26 | input external delay 5.00 5.00 f 27 | in_valid (in) 0.00 5.00 f 28 | U679/Y (NAND2XL) 0.13 5.13 r 29 | U645/Y (OAI31XL) 0.30 5.43 f 30 | U614/Y (NOR2XL) 1.63 7.06 r 31 | U739/Y (INVXL) 0.69 7.74 f 32 | U784/Y (AOI2BB2XL) 0.47 8.21 f 33 | map_reg[0][0][0]/D (DFFRX1) 0.00 8.21 f 34 | data arrival time 8.21 35 | 36 | clock clk (rise edge) 10.00 10.00 37 | clock network delay (ideal) 0.00 10.00 38 | map_reg[0][0][0]/CK (DFFRX1) 0.00 10.00 r 39 | library setup time -0.32 9.68 40 | data required time 9.68 41 | ----------------------------------------------------------- 42 | data required time 9.68 43 | data arrival time -8.21 44 | ----------------------------------------------------------- 45 | slack (MET) 1.46 46 | 47 | 48 | 1 49 | -------------------------------------------------------------------------------- /src/02_SYN/default.svf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hankshyu/RISC-V_MAC/a99fb15770c8dd553d0b348fadddc2b8b3d4b22b/src/02_SYN/default.svf -------------------------------------------------------------------------------- /src/02_SYN/syn.log: -------------------------------------------------------------------------------- 1 | 2 | Design Compiler Graphical 3 | DC Ultra (TM) 4 | DFTMAX (TM) 5 | Power Compiler (TM) 6 | DesignWare (R) 7 | DC Expert (TM) 8 | Design Vision (TM) 9 | HDL Compiler (TM) 10 | VHDL Compiler (TM) 11 | DFT Compiler 12 | Design Compiler(R) 13 | 14 | Version T-2022.03 for linux64 - Feb 22, 2022 15 | 16 | Copyright (c) 1988 - 2022 Synopsys, Inc. 17 | This software and the associated documentation are proprietary to Synopsys, 18 | Inc. This software may only be used in accordance with the terms and conditions 19 | of a written license agreement with Synopsys, Inc. All other use, reproduction, 20 | or distribution of this software is strictly prohibited. Licensed Products 21 | communicate with Synopsys servers for the purpose of providing software 22 | updates, detecting software piracy and verifying that customers are using 23 | Licensed Products in conformity with the applicable License Key for such 24 | Licensed Products. Synopsys will use information gathered in connection with 25 | this process to deliver software updates and pursue software pirates and 26 | infringers. 27 | 28 | Inclusivity & Diversity - Visit SolvNetPlus to read the "Synopsys Statement on 29 | Inclusivity and Diversity" (Refer to article 000036315 at 30 | https://solvnetplus.synopsys.com) 31 | Initializing... 32 | #====================================================== 33 | # 34 | # Synopsys Synthesis Scripts (Design Vision dctcl mode) 35 | # 36 | #====================================================== 37 | #====================================================== 38 | # Set Libraries 39 | #====================================================== 40 | set search_path {./../01_RTL \ 41 | ~iclabta01/umc018/Synthesis/ \ 42 | /usr/synthesis/libraries/syn/ } 43 | ./../01_RTL ~iclabta01/umc018/Synthesis/ /usr/synthesis/libraries/syn/ 44 | set synthetic_library {dw_foundation.sldb} 45 | dw_foundation.sldb 46 | set link_library {* dw_foundation.sldb standard.sldb slow.db} 47 | * dw_foundation.sldb standard.sldb slow.db 48 | set target_library {slow.db} 49 | slow.db 50 | #====================================================== 51 | # Global Parameters 52 | #====================================================== 53 | set DESIGN "SUBWAY" 54 | SUBWAY 55 | set hdlin_ff_always_sync_set_reset true 56 | true 57 | set CLK_period 10.0 58 | 10.0 59 | #====================================================== 60 | # Read RTL Code 61 | #====================================================== 62 | read_sverilog $DESIGN.v 63 | Loading db file '/usr/cad/synopsys/synthesis/2022.03/libraries/syn/dw_foundation.sldb' 64 | Loading db file '/usr/cad/synopsys/synthesis/2022.03/libraries/syn/standard.sldb' 65 | Loading db file '/RAID2/COURSE/iclab/iclabta01/umc018/Synthesis/slow.db' 66 | Loading db file '/usr/cad/synopsys/synthesis/2022.03/libraries/syn/gtech.db' 67 | Loading link library 'slow' 68 | Loading link library 'gtech' 69 | Loading sverilog file '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v' 70 | Detecting input file type automatically (-rtl or -netlist). 71 | Reading with Presto HDL Compiler (equivalent to -rtl option). 72 | Running PRESTO HDLC 73 | Compiling source file /RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v 74 | Warning: /RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v:77: signed to unsigned conversion occurs. (VER-318) 75 | 76 | Statistics for case statements in always block at line 74 in file 77 | '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v' 78 | =============================================== 79 | | Line | full/ parallel | 80 | =============================================== 81 | | 75 | auto/auto | 82 | =============================================== 83 | 84 | Statistics for case statements in always block at line 125 in file 85 | '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v' 86 | =============================================== 87 | | Line | full/ parallel | 88 | =============================================== 89 | | 133 | no/auto | 90 | | 138 | auto/auto | 91 | | 231 | auto/auto | 92 | | 275 | auto/auto | 93 | | 369 | auto/auto | 94 | =============================================== 95 | 96 | Inferred memory devices in process 97 | in routine SUBWAY line 69 in file 98 | '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v'. 99 | =============================================================================== 100 | | Register Name | Type | Width | Bus | MB | AR | AS | SR | SS | ST | 101 | =============================================================================== 102 | | current_state_reg | Flip-flop | 2 | Y | N | Y | N | N | N | N | 103 | =============================================================================== 104 | 105 | Inferred memory devices in process 106 | in routine SUBWAY line 84 in file 107 | '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v'. 108 | =============================================================================== 109 | | Register Name | Type | Width | Bus | MB | AR | AS | SR | SS | ST | 110 | =============================================================================== 111 | | cnt_reg | Flip-flop | 7 | Y | N | Y | N | N | N | N | 112 | =============================================================================== 113 | 114 | Inferred memory devices in process 115 | in routine SUBWAY line 91 in file 116 | '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v'. 117 | =============================================================================== 118 | | Register Name | Type | Width | Bus | MB | AR | AS | SR | SS | ST | 119 | =============================================================================== 120 | | map_reg | Flip-flop | 32 | Y | N | Y | N | N | N | N | 121 | =============================================================================== 122 | 123 | Inferred memory devices in process 124 | in routine SUBWAY line 125 in file 125 | '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v'. 126 | =============================================================================== 127 | | Register Name | Type | Width | Bus | MB | AR | AS | SR | SS | ST | 128 | =============================================================================== 129 | | answer_reg | Flip-flop | 112 | Y | N | Y | N | N | N | N | 130 | | lane_reg | Flip-flop | 2 | Y | N | Y | N | N | N | N | 131 | =============================================================================== 132 | 133 | Inferred memory devices in process 134 | in routine SUBWAY line 414 in file 135 | '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v'. 136 | =============================================================================== 137 | | Register Name | Type | Width | Bus | MB | AR | AS | SR | SS | ST | 138 | =============================================================================== 139 | | ans_idx_reg | Flip-flop | 6 | Y | N | Y | N | N | N | N | 140 | =============================================================================== 141 | 142 | Inferred memory devices in process 143 | in routine SUBWAY line 427 in file 144 | '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v'. 145 | =============================================================================== 146 | | Register Name | Type | Width | Bus | MB | AR | AS | SR | SS | ST | 147 | =============================================================================== 148 | | out_valid_reg | Flip-flop | 1 | N | N | Y | N | N | N | N | 149 | =============================================================================== 150 | 151 | Inferred memory devices in process 152 | in routine SUBWAY line 436 in file 153 | '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.v'. 154 | =============================================================================== 155 | | Register Name | Type | Width | Bus | MB | AR | AS | SR | SS | ST | 156 | =============================================================================== 157 | | out_reg | Flip-flop | 2 | Y | N | Y | N | N | N | N | 158 | =============================================================================== 159 | Statistics for MUX_OPs 160 | ====================================================== 161 | | block name/line | Inputs | Outputs | # sel inputs | 162 | ====================================================== 163 | | SUBWAY/140 | 4 | 4 | 2 | 164 | | SUBWAY/142 | 4 | 2 | 2 | 165 | | SUBWAY/142 | 4 | 2 | 2 | 166 | | SUBWAY/146 | 4 | 2 | 2 | 167 | | SUBWAY/146 | 4 | 2 | 2 | 168 | | SUBWAY/169 | 4 | 2 | 2 | 169 | | SUBWAY/173 | 4 | 2 | 2 | 170 | | SUBWAY/202 | 4 | 2 | 2 | 171 | | SUBWAY/206 | 4 | 2 | 2 | 172 | | SUBWAY/242 | 4 | 2 | 2 | 173 | | SUBWAY/242 | 4 | 2 | 2 | 174 | | SUBWAY/242 | 4 | 2 | 2 | 175 | | SUBWAY/242 | 4 | 2 | 2 | 176 | | SUBWAY/242 | 4 | 2 | 2 | 177 | | SUBWAY/247 | 4 | 2 | 2 | 178 | | SUBWAY/247 | 4 | 2 | 2 | 179 | | SUBWAY/279 | 4 | 2 | 2 | 180 | | SUBWAY/279 | 4 | 2 | 2 | 181 | | SUBWAY/283 | 4 | 2 | 2 | 182 | | SUBWAY/283 | 4 | 2 | 2 | 183 | | SUBWAY/307 | 4 | 2 | 2 | 184 | | SUBWAY/311 | 4 | 2 | 2 | 185 | | SUBWAY/340 | 4 | 2 | 2 | 186 | | SUBWAY/344 | 4 | 2 | 2 | 187 | | SUBWAY/380 | 4 | 2 | 2 | 188 | | SUBWAY/380 | 4 | 2 | 2 | 189 | | SUBWAY/380 | 4 | 2 | 2 | 190 | | SUBWAY/380 | 4 | 2 | 2 | 191 | | SUBWAY/380 | 4 | 2 | 2 | 192 | | SUBWAY/385 | 4 | 2 | 2 | 193 | | SUBWAY/385 | 4 | 2 | 2 | 194 | ====================================================== 195 | Information: Complex logic will not be considered for set/reset inference. (ELAB-2008) 196 | Presto compilation completed successfully. 197 | Current design is now '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/01_RTL/SUBWAY.db:SUBWAY' 198 | Loaded 1 design. 199 | Current design is 'SUBWAY'. 200 | SUBWAY 201 | current_design $DESIGN 202 | Current design is 'SUBWAY'. 203 | {SUBWAY} 204 | #====================================================== 205 | # Global Setting 206 | #====================================================== 207 | set_wire_load_mode top 208 | 1 209 | #====================================================== 210 | # Set Design Constraints 211 | #====================================================== 212 | create_clock -name "clk" -period $CLK_period clk 213 | 1 214 | set_input_delay [ expr $CLK_period*0.5 ] -clock clk [all_inputs] 215 | 1 216 | set_output_delay [ expr $CLK_period*0.5 ] -clock clk [all_outputs] 217 | 1 218 | set_input_delay 0 -clock clk clk 219 | 1 220 | set_load 0.05 [all_outputs] 221 | 1 222 | #====================================================== 223 | # Optimization 224 | #====================================================== 225 | uniquify 226 | 1 227 | set_fix_multiple_port_nets -all -buffer_constants 228 | 1 229 | compile_ultra 230 | Information: Performing power optimization. (PWR-850) 231 | Analyzing: "/RAID2/COURSE/iclab/iclabta01/umc018/Synthesis/slow.db" 232 | Library analysis succeeded. 233 | Information: Evaluating DesignWare library utilization. (UISN-27) 234 | 235 | ============================================================================ 236 | | DesignWare Building Block Library | Version | Available | 237 | ============================================================================ 238 | | Basic DW Building Blocks | S-2021.06-DWBB_202106.0 | * | 239 | | Licensed DW Building Blocks | S-2021.06-DWBB_202106.0 | * | 240 | ============================================================================ 241 | 242 | ==================================================================================================== 243 | | Flow Information | 244 | ---------------------------------------------------------------------------------------------------- 245 | | Flow | Design Compiler WLM | 246 | | Comand line | compile_ultra | 247 | ==================================================================================================== 248 | | Design Information | Value | 249 | ==================================================================================================== 250 | | Number of Scenarios | 0 | 251 | | Leaf Cell Count | 3498 | 252 | | Number of User Hierarchies | 0 | 253 | | Sequential Cell Count | 164 | 254 | | Macro Count | 0 | 255 | | Number of Power Domains | 0 | 256 | | Number of Path Groups | 2 | 257 | | Number of VT class | 0 | 258 | | Number of Clocks | 1 | 259 | | Number of Dont Touch cells | 264 | 260 | | Number of Dont Touch nets | 0 | 261 | | Number of size only cells | 0 | 262 | | Design with UPF Data | false | 263 | ---------------------------------------------------------------------------------------------------- 264 | | Variables | Value | 265 | ---------------------------------------------------------------------------------------------------- 266 | | set_fix_multiple_port_nets | -all -buffer_constants | 267 | ==================================================================================================== 268 | Information: Sequential output inversion is enabled. SVF file must be used for formal verification. (OPT-1208) 269 | 270 | Information: There are 30 potential problems in your design. Please run 'check_design' for more information. (LINT-99) 271 | 272 | Simplifying Design 'SUBWAY' 273 | 274 | Loaded alib file './alib-52/slow.db.alib' 275 | Building model 'DW01_NAND2' 276 | Information: Ungrouping 0 of 1 hierarchies before Pass 1 (OPT-775) 277 | Information: State dependent leakage is now switched from on to off. 278 | 279 | Beginning Pass 1 Mapping 280 | ------------------------ 281 | Processing 'SUBWAY' 282 | Information: Added key list 'DesignWare' to design 'SUBWAY'. (DDB-72) 283 | Implement Synthetic for 'SUBWAY'. 284 | 285 | Updating timing information 286 | Information: Updating design information... (UID-85) 287 | Information: The library cell 'HOLDX1' in the library 'slow' is not characterized for internal power. (PWR-536) 288 | Information: The target library(s) contains cell(s), other than black boxes, that are not characterized for internal power. (PWR-24) 289 | 290 | Beginning Mapping Optimizations (Ultra High effort) 291 | ------------------------------- 292 | Information: There is no timing violation in design SUBWAY. Delay-based auto_ungroup will not be performed. (OPT-780) 293 | 294 | TOTAL 295 | ELAPSED WORST NEG SETUP DESIGN LEAKAGE 296 | TIME AREA SLACK COST RULE COST ENDPOINT POWER 297 | --------- --------- --------- --------- --------- ------------------------- --------- 298 | 0:00:04 27748.8 0.00 0.0 0.0 3245796.2500 299 | 0:00:04 27738.8 0.00 0.0 0.0 3245446.2500 300 | 301 | Beginning Constant Register Removal 302 | ----------------------------------- 303 | 0:00:04 27738.8 0.00 0.0 0.0 3245446.2500 304 | 0:00:04 27738.8 0.00 0.0 0.0 3245446.2500 305 | 306 | Beginning Global Optimizations 307 | ------------------------------ 308 | Numerical Synthesis (Phase 1) 309 | Numerical Synthesis (Phase 2) 310 | Global Optimization (Phase 1) 311 | Global Optimization (Phase 2) 312 | Global Optimization (Phase 3) 313 | Global Optimization (Phase 4) 314 | Global Optimization (Phase 5) 315 | Global Optimization (Phase 6) 316 | Global Optimization (Phase 7) 317 | Global Optimization (Phase 8) 318 | Global Optimization (Phase 9) 319 | Global Optimization (Phase 10) 320 | Global Optimization (Phase 11) 321 | Global Optimization (Phase 12) 322 | Global Optimization (Phase 13) 323 | Global Optimization (Phase 14) 324 | Global Optimization (Phase 15) 325 | Global Optimization (Phase 16) 326 | Global Optimization (Phase 17) 327 | Global Optimization (Phase 18) 328 | Global Optimization (Phase 19) 329 | Global Optimization (Phase 20) 330 | Global Optimization (Phase 21) 331 | Global Optimization (Phase 22) 332 | Global Optimization (Phase 23) 333 | Global Optimization (Phase 24) 334 | Global Optimization (Phase 25) 335 | Global Optimization (Phase 26) 336 | Global Optimization (Phase 27) 337 | Global Optimization (Phase 28) 338 | Global Optimization (Phase 29) 339 | Global Optimization (Phase 30) 340 | 341 | Beginning Isolate Ports 342 | ----------------------- 343 | 344 | Beginning Delay Optimization 345 | ---------------------------- 346 | 0:00:05 20969.6 0.00 0.0 0.0 1455825.1250 347 | 0:00:05 20969.6 0.00 0.0 0.0 1455825.1250 348 | 0:00:05 20969.6 0.00 0.0 0.0 1455825.1250 349 | 0:00:05 20297.7 0.00 0.0 0.0 1435287.1250 350 | 0:00:05 20291.0 0.00 0.0 0.0 1435205.0000 351 | 0:00:05 20291.0 0.00 0.0 0.0 1435205.0000 352 | 353 | Beginning WLM Backend Optimization 354 | -------------------------------------- 355 | 0:00:05 20291.0 0.00 0.0 0.0 1429606.6250 356 | 0:00:05 20291.0 0.00 0.0 0.0 1429606.6250 357 | 0:00:05 20291.0 0.00 0.0 0.0 1429606.6250 358 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 359 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 360 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 361 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 362 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 363 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 364 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 365 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 366 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 367 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 368 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 369 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 370 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 371 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 372 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 373 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 374 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 375 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 376 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 377 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 378 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 379 | 380 | 381 | Beginning Leakage Power Optimization (max_leakage_power 0) 382 | ------------------------------------ 383 | 384 | TOTAL 385 | ELAPSED WORST NEG SETUP DESIGN LEAKAGE 386 | TIME AREA SLACK COST RULE COST ENDPOINT POWER 387 | --------- --------- --------- --------- --------- ------------------------- --------- 388 | 0:00:05 20287.7 0.00 0.0 0.0 1355246.8750 389 | Global Optimization (Phase 31) 390 | Global Optimization (Phase 32) 391 | Global Optimization (Phase 33) 392 | Global Optimization (Phase 34) 393 | Global Optimization (Phase 35) 394 | Global Optimization (Phase 36) 395 | Global Optimization (Phase 37) 396 | Global Optimization (Phase 38) 397 | Global Optimization (Phase 39) 398 | Global Optimization (Phase 40) 399 | Global Optimization (Phase 41) 400 | Global Optimization (Phase 42) 401 | Global Optimization (Phase 43) 402 | 0:00:05 20440.7 0.00 0.0 0.0 1230133.5000 403 | 0:00:05 20440.7 0.00 0.0 0.0 1230133.5000 404 | 0:00:05 20440.7 0.00 0.0 0.0 1230133.5000 405 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 406 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 407 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 408 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 409 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 410 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 411 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 412 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 413 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 414 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 415 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 416 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 417 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 418 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 419 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 420 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 421 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 422 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 423 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 424 | 425 | TOTAL 426 | ELAPSED WORST NEG SETUP DESIGN LEAKAGE 427 | TIME AREA SLACK COST RULE COST ENDPOINT POWER 428 | --------- --------- --------- --------- --------- ------------------------- --------- 429 | 0:00:06 20347.6 0.00 0.0 0.0 1235536.2500 430 | 0:00:06 20148.0 0.00 0.0 0.0 1246586.7500 431 | 0:00:06 20148.0 0.00 0.0 0.0 1246586.7500 432 | 0:00:06 20148.0 0.00 0.0 0.0 1246586.7500 433 | 0:00:06 20148.0 0.00 0.0 0.0 1246586.7500 434 | 0:00:06 20227.8 0.00 0.0 0.0 1241531.3750 435 | 0:00:06 20227.8 0.00 0.0 0.0 1241531.3750 436 | 0:00:06 20227.8 0.00 0.0 0.0 1241531.3750 437 | 0:00:06 20227.8 0.00 0.0 0.0 1241531.3750 438 | 0:00:06 20227.8 0.00 0.0 0.0 1241531.3750 439 | 0:00:06 20227.8 0.00 0.0 0.0 1241531.3750 440 | Loading db file '/RAID2/COURSE/iclab/iclabta01/umc018/Synthesis/slow.db' 441 | 442 | 443 | Note: Symbol # after min delay cost means estimated hold TNS across all active scenarios 444 | 445 | 446 | Optimization Complete 447 | --------------------- 448 | Information: State dependent leakage is now switched from off to on. 449 | Information: Propagating switching activity (low effort zero delay simulation). (PWR-6) 450 | 1 451 | #====================================================== 452 | # Output Reports 453 | #====================================================== 454 | check_design > Report/$DESIGN\.check 455 | report_timing > Report/$DESIGN\.timing 456 | report_area > Report/$DESIGN\.area 457 | report_resource > Report/$DESIGN\.resource 458 | #====================================================== 459 | # Change Naming Rule 460 | #====================================================== 461 | set bus_inference_style "%s\[%d\]" 462 | %s[%d] 463 | set bus_naming_style "%s\[%d\]" 464 | %s[%d] 465 | set hdlout_internal_busses true 466 | true 467 | change_names -hierarchy -rule verilog 468 | 1 469 | define_name_rules name_rule -allowed "a-z A-Z 0-9 _" -max_length 255 -type cell 470 | 1 471 | define_name_rules name_rule -allowed "a-z A-Z 0-9 _[]" -max_length 255 -type net 472 | 1 473 | define_name_rules name_rule -map {{"\\*cell\\*" "cell"}} 474 | 1 475 | change_names -hierarchy -rules name_rule 476 | 1 477 | #====================================================== 478 | # Output Results 479 | #====================================================== 480 | set verilogout_higher_designs_first true 481 | true 482 | write -format verilog -output Netlist/$DESIGN\_SYN.v -hierarchy 483 | Writing verilog file '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/02_SYN/Netlist/SUBWAY_SYN.v'. 484 | 1 485 | write_sdf -version 2.1 -context verilog -load_delay cell Netlist/$DESIGN\_SYN.sdf 486 | Information: Writing timing information to file '/RAID2/COURSE/iclab/iclab105/oldarchive/Lab03/Exercise/02_SYN/Netlist/SUBWAY_SYN.sdf'. (WT-3) 487 | 1 488 | report_area 489 | 490 | **************************************** 491 | Report : area 492 | Design : SUBWAY 493 | Version: T-2022.03 494 | Date : Sun Apr 9 02:23:57 2023 495 | **************************************** 496 | 497 | Library(s) Used: 498 | 499 | slow (File: /RAID2/COURSE/iclab/iclabta01/umc018/Synthesis/slow.db) 500 | 501 | Number of ports: 16 502 | Number of nets: 835 503 | Number of cells: 718 504 | Number of combinational cells: 554 505 | Number of sequential cells: 164 506 | Number of macros/black boxes: 0 507 | Number of buf/inv: 73 508 | Number of references: 36 509 | 510 | Combinational area: 8236.166493 511 | Buf/Inv area: 728.481627 512 | Noncombinational area: 11991.672226 513 | Macro/Black Box area: 0.000000 514 | Net Interconnect area: undefined (No wire load specified) 515 | 516 | Total cell area: 20227.838719 517 | Total area: undefined 518 | 1 519 | report_timing 520 | 521 | **************************************** 522 | Report : timing 523 | -path full 524 | -delay max 525 | -max_paths 1 526 | Design : SUBWAY 527 | Version: T-2022.03 528 | Date : Sun Apr 9 02:23:57 2023 529 | **************************************** 530 | 531 | Operating Conditions: slow Library: slow 532 | Wire Load Model Mode: top 533 | 534 | Startpoint: in_valid (input port clocked by clk) 535 | Endpoint: map_reg_0__0__0_ 536 | (rising edge-triggered flip-flop clocked by clk) 537 | Path Group: clk 538 | Path Type: max 539 | 540 | Point Incr Path 541 | ----------------------------------------------------------- 542 | clock clk (rise edge) 0.00 0.00 543 | clock network delay (ideal) 0.00 0.00 544 | input external delay 5.00 5.00 f 545 | in_valid (in) 0.00 5.00 f 546 | U679/Y (NAND2XL) 0.13 5.13 r 547 | U645/Y (OAI31XL) 0.30 5.43 f 548 | U614/Y (NOR2XL) 1.63 7.06 r 549 | U739/Y (INVXL) 0.69 7.74 f 550 | U784/Y (AOI2BB2XL) 0.47 8.21 f 551 | map_reg_0__0__0_/D (DFFRX1) 0.00 8.21 f 552 | data arrival time 8.21 553 | 554 | clock clk (rise edge) 10.00 10.00 555 | clock network delay (ideal) 0.00 10.00 556 | map_reg_0__0__0_/CK (DFFRX1) 0.00 10.00 r 557 | library setup time -0.32 9.68 558 | data required time 9.68 559 | ----------------------------------------------------------- 560 | data required time 9.68 561 | data arrival time -8.21 562 | ----------------------------------------------------------- 563 | slack (MET) 1.46 564 | 565 | 566 | 1 567 | #====================================================== 568 | # Finish and Quit 569 | #====================================================== 570 | exit 571 | 572 | Memory usage for this session 196 Mbytes. 573 | Memory usage for this session including child processes 241 Mbytes. 574 | CPU usage for this session 57 seconds ( 0.02 hours ). 575 | Elapsed time for this session 62 seconds ( 0.02 hours ). 576 | 577 | Thank you... 578 | -------------------------------------------------------------------------------- /src/02_SYN/syn.tcl: -------------------------------------------------------------------------------- 1 | #====================================================== 2 | # 3 | # Synopsys Synthesis Scripts (Design Vision dctcl mode) 4 | # 5 | #====================================================== 6 | 7 | #====================================================== 8 | # Set Libraries 9 | #====================================================== 10 | set search_path {./../01_RTL \ 11 | ~iclabta01/umc018/Synthesis/ \ 12 | /usr/synthesis/libraries/syn/ } 13 | 14 | set synthetic_library {dw_foundation.sldb} 15 | set link_library {* dw_foundation.sldb standard.sldb slow.db} 16 | set target_library {slow.db} 17 | 18 | #====================================================== 19 | # Global Parameters 20 | #====================================================== 21 | set DESIGN "SUBWAY" 22 | set hdlin_ff_always_sync_set_reset true 23 | set CLK_period 10.0 24 | 25 | #====================================================== 26 | # Read RTL Code 27 | #====================================================== 28 | read_sverilog $DESIGN.v 29 | current_design $DESIGN 30 | 31 | #====================================================== 32 | # Global Setting 33 | #====================================================== 34 | set_wire_load_mode top 35 | 36 | #====================================================== 37 | # Set Design Constraints 38 | #====================================================== 39 | create_clock -name "clk" -period $CLK_period clk 40 | set_input_delay [ expr $CLK_period*0.5 ] -clock clk [all_inputs] 41 | set_output_delay [ expr $CLK_period*0.5 ] -clock clk [all_outputs] 42 | set_input_delay 0 -clock clk clk 43 | set_load 0.05 [all_outputs] 44 | 45 | #====================================================== 46 | # Optimization 47 | #====================================================== 48 | uniquify 49 | set_fix_multiple_port_nets -all -buffer_constants 50 | compile_ultra 51 | 52 | #====================================================== 53 | # Output Reports 54 | #====================================================== 55 | check_design > Report/$DESIGN\.check 56 | report_timing > Report/$DESIGN\.timing 57 | report_area > Report/$DESIGN\.area 58 | report_resource > Report/$DESIGN\.resource 59 | 60 | #====================================================== 61 | # Change Naming Rule 62 | #====================================================== 63 | set bus_inference_style "%s\[%d\]" 64 | set bus_naming_style "%s\[%d\]" 65 | set hdlout_internal_busses true 66 | 67 | change_names -hierarchy -rule verilog 68 | 69 | define_name_rules name_rule -allowed "a-z A-Z 0-9 _" -max_length 255 -type cell 70 | define_name_rules name_rule -allowed "a-z A-Z 0-9 _[]" -max_length 255 -type net 71 | define_name_rules name_rule -map {{"\\*cell\\*" "cell"}} 72 | change_names -hierarchy -rules name_rule 73 | 74 | #====================================================== 75 | # Output Results 76 | #====================================================== 77 | 78 | set verilogout_higher_designs_first true 79 | write -format verilog -output Netlist/$DESIGN\_SYN.v -hierarchy 80 | write_sdf -version 2.1 -context verilog -load_delay cell Netlist/$DESIGN\_SYN.sdf 81 | report_area 82 | report_timing 83 | #====================================================== 84 | # Finish and Quit 85 | #====================================================== 86 | exit 87 | -------------------------------------------------------------------------------- /src/03_GATE_SIM/01_run: -------------------------------------------------------------------------------- 1 | irun -timescale 1ns/1fs -override_precision -sdf_precision 1fs TESTBED.v -define GATE -define FUNC -debug -v ~iclabta01/umc018/Verilog/umc18_neg.v -nontcglitch -loadpli1 debpli:novas_pli_boot 2 | -------------------------------------------------------------------------------- /src/03_GATE_SIM/09_clean_up: -------------------------------------------------------------------------------- 1 | rm -rf INCA_libs nWaveLog 2 | rm -rf *.fsdb 3 | rm -rf *.log 4 | rm -rf *~ 5 | rm -rf *.sdf.X 6 | rm -rf *.key 7 | rm -rf *.conf 8 | rm -rf *.rc 9 | 10 | -------------------------------------------------------------------------------- /src/03_GATE_SIM/SUBWAY_SYN.sdf.X: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/hankshyu/RISC-V_MAC/a99fb15770c8dd553d0b348fadddc2b8b3d4b22b/src/03_GATE_SIM/SUBWAY_SYN.sdf.X --------------------------------------------------------------------------------