├── Makefile └── README.md ├── README.md ├── examples ├── fft │ ├── .clangd │ ├── .gitignore │ ├── Makefile │ ├── README.md │ ├── Vitis Tutorial.md │ ├── Vitis_HLS Tutorial.md │ ├── common │ │ └── include │ │ │ ├── event_timer.cpp │ │ │ ├── event_timer.hpp │ │ │ ├── logger.hpp │ │ │ ├── utils.hpp │ │ │ ├── xcl2.cpp │ │ │ └── xcl2.hpp │ ├── conn_u280.cfg │ ├── emconfig.json │ ├── global.h │ ├── host │ │ ├── fft_wrapper.hpp │ │ └── host.cpp │ ├── img │ │ ├── 2bde4ea848e911dc307525dd359e70e4.png │ │ ├── abaeeb9056d37a72a2598b4490bd3c41.png │ │ ├── image-20250106154407864.png │ │ ├── image-20250106161357212.png │ │ ├── image-20250106161814852.png │ │ ├── image-20250106164155120.png │ │ ├── image-20250106164530672.png │ │ ├── image-20250106164624970.png │ │ ├── image-20250106164843096.png │ │ ├── image-20250106164909915.png │ │ ├── image-20250106165003683.png │ │ ├── image-20250106165306262.png │ │ ├── image-20250106165321358.png │ │ ├── image-20250106165438505.png │ │ ├── image-20250106194714626.png │ │ └── image-20250108170352415.png │ ├── kernel │ │ ├── fft.cpp │ │ ├── fft.h │ │ ├── fft_test.cpp │ │ └── out.gold.dat │ ├── make_hw.sh │ ├── make_hwe.sh │ ├── make_kernel.sh │ ├── make_run.sh │ ├── make_sw.sh │ ├── remake_host.sh │ └── utils.mk └── vadd_vmul │ ├── Makefile │ ├── common │ └── include │ │ ├── event_timer.cpp │ │ ├── event_timer.hpp │ │ ├── logger.hpp │ │ ├── utils.hpp │ │ ├── xcl2.cpp │ │ └── xcl2.hpp │ ├── conn_u280.cfg │ ├── global.h │ ├── host │ ├── host.cpp │ └── vadd_vmul_wrapper.h │ ├── kernel │ ├── vadd.cpp │ ├── vadd.h │ ├── vmul.cpp │ └── vmul.h │ ├── make_hw.sh │ ├── make_hwe.sh │ ├── make_kernel.sh │ ├── make_run.sh │ ├── make_sw.sh │ ├── remake_host.sh │ └── utils.mk ├── git_push.sh ├── host └── README.md ├── img ├── add_kernel_func.png ├── add_kernel_func_u50.png ├── add_u50.png ├── add_zcu104.jpg ├── app_prj.jpg ├── build.jpg ├── common_image.jpg ├── creat_app_prj.jpg ├── empty_app.jpg ├── host_code.jpg ├── host_run_zcu104.png ├── image2sd.png ├── image_cfg.jpg ├── kernel_code.jpg ├── kernel_code_u50.png ├── kernel_setting.png ├── kernel_vitis_hls.png ├── link.png ├── link_setting.png ├── link_u50.png ├── link_vivado.jpg ├── mk_01-dig.png ├── mk_01-link.png ├── mk_01-reg.png ├── mk_01-res.png ├── mk_02-dig.png ├── mk_02-link.png ├── mk_02-reg.png ├── mk_02-res.png ├── mk_03-dig1.png ├── mk_03-dig2.png ├── mk_03-link.png ├── mk_03-res.png ├── putty.png ├── run_u50.png ├── sysroot.jpg ├── u50_platform.png ├── vitis_shell.png ├── vitis_workflow.image ├── workspace.jpg └── zcu104.jpg ├── multi-kernels ├── README.md └── src │ ├── host │ ├── event_timer.cpp │ ├── event_timer.hpp │ ├── host-01.cpp │ ├── host-01.hpp │ ├── host-02.cpp │ ├── host-02.hpp │ ├── host-03.cpp │ ├── host-03.hpp │ ├── op_host-02.hpp │ ├── vadd_host-01.hpp │ ├── vadd_host-03.hpp │ ├── vmul_host-03.hpp │ ├── xcl2.cpp │ └── xcl2.hpp │ └── kernel │ ├── vadd.cpp │ ├── vadd.hpp │ ├── vmul.cpp │ └── vmul.hpp ├── overall ├── README.md ├── src │ ├── host │ │ ├── event_timer.cpp │ │ ├── event_timer.hpp │ │ ├── host.cpp │ │ ├── host.hpp │ │ ├── vadd_host.hpp │ │ ├── xcl2.cpp │ │ └── xcl2.hpp │ └── kernel │ │ ├── vadd.cpp │ │ └── vadd.hpp ├── start_U50.md └── start_ZCU104.md ├── vitis_pid1639.str └── xrc.log /Makefile/README.md: -------------------------------------------------------------------------------- 1 | # Makefile for Vitis project 2 | 3 | TODO 4 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Vitis_workflow 2 | 本教程主要介绍利用vitis工具流使用,如图所示是vitis工具流的概览。 3 | 4 | FPGA: ZCU104、Alevo U50 5 | 6 | 系统: ubuntu18.04 7 | 8 | ![图1 vitis工作流程](./img/vitis_workflow.image) 9 | 10 | ## 1 教程目录 11 | 12 | 1. [overall](./overall/README.md):整体加速器部署流程简介,并给出一个简单样例作为参考。[跳转链接](./overall/README.md) 13 | 14 | 2. [multi-kernels](./multi-kernels/README.md):加速器多核部署方案样例,涉及单核多部署,多核单部署以及多核多部署。[跳转链接](./multi-kernels/README.md) 15 | 16 | 17 | 3. [host](./host/README.md):vitis加速应用的上层控制代码编写。[跳转链接](./host/README.md) 18 | 19 | 4. [Makefile](./Makefile/README.md):vitis工具流编译文件规则,一键式部署方案设计。[跳转链接](./Makefile/README.md) 20 | 21 | 5. [examples](./examples/):加速器部署样例以及编译运行脚本,涉及到在完成加速器设计之后将硬件设计部署到板卡或模拟器的编译过程。 22 | 23 | 24 | ## 2 环境准备 25 | 26 | ### 2.1 安装所需依赖 27 | ``` 28 | sudo apt-get install ocl-icd-libopencl1 opencl-headers ocl-icd-opencl-dev 29 | sudo add-apt-repository ppa:xorg-edgers/ppa 30 | sudo apt-get update 31 | sudo apt-get install libgl1-mesa-glx 32 | sudo apt-get install libgl1-mesa-dri 33 | sudo apt-get install libgl1-mesa-dev 34 | sudo add-apt-repository --remove ppa:xorg-edgers/ppa 35 | sudo apt install net-tools 36 | sudo apt-get install -y unzip 37 | sudo apt install gcc 38 | sudo apt install g++ 39 | sudo apt install python 40 | ln -s /usr/bin/python2 /usr/bin/python 41 | sudo apt install putty 42 | curl -1sLf \ 43 | 'https://dl.cloudsmith.io/public/balena/etcher/setup.deb.sh' \ 44 | | sudo -E bash 45 | sudo apt-get update 46 | sudo apt-get install balena-etcher-electron 47 | ``` 48 | ### 2.2 下载安装vitis相关工具 49 | 50 | 1. [安装vitis软件](https://www.xilinx.com/support/download/index.html/content/xilinx/en/downloadNav/vitis.html),这里下载的版本号最好要与之后下载相关内容保持一致。 51 | 例如:这里下载了vitis2020.2,后续的软件和镜像最好也下载2020.2或者之前版本以保证兼容性。 52 | 53 | 2. [安装XRT软件](https://www.xilinx.com/products/design-tools/vitis/xrt.html#gettingstarted),XRT是Xilinx FPGA的运行时库。 54 | 55 | 3. 配置环境 56 | ``` 57 | source /tools/Xilinx/Vitis/2020.2/settings64.sh 58 | source /tools/Xilinx/Vitis_HLS/2020.2/settings64.sh 59 | source /opt/xilinx/xrt/setup.sh 60 | ``` 61 | 62 | 4. 平台描述文件下载 63 | 64 | + ZCU104 65 | 66 | - [下载ZCU104平台描述文件](https://www.xilinx.com/support/download/index.html/content/xilinx/en/downloadNav/embedded-platforms.html),解压ZCU104平台描述文件,并将其复制到 `/opt/xilinx/platforms/` 下。 67 | 68 | ps:对于这里使用vitis 2020.2版本的同学来说,请选择2020.1版本的ZCU104平台描述文件。 69 | 70 | pss:2020.2版本的平台描述文件没有包含opencl domain,无法使用xrt方式运行并对PL编程。 71 | 72 | - [下载ZYNP平台通用镜像并解压](xilinx.com/support/download/index.html/content/xilinx/en/downloadNav/embedded-platforms/archive-vitis-embedded.html),选择ZYNQMP common image。 73 | 74 | ![图2 ZYNP平台通用镜像](./img/common_image.jpg) 75 | 76 | - ZYNP平台通用镜像展开,经过这个步骤,获得后续进行工程的镜像和文件树等内容。 77 | ``` 78 | cd xilinx-zynqmp-common-v2020.2/ 79 | sudo gunzip ./rootfs.ext4.gz 80 | ./sdk.sh -y -dir ./ -p 81 | ``` 82 | 83 | + Alveo U50 84 | 85 | - [下载U50平台描述文件](https://www.xilinx.com/products/boards-and-kits/alveo/u50.html#gettingStarted)并安装。 86 | 87 | - [下载U50物理层通信驱动](https://www.xilinx.com/products/boards-and-kits/alveo/u50.html#gettingStarted)并安装。 88 | 89 | ![图3 U50相关开发文件](./img/u50_platform.png) 90 | -------------------------------------------------------------------------------- /examples/fft/.clangd: -------------------------------------------------------------------------------- 1 | CompileFlags: 2 | Add: [ 3 | "-I/home/hanggu/USTC/Vitis_workflow/examples/fft", 4 | "-I/tools/Xilinx/Vitis_HLS/2023.1/include", 5 | "-I/opt/xilinx/xrt/include", 6 | "-I/home/hanggu/USTC/Vitis_workflow/examples/fft/common/include", 7 | ] -------------------------------------------------------------------------------- /examples/fft/.gitignore: -------------------------------------------------------------------------------- 1 | build_dir*/ 2 | _x_temp*/ 3 | *.jou 4 | *.log 5 | build* 6 | _x* 7 | hls_prj 8 | reports 9 | .* 10 | emconfig.json -------------------------------------------------------------------------------- /examples/fft/README.md: -------------------------------------------------------------------------------- 1 | 请参考[Vitis Tutorial.md](Vitis%20Tutorial.md)和[Vitis_HLS Tutorial.md](Vitis_HLS%20Tutorial.md)。 2 | -------------------------------------------------------------------------------- /examples/fft/Vitis Tutorial.md: -------------------------------------------------------------------------------- 1 | # Vitis使用教程 2 | 3 | 与Vitis_hls一样,Vitis也可以通过GUI和命令行两种方式运行。命令行较为方便快捷,所以这里介绍命令行的使用方法。 4 | 5 | 相关脚本文件的含义如下: 6 | 7 | conn_u280.cfg:全局配置文件, --connectivity.XXX 选项可支持定义 FPGA 二进制文件的拓扑结构、指定 CU 数量、将其分配给 SLR、将内核端口连接到全局存储器并建立串流端口连接。这些命令是构建进程中不可或缺的一部分, 对于应用的定义和构造都至关重要。 8 | 9 | make_kernel.sh:编译kernel的sw_emu、hw_emu和hw的目标 10 | 11 | make_sw.sh:跑sw_emu流程(编译kernel的sw_emu目标,并跑sw_emu的主机代码) 12 | 13 | make_hwe.sh:跑hw_emu流程(编译kernel的hw_emu目标,并跑hw_emu的主机代码) 14 | 15 | make_hw.sh:跑 hw流程(编译kernel的hw目标,并跑hw的主机代码) 16 | 17 | make_run.sh:跑sw_emu、hw_emu、hw的主机代码 18 | 19 | ![image-20250106194714626](img/image-20250106194714626.png) 20 | 21 | 22 | 23 | makefile中关于清除文件的规则如下: 24 | 25 | cleanh:清除所有关于host的文件。 26 | cleank:清除所有关于kernel的文件。 27 | cleanall:清除所有关于host和kernel的文件。 28 | 29 | makeflie中所有写了TODO注释的都是需要配置的参数,例如: 30 | 31 | ![image-20250108170352415](img/image-20250108170352415.png) 32 | 33 | 34 | 35 | ## 附录 36 | 37 | connectivity写法的详细文档:ug1393的第32章:v++命令,--connectivity选项 38 | 39 | 40 | -------------------------------------------------------------------------------- /examples/fft/Vitis_HLS Tutorial.md: -------------------------------------------------------------------------------- 1 | # Vitis_HLS使用教程 2 | 3 | Vitis_HLS可以通过tcl脚本和GUI两种方式操作。下面以快速傅里叶变换(FFT)为例,简要介绍两种流程。 4 | 5 | ## 1. 使用tcl脚本 6 | 7 | 首先,将相关代码解压并放入文件夹中,如下图所示: 8 | 9 | ![image-20250106154407864](img/image-20250106154407864.png) 10 | 11 | 其中,hls_prj文件夹用于存储vitis_hls工程。script.tcl脚本用于在这个目录下创建相应的vitis_hls工程并执行hls的仿真、综合、联合仿真、导出设计四个步骤。详细注释已在tcl脚本中给出。src文件夹中存储hls设计代码和testbench。testbench会在csim和cosim中用到。在csim中,用于验证代码的功能是否正确;在cosim中,用于验证vitis_hls生成的RTL代码是否与软件代码的行为一致。out.gold.dat是testbench中会用到的真值文件(ground truth)。 12 | 13 | 然后,终端进入hls_prj文件夹,运行tcl脚本(假设这里tcl脚本中只运行csim): 14 | 15 | ```tcl 16 | vitis_hls -f script.tcl 17 | ``` 18 | 19 | ![image-20250106161357212](img/image-20250106161357212.png) 20 | 21 | 可以看到csim通过,说明代码功能没有问题。另外,在hls_prj文件夹下也可以看到tcl脚本生成的hls工程项目fft: 22 | 23 | ![2bde4ea848e911dc307525dd359e70e4](img/2bde4ea848e911dc307525dd359e70e4.png) 24 | 25 | 如果要运行综合、联合仿真、导出ip等步骤,在tcl脚本中打开对应注释即可。例如,想要一次性运行csim、综合、cosim,将前三行注释打开: 26 | 27 | ![image-20250106161814852](img/image-20250106161814852.png) 28 | 29 | 然后再次通过命令行运行tcl脚本,等待结果即可。 30 | 31 | 接下来,在命令行使用vitis_hls命令打开刚刚创建的fft工程,在左下角的Flow Navigator中找到C SYNTHESIS的report,点击即可打开综合报告。在这里,可以查看到顶层函数的所有信息,包括latency、interval、资源使用量等。 32 | 33 | ![abaeeb9056d37a72a2598b4490bd3c41](img/abaeeb9056d37a72a2598b4490bd3c41.png) 34 | 35 | 另外,要查看每一个module的详细综合报告,在相应module上右键点击"Open Synthhesis Details Report": 36 | 37 | ![image-20250106164155120](img/image-20250106164155120.png) 38 | 39 | 在详细综合报告中,可以看到该module的性能评估和资源使用量。分析这些信息有助于优化HLS代码。 40 | 41 | ## 2.使用GUI 42 | 43 | 首先,使用命令vitis_hls打开GUI界面: 44 | 45 | ![image-20250106164530672](img/image-20250106164530672.png) 46 | 47 | 点击 create project创建新hls工程(刚刚使用tcl脚本创建的工程叫做fft,这里起名为fft2): 48 | 49 | ![image-20250106164624970](img/image-20250106164624970.png) 50 | 51 | 接下来,向工程项目中加入设计文件并设置top function: 52 | 53 | ![image-20250106164843096](img/image-20250106164843096.png) 54 | 55 | 设置testbench文件: 56 | 57 | ![image-20250106164909915](img/image-20250106164909915.png) 58 | 59 | 最后一步,将目标板卡改为U280,点击finish: 60 | 61 | ![image-20250106165003683](img/image-20250106165003683.png) 62 | 63 | 目前为止,hls工程创建完毕。接下来,在Flow Navigator中依次点击C SIMLULATION、C SYNTHESIS、COSIMULATION、IMPLEMENTATION即可。 64 | 65 | ![image-20250106165321358](img/image-20250106165321358.png) 66 | 67 | 运行csim,显示通过: 68 | 69 | ![image-20250106165306262](img/image-20250106165306262.png) 70 | 71 | 运行C综合: 72 | 73 | ![image-20250106165438505](img/image-20250106165438505.png) 74 | 75 | 这一步完成后flow navigator中的cosimulation和IMPLEMENTATION变绿,表示可以运行最后两步了,依次点击即可,后面不再赘述。 76 | 77 | 在C综合中,可以查看Schedule Viewer和Dataflow Viewer。 78 | 79 | Schedule Viewer:用于显示硬件执行流程中每个操作的调度情况的工具,它以时间轴的方式可视化每个操作或任务的执行顺序及其并行性。 80 | 81 | Dataflow Viewer:用于显示数据流设计中各模块之间的数据传递情况的工具。它以模块化的图形形式直观地展示设计中数据流动的结构和性能瓶颈。 82 | 83 | 84 | 85 | ## 附录 86 | 87 | 关于HLS的详细信息,请参考:UG1399-Vitis 高层次综合用户指南 https://docs.amd.com/r/en-US/ug1399-vitis-hls/Introduction 88 | 89 | 关于vitis_hls软件的详细用法,请参考:UG1393-Vitis 统一软件平台文档 https://docs.amd.com/v/u/zh-CN/ug1393-vitis-application-acceleration 90 | -------------------------------------------------------------------------------- /examples/fft/common/include/event_timer.cpp: -------------------------------------------------------------------------------- 1 | /********** 2 | Copyright (c) 2019, Xilinx, Inc. 3 | All rights reserved. 4 | 5 | Redistribution and use in source and binary forms, with or without modification, 6 | are permitted provided that the following conditions are met: 7 | 8 | 1. Redistributions of source code must retain the above copyright notice, 9 | this list of conditions and the following disclaimer. 10 | 11 | 2. Redistributions in binary form must reproduce the above copyright notice, 12 | this list of conditions and the following disclaimer in the documentation 13 | and/or other materials provided with the distribution. 14 | 15 | 3. Neither the name of the copyright holder nor the names of its contributors 16 | may be used to endorse or promote products derived from this software 17 | without specific prior written permission. 18 | 19 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 20 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, 21 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 22 | IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 23 | INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 24 | PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 25 | HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 26 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, 27 | EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 28 | **********/ 29 | 30 | 31 | #include "event_timer.hpp" 32 | 33 | #include 34 | #include 35 | 36 | EventTimer::EventTimer() 37 | { 38 | unfinished = false; 39 | event_count = 0; 40 | max_string_length = 0; 41 | } 42 | 43 | float EventTimer::ms_difference(EventTimer::timepoint start, 44 | EventTimer::timepoint end) 45 | { 46 | std::chrono::duration duration = end - start; 47 | return duration.count(); 48 | } 49 | 50 | int EventTimer::add(std::string description) 51 | { 52 | // If previously pending event was unfinished, adding a new event 53 | // will terminate it if this function is called 54 | if (unfinished) 55 | finish(); 56 | 57 | unfinished = true; 58 | 59 | event_names.push_back(description); 60 | int length = description.length(); 61 | if (length > max_string_length) 62 | max_string_length = length; 63 | start_times.push_back(std::chrono::high_resolution_clock::now()); 64 | return event_count++; 65 | } 66 | 67 | void EventTimer::finish(void) 68 | { 69 | end_times.push_back(std::chrono::high_resolution_clock::now()); 70 | if (!unfinished) { 71 | end_times.pop_back(); 72 | return; 73 | } 74 | unfinished = false; 75 | } 76 | 77 | void EventTimer::clear(void) 78 | { 79 | start_times.clear(); 80 | end_times.clear(); 81 | event_names.clear(); 82 | event_count = 0; 83 | unfinished = false; 84 | } 85 | 86 | void EventTimer::print(int id) 87 | { 88 | std::ios_base::fmtflags flags(std::cout.flags()); 89 | if (id >= 0) { 90 | if ((unsigned)id > event_names.size()) 91 | return; 92 | std::cout << event_names[id] << " : " << std::fixed << std::setprecision(3) 93 | << ms_difference(start_times[id], end_times[id]) << std::endl; 94 | } 95 | else { 96 | int printable_events = unfinished ? event_count - 1 : event_count; 97 | for (int i = 0; i < printable_events; i++) { 98 | std::cout << std::left << std::setw(max_string_length) << event_names[i] << " : "; 99 | std::cout << std::right << std::setw(8) << std::fixed << std::setprecision(3) 100 | << ms_difference(start_times[i], end_times[i]) << " ms" 101 | << std::endl; 102 | } 103 | } 104 | std::cout.flags(flags); 105 | } 106 | -------------------------------------------------------------------------------- /examples/fft/common/include/event_timer.hpp: -------------------------------------------------------------------------------- 1 | /********** 2 | Copyright (c) 2019, Xilinx, Inc. 3 | All rights reserved. 4 | 5 | Redistribution and use in source and binary forms, with or without modification, 6 | are permitted provided that the following conditions are met: 7 | 8 | 1. Redistributions of source code must retain the above copyright notice, 9 | this list of conditions and the following disclaimer. 10 | 11 | 2. Redistributions in binary form must reproduce the above copyright notice, 12 | this list of conditions and the following disclaimer in the documentation 13 | and/or other materials provided with the distribution. 14 | 15 | 3. Neither the name of the copyright holder nor the names of its contributors 16 | may be used to endorse or promote products derived from this software 17 | without specific prior written permission. 18 | 19 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 20 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, 21 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 22 | IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 23 | INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 24 | PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 25 | HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 26 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, 27 | EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 28 | **********/ 29 | 30 | 31 | #ifndef EVENT_TIMER_HPP__ 32 | #define EVENT_TIMER_HPP__ 33 | 34 | #include 35 | #include 36 | #include 37 | 38 | class EventTimer 39 | { 40 | typedef std::chrono::high_resolution_clock::time_point timepoint; 41 | 42 | private: 43 | std::vector start_times; 44 | std::vector end_times; 45 | std::vector event_names; 46 | 47 | bool unfinished; 48 | unsigned int event_count; 49 | int max_string_length; 50 | 51 | float ms_difference(EventTimer::timepoint start, EventTimer::timepoint end); 52 | 53 | public: 54 | EventTimer(void); 55 | int add(std::string description); 56 | void finish(void); 57 | void clear(void); 58 | 59 | void print(int id = -1); 60 | }; 61 | 62 | #endif // EVENT_TIMER_HPP__ 63 | -------------------------------------------------------------------------------- /examples/fft/common/include/utils.hpp: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright 2019 Xilinx, Inc. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #ifndef XF_GRAPH_UTILS_HPP 18 | #define XF_GRAPH_UTILS_HPP 19 | 20 | #include 21 | #include 22 | #include 23 | #include 24 | #include "ap_int.h" 25 | 26 | class ArgParser { 27 | public: 28 | ArgParser(int& argc, const char** argv) { 29 | for (int i = 1; i < argc; ++i) mTokens.push_back(std::string(argv[i])); 30 | } 31 | bool getCmdOption(const std::string option, std::string& value) const { 32 | std::vector::const_iterator itr; 33 | itr = std::find(this->mTokens.begin(), this->mTokens.end(), option); 34 | if (itr != this->mTokens.end() && ++itr != this->mTokens.end()) { 35 | value = *itr; 36 | return true; 37 | } 38 | return false; 39 | } 40 | 41 | private: 42 | std::vector mTokens; 43 | }; 44 | 45 | // Compute time difference 46 | inline unsigned long tvdiff(struct timeval* tv0, struct timeval* tv1) { 47 | return ((unsigned long)tv1->tv_sec - (unsigned long)tv0->tv_sec) * 1000000UL + 48 | ((unsigned long)tv1->tv_usec - (unsigned long)tv0->tv_usec); 49 | } 50 | 51 | template 52 | T* aligned_alloc(std::size_t num) { 53 | void* ptr = NULL; 54 | if (posix_memalign(&ptr, 4096, num * sizeof(T))) throw std::bad_alloc(); 55 | return reinterpret_cast(ptr); 56 | } 57 | 58 | #endif //#ifndef VT_GRAPH_UTILS_H 59 | -------------------------------------------------------------------------------- /examples/fft/common/include/xcl2.cpp: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright (C) 2019-2021 Xilinx, Inc 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"). You may 5 | * not use this file except in compliance with the License. A copy of the 6 | * License is located at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT 12 | * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 13 | * License for the specific language governing permissions and limitations 14 | * under the License. 15 | */ 16 | 17 | #include "xcl2.hpp" 18 | #include 19 | #include 20 | #include 21 | #include 22 | #include 23 | #if defined(_WINDOWS) 24 | #include 25 | #else 26 | #include 27 | #endif 28 | 29 | namespace xcl { 30 | std::vector get_devices(const std::string& vendor_name) { 31 | size_t i; 32 | cl_int err; 33 | std::vector platforms; 34 | OCL_CHECK(err, err = cl::Platform::get(&platforms)); 35 | cl::Platform platform; 36 | for (i = 0; i < platforms.size(); i++) { 37 | platform = platforms[i]; 38 | OCL_CHECK(err, std::string platformName = platform.getInfo(&err)); 39 | if (!(platformName.compare(vendor_name))) { 40 | std::cout << "Found Platform" << std::endl; 41 | std::cout << "Platform Name: " << platformName.c_str() << std::endl; 42 | break; 43 | } 44 | } 45 | if (i == platforms.size()) { 46 | std::cout << "Error: Failed to find Xilinx platform" << std::endl; 47 | std::cout << "Found the following platforms : " << std::endl; 48 | for (size_t j = 0; j < platforms.size(); j++) { 49 | platform = platforms[j]; 50 | OCL_CHECK(err, std::string platformName = platform.getInfo(&err)); 51 | std::cout << "Platform Name: " << platformName.c_str() << std::endl; 52 | } 53 | exit(EXIT_FAILURE); 54 | } 55 | // Getting ACCELERATOR Devices and selecting 1st such device 56 | std::vector devices; 57 | OCL_CHECK(err, err = platform.getDevices(CL_DEVICE_TYPE_ACCELERATOR, &devices)); 58 | return devices; 59 | } 60 | 61 | std::vector get_xil_devices() { 62 | return get_devices("Xilinx"); 63 | } 64 | 65 | cl::Device find_device_bdf(const std::vector& devices, const std::string& bdf) { 66 | char device_bdf[20]; 67 | cl_int err; 68 | cl::Device device; 69 | int cnt = 0; 70 | for (uint32_t i = 0; i < devices.size(); i++) { 71 | OCL_CHECK(err, err = devices[i].getInfo(CL_DEVICE_PCIE_BDF, &device_bdf)); 72 | if (bdf == device_bdf) { 73 | device = devices[i]; 74 | cnt++; 75 | break; 76 | } 77 | } 78 | if (cnt == 0) { 79 | std::cout << "Invalid device bdf. Please check and provide valid bdf\n"; 80 | exit(EXIT_FAILURE); 81 | } 82 | return device; 83 | } 84 | cl_device_id find_device_bdf_c(cl_device_id* devices, const std::string& bdf, cl_uint device_count) { 85 | char device_bdf[20]; 86 | cl_int err; 87 | cl_device_id device; 88 | int cnt = 0; 89 | for (uint32_t i = 0; i < device_count; i++) { 90 | err = clGetDeviceInfo(devices[i], CL_DEVICE_PCIE_BDF, sizeof(device_bdf), device_bdf, 0); 91 | if (err != CL_SUCCESS) { 92 | std::cout << "Unable to extract the device BDF details\n"; 93 | exit(EXIT_FAILURE); 94 | } 95 | if (bdf == device_bdf) { 96 | device = devices[i]; 97 | cnt++; 98 | break; 99 | } 100 | } 101 | if (cnt == 0) { 102 | std::cout << "Invalid device bdf. Please check and provide valid bdf\n"; 103 | exit(EXIT_FAILURE); 104 | } 105 | return device; 106 | } 107 | std::vector read_binary_file(const std::string& xclbin_file_name) { 108 | std::cout << "INFO: Reading " << xclbin_file_name << std::endl; 109 | FILE* fp; 110 | if ((fp = fopen(xclbin_file_name.c_str(), "r")) == nullptr) { 111 | printf("ERROR: %s xclbin not available please build\n", xclbin_file_name.c_str()); 112 | exit(EXIT_FAILURE); 113 | } 114 | // Loading XCL Bin into char buffer 115 | std::cout << "Loading: '" << xclbin_file_name.c_str() << "'\n"; 116 | std::ifstream bin_file(xclbin_file_name.c_str(), std::ifstream::binary); 117 | bin_file.seekg(0, bin_file.end); 118 | auto nb = bin_file.tellg(); 119 | bin_file.seekg(0, bin_file.beg); 120 | std::vector buf; 121 | buf.resize(nb); 122 | bin_file.read(reinterpret_cast(buf.data()), nb); 123 | return buf; 124 | } 125 | 126 | bool is_emulation() { 127 | bool ret = false; 128 | char* xcl_mode = getenv("XCL_EMULATION_MODE"); 129 | if (xcl_mode != nullptr) { 130 | ret = true; 131 | } 132 | return ret; 133 | } 134 | 135 | bool is_hw_emulation() { 136 | bool ret = false; 137 | char* xcl_mode = getenv("XCL_EMULATION_MODE"); 138 | if ((xcl_mode != nullptr) && !strcmp(xcl_mode, "hw_emu")) { 139 | ret = true; 140 | } 141 | return ret; 142 | } 143 | double round_off(double n) { 144 | double d = n * 100.0; 145 | int i = d + 0.5; 146 | d = i / 100.0; 147 | return d; 148 | } 149 | 150 | std::string convert_size(size_t size) { 151 | static const char* SIZES[] = {"B", "KB", "MB", "GB"}; 152 | uint32_t div = 0; 153 | size_t rem = 0; 154 | 155 | while (size >= 1024 && div < (sizeof SIZES / sizeof *SIZES)) { 156 | rem = (size % 1024); 157 | div++; 158 | size /= 1024; 159 | } 160 | 161 | double size_d = (float)size + (float)rem / 1024.0; 162 | double size_val = round_off(size_d); 163 | 164 | std::stringstream stream; 165 | stream << std::fixed << std::setprecision(2) << size_val; 166 | std::string size_str = stream.str(); 167 | std::string result = size_str + " " + SIZES[div]; 168 | return result; 169 | } 170 | 171 | bool is_xpr_device(const char* device_name) { 172 | const char* output = strstr(device_name, "xpr"); 173 | 174 | if (output == nullptr) { 175 | return false; 176 | } else { 177 | return true; 178 | } 179 | } 180 | }; // namespace xcl 181 | -------------------------------------------------------------------------------- /examples/fft/common/include/xcl2.hpp: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright (C) 2019-2021 Xilinx, Inc 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"). You may 5 | * not use this file except in compliance with the License. A copy of the 6 | * License is located at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT 12 | * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 13 | * License for the specific language governing permissions and limitations 14 | * under the License. 15 | */ 16 | 17 | #pragma once 18 | 19 | #define CL_HPP_CL_1_2_DEFAULT_BUILD 20 | #define CL_HPP_TARGET_OPENCL_VERSION 120 21 | #define CL_HPP_MINIMUM_OPENCL_VERSION 120 22 | #define CL_HPP_ENABLE_PROGRAM_CONSTRUCTION_FROM_ARRAY_COMPATIBILITY 1 23 | #define CL_USE_DEPRECATED_OPENCL_1_2_APIS 24 | 25 | // OCL_CHECK doesn't work if call has templatized function call 26 | #define OCL_CHECK(error, call) \ 27 | call; \ 28 | if (error != CL_SUCCESS) { \ 29 | printf("%s:%d Error calling " #call ", error code is: %d\n", __FILE__, __LINE__, error); \ 30 | exit(EXIT_FAILURE); \ 31 | } 32 | 33 | #include 34 | #include 35 | #include 36 | #include 37 | // When creating a buffer with user pointer (CL_MEM_USE_HOST_PTR), under the 38 | // hood 39 | // User ptr is used if and only if it is properly aligned (page aligned). When 40 | // not 41 | // aligned, runtime has no choice but to create its own host side buffer that 42 | // backs 43 | // user ptr. This in turn implies that all operations that move data to and from 44 | // device incur an extra memcpy to move data to/from runtime's own host buffer 45 | // from/to user pointer. So it is recommended to use this allocator if user wish 46 | // to 47 | // Create Buffer/Memory Object with CL_MEM_USE_HOST_PTR to align user buffer to 48 | // the 49 | // page boundary. It will ensure that user buffer will be used when user create 50 | // Buffer/Mem Object with CL_MEM_USE_HOST_PTR. 51 | template 52 | struct aligned_allocator { 53 | using value_type = T; 54 | 55 | aligned_allocator() {} 56 | 57 | aligned_allocator(const aligned_allocator&) {} 58 | 59 | template 60 | aligned_allocator(const aligned_allocator&) {} 61 | 62 | T* allocate(std::size_t num) { 63 | void* ptr = nullptr; 64 | 65 | #if defined(_WINDOWS) 66 | { 67 | ptr = _aligned_malloc(num * sizeof(T), 4096); 68 | if (ptr == nullptr) { 69 | std::cout << "Failed to allocate memory" << std::endl; 70 | exit(EXIT_FAILURE); 71 | } 72 | } 73 | #else 74 | { 75 | if (posix_memalign(&ptr, 4096, num * sizeof(T))) throw std::bad_alloc(); 76 | } 77 | #endif 78 | return reinterpret_cast(ptr); 79 | } 80 | void deallocate(T* p, std::size_t num) { 81 | #if defined(_WINDOWS) 82 | _aligned_free(p); 83 | #else 84 | free(p); 85 | #endif 86 | } 87 | }; 88 | 89 | namespace xcl { 90 | std::vector get_xil_devices(); 91 | std::vector get_devices(const std::string& vendor_name); 92 | cl::Device find_device_bdf(const std::vector& devices, const std::string& bdf); 93 | cl_device_id find_device_bdf_c(cl_device_id* devices, const std::string& bdf, cl_uint dev_count); 94 | std::string convert_size(size_t size); 95 | std::vector read_binary_file(const std::string& xclbin_file_name); 96 | bool is_emulation(); 97 | bool is_hw_emulation(); 98 | bool is_xpr_device(const char* device_name); 99 | class P2P { 100 | public: 101 | static decltype(&xclGetMemObjectFd) getMemObjectFd; 102 | static decltype(&xclGetMemObjectFromFd) getMemObjectFromFd; 103 | static void init(const cl_platform_id& platform) { 104 | void* bar = clGetExtensionFunctionAddressForPlatform(platform, "xclGetMemObjectFd"); 105 | getMemObjectFd = (decltype(&xclGetMemObjectFd))bar; 106 | bar = clGetExtensionFunctionAddressForPlatform(platform, "xclGetMemObjectFromFd"); 107 | getMemObjectFromFd = (decltype(&xclGetMemObjectFromFd))bar; 108 | } 109 | }; 110 | class Ext { 111 | public: 112 | static decltype(&xclGetComputeUnitInfo) getComputeUnitInfo; 113 | static void init(const cl_platform_id& platform) { 114 | void* bar = clGetExtensionFunctionAddressForPlatform(platform, "xclGetComputeUnitInfo"); 115 | getComputeUnitInfo = (decltype(&xclGetComputeUnitInfo))bar; 116 | } 117 | }; 118 | } 119 | -------------------------------------------------------------------------------- /examples/fft/conn_u280.cfg: -------------------------------------------------------------------------------- 1 | [connectivity] 2 | sp=fft.X_R:HBM[0] 3 | sp=fft.X_I:HBM[1] 4 | sp=fft.OUT_R:HBM[2] 5 | sp=fft.OUT_I:HBM[3] 6 | 7 | # slr=fft:SLR0 8 | nk=fft:1:fft 9 | -------------------------------------------------------------------------------- /examples/fft/emconfig.json: -------------------------------------------------------------------------------- 1 | { 2 | "Comment": "This file is auto-generated by the tool. Do not modify", 3 | "Version": { 4 | "FileVersion": "2.0", 5 | "ToolVersion": "2023.1" 6 | }, 7 | "Platform": { 8 | "Boards": [ 9 | { 10 | "Devices": [ 11 | { 12 | "Name": "xilinx_u280_gen3x16_xdma_1_202211_1", 13 | "DdrBanks": [ 14 | { 15 | "Name": "pfm_dynamic_inst_memory_subsystem_memory_ddr4_mem00", 16 | "Type": "ddr4", 17 | "Size": "8GB", 18 | "AXI_ARBITRATION_SCHEME": "RD_PRI_REG", 19 | "BURST_LENGTH": "8", 20 | "C0": { 21 | "APP_ADDR_WIDTH": "31", 22 | "APP_DATA_WIDTH": "512", 23 | "ControllerType": "DDR4_SDRAM", 24 | "DDR4_ADDR_WIDTH": "17", 25 | "DDR4_AXI_ADDR_WIDTH": "34", 26 | "DDR4_AXI_DATA_WIDTH": "512", 27 | "DDR4_AXI_ID_WIDTH": "1", 28 | "DDR4_AutoPrecharge": "false", 29 | "DDR4_AxiNarrowBurst": "false", 30 | "DDR4_BANK_GROUP_WIDTH": "2", 31 | "DDR4_BANK_WIDTH": "2", 32 | "DDR4_CL": "0", 33 | "DDR4_COLUMN_WIDTH": "10", 34 | "DDR4_CWL": "0", 35 | "DDR4_Mem_Add_Map": "ROW_COLUMN_BANK_INTLV", 36 | "DDR4_Ordering": "Normal", 37 | "DDR4_RANK_WIDTH": "1", 38 | "DDR4_ROW_WIDTH": "17", 39 | "DDR4_tCK": "833", 40 | "DDR4_tCKE": "0", 41 | "DDR4_tFAW": "16", 42 | "DDR4_tMRD": "2", 43 | "DDR4_tRAS": "39", 44 | "DDR4_tRCD": "17", 45 | "DDR4_tREFI": "9363", 46 | "DDR4_tRFC": "421", 47 | "DDR4_tRP": "17", 48 | "DDR4_tRRD_L": "6", 49 | "DDR4_tRRD_S": "4", 50 | "DDR4_tRTP": "10", 51 | "DDR4_tWR": "19", 52 | "DDR4_tWTR_L": "10", 53 | "DDR4_tWTR_S": "4", 54 | "DDR4_tXPR": "109", 55 | "DDR4_tZQCS": "128", 56 | "DDR4_tZQI": "0", 57 | "DDR4_tZQINIT": "256" 58 | }, 59 | "CAS_LATENCY": "17", 60 | "CAS_WRITE_LATENCY": "12", 61 | "DATA_WIDTH": "72", 62 | "MEMORY_PART": "MTA18ASF2G72PZ-2G3", 63 | "MEM_ADDR_MAP": "ROW_COLUMN_BANK_INTLV", 64 | "TIMEPERIOD_PS": "833" 65 | }, 66 | { 67 | "Name": "pfm_dynamic_inst_memory_subsystem_memory_ddr4_mem01", 68 | "Type": "ddr4", 69 | "Size": "8GB", 70 | "AXI_ARBITRATION_SCHEME": "RD_PRI_REG", 71 | "BURST_LENGTH": "8", 72 | "C0": { 73 | "APP_ADDR_WIDTH": "31", 74 | "APP_DATA_WIDTH": "512", 75 | "ControllerType": "DDR4_SDRAM", 76 | "DDR4_ADDR_WIDTH": "17", 77 | "DDR4_AXI_ADDR_WIDTH": "34", 78 | "DDR4_AXI_DATA_WIDTH": "512", 79 | "DDR4_AXI_ID_WIDTH": "1", 80 | "DDR4_AutoPrecharge": "false", 81 | "DDR4_AxiNarrowBurst": "false", 82 | "DDR4_BANK_GROUP_WIDTH": "2", 83 | "DDR4_BANK_WIDTH": "2", 84 | "DDR4_CL": "0", 85 | "DDR4_COLUMN_WIDTH": "10", 86 | "DDR4_CWL": "0", 87 | "DDR4_Mem_Add_Map": "ROW_COLUMN_BANK_INTLV", 88 | "DDR4_Ordering": "Normal", 89 | "DDR4_RANK_WIDTH": "1", 90 | "DDR4_ROW_WIDTH": "17", 91 | "DDR4_tCK": "833", 92 | "DDR4_tCKE": "0", 93 | "DDR4_tFAW": "16", 94 | "DDR4_tMRD": "2", 95 | "DDR4_tRAS": "39", 96 | "DDR4_tRCD": "17", 97 | "DDR4_tREFI": "9363", 98 | "DDR4_tRFC": "421", 99 | "DDR4_tRP": "17", 100 | "DDR4_tRRD_L": "6", 101 | "DDR4_tRRD_S": "4", 102 | "DDR4_tRTP": "10", 103 | "DDR4_tWR": "19", 104 | "DDR4_tWTR_L": "10", 105 | "DDR4_tWTR_S": "4", 106 | "DDR4_tXPR": "109", 107 | "DDR4_tZQCS": "128", 108 | "DDR4_tZQI": "0", 109 | "DDR4_tZQINIT": "256" 110 | }, 111 | "CAS_LATENCY": "17", 112 | "CAS_WRITE_LATENCY": "12", 113 | "DATA_WIDTH": "72", 114 | "MEMORY_PART": "MTA18ASF2G72PZ-2G3", 115 | "MEM_ADDR_MAP": "ROW_COLUMN_BANK_INTLV", 116 | "TIMEPERIOD_PS": "833" 117 | } 118 | ], 119 | "PlatformData": { 120 | "plp": { 121 | "slaveBridge": "enabled", 122 | "cdmaBaseAddress1": "0x0", 123 | "ertCmdqBaseAddr": "0x008000000000", 124 | "m2m_address": "0x0", 125 | "cdmaBaseAddress2": "0x0", 126 | "ertVersion": "30", 127 | "dma": "xdma", 128 | "ertBaseAddr": "0x0", 129 | "ert": "enabled", 130 | "peerToPeer": "enabled", 131 | "cdmaBaseAddress0": "0x0", 132 | "m2m": "disabled", 133 | "cdmaBaseAddress3": "0x0", 134 | "numCdma": "0" 135 | }, 136 | "ulp": "" 137 | } 138 | } 139 | ], 140 | "NumBoards": "1" 141 | } 142 | ], 143 | "UnifiedPlatform": "true", 144 | "ExpandedPR": "false" 145 | } 146 | } 147 | -------------------------------------------------------------------------------- /examples/fft/global.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include "xcl2.hpp" 4 | 5 | // HBM channels 6 | #define MAX_HBM_CHANNEL_COUNT 32 7 | #define CHANNEL_NAME(n) n | XCL_MEM_TOPOLOGY 8 | const int HBM[MAX_HBM_CHANNEL_COUNT] = { 9 | CHANNEL_NAME(0), CHANNEL_NAME(1), CHANNEL_NAME(2), CHANNEL_NAME(3), CHANNEL_NAME(4), CHANNEL_NAME(5), 10 | CHANNEL_NAME(6), CHANNEL_NAME(7), CHANNEL_NAME(8), CHANNEL_NAME(9), CHANNEL_NAME(10), CHANNEL_NAME(11), 11 | CHANNEL_NAME(12), CHANNEL_NAME(13), CHANNEL_NAME(14), CHANNEL_NAME(15), CHANNEL_NAME(16), CHANNEL_NAME(17), 12 | CHANNEL_NAME(18), CHANNEL_NAME(19), CHANNEL_NAME(20), CHANNEL_NAME(21), CHANNEL_NAME(22), CHANNEL_NAME(23), 13 | CHANNEL_NAME(24), CHANNEL_NAME(25), CHANNEL_NAME(26), CHANNEL_NAME(27), CHANNEL_NAME(28), CHANNEL_NAME(29), 14 | CHANNEL_NAME(30), CHANNEL_NAME(31)}; 15 | 16 | const int DDR[2] = {CHANNEL_NAME(32), CHANNEL_NAME(33)}; 17 | 18 | //使用更现代化的std容器。若不想使用vector容器,可使用utils.hpp中提供的方法对齐创建数组,写法为:DTYPE* In_R = 19 | // aligned_alloc(SIZE * sizeof(DTYPE)) 20 | // TODO: 修改内存分配类型 21 | using vec_t = std::vector >; // fft中使用的是float类型 22 | 23 | typedef float DTYPE; 24 | #define SIZE 1024 /* SIZE OF FFT */ -------------------------------------------------------------------------------- /examples/fft/host/fft_wrapper.hpp: -------------------------------------------------------------------------------- 1 | #include "event_timer.hpp" 2 | #include "global.h" 3 | 4 | // fft内核的opencl wrapper 5 | void fft_wrapper(vec_t X_R, vec_t X_I, vec_t& OUT_R, vec_t& OUT_I, std::string xclbin) { 6 | EventTimer et; 7 | cl_int err; 8 | cl::Context context; 9 | cl::CommandQueue q; 10 | // std::vector krnls(NUM_KERNEL); 11 | // cl::Kernel krnl, krnl_read, krnl_write; 12 | cl::Kernel krnl_fft; 13 | 14 | // OPENCL HOST CODE AREA START 15 | et.add("OpenCL Initialization"); 16 | auto devices = xcl::get_xil_devices(); 17 | auto fileBuf = xcl::read_binary_file(xclbin); 18 | cl::Program::Binaries bins{{fileBuf.data(), fileBuf.size()}}; 19 | bool valid_device = false; 20 | for (unsigned int i = 0; i < devices.size(); i++) { 21 | auto device = devices[i]; 22 | // Creating Context and Command Queue for selected Device 23 | OCL_CHECK(err, context = cl::Context(device, nullptr, nullptr, nullptr, &err)); 24 | OCL_CHECK(err, q = cl::CommandQueue(context, device, 25 | CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE | CL_QUEUE_PROFILING_ENABLE, 26 | &err)); 27 | 28 | std::cout << "Trying to program device[" << i << "]: " << device.getInfo() << std::endl; 29 | cl::Program program(context, {device}, bins, nullptr, &err); 30 | if (err != CL_SUCCESS) { 31 | std::cout << "Failed to program device[" << i << "] with xclbin file!\n"; 32 | } else { 33 | std::cout << "Device[" << i << "]: program successful!\n"; 34 | 35 | // Creating Kernel object using Compute unit names 36 | OCL_CHECK(err, krnl_fft = cl::Kernel(program, "fft", &err)); 37 | valid_device = true; 38 | break; // we break because we found a valid device 39 | } 40 | } 41 | if (!valid_device) { 42 | std::cout << "Failed to program any device found, exit!\n"; 43 | exit(EXIT_FAILURE); 44 | } 45 | et.finish(); 46 | 47 | /* Host mem flags */ 48 | cl_mem_ext_ptr_t X_R_ext; 49 | cl_mem_ext_ptr_t X_I_ext; 50 | cl_mem_ext_ptr_t OUT_R_ext; 51 | cl_mem_ext_ptr_t OUT_I_ext; 52 | 53 | X_R_ext.obj = X_R.data(); 54 | X_R_ext.param = 0; 55 | X_R_ext.flags = HBM[0]; 56 | 57 | X_I_ext.obj = X_I.data(); 58 | X_I_ext.param = 0; 59 | X_I_ext.flags = HBM[1]; 60 | 61 | OUT_R_ext.obj = OUT_R.data(); 62 | OUT_R_ext.param = 0; 63 | OUT_R_ext.flags = HBM[2]; 64 | 65 | OUT_I_ext.obj = OUT_I.data(); 66 | OUT_I_ext.param = 0; 67 | OUT_I_ext.flags = HBM[3]; 68 | 69 | cl::Buffer X_R_buf; 70 | cl::Buffer X_I_buf; 71 | cl::Buffer OUT_R_buf; 72 | cl::Buffer OUT_I_buf; 73 | 74 | et.add("Map host buffers to OpenCL buffers"); 75 | OCL_CHECK(err, X_R_buf = cl::Buffer(context, CL_MEM_EXT_PTR_XILINX | CL_MEM_USE_HOST_PTR | CL_MEM_READ_WRITE, 76 | sizeof(DTYPE) * X_R.size(), &X_R_ext, &err)); 77 | OCL_CHECK(err, X_I_buf = cl::Buffer(context, CL_MEM_EXT_PTR_XILINX | CL_MEM_USE_HOST_PTR | CL_MEM_READ_WRITE, 78 | sizeof(DTYPE) * X_I.size(), &X_I_ext, &err)); 79 | OCL_CHECK(err, OUT_R_buf = cl::Buffer(context, CL_MEM_EXT_PTR_XILINX | CL_MEM_USE_HOST_PTR | CL_MEM_READ_WRITE, 80 | sizeof(DTYPE) * OUT_R.size(), &OUT_R_ext, &err)); 81 | OCL_CHECK(err, OUT_I_buf = cl::Buffer(context, CL_MEM_EXT_PTR_XILINX | CL_MEM_USE_HOST_PTR | CL_MEM_READ_WRITE, 82 | sizeof(DTYPE) * OUT_I.size(), &OUT_I_ext, &err)); 83 | et.finish(); 84 | 85 | et.add("Set kernel arguments"); 86 | OCL_CHECK(err, err = krnl_fft.setArg(0, X_R_buf)); 87 | OCL_CHECK(err, err = krnl_fft.setArg(1, X_I_buf)); 88 | OCL_CHECK(err, err = krnl_fft.setArg(2, OUT_R_buf)); 89 | OCL_CHECK(err, err = krnl_fft.setArg(3, OUT_I_buf)); 90 | 91 | et.add("Memory object migration enqueue"); 92 | OCL_CHECK(err, err = q.enqueueMigrateMemObjects({X_R_buf, X_I_buf}, 0 /* 0 means from host*/)); 93 | q.finish(); 94 | 95 | int num_runs = 1;//执行次数(测得当num-runs=1000左右时达到最大THROUGHPUT=475) 96 | et.add("OCL Enqueue task and wait for kernel to complete"); 97 | auto t1 = std::chrono::high_resolution_clock::now(); 98 | for (int i = 0; i < num_runs; i++) { 99 | OCL_CHECK(err, err = q.enqueueTask(krnl_fft)); 100 | q.finish(); 101 | } 102 | auto t2 = std::chrono::high_resolution_clock::now(); 103 | 104 | et.add("Read back computation results"); 105 | // Copy Result from Device Global Memory to Host Local Memory 106 | OCL_CHECK(err, err = q.enqueueMigrateMemObjects({OUT_R_buf, OUT_I_buf}, CL_MIGRATE_MEM_OBJECT_HOST)); 107 | q.finish(); 108 | et.finish(); 109 | 110 | //打印各阶段用时 111 | et.print(); 112 | 113 | //打印kernel吞吐量 114 | float average_time_in_sec = 115 | float(std::chrono::duration_cast(t2 - t1).count()) / 1000000 / num_runs; 116 | std::cout << "average_time: " << average_time_in_sec * 1000 << " ms" << std::endl; 117 | double throughput = 1; //表示执行一次长度为SIZE的FFT 118 | throughput /= average_time_in_sec; 119 | std::cout << "Compute THROUGHPUT = " << throughput << " /s" << std::endl; 120 | } 121 | 122 | // 使用events同步的写法 123 | // void fft_wrapper(vec_t X_R, vec_t X_I, vec_t& OUT_R, vec_t& OUT_I, std::string xclbin) { 124 | // EventTimer et; 125 | // cl_int err; 126 | // cl::Context context; 127 | // cl::CommandQueue q; 128 | // // std::vector krnls(NUM_KERNEL); 129 | // // cl::Kernel krnl, krnl_read, krnl_write; 130 | // cl::Kernel krnl_fft; 131 | 132 | // // OPENCL HOST CODE AREA START 133 | // et.add("OpenCL Initialization"); 134 | // auto devices = xcl::get_xil_devices(); 135 | // auto fileBuf = xcl::read_binary_file(xclbin); 136 | // cl::Program::Binaries bins{{fileBuf.data(), fileBuf.size()}}; 137 | // bool valid_device = false; 138 | // for (unsigned int i = 0; i < devices.size(); i++) { 139 | // auto device = devices[i]; 140 | // // Creating Context and Command Queue for selected Device 141 | // OCL_CHECK(err, context = cl::Context(device, nullptr, nullptr, nullptr, &err)); 142 | // OCL_CHECK(err, q = cl::CommandQueue(context, device, 143 | // CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE | CL_QUEUE_PROFILING_ENABLE, &err)); 144 | 145 | // std::cout << "Trying to program device[" << i << "]: " << device.getInfo() << std::endl; 146 | // cl::Program program(context, {device}, bins, nullptr, &err); 147 | // if (err != CL_SUCCESS) { 148 | // std::cout << "Failed to program device[" << i << "] with xclbin file!\n"; 149 | // } else { 150 | // std::cout << "Device[" << i << "]: program successful!\n"; 151 | 152 | // // Creating Kernel object using Compute unit names 153 | // OCL_CHECK(err, krnl_fft = cl::Kernel(program, "fft", &err)); 154 | // valid_device = true; 155 | // break; // we break because we found a valid device 156 | // } 157 | // } 158 | // if (!valid_device) { 159 | // std::cout << "Failed to program any device found, exit!\n"; 160 | // exit(EXIT_FAILURE); 161 | // } 162 | // et.finish(); 163 | 164 | // /* Host mem flags */ 165 | // cl_mem_ext_ptr_t X_R_ext; 166 | // cl_mem_ext_ptr_t X_I_ext; 167 | // cl_mem_ext_ptr_t OUT_R_ext; 168 | // cl_mem_ext_ptr_t OUT_I_ext; 169 | 170 | // X_R_ext.obj = X_R.data(); 171 | // X_R_ext.param = 0; 172 | // X_R_ext.flags = HBM[0]; 173 | 174 | // X_I_ext.obj = X_I.data(); 175 | // X_I_ext.param = 0; 176 | // X_I_ext.flags = HBM[1]; 177 | 178 | // OUT_R_ext.obj = OUT_R.data(); 179 | // OUT_R_ext.param = 0; 180 | // OUT_R_ext.flags = HBM[2]; 181 | 182 | // OUT_I_ext.obj = OUT_I.data(); 183 | // OUT_I_ext.param = 0; 184 | // OUT_I_ext.flags = HBM[3]; 185 | 186 | // cl::Buffer X_R_buf; 187 | // cl::Buffer X_I_buf; 188 | // cl::Buffer OUT_R_buf; 189 | // cl::Buffer OUT_I_buf; 190 | 191 | // et.add("Map host buffers to OpenCL buffers"); 192 | // OCL_CHECK(err, X_R_buf = cl::Buffer(context, CL_MEM_EXT_PTR_XILINX | CL_MEM_USE_HOST_PTR | CL_MEM_READ_WRITE, 193 | // sizeof(DTYPE) * X_R.size(), &X_R_ext, &err)); 194 | 195 | // OCL_CHECK(err, X_I_buf = cl::Buffer(context, CL_MEM_EXT_PTR_XILINX | CL_MEM_USE_HOST_PTR | CL_MEM_READ_WRITE, 196 | // sizeof(DTYPE) * X_I.size(), &X_I_ext, &err)); 197 | 198 | // OCL_CHECK(err, OUT_R_buf = cl::Buffer(context, CL_MEM_EXT_PTR_XILINX | CL_MEM_USE_HOST_PTR | CL_MEM_READ_WRITE, 199 | // sizeof(DTYPE) * OUT_R.size(), &OUT_R_ext, &err)); 200 | 201 | // OCL_CHECK(err, OUT_I_buf = cl::Buffer(context, CL_MEM_EXT_PTR_XILINX | CL_MEM_USE_HOST_PTR | CL_MEM_READ_WRITE, 202 | // sizeof(DTYPE) * OUT_I.size(), &OUT_I_ext, &err)); 203 | // et.finish(); 204 | 205 | // et.add("Set kernel arguments"); 206 | // OCL_CHECK(err, err = krnl_fft.setArg(0, X_R_buf)); 207 | // OCL_CHECK(err, err = krnl_fft.setArg(1, X_I_buf)); 208 | // OCL_CHECK(err, err = krnl_fft.setArg(2, OUT_R_buf)); 209 | // OCL_CHECK(err, err = krnl_fft.setArg(3, OUT_I_buf)); 210 | 211 | // std::vector events_write(1); 212 | // std::vector events_kernel(1); 213 | // std::vector events_read(1); 214 | 215 | // et.add("Memory object migration enqueue"); 216 | // OCL_CHECK( 217 | // err, err = q.enqueueMigrateMemObjects({X_R_buf, X_I_buf}, 0 /* 0 means from host*/, nullptr, &events_write[0])); 218 | // clWaitForEvents(1, (const cl_event*)&events_write[0]); 219 | 220 | // int num_runs = 1000; //执行次数(测得当num-runs=1000左右时达到最大THROUGHPUT=475) 221 | // et.add("OCL Enqueue task and wait for kernel to complete"); 222 | // auto t1 = std::chrono::high_resolution_clock::now(); 223 | // for (int i = 0; i < num_runs; i++) { 224 | // OCL_CHECK(err, err = q.enqueueTask(krnl_fft, &events_write, &events_kernel[0])); 225 | // clWaitForEvents(1, (const cl_event*)&events_kernel[0]); 226 | // } 227 | // auto t2 = std::chrono::high_resolution_clock::now(); 228 | 229 | // et.add("Read back computation results"); 230 | // // Copy Result from Device Global Memory to Host Local Memory 231 | // OCL_CHECK(err, err = q.enqueueMigrateMemObjects({OUT_R_buf, OUT_I_buf}, CL_MIGRATE_MEM_OBJECT_HOST, &events_kernel, 232 | // &events_read[0])); 233 | // clWaitForEvents(1, (const cl_event*)&events_read[0]); 234 | // et.finish(); 235 | 236 | // q.finish(); 237 | 238 | // //打印各阶段用时 239 | // et.print(); 240 | 241 | // //打印kernel吞吐量 242 | // float average_time_in_sec = 243 | // float(std::chrono::duration_cast(t2 - t1).count()) / 1000000 / num_runs; 244 | // std::cout << "average_time: " << average_time_in_sec * 1000 << " ms" << std::endl; 245 | // double throughput = 1; //表示执行一次长度为SIZE的FFT 246 | // throughput /= average_time_in_sec; 247 | // std::cout << "Compute THROUGHPUT = " << throughput << " /s" << std::endl; 248 | // } -------------------------------------------------------------------------------- /examples/fft/host/host.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include "fft_wrapper.hpp" 4 | #include "utils.hpp" 5 | using namespace std; 6 | 7 | struct Rmse { 8 | int num_sq; 9 | float sum_sq; 10 | float error; 11 | 12 | Rmse() { 13 | num_sq = 0; 14 | sum_sq = 0; 15 | error = 0; 16 | } 17 | 18 | float add_value(float d_n) { 19 | num_sq++; 20 | sum_sq += (d_n * d_n); 21 | error = sqrtf(sum_sq / num_sq); 22 | return error; 23 | } 24 | }; 25 | 26 | Rmse rmse_R, rmse_I; 27 | 28 | // Data must be aligned 29 | vec_t In_R(SIZE), In_I(SIZE); 30 | vec_t Out_R(SIZE), Out_I(SIZE); 31 | 32 | int main(int argc, const char* argv[]) { 33 | std::cout << "\n-------------------START----------------\n"; 34 | ArgParser parser(argc, argv); 35 | std::string tmpStr; 36 | if (!parser.getCmdOption("--xclbin", tmpStr)) { 37 | std::cout << "ERROR: xclbin is not set!\n"; 38 | return 1; 39 | } 40 | 41 | int index; 42 | float gold_R, gold_I; 43 | 44 | FILE* fp = fopen("kernel/out.gold.dat", "r"); 45 | if (fp == NULL) { 46 | fprintf(stderr, "Error: cannot open the golden output file\n"); 47 | return 1; 48 | } 49 | 50 | // Generate input data 51 | for (int i = 0; i < SIZE; i++) { 52 | In_R[i] = i; 53 | In_I[i] = 0.0; 54 | Out_R[i] = i; 55 | Out_I[i] = 1; 56 | } 57 | 58 | // Perform FFT 59 | fft_wrapper(In_R, In_I, Out_R, Out_I, tmpStr); 60 | 61 | // comparing with golden output 62 | for (int i = 0; i < SIZE; i++) { 63 | fscanf(fp, "%d %f %f", &index, &gold_R, &gold_I); 64 | // printf("%f %f\n", Out_R[i], gold_R); 65 | rmse_R.add_value(Out_R[i] - gold_R); 66 | rmse_I.add_value(Out_I[i] - gold_I); 67 | } 68 | fclose(fp); 69 | 70 | // printing error results 71 | printf("----------------------------------------------\n"); 72 | printf(" RMSE(R) RMSE(I)\n"); 73 | printf("%0.15f %0.15f\n", rmse_R.error, rmse_I.error); 74 | printf("----------------------------------------------\n"); 75 | 76 | if (rmse_R.error > 1 || rmse_I.error > 1) { 77 | fprintf(stdout, "*******************************************\n"); 78 | fprintf(stdout, "FAIL: Output DOES NOT match the golden output\n"); 79 | fprintf(stdout, "*******************************************\n"); 80 | return 1; 81 | } else { 82 | fprintf(stdout, "*******************************************\n"); 83 | fprintf(stdout, "PASS: The output matches the golden output!\n"); 84 | fprintf(stdout, "*******************************************\n"); 85 | return 0; 86 | } 87 | } 88 | -------------------------------------------------------------------------------- /examples/fft/img/2bde4ea848e911dc307525dd359e70e4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/examples/fft/img/2bde4ea848e911dc307525dd359e70e4.png -------------------------------------------------------------------------------- /examples/fft/img/abaeeb9056d37a72a2598b4490bd3c41.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/examples/fft/img/abaeeb9056d37a72a2598b4490bd3c41.png -------------------------------------------------------------------------------- /examples/fft/img/image-20250106154407864.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/examples/fft/img/image-20250106154407864.png -------------------------------------------------------------------------------- /examples/fft/img/image-20250106161357212.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/examples/fft/img/image-20250106161357212.png -------------------------------------------------------------------------------- /examples/fft/img/image-20250106161814852.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/examples/fft/img/image-20250106161814852.png -------------------------------------------------------------------------------- /examples/fft/img/image-20250106164155120.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/examples/fft/img/image-20250106164155120.png -------------------------------------------------------------------------------- /examples/fft/img/image-20250106164530672.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/examples/fft/img/image-20250106164530672.png -------------------------------------------------------------------------------- /examples/fft/img/image-20250106164624970.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/examples/fft/img/image-20250106164624970.png -------------------------------------------------------------------------------- /examples/fft/img/image-20250106164843096.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/examples/fft/img/image-20250106164843096.png -------------------------------------------------------------------------------- /examples/fft/img/image-20250106164909915.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/examples/fft/img/image-20250106164909915.png -------------------------------------------------------------------------------- /examples/fft/img/image-20250106165003683.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/examples/fft/img/image-20250106165003683.png -------------------------------------------------------------------------------- /examples/fft/img/image-20250106165306262.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/examples/fft/img/image-20250106165306262.png -------------------------------------------------------------------------------- /examples/fft/img/image-20250106165321358.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/examples/fft/img/image-20250106165321358.png -------------------------------------------------------------------------------- /examples/fft/img/image-20250106165438505.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/examples/fft/img/image-20250106165438505.png -------------------------------------------------------------------------------- /examples/fft/img/image-20250106194714626.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/examples/fft/img/image-20250106194714626.png -------------------------------------------------------------------------------- /examples/fft/img/image-20250108170352415.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/examples/fft/img/image-20250108170352415.png -------------------------------------------------------------------------------- /examples/fft/kernel/fft.h: -------------------------------------------------------------------------------- 1 | 2 | #ifndef FFT_H_ 3 | #define FFT_H_ 4 | 5 | typedef float DTYPE; 6 | typedef int INTTYPE; 7 | 8 | #define M 10 /*Number of Stages = Log2N */ 9 | #define SIZE 1024 /* SIZE OF FFT */ 10 | #define SIZE2 SIZE>>1 11 | 12 | //W_real and W_image are twiddle factors for 1024 size FFT. 13 | //WW_R[i]=cos(e*i/SIZE); 14 | //WW_I[i]=sin(e*i/SIZE); 15 | //where i=[0,512) and DTYPE e = -6.283185307178; 16 | // the Omega Value of W[N,k],k in [0,511] to be used in FFT 17 | const DTYPE W_real[]={1.000000, 0.999981,0.999925,0.999831,0.999699,0.999529,0.999322,0.999078,0.998795,0.998476,0.998118,0.997723,0.997290,0.996820,0.996313,0.995767,0.995185, 18 | 19 | 0.994565,0.993907,0.993212,0.992480,0.991710,0.990903,0.990058,0.989177,0.988258,0.987301,0.986308,0.985278,0.984210,0.983105,0.981964,0.980785, 20 | 21 | 0.979570,0.978317,0.977028,0.975702,0.974339,0.972940,0.971504,0.970031,0.968522,0.966976,0.965394,0.963776,0.962121,0.960431,0.958703,0.956940, 22 | 23 | 0.955141,0.953306,0.951435,0.949528,0.947586,0.945607,0.943593,0.941544,0.939459,0.937339,0.935184,0.932993,0.930767,0.928506,0.926210,0.923880, 24 | 25 | 0.921514,0.919114,0.916679,0.914210,0.911706,0.909168,0.906596,0.903989,0.901349,0.898674,0.895966,0.893224,0.890449,0.887640,0.884797,0.881921, 26 | 27 | 0.879012,0.876070,0.873095,0.870087,0.867046,0.863973,0.860867,0.857729,0.854558,0.851355,0.848120,0.844854,0.841555,0.838225,0.834863,0.831470, 28 | 29 | 0.828045,0.824589,0.821102,0.817585,0.814036,0.810457,0.806848,0.803208,0.799537,0.795837,0.792107,0.788346,0.784557,0.780737,0.776888,0.773010, 30 | 31 | 0.769103,0.765167,0.761202,0.757209,0.753187,0.749136,0.745058,0.740951,0.736817,0.732654,0.728464,0.724247,0.720002,0.715731,0.711432,0.707107, 32 | 33 | 0.702755,0.698376,0.693971,0.689541,0.685084,0.680601,0.676093,0.671559,0.667000,0.662416,0.657807,0.653173,0.648514,0.643832,0.639124,0.634393, 34 | 35 | 0.629638,0.624859,0.620057,0.615232,0.610383,0.605511,0.600616,0.595699,0.590760,0.585798,0.580814,0.575808,0.570781,0.565732,0.560662,0.555570, 36 | 37 | 0.550458,0.545325,0.540171,0.534998,0.529804,0.524590,0.519356,0.514103,0.508830,0.503538,0.498228,0.492898,0.487550,0.482184,0.476799,0.471397, 38 | 39 | 0.465976,0.460539,0.455084,0.449611,0.444122,0.438616,0.433094,0.427555,0.422000,0.416430,0.410843,0.405241,0.399624,0.393992,0.388345,0.382683, 40 | 41 | 0.377007,0.371317,0.365613,0.359895,0.354163,0.348419,0.342661,0.336890,0.331106,0.325310,0.319502,0.313682,0.307850,0.302006,0.296151,0.290285, 42 | 43 | 0.284407,0.278520,0.272621,0.266713,0.260794,0.254866,0.248928,0.242980,0.237024,0.231058,0.225084,0.219101,0.213110,0.207111,0.201105,0.195090, 44 | 45 | 0.189069,0.183040,0.177004,0.170962,0.164913,0.158858,0.152797,0.146730,0.140658,0.134581,0.128498,0.122411,0.116319,0.110222,0.104122,0.098017, 46 | 47 | 0.091909,0.085797,0.079682,0.073565,0.067444,0.061321,0.055195,0.049068,0.042938,0.036807,0.030675,0.024541,0.018407,0.012271,0.006136,-0.000000, 48 | 49 | -0.006136,-0.012272,-0.018407,-0.024541,-0.030675,-0.036807,-0.042938,-0.049068,-0.055195,-0.061321,-0.067444,-0.073565,-0.079682,-0.085797,-0.091909,-0.098017, 50 | 51 | -0.104122,-0.110222,-0.116319,-0.122411,-0.128498,-0.134581,-0.140658,-0.146731,-0.152797,-0.158858,-0.164913,-0.170962,-0.177004,-0.183040,-0.189069,-0.195090, 52 | 53 | -0.201105,-0.207111,-0.213110,-0.219101,-0.225084,-0.231058,-0.237024,-0.242980,-0.248928,-0.254866,-0.260794,-0.266713,-0.272621,-0.278520,-0.284408,-0.290285, 54 | 55 | -0.296151,-0.302006,-0.307850,-0.313682,-0.319502,-0.325310,-0.331106,-0.336890,-0.342661,-0.348419,-0.354164,-0.359895,-0.365613,-0.371317,-0.377007,-0.382683, 56 | 57 | -0.388345,-0.393992,-0.399624,-0.405241,-0.410843,-0.416430,-0.422000,-0.427555,-0.433094,-0.438616,-0.444122,-0.449611,-0.455084,-0.460539,-0.465977,-0.471397, 58 | 59 | -0.476799,-0.482184,-0.487550,-0.492898,-0.498228,-0.503538,-0.508830,-0.514103,-0.519356,-0.524590,-0.529804,-0.534998,-0.540172,-0.545325,-0.550458,-0.555570, 60 | 61 | -0.560662,-0.565732,-0.570781,-0.575808,-0.580814,-0.585798,-0.590760,-0.595699,-0.600617,-0.605511,-0.610383,-0.615232,-0.620057,-0.624860,-0.629638,-0.634393, 62 | 63 | -0.639125,-0.643832,-0.648514,-0.653173,-0.657807,-0.662416,-0.667000,-0.671559,-0.676093,-0.680601,-0.685084,-0.689541,-0.693972,-0.698376,-0.702755,-0.707107, 64 | 65 | -0.711432,-0.715731,-0.720003,-0.724247,-0.728464,-0.732654,-0.736817,-0.740951,-0.745058,-0.749136,-0.753187,-0.757209,-0.761202,-0.765167,-0.769103,-0.773010, 66 | 67 | -0.776888,-0.780737,-0.784557,-0.788346,-0.792107,-0.795837,-0.799537,-0.803208,-0.806848,-0.810457,-0.814036,-0.817585,-0.821103,-0.824589,-0.828045,-0.831470, 68 | 69 | -0.834863,-0.838225,-0.841555,-0.844854,-0.848120,-0.851355,-0.854558,-0.857729,-0.860867,-0.863973,-0.867046,-0.870087,-0.873095,-0.876070,-0.879012,-0.881921, 70 | 71 | -0.884797,-0.887640,-0.890449,-0.893224,-0.895966,-0.898674,-0.901349,-0.903989,-0.906596,-0.909168,-0.911706,-0.914210,-0.916679,-0.919114,-0.921514,-0.923880, 72 | 73 | -0.926210,-0.928506,-0.930767,-0.932993,-0.935184,-0.937339,-0.939459,-0.941544,-0.943594,-0.945607,-0.947586,-0.949528,-0.951435,-0.953306,-0.955141,-0.956940, 74 | 75 | -0.958704,-0.960431,-0.962121,-0.963776,-0.965394,-0.966976,-0.968522,-0.970031,-0.971504,-0.972940,-0.974339,-0.975702,-0.977028,-0.978317,-0.979570,-0.980785, 76 | 77 | -0.981964,-0.983105,-0.984210,-0.985278,-0.986308,-0.987301,-0.988258,-0.989177,-0.990058,-0.990903,-0.991710,-0.992480,-0.993212,-0.993907,-0.994565,-0.995185, 78 | 79 | -0.995767,-0.996313,-0.996820,-0.997290,-0.997723,-0.998118,-0.998476,-0.998795,-0.999078,-0.999322,-0.999529,-0.999699,-0.999831,-0.999925,-0.999981}; 80 | const DTYPE W_imag[]={-0.000000,-0.006136,-0.012272,-0.018407,-0.024541,-0.030675,-0.036807,-0.042938,-0.049068,-0.055195,-0.061321,-0.067444,-0.073565,-0.079682,-0.085797,-0.091909,-0.098017,-0.104122,-0.110222,-0.116319,-0.122411,-0.128498,-0.134581,-0.140658,-0.146730,-0.152797,-0.158858,-0.164913,-0.170962,-0.177004,-0.183040,-0.189069,-0.195090,-0.201105,-0.207111,-0.213110,-0.219101,-0.225084,-0.231058,-0.237024,-0.242980,-0.248928,-0.254866,-0.260794,-0.266713,-0.272621,-0.278520,-0.284408,-0.290285,-0.296151,-0.302006,-0.307850,-0.313682,-0.319502,-0.325310,-0.331106,-0.336890,-0.342661,-0.348419,-0.354164,-0.359895,-0.365613,-0.371317,-0.377007,-0.382683,-0.388345,-0.393992,-0.399624,-0.405241,-0.410843,-0.416430,-0.422000,-0.427555,-0.433094,-0.438616,-0.444122,-0.449611,-0.455084,-0.460539,-0.465977,-0.471397,-0.476799,-0.482184,-0.487550,-0.492898,-0.498228,-0.503538,-0.508830,-0.514103,-0.519356,-0.524590,-0.529804,-0.534998,-0.540172,-0.545325,-0.550458,-0.555570,-0.560662,-0.565732,-0.570781,-0.575808,-0.580814,-0.585798,-0.590760,-0.595699,-0.600617,-0.605511,-0.610383,-0.615232,-0.620057,-0.624860,-0.629638,-0.634393,-0.639124,-0.643832,-0.648514,-0.653173,-0.657807,-0.662416,-0.667000,-0.671559,-0.676093,-0.680601,-0.685084,-0.689541,-0.693971,-0.698376,-0.702755,-0.707107,-0.711432,-0.715731,-0.720003,-0.724247,-0.728464,-0.732654,-0.736817,-0.740951,-0.745058,-0.749136,-0.753187,-0.757209,-0.761202,-0.765167,-0.769103,-0.773010,-0.776888,-0.780737,-0.784557,-0.788346,-0.792107,-0.795837,-0.799537,-0.803208,-0.806848,-0.810457,-0.814036,-0.817585,-0.821103,-0.824589,-0.828045,-0.831470,-0.834863,-0.838225,-0.841555,-0.844854,-0.848120,-0.851355,-0.854558,-0.857729,-0.860867,-0.863973,-0.867046,-0.870087,-0.873095,-0.876070,-0.879012,-0.881921,-0.884797,-0.887640,-0.890449,-0.893224,-0.895966,-0.898674,-0.901349,-0.903989,-0.906596,-0.909168,-0.911706,-0.914210,-0.916679,-0.919114,-0.921514,-0.923880,-0.926210,-0.928506,-0.930767,-0.932993,-0.935184,-0.937339,-0.939459,-0.941544,-0.943593,-0.945607,-0.947586,-0.949528,-0.951435,-0.953306,-0.955141,-0.956940,-0.958703,-0.960431,-0.962121,-0.963776,-0.965394,-0.966976,-0.968522,-0.970031,-0.971504,-0.972940,-0.974339,-0.975702,-0.977028,-0.978317,-0.979570,-0.980785,-0.981964,-0.983105,-0.984210,-0.985278,-0.986308,-0.987301,-0.988258,-0.989177,-0.990058,-0.990903,-0.991710,-0.992480,-0.993212,-0.993907,-0.994565,-0.995185,-0.995767,-0.996313,-0.996820,-0.997290,-0.997723,-0.998118,-0.998476,-0.998795,-0.999078,-0.999322,-0.999529,-0.999699,-0.999831,-0.999925,-0.999981,-1.000000,-0.999981,-0.999925,-0.999831,-0.999699,-0.999529,-0.999322,-0.999078,-0.998795,-0.998476,-0.998118,-0.997723,-0.997290,-0.996820,-0.996313,-0.995767,-0.995185,-0.994565,-0.993907,-0.993212,-0.992480,-0.991710,-0.990903,-0.990058,-0.989177,-0.988258,-0.987301,-0.986308,-0.985278,-0.984210,-0.983105,-0.981964,-0.980785,-0.979570,-0.978317,-0.977028,-0.975702,-0.974339,-0.972940,-0.971504,-0.970031,-0.968522,-0.966976,-0.965394,-0.963776,-0.962121,-0.960431,-0.958703,-0.956940,-0.955141,-0.953306,-0.951435,-0.949528,-0.947586,-0.945607,-0.943593,-0.941544,-0.939459,-0.937339,-0.935183,-0.932993,-0.930767,-0.928506,-0.926210,-0.923880,-0.921514,-0.919114,-0.916679,-0.914210,-0.911706,-0.909168,-0.906596,-0.903989,-0.901349,-0.898674,-0.895966,-0.893224,-0.890449,-0.887640,-0.884797,-0.881921,-0.879012,-0.876070,-0.873095,-0.870087,-0.867046,-0.863973,-0.860867,-0.857729,-0.854558,-0.851355,-0.848120,-0.844854,-0.841555,-0.838225,-0.834863,-0.831470,-0.828045,-0.824589,-0.821102,-0.817585,-0.814036,-0.810457,-0.806848,-0.803208,-0.799537,-0.795837,-0.792107,-0.788346,-0.784557,-0.780737,-0.776888,-0.773010,-0.769103,-0.765167,-0.761202,-0.757209,-0.753187,-0.749136,-0.745058,-0.740951,-0.736817,-0.732654,-0.728464,-0.724247,-0.720002,-0.715731,-0.711432,-0.707107,-0.702755,-0.698376,-0.693971,-0.689541,-0.685084,-0.680601,-0.676093,-0.671559,-0.667000,-0.662416,-0.657807,-0.653173,-0.648514,-0.643831,-0.639124,-0.634393,-0.629638,-0.624859,-0.620057,-0.615232,-0.610383,-0.605511,-0.600616,-0.595699,-0.590760,-0.585798,-0.580814,-0.575808,-0.570781,-0.565732,-0.560661,-0.555570,-0.550458,-0.545325,-0.540171,-0.534998,-0.529804,-0.524590,-0.519356,-0.514103,-0.508830,-0.503538,-0.498228,-0.492898,-0.487550,-0.482184,-0.476799,-0.471397,-0.465976,-0.460539,-0.455084,-0.449611,-0.444122,-0.438616,-0.433094,-0.427555,-0.422000,-0.416429,-0.410843,-0.405241,-0.399624,-0.393992,-0.388345,-0.382683,-0.377007,-0.371317,-0.365613,-0.359895,-0.354163,-0.348419,-0.342661,-0.336890,-0.331106,-0.325310,-0.319502,-0.313682,-0.307850,-0.302006,-0.296151,-0.290285,-0.284407,-0.278520,-0.272621,-0.266713,-0.260794,-0.254866,-0.248928,-0.242980,-0.237024,-0.231058,-0.225084,-0.219101,-0.213110,-0.207111,-0.201105,-0.195090,-0.189069,-0.183040,-0.177004,-0.170962,-0.164913,-0.158858,-0.152797,-0.146730,-0.140658,-0.134581,-0.128498,-0.122411,-0.116319,-0.110222,-0.104122,-0.098017,-0.091909,-0.085797,-0.079682,-0.073564,-0.067444,-0.061321,-0.055195,-0.049068,-0.042938,-0.036807,-0.030675,-0.024541,-0.018407,-0.012271,-0.006136}; 81 | 82 | extern "C" void fft(DTYPE XX_R[SIZE], DTYPE XX_I[SIZE], DTYPE OUT_R[SIZE], DTYPE OUT_I[SIZE]); 83 | 84 | #endif 85 | -------------------------------------------------------------------------------- /examples/fft/kernel/fft_test.cpp: -------------------------------------------------------------------------------- 1 | 2 | #include 3 | #include 4 | #include 5 | #include "fft.h" 6 | 7 | 8 | struct Rmse 9 | { 10 | int num_sq; 11 | float sum_sq; 12 | float error; 13 | 14 | Rmse(){ num_sq = 0; sum_sq = 0; error = 0; } 15 | 16 | float add_value(float d_n) 17 | { 18 | num_sq++; 19 | sum_sq += (d_n*d_n); 20 | error = sqrtf(sum_sq / num_sq); 21 | return error; 22 | } 23 | 24 | }; 25 | 26 | 27 | Rmse rmse_R, rmse_I; 28 | 29 | DTYPE In_R[SIZE], In_I[SIZE]; 30 | DTYPE Out_R[SIZE], Out_I[SIZE]; 31 | 32 | int main() 33 | { 34 | int index; 35 | DTYPE gold_R, gold_I; 36 | 37 | FILE * fp = fopen("out.gold.dat","r"); 38 | 39 | //Generate input data 40 | for(int i=0; i 1 || rmse_I.error > 1 ) { 68 | fprintf(stdout, "*******************************************\n"); 69 | fprintf(stdout, "FAIL: Output DOES NOT match the golden output\n"); 70 | fprintf(stdout, "*******************************************\n"); 71 | return 1; 72 | } else { 73 | fprintf(stdout, "*******************************************\n"); 74 | fprintf(stdout, "PASS: The output matches the golden output!\n"); 75 | fprintf(stdout, "*******************************************\n"); 76 | return 0; 77 | } 78 | 79 | } 80 | -------------------------------------------------------------------------------- /examples/fft/make_hw.sh: -------------------------------------------------------------------------------- 1 | make build TARGET=hw 2 | make host 3 | make run TARGET=hw 4 | 5 | 6 | -------------------------------------------------------------------------------- /examples/fft/make_hwe.sh: -------------------------------------------------------------------------------- 1 | make build TARGET=hw_emu 2 | make host 3 | make run TARGET=hw_emu 4 | 5 | 6 | -------------------------------------------------------------------------------- /examples/fft/make_kernel.sh: -------------------------------------------------------------------------------- 1 | make cleanall 2 | make build TARGET=sw_emu 3 | make build TARGET=hw_emu 4 | make build TARGET=hw 5 | 6 | -------------------------------------------------------------------------------- /examples/fft/make_run.sh: -------------------------------------------------------------------------------- 1 | 2 | make run TARGET=sw_emu 3 | make run TARGET=hw_emu 4 | make run TARGET=hw 5 | 6 | -------------------------------------------------------------------------------- /examples/fft/make_sw.sh: -------------------------------------------------------------------------------- 1 | # make build TARGET=sw_emu DEVICE =XXXX CXXFLAGS=XXXX KERNEL=XXXX 2 | # make host DEVICE =XXXX CXXFLAGS=XXXX KERNEL=XXXX 3 | # make run TARGET=sw_emu 4 | 5 | make build TARGET=sw_emu 6 | make host 7 | make run TARGET=sw_emu 8 | 9 | 10 | -------------------------------------------------------------------------------- /examples/fft/remake_host.sh: -------------------------------------------------------------------------------- 1 | make cleanh 2 | make host 3 | make run TARGET=sw_emu -------------------------------------------------------------------------------- /examples/fft/utils.mk: -------------------------------------------------------------------------------- 1 | # 2 | # Copyright 2019-2021 Xilinx, Inc. 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"); 5 | # you may not use this file except in compliance with the License. 6 | # You may obtain a copy of the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | # See the License for the specific language governing permissions and 14 | # limitations under the License. 15 | # 16 | #+------------------------------------------------------------------------------- 17 | # The following parameters are assigned with default values. These parameters can 18 | # be overridden through the make command line 19 | #+------------------------------------------------------------------------------- 20 | 21 | REPORT := no 22 | PROFILE := no 23 | DEBUG := no 24 | 25 | #'estimate' for estimate report generation 26 | #'system' for system report generation 27 | ifneq ($(REPORT), no) 28 | VPP_LDFLAGS += --report estimate 29 | VPP_LDFLAGS += --report system 30 | endif 31 | 32 | #Generates profile summary report 33 | ifeq ($(PROFILE), yes) 34 | VPP_LDFLAGS += --profile_kernel data:all:all:all 35 | endif 36 | 37 | #Generates debug summary report 38 | ifeq ($(DEBUG), yes) 39 | VPP_LDFLAGS += --dk protocol:all:all:all 40 | endif 41 | 42 | #Checks for XILINX_XRT 43 | ifeq ($(HOST_ARCH), x86) 44 | ifndef XILINX_XRT 45 | XILINX_XRT = /opt/xilinx/xrt 46 | export XILINX_XRT 47 | endif 48 | else 49 | ifndef XILINX_VITIS 50 | XILINX_VITIS = /opt/xilinx/Vitis/$(TOOL_VERSION) 51 | export XILINX_VITIS 52 | endif 53 | endif 54 | 55 | #Checks for Device Family 56 | ifeq ($(HOST_ARCH), aarch32) 57 | DEV_FAM = 7Series 58 | else ifeq ($(HOST_ARCH), aarch64) 59 | DEV_FAM = Ultrascale 60 | endif 61 | 62 | B_NAME = $(shell dirname $(XPLATFORM)) 63 | 64 | #Checks for Correct architecture 65 | ifneq ($(HOST_ARCH), $(filter $(HOST_ARCH),aarch64 aarch32 x86)) 66 | $(error HOST_ARCH variable not set, please set correctly and rerun) 67 | endif 68 | 69 | #Checks for SYSROOT 70 | check_sysroot: 71 | ifneq ($(HOST_ARCH), x86) 72 | ifndef SYSROOT 73 | $(error SYSROOT ENV variable is not set, please set ENV variable correctly and rerun) 74 | endif 75 | endif 76 | 77 | check_version: 78 | ifneq (, $(shell which git)) 79 | ifneq (,$(wildcard $(XFLIB_DIR)/.git)) 80 | @cd $(XFLIB_DIR) && git log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit -n 1 && cd - 81 | endif 82 | endif 83 | 84 | #Checks for g++ 85 | CXX := g++ 86 | ifeq ($(HOST_ARCH), x86) 87 | ifneq ($(shell expr $(shell g++ -dumpversion) \>= 5), 1) 88 | ifndef XILINX_VIVADO 89 | $(error [ERROR]: g++ version too old. Please use 5.0 or above) 90 | else 91 | CXX := $(XILINX_VIVADO)/tps/lnx64/gcc-6.2.0/bin/g++ 92 | ifeq ($(LD_LIBRARY_PATH),) 93 | export LD_LIBRARY_PATH := $(XILINX_VIVADO)/tps/lnx64/gcc-6.2.0/lib64 94 | else 95 | export LD_LIBRARY_PATH := $(XILINX_VIVADO)/tps/lnx64/gcc-6.2.0/lib64:$(LD_LIBRARY_PATH) 96 | endif 97 | $(warning [WARNING]: g++ version too old. Using g++ provided by the tool: $(CXX)) 98 | endif 99 | endif 100 | else ifeq ($(HOST_ARCH), aarch64) 101 | CXX := $(XILINX_VITIS)/gnu/aarch64/lin/aarch64-linux/bin/aarch64-linux-gnu-g++ 102 | else ifeq ($(HOST_ARCH), aarch32) 103 | CXX := $(XILINX_VITIS)/gnu/aarch32/lin/gcc-arm-linux-gnueabi/bin/arm-linux-gnueabihf-g++ 104 | endif 105 | 106 | #Setting VPP 107 | VPP := v++ 108 | 109 | #Cheks for aiecompiler 110 | 111 | .PHONY: check_vivado 112 | check_vivado: 113 | ifeq (,$(wildcard $(XILINX_VIVADO)/bin/vivado)) 114 | @echo "Cannot locate Vivado installation. Please set XILINX_VIVADO variable." && false 115 | endif 116 | 117 | .PHONY: check_vpp 118 | check_vpp: 119 | ifeq (,$(wildcard $(XILINX_VITIS)/bin/v++)) 120 | @echo "Cannot locate Vitis installation. Please set XILINX_VITIS variable." && false 121 | endif 122 | 123 | .PHONY: check_xrt 124 | check_xrt: 125 | ifeq ($(HOST_ARCH), x86) 126 | ifeq (,$(wildcard $(XILINX_XRT)/lib/libxilinxopencl.so)) 127 | @echo "Cannot locate XRT installation. Please set XILINX_XRT variable." && false 128 | endif 129 | endif 130 | 131 | export PATH := $(XILINX_VITIS)/bin:$(XILINX_XRT)/bin:$(PATH) 132 | ifeq ($(HOST_ARCH), x86) 133 | ifeq (,$(LD_LIBRARY_PATH)) 134 | LD_LIBRARY_PATH := $(XILINX_XRT)/lib 135 | else 136 | LD_LIBRARY_PATH := $(XILINX_XRT)/lib:$(LD_LIBRARY_PATH) 137 | endif 138 | else # aarch64 139 | ifeq (,$(LD_LIBRARY_PATH)) 140 | LD_LIBRARY_PATH := $(SYSROOT)/usr/lib 141 | else 142 | LD_LIBRARY_PATH := $(SYSROOT)/usr/lib:$(LD_LIBRARY_PATH) 143 | endif 144 | endif 145 | 146 | # check target 147 | ifeq ($(filter $(TARGET),sw_emu hw_emu hw),) 148 | $(error TARGET is not sw_emu, hw_emu or hw) 149 | endif 150 | 151 | ifneq (,$(wildcard $(DEVICE))) 152 | # Use DEVICE as a file path 153 | XPLATFORM := $(DEVICE) 154 | else 155 | # Use DEVICE as a file name pattern 156 | # 1. search paths specified by variable 157 | ifneq (,$(PLATFORM_REPO_PATHS)) 158 | # 1.1 as exact name 159 | XPLATFORM := $(strip $(foreach p, $(subst :, ,$(PLATFORM_REPO_PATHS)), $(wildcard $(p)/$(DEVICE)/$(DEVICE).xpfm))) 160 | # 1.2 as a pattern 161 | ifeq (,$(XPLATFORM)) 162 | XPLATFORMS := $(foreach p, $(subst :, ,$(PLATFORM_REPO_PATHS)), $(wildcard $(p)/*/*.xpfm)) 163 | XPLATFORM := $(strip $(foreach p, $(XPLATFORMS), $(shell echo $(p) | awk '$$1 ~ /$(DEVICE)/'))) 164 | endif # 1.2 165 | endif # 1 166 | # 2. search Vitis installation 167 | ifeq (,$(XPLATFORM)) 168 | # 2.1 as exact name 169 | XPLATFORM := $(strip $(wildcard $(XILINX_VITIS)/platforms/$(DEVICE)/$(DEVICE).xpfm)) 170 | # 2.2 as a pattern 171 | ifeq (,$(XPLATFORM)) 172 | XPLATFORMS := $(wildcard $(XILINX_VITIS)/platforms/*/*.xpfm) 173 | XPLATFORM := $(strip $(foreach p, $(XPLATFORMS), $(shell echo $(p) | awk '$$1 ~ /$(DEVICE)/'))) 174 | endif # 2.2 175 | endif # 2 176 | # 3. search default locations 177 | ifeq (,$(XPLATFORM)) 178 | # 3.1 as exact name 179 | XPLATFORM := $(strip $(wildcard /opt/xilinx/platforms/$(DEVICE)/$(DEVICE).xpfm)) 180 | # 3.2 as a pattern 181 | ifeq (,$(XPLATFORM)) 182 | XPLATFORMS := $(wildcard /opt/xilinx/platforms/*/*.xpfm) 183 | XPLATFORM := $(strip $(foreach p, $(XPLATFORMS), $(shell echo $(p) | awk '$$1 ~ /$(DEVICE)/'))) 184 | endif # 3.2 185 | endif # 3 186 | endif 187 | 188 | define MSG_PLATFORM 189 | No platform matched pattern '$(DEVICE)'. 190 | Available platforms are: $(XPLATFORMS) 191 | To add more platform directories, set the PLATFORM_REPO_PATHS variable or point DEVICE variable to the full path of platform .xpfm file. 192 | endef 193 | export MSG_PLATFORM 194 | 195 | define MSG_DEVICE 196 | More than one platform matched: $(XPLATFORM) 197 | Please set DEVICE variable more accurately to select only one platform file, or set DEVICE variable to the full path of the platform .xpfm file. 198 | endef 199 | export MSG_DEVICE 200 | 201 | .PHONY: check_platform 202 | check_platform: 203 | ifeq (,$(XPLATFORM)) 204 | @echo "$${MSG_PLATFORM}" && false 205 | endif 206 | ifneq (,$(word 2,$(XPLATFORM))) 207 | @echo "$${MSG_DEVICE}" && false 208 | endif 209 | #Check ends 210 | 211 | # device2xsa - create a filesystem friendly name from device name 212 | # $(1) - full name of device 213 | device2xsa = $(strip $(patsubst %.xpfm, % , $(shell basename $(DEVICE)))) 214 | 215 | # Cleaning stuff 216 | RM = rm -f 217 | RMDIR = rm -rf 218 | 219 | MV = mv -f 220 | CP = cp -rf 221 | ECHO:= @echo 222 | -------------------------------------------------------------------------------- /examples/vadd_vmul/Makefile: -------------------------------------------------------------------------------- 1 | # 2 | # Copyright 2019-2021 Xilinx, Inc. 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"); 5 | # you may not use this file except in compliance with the License. 6 | # You may obtain a copy of the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | # See the License for the specific language governing permissions and 14 | # limitations under the License. 15 | # makefile-generator v1.0.4 16 | # 17 | 18 | # ####################################### Help Section ##################################### 19 | .PHONY: help 20 | 21 | help:: 22 | $(ECHO) "Makefile Usage:" 23 | $(ECHO) " make all TARGET= DEVICE= HOST_ARCH=" 24 | $(ECHO) " Command to generate the design for specified Target and Shell." 25 | $(ECHO) "" 26 | $(ECHO) " make clean " 27 | $(ECHO) " Command to remove the generated non-hardware files." 28 | $(ECHO) "" 29 | $(ECHO) " make cleanall" 30 | $(ECHO) " Command to remove all the generated files." 31 | $(ECHO) "" 32 | $(ECHO) " make run TARGET= DEVICE= HOST_ARCH=" 33 | $(ECHO) " Command to run application in emulation or on board." 34 | $(ECHO) "" 35 | $(ECHO) " make build TARGET= DEVICE= HOST_ARCH=" 36 | $(ECHO) " Command to build xclbin application." 37 | $(ECHO) "" 38 | $(ECHO) " make host HOST_ARCH=" 39 | $(ECHO) " Command to build host application." 40 | $(ECHO) "" 41 | 42 | # NOTE: 本Makefile仅支持生成单个xclbin(统一命名为fused.xclbin),不支持切换xclbin 43 | # ##################### Setting up default value of TARGET ########################## 44 | # TODO: 设置TARGET 45 | TARGET ?= sw_emu 46 | 47 | # ################### Setting up default value of DEVICE ############################## 48 | # TODO: 设置板卡(可选值:u280 u50) 49 | BOARD ?= u280 50 | 51 | # 以下是几个板卡型号: 52 | # u50:xilinx_u50_gen3x16_xdma_201920_3 53 | # U280:xilinx_u280_gen3x16_xdma_1_202211_1 54 | # U280(用于最后实际上板时,即TARGET=HW时): xilinx_u280_gen3x16_xdma_base_1 55 | DEVICE ?= xilinx_u280_gen3x16_xdma_1_202211_1 56 | 57 | # 根据板卡设置 DEVICE 58 | ifeq ($(BOARD), u50) 59 | DEVICE := xilinx_u50_gen3x16_xdma_201920_3 60 | else ifeq ($(BOARD), u280) 61 | DEVICE := xilinx_u280_gen3x16_xdma_1_202211_1 62 | else 63 | $(error [ERROR]: Unsupported BOARD=$(BOARD). Please use 'u50' or 'u280'.) 64 | endif 65 | 66 | # 打印当前板卡设置 67 | $(info [INFO]: Using BOARD=$(BOARD), DEVICE=$(DEVICE)) 68 | 69 | # ###################### Setting up default value of HOST_ARCH ####################### 70 | HOST_ARCH ?= x86 71 | 72 | CXXFLAGS := -I${PWD} 73 | 74 | # TODO: 设置kernel名称(kernel文件夹下的cpp文件名.top function必须和cpp名称一样,见"Setting Rules for Binary Containers Building Kernels") 75 | KERNEL_LIST := vmul vadd 76 | 77 | # TODO: 设置频率,默认值为300 78 | FREQUENCY ?= 300 79 | 80 | # TODO: 设置线程并行度,默认值为8 81 | THREADS ?= 32 82 | 83 | # TODO: 设置优化等级,默认值为3 84 | OPT_LEVEL ?= 3 85 | 86 | # TODO: 设置报告级别 87 | REPORT_LEVEL ?= 2 88 | 89 | # #################### Checking if DEVICE in blacklist ############################# 90 | ifeq ($(findstring zc, $(DEVICE)), zc) 91 | $(error [ERROR]: This project is not supported for $(DEVICE).) 92 | endif 93 | 94 | # #################### Checking if DEVICE in whitelist ############################ 95 | # ifneq ($(findstring u280, $(DEVICE)), u280) 96 | # $(warning [WARNING]: This project has not been tested for $(DEVICE). It may or may not work.) 97 | # endif 98 | 99 | ifneq ($(findstring $(BOARD), $(DEVICE)),$(BOARD)) 100 | $(warning [WARNING]: This project has not been tested for $(DEVICE). It may or may not work.) 101 | endif 102 | 103 | # ######################## Setting up Project Variables ################################# 104 | MK_PATH := $(abspath $(lastword $(MAKEFILE_LIST))) 105 | CUR_DIR := $(patsubst %/,%,$(dir $(MK_PATH))) 106 | 107 | # ######################### Include environment variables in utils.mk #################### 108 | include ./utils.mk 109 | XDEVICE := $(call device2xsa, $(DEVICE)) 110 | TEMP_DIR := _x_temp.$(TARGET).$(XDEVICE) 111 | TEMP_REPORT_DIR := $(CUR_DIR)/reports/_x.$(TARGET).$(XDEVICE) 112 | BUILD_DIR := build_dir.$(TARGET).$(XDEVICE) 113 | BUILD_REPORT_DIR := $(CUR_DIR)/reports/_build.$(TARGET).$(XDEVICE) 114 | EMCONFIG_DIR := $(BUILD_DIR) 115 | XCLBIN_DIR := $(CUR_DIR)/$(BUILD_DIR) 116 | export XCL_BINDIR = $(XCLBIN_DIR) 117 | 118 | # ######################### Setting up Host Variables ######################### 119 | # TODO: 在这里include需要的头文件和链接需要的库 120 | #Include Required Host Source Files 121 | HOST_SRCS += $(CUR_DIR)/common/include/xcl2.cpp 122 | HOST_SRCS += $(CUR_DIR)/common/include/event_timer.cpp 123 | HOST_SRCS += $(CUR_DIR)/host/host.cpp 124 | 125 | CXXFLAGS += -I$(CUR_DIR) 126 | CXXFLAGS += -I$(CUR_DIR)/host 127 | CXXFLAGS += -I$(CUR_DIR)/common/include 128 | 129 | ifeq ($(TARGET),sw_emu) 130 | CXXFLAGS += -D SW_EMU_TEST 131 | endif 132 | 133 | ifeq ($(TARGET),hw_emu) 134 | CXXFLAGS += -D HW_EMU_TEST 135 | endif 136 | 137 | # ######################### Host compiler global settings ############################ 138 | CXXFLAGS += -I$(XILINX_XRT)/include -I$(XILINX_HLS)/include -std=c++11 -O3 -Wall -Wno-unknown-pragmas -Wno-unused-label 139 | LDFLAGS += -L$(XILINX_XRT)/lib -lOpenCL -lpthread -lrt -Wno-unused-label -Wno-narrowing -DVERBOSE 140 | CXXFLAGS += -fmessage-length=0 -O3 141 | CXXFLAGS += -I$(CUR_DIR)/src 142 | 143 | # lIp_floating_point_v7_0_bitacc_cmodel 只存在于2022.1版本的Vitis HLS中 144 | # 2023版本的Vitis HLS中使用lIp_floating_point_v7_1_bitacc_cmodel 145 | ifeq ($(HOST_ARCH), x86) 146 | LDFLAGS += -L$(XILINX_HLS)/lnx64/tools/fpo_v7_1 -Wl,--as-needed -lgmp -lmpfr -lIp_floating_point_v7_1_bitacc_cmodel 147 | endif 148 | 149 | # ################### Setting package and image directory ####################### 150 | EXE_NAME := host.exe 151 | EXE_FILE := $(BUILD_DIR)/$(EXE_NAME) 152 | # TODO: 设置host程序参数(xclbin文件路径) 153 | HOST_ARGS := --xclbin $(BUILD_DIR)/fused.xclbin 154 | #HOST_ARGS := --xclbin $(BUILD_DIR)/fused.xclbin --offset $(CUR_DIR)/data/data-csr-offset.mtx --index $(CUR_DIR)/data/data-csr-indicesweights.mtx 155 | 156 | # ##################### Kernel compiler global settings ########################## 157 | VPP_FLAGS += -t $(TARGET) --platform $(XPLATFORM) --save-temps --optimize $(OPT_LEVEL) 158 | VPP_FLAGS += --hls.jobs $(THREADS) 159 | VPP_LDFLAGS += --vivado.synth.jobs $(THREADS) --vivado.impl.jobs $(THREADS) 160 | 161 | # --------------------- 162 | # !!! 在编译阶段不使用 .cfg !!! 163 | # --------------------- 164 | # 设置cfg文件 165 | # ifneq (,$(shell echo $(XPLATFORM) | awk '/$(BOARD)/')) 166 | # VPP_FLAGS += --config $(CUR_DIR)/conn_$(BOARD).cfg 167 | # endif 168 | 169 | VPP_FLAGS += -I$(CUR_DIR)/kernel 170 | 171 | # 对于KERNEL_LIST中的每一个kernel定义一个VPP_FLAGS 和 VPP_LDFLAGS 172 | define SET_KERNEL_VPP_FLAGS 173 | $(1)_VPP_FLAGS += -D KERNEL_NAME=$(1) 174 | $(1)_VPP_FLAGS += --hls.clock $(FREQUENCY)000000:$(1) 175 | endef 176 | 177 | # 只在链接时一次性指定频率 178 | VPP_LDFLAGS += --kernel_frequency $(FREQUENCY) 179 | 180 | # Apply to each kernel 181 | $(foreach KERNEL, $(KERNEL_LIST), $(eval $(call SET_KERNEL_VPP_FLAGS,$(KERNEL)))) 182 | 183 | # 设置report level 184 | ifeq ($(REPORT_LEVEL), estimate) 185 | VPP_FLAGS += --report_level estimate 186 | else ifneq ($(filter $(REPORT_LEVEL), 0 1 2),) 187 | VPP_FLAGS += --report_level $(REPORT_LEVEL) 188 | else 189 | $(error [ERROR]: Unsupported REPORT_LEVEL=$(REPORT_LEVEL). Please use 0, 1, 2, or estimate.) 190 | endif 191 | 192 | # ############################ Declaring Binary Containers ########################## 193 | # Initialize BINARY_CONTAINERS and their OBJS 194 | # 一个BINARY_CONTAINER对应生成一个xclbin文件。本makefile只支持生成一个fused.xclbin文件 195 | BINARY_CONTAINERS += $(BUILD_DIR)/fused.xclbin 196 | 197 | # 该xclbin文件内含的kernel object(.xo)文件 198 | #TODO: 关于kernel在FPGA上的连接关系,请修改.cfg文件 199 | # BINARY_CONTAINER_$(KERNEL)_OBJS += $(TEMP_DIR)/$(KERNEL).xo 200 | BINARY_CONTAINER_fused_OBJS := $(foreach KERNEL, $(KERNEL_LIST), $(TEMP_DIR)/$(KERNEL).xo) 201 | 202 | # ######################### Setting Targets of Makefile ################################ 203 | DATA_DIR += $(CUR_DIR)/data 204 | 205 | .PHONY: all clean cleanall docs emconfig 206 | ifeq ($(HOST_ARCH), x86) 207 | all: check_version check_vpp check_platform check_xrt $(EXE_FILE) $(BINARY_CONTAINERS) emconfig 208 | else 209 | all: check_version check_vpp check_platform check_sysroot $(EXE_FILE) $(BINARY_CONTAINERS) emconfig sd_card 210 | endif 211 | 212 | .PHONY: host 213 | ifeq ($(HOST_ARCH), x86) 214 | host: check_xrt $(EXE_FILE) 215 | else 216 | host: check_sysroot $(EXE_FILE) 217 | endif 218 | 219 | .PHONY: xclbin 220 | ifeq ($(HOST_ARCH), x86) 221 | xclbin: check_vpp check_xrt $(BINARY_CONTAINERS) 222 | else 223 | xclbin: check_vpp check_sysroot $(BINARY_CONTAINERS) 224 | endif 225 | 226 | .PHONY: build 227 | build: xclbin 228 | 229 | # ################ Setting Rules for Binary Containers (Building Kernels) ################ 230 | # 编译阶段:仅 -c,不带 config 231 | $(TEMP_DIR)/%.xo: $(CUR_DIR)/kernel/%.cpp 232 | $(ECHO) "Compiling Kernel: $*" 233 | mkdir -p $(TEMP_DIR) 234 | $(VPP) -c $($*_VPP_FLAGS) $(VPP_FLAGS) -k $* \ 235 | -I'$( 34 | #include 35 | 36 | EventTimer::EventTimer() 37 | { 38 | unfinished = false; 39 | event_count = 0; 40 | max_string_length = 0; 41 | } 42 | 43 | float EventTimer::ms_difference(EventTimer::timepoint start, 44 | EventTimer::timepoint end) 45 | { 46 | std::chrono::duration duration = end - start; 47 | return duration.count(); 48 | } 49 | 50 | int EventTimer::add(std::string description) 51 | { 52 | // If previously pending event was unfinished, adding a new event 53 | // will terminate it if this function is called 54 | if (unfinished) 55 | finish(); 56 | 57 | unfinished = true; 58 | 59 | event_names.push_back(description); 60 | int length = description.length(); 61 | if (length > max_string_length) 62 | max_string_length = length; 63 | start_times.push_back(std::chrono::high_resolution_clock::now()); 64 | return event_count++; 65 | } 66 | 67 | void EventTimer::finish(void) 68 | { 69 | end_times.push_back(std::chrono::high_resolution_clock::now()); 70 | if (!unfinished) { 71 | end_times.pop_back(); 72 | return; 73 | } 74 | unfinished = false; 75 | } 76 | 77 | void EventTimer::clear(void) 78 | { 79 | start_times.clear(); 80 | end_times.clear(); 81 | event_names.clear(); 82 | event_count = 0; 83 | unfinished = false; 84 | } 85 | 86 | void EventTimer::print(int id) 87 | { 88 | std::ios_base::fmtflags flags(std::cout.flags()); 89 | if (id >= 0) { 90 | if ((unsigned)id > event_names.size()) 91 | return; 92 | std::cout << event_names[id] << " : " << std::fixed << std::setprecision(3) 93 | << ms_difference(start_times[id], end_times[id]) << std::endl; 94 | } 95 | else { 96 | int printable_events = unfinished ? event_count - 1 : event_count; 97 | for (int i = 0; i < printable_events; i++) { 98 | std::cout << std::left << std::setw(max_string_length) << event_names[i] << " : "; 99 | std::cout << std::right << std::setw(8) << std::fixed << std::setprecision(3) 100 | << ms_difference(start_times[i], end_times[i]) << " ms" 101 | << std::endl; 102 | } 103 | } 104 | std::cout.flags(flags); 105 | } 106 | -------------------------------------------------------------------------------- /examples/vadd_vmul/common/include/event_timer.hpp: -------------------------------------------------------------------------------- 1 | /********** 2 | Copyright (c) 2019, Xilinx, Inc. 3 | All rights reserved. 4 | 5 | Redistribution and use in source and binary forms, with or without modification, 6 | are permitted provided that the following conditions are met: 7 | 8 | 1. Redistributions of source code must retain the above copyright notice, 9 | this list of conditions and the following disclaimer. 10 | 11 | 2. Redistributions in binary form must reproduce the above copyright notice, 12 | this list of conditions and the following disclaimer in the documentation 13 | and/or other materials provided with the distribution. 14 | 15 | 3. Neither the name of the copyright holder nor the names of its contributors 16 | may be used to endorse or promote products derived from this software 17 | without specific prior written permission. 18 | 19 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 20 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, 21 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 22 | IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 23 | INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 24 | PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 25 | HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 26 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, 27 | EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 28 | **********/ 29 | 30 | 31 | #ifndef EVENT_TIMER_HPP__ 32 | #define EVENT_TIMER_HPP__ 33 | 34 | #include 35 | #include 36 | #include 37 | 38 | class EventTimer 39 | { 40 | typedef std::chrono::high_resolution_clock::time_point timepoint; 41 | 42 | private: 43 | std::vector start_times; 44 | std::vector end_times; 45 | std::vector event_names; 46 | 47 | bool unfinished; 48 | unsigned int event_count; 49 | int max_string_length; 50 | 51 | float ms_difference(EventTimer::timepoint start, EventTimer::timepoint end); 52 | 53 | public: 54 | EventTimer(void); 55 | int add(std::string description); 56 | void finish(void); 57 | void clear(void); 58 | 59 | void print(int id = -1); 60 | }; 61 | 62 | #endif // EVENT_TIMER_HPP__ 63 | -------------------------------------------------------------------------------- /examples/vadd_vmul/common/include/utils.hpp: -------------------------------------------------------------------------------- 1 | /* 2 | * Copyright 2019 Xilinx, Inc. 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"); 5 | * you may not use this file except in compliance with the License. 6 | * You may obtain a copy of the License at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, 12 | * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | * See the License for the specific language governing permissions and 14 | * limitations under the License. 15 | */ 16 | 17 | #ifndef XF_GRAPH_UTILS_HPP 18 | #define XF_GRAPH_UTILS_HPP 19 | 20 | #include 21 | #include 22 | #include 23 | #include 24 | #include "ap_int.h" 25 | 26 | class ArgParser { 27 | public: 28 | ArgParser(int& argc, const char** argv) { 29 | for (int i = 1; i < argc; ++i) mTokens.push_back(std::string(argv[i])); 30 | } 31 | bool getCmdOption(const std::string option, std::string& value) const { 32 | std::vector::const_iterator itr; 33 | itr = std::find(this->mTokens.begin(), this->mTokens.end(), option); 34 | if (itr != this->mTokens.end() && ++itr != this->mTokens.end()) { 35 | value = *itr; 36 | return true; 37 | } 38 | return false; 39 | } 40 | 41 | private: 42 | std::vector mTokens; 43 | }; 44 | 45 | // Compute time difference 46 | inline unsigned long tvdiff(struct timeval* tv0, struct timeval* tv1) { 47 | return ((unsigned long)tv1->tv_sec - (unsigned long)tv0->tv_sec) * 1000000UL + 48 | ((unsigned long)tv1->tv_usec - (unsigned long)tv0->tv_usec); 49 | } 50 | 51 | template 52 | T* aligned_alloc(std::size_t num) { 53 | void* ptr = NULL; 54 | if (posix_memalign(&ptr, 4096, num * sizeof(T))) throw std::bad_alloc(); 55 | return reinterpret_cast(ptr); 56 | } 57 | 58 | #endif //#ifndef VT_GRAPH_UTILS_H 59 | -------------------------------------------------------------------------------- /examples/vadd_vmul/common/include/xcl2.cpp: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright (C) 2019-2021 Xilinx, Inc 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"). You may 5 | * not use this file except in compliance with the License. A copy of the 6 | * License is located at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT 12 | * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 13 | * License for the specific language governing permissions and limitations 14 | * under the License. 15 | */ 16 | 17 | #include "xcl2.hpp" 18 | #include 19 | #include 20 | #include 21 | #include 22 | #include 23 | #if defined(_WINDOWS) 24 | #include 25 | #else 26 | #include 27 | #endif 28 | 29 | namespace xcl { 30 | std::vector get_devices(const std::string& vendor_name) { 31 | size_t i; 32 | cl_int err; 33 | std::vector platforms; 34 | OCL_CHECK(err, err = cl::Platform::get(&platforms)); 35 | cl::Platform platform; 36 | for (i = 0; i < platforms.size(); i++) { 37 | platform = platforms[i]; 38 | OCL_CHECK(err, std::string platformName = platform.getInfo(&err)); 39 | if (!(platformName.compare(vendor_name))) { 40 | std::cout << "Found Platform" << std::endl; 41 | std::cout << "Platform Name: " << platformName.c_str() << std::endl; 42 | break; 43 | } 44 | } 45 | if (i == platforms.size()) { 46 | std::cout << "Error: Failed to find Xilinx platform" << std::endl; 47 | std::cout << "Found the following platforms : " << std::endl; 48 | for (size_t j = 0; j < platforms.size(); j++) { 49 | platform = platforms[j]; 50 | OCL_CHECK(err, std::string platformName = platform.getInfo(&err)); 51 | std::cout << "Platform Name: " << platformName.c_str() << std::endl; 52 | } 53 | exit(EXIT_FAILURE); 54 | } 55 | // Getting ACCELERATOR Devices and selecting 1st such device 56 | std::vector devices; 57 | OCL_CHECK(err, err = platform.getDevices(CL_DEVICE_TYPE_ACCELERATOR, &devices)); 58 | return devices; 59 | } 60 | 61 | std::vector get_xil_devices() { 62 | return get_devices("Xilinx"); 63 | } 64 | 65 | cl::Device find_device_bdf(const std::vector& devices, const std::string& bdf) { 66 | char device_bdf[20]; 67 | cl_int err; 68 | cl::Device device; 69 | int cnt = 0; 70 | for (uint32_t i = 0; i < devices.size(); i++) { 71 | OCL_CHECK(err, err = devices[i].getInfo(CL_DEVICE_PCIE_BDF, &device_bdf)); 72 | if (bdf == device_bdf) { 73 | device = devices[i]; 74 | cnt++; 75 | break; 76 | } 77 | } 78 | if (cnt == 0) { 79 | std::cout << "Invalid device bdf. Please check and provide valid bdf\n"; 80 | exit(EXIT_FAILURE); 81 | } 82 | return device; 83 | } 84 | cl_device_id find_device_bdf_c(cl_device_id* devices, const std::string& bdf, cl_uint device_count) { 85 | char device_bdf[20]; 86 | cl_int err; 87 | cl_device_id device; 88 | int cnt = 0; 89 | for (uint32_t i = 0; i < device_count; i++) { 90 | err = clGetDeviceInfo(devices[i], CL_DEVICE_PCIE_BDF, sizeof(device_bdf), device_bdf, 0); 91 | if (err != CL_SUCCESS) { 92 | std::cout << "Unable to extract the device BDF details\n"; 93 | exit(EXIT_FAILURE); 94 | } 95 | if (bdf == device_bdf) { 96 | device = devices[i]; 97 | cnt++; 98 | break; 99 | } 100 | } 101 | if (cnt == 0) { 102 | std::cout << "Invalid device bdf. Please check and provide valid bdf\n"; 103 | exit(EXIT_FAILURE); 104 | } 105 | return device; 106 | } 107 | std::vector read_binary_file(const std::string& xclbin_file_name) { 108 | std::cout << "INFO: Reading " << xclbin_file_name << std::endl; 109 | FILE* fp; 110 | if ((fp = fopen(xclbin_file_name.c_str(), "r")) == nullptr) { 111 | printf("ERROR: %s xclbin not available please build\n", xclbin_file_name.c_str()); 112 | exit(EXIT_FAILURE); 113 | } 114 | // Loading XCL Bin into char buffer 115 | std::cout << "Loading: '" << xclbin_file_name.c_str() << "'\n"; 116 | std::ifstream bin_file(xclbin_file_name.c_str(), std::ifstream::binary); 117 | bin_file.seekg(0, bin_file.end); 118 | auto nb = bin_file.tellg(); 119 | bin_file.seekg(0, bin_file.beg); 120 | std::vector buf; 121 | buf.resize(nb); 122 | bin_file.read(reinterpret_cast(buf.data()), nb); 123 | return buf; 124 | } 125 | 126 | bool is_emulation() { 127 | bool ret = false; 128 | char* xcl_mode = getenv("XCL_EMULATION_MODE"); 129 | if (xcl_mode != nullptr) { 130 | ret = true; 131 | } 132 | return ret; 133 | } 134 | 135 | bool is_hw_emulation() { 136 | bool ret = false; 137 | char* xcl_mode = getenv("XCL_EMULATION_MODE"); 138 | if ((xcl_mode != nullptr) && !strcmp(xcl_mode, "hw_emu")) { 139 | ret = true; 140 | } 141 | return ret; 142 | } 143 | double round_off(double n) { 144 | double d = n * 100.0; 145 | int i = d + 0.5; 146 | d = i / 100.0; 147 | return d; 148 | } 149 | 150 | std::string convert_size(size_t size) { 151 | static const char* SIZES[] = {"B", "KB", "MB", "GB"}; 152 | uint32_t div = 0; 153 | size_t rem = 0; 154 | 155 | while (size >= 1024 && div < (sizeof SIZES / sizeof *SIZES)) { 156 | rem = (size % 1024); 157 | div++; 158 | size /= 1024; 159 | } 160 | 161 | double size_d = (float)size + (float)rem / 1024.0; 162 | double size_val = round_off(size_d); 163 | 164 | std::stringstream stream; 165 | stream << std::fixed << std::setprecision(2) << size_val; 166 | std::string size_str = stream.str(); 167 | std::string result = size_str + " " + SIZES[div]; 168 | return result; 169 | } 170 | 171 | bool is_xpr_device(const char* device_name) { 172 | const char* output = strstr(device_name, "xpr"); 173 | 174 | if (output == nullptr) { 175 | return false; 176 | } else { 177 | return true; 178 | } 179 | } 180 | }; // namespace xcl 181 | -------------------------------------------------------------------------------- /examples/vadd_vmul/common/include/xcl2.hpp: -------------------------------------------------------------------------------- 1 | /** 2 | * Copyright (C) 2019-2021 Xilinx, Inc 3 | * 4 | * Licensed under the Apache License, Version 2.0 (the "License"). You may 5 | * not use this file except in compliance with the License. A copy of the 6 | * License is located at 7 | * 8 | * http://www.apache.org/licenses/LICENSE-2.0 9 | * 10 | * Unless required by applicable law or agreed to in writing, software 11 | * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT 12 | * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 13 | * License for the specific language governing permissions and limitations 14 | * under the License. 15 | */ 16 | 17 | #pragma once 18 | 19 | #define CL_HPP_CL_1_2_DEFAULT_BUILD 20 | #define CL_HPP_TARGET_OPENCL_VERSION 120 21 | #define CL_HPP_MINIMUM_OPENCL_VERSION 120 22 | #define CL_HPP_ENABLE_PROGRAM_CONSTRUCTION_FROM_ARRAY_COMPATIBILITY 1 23 | #define CL_USE_DEPRECATED_OPENCL_1_2_APIS 24 | 25 | // OCL_CHECK doesn't work if call has templatized function call 26 | #define OCL_CHECK(error, call) \ 27 | call; \ 28 | if (error != CL_SUCCESS) { \ 29 | printf("%s:%d Error calling " #call ", error code is: %d\n", __FILE__, __LINE__, error); \ 30 | exit(EXIT_FAILURE); \ 31 | } 32 | 33 | #include 34 | #include 35 | #include 36 | #include 37 | // When creating a buffer with user pointer (CL_MEM_USE_HOST_PTR), under the 38 | // hood 39 | // User ptr is used if and only if it is properly aligned (page aligned). When 40 | // not 41 | // aligned, runtime has no choice but to create its own host side buffer that 42 | // backs 43 | // user ptr. This in turn implies that all operations that move data to and from 44 | // device incur an extra memcpy to move data to/from runtime's own host buffer 45 | // from/to user pointer. So it is recommended to use this allocator if user wish 46 | // to 47 | // Create Buffer/Memory Object with CL_MEM_USE_HOST_PTR to align user buffer to 48 | // the 49 | // page boundary. It will ensure that user buffer will be used when user create 50 | // Buffer/Mem Object with CL_MEM_USE_HOST_PTR. 51 | template 52 | struct aligned_allocator { 53 | using value_type = T; 54 | 55 | aligned_allocator() {} 56 | 57 | aligned_allocator(const aligned_allocator&) {} 58 | 59 | template 60 | aligned_allocator(const aligned_allocator&) {} 61 | 62 | T* allocate(std::size_t num) { 63 | void* ptr = nullptr; 64 | 65 | #if defined(_WINDOWS) 66 | { 67 | ptr = _aligned_malloc(num * sizeof(T), 4096); 68 | if (ptr == nullptr) { 69 | std::cout << "Failed to allocate memory" << std::endl; 70 | exit(EXIT_FAILURE); 71 | } 72 | } 73 | #else 74 | { 75 | if (posix_memalign(&ptr, 4096, num * sizeof(T))) throw std::bad_alloc(); 76 | } 77 | #endif 78 | return reinterpret_cast(ptr); 79 | } 80 | void deallocate(T* p, std::size_t num) { 81 | #if defined(_WINDOWS) 82 | _aligned_free(p); 83 | #else 84 | free(p); 85 | #endif 86 | } 87 | }; 88 | 89 | namespace xcl { 90 | std::vector get_xil_devices(); 91 | std::vector get_devices(const std::string& vendor_name); 92 | cl::Device find_device_bdf(const std::vector& devices, const std::string& bdf); 93 | cl_device_id find_device_bdf_c(cl_device_id* devices, const std::string& bdf, cl_uint dev_count); 94 | std::string convert_size(size_t size); 95 | std::vector read_binary_file(const std::string& xclbin_file_name); 96 | bool is_emulation(); 97 | bool is_hw_emulation(); 98 | bool is_xpr_device(const char* device_name); 99 | class P2P { 100 | public: 101 | static decltype(&xclGetMemObjectFd) getMemObjectFd; 102 | static decltype(&xclGetMemObjectFromFd) getMemObjectFromFd; 103 | static void init(const cl_platform_id& platform) { 104 | void* bar = clGetExtensionFunctionAddressForPlatform(platform, "xclGetMemObjectFd"); 105 | getMemObjectFd = (decltype(&xclGetMemObjectFd))bar; 106 | bar = clGetExtensionFunctionAddressForPlatform(platform, "xclGetMemObjectFromFd"); 107 | getMemObjectFromFd = (decltype(&xclGetMemObjectFromFd))bar; 108 | } 109 | }; 110 | class Ext { 111 | public: 112 | static decltype(&xclGetComputeUnitInfo) getComputeUnitInfo; 113 | static void init(const cl_platform_id& platform) { 114 | void* bar = clGetExtensionFunctionAddressForPlatform(platform, "xclGetComputeUnitInfo"); 115 | getComputeUnitInfo = (decltype(&xclGetComputeUnitInfo))bar; 116 | } 117 | }; 118 | } 119 | -------------------------------------------------------------------------------- /examples/vadd_vmul/conn_u280.cfg: -------------------------------------------------------------------------------- 1 | [connectivity] 2 | sp=vadd.in1:HBM[0] 3 | sp=vadd.in2:HBM[0] 4 | sp=vadd.res:HBM[0] 5 | sp=vmul.in1:HBM[0] 6 | sp=vmul.in2:HBM[0] 7 | sp=vmul.res:HBM[0] 8 | 9 | nk=vadd:1:vadd 10 | nk=vmul:1:vmul 11 | 12 | -------------------------------------------------------------------------------- /examples/vadd_vmul/global.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include "xcl2.hpp" 4 | 5 | // HBM channels 6 | #define MAX_HBM_CHANNEL_COUNT 32 7 | #define CHANNEL_NAME(n) n | XCL_MEM_TOPOLOGY 8 | const int HBM[MAX_HBM_CHANNEL_COUNT] = { 9 | CHANNEL_NAME(0), CHANNEL_NAME(1), CHANNEL_NAME(2), CHANNEL_NAME(3), CHANNEL_NAME(4), CHANNEL_NAME(5), 10 | CHANNEL_NAME(6), CHANNEL_NAME(7), CHANNEL_NAME(8), CHANNEL_NAME(9), CHANNEL_NAME(10), CHANNEL_NAME(11), 11 | CHANNEL_NAME(12), CHANNEL_NAME(13), CHANNEL_NAME(14), CHANNEL_NAME(15), CHANNEL_NAME(16), CHANNEL_NAME(17), 12 | CHANNEL_NAME(18), CHANNEL_NAME(19), CHANNEL_NAME(20), CHANNEL_NAME(21), CHANNEL_NAME(22), CHANNEL_NAME(23), 13 | CHANNEL_NAME(24), CHANNEL_NAME(25), CHANNEL_NAME(26), CHANNEL_NAME(27), CHANNEL_NAME(28), CHANNEL_NAME(29), 14 | CHANNEL_NAME(30), CHANNEL_NAME(31)}; 15 | 16 | const int DDR[2] = {CHANNEL_NAME(32), CHANNEL_NAME(33)}; 17 | 18 | //使用更现代化的std容器。若不想使用vector容器,可使用utils.hpp中提供的方法对齐创建数组,写法为:DTYPE* In_R = 19 | // aligned_alloc(SIZE * sizeof(DTYPE)) 20 | // TODO: 修改内存分配类型 21 | using vec_t = std::vector >; // vadd,vmul中使用的是int类型 22 | 23 | #define N 1000 24 | 25 | typedef int dt; -------------------------------------------------------------------------------- /examples/vadd_vmul/host/host.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include "global.h" 3 | #include "utils.hpp" 4 | #include "vadd_vmul_wrapper.h" 5 | using namespace std; 6 | 7 | int main(int argc, const char* argv[]) { 8 | std::cout << "\n-------------------START----------------\n"; 9 | ArgParser parser(argc, argv); 10 | std::string tmpStr; 11 | if (!parser.getCmdOption("--xclbin", tmpStr)) { 12 | std::cout << "ERROR: xclbin is not set!\n"; 13 | return 1; 14 | } 15 | vec_t in1(N); 16 | vec_t in2(N); 17 | vec_t res1(N); 18 | vec_t res2(N); 19 | vec_t golden_res1(N); 20 | vec_t golden_res2(N); 21 | 22 | std::cout << "data prepared" << std::endl; 23 | for (int i = 0; i < N; i++) { 24 | in1[i] = i; 25 | in2[i] = i; 26 | golden_res1[i] = in1[i] + in2[i]; 27 | golden_res2[i] = in1[i] * in2[i]; 28 | } 29 | 30 | vadd_vmul_wrapper(in1, in2, res1, res2, N, tmpStr); 31 | 32 | unsigned err = 0; 33 | for (int i = 0; i < N; i++) { 34 | if (res1[i] != golden_res1[i] || res2[i] != golden_res2[i]) { 35 | err++; 36 | std::cout << "result1: " << res1[i] << ":" << golden_res1[i]; 37 | std::cout << " result2: " << res2[i] << ":" << golden_res2[i] << std::endl; 38 | } 39 | } 40 | 41 | if (err == 0) { 42 | std::cout << "TEST PASS" << std::endl; 43 | } else { 44 | std::cout << "TEST FAIL" << std::endl; 45 | } 46 | return err; 47 | } -------------------------------------------------------------------------------- /examples/vadd_vmul/host/vadd_vmul_wrapper.h: -------------------------------------------------------------------------------- 1 | #include "event_timer.hpp" 2 | #include "global.h" 3 | #include "kernel/vmul.h" 4 | 5 | inline void vadd_vmul_wrapper(vec_t in1, vec_t in2, vec_t& res1, vec_t& res2, int LEN, std::string xclbin) { 6 | EventTimer et; 7 | cl_int err; 8 | cl::Context context; 9 | cl::CommandQueue q; 10 | 11 | // TODO: 添加kernel 12 | cl::Kernel krnl_vadd; 13 | cl::Kernel krnl_vmul; 14 | 15 | // OPENCL HOST CODE AREA START 16 | et.add("OpenCL Initialization"); 17 | auto devices = xcl::get_xil_devices(); 18 | auto fileBuf = xcl::read_binary_file(xclbin); 19 | cl::Program::Binaries bins{{fileBuf.data(), fileBuf.size()}}; 20 | bool valid_device = false; 21 | for (unsigned int i = 0; i < devices.size(); i++) { 22 | auto device = devices[i]; 23 | // Creating Context and Command Queue for selected Device 24 | OCL_CHECK(err, context = cl::Context(device, nullptr, nullptr, nullptr, &err)); 25 | OCL_CHECK(err, q = cl::CommandQueue(context, device, 26 | CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE | CL_QUEUE_PROFILING_ENABLE, &err)); 27 | 28 | std::cout << "Trying to program device[" << i << "]: " << device.getInfo() << std::endl; 29 | cl::Program program(context, {device}, bins, nullptr, &err); 30 | if (err != CL_SUCCESS) { 31 | std::cout << "Failed to program device[" << i << "] with xclbin file!\n"; 32 | } else { 33 | std::cout << "Device[" << i << "]: program successful!\n"; 34 | 35 | // Creating Kernel object using Compute unit names 36 | // TODO: 注册kernel 37 | OCL_CHECK(err, krnl_vadd = cl::Kernel(program, "vadd", &err)); 38 | OCL_CHECK(err, krnl_vmul = cl::Kernel(program, "vmul", &err)); 39 | valid_device = true; 40 | break; // we break because we found a valid device 41 | } 42 | } 43 | if (!valid_device) { 44 | std::cout << "Failed to program any device found, exit!\n"; 45 | exit(EXIT_FAILURE); 46 | } 47 | et.finish(); 48 | 49 | /* Host mem flags */ 50 | cl_mem_ext_ptr_t in1_ext; 51 | cl_mem_ext_ptr_t in2_ext; 52 | cl_mem_ext_ptr_t res1_ext; 53 | cl_mem_ext_ptr_t res2_ext; 54 | 55 | in1_ext.obj = in1.data(); 56 | in1_ext.param = 0; 57 | in1_ext.flags = HBM[0]; 58 | 59 | in2_ext.obj = in2.data(); 60 | in2_ext.param = 0; 61 | in2_ext.flags = HBM[0]; 62 | 63 | res1_ext.obj = res1.data(); 64 | res1_ext.param = 0; 65 | res1_ext.flags = HBM[0]; 66 | 67 | res2_ext.obj = res2.data(); 68 | res2_ext.param = 0; 69 | res2_ext.flags = HBM[0]; 70 | 71 | cl::Buffer in1_buf; 72 | cl::Buffer in2_buf; 73 | cl::Buffer res1_buf; 74 | cl::Buffer res2_buf; 75 | 76 | et.add("Map host buffers to OpenCL buffers"); 77 | OCL_CHECK(err, in1_buf = cl::Buffer(context, CL_MEM_EXT_PTR_XILINX | CL_MEM_USE_HOST_PTR | CL_MEM_READ_WRITE, 78 | sizeof(dt) * in1.size(), &in1_ext, &err)); 79 | OCL_CHECK(err, in2_buf = cl::Buffer(context, CL_MEM_EXT_PTR_XILINX | CL_MEM_USE_HOST_PTR | CL_MEM_READ_WRITE, 80 | sizeof(dt) * in2.size(), &in2_ext, &err)); 81 | OCL_CHECK(err, res1_buf = cl::Buffer(context, CL_MEM_EXT_PTR_XILINX | CL_MEM_USE_HOST_PTR | CL_MEM_READ_WRITE, 82 | sizeof(dt) * res1.size(), &res1_ext, &err)); 83 | OCL_CHECK(err, res2_buf = cl::Buffer(context, CL_MEM_EXT_PTR_XILINX | CL_MEM_USE_HOST_PTR | CL_MEM_READ_WRITE, 84 | sizeof(dt) * res2.size(), &res2_ext, &err)); 85 | et.finish(); 86 | 87 | et.add("Set kernel arguments"); 88 | OCL_CHECK(err, err = krnl_vadd.setArg(0, in1_buf)); 89 | OCL_CHECK(err, err = krnl_vadd.setArg(1, in2_buf)); 90 | OCL_CHECK(err, err = krnl_vadd.setArg(2, res1_buf)); 91 | OCL_CHECK(err, err = krnl_vmul.setArg(0, in1_buf)); 92 | OCL_CHECK(err, err = krnl_vmul.setArg(1, in2_buf)); 93 | OCL_CHECK(err, err = krnl_vmul.setArg(2, res2_buf)); 94 | 95 | et.add("Memory object migration enqueue"); 96 | OCL_CHECK(err, err = q.enqueueMigrateMemObjects({in1_buf, in2_buf}, 0 /* 0 means from host*/)); 97 | q.finish(); 98 | 99 | int num_runs = 1; //执行次数 100 | et.add("OCL Enqueue task and wait for kernel to complete"); 101 | auto t1 = std::chrono::high_resolution_clock::now(); 102 | for (int i = 0; i < num_runs; i++) { 103 | OCL_CHECK(err, err = q.enqueueTask(krnl_vadd)); 104 | OCL_CHECK(err, err = q.enqueueTask(krnl_vmul)); 105 | q.finish(); 106 | } 107 | auto t2 = std::chrono::high_resolution_clock::now(); 108 | 109 | et.add("Read back computation results"); 110 | // Copy Result from Device Global Memory to Host Local Memory 111 | OCL_CHECK(err, err = q.enqueueMigrateMemObjects({res1_buf, res2_buf}, CL_MIGRATE_MEM_OBJECT_HOST)); 112 | q.finish(); 113 | et.finish(); 114 | 115 | //打印各阶段用时 116 | et.print(); 117 | 118 | //打印两个kernel计算吞吐量 119 | float average_time_in_sec = 120 | float(std::chrono::duration_cast(t2 - t1).count()) / 1000000 / num_runs; 121 | std::cout << "average_time: " << average_time_in_sec * 1000 << " ms" << std::endl; 122 | double throughput = res1.size() + res2.size(); //处理的数据量 123 | throughput /= average_time_in_sec; 124 | std::cout << "Compute THROUGHPUT = " << throughput << " /s" << std::endl; 125 | } -------------------------------------------------------------------------------- /examples/vadd_vmul/kernel/vadd.cpp: -------------------------------------------------------------------------------- 1 | #include "vadd.h" 2 | 3 | extern "C" void vadd(dt* in1, dt* in2, dt* res) { 4 | #pragma HLS INTERFACE m_axi depth = 1000 port = in1 offset = slave bundle = inBUS1 5 | #pragma HLS INTERFACE m_axi depth = 1000 port = in2 offset = slave bundle = inBUS2 6 | #pragma HLS INTERFACE m_axi depth = 1000 port = res offset = slave bundle = resBUS 7 | 8 | for (int i = 0; i < N; i++) { 9 | *(res + i) = *(in1 + i) + *(in2 + i); 10 | } 11 | } -------------------------------------------------------------------------------- /examples/vadd_vmul/kernel/vadd.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | //猜测:在kernel中包含global.h中的xcl.hpp头文件,会让vitis_hls出问题(thread-local storage is not supported for the 4 | // current target) 5 | // #include "global.h" 6 | 7 | #define N 1000 8 | 9 | typedef int dt; 10 | 11 | extern "C" void vadd(dt* in1, dt* in2, dt* res); -------------------------------------------------------------------------------- /examples/vadd_vmul/kernel/vmul.cpp: -------------------------------------------------------------------------------- 1 | #include "vmul.h" 2 | 3 | extern "C" void vmul(dt* in1, dt* in2, dt* res) { 4 | #pragma HLS INTERFACE m_axi depth = 1000 port = in1 offset = slave bundle = inBUS1 5 | #pragma HLS INTERFACE m_axi depth = 1000 port = in2 offset = slave bundle = inBUS2 6 | #pragma HLS INTERFACE m_axi depth = 1000 port = res offset = slave bundle = resBUS 7 | 8 | for (int i = 0; i < N; i++) { 9 | *(res + i) = in1[i] * in2[i]; 10 | } 11 | } -------------------------------------------------------------------------------- /examples/vadd_vmul/kernel/vmul.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | // #include "global.h" 4 | 5 | #define N 1000 6 | 7 | typedef int dt; 8 | 9 | extern "C" void vmul(dt* in1, dt* in2, dt* res); -------------------------------------------------------------------------------- /examples/vadd_vmul/make_hw.sh: -------------------------------------------------------------------------------- 1 | make build TARGET=hw 2 | make host 3 | make run TARGET=hw 4 | 5 | 6 | -------------------------------------------------------------------------------- /examples/vadd_vmul/make_hwe.sh: -------------------------------------------------------------------------------- 1 | make build TARGET=hw_emu 2 | make host 3 | make run TARGET=hw_emu 4 | 5 | 6 | -------------------------------------------------------------------------------- /examples/vadd_vmul/make_kernel.sh: -------------------------------------------------------------------------------- 1 | make cleanall 2 | make build TARGET=sw_emu 3 | make build TARGET=hw_emu 4 | make build TARGET=hw 5 | 6 | -------------------------------------------------------------------------------- /examples/vadd_vmul/make_run.sh: -------------------------------------------------------------------------------- 1 | 2 | make run TARGET=sw_emu 3 | make run TARGET=hw_emu 4 | make run TARGET=hw 5 | 6 | -------------------------------------------------------------------------------- /examples/vadd_vmul/make_sw.sh: -------------------------------------------------------------------------------- 1 | # make build TARGET=sw_emu DEVICE =XXXX CXXFLAGS=XXXX KERNEL=XXXX 2 | # make host DEVICE =XXXX CXXFLAGS=XXXX KERNEL=XXXX 3 | # make run TARGET=sw_emu 4 | 5 | make build TARGET=sw_emu 6 | make host 7 | make run TARGET=sw_emu 8 | 9 | 10 | -------------------------------------------------------------------------------- /examples/vadd_vmul/remake_host.sh: -------------------------------------------------------------------------------- 1 | make cleanh 2 | make host 3 | make run TARGET=sw_emu -------------------------------------------------------------------------------- /examples/vadd_vmul/utils.mk: -------------------------------------------------------------------------------- 1 | # 2 | # Copyright 2019-2021 Xilinx, Inc. 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"); 5 | # you may not use this file except in compliance with the License. 6 | # You may obtain a copy of the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | # See the License for the specific language governing permissions and 14 | # limitations under the License. 15 | # 16 | #+------------------------------------------------------------------------------- 17 | # The following parameters are assigned with default values. These parameters can 18 | # be overridden through the make command line 19 | #+------------------------------------------------------------------------------- 20 | 21 | REPORT := no 22 | PROFILE := no 23 | DEBUG := no 24 | 25 | #'estimate' for estimate report generation 26 | #'system' for system report generation 27 | ifneq ($(REPORT), no) 28 | VPP_LDFLAGS += --report estimate 29 | VPP_LDFLAGS += --report system 30 | endif 31 | 32 | #Generates profile summary report 33 | ifeq ($(PROFILE), yes) 34 | VPP_LDFLAGS += --profile_kernel data:all:all:all 35 | endif 36 | 37 | #Generates debug summary report 38 | ifeq ($(DEBUG), yes) 39 | VPP_LDFLAGS += --dk protocol:all:all:all 40 | endif 41 | 42 | #Checks for XILINX_XRT 43 | ifeq ($(HOST_ARCH), x86) 44 | ifndef XILINX_XRT 45 | XILINX_XRT = /opt/xilinx/xrt 46 | export XILINX_XRT 47 | endif 48 | else 49 | ifndef XILINX_VITIS 50 | XILINX_VITIS = /opt/xilinx/Vitis/$(TOOL_VERSION) 51 | export XILINX_VITIS 52 | endif 53 | endif 54 | 55 | #Checks for Device Family 56 | ifeq ($(HOST_ARCH), aarch32) 57 | DEV_FAM = 7Series 58 | else ifeq ($(HOST_ARCH), aarch64) 59 | DEV_FAM = Ultrascale 60 | endif 61 | 62 | B_NAME = $(shell dirname $(XPLATFORM)) 63 | 64 | #Checks for Correct architecture 65 | ifneq ($(HOST_ARCH), $(filter $(HOST_ARCH),aarch64 aarch32 x86)) 66 | $(error HOST_ARCH variable not set, please set correctly and rerun) 67 | endif 68 | 69 | #Checks for SYSROOT 70 | check_sysroot: 71 | ifneq ($(HOST_ARCH), x86) 72 | ifndef SYSROOT 73 | $(error SYSROOT ENV variable is not set, please set ENV variable correctly and rerun) 74 | endif 75 | endif 76 | 77 | check_version: 78 | ifneq (, $(shell which git)) 79 | ifneq (,$(wildcard $(XFLIB_DIR)/.git)) 80 | @cd $(XFLIB_DIR) && git log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit -n 1 && cd - 81 | endif 82 | endif 83 | 84 | #Checks for g++ 85 | CXX := g++ 86 | ifeq ($(HOST_ARCH), x86) 87 | ifneq ($(shell expr $(shell g++ -dumpversion) \>= 5), 1) 88 | ifndef XILINX_VIVADO 89 | $(error [ERROR]: g++ version too old. Please use 5.0 or above) 90 | else 91 | CXX := $(XILINX_VIVADO)/tps/lnx64/gcc-6.2.0/bin/g++ 92 | ifeq ($(LD_LIBRARY_PATH),) 93 | export LD_LIBRARY_PATH := $(XILINX_VIVADO)/tps/lnx64/gcc-6.2.0/lib64 94 | else 95 | export LD_LIBRARY_PATH := $(XILINX_VIVADO)/tps/lnx64/gcc-6.2.0/lib64:$(LD_LIBRARY_PATH) 96 | endif 97 | $(warning [WARNING]: g++ version too old. Using g++ provided by the tool: $(CXX)) 98 | endif 99 | endif 100 | else ifeq ($(HOST_ARCH), aarch64) 101 | CXX := $(XILINX_VITIS)/gnu/aarch64/lin/aarch64-linux/bin/aarch64-linux-gnu-g++ 102 | else ifeq ($(HOST_ARCH), aarch32) 103 | CXX := $(XILINX_VITIS)/gnu/aarch32/lin/gcc-arm-linux-gnueabi/bin/arm-linux-gnueabihf-g++ 104 | endif 105 | 106 | #Setting VPP 107 | VPP := v++ 108 | 109 | #Cheks for aiecompiler 110 | 111 | .PHONY: check_vivado 112 | check_vivado: 113 | ifeq (,$(wildcard $(XILINX_VIVADO)/bin/vivado)) 114 | @echo "Cannot locate Vivado installation. Please set XILINX_VIVADO variable." && false 115 | endif 116 | 117 | .PHONY: check_vpp 118 | check_vpp: 119 | ifeq (,$(wildcard $(XILINX_VITIS)/bin/v++)) 120 | @echo "Cannot locate Vitis installation. Please set XILINX_VITIS variable." && false 121 | endif 122 | 123 | .PHONY: check_xrt 124 | check_xrt: 125 | ifeq ($(HOST_ARCH), x86) 126 | ifeq (,$(wildcard $(XILINX_XRT)/lib/libxilinxopencl.so)) 127 | @echo "Cannot locate XRT installation. Please set XILINX_XRT variable." && false 128 | endif 129 | endif 130 | 131 | export PATH := $(XILINX_VITIS)/bin:$(XILINX_XRT)/bin:$(PATH) 132 | ifeq ($(HOST_ARCH), x86) 133 | ifeq (,$(LD_LIBRARY_PATH)) 134 | LD_LIBRARY_PATH := $(XILINX_XRT)/lib 135 | else 136 | LD_LIBRARY_PATH := $(XILINX_XRT)/lib:$(LD_LIBRARY_PATH) 137 | endif 138 | else # aarch64 139 | ifeq (,$(LD_LIBRARY_PATH)) 140 | LD_LIBRARY_PATH := $(SYSROOT)/usr/lib 141 | else 142 | LD_LIBRARY_PATH := $(SYSROOT)/usr/lib:$(LD_LIBRARY_PATH) 143 | endif 144 | endif 145 | 146 | # check target 147 | ifeq ($(filter $(TARGET),sw_emu hw_emu hw),) 148 | $(error TARGET is not sw_emu, hw_emu or hw) 149 | endif 150 | 151 | ifneq (,$(wildcard $(DEVICE))) 152 | # Use DEVICE as a file path 153 | XPLATFORM := $(DEVICE) 154 | else 155 | # Use DEVICE as a file name pattern 156 | # 1. search paths specified by variable 157 | ifneq (,$(PLATFORM_REPO_PATHS)) 158 | # 1.1 as exact name 159 | XPLATFORM := $(strip $(foreach p, $(subst :, ,$(PLATFORM_REPO_PATHS)), $(wildcard $(p)/$(DEVICE)/$(DEVICE).xpfm))) 160 | # 1.2 as a pattern 161 | ifeq (,$(XPLATFORM)) 162 | XPLATFORMS := $(foreach p, $(subst :, ,$(PLATFORM_REPO_PATHS)), $(wildcard $(p)/*/*.xpfm)) 163 | XPLATFORM := $(strip $(foreach p, $(XPLATFORMS), $(shell echo $(p) | awk '$$1 ~ /$(DEVICE)/'))) 164 | endif # 1.2 165 | endif # 1 166 | # 2. search Vitis installation 167 | ifeq (,$(XPLATFORM)) 168 | # 2.1 as exact name 169 | XPLATFORM := $(strip $(wildcard $(XILINX_VITIS)/platforms/$(DEVICE)/$(DEVICE).xpfm)) 170 | # 2.2 as a pattern 171 | ifeq (,$(XPLATFORM)) 172 | XPLATFORMS := $(wildcard $(XILINX_VITIS)/platforms/*/*.xpfm) 173 | XPLATFORM := $(strip $(foreach p, $(XPLATFORMS), $(shell echo $(p) | awk '$$1 ~ /$(DEVICE)/'))) 174 | endif # 2.2 175 | endif # 2 176 | # 3. search default locations 177 | ifeq (,$(XPLATFORM)) 178 | # 3.1 as exact name 179 | XPLATFORM := $(strip $(wildcard /opt/xilinx/platforms/$(DEVICE)/$(DEVICE).xpfm)) 180 | # 3.2 as a pattern 181 | ifeq (,$(XPLATFORM)) 182 | XPLATFORMS := $(wildcard /opt/xilinx/platforms/*/*.xpfm) 183 | XPLATFORM := $(strip $(foreach p, $(XPLATFORMS), $(shell echo $(p) | awk '$$1 ~ /$(DEVICE)/'))) 184 | endif # 3.2 185 | endif # 3 186 | endif 187 | 188 | define MSG_PLATFORM 189 | No platform matched pattern '$(DEVICE)'. 190 | Available platforms are: $(XPLATFORMS) 191 | To add more platform directories, set the PLATFORM_REPO_PATHS variable or point DEVICE variable to the full path of platform .xpfm file. 192 | endef 193 | export MSG_PLATFORM 194 | 195 | define MSG_DEVICE 196 | More than one platform matched: $(XPLATFORM) 197 | Please set DEVICE variable more accurately to select only one platform file, or set DEVICE variable to the full path of the platform .xpfm file. 198 | endef 199 | export MSG_DEVICE 200 | 201 | .PHONY: check_platform 202 | check_platform: 203 | ifeq (,$(XPLATFORM)) 204 | @echo "$${MSG_PLATFORM}" && false 205 | endif 206 | ifneq (,$(word 2,$(XPLATFORM))) 207 | @echo "$${MSG_DEVICE}" && false 208 | endif 209 | #Check ends 210 | 211 | # device2xsa - create a filesystem friendly name from device name 212 | # $(1) - full name of device 213 | device2xsa = $(strip $(patsubst %.xpfm, % , $(shell basename $(DEVICE)))) 214 | 215 | # Cleaning stuff 216 | RM = rm -f 217 | RMDIR = rm -rf 218 | 219 | MV = mv -f 220 | CP = cp -rf 221 | ECHO:= @echo 222 | -------------------------------------------------------------------------------- /git_push.sh: -------------------------------------------------------------------------------- 1 | TXT=$1 2 | 3 | git add * 4 | git commit -m $TXT 5 | git push -u origin main 6 | -------------------------------------------------------------------------------- /host/README.md: -------------------------------------------------------------------------------- 1 | # Programming Model for Vitis project 2 | 3 | TODO 4 | 5 | 主机应用的结构可分为以下步骤: 6 | 7 | 1. 指定加速器器件 ID 并加载 .xclbin。 8 | 2. 设置内核与内核实参。 9 | 3. 在主机与内核之间传输数据。 10 | 4. 运行内核并返回结果。 11 | -------------------------------------------------------------------------------- /img/add_kernel_func.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/add_kernel_func.png -------------------------------------------------------------------------------- /img/add_kernel_func_u50.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/add_kernel_func_u50.png -------------------------------------------------------------------------------- /img/add_u50.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/add_u50.png -------------------------------------------------------------------------------- /img/add_zcu104.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/add_zcu104.jpg -------------------------------------------------------------------------------- /img/app_prj.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/app_prj.jpg -------------------------------------------------------------------------------- /img/build.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/build.jpg -------------------------------------------------------------------------------- /img/common_image.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/common_image.jpg -------------------------------------------------------------------------------- /img/creat_app_prj.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/creat_app_prj.jpg -------------------------------------------------------------------------------- /img/empty_app.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/empty_app.jpg -------------------------------------------------------------------------------- /img/host_code.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/host_code.jpg -------------------------------------------------------------------------------- /img/host_run_zcu104.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/host_run_zcu104.png -------------------------------------------------------------------------------- /img/image2sd.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/image2sd.png -------------------------------------------------------------------------------- /img/image_cfg.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/image_cfg.jpg -------------------------------------------------------------------------------- /img/kernel_code.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/kernel_code.jpg -------------------------------------------------------------------------------- /img/kernel_code_u50.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/kernel_code_u50.png -------------------------------------------------------------------------------- /img/kernel_setting.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/kernel_setting.png -------------------------------------------------------------------------------- /img/kernel_vitis_hls.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/kernel_vitis_hls.png -------------------------------------------------------------------------------- /img/link.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/link.png -------------------------------------------------------------------------------- /img/link_setting.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/link_setting.png -------------------------------------------------------------------------------- /img/link_u50.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/link_u50.png -------------------------------------------------------------------------------- /img/link_vivado.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/link_vivado.jpg -------------------------------------------------------------------------------- /img/mk_01-dig.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/mk_01-dig.png -------------------------------------------------------------------------------- /img/mk_01-link.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/mk_01-link.png -------------------------------------------------------------------------------- /img/mk_01-reg.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/mk_01-reg.png -------------------------------------------------------------------------------- /img/mk_01-res.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/mk_01-res.png -------------------------------------------------------------------------------- /img/mk_02-dig.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/mk_02-dig.png -------------------------------------------------------------------------------- /img/mk_02-link.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/mk_02-link.png -------------------------------------------------------------------------------- /img/mk_02-reg.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/mk_02-reg.png -------------------------------------------------------------------------------- /img/mk_02-res.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/mk_02-res.png -------------------------------------------------------------------------------- /img/mk_03-dig1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/mk_03-dig1.png -------------------------------------------------------------------------------- /img/mk_03-dig2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/mk_03-dig2.png -------------------------------------------------------------------------------- /img/mk_03-link.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/mk_03-link.png -------------------------------------------------------------------------------- /img/mk_03-res.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/mk_03-res.png -------------------------------------------------------------------------------- /img/putty.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/putty.png -------------------------------------------------------------------------------- /img/run_u50.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/run_u50.png -------------------------------------------------------------------------------- /img/sysroot.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/sysroot.jpg -------------------------------------------------------------------------------- /img/u50_platform.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/u50_platform.png -------------------------------------------------------------------------------- /img/vitis_shell.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/vitis_shell.png -------------------------------------------------------------------------------- /img/vitis_workflow.image: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/vitis_workflow.image -------------------------------------------------------------------------------- /img/workspace.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/workspace.jpg -------------------------------------------------------------------------------- /img/zcu104.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Reconfigurable-Computing/Vitis_workflow/1c24f9b2d2ff630cd88b8f40df97cb8961e37b04/img/zcu104.jpg -------------------------------------------------------------------------------- /multi-kernels/README.md: -------------------------------------------------------------------------------- 1 | # Multi-kernels implementation 2 | 3 | 加速器多核部署方案样例,涉及单核多部署,多核单部署以及多核多部署。 4 | 5 | + 单核多部署:指的是片上部署多个单一IP核心,并进行host端调度使用 6 | + 多核单部署:指的是片上部署多个多种IP核心,并进行host端调度使用 7 | + 多核多部署:指的是片上部署一个单一IP核心,并在host端随运行时切换调度 8 | 9 | ## 1 kernel代码 10 | 11 | 源代码里给出了两种kernel代码示例 12 | 13 | + [逐元素向量加法VADD IP](./src/kernel/vadd.cpp) 14 | 15 | + [逐元素向量乘法VMUL IP](./src/kernel/vmul.cpp) 16 | 17 | ## 2 U50部署示例 18 | 19 | ### 2.1 单核多部署 20 | 21 | 单核多部署会生成一个xclbin文件,通过host代码来进行IP核注册。 22 | 23 | 示例工程所需源文件: 24 | 25 | + host: 26 | - event_timer.cpp&hpp 27 | - host-01.cpp&hpp 28 | - vadd_host-01.hpp 29 | - xcl2.cpp&hpp 30 | 31 | + kernel: 32 | - vadd.cpp&hpp 33 | 34 | 主要与一般部署不同的地方有如下几点 35 | 36 | + 在binary_container的设置中将kernel数量改变,示例里是2。 37 | 38 | ![图 单核多部署link样例](../img/mk_01-link.png) 39 | 40 | + host端在进行使用多核时需要将所需要的核心进行注册。同理后续的内存空间映射和调度也需要类似处理。 41 | 42 | ps: 示例仅给出了两个核心并行同步处理的样例,如果有其他需求,则需要配合事件标志进行复杂调度。 43 | 44 | ![图 单核多部署reg样例](../img/mk_01-reg.png) 45 | 46 | 对应的结果展示 47 | 48 | + 片上布局 49 | 50 | ![图 单核多部署布局](../img/mk_01-dig.png) 51 | 52 | + 运行代码 53 | ``` 54 | ./test --xclbin ./binary_container_1.xclbin 55 | ``` 56 | ![图 单核多部署运行结果](../img/mk_01-res.png) 57 | 58 | ### 2.2 多核单部署 59 | 60 | 多核单部署会生成一个xclbin文件,通过host代码来进行IP核注册。 61 | 62 | 示例工程所需源文件: 63 | 64 | + host: 65 | - event_timer.cpp&hpp 66 | - host-02.cpp&hpp 67 | - op_host-02.hpp 68 | - xcl2.cpp&hpp 69 | 70 | + kernel: 71 | - vadd.cpp&hpp 72 | - vmul.cpp&hpp 73 | 74 | 主要与一般部署不同的地方有如下几点 75 | 76 | + 在binary_container的设置中将多个kernel放在一块。 77 | 78 | ![图 多核单部署link样例](../img/mk_02-link.png) 79 | 80 | + host端在进行使用多核时需要将所需要的核心进行注册。同理后续的内存空间映射和调度也需要类似处理。 81 | 82 | ps: 示例仅给出了两个核心并行同步处理的样例,如果有其他需求,则需要配合事件标志进行复杂调度。 83 | 84 | ![图 多核单部署reg样例](../img/mk_02-reg.png) 85 | 86 | 对应的结果展示 87 | 88 | + 片上布局 89 | 90 | ![图 多核单部署布局](../img/mk_02-dig.png) 91 | 92 | + 运行代码 93 | ``` 94 | ./test --xclbin ./binary_container_1.xclbin 95 | ``` 96 | ![图 多核单部署运行结果](../img/mk_02-res.png) 97 | 98 | 99 | ### 2.3 多核多部署 100 | 101 | 多核多部署会生成多个xclbin文件,通过host代码来进行计算时调度。 102 | 103 | 示例工程所需源文件: 104 | 105 | + host: 106 | - event_timer.cpp&hpp 107 | - host-03.cpp&hpp 108 | - vadd_host-03.hpp 109 | - vmul_host-03.hpp 110 | - xcl2.cpp&hpp 111 | 112 | + kernel: 113 | - vadd.cpp&hpp 114 | - vmul.cpp&hpp 115 | 116 | 主要与一般部署不同的地方有如下几点 117 | 118 | + 多binary_container的设置,将不同的kernel放入同一工程下不同binary_container中以生成多个xclbin和对应驱动。 119 | 120 | ![图 多核多部署link样例](../img/mk_03-link.png) 121 | 122 | + [为每个IP编写host](./src/host/host-03.cpp) 123 | 124 | 对应的结果展示 125 | 126 | + 片上布局 127 | 128 | ![图 多核多部署布局1](../img/mk_03-dig1.png) 129 | ![图 多核多部署布局2](../img/mk_03-dig2.png) 130 | 131 | + 下列代码运行的是VADD 132 | ``` 133 | ./test --mode 0 134 | ``` 135 | + 下列代码运行的是VMUL 136 | ``` 137 | ./test --mode 1 138 | ``` 139 | 140 | ![图 多核多部署运行结果](../img/mk_03-res.png) 141 | -------------------------------------------------------------------------------- /multi-kernels/src/host/event_timer.cpp: -------------------------------------------------------------------------------- 1 | /********** 2 | Copyright (c) 2019, Xilinx, Inc. 3 | All rights reserved. 4 | 5 | Redistribution and use in source and binary forms, with or without modification, 6 | are permitted provided that the following conditions are met: 7 | 8 | 1. Redistributions of source code must retain the above copyright notice, 9 | this list of conditions and the following disclaimer. 10 | 11 | 2. Redistributions in binary form must reproduce the above copyright notice, 12 | this list of conditions and the following disclaimer in the documentation 13 | and/or other materials provided with the distribution. 14 | 15 | 3. Neither the name of the copyright holder nor the names of its contributors 16 | may be used to endorse or promote products derived from this software 17 | without specific prior written permission. 18 | 19 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 20 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, 21 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 22 | IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 23 | INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 24 | PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 25 | HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 26 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, 27 | EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 28 | **********/ 29 | 30 | 31 | #include "event_timer.hpp" 32 | 33 | #include 34 | #include 35 | 36 | EventTimer::EventTimer() 37 | { 38 | unfinished = false; 39 | event_count = 0; 40 | max_string_length = 0; 41 | } 42 | 43 | float EventTimer::ms_difference(EventTimer::timepoint start, 44 | EventTimer::timepoint end) 45 | { 46 | std::chrono::duration duration = end - start; 47 | return duration.count(); 48 | } 49 | 50 | int EventTimer::add(std::string description) 51 | { 52 | // If previously pending event was unfinished, adding a new event 53 | // will terminate it if this function is called 54 | if (unfinished) 55 | finish(); 56 | 57 | unfinished = true; 58 | 59 | event_names.push_back(description); 60 | int length = description.length(); 61 | if (length > max_string_length) 62 | max_string_length = length; 63 | start_times.push_back(std::chrono::high_resolution_clock::now()); 64 | return event_count++; 65 | } 66 | 67 | void EventTimer::finish(void) 68 | { 69 | end_times.push_back(std::chrono::high_resolution_clock::now()); 70 | if (!unfinished) { 71 | end_times.pop_back(); 72 | return; 73 | } 74 | unfinished = false; 75 | } 76 | 77 | void EventTimer::clear(void) 78 | { 79 | start_times.clear(); 80 | end_times.clear(); 81 | event_names.clear(); 82 | event_count = 0; 83 | unfinished = false; 84 | } 85 | 86 | void EventTimer::print(int id) 87 | { 88 | std::ios_base::fmtflags flags(std::cout.flags()); 89 | if (id >= 0) { 90 | if ((unsigned)id > event_names.size()) 91 | return; 92 | std::cout << event_names[id] << " : " << std::fixed << std::setprecision(3) 93 | << ms_difference(start_times[id], end_times[id]) << std::endl; 94 | } 95 | else { 96 | int printable_events = unfinished ? event_count - 1 : event_count; 97 | for (int i = 0; i < printable_events; i++) { 98 | std::cout << std::left << std::setw(max_string_length) << event_names[i] << " : "; 99 | std::cout << std::right << std::setw(8) << std::fixed << std::setprecision(3) 100 | << ms_difference(start_times[i], end_times[i]) << " ms" 101 | << std::endl; 102 | } 103 | } 104 | std::cout.flags(flags); 105 | } 106 | -------------------------------------------------------------------------------- /multi-kernels/src/host/event_timer.hpp: -------------------------------------------------------------------------------- 1 | /********** 2 | Copyright (c) 2019, Xilinx, Inc. 3 | All rights reserved. 4 | 5 | Redistribution and use in source and binary forms, with or without modification, 6 | are permitted provided that the following conditions are met: 7 | 8 | 1. Redistributions of source code must retain the above copyright notice, 9 | this list of conditions and the following disclaimer. 10 | 11 | 2. Redistributions in binary form must reproduce the above copyright notice, 12 | this list of conditions and the following disclaimer in the documentation 13 | and/or other materials provided with the distribution. 14 | 15 | 3. Neither the name of the copyright holder nor the names of its contributors 16 | may be used to endorse or promote products derived from this software 17 | without specific prior written permission. 18 | 19 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 20 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, 21 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 22 | IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 23 | INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 24 | PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 25 | HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 26 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, 27 | EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 28 | **********/ 29 | 30 | 31 | #ifndef EVENT_TIMER_HPP__ 32 | #define EVENT_TIMER_HPP__ 33 | 34 | #include 35 | #include 36 | #include 37 | 38 | class EventTimer 39 | { 40 | typedef std::chrono::high_resolution_clock::time_point timepoint; 41 | 42 | private: 43 | std::vector start_times; 44 | std::vector end_times; 45 | std::vector event_names; 46 | 47 | bool unfinished; 48 | unsigned int event_count; 49 | int max_string_length; 50 | 51 | float ms_difference(EventTimer::timepoint start, EventTimer::timepoint end); 52 | 53 | public: 54 | EventTimer(void); 55 | int add(std::string description); 56 | void finish(void); 57 | void clear(void); 58 | 59 | void print(int id = -1); 60 | }; 61 | 62 | #endif // EVENT_TIMER_HPP__ 63 | -------------------------------------------------------------------------------- /multi-kernels/src/host/host-01.cpp: -------------------------------------------------------------------------------- 1 | #include "host-01.hpp" 2 | 3 | 4 | 5 | int main(int argc, const char* argv[]) { 6 | std::cout << "\n-------------------START----------------\n"; 7 | ArgParser parser(argc, argv); 8 | std::string tmpStr; 9 | if (!parser.getCmdOption("--xclbin", tmpStr)) { 10 | std::cout << "ERROR: xclbin is not set!\n"; 11 | return 1; 12 | } 13 | std::string xclbin_path = tmpStr; 14 | 15 | dt in1[N]; 16 | dt in2[N]; 17 | dt res1[N]; 18 | dt res2[N]; 19 | dt res_golden[N]; 20 | 21 | std::cout << "data prepared" << std::endl; 22 | for (int i = 0; i < N; i++) { 23 | in1[i] = i; 24 | in2[i] = i; 25 | 26 | res_golden[i] = i+i; 27 | // std::cout << "Here is add\n"; 28 | 29 | } 30 | 31 | struct timeval start_time, end_time; 32 | gettimeofday(&end_time, 0); 33 | 34 | vadd_op(in1,in2,res1,res2,N,xclbin_path); 35 | gettimeofday(&end_time, 0); 36 | // std::cout << "Kernel ed execution time is: " < 3 | #include 4 | #include 5 | #include 6 | #include 7 | 8 | #define N 1000 9 | 10 | 11 | 12 | class ArgParser { 13 | public: 14 | ArgParser(int& argc, const char** argv) { 15 | for (int i = 1; i < argc; ++i) mTokens.push_back(std::string(argv[i])); 16 | } 17 | bool getCmdOption(const std::string option, std::string& value) const { 18 | std::vector::const_iterator itr; 19 | itr = std::find(this->mTokens.begin(), this->mTokens.end(), option); 20 | if (itr != this->mTokens.end() && ++itr != this->mTokens.end()) { 21 | value = *itr; 22 | return true; 23 | } 24 | return false; 25 | } 26 | 27 | private: 28 | std::vector mTokens; 29 | }; 30 | -------------------------------------------------------------------------------- /multi-kernels/src/host/host-02.cpp: -------------------------------------------------------------------------------- 1 | #include "host-02.hpp" 2 | 3 | 4 | 5 | 6 | int main(int argc, const char* argv[]) { 7 | std::cout << "\n-------------------START----------------\n"; 8 | ArgParser parser(argc, argv); 9 | std::string tmpStr; 10 | if (!parser.getCmdOption("--xclbin", tmpStr)) { 11 | std::cout << "ERROR: xclbin is not set!\n"; 12 | return 1; 13 | } 14 | std::string xclbin_path = tmpStr; 15 | 16 | dt in1[N]; 17 | dt in2[N]; 18 | dt res1[N]; 19 | dt res2[N]; 20 | dt res_golden1[N]; 21 | dt res_golden2[N]; 22 | 23 | std::cout << "data prepared" << std::endl; 24 | for (int i = 0; i < N; i++) { 25 | in1[i] = i; 26 | in2[i] = i; 27 | 28 | res_golden1[i] = i+i; 29 | res_golden2[i] = i*i; 30 | // std::cout << "Here is add\n"; 31 | 32 | } 33 | 34 | struct timeval start_time, end_time; 35 | gettimeofday(&end_time, 0); 36 | 37 | op(in1,in2,res1,res2,N,xclbin_path); 38 | gettimeofday(&end_time, 0); 39 | // std::cout << "Kernel ed execution time is: " < 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include "op_host-02.hpp" 7 | 8 | #define N 1000 9 | 10 | 11 | 12 | class ArgParser { 13 | public: 14 | ArgParser(int& argc, const char** argv) { 15 | for (int i = 1; i < argc; ++i) mTokens.push_back(std::string(argv[i])); 16 | } 17 | bool getCmdOption(const std::string option, std::string& value) const { 18 | std::vector::const_iterator itr; 19 | itr = std::find(this->mTokens.begin(), this->mTokens.end(), option); 20 | if (itr != this->mTokens.end() && ++itr != this->mTokens.end()) { 21 | value = *itr; 22 | return true; 23 | } 24 | return false; 25 | } 26 | 27 | private: 28 | std::vector mTokens; 29 | }; 30 | -------------------------------------------------------------------------------- /multi-kernels/src/host/host-03.cpp: -------------------------------------------------------------------------------- 1 | #include "host.hpp" 2 | 3 | 4 | 5 | int main(int argc, const char* argv[]) { 6 | std::cout << "\n-------------------START----------------\n"; 7 | ArgParser parser(argc, argv); 8 | std::string tmpStr; 9 | if (!parser.getCmdOption("--mode", tmpStr)) { 10 | std::cout << "ERROR: mode is not set!\n"; 11 | return 1; 12 | } 13 | 14 | int mode; 15 | if(tmpStr == "0"){ 16 | mode = 0; 17 | std::cout << "INFO: VADD is selected\n"; 18 | 19 | } 20 | else if(tmpStr == "1"){ 21 | mode = 1; 22 | std::cout << "INFO: VMUL is selected\n"; 23 | } 24 | else{ 25 | std::cout << "ERROR: mode error\n"; 26 | } 27 | 28 | 29 | dt in1[N]; 30 | dt in2[N]; 31 | dt res[N]; 32 | dt res_golden[N]; 33 | 34 | std::cout << "data prepared" << std::endl; 35 | for (int i = 0; i < N; i++) { 36 | in1[i] = i; 37 | in2[i] = i; 38 | if(mode == 0){ 39 | res_golden[i] = i+i; 40 | // std::cout << "Here is add\n"; 41 | } 42 | else 43 | if(mode == 1){ 44 | res_golden[i] = i*i; 45 | // std::cout << "Here is mul\n"; 46 | } 47 | 48 | } 49 | 50 | struct timeval start_time, end_time; 51 | gettimeofday(&end_time, 0); 52 | switch (mode) 53 | { 54 | case 0: 55 | vadd_op(in1,in2,res,N); 56 | break; 57 | case 1: 58 | vmul_op(in1,in2,res,N); 59 | break; 60 | } 61 | gettimeofday(&end_time, 0); 62 | // std::cout << "Kernel ed execution time is: " < 4 | #include 5 | #include 6 | #include 7 | #include 8 | 9 | #define N 1000 10 | 11 | 12 | 13 | class ArgParser { 14 | public: 15 | ArgParser(int& argc, const char** argv) { 16 | for (int i = 1; i < argc; ++i) mTokens.push_back(std::string(argv[i])); 17 | } 18 | bool getCmdOption(const std::string option, std::string& value) const { 19 | std::vector::const_iterator itr; 20 | itr = std::find(this->mTokens.begin(), this->mTokens.end(), option); 21 | if (itr != this->mTokens.end() && ++itr != this->mTokens.end()) { 22 | value = *itr; 23 | return true; 24 | } 25 | return false; 26 | } 27 | 28 | private: 29 | std::vector mTokens; 30 | }; 31 | -------------------------------------------------------------------------------- /multi-kernels/src/host/op_host-02.hpp: -------------------------------------------------------------------------------- 1 | #include "event_timer.hpp" 2 | #include "xcl2.hpp" 3 | #include 4 | 5 | typedef int dt; 6 | 7 | void op(dt *in1, dt *in2, dt *res1, dt *res2, int LEN, std::string xclbin){ 8 | EventTimer et; 9 | 10 | cl_int fail; 11 | 12 | //***************************************** Step1 platform related operations 13 | std::vector devices = xcl::get_xil_devices(); 14 | cl::Device device = devices[0]; //the device 0 15 | 16 | //***************************************** Step2 Creating Context and Command Queue for selected Device 17 | 18 | 19 | et.add("OpenCL Initialization"); 20 | 21 | cl::Context context(device, NULL, NULL, NULL, &fail); //initial OpenCL environment 22 | 23 | cl::CommandQueue q(context, device, CL_QUEUE_PROFILING_ENABLE | CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE, &fail); //command queue 24 | 25 | cl::Program::Binaries xclBins = xcl::import_binary_file(xclbin); //load the binary file 26 | devices.resize(1); 27 | cl::Program program(context, devices, xclBins, NULL, &fail); // pragram the fpga 28 | 29 | cl::Kernel vadd_kernel,vmul_kernel; 30 | printf("load kernel start\n"); 31 | vadd_kernel = cl::Kernel(program, "vadd_kernel", &fail);//The name "vadd_kernel" should be same as your kernel name .xo 32 | vmul_kernel = cl::Kernel(program, "vmul_kernel", &fail); 33 | printf("load kernel finish\n"); 34 | et.finish(); 35 | 36 | //***************************************** Step3 create device buffer and map dev buf to host buf 37 | 38 | cl::Buffer In1_buf, In2_buf, Res_buf; 39 | cl::Buffer In1_buf2, In2_buf2, Res_buf2; 40 | et.add("Map host buffers to OpenCL buffers"); 41 | In1_buf = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY), 42 | sizeof(dt) * LEN, in1); 43 | In2_buf = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY), 44 | sizeof(dt) * LEN, in2); 45 | 46 | Res_buf = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_WRITE_ONLY), 47 | sizeof(dt) * LEN, res1); 48 | 49 | In1_buf2 = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY), 50 | sizeof(dt) * LEN, in1); 51 | In2_buf2 = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY), 52 | sizeof(dt) * LEN, in2); 53 | 54 | Res_buf2 = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_WRITE_ONLY), 55 | sizeof(dt) * LEN, res2); 56 | 57 | et.finish(); 58 | 59 | int j = 0; 60 | et.add("Set kernel arguments"); 61 | vadd_kernel.setArg(j++, In1_buf); 62 | vadd_kernel.setArg(j++, In2_buf); 63 | vadd_kernel.setArg(j++, Res_buf); 64 | 65 | j = 0; 66 | vmul_kernel.setArg(j++, In1_buf2); 67 | vmul_kernel.setArg(j++, In2_buf2); 68 | vmul_kernel.setArg(j++, Res_buf2); 69 | 70 | std::vector events_write(2); 71 | std::vector events_kernel(2); 72 | std::vector events_read(2); 73 | 74 | struct timeval start_time, end_time; 75 | gettimeofday(&start_time, 0); 76 | 77 | et.add("Memory object migration enqueue"); 78 | q.enqueueMigrateMemObjects({In1_buf,In2_buf}, 0, nullptr, &events_write[0]); //TODO start transfer, 0: host -> device; 1: opposite 79 | q.enqueueMigrateMemObjects({In1_buf2,In2_buf2}, 0, nullptr, &events_write[1]); //TODO start transfer, 0: host -> device; 1: opposite 80 | 81 | clWaitForEvents(1, (const cl_event *)&events_write[0]); 82 | clWaitForEvents(1, (const cl_event *)&events_write[1]); 83 | 84 | et.add("OCL Enqueue task"); 85 | q.enqueueTask(vadd_kernel, &events_write, &events_kernel[0]); 86 | q.enqueueTask(vmul_kernel, &events_write, &events_kernel[1]); 87 | 88 | et.add("Wait for kernel to complete"); 89 | 90 | clWaitForEvents(1, (const cl_event *)&events_kernel[0]); 91 | clWaitForEvents(1, (const cl_event *)&events_kernel[1]); 92 | 93 | et.add("Read back computation results"); 94 | q.enqueueMigrateMemObjects({Res_buf}, 1, &events_kernel, &events_read[0]); 95 | q.enqueueMigrateMemObjects({Res_buf2}, 1, &events_kernel, &events_read[1]); 96 | 97 | clWaitForEvents(1, (const cl_event *)&events_read[0]); 98 | clWaitForEvents(1, (const cl_event *)&events_read[1]); 99 | et.finish(); 100 | q.finish(); 101 | 102 | // std::cout << "Kernel st execution time is: " < 4 | 5 | typedef int dt; 6 | 7 | void vadd_op(dt *in1, dt *in2, dt *res1, dt *res2, int LEN, std::string xclbin){ 8 | EventTimer et; 9 | 10 | cl_int fail; 11 | 12 | //***************************************** Step1 platform related operations 13 | std::vector devices = xcl::get_xil_devices(); 14 | cl::Device device = devices[0]; //the device 0 15 | 16 | //***************************************** Step2 Creating Context and Command Queue for selected Device 17 | 18 | 19 | et.add("OpenCL Initialization"); 20 | 21 | cl::Context context(device, NULL, NULL, NULL, &fail); //initial OpenCL environment 22 | 23 | cl::CommandQueue q(context, device, CL_QUEUE_PROFILING_ENABLE | CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE, &fail); //command queue 24 | 25 | cl::Program::Binaries xclBins = xcl::import_binary_file(xclbin); //load the binary file 26 | devices.resize(1); 27 | cl::Program program(context, devices, xclBins, NULL, &fail); // pragram the fpga 28 | 29 | cl::Kernel vadd_kernel,vadd_kernel2; 30 | printf("load kernel start\n"); 31 | vadd_kernel = cl::Kernel(program, "vadd_kernel:{vadd_kernel_1}", &fail);//The name "vadd_kernel" should be same as your kernel name .xo 32 | vadd_kernel2 = cl::Kernel(program, "vadd_kernel:{vadd_kernel_2}", &fail); 33 | printf("load kernel finish\n"); 34 | et.finish(); 35 | 36 | //***************************************** Step3 create device buffer and map dev buf to host buf 37 | 38 | cl::Buffer In1_buf, In2_buf, Res_buf; 39 | cl::Buffer In1_buf2, In2_buf2, Res_buf2; 40 | et.add("Map host buffers to OpenCL buffers"); 41 | In1_buf = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY), 42 | sizeof(dt) * LEN, in1); 43 | In2_buf = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY), 44 | sizeof(dt) * LEN, in2); 45 | 46 | Res_buf = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_WRITE_ONLY), 47 | sizeof(dt) * LEN, res1); 48 | 49 | In1_buf2 = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY), 50 | sizeof(dt) * LEN, in1); 51 | In2_buf2 = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY), 52 | sizeof(dt) * LEN, in2); 53 | 54 | Res_buf2 = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_WRITE_ONLY), 55 | sizeof(dt) * LEN, res2); 56 | 57 | et.finish(); 58 | 59 | int j = 0; 60 | et.add("Set kernel arguments"); 61 | vadd_kernel.setArg(j++, In1_buf); 62 | vadd_kernel.setArg(j++, In2_buf); 63 | vadd_kernel.setArg(j++, Res_buf); 64 | 65 | j = 0; 66 | vadd_kernel2.setArg(j++, In1_buf2); 67 | vadd_kernel2.setArg(j++, In2_buf2); 68 | vadd_kernel2.setArg(j++, Res_buf2); 69 | 70 | std::vector events_write(2); 71 | std::vector events_kernel(2); 72 | std::vector events_read(2); 73 | 74 | struct timeval start_time, end_time; 75 | gettimeofday(&start_time, 0); 76 | 77 | et.add("Memory object migration enqueue"); 78 | q.enqueueMigrateMemObjects({In1_buf,In2_buf}, 0, nullptr, &events_write[0]); //TODO start transfer, 0: host -> device; 1: opposite 79 | q.enqueueMigrateMemObjects({In1_buf2,In2_buf2}, 0, nullptr, &events_write[1]); //TODO start transfer, 0: host -> device; 1: opposite 80 | 81 | clWaitForEvents(1, (const cl_event *)&events_write[0]); 82 | clWaitForEvents(1, (const cl_event *)&events_write[1]); 83 | 84 | et.add("OCL Enqueue task"); 85 | q.enqueueTask(vadd_kernel, &events_write, &events_kernel[0]); 86 | q.enqueueTask(vadd_kernel2, &events_write, &events_kernel[1]); 87 | 88 | et.add("Wait for kernel to complete"); 89 | 90 | clWaitForEvents(1, (const cl_event *)&events_kernel[0]); 91 | clWaitForEvents(1, (const cl_event *)&events_kernel[1]); 92 | 93 | et.add("Read back computation results"); 94 | q.enqueueMigrateMemObjects({Res_buf}, 1, &events_kernel, &events_read[0]); 95 | q.enqueueMigrateMemObjects({Res_buf2}, 1, &events_kernel, &events_read[1]); 96 | 97 | clWaitForEvents(1, (const cl_event *)&events_read[0]); 98 | clWaitForEvents(1, (const cl_event *)&events_read[1]); 99 | et.finish(); 100 | q.finish(); 101 | 102 | // std::cout << "Kernel st execution time is: " < 4 | 5 | typedef int dt; 6 | #define VADD_XCL "../../test_system_hw_link/Hardware/binary_container_1.xclbin" 7 | 8 | void vadd_op(dt *in1, dt *in2, dt *res, int LEN){ 9 | EventTimer et; 10 | 11 | cl_int fail; 12 | 13 | //***************************************** Step1 platform related operations 14 | std::vector devices = xcl::get_xil_devices(); 15 | cl::Device device = devices[0]; //the device 0 16 | 17 | //***************************************** Step2 Creating Context and Command Queue for selected Device 18 | 19 | 20 | et.add("OpenCL Initialization"); 21 | 22 | cl::Context context(device, NULL, NULL, NULL, &fail); //initial OpenCL environment 23 | 24 | cl::CommandQueue q(context, device, CL_QUEUE_PROFILING_ENABLE | CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE, &fail); //command queue 25 | 26 | cl::Program::Binaries xclBins = xcl::import_binary_file(VADD_XCL); //load the binary file 27 | devices.resize(1); 28 | cl::Program program(context, devices, xclBins, NULL, &fail); // pragram the fpga 29 | 30 | cl::Kernel vadd_kernel; 31 | printf("load xclbin start\n"); 32 | vadd_kernel = cl::Kernel(program, "vadd_kernel", &fail);//The name "vadd_kernel" should be same as your kernel name .xo 33 | printf("load xclbin finish\n"); 34 | et.finish(); 35 | 36 | //***************************************** Step3 create device buffer and map dev buf to host buf 37 | 38 | cl::Buffer In1_buf, In2_buf, Res_buf; 39 | et.add("Map host buffers to OpenCL buffers"); 40 | In1_buf = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY), 41 | sizeof(dt) * LEN, in1); 42 | In2_buf = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY), 43 | sizeof(dt) * LEN, in2); 44 | 45 | Res_buf = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_WRITE_ONLY), 46 | sizeof(dt) * LEN, res); 47 | 48 | et.finish(); 49 | 50 | int j = 0; 51 | et.add("Set kernel arguments"); 52 | vadd_kernel.setArg(j++, In1_buf); 53 | vadd_kernel.setArg(j++, In2_buf); 54 | vadd_kernel.setArg(j++, Res_buf); 55 | 56 | std::vector events_write(1); 57 | std::vector events_kernel(1); 58 | std::vector events_read(1); 59 | 60 | struct timeval start_time, end_time; 61 | gettimeofday(&start_time, 0); 62 | 63 | et.add("Memory object migration enqueue"); 64 | q.enqueueMigrateMemObjects({In1_buf,In2_buf}, 0, nullptr, &events_write[0]); //TODO start transfer, 0: host -> device; 1: opposite 65 | 66 | clWaitForEvents(1, (const cl_event *)&events_write[0]); 67 | 68 | et.add("OCL Enqueue task"); 69 | q.enqueueTask(vadd_kernel, &events_write, &events_kernel[0]); 70 | 71 | et.add("Wait for kernel to complete"); 72 | 73 | clWaitForEvents(1, (const cl_event *)&events_kernel[0]); 74 | 75 | et.add("Read back computation results"); 76 | q.enqueueMigrateMemObjects({Res_buf}, 1, &events_kernel, &events_read[0]); 77 | clWaitForEvents(1, (const cl_event *)&events_read[0]); 78 | et.finish(); 79 | q.finish(); 80 | 81 | // std::cout << "Kernel st execution time is: " < 4 | 5 | typedef int dt; 6 | #define VMUL_XCL "../../test_system_hw_link/Hardware/binary_container_2.xclbin" 7 | 8 | 9 | void vmul_op(dt *in1, dt *in2, dt *res, int LEN){ 10 | EventTimer et; 11 | 12 | cl_int fail; 13 | 14 | //***************************************** Step1 platform related operations 15 | std::vector devices = xcl::get_xil_devices(); 16 | cl::Device device = devices[0]; //the device 0 17 | 18 | //***************************************** Step2 Creating Context and Command Queue for selected Device 19 | 20 | 21 | et.add("OpenCL Initialization"); 22 | 23 | cl::Context context(device, NULL, NULL, NULL, &fail); //initial OpenCL environment 24 | 25 | cl::CommandQueue q(context, device, CL_QUEUE_PROFILING_ENABLE | CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE, &fail); //command queue 26 | 27 | cl::Program::Binaries xclBins = xcl::import_binary_file(VMUL_XCL); //load the binary file 28 | devices.resize(1); 29 | cl::Program program(context, devices, xclBins, NULL, &fail); // pragram the fpga 30 | 31 | cl::Kernel vmul_kernel; 32 | vmul_kernel = cl::Kernel(program, "vmul_kernel", &fail);//The name "vmul_kernel" should be same as your kernel name .xo 33 | et.finish(); 34 | 35 | //***************************************** Step3 create device buffer and map dev buf to host buf 36 | 37 | cl::Buffer In1_buf, In2_buf, Res_buf; 38 | et.add("Map host buffers to OpenCL buffers"); 39 | In1_buf = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY), 40 | sizeof(dt) * LEN, in1); 41 | In2_buf = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY), 42 | sizeof(dt) * LEN, in2); 43 | 44 | Res_buf = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_WRITE_ONLY), 45 | sizeof(dt) * LEN, res); 46 | 47 | et.finish(); 48 | 49 | int j = 0; 50 | et.add("Set kernel arguments"); 51 | vmul_kernel.setArg(j++, In1_buf); 52 | vmul_kernel.setArg(j++, In2_buf); 53 | vmul_kernel.setArg(j++, Res_buf); 54 | 55 | std::vector events_write(1); 56 | std::vector events_kernel(1); 57 | std::vector events_read(1); 58 | 59 | struct timeval start_time, end_time; 60 | gettimeofday(&start_time, 0); 61 | 62 | et.add("Memory object migration enqueue"); 63 | q.enqueueMigrateMemObjects({In1_buf,In2_buf}, 0, nullptr, &events_write[0]); //TODO start transfer, 0: host -> device; 1: opposite 64 | 65 | clWaitForEvents(1, (const cl_event *)&events_write[0]); 66 | 67 | et.add("OCL Enqueue task"); 68 | q.enqueueTask(vmul_kernel, &events_write, &events_kernel[0]); 69 | 70 | et.add("Wait for kernel to complete"); 71 | 72 | clWaitForEvents(1, (const cl_event *)&events_kernel[0]); 73 | 74 | et.add("Read back computation results"); 75 | q.enqueueMigrateMemObjects({Res_buf}, 1, &events_kernel, &events_read[0]); 76 | clWaitForEvents(1, (const cl_event *)&events_read[0]); 77 | et.finish(); 78 | q.finish(); 79 | 80 | // std::cout << "Kernel st execution time is: " < 38 | #include 39 | #include 40 | 41 | // When creating a buffer with user pointer (CL_MEM_USE_HOST_PTR), under the hood 42 | // User ptr is used if and only if it is properly aligned (page aligned). When not 43 | // aligned, runtime has no choice but to create its own host side buffer that backs 44 | // user ptr. This in turn implies that all operations that move data to and from 45 | // device incur an extra memcpy to move data to/from runtime's own host buffer 46 | // from/to user pointer. So it is recommended to use this allocator if user wish to 47 | // Create Buffer/Memory Object with CL_MEM_USE_HOST_PTR to align user buffer to the 48 | // page boundary. It will ensure that user buffer will be used when user create 49 | // Buffer/Mem Object with CL_MEM_USE_HOST_PTR. 50 | template 51 | struct aligned_allocator { 52 | using value_type = T; 53 | T* allocate(std::size_t num) { 54 | void* ptr = nullptr; 55 | if (posix_memalign(&ptr, 4096, num * sizeof(T))) throw std::bad_alloc(); 56 | return reinterpret_cast(ptr); 57 | } 58 | void deallocate(T* p, std::size_t num) { free(p); } 59 | }; 60 | 61 | namespace xcl { 62 | std::vector get_xil_devices(); 63 | std::vector get_devices(const std::string& vendor_name); 64 | /* find_xclbin_file 65 | * 66 | * 67 | * Description: 68 | * Find precompiled program (as commonly created by the Xilinx OpenCL 69 | * flow). Using search path below. 70 | * 71 | * Search Path: 72 | * $XCL_BINDIR/...xclbin 73 | * $XCL_BINDIR/...xclbin 74 | * $XCL_BINDIR/binary_container_1.xclbin 75 | * $XCL_BINDIR/.xclbin 76 | * xclbin/...xclbin 77 | * xclbin/...xclbin 78 | * xclbin/binary_container_1.xclbin 79 | * xclbin/.xclbin 80 | * ../...xclbin 81 | * ../...xclbin 82 | * ../binary_container_1.xclbin 83 | * ../.xclbin 84 | * ./...xclbin 85 | * ./...xclbin 86 | * ./binary_container_1.xclbin 87 | * ./.xclbin 88 | * 89 | * Inputs: 90 | * _device_name - Targeted Device name 91 | * xclbin_name - base name of the xclbin to import. 92 | * 93 | * Returns: 94 | * An opencl program Binaries object that was created from xclbin_name file. 95 | */ 96 | std::string find_binary_file(const std::string& _device_name, const std::string& xclbin_name); 97 | cl::Program::Binaries import_binary_file(std::string xclbin_file_name); 98 | bool is_emulation(); 99 | bool is_hw_emulation(); 100 | bool is_xpr_device(const char* device_name); 101 | } 102 | -------------------------------------------------------------------------------- /multi-kernels/src/kernel/vadd.cpp: -------------------------------------------------------------------------------- 1 | #include "vadd.hpp" 2 | 3 | 4 | 5 | extern "C" 6 | void vadd_kernel(dt *in1, dt *in2, dt *res){ 7 | #pragma HLS TOP name=vadd_kernel 8 | 9 | #pragma HLS INTERFACE m_axi depth=1000 port=in1 offset=slave bundle=inBUS1 10 | #pragma HLS INTERFACE m_axi depth=1000 port=in2 offset=slave bundle=inBUS2 11 | #pragma HLS INTERFACE m_axi depth=1000 port=res offset=slave bundle=resBUS 12 | 13 | 14 | for(int i=0; i< N; i++){ 15 | *(res+i) = *(in1+i)+*(in2+i); 16 | } 17 | 18 | } 19 | -------------------------------------------------------------------------------- /multi-kernels/src/kernel/vadd.hpp: -------------------------------------------------------------------------------- 1 | #define N 1000 2 | 3 | 4 | typedef int dt; 5 | 6 | 7 | extern "C" 8 | void vadd_kernel(dt *in1, dt *in2, dt *res); 9 | -------------------------------------------------------------------------------- /multi-kernels/src/kernel/vmul.cpp: -------------------------------------------------------------------------------- 1 | #include "vmul.hpp" 2 | 3 | 4 | 5 | extern "C" 6 | void vmul_kernel(dt *in1, dt *in2, dt *res){ 7 | #pragma HLS TOP name=vmul_kernel 8 | 9 | #pragma HLS INTERFACE m_axi depth=1000 port=in1 offset=slave bundle=inBUS1 10 | #pragma HLS INTERFACE m_axi depth=1000 port=in2 offset=slave bundle=inBUS2 11 | #pragma HLS INTERFACE m_axi depth=1000 port=res offset=slave bundle=resBUS 12 | 13 | 14 | for(int i=0; i< N; i++){ 15 | *(res+i) = in1[i]*in2[i]; 16 | } 17 | 18 | } 19 | -------------------------------------------------------------------------------- /multi-kernels/src/kernel/vmul.hpp: -------------------------------------------------------------------------------- 1 | #define N 1000 2 | 3 | 4 | typedef int dt; 5 | 6 | 7 | extern "C" 8 | void vmul_kernel(dt *in1, dt *in2, dt *res); 9 | -------------------------------------------------------------------------------- /overall/README.md: -------------------------------------------------------------------------------- 1 | # Overall introduction 2 | 3 | 4 | ## 教程目录 5 | 6 | 1. [ZCU104](./start_ZCU104.md):针对嵌入式板卡的加速器设计简单样例,以ZCU104作为参考。[跳转链接](./start_ZCU104.md) 7 | 8 | 2. [Alveo U50](./start_U50.md):针对数据中心板卡的加速器设计简单样例,以Alveo U50作为参考。[跳转链接](./start_U50.md) 9 | 10 | ## 源代码结构 11 | 12 | + host : 上层代码 13 | - event_timer: 记录器 14 | - host: 主程序 15 | - vadd_host: 加速器上层程序 16 | - xcl2: xilinx魔改opencl库 17 | 18 | + kernel: 加速器IP核代码 19 | - vadd: 向量逐元素加法IP 20 | -------------------------------------------------------------------------------- /overall/src/host/event_timer.cpp: -------------------------------------------------------------------------------- 1 | /********** 2 | Copyright (c) 2019, Xilinx, Inc. 3 | All rights reserved. 4 | 5 | Redistribution and use in source and binary forms, with or without modification, 6 | are permitted provided that the following conditions are met: 7 | 8 | 1. Redistributions of source code must retain the above copyright notice, 9 | this list of conditions and the following disclaimer. 10 | 11 | 2. Redistributions in binary form must reproduce the above copyright notice, 12 | this list of conditions and the following disclaimer in the documentation 13 | and/or other materials provided with the distribution. 14 | 15 | 3. Neither the name of the copyright holder nor the names of its contributors 16 | may be used to endorse or promote products derived from this software 17 | without specific prior written permission. 18 | 19 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 20 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, 21 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 22 | IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 23 | INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 24 | PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 25 | HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 26 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, 27 | EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 28 | **********/ 29 | 30 | 31 | #include "event_timer.hpp" 32 | 33 | #include 34 | #include 35 | 36 | EventTimer::EventTimer() 37 | { 38 | unfinished = false; 39 | event_count = 0; 40 | max_string_length = 0; 41 | } 42 | 43 | float EventTimer::ms_difference(EventTimer::timepoint start, 44 | EventTimer::timepoint end) 45 | { 46 | std::chrono::duration duration = end - start; 47 | return duration.count(); 48 | } 49 | 50 | int EventTimer::add(std::string description) 51 | { 52 | // If previously pending event was unfinished, adding a new event 53 | // will terminate it if this function is called 54 | if (unfinished) 55 | finish(); 56 | 57 | unfinished = true; 58 | 59 | event_names.push_back(description); 60 | int length = description.length(); 61 | if (length > max_string_length) 62 | max_string_length = length; 63 | start_times.push_back(std::chrono::high_resolution_clock::now()); 64 | return event_count++; 65 | } 66 | 67 | void EventTimer::finish(void) 68 | { 69 | end_times.push_back(std::chrono::high_resolution_clock::now()); 70 | if (!unfinished) { 71 | end_times.pop_back(); 72 | return; 73 | } 74 | unfinished = false; 75 | } 76 | 77 | void EventTimer::clear(void) 78 | { 79 | start_times.clear(); 80 | end_times.clear(); 81 | event_names.clear(); 82 | event_count = 0; 83 | unfinished = false; 84 | } 85 | 86 | void EventTimer::print(int id) 87 | { 88 | std::ios_base::fmtflags flags(std::cout.flags()); 89 | if (id >= 0) { 90 | if ((unsigned)id > event_names.size()) 91 | return; 92 | std::cout << event_names[id] << " : " << std::fixed << std::setprecision(3) 93 | << ms_difference(start_times[id], end_times[id]) << std::endl; 94 | } 95 | else { 96 | int printable_events = unfinished ? event_count - 1 : event_count; 97 | for (int i = 0; i < printable_events; i++) { 98 | std::cout << std::left << std::setw(max_string_length) << event_names[i] << " : "; 99 | std::cout << std::right << std::setw(8) << std::fixed << std::setprecision(3) 100 | << ms_difference(start_times[i], end_times[i]) << " ms" 101 | << std::endl; 102 | } 103 | } 104 | std::cout.flags(flags); 105 | } 106 | -------------------------------------------------------------------------------- /overall/src/host/event_timer.hpp: -------------------------------------------------------------------------------- 1 | /********** 2 | Copyright (c) 2019, Xilinx, Inc. 3 | All rights reserved. 4 | 5 | Redistribution and use in source and binary forms, with or without modification, 6 | are permitted provided that the following conditions are met: 7 | 8 | 1. Redistributions of source code must retain the above copyright notice, 9 | this list of conditions and the following disclaimer. 10 | 11 | 2. Redistributions in binary form must reproduce the above copyright notice, 12 | this list of conditions and the following disclaimer in the documentation 13 | and/or other materials provided with the distribution. 14 | 15 | 3. Neither the name of the copyright holder nor the names of its contributors 16 | may be used to endorse or promote products derived from this software 17 | without specific prior written permission. 18 | 19 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 20 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, 21 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 22 | IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 23 | INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 24 | PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 25 | HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 26 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, 27 | EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 28 | **********/ 29 | 30 | 31 | #ifndef EVENT_TIMER_HPP__ 32 | #define EVENT_TIMER_HPP__ 33 | 34 | #include 35 | #include 36 | #include 37 | 38 | class EventTimer 39 | { 40 | typedef std::chrono::high_resolution_clock::time_point timepoint; 41 | 42 | private: 43 | std::vector start_times; 44 | std::vector end_times; 45 | std::vector event_names; 46 | 47 | bool unfinished; 48 | unsigned int event_count; 49 | int max_string_length; 50 | 51 | float ms_difference(EventTimer::timepoint start, EventTimer::timepoint end); 52 | 53 | public: 54 | EventTimer(void); 55 | int add(std::string description); 56 | void finish(void); 57 | void clear(void); 58 | 59 | void print(int id = -1); 60 | }; 61 | 62 | #endif // EVENT_TIMER_HPP__ 63 | -------------------------------------------------------------------------------- /overall/src/host/host.cpp: -------------------------------------------------------------------------------- 1 | #include "host.hpp" 2 | 3 | 4 | 5 | int main(int argc, const char* argv[]) { 6 | std::cout << "\n-------------------START----------------\n"; 7 | ArgParser parser(argc, argv); 8 | std::string tmpStr; 9 | if (!parser.getCmdOption("--xclbin", tmpStr)) { 10 | std::cout << "ERROR: xclbin is not set!\n"; 11 | return 1; 12 | } 13 | 14 | dt in1[N]; 15 | dt in2[N]; 16 | dt res[N]; 17 | dt res_golden[N]; 18 | 19 | std::cout << "data prepared" << std::endl; 20 | for (int i = 0; i < N; i++) { 21 | in1[i] = i; 22 | in2[i] = i; 23 | if(mode == 0){ 24 | res_golden[i] = i+i; 25 | // std::cout << "Here is add\n"; 26 | } 27 | else 28 | if(mode == 1){ 29 | res_golden[i] = i*i; 30 | // std::cout << "Here is mul\n"; 31 | } 32 | 33 | } 34 | 35 | struct timeval start_time, end_time; 36 | gettimeofday(&end_time, 0); 37 | vadd_op(in1,in2,res,N,tmpStr); 38 | 39 | gettimeofday(&end_time, 0); 40 | // std::cout << "Kernel ed execution time is: " < 4 | #include 5 | #include 6 | #include 7 | #include 8 | 9 | #define N 1000 10 | 11 | 12 | 13 | class ArgParser { 14 | public: 15 | ArgParser(int& argc, const char** argv) { 16 | for (int i = 1; i < argc; ++i) mTokens.push_back(std::string(argv[i])); 17 | } 18 | bool getCmdOption(const std::string option, std::string& value) const { 19 | std::vector::const_iterator itr; 20 | itr = std::find(this->mTokens.begin(), this->mTokens.end(), option); 21 | if (itr != this->mTokens.end() && ++itr != this->mTokens.end()) { 22 | value = *itr; 23 | return true; 24 | } 25 | return false; 26 | } 27 | 28 | private: 29 | std::vector mTokens; 30 | }; 31 | -------------------------------------------------------------------------------- /overall/src/host/vadd_host.hpp: -------------------------------------------------------------------------------- 1 | #include "event_timer.hpp" 2 | #include "xcl2.hpp" 3 | #include 4 | 5 | typedef int dt; 6 | 7 | 8 | void vadd_op(dt *in1, dt *in2, dt *res, int LEN,std::string xclbin){ 9 | EventTimer et; 10 | 11 | cl_int fail; 12 | 13 | //***************************************** Step1 platform related operations 14 | std::vector devices = xcl::get_xil_devices(); 15 | cl::Device device = devices[0]; //the device 0 16 | 17 | //***************************************** Step2 Creating Context and Command Queue for selected Device 18 | 19 | 20 | et.add("OpenCL Initialization"); 21 | 22 | cl::Context context(device, NULL, NULL, NULL, &fail); //initial OpenCL environment 23 | 24 | cl::CommandQueue q(context, device, CL_QUEUE_PROFILING_ENABLE | CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE, &fail); //command queue 25 | 26 | cl::Program::Binaries xclBins = xcl::import_binary_file(xclbin); //load the binary file 27 | devices.resize(1); 28 | cl::Program program(context, devices, xclBins, NULL, &fail); // pragram the fpga 29 | 30 | cl::Kernel vadd_kernel; 31 | printf("load xclbin start\n"); 32 | vadd_kernel = cl::Kernel(program, "vadd_kernel", &fail);//The name "vadd_kernel" should be same as your kernel name .xo 33 | printf("load xclbin finish\n"); 34 | et.finish(); 35 | 36 | //***************************************** Step3 create device buffer and map dev buf to host buf 37 | 38 | cl::Buffer In1_buf, In2_buf, Res_buf; 39 | et.add("Map host buffers to OpenCL buffers"); 40 | In1_buf = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY), 41 | sizeof(dt) * LEN, in1); 42 | In2_buf = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_READ_ONLY), 43 | sizeof(dt) * LEN, in2); 44 | 45 | Res_buf = cl::Buffer(context, static_cast( CL_MEM_USE_HOST_PTR | CL_MEM_WRITE_ONLY), 46 | sizeof(dt) * LEN, res); 47 | 48 | et.finish(); 49 | 50 | int j = 0; 51 | et.add("Set kernel arguments"); 52 | vadd_kernel.setArg(j++, In1_buf); 53 | vadd_kernel.setArg(j++, In2_buf); 54 | vadd_kernel.setArg(j++, Res_buf); 55 | 56 | std::vector events_write(1); 57 | std::vector events_kernel(1); 58 | std::vector events_read(1); 59 | 60 | struct timeval start_time, end_time; 61 | gettimeofday(&start_time, 0); 62 | 63 | et.add("Memory object migration enqueue"); 64 | q.enqueueMigrateMemObjects({In1_buf,In2_buf}, 0, nullptr, &events_write[0]); //TODO start transfer, 0: host -> device; 1: opposite 65 | 66 | clWaitForEvents(1, (const cl_event *)&events_write[0]); 67 | 68 | et.add("OCL Enqueue task"); 69 | q.enqueueTask(vadd_kernel, &events_write, &events_kernel[0]); 70 | 71 | et.add("Wait for kernel to complete"); 72 | 73 | clWaitForEvents(1, (const cl_event *)&events_kernel[0]); 74 | 75 | et.add("Read back computation results"); 76 | q.enqueueMigrateMemObjects({Res_buf}, 1, &events_kernel, &events_read[0]); 77 | clWaitForEvents(1, (const cl_event *)&events_read[0]); 78 | et.finish(); 79 | q.finish(); 80 | 81 | // std::cout << "Kernel st execution time is: " < 38 | #include 39 | #include 40 | 41 | // When creating a buffer with user pointer (CL_MEM_USE_HOST_PTR), under the hood 42 | // User ptr is used if and only if it is properly aligned (page aligned). When not 43 | // aligned, runtime has no choice but to create its own host side buffer that backs 44 | // user ptr. This in turn implies that all operations that move data to and from 45 | // device incur an extra memcpy to move data to/from runtime's own host buffer 46 | // from/to user pointer. So it is recommended to use this allocator if user wish to 47 | // Create Buffer/Memory Object with CL_MEM_USE_HOST_PTR to align user buffer to the 48 | // page boundary. It will ensure that user buffer will be used when user create 49 | // Buffer/Mem Object with CL_MEM_USE_HOST_PTR. 50 | template 51 | struct aligned_allocator { 52 | using value_type = T; 53 | T* allocate(std::size_t num) { 54 | void* ptr = nullptr; 55 | if (posix_memalign(&ptr, 4096, num * sizeof(T))) throw std::bad_alloc(); 56 | return reinterpret_cast(ptr); 57 | } 58 | void deallocate(T* p, std::size_t num) { free(p); } 59 | }; 60 | 61 | namespace xcl { 62 | std::vector get_xil_devices(); 63 | std::vector get_devices(const std::string& vendor_name); 64 | /* find_xclbin_file 65 | * 66 | * 67 | * Description: 68 | * Find precompiled program (as commonly created by the Xilinx OpenCL 69 | * flow). Using search path below. 70 | * 71 | * Search Path: 72 | * $XCL_BINDIR/...xclbin 73 | * $XCL_BINDIR/...xclbin 74 | * $XCL_BINDIR/binary_container_1.xclbin 75 | * $XCL_BINDIR/.xclbin 76 | * xclbin/...xclbin 77 | * xclbin/...xclbin 78 | * xclbin/binary_container_1.xclbin 79 | * xclbin/.xclbin 80 | * ../...xclbin 81 | * ../...xclbin 82 | * ../binary_container_1.xclbin 83 | * ../.xclbin 84 | * ./...xclbin 85 | * ./...xclbin 86 | * ./binary_container_1.xclbin 87 | * ./.xclbin 88 | * 89 | * Inputs: 90 | * _device_name - Targeted Device name 91 | * xclbin_name - base name of the xclbin to import. 92 | * 93 | * Returns: 94 | * An opencl program Binaries object that was created from xclbin_name file. 95 | */ 96 | std::string find_binary_file(const std::string& _device_name, const std::string& xclbin_name); 97 | cl::Program::Binaries import_binary_file(std::string xclbin_file_name); 98 | bool is_emulation(); 99 | bool is_hw_emulation(); 100 | bool is_xpr_device(const char* device_name); 101 | } 102 | -------------------------------------------------------------------------------- /overall/src/kernel/vadd.cpp: -------------------------------------------------------------------------------- 1 | #include "vadd.hpp" 2 | 3 | 4 | 5 | extern "C" 6 | void vadd_kernel(dt *in1, dt *in2, dt *res){ 7 | #pragma HLS TOP name=vadd_kernel 8 | 9 | #pragma HLS INTERFACE m_axi depth=1000 port=in1 offset=slave bundle=inBUS1 10 | #pragma HLS INTERFACE m_axi depth=1000 port=in2 offset=slave bundle=inBUS2 11 | #pragma HLS INTERFACE m_axi depth=1000 port=res offset=slave bundle=resBUS 12 | 13 | 14 | for(int i=0; i< N; i++){ 15 | *(res+i) = *(in1+i)+*(in2+i); 16 | } 17 | 18 | } 19 | -------------------------------------------------------------------------------- /overall/src/kernel/vadd.hpp: -------------------------------------------------------------------------------- 1 | #define N 1000 2 | 3 | 4 | typedef int dt; 5 | 6 | 7 | extern "C" 8 | void vadd_kernel(dt *in1, dt *in2, dt *res); 9 | -------------------------------------------------------------------------------- /overall/start_U50.md: -------------------------------------------------------------------------------- 1 | # Alveo U50板卡加速器部署示例 2 | 3 | ## 1 应用工程创建 4 | 5 | ### 1.1 建立工程 6 | 1. 在终端直接运行`vitis`,设置工作目录 7 | 8 | ![图3 workspace设置](../img/workspace.jpg) 9 | 10 | 2. 新建应用工程,并点击next 11 | 12 | ![图4 application 工程创建](../img/creat_app_prj.jpg) 13 | 14 | 3. 选择之前已经安装好的U50平台描述文件 15 | 16 | ![图5 添加U50平台](../img/add_u50.png) 17 | 18 | 4. 输入应用名称,例如 test 19 | 20 | ![图6 新建应用场景](../img/app_prj.jpg) 21 | 22 | 5. 创建空白应用工程,选择Empty Application后点击Finish 23 | 24 | ![图8 创建空白应用工程](../img/empty_app.jpg) 25 | 26 | ### 1.2 kernel端配置 27 | 28 | 1. 添加kernel代码,将编写好的kernel代码复制或导入到如图的src文件夹内 29 | 30 | ![图9 添加kernel代码](../img/kernel_code_u50.png) 31 | 32 | 2. 打开上图中的test_kernels.prj配置kernel信息 33 | 34 | 3. 注册kernel函数,点击Add Hardware Fuction,添加硬件单元的TOP函数 35 | 36 | ![图10 注册kernel函数](../img/add_kernel_func_u50.png) 37 | 38 | 4. 可选配置 39 | + 设置kernel编译时频率约束 40 | - 在Assistant界面右键kernel项目部分 41 | - 单击Settings进入编译设置界面 42 | - 在kernel目录下的Hardware中选中$YOUR_KERNEL_NAME选项 43 | - 在v++ compiler options中添加`--hls.clock 300000000:$YOUR_KERNEL_NAME`,其中300000000代表300MHz 44 | 45 | ![图11 设置kernel编译时频率约束](../img/kernel_setting.png) 46 | 47 | + 关联vitis_hls软件,工程编译以后可用 48 | - 打开之前的test_kernels.prj页面 49 | - 单击下图选中的图标快速打开hls软件调试kernel代码 50 | 51 | ![图12 打开vitis_hls](../img/kernel_vitis_hls.png) 52 | 53 | ### 1.3 host端配置 54 | 55 | 1. 添加host代码,将编写好的host代码复制或导入到如图的src文件夹内 56 | 57 | ![图13 添加host代码](../img/host_code.jpg) 58 | 59 | ### 1.4 HW-link配置 60 | 61 | 1. 打开图中的test_system_hw_link.prj配置link信息 62 | 2. 点击Add Binary Container创建一个容器 63 | 3. 点击ADD Hardware Fuction添加硬件单元的Top Fuction 64 | 65 | ![图14 配置link信息](../img/link_u50.png) 66 | 67 | 4. 可选配置 68 | 69 | + 设置硬件实现频率约束 70 | - 在Assistant界面右键hw_link项目部分 71 | - 单击Settings进入编译设置界面 72 | - 在hw_link目录下的Hardware中选中$YOUR_CONTAINER_NAME选项 73 | - 在v++ compiler options中添加`--kernel_frequency 300`,其中300代表300MHz 74 | 75 | + 设置kernel端口映射 76 | - 在下图中的Memory选项中可以配置kernel的端口映射信息 77 | 78 | ![图15 设置link编译时频率约束](../img/link_setting.png) 79 | 80 | + 关联vivado软件,工程编译以后可用 81 | - 在Assistant界面右键hw_link项目下的container部分 82 | - 单击Open Vivado Project进入vivado工程快速调试 83 | 84 | ![图16 打开vivado](../img/link_vivado.jpg) 85 | 86 | ## 2 应用工程编译 87 | 88 | 在Explorer界面选中System后,便可在菜单中点击build按钮,其中编译分为三种模式 89 | 90 | + Emulation-SW:软件仿真,类似于hls的纯软件仿真,主要是用于验证算法的正确性 91 | + Emulation-HW:硬件仿真,仿真真实的硬件连接,用于检查硬件链接问题以及内存访问问题 92 | + Hardware:硬件实现,编译可用于FPGA硬件的工程文件 93 | 94 | ![图17 编译工程](../img/build.jpg) 95 | 96 | ## 3 硬件部署 97 | 98 | 1. 打开vitis命令行 99 | 100 | ![图18 shell](../img/vitis_shell.png) 101 | 102 | 2. 启动后运行如下命令 103 | ``` 104 | cd ./test/Hardware/ 105 | ``` 106 | 107 | 3. 执行host程序 108 | 109 | ``` 110 | ./test --xclbin ./binary_container_1.xclbin 111 | ``` 112 | 113 | ![图19 run](../img/run_u50.png) -------------------------------------------------------------------------------- /overall/start_ZCU104.md: -------------------------------------------------------------------------------- 1 | # ZCU104板卡加速器部署示例 2 | 3 | ## 1 应用工程创建 4 | 5 | ### 1.1 建立工程 6 | 1. 在终端直接运行`vitis`,设置工作目录 7 | 8 | ![图3 workspace设置](../img/workspace.jpg) 9 | 10 | 2. 新建应用工程,并点击next 11 | 12 | ![图4 application 工程创建](../img/creat_app_prj.jpg) 13 | 14 | 3. 点击Add添加之前已经下载好的ZCU104平台描述文件 15 | 16 | ![图5 添加ZCU104平台](../img/add_zcu104.jpg) 17 | 18 | 4. 输入应用名称,例如 test 19 | 20 | ![图6 新建应用场景](../img/app_prj.jpg) 21 | 22 | 5. 选择镜像文件 23 | + Sysroot -> /ZYNP平台通用镜像路径/xilinx-zynqmp-common-v2020.2/ir/sysroots/aarch64-xilinx-linux 24 | + Root FS -> /ZYNP平台通用镜像路径/xilinx-zynqmp-common-v2020.2/rootfs.ext4 25 | + Kernel Image -> /ZYNP平台通用镜像路径/xilinx-zynqmp-common-v2020.2/Image 26 | 27 | ![图7 镜像文件配置](../img/sysroot.jpg) 28 | 29 | 6. 创建空白应用工程,选择Empty Application后点击Finish 30 | 31 | ![图8 创建空白应用工程](../img/empty_app.jpg) 32 | 33 | ### 1.2 kernel端配置 34 | 35 | 1. 添加kernel代码,将编写好的kernel代码复制或导入到如图的src文件夹内 36 | 37 | ![图9 添加kernel代码](../img/kernel_code.jpg) 38 | 39 | 2. 打开上图中的test_kernels.prj配置kernel信息 40 | 41 | 3. 注册kernel函数,点击Add Hardware Fuction,添加硬件单元的TOP函数 42 | 43 | ![图10 注册kernel函数](../img/add_kernel_func.png) 44 | 45 | 4. 可选配置 46 | 47 | + 设置kernel编译时频率约束 48 | - 在Assistant界面右键kernel项目部分 49 | - 单击Settings进入编译设置界面 50 | - 在kernel目录下的Hardware中选中$YOUR_KERNEL_NAME选项 51 | - 在v++ compiler options中添加`--hls.clock 300000000:$YOUR_KERNEL_NAME`,其中300000000代表300MHz 52 | 53 | ![图11 设置kernel编译时频率约束](../img/kernel_setting.png) 54 | 55 | + 关联vitis_hls软件,工程编译以后可用 56 | - 打开之前的test_kernels.prj页面 57 | - 单击下图选中的图标快速打开hls软件调试kernel代码 58 | 59 | ![图12 打开vitis_hls](../img/kernel_vitis_hls.png) 60 | 61 | ### 1.3 host端配置 62 | 63 | 1. 添加host代码,将编写好的host代码复制或导入到如图的src文件夹内 64 | 65 | ![图13 添加host代码](../img/host_code.jpg) 66 | 67 | ### 1.4 HW-link配置 68 | 69 | 1. 打开图中的test_system_hw_link.prj配置link信息 70 | 2. 点击Add Binary Container创建一个容器 71 | 3. 点击ADD Hardware Fuction添加硬件单元的Top Fuction 72 | 73 | ![图14 配置link信息](../img/link.png) 74 | 75 | 4. 可选配置 76 | 77 | + 设置硬件实现频率约束 78 | - 在Assistant界面右键hw_link项目部分 79 | - 单击Settings进入编译设置界面 80 | - 在hw_link目录下的Hardware中选中$YOUR_CONTAINER_NAME选项 81 | - 在v++ compiler options中添加`--clock.defaultFreqHz 300000000`,其中300000000代表300MHz 82 | 83 | + 设置kernel端口映射 84 | - 在下图中的Memory选项中可以配置kernel的端口映射信息 85 | 86 | ![图15 设置link编译时频率约束](../img/link_setting.png) 87 | 88 | + 关联vivado软件,工程编译以后可用 89 | - 在Assistant界面右键hw_link项目下的container部分 90 | - 单击Open Vivado Project进入vivado工程快速调试 91 | 92 | ![图16 打开vivado](../img/link_vivado.jpg) 93 | 94 | ## 2 应用工程编译 95 | 96 | 在Explorer界面选中System后,便可在菜单中点击build按钮,其中编译分为三种模式 97 | + Emulation-SW:软件仿真,类似于hls的纯软件仿真,主要是用于验证算法的正确性 98 | + Emulation-HW:硬件仿真,仿真真实的硬件连接,用于检查硬件链接问题以及内存访问问题 99 | + Hardware:硬件实现,编译可用于FPGA硬件的工程文件 100 | 101 | ![图17 编译工程](../img/build.jpg) 102 | 103 | ## 3 硬件部署 104 | 105 | ### 3.1 SD卡烧录 106 | 107 | 1. 将sd卡插到电脑 108 | 2. 打开etcher软件 109 | 3. 在软件中的image选项里选择,`/PATH-to-YOUR-WORK/test_system/Hardware/package`下找到sd_card.img文件 110 | 4. 在device选项里选择sd卡 111 | 5. 单击Flash进行烧录 112 | 113 | ![图18 烧录](../img/image2sd.png) 114 | 115 | ### 3.2 ZCU104板卡串口连接 116 | 117 | 1. 将ZCU104板卡与主机连接,并插上之前已经烧录好的sd卡 118 | 119 | ![图19 zcu104](../img/zcu104.jpg) 120 | 121 | 2. 命令行运行`sudo putty`,打开putty后并如图配置,串口号随实际情况变化,本案例里是`ttyUSB1` 122 | 123 | ![图20 putty](../img/putty.png) 124 | 125 | ## 4 运行结果 126 | 127 | 1. 板卡上电运行 128 | 2. 启动后运行如下命令 129 | 130 | ``` 131 | cd /mnt/sd-mmcblkOp1/ 132 | source ./init.sh 133 | ``` 134 | 135 | 3. 执行host程序 136 | 137 | ``` 138 | ./test --xclbin ./binary_container_1.xclbin 139 | ``` -------------------------------------------------------------------------------- /vitis_pid1639.str: -------------------------------------------------------------------------------- 1 | /* 2 | 3 | Xilinx Vitis v2021.2.0 (64-bit) [Major: 2021, Minor: 2] 4 | SW Build 3363750 on 2021-10-16-13:10:08 5 | 6 | Process ID (PID): 1639 7 | License: Customer 8 | 9 | Current time: Fri Aug 26 10:27:49 CST 2022 10 | Time zone: China Standard Time (Asia/Shanghai) 11 | 12 | OS: Ubuntu 13 | OS Version: 5.4.0-122-generic 14 | OS Architecture: amd64 15 | Available processors (cores): 32 16 | 17 | Display: localhost:11.0 18 | Screen size: 1920x1080 19 | Available screens: 2 20 | Available disk space: 332 GB 21 | 22 | Java version: 11.0.11 64-bit 23 | Java home: /tools/Xilinx/Vitis/2021.2/tps/lnx64/jre11.0.11_9 24 | Java executable location: /tools/Xilinx/Vitis/2021.2/tps/lnx64/jre11.0.11_9/bin/java 25 | Java initial memory (-Xms): 64 MB 26 | Java maximum memory (-Xmx): 1,024 MB 27 | 28 | Java library paths: /tools/Xilinx/Vitis/2021.2/tps/lnx64/javafx-sdk-11.0.2/lib:/tools/Xilinx/Vitis/2021.2/lib/lnx64.o/Ubuntu/18:/tools/Xilinx/Vitis/2021.2/lib/lnx64.o/Ubuntu:/tools/Xilinx/Vitis/2021.2/lib/lnx64.o:/tools/Xilinx/Vitis/2021.2/tps/lnx64/jre11.0.11_9/lib/:/tools/Xilinx/Vitis/2021.2/tps/lnx64/jre11.0.11_9/lib//server:/tools/Xilinx/Vitis/2021.2/lib/lnx64.o:/tools/Xilinx/Vitis/2021.2/lib/lnx64.o/Ubuntu/18:/tools/Xilinx/Vitis/2021.2/lib/lnx64.o/Ubuntu:/tools/Xilinx/Vitis/2021.2/lib/lnx64.o:/tools/Xilinx/Vitis/2021.2/tps/lnx64/python-3.8.3/lib:/tools/Xilinx/Vitis/2021.2/aietools/lib/lnx64.o:/opt/xilinx/xrt/lib::/usr/local/cuda/lib64:/tools/Xilinx/Vitis/2021.2/bin/../lnx64/tools/dot/lib:/usr/java/packages/lib:/usr/lib64:/lib64:/lib:/usr/lib 29 | 30 | Java class paths: /tools/Xilinx/Vitis/2021.2/eclipse/lnx64.o//plugins/org.eclipse.equinox.launcher_1.5.700.v20200207-2156.jar 31 | LD_LIBRARY_PATH: /tools/Xilinx/Vitis 32 | 33 | User name: wt 34 | User home directory: /home/wt 35 | User working directory: /home/wt/git/Vitis_workflow 36 | User country: US 37 | User language: en 38 | User locale: en_US 39 | 40 | RDI_BASEROOT: /tools/Xilinx/Vitis 41 | HDI_APPROOT: /tools/Xilinx/Vitis/2021.2 42 | RDI_DATADIR: /tools/Xilinx/Vitis/2021.2/data 43 | RDI_BINDIR: /tools/Xilinx/Vitis/2021.2/bin 44 | 45 | Vitis preferences directory: /home/wt/.Xilinx/Vitis/2021.2/ 46 | Vitis workspace directory: /home/wt/double_kernel 47 | Vitis workspace log file location: /home/wt/double_kernel/.metadata/.log 48 | Engine tmp dir: 49 | 50 | Xilinx Environment Variables 51 | ---------------------------- 52 | XILINX_DSP: 53 | XILINX_HLS: /tools/Xilinx/Vitis_HLS/2021.2 54 | XILINX_PLANAHEAD: /tools/Xilinx/Vitis/2021.2 55 | XILINX_SDK: /tools/Xilinx/Vitis/2021.2 56 | XILINX_VITIS: /tools/Xilinx/Vitis/2021.2 57 | XILINX_VIVADO: /tools/Xilinx/Vivado/2021.2 58 | XILINX_VIVADO_HLS: /tools/Xilinx/Vivado/2021.2 59 | XILINX_XRT: /opt/xilinx/xrt 60 | _RDI_DONT_SET_XILINX_AS_PATH: True 61 | 62 | 63 | Copyright 1986-2020 Xilinx, Inc. All Rights Reserved. 64 | 65 | */ 66 | 67 | selectTreeTable("Hardware", "test_system > test_kernels", "Assistant", "SDXAssistantView", "TreeViewer.AssistantContentProvider"); 68 | selectTreeTable("test_kernels", "test_system", "Assistant", "SDXAssistantView", "TreeViewer.AssistantContentProvider"); 69 | selectTreeTable("test_system", "test_system", "Assistant", "SDXAssistantView", "TreeViewer.AssistantContentProvider"); 70 | selectTreeTable("test", "test_system", "Assistant", "SDXAssistantView", "TreeViewer.AssistantContentProvider"); 71 | selectTreeTable("test_system", "test_system", "Assistant", "SDXAssistantView", "TreeViewer.AssistantContentProvider"); 72 | activateView("Explorer", "ProjectExplorer", "CTabItem.EXPLORER"); 73 | selectTreeTable("src", "test_system > test", "Explorer", "ProjectExplorer", "TreeViewer.NavigatorContentServiceContentProvider"); 74 | selectTreeTable("makefile", "test_system > Hardware", EventType.DOUBLE_CLICK, "Explorer", "ProjectExplorer", "TreeViewer.NavigatorContentServiceContentProvider"); 75 | closeView("makefile", "test_system/Hardware/makefile", "MakefileEditor", "CTabItem.MAKEFILE"); 76 | activateView("Explorer", "ProjectExplorer", "CTabItem.EXPLORER"); 77 | selectTreeTable("binary_container_2-link.cfg", "test_system > test_system_hw_link > Hardware", EventType.DOUBLE_CLICK, "Explorer", "ProjectExplorer", "TreeViewer.NavigatorContentServiceContentProvider"); 78 | activateView("Assistant", "SDXAssistantView", "CTabItem.ASSISTANT"); 79 | selectTreeTable("binary_container_1.xclbin", "test_system > test_system_hw_link > Hardware", EventType.POPUP_TRIGGER_CLICK, "Explorer", "ProjectExplorer", "TreeViewer.NavigatorContentServiceContentProvider"); 80 | selectMenuItem("Rename", "Explorer", "ProjectExplorer", "MenuItem.RENAME"); 81 | selectButton("Cancel", "Rename Resource", "RefactoringWizardDialog2", "Button.CANCEL"); 82 | selectMenuItem("File", "test_system_hw_link", "WorkbenchWindow", "MenuItem.FILE"); 83 | selectMenuItem("New", "test_system_hw_link", "WorkbenchWindow", "MenuItem.NEW"); 84 | selectMenuItem("Application Project", "test_system_hw_link", "WorkbenchWindow", "MenuItem.APPLICATION_PROJECT"); 85 | selectButton("Next", "New Application Project (Create a New Application Project)", "NewAppProjectWizard", "Button.NEXT"); 86 | selectTreeTable("xilinx_u50_gen3x16_xdma_201920_3", "New Application Project (Platform)", "NewAppProjectWizard", "TreeViewer.ArrayTreeContentProvider"); 87 | selectButton("Next", "New Application Project (Platform)", "NewAppProjectWizard", "Button.NEXT"); 88 | selectTable("Create new", "New Application Project (Application Project Details)", "NewAppProjectWizard", "Table"); 89 | selectTable("test_system", "New Application Project (Application Project Details)", "NewAppProjectWizard", "Table"); 90 | selectTable("Create new", "New Application Project (Application Project Details)", "NewAppProjectWizard", "Table"); 91 | setTextField("adddd", "New Application Project (Application Project Details)", "NewAppProjectWizard", "Text.APPLICATION_PROJECT_NAME"); 92 | selectButton("Next", "New Application Project (Application Project Details)", "NewAppProjectWizard", "Button.NEXT"); 93 | selectButton("Back", "New Application Project (Templates)", "NewAppProjectWizard", "Button.BACK"); 94 | selectButton("Back", "New Application Project (Application Project Details)", "NewAppProjectWizard", "Button.BACK"); 95 | selectButton("Next", "New Application Project (Platform)", "NewAppProjectWizard", "Button.NEXT"); 96 | selectTable("test_system", "New Application Project (Application Project Details)", "NewAppProjectWizard", "Table"); 97 | selectTable("Create new", "New Application Project (Application Project Details)", "NewAppProjectWizard", "Table"); 98 | selectTable("test_system", "New Application Project (Application Project Details)", "NewAppProjectWizard", "Table"); 99 | selectTreeTable("Hardware", "test_system > test_system_hw_link", "Explorer", "ProjectExplorer", "TreeViewer.NavigatorContentServiceContentProvider"); 100 | selectTreeTable("Hardware", "test_system", "Explorer", "ProjectExplorer", "TreeViewer.NavigatorContentServiceContentProvider"); 101 | selectTreeTable("src", "test_system > test_kernels", "Explorer", "ProjectExplorer", "TreeViewer.NavigatorContentServiceContentProvider"); 102 | activateView("test_kernels", "test_kernels/test_kernels.prj", "Settingseditor", "CTabItem.TEST_KERNELS"); 103 | selectTreeTable("vmul_kernel", "test_kernels", "SDXSettingsEditor", "TreeViewer.HwAcceleratorContentProvider"); 104 | activateView("Explorer", "ProjectExplorer", "CTabItem.EXPLORER"); 105 | selectTreeTable("vmul_host.hpp", "test_system > test > src", "Explorer", "ProjectExplorer", "TreeViewer.NavigatorContentServiceContentProvider"); 106 | activateView("test_kernels", "test_kernels/test_kernels.prj", "SDXSettingsEditor", "CTabItem.TEST_KERNELS"); 107 | selectTreeTable("test_system_hw_link.prj", "test_system > test_system_hw_link", EventType.DOUBLE_CLICK, "Explorer", "ProjectExplorer", "TreeViewer.NavigatorContentServiceContentProvider"); 108 | selectToolBarButton("Vitis Shell", "test_system_hw_link", "WorkbenchWindow", "ToolItem.VITIS_SHELL"); 109 | selectTreeTable("vmul_kernel", "binary_container_2", "test_system_hw_link", "SDXSettingsEditor", "TreeViewer.HwLinkAcceleratorTableContentProvider"); 110 | selectToolBarButton("Vitis Shell", "test_system_hw_link", "WorkbenchWindow", "ToolItem.VITIS_SHELL"); 111 | activateView("Explorer", "ProjectExplorer", "CTabItem.EXPLORER"); 112 | selectTreeTable("test_system_hw_link.prj", "test_system > test_system_hw_link", EventType.POPUP_TRIGGER_CLICK, "Explorer", "ProjectExplorer", "TreeViewer.NavigatorContentServiceContentProvider"); 113 | activateView("Assistant", "SDXAssistantView", "CTabItem.ASSISTANT"); 114 | selectTreeTable("test_system", "test_system > test_system_hw_link", EventType.POPUP_TRIGGER_CLICK, "Assistant", "SDXAssistantView", "TreeViewer.AssistantContentProvider"); 115 | selectMenuItem("Settings", "Assistant", "SDXAssistantView", "MenuItem.SETTINGS"); 116 | selectTreeTable("binary_container_1", "test_system > test_system_hw_link > Hardware", "System Project Settings", "AssistantPreferencesDialog", "TreeViewer.PreferenceContentProvider"); 117 | selectTreeTable("binary_container_1", "test_system > test_system_hw_link > Hardware", EventType.DOUBLE_CLICK, "Binary Container Settings", "AssistantPreferencesDialog", "TreeViewer.PreferenceContentProvider"); 118 | selectTreeTable("vadd_kernel", "binary_container_1", "Binary Container Settings", "AssistantPreferencesDialog", "TreeViewer.ProfilingSettingsContentProvider"); 119 | setSpinner("2", "Binary Container Settings", "AssistantPreferencesDialog", "AssistantPreferencesDialog"); 120 | setSpinner("1", "Binary Container Settings", "AssistantPreferencesDialog", "AssistantPreferencesDialog"); 121 | setSpinner("2", "Binary Container Settings", "AssistantPreferencesDialog", "AssistantPreferencesDialog"); 122 | activateView("Explorer", "ProjectExplorer", "CTabItem.EXPLORER"); 123 | selectTreeTable("binary_container_1", "test_system_hw_link", "SDXSettingsEditor", "TreeViewer.HwLinkAcceleratorTableContentProvider"); 124 | activateView("Outline", "ContentOutline", "CTabItem.OUTLINE"); 125 | selectTreeTable("binary_container_1", "test_system_hw_link", "SDXSettingsEditor", "TreeViewer.HwLinkAcceleratorTableContentProvider"); 126 | activateView("Explorer", "ProjectExplorer", "CTabItem.EXPLORER"); 127 | selectTreeTable("event_timer.cpp", "test_system > test > src", "Explorer", "ProjectExplorer", "TreeViewer.NavigatorContentServiceContentProvider"); 128 | selectTreeTable("host.hpp", "test_system > test > src", "Explorer", "ProjectExplorer", "TreeViewer.NavigatorContentServiceContentProvider"); 129 | selectTreeTable("host.cpp", "test_system > test > src", "Explorer", "ProjectExplorer", "TreeViewer.NavigatorContentServiceContentProvider"); 130 | selectButton("Exit", "Confirm Exit", "MessageDialogWithToggle", "Button.EXIT"); 131 | // Exiting Xilinx Vitis with a status code '0' at 8/28/22, 8:39:08 PM CST 132 | // Elapsed time: 58:11:19 133 | 134 | -------------------------------------------------------------------------------- /xrc.log: -------------------------------------------------------------------------------- 1 | Fri Aug 26 10:27:50 2022: Server was asked to start on port: '35783' 2 | Fri Aug 26 10:27:50 2022: Server is using token UUID: 'f376d3b4-5608-4143-9fc1-bfa9895b4ec4' 3 | Fri Aug 26 10:27:50 2022: Attempting to start server on port '35783' 4 | Fri Aug 26 10:27:50 2022: XRC main fifo: created main fifo '/tmp/xrcmainf376d3b4-5608-4143-9fc1-bfa9895b4ec4', fd read = 11 5 | Fri Aug 26 10:27:50 2022: Running Rule Check Server 6 | Fri Aug 26 10:27:50 2022: Version 1.6.1 7 | Fri Aug 26 10:27:51 2022: Rule Check Server: Accepted socket connection from client 8 | Fri Aug 26 10:27:51 2022: EXCHANGE_TOKEN received, server token: f376d3b4-5608-4143-9fc1-bfa9895b4ec4, passed token: f376d3b4-5608-4143-9fc1-bfa9895b4ec4 9 | Sun Aug 28 20:39:08 2022: STOP_SERVER received, server token: f376d3b4-5608-4143-9fc1-bfa9895b4ec4, passed token: f376d3b4-5608-4143-9fc1-bfa9895b4ec4 10 | Sun Aug 28 20:39:08 2022: Socket received request to stop server. 11 | Sun Aug 28 20:39:08 2022: SERVER: destructor for server bound to port 35783 12 | --------------------------------------------------------------------------------