├── .gitignore
├── ModelConver
    ├── imgs
    │   ├── mnn.jpg
    │   ├── ncnn.jpeg
    │   └── process.svg
    ├── readme.md
    ├── Pytorch->ONNX.md
    ├── ONNX->MNN.md
    └── ONNX->NCNN.md
├── README.md
├── AMP
    ├── net.py
    ├── README.md
    └── main.py
├── TensorRT
    ├── readme.md
    ├── main.py
    ├── trt_com.py
    ├── lenet.py
    └── imgs
    │   ├── build.svg
    │   └── infer.svg
└── DDP
    ├── readme.md
    └── ddp.py


/.gitignore:
--------------------------------------------------------------------------------
1 | 
2 | *.code-workspace
3 | .DS_Store
4 | 


--------------------------------------------------------------------------------
/ModelConver/imgs/mnn.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bobo0810/PytorchExample/HEAD/ModelConver/imgs/mnn.jpg


--------------------------------------------------------------------------------
/ModelConver/imgs/ncnn.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bobo0810/PytorchExample/HEAD/ModelConver/imgs/ncnn.jpeg


--------------------------------------------------------------------------------
/ModelConver/readme.md:
--------------------------------------------------------------------------------
 1 | # 移动端部署
 2 | 
 3 | 以[人脸检测库RetinaFace](https://github.com/biubug6/Face-Detector-1MB-with-landmark)为例，移动端推理框架NCNN、MNN较为常用。
 4 | 
 5 | ### 示例
 6 | 
 7 | - [Pytorch->ONNX](Pytorch->ONNX.md)
 8 | 
 9 | - [ONNX->NCNN](ONNX->NCNN.md)
10 | 
11 | - [ONNX->MNN](ONNX->MNN.md)
12 | 
13 | ### 流程
14 | 
15 | ![avatar](./imgs/process.svg)
16 | 
17 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Pytorch最小实践
 2 | 
 3 | #### 收录到[PytorchNetHub](https://github.com/bobo0810/PytorchNetHub)
 4 | 
 5 | ### [AMP](./AMP/README.md)
 6 | 
 7 | - 自动混合精度训练
 8 | 
 9 | ### [DDP](./DDP/readme.md)
10 | 
11 | - 分布式数据并行（多机多卡）
12 | 
13 | 
14 | ### [MNN/NCNN部署](./ModelConver/readme.md)
15 | 
16 | - Pytorch->ONNX-> NCNN / MNN
17 | 
18 | ### [TensorRT部署](./TensorRT/readme.md)
19 | 
20 | - TensorRT API
21 | - Pytorch->ONNX->TensorRT
22 | 
23 | 


--------------------------------------------------------------------------------
/AMP/net.py:
--------------------------------------------------------------------------------
 1 | from torch.cuda.amp import autocast
 2 | import torch.nn as nn
 3 | class MyNet(nn.Module):
 4 |     '''
 5 |     自定义网络
 6 |     '''
 7 |     def __init__(self, use_amp=False):
 8 |         '''
 9 |         :param use_amp: True开启混合精度训练
10 |         '''
11 |         super(MyNet, self).__init__()
12 |         self.use_amp = use_amp
13 | 
14 |     def forward(self,input):
15 |         if self.use_amp:
16 |             # 开启自动混合精度
17 |             with autocast():
18 |                 return self.forward_calculation(input)
19 |         else:
20 |             return self.forward_calculation(input)
21 | 
22 |     def forward_calculation(self, input):
23 |         ...
24 |         ...
25 |         return feature


--------------------------------------------------------------------------------
/AMP/README.md:
--------------------------------------------------------------------------------
 1 | # AMP: Automatic Mixed Precision
 2 | 
 3 | ## 说明
 4 | - 好处：多快好省, batch增大
 5 | - 训练：DataParallel且梯度累加的代码
 6 | 
 7 | ## 注意
 8 | - AMP保存的模型仍为FP32
 9 | - AMP下模型保存两份权重。
10 | 
11 |     FP16权重用于反向传播计算（加速训练），并更新参数在FP32权重上（主模型）
12 | - 若想推理加速，在精度接受范围内img\model手动half()为FP16，然后只能GPU推理
13 | - [预测问题](https://github.com/jefflomax/pytorch-fizzbuzz-amp/issues/1#issuecomment-719125063)
14 | 
15 | ## 环境
16 | 
17 | | python版本 | pytorch版本 | 系统   |
18 | |------------|-------------|--------|
19 | | 3.6        | >=1.6.0       | Ubuntu |
20 | 
21 | 
22 | ## 参考
23 | [Pytorch_docs](https://pytorch.org/docs/stable/notes/amp_examples.html)
24 | 
25 | [AUTOMATIC MIXED PRECISION](https://pytorch.org/tutorials/recipes/recipes/amp_recipe.html#advanced-topics)
26 | 
27 | [基于Apex的混合精度加速](https://zhuanlan.zhihu.com/p/79887894)
28 | 
29 | [论文精读：Mixed Precision Training](https://zhuanlan.zhihu.com/p/163493798)
30 | 


--------------------------------------------------------------------------------
/ModelConver/Pytorch->ONNX.md:
--------------------------------------------------------------------------------
 1 | ## Pytorch->ONNX
 2 | 
 3 | 示例库: [Face-Detector-1MB-with-landmark](https://github.com/biubug6/Face-Detector-1MB-with-landmark)
 4 | 
 5 | 1. 验证输出
 6 | 
 7 |    convert_to_onnx.py  
 8 | 
 9 |    ```python
10 |    # RetinaFace网络输出三个参数：bbox、类别置信度、关键点
11 |    output_names = ["output0"] 改为 output_names = ["bbox","prob","landmark"]
12 |    ```
13 | 
14 | 2. 转为ONNX
15 | 
16 |    注意： <font color=red>opset_version=11 与ONNX瘦身共用将导致推理异常</font>
17 | 
18 |    ```shell
19 |    # 生成faceDetector.onnx 
20 |    python convert_to_onnx.py --trained_model ./weights/mobilenet0.25_Final.pth   --network mobile0.25
21 |    ```
22 | 
23 | 3. ONNX瘦身
24 | 
25 |    ```shell
26 |    #安装onnx-simplifier
27 |    pip3 install -U pip && pip3 install onnx-simplifier
28 |    # 生成faceDetector_sim.onnx
29 |    python3 -m onnxsim faceDetector.onnx faceDetector_sim.onnx
30 |    ```
31 | 
32 | **参考**
33 | 
34 | [onnx-simplifier](https://github.com/daquexian/onnx-simplifier)
35 | 
36 | 


--------------------------------------------------------------------------------
/TensorRT/readme.md:
--------------------------------------------------------------------------------
 1 | 
 2 | # TensorRT最佳实践
 3 | 
 4 | 
 5 | # 示例
 6 | - TensorRT API
 7 |   - [最简示例](./lenet.py)  
 8 |   - 访问[TensorRTx](https://github.com/wang-xinyu/tensorrtx)了解更多
 9 | 
10 | - 解析ONNX 
11 |   - [固定尺度](./main.py)  
12 |   - 动态维度: 待更新 
13 |   
14 |  
15 | # 整体流程
16 | ## 1.构建引擎
17 | ![avatar](./imgs/build.svg)
18 | ## 2.推理
19 | ![avatar](./imgs/infer.svg)
20 | 
21 | 
22 | 
23 | ## 三方库
24 | - [torch2trt](https://github.com/NVIDIA-AI-IOT/torch2trt)
25 | - [TRTorch](https://github.com/NVIDIA/TRTorch)
26 | > Torch直接转为TRT，但支持算子少，不通用。
27 | 
28 | ## 参考
29 | 
30 | - [TensorRT部署](http://zengzeyu.com/2020/07/09/tensorrt_01_installation/)
31 | - [TensorRT部署常见错误](https://blog.csdn.net/QFJIZHI/article/details/107335865)
32 | - [TensorRT加速Pytorch](https://blog.csdn.net/leviopku/article/details/112963733)
33 | - [TensorRTx](https://github.com/wang-xinyu/tensorrtx)
34 | - [TensorRT：深度学习推理加速](https://www.nvidia.cn/content/dam/en-zz/zh_cn/assets/webinars/oct16/Gary_TensorRT_GTCChina2019.pdf)


--------------------------------------------------------------------------------
/DDP/readme.md:
--------------------------------------------------------------------------------
 1 | # DistributedDataParallel
 2 | 
 3 | ## 说明
 4 | - 分布式数据并行DDP最小实现
 5 | - 适用单机多卡、多机多卡训练
 6 | 
 7 | ## 运行
 8 | 
 9 | ### 示例
10 | ```PowerShell
11 | //只有所有节点执行Shell命令，才开始训练
12 | python ddp.py --nodes 节点数 --gpus 每个节点的GPU数量 --nr 当前节点序号  --ip 当前节点ip 
13 | ```
14 | 
15 | ### 单机多卡 
16 | 节点ip=192.168.3.8
17 | ```PowerShell
18 | Shell: CUDA_VISIBLE_DEVICES=0,1 python ddp.py --nodes 1 --gpus 2 --nr 0  --ip 192.168.3.8
19 | ```
20 | 
21 | ### 多机多卡 
22 | 主节点ip=192.168.3.8  
23 | ```PowerShell
24 | 主节点Shell: CUDA_VISIBLE_DEVICES=0,1  python ddp.py --nodes 2 --gpus 2 --nr 0  --ip 192.168.3.8 
25 | 副节点Shell: CUDA_VISIBLE_DEVICES=0,1  python ddp.py --nodes 2 --gpus 2 --nr 1  --ip 192.168.3.8 
26 | ```
27 | 
28 | 
29 | ## 总结问题
30 | 
31 | 1. batch_size
32 | 
33 |    > 有效batch = 每个GPU的batch * 总GPUs
34 | 
35 | 2. 验证、保存
36 | 
37 |    > 验证：确保不同进程保存的log名称不同，最后只可视化rank=0。  
38 |    > 保存：只保存rank=0的模型。
39 |    
40 | 3. 数据读取
41 | 
42 |    - DataLoader采用Lmdb读取，若如下错误
43 | 
44 |      ```python
45 |      TypeError: can't pickle Environment objects
46 |      ```
47 | 
48 |      > 解决办法:DataLoader内num_workers=0
49 | 
50 |    - DataLoader采用其他方式读取，若如下错误
51 | 
52 |      ```python
53 |      Attribute:Can’t pickle local object ‘DataLoader.__init__.<locals>.<lambda>’
54 |      ```
55 | 
56 |      > 解决办法:   lambda x: Image.fromarray(x)   改为   Image.fromarray
57 | 
58 | 4. 同步BN
59 | 
60 |     ```python
61 |     # 仅支持DDP
62 |     model = torch.nn.SyncBatchNorm.convert_sync_batchnorm(model)
63 |     ```
64 | 
65 | # 参考
66 | [Distributed data parallel training in Pytorch](https://yangkky.github.io/2019/07/08/distributed-pytorch-tutorial.html) 推荐！
67 | 
68 | [distributed_tutorial](https://github.com/yangkky/distributed_tutorial/blob/master/src/mnist-distributed.py)
69 | 
70 | [PyTorch分布式训练简明教程](https://zhuanlan.zhihu.com/p/113694038)
71 | 
72 | [Pytorch 分布式训练](https://zhuanlan.zhihu.com/p/76638962) 
73 | 
74 | [discuss.pytorch](https://discuss.pytorch.org/t/cant-pickle-local-object-dataloader-init-locals-lambda/31857) 


--------------------------------------------------------------------------------
/ModelConver/ONNX->MNN.md:
--------------------------------------------------------------------------------
 1 | ## ONNX->MNN
 2 | 
 3 | ### 编译MNN
 4 | 
 5 | 1. 安装Homebrew
 6 | 
 7 |    ```shell
 8 |    # macOS10.15.7
 9 |    /bin/zsh -c "$(curl -fsSL https://gitee.com/cunkai/HomebrewCN/raw/master/Homebrew.sh)"
10 |    ```
11 | 
12 | 2. 编译MNN  
13 | 
14 |    [官方编译](https://www.yuque.com/mnn/cn/demo_project)
15 | 
16 |    ```shell
17 |      cd <mnn-root-dir>   //进入mnn根路径
18 |      # 生成 schema ，可选
19 |      cd schema && ./generate.sh
20 |    
21 |      # 进行编译
22 |      cd <mnn-root-dir>
23 |      mkdir build && cd build
24 |      # 打开 编译DEMO、编译模型转换器  
25 |      # https://www.yuque.com/mnn/cn/cmake_opts CMake参数前均加字母D
26 |      cmake -DMNN_BUILD_DEMO=ON  -DMNN_BUILD_CONVERTER=ON   ..
27 |      make -j8
28 |    ```
29 | 
30 | ### 转化
31 | 
32 | 1. 转为mnn
33 | 
34 |    [官方转化](https://www.yuque.com/mnn/cn/model_convert)
35 | 
36 |    ```shell
37 |    cd <mnn-root-dir>/build/
38 |    # 生成retinaface.mnn
39 |    ./MNNConvert -f ONNX --modelFile faceDetector_sim.onnx --MNNModel retinaface.mnn --bizCode biz
40 |    ```
41 | 
42 | ### C++推理
43 | 
44 | 1. 验证输出
45 | 
46 | （1）下载[RetinaFace_MNN](https://github.com/ItchyHiker/RetinaFace_MNN)推理项目
47 | 
48 | （2）修改代码
49 | 
50 | - `CMakeLists.txt` 
51 | 
52 | ```cmake
53 | # OpenCV路径：默认安装/usr/local/Cellar/opencv
54 | # MNN路径： <mnn-root-dir>更改为本地mnn根路径
55 | set(OpenCV_DIR /usr/local/Cellar/opencv/4.5.2/lib)
56 | set(OpenCV_INCLUDE_DIRS /usr/local/Cellar/opencv/4.5.2/include/opencv4/)
57 | set(MNN_DIR <mnn-root-dir>/build/libMNN.dylib)
58 | set(MNN_INCLUDE_DIRS <mnn-root-dir>/include)
59 | ```
60 | 
61 | - `main.cpp`  更改第12、13行 模型、测试图片路径
62 | - `retinaface.cpp`第26~29行key分别更改为“input0”、“prob”、"bbox"、"landmark"，与转化前key值对应。 
63 | 
64 | （3）可选项
65 | 
66 | - anchor比例：`retinaface.cpp`第128、130、132行
67 | - 图像尺寸：`main.cpp`第15行
68 | 
69 | 2. 构建项目
70 | 
71 | ```shell
72 | cd <RetinaFace_MNN-root-dir> #进入推理项目根路径
73 | mkdir -p build
74 | cd build
75 | cmake .. #生成Makefile文件
76 | make  -j4#根据Makefile文件进行编译
77 | # 生成可执行文件RetinaFace
78 | # 验证
79 | ./RetinaFace
80 | ```
81 | 
82 | ![avatar](./imgs/mnn.jpg)
83 | 
84 | **参考**
85 | 
86 | [MNN](https://github.com/alibaba/MNN)
87 | 
88 | [MNN文档](https://www.yuque.com/mnn/cn/cmake_opts)
89 | 
90 | 


--------------------------------------------------------------------------------
/AMP/main.py:
--------------------------------------------------------------------------------
 1 | from torch.cuda.amp import autocast
 2 | from torch.cuda.amp import GradScaler
 3 | import torch
 4 | from net import MyNet
 5 | 
 6 | def start_train():
 7 |     '''
 8 |     训练
 9 |     '''
10 |     use_amp=True
11 |     # 前向反传N次，再更新参数  目的：增大batch（理论batch= batch_size * N）
12 |     iter_size=8
13 | 
14 |     myNet = MyNet(use_amp).to("cuda:0")
15 |     myNet = torch.nn.DataParallel(myNet,device_ids=[0,1]) # 数据并行
16 |     myNet.train()
17 |     # 训练开始前初始化 梯度缩放器
18 |     scaler = GradScaler() if use_amp else None
19 | 
20 |     # 加载预训练权重
21 |     if resume_train:
22 |         scaler.load_state_dict(checkpoint['scaler']) # amp自动混合精度用到
23 |         optimizer.load_state_dict(checkpoint['optimizer'])
24 |         myNet.load_state_dict(checkpoint["model"])
25 | 
26 | 
27 |     for epoch in range(1,100):
28 |         for batch_idx, (input, target) in enumerate(dataloader_train):
29 | 
30 |             # 数据 转到每个并行模型的主卡上
31 |             input = input.to("cuda:0")
32 |             target = target.to("cuda:0")
33 | 
34 |             # 自动混合精度训练
35 |             if use_amp:
36 |                 # 自动广播 将支持半精度操作自动转为FP16
37 |                 with autocast():
38 |                     # 提取特征
39 |                     feature=myNet(input)
40 |                     losses = loss_function(target,feature)
41 |                     loss = losses / iter_size
42 |                 scaler.scale(loss).backward()
43 |             else:
44 |                 feature = myNet(input, target)
45 |                 losses = loss_function(target, feature)
46 |                 loss = losses / iter_size
47 |                 loss.backward()
48 | 
49 |             # 梯度累积,再更新参数
50 |             if (batch_idx + 1) % iter_size == 0:
51 |                 # 梯度更新
52 |                 if use_amp:
53 |                     scaler.step(optimizer)
54 |                     scaler.update()
55 |                 else:
56 |                     optimizer.step()
57 |                 # 梯度清零
58 |                 optimizer.zero_grad()
59 |         # scaler 具有状态。恢复训练时需要加载
60 |         state = {'net': myNet.state_dict(), 'optimizer': optimizer.state_dict(), 'scaler': scaler.state_dict()}
61 |         torch.save(state, "filename.pth")
62 | 
63 | def start_test():
64 |     '''
65 |     测试
66 |     '''
67 |     # 初始化网络并加载预训练模型
68 |     myNet = MyNet().to("cuda:0")
69 |     myNet.eval()
70 |     with torch.no_grad():
71 |         input = input.to("cuda:0")
72 | 
73 |         # 若想推理加速，在精度接受范围内img\model手动half()为FP16，然后只能GPU推理
74 |         # input=input.half()
75 |         # myNet=myNet.half()
76 |         feature = myNet(input)
77 | 
78 | 
79 | 
80 | 
81 | 
82 | 


--------------------------------------------------------------------------------
/TensorRT/main.py:
--------------------------------------------------------------------------------
 1 | import onnx
 2 | import pycuda.autoinit
 3 | import pycuda.driver as cuda
 4 | import tensorrt as trt
 5 | import torch
 6 | import time
 7 | import torchvision
 8 | import numpy as np
 9 | import os
10 | current_path=os.path.abspath(os.path.dirname(__file__))
11 | from trt_com import Torch_to_ONNX,ONNX_to_TensorRT,Init_TensorRT,Do_Inference
12 | 
13 | 
14 | 
15 | batch_size=3 # 固定尺度 eg:1、6、8...
16 | class ONNX_Config():
17 |     '''
18 |     ONNX参数
19 |     '''
20 |     input_size=[batch_size,3,224,224] # 输入尺寸
21 |     device_id="cuda:0" 
22 |     onnx_path=current_path+"/model.onnx" # onnx模型的保存路径
23 | 
24 | class TensorRT_Config():
25 |     '''
26 |     TensorRT参数
27 |     '''
28 |     output_size= [batch_size,1000] #输出尺寸  resnet18输出1000分类
29 |     fp16_mode = True     # 是否支持FP16 依赖硬件
30 |     trt_path = current_path+"/model_fp16_{}.trt".format(fp16_mode) # TRT引擎的保存路径
31 | 
32 | if __name__ == "__main__":
33 |     # ============1.Pytorch->ONNX============
34 |     onnx_cfg = ONNX_Config() #配置onnx转化参数
35 |     device = torch.device(onnx_cfg.device_id)
36 |     # 初始化Pytorch
37 |     torch_net = torchvision.models.resnet18(pretrained=True).to(device)
38 |     torch_net.eval()
39 |     # 转为ONNX模型
40 |     Torch_to_ONNX(torch_net,onnx_cfg.input_size,onnx_cfg.onnx_path,device)
41 |     
42 | 
43 |     # ============2.ONNX->TensorRT============
44 |     trt_cfg = TensorRT_Config() #配置tesnorrt转化参数
45 |     ONNX_to_TensorRT(trt_cfg.fp16_mode,onnx_cfg.onnx_path,trt_cfg.trt_path)
46 | 
47 | 
48 |     # ============3.Trt预测============
49 |     img_np_nchw = np.ones(tuple(onnx_cfg.input_size),dtype=float).astype(np.float32) # 输入数据
50 | 
51 |     [context,inputs, outputs, bindings, stream] =Init_TensorRT(trt_cfg.trt_path) # 加载引擎
52 |     inputs[0].host = img_np_nchw.reshape(-1) # 绑定输入数据  一维npy
53 |     # inputs[1].host = ... 适用多个输入
54 | 
55 |     t0 = time.time()
56 |     output=Do_Inference(context, bindings, inputs, outputs, stream) # list  若网络仅一个输出，则len=1
57 |     t1 = time.time()
58 |     output=output[0].reshape(*trt_cfg.output_size) # 一维npy 恢复为 指定输出尺寸
59 | 
60 |     # ============4.Torch预测============
61 |     input = torch.from_numpy(img_np_nchw).to(device)
62 |     t2 = time.time()
63 |     output_torch = torch_net(input)
64 |     t3 = time.time()
65 |     
66 |     # ============5.计算误差============
67 |     mse = np.mean((output - output_torch.cpu().detach().numpy()) ** 2)
68 | 
69 |     print('MSE Error = {}'.format(mse))
70 |     print("Inference time with the TensorRT engine: {}".format(t1 - t0))
71 |     print("Inference time with the PyTorch model: {}".format(t3 - t2))
72 |     print('All completed!')
73 |     


--------------------------------------------------------------------------------
/ModelConver/ONNX->NCNN.md:
--------------------------------------------------------------------------------
  1 | ## 1. ONNX->NCNN
  2 | 
  3 | 示例库: [Face-Detector-1MB-with-landmark](https://github.com/biubug6/Face-Detector-1MB-with-landmark)
  4 | 
  5 | ### 编译NCNN
  6 | 
  7 | 1. 安装Homebrew
  8 | 
  9 |    ```shell
 10 |    # macOS10.15.7
 11 |    /bin/zsh -c "$(curl -fsSL https://gitee.com/cunkai/HomebrewCN/raw/master/Homebrew.sh)"
 12 |    ```
 13 | 
 14 | 2. 安装第三方依赖
 15 | 
 16 |    ```shell
 17 |    brew install cmake
 18 |    brew install protobuf
 19 |    brew install opencv  //自动安装很多依赖  默认安装路径/usr/local/Cellar/opencv
 20 |    ```
 21 | 
 22 | 3. 编译NCNN
 23 | 
 24 |    ```shell
 25 |    cd <ncnn-root-dir>   //进入ncnn根路径
 26 |    mkdir -p build
 27 |    cd build
 28 |    cmake .. #生成Makefile文件
 29 |    make  #根据Makefile文件进行编译
 30 |    make install #生成install文件夹
 31 |    ```
 32 | 
 33 | ### 转化
 34 | 
 35 | 1. 转为ncnn
 36 | 
 37 |    ```shell
 38 |    cd <ncnn-root-dir>/build/tools/onnx
 39 |    ./onnx2ncnn  faceDetector_sim.onnx   face.param  face.bin
 40 |    ```
 41 | 
 42 | ### C++推理
 43 | 
 44 | 1. 验证输出
 45 | 
 46 | (1)将`<ncnn-root-dir>/build/install`的文件替换到`Face-Detector-1MB-with-landmark/Face_Detector_ncnn/ncnn`目录下
 47 | 
 48 | (2)将 face.param 和face.bin移动到`Face_Detector_ncnn/model`目录下
 49 | 
 50 | (3)`Face_Detector_ncnn/FaceDetector.cpp` 第53、56、59行key分别更改为"bbox","prob","landmark"，与转化前key值对应。
 51 | 
 52 | (4)`Face_Detector_ncnn/main.cpp` 若使用retinaface模型，应将false->true。
 53 | 
 54 | (5)可选项
 55 | 
 56 |   -  anchor比例：`FaceDetector.cpp`第202行
 57 |   -  图像尺寸：`main.cpp`第27行
 58 | 
 59 | 2. 构建项目
 60 | 
 61 | 在`Face_Detector_ncnn/CMakeLists.txt`设置opencv路径
 62 | 
 63 | ```shell
 64 | set(OpenCV_DIR "/usr/local/Cellar/opencv/4.5.2/")
 65 | ```
 66 | 
 67 | 编辑将出现如下错误
 68 | 
 69 | ```shell
 70 | cmake .. 
 71 | make -j4 #开4个线程进行编译
 72 | 
 73 | #将出现错误 include未找到opencv2
 74 | fatal error: 'opencv2/opencv.hpp' file not found
 75 | fatal error: 'opencv2/core/core.hpp' file not found
 76 | 
 77 | # 原因
 78 | # opencv2的include路径为/usr/local/Cellar/opencv/4.5.2/include/opencv4/
 79 | ```
 80 | 
 81 | 3. 解决
 82 | 
 83 | ```cmake
 84 | # CMakeLists文件修改两个路径
 85 | 19行： ${OpenCV_DIR}/include  = /usr/local/Cellar/opencv/4.5.2/include/opencv4/
 86 | 22行： ${OpenCV_DIR}/lib      = /usr/local/Cellar/opencv/4.5.2/lib
 87 | ```
 88 | 
 89 | ```shell
 90 | #编译
 91 | cmake .. 
 92 | make -j4
 93 | # 生成可执行文件FaceDetector
 94 | # 验证
 95 | ./FaceDetector
 96 | ```
 97 | 
 98 | ![avatar](./imgs/ncnn.jpeg)
 99 | 
100 | 
101 | 
102 | ## 2. NCNN优化
103 | 
104 | 作用：（1）优化模型，融合算子 （2）FP32->FP16
105 | 
106 | ```shell
107 | cd <ncnn-root-dir>/build/tools/
108 | # flag:0为FP32，1为FP16
109 | ./ncnnoptimize  ncnn.param  ncnn.bin  new.param  new.bin flag
110 | ```
111 | 
112 | 
113 | 
114 | **参考**
115 | 
116 | [macOS编译NCNN](https://www.bilibili.com/read/cv10224407/)
117 | 
118 | [NCNN](https://github.com/Tencent/ncnn)
119 | 
120 | [NCNN深度学习框架之Optimize优化器](https://www.cnblogs.com/wanggangtao/p/11313705.html)
121 | 
122 | 


--------------------------------------------------------------------------------
/ModelConver/imgs/process.svg:
--------------------------------------------------------------------------------
1 | <svg id="SvgjsSvg1006" width="268" height="337" xmlns="http://www.w3.org/2000/svg" version="1.1" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:svgjs="http://svgjs.com/svgjs"><defs id="SvgjsDefs1007"><marker id="SvgjsMarker1034" markerWidth="16" markerHeight="12" refX="16" refY="6" viewBox="0 0 16 12" orient="auto" markerUnits="userSpaceOnUse" stroke-dasharray="0,0"><path id="SvgjsPath1035" d="M0,2 L14,6 L0,11 L0,2" fill="#323232" stroke="#323232" stroke-width="2"></path></marker><marker id="SvgjsMarker1038" markerWidth="16" markerHeight="12" refX="16" refY="6" viewBox="0 0 16 12" orient="auto" markerUnits="userSpaceOnUse" stroke-dasharray="0,0"><path id="SvgjsPath1039" d="M0,2 L14,6 L0,11 L0,2" fill="#323232" stroke="#323232" stroke-width="2"></path></marker><marker id="SvgjsMarker1042" markerWidth="16" markerHeight="12" refX="16" refY="6" viewBox="0 0 16 12" orient="auto" markerUnits="userSpaceOnUse" stroke-dasharray="0,0"><path id="SvgjsPath1043" d="M0,2 L14,6 L0,11 L0,2" fill="#323232" stroke="#323232" stroke-width="2"></path></marker></defs><g id="SvgjsG1008" transform="translate(85,25)"><path id="SvgjsPath1009" d="M 0 4Q 0 0 4 0L 96 0Q 100 0 100 4L 100 66Q 100 70 96 70L 4 70Q 0 70 0 66Z" stroke="rgba(50,50,50,1)" stroke-width="2" fill-opacity="1" fill="#ffffff"></path><g id="SvgjsG1010"><text id="SvgjsText1011" font-family="微软雅黑" text-anchor="middle" font-size="13px" width="80px" fill="#323232" font-weight="400" align="middle" lineHeight="125%" anchor="middle" family="微软雅黑" size="13px" weight="400" font-style="" opacity="1" y="25.375" transform="rotate(0)"><tspan id="SvgjsTspan1012" dy="16" x="50"><tspan id="SvgjsTspan1013" style="text-decoration:;">Pytorch</tspan></tspan></text></g></g><g id="SvgjsG1014" transform="translate(85,130)"><path id="SvgjsPath1015" d="M 0 4Q 0 0 4 0L 96 0Q 100 0 100 4L 100 66Q 100 70 96 70L 4 70Q 0 70 0 66Z" stroke="rgba(50,50,50,1)" stroke-width="2" fill-opacity="1" fill="#ffffff"></path><g id="SvgjsG1016"><text id="SvgjsText1017" font-family="微软雅黑" text-anchor="middle" font-size="13px" width="80px" fill="#323232" font-weight="400" align="middle" lineHeight="125%" anchor="middle" family="微软雅黑" size="13px" weight="400" font-style="" opacity="1" y="25.375" transform="rotate(0)"><tspan id="SvgjsTspan1018" dy="16" x="50"><tspan id="SvgjsTspan1019" style="text-decoration:;">ONNX</tspan></tspan></text></g></g><g id="SvgjsG1020" transform="translate(25,242)"><path id="SvgjsPath1021" d="M 0 4Q 0 0 4 0L 96 0Q 100 0 100 4L 100 66Q 100 70 96 70L 4 70Q 0 70 0 66Z" stroke="rgba(50,50,50,1)" stroke-width="2" fill-opacity="1" fill="#ffffff"></path><g id="SvgjsG1022"><text id="SvgjsText1023" font-family="微软雅黑" text-anchor="middle" font-size="13px" width="80px" fill="#323232" font-weight="400" align="middle" lineHeight="125%" anchor="middle" family="微软雅黑" size="13px" weight="400" font-style="" opacity="1" y="25.375" transform="rotate(0)"><tspan id="SvgjsTspan1024" dy="16" x="50"><tspan id="SvgjsTspan1025" style="text-decoration:;">MNN</tspan></tspan></text></g></g><g id="SvgjsG1026" transform="translate(143,242)"><path id="SvgjsPath1027" d="M 0 4Q 0 0 4 0L 96 0Q 100 0 100 4L 100 66Q 100 70 96 70L 4 70Q 0 70 0 66Z" stroke="rgba(50,50,50,1)" stroke-width="2" fill-opacity="1" fill="#ffffff"></path><g id="SvgjsG1028"><text id="SvgjsText1029" font-family="微软雅黑" text-anchor="middle" font-size="13px" width="80px" fill="#323232" font-weight="400" align="middle" lineHeight="125%" anchor="middle" family="微软雅黑" size="13px" weight="400" font-style="" opacity="1" y="25.375" transform="rotate(0)"><tspan id="SvgjsTspan1030" dy="16" x="50"><tspan id="SvgjsTspan1031" style="text-decoration:;">NCNN</tspan></tspan></text></g></g><g id="SvgjsG1032"><path id="SvgjsPath1033" d="M135 95L135 112.5L135 112.5L135 130" stroke="#323232" stroke-width="2" fill="none" marker-end="url(#SvgjsMarker1034)"></path></g><g id="SvgjsG1036"><path id="SvgjsPath1037" d="M135 200L135 221L75 221L75 242" stroke="#323232" stroke-width="2" fill="none" marker-end="url(#SvgjsMarker1038)"></path></g><g id="SvgjsG1040"><path id="SvgjsPath1041" d="M135 200L135 221L204 221L204 242" stroke="#323232" stroke-width="2" fill="none" marker-end="url(#SvgjsMarker1042)"></path></g></svg>


--------------------------------------------------------------------------------
/DDP/ddp.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import argparse
  3 | import torch.multiprocessing as mp
  4 | import torchvision
  5 | import torchvision.transforms as transforms
  6 | import torch
  7 | import torch.nn as nn
  8 | import torch.distributed as dist
  9 | class ConvNet(nn.Module):
 10 |     def __init__(self, num_classes=10):
 11 |         super(ConvNet, self).__init__()
 12 |         self.layer1 = nn.Sequential(
 13 |             nn.Conv2d(1, 16, kernel_size=5, stride=1, padding=2),
 14 |             nn.BatchNorm2d(16),
 15 |             nn.ReLU(),
 16 |             nn.MaxPool2d(kernel_size=2, stride=2))
 17 |         self.layer2 = nn.Sequential(
 18 |             nn.Conv2d(16, 32, kernel_size=5, stride=1, padding=2),
 19 |             nn.BatchNorm2d(32),
 20 |             nn.ReLU(),
 21 |             nn.MaxPool2d(kernel_size=2, stride=2))
 22 |         self.fc = nn.Linear(7*7*32, num_classes)
 23 | 
 24 |     def forward(self, x):
 25 |         out = self.layer1(x)
 26 |         out = self.layer2(out)
 27 |         out = out.reshape(out.size(0), -1)
 28 |         out = self.fc(out)
 29 |         return out
 30 | 
 31 | 
 32 | def main():
 33 |     parser = argparse.ArgumentParser()
 34 |     parser.add_argument('--nodes', default=2, type=int) # 节点数量
 35 |     parser.add_argument('--gpus', default=2, type=int) # 每个节点的GPU数量
 36 |     parser.add_argument( '--nr', default=0, type=int) # 当前节点在所有节点的序号
 37 |     parser.add_argument('--batch', default=128, type=int) # 总batch(有效batch) 均分给全部GPU
 38 |     parser.add_argument('--ip',default=None,type=str) # 主节点ip
 39 | 
 40 |     args = parser.parse_args()
 41 |     args.world_size = args.gpus * args.nodes   #总的world_size,即进程总数==总GPU数量（每个进程负责一个GPU）
 42 |     os.environ['MASTER_ADDR'] = args.ip        # 主节点（主进程），用于所有进程同步梯度
 43 |     os.environ['MASTER_PORT'] = '8886'         # 主进程用于通信的端口，可随意设置
 44 | 
 45 |     # 一个节点启动 该节点的所有进程，每个进程运行train(i,args)  i从0到args.gpus-1
 46 |     # nprocs：作用于mp.spawn，标明启动的线程数
 47 |     # args：传递给train方法的参数
 48 |     mp.spawn(train, nprocs=args.gpus, args=(args,))
 49 | 
 50 | 
 51 | def train(pid, args):
 52 |     '''
 53 |     通过mp.spawn启动多进程，train接收参数为：节点内部的子进程号pid + 方法参数
 54 |     '''
 55 |     # 每个进程负责一个GPU，故 节点内部子进程号 = 节点内部GPU序号
 56 |     gpu=pid
 57 | 
 58 |     # 计算当前进程在所有进程中的全局排名，每个进程都需要知道进程总数和在进程中的顺序，以便使用哪块GPU
 59 |     # rank=0为主进程，用于保存模型和打印信息
 60 |     rank = args.nr * args.gpus + gpu
 61 | 
 62 |     # 初始化分布环境
 63 |     # env：环境变量初始化，需要在环境变量配置4个参数：MASTER_PORT，MASTER_ADDR，WORLD_SIZE，RANK
 64 |     dist.init_process_group(backend='nccl',
 65 |                             init_method='env://',
 66 |                             world_size=args.world_size,
 67 |                             rank=rank)
 68 | 
 69 |     torch.manual_seed(0)
 70 |     model = ConvNet()
 71 | 
 72 |     # 加载权重
 73 |     if args.savepath:
 74 |         print('loading weights')
 75 |         pass
 76 | 
 77 |     # DDP分发之前，同步BN（将网络内部的BatchNorm层转换为SyncBatchNorm层）
 78 |     model = torch.nn.SyncBatchNorm.convert_sync_batchnorm(model)
 79 | 
 80 |     torch.cuda.set_device(gpu) # 当前节点负责的GPU
 81 |     model.cuda(gpu)
 82 |     batch_size = int(args.batch/args.world_size) # 总的有效batch_size= 均分每块GPU的batch * 总进程数（总GPUs）
 83 | 
 84 |     criterion = nn.CrossEntropyLoss().cuda(gpu)
 85 |     optimizer = torch.optim.SGD(model.parameters(), 1e-4)
 86 | 
 87 |     # GPU模型包装为 DDP模型
 88 |     model = nn.parallel.DistributedDataParallel(model, device_ids=[gpu])
 89 | 
 90 |     # 加载数据
 91 |     train_dataset = torchvision.datasets.MNIST(root='./data',
 92 |                                                train=True,
 93 |                                                transform=transforms.ToTensor(),
 94 |                                                download=True)
 95 |     # 采样器：将数据集分为 world_size 块，不同块送到各进程中
 96 |     train_sampler = torch.utils.data.distributed.DistributedSampler(train_dataset,
 97 |                                                                     num_replicas=args.world_size,
 98 |                                                                     rank=rank)
 99 |     train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
100 |                                                batch_size=batch_size,
101 |                                                shuffle=False, # DDP下该参数无效，由train_sampler负责
102 |                                                num_workers=0, # DDP下为0 否则读取出错
103 |                                                pin_memory=True,
104 |                                                sampler=train_sampler) # 采样器
105 | 
106 |     for epoch in range(10):
107 |         # 每轮采样器打乱数据集，保证数据划分不同
108 |         train_sampler.set_epoch(epoch)
109 | 
110 |         for i, (images, labels) in enumerate(train_loader):
111 |             images = images.cuda(non_blocking=True)
112 |             labels = labels.cuda(non_blocking=True)
113 | 
114 |             outputs = model(images)
115 |             loss = criterion(outputs, labels)
116 | 
117 | 
118 |             optimizer.zero_grad()
119 |             loss.backward()
120 |             optimizer.step()
121 |             if (i + 1) % 100 == 0 and gpu == 0:
122 |                 print('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}'.format(epoch + 1, 10, i + 1, len(train_loader),
123 |                                                                          loss.item()))
124 | 
125 |         # ===验证===
126 |         # 确保每个进程log名称不同，最后可视化rank=0的log即可
127 |         # acc=eval()
128 | 
129 | 
130 |         # 仅主进程 保存模型
131 |         if rank == 0:
132 |             torch.save(model.state_dict(),'ddp.pth')
133 | 
134 | 
135 | if __name__ == '__main__':
136 |     main()


--------------------------------------------------------------------------------
/TensorRT/trt_com.py:
--------------------------------------------------------------------------------
  1 | import onnx
  2 | import onnxruntime
  3 | import pycuda.autoinit
  4 | import pycuda.driver as cuda
  5 | import tensorrt as trt
  6 | import torch
  7 | import time
  8 | import torchvision
  9 | import numpy as np
 10 | import os
 11 | import sys
 12 | current_path=os.path.abspath(os.path.dirname(__file__))
 13 | '''
 14 | 封装通用代码
 15 | '''
 16 | 
 17 | def Init_TensorRT(trt_path):
 18 |     '''
 19 |     初始化TensorRT引擎
 20 |     trt_path: trt文件
 21 |     '''
 22 |     # 加载cuda引擎
 23 |     engine = load_engine(trt_path)
 24 |     # 创建CudaEngine之后,需要将该引擎应用到不同的卡上配置执行环境
 25 |     context = engine.create_execution_context()
 26 |     inputs, outputs, bindings, stream = allocate_buffers(engine)  # input, output: host # bindings
 27 |     return [context,inputs, outputs, bindings, stream]
 28 | def load_engine(trt_path):
 29 |     """
 30 |     加载cuda引擎
 31 |     trt_path: TensorRT引擎文件
 32 |     """
 33 |     # 以trt的Logger为参数，使用builder创建计算图类型INetworkDefinition
 34 |     TRT_LOGGER = trt.Logger()
 35 | 
 36 |     # 如果已经存在序列化之后的引擎，则直接反序列化得到cudaEngine
 37 |     if os.path.exists(trt_path):
 38 |         print("Reading engine from file: {}".format(trt_path))
 39 |         with open(trt_path, 'rb') as f, \
 40 |                 trt.Runtime(TRT_LOGGER) as runtime:
 41 |             return runtime.deserialize_cuda_engine(f.read())  # 反序列化
 42 |     else:
 43 |         print('No Found:'+trt_path)
 44 |         raise FileNotFoundError
 45 | 
 46 | 
 47 | def allocate_buffers(engine):
 48 |     '''
 49 |     TRT分配缓存
 50 |     '''
 51 |     class HostDeviceMem(object):
 52 |         def __init__(self, host_mem, device_mem):
 53 |             """
 54 |             host_mem: cpu memory
 55 |             device_mem: gpu memory
 56 |             """
 57 |             self.host = host_mem     # 主机数据
 58 |             self.device = device_mem # GPU数据
 59 | 
 60 |         def __str__(self):
 61 |             return "Host:\n" + str(self.host) + "\nDevice:\n" + str(self.device)
 62 | 
 63 |         def __repr__(self):
 64 |             return self.__str__()
 65 |     inputs, outputs, bindings = [], [], []
 66 |     stream = cuda.Stream()
 67 |     for binding in engine:
 68 |         # print(binding) # 绑定的输入输出
 69 |         # print(engine.get_binding_shape(binding)) # get_binding_shape 是变量的大小
 70 |         size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size
 71 |         # volume 计算可迭代变量的空间，指元素个数
 72 |         # size = trt.volume(engine.get_binding_shape(binding)) # 如果采用固定bs的onnx，则采用该句
 73 |         dtype = trt.nptype(engine.get_binding_dtype(binding))
 74 |         # get_binding_dtype  获得binding的数据类型
 75 |         # nptype等价于numpy中的dtype，即数据类型
 76 |         # allocate host and device buffers
 77 |         host_mem = cuda.pagelocked_empty(size, dtype)  # 创建锁业内存
 78 |         device_mem = cuda.mem_alloc(host_mem.nbytes)  # cuda分配空间
 79 |         # print(int(device_mem)) # binding在计算图中的缓冲地址
 80 |         bindings.append(int(device_mem))
 81 |         # append to the appropriate list
 82 |         if engine.binding_is_input(binding): 
 83 |             inputs.append(HostDeviceMem(host_mem, device_mem)) # 绑定输入
 84 |         else:
 85 |             outputs.append(HostDeviceMem(host_mem, device_mem)) # 绑定输出
 86 |     return inputs, outputs, bindings, stream
 87 | 
 88 | 
 89 | def Do_Inference(context, bindings, inputs, outputs, stream):
 90 |     '''
 91 |     执行推理
 92 |     '''
 93 |     # htod：host to device 将数据由主机迁移到gpu device
 94 |     [cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs] 
 95 |    
 96 |     # Run inference.
 97 |     context.execute_async_v2(bindings=bindings, stream_handle=stream.handle)
 98 |     # dtoh：device to host 
 99 |     [cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs]
100 |  
101 |     # Synchronize the stream 同步流后才能得到预测结果
102 |     stream.synchronize()
103 | 
104 |     # 返回预测结果 一维numpy
105 |     return [out.host for out in outputs] 
106 | 
107 | 
108 | def Torch_to_ONNX(net,input_size,onnx_path,device):
109 |     '''
110 |     torch->onnx(仅支持固定输入尺度)
111 |     input_size: 输入尺度 [N,3,224,224] 
112 |     onnx_path: onnx权重文件的保存路径
113 |     device: "cuda:0"
114 |     '''
115 |     net.to(device)
116 |     net.eval()
117 |     # 转为ONNX
118 |     torch.onnx.export(net,  # 待转换的网络模型和参数
119 |                     torch.randn(tuple(input_size), device=device),  # 虚拟的输入，用于确定输入尺寸和推理计算图每个节点的尺寸
120 |                     onnx_path,  # 输出文件路径
121 |                     verbose=False,  # 是否以字符串的形式显示计算图
122 |                     input_names=["input"], 
123 |                     output_names=["output"],  # 输出节点的名称
124 |                     opset_version=13,  # onnx支持算子的版本
125 |                     do_constant_folding=True,  # 是否压缩常量
126 |                     )
127 | 
128 | 
129 |     # 验证模型
130 |     net = onnx.load(onnx_path)  # 加载onnx 计算图
131 |     onnx.checker.check_model(net)  # 检查文件模型是否正确
132 |     onnx.helper.printable_graph(net.graph)  # 输出onnx的计算图
133 | 
134 |     # ONNX推理
135 |     session = onnxruntime.InferenceSession(onnx_path)  # 创建一个运行session，类似于tensorflow
136 |     output = session.run(None, {"input": np.random.rand(input_size[0],input_size[1], input_size[2], input_size[3]).astype('float32')})  # 输入必须是numpy类型
137 | 
138 |     print('ONNX file in ' + onnx_path)
139 |     print('============Pytorch->ONNX SUCCESS============')
140 | 
141 | 
142 | def ONNX_to_TensorRT(fp16_mode=False,onnx_path=None,trt_path=None,max_batch_size=1):
143 |     """
144 |     生成cudaEngine，并保存引擎文件(仅支持固定输入尺度)
145 |     
146 |     max_batch_size: 默认为1，不支持动态batch
147 |     fp16_mode: True则fp16预测
148 |     onnx_path: 将加载的onnx权重路径
149 |     trt_path: trt引擎文件保存路径
150 |     """
151 |     # 通过logger报告错误、警告、信息
152 |     TRT_LOGGER = trt.Logger()
153 | 
154 |     explicit_batch = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
155 | 
156 |     
157 |     with trt.Builder(TRT_LOGGER) as builder, \
158 |             builder.create_network(explicit_batch) as network, \
159 |             trt.OnnxParser(network, TRT_LOGGER) as parser: 
160 |         builder.max_workspace_size = 1 << 30  # 预先分配的工作空间大小,即ICudaEngine执行时GPU最大需要的空间
161 |         builder.max_batch_size = max_batch_size  # 执行时最大可以使用的batchsize
162 |         builder.fp16_mode = fp16_mode 
163 | 
164 |         # ########解析onnx文件，填充计算图#########
165 |         if not os.path.exists(onnx_path):
166 |             quit("ONNX file {} not found!".format(onnx_path))
167 |         print('loading onnx file from path {} ...'.format(onnx_path))
168 |         with open(onnx_path, 'rb') as model: 
169 |             print("Begining onnx file parsing")
170 |             parser.parse(model.read())  # OnnxParser解析onnx文件，为network对象构建网络并填充权重
171 |         print("Completed parsing of onnx file")
172 |     
173 |         ########builder基于计算图创建引擎#########
174 |         print("Building an engine from file{}' this may take a while...".format(onnx_path))
175 |         output_shape=network.get_layer(network.num_layers - 1).get_output(0).shape # 查看最后一层网络输出尺寸
176 |         # network.mark_output(network.get_layer(network.num_layers -1).get_output(0)) #设置输出
177 |         engine = builder.build_cuda_engine(network)  # 构建引擎
178 |         print("Completed creating Engine")
179 | 
180 |         # 保存engine供以后直接加载使用
181 |         with open(trt_path, 'wb') as f:
182 |             f.write(engine.serialize())  # 序列化
183 | 
184 |         print('TensorRT file in ' + trt_path)
185 |         print('============ONNX->TensorRT SUCCESS============')


--------------------------------------------------------------------------------
/TensorRT/lenet.py:
--------------------------------------------------------------------------------
  1 | '''
  2 | https://github.com/wang-xinyu/tensorrtx  lenet最简单示例
  3 | '''
  4 | 
  5 | import argparse
  6 | import os
  7 | import struct
  8 | import sys
  9 | 
 10 | import numpy as np
 11 | import pycuda.autoinit
 12 | import pycuda.driver as cuda
 13 | import tensorrt as trt
 14 | 
 15 | INPUT_H = 32 #输入尺寸
 16 | INPUT_W = 32
 17 | OUTPUT_SIZE = 10 #输出形状  10分类
 18 | INPUT_BLOB_NAME = "data" # blob二进制对象  输入名称
 19 | OUTPUT_BLOB_NAME = "prob" # 输出名称
 20 | 
 21 | weight_path = "./lenet5.wts" # 二进制权重
 22 | engine_path = "./lenet5.engine" #trt引擎的保存路径
 23 | 
 24 | gLogger = trt.Logger(trt.Logger.INFO) # 通过logger报告错误、警告、信息（Builder/ICudaEngine/Runtime）
 25 | 
 26 | 
 27 | def load_weights(file):
 28 |     '''加载二进制权重文件'''
 29 |     print(f"Loading weights: {file}")
 30 | 
 31 |     assert os.path.exists(file), 'Unable to load weight file.'
 32 | 
 33 |     weight_map = {}
 34 |     with open(file, "r") as f:
 35 |         lines = [line.strip() for line in f]
 36 |     count = int(lines[0])
 37 |     assert count == len(lines) - 1
 38 |     for i in range(1, count + 1): # 遍历每行内容
 39 |         splits = lines[i].split(" ")
 40 |         name = splits[0] # 第一个值为网络名称
 41 |         cur_count = int(splits[1]) # 第二个值为 该行参数数量
 42 |         assert cur_count + 2 == len(splits) 
 43 |         values = [] #保存该行参数
 44 |         for j in range(2, len(splits)):
 45 |             # hex string to bytes to float
 46 |             values.append(struct.unpack(">f", bytes.fromhex(splits[j])))
 47 |         weight_map[name] = np.array(values, dtype=np.float32)
 48 | 
 49 |     return weight_map
 50 | 
 51 | 
 52 | def createLenetEngine(maxBatchSize, builder, config, dt):
 53 |     '''
 54 |     构建网络引擎
 55 |     dt: fp32 or fp16
 56 |     '''
 57 | 
 58 | 
 59 |     weight_map = load_weights(weight_path) # 加载二进制权重
 60 |     network = builder.create_network() # 创建网络对象
 61 | 
 62 |     data = network.add_input(INPUT_BLOB_NAME, dt, (1, INPUT_H, INPUT_W)) # 设置网络输入的名称和尺寸
 63 |     assert data
 64 |     # ============定义网络============
 65 |     # 定义卷积
 66 |     conv1 = network.add_convolution(input=data, # 输入tensor
 67 |                                     num_output_maps=6, # 输出通道
 68 |                                     kernel_shape=(5, 5), # 卷积核尺寸
 69 |                                     kernel=weight_map["conv1.weight"], # 赋值卷积核的权重[out_channels, in_channels, kernel_height, kernel_width]
 70 |                                     bias=weight_map["conv1.bias"]) # 赋值偏向权重[out_channels]
 71 |     assert conv1
 72 |     conv1.stride = (1, 1) # 设置卷积的步长
 73 |     
 74 |     # 定义激活函数
 75 |     relu1 = network.add_activation(conv1.get_output(0), # 前卷积层的输出
 76 |                                    type=trt.ActivationType.RELU)
 77 |     assert relu1
 78 |     
 79 |     # 定义池化
 80 |     pool1 = network.add_pooling(input=relu1.get_output(0),# 前激活层的输出
 81 |                                 window_size=trt.DimsHW(2, 2), # 池化窗口大小
 82 |                                 type=trt.PoolingType.AVERAGE) # 池化类型为平均池化
 83 |     assert pool1
 84 |     pool1.stride = (2, 2) # 池化步长
 85 | 
 86 |     conv2 = network.add_convolution(pool1.get_output(0), 16, trt.DimsHW(5, 5),
 87 |                                     weight_map["conv2.weight"],
 88 |                                     weight_map["conv2.bias"])
 89 |     assert conv2
 90 |     conv2.stride = (1, 1)
 91 | 
 92 |     relu2 = network.add_activation(conv2.get_output(0),
 93 |                                    type=trt.ActivationType.RELU)
 94 |     assert relu2
 95 | 
 96 |     pool2 = network.add_pooling(input=relu2.get_output(0),
 97 |                                 window_size=trt.DimsHW(2, 2),
 98 |                                 type=trt.PoolingType.AVERAGE)
 99 |     assert pool2
100 |     pool2.stride = (2, 2)
101 | 
102 |     # 定义全连接层
103 |     fc1 = network.add_fully_connected(input=pool2.get_output(0),
104 |                                       num_outputs=120,
105 |                                       kernel=weight_map['fc1.weight'],
106 |                                       bias=weight_map['fc1.bias'])
107 |     assert fc1
108 | 
109 |     relu3 = network.add_activation(fc1.get_output(0),
110 |                                    type=trt.ActivationType.RELU)
111 |     assert relu3
112 | 
113 |     fc2 = network.add_fully_connected(input=relu3.get_output(0),
114 |                                       num_outputs=84,
115 |                                       kernel=weight_map['fc2.weight'],
116 |                                       bias=weight_map['fc2.bias'])
117 |     assert fc2
118 | 
119 |     relu4 = network.add_activation(fc2.get_output(0),
120 |                                    type=trt.ActivationType.RELU)
121 |     assert relu4
122 | 
123 |     fc3 = network.add_fully_connected(input=relu4.get_output(0),
124 |                                       num_outputs=OUTPUT_SIZE,
125 |                                       kernel=weight_map['fc3.weight'],
126 |                                       bias=weight_map['fc3.bias'])
127 |     assert fc3
128 | 
129 |     prob = network.add_softmax(fc3.get_output(0)) #经过softmax
130 |     assert prob
131 | 
132 |     prob.get_output(0).name = OUTPUT_BLOB_NAME # 网络输出 赋值名称，便于后续通过名称拿出预测结果
133 |     network.mark_output(prob.get_output(0)) # 将该tensor 标记为输出
134 | 
135 |     # Build engine
136 |     builder.max_batch_size = maxBatchSize
137 |     # builder.max_workspace_size = 1 << 20
138 |     config.max_workspace_size= 1 << 20
139 |     engine = builder.build_engine(network, config)
140 | 
141 |     del network
142 |     del weight_map
143 | 
144 |     return engine
145 | 
146 | 
147 | def APIToModel(maxBatchSize):
148 |     '''将二进制权重转为trt引擎'''
149 |     builder = trt.Builder(gLogger) # builder对象 用于推理
150 |     config = builder.create_builder_config() # 为builder对象配置参数
151 |     engine = createLenetEngine(maxBatchSize, builder, config, trt.float32)
152 |     assert engine  # 断言引擎不为空
153 | 
154 |     # 保存为trt引擎文件
155 |     with open(engine_path, "wb") as f:
156 |         f.write(engine.serialize())
157 | 
158 |     del engine
159 |     del builder
160 | 
161 | 
162 | def doInference(context, host_in, host_out, batchSize):
163 |     '''
164 |     trt推理
165 |    
166 |     host_in  输入数据
167 |     host_out 空npy,用于接收输出
168 |     '''
169 |     engine = context.engine
170 |     assert engine.num_bindings == 2 # 绑定的tensor数量  输入1+输出1
171 | 
172 |     devide_in = cuda.mem_alloc(host_in.nbytes) # cuda分配输入内存，返回“设备分配对象“地址
173 |     devide_out = cuda.mem_alloc(host_out.nbytes)
174 |     bindings = [int(devide_in), int(devide_out)] 
175 |     stream = cuda.Stream() # 多个流 可以并行
176 | 
177 |     cuda.memcpy_htod_async(devide_in, host_in, stream) # 将主机内存的数据 复制到GPU上  htod即host_to_device
178 |     context.execute_async(bindings=bindings, stream_handle=stream.handle) # 异步 GPU执行推理
179 |     cuda.memcpy_dtoh_async(host_out, devide_out, stream)  # 将GPU数据 复制到主机内存上 dtoh即device_to_host
180 |     stream.synchronize() #流同步后 host_out接收预测结果
181 | 
182 | 
183 | if __name__ == '__main__':
184 |     parser = argparse.ArgumentParser()
185 |     parser.add_argument("-s",default=False, action='store_true') 
186 |     parser.add_argument("-d", default=True, action='store_true')
187 |     args = parser.parse_args()
188 | 
189 |     if not (args.s ^ args.d):
190 |         print("arguments not right!")
191 |         print("python lenet.py -s   # serialize model to plan file") # 将二进制权重转为trt引擎
192 |         print("python lenet.py -d   # deserialize plan file and run inference") # 加载trt引擎并推理
193 |         sys.exit()
194 | 
195 |     if args.s:
196 |         APIToModel(1)
197 |     else:
198 |         runtime = trt.Runtime(gLogger) # 创建trt运行时，以便加载trt引擎
199 |         assert runtime
200 | 
201 |         with open(engine_path, "rb") as f:
202 |             engine = runtime.deserialize_cuda_engine(f.read()) # 加载trt引擎
203 |         assert engine
204 | 
205 |         context = engine.create_execution_context() # 创建执行内容对象
206 |         assert context
207 | 
208 |         data = np.ones((INPUT_H * INPUT_W), dtype=np.float32) # TRT输入为一维[1024]  1024=1*32*32
209 |         host_in = cuda.pagelocked_empty(INPUT_H * INPUT_W, dtype=np.float32) # 页面锁定分配输入 dtype为输入数据类型  初始化一个与输入数据尺寸相同的空npy
210 |         np.copyto(host_in, data.ravel()) # ravel将原数据拉伸为一维，不产生副本   copyto返回数组的副本，赋值给host_in    
211 |         host_out = cuda.pagelocked_empty(OUTPUT_SIZE, dtype=np.float32) # 页面锁定分配输出  初始化一个与输出数据尺寸相同的空npy 
212 |         doInference(context, host_in, host_out, 1) # 推理完成后 host_out保存结果
213 | 
214 |         print(f'Output: {host_out}')
215 | 


--------------------------------------------------------------------------------
/TensorRT/imgs/build.svg:
--------------------------------------------------------------------------------
1 | <?xml version="1.0" encoding="UTF-8"?>
2 | <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
3 | <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="881px" height="333px" viewBox="-0.5 -0.5 881 333" content="&lt;mxfile host=&quot;wcdn.wiz.cn&quot; modified=&quot;2021-11-05T10:04:45.955Z&quot; agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) WizNote/0.1.46 Chrome/89.0.4389.128 Electron/12.0.15 Safari/537.36&quot; etag=&quot;I5rAp22JilJ8Hwe2B4BA&quot; version=&quot;14.9.3&quot; type=&quot;embed&quot;&gt;&lt;diagram id=&quot;pM3JtoRX88Zx8f4QACPw&quot; name=&quot;第 1 页&quot;&gt;7Vrbbts4EP0aAtsHG7pRlh4tR+ku0EUD5GHbp4CxaFkbWfRSdOz065ekSF0sOnFdxXaLtgXMu8iZM8MzwwJ3ttp9pGi9/JskOAeOleyAewMcx+Z/+Y9oealaAsevGlKaJWpQ03CffcOq0VKtmyzBZWcgIyRn2brbOCdFgees04YoJdvusAXJu19doxT3Gu7nKO+3/pMlbKlPMWna/8RZutRftv2w6lkhPVidpFyihGxbTW4M3BklhFWl1W6GcyE8LZdq3u2B3npjFBfsmAmeW814RvlGHU5tjL3o01KyKRIsJtjAjbbLjOH7NZqL3i3XL29bslWuuvsbUHt6xpThXatJbegjJivM6AsfsutqWYHDCVR924jac1XbsiVmV09ESr1pvXQjAV5QQjggQa93fpxw5atqQQr+EzUisXiNULYkKSlQ/omQtRLEv5ixFwVdtGGkKya8y9iXVvmrWGoMVe1mp1aWlRddKfhZvrQrrVmi2kyTNT3voEpKsqFzdUh1bIZoitUohVlx/FfVRnGOWPbctY0f0gHsgXJOMWL4ocBsS+iTUUWf0CN3Mx0pozxLC16e85NjyhsECDNux1PVscqSJJcKxWX2DT3K9YTM1iQrmDwCjAC8MUvRO4Ds2vGoBTu23RGdgvhITTtakmqlO7HL1hCyWJRce/uirj94lPRrdLyzBdRobgD8tYNfM5pPsZxTLcCeXMoEnL5fftxkefKAizTjCrgKAzjo2o83ADXLGjt+aHfc/jWZhGdWBpfope9Kr3tXuk7/rrQdw13pD3BVhj2pHHLP55aKE15QLPbZ/acLg+/yoKJ2h2nGjyacwusucjjft2ehWle+Mw59q/njdVRXsz+9YuWy1SJtZvuD61Z+v7fuCd5CRzcdTu3nXKxRuUYFL6eizF3kIktBfAuiGQineghfvT3KNBPEIQggCG5APAHRDZjG2h3FEEz5eiGIAxDZYGqLluAWBA6IfRBCEFkHv7OHWR6ZrCu8Mvy20T6i+VMqMf15w3JxQ1XtCaJPn/msjEn0jS34Pn7PECPYvsHAvQEMvI7drpMgJahc1r72KOZj+9dEffRu2uYjcB0LI4k9ge4gVMjnnEEUIhDOdMGVSI9B4CnTCmZiOp8V+dIsuEFE0iws0SsG34DAlTY1ETa1r1uOSPZ+rErovawgIJYuGSVPeEZyQhsoDXEjTk40GGeIG3HS0+gBGntpogCtMxIFA62U+L4ViBSI5/gOJHZnwnULpLogiGXBF2AVgyMB618Ssq4Fu7rx+7oJ3wmxXmD0QVz44a1wNKHHC9O7v46QvFE8iyzP95qO15DJLLq3zQDSt8Ou9B2D9E3+YogcHLQMBOrSgdaepzBmJS2DQAbJSsLXY4qGXcRN6y+YpIRun6hAeCmiAg08nxOM6QRMNQkJrS5/bynQ/28jkvzRghRsVLlTHgNYtrPe8R8pEath1aN55S3EEJo+/iE9FF+cb9Li2GzK0PrQrK3JPSmKnYnpHBcJXCS3BAfLLY2scUMqh0ku6f0EpkUHzTzBfh5QKPOBi6q8gvQT3LujXXhG/gTP81Tz/fGUdkkdN3Wx1xTYf025ljzdBF4SPn3efOEb8/vi/OOgGFwVFPuB4PW9amhcDHLz2HWUd41PfbAf5lzLw8b+zVK/+J/FNfSfNq4kYbHvMT1THue9xOKb4rKKQ/K1kDCQpeAlrENwN2wxCvqc9E1ivMd4FcHVPx8qhizZ8wKtsvylGrrE+TMWzqDV32LXNmfXrY7qo6KnIHSF8lbfVglPdIrATvbk3PtjOipFRrtI+zNF+D9S7kn01R5K92UcLYVa1dJ7kT2MoqJc8LX0qiqPYvGbOul+sZ74SnwgcFIFBE1BySzJynWOlLyyosrWy3PnBLG9z++rzR638qtQ5p+mMuqJ5WvDROZpJ6Ir5F3xG68aj/S36k9S/W9r+iWsyRnLRK8NptWThPx3cqCuko9nuIUmXVLjGdLmptQsHOISOtPzumLymq4fw+RPiBQtAz23LkXP/WMSWjKXdAB5P/eDgw/fTnlPDKg+4cGBV5v/41sR9+Z/Srvx/w==&lt;/diagram&gt;&lt;/mxfile&gt;" style="background-color: rgb(255, 255, 255);"><defs/><g><rect x="0" y="22" width="430" height="310" rx="46.5" ry="46.5" fill="#ffffff" stroke="#000000" pointer-events="all"/><path d="M 150 92 L 273.63 92" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 278.88 92 L 271.88 95.5 L 273.63 92 L 271.88 88.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 94px; margin-left: 216px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; background-color: #ffffff; white-space: nowrap; ">create_network</div></div></div></foreignObject><text x="216" y="97" fill="#000000" font-family="Helvetica" font-size="11px" text-anchor="middle">create_network</text></switch></g><path d="M 150 92 L 335 238.05" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 339.12 241.31 L 331.46 239.72 L 335 238.05 L 335.8 234.22 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 187px; margin-left: 272px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; background-color: #ffffff; white-space: nowrap; ">build_engine</div></div></div></foreignObject><text x="272" y="190" fill="#000000" font-family="Helvetica" font-size="11px" text-anchor="middle">build_engine</text></switch></g><rect x="30" y="62" width="120" height="60" rx="9" ry="9" fill="#ffffff" stroke="#000000" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 92px; margin-left: 31px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">builder</div></div></div></foreignObject><text x="90" y="96" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">builder</text></switch></g><rect x="280" y="62" width="120" height="60" rx="9" ry="9" fill="#ffffff" stroke="#000000" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 92px; margin-left: 281px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">network</div></div></div></foreignObject><text x="340" y="96" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">network</text></switch></g><path d="M 30 122 L 160 122 L 190 152 L 190 162 L 30 162 L 30 122 Z" fill="#ffffff" stroke="#000000" stroke-miterlimit="10" pointer-events="all"/><path d="M 160 122 L 160 152 L 190 152 Z" fill-opacity="0.05" fill="#000000" stroke="none" pointer-events="all"/><path d="M 160 122 L 160 152 L 190 152" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 158px; height: 1px; padding-top: 142px; margin-left: 31px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; "><span>config：</span><span>配置builder对象参数</span></div></div></div></foreignObject><text x="110" y="146" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">config：配置builder对象参数</text></switch></g><path d="M 340 142 L 340 235.63" fill="none" stroke="#000000" stroke-miterlimit="10" stroke-dasharray="3 3" pointer-events="stroke"/><path d="M 340 240.88 L 336.5 233.88 L 340 235.63 L 343.5 233.88 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="all"/><rect x="260" y="122" width="160" height="20" fill="none" stroke="none" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 132px; margin-left: 340px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: nowrap; ">定义网络结构，并赋值权重</div></div></div></foreignObject><text x="340" y="136" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">定义网络结构，并赋值权重</text></switch></g><rect x="280" y="242" width="120" height="60" rx="9" ry="9" fill="#ffffff" stroke="#000000" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 272px; margin-left: 281px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">engine</div></div></div></foreignObject><text x="340" y="276" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">engine</text></switch></g><rect x="295" y="302" width="90" height="20" fill="none" stroke="none" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 312px; margin-left: 340px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: nowrap; ">保存引擎文件</div></div></div></foreignObject><text x="340" y="316" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">保存引擎文件</text></switch></g><rect x="185" y="2" width="60" height="10" fill="none" stroke="none" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 58px; height: 1px; padding-top: 7px; margin-left: 186px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">原生API</div></div></div></foreignObject><text x="215" y="11" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">原生API</text></switch></g><rect x="480" y="22" width="400" height="310" rx="46.5" ry="46.5" fill="#ffffff" stroke="#000000" pointer-events="all"/><path d="M 615 122 L 738.63 122" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 743.88 122 L 736.88 125.5 L 738.63 122 L 736.88 118.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 123px; margin-left: 685px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; background-color: #ffffff; white-space: nowrap; ">解析<span style="font-size: 12px ; background-color: rgb(248 , 249 , 250)">onnx权重</span></div></div></div></foreignObject><text x="685" y="126" fill="#000000" font-family="Helvetica" font-size="11px" text-anchor="middle">解析onnx权重</text></switch></g><rect x="495" y="92" width="120" height="60" rx="9" ry="9" fill="#ffffff" stroke="#000000" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 122px; margin-left: 496px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">onnx_parser</div></div></div></foreignObject><text x="555" y="126" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">onnx_parser</text></switch></g><path d="M 805 152 L 805 215.63" fill="none" stroke="#000000" stroke-miterlimit="10" stroke-dasharray="3 3" pointer-events="stroke"/><path d="M 805 220.88 L 801.5 213.88 L 805 215.63 L 808.5 213.88 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="all"/><rect x="745" y="92" width="120" height="60" rx="9" ry="9" fill="#ffffff" stroke="#000000" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 122px; margin-left: 746px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">network</div></div></div></foreignObject><text x="805" y="126" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">network</text></switch></g><path d="M 615 202 L 798.67 221.33" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 803.89 221.88 L 796.56 224.63 L 798.67 221.33 L 797.29 217.67 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 212px; margin-left: 699px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; background-color: #ffffff; white-space: nowrap; ">build_engine</div></div></div></foreignObject><text x="699" y="216" fill="#000000" font-family="Helvetica" font-size="11px" text-anchor="middle">build_engine</text></switch></g><rect x="495" y="172" width="120" height="60" rx="9" ry="9" fill="#ffffff" stroke="#000000" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 202px; margin-left: 496px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">builder</div></div></div></foreignObject><text x="555" y="206" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">builder</text></switch></g><rect x="745" y="222" width="120" height="60" rx="9" ry="9" fill="#ffffff" stroke="#000000" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 252px; margin-left: 746px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">engine</div></div></div></foreignObject><text x="805" y="256" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">engine</text></switch></g><rect x="764" y="142" width="90" height="50" fill="none" stroke="none" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe flex-start; justify-content: unsafe flex-start; width: 88px; height: 1px; padding-top: 149px; margin-left: 766px;"><div style="box-sizing: border-box; font-size: 0; text-align: left; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; "><span style="color: rgb(0 , 0 , 0) ; font-family: &quot;helvetica&quot; ; font-size: 11px ; font-style: normal ; font-weight: 400 ; letter-spacing: normal ; text-align: center ; text-indent: 0px ; text-transform: none ; word-spacing: 0px ; background-color: rgb(255 , 255 , 255) ; display: inline ; float: none">1.构建计算图</span><br style="color: rgb(0 , 0 , 0) ; font-family: &quot;helvetica&quot; ; font-size: 11px ; font-style: normal ; font-weight: 400 ; letter-spacing: normal ; text-align: center ; text-indent: 0px ; text-transform: none ; word-spacing: 0px" /><span style="color: rgb(0 , 0 , 0) ; font-family: &quot;helvetica&quot; ; font-size: 11px ; font-style: normal ; font-weight: 400 ; letter-spacing: normal ; text-align: center ; text-indent: 0px ; text-transform: none ; word-spacing: 0px ; background-color: rgb(255 , 255 , 255) ; display: inline ; float: none">2.填充权重</span></div></div></div></foreignObject><text x="766" y="161" fill="#000000" font-family="Helvetica" font-size="12px">1.构建计算图...</text></switch></g><rect x="645" y="2" width="70" height="20" fill="none" stroke="none" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 12px; margin-left: 680px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: nowrap; ">解析onnx</div></div></div></foreignObject><text x="680" y="16" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">解析onnx</text></switch></g></g><switch><g requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"/><a transform="translate(0,-5)" xlink:href="https://www.diagrams.net/doc/faq/svg-export-text-problems" target="_blank"><text text-anchor="middle" font-size="10px" x="50%" y="100%">Viewer does not support full SVG 1.1</text></a></switch></svg>


--------------------------------------------------------------------------------
/TensorRT/imgs/infer.svg:
--------------------------------------------------------------------------------
1 | <?xml version="1.0" encoding="UTF-8"?>
2 | <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
3 | <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="1111px" height="401px" viewBox="-0.5 -0.5 1111 401" content="&lt;mxfile host=&quot;wcdn.wiz.cn&quot; modified=&quot;2021-11-05T07:58:59.777Z&quot; agent=&quot;5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) WizNote/0.1.46 Chrome/89.0.4389.128 Electron/12.0.15 Safari/537.36&quot; etag=&quot;hxNL2PFCwd0xAqAG0IsU&quot; version=&quot;14.9.3&quot; type=&quot;embed&quot;&gt;&lt;diagram id=&quot;_irSwdsF-xRYK8By4qb9&quot; name=&quot;第 1 页&quot;&gt;3Vrfb6M4EP5rLN09dEUAE3gkKe097EkrVafbe4qc4BLfEpwzTpvsX3+2scMP073slZA0aqXagz3YM5+/mTEF3nyzf2Rou/6dpjgHrpPugXcPXHcifsQfKTlUktANKkHGSKoH1YIn8h1roaOlO5LisjWQU5pzsm0LV7Qo8Iq3ZIgx+toe9kzz9lu3KMOW4GmFclv6J0n52uxiWst/wyRbmzdPgqh6skFmsN5JuUYpfW2IvAR4c0Ypr1qb/Rzn0njGLtW8hzeeHhfGcMFPmeDrZfCD2RtOxVZ1lzK+phktUJ7U0hmjuyLFUoEjevWYz5RuhXAihH9jzg/ab2jHqRCt+SbXT/Ge8K+N9l9S1Seoe/d7rVl1Drpjb0zvtaQ7ttJr17DiiGVYj/IqkdxVY5o2xiOmG8zZQQxgOEecvLQdjDROsuO42pSioa3Zb1lYqXhB+U4rFZDFjKBcmGWx2qVogYuMFLjXA5/RUpyZltXExKwQ7ZWwAGZC8IIZJwKUsX6wIWlaOQiX5DtaKn3SdltKCq62AmcA3vdaUwNB6sR70HOItL4WTlsW1LPunE/u1JwEfb7vtKqTjay1f5ELr1WbRXa0GgX0+bkUbu/66LjEk9zmWm5ju4KTje2m+hhI77yuCcdPW6Sg+Cpor+28N+FrGfxNwwZOa/dHMnytGWhiaGXdYB8z7z1gDm+KJgbnhH64Bm7HY37HExVP6VnvQm1koXbFMOJ4gfd4teOEFgtxjAXQ+HWwTTgk24RhdA62mQTnZxvP8tsbMWFssvG62B2Tbcy7GmYBSQCEm+MpSEIQT0A4BwkEoRBC2YgTMItU40E1QjCbyGFdM6oTcDacSworKzaTqkvO6Dc8pzkVeu8LWuCBAoEH276Z2r6JelzjDuAaz7tIJCjEMr8avpedRiyQ3ToYqN5PRQNTjDSzRpNpjJ82eiYRa2LfB7M5iB2F6wRE/T4YncQNFgZh8YnXYfF3kvigNG0g0nDKmpZ8QQrgBrlklCUTrUy2lLeEn2aSsqI5mMWqAcHMUSR2L8nq0vTuTtsU4gVj0nsPxLUZU/LSMkzwz44q86LVt0wZ6W5V8WkssZEtf/HEgsSveKnTav4q29JWTi7i6Z3Zg5zm+tt9rfroN9uTWiL2o9ZlpLcYU1zYAUQELUD04cEbIqbY1cVFaoezRhkTUVp3E9NLRRnXPoIpfiEp7qW0xy9/XCGH+ZfkMC+6Dsyehj04NM7eKHiDtkf8LjsMV/D2Xq8pANMd/6gI9uGICD7W4J0iKwExVLYSaUxQZ5yy2oLm0Y/NeBsBMeywiz9ekeXb5NIssvQmby9GGro4N0+F0/blhm8K6uF5yuu5ytDsVG5RYfjpeDl3zDibTy9OVNZN5pi3QYYUz30a/ieyU1Suj644DeaDp32Xh/nUgvmSFCkpstIOD4ygIlNsPwJyg/+ua4Ie4IZDANe+NRgFyT+PSH/w665+REZeB5HB2RDp992j2elNBGIXhL6UzESGMwPJVF7dyIRHjEnkBc5NpjdhJ705emKE9Aba9/JXcgxGqpMsYj7fh0HfrpOeOMNoc3U5BfRGzCn8vuJHlDc+iGJ9gRvHhhQmmi/iB8UgcyWB8jtTVSBVn6NukiaCDmHDqMdJzrl4wsbuiZ9Cbr5SgnrfzdtEeLFvVtD+jP3xLmMsUnbsiHi+GsfOVtQHpl77fYgvTBMHjmdP0a3/ZbKKkvU/nnrJvw==&lt;/diagram&gt;&lt;/mxfile&gt;" style="background-color: rgb(255, 255, 255);"><defs/><g><path d="M 120 30 L 253.63 30" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 258.88 30 L 251.88 33.5 L 253.63 30 L 251.88 26.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 31px; margin-left: 196px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; background-color: #ffffff; white-space: nowrap; ">deserialize_cuda_engine</div></div></div></foreignObject><text x="196" y="34" fill="#000000" font-family="Helvetica" font-size="11px" text-anchor="middle">deserialize_cuda_engine</text></switch></g><rect x="0" y="0" width="120" height="60" rx="9" ry="9" fill="#ffffff" stroke="#000000" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 30px; margin-left: 1px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">runtime</div></div></div></foreignObject><text x="60" y="34" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">runtime</text></switch></g><path d="M 380 30 L 553.63 30" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 558.88 30 L 551.88 33.5 L 553.63 30 L 551.88 26.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 31px; margin-left: 461px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; background-color: #ffffff; white-space: nowrap; ">create_execution_context</div></div></div></foreignObject><text x="461" y="34" fill="#000000" font-family="Helvetica" font-size="11px" text-anchor="middle">create_execution_context</text></switch></g><rect x="260" y="0" width="120" height="60" rx="9" ry="9" fill="#ffffff" stroke="#000000" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 30px; margin-left: 261px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">engine</div></div></div></foreignObject><text x="320" y="34" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">engine</text></switch></g><rect x="575" y="60" width="90" height="20" fill="none" stroke="none" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 70px; margin-left: 620px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: nowrap; ">执行内容对象</div></div></div></foreignObject><text x="620" y="74" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">执行内容对象</text></switch></g><path d="M 335 180 L 408.63 180" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 413.88 180 L 406.88 183.5 L 408.63 180 L 406.88 176.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 180px; margin-left: 370px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; background-color: #ffffff; white-space: nowrap; ">传输</div></div></div></foreignObject><text x="370" y="183" fill="#000000" font-family="Helvetica" font-size="11px" text-anchor="middle">传输</text></switch></g><rect x="215" y="150" width="120" height="60" rx="9" ry="9" fill="#ffffff" stroke="#000000" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 180px; margin-left: 216px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">host_in<br />主机数据</div></div></div></foreignObject><text x="275" y="184" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">host_in...</text></switch></g><rect x="195" y="185" width="20" height="30" fill="none" stroke="none" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 200px; margin-left: 205px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: nowrap; "><div style="background-color: rgb(30 , 30 , 30) ; line-height: 24px"><br /></div></div></div></div></foreignObject><text x="205" y="204" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">&#xa;</text></switch></g><path d="M 535 180 L 599.64 221.56" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 604.06 224.4 L 596.28 223.55 L 599.64 221.56 L 600.06 217.67 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="all"/><rect x="415" y="150" width="120" height="60" rx="9" ry="9" fill="#ffffff" stroke="#000000" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 180px; margin-left: 416px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">devide_in<br />GPU数据</div></div></div></foreignObject><text x="475" y="184" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">devide_in...</text></switch></g><path d="M 535 270 L 599.82 223.7" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 604.09 220.65 L 600.43 227.57 L 599.82 223.7 L 596.36 221.87 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="all"/><rect x="415" y="240" width="120" height="60" rx="9" ry="9" fill="#ffffff" stroke="#000000" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 270px; margin-left: 416px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">devide_out<br />GPU数据</div></div></div></foreignObject><text x="475" y="274" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">devide_out...</text></switch></g><rect x="225" y="130" width="90" height="20" fill="none" stroke="none" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 140px; margin-left: 270px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: nowrap; ">接收输入数据</div></div></div></foreignObject><text x="270" y="144" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">接收输入数据</text></switch></g><path d="M 680 30 L 806.47 219.7" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 809.38 224.07 L 802.58 220.19 L 806.47 219.7 L 808.41 216.3 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="all"/><rect x="560" y="0" width="120" height="60" rx="9" ry="9" fill="#ffffff" stroke="#000000" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 30px; margin-left: 561px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; "><span>context</span></div></div></div></foreignObject><text x="620" y="34" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">context</text></switch></g><path d="M 665 225 L 803.63 225" fill="none" stroke="#000000" stroke-miterlimit="10" stroke-dasharray="3 3" pointer-events="stroke"/><path d="M 808.88 225 L 801.88 228.5 L 803.63 225 L 801.88 221.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="all"/><path d="M 605 185 L 665 225 L 605 265 Z" fill="#ffffff" stroke="#000000" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 58px; height: 1px; padding-top: 225px; margin-left: 606px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">bindings</div></div></div></foreignObject><text x="635" y="229" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">bindings</text></switch></g><rect x="825" y="255" width="90" height="20" fill="none" stroke="none" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 265px; margin-left: 870px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: nowrap; ">接收预测结果</div></div></div></foreignObject><text x="870" y="269" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">接收预测结果</text></switch></g><path d="M 667.5 320 L 804.62 233.4" fill="none" stroke="#000000" stroke-miterlimit="10" stroke-dasharray="3 3" pointer-events="stroke"/><path d="M 809.05 230.6 L 805.01 237.29 L 804.62 233.4 L 801.27 231.38 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="all"/><rect x="560" y="320" width="120" height="60" rx="9" ry="9" fill="#ffffff" stroke="#000000" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 350px; margin-left: 561px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">Stream</div></div></div></foreignObject><text x="620" y="354" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">Stream</text></switch></g><rect x="570" y="380" width="100" height="20" fill="none" stroke="none" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 390px; margin-left: 620px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: nowrap; ">多个流支持并行</div></div></div></foreignObject><text x="620" y="394" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">多个流支持并行</text></switch></g><path d="M 930 225 L 983.63 225" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 988.88 225 L 981.88 228.5 L 983.63 225 L 981.88 221.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 225px; margin-left: 960px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; background-color: #ffffff; white-space: nowrap; ">传输</div></div></div></foreignObject><text x="960" y="228" fill="#000000" font-family="Helvetica" font-size="11px" text-anchor="middle">传输</text></switch></g><rect x="810" y="195" width="120" height="60" rx="9" ry="9" fill="#ffffff" stroke="#000000" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 225px; margin-left: 811px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">devide_out<br />GPU数据</div></div></div></foreignObject><text x="870" y="229" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">devide_out...</text></switch></g><rect x="990" y="195" width="120" height="60" rx="9" ry="9" fill="#ffffff" stroke="#000000" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 225px; margin-left: 991px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: #000000; line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">host_out<br />主机数据</div></div></div></foreignObject><text x="1050" y="229" fill="#000000" font-family="Helvetica" font-size="12px" text-anchor="middle">host_out...</text></switch></g></g><switch><g requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"/><a transform="translate(0,-5)" xlink:href="https://www.diagrams.net/doc/faq/svg-export-text-problems" target="_blank"><text text-anchor="middle" font-size="10px" x="50%" y="100%">Viewer does not support full SVG 1.1</text></a></switch></svg>


--------------------------------------------------------------------------------