├── README.md ├── structure_CN.md └── tutorials_img ├── ADown.svg ├── BottleNeck.svg ├── GELAN_in_paper.png ├── RepNCSP.svg ├── RepNCSPELAN4.svg ├── SPPELAN.svg ├── inference_structure.svg ├── structure_in_paper.png └── train_structure.svg /README.md: -------------------------------------------------------------------------------- 1 | # Language 语言 2 | 3 | [English](./structure.md) [简体中文](structure_CN.md) 4 | 5 | # Paper Summary 6 | 7 | * **Auxiliary Reversible Branch (Training only)** 8 | 9 | Maintenance of complete information by introducing reversible architecture, but adding main branch to reversible architecture will consume a lot of inference costs. 10 | 11 | 'Reversible' is not the only necessary condition in the inference stage. 12 | 13 | * **Multi-level Auxiliary Information (Training only)** 14 | 15 | Each feature pyramid should to receive information about all target objects. Multi-level auxiliary information is then to aggregate the gradient information containing all target objects, and pass it to the main branch and then update parameters. 16 | 17 | * **GELAN Block** 18 | 19 | GELAN = CSPNet + ELAN 20 | 21 | # Model Structure overview 22 | 23 | ## Train model structure 24 | 25 | This structure based on `models/detect/yolov9.yaml`. 26 | 27 | |

| 28 | | :----------------------------------------------------------: | :----------------------------------------------------------: | 29 | | Train model structure | Train model structure (in paper) | 30 | 31 | The ***Auxiliary Reversible Branch*** and the ***Multi-level Auxiliary Information*** exists only in training mode, they help backbone achieve better performance. In this stage, the forward propagation outputs is [16, 19, 22, 31, 34, 37], and the outputs will into Detect head. By the [31, 34, 37] predict and GT label, the model can get more detail gradients information to help the[#5, #7, #9] blocks to update weights. So despite those branch will dropout in inference mode, the backbone have more rubust weights. 32 | 33 | ## Inference model structure 34 | 35 | This structure based on `models/detect/gelan.yaml`. Actually, this model is derived from pruning of the Train model (`models/detect/yolov9.yaml`). 36 | 37 | ```python 38 | Note: 39 | models/detect/gelan.yaml <---> models/detect/yolov9.yaml 40 | models/detect/gelan-c.yaml <---> models/detect/yolov9-c.yaml 41 | models/detect/gelan-e.yaml <---> models/detect/yolov9-e.yaml 42 | ``` 43 | 44 | ![train_structure](tutorials_img/inference_structure.svg) 45 | 46 | The model structure is similar to the previous version when inference mode. Note the re-parameter and GELAN blocks. 47 | 48 | Through Detect Head (mainly NMS and some others) we can get the object detection results. 49 | 50 | ## Blocks detail 51 | 52 | * **Silence** `models.common.Silence`: Do nothing. It's only use to provide source input data for Auxiliary Reversible Branch. 53 | 54 | * **CBS** `models.common.Conv`: Conv2d + BatchNorm2d + SiLU (Default act) 55 | 56 | Note: The BN layer can re-parameter when inference. (ref: [RepVGG](https://openaccess.thecvf.com/content/CVPR2021/papers/Ding_RepVGG_Making_VGG-Style_ConvNets_Great_Again_CVPR_2021_paper.pdf)) 57 | 58 | * **ELAN** `models.common.RepNCSPELAN4`: 59 | 60 | | train_structure

| 61 | | :----------------------------------------------------------: | :----------------------------------------------------------: | 62 | | RepNCSPELAN4 Block | RepNCSPELAN4 （GELAN in paper） | 63 | |

| ![train_structure](tutorials_img/BottleNeck.svg) | 64 | | RepNCSP Block | RepNBottleNeck | 65 | 66 | * **ELAN-SPP** `models.common.SPPELAN`: 67 | 68 |

69 | 70 | * **ADown `models.common.ADown`:** 71 | 72 | This block replaces a part of `CBS` in`yolov9-c.yaml` and `yolov9-e.yaml`. 73 | 74 |

75 | 76 | --- 77 | 78 | If you find some mistakes, please tell me: divided.by.07@gmail.com 79 | -------------------------------------------------------------------------------- /structure_CN.md: -------------------------------------------------------------------------------- 1 | # 语言 Language 2 | 3 | [English](./structure.md) [简体中文](structure_CN.md) 4 | 5 | # 论文总结 6 | 7 | 论文提出了PGI（Programmable Gradient Information）思想，即反向传播过程梯度信息丢失的问题需要以被解决。一共提出三个重要部分： 8 | 9 | * **辅助可逆分支**（Auxiliary Reversible Branch） 10 | 11 | 通过引入可逆结构来保证完整的信息，但在可逆结构中增加backbone参数量会消耗大量的推理成本。作者提出观点：“可逆”并不是推理阶段的唯一必要条件，因此设计了辅助可逆分支，在训练过程中帮助backbone更好地获得丰富的返回梯度信息，使得backbone具有更高的表现；而在推理过程中丢弃该分支，使得推理过程并没有增加时间损耗。该模块仅在**训练模式**使用。 12 | 13 | 14 | 15 | * **多级辅助信息**（Multi-level Auxiliary Information ） 16 | 17 | 每个特征金字塔应该接收所有目标对象的梯度信息，然后将包含所有目标对象的梯度信息进行多级辅助信息聚合，传递给主分支进行权重的更新。该模块仅在**训练模式**使用，因为其返回的梯度从辅助可逆分支中获取。 18 | 19 | * **GELAN 模块** 20 | 21 | GELAN模块主要由CSPNet和ELAN结构组合而成，并参考了Re-parameter方法。 22 | 23 | GELAN = CSPNet + ELAN 24 | 25 | # 模型结构概览 26 | 27 | ## 训练阶段模型结构 28 | 29 | 该结构基于 `models/detect/yolov9.yaml`. 30 | 31 | |

| 32 | | :----------------------------------------------------------: | :----------------------------------------------------------: | 33 | | Train model structure | Train model structure (in paper) | 34 | 35 | ***辅助可逆分支*** 和 ***多级辅助信息*** 仅在训练模式存在，用于帮助backbone获得更好的表现。在训练阶段，共有6个输出特征图，如上图中的[16, 19, 22, 31, 34, 37]，这6个输出特征图送入 Detect head 后即可得到预测label。相较于先前的yolo，额外的 [31, 34, 37] 输出得到的更多label能够与 GT label 计算损失后，从辅助可逆回路中将梯度信息更好地传入[#5, #7, #9] 模块中，更新backbone的权重。 36 | 37 | ## Inference model structure 38 | 39 | 该结构基于 `models/detect/gelan.yaml`。事实上，该模型基于 `models/detect/yolov9.yaml`在结构上减去辅助分支而得来。 40 | 41 | ```python 42 | Note: 43 | models/detect/gelan.yaml <---> models/detect/yolov9.yaml 44 | models/detect/gelan-c.yaml <---> models/detect/yolov9-c.yaml 45 | models/detect/gelan-e.yaml <---> models/detect/yolov9-e.yaml 46 | ``` 47 | 48 | ![train_structure](tutorials_img/inference_structure.svg) 49 | 50 | 在推理模式下，模型结构与以前的yolo版本相似。注意re-parameter和GELAN块。 51 | 52 | 通过Detect Head(主要是NMS和其他一些操作)可以得到目标检测结果。 53 | 54 | ### Blocks 细节 55 | 56 | * **Silence** `models.common.Silence`: 该模块输出=输入，即什么都不做。这个模块的目的是为了辅助可逆分支能够获得原图信息。 57 | 58 | * **CBS** `models.common.Conv`: Conv2d + BatchNorm2d + SiLU (默认激活函数) 59 | 60 | Note: BN层在推理阶段可以将其参数融合进卷积层。在yolov9的代码中可以关注 ’fuse‘关键词，一般与rep有关，例如`models.common.Conv.forward_fuse`。(ref: [RepVGG](https://openaccess.thecvf.com/content/CVPR2021/papers/Ding_RepVGG_Making_VGG-Style_ConvNets_Great_Again_CVPR_2021_paper.pdf)) 61 | 62 | * **ELAN** `models.common.RepNCSPELAN4`: 63 | 64 | 从模块名字不难看出核心是Re-parameter + CSPNet + ELAN。 65 | 66 | | train_structure

| 67 | | :----------------------------------------------------------: | :----------------------------------------------------------: | 68 | | RepNCSPELAN4 Block | RepNCSPELAN4 （GELAN in paper） | 69 | |

| ![train_structure](tutorials_img/BottleNeck.svg) | 70 | | RepNCSP Block | RepNBottleNeck | 71 | 72 | * **ELAN-SPP** `models.common.SPPELAN`: 73 | 74 | 该模块与早前yolo版本中的SPPF结构基本一致，如下图。 75 | 76 |

77 | 78 | * **ADown `models.common.ADown`:** 79 | 80 | 该模块在`yolov9-c.yaml`与`yolov9-e.yaml`结构中出现，替代了模型中部分`CBS`模块。 81 | 82 |

83 | 84 | 85 | -------------------------------------------------------------------------------- /tutorials_img/ADown.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 400 | -------------------------------------------------------------------------------- /tutorials_img/BottleNeck.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 366 | -------------------------------------------------------------------------------- /tutorials_img/GELAN_in_paper.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/divided7/yolov9_structure_graph/68cc7dccd13dc8e4d380ad0e3d2b27a97fd66a91/tutorials_img/GELAN_in_paper.png -------------------------------------------------------------------------------- /tutorials_img/RepNCSP.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 647 | -------------------------------------------------------------------------------- /tutorials_img/RepNCSPELAN4.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 536 | -------------------------------------------------------------------------------- /tutorials_img/SPPELAN.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 434 | -------------------------------------------------------------------------------- /tutorials_img/structure_in_paper.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/divided7/yolov9_structure_graph/68cc7dccd13dc8e4d380ad0e3d2b27a97fd66a91/tutorials_img/structure_in_paper.png --------------------------------------------------------------------------------