├── README.md
├── structure_CN.md
└── tutorials_img
├── ADown.svg
├── BottleNeck.svg
├── GELAN_in_paper.png
├── RepNCSP.svg
├── RepNCSPELAN4.svg
├── SPPELAN.svg
├── inference_structure.svg
├── structure_in_paper.png
└── train_structure.svg
/README.md:
--------------------------------------------------------------------------------
1 | # Language 语言
2 |
3 | [English](./structure.md) [简体中文](structure_CN.md)
4 |
5 | # Paper Summary
6 |
7 | * **Auxiliary Reversible Branch (Training only)**
8 |
9 | Maintenance of complete information by introducing reversible architecture, but adding main branch to reversible architecture will consume a lot of inference costs.
10 |
11 | 'Reversible' is not the only necessary condition in the inference stage.
12 |
13 | * **Multi-level Auxiliary Information (Training only)**
14 |
15 | Each feature pyramid should to receive information about all target objects. Multi-level auxiliary information is then to aggregate the gradient information containing all target objects, and pass it to the main branch and then update parameters.
16 |
17 | * **GELAN Block**
18 |
19 | GELAN = CSPNet + ELAN
20 |
21 | # Model Structure overview
22 |
23 | ## Train model structure
24 |
25 | This structure based on `models/detect/yolov9.yaml`.
26 |
27 | |
|
|
28 | | :----------------------------------------------------------: | :----------------------------------------------------------: |
29 | | Train model structure | Train model structure (in paper) |
30 |
31 | The ***Auxiliary Reversible Branch*** and the ***Multi-level Auxiliary Information*** exists only in training mode, they help backbone achieve better performance. In this stage, the forward propagation outputs is [16, 19, 22, 31, 34, 37], and the outputs will into Detect head. By the [31, 34, 37] predict and GT label, the model can get more detail gradients information to help the[#5, #7, #9] blocks to update weights. So despite those branch will dropout in inference mode, the backbone have more rubust weights.
32 |
33 | ## Inference model structure
34 |
35 | This structure based on `models/detect/gelan.yaml`. Actually, this model is derived from pruning of the Train model (`models/detect/yolov9.yaml`).
36 |
37 | ```python
38 | Note:
39 | models/detect/gelan.yaml <---> models/detect/yolov9.yaml
40 | models/detect/gelan-c.yaml <---> models/detect/yolov9-c.yaml
41 | models/detect/gelan-e.yaml <---> models/detect/yolov9-e.yaml
42 | ```
43 |
44 | 
45 |
46 | The model structure is similar to the previous version when inference mode. Note the re-parameter and GELAN blocks.
47 |
48 | Through Detect Head (mainly NMS and some others) we can get the object detection results.
49 |
50 | ## Blocks detail
51 |
52 | * **Silence** `models.common.Silence`: Do nothing. It's only use to provide source input data for Auxiliary Reversible Branch.
53 |
54 | * **CBS** `models.common.Conv`: Conv2d + BatchNorm2d + SiLU (Default act)
55 |
56 | Note: The BN layer can re-parameter when inference. (ref: [RepVGG](https://openaccess.thecvf.com/content/CVPR2021/papers/Ding_RepVGG_Making_VGG-Style_ConvNets_Great_Again_CVPR_2021_paper.pdf))
57 |
58 | * **ELAN** `models.common.RepNCSPELAN4`:
59 |
60 | |
|
|
61 | | :----------------------------------------------------------: | :----------------------------------------------------------: |
62 | | RepNCSPELAN4 Block | RepNCSPELAN4 (GELAN in paper) |
63 | |
|  |
64 | | RepNCSP Block | RepNBottleNeck |
65 |
66 | * **ELAN-SPP** `models.common.SPPELAN`:
67 |
68 |
69 |
70 | * **ADown `models.common.ADown`:**
71 |
72 | This block replaces a part of `CBS` in`yolov9-c.yaml` and `yolov9-e.yaml`.
73 |
74 |
75 |
76 | ---
77 |
78 | If you find some mistakes, please tell me: divided.by.07@gmail.com
79 |
--------------------------------------------------------------------------------
/structure_CN.md:
--------------------------------------------------------------------------------
1 | # 语言 Language
2 |
3 | [English](./structure.md) [简体中文](structure_CN.md)
4 |
5 | # 论文总结
6 |
7 | 论文提出了PGI(Programmable Gradient Information)思想,即反向传播过程梯度信息丢失的问题需要以被解决。一共提出三个重要部分:
8 |
9 | * **辅助可逆分支**(Auxiliary Reversible Branch)
10 |
11 | 通过引入可逆结构来保证完整的信息,但在可逆结构中增加backbone参数量会消耗大量的推理成本。作者提出观点:“可逆”并不是推理阶段的唯一必要条件,因此设计了辅助可逆分支,在训练过程中帮助backbone更好地获得丰富的返回梯度信息,使得backbone具有更高的表现;而在推理过程中丢弃该分支,使得推理过程并没有增加时间损耗。该模块仅在**训练模式**使用。
12 |
13 |
14 |
15 | * **多级辅助信息**(Multi-level Auxiliary Information )
16 |
17 | 每个特征金字塔应该接收所有目标对象的梯度信息,然后将包含所有目标对象的梯度信息进行多级辅助信息聚合,传递给主分支进行权重的更新。该模块仅在**训练模式**使用,因为其返回的梯度从辅助可逆分支中获取。
18 |
19 | * **GELAN 模块**
20 |
21 | GELAN模块主要由CSPNet和ELAN结构组合而成,并参考了Re-parameter方法。
22 |
23 | GELAN = CSPNet + ELAN
24 |
25 | # 模型结构概览
26 |
27 | ## 训练阶段模型结构
28 |
29 | 该结构基于 `models/detect/yolov9.yaml`.
30 |
31 | |
|
|
32 | | :----------------------------------------------------------: | :----------------------------------------------------------: |
33 | | Train model structure | Train model structure (in paper) |
34 |
35 | ***辅助可逆分支*** 和 ***多级辅助信息*** 仅在训练模式存在,用于帮助backbone获得更好的表现。在训练阶段,共有6个输出特征图,如上图中的[16, 19, 22, 31, 34, 37],这6个输出特征图送入 Detect head 后即可得到预测label。相较于先前的yolo,额外的 [31, 34, 37] 输出得到的更多label能够与 GT label 计算损失后,从辅助可逆回路中将梯度信息更好地传入[#5, #7, #9] 模块中,更新backbone的权重。
36 |
37 | ## Inference model structure
38 |
39 | 该结构基于 `models/detect/gelan.yaml`。事实上,该模型基于 `models/detect/yolov9.yaml`在结构上减去辅助分支而得来。
40 |
41 | ```python
42 | Note:
43 | models/detect/gelan.yaml <---> models/detect/yolov9.yaml
44 | models/detect/gelan-c.yaml <---> models/detect/yolov9-c.yaml
45 | models/detect/gelan-e.yaml <---> models/detect/yolov9-e.yaml
46 | ```
47 |
48 | 
49 |
50 | 在推理模式下,模型结构与以前的yolo版本相似。注意re-parameter和GELAN块。
51 |
52 | 通过Detect Head(主要是NMS和其他一些操作)可以得到目标检测结果。
53 |
54 | ### Blocks 细节
55 |
56 | * **Silence** `models.common.Silence`: 该模块输出=输入,即什么都不做。这个模块的目的是为了辅助可逆分支能够获得原图信息。
57 |
58 | * **CBS** `models.common.Conv`: Conv2d + BatchNorm2d + SiLU (默认激活函数)
59 |
60 | Note: BN层在推理阶段可以将其参数融合进卷积层。在yolov9的代码中可以关注 ’fuse‘关键词,一般与rep有关,例如`models.common.Conv.forward_fuse`。(ref: [RepVGG](https://openaccess.thecvf.com/content/CVPR2021/papers/Ding_RepVGG_Making_VGG-Style_ConvNets_Great_Again_CVPR_2021_paper.pdf))
61 |
62 | * **ELAN** `models.common.RepNCSPELAN4`:
63 |
64 | 从模块名字不难看出核心是Re-parameter + CSPNet + ELAN。
65 |
66 | |
|
|
67 | | :----------------------------------------------------------: | :----------------------------------------------------------: |
68 | | RepNCSPELAN4 Block | RepNCSPELAN4 (GELAN in paper) |
69 | |
|  |
70 | | RepNCSP Block | RepNBottleNeck |
71 |
72 | * **ELAN-SPP** `models.common.SPPELAN`:
73 |
74 | 该模块与早前yolo版本中的SPPF结构基本一致,如下图。
75 |
76 |
77 |
78 | * **ADown `models.common.ADown`:**
79 |
80 | 该模块在`yolov9-c.yaml`与`yolov9-e.yaml`结构中出现,替代了模型中部分`CBS`模块。
81 |
82 |
83 |
84 |
85 |
--------------------------------------------------------------------------------
/tutorials_img/ADown.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
400 |
--------------------------------------------------------------------------------
/tutorials_img/BottleNeck.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
366 |
--------------------------------------------------------------------------------
/tutorials_img/GELAN_in_paper.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/divided7/yolov9_structure_graph/68cc7dccd13dc8e4d380ad0e3d2b27a97fd66a91/tutorials_img/GELAN_in_paper.png
--------------------------------------------------------------------------------
/tutorials_img/RepNCSP.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
647 |
--------------------------------------------------------------------------------
/tutorials_img/RepNCSPELAN4.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
536 |
--------------------------------------------------------------------------------
/tutorials_img/SPPELAN.svg:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
434 |
--------------------------------------------------------------------------------
/tutorials_img/structure_in_paper.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/divided7/yolov9_structure_graph/68cc7dccd13dc8e4d380ad0e3d2b27a97fd66a91/tutorials_img/structure_in_paper.png
--------------------------------------------------------------------------------