├── .gitignore ├── EfficientDet_CN.md ├── LICENSE ├── README.md ├── backbone.py ├── benchmark └── coco_eval_result ├── coco_eval.py ├── efficientdet-d2.pth ├── efficientdet ├── config.py ├── dataset.py ├── loss.py ├── model.py └── utils.py ├── efficientdet_test.py ├── efficientnet ├── __init__.py ├── model.py ├── utils.py └── utils_extra.py ├── pic ├── data │ ├── img_test1.jpg │ ├── img_test2.jpg │ ├── p0.png │ ├── p1.png │ ├── p2.png │ ├── p3.png │ ├── p4.png │ └── p5.png ├── p0.png ├── p1.png ├── p10.png ├── p11.png ├── p12.png ├── p13.png ├── p14.png ├── p15.png ├── p16.png ├── p17.png ├── p18.png ├── p19.png ├── p2.png ├── p20.png ├── p21.png ├── p3.png ├── p4.png ├── p5.png ├── p6.png ├── p7.png ├── p8.png └── p9.png ├── projects ├── coco.yml ├── shape.yml └── underwater.yml ├── readme_efficientdet.md ├── test ├── img.png ├── img_inferred_d0_official.jpg ├── img_inferred_d0_this_repo.jpg └── img_inferred_d0_this_repo_0.jpg ├── train.py ├── tutorial └── train_shape.ipynb ├── utils ├── sync_batchnorm │ ├── __init__.py │ ├── batchnorm.py │ ├── batchnorm_reimpl.py │ ├── comm.py │ ├── replicate.py │ └── unittest.py └── utils.py └── voc2coco.py /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | pip-wheel-metadata/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | nosetests.xml 48 | coverage.xml 49 | *.cover 50 | *.py,cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # IPython 81 | profile_default/ 82 | ipython_config.py 83 | 84 | # pyenv 85 | .python-version 86 | 87 | # pipenv 88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 91 | # install all needed dependencies. 92 | #Pipfile.lock 93 | 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 95 | __pypackages__/ 96 | 97 | # Celery stuff 98 | celerybeat-schedule 99 | celerybeat.pid 100 | 101 | # SageMath parsed files 102 | *.sage.py 103 | 104 | # Environments 105 | .env 106 | .venv 107 | env/ 108 | venv/ 109 | ENV/ 110 | env.bak/ 111 | venv.bak/ 112 | 113 | # Spyder project settings 114 | .spyderproject 115 | .spyproject 116 | 117 | # Rope project settings 118 | .ropeproject 119 | 120 | # mkdocs documentation 121 | /site 122 | 123 | # mypy 124 | .mypy_cache/ 125 | .dmypy.json 126 | dmypy.json 127 | 128 | # Pyre type checker 129 | .pyre/ 130 | .idea/ 131 | weights/ -------------------------------------------------------------------------------- /EfficientDet_CN.md: -------------------------------------------------------------------------------- 1 | ## EfficientDet: Scalable and Efficient Object Detection 2 | 3 | **Xu Jing** 4 | 5 | 论文地址:https://arxiv.org/abs/1911.09070 6 | 7 | 代码地址:https://github.com/google/automl/tree/master/efficientdet 8 | 9 | 目前Pytorch开源的SOTA的地址:https://github.com/zylo117/Yet-Another-EfficientDet-Pytorch 10 | 11 | 12 | 13 | EfficientDet改进主要是借鉴了RetinaNet网络,是一种Anchor-base的one stage目标检测方法 14 | 15 | **RetinaNet的结构** 16 | 17 | ![](pic/p0.png) 18 | 19 | 是不是感觉很像,EfficientDet网络将ResNet替换成连续的卷积降采样层(EfficientNet),然后将FPN替换成BIFPN(就是改变原有FPN的连接关系,参考 [FPN详解](https://blog.csdn.net/u014380165/article/details/72890275)),最后将feature map 连接一个分类子网络,一个box回归网络。总结如下: 20 | 21 | 1. 提出BiFPN子网络结构,双向的多尺度特征融合网络。 22 | 2. 提出一种扩展网络的方法,就是扩展backbone,BiFPN,box net 和class net,具体包括网络层数,输入尺寸,深度。 23 | 3. 由上面的1,2点,结合得到了EfficientDet一系列网络。 24 | 25 | ### 0.摘要 26 | 27 | 模型效率在计算机视觉中的地位越来越重要。本文系统地研究了用于目标检测的各种神经网络结构设计选择,并提出了几种提高效率的关键优化方法。**首先,提出了一种加权双向特征金字塔网络(BiFPN),该网络允许简单快速的多尺度特征融合;其次,提出了一种复合尺度方法,该方法可以同时均匀地对所有骨干网、特征网络和box/class预测网络的分辨率、深度和宽度进行缩放。基于这些优化,Google开发了一个新的目标检测器家族,称为EfficientDet,它在广泛的资源限制范围内始终比现有技术获得一个数量级更好的效率。特别是,在没有钟声和哨声**(without bells and whistles,就是说没搞那些涨精度的tricks,比如各种调参,数据集增强啥的)的情况下,EfficientDet-D7在COCO数据集上实现了最先进的51.0mAP,它有52M参数和326B FLOPS(网络中乘法和加法的数目),体积小了4倍,使用的FLOPS少了9.3倍,但仍然比以前最好的检测器更精确(+0.3%mAP)。 28 | 29 | 最近在GitHub有大神开源了相对SOTA的Pytorch的实现,推荐尝试。 30 | 31 | ![](pic/p1.png) 32 | 33 | ### 1.介绍 34 | 35 | 近年来,在更精确的目标检测方面取得了巨大的进展,同时,最先进的目标检测器也变得越来越昂贵。例如,最新的基于 AmoebaNet的NAS-FPN检测器[37]需要167M的参数和3045B的FLOPS(比RetinaNet[17]多30x),才能达到最先进的精度。巨大的模型尺寸和昂贵的计算成本阻碍了它们在许多实际应用中的部署,例如机器人和自动驾驶汽车,在这些应用中,模型尺寸和延迟受到高度限制。考虑到这些现实世界的资源约束,模型效率对于目标检测变得越来越重要。 36 | 37 | 以前有许多工作旨在开发更高效的检测器架构,如单级[20、25、26、17]和无锚探测器[14、36、32],或者压缩现有的模型[21,22]。尽管这些方法往往能获得更好的效率,但它们通常会牺牲准确性。此外,以往的研究大多只关注特定或小范围的资源需求,但现实世界中的各种应用,从移动设备到数据中心,往往需要不同的资源约束。 38 | 39 | 一个自然的问题是:是否有可能在广泛的资源限制(例如从3B到300B的FLOPS)中构建一个既具有更高精度又具有更好效率的可扩展检测体系结构?本文旨在通过系统研究探测器结构的各种设计选择来解决这一问题。基于单级检测器范式,我们研究了主干网、特征融合和box/class网络的设计选择,并确定了两个主要挑战: 40 | 41 | **挑战1:高效的多尺度特征融合** 就像在[16]中介绍的那样,FPN在多尺度特征融合中得到了广泛的应用。最近,PANet[19]、NAS-FPN[5]和其他研究[13、12、34]开发了更多的用于跨尺度特征融合的网络结构。在融合不同的输入特征时,以往的研究大多只是简单的归纳,没有区别;然而,由于这些不同的输入特征具有不同的分辨率,我们观察到它们对融合输出特征的贡献往往是不平等的。为了解决这一问题,我们提出了一种简单而高效的加权双向特征金字塔网络(BiFPN),它引入可学习的权值来学习不同输入特征的重要性,同时反复应用自顶向下和自下而上的多尺度特征融合。 42 | **挑战2:模型缩放** 虽然以前的工作主要依赖于更大的主干网络[17、27、26、5]或更大的输入图像大小[8、37]以获得更高的精度,但我们观察到,在兼顾精度和效率的情况下,放大特征网络和box/class预测网络也是至关重要的。在文献[31]的启发下,我们提出了一种目标检测器的复合标度方法,该方法联合提高了所有骨干网、特征网络、box/class预测网络的分辨率/深度/宽度。 43 | 44 | 最后,我们还观察到,最近引入的EfficientNets[31]比以前常用的主干(例如ResNets[9]、ResNeXt[33]和AmoebaNet[24])获得更好的效率。将EfficientNet骨干网与我们提出的BiFPN和复合标度相结合,我们开发了一个新的目标检测器家族EfficientDet,与以前的目标检测器相比,它在参数和触发器数量级较少的情况下,始终获得更好的精度。图1和图4显示了COCO数据集的性能比较[18]。在类似的精度限制下,我们的efficientdtet使用的FLOPS比YOLOv3少28倍[26],比RetinaNet少30倍[17],比最近的NASFPN少19倍[5]。特别是,在单模型和单测试时间尺度下,我们的EfficientDTET-D7在52M参数和326B FLOPS下实现了最新的51.0 mAP,比以前最好的模型小4倍,使用的触发器少9.3倍,但仍然更精确(+0.3%mAP)[37]。我们的EfficientSet模型在GPU上的速度比以前的探测器快3.2倍,在CPU上的速度比以前的探测器快8.1倍,如图4和表2所示。 45 | 46 | ![](pic/p2.png) 47 | 48 | 我们的贡献可以被总结为: 49 | 50 | - 提出了一种用于多尺度特征融合的加权双向特征网络BiFPN。 51 | - 我们提出了一种新的复合缩放方法,该方法联合性地将骨干网、特征网、box/class网和分辨率联合起来进行缩放。 52 | - 基于BiFPN和复合标度,我们开发了EfficientDet,这是一个新的探测器系列,在有广泛资源限制范围的应用中具有显著的更好的精度和效率。 53 | 54 | ### 2.相关工作 55 | 56 | **one-stage检测器:**现有的目标探测器主要根据它们是否有感兴趣的建议步骤(two-stage[6,27,3,8])区域(one-stage[28,20,25,17])来分类。虽然two-stage检测器往往更灵活、更精确,但通常认为one-stage检测器通过利用预定义的锚更简单、更高效[11]。近年来,one-stage探测器因其效率高、简单等优点引起了人们的广泛关注[14,34,36]。本文主要遵循one-stage检测器的设计思想,通过优化网络结构,可以达到更高的效率和精度。 57 | **多尺度特征表示:**有效地表示和处理多尺度特征是目标检测的主要难点之一。早期的检测器通常直接根据从骨干网络中提取的金字塔特征层次进行预测[2,20,28]。作为开创性的工作之一,特征金字塔网络(FPN)[16]提出了一种自顶向下的方法来组合多尺度特征。遵循这一思想,PANet[19]在FPN的基础上增加了一个额外的自底向上的路径聚合网络;STDL[35]提出了一个规模转移模块来开发跨规模特征;M2det[34]提出了一个U形模块来融合多规模特征,G-FRNet[1]引入了门单元来控制跨特征的信息流。最近,NAS-FPN[5]利用神经架构搜索自动设计特征网络拓扑。尽管NAS-FPN能获得更好的性能,但它在搜索过程中需要数千个GPU小时,并且生成的特征网络是不规则的,因此很难解释。本文旨在以更直观、更具原则性的方法对多尺度特征融合进行优化。 58 | **模型缩放:**为了获得更好的精度,通常通过使用更大的骨干网络(例如,从移动大小的模型[30,10]和ResNet[9],到ResNeXt[33]和AmoebaNet[24])或增加输入图像大小(例如,从512x512[17]到1536x1536[37])来放大基线检测器。最近的一些工作[5,37]表明,增加channel大小和重复特征网络也可以导致更高的精度。这些缩放方法主要集中在单个或有限的缩放维度上。最近,[31]显示了通过联合缩放网络宽度、深度和分辨率进行图像分类的显著模型效率。我们提出的目标检测的复合缩放方法主要受到了[31]的启发。 59 | 60 | ### 3.BiFPN 61 | 62 | 在这一节中,我们首先阐述了多尺度特征融合问题,然后介绍了两个我们提出的BiFPN的主要思想:**有效的双向交叉尺度连接和加权特征融合。** 63 | 64 | #### 3.1 问题描述 65 | 66 | 多尺度特征融合的目的是对不同分辨率的特征进行融合。形式上,给定一个多尺度特征列表$P^{in}=(P^{in }_{l1},$P^{in}_{ l2},…),其中$P^{in}_{li}表示$li$级别的特征,我们的目标是找到一个变换$f$,它可以有效地聚合不同的特征并输出一个新特征列表:$P^{out}=f(P^{in})$。作为一个具体的例子,图2(a)显示了传统的自上而下的FPN[16]。它采用3-7级输入特征$P^{in}=(P^{in}_ 3,…P^{in}_7)$,其中$P^{in}_ _i$表示分辨率为输入图像1/2的特征级。例如,如果输入分辨率为640x640,则p3表示分辨率为80x80的功能级别3(640/23=80),而p7表示分辨率为5x5的功能级别7。传统的FPN以自顶向下的方式聚合多尺度特性: 67 | 68 | ![](pic/p3.png) 69 | 70 | 71 | 72 | ![](pic/p4.png) 73 | 74 | #### 3.2 交叉比例尺连接 75 | 76 | 传统的自顶向下FPN固有地受到单向信息流的限制。为了解决这个问题,PANet[19]添加了一个额外的自底向上的路径聚合网络,如图2(b)所示。交叉尺度连接在[13,12,34]中有进一步的研究。最近,NAS-FPN[5]使用神经架构搜索来搜索更好的跨尺度特征网络拓扑,但它需要数千个GPU时间去搜索并且最终找到的网络不规则,难以解释或修改,如图2(c)所示。 77 | 78 | 通过研究这三种网络的性能和效率(表4),我们发现PANet比FPN和NAS-FPN具有更好的精度,但需要花费更多的参数和计算。为了提高模型的效率,本文提出了几种跨尺度连接的优化方法:首先,我们移除那些只有一个输入边的节点。我们的直觉很简单:如果一个节点只有一个输入边而没有特征融合,那么它对以融合不同特征为目的的特征网络的贡献就较小。这将导致一个简化的PANet,如图2(e)所示;第二,如果原始输入节点与输出节点处于同一级别,我们将在它们之间添加额外的边,以便在不增加太多成本的情况下融合更多的功能,如图2(f)所示;第三,与PANet[19]不同,PANet[19]只有一个自顶向下和一个自下而上的路径,我们将每个双向(自上而下和自下而上)路径视为一个特征网络层,并多次重复同一层,以实现更高层次的特征融合。第4.2节将讨论如何使用复合缩放方法确定不同资源约束的层数。通过这些优化,我们将新的特征网络命名为双向特征金字塔网络(BiFPN),如图2(f)和3所示。 79 | 80 | #### 3.3 加权特征融合 81 | 82 | 当融合不同分辨率的多个输入特征时,一种常见的方法是先将它们调整为相同的分辨率,然后对它们进行汇总。金字塔注意网络[15]引入了全局自注意上采样来恢复像素定位,这在[5]中有进一步的研究。 83 | 以往的特征融合方法对所有输入特征一视同仁。然而,我们观察到,由于不同的输入特征在不同的分辨率下,它们通常对输出特征的贡献是不平等的。为了解决这个问题,我们建议在特征融合过程中为每个输入增加一个额外的权重,并让网络了解每个输入特征的重要性。基于此,我们考虑三种加权融合方法: 84 | 85 | **无限融合:**![O=P iwi·Ii](pic/p5.png),其中$w_i$是一个可学习的权重,可以是标量(每个特征)、向量(每个通道)或多维张量(每个像素)。我们发现一个比例尺可以在最小的计算成本下达到与其他方法相当的精度。然而,由于标量权重是无限的,它可能会导致训练的不稳定性。因此,我们使用权重规范化来限定每个权重的值范围。 86 | 87 | **基于Softmax的融合:**![在这里插入图片描述](pic/p6.png),一个直观的想法是对每个权重应用softmax,这样所有权重都被规范化为一个值范围为0到1的概率,表示每个输入的重要性。然而,如我们在第6.3节中的消融研究所示,额外的softmax导致GPU硬件显著减速。为了最小化额外的延迟成本,我们进一步提出了一种快速融合方法。 88 | 89 | **快速标准化融合:** ![在这里插入图片描述](pic/p7.png),其中$w_i≥0$是通过在每个$w_i$之后应用ReLU来保证的,并且![](pic/p8.png)是一个小值,以避免数值不稳定。类似地,每个规格化权重的值也在0到1之间,但是由于这里没有softmax操作,因此效率更高。我们的消融研究表明,这种快速融合方法与基于softmax的融合具有非常相似的学习行为和准确性,但在GPU上运行速度高了30%(表5)。 90 | 91 | 注:上面最后的图出现的$W_i$,指的是特征图对应得一组权重(可以是标量,向量,作者没有提怎么来的,需要看源码)$I_{i}$代表是FPN的特征图值。 92 | 93 | ![](pic/p10.png) 94 | 95 | 上图中$P{6in}=P{6}$ , 结合下面的公式,此处的输出是中间一层的输出$P6_{out1}$ 96 | 97 | 98 | 99 | ![](pic/p11.png) 100 | 101 | 102 | 103 | 我们的最终BiFPN集成了双向交叉尺度连接和快速规范化融合。作为一个具体的例子,这里我们描述了图2(f)中所示的BiFPN在$P6_{out2}$融合特征: 104 | 105 | ![](pic/p9.png) 106 | 107 | $P_6^{td}$就是加了权重的$P6\_{out1}$, 此处的$P_6^{out}$是最后层的输出$P6\_{out2}$。值得注意的是,为了进一步提高效率,我们使用了可分离depthwise 卷积[4,29]进行特征融合,并在每次卷积后添加了批量规范化和激活。 108 | 109 | ### 4. EfficientDet 110 | 111 | 基于我们的BiFPN,我们开发了一个新的检测模型家族EfficientDet。在这一部分中,我们将讨论网络结构和一种新的用于EfficientDet的复合缩放方法。 112 | 113 | #### 4.1 EfficientDet结构 114 | 115 | 图3显示了EfficientDet的总体架构,它大体上遵循了单级检测器范式[20、25、16、17]。我们采用ImageNet预训练有效网络作为骨干网络。我们提出的BiFPN作为特征网络,它从骨干网络中提取3-7级特征{P3、P4、P5、P6、P7},并反复应用自顶向下和自下而上的双向特征融合。这些融合后的特征分别反馈到类和盒网络中,产生对象类和边界盒预测。与[17]类似,类和框网络权重在所有级别的特征中共享。 116 | 117 | ![](pic/p14.png) 118 | 119 | ![](pic/p12.png) 120 | 121 | ![](pic/p13.png) 122 | 123 | 1.**骨干网络:**我们重用EfficientNet-B0到B6的相同宽度/深度比例系数[31],这样我们就可以轻松地重用他们的ImageNet预训练模型。 124 | 125 | 2.**BiFPN网络:**我们以指数方式增长BiFPN宽度![Wbif pn](pic/p15.png)(#通道)类似于[31],但线性增加深度![Dbif pn](pic/p16.png)(#层),因为深度需要舍入到小整数。形式上,我们使用以下等式: 126 | 127 | ![在这里插入图片描述](pic/p17.png) 128 | 129 | 3.**box/class预测网络:**我们将其宽度固定为始终与BiFPN相同(即![Wpred=Wbif pn](pic/p18.png)),但使用公式线性增加深度(#层): 130 | 131 | ![在这里插入图片描述](pic/p19.png) 132 | 133 | 4.**输入图像分辨率:**由于特征级别3-7用于BiFPN,因此输入分辨率必须可除以2的7次方=128,因此我们使用公式线性增加分辨率: 134 | 135 | ![在这里插入图片描述](pic/p20.png) 136 | 137 | ### 5.实验结果 138 | 139 | EfficientDet-D7在MS COCO数据集上的MAP表现确实很惊人,达到了51.0,但是单次浮点计算量千亿次为326BFLOPS,(BFLOPS,即Billion FLOPS,十亿FLOPS,参考 FLOPS计算) 140 | 141 | EfficientDet-D0与YOLO v3 相比,同样差不多的MAP,EfficientDet-D0的BFLOPS是2.5, YOLO v3是71,相差28倍。 下图的X是倍数的意思,以EfficientDet-D0的所有参数为基准的倍率。 142 | 143 | ![](pic/p21.png) 144 | 145 | > 后面作者列出了几张关于以下参数的对比实验,不太重要,在此不一一列出 146 | > 147 | > 1.参数量,GPU消耗, 148 | > 2.powerful backbone and BiFPN 149 | > 3.Softmax VS fast normalized feature fusion 150 | 151 | 总结: 152 | 1.从实际应用价值考虑,EfficientDet D0-D4比YOLO v3 的map好一点,速度论文没写看出来。 153 | 154 | 2.现在流行anchor-free,anchor-base的方法对遮挡问题的解决效果不好。 155 | 156 | 3.EfficientDet 的训练时间肯定不如轻量级的 fcos-mobilenet和yolo-tiny模型。 157 | 158 | 参考: 159 | 160 | https://blog.csdn.net/weixin_38632246/article/details/103400788 161 | 162 | https://blog.csdn.net/sun_shine56/article/details/104970533 163 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | GNU LESSER GENERAL PUBLIC LICENSE 2 | Version 3, 29 June 2007 3 | 4 | Copyright (C) 2007 Free Software Foundation, Inc. 5 | Everyone is permitted to copy and distribute verbatim copies 6 | of this license document, but changing it is not allowed. 7 | 8 | 9 | This version of the GNU Lesser General Public License incorporates 10 | the terms and conditions of version 3 of the GNU General Public 11 | License, supplemented by the additional permissions listed below. 12 | 13 | 0. Additional Definitions. 14 | 15 | As used herein, "this License" refers to version 3 of the GNU Lesser 16 | General Public License, and the "GNU GPL" refers to version 3 of the GNU 17 | General Public License. 18 | 19 | "The Library" refers to a covered work governed by this License, 20 | other than an Application or a Combined Work as defined below. 21 | 22 | An "Application" is any work that makes use of an interface provided 23 | by the Library, but which is not otherwise based on the Library. 24 | Defining a subclass of a class defined by the Library is deemed a mode 25 | of using an interface provided by the Library. 26 | 27 | A "Combined Work" is a work produced by combining or linking an 28 | Application with the Library. The particular version of the Library 29 | with which the Combined Work was made is also called the "Linked 30 | Version". 31 | 32 | The "Minimal Corresponding Source" for a Combined Work means the 33 | Corresponding Source for the Combined Work, excluding any source code 34 | for portions of the Combined Work that, considered in isolation, are 35 | based on the Application, and not on the Linked Version. 36 | 37 | The "Corresponding Application Code" for a Combined Work means the 38 | object code and/or source code for the Application, including any data 39 | and utility programs needed for reproducing the Combined Work from the 40 | Application, but excluding the System Libraries of the Combined Work. 41 | 42 | 1. Exception to Section 3 of the GNU GPL. 43 | 44 | You may convey a covered work under sections 3 and 4 of this License 45 | without being bound by section 3 of the GNU GPL. 46 | 47 | 2. Conveying Modified Versions. 48 | 49 | If you modify a copy of the Library, and, in your modifications, a 50 | facility refers to a function or data to be supplied by an Application 51 | that uses the facility (other than as an argument passed when the 52 | facility is invoked), then you may convey a copy of the modified 53 | version: 54 | 55 | a) under this License, provided that you make a good faith effort to 56 | ensure that, in the event an Application does not supply the 57 | function or data, the facility still operates, and performs 58 | whatever part of its purpose remains meaningful, or 59 | 60 | b) under the GNU GPL, with none of the additional permissions of 61 | this License applicable to that copy. 62 | 63 | 3. Object Code Incorporating Material from Library Header Files. 64 | 65 | The object code form of an Application may incorporate material from 66 | a header file that is part of the Library. You may convey such object 67 | code under terms of your choice, provided that, if the incorporated 68 | material is not limited to numerical parameters, data structure 69 | layouts and accessors, or small macros, inline functions and templates 70 | (ten or fewer lines in length), you do both of the following: 71 | 72 | a) Give prominent notice with each copy of the object code that the 73 | Library is used in it and that the Library and its use are 74 | covered by this License. 75 | 76 | b) Accompany the object code with a copy of the GNU GPL and this license 77 | document. 78 | 79 | 4. Combined Works. 80 | 81 | You may convey a Combined Work under terms of your choice that, 82 | taken together, effectively do not restrict modification of the 83 | portions of the Library contained in the Combined Work and reverse 84 | engineering for debugging such modifications, if you also do each of 85 | the following: 86 | 87 | a) Give prominent notice with each copy of the Combined Work that 88 | the Library is used in it and that the Library and its use are 89 | covered by this License. 90 | 91 | b) Accompany the Combined Work with a copy of the GNU GPL and this license 92 | document. 93 | 94 | c) For a Combined Work that displays copyright notices during 95 | execution, include the copyright notice for the Library among 96 | these notices, as well as a reference directing the user to the 97 | copies of the GNU GPL and this license document. 98 | 99 | d) Do one of the following: 100 | 101 | 0) Convey the Minimal Corresponding Source under the terms of this 102 | License, and the Corresponding Application Code in a form 103 | suitable for, and under terms that permit, the user to 104 | recombine or relink the Application with a modified version of 105 | the Linked Version to produce a modified Combined Work, in the 106 | manner specified by section 6 of the GNU GPL for conveying 107 | Corresponding Source. 108 | 109 | 1) Use a suitable shared library mechanism for linking with the 110 | Library. A suitable mechanism is one that (a) uses at run time 111 | a copy of the Library already present on the user's computer 112 | system, and (b) will operate properly with a modified version 113 | of the Library that is interface-compatible with the Linked 114 | Version. 115 | 116 | e) Provide Installation Information, but only if you would otherwise 117 | be required to provide such information under section 6 of the 118 | GNU GPL, and only to the extent that such information is 119 | necessary to install and execute a modified version of the 120 | Combined Work produced by recombining or relinking the 121 | Application with a modified version of the Linked Version. (If 122 | you use option 4d0, the Installation Information must accompany 123 | the Minimal Corresponding Source and Corresponding Application 124 | Code. If you use option 4d1, you must provide the Installation 125 | Information in the manner specified by section 6 of the GNU GPL 126 | for conveying Corresponding Source.) 127 | 128 | 5. Combined Libraries. 129 | 130 | You may place library facilities that are a work based on the 131 | Library side by side in a single library together with other library 132 | facilities that are not Applications and are not covered by this 133 | License, and convey such a combined library under terms of your 134 | choice, if you do both of the following: 135 | 136 | a) Accompany the combined library with a copy of the same work based 137 | on the Library, uncombined with any other library facilities, 138 | conveyed under the terms of this License. 139 | 140 | b) Give prominent notice with the combined library that part of it 141 | is a work based on the Library, and explaining where to find the 142 | accompanying uncombined form of the same work. 143 | 144 | 6. Revised Versions of the GNU Lesser General Public License. 145 | 146 | The Free Software Foundation may publish revised and/or new versions 147 | of the GNU Lesser General Public License from time to time. Such new 148 | versions will be similar in spirit to the present version, but may 149 | differ in detail to address new problems or concerns. 150 | 151 | Each version is given a distinguishing version number. If the 152 | Library as you received it specifies that a certain numbered version 153 | of the GNU Lesser General Public License "or any later version" 154 | applies to it, you have the option of following the terms and 155 | conditions either of that published version or of any later version 156 | published by the Free Software Foundation. If the Library as you 157 | received it does not specify a version number of the GNU Lesser 158 | General Public License, you may choose any version of the GNU Lesser 159 | General Public License ever published by the Free Software Foundation. 160 | 161 | If the Library as you received it specifies that a proxy can decide 162 | whether future versions of the GNU Lesser General Public License shall 163 | apply, that proxy's public statement of acceptance of any version is 164 | permanent authorization for you to choose that version for the 165 | Library. 166 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | ## EfficientDet训练自己的数据集 4 | 5 | **Xu Jing** 6 | 7 | Paper: 8 | 9 | Base GitHub Repo: 10 | 11 | Official Repo: 12 | 13 | EfficientDet 算法中文介绍:[EfficientDet_CN.md](./EfficientDet_CN.md) 14 | 15 | > 本项目以一个真实比赛的数据集,Step by Step演示如何训练最近开源的相对SOTA的Pytorch版的EfficientDet的训练,评估,推断的过程。像paper中提到的一样,我们并没有使用任何数据增强或模型融合等后处理的trick来提高模型的精度,如果你想增加数据增强的策略可以在`efficientdet/dataset.py`中实现; 16 | > 17 | > 此外我们并没有采用类似于[UWGAN_UIE](https://github.com/DataXujing/UWGAN_UIE),水质迁移(WQT),DG-YOLO或一些水下去雾算法的办法,预处理水下的图像; 18 | > 19 | > 相信这些trick同样会提高模型识别的精度!!! 20 | 21 | ### 1.数据来源 22 | 23 | 数据来源于[科赛网中的水下目标检测的比赛](https://www.kesci.com/home/competition/5e535a612537a0002ca864ac/content/2): 24 | 25 | ![](pic/data/p0.png) 26 | 27 | **大赛简介** 28 | 29 | 「背景」 随着海洋观测的快速发展,水下物体检测在海军沿海防御任务以及渔业、水产养殖等海洋经济中发挥着越来越重要的作用,而水下图像是海洋信息的重要载体,本次比赛希望参赛者在真实海底图片数据中通过算法检测出不同海产品(海参、海胆、扇贝、海星)的位置。 30 | 31 | ![](pic/data/p1.png) 32 | 33 | 「数据」 训练集是5543张 jpg 格式的水下光学图像与对应标注结果,A榜测试集800张,B榜测试集1200张。 34 | 35 | 「评估指标」 mAP(mean Average Precision) 36 | 37 | > 注:数据由鹏城实验室提供。 38 | 39 | ### 2.据转换 40 | 41 | 我们将数据存放在项目的dataset下: 42 | 43 | ``` 44 | .. 45 | └─underwater 46 | ├─Annotations #xml标注 47 | └─JPEGImages #jpg原图 48 | # 首先划分训练集和验证集:我们采用9:1的随机换分,划分好的数据等待转化为COCO数据 49 | ``` 50 | 51 | 划分训练集和验证集后的数据结构: 52 | 53 | ``` 54 | .. 55 | ├─train 56 | │ ├─Annotations 57 | │ └─JPEGImages 58 | └─val 59 | ├─Annotations 60 | └─JPEGImages 61 | ``` 62 | 63 | 将VOC转COCO: 64 | 65 | ``` 66 | python voc2coco.py train.txt ./train/Annotations instances_train.json ./train/JPEGImages 67 | python voc2coco.py val.txt ./val/Annotations instances_val.json ./val/JPEGImages 68 | # 生成的json文件存放在了dataset/underwater/annotations/*.jpg 69 | ``` 70 | 71 | 72 | ### 3.修改EfficientDet的项目文件 73 | 74 | 1.新建dataset文件夹用以存放训练和验证数据 75 | 76 | ``` 77 | dataset 78 | └─underwater # 项目数据集名称 79 | ├─annotations # instances_train.json,instances_val.json 80 | ├─train # train jpgs 81 | └─val # val jpgs 82 | ``` 83 | 84 | 2.新建logs文件夹 85 | 86 | logs存放了训练过程中的tensprboardX保存的日志及模型 87 | 88 | 3.修改train.py[训练使用] 89 | 90 | ``` 91 | def get_args(): 92 | parser.add_argument('-p', '--project', type=str, default='underwater', help='project file that contains parameters') 93 | parser.add_argument('--batch_size', type=int, default=16, help='The number of images per batch among all devices') 94 | 95 | ``` 96 | 97 | 4.修改efficientdet_test.py[测试新图像使用] 98 | 99 | ``` 100 | # obj_list = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 101 | # 'fire hydrant', '', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 102 | # 'cow', 'elephant', 'bear', 'zebra', 'giraffe', '', 'backpack', 'umbrella', '', '', 'handbag', 'tie', 103 | # 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 104 | # 'skateboard', 'surfboard', 'tennis racket', 'bottle', '', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 105 | # 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 106 | # 'cake', 'chair', 'couch', 'potted plant', 'bed', '', 'dining table', '', '', 'toilet', '', 'tv', 107 | # 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 108 | # 'refrigerator', '', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 109 | # 'toothbrush'] 110 | 111 | obj_list = ["holothurian","echinus","scallop","starfish"]# 换成自己的 112 | compound_coef = 2 # D0-D6 113 | model.load_state_dict(torch.load("./logs/underwater/efficientdet-d2_122_38106.pth")) # 模型地址 114 | ``` 115 | 116 | 5.修改coco_eval.py[评估模型使用] 117 | 118 | ``` 119 | ap.add_argument('-p', '--project', type=str, default='underwater', help='project file that contains parameters') 120 | ``` 121 | 122 | 6.修改efficientdet/config.py 123 | 124 | ``` 125 | # COCO_CLASSES = ["person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", 126 | # "traffic light", "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", 127 | # "horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", 128 | # "handbag", "tie", "suitcase", "frisbee", "skis", "snowboard", "sports ball", "kite", 129 | # "baseball bat", "baseball glove", "skateboard", "surfboard", "tennis racket", "bottle", 130 | # "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple", "sandwich", "orange", 131 | # "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch", "potted plant", 132 | # "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone", 133 | # "microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", 134 | # "teddy bear", "hair drier", "toothbrush"] 135 | COCO_CLASSES = ["holothurian","echinus","scallop","starfish"] 136 | ``` 137 | 138 | 7.新建yml配置文件(./projects/underwater.yml)[训练的配置文件] 139 | 140 | ``` 141 | project_name: underwater # also the folder name of the dataset that under data_path folder 142 | train_set: train 143 | val_set: val 144 | num_gpus: 1 145 | 146 | # mean and std in RGB order, actually this part should remain unchanged as long as your dataset is similar to coco. 147 | mean: [0.485, 0.456, 0.406] 148 | std: [0.229, 0.224, 0.225] 149 | 150 | # this is coco anchors, change it if necessary 151 | anchors_scales: '[2 ** 0, 2 ** (1.0 / 3.0), 2 ** (2.0 / 3.0)]' 152 | anchors_ratios: '[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]' 153 | 154 | # must match your dataset's category_id. 155 | # category_id is one_indexed, 156 | # for example, index of 'car' here is 2, while category_id of is 3 157 | obj_list: ["holothurian","echinus","scallop","starfish"] 158 | 159 | ``` 160 | 161 | 162 | 163 | ### 4.训练EfficientDet 164 | 165 | ``` 166 | # 从头训练自己的数据集 EfficientDet-D2 167 | python train.py -c 2 --batch_size 16 --lr 1e4 168 | 169 | # train efficientdet-d2 在自己的数据集上使用预训练的模型(推荐) 170 | python train.py -c 2 --batch_size 8 --lr 1e-5 --num_epochs 10 \ 171 | --load_weights /path/to/your/weights/efficientdet-d2.pth 172 | 173 | # with a coco-pretrained, you can even freeze the backbone and train heads only 174 | # to speed up training and help convergence. 175 | python train.py -c 2 --batch_size 8 --lr 1e-5 --num_epochs 10 \ 176 | --load_weights /path/to/your/weights/efficientdet-d2.pth \ 177 | --head_only True 178 | 179 | # Early stopping 180 | #Ctrl+c, 181 | # the program will catch KeyboardInterrupt 182 | # and stop training, save current checkpoint. 183 | 184 | # 断点训练 185 | python train.py -c 2 --batch_size 8 --lr 1e-5 \ 186 | --load_weights last \ 187 | --head_only True 188 | ``` 189 | 190 | ### 5.测试EfficientDet 191 | 192 | 1.评估模型使用coco的map 193 | 194 | ``` 195 | python coco_eval.py -p underwater -c 2 -w ./logs/underwater/efficientdet-d2_122_38106.pth 196 | ``` 197 | 198 | ``` 199 | # 评价结果 200 | Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.381 201 | Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.714 202 | Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.368 203 | Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.170 204 | Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.351 205 | Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.426 206 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.149 207 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.433 208 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.464 209 | Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.267 210 | Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.429 211 | Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.507 212 | 213 | ``` 214 | 215 | 2.训练过程中的Debug 216 | 217 | ``` 218 | # when you get bad result, you need to debug the training result. 219 | python train.py -c 2 --batch_size 8 --lr 1e-5 --debug True 220 | 221 | # then checkout test/ folder, there you can visualize the predicted boxes during training 222 | # don't panic if you see countless of error boxes, it happens when the training is at early stage. 223 | # But if you still can't see a normal box after several epoches, not even one in all image, 224 | # then it's possible that either the anchors config is inappropriate or the ground truth is corrupted. 225 | ``` 226 | 227 | 3.推断新的图像 228 | 229 | ``` 230 | python efficientdet_test.py 231 | ``` 232 | 233 | 推断速度基本能达到实时: 234 | 235 | ![](pic/data/p2.png) 236 | 237 | ![](pic/data/img_test1.jpg) 238 | 239 | ![](pic/data/img_test2.jpg) 240 | 241 | 4.Tensorboard展示结果: 242 | 243 | ``` 244 | tensorboard --logdir logs/underwater/tensorboard 245 | ``` 246 | 247 | ![](pic/data/p3.png) 248 | 249 | ![](pic/data/p4.png) 250 | 251 | ![](pic/data/p5.png) -------------------------------------------------------------------------------- /backbone.py: -------------------------------------------------------------------------------- 1 | # Author: Zylo117 2 | 3 | import math 4 | 5 | import torch 6 | from torch import nn 7 | 8 | from efficientdet.model import BiFPN, Regressor, Classifier, EfficientNet 9 | from efficientdet.utils import Anchors 10 | 11 | 12 | class EfficientDetBackbone(nn.Module): 13 | def __init__(self, num_classes=80, compound_coef=0, load_weights=False, **kwargs): 14 | super(EfficientDetBackbone, self).__init__() 15 | self.compound_coef = compound_coef 16 | 17 | self.backbone_compound_coef = [0, 1, 2, 3, 4, 5, 6, 6] 18 | self.fpn_num_filters = [64, 88, 112, 160, 224, 288, 384, 384] 19 | self.fpn_cell_repeats = [3, 4, 5, 6, 7, 7, 8, 8] 20 | self.input_sizes = [512, 640, 768, 896, 1024, 1280, 1280, 1536] 21 | self.box_class_repeats = [3, 3, 3, 4, 4, 4, 5, 5] 22 | self.anchor_scale = [4., 4., 4., 4., 4., 4., 4., 5.] 23 | self.aspect_ratios = kwargs.get('ratios', [(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]) 24 | self.num_scales = len(kwargs.get('scales', [2 ** 0, 2 ** (1.0 / 3.0), 2 ** (2.0 / 3.0)])) 25 | conv_channel_coef = { 26 | # the channels of P3/P4/P5. 27 | 0: [40, 112, 320], 28 | 1: [40, 112, 320], 29 | 2: [48, 120, 352], 30 | 3: [48, 136, 384], 31 | 4: [56, 160, 448], 32 | 5: [64, 176, 512], 33 | 6: [72, 200, 576], 34 | 7: [72, 200, 576], 35 | } 36 | 37 | num_anchors = len(self.aspect_ratios) * self.num_scales 38 | 39 | self.bifpn = nn.Sequential( 40 | *[BiFPN(self.fpn_num_filters[self.compound_coef], 41 | conv_channel_coef[compound_coef], 42 | True if _ == 0 else False, 43 | attention=True if compound_coef < 6 else False) 44 | for _ in range(self.fpn_cell_repeats[compound_coef])]) 45 | 46 | self.num_classes = num_classes 47 | self.regressor = Regressor(in_channels=self.fpn_num_filters[self.compound_coef], num_anchors=num_anchors, 48 | num_layers=self.box_class_repeats[self.compound_coef]) 49 | self.classifier = Classifier(in_channels=self.fpn_num_filters[self.compound_coef], num_anchors=num_anchors, 50 | num_classes=num_classes, 51 | num_layers=self.box_class_repeats[self.compound_coef]) 52 | 53 | self.anchors = Anchors(anchor_scale=self.anchor_scale[compound_coef], **kwargs) 54 | 55 | self.backbone_net = EfficientNet(self.backbone_compound_coef[compound_coef], load_weights) 56 | 57 | def freeze_bn(self): 58 | for m in self.modules(): 59 | if isinstance(m, nn.BatchNorm2d): 60 | m.eval() 61 | 62 | def forward(self, inputs): 63 | max_size = inputs.shape[-1] 64 | 65 | _, p3, p4, p5 = self.backbone_net(inputs) 66 | 67 | features = (p3, p4, p5) 68 | features = self.bifpn(features) 69 | 70 | regression = self.regressor(features) 71 | classification = self.classifier(features) 72 | anchors = self.anchors(inputs, inputs.dtype) 73 | 74 | return features, regression, classification, anchors 75 | 76 | def init_backbone(self, path): 77 | state_dict = torch.load(path) 78 | try: 79 | ret = self.load_state_dict(state_dict, strict=False) 80 | print(ret) 81 | except RuntimeError as e: 82 | print('Ignoring ' + str(e) + '"') 83 | -------------------------------------------------------------------------------- /benchmark/coco_eval_result: -------------------------------------------------------------------------------- 1 | efficientdet-d0 2 | Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.326 3 | Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.502 4 | Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.342 5 | Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.118 6 | Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.376 7 | Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.509 8 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.268 9 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.402 10 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.430 11 | Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.172 12 | Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.502 13 | Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.624 14 | 15 | efficientdet-d1 16 | Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.382 17 | Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.568 18 | Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.407 19 | Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.181 20 | Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.437 21 | Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.555 22 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.304 23 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.465 24 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.496 25 | Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.265 26 | Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.562 27 | Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.673 28 | 29 | efficientdet-d2 30 | Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.415 31 | Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.603 32 | Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.440 33 | Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.226 34 | Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.471 35 | Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.567 36 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.321 37 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.497 38 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.529 39 | Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.315 40 | Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.595 41 | Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.672 42 | 43 | efficientdet-d3 44 | Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.449 45 | Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.637 46 | Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.480 47 | Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.272 48 | Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.491 49 | Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.602 50 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.342 51 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.533 52 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.567 53 | Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.383 54 | Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.615 55 | Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.710 56 | 57 | efficientdet-d4 58 | Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.481 59 | Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.672 60 | Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.520 61 | Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.320 62 | Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.528 63 | Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.625 64 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.357 65 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.565 66 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.600 67 | Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.436 68 | Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.649 69 | Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.728 70 | 71 | efficientdet-d5 72 | Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.495 73 | Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.687 74 | Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.532 75 | Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.333 76 | Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.540 77 | Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.632 78 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.367 79 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.584 80 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.621 81 | Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.467 82 | Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.662 83 | Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.739 84 | 85 | efficientdet-d6 86 | Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.501 87 | Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.692 88 | Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.540 89 | Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.338 90 | Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.544 91 | Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.637 92 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.368 93 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.588 94 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.626 95 | Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.469 96 | Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.667 97 | Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.738 98 | 99 | efficientdet-d7 100 | Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.507 101 | Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.696 102 | Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.545 103 | Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.352 104 | Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.551 105 | Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.638 106 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.370 107 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.588 108 | Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.624 109 | Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.466 110 | Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.663 111 | Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.743 112 | -------------------------------------------------------------------------------- /coco_eval.py: -------------------------------------------------------------------------------- 1 | # Author: Zylo117 2 | 3 | """ 4 | COCO-Style Evaluations 5 | 6 | put images here datasets/your_project_name/annotations/val_set_name/*.jpg 7 | put annotations here datasets/your_project_name/annotations/instances_{val_set_name}.json 8 | put weights here /path/to/your/weights/*.pth 9 | change compound_coef 10 | 11 | """ 12 | 13 | import json 14 | import os 15 | 16 | import argparse 17 | import torch 18 | import yaml 19 | from tqdm import tqdm 20 | from pycocotools.coco import COCO 21 | from pycocotools.cocoeval import COCOeval 22 | 23 | from backbone import EfficientDetBackbone 24 | from efficientdet.utils import BBoxTransform, ClipBoxes 25 | from utils.utils import preprocess, invert_affine, postprocess 26 | 27 | ap = argparse.ArgumentParser() 28 | ap.add_argument('-p', '--project', type=str, default='underwater', help='project file that contains parameters') 29 | ap.add_argument('-c', '--compound_coef', type=int, default=0, help='coefficients of efficientdet') 30 | ap.add_argument('-w', '--weights', type=str, default=None, help='/path/to/weights') 31 | ap.add_argument('--nms_threshold', type=float, default=0.5, help='nms threshold, don\'t change it if not for testing purposes') 32 | ap.add_argument('--cuda', type=bool, default=True) 33 | ap.add_argument('--device', type=int, default=0) 34 | ap.add_argument('--float16', type=bool, default=False) 35 | args = ap.parse_args() 36 | 37 | compound_coef = args.compound_coef 38 | nms_threshold = args.nms_threshold 39 | use_cuda = args.cuda 40 | gpu = args.device 41 | use_float16 = args.float16 42 | project_name = args.project 43 | weights_path = 'weights/efficientdet-d{}.pth'.format(compound_coef) if args.weights is None else args.weights 44 | 45 | print('running coco-style evaluation on project {}, weights {}...'.format(project_name,weights_path)) 46 | 47 | params = yaml.safe_load(open('projects/{}.yml'.format(project_name))) 48 | obj_list = params['obj_list'] 49 | 50 | input_sizes = [512, 640, 768, 896, 1024, 1280, 1280, 1536] 51 | 52 | 53 | def evaluate_coco(img_path, set_name, image_ids, coco, model, threshold=0.05): 54 | results = [] 55 | processed_image_ids = [] 56 | 57 | regressBoxes = BBoxTransform() 58 | clipBoxes = ClipBoxes() 59 | 60 | for image_id in tqdm(image_ids): 61 | image_info = coco.loadImgs(image_id)[0] 62 | image_path = img_path + image_info['file_name'] 63 | 64 | ori_imgs, framed_imgs, framed_metas = preprocess(image_path, max_size=input_sizes[compound_coef]) 65 | x = torch.from_numpy(framed_imgs[0]) 66 | 67 | if use_cuda: 68 | x = x.cuda(gpu) 69 | if use_float16: 70 | x = x.half() 71 | else: 72 | x = x.float() 73 | else: 74 | x = x.float() 75 | 76 | x = x.unsqueeze(0).permute(0, 3, 1, 2) 77 | features, regression, classification, anchors = model(x) 78 | 79 | preds = postprocess(x, 80 | anchors, regression, classification, 81 | regressBoxes, clipBoxes, 82 | threshold, nms_threshold) 83 | 84 | processed_image_ids.append(image_id) 85 | 86 | if not preds: 87 | continue 88 | 89 | preds = invert_affine(framed_metas, preds)[0] 90 | 91 | scores = preds['scores'] 92 | class_ids = preds['class_ids'] 93 | rois = preds['rois'] 94 | 95 | if rois.shape[0] > 0: 96 | # x1,y1,x2,y2 -> x1,y1,w,h 97 | rois[:, 2] -= rois[:, 0] 98 | rois[:, 3] -= rois[:, 1] 99 | 100 | bbox_score = scores 101 | 102 | for roi_id in range(rois.shape[0]): 103 | score = float(bbox_score[roi_id]) 104 | label = int(class_ids[roi_id]) 105 | box = rois[roi_id, :] 106 | 107 | if score < threshold: 108 | break 109 | image_result = { 110 | 'image_id': image_id, 111 | 'category_id': label + 1, 112 | 'score': float(score), 113 | 'bbox': box.tolist(), 114 | } 115 | 116 | results.append(image_result) 117 | 118 | if not len(results): 119 | raise Exception('the model does not provide any valid output, check model architecture and the data input') 120 | 121 | # write output 122 | json.dump(results, open('{}_bbox_results.json'.format(set_name), 'w'), indent=4) 123 | 124 | return processed_image_ids 125 | 126 | 127 | def _eval(coco_gt, image_ids, pred_json_path): 128 | # load results in COCO evaluation tool 129 | coco_pred = coco_gt.loadRes(pred_json_path) 130 | 131 | # run COCO evaluation 132 | print('BBox') 133 | coco_eval = COCOeval(coco_gt, coco_pred, 'bbox') 134 | coco_eval.params.imgIds = image_ids 135 | coco_eval.evaluate() 136 | coco_eval.accumulate() 137 | coco_eval.summarize() 138 | 139 | 140 | if __name__ == '__main__': 141 | SET_NAME = params['val_set'] 142 | VAL_GT = 'dataset/{}/annotations/instances_{}.json'.format(params["project_name"],SET_NAME) 143 | VAL_IMGS = 'dataset/{}/{}/'.format(params["project_name"],SET_NAME) 144 | MAX_IMAGES = 10000 145 | coco_gt = COCO(VAL_GT) 146 | image_ids = coco_gt.getImgIds()[:MAX_IMAGES] 147 | 148 | if not os.path.exists('{}_bbox_results.json'.format(SET_NAME)): 149 | model = EfficientDetBackbone(compound_coef=compound_coef, num_classes=len(obj_list), 150 | ratios=eval(params['anchors_ratios']), scales=eval(params['anchors_scales'])) 151 | model.load_state_dict(torch.load(weights_path, map_location=torch.device('cpu'))) 152 | model.requires_grad_(False) 153 | model.eval() 154 | 155 | if use_cuda: 156 | model.cuda(gpu) 157 | 158 | if use_float16: 159 | model.half() 160 | 161 | image_ids = evaluate_coco(VAL_IMGS, SET_NAME, image_ids, coco_gt, model) 162 | 163 | _eval(coco_gt, image_ids, '{}_bbox_results.json'.format(SET_NAME)) 164 | else: 165 | _eval(coco_gt, image_ids, '{}_bbox_results.json'.format(SET_NAME)) 166 | -------------------------------------------------------------------------------- /efficientdet-d2.pth: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/efficientdet-d2.pth -------------------------------------------------------------------------------- /efficientdet/config.py: -------------------------------------------------------------------------------- 1 | # COCO_CLASSES = ["person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", 2 | # "traffic light", "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", 3 | # "horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", 4 | # "handbag", "tie", "suitcase", "frisbee", "skis", "snowboard", "sports ball", "kite", 5 | # "baseball bat", "baseball glove", "skateboard", "surfboard", "tennis racket", "bottle", 6 | # "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple", "sandwich", "orange", 7 | # "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch", "potted plant", 8 | # "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone", 9 | # "microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", 10 | # "teddy bear", "hair drier", "toothbrush"] 11 | COCO_CLASSES = ["holothurian","echinus","scallop","starfish"] 12 | 13 | colors = [(39, 129, 113), (164, 80, 133), (83, 122, 114), (99, 81, 172), (95, 56, 104), (37, 84, 86), (14, 89, 122), 14 | (80, 7, 65), (10, 102, 25), (90, 185, 109), (106, 110, 132), (169, 158, 85), (188, 185, 26), (103, 1, 17), 15 | (82, 144, 81), (92, 7, 184), (49, 81, 155), (179, 177, 69), (93, 187, 158), (13, 39, 73), (12, 50, 60), 16 | (16, 179, 33), (112, 69, 165), (15, 139, 63), (33, 191, 159), (182, 173, 32), (34, 113, 133), (90, 135, 34), 17 | (53, 34, 86), (141, 35, 190), (6, 171, 8), (118, 76, 112), (89, 60, 55), (15, 54, 88), (112, 75, 181), 18 | (42, 147, 38), (138, 52, 63), (128, 65, 149), (106, 103, 24), (168, 33, 45), (28, 136, 135), (86, 91, 108), 19 | (52, 11, 76), (142, 6, 189), (57, 81, 168), (55, 19, 148), (182, 101, 89), (44, 65, 179), (1, 33, 26), 20 | (122, 164, 26), (70, 63, 134), (137, 106, 82), (120, 118, 52), (129, 74, 42), (182, 147, 112), (22, 157, 50), 21 | (56, 50, 20), (2, 22, 177), (156, 100, 106), (21, 35, 42), (13, 8, 121), (142, 92, 28), (45, 118, 33), 22 | (105, 118, 30), (7, 185, 124), (46, 34, 146), (105, 184, 169), (22, 18, 5), (147, 71, 73), (181, 64, 91), 23 | (31, 39, 184), (164, 179, 33), (96, 50, 18), (95, 15, 106), (113, 68, 54), (136, 116, 112), (119, 139, 130), 24 | (31, 139, 34), (66, 6, 127), (62, 39, 2), (49, 99, 180), (49, 119, 155), (153, 50, 183), (125, 38, 3), 25 | (129, 87, 143), (49, 87, 40), (128, 62, 120), (73, 85, 148), (28, 144, 118), (29, 9, 24), (175, 45, 108), 26 | (81, 175, 64), (178, 19, 157), (74, 188, 190), (18, 114, 2), (62, 128, 96), (21, 3, 150), (0, 6, 95), 27 | (2, 20, 184), (122, 37, 185)] 28 | -------------------------------------------------------------------------------- /efficientdet/dataset.py: -------------------------------------------------------------------------------- 1 | import os 2 | import torch 3 | import numpy as np 4 | 5 | from torch.utils.data import Dataset, DataLoader 6 | from pycocotools.coco import COCO 7 | import cv2 8 | 9 | 10 | class CocoDataset(Dataset): 11 | def __init__(self, root_dir, set='train', transform=None): 12 | 13 | self.root_dir = root_dir 14 | self.set_name = set 15 | self.transform = transform 16 | 17 | self.coco = COCO(os.path.join(self.root_dir, 'annotations', 'instances_' + self.set_name + '.json')) 18 | self.image_ids = self.coco.getImgIds() 19 | 20 | self.load_classes() 21 | 22 | def load_classes(self): 23 | 24 | # load class names (name -> label) 25 | categories = self.coco.loadCats(self.coco.getCatIds()) 26 | categories.sort(key=lambda x: x['id']) 27 | 28 | self.classes = {} 29 | self.coco_labels = {} 30 | self.coco_labels_inverse = {} 31 | for c in categories: 32 | self.coco_labels[len(self.classes)] = c['id'] 33 | self.coco_labels_inverse[c['id']] = len(self.classes) 34 | self.classes[c['name']] = len(self.classes) 35 | 36 | # also load the reverse (label -> name) 37 | self.labels = {} 38 | for key, value in self.classes.items(): 39 | self.labels[value] = key 40 | 41 | def __len__(self): 42 | return len(self.image_ids) 43 | 44 | def __getitem__(self, idx): 45 | 46 | img = self.load_image(idx) 47 | annot = self.load_annotations(idx) 48 | sample = {'img': img, 'annot': annot} 49 | if self.transform: 50 | sample = self.transform(sample) 51 | return sample 52 | 53 | def load_image(self, image_index): 54 | image_info = self.coco.loadImgs(self.image_ids[image_index])[0] 55 | path = os.path.join(self.root_dir, self.set_name, image_info['file_name']) 56 | img = cv2.imread(path) 57 | img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) 58 | 59 | return img.astype(np.float32) / 255. 60 | 61 | def load_annotations(self, image_index): 62 | # get ground truth annotations 63 | annotations_ids = self.coco.getAnnIds(imgIds=self.image_ids[image_index], iscrowd=False) 64 | annotations = np.zeros((0, 5)) 65 | 66 | # some images appear to miss annotations 67 | if len(annotations_ids) == 0: 68 | return annotations 69 | 70 | # parse annotations 71 | coco_annotations = self.coco.loadAnns(annotations_ids) 72 | for idx, a in enumerate(coco_annotations): 73 | 74 | # some annotations have basically no width / height, skip them 75 | if a['bbox'][2] < 1 or a['bbox'][3] < 1: 76 | continue 77 | 78 | annotation = np.zeros((1, 5)) 79 | annotation[0, :4] = a['bbox'] 80 | annotation[0, 4] = self.coco_label_to_label(a['category_id']) 81 | annotations = np.append(annotations, annotation, axis=0) 82 | 83 | # transform from [x, y, w, h] to [x1, y1, x2, y2] 84 | annotations[:, 2] = annotations[:, 0] + annotations[:, 2] 85 | annotations[:, 3] = annotations[:, 1] + annotations[:, 3] 86 | 87 | return annotations 88 | 89 | def coco_label_to_label(self, coco_label): 90 | return self.coco_labels_inverse[coco_label] 91 | 92 | def label_to_coco_label(self, label): 93 | return self.coco_labels[label] 94 | 95 | 96 | def collater(data): 97 | imgs = [s['img'] for s in data] 98 | annots = [s['annot'] for s in data] 99 | scales = [s['scale'] for s in data] 100 | 101 | imgs = torch.from_numpy(np.stack(imgs, axis=0)) 102 | 103 | max_num_annots = max(annot.shape[0] for annot in annots) 104 | 105 | if max_num_annots > 0: 106 | 107 | annot_padded = torch.ones((len(annots), max_num_annots, 5)) * -1 108 | 109 | if max_num_annots > 0: 110 | for idx, annot in enumerate(annots): 111 | if annot.shape[0] > 0: 112 | annot_padded[idx, :annot.shape[0], :] = annot 113 | else: 114 | annot_padded = torch.ones((len(annots), 1, 5)) * -1 115 | 116 | imgs = imgs.permute(0, 3, 1, 2) 117 | 118 | return {'img': imgs, 'annot': annot_padded, 'scale': scales} 119 | 120 | 121 | class Resizer(object): 122 | """Convert ndarrays in sample to Tensors.""" 123 | 124 | def __init__(self, img_size=512): 125 | self.img_size = img_size 126 | 127 | def __call__(self, sample): 128 | image, annots = sample['img'], sample['annot'] 129 | height, width, _ = image.shape 130 | if height > width: 131 | scale = self.img_size / height 132 | resized_height = self.img_size 133 | resized_width = int(width * scale) 134 | else: 135 | scale = self.img_size / width 136 | resized_height = int(height * scale) 137 | resized_width = self.img_size 138 | 139 | image = cv2.resize(image, (resized_width, resized_height), interpolation=cv2.INTER_LINEAR) 140 | 141 | new_image = np.zeros((self.img_size, self.img_size, 3)) 142 | new_image[0:resized_height, 0:resized_width] = image 143 | 144 | annots[:, :4] *= scale 145 | 146 | return {'img': torch.from_numpy(new_image).to(torch.float32), 'annot': torch.from_numpy(annots), 'scale': scale} 147 | 148 | 149 | class Augmenter(object): 150 | """Convert ndarrays in sample to Tensors.""" 151 | 152 | def __call__(self, sample, flip_x=0.5): 153 | if np.random.rand() < flip_x: 154 | image, annots = sample['img'], sample['annot'] 155 | image = image[:, ::-1, :] 156 | 157 | rows, cols, channels = image.shape 158 | 159 | x1 = annots[:, 0].copy() 160 | x2 = annots[:, 2].copy() 161 | 162 | x_tmp = x1.copy() 163 | 164 | annots[:, 0] = cols - x2 165 | annots[:, 2] = cols - x_tmp 166 | 167 | sample = {'img': image, 'annot': annots} 168 | 169 | return sample 170 | 171 | 172 | class Normalizer(object): 173 | 174 | def __init__(self, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]): 175 | self.mean = np.array([[mean]]) 176 | self.std = np.array([[std]]) 177 | 178 | def __call__(self, sample): 179 | image, annots = sample['img'], sample['annot'] 180 | 181 | return {'img': ((image.astype(np.float32) - self.mean) / self.std), 'annot': annots} 182 | -------------------------------------------------------------------------------- /efficientdet/loss.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import cv2 4 | import numpy as np 5 | 6 | from efficientdet.utils import BBoxTransform, ClipBoxes 7 | from utils.utils import postprocess, invert_affine, display 8 | 9 | 10 | def calc_iou(a, b): 11 | # a(anchor) [boxes, (y1, x1, y2, x2)] 12 | # b(gt, coco-style) [boxes, (x1, y1, x2, y2)] 13 | 14 | area = (b[:, 2] - b[:, 0]) * (b[:, 3] - b[:, 1]) 15 | iw = torch.min(torch.unsqueeze(a[:, 3], dim=1), b[:, 2]) - torch.max(torch.unsqueeze(a[:, 1], 1), b[:, 0]) 16 | ih = torch.min(torch.unsqueeze(a[:, 2], dim=1), b[:, 3]) - torch.max(torch.unsqueeze(a[:, 0], 1), b[:, 1]) 17 | iw = torch.clamp(iw, min=0) 18 | ih = torch.clamp(ih, min=0) 19 | ua = torch.unsqueeze((a[:, 2] - a[:, 0]) * (a[:, 3] - a[:, 1]), dim=1) + area - iw * ih 20 | ua = torch.clamp(ua, min=1e-8) 21 | intersection = iw * ih 22 | IoU = intersection / ua 23 | 24 | return IoU 25 | 26 | 27 | class FocalLoss(nn.Module): 28 | def __init__(self): 29 | super(FocalLoss, self).__init__() 30 | 31 | def forward(self, classifications, regressions, anchors, annotations, **kwargs): 32 | alpha = 0.25 33 | gamma = 2.0 34 | batch_size = classifications.shape[0] 35 | classification_losses = [] 36 | regression_losses = [] 37 | 38 | anchor = anchors[0, :, :] # assuming all image sizes are the same, which it is 39 | dtype = anchors.dtype 40 | 41 | anchor_widths = anchor[:, 3] - anchor[:, 1] 42 | anchor_heights = anchor[:, 2] - anchor[:, 0] 43 | anchor_ctr_x = anchor[:, 1] + 0.5 * anchor_widths 44 | anchor_ctr_y = anchor[:, 0] + 0.5 * anchor_heights 45 | 46 | for j in range(batch_size): 47 | 48 | classification = classifications[j, :, :] 49 | regression = regressions[j, :, :] 50 | 51 | bbox_annotation = annotations[j] 52 | bbox_annotation = bbox_annotation[bbox_annotation[:, 4] != -1] 53 | 54 | if bbox_annotation.shape[0] == 0: 55 | if torch.cuda.is_available(): 56 | regression_losses.append(torch.tensor(0).to(dtype).cuda()) 57 | classification_losses.append(torch.tensor(0).to(dtype).cuda()) 58 | else: 59 | regression_losses.append(torch.tensor(0).to(dtype)) 60 | classification_losses.append(torch.tensor(0).to(dtype)) 61 | 62 | continue 63 | 64 | classification = torch.clamp(classification, 1e-4, 1.0 - 1e-4) 65 | 66 | IoU = calc_iou(anchor[:, :], bbox_annotation[:, :4]) 67 | 68 | IoU_max, IoU_argmax = torch.max(IoU, dim=1) 69 | 70 | # compute the loss for classification 71 | targets = torch.ones_like(classification) * -1 72 | if torch.cuda.is_available(): 73 | targets = targets.cuda() 74 | 75 | targets[torch.lt(IoU_max, 0.4), :] = 0 76 | 77 | positive_indices = torch.ge(IoU_max, 0.5) 78 | 79 | num_positive_anchors = positive_indices.sum() 80 | 81 | assigned_annotations = bbox_annotation[IoU_argmax, :] 82 | 83 | targets[positive_indices, :] = 0 84 | targets[positive_indices, assigned_annotations[positive_indices, 4].long()] = 1 85 | 86 | alpha_factor = torch.ones_like(targets) * alpha 87 | if torch.cuda.is_available(): 88 | alpha_factor = alpha_factor.cuda() 89 | 90 | alpha_factor = torch.where(torch.eq(targets, 1.), alpha_factor, 1. - alpha_factor) 91 | focal_weight = torch.where(torch.eq(targets, 1.), 1. - classification, classification) 92 | focal_weight = alpha_factor * torch.pow(focal_weight, gamma) 93 | 94 | bce = -(targets * torch.log(classification) + (1.0 - targets) * torch.log(1.0 - classification)) 95 | 96 | cls_loss = focal_weight * bce 97 | 98 | zeros = torch.zeros_like(cls_loss) 99 | if torch.cuda.is_available(): 100 | zeros = zeros.cuda() 101 | cls_loss = torch.where(torch.ne(targets, -1.0), cls_loss, zeros) 102 | 103 | classification_losses.append(cls_loss.sum() / torch.clamp(num_positive_anchors.to(dtype), min=1.0)) 104 | 105 | if positive_indices.sum() > 0: 106 | assigned_annotations = assigned_annotations[positive_indices, :] 107 | 108 | anchor_widths_pi = anchor_widths[positive_indices] 109 | anchor_heights_pi = anchor_heights[positive_indices] 110 | anchor_ctr_x_pi = anchor_ctr_x[positive_indices] 111 | anchor_ctr_y_pi = anchor_ctr_y[positive_indices] 112 | 113 | gt_widths = assigned_annotations[:, 2] - assigned_annotations[:, 0] 114 | gt_heights = assigned_annotations[:, 3] - assigned_annotations[:, 1] 115 | gt_ctr_x = assigned_annotations[:, 0] + 0.5 * gt_widths 116 | gt_ctr_y = assigned_annotations[:, 1] + 0.5 * gt_heights 117 | 118 | # efficientdet style 119 | gt_widths = torch.clamp(gt_widths, min=1) 120 | gt_heights = torch.clamp(gt_heights, min=1) 121 | 122 | targets_dx = (gt_ctr_x - anchor_ctr_x_pi) / anchor_widths_pi 123 | targets_dy = (gt_ctr_y - anchor_ctr_y_pi) / anchor_heights_pi 124 | targets_dw = torch.log(gt_widths / anchor_widths_pi) 125 | targets_dh = torch.log(gt_heights / anchor_heights_pi) 126 | 127 | targets = torch.stack((targets_dy, targets_dx, targets_dh, targets_dw)) 128 | targets = targets.t() 129 | 130 | regression_diff = torch.abs(targets - regression[positive_indices, :]) 131 | 132 | regression_loss = torch.where( 133 | torch.le(regression_diff, 1.0 / 9.0), 134 | 0.5 * 9.0 * torch.pow(regression_diff, 2), 135 | regression_diff - 0.5 / 9.0 136 | ) 137 | regression_losses.append(regression_loss.mean()) 138 | else: 139 | if torch.cuda.is_available(): 140 | regression_losses.append(torch.tensor(0).to(dtype).cuda()) 141 | else: 142 | regression_losses.append(torch.tensor(0).to(dtype)) 143 | 144 | # debug 145 | imgs = kwargs.get('imgs', None) 146 | if imgs is not None: 147 | regressBoxes = BBoxTransform() 148 | clipBoxes = ClipBoxes() 149 | obj_list = kwargs.get('obj_list', None) 150 | out = postprocess(imgs.detach(), 151 | torch.stack([anchors[0]] * imgs.shape[0], 0).detach(), regressions.detach(), classifications.detach(), 152 | regressBoxes, clipBoxes, 153 | 0.5, 0.3) 154 | imgs = imgs.permute(0, 2, 3, 1).cpu().numpy() 155 | imgs = ((imgs * [0.229, 0.224, 0.225] + [0.485, 0.456, 0.406]) * 255).astype(np.uint8) 156 | imgs = [cv2.cvtColor(img, cv2.COLOR_RGB2BGR) for img in imgs] 157 | display(out, imgs, obj_list, imshow=False, imwrite=True) 158 | 159 | return torch.stack(classification_losses).mean(dim=0, keepdim=True), \ 160 | torch.stack(regression_losses).mean(dim=0, keepdim=True) 161 | -------------------------------------------------------------------------------- /efficientdet/model.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | import torch 3 | from torchvision.ops.boxes import nms as nms_torch 4 | 5 | from efficientnet import EfficientNet as EffNet 6 | from efficientnet.utils import MemoryEfficientSwish, Swish 7 | from efficientnet.utils_extra import Conv2dStaticSamePadding, MaxPool2dStaticSamePadding 8 | 9 | 10 | def nms(dets, thresh): 11 | return nms_torch(dets[:, :4], dets[:, 4], thresh) 12 | 13 | 14 | class SeparableConvBlock(nn.Module): 15 | """ 16 | created by Zylo117 17 | """ 18 | 19 | def __init__(self, in_channels, out_channels=None, norm=True, activation=False, onnx_export=False): 20 | super(SeparableConvBlock, self).__init__() 21 | if out_channels is None: 22 | out_channels = in_channels 23 | 24 | # Q: whether separate conv 25 | # share bias between depthwise_conv and pointwise_conv 26 | # or just pointwise_conv apply bias. 27 | # A: Confirmed, just pointwise_conv applies bias, depthwise_conv has no bias. 28 | 29 | self.depthwise_conv = Conv2dStaticSamePadding(in_channels, in_channels, 30 | kernel_size=3, stride=1, groups=in_channels, bias=False) 31 | self.pointwise_conv = Conv2dStaticSamePadding(in_channels, out_channels, kernel_size=1, stride=1) 32 | 33 | self.norm = norm 34 | if self.norm: 35 | # Warning: pytorch momentum is different from tensorflow's, momentum_pytorch = 1 - momentum_tensorflow 36 | self.bn = nn.BatchNorm2d(num_features=out_channels, momentum=0.01, eps=1e-3) 37 | 38 | self.activation = activation 39 | if self.activation: 40 | self.swish = MemoryEfficientSwish() if not onnx_export else Swish() 41 | 42 | def forward(self, x): 43 | x = self.depthwise_conv(x) 44 | x = self.pointwise_conv(x) 45 | 46 | if self.norm: 47 | x = self.bn(x) 48 | 49 | if self.activation: 50 | x = self.swish(x) 51 | 52 | return x 53 | 54 | 55 | class BiFPN(nn.Module): 56 | """ 57 | modified by Zylo117 58 | """ 59 | 60 | def __init__(self, num_channels, conv_channels, first_time=False, epsilon=1e-4, onnx_export=False, attention=True): 61 | """ 62 | 63 | Args: 64 | num_channels: 65 | conv_channels: 66 | first_time: whether the input comes directly from the efficientnet, 67 | if True, downchannel it first, and downsample P5 to generate P6 then P7 68 | epsilon: epsilon of fast weighted attention sum of BiFPN, not the BN's epsilon 69 | onnx_export: if True, use Swish instead of MemoryEfficientSwish 70 | """ 71 | super(BiFPN, self).__init__() 72 | self.epsilon = epsilon 73 | # Conv layers 74 | self.conv6_up = SeparableConvBlock(num_channels, onnx_export=onnx_export) 75 | self.conv5_up = SeparableConvBlock(num_channels, onnx_export=onnx_export) 76 | self.conv4_up = SeparableConvBlock(num_channels, onnx_export=onnx_export) 77 | self.conv3_up = SeparableConvBlock(num_channels, onnx_export=onnx_export) 78 | self.conv4_down = SeparableConvBlock(num_channels, onnx_export=onnx_export) 79 | self.conv5_down = SeparableConvBlock(num_channels, onnx_export=onnx_export) 80 | self.conv6_down = SeparableConvBlock(num_channels, onnx_export=onnx_export) 81 | self.conv7_down = SeparableConvBlock(num_channels, onnx_export=onnx_export) 82 | 83 | # Feature scaling layers 84 | self.p6_upsample = nn.Upsample(scale_factor=2, mode='nearest') 85 | self.p5_upsample = nn.Upsample(scale_factor=2, mode='nearest') 86 | self.p4_upsample = nn.Upsample(scale_factor=2, mode='nearest') 87 | self.p3_upsample = nn.Upsample(scale_factor=2, mode='nearest') 88 | 89 | self.p4_downsample = MaxPool2dStaticSamePadding(3, 2) 90 | self.p5_downsample = MaxPool2dStaticSamePadding(3, 2) 91 | self.p6_downsample = MaxPool2dStaticSamePadding(3, 2) 92 | self.p7_downsample = MaxPool2dStaticSamePadding(3, 2) 93 | 94 | self.swish = MemoryEfficientSwish() if not onnx_export else Swish() 95 | 96 | self.first_time = first_time 97 | if self.first_time: 98 | self.p5_down_channel = nn.Sequential( 99 | Conv2dStaticSamePadding(conv_channels[2], num_channels, 1), 100 | nn.BatchNorm2d(num_channels, momentum=0.01, eps=1e-3), 101 | ) 102 | self.p4_down_channel = nn.Sequential( 103 | Conv2dStaticSamePadding(conv_channels[1], num_channels, 1), 104 | nn.BatchNorm2d(num_channels, momentum=0.01, eps=1e-3), 105 | ) 106 | self.p3_down_channel = nn.Sequential( 107 | Conv2dStaticSamePadding(conv_channels[0], num_channels, 1), 108 | nn.BatchNorm2d(num_channels, momentum=0.01, eps=1e-3), 109 | ) 110 | 111 | self.p5_to_p6 = nn.Sequential( 112 | Conv2dStaticSamePadding(conv_channels[2], num_channels, 1), 113 | nn.BatchNorm2d(num_channels, momentum=0.01, eps=1e-3), 114 | MaxPool2dStaticSamePadding(3, 2) 115 | ) 116 | self.p6_to_p7 = nn.Sequential( 117 | MaxPool2dStaticSamePadding(3, 2) 118 | ) 119 | 120 | self.p4_down_channel_2 = nn.Sequential( 121 | Conv2dStaticSamePadding(conv_channels[1], num_channels, 1), 122 | nn.BatchNorm2d(num_channels, momentum=0.01, eps=1e-3), 123 | ) 124 | self.p5_down_channel_2 = nn.Sequential( 125 | Conv2dStaticSamePadding(conv_channels[2], num_channels, 1), 126 | nn.BatchNorm2d(num_channels, momentum=0.01, eps=1e-3), 127 | ) 128 | 129 | # Weight 130 | self.p6_w1 = nn.Parameter(torch.ones(2, dtype=torch.float32), requires_grad=True) 131 | self.p6_w1_relu = nn.ReLU() 132 | self.p5_w1 = nn.Parameter(torch.ones(2, dtype=torch.float32), requires_grad=True) 133 | self.p5_w1_relu = nn.ReLU() 134 | self.p4_w1 = nn.Parameter(torch.ones(2, dtype=torch.float32), requires_grad=True) 135 | self.p4_w1_relu = nn.ReLU() 136 | self.p3_w1 = nn.Parameter(torch.ones(2, dtype=torch.float32), requires_grad=True) 137 | self.p3_w1_relu = nn.ReLU() 138 | 139 | self.p4_w2 = nn.Parameter(torch.ones(3, dtype=torch.float32), requires_grad=True) 140 | self.p4_w2_relu = nn.ReLU() 141 | self.p5_w2 = nn.Parameter(torch.ones(3, dtype=torch.float32), requires_grad=True) 142 | self.p5_w2_relu = nn.ReLU() 143 | self.p6_w2 = nn.Parameter(torch.ones(3, dtype=torch.float32), requires_grad=True) 144 | self.p6_w2_relu = nn.ReLU() 145 | self.p7_w2 = nn.Parameter(torch.ones(2, dtype=torch.float32), requires_grad=True) 146 | self.p7_w2_relu = nn.ReLU() 147 | 148 | self.attention = attention 149 | 150 | def forward(self, inputs): 151 | """ 152 | illustration of a minimal bifpn unit 153 | P7_0 -------------------------> P7_2 --------> 154 | |-------------| ↑ 155 | ↓ | 156 | P6_0 ---------> P6_1 ---------> P6_2 --------> 157 | |-------------|--------------↑ ↑ 158 | ↓ | 159 | P5_0 ---------> P5_1 ---------> P5_2 --------> 160 | |-------------|--------------↑ ↑ 161 | ↓ | 162 | P4_0 ---------> P4_1 ---------> P4_2 --------> 163 | |-------------|--------------↑ ↑ 164 | |--------------↓ | 165 | P3_0 -------------------------> P3_2 --------> 166 | """ 167 | 168 | # downsample channels using same-padding conv2d to target phase's if not the same 169 | # judge: same phase as target, 170 | # if same, pass; 171 | # elif earlier phase, downsample to target phase's by pooling 172 | # elif later phase, upsample to target phase's by nearest interpolation 173 | 174 | if self.attention: 175 | p3_out, p4_out, p5_out, p6_out, p7_out = self._forward_fast_attention(inputs) 176 | else: 177 | p3_out, p4_out, p5_out, p6_out, p7_out = self._forward(inputs) 178 | 179 | return p3_out, p4_out, p5_out, p6_out, p7_out 180 | 181 | def _forward_fast_attention(self, inputs): 182 | if self.first_time: 183 | p3, p4, p5 = inputs 184 | 185 | p6_in = self.p5_to_p6(p5) 186 | p7_in = self.p6_to_p7(p6_in) 187 | 188 | p3_in = self.p3_down_channel(p3) 189 | p4_in = self.p4_down_channel(p4) 190 | p5_in = self.p5_down_channel(p5) 191 | 192 | else: 193 | # P3_0, P4_0, P5_0, P6_0 and P7_0 194 | p3_in, p4_in, p5_in, p6_in, p7_in = inputs 195 | 196 | # P7_0 to P7_2 197 | 198 | # Weights for P6_0 and P7_0 to P6_1 199 | p6_w1 = self.p6_w1_relu(self.p6_w1) 200 | weight = p6_w1 / (torch.sum(p6_w1, dim=0) + self.epsilon) 201 | # Connections for P6_0 and P7_0 to P6_1 respectively 202 | p6_up = self.conv6_up(self.swish(weight[0] * p6_in + weight[1] * self.p6_upsample(p7_in))) 203 | 204 | # Weights for P5_0 and P6_0 to P5_1 205 | p5_w1 = self.p5_w1_relu(self.p5_w1) 206 | weight = p5_w1 / (torch.sum(p5_w1, dim=0) + self.epsilon) 207 | # Connections for P5_0 and P6_0 to P5_1 respectively 208 | p5_up = self.conv5_up(self.swish(weight[0] * p5_in + weight[1] * self.p5_upsample(p6_up))) 209 | 210 | # Weights for P4_0 and P5_0 to P4_1 211 | p4_w1 = self.p4_w1_relu(self.p4_w1) 212 | weight = p4_w1 / (torch.sum(p4_w1, dim=0) + self.epsilon) 213 | # Connections for P4_0 and P5_0 to P4_1 respectively 214 | p4_up = self.conv4_up(self.swish(weight[0] * p4_in + weight[1] * self.p4_upsample(p5_up))) 215 | 216 | # Weights for P3_0 and P4_1 to P3_2 217 | p3_w1 = self.p3_w1_relu(self.p3_w1) 218 | weight = p3_w1 / (torch.sum(p3_w1, dim=0) + self.epsilon) 219 | # Connections for P3_0 and P4_1 to P3_2 respectively 220 | p3_out = self.conv3_up(self.swish(weight[0] * p3_in + weight[1] * self.p3_upsample(p4_up))) 221 | 222 | if self.first_time: 223 | p4_in = self.p4_down_channel_2(p4) 224 | p5_in = self.p5_down_channel_2(p5) 225 | 226 | # Weights for P4_0, P4_1 and P3_2 to P4_2 227 | p4_w2 = self.p4_w2_relu(self.p4_w2) 228 | weight = p4_w2 / (torch.sum(p4_w2, dim=0) + self.epsilon) 229 | # Connections for P4_0, P4_1 and P3_2 to P4_2 respectively 230 | p4_out = self.conv4_down( 231 | self.swish(weight[0] * p4_in + weight[1] * p4_up + weight[2] * self.p4_downsample(p3_out))) 232 | 233 | # Weights for P5_0, P5_1 and P4_2 to P5_2 234 | p5_w2 = self.p5_w2_relu(self.p5_w2) 235 | weight = p5_w2 / (torch.sum(p5_w2, dim=0) + self.epsilon) 236 | # Connections for P5_0, P5_1 and P4_2 to P5_2 respectively 237 | p5_out = self.conv5_down( 238 | self.swish(weight[0] * p5_in + weight[1] * p5_up + weight[2] * self.p5_downsample(p4_out))) 239 | 240 | # Weights for P6_0, P6_1 and P5_2 to P6_2 241 | p6_w2 = self.p6_w2_relu(self.p6_w2) 242 | weight = p6_w2 / (torch.sum(p6_w2, dim=0) + self.epsilon) 243 | # Connections for P6_0, P6_1 and P5_2 to P6_2 respectively 244 | p6_out = self.conv6_down( 245 | self.swish(weight[0] * p6_in + weight[1] * p6_up + weight[2] * self.p6_downsample(p5_out))) 246 | 247 | # Weights for P7_0 and P6_2 to P7_2 248 | p7_w2 = self.p7_w2_relu(self.p7_w2) 249 | weight = p7_w2 / (torch.sum(p7_w2, dim=0) + self.epsilon) 250 | # Connections for P7_0 and P6_2 to P7_2 251 | p7_out = self.conv7_down(self.swish(weight[0] * p7_in + weight[1] * self.p7_downsample(p6_out))) 252 | 253 | return p3_out, p4_out, p5_out, p6_out, p7_out 254 | 255 | def _forward(self, inputs): 256 | if self.first_time: 257 | p3, p4, p5 = inputs 258 | 259 | p6_in = self.p5_to_p6(p5) 260 | p7_in = self.p6_to_p7(p6_in) 261 | 262 | p3_in = self.p3_down_channel(p3) 263 | p4_in = self.p4_down_channel(p4) 264 | p5_in = self.p5_down_channel(p5) 265 | 266 | else: 267 | # P3_0, P4_0, P5_0, P6_0 and P7_0 268 | p3_in, p4_in, p5_in, p6_in, p7_in = inputs 269 | 270 | # P7_0 to P7_2 271 | 272 | # Connections for P6_0 and P7_0 to P6_1 respectively 273 | p6_up = self.conv6_up(self.swish(p6_in + self.p6_upsample(p7_in))) 274 | 275 | # Connections for P5_0 and P6_0 to P5_1 respectively 276 | p5_up = self.conv5_up(self.swish(p5_in + self.p5_upsample(p6_up))) 277 | 278 | # Connections for P4_0 and P5_0 to P4_1 respectively 279 | p4_up = self.conv4_up(self.swish(p4_in + self.p4_upsample(p5_up))) 280 | 281 | # Connections for P3_0 and P4_1 to P3_2 respectively 282 | p3_out = self.conv3_up(self.swish(p3_in + self.p3_upsample(p4_up))) 283 | 284 | if self.first_time: 285 | p4_in = self.p4_down_channel_2(p4) 286 | p5_in = self.p5_down_channel_2(p5) 287 | 288 | # Connections for P4_0, P4_1 and P3_2 to P4_2 respectively 289 | p4_out = self.conv4_down( 290 | self.swish(p4_in + p4_up + self.p4_downsample(p3_out))) 291 | 292 | # Connections for P5_0, P5_1 and P4_2 to P5_2 respectively 293 | p5_out = self.conv5_down( 294 | self.swish(p5_in + p5_up + self.p5_downsample(p4_out))) 295 | 296 | # Connections for P6_0, P6_1 and P5_2 to P6_2 respectively 297 | p6_out = self.conv6_down( 298 | self.swish(p6_in + p6_up + self.p6_downsample(p5_out))) 299 | 300 | # Connections for P7_0 and P6_2 to P7_2 301 | p7_out = self.conv7_down(self.swish(p7_in + self.p7_downsample(p6_out))) 302 | 303 | return p3_out, p4_out, p5_out, p6_out, p7_out 304 | 305 | 306 | class Regressor(nn.Module): 307 | """ 308 | modified by Zylo117 309 | """ 310 | 311 | def __init__(self, in_channels, num_anchors, num_layers, onnx_export=False): 312 | super(Regressor, self).__init__() 313 | self.num_layers = num_layers 314 | self.num_layers = num_layers 315 | 316 | self.conv_list = nn.ModuleList( 317 | [SeparableConvBlock(in_channels, in_channels, norm=False, activation=False) for i in range(num_layers)]) 318 | self.bn_list = nn.ModuleList( 319 | [nn.ModuleList([nn.BatchNorm2d(in_channels, momentum=0.01, eps=1e-3) for i in range(num_layers)]) for j in 320 | range(5)]) 321 | self.header = SeparableConvBlock(in_channels, num_anchors * 4, norm=False, activation=False) 322 | self.swish = MemoryEfficientSwish() if not onnx_export else Swish() 323 | 324 | def forward(self, inputs): 325 | feats = [] 326 | for feat, bn_list in zip(inputs, self.bn_list): 327 | for i, bn, conv in zip(range(self.num_layers), bn_list, self.conv_list): 328 | feat = conv(feat) 329 | feat = bn(feat) 330 | feat = self.swish(feat) 331 | feat = self.header(feat) 332 | 333 | feat = feat.permute(0, 2, 3, 1) 334 | feat = feat.contiguous().view(feat.shape[0], -1, 4) 335 | 336 | feats.append(feat) 337 | 338 | feats = torch.cat(feats, dim=1) 339 | 340 | return feats 341 | 342 | 343 | class Classifier(nn.Module): 344 | """ 345 | modified by Zylo117 346 | """ 347 | 348 | def __init__(self, in_channels, num_anchors, num_classes, num_layers, onnx_export=False): 349 | super(Classifier, self).__init__() 350 | self.num_anchors = num_anchors 351 | self.num_classes = num_classes 352 | self.num_layers = num_layers 353 | self.conv_list = nn.ModuleList( 354 | [SeparableConvBlock(in_channels, in_channels, norm=False, activation=False) for i in range(num_layers)]) 355 | self.bn_list = nn.ModuleList( 356 | [nn.ModuleList([nn.BatchNorm2d(in_channels, momentum=0.01, eps=1e-3) for i in range(num_layers)]) for j in 357 | range(5)]) 358 | self.header = SeparableConvBlock(in_channels, num_anchors * num_classes, norm=False, activation=False) 359 | self.swish = MemoryEfficientSwish() if not onnx_export else Swish() 360 | 361 | def forward(self, inputs): 362 | feats = [] 363 | for feat, bn_list in zip(inputs, self.bn_list): 364 | for i, bn, conv in zip(range(self.num_layers), bn_list, self.conv_list): 365 | feat = conv(feat) 366 | feat = bn(feat) 367 | feat = self.swish(feat) 368 | feat = self.header(feat) 369 | 370 | feat = feat.permute(0, 2, 3, 1) 371 | feat = feat.contiguous().view(feat.shape[0], feat.shape[1], feat.shape[2], self.num_anchors, 372 | self.num_classes) 373 | feat = feat.contiguous().view(feat.shape[0], -1, self.num_classes) 374 | 375 | feats.append(feat) 376 | 377 | feats = torch.cat(feats, dim=1) 378 | feats = feats.sigmoid() 379 | 380 | return feats 381 | 382 | 383 | class EfficientNet(nn.Module): 384 | """ 385 | modified by Zylo117 386 | """ 387 | 388 | def __init__(self, compound_coef, load_weights=False): 389 | super(EfficientNet, self).__init__() 390 | model = EffNet.from_pretrained('efficientnet-b{}'.format(compound_coef), load_weights) 391 | del model._conv_head 392 | del model._bn1 393 | del model._avg_pooling 394 | del model._dropout 395 | del model._fc 396 | self.model = model 397 | 398 | def forward(self, x): 399 | x = self.model._conv_stem(x) 400 | x = self.model._bn0(x) 401 | x = self.model._swish(x) 402 | feature_maps = [] 403 | 404 | # TODO: temporarily storing extra tensor last_x and del it later might not be a good idea, 405 | # try recording stride changing when creating efficientnet, 406 | # and then apply it here. 407 | last_x = None 408 | for idx, block in enumerate(self.model._blocks): 409 | drop_connect_rate = self.model._global_params.drop_connect_rate 410 | if drop_connect_rate: 411 | drop_connect_rate *= float(idx) / len(self.model._blocks) 412 | x = block(x, drop_connect_rate=drop_connect_rate) 413 | 414 | if block._depthwise_conv.stride == [2, 2]: 415 | feature_maps.append(last_x) 416 | elif idx == len(self.model._blocks) - 1: 417 | feature_maps.append(x) 418 | last_x = x 419 | del last_x 420 | return feature_maps[1:] 421 | 422 | 423 | if __name__ == '__main__': 424 | from tensorboardX import SummaryWriter 425 | 426 | 427 | def count_parameters(model): 428 | return sum(p.numel() for p in model.parameters() if p.requires_grad) 429 | -------------------------------------------------------------------------------- /efficientdet/utils.py: -------------------------------------------------------------------------------- 1 | import itertools 2 | import torch 3 | import torch.nn as nn 4 | import numpy as np 5 | 6 | 7 | class BBoxTransform(nn.Module): 8 | def forward(self, anchors, regression): 9 | """ 10 | decode_box_outputs adapted from https://github.com/google/automl/blob/master/efficientdet/anchors.py 11 | 12 | Args: 13 | anchors: [batchsize, boxes, (y1, x1, y2, x2)] 14 | regression: [batchsize, boxes, (dy, dx, dh, dw)] 15 | 16 | Returns: 17 | 18 | """ 19 | y_centers_a = (anchors[..., 0] + anchors[..., 2]) / 2 20 | x_centers_a = (anchors[..., 1] + anchors[..., 3]) / 2 21 | ha = anchors[..., 2] - anchors[..., 0] 22 | wa = anchors[..., 3] - anchors[..., 1] 23 | 24 | w = regression[..., 3].exp() * wa 25 | h = regression[..., 2].exp() * ha 26 | 27 | y_centers = regression[..., 0] * ha + y_centers_a 28 | x_centers = regression[..., 1] * wa + x_centers_a 29 | 30 | ymin = y_centers - h / 2. 31 | xmin = x_centers - w / 2. 32 | ymax = y_centers + h / 2. 33 | xmax = x_centers + w / 2. 34 | 35 | return torch.stack([xmin, ymin, xmax, ymax], dim=2) 36 | 37 | 38 | class ClipBoxes(nn.Module): 39 | 40 | def __init__(self): 41 | super(ClipBoxes, self).__init__() 42 | 43 | def forward(self, boxes, img): 44 | batch_size, num_channels, height, width = img.shape 45 | 46 | boxes[:, :, 0] = torch.clamp(boxes[:, :, 0], min=0) 47 | boxes[:, :, 1] = torch.clamp(boxes[:, :, 1], min=0) 48 | 49 | boxes[:, :, 2] = torch.clamp(boxes[:, :, 2], max=width - 1) 50 | boxes[:, :, 3] = torch.clamp(boxes[:, :, 3], max=height - 1) 51 | 52 | return boxes 53 | 54 | 55 | class Anchors(nn.Module): 56 | """ 57 | adapted and modified from https://github.com/google/automl/blob/master/efficientdet/anchors.py by Zylo117 58 | """ 59 | 60 | def __init__(self, anchor_scale=4., pyramid_levels=None, **kwargs): 61 | super().__init__() 62 | self.anchor_scale = anchor_scale 63 | 64 | if pyramid_levels is None: 65 | self.pyramid_levels = [3, 4, 5, 6, 7] 66 | 67 | self.strides = kwargs.get('strides', [2 ** x for x in self.pyramid_levels]) 68 | self.scales = np.array(kwargs.get('scales', [2 ** 0, 2 ** (1.0 / 3.0), 2 ** (2.0 / 3.0)])) 69 | self.ratios = kwargs.get('ratios', [(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]) 70 | 71 | self.last_anchors = {} 72 | self.last_shape = None 73 | 74 | def forward(self, image, dtype=torch.float32): 75 | """Generates multiscale anchor boxes. 76 | 77 | Args: 78 | image_size: integer number of input image size. The input image has the 79 | same dimension for width and height. The image_size should be divided by 80 | the largest feature stride 2^max_level. 81 | anchor_scale: float number representing the scale of size of the base 82 | anchor to the feature stride 2^level. 83 | anchor_configs: a dictionary with keys as the levels of anchors and 84 | values as a list of anchor configuration. 85 | 86 | Returns: 87 | anchor_boxes: a numpy array with shape [N, 4], which stacks anchors on all 88 | feature levels. 89 | Raises: 90 | ValueError: input size must be the multiple of largest feature stride. 91 | """ 92 | image_shape = image.shape[2:] 93 | 94 | if image_shape == self.last_shape and image.device in self.last_anchors: 95 | return self.last_anchors[image.device] 96 | 97 | if self.last_shape is None or self.last_shape != image_shape: 98 | self.last_shape = image_shape 99 | 100 | if dtype == torch.float16: 101 | dtype = np.float16 102 | else: 103 | dtype = np.float32 104 | 105 | boxes_all = [] 106 | for stride in self.strides: 107 | boxes_level = [] 108 | for scale, ratio in itertools.product(self.scales, self.ratios): 109 | if image_shape[1] % stride != 0: 110 | raise ValueError('input size must be divided by the stride.') 111 | base_anchor_size = self.anchor_scale * stride * scale 112 | anchor_size_x_2 = base_anchor_size * ratio[0] / 2.0 113 | anchor_size_y_2 = base_anchor_size * ratio[1] / 2.0 114 | 115 | x = np.arange(stride / 2, image_shape[1], stride) 116 | y = np.arange(stride / 2, image_shape[0], stride) 117 | xv, yv = np.meshgrid(x, y) 118 | xv = xv.reshape(-1) 119 | yv = yv.reshape(-1) 120 | 121 | # y1,x1,y2,x2 122 | boxes = np.vstack((yv - anchor_size_y_2, xv - anchor_size_x_2, 123 | yv + anchor_size_y_2, xv + anchor_size_x_2)) 124 | boxes = np.swapaxes(boxes, 0, 1) 125 | boxes_level.append(np.expand_dims(boxes, axis=1)) 126 | # concat anchors on the same level to the reshape NxAx4 127 | boxes_level = np.concatenate(boxes_level, axis=1) 128 | boxes_all.append(boxes_level.reshape([-1, 4])) 129 | 130 | anchor_boxes = np.vstack(boxes_all) 131 | 132 | anchor_boxes = torch.from_numpy(anchor_boxes.astype(dtype)).to(image.device) 133 | anchor_boxes = anchor_boxes.unsqueeze(0) 134 | 135 | # save it for later use to reduce overhead 136 | self.last_anchors[image.device] = anchor_boxes 137 | return anchor_boxes 138 | -------------------------------------------------------------------------------- /efficientdet_test.py: -------------------------------------------------------------------------------- 1 | # Author: Zylo117 2 | 3 | """ 4 | Simple Inference Script of EfficientDet-Pytorch 5 | """ 6 | import time 7 | 8 | import torch 9 | from torch.backends import cudnn 10 | 11 | from backbone import EfficientDetBackbone 12 | import cv2 13 | import numpy as np 14 | 15 | from efficientdet.utils import BBoxTransform, ClipBoxes 16 | from utils.utils import preprocess, invert_affine, postprocess 17 | 18 | compound_coef = 2 19 | force_input_size = None # set None to use default size 20 | img_path = "dataset/underwater/val/000008.jpg" 21 | 22 | # replace this part with your project's anchor config 23 | anchor_ratios = [(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)] 24 | anchor_scales = [2 ** 0, 2 ** (1.0 / 3.0), 2 ** (2.0 / 3.0)] 25 | 26 | threshold = 0.2 27 | iou_threshold = 0.2 28 | 29 | use_cuda = True 30 | use_float16 = False 31 | cudnn.fastest = True 32 | cudnn.benchmark = True 33 | 34 | # obj_list = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 35 | # 'fire hydrant', '', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 36 | # 'cow', 'elephant', 'bear', 'zebra', 'giraffe', '', 'backpack', 'umbrella', '', '', 'handbag', 'tie', 37 | # 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 38 | # 'skateboard', 'surfboard', 'tennis racket', 'bottle', '', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 39 | # 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 40 | # 'cake', 'chair', 'couch', 'potted plant', 'bed', '', 'dining table', '', '', 'toilet', '', 'tv', 41 | # 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 42 | # 'refrigerator', '', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 43 | # 'toothbrush'] 44 | 45 | obj_list = ["holothurian","echinus","scallop","starfish"] 46 | 47 | # tf bilinear interpolation is different from any other's, just make do 48 | input_sizes = [512, 640, 768, 896, 1024, 1280, 1280, 1536] 49 | input_size = input_sizes[compound_coef] if force_input_size is None else force_input_size 50 | ori_imgs, framed_imgs, framed_metas = preprocess(img_path, max_size=input_size) 51 | 52 | if use_cuda: 53 | x = torch.stack([torch.from_numpy(fi).cuda() for fi in framed_imgs], 0) 54 | else: 55 | x = torch.stack([torch.from_numpy(fi) for fi in framed_imgs], 0) 56 | 57 | x = x.to(torch.float32 if not use_float16 else torch.float16).permute(0, 3, 1, 2) 58 | 59 | model = EfficientDetBackbone(compound_coef=compound_coef, num_classes=len(obj_list), 60 | ratios=anchor_ratios, scales=anchor_scales) 61 | model.load_state_dict(torch.load("./logs/underwater/efficientdet-d2_122_38106.pth")) # 模型地址 62 | model.requires_grad_(False) 63 | model.eval() 64 | 65 | if use_cuda: 66 | model = model.cuda() 67 | if use_float16: 68 | model = model.half() 69 | 70 | with torch.no_grad(): 71 | features, regression, classification, anchors = model(x) 72 | 73 | regressBoxes = BBoxTransform() 74 | clipBoxes = ClipBoxes() 75 | 76 | out = postprocess(x, 77 | anchors, regression, classification, 78 | regressBoxes, clipBoxes, 79 | threshold, iou_threshold) 80 | 81 | 82 | def display(preds, imgs, imshow=True, imwrite=False): 83 | for i in range(len(imgs)): 84 | if len(preds[i]['rois']) == 0: 85 | continue 86 | 87 | for j in range(len(preds[i]['rois'])): 88 | (x1, y1, x2, y2) = preds[i]['rois'][j].astype(np.int) 89 | cv2.rectangle(imgs[i], (x1, y1), (x2, y2), (255, 255, 0), 2) 90 | obj = obj_list[preds[i]['class_ids'][j]] 91 | score = float(preds[i]['scores'][j]) 92 | 93 | cv2.putText(imgs[i], '{}, {:.3f}'.format(obj, score), 94 | (x1, y1 + 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, 95 | (255, 255, 0), 1) 96 | 97 | if imshow: 98 | cv2.imshow('img', imgs[i]) 99 | cv2.waitKey(0) 100 | 101 | if imwrite: 102 | cv2.imwrite('test/img_inferred_d{}_this_repo_{}.jpg'.format(compound_coef,i), imgs[i]) 103 | 104 | 105 | out = invert_affine(framed_metas, out) 106 | display(out, ori_imgs, imshow=False, imwrite=True) 107 | 108 | print('running speed test...') 109 | with torch.no_grad(): 110 | print('test1: model inferring and postprocessing') 111 | print('inferring image for 10 times...') 112 | t1 = time.time() 113 | for _ in range(10): 114 | _, regression, classification, anchors = model(x) 115 | 116 | out = postprocess(x, 117 | anchors, regression, classification, 118 | regressBoxes, clipBoxes, 119 | threshold, iou_threshold) 120 | out = invert_affine(framed_metas, out) 121 | 122 | t2 = time.time() 123 | tact_time = (t2 - t1) / 10 124 | print('{} seconds, {} FPS, @batch_size 1'.format(tact_time,1 / tact_time)) 125 | 126 | # uncomment this if you want a extreme fps test 127 | # print('test2: model inferring only') 128 | # print('inferring images for batch_size 32 for 10 times...') 129 | # t1 = time.time() 130 | # x = torch.cat([x] * 32, 0) 131 | # for _ in range(10): 132 | # _, regression, classification, anchors = model(x) 133 | # 134 | # t2 = time.time() 135 | # tact_time = (t2 - t1) / 10 136 | # print(f'{tact_time} seconds, {32 / tact_time} FPS, @batch_size 32') 137 | -------------------------------------------------------------------------------- /efficientnet/__init__.py: -------------------------------------------------------------------------------- 1 | __version__ = "0.6.1" 2 | from .model import EfficientNet 3 | from .utils import ( 4 | GlobalParams, 5 | BlockArgs, 6 | BlockDecoder, 7 | efficientnet, 8 | get_model_params, 9 | ) 10 | 11 | -------------------------------------------------------------------------------- /efficientnet/model.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch import nn 3 | from torch.nn import functional as F 4 | 5 | from .utils import ( 6 | round_filters, 7 | round_repeats, 8 | drop_connect, 9 | get_same_padding_conv2d, 10 | get_model_params, 11 | efficientnet_params, 12 | load_pretrained_weights, 13 | Swish, 14 | MemoryEfficientSwish, 15 | ) 16 | 17 | class MBConvBlock(nn.Module): 18 | """ 19 | Mobile Inverted Residual Bottleneck Block 20 | 21 | Args: 22 | block_args (namedtuple): BlockArgs, see above 23 | global_params (namedtuple): GlobalParam, see above 24 | 25 | Attributes: 26 | has_se (bool): Whether the block contains a Squeeze and Excitation layer. 27 | """ 28 | 29 | def __init__(self, block_args, global_params): 30 | super().__init__() 31 | self._block_args = block_args 32 | self._bn_mom = 1 - global_params.batch_norm_momentum 33 | self._bn_eps = global_params.batch_norm_epsilon 34 | self.has_se = (self._block_args.se_ratio is not None) and (0 < self._block_args.se_ratio <= 1) 35 | self.id_skip = block_args.id_skip # skip connection and drop connect 36 | 37 | # Get static or dynamic convolution depending on image size 38 | Conv2d = get_same_padding_conv2d(image_size=global_params.image_size) 39 | 40 | # Expansion phase 41 | inp = self._block_args.input_filters # number of input channels 42 | oup = self._block_args.input_filters * self._block_args.expand_ratio # number of output channels 43 | if self._block_args.expand_ratio != 1: 44 | self._expand_conv = Conv2d(in_channels=inp, out_channels=oup, kernel_size=1, bias=False) 45 | self._bn0 = nn.BatchNorm2d(num_features=oup, momentum=self._bn_mom, eps=self._bn_eps) 46 | 47 | # Depthwise convolution phase 48 | k = self._block_args.kernel_size 49 | s = self._block_args.stride 50 | self._depthwise_conv = Conv2d( 51 | in_channels=oup, out_channels=oup, groups=oup, # groups makes it depthwise 52 | kernel_size=k, stride=s, bias=False) 53 | self._bn1 = nn.BatchNorm2d(num_features=oup, momentum=self._bn_mom, eps=self._bn_eps) 54 | 55 | # Squeeze and Excitation layer, if desired 56 | if self.has_se: 57 | num_squeezed_channels = max(1, int(self._block_args.input_filters * self._block_args.se_ratio)) 58 | self._se_reduce = Conv2d(in_channels=oup, out_channels=num_squeezed_channels, kernel_size=1) 59 | self._se_expand = Conv2d(in_channels=num_squeezed_channels, out_channels=oup, kernel_size=1) 60 | 61 | # Output phase 62 | final_oup = self._block_args.output_filters 63 | self._project_conv = Conv2d(in_channels=oup, out_channels=final_oup, kernel_size=1, bias=False) 64 | self._bn2 = nn.BatchNorm2d(num_features=final_oup, momentum=self._bn_mom, eps=self._bn_eps) 65 | self._swish = MemoryEfficientSwish() 66 | 67 | def forward(self, inputs, drop_connect_rate=None): 68 | """ 69 | :param inputs: input tensor 70 | :param drop_connect_rate: drop connect rate (float, between 0 and 1) 71 | :return: output of block 72 | """ 73 | 74 | # Expansion and Depthwise Convolution 75 | x = inputs 76 | if self._block_args.expand_ratio != 1: 77 | x = self._expand_conv(inputs) 78 | x = self._bn0(x) 79 | x = self._swish(x) 80 | 81 | x = self._depthwise_conv(x) 82 | x = self._bn1(x) 83 | x = self._swish(x) 84 | 85 | # Squeeze and Excitation 86 | if self.has_se: 87 | x_squeezed = F.adaptive_avg_pool2d(x, 1) 88 | x_squeezed = self._se_reduce(x_squeezed) 89 | x_squeezed = self._swish(x_squeezed) 90 | x_squeezed = self._se_expand(x_squeezed) 91 | x = torch.sigmoid(x_squeezed) * x 92 | 93 | x = self._project_conv(x) 94 | x = self._bn2(x) 95 | 96 | # Skip connection and drop connect 97 | input_filters, output_filters = self._block_args.input_filters, self._block_args.output_filters 98 | if self.id_skip and self._block_args.stride == 1 and input_filters == output_filters: 99 | if drop_connect_rate: 100 | x = drop_connect(x, p=drop_connect_rate, training=self.training) 101 | x = x + inputs # skip connection 102 | return x 103 | 104 | def set_swish(self, memory_efficient=True): 105 | """Sets swish function as memory efficient (for training) or standard (for export)""" 106 | self._swish = MemoryEfficientSwish() if memory_efficient else Swish() 107 | 108 | 109 | class EfficientNet(nn.Module): 110 | """ 111 | An EfficientNet model. Most easily loaded with the .from_name or .from_pretrained methods 112 | 113 | Args: 114 | blocks_args (list): A list of BlockArgs to construct blocks 115 | global_params (namedtuple): A set of GlobalParams shared between blocks 116 | 117 | Example: 118 | model = EfficientNet.from_pretrained('efficientnet-b0') 119 | 120 | """ 121 | 122 | def __init__(self, blocks_args=None, global_params=None): 123 | super().__init__() 124 | assert isinstance(blocks_args, list), 'blocks_args should be a list' 125 | assert len(blocks_args) > 0, 'block args must be greater than 0' 126 | self._global_params = global_params 127 | self._blocks_args = blocks_args 128 | 129 | # Get static or dynamic convolution depending on image size 130 | Conv2d = get_same_padding_conv2d(image_size=global_params.image_size) 131 | 132 | # Batch norm parameters 133 | bn_mom = 1 - self._global_params.batch_norm_momentum 134 | bn_eps = self._global_params.batch_norm_epsilon 135 | 136 | # Stem 137 | in_channels = 3 # rgb 138 | out_channels = round_filters(32, self._global_params) # number of output channels 139 | self._conv_stem = Conv2d(in_channels, out_channels, kernel_size=3, stride=2, bias=False) 140 | self._bn0 = nn.BatchNorm2d(num_features=out_channels, momentum=bn_mom, eps=bn_eps) 141 | 142 | # Build blocks 143 | self._blocks = nn.ModuleList([]) 144 | for block_args in self._blocks_args: 145 | 146 | # Update block input and output filters based on depth multiplier. 147 | block_args = block_args._replace( 148 | input_filters=round_filters(block_args.input_filters, self._global_params), 149 | output_filters=round_filters(block_args.output_filters, self._global_params), 150 | num_repeat=round_repeats(block_args.num_repeat, self._global_params) 151 | ) 152 | 153 | # The first block needs to take care of stride and filter size increase. 154 | self._blocks.append(MBConvBlock(block_args, self._global_params)) 155 | if block_args.num_repeat > 1: 156 | block_args = block_args._replace(input_filters=block_args.output_filters, stride=1) 157 | for _ in range(block_args.num_repeat - 1): 158 | self._blocks.append(MBConvBlock(block_args, self._global_params)) 159 | 160 | # Head 161 | in_channels = block_args.output_filters # output of final block 162 | out_channels = round_filters(1280, self._global_params) 163 | self._conv_head = Conv2d(in_channels, out_channels, kernel_size=1, bias=False) 164 | self._bn1 = nn.BatchNorm2d(num_features=out_channels, momentum=bn_mom, eps=bn_eps) 165 | 166 | # Final linear layer 167 | self._avg_pooling = nn.AdaptiveAvgPool2d(1) 168 | self._dropout = nn.Dropout(self._global_params.dropout_rate) 169 | self._fc = nn.Linear(out_channels, self._global_params.num_classes) 170 | self._swish = MemoryEfficientSwish() 171 | 172 | def set_swish(self, memory_efficient=True): 173 | """Sets swish function as memory efficient (for training) or standard (for export)""" 174 | self._swish = MemoryEfficientSwish() if memory_efficient else Swish() 175 | for block in self._blocks: 176 | block.set_swish(memory_efficient) 177 | 178 | 179 | def extract_features(self, inputs): 180 | """ Returns output of the final convolution layer """ 181 | 182 | # Stem 183 | x = self._swish(self._bn0(self._conv_stem(inputs))) 184 | 185 | # Blocks 186 | for idx, block in enumerate(self._blocks): 187 | drop_connect_rate = self._global_params.drop_connect_rate 188 | if drop_connect_rate: 189 | drop_connect_rate *= float(idx) / len(self._blocks) 190 | x = block(x, drop_connect_rate=drop_connect_rate) 191 | # Head 192 | x = self._swish(self._bn1(self._conv_head(x))) 193 | 194 | return x 195 | 196 | def forward(self, inputs): 197 | """ Calls extract_features to extract features, applies final linear layer, and returns logits. """ 198 | bs = inputs.size(0) 199 | # Convolution layers 200 | x = self.extract_features(inputs) 201 | 202 | # Pooling and final linear layer 203 | x = self._avg_pooling(x) 204 | x = x.view(bs, -1) 205 | x = self._dropout(x) 206 | x = self._fc(x) 207 | return x 208 | 209 | @classmethod 210 | def from_name(cls, model_name, override_params=None): 211 | cls._check_model_name_is_valid(model_name) 212 | blocks_args, global_params = get_model_params(model_name, override_params) 213 | return cls(blocks_args, global_params) 214 | 215 | @classmethod 216 | def from_pretrained(cls, model_name, load_weights=True, advprop=True, num_classes=1000, in_channels=3): 217 | model = cls.from_name(model_name, override_params={'num_classes': num_classes}) 218 | if load_weights: 219 | load_pretrained_weights(model, model_name, load_fc=(num_classes == 1000), advprop=advprop) 220 | if in_channels != 3: 221 | Conv2d = get_same_padding_conv2d(image_size = model._global_params.image_size) 222 | out_channels = round_filters(32, model._global_params) 223 | model._conv_stem = Conv2d(in_channels, out_channels, kernel_size=3, stride=2, bias=False) 224 | return model 225 | 226 | @classmethod 227 | def get_image_size(cls, model_name): 228 | cls._check_model_name_is_valid(model_name) 229 | _, _, res, _ = efficientnet_params(model_name) 230 | return res 231 | 232 | @classmethod 233 | def _check_model_name_is_valid(cls, model_name): 234 | """ Validates model name. """ 235 | valid_models = ['efficientnet-b'+str(i) for i in range(9)] 236 | if model_name not in valid_models: 237 | raise ValueError('model_name should be one of: ' + ', '.join(valid_models)) 238 | -------------------------------------------------------------------------------- /efficientnet/utils.py: -------------------------------------------------------------------------------- 1 | """ 2 | This file contains helper functions for building the model and for loading model parameters. 3 | These helper functions are built to mirror those in the official TensorFlow implementation. 4 | """ 5 | 6 | import re 7 | import math 8 | import collections 9 | from functools import partial 10 | import torch 11 | from torch import nn 12 | from torch.nn import functional as F 13 | from torch.utils import model_zoo 14 | from .utils_extra import Conv2dStaticSamePadding 15 | 16 | ######################################################################## 17 | ############### HELPERS FUNCTIONS FOR MODEL ARCHITECTURE ############### 18 | ######################################################################## 19 | 20 | 21 | # Parameters for the entire model (stem, all blocks, and head) 22 | 23 | GlobalParams = collections.namedtuple('GlobalParams', [ 24 | 'batch_norm_momentum', 'batch_norm_epsilon', 'dropout_rate', 25 | 'num_classes', 'width_coefficient', 'depth_coefficient', 26 | 'depth_divisor', 'min_depth', 'drop_connect_rate', 'image_size']) 27 | 28 | # Parameters for an individual model block 29 | BlockArgs = collections.namedtuple('BlockArgs', [ 30 | 'kernel_size', 'num_repeat', 'input_filters', 'output_filters', 31 | 'expand_ratio', 'id_skip', 'stride', 'se_ratio']) 32 | 33 | # Change namedtuple defaults 34 | GlobalParams.__new__.__defaults__ = (None,) * len(GlobalParams._fields) 35 | BlockArgs.__new__.__defaults__ = (None,) * len(BlockArgs._fields) 36 | 37 | 38 | class SwishImplementation(torch.autograd.Function): 39 | @staticmethod 40 | def forward(ctx, i): 41 | result = i * torch.sigmoid(i) 42 | ctx.save_for_backward(i) 43 | return result 44 | 45 | @staticmethod 46 | def backward(ctx, grad_output): 47 | i = ctx.saved_variables[0] 48 | sigmoid_i = torch.sigmoid(i) 49 | return grad_output * (sigmoid_i * (1 + i * (1 - sigmoid_i))) 50 | 51 | 52 | class MemoryEfficientSwish(nn.Module): 53 | def forward(self, x): 54 | return SwishImplementation.apply(x) 55 | 56 | 57 | class Swish(nn.Module): 58 | def forward(self, x): 59 | return x * torch.sigmoid(x) 60 | 61 | 62 | def round_filters(filters, global_params): 63 | """ Calculate and round number of filters based on depth multiplier. """ 64 | multiplier = global_params.width_coefficient 65 | if not multiplier: 66 | return filters 67 | divisor = global_params.depth_divisor 68 | min_depth = global_params.min_depth 69 | filters *= multiplier 70 | min_depth = min_depth or divisor 71 | new_filters = max(min_depth, int(filters + divisor / 2) // divisor * divisor) 72 | if new_filters < 0.9 * filters: # prevent rounding by more than 10% 73 | new_filters += divisor 74 | return int(new_filters) 75 | 76 | 77 | def round_repeats(repeats, global_params): 78 | """ Round number of filters based on depth multiplier. """ 79 | multiplier = global_params.depth_coefficient 80 | if not multiplier: 81 | return repeats 82 | return int(math.ceil(multiplier * repeats)) 83 | 84 | 85 | def drop_connect(inputs, p, training): 86 | """ Drop connect. """ 87 | if not training: return inputs 88 | batch_size = inputs.shape[0] 89 | keep_prob = 1 - p 90 | random_tensor = keep_prob 91 | random_tensor += torch.rand([batch_size, 1, 1, 1], dtype=inputs.dtype, device=inputs.device) 92 | binary_tensor = torch.floor(random_tensor) 93 | output = inputs / keep_prob * binary_tensor 94 | return output 95 | 96 | 97 | def get_same_padding_conv2d(image_size=None): 98 | """ Chooses static padding if you have specified an image size, and dynamic padding otherwise. 99 | Static padding is necessary for ONNX exporting of models. """ 100 | if image_size is None: 101 | return Conv2dDynamicSamePadding 102 | else: 103 | return partial(Conv2dStaticSamePadding, image_size=image_size) 104 | 105 | 106 | class Conv2dDynamicSamePadding(nn.Conv2d): 107 | """ 2D Convolutions like TensorFlow, for a dynamic image size """ 108 | 109 | def __init__(self, in_channels, out_channels, kernel_size, stride=1, dilation=1, groups=1, bias=True): 110 | super().__init__(in_channels, out_channels, kernel_size, stride, 0, dilation, groups, bias) 111 | self.stride = self.stride if len(self.stride) == 2 else [self.stride[0]] * 2 112 | 113 | def forward(self, x): 114 | ih, iw = x.size()[-2:] 115 | kh, kw = self.weight.size()[-2:] 116 | sh, sw = self.stride 117 | oh, ow = math.ceil(ih / sh), math.ceil(iw / sw) 118 | pad_h = max((oh - 1) * self.stride[0] + (kh - 1) * self.dilation[0] + 1 - ih, 0) 119 | pad_w = max((ow - 1) * self.stride[1] + (kw - 1) * self.dilation[1] + 1 - iw, 0) 120 | if pad_h > 0 or pad_w > 0: 121 | x = F.pad(x, [pad_w // 2, pad_w - pad_w // 2, pad_h // 2, pad_h - pad_h // 2]) 122 | return F.conv2d(x, self.weight, self.bias, self.stride, self.padding, self.dilation, self.groups) 123 | 124 | 125 | class Identity(nn.Module): 126 | def __init__(self, ): 127 | super(Identity, self).__init__() 128 | 129 | def forward(self, input): 130 | return input 131 | 132 | 133 | ######################################################################## 134 | ############## HELPERS FUNCTIONS FOR LOADING MODEL PARAMS ############## 135 | ######################################################################## 136 | 137 | 138 | def efficientnet_params(model_name): 139 | """ Map EfficientNet model name to parameter coefficients. """ 140 | params_dict = { 141 | # Coefficients: width,depth,res,dropout 142 | 'efficientnet-b0': (1.0, 1.0, 224, 0.2), 143 | 'efficientnet-b1': (1.0, 1.1, 240, 0.2), 144 | 'efficientnet-b2': (1.1, 1.2, 260, 0.3), 145 | 'efficientnet-b3': (1.2, 1.4, 300, 0.3), 146 | 'efficientnet-b4': (1.4, 1.8, 380, 0.4), 147 | 'efficientnet-b5': (1.6, 2.2, 456, 0.4), 148 | 'efficientnet-b6': (1.8, 2.6, 528, 0.5), 149 | 'efficientnet-b7': (2.0, 3.1, 600, 0.5), 150 | 'efficientnet-b8': (2.2, 3.6, 672, 0.5), 151 | 'efficientnet-l2': (4.3, 5.3, 800, 0.5), 152 | } 153 | return params_dict[model_name] 154 | 155 | 156 | class BlockDecoder(object): 157 | """ Block Decoder for readability, straight from the official TensorFlow repository """ 158 | 159 | @staticmethod 160 | def _decode_block_string(block_string): 161 | """ Gets a block through a string notation of arguments. """ 162 | assert isinstance(block_string, str) 163 | 164 | ops = block_string.split('_') 165 | options = {} 166 | for op in ops: 167 | splits = re.split(r'(\d.*)', op) 168 | if len(splits) >= 2: 169 | key, value = splits[:2] 170 | options[key] = value 171 | 172 | # Check stride 173 | assert (('s' in options and len(options['s']) == 1) or 174 | (len(options['s']) == 2 and options['s'][0] == options['s'][1])) 175 | 176 | return BlockArgs( 177 | kernel_size=int(options['k']), 178 | num_repeat=int(options['r']), 179 | input_filters=int(options['i']), 180 | output_filters=int(options['o']), 181 | expand_ratio=int(options['e']), 182 | id_skip=('noskip' not in block_string), 183 | se_ratio=float(options['se']) if 'se' in options else None, 184 | stride=[int(options['s'][0])]) 185 | 186 | @staticmethod 187 | def _encode_block_string(block): 188 | """Encodes a block to a string.""" 189 | args = [ 190 | 'r%d' % block.num_repeat, 191 | 'k%d' % block.kernel_size, 192 | 's%d%d' % (block.strides[0], block.strides[1]), 193 | 'e%s' % block.expand_ratio, 194 | 'i%d' % block.input_filters, 195 | 'o%d' % block.output_filters 196 | ] 197 | if 0 < block.se_ratio <= 1: 198 | args.append('se%s' % block.se_ratio) 199 | if block.id_skip is False: 200 | args.append('noskip') 201 | return '_'.join(args) 202 | 203 | @staticmethod 204 | def decode(string_list): 205 | """ 206 | Decodes a list of string notations to specify blocks inside the network. 207 | 208 | :param string_list: a list of strings, each string is a notation of block 209 | :return: a list of BlockArgs namedtuples of block args 210 | """ 211 | assert isinstance(string_list, list) 212 | blocks_args = [] 213 | for block_string in string_list: 214 | blocks_args.append(BlockDecoder._decode_block_string(block_string)) 215 | return blocks_args 216 | 217 | @staticmethod 218 | def encode(blocks_args): 219 | """ 220 | Encodes a list of BlockArgs to a list of strings. 221 | 222 | :param blocks_args: a list of BlockArgs namedtuples of block args 223 | :return: a list of strings, each string is a notation of block 224 | """ 225 | block_strings = [] 226 | for block in blocks_args: 227 | block_strings.append(BlockDecoder._encode_block_string(block)) 228 | return block_strings 229 | 230 | 231 | def efficientnet(width_coefficient=None, depth_coefficient=None, dropout_rate=0.2, 232 | drop_connect_rate=0.2, image_size=None, num_classes=1000): 233 | """ Creates a efficientnet model. """ 234 | 235 | blocks_args = [ 236 | 'r1_k3_s11_e1_i32_o16_se0.25', 'r2_k3_s22_e6_i16_o24_se0.25', 237 | 'r2_k5_s22_e6_i24_o40_se0.25', 'r3_k3_s22_e6_i40_o80_se0.25', 238 | 'r3_k5_s11_e6_i80_o112_se0.25', 'r4_k5_s22_e6_i112_o192_se0.25', 239 | 'r1_k3_s11_e6_i192_o320_se0.25', 240 | ] 241 | blocks_args = BlockDecoder.decode(blocks_args) 242 | 243 | global_params = GlobalParams( 244 | batch_norm_momentum=0.99, 245 | batch_norm_epsilon=1e-3, 246 | dropout_rate=dropout_rate, 247 | drop_connect_rate=drop_connect_rate, 248 | # data_format='channels_last', # removed, this is always true in PyTorch 249 | num_classes=num_classes, 250 | width_coefficient=width_coefficient, 251 | depth_coefficient=depth_coefficient, 252 | depth_divisor=8, 253 | min_depth=None, 254 | image_size=image_size, 255 | ) 256 | 257 | return blocks_args, global_params 258 | 259 | 260 | def get_model_params(model_name, override_params): 261 | """ Get the block args and global params for a given model """ 262 | if model_name.startswith('efficientnet'): 263 | w, d, s, p = efficientnet_params(model_name) 264 | # note: all models have drop connect rate = 0.2 265 | blocks_args, global_params = efficientnet( 266 | width_coefficient=w, depth_coefficient=d, dropout_rate=p, image_size=s) 267 | else: 268 | raise NotImplementedError('model name is not pre-defined: %s' % model_name) 269 | if override_params: 270 | # ValueError will be raised here if override_params has fields not included in global_params. 271 | global_params = global_params._replace(**override_params) 272 | return blocks_args, global_params 273 | 274 | 275 | url_map = { 276 | 'efficientnet-b0': 'https://publicmodels.blob.core.windows.net/container/aa/efficientnet-b0-355c32eb.pth', 277 | 'efficientnet-b1': 'https://publicmodels.blob.core.windows.net/container/aa/efficientnet-b1-f1951068.pth', 278 | 'efficientnet-b2': 'https://publicmodels.blob.core.windows.net/container/aa/efficientnet-b2-8bb594d6.pth', 279 | 'efficientnet-b3': 'https://publicmodels.blob.core.windows.net/container/aa/efficientnet-b3-5fb5a3c3.pth', 280 | 'efficientnet-b4': 'https://publicmodels.blob.core.windows.net/container/aa/efficientnet-b4-6ed6700e.pth', 281 | 'efficientnet-b5': 'https://publicmodels.blob.core.windows.net/container/aa/efficientnet-b5-b6417697.pth', 282 | 'efficientnet-b6': 'https://publicmodels.blob.core.windows.net/container/aa/efficientnet-b6-c76e70fd.pth', 283 | 'efficientnet-b7': 'https://publicmodels.blob.core.windows.net/container/aa/efficientnet-b7-dcc49843.pth', 284 | } 285 | 286 | url_map_advprop = { 287 | 'efficientnet-b0': 'https://publicmodels.blob.core.windows.net/container/advprop/efficientnet-b0-b64d5a18.pth', 288 | 'efficientnet-b1': 'https://publicmodels.blob.core.windows.net/container/advprop/efficientnet-b1-0f3ce85a.pth', 289 | 'efficientnet-b2': 'https://publicmodels.blob.core.windows.net/container/advprop/efficientnet-b2-6e9d97e5.pth', 290 | 'efficientnet-b3': 'https://publicmodels.blob.core.windows.net/container/advprop/efficientnet-b3-cdd7c0f4.pth', 291 | 'efficientnet-b4': 'https://publicmodels.blob.core.windows.net/container/advprop/efficientnet-b4-44fb3a87.pth', 292 | 'efficientnet-b5': 'https://publicmodels.blob.core.windows.net/container/advprop/efficientnet-b5-86493f6b.pth', 293 | 'efficientnet-b6': 'https://publicmodels.blob.core.windows.net/container/advprop/efficientnet-b6-ac80338e.pth', 294 | 'efficientnet-b7': 'https://publicmodels.blob.core.windows.net/container/advprop/efficientnet-b7-4652b6dd.pth', 295 | 'efficientnet-b8': 'https://publicmodels.blob.core.windows.net/container/advprop/efficientnet-b8-22a8fe65.pth', 296 | } 297 | 298 | 299 | def load_pretrained_weights(model, model_name, load_fc=True, advprop=False): 300 | """ Loads pretrained weights, and downloads if loading for the first time. """ 301 | # AutoAugment or Advprop (different preprocessing) 302 | url_map_ = url_map_advprop if advprop else url_map 303 | state_dict = model_zoo.load_url(url_map_[model_name], map_location=torch.device('cpu')) 304 | # state_dict = torch.load('../../weights/backbone_efficientnetb0.pth') 305 | if load_fc: 306 | ret = model.load_state_dict(state_dict, strict=False) 307 | print(ret) 308 | else: 309 | state_dict.pop('_fc.weight') 310 | state_dict.pop('_fc.bias') 311 | res = model.load_state_dict(state_dict, strict=False) 312 | assert set(res.missing_keys) == set(['_fc.weight', '_fc.bias']), 'issue loading pretrained weights' 313 | print('Loaded pretrained weights for {}'.format(model_name)) 314 | -------------------------------------------------------------------------------- /efficientnet/utils_extra.py: -------------------------------------------------------------------------------- 1 | # Author: Zylo117 2 | 3 | import math 4 | 5 | from torch import nn 6 | import torch.nn.functional as F 7 | 8 | 9 | class Conv2dStaticSamePadding(nn.Module): 10 | """ 11 | created by Zylo117 12 | The real keras/tensorflow conv2d with same padding 13 | """ 14 | 15 | def __init__(self, in_channels, out_channels, kernel_size, stride=1, bias=True, groups=1, dilation=1, **kwargs): 16 | super().__init__() 17 | self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride=stride, 18 | bias=bias, groups=groups) 19 | self.stride = self.conv.stride 20 | self.kernel_size = self.conv.kernel_size 21 | self.dilation = self.conv.dilation 22 | 23 | if isinstance(self.stride, int): 24 | self.stride = [self.stride] * 2 25 | elif len(self.stride) == 1: 26 | self.stride = [self.stride[0]] * 2 27 | 28 | if isinstance(self.kernel_size, int): 29 | self.kernel_size = [self.kernel_size] * 2 30 | elif len(self.kernel_size) == 1: 31 | self.kernel_size = [self.kernel_size[0]] * 2 32 | 33 | def forward(self, x): 34 | h, w = x.shape[-2:] 35 | 36 | h_step = math.ceil(w / self.stride[1]) 37 | v_step = math.ceil(h / self.stride[0]) 38 | h_cover_len = self.stride[1] * (h_step - 1) + 1 + (self.kernel_size[1] - 1) 39 | v_cover_len = self.stride[0] * (v_step - 1) + 1 + (self.kernel_size[0] - 1) 40 | 41 | extra_h = h_cover_len - w 42 | extra_v = v_cover_len - h 43 | 44 | left = extra_h // 2 45 | right = extra_h - left 46 | top = extra_v // 2 47 | bottom = extra_v - top 48 | 49 | x = F.pad(x, [left, right, top, bottom]) 50 | 51 | x = self.conv(x) 52 | return x 53 | 54 | 55 | class MaxPool2dStaticSamePadding(nn.Module): 56 | """ 57 | created by Zylo117 58 | The real keras/tensorflow MaxPool2d with same padding 59 | """ 60 | 61 | def __init__(self, *args, **kwargs): 62 | super().__init__() 63 | self.pool = nn.MaxPool2d(*args, **kwargs) 64 | self.stride = self.pool.stride 65 | self.kernel_size = self.pool.kernel_size 66 | 67 | if isinstance(self.stride, int): 68 | self.stride = [self.stride] * 2 69 | elif len(self.stride) == 1: 70 | self.stride = [self.stride[0]] * 2 71 | 72 | if isinstance(self.kernel_size, int): 73 | self.kernel_size = [self.kernel_size] * 2 74 | elif len(self.kernel_size) == 1: 75 | self.kernel_size = [self.kernel_size[0]] * 2 76 | 77 | def forward(self, x): 78 | h, w = x.shape[-2:] 79 | 80 | h_step = math.ceil(w / self.stride[1]) 81 | v_step = math.ceil(h / self.stride[0]) 82 | h_cover_len = self.stride[1] * (h_step - 1) + 1 + (self.kernel_size[1] - 1) 83 | v_cover_len = self.stride[0] * (v_step - 1) + 1 + (self.kernel_size[0] - 1) 84 | 85 | extra_h = h_cover_len - w 86 | extra_v = v_cover_len - h 87 | 88 | left = extra_h // 2 89 | right = extra_h - left 90 | top = extra_v // 2 91 | bottom = extra_v - top 92 | 93 | x = F.pad(x, [left, right, top, bottom]) 94 | 95 | x = self.pool(x) 96 | return x 97 | -------------------------------------------------------------------------------- /pic/data/img_test1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/data/img_test1.jpg -------------------------------------------------------------------------------- /pic/data/img_test2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/data/img_test2.jpg -------------------------------------------------------------------------------- /pic/data/p0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/data/p0.png -------------------------------------------------------------------------------- /pic/data/p1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/data/p1.png -------------------------------------------------------------------------------- /pic/data/p2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/data/p2.png -------------------------------------------------------------------------------- /pic/data/p3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/data/p3.png -------------------------------------------------------------------------------- /pic/data/p4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/data/p4.png -------------------------------------------------------------------------------- /pic/data/p5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/data/p5.png -------------------------------------------------------------------------------- /pic/p0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p0.png -------------------------------------------------------------------------------- /pic/p1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p1.png -------------------------------------------------------------------------------- /pic/p10.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p10.png -------------------------------------------------------------------------------- /pic/p11.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p11.png -------------------------------------------------------------------------------- /pic/p12.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p12.png -------------------------------------------------------------------------------- /pic/p13.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p13.png -------------------------------------------------------------------------------- /pic/p14.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p14.png -------------------------------------------------------------------------------- /pic/p15.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p15.png -------------------------------------------------------------------------------- /pic/p16.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p16.png -------------------------------------------------------------------------------- /pic/p17.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p17.png -------------------------------------------------------------------------------- /pic/p18.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p18.png -------------------------------------------------------------------------------- /pic/p19.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p19.png -------------------------------------------------------------------------------- /pic/p2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p2.png -------------------------------------------------------------------------------- /pic/p20.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p20.png -------------------------------------------------------------------------------- /pic/p21.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p21.png -------------------------------------------------------------------------------- /pic/p3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p3.png -------------------------------------------------------------------------------- /pic/p4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p4.png -------------------------------------------------------------------------------- /pic/p5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p5.png -------------------------------------------------------------------------------- /pic/p6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p6.png -------------------------------------------------------------------------------- /pic/p7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p7.png -------------------------------------------------------------------------------- /pic/p8.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p8.png -------------------------------------------------------------------------------- /pic/p9.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/pic/p9.png -------------------------------------------------------------------------------- /projects/coco.yml: -------------------------------------------------------------------------------- 1 | project_name: coco # also the folder name of the dataset that under data_path folder 2 | train_set: train2017 3 | val_set: val2017 4 | num_gpus: 4 5 | 6 | # mean and std in RGB order, actually this part should remain unchanged as long as your dataset is similar to coco. 7 | mean: [0.485, 0.456, 0.406] 8 | std: [0.229, 0.224, 0.225] 9 | 10 | # this is coco anchors, change it if necessary 11 | anchors_scales: '[2 ** 0, 2 ** (1.0 / 3.0), 2 ** (2.0 / 3.0)]' 12 | anchors_ratios: '[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]' 13 | 14 | # must match your dataset's category_id. 15 | # category_id is one_indexed, 16 | # for example, index of 'car' here is 2, while category_id of is 3 17 | obj_list: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 18 | 'fire hydrant', '', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 19 | 'cow', 'elephant', 'bear', 'zebra', 'giraffe', '', 'backpack', 'umbrella', '', '', 'handbag', 'tie', 20 | 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 21 | 'skateboard', 'surfboard', 'tennis racket', 'bottle', '', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 22 | 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 23 | 'cake', 'chair', 'couch', 'potted plant', 'bed', '', 'dining table', '', '', 'toilet', '', 'tv', 24 | 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 25 | 'refrigerator', '', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 26 | 'toothbrush'] -------------------------------------------------------------------------------- /projects/shape.yml: -------------------------------------------------------------------------------- 1 | project_name: shape # also the folder name of the dataset that under data_path folder 2 | train_set: train 3 | val_set: val 4 | num_gpus: 1 5 | 6 | # mean and std in RGB order, actually this part should remain unchanged as long as your dataset is similar to coco. 7 | mean: [0.485, 0.456, 0.406] 8 | std: [0.229, 0.224, 0.225] 9 | 10 | # this anchor is adapted to the dataset 11 | anchors_scales: '[2 ** 0, 2 ** (1.0 / 3.0), 2 ** (2.0 / 3.0)]' 12 | anchors_ratios: '[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]' 13 | 14 | obj_list: ['rectangle', 'circle'] -------------------------------------------------------------------------------- /projects/underwater.yml: -------------------------------------------------------------------------------- 1 | project_name: underwater # also the folder name of the dataset that under data_path folder 2 | train_set: train 3 | val_set: val 4 | num_gpus: 1 5 | 6 | # mean and std in RGB order, actually this part should remain unchanged as long as your dataset is similar to coco. 7 | mean: [0.485, 0.456, 0.406] 8 | std: [0.229, 0.224, 0.225] 9 | 10 | # this is coco anchors, change it if necessary 11 | anchors_scales: '[2 ** 0, 2 ** (1.0 / 3.0), 2 ** (2.0 / 3.0)]' 12 | anchors_ratios: '[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]' 13 | 14 | # must match your dataset's category_id. 15 | # category_id is one_indexed, 16 | # for example, index of 'car' here is 2, while category_id of is 3 17 | obj_list: ["holothurian","echinus","scallop","starfish"] 18 | -------------------------------------------------------------------------------- /readme_efficientdet.md: -------------------------------------------------------------------------------- 1 | # Yet Another EfficientDet Pytorch 2 | 3 | The pytorch re-implement of the official [EfficientDet](https://github.com/google/automl/efficientdet) with SOTA performance in real time, original paper link: https://arxiv.org/abs/1911.09070 4 | 5 | 6 | # Performance 7 | 8 | ## Pretrained weights and benchmark 9 | 10 | The performance is very close to the paper's, it is still SOTA. 11 | 12 | The speed/FPS test includes the time of post-processing with no jit/data precision trick. 13 | 14 | | coefficient | pth_download | GPU Mem(MB) | FPS | Extreme FPS (Batchsize 32) | mAP 0.5:0.95(this repo) | mAP 0.5:0.95(paper) | 15 | | :-----: | :-----: | :------: | :------: | :------: | :-----: | :-----: | 16 | | D0 | [efficientdet-d0.pth](https://github.com/zylo117/Yet-Another-Efficient-Pytorch/releases/download/1.0/efficientdet-d0.pth) | 1049 | 36.20 | 163.14 | 32.6 | 33.8 17 | | D1 | [efficientdet-d1.pth](https://github.com/zylo117/Yet-Another-Efficient-Pytorch/releases/download/1.0/efficientdet-d1.pth) | 1159 | 29.69 | 63.08 | 38.2 | 39.6 18 | | D2 | [efficientdet-d2.pth](https://github.com/zylo117/Yet-Another-Efficient-Pytorch/releases/download/1.0/efficientdet-d2.pth) | 1321 | 26.50 | 40.99 | 41.5 | 43.0 19 | | D3 | [efficientdet-d3.pth](https://github.com/zylo117/Yet-Another-Efficient-Pytorch/releases/download/1.0/efficientdet-d3.pth) | 1647 | 22.73 | - | 44.9 | 45.8 20 | | D4 | [efficientdet-d4.pth](https://github.com/zylo117/Yet-Another-Efficient-Pytorch/releases/download/1.0/efficientdet-d4.pth) | 1903 | 14.75 | - | 48.1 | 49.4 21 | | D5 | [efficientdet-d5.pth](https://github.com/zylo117/Yet-Another-Efficient-Pytorch/releases/download/1.0/efficientdet-d5.pth) | 2255 | 7.11 | - | 49.5 | 50.7 22 | | D6 | [efficientdet-d6.pth](https://github.com/zylo117/Yet-Another-Efficient-Pytorch/releases/download/1.0/efficientdet-d6.pth) | 2985 | 5.30 | - | 50.1 | 51.7 23 | | D7 | [efficientdet-d7.pth](https://github.com/zylo117/Yet-Another-Efficient-Pytorch/releases/download/1.0/efficientdet-d7.pth) | 3819 | 3.73 | - | 50.7 | 52.2 24 | 25 | ## Speed Test 26 | 27 | This pure-pytorch implement is 26 times faster than the official Tensorflow version without any trick. 28 | 29 | | coefficient | Time | FPS | Ratio | 30 | | :------: | :------: | :------: | :-----: | 31 | | Official D0 (tf postprocess) | 0.713s | 1.40 | 1X | 32 | | Official D0 (numpy postprocess) | 0.477s | 2.09 | 1.49X | 33 | | **_Yet-Another-EfficientDet-D0_** | **_0.028s_** | **_36.20_** | **_25.86X_** | 34 | 35 | 36 | Test method: 37 | 38 | Run this test on 2080Ti, Ubuntu 19.10 x64. 39 | 1. Prepare two image tensor with the same content, size (1,3,512,512)-pytorch, (1,512,512,3)-tensorflow. 40 | 2. Initiate everything by inferring once. 41 | 3. Run 10 times with batchsize 1 and calculate the average time, including post-processing and visualization, to make the test more practical. 42 | 43 | ___ 44 | # Update log 45 | 46 | [2020-04-14] fixed loss function bug. please pull the latest code. 47 | 48 | [2020-04-14] for those who needs help or can't get a good result after several epochs, check out this [tutorial](tutorial/train_shape.ipynb). You can run it on colab with GPU support. 49 | 50 | [2020-04-10] warp the loss function within the training model, so that the memory usage will be balanced when training with multiple gpus, enabling training with bigger batchsize. 51 | 52 | [2020-04-10] add D7 (D6 with larger input size and larger anchor scale) support and test its mAP 53 | 54 | [2020-04-09] allow custom anchor scales and ratios 55 | 56 | [2020-04-08] add D6 support and test its mAP 57 | 58 | [2020-04-08] add training script and its doc; update eval script and simple inference script. 59 | 60 | [2020-04-07] tested D0-D5 mAP, result seems nice, details can be found [here](benchmark/coco_eval_result) 61 | 62 | [2020-04-07] fix anchors strategies. 63 | 64 | [2020-04-06] adapt anchor strategies. 65 | 66 | [2020-04-05] create this repository. 67 | 68 | # Demo 69 | 70 | # install requirements 71 | pip install pycocotools numpy opencv-python tqdm tensorboard tensorboardX pyyaml 72 | pip install torch==1.4.0 73 | pip install torchvision==0.5.0 74 | 75 | # run the simple inference script 76 | python efficientdet_test.py 77 | 78 | # Training 79 | 80 | Training EfficientDet is a painful and time-consuming task. You shouldn't expect to get a good result within a day or two. Please be patient. 81 | 82 | Check out this [tutorial](tutorial/train_shape.ipynb) if you are new to this. You can run it on colab with GPU support. 83 | 84 | ## 1. Prepare your dataset 85 | 86 | # your dataset structure should be like this 87 | datasets/ 88 | -your_project_name/ 89 | -train_set_name/ 90 | -*.jpg 91 | -val_set_name/ 92 | -*.jpg 93 | -annotations 94 | -instances_{train_set_name}.json 95 | -instances_{val_set_name}.json 96 | 97 | # for example, coco2017 98 | datasets/ 99 | -coco2017/ 100 | -train2017/ 101 | -000000000001.jpg 102 | -000000000002.jpg 103 | -000000000003.jpg 104 | -val2017/ 105 | -000000000004.jpg 106 | -000000000005.jpg 107 | -000000000006.jpg 108 | -annotations 109 | -instances_train2017.json 110 | -instances_val2017.json 111 | 112 | 113 | ## 2. Manual set project's specific parameters 114 | 115 | # create a yml file {your_project_name}.yml under 'projects'folder 116 | # modify it following 'coco.yml' 117 | 118 | # for example 119 | project_name: coco 120 | train_set: train2017 121 | val_set: val2017 122 | num_gpus: 4 # 0 means using cpu, 1-N means using gpus 123 | 124 | # mean and std in RGB order, actually this part should remain unchanged as long as your dataset is similar to coco. 125 | mean: [0.485, 0.456, 0.406] 126 | std: [0.229, 0.224, 0.225] 127 | 128 | # this is coco anchors, change it if necessary 129 | anchors_scales: '[2 ** 0, 2 ** (1.0 / 3.0), 2 ** (2.0 / 3.0)]' 130 | anchors_ratios: '[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)]' 131 | 132 | # objects from all labels from your dataset with the order from your annotations. 133 | # its index must match your dataset's category_id. 134 | # category_id is one_indexed, 135 | # for example, index of 'car' here is 2, while category_id of is 3 136 | obj_list: ['person', 'bicycle', 'car', ...] 137 | 138 | 139 | ## 3.a. Train on coco from scratch 140 | 141 | # train efficientdet-d0 on coco from scratch 142 | # with batchsize 12 143 | # This takes time and requires change 144 | # of hyperparameters every few hours. 145 | # If you have months to kill, do it. 146 | # It's not like someone going to achieve 147 | # better score than the one in the paper. 148 | # The first few epoches will be rather unstable, 149 | # it's quite normal when you train from scratch. 150 | 151 | python train.py -c 0 --batch_size 12 152 | 153 | ## 3.b. Train a custom dataset from scratch 154 | 155 | # train efficientdet-d1 on a custom dataset 156 | # with batchsize 8 and learning rate 1e-5 157 | 158 | python train.py -c 1 --batch_size 8 --lr 1e-5 159 | 160 | ## 3.c. Train a custom dataset with pretrained weights (Highly Recommended) 161 | 162 | # train efficientdet-d2 on a custom dataset with pretrained weights 163 | # with batchsize 8 and learning rate 1e-5 for 10 epoches 164 | 165 | python train.py -c 2 --batch_size 8 --lr 1e-5 --num_epochs 10 \ 166 | --load_weights /path/to/your/weights/efficientdet-d2.pth 167 | 168 | # with a coco-pretrained, you can even freeze the backbone and train heads only 169 | # to speed up training and help convergence. 170 | 171 | python train.py -c 2 --batch_size 8 --lr 1e-5 --num_epochs 10 \ 172 | --load_weights /path/to/your/weights/efficientdet-d2.pth \ 173 | --head_only True 174 | 175 | ## 4. Early stopping a training session 176 | 177 | # while training, press Ctrl+c, the program will catch KeyboardInterrupt 178 | # and stop training, save current checkpoint. 179 | 180 | ## 5. Resume training 181 | 182 | # let says you started a training session like this. 183 | 184 | python train.py -c 2 --batch_size 8 --lr 1e-5 \ 185 | --load_weights /path/to/your/weights/efficientdet-d2.pth \ 186 | --head_only True 187 | 188 | # then you stopped it with a Ctrl+c, it exited with a checkpoint 189 | 190 | # now you want to resume training from the last checkpoint 191 | # simply set load_weights to 'last' 192 | 193 | python train.py -c 2 --batch_size 8 --lr 1e-5 \ 194 | --load_weights last \ 195 | --head_only True 196 | 197 | ## 6. Evaluate model performance 198 | 199 | # eval on your_project, efficientdet-d5 200 | 201 | python coco_eval.py -p your_project_name -c 5 \ 202 | -w /path/to/your/weights 203 | 204 | ## 7. Debug training (optional) 205 | 206 | # when you get bad result, you need to debug the training result. 207 | python train.py -c 2 --batch_size 8 --lr 1e-5 --debug True 208 | 209 | # then checkout test/ folder, there you can visualize the predicted boxes during training 210 | # don't panic if you see countless of error boxes, it happens when the training is at early stage. 211 | # But if you still can't see a normal box after several epoches, not even one in all image, 212 | # then it's possible that either the anchors config is inappropriate or the ground truth is corrupted. 213 | 214 | # TODO 215 | 216 | - [X] re-implement efficientdet 217 | - [X] adapt anchor strategies 218 | - [X] mAP tests 219 | - [X] training-scripts 220 | - [X] efficientdet D6 supports 221 | - [X] efficientdet D7 supports 222 | 223 | # FAQ: 224 | 225 | **Q1. Why implement this while there are several efficientdet pytorch projects already.** 226 | 227 | A1: Because AFAIK none of them fully recovers the true algorithm of the official efficientdet, that's why their communities could not achieve or having a hard time to achieve the same score as the official efficientdet by training from scratch. 228 | 229 | **Q2: What exactly is the difference among this repository and the others?** 230 | 231 | A2: For example, these two are the most popular efficientdet-pytorch, 232 | 233 | https://github.com/toandaominh1997/EfficientDet.Pytorch 234 | 235 | https://github.com/signatrix/efficientdet 236 | 237 | Here is the issues and why these are difficult to achieve the same score as the official one: 238 | 239 | The first one: 240 | 241 | 1. Altered EfficientNet the wrong way, strides have been changed to adapt the BiFPN, but we should be aware that efficientnet's great performance comes from it's specific parameters combinations. Any slight alteration could lead to worse performance. 242 | 243 | The second one: 244 | 245 | 1. Pytorch's BatchNormalization is slightly different from TensorFlow, momentum_pytorch = 1 - momentum_tensorflow. Well I didn't realize this trap if I paid less attentions. signatrix/efficientdet succeeded the parameter from TensorFlow, so the BN will perform badly because running mean and the running variance is being dominated by the new input. 246 | 247 | 2. Mis-implement of Depthwise-Separable Conv2D. Depthwise-Separable Conv2D is Depthwise-Conv2D and Pointwise-Conv2D and BiasAdd ,there is only a BiasAdd after two Conv2D, while signatrix/efficientdet has a extra BiasAdd on Depthwise-Conv2D. 248 | 249 | 3. Misunderstand the first parameter of MaxPooling2D, the first parameter is kernel_size, instead of stride. 250 | 251 | 4. Missing BN after downchannel of the feature of the efficientnet output. 252 | 253 | 5. Using the wrong output feature of the efficientnet. This is big one. It takes whatever output that has the conv.stride of 2, but it's wrong. It should be the one whose next conv.stride is 2 or the final output of efficientnet. 254 | 255 | 6. Does not apply same padding on Conv2D and Pooling. 256 | 257 | 7. Missing swish activation after several operations. 258 | 259 | 8. Missing Conv/BN operations in BiFPN, Regressor and Classifier. This one is very tricky, if you don't dig deeper into the official implement, there are some same operations with different weights. 260 | 261 | 262 | illustration of a minimal bifpn unit 263 | P7_0 -------------------------> P7_2 --------> 264 | |-------------| ↑ 265 | ↓ | 266 | P6_0 ---------> P6_1 ---------> P6_2 --------> 267 | |-------------|--------------↑ ↑ 268 | ↓ | 269 | P5_0 ---------> P5_1 ---------> P5_2 --------> 270 | |-------------|--------------↑ ↑ 271 | ↓ | 272 | P4_0 ---------> P4_1 ---------> P4_2 --------> 273 | |-------------|--------------↑ ↑ 274 | |--------------↓ | 275 | P3_0 -------------------------> P3_2 --------> 276 | 277 | For example, P4 will downchannel to P4_0, then it goes P4_1, 278 | anyone may takes it for granted that P4_0 goes to P4_2 directly, right? 279 | 280 | That's why they are wrong, 281 | P4 should downchannel again with a different weights to P4_0_another, 282 | then it goes to P4_2. 283 | 284 | And finally some common issues, their anchor decoder and encoder are different from the original one, but it's not the main reason that it performs badly. 285 | 286 | Also, Conv2dStaticSamePadding from [EfficientNet-PyTorch](https://github.com/lukemelas/EfficientNet-PyTorch) does not perform like TensorFlow, the padding strategy is different. So I implement a real tensorflow-style [Conv2dStaticSamePadding](efficientnet/utils_extra.py#L9) and [MaxPool2dStaticSamePadding](efficientnet/utils_extra.py#L55) myself. 287 | 288 | Despite of the above issues, they are great repositories that enlighten me, hence there is this repository. 289 | 290 | This repository is mainly based on [efficientdet](https://github.com/signatrix/efficientdet), with the changing that makes sure that it performs as closer as possible as the paper. 291 | 292 | Btw, debugging static-graph TensorFlow v1 is really painful. Don't try to export it with automation tools like tf-onnx or mmdnn, they will only cause more problems because of its custom/complex operations. 293 | 294 | And even if you succeeded, like I did, you will have to deal with the crazy messed up machine-generated code under the same class that takes more time to refactor than translating it from scratch. 295 | 296 | **Q3: What should I do when I find a bug?** 297 | 298 | A3: Check out the update log if it's been fixed, then pull the latest code to try again. If it doesn't help, create a new issue and describe it in detail. 299 | 300 | # Known issues 301 | 302 | 1. Official EfficientDet use TensorFlow bilinear interpolation to resize image inputs, while it is different from many other methods (opencv/pytorch), so the output is definitely slightly different from the official one. 303 | 304 | # Visual Comparison 305 | 306 | Conclusion: They are providing almost the same precision. 307 | 308 | ## This Repo 309 | 310 | 311 | ## Official EfficientDet 312 | 313 | 314 | ## References 315 | 316 | Appreciate the great work from the following repositories: 317 | - [google/automl](https://github.com/google/automl) 318 | - [lukemelas/EfficientNet-PyTorch](https://github.com/lukemelas/EfficientNet-PyTorch) 319 | - [signatrix/efficientdet](https://github.com/signatrix/efficientdet) 320 | - [vacancy/Synchronized-BatchNorm-PyTorch](https://github.com/vacancy/Synchronized-BatchNorm-PyTorch) 321 | 322 | ## Donation 323 | 324 | If you like this repository, or if you'd like to support the author for any reason, you can donate to the author. Feel free to send me your name or introducing pages, I will make sure your name(s) on the sponsors list. 325 | 326 | 327 | 328 | ## Sponsors 329 | 330 | Sincerely thank you for your generosity. 331 | 332 | [cndylan](https://github.com/cndylan) 333 | [claire-s11](https://github.com/claire-s11) 334 | -------------------------------------------------------------------------------- /test/img.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/test/img.png -------------------------------------------------------------------------------- /test/img_inferred_d0_official.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/test/img_inferred_d0_official.jpg -------------------------------------------------------------------------------- /test/img_inferred_d0_this_repo.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/test/img_inferred_d0_this_repo.jpg -------------------------------------------------------------------------------- /test/img_inferred_d0_this_repo_0.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/DataXujing/EfficientDet_pytorch/b915a3e6a2c4ba6cacbaf9e0d84536cc34c27d59/test/img_inferred_d0_this_repo_0.jpg -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | # original author: signatrix 2 | # adapted from https://github.com/signatrix/efficientdet/blob/master/train.py 3 | # modified by Zylo117 4 | 5 | import datetime 6 | import os 7 | import argparse 8 | import traceback 9 | 10 | import torch 11 | import yaml 12 | from torch import nn 13 | from torch.utils.data import DataLoader 14 | from torchvision import transforms 15 | from efficientdet.dataset import CocoDataset, Resizer, Normalizer, Augmenter, collater 16 | from backbone import EfficientDetBackbone 17 | from tensorboardX import SummaryWriter 18 | import numpy as np 19 | from tqdm.autonotebook import tqdm 20 | 21 | from efficientdet.loss import FocalLoss 22 | from utils.utils import replace_w_sync_bn, CustomDataParallel, get_last_weights, init_weights 23 | 24 | 25 | class Params: 26 | def __init__(self, project_file): 27 | self.params = yaml.safe_load(open(project_file).read()) 28 | 29 | def __getattr__(self, item): 30 | return self.params.get(item, None) 31 | 32 | 33 | def get_args(): 34 | parser = argparse.ArgumentParser('Yet Another EfficientDet Pytorch: SOTA object detection network - Zylo117') 35 | parser.add_argument('-p', '--project', type=str, default='underwater', help='project file that contains parameters') 36 | parser.add_argument('-c', '--compound_coef', type=int, default=0, help='coefficients of efficientdet') 37 | parser.add_argument('-n', '--num_workers', type=int, default=12, help='num_workers of dataloader') 38 | parser.add_argument('--batch_size', type=int, default=16, help='The number of images per batch among all devices') 39 | parser.add_argument('--head_only', type=bool, default=False, 40 | help='whether finetunes only the regressor and the classifier, ' 41 | 'useful in early stage convergence or small/easy dataset') 42 | parser.add_argument('--lr', type=float, default=1e-4) 43 | parser.add_argument('--optim', type=str, default='adamw', help='select optimizer for training, ' 44 | 'suggest using \'admaw\' until the' 45 | ' very final stage then switch to \'sgd\'') 46 | parser.add_argument('--alpha', type=float, default=0.25) 47 | parser.add_argument('--gamma', type=float, default=1.5) 48 | parser.add_argument('--num_epochs', type=int, default=500) 49 | parser.add_argument('--val_interval', type=int, default=1, help='Number of epoches between valing phases') 50 | parser.add_argument('--save_interval', type=int, default=500, help='Number of steps between saving') 51 | parser.add_argument('--es_min_delta', type=float, default=0.0, 52 | help='Early stopping\'s parameter: minimum change loss to qualify as an improvement') 53 | parser.add_argument('--es_patience', type=int, default=0, 54 | help='Early stopping\'s parameter: number of epochs with no improvement after which training will be stopped. Set to 0 to disable this technique.') 55 | parser.add_argument('--data_path', type=str, default='dataset/', help='the root folder of dataset') 56 | parser.add_argument('--log_path', type=str, default='logs/') 57 | parser.add_argument('--load_weights', type=str, default=None, 58 | help='whether to load weights from a checkpoint, set None to initialize, set \'last\' to load last checkpoint') 59 | parser.add_argument('--saved_path', type=str, default='logs/') 60 | parser.add_argument('--debug', type=bool, default=False, help='whether visualize the predicted boxes of trainging, ' 61 | 'the output images will be in test/') 62 | 63 | args = parser.parse_args() 64 | return args 65 | 66 | 67 | class ModelWithLoss(nn.Module): 68 | def __init__(self, model, debug=False): 69 | super().__init__() 70 | self.criterion = FocalLoss() 71 | self.model = model 72 | self.debug = debug 73 | 74 | def forward(self, imgs, annotations, obj_list=None): 75 | _, regression, classification, anchors = self.model(imgs) 76 | if self.debug: 77 | cls_loss, reg_loss = self.criterion(classification, regression, anchors, annotations, 78 | imgs=imgs, obj_list=obj_list) 79 | else: 80 | cls_loss, reg_loss = self.criterion(classification, regression, anchors, annotations) 81 | return cls_loss, reg_loss 82 | 83 | 84 | def train(opt): 85 | params = Params('projects/{}.yml'.format(opt.project)) 86 | 87 | if params.num_gpus == 0: 88 | os.environ['CUDA_VISIBLE_DEVICES'] = '-1' 89 | 90 | if torch.cuda.is_available(): 91 | torch.cuda.manual_seed(42) 92 | else: 93 | torch.manual_seed(42) 94 | 95 | opt.saved_path = opt.saved_path + '/{}/'.format(params.project_name) 96 | opt.log_path = opt.log_path + '/{}/tensorboard/'.format(params.project_name) 97 | os.makedirs(opt.log_path, exist_ok=True) 98 | os.makedirs(opt.saved_path, exist_ok=True) 99 | 100 | training_params = {'batch_size': opt.batch_size, 101 | 'shuffle': True, 102 | 'drop_last': True, 103 | 'collate_fn': collater, 104 | 'num_workers': opt.num_workers} 105 | 106 | val_params = {'batch_size': opt.batch_size, 107 | 'shuffle': False, 108 | 'drop_last': True, 109 | 'collate_fn': collater, 110 | 'num_workers': opt.num_workers} 111 | 112 | input_sizes = [512, 640, 768, 896, 1024, 1280, 1280, 1536] 113 | training_set = CocoDataset(root_dir=opt.data_path + params.project_name, set=params.train_set, 114 | transform=transforms.Compose([Normalizer(mean=params.mean, std=params.std), 115 | Augmenter(), 116 | Resizer(input_sizes[opt.compound_coef])])) 117 | training_generator = DataLoader(training_set, **training_params) 118 | 119 | val_set = CocoDataset(root_dir=opt.data_path + params.project_name, set=params.val_set, 120 | transform=transforms.Compose([Normalizer(mean=params.mean, std=params.std), 121 | Resizer(input_sizes[opt.compound_coef])])) 122 | val_generator = DataLoader(val_set, **val_params) 123 | 124 | model = EfficientDetBackbone(num_classes=len(params.obj_list), compound_coef=opt.compound_coef, 125 | ratios=eval(params.anchors_ratios), scales=eval(params.anchors_scales)) 126 | 127 | # load last weights 128 | if opt.load_weights is not None: 129 | if opt.load_weights.endswith('.pth'): 130 | weights_path = opt.load_weights 131 | else: 132 | weights_path = get_last_weights(opt.saved_path) 133 | try: 134 | last_step = int(os.path.basename(weights_path).split('_')[-1].split('.')[0]) 135 | except: 136 | last_step = 0 137 | 138 | try: 139 | ret = model.load_state_dict(torch.load(weights_path), strict=False) 140 | except RuntimeError as e: 141 | print('[Warning] Ignoring {}'.format(e)) 142 | print( 143 | '[Warning] Don\'t panic if you see this, this might be because you load a pretrained weights with different number of classes. The rest of the weights should be loaded already.') 144 | 145 | print('[Info] loaded weights: {}, resuming checkpoint from step: {}'.format(os.path.basename(weights_path),last_step)) 146 | else: 147 | last_step = 0 148 | print('[Info] initializing weights...') 149 | init_weights(model) 150 | 151 | # freeze backbone if train head_only 152 | if opt.head_only: 153 | def freeze_backbone(m): 154 | classname = m.__class__.__name__ 155 | for ntl in ['EfficientNet', 'BiFPN']: 156 | if ntl in classname: 157 | for param in m.parameters(): 158 | param.requires_grad = False 159 | 160 | model.apply(freeze_backbone) 161 | print('[Info] freezed backbone') 162 | 163 | # https://github.com/vacancy/Synchronized-BatchNorm-PyTorch 164 | # apply sync_bn when using multiple gpu and batch_size per gpu is lower than 4 165 | # useful when gpu memory is limited. 166 | # because when bn is disable, the training will be very unstable or slow to converge, 167 | # apply sync_bn can solve it, 168 | # by packing all mini-batch across all gpus as one batch and normalize, then send it back to all gpus. 169 | # but it would also slow down the training by a little bit. 170 | if params.num_gpus > 1 and opt.batch_size // params.num_gpus < 4: 171 | model.apply(replace_w_sync_bn) 172 | 173 | writer = SummaryWriter(opt.log_path + '/{}/'.format(datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))) 174 | 175 | # warp the model with loss function, to reduce the memory usage on gpu0 and speedup 176 | model = ModelWithLoss(model, debug=opt.debug) 177 | 178 | if params.num_gpus > 0: 179 | model = model.cuda() 180 | if params.num_gpus > 1: 181 | model = CustomDataParallel(model, params.num_gpus) 182 | 183 | if opt.optim == 'adamw': 184 | optimizer = torch.optim.AdamW(model.parameters(), opt.lr) 185 | else: 186 | optimizer = torch.optim.SGD(model.parameters(), opt.lr, momentum=0.9, nesterov=True) 187 | 188 | scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, patience=3, verbose=True) 189 | 190 | epoch = 0 191 | best_loss = 1e5 192 | best_epoch = 0 193 | step = max(0, last_step) 194 | model.train() 195 | 196 | num_iter_per_epoch = len(training_generator) 197 | 198 | try: 199 | for epoch in range(opt.num_epochs): 200 | last_epoch = step // num_iter_per_epoch 201 | if epoch < last_epoch: 202 | continue 203 | 204 | epoch_loss = [] 205 | progress_bar = tqdm(training_generator) 206 | for iter, data in enumerate(progress_bar): 207 | if iter < step - last_epoch * num_iter_per_epoch: 208 | progress_bar.update() 209 | continue 210 | try: 211 | imgs = data['img'] 212 | annot = data['annot'] 213 | 214 | if params.num_gpus == 1: 215 | # if only one gpu, just send it to cuda:0 216 | # elif multiple gpus, send it to multiple gpus in CustomDataParallel, not here 217 | imgs = imgs.cuda() 218 | annot = annot.cuda() 219 | 220 | optimizer.zero_grad() 221 | cls_loss, reg_loss = model(imgs, annot, obj_list=params.obj_list) 222 | cls_loss = cls_loss.mean() 223 | reg_loss = reg_loss.mean() 224 | 225 | loss = cls_loss + reg_loss 226 | if loss == 0 or not torch.isfinite(loss): 227 | continue 228 | 229 | loss.backward() 230 | # torch.nn.utils.clip_grad_norm_(model.parameters(), 0.1) 231 | optimizer.step() 232 | 233 | epoch_loss.append(float(loss)) 234 | 235 | progress_bar.set_description( 236 | 'Step: {}. Epoch: {}/{}. Iteration: {}/{}. Cls loss: {:.5f}. Reg loss: {:.5f}. Total loss: {:.5f}'.format( 237 | step, epoch, opt.num_epochs, iter + 1, num_iter_per_epoch, cls_loss.item(), 238 | reg_loss.item(), loss.item())) 239 | writer.add_scalars('Loss', {'train': loss}, step) 240 | writer.add_scalars('Regression_loss', {'train': reg_loss}, step) 241 | writer.add_scalars('Classfication_loss', {'train': cls_loss}, step) 242 | 243 | # log learning_rate 244 | current_lr = optimizer.param_groups[0]['lr'] 245 | writer.add_scalar('learning_rate', current_lr, step) 246 | 247 | step += 1 248 | 249 | if step % opt.save_interval == 0 and step > 0: 250 | save_checkpoint(model, 'efficientdet-d{}_{}_{}.pth'.format(opt.compound_coef,epoch,step)) 251 | print('checkpoint...') 252 | 253 | except Exception as e: 254 | print('[Error]', traceback.format_exc()) 255 | print(e) 256 | continue 257 | scheduler.step(np.mean(epoch_loss)) 258 | 259 | if epoch % opt.val_interval == 0: 260 | model.eval() 261 | loss_regression_ls = [] 262 | loss_classification_ls = [] 263 | for iter, data in enumerate(val_generator): 264 | with torch.no_grad(): 265 | imgs = data['img'] 266 | annot = data['annot'] 267 | 268 | if params.num_gpus == 1: 269 | imgs = imgs.cuda() 270 | annot = annot.cuda() 271 | 272 | cls_loss, reg_loss = model(imgs, annot, obj_list=params.obj_list) 273 | cls_loss = cls_loss.mean() 274 | reg_loss = reg_loss.mean() 275 | 276 | loss = cls_loss + reg_loss 277 | if loss == 0 or not torch.isfinite(loss): 278 | continue 279 | 280 | loss_classification_ls.append(cls_loss.item()) 281 | loss_regression_ls.append(reg_loss.item()) 282 | 283 | cls_loss = np.mean(loss_classification_ls) 284 | reg_loss = np.mean(loss_regression_ls) 285 | loss = cls_loss + reg_loss 286 | 287 | print( 288 | 'Val. Epoch: {}/{}. Classification loss: {:1.5f}. Regression loss: {:1.5f}. Total loss: {:1.5f}'.format( 289 | epoch, opt.num_epochs, cls_loss, reg_loss, loss)) 290 | writer.add_scalars('Total_loss', {'val': loss}, step) 291 | writer.add_scalars('Regression_loss', {'val': reg_loss}, step) 292 | writer.add_scalars('Classfication_loss', {'val': cls_loss}, step) 293 | 294 | if loss + opt.es_min_delta < best_loss: 295 | best_loss = loss 296 | best_epoch = epoch 297 | 298 | save_checkpoint(model, 'efficientdet-d{}_{}_{}.pth'.format(opt.compound_coef,epoch,step)) 299 | 300 | # onnx export is not tested. 301 | # dummy_input = torch.rand(opt.batch_size, 3, 512, 512) 302 | # if torch.cuda.is_available(): 303 | # dummy_input = dummy_input.cuda() 304 | # if isinstance(model, nn.DataParallel): 305 | # model.module.backbone_net.model.set_swish(memory_efficient=False) 306 | # 307 | # torch.onnx.export(model.module, dummy_input, 308 | # os.path.join(opt.saved_path, 'signatrix_efficientdet_coco.onnx'), 309 | # verbose=False) 310 | # model.module.backbone_net.model.set_swish(memory_efficient=True) 311 | # else: 312 | # model.backbone_net.model.set_swish(memory_efficient=False) 313 | # 314 | # torch.onnx.export(model, dummy_input, 315 | # os.path.join(opt.saved_path, 'signatrix_efficientdet_coco.onnx'), 316 | # verbose=False) 317 | # model.backbone_net.model.set_swish(memory_efficient=True) 318 | 319 | # Early stopping 320 | if epoch - best_epoch > opt.es_patience > 0: 321 | print('[Info] Stop training at epoch {}. The lowest loss achieved is {}'.format(epoch, loss)) 322 | break 323 | except KeyboardInterrupt: 324 | save_checkpoint(model, 'efficientdet-d{}_{}_{}.pth'.format(opt.compound_coef,epoch,step)) 325 | writer.close() 326 | writer.close() 327 | 328 | 329 | def save_checkpoint(model, name): 330 | if isinstance(model, CustomDataParallel): 331 | torch.save(model.module.model.state_dict(), os.path.join(opt.saved_path, name)) 332 | else: 333 | torch.save(model.model.state_dict(), os.path.join(opt.saved_path, name)) 334 | 335 | 336 | if __name__ == '__main__': 337 | opt = get_args() 338 | train(opt) 339 | -------------------------------------------------------------------------------- /tutorial/train_shape.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "collapsed": true, 7 | "pycharm": { 8 | "name": "#%% md\n" 9 | } 10 | }, 11 | "source": [ 12 | "# EfficientDet Training On A Custom Dataset\n", 13 | "\n", 14 | "\n", 15 | "\n", 16 | "
\n", 17 | " \n", 18 | " View source on github\n", 19 | " \n", 20 | "\n", 21 | " \n", 22 | " Run in Google Colab\n", 23 | "
" 24 | ] 25 | }, 26 | { 27 | "cell_type": "markdown", 28 | "source": [ 29 | "## This tutorial will show you how to train a custom dataset.\n", 30 | "\n", 31 | "## For the sake of simplicity, I generated a dataset of different shapes, like rectangles, triangles, circles.\n", 32 | "\n", 33 | "## Please enable GPU support to accelerate on notebook setting if you are using colab.\n", 34 | "\n", 35 | "### 0. Install Requirements" 36 | ], 37 | "metadata": { 38 | "collapsed": false, 39 | "pycharm": { 40 | "name": "#%% md\n" 41 | } 42 | } 43 | }, 44 | { 45 | "cell_type": "code", 46 | "execution_count": null, 47 | "outputs": [], 48 | "source": [ 49 | "!pip install pycocotools numpy==1.16.0 opencv-python tqdm tensorboard tensorboardX pyyaml matplotlib\n", 50 | "!pip install torch==1.4.0\n", 51 | "!pip install torchvision==0.5.0" 52 | ], 53 | "metadata": { 54 | "collapsed": false, 55 | "pycharm": { 56 | "name": "#%%\n" 57 | } 58 | } 59 | }, 60 | { 61 | "cell_type": "markdown", 62 | "source": [ 63 | "### 1. Prepare Custom Dataset/Pretrained Weights (Skip this part if you already have datasets and weights of your own)" 64 | ], 65 | "metadata": { 66 | "collapsed": false, 67 | "pycharm": { 68 | "name": "#%% md\n" 69 | } 70 | } 71 | }, 72 | { 73 | "cell_type": "code", 74 | "execution_count": null, 75 | "outputs": [], 76 | "source": [ 77 | "import os\n", 78 | "import sys\n", 79 | "if \"projects\" not in os.getcwd():\n", 80 | " !git clone --depth 1 https://github.com/zylo117/Yet-Another-EfficientDet-Pytorch\n", 81 | " os.chdir('Yet-Another-EfficientDet-Pytorch')\n", 82 | " sys.path.append('.')\n", 83 | "else:\n", 84 | " !git pull\n", 85 | "\n", 86 | "# download and unzip dataset\n", 87 | "! mkdir datasets\n", 88 | "! wget https://github.com/zylo117/Yet-Another-EfficientDet-Pytorch/releases/download/1.1/dataset_shape.tar.gz\n", 89 | "! tar xzf dataset_shape.tar.gz\n", 90 | "\n", 91 | "# download pretrained weights\n", 92 | "! mkdir weights\n", 93 | "! wget https://github.com/zylo117/Yet-Another-EfficientDet-Pytorch/releases/download/1.0/efficientdet-d0.pth -O weights/efficientdet-d0.pth\n", 94 | "\n", 95 | "# prepare project file projects/shape.yml\n", 96 | "# showing its contents here\n", 97 | "! cat projects/shape.yml" 98 | ], 99 | "metadata": { 100 | "collapsed": false, 101 | "pycharm": { 102 | "name": "#%%\n" 103 | } 104 | } 105 | }, 106 | { 107 | "cell_type": "markdown", 108 | "source": [ 109 | "### 2. Training" 110 | ], 111 | "metadata": { 112 | "collapsed": false 113 | } 114 | }, 115 | { 116 | "cell_type": "code", 117 | "execution_count": null, 118 | "outputs": [], 119 | "source": [ 120 | "# consider this is a simple dataset, train head will be enough.\n", 121 | "! python train.py -c 0 -p shape --head_only True --lr 1e-3 --batch_size 32 --load_weights weights/efficientdet-d0.pth --num_epochs 50\n", 122 | "\n", 123 | "# the loss will be high at first\n", 124 | "# don't panic, be patient,\n", 125 | "# just wait for a little bit longer" 126 | ], 127 | "metadata": { 128 | "collapsed": false, 129 | "pycharm": { 130 | "name": "#%%\n" 131 | } 132 | } 133 | }, 134 | { 135 | "cell_type": "markdown", 136 | "source": [ 137 | "### 3. Evaluation" 138 | ], 139 | "metadata": { 140 | "collapsed": false 141 | } 142 | }, 143 | { 144 | "cell_type": "code", 145 | "execution_count": null, 146 | "outputs": [], 147 | "source": [ 148 | "! python coco_eval.py -c 0 -p shape -w logs/shape/efficientdet-d0_49_1400.pth" 149 | ], 150 | "metadata": { 151 | "collapsed": false, 152 | "pycharm": { 153 | "name": "#%%\n" 154 | } 155 | } 156 | }, 157 | { 158 | "cell_type": "markdown", 159 | "source": [ 160 | "### 4. Visualize" 161 | ], 162 | "metadata": { 163 | "collapsed": false, 164 | "pycharm": { 165 | "name": "#%% md\n" 166 | } 167 | } 168 | }, 169 | { 170 | "cell_type": "code", 171 | "execution_count": 4, 172 | "outputs": [ 173 | { 174 | "data": { 175 | "text/plain": "
", 176 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAQYAAAD8CAYAAACVSwr3AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3de7RdVX3o8e9vrrX3PnkgCSRQTECCQAW9ijRCLARQSgX0Ch3FB/WWaNEoYC/Utha9HX1wvR311lFbexWIAsY7rIq2NrnUq+UhJvYK8VB5KYJBwyMDSUCeyck5e635u3/Mufde56xzcp775fl9xtjZa8219l7zZO/123PNNR+iqhhjTJHrdgaMMb3HAoMxpsQCgzGmxAKDMabEAoMxpsQCgzGmpC2BQUTOFpEHRWS7iFzZjmMYY9pH5rodg4gkwEPAWcDjwPeBC1X1R3N6IGNM27SjxHASsF1Vf6qqI8CXgfPacBxjTJukbXjPFcBjhfXHgZP394Jly5bpkUce2YasGGMa7rrrrqdUdflU9m1HYJgSEVkPrAc44ogjGBwc7FZWjJkXROSRqe7bjkuJncDhhfWVMW0UVd2gqqtVdfXy5VMKYsaYDmlHYPg+cIyIrBKRKvBOYHMbjmOMaZM5v5RQ1UxEPgh8C0iA61X1h3N9HGNM+7SljkFVvwF8ox3vbYxpP2v5aIwpscBgjCmxwGCMKbHAYIwpscBgjCmxwGCMKbHAYIwpscBgjCmxwGCMKbHAYIwpscBgjCmxwGCMKbHAYIwpscBgjCmxwGCMKbHAYIwpscBgjCmxwGCMKbHAYIwpscBgjCmxwGCMKbHAYIwpscBgjCmxwGCMKbHAYIwpscBgjCmxwGCMKbHAYIwpscBgjCmxwGCMKbHAYIwpscBgjCmxwGCMKbHAYIwpmTQwiMj1IrJLRO4vpB0kIjeLyE/i89KYLiLyKRHZLiL3isiJ7cy8MaY9plJi+Dxw9pi0K4FbVfUY4Na4DnAOcEx8rAeunptsGmM6adLAoKpbgF+MST4P2BiXNwLnF9K/oMEdwBIROWyuMmuM6YyZ1jEcqqpPxOWfA4fG5RXAY4X9Ho9pJSKyXkQGRWRw9+7dM8yGMaYdZl35qKoK6Axet0FVV6vq6uXLl882G8aYOTTTwPBk4xIhPu+K6TuBwwv7rYxpxpg+MtPAsBlYF5fXAZsK6RfFuxNrgOcKlxzGmD6RTraDiHwJOANYJiKPA38O/DVwo4hcDDwCvD3u/g3gXGA7sBd4TxvybIxps0kDg6peOMGmM8fZV4HLZpspY0x3WctHY0yJBQZjTIkFBmNMiQUGY0yJBQZjTIkFBmNMiQUGY0yJBQZjTIkFBmNMiQUGY0yJBQZjTIkFBmNMiQUGY0yJBQZjTIkFBmNMiQUGY0yJBQZjTIkFBmNMiQUGY0yJBQZjTIkFBmNMiQUGY0yJBQZjTIkFBmNMiQUGY0yJBQZjTIkFBmNMiQUGY0yJBQZjTIkFBmNMiQUGY0yJBQZjTIkFBmNMyaSBQUQOF5Fvi8iPROSHInJ5TD9IRG4WkZ/E56UxXUTkUyKyXUTuFZET2/1HGGPm1lRKDBnwh6p6PLAGuExEjgeuBG5V1WOAW+M6wDnAMfGxHrh6znNtjGmrSQODqj6hqv8Rl18AHgBWAOcBG+NuG4Hz4/J5wBc0uANYIiKHzXnOjTFtM606BhE5EngtcCdwqKo+ETf9HDg0Lq8AHiu87PGYZozpE1MODCKyGPgn4ApVfb64TVUV0OkcWETWi8igiAzu3r17Oi81xrTZlAKDiFQIQeGLqvrPMfnJxiVCfN4V03cChxdevjKmjaKqG1R1taquXr58+Uzzb4xpg6nclRDgOuABVf3bwqbNwLq4vA7YVEi/KN6dWAM8V7jkMMb0gXQK+5wC/C5wn4jcHdM+Cvw1cKOIXAw8Arw9bvsGcC6wHdgLvGdOc2yMabtJA4OqfheQCTafOc7+Clw2y3wZY7rIWj4aY0osMBhjSiwwGGNKLDAYY0osMBhjSiwwGGNKLDAYY0osMBhjSiwwGGNKLDAYY0osMBhjSiwwGGNKLDAYY0osMBhjSiwwGGNKLDAYY0osMBhjSiwwGGNKLDAYY0osMBhjSiwwGGNKLDAYY0osMBhjSiwwGGNKLDAYY0osMBhjSiwwGGNKLDAYY0osMBhjSiwwGGNKLDAYY0osMPQtBQ1P/cx3OwNmXGm3M2BmbtgDCk5oBYhioJDC8nTS5+I9iukyerMWVjIgVajYN7Gn2MfRtxy1pNt5mL1KY0EVZLzIZLph0sAgIgPAFqAW9/+aqv65iKwCvgwcDNwF/K6qjohIDfgC8GvA08A7VHVHm/I/7510+rtBW1eExWvDYjF925brWXPa75XSx9t/oveYaXozTTxOwQvgE7ZtvS78DaddwLbvwOgihgWHbppKHcMw8EZVfQ1wAnC2iKwBPg58UlWPBp4BLo77Xww8E9M/Gfczc8gXTppt3/k8ojlChpChhYcUHkApfduW61Ey7ojP27ZcP+o97ojrY997ovSxxyylqUdVw3MhhGzb8rXRlyCm6yYNDBq8GFcr8aHAG4GvxfSNwPlx+by4Ttx+pojYxz5HxlbWvX7tOpwod275Ak6l+Ri7DoxaL6a9fu26cfcZ7zXF/Wf2cM33bTjtjHe0+X/NTNeU7kqISCIidwO7gJuBh4FnVTWLuzwOrIjLK4DHAOL25wiXG2Pfc72IDIrI4O7du2f3V8wjjjEfmjq8CiefdhEe13ycvPY9o9aBUevjpU223nh8b+vGcdOL28bb53tbw++FF4+XVqlny+1fGXPlYJcR3TalwKCquaqeAKwETgJeMdsDq+oGVV2tqquXL18+27ebZ1q/tr5ZXyeE8sT4j5PXrps0rbwPE77XnVtv2O9xJjyeeNAEJG/+Daec8c4+uZQIAcuTt/6HtPGPtnZRgJzmBVNcR2n9d2hv36qdVjsGVX0W+DbwemCJiDQqL1cCO+PyTuBwgLj9QEIlpGkDp4DM/S/snVs3xpN7fPvbtl/q4uVEX0SCMQR8q9TmIJ7sEgNz1twNHC6XZmWr1yR8Ti4+pLcbEU2aNxFZLiJL4vIC4CzgAUKAuCDutg7YFJc3x3Xi9ttU1cqGfWbGJ/4vM6V50jd//YUYncEjeNFm5bBPwEsG5CAaSxqxyNDjjdOm0o7hMGCjiCSEQHKjqt4kIj8CviwiHwN+AFwX978O+N8ish34BfDONuTbmM5r3EVVCcsS7rkoQiLgNBnVSCzEizScNR6QtHDrttOZn55JA4Oq3gu8dpz0nxLqG8am7wPeNie5M6ZjpnimjtktnWibjFl3o55moLPFC2v5aEzTZCdfuH7YV89wlZwqi6jzGIKSI3j2ABk5i3GMIKSkHAQ4KrwktO6ERlUFAG7SxlxSeO5ccLDAYExJ8SQsLjuGeIwFlcP5qwd+nY8eB//jkXMgGSHxipOFDGcjJBVInEdHBD+0kGq+kIF9v82lq3dSYRlOazhpNPEa29FkKh1S2s8CQ78ZcxfCS6j0QrSna7mBcG2uAi4Lty0b2nBXZVZUS3l6ke1cPXg+I4t+Bx14DX7xLwD4i5fdz1WPvQIVyH1GmgAevAo4QRYNUWeIfNFzfOLH72Dg+Zfy/pP+hsW8DIeSZ54k1XgL15MxTlsVoNNBoue/S2YyHofixIeWhT5pPYotDqeTPhfvUUwvtHp0MTi4Qv8O8b3UGyx25lJBs3AC7mEH19z3Xv549b+QL3qKP111T7w9CVc9elwr4BWDXaFlJz7hT1fdR33Bs/zhSV/lmgcugthQ3aXgfRI7kAkpMkEbBy082s9KDP1GwDfbJYLzCcTbZJnLS/tO9B5TTp/L9xCPczle/KiOXyo91NTHCzl1nPPsSx9lw+C72HvYW8iWplz1yCtBfAgG0HyejJOcjz12LLic//74cejiXfzV9tdRee63uPTXPsMi91JyRvBkVMhwAp5hCvdHO85KDH3IFT4273Iy5/EuB3RUI+Tir8x00ufiPYrpo37tGiWJwi+q9NgN/YSUjD38w0MX8sKye8jqyoiOhIA2A43/BUf4X8E5RgaeY++yB/n0g2/H8xwJlXDcWHpyWpu7P2gGrMTQl1on1Z1bN+5nv/6g5JPv1CniyeRF/mHb77Hn0GcRUirJCBVfCaWcGQQH7/IQzPNK8xg4hzp4YcHTfPqOy7l0zf8i8Ysb0aPrLDD0nVi89Mopb3gH6itIXkEFsl4qko9HYiUbAJ7EeYa9MJBq8xe12/bI83z2nnfz/LLtJCSAUkeoeIfKDPs3NEpHLms0ksTFTi6qGU+/dBufGLyQK1bfQE2XoeLx6uhmzYv0Qmvl1atX6+DgYLez0Rda979zhjUBZfQXqMeHdmsMVRnr2hjJoSqQJt27ng7CrcK/ueN3GPmVe0lTZVgzyKuhhaPLSXWG+RMN7aMbdUAqzToWyYW620PFH8DAo6v5g1M2kFYcXj1OEsb/UGaYDZG7VHX1VPa1EkOfaf2qJtTGtq7rQ2nzG9jJPyQ0VMpwpIDmgiQwzCMMHfojcJ7cA7RO5hkHBWjdsSi+RyzdaeJIXQVNRhg66nvUKzug/lK8LqBanexvgOb/2/6C8wyy3gulN2M6TECT+KsozfPmM1v+ApKRDmfFg6+AT/Dq+NxdH0Uqbv+D4zb6a6i06nVFQbLYnT2nuWGG8cwCg5mXwriTgkfJJXSX3nfIQ53NhGg4kVVweHwOzw08yjA7EbefS4dSn4w8LGga6nA0aQWOGbLAYOadMIBKY0yEHIlnwd4Fz3Y2IyqhtJCMgMtJq1Bf+CLXbvsQ+X7qFFSV3OeF6ockvl1GsxLHE4aHGAb2xcc0WB2DmXdG/xomrQZjkpe2tp34Zq9sXI5XYWThU4ywnQUcO+5LVDVUTCooikiYoEOylC233YvPhcSlgAOfkCQJIyPTu0SywGDmIWneNiQX8iTUN6TNInnnKkK9KM5X8Aguh0olI0scQwyzYILXOHF4r4gTJBP8MPy/7zyErztgIXhP3StprNmt5xnVyvQaTFlgMPOXZJBAxgPUgEq+kMzV23a4PzvigSnu+WPg1WPSRk/n5Rr3qNPQ3+LUc2efvyILDGZ+EoA0jLIWrx7aGRT6jQUGM78JEJtkq8br9Ta66tHjmiWHxnKjM9bY5dwPk7jGJUCrQcJ3//UhsnpoF5EkrVN47VtfztbND5eWG+vTYXclzPzUGNJdQlMnCIGh3Ron/3g9M4tB4+P3vxHvssLWYjfulDSpUqlU25ZnKzGYeSz0hWj0SnCuh34nfUJSOD23bvpZaAwlvjlEXDtLOBYYzDyWx5GTOtdfaOylxITbj3iAwlzgrD1vVXN566afsfa8VaMuFQC2bn64eckw0SXFVFlgMPOTQKv7WWdPg2JAmCg41J55Gac88UNOPTukbf3XH4fbqHnoQDHRyT5esJiJHio7GdMFCkKs4NMunw6NCX99gqsPoL6Qnzz2p3B57IA1zlBvc5h/CwxmXiqOq5DEOZdTSToaHEbNECoKeWiUlIws5PiD3wDZwJhXSGHkKyk8GklzNx6HBQYzL7lGnwKXkbIMgGEdiUPkdSoP8VkdDo/Wcuo+w9WXUNv5qx3Lx3isjsHMUxL7RqTNE1TxiHZ23CSnLs6InaAjVRKpkTz9K1RePKKj+Sjlq6tHN6bbmnNMwgD7HRmlDcd2o8aR1HQvfgiOX3oqyfBEPSU6wwKDmZ8UINyq1FiBt/DR1aSdHIlVPLgs1C+4jNQv4ICnjmbg4dPZ/0gt7WeBwcw7njhQCwrkzTGcLln7caRU4dd+zif4vAo+4TXpf6HqhJF8uOP5GJWnrh7dmC5oTgGnAqTNiv0ay6ntOoaqr5AgaBIqIp3kZJrgtXDXQsNYBx7wk9zJ8C4Pg8ECCWH6uwxHhiNJElwCiSqLd5zOwJPH4nwVke7OzmWBwcxfpdbEwgfXfJbKU6sYqUOqA7i8iseRuizcyYDQUtLVIRnBST7pbUKX1XCx30NdE7x4quRUNTRrrmf7GNixhtfIhZAoKkp1f0O7dYAFBmMKqhzIpas/waJsKblkca4OT5Kn4Oqjb2fGOxiTnUROfLx0AZDQWMnlgCfzntreg/lPA79N7cVD0DxBcWi9u8N/W2AwZowFejR/8Ks3seD5pXjnqZOQuzyO5pyGywdfDfUC0Lq7MAEvHi+NYeihgifLa4yIY+D5Q/jj479JbfcqMs1RVdLUo5UJ364jphwYRCQRkR+IyE1xfZWI3Cki20XkKyJSjem1uL49bj+yPVk3Zu4571GUhCVcetwNLHryKJKsSpZDPRtGPOR1QZxvDTXf6PU4gUql0pyXwgN5kpFmVV7y7NH811fdgBs+kGpSwavgU08997i8u7OKTafEcDlQHJvq48AnVfVo4Bng4ph+MfBMTP9k3M+YvqAeRD2Cp6Iv54rXbeSQX7yCBdUBBgaESq1OxaWtXplTaBA1tHeY5rVEqiS6iCW/eCWXv/oGFrKKRByJpqS+QhZnLpcu99uY0tFFZCXwZuBzcV2ANwJfi7tsBM6Py+fFdeL2M6Xdw+IYM0ckdXiXoChVHLV8KZec8EUuX7GZdPup6ItLkGQvmsWZpRrTz+3nRE4HErI0BJCFPzuBPzr8Jj74uo1UdGkYIkbjxLduhAU+9IfochXDlEsMfwd8mFafj4OBZ1W1McTM48CKuLwCeAwgbn8u7j+KiKwXkUERGdy9e/cMs2/M3Au3M124a+FAkoSFrOSK0z/NFcdu4oAdJ7No6AAWiIM8xacjSJKhIuRewkkiaQgWmbB4eDEHPHwSAB9a+xlqrAAEL1kcth58Hk4ujWNDuC5PUDxp8yoReQuwS1XvEpEz5urAqroB2ABhUtu5el9j5oaMegKoMUCqjkvWXkPOEBlPc82Wq0gH4te3sheqe5Chg/G5oFrnvWs+huNAakcdDHwexwG0pvb28Rao4CQhx+Mbc1wWmkp3w1TaXZ4CvFVEzgUGgJcAfw8sEZE0lgpWAjvj/juBw4HHRSQFDgSenvOcGzPnJi+/JzJ6dvEPnzbZK74+Zt01y+nN4noVfv3NE7/DTAdbmY1JA4OqfgT4CEAsMfyRqr5LRL4KXAB8GVgHbIov2RzXvxe336adGGXTmFmZ/Cvqm/sJmkEYoLkxe5WQZRlpHLXZq8c5CZWOLieEEyHMG9cICYL3Hpc5/v2W7fis0Kqyy5cSs6n6/BPgQyKynVCHcF1Mvw44OKZ/CLhydlk0pjc4bZwwOdL8SQ0TyHrApSleGtPNOTw5YRCmpDUgC+CbJZMc58LFRK/9dk6rC5eq3g7cHpd/Cpw0zj77gLfNQd6M6S0KoT7AhZXGaEqiuEZpQNM4uGwdtBoqFltTQoQRqeP7hJ4TYaq5UQPSNodu696tCRuoxZipiiOp+eI6ECoREyAtnM+NmgjFyaidwyti705RqNdBNQwa0yt6JyfG9Lrmr/7YX/JkzD5SSBNK9RdSuIYXIW2MDyMefBpmwFYp9K/oPOsrYcxckvGWJzrDQ7o4WHtmHEY+3qrsZlAACwzGdFejMJEQ+1z0RiWkBQZjuqh5MyIFXDZph6xOscBgTBdJaAIBoqz9jeNRqZMkSdtn3Z6MBQZjeoJABZKKouotMBgzn/lGJYOCSkbuM+r1etcbPFlg6JbCtIMebX5BitOWTec9ptKk1/Qep9Js+yCknHbWq9DKSOiUAYCS1X3o2t3B+gcLDF2SSY4XxWv4cjiVGCCAGCgyQsv6ECxi8FCaffgR8JKThTmUuva3mFloTj8ZorwKvOGsV4HLEIE8z6nVaqFi0qfh0QEWGLrE4WJACBOaeslwKqQaGtCER3H/kOZFSZV4W0vJcaTxfUw/anxwAnhyMqjAqWceTZIkJFKjng3T6gbemRKDtXxsu/F/yRsDdIxuBVfeZ2zkdmP2q0x8iMgiRm+TQmxISFNtJo+wB5ccQOKrqDYuIzrTh8JKDL/0ilOlj30eu5/pOB27HJtQi7L2zOPxMkSmPk7A27mOVVZi6IhicTFWMubKi074/c99lZ35Um75wFkAnHzNVvb5Ybz31Go1sizj7kvO5LVX3w7ikaSGjOQMa8ZRvMDmD57HPp8x4JJR7986XvH445Uexu5jOqp4njcqkyV8jq4Ga3/zWL777ftheEHYuUNjNVhg6Ihw/Qi+WXn4FMLvf/pf2J6+lB984FROvPpW8sQhPkNclQTP4PtOH/UeXhMWZDlDicfh2MGBADxMwiHA8v0ev/TTNGZ5nM4+pkMalw+Fy4r4ebgqrD3zVXz3mw+17kx0IDhYYOgYT6ahP/6ODN73uZt5NlkKcTxdJQkD/QhxDHNCKaFJcUAdxWmYijWJt7QuvvY2lrt9/J/3hz3reU4lcaNeOz4LBr1Bxl0MPJImrD37WL5zy33oyACpJGSZkqTSagylCY4wuc2c5KjbDSkgDAY7ODjY7Wy0ieDxuFwYTuBRhbd95hbEzXaqofANciguAXLPYj/M1z5wNksbwwo2Ws+pWhVCT5hJIA77q5cwUn0G37vlp6Au3s4sliCKpb/R77H2rUcjwl2qunoqR7XKxzZqxG6HJ3NhjP311/4bqQww0zO10R3XKZB7PIK4ChkJz7OID3x2E3vHvrUFhf7UrIwUcq2TkyMVOOVNR1FPn8G7HJ/JmLsVc1OxbIGhjUQbrRkT9gKXX/stnskHyLXOXBThnQsf33A9Ixclr8AjHMgjwNCoSq36rI9luqDwGaZJJXSucoqmntPfdAKnnPVyfHUPKgI+RaRxaRHGkBQJg83OhAWGDhAPF13zTX42lCJpBm7m/+3NhkyieBnVJprcj5CgvPuz/86TozJgH3Nf08IDQdQhCJLAGW9+Jae9ZRVrz3kZVOuIA685xT5YOoNRX6zysY3Eh0kIhgQedgup1sBl9Thy8MyEIb90VH8KFyskRSqMkDCU7ePyDTexaX3YnpHYB93PineUFbzXMDQ9NEsFruY45axjkQzI4fbb7wV15OpJ0+nXZ9n3pQN+LJD4jAVZjaFa7Bfh8xm/n1Mhi0WH1pDmkGceJ0JShR060Np/Npk3vaMRIJJQiek13LZ2zuE1D+1cqg6vcNrZr8RJGJF6JvcXLDC0k4bf9T+55pvU3CKG0ix8gLPoCOOlFRCaaYSTP02UDMVlFZKkFXgsMPSS2dcENz5PV3iriZYRmMnQDhYY2mjYKTVgDwn7CHMPOG1P24EwkSq4XMApvvBtaE6VaLqs200Dph4h7PvSRpX4OezF4fN6OHnb1GrNC2HqdOepKmje+hJYz0szXRYY2shJnFvAJ1TTVuGsHY1ZnYKoCw2vVdARu0VpZs4CQxsNx+eKCPVcC0ODz/1/uwOqGkoOmVNcpXAM7f6ow6a/WGBoo8bpOJwNU1GHxkuIVOf+P74xJJzD4xSKNzR9j8xVYPqHBYY2alzbp2lK3dVJfBLbIfi2XE548aAJPslIkgWtDZJM/CJjxmGBoY1q8dkTSvPOC67Nv94+zsAs2czbSRhjgaEDqkkVcmIzZt++/3R1IfCoIHkrMPRAB1rTZywwtFE91ikkXllYTaknOcNCOIHbcDwvoT7B+YSs0roLklhgMNNkgaGNGuM8JzpMlmUkCok6MqStw707IHeFSwnrdm2myQJDG6V5+KleVXkR0irOJ6Rt/fX2pD50sqr4QjsGCwxmmqYUGERkh4jcJyJ3i8hgTDtIRG4WkZ/E56UxXUTkUyKyXUTuFZET2/kH9LYQBf7yveeT1+vURag7z4C2pzWig+ZdjwO6PPeh6W/TKTG8QVVPKAwNdSVwq6oeA9wa1wHOAY6Jj/XA1XOV2b4Tx104DFgqdbxmqBeGJYsVkbHI7/I4Vl+jL8XEGl3rU8I8A7nzOAmPTIWRxJOrcmx9qPWaNv155pfXbC4lzgM2xuWNwPmF9C9ocAewREQOm8Vx+lZGuM6vAZ+77E0cJDm1TKkmVTJ1LMgctTxMO5b6BC9KPdn/bcY0nuaZpnhNEJ/gNcGrQ0ZykhHlKF7kY5f85+ZrrK+Ema6pBgYF/k1E7hKROPwHh6rqE3H558ChcXkF8FjhtY/HtFFEZL2IDIrI4O7du2eQ9d7XOIkdyvIcjuRZlIysnuNcxlAaGjo5JY6v4EnQ/Xay8upIgVy02QzaSeiLn9YqLK4Oce36t3JwsU2TFRnMNE01MJyqqicSLhMuE5HTihs1DDU9rd8lVd2gqqtVdfXy5RPPiNDP8kKLwwWJ8plLLuAo2cOw1MPMQq4eBvR0ORrbNzifhB6YE/AIHqiRg+SMNO4+iKJ+hOvffw5LkjEfrFU3mGmaUmBQ1Z3xeRfwdeAk4MnGJUJ83hV33wkcXnj5ypg27ySN/14VKngWq+dLl53PcbxAmkOaVck1RUmp+RAUMnXs70xOFTLxDDtAKkie4LTKUq/ceMlvcLRCtT76g53NUHJmfpr0KyMii0TkgMYy8JvA/cBmYF3cbR2wKS5vBi6KdyfWAM8VLjnmmVYhypOQiZCiXP/+3+J49vAStxfv64iHDBd6X8r+C1+ZU5w6Kt4xkCtp6lmZvMAN7zuTVXEUcamA7/qgIKafTWUEp0OBr0u4/ZUC/6iq3xSR7wM3isjFwCPA2+P+3wDOBbYDe4H3zHmu+4VKHOpfWjNXC7wkgY2XtvfQbtRy5yZDNb8cemImKhF5AXiw2/mYomXAU93OxBT0Sz6hf/LaL/mE8fP6MlWdUoVer4z5+OBUp87qNhEZ7Ie89ks+oX/y2i/5hNnn1aqljDElFhiMMSW9Ehg2dDsD09Avee2XfEL/5LVf8gmzzGtPVD4aY3pLr5QYjDE9pOuBQUTOFpEHYzftKyd/RVvzcr2I7BKR+wtpPdm9XEQOF5Fvi8iPROSHInJ5L+ZXRAZEZJuI3BPz+ZcxfZWI3Bnz8xURqcb0WlzfHrcf2Yl8FvKbiMgPROSmHs9ne4dCUNWuPYAEeBg4CqgC9wDHdzE/pwEnAvcX0v4ncGVcvhL4eFw+F/i/hJZDa4A7O5zXw4AT4/IBwEPA8b2W33i8xXG5AtwZj38j8M6Yfg1wSVy+FLgmLr8T+EqH/18/BPwjcFNc79V87qgn+38AAAIxSURBVACWjUmbs8++Y3/IBH/c64FvFdY/Anyky3k6ckxgeBA4LC4fRmhzAXAtcOF4+3Up35uAs3o5v8BC4D+AkwmNb9Kx3wPgW8Dr43Ia95MO5W8lYWyRNwI3xROp5/IZjzleYJizz77blxJT6qLdZbPqXt4JsRj7WsKvcc/lNxbP7yZ0tLuZUEp8VlWzcfLSzGfc/hxwcCfyCfwd8GFaHdUP7tF8QhuGQijqlZaPfUFVVaS3pnUSkcXAPwFXqOrzUhjSrVfyq6o5cIKILCH0zn1Fl7NUIiJvAXap6l0icka38zMFp6rqThE5BLhZRH5c3Djbz77bJYZ+6KLds93LRaRCCApfVNV/jsk9m19VfRb4NqFIvkREGj9Mxbw08xm3Hwg83YHsnQK8VUR2AF8mXE78fQ/mE2j/UAjdDgzfB46JNb9VQiXO5i7naaye7F4uoWhwHfCAqv5tr+ZXRJbHkgIisoBQD/IAIUBcMEE+G/m/ALhN44VxO6nqR1R1paoeSfge3qaq7+q1fEKHhkLoVGXJfipRziXUqD8M/Lcu5+VLwBNAnXAddjHhuvFW4CfALcBBcV8BPh3zfR+wusN5PZVwnXkvcHd8nNtr+QVeDfwg5vN+4M9i+lHANkL3/K8CtZg+ENe3x+1HdeF7cAatuxI9l8+Yp3vi44eN82YuP3tr+WiMKen2pYQxpgdZYDDGlFhgMMaUWGAwxpRYYDDGlFhgMMaUWGAwxpRYYDDGlPx/iALj2+UtxycAAAAASUVORK5CYII=\n" 177 | }, 178 | "metadata": { 179 | "needs_background": "light" 180 | }, 181 | "output_type": "display_data" 182 | } 183 | ], 184 | "source": [ 185 | "import torch\n", 186 | "from torch.backends import cudnn\n", 187 | "\n", 188 | "from backbone import EfficientDetBackbone\n", 189 | "import cv2\n", 190 | "import matplotlib.pyplot as plt\n", 191 | "import numpy as np\n", 192 | "\n", 193 | "from efficientdet.utils import BBoxTransform, ClipBoxes\n", 194 | "from utils.utils import preprocess, invert_affine, postprocess\n", 195 | "\n", 196 | "compound_coef = 0\n", 197 | "force_input_size = None # set None to use default size\n", 198 | "img_path = 'datasets/shape/val/999.jpg'\n", 199 | "\n", 200 | "threshold = 0.2\n", 201 | "iou_threshold = 0.2\n", 202 | "\n", 203 | "use_cuda = True\n", 204 | "use_float16 = False\n", 205 | "cudnn.fastest = True\n", 206 | "cudnn.benchmark = True\n", 207 | "\n", 208 | "obj_list = ['rectangle', 'circle']\n", 209 | "\n", 210 | "# tf bilinear interpolation is different from any other's, just make do\n", 211 | "input_sizes = [512, 640, 768, 896, 1024, 1280, 1280, 1536]\n", 212 | "input_size = input_sizes[compound_coef] if force_input_size is None else force_input_size\n", 213 | "ori_imgs, framed_imgs, framed_metas = preprocess(img_path, max_size=input_size)\n", 214 | "\n", 215 | "if use_cuda:\n", 216 | " x = torch.stack([torch.from_numpy(fi).cuda() for fi in framed_imgs], 0)\n", 217 | "else:\n", 218 | " x = torch.stack([torch.from_numpy(fi) for fi in framed_imgs], 0)\n", 219 | "\n", 220 | "x = x.to(torch.float32 if not use_float16 else torch.float16).permute(0, 3, 1, 2)\n", 221 | "\n", 222 | "model = EfficientDetBackbone(compound_coef=compound_coef, num_classes=len(obj_list),\n", 223 | "\n", 224 | " # replace this part with your project's anchor config\n", 225 | " ratios=[(1.0, 1.0), (1.4, 0.7), (0.7, 1.4)],\n", 226 | " scales=[2 ** 0, 2 ** (1.0 / 3.0), 2 ** (2.0 / 3.0)])\n", 227 | "\n", 228 | "model.load_state_dict(torch.load('logs/shape/efficientdet-d0_49_1400.pth'))\n", 229 | "model.requires_grad_(False)\n", 230 | "model.eval()\n", 231 | "\n", 232 | "if use_cuda:\n", 233 | " model = model.cuda()\n", 234 | "if use_float16:\n", 235 | " model = model.half()\n", 236 | "\n", 237 | "with torch.no_grad():\n", 238 | " features, regression, classification, anchors = model(x)\n", 239 | "\n", 240 | " regressBoxes = BBoxTransform()\n", 241 | " clipBoxes = ClipBoxes()\n", 242 | "\n", 243 | " out = postprocess(x,\n", 244 | " anchors, regression, classification,\n", 245 | " regressBoxes, clipBoxes,\n", 246 | " threshold, iou_threshold)\n", 247 | "\n", 248 | "out = invert_affine(framed_metas, out)\n", 249 | "\n", 250 | "for i in range(len(ori_imgs)):\n", 251 | " if len(out[i]['rois']) == 0:\n", 252 | " continue\n", 253 | "\n", 254 | " for j in range(len(out[i]['rois'])):\n", 255 | " (x1, y1, x2, y2) = out[i]['rois'][j].astype(np.int)\n", 256 | " cv2.rectangle(ori_imgs[i], (x1, y1), (x2, y2), (255, 255, 0), 2)\n", 257 | " obj = obj_list[out[i]['class_ids'][j]]\n", 258 | " score = float(out[i]['scores'][j])\n", 259 | "\n", 260 | " cv2.putText(ori_imgs[i], '{}, {:.3f}'.format(obj, score),\n", 261 | " (x1, y1 + 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5,\n", 262 | " (255, 255, 0), 1)\n", 263 | "\n", 264 | " plt.imshow(ori_imgs[i])\n", 265 | "\n" 266 | ], 267 | "metadata": { 268 | "collapsed": false, 269 | "pycharm": { 270 | "name": "#%%\n" 271 | } 272 | } 273 | } 274 | ], 275 | "metadata": { 276 | "kernelspec": { 277 | "display_name": "Python 3", 278 | "language": "python", 279 | "name": "python3" 280 | }, 281 | "language_info": { 282 | "codemirror_mode": { 283 | "name": "ipython", 284 | "version": 2 285 | }, 286 | "file_extension": ".py", 287 | "mimetype": "text/x-python", 288 | "name": "python", 289 | "nbconvert_exporter": "python", 290 | "pygments_lexer": "ipython2", 291 | "version": "2.7.6" 292 | } 293 | }, 294 | "nbformat": 4, 295 | "nbformat_minor": 0 296 | } -------------------------------------------------------------------------------- /utils/sync_batchnorm/__init__.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # File : __init__.py 3 | # Author : Jiayuan Mao 4 | # Email : maojiayuan@gmail.com 5 | # Date : 27/01/2018 6 | # 7 | # This file is part of Synchronized-BatchNorm-PyTorch. 8 | # https://github.com/vacancy/Synchronized-BatchNorm-PyTorch 9 | # Distributed under MIT License. 10 | 11 | from .batchnorm import SynchronizedBatchNorm1d, SynchronizedBatchNorm2d, SynchronizedBatchNorm3d 12 | from .batchnorm import patch_sync_batchnorm, convert_model 13 | from .replicate import DataParallelWithCallback, patch_replication_callback 14 | -------------------------------------------------------------------------------- /utils/sync_batchnorm/batchnorm.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # File : batchnorm.py 3 | # Author : Jiayuan Mao 4 | # Email : maojiayuan@gmail.com 5 | # Date : 27/01/2018 6 | # 7 | # This file is part of Synchronized-BatchNorm-PyTorch. 8 | # https://github.com/vacancy/Synchronized-BatchNorm-PyTorch 9 | # Distributed under MIT License. 10 | 11 | import collections 12 | import contextlib 13 | 14 | import torch 15 | import torch.nn.functional as F 16 | 17 | from torch.nn.modules.batchnorm import _BatchNorm 18 | 19 | try: 20 | from torch.nn.parallel._functions import ReduceAddCoalesced, Broadcast 21 | except ImportError: 22 | ReduceAddCoalesced = Broadcast = None 23 | 24 | try: 25 | from jactorch.parallel.comm import SyncMaster 26 | from jactorch.parallel.data_parallel import JacDataParallel as DataParallelWithCallback 27 | except ImportError: 28 | from .comm import SyncMaster 29 | from .replicate import DataParallelWithCallback 30 | 31 | __all__ = [ 32 | 'SynchronizedBatchNorm1d', 'SynchronizedBatchNorm2d', 'SynchronizedBatchNorm3d', 33 | 'patch_sync_batchnorm', 'convert_model' 34 | ] 35 | 36 | 37 | def _sum_ft(tensor): 38 | """sum over the first and last dimention""" 39 | return tensor.sum(dim=0).sum(dim=-1) 40 | 41 | 42 | def _unsqueeze_ft(tensor): 43 | """add new dimensions at the front and the tail""" 44 | return tensor.unsqueeze(0).unsqueeze(-1) 45 | 46 | 47 | _ChildMessage = collections.namedtuple('_ChildMessage', ['sum', 'ssum', 'sum_size']) 48 | _MasterMessage = collections.namedtuple('_MasterMessage', ['sum', 'inv_std']) 49 | 50 | 51 | class _SynchronizedBatchNorm(_BatchNorm): 52 | def __init__(self, num_features, eps=1e-5, momentum=0.1, affine=True): 53 | assert ReduceAddCoalesced is not None, 'Can not use Synchronized Batch Normalization without CUDA support.' 54 | 55 | super(_SynchronizedBatchNorm, self).__init__(num_features, eps=eps, momentum=momentum, affine=affine) 56 | 57 | self._sync_master = SyncMaster(self._data_parallel_master) 58 | 59 | self._is_parallel = False 60 | self._parallel_id = None 61 | self._slave_pipe = None 62 | 63 | def forward(self, input): 64 | # If it is not parallel computation or is in evaluation mode, use PyTorch's implementation. 65 | if not (self._is_parallel and self.training): 66 | return F.batch_norm( 67 | input, self.running_mean, self.running_var, self.weight, self.bias, 68 | self.training, self.momentum, self.eps) 69 | 70 | # Resize the input to (B, C, -1). 71 | input_shape = input.size() 72 | input = input.view(input.size(0), self.num_features, -1) 73 | 74 | # Compute the sum and square-sum. 75 | sum_size = input.size(0) * input.size(2) 76 | input_sum = _sum_ft(input) 77 | input_ssum = _sum_ft(input ** 2) 78 | 79 | # Reduce-and-broadcast the statistics. 80 | if self._parallel_id == 0: 81 | mean, inv_std = self._sync_master.run_master(_ChildMessage(input_sum, input_ssum, sum_size)) 82 | else: 83 | mean, inv_std = self._slave_pipe.run_slave(_ChildMessage(input_sum, input_ssum, sum_size)) 84 | 85 | # Compute the output. 86 | if self.affine: 87 | # MJY:: Fuse the multiplication for speed. 88 | output = (input - _unsqueeze_ft(mean)) * _unsqueeze_ft(inv_std * self.weight) + _unsqueeze_ft(self.bias) 89 | else: 90 | output = (input - _unsqueeze_ft(mean)) * _unsqueeze_ft(inv_std) 91 | 92 | # Reshape it. 93 | return output.view(input_shape) 94 | 95 | def __data_parallel_replicate__(self, ctx, copy_id): 96 | self._is_parallel = True 97 | self._parallel_id = copy_id 98 | 99 | # parallel_id == 0 means master device. 100 | if self._parallel_id == 0: 101 | ctx.sync_master = self._sync_master 102 | else: 103 | self._slave_pipe = ctx.sync_master.register_slave(copy_id) 104 | 105 | def _data_parallel_master(self, intermediates): 106 | """Reduce the sum and square-sum, compute the statistics, and broadcast it.""" 107 | 108 | # Always using same "device order" makes the ReduceAdd operation faster. 109 | # Thanks to:: Tete Xiao (http://tetexiao.com/) 110 | intermediates = sorted(intermediates, key=lambda i: i[1].sum.get_device()) 111 | 112 | to_reduce = [i[1][:2] for i in intermediates] 113 | to_reduce = [j for i in to_reduce for j in i] # flatten 114 | target_gpus = [i[1].sum.get_device() for i in intermediates] 115 | 116 | sum_size = sum([i[1].sum_size for i in intermediates]) 117 | sum_, ssum = ReduceAddCoalesced.apply(target_gpus[0], 2, *to_reduce) 118 | mean, inv_std = self._compute_mean_std(sum_, ssum, sum_size) 119 | 120 | broadcasted = Broadcast.apply(target_gpus, mean, inv_std) 121 | 122 | outputs = [] 123 | for i, rec in enumerate(intermediates): 124 | outputs.append((rec[0], _MasterMessage(*broadcasted[i*2:i*2+2]))) 125 | 126 | return outputs 127 | 128 | def _compute_mean_std(self, sum_, ssum, size): 129 | """Compute the mean and standard-deviation with sum and square-sum. This method 130 | also maintains the moving average on the master device.""" 131 | assert size > 1, 'BatchNorm computes unbiased standard-deviation, which requires size > 1.' 132 | mean = sum_ / size 133 | sumvar = ssum - sum_ * mean 134 | unbias_var = sumvar / (size - 1) 135 | bias_var = sumvar / size 136 | 137 | if hasattr(torch, 'no_grad'): 138 | with torch.no_grad(): 139 | self.running_mean = (1 - self.momentum) * self.running_mean + self.momentum * mean.data 140 | self.running_var = (1 - self.momentum) * self.running_var + self.momentum * unbias_var.data 141 | else: 142 | self.running_mean = (1 - self.momentum) * self.running_mean + self.momentum * mean.data 143 | self.running_var = (1 - self.momentum) * self.running_var + self.momentum * unbias_var.data 144 | 145 | return mean, bias_var.clamp(self.eps) ** -0.5 146 | 147 | 148 | class SynchronizedBatchNorm1d(_SynchronizedBatchNorm): 149 | r"""Applies Synchronized Batch Normalization over a 2d or 3d input that is seen as a 150 | mini-batch. 151 | 152 | .. math:: 153 | 154 | y = \frac{x - mean[x]}{ \sqrt{Var[x] + \epsilon}} * gamma + beta 155 | 156 | This module differs from the built-in PyTorch BatchNorm1d as the mean and 157 | standard-deviation are reduced across all devices during training. 158 | 159 | For example, when one uses `nn.DataParallel` to wrap the network during 160 | training, PyTorch's implementation normalize the tensor on each device using 161 | the statistics only on that device, which accelerated the computation and 162 | is also easy to implement, but the statistics might be inaccurate. 163 | Instead, in this synchronized version, the statistics will be computed 164 | over all training samples distributed on multiple devices. 165 | 166 | Note that, for one-GPU or CPU-only case, this module behaves exactly same 167 | as the built-in PyTorch implementation. 168 | 169 | The mean and standard-deviation are calculated per-dimension over 170 | the mini-batches and gamma and beta are learnable parameter vectors 171 | of size C (where C is the input size). 172 | 173 | During training, this layer keeps a running estimate of its computed mean 174 | and variance. The running sum is kept with a default momentum of 0.1. 175 | 176 | During evaluation, this running mean/variance is used for normalization. 177 | 178 | Because the BatchNorm is done over the `C` dimension, computing statistics 179 | on `(N, L)` slices, it's common terminology to call this Temporal BatchNorm 180 | 181 | Args: 182 | num_features: num_features from an expected input of size 183 | `batch_size x num_features [x width]` 184 | eps: a value added to the denominator for numerical stability. 185 | Default: 1e-5 186 | momentum: the value used for the running_mean and running_var 187 | computation. Default: 0.1 188 | affine: a boolean value that when set to ``True``, gives the layer learnable 189 | affine parameters. Default: ``True`` 190 | 191 | Shape:: 192 | - Input: :math:`(N, C)` or :math:`(N, C, L)` 193 | - Output: :math:`(N, C)` or :math:`(N, C, L)` (same shape as input) 194 | 195 | Examples: 196 | >>> # With Learnable Parameters 197 | >>> m = SynchronizedBatchNorm1d(100) 198 | >>> # Without Learnable Parameters 199 | >>> m = SynchronizedBatchNorm1d(100, affine=False) 200 | >>> input = torch.autograd.Variable(torch.randn(20, 100)) 201 | >>> output = m(input) 202 | """ 203 | 204 | def _check_input_dim(self, input): 205 | if input.dim() != 2 and input.dim() != 3: 206 | raise ValueError('expected 2D or 3D input (got {}D input)' 207 | .format(input.dim())) 208 | super(SynchronizedBatchNorm1d, self)._check_input_dim(input) 209 | 210 | 211 | class SynchronizedBatchNorm2d(_SynchronizedBatchNorm): 212 | r"""Applies Batch Normalization over a 4d input that is seen as a mini-batch 213 | of 3d inputs 214 | 215 | .. math:: 216 | 217 | y = \frac{x - mean[x]}{ \sqrt{Var[x] + \epsilon}} * gamma + beta 218 | 219 | This module differs from the built-in PyTorch BatchNorm2d as the mean and 220 | standard-deviation are reduced across all devices during training. 221 | 222 | For example, when one uses `nn.DataParallel` to wrap the network during 223 | training, PyTorch's implementation normalize the tensor on each device using 224 | the statistics only on that device, which accelerated the computation and 225 | is also easy to implement, but the statistics might be inaccurate. 226 | Instead, in this synchronized version, the statistics will be computed 227 | over all training samples distributed on multiple devices. 228 | 229 | Note that, for one-GPU or CPU-only case, this module behaves exactly same 230 | as the built-in PyTorch implementation. 231 | 232 | The mean and standard-deviation are calculated per-dimension over 233 | the mini-batches and gamma and beta are learnable parameter vectors 234 | of size C (where C is the input size). 235 | 236 | During training, this layer keeps a running estimate of its computed mean 237 | and variance. The running sum is kept with a default momentum of 0.1. 238 | 239 | During evaluation, this running mean/variance is used for normalization. 240 | 241 | Because the BatchNorm is done over the `C` dimension, computing statistics 242 | on `(N, H, W)` slices, it's common terminology to call this Spatial BatchNorm 243 | 244 | Args: 245 | num_features: num_features from an expected input of 246 | size batch_size x num_features x height x width 247 | eps: a value added to the denominator for numerical stability. 248 | Default: 1e-5 249 | momentum: the value used for the running_mean and running_var 250 | computation. Default: 0.1 251 | affine: a boolean value that when set to ``True``, gives the layer learnable 252 | affine parameters. Default: ``True`` 253 | 254 | Shape:: 255 | - Input: :math:`(N, C, H, W)` 256 | - Output: :math:`(N, C, H, W)` (same shape as input) 257 | 258 | Examples: 259 | >>> # With Learnable Parameters 260 | >>> m = SynchronizedBatchNorm2d(100) 261 | >>> # Without Learnable Parameters 262 | >>> m = SynchronizedBatchNorm2d(100, affine=False) 263 | >>> input = torch.autograd.Variable(torch.randn(20, 100, 35, 45)) 264 | >>> output = m(input) 265 | """ 266 | 267 | def _check_input_dim(self, input): 268 | if input.dim() != 4: 269 | raise ValueError('expected 4D input (got {}D input)' 270 | .format(input.dim())) 271 | super(SynchronizedBatchNorm2d, self)._check_input_dim(input) 272 | 273 | 274 | class SynchronizedBatchNorm3d(_SynchronizedBatchNorm): 275 | r"""Applies Batch Normalization over a 5d input that is seen as a mini-batch 276 | of 4d inputs 277 | 278 | .. math:: 279 | 280 | y = \frac{x - mean[x]}{ \sqrt{Var[x] + \epsilon}} * gamma + beta 281 | 282 | This module differs from the built-in PyTorch BatchNorm3d as the mean and 283 | standard-deviation are reduced across all devices during training. 284 | 285 | For example, when one uses `nn.DataParallel` to wrap the network during 286 | training, PyTorch's implementation normalize the tensor on each device using 287 | the statistics only on that device, which accelerated the computation and 288 | is also easy to implement, but the statistics might be inaccurate. 289 | Instead, in this synchronized version, the statistics will be computed 290 | over all training samples distributed on multiple devices. 291 | 292 | Note that, for one-GPU or CPU-only case, this module behaves exactly same 293 | as the built-in PyTorch implementation. 294 | 295 | The mean and standard-deviation are calculated per-dimension over 296 | the mini-batches and gamma and beta are learnable parameter vectors 297 | of size C (where C is the input size). 298 | 299 | During training, this layer keeps a running estimate of its computed mean 300 | and variance. The running sum is kept with a default momentum of 0.1. 301 | 302 | During evaluation, this running mean/variance is used for normalization. 303 | 304 | Because the BatchNorm is done over the `C` dimension, computing statistics 305 | on `(N, D, H, W)` slices, it's common terminology to call this Volumetric BatchNorm 306 | or Spatio-temporal BatchNorm 307 | 308 | Args: 309 | num_features: num_features from an expected input of 310 | size batch_size x num_features x depth x height x width 311 | eps: a value added to the denominator for numerical stability. 312 | Default: 1e-5 313 | momentum: the value used for the running_mean and running_var 314 | computation. Default: 0.1 315 | affine: a boolean value that when set to ``True``, gives the layer learnable 316 | affine parameters. Default: ``True`` 317 | 318 | Shape:: 319 | - Input: :math:`(N, C, D, H, W)` 320 | - Output: :math:`(N, C, D, H, W)` (same shape as input) 321 | 322 | Examples: 323 | >>> # With Learnable Parameters 324 | >>> m = SynchronizedBatchNorm3d(100) 325 | >>> # Without Learnable Parameters 326 | >>> m = SynchronizedBatchNorm3d(100, affine=False) 327 | >>> input = torch.autograd.Variable(torch.randn(20, 100, 35, 45, 10)) 328 | >>> output = m(input) 329 | """ 330 | 331 | def _check_input_dim(self, input): 332 | if input.dim() != 5: 333 | raise ValueError('expected 5D input (got {}D input)' 334 | .format(input.dim())) 335 | super(SynchronizedBatchNorm3d, self)._check_input_dim(input) 336 | 337 | 338 | @contextlib.contextmanager 339 | def patch_sync_batchnorm(): 340 | import torch.nn as nn 341 | 342 | backup = nn.BatchNorm1d, nn.BatchNorm2d, nn.BatchNorm3d 343 | 344 | nn.BatchNorm1d = SynchronizedBatchNorm1d 345 | nn.BatchNorm2d = SynchronizedBatchNorm2d 346 | nn.BatchNorm3d = SynchronizedBatchNorm3d 347 | 348 | yield 349 | 350 | nn.BatchNorm1d, nn.BatchNorm2d, nn.BatchNorm3d = backup 351 | 352 | 353 | def convert_model(module): 354 | """Traverse the input module and its child recursively 355 | and replace all instance of torch.nn.modules.batchnorm.BatchNorm*N*d 356 | to SynchronizedBatchNorm*N*d 357 | 358 | Args: 359 | module: the input module needs to be convert to SyncBN model 360 | 361 | Examples: 362 | >>> import torch.nn as nn 363 | >>> import torchvision 364 | >>> # m is a standard pytorch model 365 | >>> m = torchvision.models.resnet18(True) 366 | >>> m = nn.DataParallel(m) 367 | >>> # after convert, m is using SyncBN 368 | >>> m = convert_model(m) 369 | """ 370 | if isinstance(module, torch.nn.DataParallel): 371 | mod = module.module 372 | mod = convert_model(mod) 373 | mod = DataParallelWithCallback(mod, device_ids=module.device_ids) 374 | return mod 375 | 376 | mod = module 377 | for pth_module, sync_module in zip([torch.nn.modules.batchnorm.BatchNorm1d, 378 | torch.nn.modules.batchnorm.BatchNorm2d, 379 | torch.nn.modules.batchnorm.BatchNorm3d], 380 | [SynchronizedBatchNorm1d, 381 | SynchronizedBatchNorm2d, 382 | SynchronizedBatchNorm3d]): 383 | if isinstance(module, pth_module): 384 | mod = sync_module(module.num_features, module.eps, module.momentum, module.affine) 385 | mod.running_mean = module.running_mean 386 | mod.running_var = module.running_var 387 | if module.affine: 388 | mod.weight.data = module.weight.data.clone().detach() 389 | mod.bias.data = module.bias.data.clone().detach() 390 | 391 | for name, child in module.named_children(): 392 | mod.add_module(name, convert_model(child)) 393 | 394 | return mod 395 | -------------------------------------------------------------------------------- /utils/sync_batchnorm/batchnorm_reimpl.py: -------------------------------------------------------------------------------- 1 | #! /usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | # File : batchnorm_reimpl.py 4 | # Author : acgtyrant 5 | # Date : 11/01/2018 6 | # 7 | # This file is part of Synchronized-BatchNorm-PyTorch. 8 | # https://github.com/vacancy/Synchronized-BatchNorm-PyTorch 9 | # Distributed under MIT License. 10 | 11 | import torch 12 | import torch.nn as nn 13 | import torch.nn.init as init 14 | 15 | __all__ = ['BatchNorm2dReimpl'] 16 | 17 | 18 | class BatchNorm2dReimpl(nn.Module): 19 | """ 20 | A re-implementation of batch normalization, used for testing the numerical 21 | stability. 22 | 23 | Author: acgtyrant 24 | See also: 25 | https://github.com/vacancy/Synchronized-BatchNorm-PyTorch/issues/14 26 | """ 27 | def __init__(self, num_features, eps=1e-5, momentum=0.1): 28 | super().__init__() 29 | 30 | self.num_features = num_features 31 | self.eps = eps 32 | self.momentum = momentum 33 | self.weight = nn.Parameter(torch.empty(num_features)) 34 | self.bias = nn.Parameter(torch.empty(num_features)) 35 | self.register_buffer('running_mean', torch.zeros(num_features)) 36 | self.register_buffer('running_var', torch.ones(num_features)) 37 | self.reset_parameters() 38 | 39 | def reset_running_stats(self): 40 | self.running_mean.zero_() 41 | self.running_var.fill_(1) 42 | 43 | def reset_parameters(self): 44 | self.reset_running_stats() 45 | init.uniform_(self.weight) 46 | init.zeros_(self.bias) 47 | 48 | def forward(self, input_): 49 | batchsize, channels, height, width = input_.size() 50 | numel = batchsize * height * width 51 | input_ = input_.permute(1, 0, 2, 3).contiguous().view(channels, numel) 52 | sum_ = input_.sum(1) 53 | sum_of_square = input_.pow(2).sum(1) 54 | mean = sum_ / numel 55 | sumvar = sum_of_square - sum_ * mean 56 | 57 | self.running_mean = ( 58 | (1 - self.momentum) * self.running_mean 59 | + self.momentum * mean.detach() 60 | ) 61 | unbias_var = sumvar / (numel - 1) 62 | self.running_var = ( 63 | (1 - self.momentum) * self.running_var 64 | + self.momentum * unbias_var.detach() 65 | ) 66 | 67 | bias_var = sumvar / numel 68 | inv_std = 1 / (bias_var + self.eps).pow(0.5) 69 | output = ( 70 | (input_ - mean.unsqueeze(1)) * inv_std.unsqueeze(1) * 71 | self.weight.unsqueeze(1) + self.bias.unsqueeze(1)) 72 | 73 | return output.view(channels, batchsize, height, width).permute(1, 0, 2, 3).contiguous() 74 | 75 | -------------------------------------------------------------------------------- /utils/sync_batchnorm/comm.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # File : comm.py 3 | # Author : Jiayuan Mao 4 | # Email : maojiayuan@gmail.com 5 | # Date : 27/01/2018 6 | # 7 | # This file is part of Synchronized-BatchNorm-PyTorch. 8 | # https://github.com/vacancy/Synchronized-BatchNorm-PyTorch 9 | # Distributed under MIT License. 10 | 11 | import queue 12 | import collections 13 | import threading 14 | 15 | __all__ = ['FutureResult', 'SlavePipe', 'SyncMaster'] 16 | 17 | 18 | class FutureResult(object): 19 | """A thread-safe future implementation. Used only as one-to-one pipe.""" 20 | 21 | def __init__(self): 22 | self._result = None 23 | self._lock = threading.Lock() 24 | self._cond = threading.Condition(self._lock) 25 | 26 | def put(self, result): 27 | with self._lock: 28 | assert self._result is None, 'Previous result has\'t been fetched.' 29 | self._result = result 30 | self._cond.notify() 31 | 32 | def get(self): 33 | with self._lock: 34 | if self._result is None: 35 | self._cond.wait() 36 | 37 | res = self._result 38 | self._result = None 39 | return res 40 | 41 | 42 | _MasterRegistry = collections.namedtuple('MasterRegistry', ['result']) 43 | _SlavePipeBase = collections.namedtuple('_SlavePipeBase', ['identifier', 'queue', 'result']) 44 | 45 | 46 | class SlavePipe(_SlavePipeBase): 47 | """Pipe for master-slave communication.""" 48 | 49 | def run_slave(self, msg): 50 | self.queue.put((self.identifier, msg)) 51 | ret = self.result.get() 52 | self.queue.put(True) 53 | return ret 54 | 55 | 56 | class SyncMaster(object): 57 | """An abstract `SyncMaster` object. 58 | 59 | - During the replication, as the data parallel will trigger an callback of each module, all slave devices should 60 | call `register(id)` and obtain an `SlavePipe` to communicate with the master. 61 | - During the forward pass, master device invokes `run_master`, all messages from slave devices will be collected, 62 | and passed to a registered callback. 63 | - After receiving the messages, the master device should gather the information and determine to message passed 64 | back to each slave devices. 65 | """ 66 | 67 | def __init__(self, master_callback): 68 | """ 69 | 70 | Args: 71 | master_callback: a callback to be invoked after having collected messages from slave devices. 72 | """ 73 | self._master_callback = master_callback 74 | self._queue = queue.Queue() 75 | self._registry = collections.OrderedDict() 76 | self._activated = False 77 | 78 | def __getstate__(self): 79 | return {'master_callback': self._master_callback} 80 | 81 | def __setstate__(self, state): 82 | self.__init__(state['master_callback']) 83 | 84 | def register_slave(self, identifier): 85 | """ 86 | Register an slave device. 87 | 88 | Args: 89 | identifier: an identifier, usually is the device id. 90 | 91 | Returns: a `SlavePipe` object which can be used to communicate with the master device. 92 | 93 | """ 94 | if self._activated: 95 | assert self._queue.empty(), 'Queue is not clean before next initialization.' 96 | self._activated = False 97 | self._registry.clear() 98 | future = FutureResult() 99 | self._registry[identifier] = _MasterRegistry(future) 100 | return SlavePipe(identifier, self._queue, future) 101 | 102 | def run_master(self, master_msg): 103 | """ 104 | Main entry for the master device in each forward pass. 105 | The messages were first collected from each devices (including the master device), and then 106 | an callback will be invoked to compute the message to be sent back to each devices 107 | (including the master device). 108 | 109 | Args: 110 | master_msg: the message that the master want to send to itself. This will be placed as the first 111 | message when calling `master_callback`. For detailed usage, see `_SynchronizedBatchNorm` for an example. 112 | 113 | Returns: the message to be sent back to the master device. 114 | 115 | """ 116 | self._activated = True 117 | 118 | intermediates = [(0, master_msg)] 119 | for i in range(self.nr_slaves): 120 | intermediates.append(self._queue.get()) 121 | 122 | results = self._master_callback(intermediates) 123 | assert results[0][0] == 0, 'The first result should belongs to the master.' 124 | 125 | for i, res in results: 126 | if i == 0: 127 | continue 128 | self._registry[i].result.put(res) 129 | 130 | for i in range(self.nr_slaves): 131 | assert self._queue.get() is True 132 | 133 | return results[0][1] 134 | 135 | @property 136 | def nr_slaves(self): 137 | return len(self._registry) 138 | -------------------------------------------------------------------------------- /utils/sync_batchnorm/replicate.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # File : replicate.py 3 | # Author : Jiayuan Mao 4 | # Email : maojiayuan@gmail.com 5 | # Date : 27/01/2018 6 | # 7 | # This file is part of Synchronized-BatchNorm-PyTorch. 8 | # https://github.com/vacancy/Synchronized-BatchNorm-PyTorch 9 | # Distributed under MIT License. 10 | 11 | import functools 12 | 13 | from torch.nn.parallel.data_parallel import DataParallel 14 | 15 | __all__ = [ 16 | 'CallbackContext', 17 | 'execute_replication_callbacks', 18 | 'DataParallelWithCallback', 19 | 'patch_replication_callback' 20 | ] 21 | 22 | 23 | class CallbackContext(object): 24 | pass 25 | 26 | 27 | def execute_replication_callbacks(modules): 28 | """ 29 | Execute an replication callback `__data_parallel_replicate__` on each module created by original replication. 30 | 31 | The callback will be invoked with arguments `__data_parallel_replicate__(ctx, copy_id)` 32 | 33 | Note that, as all modules are isomorphism, we assign each sub-module with a context 34 | (shared among multiple copies of this module on different devices). 35 | Through this context, different copies can share some information. 36 | 37 | We guarantee that the callback on the master copy (the first copy) will be called ahead of calling the callback 38 | of any slave copies. 39 | """ 40 | master_copy = modules[0] 41 | nr_modules = len(list(master_copy.modules())) 42 | ctxs = [CallbackContext() for _ in range(nr_modules)] 43 | 44 | for i, module in enumerate(modules): 45 | for j, m in enumerate(module.modules()): 46 | if hasattr(m, '__data_parallel_replicate__'): 47 | m.__data_parallel_replicate__(ctxs[j], i) 48 | 49 | 50 | class DataParallelWithCallback(DataParallel): 51 | """ 52 | Data Parallel with a replication callback. 53 | 54 | An replication callback `__data_parallel_replicate__` of each module will be invoked after being created by 55 | original `replicate` function. 56 | The callback will be invoked with arguments `__data_parallel_replicate__(ctx, copy_id)` 57 | 58 | Examples: 59 | > sync_bn = SynchronizedBatchNorm1d(10, eps=1e-5, affine=False) 60 | > sync_bn = DataParallelWithCallback(sync_bn, device_ids=[0, 1]) 61 | # sync_bn.__data_parallel_replicate__ will be invoked. 62 | """ 63 | 64 | def replicate(self, module, device_ids): 65 | modules = super(DataParallelWithCallback, self).replicate(module, device_ids) 66 | execute_replication_callbacks(modules) 67 | return modules 68 | 69 | 70 | def patch_replication_callback(data_parallel): 71 | """ 72 | Monkey-patch an existing `DataParallel` object. Add the replication callback. 73 | Useful when you have customized `DataParallel` implementation. 74 | 75 | Examples: 76 | > sync_bn = SynchronizedBatchNorm1d(10, eps=1e-5, affine=False) 77 | > sync_bn = DataParallel(sync_bn, device_ids=[0, 1]) 78 | > patch_replication_callback(sync_bn) 79 | # this is equivalent to 80 | > sync_bn = SynchronizedBatchNorm1d(10, eps=1e-5, affine=False) 81 | > sync_bn = DataParallelWithCallback(sync_bn, device_ids=[0, 1]) 82 | """ 83 | 84 | assert isinstance(data_parallel, DataParallel) 85 | 86 | old_replicate = data_parallel.replicate 87 | 88 | @functools.wraps(old_replicate) 89 | def new_replicate(module, device_ids): 90 | modules = old_replicate(module, device_ids) 91 | execute_replication_callbacks(modules) 92 | return modules 93 | 94 | data_parallel.replicate = new_replicate 95 | -------------------------------------------------------------------------------- /utils/sync_batchnorm/unittest.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # File : unittest.py 3 | # Author : Jiayuan Mao 4 | # Email : maojiayuan@gmail.com 5 | # Date : 27/01/2018 6 | # 7 | # This file is part of Synchronized-BatchNorm-PyTorch. 8 | # https://github.com/vacancy/Synchronized-BatchNorm-PyTorch 9 | # Distributed under MIT License. 10 | 11 | import unittest 12 | import torch 13 | 14 | 15 | class TorchTestCase(unittest.TestCase): 16 | def assertTensorClose(self, x, y): 17 | adiff = float((x - y).abs().max()) 18 | if (y == 0).all(): 19 | rdiff = 'NaN' 20 | else: 21 | rdiff = float((adiff / y).abs().max()) 22 | 23 | message = ( 24 | 'Tensor close check failed\n' 25 | 'adiff={}\n' 26 | 'rdiff={}\n' 27 | ).format(adiff, rdiff) 28 | self.assertTrue(torch.allclose(x, y), message) 29 | 30 | -------------------------------------------------------------------------------- /utils/utils.py: -------------------------------------------------------------------------------- 1 | # Author: Zylo117 2 | 3 | import os 4 | 5 | import cv2 6 | import numpy as np 7 | import torch 8 | from glob import glob 9 | from torch import nn 10 | from torchvision.ops import nms 11 | from typing import Union 12 | import uuid 13 | 14 | from utils.sync_batchnorm import SynchronizedBatchNorm2d 15 | 16 | 17 | def invert_affine(metas: Union[float, list, tuple], preds): 18 | for i in range(len(preds)): 19 | if len(preds[i]['rois']) == 0: 20 | continue 21 | else: 22 | if metas is float: 23 | preds[i]['rois'][:, [0, 2]] = preds[i]['rois'][:, [0, 2]] / metas 24 | preds[i]['rois'][:, [1, 3]] = preds[i]['rois'][:, [1, 3]] / metas 25 | else: 26 | new_w, new_h, old_w, old_h, padding_w, padding_h = metas[i] 27 | preds[i]['rois'][:, [0, 2]] = preds[i]['rois'][:, [0, 2]] / (new_w / old_w) 28 | preds[i]['rois'][:, [1, 3]] = preds[i]['rois'][:, [1, 3]] / (new_h / old_h) 29 | return preds 30 | 31 | 32 | def aspectaware_resize_padding(image, width, height, interpolation=None, means=None): 33 | old_h, old_w, c = image.shape 34 | if old_w > old_h: 35 | new_w = width 36 | new_h = int(width / old_w * old_h) 37 | else: 38 | new_w = int(height / old_h * old_w) 39 | new_h = height 40 | 41 | canvas = np.zeros((height, height, c), np.float32) 42 | if means is not None: 43 | canvas[...] = means 44 | 45 | if new_w != old_w or new_h != old_h: 46 | if interpolation is None: 47 | image = cv2.resize(image, (new_w, new_h)) 48 | else: 49 | image = cv2.resize(image, (new_w, new_h), interpolation=interpolation) 50 | 51 | padding_h = height - new_h 52 | padding_w = width - new_w 53 | 54 | if c > 1: 55 | canvas[:new_h, :new_w] = image 56 | else: 57 | if len(image.shape) == 2: 58 | canvas[:new_h, :new_w, 0] = image 59 | else: 60 | canvas[:new_h, :new_w] = image 61 | 62 | return canvas, new_w, new_h, old_w, old_h, padding_w, padding_h, 63 | 64 | 65 | def preprocess(*image_path, max_size=512, mean=(0.406, 0.456, 0.485), std=(0.225, 0.224, 0.229)): 66 | ori_imgs = [cv2.imread(img_path) for img_path in image_path] 67 | normalized_imgs = [(img / 255 - mean) / std for img in ori_imgs] 68 | imgs_meta = [aspectaware_resize_padding(img[..., ::-1], max_size, max_size, 69 | means=None) for img in normalized_imgs] 70 | framed_imgs = [img_meta[0] for img_meta in imgs_meta] 71 | framed_metas = [img_meta[1:] for img_meta in imgs_meta] 72 | 73 | return ori_imgs, framed_imgs, framed_metas 74 | 75 | 76 | def postprocess(x, anchors, regression, classification, regressBoxes, clipBoxes, threshold, iou_threshold): 77 | transformed_anchors = regressBoxes(anchors, regression) 78 | transformed_anchors = clipBoxes(transformed_anchors, x) 79 | scores = torch.max(classification, dim=2, keepdim=True)[0] 80 | scores_over_thresh = (scores > threshold)[:, :, 0] 81 | out = [] 82 | for i in range(x.shape[0]): 83 | if scores_over_thresh.sum() == 0: 84 | out.append({ 85 | 'rois': np.array(()), 86 | 'class_ids': np.array(()), 87 | 'scores': np.array(()), 88 | }) 89 | 90 | classification_per = classification[i, scores_over_thresh[i, :], ...].permute(1, 0) 91 | transformed_anchors_per = transformed_anchors[i, scores_over_thresh[i, :], ...] 92 | scores_per = scores[i, scores_over_thresh[i, :], ...] 93 | anchors_nms_idx = nms(transformed_anchors_per, scores_per[:, 0], iou_threshold=iou_threshold) 94 | 95 | if anchors_nms_idx.shape[0] != 0: 96 | scores_, classes_ = classification_per[:, anchors_nms_idx].max(dim=0) 97 | boxes_ = transformed_anchors_per[anchors_nms_idx, :] 98 | 99 | out.append({ 100 | 'rois': boxes_.cpu().numpy(), 101 | 'class_ids': classes_.cpu().numpy(), 102 | 'scores': scores_.cpu().numpy(), 103 | }) 104 | else: 105 | out.append({ 106 | 'rois': np.array(()), 107 | 'class_ids': np.array(()), 108 | 'scores': np.array(()), 109 | }) 110 | 111 | return out 112 | 113 | 114 | def display(preds, imgs, obj_list, imshow=True, imwrite=False): 115 | for i in range(len(imgs)): 116 | if len(preds[i]['rois']) == 0: 117 | continue 118 | 119 | for j in range(len(preds[i]['rois'])): 120 | (x1, y1, x2, y2) = preds[i]['rois'][j].astype(np.int) 121 | cv2.rectangle(imgs[i], (x1, y1), (x2, y2), (255, 255, 0), 2) 122 | obj = obj_list[preds[i]['class_ids'][j]] 123 | score = float(preds[i]['scores'][j]) 124 | 125 | cv2.putText(imgs[i], '{}, {:.3f}'.format(obj, score), 126 | (x1, y1 + 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, 127 | (255, 255, 0), 1) 128 | if imshow: 129 | cv2.imshow('img', imgs[i]) 130 | cv2.waitKey(0) 131 | 132 | if imwrite: 133 | os.makedirs('test/', exist_ok=True) 134 | cv2.imwrite('test/{}.jpg'.format(uuid.uuid4().hex), imgs[i]) 135 | 136 | 137 | def replace_w_sync_bn(m): 138 | for var_name in dir(m): 139 | target_attr = getattr(m, var_name) 140 | if type(target_attr) == torch.nn.BatchNorm2d: 141 | num_features = target_attr.num_features 142 | eps = target_attr.eps 143 | momentum = target_attr.momentum 144 | affine = target_attr.affine 145 | 146 | # get parameters 147 | running_mean = target_attr.running_mean 148 | running_var = target_attr.running_var 149 | if affine: 150 | weight = target_attr.weight 151 | bias = target_attr.bias 152 | 153 | setattr(m, var_name, 154 | SynchronizedBatchNorm2d(num_features, eps, momentum, affine)) 155 | 156 | target_attr = getattr(m, var_name) 157 | # set parameters 158 | target_attr.running_mean = running_mean 159 | target_attr.running_var = running_var 160 | if affine: 161 | target_attr.weight = weight 162 | target_attr.bias = bias 163 | 164 | for var_name, children in m.named_children(): 165 | replace_w_sync_bn(children) 166 | 167 | 168 | class CustomDataParallel(nn.DataParallel): 169 | """ 170 | force splitting data to all gpus instead of sending all data to cuda:0 and then moving around. 171 | """ 172 | 173 | def __init__(self, module, num_gpus): 174 | super().__init__(module) 175 | self.num_gpus = num_gpus 176 | 177 | def scatter(self, inputs, kwargs, device_ids): 178 | # More like scatter and data prep at the same time. The point is we prep the data in such a way 179 | # that no scatter is necessary, and there's no need to shuffle stuff around different GPUs. 180 | devices = ['cuda:' + str(x) for x in range(self.num_gpus)] 181 | splits = inputs[0].shape[0] // self.num_gpus 182 | 183 | return [(inputs[0][splits * device_idx: splits * (device_idx + 1)].to('cuda:{}'.format(device_idx), non_blocking=True), 184 | inputs[1][splits * device_idx: splits * (device_idx + 1)].to('cuda:{}'.format(device_idx), non_blocking=True)) 185 | for device_idx in range(len(devices))], \ 186 | [kwargs] * len(devices) 187 | 188 | 189 | def get_last_weights(weights_path): 190 | weights_path = glob(weights_path + '/*.pth') 191 | weights_path = sorted(weights_path, 192 | key=lambda x: int(x.rsplit('_')[-1].rsplit('.')[0]), 193 | reverse=True)[0] 194 | print('using weights {}'.format(weights_path)) 195 | return weights_path 196 | 197 | 198 | def init_weights(model): 199 | for name, module in model.named_modules(): 200 | is_conv_layer = isinstance(module, nn.Conv2d) 201 | 202 | if is_conv_layer: 203 | nn.init.kaiming_uniform_(module.weight.data) 204 | 205 | if module.bias is not None: 206 | module.bias.data.zero_() 207 | -------------------------------------------------------------------------------- /voc2coco.py: -------------------------------------------------------------------------------- 1 | # pip install lxml 2 | 3 | import sys 4 | import os 5 | import json 6 | import xml.etree.ElementTree as ET 7 | import cv2 8 | 9 | START_BOUNDING_BOX_ID = 1 10 | PRE_DEFINE_CATEGORIES = {"holothurian":1,"echinus":2,"scallop":3,"starfish":4} 11 | # ["holothurian","echinus","scallop","starfish"] 12 | # If necessary, pre-define category and its id 13 | # PRE_DEFINE_CATEGORIES = {"aeroplane": 1, "bicycle": 2, "bird": 3, "boat": 4, 14 | # "bottle":5, "bus": 6, "car": 7, "cat": 8, "chair": 9, 15 | # "cow": 10, "diningtable": 11, "dog": 12, "horse": 13, 16 | # "motorbike": 14, "person": 15, "pottedplant": 16, 17 | # "sheep": 17, "sofa": 18, "train": 19, "tvmonitor": 20} 18 | 19 | 20 | def get(root, name): 21 | vars = root.findall(name) 22 | return vars 23 | 24 | 25 | def get_and_check(root, name, length): 26 | vars = root.findall(name) 27 | if len(vars) == 0: 28 | raise NotImplementedError('Can not find %s in %s.'%(name, root.tag)) 29 | if length > 0 and len(vars) != length: 30 | raise NotImplementedError('The size of %s is supposed to be %d, but is %d.'%(name, length, len(vars))) 31 | if length == 1: 32 | vars = vars[0] 33 | return vars 34 | 35 | 36 | def get_filename_as_int(filename): 37 | try: 38 | filename = os.path.splitext(filename)[0] 39 | return int(filename) 40 | except: 41 | raise NotImplementedError('Filename %s is supposed to be an integer.'%(filename)) 42 | 43 | 44 | def convert(xml_list, xml_dir, json_file,img_dir): 45 | list_fp = open(xml_list, 'r') 46 | json_dict = {"images":[], "type": "instances", "annotations": [], 47 | "categories": []} 48 | categories = PRE_DEFINE_CATEGORIES 49 | bnd_id = START_BOUNDING_BOX_ID 50 | for line in list_fp: 51 | line_name = line.strip() 52 | line = line_name + ".xml" 53 | print("Processing %s"%(line)) 54 | xml_f = os.path.join(xml_dir, line) 55 | tree = ET.parse(xml_f) 56 | root = tree.getroot() 57 | path = get(root, 'path') 58 | try: 59 | if len(path) == 1: 60 | filename = os.path.basename(path[0].text) 61 | elif len(path) == 0: 62 | filename = get_and_check(root, 'filename', 1).text 63 | except: 64 | filename = line_name + ".jpg" 65 | # raise NotImplementedError('%d paths found in %s'%(len(path), line)) 66 | ## The filename must be a number 67 | image_id = get_filename_as_int(filename) 68 | try: 69 | size = get_and_check(root, 'size', 1) 70 | width = int(get_and_check(size, 'width', 1).text) 71 | height = int(get_and_check(size, 'height', 1).text) 72 | except: 73 | img = cv2.imread(img_dir+line_name+".jpg") 74 | height = img.shape[0] 75 | width = img.shape[1] 76 | 77 | image = {'file_name': filename, 'height': height, 'width': width, 78 | 'id':image_id} 79 | json_dict['images'].append(image) 80 | ## Cruuently we do not support segmentation 81 | # segmented = get_and_check(root, 'segmented', 1).text 82 | # assert segmented == '0' 83 | for obj in get(root, 'object'): 84 | category = get_and_check(obj, 'name', 1).text 85 | if category not in categories: 86 | # new_id = len(categories) 87 | # categories[category] = new_id 88 | continue 89 | category_id = categories[category] 90 | bndbox = get_and_check(obj, 'bndbox', 1) 91 | xmin = int(get_and_check(bndbox, 'xmin', 1).text) - 1 92 | ymin = int(get_and_check(bndbox, 'ymin', 1).text) - 1 93 | xmax = int(get_and_check(bndbox, 'xmax', 1).text) 94 | ymax = int(get_and_check(bndbox, 'ymax', 1).text) 95 | assert(xmax > xmin) 96 | assert(ymax > ymin) 97 | o_width = abs(xmax - xmin) 98 | o_height = abs(ymax - ymin) 99 | ann = {'area': o_width*o_height, 'iscrowd': 0, 'image_id': 100 | image_id, 'bbox':[xmin, ymin, o_width, o_height], 101 | 'category_id': category_id, 'id': bnd_id, 'ignore': 0, 102 | 'segmentation': []} 103 | json_dict['annotations'].append(ann) 104 | bnd_id = bnd_id + 1 105 | 106 | for cate, cid in categories.items(): 107 | cat = {'supercategory': 'none', 'id': cid, 'name': cate} 108 | json_dict['categories'].append(cat) 109 | json_fp = open(json_file, 'w') 110 | json_str = json.dumps(json_dict) 111 | json_fp.write(json_str) 112 | json_fp.close() 113 | list_fp.close() 114 | 115 | 116 | if __name__ == '__main__': 117 | if len(sys.argv) <= 1: 118 | print('3 auguments are need.') 119 | print('Usage: %s XML_LIST.txt XML_DIR OUTPU_JSON.json'%(sys.argv[0])) 120 | exit(1) 121 | 122 | convert(sys.argv[1], sys.argv[2], sys.argv[3], sys.argv[4]) 123 | 124 | # python voc2coco.py xmllist.txt ../Annotations output.json ../JPEGImages --------------------------------------------------------------------------------