├── README.md ├── VisDrone2019 ├── drone.data └── drone │ ├── labels.txt │ ├── train.lists │ └── val.lists ├── cfg ├── 00-unpruned │ ├── yolov3-spp1.cfg │ ├── yolov3-spp3.cfg │ ├── yolov3-tiny.cfg │ └── yolov3.cfg ├── 1iter-pruned │ ├── prune_0.5.cfg │ ├── prune_0.9.cfg │ └── prune_0.95.cfg ├── 2iter-pruned │ └── prune_0.5_0.5.cfg └── 3iter-pruned │ └── prune_0.5_0.5_0.7.cfg ├── images └── test.jpg ├── metrics.jpg ├── procedure.jpg ├── prune.py ├── results ├── demo.mp4 └── result.jpg ├── sparsity.py ├── table.jpg └── yolov3 ├── README.md ├── prune.py └── sparsity.py /README.md: -------------------------------------------------------------------------------- 1 | # SlimYOLOv3: Narrower, Faster and Better for UAV Real-Time Applications 2 | 3 | This page is for the [SlimYOLOv3: Narrower, Faster and Better for UAV Real-Time Applications](http://arxiv.org/abs/1907.11093) 4 | 5 | 6 | ![img](./table.jpg) 7 | 8 | 9 | ![img](./metrics.jpg) 10 | 11 | ## Abstract 12 | 13 | Drones or general Unmanned Aerial Vehicles (UAVs), endowed with computer vision function by on-board cameras and embedded systems, have become popular in a wide range of applications. However, real-time scene parsing through object detection running on a UAV platform is very challenging, due to the limited memory and computing power of embedded devices. To deal with these challenges, in this paper we propose to learn efficient deep object detectors through performing channel pruning on convolutional layers. To this end, we enforce channel-level sparsity of convolutional layers by imposing L1 regularization on channel scaling factors and prune less informative feature channels to obtain “slim” object detectors. Based on such approach, we present SlimYOLOv3 with fewer trainable parameters and floating point operations (FLOPs) in comparison of original YOLOv3 (Joseph Redmon et al., 2018) as a promising solution for real-time object detection on UAVs. We evaluate SlimYOLOv3 on VisDrone2018-Det benchmark dataset; compelling results are achieved by SlimYOLOv3-SPP3 in comparison of unpruned counterpart, including ~90.8% decrease of FLOPs, ~92.0% decline of parameter size, running ~2 times faster and comparable detection accuracy as YOLOv3. Experimental results with different pruning ratios consistently verify that proposed SlimYOLOv3 with narrower structure are more efficient, faster and better than YOLOv3, and thus are more suitable for real-time object detection on UAVs. 14 | 15 | 16 | ## Demo 17 | 18 | [![asciicast](results/result.jpg)](results/demo.mp4) 19 | 20 | 21 | ## Requirements 22 | 23 | 1. pytorch >= 1.0 24 | 25 | 2. [darknet](https://pjreddie.com/darknet/yolo/) 26 | 27 | 3. [ultralytics/yolov3](https://github.com/ultralytics/yolov3) 28 | 29 | ## Procedure of learning efficient deep objectors for SlimYOLOv3 30 | 31 | ![img](./procedure.jpg) 32 | 33 | 34 | ### 1. Normal Training 35 | 36 | ./darknet/darknet detector train VisDrone2019/drone.data cfg/yolov3-spp3.cfg darknet53.conv.74.weights 37 | 38 | ### 2. Sparsity Training 39 | 40 | python yolov3/train_drone.py --cfg VisDrone2019/yolov3-spp3.cfg --data-cfg VisDrone2019/drone.data -sr --s 0.0001 --alpha 1.0 41 | 42 | ### 3. Channel Prunning 43 | 44 | python yolov3/prune.py --cfg VisDrone2019/yolov3-spp3.cfg --data-cfg VisDrone2019/drone.data --weights yolov3-spp3_sparsity.weights --overall_ratio 0.5 --perlayer_ratio 0.1 45 | 46 | 47 | ### 4. Fine-tuning 48 | 49 | ./darknet/darknet detector train VisDrone2019/drone.data cfg/prune_0.5.cfg weights/prune_0.5/prune.weights 50 | 51 | 52 | ## Test 53 | 54 | ### Pretrained models 55 | 56 | 57 | ### Use darknet for evaluation 58 | 59 | ./darknet/darknet detector valid VisDrone2019/drone.data cfg/prune_0.5.cfg backup_prune_0.5/prune_0_final.weights 60 | 61 | ### Use pytorch for evaluation 62 | 63 | python3.6 yolov3/test_drone.py --cfg cfg/prune_0.5.cfg --data-cfg VisDrone2019/drone.data --weights backup_prune_0.5/prune_0_final.weights --iou-thres 0.5 --conf-thres 0.1 --nms-thres 0.5 --img-size 608 64 | 65 | python yolov3/test_single_image.py --cfg cfg/yolov3-tiny.cfg --weights ../weights/yolov3-tiny.weights --img_size 608 66 | 67 | 68 | ## Citation 69 | 70 | If you find this code useful for your research, please cite our paper: 71 | 72 | ``` 73 | @article{ 74 | author = {Pengyi Zhang, Yunxin Zhong, Xiaoqiong Li}, 75 | title = {SlimYOLOv3: Narrower, Faster and Better for Real-Time UAV Applications}, 76 | journal = {CoRR}, 77 | volume = {abs/1907.11093}, 78 | year = {2019}, 79 | ee = {https://arxiv.org/abs/1907.11093} 80 | } 81 | 82 | ``` -------------------------------------------------------------------------------- /VisDrone2019/drone.data: -------------------------------------------------------------------------------- 1 | classes=10 2 | train = VisDrone2019/drone/train.lists 3 | valid = VisDrone2019/drone/val.lists 4 | labels = VisDrone2019/drone/labels.txt 5 | names = VisDrone2019/drone/labels.txt 6 | backup = backup_drone/ 7 | eval=coco 8 | -------------------------------------------------------------------------------- /VisDrone2019/drone/labels.txt: -------------------------------------------------------------------------------- 1 | pedestrian 2 | people 3 | bicycle 4 | car 5 | van 6 | truck 7 | tricycle 8 | awning-tricycle 9 | bus 10 | motor -------------------------------------------------------------------------------- /VisDrone2019/drone/val.lists: -------------------------------------------------------------------------------- 1 | VisDrone2018-DET-val/data/0000348_03333_d_0000423.jpg 2 | VisDrone2018-DET-val/data/0000295_00000_d_0000021.jpg 3 | VisDrone2018-DET-val/data/0000276_00001_d_0000507.jpg 4 | VisDrone2018-DET-val/data/0000129_02411_d_0000138.jpg 5 | VisDrone2018-DET-val/data/0000162_00401_d_0000001.jpg 6 | VisDrone2018-DET-val/data/0000249_03400_d_0000010.jpg 7 | VisDrone2018-DET-val/data/0000335_04117_d_0000064.jpg 8 | VisDrone2018-DET-val/data/0000330_04401_d_0000822.jpg 9 | VisDrone2018-DET-val/data/0000316_01401_d_0000526.jpg 10 | VisDrone2018-DET-val/data/0000086_00592_d_0000002.jpg 11 | VisDrone2018-DET-val/data/0000335_00981_d_0000048.jpg 12 | VisDrone2018-DET-val/data/0000313_06401_d_0000468.jpg 13 | VisDrone2018-DET-val/data/0000333_02745_d_0000015.jpg 14 | VisDrone2018-DET-val/data/0000103_04122_d_0000033.jpg 15 | VisDrone2018-DET-val/data/0000023_01233_d_0000011.jpg 16 | VisDrone2018-DET-val/data/0000345_07057_d_0000345.jpg 17 | VisDrone2018-DET-val/data/0000263_02401_d_0000179.jpg 18 | VisDrone2018-DET-val/data/0000335_02353_d_0000055.jpg 19 | VisDrone2018-DET-val/data/0000360_00589_d_0000716.jpg 20 | VisDrone2018-DET-val/data/0000293_03601_d_0000940.jpg 21 | VisDrone2018-DET-val/data/0000291_01201_d_0000874.jpg 22 | VisDrone2018-DET-val/data/0000316_00601_d_0000522.jpg 23 | VisDrone2018-DET-val/data/0000280_01201_d_0000618.jpg 24 | VisDrone2018-DET-val/data/0000276_01801_d_0000516.jpg 25 | VisDrone2018-DET-val/data/0000242_00627_d_0000003.jpg 26 | VisDrone2018-DET-val/data/0000256_00715_d_0000019.jpg 27 | VisDrone2018-DET-val/data/0000276_05201_d_0000533.jpg 28 | VisDrone2018-DET-val/data/0000296_00601_d_0000038.jpg 29 | VisDrone2018-DET-val/data/0000276_03201_d_0000523.jpg 30 | VisDrone2018-DET-val/data/0000115_00315_d_0000080.jpg 31 | VisDrone2018-DET-val/data/0000153_01801_d_0000001.jpg 32 | VisDrone2018-DET-val/data/0000356_01765_d_0000638.jpg 33 | VisDrone2018-DET-val/data/0000330_04001_d_0000820.jpg 34 | VisDrone2018-DET-val/data/0000155_01201_d_0000001.jpg 35 | VisDrone2018-DET-val/data/0000289_04001_d_0000831.jpg 36 | VisDrone2018-DET-val/data/0000154_00401_d_0000001.jpg 37 | VisDrone2018-DET-val/data/0000295_02900_d_0000034.jpg 38 | VisDrone2018-DET-val/data/0000287_03001_d_0000774.jpg 39 | VisDrone2018-DET-val/data/0000253_00001_d_0000001.jpg 40 | VisDrone2018-DET-val/data/0000323_02601_d_0000645.jpg 41 | VisDrone2018-DET-val/data/0000316_01801_d_0000528.jpg 42 | VisDrone2018-DET-val/data/0000193_02416_d_0000116.jpg 43 | VisDrone2018-DET-val/data/0000360_03529_d_0000731.jpg 44 | VisDrone2018-DET-val/data/0000271_06401_d_0000404.jpg 45 | VisDrone2018-DET-val/data/0000364_01569_d_0000781.jpg 46 | VisDrone2018-DET-val/data/0000359_01961_d_0000707.jpg 47 | VisDrone2018-DET-val/data/0000289_05801_d_0000840.jpg 48 | VisDrone2018-DET-val/data/0000312_00001_d_0000414.jpg 49 | VisDrone2018-DET-val/data/0000342_00393_d_0000244.jpg 50 | VisDrone2018-DET-val/data/0000213_03478_d_0000242.jpg 51 | VisDrone2018-DET-val/data/0000215_02319_d_0000261.jpg 52 | VisDrone2018-DET-val/data/0000069_02163_d_0000006.jpg 53 | VisDrone2018-DET-val/data/0000335_00001_d_0000043.jpg 54 | VisDrone2018-DET-val/data/0000346_02549_d_0000359.jpg 55 | VisDrone2018-DET-val/data/0000316_00001_d_0000519.jpg 56 | VisDrone2018-DET-val/data/0000359_02353_d_0000708.jpg 57 | VisDrone2018-DET-val/data/0000117_02708_d_0000090.jpg 58 | VisDrone2018-DET-val/data/0000244_06388_d_0000014.jpg 59 | VisDrone2018-DET-val/data/0000153_00401_d_0000001.jpg 60 | VisDrone2018-DET-val/data/0000327_01801_d_0000720.jpg 61 | VisDrone2018-DET-val/data/0000291_01801_d_0000877.jpg 62 | VisDrone2018-DET-val/data/0000269_00601_d_0000351.jpg 63 | VisDrone2018-DET-val/data/0000356_03725_d_0000648.jpg 64 | VisDrone2018-DET-val/data/0000330_00201_d_0000801.jpg 65 | VisDrone2018-DET-val/data/0000330_00801_d_0000804.jpg 66 | VisDrone2018-DET-val/data/0000301_01201_d_0000162.jpg 67 | VisDrone2018-DET-val/data/0000193_00000_d_0000103.jpg 68 | VisDrone2018-DET-val/data/0000244_03500_d_0000008.jpg 69 | VisDrone2018-DET-val/data/0000117_01326_d_0000088.jpg 70 | VisDrone2018-DET-val/data/0000276_02401_d_0000519.jpg 71 | VisDrone2018-DET-val/data/0000364_01177_d_0000799.jpg 72 | VisDrone2018-DET-val/data/0000269_00001_d_0000348.jpg 73 | VisDrone2018-DET-val/data/0000271_05201_d_0000398.jpg 74 | VisDrone2018-DET-val/data/0000249_02468_d_0000008.jpg 75 | VisDrone2018-DET-val/data/0000291_04201_d_0000889.jpg 76 | VisDrone2018-DET-val/data/0000117_00112_d_0000087.jpg 77 | VisDrone2018-DET-val/data/0000194_00998_d_0000124.jpg 78 | VisDrone2018-DET-val/data/0000271_07201_d_0000408.jpg 79 | VisDrone2018-DET-val/data/0000117_03096_d_0000091.jpg 80 | VisDrone2018-DET-val/data/0000333_03529_d_0000019.jpg 81 | VisDrone2018-DET-val/data/0000327_04201_d_0000732.jpg 82 | VisDrone2018-DET-val/data/0000086_01954_d_0000005.jpg 83 | VisDrone2018-DET-val/data/0000289_01401_d_0000818.jpg 84 | VisDrone2018-DET-val/data/0000249_01514_d_0000005.jpg 85 | VisDrone2018-DET-val/data/0000194_00625_d_0000122.jpg 86 | VisDrone2018-DET-val/data/0000242_02400_d_0000009.jpg 87 | VisDrone2018-DET-val/data/0000213_03920_d_0000243.jpg 88 | VisDrone2018-DET-val/data/0000277_03201_d_0000554.jpg 89 | VisDrone2018-DET-val/data/0000069_02480_d_0000007.jpg 90 | VisDrone2018-DET-val/data/0000021_00800_d_0000003.jpg 91 | VisDrone2018-DET-val/data/0000117_01731_d_0000089.jpg 92 | VisDrone2018-DET-val/data/0000312_00601_d_0000417.jpg 93 | VisDrone2018-DET-val/data/0000295_00200_d_0000022.jpg 94 | VisDrone2018-DET-val/data/0000280_01601_d_0000620.jpg 95 | VisDrone2018-DET-val/data/0000335_03333_d_0000060.jpg 96 | VisDrone2018-DET-val/data/0000001_03999_d_0000007.jpg 97 | VisDrone2018-DET-val/data/0000301_00201_d_0000157.jpg 98 | VisDrone2018-DET-val/data/0000287_00601_d_0000762.jpg 99 | VisDrone2018-DET-val/data/0000289_03001_d_0000826.jpg 100 | VisDrone2018-DET-val/data/0000360_07253_d_0000750.jpg 101 | VisDrone2018-DET-val/data/0000358_03333_d_0000697.jpg 102 | VisDrone2018-DET-val/data/0000277_04201_d_0000559.jpg 103 | VisDrone2018-DET-val/data/0000280_00001_d_0000612.jpg 104 | VisDrone2018-DET-val/data/0000360_06861_d_0000748.jpg 105 | VisDrone2018-DET-val/data/0000277_00401_d_0000541.jpg 106 | VisDrone2018-DET-val/data/0000356_01177_d_0000635.jpg 107 | VisDrone2018-DET-val/data/0000335_05685_d_0000072.jpg 108 | VisDrone2018-DET-val/data/0000103_00180_d_0000026.jpg 109 | VisDrone2018-DET-val/data/0000287_02001_d_0000769.jpg 110 | VisDrone2018-DET-val/data/0000316_00401_d_0000521.jpg 111 | VisDrone2018-DET-val/data/0000333_02157_d_0000012.jpg 112 | VisDrone2018-DET-val/data/0000026_04978_d_0000034.jpg 113 | VisDrone2018-DET-val/data/0000271_02601_d_0000386.jpg 114 | VisDrone2018-DET-val/data/0000242_05059_d_0000015.jpg 115 | VisDrone2018-DET-val/data/0000333_01765_d_0000010.jpg 116 | VisDrone2018-DET-val/data/0000022_00000_d_0000004.jpg 117 | VisDrone2018-DET-val/data/0000327_01001_d_0000716.jpg 118 | VisDrone2018-DET-val/data/0000280_00401_d_0000614.jpg 119 | VisDrone2018-DET-val/data/0000280_02401_d_0000624.jpg 120 | VisDrone2018-DET-val/data/0000335_01765_d_0000052.jpg 121 | VisDrone2018-DET-val/data/0000153_00801_d_0000001.jpg 122 | VisDrone2018-DET-val/data/0000327_00401_d_0000713.jpg 123 | VisDrone2018-DET-val/data/0000271_00201_d_0000375.jpg 124 | VisDrone2018-DET-val/data/0000280_01001_d_0000617.jpg 125 | VisDrone2018-DET-val/data/0000287_01601_d_0000767.jpg 126 | VisDrone2018-DET-val/data/0000026_01000_d_0000026.jpg 127 | VisDrone2018-DET-val/data/0000276_05601_d_0000535.jpg 128 | VisDrone2018-DET-val/data/0000316_00801_d_0000523.jpg 129 | VisDrone2018-DET-val/data/0000330_01001_d_0000805.jpg 130 | VisDrone2018-DET-val/data/0000194_00000_d_0000119.jpg 131 | VisDrone2018-DET-val/data/0000276_00601_d_0000510.jpg 132 | VisDrone2018-DET-val/data/0000215_03226_d_0000264.jpg 133 | VisDrone2018-DET-val/data/0000271_06001_d_0000402.jpg 134 | VisDrone2018-DET-val/data/0000360_04705_d_0000737.jpg 135 | VisDrone2018-DET-val/data/0000291_00401_d_0000870.jpg 136 | VisDrone2018-DET-val/data/0000335_00589_d_0000046.jpg 137 | VisDrone2018-DET-val/data/0000359_00393_d_0000701.jpg 138 | VisDrone2018-DET-val/data/0000026_00500_d_0000025.jpg 139 | VisDrone2018-DET-val/data/0000360_07057_d_0000749.jpg 140 | VisDrone2018-DET-val/data/0000244_01500_d_0000004.jpg 141 | VisDrone2018-DET-val/data/0000276_04801_d_0000531.jpg 142 | VisDrone2018-DET-val/data/0000165_04325_d_0000105.jpg 143 | VisDrone2018-DET-val/data/0000289_02601_d_0000824.jpg 144 | VisDrone2018-DET-val/data/0000271_07401_d_0000409.jpg 145 | VisDrone2018-DET-val/data/0000291_00201_d_0000869.jpg 146 | VisDrone2018-DET-val/data/0000289_00201_d_0000812.jpg 147 | VisDrone2018-DET-val/data/0000287_01201_d_0000765.jpg 148 | VisDrone2018-DET-val/data/0000356_02745_d_0000643.jpg 149 | VisDrone2018-DET-val/data/0000277_01401_d_0000546.jpg 150 | VisDrone2018-DET-val/data/0000242_00963_d_0000005.jpg 151 | VisDrone2018-DET-val/data/0000327_02601_d_0000724.jpg 152 | VisDrone2018-DET-val/data/0000277_02201_d_0000550.jpg 153 | VisDrone2018-DET-val/data/0000103_04513_d_0000034.jpg 154 | VisDrone2018-DET-val/data/0000316_02001_d_0000529.jpg 155 | VisDrone2018-DET-val/data/0000022_01251_d_0000007.jpg 156 | VisDrone2018-DET-val/data/0000244_04400_d_0000010.jpg 157 | VisDrone2018-DET-val/data/0000289_05601_d_0000839.jpg 158 | VisDrone2018-DET-val/data/0000360_00197_d_0000714.jpg 159 | VisDrone2018-DET-val/data/0000335_03137_d_0000059.jpg 160 | VisDrone2018-DET-val/data/0000335_00197_d_0000044.jpg 161 | VisDrone2018-DET-val/data/0000287_00201_d_0000760.jpg 162 | VisDrone2018-DET-val/data/0000287_05001_d_0000782.jpg 163 | VisDrone2018-DET-val/data/0000289_01201_d_0000817.jpg 164 | VisDrone2018-DET-val/data/0000283_01001_d_0000679.jpg 165 | VisDrone2018-DET-val/data/0000276_02201_d_0000518.jpg 166 | VisDrone2018-DET-val/data/0000153_01201_d_0000001.jpg 167 | VisDrone2018-DET-val/data/0000271_03801_d_0000391.jpg 168 | VisDrone2018-DET-val/data/0000244_05400_d_0000012.jpg 169 | VisDrone2018-DET-val/data/0000287_01001_d_0000764.jpg 170 | VisDrone2018-DET-val/data/0000213_04998_d_0000245.jpg 171 | VisDrone2018-DET-val/data/0000115_01031_d_0000082.jpg 172 | VisDrone2018-DET-val/data/0000155_01601_d_0000001.jpg 173 | VisDrone2018-DET-val/data/0000335_04313_d_0000065.jpg 174 | VisDrone2018-DET-val/data/0000242_01475_d_0000007.jpg 175 | VisDrone2018-DET-val/data/0000242_03199_d_0000011.jpg 176 | VisDrone2018-DET-val/data/0000215_00909_d_0000258.jpg 177 | VisDrone2018-DET-val/data/0000289_00401_d_0000813.jpg 178 | VisDrone2018-DET-val/data/0000335_06861_d_0000078.jpg 179 | VisDrone2018-DET-val/data/0000330_01401_d_0000807.jpg 180 | VisDrone2018-DET-val/data/0000223_01802_d_0000005.jpg 181 | VisDrone2018-DET-val/data/0000193_02725_d_0000118.jpg 182 | VisDrone2018-DET-val/data/0000193_01705_d_0000112.jpg 183 | VisDrone2018-DET-val/data/0000244_03000_d_0000007.jpg 184 | VisDrone2018-DET-val/data/0000276_04001_d_0000527.jpg 185 | VisDrone2018-DET-val/data/0000330_01201_d_0000806.jpg 186 | VisDrone2018-DET-val/data/0000220_01242_d_0000004.jpg 187 | VisDrone2018-DET-val/data/0000280_03401_d_0000629.jpg 188 | VisDrone2018-DET-val/data/0000026_02000_d_0000028.jpg 189 | VisDrone2018-DET-val/data/0000289_04601_d_0000834.jpg 190 | VisDrone2018-DET-val/data/0000315_01601_d_0000509.jpg 191 | VisDrone2018-DET-val/data/0000271_00401_d_0000376.jpg 192 | VisDrone2018-DET-val/data/0000116_01338_d_0000086.jpg 193 | VisDrone2018-DET-val/data/0000244_00500_d_0000002.jpg 194 | VisDrone2018-DET-val/data/0000335_00393_d_0000045.jpg 195 | VisDrone2018-DET-val/data/0000069_01878_d_0000005.jpg 196 | VisDrone2018-DET-val/data/0000327_00801_d_0000715.jpg 197 | VisDrone2018-DET-val/data/0000277_02601_d_0000552.jpg 198 | VisDrone2018-DET-val/data/0000103_03349_d_0000031.jpg 199 | VisDrone2018-DET-val/data/0000327_03601_d_0000729.jpg 200 | VisDrone2018-DET-val/data/0000312_01001_d_0000419.jpg 201 | VisDrone2018-DET-val/data/0000327_02801_d_0000725.jpg 202 | VisDrone2018-DET-val/data/0000242_00001_d_0000001.jpg 203 | VisDrone2018-DET-val/data/0000289_07201_d_0000847.jpg 204 | VisDrone2018-DET-val/data/0000335_06665_d_0000077.jpg 205 | VisDrone2018-DET-val/data/0000289_01001_d_0000816.jpg 206 | VisDrone2018-DET-val/data/0000291_01401_d_0000875.jpg 207 | VisDrone2018-DET-val/data/0000215_00000_d_0000256.jpg 208 | VisDrone2018-DET-val/data/0000313_06601_d_0000469.jpg 209 | VisDrone2018-DET-val/data/0000216_00520_d_0000001.jpg 210 | VisDrone2018-DET-val/data/0000359_02549_d_0000709.jpg 211 | VisDrone2018-DET-val/data/0000356_03333_d_0000646.jpg 212 | VisDrone2018-DET-val/data/0000244_00001_d_0000001.jpg 213 | VisDrone2018-DET-val/data/0000289_02201_d_0000822.jpg 214 | VisDrone2018-DET-val/data/0000055_00714_d_0000110.jpg 215 | VisDrone2018-DET-val/data/0000356_01961_d_0000639.jpg 216 | VisDrone2018-DET-val/data/0000359_00785_d_0000703.jpg 217 | VisDrone2018-DET-val/data/0000289_02001_d_0000821.jpg 218 | VisDrone2018-DET-val/data/0000024_01000_d_0000014.jpg 219 | VisDrone2018-DET-val/data/0000280_02001_d_0000622.jpg 220 | VisDrone2018-DET-val/data/0000276_04601_d_0000530.jpg 221 | VisDrone2018-DET-val/data/0000069_00483_d_0000002.jpg 222 | VisDrone2018-DET-val/data/0000327_03201_d_0000727.jpg 223 | VisDrone2018-DET-val/data/0000215_00447_d_0000257.jpg 224 | VisDrone2018-DET-val/data/0000276_03601_d_0000525.jpg 225 | VisDrone2018-DET-val/data/0000103_04948_d_0000035.jpg 226 | VisDrone2018-DET-val/data/0000215_01869_d_0000260.jpg 227 | VisDrone2018-DET-val/data/0000333_01961_d_0000011.jpg 228 | VisDrone2018-DET-val/data/0000162_00001_d_0000001.jpg 229 | VisDrone2018-DET-val/data/0000333_02549_d_0000014.jpg 230 | VisDrone2018-DET-val/data/0000001_03499_d_0000006.jpg 231 | VisDrone2018-DET-val/data/0000360_00785_d_0000717.jpg 232 | VisDrone2018-DET-val/data/0000289_00001_d_0000811.jpg 233 | VisDrone2018-DET-val/data/0000280_00801_d_0000616.jpg 234 | VisDrone2018-DET-val/data/0000022_01036_d_0000006.jpg 235 | VisDrone2018-DET-val/data/0000103_02964_d_0000030.jpg 236 | VisDrone2018-DET-val/data/0000276_04401_d_0000529.jpg 237 | VisDrone2018-DET-val/data/0000360_06665_d_0000747.jpg 238 | VisDrone2018-DET-val/data/0000291_02201_d_0000879.jpg 239 | VisDrone2018-DET-val/data/0000271_00601_d_0000377.jpg 240 | VisDrone2018-DET-val/data/0000026_04500_d_0000033.jpg 241 | VisDrone2018-DET-val/data/0000364_00589_d_0000798.jpg 242 | VisDrone2018-DET-val/data/0000155_00801_d_0000001.jpg 243 | VisDrone2018-DET-val/data/0000086_00000_d_0000001.jpg 244 | VisDrone2018-DET-val/data/0000244_04000_d_0000009.jpg 245 | VisDrone2018-DET-val/data/0000330_04601_d_0000823.jpg 246 | VisDrone2018-DET-val/data/0000271_05601_d_0000400.jpg 247 | VisDrone2018-DET-val/data/0000103_01734_d_0000028.jpg 248 | VisDrone2018-DET-val/data/0000335_05097_d_0000069.jpg 249 | VisDrone2018-DET-val/data/0000280_00201_d_0000613.jpg 250 | VisDrone2018-DET-val/data/0000086_01084_d_0000003.jpg 251 | VisDrone2018-DET-val/data/0000154_00001_d_0000001.jpg 252 | VisDrone2018-DET-val/data/0000193_02029_d_0000114.jpg 253 | VisDrone2018-DET-val/data/0000271_04401_d_0000394.jpg 254 | VisDrone2018-DET-val/data/0000289_03801_d_0000830.jpg 255 | VisDrone2018-DET-val/data/0000289_05201_d_0000837.jpg 256 | VisDrone2018-DET-val/data/0000001_04527_d_0000008.jpg 257 | VisDrone2018-DET-val/data/0000327_01401_d_0000718.jpg 258 | VisDrone2018-DET-val/data/0000277_02401_d_0000551.jpg 259 | VisDrone2018-DET-val/data/0000194_00200_d_0000120.jpg 260 | VisDrone2018-DET-val/data/0000276_05801_d_0000536.jpg 261 | VisDrone2018-DET-val/data/0000330_01601_d_0000808.jpg 262 | VisDrone2018-DET-val/data/0000360_03137_d_0000729.jpg 263 | VisDrone2018-DET-val/data/0000356_04705_d_0000653.jpg 264 | VisDrone2018-DET-val/data/0000327_03001_d_0000726.jpg 265 | VisDrone2018-DET-val/data/0000280_01401_d_0000619.jpg 266 | VisDrone2018-DET-val/data/0000116_00819_d_0000084.jpg 267 | VisDrone2018-DET-val/data/0000086_01443_d_0000004.jpg 268 | VisDrone2018-DET-val/data/0000276_01001_d_0000512.jpg 269 | VisDrone2018-DET-val/data/0000026_03000_d_0000030.jpg 270 | VisDrone2018-DET-val/data/0000277_02001_d_0000549.jpg 271 | VisDrone2018-DET-val/data/0000280_02201_d_0000623.jpg 272 | VisDrone2018-DET-val/data/0000346_01961_d_0000356.jpg 273 | VisDrone2018-DET-val/data/0000069_00713_d_0000003.jpg 274 | VisDrone2018-DET-val/data/0000237_00001_d_0000001.jpg 275 | VisDrone2018-DET-val/data/0000194_00399_d_0000121.jpg 276 | VisDrone2018-DET-val/data/0000023_00868_d_0000010.jpg 277 | VisDrone2018-DET-val/data/0000155_01871_d_0000001.jpg 278 | VisDrone2018-DET-val/data/0000277_03401_d_0000555.jpg 279 | VisDrone2018-DET-val/data/0000295_01800_d_0000030.jpg 280 | VisDrone2018-DET-val/data/0000256_02173_d_0000030.jpg 281 | VisDrone2018-DET-val/data/0000289_06401_d_0000843.jpg 282 | VisDrone2018-DET-val/data/0000116_01059_d_0000085.jpg 283 | VisDrone2018-DET-val/data/0000081_00000_d_0000001.jpg 284 | VisDrone2018-DET-val/data/0000276_01201_d_0000513.jpg 285 | VisDrone2018-DET-val/data/0000300_03201_d_0000154.jpg 286 | VisDrone2018-DET-val/data/0000069_00001_d_0000001.jpg 287 | VisDrone2018-DET-val/data/0000289_02401_d_0000823.jpg 288 | VisDrone2018-DET-val/data/0000153_01601_d_0000001.jpg 289 | VisDrone2018-DET-val/data/0000249_01100_d_0000004.jpg 290 | VisDrone2018-DET-val/data/0000023_00000_d_0000008.jpg 291 | VisDrone2018-DET-val/data/0000291_02001_d_0000878.jpg 292 | VisDrone2018-DET-val/data/0000242_00500_d_0000002.jpg 293 | VisDrone2018-DET-val/data/0000312_00201_d_0000415.jpg 294 | VisDrone2018-DET-val/data/0000289_06801_d_0000845.jpg 295 | VisDrone2018-DET-val/data/0000295_00800_d_0000025.jpg 296 | VisDrone2018-DET-val/data/0000103_00502_d_0000027.jpg 297 | VisDrone2018-DET-val/data/0000289_04201_d_0000832.jpg 298 | VisDrone2018-DET-val/data/0000194_00800_d_0000123.jpg 299 | VisDrone2018-DET-val/data/0000287_01401_d_0000766.jpg 300 | VisDrone2018-DET-val/data/0000356_01569_d_0000637.jpg 301 | VisDrone2018-DET-val/data/0000244_05900_d_0000013.jpg 302 | VisDrone2018-DET-val/data/0000276_06001_d_0000537.jpg 303 | VisDrone2018-DET-val/data/0000276_03401_d_0000524.jpg 304 | VisDrone2018-DET-val/data/0000330_03201_d_0000816.jpg 305 | VisDrone2018-DET-val/data/0000356_03529_d_0000647.jpg 306 | VisDrone2018-DET-val/data/0000333_00001_d_0000001.jpg 307 | VisDrone2018-DET-val/data/0000193_01876_d_0000113.jpg 308 | VisDrone2018-DET-val/data/0000359_00589_d_0000702.jpg 309 | VisDrone2018-DET-val/data/0000001_02999_d_0000005.jpg 310 | VisDrone2018-DET-val/data/0000194_01192_d_0000125.jpg 311 | VisDrone2018-DET-val/data/0000154_01201_d_0000001.jpg 312 | VisDrone2018-DET-val/data/0000244_02500_d_0000006.jpg 313 | VisDrone2018-DET-val/data/0000116_00351_d_0000083.jpg 314 | VisDrone2018-DET-val/data/0000360_05685_d_0000742.jpg 315 | VisDrone2018-DET-val/data/0000213_02965_d_0000241.jpg 316 | VisDrone2018-DET-val/data/0000360_02745_d_0000727.jpg 317 | VisDrone2018-DET-val/data/0000026_02500_d_0000029.jpg 318 | VisDrone2018-DET-val/data/0000291_00001_d_0000868.jpg 319 | VisDrone2018-DET-val/data/0000330_02001_d_0000810.jpg 320 | VisDrone2018-DET-val/data/0000069_01109_d_0000004.jpg 321 | VisDrone2018-DET-val/data/0000356_00589_d_0000632.jpg 322 | VisDrone2018-DET-val/data/0000295_01600_d_0000029.jpg 323 | VisDrone2018-DET-val/data/0000026_04000_d_0000032.jpg 324 | VisDrone2018-DET-val/data/0000291_04001_d_0000888.jpg 325 | VisDrone2018-DET-val/data/0000295_02200_d_0000032.jpg 326 | VisDrone2018-DET-val/data/0000271_01201_d_0000379.jpg 327 | VisDrone2018-DET-val/data/0000330_00001_d_0000800.jpg 328 | VisDrone2018-DET-val/data/0000356_04117_d_0000650.jpg 329 | VisDrone2018-DET-val/data/0000291_01001_d_0000873.jpg 330 | VisDrone2018-DET-val/data/0000055_00000_d_0000109.jpg 331 | VisDrone2018-DET-val/data/0000360_02941_d_0000728.jpg 332 | VisDrone2018-DET-val/data/0000295_01000_d_0000026.jpg 333 | VisDrone2018-DET-val/data/0000242_05570_d_0000016.jpg 334 | VisDrone2018-DET-val/data/0000308_04401_d_0000327.jpg 335 | VisDrone2018-DET-val/data/0000024_00000_d_0000012.jpg 336 | VisDrone2018-DET-val/data/0000276_02601_d_0000520.jpg 337 | VisDrone2018-DET-val/data/0000287_02801_d_0000773.jpg 338 | VisDrone2018-DET-val/data/0000277_03601_d_0000556.jpg 339 | VisDrone2018-DET-val/data/0000271_07001_d_0000407.jpg 340 | VisDrone2018-DET-val/data/0000356_02941_d_0000644.jpg 341 | VisDrone2018-DET-val/data/0000291_03801_d_0000887.jpg 342 | VisDrone2018-DET-val/data/0000283_00401_d_0000676.jpg 343 | VisDrone2018-DET-val/data/0000287_03401_d_0000776.jpg 344 | VisDrone2018-DET-val/data/0000327_00001_d_0000711.jpg 345 | VisDrone2018-DET-val/data/0000356_02353_d_0000641.jpg 346 | VisDrone2018-DET-val/data/0000249_00235_d_0000002.jpg 347 | VisDrone2018-DET-val/data/0000330_02801_d_0000814.jpg 348 | VisDrone2018-DET-val/data/0000024_00500_d_0000013.jpg 349 | VisDrone2018-DET-val/data/0000271_05001_d_0000397.jpg 350 | VisDrone2018-DET-val/data/0000021_00500_d_0000002.jpg 351 | VisDrone2018-DET-val/data/0000360_07645_d_0000752.jpg 352 | VisDrone2018-DET-val/data/0000166_00558_d_0000118.jpg 353 | VisDrone2018-DET-val/data/0000276_04201_d_0000528.jpg 354 | VisDrone2018-DET-val/data/0000303_03401_d_0000200.jpg 355 | VisDrone2018-DET-val/data/0000271_01401_d_0000380.jpg 356 | VisDrone2018-DET-val/data/0000333_03725_d_0000020.jpg 357 | VisDrone2018-DET-val/data/0000196_04512_d_0000144.jpg 358 | VisDrone2018-DET-val/data/0000271_00801_d_0000378.jpg 359 | VisDrone2018-DET-val/data/0000076_02142_d_0000009.jpg 360 | VisDrone2018-DET-val/data/0000287_02401_d_0000771.jpg 361 | VisDrone2018-DET-val/data/0000163_00001_d_0000001.jpg 362 | VisDrone2018-DET-val/data/0000295_00600_d_0000024.jpg 363 | VisDrone2018-DET-val/data/0000289_05401_d_0000838.jpg 364 | VisDrone2018-DET-val/data/0000333_02353_d_0000013.jpg 365 | VisDrone2018-DET-val/data/0000249_02073_d_0000007.jpg 366 | VisDrone2018-DET-val/data/0000242_03668_d_0000012.jpg 367 | VisDrone2018-DET-val/data/0000242_04600_d_0000014.jpg 368 | VisDrone2018-DET-val/data/0000327_03801_d_0000730.jpg 369 | VisDrone2018-DET-val/data/0000360_06273_d_0000745.jpg 370 | VisDrone2018-DET-val/data/0000280_01801_d_0000621.jpg 371 | VisDrone2018-DET-val/data/0000280_03201_d_0000628.jpg 372 | VisDrone2018-DET-val/data/0000316_01201_d_0000525.jpg 373 | VisDrone2018-DET-val/data/0000213_04405_d_0000244.jpg 374 | VisDrone2018-DET-val/data/0000242_06010_d_0000017.jpg 375 | VisDrone2018-DET-val/data/0000289_06601_d_0000844.jpg 376 | VisDrone2018-DET-val/data/0000291_03201_d_0000884.jpg 377 | VisDrone2018-DET-val/data/0000242_06452_d_0000018.jpg 378 | VisDrone2018-DET-val/data/0000276_01401_d_0000514.jpg 379 | VisDrone2018-DET-val/data/0000289_03201_d_0000827.jpg 380 | VisDrone2018-DET-val/data/0000026_01500_d_0000027.jpg 381 | VisDrone2018-DET-val/data/0000327_00601_d_0000714.jpg 382 | VisDrone2018-DET-val/data/0000360_03921_d_0000733.jpg 383 | VisDrone2018-DET-val/data/0000289_06201_d_0000842.jpg 384 | VisDrone2018-DET-val/data/0000287_04601_d_0000780.jpg 385 | VisDrone2018-DET-val/data/0000316_01601_d_0000527.jpg 386 | VisDrone2018-DET-val/data/0000364_01765_d_0000782.jpg 387 | VisDrone2018-DET-val/data/0000312_02401_d_0000426.jpg 388 | VisDrone2018-DET-val/data/0000356_04901_d_0000654.jpg 389 | VisDrone2018-DET-val/data/0000271_06601_d_0000405.jpg 390 | VisDrone2018-DET-val/data/0000021_00000_d_0000001.jpg 391 | VisDrone2018-DET-val/data/0000242_01912_d_0000008.jpg 392 | VisDrone2018-DET-val/data/0000301_00001_d_0000156.jpg 393 | VisDrone2018-DET-val/data/0000335_06469_d_0000076.jpg 394 | VisDrone2018-DET-val/data/0000316_02201_d_0000530.jpg 395 | VisDrone2018-DET-val/data/0000215_02667_d_0000262.jpg 396 | VisDrone2018-DET-val/data/0000359_01177_d_0000705.jpg 397 | VisDrone2018-DET-val/data/0000103_03738_d_0000032.jpg 398 | VisDrone2018-DET-val/data/0000295_02400_d_0000033.jpg 399 | VisDrone2018-DET-val/data/0000333_03333_d_0000018.jpg 400 | VisDrone2018-DET-val/data/0000364_01373_d_0000780.jpg 401 | VisDrone2018-DET-val/data/0000249_00001_d_0000001.jpg 402 | VisDrone2018-DET-val/data/0000356_02157_d_0000640.jpg 403 | VisDrone2018-DET-val/data/0000199_01269_d_0000166.jpg 404 | VisDrone2018-DET-val/data/0000360_04117_d_0000734.jpg 405 | VisDrone2018-DET-val/data/0000271_05801_d_0000401.jpg 406 | VisDrone2018-DET-val/data/0000276_00401_d_0000509.jpg 407 | VisDrone2018-DET-val/data/0000244_04900_d_0000011.jpg 408 | VisDrone2018-DET-val/data/0000359_00981_d_0000704.jpg 409 | VisDrone2018-DET-val/data/0000335_00785_d_0000047.jpg 410 | VisDrone2018-DET-val/data/0000291_04401_d_0000890.jpg 411 | VisDrone2018-DET-val/data/0000277_05001_d_0000562.jpg 412 | VisDrone2018-DET-val/data/0000103_02544_d_0000029.jpg 413 | VisDrone2018-DET-val/data/0000026_00000_d_0000024.jpg 414 | VisDrone2018-DET-val/data/0000193_02606_d_0000117.jpg 415 | VisDrone2018-DET-val/data/0000244_01000_d_0000003.jpg 416 | VisDrone2018-DET-val/data/0000312_02201_d_0000425.jpg 417 | VisDrone2018-DET-val/data/0000242_02762_d_0000010.jpg 418 | VisDrone2018-DET-val/data/0000346_00001_d_0000346.jpg 419 | VisDrone2018-DET-val/data/0000154_01601_d_0000001.jpg 420 | VisDrone2018-DET-val/data/0000001_05999_d_0000011.jpg 421 | VisDrone2018-DET-val/data/0000346_07057_d_0000382.jpg 422 | VisDrone2018-DET-val/data/0000249_01635_d_0000006.jpg 423 | VisDrone2018-DET-val/data/0000295_01400_d_0000028.jpg 424 | VisDrone2018-DET-val/data/0000330_04201_d_0000821.jpg 425 | VisDrone2018-DET-val/data/0000276_05001_d_0000532.jpg 426 | VisDrone2018-DET-val/data/0000360_05293_d_0000740.jpg 427 | VisDrone2018-DET-val/data/0000287_00801_d_0000763.jpg 428 | VisDrone2018-DET-val/data/0000155_00401_d_0000001.jpg 429 | VisDrone2018-DET-val/data/0000359_03333_d_0000711.jpg 430 | VisDrone2018-DET-val/data/0000271_01601_d_0000381.jpg 431 | VisDrone2018-DET-val/data/0000287_04801_d_0000781.jpg 432 | VisDrone2018-DET-val/data/0000280_02601_d_0000625.jpg 433 | VisDrone2018-DET-val/data/0000271_03001_d_0000387.jpg 434 | VisDrone2018-DET-val/data/0000001_08414_d_0000013.jpg 435 | VisDrone2018-DET-val/data/0000330_00601_d_0000803.jpg 436 | VisDrone2018-DET-val/data/0000193_02212_d_0000115.jpg 437 | VisDrone2018-DET-val/data/0000276_03001_d_0000522.jpg 438 | VisDrone2018-DET-val/data/0000271_00001_d_0000374.jpg 439 | VisDrone2018-DET-val/data/0000215_01395_d_0000259.jpg 440 | VisDrone2018-DET-val/data/0000330_00401_d_0000802.jpg 441 | VisDrone2018-DET-val/data/0000283_00601_d_0000677.jpg 442 | VisDrone2018-DET-val/data/0000277_04401_d_0000560.jpg 443 | VisDrone2018-DET-val/data/0000155_00001_d_0000001.jpg 444 | VisDrone2018-DET-val/data/0000024_01543_d_0000015.jpg 445 | VisDrone2018-DET-val/data/0000359_03529_d_0000712.jpg 446 | VisDrone2018-DET-val/data/0000348_00589_d_0000409.jpg 447 | VisDrone2018-DET-val/data/0000346_03921_d_0000366.jpg 448 | VisDrone2018-DET-val/data/0000289_05001_d_0000836.jpg 449 | VisDrone2018-DET-val/data/0000289_06001_d_0000841.jpg 450 | VisDrone2018-DET-val/data/0000296_01001_d_0000040.jpg 451 | VisDrone2018-DET-val/data/0000242_00843_d_0000004.jpg 452 | VisDrone2018-DET-val/data/0000213_05340_d_0000246.jpg 453 | VisDrone2018-DET-val/data/0000289_00801_d_0000815.jpg 454 | VisDrone2018-DET-val/data/0000330_01801_d_0000809.jpg 455 | VisDrone2018-DET-val/data/0000289_03601_d_0000829.jpg 456 | VisDrone2018-DET-val/data/0000289_04401_d_0000833.jpg 457 | VisDrone2018-DET-val/data/0000327_01201_d_0000717.jpg 458 | VisDrone2018-DET-val/data/0000356_05097_d_0000655.jpg 459 | VisDrone2018-DET-val/data/0000023_00300_d_0000009.jpg 460 | VisDrone2018-DET-val/data/0000330_03001_d_0000815.jpg 461 | VisDrone2018-DET-val/data/0000154_02001_d_0000001.jpg 462 | VisDrone2018-DET-val/data/0000360_00393_d_0000715.jpg 463 | VisDrone2018-DET-val/data/0000277_04601_d_0000561.jpg 464 | VisDrone2018-DET-val/data/0000072_02834_d_0000003.jpg 465 | VisDrone2018-DET-val/data/0000333_03137_d_0000017.jpg 466 | VisDrone2018-DET-val/data/0000295_01200_d_0000027.jpg 467 | VisDrone2018-DET-val/data/0000312_04201_d_0000435.jpg 468 | VisDrone2018-DET-val/data/0000360_02353_d_0000725.jpg 469 | VisDrone2018-DET-val/data/0000291_05201_d_0000893.jpg 470 | VisDrone2018-DET-val/data/0000291_05001_d_0000892.jpg 471 | VisDrone2018-DET-val/data/0000289_04801_d_0000835.jpg 472 | VisDrone2018-DET-val/data/0000271_06201_d_0000403.jpg 473 | VisDrone2018-DET-val/data/0000271_05401_d_0000399.jpg 474 | VisDrone2018-DET-val/data/0000287_00401_d_0000761.jpg 475 | VisDrone2018-DET-val/data/0000280_02801_d_0000626.jpg 476 | VisDrone2018-DET-val/data/0000276_03801_d_0000526.jpg 477 | VisDrone2018-DET-val/data/0000295_02000_d_0000031.jpg 478 | VisDrone2018-DET-val/data/0000153_00001_d_0000001.jpg 479 | VisDrone2018-DET-val/data/0000360_00981_d_0000718.jpg 480 | VisDrone2018-DET-val/data/0000291_02601_d_0000881.jpg 481 | VisDrone2018-DET-val/data/0000249_02900_d_0000009.jpg 482 | VisDrone2018-DET-val/data/0000335_01961_d_0000053.jpg 483 | VisDrone2018-DET-val/data/0000313_07001_d_0000471.jpg 484 | VisDrone2018-DET-val/data/0000312_01201_d_0000420.jpg 485 | VisDrone2018-DET-val/data/0000001_07999_d_0000012.jpg 486 | VisDrone2018-DET-val/data/0000001_05249_d_0000009.jpg 487 | VisDrone2018-DET-val/data/0000242_04116_d_0000013.jpg 488 | VisDrone2018-DET-val/data/0000269_00201_d_0000349.jpg 489 | VisDrone2018-DET-val/data/0000289_01601_d_0000819.jpg 490 | VisDrone2018-DET-val/data/0000290_04001_d_0000867.jpg 491 | VisDrone2018-DET-val/data/0000277_04001_d_0000558.jpg 492 | VisDrone2018-DET-val/data/0000327_00201_d_0000712.jpg 493 | VisDrone2018-DET-val/data/0000163_00359_d_0000001.jpg 494 | VisDrone2018-DET-val/data/0000249_00603_d_0000003.jpg 495 | VisDrone2018-DET-val/data/0000289_00601_d_0000814.jpg 496 | VisDrone2018-DET-val/data/0000287_01801_d_0000768.jpg 497 | VisDrone2018-DET-val/data/0000291_01601_d_0000876.jpg 498 | VisDrone2018-DET-val/data/0000287_02601_d_0000772.jpg 499 | VisDrone2018-DET-val/data/0000271_02401_d_0000385.jpg 500 | VisDrone2018-DET-val/data/0000335_02549_d_0000056.jpg 501 | VisDrone2018-DET-val/data/0000244_02000_d_0000005.jpg 502 | VisDrone2018-DET-val/data/0000277_01601_d_0000547.jpg 503 | VisDrone2018-DET-val/data/0000026_03500_d_0000031.jpg 504 | VisDrone2018-DET-val/data/0000249_03900_d_0000011.jpg 505 | VisDrone2018-DET-val/data/0000213_05745_d_0000247.jpg 506 | VisDrone2018-DET-val/data/0000333_02941_d_0000016.jpg 507 | VisDrone2018-DET-val/data/0000271_06801_d_0000406.jpg 508 | VisDrone2018-DET-val/data/0000291_00601_d_0000871.jpg 509 | VisDrone2018-DET-val/data/0000154_00801_d_0000001.jpg 510 | VisDrone2018-DET-val/data/0000327_04001_d_0000731.jpg 511 | VisDrone2018-DET-val/data/0000001_05499_d_0000010.jpg 512 | VisDrone2018-DET-val/data/0000360_00001_d_0000713.jpg 513 | VisDrone2018-DET-val/data/0000316_01001_d_0000524.jpg 514 | VisDrone2018-DET-val/data/0000162_01058_d_0000001.jpg 515 | VisDrone2018-DET-val/data/0000291_03401_d_0000885.jpg 516 | VisDrone2018-DET-val/data/0000356_03921_d_0000649.jpg 517 | VisDrone2018-DET-val/data/0000213_02500_d_0000240.jpg 518 | VisDrone2018-DET-val/data/0000289_07001_d_0000846.jpg 519 | VisDrone2018-DET-val/data/0000335_03725_d_0000062.jpg 520 | VisDrone2018-DET-val/data/0000277_00001_d_0000539.jpg 521 | VisDrone2018-DET-val/data/0000359_01373_d_0000706.jpg 522 | VisDrone2018-DET-val/data/0000271_04801_d_0000396.jpg 523 | VisDrone2018-DET-val/data/0000283_00801_d_0000678.jpg 524 | VisDrone2018-DET-val/data/0000276_02801_d_0000521.jpg 525 | VisDrone2018-DET-val/data/0000335_03921_d_0000063.jpg 526 | VisDrone2018-DET-val/data/0000215_02932_d_0000263.jpg 527 | VisDrone2018-DET-val/data/0000276_02001_d_0000517.jpg 528 | VisDrone2018-DET-val/data/0000356_04313_d_0000651.jpg 529 | VisDrone2018-DET-val/data/0000277_01801_d_0000548.jpg 530 | VisDrone2018-DET-val/data/0000162_00801_d_0000001.jpg 531 | VisDrone2018-DET-val/data/0000301_01001_d_0000161.jpg 532 | VisDrone2018-DET-val/data/0000335_02745_d_0000057.jpg 533 | VisDrone2018-DET-val/data/0000291_03601_d_0000886.jpg 534 | VisDrone2018-DET-val/data/0000287_02201_d_0000770.jpg 535 | VisDrone2018-DET-val/data/0000242_01139_d_0000006.jpg 536 | VisDrone2018-DET-val/data/0000330_02201_d_0000811.jpg 537 | VisDrone2018-DET-val/data/0000280_00601_d_0000615.jpg 538 | VisDrone2018-DET-val/data/0000022_00500_d_0000005.jpg 539 | VisDrone2018-DET-val/data/0000287_00001_d_0000759.jpg 540 | VisDrone2018-DET-val/data/0000360_03333_d_0000730.jpg 541 | VisDrone2018-DET-val/data/0000193_01497_d_0000111.jpg 542 | VisDrone2018-DET-val/data/0000280_03001_d_0000627.jpg 543 | VisDrone2018-DET-val/data/0000289_03401_d_0000828.jpg 544 | VisDrone2018-DET-val/data/0000277_00201_d_0000540.jpg 545 | VisDrone2018-DET-val/data/0000115_00796_d_0000081.jpg 546 | VisDrone2018-DET-val/data/0000327_03401_d_0000728.jpg 547 | VisDrone2018-DET-val/data/0000273_05601_d_0000466.jpg 548 | VisDrone2018-DET-val/data/0000316_00201_d_0000520.jpg 549 | -------------------------------------------------------------------------------- /cfg/00-unpruned/yolov3-spp1.cfg: -------------------------------------------------------------------------------- 1 | [net] 2 | # Testing 3 | batch=1 4 | subdivisions=1 5 | # Training 6 | #batch=64 7 | #subdivisions=16 8 | width=608 9 | height=608 10 | channels=3 11 | momentum=0.9 12 | decay=0.0005 13 | angle=0 14 | saturation = 1.5 15 | exposure = 1.5 16 | hue=.1 17 | 18 | learning_rate=0.001 19 | burn_in=1000 20 | max_batches = 500200 21 | policy=steps 22 | steps=400000,450000 23 | scales=.1,.1 24 | 25 | [convolutional] 26 | batch_normalize=1 27 | filters=32 28 | size=3 29 | stride=1 30 | pad=1 31 | activation=leaky 32 | 33 | # Downsample 34 | 35 | [convolutional] 36 | batch_normalize=1 37 | filters=64 38 | size=3 39 | stride=2 40 | pad=1 41 | activation=leaky 42 | 43 | [convolutional] 44 | batch_normalize=1 45 | filters=32 46 | size=1 47 | stride=1 48 | pad=1 49 | activation=leaky 50 | 51 | [convolutional] 52 | batch_normalize=1 53 | filters=64 54 | size=3 55 | stride=1 56 | pad=1 57 | activation=leaky 58 | 59 | [shortcut] 60 | from=-3 61 | activation=linear 62 | 63 | # Downsample 64 | 65 | [convolutional] 66 | batch_normalize=1 67 | filters=128 68 | size=3 69 | stride=2 70 | pad=1 71 | activation=leaky 72 | 73 | [convolutional] 74 | batch_normalize=1 75 | filters=64 76 | size=1 77 | stride=1 78 | pad=1 79 | activation=leaky 80 | 81 | [convolutional] 82 | batch_normalize=1 83 | filters=128 84 | size=3 85 | stride=1 86 | pad=1 87 | activation=leaky 88 | 89 | [shortcut] 90 | from=-3 91 | activation=linear 92 | 93 | [convolutional] 94 | batch_normalize=1 95 | filters=64 96 | size=1 97 | stride=1 98 | pad=1 99 | activation=leaky 100 | 101 | [convolutional] 102 | batch_normalize=1 103 | filters=128 104 | size=3 105 | stride=1 106 | pad=1 107 | activation=leaky 108 | 109 | [shortcut] 110 | from=-3 111 | activation=linear 112 | 113 | # Downsample 114 | 115 | [convolutional] 116 | batch_normalize=1 117 | filters=256 118 | size=3 119 | stride=2 120 | pad=1 121 | activation=leaky 122 | 123 | [convolutional] 124 | batch_normalize=1 125 | filters=128 126 | size=1 127 | stride=1 128 | pad=1 129 | activation=leaky 130 | 131 | [convolutional] 132 | batch_normalize=1 133 | filters=256 134 | size=3 135 | stride=1 136 | pad=1 137 | activation=leaky 138 | 139 | [shortcut] 140 | from=-3 141 | activation=linear 142 | 143 | [convolutional] 144 | batch_normalize=1 145 | filters=128 146 | size=1 147 | stride=1 148 | pad=1 149 | activation=leaky 150 | 151 | [convolutional] 152 | batch_normalize=1 153 | filters=256 154 | size=3 155 | stride=1 156 | pad=1 157 | activation=leaky 158 | 159 | [shortcut] 160 | from=-3 161 | activation=linear 162 | 163 | [convolutional] 164 | batch_normalize=1 165 | filters=128 166 | size=1 167 | stride=1 168 | pad=1 169 | activation=leaky 170 | 171 | [convolutional] 172 | batch_normalize=1 173 | filters=256 174 | size=3 175 | stride=1 176 | pad=1 177 | activation=leaky 178 | 179 | [shortcut] 180 | from=-3 181 | activation=linear 182 | 183 | [convolutional] 184 | batch_normalize=1 185 | filters=128 186 | size=1 187 | stride=1 188 | pad=1 189 | activation=leaky 190 | 191 | [convolutional] 192 | batch_normalize=1 193 | filters=256 194 | size=3 195 | stride=1 196 | pad=1 197 | activation=leaky 198 | 199 | [shortcut] 200 | from=-3 201 | activation=linear 202 | 203 | 204 | [convolutional] 205 | batch_normalize=1 206 | filters=128 207 | size=1 208 | stride=1 209 | pad=1 210 | activation=leaky 211 | 212 | [convolutional] 213 | batch_normalize=1 214 | filters=256 215 | size=3 216 | stride=1 217 | pad=1 218 | activation=leaky 219 | 220 | [shortcut] 221 | from=-3 222 | activation=linear 223 | 224 | [convolutional] 225 | batch_normalize=1 226 | filters=128 227 | size=1 228 | stride=1 229 | pad=1 230 | activation=leaky 231 | 232 | [convolutional] 233 | batch_normalize=1 234 | filters=256 235 | size=3 236 | stride=1 237 | pad=1 238 | activation=leaky 239 | 240 | [shortcut] 241 | from=-3 242 | activation=linear 243 | 244 | [convolutional] 245 | batch_normalize=1 246 | filters=128 247 | size=1 248 | stride=1 249 | pad=1 250 | activation=leaky 251 | 252 | [convolutional] 253 | batch_normalize=1 254 | filters=256 255 | size=3 256 | stride=1 257 | pad=1 258 | activation=leaky 259 | 260 | [shortcut] 261 | from=-3 262 | activation=linear 263 | 264 | [convolutional] 265 | batch_normalize=1 266 | filters=128 267 | size=1 268 | stride=1 269 | pad=1 270 | activation=leaky 271 | 272 | [convolutional] 273 | batch_normalize=1 274 | filters=256 275 | size=3 276 | stride=1 277 | pad=1 278 | activation=leaky 279 | 280 | [shortcut] 281 | from=-3 282 | activation=linear 283 | 284 | # Downsample 285 | 286 | [convolutional] 287 | batch_normalize=1 288 | filters=512 289 | size=3 290 | stride=2 291 | pad=1 292 | activation=leaky 293 | 294 | [convolutional] 295 | batch_normalize=1 296 | filters=256 297 | size=1 298 | stride=1 299 | pad=1 300 | activation=leaky 301 | 302 | [convolutional] 303 | batch_normalize=1 304 | filters=512 305 | size=3 306 | stride=1 307 | pad=1 308 | activation=leaky 309 | 310 | [shortcut] 311 | from=-3 312 | activation=linear 313 | 314 | 315 | [convolutional] 316 | batch_normalize=1 317 | filters=256 318 | size=1 319 | stride=1 320 | pad=1 321 | activation=leaky 322 | 323 | [convolutional] 324 | batch_normalize=1 325 | filters=512 326 | size=3 327 | stride=1 328 | pad=1 329 | activation=leaky 330 | 331 | [shortcut] 332 | from=-3 333 | activation=linear 334 | 335 | 336 | [convolutional] 337 | batch_normalize=1 338 | filters=256 339 | size=1 340 | stride=1 341 | pad=1 342 | activation=leaky 343 | 344 | [convolutional] 345 | batch_normalize=1 346 | filters=512 347 | size=3 348 | stride=1 349 | pad=1 350 | activation=leaky 351 | 352 | [shortcut] 353 | from=-3 354 | activation=linear 355 | 356 | 357 | [convolutional] 358 | batch_normalize=1 359 | filters=256 360 | size=1 361 | stride=1 362 | pad=1 363 | activation=leaky 364 | 365 | [convolutional] 366 | batch_normalize=1 367 | filters=512 368 | size=3 369 | stride=1 370 | pad=1 371 | activation=leaky 372 | 373 | [shortcut] 374 | from=-3 375 | activation=linear 376 | 377 | [convolutional] 378 | batch_normalize=1 379 | filters=256 380 | size=1 381 | stride=1 382 | pad=1 383 | activation=leaky 384 | 385 | [convolutional] 386 | batch_normalize=1 387 | filters=512 388 | size=3 389 | stride=1 390 | pad=1 391 | activation=leaky 392 | 393 | [shortcut] 394 | from=-3 395 | activation=linear 396 | 397 | 398 | [convolutional] 399 | batch_normalize=1 400 | filters=256 401 | size=1 402 | stride=1 403 | pad=1 404 | activation=leaky 405 | 406 | [convolutional] 407 | batch_normalize=1 408 | filters=512 409 | size=3 410 | stride=1 411 | pad=1 412 | activation=leaky 413 | 414 | [shortcut] 415 | from=-3 416 | activation=linear 417 | 418 | 419 | [convolutional] 420 | batch_normalize=1 421 | filters=256 422 | size=1 423 | stride=1 424 | pad=1 425 | activation=leaky 426 | 427 | [convolutional] 428 | batch_normalize=1 429 | filters=512 430 | size=3 431 | stride=1 432 | pad=1 433 | activation=leaky 434 | 435 | [shortcut] 436 | from=-3 437 | activation=linear 438 | 439 | [convolutional] 440 | batch_normalize=1 441 | filters=256 442 | size=1 443 | stride=1 444 | pad=1 445 | activation=leaky 446 | 447 | [convolutional] 448 | batch_normalize=1 449 | filters=512 450 | size=3 451 | stride=1 452 | pad=1 453 | activation=leaky 454 | 455 | [shortcut] 456 | from=-3 457 | activation=linear 458 | 459 | # Downsample 460 | 461 | [convolutional] 462 | batch_normalize=1 463 | filters=1024 464 | size=3 465 | stride=2 466 | pad=1 467 | activation=leaky 468 | 469 | [convolutional] 470 | batch_normalize=1 471 | filters=512 472 | size=1 473 | stride=1 474 | pad=1 475 | activation=leaky 476 | 477 | [convolutional] 478 | batch_normalize=1 479 | filters=1024 480 | size=3 481 | stride=1 482 | pad=1 483 | activation=leaky 484 | 485 | [shortcut] 486 | from=-3 487 | activation=linear 488 | 489 | [convolutional] 490 | batch_normalize=1 491 | filters=512 492 | size=1 493 | stride=1 494 | pad=1 495 | activation=leaky 496 | 497 | [convolutional] 498 | batch_normalize=1 499 | filters=1024 500 | size=3 501 | stride=1 502 | pad=1 503 | activation=leaky 504 | 505 | [shortcut] 506 | from=-3 507 | activation=linear 508 | 509 | [convolutional] 510 | batch_normalize=1 511 | filters=512 512 | size=1 513 | stride=1 514 | pad=1 515 | activation=leaky 516 | 517 | [convolutional] 518 | batch_normalize=1 519 | filters=1024 520 | size=3 521 | stride=1 522 | pad=1 523 | activation=leaky 524 | 525 | [shortcut] 526 | from=-3 527 | activation=linear 528 | 529 | [convolutional] 530 | batch_normalize=1 531 | filters=512 532 | size=1 533 | stride=1 534 | pad=1 535 | activation=leaky 536 | 537 | [convolutional] 538 | batch_normalize=1 539 | filters=1024 540 | size=3 541 | stride=1 542 | pad=1 543 | activation=leaky 544 | 545 | [shortcut] 546 | from=-3 547 | activation=linear 548 | 549 | ###################### 550 | 551 | [convolutional] 552 | batch_normalize=1 553 | filters=512 554 | size=1 555 | stride=1 556 | pad=1 557 | activation=leaky 558 | 559 | [convolutional] 560 | batch_normalize=1 561 | size=3 562 | stride=1 563 | pad=1 564 | filters=1024 565 | activation=leaky 566 | 567 | [convolutional] 568 | batch_normalize=1 569 | filters=512 570 | size=1 571 | stride=1 572 | pad=1 573 | activation=leaky 574 | 575 | ### SPP ### 576 | [maxpool] 577 | stride=1 578 | size=5 579 | 580 | [route] 581 | layers=-2 582 | 583 | [maxpool] 584 | stride=1 585 | size=9 586 | 587 | [route] 588 | layers=-4 589 | 590 | [maxpool] 591 | stride=1 592 | size=13 593 | 594 | [route] 595 | layers=-1,-3,-5,-6 596 | 597 | ### End SPP ### 598 | 599 | [convolutional] 600 | batch_normalize=1 601 | filters=512 602 | size=1 603 | stride=1 604 | pad=1 605 | activation=leaky 606 | 607 | 608 | [convolutional] 609 | batch_normalize=1 610 | size=3 611 | stride=1 612 | pad=1 613 | filters=1024 614 | activation=leaky 615 | 616 | [convolutional] 617 | batch_normalize=1 618 | filters=512 619 | size=1 620 | stride=1 621 | pad=1 622 | activation=leaky 623 | 624 | [convolutional] 625 | batch_normalize=1 626 | size=3 627 | stride=1 628 | pad=1 629 | filters=1024 630 | activation=leaky 631 | 632 | [convolutional] 633 | size=1 634 | stride=1 635 | pad=1 636 | filters=45 637 | activation=linear 638 | 639 | 640 | [yolo] 641 | mask = 6,7,8 642 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 643 | classes=10 644 | num=9 645 | jitter=.3 646 | ignore_thresh = .7 647 | truth_thresh = 1 648 | random=1 649 | 650 | 651 | [route] 652 | layers = -4 653 | 654 | [convolutional] 655 | batch_normalize=1 656 | filters=256 657 | size=1 658 | stride=1 659 | pad=1 660 | activation=leaky 661 | 662 | [upsample] 663 | stride=2 664 | 665 | [route] 666 | layers = -1, 61 667 | 668 | 669 | 670 | [convolutional] 671 | batch_normalize=1 672 | filters=256 673 | size=1 674 | stride=1 675 | pad=1 676 | activation=leaky 677 | 678 | [convolutional] 679 | batch_normalize=1 680 | size=3 681 | stride=1 682 | pad=1 683 | filters=512 684 | activation=leaky 685 | 686 | [convolutional] 687 | batch_normalize=1 688 | filters=256 689 | size=1 690 | stride=1 691 | pad=1 692 | activation=leaky 693 | 694 | [convolutional] 695 | batch_normalize=1 696 | size=3 697 | stride=1 698 | pad=1 699 | filters=512 700 | activation=leaky 701 | 702 | [convolutional] 703 | batch_normalize=1 704 | filters=256 705 | size=1 706 | stride=1 707 | pad=1 708 | activation=leaky 709 | 710 | [convolutional] 711 | batch_normalize=1 712 | size=3 713 | stride=1 714 | pad=1 715 | filters=512 716 | activation=leaky 717 | 718 | [convolutional] 719 | size=1 720 | stride=1 721 | pad=1 722 | filters=45 723 | activation=linear 724 | 725 | 726 | [yolo] 727 | mask = 3,4,5 728 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 729 | classes=10 730 | num=9 731 | jitter=.3 732 | ignore_thresh = .7 733 | truth_thresh = 1 734 | random=1 735 | 736 | 737 | 738 | [route] 739 | layers = -4 740 | 741 | [convolutional] 742 | batch_normalize=1 743 | filters=128 744 | size=1 745 | stride=1 746 | pad=1 747 | activation=leaky 748 | 749 | [upsample] 750 | stride=2 751 | 752 | [route] 753 | layers = -1, 36 754 | 755 | 756 | 757 | [convolutional] 758 | batch_normalize=1 759 | filters=128 760 | size=1 761 | stride=1 762 | pad=1 763 | activation=leaky 764 | 765 | [convolutional] 766 | batch_normalize=1 767 | size=3 768 | stride=1 769 | pad=1 770 | filters=256 771 | activation=leaky 772 | 773 | [convolutional] 774 | batch_normalize=1 775 | filters=128 776 | size=1 777 | stride=1 778 | pad=1 779 | activation=leaky 780 | 781 | [convolutional] 782 | batch_normalize=1 783 | size=3 784 | stride=1 785 | pad=1 786 | filters=256 787 | activation=leaky 788 | 789 | [convolutional] 790 | batch_normalize=1 791 | filters=128 792 | size=1 793 | stride=1 794 | pad=1 795 | activation=leaky 796 | 797 | [convolutional] 798 | batch_normalize=1 799 | size=3 800 | stride=1 801 | pad=1 802 | filters=256 803 | activation=leaky 804 | 805 | [convolutional] 806 | size=1 807 | stride=1 808 | pad=1 809 | filters=45 810 | activation=linear 811 | 812 | 813 | [yolo] 814 | mask = 0,1,2 815 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 816 | classes=10 817 | num=9 818 | jitter=.3 819 | ignore_thresh = .1 820 | truth_thresh = 1 821 | random=1 822 | 823 | -------------------------------------------------------------------------------- /cfg/00-unpruned/yolov3-spp3.cfg: -------------------------------------------------------------------------------- 1 | [net] 2 | # Testing 3 | batch=1 4 | subdivisions=1 5 | # Training 6 | # batch=64 7 | # subdivisions=16 8 | width=608 9 | height=608 10 | channels=3 11 | momentum=0.9 12 | decay=0.0005 13 | angle=0 14 | saturation = 1.5 15 | exposure = 1.5 16 | hue=.1 17 | 18 | learning_rate=0.001 19 | burn_in=1000 20 | max_batches = 120200 21 | policy=steps 22 | steps=70000,100000 23 | scales=.1,.1 24 | 25 | [convolutional] 26 | batch_normalize=1 27 | filters=32 28 | size=3 29 | stride=1 30 | pad=1 31 | activation=leaky 32 | 33 | # Downsample 34 | 35 | [convolutional] 36 | batch_normalize=1 37 | filters=64 38 | size=3 39 | stride=2 40 | pad=1 41 | activation=leaky 42 | 43 | [convolutional] 44 | batch_normalize=1 45 | filters=32 46 | size=1 47 | stride=1 48 | pad=1 49 | activation=leaky 50 | 51 | [convolutional] 52 | batch_normalize=1 53 | filters=64 54 | size=3 55 | stride=1 56 | pad=1 57 | activation=leaky 58 | 59 | [shortcut] 60 | from=-3 61 | activation=linear 62 | 63 | # Downsample 64 | 65 | [convolutional] 66 | batch_normalize=1 67 | filters=128 68 | size=3 69 | stride=2 70 | pad=1 71 | activation=leaky 72 | 73 | [convolutional] 74 | batch_normalize=1 75 | filters=64 76 | size=1 77 | stride=1 78 | pad=1 79 | activation=leaky 80 | 81 | [convolutional] 82 | batch_normalize=1 83 | filters=128 84 | size=3 85 | stride=1 86 | pad=1 87 | activation=leaky 88 | 89 | [shortcut] 90 | from=-3 91 | activation=linear 92 | 93 | [convolutional] 94 | batch_normalize=1 95 | filters=64 96 | size=1 97 | stride=1 98 | pad=1 99 | activation=leaky 100 | 101 | [convolutional] 102 | batch_normalize=1 103 | filters=128 104 | size=3 105 | stride=1 106 | pad=1 107 | activation=leaky 108 | 109 | [shortcut] 110 | from=-3 111 | activation=linear 112 | 113 | # Downsample 114 | 115 | [convolutional] 116 | batch_normalize=1 117 | filters=256 118 | size=3 119 | stride=2 120 | pad=1 121 | activation=leaky 122 | 123 | [convolutional] 124 | batch_normalize=1 125 | filters=128 126 | size=1 127 | stride=1 128 | pad=1 129 | activation=leaky 130 | 131 | [convolutional] 132 | batch_normalize=1 133 | filters=256 134 | size=3 135 | stride=1 136 | pad=1 137 | activation=leaky 138 | 139 | [shortcut] 140 | from=-3 141 | activation=linear 142 | 143 | [convolutional] 144 | batch_normalize=1 145 | filters=128 146 | size=1 147 | stride=1 148 | pad=1 149 | activation=leaky 150 | 151 | [convolutional] 152 | batch_normalize=1 153 | filters=256 154 | size=3 155 | stride=1 156 | pad=1 157 | activation=leaky 158 | 159 | [shortcut] 160 | from=-3 161 | activation=linear 162 | 163 | [convolutional] 164 | batch_normalize=1 165 | filters=128 166 | size=1 167 | stride=1 168 | pad=1 169 | activation=leaky 170 | 171 | [convolutional] 172 | batch_normalize=1 173 | filters=256 174 | size=3 175 | stride=1 176 | pad=1 177 | activation=leaky 178 | 179 | [shortcut] 180 | from=-3 181 | activation=linear 182 | 183 | [convolutional] 184 | batch_normalize=1 185 | filters=128 186 | size=1 187 | stride=1 188 | pad=1 189 | activation=leaky 190 | 191 | [convolutional] 192 | batch_normalize=1 193 | filters=256 194 | size=3 195 | stride=1 196 | pad=1 197 | activation=leaky 198 | 199 | [shortcut] 200 | from=-3 201 | activation=linear 202 | 203 | 204 | [convolutional] 205 | batch_normalize=1 206 | filters=128 207 | size=1 208 | stride=1 209 | pad=1 210 | activation=leaky 211 | 212 | [convolutional] 213 | batch_normalize=1 214 | filters=256 215 | size=3 216 | stride=1 217 | pad=1 218 | activation=leaky 219 | 220 | [shortcut] 221 | from=-3 222 | activation=linear 223 | 224 | [convolutional] 225 | batch_normalize=1 226 | filters=128 227 | size=1 228 | stride=1 229 | pad=1 230 | activation=leaky 231 | 232 | [convolutional] 233 | batch_normalize=1 234 | filters=256 235 | size=3 236 | stride=1 237 | pad=1 238 | activation=leaky 239 | 240 | [shortcut] 241 | from=-3 242 | activation=linear 243 | 244 | [convolutional] 245 | batch_normalize=1 246 | filters=128 247 | size=1 248 | stride=1 249 | pad=1 250 | activation=leaky 251 | 252 | [convolutional] 253 | batch_normalize=1 254 | filters=256 255 | size=3 256 | stride=1 257 | pad=1 258 | activation=leaky 259 | 260 | [shortcut] 261 | from=-3 262 | activation=linear 263 | 264 | [convolutional] 265 | batch_normalize=1 266 | filters=128 267 | size=1 268 | stride=1 269 | pad=1 270 | activation=leaky 271 | 272 | [convolutional] 273 | batch_normalize=1 274 | filters=256 275 | size=3 276 | stride=1 277 | pad=1 278 | activation=leaky 279 | 280 | [shortcut] 281 | from=-3 282 | activation=linear 283 | 284 | # Downsample 285 | 286 | [convolutional] 287 | batch_normalize=1 288 | filters=512 289 | size=3 290 | stride=2 291 | pad=1 292 | activation=leaky 293 | 294 | [convolutional] 295 | batch_normalize=1 296 | filters=256 297 | size=1 298 | stride=1 299 | pad=1 300 | activation=leaky 301 | 302 | [convolutional] 303 | batch_normalize=1 304 | filters=512 305 | size=3 306 | stride=1 307 | pad=1 308 | activation=leaky 309 | 310 | [shortcut] 311 | from=-3 312 | activation=linear 313 | 314 | 315 | [convolutional] 316 | batch_normalize=1 317 | filters=256 318 | size=1 319 | stride=1 320 | pad=1 321 | activation=leaky 322 | 323 | [convolutional] 324 | batch_normalize=1 325 | filters=512 326 | size=3 327 | stride=1 328 | pad=1 329 | activation=leaky 330 | 331 | [shortcut] 332 | from=-3 333 | activation=linear 334 | 335 | 336 | [convolutional] 337 | batch_normalize=1 338 | filters=256 339 | size=1 340 | stride=1 341 | pad=1 342 | activation=leaky 343 | 344 | [convolutional] 345 | batch_normalize=1 346 | filters=512 347 | size=3 348 | stride=1 349 | pad=1 350 | activation=leaky 351 | 352 | [shortcut] 353 | from=-3 354 | activation=linear 355 | 356 | 357 | [convolutional] 358 | batch_normalize=1 359 | filters=256 360 | size=1 361 | stride=1 362 | pad=1 363 | activation=leaky 364 | 365 | [convolutional] 366 | batch_normalize=1 367 | filters=512 368 | size=3 369 | stride=1 370 | pad=1 371 | activation=leaky 372 | 373 | [shortcut] 374 | from=-3 375 | activation=linear 376 | 377 | [convolutional] 378 | batch_normalize=1 379 | filters=256 380 | size=1 381 | stride=1 382 | pad=1 383 | activation=leaky 384 | 385 | [convolutional] 386 | batch_normalize=1 387 | filters=512 388 | size=3 389 | stride=1 390 | pad=1 391 | activation=leaky 392 | 393 | [shortcut] 394 | from=-3 395 | activation=linear 396 | 397 | 398 | [convolutional] 399 | batch_normalize=1 400 | filters=256 401 | size=1 402 | stride=1 403 | pad=1 404 | activation=leaky 405 | 406 | [convolutional] 407 | batch_normalize=1 408 | filters=512 409 | size=3 410 | stride=1 411 | pad=1 412 | activation=leaky 413 | 414 | [shortcut] 415 | from=-3 416 | activation=linear 417 | 418 | 419 | [convolutional] 420 | batch_normalize=1 421 | filters=256 422 | size=1 423 | stride=1 424 | pad=1 425 | activation=leaky 426 | 427 | [convolutional] 428 | batch_normalize=1 429 | filters=512 430 | size=3 431 | stride=1 432 | pad=1 433 | activation=leaky 434 | 435 | [shortcut] 436 | from=-3 437 | activation=linear 438 | 439 | [convolutional] 440 | batch_normalize=1 441 | filters=256 442 | size=1 443 | stride=1 444 | pad=1 445 | activation=leaky 446 | 447 | [convolutional] 448 | batch_normalize=1 449 | filters=512 450 | size=3 451 | stride=1 452 | pad=1 453 | activation=leaky 454 | 455 | [shortcut] 456 | from=-3 457 | activation=linear 458 | 459 | # Downsample 460 | 461 | [convolutional] 462 | batch_normalize=1 463 | filters=1024 464 | size=3 465 | stride=2 466 | pad=1 467 | activation=leaky 468 | 469 | [convolutional] 470 | batch_normalize=1 471 | filters=512 472 | size=1 473 | stride=1 474 | pad=1 475 | activation=leaky 476 | 477 | [convolutional] 478 | batch_normalize=1 479 | filters=1024 480 | size=3 481 | stride=1 482 | pad=1 483 | activation=leaky 484 | 485 | [shortcut] 486 | from=-3 487 | activation=linear 488 | 489 | [convolutional] 490 | batch_normalize=1 491 | filters=512 492 | size=1 493 | stride=1 494 | pad=1 495 | activation=leaky 496 | 497 | [convolutional] 498 | batch_normalize=1 499 | filters=1024 500 | size=3 501 | stride=1 502 | pad=1 503 | activation=leaky 504 | 505 | [shortcut] 506 | from=-3 507 | activation=linear 508 | 509 | [convolutional] 510 | batch_normalize=1 511 | filters=512 512 | size=1 513 | stride=1 514 | pad=1 515 | activation=leaky 516 | 517 | [convolutional] 518 | batch_normalize=1 519 | filters=1024 520 | size=3 521 | stride=1 522 | pad=1 523 | activation=leaky 524 | 525 | [shortcut] 526 | from=-3 527 | activation=linear 528 | 529 | [convolutional] 530 | batch_normalize=1 531 | filters=512 532 | size=1 533 | stride=1 534 | pad=1 535 | activation=leaky 536 | 537 | [convolutional] 538 | batch_normalize=1 539 | filters=1024 540 | size=3 541 | stride=1 542 | pad=1 543 | activation=leaky 544 | 545 | [shortcut] 546 | from=-3 547 | activation=linear 548 | 549 | ###################### 550 | 551 | [convolutional] 552 | batch_normalize=1 553 | filters=512 554 | size=1 555 | stride=1 556 | pad=1 557 | activation=leaky 558 | 559 | [convolutional] 560 | batch_normalize=1 561 | size=3 562 | stride=1 563 | pad=1 564 | filters=1024 565 | activation=leaky 566 | 567 | [convolutional] 568 | batch_normalize=1 569 | filters=512 570 | size=1 571 | stride=1 572 | pad=1 573 | activation=leaky 574 | 575 | ### SPP ### 576 | [maxpool] 577 | stride=1 578 | size=5 579 | 580 | [route] 581 | layers=-2 582 | 583 | [maxpool] 584 | stride=1 585 | size=9 586 | 587 | [route] 588 | layers=-4 589 | 590 | [maxpool] 591 | stride=1 592 | size=13 593 | 594 | [route] 595 | layers=-1,-3,-5,-6 596 | 597 | ### End SPP ### 598 | 599 | [convolutional] 600 | batch_normalize=1 601 | filters=512 602 | size=1 603 | stride=1 604 | pad=1 605 | activation=leaky 606 | 607 | 608 | [convolutional] 609 | batch_normalize=1 610 | size=3 611 | stride=1 612 | pad=1 613 | filters=1024 614 | activation=leaky 615 | 616 | [convolutional] 617 | batch_normalize=1 618 | filters=512 619 | size=1 620 | stride=1 621 | pad=1 622 | activation=leaky 623 | 624 | [convolutional] 625 | batch_normalize=1 626 | size=3 627 | stride=1 628 | pad=1 629 | filters=1024 630 | activation=leaky 631 | 632 | 633 | [convolutional] 634 | size=1 635 | stride=1 636 | pad=1 637 | filters=45 638 | activation=linear 639 | 640 | 641 | [yolo] 642 | mask = 6,7,8 643 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 644 | classes=10 645 | num=9 646 | jitter=.3 647 | ignore_thresh = .7 648 | truth_thresh = 1 649 | random=1 650 | 651 | 652 | [route] 653 | layers = -4 654 | 655 | [convolutional] 656 | batch_normalize=1 657 | filters=256 658 | size=1 659 | stride=1 660 | pad=1 661 | activation=leaky 662 | 663 | [upsample] 664 | stride=2 665 | 666 | [route] 667 | layers = -1, 61 668 | 669 | 670 | 671 | [convolutional] 672 | batch_normalize=1 673 | filters=256 674 | size=1 675 | stride=1 676 | pad=1 677 | activation=leaky 678 | 679 | [convolutional] 680 | batch_normalize=1 681 | size=3 682 | stride=1 683 | pad=1 684 | filters=512 685 | activation=leaky 686 | 687 | ### SPP ### 688 | [maxpool] 689 | stride=1 690 | size=5 691 | 692 | [route] 693 | layers=-2 694 | 695 | [maxpool] 696 | stride=1 697 | size=9 698 | 699 | [route] 700 | layers=-4 701 | 702 | [maxpool] 703 | stride=1 704 | size=13 705 | 706 | [route] 707 | layers=-1,-3,-5,-6 708 | 709 | ### End SPP ### 710 | 711 | 712 | [convolutional] 713 | batch_normalize=1 714 | filters=256 715 | size=1 716 | stride=1 717 | pad=1 718 | activation=leaky 719 | 720 | [convolutional] 721 | batch_normalize=1 722 | size=3 723 | stride=1 724 | pad=1 725 | filters=512 726 | activation=leaky 727 | 728 | [convolutional] 729 | batch_normalize=1 730 | filters=256 731 | size=1 732 | stride=1 733 | pad=1 734 | activation=leaky 735 | 736 | [convolutional] 737 | batch_normalize=1 738 | size=3 739 | stride=1 740 | pad=1 741 | filters=512 742 | activation=leaky 743 | 744 | [convolutional] 745 | size=1 746 | stride=1 747 | pad=1 748 | filters=45 749 | activation=linear 750 | 751 | 752 | [yolo] 753 | mask = 3,4,5 754 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 755 | classes=10 756 | num=9 757 | jitter=.3 758 | ignore_thresh = .7 759 | truth_thresh = 1 760 | random=1 761 | 762 | 763 | 764 | [route] 765 | layers = -4 766 | 767 | [convolutional] 768 | batch_normalize=1 769 | filters=128 770 | size=1 771 | stride=1 772 | pad=1 773 | activation=leaky 774 | 775 | [upsample] 776 | stride=2 777 | 778 | [route] 779 | layers = -1, 36 780 | 781 | 782 | 783 | [convolutional] 784 | batch_normalize=1 785 | filters=128 786 | size=1 787 | stride=1 788 | pad=1 789 | activation=leaky 790 | 791 | [convolutional] 792 | batch_normalize=1 793 | size=3 794 | stride=1 795 | pad=1 796 | filters=256 797 | activation=leaky 798 | 799 | [convolutional] 800 | batch_normalize=1 801 | filters=128 802 | size=1 803 | stride=1 804 | pad=1 805 | activation=leaky 806 | 807 | ### SPP ### 808 | [maxpool] 809 | stride=1 810 | size=5 811 | 812 | [route] 813 | layers=-2 814 | 815 | [maxpool] 816 | stride=1 817 | size=9 818 | 819 | [route] 820 | layers=-4 821 | 822 | [maxpool] 823 | stride=1 824 | size=13 825 | 826 | [route] 827 | layers=-1,-3,-5,-6 828 | 829 | ### End SPP ### 830 | 831 | [convolutional] 832 | batch_normalize=1 833 | size=3 834 | stride=1 835 | pad=1 836 | filters=256 837 | activation=leaky 838 | 839 | [convolutional] 840 | batch_normalize=1 841 | filters=128 842 | size=1 843 | stride=1 844 | pad=1 845 | activation=leaky 846 | 847 | [convolutional] 848 | batch_normalize=1 849 | size=3 850 | stride=1 851 | pad=1 852 | filters=256 853 | activation=leaky 854 | 855 | [convolutional] 856 | size=1 857 | stride=1 858 | pad=1 859 | filters=45 860 | activation=linear 861 | 862 | 863 | [yolo] 864 | mask = 0,1,2 865 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 866 | classes=10 867 | num=9 868 | jitter=.3 869 | ignore_thresh = .7 870 | truth_thresh = 1 871 | random=1 872 | 873 | -------------------------------------------------------------------------------- /cfg/00-unpruned/yolov3-tiny.cfg: -------------------------------------------------------------------------------- 1 | [net] 2 | # Testing 3 | batch=1 4 | subdivisions=1 5 | # Training 6 | # batch=64 7 | # subdivisions=2 8 | width=416 9 | height=416 10 | channels=3 11 | momentum=0.9 12 | decay=0.0005 13 | angle=0 14 | saturation = 1.5 15 | exposure = 1.5 16 | hue=.1 17 | 18 | learning_rate=0.001 19 | burn_in=1000 20 | max_batches = 120200 21 | policy=steps 22 | steps=70000,100000 23 | scales=.1,.1 24 | 25 | [convolutional] 26 | batch_normalize=1 27 | filters=16 28 | size=3 29 | stride=1 30 | pad=1 31 | activation=leaky 32 | 33 | [maxpool] 34 | size=2 35 | stride=2 36 | 37 | [convolutional] 38 | batch_normalize=1 39 | filters=32 40 | size=3 41 | stride=1 42 | pad=1 43 | activation=leaky 44 | 45 | [maxpool] 46 | size=2 47 | stride=2 48 | 49 | [convolutional] 50 | batch_normalize=1 51 | filters=64 52 | size=3 53 | stride=1 54 | pad=1 55 | activation=leaky 56 | 57 | [maxpool] 58 | size=2 59 | stride=2 60 | 61 | [convolutional] 62 | batch_normalize=1 63 | filters=128 64 | size=3 65 | stride=1 66 | pad=1 67 | activation=leaky 68 | 69 | [maxpool] 70 | size=2 71 | stride=2 72 | 73 | [convolutional] 74 | batch_normalize=1 75 | filters=256 76 | size=3 77 | stride=1 78 | pad=1 79 | activation=leaky 80 | 81 | [maxpool] 82 | size=2 83 | stride=2 84 | 85 | [convolutional] 86 | batch_normalize=1 87 | filters=512 88 | size=3 89 | stride=1 90 | pad=1 91 | activation=leaky 92 | 93 | [maxpool] 94 | size=2 95 | stride=1 96 | 97 | [convolutional] 98 | batch_normalize=1 99 | filters=1024 100 | size=3 101 | stride=1 102 | pad=1 103 | activation=leaky 104 | 105 | ########### 106 | 107 | [convolutional] 108 | batch_normalize=1 109 | filters=256 110 | size=1 111 | stride=1 112 | pad=1 113 | activation=leaky 114 | 115 | [convolutional] 116 | batch_normalize=1 117 | filters=512 118 | size=3 119 | stride=1 120 | pad=1 121 | activation=leaky 122 | 123 | [convolutional] 124 | size=1 125 | stride=1 126 | pad=1 127 | filters=45 128 | activation=linear 129 | 130 | 131 | 132 | [yolo] 133 | mask = 3,4,5 134 | anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319 135 | classes=10 136 | num=6 137 | jitter=.3 138 | ignore_thresh = .7 139 | truth_thresh = 1 140 | random=1 141 | 142 | [route] 143 | layers = -4 144 | 145 | [convolutional] 146 | batch_normalize=1 147 | filters=128 148 | size=1 149 | stride=1 150 | pad=1 151 | activation=leaky 152 | 153 | [upsample] 154 | stride=2 155 | 156 | [route] 157 | layers = -1, 8 158 | 159 | [convolutional] 160 | batch_normalize=1 161 | filters=256 162 | size=3 163 | stride=1 164 | pad=1 165 | activation=leaky 166 | 167 | [convolutional] 168 | size=1 169 | stride=1 170 | pad=1 171 | filters=45 172 | activation=linear 173 | 174 | [yolo] 175 | mask = 0,1,2 176 | anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319 177 | classes=10 178 | num=6 179 | jitter=.3 180 | ignore_thresh = .7 181 | truth_thresh = 1 182 | random=1 183 | -------------------------------------------------------------------------------- /cfg/00-unpruned/yolov3.cfg: -------------------------------------------------------------------------------- 1 | [net] 2 | # Testing 3 | batch=1 4 | subdivisions=1 5 | # Training 6 | # batch=64 7 | # subdivisions=16 8 | width=608 9 | height=608 10 | channels=3 11 | momentum=0.9 12 | decay=0.0005 13 | angle=0 14 | saturation = 1.5 15 | exposure = 1.5 16 | hue=.1 17 | 18 | learning_rate=0.001 19 | burn_in=1000 20 | max_batches = 120200 21 | policy=steps 22 | steps=70000,100000 23 | scales=.1,.1 24 | 25 | [convolutional] 26 | batch_normalize=1 27 | filters=32 28 | size=3 29 | stride=1 30 | pad=1 31 | activation=leaky 32 | 33 | # Downsample 34 | 35 | [convolutional] 36 | batch_normalize=1 37 | filters=64 38 | size=3 39 | stride=2 40 | pad=1 41 | activation=leaky 42 | 43 | [convolutional] 44 | batch_normalize=1 45 | filters=32 46 | size=1 47 | stride=1 48 | pad=1 49 | activation=leaky 50 | 51 | [convolutional] 52 | batch_normalize=1 53 | filters=64 54 | size=3 55 | stride=1 56 | pad=1 57 | activation=leaky 58 | 59 | [shortcut] 60 | from=-3 61 | activation=linear 62 | 63 | # Downsample 64 | 65 | [convolutional] 66 | batch_normalize=1 67 | filters=128 68 | size=3 69 | stride=2 70 | pad=1 71 | activation=leaky 72 | 73 | [convolutional] 74 | batch_normalize=1 75 | filters=64 76 | size=1 77 | stride=1 78 | pad=1 79 | activation=leaky 80 | 81 | [convolutional] 82 | batch_normalize=1 83 | filters=128 84 | size=3 85 | stride=1 86 | pad=1 87 | activation=leaky 88 | 89 | [shortcut] 90 | from=-3 91 | activation=linear 92 | 93 | [convolutional] 94 | batch_normalize=1 95 | filters=64 96 | size=1 97 | stride=1 98 | pad=1 99 | activation=leaky 100 | 101 | [convolutional] 102 | batch_normalize=1 103 | filters=128 104 | size=3 105 | stride=1 106 | pad=1 107 | activation=leaky 108 | 109 | [shortcut] 110 | from=-3 111 | activation=linear 112 | 113 | # Downsample 114 | 115 | [convolutional] 116 | batch_normalize=1 117 | filters=256 118 | size=3 119 | stride=2 120 | pad=1 121 | activation=leaky 122 | 123 | [convolutional] 124 | batch_normalize=1 125 | filters=128 126 | size=1 127 | stride=1 128 | pad=1 129 | activation=leaky 130 | 131 | [convolutional] 132 | batch_normalize=1 133 | filters=256 134 | size=3 135 | stride=1 136 | pad=1 137 | activation=leaky 138 | 139 | [shortcut] 140 | from=-3 141 | activation=linear 142 | 143 | [convolutional] 144 | batch_normalize=1 145 | filters=128 146 | size=1 147 | stride=1 148 | pad=1 149 | activation=leaky 150 | 151 | [convolutional] 152 | batch_normalize=1 153 | filters=256 154 | size=3 155 | stride=1 156 | pad=1 157 | activation=leaky 158 | 159 | [shortcut] 160 | from=-3 161 | activation=linear 162 | 163 | [convolutional] 164 | batch_normalize=1 165 | filters=128 166 | size=1 167 | stride=1 168 | pad=1 169 | activation=leaky 170 | 171 | [convolutional] 172 | batch_normalize=1 173 | filters=256 174 | size=3 175 | stride=1 176 | pad=1 177 | activation=leaky 178 | 179 | [shortcut] 180 | from=-3 181 | activation=linear 182 | 183 | [convolutional] 184 | batch_normalize=1 185 | filters=128 186 | size=1 187 | stride=1 188 | pad=1 189 | activation=leaky 190 | 191 | [convolutional] 192 | batch_normalize=1 193 | filters=256 194 | size=3 195 | stride=1 196 | pad=1 197 | activation=leaky 198 | 199 | [shortcut] 200 | from=-3 201 | activation=linear 202 | 203 | 204 | [convolutional] 205 | batch_normalize=1 206 | filters=128 207 | size=1 208 | stride=1 209 | pad=1 210 | activation=leaky 211 | 212 | [convolutional] 213 | batch_normalize=1 214 | filters=256 215 | size=3 216 | stride=1 217 | pad=1 218 | activation=leaky 219 | 220 | [shortcut] 221 | from=-3 222 | activation=linear 223 | 224 | [convolutional] 225 | batch_normalize=1 226 | filters=128 227 | size=1 228 | stride=1 229 | pad=1 230 | activation=leaky 231 | 232 | [convolutional] 233 | batch_normalize=1 234 | filters=256 235 | size=3 236 | stride=1 237 | pad=1 238 | activation=leaky 239 | 240 | [shortcut] 241 | from=-3 242 | activation=linear 243 | 244 | [convolutional] 245 | batch_normalize=1 246 | filters=128 247 | size=1 248 | stride=1 249 | pad=1 250 | activation=leaky 251 | 252 | [convolutional] 253 | batch_normalize=1 254 | filters=256 255 | size=3 256 | stride=1 257 | pad=1 258 | activation=leaky 259 | 260 | [shortcut] 261 | from=-3 262 | activation=linear 263 | 264 | [convolutional] 265 | batch_normalize=1 266 | filters=128 267 | size=1 268 | stride=1 269 | pad=1 270 | activation=leaky 271 | 272 | [convolutional] 273 | batch_normalize=1 274 | filters=256 275 | size=3 276 | stride=1 277 | pad=1 278 | activation=leaky 279 | 280 | [shortcut] 281 | from=-3 282 | activation=linear 283 | 284 | # Downsample 285 | 286 | [convolutional] 287 | batch_normalize=1 288 | filters=512 289 | size=3 290 | stride=2 291 | pad=1 292 | activation=leaky 293 | 294 | [convolutional] 295 | batch_normalize=1 296 | filters=256 297 | size=1 298 | stride=1 299 | pad=1 300 | activation=leaky 301 | 302 | [convolutional] 303 | batch_normalize=1 304 | filters=512 305 | size=3 306 | stride=1 307 | pad=1 308 | activation=leaky 309 | 310 | [shortcut] 311 | from=-3 312 | activation=linear 313 | 314 | 315 | [convolutional] 316 | batch_normalize=1 317 | filters=256 318 | size=1 319 | stride=1 320 | pad=1 321 | activation=leaky 322 | 323 | [convolutional] 324 | batch_normalize=1 325 | filters=512 326 | size=3 327 | stride=1 328 | pad=1 329 | activation=leaky 330 | 331 | [shortcut] 332 | from=-3 333 | activation=linear 334 | 335 | 336 | [convolutional] 337 | batch_normalize=1 338 | filters=256 339 | size=1 340 | stride=1 341 | pad=1 342 | activation=leaky 343 | 344 | [convolutional] 345 | batch_normalize=1 346 | filters=512 347 | size=3 348 | stride=1 349 | pad=1 350 | activation=leaky 351 | 352 | [shortcut] 353 | from=-3 354 | activation=linear 355 | 356 | 357 | [convolutional] 358 | batch_normalize=1 359 | filters=256 360 | size=1 361 | stride=1 362 | pad=1 363 | activation=leaky 364 | 365 | [convolutional] 366 | batch_normalize=1 367 | filters=512 368 | size=3 369 | stride=1 370 | pad=1 371 | activation=leaky 372 | 373 | [shortcut] 374 | from=-3 375 | activation=linear 376 | 377 | [convolutional] 378 | batch_normalize=1 379 | filters=256 380 | size=1 381 | stride=1 382 | pad=1 383 | activation=leaky 384 | 385 | [convolutional] 386 | batch_normalize=1 387 | filters=512 388 | size=3 389 | stride=1 390 | pad=1 391 | activation=leaky 392 | 393 | [shortcut] 394 | from=-3 395 | activation=linear 396 | 397 | 398 | [convolutional] 399 | batch_normalize=1 400 | filters=256 401 | size=1 402 | stride=1 403 | pad=1 404 | activation=leaky 405 | 406 | [convolutional] 407 | batch_normalize=1 408 | filters=512 409 | size=3 410 | stride=1 411 | pad=1 412 | activation=leaky 413 | 414 | [shortcut] 415 | from=-3 416 | activation=linear 417 | 418 | 419 | [convolutional] 420 | batch_normalize=1 421 | filters=256 422 | size=1 423 | stride=1 424 | pad=1 425 | activation=leaky 426 | 427 | [convolutional] 428 | batch_normalize=1 429 | filters=512 430 | size=3 431 | stride=1 432 | pad=1 433 | activation=leaky 434 | 435 | [shortcut] 436 | from=-3 437 | activation=linear 438 | 439 | [convolutional] 440 | batch_normalize=1 441 | filters=256 442 | size=1 443 | stride=1 444 | pad=1 445 | activation=leaky 446 | 447 | [convolutional] 448 | batch_normalize=1 449 | filters=512 450 | size=3 451 | stride=1 452 | pad=1 453 | activation=leaky 454 | 455 | [shortcut] 456 | from=-3 457 | activation=linear 458 | 459 | # Downsample 460 | 461 | [convolutional] 462 | batch_normalize=1 463 | filters=1024 464 | size=3 465 | stride=2 466 | pad=1 467 | activation=leaky 468 | 469 | [convolutional] 470 | batch_normalize=1 471 | filters=512 472 | size=1 473 | stride=1 474 | pad=1 475 | activation=leaky 476 | 477 | [convolutional] 478 | batch_normalize=1 479 | filters=1024 480 | size=3 481 | stride=1 482 | pad=1 483 | activation=leaky 484 | 485 | [shortcut] 486 | from=-3 487 | activation=linear 488 | 489 | [convolutional] 490 | batch_normalize=1 491 | filters=512 492 | size=1 493 | stride=1 494 | pad=1 495 | activation=leaky 496 | 497 | [convolutional] 498 | batch_normalize=1 499 | filters=1024 500 | size=3 501 | stride=1 502 | pad=1 503 | activation=leaky 504 | 505 | [shortcut] 506 | from=-3 507 | activation=linear 508 | 509 | [convolutional] 510 | batch_normalize=1 511 | filters=512 512 | size=1 513 | stride=1 514 | pad=1 515 | activation=leaky 516 | 517 | [convolutional] 518 | batch_normalize=1 519 | filters=1024 520 | size=3 521 | stride=1 522 | pad=1 523 | activation=leaky 524 | 525 | [shortcut] 526 | from=-3 527 | activation=linear 528 | 529 | [convolutional] 530 | batch_normalize=1 531 | filters=512 532 | size=1 533 | stride=1 534 | pad=1 535 | activation=leaky 536 | 537 | [convolutional] 538 | batch_normalize=1 539 | filters=1024 540 | size=3 541 | stride=1 542 | pad=1 543 | activation=leaky 544 | 545 | [shortcut] 546 | from=-3 547 | activation=linear 548 | 549 | ###################### 550 | 551 | [convolutional] 552 | batch_normalize=1 553 | filters=512 554 | size=1 555 | stride=1 556 | pad=1 557 | activation=leaky 558 | 559 | [convolutional] 560 | batch_normalize=1 561 | size=3 562 | stride=1 563 | pad=1 564 | filters=1024 565 | activation=leaky 566 | 567 | [convolutional] 568 | batch_normalize=1 569 | filters=512 570 | size=1 571 | stride=1 572 | pad=1 573 | activation=leaky 574 | 575 | [convolutional] 576 | batch_normalize=1 577 | size=3 578 | stride=1 579 | pad=1 580 | filters=1024 581 | activation=leaky 582 | 583 | [convolutional] 584 | batch_normalize=1 585 | filters=512 586 | size=1 587 | stride=1 588 | pad=1 589 | activation=leaky 590 | 591 | [convolutional] 592 | batch_normalize=1 593 | size=3 594 | stride=1 595 | pad=1 596 | filters=1024 597 | activation=leaky 598 | 599 | [convolutional] 600 | size=1 601 | stride=1 602 | pad=1 603 | filters=45 604 | activation=linear 605 | 606 | 607 | [yolo] 608 | mask = 6,7,8 609 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 610 | classes=10 611 | num=9 612 | jitter=.3 613 | ignore_thresh = .7 614 | truth_thresh = 1 615 | random=1 616 | 617 | 618 | [route] 619 | layers = -4 620 | 621 | [convolutional] 622 | batch_normalize=1 623 | filters=256 624 | size=1 625 | stride=1 626 | pad=1 627 | activation=leaky 628 | 629 | [upsample] 630 | stride=2 631 | 632 | [route] 633 | layers = -1, 61 634 | 635 | 636 | 637 | [convolutional] 638 | batch_normalize=1 639 | filters=256 640 | size=1 641 | stride=1 642 | pad=1 643 | activation=leaky 644 | 645 | [convolutional] 646 | batch_normalize=1 647 | size=3 648 | stride=1 649 | pad=1 650 | filters=512 651 | activation=leaky 652 | 653 | [convolutional] 654 | batch_normalize=1 655 | filters=256 656 | size=1 657 | stride=1 658 | pad=1 659 | activation=leaky 660 | 661 | [convolutional] 662 | batch_normalize=1 663 | size=3 664 | stride=1 665 | pad=1 666 | filters=512 667 | activation=leaky 668 | 669 | [convolutional] 670 | batch_normalize=1 671 | filters=256 672 | size=1 673 | stride=1 674 | pad=1 675 | activation=leaky 676 | 677 | [convolutional] 678 | batch_normalize=1 679 | size=3 680 | stride=1 681 | pad=1 682 | filters=512 683 | activation=leaky 684 | 685 | [convolutional] 686 | size=1 687 | stride=1 688 | pad=1 689 | filters=45 690 | activation=linear 691 | 692 | 693 | [yolo] 694 | mask = 3,4,5 695 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 696 | classes=10 697 | num=9 698 | jitter=.3 699 | ignore_thresh = .7 700 | truth_thresh = 1 701 | random=1 702 | 703 | 704 | 705 | [route] 706 | layers = -4 707 | 708 | [convolutional] 709 | batch_normalize=1 710 | filters=128 711 | size=1 712 | stride=1 713 | pad=1 714 | activation=leaky 715 | 716 | [upsample] 717 | stride=2 718 | 719 | [route] 720 | layers = -1, 36 721 | 722 | 723 | 724 | [convolutional] 725 | batch_normalize=1 726 | filters=128 727 | size=1 728 | stride=1 729 | pad=1 730 | activation=leaky 731 | 732 | [convolutional] 733 | batch_normalize=1 734 | size=3 735 | stride=1 736 | pad=1 737 | filters=256 738 | activation=leaky 739 | 740 | [convolutional] 741 | batch_normalize=1 742 | filters=128 743 | size=1 744 | stride=1 745 | pad=1 746 | activation=leaky 747 | 748 | [convolutional] 749 | batch_normalize=1 750 | size=3 751 | stride=1 752 | pad=1 753 | filters=256 754 | activation=leaky 755 | 756 | [convolutional] 757 | batch_normalize=1 758 | filters=128 759 | size=1 760 | stride=1 761 | pad=1 762 | activation=leaky 763 | 764 | [convolutional] 765 | batch_normalize=1 766 | size=3 767 | stride=1 768 | pad=1 769 | filters=256 770 | activation=leaky 771 | 772 | [convolutional] 773 | size=1 774 | stride=1 775 | pad=1 776 | filters=45 777 | activation=linear 778 | 779 | 780 | [yolo] 781 | mask = 0,1,2 782 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 783 | classes=10 784 | num=9 785 | jitter=.3 786 | ignore_thresh = .7 787 | truth_thresh = 1 788 | random=1 789 | 790 | -------------------------------------------------------------------------------- /cfg/1iter-pruned/prune_0.5.cfg: -------------------------------------------------------------------------------- 1 | [net] 2 | # Testing 3 | #batch=1 4 | #subdivisions=1 5 | # Training 6 | batch=64 7 | subdivisions=16 8 | width=608 9 | height=608 10 | channels=3 11 | momentum=0.9 12 | decay=0.0005 13 | angle=0 14 | saturation = 1.5 15 | exposure = 1.5 16 | hue=.1 17 | 18 | learning_rate=0.001 19 | burn_in=1000 20 | max_batches = 120200 21 | policy=steps 22 | steps=70000,100000 23 | scales=.1,.1 24 | 25 | [convolutional] 26 | batch_normalize=1 27 | filters=26 28 | size=3 29 | stride=1 30 | pad=1 31 | activation=leaky 32 | 33 | [convolutional] 34 | batch_normalize=1 35 | filters=51 36 | size=3 37 | stride=2 38 | pad=1 39 | activation=leaky 40 | 41 | [convolutional] 42 | batch_normalize=1 43 | filters=26 44 | size=1 45 | stride=1 46 | pad=1 47 | activation=leaky 48 | 49 | [convolutional] 50 | batch_normalize=1 51 | filters=51 52 | size=3 53 | stride=1 54 | pad=1 55 | activation=leaky 56 | 57 | [shortcut] 58 | from=-3 59 | activation=linear 60 | 61 | [convolutional] 62 | batch_normalize=1 63 | filters=122 64 | size=3 65 | stride=2 66 | pad=1 67 | activation=leaky 68 | 69 | [convolutional] 70 | batch_normalize=1 71 | filters=22 72 | size=1 73 | stride=1 74 | pad=1 75 | activation=leaky 76 | 77 | [convolutional] 78 | batch_normalize=1 79 | filters=122 80 | size=3 81 | stride=1 82 | pad=1 83 | activation=leaky 84 | 85 | [shortcut] 86 | from=-3 87 | activation=linear 88 | 89 | [convolutional] 90 | batch_normalize=1 91 | filters=49 92 | size=1 93 | stride=1 94 | pad=1 95 | activation=leaky 96 | 97 | [convolutional] 98 | batch_normalize=1 99 | filters=122 100 | size=3 101 | stride=1 102 | pad=1 103 | activation=leaky 104 | 105 | [shortcut] 106 | from=-3 107 | activation=linear 108 | 109 | [convolutional] 110 | batch_normalize=1 111 | filters=256 112 | size=3 113 | stride=2 114 | pad=1 115 | activation=leaky 116 | 117 | [convolutional] 118 | batch_normalize=1 119 | filters=61 120 | size=1 121 | stride=1 122 | pad=1 123 | activation=leaky 124 | 125 | [convolutional] 126 | batch_normalize=1 127 | filters=256 128 | size=3 129 | stride=1 130 | pad=1 131 | activation=leaky 132 | 133 | [shortcut] 134 | from=-3 135 | activation=linear 136 | 137 | [convolutional] 138 | batch_normalize=1 139 | filters=115 140 | size=1 141 | stride=1 142 | pad=1 143 | activation=leaky 144 | 145 | [convolutional] 146 | batch_normalize=1 147 | filters=256 148 | size=3 149 | stride=1 150 | pad=1 151 | activation=leaky 152 | 153 | [shortcut] 154 | from=-3 155 | activation=linear 156 | 157 | [convolutional] 158 | batch_normalize=1 159 | filters=103 160 | size=1 161 | stride=1 162 | pad=1 163 | activation=leaky 164 | 165 | [convolutional] 166 | batch_normalize=1 167 | filters=256 168 | size=3 169 | stride=1 170 | pad=1 171 | activation=leaky 172 | 173 | [shortcut] 174 | from=-3 175 | activation=linear 176 | 177 | [convolutional] 178 | batch_normalize=1 179 | filters=124 180 | size=1 181 | stride=1 182 | pad=1 183 | activation=leaky 184 | 185 | [convolutional] 186 | batch_normalize=1 187 | filters=256 188 | size=3 189 | stride=1 190 | pad=1 191 | activation=leaky 192 | 193 | [shortcut] 194 | from=-3 195 | activation=linear 196 | 197 | [convolutional] 198 | batch_normalize=1 199 | filters=121 200 | size=1 201 | stride=1 202 | pad=1 203 | activation=leaky 204 | 205 | [convolutional] 206 | batch_normalize=1 207 | filters=256 208 | size=3 209 | stride=1 210 | pad=1 211 | activation=leaky 212 | 213 | [shortcut] 214 | from=-3 215 | activation=linear 216 | 217 | [convolutional] 218 | batch_normalize=1 219 | filters=124 220 | size=1 221 | stride=1 222 | pad=1 223 | activation=leaky 224 | 225 | [convolutional] 226 | batch_normalize=1 227 | filters=256 228 | size=3 229 | stride=1 230 | pad=1 231 | activation=leaky 232 | 233 | [shortcut] 234 | from=-3 235 | activation=linear 236 | 237 | [convolutional] 238 | batch_normalize=1 239 | filters=90 240 | size=1 241 | stride=1 242 | pad=1 243 | activation=leaky 244 | 245 | [convolutional] 246 | batch_normalize=1 247 | filters=256 248 | size=3 249 | stride=1 250 | pad=1 251 | activation=leaky 252 | 253 | [shortcut] 254 | from=-3 255 | activation=linear 256 | 257 | [convolutional] 258 | batch_normalize=1 259 | filters=128 260 | size=1 261 | stride=1 262 | pad=1 263 | activation=leaky 264 | 265 | [convolutional] 266 | batch_normalize=1 267 | filters=256 268 | size=3 269 | stride=1 270 | pad=1 271 | activation=leaky 272 | 273 | [shortcut] 274 | from=-3 275 | activation=linear 276 | 277 | [convolutional] 278 | batch_normalize=1 279 | filters=512 280 | size=3 281 | stride=2 282 | pad=1 283 | activation=leaky 284 | 285 | [convolutional] 286 | batch_normalize=1 287 | filters=256 288 | size=1 289 | stride=1 290 | pad=1 291 | activation=leaky 292 | 293 | [convolutional] 294 | batch_normalize=1 295 | filters=512 296 | size=3 297 | stride=1 298 | pad=1 299 | activation=leaky 300 | 301 | [shortcut] 302 | from=-3 303 | activation=linear 304 | 305 | [convolutional] 306 | batch_normalize=1 307 | filters=240 308 | size=1 309 | stride=1 310 | pad=1 311 | activation=leaky 312 | 313 | [convolutional] 314 | batch_normalize=1 315 | filters=512 316 | size=3 317 | stride=1 318 | pad=1 319 | activation=leaky 320 | 321 | [shortcut] 322 | from=-3 323 | activation=linear 324 | 325 | [convolutional] 326 | batch_normalize=1 327 | filters=162 328 | size=1 329 | stride=1 330 | pad=1 331 | activation=leaky 332 | 333 | [convolutional] 334 | batch_normalize=1 335 | filters=512 336 | size=3 337 | stride=1 338 | pad=1 339 | activation=leaky 340 | 341 | [shortcut] 342 | from=-3 343 | activation=linear 344 | 345 | [convolutional] 346 | batch_normalize=1 347 | filters=188 348 | size=1 349 | stride=1 350 | pad=1 351 | activation=leaky 352 | 353 | [convolutional] 354 | batch_normalize=1 355 | filters=512 356 | size=3 357 | stride=1 358 | pad=1 359 | activation=leaky 360 | 361 | [shortcut] 362 | from=-3 363 | activation=linear 364 | 365 | [convolutional] 366 | batch_normalize=1 367 | filters=194 368 | size=1 369 | stride=1 370 | pad=1 371 | activation=leaky 372 | 373 | [convolutional] 374 | batch_normalize=1 375 | filters=512 376 | size=3 377 | stride=1 378 | pad=1 379 | activation=leaky 380 | 381 | [shortcut] 382 | from=-3 383 | activation=linear 384 | 385 | [convolutional] 386 | batch_normalize=1 387 | filters=122 388 | size=1 389 | stride=1 390 | pad=1 391 | activation=leaky 392 | 393 | [convolutional] 394 | batch_normalize=1 395 | filters=512 396 | size=3 397 | stride=1 398 | pad=1 399 | activation=leaky 400 | 401 | [shortcut] 402 | from=-3 403 | activation=linear 404 | 405 | [convolutional] 406 | batch_normalize=1 407 | filters=256 408 | size=1 409 | stride=1 410 | pad=1 411 | activation=leaky 412 | 413 | [convolutional] 414 | batch_normalize=1 415 | filters=512 416 | size=3 417 | stride=1 418 | pad=1 419 | activation=leaky 420 | 421 | [shortcut] 422 | from=-3 423 | activation=linear 424 | 425 | [convolutional] 426 | batch_normalize=1 427 | filters=219 428 | size=1 429 | stride=1 430 | pad=1 431 | activation=leaky 432 | 433 | [convolutional] 434 | batch_normalize=1 435 | filters=512 436 | size=3 437 | stride=1 438 | pad=1 439 | activation=leaky 440 | 441 | [shortcut] 442 | from=-3 443 | activation=linear 444 | 445 | [convolutional] 446 | batch_normalize=1 447 | filters=1024 448 | size=3 449 | stride=2 450 | pad=1 451 | activation=leaky 452 | 453 | [convolutional] 454 | batch_normalize=1 455 | filters=402 456 | size=1 457 | stride=1 458 | pad=1 459 | activation=leaky 460 | 461 | [convolutional] 462 | batch_normalize=1 463 | filters=1024 464 | size=3 465 | stride=1 466 | pad=1 467 | activation=leaky 468 | 469 | [shortcut] 470 | from=-3 471 | activation=linear 472 | 473 | [convolutional] 474 | batch_normalize=1 475 | filters=242 476 | size=1 477 | stride=1 478 | pad=1 479 | activation=leaky 480 | 481 | [convolutional] 482 | batch_normalize=1 483 | filters=1024 484 | size=3 485 | stride=1 486 | pad=1 487 | activation=leaky 488 | 489 | [shortcut] 490 | from=-3 491 | activation=linear 492 | 493 | [convolutional] 494 | batch_normalize=1 495 | filters=367 496 | size=1 497 | stride=1 498 | pad=1 499 | activation=leaky 500 | 501 | [convolutional] 502 | batch_normalize=1 503 | filters=1024 504 | size=3 505 | stride=1 506 | pad=1 507 | activation=leaky 508 | 509 | [shortcut] 510 | from=-3 511 | activation=linear 512 | 513 | [convolutional] 514 | batch_normalize=1 515 | filters=285 516 | size=1 517 | stride=1 518 | pad=1 519 | activation=leaky 520 | 521 | [convolutional] 522 | batch_normalize=1 523 | filters=1024 524 | size=3 525 | stride=1 526 | pad=1 527 | activation=leaky 528 | 529 | [shortcut] 530 | from=-3 531 | activation=linear 532 | 533 | [convolutional] 534 | batch_normalize=1 535 | filters=191 536 | size=1 537 | stride=1 538 | pad=1 539 | activation=leaky 540 | 541 | [convolutional] 542 | batch_normalize=1 543 | filters=379 544 | size=3 545 | stride=1 546 | pad=1 547 | activation=leaky 548 | 549 | [convolutional] 550 | batch_normalize=1 551 | filters=170 552 | size=1 553 | stride=1 554 | pad=1 555 | activation=leaky 556 | 557 | [maxpool] 558 | stride=1 559 | size=5 560 | 561 | [route] 562 | layers=-2 563 | 564 | [maxpool] 565 | stride=1 566 | size=9 567 | 568 | [route] 569 | layers=-4 570 | 571 | [maxpool] 572 | stride=1 573 | size=13 574 | 575 | [route] 576 | layers=-1,-3,-5,-6 577 | 578 | [convolutional] 579 | batch_normalize=1 580 | filters=185 581 | size=1 582 | stride=1 583 | pad=1 584 | activation=leaky 585 | 586 | [convolutional] 587 | batch_normalize=1 588 | filters=369 589 | size=3 590 | stride=1 591 | pad=1 592 | activation=leaky 593 | 594 | [convolutional] 595 | batch_normalize=1 596 | filters=181 597 | size=1 598 | stride=1 599 | pad=1 600 | activation=leaky 601 | 602 | [convolutional] 603 | batch_normalize=1 604 | filters=340 605 | size=3 606 | stride=1 607 | pad=1 608 | activation=leaky 609 | 610 | [convolutional] 611 | filters=45 612 | size=1 613 | stride=1 614 | pad=1 615 | activation=linear 616 | 617 | [yolo] 618 | mask = 6,7,8 619 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 620 | classes = 10 621 | num = 9 622 | jitter = .3 623 | ignore_thresh = .7 624 | truth_thresh = 1 625 | random = 1 626 | 627 | [route] 628 | layers=-4 629 | 630 | [convolutional] 631 | batch_normalize=1 632 | filters=80 633 | size=1 634 | stride=1 635 | pad=1 636 | activation=leaky 637 | 638 | [upsample] 639 | stride=2 640 | 641 | [route] 642 | layers=-1, 61 643 | 644 | [convolutional] 645 | batch_normalize=1 646 | filters=99 647 | size=1 648 | stride=1 649 | pad=1 650 | activation=leaky 651 | 652 | [convolutional] 653 | batch_normalize=1 654 | filters=180 655 | size=3 656 | stride=1 657 | pad=1 658 | activation=leaky 659 | 660 | [maxpool] 661 | stride=1 662 | size=5 663 | 664 | [route] 665 | layers=-2 666 | 667 | [maxpool] 668 | stride=1 669 | size=9 670 | 671 | [route] 672 | layers=-4 673 | 674 | [maxpool] 675 | stride=1 676 | size=13 677 | 678 | [route] 679 | layers=-1,-3,-5,-6 680 | 681 | [convolutional] 682 | batch_normalize=1 683 | filters=91 684 | size=1 685 | stride=1 686 | pad=1 687 | activation=leaky 688 | 689 | [convolutional] 690 | batch_normalize=1 691 | filters=169 692 | size=3 693 | stride=1 694 | pad=1 695 | activation=leaky 696 | 697 | [convolutional] 698 | batch_normalize=1 699 | filters=79 700 | size=1 701 | stride=1 702 | pad=1 703 | activation=leaky 704 | 705 | [convolutional] 706 | batch_normalize=1 707 | filters=211 708 | size=3 709 | stride=1 710 | pad=1 711 | activation=leaky 712 | 713 | [convolutional] 714 | filters=45 715 | size=1 716 | stride=1 717 | pad=1 718 | activation=linear 719 | 720 | [yolo] 721 | mask = 3,4,5 722 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 723 | classes = 10 724 | num = 9 725 | jitter = .3 726 | ignore_thresh = .7 727 | truth_thresh = 1 728 | random = 1 729 | 730 | [route] 731 | layers=-4 732 | 733 | [convolutional] 734 | batch_normalize=1 735 | filters=50 736 | size=1 737 | stride=1 738 | pad=1 739 | activation=leaky 740 | 741 | [upsample] 742 | stride=2 743 | 744 | [route] 745 | layers=-1, 36 746 | 747 | [convolutional] 748 | batch_normalize=1 749 | filters=47 750 | size=1 751 | stride=1 752 | pad=1 753 | activation=leaky 754 | 755 | [convolutional] 756 | batch_normalize=1 757 | filters=98 758 | size=3 759 | stride=1 760 | pad=1 761 | activation=leaky 762 | 763 | [convolutional] 764 | batch_normalize=1 765 | filters=34 766 | size=1 767 | stride=1 768 | pad=1 769 | activation=leaky 770 | 771 | [maxpool] 772 | stride=1 773 | size=5 774 | 775 | [route] 776 | layers=-2 777 | 778 | [maxpool] 779 | stride=1 780 | size=9 781 | 782 | [route] 783 | layers=-4 784 | 785 | [maxpool] 786 | stride=1 787 | size=13 788 | 789 | [route] 790 | layers=-1,-3,-5,-6 791 | 792 | [convolutional] 793 | batch_normalize=1 794 | filters=74 795 | size=3 796 | stride=1 797 | pad=1 798 | activation=leaky 799 | 800 | [convolutional] 801 | batch_normalize=1 802 | filters=47 803 | size=1 804 | stride=1 805 | pad=1 806 | activation=leaky 807 | 808 | [convolutional] 809 | batch_normalize=1 810 | filters=115 811 | size=3 812 | stride=1 813 | pad=1 814 | activation=leaky 815 | 816 | [convolutional] 817 | filters=45 818 | size=1 819 | stride=1 820 | pad=1 821 | activation=linear 822 | 823 | [yolo] 824 | mask = 0,1,2 825 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 826 | classes = 10 827 | num = 9 828 | jitter = .3 829 | ignore_thresh = .7 830 | truth_thresh = 1 831 | random = 1 832 | 833 | -------------------------------------------------------------------------------- /cfg/1iter-pruned/prune_0.9.cfg: -------------------------------------------------------------------------------- 1 | [net] 2 | # Testing 3 | #batch=1 4 | #subdivisions=1 5 | # Training 6 | batch=64 7 | subdivisions=16 8 | width=608 9 | height=608 10 | channels=3 11 | momentum=0.9 12 | decay=0.0005 13 | angle=0 14 | saturation = 1.5 15 | exposure = 1.5 16 | hue=.1 17 | 18 | learning_rate=0.001 19 | burn_in=1000 20 | max_batches = 120200 21 | policy=steps 22 | steps=70000,100000 23 | scales=.1,.1 24 | 25 | [convolutional] 26 | batch_normalize=1 27 | filters=19 28 | size=3 29 | stride=1 30 | pad=1 31 | activation=leaky 32 | 33 | [convolutional] 34 | batch_normalize=1 35 | filters=40 36 | size=3 37 | stride=2 38 | pad=1 39 | activation=leaky 40 | 41 | [convolutional] 42 | batch_normalize=1 43 | filters=10 44 | size=1 45 | stride=1 46 | pad=1 47 | activation=leaky 48 | 49 | [convolutional] 50 | batch_normalize=1 51 | filters=40 52 | size=3 53 | stride=1 54 | pad=1 55 | activation=leaky 56 | 57 | [shortcut] 58 | from=-3 59 | activation=linear 60 | 61 | [convolutional] 62 | batch_normalize=1 63 | filters=93 64 | size=3 65 | stride=2 66 | pad=1 67 | activation=leaky 68 | 69 | [convolutional] 70 | batch_normalize=1 71 | filters=6 72 | size=1 73 | stride=1 74 | pad=1 75 | activation=leaky 76 | 77 | [convolutional] 78 | batch_normalize=1 79 | filters=93 80 | size=3 81 | stride=1 82 | pad=1 83 | activation=leaky 84 | 85 | [shortcut] 86 | from=-3 87 | activation=linear 88 | 89 | [convolutional] 90 | batch_normalize=1 91 | filters=24 92 | size=1 93 | stride=1 94 | pad=1 95 | activation=leaky 96 | 97 | [convolutional] 98 | batch_normalize=1 99 | filters=93 100 | size=3 101 | stride=1 102 | pad=1 103 | activation=leaky 104 | 105 | [shortcut] 106 | from=-3 107 | activation=linear 108 | 109 | [convolutional] 110 | batch_normalize=1 111 | filters=231 112 | size=3 113 | stride=2 114 | pad=1 115 | activation=leaky 116 | 117 | [convolutional] 118 | batch_normalize=1 119 | filters=19 120 | size=1 121 | stride=1 122 | pad=1 123 | activation=leaky 124 | 125 | [convolutional] 126 | batch_normalize=1 127 | filters=231 128 | size=3 129 | stride=1 130 | pad=1 131 | activation=leaky 132 | 133 | [shortcut] 134 | from=-3 135 | activation=linear 136 | 137 | [convolutional] 138 | batch_normalize=1 139 | filters=33 140 | size=1 141 | stride=1 142 | pad=1 143 | activation=leaky 144 | 145 | [convolutional] 146 | batch_normalize=1 147 | filters=231 148 | size=3 149 | stride=1 150 | pad=1 151 | activation=leaky 152 | 153 | [shortcut] 154 | from=-3 155 | activation=linear 156 | 157 | [convolutional] 158 | batch_normalize=1 159 | filters=20 160 | size=1 161 | stride=1 162 | pad=1 163 | activation=leaky 164 | 165 | [convolutional] 166 | batch_normalize=1 167 | filters=231 168 | size=3 169 | stride=1 170 | pad=1 171 | activation=leaky 172 | 173 | [shortcut] 174 | from=-3 175 | activation=linear 176 | 177 | [convolutional] 178 | batch_normalize=1 179 | filters=19 180 | size=1 181 | stride=1 182 | pad=1 183 | activation=leaky 184 | 185 | [convolutional] 186 | batch_normalize=1 187 | filters=231 188 | size=3 189 | stride=1 190 | pad=1 191 | activation=leaky 192 | 193 | [shortcut] 194 | from=-3 195 | activation=linear 196 | 197 | [convolutional] 198 | batch_normalize=1 199 | filters=12 200 | size=1 201 | stride=1 202 | pad=1 203 | activation=leaky 204 | 205 | [convolutional] 206 | batch_normalize=1 207 | filters=231 208 | size=3 209 | stride=1 210 | pad=1 211 | activation=leaky 212 | 213 | [shortcut] 214 | from=-3 215 | activation=linear 216 | 217 | [convolutional] 218 | batch_normalize=1 219 | filters=12 220 | size=1 221 | stride=1 222 | pad=1 223 | activation=leaky 224 | 225 | [convolutional] 226 | batch_normalize=1 227 | filters=231 228 | size=3 229 | stride=1 230 | pad=1 231 | activation=leaky 232 | 233 | [shortcut] 234 | from=-3 235 | activation=linear 236 | 237 | [convolutional] 238 | batch_normalize=1 239 | filters=12 240 | size=1 241 | stride=1 242 | pad=1 243 | activation=leaky 244 | 245 | [convolutional] 246 | batch_normalize=1 247 | filters=231 248 | size=3 249 | stride=1 250 | pad=1 251 | activation=leaky 252 | 253 | [shortcut] 254 | from=-3 255 | activation=linear 256 | 257 | [convolutional] 258 | batch_normalize=1 259 | filters=12 260 | size=1 261 | stride=1 262 | pad=1 263 | activation=leaky 264 | 265 | [convolutional] 266 | batch_normalize=1 267 | filters=231 268 | size=3 269 | stride=1 270 | pad=1 271 | activation=leaky 272 | 273 | [shortcut] 274 | from=-3 275 | activation=linear 276 | 277 | [convolutional] 278 | batch_normalize=1 279 | filters=465 280 | size=3 281 | stride=2 282 | pad=1 283 | activation=leaky 284 | 285 | [convolutional] 286 | batch_normalize=1 287 | filters=51 288 | size=1 289 | stride=1 290 | pad=1 291 | activation=leaky 292 | 293 | [convolutional] 294 | batch_normalize=1 295 | filters=465 296 | size=3 297 | stride=1 298 | pad=1 299 | activation=leaky 300 | 301 | [shortcut] 302 | from=-3 303 | activation=linear 304 | 305 | [convolutional] 306 | batch_normalize=1 307 | filters=32 308 | size=1 309 | stride=1 310 | pad=1 311 | activation=leaky 312 | 313 | [convolutional] 314 | batch_normalize=1 315 | filters=465 316 | size=3 317 | stride=1 318 | pad=1 319 | activation=leaky 320 | 321 | [shortcut] 322 | from=-3 323 | activation=linear 324 | 325 | [convolutional] 326 | batch_normalize=1 327 | filters=25 328 | size=1 329 | stride=1 330 | pad=1 331 | activation=leaky 332 | 333 | [convolutional] 334 | batch_normalize=1 335 | filters=465 336 | size=3 337 | stride=1 338 | pad=1 339 | activation=leaky 340 | 341 | [shortcut] 342 | from=-3 343 | activation=linear 344 | 345 | [convolutional] 346 | batch_normalize=1 347 | filters=25 348 | size=1 349 | stride=1 350 | pad=1 351 | activation=leaky 352 | 353 | [convolutional] 354 | batch_normalize=1 355 | filters=465 356 | size=3 357 | stride=1 358 | pad=1 359 | activation=leaky 360 | 361 | [shortcut] 362 | from=-3 363 | activation=linear 364 | 365 | [convolutional] 366 | batch_normalize=1 367 | filters=25 368 | size=1 369 | stride=1 370 | pad=1 371 | activation=leaky 372 | 373 | [convolutional] 374 | batch_normalize=1 375 | filters=465 376 | size=3 377 | stride=1 378 | pad=1 379 | activation=leaky 380 | 381 | [shortcut] 382 | from=-3 383 | activation=linear 384 | 385 | [convolutional] 386 | batch_normalize=1 387 | filters=25 388 | size=1 389 | stride=1 390 | pad=1 391 | activation=leaky 392 | 393 | [convolutional] 394 | batch_normalize=1 395 | filters=465 396 | size=3 397 | stride=1 398 | pad=1 399 | activation=leaky 400 | 401 | [shortcut] 402 | from=-3 403 | activation=linear 404 | 405 | [convolutional] 406 | batch_normalize=1 407 | filters=25 408 | size=1 409 | stride=1 410 | pad=1 411 | activation=leaky 412 | 413 | [convolutional] 414 | batch_normalize=1 415 | filters=465 416 | size=3 417 | stride=1 418 | pad=1 419 | activation=leaky 420 | 421 | [shortcut] 422 | from=-3 423 | activation=linear 424 | 425 | [convolutional] 426 | batch_normalize=1 427 | filters=25 428 | size=1 429 | stride=1 430 | pad=1 431 | activation=leaky 432 | 433 | [convolutional] 434 | batch_normalize=1 435 | filters=465 436 | size=3 437 | stride=1 438 | pad=1 439 | activation=leaky 440 | 441 | [shortcut] 442 | from=-3 443 | activation=linear 444 | 445 | [convolutional] 446 | batch_normalize=1 447 | filters=789 448 | size=3 449 | stride=2 450 | pad=1 451 | activation=leaky 452 | 453 | [convolutional] 454 | batch_normalize=1 455 | filters=72 456 | size=1 457 | stride=1 458 | pad=1 459 | activation=leaky 460 | 461 | [convolutional] 462 | batch_normalize=1 463 | filters=789 464 | size=3 465 | stride=1 466 | pad=1 467 | activation=leaky 468 | 469 | [shortcut] 470 | from=-3 471 | activation=linear 472 | 473 | [convolutional] 474 | batch_normalize=1 475 | filters=51 476 | size=1 477 | stride=1 478 | pad=1 479 | activation=leaky 480 | 481 | [convolutional] 482 | batch_normalize=1 483 | filters=789 484 | size=3 485 | stride=1 486 | pad=1 487 | activation=leaky 488 | 489 | [shortcut] 490 | from=-3 491 | activation=linear 492 | 493 | [convolutional] 494 | batch_normalize=1 495 | filters=51 496 | size=1 497 | stride=1 498 | pad=1 499 | activation=leaky 500 | 501 | [convolutional] 502 | batch_normalize=1 503 | filters=789 504 | size=3 505 | stride=1 506 | pad=1 507 | activation=leaky 508 | 509 | [shortcut] 510 | from=-3 511 | activation=linear 512 | 513 | [convolutional] 514 | batch_normalize=1 515 | filters=51 516 | size=1 517 | stride=1 518 | pad=1 519 | activation=leaky 520 | 521 | [convolutional] 522 | batch_normalize=1 523 | filters=789 524 | size=3 525 | stride=1 526 | pad=1 527 | activation=leaky 528 | 529 | [shortcut] 530 | from=-3 531 | activation=linear 532 | 533 | [convolutional] 534 | batch_normalize=1 535 | filters=51 536 | size=1 537 | stride=1 538 | pad=1 539 | activation=leaky 540 | 541 | [convolutional] 542 | batch_normalize=1 543 | filters=102 544 | size=3 545 | stride=1 546 | pad=1 547 | activation=leaky 548 | 549 | [convolutional] 550 | batch_normalize=1 551 | filters=51 552 | size=1 553 | stride=1 554 | pad=1 555 | activation=leaky 556 | 557 | [maxpool] 558 | stride=1 559 | size=5 560 | 561 | [route] 562 | layers=-2 563 | 564 | [maxpool] 565 | stride=1 566 | size=9 567 | 568 | [route] 569 | layers=-4 570 | 571 | [maxpool] 572 | stride=1 573 | size=13 574 | 575 | [route] 576 | layers=-1,-3,-5,-6 577 | 578 | [convolutional] 579 | batch_normalize=1 580 | filters=51 581 | size=1 582 | stride=1 583 | pad=1 584 | activation=leaky 585 | 586 | [convolutional] 587 | batch_normalize=1 588 | filters=102 589 | size=3 590 | stride=1 591 | pad=1 592 | activation=leaky 593 | 594 | [convolutional] 595 | batch_normalize=1 596 | filters=51 597 | size=1 598 | stride=1 599 | pad=1 600 | activation=leaky 601 | 602 | [convolutional] 603 | batch_normalize=1 604 | filters=102 605 | size=3 606 | stride=1 607 | pad=1 608 | activation=leaky 609 | 610 | [convolutional] 611 | filters=45 612 | size=1 613 | stride=1 614 | pad=1 615 | activation=linear 616 | 617 | [yolo] 618 | mask = 6,7,8 619 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 620 | classes = 10 621 | num = 9 622 | jitter = .3 623 | ignore_thresh = .7 624 | truth_thresh = 1 625 | random = 1 626 | 627 | [route] 628 | layers=-4 629 | 630 | [convolutional] 631 | batch_normalize=1 632 | filters=25 633 | size=1 634 | stride=1 635 | pad=1 636 | activation=leaky 637 | 638 | [upsample] 639 | stride=2 640 | 641 | [route] 642 | layers=-1, 61 643 | 644 | [convolutional] 645 | batch_normalize=1 646 | filters=25 647 | size=1 648 | stride=1 649 | pad=1 650 | activation=leaky 651 | 652 | [convolutional] 653 | batch_normalize=1 654 | filters=51 655 | size=3 656 | stride=1 657 | pad=1 658 | activation=leaky 659 | 660 | [maxpool] 661 | stride=1 662 | size=5 663 | 664 | [route] 665 | layers=-2 666 | 667 | [maxpool] 668 | stride=1 669 | size=9 670 | 671 | [route] 672 | layers=-4 673 | 674 | [maxpool] 675 | stride=1 676 | size=13 677 | 678 | [route] 679 | layers=-1,-3,-5,-6 680 | 681 | [convolutional] 682 | batch_normalize=1 683 | filters=25 684 | size=1 685 | stride=1 686 | pad=1 687 | activation=leaky 688 | 689 | [convolutional] 690 | batch_normalize=1 691 | filters=51 692 | size=3 693 | stride=1 694 | pad=1 695 | activation=leaky 696 | 697 | [convolutional] 698 | batch_normalize=1 699 | filters=25 700 | size=1 701 | stride=1 702 | pad=1 703 | activation=leaky 704 | 705 | [convolutional] 706 | batch_normalize=1 707 | filters=51 708 | size=3 709 | stride=1 710 | pad=1 711 | activation=leaky 712 | 713 | [convolutional] 714 | filters=45 715 | size=1 716 | stride=1 717 | pad=1 718 | activation=linear 719 | 720 | [yolo] 721 | mask = 3,4,5 722 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 723 | classes = 10 724 | num = 9 725 | jitter = .3 726 | ignore_thresh = .7 727 | truth_thresh = 1 728 | random = 1 729 | 730 | [route] 731 | layers=-4 732 | 733 | [convolutional] 734 | batch_normalize=1 735 | filters=12 736 | size=1 737 | stride=1 738 | pad=1 739 | activation=leaky 740 | 741 | [upsample] 742 | stride=2 743 | 744 | [route] 745 | layers=-1, 36 746 | 747 | [convolutional] 748 | batch_normalize=1 749 | filters=12 750 | size=1 751 | stride=1 752 | pad=1 753 | activation=leaky 754 | 755 | [convolutional] 756 | batch_normalize=1 757 | filters=25 758 | size=3 759 | stride=1 760 | pad=1 761 | activation=leaky 762 | 763 | [convolutional] 764 | batch_normalize=1 765 | filters=12 766 | size=1 767 | stride=1 768 | pad=1 769 | activation=leaky 770 | 771 | [maxpool] 772 | stride=1 773 | size=5 774 | 775 | [route] 776 | layers=-2 777 | 778 | [maxpool] 779 | stride=1 780 | size=9 781 | 782 | [route] 783 | layers=-4 784 | 785 | [maxpool] 786 | stride=1 787 | size=13 788 | 789 | [route] 790 | layers=-1,-3,-5,-6 791 | 792 | [convolutional] 793 | batch_normalize=1 794 | filters=25 795 | size=3 796 | stride=1 797 | pad=1 798 | activation=leaky 799 | 800 | [convolutional] 801 | batch_normalize=1 802 | filters=12 803 | size=1 804 | stride=1 805 | pad=1 806 | activation=leaky 807 | 808 | [convolutional] 809 | batch_normalize=1 810 | filters=27 811 | size=3 812 | stride=1 813 | pad=1 814 | activation=leaky 815 | 816 | [convolutional] 817 | filters=45 818 | size=1 819 | stride=1 820 | pad=1 821 | activation=linear 822 | 823 | [yolo] 824 | mask = 0,1,2 825 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 826 | classes = 10 827 | num = 9 828 | jitter = .3 829 | ignore_thresh = .7 830 | truth_thresh = 1 831 | random = 1 832 | 833 | -------------------------------------------------------------------------------- /cfg/1iter-pruned/prune_0.95.cfg: -------------------------------------------------------------------------------- 1 | [net] 2 | # Testing 3 | #batch=1 4 | #subdivisions=1 5 | # Training 6 | batch=64 7 | subdivisions=16 8 | width=608 9 | height=608 10 | channels=3 11 | momentum=0.9 12 | decay=0.0005 13 | angle=0 14 | saturation = 1.5 15 | exposure = 1.5 16 | hue=.1 17 | 18 | learning_rate=0.001 19 | burn_in=1000 20 | max_batches = 120200 21 | policy=steps 22 | steps=70000,100000 23 | scales=.1,.1 24 | 25 | [convolutional] 26 | batch_normalize=1 27 | filters=16 28 | size=3 29 | stride=1 30 | pad=1 31 | activation=leaky 32 | 33 | [convolutional] 34 | batch_normalize=1 35 | filters=31 36 | size=3 37 | stride=2 38 | pad=1 39 | activation=leaky 40 | 41 | [convolutional] 42 | batch_normalize=1 43 | filters=5 44 | size=1 45 | stride=1 46 | pad=1 47 | activation=leaky 48 | 49 | [convolutional] 50 | batch_normalize=1 51 | filters=31 52 | size=3 53 | stride=1 54 | pad=1 55 | activation=leaky 56 | 57 | [shortcut] 58 | from=-3 59 | activation=linear 60 | 61 | [convolutional] 62 | batch_normalize=1 63 | filters=80 64 | size=3 65 | stride=2 66 | pad=1 67 | activation=leaky 68 | 69 | [convolutional] 70 | batch_normalize=1 71 | filters=6 72 | size=1 73 | stride=1 74 | pad=1 75 | activation=leaky 76 | 77 | [convolutional] 78 | batch_normalize=1 79 | filters=80 80 | size=3 81 | stride=1 82 | pad=1 83 | activation=leaky 84 | 85 | [shortcut] 86 | from=-3 87 | activation=linear 88 | 89 | [convolutional] 90 | batch_normalize=1 91 | filters=12 92 | size=1 93 | stride=1 94 | pad=1 95 | activation=leaky 96 | 97 | [convolutional] 98 | batch_normalize=1 99 | filters=80 100 | size=3 101 | stride=1 102 | pad=1 103 | activation=leaky 104 | 105 | [shortcut] 106 | from=-3 107 | activation=linear 108 | 109 | [convolutional] 110 | batch_normalize=1 111 | filters=199 112 | size=3 113 | stride=2 114 | pad=1 115 | activation=leaky 116 | 117 | [convolutional] 118 | batch_normalize=1 119 | filters=12 120 | size=1 121 | stride=1 122 | pad=1 123 | activation=leaky 124 | 125 | [convolutional] 126 | batch_normalize=1 127 | filters=199 128 | size=3 129 | stride=1 130 | pad=1 131 | activation=leaky 132 | 133 | [shortcut] 134 | from=-3 135 | activation=linear 136 | 137 | [convolutional] 138 | batch_normalize=1 139 | filters=12 140 | size=1 141 | stride=1 142 | pad=1 143 | activation=leaky 144 | 145 | [convolutional] 146 | batch_normalize=1 147 | filters=199 148 | size=3 149 | stride=1 150 | pad=1 151 | activation=leaky 152 | 153 | [shortcut] 154 | from=-3 155 | activation=linear 156 | 157 | [convolutional] 158 | batch_normalize=1 159 | filters=12 160 | size=1 161 | stride=1 162 | pad=1 163 | activation=leaky 164 | 165 | [convolutional] 166 | batch_normalize=1 167 | filters=199 168 | size=3 169 | stride=1 170 | pad=1 171 | activation=leaky 172 | 173 | [shortcut] 174 | from=-3 175 | activation=linear 176 | 177 | [convolutional] 178 | batch_normalize=1 179 | filters=12 180 | size=1 181 | stride=1 182 | pad=1 183 | activation=leaky 184 | 185 | [convolutional] 186 | batch_normalize=1 187 | filters=199 188 | size=3 189 | stride=1 190 | pad=1 191 | activation=leaky 192 | 193 | [shortcut] 194 | from=-3 195 | activation=linear 196 | 197 | [convolutional] 198 | batch_normalize=1 199 | filters=12 200 | size=1 201 | stride=1 202 | pad=1 203 | activation=leaky 204 | 205 | [convolutional] 206 | batch_normalize=1 207 | filters=199 208 | size=3 209 | stride=1 210 | pad=1 211 | activation=leaky 212 | 213 | [shortcut] 214 | from=-3 215 | activation=linear 216 | 217 | [convolutional] 218 | batch_normalize=1 219 | filters=12 220 | size=1 221 | stride=1 222 | pad=1 223 | activation=leaky 224 | 225 | [convolutional] 226 | batch_normalize=1 227 | filters=199 228 | size=3 229 | stride=1 230 | pad=1 231 | activation=leaky 232 | 233 | [shortcut] 234 | from=-3 235 | activation=linear 236 | 237 | [convolutional] 238 | batch_normalize=1 239 | filters=12 240 | size=1 241 | stride=1 242 | pad=1 243 | activation=leaky 244 | 245 | [convolutional] 246 | batch_normalize=1 247 | filters=199 248 | size=3 249 | stride=1 250 | pad=1 251 | activation=leaky 252 | 253 | [shortcut] 254 | from=-3 255 | activation=linear 256 | 257 | [convolutional] 258 | batch_normalize=1 259 | filters=12 260 | size=1 261 | stride=1 262 | pad=1 263 | activation=leaky 264 | 265 | [convolutional] 266 | batch_normalize=1 267 | filters=199 268 | size=3 269 | stride=1 270 | pad=1 271 | activation=leaky 272 | 273 | [shortcut] 274 | from=-3 275 | activation=linear 276 | 277 | [convolutional] 278 | batch_normalize=1 279 | filters=405 280 | size=3 281 | stride=2 282 | pad=1 283 | activation=leaky 284 | 285 | [convolutional] 286 | batch_normalize=1 287 | filters=25 288 | size=1 289 | stride=1 290 | pad=1 291 | activation=leaky 292 | 293 | [convolutional] 294 | batch_normalize=1 295 | filters=405 296 | size=3 297 | stride=1 298 | pad=1 299 | activation=leaky 300 | 301 | [shortcut] 302 | from=-3 303 | activation=linear 304 | 305 | [convolutional] 306 | batch_normalize=1 307 | filters=25 308 | size=1 309 | stride=1 310 | pad=1 311 | activation=leaky 312 | 313 | [convolutional] 314 | batch_normalize=1 315 | filters=405 316 | size=3 317 | stride=1 318 | pad=1 319 | activation=leaky 320 | 321 | [shortcut] 322 | from=-3 323 | activation=linear 324 | 325 | [convolutional] 326 | batch_normalize=1 327 | filters=25 328 | size=1 329 | stride=1 330 | pad=1 331 | activation=leaky 332 | 333 | [convolutional] 334 | batch_normalize=1 335 | filters=405 336 | size=3 337 | stride=1 338 | pad=1 339 | activation=leaky 340 | 341 | [shortcut] 342 | from=-3 343 | activation=linear 344 | 345 | [convolutional] 346 | batch_normalize=1 347 | filters=25 348 | size=1 349 | stride=1 350 | pad=1 351 | activation=leaky 352 | 353 | [convolutional] 354 | batch_normalize=1 355 | filters=405 356 | size=3 357 | stride=1 358 | pad=1 359 | activation=leaky 360 | 361 | [shortcut] 362 | from=-3 363 | activation=linear 364 | 365 | [convolutional] 366 | batch_normalize=1 367 | filters=25 368 | size=1 369 | stride=1 370 | pad=1 371 | activation=leaky 372 | 373 | [convolutional] 374 | batch_normalize=1 375 | filters=405 376 | size=3 377 | stride=1 378 | pad=1 379 | activation=leaky 380 | 381 | [shortcut] 382 | from=-3 383 | activation=linear 384 | 385 | [convolutional] 386 | batch_normalize=1 387 | filters=25 388 | size=1 389 | stride=1 390 | pad=1 391 | activation=leaky 392 | 393 | [convolutional] 394 | batch_normalize=1 395 | filters=405 396 | size=3 397 | stride=1 398 | pad=1 399 | activation=leaky 400 | 401 | [shortcut] 402 | from=-3 403 | activation=linear 404 | 405 | [convolutional] 406 | batch_normalize=1 407 | filters=25 408 | size=1 409 | stride=1 410 | pad=1 411 | activation=leaky 412 | 413 | [convolutional] 414 | batch_normalize=1 415 | filters=405 416 | size=3 417 | stride=1 418 | pad=1 419 | activation=leaky 420 | 421 | [shortcut] 422 | from=-3 423 | activation=linear 424 | 425 | [convolutional] 426 | batch_normalize=1 427 | filters=25 428 | size=1 429 | stride=1 430 | pad=1 431 | activation=leaky 432 | 433 | [convolutional] 434 | batch_normalize=1 435 | filters=405 436 | size=3 437 | stride=1 438 | pad=1 439 | activation=leaky 440 | 441 | [shortcut] 442 | from=-3 443 | activation=linear 444 | 445 | [convolutional] 446 | batch_normalize=1 447 | filters=507 448 | size=3 449 | stride=2 450 | pad=1 451 | activation=leaky 452 | 453 | [convolutional] 454 | batch_normalize=1 455 | filters=51 456 | size=1 457 | stride=1 458 | pad=1 459 | activation=leaky 460 | 461 | [convolutional] 462 | batch_normalize=1 463 | filters=507 464 | size=3 465 | stride=1 466 | pad=1 467 | activation=leaky 468 | 469 | [shortcut] 470 | from=-3 471 | activation=linear 472 | 473 | [convolutional] 474 | batch_normalize=1 475 | filters=51 476 | size=1 477 | stride=1 478 | pad=1 479 | activation=leaky 480 | 481 | [convolutional] 482 | batch_normalize=1 483 | filters=507 484 | size=3 485 | stride=1 486 | pad=1 487 | activation=leaky 488 | 489 | [shortcut] 490 | from=-3 491 | activation=linear 492 | 493 | [convolutional] 494 | batch_normalize=1 495 | filters=51 496 | size=1 497 | stride=1 498 | pad=1 499 | activation=leaky 500 | 501 | [convolutional] 502 | batch_normalize=1 503 | filters=507 504 | size=3 505 | stride=1 506 | pad=1 507 | activation=leaky 508 | 509 | [shortcut] 510 | from=-3 511 | activation=linear 512 | 513 | [convolutional] 514 | batch_normalize=1 515 | filters=51 516 | size=1 517 | stride=1 518 | pad=1 519 | activation=leaky 520 | 521 | [convolutional] 522 | batch_normalize=1 523 | filters=507 524 | size=3 525 | stride=1 526 | pad=1 527 | activation=leaky 528 | 529 | [shortcut] 530 | from=-3 531 | activation=linear 532 | 533 | [convolutional] 534 | batch_normalize=1 535 | filters=51 536 | size=1 537 | stride=1 538 | pad=1 539 | activation=leaky 540 | 541 | [convolutional] 542 | batch_normalize=1 543 | filters=102 544 | size=3 545 | stride=1 546 | pad=1 547 | activation=leaky 548 | 549 | [convolutional] 550 | batch_normalize=1 551 | filters=51 552 | size=1 553 | stride=1 554 | pad=1 555 | activation=leaky 556 | 557 | [maxpool] 558 | stride=1 559 | size=5 560 | 561 | [route] 562 | layers=-2 563 | 564 | [maxpool] 565 | stride=1 566 | size=9 567 | 568 | [route] 569 | layers=-4 570 | 571 | [maxpool] 572 | stride=1 573 | size=13 574 | 575 | [route] 576 | layers=-1,-3,-5,-6 577 | 578 | [convolutional] 579 | batch_normalize=1 580 | filters=51 581 | size=1 582 | stride=1 583 | pad=1 584 | activation=leaky 585 | 586 | [convolutional] 587 | batch_normalize=1 588 | filters=102 589 | size=3 590 | stride=1 591 | pad=1 592 | activation=leaky 593 | 594 | [convolutional] 595 | batch_normalize=1 596 | filters=51 597 | size=1 598 | stride=1 599 | pad=1 600 | activation=leaky 601 | 602 | [convolutional] 603 | batch_normalize=1 604 | filters=102 605 | size=3 606 | stride=1 607 | pad=1 608 | activation=leaky 609 | 610 | [convolutional] 611 | filters=45 612 | size=1 613 | stride=1 614 | pad=1 615 | activation=linear 616 | 617 | [yolo] 618 | mask = 6,7,8 619 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 620 | classes = 10 621 | num = 9 622 | jitter = .3 623 | ignore_thresh = .7 624 | truth_thresh = 1 625 | random = 1 626 | 627 | [route] 628 | layers=-4 629 | 630 | [convolutional] 631 | batch_normalize=1 632 | filters=25 633 | size=1 634 | stride=1 635 | pad=1 636 | activation=leaky 637 | 638 | [upsample] 639 | stride=2 640 | 641 | [route] 642 | layers=-1, 61 643 | 644 | [convolutional] 645 | batch_normalize=1 646 | filters=25 647 | size=1 648 | stride=1 649 | pad=1 650 | activation=leaky 651 | 652 | [convolutional] 653 | batch_normalize=1 654 | filters=51 655 | size=3 656 | stride=1 657 | pad=1 658 | activation=leaky 659 | 660 | [maxpool] 661 | stride=1 662 | size=5 663 | 664 | [route] 665 | layers=-2 666 | 667 | [maxpool] 668 | stride=1 669 | size=9 670 | 671 | [route] 672 | layers=-4 673 | 674 | [maxpool] 675 | stride=1 676 | size=13 677 | 678 | [route] 679 | layers=-1,-3,-5,-6 680 | 681 | [convolutional] 682 | batch_normalize=1 683 | filters=25 684 | size=1 685 | stride=1 686 | pad=1 687 | activation=leaky 688 | 689 | [convolutional] 690 | batch_normalize=1 691 | filters=51 692 | size=3 693 | stride=1 694 | pad=1 695 | activation=leaky 696 | 697 | [convolutional] 698 | batch_normalize=1 699 | filters=25 700 | size=1 701 | stride=1 702 | pad=1 703 | activation=leaky 704 | 705 | [convolutional] 706 | batch_normalize=1 707 | filters=51 708 | size=3 709 | stride=1 710 | pad=1 711 | activation=leaky 712 | 713 | [convolutional] 714 | filters=45 715 | size=1 716 | stride=1 717 | pad=1 718 | activation=linear 719 | 720 | [yolo] 721 | mask = 3,4,5 722 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 723 | classes = 10 724 | num = 9 725 | jitter = .3 726 | ignore_thresh = .7 727 | truth_thresh = 1 728 | random = 1 729 | 730 | [route] 731 | layers=-4 732 | 733 | [convolutional] 734 | batch_normalize=1 735 | filters=12 736 | size=1 737 | stride=1 738 | pad=1 739 | activation=leaky 740 | 741 | [upsample] 742 | stride=2 743 | 744 | [route] 745 | layers=-1, 36 746 | 747 | [convolutional] 748 | batch_normalize=1 749 | filters=12 750 | size=1 751 | stride=1 752 | pad=1 753 | activation=leaky 754 | 755 | [convolutional] 756 | batch_normalize=1 757 | filters=25 758 | size=3 759 | stride=1 760 | pad=1 761 | activation=leaky 762 | 763 | [convolutional] 764 | batch_normalize=1 765 | filters=12 766 | size=1 767 | stride=1 768 | pad=1 769 | activation=leaky 770 | 771 | [maxpool] 772 | stride=1 773 | size=5 774 | 775 | [route] 776 | layers=-2 777 | 778 | [maxpool] 779 | stride=1 780 | size=9 781 | 782 | [route] 783 | layers=-4 784 | 785 | [maxpool] 786 | stride=1 787 | size=13 788 | 789 | [route] 790 | layers=-1,-3,-5,-6 791 | 792 | [convolutional] 793 | batch_normalize=1 794 | filters=25 795 | size=3 796 | stride=1 797 | pad=1 798 | activation=leaky 799 | 800 | [convolutional] 801 | batch_normalize=1 802 | filters=12 803 | size=1 804 | stride=1 805 | pad=1 806 | activation=leaky 807 | 808 | [convolutional] 809 | batch_normalize=1 810 | filters=25 811 | size=3 812 | stride=1 813 | pad=1 814 | activation=leaky 815 | 816 | [convolutional] 817 | filters=45 818 | size=1 819 | stride=1 820 | pad=1 821 | activation=linear 822 | 823 | [yolo] 824 | mask = 0,1,2 825 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 826 | classes = 10 827 | num = 9 828 | jitter = .3 829 | ignore_thresh = .7 830 | truth_thresh = 1 831 | random = 1 832 | 833 | -------------------------------------------------------------------------------- /cfg/2iter-pruned/prune_0.5_0.5.cfg: -------------------------------------------------------------------------------- 1 | [net] 2 | # Testing 3 | batch=1 4 | subdivisions=1 5 | # Training 6 | # batch=64 7 | # subdivisions=16 8 | width=416 9 | height=416 10 | channels=3 11 | momentum=0.9 12 | decay=0.0005 13 | angle=0 14 | saturation = 1.5 15 | exposure = 1.5 16 | hue=.1 17 | 18 | learning_rate=0.001 19 | burn_in=1000 20 | max_batches = 120200 21 | policy=steps 22 | steps=70000,100000 23 | scales=.1,.1 24 | 25 | [convolutional] 26 | batch_normalize=1 27 | filters=17 28 | size=3 29 | stride=1 30 | pad=1 31 | activation=leaky 32 | 33 | [convolutional] 34 | batch_normalize=1 35 | filters=50 36 | size=3 37 | stride=2 38 | pad=1 39 | activation=leaky 40 | 41 | [convolutional] 42 | batch_normalize=1 43 | filters=25 44 | size=1 45 | stride=1 46 | pad=1 47 | activation=leaky 48 | 49 | [convolutional] 50 | batch_normalize=1 51 | filters=50 52 | size=3 53 | stride=1 54 | pad=1 55 | activation=leaky 56 | 57 | [shortcut] 58 | from=-3 59 | activation=linear 60 | 61 | [convolutional] 62 | batch_normalize=1 63 | filters=122 64 | size=3 65 | stride=2 66 | pad=1 67 | activation=leaky 68 | 69 | [convolutional] 70 | batch_normalize=1 71 | filters=20 72 | size=1 73 | stride=1 74 | pad=1 75 | activation=leaky 76 | 77 | [convolutional] 78 | batch_normalize=1 79 | filters=122 80 | size=3 81 | stride=1 82 | pad=1 83 | activation=leaky 84 | 85 | [shortcut] 86 | from=-3 87 | activation=linear 88 | 89 | [convolutional] 90 | batch_normalize=1 91 | filters=44 92 | size=1 93 | stride=1 94 | pad=1 95 | activation=leaky 96 | 97 | [convolutional] 98 | batch_normalize=1 99 | filters=122 100 | size=3 101 | stride=1 102 | pad=1 103 | activation=leaky 104 | 105 | [shortcut] 106 | from=-3 107 | activation=linear 108 | 109 | [convolutional] 110 | batch_normalize=1 111 | filters=256 112 | size=3 113 | stride=2 114 | pad=1 115 | activation=leaky 116 | 117 | [convolutional] 118 | batch_normalize=1 119 | filters=54 120 | size=1 121 | stride=1 122 | pad=1 123 | activation=leaky 124 | 125 | [convolutional] 126 | batch_normalize=1 127 | filters=256 128 | size=3 129 | stride=1 130 | pad=1 131 | activation=leaky 132 | 133 | [shortcut] 134 | from=-3 135 | activation=linear 136 | 137 | [convolutional] 138 | batch_normalize=1 139 | filters=100 140 | size=1 141 | stride=1 142 | pad=1 143 | activation=leaky 144 | 145 | [convolutional] 146 | batch_normalize=1 147 | filters=256 148 | size=3 149 | stride=1 150 | pad=1 151 | activation=leaky 152 | 153 | [shortcut] 154 | from=-3 155 | activation=linear 156 | 157 | [convolutional] 158 | batch_normalize=1 159 | filters=92 160 | size=1 161 | stride=1 162 | pad=1 163 | activation=leaky 164 | 165 | [convolutional] 166 | batch_normalize=1 167 | filters=256 168 | size=3 169 | stride=1 170 | pad=1 171 | activation=leaky 172 | 173 | [shortcut] 174 | from=-3 175 | activation=linear 176 | 177 | [convolutional] 178 | batch_normalize=1 179 | filters=111 180 | size=1 181 | stride=1 182 | pad=1 183 | activation=leaky 184 | 185 | [convolutional] 186 | batch_normalize=1 187 | filters=256 188 | size=3 189 | stride=1 190 | pad=1 191 | activation=leaky 192 | 193 | [shortcut] 194 | from=-3 195 | activation=linear 196 | 197 | [convolutional] 198 | batch_normalize=1 199 | filters=95 200 | size=1 201 | stride=1 202 | pad=1 203 | activation=leaky 204 | 205 | [convolutional] 206 | batch_normalize=1 207 | filters=256 208 | size=3 209 | stride=1 210 | pad=1 211 | activation=leaky 212 | 213 | [shortcut] 214 | from=-3 215 | activation=linear 216 | 217 | [convolutional] 218 | batch_normalize=1 219 | filters=91 220 | size=1 221 | stride=1 222 | pad=1 223 | activation=leaky 224 | 225 | [convolutional] 226 | batch_normalize=1 227 | filters=256 228 | size=3 229 | stride=1 230 | pad=1 231 | activation=leaky 232 | 233 | [shortcut] 234 | from=-3 235 | activation=linear 236 | 237 | [convolutional] 238 | batch_normalize=1 239 | filters=80 240 | size=1 241 | stride=1 242 | pad=1 243 | activation=leaky 244 | 245 | [convolutional] 246 | batch_normalize=1 247 | filters=256 248 | size=3 249 | stride=1 250 | pad=1 251 | activation=leaky 252 | 253 | [shortcut] 254 | from=-3 255 | activation=linear 256 | 257 | [convolutional] 258 | batch_normalize=1 259 | filters=74 260 | size=1 261 | stride=1 262 | pad=1 263 | activation=leaky 264 | 265 | [convolutional] 266 | batch_normalize=1 267 | filters=256 268 | size=3 269 | stride=1 270 | pad=1 271 | activation=leaky 272 | 273 | [shortcut] 274 | from=-3 275 | activation=linear 276 | 277 | [convolutional] 278 | batch_normalize=1 279 | filters=509 280 | size=3 281 | stride=2 282 | pad=1 283 | activation=leaky 284 | 285 | [convolutional] 286 | batch_normalize=1 287 | filters=226 288 | size=1 289 | stride=1 290 | pad=1 291 | activation=leaky 292 | 293 | [convolutional] 294 | batch_normalize=1 295 | filters=509 296 | size=3 297 | stride=1 298 | pad=1 299 | activation=leaky 300 | 301 | [shortcut] 302 | from=-3 303 | activation=linear 304 | 305 | [convolutional] 306 | batch_normalize=1 307 | filters=187 308 | size=1 309 | stride=1 310 | pad=1 311 | activation=leaky 312 | 313 | [convolutional] 314 | batch_normalize=1 315 | filters=509 316 | size=3 317 | stride=1 318 | pad=1 319 | activation=leaky 320 | 321 | [shortcut] 322 | from=-3 323 | activation=linear 324 | 325 | [convolutional] 326 | batch_normalize=1 327 | filters=130 328 | size=1 329 | stride=1 330 | pad=1 331 | activation=leaky 332 | 333 | [convolutional] 334 | batch_normalize=1 335 | filters=509 336 | size=3 337 | stride=1 338 | pad=1 339 | activation=leaky 340 | 341 | [shortcut] 342 | from=-3 343 | activation=linear 344 | 345 | [convolutional] 346 | batch_normalize=1 347 | filters=124 348 | size=1 349 | stride=1 350 | pad=1 351 | activation=leaky 352 | 353 | [convolutional] 354 | batch_normalize=1 355 | filters=509 356 | size=3 357 | stride=1 358 | pad=1 359 | activation=leaky 360 | 361 | [shortcut] 362 | from=-3 363 | activation=linear 364 | 365 | [convolutional] 366 | batch_normalize=1 367 | filters=128 368 | size=1 369 | stride=1 370 | pad=1 371 | activation=leaky 372 | 373 | [convolutional] 374 | batch_normalize=1 375 | filters=509 376 | size=3 377 | stride=1 378 | pad=1 379 | activation=leaky 380 | 381 | [shortcut] 382 | from=-3 383 | activation=linear 384 | 385 | [convolutional] 386 | batch_normalize=1 387 | filters=104 388 | size=1 389 | stride=1 390 | pad=1 391 | activation=leaky 392 | 393 | [convolutional] 394 | batch_normalize=1 395 | filters=509 396 | size=3 397 | stride=1 398 | pad=1 399 | activation=leaky 400 | 401 | [shortcut] 402 | from=-3 403 | activation=linear 404 | 405 | [convolutional] 406 | batch_normalize=1 407 | filters=210 408 | size=1 409 | stride=1 410 | pad=1 411 | activation=leaky 412 | 413 | [convolutional] 414 | batch_normalize=1 415 | filters=509 416 | size=3 417 | stride=1 418 | pad=1 419 | activation=leaky 420 | 421 | [shortcut] 422 | from=-3 423 | activation=linear 424 | 425 | [convolutional] 426 | batch_normalize=1 427 | filters=68 428 | size=1 429 | stride=1 430 | pad=1 431 | activation=leaky 432 | 433 | [convolutional] 434 | batch_normalize=1 435 | filters=509 436 | size=3 437 | stride=1 438 | pad=1 439 | activation=leaky 440 | 441 | [shortcut] 442 | from=-3 443 | activation=linear 444 | 445 | [convolutional] 446 | batch_normalize=1 447 | filters=1012 448 | size=3 449 | stride=2 450 | pad=1 451 | activation=leaky 452 | 453 | [convolutional] 454 | batch_normalize=1 455 | filters=280 456 | size=1 457 | stride=1 458 | pad=1 459 | activation=leaky 460 | 461 | [convolutional] 462 | batch_normalize=1 463 | filters=1012 464 | size=3 465 | stride=1 466 | pad=1 467 | activation=leaky 468 | 469 | [shortcut] 470 | from=-3 471 | activation=linear 472 | 473 | [convolutional] 474 | batch_normalize=1 475 | filters=67 476 | size=1 477 | stride=1 478 | pad=1 479 | activation=leaky 480 | 481 | [convolutional] 482 | batch_normalize=1 483 | filters=1012 484 | size=3 485 | stride=1 486 | pad=1 487 | activation=leaky 488 | 489 | [shortcut] 490 | from=-3 491 | activation=linear 492 | 493 | [convolutional] 494 | batch_normalize=1 495 | filters=84 496 | size=1 497 | stride=1 498 | pad=1 499 | activation=leaky 500 | 501 | [convolutional] 502 | batch_normalize=1 503 | filters=1012 504 | size=3 505 | stride=1 506 | pad=1 507 | activation=leaky 508 | 509 | [shortcut] 510 | from=-3 511 | activation=linear 512 | 513 | [convolutional] 514 | batch_normalize=1 515 | filters=108 516 | size=1 517 | stride=1 518 | pad=1 519 | activation=leaky 520 | 521 | [convolutional] 522 | batch_normalize=1 523 | filters=1012 524 | size=3 525 | stride=1 526 | pad=1 527 | activation=leaky 528 | 529 | [shortcut] 530 | from=-3 531 | activation=linear 532 | 533 | [convolutional] 534 | batch_normalize=1 535 | filters=118 536 | size=1 537 | stride=1 538 | pad=1 539 | activation=leaky 540 | 541 | [convolutional] 542 | batch_normalize=1 543 | filters=218 544 | size=3 545 | stride=1 546 | pad=1 547 | activation=leaky 548 | 549 | [convolutional] 550 | batch_normalize=1 551 | filters=104 552 | size=1 553 | stride=1 554 | pad=1 555 | activation=leaky 556 | 557 | [maxpool] 558 | stride=1 559 | size=5 560 | 561 | [route] 562 | layers=-2 563 | 564 | [maxpool] 565 | stride=1 566 | size=9 567 | 568 | [route] 569 | layers=-4 570 | 571 | [maxpool] 572 | stride=1 573 | size=13 574 | 575 | [route] 576 | layers=-1,-3,-5,-6 577 | 578 | [convolutional] 579 | batch_normalize=1 580 | filters=96 581 | size=1 582 | stride=1 583 | pad=1 584 | activation=leaky 585 | 586 | [convolutional] 587 | batch_normalize=1 588 | filters=201 589 | size=3 590 | stride=1 591 | pad=1 592 | activation=leaky 593 | 594 | [convolutional] 595 | batch_normalize=1 596 | filters=105 597 | size=1 598 | stride=1 599 | pad=1 600 | activation=leaky 601 | 602 | [convolutional] 603 | batch_normalize=1 604 | filters=244 605 | size=3 606 | stride=1 607 | pad=1 608 | activation=leaky 609 | 610 | [convolutional] 611 | filters=45 612 | size=1 613 | stride=1 614 | pad=1 615 | activation=linear 616 | 617 | [yolo] 618 | mask = 6,7,8 619 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 620 | classes = 10 621 | num = 9 622 | jitter = .3 623 | ignore_thresh = .7 624 | truth_thresh = 1 625 | random = 1 626 | 627 | [route] 628 | layers=-4 629 | 630 | [convolutional] 631 | batch_normalize=1 632 | filters=80 633 | size=1 634 | stride=1 635 | pad=1 636 | activation=leaky 637 | 638 | [upsample] 639 | stride=2 640 | 641 | [route] 642 | layers=-1, 61 643 | 644 | [convolutional] 645 | batch_normalize=1 646 | filters=79 647 | size=1 648 | stride=1 649 | pad=1 650 | activation=leaky 651 | 652 | [convolutional] 653 | batch_normalize=1 654 | filters=107 655 | size=3 656 | stride=1 657 | pad=1 658 | activation=leaky 659 | 660 | [maxpool] 661 | stride=1 662 | size=5 663 | 664 | [route] 665 | layers=-2 666 | 667 | [maxpool] 668 | stride=1 669 | size=9 670 | 671 | [route] 672 | layers=-4 673 | 674 | [maxpool] 675 | stride=1 676 | size=13 677 | 678 | [route] 679 | layers=-1,-3,-5,-6 680 | 681 | [convolutional] 682 | batch_normalize=1 683 | filters=72 684 | size=1 685 | stride=1 686 | pad=1 687 | activation=leaky 688 | 689 | [convolutional] 690 | batch_normalize=1 691 | filters=106 692 | size=3 693 | stride=1 694 | pad=1 695 | activation=leaky 696 | 697 | [convolutional] 698 | batch_normalize=1 699 | filters=58 700 | size=1 701 | stride=1 702 | pad=1 703 | activation=leaky 704 | 705 | [convolutional] 706 | batch_normalize=1 707 | filters=193 708 | size=3 709 | stride=1 710 | pad=1 711 | activation=leaky 712 | 713 | [convolutional] 714 | filters=45 715 | size=1 716 | stride=1 717 | pad=1 718 | activation=linear 719 | 720 | [yolo] 721 | mask = 3,4,5 722 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 723 | classes = 10 724 | num = 9 725 | jitter = .3 726 | ignore_thresh = .7 727 | truth_thresh = 1 728 | random = 1 729 | 730 | [route] 731 | layers=-4 732 | 733 | [convolutional] 734 | batch_normalize=1 735 | filters=50 736 | size=1 737 | stride=1 738 | pad=1 739 | activation=leaky 740 | 741 | [upsample] 742 | stride=2 743 | 744 | [route] 745 | layers=-1, 36 746 | 747 | [convolutional] 748 | batch_normalize=1 749 | filters=40 750 | size=1 751 | stride=1 752 | pad=1 753 | activation=leaky 754 | 755 | [convolutional] 756 | batch_normalize=1 757 | filters=72 758 | size=3 759 | stride=1 760 | pad=1 761 | activation=leaky 762 | 763 | [convolutional] 764 | batch_normalize=1 765 | filters=23 766 | size=1 767 | stride=1 768 | pad=1 769 | activation=leaky 770 | 771 | [maxpool] 772 | stride=1 773 | size=5 774 | 775 | [route] 776 | layers=-2 777 | 778 | [maxpool] 779 | stride=1 780 | size=9 781 | 782 | [route] 783 | layers=-4 784 | 785 | [maxpool] 786 | stride=1 787 | size=13 788 | 789 | [route] 790 | layers=-1,-3,-5,-6 791 | 792 | [convolutional] 793 | batch_normalize=1 794 | filters=64 795 | size=3 796 | stride=1 797 | pad=1 798 | activation=leaky 799 | 800 | [convolutional] 801 | batch_normalize=1 802 | filters=44 803 | size=1 804 | stride=1 805 | pad=1 806 | activation=leaky 807 | 808 | [convolutional] 809 | batch_normalize=1 810 | filters=82 811 | size=3 812 | stride=1 813 | pad=1 814 | activation=leaky 815 | 816 | [convolutional] 817 | filters=45 818 | size=1 819 | stride=1 820 | pad=1 821 | activation=linear 822 | 823 | [yolo] 824 | mask = 0,1,2 825 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 826 | classes = 10 827 | num = 9 828 | jitter = .3 829 | ignore_thresh = .7 830 | truth_thresh = 1 831 | random = 1 832 | 833 | -------------------------------------------------------------------------------- /cfg/3iter-pruned/prune_0.5_0.5_0.7.cfg: -------------------------------------------------------------------------------- 1 | [net] 2 | # Testing 3 | batch=1 4 | subdivisions=1 5 | # Training 6 | # batch=64 7 | # subdivisions=16 8 | width=608 9 | height=608 10 | channels=3 11 | momentum=0.9 12 | decay=0.0005 13 | angle=0 14 | saturation = 1.5 15 | exposure = 1.5 16 | hue=.1 17 | 18 | learning_rate=0.001 19 | burn_in=1000 20 | max_batches = 60200 21 | policy=steps 22 | steps=35000,50000 23 | scales=.1,.1 24 | 25 | [convolutional] 26 | batch_normalize=1 27 | filters=9 28 | size=3 29 | stride=1 30 | pad=1 31 | activation=leaky 32 | 33 | [convolutional] 34 | batch_normalize=1 35 | filters=50 36 | size=3 37 | stride=2 38 | pad=1 39 | activation=leaky 40 | 41 | [convolutional] 42 | batch_normalize=1 43 | filters=25 44 | size=1 45 | stride=1 46 | pad=1 47 | activation=leaky 48 | 49 | [convolutional] 50 | batch_normalize=1 51 | filters=50 52 | size=3 53 | stride=1 54 | pad=1 55 | activation=leaky 56 | 57 | [shortcut] 58 | from=-3 59 | activation=linear 60 | 61 | [convolutional] 62 | batch_normalize=1 63 | filters=117 64 | size=3 65 | stride=2 66 | pad=1 67 | activation=leaky 68 | 69 | [convolutional] 70 | batch_normalize=1 71 | filters=16 72 | size=1 73 | stride=1 74 | pad=1 75 | activation=leaky 76 | 77 | [convolutional] 78 | batch_normalize=1 79 | filters=117 80 | size=3 81 | stride=1 82 | pad=1 83 | activation=leaky 84 | 85 | [shortcut] 86 | from=-3 87 | activation=linear 88 | 89 | [convolutional] 90 | batch_normalize=1 91 | filters=40 92 | size=1 93 | stride=1 94 | pad=1 95 | activation=leaky 96 | 97 | [convolutional] 98 | batch_normalize=1 99 | filters=117 100 | size=3 101 | stride=1 102 | pad=1 103 | activation=leaky 104 | 105 | [shortcut] 106 | from=-3 107 | activation=linear 108 | 109 | [convolutional] 110 | batch_normalize=1 111 | filters=244 112 | size=3 113 | stride=2 114 | pad=1 115 | activation=leaky 116 | 117 | [convolutional] 118 | batch_normalize=1 119 | filters=43 120 | size=1 121 | stride=1 122 | pad=1 123 | activation=leaky 124 | 125 | [convolutional] 126 | batch_normalize=1 127 | filters=244 128 | size=3 129 | stride=1 130 | pad=1 131 | activation=leaky 132 | 133 | [shortcut] 134 | from=-3 135 | activation=linear 136 | 137 | [convolutional] 138 | batch_normalize=1 139 | filters=71 140 | size=1 141 | stride=1 142 | pad=1 143 | activation=leaky 144 | 145 | [convolutional] 146 | batch_normalize=1 147 | filters=244 148 | size=3 149 | stride=1 150 | pad=1 151 | activation=leaky 152 | 153 | [shortcut] 154 | from=-3 155 | activation=linear 156 | 157 | [convolutional] 158 | batch_normalize=1 159 | filters=74 160 | size=1 161 | stride=1 162 | pad=1 163 | activation=leaky 164 | 165 | [convolutional] 166 | batch_normalize=1 167 | filters=244 168 | size=3 169 | stride=1 170 | pad=1 171 | activation=leaky 172 | 173 | [shortcut] 174 | from=-3 175 | activation=linear 176 | 177 | [convolutional] 178 | batch_normalize=1 179 | filters=63 180 | size=1 181 | stride=1 182 | pad=1 183 | activation=leaky 184 | 185 | [convolutional] 186 | batch_normalize=1 187 | filters=244 188 | size=3 189 | stride=1 190 | pad=1 191 | activation=leaky 192 | 193 | [shortcut] 194 | from=-3 195 | activation=linear 196 | 197 | [convolutional] 198 | batch_normalize=1 199 | filters=48 200 | size=1 201 | stride=1 202 | pad=1 203 | activation=leaky 204 | 205 | [convolutional] 206 | batch_normalize=1 207 | filters=244 208 | size=3 209 | stride=1 210 | pad=1 211 | activation=leaky 212 | 213 | [shortcut] 214 | from=-3 215 | activation=linear 216 | 217 | [convolutional] 218 | batch_normalize=1 219 | filters=56 220 | size=1 221 | stride=1 222 | pad=1 223 | activation=leaky 224 | 225 | [convolutional] 226 | batch_normalize=1 227 | filters=244 228 | size=3 229 | stride=1 230 | pad=1 231 | activation=leaky 232 | 233 | [shortcut] 234 | from=-3 235 | activation=linear 236 | 237 | [convolutional] 238 | batch_normalize=1 239 | filters=60 240 | size=1 241 | stride=1 242 | pad=1 243 | activation=leaky 244 | 245 | [convolutional] 246 | batch_normalize=1 247 | filters=244 248 | size=3 249 | stride=1 250 | pad=1 251 | activation=leaky 252 | 253 | [shortcut] 254 | from=-3 255 | activation=linear 256 | 257 | [convolutional] 258 | batch_normalize=1 259 | filters=43 260 | size=1 261 | stride=1 262 | pad=1 263 | activation=leaky 264 | 265 | [convolutional] 266 | batch_normalize=1 267 | filters=244 268 | size=3 269 | stride=1 270 | pad=1 271 | activation=leaky 272 | 273 | [shortcut] 274 | from=-3 275 | activation=linear 276 | 277 | [convolutional] 278 | batch_normalize=1 279 | filters=457 280 | size=3 281 | stride=2 282 | pad=1 283 | activation=leaky 284 | 285 | [convolutional] 286 | batch_normalize=1 287 | filters=89 288 | size=1 289 | stride=1 290 | pad=1 291 | activation=leaky 292 | 293 | [convolutional] 294 | batch_normalize=1 295 | filters=457 296 | size=3 297 | stride=1 298 | pad=1 299 | activation=leaky 300 | 301 | [shortcut] 302 | from=-3 303 | activation=linear 304 | 305 | [convolutional] 306 | batch_normalize=1 307 | filters=74 308 | size=1 309 | stride=1 310 | pad=1 311 | activation=leaky 312 | 313 | [convolutional] 314 | batch_normalize=1 315 | filters=457 316 | size=3 317 | stride=1 318 | pad=1 319 | activation=leaky 320 | 321 | [shortcut] 322 | from=-3 323 | activation=linear 324 | 325 | [convolutional] 326 | batch_normalize=1 327 | filters=69 328 | size=1 329 | stride=1 330 | pad=1 331 | activation=leaky 332 | 333 | [convolutional] 334 | batch_normalize=1 335 | filters=457 336 | size=3 337 | stride=1 338 | pad=1 339 | activation=leaky 340 | 341 | [shortcut] 342 | from=-3 343 | activation=linear 344 | 345 | [convolutional] 346 | batch_normalize=1 347 | filters=87 348 | size=1 349 | stride=1 350 | pad=1 351 | activation=leaky 352 | 353 | [convolutional] 354 | batch_normalize=1 355 | filters=457 356 | size=3 357 | stride=1 358 | pad=1 359 | activation=leaky 360 | 361 | [shortcut] 362 | from=-3 363 | activation=linear 364 | 365 | [convolutional] 366 | batch_normalize=1 367 | filters=72 368 | size=1 369 | stride=1 370 | pad=1 371 | activation=leaky 372 | 373 | [convolutional] 374 | batch_normalize=1 375 | filters=457 376 | size=3 377 | stride=1 378 | pad=1 379 | activation=leaky 380 | 381 | [shortcut] 382 | from=-3 383 | activation=linear 384 | 385 | [convolutional] 386 | batch_normalize=1 387 | filters=63 388 | size=1 389 | stride=1 390 | pad=1 391 | activation=leaky 392 | 393 | [convolutional] 394 | batch_normalize=1 395 | filters=457 396 | size=3 397 | stride=1 398 | pad=1 399 | activation=leaky 400 | 401 | [shortcut] 402 | from=-3 403 | activation=linear 404 | 405 | [convolutional] 406 | batch_normalize=1 407 | filters=38 408 | size=1 409 | stride=1 410 | pad=1 411 | activation=leaky 412 | 413 | [convolutional] 414 | batch_normalize=1 415 | filters=457 416 | size=3 417 | stride=1 418 | pad=1 419 | activation=leaky 420 | 421 | [shortcut] 422 | from=-3 423 | activation=linear 424 | 425 | [convolutional] 426 | batch_normalize=1 427 | filters=65 428 | size=1 429 | stride=1 430 | pad=1 431 | activation=leaky 432 | 433 | [convolutional] 434 | batch_normalize=1 435 | filters=457 436 | size=3 437 | stride=1 438 | pad=1 439 | activation=leaky 440 | 441 | [shortcut] 442 | from=-3 443 | activation=linear 444 | 445 | [convolutional] 446 | batch_normalize=1 447 | filters=864 448 | size=3 449 | stride=2 450 | pad=1 451 | activation=leaky 452 | 453 | [convolutional] 454 | batch_normalize=1 455 | filters=91 456 | size=1 457 | stride=1 458 | pad=1 459 | activation=leaky 460 | 461 | [convolutional] 462 | batch_normalize=1 463 | filters=864 464 | size=3 465 | stride=1 466 | pad=1 467 | activation=leaky 468 | 469 | [shortcut] 470 | from=-3 471 | activation=linear 472 | 473 | [convolutional] 474 | batch_normalize=1 475 | filters=53 476 | size=1 477 | stride=1 478 | pad=1 479 | activation=leaky 480 | 481 | [convolutional] 482 | batch_normalize=1 483 | filters=864 484 | size=3 485 | stride=1 486 | pad=1 487 | activation=leaky 488 | 489 | [shortcut] 490 | from=-3 491 | activation=linear 492 | 493 | [convolutional] 494 | batch_normalize=1 495 | filters=52 496 | size=1 497 | stride=1 498 | pad=1 499 | activation=leaky 500 | 501 | [convolutional] 502 | batch_normalize=1 503 | filters=864 504 | size=3 505 | stride=1 506 | pad=1 507 | activation=leaky 508 | 509 | [shortcut] 510 | from=-3 511 | activation=linear 512 | 513 | [convolutional] 514 | batch_normalize=1 515 | filters=63 516 | size=1 517 | stride=1 518 | pad=1 519 | activation=leaky 520 | 521 | [convolutional] 522 | batch_normalize=1 523 | filters=864 524 | size=3 525 | stride=1 526 | pad=1 527 | activation=leaky 528 | 529 | [shortcut] 530 | from=-3 531 | activation=linear 532 | 533 | [convolutional] 534 | batch_normalize=1 535 | filters=11 536 | size=1 537 | stride=1 538 | pad=1 539 | activation=leaky 540 | 541 | [convolutional] 542 | batch_normalize=1 543 | filters=33 544 | size=3 545 | stride=1 546 | pad=1 547 | activation=leaky 548 | 549 | [convolutional] 550 | batch_normalize=1 551 | filters=12 552 | size=1 553 | stride=1 554 | pad=1 555 | activation=leaky 556 | 557 | [maxpool] 558 | stride=1 559 | size=5 560 | 561 | [route] 562 | layers=-2 563 | 564 | [maxpool] 565 | stride=1 566 | size=9 567 | 568 | [route] 569 | layers=-4 570 | 571 | [maxpool] 572 | stride=1 573 | size=13 574 | 575 | [route] 576 | layers=-1,-3,-5,-6 577 | 578 | [convolutional] 579 | batch_normalize=1 580 | filters=18 581 | size=1 582 | stride=1 583 | pad=1 584 | activation=leaky 585 | 586 | [convolutional] 587 | batch_normalize=1 588 | filters=20 589 | size=3 590 | stride=1 591 | pad=1 592 | activation=leaky 593 | 594 | [convolutional] 595 | batch_normalize=1 596 | filters=12 597 | size=1 598 | stride=1 599 | pad=1 600 | activation=leaky 601 | 602 | [convolutional] 603 | batch_normalize=1 604 | filters=120 605 | size=3 606 | stride=1 607 | pad=1 608 | activation=leaky 609 | 610 | [convolutional] 611 | filters=45 612 | size=1 613 | stride=1 614 | pad=1 615 | activation=linear 616 | 617 | [yolo] 618 | mask = 6,7,8 619 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 620 | classes = 10 621 | num = 9 622 | jitter = .3 623 | ignore_thresh = .7 624 | truth_thresh = 1 625 | random = 1 626 | 627 | [route] 628 | layers=-4 629 | 630 | [convolutional] 631 | batch_normalize=1 632 | filters=72 633 | size=1 634 | stride=1 635 | pad=1 636 | activation=leaky 637 | 638 | [upsample] 639 | stride=2 640 | 641 | [route] 642 | layers=-1, 61 643 | 644 | [convolutional] 645 | batch_normalize=1 646 | filters=43 647 | size=1 648 | stride=1 649 | pad=1 650 | activation=leaky 651 | 652 | [convolutional] 653 | batch_normalize=1 654 | filters=18 655 | size=3 656 | stride=1 657 | pad=1 658 | activation=leaky 659 | 660 | [maxpool] 661 | stride=1 662 | size=5 663 | 664 | [route] 665 | layers=-2 666 | 667 | [maxpool] 668 | stride=1 669 | size=9 670 | 671 | [route] 672 | layers=-4 673 | 674 | [maxpool] 675 | stride=1 676 | size=13 677 | 678 | [route] 679 | layers=-1,-3,-5,-6 680 | 681 | [convolutional] 682 | batch_normalize=1 683 | filters=42 684 | size=1 685 | stride=1 686 | pad=1 687 | activation=leaky 688 | 689 | [convolutional] 690 | batch_normalize=1 691 | filters=54 692 | size=3 693 | stride=1 694 | pad=1 695 | activation=leaky 696 | 697 | [convolutional] 698 | batch_normalize=1 699 | filters=23 700 | size=1 701 | stride=1 702 | pad=1 703 | activation=leaky 704 | 705 | [convolutional] 706 | batch_normalize=1 707 | filters=157 708 | size=3 709 | stride=1 710 | pad=1 711 | activation=leaky 712 | 713 | [convolutional] 714 | filters=45 715 | size=1 716 | stride=1 717 | pad=1 718 | activation=linear 719 | 720 | [yolo] 721 | mask = 3,4,5 722 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 723 | classes = 10 724 | num = 9 725 | jitter = .3 726 | ignore_thresh = .7 727 | truth_thresh = 1 728 | random = 1 729 | 730 | [route] 731 | layers=-4 732 | 733 | [convolutional] 734 | batch_normalize=1 735 | filters=50 736 | size=1 737 | stride=1 738 | pad=1 739 | activation=leaky 740 | 741 | [upsample] 742 | stride=2 743 | 744 | [route] 745 | layers=-1, 36 746 | 747 | [convolutional] 748 | batch_normalize=1 749 | filters=33 750 | size=1 751 | stride=1 752 | pad=1 753 | activation=leaky 754 | 755 | [convolutional] 756 | batch_normalize=1 757 | filters=57 758 | size=3 759 | stride=1 760 | pad=1 761 | activation=leaky 762 | 763 | [convolutional] 764 | batch_normalize=1 765 | filters=15 766 | size=1 767 | stride=1 768 | pad=1 769 | activation=leaky 770 | 771 | [maxpool] 772 | stride=1 773 | size=5 774 | 775 | [route] 776 | layers=-2 777 | 778 | [maxpool] 779 | stride=1 780 | size=9 781 | 782 | [route] 783 | layers=-4 784 | 785 | [maxpool] 786 | stride=1 787 | size=13 788 | 789 | [route] 790 | layers=-1,-3,-5,-6 791 | 792 | [convolutional] 793 | batch_normalize=1 794 | filters=44 795 | size=3 796 | stride=1 797 | pad=1 798 | activation=leaky 799 | 800 | [convolutional] 801 | batch_normalize=1 802 | filters=41 803 | size=1 804 | stride=1 805 | pad=1 806 | activation=leaky 807 | 808 | [convolutional] 809 | batch_normalize=1 810 | filters=63 811 | size=3 812 | stride=1 813 | pad=1 814 | activation=leaky 815 | 816 | [convolutional] 817 | filters=45 818 | size=1 819 | stride=1 820 | pad=1 821 | activation=linear 822 | 823 | [yolo] 824 | mask = 0,1,2 825 | anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 826 | classes = 10 827 | num = 9 828 | jitter = .3 829 | ignore_thresh = .7 830 | truth_thresh = 1 831 | random = 1 832 | 833 | -------------------------------------------------------------------------------- /images/test.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PengyiZhang/SlimYOLOv3/ff0e7d0e534ab956b1c0a2c7c38d6c8c233275d2/images/test.jpg -------------------------------------------------------------------------------- /metrics.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PengyiZhang/SlimYOLOv3/ff0e7d0e534ab956b1c0a2c7c38d6c8c233275d2/metrics.jpg -------------------------------------------------------------------------------- /procedure.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PengyiZhang/SlimYOLOv3/ff0e7d0e534ab956b1c0a2c7c38d6c8c233275d2/procedure.jpg -------------------------------------------------------------------------------- /prune.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | """ 3 | Pengyi Zhang 4 | 201906 5 | """ 6 | import cv2 7 | 8 | import argparse 9 | import json 10 | import os 11 | 12 | import numpy 13 | 14 | import torch 15 | import torch.nn as nn 16 | 17 | from torch.utils.data import DataLoader 18 | 19 | from models import * 20 | from utils.datasets import * 21 | from utils.utils import * 22 | from utils.parse_config import * 23 | 24 | """ Slim Principle 25 | (1) Use global threshold to control pruning ratio 26 | (2) Use local threshold to keep at least 10% unpruned 27 | """ 28 | 29 | def route_conv(layer_index, module_defs): 30 | """ find the convolutional layers connected by route layer 31 | """ 32 | module_def = module_defs[layer_index] 33 | mtype = module_def['type'] 34 | 35 | before_conv_id = [] 36 | if mtype in ['convolutional', 'shortcut', 'upsample', 'maxpool']: 37 | if module_defs[layer_index-1]['type'] == 'convolutional': 38 | return [layer_index-1] 39 | before_conv_id += route_conv(layer_index-1, module_defs) 40 | 41 | elif mtype == "route": 42 | layer_is = [int(x)+layer_index if int(x) < 0 else int(x) for x in module_defs[layer_index]['layers'].split(',')] 43 | for layer_i in layer_is: 44 | if module_defs[layer_i]['type'] == 'convolutional': 45 | before_conv_id += [layer_i] 46 | else: 47 | before_conv_id += route_conv(layer_i, module_defs) 48 | 49 | return before_conv_id 50 | 51 | 52 | def write_model_cfg(old_path, new_path, new_module_defs): 53 | """Parses the yolo-v3 layer configuration file and returns module definitions""" 54 | lines = [] 55 | with open(old_path, 'r') as fp: 56 | old_lines = fp.readlines() 57 | for _line in old_lines: 58 | if "[convolutional]" in _line: 59 | break 60 | lines.append(_line) 61 | 62 | for i, module_def in enumerate(new_module_defs): 63 | 64 | mtype = module_def['type'] 65 | lines.append("[{}]\n".format(mtype)) 66 | print("layer:", i, mtype) 67 | if mtype == "convolutional": 68 | bn = 0 69 | filters = module_def['filters'] 70 | bn = int(module_def['batch_normalize']) 71 | if bn: 72 | lines.append("batch_normalize={}\n".format(bn)) 73 | filters = torch.sum(module_def['mask']).cpu().numpy().astype('int') 74 | lines.append("filters={}\n".format(filters)) 75 | lines.append("size={}\n".format(module_def['size'])) 76 | lines.append("stride={}\n".format(module_def['stride'])) 77 | lines.append("pad={}\n".format(module_def['pad'])) 78 | lines.append("activation={}\n\n".format(module_def['activation'])) 79 | elif mtype == "shortcut": 80 | lines.append("from={}\n".format(module_def['from'])) 81 | lines.append("activation={}\n\n".format(module_def['activation'])) 82 | elif mtype == 'route': 83 | lines.append("layers={}\n\n".format(module_def['layers'])) 84 | 85 | elif mtype == 'upsample': 86 | lines.append("stride={}\n\n".format(module_def['stride'])) 87 | elif mtype == 'maxpool': 88 | lines.append("stride={}\n".format(module_def['stride'])) 89 | lines.append("size={}\n\n".format(module_def['size'])) 90 | elif mtype == 'yolo': 91 | lines.append("mask = {}\n".format(module_def['mask'])) 92 | lines.append("anchors = {}\n".format(module_def['anchors'])) 93 | lines.append("classes = {}\n".format(module_def['classes'])) 94 | lines.append("num = {}\n".format(module_def['num'])) 95 | lines.append("jitter = {}\n".format(module_def['jitter'])) 96 | lines.append("ignore_thresh = {}\n".format(module_def['ignore_thresh'])) 97 | lines.append("truth_thresh = {}\n".format(module_def['truth_thresh'])) 98 | lines.append("random = {}\n\n".format(module_def['random'])) 99 | 100 | with open(new_path, "w") as f: 101 | f.writelines(lines) 102 | 103 | 104 | 105 | def test( 106 | cfg, 107 | weights=None, 108 | img_size=406, 109 | save=None, 110 | overall_ratio=0.5, 111 | perlayer_ratio=0.1 112 | ): 113 | 114 | """prune yolov3 and generate cfg, weights 115 | """ 116 | if save != None: 117 | if not os.path.exists(save): 118 | os.makedirs(save) 119 | device = torch_utils.select_device() 120 | # Initialize model 121 | model = Darknet(cfg, img_size).to(device) 122 | 123 | # Load weights 124 | if weights.endswith('.pt'): # pytorch format 125 | _state_dict = torch.load(weights, map_location=device)['model'] 126 | model.load_state_dict(_state_dict) 127 | else: # darknet format 128 | _ = load_darknet_weights(model, weights) 129 | 130 | ## output a new cfg file 131 | total = 0 132 | for m in model.modules(): 133 | if isinstance(m, nn.BatchNorm2d): 134 | total += m.weight.data.shape[0] # channels numbers 135 | 136 | bn = torch.zeros(total) 137 | index = 0 138 | 139 | for m in model.modules(): 140 | if isinstance(m, nn.BatchNorm2d): 141 | size = m.weight.data.shape[0] 142 | bn[index:(index+size)] = m.weight.data.abs().clone() 143 | index += size 144 | 145 | sorted_bn, sorted_index = torch.sort(bn) 146 | thresh_index = int(total*overall_ratio) 147 | thresh = sorted_bn[thresh_index].cuda() 148 | 149 | print("--"*30) 150 | print() 151 | #print(list(model.modules())) 152 | # 153 | proned_module_defs = model.module_defs 154 | for i, (module_def, module) in enumerate(zip(model.module_defs, model.module_list)): 155 | print("layer:", i) 156 | mtype = module_def['type'] 157 | if mtype == 'convolutional': 158 | bn = int(module_def['batch_normalize']) 159 | if bn: 160 | m = getattr(module, 'batch_norm_%d' % i) # batch_norm layer 161 | weight_copy = m.weight.data.abs().clone() 162 | channels = weight_copy.shape[0] # 163 | min_channel_num = int(channels * perlayer_ratio) if int(channels * perlayer_ratio) > 0 else 1 164 | mask = weight_copy.gt(thresh).float().cuda() 165 | 166 | if int(torch.sum(mask)) < min_channel_num: 167 | _, sorted_index_weights = torch.sort(weight_copy,descending=True) 168 | mask[sorted_index_weights[:min_channel_num]]=1. 169 | 170 | proned_module_defs[i]['mask'] = mask.clone() 171 | 172 | print('layer index: {:d} \t total channel: {:d} \t remaining channel: {:d}'. 173 | format(i, mask.shape[0], int(torch.sum(mask)))) 174 | 175 | print("layer:", mtype) 176 | 177 | elif mtype in ['upsample', 'maxpool']: 178 | print("layer:", mtype) 179 | 180 | elif mtype == 'route': 181 | print("layer:", mtype) 182 | # 183 | 184 | elif mtype == 'shortcut': 185 | layer_i = int(module_def['from'])+i 186 | print("from layer ", layer_i) 187 | print("layer:", mtype) 188 | proned_module_defs[i]['is_access'] = False 189 | 190 | 191 | elif mtype == 'yolo': 192 | print("layer:", mtype) 193 | 194 | 195 | layer_number = len(proned_module_defs) 196 | for i in range(layer_number-1, -1, -1): 197 | mtype = proned_module_defs[i]['type'] 198 | if mtype == 'shortcut': 199 | if proned_module_defs[i]['is_access']: 200 | continue 201 | 202 | Merge_masks = [] 203 | layer_i = i 204 | while mtype == 'shortcut': 205 | proned_module_defs[layer_i]['is_access'] = True 206 | 207 | if proned_module_defs[layer_i-1]['type'] == 'convolutional': 208 | bn = int(proned_module_defs[layer_i-1]['batch_normalize']) 209 | if bn: 210 | Merge_masks.append(proned_module_defs[layer_i-1]["mask"].unsqueeze(0)) 211 | 212 | layer_i = int(proned_module_defs[layer_i]['from'])+layer_i 213 | mtype = proned_module_defs[layer_i]['type'] 214 | 215 | if mtype == 'convolutional': 216 | bn = int(proned_module_defs[layer_i]['batch_normalize']) 217 | if bn: 218 | Merge_masks.append(proned_module_defs[layer_i]["mask"].unsqueeze(0)) 219 | 220 | 221 | if len(Merge_masks) > 1: 222 | Merge_masks = torch.cat(Merge_masks, 0) 223 | merge_mask = (torch.sum(Merge_masks, dim=0) > 0).float().cuda() 224 | else: 225 | merge_mask = Merge_masks[0].float().cuda() 226 | 227 | layer_i = i 228 | mtype = 'shortcut' 229 | while mtype == 'shortcut': 230 | 231 | if proned_module_defs[layer_i-1]['type'] == 'convolutional': 232 | bn = int(proned_module_defs[layer_i-1]['batch_normalize']) 233 | if bn: 234 | proned_module_defs[layer_i-1]["mask"] = merge_mask 235 | 236 | layer_i = int(proned_module_defs[layer_i]['from'])+layer_i 237 | mtype = proned_module_defs[layer_i]['type'] 238 | 239 | if mtype == 'convolutional': 240 | bn = int(proned_module_defs[layer_i]['batch_normalize']) 241 | if bn: 242 | proned_module_defs[layer_i]["mask"] = merge_mask 243 | 244 | 245 | 246 | for i, (module_def, module) in enumerate(zip(model.module_defs, model.module_list)): 247 | print("layer:", i) 248 | mtype = module_def['type'] 249 | if mtype == 'convolutional': 250 | bn = int(module_def['batch_normalize']) 251 | if bn: 252 | 253 | layer_i_1 = i - 1 254 | proned_module_defs[i]['mask_before'] = None 255 | 256 | mask_before = [] 257 | conv_indexs = [] 258 | if i > 0: 259 | conv_indexs = route_conv(i, proned_module_defs) 260 | for conv_index in conv_indexs: 261 | mask_before += proned_module_defs[conv_index]["mask"].clone().cpu().numpy().tolist() 262 | proned_module_defs[i]['mask_before'] = torch.tensor(mask_before).float().cuda() 263 | 264 | 265 | 266 | 267 | output_cfg_path = os.path.join(save, "prune.cfg") 268 | write_model_cfg(cfg, output_cfg_path, proned_module_defs) 269 | 270 | pruned_model = Darknet(output_cfg_path, img_size).to(device) 271 | print(list(pruned_model.modules())) 272 | for i, (module_def, old_module, new_module) in enumerate(zip(proned_module_defs, model.module_list, pruned_model.module_list)): 273 | mtype = module_def['type'] 274 | print("layer: ",i, mtype) 275 | if mtype == 'convolutional': # 276 | bn = int(module_def['batch_normalize']) 277 | if bn: 278 | new_norm = getattr(new_module, 'batch_norm_%d' % i) # batch_norm layer 279 | old_norm = getattr(old_module, 'batch_norm_%d' % i) # batch_norm layer 280 | 281 | new_conv = getattr(new_module, 'conv_%d' % i) # conv layer 282 | old_conv = getattr(old_module, 'conv_%d' % i) # conv layer 283 | 284 | 285 | idx1 = np.squeeze(np.argwhere(np.asarray(module_def['mask'].cpu().numpy()))) 286 | if i > 0: 287 | idx2 = np.squeeze(np.argwhere(np.asarray(module_def['mask_before'].cpu().numpy()))) 288 | new_conv.weight.data = old_conv.weight.data[idx1.tolist()][:, idx2.tolist(), :, :].clone() 289 | 290 | print("idx1: ", len(idx1), ", idx2: ", len(idx2)) 291 | else: 292 | new_conv.weight.data = old_conv.weight.data[idx1.tolist()].clone() 293 | 294 | new_norm.weight.data = old_norm.weight.data[idx1.tolist()].clone() 295 | new_norm.bias.data = old_norm.bias.data[idx1.tolist()].clone() 296 | new_norm.running_mean = old_norm.running_mean[idx1.tolist()].clone() 297 | new_norm.running_var = old_norm.running_var[idx1.tolist()].clone() 298 | 299 | 300 | print('layer index: ', i, 'idx1: ', idx1) 301 | else: 302 | 303 | new_conv = getattr(new_module, 'conv_%d' % i) # batch_norm layer 304 | old_conv = getattr(old_module, 'conv_%d' % i) # batch_norm layer 305 | idx2 = np.squeeze(np.argwhere(np.asarray(proned_module_defs[i-1]['mask'].cpu().numpy()))) 306 | new_conv.weight.data = old_conv.weight.data[:,idx2.tolist(),:,:].clone() 307 | new_conv.bias.data = old_conv.bias.data.clone() 308 | print('layer index: ', i, "entire copy") 309 | 310 | print('--'*30) 311 | print('prune done!') 312 | print('pruned ratio %.3f'%overall_ratio) 313 | prune_weights_path = os.path.join(save, "prune.pt") 314 | _pruned_state_dict = pruned_model.state_dict() 315 | torch.save(_pruned_state_dict, prune_weights_path) 316 | 317 | print("Done!") 318 | 319 | 320 | 321 | # test 322 | pruned_model.eval() 323 | img_path = "test.jpg" 324 | 325 | org_img = cv2.imread(img_path) # BGR 326 | img, ratiow, ratioh, padw, padh = letterbox(org_img, new_shape=[img_size,img_size], mode='rect') 327 | 328 | # Normalize 329 | img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416 330 | img = np.ascontiguousarray(img, dtype=np.float32) # uint8 to float32 331 | img /= 255.0 # 0 - 255 to 0.0 - 1.0 332 | 333 | imgs = torch.from_numpy(img).unsqueeze(0).to(device) 334 | _, _, height, width = imgs.shape # batch size, channels, height, width 335 | 336 | # Run model 337 | inf_out, train_out = pruned_model(imgs) # inference and training outputs 338 | # Run NMS 339 | output = non_max_suppression(inf_out, conf_thres=0.005, nms_thres=0.5) 340 | # Statistics per image 341 | for si, pred in enumerate(output): 342 | if pred is None: 343 | continue 344 | if True: 345 | box = pred[:, :4].clone() # xyxy 346 | scale_coords(imgs[si].shape[1:], box, org_img.shape[:2]) # to original shape 347 | for di, d in enumerate(pred): 348 | category_id = int(d[6]) 349 | left, top, right, bot = [float(x) for x in box[di]] 350 | confidence = float(d[4]) 351 | 352 | cv2.rectangle(org_img, (int(left), int(top)), (int(right), int(bot)), 353 | (255, 0, 0), 2) 354 | cv2.putText(org_img, str(category_id) + ":" + str('%.1f' % (float(confidence) * 100)) + "%", (int(left), int(top) - 8), 355 | cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 1) 356 | cv2.imshow("result", org_img) 357 | cv2.waitKey(-1) 358 | cv2.imwrite('result_{}'.format(img_path), org_img) 359 | 360 | 361 | # convert pt to weights: 362 | prune_c_weights_path = os.path.join(save, "prune.weights") 363 | save_weights(pruned_model, prune_c_weights_path) 364 | 365 | 366 | if __name__ == '__main__': 367 | os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" 368 | os.environ["CUDA_VISIBLE_DEVICES"] = "0" 369 | parser = argparse.ArgumentParser(description='PyTorch Slimming Yolov3 prune') 370 | parser.add_argument('--cfg', type=str, default='VisDrone2019/yolov3-spp3.cfg', help='cfg file path') 371 | parser.add_argument('--weights', type=str, default='yolov3-spp3_final.weights', help='path to weights file') 372 | parser.add_argument('--img_size', type=int, default=608, help='inference size (pixels)') 373 | parser.add_argument('--save', default='prune', type=str, metavar='PATH', help='path to save pruned model (default: none)') 374 | parser.add_argument('--overall_ratio', type=float, default=0.5, help='scale sparse rate (default: 0.5)') 375 | parser.add_argument('--perlayer_ratio', type=float, default=0.1, help='minimal scale sparse rate (default: 0.1) to prevent disconnect') 376 | 377 | opt = parser.parse_args() 378 | opt.save += "_{}_{}".format(opt.overall_ratio, opt.perlayer_ratio) 379 | 380 | print(opt) 381 | 382 | with torch.no_grad(): 383 | test( 384 | opt.cfg, 385 | opt.weights, 386 | opt.img_size, 387 | opt.save, 388 | opt.overall_ratio, 389 | opt.perlayer_ratio, 390 | ) 391 | -------------------------------------------------------------------------------- /results/demo.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PengyiZhang/SlimYOLOv3/ff0e7d0e534ab956b1c0a2c7c38d6c8c233275d2/results/demo.mp4 -------------------------------------------------------------------------------- /results/result.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PengyiZhang/SlimYOLOv3/ff0e7d0e534ab956b1c0a2c7c38d6c8c233275d2/results/result.jpg -------------------------------------------------------------------------------- /sparsity.py: -------------------------------------------------------------------------------- 1 | # additional subgradient descent on the sparsity-induced penalty term 2 | # x_{k+1} = x_{k} - \alpha_{k} * g^{k} 3 | def updateBN(scale, model): 4 | for m in model.modules(): 5 | if isinstance(m, nn.BatchNorm2d): 6 | m.weight.grad.data.add_(scale*torch.sign(m.weight.data)) # L1 7 | -------------------------------------------------------------------------------- /table.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PengyiZhang/SlimYOLOv3/ff0e7d0e534ab956b1c0a2c7c38d6c8c233275d2/table.jpg -------------------------------------------------------------------------------- /yolov3/README.md: -------------------------------------------------------------------------------- 1 | **NOTICE**: 2 | 1. TO run sparsity training and channel pruning, [ultralytics/yolov3](https://github.com/ultralytics/yolov3) is required. 3 | 2. We only provide the pruning method for channel pruning (`prune.py`) and subgradient method for sparsity training (`sparsity.py`). 4 | 3. Sparsity training can be done by using `updateBN()` in `sparsity.py` before `optimizer.step()` in `train.py`. 5 | 4. The channel pruning can be done by `prune.py`. -------------------------------------------------------------------------------- /yolov3/prune.py: -------------------------------------------------------------------------------- 1 | 2 | # coding: utf-8 3 | """ 4 | Pengyi Zhang 5 | 201906 6 | """ 7 | import cv2 8 | 9 | import argparse 10 | import json 11 | import os 12 | 13 | import numpy 14 | 15 | import torch 16 | import torch.nn as nn 17 | 18 | from torch.utils.data import DataLoader 19 | 20 | from models import * 21 | from utils.datasets import * 22 | from utils.utils import * 23 | from utils.parse_config import * 24 | 25 | """ Slim Principle 26 | (1) Use global threshold to control pruning ratio 27 | (2) Use local threshold to keep at least 10% unpruned 28 | """ 29 | 30 | def route_conv(layer_index, module_defs): 31 | """ find the convolutional layers connected by route layer 32 | """ 33 | module_def = module_defs[layer_index] 34 | mtype = module_def['type'] 35 | 36 | before_conv_id = [] 37 | if mtype in ['convolutional', 'shortcut', 'upsample', 'maxpool']: 38 | if module_defs[layer_index-1]['type'] == 'convolutional': 39 | return [layer_index-1] 40 | before_conv_id += route_conv(layer_index-1, module_defs) 41 | 42 | elif mtype == "route": 43 | layer_is = [int(x)+layer_index if int(x) < 0 else int(x) for x in module_defs[layer_index]['layers'].split(',')] 44 | for layer_i in layer_is: 45 | if module_defs[layer_i]['type'] == 'convolutional': 46 | before_conv_id += [layer_i] 47 | else: 48 | before_conv_id += route_conv(layer_i, module_defs) 49 | 50 | return before_conv_id 51 | 52 | 53 | def write_model_cfg(old_path, new_path, new_module_defs): 54 | """Parses the yolo-v3 layer configuration file and returns module definitions""" 55 | lines = [] 56 | with open(old_path, 'r') as fp: 57 | old_lines = fp.readlines() 58 | for _line in old_lines: 59 | if "[convolutional]" in _line: 60 | break 61 | lines.append(_line) 62 | 63 | for i, module_def in enumerate(new_module_defs): 64 | 65 | mtype = module_def['type'] 66 | lines.append("[{}]\n".format(mtype)) 67 | print("layer:", i, mtype) 68 | if mtype == "convolutional": 69 | bn = 0 70 | filters = module_def['filters'] 71 | bn = int(module_def['batch_normalize']) 72 | if bn: 73 | lines.append("batch_normalize={}\n".format(bn)) 74 | filters = torch.sum(module_def['mask']).cpu().numpy().astype('int') 75 | lines.append("filters={}\n".format(filters)) 76 | lines.append("size={}\n".format(module_def['size'])) 77 | lines.append("stride={}\n".format(module_def['stride'])) 78 | lines.append("pad={}\n".format(module_def['pad'])) 79 | lines.append("activation={}\n\n".format(module_def['activation'])) 80 | elif mtype == "shortcut": 81 | lines.append("from={}\n".format(module_def['from'])) 82 | lines.append("activation={}\n\n".format(module_def['activation'])) 83 | elif mtype == 'route': 84 | lines.append("layers={}\n\n".format(module_def['layers'])) 85 | 86 | elif mtype == 'upsample': 87 | lines.append("stride={}\n\n".format(module_def['stride'])) 88 | elif mtype == 'maxpool': 89 | lines.append("stride={}\n".format(module_def['stride'])) 90 | lines.append("size={}\n\n".format(module_def['size'])) 91 | elif mtype == 'yolo': 92 | lines.append("mask = {}\n".format(module_def['mask'])) 93 | lines.append("anchors = {}\n".format(module_def['anchors'])) 94 | lines.append("classes = {}\n".format(module_def['classes'])) 95 | lines.append("num = {}\n".format(module_def['num'])) 96 | lines.append("jitter = {}\n".format(module_def['jitter'])) 97 | lines.append("ignore_thresh = {}\n".format(module_def['ignore_thresh'])) 98 | lines.append("truth_thresh = {}\n".format(module_def['truth_thresh'])) 99 | lines.append("random = {}\n\n".format(module_def['random'])) 100 | 101 | with open(new_path, "w") as f: 102 | f.writelines(lines) 103 | 104 | 105 | 106 | def test( 107 | cfg, 108 | weights=None, 109 | img_size=406, 110 | save=None, 111 | overall_ratio=0.5, 112 | perlayer_ratio=0.1 113 | ): 114 | 115 | """prune yolov3 and generate cfg, weights 116 | """ 117 | if save != None: 118 | if not os.path.exists(save): 119 | os.makedirs(save) 120 | device = torch_utils.select_device() 121 | # Initialize model 122 | model = Darknet(cfg, img_size).to(device) 123 | 124 | # Load weights 125 | if weights.endswith('.pt'): # pytorch format 126 | _state_dict = torch.load(weights, map_location=device)['model'] 127 | model.load_state_dict(_state_dict) 128 | else: # darknet format 129 | _ = load_darknet_weights(model, weights) 130 | 131 | ## output a new cfg file 132 | total = 0 133 | for m in model.modules(): 134 | if isinstance(m, nn.BatchNorm2d): 135 | total += m.weight.data.shape[0] # channels numbers 136 | 137 | bn = torch.zeros(total) 138 | index = 0 139 | 140 | for m in model.modules(): 141 | if isinstance(m, nn.BatchNorm2d): 142 | size = m.weight.data.shape[0] 143 | bn[index:(index+size)] = m.weight.data.abs().clone() 144 | index += size 145 | 146 | sorted_bn, sorted_index = torch.sort(bn) 147 | thresh_index = int(total*overall_ratio) 148 | thresh = sorted_bn[thresh_index].cuda() 149 | 150 | print("--"*30) 151 | print() 152 | #print(list(model.modules())) 153 | # 154 | proned_module_defs = model.module_defs 155 | for i, (module_def, module) in enumerate(zip(model.module_defs, model.module_list)): 156 | print("layer:", i) 157 | mtype = module_def['type'] 158 | if mtype == 'convolutional': 159 | bn = int(module_def['batch_normalize']) 160 | if bn: 161 | m = getattr(module, 'batch_norm_%d' % i) # batch_norm layer 162 | weight_copy = m.weight.data.abs().clone() 163 | channels = weight_copy.shape[0] # 164 | min_channel_num = int(channels * perlayer_ratio) if int(channels * perlayer_ratio) > 0 else 1 165 | mask = weight_copy.gt(thresh).float().cuda() 166 | 167 | if int(torch.sum(mask)) < min_channel_num: 168 | _, sorted_index_weights = torch.sort(weight_copy,descending=True) 169 | mask[sorted_index_weights[:min_channel_num]]=1. 170 | 171 | proned_module_defs[i]['mask'] = mask.clone() 172 | 173 | print('layer index: {:d} \t total channel: {:d} \t remaining channel: {:d}'. 174 | format(i, mask.shape[0], int(torch.sum(mask)))) 175 | 176 | print("layer:", mtype) 177 | 178 | elif mtype in ['upsample', 'maxpool']: 179 | print("layer:", mtype) 180 | 181 | elif mtype == 'route': 182 | print("layer:", mtype) 183 | # 184 | 185 | elif mtype == 'shortcut': 186 | layer_i = int(module_def['from'])+i 187 | print("from layer ", layer_i) 188 | print("layer:", mtype) 189 | proned_module_defs[i]['is_access'] = False 190 | 191 | 192 | elif mtype == 'yolo': 193 | print("layer:", mtype) 194 | 195 | 196 | layer_number = len(proned_module_defs) 197 | for i in range(layer_number-1, -1, -1): 198 | mtype = proned_module_defs[i]['type'] 199 | if mtype == 'shortcut': 200 | if proned_module_defs[i]['is_access']: 201 | continue 202 | 203 | Merge_masks = [] 204 | layer_i = i 205 | while mtype == 'shortcut': 206 | proned_module_defs[layer_i]['is_access'] = True 207 | 208 | if proned_module_defs[layer_i-1]['type'] == 'convolutional': 209 | bn = int(proned_module_defs[layer_i-1]['batch_normalize']) 210 | if bn: 211 | Merge_masks.append(proned_module_defs[layer_i-1]["mask"].unsqueeze(0)) 212 | 213 | layer_i = int(proned_module_defs[layer_i]['from'])+layer_i 214 | mtype = proned_module_defs[layer_i]['type'] 215 | 216 | if mtype == 'convolutional': 217 | bn = int(proned_module_defs[layer_i]['batch_normalize']) 218 | if bn: 219 | Merge_masks.append(proned_module_defs[layer_i]["mask"].unsqueeze(0)) 220 | 221 | 222 | if len(Merge_masks) > 1: 223 | Merge_masks = torch.cat(Merge_masks, 0) 224 | merge_mask = (torch.sum(Merge_masks, dim=0) > 0).float().cuda() 225 | else: 226 | merge_mask = Merge_masks[0].float().cuda() 227 | 228 | layer_i = i 229 | mtype = 'shortcut' 230 | while mtype == 'shortcut': 231 | 232 | if proned_module_defs[layer_i-1]['type'] == 'convolutional': 233 | bn = int(proned_module_defs[layer_i-1]['batch_normalize']) 234 | if bn: 235 | proned_module_defs[layer_i-1]["mask"] = merge_mask 236 | 237 | layer_i = int(proned_module_defs[layer_i]['from'])+layer_i 238 | mtype = proned_module_defs[layer_i]['type'] 239 | 240 | if mtype == 'convolutional': 241 | bn = int(proned_module_defs[layer_i]['batch_normalize']) 242 | if bn: 243 | proned_module_defs[layer_i]["mask"] = merge_mask 244 | 245 | 246 | 247 | for i, (module_def, module) in enumerate(zip(model.module_defs, model.module_list)): 248 | print("layer:", i) 249 | mtype = module_def['type'] 250 | if mtype == 'convolutional': 251 | bn = int(module_def['batch_normalize']) 252 | if bn: 253 | 254 | layer_i_1 = i - 1 255 | proned_module_defs[i]['mask_before'] = None 256 | 257 | mask_before = [] 258 | conv_indexs = [] 259 | if i > 0: 260 | conv_indexs = route_conv(i, proned_module_defs) 261 | for conv_index in conv_indexs: 262 | mask_before += proned_module_defs[conv_index]["mask"].clone().cpu().numpy().tolist() 263 | proned_module_defs[i]['mask_before'] = torch.tensor(mask_before).float().cuda() 264 | 265 | 266 | 267 | 268 | output_cfg_path = os.path.join(save, "prune.cfg") 269 | write_model_cfg(cfg, output_cfg_path, proned_module_defs) 270 | 271 | pruned_model = Darknet(output_cfg_path, img_size).to(device) 272 | print(list(pruned_model.modules())) 273 | for i, (module_def, old_module, new_module) in enumerate(zip(proned_module_defs, model.module_list, pruned_model.module_list)): 274 | mtype = module_def['type'] 275 | print("layer: ",i, mtype) 276 | if mtype == 'convolutional': # 277 | bn = int(module_def['batch_normalize']) 278 | if bn: 279 | new_norm = getattr(new_module, 'batch_norm_%d' % i) # batch_norm layer 280 | old_norm = getattr(old_module, 'batch_norm_%d' % i) # batch_norm layer 281 | 282 | new_conv = getattr(new_module, 'conv_%d' % i) # conv layer 283 | old_conv = getattr(old_module, 'conv_%d' % i) # conv layer 284 | 285 | 286 | idx1 = np.squeeze(np.argwhere(np.asarray(module_def['mask'].cpu().numpy()))) 287 | if i > 0: 288 | idx2 = np.squeeze(np.argwhere(np.asarray(module_def['mask_before'].cpu().numpy()))) 289 | new_conv.weight.data = old_conv.weight.data[idx1.tolist()][:, idx2.tolist(), :, :].clone() 290 | 291 | print("idx1: ", len(idx1), ", idx2: ", len(idx2)) 292 | else: 293 | new_conv.weight.data = old_conv.weight.data[idx1.tolist()].clone() 294 | 295 | new_norm.weight.data = old_norm.weight.data[idx1.tolist()].clone() 296 | new_norm.bias.data = old_norm.bias.data[idx1.tolist()].clone() 297 | new_norm.running_mean = old_norm.running_mean[idx1.tolist()].clone() 298 | new_norm.running_var = old_norm.running_var[idx1.tolist()].clone() 299 | 300 | 301 | print('layer index: ', i, 'idx1: ', idx1) 302 | else: 303 | 304 | new_conv = getattr(new_module, 'conv_%d' % i) # batch_norm layer 305 | old_conv = getattr(old_module, 'conv_%d' % i) # batch_norm layer 306 | idx2 = np.squeeze(np.argwhere(np.asarray(proned_module_defs[i-1]['mask'].cpu().numpy()))) 307 | new_conv.weight.data = old_conv.weight.data[:,idx2.tolist(),:,:].clone() 308 | new_conv.bias.data = old_conv.bias.data.clone() 309 | print('layer index: ', i, "entire copy") 310 | 311 | print('--'*30) 312 | print('prune done!') 313 | print('pruned ratio %.3f'%overall_ratio) 314 | prune_weights_path = os.path.join(save, "prune.pt") 315 | _pruned_state_dict = pruned_model.state_dict() 316 | torch.save(_pruned_state_dict, prune_weights_path) 317 | 318 | print("Done!") 319 | 320 | 321 | 322 | # test 323 | pruned_model.eval() 324 | img_path = "test.jpg" 325 | 326 | org_img = cv2.imread(img_path) # BGR 327 | img, ratiow, ratioh, padw, padh = letterbox(org_img, new_shape=[img_size,img_size], mode='rect') 328 | 329 | # Normalize 330 | img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416 331 | img = np.ascontiguousarray(img, dtype=np.float32) # uint8 to float32 332 | img /= 255.0 # 0 - 255 to 0.0 - 1.0 333 | 334 | imgs = torch.from_numpy(img).unsqueeze(0).to(device) 335 | _, _, height, width = imgs.shape # batch size, channels, height, width 336 | 337 | # Run model 338 | inf_out, train_out = pruned_model(imgs) # inference and training outputs 339 | # Run NMS 340 | output = non_max_suppression(inf_out, conf_thres=0.005, nms_thres=0.5) 341 | # Statistics per image 342 | for si, pred in enumerate(output): 343 | if pred is None: 344 | continue 345 | if True: 346 | box = pred[:, :4].clone() # xyxy 347 | scale_coords(imgs[si].shape[1:], box, org_img.shape[:2]) # to original shape 348 | for di, d in enumerate(pred): 349 | category_id = int(d[6]) 350 | left, top, right, bot = [float(x) for x in box[di]] 351 | confidence = float(d[4]) 352 | 353 | cv2.rectangle(org_img, (int(left), int(top)), (int(right), int(bot)), 354 | (255, 0, 0), 2) 355 | cv2.putText(org_img, str(category_id) + ":" + str('%.1f' % (float(confidence) * 100)) + "%", (int(left), int(top) - 8), 356 | cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 1) 357 | cv2.imshow("result", org_img) 358 | cv2.waitKey(-1) 359 | cv2.imwrite('result_{}'.format(img_path), org_img) 360 | 361 | 362 | # convert pt to weights: 363 | prune_c_weights_path = os.path.join(save, "prune.weights") 364 | save_weights(pruned_model, prune_c_weights_path) 365 | 366 | 367 | if __name__ == '__main__': 368 | os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" 369 | os.environ["CUDA_VISIBLE_DEVICES"] = "0" 370 | parser = argparse.ArgumentParser(description='PyTorch Slimming Yolov3 prune') 371 | parser.add_argument('--cfg', type=str, default='VisDrone2019/yolov3-spp3.cfg', help='cfg file path') 372 | parser.add_argument('--weights', type=str, default='yolov3-spp3_final.weights', help='path to weights file') 373 | parser.add_argument('--img_size', type=int, default=608, help='inference size (pixels)') 374 | parser.add_argument('--save', default='prune', type=str, metavar='PATH', help='path to save pruned model (default: none)') 375 | parser.add_argument('--overall_ratio', type=float, default=0.5, help='scale sparse rate (default: 0.5)') 376 | parser.add_argument('--perlayer_ratio', type=float, default=0.1, help='minimal scale sparse rate (default: 0.1) to prevent disconnect') 377 | 378 | opt = parser.parse_args() 379 | opt.save += "_{}_{}".format(opt.overall_ratio, opt.perlayer_ratio) 380 | 381 | print(opt) 382 | 383 | with torch.no_grad(): 384 | test( 385 | opt.cfg, 386 | opt.weights, 387 | opt.img_size, 388 | opt.save, 389 | opt.overall_ratio, 390 | opt.perlayer_ratio, 391 | ) 392 | -------------------------------------------------------------------------------- /yolov3/sparsity.py: -------------------------------------------------------------------------------- 1 | # additional subgradient descent on the sparsity-induced penalty term 2 | # x_{k+1} = x_{k} - \alpha_{k} * g^{k} 3 | def updateBN(scale, model): 4 | for m in model.modules(): 5 | if isinstance(m, nn.BatchNorm2d): 6 | m.weight.grad.data.add_(scale*torch.sign(m.weight.data)) # L1 7 | --------------------------------------------------------------------------------