├── .DS_Store ├── LICENSE ├── README.md ├── Zhenbo_Xu_Towards_End-to-End_License_ECCV_2018_paper.pdf ├── rpnet ├── .DS_Store ├── demo.py ├── demo │ ├── 0.jpg │ ├── 1.jpg │ ├── 2.jpg │ ├── 3.jpg │ └── 4.jpg ├── load_data.py ├── roi_pooling.py ├── rpnet.py ├── rpnetEval.py └── wR2.py └── split ├── ccpd_blur.txt ├── ccpd_challenge.txt ├── ccpd_db.txt ├── ccpd_fn.txt ├── ccpd_rotate.txt ├── ccpd_tilt.txt ├── test.txt ├── train.txt └── val.txt /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/detectRecog/CCPD/02aaea15137c4d2fe662e57d257c6822356e9304/.DS_Store -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2017 CCPD 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # CCPD (Chinese City Parking Dataset, ECCV) 2 | 3 | ## UPdate on 10/03/2019. CCPD Dataset is now updated. We are confident that images in subsets of CCPD is much more challenging than before with over 300k images and refined annotations. 4 | 5 | (If you are benefited from this dataset, please cite our paper.) 6 | It can be downloaded from and extract by (tar xf CCPD2019.tar.xz): 7 | - [Google Drive](https://drive.google.com/open?id=1rdEsCUcIUaYOVRkx5IMTRNA7PcGMmSgc) 8 | 9 | - [BaiduYun Drive(code: hm0u)](https://pan.baidu.com/s/1i5AOjAbtkwb17Zy-NQGqkw) 10 | 11 | 12 | #### train\val\test split 13 | The split file is available under 'split/' folder. 14 | 15 | Images in CCPD-Base is split to train/val set. Sub-datasets (CCPD-DB, CCPD-Blur, CCPD-FN, CCPD-Rotate, CCPD-Tilt, CCPD-Challenge) in CCPD are exploited for test. 16 | **** 17 | ## UPdate on 16/09/2020. We add a new energy vehicle sub-dataset (CCPD-Green) which has an eight-digit license plate number. 18 | 19 | It can be downloaded from: 20 | - [Google Drive](https://drive.google.com/file/d/1m8w1kFxnCEiqz_-t2vTcgrgqNIv986PR/view?usp=sharing) 21 | 22 | - [BaiduYun Drive(code: ol3j)](https://pan.baidu.com/s/1JSpc9BZXFlPkXxRK4qUCyw) 23 | 24 | ### metric 25 | As each image in CCPD contains only a single license plate (LP). Therefore, we do not consider recall and concerntrate on precision. Detectors are allowed to predict only one bounding box for each image. 26 | 27 | - Detection. For each image, the detector outputs only one bounding box. The bounding box is considered to be correct if and only if its IoU with the ground truth bounding box is more than 70% (IoU > 0.7). Also, we compute AP on the test set. 28 | 29 | - Recognition. A LP recognition is correct if and only if all characters in the LP number are correctly recognized. 30 | 31 | #### benchmark 32 | 33 | If you want to provide more baseline results or have problems about the provided results. Please raise an issue. 34 | ##### detection 35 | 36 | | | FPS | AP | DB | Blur | FN | Rotate | Tilt | Challenge | 37 | |---|---|---|---|---|---|---|---|---| 38 | | Faster-RCNN | 11 | 84.98 | 66.73 | 81.59 | 76.45 | 94.42 | 88.19 | 89.82 | 39 | | SSD300 | 25 | 86.99 | 72.90 | 87.06 | 74.84 | 96.53 | 91.86 | 90.06 | 40 | | SSD512 | 12 | 87.83 | 69.99 | 84.23 | 80.65 | 96.50 | 91.26 | 92.14 | 41 | | YOLOv3-320 | 52 | 87.23 | 71.34 | 82.19 | 82.44 | 96.69 | 89.17 | 91.46 | 42 | 43 | ##### recognition 44 | We provide baseline methods for recognition by appending a LP recognition model Holistic-CNN (HC) (refer to paper 'Holistic recognition of low quality license plates by cnn using track annotated data') to the detector. 45 | 46 | | | FPS | AP | DB | Blur | FN | Rotate | Tilt | Challenge | 47 | |---|---|---|---|---|---|---|---|---| 48 | | SSD512+HC | 11 | 43.42 | 34.47 | 25.83 | 45.24 | 52.82 | 52.04 | 44.62 | 49 | 50 | The column 'AP' shows the precision on all the test set. The test set contains six parts: DB(ccpd_db/), Blur(ccpd_blur), FN(ccpd_fn), Rotate(ccpd_rotate), Tilt(ccpd_tilt), Challenge(ccpd_challenge). 51 | 52 | This repository is designed to provide an open-source dataset for license plate detection and recognition, described in _《Towards End-to-End License Plate Detection and Recognition: A Large Dataset and Baseline》_. This dataset is open-source under MIT license. More details about this dataset are avialable at our ECCV 2018 paper (also available in this github) _《Towards End-to-End License Plate Detection and Recognition: A Large Dataset and Baseline》_. If you are benefited from this paper, please cite our paper as follows: 53 | 54 | ``` 55 | @inproceedings{xu2018towards, 56 | title={Towards End-to-End License Plate Detection and Recognition: A Large Dataset and Baseline}, 57 | author={Xu, Zhenbo and Yang, Wei and Meng, Ajin and Lu, Nanxue and Huang, Huan}, 58 | booktitle={Proceedings of the European Conference on Computer Vision (ECCV)}, 59 | pages={255--271}, 60 | year={2018} 61 | } 62 | ``` 63 | 64 | 65 | 66 | ## Specification of the categorise above: 67 | 68 | - **rpnet**: The training code for a license plate localization network and an end-to-end network which can detect the license plate bounding box and recognize the corresponding license plate number in a single forward. In addition, demo.py and demo folder are provided for playing demo. 69 | 70 | - **paper.pdf**: Our published eccv paper. 71 | 72 | 73 | ## Demo 74 | 75 | Demo code and several images are provided under rpnet/ folder, after you obtain "fh02.pth" by downloading or training, run demo as follows, the demo code will modify images in rpnet/demo folder and you can check by opening demo images. 76 | 77 | ``` 78 | 79 | python demo.py -i [ROOT/rpnet/demo/] -m [***/fh02.pth] 80 | 81 | ``` 82 | 83 | ### The nearly well-trained model for testing and fun (Short of time, trained only for 5 epochs, but enough for testing): 84 | 85 | We encourage the comparison with SOTA detector like FCOS rather than RPnet as the architecture of RPnet is very old fashioned. 86 | - Location module wR2.pth [google_drive](https://drive.google.com/open?id=1l_tIt7D3vmYNYZLOPbwx8qJpPVM82CP-), [baiduyun](https://pan.baidu.com/s/1Q3fPDHFYV5uibWwIQxPEOw) 87 | - rpnet model fh02.pth [google_drive](https://drive.google.com/open?id=1YYVWgbHksj25vV6bnCX_AWokFjhgIMhv), [baiduyun](https://pan.baidu.com/s/1sA-rzn4Mf33uhh1DWNcRhQ). 88 | 89 | ## Training instructions 90 | 91 | Input parameters are well commented in python codes(python2/3 are both ok, the version of pytorch should be >= 0.3). You can increase the batchSize as long as enough GPU memory is available. 92 | 93 | #### Enviorment (not so important as long as you can run the code): 94 | 95 | - python: pytorch(0.3.1), numpy(1.14.3), cv2(2.4.9.1). 96 | - system: Cuda(release 9.1, V9.1.85) 97 | 98 | #### For convinence, we provide a trained wR2 model and a trained rpnet model, you can download them from google drive or baiduyun. 99 | 100 | 101 | 102 | First train the localization network (we provide one as before, you can download it from [google drive](https://drive.google.com/open?id=1l_tIt7D3vmYNYZLOPbwx8qJpPVM82CP-) or [baiduyun](https://pan.baidu.com/s/1Q3fPDHFYV5uibWwIQxPEOw)) defined in wR2.py as follows: 103 | 104 | ``` 105 | 106 | python wR2.py -i [IMG FOLDERS] -b 4 107 | 108 | ``` 109 | 110 | After wR2 finetunes, we train the RPnet (we provide one as before, you can download it from [google drive](https://drive.google.com/open?id=1YYVWgbHksj25vV6bnCX_AWokFjhgIMhv) or [baiduyun](https://pan.baidu.com/s/1sA-rzn4Mf33uhh1DWNcRhQ)) defined in rpnet.py. Please specify the variable wR2Path (the path of the well-trained wR2 model) in rpnet.py. 111 | 112 | ``` 113 | 114 | python rpnet.py -i [TRAIN IMG FOLDERS] -b 4 -se 0 -f [MODEL SAVE FOLDER] -t [TEST IMG FOLDERS] 115 | 116 | ``` 117 | 118 | 119 | 120 | ## Test instructions 121 | 122 | After fine-tuning RPnet, you need to uncompress a zip folder and select it as the test directory. The argument after -s is a folder for storing failure cases. 123 | 124 | ``` 125 | 126 | python rpnetEval.py -m [MODEL PATH, like /**/fh02.pth] -i [TEST DIR] -s [FAILURE SAVE DIR] 127 | 128 | ``` 129 | 130 | ## Dataset Annotations 131 | 132 | Annotations are embedded in file name. 133 | 134 | A sample image name is "025-95_113-154&383_386&473-386&473_177&454_154&383_363&402-0_0_22_27_27_33_16-37-15.jpg". Each name can be splited into seven fields. Those fields are explained as follows. 135 | 136 | - **Area**: Area ratio of license plate area to the entire picture area. 137 | 138 | - **Tilt degree**: Horizontal tilt degree and vertical tilt degree. 139 | 140 | - **Bounding box coordinates**: The coordinates of the left-up and the right-bottom vertices. 141 | 142 | - **Four vertices locations**: The exact (x, y) coordinates of the four vertices of LP in the whole image. These coordinates start from the right-bottom vertex. 143 | 144 | - **License plate number**: Each image in CCPD has only one LP. Each LP number is comprised of a Chinese character, a letter, and five letters or numbers. A valid Chinese license plate consists of seven characters: province (1 character), alphabets (1 character), alphabets+digits (5 characters). "0_0_22_27_27_33_16" is the index of each character. These three arrays are defined as follows. The last character of each array is letter O rather than a digit 0. We use O as a sign of "no character" because there is no O in Chinese license plate characters. 145 | ``` 146 | provinces = ["皖", "沪", "津", "渝", "冀", "晋", "蒙", "辽", "吉", "黑", "苏", "浙", "京", "闽", "赣", "鲁", "豫", "鄂", "湘", "粤", "桂", "琼", "川", "贵", "云", "藏", "陕", "甘", "青", "宁", "新", "警", "学", "O"] 147 | alphabets = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 148 | 'X', 'Y', 'Z', 'O'] 149 | ads = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 150 | 'Y', 'Z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'O'] 151 | ``` 152 | 153 | - **Brightness**: The brightness of the license plate region. 154 | 155 | - **Blurriness**: The Blurriness of the license plate region. 156 | 157 | 158 | 159 | ## Acknowledgement 160 | 161 | If you have any problems about CCPD, please contact detectrecog@gmail.com. 162 | 163 | 164 | 165 | Please cite the paper _《Towards End-to-End License Plate Detection and Recognition: A Large Dataset and Baseline》_, if you benefit from this dataset. 166 | -------------------------------------------------------------------------------- /Zhenbo_Xu_Towards_End-to-End_License_ECCV_2018_paper.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/detectRecog/CCPD/02aaea15137c4d2fe662e57d257c6822356e9304/Zhenbo_Xu_Towards_End-to-End_License_ECCV_2018_paper.pdf -------------------------------------------------------------------------------- /rpnet/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/detectRecog/CCPD/02aaea15137c4d2fe662e57d257c6822356e9304/rpnet/.DS_Store -------------------------------------------------------------------------------- /rpnet/demo.py: -------------------------------------------------------------------------------- 1 | #encoding:utf-8 2 | import cv2 3 | import torch 4 | from torch.autograd import Variable 5 | import torch.nn as nn 6 | import argparse 7 | import numpy as np 8 | from os import path, mkdir 9 | from load_data import * 10 | from time import time 11 | from roi_pooling import roi_pooling_ims 12 | 13 | ap = argparse.ArgumentParser() 14 | ap.add_argument("-i", "--input", required=True, 15 | help="path to the input folder") 16 | ap.add_argument("-m", "--model", required=True, 17 | help="path to the model file") 18 | args = vars(ap.parse_args()) 19 | 20 | use_gpu = torch.cuda.is_available() 21 | print (use_gpu) 22 | 23 | numClasses = 4 24 | numPoints = 4 25 | imgSize = (480, 480) 26 | batchSize = 8 if use_gpu else 8 27 | resume_file = str(args["model"]) 28 | 29 | provNum, alphaNum, adNum = 38, 25, 35 30 | provinces = ["皖", "沪", "津", "渝", "冀", "晋", "蒙", "辽", "吉", "黑", "苏", "浙", "京", "闽", "赣", "鲁", "豫", "鄂", "湘", "粤", "桂", 31 | "琼", "川", "贵", "云", "藏", "陕", "甘", "青", "宁", "新", "警", "学", "O"] 32 | alphabets = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 33 | 'X', 'Y', 'Z', 'O'] 34 | ads = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 35 | 'Y', 'Z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'O'] 36 | 37 | class wR2(nn.Module): 38 | def __init__(self, num_classes=1000): 39 | super(wR2, self).__init__() 40 | hidden1 = nn.Sequential( 41 | nn.Conv2d(in_channels=3, out_channels=48, kernel_size=5, padding=2, stride=2), 42 | nn.BatchNorm2d(num_features=48), 43 | nn.ReLU(), 44 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 45 | nn.Dropout(0.2) 46 | ) 47 | hidden2 = nn.Sequential( 48 | nn.Conv2d(in_channels=48, out_channels=64, kernel_size=5, padding=2), 49 | nn.BatchNorm2d(num_features=64), 50 | nn.ReLU(), 51 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 52 | nn.Dropout(0.2) 53 | ) 54 | hidden3 = nn.Sequential( 55 | nn.Conv2d(in_channels=64, out_channels=128, kernel_size=5, padding=2), 56 | nn.BatchNorm2d(num_features=128), 57 | nn.ReLU(), 58 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 59 | nn.Dropout(0.2) 60 | ) 61 | hidden4 = nn.Sequential( 62 | nn.Conv2d(in_channels=128, out_channels=160, kernel_size=5, padding=2), 63 | nn.BatchNorm2d(num_features=160), 64 | nn.ReLU(), 65 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 66 | nn.Dropout(0.2) 67 | ) 68 | hidden5 = nn.Sequential( 69 | nn.Conv2d(in_channels=160, out_channels=192, kernel_size=5, padding=2), 70 | nn.BatchNorm2d(num_features=192), 71 | nn.ReLU(), 72 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 73 | nn.Dropout(0.2) 74 | ) 75 | hidden6 = nn.Sequential( 76 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=5, padding=2), 77 | nn.BatchNorm2d(num_features=192), 78 | nn.ReLU(), 79 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 80 | nn.Dropout(0.2) 81 | ) 82 | hidden7 = nn.Sequential( 83 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=5, padding=2), 84 | nn.BatchNorm2d(num_features=192), 85 | nn.ReLU(), 86 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 87 | nn.Dropout(0.2) 88 | ) 89 | hidden8 = nn.Sequential( 90 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=5, padding=2), 91 | nn.BatchNorm2d(num_features=192), 92 | nn.ReLU(), 93 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 94 | nn.Dropout(0.2) 95 | ) 96 | hidden9 = nn.Sequential( 97 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=3, padding=1), 98 | nn.BatchNorm2d(num_features=192), 99 | nn.ReLU(), 100 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 101 | nn.Dropout(0.2) 102 | ) 103 | hidden10 = nn.Sequential( 104 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=3, padding=1), 105 | nn.BatchNorm2d(num_features=192), 106 | nn.ReLU(), 107 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 108 | nn.Dropout(0.2) 109 | ) 110 | self.features = nn.Sequential( 111 | hidden1, 112 | hidden2, 113 | hidden3, 114 | hidden4, 115 | hidden5, 116 | hidden6, 117 | hidden7, 118 | hidden8, 119 | hidden9, 120 | hidden10 121 | ) 122 | self.classifier = nn.Sequential( 123 | nn.Linear(23232, 100), 124 | # nn.ReLU(inplace=True), 125 | nn.Linear(100, 100), 126 | # nn.ReLU(inplace=True), 127 | nn.Linear(100, num_classes), 128 | ) 129 | 130 | def forward(self, x): 131 | x1 = self.features(x) 132 | x11 = x1.view(x1.size(0), -1) 133 | x = self.classifier(x11) 134 | return x 135 | 136 | 137 | class fh02(nn.Module): 138 | def __init__(self, num_points, num_classes, wrPath=None): 139 | super(fh02, self).__init__() 140 | self.load_wR2(wrPath) 141 | self.classifier1 = nn.Sequential( 142 | # nn.Dropout(), 143 | nn.Linear(53248, 128), 144 | # nn.ReLU(inplace=True), 145 | # nn.Dropout(), 146 | nn.Linear(128, provNum), 147 | ) 148 | self.classifier2 = nn.Sequential( 149 | # nn.Dropout(), 150 | nn.Linear(53248, 128), 151 | # nn.ReLU(inplace=True), 152 | # nn.Dropout(), 153 | nn.Linear(128, alphaNum), 154 | ) 155 | self.classifier3 = nn.Sequential( 156 | # nn.Dropout(), 157 | nn.Linear(53248, 128), 158 | # nn.ReLU(inplace=True), 159 | # nn.Dropout(), 160 | nn.Linear(128, adNum), 161 | ) 162 | self.classifier4 = nn.Sequential( 163 | # nn.Dropout(), 164 | nn.Linear(53248, 128), 165 | # nn.ReLU(inplace=True), 166 | # nn.Dropout(), 167 | nn.Linear(128, adNum), 168 | ) 169 | self.classifier5 = nn.Sequential( 170 | # nn.Dropout(), 171 | nn.Linear(53248, 128), 172 | # nn.ReLU(inplace=True), 173 | # nn.Dropout(), 174 | nn.Linear(128, adNum), 175 | ) 176 | self.classifier6 = nn.Sequential( 177 | # nn.Dropout(), 178 | nn.Linear(53248, 128), 179 | # nn.ReLU(inplace=True), 180 | # nn.Dropout(), 181 | nn.Linear(128, adNum), 182 | ) 183 | self.classifier7 = nn.Sequential( 184 | # nn.Dropout(), 185 | nn.Linear(53248, 128), 186 | # nn.ReLU(inplace=True), 187 | # nn.Dropout(), 188 | nn.Linear(128, adNum), 189 | ) 190 | 191 | def load_wR2(self, path): 192 | self.wR2 = wR2(numPoints) 193 | self.wR2 = torch.nn.DataParallel(self.wR2, device_ids=range(torch.cuda.device_count())) 194 | if not path is None: 195 | self.wR2.load_state_dict(torch.load(path)) 196 | # self.wR2 = self.wR2.cuda() 197 | # for param in self.wR2.parameters(): 198 | # param.requires_grad = False 199 | 200 | def forward(self, x): 201 | x0 = self.wR2.module.features[0](x) 202 | _x1 = self.wR2.module.features[1](x0) 203 | x2 = self.wR2.module.features[2](_x1) 204 | _x3 = self.wR2.module.features[3](x2) 205 | x4 = self.wR2.module.features[4](_x3) 206 | _x5 = self.wR2.module.features[5](x4) 207 | 208 | x6 = self.wR2.module.features[6](_x5) 209 | x7 = self.wR2.module.features[7](x6) 210 | x8 = self.wR2.module.features[8](x7) 211 | x9 = self.wR2.module.features[9](x8) 212 | x9 = x9.view(x9.size(0), -1) 213 | boxLoc = self.wR2.module.classifier(x9) 214 | 215 | h1, w1 = _x1.data.size()[2], _x1.data.size()[3] 216 | p1 = Variable(torch.FloatTensor([[w1,0,0,0],[0,h1,0,0],[0,0,w1,0],[0,0,0,h1]]).cuda(), requires_grad=False) 217 | h2, w2 = _x3.data.size()[2], _x3.data.size()[3] 218 | p2 = Variable(torch.FloatTensor([[w2,0,0,0],[0,h2,0,0],[0,0,w2,0],[0,0,0,h2]]).cuda(), requires_grad=False) 219 | h3, w3 = _x5.data.size()[2], _x5.data.size()[3] 220 | p3 = Variable(torch.FloatTensor([[w3,0,0,0],[0,h3,0,0],[0,0,w3,0],[0,0,0,h3]]).cuda(), requires_grad=False) 221 | 222 | # x, y, w, h --> x1, y1, x2, y2 223 | assert boxLoc.data.size()[1] == 4 224 | postfix = Variable(torch.FloatTensor([[1,0,1,0],[0,1,0,1],[-0.5,0,0.5,0],[0,-0.5,0,0.5]]).cuda(), requires_grad=False) 225 | boxNew = boxLoc.mm(postfix).clamp(min=0, max=1) 226 | 227 | # input = Variable(torch.rand(2, 1, 10, 10), requires_grad=True) 228 | # rois = Variable(torch.LongTensor([[0, 1, 2, 7, 8], [0, 3, 3, 8, 8], [1, 3, 3, 8, 8]]), requires_grad=False) 229 | roi1 = roi_pooling_ims(_x1, boxNew.mm(p1), size=(16, 8)) 230 | roi2 = roi_pooling_ims(_x3, boxNew.mm(p2), size=(16, 8)) 231 | roi3 = roi_pooling_ims(_x5, boxNew.mm(p3), size=(16, 8)) 232 | rois = torch.cat((roi1, roi2, roi3), 1) 233 | 234 | _rois = rois.view(rois.size(0), -1) 235 | 236 | y0 = self.classifier1(_rois) 237 | y1 = self.classifier2(_rois) 238 | y2 = self.classifier3(_rois) 239 | y3 = self.classifier4(_rois) 240 | y4 = self.classifier5(_rois) 241 | y5 = self.classifier6(_rois) 242 | y6 = self.classifier7(_rois) 243 | return boxLoc, [y0, y1, y2, y3, y4, y5, y6] 244 | 245 | 246 | def isEqual(labelGT, labelP): 247 | print (labelGT) 248 | print (labelP) 249 | compare = [1 if int(labelGT[i]) == int(labelP[i]) else 0 for i in range(7)] 250 | # print(sum(compare)) 251 | return sum(compare) 252 | 253 | 254 | model_conv = fh02(numPoints, numClasses) 255 | model_conv = torch.nn.DataParallel(model_conv, device_ids=range(torch.cuda.device_count())) 256 | model_conv.load_state_dict(torch.load(resume_file)) 257 | model_conv = model_conv.cuda() 258 | model_conv.eval() 259 | 260 | 261 | dst = demoTestDataLoader(args["input"].split(','), imgSize) 262 | trainloader = DataLoader(dst, batch_size=1, shuffle=True, num_workers=1) 263 | 264 | start = time() 265 | for i, (XI, ims) in enumerate(trainloader): 266 | 267 | if use_gpu: 268 | x = Variable(XI.cuda(0)) 269 | else: 270 | x = Variable(XI) 271 | # Forward pass: Compute predicted y by passing x to the model 272 | 273 | fps_pred, y_pred = model_conv(x) 274 | 275 | outputY = [el.data.cpu().numpy().tolist() for el in y_pred] 276 | labelPred = [t[0].index(max(t[0])) for t in outputY] 277 | 278 | [cx, cy, w, h] = fps_pred.data.cpu().numpy()[0].tolist() 279 | 280 | img = cv2.imread(ims[0]) 281 | left_up = [(cx - w/2)*img.shape[1], (cy - h/2)*img.shape[0]] 282 | right_down = [(cx + w/2)*img.shape[1], (cy + h/2)*img.shape[0]] 283 | cv2.rectangle(img, (int(left_up[0]), int(left_up[1])), (int(right_down[0]), int(right_down[1])), (0, 0, 255), 2) 284 | # The first character is Chinese character, can not be printed normally, thus is omitted. 285 | lpn = alphabets[labelPred[1]] + ads[labelPred[2]] + ads[labelPred[3]] + ads[labelPred[4]] + ads[labelPred[5]] + ads[labelPred[6]] 286 | cv2.putText(img, lpn, (int(left_up[0]), int(left_up[1])-20), cv2.FONT_ITALIC, 2, (0, 0, 255)) 287 | cv2.imwrite(ims[0], img) 288 | 289 | -------------------------------------------------------------------------------- /rpnet/demo/0.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/detectRecog/CCPD/02aaea15137c4d2fe662e57d257c6822356e9304/rpnet/demo/0.jpg -------------------------------------------------------------------------------- /rpnet/demo/1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/detectRecog/CCPD/02aaea15137c4d2fe662e57d257c6822356e9304/rpnet/demo/1.jpg -------------------------------------------------------------------------------- /rpnet/demo/2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/detectRecog/CCPD/02aaea15137c4d2fe662e57d257c6822356e9304/rpnet/demo/2.jpg -------------------------------------------------------------------------------- /rpnet/demo/3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/detectRecog/CCPD/02aaea15137c4d2fe662e57d257c6822356e9304/rpnet/demo/3.jpg -------------------------------------------------------------------------------- /rpnet/demo/4.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/detectRecog/CCPD/02aaea15137c4d2fe662e57d257c6822356e9304/rpnet/demo/4.jpg -------------------------------------------------------------------------------- /rpnet/load_data.py: -------------------------------------------------------------------------------- 1 | from torch.utils.data import * 2 | from imutils import paths 3 | import cv2 4 | import numpy as np 5 | 6 | 7 | class labelFpsDataLoader(Dataset): 8 | def __init__(self, img_dir, imgSize, is_transform=None): 9 | self.img_dir = img_dir 10 | self.img_paths = [] 11 | for i in range(len(img_dir)): 12 | self.img_paths += [el for el in paths.list_images(img_dir[i])] 13 | # self.img_paths = os.listdir(img_dir) 14 | # print self.img_paths 15 | self.img_size = imgSize 16 | self.is_transform = is_transform 17 | 18 | def __len__(self): 19 | return len(self.img_paths) 20 | 21 | def __getitem__(self, index): 22 | img_name = self.img_paths[index] 23 | img = cv2.imread(img_name) 24 | # img = img.astype('float32') 25 | resizedImage = cv2.resize(img, self.img_size) 26 | resizedImage = np.transpose(resizedImage, (2,0,1)) 27 | resizedImage = resizedImage.astype('float32') 28 | resizedImage /= 255.0 29 | lbl = img_name.split('/')[-1].rsplit('.', 1)[0].split('-')[-3] 30 | 31 | iname = img_name.rsplit('/', 1)[-1].rsplit('.', 1)[0].split('-') 32 | # fps = [[int(eel) for eel in el.split('&')] for el in iname[3].split('_')] 33 | # leftUp, rightDown = [min([fps[el][0] for el in range(4)]), min([fps[el][1] for el in range(4)])], [ 34 | # max([fps[el][0] for el in range(4)]), max([fps[el][1] for el in range(4)])] 35 | [leftUp, rightDown] = [[int(eel) for eel in el.split('&')] for el in iname[2].split('_')] 36 | ori_w, ori_h = [float(int(el)) for el in [img.shape[1], img.shape[0]]] 37 | new_labels = [(leftUp[0] + rightDown[0]) / (2 * ori_w), (leftUp[1] + rightDown[1]) / (2 * ori_h), 38 | (rightDown[0] - leftUp[0]) / ori_w, (rightDown[1] - leftUp[1]) / ori_h] 39 | 40 | return resizedImage, new_labels, lbl, img_name 41 | 42 | 43 | class labelTestDataLoader(Dataset): 44 | def __init__(self, img_dir, imgSize, is_transform=None): 45 | self.img_dir = img_dir 46 | self.img_paths = [] 47 | for i in range(len(img_dir)): 48 | self.img_paths += [el for el in paths.list_images(img_dir[i])] 49 | # self.img_paths = os.listdir(img_dir) 50 | # print self.img_paths 51 | self.img_size = imgSize 52 | self.is_transform = is_transform 53 | 54 | def __len__(self): 55 | return len(self.img_paths) 56 | 57 | def __getitem__(self, index): 58 | img_name = self.img_paths[index] 59 | img = cv2.imread(img_name) 60 | # img = img.astype('float32') 61 | resizedImage = cv2.resize(img, self.img_size) 62 | resizedImage = np.transpose(resizedImage, (2,0,1)) 63 | resizedImage = resizedImage.astype('float32') 64 | resizedImage /= 255.0 65 | lbl = img_name.split('/')[-1].split('.')[0].split('-')[-3] 66 | return resizedImage, lbl, img_name 67 | 68 | 69 | 70 | class ChaLocDataLoader(Dataset): 71 | def __init__(self, img_dir,imgSize, is_transform=None): 72 | self.img_dir = img_dir 73 | self.img_paths = [] 74 | for i in range(len(img_dir)): 75 | self.img_paths += [el for el in paths.list_images(img_dir[i])] 76 | # self.img_paths = os.listdir(img_dir) 77 | # print self.img_paths 78 | self.img_size = imgSize 79 | self.is_transform = is_transform 80 | 81 | def __len__(self): 82 | return len(self.img_paths) 83 | 84 | def __getitem__(self, index): 85 | img_name = self.img_paths[index] 86 | img = cv2.imread(img_name) 87 | resizedImage = cv2.resize(img, self.img_size) 88 | resizedImage = np.reshape(resizedImage, (resizedImage.shape[2], resizedImage.shape[0], resizedImage.shape[1])) 89 | 90 | iname = img_name.rsplit('/', 1)[-1].rsplit('.', 1)[0].split('-') 91 | [leftUp, rightDown] = [[int(eel) for eel in el.split('&')] for el in iname[2].split('_')] 92 | 93 | # tps = [[int(eel) for eel in el.split('&')] for el in iname[2].split('_')] 94 | # for dot in tps: 95 | # cv2.circle(img, (int(dot[0]), int(dot[1])), 2, (0, 0, 255), 2) 96 | # cv2.imwrite("/home/xubb/1_new.jpg", img) 97 | 98 | ori_w, ori_h = float(img.shape[1]), float(img.shape[0]) 99 | assert img.shape[0] == 1160 100 | new_labels = [(leftUp[0] + rightDown[0])/(2*ori_w), (leftUp[1] + rightDown[1])/(2*ori_h), (rightDown[0]-leftUp[0])/ori_w, (rightDown[1]-leftUp[1])/ori_h] 101 | 102 | resizedImage = resizedImage.astype('float32') 103 | # Y = Y.astype('int8') 104 | resizedImage /= 255.0 105 | # lbl = img_name.split('.')[0].rsplit('-',1)[-1].split('_')[:-1] 106 | # lbl = img_name.split('/')[-1].split('.')[0].rsplit('-',1)[-1] 107 | # lbl = map(int, lbl) 108 | # lbl2 = [[el] for el in lbl] 109 | 110 | # resizedImage = torch.from_numpy(resizedImage).float() 111 | return resizedImage, new_labels 112 | 113 | 114 | class demoTestDataLoader(Dataset): 115 | def __init__(self, img_dir, imgSize, is_transform=None): 116 | self.img_dir = img_dir 117 | self.img_paths = [] 118 | for i in range(len(img_dir)): 119 | self.img_paths += [el for el in paths.list_images(img_dir[i])] 120 | # self.img_paths = os.listdir(img_dir) 121 | # print self.img_paths 122 | self.img_size = imgSize 123 | self.is_transform = is_transform 124 | 125 | def __len__(self): 126 | return len(self.img_paths) 127 | 128 | def __getitem__(self, index): 129 | img_name = self.img_paths[index] 130 | img = cv2.imread(img_name) 131 | # img = img.astype('float32') 132 | resizedImage = cv2.resize(img, self.img_size) 133 | resizedImage = np.transpose(resizedImage, (2,0,1)) 134 | resizedImage = resizedImage.astype('float32') 135 | resizedImage /= 255.0 136 | return resizedImage, img_name 137 | -------------------------------------------------------------------------------- /rpnet/roi_pooling.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.autograd as ag 3 | from torch.autograd.function import Function 4 | from torch._thnn import type2backend 5 | 6 | 7 | class AdaptiveMaxPool2d(Function): 8 | def __init__(self, out_w, out_h): 9 | super(AdaptiveMaxPool2d, self).__init__() 10 | self.out_w = out_w 11 | self.out_h = out_h 12 | 13 | def forward(self, input): 14 | output = input.new() 15 | indices = input.new().long() 16 | self.save_for_backward(input) 17 | self.indices = indices 18 | self._backend = type2backend[type(input)] 19 | self._backend.SpatialAdaptiveMaxPooling_updateOutput( 20 | self._backend.library_state, input, output, indices, 21 | self.out_w, self.out_h) 22 | return output 23 | 24 | def backward(self, grad_output): 25 | input, = self.saved_tensors 26 | indices = self.indices 27 | grad_input = grad_output.new() 28 | self._backend.SpatialAdaptiveMaxPooling_updateGradInput( 29 | self._backend.library_state, input, grad_output, grad_input, 30 | indices) 31 | return grad_input, None 32 | 33 | 34 | def adaptive_max_pool(input, size): 35 | return AdaptiveMaxPool2d(size[0], size[1])(input) 36 | 37 | 38 | def roi_pooling(input, rois, size=(7, 7), spatial_scale=1.0): 39 | assert (rois.dim() == 2) 40 | assert (rois.size(1) == 5) 41 | output = [] 42 | rois = rois.data.float() 43 | num_rois = rois.size(0) 44 | 45 | rois[:, 1:].mul_(spatial_scale) 46 | rois = rois.long() 47 | for i in range(num_rois): 48 | roi = rois[i] 49 | im_idx = roi[0] 50 | # im = input.narrow(0, im_idx, 1) 51 | im = input.narrow(0, im_idx, 1)[..., roi[2]:(roi[4] + 1), roi[1]:(roi[3] + 1)] 52 | output.append(adaptive_max_pool(im, size)) 53 | 54 | return torch.cat(output, 0) 55 | 56 | 57 | def roi_pooling_ims(input, rois, size=(7, 7), spatial_scale=1.0): 58 | # written for one roi one image 59 | # size: (w, h) 60 | assert (rois.dim() == 2) 61 | assert len(input) == len(rois) 62 | assert (rois.size(1) == 4) 63 | output = [] 64 | rois = rois.data.float() 65 | num_rois = rois.size(0) 66 | 67 | rois[:, 1:].mul_(spatial_scale) 68 | rois = rois.long() 69 | for i in range(num_rois): 70 | roi = rois[i] 71 | # im = input.narrow(0, im_idx, 1) 72 | im = input.narrow(0, i, 1)[..., roi[1]:(roi[3] + 1), roi[0]:(roi[2] + 1)] 73 | output.append(adaptive_max_pool(im, size)) 74 | 75 | return torch.cat(output, 0) 76 | 77 | if __name__ == '__main__': 78 | input = ag.Variable(torch.rand(2, 1, 10, 10), requires_grad=True) 79 | rois = ag.Variable(torch.LongTensor([[1, 2, 7, 8], [3, 3, 8, 8]]), requires_grad=False) 80 | 81 | out = roi_pooling_ims(input, rois, size=(8, 8)) 82 | out.backward(out.data.clone().uniform_()) 83 | 84 | # input = ag.Variable(torch.rand(2, 1, 10, 10), requires_grad=True) 85 | # rois = ag.Variable(torch.LongTensor([[0, 1, 2, 7, 8], [0, 3, 3, 8, 8], [1, 3, 3, 8, 8]]), requires_grad=False) 86 | # rois = ag.Variable(torch.LongTensor([[0,3,3,8,8]]),requires_grad=False) 87 | 88 | # out = adaptive_max_pool(input, (3, 3)) 89 | # out.backward(out.data.clone().uniform_()) 90 | 91 | # out = roi_pooling(input, rois, size=(3, 3)) 92 | # out.backward(out.data.clone().uniform_()) 93 | -------------------------------------------------------------------------------- /rpnet/rpnet.py: -------------------------------------------------------------------------------- 1 | # Compared to fh0.py 2 | # fh02.py remove the redundant ims in model input 3 | from __future__ import print_function, division 4 | import cv2 5 | import torch 6 | import torch.nn as nn 7 | import torch.optim as optim 8 | from torch.autograd import Variable 9 | import numpy as np 10 | import os 11 | import argparse 12 | from time import time 13 | from load_data import * 14 | from roi_pooling import roi_pooling_ims 15 | from torch.optim import lr_scheduler 16 | 17 | 18 | ap = argparse.ArgumentParser() 19 | ap.add_argument("-i", "--images", required=True, 20 | help="path to the input file") 21 | ap.add_argument("-n", "--epochs", default=10000, 22 | help="epochs for train") 23 | ap.add_argument("-b", "--batchsize", default=5, 24 | help="batch size for train") 25 | ap.add_argument("-se", "--start_epoch", required=True, 26 | help="start epoch for train") 27 | ap.add_argument("-t", "--test", required=True, 28 | help="dirs for test") 29 | ap.add_argument("-r", "--resume", default='111', 30 | help="file for re-train") 31 | ap.add_argument("-f", "--folder", required=True, 32 | help="folder to store model") 33 | ap.add_argument("-w", "--writeFile", default='fh02.out', 34 | help="file for output") 35 | args = vars(ap.parse_args()) 36 | 37 | wR2Path = './wR2/wR2.pth2' 38 | use_gpu = torch.cuda.is_available() 39 | print (use_gpu) 40 | 41 | numClasses = 7 42 | numPoints = 4 43 | classifyNum = 35 44 | imgSize = (480, 480) 45 | # lpSize = (128, 64) 46 | provNum, alphaNum, adNum = 38, 25, 35 47 | batchSize = int(args["batchsize"]) if use_gpu else 2 48 | trainDirs = args["images"].split(',') 49 | testDirs = args["test"].split(',') 50 | modelFolder = str(args["folder"]) if str(args["folder"])[-1] == '/' else str(args["folder"]) + '/' 51 | storeName = modelFolder + 'fh02.pth' 52 | if not os.path.isdir(modelFolder): 53 | os.mkdir(modelFolder) 54 | 55 | epochs = int(args["epochs"]) 56 | # initialize the output file 57 | if not os.path.isfile(args['writeFile']): 58 | with open(args['writeFile'], 'wb') as outF: 59 | pass 60 | 61 | 62 | def get_n_params(model): 63 | pp=0 64 | for p in list(model.parameters()): 65 | nn=1 66 | for s in list(p.size()): 67 | nn = nn*s 68 | pp += nn 69 | return pp 70 | 71 | 72 | class wR2(nn.Module): 73 | def __init__(self, num_classes=1000): 74 | super(wR2, self).__init__() 75 | hidden1 = nn.Sequential( 76 | nn.Conv2d(in_channels=3, out_channels=48, kernel_size=5, padding=2, stride=2), 77 | nn.BatchNorm2d(num_features=48), 78 | nn.ReLU(), 79 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 80 | nn.Dropout(0.2) 81 | ) 82 | hidden2 = nn.Sequential( 83 | nn.Conv2d(in_channels=48, out_channels=64, kernel_size=5, padding=2), 84 | nn.BatchNorm2d(num_features=64), 85 | nn.ReLU(), 86 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 87 | nn.Dropout(0.2) 88 | ) 89 | hidden3 = nn.Sequential( 90 | nn.Conv2d(in_channels=64, out_channels=128, kernel_size=5, padding=2), 91 | nn.BatchNorm2d(num_features=128), 92 | nn.ReLU(), 93 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 94 | nn.Dropout(0.2) 95 | ) 96 | hidden4 = nn.Sequential( 97 | nn.Conv2d(in_channels=128, out_channels=160, kernel_size=5, padding=2), 98 | nn.BatchNorm2d(num_features=160), 99 | nn.ReLU(), 100 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 101 | nn.Dropout(0.2) 102 | ) 103 | hidden5 = nn.Sequential( 104 | nn.Conv2d(in_channels=160, out_channels=192, kernel_size=5, padding=2), 105 | nn.BatchNorm2d(num_features=192), 106 | nn.ReLU(), 107 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 108 | nn.Dropout(0.2) 109 | ) 110 | hidden6 = nn.Sequential( 111 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=5, padding=2), 112 | nn.BatchNorm2d(num_features=192), 113 | nn.ReLU(), 114 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 115 | nn.Dropout(0.2) 116 | ) 117 | hidden7 = nn.Sequential( 118 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=5, padding=2), 119 | nn.BatchNorm2d(num_features=192), 120 | nn.ReLU(), 121 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 122 | nn.Dropout(0.2) 123 | ) 124 | hidden8 = nn.Sequential( 125 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=5, padding=2), 126 | nn.BatchNorm2d(num_features=192), 127 | nn.ReLU(), 128 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 129 | nn.Dropout(0.2) 130 | ) 131 | hidden9 = nn.Sequential( 132 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=3, padding=1), 133 | nn.BatchNorm2d(num_features=192), 134 | nn.ReLU(), 135 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 136 | nn.Dropout(0.2) 137 | ) 138 | hidden10 = nn.Sequential( 139 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=3, padding=1), 140 | nn.BatchNorm2d(num_features=192), 141 | nn.ReLU(), 142 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 143 | nn.Dropout(0.2) 144 | ) 145 | self.features = nn.Sequential( 146 | hidden1, 147 | hidden2, 148 | hidden3, 149 | hidden4, 150 | hidden5, 151 | hidden6, 152 | hidden7, 153 | hidden8, 154 | hidden9, 155 | hidden10 156 | ) 157 | self.classifier = nn.Sequential( 158 | nn.Linear(23232, 100), 159 | # nn.ReLU(inplace=True), 160 | nn.Linear(100, 100), 161 | # nn.ReLU(inplace=True), 162 | nn.Linear(100, num_classes), 163 | ) 164 | 165 | def forward(self, x): 166 | x1 = self.features(x) 167 | x11 = x1.view(x1.size(0), -1) 168 | x = self.classifier(x11) 169 | return x 170 | 171 | 172 | class fh02(nn.Module): 173 | def __init__(self, num_points, num_classes, wrPath=None): 174 | super(fh02, self).__init__() 175 | self.load_wR2(wrPath) 176 | self.classifier1 = nn.Sequential( 177 | # nn.Dropout(), 178 | nn.Linear(53248, 128), 179 | # nn.ReLU(inplace=True), 180 | # nn.Dropout(), 181 | nn.Linear(128, provNum), 182 | ) 183 | self.classifier2 = nn.Sequential( 184 | # nn.Dropout(), 185 | nn.Linear(53248, 128), 186 | # nn.ReLU(inplace=True), 187 | # nn.Dropout(), 188 | nn.Linear(128, alphaNum), 189 | ) 190 | self.classifier3 = nn.Sequential( 191 | # nn.Dropout(), 192 | nn.Linear(53248, 128), 193 | # nn.ReLU(inplace=True), 194 | # nn.Dropout(), 195 | nn.Linear(128, adNum), 196 | ) 197 | self.classifier4 = nn.Sequential( 198 | # nn.Dropout(), 199 | nn.Linear(53248, 128), 200 | # nn.ReLU(inplace=True), 201 | # nn.Dropout(), 202 | nn.Linear(128, adNum), 203 | ) 204 | self.classifier5 = nn.Sequential( 205 | # nn.Dropout(), 206 | nn.Linear(53248, 128), 207 | # nn.ReLU(inplace=True), 208 | # nn.Dropout(), 209 | nn.Linear(128, adNum), 210 | ) 211 | self.classifier6 = nn.Sequential( 212 | # nn.Dropout(), 213 | nn.Linear(53248, 128), 214 | # nn.ReLU(inplace=True), 215 | # nn.Dropout(), 216 | nn.Linear(128, adNum), 217 | ) 218 | self.classifier7 = nn.Sequential( 219 | # nn.Dropout(), 220 | nn.Linear(53248, 128), 221 | # nn.ReLU(inplace=True), 222 | # nn.Dropout(), 223 | nn.Linear(128, adNum), 224 | ) 225 | 226 | def load_wR2(self, path): 227 | self.wR2 = wR2(numPoints) 228 | self.wR2 = torch.nn.DataParallel(self.wR2, device_ids=range(torch.cuda.device_count())) 229 | if not path is None: 230 | self.wR2.load_state_dict(torch.load(path)) 231 | # self.wR2 = self.wR2.cuda() 232 | # for param in self.wR2.parameters(): 233 | # param.requires_grad = False 234 | 235 | def forward(self, x): 236 | x0 = self.wR2.module.features[0](x) 237 | _x1 = self.wR2.module.features[1](x0) 238 | x2 = self.wR2.module.features[2](_x1) 239 | _x3 = self.wR2.module.features[3](x2) 240 | x4 = self.wR2.module.features[4](_x3) 241 | _x5 = self.wR2.module.features[5](x4) 242 | 243 | x6 = self.wR2.module.features[6](_x5) 244 | x7 = self.wR2.module.features[7](x6) 245 | x8 = self.wR2.module.features[8](x7) 246 | x9 = self.wR2.module.features[9](x8) 247 | x9 = x9.view(x9.size(0), -1) 248 | boxLoc = self.wR2.module.classifier(x9) 249 | 250 | h1, w1 = _x1.data.size()[2], _x1.data.size()[3] 251 | p1 = Variable(torch.FloatTensor([[w1,0,0,0],[0,h1,0,0],[0,0,w1,0],[0,0,0,h1]]).cuda(), requires_grad=False) 252 | h2, w2 = _x3.data.size()[2], _x3.data.size()[3] 253 | p2 = Variable(torch.FloatTensor([[w2,0,0,0],[0,h2,0,0],[0,0,w2,0],[0,0,0,h2]]).cuda(), requires_grad=False) 254 | h3, w3 = _x5.data.size()[2], _x5.data.size()[3] 255 | p3 = Variable(torch.FloatTensor([[w3,0,0,0],[0,h3,0,0],[0,0,w3,0],[0,0,0,h3]]).cuda(), requires_grad=False) 256 | 257 | # x, y, w, h --> x1, y1, x2, y2 258 | assert boxLoc.data.size()[1] == 4 259 | postfix = Variable(torch.FloatTensor([[1,0,1,0],[0,1,0,1],[-0.5,0,0.5,0],[0,-0.5,0,0.5]]).cuda(), requires_grad=False) 260 | boxNew = boxLoc.mm(postfix).clamp(min=0, max=1) 261 | 262 | # input = Variable(torch.rand(2, 1, 10, 10), requires_grad=True) 263 | # rois = Variable(torch.LongTensor([[0, 1, 2, 7, 8], [0, 3, 3, 8, 8], [1, 3, 3, 8, 8]]), requires_grad=False) 264 | roi1 = roi_pooling_ims(_x1, boxNew.mm(p1), size=(16, 8)) 265 | roi2 = roi_pooling_ims(_x3, boxNew.mm(p2), size=(16, 8)) 266 | roi3 = roi_pooling_ims(_x5, boxNew.mm(p3), size=(16, 8)) 267 | rois = torch.cat((roi1, roi2, roi3), 1) 268 | 269 | _rois = rois.view(rois.size(0), -1) 270 | 271 | y0 = self.classifier1(_rois) 272 | y1 = self.classifier2(_rois) 273 | y2 = self.classifier3(_rois) 274 | y3 = self.classifier4(_rois) 275 | y4 = self.classifier5(_rois) 276 | y5 = self.classifier6(_rois) 277 | y6 = self.classifier7(_rois) 278 | return boxLoc, [y0, y1, y2, y3, y4, y5, y6] 279 | 280 | 281 | epoch_start = int(args["start_epoch"]) 282 | resume_file = str(args["resume"]) 283 | if not resume_file == '111': 284 | # epoch_start = int(resume_file[resume_file.find('pth') + 3:]) + 1 285 | if not os.path.isfile(resume_file): 286 | print ("fail to load existed model! Existing ...") 287 | exit(0) 288 | print ("Load existed model! %s" % resume_file) 289 | model_conv = fh02(numPoints, numClasses) 290 | model_conv = torch.nn.DataParallel(model_conv, device_ids=range(torch.cuda.device_count())) 291 | model_conv.load_state_dict(torch.load(resume_file)) 292 | model_conv = model_conv.cuda() 293 | else: 294 | model_conv = fh02(numPoints, numClasses, wR2Path) 295 | if use_gpu: 296 | model_conv = torch.nn.DataParallel(model_conv, device_ids=range(torch.cuda.device_count())) 297 | model_conv = model_conv.cuda() 298 | 299 | print(model_conv) 300 | print(get_n_params(model_conv)) 301 | 302 | criterion = nn.CrossEntropyLoss() 303 | # optimizer_conv = optim.RMSprop(model_conv.parameters(), lr=0.01, momentum=0.9) 304 | optimizer_conv = optim.SGD(model_conv.parameters(), lr=0.001, momentum=0.9) 305 | 306 | dst = labelFpsDataLoader(trainDirs, imgSize) 307 | trainloader = DataLoader(dst, batch_size=batchSize, shuffle=True, num_workers=8) 308 | lrScheduler = lr_scheduler.StepLR(optimizer_conv, step_size=5, gamma=0.1) 309 | 310 | 311 | def isEqual(labelGT, labelP): 312 | compare = [1 if int(labelGT[i]) == int(labelP[i]) else 0 for i in range(7)] 313 | # print(sum(compare)) 314 | return sum(compare) 315 | 316 | 317 | def eval(model, test_dirs): 318 | count, error, correct = 0, 0, 0 319 | dst = labelTestDataLoader(test_dirs, imgSize) 320 | testloader = DataLoader(dst, batch_size=1, shuffle=True, num_workers=8) 321 | start = time() 322 | for i, (XI, labels, ims) in enumerate(testloader): 323 | count += 1 324 | YI = [[int(ee) for ee in el.split('_')[:7]] for el in labels] 325 | if use_gpu: 326 | x = Variable(XI.cuda(0)) 327 | else: 328 | x = Variable(XI) 329 | # Forward pass: Compute predicted y by passing x to the model 330 | 331 | fps_pred, y_pred = model(x) 332 | 333 | outputY = [el.data.cpu().numpy().tolist() for el in y_pred] 334 | labelPred = [t[0].index(max(t[0])) for t in outputY] 335 | 336 | # compare YI, outputY 337 | try: 338 | if isEqual(labelPred, YI[0]) == 7: 339 | correct += 1 340 | else: 341 | pass 342 | except: 343 | error += 1 344 | return count, correct, error, float(correct) / count, (time() - start) / count 345 | 346 | 347 | def train_model(model, criterion, optimizer, num_epochs=25): 348 | # since = time.time() 349 | for epoch in range(epoch_start, num_epochs): 350 | lossAver = [] 351 | model.train(True) 352 | lrScheduler.step() 353 | start = time() 354 | 355 | for i, (XI, Y, labels, ims) in enumerate(trainloader): 356 | if not len(XI) == batchSize: 357 | continue 358 | 359 | YI = [[int(ee) for ee in el.split('_')[:7]] for el in labels] 360 | Y = np.array([el.numpy() for el in Y]).T 361 | if use_gpu: 362 | x = Variable(XI.cuda(0)) 363 | y = Variable(torch.FloatTensor(Y).cuda(0), requires_grad=False) 364 | else: 365 | x = Variable(XI) 366 | y = Variable(torch.FloatTensor(Y), requires_grad=False) 367 | # Forward pass: Compute predicted y by passing x to the model 368 | 369 | try: 370 | fps_pred, y_pred = model(x) 371 | except: 372 | continue 373 | 374 | # Compute and print loss 375 | loss = 0.0 376 | loss += 0.8 * nn.L1Loss().cuda()(fps_pred[:][:2], y[:][:2]) 377 | loss += 0.2 * nn.L1Loss().cuda()(fps_pred[:][2:], y[:][2:]) 378 | for j in range(7): 379 | l = Variable(torch.LongTensor([el[j] for el in YI]).cuda(0)) 380 | loss += criterion(y_pred[j], l) 381 | 382 | # Zero gradients, perform a backward pass, and update the weights. 383 | optimizer.zero_grad() 384 | loss.backward() 385 | optimizer.step() 386 | 387 | try: 388 | lossAver.append(loss.data[0]) 389 | except: 390 | pass 391 | 392 | if i % 50 == 1: 393 | with open(args['writeFile'], 'a') as outF: 394 | outF.write('train %s images, use %s seconds, loss %s\n' % (i*batchSize, time() - start, sum(lossAver) / len(lossAver) if len(lossAver)>0 else 'NoLoss')) 395 | torch.save(model.state_dict(), storeName) 396 | print ('%s %s %s\n' % (epoch, sum(lossAver) / len(lossAver), time()-start)) 397 | model.eval() 398 | count, correct, error, precision, avgTime = eval(model, testDirs) 399 | with open(args['writeFile'], 'a') as outF: 400 | outF.write('%s %s %s\n' % (epoch, sum(lossAver) / len(lossAver), time() - start)) 401 | outF.write('*** total %s error %s precision %s avgTime %s\n' % (count, error, precision, avgTime)) 402 | torch.save(model.state_dict(), storeName + str(epoch)) 403 | return model 404 | 405 | 406 | model_conv = train_model(model_conv, criterion, optimizer_conv, num_epochs=epochs) 407 | -------------------------------------------------------------------------------- /rpnet/rpnetEval.py: -------------------------------------------------------------------------------- 1 | #encoding:utf-8 2 | import cv2 3 | import torch 4 | from torch.autograd import Variable 5 | import torch.nn as nn 6 | import argparse 7 | import numpy as np 8 | from os import path, mkdir 9 | from load_data import * 10 | from time import time 11 | from roi_pooling import roi_pooling_ims 12 | from shutil import copyfile 13 | 14 | ap = argparse.ArgumentParser() 15 | ap.add_argument("-i", "--input", required=True, 16 | help="path to the input folder") 17 | ap.add_argument("-m", "--model", required=True, 18 | help="path to the model file") 19 | ap.add_argument("-s", "--store", required=True, 20 | help="path to the store folder") 21 | args = vars(ap.parse_args()) 22 | 23 | # N is batch size; D_in is input dimension; 24 | # H is hidden dimension; D_out is output dimension. 25 | use_gpu = torch.cuda.is_available() 26 | print (use_gpu) 27 | 28 | numClasses = 4 29 | numPoints = 4 30 | imgSize = (480, 480) 31 | batchSize = 8 if use_gpu else 8 32 | resume_file = str(args["model"]) 33 | 34 | provNum, alphaNum, adNum = 38, 25, 35 35 | provinces = ["皖", "沪", "津", "渝", "冀", "晋", "蒙", "辽", "吉", "黑", "苏", "浙", "京", "闽", "赣", "鲁", "豫", "鄂", "湘", "粤", "桂", 36 | "琼", "川", "贵", "云", "藏", "陕", "甘", "青", "宁", "新", "警", "学", "O"] 37 | alphabets = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 38 | 'X', 'Y', 'Z', 'O'] 39 | ads = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J', 'K', 'L', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 40 | 'Y', 'Z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'O'] 41 | 42 | class wR2(nn.Module): 43 | def __init__(self, num_classes=1000): 44 | super(wR2, self).__init__() 45 | hidden1 = nn.Sequential( 46 | nn.Conv2d(in_channels=3, out_channels=48, kernel_size=5, padding=2, stride=2), 47 | nn.BatchNorm2d(num_features=48), 48 | nn.ReLU(), 49 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 50 | nn.Dropout(0.2) 51 | ) 52 | hidden2 = nn.Sequential( 53 | nn.Conv2d(in_channels=48, out_channels=64, kernel_size=5, padding=2), 54 | nn.BatchNorm2d(num_features=64), 55 | nn.ReLU(), 56 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 57 | nn.Dropout(0.2) 58 | ) 59 | hidden3 = nn.Sequential( 60 | nn.Conv2d(in_channels=64, out_channels=128, kernel_size=5, padding=2), 61 | nn.BatchNorm2d(num_features=128), 62 | nn.ReLU(), 63 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 64 | nn.Dropout(0.2) 65 | ) 66 | hidden4 = nn.Sequential( 67 | nn.Conv2d(in_channels=128, out_channels=160, kernel_size=5, padding=2), 68 | nn.BatchNorm2d(num_features=160), 69 | nn.ReLU(), 70 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 71 | nn.Dropout(0.2) 72 | ) 73 | hidden5 = nn.Sequential( 74 | nn.Conv2d(in_channels=160, out_channels=192, kernel_size=5, padding=2), 75 | nn.BatchNorm2d(num_features=192), 76 | nn.ReLU(), 77 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 78 | nn.Dropout(0.2) 79 | ) 80 | hidden6 = nn.Sequential( 81 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=5, padding=2), 82 | nn.BatchNorm2d(num_features=192), 83 | nn.ReLU(), 84 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 85 | nn.Dropout(0.2) 86 | ) 87 | hidden7 = nn.Sequential( 88 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=5, padding=2), 89 | nn.BatchNorm2d(num_features=192), 90 | nn.ReLU(), 91 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 92 | nn.Dropout(0.2) 93 | ) 94 | hidden8 = nn.Sequential( 95 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=5, padding=2), 96 | nn.BatchNorm2d(num_features=192), 97 | nn.ReLU(), 98 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 99 | nn.Dropout(0.2) 100 | ) 101 | hidden9 = nn.Sequential( 102 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=3, padding=1), 103 | nn.BatchNorm2d(num_features=192), 104 | nn.ReLU(), 105 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 106 | nn.Dropout(0.2) 107 | ) 108 | hidden10 = nn.Sequential( 109 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=3, padding=1), 110 | nn.BatchNorm2d(num_features=192), 111 | nn.ReLU(), 112 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 113 | nn.Dropout(0.2) 114 | ) 115 | self.features = nn.Sequential( 116 | hidden1, 117 | hidden2, 118 | hidden3, 119 | hidden4, 120 | hidden5, 121 | hidden6, 122 | hidden7, 123 | hidden8, 124 | hidden9, 125 | hidden10 126 | ) 127 | self.classifier = nn.Sequential( 128 | nn.Linear(23232, 100), 129 | # nn.ReLU(inplace=True), 130 | nn.Linear(100, 100), 131 | # nn.ReLU(inplace=True), 132 | nn.Linear(100, num_classes), 133 | ) 134 | 135 | def forward(self, x): 136 | x1 = self.features(x) 137 | x11 = x1.view(x1.size(0), -1) 138 | x = self.classifier(x11) 139 | return x 140 | 141 | 142 | class fh02(nn.Module): 143 | def __init__(self, num_points, num_classes, wrPath=None): 144 | super(fh02, self).__init__() 145 | self.load_wR2(wrPath) 146 | self.classifier1 = nn.Sequential( 147 | # nn.Dropout(), 148 | nn.Linear(53248, 128), 149 | # nn.ReLU(inplace=True), 150 | # nn.Dropout(), 151 | nn.Linear(128, provNum), 152 | ) 153 | self.classifier2 = nn.Sequential( 154 | # nn.Dropout(), 155 | nn.Linear(53248, 128), 156 | # nn.ReLU(inplace=True), 157 | # nn.Dropout(), 158 | nn.Linear(128, alphaNum), 159 | ) 160 | self.classifier3 = nn.Sequential( 161 | # nn.Dropout(), 162 | nn.Linear(53248, 128), 163 | # nn.ReLU(inplace=True), 164 | # nn.Dropout(), 165 | nn.Linear(128, adNum), 166 | ) 167 | self.classifier4 = nn.Sequential( 168 | # nn.Dropout(), 169 | nn.Linear(53248, 128), 170 | # nn.ReLU(inplace=True), 171 | # nn.Dropout(), 172 | nn.Linear(128, adNum), 173 | ) 174 | self.classifier5 = nn.Sequential( 175 | # nn.Dropout(), 176 | nn.Linear(53248, 128), 177 | # nn.ReLU(inplace=True), 178 | # nn.Dropout(), 179 | nn.Linear(128, adNum), 180 | ) 181 | self.classifier6 = nn.Sequential( 182 | # nn.Dropout(), 183 | nn.Linear(53248, 128), 184 | # nn.ReLU(inplace=True), 185 | # nn.Dropout(), 186 | nn.Linear(128, adNum), 187 | ) 188 | self.classifier7 = nn.Sequential( 189 | # nn.Dropout(), 190 | nn.Linear(53248, 128), 191 | # nn.ReLU(inplace=True), 192 | # nn.Dropout(), 193 | nn.Linear(128, adNum), 194 | ) 195 | 196 | def load_wR2(self, path): 197 | self.wR2 = wR2(numPoints) 198 | self.wR2 = torch.nn.DataParallel(self.wR2, device_ids=range(torch.cuda.device_count())) 199 | if not path is None: 200 | self.wR2.load_state_dict(torch.load(path)) 201 | # self.wR2 = self.wR2.cuda() 202 | # for param in self.wR2.parameters(): 203 | # param.requires_grad = False 204 | 205 | def forward(self, x): 206 | x0 = self.wR2.module.features[0](x) 207 | _x1 = self.wR2.module.features[1](x0) 208 | x2 = self.wR2.module.features[2](_x1) 209 | _x3 = self.wR2.module.features[3](x2) 210 | x4 = self.wR2.module.features[4](_x3) 211 | _x5 = self.wR2.module.features[5](x4) 212 | 213 | x6 = self.wR2.module.features[6](_x5) 214 | x7 = self.wR2.module.features[7](x6) 215 | x8 = self.wR2.module.features[8](x7) 216 | x9 = self.wR2.module.features[9](x8) 217 | x9 = x9.view(x9.size(0), -1) 218 | boxLoc = self.wR2.module.classifier(x9) 219 | 220 | h1, w1 = _x1.data.size()[2], _x1.data.size()[3] 221 | p1 = Variable(torch.FloatTensor([[w1,0,0,0],[0,h1,0,0],[0,0,w1,0],[0,0,0,h1]]).cuda(), requires_grad=False) 222 | h2, w2 = _x3.data.size()[2], _x3.data.size()[3] 223 | p2 = Variable(torch.FloatTensor([[w2,0,0,0],[0,h2,0,0],[0,0,w2,0],[0,0,0,h2]]).cuda(), requires_grad=False) 224 | h3, w3 = _x5.data.size()[2], _x5.data.size()[3] 225 | p3 = Variable(torch.FloatTensor([[w3,0,0,0],[0,h3,0,0],[0,0,w3,0],[0,0,0,h3]]).cuda(), requires_grad=False) 226 | 227 | # x, y, w, h --> x1, y1, x2, y2 228 | assert boxLoc.data.size()[1] == 4 229 | postfix = Variable(torch.FloatTensor([[1,0,1,0],[0,1,0,1],[-0.5,0,0.5,0],[0,-0.5,0,0.5]]).cuda(), requires_grad=False) 230 | boxNew = boxLoc.mm(postfix).clamp(min=0, max=1) 231 | 232 | # input = Variable(torch.rand(2, 1, 10, 10), requires_grad=True) 233 | # rois = Variable(torch.LongTensor([[0, 1, 2, 7, 8], [0, 3, 3, 8, 8], [1, 3, 3, 8, 8]]), requires_grad=False) 234 | roi1 = roi_pooling_ims(_x1, boxNew.mm(p1), size=(16, 8)) 235 | roi2 = roi_pooling_ims(_x3, boxNew.mm(p2), size=(16, 8)) 236 | roi3 = roi_pooling_ims(_x5, boxNew.mm(p3), size=(16, 8)) 237 | rois = torch.cat((roi1, roi2, roi3), 1) 238 | 239 | _rois = rois.view(rois.size(0), -1) 240 | 241 | y0 = self.classifier1(_rois) 242 | y1 = self.classifier2(_rois) 243 | y2 = self.classifier3(_rois) 244 | y3 = self.classifier4(_rois) 245 | y4 = self.classifier5(_rois) 246 | y5 = self.classifier6(_rois) 247 | y6 = self.classifier7(_rois) 248 | return boxLoc, [y0, y1, y2, y3, y4, y5, y6] 249 | 250 | 251 | def isEqual(labelGT, labelP): 252 | # print (labelGT) 253 | # print (labelP) 254 | compare = [1 if int(labelGT[i]) == int(labelP[i]) else 0 for i in range(7)] 255 | # print(sum(compare)) 256 | return sum(compare) 257 | 258 | 259 | model_conv = fh02(numPoints, numClasses) 260 | model_conv = torch.nn.DataParallel(model_conv, device_ids=range(torch.cuda.device_count())) 261 | model_conv.load_state_dict(torch.load(resume_file)) 262 | model_conv = model_conv.cuda() 263 | model_conv.eval() 264 | 265 | # efficiency evaluation 266 | # dst = imgDataLoader([args["input"]], imgSize) 267 | # trainloader = DataLoader(dst, batch_size=batchSize, shuffle=True, num_workers=4) 268 | # 269 | # start = time() 270 | # for i, (XI) in enumerate(trainloader): 271 | # x = Variable(XI.cuda(0)) 272 | # y_pred = model_conv(x) 273 | # outputY = y_pred.data.cpu().numpy() 274 | # # assert len(outputY) == batchSize 275 | # print("detect efficiency %s seconds" %(time() - start)) 276 | 277 | 278 | count = 0 279 | correct = 0 280 | error = 0 281 | sixCorrect = 0 282 | sFolder = str(args["store"]) 283 | sFolder = sFolder if sFolder[-1] == '/' else sFolder + '/' 284 | if not path.isdir(sFolder): 285 | mkdir(sFolder) 286 | 287 | dst = labelTestDataLoader(args["input"].split(','), imgSize) 288 | trainloader = DataLoader(dst, batch_size=1, shuffle=True, num_workers=1) 289 | with open('fh0Eval', 'wb') as outF: 290 | pass 291 | 292 | start = time() 293 | for i, (XI, labels, ims) in enumerate(trainloader): 294 | count += 1 295 | YI = [[int(ee) for ee in el.split('_')[:7]] for el in labels] 296 | if use_gpu: 297 | x = Variable(XI.cuda(0)) 298 | else: 299 | x = Variable(XI) 300 | # Forward pass: Compute predicted y by passing x to the model 301 | 302 | fps_pred, y_pred = model_conv(x) 303 | 304 | outputY = [el.data.cpu().numpy().tolist() for el in y_pred] 305 | labelPred = [t[0].index(max(t[0])) for t in outputY] 306 | 307 | # compare YI, outputY 308 | # try: 309 | if isEqual(labelPred, YI[0]) == 7: 310 | correct += 1 311 | sixCorrect += 1 312 | else: 313 | sixCorrect += 1 if isEqual(labelPred, YI[0]) == 6 else 0 314 | 315 | if count % 50 == 0: 316 | print ('total %s correct %s error %s precision %s six %s avg_time %s' % (count, correct, error, float(correct)/count, float(sixCorrect)/count, (time() - start)/count)) 317 | with open('fh0Eval', 'a') as outF: 318 | outF.write('total %s correct %s error %s precision %s avg_time %s' % (count, correct, error, float(correct) / count, (time() - start)/count)) 319 | -------------------------------------------------------------------------------- /rpnet/wR2.py: -------------------------------------------------------------------------------- 1 | # Code in cnn_fn_pytorch.py 2 | from __future__ import print_function, division 3 | import cv2 4 | import torch 5 | import torch.nn as nn 6 | import torch.optim as optim 7 | from torch.autograd import Variable 8 | import numpy as np 9 | import os 10 | import argparse 11 | from time import time 12 | from load_data import * 13 | from torch.optim import lr_scheduler 14 | 15 | 16 | ap = argparse.ArgumentParser() 17 | ap.add_argument("-i", "--images", required=True, 18 | help="path to the input file") 19 | ap.add_argument("-n", "--epochs", default=25, 20 | help="epochs for train") 21 | ap.add_argument("-b", "--batchsize", default=4, 22 | help="batch size for train") 23 | ap.add_argument("-r", "--resume", default='111', 24 | help="file for re-train") 25 | ap.add_argument("-w", "--writeFile", default='wR2.out', 26 | help="file for output") 27 | args = vars(ap.parse_args()) 28 | 29 | use_gpu = torch.cuda.is_available() 30 | print (use_gpu) 31 | 32 | numClasses = 4 33 | imgSize = (480, 480) 34 | batchSize = int(args["batchsize"]) if use_gpu else 8 35 | modelFolder = 'wR2/' 36 | storeName = modelFolder + 'wR2.pth' 37 | if not os.path.isdir(modelFolder): 38 | os.mkdir(modelFolder) 39 | 40 | epochs = int(args["epochs"]) 41 | # initialize the output file 42 | with open(args['writeFile'], 'wb') as outF: 43 | pass 44 | 45 | 46 | def get_n_params(model): 47 | pp=0 48 | for p in list(model.parameters()): 49 | nn=1 50 | for s in list(p.size()): 51 | nn = nn*s 52 | pp += nn 53 | return pp 54 | 55 | 56 | class wR2(nn.Module): 57 | def __init__(self, num_classes=1000): 58 | super(wR2, self).__init__() 59 | hidden1 = nn.Sequential( 60 | nn.Conv2d(in_channels=3, out_channels=48, kernel_size=5, padding=2, stride=2), 61 | nn.BatchNorm2d(num_features=48), 62 | nn.ReLU(), 63 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 64 | nn.Dropout(0.2) 65 | ) 66 | hidden2 = nn.Sequential( 67 | nn.Conv2d(in_channels=48, out_channels=64, kernel_size=5, padding=2), 68 | nn.BatchNorm2d(num_features=64), 69 | nn.ReLU(), 70 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 71 | nn.Dropout(0.2) 72 | ) 73 | hidden3 = nn.Sequential( 74 | nn.Conv2d(in_channels=64, out_channels=128, kernel_size=5, padding=2), 75 | nn.BatchNorm2d(num_features=128), 76 | nn.ReLU(), 77 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 78 | nn.Dropout(0.2) 79 | ) 80 | hidden4 = nn.Sequential( 81 | nn.Conv2d(in_channels=128, out_channels=160, kernel_size=5, padding=2), 82 | nn.BatchNorm2d(num_features=160), 83 | nn.ReLU(), 84 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 85 | nn.Dropout(0.2) 86 | ) 87 | hidden5 = nn.Sequential( 88 | nn.Conv2d(in_channels=160, out_channels=192, kernel_size=5, padding=2), 89 | nn.BatchNorm2d(num_features=192), 90 | nn.ReLU(), 91 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 92 | nn.Dropout(0.2) 93 | ) 94 | hidden6 = nn.Sequential( 95 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=5, padding=2), 96 | nn.BatchNorm2d(num_features=192), 97 | nn.ReLU(), 98 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 99 | nn.Dropout(0.2) 100 | ) 101 | hidden7 = nn.Sequential( 102 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=5, padding=2), 103 | nn.BatchNorm2d(num_features=192), 104 | nn.ReLU(), 105 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 106 | nn.Dropout(0.2) 107 | ) 108 | hidden8 = nn.Sequential( 109 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=5, padding=2), 110 | nn.BatchNorm2d(num_features=192), 111 | nn.ReLU(), 112 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 113 | nn.Dropout(0.2) 114 | ) 115 | hidden9 = nn.Sequential( 116 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=3, padding=1), 117 | nn.BatchNorm2d(num_features=192), 118 | nn.ReLU(), 119 | nn.MaxPool2d(kernel_size=2, stride=2, padding=1), 120 | nn.Dropout(0.2) 121 | ) 122 | hidden10 = nn.Sequential( 123 | nn.Conv2d(in_channels=192, out_channels=192, kernel_size=3, padding=1), 124 | nn.BatchNorm2d(num_features=192), 125 | nn.ReLU(), 126 | nn.MaxPool2d(kernel_size=2, stride=1, padding=1), 127 | nn.Dropout(0.2) 128 | ) 129 | self.features = nn.Sequential( 130 | hidden1, 131 | hidden2, 132 | hidden3, 133 | hidden4, 134 | hidden5, 135 | hidden6, 136 | hidden7, 137 | hidden8, 138 | hidden9, 139 | hidden10 140 | ) 141 | self.classifier = nn.Sequential( 142 | nn.Linear(23232, 100), 143 | # nn.ReLU(inplace=True), 144 | nn.Linear(100, 100), 145 | # nn.ReLU(inplace=True), 146 | nn.Linear(100, num_classes), 147 | ) 148 | 149 | def forward(self, x): 150 | x1 = self.features(x) 151 | x11 = x1.view(x1.size(0), -1) 152 | x = self.classifier(x11) 153 | return x 154 | 155 | 156 | epoch_start = 0 157 | resume_file = str(args["resume"]) 158 | if not resume_file == '111': 159 | # epoch_start = int(resume_file[resume_file.find('pth') + 3:]) + 1 160 | if not os.path.isfile(resume_file): 161 | print ("fail to load existed model! Existing ...") 162 | exit(0) 163 | print ("Load existed model! %s" % resume_file) 164 | model_conv = wR2(numClasses) 165 | model_conv = torch.nn.DataParallel(model_conv, device_ids=range(torch.cuda.device_count())) 166 | model_conv.load_state_dict(torch.load(resume_file)) 167 | model_conv = model_conv.cuda() 168 | else: 169 | model_conv = wR2(numClasses) 170 | if use_gpu: 171 | model_conv = torch.nn.DataParallel(model_conv, device_ids=range(torch.cuda.device_count())) 172 | model_conv = model_conv.cuda() 173 | 174 | print(model_conv) 175 | print(get_n_params(model_conv)) 176 | 177 | criterion = nn.MSELoss() 178 | optimizer_conv = optim.SGD(model_conv.parameters(), lr=0.001, momentum=0.9) 179 | lrScheduler = lr_scheduler.StepLR(optimizer_conv, step_size=5, gamma=0.1) 180 | 181 | # optimizer_conv = optim.Adam(model_conv.parameters(), lr=0.01) 182 | 183 | # dst = LocDataLoader([args["images"]], imgSize) 184 | dst = ChaLocDataLoader(args["images"].split(','), imgSize) 185 | trainloader = DataLoader(dst, batch_size=batchSize, shuffle=True, num_workers=4) 186 | 187 | 188 | def train_model(model, criterion, optimizer, num_epochs=25): 189 | # since = time.time() 190 | for epoch in range(epoch_start, num_epochs): 191 | lossAver = [] 192 | model.train(True) 193 | lrScheduler.step() 194 | start = time() 195 | 196 | for i, (XI, YI) in enumerate(trainloader): 197 | # print('%s/%s %s' % (i, times, time()-start)) 198 | YI = np.array([el.numpy() for el in YI]).T 199 | if use_gpu: 200 | x = Variable(XI.cuda(0)) 201 | y = Variable(torch.FloatTensor(YI).cuda(0), requires_grad=False) 202 | else: 203 | x = Variable(XI) 204 | y = Variable(torch.FloatTensor(YI), requires_grad=False) 205 | # Forward pass: Compute predicted y by passing x to the model 206 | y_pred = model(x) 207 | 208 | # Compute and print loss 209 | loss = 0.0 210 | if len(y_pred) == batchSize: 211 | loss += 0.8 * nn.L1Loss().cuda()(y_pred[:][:2], y[:][:2]) 212 | loss += 0.2 * nn.L1Loss().cuda()(y_pred[:][2:], y[:][2:]) 213 | lossAver.append(loss.data[0]) 214 | 215 | # Zero gradients, perform a backward pass, and update the weights. 216 | optimizer.zero_grad() 217 | loss.backward() 218 | optimizer.step() 219 | torch.save(model.state_dict(), storeName) 220 | if i % 50 == 1: 221 | with open(args['writeFile'], 'a') as outF: 222 | outF.write('train %s images, use %s seconds, loss %s\n' % (i*batchSize, time() - start, sum(lossAver[-50:]) / len(lossAver[-50:]))) 223 | print ('%s %s %s\n' % (epoch, sum(lossAver) / len(lossAver), time()-start)) 224 | with open(args['writeFile'], 'a') as outF: 225 | outF.write('Epoch: %s %s %s\n' % (epoch, sum(lossAver) / len(lossAver), time()-start)) 226 | torch.save(model.state_dict(), storeName + str(epoch)) 227 | return model 228 | 229 | 230 | model_conv = train_model(model_conv, criterion, optimizer_conv, num_epochs=epochs) 231 | --------------------------------------------------------------------------------