├── .gitattributes
├── .gitignore
├── .ipynb_checkpoints
    ├── Model_Training_Tutorial-checkpoint.md
    └── README-checkpoint.md
├── LICENSE
├── README.md
├── Test_demo
    ├── .gitignore
    ├── README.md
    ├── args.py
    ├── camera_test.py
    ├── convert_weight.py
    ├── data
    │   ├── coco.names
    │   ├── darknet_weights
    │   │   └── readme.txt
    │   ├── logs
    │   │   └── readme
    │   ├── my_data
    │   │   ├── Annotations
    │   │   │   └── readme
    │   │   ├── ImageSets
    │   │   │   └── Main
    │   │   │   │   └── readme
    │   │   ├── JPEGImages
    │   │   │   └── readme
    │   │   ├── label
    │   │   │   └── readme
    │   │   └── readme
    │   └── yolo_anchors.txt
    ├── data_pro.py
    ├── eval.py
    ├── get_kmeans.py
    ├── model.py
    ├── test_single_image.py
    ├── train.py
    ├── utils
    │   ├── __init__.py
    │   ├── data_aug.py
    │   ├── data_utils.py
    │   ├── eval_utils.py
    │   ├── layer_utils.py
    │   ├── misc_utils.py
    │   ├── nms_utils.py
    │   └── plot_utils.py
    └── video_test.py
├── Train
    ├── .gitignore
    ├── README.md
    ├── README_YOLO_V3.md
    ├── args.py
    ├── convert_weight.py
    ├── data
    │   ├── coco.names
    │   ├── darknet_weights
    │   │   └── readme
    │   ├── my_data
    │   │   ├── Annotations
    │   │   │   └── readme
    │   │   ├── ImageSets
    │   │   │   └── Main
    │   │   │   │   ├── test.txt
    │   │   │   │   ├── train.txt
    │   │   │   │   └── val.txt
    │   │   ├── JPEGImages
    │   │   │   └── readme
    │   │   └── label
    │   │   │   ├── test.txt
    │   │   │   ├── train.txt
    │   │   │   └── val.txt
    │   └── yolo_anchors.txt
    ├── data_pro.py
    ├── eval.py
    ├── get_kmeans.py
    ├── model.py
    ├── test_single_image.py
    ├── train.py
    ├── utils
    │   ├── __init__.py
    │   ├── data_aug.py
    │   ├── data_utils.py
    │   ├── eval_utils.py
    │   ├── layer_utils.py
    │   ├── misc_utils.py
    │   ├── nms_utils.py
    │   └── plot_utils.py
    └── video_test.py
├── YOLO-V3-Tensorflow-demo
    ├── .gitignore
    ├── README.md
    ├── args.py
    ├── convert_weight.py
    ├── data
    │   ├── coco.names
    │   ├── darknet_weights
    │   │   └── readme.txt
    │   └── yolo_anchors.txt
    ├── data_pro.py
    ├── detection_result.jpg
    ├── eval.py
    ├── get_kmeans.py
    ├── model.py
    ├── requirements.txt
    ├── test.jpg
    ├── test_single_image.py
    ├── train.py
    ├── utils
    │   ├── __init__.py
    │   ├── data_aug.py
    │   ├── data_utils.py
    │   ├── eval_utils.py
    │   ├── layer_utils.py
    │   ├── misc_utils.py
    │   ├── nms_utils.py
    │   └── plot_utils.py
    └── video_test.py
├── config.py
├── config_window.py
├── data
    ├── coco.names
    └── yolo_anchors.txt
├── detection.py
├── eyre.ico
├── eyre.py
├── model.py
├── notification.py
├── requirements.txt
├── sort.py
├── test_notification.py
├── tools
    ├── freeze_model.py
    └── generate_detections.py
└── utils
    ├── __init__.py
    ├── data_aug.py
    ├── data_utils.py
    ├── eval_utils.py
    ├── layer_utils.py
    ├── misc_utils.py
    ├── nms_utils.py
    └── plot_utils.py


/.gitattributes:
--------------------------------------------------------------------------------
1 | # Auto detect text files and perform LF normalization
2 | * text=auto
3 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | __pycache__/
2 | .idea/
3 | config.ini
4 | checkpoint/
5 | .vscode/
6 | 


--------------------------------------------------------------------------------
/.ipynb_checkpoints/Model_Training_Tutorial-checkpoint.md:
--------------------------------------------------------------------------------
 1 | #  PPE Detection Model Training Tutorial
 2 | 
 3 | 
 4 | ### 1. Requirements
 5 | 
 6 | Python version: 2 or 3
 7 | 
 8 | Packages:
 9 | 
10 | - tensorflow >= 1.8.0 (theoretically any version that supports tf.data is ok)
11 | - opencv-python
12 | - tqdm
13 | 
14 | ### 2. Weights download
15 | 
16 | You need download the converted TensorFlow checkpoint file by me via [[Google Drive link](https://drive.google.com/drive/folders/1mXbNgNxyXPi7JNsnBaxEv1-nWr7SVoQt?usp=sharing)] or [[Github Release](https://github.com/wizyoung/YOLOv3_TensorFlow/releases/)] and then place it to the `./data/darknet_weights/` directory.
17 | 
18 | 
19 | ### 3. Training
20 | 
21 | #### 3.1 Data preparation
22 | 
23 | Put the VOC format dataset in `./data/mydata/` directory.
24 | 
25 | (1) annotation file
26 | 
27 | Run `python data_pro.py` to generate `train.txt/val.txt/test.txt` files under `./data/my_data/` directory. One line for one image, in the format like `image_index image_absolute_path img_width img_height box_1 box_2 ... box_n`. Box_x format: `label_index x_min y_min x_max y_max`. (The origin of coordinates is at the left top corner, left top => (xmin, ymin), right bottom => (xmax, ymax).) `image_index` is the line index which starts from zero. `label_index` is in range [0, class_num - 1].
28 | 
29 | For example:
30 | 
31 | ```
32 | 0 xxx/xxx/a.jpg 1920 1080 0 453 369 473 391 1 588 245 608 268
33 | 1 xxx/xxx/b.jpg 1920 1080 1 466 403 485 422 2 793 300 809 320
34 | ...
35 | ```
36 | 
37 | (2)  class_names file:
38 | 
39 | `coco.names` file under `./data/my_data/` directory. Each line represents a class name.
40 | 
41 | ```
42 | P
43 | PH
44 | PV
45 | PHV
46 | PLC
47 | ...
48 | ```
49 | 
50 | (3) prior anchor file:
51 | 
52 | Using the kmeans algorithm to get the prior anchors:
53 | 
54 | ```
55 | python get_kmeans.py
56 | ```
57 | 
58 | Then you will get 9 anchors and the average IoU. Save the anchors to `./data/yolo_anchors.txt`
59 | 
60 | The yolo anchors computed by the kmeans script is on the resized image scale.  The default resize method is the letterbox resize, i.e., keep the original aspect ratio in the resized image.
61 | 
62 | #### 3.2 Training
63 | 
64 | Using `train.py`. The hyper-parameters and the corresponding annotations can be found in `args.py`:
65 | 
66 | ```shell
67 | CUDA_VISIBLE_DEVICES=GPU_ID python train.py
68 | ```
69 | 
70 | Check the `args.py` for more details. You should set the parameters yourself in your own specific task.
71 | 
72 | Our training enviroment was:
73 | 
74 | - Ubuntu 16.04
75 | - NVIDIA Tesla P100
76 | 
77 | ### 4. Testing
78 | 
79 | You could test by running those command:
80 | 
81 | Single image test :
82 | 
83 | ```shell
84 | python test_single_image.py ./data/demo_data/test.jpg
85 | ```
86 | 
87 | Video test:
88 | 
89 | ```shell
90 | python video_test.py ./data/demo_data/test.mp4
91 | ```
92 | 
93 | 
94 | 
95 | 
96 | 
97 | 
98 | 


--------------------------------------------------------------------------------
/.ipynb_checkpoints/README-checkpoint.md:
--------------------------------------------------------------------------------
 1 | # COMP5703-CS70: PEP Detection
 2 | 
 3 | To install required dependencies:
 4 | 
 5 | `pip install -r requirements.txt`
 6 | 
 7 | To run app:
 8 | 
 9 | `python main.py`
10 | 
11 | 
12 | # Data Introduction
13 | 
14 | After confirm requiremnts from the client, we need to collect person wearing helment, vest or lab coat data. The project needs to collect large amount of data. The data mainly contains two sources: scrape images from websites and open source data. For this project,we found and use two open source data which are GDUT-HWD and Pictorv3. The open source data contains 3995 images with 7865 positive classes and 7672 negative classes. The scrape images are mianly person wearing lab coat which contains 1073 images with 3018 positive classes and 287 negative classes.
15 | 
16 | ## 1.Data Scraping 
17 | 
18 | Using Selenium and downloaded the related WebDriver to successfully download our lab coat dataset, helmet and vest images.The powerful selenium which can automate the Google browser through clicking button, scrolling pages, waiting for loading and extracting URLs.Inputting the key words for scraping is challenge, we should use 'construction worker' or 'people wearing Safety vest' instead of using only 'Safety Helmet' or 'Safety Vest'.
19 | 
20 | ## 2. Data Labeling
21 | 
22 | We used the open source labeling tool labelImg. The labeling process just takes time and effort. During labeling, we also need to manually write  
23 | 
24 | ```python
25 | 
26 | ```
27 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # COMP5703-CS70: PEP Detection
  2 | 
  3 | To install required dependencies:
  4 | 
  5 | `pip install -r requirements.txt`
  6 | 
  7 | To download the below three [[checkpoint files](https://drive.google.com/drive/folders/1mOjkvQQBEcLV4ju2N9JYhDiROGwC1MHb?usp=sharing)] and place them under the checkpoint folder. 
  8 | 
  9 | `best_model_Epoch_75_step_29487_mAP_0.8609_loss_5.4903_lr_1e-05.data-00000-of-00001`
 10 | `best_model_Epoch_75_step_29487_mAP_0.8609_loss_5.4903_lr_1e-05.index`
 11 | `best_model_Epoch_75_step_29487_mAP_0.8609_loss_5.4903_lr_1e-05.meta`
 12 | 
 13 | ### Use Input video for Detection 
 14 | 
 15 | The application works with input any videos. Demo videos are provided [[here](https://drive.google.com/drive/folders/13c1mlFCWxu7in9Aggz5OWsq_K9duW9M2?usp=sharing)].
 16 | 
 17 | In `eyre.py` change
 18 | ```
 19 | def init_camera(self):
 20 |     self.camera = cv2.VideoCapture('path_to_video/demo_video.mp4')
 21 | ```
 22 | 
 23 | ### Use camera Stream
 24 | In `eyre.py` change
 25 | ```
 26 | def init_camera(self):
 27 |     self.camera = cv2.VideoCapture(0)
 28 | ```
 29 | 
 30 | ## Finally to run the app:
 31 | 
 32 | `python eyre.py`
 33 | 
 34 | 
 35 | 
 36 | # Data Introduction
 37 | 
 38 | After confirm requiremnts from the client, we need to collect person wearing helment, vest or lab coat data. The project needs to collect large amount of data. The data mainly contains two sources: scrape images from websites and open source data. For this project,we found and use two open source data which are GDUT-HWD and Pictorv3. The open source data contains 3995 images with 7865 positive classes and 7672 negative classes. The scrape images are mainly person wearing lab coat which contains 1073 images with 3018 positive classes and 287 negative classes. Besides, the extracted dataset from filmed videos contains 1,437 images with 2,734 positive instances and 204 negative instances.
 39 | 
 40 | Training dataset can be downloaded here
 41 | 
 42 | `https://drive.google.com/file/d/17T_aHFxhq3BNDn2iTJASKT6HYdUoph3A/view?usp=sharing`
 43 | 
 44 | ## 1.Data Scraping 
 45 | 
 46 | Using Selenium and downloaded the related WebDriver to successfully download our lab coat dataset, helmet and vest images.The powerful selenium which can automate the Google browser through clicking button, scrolling pages, waiting for loading and extracting URLs.Inputting the key words for scraping is challenge, we should use 'construction worker' or 'people wearing Safety vest' instead of using only 'Safety Helmet' or 'Safety Vest'.
 47 | 
 48 | You can choose key words such as "construction worker"  
 49 | 
 50 | Except scraping data from websites, another direct data collection method is to record videos for user scenes. 
 51 | 
 52 | 
 53 | 
 54 | 
 55 | ## 2. Data Labeling
 56 | 
 57 | We used the open source labelling tool [[LabelImg](https://github.com/tzutalin/labelImg)]. The labelling process just takes time and effort. Referring to figure 1, our project directly label data into five classes (P, PH, PV, PHV, PLC) and then implement YOLOv3 model to classify data into different classes.
 58 | ![image.png](attachment:image.png)
 59 | 
 60 | 
 61 | In order to reduce the noise from different background, we strictly labelled data from head to knees and shoulder to shoulder (figure 2). In addition, when a class is incorrectly labelled or a typo is made, we need to use the ElemetTree class in Python to process the annotation XML file for error corrections. 
 62 | ![image.png](attachment:image.png)
 63 | 
 64 | #  PPE Detection Model Training Tutorial
 65 | 
 66 | 
 67 | ### 1. Requirements
 68 | 
 69 | Python version: 3.7
 70 | 
 71 | Packages:
 72 | 
 73 | - tensorflow < 2 (theoretically any version that supports tf.data is ok)
 74 | - opencv-python
 75 | - tqdm
 76 | 
 77 | ### 2. Weights download
 78 | 
 79 | You need download the converted TensorFlow checkpoint file by me via [[Google Drive link](https://drive.google.com/drive/folders/1mXbNgNxyXPi7JNsnBaxEv1-nWr7SVoQt?usp=sharing)] or [[Github Release](https://github.com/wizyoung/YOLOv3_TensorFlow/releases/)] and then place it to the `./data/darknet_weights/` directory.
 80 | 
 81 | 
 82 | ### 3. Training
 83 | 
 84 | #### 3.1 Data preparation
 85 | 
 86 | Put the VOC format dataset in `./data/mydata/` directory.
 87 | 
 88 | (1) annotation file
 89 | 
 90 | Run `python data_pro.py` to generate `train.txt/val.txt/test.txt` files under `./data/my_data/` directory. One line for one image, in the format like `image_index image_absolute_path img_width img_height box_1 box_2 ... box_n`. Box_x format: `label_index x_min y_min x_max y_max`. (The origin of coordinates is at the left top corner, left top => (xmin, ymin), right bottom => (xmax, ymax).) `image_index` is the line index which starts from zero. `label_index` is in range [0, class_num - 1].
 91 | 
 92 | For example:
 93 | 
 94 | ```
 95 | 0 xxx/xxx/a.jpg 1920 1080 0 453 369 473 391 1 588 245 608 268
 96 | 1 xxx/xxx/b.jpg 1920 1080 1 466 403 485 422 2 793 300 809 320
 97 | ...
 98 | ```
 99 | 
100 | (2)  class_names file:
101 | 
102 | `coco.names` file under `./data/my_data/` directory. Each line represents a class name.
103 | 
104 | ```
105 | P
106 | PH
107 | PV
108 | PHV
109 | PLC
110 | ...
111 | ```
112 | 
113 | (3) prior anchor file:
114 | 
115 | Using the kmeans algorithm to get the prior anchors:
116 | 
117 | ```
118 | python get_kmeans.py
119 | ```
120 | 
121 | Then you will get 9 anchors and the average IoU. Save the anchors to `./data/yolo_anchors.txt`
122 | 
123 | The yolo anchors computed by the kmeans script is on the resized image scale.  The default resize method is the letterbox resize, i.e., keep the original aspect ratio in the resized image.
124 | 
125 | #### 3.2 Training
126 | 
127 | Using `train.py`. The hyper-parameters and the corresponding annotations can be found in `args.py`:
128 | 
129 | ```shell
130 | CUDA_VISIBLE_DEVICES=GPU_ID python train.py
131 | ```
132 | 
133 | Check the `args.py` for more details. You should set the parameters yourself in your own specific task.
134 | 
135 | Our training enviroment was:
136 | 
137 | - Ubuntu 16.04
138 | - NVIDIA Tesla P100
139 | 
140 | ### 4. Testing
141 | 
142 | You could test by running those command:
143 | 
144 | Single image test :
145 | 
146 | ```shell
147 | python test_single_image.py ./data/demo_data/test.jpg
148 | ```
149 | 
150 | Video test:
151 | 
152 | ```shell
153 | python video_test.py ./data/demo_data/test.mp4
154 | ```
155 | 
156 | # Installing dependencies on Jetson Nano
157 | 
158 | Run following scripts after setting up the Jetson Nano. Following scripts will install required dependencies
159 | 
160 | ### 1. Uninstall unused applications to save space (Optional)
161 | ```
162 | sudo apt remove libreoffice*  thunderbird shotwell rhythmbox cheese
163 | sudo apt autoremove
164 | ```
165 | ### 2. update system
166 | ```
167 | sudo apt-get update
168 | ```
169 | Install dependencies
170 | ```
171 | sudo apt-get install -y \
172 |     python3-pip \
173 |     build-essential \
174 |     git \
175 |     python3 \
176 |     python3-dev \
177 |     ffmpeg \
178 |     libsdl2-dev \
179 |     libsdl2-image-dev \
180 |     libsdl2-mixer-dev \
181 |     libsdl2-ttf-dev \
182 |     libportmidi-dev \
183 |     libswscale-dev \
184 |     libavformat-dev \
185 |     libavcodec-dev \
186 |     zlib1g-dev
187 | ```
188 | ### 3. install the correct Cython version
189 | ```
190 | python3 -m pip install Cython==0.29.10
191 | sudo apt-get update
192 | ```
193 | ### 4. prerequisite for Tensorflow
194 | ```
195 | sudo apt-get install libhdf5-serial-dev hdf5-tools libhdf5-dev zlib1g-dev zip libjpeg8-dev liblapack-dev libblas-dev gfortran
196 | ```
197 | ### 5. TensorFlow
198 | ```
199 | python3 -m pip install --upgrade --user pip setuptools virtualenv
200 | 
201 | sudo pip3 install --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v44 'tensorflow<2'
202 | 
203 | sudo pip3 install numba
204 | sudo pip3 install scikit-learn
205 | sudo apt-get install python3-matplotlib
206 | sudo pip3 install filterpy
207 | ```
208 | 


--------------------------------------------------------------------------------
/Test_demo/.gitignore:
--------------------------------------------------------------------------------
 1 | /checkpoint/checkpoint
 2 | /checkpoint/*.data-00000-of-00001
 3 | /checkpoint/*.index
 4 | /checkpoint/*.meta
 5 | /data/darknet_weights/*.data-00000-of-00001
 6 | /data/darknet_weights/*.meta
 7 | /data/darknet_weights/*.weights
 8 | 
 9 | /data/logs/*.ubuntu
10 | *.xml
11 | /data/myData/ImageSets/Main/*.txt
12 | /data/myData/JPEGImages/*.jpg
13 | /data/myData/label/*.txt
14 | 
15 | /data/test_image/*.jpg
16 | /data/test_video/*
17 | /data/*.log
18 | /test_res/*.jpg
19 | /执行步骤.txt
20 | 
21 | 


--------------------------------------------------------------------------------
/Test_demo/README.md:
--------------------------------------------------------------------------------
1 | ## Test demo
2 | ### Download checkpoint from google drive 
3 | ### save checkpoint files into checkpoint directory
4 | ### modify 'restore_path' in  video_test.py test_single_image.py camera_tst.py should on line 31
5 | ## Run
6 | ### python video_test.py own.mp4
7 | 


--------------------------------------------------------------------------------
/Test_demo/args.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | # This file contains the parameter used in train.py
 3 | 
 4 | from __future__ import division, print_function
 5 | 
 6 | from utils.misc_utils import parse_anchors, read_class_names
 7 | import math
 8 | 
 9 | ### Some paths
10 | train_file = './data/my_data/label/train.txt'  # The path of the training txt file.
11 | val_file = './data/my_data/label/val.txt'  # The path of the validation txt file.
12 | restore_path = './data/darknet_weights/yolov3.ckpt'  # The path of the weights to restore.
13 | save_dir = './checkpoint/'  # The directory of the weights to save.
14 | log_dir = './data/logs/'  # The directory to store the tensorboard log files.
15 | progress_log_path = './data/progress.log'  # The path to record the training progress.
16 | anchor_path = './data/yolo_anchors.txt'  # The path of the anchor txt file.
17 | class_name_path = './data/coco.names'  # The path of the class names.
18 | 
19 | ### Training releated numbers
20 | batch_size = 12  #6 1.12 2.24
21 | img_size = [416, 416]  # Images will be resized to `img_size` and fed to the network, size format: [width, height]
22 | letterbox_resize = True  # Whether to use the letterbox resize, i.e., keep the original aspect ratio in the resized image.
23 | total_epoches = 200 #500 1.100 2. 200 
24 | train_evaluation_step = 50 #100 # Evaluate on the training batch after some steps.
25 | val_evaluation_epoch = 50 #50  # Evaluate on the whole validation dataset after some epochs. Set to None to evaluate every epoch.
26 | save_epoch = 10  # Save the model after some epochs.
27 | batch_norm_decay = 0.99  # decay in bn ops
28 | weight_decay = 5e-4  # l2 weight decay
29 | global_step = 0  # used when resuming training
30 | 
31 | ### tf.data parameters
32 | num_threads = 10  # Number of threads for image processing used in tf.data pipeline.
33 | prefetech_buffer = 5  # Prefetech_buffer used in tf.data pipeline.
34 | 
35 | ### Learning rate and optimizer
36 | optimizer_name = 'momentum'  # Chosen from [sgd, momentum, adam, rmsprop]
37 | save_optimizer = True  # Whether to save the optimizer parameters into the checkpoint file.
38 | learning_rate_init = 1e-4
39 | lr_type = 'piecewise'  # Chosen from [fixed, exponential, cosine_decay, cosine_decay_restart, piecewise]
40 | lr_decay_epoch = 5  # Epochs after which learning rate decays. Int or float. Used when chosen `exponential` and `cosine_decay_restart` lr_type.
41 | lr_decay_factor = 0.96  # The learning rate decay factor. Used when chosen `exponential` lr_type.
42 | lr_lower_bound = 1e-7  # The minimum learning rate.
43 | # only used in piecewise lr type
44 | pw_boundaries = [30, 50]  # epoch based boundaries
45 | pw_values = [learning_rate_init, 3e-5, 1e-5]
46 | 
47 | ### Load and finetune
48 | # Choose the parts you want to restore the weights. List form.
49 | # restore_include: None, restore_exclude: None  => restore the whole model
50 | # restore_include: None, restore_exclude: scope  => restore the whole model except `scope`
51 | # restore_include: scope1, restore_exclude: scope2  => if scope1 contains scope2, restore scope1 and not restore scope2 (scope1 - scope2)
52 | # choise 1: only restore the darknet body
53 | # restore_include = ['yolov3/darknet53_body']
54 | # restore_exclude = None
55 | # choise 2: restore all layers except the last 3 conv2d layers in 3 scale
56 | restore_include = None
57 | restore_exclude = ['yolov3/yolov3_head/Conv_14', 'yolov3/yolov3_head/Conv_6', 'yolov3/yolov3_head/Conv_22']
58 | # Choose the parts you want to finetune. List form.
59 | # Set to None to train the whole model.
60 | update_part = ['yolov3/yolov3_head']
61 | 
62 | ### other training strategies
63 | multi_scale_train = True  # Whether to apply multi-scale training strategy. Image size varies from [320, 320] to [640, 640] by default.
64 | use_label_smooth = True # Whether to use class label smoothing strategy.
65 | use_focal_loss = True  # Whether to apply focal loss on the conf loss.
66 | use_mix_up = True  # Whether to use mix up data augmentation strategy. 
67 | use_warm_up = True  # whether to use warm up strategy to prevent from gradient exploding.
68 | warm_up_epoch = 10  # 3 Warm up training epoches. Set to a larger value if gradient explodes.
69 | 
70 | ### some constants in validation
71 | # nms
72 | nms_threshold = 0.1  #0.45  iou threshold in nms operation
73 | score_threshold = 0.01  # threshold of the probability of the classes in nms operation, i.e. score = pred_confs * pred_probs. set lower for higher recall.
74 | nms_topk = 150  # keep at most nms_topk outputs after nms
75 | # mAP eval
76 | eval_threshold = 0.5  # the iou threshold applied in mAP evaluation
77 | use_voc_07_metric = False  # whether to use voc 2007 evaluation metric, i.e. the 11-point metric
78 | 
79 | ### parse some params
80 | anchors = parse_anchors(anchor_path)
81 | classes = read_class_names(class_name_path)
82 | class_num = len(classes)
83 | train_img_cnt = len(open(train_file, 'r').readlines())
84 | val_img_cnt = len(open(val_file, 'r').readlines())
85 | train_batch_num = int(math.ceil(float(train_img_cnt) / batch_size))
86 | 
87 | lr_decay_freq = int(train_batch_num * lr_decay_epoch)
88 | pw_boundaries = [float(i) * train_batch_num + global_step for i in pw_boundaries]


--------------------------------------------------------------------------------
/Test_demo/camera_test.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | 
  3 | from __future__ import division, print_function
  4 | 
  5 | import tensorflow as tf
  6 | import numpy as np
  7 | import argparse
  8 | import cv2
  9 | import time
 10 | 
 11 | from utils.misc_utils import parse_anchors, read_class_names
 12 | from utils.nms_utils import gpu_nms
 13 | from utils.plot_utils import get_color_table, plot_one_box
 14 | from utils.data_aug import letterbox_resize
 15 | 
 16 | from model import yolov3
 17 | 
 18 | import warnings
 19 | warnings.filterwarnings('ignore')
 20 | parser = argparse.ArgumentParser(description="YOLO-V3 video test procedure.")
 21 | # parser.add_argument("input_video", type=str,
 22 | #                     help="The path of the input video.")
 23 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt",
 24 |                     help="The path of the anchor txt file.")
 25 | parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416],
 26 |                     help="Resize the input image with `new_size`, size format: [width, height]")
 27 | parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=True,
 28 |                     help="Whether to use the letterbox resize.")
 29 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names",
 30 |                     help="The path of the class names.")
 31 | parser.add_argument("--restore_path", type=str, default='./checkpoint/model-epoch_100_step_37268_loss_0.8836_lr_1e-05',
 32 |                     help="The path of the weights to restore.")
 33 | parser.add_argument("--save_video", type=lambda x: (str(x).lower() == 'true'), default=False,
 34 |                     help="Whether to save the video detection results.")
 35 | args = parser.parse_args()
 36 | 
 37 | args.anchors = parse_anchors(args.anchor_path)
 38 | args.classes = read_class_names(args.class_name_path)
 39 | args.num_class = len(args.classes)
 40 | 
 41 | color_table = get_color_table(args.num_class)
 42 | 
 43 | # vid = cv2.VideoCapture(args.input_video)
 44 | vid = cv2.VideoCapture(1)
 45 | video_frame_cnt = int(vid.get(7))
 46 | video_width = int(vid.get(3))
 47 | video_height = int(vid.get(4))
 48 | # video_fps = int(vid.get(5))
 49 | video_fps = 15
 50 | 
 51 | if args.save_video:
 52 |     fourcc = cv2.VideoWriter_fourcc(*'mp4v')
 53 |     videoWriter = cv2.VideoWriter('camera_result.mp4', fourcc, video_fps, (video_width, video_height))
 54 | 
 55 | with tf.Session() as sess:
 56 |     input_data = tf.placeholder(tf.float32, [1, args.new_size[1], args.new_size[0], 3], name='input_data')
 57 |     yolo_model = yolov3(args.num_class, args.anchors)
 58 |     with tf.variable_scope('yolov3'):
 59 |         pred_feature_maps = yolo_model.forward(input_data, False)
 60 |     pred_boxes, pred_confs, pred_probs = yolo_model.predict(pred_feature_maps)
 61 | 
 62 |     pred_scores = pred_confs * pred_probs
 63 | 
 64 |     boxes, scores, labels = gpu_nms(pred_boxes, pred_scores, args.num_class, max_boxes=200, score_thresh=0.3, nms_thresh=0.45)
 65 | 
 66 |     saver = tf.train.Saver()
 67 |     saver.restore(sess, args.restore_path)
 68 | 
 69 |     # for i in range(video_frame_cnt):
 70 |     while True:
 71 |         ret, img_ori = vid.read()
 72 |         if args.letterbox_resize:
 73 |             img, resize_ratio, dw, dh = letterbox_resize(img_ori, args.new_size[0], args.new_size[1])
 74 |         else:
 75 |             height_ori, width_ori = img_ori.shape[:2]
 76 |             img = cv2.resize(img_ori, tuple(args.new_size))
 77 |         img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
 78 |         img = np.asarray(img, np.float32)
 79 |         img = img[np.newaxis, :] / 255.
 80 | 
 81 |         start_time = time.time()
 82 |         boxes_, scores_, labels_ = sess.run([boxes, scores, labels], feed_dict={input_data: img})
 83 |         end_time = time.time()
 84 | 
 85 |         # rescale the coordinates to the original image
 86 |         if args.letterbox_resize:
 87 |             boxes_[:, [0, 2]] = (boxes_[:, [0, 2]] - dw) / resize_ratio
 88 |             boxes_[:, [1, 3]] = (boxes_[:, [1, 3]] - dh) / resize_ratio
 89 |         else:
 90 |             boxes_[:, [0, 2]] *= (width_ori/float(args.new_size[0]))
 91 |             boxes_[:, [1, 3]] *= (height_ori/float(args.new_size[1]))
 92 | 
 93 | 
 94 |         for i in range(len(boxes_)):
 95 |             x0, y0, x1, y1 = boxes_[i]
 96 |             image_score = scores_[i] * 100
 97 |             if image_score >= 65:
 98 |                 plot_one_box(img_ori, [x0, y0, x1, y1], label=args.classes[labels_[i]] + ', {:.2f}%'.format(scores_[i] * 100), color=color_table[labels_[i]])
 99 |         cv2.putText(img_ori, '{:.2f}ms'.format((end_time - start_time) * 1000), (40, 40), 0,
100 |                     fontScale=1, color=(0, 255, 0), thickness=2)
101 |         cv2.imshow('image', img_ori)
102 |         k = cv2.waitKey(1) 
103 |         if args.save_video:
104 |             videoWriter.write(img_ori)
105 |         if k & 0xFF == ord('q'):
106 |             break
107 | 
108 |     vid.release()
109 |     if args.save_video:
110 |         videoWriter.release()
111 | 


--------------------------------------------------------------------------------
/Test_demo/convert_weight.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | # for more details about the yolo darknet weights file, refer to
 3 | # https://itnext.io/implementing-yolo-v3-in-tensorflow-tf-slim-c3c55ff59dbe
 4 | 
 5 | from __future__ import division, print_function
 6 | 
 7 | import os
 8 | import sys
 9 | import tensorflow as tf
10 | import numpy as np
11 | 
12 | from model import yolov3
13 | from utils.misc_utils import parse_anchors, load_weights
14 | 
15 | num_class = 80
16 | img_size = 416
17 | weight_path = './data/darknet_weights/yolov3.weights'
18 | save_path = './data/darknet_weights/yolov3.ckpt'
19 | anchors = parse_anchors('./data/yolo_anchors.txt')
20 | 
21 | model = yolov3(80, anchors)
22 | with tf.Session() as sess:
23 |     inputs = tf.placeholder(tf.float32, [1, img_size, img_size, 3])
24 | 
25 |     with tf.variable_scope('yolov3'):
26 |         feature_map = model.forward(inputs)
27 | 
28 |     saver = tf.train.Saver(var_list=tf.global_variables(scope='yolov3'))
29 | 
30 |     load_ops = load_weights(tf.global_variables(scope='yolov3'), weight_path)
31 |     sess.run(load_ops)
32 |     saver.save(sess, save_path=save_path)
33 |     print('TensorFlow model checkpoint has been saved to {}'.format(save_path))
34 | 
35 | 
36 | 
37 | 


--------------------------------------------------------------------------------
/Test_demo/data/coco.names:
--------------------------------------------------------------------------------
1 | P
2 | PH
3 | PV
4 | PHV
5 | 
6 | 


--------------------------------------------------------------------------------
/Test_demo/data/darknet_weights/readme.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Test_demo/data/darknet_weights/readme.txt


--------------------------------------------------------------------------------
/Test_demo/data/logs/readme:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Test_demo/data/logs/readme


--------------------------------------------------------------------------------
/Test_demo/data/my_data/Annotations/readme:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Test_demo/data/my_data/Annotations/readme


--------------------------------------------------------------------------------
/Test_demo/data/my_data/ImageSets/Main/readme:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Test_demo/data/my_data/ImageSets/Main/readme


--------------------------------------------------------------------------------
/Test_demo/data/my_data/JPEGImages/readme:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Test_demo/data/my_data/JPEGImages/readme


--------------------------------------------------------------------------------
/Test_demo/data/my_data/label/readme:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Test_demo/data/my_data/label/readme


--------------------------------------------------------------------------------
/Test_demo/data/my_data/readme:
--------------------------------------------------------------------------------
1 | place your data files here.


--------------------------------------------------------------------------------
/Test_demo/data/yolo_anchors.txt:
--------------------------------------------------------------------------------
1 | 26,55, 39,92, 54,132, 71,194, 100,160, 110,259, 174,347, 290,560, 579,957


--------------------------------------------------------------------------------
/Test_demo/data_pro.py:
--------------------------------------------------------------------------------
  1 | 
  2 | import os
  3 | import pandas 
  4 | import shutil
  5 | import random
  6 | 
  7 | 
  8 | import cv2
  9 | import numpy as np
 10 | import xml.etree.ElementTree as ET
 11 | 
 12 | 
 13 | # 这部分休要修改
 14 | 
 15 | 
 16 | class Data_preprocess(object):
 17 |     '''
 18 |     解析xml数据
 19 |     '''
 20 |     def __init__(self,data_path):
 21 |         self.data_path = data_path
 22 |         self.image_size = 416
 23 |         self.batch_size = 32
 24 |         self.cell_size = 13
 25 |         # TO DO
 26 |         self.classes = ["P","PH","PV","PHV"]
 27 |         self.num_classes = len(self.classes)
 28 |         self.box_per_cell = 5
 29 |         self.class_to_ind = dict(zip(self.classes, range(self.num_classes)))
 30 | 
 31 |         self.count = 0
 32 |         self.epoch = 1
 33 |         self.count_t = 0
 34 | 
 35 |     def load_labels(self, model):
 36 |         if model == 'train':
 37 |             txtname = os.path.join(self.data_path, 'ImageSets/Main/train.txt')
 38 |         if model == 'test':
 39 |             txtname = os.path.join(self.data_path, 'ImageSets/Main/test.txt')
 40 | 
 41 |         if model == "val":
 42 |             txtname = os.path.join(self.data_path, 'ImageSets/Main/val.txt')
 43 | 
 44 | 
 45 |         with open(txtname, 'r') as f:
 46 |             image_ind = [x.strip() for x in f.readlines()] # 文件名去掉 .jpg
 47 | 
 48 |         
 49 |         my_index = 0
 50 |         for ind in image_ind:
 51 |             class_inds, x1s, y1s, x2s, y2s,img_width,img_height = self.load_data(ind)
 52 | 
 53 |             if len(class_inds) == 0:
 54 |                 pass
 55 |             else:
 56 |                 annotation_label = ""
 57 |                 #box_x: label_index, x_min,y_min,x_max,y_max
 58 |                 for label_i in range(len(class_inds)):
 59 | 
 60 |                     annotation_label += " " + str(class_inds[label_i])
 61 |                     annotation_label += " " + str(x1s[label_i])
 62 |                     annotation_label += " " + str(y1s[label_i])
 63 |                     annotation_label += " " + str(x2s[label_i])
 64 |                     annotation_label += " " + str(y2s[label_i])
 65 | 
 66 |                 with open("./data/my_data/label/"+model+".txt","a") as f:
 67 |                     f.write(str(my_index) + " " + data_path+"/JPEGImages/"+ind+".jpg"+" "+str(img_width) +" "+str(img_height)+ annotation_label + "\n")
 68 | 
 69 |                 my_index += 1
 70 | 
 71 |             print(my_index)
 72 | 
 73 | 
 74 | 
 75 |     def load_data(self, index):
 76 |         label = np.zeros([self.cell_size, self.cell_size, self.box_per_cell, 5 + self.num_classes])
 77 |         filename = os.path.join(self.data_path, 'Annotations', index + '.xml')
 78 |         tree = ET.parse(filename)
 79 |         image_size = tree.find('size')
 80 |         image_width = int(float(image_size.find('width').text))
 81 |         image_height = int(float(image_size.find('height').text))
 82 |         # h_ratio = 1.0 * self.image_size / image_height
 83 |         # w_ratio = 1.0 * self.image_size / image_width
 84 | 
 85 |         objects = tree.findall('object')
 86 | 
 87 |         class_inds = []
 88 |         x1s = []
 89 |         y1s = []
 90 |         x2s = []
 91 |         y2s = []
 92 | 
 93 |         for obj in objects:
 94 |             box = obj.find('bndbox')
 95 |             x1 = int(float(box.find('xmin').text))
 96 |             y1 = int(float(box.find('ymin').text))
 97 |             x2 = int(float(box.find('xmax').text))
 98 |             y2 = int(float(box.find('ymax').text))
 99 |             # x1 = max(min((float(box.find('xmin').text)) * w_ratio, self.image_size), 0)
100 |             # y1 = max(min((float(box.find('ymin').text)) * h_ratio, self.image_size), 0)
101 |             # x2 = max(min((float(box.find('xmax').text)) * w_ratio, self.image_size), 0)
102 |             # y2 = max(min((float(box.find('ymax').text)) * h_ratio, self.image_size), 0)
103 |             if obj.find('name').text in self.classes:
104 |                 class_ind = self.class_to_ind[obj.find('name').text]
105 |                 # class_ind = self.class_to_ind[obj.find('name').text.lower().strip()]
106 | 
107 |                 # boxes = [0.5 * (x1 + x2) / self.image_size, 0.5 * (y1 + y2) / self.image_size, np.sqrt((x2 - x1) / self.image_size), np.sqrt((y2 - y1) / self.image_size)]
108 |                 # cx = 1.0 * boxes[0] * self.cell_size
109 |                 # cy = 1.0 * boxes[1] * self.cell_size
110 |                 # xind = int(np.floor(cx))
111 |                 # yind = int(np.floor(cy))
112 |                 
113 |                 # label[yind, xind, :, 0] = 1
114 |                 # label[yind, xind, :, 1:5] = boxes
115 |                 # label[yind, xind, :, 5 + class_ind] = 1
116 | 
117 |                 if x1 >= x2 or y1 >= y2:
118 |                     pass
119 |                 else:
120 |                     class_inds.append(class_ind)
121 |                     x1s.append(x1)
122 |                     y1s.append(y1)
123 |                     x2s.append(x2)
124 |                     y2s.append(y2)
125 |       
126 |         return class_inds, x1s, y1s, x2s, y2s, image_width, image_height
127 | 
128 | 
129 | def data_split(img_path):
130 |     '''
131 |     数据分割
132 |     '''
133 | 
134 |     files = os.listdir(img_path)
135 |     # To do
136 |     test_part = random.sample(files,int(399*0.2))
137 | 
138 |     val_part = random.sample(test_part,int(int(399*0.2)*0.5))
139 | 
140 |     val_index = 0
141 |     test_index = 0
142 |     train_index = 0
143 |     for file in files:
144 |         if file in val_part:
145 | 
146 |             with open("./data/my_data/ImageSets/Main/val.txt","a") as val_f:
147 |                 val_f.write(file[:-4] + "\n" )
148 | 
149 |             val_index += 1
150 | 
151 |         elif file in test_part:
152 |             with open("./data/my_data/ImageSets/Main/test.txt","a") as test_f:
153 |                 test_f.write(file[:-4] + "\n")
154 | 
155 |             test_index += 1
156 | 
157 |         else:
158 |             with open("./data/my_data/ImageSets/Main/train.txt","a") as train_f:
159 |                 train_f.write(file[:-4] + "\n")
160 | 
161 |             train_index += 1  
162 | 
163 | 
164 |         print(train_index,test_index,val_index)
165 | 
166 | 
167 | # TO DO
168 | if __name__ == "__main__":
169 |     
170 |     # 分割train, val, test
171 |     img_path = "./data/my_data/JPEGImages"
172 |     data_split(img_path)
173 |     print("===========split data finish============")
174 | 
175 |     # 做YOLO V3需要的训练集
176 |     base_path = os.getcwd()
177 |     data_path = os.path.join(base_path,"data/my_data")  # 绝对路径
178 | 
179 |     data_p = Data_preprocess(data_path)
180 |     data_p.load_labels("train")
181 |     data_p.load_labels("test")
182 |     data_p.load_labels("val")
183 |     print("==========data pro finish===========")
184 | 
185 | 
186 | 
187 | 
188 | 
189 | 
190 | 
191 | 


--------------------------------------------------------------------------------
/Test_demo/eval.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | 
  3 | from __future__ import division, print_function
  4 | 
  5 | import tensorflow as tf
  6 | import numpy as np
  7 | import argparse
  8 | from tqdm import trange
  9 | 
 10 | from utils.data_utils import get_batch_data
 11 | from utils.misc_utils import parse_anchors, read_class_names, AverageMeter
 12 | from utils.eval_utils import evaluate_on_cpu, evaluate_on_gpu, get_preds_gpu, voc_eval, parse_gt_rec
 13 | from utils.nms_utils import gpu_nms
 14 | 
 15 | from model import yolov3
 16 | 
 17 | #################
 18 | # ArgumentParser
 19 | #################
 20 | parser = argparse.ArgumentParser(description="YOLO-V3 eval procedure.")
 21 | # some paths
 22 | parser.add_argument("--eval_file", type=str, default="./data/my_data/val.txt",
 23 |                     help="The path of the validation or test txt file.")
 24 | 
 25 | parser.add_argument("--restore_path", type=str, default="./data/darknet_weights/yolov3.ckpt",
 26 |                     help="The path of the weights to restore.")
 27 | 
 28 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt",
 29 |                     help="The path of the anchor txt file.")
 30 | 
 31 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names",
 32 |                     help="The path of the class names.")
 33 | 
 34 | # some numbers
 35 | parser.add_argument("--img_size", nargs='*', type=int, default=[416, 416],
 36 |                     help="Resize the input image to `img_size`, size format: [width, height]")
 37 | 
 38 | parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=False,
 39 |                     help="Whether to use the letterbox resize, i.e., keep the original image aspect ratio.")
 40 | 
 41 | parser.add_argument("--num_threads", type=int, default=10,
 42 |                     help="Number of threads for image processing used in tf.data pipeline.")
 43 | 
 44 | parser.add_argument("--prefetech_buffer", type=int, default=5,
 45 |                     help="Prefetech_buffer used in tf.data pipeline.")
 46 | 
 47 | parser.add_argument("--nms_threshold", type=float, default=0.45,
 48 |                     help="IOU threshold in nms operation.")
 49 | 
 50 | parser.add_argument("--score_threshold", type=float, default=0.01,
 51 |                     help="Threshold of the probability of the classes in nms operation.")
 52 | 
 53 | parser.add_argument("--nms_topk", type=int, default=400,
 54 |                     help="Keep at most nms_topk outputs after nms.")
 55 | 
 56 | parser.add_argument("--use_voc_07_metric", type=lambda x: (str(x).lower() == 'true'), default=False,
 57 |                     help="Whether to use the voc 2007 mAP metrics.")
 58 | 
 59 | args = parser.parse_args()
 60 | 
 61 | # args params
 62 | args.anchors = parse_anchors(args.anchor_path)
 63 | args.classes = read_class_names(args.class_name_path)
 64 | args.class_num = len(args.classes)
 65 | args.img_cnt = len(open(args.eval_file, 'r').readlines())
 66 | 
 67 | # setting placeholders
 68 | is_training = tf.placeholder(dtype=tf.bool, name="phase_train")
 69 | handle_flag = tf.placeholder(tf.string, [], name='iterator_handle_flag')
 70 | pred_boxes_flag = tf.placeholder(tf.float32, [1, None, None])
 71 | pred_scores_flag = tf.placeholder(tf.float32, [1, None, None])
 72 | gpu_nms_op = gpu_nms(pred_boxes_flag, pred_scores_flag, args.class_num, args.nms_topk, args.score_threshold, args.nms_threshold)
 73 | 
 74 | ##################
 75 | # tf.data pipeline
 76 | ##################
 77 | val_dataset = tf.data.TextLineDataset(args.eval_file)
 78 | val_dataset = val_dataset.batch(1)
 79 | val_dataset = val_dataset.map(
 80 |     lambda x: tf.py_func(get_batch_data, [x, args.class_num, args.img_size, args.anchors, 'val', False, False, args.letterbox_resize], [tf.int64, tf.float32, tf.float32, tf.float32, tf.float32]),
 81 |     num_parallel_calls=args.num_threads
 82 | )
 83 | val_dataset.prefetch(args.prefetech_buffer)
 84 | iterator = val_dataset.make_one_shot_iterator()
 85 | 
 86 | image_ids, image, y_true_13, y_true_26, y_true_52 = iterator.get_next()
 87 | image_ids.set_shape([None])
 88 | y_true = [y_true_13, y_true_26, y_true_52]
 89 | image.set_shape([None, args.img_size[1], args.img_size[0], 3])
 90 | for y in y_true:
 91 |     y.set_shape([None, None, None, None, None])
 92 | 
 93 | ##################
 94 | # Model definition
 95 | ##################
 96 | yolo_model = yolov3(args.class_num, args.anchors)
 97 | with tf.variable_scope('yolov3'):
 98 |     pred_feature_maps = yolo_model.forward(image, is_training=is_training)
 99 | loss = yolo_model.compute_loss(pred_feature_maps, y_true)
100 | y_pred = yolo_model.predict(pred_feature_maps)
101 | 
102 | saver_to_restore = tf.train.Saver()
103 | 
104 | with tf.Session() as sess:
105 |     sess.run([tf.global_variables_initializer()])
106 |     saver_to_restore.restore(sess, args.restore_path)
107 | 
108 |     print('\n----------- start to eval -----------\n')
109 | 
110 |     val_loss_total, val_loss_xy, val_loss_wh, val_loss_conf, val_loss_class = \
111 |         AverageMeter(), AverageMeter(), AverageMeter(), AverageMeter(), AverageMeter()
112 |     val_preds = []
113 | 
114 |     for j in trange(args.img_cnt):
115 |         __image_ids, __y_pred, __loss = sess.run([image_ids, y_pred, loss], feed_dict={is_training: False})
116 |         pred_content = get_preds_gpu(sess, gpu_nms_op, pred_boxes_flag, pred_scores_flag, __image_ids, __y_pred)
117 | 
118 |         val_preds.extend(pred_content)
119 |         val_loss_total.update(__loss[0])
120 |         val_loss_xy.update(__loss[1])
121 |         val_loss_wh.update(__loss[2])
122 |         val_loss_conf.update(__loss[3])
123 |         val_loss_class.update(__loss[4])
124 | 
125 |     rec_total, prec_total, ap_total = AverageMeter(), AverageMeter(), AverageMeter()
126 |     gt_dict = parse_gt_rec(args.eval_file, args.img_size, args.letterbox_resize)
127 |     print('mAP eval:')
128 |     for ii in range(args.class_num):
129 |         npos, nd, rec, prec, ap = voc_eval(gt_dict, val_preds, ii, iou_thres=0.5, use_07_metric=args.use_voc_07_metric)
130 |         rec_total.update(rec, npos)
131 |         prec_total.update(prec, nd)
132 |         ap_total.update(ap, 1)
133 |         print('Class {}: Recall: {:.4f}, Precision: {:.4f}, AP: {:.4f}'.format(ii, rec, prec, ap))
134 | 
135 |     mAP = ap_total.average
136 |     print('final mAP: {:.4f}'.format(mAP))
137 |     print("recall: {:.3f}, precision: {:.3f}".format(rec_total.average, prec_total.average))
138 |     print("total_loss: {:.3f}, loss_xy: {:.3f}, loss_wh: {:.3f}, loss_conf: {:.3f}, loss_class: {:.3f}".format(
139 |         val_loss_total.average, val_loss_xy.average, val_loss_wh.average, val_loss_conf.average, val_loss_class.average
140 |     ))
141 | 


--------------------------------------------------------------------------------
/Test_demo/get_kmeans.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | # This script is modified from https://github.com/lars76/kmeans-anchor-boxes
  3 | 
  4 | from __future__ import division, print_function
  5 | 
  6 | import numpy as np
  7 | 
  8 | def iou(box, clusters):
  9 |     """
 10 |     Calculates the Intersection over Union (IoU) between a box and k clusters.
 11 |     param:
 12 |         box: tuple or array, shifted to the origin (i. e. width and height)
 13 |         clusters: numpy array of shape (k, 2) where k is the number of clusters
 14 |     return:
 15 |         numpy array of shape (k, 0) where k is the number of clusters
 16 |     """
 17 |     x = np.minimum(clusters[:, 0], box[0])
 18 |     y = np.minimum(clusters[:, 1], box[1])
 19 |     if np.count_nonzero(x == 0) > 0 or np.count_nonzero(y == 0) > 0:
 20 |         raise ValueError("Box has no area")
 21 | 
 22 |     intersection = x * y
 23 |     box_area = box[0] * box[1]
 24 |     cluster_area = clusters[:, 0] * clusters[:, 1]
 25 | 
 26 |     iou_ = np.true_divide(intersection, box_area + cluster_area - intersection + 1e-10)
 27 |     # iou_ = intersection / (box_area + cluster_area - intersection + 1e-10)
 28 | 
 29 |     return iou_
 30 | 
 31 | 
 32 | def avg_iou(boxes, clusters):
 33 |     """
 34 |     Calculates the average Intersection over Union (IoU) between a numpy array of boxes and k clusters.
 35 |     param:
 36 |         boxes: numpy array of shape (r, 2), where r is the number of rows
 37 |         clusters: numpy array of shape (k, 2) where k is the number of clusters
 38 |     return:
 39 |         average IoU as a single float
 40 |     """
 41 |     return np.mean([np.max(iou(boxes[i], clusters)) for i in range(boxes.shape[0])])
 42 | 
 43 | 
 44 | def translate_boxes(boxes):
 45 |     """
 46 |     Translates all the boxes to the origin.
 47 |     param:
 48 |         boxes: numpy array of shape (r, 4)
 49 |     return:
 50 |     numpy array of shape (r, 2)
 51 |     """
 52 |     new_boxes = boxes.copy()
 53 |     for row in range(new_boxes.shape[0]):
 54 |         new_boxes[row][2] = np.abs(new_boxes[row][2] - new_boxes[row][0])
 55 |         new_boxes[row][3] = np.abs(new_boxes[row][3] - new_boxes[row][1])
 56 |     return np.delete(new_boxes, [0, 1], axis=1)
 57 | 
 58 | 
 59 | def kmeans(boxes, k, dist=np.median):
 60 |     """
 61 |     Calculates k-means clustering with the Intersection over Union (IoU) metric.
 62 |     param:
 63 |         boxes: numpy array of shape (r, 2), where r is the number of rows
 64 |         k: number of clusters
 65 |         dist: distance function
 66 |     return:
 67 |         numpy array of shape (k, 2)
 68 |     """
 69 |     rows = boxes.shape[0]
 70 | 
 71 |     distances = np.empty((rows, k))
 72 |     last_clusters = np.zeros((rows,))
 73 | 
 74 |     np.random.seed()
 75 | 
 76 |     # the Forgy method will fail if the whole array contains the same rows
 77 |     clusters = boxes[np.random.choice(rows, k, replace=False)]
 78 | 
 79 |     while True:
 80 |         for row in range(rows):
 81 |             distances[row] = 1 - iou(boxes[row], clusters)
 82 | 
 83 |         nearest_clusters = np.argmin(distances, axis=1)
 84 | 
 85 |         if (last_clusters == nearest_clusters).all():
 86 |             break
 87 | 
 88 |         for cluster in range(k):
 89 |             clusters[cluster] = dist(boxes[nearest_clusters == cluster], axis=0)
 90 | 
 91 |         last_clusters = nearest_clusters
 92 | 
 93 |     return clusters
 94 | 
 95 | 
 96 | def parse_anno(annotation_path, target_size=None):
 97 |     anno = open(annotation_path, 'r')
 98 |     result = []
 99 |     for line in anno:
100 |         s = line.strip().split(' ') 
101 |         print(line)
102 |         img_w = int(float(s[2]))
103 |         img_h = int(float(s[3]))
104 |         s = s[4:]
105 |         box_cnt = len(s) // 5
106 |         for i in range(box_cnt):
107 |             x_min, y_min, x_max, y_max = float(s[i*5+1]), float(s[i*5+2]), float(s[i*5+3]), float(s[i*5+4])
108 |             width = x_max - x_min
109 |             height = y_max - y_min
110 |             assert width > 0
111 |             assert height > 0
112 |             # use letterbox resize, i.e. keep the original aspect ratio
113 |             # get k-means anchors on the resized target image size
114 |             if target_size is not None:
115 |                 resize_ratio = min(target_size[0] / img_w, target_size[1] / img_h)
116 |                 width *= resize_ratio
117 |                 height *= resize_ratio
118 |                 result.append([width, height])
119 |             # get k-means anchors on the original image size
120 |             else:
121 |                 result.append([width, height])
122 |     result = np.asarray(result)
123 |     return result
124 | 
125 | 
126 | def get_kmeans(anno, cluster_num=9):
127 | 
128 |     anchors = kmeans(anno, cluster_num)
129 |     ave_iou = avg_iou(anno, anchors)
130 | 
131 |     anchors = anchors.astype('int').tolist()
132 | 
133 |     anchors = sorted(anchors, key=lambda x: x[0] * x[1])
134 | 
135 |     return anchors, ave_iou
136 | 
137 | 
138 | if __name__ == '__main__':
139 |     # target resize format: [width, height]
140 |     # if target_resize is speficied, the anchors are on the resized image scale
141 |     # if target_resize is set to None, the anchors are on the original image scale
142 |     # target_size = [416, 416]
143 |     target_size = None
144 |     annotation_path = "./data/my_data/label/train.txt"
145 |     anno_result = parse_anno(annotation_path, target_size=target_size)
146 |     anchors, ave_iou = get_kmeans(anno_result, 9)
147 | 
148 |     anchor_string = ''
149 |     for anchor in anchors:
150 |         anchor_string += '{},{}, '.format(anchor[0], anchor[1])
151 |     anchor_string = anchor_string[:-2]
152 | 
153 |     print('anchors are:')
154 |     print(anchor_string)
155 |     print('the average iou is:')
156 |     print(ave_iou)
157 | 
158 | 


--------------------------------------------------------------------------------
/Test_demo/test_single_image.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | 
 3 | from __future__ import division, print_function
 4 | 
 5 | import tensorflow as tf
 6 | import numpy as np
 7 | import argparse
 8 | import cv2
 9 | 
10 | from utils.misc_utils import parse_anchors, read_class_names
11 | from utils.nms_utils import gpu_nms
12 | from utils.plot_utils import get_color_table, plot_one_box
13 | 
14 | from model import yolov3
15 | 
16 | tf.compat.v1.train.Saver
17 | 
18 | parser = argparse.ArgumentParser(description="YOLO-V3 test single image test procedure.")
19 | parser.add_argument("input_image", type=str,
20 |                     help="The path of the input image.")
21 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt",
22 |                     help="The path of the anchor txt file.")
23 | parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416],
24 |                     help="Resize the input image with `new_size`, size format: [width, height]")
25 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names",
26 |                     help="The path of the class names.")
27 | parser.add_argument("--restore_path", type=str, default="./checkpoint/model-epoch_100_step_37268_loss_0.8836_lr_1e-05",
28 |                     help="The path of the weights to restore.")
29 | args = parser.parse_args()
30 | 
31 | args.anchors = parse_anchors(args.anchor_path)
32 | args.classes = read_class_names(args.class_name_path)
33 | args.num_class = len(args.classes)
34 | 
35 | color_table = get_color_table(args.num_class)
36 | 
37 | img_ori = cv2.imread(args.input_image)
38 | height_ori, width_ori = img_ori.shape[:2]
39 | img = cv2.resize(img_ori, tuple(args.new_size))
40 | img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
41 | img = np.asarray(img, np.float32)
42 | img = img[np.newaxis, :] / 255.
43 | 
44 | with tf.Session() as sess:
45 |     input_data = tf.placeholder(tf.float32, [1, args.new_size[1], args.new_size[0], 3], name='input_data')
46 |     yolo_model = yolov3(args.num_class, args.anchors)
47 |     with tf.variable_scope('yolov3'):
48 |         pred_feature_maps = yolo_model.forward(input_data, False)
49 |     pred_boxes, pred_confs, pred_probs = yolo_model.predict(pred_feature_maps)
50 | 
51 |     pred_scores = pred_confs * pred_probs
52 | 
53 |     boxes, scores, labels = gpu_nms(pred_boxes, pred_scores, args.num_class, max_boxes=30, score_thresh=0.4, nms_thresh=0.5)
54 | 
55 |     saver = tf.train.Saver()
56 |     saver.restore(sess, args.restore_path)
57 |     # saver = tf.train.import_meta_graph('./checkpoint/best_model_Epoch_5_step_42_mAP_0.0735_loss_43.3285_lr_0.0001.meta')
58 |     # saver.restore(sess, tf.train.latest_checkpoint("./checkpoint/"))
59 | 
60 |     boxes_, scores_, labels_ = sess.run([boxes, scores, labels], feed_dict={input_data: img})
61 | 
62 |     # rescale the coordinates to the original image
63 |     boxes_[:, 0] *= (width_ori/float(args.new_size[0]))
64 |     boxes_[:, 2] *= (width_ori/float(args.new_size[0]))
65 |     boxes_[:, 1] *= (height_ori/float(args.new_size[1]))
66 |     boxes_[:, 3] *= (height_ori/float(args.new_size[1]))
67 | 
68 |     print("box coords:")
69 |     print(boxes_)
70 |     print('*' * 30)
71 |     print("scores:")
72 |     print(scores_)
73 |     print('*' * 30)
74 |     print("labels:")
75 |     print(labels_)
76 | 
77 |     for i in range(len(boxes_)):
78 |         x0, y0, x1, y1 = boxes_[i]
79 |         plot_one_box(img_ori, [x0, y0, x1, y1], label=args.classes[labels_[i]], color=color_table[labels_[i]])
80 |     cv2.imshow('Detection result', img_ori)
81 |     cv2.imwrite('detection_result.jpg', img_ori)
82 |     cv2.waitKey(0)
83 | 


--------------------------------------------------------------------------------
/Test_demo/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Test_demo/utils/__init__.py


--------------------------------------------------------------------------------
/Test_demo/utils/layer_utils.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | 
 3 | from __future__ import division, print_function
 4 | 
 5 | import numpy as np
 6 | import tensorflow as tf
 7 | slim = tf.contrib.slim
 8 | 
 9 | def conv2d(inputs, filters, kernel_size, strides=1):
10 |     def _fixed_padding(inputs, kernel_size):
11 |         pad_total = kernel_size - 1
12 |         pad_beg = pad_total // 2
13 |         pad_end = pad_total - pad_beg
14 | 
15 |         padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg, pad_end],
16 |                                         [pad_beg, pad_end], [0, 0]], mode='CONSTANT')
17 |         return padded_inputs
18 |     if strides > 1: 
19 |         inputs = _fixed_padding(inputs, kernel_size)
20 |     inputs = slim.conv2d(inputs, filters, kernel_size, stride=strides,
21 |                          padding=('SAME' if strides == 1 else 'VALID'))
22 |     return inputs
23 | 
24 | def darknet53_body(inputs):
25 |     def res_block(inputs, filters):
26 |         shortcut = inputs
27 |         net = conv2d(inputs, filters * 1, 1)
28 |         net = conv2d(net, filters * 2, 3)
29 | 
30 |         net = net + shortcut
31 | 
32 |         return net
33 |     
34 |     # first two conv2d layers
35 |     net = conv2d(inputs, 32,  3, strides=1)
36 |     net = conv2d(net, 64,  3, strides=2)
37 | 
38 |     # res_block * 1
39 |     net = res_block(net, 32)
40 | 
41 |     net = conv2d(net, 128, 3, strides=2)
42 | 
43 |     # res_block * 2
44 |     for i in range(2):
45 |         net = res_block(net, 64)
46 | 
47 |     net = conv2d(net, 256, 3, strides=2)
48 | 
49 |     # res_block * 8
50 |     for i in range(8):
51 |         net = res_block(net, 128)
52 | 
53 |     route_1 = net
54 |     net = conv2d(net, 512, 3, strides=2)
55 | 
56 |     # res_block * 8
57 |     for i in range(8):
58 |         net = res_block(net, 256)
59 | 
60 |     route_2 = net
61 |     net = conv2d(net, 1024, 3, strides=2)
62 | 
63 |     # res_block * 4
64 |     for i in range(4):
65 |         net = res_block(net, 512)
66 |     route_3 = net
67 | 
68 |     return route_1, route_2, route_3
69 | 
70 | 
71 | def yolo_block(inputs, filters):
72 |     net = conv2d(inputs, filters * 1, 1)
73 |     net = conv2d(net, filters * 2, 3)
74 |     net = conv2d(net, filters * 1, 1)
75 |     net = conv2d(net, filters * 2, 3)
76 |     net = conv2d(net, filters * 1, 1)
77 |     route = net
78 |     net = conv2d(net, filters * 2, 3)
79 |     return route, net
80 | 
81 | 
82 | def upsample_layer(inputs, out_shape):
83 |     new_height, new_width = out_shape[1], out_shape[2]
84 |     # NOTE: here height is the first
85 |     # TODO: Do we need to set `align_corners` as True?
86 |     inputs = tf.image.resize_nearest_neighbor(inputs, (new_height, new_width), name='upsampled')
87 |     return inputs
88 | 
89 | 
90 | 


--------------------------------------------------------------------------------
/Test_demo/utils/misc_utils.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | 
  3 | import numpy as np
  4 | import tensorflow as tf
  5 | import random
  6 | 
  7 | from tensorflow.core.framework import summary_pb2
  8 | 
  9 | 
 10 | def make_summary(name, val):
 11 |     return summary_pb2.Summary(value=[summary_pb2.Summary.Value(tag=name, simple_value=val)])
 12 | 
 13 | 
 14 | class AverageMeter(object):
 15 |     def __init__(self):
 16 |         self.reset()
 17 | 
 18 |     def reset(self):
 19 |         self.val = 0
 20 |         self.average = 0
 21 |         self.sum = 0
 22 |         self.count = 0
 23 | 
 24 |     def update(self, val, n=1):
 25 |         self.val = val
 26 |         self.sum += val * n
 27 |         self.count += n
 28 |         self.average = self.sum / float(self.count)
 29 | 
 30 | 
 31 | def parse_anchors(anchor_path):
 32 |     '''
 33 |     parse anchors.
 34 |     returned data: shape [N, 2], dtype float32
 35 |     '''
 36 |     anchors = np.reshape(np.asarray(open(anchor_path, 'r').read().split(','), np.float32), [-1, 2])
 37 |     return anchors
 38 | 
 39 | 
 40 | def read_class_names(class_name_path):
 41 |     names = {}
 42 |     with open(class_name_path, 'r') as data:
 43 |         for ID, name in enumerate(data):
 44 |             names[ID] = name.strip('\n')
 45 |     return names
 46 | 
 47 | 
 48 | def shuffle_and_overwrite(file_name):
 49 |     content = open(file_name, 'r').readlines()
 50 |     random.shuffle(content)
 51 |     with open(file_name, 'w') as f:
 52 |         for line in content:
 53 |             f.write(line)
 54 | 
 55 | 
 56 | def update_dict(ori_dict, new_dict):
 57 |     if not ori_dict:
 58 |         return new_dict
 59 |     for key in ori_dict:
 60 |         ori_dict[key] += new_dict[key]
 61 |     return ori_dict
 62 | 
 63 | 
 64 | def list_add(ori_list, new_list):
 65 |     for i in range(len(ori_list)):
 66 |         ori_list[i] += new_list[i]
 67 |     return ori_list
 68 | 
 69 | 
 70 | def load_weights(var_list, weights_file):
 71 |     """
 72 |     Loads and converts pre-trained weights.
 73 |     param:
 74 |         var_list: list of network variables.
 75 |         weights_file: name of the binary file.
 76 |     """
 77 |     with open(weights_file, "rb") as fp:
 78 |         np.fromfile(fp, dtype=np.int32, count=5)
 79 |         weights = np.fromfile(fp, dtype=np.float32)
 80 | 
 81 |     ptr = 0
 82 |     i = 0
 83 |     assign_ops = []
 84 |     while i < len(var_list) - 1:
 85 |         var1 = var_list[i]
 86 |         var2 = var_list[i + 1]
 87 |         # do something only if we process conv layer
 88 |         if 'Conv' in var1.name.split('/')[-2]:
 89 |             # check type of next layer
 90 |             if 'BatchNorm' in var2.name.split('/')[-2]:
 91 |                 # load batch norm params
 92 |                 gamma, beta, mean, var = var_list[i + 1:i + 5]
 93 |                 batch_norm_vars = [beta, gamma, mean, var]
 94 |                 for var in batch_norm_vars:
 95 |                     shape = var.shape.as_list()
 96 |                     num_params = np.prod(shape)
 97 |                     var_weights = weights[ptr:ptr + num_params].reshape(shape)
 98 |                     ptr += num_params
 99 |                     assign_ops.append(tf.assign(var, var_weights, validate_shape=True))
100 |                 # we move the pointer by 4, because we loaded 4 variables
101 |                 i += 4
102 |             elif 'Conv' in var2.name.split('/')[-2]:
103 |                 # load biases
104 |                 bias = var2
105 |                 bias_shape = bias.shape.as_list()
106 |                 bias_params = np.prod(bias_shape)
107 |                 bias_weights = weights[ptr:ptr +
108 |                                        bias_params].reshape(bias_shape)
109 |                 ptr += bias_params
110 |                 assign_ops.append(tf.assign(bias, bias_weights, validate_shape=True))
111 |                 # we loaded 1 variable
112 |                 i += 1
113 |             # we can load weights of conv layer
114 |             shape = var1.shape.as_list()
115 |             num_params = np.prod(shape)
116 | 
117 |             var_weights = weights[ptr:ptr + num_params].reshape(
118 |                 (shape[3], shape[2], shape[0], shape[1]))
119 |             # remember to transpose to column-major
120 |             var_weights = np.transpose(var_weights, (2, 3, 1, 0))
121 |             ptr += num_params
122 |             assign_ops.append(
123 |                 tf.assign(var1, var_weights, validate_shape=True))
124 |             i += 1
125 | 
126 |     return assign_ops
127 | 
128 | 
129 | def config_learning_rate(args, global_step):
130 |     if args.lr_type == 'exponential':
131 |         lr_tmp = tf.train.exponential_decay(args.learning_rate_init, global_step, args.lr_decay_freq,
132 |                                             args.lr_decay_factor, staircase=True, name='exponential_learning_rate')
133 |         return tf.maximum(lr_tmp, args.lr_lower_bound)
134 |     elif args.lr_type == 'cosine_decay':
135 |         train_steps = (args.total_epoches - float(args.use_warm_up) * args.warm_up_epoch) * args.train_batch_num
136 |         return args.lr_lower_bound + 0.5 * (args.learning_rate_init - args.lr_lower_bound) * \
137 |             (1 + tf.cos(global_step / train_steps * np.pi))
138 |     elif args.lr_type == 'cosine_decay_restart':
139 |         return tf.train.cosine_decay_restarts(args.learning_rate_init, global_step, 
140 |                                               args.lr_decay_freq, t_mul=2.0, m_mul=1.0, 
141 |                                               name='cosine_decay_learning_rate_restart')
142 |     elif args.lr_type == 'fixed':
143 |         return tf.convert_to_tensor(args.learning_rate_init, name='fixed_learning_rate')
144 |     elif args.lr_type == 'piecewise':
145 |         return tf.train.piecewise_constant(global_step, boundaries=args.pw_boundaries, values=args.pw_values,
146 |                                            name='piecewise_learning_rate')
147 |     else:
148 |         raise ValueError('Unsupported learning rate type!')
149 | 
150 | 
151 | def config_optimizer(optimizer_name, learning_rate, decay=0.9, momentum=0.9):
152 |     if optimizer_name == 'momentum':
153 |         return tf.train.MomentumOptimizer(learning_rate, momentum=momentum)
154 |     elif optimizer_name == 'rmsprop':
155 |         return tf.train.RMSPropOptimizer(learning_rate, decay=decay, momentum=momentum)
156 |     elif optimizer_name == 'adam':
157 |         return tf.train.AdamOptimizer(learning_rate)
158 |     elif optimizer_name == 'sgd':
159 |         return tf.train.GradientDescentOptimizer(learning_rate)
160 |     else:
161 |         raise ValueError('Unsupported optimizer type!')


--------------------------------------------------------------------------------
/Test_demo/utils/nms_utils.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | 
  3 | from __future__ import division, print_function
  4 | 
  5 | import numpy as np
  6 | import tensorflow as tf
  7 | 
  8 | def gpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, nms_thresh=0.5):
  9 |     """
 10 |     Perform NMS on GPU using TensorFlow.
 11 | 
 12 |     params:
 13 |         boxes: tensor of shape [1, 10647, 4] # 10647=(13*13+26*26+52*52)*3, for input 416*416 image
 14 |         scores: tensor of shape [1, 10647, num_classes], score=conf*prob
 15 |         num_classes: total number of classes
 16 |         max_boxes: integer, maximum number of predicted boxes you'd like, default is 50
 17 |         score_thresh: if [ highest class probability score < score_threshold]
 18 |                         then get rid of the corresponding box
 19 |         nms_thresh: real value, "intersection over union" threshold used for NMS filtering
 20 |     """
 21 | 
 22 |     boxes_list, label_list, score_list = [], [], []
 23 |     max_boxes = tf.constant(max_boxes, dtype='int32')
 24 | 
 25 |     # since we do nms for single image, then reshape it
 26 |     boxes = tf.reshape(boxes, [-1, 4]) # '-1' means we don't konw the exact number of boxes
 27 |     score = tf.reshape(scores, [-1, num_classes])
 28 | 
 29 |     # Step 1: Create a filtering mask based on "box_class_scores" by using "threshold".
 30 |     mask = tf.greater_equal(score, tf.constant(score_thresh))
 31 |     # Step 2: Do non_max_suppression for each class
 32 |     for i in range(num_classes):
 33 |         # Step 3: Apply the mask to scores, boxes and pick them out
 34 |         filter_boxes = tf.boolean_mask(boxes, mask[:,i])
 35 |         filter_score = tf.boolean_mask(score[:,i], mask[:,i])
 36 |         nms_indices = tf.image.non_max_suppression(boxes=filter_boxes,
 37 |                                                    scores=filter_score,
 38 |                                                    max_output_size=max_boxes,
 39 |                                                    iou_threshold=nms_thresh, name='nms_indices')
 40 |         label_list.append(tf.ones_like(tf.gather(filter_score, nms_indices), 'int32')*i)
 41 |         boxes_list.append(tf.gather(filter_boxes, nms_indices))
 42 |         score_list.append(tf.gather(filter_score, nms_indices))
 43 | 
 44 |     boxes = tf.concat(boxes_list, axis=0)
 45 |     score = tf.concat(score_list, axis=0)
 46 |     label = tf.concat(label_list, axis=0)
 47 | 
 48 |     return boxes, score, label
 49 | 
 50 | 
 51 | def py_nms(boxes, scores, max_boxes=50, iou_thresh=0.5):
 52 |     """
 53 |     Pure Python NMS baseline.
 54 | 
 55 |     Arguments: boxes: shape of [-1, 4], the value of '-1' means that dont know the
 56 |                       exact number of boxes
 57 |                scores: shape of [-1,]
 58 |                max_boxes: representing the maximum of boxes to be selected by non_max_suppression
 59 |                iou_thresh: representing iou_threshold for deciding to keep boxes
 60 |     """
 61 |     assert boxes.shape[1] == 4 and len(scores.shape) == 1
 62 | 
 63 |     x1 = boxes[:, 0]
 64 |     y1 = boxes[:, 1]
 65 |     x2 = boxes[:, 2]
 66 |     y2 = boxes[:, 3]
 67 | 
 68 |     areas = (x2 - x1) * (y2 - y1)
 69 |     order = scores.argsort()[::-1]
 70 | 
 71 |     keep = []
 72 |     while order.size > 0:
 73 |         i = order[0]
 74 |         keep.append(i)
 75 |         xx1 = np.maximum(x1[i], x1[order[1:]])
 76 |         yy1 = np.maximum(y1[i], y1[order[1:]])
 77 |         xx2 = np.minimum(x2[i], x2[order[1:]])
 78 |         yy2 = np.minimum(y2[i], y2[order[1:]])
 79 | 
 80 |         w = np.maximum(0.0, xx2 - xx1 + 1)
 81 |         h = np.maximum(0.0, yy2 - yy1 + 1)
 82 |         inter = w * h
 83 |         ovr = inter / (areas[i] + areas[order[1:]] - inter)
 84 | 
 85 |         inds = np.where(ovr <= iou_thresh)[0]
 86 |         order = order[inds + 1]
 87 | 
 88 |     return keep[:max_boxes]
 89 | 
 90 | 
 91 | def cpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, iou_thresh=0.5):
 92 |     """
 93 |     Perform NMS on CPU.
 94 |     Arguments:
 95 |         boxes: shape [1, 10647, 4]
 96 |         scores: shape [1, 10647, num_classes]
 97 |     """
 98 | 
 99 |     boxes = boxes.reshape(-1, 4)
100 |     scores = scores.reshape(-1, num_classes)
101 |     # Picked bounding boxes
102 |     picked_boxes, picked_score, picked_label = [], [], []
103 | 
104 |     for i in range(num_classes):
105 |         indices = np.where(scores[:,i] >= score_thresh)
106 |         filter_boxes = boxes[indices]
107 |         filter_scores = scores[:,i][indices]
108 |         if len(filter_boxes) == 0: 
109 |             continue
110 |         # do non_max_suppression on the cpu
111 |         indices = py_nms(filter_boxes, filter_scores,
112 |                          max_boxes=max_boxes, iou_thresh=iou_thresh)
113 |         picked_boxes.append(filter_boxes[indices])
114 |         picked_score.append(filter_scores[indices])
115 |         picked_label.append(np.ones(len(indices), dtype='int32')*i)
116 |     if len(picked_boxes) == 0: 
117 |         return None, None, None
118 | 
119 |     boxes = np.concatenate(picked_boxes, axis=0)
120 |     score = np.concatenate(picked_score, axis=0)
121 |     label = np.concatenate(picked_label, axis=0)
122 | 
123 |     return boxes, score, label


--------------------------------------------------------------------------------
/Test_demo/utils/plot_utils.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | 
 3 | from __future__ import division, print_function
 4 | 
 5 | import cv2
 6 | import random
 7 | 
 8 | 
 9 | def get_color_table(class_num, seed=2):
10 |     random.seed(seed)
11 |     color_table = {}
12 |     for i in range(class_num):
13 |         color_table[i] = [random.randint(0, 255) for _ in range(3)]
14 |     return color_table
15 | 
16 | 
17 | def plot_one_box(img, coord, label=None, color=None, line_thickness=None):
18 |     '''
19 |     coord: [x_min, y_min, x_max, y_max] format coordinates.
20 |     img: img to plot on.
21 |     label: str. The label name.
22 |     color: int. color index.
23 |     line_thickness: int. rectangle line thickness.
24 |     '''
25 |     tl = line_thickness or int(round(0.002 * max(img.shape[0:2])))  # line thickness
26 |     color = color or [random.randint(0, 255) for _ in range(3)]
27 |     c1, c2 = (int(coord[0]), int(coord[1])), (int(coord[2]), int(coord[3]))
28 |     cv2.rectangle(img, c1, c2, color, thickness=tl)
29 |     if label:
30 |         tf = max(tl - 1, 1)  # font thickness
31 |         t_size = cv2.getTextSize(label, 0, fontScale=float(tl) / 3, thickness=tf)[0]
32 |         c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3
33 |         cv2.rectangle(img, c1, c2, color, -1)  # filled
34 |         cv2.putText(img, label, (c1[0], c1[1] - 2), 0, float(tl) / 3, [0, 0, 0], thickness=tf, lineType=cv2.LINE_AA)
35 | 
36 | 


--------------------------------------------------------------------------------
/Test_demo/video_test.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | 
  3 | from __future__ import division, print_function
  4 | 
  5 | import tensorflow as tf
  6 | import numpy as np
  7 | import argparse
  8 | import cv2
  9 | import time
 10 | 
 11 | from utils.misc_utils import parse_anchors, read_class_names
 12 | from utils.nms_utils import gpu_nms
 13 | from utils.plot_utils import get_color_table, plot_one_box
 14 | from utils.data_aug import letterbox_resize
 15 | 
 16 | from model import yolov3
 17 | 
 18 | import warnings
 19 | warnings.filterwarnings('ignore')
 20 | parser = argparse.ArgumentParser(description="YOLO-V3 video test procedure.")
 21 | parser.add_argument("input_video", type=str,
 22 |                     help="The path of the input video.")
 23 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt",
 24 |                     help="The path of the anchor txt file.")
 25 | parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416],
 26 |                     help="Resize the input image with `new_size`, size format: [width, height]")
 27 | parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=True,
 28 |                     help="Whether to use the letterbox resize.")
 29 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names",
 30 |                     help="The path of the class names.")
 31 | parser.add_argument("--restore_path", type=str, default="./checkpoint/model-epoch_100_step_37268_loss_0.8836_lr_1e-05",
 32 |                     help="The path of the weights to restore.")
 33 | parser.add_argument("--save_video", type=lambda x: (str(x).lower() == 'true'), default=False,
 34 |                     help="Whether to save the video detection results.")
 35 | args = parser.parse_args()
 36 | 
 37 | args.anchors = parse_anchors(args.anchor_path)
 38 | args.classes = read_class_names(args.class_name_path)
 39 | args.num_class = len(args.classes)
 40 | 
 41 | color_table = get_color_table(args.num_class)
 42 | 
 43 | vid = cv2.VideoCapture(args.input_video)
 44 | # vid = cv2.VideoCapture(0)
 45 | video_frame_cnt = int(vid.get(7))
 46 | video_width = int(vid.get(3))
 47 | video_height = int(vid.get(4))
 48 | # video_fps = int(vid.get(5))
 49 | video_fps = 10
 50 | 
 51 | if args.save_video:
 52 |     fourcc = cv2.VideoWriter_fourcc(*'mp4v')
 53 |     videoWriter = cv2.VideoWriter('video_result.mp4', fourcc, video_fps, (video_width, video_height))
 54 | 
 55 | with tf.Session() as sess:
 56 |     input_data = tf.placeholder(tf.float32, [1, args.new_size[1], args.new_size[0], 3], name='input_data')
 57 |     yolo_model = yolov3(args.num_class, args.anchors)
 58 |     with tf.variable_scope('yolov3'):
 59 |         pred_feature_maps = yolo_model.forward(input_data, False)
 60 |     pred_boxes, pred_confs, pred_probs = yolo_model.predict(pred_feature_maps)
 61 | 
 62 |     pred_scores = pred_confs * pred_probs
 63 | 
 64 |     boxes, scores, labels = gpu_nms(pred_boxes, pred_scores, args.num_class, max_boxes=200, score_thresh=0.3, nms_thresh=0.45)
 65 | 
 66 |     saver = tf.train.Saver()
 67 |     saver.restore(sess, args.restore_path)
 68 | 
 69 |     for i in range(video_frame_cnt):
 70 |     # while True:
 71 |         ret, img_ori = vid.read()
 72 |         if args.letterbox_resize:
 73 |             img, resize_ratio, dw, dh = letterbox_resize(img_ori, args.new_size[0], args.new_size[1])
 74 |         else:
 75 |             height_ori, width_ori = img_ori.shape[:2]
 76 |             img = cv2.resize(img_ori, tuple(args.new_size))
 77 |         img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
 78 |         img = np.asarray(img, np.float32)
 79 |         img = img[np.newaxis, :] / 255.
 80 | 
 81 |         start_time = time.time()
 82 |         boxes_, scores_, labels_ = sess.run([boxes, scores, labels], feed_dict={input_data: img})
 83 |         end_time = time.time()
 84 | 
 85 |         # rescale the coordinates to the original image
 86 |         if args.letterbox_resize:
 87 |             boxes_[:, [0, 2]] = (boxes_[:, [0, 2]] - dw) / resize_ratio
 88 |             boxes_[:, [1, 3]] = (boxes_[:, [1, 3]] - dh) / resize_ratio
 89 |         else:
 90 |             boxes_[:, [0, 2]] *= (width_ori/float(args.new_size[0]))
 91 |             boxes_[:, [1, 3]] *= (height_ori/float(args.new_size[1]))
 92 | 
 93 |         for i in range(len(boxes_)):
 94 |             x0, y0, x1, y1 = boxes_[i]
 95 |             #########
 96 |           
 97 |             plot_one_box(img_ori, [x0, y0, x1, y1], label=args.classes[labels_[i]] + ', {:.2f}%'.format(scores_[i] * 100), color=color_table[labels_[i]])
 98 |             
 99 |         cv2.putText(img_ori, '{:.2f}ms'.format((end_time - start_time) * 1000), (40, 40), 0,
100 |                     fontScale=1, color=(0, 255, 0), thickness=2)
101 |         cv2.imshow('image', img_ori)
102 |         k = cv2.waitKey(1) 
103 |         if args.save_video:
104 |             videoWriter.write(img_ori)
105 |         if k & 0xFF == ord('q'):
106 |             break
107 | 
108 |     vid.release()
109 |     if args.save_video:
110 |         videoWriter.release()
111 | 


--------------------------------------------------------------------------------
/Train/.gitignore:
--------------------------------------------------------------------------------
 1 | /checkpoint/checkpoint
 2 | /checkpoint/*.data-00000-of-00001
 3 | /checkpoint/*.index
 4 | /checkpoint/*.meta
 5 | /data/darknet_weights/*.data-00000-of-00001
 6 | /data/darknet_weights/*.meta
 7 | /data/darknet_weights/*.weights
 8 | 
 9 | /data/logs/*.ubuntu
10 | *.xml
11 | /data/myData/ImageSets/Main/*.txt
12 | /data/myData/JPEGImages/*.jpg
13 | /data/myData/label/*.txt
14 | 
15 | /data/test_image/*.jpg
16 | /data/test_video/*
17 | /data/*.log
18 | /test_res/*.jpg
19 | /执行步骤.txt
20 | 
21 | 


--------------------------------------------------------------------------------
/Train/README.md:
--------------------------------------------------------------------------------
  1 | ### 1.Download darknet_weights
  2 | 
  3 | [GitHub Release](https://github.com/DataXujing/YOLO-V3-Tensorflow/releases/tag/1.0)
  4 | put it into `./data/darknet_weights/`
  5 | 
  6 | 
  7 | ### 2.Create data structure
  8 | 
  9 | 
 10 | (1) annotation files
 11 | 
 12 | put annotation files under ./data/my_data/Annotations directory
 13 | put image file under ./data/my_data/JPEGImages directory
 14 | 
 15 | run
 16 | 
 17 | ```shell
 18 | python data_pro.py
 19 | ```
 20 | Generate train.txt/val.txt/test.txt files under ./data/my_data/label directory.
 21 | one row represents one image as `image_index`,`image_absolute_path`, `img_width`, `img_height`,`box_1`,`box_2`,...,`box_n`
 22 | 
 23 | 
 24 | Example：
 25 | 
 26 | ```
 27 | 0 xxx/xxx/a.jpg 1920,1080,0 453 369 473 391 1 588 245 608 268
 28 | 1 xxx/xxx/b.jpg 1920,1080,1 466 403 485 422 2 793 300 809 320
 29 | ...
 30 | ```
 31 | 
 32 | 
 33 | (2) class_names file:
 34 | 
 35 | Generate the data.names file under ./data/ directory. Each line represents a class name.
 36 | 
 37 | ```
 38 | P
 39 | PH
 40 | ```
 41 | 
 42 | (3) prior anchor file:
 43 | 
 44 | Using the kmeans algorithm to get the prior anchors:
 45 | 
 46 | ```
 47 | python get_kmeans.py
 48 | ```
 49 | Then you will get 9 anchors and the average IoU. Save the anchors to a txt file.
 50 | 
 51 | The COCO dataset anchors offered by YOLO's author is placed at ./data/yolo_anchors.txt, you can use that one too.
 52 | 
 53 | The yolo anchors computed by the kmeans script is on the resized image scale. The default resize method is the letterbox resize, i.e., keep the original aspect ratio in the resized image.
 54 | 
 55 | 
 56 | ### 4.Train
 57 | 
 58 | modify parameters in `arg.py`：
 59 | 
 60 | <details>
 61 | <summary><mark><font color=darkred>change arg.py</font></mark></summary>
 62 | <pre><code>
 63 | ### Some paths
 64 | train_file = './data/my_data/label/train.txt'  # The path of the training txt file.
 65 | val_file = './data/my_data/label/val.txt'  # The path of the validation txt file.
 66 | restore_path = './data/darknet_weights/yolov3.ckpt'  # The path of the weights to restore.
 67 | save_dir = './checkpoint/'  # The directory of the weights to save.
 68 | log_dir = './data/logs/'  # The directory to store the tensorboard log files.
 69 | progress_log_path = './data/progress.log'  # The path to record the training progress.
 70 | anchor_path = './data/yolo_anchors.txt'  # The path of the anchor txt file.
 71 | class_name_path = './data/coco.names'  # The path of the class names.
 72 | ### Training releated numbers
 73 | batch_size = 32  #6
 74 | img_size = [416, 416]  # Images will be resized to `img_size` and fed to the network, size format: [width, height]
 75 | letterbox_resize = True  # Whether to use the letterbox resize, i.e., keep the original aspect ratio in the resized image.
 76 | total_epoches = 500
 77 | train_evaluation_step = 100  # Evaluate on the training batch after some steps.
 78 | val_evaluation_epoch = 50  # Evaluate on the whole validation dataset after some epochs. Set to None to evaluate every epoch.
 79 | save_epoch = 10  # Save the model after some epochs.
 80 | batch_norm_decay = 0.99  # decay in bn ops
 81 | weight_decay = 5e-4  # l2 weight decay
 82 | global_step = 0  # used when resuming training
 83 | ### tf.data parameters
 84 | num_threads = 10  # Number of threads for image processing used in tf.data pipeline.
 85 | prefetech_buffer = 5  # Prefetech_buffer used in tf.data pipeline.
 86 | ### Learning rate and optimizer
 87 | optimizer_name = 'momentum'  # Chosen from [sgd, momentum, adam, rmsprop]
 88 | save_optimizer = True  # Whether to save the optimizer parameters into the checkpoint file.
 89 | learning_rate_init = 1e-4
 90 | lr_type = 'piecewise'  # Chosen from [fixed, exponential, cosine_decay, cosine_decay_restart, piecewise]
 91 | lr_decay_epoch = 5  # Epochs after which learning rate decays. Int or float. Used when chosen `exponential` and `cosine_decay_restart` lr_type.
 92 | lr_decay_factor = 0.96  # The learning rate decay factor. Used when chosen `exponential` lr_type.
 93 | lr_lower_bound = 1e-6  # The minimum learning rate.
 94 | # only used in piecewise lr type
 95 | pw_boundaries = [30, 50]  # epoch based boundaries
 96 | pw_values = [learning_rate_init, 3e-5, 1e-5]
 97 | ### Load and finetune
 98 | # Choose the parts you want to restore the weights. List form.
 99 | # restore_include: None, restore_exclude: None  => restore the whole model
100 | # restore_include: None, restore_exclude: scope  => restore the whole model except `scope`
101 | # restore_include: scope1, restore_exclude: scope2  => if scope1 contains scope2, restore scope1 and not restore scope2 (scope1 - scope2)
102 | # choise 1: only restore the darknet body
103 | # restore_include = ['yolov3/darknet53_body']
104 | # restore_exclude = None
105 | # choise 2: restore all layers except the last 3 conv2d layers in 3 scale
106 | restore_include = None
107 | restore_exclude = ['yolov3/yolov3_head/Conv_14', 'yolov3/yolov3_head/Conv_6', 'yolov3/yolov3_head/Conv_22']
108 | # Choose the parts you want to finetune. List form.
109 | # Set to None to train the whole model.
110 | update_part = ['yolov3/yolov3_head']
111 | ### other training strategies
112 | multi_scale_train = True  # Whether to apply multi-scale training strategy. Image size varies from [320, 320] to [640, 640] by default.
113 | use_label_smooth = True # Whether to use class label smoothing strategy.
114 | use_focal_loss = True  # Whether to apply focal loss on the conf loss.
115 | use_mix_up = True  # Whether to use mix up data augmentation strategy. 
116 | use_warm_up = True  # whether to use warm up strategy to prevent from gradient exploding.
117 | warm_up_epoch = 3  # Warm up training epoches. Set to a larger value if gradient explodes.
118 | ### some constants in validation
119 | # nms
120 | nms_threshold = 0.45  # iou threshold in nms operation
121 | score_threshold = 0.01  # threshold of the probability of the classes in nms operation, i.e. score = pred_confs * pred_probs. set lower for higher recall.
122 | nms_topk = 150  # keep at most nms_topk outputs after nms
123 | # mAP eval
124 | eval_threshold = 0.5  # the iou threshold applied in mAP evaluation
125 | use_voc_07_metric = False  # whether to use voc 2007 evaluation metric, i.e. the 11-point metric
126 | ### parse some params
127 | anchors = parse_anchors(anchor_path)
128 | classes = read_class_names(class_name_path)
129 | class_num = len(classes)
130 | train_img_cnt = len(open(train_file, 'r').readlines())
131 | val_img_cnt = len(open(val_file, 'r').readlines())
132 | train_batch_num = int(math.ceil(float(train_img_cnt) / batch_size))
133 | lr_decay_freq = int(train_batch_num * lr_decay_epoch)
134 | pw_boundaries = [float(i) * train_batch_num + global_step for i in pw_boundaries]
135 | </code></pre>
136 | </details>
137 | 
138 | Run:
139 | 
140 | 
141 | ```shell
142 | python train.py
143 | ```
144 | 
145 | 
146 | 
147 | ### 5.Test
148 | 
149 | 
150 | 
151 | ```
152 | python3 test_single_image.py 000002.jpg
153 | ```
154 | 
155 | 
156 | 
157 | 


--------------------------------------------------------------------------------
/Train/args.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | # This file contains the parameter used in train.py
 3 | 
 4 | from __future__ import division, print_function
 5 | 
 6 | from utils.misc_utils import parse_anchors, read_class_names
 7 | import math
 8 | 
 9 | ### Some paths
10 | train_file = './data/my_data/label/train.txt'  # The path of the training txt file.
11 | val_file = './data/my_data/label/val.txt'  # The path of the validation txt file.
12 | restore_path = './data/darknet_weights/yolov3.ckpt'  # The path of the weights to restore.
13 | save_dir = './checkpoint/'  # The directory of the weights to save.
14 | log_dir = './data/logs/'  # The directory to store the tensorboard log files.
15 | progress_log_path = './data/progress.log'  # The path to record the training progress.
16 | anchor_path = './data/yolo_anchors.txt'  # The path of the anchor txt file.
17 | class_name_path = './data/coco.names'  # The path of the class names.
18 | 
19 | ### Training releated numbers
20 | batch_size = 12  #6
21 | img_size = [416, 416]  # Images will be resized to `img_size` and fed to the network, size format: [width, height]
22 | letterbox_resize = True  # Whether to use the letterbox resize, i.e., keep the original aspect ratio in the resized image.
23 | total_epoches = 100 #500
24 | train_evaluation_step = 50 #100 # Evaluate on the training batch after some steps.
25 | val_evaluation_epoch = 50 #50  # Evaluate on the whole validation dataset after some epochs. Set to None to evaluate every epoch.
26 | save_epoch = 10  # Save the model after some epochs.
27 | batch_norm_decay = 0.99  # decay in bn ops
28 | weight_decay = 5e-4  # l2 weight decay
29 | global_step = 0  # used when resuming training
30 | 
31 | ### tf.data parameters
32 | num_threads = 10  # Number of threads for image processing used in tf.data pipeline.
33 | prefetech_buffer = 5  # Prefetech_buffer used in tf.data pipeline.
34 | 
35 | ### Learning rate and optimizer
36 | optimizer_name = 'momentum'  # Chosen from [sgd, momentum, adam, rmsprop]
37 | save_optimizer = True  # Whether to save the optimizer parameters into the checkpoint file.
38 | learning_rate_init = 1e-4
39 | lr_type = 'piecewise'  # Chosen from [fixed, exponential, cosine_decay, cosine_decay_restart, piecewise]
40 | lr_decay_epoch = 5  # Epochs after which learning rate decays. Int or float. Used when chosen `exponential` and `cosine_decay_restart` lr_type.
41 | lr_decay_factor = 0.96  # The learning rate decay factor. Used when chosen `exponential` lr_type.
42 | lr_lower_bound = 1e-6  # The minimum learning rate.
43 | # only used in piecewise lr type
44 | pw_boundaries = [30, 50]  # epoch based boundaries
45 | pw_values = [learning_rate_init, 3e-5, 1e-5]
46 | 
47 | ### Load and finetune
48 | # Choose the parts you want to restore the weights. List form.
49 | # restore_include: None, restore_exclude: None  => restore the whole model
50 | # restore_include: None, restore_exclude: scope  => restore the whole model except `scope`
51 | # restore_include: scope1, restore_exclude: scope2  => if scope1 contains scope2, restore scope1 and not restore scope2 (scope1 - scope2)
52 | # choise 1: only restore the darknet body
53 | # restore_include = ['yolov3/darknet53_body']
54 | # restore_exclude = None
55 | # choise 2: restore all layers except the last 3 conv2d layers in 3 scale
56 | restore_include = None
57 | restore_exclude = ['yolov3/yolov3_head/Conv_14', 'yolov3/yolov3_head/Conv_6', 'yolov3/yolov3_head/Conv_22']
58 | # Choose the parts you want to finetune. List form.
59 | # Set to None to train the whole model.
60 | update_part = ['yolov3/yolov3_head']
61 | 
62 | ### other training strategies
63 | multi_scale_train = True  # Whether to apply multi-scale training strategy. Image size varies from [320, 320] to [640, 640] by default.
64 | use_label_smooth = True # Whether to use class label smoothing strategy.
65 | use_focal_loss = True  # Whether to apply focal loss on the conf loss.
66 | use_mix_up = True  # Whether to use mix up data augmentation strategy. 
67 | use_warm_up = True  # whether to use warm up strategy to prevent from gradient exploding.
68 | warm_up_epoch = 3  # Warm up training epoches. Set to a larger value if gradient explodes.
69 | 
70 | ### some constants in validation
71 | # nms
72 | nms_threshold = 0.45  # iou threshold in nms operation
73 | score_threshold = 0.01  # threshold of the probability of the classes in nms operation, i.e. score = pred_confs * pred_probs. set lower for higher recall.
74 | nms_topk = 150  # keep at most nms_topk outputs after nms
75 | # mAP eval
76 | eval_threshold = 0.5  # the iou threshold applied in mAP evaluation
77 | use_voc_07_metric = False  # whether to use voc 2007 evaluation metric, i.e. the 11-point metric
78 | 
79 | ### parse some params
80 | anchors = parse_anchors(anchor_path)
81 | classes = read_class_names(class_name_path)
82 | class_num = len(classes)
83 | train_img_cnt = len(open(train_file, 'r').readlines())
84 | val_img_cnt = len(open(val_file, 'r').readlines())
85 | train_batch_num = int(math.ceil(float(train_img_cnt) / batch_size))
86 | 
87 | lr_decay_freq = int(train_batch_num * lr_decay_epoch)
88 | pw_boundaries = [float(i) * train_batch_num + global_step for i in pw_boundaries]


--------------------------------------------------------------------------------
/Train/convert_weight.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | # for more details about the yolo darknet weights file, refer to
 3 | # https://itnext.io/implementing-yolo-v3-in-tensorflow-tf-slim-c3c55ff59dbe
 4 | 
 5 | from __future__ import division, print_function
 6 | 
 7 | import os
 8 | import sys
 9 | import tensorflow as tf
10 | import numpy as np
11 | 
12 | from model import yolov3
13 | from utils.misc_utils import parse_anchors, load_weights
14 | 
15 | num_class = 80
16 | img_size = 416
17 | weight_path = './data/darknet_weights/yolov3.weights'
18 | save_path = './data/darknet_weights/yolov3.ckpt'
19 | anchors = parse_anchors('./data/yolo_anchors.txt')
20 | 
21 | model = yolov3(80, anchors)
22 | with tf.Session() as sess:
23 |     inputs = tf.placeholder(tf.float32, [1, img_size, img_size, 3])
24 | 
25 |     with tf.variable_scope('yolov3'):
26 |         feature_map = model.forward(inputs)
27 | 
28 |     saver = tf.train.Saver(var_list=tf.global_variables(scope='yolov3'))
29 | 
30 |     load_ops = load_weights(tf.global_variables(scope='yolov3'), weight_path)
31 |     sess.run(load_ops)
32 |     saver.save(sess, save_path=save_path)
33 |     print('TensorFlow model checkpoint has been saved to {}'.format(save_path))
34 | 
35 | 
36 | 
37 | 


--------------------------------------------------------------------------------
/Train/data/coco.names:
--------------------------------------------------------------------------------
1 | P
2 | PH
3 | PV
4 | PHV
5 | PLC


--------------------------------------------------------------------------------
/Train/data/darknet_weights/readme:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/data/darknet_weights/readme


--------------------------------------------------------------------------------
/Train/data/my_data/Annotations/readme:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/data/my_data/Annotations/readme


--------------------------------------------------------------------------------
/Train/data/my_data/ImageSets/Main/test.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/data/my_data/ImageSets/Main/test.txt


--------------------------------------------------------------------------------
/Train/data/my_data/ImageSets/Main/train.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/data/my_data/ImageSets/Main/train.txt


--------------------------------------------------------------------------------
/Train/data/my_data/ImageSets/Main/val.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/data/my_data/ImageSets/Main/val.txt


--------------------------------------------------------------------------------
/Train/data/my_data/JPEGImages/readme:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/data/my_data/JPEGImages/readme


--------------------------------------------------------------------------------
/Train/data/my_data/label/test.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/data/my_data/label/test.txt


--------------------------------------------------------------------------------
/Train/data/my_data/label/train.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/data/my_data/label/train.txt


--------------------------------------------------------------------------------
/Train/data/my_data/label/val.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/data/my_data/label/val.txt


--------------------------------------------------------------------------------
/Train/data/yolo_anchors.txt:
--------------------------------------------------------------------------------
1 | 27,58, 40,97, 56,123, 68,180, 97,152, 102,241, 163,299, 262,481, 587,710


--------------------------------------------------------------------------------
/Train/data_pro.py:
--------------------------------------------------------------------------------
  1 | 
  2 | import os
  3 | import pandas 
  4 | import shutil
  5 | import random
  6 | 
  7 | 
  8 | import cv2
  9 | import numpy as np
 10 | import xml.etree.ElementTree as ET
 11 | 
 12 | 
 13 | 
 14 | 
 15 | 
 16 | class Data_preprocess(object):
 17 |     '''
 18 |     解析xml数据
 19 |     '''
 20 |     def __init__(self,data_path):
 21 |         self.data_path = data_path
 22 |         self.image_size = 416
 23 |         self.batch_size = 32
 24 |         self.cell_size = 13
 25 |         # TO DO
 26 |         self.classes = ["P","PH","PV","PHV","PLC"]
 27 |         self.num_classes = len(self.classes)
 28 |         self.box_per_cell = 5
 29 |         self.class_to_ind = dict(zip(self.classes, range(self.num_classes)))
 30 | 
 31 |         self.count = 0
 32 |         self.epoch = 1
 33 |         self.count_t = 0
 34 | 
 35 |     def load_labels(self, model):
 36 |         if model == 'train':
 37 |             txtname = os.path.join(self.data_path, 'ImageSets/Main/train.txt')
 38 |         if model == 'test':
 39 |             txtname = os.path.join(self.data_path, 'ImageSets/Main/test.txt')
 40 | 
 41 |         if model == "val":
 42 |             txtname = os.path.join(self.data_path, 'ImageSets/Main/val.txt')
 43 | 
 44 | 
 45 |         with open(txtname, 'r') as f:
 46 |             image_ind = [x.strip() for x in f.readlines()] # 文件名去掉 .jpg
 47 | 
 48 |         
 49 |         my_index = 0
 50 |         for ind in image_ind:
 51 |             class_inds, x1s, y1s, x2s, y2s,img_width,img_height = self.load_data(ind)
 52 | 
 53 |             if len(class_inds) == 0:
 54 |                 pass
 55 |             else:
 56 |                 annotation_label = ""
 57 |                 #box_x: label_index, x_min,y_min,x_max,y_max
 58 |                 for label_i in range(len(class_inds)):
 59 | 
 60 |                     annotation_label += " " + str(class_inds[label_i])
 61 |                     annotation_label += " " + str(x1s[label_i])
 62 |                     annotation_label += " " + str(y1s[label_i])
 63 |                     annotation_label += " " + str(x2s[label_i])
 64 |                     annotation_label += " " + str(y2s[label_i])
 65 | 
 66 |                 with open("./data/my_data/label/"+model+".txt","a") as f:
 67 |                     f.write(str(my_index) + " " + data_path+"/JPEGImages/"+ind+".jpg"+" "+str(img_width) +" "+str(img_height)+ annotation_label + "\n")
 68 | 
 69 |                 my_index += 1
 70 | 
 71 |             print(my_index)
 72 | 
 73 | 
 74 | 
 75 |     def load_data(self, index):
 76 |         label = np.zeros([self.cell_size, self.cell_size, self.box_per_cell, 5 + self.num_classes])
 77 |         filename = os.path.join(self.data_path, 'Annotations', index + '.xml')
 78 |         tree = ET.parse(filename)
 79 |         image_size = tree.find('size')
 80 |         image_width = int(float(image_size.find('width').text))
 81 |         image_height = int(float(image_size.find('height').text))
 82 |         # h_ratio = 1.0 * self.image_size / image_height
 83 |         # w_ratio = 1.0 * self.image_size / image_width
 84 | 
 85 |         objects = tree.findall('object')
 86 | 
 87 |         class_inds = []
 88 |         x1s = []
 89 |         y1s = []
 90 |         x2s = []
 91 |         y2s = []
 92 | 
 93 |         for obj in objects:
 94 |             box = obj.find('bndbox')
 95 |             x1 = int(float(box.find('xmin').text))
 96 |             y1 = int(float(box.find('ymin').text))
 97 |             x2 = int(float(box.find('xmax').text))
 98 |             y2 = int(float(box.find('ymax').text))
 99 |             # x1 = max(min((float(box.find('xmin').text)) * w_ratio, self.image_size), 0)
100 |             # y1 = max(min((float(box.find('ymin').text)) * h_ratio, self.image_size), 0)
101 |             # x2 = max(min((float(box.find('xmax').text)) * w_ratio, self.image_size), 0)
102 |             # y2 = max(min((float(box.find('ymax').text)) * h_ratio, self.image_size), 0)
103 |             if obj.find('name').text in self.classes:
104 |                 class_ind = self.class_to_ind[obj.find('name').text]
105 |                 # class_ind = self.class_to_ind[obj.find('name').text.lower().strip()]
106 | 
107 |                 # boxes = [0.5 * (x1 + x2) / self.image_size, 0.5 * (y1 + y2) / self.image_size, np.sqrt((x2 - x1) / self.image_size), np.sqrt((y2 - y1) / self.image_size)]
108 |                 # cx = 1.0 * boxes[0] * self.cell_size
109 |                 # cy = 1.0 * boxes[1] * self.cell_size
110 |                 # xind = int(np.floor(cx))
111 |                 # yind = int(np.floor(cy))
112 |                 
113 |                 # label[yind, xind, :, 0] = 1
114 |                 # label[yind, xind, :, 1:5] = boxes
115 |                 # label[yind, xind, :, 5 + class_ind] = 1
116 | 
117 |                 if x1 >= x2 or y1 >= y2:
118 |                     pass
119 |                 else:
120 |                     class_inds.append(class_ind)
121 |                     x1s.append(x1)
122 |                     y1s.append(y1)
123 |                     x2s.append(x2)
124 |                     y2s.append(y2)
125 |       
126 |         return class_inds, x1s, y1s, x2s, y2s, image_width, image_height
127 | 
128 | 
129 | def data_split(img_path):
130 |     '''
131 |     数据分割
132 |     '''
133 | 
134 |     files = os.listdir(img_path)
135 |     # To do
136 |     test_part = random.sample(files,int(2302*0.2))
137 | 
138 |     val_part = random.sample(test_part,int(int(2302*0.2)*0.5))
139 | 
140 |     val_index = 0
141 |     test_index = 0
142 |     train_index = 0
143 |     for file in files:
144 |         if file in val_part:
145 | 
146 |             with open("./data/my_data/ImageSets/Main/val.txt","a") as val_f:
147 |                 val_f.write(file[:-4] + "\n" )
148 | 
149 |             val_index += 1
150 | 
151 |         elif file in test_part:
152 |             with open("./data/my_data/ImageSets/Main/test.txt","a") as test_f:
153 |                 test_f.write(file[:-4] + "\n")
154 | 
155 |             test_index += 1
156 | 
157 |         else:
158 |             with open("./data/my_data/ImageSets/Main/train.txt","a") as train_f:
159 |                 train_f.write(file[:-4] + "\n")
160 | 
161 |             train_index += 1  
162 | 
163 | 
164 |         print(train_index,test_index,val_index)
165 | 
166 | 
167 | # TO DO
168 | if __name__ == "__main__":
169 |     
170 |     # split train, val, test
171 |     img_path = "./data/my_data/JPEGImages"
172 |     data_split(img_path)
173 |     print("===========split data finish============")
174 | 
175 |     # create YOLO V3 datasets
176 |     base_path = os.getcwd()
177 |     data_path = os.path.join(base_path,"data/my_data")  # absolute path
178 | 
179 |     data_p = Data_preprocess(data_path)
180 |     data_p.load_labels("train")
181 |     data_p.load_labels("test")
182 |     data_p.load_labels("val")
183 |     print("==========data pro finish===========")
184 | 
185 | 
186 | 
187 | 
188 | 
189 | 
190 | 
191 | 


--------------------------------------------------------------------------------
/Train/eval.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | 
  3 | from __future__ import division, print_function
  4 | 
  5 | import tensorflow as tf
  6 | import numpy as np
  7 | import argparse
  8 | from tqdm import trange
  9 | 
 10 | from utils.data_utils import get_batch_data
 11 | from utils.misc_utils import parse_anchors, read_class_names, AverageMeter
 12 | from utils.eval_utils import evaluate_on_cpu, evaluate_on_gpu, get_preds_gpu, voc_eval, parse_gt_rec
 13 | from utils.nms_utils import gpu_nms
 14 | 
 15 | from model import yolov3
 16 | 
 17 | #################
 18 | # ArgumentParser
 19 | #################
 20 | parser = argparse.ArgumentParser(description="YOLO-V3 eval procedure.")
 21 | # some paths
 22 | parser.add_argument("--eval_file", type=str, default="./data/my_data/val.txt",
 23 |                     help="The path of the validation or test txt file.")
 24 | 
 25 | parser.add_argument("--restore_path", type=str, default="./data/darknet_weights/yolov3.ckpt",
 26 |                     help="The path of the weights to restore.")
 27 | 
 28 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt",
 29 |                     help="The path of the anchor txt file.")
 30 | 
 31 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names",
 32 |                     help="The path of the class names.")
 33 | 
 34 | # some numbers
 35 | parser.add_argument("--img_size", nargs='*', type=int, default=[416, 416],
 36 |                     help="Resize the input image to `img_size`, size format: [width, height]")
 37 | 
 38 | parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=False,
 39 |                     help="Whether to use the letterbox resize, i.e., keep the original image aspect ratio.")
 40 | 
 41 | parser.add_argument("--num_threads", type=int, default=10,
 42 |                     help="Number of threads for image processing used in tf.data pipeline.")
 43 | 
 44 | parser.add_argument("--prefetech_buffer", type=int, default=5,
 45 |                     help="Prefetech_buffer used in tf.data pipeline.")
 46 | 
 47 | parser.add_argument("--nms_threshold", type=float, default=0.45,
 48 |                     help="IOU threshold in nms operation.")
 49 | 
 50 | parser.add_argument("--score_threshold", type=float, default=0.01,
 51 |                     help="Threshold of the probability of the classes in nms operation.")
 52 | 
 53 | parser.add_argument("--nms_topk", type=int, default=400,
 54 |                     help="Keep at most nms_topk outputs after nms.")
 55 | 
 56 | parser.add_argument("--use_voc_07_metric", type=lambda x: (str(x).lower() == 'true'), default=False,
 57 |                     help="Whether to use the voc 2007 mAP metrics.")
 58 | 
 59 | args = parser.parse_args()
 60 | 
 61 | # args params
 62 | args.anchors = parse_anchors(args.anchor_path)
 63 | args.classes = read_class_names(args.class_name_path)
 64 | args.class_num = len(args.classes)
 65 | args.img_cnt = len(open(args.eval_file, 'r').readlines())
 66 | 
 67 | # setting placeholders
 68 | is_training = tf.placeholder(dtype=tf.bool, name="phase_train")
 69 | handle_flag = tf.placeholder(tf.string, [], name='iterator_handle_flag')
 70 | pred_boxes_flag = tf.placeholder(tf.float32, [1, None, None])
 71 | pred_scores_flag = tf.placeholder(tf.float32, [1, None, None])
 72 | gpu_nms_op = gpu_nms(pred_boxes_flag, pred_scores_flag, args.class_num, args.nms_topk, args.score_threshold, args.nms_threshold)
 73 | 
 74 | ##################
 75 | # tf.data pipeline
 76 | ##################
 77 | val_dataset = tf.data.TextLineDataset(args.eval_file)
 78 | val_dataset = val_dataset.batch(1)
 79 | val_dataset = val_dataset.map(
 80 |     lambda x: tf.py_func(get_batch_data, [x, args.class_num, args.img_size, args.anchors, 'val', False, False, args.letterbox_resize], [tf.int64, tf.float32, tf.float32, tf.float32, tf.float32]),
 81 |     num_parallel_calls=args.num_threads
 82 | )
 83 | val_dataset.prefetch(args.prefetech_buffer)
 84 | iterator = val_dataset.make_one_shot_iterator()
 85 | 
 86 | image_ids, image, y_true_13, y_true_26, y_true_52 = iterator.get_next()
 87 | image_ids.set_shape([None])
 88 | y_true = [y_true_13, y_true_26, y_true_52]
 89 | image.set_shape([None, args.img_size[1], args.img_size[0], 3])
 90 | for y in y_true:
 91 |     y.set_shape([None, None, None, None, None])
 92 | 
 93 | ##################
 94 | # Model definition
 95 | ##################
 96 | yolo_model = yolov3(args.class_num, args.anchors)
 97 | with tf.variable_scope('yolov3'):
 98 |     pred_feature_maps = yolo_model.forward(image, is_training=is_training)
 99 | loss = yolo_model.compute_loss(pred_feature_maps, y_true)
100 | y_pred = yolo_model.predict(pred_feature_maps)
101 | 
102 | saver_to_restore = tf.train.Saver()
103 | 
104 | with tf.Session() as sess:
105 |     sess.run([tf.global_variables_initializer()])
106 |     saver_to_restore.restore(sess, args.restore_path)
107 | 
108 |     print('\n----------- start to eval -----------\n')
109 | 
110 |     val_loss_total, val_loss_xy, val_loss_wh, val_loss_conf, val_loss_class = \
111 |         AverageMeter(), AverageMeter(), AverageMeter(), AverageMeter(), AverageMeter()
112 |     val_preds = []
113 | 
114 |     for j in trange(args.img_cnt):
115 |         __image_ids, __y_pred, __loss = sess.run([image_ids, y_pred, loss], feed_dict={is_training: False})
116 |         pred_content = get_preds_gpu(sess, gpu_nms_op, pred_boxes_flag, pred_scores_flag, __image_ids, __y_pred)
117 | 
118 |         val_preds.extend(pred_content)
119 |         val_loss_total.update(__loss[0])
120 |         val_loss_xy.update(__loss[1])
121 |         val_loss_wh.update(__loss[2])
122 |         val_loss_conf.update(__loss[3])
123 |         val_loss_class.update(__loss[4])
124 | 
125 |     rec_total, prec_total, ap_total = AverageMeter(), AverageMeter(), AverageMeter()
126 |     gt_dict = parse_gt_rec(args.eval_file, args.img_size, args.letterbox_resize)
127 |     print('mAP eval:')
128 |     for ii in range(args.class_num):
129 |         npos, nd, rec, prec, ap = voc_eval(gt_dict, val_preds, ii, iou_thres=0.5, use_07_metric=args.use_voc_07_metric)
130 |         rec_total.update(rec, npos)
131 |         prec_total.update(prec, nd)
132 |         ap_total.update(ap, 1)
133 |         print('Class {}: Recall: {:.4f}, Precision: {:.4f}, AP: {:.4f}'.format(ii, rec, prec, ap))
134 | 
135 |     mAP = ap_total.average
136 |     print('final mAP: {:.4f}'.format(mAP))
137 |     print("recall: {:.3f}, precision: {:.3f}".format(rec_total.average, prec_total.average))
138 |     print("total_loss: {:.3f}, loss_xy: {:.3f}, loss_wh: {:.3f}, loss_conf: {:.3f}, loss_class: {:.3f}".format(
139 |         val_loss_total.average, val_loss_xy.average, val_loss_wh.average, val_loss_conf.average, val_loss_class.average
140 |     ))
141 | 


--------------------------------------------------------------------------------
/Train/get_kmeans.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | # This script is modified from https://github.com/lars76/kmeans-anchor-boxes
  3 | 
  4 | from __future__ import division, print_function
  5 | 
  6 | import numpy as np
  7 | 
  8 | def iou(box, clusters):
  9 |     """
 10 |     Calculates the Intersection over Union (IoU) between a box and k clusters.
 11 |     param:
 12 |         box: tuple or array, shifted to the origin (i. e. width and height)
 13 |         clusters: numpy array of shape (k, 2) where k is the number of clusters
 14 |     return:
 15 |         numpy array of shape (k, 0) where k is the number of clusters
 16 |     """
 17 |     x = np.minimum(clusters[:, 0], box[0])
 18 |     y = np.minimum(clusters[:, 1], box[1])
 19 |     if np.count_nonzero(x == 0) > 0 or np.count_nonzero(y == 0) > 0:
 20 |         raise ValueError("Box has no area")
 21 | 
 22 |     intersection = x * y
 23 |     box_area = box[0] * box[1]
 24 |     cluster_area = clusters[:, 0] * clusters[:, 1]
 25 | 
 26 |     iou_ = np.true_divide(intersection, box_area + cluster_area - intersection + 1e-10)
 27 |     # iou_ = intersection / (box_area + cluster_area - intersection + 1e-10)
 28 | 
 29 |     return iou_
 30 | 
 31 | 
 32 | def avg_iou(boxes, clusters):
 33 |     """
 34 |     Calculates the average Intersection over Union (IoU) between a numpy array of boxes and k clusters.
 35 |     param:
 36 |         boxes: numpy array of shape (r, 2), where r is the number of rows
 37 |         clusters: numpy array of shape (k, 2) where k is the number of clusters
 38 |     return:
 39 |         average IoU as a single float
 40 |     """
 41 |     return np.mean([np.max(iou(boxes[i], clusters)) for i in range(boxes.shape[0])])
 42 | 
 43 | 
 44 | def translate_boxes(boxes):
 45 |     """
 46 |     Translates all the boxes to the origin.
 47 |     param:
 48 |         boxes: numpy array of shape (r, 4)
 49 |     return:
 50 |     numpy array of shape (r, 2)
 51 |     """
 52 |     new_boxes = boxes.copy()
 53 |     for row in range(new_boxes.shape[0]):
 54 |         new_boxes[row][2] = np.abs(new_boxes[row][2] - new_boxes[row][0])
 55 |         new_boxes[row][3] = np.abs(new_boxes[row][3] - new_boxes[row][1])
 56 |     return np.delete(new_boxes, [0, 1], axis=1)
 57 | 
 58 | 
 59 | def kmeans(boxes, k, dist=np.median):
 60 |     """
 61 |     Calculates k-means clustering with the Intersection over Union (IoU) metric.
 62 |     param:
 63 |         boxes: numpy array of shape (r, 2), where r is the number of rows
 64 |         k: number of clusters
 65 |         dist: distance function
 66 |     return:
 67 |         numpy array of shape (k, 2)
 68 |     """
 69 |     rows = boxes.shape[0]
 70 | 
 71 |     distances = np.empty((rows, k))
 72 |     last_clusters = np.zeros((rows,))
 73 | 
 74 |     np.random.seed()
 75 | 
 76 |     # the Forgy method will fail if the whole array contains the same rows
 77 |     clusters = boxes[np.random.choice(rows, k, replace=False)]
 78 | 
 79 |     while True:
 80 |         for row in range(rows):
 81 |             distances[row] = 1 - iou(boxes[row], clusters)
 82 | 
 83 |         nearest_clusters = np.argmin(distances, axis=1)
 84 | 
 85 |         if (last_clusters == nearest_clusters).all():
 86 |             break
 87 | 
 88 |         for cluster in range(k):
 89 |             clusters[cluster] = dist(boxes[nearest_clusters == cluster], axis=0)
 90 | 
 91 |         last_clusters = nearest_clusters
 92 | 
 93 |     return clusters
 94 | 
 95 | 
 96 | def parse_anno(annotation_path, target_size=None):
 97 |     anno = open(annotation_path, 'r')
 98 |     result = []
 99 |     for line in anno:
100 |         s = line.strip().split(' ')
101 |         img_w = int(float(s[2]))
102 |         img_h = int(float(s[3]))
103 |         s = s[4:]
104 |         box_cnt = len(s) // 5
105 |         for i in range(box_cnt):
106 |             x_min, y_min, x_max, y_max = float(s[i*5+1]), float(s[i*5+2]), float(s[i*5+3]), float(s[i*5+4])
107 |             width = x_max - x_min
108 |             height = y_max - y_min
109 |             assert width > 0
110 |             assert height > 0
111 |             # use letterbox resize, i.e. keep the original aspect ratio
112 |             # get k-means anchors on the resized target image size
113 |             if target_size is not None:
114 |                 resize_ratio = min(target_size[0] / img_w, target_size[1] / img_h)
115 |                 width *= resize_ratio
116 |                 height *= resize_ratio
117 |                 result.append([width, height])
118 |             # get k-means anchors on the original image size
119 |             else:
120 |                 result.append([width, height])
121 |     result = np.asarray(result)
122 |     return result
123 | 
124 | 
125 | def get_kmeans(anno, cluster_num=9):
126 | 
127 |     anchors = kmeans(anno, cluster_num)
128 |     ave_iou = avg_iou(anno, anchors)
129 | 
130 |     anchors = anchors.astype('int').tolist()
131 | 
132 |     anchors = sorted(anchors, key=lambda x: x[0] * x[1])
133 | 
134 |     return anchors, ave_iou
135 | 
136 | 
137 | if __name__ == '__main__':
138 |     # target resize format: [width, height]
139 |     # if target_resize is speficied, the anchors are on the resized image scale
140 |     # if target_resize is set to None, the anchors are on the original image scale
141 |     # target_size = [416, 416]
142 |     target_size = None
143 |     annotation_path = "./data/my_data/label/train.txt"
144 |     anno_result = parse_anno(annotation_path, target_size=target_size)
145 |     anchors, ave_iou = get_kmeans(anno_result, 9)
146 | 
147 |     anchor_string = ''
148 |     for anchor in anchors:
149 |         anchor_string += '{},{}, '.format(anchor[0], anchor[1])
150 |     anchor_string = anchor_string[:-2]
151 | 
152 |     print('anchors are:')
153 |     print(anchor_string)
154 |     print('the average iou is:')
155 |     print(ave_iou)
156 | 
157 | 


--------------------------------------------------------------------------------
/Train/test_single_image.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | 
 3 | from __future__ import division, print_function
 4 | 
 5 | import tensorflow as tf
 6 | import numpy as np
 7 | import argparse
 8 | import cv2
 9 | 
10 | from utils.misc_utils import parse_anchors, read_class_names
11 | from utils.nms_utils import gpu_nms
12 | from utils.plot_utils import get_color_table, plot_one_box
13 | 
14 | from model import yolov3
15 | 
16 | tf.compat.v1.train.Saver
17 | 
18 | parser = argparse.ArgumentParser(description="YOLO-V3 test single image test procedure.")
19 | parser.add_argument("input_image", type=str,
20 |                     help="The path of the input image.")
21 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt",
22 |                     help="The path of the anchor txt file.")
23 | parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416],
24 |                     help="Resize the input image with `new_size`, size format: [width, height]")
25 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names",
26 |                     help="The path of the class names.")
27 | parser.add_argument("--restore_path", type=str, default="./checkpoint/model-epoch_90_step_14013_loss_1.4107_lr_1e-05",
28 |                     help="The path of the weights to restore.")
29 | args = parser.parse_args()
30 | 
31 | args.anchors = parse_anchors(args.anchor_path)
32 | args.classes = read_class_names(args.class_name_path)
33 | args.num_class = len(args.classes)
34 | 
35 | color_table = get_color_table(args.num_class)
36 | 
37 | img_ori = cv2.imread(args.input_image)
38 | height_ori, width_ori = img_ori.shape[:2]
39 | img = cv2.resize(img_ori, tuple(args.new_size))
40 | img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
41 | img = np.asarray(img, np.float32)
42 | img = img[np.newaxis, :] / 255.
43 | 
44 | with tf.Session() as sess:
45 |     input_data = tf.placeholder(tf.float32, [1, args.new_size[1], args.new_size[0], 3], name='input_data')
46 |     yolo_model = yolov3(args.num_class, args.anchors)
47 |     with tf.variable_scope('yolov3'):
48 |         pred_feature_maps = yolo_model.forward(input_data, False)
49 |     pred_boxes, pred_confs, pred_probs = yolo_model.predict(pred_feature_maps)
50 | 
51 |     pred_scores = pred_confs * pred_probs
52 | 
53 |     boxes, scores, labels = gpu_nms(pred_boxes, pred_scores, args.num_class, max_boxes=30, score_thresh=0.4, nms_thresh=0.5)
54 | 
55 |     saver = tf.train.Saver()
56 |     saver.restore(sess, args.restore_path)
57 |     # saver = tf.train.import_meta_graph('./checkpoint/best_model_Epoch_5_step_42_mAP_0.0735_loss_43.3285_lr_0.0001.meta')
58 |     # saver.restore(sess, tf.train.latest_checkpoint("./checkpoint/"))
59 | 
60 |     boxes_, scores_, labels_ = sess.run([boxes, scores, labels], feed_dict={input_data: img})
61 | 
62 |     # rescale the coordinates to the original image
63 |     boxes_[:, 0] *= (width_ori/float(args.new_size[0]))
64 |     boxes_[:, 2] *= (width_ori/float(args.new_size[0]))
65 |     boxes_[:, 1] *= (height_ori/float(args.new_size[1]))
66 |     boxes_[:, 3] *= (height_ori/float(args.new_size[1]))
67 | 
68 |     print("box coords:")
69 |     print(boxes_)
70 |     print('*' * 30)
71 |     print("scores:")
72 |     print(scores_)
73 |     print('*' * 30)
74 |     print("labels:")
75 |     print(labels_)
76 | 
77 |     for i in range(len(boxes_)):
78 |         x0, y0, x1, y1 = boxes_[i]
79 |         plot_one_box(img_ori, [x0, y0, x1, y1], label=args.classes[labels_[i]], color=color_table[labels_[i]])
80 |     cv2.imshow('Detection result', img_ori)
81 |     cv2.imwrite('detection_result.jpg', img_ori)
82 |     cv2.waitKey(0)
83 | 


--------------------------------------------------------------------------------
/Train/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/utils/__init__.py


--------------------------------------------------------------------------------
/Train/utils/layer_utils.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | 
 3 | from __future__ import division, print_function
 4 | 
 5 | import numpy as np
 6 | import tensorflow as tf
 7 | slim = tf.contrib.slim
 8 | 
 9 | def conv2d(inputs, filters, kernel_size, strides=1):
10 |     def _fixed_padding(inputs, kernel_size):
11 |         pad_total = kernel_size - 1
12 |         pad_beg = pad_total // 2
13 |         pad_end = pad_total - pad_beg
14 | 
15 |         padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg, pad_end],
16 |                                         [pad_beg, pad_end], [0, 0]], mode='CONSTANT')
17 |         return padded_inputs
18 |     if strides > 1: 
19 |         inputs = _fixed_padding(inputs, kernel_size)
20 |     inputs = slim.conv2d(inputs, filters, kernel_size, stride=strides,
21 |                          padding=('SAME' if strides == 1 else 'VALID'))
22 |     return inputs
23 | 
24 | def darknet53_body(inputs):
25 |     def res_block(inputs, filters):
26 |         shortcut = inputs
27 |         net = conv2d(inputs, filters * 1, 1)
28 |         net = conv2d(net, filters * 2, 3)
29 | 
30 |         net = net + shortcut
31 | 
32 |         return net
33 |     
34 |     # first two conv2d layers
35 |     net = conv2d(inputs, 32,  3, strides=1)
36 |     net = conv2d(net, 64,  3, strides=2)
37 | 
38 |     # res_block * 1
39 |     net = res_block(net, 32)
40 | 
41 |     net = conv2d(net, 128, 3, strides=2)
42 | 
43 |     # res_block * 2
44 |     for i in range(2):
45 |         net = res_block(net, 64)
46 | 
47 |     net = conv2d(net, 256, 3, strides=2)
48 | 
49 |     # res_block * 8
50 |     for i in range(8):
51 |         net = res_block(net, 128)
52 | 
53 |     route_1 = net
54 |     net = conv2d(net, 512, 3, strides=2)
55 | 
56 |     # res_block * 8
57 |     for i in range(8):
58 |         net = res_block(net, 256)
59 | 
60 |     route_2 = net
61 |     net = conv2d(net, 1024, 3, strides=2)
62 | 
63 |     # res_block * 4
64 |     for i in range(4):
65 |         net = res_block(net, 512)
66 |     route_3 = net
67 | 
68 |     return route_1, route_2, route_3
69 | 
70 | 
71 | def yolo_block(inputs, filters):
72 |     net = conv2d(inputs, filters * 1, 1)
73 |     net = conv2d(net, filters * 2, 3)
74 |     net = conv2d(net, filters * 1, 1)
75 |     net = conv2d(net, filters * 2, 3)
76 |     net = conv2d(net, filters * 1, 1)
77 |     route = net
78 |     net = conv2d(net, filters * 2, 3)
79 |     return route, net
80 | 
81 | 
82 | def upsample_layer(inputs, out_shape):
83 |     new_height, new_width = out_shape[1], out_shape[2]
84 |     # NOTE: here height is the first
85 |     # TODO: Do we need to set `align_corners` as True?
86 |     inputs = tf.image.resize_nearest_neighbor(inputs, (new_height, new_width), name='upsampled')
87 |     return inputs
88 | 
89 | 
90 | 


--------------------------------------------------------------------------------
/Train/utils/misc_utils.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | 
  3 | import numpy as np
  4 | import tensorflow as tf
  5 | import random
  6 | 
  7 | from tensorflow.core.framework import summary_pb2
  8 | 
  9 | 
 10 | def make_summary(name, val):
 11 |     return summary_pb2.Summary(value=[summary_pb2.Summary.Value(tag=name, simple_value=val)])
 12 | 
 13 | 
 14 | class AverageMeter(object):
 15 |     def __init__(self):
 16 |         self.reset()
 17 | 
 18 |     def reset(self):
 19 |         self.val = 0
 20 |         self.average = 0
 21 |         self.sum = 0
 22 |         self.count = 0
 23 | 
 24 |     def update(self, val, n=1):
 25 |         self.val = val
 26 |         self.sum += val * n
 27 |         self.count += n
 28 |         self.average = self.sum / float(self.count)
 29 | 
 30 | 
 31 | def parse_anchors(anchor_path):
 32 |     '''
 33 |     parse anchors.
 34 |     returned data: shape [N, 2], dtype float32
 35 |     '''
 36 |     anchors = np.reshape(np.asarray(open(anchor_path, 'r').read().split(','), np.float32), [-1, 2])
 37 |     return anchors
 38 | 
 39 | 
 40 | def read_class_names(class_name_path):
 41 |     names = {}
 42 |     with open(class_name_path, 'r') as data:
 43 |         for ID, name in enumerate(data):
 44 |             names[ID] = name.strip('\n')
 45 |     return names
 46 | 
 47 | 
 48 | def shuffle_and_overwrite(file_name):
 49 |     content = open(file_name, 'r').readlines()
 50 |     random.shuffle(content)
 51 |     with open(file_name, 'w') as f:
 52 |         for line in content:
 53 |             f.write(line)
 54 | 
 55 | 
 56 | def update_dict(ori_dict, new_dict):
 57 |     if not ori_dict:
 58 |         return new_dict
 59 |     for key in ori_dict:
 60 |         ori_dict[key] += new_dict[key]
 61 |     return ori_dict
 62 | 
 63 | 
 64 | def list_add(ori_list, new_list):
 65 |     for i in range(len(ori_list)):
 66 |         ori_list[i] += new_list[i]
 67 |     return ori_list
 68 | 
 69 | 
 70 | def load_weights(var_list, weights_file):
 71 |     """
 72 |     Loads and converts pre-trained weights.
 73 |     param:
 74 |         var_list: list of network variables.
 75 |         weights_file: name of the binary file.
 76 |     """
 77 |     with open(weights_file, "rb") as fp:
 78 |         np.fromfile(fp, dtype=np.int32, count=5)
 79 |         weights = np.fromfile(fp, dtype=np.float32)
 80 | 
 81 |     ptr = 0
 82 |     i = 0
 83 |     assign_ops = []
 84 |     while i < len(var_list) - 1:
 85 |         var1 = var_list[i]
 86 |         var2 = var_list[i + 1]
 87 |         # do something only if we process conv layer
 88 |         if 'Conv' in var1.name.split('/')[-2]:
 89 |             # check type of next layer
 90 |             if 'BatchNorm' in var2.name.split('/')[-2]:
 91 |                 # load batch norm params
 92 |                 gamma, beta, mean, var = var_list[i + 1:i + 5]
 93 |                 batch_norm_vars = [beta, gamma, mean, var]
 94 |                 for var in batch_norm_vars:
 95 |                     shape = var.shape.as_list()
 96 |                     num_params = np.prod(shape)
 97 |                     var_weights = weights[ptr:ptr + num_params].reshape(shape)
 98 |                     ptr += num_params
 99 |                     assign_ops.append(tf.assign(var, var_weights, validate_shape=True))
100 |                 # we move the pointer by 4, because we loaded 4 variables
101 |                 i += 4
102 |             elif 'Conv' in var2.name.split('/')[-2]:
103 |                 # load biases
104 |                 bias = var2
105 |                 bias_shape = bias.shape.as_list()
106 |                 bias_params = np.prod(bias_shape)
107 |                 bias_weights = weights[ptr:ptr +
108 |                                        bias_params].reshape(bias_shape)
109 |                 ptr += bias_params
110 |                 assign_ops.append(tf.assign(bias, bias_weights, validate_shape=True))
111 |                 # we loaded 1 variable
112 |                 i += 1
113 |             # we can load weights of conv layer
114 |             shape = var1.shape.as_list()
115 |             num_params = np.prod(shape)
116 | 
117 |             var_weights = weights[ptr:ptr + num_params].reshape(
118 |                 (shape[3], shape[2], shape[0], shape[1]))
119 |             # remember to transpose to column-major
120 |             var_weights = np.transpose(var_weights, (2, 3, 1, 0))
121 |             ptr += num_params
122 |             assign_ops.append(
123 |                 tf.assign(var1, var_weights, validate_shape=True))
124 |             i += 1
125 | 
126 |     return assign_ops
127 | 
128 | 
129 | def config_learning_rate(args, global_step):
130 |     if args.lr_type == 'exponential':
131 |         lr_tmp = tf.train.exponential_decay(args.learning_rate_init, global_step, args.lr_decay_freq,
132 |                                             args.lr_decay_factor, staircase=True, name='exponential_learning_rate')
133 |         return tf.maximum(lr_tmp, args.lr_lower_bound)
134 |     elif args.lr_type == 'cosine_decay':
135 |         train_steps = (args.total_epoches - float(args.use_warm_up) * args.warm_up_epoch) * args.train_batch_num
136 |         return args.lr_lower_bound + 0.5 * (args.learning_rate_init - args.lr_lower_bound) * \
137 |             (1 + tf.cos(global_step / train_steps * np.pi))
138 |     elif args.lr_type == 'cosine_decay_restart':
139 |         return tf.train.cosine_decay_restarts(args.learning_rate_init, global_step, 
140 |                                               args.lr_decay_freq, t_mul=2.0, m_mul=1.0, 
141 |                                               name='cosine_decay_learning_rate_restart')
142 |     elif args.lr_type == 'fixed':
143 |         return tf.convert_to_tensor(args.learning_rate_init, name='fixed_learning_rate')
144 |     elif args.lr_type == 'piecewise':
145 |         return tf.train.piecewise_constant(global_step, boundaries=args.pw_boundaries, values=args.pw_values,
146 |                                            name='piecewise_learning_rate')
147 |     else:
148 |         raise ValueError('Unsupported learning rate type!')
149 | 
150 | 
151 | def config_optimizer(optimizer_name, learning_rate, decay=0.9, momentum=0.9):
152 |     if optimizer_name == 'momentum':
153 |         return tf.train.MomentumOptimizer(learning_rate, momentum=momentum)
154 |     elif optimizer_name == 'rmsprop':
155 |         return tf.train.RMSPropOptimizer(learning_rate, decay=decay, momentum=momentum)
156 |     elif optimizer_name == 'adam':
157 |         return tf.train.AdamOptimizer(learning_rate)
158 |     elif optimizer_name == 'sgd':
159 |         return tf.train.GradientDescentOptimizer(learning_rate)
160 |     else:
161 |         raise ValueError('Unsupported optimizer type!')


--------------------------------------------------------------------------------
/Train/utils/nms_utils.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | 
  3 | from __future__ import division, print_function
  4 | 
  5 | import numpy as np
  6 | import tensorflow as tf
  7 | 
  8 | def gpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, nms_thresh=0.5):
  9 |     """
 10 |     Perform NMS on GPU using TensorFlow.
 11 | 
 12 |     params:
 13 |         boxes: tensor of shape [1, 10647, 4] # 10647=(13*13+26*26+52*52)*3, for input 416*416 image
 14 |         scores: tensor of shape [1, 10647, num_classes], score=conf*prob
 15 |         num_classes: total number of classes
 16 |         max_boxes: integer, maximum number of predicted boxes you'd like, default is 50
 17 |         score_thresh: if [ highest class probability score < score_threshold]
 18 |                         then get rid of the corresponding box
 19 |         nms_thresh: real value, "intersection over union" threshold used for NMS filtering
 20 |     """
 21 | 
 22 |     boxes_list, label_list, score_list = [], [], []
 23 |     max_boxes = tf.constant(max_boxes, dtype='int32')
 24 | 
 25 |     # since we do nms for single image, then reshape it
 26 |     boxes = tf.reshape(boxes, [-1, 4]) # '-1' means we don't konw the exact number of boxes
 27 |     score = tf.reshape(scores, [-1, num_classes])
 28 | 
 29 |     # Step 1: Create a filtering mask based on "box_class_scores" by using "threshold".
 30 |     mask = tf.greater_equal(score, tf.constant(score_thresh))
 31 |     # Step 2: Do non_max_suppression for each class
 32 |     for i in range(num_classes):
 33 |         # Step 3: Apply the mask to scores, boxes and pick them out
 34 |         filter_boxes = tf.boolean_mask(boxes, mask[:,i])
 35 |         filter_score = tf.boolean_mask(score[:,i], mask[:,i])
 36 |         nms_indices = tf.image.non_max_suppression(boxes=filter_boxes,
 37 |                                                    scores=filter_score,
 38 |                                                    max_output_size=max_boxes,
 39 |                                                    iou_threshold=nms_thresh, name='nms_indices')
 40 |         label_list.append(tf.ones_like(tf.gather(filter_score, nms_indices), 'int32')*i)
 41 |         boxes_list.append(tf.gather(filter_boxes, nms_indices))
 42 |         score_list.append(tf.gather(filter_score, nms_indices))
 43 | 
 44 |     boxes = tf.concat(boxes_list, axis=0)
 45 |     score = tf.concat(score_list, axis=0)
 46 |     label = tf.concat(label_list, axis=0)
 47 | 
 48 |     return boxes, score, label
 49 | 
 50 | 
 51 | def py_nms(boxes, scores, max_boxes=50, iou_thresh=0.5):
 52 |     """
 53 |     Pure Python NMS baseline.
 54 | 
 55 |     Arguments: boxes: shape of [-1, 4], the value of '-1' means that dont know the
 56 |                       exact number of boxes
 57 |                scores: shape of [-1,]
 58 |                max_boxes: representing the maximum of boxes to be selected by non_max_suppression
 59 |                iou_thresh: representing iou_threshold for deciding to keep boxes
 60 |     """
 61 |     assert boxes.shape[1] == 4 and len(scores.shape) == 1
 62 | 
 63 |     x1 = boxes[:, 0]
 64 |     y1 = boxes[:, 1]
 65 |     x2 = boxes[:, 2]
 66 |     y2 = boxes[:, 3]
 67 | 
 68 |     areas = (x2 - x1) * (y2 - y1)
 69 |     order = scores.argsort()[::-1]
 70 | 
 71 |     keep = []
 72 |     while order.size > 0:
 73 |         i = order[0]
 74 |         keep.append(i)
 75 |         xx1 = np.maximum(x1[i], x1[order[1:]])
 76 |         yy1 = np.maximum(y1[i], y1[order[1:]])
 77 |         xx2 = np.minimum(x2[i], x2[order[1:]])
 78 |         yy2 = np.minimum(y2[i], y2[order[1:]])
 79 | 
 80 |         w = np.maximum(0.0, xx2 - xx1 + 1)
 81 |         h = np.maximum(0.0, yy2 - yy1 + 1)
 82 |         inter = w * h
 83 |         ovr = inter / (areas[i] + areas[order[1:]] - inter)
 84 | 
 85 |         inds = np.where(ovr <= iou_thresh)[0]
 86 |         order = order[inds + 1]
 87 | 
 88 |     return keep[:max_boxes]
 89 | 
 90 | 
 91 | def cpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, iou_thresh=0.5):
 92 |     """
 93 |     Perform NMS on CPU.
 94 |     Arguments:
 95 |         boxes: shape [1, 10647, 4]
 96 |         scores: shape [1, 10647, num_classes]
 97 |     """
 98 | 
 99 |     boxes = boxes.reshape(-1, 4)
100 |     scores = scores.reshape(-1, num_classes)
101 |     # Picked bounding boxes
102 |     picked_boxes, picked_score, picked_label = [], [], []
103 | 
104 |     for i in range(num_classes):
105 |         indices = np.where(scores[:,i] >= score_thresh)
106 |         filter_boxes = boxes[indices]
107 |         filter_scores = scores[:,i][indices]
108 |         if len(filter_boxes) == 0: 
109 |             continue
110 |         # do non_max_suppression on the cpu
111 |         indices = py_nms(filter_boxes, filter_scores,
112 |                          max_boxes=max_boxes, iou_thresh=iou_thresh)
113 |         picked_boxes.append(filter_boxes[indices])
114 |         picked_score.append(filter_scores[indices])
115 |         picked_label.append(np.ones(len(indices), dtype='int32')*i)
116 |     if len(picked_boxes) == 0: 
117 |         return None, None, None
118 | 
119 |     boxes = np.concatenate(picked_boxes, axis=0)
120 |     score = np.concatenate(picked_score, axis=0)
121 |     label = np.concatenate(picked_label, axis=0)
122 | 
123 |     return boxes, score, label


--------------------------------------------------------------------------------
/Train/utils/plot_utils.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | 
 3 | from __future__ import division, print_function
 4 | 
 5 | import cv2
 6 | import random
 7 | 
 8 | 
 9 | def get_color_table(class_num, seed=2):
10 |     random.seed(seed)
11 |     color_table = {}
12 |     for i in range(class_num):
13 |         color_table[i] = [random.randint(0, 255) for _ in range(3)]
14 |     return color_table
15 | 
16 | 
17 | def plot_one_box(img, coord, label=None, color=None, line_thickness=None):
18 |     '''
19 |     coord: [x_min, y_min, x_max, y_max] format coordinates.
20 |     img: img to plot on.
21 |     label: str. The label name.
22 |     color: int. color index.
23 |     line_thickness: int. rectangle line thickness.
24 |     '''
25 |     tl = line_thickness or int(round(0.002 * max(img.shape[0:2])))  # line thickness
26 |     color = color or [random.randint(0, 255) for _ in range(3)]
27 |     c1, c2 = (int(coord[0]), int(coord[1])), (int(coord[2]), int(coord[3]))
28 |     cv2.rectangle(img, c1, c2, color, thickness=tl)
29 |     if label:
30 |         tf = max(tl - 1, 1)  # font thickness
31 |         t_size = cv2.getTextSize(label, 0, fontScale=float(tl) / 3, thickness=tf)[0]
32 |         c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3
33 |         cv2.rectangle(img, c1, c2, color, -1)  # filled
34 |         cv2.putText(img, label, (c1[0], c1[1] - 2), 0, float(tl) / 3, [0, 0, 0], thickness=tf, lineType=cv2.LINE_AA)
35 | 
36 | 


--------------------------------------------------------------------------------
/Train/video_test.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | 
  3 | from __future__ import division, print_function
  4 | 
  5 | import tensorflow as tf
  6 | import numpy as np
  7 | import argparse
  8 | import cv2
  9 | import time
 10 | 
 11 | from utils.misc_utils import parse_anchors, read_class_names
 12 | from utils.nms_utils import gpu_nms
 13 | from utils.plot_utils import get_color_table, plot_one_box
 14 | from utils.data_aug import letterbox_resize
 15 | 
 16 | from model import yolov3
 17 | 
 18 | import warnings
 19 | warnings.filterwarnings('ignore')
 20 | parser = argparse.ArgumentParser(description="YOLO-V3 video test procedure.")
 21 | parser.add_argument("input_video", type=str,
 22 |                     help="The path of the input video.")
 23 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt",
 24 |                     help="The path of the anchor txt file.")
 25 | parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416],
 26 |                     help="Resize the input image with `new_size`, size format: [width, height]")
 27 | parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=True,
 28 |                     help="Whether to use the letterbox resize.")
 29 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names",
 30 |                     help="The path of the class names.")
 31 | parser.add_argument("--restore_path", type=str, default="./checkpoint/model-epoch_90_step_14013_loss_1.4107_lr_1e-05",
 32 |                     help="The path of the weights to restore.")
 33 | parser.add_argument("--save_video", type=lambda x: (str(x).lower() == 'true'), default=False,
 34 |                     help="Whether to save the video detection results.")
 35 | args = parser.parse_args()
 36 | 
 37 | args.anchors = parse_anchors(args.anchor_path)
 38 | args.classes = read_class_names(args.class_name_path)
 39 | args.num_class = len(args.classes)
 40 | 
 41 | color_table = get_color_table(args.num_class)
 42 | 
 43 | # vid = cv2.VideoCapture(args.input_video)
 44 | vid = cv2.VideoCapture(0)
 45 | video_frame_cnt = int(vid.get(7))
 46 | video_width = int(vid.get(3))
 47 | video_height = int(vid.get(4))
 48 | # video_fps = int(vid.get(5))
 49 | video_fps = 10
 50 | 
 51 | if args.save_video:
 52 |     fourcc = cv2.VideoWriter_fourcc(*'mp4v')
 53 |     videoWriter = cv2.VideoWriter('video_result.mp4', fourcc, video_fps, (video_width, video_height))
 54 | 
 55 | with tf.Session() as sess:
 56 |     input_data = tf.placeholder(tf.float32, [1, args.new_size[1], args.new_size[0], 3], name='input_data')
 57 |     yolo_model = yolov3(args.num_class, args.anchors)
 58 |     with tf.variable_scope('yolov3'):
 59 |         pred_feature_maps = yolo_model.forward(input_data, False)
 60 |     pred_boxes, pred_confs, pred_probs = yolo_model.predict(pred_feature_maps)
 61 | 
 62 |     pred_scores = pred_confs * pred_probs
 63 | 
 64 |     boxes, scores, labels = gpu_nms(pred_boxes, pred_scores, args.num_class, max_boxes=200, score_thresh=0.3, nms_thresh=0.45)
 65 | 
 66 |     saver = tf.train.Saver()
 67 |     saver.restore(sess, args.restore_path)
 68 | 
 69 |     # for i in range(video_frame_cnt):
 70 |     while True:
 71 |         ret, img_ori = vid.read()
 72 |         if args.letterbox_resize:
 73 |             img, resize_ratio, dw, dh = letterbox_resize(img_ori, args.new_size[0], args.new_size[1])
 74 |         else:
 75 |             height_ori, width_ori = img_ori.shape[:2]
 76 |             img = cv2.resize(img_ori, tuple(args.new_size))
 77 |         img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
 78 |         img = np.asarray(img, np.float32)
 79 |         img = img[np.newaxis, :] / 255.
 80 | 
 81 |         start_time = time.time()
 82 |         boxes_, scores_, labels_ = sess.run([boxes, scores, labels], feed_dict={input_data: img})
 83 |         end_time = time.time()
 84 | 
 85 |         # rescale the coordinates to the original image
 86 |         if args.letterbox_resize:
 87 |             boxes_[:, [0, 2]] = (boxes_[:, [0, 2]] - dw) / resize_ratio
 88 |             boxes_[:, [1, 3]] = (boxes_[:, [1, 3]] - dh) / resize_ratio
 89 |         else:
 90 |             boxes_[:, [0, 2]] *= (width_ori/float(args.new_size[0]))
 91 |             boxes_[:, [1, 3]] *= (height_ori/float(args.new_size[1]))
 92 | 
 93 | 
 94 |         for i in range(len(boxes_)):
 95 |             x0, y0, x1, y1 = boxes_[i]
 96 |             plot_one_box(img_ori, [x0, y0, x1, y1], label=args.classes[labels_[i]] + ', {:.2f}%'.format(scores_[i] * 100), color=color_table[labels_[i]])
 97 |         cv2.putText(img_ori, '{:.2f}ms'.format((end_time - start_time) * 1000), (40, 40), 0,
 98 |                     fontScale=1, color=(0, 255, 0), thickness=2)
 99 |         cv2.imshow('image', img_ori)
100 |         k = cv2.waitKey(1) 
101 |         if args.save_video:
102 |             videoWriter.write(img_ori)
103 |         if k & 0xFF == ord('q'):
104 |             break
105 | 
106 |     vid.release()
107 |     if args.save_video:
108 |         videoWriter.release()
109 | 


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/.gitignore:
--------------------------------------------------------------------------------
 1 | /checkpoint/checkpoint
 2 | /checkpoint/*.data-00000-of-00001
 3 | /checkpoint/*.index
 4 | /checkpoint/*.meta
 5 | /data/darknet_weights/*.data-00000-of-00001
 6 | /data/darknet_weights/*.meta
 7 | /data/darknet_weights/*.weights
 8 | 
 9 | /data/logs/*.ubuntu
10 | *.xml
11 | /data/myData/ImageSets/Main/*.txt
12 | /data/myData/JPEGImages/*.jpg
13 | /data/myData/label/*.txt
14 | 
15 | /data/test_image/*.jpg
16 | /data/test_video/*
17 | /data/*.log
18 | /test_res/*.jpg
19 | /执行步骤.txt
20 | 
21 | 


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/README.md:
--------------------------------------------------------------------------------
 1 | ## Tensorflow YOLO V3 helmet detection
 2 | trained model download ：[GitHub Release](https://github.com/DataXujing/YOLO-V3-Tensorflow/releases/tag/model)  -- yolo3_hat.rar
 3 | 
 4 | put three files into  ./data/darknet_weights
 5 | 
 6 | # Test:
 7 | ```
 8 | python3 test_single_image.py test.jpg
 9 | ```
10 | 
11 | 
12 | 


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/args.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | # This file contains the parameter used in train.py
 3 | 
 4 | from __future__ import division, print_function
 5 | 
 6 | from utils.misc_utils import parse_anchors, read_class_names
 7 | import math
 8 | 
 9 | ### Some paths
10 | train_file = './data/my_data/label/train.txt'  # The path of the training txt file.
11 | val_file = './data/my_data/label/val.txt'  # The path of the validation txt file.
12 | restore_path = './data/darknet_weights/yolov3.ckpt'  # The path of the weights to restore.
13 | save_dir = './checkpoint/'  # The directory of the weights to save.
14 | log_dir = './data/logs/'  # The directory to store the tensorboard log files.
15 | progress_log_path = './data/progress.log'  # The path to record the training progress.
16 | anchor_path = './data/yolo_anchors.txt'  # The path of the anchor txt file.
17 | class_name_path = './data/coco.names'  # The path of the class names.
18 | 
19 | ### Training releated numbers
20 | batch_size = 32  #6
21 | img_size = [416, 416]  # Images will be resized to `img_size` and fed to the network, size format: [width, height]
22 | letterbox_resize = True  # Whether to use the letterbox resize, i.e., keep the original aspect ratio in the resized image.
23 | total_epoches = 500
24 | train_evaluation_step = 100  # Evaluate on the training batch after some steps.
25 | val_evaluation_epoch = 50  # Evaluate on the whole validation dataset after some epochs. Set to None to evaluate every epoch.
26 | save_epoch = 10  # Save the model after some epochs.
27 | batch_norm_decay = 0.99  # decay in bn ops
28 | weight_decay = 5e-4  # l2 weight decay
29 | global_step = 0  # used when resuming training
30 | 
31 | ### tf.data parameters
32 | num_threads = 10  # Number of threads for image processing used in tf.data pipeline.
33 | prefetech_buffer = 5  # Prefetech_buffer used in tf.data pipeline.
34 | 
35 | ### Learning rate and optimizer
36 | optimizer_name = 'momentum'  # Chosen from [sgd, momentum, adam, rmsprop]
37 | save_optimizer = True  # Whether to save the optimizer parameters into the checkpoint file.
38 | learning_rate_init = 1e-4
39 | lr_type = 'piecewise'  # Chosen from [fixed, exponential, cosine_decay, cosine_decay_restart, piecewise]
40 | lr_decay_epoch = 5  # Epochs after which learning rate decays. Int or float. Used when chosen `exponential` and `cosine_decay_restart` lr_type.
41 | lr_decay_factor = 0.96  # The learning rate decay factor. Used when chosen `exponential` lr_type.
42 | lr_lower_bound = 1e-6  # The minimum learning rate.
43 | # only used in piecewise lr type
44 | pw_boundaries = [30, 50]  # epoch based boundaries
45 | pw_values = [learning_rate_init, 3e-5, 1e-5]
46 | 
47 | ### Load and finetune
48 | # Choose the parts you want to restore the weights. List form.
49 | # restore_include: None, restore_exclude: None  => restore the whole model
50 | # restore_include: None, restore_exclude: scope  => restore the whole model except `scope`
51 | # restore_include: scope1, restore_exclude: scope2  => if scope1 contains scope2, restore scope1 and not restore scope2 (scope1 - scope2)
52 | # choise 1: only restore the darknet body
53 | # restore_include = ['yolov3/darknet53_body']
54 | # restore_exclude = None
55 | # choise 2: restore all layers except the last 3 conv2d layers in 3 scale
56 | restore_include = None
57 | restore_exclude = ['yolov3/yolov3_head/Conv_14', 'yolov3/yolov3_head/Conv_6', 'yolov3/yolov3_head/Conv_22']
58 | # Choose the parts you want to finetune. List form.
59 | # Set to None to train the whole model.
60 | update_part = ['yolov3/yolov3_head']
61 | 
62 | ### other training strategies
63 | multi_scale_train = True  # Whether to apply multi-scale training strategy. Image size varies from [320, 320] to [640, 640] by default.
64 | use_label_smooth = True # Whether to use class label smoothing strategy.
65 | use_focal_loss = True  # Whether to apply focal loss on the conf loss.
66 | use_mix_up = True  # Whether to use mix up data augmentation strategy. 
67 | use_warm_up = True  # whether to use warm up strategy to prevent from gradient exploding.
68 | warm_up_epoch = 3  # Warm up training epoches. Set to a larger value if gradient explodes.
69 | 
70 | ### some constants in validation
71 | # nms
72 | nms_threshold = 0.45  # iou threshold in nms operation
73 | score_threshold = 0.01  # threshold of the probability of the classes in nms operation, i.e. score = pred_confs * pred_probs. set lower for higher recall.
74 | nms_topk = 150  # keep at most nms_topk outputs after nms
75 | # mAP eval
76 | eval_threshold = 0.5  # the iou threshold applied in mAP evaluation
77 | use_voc_07_metric = False  # whether to use voc 2007 evaluation metric, i.e. the 11-point metric
78 | 
79 | ### parse some params
80 | anchors = parse_anchors(anchor_path)
81 | classes = read_class_names(class_name_path)
82 | class_num = len(classes)
83 | train_img_cnt = len(open(train_file, 'r').readlines())
84 | val_img_cnt = len(open(val_file, 'r').readlines())
85 | train_batch_num = int(math.ceil(float(train_img_cnt) / batch_size))
86 | 
87 | lr_decay_freq = int(train_batch_num * lr_decay_epoch)
88 | pw_boundaries = [float(i) * train_batch_num + global_step for i in pw_boundaries]


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/convert_weight.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | # for more details about the yolo darknet weights file, refer to
 3 | # https://itnext.io/implementing-yolo-v3-in-tensorflow-tf-slim-c3c55ff59dbe
 4 | 
 5 | from __future__ import division, print_function
 6 | 
 7 | import os
 8 | import sys
 9 | import tensorflow as tf
10 | import numpy as np
11 | 
12 | from model import yolov3
13 | from utils.misc_utils import parse_anchors, load_weights
14 | 
15 | num_class = 80
16 | img_size = 416
17 | weight_path = './data/darknet_weights/yolov3.weights'
18 | save_path = './data/darknet_weights/yolov3.ckpt'
19 | anchors = parse_anchors('./data/yolo_anchors.txt')
20 | 
21 | model = yolov3(80, anchors)
22 | with tf.Session() as sess:
23 |     inputs = tf.placeholder(tf.float32, [1, img_size, img_size, 3])
24 | 
25 |     with tf.variable_scope('yolov3'):
26 |         feature_map = model.forward(inputs)
27 | 
28 |     saver = tf.train.Saver(var_list=tf.global_variables(scope='yolov3'))
29 | 
30 |     load_ops = load_weights(tf.global_variables(scope='yolov3'), weight_path)
31 |     sess.run(load_ops)
32 |     saver.save(sess, save_path=save_path)
33 |     print('TensorFlow model checkpoint has been saved to {}'.format(save_path))
34 | 
35 | 
36 | 
37 | 


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/data/coco.names:
--------------------------------------------------------------------------------
1 | hat
2 | person


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/data/darknet_weights/readme.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/YOLO-V3-Tensorflow-demo/data/darknet_weights/readme.txt


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/data/yolo_anchors.txt:
--------------------------------------------------------------------------------
1 | 676,197, 763,250, 684,283, 868,231, 745,273, 544,391, 829,258, 678,316, 713,355


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/data_pro.py:
--------------------------------------------------------------------------------
  1 | 
  2 | import os
  3 | import pandas 
  4 | import shutil
  5 | import random
  6 | 
  7 | 
  8 | import cv2
  9 | import numpy as np
 10 | import xml.etree.ElementTree as ET
 11 | 
 12 | 
 13 | # 这部分休要修改
 14 | 
 15 | 
 16 | class Data_preprocess(object):
 17 |     '''
 18 |     解析xml数据
 19 |     '''
 20 |     def __init__(self,data_path):
 21 |         self.data_path = data_path
 22 |         self.image_size = 416
 23 |         self.batch_size = 32
 24 |         self.cell_size = 13
 25 |         self.classes = ["hat","person"]
 26 |         self.num_classes = len(self.classes)
 27 |         self.box_per_cell = 5
 28 |         self.class_to_ind = dict(zip(self.classes, range(self.num_classes)))
 29 | 
 30 |         self.count = 0
 31 |         self.epoch = 1
 32 |         self.count_t = 0
 33 | 
 34 |     def load_labels(self, model):
 35 |         if model == 'train':
 36 |             txtname = os.path.join(self.data_path, 'ImageSets/Main/train.txt')
 37 |         if model == 'test':
 38 |             txtname = os.path.join(self.data_path, 'ImageSets/Main/test.txt')
 39 | 
 40 |         if model == "val":
 41 |             txtname = os.path.join(self.data_path, 'ImageSets/Main/val.txt')
 42 | 
 43 | 
 44 |         with open(txtname, 'r') as f:
 45 |             image_ind = [x.strip() for x in f.readlines()] # 文件名去掉 .jpg
 46 | 
 47 |         
 48 |         my_index = 0
 49 |         for ind in image_ind:
 50 |             class_inds, x1s, y1s, x2s, y2s,img_width,img_height = self.load_data(ind)
 51 | 
 52 |             if len(class_inds) == 0:
 53 |                 pass
 54 |             else:
 55 |                 annotation_label = ""
 56 |                 #box_x: label_index, x_min,y_min,x_max,y_max
 57 |                 for label_i in range(len(class_inds)):
 58 | 
 59 |                     annotation_label += " " + str(class_inds[label_i])
 60 |                     annotation_label += " " + str(x1s[label_i])
 61 |                     annotation_label += " " + str(y1s[label_i])
 62 |                     annotation_label += " " + str(x2s[label_i])
 63 |                     annotation_label += " " + str(y2s[label_i])
 64 | 
 65 |                 with open("./data/my_data/label/"+model+".txt","a") as f:
 66 |                     f.write(str(my_index) + " " + data_path+"/JPEGImages/"+ind+".jpg"+" "+str(img_width) +" "+str(img_height)+ annotation_label + "\n")
 67 | 
 68 |                 my_index += 1
 69 | 
 70 |             print(my_index)
 71 | 
 72 | 
 73 | 
 74 |     def load_data(self, index):
 75 |         label = np.zeros([self.cell_size, self.cell_size, self.box_per_cell, 5 + self.num_classes])
 76 |         filename = os.path.join(self.data_path, 'Annotations', index + '.xml')
 77 |         tree = ET.parse(filename)
 78 |         image_size = tree.find('size')
 79 |         image_width = int(float(image_size.find('width').text))
 80 |         image_height = int(float(image_size.find('height').text))
 81 |         # h_ratio = 1.0 * self.image_size / image_height
 82 |         # w_ratio = 1.0 * self.image_size / image_width
 83 | 
 84 |         objects = tree.findall('object')
 85 | 
 86 |         class_inds = []
 87 |         x1s = []
 88 |         y1s = []
 89 |         x2s = []
 90 |         y2s = []
 91 | 
 92 |         for obj in objects:
 93 |             box = obj.find('bndbox')
 94 |             x1 = int(float(box.find('xmin').text))
 95 |             y1 = int(float(box.find('ymin').text))
 96 |             x2 = int(float(box.find('xmax').text))
 97 |             y2 = int(float(box.find('ymax').text))
 98 |             # x1 = max(min((float(box.find('xmin').text)) * w_ratio, self.image_size), 0)
 99 |             # y1 = max(min((float(box.find('ymin').text)) * h_ratio, self.image_size), 0)
100 |             # x2 = max(min((float(box.find('xmax').text)) * w_ratio, self.image_size), 0)
101 |             # y2 = max(min((float(box.find('ymax').text)) * h_ratio, self.image_size), 0)
102 |             if obj.find('name').text in self.classes:
103 |                 class_ind = self.class_to_ind[obj.find('name').text]
104 |                 # class_ind = self.class_to_ind[obj.find('name').text.lower().strip()]
105 | 
106 |                 # boxes = [0.5 * (x1 + x2) / self.image_size, 0.5 * (y1 + y2) / self.image_size, np.sqrt((x2 - x1) / self.image_size), np.sqrt((y2 - y1) / self.image_size)]
107 |                 # cx = 1.0 * boxes[0] * self.cell_size
108 |                 # cy = 1.0 * boxes[1] * self.cell_size
109 |                 # xind = int(np.floor(cx))
110 |                 # yind = int(np.floor(cy))
111 |                 
112 |                 # label[yind, xind, :, 0] = 1
113 |                 # label[yind, xind, :, 1:5] = boxes
114 |                 # label[yind, xind, :, 5 + class_ind] = 1
115 | 
116 |                 if x1 >= x2 or y1 >= y2:
117 |                     pass
118 |                 else:
119 |                     class_inds.append(class_ind)
120 |                     x1s.append(x1)
121 |                     y1s.append(y1)
122 |                     x2s.append(x2)
123 |                     y2s.append(y2)
124 | 
125 |         return class_inds, x1s, y1s, x2s, y2s, image_width, image_height
126 | 
127 | 
128 | def data_split(img_path):
129 |     '''
130 |     数据分割
131 |     '''
132 | 
133 |     files = os.listdir(img_path)
134 | 
135 |     test_part = random.sample(files,int(351*0.2))
136 | 
137 |     val_part = random.sample(test_part,int(int(351*0.2)*0.5))
138 | 
139 |     val_index = 0
140 |     test_index = 0
141 |     train_index = 0
142 |     for file in files:
143 |         if file in val_part:
144 | 
145 |             with open("./data/my_data/ImageSets/Main/val.txt","a") as val_f:
146 |                 val_f.write(file[:-4] + "\n" )
147 | 
148 |             val_index += 1
149 | 
150 |         elif file in test_part:
151 |             with open("./data/my_data/ImageSets/Main/test.txt","a") as test_f:
152 |                 test_f.write(file[:-4] + "\n")
153 | 
154 |             test_index += 1
155 | 
156 |         else:
157 |             with open("./data/my_data/ImageSets/Main/train.txt","a") as train_f:
158 |                 train_f.write(file[:-4] + "\n")
159 | 
160 |             train_index += 1  
161 | 
162 | 
163 |         print(train_index,test_index,val_index)
164 | 
165 | 
166 | 
167 | if __name__ == "__main__":
168 |     
169 |     # 分割train, val, test
170 |     # img_path = "./data/my_data/ImageSets"
171 |     # data_split(img_path)
172 |     print("===========split data finish============")
173 | 
174 |     # 做YOLO V3需要的训练集
175 |     base_path = os.getcwd()
176 |     data_path = os.path.join(base_path,"data/my_data")  # 绝对路径
177 | 
178 |     data_p = Data_preprocess(data_path)
179 |     data_p.load_labels("train")
180 |     data_p.load_labels("test")
181 |     data_p.load_labels("val")
182 |     print("==========data pro finish===========")
183 | 
184 | 
185 | 
186 | 
187 | 
188 | 
189 | 
190 | 


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/detection_result.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/YOLO-V3-Tensorflow-demo/detection_result.jpg


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/eval.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | 
  3 | from __future__ import division, print_function
  4 | 
  5 | import tensorflow as tf
  6 | import numpy as np
  7 | import argparse
  8 | from tqdm import trange
  9 | 
 10 | from utils.data_utils import get_batch_data
 11 | from utils.misc_utils import parse_anchors, read_class_names, AverageMeter
 12 | from utils.eval_utils import evaluate_on_cpu, evaluate_on_gpu, get_preds_gpu, voc_eval, parse_gt_rec
 13 | from utils.nms_utils import gpu_nms
 14 | 
 15 | from model import yolov3
 16 | 
 17 | #################
 18 | # ArgumentParser
 19 | #################
 20 | parser = argparse.ArgumentParser(description="YOLO-V3 eval procedure.")
 21 | # some paths
 22 | parser.add_argument("--eval_file", type=str, default="./data/my_data/val.txt",
 23 |                     help="The path of the validation or test txt file.")
 24 | 
 25 | parser.add_argument("--restore_path", type=str, default="./data/darknet_weights/yolov3.ckpt",
 26 |                     help="The path of the weights to restore.")
 27 | 
 28 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt",
 29 |                     help="The path of the anchor txt file.")
 30 | 
 31 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names",
 32 |                     help="The path of the class names.")
 33 | 
 34 | # some numbers
 35 | parser.add_argument("--img_size", nargs='*', type=int, default=[416, 416],
 36 |                     help="Resize the input image to `img_size`, size format: [width, height]")
 37 | 
 38 | parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=False,
 39 |                     help="Whether to use the letterbox resize, i.e., keep the original image aspect ratio.")
 40 | 
 41 | parser.add_argument("--num_threads", type=int, default=10,
 42 |                     help="Number of threads for image processing used in tf.data pipeline.")
 43 | 
 44 | parser.add_argument("--prefetech_buffer", type=int, default=5,
 45 |                     help="Prefetech_buffer used in tf.data pipeline.")
 46 | 
 47 | parser.add_argument("--nms_threshold", type=float, default=0.45,
 48 |                     help="IOU threshold in nms operation.")
 49 | 
 50 | parser.add_argument("--score_threshold", type=float, default=0.01,
 51 |                     help="Threshold of the probability of the classes in nms operation.")
 52 | 
 53 | parser.add_argument("--nms_topk", type=int, default=400,
 54 |                     help="Keep at most nms_topk outputs after nms.")
 55 | 
 56 | parser.add_argument("--use_voc_07_metric", type=lambda x: (str(x).lower() == 'true'), default=False,
 57 |                     help="Whether to use the voc 2007 mAP metrics.")
 58 | 
 59 | args = parser.parse_args()
 60 | 
 61 | # args params
 62 | args.anchors = parse_anchors(args.anchor_path)
 63 | args.classes = read_class_names(args.class_name_path)
 64 | args.class_num = len(args.classes)
 65 | args.img_cnt = len(open(args.eval_file, 'r').readlines())
 66 | 
 67 | # setting placeholders
 68 | is_training = tf.placeholder(dtype=tf.bool, name="phase_train")
 69 | handle_flag = tf.placeholder(tf.string, [], name='iterator_handle_flag')
 70 | pred_boxes_flag = tf.placeholder(tf.float32, [1, None, None])
 71 | pred_scores_flag = tf.placeholder(tf.float32, [1, None, None])
 72 | gpu_nms_op = gpu_nms(pred_boxes_flag, pred_scores_flag, args.class_num, args.nms_topk, args.score_threshold, args.nms_threshold)
 73 | 
 74 | ##################
 75 | # tf.data pipeline
 76 | ##################
 77 | val_dataset = tf.data.TextLineDataset(args.eval_file)
 78 | val_dataset = val_dataset.batch(1)
 79 | val_dataset = val_dataset.map(
 80 |     lambda x: tf.py_func(get_batch_data, [x, args.class_num, args.img_size, args.anchors, 'val', False, False, args.letterbox_resize], [tf.int64, tf.float32, tf.float32, tf.float32, tf.float32]),
 81 |     num_parallel_calls=args.num_threads
 82 | )
 83 | val_dataset.prefetch(args.prefetech_buffer)
 84 | iterator = val_dataset.make_one_shot_iterator()
 85 | 
 86 | image_ids, image, y_true_13, y_true_26, y_true_52 = iterator.get_next()
 87 | image_ids.set_shape([None])
 88 | y_true = [y_true_13, y_true_26, y_true_52]
 89 | image.set_shape([None, args.img_size[1], args.img_size[0], 3])
 90 | for y in y_true:
 91 |     y.set_shape([None, None, None, None, None])
 92 | 
 93 | ##################
 94 | # Model definition
 95 | ##################
 96 | yolo_model = yolov3(args.class_num, args.anchors)
 97 | with tf.variable_scope('yolov3'):
 98 |     pred_feature_maps = yolo_model.forward(image, is_training=is_training)
 99 | loss = yolo_model.compute_loss(pred_feature_maps, y_true)
100 | y_pred = yolo_model.predict(pred_feature_maps)
101 | 
102 | saver_to_restore = tf.train.Saver()
103 | 
104 | with tf.Session() as sess:
105 |     sess.run([tf.global_variables_initializer()])
106 |     saver_to_restore.restore(sess, args.restore_path)
107 | 
108 |     print('\n----------- start to eval -----------\n')
109 | 
110 |     val_loss_total, val_loss_xy, val_loss_wh, val_loss_conf, val_loss_class = \
111 |         AverageMeter(), AverageMeter(), AverageMeter(), AverageMeter(), AverageMeter()
112 |     val_preds = []
113 | 
114 |     for j in trange(args.img_cnt):
115 |         __image_ids, __y_pred, __loss = sess.run([image_ids, y_pred, loss], feed_dict={is_training: False})
116 |         pred_content = get_preds_gpu(sess, gpu_nms_op, pred_boxes_flag, pred_scores_flag, __image_ids, __y_pred)
117 | 
118 |         val_preds.extend(pred_content)
119 |         val_loss_total.update(__loss[0])
120 |         val_loss_xy.update(__loss[1])
121 |         val_loss_wh.update(__loss[2])
122 |         val_loss_conf.update(__loss[3])
123 |         val_loss_class.update(__loss[4])
124 | 
125 |     rec_total, prec_total, ap_total = AverageMeter(), AverageMeter(), AverageMeter()
126 |     gt_dict = parse_gt_rec(args.eval_file, args.img_size, args.letterbox_resize)
127 |     print('mAP eval:')
128 |     for ii in range(args.class_num):
129 |         npos, nd, rec, prec, ap = voc_eval(gt_dict, val_preds, ii, iou_thres=0.5, use_07_metric=args.use_voc_07_metric)
130 |         rec_total.update(rec, npos)
131 |         prec_total.update(prec, nd)
132 |         ap_total.update(ap, 1)
133 |         print('Class {}: Recall: {:.4f}, Precision: {:.4f}, AP: {:.4f}'.format(ii, rec, prec, ap))
134 | 
135 |     mAP = ap_total.average
136 |     print('final mAP: {:.4f}'.format(mAP))
137 |     print("recall: {:.3f}, precision: {:.3f}".format(rec_total.average, prec_total.average))
138 |     print("total_loss: {:.3f}, loss_xy: {:.3f}, loss_wh: {:.3f}, loss_conf: {:.3f}, loss_class: {:.3f}".format(
139 |         val_loss_total.average, val_loss_xy.average, val_loss_wh.average, val_loss_conf.average, val_loss_class.average
140 |     ))
141 | 


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/get_kmeans.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | # This script is modified from https://github.com/lars76/kmeans-anchor-boxes
  3 | 
  4 | from __future__ import division, print_function
  5 | 
  6 | import numpy as np
  7 | 
  8 | def iou(box, clusters):
  9 |     """
 10 |     Calculates the Intersection over Union (IoU) between a box and k clusters.
 11 |     param:
 12 |         box: tuple or array, shifted to the origin (i. e. width and height)
 13 |         clusters: numpy array of shape (k, 2) where k is the number of clusters
 14 |     return:
 15 |         numpy array of shape (k, 0) where k is the number of clusters
 16 |     """
 17 |     x = np.minimum(clusters[:, 0], box[0])
 18 |     y = np.minimum(clusters[:, 1], box[1])
 19 |     if np.count_nonzero(x == 0) > 0 or np.count_nonzero(y == 0) > 0:
 20 |         raise ValueError("Box has no area")
 21 | 
 22 |     intersection = x * y
 23 |     box_area = box[0] * box[1]
 24 |     cluster_area = clusters[:, 0] * clusters[:, 1]
 25 | 
 26 |     iou_ = np.true_divide(intersection, box_area + cluster_area - intersection + 1e-10)
 27 |     # iou_ = intersection / (box_area + cluster_area - intersection + 1e-10)
 28 | 
 29 |     return iou_
 30 | 
 31 | 
 32 | def avg_iou(boxes, clusters):
 33 |     """
 34 |     Calculates the average Intersection over Union (IoU) between a numpy array of boxes and k clusters.
 35 |     param:
 36 |         boxes: numpy array of shape (r, 2), where r is the number of rows
 37 |         clusters: numpy array of shape (k, 2) where k is the number of clusters
 38 |     return:
 39 |         average IoU as a single float
 40 |     """
 41 |     return np.mean([np.max(iou(boxes[i], clusters)) for i in range(boxes.shape[0])])
 42 | 
 43 | 
 44 | def translate_boxes(boxes):
 45 |     """
 46 |     Translates all the boxes to the origin.
 47 |     param:
 48 |         boxes: numpy array of shape (r, 4)
 49 |     return:
 50 |     numpy array of shape (r, 2)
 51 |     """
 52 |     new_boxes = boxes.copy()
 53 |     for row in range(new_boxes.shape[0]):
 54 |         new_boxes[row][2] = np.abs(new_boxes[row][2] - new_boxes[row][0])
 55 |         new_boxes[row][3] = np.abs(new_boxes[row][3] - new_boxes[row][1])
 56 |     return np.delete(new_boxes, [0, 1], axis=1)
 57 | 
 58 | 
 59 | def kmeans(boxes, k, dist=np.median):
 60 |     """
 61 |     Calculates k-means clustering with the Intersection over Union (IoU) metric.
 62 |     param:
 63 |         boxes: numpy array of shape (r, 2), where r is the number of rows
 64 |         k: number of clusters
 65 |         dist: distance function
 66 |     return:
 67 |         numpy array of shape (k, 2)
 68 |     """
 69 |     rows = boxes.shape[0]
 70 | 
 71 |     distances = np.empty((rows, k))
 72 |     last_clusters = np.zeros((rows,))
 73 | 
 74 |     np.random.seed()
 75 | 
 76 |     # the Forgy method will fail if the whole array contains the same rows
 77 |     clusters = boxes[np.random.choice(rows, k, replace=False)]
 78 | 
 79 |     while True:
 80 |         for row in range(rows):
 81 |             distances[row] = 1 - iou(boxes[row], clusters)
 82 | 
 83 |         nearest_clusters = np.argmin(distances, axis=1)
 84 | 
 85 |         if (last_clusters == nearest_clusters).all():
 86 |             break
 87 | 
 88 |         for cluster in range(k):
 89 |             clusters[cluster] = dist(boxes[nearest_clusters == cluster], axis=0)
 90 | 
 91 |         last_clusters = nearest_clusters
 92 | 
 93 |     return clusters
 94 | 
 95 | 
 96 | def parse_anno(annotation_path, target_size=None):
 97 |     anno = open(annotation_path, 'r')
 98 |     result = []
 99 |     for line in anno:
100 |         s = line.strip().split(' ')
101 |         img_w = int(float(s[2]))
102 |         img_h = int(float(s[3]))
103 |         s = s[4:]
104 |         box_cnt = len(s) // 5
105 |         for i in range(box_cnt):
106 |             x_min, y_min, x_max, y_max = float(s[i*5+1]), float(s[i*5+2]), float(s[i*5+3]), float(s[i*5+4])
107 |             width = x_max - x_min
108 |             height = y_max - y_min
109 |             assert width > 0
110 |             assert height > 0
111 |             # use letterbox resize, i.e. keep the original aspect ratio
112 |             # get k-means anchors on the resized target image size
113 |             if target_size is not None:
114 |                 resize_ratio = min(target_size[0] / img_w, target_size[1] / img_h)
115 |                 width *= resize_ratio
116 |                 height *= resize_ratio
117 |                 result.append([width, height])
118 |             # get k-means anchors on the original image size
119 |             else:
120 |                 result.append([width, height])
121 |     result = np.asarray(result)
122 |     return result
123 | 
124 | 
125 | def get_kmeans(anno, cluster_num=9):
126 | 
127 |     anchors = kmeans(anno, cluster_num)
128 |     ave_iou = avg_iou(anno, anchors)
129 | 
130 |     anchors = anchors.astype('int').tolist()
131 | 
132 |     anchors = sorted(anchors, key=lambda x: x[0] * x[1])
133 | 
134 |     return anchors, ave_iou
135 | 
136 | 
137 | if __name__ == '__main__':
138 |     # target resize format: [width, height]
139 |     # if target_resize is speficied, the anchors are on the resized image scale
140 |     # if target_resize is set to None, the anchors are on the original image scale
141 |     target_size = [416, 416]
142 |     annotation_path = "./data/my_data/label/train.txt"
143 |     anno_result = parse_anno(annotation_path, target_size=target_size)
144 |     anchors, ave_iou = get_kmeans(anno_result, 9)
145 | 
146 |     anchor_string = ''
147 |     for anchor in anchors:
148 |         anchor_string += '{},{}, '.format(anchor[0], anchor[1])
149 |     anchor_string = anchor_string[:-2]
150 | 
151 |     print('anchors are:')
152 |     print(anchor_string)
153 |     print('the average iou is:')
154 |     print(ave_iou)
155 | 
156 | 


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/requirements.txt:
--------------------------------------------------------------------------------
 1 | numpy==1.16.0
 2 | absl-py-0.9.0
 3 | astor-0.8.1 
 4 | gast-0.3.3 
 5 | grpcio-1.27.2
 6 | h5py-2.10.0 
 7 | keras-applications-1.0.8 
 8 | keras-preprocessing-1.1.0 
 9 | markdown-3.2.1 
10 | mock-4.0.2 
11 | protobuf-3.11.3 
12 | tensorboard-1.13.1 
13 | tensorflow-estimator-1.13.0 
14 | tensorflow-gpu-1.13.1+nv19.3 
15 | termcolor-1.1.0
16 | 
17 | 
18 | 
19 | 


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/test.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/YOLO-V3-Tensorflow-demo/test.jpg


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/test_single_image.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | 
 3 | from __future__ import division, print_function
 4 | 
 5 | import tensorflow as tf
 6 | import numpy as np
 7 | import argparse
 8 | import cv2
 9 | 
10 | from utils.misc_utils import parse_anchors, read_class_names
11 | from utils.nms_utils import gpu_nms
12 | from utils.plot_utils import get_color_table, plot_one_box
13 | 
14 | from model import yolov3
15 | 
16 | tf.compat.v1.train.Saver
17 | 
18 | parser = argparse.ArgumentParser(description="YOLO-V3 test single image test procedure.")
19 | parser.add_argument("input_image", type=str,
20 |                     help="The path of the input image.")
21 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt",
22 |                     help="The path of the anchor txt file.")
23 | parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416],
24 |                     help="Resize the input image with `new_size`, size format: [width, height]")
25 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names",
26 |                     help="The path of the class names.")
27 | parser.add_argument("--restore_path", type=str, default="./data/darknet_weights/best_model_Epoch_200_step_34370_mAP_0.8121_loss_9.4284_lr_1e-05",
28 |                     help="The path of the weights to restore.")
29 | args = parser.parse_args()
30 | 
31 | args.anchors = parse_anchors(args.anchor_path)
32 | args.classes = read_class_names(args.class_name_path)
33 | args.num_class = len(args.classes)
34 | 
35 | color_table = get_color_table(args.num_class)
36 | 
37 | img_ori = cv2.imread(args.input_image)
38 | height_ori, width_ori = img_ori.shape[:2]
39 | img = cv2.resize(img_ori, tuple(args.new_size))
40 | img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
41 | img = np.asarray(img, np.float32)
42 | img = img[np.newaxis, :] / 255.
43 | 
44 | with tf.Session() as sess:
45 |     input_data = tf.placeholder(tf.float32, [1, args.new_size[1], args.new_size[0], 3], name='input_data')
46 |     yolo_model = yolov3(args.num_class, args.anchors)
47 |     with tf.variable_scope('yolov3'):
48 |         pred_feature_maps = yolo_model.forward(input_data, False)
49 |     pred_boxes, pred_confs, pred_probs = yolo_model.predict(pred_feature_maps)
50 | 
51 |     pred_scores = pred_confs * pred_probs
52 | 
53 |     boxes, scores, labels = gpu_nms(pred_boxes, pred_scores, args.num_class, max_boxes=30, score_thresh=0.4, nms_thresh=0.5)
54 | 
55 |     saver = tf.train.Saver()
56 |     saver.restore(sess, args.restore_path)
57 | 
58 |     boxes_, scores_, labels_ = sess.run([boxes, scores, labels], feed_dict={input_data: img})
59 | 
60 |     # rescale the coordinates to the original image
61 |     boxes_[:, 0] *= (width_ori/float(args.new_size[0]))
62 |     boxes_[:, 2] *= (width_ori/float(args.new_size[0]))
63 |     boxes_[:, 1] *= (height_ori/float(args.new_size[1]))
64 |     boxes_[:, 3] *= (height_ori/float(args.new_size[1]))
65 | 
66 |     print("box coords:")
67 |     print(boxes_)
68 |     print('*' * 30)
69 |     print("scores:")
70 |     print(scores_)
71 |     print('*' * 30)
72 |     print("labels:")
73 |     print(labels_)
74 | 
75 |     for i in range(len(boxes_)):
76 |         x0, y0, x1, y1 = boxes_[i]
77 |         plot_one_box(img_ori, [x0, y0, x1, y1], label=args.classes[labels_[i]], color=color_table[labels_[i]])
78 |     # cv2.imshow('Detection result', img_ori)
79 |     cv2.imwrite('detection_result.jpg', img_ori)
80 |     # cv2.waitKey(0)
81 | 


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/YOLO-V3-Tensorflow-demo/utils/__init__.py


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/utils/layer_utils.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | 
 3 | from __future__ import division, print_function
 4 | 
 5 | import numpy as np
 6 | import tensorflow as tf
 7 | slim = tf.contrib.slim
 8 | 
 9 | def conv2d(inputs, filters, kernel_size, strides=1):
10 |     def _fixed_padding(inputs, kernel_size):
11 |         pad_total = kernel_size - 1
12 |         pad_beg = pad_total // 2
13 |         pad_end = pad_total - pad_beg
14 | 
15 |         padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg, pad_end],
16 |                                         [pad_beg, pad_end], [0, 0]], mode='CONSTANT')
17 |         return padded_inputs
18 |     if strides > 1: 
19 |         inputs = _fixed_padding(inputs, kernel_size)
20 |     inputs = slim.conv2d(inputs, filters, kernel_size, stride=strides,
21 |                          padding=('SAME' if strides == 1 else 'VALID'))
22 |     return inputs
23 | 
24 | def darknet53_body(inputs):
25 |     def res_block(inputs, filters):
26 |         shortcut = inputs
27 |         net = conv2d(inputs, filters * 1, 1)
28 |         net = conv2d(net, filters * 2, 3)
29 | 
30 |         net = net + shortcut
31 | 
32 |         return net
33 |     
34 |     # first two conv2d layers
35 |     net = conv2d(inputs, 32,  3, strides=1)
36 |     net = conv2d(net, 64,  3, strides=2)
37 | 
38 |     # res_block * 1
39 |     net = res_block(net, 32)
40 | 
41 |     net = conv2d(net, 128, 3, strides=2)
42 | 
43 |     # res_block * 2
44 |     for i in range(2):
45 |         net = res_block(net, 64)
46 | 
47 |     net = conv2d(net, 256, 3, strides=2)
48 | 
49 |     # res_block * 8
50 |     for i in range(8):
51 |         net = res_block(net, 128)
52 | 
53 |     route_1 = net
54 |     net = conv2d(net, 512, 3, strides=2)
55 | 
56 |     # res_block * 8
57 |     for i in range(8):
58 |         net = res_block(net, 256)
59 | 
60 |     route_2 = net
61 |     net = conv2d(net, 1024, 3, strides=2)
62 | 
63 |     # res_block * 4
64 |     for i in range(4):
65 |         net = res_block(net, 512)
66 |     route_3 = net
67 | 
68 |     return route_1, route_2, route_3
69 | 
70 | 
71 | def yolo_block(inputs, filters):
72 |     net = conv2d(inputs, filters * 1, 1)
73 |     net = conv2d(net, filters * 2, 3)
74 |     net = conv2d(net, filters * 1, 1)
75 |     net = conv2d(net, filters * 2, 3)
76 |     net = conv2d(net, filters * 1, 1)
77 |     route = net
78 |     net = conv2d(net, filters * 2, 3)
79 |     return route, net
80 | 
81 | 
82 | def upsample_layer(inputs, out_shape):
83 |     new_height, new_width = out_shape[1], out_shape[2]
84 |     # NOTE: here height is the first
85 |     # TODO: Do we need to set `align_corners` as True?
86 |     inputs = tf.image.resize_nearest_neighbor(inputs, (new_height, new_width), name='upsampled')
87 |     return inputs
88 | 
89 | 
90 | 


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/utils/misc_utils.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | 
  3 | import numpy as np
  4 | import tensorflow as tf
  5 | import random
  6 | 
  7 | from tensorflow.core.framework import summary_pb2
  8 | 
  9 | 
 10 | def make_summary(name, val):
 11 |     return summary_pb2.Summary(value=[summary_pb2.Summary.Value(tag=name, simple_value=val)])
 12 | 
 13 | 
 14 | class AverageMeter(object):
 15 |     def __init__(self):
 16 |         self.reset()
 17 | 
 18 |     def reset(self):
 19 |         self.val = 0
 20 |         self.average = 0
 21 |         self.sum = 0
 22 |         self.count = 0
 23 | 
 24 |     def update(self, val, n=1):
 25 |         self.val = val
 26 |         self.sum += val * n
 27 |         self.count += n
 28 |         self.average = self.sum / float(self.count)
 29 | 
 30 | 
 31 | def parse_anchors(anchor_path):
 32 |     '''
 33 |     parse anchors.
 34 |     returned data: shape [N, 2], dtype float32
 35 |     '''
 36 |     anchors = np.reshape(np.asarray(open(anchor_path, 'r').read().split(','), np.float32), [-1, 2])
 37 |     return anchors
 38 | 
 39 | 
 40 | def read_class_names(class_name_path):
 41 |     names = {}
 42 |     with open(class_name_path, 'r') as data:
 43 |         for ID, name in enumerate(data):
 44 |             names[ID] = name.strip('\n')
 45 |     return names
 46 | 
 47 | 
 48 | def shuffle_and_overwrite(file_name):
 49 |     content = open(file_name, 'r').readlines()
 50 |     random.shuffle(content)
 51 |     with open(file_name, 'w') as f:
 52 |         for line in content:
 53 |             f.write(line)
 54 | 
 55 | 
 56 | def update_dict(ori_dict, new_dict):
 57 |     if not ori_dict:
 58 |         return new_dict
 59 |     for key in ori_dict:
 60 |         ori_dict[key] += new_dict[key]
 61 |     return ori_dict
 62 | 
 63 | 
 64 | def list_add(ori_list, new_list):
 65 |     for i in range(len(ori_list)):
 66 |         ori_list[i] += new_list[i]
 67 |     return ori_list
 68 | 
 69 | 
 70 | def load_weights(var_list, weights_file):
 71 |     """
 72 |     Loads and converts pre-trained weights.
 73 |     param:
 74 |         var_list: list of network variables.
 75 |         weights_file: name of the binary file.
 76 |     """
 77 |     with open(weights_file, "rb") as fp:
 78 |         np.fromfile(fp, dtype=np.int32, count=5)
 79 |         weights = np.fromfile(fp, dtype=np.float32)
 80 | 
 81 |     ptr = 0
 82 |     i = 0
 83 |     assign_ops = []
 84 |     while i < len(var_list) - 1:
 85 |         var1 = var_list[i]
 86 |         var2 = var_list[i + 1]
 87 |         # do something only if we process conv layer
 88 |         if 'Conv' in var1.name.split('/')[-2]:
 89 |             # check type of next layer
 90 |             if 'BatchNorm' in var2.name.split('/')[-2]:
 91 |                 # load batch norm params
 92 |                 gamma, beta, mean, var = var_list[i + 1:i + 5]
 93 |                 batch_norm_vars = [beta, gamma, mean, var]
 94 |                 for var in batch_norm_vars:
 95 |                     shape = var.shape.as_list()
 96 |                     num_params = np.prod(shape)
 97 |                     var_weights = weights[ptr:ptr + num_params].reshape(shape)
 98 |                     ptr += num_params
 99 |                     assign_ops.append(tf.assign(var, var_weights, validate_shape=True))
100 |                 # we move the pointer by 4, because we loaded 4 variables
101 |                 i += 4
102 |             elif 'Conv' in var2.name.split('/')[-2]:
103 |                 # load biases
104 |                 bias = var2
105 |                 bias_shape = bias.shape.as_list()
106 |                 bias_params = np.prod(bias_shape)
107 |                 bias_weights = weights[ptr:ptr +
108 |                                        bias_params].reshape(bias_shape)
109 |                 ptr += bias_params
110 |                 assign_ops.append(tf.assign(bias, bias_weights, validate_shape=True))
111 |                 # we loaded 1 variable
112 |                 i += 1
113 |             # we can load weights of conv layer
114 |             shape = var1.shape.as_list()
115 |             num_params = np.prod(shape)
116 | 
117 |             var_weights = weights[ptr:ptr + num_params].reshape(
118 |                 (shape[3], shape[2], shape[0], shape[1]))
119 |             # remember to transpose to column-major
120 |             var_weights = np.transpose(var_weights, (2, 3, 1, 0))
121 |             ptr += num_params
122 |             assign_ops.append(
123 |                 tf.assign(var1, var_weights, validate_shape=True))
124 |             i += 1
125 | 
126 |     return assign_ops
127 | 
128 | 
129 | def config_learning_rate(args, global_step):
130 |     if args.lr_type == 'exponential':
131 |         lr_tmp = tf.train.exponential_decay(args.learning_rate_init, global_step, args.lr_decay_freq,
132 |                                             args.lr_decay_factor, staircase=True, name='exponential_learning_rate')
133 |         return tf.maximum(lr_tmp, args.lr_lower_bound)
134 |     elif args.lr_type == 'cosine_decay':
135 |         train_steps = (args.total_epoches - float(args.use_warm_up) * args.warm_up_epoch) * args.train_batch_num
136 |         return args.lr_lower_bound + 0.5 * (args.learning_rate_init - args.lr_lower_bound) * \
137 |             (1 + tf.cos(global_step / train_steps * np.pi))
138 |     elif args.lr_type == 'cosine_decay_restart':
139 |         return tf.train.cosine_decay_restarts(args.learning_rate_init, global_step, 
140 |                                               args.lr_decay_freq, t_mul=2.0, m_mul=1.0, 
141 |                                               name='cosine_decay_learning_rate_restart')
142 |     elif args.lr_type == 'fixed':
143 |         return tf.convert_to_tensor(args.learning_rate_init, name='fixed_learning_rate')
144 |     elif args.lr_type == 'piecewise':
145 |         return tf.train.piecewise_constant(global_step, boundaries=args.pw_boundaries, values=args.pw_values,
146 |                                            name='piecewise_learning_rate')
147 |     else:
148 |         raise ValueError('Unsupported learning rate type!')
149 | 
150 | 
151 | def config_optimizer(optimizer_name, learning_rate, decay=0.9, momentum=0.9):
152 |     if optimizer_name == 'momentum':
153 |         return tf.train.MomentumOptimizer(learning_rate, momentum=momentum)
154 |     elif optimizer_name == 'rmsprop':
155 |         return tf.train.RMSPropOptimizer(learning_rate, decay=decay, momentum=momentum)
156 |     elif optimizer_name == 'adam':
157 |         return tf.train.AdamOptimizer(learning_rate)
158 |     elif optimizer_name == 'sgd':
159 |         return tf.train.GradientDescentOptimizer(learning_rate)
160 |     else:
161 |         raise ValueError('Unsupported optimizer type!')


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/utils/nms_utils.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | 
  3 | from __future__ import division, print_function
  4 | 
  5 | import numpy as np
  6 | import tensorflow as tf
  7 | 
  8 | def gpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, nms_thresh=0.5):
  9 |     """
 10 |     Perform NMS on GPU using TensorFlow.
 11 | 
 12 |     params:
 13 |         boxes: tensor of shape [1, 10647, 4] # 10647=(13*13+26*26+52*52)*3, for input 416*416 image
 14 |         scores: tensor of shape [1, 10647, num_classes], score=conf*prob
 15 |         num_classes: total number of classes
 16 |         max_boxes: integer, maximum number of predicted boxes you'd like, default is 50
 17 |         score_thresh: if [ highest class probability score < score_threshold]
 18 |                         then get rid of the corresponding box
 19 |         nms_thresh: real value, "intersection over union" threshold used for NMS filtering
 20 |     """
 21 | 
 22 |     boxes_list, label_list, score_list = [], [], []
 23 |     max_boxes = tf.constant(max_boxes, dtype='int32')
 24 | 
 25 |     # since we do nms for single image, then reshape it
 26 |     boxes = tf.reshape(boxes, [-1, 4]) # '-1' means we don't konw the exact number of boxes
 27 |     score = tf.reshape(scores, [-1, num_classes])
 28 | 
 29 |     # Step 1: Create a filtering mask based on "box_class_scores" by using "threshold".
 30 |     mask = tf.greater_equal(score, tf.constant(score_thresh))
 31 |     # Step 2: Do non_max_suppression for each class
 32 |     for i in range(num_classes):
 33 |         # Step 3: Apply the mask to scores, boxes and pick them out
 34 |         filter_boxes = tf.boolean_mask(boxes, mask[:,i])
 35 |         filter_score = tf.boolean_mask(score[:,i], mask[:,i])
 36 |         nms_indices = tf.image.non_max_suppression(boxes=filter_boxes,
 37 |                                                    scores=filter_score,
 38 |                                                    max_output_size=max_boxes,
 39 |                                                    iou_threshold=nms_thresh, name='nms_indices')
 40 |         label_list.append(tf.ones_like(tf.gather(filter_score, nms_indices), 'int32')*i)
 41 |         boxes_list.append(tf.gather(filter_boxes, nms_indices))
 42 |         score_list.append(tf.gather(filter_score, nms_indices))
 43 | 
 44 |     boxes = tf.concat(boxes_list, axis=0)
 45 |     score = tf.concat(score_list, axis=0)
 46 |     label = tf.concat(label_list, axis=0)
 47 | 
 48 |     return boxes, score, label
 49 | 
 50 | 
 51 | def py_nms(boxes, scores, max_boxes=50, iou_thresh=0.5):
 52 |     """
 53 |     Pure Python NMS baseline.
 54 | 
 55 |     Arguments: boxes: shape of [-1, 4], the value of '-1' means that dont know the
 56 |                       exact number of boxes
 57 |                scores: shape of [-1,]
 58 |                max_boxes: representing the maximum of boxes to be selected by non_max_suppression
 59 |                iou_thresh: representing iou_threshold for deciding to keep boxes
 60 |     """
 61 |     assert boxes.shape[1] == 4 and len(scores.shape) == 1
 62 | 
 63 |     x1 = boxes[:, 0]
 64 |     y1 = boxes[:, 1]
 65 |     x2 = boxes[:, 2]
 66 |     y2 = boxes[:, 3]
 67 | 
 68 |     areas = (x2 - x1) * (y2 - y1)
 69 |     order = scores.argsort()[::-1]
 70 | 
 71 |     keep = []
 72 |     while order.size > 0:
 73 |         i = order[0]
 74 |         keep.append(i)
 75 |         xx1 = np.maximum(x1[i], x1[order[1:]])
 76 |         yy1 = np.maximum(y1[i], y1[order[1:]])
 77 |         xx2 = np.minimum(x2[i], x2[order[1:]])
 78 |         yy2 = np.minimum(y2[i], y2[order[1:]])
 79 | 
 80 |         w = np.maximum(0.0, xx2 - xx1 + 1)
 81 |         h = np.maximum(0.0, yy2 - yy1 + 1)
 82 |         inter = w * h
 83 |         ovr = inter / (areas[i] + areas[order[1:]] - inter)
 84 | 
 85 |         inds = np.where(ovr <= iou_thresh)[0]
 86 |         order = order[inds + 1]
 87 | 
 88 |     return keep[:max_boxes]
 89 | 
 90 | 
 91 | def cpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, iou_thresh=0.5):
 92 |     """
 93 |     Perform NMS on CPU.
 94 |     Arguments:
 95 |         boxes: shape [1, 10647, 4]
 96 |         scores: shape [1, 10647, num_classes]
 97 |     """
 98 | 
 99 |     boxes = boxes.reshape(-1, 4)
100 |     scores = scores.reshape(-1, num_classes)
101 |     # Picked bounding boxes
102 |     picked_boxes, picked_score, picked_label = [], [], []
103 | 
104 |     for i in range(num_classes):
105 |         indices = np.where(scores[:,i] >= score_thresh)
106 |         filter_boxes = boxes[indices]
107 |         filter_scores = scores[:,i][indices]
108 |         if len(filter_boxes) == 0: 
109 |             continue
110 |         # do non_max_suppression on the cpu
111 |         indices = py_nms(filter_boxes, filter_scores,
112 |                          max_boxes=max_boxes, iou_thresh=iou_thresh)
113 |         picked_boxes.append(filter_boxes[indices])
114 |         picked_score.append(filter_scores[indices])
115 |         picked_label.append(np.ones(len(indices), dtype='int32')*i)
116 |     if len(picked_boxes) == 0: 
117 |         return None, None, None
118 | 
119 |     boxes = np.concatenate(picked_boxes, axis=0)
120 |     score = np.concatenate(picked_score, axis=0)
121 |     label = np.concatenate(picked_label, axis=0)
122 | 
123 |     return boxes, score, label


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/utils/plot_utils.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | 
 3 | from __future__ import division, print_function
 4 | 
 5 | import cv2
 6 | import random
 7 | 
 8 | 
 9 | def get_color_table(class_num, seed=2):
10 |     random.seed(seed)
11 |     color_table = {}
12 |     for i in range(class_num):
13 |         color_table[i] = [random.randint(0, 255) for _ in range(3)]
14 |     return color_table
15 | 
16 | 
17 | def plot_one_box(img, coord, label=None, color=None, line_thickness=None):
18 |     '''
19 |     coord: [x_min, y_min, x_max, y_max] format coordinates.
20 |     img: img to plot on.
21 |     label: str. The label name.
22 |     color: int. color index.
23 |     line_thickness: int. rectangle line thickness.
24 |     '''
25 |     tl = line_thickness or int(round(0.002 * max(img.shape[0:2])))  # line thickness
26 |     color = color or [random.randint(0, 255) for _ in range(3)]
27 |     c1, c2 = (int(coord[0]), int(coord[1])), (int(coord[2]), int(coord[3]))
28 |     cv2.rectangle(img, c1, c2, color, thickness=tl)
29 |     if label:
30 |         tf = max(tl - 1, 1)  # font thickness
31 |         t_size = cv2.getTextSize(label, 0, fontScale=float(tl) / 3, thickness=tf)[0]
32 |         c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3
33 |         cv2.rectangle(img, c1, c2, color, -1)  # filled
34 |         cv2.putText(img, label, (c1[0], c1[1] - 2), 0, float(tl) / 3, [0, 0, 0], thickness=tf, lineType=cv2.LINE_AA)
35 | 
36 | 


--------------------------------------------------------------------------------
/YOLO-V3-Tensorflow-demo/video_test.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | 
  3 | from __future__ import division, print_function
  4 | 
  5 | import tensorflow as tf
  6 | import numpy as np
  7 | import argparse
  8 | import cv2
  9 | import time
 10 | 
 11 | from utils.misc_utils import parse_anchors, read_class_names
 12 | from utils.nms_utils import gpu_nms
 13 | from utils.plot_utils import get_color_table, plot_one_box
 14 | from utils.data_aug import letterbox_resize
 15 | 
 16 | from model import yolov3
 17 | 
 18 | parser = argparse.ArgumentParser(description="YOLO-V3 video test procedure.")
 19 | parser.add_argument("input_video", type=str,
 20 |                     help="The path of the input video.")
 21 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt",
 22 |                     help="The path of the anchor txt file.")
 23 | parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416],
 24 |                     help="Resize the input image with `new_size`, size format: [width, height]")
 25 | parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=True,
 26 |                     help="Whether to use the letterbox resize.")
 27 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names",
 28 |                     help="The path of the class names.")
 29 | parser.add_argument("--restore_path", type=str, default="./data/darknet_weights/best_model_Epoch_200_step_34370_mAP_0.8121_loss_9.4284_lr_1e-05",
 30 |                     help="The path of the weights to restore.")
 31 | parser.add_argument("--save_video", type=lambda x: (str(x).lower() == 'true'), default=False,
 32 |                     help="Whether to save the video detection results.")
 33 | args = parser.parse_args()
 34 | 
 35 | args.anchors = parse_anchors(args.anchor_path)
 36 | args.classes = read_class_names(args.class_name_path)
 37 | args.num_class = len(args.classes)
 38 | 
 39 | color_table = get_color_table(args.num_class)
 40 | 
 41 | vid = cv2.VideoCapture(args.input_video)
 42 | video_frame_cnt = int(vid.get(7))
 43 | video_width = int(vid.get(3))
 44 | video_height = int(vid.get(4))
 45 | video_fps = int(vid.get(5))
 46 | 
 47 | if args.save_video:
 48 |     fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v')
 49 |     videoWriter = cv2.VideoWriter('video_result.mp4', fourcc, video_fps, (video_width, video_height))
 50 | 
 51 | with tf.Session() as sess:
 52 |     input_data = tf.placeholder(tf.float32, [1, args.new_size[1], args.new_size[0], 3], name='input_data')
 53 |     yolo_model = yolov3(args.num_class, args.anchors)
 54 |     with tf.variable_scope('yolov3'):
 55 |         pred_feature_maps = yolo_model.forward(input_data, False)
 56 |     pred_boxes, pred_confs, pred_probs = yolo_model.predict(pred_feature_maps)
 57 | 
 58 |     pred_scores = pred_confs * pred_probs
 59 | 
 60 |     boxes, scores, labels = gpu_nms(pred_boxes, pred_scores, args.num_class, max_boxes=200, score_thresh=0.3, nms_thresh=0.45)
 61 | 
 62 |     saver = tf.train.Saver()
 63 |     saver.restore(sess, args.restore_path)
 64 | 
 65 |     for i in range(video_frame_cnt):
 66 |         ret, img_ori = vid.read()
 67 |         if args.letterbox_resize:
 68 |             img, resize_ratio, dw, dh = letterbox_resize(img_ori, args.new_size[0], args.new_size[1])
 69 |         else:
 70 |             height_ori, width_ori = img_ori.shape[:2]
 71 |             img = cv2.resize(img_ori, tuple(args.new_size))
 72 |         img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
 73 |         img = np.asarray(img, np.float32)
 74 |         img = img[np.newaxis, :] / 255.
 75 | 
 76 |         start_time = time.time()
 77 |         boxes_, scores_, labels_ = sess.run([boxes, scores, labels], feed_dict={input_data: img})
 78 |         end_time = time.time()
 79 | 
 80 |         # rescale the coordinates to the original image
 81 |         if args.letterbox_resize:
 82 |             boxes_[:, [0, 2]] = (boxes_[:, [0, 2]] - dw) / resize_ratio
 83 |             boxes_[:, [1, 3]] = (boxes_[:, [1, 3]] - dh) / resize_ratio
 84 |         else:
 85 |             boxes_[:, [0, 2]] *= (width_ori/float(args.new_size[0]))
 86 |             boxes_[:, [1, 3]] *= (height_ori/float(args.new_size[1]))
 87 | 
 88 | 
 89 |         for i in range(len(boxes_)):
 90 |             x0, y0, x1, y1 = boxes_[i]
 91 |             plot_one_box(img_ori, [x0, y0, x1, y1], label=args.classes[labels_[i]] + ', {:.2f}%'.format(scores_[i] * 100), color=color_table[labels_[i]])
 92 |         cv2.putText(img_ori, '{:.2f}ms'.format((end_time - start_time) * 1000), (40, 40), 0,
 93 |                     fontScale=1, color=(0, 255, 0), thickness=2)
 94 |         cv2.imshow('image', img_ori)
 95 |         if args.save_video:
 96 |             videoWriter.write(img_ori)
 97 |         if cv2.waitKey(1) & 0xFF == ord('q'):
 98 |             break
 99 | 
100 |     vid.release()
101 |     if args.save_video:
102 |         videoWriter.release()
103 | 


--------------------------------------------------------------------------------
/config.py:
--------------------------------------------------------------------------------
 1 | import configparser
 2 | import os
 3 | 
 4 | class Config:
 5 |     """A model for saving settings"""
 6 | 
 7 |     def __init__(self):
 8 |         self.config_path = 'config.ini'
 9 |         self.email_receiver = ''
10 |         self.email_server = ''
11 |         self.email_port = '0'
12 |         self.email_username = ''
13 |         self.email_password = ''
14 |         self.email_ssl = False
15 |         self.objects_to_detect = -1
16 |         self.detection_marker_location = 50
17 |         self.detection_marker_direction = 0
18 |         self.mode = ''
19 | 
20 |     def load(self):
21 |         """Load the settings from the file"""
22 | 
23 |         if not os.path.exists(self.config_path):
24 |             return
25 | 
26 |         config_parser = configparser.ConfigParser()
27 |         config_parser.read(self.config_path)
28 | 
29 |         email_config = config_parser['email']
30 |         objects_config = config_parser['objects']
31 |         detection_marker = config_parser['detection_marker']
32 | 
33 |         self.email_receiver = email_config['receiver']
34 |         self.email_server = email_config['server']
35 |         self.email_port = email_config['port']
36 |         self.email_username = email_config['username']
37 |         self.email_password = email_config['password']
38 |         self.email_ssl = bool(email_config['ssl'])
39 |         self.objects_to_detect = int(objects_config['index'])
40 |         self.detection_marker_location = int(detection_marker['location'])
41 |         self.detection_marker_direction = int(detection_marker['direction'])
42 | 
43 |         if(self.objects_to_detect == 0):
44 |             self.mode = 'PH'
45 |         elif(self.objects_to_detect == 1):
46 |             self.mode = 'PV'
47 |         elif(self.objects_to_detect == 3):
48 |             self.mode = 'PLC'
49 |         else:
50 |             self.mode ='PHV'
51 | 
52 |     def save(self):
53 |         """Save the settings to the file"""
54 |         config_parser = configparser.ConfigParser()
55 |         config_parser['email'] = {'receiver': self.email_receiver,
56 |                                   'server': self.email_server,
57 |                                   'port': self.email_port,
58 |                                   'username': self.email_username,
59 |                                   'password': self.email_password,
60 |                                   'ssl': self.email_ssl
61 |                                   }
62 | 
63 |         config_parser['objects'] = {'index': str(self.objects_to_detect)}
64 | 
65 |         config_parser['detection_marker'] = {'location': self.detection_marker_location,
66 |                                             'direction': self.detection_marker_direction}
67 | 
68 |         with open(self.config_path, 'w') as configfile:
69 |             config_parser.write(configfile)
70 | 


--------------------------------------------------------------------------------
/config_window.py:
--------------------------------------------------------------------------------
  1 | import asyncio
  2 | from tkinter import ttk, Tk, messagebox
  3 | import tkinter as tk
  4 | from config import Config
  5 | from notification import NotificationService
  6 | 
  7 | class ConfigWindow(tk.Toplevel):
  8 |     def __init__(self, master, on_config_save=None, **kwargs):
  9 |         super().__init__(**kwargs  )
 10 |         
 11 |         self.resizable(False, False)
 12 |         
 13 |         self.on_config_save = on_config_save
 14 |         self.title('Configuration')
 15 | 
 16 |         self.config = Config()
 17 |         self.config.load()
 18 |         
 19 |         self.var_enable_ssl = tk.BooleanVar()
 20 |         self.var_enable_ssl.set(self.config.email_ssl)
 21 | 
 22 |         self.var_recipient_email = tk.StringVar()
 23 |         self.var_recipient_email.set(self.config.email_receiver)
 24 | 
 25 |         self.var_server = tk.StringVar()
 26 |         self.var_server.set(self.config.email_server)
 27 | 
 28 |         self.var_port = tk.IntVar()
 29 |         self.var_port.set(self.config.email_port)
 30 | 
 31 |         self.var_logon_email = tk.StringVar()
 32 |         self.var_logon_email.set(self.config.email_username)
 33 | 
 34 |         self.var_password = tk.StringVar()
 35 |         self.var_password.set(self.config.email_password)
 36 | 
 37 |         self.var_objects_detect = tk.IntVar()
 38 |         self.var_objects_detect.set(self.config.objects_to_detect)
 39 | 
 40 |         self.var_marker_location = tk.IntVar()
 41 |         self.var_marker_location.set(self.config.detection_marker_location)
 42 | 
 43 |         self.var_marker_direction = tk.IntVar()
 44 |         self.var_marker_direction.set(self.config.detection_marker_direction)
 45 |         
 46 |         self.var_status = tk.StringVar()
 47 |         self.var_status.set('')
 48 | 
 49 |         self.create_page()
 50 | 
 51 |         self.event_loop = asyncio.get_event_loop()
 52 |     
 53 |     def create_page(self):
 54 |         email_frame = ttk.LabelFrame(self, text="Email settings")
 55 |         email_frame.pack(padx=15, pady=15, fill=tk.X)
 56 | 
 57 |         self.draw_input(email_frame, 'Recipient Email:', "text", 0, self.var_recipient_email)
 58 |         self.draw_input(email_frame, 'Server:', "text", 1, self.var_server)
 59 |         self.draw_input(email_frame, 'Port:', "text", 2, self.var_port)
 60 |         self.draw_input(email_frame, 'Logon Email:', "text", 3, self.var_logon_email)
 61 |         self.draw_input(email_frame, 'Password:', "text", 4, self.var_password, True)
 62 |         self.draw_input(email_frame, 'Enable SSL', "check",5, self.var_enable_ssl)
 63 |         
 64 |         row_status = ttk.Frame(email_frame)
 65 |         row_status.pack(anchor='w')
 66 |         self.lbl_status = ttk.Label(row_status, textvariable=self.var_status)
 67 |         self.lbl_status.pack()
 68 | 
 69 |         row_test_email = ttk.Frame(email_frame)
 70 |         row_test_email.pack(anchor='w')
 71 |         btn_test_email = ttk.Button(row_test_email, text='Test Email',command=self.on_email_test)
 72 |         btn_test_email.pack()
 73 | 
 74 |         objects_frame = ttk.LabelFrame(self, text='Objects to detect')
 75 |         objects_frame.pack(fill=tk.X, padx=15)
 76 | 
 77 |         row_objects = ttk.Frame(objects_frame)
 78 |         row_objects.pack(anchor='w', expand=1)
 79 | 
 80 |         rad_helmet = ttk.Radiobutton(row_objects,text='Helmet', value=0, variable=self.var_objects_detect)
 81 |         rad_helmet.pack(side=tk.LEFT,fill=tk.X, expand=1)
 82 |         
 83 |         rad_vest = ttk.Radiobutton(row_objects,text='Vest', value=1, variable=self.var_objects_detect)
 84 |         rad_vest.pack(side=tk.LEFT,fill=tk.X)
 85 |         # rad_vest.grid(row=7, column=1)
 86 |         rad_helmet_vest = ttk.Radiobutton(row_objects,text='Helmet & Vest', value=2, variable=self.var_objects_detect)
 87 |         rad_helmet_vest.pack(side=tk.LEFT,fill=tk.X)
 88 | 
 89 |         # rad_helmet_vest.grid(row=7, column=2)
 90 |         rad_lab_coat = ttk.Radiobutton(row_objects,text='Lab Coat', value=3, variable=self.var_objects_detect)
 91 |         rad_lab_coat.pack(side=tk.LEFT,fill=tk.X)
 92 | 
 93 |         marker_frame = ttk.LabelFrame(self, text="Detection marker position")
 94 |         marker_frame.pack(padx=15, pady=15, fill=tk.X)
 95 | 
 96 |         # objects_frame = ttk.LabelFrame(self, text='Objects to detect')
 97 |         # objects_frame.pack(fill=tk.X, padx=15)
 98 |         direction_frame = ttk.LabelFrame(marker_frame, text='Direction of the marker')
 99 |         direction_frame.pack(fill=tk.X)
100 | 
101 |         row_direction = ttk.Frame(direction_frame)
102 |         row_direction.pack(anchor='w', expand=1)
103 | 
104 |         rad_horizontal = ttk.Radiobutton(row_direction,text='Horizontal', value=0, variable=self.var_marker_direction)
105 |         rad_horizontal.pack(side=tk.LEFT,fill=tk.X, expand=1)
106 |         
107 |         rad_vertical = ttk.Radiobutton(row_direction,text='Vertical', value=1, variable=self.var_marker_direction)
108 |         rad_vertical.pack(side=tk.LEFT,fill=tk.X)
109 | 
110 |         self.draw_input(marker_frame, 'Location %: (0 - 100)', "text", 2, self.var_marker_location)
111 | 
112 |         buttons_frame = ttk.LabelFrame(self)
113 |         buttons_frame.pack(padx=15, pady=15, fill=tk.X)
114 | 
115 |         row_btn_save = ttk.Frame(buttons_frame)
116 |         row_btn_save.pack(anchor='w')
117 | 
118 |         btn_save = ttk.Button(row_btn_save, text="Ok",width=10,command=self.on_save)
119 |         btn_save.pack(side=tk.LEFT)
120 |         
121 |         btn_cancel = ttk.Button(row_btn_save, text="Cancel", command=self.destroy)
122 |         btn_cancel.pack(side=tk.LEFT)
123 |         
124 |     def draw_input(self, master, label, type, index, variable=None, is_password=False):
125 |         row = ttk.Frame(master)
126 |         row.pack(anchor='w')
127 | 
128 |         if type=="text":
129 |             lbl = ttk.Label(row, width=20, text=label)
130 |             lbl.pack(anchor='w')
131 | 
132 |             txt = ttk.Entry(row, width=50, textvar=variable)
133 |             
134 |             if is_password:
135 |                 txt.config(show="*")
136 |         elif type=="check":
137 |             txt = ttk.Checkbutton(row, var=variable, text=label)
138 |         elif type=="number":
139 |             lbl = ttk.Label(row, width=20, text=label)
140 |             lbl.pack(anchor='w')
141 |             txt = ttk.Spinbox(master, increment=1)
142 |         
143 |         txt.pack(anchor='w')
144 | 
145 |     def on_email_test(self):
146 |         print('on_email_test')
147 | 
148 |         try:
149 |             noti = NotificationService(self.var_server.get(),
150 |                                         self.var_port.get(),
151 |                                         self.var_logon_email.get(),
152 |                                        self.var_password.get(),
153 |                                        self.var_enable_ssl.get())
154 | 
155 |             loop = asyncio.new_event_loop()
156 |             ss = loop.run_until_complete( noti.notify(0, self.var_recipient_email.get(), 'Test Email') )
157 |             loop.close()
158 |             
159 |             self.var_status.set('Test Success')
160 |             self.lbl_status.configure(foreground="green")
161 | 
162 |         except Exception as inst:
163 |             self.lbl_status.configure(foreground="red")
164 |             self.var_status.set(f'Error:{inst}')
165 | 
166 |     def on_save(self):
167 |         config = Config()
168 |         config.load()
169 |         
170 |         config.email_ssl = self.var_enable_ssl.get()
171 |         config.email_receiver = self.var_recipient_email.get()
172 |         config.email_server = self.var_server.get()
173 |         config.email_port = self.var_port.get()
174 |         config.email_username = self.var_logon_email.get()
175 |         config.email_password = self.var_password.get()
176 |         config.objects_to_detect = self.var_objects_detect.get()
177 |         config.detection_marker_direction = self.var_marker_direction.get()
178 |         config.detection_marker_location = self.var_marker_location.get()
179 |         config.save()
180 | 
181 |         if(self.on_config_save != None):
182 |             self.on_config_save()
183 |         
184 |         self.destroy()
185 | 
186 | if __name__ == '__main__':
187 |     mw = tk.Tk()
188 |     fw = ConfigWindow(mw)
189 |     mw.mainloop()


--------------------------------------------------------------------------------
/data/coco.names:
--------------------------------------------------------------------------------
1 | P
2 | PH
3 | PV
4 | PHV
5 | PLC


--------------------------------------------------------------------------------
/data/yolo_anchors.txt:
--------------------------------------------------------------------------------
1 | 15,32, 27,62, 38,99, 49,141, 69,111, 69,184, 98,223, 143,232, 188,359


--------------------------------------------------------------------------------
/eyre.ico:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/eyre.ico


--------------------------------------------------------------------------------
/notification.py:
--------------------------------------------------------------------------------
 1 | import asyncio
 2 | import smtplib
 3 | 
 4 | 
 5 | class NotificationService:
 6 |     username = ''
 7 |     password = ''
 8 |     host = ''
 9 |     port = ''
10 |     ssl = False
11 | 
12 |     def __init__(self, host, port, username, password, ssl):
13 |         self.username = username
14 |         self.password = password
15 |         self.host = host
16 |         self.port = port
17 |         self.ssl = ssl
18 | 
19 |     def count(self, mode):
20 |         self.counter[mode] += 1
21 | 
22 |     def reset_count(self, mode):
23 |         self.counter[mode] = 0
24 | 
25 |     async def notify(self, notification_type, receiver, message):
26 |         if notification_type == 0:
27 |             with smtplib.SMTP(self.host, self.port) as smtp:
28 |                 smtp.ehlo()
29 |                 smtp.starttls()
30 |                 smtp.ehlo()
31 | 
32 |                 smtp.login(self.username, self.password)
33 | 
34 |                 subject = 'Eyre Notification'
35 | 
36 |                 mail = 'Subject: {}\n\n{}'.format(subject, message)
37 | 
38 |                 smtp.sendmail(self.username, receiver, mail)
39 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | tensorflow-gpu==1.15.2
2 | opencv-python==4.2.0.32
3 | Pillow==7.1.2
4 | scikit-learn==0.21.3
5 | filterpy==1.4.5
6 | 


--------------------------------------------------------------------------------
/sort.py:
--------------------------------------------------------------------------------
  1 | """
  2 |     SORT: A Simple, Online and Realtime Tracker
  3 |     Copyright (C) 2016 Alex Bewley alex@dynamicdetection.com
  4 | 
  5 |     This program is free software: you can redistribute it and/or modify
  6 |     it under the terms of the GNU General Public License as published by
  7 |     the Free Software Foundation, either version 3 of the License, or
  8 |     (at your option) any later version.
  9 | 
 10 |     This program is distributed in the hope that it will be useful,
 11 |     but WITHOUT ANY WARRANTY; without even the implied warranty of
 12 |     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 13 |     GNU General Public License for more details.
 14 | 
 15 |     You should have received a copy of the GNU General Public License
 16 |     along with this program.  If not, see <http://www.gnu.org/licenses/>.
 17 | """
 18 | from __future__ import print_function
 19 | 
 20 | import numpy as np
 21 | from sklearn.utils.linear_assignment_ import linear_assignment
 22 | from filterpy.kalman import KalmanFilter
 23 | 
 24 | def iou(bb_test,bb_gt):
 25 |   """
 26 |   Computes IUO between two bboxes in the form [x1,y1,x2,y2]
 27 |   """
 28 |   xx1 = np.maximum(bb_test[0], bb_gt[0])
 29 |   yy1 = np.maximum(bb_test[1], bb_gt[1])
 30 |   xx2 = np.minimum(bb_test[2], bb_gt[2])
 31 |   yy2 = np.minimum(bb_test[3], bb_gt[3])
 32 |   w = np.maximum(0., xx2 - xx1)
 33 |   h = np.maximum(0., yy2 - yy1)
 34 |   wh = w * h
 35 |   o = wh / ((bb_test[2]-bb_test[0])*(bb_test[3]-bb_test[1])
 36 |     + (bb_gt[2]-bb_gt[0])*(bb_gt[3]-bb_gt[1]) - wh)
 37 |   return(o)
 38 | 
 39 | def convert_bbox_to_z(bbox):
 40 |   """
 41 |   Takes a bounding box in the form [x1,y1,x2,y2] and returns z in the form
 42 |     [x,y,s,r] where x,y is the centre of the box and s is the scale/area and r is
 43 |     the aspect ratio
 44 |   """
 45 |   w = bbox[2]-bbox[0]
 46 |   h = bbox[3]-bbox[1]
 47 |   x = bbox[0]+w/2.
 48 |   y = bbox[1]+h/2.
 49 |   s = w*h    #scale is just area
 50 |   r = w/float(h)
 51 |   return np.array([x,y,s,r]).reshape((4,1))
 52 | 
 53 | def convert_x_to_bbox(x,score=None):
 54 |   """
 55 |   Takes a bounding box in the centre form [x,y,s,r] and returns it in the form
 56 |     [x1,y1,x2,y2] where x1,y1 is the top left and x2,y2 is the bottom right
 57 |   """
 58 |   w = np.sqrt(x[2]*x[3])
 59 |   h = x[2]/w
 60 |   if(score==None):
 61 |     return np.array([x[0]-w/2.,x[1]-h/2.,x[0]+w/2.,x[1]+h/2.]).reshape((1,4))
 62 |   else:
 63 |     return np.array([x[0]-w/2.,x[1]-h/2.,x[0]+w/2.,x[1]+h/2.,score]).reshape((1,5))
 64 | 
 65 | class KalmanBoxTracker(object):
 66 |   """
 67 |   This class represents the internel state of individual tracked objects observed as bbox.
 68 |   """
 69 |   count = 0
 70 |   def __init__(self,bbox):
 71 |     """
 72 |     Initialises a tracker using initial bounding box.
 73 |     """
 74 |     #define constant velocity model
 75 |     self.kf = KalmanFilter(dim_x=7, dim_z=4)
 76 |     self.kf.F = np.array([[1,0,0,0,1,0,0],[0,1,0,0,0,1,0],[0,0,1,0,0,0,1],[0,0,0,1,0,0,0],  [0,0,0,0,1,0,0],[0,0,0,0,0,1,0],[0,0,0,0,0,0,1]])
 77 |     self.kf.H = np.array([[1,0,0,0,0,0,0],[0,1,0,0,0,0,0],[0,0,1,0,0,0,0],[0,0,0,1,0,0,0]])
 78 | 
 79 |     self.kf.R[2:,2:] *= 10.
 80 |     self.kf.P[4:,4:] *= 1000. #give high uncertainty to the unobservable initial velocities
 81 |     self.kf.P *= 10.
 82 |     self.kf.Q[-1,-1] *= 0.01
 83 |     self.kf.Q[4:,4:] *= 0.01
 84 | 
 85 |     self.kf.x[:4] = convert_bbox_to_z(bbox)
 86 |     self.time_since_update = 0
 87 |     self.id = KalmanBoxTracker.count
 88 |     KalmanBoxTracker.count += 1
 89 |     self.history = []
 90 |     self.hits = 0
 91 |     self.hit_streak = 0
 92 |     self.age = 0
 93 | 
 94 |   def update(self,bbox):
 95 |     """
 96 |     Updates the state vector with observed bbox.
 97 |     """
 98 |     self.time_since_update = 0
 99 |     self.history = []
100 |     self.hits += 1
101 |     self.hit_streak += 1
102 |     self.kf.update(convert_bbox_to_z(bbox))
103 | 
104 |   def predict(self):
105 |     """
106 |     Advances the state vector and returns the predicted bounding box estimate.
107 |     """
108 |     if((self.kf.x[6]+self.kf.x[2])<=0):
109 |       self.kf.x[6] *= 0.0
110 |     self.kf.predict()
111 |     self.age += 1
112 |     if(self.time_since_update>0):
113 |       self.hit_streak = 0
114 |     self.time_since_update += 1
115 |     self.history.append(convert_x_to_bbox(self.kf.x))
116 |     return self.history[-1]
117 | 
118 |   def get_state(self):
119 |     """
120 |     Returns the current bounding box estimate.
121 |     """
122 |     return convert_x_to_bbox(self.kf.x)
123 | 
124 | def associate_detections_to_trackers(detections,trackers,iou_threshold = 0.3):
125 |   """
126 |   Assigns detections to tracked object (both represented as bounding boxes)
127 | 
128 |   Returns 3 lists of matches, unmatched_detections and unmatched_trackers
129 |   """
130 |   if(len(trackers)==0) or (len(detections)==0):
131 |     return np.empty((0,2),dtype=int), np.arange(len(detections)), np.empty((0,5),dtype=int)
132 |   iou_matrix = np.zeros((len(detections),len(trackers)),dtype=np.float32)
133 | 
134 |   for d,det in enumerate(detections):
135 |     for t,trk in enumerate(trackers):
136 |       iou_matrix[d,t] = iou(det,trk)
137 |   matched_indices = linear_assignment(-iou_matrix)
138 | 
139 |   unmatched_detections = []
140 |   for d,det in enumerate(detections):
141 |     if(d not in matched_indices[:,0]):
142 |       unmatched_detections.append(d)
143 |   unmatched_trackers = []
144 |   for t,trk in enumerate(trackers):
145 |     if(t not in matched_indices[:,1]):
146 |       unmatched_trackers.append(t)
147 | 
148 |   #filter out matched with low IOU
149 |   matches = []
150 |   for m in matched_indices:
151 |     if(iou_matrix[m[0],m[1]]<iou_threshold):
152 |       unmatched_detections.append(m[0])
153 |       unmatched_trackers.append(m[1])
154 |     else:
155 |       matches.append(m.reshape(1,2))
156 |   if(len(matches)==0):
157 |     matches = np.empty((0,2),dtype=int)
158 |   else:
159 |     matches = np.concatenate(matches,axis=0)
160 | 
161 |   return matches, np.array(unmatched_detections), np.array(unmatched_trackers)
162 | 
163 | class Sort(object):
164 |   def __init__(self,max_age=1,min_hits=3):
165 |     """
166 |     Sets key parameters for SORT
167 |     """
168 |     self.max_age = max_age
169 |     self.min_hits = min_hits
170 |     self.trackers = []
171 |     self.frame_count = 0
172 | 
173 |   def update(self,dets):
174 |     """
175 |     Params:
176 |       dets - a numpy array of detections in the format [[x,y,w,h,score],[x,y,w,h,score],...]
177 |     Requires: this method must be called once for each frame even with empty detections.
178 |     Returns the a similar array, where the last column is the object ID.
179 | 
180 |     NOTE: The number of objects returned may differ from the number of detections provided.
181 |     """
182 |     self.frame_count += 1
183 |     #get predicted locations from existing trackers.
184 |     trks = np.zeros((len(self.trackers),5))
185 |     to_del = []
186 |     ret = []
187 |     for t,trk in enumerate(trks):
188 |       pos = self.trackers[t].predict()[0]
189 |       trk[:] = [pos[0], pos[1], pos[2], pos[3], 0]
190 |       if(np.any(np.isnan(pos))):
191 |         to_del.append(t)
192 |     trks = np.ma.compress_rows(np.ma.masked_invalid(trks))
193 |     for t in reversed(to_del):
194 |       self.trackers.pop(t)
195 |     matched, unmatched_dets, unmatched_trks = associate_detections_to_trackers(dets,trks)
196 | 
197 |     #update matched trackers with assigned detections
198 |     for t,trk in enumerate(self.trackers):
199 |       if(t not in unmatched_trks):
200 |         d = matched[np.where(matched[:,1]==t)[0],0]
201 |         if len(d) > 0:
202 |           trk.update(dets[d,:][0])
203 | 
204 |     #create and initialise new trackers for unmatched detections
205 |     for i in unmatched_dets:
206 |         trk = KalmanBoxTracker(dets[i,:])
207 |         self.trackers.append(trk)
208 |     i = len(self.trackers)
209 |     for trk in reversed(self.trackers):
210 |         d = trk.get_state()[0]
211 |         if((trk.time_since_update < 1) and (trk.hit_streak >= self.min_hits or self.frame_count <= self.min_hits)):
212 |           ret.append(np.concatenate((d,[trk.id+1])).reshape(1,-1)) # +1 as MOT benchmark requires positive
213 |         i -= 1
214 |         #remove dead tracklet
215 |         if(trk.time_since_update > self.max_age):
216 |           self.trackers.pop(i)
217 |     if(len(ret)>0):
218 |       return np.concatenate(ret)
219 |     return np.empty((0,5))


--------------------------------------------------------------------------------
/test_notification.py:
--------------------------------------------------------------------------------
 1 | from unittest import TestCase
 2 | 
 3 | from notification import NotificationService
 4 | 
 5 | 
 6 | class TestNotificationService(TestCase):
 7 |     def test_notify(self):
 8 |         username = 'ppe.detection@gmail.com'
 9 |         password = 'itsdxkjynqhwsgaj'
10 |         host = 'smtp.gmail.com'
11 |         port = 587
12 | 
13 |         noti = NotificationService(host, port, username, password, False)
14 |         noti.notify(0, 'guneedmts@gmail.com', 'Unit Test')
15 | 
16 | 


--------------------------------------------------------------------------------
/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/utils/__init__.py


--------------------------------------------------------------------------------
/utils/layer_utils.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | 
 3 | from __future__ import division, print_function
 4 | 
 5 | import numpy as np
 6 | import tensorflow as tf
 7 | slim = tf.contrib.slim
 8 | 
 9 | def conv2d(inputs, filters, kernel_size, strides=1):
10 |     def _fixed_padding(inputs, kernel_size):
11 |         pad_total = kernel_size - 1
12 |         pad_beg = pad_total // 2
13 |         pad_end = pad_total - pad_beg
14 | 
15 |         padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg, pad_end],
16 |                                         [pad_beg, pad_end], [0, 0]], mode='CONSTANT')
17 |         return padded_inputs
18 |     if strides > 1: 
19 |         inputs = _fixed_padding(inputs, kernel_size)
20 |     inputs = slim.conv2d(inputs, filters, kernel_size, stride=strides,
21 |                          padding=('SAME' if strides == 1 else 'VALID'))
22 |     return inputs
23 | 
24 | def darknet53_body(inputs):
25 |     def res_block(inputs, filters):
26 |         shortcut = inputs
27 |         net = conv2d(inputs, filters * 1, 1)
28 |         net = conv2d(net, filters * 2, 3)
29 | 
30 |         net = net + shortcut
31 | 
32 |         return net
33 |     
34 |     # first two conv2d layers
35 |     net = conv2d(inputs, 32,  3, strides=1)
36 |     net = conv2d(net, 64,  3, strides=2)
37 | 
38 |     # res_block * 1
39 |     net = res_block(net, 32)
40 | 
41 |     net = conv2d(net, 128, 3, strides=2)
42 | 
43 |     # res_block * 2
44 |     for i in range(2):
45 |         net = res_block(net, 64)
46 | 
47 |     net = conv2d(net, 256, 3, strides=2)
48 | 
49 |     # res_block * 8
50 |     for i in range(8):
51 |         net = res_block(net, 128)
52 | 
53 |     route_1 = net
54 |     net = conv2d(net, 512, 3, strides=2)
55 | 
56 |     # res_block * 8
57 |     for i in range(8):
58 |         net = res_block(net, 256)
59 | 
60 |     route_2 = net
61 |     net = conv2d(net, 1024, 3, strides=2)
62 | 
63 |     # res_block * 4
64 |     for i in range(4):
65 |         net = res_block(net, 512)
66 |     route_3 = net
67 | 
68 |     return route_1, route_2, route_3
69 | 
70 | 
71 | def yolo_block(inputs, filters):
72 |     net = conv2d(inputs, filters * 1, 1)
73 |     net = conv2d(net, filters * 2, 3)
74 |     net = conv2d(net, filters * 1, 1)
75 |     net = conv2d(net, filters * 2, 3)
76 |     net = conv2d(net, filters * 1, 1)
77 |     route = net
78 |     net = conv2d(net, filters * 2, 3)
79 |     return route, net
80 | 
81 | 
82 | def upsample_layer(inputs, out_shape):
83 |     new_height, new_width = out_shape[1], out_shape[2]
84 |     # NOTE: here height is the first
85 |     # TODO: Do we need to set `align_corners` as True?
86 |     inputs = tf.compat.v1.image.resize_nearest_neighbor(inputs, (new_height, new_width), name='upsampled')
87 |     return inputs
88 | 
89 | 
90 | 


--------------------------------------------------------------------------------
/utils/misc_utils.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | 
  3 | import numpy as np
  4 | import tensorflow as tf
  5 | import random
  6 | 
  7 | from tensorflow.core.framework import summary_pb2
  8 | 
  9 | 
 10 | def make_summary(name, val):
 11 |     return summary_pb2.Summary(value=[summary_pb2.Summary.Value(tag=name, simple_value=val)])
 12 | 
 13 | 
 14 | class AverageMeter(object):
 15 |     def __init__(self):
 16 |         self.reset()
 17 | 
 18 |     def reset(self):
 19 |         self.val = 0
 20 |         self.average = 0
 21 |         self.sum = 0
 22 |         self.count = 0
 23 | 
 24 |     def update(self, val, n=1):
 25 |         self.val = val
 26 |         self.sum += val * n
 27 |         self.count += n
 28 |         self.average = self.sum / float(self.count)
 29 | 
 30 | 
 31 | def parse_anchors(anchor_path):
 32 |     '''
 33 |     parse anchors.
 34 |     returned data: shape [N, 2], dtype float32
 35 |     '''
 36 |     anchors = np.reshape(np.asarray(open(anchor_path, 'r').read().split(','), np.float32), [-1, 2])
 37 |     return anchors
 38 | 
 39 | 
 40 | def read_class_names(class_name_path):
 41 |     names = {}
 42 |     with open(class_name_path, 'r') as data:
 43 |         for ID, name in enumerate(data):
 44 |             names[ID] = name.strip('\n')
 45 |     return names
 46 | 
 47 | 
 48 | def shuffle_and_overwrite(file_name):
 49 |     content = open(file_name, 'r').readlines()
 50 |     random.shuffle(content)
 51 |     with open(file_name, 'w') as f:
 52 |         for line in content:
 53 |             f.write(line)
 54 | 
 55 | 
 56 | def update_dict(ori_dict, new_dict):
 57 |     if not ori_dict:
 58 |         return new_dict
 59 |     for key in ori_dict:
 60 |         ori_dict[key] += new_dict[key]
 61 |     return ori_dict
 62 | 
 63 | 
 64 | def list_add(ori_list, new_list):
 65 |     for i in range(len(ori_list)):
 66 |         ori_list[i] += new_list[i]
 67 |     return ori_list
 68 | 
 69 | 
 70 | def load_weights(var_list, weights_file):
 71 |     """
 72 |     Loads and converts pre-trained weights.
 73 |     param:
 74 |         var_list: list of network variables.
 75 |         weights_file: name of the binary file.
 76 |     """
 77 |     with open(weights_file, "rb") as fp:
 78 |         np.fromfile(fp, dtype=np.int32, count=5)
 79 |         weights = np.fromfile(fp, dtype=np.float32)
 80 | 
 81 |     ptr = 0
 82 |     i = 0
 83 |     assign_ops = []
 84 |     while i < len(var_list) - 1:
 85 |         var1 = var_list[i]
 86 |         var2 = var_list[i + 1]
 87 |         # do something only if we process conv layer
 88 |         if 'Conv' in var1.name.split('/')[-2]:
 89 |             # check type of next layer
 90 |             if 'BatchNorm' in var2.name.split('/')[-2]:
 91 |                 # load batch norm params
 92 |                 gamma, beta, mean, var = var_list[i + 1:i + 5]
 93 |                 batch_norm_vars = [beta, gamma, mean, var]
 94 |                 for var in batch_norm_vars:
 95 |                     shape = var.shape.as_list()
 96 |                     num_params = np.prod(shape)
 97 |                     var_weights = weights[ptr:ptr + num_params].reshape(shape)
 98 |                     ptr += num_params
 99 |                     assign_ops.append(tf.assign(var, var_weights, validate_shape=True))
100 |                 # we move the pointer by 4, because we loaded 4 variables
101 |                 i += 4
102 |             elif 'Conv' in var2.name.split('/')[-2]:
103 |                 # load biases
104 |                 bias = var2
105 |                 bias_shape = bias.shape.as_list()
106 |                 bias_params = np.prod(bias_shape)
107 |                 bias_weights = weights[ptr:ptr +
108 |                                        bias_params].reshape(bias_shape)
109 |                 ptr += bias_params
110 |                 assign_ops.append(tf.assign(bias, bias_weights, validate_shape=True))
111 |                 # we loaded 1 variable
112 |                 i += 1
113 |             # we can load weights of conv layer
114 |             shape = var1.shape.as_list()
115 |             num_params = np.prod(shape)
116 | 
117 |             var_weights = weights[ptr:ptr + num_params].reshape(
118 |                 (shape[3], shape[2], shape[0], shape[1]))
119 |             # remember to transpose to column-major
120 |             var_weights = np.transpose(var_weights, (2, 3, 1, 0))
121 |             ptr += num_params
122 |             assign_ops.append(
123 |                 tf.assign(var1, var_weights, validate_shape=True))
124 |             i += 1
125 | 
126 |     return assign_ops
127 | 
128 | 
129 | def config_learning_rate(args, global_step):
130 |     if args.lr_type == 'exponential':
131 |         lr_tmp = tf.train.exponential_decay(args.learning_rate_init, global_step, args.lr_decay_freq,
132 |                                             args.lr_decay_factor, staircase=True, name='exponential_learning_rate')
133 |         return tf.maximum(lr_tmp, args.lr_lower_bound)
134 |     elif args.lr_type == 'cosine_decay':
135 |         train_steps = (args.total_epoches - float(args.use_warm_up) * args.warm_up_epoch) * args.train_batch_num
136 |         return args.lr_lower_bound + 0.5 * (args.learning_rate_init - args.lr_lower_bound) * \
137 |             (1 + tf.cos(global_step / train_steps * np.pi))
138 |     elif args.lr_type == 'cosine_decay_restart':
139 |         return tf.train.cosine_decay_restarts(args.learning_rate_init, global_step, 
140 |                                               args.lr_decay_freq, t_mul=2.0, m_mul=1.0, 
141 |                                               name='cosine_decay_learning_rate_restart')
142 |     elif args.lr_type == 'fixed':
143 |         return tf.convert_to_tensor(args.learning_rate_init, name='fixed_learning_rate')
144 |     elif args.lr_type == 'piecewise':
145 |         return tf.train.piecewise_constant(global_step, boundaries=args.pw_boundaries, values=args.pw_values,
146 |                                            name='piecewise_learning_rate')
147 |     else:
148 |         raise ValueError('Unsupported learning rate type!')
149 | 
150 | 
151 | def config_optimizer(optimizer_name, learning_rate, decay=0.9, momentum=0.9):
152 |     if optimizer_name == 'momentum':
153 |         return tf.train.MomentumOptimizer(learning_rate, momentum=momentum)
154 |     elif optimizer_name == 'rmsprop':
155 |         return tf.train.RMSPropOptimizer(learning_rate, decay=decay, momentum=momentum)
156 |     elif optimizer_name == 'adam':
157 |         return tf.train.AdamOptimizer(learning_rate)
158 |     elif optimizer_name == 'sgd':
159 |         return tf.train.GradientDescentOptimizer(learning_rate)
160 |     else:
161 |         raise ValueError('Unsupported optimizer type!')


--------------------------------------------------------------------------------
/utils/nms_utils.py:
--------------------------------------------------------------------------------
  1 | # coding: utf-8
  2 | 
  3 | from __future__ import division, print_function
  4 | 
  5 | import numpy as np
  6 | import tensorflow as tf
  7 | 
  8 | def gpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, nms_thresh=0.5):
  9 |     """
 10 |     Perform NMS on GPU using TensorFlow.
 11 | 
 12 |     params:
 13 |         boxes: tensor of shape [1, 10647, 4] # 10647=(13*13+26*26+52*52)*3, for input 416*416 image
 14 |         scores: tensor of shape [1, 10647, num_classes], score=conf*prob
 15 |         num_classes: total number of classes
 16 |         max_boxes: integer, maximum number of predicted boxes you'd like, default is 50
 17 |         score_thresh: if [ highest class probability score < score_threshold]
 18 |                         then get rid of the corresponding box
 19 |         nms_thresh: real value, "intersection over union" threshold used for NMS filtering
 20 |     """
 21 | 
 22 |     boxes_list, label_list, score_list = [], [], []
 23 |     max_boxes = tf.constant(max_boxes, dtype='int32')
 24 | 
 25 |     # since we do nms for single image, then reshape it
 26 |     boxes = tf.reshape(boxes, [-1, 4]) # '-1' means we don't konw the exact number of boxes
 27 |     score = tf.reshape(scores, [-1, num_classes])
 28 | 
 29 |     # Step 1: Create a filtering mask based on "box_class_scores" by using "threshold".
 30 |     mask = tf.greater_equal(score, tf.constant(score_thresh))
 31 |     # Step 2: Do non_max_suppression for each class
 32 |     for i in range(num_classes):
 33 |         # Step 3: Apply the mask to scores, boxes and pick them out
 34 |         filter_boxes = tf.boolean_mask(boxes, mask[:,i])
 35 |         filter_score = tf.boolean_mask(score[:,i], mask[:,i])
 36 |         nms_indices = tf.image.non_max_suppression(boxes=filter_boxes,
 37 |                                                    scores=filter_score,
 38 |                                                    max_output_size=max_boxes,
 39 |                                                    iou_threshold=nms_thresh, name='nms_indices')
 40 |         label_list.append(tf.ones_like(tf.gather(filter_score, nms_indices), 'int32')*i)
 41 |         boxes_list.append(tf.gather(filter_boxes, nms_indices))
 42 |         score_list.append(tf.gather(filter_score, nms_indices))
 43 | 
 44 |     boxes = tf.concat(boxes_list, axis=0)
 45 |     score = tf.concat(score_list, axis=0)
 46 |     label = tf.concat(label_list, axis=0)
 47 | 
 48 |     return boxes, score, label
 49 | 
 50 | 
 51 | def py_nms(boxes, scores, max_boxes=50, iou_thresh=0.5):
 52 |     """
 53 |     Pure Python NMS baseline.
 54 | 
 55 |     Arguments: boxes: shape of [-1, 4], the value of '-1' means that dont know the
 56 |                       exact number of boxes
 57 |                scores: shape of [-1,]
 58 |                max_boxes: representing the maximum of boxes to be selected by non_max_suppression
 59 |                iou_thresh: representing iou_threshold for deciding to keep boxes
 60 |     """
 61 |     assert boxes.shape[1] == 4 and len(scores.shape) == 1
 62 | 
 63 |     x1 = boxes[:, 0]
 64 |     y1 = boxes[:, 1]
 65 |     x2 = boxes[:, 2]
 66 |     y2 = boxes[:, 3]
 67 | 
 68 |     areas = (x2 - x1) * (y2 - y1)
 69 |     order = scores.argsort()[::-1]
 70 | 
 71 |     keep = []
 72 |     while order.size > 0:
 73 |         i = order[0]
 74 |         keep.append(i)
 75 |         xx1 = np.maximum(x1[i], x1[order[1:]])
 76 |         yy1 = np.maximum(y1[i], y1[order[1:]])
 77 |         xx2 = np.minimum(x2[i], x2[order[1:]])
 78 |         yy2 = np.minimum(y2[i], y2[order[1:]])
 79 | 
 80 |         w = np.maximum(0.0, xx2 - xx1 + 1)
 81 |         h = np.maximum(0.0, yy2 - yy1 + 1)
 82 |         inter = w * h
 83 |         ovr = inter / (areas[i] + areas[order[1:]] - inter)
 84 | 
 85 |         inds = np.where(ovr <= iou_thresh)[0]
 86 |         order = order[inds + 1]
 87 | 
 88 |     return keep[:max_boxes]
 89 | 
 90 | 
 91 | def cpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, iou_thresh=0.5):
 92 |     """
 93 |     Perform NMS on CPU.
 94 |     Arguments:
 95 |         boxes: shape [1, 10647, 4]
 96 |         scores: shape [1, 10647, num_classes]
 97 |     """
 98 | 
 99 |     boxes = boxes.reshape(-1, 4)
100 |     scores = scores.reshape(-1, num_classes)
101 |     # Picked bounding boxes
102 |     picked_boxes, picked_score, picked_label = [], [], []
103 | 
104 |     for i in range(num_classes):
105 |         indices = np.where(scores[:,i] >= score_thresh)
106 |         filter_boxes = boxes[indices]
107 |         filter_scores = scores[:,i][indices]
108 |         if len(filter_boxes) == 0: 
109 |             continue
110 |         # do non_max_suppression on the cpu
111 |         indices = py_nms(filter_boxes, filter_scores,
112 |                          max_boxes=max_boxes, iou_thresh=iou_thresh)
113 |         picked_boxes.append(filter_boxes[indices])
114 |         picked_score.append(filter_scores[indices])
115 |         picked_label.append(np.ones(len(indices), dtype='int32')*i)
116 |     if len(picked_boxes) == 0: 
117 |         return None, None, None
118 | 
119 |     boxes = np.concatenate(picked_boxes, axis=0)
120 |     score = np.concatenate(picked_score, axis=0)
121 |     label = np.concatenate(picked_label, axis=0)
122 | 
123 |     return boxes, score, label


--------------------------------------------------------------------------------
/utils/plot_utils.py:
--------------------------------------------------------------------------------
 1 | # coding: utf-8
 2 | 
 3 | from __future__ import division, print_function
 4 | 
 5 | import cv2
 6 | import random
 7 | 
 8 | 
 9 | def get_color_table(class_num, seed=2):
10 |     random.seed(seed)
11 |     color_table = {}
12 |     for i in range(class_num):
13 |         color_table[i] = [random.randint(0, 255) for _ in range(3)]
14 |     return color_table
15 | 
16 | 
17 | def plot_one_box(img, coord, label=None, color=None, line_thickness=None):
18 |     '''
19 |     coord: [x_min, y_min, x_max, y_max] format coordinates.
20 |     img: img to plot on.
21 |     label: str. The label name.
22 |     color: int. color index.
23 |     line_thickness: int. rectangle line thickness.
24 |     '''
25 |     tl = line_thickness or int(round(0.002 * max(img.shape[0:2])))  # line thickness
26 |     color = color or [random.randint(0, 255) for _ in range(3)]
27 |     c1, c2 = (int(coord[0]), int(coord[1])), (int(coord[2]), int(coord[3]))
28 |     cv2.rectangle(img, c1, c2, color, thickness=tl)
29 |     if label:
30 |         tf = max(tl - 1, 1)  # font thickness
31 |         t_size = cv2.getTextSize(label, 0, fontScale=float(tl) / 3, thickness=tf)[0]
32 |         c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3
33 |         cv2.rectangle(img, c1, c2, color, -1)  # filled
34 |         cv2.putText(img, label, (c1[0], c1[1] - 2), 0, float(tl) / 3, [0, 0, 0], thickness=tf, lineType=cv2.LINE_AA)
35 | 
36 | 


--------------------------------------------------------------------------------