├── .gitattributes ├── .gitignore ├── .ipynb_checkpoints ├── Model_Training_Tutorial-checkpoint.md └── README-checkpoint.md ├── LICENSE ├── README.md ├── Test_demo ├── .gitignore ├── README.md ├── args.py ├── camera_test.py ├── convert_weight.py ├── data │ ├── coco.names │ ├── darknet_weights │ │ └── readme.txt │ ├── logs │ │ └── readme │ ├── my_data │ │ ├── Annotations │ │ │ └── readme │ │ ├── ImageSets │ │ │ └── Main │ │ │ │ └── readme │ │ ├── JPEGImages │ │ │ └── readme │ │ ├── label │ │ │ └── readme │ │ └── readme │ └── yolo_anchors.txt ├── data_pro.py ├── eval.py ├── get_kmeans.py ├── model.py ├── test_single_image.py ├── train.py ├── utils │ ├── __init__.py │ ├── data_aug.py │ ├── data_utils.py │ ├── eval_utils.py │ ├── layer_utils.py │ ├── misc_utils.py │ ├── nms_utils.py │ └── plot_utils.py └── video_test.py ├── Train ├── .gitignore ├── README.md ├── README_YOLO_V3.md ├── args.py ├── convert_weight.py ├── data │ ├── coco.names │ ├── darknet_weights │ │ └── readme │ ├── my_data │ │ ├── Annotations │ │ │ └── readme │ │ ├── ImageSets │ │ │ └── Main │ │ │ │ ├── test.txt │ │ │ │ ├── train.txt │ │ │ │ └── val.txt │ │ ├── JPEGImages │ │ │ └── readme │ │ └── label │ │ │ ├── test.txt │ │ │ ├── train.txt │ │ │ └── val.txt │ └── yolo_anchors.txt ├── data_pro.py ├── eval.py ├── get_kmeans.py ├── model.py ├── test_single_image.py ├── train.py ├── utils │ ├── __init__.py │ ├── data_aug.py │ ├── data_utils.py │ ├── eval_utils.py │ ├── layer_utils.py │ ├── misc_utils.py │ ├── nms_utils.py │ └── plot_utils.py └── video_test.py ├── YOLO-V3-Tensorflow-demo ├── .gitignore ├── README.md ├── args.py ├── convert_weight.py ├── data │ ├── coco.names │ ├── darknet_weights │ │ └── readme.txt │ └── yolo_anchors.txt ├── data_pro.py ├── detection_result.jpg ├── eval.py ├── get_kmeans.py ├── model.py ├── requirements.txt ├── test.jpg ├── test_single_image.py ├── train.py ├── utils │ ├── __init__.py │ ├── data_aug.py │ ├── data_utils.py │ ├── eval_utils.py │ ├── layer_utils.py │ ├── misc_utils.py │ ├── nms_utils.py │ └── plot_utils.py └── video_test.py ├── config.py ├── config_window.py ├── data ├── coco.names └── yolo_anchors.txt ├── detection.py ├── eyre.ico ├── eyre.py ├── model.py ├── notification.py ├── requirements.txt ├── sort.py ├── test_notification.py ├── tools ├── freeze_model.py └── generate_detections.py └── utils ├── __init__.py ├── data_aug.py ├── data_utils.py ├── eval_utils.py ├── layer_utils.py ├── misc_utils.py ├── nms_utils.py └── plot_utils.py /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto 3 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | __pycache__/ 2 | .idea/ 3 | config.ini 4 | checkpoint/ 5 | .vscode/ 6 | -------------------------------------------------------------------------------- /.ipynb_checkpoints/Model_Training_Tutorial-checkpoint.md: -------------------------------------------------------------------------------- 1 | # PPE Detection Model Training Tutorial 2 | 3 | 4 | ### 1. Requirements 5 | 6 | Python version: 2 or 3 7 | 8 | Packages: 9 | 10 | - tensorflow >= 1.8.0 (theoretically any version that supports tf.data is ok) 11 | - opencv-python 12 | - tqdm 13 | 14 | ### 2. Weights download 15 | 16 | You need download the converted TensorFlow checkpoint file by me via [[Google Drive link](https://drive.google.com/drive/folders/1mXbNgNxyXPi7JNsnBaxEv1-nWr7SVoQt?usp=sharing)] or [[Github Release](https://github.com/wizyoung/YOLOv3_TensorFlow/releases/)] and then place it to the `./data/darknet_weights/` directory. 17 | 18 | 19 | ### 3. Training 20 | 21 | #### 3.1 Data preparation 22 | 23 | Put the VOC format dataset in `./data/mydata/` directory. 24 | 25 | (1) annotation file 26 | 27 | Run `python data_pro.py` to generate `train.txt/val.txt/test.txt` files under `./data/my_data/` directory. One line for one image, in the format like `image_index image_absolute_path img_width img_height box_1 box_2 ... box_n`. Box_x format: `label_index x_min y_min x_max y_max`. (The origin of coordinates is at the left top corner, left top => (xmin, ymin), right bottom => (xmax, ymax).) `image_index` is the line index which starts from zero. `label_index` is in range [0, class_num - 1]. 28 | 29 | For example: 30 | 31 | ``` 32 | 0 xxx/xxx/a.jpg 1920 1080 0 453 369 473 391 1 588 245 608 268 33 | 1 xxx/xxx/b.jpg 1920 1080 1 466 403 485 422 2 793 300 809 320 34 | ... 35 | ``` 36 | 37 | (2) class_names file: 38 | 39 | `coco.names` file under `./data/my_data/` directory. Each line represents a class name. 40 | 41 | ``` 42 | P 43 | PH 44 | PV 45 | PHV 46 | PLC 47 | ... 48 | ``` 49 | 50 | (3) prior anchor file: 51 | 52 | Using the kmeans algorithm to get the prior anchors: 53 | 54 | ``` 55 | python get_kmeans.py 56 | ``` 57 | 58 | Then you will get 9 anchors and the average IoU. Save the anchors to `./data/yolo_anchors.txt` 59 | 60 | The yolo anchors computed by the kmeans script is on the resized image scale. The default resize method is the letterbox resize, i.e., keep the original aspect ratio in the resized image. 61 | 62 | #### 3.2 Training 63 | 64 | Using `train.py`. The hyper-parameters and the corresponding annotations can be found in `args.py`: 65 | 66 | ```shell 67 | CUDA_VISIBLE_DEVICES=GPU_ID python train.py 68 | ``` 69 | 70 | Check the `args.py` for more details. You should set the parameters yourself in your own specific task. 71 | 72 | Our training enviroment was: 73 | 74 | - Ubuntu 16.04 75 | - NVIDIA Tesla P100 76 | 77 | ### 4. Testing 78 | 79 | You could test by running those command: 80 | 81 | Single image test : 82 | 83 | ```shell 84 | python test_single_image.py ./data/demo_data/test.jpg 85 | ``` 86 | 87 | Video test: 88 | 89 | ```shell 90 | python video_test.py ./data/demo_data/test.mp4 91 | ``` 92 | 93 | 94 | 95 | 96 | 97 | 98 | -------------------------------------------------------------------------------- /.ipynb_checkpoints/README-checkpoint.md: -------------------------------------------------------------------------------- 1 | # COMP5703-CS70: PEP Detection 2 | 3 | To install required dependencies: 4 | 5 | `pip install -r requirements.txt` 6 | 7 | To run app: 8 | 9 | `python main.py` 10 | 11 | 12 | # Data Introduction 13 | 14 | After confirm requiremnts from the client, we need to collect person wearing helment, vest or lab coat data. The project needs to collect large amount of data. The data mainly contains two sources: scrape images from websites and open source data. For this project,we found and use two open source data which are GDUT-HWD and Pictorv3. The open source data contains 3995 images with 7865 positive classes and 7672 negative classes. The scrape images are mianly person wearing lab coat which contains 1073 images with 3018 positive classes and 287 negative classes. 15 | 16 | ## 1.Data Scraping 17 | 18 | Using Selenium and downloaded the related WebDriver to successfully download our lab coat dataset, helmet and vest images.The powerful selenium which can automate the Google browser through clicking button, scrolling pages, waiting for loading and extracting URLs.Inputting the key words for scraping is challenge, we should use 'construction worker' or 'people wearing Safety vest' instead of using only 'Safety Helmet' or 'Safety Vest'. 19 | 20 | ## 2. Data Labeling 21 | 22 | We used the open source labeling tool labelImg. The labeling process just takes time and effort. During labeling, we also need to manually write 23 | 24 | ```python 25 | 26 | ``` 27 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # COMP5703-CS70: PEP Detection 2 | 3 | To install required dependencies: 4 | 5 | `pip install -r requirements.txt` 6 | 7 | To download the below three [[checkpoint files](https://drive.google.com/drive/folders/1mOjkvQQBEcLV4ju2N9JYhDiROGwC1MHb?usp=sharing)] and place them under the checkpoint folder. 8 | 9 | `best_model_Epoch_75_step_29487_mAP_0.8609_loss_5.4903_lr_1e-05.data-00000-of-00001` 10 | `best_model_Epoch_75_step_29487_mAP_0.8609_loss_5.4903_lr_1e-05.index` 11 | `best_model_Epoch_75_step_29487_mAP_0.8609_loss_5.4903_lr_1e-05.meta` 12 | 13 | ### Use Input video for Detection 14 | 15 | The application works with input any videos. Demo videos are provided [[here](https://drive.google.com/drive/folders/13c1mlFCWxu7in9Aggz5OWsq_K9duW9M2?usp=sharing)]. 16 | 17 | In `eyre.py` change 18 | ``` 19 | def init_camera(self): 20 | self.camera = cv2.VideoCapture('path_to_video/demo_video.mp4') 21 | ``` 22 | 23 | ### Use camera Stream 24 | In `eyre.py` change 25 | ``` 26 | def init_camera(self): 27 | self.camera = cv2.VideoCapture(0) 28 | ``` 29 | 30 | ## Finally to run the app: 31 | 32 | `python eyre.py` 33 | 34 | 35 | 36 | # Data Introduction 37 | 38 | After confirm requiremnts from the client, we need to collect person wearing helment, vest or lab coat data. The project needs to collect large amount of data. The data mainly contains two sources: scrape images from websites and open source data. For this project,we found and use two open source data which are GDUT-HWD and Pictorv3. The open source data contains 3995 images with 7865 positive classes and 7672 negative classes. The scrape images are mainly person wearing lab coat which contains 1073 images with 3018 positive classes and 287 negative classes. Besides, the extracted dataset from filmed videos contains 1,437 images with 2,734 positive instances and 204 negative instances. 39 | 40 | Training dataset can be downloaded here 41 | 42 | `https://drive.google.com/file/d/17T_aHFxhq3BNDn2iTJASKT6HYdUoph3A/view?usp=sharing` 43 | 44 | ## 1.Data Scraping 45 | 46 | Using Selenium and downloaded the related WebDriver to successfully download our lab coat dataset, helmet and vest images.The powerful selenium which can automate the Google browser through clicking button, scrolling pages, waiting for loading and extracting URLs.Inputting the key words for scraping is challenge, we should use 'construction worker' or 'people wearing Safety vest' instead of using only 'Safety Helmet' or 'Safety Vest'. 47 | 48 | You can choose key words such as "construction worker" 49 | 50 | Except scraping data from websites, another direct data collection method is to record videos for user scenes. 51 | 52 | 53 | 54 | 55 | ## 2. Data Labeling 56 | 57 | We used the open source labelling tool [[LabelImg](https://github.com/tzutalin/labelImg)]. The labelling process just takes time and effort. Referring to figure 1, our project directly label data into five classes (P, PH, PV, PHV, PLC) and then implement YOLOv3 model to classify data into different classes. 58 | ![image.png](attachment:image.png) 59 | 60 | 61 | In order to reduce the noise from different background, we strictly labelled data from head to knees and shoulder to shoulder (figure 2). In addition, when a class is incorrectly labelled or a typo is made, we need to use the ElemetTree class in Python to process the annotation XML file for error corrections. 62 | ![image.png](attachment:image.png) 63 | 64 | # PPE Detection Model Training Tutorial 65 | 66 | 67 | ### 1. Requirements 68 | 69 | Python version: 3.7 70 | 71 | Packages: 72 | 73 | - tensorflow < 2 (theoretically any version that supports tf.data is ok) 74 | - opencv-python 75 | - tqdm 76 | 77 | ### 2. Weights download 78 | 79 | You need download the converted TensorFlow checkpoint file by me via [[Google Drive link](https://drive.google.com/drive/folders/1mXbNgNxyXPi7JNsnBaxEv1-nWr7SVoQt?usp=sharing)] or [[Github Release](https://github.com/wizyoung/YOLOv3_TensorFlow/releases/)] and then place it to the `./data/darknet_weights/` directory. 80 | 81 | 82 | ### 3. Training 83 | 84 | #### 3.1 Data preparation 85 | 86 | Put the VOC format dataset in `./data/mydata/` directory. 87 | 88 | (1) annotation file 89 | 90 | Run `python data_pro.py` to generate `train.txt/val.txt/test.txt` files under `./data/my_data/` directory. One line for one image, in the format like `image_index image_absolute_path img_width img_height box_1 box_2 ... box_n`. Box_x format: `label_index x_min y_min x_max y_max`. (The origin of coordinates is at the left top corner, left top => (xmin, ymin), right bottom => (xmax, ymax).) `image_index` is the line index which starts from zero. `label_index` is in range [0, class_num - 1]. 91 | 92 | For example: 93 | 94 | ``` 95 | 0 xxx/xxx/a.jpg 1920 1080 0 453 369 473 391 1 588 245 608 268 96 | 1 xxx/xxx/b.jpg 1920 1080 1 466 403 485 422 2 793 300 809 320 97 | ... 98 | ``` 99 | 100 | (2) class_names file: 101 | 102 | `coco.names` file under `./data/my_data/` directory. Each line represents a class name. 103 | 104 | ``` 105 | P 106 | PH 107 | PV 108 | PHV 109 | PLC 110 | ... 111 | ``` 112 | 113 | (3) prior anchor file: 114 | 115 | Using the kmeans algorithm to get the prior anchors: 116 | 117 | ``` 118 | python get_kmeans.py 119 | ``` 120 | 121 | Then you will get 9 anchors and the average IoU. Save the anchors to `./data/yolo_anchors.txt` 122 | 123 | The yolo anchors computed by the kmeans script is on the resized image scale. The default resize method is the letterbox resize, i.e., keep the original aspect ratio in the resized image. 124 | 125 | #### 3.2 Training 126 | 127 | Using `train.py`. The hyper-parameters and the corresponding annotations can be found in `args.py`: 128 | 129 | ```shell 130 | CUDA_VISIBLE_DEVICES=GPU_ID python train.py 131 | ``` 132 | 133 | Check the `args.py` for more details. You should set the parameters yourself in your own specific task. 134 | 135 | Our training enviroment was: 136 | 137 | - Ubuntu 16.04 138 | - NVIDIA Tesla P100 139 | 140 | ### 4. Testing 141 | 142 | You could test by running those command: 143 | 144 | Single image test : 145 | 146 | ```shell 147 | python test_single_image.py ./data/demo_data/test.jpg 148 | ``` 149 | 150 | Video test: 151 | 152 | ```shell 153 | python video_test.py ./data/demo_data/test.mp4 154 | ``` 155 | 156 | # Installing dependencies on Jetson Nano 157 | 158 | Run following scripts after setting up the Jetson Nano. Following scripts will install required dependencies 159 | 160 | ### 1. Uninstall unused applications to save space (Optional) 161 | ``` 162 | sudo apt remove libreoffice* thunderbird shotwell rhythmbox cheese 163 | sudo apt autoremove 164 | ``` 165 | ### 2. update system 166 | ``` 167 | sudo apt-get update 168 | ``` 169 | Install dependencies 170 | ``` 171 | sudo apt-get install -y \ 172 | python3-pip \ 173 | build-essential \ 174 | git \ 175 | python3 \ 176 | python3-dev \ 177 | ffmpeg \ 178 | libsdl2-dev \ 179 | libsdl2-image-dev \ 180 | libsdl2-mixer-dev \ 181 | libsdl2-ttf-dev \ 182 | libportmidi-dev \ 183 | libswscale-dev \ 184 | libavformat-dev \ 185 | libavcodec-dev \ 186 | zlib1g-dev 187 | ``` 188 | ### 3. install the correct Cython version 189 | ``` 190 | python3 -m pip install Cython==0.29.10 191 | sudo apt-get update 192 | ``` 193 | ### 4. prerequisite for Tensorflow 194 | ``` 195 | sudo apt-get install libhdf5-serial-dev hdf5-tools libhdf5-dev zlib1g-dev zip libjpeg8-dev liblapack-dev libblas-dev gfortran 196 | ``` 197 | ### 5. TensorFlow 198 | ``` 199 | python3 -m pip install --upgrade --user pip setuptools virtualenv 200 | 201 | sudo pip3 install --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v44 'tensorflow<2' 202 | 203 | sudo pip3 install numba 204 | sudo pip3 install scikit-learn 205 | sudo apt-get install python3-matplotlib 206 | sudo pip3 install filterpy 207 | ``` 208 | -------------------------------------------------------------------------------- /Test_demo/.gitignore: -------------------------------------------------------------------------------- 1 | /checkpoint/checkpoint 2 | /checkpoint/*.data-00000-of-00001 3 | /checkpoint/*.index 4 | /checkpoint/*.meta 5 | /data/darknet_weights/*.data-00000-of-00001 6 | /data/darknet_weights/*.meta 7 | /data/darknet_weights/*.weights 8 | 9 | /data/logs/*.ubuntu 10 | *.xml 11 | /data/myData/ImageSets/Main/*.txt 12 | /data/myData/JPEGImages/*.jpg 13 | /data/myData/label/*.txt 14 | 15 | /data/test_image/*.jpg 16 | /data/test_video/* 17 | /data/*.log 18 | /test_res/*.jpg 19 | /执行步骤.txt 20 | 21 | -------------------------------------------------------------------------------- /Test_demo/README.md: -------------------------------------------------------------------------------- 1 | ## Test demo 2 | ### Download checkpoint from google drive 3 | ### save checkpoint files into checkpoint directory 4 | ### modify 'restore_path' in video_test.py test_single_image.py camera_tst.py should on line 31 5 | ## Run 6 | ### python video_test.py own.mp4 7 | -------------------------------------------------------------------------------- /Test_demo/args.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | # This file contains the parameter used in train.py 3 | 4 | from __future__ import division, print_function 5 | 6 | from utils.misc_utils import parse_anchors, read_class_names 7 | import math 8 | 9 | ### Some paths 10 | train_file = './data/my_data/label/train.txt' # The path of the training txt file. 11 | val_file = './data/my_data/label/val.txt' # The path of the validation txt file. 12 | restore_path = './data/darknet_weights/yolov3.ckpt' # The path of the weights to restore. 13 | save_dir = './checkpoint/' # The directory of the weights to save. 14 | log_dir = './data/logs/' # The directory to store the tensorboard log files. 15 | progress_log_path = './data/progress.log' # The path to record the training progress. 16 | anchor_path = './data/yolo_anchors.txt' # The path of the anchor txt file. 17 | class_name_path = './data/coco.names' # The path of the class names. 18 | 19 | ### Training releated numbers 20 | batch_size = 12 #6 1.12 2.24 21 | img_size = [416, 416] # Images will be resized to `img_size` and fed to the network, size format: [width, height] 22 | letterbox_resize = True # Whether to use the letterbox resize, i.e., keep the original aspect ratio in the resized image. 23 | total_epoches = 200 #500 1.100 2. 200 24 | train_evaluation_step = 50 #100 # Evaluate on the training batch after some steps. 25 | val_evaluation_epoch = 50 #50 # Evaluate on the whole validation dataset after some epochs. Set to None to evaluate every epoch. 26 | save_epoch = 10 # Save the model after some epochs. 27 | batch_norm_decay = 0.99 # decay in bn ops 28 | weight_decay = 5e-4 # l2 weight decay 29 | global_step = 0 # used when resuming training 30 | 31 | ### tf.data parameters 32 | num_threads = 10 # Number of threads for image processing used in tf.data pipeline. 33 | prefetech_buffer = 5 # Prefetech_buffer used in tf.data pipeline. 34 | 35 | ### Learning rate and optimizer 36 | optimizer_name = 'momentum' # Chosen from [sgd, momentum, adam, rmsprop] 37 | save_optimizer = True # Whether to save the optimizer parameters into the checkpoint file. 38 | learning_rate_init = 1e-4 39 | lr_type = 'piecewise' # Chosen from [fixed, exponential, cosine_decay, cosine_decay_restart, piecewise] 40 | lr_decay_epoch = 5 # Epochs after which learning rate decays. Int or float. Used when chosen `exponential` and `cosine_decay_restart` lr_type. 41 | lr_decay_factor = 0.96 # The learning rate decay factor. Used when chosen `exponential` lr_type. 42 | lr_lower_bound = 1e-7 # The minimum learning rate. 43 | # only used in piecewise lr type 44 | pw_boundaries = [30, 50] # epoch based boundaries 45 | pw_values = [learning_rate_init, 3e-5, 1e-5] 46 | 47 | ### Load and finetune 48 | # Choose the parts you want to restore the weights. List form. 49 | # restore_include: None, restore_exclude: None => restore the whole model 50 | # restore_include: None, restore_exclude: scope => restore the whole model except `scope` 51 | # restore_include: scope1, restore_exclude: scope2 => if scope1 contains scope2, restore scope1 and not restore scope2 (scope1 - scope2) 52 | # choise 1: only restore the darknet body 53 | # restore_include = ['yolov3/darknet53_body'] 54 | # restore_exclude = None 55 | # choise 2: restore all layers except the last 3 conv2d layers in 3 scale 56 | restore_include = None 57 | restore_exclude = ['yolov3/yolov3_head/Conv_14', 'yolov3/yolov3_head/Conv_6', 'yolov3/yolov3_head/Conv_22'] 58 | # Choose the parts you want to finetune. List form. 59 | # Set to None to train the whole model. 60 | update_part = ['yolov3/yolov3_head'] 61 | 62 | ### other training strategies 63 | multi_scale_train = True # Whether to apply multi-scale training strategy. Image size varies from [320, 320] to [640, 640] by default. 64 | use_label_smooth = True # Whether to use class label smoothing strategy. 65 | use_focal_loss = True # Whether to apply focal loss on the conf loss. 66 | use_mix_up = True # Whether to use mix up data augmentation strategy. 67 | use_warm_up = True # whether to use warm up strategy to prevent from gradient exploding. 68 | warm_up_epoch = 10 # 3 Warm up training epoches. Set to a larger value if gradient explodes. 69 | 70 | ### some constants in validation 71 | # nms 72 | nms_threshold = 0.1 #0.45 iou threshold in nms operation 73 | score_threshold = 0.01 # threshold of the probability of the classes in nms operation, i.e. score = pred_confs * pred_probs. set lower for higher recall. 74 | nms_topk = 150 # keep at most nms_topk outputs after nms 75 | # mAP eval 76 | eval_threshold = 0.5 # the iou threshold applied in mAP evaluation 77 | use_voc_07_metric = False # whether to use voc 2007 evaluation metric, i.e. the 11-point metric 78 | 79 | ### parse some params 80 | anchors = parse_anchors(anchor_path) 81 | classes = read_class_names(class_name_path) 82 | class_num = len(classes) 83 | train_img_cnt = len(open(train_file, 'r').readlines()) 84 | val_img_cnt = len(open(val_file, 'r').readlines()) 85 | train_batch_num = int(math.ceil(float(train_img_cnt) / batch_size)) 86 | 87 | lr_decay_freq = int(train_batch_num * lr_decay_epoch) 88 | pw_boundaries = [float(i) * train_batch_num + global_step for i in pw_boundaries] -------------------------------------------------------------------------------- /Test_demo/camera_test.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import tensorflow as tf 6 | import numpy as np 7 | import argparse 8 | import cv2 9 | import time 10 | 11 | from utils.misc_utils import parse_anchors, read_class_names 12 | from utils.nms_utils import gpu_nms 13 | from utils.plot_utils import get_color_table, plot_one_box 14 | from utils.data_aug import letterbox_resize 15 | 16 | from model import yolov3 17 | 18 | import warnings 19 | warnings.filterwarnings('ignore') 20 | parser = argparse.ArgumentParser(description="YOLO-V3 video test procedure.") 21 | # parser.add_argument("input_video", type=str, 22 | # help="The path of the input video.") 23 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt", 24 | help="The path of the anchor txt file.") 25 | parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416], 26 | help="Resize the input image with `new_size`, size format: [width, height]") 27 | parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=True, 28 | help="Whether to use the letterbox resize.") 29 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names", 30 | help="The path of the class names.") 31 | parser.add_argument("--restore_path", type=str, default='./checkpoint/model-epoch_100_step_37268_loss_0.8836_lr_1e-05', 32 | help="The path of the weights to restore.") 33 | parser.add_argument("--save_video", type=lambda x: (str(x).lower() == 'true'), default=False, 34 | help="Whether to save the video detection results.") 35 | args = parser.parse_args() 36 | 37 | args.anchors = parse_anchors(args.anchor_path) 38 | args.classes = read_class_names(args.class_name_path) 39 | args.num_class = len(args.classes) 40 | 41 | color_table = get_color_table(args.num_class) 42 | 43 | # vid = cv2.VideoCapture(args.input_video) 44 | vid = cv2.VideoCapture(1) 45 | video_frame_cnt = int(vid.get(7)) 46 | video_width = int(vid.get(3)) 47 | video_height = int(vid.get(4)) 48 | # video_fps = int(vid.get(5)) 49 | video_fps = 15 50 | 51 | if args.save_video: 52 | fourcc = cv2.VideoWriter_fourcc(*'mp4v') 53 | videoWriter = cv2.VideoWriter('camera_result.mp4', fourcc, video_fps, (video_width, video_height)) 54 | 55 | with tf.Session() as sess: 56 | input_data = tf.placeholder(tf.float32, [1, args.new_size[1], args.new_size[0], 3], name='input_data') 57 | yolo_model = yolov3(args.num_class, args.anchors) 58 | with tf.variable_scope('yolov3'): 59 | pred_feature_maps = yolo_model.forward(input_data, False) 60 | pred_boxes, pred_confs, pred_probs = yolo_model.predict(pred_feature_maps) 61 | 62 | pred_scores = pred_confs * pred_probs 63 | 64 | boxes, scores, labels = gpu_nms(pred_boxes, pred_scores, args.num_class, max_boxes=200, score_thresh=0.3, nms_thresh=0.45) 65 | 66 | saver = tf.train.Saver() 67 | saver.restore(sess, args.restore_path) 68 | 69 | # for i in range(video_frame_cnt): 70 | while True: 71 | ret, img_ori = vid.read() 72 | if args.letterbox_resize: 73 | img, resize_ratio, dw, dh = letterbox_resize(img_ori, args.new_size[0], args.new_size[1]) 74 | else: 75 | height_ori, width_ori = img_ori.shape[:2] 76 | img = cv2.resize(img_ori, tuple(args.new_size)) 77 | img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) 78 | img = np.asarray(img, np.float32) 79 | img = img[np.newaxis, :] / 255. 80 | 81 | start_time = time.time() 82 | boxes_, scores_, labels_ = sess.run([boxes, scores, labels], feed_dict={input_data: img}) 83 | end_time = time.time() 84 | 85 | # rescale the coordinates to the original image 86 | if args.letterbox_resize: 87 | boxes_[:, [0, 2]] = (boxes_[:, [0, 2]] - dw) / resize_ratio 88 | boxes_[:, [1, 3]] = (boxes_[:, [1, 3]] - dh) / resize_ratio 89 | else: 90 | boxes_[:, [0, 2]] *= (width_ori/float(args.new_size[0])) 91 | boxes_[:, [1, 3]] *= (height_ori/float(args.new_size[1])) 92 | 93 | 94 | for i in range(len(boxes_)): 95 | x0, y0, x1, y1 = boxes_[i] 96 | image_score = scores_[i] * 100 97 | if image_score >= 65: 98 | plot_one_box(img_ori, [x0, y0, x1, y1], label=args.classes[labels_[i]] + ', {:.2f}%'.format(scores_[i] * 100), color=color_table[labels_[i]]) 99 | cv2.putText(img_ori, '{:.2f}ms'.format((end_time - start_time) * 1000), (40, 40), 0, 100 | fontScale=1, color=(0, 255, 0), thickness=2) 101 | cv2.imshow('image', img_ori) 102 | k = cv2.waitKey(1) 103 | if args.save_video: 104 | videoWriter.write(img_ori) 105 | if k & 0xFF == ord('q'): 106 | break 107 | 108 | vid.release() 109 | if args.save_video: 110 | videoWriter.release() 111 | -------------------------------------------------------------------------------- /Test_demo/convert_weight.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | # for more details about the yolo darknet weights file, refer to 3 | # https://itnext.io/implementing-yolo-v3-in-tensorflow-tf-slim-c3c55ff59dbe 4 | 5 | from __future__ import division, print_function 6 | 7 | import os 8 | import sys 9 | import tensorflow as tf 10 | import numpy as np 11 | 12 | from model import yolov3 13 | from utils.misc_utils import parse_anchors, load_weights 14 | 15 | num_class = 80 16 | img_size = 416 17 | weight_path = './data/darknet_weights/yolov3.weights' 18 | save_path = './data/darknet_weights/yolov3.ckpt' 19 | anchors = parse_anchors('./data/yolo_anchors.txt') 20 | 21 | model = yolov3(80, anchors) 22 | with tf.Session() as sess: 23 | inputs = tf.placeholder(tf.float32, [1, img_size, img_size, 3]) 24 | 25 | with tf.variable_scope('yolov3'): 26 | feature_map = model.forward(inputs) 27 | 28 | saver = tf.train.Saver(var_list=tf.global_variables(scope='yolov3')) 29 | 30 | load_ops = load_weights(tf.global_variables(scope='yolov3'), weight_path) 31 | sess.run(load_ops) 32 | saver.save(sess, save_path=save_path) 33 | print('TensorFlow model checkpoint has been saved to {}'.format(save_path)) 34 | 35 | 36 | 37 | -------------------------------------------------------------------------------- /Test_demo/data/coco.names: -------------------------------------------------------------------------------- 1 | P 2 | PH 3 | PV 4 | PHV 5 | 6 | -------------------------------------------------------------------------------- /Test_demo/data/darknet_weights/readme.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Test_demo/data/darknet_weights/readme.txt -------------------------------------------------------------------------------- /Test_demo/data/logs/readme: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Test_demo/data/logs/readme -------------------------------------------------------------------------------- /Test_demo/data/my_data/Annotations/readme: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Test_demo/data/my_data/Annotations/readme -------------------------------------------------------------------------------- /Test_demo/data/my_data/ImageSets/Main/readme: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Test_demo/data/my_data/ImageSets/Main/readme -------------------------------------------------------------------------------- /Test_demo/data/my_data/JPEGImages/readme: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Test_demo/data/my_data/JPEGImages/readme -------------------------------------------------------------------------------- /Test_demo/data/my_data/label/readme: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Test_demo/data/my_data/label/readme -------------------------------------------------------------------------------- /Test_demo/data/my_data/readme: -------------------------------------------------------------------------------- 1 | place your data files here. -------------------------------------------------------------------------------- /Test_demo/data/yolo_anchors.txt: -------------------------------------------------------------------------------- 1 | 26,55, 39,92, 54,132, 71,194, 100,160, 110,259, 174,347, 290,560, 579,957 -------------------------------------------------------------------------------- /Test_demo/data_pro.py: -------------------------------------------------------------------------------- 1 | 2 | import os 3 | import pandas 4 | import shutil 5 | import random 6 | 7 | 8 | import cv2 9 | import numpy as np 10 | import xml.etree.ElementTree as ET 11 | 12 | 13 | # 这部分休要修改 14 | 15 | 16 | class Data_preprocess(object): 17 | ''' 18 | 解析xml数据 19 | ''' 20 | def __init__(self,data_path): 21 | self.data_path = data_path 22 | self.image_size = 416 23 | self.batch_size = 32 24 | self.cell_size = 13 25 | # TO DO 26 | self.classes = ["P","PH","PV","PHV"] 27 | self.num_classes = len(self.classes) 28 | self.box_per_cell = 5 29 | self.class_to_ind = dict(zip(self.classes, range(self.num_classes))) 30 | 31 | self.count = 0 32 | self.epoch = 1 33 | self.count_t = 0 34 | 35 | def load_labels(self, model): 36 | if model == 'train': 37 | txtname = os.path.join(self.data_path, 'ImageSets/Main/train.txt') 38 | if model == 'test': 39 | txtname = os.path.join(self.data_path, 'ImageSets/Main/test.txt') 40 | 41 | if model == "val": 42 | txtname = os.path.join(self.data_path, 'ImageSets/Main/val.txt') 43 | 44 | 45 | with open(txtname, 'r') as f: 46 | image_ind = [x.strip() for x in f.readlines()] # 文件名去掉 .jpg 47 | 48 | 49 | my_index = 0 50 | for ind in image_ind: 51 | class_inds, x1s, y1s, x2s, y2s,img_width,img_height = self.load_data(ind) 52 | 53 | if len(class_inds) == 0: 54 | pass 55 | else: 56 | annotation_label = "" 57 | #box_x: label_index, x_min,y_min,x_max,y_max 58 | for label_i in range(len(class_inds)): 59 | 60 | annotation_label += " " + str(class_inds[label_i]) 61 | annotation_label += " " + str(x1s[label_i]) 62 | annotation_label += " " + str(y1s[label_i]) 63 | annotation_label += " " + str(x2s[label_i]) 64 | annotation_label += " " + str(y2s[label_i]) 65 | 66 | with open("./data/my_data/label/"+model+".txt","a") as f: 67 | f.write(str(my_index) + " " + data_path+"/JPEGImages/"+ind+".jpg"+" "+str(img_width) +" "+str(img_height)+ annotation_label + "\n") 68 | 69 | my_index += 1 70 | 71 | print(my_index) 72 | 73 | 74 | 75 | def load_data(self, index): 76 | label = np.zeros([self.cell_size, self.cell_size, self.box_per_cell, 5 + self.num_classes]) 77 | filename = os.path.join(self.data_path, 'Annotations', index + '.xml') 78 | tree = ET.parse(filename) 79 | image_size = tree.find('size') 80 | image_width = int(float(image_size.find('width').text)) 81 | image_height = int(float(image_size.find('height').text)) 82 | # h_ratio = 1.0 * self.image_size / image_height 83 | # w_ratio = 1.0 * self.image_size / image_width 84 | 85 | objects = tree.findall('object') 86 | 87 | class_inds = [] 88 | x1s = [] 89 | y1s = [] 90 | x2s = [] 91 | y2s = [] 92 | 93 | for obj in objects: 94 | box = obj.find('bndbox') 95 | x1 = int(float(box.find('xmin').text)) 96 | y1 = int(float(box.find('ymin').text)) 97 | x2 = int(float(box.find('xmax').text)) 98 | y2 = int(float(box.find('ymax').text)) 99 | # x1 = max(min((float(box.find('xmin').text)) * w_ratio, self.image_size), 0) 100 | # y1 = max(min((float(box.find('ymin').text)) * h_ratio, self.image_size), 0) 101 | # x2 = max(min((float(box.find('xmax').text)) * w_ratio, self.image_size), 0) 102 | # y2 = max(min((float(box.find('ymax').text)) * h_ratio, self.image_size), 0) 103 | if obj.find('name').text in self.classes: 104 | class_ind = self.class_to_ind[obj.find('name').text] 105 | # class_ind = self.class_to_ind[obj.find('name').text.lower().strip()] 106 | 107 | # boxes = [0.5 * (x1 + x2) / self.image_size, 0.5 * (y1 + y2) / self.image_size, np.sqrt((x2 - x1) / self.image_size), np.sqrt((y2 - y1) / self.image_size)] 108 | # cx = 1.0 * boxes[0] * self.cell_size 109 | # cy = 1.0 * boxes[1] * self.cell_size 110 | # xind = int(np.floor(cx)) 111 | # yind = int(np.floor(cy)) 112 | 113 | # label[yind, xind, :, 0] = 1 114 | # label[yind, xind, :, 1:5] = boxes 115 | # label[yind, xind, :, 5 + class_ind] = 1 116 | 117 | if x1 >= x2 or y1 >= y2: 118 | pass 119 | else: 120 | class_inds.append(class_ind) 121 | x1s.append(x1) 122 | y1s.append(y1) 123 | x2s.append(x2) 124 | y2s.append(y2) 125 | 126 | return class_inds, x1s, y1s, x2s, y2s, image_width, image_height 127 | 128 | 129 | def data_split(img_path): 130 | ''' 131 | 数据分割 132 | ''' 133 | 134 | files = os.listdir(img_path) 135 | # To do 136 | test_part = random.sample(files,int(399*0.2)) 137 | 138 | val_part = random.sample(test_part,int(int(399*0.2)*0.5)) 139 | 140 | val_index = 0 141 | test_index = 0 142 | train_index = 0 143 | for file in files: 144 | if file in val_part: 145 | 146 | with open("./data/my_data/ImageSets/Main/val.txt","a") as val_f: 147 | val_f.write(file[:-4] + "\n" ) 148 | 149 | val_index += 1 150 | 151 | elif file in test_part: 152 | with open("./data/my_data/ImageSets/Main/test.txt","a") as test_f: 153 | test_f.write(file[:-4] + "\n") 154 | 155 | test_index += 1 156 | 157 | else: 158 | with open("./data/my_data/ImageSets/Main/train.txt","a") as train_f: 159 | train_f.write(file[:-4] + "\n") 160 | 161 | train_index += 1 162 | 163 | 164 | print(train_index,test_index,val_index) 165 | 166 | 167 | # TO DO 168 | if __name__ == "__main__": 169 | 170 | # 分割train, val, test 171 | img_path = "./data/my_data/JPEGImages" 172 | data_split(img_path) 173 | print("===========split data finish============") 174 | 175 | # 做YOLO V3需要的训练集 176 | base_path = os.getcwd() 177 | data_path = os.path.join(base_path,"data/my_data") # 绝对路径 178 | 179 | data_p = Data_preprocess(data_path) 180 | data_p.load_labels("train") 181 | data_p.load_labels("test") 182 | data_p.load_labels("val") 183 | print("==========data pro finish===========") 184 | 185 | 186 | 187 | 188 | 189 | 190 | 191 | -------------------------------------------------------------------------------- /Test_demo/eval.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import tensorflow as tf 6 | import numpy as np 7 | import argparse 8 | from tqdm import trange 9 | 10 | from utils.data_utils import get_batch_data 11 | from utils.misc_utils import parse_anchors, read_class_names, AverageMeter 12 | from utils.eval_utils import evaluate_on_cpu, evaluate_on_gpu, get_preds_gpu, voc_eval, parse_gt_rec 13 | from utils.nms_utils import gpu_nms 14 | 15 | from model import yolov3 16 | 17 | ################# 18 | # ArgumentParser 19 | ################# 20 | parser = argparse.ArgumentParser(description="YOLO-V3 eval procedure.") 21 | # some paths 22 | parser.add_argument("--eval_file", type=str, default="./data/my_data/val.txt", 23 | help="The path of the validation or test txt file.") 24 | 25 | parser.add_argument("--restore_path", type=str, default="./data/darknet_weights/yolov3.ckpt", 26 | help="The path of the weights to restore.") 27 | 28 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt", 29 | help="The path of the anchor txt file.") 30 | 31 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names", 32 | help="The path of the class names.") 33 | 34 | # some numbers 35 | parser.add_argument("--img_size", nargs='*', type=int, default=[416, 416], 36 | help="Resize the input image to `img_size`, size format: [width, height]") 37 | 38 | parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=False, 39 | help="Whether to use the letterbox resize, i.e., keep the original image aspect ratio.") 40 | 41 | parser.add_argument("--num_threads", type=int, default=10, 42 | help="Number of threads for image processing used in tf.data pipeline.") 43 | 44 | parser.add_argument("--prefetech_buffer", type=int, default=5, 45 | help="Prefetech_buffer used in tf.data pipeline.") 46 | 47 | parser.add_argument("--nms_threshold", type=float, default=0.45, 48 | help="IOU threshold in nms operation.") 49 | 50 | parser.add_argument("--score_threshold", type=float, default=0.01, 51 | help="Threshold of the probability of the classes in nms operation.") 52 | 53 | parser.add_argument("--nms_topk", type=int, default=400, 54 | help="Keep at most nms_topk outputs after nms.") 55 | 56 | parser.add_argument("--use_voc_07_metric", type=lambda x: (str(x).lower() == 'true'), default=False, 57 | help="Whether to use the voc 2007 mAP metrics.") 58 | 59 | args = parser.parse_args() 60 | 61 | # args params 62 | args.anchors = parse_anchors(args.anchor_path) 63 | args.classes = read_class_names(args.class_name_path) 64 | args.class_num = len(args.classes) 65 | args.img_cnt = len(open(args.eval_file, 'r').readlines()) 66 | 67 | # setting placeholders 68 | is_training = tf.placeholder(dtype=tf.bool, name="phase_train") 69 | handle_flag = tf.placeholder(tf.string, [], name='iterator_handle_flag') 70 | pred_boxes_flag = tf.placeholder(tf.float32, [1, None, None]) 71 | pred_scores_flag = tf.placeholder(tf.float32, [1, None, None]) 72 | gpu_nms_op = gpu_nms(pred_boxes_flag, pred_scores_flag, args.class_num, args.nms_topk, args.score_threshold, args.nms_threshold) 73 | 74 | ################## 75 | # tf.data pipeline 76 | ################## 77 | val_dataset = tf.data.TextLineDataset(args.eval_file) 78 | val_dataset = val_dataset.batch(1) 79 | val_dataset = val_dataset.map( 80 | lambda x: tf.py_func(get_batch_data, [x, args.class_num, args.img_size, args.anchors, 'val', False, False, args.letterbox_resize], [tf.int64, tf.float32, tf.float32, tf.float32, tf.float32]), 81 | num_parallel_calls=args.num_threads 82 | ) 83 | val_dataset.prefetch(args.prefetech_buffer) 84 | iterator = val_dataset.make_one_shot_iterator() 85 | 86 | image_ids, image, y_true_13, y_true_26, y_true_52 = iterator.get_next() 87 | image_ids.set_shape([None]) 88 | y_true = [y_true_13, y_true_26, y_true_52] 89 | image.set_shape([None, args.img_size[1], args.img_size[0], 3]) 90 | for y in y_true: 91 | y.set_shape([None, None, None, None, None]) 92 | 93 | ################## 94 | # Model definition 95 | ################## 96 | yolo_model = yolov3(args.class_num, args.anchors) 97 | with tf.variable_scope('yolov3'): 98 | pred_feature_maps = yolo_model.forward(image, is_training=is_training) 99 | loss = yolo_model.compute_loss(pred_feature_maps, y_true) 100 | y_pred = yolo_model.predict(pred_feature_maps) 101 | 102 | saver_to_restore = tf.train.Saver() 103 | 104 | with tf.Session() as sess: 105 | sess.run([tf.global_variables_initializer()]) 106 | saver_to_restore.restore(sess, args.restore_path) 107 | 108 | print('\n----------- start to eval -----------\n') 109 | 110 | val_loss_total, val_loss_xy, val_loss_wh, val_loss_conf, val_loss_class = \ 111 | AverageMeter(), AverageMeter(), AverageMeter(), AverageMeter(), AverageMeter() 112 | val_preds = [] 113 | 114 | for j in trange(args.img_cnt): 115 | __image_ids, __y_pred, __loss = sess.run([image_ids, y_pred, loss], feed_dict={is_training: False}) 116 | pred_content = get_preds_gpu(sess, gpu_nms_op, pred_boxes_flag, pred_scores_flag, __image_ids, __y_pred) 117 | 118 | val_preds.extend(pred_content) 119 | val_loss_total.update(__loss[0]) 120 | val_loss_xy.update(__loss[1]) 121 | val_loss_wh.update(__loss[2]) 122 | val_loss_conf.update(__loss[3]) 123 | val_loss_class.update(__loss[4]) 124 | 125 | rec_total, prec_total, ap_total = AverageMeter(), AverageMeter(), AverageMeter() 126 | gt_dict = parse_gt_rec(args.eval_file, args.img_size, args.letterbox_resize) 127 | print('mAP eval:') 128 | for ii in range(args.class_num): 129 | npos, nd, rec, prec, ap = voc_eval(gt_dict, val_preds, ii, iou_thres=0.5, use_07_metric=args.use_voc_07_metric) 130 | rec_total.update(rec, npos) 131 | prec_total.update(prec, nd) 132 | ap_total.update(ap, 1) 133 | print('Class {}: Recall: {:.4f}, Precision: {:.4f}, AP: {:.4f}'.format(ii, rec, prec, ap)) 134 | 135 | mAP = ap_total.average 136 | print('final mAP: {:.4f}'.format(mAP)) 137 | print("recall: {:.3f}, precision: {:.3f}".format(rec_total.average, prec_total.average)) 138 | print("total_loss: {:.3f}, loss_xy: {:.3f}, loss_wh: {:.3f}, loss_conf: {:.3f}, loss_class: {:.3f}".format( 139 | val_loss_total.average, val_loss_xy.average, val_loss_wh.average, val_loss_conf.average, val_loss_class.average 140 | )) 141 | -------------------------------------------------------------------------------- /Test_demo/get_kmeans.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | # This script is modified from https://github.com/lars76/kmeans-anchor-boxes 3 | 4 | from __future__ import division, print_function 5 | 6 | import numpy as np 7 | 8 | def iou(box, clusters): 9 | """ 10 | Calculates the Intersection over Union (IoU) between a box and k clusters. 11 | param: 12 | box: tuple or array, shifted to the origin (i. e. width and height) 13 | clusters: numpy array of shape (k, 2) where k is the number of clusters 14 | return: 15 | numpy array of shape (k, 0) where k is the number of clusters 16 | """ 17 | x = np.minimum(clusters[:, 0], box[0]) 18 | y = np.minimum(clusters[:, 1], box[1]) 19 | if np.count_nonzero(x == 0) > 0 or np.count_nonzero(y == 0) > 0: 20 | raise ValueError("Box has no area") 21 | 22 | intersection = x * y 23 | box_area = box[0] * box[1] 24 | cluster_area = clusters[:, 0] * clusters[:, 1] 25 | 26 | iou_ = np.true_divide(intersection, box_area + cluster_area - intersection + 1e-10) 27 | # iou_ = intersection / (box_area + cluster_area - intersection + 1e-10) 28 | 29 | return iou_ 30 | 31 | 32 | def avg_iou(boxes, clusters): 33 | """ 34 | Calculates the average Intersection over Union (IoU) between a numpy array of boxes and k clusters. 35 | param: 36 | boxes: numpy array of shape (r, 2), where r is the number of rows 37 | clusters: numpy array of shape (k, 2) where k is the number of clusters 38 | return: 39 | average IoU as a single float 40 | """ 41 | return np.mean([np.max(iou(boxes[i], clusters)) for i in range(boxes.shape[0])]) 42 | 43 | 44 | def translate_boxes(boxes): 45 | """ 46 | Translates all the boxes to the origin. 47 | param: 48 | boxes: numpy array of shape (r, 4) 49 | return: 50 | numpy array of shape (r, 2) 51 | """ 52 | new_boxes = boxes.copy() 53 | for row in range(new_boxes.shape[0]): 54 | new_boxes[row][2] = np.abs(new_boxes[row][2] - new_boxes[row][0]) 55 | new_boxes[row][3] = np.abs(new_boxes[row][3] - new_boxes[row][1]) 56 | return np.delete(new_boxes, [0, 1], axis=1) 57 | 58 | 59 | def kmeans(boxes, k, dist=np.median): 60 | """ 61 | Calculates k-means clustering with the Intersection over Union (IoU) metric. 62 | param: 63 | boxes: numpy array of shape (r, 2), where r is the number of rows 64 | k: number of clusters 65 | dist: distance function 66 | return: 67 | numpy array of shape (k, 2) 68 | """ 69 | rows = boxes.shape[0] 70 | 71 | distances = np.empty((rows, k)) 72 | last_clusters = np.zeros((rows,)) 73 | 74 | np.random.seed() 75 | 76 | # the Forgy method will fail if the whole array contains the same rows 77 | clusters = boxes[np.random.choice(rows, k, replace=False)] 78 | 79 | while True: 80 | for row in range(rows): 81 | distances[row] = 1 - iou(boxes[row], clusters) 82 | 83 | nearest_clusters = np.argmin(distances, axis=1) 84 | 85 | if (last_clusters == nearest_clusters).all(): 86 | break 87 | 88 | for cluster in range(k): 89 | clusters[cluster] = dist(boxes[nearest_clusters == cluster], axis=0) 90 | 91 | last_clusters = nearest_clusters 92 | 93 | return clusters 94 | 95 | 96 | def parse_anno(annotation_path, target_size=None): 97 | anno = open(annotation_path, 'r') 98 | result = [] 99 | for line in anno: 100 | s = line.strip().split(' ') 101 | print(line) 102 | img_w = int(float(s[2])) 103 | img_h = int(float(s[3])) 104 | s = s[4:] 105 | box_cnt = len(s) // 5 106 | for i in range(box_cnt): 107 | x_min, y_min, x_max, y_max = float(s[i*5+1]), float(s[i*5+2]), float(s[i*5+3]), float(s[i*5+4]) 108 | width = x_max - x_min 109 | height = y_max - y_min 110 | assert width > 0 111 | assert height > 0 112 | # use letterbox resize, i.e. keep the original aspect ratio 113 | # get k-means anchors on the resized target image size 114 | if target_size is not None: 115 | resize_ratio = min(target_size[0] / img_w, target_size[1] / img_h) 116 | width *= resize_ratio 117 | height *= resize_ratio 118 | result.append([width, height]) 119 | # get k-means anchors on the original image size 120 | else: 121 | result.append([width, height]) 122 | result = np.asarray(result) 123 | return result 124 | 125 | 126 | def get_kmeans(anno, cluster_num=9): 127 | 128 | anchors = kmeans(anno, cluster_num) 129 | ave_iou = avg_iou(anno, anchors) 130 | 131 | anchors = anchors.astype('int').tolist() 132 | 133 | anchors = sorted(anchors, key=lambda x: x[0] * x[1]) 134 | 135 | return anchors, ave_iou 136 | 137 | 138 | if __name__ == '__main__': 139 | # target resize format: [width, height] 140 | # if target_resize is speficied, the anchors are on the resized image scale 141 | # if target_resize is set to None, the anchors are on the original image scale 142 | # target_size = [416, 416] 143 | target_size = None 144 | annotation_path = "./data/my_data/label/train.txt" 145 | anno_result = parse_anno(annotation_path, target_size=target_size) 146 | anchors, ave_iou = get_kmeans(anno_result, 9) 147 | 148 | anchor_string = '' 149 | for anchor in anchors: 150 | anchor_string += '{},{}, '.format(anchor[0], anchor[1]) 151 | anchor_string = anchor_string[:-2] 152 | 153 | print('anchors are:') 154 | print(anchor_string) 155 | print('the average iou is:') 156 | print(ave_iou) 157 | 158 | -------------------------------------------------------------------------------- /Test_demo/test_single_image.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import tensorflow as tf 6 | import numpy as np 7 | import argparse 8 | import cv2 9 | 10 | from utils.misc_utils import parse_anchors, read_class_names 11 | from utils.nms_utils import gpu_nms 12 | from utils.plot_utils import get_color_table, plot_one_box 13 | 14 | from model import yolov3 15 | 16 | tf.compat.v1.train.Saver 17 | 18 | parser = argparse.ArgumentParser(description="YOLO-V3 test single image test procedure.") 19 | parser.add_argument("input_image", type=str, 20 | help="The path of the input image.") 21 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt", 22 | help="The path of the anchor txt file.") 23 | parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416], 24 | help="Resize the input image with `new_size`, size format: [width, height]") 25 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names", 26 | help="The path of the class names.") 27 | parser.add_argument("--restore_path", type=str, default="./checkpoint/model-epoch_100_step_37268_loss_0.8836_lr_1e-05", 28 | help="The path of the weights to restore.") 29 | args = parser.parse_args() 30 | 31 | args.anchors = parse_anchors(args.anchor_path) 32 | args.classes = read_class_names(args.class_name_path) 33 | args.num_class = len(args.classes) 34 | 35 | color_table = get_color_table(args.num_class) 36 | 37 | img_ori = cv2.imread(args.input_image) 38 | height_ori, width_ori = img_ori.shape[:2] 39 | img = cv2.resize(img_ori, tuple(args.new_size)) 40 | img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) 41 | img = np.asarray(img, np.float32) 42 | img = img[np.newaxis, :] / 255. 43 | 44 | with tf.Session() as sess: 45 | input_data = tf.placeholder(tf.float32, [1, args.new_size[1], args.new_size[0], 3], name='input_data') 46 | yolo_model = yolov3(args.num_class, args.anchors) 47 | with tf.variable_scope('yolov3'): 48 | pred_feature_maps = yolo_model.forward(input_data, False) 49 | pred_boxes, pred_confs, pred_probs = yolo_model.predict(pred_feature_maps) 50 | 51 | pred_scores = pred_confs * pred_probs 52 | 53 | boxes, scores, labels = gpu_nms(pred_boxes, pred_scores, args.num_class, max_boxes=30, score_thresh=0.4, nms_thresh=0.5) 54 | 55 | saver = tf.train.Saver() 56 | saver.restore(sess, args.restore_path) 57 | # saver = tf.train.import_meta_graph('./checkpoint/best_model_Epoch_5_step_42_mAP_0.0735_loss_43.3285_lr_0.0001.meta') 58 | # saver.restore(sess, tf.train.latest_checkpoint("./checkpoint/")) 59 | 60 | boxes_, scores_, labels_ = sess.run([boxes, scores, labels], feed_dict={input_data: img}) 61 | 62 | # rescale the coordinates to the original image 63 | boxes_[:, 0] *= (width_ori/float(args.new_size[0])) 64 | boxes_[:, 2] *= (width_ori/float(args.new_size[0])) 65 | boxes_[:, 1] *= (height_ori/float(args.new_size[1])) 66 | boxes_[:, 3] *= (height_ori/float(args.new_size[1])) 67 | 68 | print("box coords:") 69 | print(boxes_) 70 | print('*' * 30) 71 | print("scores:") 72 | print(scores_) 73 | print('*' * 30) 74 | print("labels:") 75 | print(labels_) 76 | 77 | for i in range(len(boxes_)): 78 | x0, y0, x1, y1 = boxes_[i] 79 | plot_one_box(img_ori, [x0, y0, x1, y1], label=args.classes[labels_[i]], color=color_table[labels_[i]]) 80 | cv2.imshow('Detection result', img_ori) 81 | cv2.imwrite('detection_result.jpg', img_ori) 82 | cv2.waitKey(0) 83 | -------------------------------------------------------------------------------- /Test_demo/utils/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Test_demo/utils/__init__.py -------------------------------------------------------------------------------- /Test_demo/utils/layer_utils.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import numpy as np 6 | import tensorflow as tf 7 | slim = tf.contrib.slim 8 | 9 | def conv2d(inputs, filters, kernel_size, strides=1): 10 | def _fixed_padding(inputs, kernel_size): 11 | pad_total = kernel_size - 1 12 | pad_beg = pad_total // 2 13 | pad_end = pad_total - pad_beg 14 | 15 | padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg, pad_end], 16 | [pad_beg, pad_end], [0, 0]], mode='CONSTANT') 17 | return padded_inputs 18 | if strides > 1: 19 | inputs = _fixed_padding(inputs, kernel_size) 20 | inputs = slim.conv2d(inputs, filters, kernel_size, stride=strides, 21 | padding=('SAME' if strides == 1 else 'VALID')) 22 | return inputs 23 | 24 | def darknet53_body(inputs): 25 | def res_block(inputs, filters): 26 | shortcut = inputs 27 | net = conv2d(inputs, filters * 1, 1) 28 | net = conv2d(net, filters * 2, 3) 29 | 30 | net = net + shortcut 31 | 32 | return net 33 | 34 | # first two conv2d layers 35 | net = conv2d(inputs, 32, 3, strides=1) 36 | net = conv2d(net, 64, 3, strides=2) 37 | 38 | # res_block * 1 39 | net = res_block(net, 32) 40 | 41 | net = conv2d(net, 128, 3, strides=2) 42 | 43 | # res_block * 2 44 | for i in range(2): 45 | net = res_block(net, 64) 46 | 47 | net = conv2d(net, 256, 3, strides=2) 48 | 49 | # res_block * 8 50 | for i in range(8): 51 | net = res_block(net, 128) 52 | 53 | route_1 = net 54 | net = conv2d(net, 512, 3, strides=2) 55 | 56 | # res_block * 8 57 | for i in range(8): 58 | net = res_block(net, 256) 59 | 60 | route_2 = net 61 | net = conv2d(net, 1024, 3, strides=2) 62 | 63 | # res_block * 4 64 | for i in range(4): 65 | net = res_block(net, 512) 66 | route_3 = net 67 | 68 | return route_1, route_2, route_3 69 | 70 | 71 | def yolo_block(inputs, filters): 72 | net = conv2d(inputs, filters * 1, 1) 73 | net = conv2d(net, filters * 2, 3) 74 | net = conv2d(net, filters * 1, 1) 75 | net = conv2d(net, filters * 2, 3) 76 | net = conv2d(net, filters * 1, 1) 77 | route = net 78 | net = conv2d(net, filters * 2, 3) 79 | return route, net 80 | 81 | 82 | def upsample_layer(inputs, out_shape): 83 | new_height, new_width = out_shape[1], out_shape[2] 84 | # NOTE: here height is the first 85 | # TODO: Do we need to set `align_corners` as True? 86 | inputs = tf.image.resize_nearest_neighbor(inputs, (new_height, new_width), name='upsampled') 87 | return inputs 88 | 89 | 90 | -------------------------------------------------------------------------------- /Test_demo/utils/misc_utils.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | import numpy as np 4 | import tensorflow as tf 5 | import random 6 | 7 | from tensorflow.core.framework import summary_pb2 8 | 9 | 10 | def make_summary(name, val): 11 | return summary_pb2.Summary(value=[summary_pb2.Summary.Value(tag=name, simple_value=val)]) 12 | 13 | 14 | class AverageMeter(object): 15 | def __init__(self): 16 | self.reset() 17 | 18 | def reset(self): 19 | self.val = 0 20 | self.average = 0 21 | self.sum = 0 22 | self.count = 0 23 | 24 | def update(self, val, n=1): 25 | self.val = val 26 | self.sum += val * n 27 | self.count += n 28 | self.average = self.sum / float(self.count) 29 | 30 | 31 | def parse_anchors(anchor_path): 32 | ''' 33 | parse anchors. 34 | returned data: shape [N, 2], dtype float32 35 | ''' 36 | anchors = np.reshape(np.asarray(open(anchor_path, 'r').read().split(','), np.float32), [-1, 2]) 37 | return anchors 38 | 39 | 40 | def read_class_names(class_name_path): 41 | names = {} 42 | with open(class_name_path, 'r') as data: 43 | for ID, name in enumerate(data): 44 | names[ID] = name.strip('\n') 45 | return names 46 | 47 | 48 | def shuffle_and_overwrite(file_name): 49 | content = open(file_name, 'r').readlines() 50 | random.shuffle(content) 51 | with open(file_name, 'w') as f: 52 | for line in content: 53 | f.write(line) 54 | 55 | 56 | def update_dict(ori_dict, new_dict): 57 | if not ori_dict: 58 | return new_dict 59 | for key in ori_dict: 60 | ori_dict[key] += new_dict[key] 61 | return ori_dict 62 | 63 | 64 | def list_add(ori_list, new_list): 65 | for i in range(len(ori_list)): 66 | ori_list[i] += new_list[i] 67 | return ori_list 68 | 69 | 70 | def load_weights(var_list, weights_file): 71 | """ 72 | Loads and converts pre-trained weights. 73 | param: 74 | var_list: list of network variables. 75 | weights_file: name of the binary file. 76 | """ 77 | with open(weights_file, "rb") as fp: 78 | np.fromfile(fp, dtype=np.int32, count=5) 79 | weights = np.fromfile(fp, dtype=np.float32) 80 | 81 | ptr = 0 82 | i = 0 83 | assign_ops = [] 84 | while i < len(var_list) - 1: 85 | var1 = var_list[i] 86 | var2 = var_list[i + 1] 87 | # do something only if we process conv layer 88 | if 'Conv' in var1.name.split('/')[-2]: 89 | # check type of next layer 90 | if 'BatchNorm' in var2.name.split('/')[-2]: 91 | # load batch norm params 92 | gamma, beta, mean, var = var_list[i + 1:i + 5] 93 | batch_norm_vars = [beta, gamma, mean, var] 94 | for var in batch_norm_vars: 95 | shape = var.shape.as_list() 96 | num_params = np.prod(shape) 97 | var_weights = weights[ptr:ptr + num_params].reshape(shape) 98 | ptr += num_params 99 | assign_ops.append(tf.assign(var, var_weights, validate_shape=True)) 100 | # we move the pointer by 4, because we loaded 4 variables 101 | i += 4 102 | elif 'Conv' in var2.name.split('/')[-2]: 103 | # load biases 104 | bias = var2 105 | bias_shape = bias.shape.as_list() 106 | bias_params = np.prod(bias_shape) 107 | bias_weights = weights[ptr:ptr + 108 | bias_params].reshape(bias_shape) 109 | ptr += bias_params 110 | assign_ops.append(tf.assign(bias, bias_weights, validate_shape=True)) 111 | # we loaded 1 variable 112 | i += 1 113 | # we can load weights of conv layer 114 | shape = var1.shape.as_list() 115 | num_params = np.prod(shape) 116 | 117 | var_weights = weights[ptr:ptr + num_params].reshape( 118 | (shape[3], shape[2], shape[0], shape[1])) 119 | # remember to transpose to column-major 120 | var_weights = np.transpose(var_weights, (2, 3, 1, 0)) 121 | ptr += num_params 122 | assign_ops.append( 123 | tf.assign(var1, var_weights, validate_shape=True)) 124 | i += 1 125 | 126 | return assign_ops 127 | 128 | 129 | def config_learning_rate(args, global_step): 130 | if args.lr_type == 'exponential': 131 | lr_tmp = tf.train.exponential_decay(args.learning_rate_init, global_step, args.lr_decay_freq, 132 | args.lr_decay_factor, staircase=True, name='exponential_learning_rate') 133 | return tf.maximum(lr_tmp, args.lr_lower_bound) 134 | elif args.lr_type == 'cosine_decay': 135 | train_steps = (args.total_epoches - float(args.use_warm_up) * args.warm_up_epoch) * args.train_batch_num 136 | return args.lr_lower_bound + 0.5 * (args.learning_rate_init - args.lr_lower_bound) * \ 137 | (1 + tf.cos(global_step / train_steps * np.pi)) 138 | elif args.lr_type == 'cosine_decay_restart': 139 | return tf.train.cosine_decay_restarts(args.learning_rate_init, global_step, 140 | args.lr_decay_freq, t_mul=2.0, m_mul=1.0, 141 | name='cosine_decay_learning_rate_restart') 142 | elif args.lr_type == 'fixed': 143 | return tf.convert_to_tensor(args.learning_rate_init, name='fixed_learning_rate') 144 | elif args.lr_type == 'piecewise': 145 | return tf.train.piecewise_constant(global_step, boundaries=args.pw_boundaries, values=args.pw_values, 146 | name='piecewise_learning_rate') 147 | else: 148 | raise ValueError('Unsupported learning rate type!') 149 | 150 | 151 | def config_optimizer(optimizer_name, learning_rate, decay=0.9, momentum=0.9): 152 | if optimizer_name == 'momentum': 153 | return tf.train.MomentumOptimizer(learning_rate, momentum=momentum) 154 | elif optimizer_name == 'rmsprop': 155 | return tf.train.RMSPropOptimizer(learning_rate, decay=decay, momentum=momentum) 156 | elif optimizer_name == 'adam': 157 | return tf.train.AdamOptimizer(learning_rate) 158 | elif optimizer_name == 'sgd': 159 | return tf.train.GradientDescentOptimizer(learning_rate) 160 | else: 161 | raise ValueError('Unsupported optimizer type!') -------------------------------------------------------------------------------- /Test_demo/utils/nms_utils.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import numpy as np 6 | import tensorflow as tf 7 | 8 | def gpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, nms_thresh=0.5): 9 | """ 10 | Perform NMS on GPU using TensorFlow. 11 | 12 | params: 13 | boxes: tensor of shape [1, 10647, 4] # 10647=(13*13+26*26+52*52)*3, for input 416*416 image 14 | scores: tensor of shape [1, 10647, num_classes], score=conf*prob 15 | num_classes: total number of classes 16 | max_boxes: integer, maximum number of predicted boxes you'd like, default is 50 17 | score_thresh: if [ highest class probability score < score_threshold] 18 | then get rid of the corresponding box 19 | nms_thresh: real value, "intersection over union" threshold used for NMS filtering 20 | """ 21 | 22 | boxes_list, label_list, score_list = [], [], [] 23 | max_boxes = tf.constant(max_boxes, dtype='int32') 24 | 25 | # since we do nms for single image, then reshape it 26 | boxes = tf.reshape(boxes, [-1, 4]) # '-1' means we don't konw the exact number of boxes 27 | score = tf.reshape(scores, [-1, num_classes]) 28 | 29 | # Step 1: Create a filtering mask based on "box_class_scores" by using "threshold". 30 | mask = tf.greater_equal(score, tf.constant(score_thresh)) 31 | # Step 2: Do non_max_suppression for each class 32 | for i in range(num_classes): 33 | # Step 3: Apply the mask to scores, boxes and pick them out 34 | filter_boxes = tf.boolean_mask(boxes, mask[:,i]) 35 | filter_score = tf.boolean_mask(score[:,i], mask[:,i]) 36 | nms_indices = tf.image.non_max_suppression(boxes=filter_boxes, 37 | scores=filter_score, 38 | max_output_size=max_boxes, 39 | iou_threshold=nms_thresh, name='nms_indices') 40 | label_list.append(tf.ones_like(tf.gather(filter_score, nms_indices), 'int32')*i) 41 | boxes_list.append(tf.gather(filter_boxes, nms_indices)) 42 | score_list.append(tf.gather(filter_score, nms_indices)) 43 | 44 | boxes = tf.concat(boxes_list, axis=0) 45 | score = tf.concat(score_list, axis=0) 46 | label = tf.concat(label_list, axis=0) 47 | 48 | return boxes, score, label 49 | 50 | 51 | def py_nms(boxes, scores, max_boxes=50, iou_thresh=0.5): 52 | """ 53 | Pure Python NMS baseline. 54 | 55 | Arguments: boxes: shape of [-1, 4], the value of '-1' means that dont know the 56 | exact number of boxes 57 | scores: shape of [-1,] 58 | max_boxes: representing the maximum of boxes to be selected by non_max_suppression 59 | iou_thresh: representing iou_threshold for deciding to keep boxes 60 | """ 61 | assert boxes.shape[1] == 4 and len(scores.shape) == 1 62 | 63 | x1 = boxes[:, 0] 64 | y1 = boxes[:, 1] 65 | x2 = boxes[:, 2] 66 | y2 = boxes[:, 3] 67 | 68 | areas = (x2 - x1) * (y2 - y1) 69 | order = scores.argsort()[::-1] 70 | 71 | keep = [] 72 | while order.size > 0: 73 | i = order[0] 74 | keep.append(i) 75 | xx1 = np.maximum(x1[i], x1[order[1:]]) 76 | yy1 = np.maximum(y1[i], y1[order[1:]]) 77 | xx2 = np.minimum(x2[i], x2[order[1:]]) 78 | yy2 = np.minimum(y2[i], y2[order[1:]]) 79 | 80 | w = np.maximum(0.0, xx2 - xx1 + 1) 81 | h = np.maximum(0.0, yy2 - yy1 + 1) 82 | inter = w * h 83 | ovr = inter / (areas[i] + areas[order[1:]] - inter) 84 | 85 | inds = np.where(ovr <= iou_thresh)[0] 86 | order = order[inds + 1] 87 | 88 | return keep[:max_boxes] 89 | 90 | 91 | def cpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, iou_thresh=0.5): 92 | """ 93 | Perform NMS on CPU. 94 | Arguments: 95 | boxes: shape [1, 10647, 4] 96 | scores: shape [1, 10647, num_classes] 97 | """ 98 | 99 | boxes = boxes.reshape(-1, 4) 100 | scores = scores.reshape(-1, num_classes) 101 | # Picked bounding boxes 102 | picked_boxes, picked_score, picked_label = [], [], [] 103 | 104 | for i in range(num_classes): 105 | indices = np.where(scores[:,i] >= score_thresh) 106 | filter_boxes = boxes[indices] 107 | filter_scores = scores[:,i][indices] 108 | if len(filter_boxes) == 0: 109 | continue 110 | # do non_max_suppression on the cpu 111 | indices = py_nms(filter_boxes, filter_scores, 112 | max_boxes=max_boxes, iou_thresh=iou_thresh) 113 | picked_boxes.append(filter_boxes[indices]) 114 | picked_score.append(filter_scores[indices]) 115 | picked_label.append(np.ones(len(indices), dtype='int32')*i) 116 | if len(picked_boxes) == 0: 117 | return None, None, None 118 | 119 | boxes = np.concatenate(picked_boxes, axis=0) 120 | score = np.concatenate(picked_score, axis=0) 121 | label = np.concatenate(picked_label, axis=0) 122 | 123 | return boxes, score, label -------------------------------------------------------------------------------- /Test_demo/utils/plot_utils.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import cv2 6 | import random 7 | 8 | 9 | def get_color_table(class_num, seed=2): 10 | random.seed(seed) 11 | color_table = {} 12 | for i in range(class_num): 13 | color_table[i] = [random.randint(0, 255) for _ in range(3)] 14 | return color_table 15 | 16 | 17 | def plot_one_box(img, coord, label=None, color=None, line_thickness=None): 18 | ''' 19 | coord: [x_min, y_min, x_max, y_max] format coordinates. 20 | img: img to plot on. 21 | label: str. The label name. 22 | color: int. color index. 23 | line_thickness: int. rectangle line thickness. 24 | ''' 25 | tl = line_thickness or int(round(0.002 * max(img.shape[0:2]))) # line thickness 26 | color = color or [random.randint(0, 255) for _ in range(3)] 27 | c1, c2 = (int(coord[0]), int(coord[1])), (int(coord[2]), int(coord[3])) 28 | cv2.rectangle(img, c1, c2, color, thickness=tl) 29 | if label: 30 | tf = max(tl - 1, 1) # font thickness 31 | t_size = cv2.getTextSize(label, 0, fontScale=float(tl) / 3, thickness=tf)[0] 32 | c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3 33 | cv2.rectangle(img, c1, c2, color, -1) # filled 34 | cv2.putText(img, label, (c1[0], c1[1] - 2), 0, float(tl) / 3, [0, 0, 0], thickness=tf, lineType=cv2.LINE_AA) 35 | 36 | -------------------------------------------------------------------------------- /Test_demo/video_test.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import tensorflow as tf 6 | import numpy as np 7 | import argparse 8 | import cv2 9 | import time 10 | 11 | from utils.misc_utils import parse_anchors, read_class_names 12 | from utils.nms_utils import gpu_nms 13 | from utils.plot_utils import get_color_table, plot_one_box 14 | from utils.data_aug import letterbox_resize 15 | 16 | from model import yolov3 17 | 18 | import warnings 19 | warnings.filterwarnings('ignore') 20 | parser = argparse.ArgumentParser(description="YOLO-V3 video test procedure.") 21 | parser.add_argument("input_video", type=str, 22 | help="The path of the input video.") 23 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt", 24 | help="The path of the anchor txt file.") 25 | parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416], 26 | help="Resize the input image with `new_size`, size format: [width, height]") 27 | parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=True, 28 | help="Whether to use the letterbox resize.") 29 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names", 30 | help="The path of the class names.") 31 | parser.add_argument("--restore_path", type=str, default="./checkpoint/model-epoch_100_step_37268_loss_0.8836_lr_1e-05", 32 | help="The path of the weights to restore.") 33 | parser.add_argument("--save_video", type=lambda x: (str(x).lower() == 'true'), default=False, 34 | help="Whether to save the video detection results.") 35 | args = parser.parse_args() 36 | 37 | args.anchors = parse_anchors(args.anchor_path) 38 | args.classes = read_class_names(args.class_name_path) 39 | args.num_class = len(args.classes) 40 | 41 | color_table = get_color_table(args.num_class) 42 | 43 | vid = cv2.VideoCapture(args.input_video) 44 | # vid = cv2.VideoCapture(0) 45 | video_frame_cnt = int(vid.get(7)) 46 | video_width = int(vid.get(3)) 47 | video_height = int(vid.get(4)) 48 | # video_fps = int(vid.get(5)) 49 | video_fps = 10 50 | 51 | if args.save_video: 52 | fourcc = cv2.VideoWriter_fourcc(*'mp4v') 53 | videoWriter = cv2.VideoWriter('video_result.mp4', fourcc, video_fps, (video_width, video_height)) 54 | 55 | with tf.Session() as sess: 56 | input_data = tf.placeholder(tf.float32, [1, args.new_size[1], args.new_size[0], 3], name='input_data') 57 | yolo_model = yolov3(args.num_class, args.anchors) 58 | with tf.variable_scope('yolov3'): 59 | pred_feature_maps = yolo_model.forward(input_data, False) 60 | pred_boxes, pred_confs, pred_probs = yolo_model.predict(pred_feature_maps) 61 | 62 | pred_scores = pred_confs * pred_probs 63 | 64 | boxes, scores, labels = gpu_nms(pred_boxes, pred_scores, args.num_class, max_boxes=200, score_thresh=0.3, nms_thresh=0.45) 65 | 66 | saver = tf.train.Saver() 67 | saver.restore(sess, args.restore_path) 68 | 69 | for i in range(video_frame_cnt): 70 | # while True: 71 | ret, img_ori = vid.read() 72 | if args.letterbox_resize: 73 | img, resize_ratio, dw, dh = letterbox_resize(img_ori, args.new_size[0], args.new_size[1]) 74 | else: 75 | height_ori, width_ori = img_ori.shape[:2] 76 | img = cv2.resize(img_ori, tuple(args.new_size)) 77 | img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) 78 | img = np.asarray(img, np.float32) 79 | img = img[np.newaxis, :] / 255. 80 | 81 | start_time = time.time() 82 | boxes_, scores_, labels_ = sess.run([boxes, scores, labels], feed_dict={input_data: img}) 83 | end_time = time.time() 84 | 85 | # rescale the coordinates to the original image 86 | if args.letterbox_resize: 87 | boxes_[:, [0, 2]] = (boxes_[:, [0, 2]] - dw) / resize_ratio 88 | boxes_[:, [1, 3]] = (boxes_[:, [1, 3]] - dh) / resize_ratio 89 | else: 90 | boxes_[:, [0, 2]] *= (width_ori/float(args.new_size[0])) 91 | boxes_[:, [1, 3]] *= (height_ori/float(args.new_size[1])) 92 | 93 | for i in range(len(boxes_)): 94 | x0, y0, x1, y1 = boxes_[i] 95 | ######### 96 | 97 | plot_one_box(img_ori, [x0, y0, x1, y1], label=args.classes[labels_[i]] + ', {:.2f}%'.format(scores_[i] * 100), color=color_table[labels_[i]]) 98 | 99 | cv2.putText(img_ori, '{:.2f}ms'.format((end_time - start_time) * 1000), (40, 40), 0, 100 | fontScale=1, color=(0, 255, 0), thickness=2) 101 | cv2.imshow('image', img_ori) 102 | k = cv2.waitKey(1) 103 | if args.save_video: 104 | videoWriter.write(img_ori) 105 | if k & 0xFF == ord('q'): 106 | break 107 | 108 | vid.release() 109 | if args.save_video: 110 | videoWriter.release() 111 | -------------------------------------------------------------------------------- /Train/.gitignore: -------------------------------------------------------------------------------- 1 | /checkpoint/checkpoint 2 | /checkpoint/*.data-00000-of-00001 3 | /checkpoint/*.index 4 | /checkpoint/*.meta 5 | /data/darknet_weights/*.data-00000-of-00001 6 | /data/darknet_weights/*.meta 7 | /data/darknet_weights/*.weights 8 | 9 | /data/logs/*.ubuntu 10 | *.xml 11 | /data/myData/ImageSets/Main/*.txt 12 | /data/myData/JPEGImages/*.jpg 13 | /data/myData/label/*.txt 14 | 15 | /data/test_image/*.jpg 16 | /data/test_video/* 17 | /data/*.log 18 | /test_res/*.jpg 19 | /执行步骤.txt 20 | 21 | -------------------------------------------------------------------------------- /Train/README.md: -------------------------------------------------------------------------------- 1 | ### 1.Download darknet_weights 2 | 3 | [GitHub Release](https://github.com/DataXujing/YOLO-V3-Tensorflow/releases/tag/1.0) 4 | put it into `./data/darknet_weights/` 5 | 6 | 7 | ### 2.Create data structure 8 | 9 | 10 | (1) annotation files 11 | 12 | put annotation files under ./data/my_data/Annotations directory 13 | put image file under ./data/my_data/JPEGImages directory 14 | 15 | run 16 | 17 | ```shell 18 | python data_pro.py 19 | ``` 20 | Generate train.txt/val.txt/test.txt files under ./data/my_data/label directory. 21 | one row represents one image as `image_index`,`image_absolute_path`, `img_width`, `img_height`,`box_1`,`box_2`,...,`box_n` 22 | 23 | 24 | Example: 25 | 26 | ``` 27 | 0 xxx/xxx/a.jpg 1920,1080,0 453 369 473 391 1 588 245 608 268 28 | 1 xxx/xxx/b.jpg 1920,1080,1 466 403 485 422 2 793 300 809 320 29 | ... 30 | ``` 31 | 32 | 33 | (2) class_names file: 34 | 35 | Generate the data.names file under ./data/ directory. Each line represents a class name. 36 | 37 | ``` 38 | P 39 | PH 40 | ``` 41 | 42 | (3) prior anchor file: 43 | 44 | Using the kmeans algorithm to get the prior anchors: 45 | 46 | ``` 47 | python get_kmeans.py 48 | ``` 49 | Then you will get 9 anchors and the average IoU. Save the anchors to a txt file. 50 | 51 | The COCO dataset anchors offered by YOLO's author is placed at ./data/yolo_anchors.txt, you can use that one too. 52 | 53 | The yolo anchors computed by the kmeans script is on the resized image scale. The default resize method is the letterbox resize, i.e., keep the original aspect ratio in the resized image. 54 | 55 | 56 | ### 4.Train 57 | 58 | modify parameters in `arg.py`: 59 | 60 |
61 | change arg.py 62 |

 63 | ### Some paths
 64 | train_file = './data/my_data/label/train.txt'  # The path of the training txt file.
 65 | val_file = './data/my_data/label/val.txt'  # The path of the validation txt file.
 66 | restore_path = './data/darknet_weights/yolov3.ckpt'  # The path of the weights to restore.
 67 | save_dir = './checkpoint/'  # The directory of the weights to save.
 68 | log_dir = './data/logs/'  # The directory to store the tensorboard log files.
 69 | progress_log_path = './data/progress.log'  # The path to record the training progress.
 70 | anchor_path = './data/yolo_anchors.txt'  # The path of the anchor txt file.
 71 | class_name_path = './data/coco.names'  # The path of the class names.
 72 | ### Training releated numbers
 73 | batch_size = 32  #6
 74 | img_size = [416, 416]  # Images will be resized to `img_size` and fed to the network, size format: [width, height]
 75 | letterbox_resize = True  # Whether to use the letterbox resize, i.e., keep the original aspect ratio in the resized image.
 76 | total_epoches = 500
 77 | train_evaluation_step = 100  # Evaluate on the training batch after some steps.
 78 | val_evaluation_epoch = 50  # Evaluate on the whole validation dataset after some epochs. Set to None to evaluate every epoch.
 79 | save_epoch = 10  # Save the model after some epochs.
 80 | batch_norm_decay = 0.99  # decay in bn ops
 81 | weight_decay = 5e-4  # l2 weight decay
 82 | global_step = 0  # used when resuming training
 83 | ### tf.data parameters
 84 | num_threads = 10  # Number of threads for image processing used in tf.data pipeline.
 85 | prefetech_buffer = 5  # Prefetech_buffer used in tf.data pipeline.
 86 | ### Learning rate and optimizer
 87 | optimizer_name = 'momentum'  # Chosen from [sgd, momentum, adam, rmsprop]
 88 | save_optimizer = True  # Whether to save the optimizer parameters into the checkpoint file.
 89 | learning_rate_init = 1e-4
 90 | lr_type = 'piecewise'  # Chosen from [fixed, exponential, cosine_decay, cosine_decay_restart, piecewise]
 91 | lr_decay_epoch = 5  # Epochs after which learning rate decays. Int or float. Used when chosen `exponential` and `cosine_decay_restart` lr_type.
 92 | lr_decay_factor = 0.96  # The learning rate decay factor. Used when chosen `exponential` lr_type.
 93 | lr_lower_bound = 1e-6  # The minimum learning rate.
 94 | # only used in piecewise lr type
 95 | pw_boundaries = [30, 50]  # epoch based boundaries
 96 | pw_values = [learning_rate_init, 3e-5, 1e-5]
 97 | ### Load and finetune
 98 | # Choose the parts you want to restore the weights. List form.
 99 | # restore_include: None, restore_exclude: None  => restore the whole model
100 | # restore_include: None, restore_exclude: scope  => restore the whole model except `scope`
101 | # restore_include: scope1, restore_exclude: scope2  => if scope1 contains scope2, restore scope1 and not restore scope2 (scope1 - scope2)
102 | # choise 1: only restore the darknet body
103 | # restore_include = ['yolov3/darknet53_body']
104 | # restore_exclude = None
105 | # choise 2: restore all layers except the last 3 conv2d layers in 3 scale
106 | restore_include = None
107 | restore_exclude = ['yolov3/yolov3_head/Conv_14', 'yolov3/yolov3_head/Conv_6', 'yolov3/yolov3_head/Conv_22']
108 | # Choose the parts you want to finetune. List form.
109 | # Set to None to train the whole model.
110 | update_part = ['yolov3/yolov3_head']
111 | ### other training strategies
112 | multi_scale_train = True  # Whether to apply multi-scale training strategy. Image size varies from [320, 320] to [640, 640] by default.
113 | use_label_smooth = True # Whether to use class label smoothing strategy.
114 | use_focal_loss = True  # Whether to apply focal loss on the conf loss.
115 | use_mix_up = True  # Whether to use mix up data augmentation strategy. 
116 | use_warm_up = True  # whether to use warm up strategy to prevent from gradient exploding.
117 | warm_up_epoch = 3  # Warm up training epoches. Set to a larger value if gradient explodes.
118 | ### some constants in validation
119 | # nms
120 | nms_threshold = 0.45  # iou threshold in nms operation
121 | score_threshold = 0.01  # threshold of the probability of the classes in nms operation, i.e. score = pred_confs * pred_probs. set lower for higher recall.
122 | nms_topk = 150  # keep at most nms_topk outputs after nms
123 | # mAP eval
124 | eval_threshold = 0.5  # the iou threshold applied in mAP evaluation
125 | use_voc_07_metric = False  # whether to use voc 2007 evaluation metric, i.e. the 11-point metric
126 | ### parse some params
127 | anchors = parse_anchors(anchor_path)
128 | classes = read_class_names(class_name_path)
129 | class_num = len(classes)
130 | train_img_cnt = len(open(train_file, 'r').readlines())
131 | val_img_cnt = len(open(val_file, 'r').readlines())
132 | train_batch_num = int(math.ceil(float(train_img_cnt) / batch_size))
133 | lr_decay_freq = int(train_batch_num * lr_decay_epoch)
134 | pw_boundaries = [float(i) * train_batch_num + global_step for i in pw_boundaries]
135 | 
136 |
137 | 138 | Run: 139 | 140 | 141 | ```shell 142 | python train.py 143 | ``` 144 | 145 | 146 | 147 | ### 5.Test 148 | 149 | 150 | 151 | ``` 152 | python3 test_single_image.py 000002.jpg 153 | ``` 154 | 155 | 156 | 157 | -------------------------------------------------------------------------------- /Train/args.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | # This file contains the parameter used in train.py 3 | 4 | from __future__ import division, print_function 5 | 6 | from utils.misc_utils import parse_anchors, read_class_names 7 | import math 8 | 9 | ### Some paths 10 | train_file = './data/my_data/label/train.txt' # The path of the training txt file. 11 | val_file = './data/my_data/label/val.txt' # The path of the validation txt file. 12 | restore_path = './data/darknet_weights/yolov3.ckpt' # The path of the weights to restore. 13 | save_dir = './checkpoint/' # The directory of the weights to save. 14 | log_dir = './data/logs/' # The directory to store the tensorboard log files. 15 | progress_log_path = './data/progress.log' # The path to record the training progress. 16 | anchor_path = './data/yolo_anchors.txt' # The path of the anchor txt file. 17 | class_name_path = './data/coco.names' # The path of the class names. 18 | 19 | ### Training releated numbers 20 | batch_size = 12 #6 21 | img_size = [416, 416] # Images will be resized to `img_size` and fed to the network, size format: [width, height] 22 | letterbox_resize = True # Whether to use the letterbox resize, i.e., keep the original aspect ratio in the resized image. 23 | total_epoches = 100 #500 24 | train_evaluation_step = 50 #100 # Evaluate on the training batch after some steps. 25 | val_evaluation_epoch = 50 #50 # Evaluate on the whole validation dataset after some epochs. Set to None to evaluate every epoch. 26 | save_epoch = 10 # Save the model after some epochs. 27 | batch_norm_decay = 0.99 # decay in bn ops 28 | weight_decay = 5e-4 # l2 weight decay 29 | global_step = 0 # used when resuming training 30 | 31 | ### tf.data parameters 32 | num_threads = 10 # Number of threads for image processing used in tf.data pipeline. 33 | prefetech_buffer = 5 # Prefetech_buffer used in tf.data pipeline. 34 | 35 | ### Learning rate and optimizer 36 | optimizer_name = 'momentum' # Chosen from [sgd, momentum, adam, rmsprop] 37 | save_optimizer = True # Whether to save the optimizer parameters into the checkpoint file. 38 | learning_rate_init = 1e-4 39 | lr_type = 'piecewise' # Chosen from [fixed, exponential, cosine_decay, cosine_decay_restart, piecewise] 40 | lr_decay_epoch = 5 # Epochs after which learning rate decays. Int or float. Used when chosen `exponential` and `cosine_decay_restart` lr_type. 41 | lr_decay_factor = 0.96 # The learning rate decay factor. Used when chosen `exponential` lr_type. 42 | lr_lower_bound = 1e-6 # The minimum learning rate. 43 | # only used in piecewise lr type 44 | pw_boundaries = [30, 50] # epoch based boundaries 45 | pw_values = [learning_rate_init, 3e-5, 1e-5] 46 | 47 | ### Load and finetune 48 | # Choose the parts you want to restore the weights. List form. 49 | # restore_include: None, restore_exclude: None => restore the whole model 50 | # restore_include: None, restore_exclude: scope => restore the whole model except `scope` 51 | # restore_include: scope1, restore_exclude: scope2 => if scope1 contains scope2, restore scope1 and not restore scope2 (scope1 - scope2) 52 | # choise 1: only restore the darknet body 53 | # restore_include = ['yolov3/darknet53_body'] 54 | # restore_exclude = None 55 | # choise 2: restore all layers except the last 3 conv2d layers in 3 scale 56 | restore_include = None 57 | restore_exclude = ['yolov3/yolov3_head/Conv_14', 'yolov3/yolov3_head/Conv_6', 'yolov3/yolov3_head/Conv_22'] 58 | # Choose the parts you want to finetune. List form. 59 | # Set to None to train the whole model. 60 | update_part = ['yolov3/yolov3_head'] 61 | 62 | ### other training strategies 63 | multi_scale_train = True # Whether to apply multi-scale training strategy. Image size varies from [320, 320] to [640, 640] by default. 64 | use_label_smooth = True # Whether to use class label smoothing strategy. 65 | use_focal_loss = True # Whether to apply focal loss on the conf loss. 66 | use_mix_up = True # Whether to use mix up data augmentation strategy. 67 | use_warm_up = True # whether to use warm up strategy to prevent from gradient exploding. 68 | warm_up_epoch = 3 # Warm up training epoches. Set to a larger value if gradient explodes. 69 | 70 | ### some constants in validation 71 | # nms 72 | nms_threshold = 0.45 # iou threshold in nms operation 73 | score_threshold = 0.01 # threshold of the probability of the classes in nms operation, i.e. score = pred_confs * pred_probs. set lower for higher recall. 74 | nms_topk = 150 # keep at most nms_topk outputs after nms 75 | # mAP eval 76 | eval_threshold = 0.5 # the iou threshold applied in mAP evaluation 77 | use_voc_07_metric = False # whether to use voc 2007 evaluation metric, i.e. the 11-point metric 78 | 79 | ### parse some params 80 | anchors = parse_anchors(anchor_path) 81 | classes = read_class_names(class_name_path) 82 | class_num = len(classes) 83 | train_img_cnt = len(open(train_file, 'r').readlines()) 84 | val_img_cnt = len(open(val_file, 'r').readlines()) 85 | train_batch_num = int(math.ceil(float(train_img_cnt) / batch_size)) 86 | 87 | lr_decay_freq = int(train_batch_num * lr_decay_epoch) 88 | pw_boundaries = [float(i) * train_batch_num + global_step for i in pw_boundaries] -------------------------------------------------------------------------------- /Train/convert_weight.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | # for more details about the yolo darknet weights file, refer to 3 | # https://itnext.io/implementing-yolo-v3-in-tensorflow-tf-slim-c3c55ff59dbe 4 | 5 | from __future__ import division, print_function 6 | 7 | import os 8 | import sys 9 | import tensorflow as tf 10 | import numpy as np 11 | 12 | from model import yolov3 13 | from utils.misc_utils import parse_anchors, load_weights 14 | 15 | num_class = 80 16 | img_size = 416 17 | weight_path = './data/darknet_weights/yolov3.weights' 18 | save_path = './data/darknet_weights/yolov3.ckpt' 19 | anchors = parse_anchors('./data/yolo_anchors.txt') 20 | 21 | model = yolov3(80, anchors) 22 | with tf.Session() as sess: 23 | inputs = tf.placeholder(tf.float32, [1, img_size, img_size, 3]) 24 | 25 | with tf.variable_scope('yolov3'): 26 | feature_map = model.forward(inputs) 27 | 28 | saver = tf.train.Saver(var_list=tf.global_variables(scope='yolov3')) 29 | 30 | load_ops = load_weights(tf.global_variables(scope='yolov3'), weight_path) 31 | sess.run(load_ops) 32 | saver.save(sess, save_path=save_path) 33 | print('TensorFlow model checkpoint has been saved to {}'.format(save_path)) 34 | 35 | 36 | 37 | -------------------------------------------------------------------------------- /Train/data/coco.names: -------------------------------------------------------------------------------- 1 | P 2 | PH 3 | PV 4 | PHV 5 | PLC -------------------------------------------------------------------------------- /Train/data/darknet_weights/readme: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/data/darknet_weights/readme -------------------------------------------------------------------------------- /Train/data/my_data/Annotations/readme: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/data/my_data/Annotations/readme -------------------------------------------------------------------------------- /Train/data/my_data/ImageSets/Main/test.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/data/my_data/ImageSets/Main/test.txt -------------------------------------------------------------------------------- /Train/data/my_data/ImageSets/Main/train.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/data/my_data/ImageSets/Main/train.txt -------------------------------------------------------------------------------- /Train/data/my_data/ImageSets/Main/val.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/data/my_data/ImageSets/Main/val.txt -------------------------------------------------------------------------------- /Train/data/my_data/JPEGImages/readme: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/data/my_data/JPEGImages/readme -------------------------------------------------------------------------------- /Train/data/my_data/label/test.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/data/my_data/label/test.txt -------------------------------------------------------------------------------- /Train/data/my_data/label/train.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/data/my_data/label/train.txt -------------------------------------------------------------------------------- /Train/data/my_data/label/val.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/data/my_data/label/val.txt -------------------------------------------------------------------------------- /Train/data/yolo_anchors.txt: -------------------------------------------------------------------------------- 1 | 27,58, 40,97, 56,123, 68,180, 97,152, 102,241, 163,299, 262,481, 587,710 -------------------------------------------------------------------------------- /Train/data_pro.py: -------------------------------------------------------------------------------- 1 | 2 | import os 3 | import pandas 4 | import shutil 5 | import random 6 | 7 | 8 | import cv2 9 | import numpy as np 10 | import xml.etree.ElementTree as ET 11 | 12 | 13 | 14 | 15 | 16 | class Data_preprocess(object): 17 | ''' 18 | 解析xml数据 19 | ''' 20 | def __init__(self,data_path): 21 | self.data_path = data_path 22 | self.image_size = 416 23 | self.batch_size = 32 24 | self.cell_size = 13 25 | # TO DO 26 | self.classes = ["P","PH","PV","PHV","PLC"] 27 | self.num_classes = len(self.classes) 28 | self.box_per_cell = 5 29 | self.class_to_ind = dict(zip(self.classes, range(self.num_classes))) 30 | 31 | self.count = 0 32 | self.epoch = 1 33 | self.count_t = 0 34 | 35 | def load_labels(self, model): 36 | if model == 'train': 37 | txtname = os.path.join(self.data_path, 'ImageSets/Main/train.txt') 38 | if model == 'test': 39 | txtname = os.path.join(self.data_path, 'ImageSets/Main/test.txt') 40 | 41 | if model == "val": 42 | txtname = os.path.join(self.data_path, 'ImageSets/Main/val.txt') 43 | 44 | 45 | with open(txtname, 'r') as f: 46 | image_ind = [x.strip() for x in f.readlines()] # 文件名去掉 .jpg 47 | 48 | 49 | my_index = 0 50 | for ind in image_ind: 51 | class_inds, x1s, y1s, x2s, y2s,img_width,img_height = self.load_data(ind) 52 | 53 | if len(class_inds) == 0: 54 | pass 55 | else: 56 | annotation_label = "" 57 | #box_x: label_index, x_min,y_min,x_max,y_max 58 | for label_i in range(len(class_inds)): 59 | 60 | annotation_label += " " + str(class_inds[label_i]) 61 | annotation_label += " " + str(x1s[label_i]) 62 | annotation_label += " " + str(y1s[label_i]) 63 | annotation_label += " " + str(x2s[label_i]) 64 | annotation_label += " " + str(y2s[label_i]) 65 | 66 | with open("./data/my_data/label/"+model+".txt","a") as f: 67 | f.write(str(my_index) + " " + data_path+"/JPEGImages/"+ind+".jpg"+" "+str(img_width) +" "+str(img_height)+ annotation_label + "\n") 68 | 69 | my_index += 1 70 | 71 | print(my_index) 72 | 73 | 74 | 75 | def load_data(self, index): 76 | label = np.zeros([self.cell_size, self.cell_size, self.box_per_cell, 5 + self.num_classes]) 77 | filename = os.path.join(self.data_path, 'Annotations', index + '.xml') 78 | tree = ET.parse(filename) 79 | image_size = tree.find('size') 80 | image_width = int(float(image_size.find('width').text)) 81 | image_height = int(float(image_size.find('height').text)) 82 | # h_ratio = 1.0 * self.image_size / image_height 83 | # w_ratio = 1.0 * self.image_size / image_width 84 | 85 | objects = tree.findall('object') 86 | 87 | class_inds = [] 88 | x1s = [] 89 | y1s = [] 90 | x2s = [] 91 | y2s = [] 92 | 93 | for obj in objects: 94 | box = obj.find('bndbox') 95 | x1 = int(float(box.find('xmin').text)) 96 | y1 = int(float(box.find('ymin').text)) 97 | x2 = int(float(box.find('xmax').text)) 98 | y2 = int(float(box.find('ymax').text)) 99 | # x1 = max(min((float(box.find('xmin').text)) * w_ratio, self.image_size), 0) 100 | # y1 = max(min((float(box.find('ymin').text)) * h_ratio, self.image_size), 0) 101 | # x2 = max(min((float(box.find('xmax').text)) * w_ratio, self.image_size), 0) 102 | # y2 = max(min((float(box.find('ymax').text)) * h_ratio, self.image_size), 0) 103 | if obj.find('name').text in self.classes: 104 | class_ind = self.class_to_ind[obj.find('name').text] 105 | # class_ind = self.class_to_ind[obj.find('name').text.lower().strip()] 106 | 107 | # boxes = [0.5 * (x1 + x2) / self.image_size, 0.5 * (y1 + y2) / self.image_size, np.sqrt((x2 - x1) / self.image_size), np.sqrt((y2 - y1) / self.image_size)] 108 | # cx = 1.0 * boxes[0] * self.cell_size 109 | # cy = 1.0 * boxes[1] * self.cell_size 110 | # xind = int(np.floor(cx)) 111 | # yind = int(np.floor(cy)) 112 | 113 | # label[yind, xind, :, 0] = 1 114 | # label[yind, xind, :, 1:5] = boxes 115 | # label[yind, xind, :, 5 + class_ind] = 1 116 | 117 | if x1 >= x2 or y1 >= y2: 118 | pass 119 | else: 120 | class_inds.append(class_ind) 121 | x1s.append(x1) 122 | y1s.append(y1) 123 | x2s.append(x2) 124 | y2s.append(y2) 125 | 126 | return class_inds, x1s, y1s, x2s, y2s, image_width, image_height 127 | 128 | 129 | def data_split(img_path): 130 | ''' 131 | 数据分割 132 | ''' 133 | 134 | files = os.listdir(img_path) 135 | # To do 136 | test_part = random.sample(files,int(2302*0.2)) 137 | 138 | val_part = random.sample(test_part,int(int(2302*0.2)*0.5)) 139 | 140 | val_index = 0 141 | test_index = 0 142 | train_index = 0 143 | for file in files: 144 | if file in val_part: 145 | 146 | with open("./data/my_data/ImageSets/Main/val.txt","a") as val_f: 147 | val_f.write(file[:-4] + "\n" ) 148 | 149 | val_index += 1 150 | 151 | elif file in test_part: 152 | with open("./data/my_data/ImageSets/Main/test.txt","a") as test_f: 153 | test_f.write(file[:-4] + "\n") 154 | 155 | test_index += 1 156 | 157 | else: 158 | with open("./data/my_data/ImageSets/Main/train.txt","a") as train_f: 159 | train_f.write(file[:-4] + "\n") 160 | 161 | train_index += 1 162 | 163 | 164 | print(train_index,test_index,val_index) 165 | 166 | 167 | # TO DO 168 | if __name__ == "__main__": 169 | 170 | # split train, val, test 171 | img_path = "./data/my_data/JPEGImages" 172 | data_split(img_path) 173 | print("===========split data finish============") 174 | 175 | # create YOLO V3 datasets 176 | base_path = os.getcwd() 177 | data_path = os.path.join(base_path,"data/my_data") # absolute path 178 | 179 | data_p = Data_preprocess(data_path) 180 | data_p.load_labels("train") 181 | data_p.load_labels("test") 182 | data_p.load_labels("val") 183 | print("==========data pro finish===========") 184 | 185 | 186 | 187 | 188 | 189 | 190 | 191 | -------------------------------------------------------------------------------- /Train/eval.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import tensorflow as tf 6 | import numpy as np 7 | import argparse 8 | from tqdm import trange 9 | 10 | from utils.data_utils import get_batch_data 11 | from utils.misc_utils import parse_anchors, read_class_names, AverageMeter 12 | from utils.eval_utils import evaluate_on_cpu, evaluate_on_gpu, get_preds_gpu, voc_eval, parse_gt_rec 13 | from utils.nms_utils import gpu_nms 14 | 15 | from model import yolov3 16 | 17 | ################# 18 | # ArgumentParser 19 | ################# 20 | parser = argparse.ArgumentParser(description="YOLO-V3 eval procedure.") 21 | # some paths 22 | parser.add_argument("--eval_file", type=str, default="./data/my_data/val.txt", 23 | help="The path of the validation or test txt file.") 24 | 25 | parser.add_argument("--restore_path", type=str, default="./data/darknet_weights/yolov3.ckpt", 26 | help="The path of the weights to restore.") 27 | 28 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt", 29 | help="The path of the anchor txt file.") 30 | 31 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names", 32 | help="The path of the class names.") 33 | 34 | # some numbers 35 | parser.add_argument("--img_size", nargs='*', type=int, default=[416, 416], 36 | help="Resize the input image to `img_size`, size format: [width, height]") 37 | 38 | parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=False, 39 | help="Whether to use the letterbox resize, i.e., keep the original image aspect ratio.") 40 | 41 | parser.add_argument("--num_threads", type=int, default=10, 42 | help="Number of threads for image processing used in tf.data pipeline.") 43 | 44 | parser.add_argument("--prefetech_buffer", type=int, default=5, 45 | help="Prefetech_buffer used in tf.data pipeline.") 46 | 47 | parser.add_argument("--nms_threshold", type=float, default=0.45, 48 | help="IOU threshold in nms operation.") 49 | 50 | parser.add_argument("--score_threshold", type=float, default=0.01, 51 | help="Threshold of the probability of the classes in nms operation.") 52 | 53 | parser.add_argument("--nms_topk", type=int, default=400, 54 | help="Keep at most nms_topk outputs after nms.") 55 | 56 | parser.add_argument("--use_voc_07_metric", type=lambda x: (str(x).lower() == 'true'), default=False, 57 | help="Whether to use the voc 2007 mAP metrics.") 58 | 59 | args = parser.parse_args() 60 | 61 | # args params 62 | args.anchors = parse_anchors(args.anchor_path) 63 | args.classes = read_class_names(args.class_name_path) 64 | args.class_num = len(args.classes) 65 | args.img_cnt = len(open(args.eval_file, 'r').readlines()) 66 | 67 | # setting placeholders 68 | is_training = tf.placeholder(dtype=tf.bool, name="phase_train") 69 | handle_flag = tf.placeholder(tf.string, [], name='iterator_handle_flag') 70 | pred_boxes_flag = tf.placeholder(tf.float32, [1, None, None]) 71 | pred_scores_flag = tf.placeholder(tf.float32, [1, None, None]) 72 | gpu_nms_op = gpu_nms(pred_boxes_flag, pred_scores_flag, args.class_num, args.nms_topk, args.score_threshold, args.nms_threshold) 73 | 74 | ################## 75 | # tf.data pipeline 76 | ################## 77 | val_dataset = tf.data.TextLineDataset(args.eval_file) 78 | val_dataset = val_dataset.batch(1) 79 | val_dataset = val_dataset.map( 80 | lambda x: tf.py_func(get_batch_data, [x, args.class_num, args.img_size, args.anchors, 'val', False, False, args.letterbox_resize], [tf.int64, tf.float32, tf.float32, tf.float32, tf.float32]), 81 | num_parallel_calls=args.num_threads 82 | ) 83 | val_dataset.prefetch(args.prefetech_buffer) 84 | iterator = val_dataset.make_one_shot_iterator() 85 | 86 | image_ids, image, y_true_13, y_true_26, y_true_52 = iterator.get_next() 87 | image_ids.set_shape([None]) 88 | y_true = [y_true_13, y_true_26, y_true_52] 89 | image.set_shape([None, args.img_size[1], args.img_size[0], 3]) 90 | for y in y_true: 91 | y.set_shape([None, None, None, None, None]) 92 | 93 | ################## 94 | # Model definition 95 | ################## 96 | yolo_model = yolov3(args.class_num, args.anchors) 97 | with tf.variable_scope('yolov3'): 98 | pred_feature_maps = yolo_model.forward(image, is_training=is_training) 99 | loss = yolo_model.compute_loss(pred_feature_maps, y_true) 100 | y_pred = yolo_model.predict(pred_feature_maps) 101 | 102 | saver_to_restore = tf.train.Saver() 103 | 104 | with tf.Session() as sess: 105 | sess.run([tf.global_variables_initializer()]) 106 | saver_to_restore.restore(sess, args.restore_path) 107 | 108 | print('\n----------- start to eval -----------\n') 109 | 110 | val_loss_total, val_loss_xy, val_loss_wh, val_loss_conf, val_loss_class = \ 111 | AverageMeter(), AverageMeter(), AverageMeter(), AverageMeter(), AverageMeter() 112 | val_preds = [] 113 | 114 | for j in trange(args.img_cnt): 115 | __image_ids, __y_pred, __loss = sess.run([image_ids, y_pred, loss], feed_dict={is_training: False}) 116 | pred_content = get_preds_gpu(sess, gpu_nms_op, pred_boxes_flag, pred_scores_flag, __image_ids, __y_pred) 117 | 118 | val_preds.extend(pred_content) 119 | val_loss_total.update(__loss[0]) 120 | val_loss_xy.update(__loss[1]) 121 | val_loss_wh.update(__loss[2]) 122 | val_loss_conf.update(__loss[3]) 123 | val_loss_class.update(__loss[4]) 124 | 125 | rec_total, prec_total, ap_total = AverageMeter(), AverageMeter(), AverageMeter() 126 | gt_dict = parse_gt_rec(args.eval_file, args.img_size, args.letterbox_resize) 127 | print('mAP eval:') 128 | for ii in range(args.class_num): 129 | npos, nd, rec, prec, ap = voc_eval(gt_dict, val_preds, ii, iou_thres=0.5, use_07_metric=args.use_voc_07_metric) 130 | rec_total.update(rec, npos) 131 | prec_total.update(prec, nd) 132 | ap_total.update(ap, 1) 133 | print('Class {}: Recall: {:.4f}, Precision: {:.4f}, AP: {:.4f}'.format(ii, rec, prec, ap)) 134 | 135 | mAP = ap_total.average 136 | print('final mAP: {:.4f}'.format(mAP)) 137 | print("recall: {:.3f}, precision: {:.3f}".format(rec_total.average, prec_total.average)) 138 | print("total_loss: {:.3f}, loss_xy: {:.3f}, loss_wh: {:.3f}, loss_conf: {:.3f}, loss_class: {:.3f}".format( 139 | val_loss_total.average, val_loss_xy.average, val_loss_wh.average, val_loss_conf.average, val_loss_class.average 140 | )) 141 | -------------------------------------------------------------------------------- /Train/get_kmeans.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | # This script is modified from https://github.com/lars76/kmeans-anchor-boxes 3 | 4 | from __future__ import division, print_function 5 | 6 | import numpy as np 7 | 8 | def iou(box, clusters): 9 | """ 10 | Calculates the Intersection over Union (IoU) between a box and k clusters. 11 | param: 12 | box: tuple or array, shifted to the origin (i. e. width and height) 13 | clusters: numpy array of shape (k, 2) where k is the number of clusters 14 | return: 15 | numpy array of shape (k, 0) where k is the number of clusters 16 | """ 17 | x = np.minimum(clusters[:, 0], box[0]) 18 | y = np.minimum(clusters[:, 1], box[1]) 19 | if np.count_nonzero(x == 0) > 0 or np.count_nonzero(y == 0) > 0: 20 | raise ValueError("Box has no area") 21 | 22 | intersection = x * y 23 | box_area = box[0] * box[1] 24 | cluster_area = clusters[:, 0] * clusters[:, 1] 25 | 26 | iou_ = np.true_divide(intersection, box_area + cluster_area - intersection + 1e-10) 27 | # iou_ = intersection / (box_area + cluster_area - intersection + 1e-10) 28 | 29 | return iou_ 30 | 31 | 32 | def avg_iou(boxes, clusters): 33 | """ 34 | Calculates the average Intersection over Union (IoU) between a numpy array of boxes and k clusters. 35 | param: 36 | boxes: numpy array of shape (r, 2), where r is the number of rows 37 | clusters: numpy array of shape (k, 2) where k is the number of clusters 38 | return: 39 | average IoU as a single float 40 | """ 41 | return np.mean([np.max(iou(boxes[i], clusters)) for i in range(boxes.shape[0])]) 42 | 43 | 44 | def translate_boxes(boxes): 45 | """ 46 | Translates all the boxes to the origin. 47 | param: 48 | boxes: numpy array of shape (r, 4) 49 | return: 50 | numpy array of shape (r, 2) 51 | """ 52 | new_boxes = boxes.copy() 53 | for row in range(new_boxes.shape[0]): 54 | new_boxes[row][2] = np.abs(new_boxes[row][2] - new_boxes[row][0]) 55 | new_boxes[row][3] = np.abs(new_boxes[row][3] - new_boxes[row][1]) 56 | return np.delete(new_boxes, [0, 1], axis=1) 57 | 58 | 59 | def kmeans(boxes, k, dist=np.median): 60 | """ 61 | Calculates k-means clustering with the Intersection over Union (IoU) metric. 62 | param: 63 | boxes: numpy array of shape (r, 2), where r is the number of rows 64 | k: number of clusters 65 | dist: distance function 66 | return: 67 | numpy array of shape (k, 2) 68 | """ 69 | rows = boxes.shape[0] 70 | 71 | distances = np.empty((rows, k)) 72 | last_clusters = np.zeros((rows,)) 73 | 74 | np.random.seed() 75 | 76 | # the Forgy method will fail if the whole array contains the same rows 77 | clusters = boxes[np.random.choice(rows, k, replace=False)] 78 | 79 | while True: 80 | for row in range(rows): 81 | distances[row] = 1 - iou(boxes[row], clusters) 82 | 83 | nearest_clusters = np.argmin(distances, axis=1) 84 | 85 | if (last_clusters == nearest_clusters).all(): 86 | break 87 | 88 | for cluster in range(k): 89 | clusters[cluster] = dist(boxes[nearest_clusters == cluster], axis=0) 90 | 91 | last_clusters = nearest_clusters 92 | 93 | return clusters 94 | 95 | 96 | def parse_anno(annotation_path, target_size=None): 97 | anno = open(annotation_path, 'r') 98 | result = [] 99 | for line in anno: 100 | s = line.strip().split(' ') 101 | img_w = int(float(s[2])) 102 | img_h = int(float(s[3])) 103 | s = s[4:] 104 | box_cnt = len(s) // 5 105 | for i in range(box_cnt): 106 | x_min, y_min, x_max, y_max = float(s[i*5+1]), float(s[i*5+2]), float(s[i*5+3]), float(s[i*5+4]) 107 | width = x_max - x_min 108 | height = y_max - y_min 109 | assert width > 0 110 | assert height > 0 111 | # use letterbox resize, i.e. keep the original aspect ratio 112 | # get k-means anchors on the resized target image size 113 | if target_size is not None: 114 | resize_ratio = min(target_size[0] / img_w, target_size[1] / img_h) 115 | width *= resize_ratio 116 | height *= resize_ratio 117 | result.append([width, height]) 118 | # get k-means anchors on the original image size 119 | else: 120 | result.append([width, height]) 121 | result = np.asarray(result) 122 | return result 123 | 124 | 125 | def get_kmeans(anno, cluster_num=9): 126 | 127 | anchors = kmeans(anno, cluster_num) 128 | ave_iou = avg_iou(anno, anchors) 129 | 130 | anchors = anchors.astype('int').tolist() 131 | 132 | anchors = sorted(anchors, key=lambda x: x[0] * x[1]) 133 | 134 | return anchors, ave_iou 135 | 136 | 137 | if __name__ == '__main__': 138 | # target resize format: [width, height] 139 | # if target_resize is speficied, the anchors are on the resized image scale 140 | # if target_resize is set to None, the anchors are on the original image scale 141 | # target_size = [416, 416] 142 | target_size = None 143 | annotation_path = "./data/my_data/label/train.txt" 144 | anno_result = parse_anno(annotation_path, target_size=target_size) 145 | anchors, ave_iou = get_kmeans(anno_result, 9) 146 | 147 | anchor_string = '' 148 | for anchor in anchors: 149 | anchor_string += '{},{}, '.format(anchor[0], anchor[1]) 150 | anchor_string = anchor_string[:-2] 151 | 152 | print('anchors are:') 153 | print(anchor_string) 154 | print('the average iou is:') 155 | print(ave_iou) 156 | 157 | -------------------------------------------------------------------------------- /Train/test_single_image.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import tensorflow as tf 6 | import numpy as np 7 | import argparse 8 | import cv2 9 | 10 | from utils.misc_utils import parse_anchors, read_class_names 11 | from utils.nms_utils import gpu_nms 12 | from utils.plot_utils import get_color_table, plot_one_box 13 | 14 | from model import yolov3 15 | 16 | tf.compat.v1.train.Saver 17 | 18 | parser = argparse.ArgumentParser(description="YOLO-V3 test single image test procedure.") 19 | parser.add_argument("input_image", type=str, 20 | help="The path of the input image.") 21 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt", 22 | help="The path of the anchor txt file.") 23 | parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416], 24 | help="Resize the input image with `new_size`, size format: [width, height]") 25 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names", 26 | help="The path of the class names.") 27 | parser.add_argument("--restore_path", type=str, default="./checkpoint/model-epoch_90_step_14013_loss_1.4107_lr_1e-05", 28 | help="The path of the weights to restore.") 29 | args = parser.parse_args() 30 | 31 | args.anchors = parse_anchors(args.anchor_path) 32 | args.classes = read_class_names(args.class_name_path) 33 | args.num_class = len(args.classes) 34 | 35 | color_table = get_color_table(args.num_class) 36 | 37 | img_ori = cv2.imread(args.input_image) 38 | height_ori, width_ori = img_ori.shape[:2] 39 | img = cv2.resize(img_ori, tuple(args.new_size)) 40 | img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) 41 | img = np.asarray(img, np.float32) 42 | img = img[np.newaxis, :] / 255. 43 | 44 | with tf.Session() as sess: 45 | input_data = tf.placeholder(tf.float32, [1, args.new_size[1], args.new_size[0], 3], name='input_data') 46 | yolo_model = yolov3(args.num_class, args.anchors) 47 | with tf.variable_scope('yolov3'): 48 | pred_feature_maps = yolo_model.forward(input_data, False) 49 | pred_boxes, pred_confs, pred_probs = yolo_model.predict(pred_feature_maps) 50 | 51 | pred_scores = pred_confs * pred_probs 52 | 53 | boxes, scores, labels = gpu_nms(pred_boxes, pred_scores, args.num_class, max_boxes=30, score_thresh=0.4, nms_thresh=0.5) 54 | 55 | saver = tf.train.Saver() 56 | saver.restore(sess, args.restore_path) 57 | # saver = tf.train.import_meta_graph('./checkpoint/best_model_Epoch_5_step_42_mAP_0.0735_loss_43.3285_lr_0.0001.meta') 58 | # saver.restore(sess, tf.train.latest_checkpoint("./checkpoint/")) 59 | 60 | boxes_, scores_, labels_ = sess.run([boxes, scores, labels], feed_dict={input_data: img}) 61 | 62 | # rescale the coordinates to the original image 63 | boxes_[:, 0] *= (width_ori/float(args.new_size[0])) 64 | boxes_[:, 2] *= (width_ori/float(args.new_size[0])) 65 | boxes_[:, 1] *= (height_ori/float(args.new_size[1])) 66 | boxes_[:, 3] *= (height_ori/float(args.new_size[1])) 67 | 68 | print("box coords:") 69 | print(boxes_) 70 | print('*' * 30) 71 | print("scores:") 72 | print(scores_) 73 | print('*' * 30) 74 | print("labels:") 75 | print(labels_) 76 | 77 | for i in range(len(boxes_)): 78 | x0, y0, x1, y1 = boxes_[i] 79 | plot_one_box(img_ori, [x0, y0, x1, y1], label=args.classes[labels_[i]], color=color_table[labels_[i]]) 80 | cv2.imshow('Detection result', img_ori) 81 | cv2.imwrite('detection_result.jpg', img_ori) 82 | cv2.waitKey(0) 83 | -------------------------------------------------------------------------------- /Train/utils/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/Train/utils/__init__.py -------------------------------------------------------------------------------- /Train/utils/layer_utils.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import numpy as np 6 | import tensorflow as tf 7 | slim = tf.contrib.slim 8 | 9 | def conv2d(inputs, filters, kernel_size, strides=1): 10 | def _fixed_padding(inputs, kernel_size): 11 | pad_total = kernel_size - 1 12 | pad_beg = pad_total // 2 13 | pad_end = pad_total - pad_beg 14 | 15 | padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg, pad_end], 16 | [pad_beg, pad_end], [0, 0]], mode='CONSTANT') 17 | return padded_inputs 18 | if strides > 1: 19 | inputs = _fixed_padding(inputs, kernel_size) 20 | inputs = slim.conv2d(inputs, filters, kernel_size, stride=strides, 21 | padding=('SAME' if strides == 1 else 'VALID')) 22 | return inputs 23 | 24 | def darknet53_body(inputs): 25 | def res_block(inputs, filters): 26 | shortcut = inputs 27 | net = conv2d(inputs, filters * 1, 1) 28 | net = conv2d(net, filters * 2, 3) 29 | 30 | net = net + shortcut 31 | 32 | return net 33 | 34 | # first two conv2d layers 35 | net = conv2d(inputs, 32, 3, strides=1) 36 | net = conv2d(net, 64, 3, strides=2) 37 | 38 | # res_block * 1 39 | net = res_block(net, 32) 40 | 41 | net = conv2d(net, 128, 3, strides=2) 42 | 43 | # res_block * 2 44 | for i in range(2): 45 | net = res_block(net, 64) 46 | 47 | net = conv2d(net, 256, 3, strides=2) 48 | 49 | # res_block * 8 50 | for i in range(8): 51 | net = res_block(net, 128) 52 | 53 | route_1 = net 54 | net = conv2d(net, 512, 3, strides=2) 55 | 56 | # res_block * 8 57 | for i in range(8): 58 | net = res_block(net, 256) 59 | 60 | route_2 = net 61 | net = conv2d(net, 1024, 3, strides=2) 62 | 63 | # res_block * 4 64 | for i in range(4): 65 | net = res_block(net, 512) 66 | route_3 = net 67 | 68 | return route_1, route_2, route_3 69 | 70 | 71 | def yolo_block(inputs, filters): 72 | net = conv2d(inputs, filters * 1, 1) 73 | net = conv2d(net, filters * 2, 3) 74 | net = conv2d(net, filters * 1, 1) 75 | net = conv2d(net, filters * 2, 3) 76 | net = conv2d(net, filters * 1, 1) 77 | route = net 78 | net = conv2d(net, filters * 2, 3) 79 | return route, net 80 | 81 | 82 | def upsample_layer(inputs, out_shape): 83 | new_height, new_width = out_shape[1], out_shape[2] 84 | # NOTE: here height is the first 85 | # TODO: Do we need to set `align_corners` as True? 86 | inputs = tf.image.resize_nearest_neighbor(inputs, (new_height, new_width), name='upsampled') 87 | return inputs 88 | 89 | 90 | -------------------------------------------------------------------------------- /Train/utils/misc_utils.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | import numpy as np 4 | import tensorflow as tf 5 | import random 6 | 7 | from tensorflow.core.framework import summary_pb2 8 | 9 | 10 | def make_summary(name, val): 11 | return summary_pb2.Summary(value=[summary_pb2.Summary.Value(tag=name, simple_value=val)]) 12 | 13 | 14 | class AverageMeter(object): 15 | def __init__(self): 16 | self.reset() 17 | 18 | def reset(self): 19 | self.val = 0 20 | self.average = 0 21 | self.sum = 0 22 | self.count = 0 23 | 24 | def update(self, val, n=1): 25 | self.val = val 26 | self.sum += val * n 27 | self.count += n 28 | self.average = self.sum / float(self.count) 29 | 30 | 31 | def parse_anchors(anchor_path): 32 | ''' 33 | parse anchors. 34 | returned data: shape [N, 2], dtype float32 35 | ''' 36 | anchors = np.reshape(np.asarray(open(anchor_path, 'r').read().split(','), np.float32), [-1, 2]) 37 | return anchors 38 | 39 | 40 | def read_class_names(class_name_path): 41 | names = {} 42 | with open(class_name_path, 'r') as data: 43 | for ID, name in enumerate(data): 44 | names[ID] = name.strip('\n') 45 | return names 46 | 47 | 48 | def shuffle_and_overwrite(file_name): 49 | content = open(file_name, 'r').readlines() 50 | random.shuffle(content) 51 | with open(file_name, 'w') as f: 52 | for line in content: 53 | f.write(line) 54 | 55 | 56 | def update_dict(ori_dict, new_dict): 57 | if not ori_dict: 58 | return new_dict 59 | for key in ori_dict: 60 | ori_dict[key] += new_dict[key] 61 | return ori_dict 62 | 63 | 64 | def list_add(ori_list, new_list): 65 | for i in range(len(ori_list)): 66 | ori_list[i] += new_list[i] 67 | return ori_list 68 | 69 | 70 | def load_weights(var_list, weights_file): 71 | """ 72 | Loads and converts pre-trained weights. 73 | param: 74 | var_list: list of network variables. 75 | weights_file: name of the binary file. 76 | """ 77 | with open(weights_file, "rb") as fp: 78 | np.fromfile(fp, dtype=np.int32, count=5) 79 | weights = np.fromfile(fp, dtype=np.float32) 80 | 81 | ptr = 0 82 | i = 0 83 | assign_ops = [] 84 | while i < len(var_list) - 1: 85 | var1 = var_list[i] 86 | var2 = var_list[i + 1] 87 | # do something only if we process conv layer 88 | if 'Conv' in var1.name.split('/')[-2]: 89 | # check type of next layer 90 | if 'BatchNorm' in var2.name.split('/')[-2]: 91 | # load batch norm params 92 | gamma, beta, mean, var = var_list[i + 1:i + 5] 93 | batch_norm_vars = [beta, gamma, mean, var] 94 | for var in batch_norm_vars: 95 | shape = var.shape.as_list() 96 | num_params = np.prod(shape) 97 | var_weights = weights[ptr:ptr + num_params].reshape(shape) 98 | ptr += num_params 99 | assign_ops.append(tf.assign(var, var_weights, validate_shape=True)) 100 | # we move the pointer by 4, because we loaded 4 variables 101 | i += 4 102 | elif 'Conv' in var2.name.split('/')[-2]: 103 | # load biases 104 | bias = var2 105 | bias_shape = bias.shape.as_list() 106 | bias_params = np.prod(bias_shape) 107 | bias_weights = weights[ptr:ptr + 108 | bias_params].reshape(bias_shape) 109 | ptr += bias_params 110 | assign_ops.append(tf.assign(bias, bias_weights, validate_shape=True)) 111 | # we loaded 1 variable 112 | i += 1 113 | # we can load weights of conv layer 114 | shape = var1.shape.as_list() 115 | num_params = np.prod(shape) 116 | 117 | var_weights = weights[ptr:ptr + num_params].reshape( 118 | (shape[3], shape[2], shape[0], shape[1])) 119 | # remember to transpose to column-major 120 | var_weights = np.transpose(var_weights, (2, 3, 1, 0)) 121 | ptr += num_params 122 | assign_ops.append( 123 | tf.assign(var1, var_weights, validate_shape=True)) 124 | i += 1 125 | 126 | return assign_ops 127 | 128 | 129 | def config_learning_rate(args, global_step): 130 | if args.lr_type == 'exponential': 131 | lr_tmp = tf.train.exponential_decay(args.learning_rate_init, global_step, args.lr_decay_freq, 132 | args.lr_decay_factor, staircase=True, name='exponential_learning_rate') 133 | return tf.maximum(lr_tmp, args.lr_lower_bound) 134 | elif args.lr_type == 'cosine_decay': 135 | train_steps = (args.total_epoches - float(args.use_warm_up) * args.warm_up_epoch) * args.train_batch_num 136 | return args.lr_lower_bound + 0.5 * (args.learning_rate_init - args.lr_lower_bound) * \ 137 | (1 + tf.cos(global_step / train_steps * np.pi)) 138 | elif args.lr_type == 'cosine_decay_restart': 139 | return tf.train.cosine_decay_restarts(args.learning_rate_init, global_step, 140 | args.lr_decay_freq, t_mul=2.0, m_mul=1.0, 141 | name='cosine_decay_learning_rate_restart') 142 | elif args.lr_type == 'fixed': 143 | return tf.convert_to_tensor(args.learning_rate_init, name='fixed_learning_rate') 144 | elif args.lr_type == 'piecewise': 145 | return tf.train.piecewise_constant(global_step, boundaries=args.pw_boundaries, values=args.pw_values, 146 | name='piecewise_learning_rate') 147 | else: 148 | raise ValueError('Unsupported learning rate type!') 149 | 150 | 151 | def config_optimizer(optimizer_name, learning_rate, decay=0.9, momentum=0.9): 152 | if optimizer_name == 'momentum': 153 | return tf.train.MomentumOptimizer(learning_rate, momentum=momentum) 154 | elif optimizer_name == 'rmsprop': 155 | return tf.train.RMSPropOptimizer(learning_rate, decay=decay, momentum=momentum) 156 | elif optimizer_name == 'adam': 157 | return tf.train.AdamOptimizer(learning_rate) 158 | elif optimizer_name == 'sgd': 159 | return tf.train.GradientDescentOptimizer(learning_rate) 160 | else: 161 | raise ValueError('Unsupported optimizer type!') -------------------------------------------------------------------------------- /Train/utils/nms_utils.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import numpy as np 6 | import tensorflow as tf 7 | 8 | def gpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, nms_thresh=0.5): 9 | """ 10 | Perform NMS on GPU using TensorFlow. 11 | 12 | params: 13 | boxes: tensor of shape [1, 10647, 4] # 10647=(13*13+26*26+52*52)*3, for input 416*416 image 14 | scores: tensor of shape [1, 10647, num_classes], score=conf*prob 15 | num_classes: total number of classes 16 | max_boxes: integer, maximum number of predicted boxes you'd like, default is 50 17 | score_thresh: if [ highest class probability score < score_threshold] 18 | then get rid of the corresponding box 19 | nms_thresh: real value, "intersection over union" threshold used for NMS filtering 20 | """ 21 | 22 | boxes_list, label_list, score_list = [], [], [] 23 | max_boxes = tf.constant(max_boxes, dtype='int32') 24 | 25 | # since we do nms for single image, then reshape it 26 | boxes = tf.reshape(boxes, [-1, 4]) # '-1' means we don't konw the exact number of boxes 27 | score = tf.reshape(scores, [-1, num_classes]) 28 | 29 | # Step 1: Create a filtering mask based on "box_class_scores" by using "threshold". 30 | mask = tf.greater_equal(score, tf.constant(score_thresh)) 31 | # Step 2: Do non_max_suppression for each class 32 | for i in range(num_classes): 33 | # Step 3: Apply the mask to scores, boxes and pick them out 34 | filter_boxes = tf.boolean_mask(boxes, mask[:,i]) 35 | filter_score = tf.boolean_mask(score[:,i], mask[:,i]) 36 | nms_indices = tf.image.non_max_suppression(boxes=filter_boxes, 37 | scores=filter_score, 38 | max_output_size=max_boxes, 39 | iou_threshold=nms_thresh, name='nms_indices') 40 | label_list.append(tf.ones_like(tf.gather(filter_score, nms_indices), 'int32')*i) 41 | boxes_list.append(tf.gather(filter_boxes, nms_indices)) 42 | score_list.append(tf.gather(filter_score, nms_indices)) 43 | 44 | boxes = tf.concat(boxes_list, axis=0) 45 | score = tf.concat(score_list, axis=0) 46 | label = tf.concat(label_list, axis=0) 47 | 48 | return boxes, score, label 49 | 50 | 51 | def py_nms(boxes, scores, max_boxes=50, iou_thresh=0.5): 52 | """ 53 | Pure Python NMS baseline. 54 | 55 | Arguments: boxes: shape of [-1, 4], the value of '-1' means that dont know the 56 | exact number of boxes 57 | scores: shape of [-1,] 58 | max_boxes: representing the maximum of boxes to be selected by non_max_suppression 59 | iou_thresh: representing iou_threshold for deciding to keep boxes 60 | """ 61 | assert boxes.shape[1] == 4 and len(scores.shape) == 1 62 | 63 | x1 = boxes[:, 0] 64 | y1 = boxes[:, 1] 65 | x2 = boxes[:, 2] 66 | y2 = boxes[:, 3] 67 | 68 | areas = (x2 - x1) * (y2 - y1) 69 | order = scores.argsort()[::-1] 70 | 71 | keep = [] 72 | while order.size > 0: 73 | i = order[0] 74 | keep.append(i) 75 | xx1 = np.maximum(x1[i], x1[order[1:]]) 76 | yy1 = np.maximum(y1[i], y1[order[1:]]) 77 | xx2 = np.minimum(x2[i], x2[order[1:]]) 78 | yy2 = np.minimum(y2[i], y2[order[1:]]) 79 | 80 | w = np.maximum(0.0, xx2 - xx1 + 1) 81 | h = np.maximum(0.0, yy2 - yy1 + 1) 82 | inter = w * h 83 | ovr = inter / (areas[i] + areas[order[1:]] - inter) 84 | 85 | inds = np.where(ovr <= iou_thresh)[0] 86 | order = order[inds + 1] 87 | 88 | return keep[:max_boxes] 89 | 90 | 91 | def cpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, iou_thresh=0.5): 92 | """ 93 | Perform NMS on CPU. 94 | Arguments: 95 | boxes: shape [1, 10647, 4] 96 | scores: shape [1, 10647, num_classes] 97 | """ 98 | 99 | boxes = boxes.reshape(-1, 4) 100 | scores = scores.reshape(-1, num_classes) 101 | # Picked bounding boxes 102 | picked_boxes, picked_score, picked_label = [], [], [] 103 | 104 | for i in range(num_classes): 105 | indices = np.where(scores[:,i] >= score_thresh) 106 | filter_boxes = boxes[indices] 107 | filter_scores = scores[:,i][indices] 108 | if len(filter_boxes) == 0: 109 | continue 110 | # do non_max_suppression on the cpu 111 | indices = py_nms(filter_boxes, filter_scores, 112 | max_boxes=max_boxes, iou_thresh=iou_thresh) 113 | picked_boxes.append(filter_boxes[indices]) 114 | picked_score.append(filter_scores[indices]) 115 | picked_label.append(np.ones(len(indices), dtype='int32')*i) 116 | if len(picked_boxes) == 0: 117 | return None, None, None 118 | 119 | boxes = np.concatenate(picked_boxes, axis=0) 120 | score = np.concatenate(picked_score, axis=0) 121 | label = np.concatenate(picked_label, axis=0) 122 | 123 | return boxes, score, label -------------------------------------------------------------------------------- /Train/utils/plot_utils.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import cv2 6 | import random 7 | 8 | 9 | def get_color_table(class_num, seed=2): 10 | random.seed(seed) 11 | color_table = {} 12 | for i in range(class_num): 13 | color_table[i] = [random.randint(0, 255) for _ in range(3)] 14 | return color_table 15 | 16 | 17 | def plot_one_box(img, coord, label=None, color=None, line_thickness=None): 18 | ''' 19 | coord: [x_min, y_min, x_max, y_max] format coordinates. 20 | img: img to plot on. 21 | label: str. The label name. 22 | color: int. color index. 23 | line_thickness: int. rectangle line thickness. 24 | ''' 25 | tl = line_thickness or int(round(0.002 * max(img.shape[0:2]))) # line thickness 26 | color = color or [random.randint(0, 255) for _ in range(3)] 27 | c1, c2 = (int(coord[0]), int(coord[1])), (int(coord[2]), int(coord[3])) 28 | cv2.rectangle(img, c1, c2, color, thickness=tl) 29 | if label: 30 | tf = max(tl - 1, 1) # font thickness 31 | t_size = cv2.getTextSize(label, 0, fontScale=float(tl) / 3, thickness=tf)[0] 32 | c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3 33 | cv2.rectangle(img, c1, c2, color, -1) # filled 34 | cv2.putText(img, label, (c1[0], c1[1] - 2), 0, float(tl) / 3, [0, 0, 0], thickness=tf, lineType=cv2.LINE_AA) 35 | 36 | -------------------------------------------------------------------------------- /Train/video_test.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import tensorflow as tf 6 | import numpy as np 7 | import argparse 8 | import cv2 9 | import time 10 | 11 | from utils.misc_utils import parse_anchors, read_class_names 12 | from utils.nms_utils import gpu_nms 13 | from utils.plot_utils import get_color_table, plot_one_box 14 | from utils.data_aug import letterbox_resize 15 | 16 | from model import yolov3 17 | 18 | import warnings 19 | warnings.filterwarnings('ignore') 20 | parser = argparse.ArgumentParser(description="YOLO-V3 video test procedure.") 21 | parser.add_argument("input_video", type=str, 22 | help="The path of the input video.") 23 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt", 24 | help="The path of the anchor txt file.") 25 | parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416], 26 | help="Resize the input image with `new_size`, size format: [width, height]") 27 | parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=True, 28 | help="Whether to use the letterbox resize.") 29 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names", 30 | help="The path of the class names.") 31 | parser.add_argument("--restore_path", type=str, default="./checkpoint/model-epoch_90_step_14013_loss_1.4107_lr_1e-05", 32 | help="The path of the weights to restore.") 33 | parser.add_argument("--save_video", type=lambda x: (str(x).lower() == 'true'), default=False, 34 | help="Whether to save the video detection results.") 35 | args = parser.parse_args() 36 | 37 | args.anchors = parse_anchors(args.anchor_path) 38 | args.classes = read_class_names(args.class_name_path) 39 | args.num_class = len(args.classes) 40 | 41 | color_table = get_color_table(args.num_class) 42 | 43 | # vid = cv2.VideoCapture(args.input_video) 44 | vid = cv2.VideoCapture(0) 45 | video_frame_cnt = int(vid.get(7)) 46 | video_width = int(vid.get(3)) 47 | video_height = int(vid.get(4)) 48 | # video_fps = int(vid.get(5)) 49 | video_fps = 10 50 | 51 | if args.save_video: 52 | fourcc = cv2.VideoWriter_fourcc(*'mp4v') 53 | videoWriter = cv2.VideoWriter('video_result.mp4', fourcc, video_fps, (video_width, video_height)) 54 | 55 | with tf.Session() as sess: 56 | input_data = tf.placeholder(tf.float32, [1, args.new_size[1], args.new_size[0], 3], name='input_data') 57 | yolo_model = yolov3(args.num_class, args.anchors) 58 | with tf.variable_scope('yolov3'): 59 | pred_feature_maps = yolo_model.forward(input_data, False) 60 | pred_boxes, pred_confs, pred_probs = yolo_model.predict(pred_feature_maps) 61 | 62 | pred_scores = pred_confs * pred_probs 63 | 64 | boxes, scores, labels = gpu_nms(pred_boxes, pred_scores, args.num_class, max_boxes=200, score_thresh=0.3, nms_thresh=0.45) 65 | 66 | saver = tf.train.Saver() 67 | saver.restore(sess, args.restore_path) 68 | 69 | # for i in range(video_frame_cnt): 70 | while True: 71 | ret, img_ori = vid.read() 72 | if args.letterbox_resize: 73 | img, resize_ratio, dw, dh = letterbox_resize(img_ori, args.new_size[0], args.new_size[1]) 74 | else: 75 | height_ori, width_ori = img_ori.shape[:2] 76 | img = cv2.resize(img_ori, tuple(args.new_size)) 77 | img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) 78 | img = np.asarray(img, np.float32) 79 | img = img[np.newaxis, :] / 255. 80 | 81 | start_time = time.time() 82 | boxes_, scores_, labels_ = sess.run([boxes, scores, labels], feed_dict={input_data: img}) 83 | end_time = time.time() 84 | 85 | # rescale the coordinates to the original image 86 | if args.letterbox_resize: 87 | boxes_[:, [0, 2]] = (boxes_[:, [0, 2]] - dw) / resize_ratio 88 | boxes_[:, [1, 3]] = (boxes_[:, [1, 3]] - dh) / resize_ratio 89 | else: 90 | boxes_[:, [0, 2]] *= (width_ori/float(args.new_size[0])) 91 | boxes_[:, [1, 3]] *= (height_ori/float(args.new_size[1])) 92 | 93 | 94 | for i in range(len(boxes_)): 95 | x0, y0, x1, y1 = boxes_[i] 96 | plot_one_box(img_ori, [x0, y0, x1, y1], label=args.classes[labels_[i]] + ', {:.2f}%'.format(scores_[i] * 100), color=color_table[labels_[i]]) 97 | cv2.putText(img_ori, '{:.2f}ms'.format((end_time - start_time) * 1000), (40, 40), 0, 98 | fontScale=1, color=(0, 255, 0), thickness=2) 99 | cv2.imshow('image', img_ori) 100 | k = cv2.waitKey(1) 101 | if args.save_video: 102 | videoWriter.write(img_ori) 103 | if k & 0xFF == ord('q'): 104 | break 105 | 106 | vid.release() 107 | if args.save_video: 108 | videoWriter.release() 109 | -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/.gitignore: -------------------------------------------------------------------------------- 1 | /checkpoint/checkpoint 2 | /checkpoint/*.data-00000-of-00001 3 | /checkpoint/*.index 4 | /checkpoint/*.meta 5 | /data/darknet_weights/*.data-00000-of-00001 6 | /data/darknet_weights/*.meta 7 | /data/darknet_weights/*.weights 8 | 9 | /data/logs/*.ubuntu 10 | *.xml 11 | /data/myData/ImageSets/Main/*.txt 12 | /data/myData/JPEGImages/*.jpg 13 | /data/myData/label/*.txt 14 | 15 | /data/test_image/*.jpg 16 | /data/test_video/* 17 | /data/*.log 18 | /test_res/*.jpg 19 | /执行步骤.txt 20 | 21 | -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/README.md: -------------------------------------------------------------------------------- 1 | ## Tensorflow YOLO V3 helmet detection 2 | trained model download :[GitHub Release](https://github.com/DataXujing/YOLO-V3-Tensorflow/releases/tag/model) -- yolo3_hat.rar 3 | 4 | put three files into ./data/darknet_weights 5 | 6 | # Test: 7 | ``` 8 | python3 test_single_image.py test.jpg 9 | ``` 10 | 11 | 12 | -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/args.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | # This file contains the parameter used in train.py 3 | 4 | from __future__ import division, print_function 5 | 6 | from utils.misc_utils import parse_anchors, read_class_names 7 | import math 8 | 9 | ### Some paths 10 | train_file = './data/my_data/label/train.txt' # The path of the training txt file. 11 | val_file = './data/my_data/label/val.txt' # The path of the validation txt file. 12 | restore_path = './data/darknet_weights/yolov3.ckpt' # The path of the weights to restore. 13 | save_dir = './checkpoint/' # The directory of the weights to save. 14 | log_dir = './data/logs/' # The directory to store the tensorboard log files. 15 | progress_log_path = './data/progress.log' # The path to record the training progress. 16 | anchor_path = './data/yolo_anchors.txt' # The path of the anchor txt file. 17 | class_name_path = './data/coco.names' # The path of the class names. 18 | 19 | ### Training releated numbers 20 | batch_size = 32 #6 21 | img_size = [416, 416] # Images will be resized to `img_size` and fed to the network, size format: [width, height] 22 | letterbox_resize = True # Whether to use the letterbox resize, i.e., keep the original aspect ratio in the resized image. 23 | total_epoches = 500 24 | train_evaluation_step = 100 # Evaluate on the training batch after some steps. 25 | val_evaluation_epoch = 50 # Evaluate on the whole validation dataset after some epochs. Set to None to evaluate every epoch. 26 | save_epoch = 10 # Save the model after some epochs. 27 | batch_norm_decay = 0.99 # decay in bn ops 28 | weight_decay = 5e-4 # l2 weight decay 29 | global_step = 0 # used when resuming training 30 | 31 | ### tf.data parameters 32 | num_threads = 10 # Number of threads for image processing used in tf.data pipeline. 33 | prefetech_buffer = 5 # Prefetech_buffer used in tf.data pipeline. 34 | 35 | ### Learning rate and optimizer 36 | optimizer_name = 'momentum' # Chosen from [sgd, momentum, adam, rmsprop] 37 | save_optimizer = True # Whether to save the optimizer parameters into the checkpoint file. 38 | learning_rate_init = 1e-4 39 | lr_type = 'piecewise' # Chosen from [fixed, exponential, cosine_decay, cosine_decay_restart, piecewise] 40 | lr_decay_epoch = 5 # Epochs after which learning rate decays. Int or float. Used when chosen `exponential` and `cosine_decay_restart` lr_type. 41 | lr_decay_factor = 0.96 # The learning rate decay factor. Used when chosen `exponential` lr_type. 42 | lr_lower_bound = 1e-6 # The minimum learning rate. 43 | # only used in piecewise lr type 44 | pw_boundaries = [30, 50] # epoch based boundaries 45 | pw_values = [learning_rate_init, 3e-5, 1e-5] 46 | 47 | ### Load and finetune 48 | # Choose the parts you want to restore the weights. List form. 49 | # restore_include: None, restore_exclude: None => restore the whole model 50 | # restore_include: None, restore_exclude: scope => restore the whole model except `scope` 51 | # restore_include: scope1, restore_exclude: scope2 => if scope1 contains scope2, restore scope1 and not restore scope2 (scope1 - scope2) 52 | # choise 1: only restore the darknet body 53 | # restore_include = ['yolov3/darknet53_body'] 54 | # restore_exclude = None 55 | # choise 2: restore all layers except the last 3 conv2d layers in 3 scale 56 | restore_include = None 57 | restore_exclude = ['yolov3/yolov3_head/Conv_14', 'yolov3/yolov3_head/Conv_6', 'yolov3/yolov3_head/Conv_22'] 58 | # Choose the parts you want to finetune. List form. 59 | # Set to None to train the whole model. 60 | update_part = ['yolov3/yolov3_head'] 61 | 62 | ### other training strategies 63 | multi_scale_train = True # Whether to apply multi-scale training strategy. Image size varies from [320, 320] to [640, 640] by default. 64 | use_label_smooth = True # Whether to use class label smoothing strategy. 65 | use_focal_loss = True # Whether to apply focal loss on the conf loss. 66 | use_mix_up = True # Whether to use mix up data augmentation strategy. 67 | use_warm_up = True # whether to use warm up strategy to prevent from gradient exploding. 68 | warm_up_epoch = 3 # Warm up training epoches. Set to a larger value if gradient explodes. 69 | 70 | ### some constants in validation 71 | # nms 72 | nms_threshold = 0.45 # iou threshold in nms operation 73 | score_threshold = 0.01 # threshold of the probability of the classes in nms operation, i.e. score = pred_confs * pred_probs. set lower for higher recall. 74 | nms_topk = 150 # keep at most nms_topk outputs after nms 75 | # mAP eval 76 | eval_threshold = 0.5 # the iou threshold applied in mAP evaluation 77 | use_voc_07_metric = False # whether to use voc 2007 evaluation metric, i.e. the 11-point metric 78 | 79 | ### parse some params 80 | anchors = parse_anchors(anchor_path) 81 | classes = read_class_names(class_name_path) 82 | class_num = len(classes) 83 | train_img_cnt = len(open(train_file, 'r').readlines()) 84 | val_img_cnt = len(open(val_file, 'r').readlines()) 85 | train_batch_num = int(math.ceil(float(train_img_cnt) / batch_size)) 86 | 87 | lr_decay_freq = int(train_batch_num * lr_decay_epoch) 88 | pw_boundaries = [float(i) * train_batch_num + global_step for i in pw_boundaries] -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/convert_weight.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | # for more details about the yolo darknet weights file, refer to 3 | # https://itnext.io/implementing-yolo-v3-in-tensorflow-tf-slim-c3c55ff59dbe 4 | 5 | from __future__ import division, print_function 6 | 7 | import os 8 | import sys 9 | import tensorflow as tf 10 | import numpy as np 11 | 12 | from model import yolov3 13 | from utils.misc_utils import parse_anchors, load_weights 14 | 15 | num_class = 80 16 | img_size = 416 17 | weight_path = './data/darknet_weights/yolov3.weights' 18 | save_path = './data/darknet_weights/yolov3.ckpt' 19 | anchors = parse_anchors('./data/yolo_anchors.txt') 20 | 21 | model = yolov3(80, anchors) 22 | with tf.Session() as sess: 23 | inputs = tf.placeholder(tf.float32, [1, img_size, img_size, 3]) 24 | 25 | with tf.variable_scope('yolov3'): 26 | feature_map = model.forward(inputs) 27 | 28 | saver = tf.train.Saver(var_list=tf.global_variables(scope='yolov3')) 29 | 30 | load_ops = load_weights(tf.global_variables(scope='yolov3'), weight_path) 31 | sess.run(load_ops) 32 | saver.save(sess, save_path=save_path) 33 | print('TensorFlow model checkpoint has been saved to {}'.format(save_path)) 34 | 35 | 36 | 37 | -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/data/coco.names: -------------------------------------------------------------------------------- 1 | hat 2 | person -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/data/darknet_weights/readme.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/YOLO-V3-Tensorflow-demo/data/darknet_weights/readme.txt -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/data/yolo_anchors.txt: -------------------------------------------------------------------------------- 1 | 676,197, 763,250, 684,283, 868,231, 745,273, 544,391, 829,258, 678,316, 713,355 -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/data_pro.py: -------------------------------------------------------------------------------- 1 | 2 | import os 3 | import pandas 4 | import shutil 5 | import random 6 | 7 | 8 | import cv2 9 | import numpy as np 10 | import xml.etree.ElementTree as ET 11 | 12 | 13 | # 这部分休要修改 14 | 15 | 16 | class Data_preprocess(object): 17 | ''' 18 | 解析xml数据 19 | ''' 20 | def __init__(self,data_path): 21 | self.data_path = data_path 22 | self.image_size = 416 23 | self.batch_size = 32 24 | self.cell_size = 13 25 | self.classes = ["hat","person"] 26 | self.num_classes = len(self.classes) 27 | self.box_per_cell = 5 28 | self.class_to_ind = dict(zip(self.classes, range(self.num_classes))) 29 | 30 | self.count = 0 31 | self.epoch = 1 32 | self.count_t = 0 33 | 34 | def load_labels(self, model): 35 | if model == 'train': 36 | txtname = os.path.join(self.data_path, 'ImageSets/Main/train.txt') 37 | if model == 'test': 38 | txtname = os.path.join(self.data_path, 'ImageSets/Main/test.txt') 39 | 40 | if model == "val": 41 | txtname = os.path.join(self.data_path, 'ImageSets/Main/val.txt') 42 | 43 | 44 | with open(txtname, 'r') as f: 45 | image_ind = [x.strip() for x in f.readlines()] # 文件名去掉 .jpg 46 | 47 | 48 | my_index = 0 49 | for ind in image_ind: 50 | class_inds, x1s, y1s, x2s, y2s,img_width,img_height = self.load_data(ind) 51 | 52 | if len(class_inds) == 0: 53 | pass 54 | else: 55 | annotation_label = "" 56 | #box_x: label_index, x_min,y_min,x_max,y_max 57 | for label_i in range(len(class_inds)): 58 | 59 | annotation_label += " " + str(class_inds[label_i]) 60 | annotation_label += " " + str(x1s[label_i]) 61 | annotation_label += " " + str(y1s[label_i]) 62 | annotation_label += " " + str(x2s[label_i]) 63 | annotation_label += " " + str(y2s[label_i]) 64 | 65 | with open("./data/my_data/label/"+model+".txt","a") as f: 66 | f.write(str(my_index) + " " + data_path+"/JPEGImages/"+ind+".jpg"+" "+str(img_width) +" "+str(img_height)+ annotation_label + "\n") 67 | 68 | my_index += 1 69 | 70 | print(my_index) 71 | 72 | 73 | 74 | def load_data(self, index): 75 | label = np.zeros([self.cell_size, self.cell_size, self.box_per_cell, 5 + self.num_classes]) 76 | filename = os.path.join(self.data_path, 'Annotations', index + '.xml') 77 | tree = ET.parse(filename) 78 | image_size = tree.find('size') 79 | image_width = int(float(image_size.find('width').text)) 80 | image_height = int(float(image_size.find('height').text)) 81 | # h_ratio = 1.0 * self.image_size / image_height 82 | # w_ratio = 1.0 * self.image_size / image_width 83 | 84 | objects = tree.findall('object') 85 | 86 | class_inds = [] 87 | x1s = [] 88 | y1s = [] 89 | x2s = [] 90 | y2s = [] 91 | 92 | for obj in objects: 93 | box = obj.find('bndbox') 94 | x1 = int(float(box.find('xmin').text)) 95 | y1 = int(float(box.find('ymin').text)) 96 | x2 = int(float(box.find('xmax').text)) 97 | y2 = int(float(box.find('ymax').text)) 98 | # x1 = max(min((float(box.find('xmin').text)) * w_ratio, self.image_size), 0) 99 | # y1 = max(min((float(box.find('ymin').text)) * h_ratio, self.image_size), 0) 100 | # x2 = max(min((float(box.find('xmax').text)) * w_ratio, self.image_size), 0) 101 | # y2 = max(min((float(box.find('ymax').text)) * h_ratio, self.image_size), 0) 102 | if obj.find('name').text in self.classes: 103 | class_ind = self.class_to_ind[obj.find('name').text] 104 | # class_ind = self.class_to_ind[obj.find('name').text.lower().strip()] 105 | 106 | # boxes = [0.5 * (x1 + x2) / self.image_size, 0.5 * (y1 + y2) / self.image_size, np.sqrt((x2 - x1) / self.image_size), np.sqrt((y2 - y1) / self.image_size)] 107 | # cx = 1.0 * boxes[0] * self.cell_size 108 | # cy = 1.0 * boxes[1] * self.cell_size 109 | # xind = int(np.floor(cx)) 110 | # yind = int(np.floor(cy)) 111 | 112 | # label[yind, xind, :, 0] = 1 113 | # label[yind, xind, :, 1:5] = boxes 114 | # label[yind, xind, :, 5 + class_ind] = 1 115 | 116 | if x1 >= x2 or y1 >= y2: 117 | pass 118 | else: 119 | class_inds.append(class_ind) 120 | x1s.append(x1) 121 | y1s.append(y1) 122 | x2s.append(x2) 123 | y2s.append(y2) 124 | 125 | return class_inds, x1s, y1s, x2s, y2s, image_width, image_height 126 | 127 | 128 | def data_split(img_path): 129 | ''' 130 | 数据分割 131 | ''' 132 | 133 | files = os.listdir(img_path) 134 | 135 | test_part = random.sample(files,int(351*0.2)) 136 | 137 | val_part = random.sample(test_part,int(int(351*0.2)*0.5)) 138 | 139 | val_index = 0 140 | test_index = 0 141 | train_index = 0 142 | for file in files: 143 | if file in val_part: 144 | 145 | with open("./data/my_data/ImageSets/Main/val.txt","a") as val_f: 146 | val_f.write(file[:-4] + "\n" ) 147 | 148 | val_index += 1 149 | 150 | elif file in test_part: 151 | with open("./data/my_data/ImageSets/Main/test.txt","a") as test_f: 152 | test_f.write(file[:-4] + "\n") 153 | 154 | test_index += 1 155 | 156 | else: 157 | with open("./data/my_data/ImageSets/Main/train.txt","a") as train_f: 158 | train_f.write(file[:-4] + "\n") 159 | 160 | train_index += 1 161 | 162 | 163 | print(train_index,test_index,val_index) 164 | 165 | 166 | 167 | if __name__ == "__main__": 168 | 169 | # 分割train, val, test 170 | # img_path = "./data/my_data/ImageSets" 171 | # data_split(img_path) 172 | print("===========split data finish============") 173 | 174 | # 做YOLO V3需要的训练集 175 | base_path = os.getcwd() 176 | data_path = os.path.join(base_path,"data/my_data") # 绝对路径 177 | 178 | data_p = Data_preprocess(data_path) 179 | data_p.load_labels("train") 180 | data_p.load_labels("test") 181 | data_p.load_labels("val") 182 | print("==========data pro finish===========") 183 | 184 | 185 | 186 | 187 | 188 | 189 | 190 | -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/detection_result.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/YOLO-V3-Tensorflow-demo/detection_result.jpg -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/eval.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import tensorflow as tf 6 | import numpy as np 7 | import argparse 8 | from tqdm import trange 9 | 10 | from utils.data_utils import get_batch_data 11 | from utils.misc_utils import parse_anchors, read_class_names, AverageMeter 12 | from utils.eval_utils import evaluate_on_cpu, evaluate_on_gpu, get_preds_gpu, voc_eval, parse_gt_rec 13 | from utils.nms_utils import gpu_nms 14 | 15 | from model import yolov3 16 | 17 | ################# 18 | # ArgumentParser 19 | ################# 20 | parser = argparse.ArgumentParser(description="YOLO-V3 eval procedure.") 21 | # some paths 22 | parser.add_argument("--eval_file", type=str, default="./data/my_data/val.txt", 23 | help="The path of the validation or test txt file.") 24 | 25 | parser.add_argument("--restore_path", type=str, default="./data/darknet_weights/yolov3.ckpt", 26 | help="The path of the weights to restore.") 27 | 28 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt", 29 | help="The path of the anchor txt file.") 30 | 31 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names", 32 | help="The path of the class names.") 33 | 34 | # some numbers 35 | parser.add_argument("--img_size", nargs='*', type=int, default=[416, 416], 36 | help="Resize the input image to `img_size`, size format: [width, height]") 37 | 38 | parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=False, 39 | help="Whether to use the letterbox resize, i.e., keep the original image aspect ratio.") 40 | 41 | parser.add_argument("--num_threads", type=int, default=10, 42 | help="Number of threads for image processing used in tf.data pipeline.") 43 | 44 | parser.add_argument("--prefetech_buffer", type=int, default=5, 45 | help="Prefetech_buffer used in tf.data pipeline.") 46 | 47 | parser.add_argument("--nms_threshold", type=float, default=0.45, 48 | help="IOU threshold in nms operation.") 49 | 50 | parser.add_argument("--score_threshold", type=float, default=0.01, 51 | help="Threshold of the probability of the classes in nms operation.") 52 | 53 | parser.add_argument("--nms_topk", type=int, default=400, 54 | help="Keep at most nms_topk outputs after nms.") 55 | 56 | parser.add_argument("--use_voc_07_metric", type=lambda x: (str(x).lower() == 'true'), default=False, 57 | help="Whether to use the voc 2007 mAP metrics.") 58 | 59 | args = parser.parse_args() 60 | 61 | # args params 62 | args.anchors = parse_anchors(args.anchor_path) 63 | args.classes = read_class_names(args.class_name_path) 64 | args.class_num = len(args.classes) 65 | args.img_cnt = len(open(args.eval_file, 'r').readlines()) 66 | 67 | # setting placeholders 68 | is_training = tf.placeholder(dtype=tf.bool, name="phase_train") 69 | handle_flag = tf.placeholder(tf.string, [], name='iterator_handle_flag') 70 | pred_boxes_flag = tf.placeholder(tf.float32, [1, None, None]) 71 | pred_scores_flag = tf.placeholder(tf.float32, [1, None, None]) 72 | gpu_nms_op = gpu_nms(pred_boxes_flag, pred_scores_flag, args.class_num, args.nms_topk, args.score_threshold, args.nms_threshold) 73 | 74 | ################## 75 | # tf.data pipeline 76 | ################## 77 | val_dataset = tf.data.TextLineDataset(args.eval_file) 78 | val_dataset = val_dataset.batch(1) 79 | val_dataset = val_dataset.map( 80 | lambda x: tf.py_func(get_batch_data, [x, args.class_num, args.img_size, args.anchors, 'val', False, False, args.letterbox_resize], [tf.int64, tf.float32, tf.float32, tf.float32, tf.float32]), 81 | num_parallel_calls=args.num_threads 82 | ) 83 | val_dataset.prefetch(args.prefetech_buffer) 84 | iterator = val_dataset.make_one_shot_iterator() 85 | 86 | image_ids, image, y_true_13, y_true_26, y_true_52 = iterator.get_next() 87 | image_ids.set_shape([None]) 88 | y_true = [y_true_13, y_true_26, y_true_52] 89 | image.set_shape([None, args.img_size[1], args.img_size[0], 3]) 90 | for y in y_true: 91 | y.set_shape([None, None, None, None, None]) 92 | 93 | ################## 94 | # Model definition 95 | ################## 96 | yolo_model = yolov3(args.class_num, args.anchors) 97 | with tf.variable_scope('yolov3'): 98 | pred_feature_maps = yolo_model.forward(image, is_training=is_training) 99 | loss = yolo_model.compute_loss(pred_feature_maps, y_true) 100 | y_pred = yolo_model.predict(pred_feature_maps) 101 | 102 | saver_to_restore = tf.train.Saver() 103 | 104 | with tf.Session() as sess: 105 | sess.run([tf.global_variables_initializer()]) 106 | saver_to_restore.restore(sess, args.restore_path) 107 | 108 | print('\n----------- start to eval -----------\n') 109 | 110 | val_loss_total, val_loss_xy, val_loss_wh, val_loss_conf, val_loss_class = \ 111 | AverageMeter(), AverageMeter(), AverageMeter(), AverageMeter(), AverageMeter() 112 | val_preds = [] 113 | 114 | for j in trange(args.img_cnt): 115 | __image_ids, __y_pred, __loss = sess.run([image_ids, y_pred, loss], feed_dict={is_training: False}) 116 | pred_content = get_preds_gpu(sess, gpu_nms_op, pred_boxes_flag, pred_scores_flag, __image_ids, __y_pred) 117 | 118 | val_preds.extend(pred_content) 119 | val_loss_total.update(__loss[0]) 120 | val_loss_xy.update(__loss[1]) 121 | val_loss_wh.update(__loss[2]) 122 | val_loss_conf.update(__loss[3]) 123 | val_loss_class.update(__loss[4]) 124 | 125 | rec_total, prec_total, ap_total = AverageMeter(), AverageMeter(), AverageMeter() 126 | gt_dict = parse_gt_rec(args.eval_file, args.img_size, args.letterbox_resize) 127 | print('mAP eval:') 128 | for ii in range(args.class_num): 129 | npos, nd, rec, prec, ap = voc_eval(gt_dict, val_preds, ii, iou_thres=0.5, use_07_metric=args.use_voc_07_metric) 130 | rec_total.update(rec, npos) 131 | prec_total.update(prec, nd) 132 | ap_total.update(ap, 1) 133 | print('Class {}: Recall: {:.4f}, Precision: {:.4f}, AP: {:.4f}'.format(ii, rec, prec, ap)) 134 | 135 | mAP = ap_total.average 136 | print('final mAP: {:.4f}'.format(mAP)) 137 | print("recall: {:.3f}, precision: {:.3f}".format(rec_total.average, prec_total.average)) 138 | print("total_loss: {:.3f}, loss_xy: {:.3f}, loss_wh: {:.3f}, loss_conf: {:.3f}, loss_class: {:.3f}".format( 139 | val_loss_total.average, val_loss_xy.average, val_loss_wh.average, val_loss_conf.average, val_loss_class.average 140 | )) 141 | -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/get_kmeans.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | # This script is modified from https://github.com/lars76/kmeans-anchor-boxes 3 | 4 | from __future__ import division, print_function 5 | 6 | import numpy as np 7 | 8 | def iou(box, clusters): 9 | """ 10 | Calculates the Intersection over Union (IoU) between a box and k clusters. 11 | param: 12 | box: tuple or array, shifted to the origin (i. e. width and height) 13 | clusters: numpy array of shape (k, 2) where k is the number of clusters 14 | return: 15 | numpy array of shape (k, 0) where k is the number of clusters 16 | """ 17 | x = np.minimum(clusters[:, 0], box[0]) 18 | y = np.minimum(clusters[:, 1], box[1]) 19 | if np.count_nonzero(x == 0) > 0 or np.count_nonzero(y == 0) > 0: 20 | raise ValueError("Box has no area") 21 | 22 | intersection = x * y 23 | box_area = box[0] * box[1] 24 | cluster_area = clusters[:, 0] * clusters[:, 1] 25 | 26 | iou_ = np.true_divide(intersection, box_area + cluster_area - intersection + 1e-10) 27 | # iou_ = intersection / (box_area + cluster_area - intersection + 1e-10) 28 | 29 | return iou_ 30 | 31 | 32 | def avg_iou(boxes, clusters): 33 | """ 34 | Calculates the average Intersection over Union (IoU) between a numpy array of boxes and k clusters. 35 | param: 36 | boxes: numpy array of shape (r, 2), where r is the number of rows 37 | clusters: numpy array of shape (k, 2) where k is the number of clusters 38 | return: 39 | average IoU as a single float 40 | """ 41 | return np.mean([np.max(iou(boxes[i], clusters)) for i in range(boxes.shape[0])]) 42 | 43 | 44 | def translate_boxes(boxes): 45 | """ 46 | Translates all the boxes to the origin. 47 | param: 48 | boxes: numpy array of shape (r, 4) 49 | return: 50 | numpy array of shape (r, 2) 51 | """ 52 | new_boxes = boxes.copy() 53 | for row in range(new_boxes.shape[0]): 54 | new_boxes[row][2] = np.abs(new_boxes[row][2] - new_boxes[row][0]) 55 | new_boxes[row][3] = np.abs(new_boxes[row][3] - new_boxes[row][1]) 56 | return np.delete(new_boxes, [0, 1], axis=1) 57 | 58 | 59 | def kmeans(boxes, k, dist=np.median): 60 | """ 61 | Calculates k-means clustering with the Intersection over Union (IoU) metric. 62 | param: 63 | boxes: numpy array of shape (r, 2), where r is the number of rows 64 | k: number of clusters 65 | dist: distance function 66 | return: 67 | numpy array of shape (k, 2) 68 | """ 69 | rows = boxes.shape[0] 70 | 71 | distances = np.empty((rows, k)) 72 | last_clusters = np.zeros((rows,)) 73 | 74 | np.random.seed() 75 | 76 | # the Forgy method will fail if the whole array contains the same rows 77 | clusters = boxes[np.random.choice(rows, k, replace=False)] 78 | 79 | while True: 80 | for row in range(rows): 81 | distances[row] = 1 - iou(boxes[row], clusters) 82 | 83 | nearest_clusters = np.argmin(distances, axis=1) 84 | 85 | if (last_clusters == nearest_clusters).all(): 86 | break 87 | 88 | for cluster in range(k): 89 | clusters[cluster] = dist(boxes[nearest_clusters == cluster], axis=0) 90 | 91 | last_clusters = nearest_clusters 92 | 93 | return clusters 94 | 95 | 96 | def parse_anno(annotation_path, target_size=None): 97 | anno = open(annotation_path, 'r') 98 | result = [] 99 | for line in anno: 100 | s = line.strip().split(' ') 101 | img_w = int(float(s[2])) 102 | img_h = int(float(s[3])) 103 | s = s[4:] 104 | box_cnt = len(s) // 5 105 | for i in range(box_cnt): 106 | x_min, y_min, x_max, y_max = float(s[i*5+1]), float(s[i*5+2]), float(s[i*5+3]), float(s[i*5+4]) 107 | width = x_max - x_min 108 | height = y_max - y_min 109 | assert width > 0 110 | assert height > 0 111 | # use letterbox resize, i.e. keep the original aspect ratio 112 | # get k-means anchors on the resized target image size 113 | if target_size is not None: 114 | resize_ratio = min(target_size[0] / img_w, target_size[1] / img_h) 115 | width *= resize_ratio 116 | height *= resize_ratio 117 | result.append([width, height]) 118 | # get k-means anchors on the original image size 119 | else: 120 | result.append([width, height]) 121 | result = np.asarray(result) 122 | return result 123 | 124 | 125 | def get_kmeans(anno, cluster_num=9): 126 | 127 | anchors = kmeans(anno, cluster_num) 128 | ave_iou = avg_iou(anno, anchors) 129 | 130 | anchors = anchors.astype('int').tolist() 131 | 132 | anchors = sorted(anchors, key=lambda x: x[0] * x[1]) 133 | 134 | return anchors, ave_iou 135 | 136 | 137 | if __name__ == '__main__': 138 | # target resize format: [width, height] 139 | # if target_resize is speficied, the anchors are on the resized image scale 140 | # if target_resize is set to None, the anchors are on the original image scale 141 | target_size = [416, 416] 142 | annotation_path = "./data/my_data/label/train.txt" 143 | anno_result = parse_anno(annotation_path, target_size=target_size) 144 | anchors, ave_iou = get_kmeans(anno_result, 9) 145 | 146 | anchor_string = '' 147 | for anchor in anchors: 148 | anchor_string += '{},{}, '.format(anchor[0], anchor[1]) 149 | anchor_string = anchor_string[:-2] 150 | 151 | print('anchors are:') 152 | print(anchor_string) 153 | print('the average iou is:') 154 | print(ave_iou) 155 | 156 | -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/requirements.txt: -------------------------------------------------------------------------------- 1 | numpy==1.16.0 2 | absl-py-0.9.0 3 | astor-0.8.1 4 | gast-0.3.3 5 | grpcio-1.27.2 6 | h5py-2.10.0 7 | keras-applications-1.0.8 8 | keras-preprocessing-1.1.0 9 | markdown-3.2.1 10 | mock-4.0.2 11 | protobuf-3.11.3 12 | tensorboard-1.13.1 13 | tensorflow-estimator-1.13.0 14 | tensorflow-gpu-1.13.1+nv19.3 15 | termcolor-1.1.0 16 | 17 | 18 | 19 | -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/test.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/YOLO-V3-Tensorflow-demo/test.jpg -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/test_single_image.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import tensorflow as tf 6 | import numpy as np 7 | import argparse 8 | import cv2 9 | 10 | from utils.misc_utils import parse_anchors, read_class_names 11 | from utils.nms_utils import gpu_nms 12 | from utils.plot_utils import get_color_table, plot_one_box 13 | 14 | from model import yolov3 15 | 16 | tf.compat.v1.train.Saver 17 | 18 | parser = argparse.ArgumentParser(description="YOLO-V3 test single image test procedure.") 19 | parser.add_argument("input_image", type=str, 20 | help="The path of the input image.") 21 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt", 22 | help="The path of the anchor txt file.") 23 | parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416], 24 | help="Resize the input image with `new_size`, size format: [width, height]") 25 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names", 26 | help="The path of the class names.") 27 | parser.add_argument("--restore_path", type=str, default="./data/darknet_weights/best_model_Epoch_200_step_34370_mAP_0.8121_loss_9.4284_lr_1e-05", 28 | help="The path of the weights to restore.") 29 | args = parser.parse_args() 30 | 31 | args.anchors = parse_anchors(args.anchor_path) 32 | args.classes = read_class_names(args.class_name_path) 33 | args.num_class = len(args.classes) 34 | 35 | color_table = get_color_table(args.num_class) 36 | 37 | img_ori = cv2.imread(args.input_image) 38 | height_ori, width_ori = img_ori.shape[:2] 39 | img = cv2.resize(img_ori, tuple(args.new_size)) 40 | img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) 41 | img = np.asarray(img, np.float32) 42 | img = img[np.newaxis, :] / 255. 43 | 44 | with tf.Session() as sess: 45 | input_data = tf.placeholder(tf.float32, [1, args.new_size[1], args.new_size[0], 3], name='input_data') 46 | yolo_model = yolov3(args.num_class, args.anchors) 47 | with tf.variable_scope('yolov3'): 48 | pred_feature_maps = yolo_model.forward(input_data, False) 49 | pred_boxes, pred_confs, pred_probs = yolo_model.predict(pred_feature_maps) 50 | 51 | pred_scores = pred_confs * pred_probs 52 | 53 | boxes, scores, labels = gpu_nms(pred_boxes, pred_scores, args.num_class, max_boxes=30, score_thresh=0.4, nms_thresh=0.5) 54 | 55 | saver = tf.train.Saver() 56 | saver.restore(sess, args.restore_path) 57 | 58 | boxes_, scores_, labels_ = sess.run([boxes, scores, labels], feed_dict={input_data: img}) 59 | 60 | # rescale the coordinates to the original image 61 | boxes_[:, 0] *= (width_ori/float(args.new_size[0])) 62 | boxes_[:, 2] *= (width_ori/float(args.new_size[0])) 63 | boxes_[:, 1] *= (height_ori/float(args.new_size[1])) 64 | boxes_[:, 3] *= (height_ori/float(args.new_size[1])) 65 | 66 | print("box coords:") 67 | print(boxes_) 68 | print('*' * 30) 69 | print("scores:") 70 | print(scores_) 71 | print('*' * 30) 72 | print("labels:") 73 | print(labels_) 74 | 75 | for i in range(len(boxes_)): 76 | x0, y0, x1, y1 = boxes_[i] 77 | plot_one_box(img_ori, [x0, y0, x1, y1], label=args.classes[labels_[i]], color=color_table[labels_[i]]) 78 | # cv2.imshow('Detection result', img_ori) 79 | cv2.imwrite('detection_result.jpg', img_ori) 80 | # cv2.waitKey(0) 81 | -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/utils/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/YOLO-V3-Tensorflow-demo/utils/__init__.py -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/utils/layer_utils.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import numpy as np 6 | import tensorflow as tf 7 | slim = tf.contrib.slim 8 | 9 | def conv2d(inputs, filters, kernel_size, strides=1): 10 | def _fixed_padding(inputs, kernel_size): 11 | pad_total = kernel_size - 1 12 | pad_beg = pad_total // 2 13 | pad_end = pad_total - pad_beg 14 | 15 | padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg, pad_end], 16 | [pad_beg, pad_end], [0, 0]], mode='CONSTANT') 17 | return padded_inputs 18 | if strides > 1: 19 | inputs = _fixed_padding(inputs, kernel_size) 20 | inputs = slim.conv2d(inputs, filters, kernel_size, stride=strides, 21 | padding=('SAME' if strides == 1 else 'VALID')) 22 | return inputs 23 | 24 | def darknet53_body(inputs): 25 | def res_block(inputs, filters): 26 | shortcut = inputs 27 | net = conv2d(inputs, filters * 1, 1) 28 | net = conv2d(net, filters * 2, 3) 29 | 30 | net = net + shortcut 31 | 32 | return net 33 | 34 | # first two conv2d layers 35 | net = conv2d(inputs, 32, 3, strides=1) 36 | net = conv2d(net, 64, 3, strides=2) 37 | 38 | # res_block * 1 39 | net = res_block(net, 32) 40 | 41 | net = conv2d(net, 128, 3, strides=2) 42 | 43 | # res_block * 2 44 | for i in range(2): 45 | net = res_block(net, 64) 46 | 47 | net = conv2d(net, 256, 3, strides=2) 48 | 49 | # res_block * 8 50 | for i in range(8): 51 | net = res_block(net, 128) 52 | 53 | route_1 = net 54 | net = conv2d(net, 512, 3, strides=2) 55 | 56 | # res_block * 8 57 | for i in range(8): 58 | net = res_block(net, 256) 59 | 60 | route_2 = net 61 | net = conv2d(net, 1024, 3, strides=2) 62 | 63 | # res_block * 4 64 | for i in range(4): 65 | net = res_block(net, 512) 66 | route_3 = net 67 | 68 | return route_1, route_2, route_3 69 | 70 | 71 | def yolo_block(inputs, filters): 72 | net = conv2d(inputs, filters * 1, 1) 73 | net = conv2d(net, filters * 2, 3) 74 | net = conv2d(net, filters * 1, 1) 75 | net = conv2d(net, filters * 2, 3) 76 | net = conv2d(net, filters * 1, 1) 77 | route = net 78 | net = conv2d(net, filters * 2, 3) 79 | return route, net 80 | 81 | 82 | def upsample_layer(inputs, out_shape): 83 | new_height, new_width = out_shape[1], out_shape[2] 84 | # NOTE: here height is the first 85 | # TODO: Do we need to set `align_corners` as True? 86 | inputs = tf.image.resize_nearest_neighbor(inputs, (new_height, new_width), name='upsampled') 87 | return inputs 88 | 89 | 90 | -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/utils/misc_utils.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | import numpy as np 4 | import tensorflow as tf 5 | import random 6 | 7 | from tensorflow.core.framework import summary_pb2 8 | 9 | 10 | def make_summary(name, val): 11 | return summary_pb2.Summary(value=[summary_pb2.Summary.Value(tag=name, simple_value=val)]) 12 | 13 | 14 | class AverageMeter(object): 15 | def __init__(self): 16 | self.reset() 17 | 18 | def reset(self): 19 | self.val = 0 20 | self.average = 0 21 | self.sum = 0 22 | self.count = 0 23 | 24 | def update(self, val, n=1): 25 | self.val = val 26 | self.sum += val * n 27 | self.count += n 28 | self.average = self.sum / float(self.count) 29 | 30 | 31 | def parse_anchors(anchor_path): 32 | ''' 33 | parse anchors. 34 | returned data: shape [N, 2], dtype float32 35 | ''' 36 | anchors = np.reshape(np.asarray(open(anchor_path, 'r').read().split(','), np.float32), [-1, 2]) 37 | return anchors 38 | 39 | 40 | def read_class_names(class_name_path): 41 | names = {} 42 | with open(class_name_path, 'r') as data: 43 | for ID, name in enumerate(data): 44 | names[ID] = name.strip('\n') 45 | return names 46 | 47 | 48 | def shuffle_and_overwrite(file_name): 49 | content = open(file_name, 'r').readlines() 50 | random.shuffle(content) 51 | with open(file_name, 'w') as f: 52 | for line in content: 53 | f.write(line) 54 | 55 | 56 | def update_dict(ori_dict, new_dict): 57 | if not ori_dict: 58 | return new_dict 59 | for key in ori_dict: 60 | ori_dict[key] += new_dict[key] 61 | return ori_dict 62 | 63 | 64 | def list_add(ori_list, new_list): 65 | for i in range(len(ori_list)): 66 | ori_list[i] += new_list[i] 67 | return ori_list 68 | 69 | 70 | def load_weights(var_list, weights_file): 71 | """ 72 | Loads and converts pre-trained weights. 73 | param: 74 | var_list: list of network variables. 75 | weights_file: name of the binary file. 76 | """ 77 | with open(weights_file, "rb") as fp: 78 | np.fromfile(fp, dtype=np.int32, count=5) 79 | weights = np.fromfile(fp, dtype=np.float32) 80 | 81 | ptr = 0 82 | i = 0 83 | assign_ops = [] 84 | while i < len(var_list) - 1: 85 | var1 = var_list[i] 86 | var2 = var_list[i + 1] 87 | # do something only if we process conv layer 88 | if 'Conv' in var1.name.split('/')[-2]: 89 | # check type of next layer 90 | if 'BatchNorm' in var2.name.split('/')[-2]: 91 | # load batch norm params 92 | gamma, beta, mean, var = var_list[i + 1:i + 5] 93 | batch_norm_vars = [beta, gamma, mean, var] 94 | for var in batch_norm_vars: 95 | shape = var.shape.as_list() 96 | num_params = np.prod(shape) 97 | var_weights = weights[ptr:ptr + num_params].reshape(shape) 98 | ptr += num_params 99 | assign_ops.append(tf.assign(var, var_weights, validate_shape=True)) 100 | # we move the pointer by 4, because we loaded 4 variables 101 | i += 4 102 | elif 'Conv' in var2.name.split('/')[-2]: 103 | # load biases 104 | bias = var2 105 | bias_shape = bias.shape.as_list() 106 | bias_params = np.prod(bias_shape) 107 | bias_weights = weights[ptr:ptr + 108 | bias_params].reshape(bias_shape) 109 | ptr += bias_params 110 | assign_ops.append(tf.assign(bias, bias_weights, validate_shape=True)) 111 | # we loaded 1 variable 112 | i += 1 113 | # we can load weights of conv layer 114 | shape = var1.shape.as_list() 115 | num_params = np.prod(shape) 116 | 117 | var_weights = weights[ptr:ptr + num_params].reshape( 118 | (shape[3], shape[2], shape[0], shape[1])) 119 | # remember to transpose to column-major 120 | var_weights = np.transpose(var_weights, (2, 3, 1, 0)) 121 | ptr += num_params 122 | assign_ops.append( 123 | tf.assign(var1, var_weights, validate_shape=True)) 124 | i += 1 125 | 126 | return assign_ops 127 | 128 | 129 | def config_learning_rate(args, global_step): 130 | if args.lr_type == 'exponential': 131 | lr_tmp = tf.train.exponential_decay(args.learning_rate_init, global_step, args.lr_decay_freq, 132 | args.lr_decay_factor, staircase=True, name='exponential_learning_rate') 133 | return tf.maximum(lr_tmp, args.lr_lower_bound) 134 | elif args.lr_type == 'cosine_decay': 135 | train_steps = (args.total_epoches - float(args.use_warm_up) * args.warm_up_epoch) * args.train_batch_num 136 | return args.lr_lower_bound + 0.5 * (args.learning_rate_init - args.lr_lower_bound) * \ 137 | (1 + tf.cos(global_step / train_steps * np.pi)) 138 | elif args.lr_type == 'cosine_decay_restart': 139 | return tf.train.cosine_decay_restarts(args.learning_rate_init, global_step, 140 | args.lr_decay_freq, t_mul=2.0, m_mul=1.0, 141 | name='cosine_decay_learning_rate_restart') 142 | elif args.lr_type == 'fixed': 143 | return tf.convert_to_tensor(args.learning_rate_init, name='fixed_learning_rate') 144 | elif args.lr_type == 'piecewise': 145 | return tf.train.piecewise_constant(global_step, boundaries=args.pw_boundaries, values=args.pw_values, 146 | name='piecewise_learning_rate') 147 | else: 148 | raise ValueError('Unsupported learning rate type!') 149 | 150 | 151 | def config_optimizer(optimizer_name, learning_rate, decay=0.9, momentum=0.9): 152 | if optimizer_name == 'momentum': 153 | return tf.train.MomentumOptimizer(learning_rate, momentum=momentum) 154 | elif optimizer_name == 'rmsprop': 155 | return tf.train.RMSPropOptimizer(learning_rate, decay=decay, momentum=momentum) 156 | elif optimizer_name == 'adam': 157 | return tf.train.AdamOptimizer(learning_rate) 158 | elif optimizer_name == 'sgd': 159 | return tf.train.GradientDescentOptimizer(learning_rate) 160 | else: 161 | raise ValueError('Unsupported optimizer type!') -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/utils/nms_utils.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import numpy as np 6 | import tensorflow as tf 7 | 8 | def gpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, nms_thresh=0.5): 9 | """ 10 | Perform NMS on GPU using TensorFlow. 11 | 12 | params: 13 | boxes: tensor of shape [1, 10647, 4] # 10647=(13*13+26*26+52*52)*3, for input 416*416 image 14 | scores: tensor of shape [1, 10647, num_classes], score=conf*prob 15 | num_classes: total number of classes 16 | max_boxes: integer, maximum number of predicted boxes you'd like, default is 50 17 | score_thresh: if [ highest class probability score < score_threshold] 18 | then get rid of the corresponding box 19 | nms_thresh: real value, "intersection over union" threshold used for NMS filtering 20 | """ 21 | 22 | boxes_list, label_list, score_list = [], [], [] 23 | max_boxes = tf.constant(max_boxes, dtype='int32') 24 | 25 | # since we do nms for single image, then reshape it 26 | boxes = tf.reshape(boxes, [-1, 4]) # '-1' means we don't konw the exact number of boxes 27 | score = tf.reshape(scores, [-1, num_classes]) 28 | 29 | # Step 1: Create a filtering mask based on "box_class_scores" by using "threshold". 30 | mask = tf.greater_equal(score, tf.constant(score_thresh)) 31 | # Step 2: Do non_max_suppression for each class 32 | for i in range(num_classes): 33 | # Step 3: Apply the mask to scores, boxes and pick them out 34 | filter_boxes = tf.boolean_mask(boxes, mask[:,i]) 35 | filter_score = tf.boolean_mask(score[:,i], mask[:,i]) 36 | nms_indices = tf.image.non_max_suppression(boxes=filter_boxes, 37 | scores=filter_score, 38 | max_output_size=max_boxes, 39 | iou_threshold=nms_thresh, name='nms_indices') 40 | label_list.append(tf.ones_like(tf.gather(filter_score, nms_indices), 'int32')*i) 41 | boxes_list.append(tf.gather(filter_boxes, nms_indices)) 42 | score_list.append(tf.gather(filter_score, nms_indices)) 43 | 44 | boxes = tf.concat(boxes_list, axis=0) 45 | score = tf.concat(score_list, axis=0) 46 | label = tf.concat(label_list, axis=0) 47 | 48 | return boxes, score, label 49 | 50 | 51 | def py_nms(boxes, scores, max_boxes=50, iou_thresh=0.5): 52 | """ 53 | Pure Python NMS baseline. 54 | 55 | Arguments: boxes: shape of [-1, 4], the value of '-1' means that dont know the 56 | exact number of boxes 57 | scores: shape of [-1,] 58 | max_boxes: representing the maximum of boxes to be selected by non_max_suppression 59 | iou_thresh: representing iou_threshold for deciding to keep boxes 60 | """ 61 | assert boxes.shape[1] == 4 and len(scores.shape) == 1 62 | 63 | x1 = boxes[:, 0] 64 | y1 = boxes[:, 1] 65 | x2 = boxes[:, 2] 66 | y2 = boxes[:, 3] 67 | 68 | areas = (x2 - x1) * (y2 - y1) 69 | order = scores.argsort()[::-1] 70 | 71 | keep = [] 72 | while order.size > 0: 73 | i = order[0] 74 | keep.append(i) 75 | xx1 = np.maximum(x1[i], x1[order[1:]]) 76 | yy1 = np.maximum(y1[i], y1[order[1:]]) 77 | xx2 = np.minimum(x2[i], x2[order[1:]]) 78 | yy2 = np.minimum(y2[i], y2[order[1:]]) 79 | 80 | w = np.maximum(0.0, xx2 - xx1 + 1) 81 | h = np.maximum(0.0, yy2 - yy1 + 1) 82 | inter = w * h 83 | ovr = inter / (areas[i] + areas[order[1:]] - inter) 84 | 85 | inds = np.where(ovr <= iou_thresh)[0] 86 | order = order[inds + 1] 87 | 88 | return keep[:max_boxes] 89 | 90 | 91 | def cpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, iou_thresh=0.5): 92 | """ 93 | Perform NMS on CPU. 94 | Arguments: 95 | boxes: shape [1, 10647, 4] 96 | scores: shape [1, 10647, num_classes] 97 | """ 98 | 99 | boxes = boxes.reshape(-1, 4) 100 | scores = scores.reshape(-1, num_classes) 101 | # Picked bounding boxes 102 | picked_boxes, picked_score, picked_label = [], [], [] 103 | 104 | for i in range(num_classes): 105 | indices = np.where(scores[:,i] >= score_thresh) 106 | filter_boxes = boxes[indices] 107 | filter_scores = scores[:,i][indices] 108 | if len(filter_boxes) == 0: 109 | continue 110 | # do non_max_suppression on the cpu 111 | indices = py_nms(filter_boxes, filter_scores, 112 | max_boxes=max_boxes, iou_thresh=iou_thresh) 113 | picked_boxes.append(filter_boxes[indices]) 114 | picked_score.append(filter_scores[indices]) 115 | picked_label.append(np.ones(len(indices), dtype='int32')*i) 116 | if len(picked_boxes) == 0: 117 | return None, None, None 118 | 119 | boxes = np.concatenate(picked_boxes, axis=0) 120 | score = np.concatenate(picked_score, axis=0) 121 | label = np.concatenate(picked_label, axis=0) 122 | 123 | return boxes, score, label -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/utils/plot_utils.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import cv2 6 | import random 7 | 8 | 9 | def get_color_table(class_num, seed=2): 10 | random.seed(seed) 11 | color_table = {} 12 | for i in range(class_num): 13 | color_table[i] = [random.randint(0, 255) for _ in range(3)] 14 | return color_table 15 | 16 | 17 | def plot_one_box(img, coord, label=None, color=None, line_thickness=None): 18 | ''' 19 | coord: [x_min, y_min, x_max, y_max] format coordinates. 20 | img: img to plot on. 21 | label: str. The label name. 22 | color: int. color index. 23 | line_thickness: int. rectangle line thickness. 24 | ''' 25 | tl = line_thickness or int(round(0.002 * max(img.shape[0:2]))) # line thickness 26 | color = color or [random.randint(0, 255) for _ in range(3)] 27 | c1, c2 = (int(coord[0]), int(coord[1])), (int(coord[2]), int(coord[3])) 28 | cv2.rectangle(img, c1, c2, color, thickness=tl) 29 | if label: 30 | tf = max(tl - 1, 1) # font thickness 31 | t_size = cv2.getTextSize(label, 0, fontScale=float(tl) / 3, thickness=tf)[0] 32 | c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3 33 | cv2.rectangle(img, c1, c2, color, -1) # filled 34 | cv2.putText(img, label, (c1[0], c1[1] - 2), 0, float(tl) / 3, [0, 0, 0], thickness=tf, lineType=cv2.LINE_AA) 35 | 36 | -------------------------------------------------------------------------------- /YOLO-V3-Tensorflow-demo/video_test.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import tensorflow as tf 6 | import numpy as np 7 | import argparse 8 | import cv2 9 | import time 10 | 11 | from utils.misc_utils import parse_anchors, read_class_names 12 | from utils.nms_utils import gpu_nms 13 | from utils.plot_utils import get_color_table, plot_one_box 14 | from utils.data_aug import letterbox_resize 15 | 16 | from model import yolov3 17 | 18 | parser = argparse.ArgumentParser(description="YOLO-V3 video test procedure.") 19 | parser.add_argument("input_video", type=str, 20 | help="The path of the input video.") 21 | parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt", 22 | help="The path of the anchor txt file.") 23 | parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416], 24 | help="Resize the input image with `new_size`, size format: [width, height]") 25 | parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=True, 26 | help="Whether to use the letterbox resize.") 27 | parser.add_argument("--class_name_path", type=str, default="./data/coco.names", 28 | help="The path of the class names.") 29 | parser.add_argument("--restore_path", type=str, default="./data/darknet_weights/best_model_Epoch_200_step_34370_mAP_0.8121_loss_9.4284_lr_1e-05", 30 | help="The path of the weights to restore.") 31 | parser.add_argument("--save_video", type=lambda x: (str(x).lower() == 'true'), default=False, 32 | help="Whether to save the video detection results.") 33 | args = parser.parse_args() 34 | 35 | args.anchors = parse_anchors(args.anchor_path) 36 | args.classes = read_class_names(args.class_name_path) 37 | args.num_class = len(args.classes) 38 | 39 | color_table = get_color_table(args.num_class) 40 | 41 | vid = cv2.VideoCapture(args.input_video) 42 | video_frame_cnt = int(vid.get(7)) 43 | video_width = int(vid.get(3)) 44 | video_height = int(vid.get(4)) 45 | video_fps = int(vid.get(5)) 46 | 47 | if args.save_video: 48 | fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v') 49 | videoWriter = cv2.VideoWriter('video_result.mp4', fourcc, video_fps, (video_width, video_height)) 50 | 51 | with tf.Session() as sess: 52 | input_data = tf.placeholder(tf.float32, [1, args.new_size[1], args.new_size[0], 3], name='input_data') 53 | yolo_model = yolov3(args.num_class, args.anchors) 54 | with tf.variable_scope('yolov3'): 55 | pred_feature_maps = yolo_model.forward(input_data, False) 56 | pred_boxes, pred_confs, pred_probs = yolo_model.predict(pred_feature_maps) 57 | 58 | pred_scores = pred_confs * pred_probs 59 | 60 | boxes, scores, labels = gpu_nms(pred_boxes, pred_scores, args.num_class, max_boxes=200, score_thresh=0.3, nms_thresh=0.45) 61 | 62 | saver = tf.train.Saver() 63 | saver.restore(sess, args.restore_path) 64 | 65 | for i in range(video_frame_cnt): 66 | ret, img_ori = vid.read() 67 | if args.letterbox_resize: 68 | img, resize_ratio, dw, dh = letterbox_resize(img_ori, args.new_size[0], args.new_size[1]) 69 | else: 70 | height_ori, width_ori = img_ori.shape[:2] 71 | img = cv2.resize(img_ori, tuple(args.new_size)) 72 | img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) 73 | img = np.asarray(img, np.float32) 74 | img = img[np.newaxis, :] / 255. 75 | 76 | start_time = time.time() 77 | boxes_, scores_, labels_ = sess.run([boxes, scores, labels], feed_dict={input_data: img}) 78 | end_time = time.time() 79 | 80 | # rescale the coordinates to the original image 81 | if args.letterbox_resize: 82 | boxes_[:, [0, 2]] = (boxes_[:, [0, 2]] - dw) / resize_ratio 83 | boxes_[:, [1, 3]] = (boxes_[:, [1, 3]] - dh) / resize_ratio 84 | else: 85 | boxes_[:, [0, 2]] *= (width_ori/float(args.new_size[0])) 86 | boxes_[:, [1, 3]] *= (height_ori/float(args.new_size[1])) 87 | 88 | 89 | for i in range(len(boxes_)): 90 | x0, y0, x1, y1 = boxes_[i] 91 | plot_one_box(img_ori, [x0, y0, x1, y1], label=args.classes[labels_[i]] + ', {:.2f}%'.format(scores_[i] * 100), color=color_table[labels_[i]]) 92 | cv2.putText(img_ori, '{:.2f}ms'.format((end_time - start_time) * 1000), (40, 40), 0, 93 | fontScale=1, color=(0, 255, 0), thickness=2) 94 | cv2.imshow('image', img_ori) 95 | if args.save_video: 96 | videoWriter.write(img_ori) 97 | if cv2.waitKey(1) & 0xFF == ord('q'): 98 | break 99 | 100 | vid.release() 101 | if args.save_video: 102 | videoWriter.release() 103 | -------------------------------------------------------------------------------- /config.py: -------------------------------------------------------------------------------- 1 | import configparser 2 | import os 3 | 4 | class Config: 5 | """A model for saving settings""" 6 | 7 | def __init__(self): 8 | self.config_path = 'config.ini' 9 | self.email_receiver = '' 10 | self.email_server = '' 11 | self.email_port = '0' 12 | self.email_username = '' 13 | self.email_password = '' 14 | self.email_ssl = False 15 | self.objects_to_detect = -1 16 | self.detection_marker_location = 50 17 | self.detection_marker_direction = 0 18 | self.mode = '' 19 | 20 | def load(self): 21 | """Load the settings from the file""" 22 | 23 | if not os.path.exists(self.config_path): 24 | return 25 | 26 | config_parser = configparser.ConfigParser() 27 | config_parser.read(self.config_path) 28 | 29 | email_config = config_parser['email'] 30 | objects_config = config_parser['objects'] 31 | detection_marker = config_parser['detection_marker'] 32 | 33 | self.email_receiver = email_config['receiver'] 34 | self.email_server = email_config['server'] 35 | self.email_port = email_config['port'] 36 | self.email_username = email_config['username'] 37 | self.email_password = email_config['password'] 38 | self.email_ssl = bool(email_config['ssl']) 39 | self.objects_to_detect = int(objects_config['index']) 40 | self.detection_marker_location = int(detection_marker['location']) 41 | self.detection_marker_direction = int(detection_marker['direction']) 42 | 43 | if(self.objects_to_detect == 0): 44 | self.mode = 'PH' 45 | elif(self.objects_to_detect == 1): 46 | self.mode = 'PV' 47 | elif(self.objects_to_detect == 3): 48 | self.mode = 'PLC' 49 | else: 50 | self.mode ='PHV' 51 | 52 | def save(self): 53 | """Save the settings to the file""" 54 | config_parser = configparser.ConfigParser() 55 | config_parser['email'] = {'receiver': self.email_receiver, 56 | 'server': self.email_server, 57 | 'port': self.email_port, 58 | 'username': self.email_username, 59 | 'password': self.email_password, 60 | 'ssl': self.email_ssl 61 | } 62 | 63 | config_parser['objects'] = {'index': str(self.objects_to_detect)} 64 | 65 | config_parser['detection_marker'] = {'location': self.detection_marker_location, 66 | 'direction': self.detection_marker_direction} 67 | 68 | with open(self.config_path, 'w') as configfile: 69 | config_parser.write(configfile) 70 | -------------------------------------------------------------------------------- /config_window.py: -------------------------------------------------------------------------------- 1 | import asyncio 2 | from tkinter import ttk, Tk, messagebox 3 | import tkinter as tk 4 | from config import Config 5 | from notification import NotificationService 6 | 7 | class ConfigWindow(tk.Toplevel): 8 | def __init__(self, master, on_config_save=None, **kwargs): 9 | super().__init__(**kwargs ) 10 | 11 | self.resizable(False, False) 12 | 13 | self.on_config_save = on_config_save 14 | self.title('Configuration') 15 | 16 | self.config = Config() 17 | self.config.load() 18 | 19 | self.var_enable_ssl = tk.BooleanVar() 20 | self.var_enable_ssl.set(self.config.email_ssl) 21 | 22 | self.var_recipient_email = tk.StringVar() 23 | self.var_recipient_email.set(self.config.email_receiver) 24 | 25 | self.var_server = tk.StringVar() 26 | self.var_server.set(self.config.email_server) 27 | 28 | self.var_port = tk.IntVar() 29 | self.var_port.set(self.config.email_port) 30 | 31 | self.var_logon_email = tk.StringVar() 32 | self.var_logon_email.set(self.config.email_username) 33 | 34 | self.var_password = tk.StringVar() 35 | self.var_password.set(self.config.email_password) 36 | 37 | self.var_objects_detect = tk.IntVar() 38 | self.var_objects_detect.set(self.config.objects_to_detect) 39 | 40 | self.var_marker_location = tk.IntVar() 41 | self.var_marker_location.set(self.config.detection_marker_location) 42 | 43 | self.var_marker_direction = tk.IntVar() 44 | self.var_marker_direction.set(self.config.detection_marker_direction) 45 | 46 | self.var_status = tk.StringVar() 47 | self.var_status.set('') 48 | 49 | self.create_page() 50 | 51 | self.event_loop = asyncio.get_event_loop() 52 | 53 | def create_page(self): 54 | email_frame = ttk.LabelFrame(self, text="Email settings") 55 | email_frame.pack(padx=15, pady=15, fill=tk.X) 56 | 57 | self.draw_input(email_frame, 'Recipient Email:', "text", 0, self.var_recipient_email) 58 | self.draw_input(email_frame, 'Server:', "text", 1, self.var_server) 59 | self.draw_input(email_frame, 'Port:', "text", 2, self.var_port) 60 | self.draw_input(email_frame, 'Logon Email:', "text", 3, self.var_logon_email) 61 | self.draw_input(email_frame, 'Password:', "text", 4, self.var_password, True) 62 | self.draw_input(email_frame, 'Enable SSL', "check",5, self.var_enable_ssl) 63 | 64 | row_status = ttk.Frame(email_frame) 65 | row_status.pack(anchor='w') 66 | self.lbl_status = ttk.Label(row_status, textvariable=self.var_status) 67 | self.lbl_status.pack() 68 | 69 | row_test_email = ttk.Frame(email_frame) 70 | row_test_email.pack(anchor='w') 71 | btn_test_email = ttk.Button(row_test_email, text='Test Email',command=self.on_email_test) 72 | btn_test_email.pack() 73 | 74 | objects_frame = ttk.LabelFrame(self, text='Objects to detect') 75 | objects_frame.pack(fill=tk.X, padx=15) 76 | 77 | row_objects = ttk.Frame(objects_frame) 78 | row_objects.pack(anchor='w', expand=1) 79 | 80 | rad_helmet = ttk.Radiobutton(row_objects,text='Helmet', value=0, variable=self.var_objects_detect) 81 | rad_helmet.pack(side=tk.LEFT,fill=tk.X, expand=1) 82 | 83 | rad_vest = ttk.Radiobutton(row_objects,text='Vest', value=1, variable=self.var_objects_detect) 84 | rad_vest.pack(side=tk.LEFT,fill=tk.X) 85 | # rad_vest.grid(row=7, column=1) 86 | rad_helmet_vest = ttk.Radiobutton(row_objects,text='Helmet & Vest', value=2, variable=self.var_objects_detect) 87 | rad_helmet_vest.pack(side=tk.LEFT,fill=tk.X) 88 | 89 | # rad_helmet_vest.grid(row=7, column=2) 90 | rad_lab_coat = ttk.Radiobutton(row_objects,text='Lab Coat', value=3, variable=self.var_objects_detect) 91 | rad_lab_coat.pack(side=tk.LEFT,fill=tk.X) 92 | 93 | marker_frame = ttk.LabelFrame(self, text="Detection marker position") 94 | marker_frame.pack(padx=15, pady=15, fill=tk.X) 95 | 96 | # objects_frame = ttk.LabelFrame(self, text='Objects to detect') 97 | # objects_frame.pack(fill=tk.X, padx=15) 98 | direction_frame = ttk.LabelFrame(marker_frame, text='Direction of the marker') 99 | direction_frame.pack(fill=tk.X) 100 | 101 | row_direction = ttk.Frame(direction_frame) 102 | row_direction.pack(anchor='w', expand=1) 103 | 104 | rad_horizontal = ttk.Radiobutton(row_direction,text='Horizontal', value=0, variable=self.var_marker_direction) 105 | rad_horizontal.pack(side=tk.LEFT,fill=tk.X, expand=1) 106 | 107 | rad_vertical = ttk.Radiobutton(row_direction,text='Vertical', value=1, variable=self.var_marker_direction) 108 | rad_vertical.pack(side=tk.LEFT,fill=tk.X) 109 | 110 | self.draw_input(marker_frame, 'Location %: (0 - 100)', "text", 2, self.var_marker_location) 111 | 112 | buttons_frame = ttk.LabelFrame(self) 113 | buttons_frame.pack(padx=15, pady=15, fill=tk.X) 114 | 115 | row_btn_save = ttk.Frame(buttons_frame) 116 | row_btn_save.pack(anchor='w') 117 | 118 | btn_save = ttk.Button(row_btn_save, text="Ok",width=10,command=self.on_save) 119 | btn_save.pack(side=tk.LEFT) 120 | 121 | btn_cancel = ttk.Button(row_btn_save, text="Cancel", command=self.destroy) 122 | btn_cancel.pack(side=tk.LEFT) 123 | 124 | def draw_input(self, master, label, type, index, variable=None, is_password=False): 125 | row = ttk.Frame(master) 126 | row.pack(anchor='w') 127 | 128 | if type=="text": 129 | lbl = ttk.Label(row, width=20, text=label) 130 | lbl.pack(anchor='w') 131 | 132 | txt = ttk.Entry(row, width=50, textvar=variable) 133 | 134 | if is_password: 135 | txt.config(show="*") 136 | elif type=="check": 137 | txt = ttk.Checkbutton(row, var=variable, text=label) 138 | elif type=="number": 139 | lbl = ttk.Label(row, width=20, text=label) 140 | lbl.pack(anchor='w') 141 | txt = ttk.Spinbox(master, increment=1) 142 | 143 | txt.pack(anchor='w') 144 | 145 | def on_email_test(self): 146 | print('on_email_test') 147 | 148 | try: 149 | noti = NotificationService(self.var_server.get(), 150 | self.var_port.get(), 151 | self.var_logon_email.get(), 152 | self.var_password.get(), 153 | self.var_enable_ssl.get()) 154 | 155 | loop = asyncio.new_event_loop() 156 | ss = loop.run_until_complete( noti.notify(0, self.var_recipient_email.get(), 'Test Email') ) 157 | loop.close() 158 | 159 | self.var_status.set('Test Success') 160 | self.lbl_status.configure(foreground="green") 161 | 162 | except Exception as inst: 163 | self.lbl_status.configure(foreground="red") 164 | self.var_status.set(f'Error:{inst}') 165 | 166 | def on_save(self): 167 | config = Config() 168 | config.load() 169 | 170 | config.email_ssl = self.var_enable_ssl.get() 171 | config.email_receiver = self.var_recipient_email.get() 172 | config.email_server = self.var_server.get() 173 | config.email_port = self.var_port.get() 174 | config.email_username = self.var_logon_email.get() 175 | config.email_password = self.var_password.get() 176 | config.objects_to_detect = self.var_objects_detect.get() 177 | config.detection_marker_direction = self.var_marker_direction.get() 178 | config.detection_marker_location = self.var_marker_location.get() 179 | config.save() 180 | 181 | if(self.on_config_save != None): 182 | self.on_config_save() 183 | 184 | self.destroy() 185 | 186 | if __name__ == '__main__': 187 | mw = tk.Tk() 188 | fw = ConfigWindow(mw) 189 | mw.mainloop() -------------------------------------------------------------------------------- /data/coco.names: -------------------------------------------------------------------------------- 1 | P 2 | PH 3 | PV 4 | PHV 5 | PLC -------------------------------------------------------------------------------- /data/yolo_anchors.txt: -------------------------------------------------------------------------------- 1 | 15,32, 27,62, 38,99, 49,141, 69,111, 69,184, 98,223, 143,232, 188,359 -------------------------------------------------------------------------------- /eyre.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/eyre.ico -------------------------------------------------------------------------------- /notification.py: -------------------------------------------------------------------------------- 1 | import asyncio 2 | import smtplib 3 | 4 | 5 | class NotificationService: 6 | username = '' 7 | password = '' 8 | host = '' 9 | port = '' 10 | ssl = False 11 | 12 | def __init__(self, host, port, username, password, ssl): 13 | self.username = username 14 | self.password = password 15 | self.host = host 16 | self.port = port 17 | self.ssl = ssl 18 | 19 | def count(self, mode): 20 | self.counter[mode] += 1 21 | 22 | def reset_count(self, mode): 23 | self.counter[mode] = 0 24 | 25 | async def notify(self, notification_type, receiver, message): 26 | if notification_type == 0: 27 | with smtplib.SMTP(self.host, self.port) as smtp: 28 | smtp.ehlo() 29 | smtp.starttls() 30 | smtp.ehlo() 31 | 32 | smtp.login(self.username, self.password) 33 | 34 | subject = 'Eyre Notification' 35 | 36 | mail = 'Subject: {}\n\n{}'.format(subject, message) 37 | 38 | smtp.sendmail(self.username, receiver, mail) 39 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | tensorflow-gpu==1.15.2 2 | opencv-python==4.2.0.32 3 | Pillow==7.1.2 4 | scikit-learn==0.21.3 5 | filterpy==1.4.5 6 | -------------------------------------------------------------------------------- /sort.py: -------------------------------------------------------------------------------- 1 | """ 2 | SORT: A Simple, Online and Realtime Tracker 3 | Copyright (C) 2016 Alex Bewley alex@dynamicdetection.com 4 | 5 | This program is free software: you can redistribute it and/or modify 6 | it under the terms of the GNU General Public License as published by 7 | the Free Software Foundation, either version 3 of the License, or 8 | (at your option) any later version. 9 | 10 | This program is distributed in the hope that it will be useful, 11 | but WITHOUT ANY WARRANTY; without even the implied warranty of 12 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 13 | GNU General Public License for more details. 14 | 15 | You should have received a copy of the GNU General Public License 16 | along with this program. If not, see . 17 | """ 18 | from __future__ import print_function 19 | 20 | import numpy as np 21 | from sklearn.utils.linear_assignment_ import linear_assignment 22 | from filterpy.kalman import KalmanFilter 23 | 24 | def iou(bb_test,bb_gt): 25 | """ 26 | Computes IUO between two bboxes in the form [x1,y1,x2,y2] 27 | """ 28 | xx1 = np.maximum(bb_test[0], bb_gt[0]) 29 | yy1 = np.maximum(bb_test[1], bb_gt[1]) 30 | xx2 = np.minimum(bb_test[2], bb_gt[2]) 31 | yy2 = np.minimum(bb_test[3], bb_gt[3]) 32 | w = np.maximum(0., xx2 - xx1) 33 | h = np.maximum(0., yy2 - yy1) 34 | wh = w * h 35 | o = wh / ((bb_test[2]-bb_test[0])*(bb_test[3]-bb_test[1]) 36 | + (bb_gt[2]-bb_gt[0])*(bb_gt[3]-bb_gt[1]) - wh) 37 | return(o) 38 | 39 | def convert_bbox_to_z(bbox): 40 | """ 41 | Takes a bounding box in the form [x1,y1,x2,y2] and returns z in the form 42 | [x,y,s,r] where x,y is the centre of the box and s is the scale/area and r is 43 | the aspect ratio 44 | """ 45 | w = bbox[2]-bbox[0] 46 | h = bbox[3]-bbox[1] 47 | x = bbox[0]+w/2. 48 | y = bbox[1]+h/2. 49 | s = w*h #scale is just area 50 | r = w/float(h) 51 | return np.array([x,y,s,r]).reshape((4,1)) 52 | 53 | def convert_x_to_bbox(x,score=None): 54 | """ 55 | Takes a bounding box in the centre form [x,y,s,r] and returns it in the form 56 | [x1,y1,x2,y2] where x1,y1 is the top left and x2,y2 is the bottom right 57 | """ 58 | w = np.sqrt(x[2]*x[3]) 59 | h = x[2]/w 60 | if(score==None): 61 | return np.array([x[0]-w/2.,x[1]-h/2.,x[0]+w/2.,x[1]+h/2.]).reshape((1,4)) 62 | else: 63 | return np.array([x[0]-w/2.,x[1]-h/2.,x[0]+w/2.,x[1]+h/2.,score]).reshape((1,5)) 64 | 65 | class KalmanBoxTracker(object): 66 | """ 67 | This class represents the internel state of individual tracked objects observed as bbox. 68 | """ 69 | count = 0 70 | def __init__(self,bbox): 71 | """ 72 | Initialises a tracker using initial bounding box. 73 | """ 74 | #define constant velocity model 75 | self.kf = KalmanFilter(dim_x=7, dim_z=4) 76 | self.kf.F = np.array([[1,0,0,0,1,0,0],[0,1,0,0,0,1,0],[0,0,1,0,0,0,1],[0,0,0,1,0,0,0], [0,0,0,0,1,0,0],[0,0,0,0,0,1,0],[0,0,0,0,0,0,1]]) 77 | self.kf.H = np.array([[1,0,0,0,0,0,0],[0,1,0,0,0,0,0],[0,0,1,0,0,0,0],[0,0,0,1,0,0,0]]) 78 | 79 | self.kf.R[2:,2:] *= 10. 80 | self.kf.P[4:,4:] *= 1000. #give high uncertainty to the unobservable initial velocities 81 | self.kf.P *= 10. 82 | self.kf.Q[-1,-1] *= 0.01 83 | self.kf.Q[4:,4:] *= 0.01 84 | 85 | self.kf.x[:4] = convert_bbox_to_z(bbox) 86 | self.time_since_update = 0 87 | self.id = KalmanBoxTracker.count 88 | KalmanBoxTracker.count += 1 89 | self.history = [] 90 | self.hits = 0 91 | self.hit_streak = 0 92 | self.age = 0 93 | 94 | def update(self,bbox): 95 | """ 96 | Updates the state vector with observed bbox. 97 | """ 98 | self.time_since_update = 0 99 | self.history = [] 100 | self.hits += 1 101 | self.hit_streak += 1 102 | self.kf.update(convert_bbox_to_z(bbox)) 103 | 104 | def predict(self): 105 | """ 106 | Advances the state vector and returns the predicted bounding box estimate. 107 | """ 108 | if((self.kf.x[6]+self.kf.x[2])<=0): 109 | self.kf.x[6] *= 0.0 110 | self.kf.predict() 111 | self.age += 1 112 | if(self.time_since_update>0): 113 | self.hit_streak = 0 114 | self.time_since_update += 1 115 | self.history.append(convert_x_to_bbox(self.kf.x)) 116 | return self.history[-1] 117 | 118 | def get_state(self): 119 | """ 120 | Returns the current bounding box estimate. 121 | """ 122 | return convert_x_to_bbox(self.kf.x) 123 | 124 | def associate_detections_to_trackers(detections,trackers,iou_threshold = 0.3): 125 | """ 126 | Assigns detections to tracked object (both represented as bounding boxes) 127 | 128 | Returns 3 lists of matches, unmatched_detections and unmatched_trackers 129 | """ 130 | if(len(trackers)==0) or (len(detections)==0): 131 | return np.empty((0,2),dtype=int), np.arange(len(detections)), np.empty((0,5),dtype=int) 132 | iou_matrix = np.zeros((len(detections),len(trackers)),dtype=np.float32) 133 | 134 | for d,det in enumerate(detections): 135 | for t,trk in enumerate(trackers): 136 | iou_matrix[d,t] = iou(det,trk) 137 | matched_indices = linear_assignment(-iou_matrix) 138 | 139 | unmatched_detections = [] 140 | for d,det in enumerate(detections): 141 | if(d not in matched_indices[:,0]): 142 | unmatched_detections.append(d) 143 | unmatched_trackers = [] 144 | for t,trk in enumerate(trackers): 145 | if(t not in matched_indices[:,1]): 146 | unmatched_trackers.append(t) 147 | 148 | #filter out matched with low IOU 149 | matches = [] 150 | for m in matched_indices: 151 | if(iou_matrix[m[0],m[1]] 0: 202 | trk.update(dets[d,:][0]) 203 | 204 | #create and initialise new trackers for unmatched detections 205 | for i in unmatched_dets: 206 | trk = KalmanBoxTracker(dets[i,:]) 207 | self.trackers.append(trk) 208 | i = len(self.trackers) 209 | for trk in reversed(self.trackers): 210 | d = trk.get_state()[0] 211 | if((trk.time_since_update < 1) and (trk.hit_streak >= self.min_hits or self.frame_count <= self.min_hits)): 212 | ret.append(np.concatenate((d,[trk.id+1])).reshape(1,-1)) # +1 as MOT benchmark requires positive 213 | i -= 1 214 | #remove dead tracklet 215 | if(trk.time_since_update > self.max_age): 216 | self.trackers.pop(i) 217 | if(len(ret)>0): 218 | return np.concatenate(ret) 219 | return np.empty((0,5)) -------------------------------------------------------------------------------- /test_notification.py: -------------------------------------------------------------------------------- 1 | from unittest import TestCase 2 | 3 | from notification import NotificationService 4 | 5 | 6 | class TestNotificationService(TestCase): 7 | def test_notify(self): 8 | username = 'ppe.detection@gmail.com' 9 | password = 'itsdxkjynqhwsgaj' 10 | host = 'smtp.gmail.com' 11 | port = 587 12 | 13 | noti = NotificationService(host, port, username, password, False) 14 | noti.notify(0, 'guneedmts@gmail.com', 'Unit Test') 15 | 16 | -------------------------------------------------------------------------------- /utils/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ssins/PPE-detection/1e20dd7528166332e191eba7766984a482c734d6/utils/__init__.py -------------------------------------------------------------------------------- /utils/layer_utils.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import numpy as np 6 | import tensorflow as tf 7 | slim = tf.contrib.slim 8 | 9 | def conv2d(inputs, filters, kernel_size, strides=1): 10 | def _fixed_padding(inputs, kernel_size): 11 | pad_total = kernel_size - 1 12 | pad_beg = pad_total // 2 13 | pad_end = pad_total - pad_beg 14 | 15 | padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg, pad_end], 16 | [pad_beg, pad_end], [0, 0]], mode='CONSTANT') 17 | return padded_inputs 18 | if strides > 1: 19 | inputs = _fixed_padding(inputs, kernel_size) 20 | inputs = slim.conv2d(inputs, filters, kernel_size, stride=strides, 21 | padding=('SAME' if strides == 1 else 'VALID')) 22 | return inputs 23 | 24 | def darknet53_body(inputs): 25 | def res_block(inputs, filters): 26 | shortcut = inputs 27 | net = conv2d(inputs, filters * 1, 1) 28 | net = conv2d(net, filters * 2, 3) 29 | 30 | net = net + shortcut 31 | 32 | return net 33 | 34 | # first two conv2d layers 35 | net = conv2d(inputs, 32, 3, strides=1) 36 | net = conv2d(net, 64, 3, strides=2) 37 | 38 | # res_block * 1 39 | net = res_block(net, 32) 40 | 41 | net = conv2d(net, 128, 3, strides=2) 42 | 43 | # res_block * 2 44 | for i in range(2): 45 | net = res_block(net, 64) 46 | 47 | net = conv2d(net, 256, 3, strides=2) 48 | 49 | # res_block * 8 50 | for i in range(8): 51 | net = res_block(net, 128) 52 | 53 | route_1 = net 54 | net = conv2d(net, 512, 3, strides=2) 55 | 56 | # res_block * 8 57 | for i in range(8): 58 | net = res_block(net, 256) 59 | 60 | route_2 = net 61 | net = conv2d(net, 1024, 3, strides=2) 62 | 63 | # res_block * 4 64 | for i in range(4): 65 | net = res_block(net, 512) 66 | route_3 = net 67 | 68 | return route_1, route_2, route_3 69 | 70 | 71 | def yolo_block(inputs, filters): 72 | net = conv2d(inputs, filters * 1, 1) 73 | net = conv2d(net, filters * 2, 3) 74 | net = conv2d(net, filters * 1, 1) 75 | net = conv2d(net, filters * 2, 3) 76 | net = conv2d(net, filters * 1, 1) 77 | route = net 78 | net = conv2d(net, filters * 2, 3) 79 | return route, net 80 | 81 | 82 | def upsample_layer(inputs, out_shape): 83 | new_height, new_width = out_shape[1], out_shape[2] 84 | # NOTE: here height is the first 85 | # TODO: Do we need to set `align_corners` as True? 86 | inputs = tf.compat.v1.image.resize_nearest_neighbor(inputs, (new_height, new_width), name='upsampled') 87 | return inputs 88 | 89 | 90 | -------------------------------------------------------------------------------- /utils/misc_utils.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | import numpy as np 4 | import tensorflow as tf 5 | import random 6 | 7 | from tensorflow.core.framework import summary_pb2 8 | 9 | 10 | def make_summary(name, val): 11 | return summary_pb2.Summary(value=[summary_pb2.Summary.Value(tag=name, simple_value=val)]) 12 | 13 | 14 | class AverageMeter(object): 15 | def __init__(self): 16 | self.reset() 17 | 18 | def reset(self): 19 | self.val = 0 20 | self.average = 0 21 | self.sum = 0 22 | self.count = 0 23 | 24 | def update(self, val, n=1): 25 | self.val = val 26 | self.sum += val * n 27 | self.count += n 28 | self.average = self.sum / float(self.count) 29 | 30 | 31 | def parse_anchors(anchor_path): 32 | ''' 33 | parse anchors. 34 | returned data: shape [N, 2], dtype float32 35 | ''' 36 | anchors = np.reshape(np.asarray(open(anchor_path, 'r').read().split(','), np.float32), [-1, 2]) 37 | return anchors 38 | 39 | 40 | def read_class_names(class_name_path): 41 | names = {} 42 | with open(class_name_path, 'r') as data: 43 | for ID, name in enumerate(data): 44 | names[ID] = name.strip('\n') 45 | return names 46 | 47 | 48 | def shuffle_and_overwrite(file_name): 49 | content = open(file_name, 'r').readlines() 50 | random.shuffle(content) 51 | with open(file_name, 'w') as f: 52 | for line in content: 53 | f.write(line) 54 | 55 | 56 | def update_dict(ori_dict, new_dict): 57 | if not ori_dict: 58 | return new_dict 59 | for key in ori_dict: 60 | ori_dict[key] += new_dict[key] 61 | return ori_dict 62 | 63 | 64 | def list_add(ori_list, new_list): 65 | for i in range(len(ori_list)): 66 | ori_list[i] += new_list[i] 67 | return ori_list 68 | 69 | 70 | def load_weights(var_list, weights_file): 71 | """ 72 | Loads and converts pre-trained weights. 73 | param: 74 | var_list: list of network variables. 75 | weights_file: name of the binary file. 76 | """ 77 | with open(weights_file, "rb") as fp: 78 | np.fromfile(fp, dtype=np.int32, count=5) 79 | weights = np.fromfile(fp, dtype=np.float32) 80 | 81 | ptr = 0 82 | i = 0 83 | assign_ops = [] 84 | while i < len(var_list) - 1: 85 | var1 = var_list[i] 86 | var2 = var_list[i + 1] 87 | # do something only if we process conv layer 88 | if 'Conv' in var1.name.split('/')[-2]: 89 | # check type of next layer 90 | if 'BatchNorm' in var2.name.split('/')[-2]: 91 | # load batch norm params 92 | gamma, beta, mean, var = var_list[i + 1:i + 5] 93 | batch_norm_vars = [beta, gamma, mean, var] 94 | for var in batch_norm_vars: 95 | shape = var.shape.as_list() 96 | num_params = np.prod(shape) 97 | var_weights = weights[ptr:ptr + num_params].reshape(shape) 98 | ptr += num_params 99 | assign_ops.append(tf.assign(var, var_weights, validate_shape=True)) 100 | # we move the pointer by 4, because we loaded 4 variables 101 | i += 4 102 | elif 'Conv' in var2.name.split('/')[-2]: 103 | # load biases 104 | bias = var2 105 | bias_shape = bias.shape.as_list() 106 | bias_params = np.prod(bias_shape) 107 | bias_weights = weights[ptr:ptr + 108 | bias_params].reshape(bias_shape) 109 | ptr += bias_params 110 | assign_ops.append(tf.assign(bias, bias_weights, validate_shape=True)) 111 | # we loaded 1 variable 112 | i += 1 113 | # we can load weights of conv layer 114 | shape = var1.shape.as_list() 115 | num_params = np.prod(shape) 116 | 117 | var_weights = weights[ptr:ptr + num_params].reshape( 118 | (shape[3], shape[2], shape[0], shape[1])) 119 | # remember to transpose to column-major 120 | var_weights = np.transpose(var_weights, (2, 3, 1, 0)) 121 | ptr += num_params 122 | assign_ops.append( 123 | tf.assign(var1, var_weights, validate_shape=True)) 124 | i += 1 125 | 126 | return assign_ops 127 | 128 | 129 | def config_learning_rate(args, global_step): 130 | if args.lr_type == 'exponential': 131 | lr_tmp = tf.train.exponential_decay(args.learning_rate_init, global_step, args.lr_decay_freq, 132 | args.lr_decay_factor, staircase=True, name='exponential_learning_rate') 133 | return tf.maximum(lr_tmp, args.lr_lower_bound) 134 | elif args.lr_type == 'cosine_decay': 135 | train_steps = (args.total_epoches - float(args.use_warm_up) * args.warm_up_epoch) * args.train_batch_num 136 | return args.lr_lower_bound + 0.5 * (args.learning_rate_init - args.lr_lower_bound) * \ 137 | (1 + tf.cos(global_step / train_steps * np.pi)) 138 | elif args.lr_type == 'cosine_decay_restart': 139 | return tf.train.cosine_decay_restarts(args.learning_rate_init, global_step, 140 | args.lr_decay_freq, t_mul=2.0, m_mul=1.0, 141 | name='cosine_decay_learning_rate_restart') 142 | elif args.lr_type == 'fixed': 143 | return tf.convert_to_tensor(args.learning_rate_init, name='fixed_learning_rate') 144 | elif args.lr_type == 'piecewise': 145 | return tf.train.piecewise_constant(global_step, boundaries=args.pw_boundaries, values=args.pw_values, 146 | name='piecewise_learning_rate') 147 | else: 148 | raise ValueError('Unsupported learning rate type!') 149 | 150 | 151 | def config_optimizer(optimizer_name, learning_rate, decay=0.9, momentum=0.9): 152 | if optimizer_name == 'momentum': 153 | return tf.train.MomentumOptimizer(learning_rate, momentum=momentum) 154 | elif optimizer_name == 'rmsprop': 155 | return tf.train.RMSPropOptimizer(learning_rate, decay=decay, momentum=momentum) 156 | elif optimizer_name == 'adam': 157 | return tf.train.AdamOptimizer(learning_rate) 158 | elif optimizer_name == 'sgd': 159 | return tf.train.GradientDescentOptimizer(learning_rate) 160 | else: 161 | raise ValueError('Unsupported optimizer type!') -------------------------------------------------------------------------------- /utils/nms_utils.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import numpy as np 6 | import tensorflow as tf 7 | 8 | def gpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, nms_thresh=0.5): 9 | """ 10 | Perform NMS on GPU using TensorFlow. 11 | 12 | params: 13 | boxes: tensor of shape [1, 10647, 4] # 10647=(13*13+26*26+52*52)*3, for input 416*416 image 14 | scores: tensor of shape [1, 10647, num_classes], score=conf*prob 15 | num_classes: total number of classes 16 | max_boxes: integer, maximum number of predicted boxes you'd like, default is 50 17 | score_thresh: if [ highest class probability score < score_threshold] 18 | then get rid of the corresponding box 19 | nms_thresh: real value, "intersection over union" threshold used for NMS filtering 20 | """ 21 | 22 | boxes_list, label_list, score_list = [], [], [] 23 | max_boxes = tf.constant(max_boxes, dtype='int32') 24 | 25 | # since we do nms for single image, then reshape it 26 | boxes = tf.reshape(boxes, [-1, 4]) # '-1' means we don't konw the exact number of boxes 27 | score = tf.reshape(scores, [-1, num_classes]) 28 | 29 | # Step 1: Create a filtering mask based on "box_class_scores" by using "threshold". 30 | mask = tf.greater_equal(score, tf.constant(score_thresh)) 31 | # Step 2: Do non_max_suppression for each class 32 | for i in range(num_classes): 33 | # Step 3: Apply the mask to scores, boxes and pick them out 34 | filter_boxes = tf.boolean_mask(boxes, mask[:,i]) 35 | filter_score = tf.boolean_mask(score[:,i], mask[:,i]) 36 | nms_indices = tf.image.non_max_suppression(boxes=filter_boxes, 37 | scores=filter_score, 38 | max_output_size=max_boxes, 39 | iou_threshold=nms_thresh, name='nms_indices') 40 | label_list.append(tf.ones_like(tf.gather(filter_score, nms_indices), 'int32')*i) 41 | boxes_list.append(tf.gather(filter_boxes, nms_indices)) 42 | score_list.append(tf.gather(filter_score, nms_indices)) 43 | 44 | boxes = tf.concat(boxes_list, axis=0) 45 | score = tf.concat(score_list, axis=0) 46 | label = tf.concat(label_list, axis=0) 47 | 48 | return boxes, score, label 49 | 50 | 51 | def py_nms(boxes, scores, max_boxes=50, iou_thresh=0.5): 52 | """ 53 | Pure Python NMS baseline. 54 | 55 | Arguments: boxes: shape of [-1, 4], the value of '-1' means that dont know the 56 | exact number of boxes 57 | scores: shape of [-1,] 58 | max_boxes: representing the maximum of boxes to be selected by non_max_suppression 59 | iou_thresh: representing iou_threshold for deciding to keep boxes 60 | """ 61 | assert boxes.shape[1] == 4 and len(scores.shape) == 1 62 | 63 | x1 = boxes[:, 0] 64 | y1 = boxes[:, 1] 65 | x2 = boxes[:, 2] 66 | y2 = boxes[:, 3] 67 | 68 | areas = (x2 - x1) * (y2 - y1) 69 | order = scores.argsort()[::-1] 70 | 71 | keep = [] 72 | while order.size > 0: 73 | i = order[0] 74 | keep.append(i) 75 | xx1 = np.maximum(x1[i], x1[order[1:]]) 76 | yy1 = np.maximum(y1[i], y1[order[1:]]) 77 | xx2 = np.minimum(x2[i], x2[order[1:]]) 78 | yy2 = np.minimum(y2[i], y2[order[1:]]) 79 | 80 | w = np.maximum(0.0, xx2 - xx1 + 1) 81 | h = np.maximum(0.0, yy2 - yy1 + 1) 82 | inter = w * h 83 | ovr = inter / (areas[i] + areas[order[1:]] - inter) 84 | 85 | inds = np.where(ovr <= iou_thresh)[0] 86 | order = order[inds + 1] 87 | 88 | return keep[:max_boxes] 89 | 90 | 91 | def cpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, iou_thresh=0.5): 92 | """ 93 | Perform NMS on CPU. 94 | Arguments: 95 | boxes: shape [1, 10647, 4] 96 | scores: shape [1, 10647, num_classes] 97 | """ 98 | 99 | boxes = boxes.reshape(-1, 4) 100 | scores = scores.reshape(-1, num_classes) 101 | # Picked bounding boxes 102 | picked_boxes, picked_score, picked_label = [], [], [] 103 | 104 | for i in range(num_classes): 105 | indices = np.where(scores[:,i] >= score_thresh) 106 | filter_boxes = boxes[indices] 107 | filter_scores = scores[:,i][indices] 108 | if len(filter_boxes) == 0: 109 | continue 110 | # do non_max_suppression on the cpu 111 | indices = py_nms(filter_boxes, filter_scores, 112 | max_boxes=max_boxes, iou_thresh=iou_thresh) 113 | picked_boxes.append(filter_boxes[indices]) 114 | picked_score.append(filter_scores[indices]) 115 | picked_label.append(np.ones(len(indices), dtype='int32')*i) 116 | if len(picked_boxes) == 0: 117 | return None, None, None 118 | 119 | boxes = np.concatenate(picked_boxes, axis=0) 120 | score = np.concatenate(picked_score, axis=0) 121 | label = np.concatenate(picked_label, axis=0) 122 | 123 | return boxes, score, label -------------------------------------------------------------------------------- /utils/plot_utils.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from __future__ import division, print_function 4 | 5 | import cv2 6 | import random 7 | 8 | 9 | def get_color_table(class_num, seed=2): 10 | random.seed(seed) 11 | color_table = {} 12 | for i in range(class_num): 13 | color_table[i] = [random.randint(0, 255) for _ in range(3)] 14 | return color_table 15 | 16 | 17 | def plot_one_box(img, coord, label=None, color=None, line_thickness=None): 18 | ''' 19 | coord: [x_min, y_min, x_max, y_max] format coordinates. 20 | img: img to plot on. 21 | label: str. The label name. 22 | color: int. color index. 23 | line_thickness: int. rectangle line thickness. 24 | ''' 25 | tl = line_thickness or int(round(0.002 * max(img.shape[0:2]))) # line thickness 26 | color = color or [random.randint(0, 255) for _ in range(3)] 27 | c1, c2 = (int(coord[0]), int(coord[1])), (int(coord[2]), int(coord[3])) 28 | cv2.rectangle(img, c1, c2, color, thickness=tl) 29 | if label: 30 | tf = max(tl - 1, 1) # font thickness 31 | t_size = cv2.getTextSize(label, 0, fontScale=float(tl) / 3, thickness=tf)[0] 32 | c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3 33 | cv2.rectangle(img, c1, c2, color, -1) # filled 34 | cv2.putText(img, label, (c1[0], c1[1] - 2), 0, float(tl) / 3, [0, 0, 0], thickness=tf, lineType=cv2.LINE_AA) 35 | 36 | --------------------------------------------------------------------------------