├── .gitignore ├── Image ├── HOG.png ├── Normalize Feature.png ├── Normalize Feature_HSV.png ├── Normalize_Feature_HSV.png ├── car_not_car.PNG ├── feature.png ├── find_car.png ├── find_car_yolo.png ├── heatmap.png ├── heatmap1.png ├── yolo-box.PNG ├── yolo-model.PNG ├── yolo.gif └── yolo_result.jpeg ├── LICENSE ├── Project-SVM.py ├── Project-yolo.py ├── README.md ├── __pycache__ ├── helper.cpython-35.pyc └── helper_yolo.cpython-35.pyc ├── dist.p ├── helper.py ├── helper_yolo.py ├── output_videos ├── project_SVM.mp4 ├── project_yolo.mp4 └── test_SVM.mp4 └── test_img ├── 275.png └── cutout1.jpg /.gitignore: -------------------------------------------------------------------------------- 1 | 2 | *.xml 3 | 4 | /.idea/ 5 | /test_img/ 6 | /input_videos/ 7 | /reference/ 8 | /weights/ 9 | *.xlsx -------------------------------------------------------------------------------- /Image/HOG.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/Image/HOG.png -------------------------------------------------------------------------------- /Image/Normalize Feature.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/Image/Normalize Feature.png -------------------------------------------------------------------------------- /Image/Normalize Feature_HSV.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/Image/Normalize Feature_HSV.png -------------------------------------------------------------------------------- /Image/Normalize_Feature_HSV.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/Image/Normalize_Feature_HSV.png -------------------------------------------------------------------------------- /Image/car_not_car.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/Image/car_not_car.PNG -------------------------------------------------------------------------------- /Image/feature.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/Image/feature.png -------------------------------------------------------------------------------- /Image/find_car.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/Image/find_car.png -------------------------------------------------------------------------------- /Image/find_car_yolo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/Image/find_car_yolo.png -------------------------------------------------------------------------------- /Image/heatmap.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/Image/heatmap.png -------------------------------------------------------------------------------- /Image/heatmap1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/Image/heatmap1.png -------------------------------------------------------------------------------- /Image/yolo-box.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/Image/yolo-box.PNG -------------------------------------------------------------------------------- /Image/yolo-model.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/Image/yolo-model.PNG -------------------------------------------------------------------------------- /Image/yolo.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/Image/yolo.gif -------------------------------------------------------------------------------- /Image/yolo_result.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/Image/yolo_result.jpeg -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Chaoqun Shan 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. -------------------------------------------------------------------------------- /Project-SVM.py: -------------------------------------------------------------------------------- 1 | # import numpy as np 2 | # import matplotlib.pyplot as plt 3 | # import pickle 4 | # import matplotlib.image as mpimg 5 | from moviepy.editor import VideoFileClip 6 | from collections import deque 7 | from helper import * 8 | 9 | 10 | def process_image(image, heat_thres=2): 11 | global dist_pickle 12 | boxes = [] 13 | 14 | multibox1, img_multibox1 = find_cars(image, dist_pickle, ystart=400, ystop=500, scale=1.0) 15 | multibox2, img_multibox2 = find_cars(image, dist_pickle, ystart=400, ystop=500, scale=1.3) 16 | multibox3, img_multibox3 = find_cars(image, dist_pickle, ystart=420, ystop=556, scale=1.6) 17 | multibox4, img_multibox4 = find_cars(image, dist_pickle, ystart=430, ystop=556, scale=2.0) 18 | multibox5, img_multibox5 = find_cars(image, dist_pickle, ystart=500, ystop=656, scale=3.0) 19 | multibox6, img_multibox6 = find_cars(image, dist_pickle, ystart=410, ystop=500, scale=1.4) 20 | multibox7, img_multibox7 = find_cars(image, dist_pickle, ystart=430, ystop=556, scale=1.8) 21 | multibox8, img_multibox8 = find_cars(image, dist_pickle, ystart=440, ystop=556, scale=1.9) 22 | multibox9, img_multibox9 = find_cars(image, dist_pickle, ystart=400, ystop=556, scale=2.2) 23 | 24 | boxes.extend(multibox1) 25 | boxes.extend(multibox2) 26 | boxes.extend(multibox3) 27 | boxes.extend(multibox4) 28 | boxes.extend(multibox5) 29 | boxes.extend(multibox6) 30 | boxes.extend(multibox7) 31 | boxes.extend(multibox8) 32 | boxes.extend(multibox9) 33 | 34 | heat_zero = np.zeros_like(image[:, :, 0]).astype(np.float) 35 | heat = add_heat(heat_zero, boxes) 36 | heat = apply_threshold(heat, threshold=heat_thres) 37 | current_heatmap = np.clip(heat, 0, 255) 38 | 39 | # HM.current_heat = heatmap 40 | # merged_heat = HM.merge_heat() 41 | # heatmap = apply_threshold(merged_heat, threshold=framenum*heat_thres) 42 | 43 | history.append(current_heatmap) 44 | heatmap = np.zeros_like(current_heatmap).astype(np.float) 45 | for heat in history: 46 | heatmap += heat 47 | 48 | labels = label(heatmap) 49 | draw_img = draw_labeled_bboxes(np.copy(image), labels) 50 | 51 | return draw_img 52 | 53 | dist_pickle = pickle.load(open("dist.p", "rb")) 54 | history = deque(maxlen=8) 55 | 56 | input_path = './input_videos/project.mp4' 57 | video_output = './output_videos/project_SVM.mp4' 58 | 59 | # input_path = './input_videos/test_video.mp4' 60 | # video_output = './output_videos/test_SVM.mp4' 61 | 62 | clip1 = VideoFileClip(input_path) 63 | # clip1 = VideoFileClip(input_path).subclip(4, 16) 64 | 65 | t = time.time() 66 | final_clip = clip1.fl_image(process_image) 67 | final_clip.write_videofile(video_output, audio=False) 68 | t2 = time.time() 69 | print(round(t2 - t, 2), 'Seconds to process video...') 70 | 71 | # image = mpimg.imread('./test_img/test5.jpg') 72 | # plt.figure() 73 | # plt.imshow(process_image(image, heat_thres=1)) 74 | # plt.show() 75 | -------------------------------------------------------------------------------- /Project-yolo.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import matplotlib.pyplot as plt 3 | import cv2 4 | import glob 5 | from moviepy.editor import VideoFileClip 6 | from IPython.display import HTML 7 | import keras 8 | from keras.models import Sequential 9 | from keras.layers.convolutional import Convolution2D, MaxPooling2D 10 | from keras.layers.advanced_activations import LeakyReLU 11 | from keras.layers.core import Flatten, Dense, Activation, Reshape 12 | import time 13 | from helper_yolo import * 14 | 15 | keras.backend.set_image_dim_ordering('th') 16 | 17 | model = Sequential() 18 | model.add(Convolution2D(16, 3, 3,input_shape=(3, 448, 448), border_mode='same', subsample=(1, 1))) 19 | model.add(LeakyReLU(alpha=0.1)) 20 | model.add(MaxPooling2D(pool_size=(2, 2))) 21 | model.add(Convolution2D(32, 3, 3, border_mode='same')) 22 | model.add(LeakyReLU(alpha=0.1)) 23 | model.add(MaxPooling2D(pool_size=(2, 2), border_mode='valid')) 24 | model.add(Convolution2D(64, 3, 3, border_mode='same')) 25 | model.add(LeakyReLU(alpha=0.1)) 26 | model.add(MaxPooling2D(pool_size=(2, 2), border_mode='valid')) 27 | model.add(Convolution2D(128, 3, 3, border_mode='same')) 28 | model.add(LeakyReLU(alpha=0.1)) 29 | model.add(MaxPooling2D(pool_size=(2, 2), border_mode='valid')) 30 | model.add(Convolution2D(256, 3, 3, border_mode='same')) 31 | model.add(LeakyReLU(alpha=0.1)) 32 | model.add(MaxPooling2D(pool_size=(2, 2), border_mode='valid')) 33 | model.add(Convolution2D(512, 3, 3, border_mode='same')) 34 | model.add(LeakyReLU(alpha=0.1)) 35 | model.add(MaxPooling2D(pool_size=(2, 2), border_mode='valid')) 36 | model.add(Convolution2D(1024, 3, 3, border_mode='same')) 37 | model.add(LeakyReLU(alpha=0.1)) 38 | model.add(Convolution2D(1024, 3, 3, border_mode='same')) 39 | model.add(LeakyReLU(alpha=0.1)) 40 | model.add(Convolution2D(1024, 3, 3, border_mode='same')) 41 | model.add(LeakyReLU(alpha=0.1)) 42 | model.add(Flatten()) 43 | model.add(Dense(256)) 44 | model.add(Dense(4096)) 45 | model.add(LeakyReLU(alpha=0.1)) 46 | model.add(Dense(1470)) 47 | 48 | print('model loaded.') 49 | 50 | # model.summary() 51 | 52 | load_weights(model, './weights/yolo-tiny.weights') 53 | print('weight loaded.') 54 | 55 | # predict = draw_test_img('./test_img/test*.jpg', model) 56 | 57 | 58 | def process_image(image): 59 | crop = image[300:650, 500:, :] 60 | resized = cv2.resize(crop, (448, 448)) 61 | batch = np.array([resized[:, :, 0], resized[:, :, 1], resized[:, :, 2]]) 62 | batch = 2 * (batch/255.) - 1 63 | batch = np.expand_dims(batch, axis=0) 64 | out = model.predict(batch) 65 | boxes = yolo_boxes(out[0], threshold=0.2) 66 | 67 | # draw result 68 | img_cp = np.copy(image) 69 | img_cp = draw_background_highlight(img_cp, image) 70 | img_cp = draw_box(boxes, np.copy(img_cp), [[500, 1280], [300, 650]]) 71 | 72 | return img_cp 73 | 74 | input_path = './input_videos/project.mp4' 75 | video_output = './output_videos/project_yolo_0.2.mp4' 76 | 77 | # input_path = './input_videos/test_video.mp4' 78 | # video_output = './output_videos/test_yolo.mp4' 79 | 80 | clip1 = VideoFileClip(input_path) 81 | # clip1 = VideoFileClip(input_path).subclip(29, 39) 82 | 83 | t = time.time() 84 | final_clip = clip1.fl_image(process_image) 85 | final_clip.write_videofile(video_output, audio=False) 86 | t2 = time.time() 87 | print(round(t2 - t, 2), 'Seconds to process video...') 88 | 89 | # image = mpimg.imread('./test_img/test5.jpg') 90 | # plt.figure() 91 | # plt.imshow(process_image(image)) 92 | # plt.show() 93 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # **Vehicle Detection Project** 2 | 3 | ## Overview 4 | 5 | Vehicle detection project used machine learning and computer vision techniques, and combined [advanced lane detection](https://github.com/uranus4ever/Advanced-Lane-Detection) techniques. 6 | 7 | ![yolo-gif][gif] 8 | 9 | I applied two different methods for detection. The steps of this project are the following: 10 | 11 | **1) SVM Algorithm** 12 | 13 | - Perform a Histogram of Oriented Gradients (HOG) feature extraction on a labeled training set of images and train a classifier Linear SVM classifier. 14 | - Implement an effiecient sliding-window technique and use trained SVM classifier to search for vehicles in images. 15 | - Run a pipeline on a video stream and create a heat map of recurring detections frame by frame to reject outliers and follow detected vehicles. [watch full video][video-SVM] 16 | 17 | **2) YOLO Algorithm** 18 | 19 | - Construct a Keras based neural network and implement a pre-trained model to predict images. 20 | - Run a pipeline on a video stream and create a console to monitor lane status and detections. [watch full video][video-yolo] 21 | 22 | ### Usage 23 | 24 | - `Project-SVM.py` and `helper.py` contain the code for SVM classifier stracture and pipeline. 25 | - `dist.p` contains a trained SVM classifier based on YUV color features and HOG features with 17,000+ car and not-car pictures. 26 | - `Project-yolo.py` and `helper_yolo.py` contain the code for Keras network and pipeline. 27 | 28 | ### Dependencies 29 | - Numpy 30 | - cv2 31 | - sklearn 32 | - scipy 33 | - skimage 34 | - keras 35 | 36 | --- 37 | 38 | ### **1) SVM Algorithm** 39 | 40 | SVM (Support Vector Machine) is a powerful machine learning technique. Here in this project it is trained and used for classification of car and not-car. 41 | 42 | #### 1. Collecting Data 43 | My main training data is downloaded from [GTI vehicle image database](http://www.gti.ssr.upm.es/data/Vehicle_database.html) and [KITTI vision benchmark](http://www.cvlibs.net/datasets/kitti/) websites, which contain about 8700 pictures of car and 8900 of not-car. In addition, in order to increase detection accurancy, I create about 20 not-car pictures from video. 44 | 45 | ![car and not-car][img1] 46 | 47 | #### 2. Extracting Features 48 | 49 | I explored different color spaces and different `skimage.hog()` parameters (`orientations`, `pixels_per_cell`, and `cells_per_block`) and made a comparison. 50 | 51 | | Color Space | Accuracy | Training Time (CPU) | 52 | |:--:|:--:|:--:| 53 | | YUV | 97.75% | 65 s | 54 | | YCrCb | 98.11% | 51 s | 55 | | LUV | 98.23% | 59 s | 56 | | HLS | 98% | 60 s | 57 | | HSV | 97.8% | 112 s | 58 | 59 | 60 | The above table indicates that accuracy performance in different color space are almost same. Considering less false-positive, I chose `YUV` to extract color features. 61 | 62 | Here is an example using the `YUV` color space and HOG parameters of `orientations=15`, `pixels_per_cell=(8, 8)` and `cells_per_block=(2, 2)`: 63 | 64 | ![feature][img2] 65 | 66 | #### 3. Training classifier 67 | 68 | I trained a linear SVM using the following code: 69 | 70 | ``` 71 | car_features = extract_features(cars, color_space, orient, pix_per_cell, cell_per_block, spatial_feat=False, hist_feat=False, hog_channel=hog_channel) 72 | notcar_features = extract_features(notcars, color_space, orient, pix_per_cell, cell_per_block, spatial_feat=False, hist_feat=False, hog_channel=hog_channel) 73 | # Create an array stack of feature vectors 74 | X = np.vstack((car_features, notcar_features)).astype(np.float64) 75 | # Fit a per-column scaler 76 | X_scaler = StandardScaler().fit(X) 77 | # Apply the scaler to X 78 | scaled_X = X_scaler.transform(X) 79 | 80 | # Define the labels vector 81 | y = np.hstack((np.ones(len(car_features)), np.zeros(len(notcar_features)))) 82 | # Split up data into randomized training and test sets 83 | rand_state = np.random.randint(0, 100) 84 | X_train, X_test, y_train, y_test = train_test_split( 85 | scaled_X, y, test_size=0.2, random_state=rand_state) 86 | # Use a linear SVC 87 | svc = LinearSVC() 88 | svc.fit(X_train, y_train) 89 | ``` 90 | 91 | Note that feature normalization through `sklearn.prepocessing.StandardScaler` is one of the key steps before training. Here is a comparison: 92 | 93 | ![normalization][img3] 94 | 95 | #### 4. Sliding window 96 | 97 | An efficient method for sliding window search is applied, one that allows me to only have to extract the HOG features once. 98 | 99 | ``` 100 | def find_cars(img, ystart=400, ystop=656, scale=1.5): 101 | 102 | draw_img = np.copy(img) 103 | img = img.astype(np.float32) / 255 104 | 105 | img_tosearch = img[ystart:ystop, :, :] 106 | ctrans_tosearch = convert_color(img_tosearch, conv='RGB2YUV') 107 | cspace = 'YUV' 108 | if scale != 1: 109 | imshape = ctrans_tosearch.shape 110 | ctrans_tosearch = cv2.resize(ctrans_tosearch, (np.int(imshape[1] / scale), np.int(imshape[0] / scale))) 111 | 112 | ch1 = ctrans_tosearch[:, :, 0] 113 | ch2 = ctrans_tosearch[:, :, 1] 114 | ch3 = ctrans_tosearch[:, :, 2] 115 | 116 | # Define blocks and steps as above 117 | nxblocks = (ch1.shape[1] // pix_per_cell) - cell_per_block + 1 118 | nyblocks = (ch1.shape[0] // pix_per_cell) - cell_per_block + 1 119 | nfeat_per_block = orient * cell_per_block ** 2 120 | 121 | # 64 was the orginal sampling rate, with 8 cells and 8 pix per cell 122 | window = 64 123 | nblocks_per_window = (window // pix_per_cell) - cell_per_block + 1 124 | cells_per_step = 2 # Instead of overlap, define how many cells to step 125 | nxsteps = (nxblocks - nblocks_per_window) // cells_per_step 126 | nysteps = (nyblocks - nblocks_per_window) // cells_per_step 127 | 128 | # Compute individual channel HOG features for the entire image 129 | hog1 = get_hog_features(ch1, orient, pix_per_cell, cell_per_block, feature_vec=False) 130 | hog2 = get_hog_features(ch2, orient, pix_per_cell, cell_per_block, feature_vec=False) 131 | hog3 = get_hog_features(ch3, orient, pix_per_cell, cell_per_block, feature_vec=False) 132 | 133 | bbox_list = [] 134 | 135 | for xb in range(nxsteps): 136 | for yb in range(nysteps): 137 | ypos = yb * cells_per_step 138 | xpos = xb * cells_per_step 139 | # Extract HOG for this patch 140 | hog_feat1 = hog1[ypos:ypos + nblocks_per_window, xpos:xpos + nblocks_per_window].ravel() 141 | hog_feat2 = hog2[ypos:ypos + nblocks_per_window, xpos:xpos + nblocks_per_window].ravel() 142 | hog_feat3 = hog3[ypos:ypos + nblocks_per_window, xpos:xpos + nblocks_per_window].ravel() 143 | hog_features = np.hstack((hog_feat1, hog_feat2, hog_feat3)) 144 | 145 | xleft = xpos * pix_per_cell 146 | ytop = ypos * pix_per_cell 147 | 148 | # Extract the image patch 149 | subimg = cv2.resize(ctrans_tosearch[ytop:ytop + window, xleft:xleft + window], (64, 64)) 150 | 151 | # Get color features 152 | spatial_features = bin_spatial(subimg, color_space=cspace, size=spatial_size) 153 | hist_features = color_hist(subimg, nbins=hist_bins) 154 | 155 | # Scale features and make a prediction 156 | test_features = X_scaler.transform( 157 | np.hstack((spatial_features, hist_features, hog_features)).reshape(1, -1)) 158 | # test_features = X_scaler.transform(np.hstack((shape_feat, hist_feat)).reshape(1, -1)) 159 | test_prediction = svc.predict(test_features) 160 | 161 | if test_prediction == 1: 162 | xbox_left = np.int(xleft * scale) 163 | ytop_draw = np.int(ytop * scale) 164 | win_draw = np.int(window * scale) 165 | cv2.rectangle(draw_img, (xbox_left, ytop_draw + ystart), 166 | (xbox_left + win_draw, ytop_draw + win_draw + ystart), (0, 0, 255), 6) 167 | bbox_list.append(((xbox_left, ytop_draw + ystart), (xbox_left + win_draw, ytop_draw + win_draw + ystart))) 168 | 169 | return bbox_list, draw_img 170 | ``` 171 | 172 | Additionally, multiple-scaled search windows is applied with different scale values. 173 | 174 | ![multi box][img4] 175 | 176 | #### 5. Filtering False-positive by heatmap 177 | 178 | Heatmap with a certain threshold is a good helper to filter false positives and deal with multiple detections. I then used `scipy.ndimage.measurements.label()` to identify individual blobs in the heatmap. 179 | 180 | ``` 181 | def add_heat(heatmap, bbox_list): 182 | # Iterate through list of bboxes 183 | for box in bbox_list: 184 | # Add += 1 for all pixels inside each bbox 185 | # Assuming each "box" takes the form ((x1, y1), (x2, y2)) 186 | heatmap[box[0][1]:box[1][1], box[0][0]:box[1][0]] += 1 187 | 188 | return heatmap 189 | ``` 190 | ``` 191 | heatmap = threshold(heatmap, 2) 192 | labels = label(heatmap) 193 | ``` 194 | 195 | ![heatmap filter][img5] 196 | 197 | ![label][img6] 198 | 199 | #### 6. Video Implementation 200 | 201 | Video stream is a series of image process with the techniques above. In order to make detection between frames more smooth, I built a simple historical heatmap queue to set up connections. 202 | 203 | ``` 204 | history = deque(maxlen=8) 205 | current_heatmap = np.clip(heat, 0, 255) 206 | history.append(current_heatmap) 207 | ``` 208 | 209 | ### **2) YOLO Algorithm** 210 | 211 | [YOLO](https://arxiv.org/pdf/1506.02640) (You Look Only Once) is a popular end-to-end **Real-time Object Detection** algorithm based on deep learning. Compared with other object recognition methods, such as Fast R-CNN, YOLO integrates target area and object classification into a single neural network. The most outstanding point is its fast speed with preferably high accuracy, nearly 45 fps in base version and up to 155 fps in FastYOLO, quite favourable for real-time applications, for example, computer vision of self-driving car. 212 | 213 | #### 1. Principle 214 | 215 | YOLO uses an unified single neural network, which makes full use of the whole image infomation as bounding box identification and classification. It divides the image into an *SxS* grid and for each grid celll predicts *B* bounding boxes, confidence for those boxes, and *C* class probailities. The output is a 1470 vector, containing probability, confidence and box coordinates. 216 | ![model][img8] 217 | 218 | It has 20 classes as the following: 219 | ``` 220 | classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", 221 | "bus", "car", "cat", "chair", "cow", 222 | "diningtable", "dog", "horse", "motorbike", "person", 223 | "pottedplant", "sheep", "sofa", "train", "tvmonitor"] 224 | ``` 225 | 226 | In this project, I used tiny YOLO v1 as it is easy to implement and impressively fast. 227 | 228 | #### 2. Pipeline 229 | 230 | First to construct a convolutional neuarl network architecture based on Keras, containing 9 convolution layers and 3 full connected layers. 231 | ``` 232 | ____________________________________________________________________________________________________ 233 | Layer (type) Output Shape Param # Connected to 234 | ==================================================================================================== 235 | convolution2d_1 (Convolution2D) (None, 16, 448, 448) 448 convolution2d_input_1[0][0] 236 | ____________________________________________________________________________________________________ 237 | leakyrelu_1 (LeakyReLU) (None, 16, 448, 448) 0 convolution2d_1[0][0] 238 | ____________________________________________________________________________________________________ 239 | maxpooling2d_1 (MaxPooling2D) (None, 16, 224, 224) 0 leakyrelu_1[0][0] 240 | ____________________________________________________________________________________________________ 241 | convolution2d_2 (Convolution2D) (None, 32, 224, 224) 4640 maxpooling2d_1[0][0] 242 | ____________________________________________________________________________________________________ 243 | leakyrelu_2 (LeakyReLU) (None, 32, 224, 224) 0 convolution2d_2[0][0] 244 | ____________________________________________________________________________________________________ 245 | maxpooling2d_2 (MaxPooling2D) (None, 32, 112, 112) 0 leakyrelu_2[0][0] 246 | ____________________________________________________________________________________________________ 247 | convolution2d_3 (Convolution2D) (None, 64, 112, 112) 18496 maxpooling2d_2[0][0] 248 | ____________________________________________________________________________________________________ 249 | leakyrelu_3 (LeakyReLU) (None, 64, 112, 112) 0 convolution2d_3[0][0] 250 | ____________________________________________________________________________________________________ 251 | maxpooling2d_3 (MaxPooling2D) (None, 64, 56, 56) 0 leakyrelu_3[0][0] 252 | ____________________________________________________________________________________________________ 253 | convolution2d_4 (Convolution2D) (None, 128, 56, 56) 73856 maxpooling2d_3[0][0] 254 | ____________________________________________________________________________________________________ 255 | leakyrelu_4 (LeakyReLU) (None, 128, 56, 56) 0 convolution2d_4[0][0] 256 | ____________________________________________________________________________________________________ 257 | maxpooling2d_4 (MaxPooling2D) (None, 128, 28, 28) 0 leakyrelu_4[0][0] 258 | ____________________________________________________________________________________________________ 259 | convolution2d_5 (Convolution2D) (None, 256, 28, 28) 295168 maxpooling2d_4[0][0] 260 | ____________________________________________________________________________________________________ 261 | leakyrelu_5 (LeakyReLU) (None, 256, 28, 28) 0 convolution2d_5[0][0] 262 | ____________________________________________________________________________________________________ 263 | maxpooling2d_5 (MaxPooling2D) (None, 256, 14, 14) 0 leakyrelu_5[0][0] 264 | ____________________________________________________________________________________________________ 265 | convolution2d_6 (Convolution2D) (None, 512, 14, 14) 1180160 maxpooling2d_5[0][0] 266 | ____________________________________________________________________________________________________ 267 | leakyrelu_6 (LeakyReLU) (None, 512, 14, 14) 0 convolution2d_6[0][0] 268 | ____________________________________________________________________________________________________ 269 | maxpooling2d_6 (MaxPooling2D) (None, 512, 7, 7) 0 leakyrelu_6[0][0] 270 | ____________________________________________________________________________________________________ 271 | convolution2d_7 (Convolution2D) (None, 1024, 7, 7) 4719616 maxpooling2d_6[0][0] 272 | ____________________________________________________________________________________________________ 273 | leakyrelu_7 (LeakyReLU) (None, 1024, 7, 7) 0 convolution2d_7[0][0] 274 | ____________________________________________________________________________________________________ 275 | convolution2d_8 (Convolution2D) (None, 1024, 7, 7) 9438208 leakyrelu_7[0][0] 276 | ____________________________________________________________________________________________________ 277 | leakyrelu_8 (LeakyReLU) (None, 1024, 7, 7) 0 convolution2d_8[0][0] 278 | ____________________________________________________________________________________________________ 279 | convolution2d_9 (Convolution2D) (None, 1024, 7, 7) 9438208 leakyrelu_8[0][0] 280 | ____________________________________________________________________________________________________ 281 | leakyrelu_9 (LeakyReLU) (None, 1024, 7, 7) 0 convolution2d_9[0][0] 282 | ____________________________________________________________________________________________________ 283 | flatten_1 (Flatten) (None, 50176) 0 leakyrelu_9[0][0] 284 | ____________________________________________________________________________________________________ 285 | dense_1 (Dense) (None, 256) 12845312 flatten_1[0][0] 286 | ____________________________________________________________________________________________________ 287 | dense_2 (Dense) (None, 4096) 1052672 dense_1[0][0] 288 | ____________________________________________________________________________________________________ 289 | leakyrelu_10 (LeakyReLU) (None, 4096) 0 dense_2[0][0] 290 | ____________________________________________________________________________________________________ 291 | dense_3 (Dense) (None, 1470) 6022590 leakyrelu_10[0][0] 292 | ==================================================================================================== 293 | Total params: 45,089,374 294 | Trainable params: 45,089,374 295 | Non-trainable params: 0 296 | ____________________________________________________________________________________________________ 297 | ``` 298 | 299 | Then to load the pre-trained weights (172 MB, [link](https://drive.google.com/file/d/0B1tW_VtY7onibmdQWE1zVERxcjQ/view?usp=sharing)) from website as network training is really time-consuming. 300 | 301 | After weight loading, detected bounding boxes could be draw onto the images and finally applied into video stream pipeline with a confidence *threshold=0.2*. 302 | 303 | ![find_car][img9] 304 | 305 | 306 | ## Reflection 307 | 308 | ### 1. Discussion 309 | 310 | SVM has an acceptable accuracy of detection, however, it has two shorcomings: 311 | 312 | 1. Due to heatmap threshold application, the bounding box usually appears unstable and smaller than the actual size of object cars. 313 | 2. The processing speed is only up to 2 fps on account of sliding window search. Even with GPU parallel computing, it is not favorable for real-time application. 314 | 315 | YOLO is much more preferable by reason of its strength against SVM's shortcomings: 316 | 317 | 1. Better detection and more stable bounding box position. 318 | 2. Real-time processing speed, nearly 40 fps. 319 | 320 | Note that the limitation of YOLO is as follows: 321 | 322 | 1. Only two boxes and one class in each grid are predicted, which causes detection accuracy of adjacent objects decreases. 323 | 2. YOLO is learned from pre-trained data. As a result, it performs poor with new objects or usual view angle. 324 | 325 | ### 2. Next Plan 326 | 327 | 1. As the false detection types of YOLO and Fast R-CNN are different, we can integrate these two models to enhance performance. 328 | 2. Explore YOLO to more videos to try other classifications in addition to cars. 329 | 330 | ## Reference 331 | 1. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, [You Only Look Once: Unified, Real-Time Object Detection](https://arxiv.org/pdf/1506.02640), arXiv:1506.02640 (2015). 332 | 2. [dark flow](https://github.com/thtrieu/darkflow) 333 | 3. [yolo_tensorflow](https://github.com/hizhangp/yolo_tensorflow) 334 | 3. [YOLO Introduction](https://zhuanlan.zhihu.com/p/25045711) 335 | 336 | [//]: # (Image References) 337 | [img1]: ./Image/car_not_car.PNG 338 | [img2]: ./Image/feature.png 339 | [img3]: ./Image/Normalize_Feature_HSV.png 340 | [img4]: ./Image/find_car.png 341 | [img5]: ./Image/heatmap1.png 342 | [img6]: ./Image/heatmap.png 343 | [img8]: ./Image/yolo-box.PNG 344 | [img9]: ./Image/find_car_yolo.png 345 | [gif]: ./Image/yolo.gif 346 | [video-SVM]: ./project_video.mp4 347 | [video-yolo]: ./project_video.mp4 348 | -------------------------------------------------------------------------------- /__pycache__/helper.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/__pycache__/helper.cpython-35.pyc -------------------------------------------------------------------------------- /__pycache__/helper_yolo.cpython-35.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/__pycache__/helper_yolo.cpython-35.pyc -------------------------------------------------------------------------------- /dist.p: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/dist.p -------------------------------------------------------------------------------- /helper.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 3 | import matplotlib.pyplot as plt 4 | import matplotlib.image as mpimg 5 | from mpl_toolkits.mplot3d import Axes3D 6 | from skimage.feature import hog 7 | from sklearn.preprocessing import StandardScaler 8 | from sklearn.model_selection import train_test_split 9 | import time 10 | from sklearn.svm import LinearSVC 11 | from scipy.ndimage.measurements import label 12 | import glob 13 | import pickle 14 | from sklearn.utils import shuffle 15 | 16 | 17 | def draw_boxes(img, bboxes, color=(0, 0, 255), thickness=6): 18 | imgcopy = np.copy(img) 19 | for bbox in bboxes: 20 | cv2.rectangle(imgcopy, bbox[0], bbox[1], color, thickness) 21 | return imgcopy 22 | 23 | 24 | def color_hist(img, nbins=32, bins_range=(0, 256), plot=False): 25 | # Compute the histogram of the color channels separately 26 | channel1_hist = np.histogram(img[:, :, 0], bins=nbins, range=bins_range) 27 | channel2_hist = np.histogram(img[:, :, 1], bins=nbins, range=bins_range) 28 | channel3_hist = np.histogram(img[:, :, 2], bins=nbins, range=bins_range) 29 | # Concatenate the histograms into a single feature vector 30 | hist_features = np.concatenate((channel1_hist[0], channel2_hist[0], channel3_hist[0])) 31 | 32 | bin_edges = channel1_hist[1] 33 | bin_centers = (bin_edges[1:] + bin_edges[0:len(bin_edges) - 1]) / 2 34 | 35 | if plot is True: 36 | fig = plt.figure(figsize=(12, 3)) 37 | plt.subplot(131) 38 | plt.bar(bin_centers, channel1_hist[0]) 39 | plt.xlim(0, 256) 40 | plt.title('Channel 1 Histogram') 41 | plt.subplot(132) 42 | plt.bar(bin_centers, channel2_hist[0]) 43 | plt.xlim(0, 256) 44 | plt.title('Channel 2 Histogram') 45 | plt.subplot(133) 46 | plt.bar(bin_centers, channel3_hist[0]) 47 | plt.xlim(0, 256) 48 | plt.title('Channel 3 Histogram') 49 | 50 | # Return the individual histograms, bin_centers and feature vector 51 | return hist_features 52 | 53 | 54 | def plot3d(pixels, colors_rgb, axis_labels=list("RGB"), 55 | axis_limits=[(0, 255), (0, 255), (0, 255)], plot=False): 56 | """Plot pixels in 3D.""" 57 | 58 | # Create figure and 3D axes 59 | fig = plt.figure(figsize=(8, 8)) 60 | ax = Axes3D(fig) 61 | 62 | # Set axis limits 63 | ax.set_xlim(*axis_limits[0]) 64 | ax.set_ylim(*axis_limits[1]) 65 | ax.set_zlim(*axis_limits[2]) 66 | 67 | # Set axis labels and sizes 68 | ax.tick_params(axis='both', which='major', labelsize=14, pad=8) 69 | ax.set_xlabel(axis_labels[0], fontsize=16, labelpad=16) 70 | ax.set_ylabel(axis_labels[1], fontsize=16, labelpad=16) 71 | ax.set_zlabel(axis_labels[2], fontsize=16, labelpad=16) 72 | 73 | # Plot pixel values with colors given in colors_rgb 74 | ax.scatter( 75 | pixels[:, :, 0].ravel(), 76 | pixels[:, :, 1].ravel(), 77 | pixels[:, :, 2].ravel(), 78 | c=colors_rgb.reshape((-1, 3)), edgecolors='none') 79 | 80 | if plot: 81 | # Read a color image 82 | img = cv2.imread("275.png") 83 | 84 | # Select a small fraction of pixels to plot by subsampling it 85 | scale = max(img.shape[0], img.shape[1], 64) / 64 # at most 64 rows and columns 86 | img_small = cv2.resize(img, (np.int(img.shape[1] / scale), np.int(img.shape[0] / scale)), 87 | interpolation=cv2.INTER_NEAREST) 88 | 89 | # Convert subsampled image to desired color space(s) 90 | img_small_RGB = cv2.cvtColor(img_small, cv2.COLOR_BGR2RGB) # OpenCV uses BGR, matplotlib likes RGB 91 | img_small_HSV = cv2.cvtColor(img_small, cv2.COLOR_BGR2HSV) 92 | img_small_rgb = img_small_RGB / 255. # scaled to [0, 1], only for plotting 93 | 94 | # Plot and show 95 | plot3d(img_small_RGB, img_small_rgb) 96 | plt.show() 97 | 98 | plot3d(img_small_HSV, img_small_rgb, axis_labels=list("HSV")) 99 | plt.show() 100 | 101 | return ax # return Axes3D object for further manipulation 102 | 103 | 104 | def bin_spatial(img, color_space='RGB', size=(32, 32)): 105 | # Convert image to new color space (if specified) 106 | if color_space != 'RGB': 107 | if color_space == 'HSV': 108 | feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2HSV) 109 | elif color_space == 'LUV': 110 | feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2LUV) 111 | elif color_space == 'HLS': 112 | feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2HLS) 113 | elif color_space == 'YUV': 114 | feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2YUV) 115 | elif color_space == 'YCrCb': 116 | feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2YCrCb) 117 | else: 118 | feature_image = np.copy(img) 119 | # Use cv2.resize().ravel() to create the feature vector 120 | features = cv2.resize(feature_image, size).ravel() 121 | # Return the feature vector 122 | return features 123 | 124 | 125 | # Define a function to return some characteristics of the dataset 126 | def data_look(car_list, notcar_list): 127 | data_dict = {} 128 | # Define a key in data_dict "n_cars" and store the number of car images 129 | data_dict["n_cars"] = len(car_list) 130 | # Define a key "n_notcars" and store the number of notcar images 131 | data_dict["n_notcars"] = len(notcar_list) 132 | # Read in a test image, either car or notcar 133 | example_img = mpimg.imread(car_list[0]) 134 | # Define a key "image_shape" and store the test image shape 3-tuple 135 | data_dict["image_shape"] = example_img.shape 136 | # Define a key "data_type" and store the data type of the test image. 137 | data_dict["data_type"] = example_img.dtype 138 | # Return data_dict 139 | return data_dict 140 | 141 | 142 | def convert_color(img, conv='RGB2YUV'): 143 | if conv == 'RGB2YUV': 144 | return cv2.cvtColor(img, cv2.COLOR_RGB2YUV) 145 | if conv == 'BGR2YCrCb': 146 | return cv2.cvtColor(img, cv2.COLOR_BGR2YCrCb) 147 | if conv == 'RGB2LUV': 148 | return cv2.cvtColor(img, cv2.COLOR_RGB2LUV) 149 | 150 | 151 | # Define a function to return HOG features and visualization 152 | def get_hog_features(ch, orient=9, pix_per_cell=8, cell_per_block=2, vis=False, feature_vec=True): 153 | # image is a channel of image 154 | if vis is True: 155 | features, hog_image = hog(ch, orientations=orient, pixels_per_cell=(pix_per_cell, pix_per_cell), 156 | cells_per_block=(cell_per_block, cell_per_block), transform_sqrt=False, 157 | visualise=True, feature_vector=False, block_norm="L2-Hys") 158 | plt.figure() 159 | plt.subplot(121) 160 | plt.imshow(ch, cmap='gray') 161 | plt.title('L channel') 162 | plt.subplot(122) 163 | plt.imshow(hog_image, cmap='gray') 164 | plt.title('HOG') 165 | plt.show() 166 | 167 | return features, hog_image 168 | else: 169 | features = hog(ch, orientations=orient, pixels_per_cell=(pix_per_cell, pix_per_cell), 170 | cells_per_block=(cell_per_block, cell_per_block), transform_sqrt=False, 171 | visualise=False, feature_vector=feature_vec, block_norm="L2-Hys") 172 | return features 173 | 174 | 175 | # Define a function to extract features from a list of images 176 | # Have this function call bin_spatial() and color_hist() 177 | def extract_features(imgs, color_space='RGB', spatial_size=(32, 32), 178 | hist_bins=32, orient=9, 179 | pix_per_cell=8, cell_per_block=2, hog_channel=0, 180 | spatial_feat=True, hist_feat=True, hog_feat=True): 181 | # Create a list to append feature vectors to 182 | features = [] 183 | # Iterate through the list of images 184 | for file in imgs: 185 | file_features = [] 186 | # Read in each one by one 187 | image = mpimg.imread(file) 188 | # apply color conversion if other than 'RGB' 189 | if color_space != 'RGB': 190 | if color_space == 'HSV': 191 | feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2HSV) 192 | elif color_space == 'LUV': 193 | feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2LUV) 194 | elif color_space == 'HLS': 195 | feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2HLS) 196 | elif color_space == 'YUV': 197 | feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2YUV) 198 | elif color_space == 'YCrCb': 199 | feature_image = cv2.cvtColor(image, cv2.COLOR_RGB2YCrCb) 200 | else: 201 | feature_image = np.copy(image) 202 | 203 | if spatial_feat is True: 204 | spatial_features = bin_spatial(feature_image, size=spatial_size) 205 | file_features.append(spatial_features) 206 | if hist_feat is True: 207 | # Apply color_hist() 208 | hist_features = color_hist(feature_image, nbins=hist_bins) 209 | file_features.append(hist_features) 210 | if hog_feat is True: 211 | # Call get_hog_features() with vis=False, feature_vec=True 212 | if hog_channel == 3: # All channels 213 | hog_features = [] 214 | for channel in range(feature_image.shape[2]): 215 | hog_features.append(get_hog_features(feature_image[:, :, channel], 216 | orient, pix_per_cell, cell_per_block, 217 | vis=False, feature_vec=True)) 218 | hog_features = np.ravel(hog_features) 219 | else: 220 | hog_features = get_hog_features(feature_image[:, :, hog_channel], orient, 221 | pix_per_cell, cell_per_block, vis=False, feature_vec=True) 222 | # Append the new feature vector to the features list 223 | file_features.append(hog_features) 224 | features.append(np.concatenate(file_features)) 225 | # Return list of feature vectors 226 | return features 227 | 228 | 229 | def combine_feature(cspace='LUV', samples=100, plot=False): 230 | # Divide up into cars and notcars 231 | # cars and notcars are png pic 232 | # mpimg.imread returns 0-1; cv2.imread returns 0-255 233 | images = glob.glob('./test_img/*/*/*.png') 234 | cars = [] 235 | notcars = [] 236 | for image in images: 237 | if 'non-vehicles' in image: 238 | notcars.append(image) 239 | else: 240 | cars.append(image) 241 | print('Not Car pic number = {}'.format(len(notcars))) 242 | print('Car pic number = {}'.format(len(cars))) 243 | car_features = extract_features(cars[:samples], cspace, spatial_size=(32, 32), 244 | hist_bins=32) 245 | notcar_features = extract_features(notcars[:samples], cspace, spatial_size=(32, 32), 246 | hist_bins=32) 247 | 248 | if len(car_features) > 0: 249 | # Create an array stack of feature vectors 250 | X = np.vstack((car_features, notcar_features)).astype(np.float64) 251 | # Fit a per-column scaler 252 | X_scaler = StandardScaler().fit(X) 253 | # Apply the scaler to X 254 | scaled_X = X_scaler.transform(X) 255 | car_ind = np.random.randint(0, len(cars)) 256 | if plot is True: 257 | # Plot an example of raw and scaled features 258 | fig = plt.figure(figsize=(12, 4)) 259 | plt.subplot(131) 260 | plt.imshow(mpimg.imread(cars[10])) 261 | plt.title('Original Image') 262 | plt.subplot(132) 263 | plt.plot(X[10]) 264 | plt.title('Raw Features') 265 | plt.subplot(133) 266 | plt.plot(scaled_X[10]) 267 | plt.title('Normalized Features') 268 | fig.tight_layout() 269 | else: 270 | print('Your function only returns empty feature vectors...') 271 | 272 | return scaled_X, cars, notcars 273 | 274 | 275 | def SVM_color_classify(cars, notcars, samples=300): 276 | spatial = 32 277 | histbin = 32 278 | 279 | car_features = extract_features(cars[:samples], color_space='LUV', spatial_size=(spatial, spatial), 280 | hist_bins=histbin, hog_feat=False) 281 | notcar_features = extract_features(notcars[:samples], color_space='LUV', spatial_size=(spatial, spatial), 282 | hist_bins=histbin, hog_feat=False) 283 | 284 | # Create an array stack of feature vectors 285 | X = np.vstack((car_features, notcar_features)).astype(np.float64) 286 | # Fit a per-column scaler 287 | X_scaler = StandardScaler().fit(X) 288 | # Apply the scaler to X 289 | scaled_X = X_scaler.transform(X) 290 | 291 | # Define the labels vector. 1 - Cars; 0 - Not Cars. 292 | y = np.hstack((np.ones(len(car_features)), np.zeros(len(notcar_features)))) 293 | 294 | # Split up data into randomized training and test sets 295 | rand_state = np.random.randint(0, 100) 296 | X_train, X_test, y_train, y_test = train_test_split( 297 | scaled_X, y, test_size=0.2, random_state=rand_state) 298 | 299 | print('Using spatial binning of:', spatial, 300 | 'and', histbin, 'histogram bins') 301 | print('Feature vector length:', len(X_train[0])) 302 | # Use a linear SVC 303 | svc = LinearSVC() 304 | # Check the training time for the SVC 305 | t = time.time() 306 | svc.fit(X_train, y_train) 307 | t2 = time.time() 308 | print(round(t2 - t, 2), 'Seconds to train SVC...') 309 | # Check the score of the SVC 310 | print('Test Accuracy of SVC = ', round(svc.score(X_test, y_test), 4)) 311 | # Check the prediction time for a single sample 312 | t = time.time() 313 | n_predict = 10 314 | print('My SVC predicts: ', svc.predict(X_test[0:n_predict])) 315 | print('For these', n_predict, 'labels: ', y_test[0:n_predict]) 316 | t2 = time.time() 317 | print(round(t2 - t, 5), 'Seconds to predict', n_predict, 'labels with SVC') 318 | 319 | return svc, spatial, histbin 320 | 321 | 322 | def SVM_HOG_classify(cars, notcars, samples=300): 323 | # Reduce the sample size because HOG features are slow to compute 324 | # The quiz evaluator times out after 13s of CPU time 325 | 326 | cars = cars[0:samples] 327 | notcars = notcars[0:samples] 328 | 329 | colorspace = 'YUV' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb 330 | orient = 9 331 | pix_per_cell = 8 332 | cell_per_block = 2 333 | hog_channel = 'ALL' # Can be 0, 1, 2, or "ALL" 334 | 335 | t = time.time() 336 | car_features = extract_features(cars, color_space=colorspace, orient=orient, 337 | pix_per_cell=pix_per_cell, cell_per_block=cell_per_block, 338 | spatial_feat=False, hist_feat=False, 339 | hog_channel=hog_channel) 340 | notcar_features = extract_features(notcars, color_space=colorspace, orient=orient, 341 | pix_per_cell=pix_per_cell, cell_per_block=cell_per_block, 342 | spatial_feat=False, hist_feat=False, 343 | hog_channel=hog_channel) 344 | t2 = time.time() 345 | print(round(t2 - t, 2), 'Seconds to extract HOG features...') 346 | # Create an array stack of feature vectors 347 | X = np.vstack((car_features, notcar_features)).astype(np.float64) 348 | # Fit a per-column scaler 349 | X_scaler = StandardScaler().fit(X) 350 | # Apply the scaler to X 351 | scaled_X = X_scaler.transform(X) 352 | 353 | # Define the labels vector 354 | y = np.hstack((np.ones(len(car_features)), np.zeros(len(notcar_features)))) 355 | 356 | # Split up data into randomized training and test sets 357 | rand_state = np.random.randint(0, 100) 358 | X_train, X_test, y_train, y_test = train_test_split( 359 | scaled_X, y, test_size=0.2, random_state=rand_state) 360 | 361 | print('Using:', orient, 'orientations', pix_per_cell, 362 | 'pixels per cell and', cell_per_block, 'cells per block') 363 | print('Feature vector length:', len(X_train[0])) 364 | # Use a linear SVC 365 | svc = LinearSVC() 366 | # Check the training time for the SVC 367 | t = time.time() 368 | svc.fit(X_train, y_train) 369 | t2 = time.time() 370 | print(round(t2 - t, 2), 'Seconds to train SVC...') 371 | # Check the score of the SVC 372 | print('Test Accuracy of SVC = ', round(svc.score(X_test, y_test), 4)) 373 | # Check the prediction time for a single sample 374 | t = time.time() 375 | n_predict = 10 376 | print('My SVC predicts: ', svc.predict(X_test[0:n_predict])) 377 | print('For these', n_predict, 'labels: ', y_test[0:n_predict]) 378 | t2 = time.time() 379 | print(round(t2 - t, 5), 'Seconds to predict', n_predict, 'labels with SVC') 380 | 381 | dist = {'svc_HOG': svc, 382 | 'scaled_X': scaled_X, 383 | 'orient': orient, 384 | 'pix_per_cell': pix_per_cell, 385 | 'cell_per_block': cell_per_block} 386 | 387 | return dist 388 | 389 | 390 | def SVM_combine_classify(cars, notcars, csapce='LUV', samples=300): 391 | spatial = 32 392 | histbin = 32 393 | color_space = csapce # Can be RGB, HSV, LUV, HLS, YUV, YCrCb 394 | orient = 15 395 | pix_per_cell = 8 396 | cell_per_block = 2 397 | 398 | t = time.time() 399 | car_features = extract_features(cars[:samples], color_space, spatial_size=(spatial, spatial), 400 | hist_bins=histbin, orient=orient, pix_per_cell=pix_per_cell, 401 | cell_per_block=cell_per_block, hog_channel=3) 402 | notcar_features = extract_features(notcars[:samples], color_space, spatial_size=(spatial, spatial), 403 | hist_bins=histbin, orient=orient, pix_per_cell=pix_per_cell, 404 | cell_per_block=cell_per_block, hog_channel=3) 405 | 406 | t2 = time.time() 407 | print(round(t2 - t, 2), 'Seconds to extract features...') 408 | 409 | # Create an array stack of feature vectors 410 | X = np.vstack((car_features, notcar_features)).astype(np.float64) 411 | # Fit a per-column scaler 412 | X_scaler = StandardScaler().fit(X) 413 | # Apply the scaler to X 414 | scaled_X = X_scaler.transform(X) 415 | 416 | # Define the labels vector. 1 - Cars; 0 - Not Cars. 417 | y = np.hstack((np.ones(len(car_features)), np.zeros(len(notcar_features)))) 418 | 419 | # Split up data into randomized training and test sets 420 | rand_state = np.random.randint(0, 100) 421 | scaled_X, y = shuffle(scaled_X, y) 422 | X_train, X_test, y_train, y_test = train_test_split( 423 | scaled_X, y, test_size=0.2, random_state=rand_state) 424 | 425 | print('Training pic num: ', len(y_train)) 426 | print('Using spatial binning of:', spatial, 427 | 'and', histbin, 'histogram bins') 428 | print('Using:', orient, 'orientations', pix_per_cell, 429 | 'pixels per cell and', cell_per_block, 'cells per block') 430 | print('Feature vector length:', len(X_train[0])) 431 | # Use a linear SVC 432 | svc = LinearSVC() 433 | # Check the training time for the SVC 434 | t = time.time() 435 | svc.fit(X_train, y_train) 436 | t2 = time.time() 437 | print(round(t2 - t, 2), 'Seconds to train SVC...') 438 | # Check the score of the SVC 439 | print('Test Accuracy of SVC = ', round(svc.score(X_test, y_test), 4)) 440 | # Check the prediction time for a single sample 441 | t3 = time.time() 442 | n_predict = 15 443 | print('My SVC predicts: ', svc.predict(X_test[0:n_predict])) 444 | print('For these', n_predict, 'labels: ', y_test[0:n_predict]) 445 | t4 = time.time() 446 | print(round(t4 - t3, 5), 'Seconds to predict', n_predict, 'labels with SVC') 447 | 448 | dist = {'svc': svc, 449 | 'X_scaler': X_scaler, 450 | 'orient': orient, 451 | 'pix_per_cell': pix_per_cell, 452 | 'cell_per_block': cell_per_block, 453 | 'spatial_size': (spatial, spatial), 454 | 'hist_bins': histbin, 455 | 'Test Accuracy': round(svc.score(X_test, y_test), 4), 456 | 'Training Time': round(t2 - t, 2), 457 | 'color_space': color_space} 458 | 459 | return dist 460 | 461 | 462 | # Define a function that takes an image, 463 | # start and stop positions in both x and y, 464 | # window size (x and y dimensions), and overlap fraction (for both x and y) 465 | def slide_window(img, x_start_stop=[None, None], y_start_stop=[None, None], 466 | xy_window=(64, 64), xy_overlap=(0.5, 0.5)): 467 | # If x and/or y start/stop positions not defined, set to image size 468 | if x_start_stop[0] == None: 469 | x_start_stop[0] = 0 470 | if x_start_stop[1] == None: 471 | x_start_stop[1] = img.shape[1] 472 | if y_start_stop[0] == None: 473 | y_start_stop[0] = 0 474 | if y_start_stop[1] == None: 475 | y_start_stop[1] = img.shape[0] 476 | # Compute the span of the region to be searched 477 | xspan = x_start_stop[1] - x_start_stop[0] 478 | yspan = y_start_stop[1] - y_start_stop[0] 479 | # Compute the number of pixels per step in x/y 480 | nx_pix_per_step = np.int(xy_window[0] * (1 - xy_overlap[0])) 481 | ny_pix_per_step = np.int(xy_window[1] * (1 - xy_overlap[1])) 482 | # Compute the number of windows in x/y 483 | nx_buffer = np.int(xy_window[0] * (xy_overlap[0])) 484 | ny_buffer = np.int(xy_window[1] * (xy_overlap[1])) 485 | nx_windows = np.int((xspan - nx_buffer) / nx_pix_per_step) 486 | ny_windows = np.int((yspan - ny_buffer) / ny_pix_per_step) 487 | # Initialize a list to append window positions to 488 | window_list = [] 489 | # Loop through finding x and y window positions 490 | # Note: you could vectorize this step, but in practice 491 | # you'll be considering windows one by one with your 492 | # classifier, so looping makes sense 493 | for ys in range(ny_windows): 494 | for xs in range(nx_windows): 495 | # Calculate window position 496 | startx = xs * nx_pix_per_step + x_start_stop[0] 497 | endx = startx + xy_window[0] 498 | starty = ys * ny_pix_per_step + y_start_stop[0] 499 | endy = starty + xy_window[1] 500 | # Append window position to list 501 | window_list.append(((startx, starty), (endx, endy))) 502 | # Return the list of windows 503 | return window_list 504 | 505 | 506 | # Define a function to extract features from a single image window 507 | # This function is very similar to extract_features() 508 | # just for a single image rather than list of images 509 | def single_img_features(img, color_space='RGB', spatial_size=(32, 32), 510 | hist_bins=32, orient=9, 511 | pix_per_cell=8, cell_per_block=2, hog_channel=0, 512 | spatial_feat=True, hist_feat=True, hog_feat=True): 513 | # 1) Define an empty list to receive features 514 | img_features = [] 515 | # 2) Apply color conversion if other than 'RGB' 516 | if color_space != 'RGB': 517 | if color_space == 'HSV': 518 | feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2HSV) 519 | elif color_space == 'LUV': 520 | feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2LUV) 521 | elif color_space == 'HLS': 522 | feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2HLS) 523 | elif color_space == 'YUV': 524 | feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2YUV) 525 | elif color_space == 'YCrCb': 526 | feature_image = cv2.cvtColor(img, cv2.COLOR_RGB2YCrCb) 527 | else: feature_image = np.copy(img) 528 | # 3) Compute spatial features if flag is set 529 | if spatial_feat is True: 530 | spatial_features = bin_spatial(feature_image, size=spatial_size) 531 | # 4) Append features to list 532 | img_features.append(spatial_features) 533 | # 5) Compute histogram features if flag is set 534 | if hist_feat is True: 535 | hist_features = color_hist(feature_image, nbins=hist_bins) 536 | # 6) Append features to list 537 | img_features.append(hist_features) 538 | # 7) Compute HOG features if flag is set 539 | if hog_feat is True: 540 | if hog_channel == 'ALL': 541 | hog_features = [] 542 | for channel in range(feature_image.shape[2]): 543 | hog_features.extend(get_hog_features(feature_image[:, :, channel], 544 | orient, pix_per_cell, cell_per_block, 545 | vis=False, feature_vec=True)) 546 | else: 547 | hog_features = get_hog_features(feature_image[:, :, hog_channel], orient, 548 | pix_per_cell, cell_per_block, vis=False, feature_vec=True) 549 | # 8) Append features to list 550 | img_features.append(hog_features) 551 | 552 | # 9) Return concatenated array of features 553 | return np.concatenate(img_features) 554 | 555 | 556 | # Define a function you will pass an image 557 | # and the list of windows to be searched (output of slide_windows()) 558 | def search_windows(img, windows, clf, scaler, color_space='RGB', 559 | spatial_size=(32, 32), hist_bins=32, 560 | hist_range=(0, 256), orient=9, 561 | pix_per_cell=8, cell_per_block=2, 562 | hog_channel=0, spatial_feat=True, 563 | hist_feat=True, hog_feat=True): 564 | # 1) Create an empty list to receive positive detection windows 565 | on_windows = [] 566 | # 2) Iterate over all windows in the list 567 | for window in windows: 568 | # 3) Extract the test window from original image 569 | test_img = cv2.resize(img[window[0][1]:window[1][1], window[0][0]:window[1][0]], (64, 64)) 570 | # 4) Extract features for that window using single_img_features() 571 | features = single_img_features(test_img, color_space=color_space, 572 | spatial_size=spatial_size, hist_bins=hist_bins, 573 | orient=orient, pix_per_cell=pix_per_cell, 574 | cell_per_block=cell_per_block, 575 | hog_channel=hog_channel, spatial_feat=spatial_feat, 576 | hist_feat=hist_feat, hog_feat=hog_feat) 577 | # 5) Scale extracted features to be fed to classifier 578 | test_features = scaler.transform(np.array(features).reshape(1, -1)) 579 | # 6) Predict using your classifier 580 | prediction = clf.predict(test_features) 581 | # 7) If positive (prediction == 1) then save the window 582 | if prediction == 1: 583 | on_windows.append(window) 584 | # 8) Return windows for positive detections 585 | return on_windows 586 | 587 | 588 | # Define a single function that can extract features using hog sub-sampling and make predictions 589 | def find_cars(img, dist_pickle, ystart=400, ystop=656, scale=1.5, plot=False): 590 | 591 | svc = dist_pickle["svc"] 592 | X_scaler = dist_pickle["X_scaler"] 593 | orient = dist_pickle["orient"] 594 | pix_per_cell = dist_pickle["pix_per_cell"] 595 | cell_per_block = dist_pickle["cell_per_block"] 596 | spatial_size = dist_pickle["spatial_size"] 597 | hist_bins = dist_pickle["hist_bins"] 598 | 599 | draw_img = np.copy(img) 600 | img = img.astype(np.float32) / 255 601 | 602 | img_tosearch = img[ystart:ystop, :, :] 603 | ctrans_tosearch = convert_color(img_tosearch, conv='RGB2YUV') 604 | cspace = 'YUV' 605 | if scale != 1: 606 | imshape = ctrans_tosearch.shape 607 | ctrans_tosearch = cv2.resize(ctrans_tosearch, (np.int(imshape[1] / scale), np.int(imshape[0] / scale))) 608 | 609 | ch1 = ctrans_tosearch[:, :, 0] 610 | ch2 = ctrans_tosearch[:, :, 1] 611 | ch3 = ctrans_tosearch[:, :, 2] 612 | 613 | # Define blocks and steps as above 614 | nxblocks = (ch1.shape[1] // pix_per_cell) - cell_per_block + 1 615 | nyblocks = (ch1.shape[0] // pix_per_cell) - cell_per_block + 1 616 | nfeat_per_block = orient * cell_per_block ** 2 617 | 618 | # 64 was the orginal sampling rate, with 8 cells and 8 pix per cell 619 | window = 64 620 | nblocks_per_window = (window // pix_per_cell) - cell_per_block + 1 621 | cells_per_step = 2 # Instead of overlap, define how many cells to step 622 | nxsteps = (nxblocks - nblocks_per_window) // cells_per_step 623 | nysteps = (nyblocks - nblocks_per_window) // cells_per_step 624 | 625 | # Compute individual channel HOG features for the entire image 626 | hog1 = get_hog_features(ch1, orient, pix_per_cell, cell_per_block, feature_vec=False) 627 | hog2 = get_hog_features(ch2, orient, pix_per_cell, cell_per_block, feature_vec=False) 628 | hog3 = get_hog_features(ch3, orient, pix_per_cell, cell_per_block, feature_vec=False) 629 | 630 | bbox_list = [] 631 | 632 | for xb in range(nxsteps): 633 | for yb in range(nysteps): 634 | ypos = yb * cells_per_step 635 | xpos = xb * cells_per_step 636 | # Extract HOG for this patch 637 | hog_feat1 = hog1[ypos:ypos + nblocks_per_window, xpos:xpos + nblocks_per_window].ravel() 638 | hog_feat2 = hog2[ypos:ypos + nblocks_per_window, xpos:xpos + nblocks_per_window].ravel() 639 | hog_feat3 = hog3[ypos:ypos + nblocks_per_window, xpos:xpos + nblocks_per_window].ravel() 640 | hog_features = np.hstack((hog_feat1, hog_feat2, hog_feat3)) 641 | 642 | xleft = xpos * pix_per_cell 643 | ytop = ypos * pix_per_cell 644 | 645 | # Extract the image patch 646 | subimg = cv2.resize(ctrans_tosearch[ytop:ytop + window, xleft:xleft + window], (64, 64)) 647 | 648 | # Get color features 649 | spatial_features = bin_spatial(subimg, color_space=cspace, size=spatial_size) 650 | hist_features = color_hist(subimg, nbins=hist_bins) 651 | 652 | # Scale features and make a prediction 653 | test_features = X_scaler.transform( 654 | np.hstack((spatial_features, hist_features, hog_features)).reshape(1, -1)) 655 | # test_features = X_scaler.transform(np.hstack((shape_feat, hist_feat)).reshape(1, -1)) 656 | test_prediction = svc.predict(test_features) 657 | 658 | if test_prediction == 1: 659 | xbox_left = np.int(xleft * scale) 660 | ytop_draw = np.int(ytop * scale) 661 | win_draw = np.int(window * scale) 662 | cv2.rectangle(draw_img, (xbox_left, ytop_draw + ystart), 663 | (xbox_left + win_draw, ytop_draw + win_draw + ystart), (0, 0, 255), 6) 664 | bbox_list.append(((xbox_left, ytop_draw + ystart), (xbox_left + win_draw, ytop_draw + win_draw + ystart))) 665 | 666 | if plot is True: 667 | plt.figure() 668 | plt.imshow(draw_img) 669 | plt.show() 670 | return bbox_list, draw_img 671 | 672 | 673 | def draw_labeled_bboxes(img, labels): 674 | # Iterate through all detected cars 675 | for car_number in range(1, labels[1]+1): 676 | # Find pixels with each car_number label value 677 | nonzero = (labels[0] == car_number).nonzero() 678 | # Identify x and y values of those pixels 679 | nonzeroy = np.array(nonzero[0]) 680 | nonzerox = np.array(nonzero[1]) 681 | # Define a bounding box based on min/max x and y 682 | bbox = ((np.min(nonzerox), np.min(nonzeroy)), (np.max(nonzerox), np.max(nonzeroy))) 683 | # Draw the box on the image 684 | cv2.rectangle(img, bbox[0], bbox[1], (0, 0, 255), 6) 685 | 686 | return img 687 | 688 | 689 | def add_heat(heatmap, bbox_list): 690 | # Iterate through list of bboxes 691 | for box in bbox_list: 692 | # Add += 1 for all pixels inside each bbox 693 | # Assuming each "box" takes the form ((x1, y1), (x2, y2)) 694 | heatmap[box[0][1]:box[1][1], box[0][0]:box[1][0]] += 1 695 | 696 | # Return updated heatmap 697 | return heatmap # Iterate through list of bboxes 698 | 699 | 700 | def apply_threshold(heatmap, threshold): 701 | # Zero out pixels below the threshold 702 | heatmap[heatmap <= threshold] = 0 703 | # Return thresholded map 704 | return heatmap 705 | 706 | 707 | def visualize(img, dist_pickle): 708 | bbox_list, img_multibox = find_cars(img, dist_pickle, plot=False) 709 | 710 | heat = np.zeros_like(img[:, :, 0]).astype(np.float) 711 | # Add heat to each box in box list 712 | heat = add_heat(heat, bbox_list) 713 | # Visualize the heatmap when displaying 714 | heatmap = np.clip(heat, 0, 255) 715 | # Find final boxes from heatmap using label function 716 | labels = label(heatmap) 717 | draw_img = draw_labeled_bboxes(np.copy(img), labels) 718 | 719 | fig = plt.figure(figsize=(10, 4)) 720 | plt.subplot(131) 721 | plt.imshow(img_multibox) 722 | plt.title('Multi Detections') 723 | plt.subplot(132) 724 | plt.imshow(heatmap, cmap='hot') 725 | plt.title('Heat Map') 726 | plt.subplot(133) 727 | plt.imshow(draw_img) 728 | plt.title('Car Positions') 729 | fig.tight_layout() 730 | 731 | return 732 | 733 | 734 | def multi_heatmap(dist_pickle): 735 | img1 = mpimg.imread('./test_img/test4.jpg') 736 | img2 = mpimg.imread('./test_img/test5.jpg') 737 | imgs = [img1, img2] 738 | drawings = [] 739 | for img in imgs: 740 | bbox_list, img_multibox = find_cars(img, dist_pickle, plot=False) 741 | 742 | heat = np.zeros_like(img[:, :, 0]).astype(np.float) 743 | # Add heat to each box in box list 744 | heat = add_heat(heat, bbox_list) 745 | # Visualize the heatmap when displaying 746 | heatmap = np.clip(heat, 0, 255) 747 | 748 | drawings.append(heatmap) 749 | 750 | merge_heat = drawings[0] + drawings[1] 751 | 752 | fig = plt.figure(figsize=(10, 4)) 753 | plt.subplot(131) 754 | plt.imshow(drawings[0], cmap='hot') 755 | plt.title('test4_heat') 756 | plt.subplot(132) 757 | plt.imshow(drawings[1], cmap='hot') 758 | plt.title('test5_heat') 759 | plt.subplot(133) 760 | plt.imshow(merge_heat, cmap='hot') 761 | plt.title('Merge') 762 | fig.tight_layout() 763 | 764 | 765 | def add_non_car(): 766 | input_path = './test_img/add_non-car_source_size/' 767 | output_path = './test_img/add_non-car_64/' 768 | images = glob.glob(input_path + '*.png') 769 | flags = [] 770 | 771 | for img_path in images: 772 | img = cv2.imread(img_path) 773 | img_resize = cv2.resize(img, (64, 64)) 774 | filename = img_path.split('\\')[-1] 775 | flag = cv2.imwrite(output_path+filename, img_resize) 776 | flags.append(flag) 777 | return flags 778 | 779 | 780 | def draw_feature(car, notcar, cspace='RGB2YUV'): 781 | car_pic = convert_color(mpimg.imread(car), conv=cspace) 782 | notcar_pic = convert_color(mpimg.imread(notcar), conv=cspace) 783 | 784 | orient = 15 785 | pix_per_cell = 8 786 | cell_per_block = 2 787 | spatial_size = (32, 32) 788 | 789 | car_chs = [car_pic[:, :, 0], car_pic[:, :, 1], car_pic[:, :, 2]] 790 | notcat_chs = [notcar_pic[:, :, 0], notcar_pic[:, :, 1], notcar_pic[:, :, 2]] 791 | 792 | f, hog1_car = get_hog_features(car_chs[0], orient, pix_per_cell, cell_per_block, vis=True, feature_vec=False) 793 | f, hog1_notcar = get_hog_features(notcat_chs[0], orient, pix_per_cell, cell_per_block, vis=True, feature_vec=False) 794 | 795 | car_hist = np.histogram(car_pic[:, :, 0], bins=32, range=(0, 256)) 796 | notcar_hist = np.histogram(notcar_pic[:, :, 0], bins=32, range=(0, 256)) 797 | bin_edges = car_hist[1] 798 | bin_centers = (bin_edges[1:] + bin_edges[0:len(bin_edges) - 1]) / 2 799 | 800 | car_ch1_features = cv2.resize(car_chs[0], spatial_size) 801 | car_ch2_features = cv2.resize(car_chs[1], spatial_size) 802 | car_ch3_features = cv2.resize(car_chs[2], spatial_size) 803 | notcar_ch1_features = cv2.resize(notcat_chs[0], spatial_size) 804 | notcar_ch2_features = cv2.resize(notcat_chs[1], spatial_size) 805 | notcar_ch3_features = cv2.resize(notcat_chs[2], spatial_size) 806 | 807 | titles = ['CAR CH-1', 'CAR CH-1 HOG', 'NOT CAR CH-1', 'NOT CAR CH-1 HOG', 808 | 'CAR CH-1', 'CAR CH-1 FEATURES', 'NOT CAR CH-1', 'NOT CAR CH-1 FEATURES', 809 | 'CAR CH-2', 'CAR CH-2 FEATURES', 'NOT CAR CH-2', 'NOT CAR CH-2 FEATURES', 810 | 'CAR CH-3', 'CAR CH-3 FEATURES', 'NOT CAR CH-3', 'NOT CAR CH-3 FEATURES'] 811 | pic = [car_chs[0], hog1_car, notcat_chs[0], hog1_notcar, 812 | car_chs[0], car_ch1_features, notcat_chs[0], notcar_ch1_features, 813 | car_chs[1], car_ch2_features, notcat_chs[1], notcar_ch2_features, 814 | car_chs[2], car_ch3_features, notcat_chs[2], notcar_ch3_features] 815 | 816 | f, axes = plt.subplots(4, 4, figsize=(10, 8)) 817 | f.tight_layout() 818 | for idx, ax in enumerate(np.hstack(axes)): 819 | ax.imshow(pic[idx]) 820 | ax.set_title(titles[idx]) 821 | ax.axis('off') 822 | return 823 | 824 | if __name__ == "__main__": 825 | img = mpimg.imread('./test_img/test6.jpg') 826 | 827 | scaled_X, cars, notcars = combine_feature(cspace='YUV', samples=20) 828 | # 829 | # dist = SVM_combine_classify(cars, notcars, csapce='YUV', samples=-1) 830 | # pickle.dump(dist, open("./dist.p", "wb")) 831 | # print('pickle saved!') 832 | 833 | # dist_pickle = pickle.load(open("dist.p", "rb")) 834 | 835 | # feature_vec = bin_spatial(image, color_space='RGB', size=(32, 32)) 836 | # 837 | # # Plot features 838 | # plt.plot(feature_vec) 839 | # plt.title('Spatially Binned Features') 840 | -------------------------------------------------------------------------------- /helper_yolo.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import matplotlib.pyplot as plt 3 | import matplotlib.image as mpimg 4 | import cv2 5 | import glob 6 | 7 | 8 | def draw_test_img(imgs_path, model): 9 | images = [plt.imread(file) for file in glob.glob(imgs_path)] 10 | batch = np.array([np.transpose(cv2.resize(image[300:650, 500:, :], (448, 448)), (2, 0, 1)) 11 | for image in images]) 12 | batch = 2 * (batch / 255.) - 1 13 | out = model.predict(batch) 14 | f, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(8, 6)) 15 | for i, ax in zip(range(len(batch)), [ax1, ax2, ax3, ax4]): 16 | boxes = yolo_boxes(out[i], threshold=0.17) 17 | ax.imshow(draw_box(boxes, images[i], [[500, 1280], [300, 650]])) 18 | 19 | return out 20 | 21 | 22 | def load_weights(model, yolo_weight_file): 23 | data = np.fromfile(yolo_weight_file, np.float32) 24 | data = data[4:] 25 | 26 | index = 0 27 | for layer in model.layers: 28 | shape = [w.shape for w in layer.get_weights()] 29 | if shape != []: 30 | kshape, bshape = shape 31 | bia = data[index:index + np.prod(bshape)].reshape(bshape) 32 | index += np.prod(bshape) 33 | ker = data[index:index + np.prod(kshape)].reshape(kshape) 34 | index += np.prod(kshape) 35 | layer.set_weights([ker, bia]) 36 | 37 | 38 | class Box: 39 | def __init__(self): 40 | self.x, self.y = float(), float() 41 | self.w, self.h = float(), float() 42 | self.c = float() 43 | self.prob = float() 44 | 45 | 46 | def overlap(x1, w1, x2, w2): 47 | l1 = x1 - w1 / 2. 48 | l2 = x2 - w2 / 2. 49 | left = max(l1, l2) 50 | r1 = x1 + w1 / 2. 51 | r2 = x2 + w2 / 2. 52 | right = min(r1, r2) 53 | return right - left 54 | 55 | 56 | def box_intersection(a, b): 57 | w = overlap(a.x, a.w, b.x, b.w) 58 | h = overlap(a.y, a.h, b.y, b.h) 59 | if w < 0 or h < 0: 60 | return 0 61 | area = w * h 62 | return area 63 | 64 | 65 | def box_union(a, b): 66 | i = box_intersection(a, b) 67 | u = a.w * a.h + b.w * b.h - i 68 | return u 69 | 70 | 71 | def box_iou(a, b): 72 | return box_intersection(a, b) / box_union(a, b) 73 | 74 | 75 | def yolo_boxes(net_out, threshold=0.2, sqrt=1.8, C=20, B=2, S=7): 76 | classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", 77 | "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", 78 | "sheep", "sofa", "train","tvmonitor"] 79 | class_num = 6 80 | boxes = [] 81 | SS = S * S # number of grid cells 82 | prob_size = SS * C # class probabilities 83 | conf_size = SS * B # confidences for each grid cell 84 | 85 | probs = net_out[0: prob_size] 86 | confs = net_out[prob_size: (prob_size + conf_size)] 87 | cords = net_out[(prob_size + conf_size):] 88 | probs = probs.reshape([SS, C]) 89 | confs = confs.reshape([SS, B]) 90 | cords = cords.reshape([SS, B, 4]) 91 | 92 | for grid in range(SS): 93 | for b in range(B): 94 | bx = Box() 95 | bx.c = confs[grid, b] 96 | bx.x = (cords[grid, b, 0] + grid % S) / S 97 | bx.y = (cords[grid, b, 1] + grid // S) / S 98 | bx.w = cords[grid, b, 2] ** sqrt 99 | bx.h = cords[grid, b, 3] ** sqrt 100 | p = probs[grid, :] * bx.c 101 | 102 | if p[class_num] >= threshold: 103 | bx.prob = p[class_num] 104 | boxes.append(bx) 105 | 106 | # combine boxes that are overlap 107 | boxes.sort(key=lambda b: b.prob, reverse=True) 108 | for i in range(len(boxes)): 109 | boxi = boxes[i] 110 | if boxi.prob == 0: 111 | continue 112 | for j in range(i + 1, len(boxes)): 113 | boxj = boxes[j] 114 | if box_iou(boxi, boxj) >= .4: 115 | boxes[j].prob = 0. 116 | boxes = [b for b in boxes if b.prob > 0.] 117 | 118 | return boxes 119 | 120 | 121 | def draw_box(boxes, im, crop_dim): 122 | imgcv = np.copy(im) 123 | [xmin, xmax] = crop_dim[0] 124 | [ymin, ymax] = crop_dim[1] 125 | for i, b in enumerate(boxes, 1): 126 | h, w, _ = imgcv.shape 127 | left = int((b.x - b.w / 2.) * w) 128 | right = int((b.x + b.w / 2.) * w) 129 | top = int((b.y - b.h / 2.) * h) 130 | bot = int((b.y + b.h / 2.) * h) 131 | left = int(left * (xmax - xmin) / w + xmin) 132 | right = int(right * (xmax - xmin) / w + xmin) 133 | top = int(top * (ymax - ymin) / h + ymin) 134 | bot = int(bot * (ymax - ymin) / h + ymin) 135 | 136 | left = max(left, 0) 137 | right = min(right, w-1) 138 | top = max(top, 0) 139 | bot = min(bot, h-1) 140 | 141 | cv2.rectangle(imgcv, (left, top), (right, bot), (0, 0, 255), thickness=3) 142 | 143 | # draw label 144 | label = 'car ' + str(i) 145 | cv2.rectangle(imgcv, (left, top - 30), (right, top), (125, 125, 125), -1) 146 | cv2.putText(imgcv, label, (left + 5, top - 7), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 1) 147 | 148 | # draw thumbnail in highlight title 149 | thumbnail = im[top:bot, left:right] 150 | vehicle_thumb = cv2.resize(thumbnail, dsize=(120, 80)) # width=120, height=80 151 | start_x = 750 + (i-1) * 30 + (i-1) * 120 # offset=30 152 | imgcv[60:60+80, start_x:start_x+120, :] = vehicle_thumb 153 | 154 | cv2.putText(imgcv, 'Lane', (280, 35), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (255, 255, 0), 2, cv2.LINE_AA) 155 | cv2.putText(imgcv, 'Detected Vehicles', (800, 35), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (255, 255, 0), 2, 156 | cv2.LINE_AA) 157 | 158 | return imgcv 159 | 160 | 161 | def draw_background_highlight(image, draw_img, w=1280): 162 | 163 | mask = cv2.rectangle(np.copy(image), (0, 0), (w, 155), (0, 0, 0), thickness=cv2.FILLED) 164 | 165 | return cv2.addWeighted(src1=mask, alpha=0.3, src2=draw_img, beta=0.8, gamma=0) 166 | 167 | 168 | def draw_thumbnails(img_cp, img, window_list, thumb_w=120, thumb_h=80, off_x=30, off_y=30): 169 | cv2.putText(img_cp, 'Lane', (280, 35), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (255, 255, 0), 2, cv2.LINE_AA) 170 | cv2.putText(img_cp, 'Detected Vehicles', (600, 35), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (255, 255, 0), 2, cv2.LINE_AA) 171 | for i, bbox in enumerate(window_list): 172 | thumbnail = img[bbox[0][1]:bbox[1][1], bbox[0][0]:bbox[1][0]] 173 | vehicle_thumb = cv2.resize(thumbnail, dsize=(thumb_w, thumb_h)) 174 | start_x = 640 + (i+1) * off_x + i * thumb_w 175 | img_cp[off_y + 30:off_y + thumb_h + 30, start_x:start_x + thumb_w, :] = vehicle_thumb 176 | 177 | -------------------------------------------------------------------------------- /output_videos/project_SVM.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/output_videos/project_SVM.mp4 -------------------------------------------------------------------------------- /output_videos/project_yolo.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/output_videos/project_yolo.mp4 -------------------------------------------------------------------------------- /output_videos/test_SVM.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/output_videos/test_SVM.mp4 -------------------------------------------------------------------------------- /test_img/275.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/test_img/275.png -------------------------------------------------------------------------------- /test_img/cutout1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/uranus4ever/Vehicle-Detection/43742aa970ca57b83543e4c09129e8b5fe60164c/test_img/cutout1.jpg --------------------------------------------------------------------------------