├── LICENSE ├── README.md ├── YOLO_face_tf.py ├── YOLO_small_tf.py ├── YOLO_tiny_tf.py ├── YOLO_weight_extractor ├── Readme.md └── YOLO_weight_extractor.tar.gz ├── test └── person.jpg └── weights └── put_weight_file_here.txt /LICENSE: -------------------------------------------------------------------------------- 1 | YOLO_tensorflow LICENSE 2 | Version 0.1, FEB 15 2016 3 | 4 | ACCORDING TO ORIGINAL CODE'S LICENSE, 5 | 6 | DO NOT USE THIS ON COMMERCIAL! 7 | I OR ORIGINAL AUTHOR DO NOT HOLD LIABILITY FOR ANY DAMAGES! 8 | 9 | 10 | BELOW IS THE ORIGINAL CODE'S LICENSE 11 | { 12 | THIS SOFTWARE LICENSE IS PROVIDED "ALL CAPS" SO THAT YOU KNOW IT IS SUPER 13 | SERIOUS AND YOU DON'T MESS AROUND WITH COPYRIGHT LAW BECAUSE YOU WILL GET IN 14 | TROUBLE HERE ARE SOME OTHER BUZZWORDS COMMONLY IN THESE THINGS WARRANTIES 15 | LIABILITY CONTRACT TORT LIABLE CLAIMS RESTRICTION MERCHANTABILITY SUBJECT TO 16 | THE FOLLOWING CONDITIONS: 17 | 18 | 1. #yolo 19 | 2. #swag 20 | 3. #blazeit 21 | } 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # YOLO_tensorflow 2 | 3 | (Version 0.3, Last updated :2017.02.21) 4 | 5 | ### 1.Introduction 6 | 7 | This is tensorflow implementation of the YOLO:Real-Time Object Detection 8 | 9 | It can only do predictions using pretrained YOLO_small & YOLO_tiny network for now. 10 | 11 | (+ YOLO_face detector from https://github.com/quanhua92/darknet ) 12 | 13 | I extracted weight values from darknet's (.weight) files. 14 | 15 | My code does not support training. Use darknet for training. 16 | 17 | Original code(C implementation) & paper : http://pjreddie.com/darknet/yolo/ 18 | 19 | ### 2.Install 20 | (1) Download code 21 | 22 | (2) Download YOLO weight file from 23 | 24 | YOLO_small : https://drive.google.com/file/d/0B2JbaJSrWLpza08yS2FSUnV2dlE/view?usp=sharing 25 | 26 | YOLO_tiny : https://drive.google.com/file/d/0B2JbaJSrWLpza0FtQlc3ejhMTTA/view?usp=sharing 27 | 28 | YOLO_face : https://drive.google.com/file/d/0B2JbaJSrWLpzMzR5eURGN2dMTk0/view?usp=sharing 29 | 30 | (3) Put the 'YOLO_(version).ckpt' in the 'weight' folder of downloaded code 31 | 32 | ### 3.Usage 33 | 34 | (1) direct usage with default settings (display on console, show output image, no output file writing) 35 | 36 | python YOLO_(small or tiny)_tf.py -fromfile (input image filename) 37 | 38 | (2) direct usage with custom settings 39 | 40 | python YOLO_(small or tiny)_tf.py argvs 41 | 42 | where argvs are 43 | 44 | -fromfile (input image filename) : input image file 45 | -disp_console (0 or 1) : whether display results on terminal or not 46 | -imshow (0 or 1) : whether display result image or not 47 | -tofile_img (output image filename) : output image file 48 | -tofile_txt (output txt filename) : output text file (contains class, x, y, w, h, probability) 49 | 50 | (3) import on other scripts 51 | 52 | import YOLO_(small or tiny)_tf 53 | yolo = YOLO_(small or tiny)_tf.YOLO_TF() 54 | 55 | yolo.disp_console = (True or False, default = True) 56 | yolo.imshow = (True or False, default = True) 57 | yolo.tofile_img = (output image filename) 58 | yolo.tofile_txt = (output txt filename) 59 | yolo.filewrite_img = (True or False, default = False) 60 | yolo.filewrite_txt = (True of False, default = False) 61 | 62 | yolo.detect_from_file(filename) 63 | yolo.detect_from_cvmat(cvmat) 64 | 65 | ### 4.Requirements 66 | 67 | - Tensorflow 68 | - Opencv2 69 | 70 | ### 5.Copyright 71 | 72 | According to the LICENSE file of the original code, 73 | - Me and original author hold no liability for any damages 74 | - Do not use this on commercial! 75 | 76 | ### 6.Changelog 77 | 2016/02/15 : First upload! 78 | 79 | 2016/02/16 : Added YOLO_tiny, Fixed bug that ignores one of the boxes in grid when both boxes detected valid objects 80 | 81 | 2016/08/26 : Uploaded weight file converter! (darknet weight -> tensorflow ckpt) 82 | 83 | 2017/02/21 : Added YOLO_face (Thanks https://github.com/quanhua92/darknet) 84 | -------------------------------------------------------------------------------- /YOLO_face_tf.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 3 | import tensorflow as tf 4 | import time 5 | import sys 6 | 7 | class YOLO_TF: 8 | fromfile = None 9 | tofile_img = 'test/output.jpg' 10 | tofile_txt = 'test/output.txt' 11 | imshow = True 12 | filewrite_img = False 13 | filewrite_txt = False 14 | disp_console = True 15 | weights_file = 'weights/YOLO_face' 16 | alpha = 0.1 17 | threshold = 0.2 18 | iou_threshold = 0.5 19 | num_class = 1 20 | num_box = 2 21 | grid_size = 11 22 | classes = ["face"] 23 | 24 | w_img = 640 25 | h_img = 480 26 | 27 | def __init__(self,argvs = []): 28 | self.argv_parser(argvs) 29 | self.build_networks() 30 | if self.fromfile is not None: self.detect_from_file(self.fromfile) 31 | def argv_parser(self,argvs): 32 | for i in range(1,len(argvs),2): 33 | if argvs[i] == '-fromfile' : self.fromfile = argvs[i+1] 34 | if argvs[i] == '-tofile_img' : self.tofile_img = argvs[i+1] ; self.filewrite_img = True 35 | if argvs[i] == '-tofile_txt' : self.tofile_txt = argvs[i+1] ; self.filewrite_txt = True 36 | if argvs[i] == '-imshow' : 37 | if argvs[i+1] == '1' :self.imshow = True 38 | else : self.imshow = False 39 | if argvs[i] == '-disp_console' : 40 | if argvs[i+1] == '1' :self.disp_console = True 41 | else : self.disp_console = False 42 | 43 | def build_networks(self): 44 | if self.disp_console : print "Building YOLO_tiny graph..." 45 | self.x = tf.placeholder('float32',[None,448,448,3]) 46 | self.conv_1 = self.conv_layer(1,self.x,16,3,1) 47 | self.pool_2 = self.pooling_layer(2,self.conv_1,2,2) 48 | self.conv_3 = self.conv_layer(3,self.pool_2,32,3,1) 49 | self.pool_4 = self.pooling_layer(4,self.conv_3,2,2) 50 | self.conv_5 = self.conv_layer(5,self.pool_4,64,3,1) 51 | self.pool_6 = self.pooling_layer(6,self.conv_5,2,2) 52 | self.conv_7 = self.conv_layer(7,self.pool_6,128,3,1) 53 | self.pool_8 = self.pooling_layer(8,self.conv_7,2,2) 54 | self.conv_9 = self.conv_layer(9,self.pool_8,256,3,1) 55 | self.pool_10 = self.pooling_layer(10,self.conv_9,2,2) 56 | self.conv_11 = self.conv_layer(11,self.pool_10,512,3,1) 57 | self.pool_12 = self.pooling_layer(12,self.conv_11,2,2) 58 | self.conv_13 = self.conv_layer(13,self.pool_12,1024,3,1) 59 | self.conv_14 = self.conv_layer(14,self.conv_13,1024,3,1) 60 | self.conv_15 = self.conv_layer(15,self.conv_14,1024,3,1) 61 | self.fc_16 = self.fc_layer(16,self.conv_15,256,flat=True,linear=False) 62 | self.fc_17 = self.fc_layer(17,self.fc_16,4096,flat=False,linear=False) 63 | #skip dropout_18 64 | self.fc_19 = self.fc_layer(19,self.fc_17,1331,flat=False,linear=True) 65 | self.sess = tf.Session() 66 | self.sess.run(tf.initialize_all_variables()) 67 | self.saver = tf.train.Saver() 68 | self.saver.restore(self.sess,self.weights_file) 69 | if self.disp_console : print "Loading complete!" + '\n' 70 | 71 | def conv_layer(self,idx,inputs,filters,size,stride): 72 | channels = inputs.get_shape()[3] 73 | weight = tf.Variable(tf.truncated_normal([size,size,int(channels),filters], stddev=0.1)) 74 | biases = tf.Variable(tf.constant(0.1, shape=[filters])) 75 | 76 | pad_size = size//2 77 | pad_mat = np.array([[0,0],[pad_size,pad_size],[pad_size,pad_size],[0,0]]) 78 | inputs_pad = tf.pad(inputs,pad_mat) 79 | 80 | conv = tf.nn.conv2d(inputs_pad, weight, strides=[1, stride, stride, 1], padding='VALID',name=str(idx)+'_conv') 81 | conv_biased = tf.add(conv,biases,name=str(idx)+'_conv_biased') 82 | if self.disp_console : print ' Layer %d : Type = Conv, Size = %d * %d, Stride = %d, Filters = %d, Input channels = %d' % (idx,size,size,stride,filters,int(channels)) 83 | return tf.maximum(self.alpha*conv_biased,conv_biased,name=str(idx)+'_leaky_relu') 84 | 85 | def pooling_layer(self,idx,inputs,size,stride): 86 | if self.disp_console : print ' Layer %d : Type = Pool, Size = %d * %d, Stride = %d' % (idx,size,size,stride) 87 | return tf.nn.max_pool(inputs, ksize=[1, size, size, 1],strides=[1, stride, stride, 1], padding='SAME',name=str(idx)+'_pool') 88 | 89 | def fc_layer(self,idx,inputs,hiddens,flat = False,linear = False): 90 | input_shape = inputs.get_shape().as_list() 91 | if flat: 92 | dim = input_shape[1]*input_shape[2]*input_shape[3] 93 | inputs_transposed = tf.transpose(inputs,(0,3,1,2)) 94 | inputs_processed = tf.reshape(inputs_transposed, [-1,dim]) 95 | else: 96 | dim = input_shape[1] 97 | inputs_processed = inputs 98 | weight = tf.Variable(tf.truncated_normal([dim,hiddens], stddev=0.1)) 99 | biases = tf.Variable(tf.constant(0.1, shape=[hiddens])) 100 | if self.disp_console : print ' Layer %d : Type = Full, Hidden = %d, Input dimension = %d, Flat = %d, Activation = %d' % (idx,hiddens,int(dim),int(flat),1-int(linear)) 101 | if linear : return tf.add(tf.matmul(inputs_processed,weight),biases,name=str(idx)+'_fc') 102 | ip = tf.add(tf.matmul(inputs_processed,weight),biases) 103 | return tf.maximum(self.alpha*ip,ip,name=str(idx)+'_fc') 104 | 105 | def detect_from_cvmat(self,img): 106 | s = time.time() 107 | self.h_img,self.w_img,_ = img.shape 108 | img_resized = cv2.resize(img, (448, 448)) 109 | img_RGB = cv2.cvtColor(img_resized,cv2.COLOR_BGR2RGB) 110 | img_resized_np = np.asarray( img_RGB ) 111 | inputs = np.zeros((1,448,448,3),dtype='float32') 112 | inputs[0] = (img_resized_np/255.0)*2.0-1.0 113 | in_dict = {self.x: inputs} 114 | net_output = self.sess.run(self.fc_19,feed_dict=in_dict) 115 | self.result = self.interpret_output(net_output[0]) 116 | self.show_results(img,self.result) 117 | strtime = str(time.time()-s) 118 | if self.disp_console : print 'Elapsed time : ' + strtime + ' secs' + '\n' 119 | 120 | def detect_from_file(self,filename): 121 | if self.disp_console : print 'Detect from ' + filename 122 | img = cv2.imread(filename) 123 | #img = misc.imread(filename) 124 | self.detect_from_cvmat(img) 125 | 126 | def detect_from_crop_sample(self): 127 | self.w_img = 640 128 | self.h_img = 420 129 | f = np.array(open('person_crop.txt','r').readlines(),dtype='float32') 130 | inputs = np.zeros((1,448,448,3),dtype='float32') 131 | for c in range(3): 132 | for y in range(448): 133 | for x in range(448): 134 | inputs[0,y,x,c] = f[c*448*448+y*448+x] 135 | 136 | in_dict = {self.x: inputs} 137 | net_output = self.sess.run(self.fc_19,feed_dict=in_dict) 138 | self.boxes, self.probs = self.interpret_output(net_output[0]) 139 | img = cv2.imread('person.jpg') 140 | self.show_results(self.boxes,img) 141 | 142 | def interpret_output(self,output): 143 | prob_range = [0,self.grid_size*self.grid_size*self.num_class] 144 | scales_range = [prob_range[1],prob_range[1]+self.grid_size*self.grid_size*self.num_box] 145 | boxes_range = [scales_range[1],scales_range[1]+self.grid_size*self.grid_size*self.num_box*4] 146 | 147 | probs = np.zeros((self.grid_size,self.grid_size,self.num_box,self.num_class)) 148 | class_probs = np.reshape(output[0:prob_range[1]],(self.grid_size,self.grid_size,self.num_class)) 149 | scales = np.reshape(output[scales_range[0]:scales_range[1]],(self.grid_size,self.grid_size,self.num_box)) 150 | boxes = np.reshape(output[boxes_range[0]:],(self.grid_size,self.grid_size,self.num_box,4)) 151 | offset = np.transpose(np.reshape(np.array([np.arange(self.grid_size)]*(2*self.grid_size)),(2,self.grid_size,self.grid_size)),(1,2,0)) 152 | 153 | boxes[:,:,:,0] += offset 154 | boxes[:,:,:,1] += np.transpose(offset,(1,0,2)) 155 | boxes[:,:,:,0:2] = boxes[:,:,:,0:2] / float(self.grid_size) 156 | boxes[:,:,:,2] = np.multiply(boxes[:,:,:,2],boxes[:,:,:,2]) 157 | boxes[:,:,:,3] = np.multiply(boxes[:,:,:,3],boxes[:,:,:,3]) 158 | 159 | boxes[:,:,:,0] *= self.w_img 160 | boxes[:,:,:,1] *= self.h_img 161 | boxes[:,:,:,2] *= self.w_img 162 | boxes[:,:,:,3] *= self.h_img 163 | 164 | for i in range(self.num_box): 165 | for j in range(self.num_class): 166 | probs[:,:,i,j] = np.multiply(class_probs[:,:,j],scales[:,:,i]) 167 | 168 | filter_mat_probs = np.array(probs>=self.threshold,dtype='bool') 169 | filter_mat_boxes = np.nonzero(filter_mat_probs) 170 | boxes_filtered = boxes[filter_mat_boxes[0],filter_mat_boxes[1],filter_mat_boxes[2]] 171 | probs_filtered = probs[filter_mat_probs] 172 | classes_num_filtered = np.argmax(filter_mat_probs,axis=3)[filter_mat_boxes[0],filter_mat_boxes[1],filter_mat_boxes[2]] 173 | 174 | argsort = np.array(np.argsort(probs_filtered))[::-1] 175 | boxes_filtered = boxes_filtered[argsort] 176 | probs_filtered = probs_filtered[argsort] 177 | classes_num_filtered = classes_num_filtered[argsort] 178 | 179 | for i in range(len(boxes_filtered)): 180 | if probs_filtered[i] == 0 : continue 181 | for j in range(i+1,len(boxes_filtered)): 182 | if self.iou(boxes_filtered[i],boxes_filtered[j]) > self.iou_threshold : 183 | probs_filtered[j] = 0.0 184 | 185 | filter_iou = np.array(probs_filtered>0.0,dtype='bool') 186 | boxes_filtered = boxes_filtered[filter_iou] 187 | probs_filtered = probs_filtered[filter_iou] 188 | classes_num_filtered = classes_num_filtered[filter_iou] 189 | 190 | result = [] 191 | for i in range(len(boxes_filtered)): 192 | result.append([self.classes[classes_num_filtered[i]],boxes_filtered[i][0],boxes_filtered[i][1],boxes_filtered[i][2],boxes_filtered[i][3],probs_filtered[i]]) 193 | 194 | return result 195 | 196 | def show_results(self,img,results): 197 | img_cp = img.copy() 198 | if self.filewrite_txt : 199 | ftxt = open(self.tofile_txt,'w') 200 | for i in range(len(results)): 201 | x = int(results[i][1]) 202 | y = int(results[i][2]) 203 | w = int(results[i][3])//2 204 | h = int(results[i][4])//2 205 | if self.disp_console : print ' class : ' + results[i][0] + ' , [x,y,w,h]=[' + str(x) + ',' + str(y) + ',' + str(int(results[i][3])) + ',' + str(int(results[i][4]))+'], Confidence = ' + str(results[i][5]) 206 | if self.filewrite_img or self.imshow: 207 | cv2.rectangle(img_cp,(x-w,y-h),(x+w,y+h),(0,255,0),2) 208 | cv2.rectangle(img_cp,(x-w,y-h-20),(x+w,y-h),(125,125,125),-1) 209 | cv2.putText(img_cp,results[i][0] + ' : %.2f' % results[i][5],(x-w+5,y-h-7),cv2.FONT_HERSHEY_SIMPLEX,0.5,(0,0,0),1) 210 | if self.filewrite_txt : 211 | ftxt.write(results[i][0] + ',' + str(x) + ',' + str(y) + ',' + str(w) + ',' + str(h)+',' + str(results[i][5]) + '\n') 212 | if self.filewrite_img : 213 | if self.disp_console : print ' image file writed : ' + self.tofile_img 214 | cv2.imwrite(self.tofile_img,img_cp) 215 | if self.imshow : 216 | cv2.imshow('YOLO_face detection',img_cp) 217 | cv2.waitKey(1) 218 | if self.filewrite_txt : 219 | if self.disp_console : print ' txt file writed : ' + self.tofile_txt 220 | ftxt.close() 221 | 222 | def iou(self,box1,box2): 223 | tb = min(box1[0]+0.5*box1[2],box2[0]+0.5*box2[2])-max(box1[0]-0.5*box1[2],box2[0]-0.5*box2[2]) 224 | lr = min(box1[1]+0.5*box1[3],box2[1]+0.5*box2[3])-max(box1[1]-0.5*box1[3],box2[1]-0.5*box2[3]) 225 | if tb < 0 or lr < 0 : intersection = 0 226 | else : intersection = tb*lr 227 | return intersection / (box1[2]*box1[3] + box2[2]*box2[3] - intersection) 228 | 229 | def training(self): #TODO add training function! 230 | return None 231 | 232 | 233 | 234 | 235 | def main(argvs): 236 | yolo = YOLO_TF(argvs) 237 | cv2.waitKey(1000) 238 | 239 | 240 | if __name__=='__main__': 241 | main(sys.argv) 242 | -------------------------------------------------------------------------------- /YOLO_small_tf.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 3 | import tensorflow as tf 4 | import time 5 | import sys 6 | import os 7 | 8 | class YOLO_TF: 9 | fromfile = None 10 | tofile_img = 'test/output.jpg' 11 | tofile_txt = 'test/output.txt' 12 | imshow = True 13 | filewrite_img = False 14 | filewrite_txt = False 15 | disp_console = True 16 | weights_file = 'weights/YOLO_small.ckpt' 17 | alpha = 0.1 18 | threshold = 0.2 19 | iou_threshold = 0.5 20 | num_class = 20 21 | num_box = 2 22 | grid_size = 7 23 | classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train","tvmonitor"] 24 | 25 | w_img = 640 26 | h_img = 480 27 | 28 | def __init__(self,argvs = []): 29 | self.detected = 0 30 | self.overall_pics = 0 31 | self.argv_parser(argvs) 32 | self.build_networks() 33 | if self.fromfile is not None: self.detect_from_file(self.fromfile) 34 | if self.fromfolder is not None: 35 | filename_list = os.listdir(self.fromfolder) 36 | for filename in filename_list: 37 | self.overall_pics+=1 38 | self.detect_from_file(self.fromfolder+"/"+filename) 39 | print("Fooling_rate:",(self.overall_pics-self.detected)/self.overall_pics) 40 | 41 | def argv_parser(self,argvs): 42 | for i in range(1,len(argvs),2): 43 | if argvs[i] == '-fromfile' : self.fromfile = argvs[i+1] 44 | if argvs[i] == '-fromfolder' : 45 | self.fromfolder = argvs[i+1] 46 | else: 47 | self.fromfolder = None 48 | if argvs[i] == '-tofile_img' : self.tofile_img = argvs[i+1] ; self.filewrite_img = True 49 | if argvs[i] == '-tofile_txt' : self.tofile_txt = argvs[i+1] ; self.filewrite_txt = True 50 | if argvs[i] == '-imshow' : 51 | if argvs[i+1] == '1' :self.imshow = True 52 | else : self.imshow = False 53 | if argvs[i] == '-disp_console' : 54 | if argvs[i+1] == '1' :self.disp_console = True 55 | else : self.disp_console = False 56 | 57 | def build_networks(self): 58 | if self.disp_console : print "Building YOLO_small graph..." 59 | self.x = tf.placeholder('float32',[None,448,448,3]) 60 | self.conv_1 = self.conv_layer(1,self.x,64,7,2) 61 | self.pool_2 = self.pooling_layer(2,self.conv_1,2,2) 62 | self.conv_3 = self.conv_layer(3,self.pool_2,192,3,1) 63 | self.pool_4 = self.pooling_layer(4,self.conv_3,2,2) 64 | self.conv_5 = self.conv_layer(5,self.pool_4,128,1,1) 65 | self.conv_6 = self.conv_layer(6,self.conv_5,256,3,1) 66 | self.conv_7 = self.conv_layer(7,self.conv_6,256,1,1) 67 | self.conv_8 = self.conv_layer(8,self.conv_7,512,3,1) 68 | self.pool_9 = self.pooling_layer(9,self.conv_8,2,2) 69 | self.conv_10 = self.conv_layer(10,self.pool_9,256,1,1) 70 | self.conv_11 = self.conv_layer(11,self.conv_10,512,3,1) 71 | self.conv_12 = self.conv_layer(12,self.conv_11,256,1,1) 72 | self.conv_13 = self.conv_layer(13,self.conv_12,512,3,1) 73 | self.conv_14 = self.conv_layer(14,self.conv_13,256,1,1) 74 | self.conv_15 = self.conv_layer(15,self.conv_14,512,3,1) 75 | self.conv_16 = self.conv_layer(16,self.conv_15,256,1,1) 76 | self.conv_17 = self.conv_layer(17,self.conv_16,512,3,1) 77 | self.conv_18 = self.conv_layer(18,self.conv_17,512,1,1) 78 | self.conv_19 = self.conv_layer(19,self.conv_18,1024,3,1) 79 | self.pool_20 = self.pooling_layer(20,self.conv_19,2,2) 80 | self.conv_21 = self.conv_layer(21,self.pool_20,512,1,1) 81 | self.conv_22 = self.conv_layer(22,self.conv_21,1024,3,1) 82 | self.conv_23 = self.conv_layer(23,self.conv_22,512,1,1) 83 | self.conv_24 = self.conv_layer(24,self.conv_23,1024,3,1) 84 | self.conv_25 = self.conv_layer(25,self.conv_24,1024,3,1) 85 | self.conv_26 = self.conv_layer(26,self.conv_25,1024,3,2) 86 | self.conv_27 = self.conv_layer(27,self.conv_26,1024,3,1) 87 | self.conv_28 = self.conv_layer(28,self.conv_27,1024,3,1) 88 | self.fc_29 = self.fc_layer(29,self.conv_28,512,flat=True,linear=False) 89 | self.fc_30 = self.fc_layer(30,self.fc_29,4096,flat=False,linear=False) 90 | #skip dropout_31 91 | self.fc_32 = self.fc_layer(32,self.fc_30,1470,flat=False,linear=True) 92 | self.sess = tf.Session() 93 | self.sess.run(tf.initialize_all_variables()) 94 | self.saver = tf.train.Saver() 95 | self.saver.restore(self.sess,self.weights_file) 96 | if self.disp_console : print "Loading complete!" + '\n' 97 | 98 | def conv_layer(self,idx,inputs,filters,size,stride): 99 | channels = inputs.get_shape()[3] 100 | weight = tf.Variable(tf.truncated_normal([size,size,int(channels),filters], stddev=0.1)) 101 | biases = tf.Variable(tf.constant(0.1, shape=[filters])) 102 | 103 | pad_size = size//2 104 | pad_mat = np.array([[0,0],[pad_size,pad_size],[pad_size,pad_size],[0,0]]) 105 | inputs_pad = tf.pad(inputs,pad_mat) 106 | 107 | conv = tf.nn.conv2d(inputs_pad, weight, strides=[1, stride, stride, 1], padding='VALID',name=str(idx)+'_conv') 108 | conv_biased = tf.add(conv,biases,name=str(idx)+'_conv_biased') 109 | if self.disp_console : print ' Layer %d : Type = Conv, Size = %d * %d, Stride = %d, Filters = %d, Input channels = %d' % (idx,size,size,stride,filters,int(channels)) 110 | return tf.maximum(self.alpha*conv_biased,conv_biased,name=str(idx)+'_leaky_relu') 111 | 112 | def pooling_layer(self,idx,inputs,size,stride): 113 | if self.disp_console : print ' Layer %d : Type = Pool, Size = %d * %d, Stride = %d' % (idx,size,size,stride) 114 | return tf.nn.max_pool(inputs, ksize=[1, size, size, 1],strides=[1, stride, stride, 1], padding='SAME',name=str(idx)+'_pool') 115 | 116 | def fc_layer(self,idx,inputs,hiddens,flat = False,linear = False): 117 | input_shape = inputs.get_shape().as_list() 118 | if flat: 119 | dim = input_shape[1]*input_shape[2]*input_shape[3] 120 | inputs_transposed = tf.transpose(inputs,(0,3,1,2)) 121 | inputs_processed = tf.reshape(inputs_transposed, [-1,dim]) 122 | else: 123 | dim = input_shape[1] 124 | inputs_processed = inputs 125 | weight = tf.Variable(tf.truncated_normal([dim,hiddens], stddev=0.1)) 126 | biases = tf.Variable(tf.constant(0.1, shape=[hiddens])) 127 | if self.disp_console : print ' Layer %d : Type = Full, Hidden = %d, Input dimension = %d, Flat = %d, Activation = %d' % (idx,hiddens,int(dim),int(flat),1-int(linear)) 128 | if linear : return tf.add(tf.matmul(inputs_processed,weight),biases,name=str(idx)+'_fc') 129 | ip = tf.add(tf.matmul(inputs_processed,weight),biases) 130 | return tf.maximum(self.alpha*ip,ip,name=str(idx)+'_fc') 131 | 132 | def detect_from_cvmat(self,img): 133 | s = time.time() 134 | self.h_img,self.w_img,_ = img.shape 135 | img_resized = cv2.resize(img, (448, 448)) 136 | img_RGB = cv2.cvtColor(img_resized,cv2.COLOR_BGR2RGB) 137 | img_resized_np = np.asarray( img_RGB ) 138 | inputs = np.zeros((1,448,448,3),dtype='float32') 139 | inputs[0] = (img_resized_np/255.0)*2.0-1.0 140 | in_dict = {self.x: inputs} 141 | net_output = self.sess.run(self.fc_32,feed_dict=in_dict) 142 | self.result = self.interpret_output(net_output[0]) 143 | self.show_results(img,self.result) 144 | strtime = str(time.time()-s) 145 | if self.disp_console : print 'Elapsed time : ' + strtime + ' secs' + '\n' 146 | 147 | def detect_from_file(self,filename): 148 | if self.disp_console : print 'Detect from ' + filename 149 | img = cv2.imread(filename) 150 | #img = misc.imread(filename) 151 | self.detect_from_cvmat(img) 152 | 153 | def detect_from_crop_sample(self): 154 | self.w_img = 640 155 | self.h_img = 420 156 | f = np.array(open('person_crop.txt','r').readlines(),dtype='float32') 157 | inputs = np.zeros((1,448,448,3),dtype='float32') 158 | for c in range(3): 159 | for y in range(448): 160 | for x in range(448): 161 | inputs[0,y,x,c] = f[c*448*448+y*448+x] 162 | 163 | in_dict = {self.x: inputs} 164 | net_output = self.sess.run(self.fc_32,feed_dict=in_dict) 165 | self.boxes, self.probs = self.interpret_output(net_output[0]) 166 | img = cv2.imread('person.jpg') 167 | self.show_results(self.boxes,img) 168 | 169 | def interpret_output(self,output): 170 | probs = np.zeros((7,7,2,20)) 171 | class_probs = np.reshape(output[0:980],(7,7,20)) 172 | scales = np.reshape(output[980:1078],(7,7,2)) 173 | boxes = np.reshape(output[1078:],(7,7,2,4)) 174 | offset = np.transpose(np.reshape(np.array([np.arange(7)]*14),(2,7,7)),(1,2,0)) 175 | 176 | boxes[:,:,:,0] += offset 177 | boxes[:,:,:,1] += np.transpose(offset,(1,0,2)) 178 | boxes[:,:,:,0:2] = boxes[:,:,:,0:2] / 7.0 179 | boxes[:,:,:,2] = np.multiply(boxes[:,:,:,2],boxes[:,:,:,2]) 180 | boxes[:,:,:,3] = np.multiply(boxes[:,:,:,3],boxes[:,:,:,3]) 181 | 182 | boxes[:,:,:,0] *= self.w_img 183 | boxes[:,:,:,1] *= self.h_img 184 | boxes[:,:,:,2] *= self.w_img 185 | boxes[:,:,:,3] *= self.h_img 186 | 187 | for i in range(2): 188 | for j in range(20): 189 | probs[:,:,i,j] = np.multiply(class_probs[:,:,j],scales[:,:,i]) 190 | 191 | filter_mat_probs = np.array(probs>=self.threshold,dtype='bool') 192 | filter_mat_boxes = np.nonzero(filter_mat_probs) 193 | boxes_filtered = boxes[filter_mat_boxes[0],filter_mat_boxes[1],filter_mat_boxes[2]] 194 | probs_filtered = probs[filter_mat_probs] 195 | classes_num_filtered = np.argmax(filter_mat_probs,axis=3)[filter_mat_boxes[0],filter_mat_boxes[1],filter_mat_boxes[2]] 196 | 197 | argsort = np.array(np.argsort(probs_filtered))[::-1] 198 | boxes_filtered = boxes_filtered[argsort] 199 | probs_filtered = probs_filtered[argsort] 200 | classes_num_filtered = classes_num_filtered[argsort] 201 | 202 | for i in range(len(boxes_filtered)): 203 | if probs_filtered[i] == 0 : continue 204 | for j in range(i+1,len(boxes_filtered)): 205 | if self.iou(boxes_filtered[i],boxes_filtered[j]) > self.iou_threshold : 206 | probs_filtered[j] = 0.0 207 | 208 | filter_iou = np.array(probs_filtered>0.0,dtype='bool') 209 | boxes_filtered = boxes_filtered[filter_iou] 210 | probs_filtered = probs_filtered[filter_iou] 211 | classes_num_filtered = classes_num_filtered[filter_iou] 212 | 213 | result = [] 214 | for i in range(len(boxes_filtered)): 215 | result.append([self.classes[classes_num_filtered[i]],boxes_filtered[i][0],boxes_filtered[i][1],boxes_filtered[i][2],boxes_filtered[i][3],probs_filtered[i]]) 216 | 217 | return result 218 | 219 | def show_results(self,img,results): 220 | img_cp = img.copy() 221 | if self.filewrite_txt : 222 | ftxt = open(self.tofile_txt,'w') 223 | class_results_set = set() 224 | for i in range(len(results)): 225 | x = int(results[i][1]) 226 | y = int(results[i][2]) 227 | w = int(results[i][3])//2 228 | h = int(results[i][4])//2 229 | class_results_set.add(results[i][0]) 230 | if self.disp_console : print ' class : ' + results[i][0] + ' , [x,y,w,h]=[' + str(x) + ',' + str(y) + ',' + str(int(results[i][3])) + ',' + str(int(results[i][4]))+'], Confidence = ' + str(results[i][5]) 231 | if self.filewrite_img or self.imshow: 232 | cv2.rectangle(img_cp,(x-w,y-h),(x+w,y+h),(0,255,0),2) 233 | cv2.rectangle(img_cp,(x-w,y-h-20),(x+w,y-h),(125,125,125),-1) 234 | cv2.putText(img_cp,results[i][0] + ' : %.2f' % results[i][5],(x-w+5,y-h-7),cv2.FONT_HERSHEY_SIMPLEX,0.5,(0,0,0),1) 235 | if self.filewrite_txt : 236 | ftxt.write(results[i][0] + ',' + str(x) + ',' + str(y) + ',' + str(w) + ',' + str(h)+',' + str(results[i][5]) + '\n') 237 | if "person" in class_results_set: 238 | self.detected+=1 239 | # new_img_path=self.fromfolder[:-14]+"test7/selected_ImageNet_person/"+str(self.detected)+"_white_margin_orgin_pic.jpg" 240 | # cv2.imwrite(new_img_path,img_cp) 241 | if self.filewrite_img : 242 | if self.disp_console : print ' image file writed : ' + self.tofile_img 243 | is_saved = cv2.imwrite(self.tofile_img,img_cp) 244 | if is_saved == True: 245 | print("Saved under:",self.tofile_img) 246 | else: 247 | print("Saving error!s") 248 | if self.imshow : 249 | cv2.imshow('YOLO_small detection',img_cp) 250 | cv2.waitKey(1) 251 | if self.filewrite_txt : 252 | if self.disp_console : print ' txt file writed : ' + self.tofile_txt 253 | ftxt.close() 254 | 255 | def iou(self,box1,box2): 256 | tb = min(box1[0]+0.5*box1[2],box2[0]+0.5*box2[2])-max(box1[0]-0.5*box1[2],box2[0]-0.5*box2[2]) 257 | lr = min(box1[1]+0.5*box1[3],box2[1]+0.5*box2[3])-max(box1[1]-0.5*box1[3],box2[1]-0.5*box2[3]) 258 | if tb < 0 or lr < 0 : intersection = 0 259 | else : intersection = tb*lr 260 | return intersection / (box1[2]*box1[3] + box2[2]*box2[3] - intersection) 261 | 262 | def training(self): #TODO add training function! 263 | return None 264 | 265 | 266 | 267 | 268 | def main(argvs): 269 | yolo = YOLO_TF(argvs) 270 | cv2.waitKey(1000) 271 | 272 | 273 | if __name__=='__main__': 274 | main(sys.argv) 275 | -------------------------------------------------------------------------------- /YOLO_tiny_tf.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 3 | import tensorflow as tf 4 | import time 5 | import sys 6 | import os 7 | import pdb 8 | 9 | class YOLO_TF: 10 | fromfile = None 11 | tofile_img = 'test/output.jpg' 12 | tofile_txt = 'test/output.txt' 13 | imshow = False 14 | filewrite_img = False 15 | filewrite_txt = False 16 | disp_console = True 17 | weights_file = 'weights/YOLO_tiny.ckpt' 18 | alpha = 0.1 19 | threshold = 0.2 20 | iou_threshold = 0.5 21 | num_class = 20 22 | num_box = 2 23 | grid_size = 7 24 | classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train","tvmonitor"] 25 | 26 | w_img = 640 27 | h_img = 480 28 | 29 | def __init__(self,argvs = []): 30 | self.detected = 0 31 | self.overall_pics = 0 32 | self.argv_parser(argvs) 33 | self.build_networks() 34 | if self.fromfile is not None: self.detect_from_file(self.fromfile) 35 | print(self.fromfolder) 36 | if self.fromfolder is not None: 37 | filename_list = os.listdir(self.fromfolder) 38 | for filename in filename_list: 39 | print("Pics number:",self.overall_pics) 40 | self.overall_pics+=1 41 | self.detect_from_file(self.fromfolder+"/"+filename) 42 | print("Accuracy:", self.detected/self.overall_pics) 43 | 44 | def argv_parser(self,argvs): 45 | for i in range(1,len(argvs),2): 46 | if argvs[i] == '-fromfile' : self.fromfile = argvs[i+1] 47 | if argvs[i] == '-fromfolder' : 48 | self.fromfolder = argvs[i+1] 49 | else: 50 | self.fromfolder = No 51 | if argvs[i] == '-tofile_img' : self.tofile_img = argvs[i+1] ; self.filewrite_img = True 52 | if argvs[i] == '-tofile_txt' : self.tofile_txt = argvs[i+1] ; self.filewrite_txt = True 53 | if argvs[i] == '-imshow' : 54 | if argvs[i+1] == '1' :self.imshow = True 55 | else : self.imshow = False 56 | if argvs[i] == '-disp_console' : 57 | if argvs[i+1] == '1' :self.disp_console = True 58 | else : self.disp_console = False 59 | 60 | def build_networks(self): 61 | if self.disp_console : print "Building YOLO_tiny graph..." 62 | self.x = tf.placeholder('float32',[None,448,448,3]) 63 | self.conv_1 = self.conv_layer(1,self.x,16,3,1) 64 | self.pool_2 = self.pooling_layer(2,self.conv_1,2,2) 65 | self.conv_3 = self.conv_layer(3,self.pool_2,32,3,1) 66 | self.pool_4 = self.pooling_layer(4,self.conv_3,2,2) 67 | self.conv_5 = self.conv_layer(5,self.pool_4,64,3,1) 68 | self.pool_6 = self.pooling_layer(6,self.conv_5,2,2) 69 | self.conv_7 = self.conv_layer(7,self.pool_6,128,3,1) 70 | self.pool_8 = self.pooling_layer(8,self.conv_7,2,2) 71 | self.conv_9 = self.conv_layer(9,self.pool_8,256,3,1) 72 | self.pool_10 = self.pooling_layer(10,self.conv_9,2,2) 73 | self.conv_11 = self.conv_layer(11,self.pool_10,512,3,1) 74 | self.pool_12 = self.pooling_layer(12,self.conv_11,2,2) 75 | self.conv_13 = self.conv_layer(13,self.pool_12,1024,3,1) 76 | self.conv_14 = self.conv_layer(14,self.conv_13,1024,3,1) 77 | self.conv_15 = self.conv_layer(15,self.conv_14,1024,3,1) 78 | self.fc_16 = self.fc_layer(16,self.conv_15,256,flat=True,linear=False) 79 | self.fc_17 = self.fc_layer(17,self.fc_16,4096,flat=False,linear=False) 80 | #skip dropout_18 81 | self.fc_19 = self.fc_layer(19,self.fc_17,1470,flat=False,linear=True) 82 | self.sess = tf.Session() 83 | self.sess.run(tf.initialize_all_variables()) 84 | self.saver = tf.train.Saver() 85 | self.saver.restore(self.sess,self.weights_file) 86 | if self.disp_console : print "Loading complete!" + '\n' 87 | 88 | def conv_layer(self,idx,inputs,filters,size,stride): 89 | channels = inputs.get_shape()[3] 90 | weight = tf.Variable(tf.truncated_normal([size,size,int(channels),filters], stddev=0.1)) 91 | biases = tf.Variable(tf.constant(0.1, shape=[filters])) 92 | 93 | pad_size = size//2 94 | pad_mat = np.array([[0,0],[pad_size,pad_size],[pad_size,pad_size],[0,0]]) 95 | inputs_pad = tf.pad(inputs,pad_mat) 96 | 97 | conv = tf.nn.conv2d(inputs_pad, weight, strides=[1, stride, stride, 1], padding='VALID',name=str(idx)+'_conv') 98 | conv_biased = tf.add(conv,biases,name=str(idx)+'_conv_biased') 99 | if self.disp_console : print ' Layer %d : Type = Conv, Size = %d * %d, Stride = %d, Filters = %d, Input channels = %d' % (idx,size,size,stride,filters,int(channels)) 100 | return tf.maximum(self.alpha*conv_biased,conv_biased,name=str(idx)+'_leaky_relu') 101 | 102 | def pooling_layer(self,idx,inputs,size,stride): 103 | if self.disp_console : print ' Layer %d : Type = Pool, Size = %d * %d, Stride = %d' % (idx,size,size,stride) 104 | return tf.nn.max_pool(inputs, ksize=[1, size, size, 1],strides=[1, stride, stride, 1], padding='SAME',name=str(idx)+'_pool') 105 | 106 | def fc_layer(self,idx,inputs,hiddens,flat = False,linear = False): 107 | input_shape = inputs.get_shape().as_list() 108 | if flat: 109 | dim = input_shape[1]*input_shape[2]*input_shape[3] 110 | inputs_transposed = tf.transpose(inputs,(0,3,1,2)) 111 | inputs_processed = tf.reshape(inputs_transposed, [-1,dim]) 112 | else: 113 | dim = input_shape[1] 114 | inputs_processed = inputs 115 | weight = tf.Variable(tf.truncated_normal([dim,hiddens], stddev=0.1)) 116 | biases = tf.Variable(tf.constant(0.1, shape=[hiddens])) 117 | if self.disp_console : print ' Layer %d : Type = Full, Hidden = %d, Input dimension = %d, Flat = %d, Activation = %d' % (idx,hiddens,int(dim),int(flat),1-int(linear)) 118 | if linear : return tf.add(tf.matmul(inputs_processed,weight),biases,name=str(idx)+'_fc') 119 | ip = tf.add(tf.matmul(inputs_processed,weight),biases) 120 | return tf.maximum(self.alpha*ip,ip,name=str(idx)+'_fc') 121 | 122 | def detect_from_cvmat(self,img): 123 | s = time.time() 124 | self.h_img,self.w_img,_ = img.shape 125 | img_resized = cv2.resize(img, (448, 448)) 126 | img_RGB = cv2.cvtColor(img_resized,cv2.COLOR_BGR2RGB) 127 | img_resized_np = np.asarray( img_RGB ) 128 | inputs = np.zeros((1,448,448,3),dtype='float32') 129 | inputs[0] = (img_resized_np/255.0)*2.0-1.0 130 | in_dict = {self.x: inputs} 131 | net_output = self.sess.run(self.fc_19,feed_dict=in_dict) 132 | self.result = self.interpret_output(net_output[0]) 133 | self.show_results(img,self.result) 134 | strtime = str(time.time()-s) 135 | if self.disp_console : print 'Elapsed time : ' + strtime + ' secs' + '\n' 136 | 137 | def detect_from_file(self,filename): 138 | if self.disp_console : print 'Detect from ' + filename 139 | img = cv2.imread(filename) 140 | #img = misc.imread(filename) 141 | self.detect_from_cvmat(img) 142 | 143 | def detect_from_crop_sample(self): 144 | self.w_img = 640 145 | self.h_img = 420 146 | f = np.array(open('person_crop.txt','r').readlines(),dtype='float32') 147 | inputs = np.zeros((1,448,448,3),dtype='float32') 148 | for c in range(3): 149 | for y in range(448): 150 | for x in range(448): 151 | inputs[0,y,x,c] = f[c*448*448+y*448+x] 152 | 153 | in_dict = {self.x: inputs} 154 | net_output = self.sess.run(self.fc_19,feed_dict=in_dict) 155 | self.boxes, self.probs = self.interpret_output(net_output[0]) 156 | img = cv2.imread('person.jpg') 157 | self.show_results(self.boxes,img) 158 | 159 | def interpret_output(self,output): 160 | probs = np.zeros((7,7,2,20)) 161 | class_probs = np.reshape(output[0:980],(7,7,20)) 162 | scales = np.reshape(output[980:1078],(7,7,2)) 163 | boxes = np.reshape(output[1078:],(7,7,2,4)) 164 | offset = np.transpose(np.reshape(np.array([np.arange(7)]*14),(2,7,7)),(1,2,0)) 165 | 166 | boxes[:,:,:,0] += offset 167 | boxes[:,:,:,1] += np.transpose(offset,(1,0,2)) 168 | boxes[:,:,:,0:2] = boxes[:,:,:,0:2] / 7.0 169 | boxes[:,:,:,2] = np.multiply(boxes[:,:,:,2],boxes[:,:,:,2]) 170 | boxes[:,:,:,3] = np.multiply(boxes[:,:,:,3],boxes[:,:,:,3]) 171 | 172 | boxes[:,:,:,0] *= self.w_img 173 | boxes[:,:,:,1] *= self.h_img 174 | boxes[:,:,:,2] *= self.w_img 175 | boxes[:,:,:,3] *= self.h_img 176 | 177 | for i in range(2): 178 | for j in range(20): 179 | probs[:,:,i,j] = np.multiply(class_probs[:,:,j],scales[:,:,i]) 180 | 181 | filter_mat_probs = np.array(probs>=self.threshold,dtype='bool') 182 | filter_mat_boxes = np.nonzero(filter_mat_probs) 183 | boxes_filtered = boxes[filter_mat_boxes[0],filter_mat_boxes[1],filter_mat_boxes[2]] 184 | probs_filtered = probs[filter_mat_probs] 185 | classes_num_filtered = np.argmax(filter_mat_probs,axis=3)[filter_mat_boxes[0],filter_mat_boxes[1],filter_mat_boxes[2]] 186 | 187 | argsort = np.array(np.argsort(probs_filtered))[::-1] 188 | boxes_filtered = boxes_filtered[argsort] 189 | probs_filtered = probs_filtered[argsort] 190 | classes_num_filtered = classes_num_filtered[argsort] 191 | 192 | for i in range(len(boxes_filtered)): 193 | if probs_filtered[i] == 0 : continue 194 | for j in range(i+1,len(boxes_filtered)): 195 | if self.iou(boxes_filtered[i],boxes_filtered[j]) > self.iou_threshold : 196 | probs_filtered[j] = 0.0 197 | 198 | filter_iou = np.array(probs_filtered>0.0,dtype='bool') 199 | boxes_filtered = boxes_filtered[filter_iou] 200 | probs_filtered = probs_filtered[filter_iou] 201 | classes_num_filtered = classes_num_filtered[filter_iou] 202 | 203 | result = [] 204 | for i in range(len(boxes_filtered)): 205 | result.append([self.classes[classes_num_filtered[i]],boxes_filtered[i][0],boxes_filtered[i][1],boxes_filtered[i][2],boxes_filtered[i][3],probs_filtered[i]]) 206 | 207 | return result 208 | 209 | def show_results(self,img,results): 210 | img_cp = img.copy() 211 | if self.filewrite_txt : 212 | ftxt = open(self.tofile_txt,'w') 213 | class_results_set = set() 214 | for i in range(len(results)): 215 | x = int(results[i][1]) 216 | y = int(results[i][2]) 217 | w = int(results[i][3])//2 218 | h = int(results[i][4])//2 219 | class_results_set.add(results[i][0]) 220 | if self.disp_console : print ' class : ' + results[i][0] + ' , [x,y,w,h]=[' + str(x) + ',' + str(y) + ',' + str(int(results[i][3])) + ',' + str(int(results[i][4]))+'], Confidence = ' + str(results[i][5]) 221 | if self.filewrite_img or self.imshow: 222 | cv2.rectangle(img_cp,(x-w,y-h),(x+w,y+h),(0,255,0),2) 223 | cv2.rectangle(img_cp,(x-w,y-h-20),(x+w,y-h),(125,125,125),-1) 224 | cv2.putText(img_cp,results[i][0] + ' : %.2f' % results[i][5],(x-w+5,y-h-7),cv2.FONT_HERSHEY_SIMPLEX,0.5,(0,0,0),1) 225 | if self.filewrite_txt : 226 | ftxt.write(results[i][0] + ',' + str(x) + ',' + str(y) + ',' + str(w) + ',' + str(h)+',' + str(results[i][5]) + '\n') 227 | if "person" in class_results_set: 228 | self.detected+=1 229 | # new_img_path=self.fromfolder[:-14]+"test7/selected_ImageNet_person/"+str(self.detected)+"_white_margin_orgin_pic.jpg" 230 | # cv2.imwrite(new_img_path,img_cp) 231 | if self.filewrite_img : 232 | if self.disp_console : print ' image file writed : ' + self.tofile_img 233 | is_saved = cv2.imwrite(self.tofile_img,img_cp) 234 | if is_saved == True: 235 | print("Saved under:",self.tofile_img) 236 | else: 237 | print("Saving error!s") 238 | if self.imshow : 239 | cv2.imshow('YOLO_tiny detection',img_cp) 240 | cv2.waitKey(1) 241 | if self.filewrite_txt : 242 | if self.disp_console : print ' txt file writed : ' + self.tofile_txt 243 | ftxt.close() 244 | 245 | def iou(self,box1,box2): 246 | tb = min(box1[0]+0.5*box1[2],box2[0]+0.5*box2[2])-max(box1[0]-0.5*box1[2],box2[0]-0.5*box2[2]) 247 | lr = min(box1[1]+0.5*box1[3],box2[1]+0.5*box2[3])-max(box1[1]-0.5*box1[3],box2[1]-0.5*box2[3]) 248 | if tb < 0 or lr < 0 : intersection = 0 249 | else : intersection = tb*lr 250 | return intersection / (box1[2]*box1[3] + box2[2]*box2[3] - intersection) 251 | 252 | def training(self): #TODO add training function! 253 | return None 254 | 255 | 256 | 257 | 258 | def main(argvs): 259 | yolo = YOLO_TF(argvs) 260 | cv2.waitKey(1000) 261 | 262 | 263 | if __name__=='__main__': 264 | main(sys.argv) 265 | -------------------------------------------------------------------------------- /YOLO_weight_extractor/Readme.md: -------------------------------------------------------------------------------- 1 | # YOLO weight converter (darknet -> tensorflow) 2 | 3 | 1. Usage 4 | 5 | (1) download this modified version of darknet 6 | 7 | 8 | (2) put your darknet weight file(made by pjreddie or you) in the folder that contains darknet executable 9 | 10 | 11 | (3) run yolo in test mode (ex -> ./darknet yolo test cfg/yolo-small.cfg yolo-small.weights) 12 | 13 | 14 | (4) modified yolo will write txt files in folder 'cjy' 15 | 16 | 17 | (5) exit yolo when you see 'enter image path:' 18 | 19 | 20 | (6) open builder python file (YOLO_full_builder.py or YOLO_small_builder.py or YOLO_tiny_builder.py) 21 | 22 | 23 | (7) change weights_dir in line 6 (the folder that contains extracted txt files) 24 | 25 | 26 | (8) change path in the last line of function 'build_networks' (this is the path that will store ckpt file.) 27 | 28 | 29 | (9) run builder python script 30 | 31 | 2. Copyright 32 | 33 | 34 | I modified prejeddie's darknet code. (https://github.com/pjreddie/darknet) 35 | -------------------------------------------------------------------------------- /YOLO_weight_extractor/YOLO_weight_extractor.tar.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gliese581gg/YOLO_tensorflow/fd83f13b5f8f1a7b1eb7c38b143ed6da4922834a/YOLO_weight_extractor/YOLO_weight_extractor.tar.gz -------------------------------------------------------------------------------- /test/person.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gliese581gg/YOLO_tensorflow/fd83f13b5f8f1a7b1eb7c38b143ed6da4922834a/test/person.jpg -------------------------------------------------------------------------------- /weights/put_weight_file_here.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gliese581gg/YOLO_tensorflow/fd83f13b5f8f1a7b1eb7c38b143ed6da4922834a/weights/put_weight_file_here.txt --------------------------------------------------------------------------------