├── LICENSE
├── README.md
├── YOLO_face_tf.py
├── YOLO_small_tf.py
├── YOLO_tiny_tf.py
├── YOLO_weight_extractor
    ├── Readme.md
    └── YOLO_weight_extractor.tar.gz
├── test
    └── person.jpg
└── weights
    └── put_weight_file_here.txt


/LICENSE:
--------------------------------------------------------------------------------
 1 |                               YOLO_tensorflow LICENSE
 2 |                              Version 0.1, FEB 15 2016
 3 | 
 4 | ACCORDING TO ORIGINAL CODE'S LICENSE,
 5 | 
 6 | DO NOT USE THIS ON COMMERCIAL!
 7 | I OR ORIGINAL AUTHOR DO NOT HOLD LIABILITY FOR ANY DAMAGES!
 8 | 
 9 | 
10 | BELOW IS THE ORIGINAL CODE'S LICENSE
11 | {
12 | THIS SOFTWARE LICENSE IS PROVIDED "ALL CAPS" SO THAT YOU KNOW IT IS SUPER
13 | SERIOUS AND YOU DON'T MESS AROUND WITH COPYRIGHT LAW BECAUSE YOU WILL GET IN
14 | TROUBLE HERE ARE SOME OTHER BUZZWORDS COMMONLY IN THESE THINGS WARRANTIES
15 | LIABILITY CONTRACT TORT LIABLE CLAIMS RESTRICTION MERCHANTABILITY SUBJECT TO
16 | THE FOLLOWING CONDITIONS:
17 | 
18 | 1. #yolo
19 | 2. #swag
20 | 3. #blazeit
21 | }
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # YOLO_tensorflow
 2 | 
 3 | (Version 0.3, Last updated :2017.02.21)
 4 | 
 5 | ### 1.Introduction
 6 | 
 7 | This is tensorflow implementation of the YOLO:Real-Time Object Detection
 8 | 
 9 | It can only do predictions using pretrained YOLO_small & YOLO_tiny network for now.
10 | 
11 | (+ YOLO_face detector from https://github.com/quanhua92/darknet )
12 | 
13 | I extracted weight values from darknet's (.weight) files.
14 | 
15 | My code does not support training. Use darknet for training.
16 | 
17 | Original code(C implementation) & paper : http://pjreddie.com/darknet/yolo/
18 | 
19 | ### 2.Install
20 | (1) Download code
21 | 
22 | (2) Download YOLO weight file from
23 | 
24 | YOLO_small : https://drive.google.com/file/d/0B2JbaJSrWLpza08yS2FSUnV2dlE/view?usp=sharing
25 | 
26 | YOLO_tiny  : https://drive.google.com/file/d/0B2JbaJSrWLpza0FtQlc3ejhMTTA/view?usp=sharing
27 | 
28 | YOLO_face : https://drive.google.com/file/d/0B2JbaJSrWLpzMzR5eURGN2dMTk0/view?usp=sharing
29 | 
30 | (3) Put the 'YOLO_(version).ckpt' in the 'weight' folder of downloaded code
31 | 
32 | ### 3.Usage
33 | 
34 | (1) direct usage with default settings (display on console, show output image, no output file writing)
35 | 
36 | 	python YOLO_(small or tiny)_tf.py -fromfile (input image filename)
37 | 
38 | (2) direct usage with custom settings
39 | 
40 | 	python YOLO_(small or tiny)_tf.py argvs
41 | 
42 | 	where argvs are
43 | 
44 | 	-fromfile (input image filename) : input image file
45 | 	-disp_console (0 or 1) : whether display results on terminal or not
46 | 	-imshow (0 or 1) : whether display result image or not
47 | 	-tofile_img (output image filename) : output image file
48 | 	-tofile_txt (output txt filename) : output text file (contains class, x, y, w, h, probability)
49 | 
50 | (3) import on other scripts
51 | 
52 | 	import YOLO_(small or tiny)_tf
53 | 	yolo = YOLO_(small or tiny)_tf.YOLO_TF()
54 | 
55 | 	yolo.disp_console = (True or False, default = True)
56 | 	yolo.imshow = (True or False, default = True)
57 | 	yolo.tofile_img = (output image filename)
58 | 	yolo.tofile_txt = (output txt filename)
59 | 	yolo.filewrite_img = (True or False, default = False)
60 | 	yolo.filewrite_txt = (True of False, default = False)
61 | 
62 | 	yolo.detect_from_file(filename)
63 | 	yolo.detect_from_cvmat(cvmat)
64 | 
65 | ### 4.Requirements
66 | 
67 | - Tensorflow
68 | - Opencv2
69 | 
70 | ### 5.Copyright
71 | 
72 | According to the LICENSE file of the original code, 
73 | - Me and original author hold no liability for any damages
74 | - Do not use this on commercial!
75 | 
76 | ### 6.Changelog
77 | 2016/02/15 : First upload!
78 | 
79 | 2016/02/16 : Added YOLO_tiny, Fixed bug that ignores one of the boxes in grid when both boxes detected valid objects
80 | 
81 | 2016/08/26 : Uploaded weight file converter! (darknet weight -> tensorflow ckpt)
82 | 
83 | 2017/02/21 : Added YOLO_face (Thanks https://github.com/quanhua92/darknet)
84 | 


--------------------------------------------------------------------------------
/YOLO_face_tf.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import cv2
  3 | import tensorflow as tf
  4 | import time
  5 | import sys
  6 | 
  7 | class YOLO_TF:
  8 | 	fromfile = None
  9 | 	tofile_img = 'test/output.jpg'
 10 | 	tofile_txt = 'test/output.txt'
 11 | 	imshow = True
 12 | 	filewrite_img = False
 13 | 	filewrite_txt = False
 14 | 	disp_console = True
 15 | 	weights_file = 'weights/YOLO_face'
 16 | 	alpha = 0.1
 17 | 	threshold = 0.2
 18 | 	iou_threshold = 0.5
 19 | 	num_class = 1
 20 | 	num_box = 2
 21 | 	grid_size = 11
 22 | 	classes =  ["face"]
 23 | 
 24 | 	w_img = 640
 25 | 	h_img = 480
 26 | 
 27 | 	def __init__(self,argvs = []):
 28 | 		self.argv_parser(argvs)
 29 | 		self.build_networks()
 30 | 		if self.fromfile is not None: self.detect_from_file(self.fromfile)
 31 | 	def argv_parser(self,argvs):
 32 | 		for i in range(1,len(argvs),2):
 33 | 			if argvs[i] == '-fromfile' : self.fromfile = argvs[i+1]
 34 | 			if argvs[i] == '-tofile_img' : self.tofile_img = argvs[i+1] ; self.filewrite_img = True
 35 | 			if argvs[i] == '-tofile_txt' : self.tofile_txt = argvs[i+1] ; self.filewrite_txt = True
 36 | 			if argvs[i] == '-imshow' :
 37 | 				if argvs[i+1] == '1' :self.imshow = True
 38 | 				else : self.imshow = False
 39 | 			if argvs[i] == '-disp_console' :
 40 | 				if argvs[i+1] == '1' :self.disp_console = True
 41 | 				else : self.disp_console = False
 42 | 				
 43 | 	def build_networks(self):
 44 | 		if self.disp_console : print "Building YOLO_tiny graph..."
 45 | 		self.x = tf.placeholder('float32',[None,448,448,3])
 46 | 		self.conv_1 = self.conv_layer(1,self.x,16,3,1)
 47 | 		self.pool_2 = self.pooling_layer(2,self.conv_1,2,2)
 48 | 		self.conv_3 = self.conv_layer(3,self.pool_2,32,3,1)
 49 | 		self.pool_4 = self.pooling_layer(4,self.conv_3,2,2)
 50 | 		self.conv_5 = self.conv_layer(5,self.pool_4,64,3,1)
 51 | 		self.pool_6 = self.pooling_layer(6,self.conv_5,2,2)
 52 | 		self.conv_7 = self.conv_layer(7,self.pool_6,128,3,1)
 53 | 		self.pool_8 = self.pooling_layer(8,self.conv_7,2,2)
 54 | 		self.conv_9 = self.conv_layer(9,self.pool_8,256,3,1)
 55 | 		self.pool_10 = self.pooling_layer(10,self.conv_9,2,2)
 56 | 		self.conv_11 = self.conv_layer(11,self.pool_10,512,3,1)
 57 | 		self.pool_12 = self.pooling_layer(12,self.conv_11,2,2)
 58 | 		self.conv_13 = self.conv_layer(13,self.pool_12,1024,3,1)
 59 | 		self.conv_14 = self.conv_layer(14,self.conv_13,1024,3,1)
 60 | 		self.conv_15 = self.conv_layer(15,self.conv_14,1024,3,1)
 61 | 		self.fc_16 = self.fc_layer(16,self.conv_15,256,flat=True,linear=False)
 62 | 		self.fc_17 = self.fc_layer(17,self.fc_16,4096,flat=False,linear=False)
 63 | 		#skip dropout_18
 64 | 		self.fc_19 = self.fc_layer(19,self.fc_17,1331,flat=False,linear=True)
 65 | 		self.sess = tf.Session()
 66 | 		self.sess.run(tf.initialize_all_variables())
 67 | 		self.saver = tf.train.Saver()
 68 | 		self.saver.restore(self.sess,self.weights_file)
 69 | 		if self.disp_console : print "Loading complete!" + '\n'
 70 | 
 71 | 	def conv_layer(self,idx,inputs,filters,size,stride):
 72 | 		channels = inputs.get_shape()[3]
 73 | 		weight = tf.Variable(tf.truncated_normal([size,size,int(channels),filters], stddev=0.1))
 74 | 		biases = tf.Variable(tf.constant(0.1, shape=[filters]))
 75 | 
 76 | 		pad_size = size//2
 77 | 		pad_mat = np.array([[0,0],[pad_size,pad_size],[pad_size,pad_size],[0,0]])
 78 | 		inputs_pad = tf.pad(inputs,pad_mat)
 79 | 
 80 | 		conv = tf.nn.conv2d(inputs_pad, weight, strides=[1, stride, stride, 1], padding='VALID',name=str(idx)+'_conv')	
 81 | 		conv_biased = tf.add(conv,biases,name=str(idx)+'_conv_biased')	
 82 | 		if self.disp_console : print '    Layer  %d : Type = Conv, Size = %d * %d, Stride = %d, Filters = %d, Input channels = %d' % (idx,size,size,stride,filters,int(channels))
 83 | 		return tf.maximum(self.alpha*conv_biased,conv_biased,name=str(idx)+'_leaky_relu')
 84 | 
 85 | 	def pooling_layer(self,idx,inputs,size,stride):
 86 | 		if self.disp_console : print '    Layer  %d : Type = Pool, Size = %d * %d, Stride = %d' % (idx,size,size,stride)
 87 | 		return tf.nn.max_pool(inputs, ksize=[1, size, size, 1],strides=[1, stride, stride, 1], padding='SAME',name=str(idx)+'_pool')
 88 | 
 89 | 	def fc_layer(self,idx,inputs,hiddens,flat = False,linear = False):
 90 | 		input_shape = inputs.get_shape().as_list()		
 91 | 		if flat:
 92 | 			dim = input_shape[1]*input_shape[2]*input_shape[3]
 93 | 			inputs_transposed = tf.transpose(inputs,(0,3,1,2))
 94 | 			inputs_processed = tf.reshape(inputs_transposed, [-1,dim])
 95 | 		else:
 96 | 			dim = input_shape[1]
 97 | 			inputs_processed = inputs
 98 | 		weight = tf.Variable(tf.truncated_normal([dim,hiddens], stddev=0.1))
 99 | 		biases = tf.Variable(tf.constant(0.1, shape=[hiddens]))	
100 | 		if self.disp_console : print '    Layer  %d : Type = Full, Hidden = %d, Input dimension = %d, Flat = %d, Activation = %d' % (idx,hiddens,int(dim),int(flat),1-int(linear))	
101 | 		if linear : return tf.add(tf.matmul(inputs_processed,weight),biases,name=str(idx)+'_fc')
102 | 		ip = tf.add(tf.matmul(inputs_processed,weight),biases)
103 | 		return tf.maximum(self.alpha*ip,ip,name=str(idx)+'_fc')
104 | 
105 | 	def detect_from_cvmat(self,img):
106 | 		s = time.time()
107 | 		self.h_img,self.w_img,_ = img.shape
108 | 		img_resized = cv2.resize(img, (448, 448))
109 | 		img_RGB = cv2.cvtColor(img_resized,cv2.COLOR_BGR2RGB)
110 | 		img_resized_np = np.asarray( img_RGB )
111 | 		inputs = np.zeros((1,448,448,3),dtype='float32')
112 | 		inputs[0] = (img_resized_np/255.0)*2.0-1.0
113 | 		in_dict = {self.x: inputs}
114 | 		net_output = self.sess.run(self.fc_19,feed_dict=in_dict)
115 | 		self.result = self.interpret_output(net_output[0])
116 | 		self.show_results(img,self.result)
117 | 		strtime = str(time.time()-s)
118 | 		if self.disp_console : print 'Elapsed time : ' + strtime + ' secs' + '\n'
119 | 
120 | 	def detect_from_file(self,filename):
121 | 		if self.disp_console : print 'Detect from ' + filename
122 | 		img = cv2.imread(filename)
123 | 		#img = misc.imread(filename)
124 | 		self.detect_from_cvmat(img)
125 | 
126 | 	def detect_from_crop_sample(self):
127 | 		self.w_img = 640
128 | 		self.h_img = 420
129 | 		f = np.array(open('person_crop.txt','r').readlines(),dtype='float32')
130 | 		inputs = np.zeros((1,448,448,3),dtype='float32')
131 | 		for c in range(3):
132 | 			for y in range(448):
133 | 				for x in range(448):
134 | 					inputs[0,y,x,c] = f[c*448*448+y*448+x]
135 | 
136 | 		in_dict = {self.x: inputs}
137 | 		net_output = self.sess.run(self.fc_19,feed_dict=in_dict)
138 | 		self.boxes, self.probs = self.interpret_output(net_output[0])
139 | 		img = cv2.imread('person.jpg')
140 | 		self.show_results(self.boxes,img)
141 | 
142 | 	def interpret_output(self,output):
143 | 		prob_range = [0,self.grid_size*self.grid_size*self.num_class]
144 | 		scales_range = [prob_range[1],prob_range[1]+self.grid_size*self.grid_size*self.num_box]
145 | 		boxes_range = [scales_range[1],scales_range[1]+self.grid_size*self.grid_size*self.num_box*4]
146 | 
147 | 		probs = np.zeros((self.grid_size,self.grid_size,self.num_box,self.num_class))
148 | 		class_probs = np.reshape(output[0:prob_range[1]],(self.grid_size,self.grid_size,self.num_class))
149 | 		scales = np.reshape(output[scales_range[0]:scales_range[1]],(self.grid_size,self.grid_size,self.num_box))
150 | 		boxes = np.reshape(output[boxes_range[0]:],(self.grid_size,self.grid_size,self.num_box,4))
151 | 		offset = np.transpose(np.reshape(np.array([np.arange(self.grid_size)]*(2*self.grid_size)),(2,self.grid_size,self.grid_size)),(1,2,0))
152 | 
153 | 		boxes[:,:,:,0] += offset
154 | 		boxes[:,:,:,1] += np.transpose(offset,(1,0,2))
155 | 		boxes[:,:,:,0:2] = boxes[:,:,:,0:2] / float(self.grid_size)
156 | 		boxes[:,:,:,2] = np.multiply(boxes[:,:,:,2],boxes[:,:,:,2])
157 | 		boxes[:,:,:,3] = np.multiply(boxes[:,:,:,3],boxes[:,:,:,3])
158 | 		
159 | 		boxes[:,:,:,0] *= self.w_img
160 | 		boxes[:,:,:,1] *= self.h_img
161 | 		boxes[:,:,:,2] *= self.w_img
162 | 		boxes[:,:,:,3] *= self.h_img
163 | 
164 | 		for i in range(self.num_box):
165 | 			for j in range(self.num_class):
166 | 				probs[:,:,i,j] = np.multiply(class_probs[:,:,j],scales[:,:,i])
167 | 
168 | 		filter_mat_probs = np.array(probs>=self.threshold,dtype='bool')
169 | 		filter_mat_boxes = np.nonzero(filter_mat_probs)
170 | 		boxes_filtered = boxes[filter_mat_boxes[0],filter_mat_boxes[1],filter_mat_boxes[2]]
171 | 		probs_filtered = probs[filter_mat_probs]
172 | 		classes_num_filtered = np.argmax(filter_mat_probs,axis=3)[filter_mat_boxes[0],filter_mat_boxes[1],filter_mat_boxes[2]] 
173 | 
174 | 		argsort = np.array(np.argsort(probs_filtered))[::-1]
175 | 		boxes_filtered = boxes_filtered[argsort]
176 | 		probs_filtered = probs_filtered[argsort]
177 | 		classes_num_filtered = classes_num_filtered[argsort]
178 | 		
179 | 		for i in range(len(boxes_filtered)):
180 | 			if probs_filtered[i] == 0 : continue
181 | 			for j in range(i+1,len(boxes_filtered)):
182 | 				if self.iou(boxes_filtered[i],boxes_filtered[j]) > self.iou_threshold : 
183 | 					probs_filtered[j] = 0.0
184 | 		
185 | 		filter_iou = np.array(probs_filtered>0.0,dtype='bool')
186 | 		boxes_filtered = boxes_filtered[filter_iou]
187 | 		probs_filtered = probs_filtered[filter_iou]
188 | 		classes_num_filtered = classes_num_filtered[filter_iou]
189 | 
190 | 		result = []
191 | 		for i in range(len(boxes_filtered)):
192 | 			result.append([self.classes[classes_num_filtered[i]],boxes_filtered[i][0],boxes_filtered[i][1],boxes_filtered[i][2],boxes_filtered[i][3],probs_filtered[i]])
193 | 
194 | 		return result
195 | 
196 | 	def show_results(self,img,results):
197 | 		img_cp = img.copy()
198 | 		if self.filewrite_txt :
199 | 			ftxt = open(self.tofile_txt,'w')
200 | 		for i in range(len(results)):
201 | 			x = int(results[i][1])
202 | 			y = int(results[i][2])
203 | 			w = int(results[i][3])//2
204 | 			h = int(results[i][4])//2
205 | 			if self.disp_console : print '    class : ' + results[i][0] + ' , [x,y,w,h]=[' + str(x) + ',' + str(y) + ',' + str(int(results[i][3])) + ',' + str(int(results[i][4]))+'], Confidence = ' + str(results[i][5])
206 | 			if self.filewrite_img or self.imshow:
207 | 				cv2.rectangle(img_cp,(x-w,y-h),(x+w,y+h),(0,255,0),2)
208 | 				cv2.rectangle(img_cp,(x-w,y-h-20),(x+w,y-h),(125,125,125),-1)
209 | 				cv2.putText(img_cp,results[i][0] + ' : %.2f' % results[i][5],(x-w+5,y-h-7),cv2.FONT_HERSHEY_SIMPLEX,0.5,(0,0,0),1)
210 | 			if self.filewrite_txt :				
211 | 				ftxt.write(results[i][0] + ',' + str(x) + ',' + str(y) + ',' + str(w) + ',' + str(h)+',' + str(results[i][5]) + '\n')
212 | 		if self.filewrite_img : 
213 | 			if self.disp_console : print '    image file writed : ' + self.tofile_img
214 | 			cv2.imwrite(self.tofile_img,img_cp)			
215 | 		if self.imshow :
216 | 			cv2.imshow('YOLO_face detection',img_cp)
217 | 			cv2.waitKey(1)
218 | 		if self.filewrite_txt : 
219 | 			if self.disp_console : print '    txt file writed : ' + self.tofile_txt
220 | 			ftxt.close()
221 | 
222 | 	def iou(self,box1,box2):
223 | 		tb = min(box1[0]+0.5*box1[2],box2[0]+0.5*box2[2])-max(box1[0]-0.5*box1[2],box2[0]-0.5*box2[2])
224 | 		lr = min(box1[1]+0.5*box1[3],box2[1]+0.5*box2[3])-max(box1[1]-0.5*box1[3],box2[1]-0.5*box2[3])
225 | 		if tb < 0 or lr < 0 : intersection = 0
226 | 		else : intersection =  tb*lr
227 | 		return intersection / (box1[2]*box1[3] + box2[2]*box2[3] - intersection)
228 | 
229 | 	def training(self): #TODO add training function!
230 | 		return None
231 | 
232 | 	
233 | 			
234 | 
235 | def main(argvs):
236 | 	yolo = YOLO_TF(argvs)
237 | 	cv2.waitKey(1000)
238 | 
239 | 
240 | if __name__=='__main__':	
241 | 	main(sys.argv)
242 | 


--------------------------------------------------------------------------------
/YOLO_small_tf.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import cv2
  3 | import tensorflow as tf
  4 | import time
  5 | import sys
  6 | import os
  7 | 
  8 | class YOLO_TF:
  9 | 	fromfile = None
 10 | 	tofile_img = 'test/output.jpg'
 11 | 	tofile_txt = 'test/output.txt'
 12 | 	imshow = True
 13 | 	filewrite_img = False
 14 | 	filewrite_txt = False
 15 | 	disp_console = True
 16 | 	weights_file = 'weights/YOLO_small.ckpt'
 17 | 	alpha = 0.1
 18 | 	threshold = 0.2
 19 | 	iou_threshold = 0.5
 20 | 	num_class = 20
 21 | 	num_box = 2
 22 | 	grid_size = 7
 23 | 	classes =  ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train","tvmonitor"]
 24 | 
 25 | 	w_img = 640
 26 | 	h_img = 480
 27 | 
 28 | 	def __init__(self,argvs = []):
 29 | 		self.detected = 0
 30 | 		self.overall_pics = 0
 31 | 		self.argv_parser(argvs)
 32 | 		self.build_networks()
 33 | 		if self.fromfile is not None: self.detect_from_file(self.fromfile)
 34 | 		if self.fromfolder is not None:
 35 | 			filename_list = os.listdir(self.fromfolder)
 36 | 			for filename in filename_list:
 37 | 				self.overall_pics+=1
 38 | 				self.detect_from_file(self.fromfolder+"/"+filename)
 39 | 			print("Fooling_rate:",(self.overall_pics-self.detected)/self.overall_pics)
 40 | 
 41 | 	def argv_parser(self,argvs):
 42 | 		for i in range(1,len(argvs),2):
 43 | 			if argvs[i] == '-fromfile' : self.fromfile = argvs[i+1]
 44 | 			if argvs[i] == '-fromfolder' : 
 45 | 				self.fromfolder = argvs[i+1]
 46 | 			else:
 47 | 				self.fromfolder = None
 48 | 			if argvs[i] == '-tofile_img' : self.tofile_img = argvs[i+1] ; self.filewrite_img = True
 49 | 			if argvs[i] == '-tofile_txt' : self.tofile_txt = argvs[i+1] ; self.filewrite_txt = True
 50 | 			if argvs[i] == '-imshow' :
 51 | 				if argvs[i+1] == '1' :self.imshow = True
 52 | 				else : self.imshow = False
 53 | 			if argvs[i] == '-disp_console' :
 54 | 				if argvs[i+1] == '1' :self.disp_console = True
 55 | 				else : self.disp_console = False
 56 | 				
 57 | 	def build_networks(self):
 58 | 		if self.disp_console : print "Building YOLO_small graph..."
 59 | 		self.x = tf.placeholder('float32',[None,448,448,3])
 60 | 		self.conv_1 = self.conv_layer(1,self.x,64,7,2)
 61 | 		self.pool_2 = self.pooling_layer(2,self.conv_1,2,2)
 62 | 		self.conv_3 = self.conv_layer(3,self.pool_2,192,3,1)
 63 | 		self.pool_4 = self.pooling_layer(4,self.conv_3,2,2)
 64 | 		self.conv_5 = self.conv_layer(5,self.pool_4,128,1,1)
 65 | 		self.conv_6 = self.conv_layer(6,self.conv_5,256,3,1)
 66 | 		self.conv_7 = self.conv_layer(7,self.conv_6,256,1,1)
 67 | 		self.conv_8 = self.conv_layer(8,self.conv_7,512,3,1)
 68 | 		self.pool_9 = self.pooling_layer(9,self.conv_8,2,2)
 69 | 		self.conv_10 = self.conv_layer(10,self.pool_9,256,1,1)
 70 | 		self.conv_11 = self.conv_layer(11,self.conv_10,512,3,1)
 71 | 		self.conv_12 = self.conv_layer(12,self.conv_11,256,1,1)
 72 | 		self.conv_13 = self.conv_layer(13,self.conv_12,512,3,1)
 73 | 		self.conv_14 = self.conv_layer(14,self.conv_13,256,1,1)
 74 | 		self.conv_15 = self.conv_layer(15,self.conv_14,512,3,1)
 75 | 		self.conv_16 = self.conv_layer(16,self.conv_15,256,1,1)
 76 | 		self.conv_17 = self.conv_layer(17,self.conv_16,512,3,1)
 77 | 		self.conv_18 = self.conv_layer(18,self.conv_17,512,1,1)
 78 | 		self.conv_19 = self.conv_layer(19,self.conv_18,1024,3,1)
 79 | 		self.pool_20 = self.pooling_layer(20,self.conv_19,2,2)
 80 | 		self.conv_21 = self.conv_layer(21,self.pool_20,512,1,1)
 81 | 		self.conv_22 = self.conv_layer(22,self.conv_21,1024,3,1)
 82 | 		self.conv_23 = self.conv_layer(23,self.conv_22,512,1,1)
 83 | 		self.conv_24 = self.conv_layer(24,self.conv_23,1024,3,1)
 84 | 		self.conv_25 = self.conv_layer(25,self.conv_24,1024,3,1)
 85 | 		self.conv_26 = self.conv_layer(26,self.conv_25,1024,3,2)
 86 | 		self.conv_27 = self.conv_layer(27,self.conv_26,1024,3,1)
 87 | 		self.conv_28 = self.conv_layer(28,self.conv_27,1024,3,1)
 88 | 		self.fc_29 = self.fc_layer(29,self.conv_28,512,flat=True,linear=False)
 89 | 		self.fc_30 = self.fc_layer(30,self.fc_29,4096,flat=False,linear=False)
 90 | 		#skip dropout_31
 91 | 		self.fc_32 = self.fc_layer(32,self.fc_30,1470,flat=False,linear=True)
 92 | 		self.sess = tf.Session()
 93 | 		self.sess.run(tf.initialize_all_variables())
 94 | 		self.saver = tf.train.Saver()
 95 | 		self.saver.restore(self.sess,self.weights_file)
 96 | 		if self.disp_console : print "Loading complete!" + '\n'
 97 | 
 98 | 	def conv_layer(self,idx,inputs,filters,size,stride):
 99 | 		channels = inputs.get_shape()[3]
100 | 		weight = tf.Variable(tf.truncated_normal([size,size,int(channels),filters], stddev=0.1))
101 | 		biases = tf.Variable(tf.constant(0.1, shape=[filters]))
102 | 
103 | 		pad_size = size//2
104 | 		pad_mat = np.array([[0,0],[pad_size,pad_size],[pad_size,pad_size],[0,0]])
105 | 		inputs_pad = tf.pad(inputs,pad_mat)
106 | 
107 | 		conv = tf.nn.conv2d(inputs_pad, weight, strides=[1, stride, stride, 1], padding='VALID',name=str(idx)+'_conv')	
108 | 		conv_biased = tf.add(conv,biases,name=str(idx)+'_conv_biased')	
109 | 		if self.disp_console : print '    Layer  %d : Type = Conv, Size = %d * %d, Stride = %d, Filters = %d, Input channels = %d' % (idx,size,size,stride,filters,int(channels))
110 | 		return tf.maximum(self.alpha*conv_biased,conv_biased,name=str(idx)+'_leaky_relu')
111 | 
112 | 	def pooling_layer(self,idx,inputs,size,stride):
113 | 		if self.disp_console : print '    Layer  %d : Type = Pool, Size = %d * %d, Stride = %d' % (idx,size,size,stride)
114 | 		return tf.nn.max_pool(inputs, ksize=[1, size, size, 1],strides=[1, stride, stride, 1], padding='SAME',name=str(idx)+'_pool')
115 | 
116 | 	def fc_layer(self,idx,inputs,hiddens,flat = False,linear = False):
117 | 		input_shape = inputs.get_shape().as_list()		
118 | 		if flat:
119 | 			dim = input_shape[1]*input_shape[2]*input_shape[3]
120 | 			inputs_transposed = tf.transpose(inputs,(0,3,1,2))
121 | 			inputs_processed = tf.reshape(inputs_transposed, [-1,dim])
122 | 		else:
123 | 			dim = input_shape[1]
124 | 			inputs_processed = inputs
125 | 		weight = tf.Variable(tf.truncated_normal([dim,hiddens], stddev=0.1))
126 | 		biases = tf.Variable(tf.constant(0.1, shape=[hiddens]))	
127 | 		if self.disp_console : print '    Layer  %d : Type = Full, Hidden = %d, Input dimension = %d, Flat = %d, Activation = %d' % (idx,hiddens,int(dim),int(flat),1-int(linear))	
128 | 		if linear : return tf.add(tf.matmul(inputs_processed,weight),biases,name=str(idx)+'_fc')
129 | 		ip = tf.add(tf.matmul(inputs_processed,weight),biases)
130 | 		return tf.maximum(self.alpha*ip,ip,name=str(idx)+'_fc')
131 | 
132 | 	def detect_from_cvmat(self,img):
133 | 		s = time.time()
134 | 		self.h_img,self.w_img,_ = img.shape
135 | 		img_resized = cv2.resize(img, (448, 448))
136 | 		img_RGB = cv2.cvtColor(img_resized,cv2.COLOR_BGR2RGB)
137 | 		img_resized_np = np.asarray( img_RGB )
138 | 		inputs = np.zeros((1,448,448,3),dtype='float32')
139 | 		inputs[0] = (img_resized_np/255.0)*2.0-1.0
140 | 		in_dict = {self.x: inputs}
141 | 		net_output = self.sess.run(self.fc_32,feed_dict=in_dict)
142 | 		self.result = self.interpret_output(net_output[0])
143 | 		self.show_results(img,self.result)
144 | 		strtime = str(time.time()-s)
145 | 		if self.disp_console : print 'Elapsed time : ' + strtime + ' secs' + '\n'
146 | 
147 | 	def detect_from_file(self,filename):
148 | 		if self.disp_console : print 'Detect from ' + filename
149 | 		img = cv2.imread(filename)
150 | 		#img = misc.imread(filename)
151 | 		self.detect_from_cvmat(img)
152 | 
153 | 	def detect_from_crop_sample(self):
154 | 		self.w_img = 640
155 | 		self.h_img = 420
156 | 		f = np.array(open('person_crop.txt','r').readlines(),dtype='float32')
157 | 		inputs = np.zeros((1,448,448,3),dtype='float32')
158 | 		for c in range(3):
159 | 			for y in range(448):
160 | 				for x in range(448):
161 | 					inputs[0,y,x,c] = f[c*448*448+y*448+x]
162 | 
163 | 		in_dict = {self.x: inputs}
164 | 		net_output = self.sess.run(self.fc_32,feed_dict=in_dict)
165 | 		self.boxes, self.probs = self.interpret_output(net_output[0])
166 | 		img = cv2.imread('person.jpg')
167 | 		self.show_results(self.boxes,img)
168 | 
169 | 	def interpret_output(self,output):
170 | 		probs = np.zeros((7,7,2,20))
171 | 		class_probs = np.reshape(output[0:980],(7,7,20))
172 | 		scales = np.reshape(output[980:1078],(7,7,2))
173 | 		boxes = np.reshape(output[1078:],(7,7,2,4))
174 | 		offset = np.transpose(np.reshape(np.array([np.arange(7)]*14),(2,7,7)),(1,2,0))
175 | 
176 | 		boxes[:,:,:,0] += offset
177 | 		boxes[:,:,:,1] += np.transpose(offset,(1,0,2))
178 | 		boxes[:,:,:,0:2] = boxes[:,:,:,0:2] / 7.0
179 | 		boxes[:,:,:,2] = np.multiply(boxes[:,:,:,2],boxes[:,:,:,2])
180 | 		boxes[:,:,:,3] = np.multiply(boxes[:,:,:,3],boxes[:,:,:,3])
181 | 		
182 | 		boxes[:,:,:,0] *= self.w_img
183 | 		boxes[:,:,:,1] *= self.h_img
184 | 		boxes[:,:,:,2] *= self.w_img
185 | 		boxes[:,:,:,3] *= self.h_img
186 | 
187 | 		for i in range(2):
188 | 			for j in range(20):
189 | 				probs[:,:,i,j] = np.multiply(class_probs[:,:,j],scales[:,:,i])
190 | 
191 | 		filter_mat_probs = np.array(probs>=self.threshold,dtype='bool')
192 | 		filter_mat_boxes = np.nonzero(filter_mat_probs)
193 | 		boxes_filtered = boxes[filter_mat_boxes[0],filter_mat_boxes[1],filter_mat_boxes[2]]
194 | 		probs_filtered = probs[filter_mat_probs]
195 | 		classes_num_filtered = np.argmax(filter_mat_probs,axis=3)[filter_mat_boxes[0],filter_mat_boxes[1],filter_mat_boxes[2]] 
196 | 
197 | 		argsort = np.array(np.argsort(probs_filtered))[::-1]
198 | 		boxes_filtered = boxes_filtered[argsort]
199 | 		probs_filtered = probs_filtered[argsort]
200 | 		classes_num_filtered = classes_num_filtered[argsort]
201 | 		
202 | 		for i in range(len(boxes_filtered)):
203 | 			if probs_filtered[i] == 0 : continue
204 | 			for j in range(i+1,len(boxes_filtered)):
205 | 				if self.iou(boxes_filtered[i],boxes_filtered[j]) > self.iou_threshold : 
206 | 					probs_filtered[j] = 0.0
207 | 		
208 | 		filter_iou = np.array(probs_filtered>0.0,dtype='bool')
209 | 		boxes_filtered = boxes_filtered[filter_iou]
210 | 		probs_filtered = probs_filtered[filter_iou]
211 | 		classes_num_filtered = classes_num_filtered[filter_iou]
212 | 
213 | 		result = []
214 | 		for i in range(len(boxes_filtered)):
215 | 			result.append([self.classes[classes_num_filtered[i]],boxes_filtered[i][0],boxes_filtered[i][1],boxes_filtered[i][2],boxes_filtered[i][3],probs_filtered[i]])
216 | 
217 | 		return result
218 | 
219 | 	def show_results(self,img,results):
220 | 		img_cp = img.copy()
221 | 		if self.filewrite_txt :
222 | 			ftxt = open(self.tofile_txt,'w')
223 | 		class_results_set = set()
224 | 		for i in range(len(results)):
225 | 			x = int(results[i][1])
226 | 			y = int(results[i][2])
227 | 			w = int(results[i][3])//2
228 | 			h = int(results[i][4])//2
229 | 			class_results_set.add(results[i][0])
230 | 			if self.disp_console : print '    class : ' + results[i][0] + ' , [x,y,w,h]=[' + str(x) + ',' + str(y) + ',' + str(int(results[i][3])) + ',' + str(int(results[i][4]))+'], Confidence = ' + str(results[i][5])
231 | 			if self.filewrite_img or self.imshow:
232 | 				cv2.rectangle(img_cp,(x-w,y-h),(x+w,y+h),(0,255,0),2)
233 | 				cv2.rectangle(img_cp,(x-w,y-h-20),(x+w,y-h),(125,125,125),-1)
234 | 				cv2.putText(img_cp,results[i][0] + ' : %.2f' % results[i][5],(x-w+5,y-h-7),cv2.FONT_HERSHEY_SIMPLEX,0.5,(0,0,0),1)
235 | 			if self.filewrite_txt :				
236 | 				ftxt.write(results[i][0] + ',' + str(x) + ',' + str(y) + ',' + str(w) + ',' + str(h)+',' + str(results[i][5]) + '\n')
237 | 		if "person" in class_results_set:
238 | 			self.detected+=1
239 | 			# new_img_path=self.fromfolder[:-14]+"test7/selected_ImageNet_person/"+str(self.detected)+"_white_margin_orgin_pic.jpg"
240 | 			# cv2.imwrite(new_img_path,img_cp)
241 | 		if self.filewrite_img : 
242 | 			if self.disp_console : print '    image file writed : ' + self.tofile_img
243 | 			is_saved = cv2.imwrite(self.tofile_img,img_cp)
244 | 			if is_saved == True:
245 | 				print("Saved under:",self.tofile_img)
246 | 			else:
247 | 				print("Saving error!s")	
248 | 		if self.imshow :
249 | 			cv2.imshow('YOLO_small detection',img_cp)
250 | 			cv2.waitKey(1)
251 | 		if self.filewrite_txt : 
252 | 			if self.disp_console : print '    txt file writed : ' + self.tofile_txt
253 | 			ftxt.close()
254 | 
255 | 	def iou(self,box1,box2):
256 | 		tb = min(box1[0]+0.5*box1[2],box2[0]+0.5*box2[2])-max(box1[0]-0.5*box1[2],box2[0]-0.5*box2[2])
257 | 		lr = min(box1[1]+0.5*box1[3],box2[1]+0.5*box2[3])-max(box1[1]-0.5*box1[3],box2[1]-0.5*box2[3])
258 | 		if tb < 0 or lr < 0 : intersection = 0
259 | 		else : intersection =  tb*lr
260 | 		return intersection / (box1[2]*box1[3] + box2[2]*box2[3] - intersection)
261 | 
262 | 	def training(self): #TODO add training function!
263 | 		return None
264 | 
265 | 	
266 | 			
267 | 
268 | def main(argvs):
269 | 	yolo = YOLO_TF(argvs)
270 | 	cv2.waitKey(1000)
271 | 
272 | 
273 | if __name__=='__main__':	
274 | 	main(sys.argv)
275 | 


--------------------------------------------------------------------------------
/YOLO_tiny_tf.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import cv2
  3 | import tensorflow as tf
  4 | import time
  5 | import sys
  6 | import os
  7 | import pdb
  8 | 
  9 | class YOLO_TF:
 10 | 	fromfile = None
 11 | 	tofile_img = 'test/output.jpg'
 12 | 	tofile_txt = 'test/output.txt'
 13 | 	imshow = False
 14 | 	filewrite_img = False
 15 | 	filewrite_txt = False
 16 | 	disp_console = True
 17 | 	weights_file = 'weights/YOLO_tiny.ckpt'
 18 | 	alpha = 0.1
 19 | 	threshold = 0.2
 20 | 	iou_threshold = 0.5
 21 | 	num_class = 20
 22 | 	num_box = 2
 23 | 	grid_size = 7
 24 | 	classes =  ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train","tvmonitor"]
 25 | 
 26 | 	w_img = 640
 27 | 	h_img = 480
 28 | 
 29 | 	def __init__(self,argvs = []):
 30 | 		self.detected = 0
 31 | 		self.overall_pics = 0
 32 | 		self.argv_parser(argvs)
 33 | 		self.build_networks()
 34 | 		if self.fromfile is not None: self.detect_from_file(self.fromfile)
 35 | 		print(self.fromfolder)
 36 | 		if self.fromfolder is not None:
 37 | 			filename_list = os.listdir(self.fromfolder)
 38 | 			for filename in filename_list:
 39 | 				print("Pics number:",self.overall_pics)
 40 | 				self.overall_pics+=1
 41 | 				self.detect_from_file(self.fromfolder+"/"+filename)
 42 | 			print("Accuracy:", self.detected/self.overall_pics)
 43 | 
 44 | 	def argv_parser(self,argvs):
 45 | 		for i in range(1,len(argvs),2):
 46 | 			if argvs[i] == '-fromfile' : self.fromfile = argvs[i+1]
 47 | 			if argvs[i] == '-fromfolder' : 
 48 | 				self.fromfolder = argvs[i+1]
 49 | 			else:
 50 | 				self.fromfolder = No
 51 | 			if argvs[i] == '-tofile_img' : self.tofile_img = argvs[i+1] ; self.filewrite_img = True
 52 | 			if argvs[i] == '-tofile_txt' : self.tofile_txt = argvs[i+1] ; self.filewrite_txt = True
 53 | 			if argvs[i] == '-imshow' :
 54 | 				if argvs[i+1] == '1' :self.imshow = True
 55 | 				else : self.imshow = False
 56 | 			if argvs[i] == '-disp_console' :
 57 | 				if argvs[i+1] == '1' :self.disp_console = True
 58 | 				else : self.disp_console = False
 59 | 				
 60 | 	def build_networks(self):
 61 | 		if self.disp_console : print "Building YOLO_tiny graph..."
 62 | 		self.x = tf.placeholder('float32',[None,448,448,3])
 63 | 		self.conv_1 = self.conv_layer(1,self.x,16,3,1)
 64 | 		self.pool_2 = self.pooling_layer(2,self.conv_1,2,2)
 65 | 		self.conv_3 = self.conv_layer(3,self.pool_2,32,3,1)
 66 | 		self.pool_4 = self.pooling_layer(4,self.conv_3,2,2)
 67 | 		self.conv_5 = self.conv_layer(5,self.pool_4,64,3,1)
 68 | 		self.pool_6 = self.pooling_layer(6,self.conv_5,2,2)
 69 | 		self.conv_7 = self.conv_layer(7,self.pool_6,128,3,1)
 70 | 		self.pool_8 = self.pooling_layer(8,self.conv_7,2,2)
 71 | 		self.conv_9 = self.conv_layer(9,self.pool_8,256,3,1)
 72 | 		self.pool_10 = self.pooling_layer(10,self.conv_9,2,2)
 73 | 		self.conv_11 = self.conv_layer(11,self.pool_10,512,3,1)
 74 | 		self.pool_12 = self.pooling_layer(12,self.conv_11,2,2)
 75 | 		self.conv_13 = self.conv_layer(13,self.pool_12,1024,3,1)
 76 | 		self.conv_14 = self.conv_layer(14,self.conv_13,1024,3,1)
 77 | 		self.conv_15 = self.conv_layer(15,self.conv_14,1024,3,1)
 78 | 		self.fc_16 = self.fc_layer(16,self.conv_15,256,flat=True,linear=False)
 79 | 		self.fc_17 = self.fc_layer(17,self.fc_16,4096,flat=False,linear=False)
 80 | 		#skip dropout_18
 81 | 		self.fc_19 = self.fc_layer(19,self.fc_17,1470,flat=False,linear=True)
 82 | 		self.sess = tf.Session()
 83 | 		self.sess.run(tf.initialize_all_variables())
 84 | 		self.saver = tf.train.Saver()
 85 | 		self.saver.restore(self.sess,self.weights_file)
 86 | 		if self.disp_console : print "Loading complete!" + '\n'
 87 | 
 88 | 	def conv_layer(self,idx,inputs,filters,size,stride):
 89 | 		channels = inputs.get_shape()[3]
 90 | 		weight = tf.Variable(tf.truncated_normal([size,size,int(channels),filters], stddev=0.1))
 91 | 		biases = tf.Variable(tf.constant(0.1, shape=[filters]))
 92 | 
 93 | 		pad_size = size//2
 94 | 		pad_mat = np.array([[0,0],[pad_size,pad_size],[pad_size,pad_size],[0,0]])
 95 | 		inputs_pad = tf.pad(inputs,pad_mat)
 96 | 
 97 | 		conv = tf.nn.conv2d(inputs_pad, weight, strides=[1, stride, stride, 1], padding='VALID',name=str(idx)+'_conv')	
 98 | 		conv_biased = tf.add(conv,biases,name=str(idx)+'_conv_biased')	
 99 | 		if self.disp_console : print '    Layer  %d : Type = Conv, Size = %d * %d, Stride = %d, Filters = %d, Input channels = %d' % (idx,size,size,stride,filters,int(channels))
100 | 		return tf.maximum(self.alpha*conv_biased,conv_biased,name=str(idx)+'_leaky_relu')
101 | 
102 | 	def pooling_layer(self,idx,inputs,size,stride):
103 | 		if self.disp_console : print '    Layer  %d : Type = Pool, Size = %d * %d, Stride = %d' % (idx,size,size,stride)
104 | 		return tf.nn.max_pool(inputs, ksize=[1, size, size, 1],strides=[1, stride, stride, 1], padding='SAME',name=str(idx)+'_pool')
105 | 
106 | 	def fc_layer(self,idx,inputs,hiddens,flat = False,linear = False):
107 | 		input_shape = inputs.get_shape().as_list()		
108 | 		if flat:
109 | 			dim = input_shape[1]*input_shape[2]*input_shape[3]
110 | 			inputs_transposed = tf.transpose(inputs,(0,3,1,2))
111 | 			inputs_processed = tf.reshape(inputs_transposed, [-1,dim])
112 | 		else:
113 | 			dim = input_shape[1]
114 | 			inputs_processed = inputs
115 | 		weight = tf.Variable(tf.truncated_normal([dim,hiddens], stddev=0.1))
116 | 		biases = tf.Variable(tf.constant(0.1, shape=[hiddens]))	
117 | 		if self.disp_console : print '    Layer  %d : Type = Full, Hidden = %d, Input dimension = %d, Flat = %d, Activation = %d' % (idx,hiddens,int(dim),int(flat),1-int(linear))	
118 | 		if linear : return tf.add(tf.matmul(inputs_processed,weight),biases,name=str(idx)+'_fc')
119 | 		ip = tf.add(tf.matmul(inputs_processed,weight),biases)
120 | 		return tf.maximum(self.alpha*ip,ip,name=str(idx)+'_fc')
121 | 
122 | 	def detect_from_cvmat(self,img):
123 | 		s = time.time()
124 | 		self.h_img,self.w_img,_ = img.shape
125 | 		img_resized = cv2.resize(img, (448, 448))
126 | 		img_RGB = cv2.cvtColor(img_resized,cv2.COLOR_BGR2RGB)
127 | 		img_resized_np = np.asarray( img_RGB )
128 | 		inputs = np.zeros((1,448,448,3),dtype='float32')
129 | 		inputs[0] = (img_resized_np/255.0)*2.0-1.0
130 | 		in_dict = {self.x: inputs}
131 | 		net_output = self.sess.run(self.fc_19,feed_dict=in_dict)
132 | 		self.result = self.interpret_output(net_output[0])
133 | 		self.show_results(img,self.result)
134 | 		strtime = str(time.time()-s)
135 | 		if self.disp_console : print 'Elapsed time : ' + strtime + ' secs' + '\n'
136 | 
137 | 	def detect_from_file(self,filename):
138 | 		if self.disp_console : print 'Detect from ' + filename
139 | 		img = cv2.imread(filename)
140 | 		#img = misc.imread(filename)
141 | 		self.detect_from_cvmat(img)
142 | 
143 | 	def detect_from_crop_sample(self):
144 | 		self.w_img = 640
145 | 		self.h_img = 420
146 | 		f = np.array(open('person_crop.txt','r').readlines(),dtype='float32')
147 | 		inputs = np.zeros((1,448,448,3),dtype='float32')
148 | 		for c in range(3):
149 | 			for y in range(448):
150 | 				for x in range(448):
151 | 					inputs[0,y,x,c] = f[c*448*448+y*448+x]
152 | 
153 | 		in_dict = {self.x: inputs}
154 | 		net_output = self.sess.run(self.fc_19,feed_dict=in_dict)
155 | 		self.boxes, self.probs = self.interpret_output(net_output[0])
156 | 		img = cv2.imread('person.jpg')
157 | 		self.show_results(self.boxes,img)
158 | 
159 | 	def interpret_output(self,output):
160 | 		probs = np.zeros((7,7,2,20))
161 | 		class_probs = np.reshape(output[0:980],(7,7,20))
162 | 		scales = np.reshape(output[980:1078],(7,7,2))
163 | 		boxes = np.reshape(output[1078:],(7,7,2,4))
164 | 		offset = np.transpose(np.reshape(np.array([np.arange(7)]*14),(2,7,7)),(1,2,0))
165 | 
166 | 		boxes[:,:,:,0] += offset
167 | 		boxes[:,:,:,1] += np.transpose(offset,(1,0,2))
168 | 		boxes[:,:,:,0:2] = boxes[:,:,:,0:2] / 7.0
169 | 		boxes[:,:,:,2] = np.multiply(boxes[:,:,:,2],boxes[:,:,:,2])
170 | 		boxes[:,:,:,3] = np.multiply(boxes[:,:,:,3],boxes[:,:,:,3])
171 | 		
172 | 		boxes[:,:,:,0] *= self.w_img
173 | 		boxes[:,:,:,1] *= self.h_img
174 | 		boxes[:,:,:,2] *= self.w_img
175 | 		boxes[:,:,:,3] *= self.h_img
176 | 
177 | 		for i in range(2):
178 | 			for j in range(20):
179 | 				probs[:,:,i,j] = np.multiply(class_probs[:,:,j],scales[:,:,i])
180 | 
181 | 		filter_mat_probs = np.array(probs>=self.threshold,dtype='bool')
182 | 		filter_mat_boxes = np.nonzero(filter_mat_probs)
183 | 		boxes_filtered = boxes[filter_mat_boxes[0],filter_mat_boxes[1],filter_mat_boxes[2]]
184 | 		probs_filtered = probs[filter_mat_probs]
185 | 		classes_num_filtered = np.argmax(filter_mat_probs,axis=3)[filter_mat_boxes[0],filter_mat_boxes[1],filter_mat_boxes[2]] 
186 | 
187 | 		argsort = np.array(np.argsort(probs_filtered))[::-1]
188 | 		boxes_filtered = boxes_filtered[argsort]
189 | 		probs_filtered = probs_filtered[argsort]
190 | 		classes_num_filtered = classes_num_filtered[argsort]
191 | 		
192 | 		for i in range(len(boxes_filtered)):
193 | 			if probs_filtered[i] == 0 : continue
194 | 			for j in range(i+1,len(boxes_filtered)):
195 | 				if self.iou(boxes_filtered[i],boxes_filtered[j]) > self.iou_threshold : 
196 | 					probs_filtered[j] = 0.0
197 | 		
198 | 		filter_iou = np.array(probs_filtered>0.0,dtype='bool')
199 | 		boxes_filtered = boxes_filtered[filter_iou]
200 | 		probs_filtered = probs_filtered[filter_iou]
201 | 		classes_num_filtered = classes_num_filtered[filter_iou]
202 | 
203 | 		result = []
204 | 		for i in range(len(boxes_filtered)):
205 | 			result.append([self.classes[classes_num_filtered[i]],boxes_filtered[i][0],boxes_filtered[i][1],boxes_filtered[i][2],boxes_filtered[i][3],probs_filtered[i]])
206 | 
207 | 		return result
208 | 
209 | 	def show_results(self,img,results):
210 | 		img_cp = img.copy()
211 | 		if self.filewrite_txt :
212 | 			ftxt = open(self.tofile_txt,'w')
213 | 		class_results_set = set()
214 | 		for i in range(len(results)):
215 | 			x = int(results[i][1])
216 | 			y = int(results[i][2])
217 | 			w = int(results[i][3])//2
218 | 			h = int(results[i][4])//2
219 | 			class_results_set.add(results[i][0])
220 | 			if self.disp_console : print '    class : ' + results[i][0] + ' , [x,y,w,h]=[' + str(x) + ',' + str(y) + ',' + str(int(results[i][3])) + ',' + str(int(results[i][4]))+'], Confidence = ' + str(results[i][5])
221 | 			if self.filewrite_img or self.imshow:
222 | 				cv2.rectangle(img_cp,(x-w,y-h),(x+w,y+h),(0,255,0),2)
223 | 				cv2.rectangle(img_cp,(x-w,y-h-20),(x+w,y-h),(125,125,125),-1)
224 | 				cv2.putText(img_cp,results[i][0] + ' : %.2f' % results[i][5],(x-w+5,y-h-7),cv2.FONT_HERSHEY_SIMPLEX,0.5,(0,0,0),1)
225 | 			if self.filewrite_txt :				
226 | 				ftxt.write(results[i][0] + ',' + str(x) + ',' + str(y) + ',' + str(w) + ',' + str(h)+',' + str(results[i][5]) + '\n')
227 | 		if "person" in class_results_set:
228 | 			self.detected+=1
229 | 			# new_img_path=self.fromfolder[:-14]+"test7/selected_ImageNet_person/"+str(self.detected)+"_white_margin_orgin_pic.jpg"
230 | 			# cv2.imwrite(new_img_path,img_cp)
231 | 		if self.filewrite_img : 
232 | 			if self.disp_console : print '    image file writed : ' + self.tofile_img
233 | 			is_saved = cv2.imwrite(self.tofile_img,img_cp)
234 | 			if is_saved == True:
235 | 				print("Saved under:",self.tofile_img)
236 | 			else:
237 | 				print("Saving error!s")
238 | 		if self.imshow :
239 | 			cv2.imshow('YOLO_tiny detection',img_cp)
240 | 			cv2.waitKey(1)
241 | 		if self.filewrite_txt : 
242 | 			if self.disp_console : print '    txt file writed : ' + self.tofile_txt
243 | 			ftxt.close()
244 | 
245 | 	def iou(self,box1,box2):
246 | 		tb = min(box1[0]+0.5*box1[2],box2[0]+0.5*box2[2])-max(box1[0]-0.5*box1[2],box2[0]-0.5*box2[2])
247 | 		lr = min(box1[1]+0.5*box1[3],box2[1]+0.5*box2[3])-max(box1[1]-0.5*box1[3],box2[1]-0.5*box2[3])
248 | 		if tb < 0 or lr < 0 : intersection = 0
249 | 		else : intersection =  tb*lr
250 | 		return intersection / (box1[2]*box1[3] + box2[2]*box2[3] - intersection)
251 | 
252 | 	def training(self): #TODO add training function!
253 | 		return None
254 | 
255 | 	
256 | 			
257 | 
258 | def main(argvs):
259 | 	yolo = YOLO_TF(argvs)
260 | 	cv2.waitKey(1000)
261 | 
262 | 
263 | if __name__=='__main__':	
264 | 	main(sys.argv)
265 | 


--------------------------------------------------------------------------------
/YOLO_weight_extractor/Readme.md:
--------------------------------------------------------------------------------
 1 | # YOLO weight converter (darknet -> tensorflow)
 2 | 
 3 | 1. Usage
 4 | 
 5 |    (1) download this modified version of darknet
 6 | 
 7 | 
 8 |    (2) put your darknet weight file(made by pjreddie or you) in the folder that contains darknet executable
 9 | 
10 | 
11 |    (3) run yolo in test mode (ex -> ./darknet yolo test cfg/yolo-small.cfg yolo-small.weights)
12 | 
13 | 
14 |    (4) modified yolo will write txt files in folder 'cjy'
15 | 
16 | 
17 |    (5) exit yolo when you see 'enter image path:'
18 | 
19 | 
20 |    (6) open builder python file (YOLO_full_builder.py or YOLO_small_builder.py or YOLO_tiny_builder.py)
21 | 
22 | 
23 |    (7) change weights_dir in line 6 (the folder that contains extracted txt files)
24 | 
25 | 
26 |    (8) change path in the last line of function 'build_networks' (this is the path that will store ckpt file.)
27 | 
28 | 
29 |    (9) run builder python script
30 | 
31 | 2. Copyright
32 | 
33 |    
34 |     I modified prejeddie's darknet code. (https://github.com/pjreddie/darknet)
35 | 


--------------------------------------------------------------------------------
/YOLO_weight_extractor/YOLO_weight_extractor.tar.gz:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/gliese581gg/YOLO_tensorflow/fd83f13b5f8f1a7b1eb7c38b143ed6da4922834a/YOLO_weight_extractor/YOLO_weight_extractor.tar.gz


--------------------------------------------------------------------------------
/test/person.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/gliese581gg/YOLO_tensorflow/fd83f13b5f8f1a7b1eb7c38b143ed6da4922834a/test/person.jpg


--------------------------------------------------------------------------------
/weights/put_weight_file_here.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/gliese581gg/YOLO_tensorflow/fd83f13b5f8f1a7b1eb7c38b143ed6da4922834a/weights/put_weight_file_here.txt


--------------------------------------------------------------------------------