├── LICENSE
├── README.md
├── YOLO_face_tf.py
├── YOLO_small_tf.py
├── YOLO_tiny_tf.py
├── YOLO_weight_extractor
    ├── Readme.md
    └── YOLO_weight_extractor.tar.gz
├── test
    └── person.jpg
└── weights
    └── put_weight_file_here.txt


/LICENSE:
--------------------------------------------------------------------------------
 1 |                               YOLO_tensorflow LICENSE
 2 |                              Version 0.1, FEB 15 2016
 3 | 
 4 | ACCORDING TO ORIGINAL CODE'S LICENSE,
 5 | 
 6 | DO NOT USE THIS ON COMMERCIAL!
 7 | I OR ORIGINAL AUTHOR DO NOT HOLD LIABILITY FOR ANY DAMAGES!
 8 | 
 9 | 
10 | BELOW IS THE ORIGINAL CODE'S LICENSE
11 | {
12 | THIS SOFTWARE LICENSE IS PROVIDED "ALL CAPS" SO THAT YOU KNOW IT IS SUPER
13 | SERIOUS AND YOU DON'T MESS AROUND WITH COPYRIGHT LAW BECAUSE YOU WILL GET IN
14 | TROUBLE HERE ARE SOME OTHER BUZZWORDS COMMONLY IN THESE THINGS WARRANTIES
15 | LIABILITY CONTRACT TORT LIABLE CLAIMS RESTRICTION MERCHANTABILITY SUBJECT TO
16 | THE FOLLOWING CONDITIONS:
17 | 
18 | 1. #yolo
19 | 2. #swag
20 | 3. #blazeit
21 | }
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | #YOLO_tensorflow
 2 | 
 3 | (Version 0.3, Last updated :2017.02.21)
 4 | 
 5 | ###1.Introduction
 6 | 
 7 | This is tensorflow implementation of the YOLO:Real-Time Object Detection
 8 | 
 9 | It can only do predictions using pretrained YOLO_small & YOLO_tiny network for now.
10 | 
11 | (+ YOLO_face detector from https://github.com/quanhua92/darknet )
12 | 
13 | I extracted weight values from darknet's (.weight) files.
14 | 
15 | My code does not support training. Use darknet for training.
16 | 
17 | Original code(C implementation) & paper : http://pjreddie.com/darknet/yolo/
18 | 
19 | ###2.Install
20 | (1) Download code
21 | 
22 | (2) Download YOLO weight file from
23 | 
24 | YOLO_small : https://drive.google.com/file/d/0B2JbaJSrWLpza08yS2FSUnV2dlE/view?usp=sharing
25 | 
26 | YOLO_tiny  : https://drive.google.com/file/d/0B2JbaJSrWLpza0FtQlc3ejhMTTA/view?usp=sharing
27 | 
28 | YOLO_face : https://drive.google.com/file/d/0B2JbaJSrWLpzMzR5eURGN2dMTk0/view?usp=sharing
29 | 
30 | (3) Put the 'YOLO_(version).ckpt' in the 'weight' folder of downloaded code
31 | 
32 | ###3.Usage
33 | 
34 | (1) direct usage with default settings (display on console, show output image, no output file writing)
35 | 
36 | 	python YOLO_(small or tiny)_tf.py -fromfile (input image filename)
37 | 
38 | (2) direct usage with custom settings
39 | 
40 | 	python YOLO_(small or tiny)_tf.py argvs
41 | 
42 | 	where argvs are
43 | 
44 | 	-fromfile (input image filename) : input image file
45 | 	-disp_console (0 or 1) : whether display results on terminal or not
46 | 	-imshow (0 or 1) : whether display result image or not
47 | 	-tofile_img (output image filename) : output image file
48 | 	-tofile_txt (output txt filename) : output text file (contains class, x, y, w, h, probability)
49 | 
50 | (3) import on other scripts
51 | 
52 | 	import YOLO_(small or tiny)_tf
53 | 	yolo = YOLO_(small or tiny)_tf.YOLO_TF()
54 | 
55 | 	yolo.disp_console = (True or False, default = True)
56 | 	yolo.imshow = (True or False, default = True)
57 | 	yolo.tofile_img = (output image filename)
58 | 	yolo.tofile_txt = (output txt filename)
59 | 	yolo.filewrite_img = (True or False, default = False)
60 | 	yolo.filewrite_txt = (True of False, default = False)
61 | 
62 | 	yolo.detect_from_file(filename)
63 | 	yolo.detect_from_cvmat(cvmat)
64 | 
65 | ###4.Requirements
66 | 
67 | - Tensorflow
68 | - Opencv2
69 | 
70 | ###5.Copyright
71 | 
72 | According to the LICENSE file of the original code, 
73 | - Me and original author hold no liability for any damages
74 | - Do not use this on commercial!
75 | 
76 | ###6.Changelog
77 | 2016/02/15 : First upload!
78 | 
79 | 2016/02/16 : Added YOLO_tiny, Fixed bug that ignores one of the boxes in grid when both boxes detected valid objects
80 | 
81 | 2016/08/26 : Uploaded weight file converter! (darknet weight -> tensorflow ckpt)
82 | 
83 | 2017/02/21 : Added YOLO_face (Thanks https://github.com/quanhua92/darknet)
84 | 


--------------------------------------------------------------------------------
/YOLO_face_tf.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import tensorflow as tf
  3 | import cv2
  4 | import time
  5 | import sys
  6 | 
  7 | class YOLO_TF:
  8 | 	fromfile = None
  9 | 	tofile_img = 'test/output.jpg'
 10 | 	tofile_txt = 'test/output.txt'
 11 | 	imshow = True
 12 | 	filewrite_img = False
 13 | 	filewrite_txt = False
 14 | 	disp_console = True
 15 | 	weights_file = 'weights/YOLO_face'
 16 | 	alpha = 0.1
 17 | 	threshold = 0.2
 18 | 	iou_threshold = 0.5
 19 | 	num_class = 1
 20 | 	num_box = 2
 21 | 	grid_size = 11
 22 | 	classes =  ["face"]
 23 | 
 24 | 	w_img = 640
 25 | 	h_img = 480
 26 | 
 27 | 	def __init__(self,argvs = []):
 28 | 		self.argv_parser(argvs)
 29 | 		self.build_networks()
 30 | 		if self.fromfile is not None: self.detect_from_file(self.fromfile)
 31 | 	def argv_parser(self,argvs):
 32 | 		for i in range(1,len(argvs),2):
 33 | 			if argvs[i] == '-fromfile' : self.fromfile = argvs[i+1]
 34 | 			if argvs[i] == '-tofile_img' : self.tofile_img = argvs[i+1] ; self.filewrite_img = True
 35 | 			if argvs[i] == '-tofile_txt' : self.tofile_txt = argvs[i+1] ; self.filewrite_txt = True
 36 | 			if argvs[i] == '-imshow' :
 37 | 				if argvs[i+1] == '1' :self.imshow = True
 38 | 				else : self.imshow = False
 39 | 			if argvs[i] == '-disp_console' :
 40 | 				if argvs[i+1] == '1' :self.disp_console = True
 41 | 				else : self.disp_console = False
 42 | 				
 43 | 	def build_networks(self):
 44 | 		if self.disp_console : print "Building YOLO_tiny graph..."
 45 | 		self.x = tf.placeholder('float32',[None,448,448,3])
 46 | 		self.conv_1 = self.conv_layer(1,self.x,16,3,1)
 47 | 		self.pool_2 = self.pooling_layer(2,self.conv_1,2,2)
 48 | 		self.conv_3 = self.conv_layer(3,self.pool_2,32,3,1)
 49 | 		self.pool_4 = self.pooling_layer(4,self.conv_3,2,2)
 50 | 		self.conv_5 = self.conv_layer(5,self.pool_4,64,3,1)
 51 | 		self.pool_6 = self.pooling_layer(6,self.conv_5,2,2)
 52 | 		self.conv_7 = self.conv_layer(7,self.pool_6,128,3,1)
 53 | 		self.pool_8 = self.pooling_layer(8,self.conv_7,2,2)
 54 | 		self.conv_9 = self.conv_layer(9,self.pool_8,256,3,1)
 55 | 		self.pool_10 = self.pooling_layer(10,self.conv_9,2,2)
 56 | 		self.conv_11 = self.conv_layer(11,self.pool_10,512,3,1)
 57 | 		self.pool_12 = self.pooling_layer(12,self.conv_11,2,2)
 58 | 		self.conv_13 = self.conv_layer(13,self.pool_12,1024,3,1)
 59 | 		self.conv_14 = self.conv_layer(14,self.conv_13,1024,3,1)
 60 | 		self.conv_15 = self.conv_layer(15,self.conv_14,1024,3,1)
 61 | 		self.fc_16 = self.fc_layer(16,self.conv_15,256,flat=True,linear=False)
 62 | 		self.fc_17 = self.fc_layer(17,self.fc_16,4096,flat=False,linear=False)
 63 | 		#skip dropout_18
 64 | 		self.fc_19 = self.fc_layer(19,self.fc_17,1331,flat=False,linear=True)
 65 | 		self.sess = tf.Session()
 66 | 		self.sess.run(tf.initialize_all_variables())
 67 | 		self.saver = tf.train.Saver()
 68 | 		self.saver.restore(self.sess,self.weights_file)
 69 | 		if self.disp_console : print "Loading complete!" + '\n'
 70 | 
 71 | 	def conv_layer(self,idx,inputs,filters,size,stride):
 72 | 		channels = inputs.get_shape()[3]
 73 | 		weight = tf.Variable(tf.truncated_normal([size,size,int(channels),filters], stddev=0.1))
 74 | 		biases = tf.Variable(tf.constant(0.1, shape=[filters]))
 75 | 
 76 | 		pad_size = size//2
 77 | 		pad_mat = np.array([[0,0],[pad_size,pad_size],[pad_size,pad_size],[0,0]])
 78 | 		inputs_pad = tf.pad(inputs,pad_mat)
 79 | 
 80 | 		conv = tf.nn.conv2d(inputs_pad, weight, strides=[1, stride, stride, 1], padding='VALID',name=str(idx)+'_conv')	
 81 | 		conv_biased = tf.add(conv,biases,name=str(idx)+'_conv_biased')	
 82 | 		if self.disp_console : print '    Layer  %d : Type = Conv, Size = %d * %d, Stride = %d, Filters = %d, Input channels = %d' % (idx,size,size,stride,filters,int(channels))
 83 | 		return tf.maximum(self.alpha*conv_biased,conv_biased,name=str(idx)+'_leaky_relu')
 84 | 
 85 | 	def pooling_layer(self,idx,inputs,size,stride):
 86 | 		if self.disp_console : print '    Layer  %d : Type = Pool, Size = %d * %d, Stride = %d' % (idx,size,size,stride)
 87 | 		return tf.nn.max_pool(inputs, ksize=[1, size, size, 1],strides=[1, stride, stride, 1], padding='SAME',name=str(idx)+'_pool')
 88 | 
 89 | 	def fc_layer(self,idx,inputs,hiddens,flat = False,linear = False):
 90 | 		input_shape = inputs.get_shape().as_list()		
 91 | 		if flat:
 92 | 			dim = input_shape[1]*input_shape[2]*input_shape[3]
 93 | 			inputs_transposed = tf.transpose(inputs,(0,3,1,2))
 94 | 			inputs_processed = tf.reshape(inputs_transposed, [-1,dim])
 95 | 		else:
 96 | 			dim = input_shape[1]
 97 | 			inputs_processed = inputs
 98 | 		weight = tf.Variable(tf.truncated_normal([dim,hiddens], stddev=0.1))
 99 | 		biases = tf.Variable(tf.constant(0.1, shape=[hiddens]))	
100 | 		if self.disp_console : print '    Layer  %d : Type = Full, Hidden = %d, Input dimension = %d, Flat = %d, Activation = %d' % (idx,hiddens,int(dim),int(flat),1-int(linear))	
101 | 		if linear : return tf.add(tf.matmul(inputs_processed,weight),biases,name=str(idx)+'_fc')
102 | 		ip = tf.add(tf.matmul(inputs_processed,weight),biases)
103 | 		return tf.maximum(self.alpha*ip,ip,name=str(idx)+'_fc')
104 | 
105 | 	def detect_from_cvmat(self,img):
106 | 		s = time.time()
107 | 		self.h_img,self.w_img,_ = img.shape
108 | 		img_resized = cv2.resize(img, (448, 448))
109 | 		img_RGB = cv2.cvtColor(img_resized,cv2.COLOR_BGR2RGB)
110 | 		img_resized_np = np.asarray( img_RGB )
111 | 		inputs = np.zeros((1,448,448,3),dtype='float32')
112 | 		inputs[0] = (img_resized_np/255.0)*2.0-1.0
113 | 		in_dict = {self.x: inputs}
114 | 		net_output = self.sess.run(self.fc_19,feed_dict=in_dict)
115 | 		self.result = self.interpret_output(net_output[0])
116 | 		self.show_results(img,self.result)
117 | 		strtime = str(time.time()-s)
118 | 		if self.disp_console : print 'Elapsed time : ' + strtime + ' secs' + '\n'
119 | 
120 | 	def detect_from_file(self,filename):
121 | 		if self.disp_console : print 'Detect from ' + filename
122 | 		img = cv2.imread(filename)
123 | 		#img = misc.imread(filename)
124 | 		self.detect_from_cvmat(img)
125 | 
126 | 	def detect_from_crop_sample(self):
127 | 		self.w_img = 640
128 | 		self.h_img = 420
129 | 		f = np.array(open('person_crop.txt','r').readlines(),dtype='float32')
130 | 		inputs = np.zeros((1,448,448,3),dtype='float32')
131 | 		for c in range(3):
132 | 			for y in range(448):
133 | 				for x in range(448):
134 | 					inputs[0,y,x,c] = f[c*448*448+y*448+x]
135 | 
136 | 		in_dict = {self.x: inputs}
137 | 		net_output = self.sess.run(self.fc_19,feed_dict=in_dict)
138 | 		self.boxes, self.probs = self.interpret_output(net_output[0])
139 | 		img = cv2.imread('person.jpg')
140 | 		self.show_results(self.boxes,img)
141 | 
142 | 	def interpret_output(self,output):
143 | 		prob_range = [0,self.grid_size*self.grid_size*self.num_class]
144 | 		scales_range = [prob_range[1],prob_range[1]+self.grid_size*self.grid_size*self.num_box]
145 | 		boxes_range = [scales_range[1],scales_range[1]+self.grid_size*self.grid_size*self.num_box*4]
146 | 
147 | 		probs = np.zeros((self.grid_size,self.grid_size,self.num_box,self.num_class))
148 | 		class_probs = np.reshape(output[0:prob_range[1]],(self.grid_size,self.grid_size,self.num_class))
149 | 		scales = np.reshape(output[scales_range[0]:scales_range[1]],(self.grid_size,self.grid_size,self.num_box))
150 | 		boxes = np.reshape(output[boxes_range[0]:],(self.grid_size,self.grid_size,self.num_box,4))
151 | 		offset = np.transpose(np.reshape(np.array([np.arange(self.grid_size)]*(2*self.grid_size)),(2,self.grid_size,self.grid_size)),(1,2,0))
152 | 
153 | 		boxes[:,:,:,0] += offset
154 | 		boxes[:,:,:,1] += np.transpose(offset,(1,0,2))
155 | 		boxes[:,:,:,0:2] = boxes[:,:,:,0:2] / float(self.grid_size)
156 | 		boxes[:,:,:,2] = np.multiply(boxes[:,:,:,2],boxes[:,:,:,2])
157 | 		boxes[:,:,:,3] = np.multiply(boxes[:,:,:,3],boxes[:,:,:,3])
158 | 		
159 | 		boxes[:,:,:,0] *= self.w_img
160 | 		boxes[:,:,:,1] *= self.h_img
161 | 		boxes[:,:,:,2] *= self.w_img
162 | 		boxes[:,:,:,3] *= self.h_img
163 | 
164 | 		for i in range(self.num_box):
165 | 			for j in range(self.num_class):
166 | 				probs[:,:,i,j] = np.multiply(class_probs[:,:,j],scales[:,:,i])
167 | 
168 | 		filter_mat_probs = np.array(probs>=self.threshold,dtype='bool')
169 | 		filter_mat_boxes = np.nonzero(filter_mat_probs)
170 | 		boxes_filtered = boxes[filter_mat_boxes[0],filter_mat_boxes[1],filter_mat_boxes[2]]
171 | 		probs_filtered = probs[filter_mat_probs]
172 | 		classes_num_filtered = np.argmax(filter_mat_probs,axis=3)[filter_mat_boxes[0],filter_mat_boxes[1],filter_mat_boxes[2]] 
173 | 
174 | 		argsort = np.array(np.argsort(probs_filtered))[::-1]
175 | 		boxes_filtered = boxes_filtered[argsort]
176 | 		probs_filtered = probs_filtered[argsort]
177 | 		classes_num_filtered = classes_num_filtered[argsort]
178 | 		
179 | 		for i in range(len(boxes_filtered)):
180 | 			if probs_filtered[i] == 0 : continue
181 | 			for j in range(i+1,len(boxes_filtered)):
182 | 				if self.iou(boxes_filtered[i],boxes_filtered[j]) > self.iou_threshold : 
183 | 					probs_filtered[j] = 0.0
184 | 		
185 | 		filter_iou = np.array(probs_filtered>0.0,dtype='bool')
186 | 		boxes_filtered = boxes_filtered[filter_iou]
187 | 		probs_filtered = probs_filtered[filter_iou]
188 | 		classes_num_filtered = classes_num_filtered[filter_iou]
189 | 
190 | 		result = []
191 | 		for i in range(len(boxes_filtered)):
192 | 			result.append([self.classes[classes_num_filtered[i]],boxes_filtered[i][0],boxes_filtered[i][1],boxes_filtered[i][2],boxes_filtered[i][3],probs_filtered[i]])
193 | 
194 | 		return result
195 | 
196 | 	def show_results(self,img,results):
197 | 		img_cp = img.copy()
198 | 		if self.filewrite_txt :
199 | 			ftxt = open(self.tofile_txt,'w')
200 | 		for i in range(len(results)):
201 | 			x = int(results[i][1])
202 | 			y = int(results[i][2])
203 | 			w = int(results[i][3])//2
204 | 			h = int(results[i][4])//2
205 | 			if self.disp_console : print '    class : ' + results[i][0] + ' , [x,y,w,h]=[' + str(x) + ',' + str(y) + ',' + str(int(results[i][3])) + ',' + str(int(results[i][4]))+'], Confidence = ' + str(results[i][5])
206 | 			if self.filewrite_img or self.imshow:
207 | 				cv2.rectangle(img_cp,(x-w,y-h),(x+w,y+h),(0,255,0),2)
208 | 				cv2.rectangle(img_cp,(x-w,y-h-20),(x+w,y-h),(125,125,125),-1)
209 | 				cv2.putText(img_cp,results[i][0] + ' : %.2f' % results[i][5],(x-w+5,y-h-7),cv2.FONT_HERSHEY_SIMPLEX,0.5,(0,0,0),1)
210 | 			if self.filewrite_txt :				
211 | 				ftxt.write(results[i][0] + ',' + str(x) + ',' + str(y) + ',' + str(w) + ',' + str(h)+',' + str(results[i][5]) + '\n')
212 | 		if self.filewrite_img : 
213 | 			if self.disp_console : print '    image file writed : ' + self.tofile_img
214 | 			cv2.imwrite(self.tofile_img,img_cp)			
215 | 		if self.imshow :
216 | 			cv2.imshow('YOLO_face detection',img_cp)
217 | 			cv2.waitKey(1)
218 | 		if self.filewrite_txt : 
219 | 			if self.disp_console : print '    txt file writed : ' + self.tofile_txt
220 | 			ftxt.close()
221 | 
222 | 	def iou(self,box1,box2):
223 | 		tb = min(box1[0]+0.5*box1[2],box2[0]+0.5*box2[2])-max(box1[0]-0.5*box1[2],box2[0]-0.5*box2[2])
224 | 		lr = min(box1[1]+0.5*box1[3],box2[1]+0.5*box2[3])-max(box1[1]-0.5*box1[3],box2[1]-0.5*box2[3])
225 | 		if tb < 0 or lr < 0 : intersection = 0
226 | 		else : intersection =  tb*lr
227 | 		return intersection / (box1[2]*box1[3] + box2[2]*box2[3] - intersection)
228 | 
229 | 	def training(self): #TODO add training function!
230 | 		return None
231 | 
232 | 	
233 | 			
234 | 
235 | def main(argvs):
236 | 	yolo = YOLO_TF(argvs)
237 | 	cv2.waitKey(1000)
238 | 
239 | 
240 | if __name__=='__main__':	
241 | 	main(sys.argv)
242 | 


--------------------------------------------------------------------------------
/YOLO_small_tf.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import tensorflow as tf
  3 | import cv2
  4 | import time
  5 | import sys
  6 | 
  7 | class YOLO_TF:
  8 | 	fromfile = None
  9 | 	tofile_img = 'test/output.jpg'
 10 | 	tofile_txt = 'test/output.txt'
 11 | 	imshow = True
 12 | 	filewrite_img = False
 13 | 	filewrite_txt = False
 14 | 	disp_console = True
 15 | 	weights_file = 'weights/YOLO_small.ckpt'
 16 | 	alpha = 0.1
 17 | 	threshold = 0.2
 18 | 	iou_threshold = 0.5
 19 | 	num_class = 20
 20 | 	num_box = 2
 21 | 	grid_size = 7
 22 | 	classes =  ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train","tvmonitor"]
 23 | 
 24 | 	w_img = 640
 25 | 	h_img = 480
 26 | 
 27 | 	def __init__(self,argvs = []):
 28 | 		self.argv_parser(argvs)
 29 | 		self.build_networks()
 30 | 		if self.fromfile is not None: self.detect_from_file(self.fromfile)
 31 | 	def argv_parser(self,argvs):
 32 | 		for i in range(1,len(argvs),2):
 33 | 			if argvs[i] == '-fromfile' : self.fromfile = argvs[i+1]
 34 | 			if argvs[i] == '-tofile_img' : self.tofile_img = argvs[i+1] ; self.filewrite_img = True
 35 | 			if argvs[i] == '-tofile_txt' : self.tofile_txt = argvs[i+1] ; self.filewrite_txt = True
 36 | 			if argvs[i] == '-imshow' :
 37 | 				if argvs[i+1] == '1' :self.imshow = True
 38 | 				else : self.imshow = False
 39 | 			if argvs[i] == '-disp_console' :
 40 | 				if argvs[i+1] == '1' :self.disp_console = True
 41 | 				else : self.disp_console = False
 42 | 	＃build yolo model			
 43 | 	def build_networks(self):
 44 | 		if self.disp_console : print "Building YOLO_small graph..."
 45 | 		self.x = tf.placeholder('float32',[None,448,448,3])
 46 | 		self.conv_1 = self.conv_layer(1,self.x,64,7,2)
 47 | 		self.pool_2 = self.pooling_layer(2,self.conv_1,2,2)
 48 | 		self.conv_3 = self.conv_layer(3,self.pool_2,192,3,1)
 49 | 		self.pool_4 = self.pooling_layer(4,self.conv_3,2,2)
 50 | 		self.conv_5 = self.conv_layer(5,self.pool_4,128,1,1)
 51 | 		self.conv_6 = self.conv_layer(6,self.conv_5,256,3,1)
 52 | 		self.conv_7 = self.conv_layer(7,self.conv_6,256,1,1)
 53 | 		self.conv_8 = self.conv_layer(8,self.conv_7,512,3,1)
 54 | 		self.pool_9 = self.pooling_layer(9,self.conv_8,2,2)
 55 | 		self.conv_10 = self.conv_layer(10,self.pool_9,256,1,1)
 56 | 		self.conv_11 = self.conv_layer(11,self.conv_10,512,3,1)
 57 | 		self.conv_12 = self.conv_layer(12,self.conv_11,256,1,1)
 58 | 		self.conv_13 = self.conv_layer(13,self.conv_12,512,3,1)
 59 | 		self.conv_14 = self.conv_layer(14,self.conv_13,256,1,1)
 60 | 		self.conv_15 = self.conv_layer(15,self.conv_14,512,3,1)
 61 | 		self.conv_16 = self.conv_layer(16,self.conv_15,256,1,1)
 62 | 		self.conv_17 = self.conv_layer(17,self.conv_16,512,3,1)
 63 | 		self.conv_18 = self.conv_layer(18,self.conv_17,512,1,1)
 64 | 		self.conv_19 = self.conv_layer(19,self.conv_18,1024,3,1)
 65 | 		self.pool_20 = self.pooling_layer(20,self.conv_19,2,2)
 66 | 		self.conv_21 = self.conv_layer(21,self.pool_20,512,1,1)
 67 | 		self.conv_22 = self.conv_layer(22,self.conv_21,1024,3,1)
 68 | 		self.conv_23 = self.conv_layer(23,self.conv_22,512,1,1)
 69 | 		self.conv_24 = self.conv_layer(24,self.conv_23,1024,3,1)
 70 | 		self.conv_25 = self.conv_layer(25,self.conv_24,1024,3,1)
 71 | 		self.conv_26 = self.conv_layer(26,self.conv_25,1024,3,2)
 72 | 		self.conv_27 = self.conv_layer(27,self.conv_26,1024,3,1)
 73 | 		self.conv_28 = self.conv_layer(28,self.conv_27,1024,3,1)
 74 | 		self.fc_29 = self.fc_layer(29,self.conv_28,512,flat=True,linear=False)
 75 | 		self.fc_30 = self.fc_layer(30,self.fc_29,4096,flat=False,linear=False)
 76 | 		#skip dropout_31
 77 | 		self.fc_32 = self.fc_layer(32,self.fc_30,1470,flat=False,linear=True)
 78 | 		self.sess = tf.Session()
 79 | 		self.sess.run(tf.initialize_all_variables())
 80 | 		self.saver = tf.train.Saver()
 81 | 		self.saver.restore(self.sess,self.weights_file)
 82 | 		if self.disp_console : print "Loading complete!" + '\n'
 83 | 	#redesign the basic layers
 84 | 	def conv_layer(self,idx,inputs,filters,size,stride):
 85 | 		channels = inputs.get_shape()[3]
 86 | 		weight = tf.Variable(tf.truncated_normal([size,size,int(channels),filters], stddev=0.1))
 87 | 		biases = tf.Variable(tf.constant(0.1, shape=[filters]))
 88 | 
 89 | 		pad_size = size//2
 90 | 		pad_mat = np.array([[0,0],[pad_size,pad_size],[pad_size,pad_size],[0,0]])
 91 | 		inputs_pad = tf.pad(inputs,pad_mat)
 92 | 
 93 | 		conv = tf.nn.conv2d(inputs_pad, weight, strides=[1, stride, stride, 1], padding='VALID',name=str(idx)+'_conv')	
 94 | 		conv_biased = tf.add(conv,biases,name=str(idx)+'_conv_biased')	
 95 | 		if self.disp_console : print '    Layer  %d : Type = Conv, Size = %d * %d, Stride = %d, Filters = %d, Input channels = %d' % (idx,size,size,stride,filters,int(channels))
 96 | 		return tf.maximum(self.alpha*conv_biased,conv_biased,name=str(idx)+'_leaky_relu')
 97 | 
 98 | 	def pooling_layer(self,idx,inputs,size,stride):
 99 | 		if self.disp_console : print '    Layer  %d : Type = Pool, Size = %d * %d, Stride = %d' % (idx,size,size,stride)
100 | 		return tf.nn.max_pool(inputs, ksize=[1, size, size, 1],strides=[1, stride, stride, 1], padding='SAME',name=str(idx)+'_pool')
101 | 
102 | 	def fc_layer(self,idx,inputs,hiddens,flat = False,linear = False):
103 | 		input_shape = inputs.get_shape().as_list()		
104 | 		if flat:
105 | 			dim = input_shape[1]*input_shape[2]*input_shape[3]
106 | 			inputs_transposed = tf.transpose(inputs,(0,3,1,2))
107 | 			inputs_processed = tf.reshape(inputs_transposed, [-1,dim])
108 | 		else:
109 | 			dim = input_shape[1]
110 | 			inputs_processed = inputs
111 | 		weight = tf.Variable(tf.truncated_normal([dim,hiddens], stddev=0.1))
112 | 		biases = tf.Variable(tf.constant(0.1, shape=[hiddens]))	
113 | 		if self.disp_console : print '    Layer  %d : Type = Full, Hidden = %d, Input dimension = %d, Flat = %d, Activation = %d' % (idx,hiddens,int(dim),int(flat),1-int(linear))	
114 | 		if linear : return tf.add(tf.matmul(inputs_processed,weight),biases,name=str(idx)+'_fc')
115 | 		ip = tf.add(tf.matmul(inputs_processed,weight),biases)
116 | 		return tf.maximum(self.alpha*ip,ip,name=str(idx)+'_fc')
117 | 
118 | 	def detect_from_cvmat(self,img):
119 | 		s = time.time()
120 | 		self.h_img,self.w_img,_ = img.shape
121 | 		img_resized = cv2.resize(img, (448, 448))
122 | 		img_RGB = cv2.cvtColor(img_resized,cv2.COLOR_BGR2RGB)
123 | 		img_resized_np = np.asarray( img_RGB )
124 | 		inputs = np.zeros((1,448,448,3),dtype='float32')
125 | 		inputs[0] = (img_resized_np/255.0)*2.0-1.0
126 | 		in_dict = {self.x: inputs}
127 | 		net_output = self.sess.run(self.fc_32,feed_dict=in_dict)
128 | 		self.result = self.interpret_output(net_output[0])
129 | 		self.show_results(img,self.result)
130 | 		strtime = str(time.time()-s)
131 | 		if self.disp_console : print 'Elapsed time : ' + strtime + ' secs' + '\n'
132 | 
133 | 	def detect_from_file(self,filename):
134 | 		if self.disp_console : print 'Detect from ' + filename
135 | 		img = cv2.imread(filename)
136 | 		#img = misc.imread(filename)
137 | 		self.detect_from_cvmat(img)
138 | 
139 | 	def detect_from_crop_sample(self):
140 | 		self.w_img = 640
141 | 		self.h_img = 420
142 | 		f = np.array(open('person_crop.txt','r').readlines(),dtype='float32')
143 | 		inputs = np.zeros((1,448,448,3),dtype='float32')
144 | 		for c in range(3):
145 | 			for y in range(448):
146 | 				for x in range(448):
147 | 					inputs[0,y,x,c] = f[c*448*448+y*448+x]
148 | 
149 | 		in_dict = {self.x: inputs}
150 | 		net_output = self.sess.run(self.fc_32,feed_dict=in_dict)
151 | 		self.boxes, self.probs = self.interpret_output(net_output[0])
152 | 		img = cv2.imread('person.jpg')
153 | 		self.show_results(self.boxes,img)
154 | 	#getting bbox from fc output feaure
155 | 	def interpret_output(self,output):
156 | 		probs = np.zeros((7,7,2,20))
157 | 		#class score
158 | 		class_probs = np.reshape(output[0:980],(7,7,20))
159 | 		#scale score
160 | 		scales = np.reshape(output[980:1078],(7,7,2))
161 | 		#bbox positions
162 | 		boxes = np.reshape(output[1078:],(7,7,2,4))
163 | 		#starting pixel positions
164 | 		offset = np.transpose(np.reshape(np.array([np.arange(7)]*14),(2,7,7)),(1,2,0))
165 | 
166 | 		boxes[:,:,:,0] += offset
167 | 		boxes[:,:,:,1] += np.transpose(offset,(1,0,2))
168 | 		boxes[:,:,:,0:2] = boxes[:,:,:,0:2] / 7.0
169 | 		boxes[:,:,:,2] = np.multiply(boxes[:,:,:,2],boxes[:,:,:,2])
170 | 		boxes[:,:,:,3] = np.multiply(boxes[:,:,:,3],boxes[:,:,:,3])
171 | 		
172 | 		boxes[:,:,:,0] *= self.w_img
173 | 		boxes[:,:,:,1] *= self.h_img
174 | 		boxes[:,:,:,2] *= self.w_img
175 | 		boxes[:,:,:,3] *= self.h_img
176 | 
177 | 		for i in range(2):
178 | 			for j in range(20):
179 | 				probs[:,:,i,j] = np.multiply(class_probs[:,:,j],scales[:,:,i])
180 | 		#find the bboxes can be  considered as bbox
181 | 		filter_mat_probs = np.array(probs>=self.threshold,dtype='bool')
182 | 		filter_mat_boxes = np.nonzero(filter_mat_probs)
183 | 		boxes_filtered = boxes[filter_mat_boxes[0],filter_mat_boxes[1],filter_mat_boxes[2]]
184 | 		probs_filtered = probs[filter_mat_probs]
185 | 		classes_num_filtered = np.argmax(filter_mat_probs,axis=3)[filter_mat_boxes[0],filter_mat_boxes[1],filter_mat_boxes[2]] 
186 | 
187 | 		#make sure neiboring ones merge
188 | 		argsort = np.array(np.argsort(probs_filtered))[::-1]
189 | 		boxes_filtered = boxes_filtered[argsort]
190 | 		probs_filtered = probs_filtered[argsort]
191 | 		classes_num_filtered = classes_num_filtered[argsort]
192 | 		
193 | 		for i in range(len(boxes_filtered)):
194 | 			if probs_filtered[i] == 0 : continue
195 | 			for j in range(i+1,len(boxes_filtered)):
196 | 				if self.iou(boxes_filtered[i],boxes_filtered[j]) > self.iou_threshold : 
197 | 					probs_filtered[j] = 0.0
198 | 		
199 | 		filter_iou = np.array(probs_filtered>0.0,dtype='bool')
200 | 		boxes_filtered = boxes_filtered[filter_iou]
201 | 		probs_filtered = probs_filtered[filter_iou]
202 | 		classes_num_filtered = classes_num_filtered[filter_iou]
203 | 
204 | 		result = []
205 | 		for i in range(len(boxes_filtered)):
206 | 			result.append([self.classes[classes_num_filtered[i]],boxes_filtered[i][0],boxes_filtered[i][1],boxes_filtered[i][2],boxes_filtered[i][3],probs_filtered[i]])
207 | 
208 | 		return result
209 | 
210 | 	def show_results(self,img,results):
211 | 		img_cp = img.copy()
212 | 		if self.filewrite_txt :
213 | 			ftxt = open(self.tofile_txt,'w')
214 | 		for i in range(len(results)):
215 | 			x = int(results[i][1])
216 | 			y = int(results[i][2])
217 | 			w = int(results[i][3])//2
218 | 			h = int(results[i][4])//2
219 | 			if self.disp_console : print '    class : ' + results[i][0] + ' , [x,y,w,h]=[' + str(x) + ',' + str(y) + ',' + str(int(results[i][3])) + ',' + str(int(results[i][4]))+'], Confidence = ' + str(results[i][5])
220 | 			if self.filewrite_img or self.imshow:
221 | 				cv2.rectangle(img_cp,(x-w,y-h),(x+w,y+h),(0,255,0),2)
222 | 				cv2.rectangle(img_cp,(x-w,y-h-20),(x+w,y-h),(125,125,125),-1)
223 | 				cv2.putText(img_cp,results[i][0] + ' : %.2f' % results[i][5],(x-w+5,y-h-7),cv2.FONT_HERSHEY_SIMPLEX,0.5,(0,0,0),1)
224 | 			if self.filewrite_txt :				
225 | 				ftxt.write(results[i][0] + ',' + str(x) + ',' + str(y) + ',' + str(w) + ',' + str(h)+',' + str(results[i][5]) + '\n')
226 | 		if self.filewrite_img : 
227 | 			if self.disp_console : print '    image file writed : ' + self.tofile_img
228 | 			cv2.imwrite(self.tofile_img,img_cp)			
229 | 		if self.imshow :
230 | 			cv2.imshow('YOLO_small detection',img_cp)
231 | 			cv2.waitKey(1)
232 | 		if self.filewrite_txt : 
233 | 			if self.disp_console : print '    txt file writed : ' + self.tofile_txt
234 | 			ftxt.close()
235 | 
236 | 	def iou(self,box1,box2):
237 | 		tb = min(box1[0]+0.5*box1[2],box2[0]+0.5*box2[2])-max(box1[0]-0.5*box1[2],box2[0]-0.5*box2[2])
238 | 		lr = min(box1[1]+0.5*box1[3],box2[1]+0.5*box2[3])-max(box1[1]-0.5*box1[3],box2[1]-0.5*box2[3])
239 | 		if tb < 0 or lr < 0 : intersection = 0
240 | 		else : intersection =  tb*lr
241 | 		return intersection / (box1[2]*box1[3] + box2[2]*box2[3] - intersection)
242 | 
243 | 	def training(self): #TODO add training function!
244 | 		return None
245 | 
246 | 	
247 | 			
248 | 
249 | def main(argvs):
250 | 	yolo = YOLO_TF(argvs)
251 | 	cv2.waitKey(1000)
252 | 
253 | 
254 | if __name__=='__main__':	
255 | 	main(sys.argv)
256 | 


--------------------------------------------------------------------------------
/YOLO_tiny_tf.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import tensorflow as tf
  3 | import cv2
  4 | import time
  5 | import sys
  6 | 
  7 | class YOLO_TF:
  8 | 	fromfile = None
  9 | 	tofile_img = 'test/output.jpg'
 10 | 	tofile_txt = 'test/output.txt'
 11 | 	imshow = True
 12 | 	filewrite_img = False
 13 | 	filewrite_txt = False
 14 | 	disp_console = True
 15 | 	weights_file = 'weights/YOLO_tiny.ckpt'
 16 | 	alpha = 0.1
 17 | 	threshold = 0.2
 18 | 	iou_threshold = 0.5
 19 | 	num_class = 20
 20 | 	num_box = 2
 21 | 	grid_size = 7
 22 | 	classes =  ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train","tvmonitor"]
 23 | 
 24 | 	w_img = 640
 25 | 	h_img = 480
 26 | 
 27 | 	def __init__(self,argvs = []):
 28 | 		self.argv_parser(argvs)
 29 | 		self.build_networks()
 30 | 		if self.fromfile is not None: self.detect_from_file(self.fromfile)
 31 | 	def argv_parser(self,argvs):
 32 | 		for i in range(1,len(argvs),2):
 33 | 			if argvs[i] == '-fromfile' : self.fromfile = argvs[i+1]
 34 | 			if argvs[i] == '-tofile_img' : self.tofile_img = argvs[i+1] ; self.filewrite_img = True
 35 | 			if argvs[i] == '-tofile_txt' : self.tofile_txt = argvs[i+1] ; self.filewrite_txt = True
 36 | 			if argvs[i] == '-imshow' :
 37 | 				if argvs[i+1] == '1' :self.imshow = True
 38 | 				else : self.imshow = False
 39 | 			if argvs[i] == '-disp_console' :
 40 | 				if argvs[i+1] == '1' :self.disp_console = True
 41 | 				else : self.disp_console = False
 42 | 				
 43 | 	def build_networks(self):
 44 | 		if self.disp_console : print "Building YOLO_tiny graph..."
 45 | 		self.x = tf.placeholder('float32',[None,448,448,3])
 46 | 		self.conv_1 = self.conv_layer(1,self.x,16,3,1)
 47 | 		self.pool_2 = self.pooling_layer(2,self.conv_1,2,2)
 48 | 		self.conv_3 = self.conv_layer(3,self.pool_2,32,3,1)
 49 | 		self.pool_4 = self.pooling_layer(4,self.conv_3,2,2)
 50 | 		self.conv_5 = self.conv_layer(5,self.pool_4,64,3,1)
 51 | 		self.pool_6 = self.pooling_layer(6,self.conv_5,2,2)
 52 | 		self.conv_7 = self.conv_layer(7,self.pool_6,128,3,1)
 53 | 		self.pool_8 = self.pooling_layer(8,self.conv_7,2,2)
 54 | 		self.conv_9 = self.conv_layer(9,self.pool_8,256,3,1)
 55 | 		self.pool_10 = self.pooling_layer(10,self.conv_9,2,2)
 56 | 		self.conv_11 = self.conv_layer(11,self.pool_10,512,3,1)
 57 | 		self.pool_12 = self.pooling_layer(12,self.conv_11,2,2)
 58 | 		self.conv_13 = self.conv_layer(13,self.pool_12,1024,3,1)
 59 | 		self.conv_14 = self.conv_layer(14,self.conv_13,1024,3,1)
 60 | 		self.conv_15 = self.conv_layer(15,self.conv_14,1024,3,1)
 61 | 		self.fc_16 = self.fc_layer(16,self.conv_15,256,flat=True,linear=False)
 62 | 		self.fc_17 = self.fc_layer(17,self.fc_16,4096,flat=False,linear=False)
 63 | 		#skip dropout_18
 64 | 		self.fc_19 = self.fc_layer(19,self.fc_17,1470,flat=False,linear=True)
 65 | 		self.sess = tf.Session()
 66 | 		self.sess.run(tf.initialize_all_variables())
 67 | 		self.saver = tf.train.Saver()
 68 | 		self.saver.restore(self.sess,self.weights_file)
 69 | 		if self.disp_console : print "Loading complete!" + '\n'
 70 | 
 71 | 	def conv_layer(self,idx,inputs,filters,size,stride):
 72 | 		channels = inputs.get_shape()[3]
 73 | 		weight = tf.Variable(tf.truncated_normal([size,size,int(channels),filters], stddev=0.1))
 74 | 		biases = tf.Variable(tf.constant(0.1, shape=[filters]))
 75 | 
 76 | 		pad_size = size//2
 77 | 		pad_mat = np.array([[0,0],[pad_size,pad_size],[pad_size,pad_size],[0,0]])
 78 | 		inputs_pad = tf.pad(inputs,pad_mat)
 79 | 
 80 | 		conv = tf.nn.conv2d(inputs_pad, weight, strides=[1, stride, stride, 1], padding='VALID',name=str(idx)+'_conv')	
 81 | 		conv_biased = tf.add(conv,biases,name=str(idx)+'_conv_biased')	
 82 | 		if self.disp_console : print '    Layer  %d : Type = Conv, Size = %d * %d, Stride = %d, Filters = %d, Input channels = %d' % (idx,size,size,stride,filters,int(channels))
 83 | 		return tf.maximum(self.alpha*conv_biased,conv_biased,name=str(idx)+'_leaky_relu')
 84 | 
 85 | 	def pooling_layer(self,idx,inputs,size,stride):
 86 | 		if self.disp_console : print '    Layer  %d : Type = Pool, Size = %d * %d, Stride = %d' % (idx,size,size,stride)
 87 | 		return tf.nn.max_pool(inputs, ksize=[1, size, size, 1],strides=[1, stride, stride, 1], padding='SAME',name=str(idx)+'_pool')
 88 | 
 89 | 	def fc_layer(self,idx,inputs,hiddens,flat = False,linear = False):
 90 | 		input_shape = inputs.get_shape().as_list()		
 91 | 		if flat:
 92 | 			dim = input_shape[1]*input_shape[2]*input_shape[3]
 93 | 			inputs_transposed = tf.transpose(inputs,(0,3,1,2))
 94 | 			inputs_processed = tf.reshape(inputs_transposed, [-1,dim])
 95 | 		else:
 96 | 			dim = input_shape[1]
 97 | 			inputs_processed = inputs
 98 | 		weight = tf.Variable(tf.truncated_normal([dim,hiddens], stddev=0.1))
 99 | 		biases = tf.Variable(tf.constant(0.1, shape=[hiddens]))	
100 | 		if self.disp_console : print '    Layer  %d : Type = Full, Hidden = %d, Input dimension = %d, Flat = %d, Activation = %d' % (idx,hiddens,int(dim),int(flat),1-int(linear))	
101 | 		if linear : return tf.add(tf.matmul(inputs_processed,weight),biases,name=str(idx)+'_fc')
102 | 		ip = tf.add(tf.matmul(inputs_processed,weight),biases)
103 | 		return tf.maximum(self.alpha*ip,ip,name=str(idx)+'_fc')
104 | 
105 | 	def detect_from_cvmat(self,img):
106 | 		s = time.time()
107 | 		self.h_img,self.w_img,_ = img.shape
108 | 		img_resized = cv2.resize(img, (448, 448))
109 | 		img_RGB = cv2.cvtColor(img_resized,cv2.COLOR_BGR2RGB)
110 | 		img_resized_np = np.asarray( img_RGB )
111 | 		inputs = np.zeros((1,448,448,3),dtype='float32')
112 | 		inputs[0] = (img_resized_np/255.0)*2.0-1.0
113 | 		in_dict = {self.x: inputs}
114 | 		net_output = self.sess.run(self.fc_19,feed_dict=in_dict)
115 | 		self.result = self.interpret_output(net_output[0])
116 | 		self.show_results(img,self.result)
117 | 		strtime = str(time.time()-s)
118 | 		if self.disp_console : print 'Elapsed time : ' + strtime + ' secs' + '\n'
119 | 
120 | 	def detect_from_file(self,filename):
121 | 		if self.disp_console : print 'Detect from ' + filename
122 | 		img = cv2.imread(filename)
123 | 		#img = misc.imread(filename)
124 | 		self.detect_from_cvmat(img)
125 | 
126 | 	def detect_from_crop_sample(self):
127 | 		self.w_img = 640
128 | 		self.h_img = 420
129 | 		f = np.array(open('person_crop.txt','r').readlines(),dtype='float32')
130 | 		inputs = np.zeros((1,448,448,3),dtype='float32')
131 | 		for c in range(3):
132 | 			for y in range(448):
133 | 				for x in range(448):
134 | 					inputs[0,y,x,c] = f[c*448*448+y*448+x]
135 | 
136 | 		in_dict = {self.x: inputs}
137 | 		net_output = self.sess.run(self.fc_19,feed_dict=in_dict)
138 | 		self.boxes, self.probs = self.interpret_output(net_output[0])
139 | 		img = cv2.imread('person.jpg')
140 | 		self.show_results(self.boxes,img)
141 | 
142 | 	def interpret_output(self,output):
143 | 		probs = np.zeros((7,7,2,20))
144 | 		class_probs = np.reshape(output[0:980],(7,7,20))
145 | 		scales = np.reshape(output[980:1078],(7,7,2))
146 | 		boxes = np.reshape(output[1078:],(7,7,2,4))
147 | 		offset = np.transpose(np.reshape(np.array([np.arange(7)]*14),(2,7,7)),(1,2,0))
148 | 
149 | 		boxes[:,:,:,0] += offset
150 | 		boxes[:,:,:,1] += np.transpose(offset,(1,0,2))
151 | 		boxes[:,:,:,0:2] = boxes[:,:,:,0:2] / 7.0
152 | 		boxes[:,:,:,2] = np.multiply(boxes[:,:,:,2],boxes[:,:,:,2])
153 | 		boxes[:,:,:,3] = np.multiply(boxes[:,:,:,3],boxes[:,:,:,3])
154 | 		
155 | 		boxes[:,:,:,0] *= self.w_img
156 | 		boxes[:,:,:,1] *= self.h_img
157 | 		boxes[:,:,:,2] *= self.w_img
158 | 		boxes[:,:,:,3] *= self.h_img
159 | 
160 | 		for i in range(2):
161 | 			for j in range(20):
162 | 				probs[:,:,i,j] = np.multiply(class_probs[:,:,j],scales[:,:,i])
163 | 
164 | 		filter_mat_probs = np.array(probs>=self.threshold,dtype='bool')
165 | 		filter_mat_boxes = np.nonzero(filter_mat_probs)
166 | 		boxes_filtered = boxes[filter_mat_boxes[0],filter_mat_boxes[1],filter_mat_boxes[2]]
167 | 		probs_filtered = probs[filter_mat_probs]
168 | 		classes_num_filtered = np.argmax(filter_mat_probs,axis=3)[filter_mat_boxes[0],filter_mat_boxes[1],filter_mat_boxes[2]] 
169 | 
170 | 		argsort = np.array(np.argsort(probs_filtered))[::-1]
171 | 		boxes_filtered = boxes_filtered[argsort]
172 | 		probs_filtered = probs_filtered[argsort]
173 | 		classes_num_filtered = classes_num_filtered[argsort]
174 | 		
175 | 		for i in range(len(boxes_filtered)):
176 | 			if probs_filtered[i] == 0 : continue
177 | 			for j in range(i+1,len(boxes_filtered)):
178 | 				if self.iou(boxes_filtered[i],boxes_filtered[j]) > self.iou_threshold : 
179 | 					probs_filtered[j] = 0.0
180 | 		
181 | 		filter_iou = np.array(probs_filtered>0.0,dtype='bool')
182 | 		boxes_filtered = boxes_filtered[filter_iou]
183 | 		probs_filtered = probs_filtered[filter_iou]
184 | 		classes_num_filtered = classes_num_filtered[filter_iou]
185 | 
186 | 		result = []
187 | 		for i in range(len(boxes_filtered)):
188 | 			result.append([self.classes[classes_num_filtered[i]],boxes_filtered[i][0],boxes_filtered[i][1],boxes_filtered[i][2],boxes_filtered[i][3],probs_filtered[i]])
189 | 
190 | 		return result
191 | 
192 | 	def show_results(self,img,results):
193 | 		img_cp = img.copy()
194 | 		if self.filewrite_txt :
195 | 			ftxt = open(self.tofile_txt,'w')
196 | 		for i in range(len(results)):
197 | 			x = int(results[i][1])
198 | 			y = int(results[i][2])
199 | 			w = int(results[i][3])//2
200 | 			h = int(results[i][4])//2
201 | 			if self.disp_console : print '    class : ' + results[i][0] + ' , [x,y,w,h]=[' + str(x) + ',' + str(y) + ',' + str(int(results[i][3])) + ',' + str(int(results[i][4]))+'], Confidence = ' + str(results[i][5])
202 | 			if self.filewrite_img or self.imshow:
203 | 				cv2.rectangle(img_cp,(x-w,y-h),(x+w,y+h),(0,255,0),2)
204 | 				cv2.rectangle(img_cp,(x-w,y-h-20),(x+w,y-h),(125,125,125),-1)
205 | 				cv2.putText(img_cp,results[i][0] + ' : %.2f' % results[i][5],(x-w+5,y-h-7),cv2.FONT_HERSHEY_SIMPLEX,0.5,(0,0,0),1)
206 | 			if self.filewrite_txt :				
207 | 				ftxt.write(results[i][0] + ',' + str(x) + ',' + str(y) + ',' + str(w) + ',' + str(h)+',' + str(results[i][5]) + '\n')
208 | 		if self.filewrite_img : 
209 | 			if self.disp_console : print '    image file writed : ' + self.tofile_img
210 | 			cv2.imwrite(self.tofile_img,img_cp)			
211 | 		if self.imshow :
212 | 			cv2.imshow('YOLO_tiny detection',img_cp)
213 | 			cv2.waitKey(1)
214 | 		if self.filewrite_txt : 
215 | 			if self.disp_console : print '    txt file writed : ' + self.tofile_txt
216 | 			ftxt.close()
217 | 
218 | 	def iou(self,box1,box2):
219 | 		tb = min(box1[0]+0.5*box1[2],box2[0]+0.5*box2[2])-max(box1[0]-0.5*box1[2],box2[0]-0.5*box2[2])
220 | 		lr = min(box1[1]+0.5*box1[3],box2[1]+0.5*box2[3])-max(box1[1]-0.5*box1[3],box2[1]-0.5*box2[3])
221 | 		if tb < 0 or lr < 0 : intersection = 0
222 | 		else : intersection =  tb*lr
223 | 		return intersection / (box1[2]*box1[3] + box2[2]*box2[3] - intersection)
224 | 
225 | 	def training(self): #TODO add training function!
226 | 		return None
227 | 
228 | 	
229 | 			
230 | 
231 | def main(argvs):
232 | 	yolo = YOLO_TF(argvs)
233 | 	cv2.waitKey(1000)
234 | 
235 | 
236 | if __name__=='__main__':	
237 | 	main(sys.argv)
238 | 


--------------------------------------------------------------------------------
/YOLO_weight_extractor/Readme.md:
--------------------------------------------------------------------------------
 1 | #YOLO weight converter (darknet -> tensorflow)
 2 | 
 3 | 1. Usage
 4 | 
 5 |    (1) download this modified version of darknet
 6 | 
 7 | 
 8 |    (2) put your darknet weight file(made by pjreddie or you) in the folder that contains darknet executable
 9 | 
10 | 
11 |    (3) run yolo in test mode (ex -> ./darknet yolo test cfg/yolo-small.cfg yolo-small.weights)
12 | 
13 | 
14 |    (4) modified yolo will write txt files in folder 'cjy'
15 | 
16 | 
17 |    (5) exit yolo when you see 'enter image path:'
18 | 
19 | 
20 |    (6) open builder python file (YOLO_full_builder.py or YOLO_small_builder.py or YOLO_tiny_builder.py)
21 | 
22 | 
23 |    (7) change weights_dir in line 6 (the folder that contains extracted txt files)
24 | 
25 | 
26 |    (8) change path in the last line of function 'build_networks' (this is the path that will store ckpt file.)
27 | 
28 | 
29 |    (9) run builder python script
30 | 
31 | 2. Copyright
32 | 
33 |    
34 |     I modified prejeddie's darknet code. (https://github.com/pjreddie/darknet)
35 | 


--------------------------------------------------------------------------------
/YOLO_weight_extractor/YOLO_weight_extractor.tar.gz:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wiibrew/YOLO_tensorflow/2ea09ea71c75312e070340baaf6090e4425641ab/YOLO_weight_extractor/YOLO_weight_extractor.tar.gz


--------------------------------------------------------------------------------
/test/person.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wiibrew/YOLO_tensorflow/2ea09ea71c75312e070340baaf6090e4425641ab/test/person.jpg


--------------------------------------------------------------------------------
/weights/put_weight_file_here.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/wiibrew/YOLO_tensorflow/2ea09ea71c75312e070340baaf6090e4425641ab/weights/put_weight_file_here.txt


--------------------------------------------------------------------------------