├── requirements.txt
├── README.md
├── source.py
└── yolov3-custom.cfg


/requirements.txt:
--------------------------------------------------------------------------------
1 | imutils==0.5.4
2 | numpy==1.24.3
3 | opencv_python==4.6.0.66
4 | Pillow==10.0.1
5 | tensorflow==2.13.0
6 | tensorflow_intel==2.13.0
7 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Helmet-and-Number-Plate-Detection-and-Recognition
 2 | Motorcycle Accidents have been rapidly growing throughout the years in many countries. The helmet is the main safety equipment of motorcyclists. There was need to propose an automated system that monitors motorcycles and detects the persons wearing helmet or not and a system to detect number plates. 
 3 | 
 4 | This system proposes an automated system for detecting motorcyclists who do not wear a helmet and a system for retrieving motorcycle number plates.So, our model detects the helmet of the rider. If the two-wheeler rider does not wear the helmet, it detects the number plate of the vehicle.To detect the objects, this deep learning algorithm uses CNN (Convolutional Neural Network) that recognizes specific objects in videos, live images or feeds.
 5 | 
 6 | This project is a Streamlit-based application for detecting helmets, bikes, and recognizing number plates in a video stream. It uses the YOLOv3 object detection model for detecting bikes and helmets and a CNN model for helmet detection. Additionally, it recognizes number plates in real-time video.
 7 | 
 8 | # Installation
 9 | 1. Clone the Repository
10 | 
11 |     Clone this repository to your local machine.
12 | 3. Install Dependencies :
13 | 
14 |    Navigate to the project directory and install the required dependencies listed in the requirements.txt file using pip :
15 |    
16 |         pip install -r requirements.txt
17 | 5. Download YOLO Weights and Configuration :
18 |    
19 |     Download the YOLOv3 weights (yolov3-custom_7000.weights) and configuration (yolov3-custom.cfg) files. You can obtain these files from your YOLOv3 training or a pre-trained YOLOv3           model. Place these files in the project directory.
20 | 7. Download Helmet Detection Model :
21 |    
22 |     Download the helmet detection model (helmet-nonhelmet_cnn.h5) and place it in the project directory. You can train this model using your dataset or use a pre-trained one.
23 |    
24 | # Usage
25 | 1. Run the Streamlit App
26 | 
27 |   To run the Streamlit app, use the following command:
28 | 
29 |     streamlit run source.py
30 | 
31 |   This will start the Streamlit development server and open the app in your default web browser.
32 | 
33 | 2. Upload a Video File
34 | 
35 |   On the Streamlit app, use the file uploader to select a video file (e.g., MP4 or AVI) that you want to process.
36 | 
37 | 3. View the Detection and Recognition Results
38 | 
39 |   The app will display the video with real-time detections of bikes and helmets. If a helmet is detected, it will be labeled as "Helmet" or "No Helmet" based on the helmet detection model's prediction. Number plates, if present, will also be recognized and displayed.
40 | 
41 | 4. Interact with the App
42 | 
43 |   You can pause, resume, and navigate through the video using the app's interface. Observe the real-time results as the video plays.
44 |   
45 | # File Structure
46 | 
47 | * source.py: The main Streamlit app code for helmet, bike, and number plate detection and recognition.
48 | * requirements.txt: A list of required Python packages and their versions.
49 | * yolov3-custom_7000.weights: YOLOv3 custom-trained weights for object detection.
50 |        `https://drive.google.com/file/d/17DWQ1WfYHxYD_wab2OQybHwRaDDGN54K/view?usp=sharing`
51 | * yolov3-custom.cfg: YOLOv3 custom model configuration file.
52 | * helmet-nonhelmet_cnn.h5: Helmet detection CNN model weights.
53 |         `https://drive.google.com/file/d/1QW5Fw3sWHqSiJIpzxkYLpREjO_OmqX8W/view?usp=sharing`
54 | 
55 |   # ScreenShots
56 |   ![ss](https://github.com/FatimaSidra/Helmet-and-Number-Plate-Detection-and-Recognition/assets/112679516/dc00805f-2ce6-457b-b152-5b97f4b497bd)
57 |   ![ss2](https://github.com/FatimaSidra/Helmet-and-Number-Plate-Detection-and-Recognition/assets/112679516/3254988d-1fd7-4cba-a53e-bd3efea3ce12)
58 |   ![ss3](https://github.com/FatimaSidra/Helmet-and-Number-Plate-Detection-and-Recognition/assets/112679516/a98592c1-06e0-4933-ac5d-5fb4110a7190)
59 | 
60 | # Sample Output video
61 |     https://drive.google.com/file/d/1L4BRoO4WndLfTzfy4bOi-Oa7RpTWwOsU/view?usp=sharing
62 | 
63 | # Acknowledgements
64 | * This project uses YOLOv3 for object detection. You can find more information about YOLOv3 here.
65 | 
66 |       https://pjreddie.com/darknet/yolo/
67 |   
68 | * The helmet detection model is a CNN-based model used for detecting helmets on bike riders.Number plate recognition is performed in real-time to identify and display number plates.
69 | * Special thanks to the Streamlit community for creating an easy-to-use web framework for data science applications. Visit Streamlit's official website for more information.
70 |   
71 | Feel free to customize and extend this project to suit your specific needs or explore other object detection and recognition tasks using Streamlit.
72 | 


--------------------------------------------------------------------------------
/source.py:
--------------------------------------------------------------------------------
  1 | import streamlit as st
  2 | import cv2
  3 | import numpy as np
  4 | import os
  5 | from PIL import Image
  6 | import time
  7 | import imutils
  8 | from tensorflow.keras.models import load_model
  9 | 
 10 | # Load YOLO model
 11 | net = cv2.dnn.readNet("yolov3-custom_7000.weights", "yolov3-custom.cfg")
 12 | net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
 13 | net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)
 14 | 
 15 | # Define mask values for each YOLO layer
 16 | mask1 = [0, 1, 2]
 17 | mask2 = [3, 4, 5]
 18 | mask3 = [6, 7, 8]
 19 | 
 20 | # Load helmet detection model
 21 | model = load_model('helmet-nonhelmet_cnn.h5')
 22 | st.write('Model loaded!!!')
 23 | 
 24 | st.title("Bike,Helmet and Number Plate Detection and Recognition")
 25 | 
 26 | uploaded_file = st.file_uploader("Choose a video file", type=["mp4", "avi"])
 27 | if uploaded_file is not None:
 28 |     # Save the uploaded file to a temporary directory
 29 |     temp_file_path = os.path.join("temp", uploaded_file.name)
 30 |     with open(temp_file_path, "wb") as temp_file:
 31 |         temp_file.write(uploaded_file.read())
 32 | 
 33 |     video = cv2.VideoCapture(temp_file_path)
 34 | 
 35 |     if not video.isOpened():
 36 |         st.error("Error: Could not open video file.")
 37 |     else:
 38 |         stframe = st.empty()
 39 | 
 40 |         while True:
 41 |             ret, frame = video.read()
 42 | 
 43 |             if not ret:
 44 |                 break
 45 | 
 46 |             frame = imutils.resize(frame, height=500)
 47 |             height, width = frame.shape[:2]
 48 | 
 49 |             blob = cv2.dnn.blobFromImage(frame, 0.00392, (416, 416), (0, 0, 0), True, crop=False)
 50 |             net.setInput(blob)
 51 |             
 52 |             # Specify the correct mask for the YOLO layer
 53 |             if mask1:
 54 |                 outs = [net.forward("yolo_82"), net.forward("yolo_94"), net.forward("yolo_106")]
 55 |             elif mask2:
 56 |                 outs = [net.forward("yolo_89"), net.forward("yolo_101"), net.forward("yolo_113")]
 57 |             elif mask3:
 58 |                 outs = [net.forward("yolo_96"), net.forward("yolo_108"), net.forward("yolo_120")]
 59 | 
 60 |             confidences = []
 61 |             boxes = []
 62 |             classIds = []
 63 | 
 64 |             for out in outs:
 65 |                 for detection in out:
 66 |                     scores = detection[5:]
 67 |                     class_id = np.argmax(scores)
 68 |                     confidence = scores[class_id]
 69 |                     if confidence > 0.3:
 70 |                         center_x = int(detection[0] * width)
 71 |                         center_y = int(detection[1] * height)
 72 |                         w = int(detection[2] * width)
 73 |                         h = int(detection[3] * height)
 74 |                         x = int(center_x - w / 2)
 75 |                         y = int(center_y - h / 2)
 76 |                         boxes.append([x, y, w, h])
 77 |                         confidences.append(float(confidence))
 78 |                         classIds.append(class_id)
 79 | 
 80 |             indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)
 81 | 
 82 |             for i in range(len(boxes)):
 83 |                 if i in indexes:
 84 |                     x, y, w, h = boxes[i]
 85 |                     if classIds[i] == 0:  # bike
 86 |                         helmet_roi = frame[max(0, y):max(0, y) + max(0, h) // 4, max(0, x):max(0, x) + max(0, w)]
 87 |                         if helmet_roi.shape[0] > 0 and helmet_roi.shape[1] > 0:
 88 |                             helmet_roi = cv2.resize(helmet_roi, (224, 224))
 89 |                             helmet_roi = np.array(helmet_roi, dtype='float32')
 90 |                             helmet_roi = helmet_roi.reshape(1, 224, 224, 3)
 91 |                             helmet_roi = helmet_roi / 255.0
 92 |                             prediction = int(model.predict(helmet_roi)[0][0])
 93 |                             if prediction == 0:
 94 |                                 frame = cv2.putText(frame, 'Helmet', (x, y - 20), cv2.FONT_HERSHEY_SIMPLEX, 0.7,
 95 |                                                     (0, 255, 0), 2)
 96 |                             else:
 97 |                                 frame = cv2.putText(frame, 'No Helmet', (x, y - 20), cv2.FONT_HERSHEY_SIMPLEX, 0.7,
 98 |                                                     (0, 0, 255), 2)
 99 |                     cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
100 | 
101 |             stframe.image(frame, channels="BGR", use_column_width=True)
102 | 
103 |         video.release()
104 |         # Remove the temporary video file
105 |         os.remove(temp_file_path)
106 | 


--------------------------------------------------------------------------------
/yolov3-custom.cfg:
--------------------------------------------------------------------------------
  1 | [net]
  2 | # Testing
  3 | #batch=64
  4 | #subdivisions=16
  5 | # Training
  6 | batch=64
  7 | subdivisions=16
  8 | width=416
  9 | height=416
 10 | channels=3
 11 | momentum=0.9
 12 | decay=0.0005
 13 | angle=0
 14 | saturation = 1.5
 15 | exposure = 1.5
 16 | hue=.1
 17 | 
 18 | learning_rate=0.001
 19 | burn_in=1000
 20 | max_batches = 6000
 21 | policy=steps
 22 | steps=4800,5400
 23 | scales=.1,.1
 24 | 
 25 | [convolutional]
 26 | batch_normalize=1
 27 | filters=32
 28 | size=3
 29 | stride=1
 30 | pad=1
 31 | activation=leaky
 32 | 
 33 | # Downsample
 34 | 
 35 | [convolutional]
 36 | batch_normalize=1
 37 | filters=64
 38 | size=3
 39 | stride=2
 40 | pad=1
 41 | activation=leaky
 42 | 
 43 | [convolutional]
 44 | batch_normalize=1
 45 | filters=32
 46 | size=1
 47 | stride=1
 48 | pad=1
 49 | activation=leaky
 50 | 
 51 | [convolutional]
 52 | batch_normalize=1
 53 | filters=64
 54 | size=3
 55 | stride=1
 56 | pad=1
 57 | activation=leaky
 58 | 
 59 | [shortcut]
 60 | from=-3
 61 | activation=linear
 62 | 
 63 | # Downsample
 64 | 
 65 | [convolutional]
 66 | batch_normalize=1
 67 | filters=128
 68 | size=3
 69 | stride=2
 70 | pad=1
 71 | activation=leaky
 72 | 
 73 | [convolutional]
 74 | batch_normalize=1
 75 | filters=64
 76 | size=1
 77 | stride=1
 78 | pad=1
 79 | activation=leaky
 80 | 
 81 | [convolutional]
 82 | batch_normalize=1
 83 | filters=128
 84 | size=3
 85 | stride=1
 86 | pad=1
 87 | activation=leaky
 88 | 
 89 | [shortcut]
 90 | from=-3
 91 | activation=linear
 92 | 
 93 | [convolutional]
 94 | batch_normalize=1
 95 | filters=64
 96 | size=1
 97 | stride=1
 98 | pad=1
 99 | activation=leaky
100 | 
101 | [convolutional]
102 | batch_normalize=1
103 | filters=128
104 | size=3
105 | stride=1
106 | pad=1
107 | activation=leaky
108 | 
109 | [shortcut]
110 | from=-3
111 | activation=linear
112 | 
113 | # Downsample
114 | 
115 | [convolutional]
116 | batch_normalize=1
117 | filters=256
118 | size=3
119 | stride=2
120 | pad=1
121 | activation=leaky
122 | 
123 | [convolutional]
124 | batch_normalize=1
125 | filters=128
126 | size=1
127 | stride=1
128 | pad=1
129 | activation=leaky
130 | 
131 | [convolutional]
132 | batch_normalize=1
133 | filters=256
134 | size=3
135 | stride=1
136 | pad=1
137 | activation=leaky
138 | 
139 | [shortcut]
140 | from=-3
141 | activation=linear
142 | 
143 | [convolutional]
144 | batch_normalize=1
145 | filters=128
146 | size=1
147 | stride=1
148 | pad=1
149 | activation=leaky
150 | 
151 | [convolutional]
152 | batch_normalize=1
153 | filters=256
154 | size=3
155 | stride=1
156 | pad=1
157 | activation=leaky
158 | 
159 | [shortcut]
160 | from=-3
161 | activation=linear
162 | 
163 | [convolutional]
164 | batch_normalize=1
165 | filters=128
166 | size=1
167 | stride=1
168 | pad=1
169 | activation=leaky
170 | 
171 | [convolutional]
172 | batch_normalize=1
173 | filters=256
174 | size=3
175 | stride=1
176 | pad=1
177 | activation=leaky
178 | 
179 | [shortcut]
180 | from=-3
181 | activation=linear
182 | 
183 | [convolutional]
184 | batch_normalize=1
185 | filters=128
186 | size=1
187 | stride=1
188 | pad=1
189 | activation=leaky
190 | 
191 | [convolutional]
192 | batch_normalize=1
193 | filters=256
194 | size=3
195 | stride=1
196 | pad=1
197 | activation=leaky
198 | 
199 | [shortcut]
200 | from=-3
201 | activation=linear
202 | 
203 | 
204 | [convolutional]
205 | batch_normalize=1
206 | filters=128
207 | size=1
208 | stride=1
209 | pad=1
210 | activation=leaky
211 | 
212 | [convolutional]
213 | batch_normalize=1
214 | filters=256
215 | size=3
216 | stride=1
217 | pad=1
218 | activation=leaky
219 | 
220 | [shortcut]
221 | from=-3
222 | activation=linear
223 | 
224 | [convolutional]
225 | batch_normalize=1
226 | filters=128
227 | size=1
228 | stride=1
229 | pad=1
230 | activation=leaky
231 | 
232 | [convolutional]
233 | batch_normalize=1
234 | filters=256
235 | size=3
236 | stride=1
237 | pad=1
238 | activation=leaky
239 | 
240 | [shortcut]
241 | from=-3
242 | activation=linear
243 | 
244 | [convolutional]
245 | batch_normalize=1
246 | filters=128
247 | size=1
248 | stride=1
249 | pad=1
250 | activation=leaky
251 | 
252 | [convolutional]
253 | batch_normalize=1
254 | filters=256
255 | size=3
256 | stride=1
257 | pad=1
258 | activation=leaky
259 | 
260 | [shortcut]
261 | from=-3
262 | activation=linear
263 | 
264 | [convolutional]
265 | batch_normalize=1
266 | filters=128
267 | size=1
268 | stride=1
269 | pad=1
270 | activation=leaky
271 | 
272 | [convolutional]
273 | batch_normalize=1
274 | filters=256
275 | size=3
276 | stride=1
277 | pad=1
278 | activation=leaky
279 | 
280 | [shortcut]
281 | from=-3
282 | activation=linear
283 | 
284 | # Downsample
285 | 
286 | [convolutional]
287 | batch_normalize=1
288 | filters=512
289 | size=3
290 | stride=2
291 | pad=1
292 | activation=leaky
293 | 
294 | [convolutional]
295 | batch_normalize=1
296 | filters=256
297 | size=1
298 | stride=1
299 | pad=1
300 | activation=leaky
301 | 
302 | [convolutional]
303 | batch_normalize=1
304 | filters=512
305 | size=3
306 | stride=1
307 | pad=1
308 | activation=leaky
309 | 
310 | [shortcut]
311 | from=-3
312 | activation=linear
313 | 
314 | 
315 | [convolutional]
316 | batch_normalize=1
317 | filters=256
318 | size=1
319 | stride=1
320 | pad=1
321 | activation=leaky
322 | 
323 | [convolutional]
324 | batch_normalize=1
325 | filters=512
326 | size=3
327 | stride=1
328 | pad=1
329 | activation=leaky
330 | 
331 | [shortcut]
332 | from=-3
333 | activation=linear
334 | 
335 | 
336 | [convolutional]
337 | batch_normalize=1
338 | filters=256
339 | size=1
340 | stride=1
341 | pad=1
342 | activation=leaky
343 | 
344 | [convolutional]
345 | batch_normalize=1
346 | filters=512
347 | size=3
348 | stride=1
349 | pad=1
350 | activation=leaky
351 | 
352 | [shortcut]
353 | from=-3
354 | activation=linear
355 | 
356 | 
357 | [convolutional]
358 | batch_normalize=1
359 | filters=256
360 | size=1
361 | stride=1
362 | pad=1
363 | activation=leaky
364 | 
365 | [convolutional]
366 | batch_normalize=1
367 | filters=512
368 | size=3
369 | stride=1
370 | pad=1
371 | activation=leaky
372 | 
373 | [shortcut]
374 | from=-3
375 | activation=linear
376 | 
377 | [convolutional]
378 | batch_normalize=1
379 | filters=256
380 | size=1
381 | stride=1
382 | pad=1
383 | activation=leaky
384 | 
385 | [convolutional]
386 | batch_normalize=1
387 | filters=512
388 | size=3
389 | stride=1
390 | pad=1
391 | activation=leaky
392 | 
393 | [shortcut]
394 | from=-3
395 | activation=linear
396 | 
397 | 
398 | [convolutional]
399 | batch_normalize=1
400 | filters=256
401 | size=1
402 | stride=1
403 | pad=1
404 | activation=leaky
405 | 
406 | [convolutional]
407 | batch_normalize=1
408 | filters=512
409 | size=3
410 | stride=1
411 | pad=1
412 | activation=leaky
413 | 
414 | [shortcut]
415 | from=-3
416 | activation=linear
417 | 
418 | 
419 | [convolutional]
420 | batch_normalize=1
421 | filters=256
422 | size=1
423 | stride=1
424 | pad=1
425 | activation=leaky
426 | 
427 | [convolutional]
428 | batch_normalize=1
429 | filters=512
430 | size=3
431 | stride=1
432 | pad=1
433 | activation=leaky
434 | 
435 | [shortcut]
436 | from=-3
437 | activation=linear
438 | 
439 | [convolutional]
440 | batch_normalize=1
441 | filters=256
442 | size=1
443 | stride=1
444 | pad=1
445 | activation=leaky
446 | 
447 | [convolutional]
448 | batch_normalize=1
449 | filters=512
450 | size=3
451 | stride=1
452 | pad=1
453 | activation=leaky
454 | 
455 | [shortcut]
456 | from=-3
457 | activation=linear
458 | 
459 | # Downsample
460 | 
461 | [convolutional]
462 | batch_normalize=1
463 | filters=1024
464 | size=3
465 | stride=2
466 | pad=1
467 | activation=leaky
468 | 
469 | [convolutional]
470 | batch_normalize=1
471 | filters=512
472 | size=1
473 | stride=1
474 | pad=1
475 | activation=leaky
476 | 
477 | [convolutional]
478 | batch_normalize=1
479 | filters=1024
480 | size=3
481 | stride=1
482 | pad=1
483 | activation=leaky
484 | 
485 | [shortcut]
486 | from=-3
487 | activation=linear
488 | 
489 | [convolutional]
490 | batch_normalize=1
491 | filters=512
492 | size=1
493 | stride=1
494 | pad=1
495 | activation=leaky
496 | 
497 | [convolutional]
498 | batch_normalize=1
499 | filters=1024
500 | size=3
501 | stride=1
502 | pad=1
503 | activation=leaky
504 | 
505 | [shortcut]
506 | from=-3
507 | activation=linear
508 | 
509 | [convolutional]
510 | batch_normalize=1
511 | filters=512
512 | size=1
513 | stride=1
514 | pad=1
515 | activation=leaky
516 | 
517 | [convolutional]
518 | batch_normalize=1
519 | filters=1024
520 | size=3
521 | stride=1
522 | pad=1
523 | activation=leaky
524 | 
525 | [shortcut]
526 | from=-3
527 | activation=linear
528 | 
529 | [convolutional]
530 | batch_normalize=1
531 | filters=512
532 | size=1
533 | stride=1
534 | pad=1
535 | activation=leaky
536 | 
537 | [convolutional]
538 | batch_normalize=1
539 | filters=1024
540 | size=3
541 | stride=1
542 | pad=1
543 | activation=leaky
544 | 
545 | [shortcut]
546 | from=-3
547 | activation=linear
548 | 
549 | ######################
550 | 
551 | [convolutional]
552 | batch_normalize=1
553 | filters=512
554 | size=1
555 | stride=1
556 | pad=1
557 | activation=leaky
558 | 
559 | [convolutional]
560 | batch_normalize=1
561 | size=3
562 | stride=1
563 | pad=1
564 | filters=1024
565 | activation=leaky
566 | 
567 | [convolutional]
568 | batch_normalize=1
569 | filters=512
570 | size=1
571 | stride=1
572 | pad=1
573 | activation=leaky
574 | 
575 | [convolutional]
576 | batch_normalize=1
577 | size=3
578 | stride=1
579 | pad=1
580 | filters=1024
581 | activation=leaky
582 | 
583 | [convolutional]
584 | batch_normalize=1
585 | filters=512
586 | size=1
587 | stride=1
588 | pad=1
589 | activation=leaky
590 | 
591 | [convolutional]
592 | batch_normalize=1
593 | size=3
594 | stride=1
595 | pad=1
596 | filters=1024
597 | activation=leaky
598 | 
599 | [convolutional]
600 | size=1
601 | stride=1
602 | pad=1
603 | filters=21
604 | activation=linear
605 | 
606 | 
607 | [yolo]
608 | mask = 6,7,8
609 | anchors = 41,176,  75,283, 123,320, 194,348, 273,373,0,0,0,0,0,0,0,0
610 | classes=2
611 | num=9
612 | jitter=.3
613 | ignore_thresh = .7
614 | truth_thresh = 1
615 | random=1
616 | 
617 | 
618 | [route]
619 | layers = -4
620 | 
621 | [convolutional]
622 | batch_normalize=1
623 | filters=256
624 | size=1
625 | stride=1
626 | pad=1
627 | activation=leaky
628 | 
629 | [upsample]
630 | stride=2
631 | 
632 | [route]
633 | layers = -1, 61
634 | 
635 | 
636 | 
637 | [convolutional]
638 | batch_normalize=1
639 | filters=256
640 | size=1
641 | stride=1
642 | pad=1
643 | activation=leaky
644 | 
645 | [convolutional]
646 | batch_normalize=1
647 | size=3
648 | stride=1
649 | pad=1
650 | filters=512
651 | activation=leaky
652 | 
653 | [convolutional]
654 | batch_normalize=1
655 | filters=256
656 | size=1
657 | stride=1
658 | pad=1
659 | activation=leaky
660 | 
661 | [convolutional]
662 | batch_normalize=1
663 | size=3
664 | stride=1
665 | pad=1
666 | filters=512
667 | activation=leaky
668 | 
669 | [convolutional]
670 | batch_normalize=1
671 | filters=256
672 | size=1
673 | stride=1
674 | pad=1
675 | activation=leaky
676 | 
677 | [convolutional]
678 | batch_normalize=1
679 | size=3
680 | stride=1
681 | pad=1
682 | filters=512
683 | activation=leaky
684 | 
685 | [convolutional]
686 | size=1
687 | stride=1
688 | pad=1
689 | filters=21
690 | activation=linear
691 | 
692 | 
693 | [yolo]
694 | mask = 3,4,5
695 | anchors = 41,176,  75,283, 123,320, 194,348, 273,373,0,0,0,0,0,0,0,0
696 | classes=2
697 | num=9
698 | jitter=.3
699 | ignore_thresh = .7
700 | truth_thresh = 1
701 | random=1
702 | 
703 | 
704 | 
705 | [route]
706 | layers = -4
707 | 
708 | [convolutional]
709 | batch_normalize=1
710 | filters=128
711 | size=1
712 | stride=1
713 | pad=1
714 | activation=leaky
715 | 
716 | [upsample]
717 | stride=2
718 | 
719 | [route]
720 | layers = -1, 36
721 | 
722 | 
723 | 
724 | [convolutional]
725 | batch_normalize=1
726 | filters=128
727 | size=1
728 | stride=1
729 | pad=1
730 | activation=leaky
731 | 
732 | [convolutional]
733 | batch_normalize=1
734 | size=3
735 | stride=1
736 | pad=1
737 | filters=256
738 | activation=leaky
739 | 
740 | [convolutional]
741 | batch_normalize=1
742 | filters=128
743 | size=1
744 | stride=1
745 | pad=1
746 | activation=leaky
747 | 
748 | [convolutional]
749 | batch_normalize=1
750 | size=3
751 | stride=1
752 | pad=1
753 | filters=256
754 | activation=leaky
755 | 
756 | [convolutional]
757 | batch_normalize=1
758 | filters=128
759 | size=1
760 | stride=1
761 | pad=1
762 | activation=leaky
763 | 
764 | [convolutional]
765 | batch_normalize=1
766 | size=3
767 | stride=1
768 | pad=1
769 | filters=256
770 | activation=leaky
771 | 
772 | [convolutional]
773 | size=1
774 | stride=1
775 | pad=1
776 | filters=21
777 | activation=linear
778 | 
779 | 
780 | [yolo]
781 | mask = 0,1,2
782 | anchors = 41,176,  75,283, 123,320, 194,348, 273,373,0,0,0,0,0,0,0,0
783 | classes=2
784 | num=9
785 | jitter=.3
786 | ignore_thresh = .7
787 | truth_thresh = 1
788 | random=1
789 | 
790 | 


--------------------------------------------------------------------------------