├── .gitignore ├── README.md ├── RVD_result_log.csv ├── projectRVD ppt_eng.pdf ├── raspberry ├── Readme.md ├── Violence_detection.py ├── client.py ├── note_video.py ├── output_raspberry.mp4 ├── output_raspberry_test.mp4 ├── output_webcam.mp4 ├── output_webcam_test.mp4 ├── server.py ├── server2.py ├── server2_font.py ├── server3.py ├── streaming_pred.py ├── video_maker.py └── video_maker1.py └── ver_jupyter ├── 01_video-to-numpy-save.ipynb ├── 02_create-numpy-datasets_training-test.ipynb ├── 03_MobileNet.ipynb ├── 04_MobileNet+LSTM_model_training.ipynb ├── 05_Apply-model-to-Video.ipynb ├── 06_labtop_webcam_streaming.ipynb └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | pip-wheel-metadata/ 24 | share/python-wheels/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | MANIFEST 29 | 30 | # PyInstaller 31 | # Usually these files are written by a python script from a template 32 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 33 | *.manifest 34 | *.spec 35 | 36 | # Installer logs 37 | pip-log.txt 38 | pip-delete-this-directory.txt 39 | 40 | # Unit test / coverage reports 41 | htmlcov/ 42 | .tox/ 43 | .nox/ 44 | .coverage 45 | .coverage.* 46 | .cache 47 | nosetests.xml 48 | coverage.xml 49 | *.cover 50 | *.py,cover 51 | .hypothesis/ 52 | .pytest_cache/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | target/ 76 | 77 | # Jupyter Notebook 78 | .ipynb_checkpoints 79 | 80 | # IPython 81 | profile_default/ 82 | ipython_config.py 83 | 84 | # pyenv 85 | .python-version 86 | 87 | # pipenv 88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 91 | # install all needed dependencies. 92 | #Pipfile.lock 93 | 94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow 95 | __pypackages__/ 96 | 97 | # Celery stuff 98 | celerybeat-schedule 99 | celerybeat.pid 100 | 101 | # SageMath parsed files 102 | *.sage.py 103 | 104 | # Environments 105 | .env 106 | .venv 107 | env/ 108 | venv/ 109 | ENV/ 110 | env.bak/ 111 | venv.bak/ 112 | 113 | # Spyder project settings 114 | .spyderproject 115 | .spyproject 116 | 117 | # Rope project settings 118 | .ropeproject 119 | 120 | # mkdocs documentation 121 | /site 122 | 123 | # mypy 124 | .mypy_cache/ 125 | .dmypy.json 126 | dmypy.json 127 | 128 | # Pyre type checker 129 | .pyre/ 130 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Project RVD : Realtime Violence Detection with Raspberry Pi 2 | 3 | * Client: J.Oh 4 | * Starts at: 2021/04/08 5 | * Ends at: 2021/05/17 6 | * Members: [Sukwang Lee](https://github.com/SookwangLee), [Hyewon Rho](https://github.com/rhohye22), [Jeonghun Park](https://github.com/berojung), [Jimin Chae](https://github.com/regenesis90), [Yeong-Min Choi](https://github.com/kmccym) 7 | 8 | # *Introduction* 9 | 10 | * Violence detection by using MobileNet + LSTM (Binary classification : Violence / Non-Violence) 11 | * Add captions of violence dection result on video screen (video file or realtime video streaming) **[>> Fast Link](https://github.com/projectRVD/Real-Time-Violence-Detection-with-raspberry-pi/tree/main/ver_jupyter)** 12 | * Using Raspberry Pi & camera module for realtime video streaming **[>> Fast Link](https://github.com/projectRVD/Real-Time-Violence-Detection-with-raspberry-pi/tree/main/raspberry)** 13 | 14 | # *Environment* 15 | 16 | - Linux OS 17 | - Python ≥ 3.8 18 | - OpenCV ≥ 4 19 | - Tensorflow ≥ 2.4.0 20 | - Keras 2.4.0 21 | 22 | # *Tools* 23 | 24 | - Jupyterlab ≥ 3.0 25 | - Pycharm ≥ 2 26 | - Raspberry Pi +3 27 | - Camera Module 28 | 29 | # *Result* 30 | 31 | ## Examples 32 | 33 | ![ezgif-4-3dc74782d191](https://user-images.githubusercontent.com/75024126/117774684-b9e9f480-b274-11eb-978a-060f21ffd1af.gif) 34 | ![ezgif-4-15e71033eab4](https://user-images.githubusercontent.com/75024126/117774703-bce4e500-b274-11eb-8e3c-14f54d7a8743.gif) 35 | ![ezgif-4-7a9fd72fa7c9](https://user-images.githubusercontent.com/75024126/117774858-dd14a400-b274-11eb-941a-aaf8e45eb8a7.gif) 36 | ![ezgif-4-2946841c66b2](https://user-images.githubusercontent.com/75024126/117777516-a429fe80-b277-11eb-81b0-1da2b6a0ef41.gif) 37 | 38 | ![KakaoTalk_20210514_164231168](https://user-images.githubusercontent.com/76435473/118238485-8a84f300-b4d3-11eb-89f8-8bda9e57c397.gif) 39 | 40 | 41 | ## Change of Accuracy & Loss of Model 42 | 43 | * **Result Log** : **[View Table](https://github.com/projectRVD/Real-Time-Violence-Detection-with-raspberry-pi/blob/main/RVD_result_log.csv)** 44 | 45 | ![RVD_result_model_comparison](https://user-images.githubusercontent.com/75024126/117956567-21c33c80-b354-11eb-9768-aac0ed1fc5ef.png) 46 | ![RVD_result_log](https://user-images.githubusercontent.com/75024126/117956574-238d0000-b354-11eb-81ff-de111fa69851.png) 47 | 48 | # *Reference* 49 | 50 | ## Datasets 51 | 52 | * RWF2000 : https://github.com/mchengny/RWF2000-Video-Database-for-Violence-Detection 53 | * Hocky : http://visilab.etsii.uclm.es/personas/oscar/FightDetection/index.html 54 | * raw.zip : https://github.com/niyazed/violence-video-classification 55 | * cam1, cam2 : https://github.com/airtlab/A-Dataset-for-Automatic-Violence-Detection-in-Videos/tree/master/violence-detection-dataset/non-violent/cam1 56 | * RVD : - 57 | 58 | ## Examples 59 | * UCF Anomaly Detection Datasets : https://webpages.uncc.edu/cchen62/dataset.html 60 | * Those video files were used only for making output videos & GIFs. 61 | 62 | ## We're Inspired from them 63 | 64 | * [Pedro Frodenas's Github](https://github.com/pedrofrodenas/Violence-Detection-CNN-LSTM/blob/master/violence_detection.ipynb) 65 | * [Souhaiel BenSalem's Github](https://github.com/shouhaiel1/CNN-LSTM-Violence-detection) 66 | -------------------------------------------------------------------------------- /RVD_result_log.csv: -------------------------------------------------------------------------------- 1 | Date,Accuracy,Loss,Base_Model 2 | 2021.04.14,0.6183,0.6372,VGG16 3 | 2021.04.15,0.7574,0.1766,VGG16 4 | 2021.04.19,0.7635,0.2006,VGG19 5 | 2021.04.20,0.7986,0.1549,VGG19 6 | 2021.04.23,0.8302,0.4121,VGG19 7 | 2021.04.29,0.8,0.1576,VGG19 8 | 2021.05.06,0.7895,0.1899,VGG19 9 | 2021.05.07,0.7852,0.1935,VGG19 10 | 2021.05.09,0.7852,0.1935,VGG19 11 | 2021.05.09,0.7824,0.189,InceptionResNetV2 12 | 2021.05.09,0.749,0.2004,InceptionV3 13 | 2021.05.09,0.8033,0.1732,DenseNet 14 | 2021.05.09,0.8159,0.1539,MobileNet 15 | 2021.05.09,0.8145,0.1617,MobileNetV2 16 | 2021.05.12,0.8319,0.139,MobileNet 17 | -------------------------------------------------------------------------------- /projectRVD ppt_eng.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/projectRVD/Real-Time-Violence-Detection-with-raspberry-pi/969de595a953aaa103334ece819600e2530d91e2/projectRVD ppt_eng.pdf -------------------------------------------------------------------------------- /raspberry/Readme.md: -------------------------------------------------------------------------------- 1 | 2 | # 21.05.13 CODE Updated 3 | 4 | ## Server(.py) 5 | **Run it on your desktop** 6 | * server2.py must be compiled with the client.py 7 | * server3.py can be used alone if only camera devices such as laptop webcams are connected. 8 | 9 | ## Additional files to make server.py work 10 | **Font File :** https://fonts.google.com/specimen/Raleway?query=ralewa#license 11 | * If you want to use a different font, modify the'ImageFont.truetype' part and use it. 12 | 13 | **Weight File(.h5) :** https://drive.google.com/u/0/uc?export=download&confirm=r7yc&id=1z4UKAkhyItFCJ7aue3FNfCVN5jfsdkkU 14 | * The deep learning model used is MobileNet. For more information, refer to the Ver_jupiter Directory. 15 | 16 | ## Client(.py) 17 | **Run it on your raspberrypi** 18 | * **Use model** : Raspberry Pi 3+ 19 | 20 | ## OpenCV must be installed on the Raspberry Pi. 21 | * This is the OpenCV(ver 4.1.2) download link I referenced. 22 | **Download Link :** https://make.e4ds.com/make/learn_guide_view.asp?idx=116 23 | 24 | ## Imagezmq 25 | * Real-time transfer of image frames from Raspberry Pi to desktop. 26 | **Download Link :** https://pypi.org/project/imagezmq/ 27 | 28 | 29 | ## Simulation 30 | ![KakaoTalk_20210511_174831663](https://user-images.githubusercontent.com/76435473/117787354-5a461600-b281-11eb-971d-c89878ce3e85.gif) 31 | -------------------------------------------------------------------------------- /raspberry/Violence_detection.py: -------------------------------------------------------------------------------- 1 | # import 2 | from __future__ import absolute_import 3 | from __future__ import division 4 | from __future__ import print_function 5 | import numpy as np 6 | import tensorflow as tf 7 | from collections import deque 8 | import argparse 9 | from skimage.io import imread 10 | from skimage.transform import resize 11 | import cv2 12 | import numpy as np 13 | import os 14 | from PIL import Image 15 | from io import BytesIO 16 | import time 17 | 18 | from tensorflow.keras.models import Sequential, Model, load_model 19 | from tensorflow.keras.layers import Dropout, Dense, Flatten, Input 20 | from tensorflow.keras.optimizers import Adam 21 | from tensorflow.keras.metrics import categorical_crossentropy 22 | from tensorflow.keras.preprocessing.image import ImageDataGenerator 23 | 24 | from tensorflow.python.client import device_lib 25 | print(device_lib.list_local_devices()) 26 | 27 | def souhaiel_model(tf,wgts='fightw.hdfs'): 28 | layers = tf.keras.layers 29 | models = tf.keras.models 30 | losses = tf.keras.losses 31 | optimizers = tf.keras.optimizers 32 | metrics = tf.keras.metrics 33 | num_classes = 2 34 | cnn = models.Sequential() 35 | 36 | #cnn.add(base_model) 37 | input_shapes=(160,160,3) 38 | np.random.seed(1234) 39 | vg19 = tf.keras.applications.vgg19.VGG19 40 | base_model = vg19(include_top=False, weights='imagenet', input_shape=(160, 160, 3)) 41 | # Freeze the layers except the last 4 layers (we will only use the base model to extract features) 42 | cnn = models.Sequential() 43 | cnn.add(base_model) 44 | cnn.add(layers.Flatten()) 45 | model = models.Sequential() 46 | model.add(layers.TimeDistributed(cnn, input_shape=(30, 160, 160, 3))) 47 | model.add(layers.LSTM(30, return_sequences= True)) 48 | model.add(layers.TimeDistributed(layers.Dense(90))) 49 | model.add(layers.Dropout(0.1)) 50 | model.add(layers.GlobalAveragePooling1D()) 51 | model.add(layers.Dense(512, activation='relu')) 52 | model.add(layers.Dropout(0.3)) 53 | model.add(layers.Dense(num_classes, activation="sigmoid")) 54 | adam = optimizers.Adam(lr=0.0005, beta_1=0.9, beta_2=0.999, epsilon=1e-08) 55 | model.load_weights(wgts) 56 | rms = optimizers.RMSprop() 57 | model.compile(loss='binary_crossentropy', optimizer=adam, metrics=["accuracy"]) 58 | return model 59 | 60 | model1 = souhaiel_model(tf) 61 | print(model1.summary()) 62 | 63 | from tensorflow.keras.utils import plot_model 64 | model1= souhaiel_model(tf) 65 | 66 | np.random.seed(1234) 67 | model1 = souhaiel_model(tf) 68 | 69 | graph = tf.compat.v1.get_default_graph 70 | graph 71 | 72 | 73 | def video_reader(cv2,filename): 74 | frames = np.zeros((30, 160, 160, 3), dtype=np.float) 75 | i=0 76 | print(frames.shape) 77 | vc = cv2.VideoCapture(filename) 78 | if vc.isOpened(): 79 | rval , frame = vc.read() 80 | else: 81 | rval = False 82 | frm = resize(frame,(160,160,3)) 83 | frm = np.expand_dims(frm,axis=0) 84 | if(np.max(frm)>1): 85 | frm = frm/255.0 86 | frames[i][:] = frm 87 | i +=1 88 | print("reading video") 89 | while i < 30: 90 | rval, frame = vc.read() 91 | frm = resize(frame,(160,160,3)) 92 | frm = np.expand_dims(frm,axis=0) 93 | if(np.max(frm)>1): 94 | frm = frm/255.0 95 | frames[i][:] = frm 96 | i +=1 97 | return frames 98 | 99 | def pred_fight(model,video,acuracy=0.9): 100 | pred_test = model.predict(video) 101 | if pred_test[0][1] >=acuracy: 102 | return True , pred_test[0][1] 103 | else: 104 | return False , pred_test[0][1] 105 | 106 | def main_fight(vidoss): 107 | vid = video_reader(cv2,vidoss) 108 | datav = np.zeros((1, 30, 160, 160, 3), dtype=np.float) 109 | datav[0][:][:] = vid 110 | millis = int(round(time.time() * 1000)) 111 | print(millis) 112 | f , precent = pred_fight(model1,datav,acuracy=0.65) 113 | millis2 = int(round(time.time() * 1000)) 114 | print(millis2) 115 | res_fight = {'violence':f ,'violenceestimation':str(precent)} 116 | res_fight['processing_time'] = str(millis2-millis) 117 | return res_fight 118 | 119 | 120 | from collections import deque 121 | import argparse 122 | from skimage.transform import resize 123 | from easydict import EasyDict 124 | args=EasyDict() 125 | args.size=128 126 | print(args) 127 | 128 | 129 | # load the trained model and label binarizer from disk 130 | print("[INFO] loading model and label binarizer...") 131 | model = model1 132 | # initialize the image mean for mean subtraction along with the 133 | # predictions queue 134 | mean = np.array([123.68, 116.779, 103.939][::1], dtype="float32") 135 | Q = deque(maxlen=args['size']) 136 | 137 | 138 | # initialize the video stream, pointer to output video file, andframe dimensions 139 | 140 | ####### vs=cv2.VideoCapture(args["input"]) 141 | vs=cv2.VideoCapture('C:\lab\pythonProject\KakaoTalk_20210426_104932603.mp4') 142 | #vc=cv2.VideoCapture('Fighting013_x264.mp4') 143 | fps = vs.get(cv2.CAP_PROP_FPS) 144 | writer = None 145 | (W, H) = (None, None) 146 | #client = Client("ACea4cecca40ebb1bf4594098d5cef4541", "32789639585561088d5937514694e115") #update from twilio 147 | prelabel = '' 148 | ok = 'Normal' 149 | okk='violence' 150 | i=0 151 | frames = np.zeros((30, 160, 160, 3), dtype=np.float) 152 | datav = np.zeros((1, 30, 160, 160, 3), dtype=np.float) 153 | frame_counter=0 154 | 155 | while True: 156 | # read the next frame from the file 157 | (grabbed, frm) = vs.read() 158 | # if the frame was not grabbed, then we have reached the end of the stream 159 | if not grabbed: 160 | break 161 | # if the frame dimensions are empty, grab them 162 | if W is None or H is None: 163 | (H, W) = frm.shape[:2] 164 | #framecount = framecount+1 165 | # clone the output frame, then convert it from BGR to RGB ordering, resize the frame to a fixed 224x224, 166 | # and then perform mean subtraction 167 | output=frm.copy() 168 | while i < 30: 169 | rval, frame = vs.read() 170 | frame_counter +=1 171 | if frame_counter == vs.get(cv2.CAP_PROP_FRAME_COUNT): 172 | frame_counter = 0 #Or whatever as long as it is the same as next line 173 | vs.set(cv2.CAP_PROP_POS_FRAMES, 0) 174 | frame = resize(frame,(160,160,3)) 175 | frame = np.expand_dims(frame,axis=0) 176 | if(np.max(frame)>1): 177 | frame = frame/255.0 178 | frames[i][:] = frame 179 | i +=1 180 | 181 | datav[0][:][:] =frames 182 | frames -= mean 183 | 184 | # make predictions on the frame and then update the predictions 185 | # queue 186 | preds = model1.predict(datav) 187 | # print('Preds = :', preds) 188 | 189 | # total = (preds[0]+ preds[1]+preds[2] + preds[3]+ preds[4]+preds[5]) 190 | # maximum = max(preds) 191 | # rest = total - maximum 192 | 193 | # diff = (.8*maximum) - (.1*rest) 194 | # print('Difference of prob ', diff) 195 | # th = 100 196 | # if diff > .60: 197 | # th = diff 198 | # print('Old threshold = ', th) 199 | prediction = preds.argmax(axis=0) 200 | Q.append(preds) 201 | 202 | # perform prediction averaging over the current history of 203 | # previous predictions 204 | results = np.array(Q).mean(axis=0) 205 | print('Results = ', results) 206 | maxprob = np.max(results) 207 | print('Maximun Probability = ', maxprob) 208 | i = np.argmax(results) 209 | rest = 1 - maxprob 210 | 211 | diff = (maxprob) - (rest) 212 | print('Difference of prob ', diff) 213 | th = 100 214 | if diff > .80: 215 | th = diff 216 | 217 | if (preds[0][1]) < th: 218 | text = "Alert : {} - {:.2f}%".format((ok), 100 - (maxprob * 100)) 219 | cv2.putText(output, text, (35, 50), cv2.FONT_HERSHEY_TRIPLEX , 1.25, (0, 255, 0), 5) 220 | else: 221 | text = "Alert : {} - {:.2f}%".format((okk), maxprob * 100) 222 | cv2.putText(output, text, (35, 50), cv2.FONT_HERSHEY_TRIPLEX , 1.25, (0, 0, 255), 5) 223 | # if label != prelabel: 224 | # client.messages.create(to="<+country code>< receiver mobile number>", #for example +918255555555 225 | # from_="+180840084XX", #sender number can be coped from twilo 226 | # body='\n'+ str(text) +'\n Satellite: ' + str(camid) + '\n Orbit: ' + location) 227 | 228 | # check if the video writer is None 229 | if writer is None: 230 | # initialize our video writer 231 | fourcc=cv2.VideoWriter_fourcc(*"FMP4") 232 | writer=cv2.VideoWriter('C:\lab\pythonProject\KakaoTalk_20210426_104932603_output.mp4', 233 | fourcc, 27.0, (W, H), True) 234 | 235 | # write the output frame to disk 236 | writer.write(output) 237 | 238 | # show the output image 239 | cv2.imshow("This is Output", output) 240 | cv2.waitKey(0) 241 | 242 | #print('Frame count', framecount) 243 | # release the file pointers 244 | print("[INFO] cleaning up...") 245 | 246 | # 작업 완료 후 해제 247 | writer.release() 248 | vs.release() 249 | #vc.release() 250 | cv2.destroyAllWindows() -------------------------------------------------------------------------------- /raspberry/client.py: -------------------------------------------------------------------------------- 1 | # run this program on each RPi to send a labelled image stream 2 | import socket 3 | import time 4 | from imutils.video import VideoStream 5 | import imagezmq 6 | 7 | sender = imagezmq.ImageSender(connect_to='tcp://jeff-macbook:5555') 8 | 9 | picam = VideoStream(usePiCamera=True).start() 10 | time.sleep(2.0) # allow camera sensor to warm up 11 | while True: # send images as stream until Ctrl-C 12 | image = picam.read() 13 | sender.send_image(rpi_name, image) 14 | -------------------------------------------------------------------------------- /raspberry/note_video.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | 3 | cap = cv2.VideoCapture(0) 4 | fourcc = cv2.VideoWriter_fourcc(*'XVID') 5 | out = cv2.VideoWriter('C:/Users/leesookwang/Desktop/project/sample_video0504.avi',fourcc, 20.0, (160,160)) 6 | while(cap.isOpened()): 7 | ret, frame = cap.read() 8 | out.write(frame) 9 | cv2.imshow('frame',frame) 10 | if cv2.waitKey(1) & 0xFF == ord('q'): 11 | break 12 | cap.release() 13 | out.release() 14 | cv2.destroyAllWindows() 15 | 16 | 17 | while(cap.isOpened()): 18 | ret, frame = cap.read() 19 | if ret==True: 20 | frame = cv2.flip(frame,0) 21 | 22 | # write the flipped frame 23 | out.write(frame) 24 | 25 | cv2.imshow('frame',frame) 26 | if cv2.waitKey(1) & 0xFF == ord('q'): 27 | break 28 | else: 29 | break 30 | 31 | # Release everything if job is finished 32 | cap.release() 33 | out.release() 34 | cv2.destroyAllWindows() -------------------------------------------------------------------------------- /raspberry/output_raspberry.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/projectRVD/Real-Time-Violence-Detection-with-raspberry-pi/969de595a953aaa103334ece819600e2530d91e2/raspberry/output_raspberry.mp4 -------------------------------------------------------------------------------- /raspberry/output_raspberry_test.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/projectRVD/Real-Time-Violence-Detection-with-raspberry-pi/969de595a953aaa103334ece819600e2530d91e2/raspberry/output_raspberry_test.mp4 -------------------------------------------------------------------------------- /raspberry/output_webcam.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/projectRVD/Real-Time-Violence-Detection-with-raspberry-pi/969de595a953aaa103334ece819600e2530d91e2/raspberry/output_webcam.mp4 -------------------------------------------------------------------------------- /raspberry/output_webcam_test.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/projectRVD/Real-Time-Violence-Detection-with-raspberry-pi/969de595a953aaa103334ece819600e2530d91e2/raspberry/output_webcam_test.mp4 -------------------------------------------------------------------------------- /raspberry/server.py: -------------------------------------------------------------------------------- 1 | # Import 2 | from __future__ import absolute_import 3 | from __future__ import division 4 | from __future__ import print_function 5 | import tensorflow as tf 6 | import numpy as np 7 | from skimage.io import imread 8 | from skimage.transform import resize 9 | import cv2 10 | import numpy as np 11 | import imagezmq 12 | import os 13 | from PIL import Image 14 | from io import BytesIO 15 | import time 16 | from tensorflow.keras.models import Sequential, Model, load_model 17 | from tensorflow.keras.layers import Dropout, Dense, Flatten, Input 18 | from tensorflow.keras.optimizers import Adam 19 | from tensorflow.keras.metrics import categorical_crossentropy 20 | from tensorflow.keras.preprocessing.image import ImageDataGenerator 21 | 22 | # CNN + LSTM 모델 생성 함수(souhaiel_model) 선언 23 | def souhaiel_model(tf,wgts='fightw.hdfs'): 24 | layers = tf.keras.layers 25 | models = tf.keras.models 26 | losses = tf.keras.losses 27 | optimizers = tf.keras.optimizers 28 | metrics = tf.keras.metrics 29 | num_classes = 2 30 | cnn = models.Sequential() 31 | #cnn.add(base_model) 32 | input_shapes=(160,160,3) 33 | np.random.seed(1234) 34 | vg19 = tf.keras.applications.vgg19.VGG19 35 | base_model = vg19(include_top=False, weights='imagenet', input_shape=(160, 160, 3)) 36 | # Freeze the layers except the last 4 layers (we will only use the base model to extract features) 37 | cnn = models.Sequential() 38 | cnn.add(base_model) 39 | cnn.add(layers.Flatten()) 40 | model = models.Sequential() 41 | model.add(layers.TimeDistributed(cnn, input_shape=(30, 160, 160, 3))) 42 | model.add(layers.LSTM(30, return_sequences= True)) 43 | model.add(layers.TimeDistributed(layers.Dense(90))) 44 | model.add(layers.Dropout(0.1)) 45 | model.add(layers.GlobalAveragePooling1D()) 46 | model.add(layers.Dense(512, activation='relu')) 47 | model.add(layers.Dropout(0.3)) 48 | model.add(layers.Dense(num_classes, activation="sigmoid")) 49 | adam = optimizers.Adam(lr=0.0005, beta_1=0.9, beta_2=0.999, epsilon=1e-08) 50 | model.load_weights(wgts) 51 | rms = optimizers.RMSprop() 52 | model.compile(loss='binary_crossentropy', optimizer=adam, metrics=["accuracy"]) 53 | return model 54 | 55 | # CNN + LSTM 모델 생성(model1) 56 | model1 = souhaiel_model(tf) 57 | print(model1.summary()) 58 | 59 | 60 | writer = None 61 | (W, H) = (None, None) 62 | i = 0 # 비디오 초 번호. While loop를 돌아가는 회차 63 | Q = deque(maxlen=128) 64 | 65 | video_frm_ar = np.zeros((1, int(fps), 160, 160, 3), dtype=np.float) # frames 66 | frame_counter = 0 # 1초당 프레임 번호. 1~30 67 | frame_list = [] 68 | preds = None 69 | maxprob = None 70 | 71 | # client(라즈베이 파이)에서 영상 데이터 수신 72 | image_hub = imagezmq.ImageHub() 73 | while True: # show streamed images until Ctrl-C 74 | frame_counter += 1 75 | rpi_name, frm = image_hub.recv_image() 76 | cv2.imshow(rpi_name, frm) # 1 window for each RPi 77 | cv2.waitKey(1) 78 | image_hub.send_reply(b'OK') 79 | 80 | if W is None or H is None: # 프레임 이미지 폭(W), 높이(H)를 동영상에서 81 | (H, W) = frm.shape[:2] 82 | 83 | output = frm.copy() # 비디오 프레임을 그대로 복사. 저장/출력할 .mp4 파일로 84 | 85 | frame = resize(frm, (160, 160, 3)) # > input 배열을 (160, 160, 3)으로 변환 86 | frame_list.append(frame) # 각 프레임 배열 (160, 160, 3)이 append 된다. 87 | 88 | if frame_counter == 30: # 프레임 카운터가 30이 된 순간. len(frame_list)==30이 된 순간. 89 | # . ----- 1초=30프레임마다 묶어서 예측(.predict) ----- 90 | # . ----- 1초 동안 (1, 30, 160, 160, 3) 배열을 만들어 모델에 투입 --- 91 | # . ----- 예측 결과(1초)를 output에 씌워 준다. ----- 92 | # 그러기 위해서는 30개씩 append한 리스트를 넘파이화 -> 예측 -> 리스트 초기화 과정이 필요 93 | frame_ar = np.array(frame_list, dtype=np.float16) # > 30개의 원소가 든 리스트를 변환. (30, 160, 160, 3) 94 | frame_list = [] # 30프레임이 채워질 때마다 넘파이 배열로 변환을 마친 프레임 리스트는 초기화 해 준다. 95 | 96 | if (np.max(frame_ar) > 1): # 넘파이 배열의 RGB 값을 스케일링 97 | frame_ar = frame_ar / 255.0 98 | 99 | # video_frm_ar[i][:]=frame_ar #> (i, fps, 160, 160, 3). i번째 1초짜리 영상(30프레임) 배열 파일이 된다. 100 | # print(video_frm_ar.shape) 101 | 102 | # VGG19로 초당 프레임 이미지 배열로부터 특성 추출 : (1*30, 5, 5, 512) 103 | pred_imgarr = base_model.predict(frame_ar) # > (30, 5, 5, 512) 104 | # 추출된 특성 배열들을 1차원으로 변환 : (1, 30, 5*5*512) 105 | pred_imgarr_dim = pred_imgarr.reshape(1, pred_imgarr.shape[0], 5 * 5 * 512) # > (1, 30, 12800) 106 | # 각 프레임 폭력 여부 예측값을 0에 저장 107 | preds = model.predict(pred_imgarr_dim) # > (True, 0.99) : (폭력여부, 폭력확률) 108 | print(f'preds:{preds}') 109 | Q.append(preds) # > Deque Q에 리스트처럼 예측값을 추가함 110 | 111 | # 지난 5초간의 폭력 확률 평균을 result로 한다. 112 | if i < 5: 113 | results = np.array(Q)[:i].mean(axis=0) 114 | else: 115 | results = np.array(Q)[(i - 5):i].mean(axis=0) 116 | # results=np.array(Q).mean(axis=0) 117 | print(f'Results = {results}') # > ex : (0.6, 0.650) 118 | 119 | # 예측 결과에서 최대폭력확률값 120 | maxprob = np.max(results) # > 가장 높은 값을 선택함 121 | print(f'Maximum Probability : {maxprob}') 122 | print('') 123 | 124 | rest = 1 - maxprob # 폭력이 아닐 확률 125 | diff = maxprob - rest # 폭력일 확률과 폭력이 아닐 확률의 차이 126 | th = 100 127 | 128 | if diff > 0.80: # 폭력일 확률과 아닐 확률의 차이가 0.8 이상이면 129 | th = diff # 근거가? 130 | 131 | frame_counter = 0 # > 1초(30프레임)가 경과했으므로 frame_counter=0으로 리셋 132 | i += 1 # > 1초 경과 의미 133 | 134 | # frame_counter==30이 되면 0으로 돌아가 위 루프를 반복해 준다. 135 | 136 | # ----- output에 씌울 자막 설정하기 ----- 137 | # 30프레임(1초)마다 갱신된 값이 output에 씌워지게 된다 138 | scale = 1 139 | fontScale = min(W, H) / (300 / scale) 140 | 141 | if preds is not None and maxprob is not None: # 예측값이 발생한 후부터 142 | if (preds[0][1]) < th: # > 폭력일 확률이 th보다 작으면 정상 143 | text1_1 = 'Normal' 144 | text1_2 = '{:.2f}%'.format(100 - (maxprob * 100)) 145 | cv2.putText(output, text1_1, (int(0.025 * W), int(0.1 * H)), 146 | cv2.FONT_HERSHEY_TRIPLEX, fontScale, (0, 255, 0), 2) 147 | cv2.putText(output, text1_2, (int(0.025 * W), int(0.2 * H)), 148 | cv2.FONT_HERSHEY_TRIPLEX, fontScale, (0, 255, 0), 2) 149 | # cv2.putText(이미지파일, 출력문자, 시작위치좌표(좌측하단), 폰트, 폰트크기, 폰트색상, 폰트두께) 150 | 151 | else: # > 폭력일 확률이 th보다 크면 폭력 취급 152 | text2_1 = 'Violence Alert!' 153 | text2_2 = '{:.2f}%'.format(maxprob * 100) 154 | cv2.putText(output, text2_1, (int(0.025 * W), int(0.1 * H)), 155 | cv2.FONT_HERSHEY_TRIPLEX, fontScale, (0, 0, 255), 2) 156 | cv2.putText(output, text2_2, (int(0.025 * W), int(0.2 * H)), 157 | cv2.FONT_HERSHEY_TRIPLEX, fontScale, (0, 0, 255), 2) 158 | 159 | # 자막이 씌워진 동영상을 writer로 저장함 160 | if writer is None: 161 | writer = cv2.VideoWriter(output_path, -1, 30, (W, H), True) 162 | 163 | # 아웃풋을 새창으로 열어 보여주기 164 | cv2.imshow('This is output', output) 165 | writer.write(output) # output_path로 output 객체를 저장함 166 | 167 | key = cv2.waitKey(round(1000 / fps)) # 프레임-다음 프레임 사이 간격 168 | if key == 27: # esc 키를 누르면 루프로부터 벗어나고 output 파일이 저장됨 169 | print('ESC키를 눌렀습니다. 녹화를 종료합니다.') 170 | break 171 | 172 | print('종료 처리되었습니다. 메모리를 해제합니다.') 173 | writer.release() 174 | vc.release() 175 | 176 | 177 | 178 | # 비디오 스트리밍 179 | vs = cv2.VideoCapture() 180 | vs 181 | 182 | 183 | # 이미지 프레임 30개씩 묶음 저장 후 딥러닝 모델에 input(shape = (1,30,160,160,3)) 후 예측값 출력 184 | i = 1 185 | images = [] 186 | while i <= 30 : 187 | images.append(image) 188 | i += 1 189 | 190 | images = np.array(images).reshape((1, 30, 160, 160, 3)) 191 | predict_result = model1.predict(images) 192 | print(predict_result) 193 | 194 | 195 | -------------------------------------------------------------------------------- /raspberry/server2.py: -------------------------------------------------------------------------------- 1 | # Import 2 | import cv2 # openCV 4.5.1 3 | import numpy as np # numpy 배열 4 | from tensorflow import keras # 케라스 5 | import imagezmq 6 | from skimage.transform import resize # 이미지 리사이즈 7 | from collections import deque #비디오 영상에 텍스트 씌워 저장하기에 사용 8 | 9 | 10 | # 모델 불러오기 11 | 12 | #이미지를 투입할 베이스 모델 13 | base_model=keras.applications.mobilenet.MobileNet(input_shape=(160, 160, 3), 14 | include_top=False, 15 | weights='imagenet', classes=2) 16 | 17 | # LSTM 모델 불러오기 18 | model=keras.models.load_model('210508_vc_MobileNet_model_epoch100.h5') 19 | 20 | # 산출 파일 지정 21 | output_path='output04.mp4' #저장할 결과 동영상 파일 이름 22 | 23 | # 초기 설정 24 | writer = None 25 | (W, H) = (None, None) 26 | i = 0 # 비디오 초 번호. While loop를 돌아가는 회차 27 | Q = deque(maxlen=128) 28 | 29 | video_frm_ar = np.zeros((1, 30, 160, 160, 3), dtype=np.float) # frames 30 | frame_counter = 0 # 1초당 프레임 번호. 1~30 31 | frame_list = [] 32 | preds = None 33 | maxprob = None 34 | 35 | # client(라즈베이 파이)에서 영상 데이터 수신 36 | image_hub = imagezmq.ImageHub() 37 | 38 | while True: # show streamed images until Ctrl-C 39 | frame_counter += 1 40 | rpi_name, frm = image_hub.recv_image() 41 | cv2.waitKey(1) 42 | image_hub.send_reply(b'OK') 43 | 44 | if W is None or H is None: # 프레임 이미지 폭(W), 높이(H)를 동영상에서 45 | (H, W) = frm.shape[:2] 46 | 47 | output = frm.copy() # 비디오 프레임을 그대로 복사. 저장/출력할 .mp4 파일로 48 | 49 | frame = resize(frm, (160, 160, 3)) # > input 배열을 (160, 160, 3)으로 변환 50 | frame_list.append(frame) # 각 프레임 배열 (160, 160, 3)이 append 된다. 51 | 52 | if frame_counter == 30: # 프레임 카운터가 30이 된 순간. len(frame_list)==30이 된 순간. 53 | # . ----- 1초=30프레임마다 묶어서 예측(.predict) ----- 54 | # . ----- 1초 동안 (1, 30, 160, 160, 3) 배열을 만들어 모델에 투입 --- 55 | # . ----- 예측 결과(1초)를 output에 씌워 준다. ----- 56 | # 그러기 위해서는 30개씩 append한 리스트를 넘파이화 -> 예측 -> 리스트 초기화 과정이 필요 57 | frame_ar = np.array(frame_list, dtype=np.float16) # > 30개의 원소가 든 리스트를 변환. (30, 160, 160, 3) 58 | frame_list = [] # 30프레임이 채워질 때마다 넘파이 배열로 변환을 마친 프레임 리스트는 초기화 해 준다. 59 | 60 | if (np.max(frame_ar) > 1): # 넘파이 배열의 RGB 값을 스케일링 61 | frame_ar = frame_ar / 255.0 62 | 63 | # video_frm_ar[i][:]=frame_ar #> (i, fps, 160, 160, 3). i번째 1초짜리 영상(30프레임) 배열 파일이 된다. 64 | # print(video_frm_ar.shape) 65 | 66 | # VGG19로 초당 프레임 이미지 배열로부터 특성 추출 : (1*30, 5, 5, 512) 67 | pred_imgarr = base_model.predict(frame_ar) # > (30, 5, 5, 512) 68 | # 추출된 특성 배열들을 1차원으로 변환 : (1, 30, 5*5*512) 69 | pred_imgarr_dim = pred_imgarr.reshape(1, pred_imgarr.shape[0], 5 * 5 * 1024) # > (1, 30, 12800) 70 | # 각 프레임 폭력 여부 예측값을 0에 저장 71 | preds = model.predict(pred_imgarr_dim) # > (True, 0.99) : (폭력여부, 폭력확률) 72 | print(f'preds:{preds}') 73 | Q.append(preds) # > Deque Q에 리스트처럼 예측값을 추가함 74 | 75 | # 지난 5초간의 폭력 확률 평균을 result로 한다. 76 | if i < 5: 77 | results = np.array(Q)[:i].mean(axis=0) 78 | else: 79 | results = np.array(Q)[(i - 5):i].mean(axis=0) 80 | # results=np.array(Q).mean(axis=0) 81 | print(f'Results = {results}') # > ex : (0.6, 0.650) 82 | 83 | # 예측 결과에서 최대폭력확률값 84 | maxprob = np.max(results) # > 가장 높은 값을 선택함 85 | print(f'Maximum Probability : {maxprob}') 86 | print('') 87 | 88 | rest = 1 - maxprob # 폭력이 아닐 확률 89 | diff = maxprob - rest # 폭력일 확률과 폭력이 아닐 확률의 차이 90 | th = 100 91 | 92 | if diff > 0.80: # 폭력일 확률과 아닐 확률의 차이가 0.8 이상이면 93 | th = diff # 근거가? 94 | 95 | frame_counter = 0 # > 1초(30프레임)가 경과했으므로 frame_counter=0으로 리셋 96 | i += 1 # > 1초 경과 의미 97 | 98 | # frame_counter==30이 되면 0으로 돌아가 위 루프를 반복해 준다. 99 | 100 | # ----- output에 씌울 자막 설정하기 ----- 101 | # 30프레임(1초)마다 갱신된 값이 output에 씌워지게 된다 102 | scale = 1 103 | fontScale = min(W, H) / (300 / scale) 104 | 105 | if preds is not None and maxprob is not None: # 예측값이 발생한 후부터 106 | if (preds[0][1]) < th: # > 폭력일 확률이 th보다 작으면 정상 107 | text1_1 = 'Normal' 108 | text1_2 = '{:.2f}%'.format(100 - (maxprob * 100)) 109 | cv2.putText(output, text1_1, (int(0.025 * W), int(0.1 * H)), 110 | cv2.FONT_HERSHEY_TRIPLEX, fontScale, (0, 255, 0), 2) 111 | cv2.putText(output, text1_2, (int(0.025 * W), int(0.2 * H)), 112 | cv2.FONT_HERSHEY_TRIPLEX, fontScale, (0, 255, 0), 2) 113 | # cv2.putText(이미지파일, 출력문자, 시작위치좌표(좌측하단), 폰트, 폰트크기, 폰트색상, 폰트두께) 114 | 115 | else: # > 폭력일 확률이 th보다 크면 폭력 취급 116 | text2_1 = 'Violence Alert!' 117 | text2_2 = '{:.2f}%'.format(maxprob * 100) 118 | cv2.putText(output, text2_1, (int(0.025 * W), int(0.1 * H)), 119 | cv2.FONT_HERSHEY_TRIPLEX, fontScale, (0, 0, 255), 2) 120 | cv2.putText(output, text2_2, (int(0.025 * W), int(0.2 * H)), 121 | cv2.FONT_HERSHEY_TRIPLEX, fontScale, (0, 0, 255), 2) 122 | 123 | # 자막이 씌워진 동영상을 writer로 저장함 124 | if writer is None: 125 | writer = cv2.VideoWriter(output_path, -1, 30, (W, H), True) 126 | 127 | # 아웃풋을 새창으로 열어 보여주기 128 | cv2.imshow('This is output', output) 129 | writer.write(output) # output_path로 output 객체를 저장함 130 | 131 | key = cv2.waitKey(round(1000 / 30)) # 프레임-다음 프레임 사이 간격 132 | if key == 27: # esc 키를 누르면 루프로부터 벗어나고 output 파일이 저장됨 133 | print('ESC키를 눌렀습니다. 녹화를 종료합니다.') 134 | break 135 | 136 | print('종료 처리되었습니다. 메모리를 해제합니다.') 137 | writer.release() 138 | -------------------------------------------------------------------------------- /raspberry/server2_font.py: -------------------------------------------------------------------------------- 1 | # Import 2 | import cv2 # openCV 4.5.1 3 | import numpy as np # numpy 배열 4 | from tensorflow import keras # 케라스 5 | import imagezmq 6 | from skimage.transform import resize # 이미지 리사이즈 7 | from collections import deque #비디오 영상에 텍스트 씌워 저장하기에 사용 8 | from PIL import Image, ImageFont, ImageDraw #자막 투입 목적 9 | 10 | 11 | # 모델 불러오기 12 | 13 | #이미지를 투입할 베이스 모델 14 | base_model=keras.applications.mobilenet.MobileNet(input_shape=(160, 160, 3), 15 | include_top=False, 16 | weights='imagenet', classes=2) 17 | 18 | # LSTM 모델 불러오기 19 | model=keras.models.load_model('210512_MobileNet_model_epoch100.h5') 20 | 21 | # 산출 파일 지정 22 | output_path='output04.mp4' #저장할 결과 동영상 파일 이름 23 | 24 | # 초기 설정 25 | writer = None 26 | (W, H) = (None, None) 27 | i = 0 # 비디오 초 번호. While loop를 돌아가는 회차 28 | Q = deque(maxlen=128) 29 | 30 | video_frm_ar = np.zeros((1, 30, 160, 160, 3), dtype=np.float) # frames 31 | frame_counter = 0 # 1초당 프레임 번호. 1~30 32 | frame_list = [] 33 | preds = None 34 | maxprob = None 35 | 36 | # client(라즈베이 파이)에서 영상 데이터 수신 37 | image_hub = imagezmq.ImageHub() 38 | 39 | while True: # show streamed images until Ctrl-C 40 | frame_counter += 1 41 | rpi_name, frm = image_hub.recv_image() 42 | cv2.waitKey(1) 43 | image_hub.send_reply(b'OK') 44 | 45 | if W is None or H is None: # 프레임 이미지 폭(W), 높이(H)를 동영상에서 46 | (H, W) = frm.shape[:2] 47 | 48 | output = frm.copy() # 비디오 프레임을 그대로 복사. 저장/출력할 .mp4 파일로 49 | 50 | frame = resize(frm, (160, 160, 3)) # > input 배열을 (160, 160, 3)으로 변환 51 | frame_list.append(frame) # 각 프레임 배열 (160, 160, 3)이 append 된다. 52 | 53 | if frame_counter >= 30: # 프레임 카운터가 30이 된 순간. len(frame_list)==30이 된 순간. 54 | # . ----- 1초=30프레임마다 묶어서 예측(.predict) ----- 55 | # . ----- 1초 동안 (1, 30, 160, 160, 3) 배열을 만들어 모델에 투입 --- 56 | # . ----- 예측 결과(1초)를 output에 씌워 준다. ----- 57 | # 그러기 위해서는 30개씩 append한 리스트를 넘파이화 -> 예측 -> 리스트 초기화 과정이 필요 58 | frame_ar = np.array(frame_list, dtype=np.float16) # > 30개의 원소가 든 리스트를 변환. (30, 160, 160, 3) 59 | frame_list = [] # 30프레임이 채워질 때마다 넘파이 배열로 변환을 마친 프레임 리스트는 초기화 해 준다. 60 | 61 | if (np.max(frame_ar) > 1): # 넘파이 배열의 RGB 값을 스케일링 62 | frame_ar = frame_ar / 255.0 63 | 64 | # video_frm_ar[i][:]=frame_ar #> (i, fps, 160, 160, 3). i번째 1초짜리 영상(30프레임) 배열 파일이 된다. 65 | # print(video_frm_ar.shape) 66 | 67 | # MobileNet로 초당 프레임 이미지 배열로부터 특성 추출 : (1*30, 5, 5, 1024) 68 | pred_imgarr = base_model.predict(frame_ar) # > (30, 5, 5, 1024) 69 | # 추출된 특성 배열들을 1차원으로 변환 : (1, 30, 5*5*1024) 70 | pred_imgarr_dim = pred_imgarr.reshape(1, pred_imgarr.shape[0], 5 * 5 * 1024) # > (1, 30, 25600) 71 | # 각 프레임 폭력 여부 예측값을 0에 저장 72 | preds = model.predict(pred_imgarr_dim) # > (True, 0.99) : (폭력여부, 폭력확률) 73 | print(f'preds:{preds}') 74 | Q.append(preds) # > Deque Q에 리스트처럼 예측값을 추가함 75 | 76 | # 지난 5초간의 폭력 확률 평균을 result로 한다. 77 | if i < 5: 78 | results = np.array(Q)[:i].mean(axis=0) 79 | else: 80 | results = np.array(Q)[(i - 5):i].mean(axis=0) 81 | # results=np.array(Q).mean(axis=0) 82 | print(f'Results = {results}') # > ex : (0.6, 0.650) 83 | 84 | # 예측 결과에서 최대폭력확률값 85 | maxprob = np.max(results) # > 가장 높은 값을 선택함 86 | print(f'Maximum Probability : {maxprob}') 87 | print('') 88 | 89 | rest = 1 - maxprob # 폭력이 아닐 확률 90 | diff = maxprob - rest # 폭력일 확률과 폭력이 아닐 확률의 차이 91 | th = 100 92 | 93 | if diff > 0.80: # 폭력일 확률과 아닐 확률의 차이가 0.8 이상이면 94 | th = diff # 근거가? 95 | 96 | frame_counter = 0 # > 1초(30프레임)가 경과했으므로 frame_counter=0으로 리셋 97 | i += 1 # > 1초 경과 의미 98 | 99 | # frame_counter==30이 되면 0으로 돌아가 위 루프를 반복해 준다. 100 | 101 | # ----- output에 씌울 자막 설정하기 ----- 102 | # 30프레임(1초)마다 갱신된 값이 output에 씌워지게 된다 103 | font1=ImageFont.truetype('fonts/Raleway-ExtraBold.ttf', int(0.05*W)) 104 | font2=ImageFont.truetype('fonts/Raleway-ExtraBold.ttf', int(0.1*W)) 105 | 106 | if preds is not None and maxprob is not None: # 예측값이 발생한 후부터 107 | if (preds[0][1]) < th: # > 폭력일 확률이 th보다 작으면 정상 108 | text1_1 = 'Normal' 109 | text1_2 = '{:.2f}%'.format(100 - (maxprob * 100)) 110 | img_pil=Image.fromarray(output) 111 | draw=ImageDraw.Draw(img_pil) 112 | draw.text((int(0.025*W), int(0.025*H)), text1_1, font=font1, fill=(0,255,0,0)) 113 | draw.text((int(0.025*W), int(0.095*H)), text1_2, font=font2, fill=(0,255,0,0)) 114 | output=np.array(img_pil) 115 | 116 | else: # > 폭력일 확률이 th보다 크면 폭력 취급 117 | text2_1 = 'Violence Alert!' 118 | text2_2 = '{:.2f}%'.format(maxprob * 100) 119 | img_pil=Image.fromarray(output) 120 | draw=ImageDraw.Draw(img_pil) 121 | draw.text((int(0.025*W), int(0.025*H)), text2_1, font=font1, fill=(0,0,255,0)) 122 | draw.text((int(0.025*W), int(0.095*H)), text2_2, font=font2, fill=(0,0,255,0)) 123 | output=np.array(img_pil) 124 | 125 | # 자막이 씌워진 동영상을 writer로 저장함 126 | if writer is None: 127 | writer = cv2.VideoWriter(output_path, -1, 30, (W, H), True) 128 | 129 | # 아웃풋을 새창으로 열어 보여주기 130 | cv2.imshow('This is output', output) 131 | writer.write(output) # output_path로 output 객체를 저장함 132 | 133 | key = cv2.waitKey(round(1000 / 30)) # 프레임-다음 프레임 사이 간격 134 | if key == 27: # esc 키를 누르면 루프로부터 벗어나고 output 파일이 저장됨 135 | print('ESC키를 눌렀습니다. 녹화를 종료합니다.') 136 | break 137 | 138 | print('종료 처리되었습니다. 메모리를 해제합니다.') 139 | writer.release() 140 | -------------------------------------------------------------------------------- /raspberry/server3.py: -------------------------------------------------------------------------------- 1 | import cv2 # openCV 4.5.1 2 | import numpy as np # numpy 배열 3 | import os # 파일 및 폴더의 경로 지정을 위한 모듈 4 | import tensorflow as tf # 텐서플로우 5 | from tensorflow import keras # 케라스 6 | import time #프로세스 소요시간 표시 목적 7 | 8 | from skimage.io import imread #이미지 보이기 9 | from skimage.transform import resize # 이미지 리사이즈 10 | 11 | from PIL import Image, ImageFont, ImageDraw #자막 투입 목적 12 | from io import BytesIO 13 | 14 | from collections import deque #비디오 영상에 텍스트 씌워 저장하기에 사용 15 | 16 | base_model=keras.applications.mobilenet.MobileNet(input_shape=(160, 160, 3), 17 | include_top=False, 18 | weights='imagenet', classes=2) 19 | 20 | model=keras.models.load_model('210508_vc_MobileNet_model_epoch100.h5') 21 | 22 | input_path=0 # 노트북 웹캠을 비디오 인풋으로 설정함 23 | output_path='output_webcam_test.mp4' #저장할 결과 동영상 파일 이름 24 | 25 | # . ----- 비디오 스트리밍을 불러오기 & 초기설정 ----- 26 | vc = cv2.VideoCapture(input_path) 27 | fps = vc.get(cv2.CAP_PROP_FPS) # input_path의 초당 프레임 수 인식. fps=30.0 28 | print(f'fps : {fps}') 29 | 30 | writer = None 31 | (W, H) = (None, None) 32 | i = 0 # 비디오 초 번호. While loop를 돌아가는 회차 33 | Q = deque(maxlen=128) 34 | 35 | video_frm_ar = np.zeros((1, int(fps), 160, 160, 3), dtype=np.float) # frames 36 | frame_counter = 0 # 1초당 프레임 번호. 1~30 37 | frame_list = [] 38 | preds = None 39 | maxprob = None 40 | 41 | # . While loop : 스트리밍이 끝날 때까지 프레임 추출 반복문 시작 42 | # ----- 스트리밍 동영상들을 (30, 160, 160, 3)으로 저장하기 시작 ----- 43 | while True: 44 | frame_counter += 1 45 | grabbed, frm = vc.read() # 비디오를 1개 프레임씩 읽는다. 46 | # > grabbed=True, frm=프레임별 넘파이 배열. (240, 320, 3) 47 | 48 | if not grabbed: # 프레임이 안 잡힐 경우 49 | print('프레임이 없습니다. 스트리밍을 종료합니다.') 50 | break 51 | 52 | if fps != 30: # 비디오 fps가 30이 아니면 루프를 돌지 않기로 한다. 53 | print('비디오의 초당 프레임이 30이 아닙니다. fps=30으로 맞춰주세요.') 54 | break 55 | 56 | if W is None or H is None: # 프레임 이미지 폭(W), 높이(H)를 동영상에서 57 | (H, W) = frm.shape[:2] 58 | 59 | output = frm.copy() # 비디오 프레임을 그대로 복사. 저장/출력할 .mp4 파일로 60 | 61 | frame = resize(frm, (160, 160, 3)) # > input 배열을 (160, 160, 3)으로 변환 62 | frame_list.append(frame) # 각 프레임 배열 (160, 160, 3)이 append 된다. 63 | 64 | if frame_counter >= fps: # 프레임 카운터가 30이 된 순간. len(frame_list)==30이 된 순간. 65 | # . ----- 1초=30프레임마다 묶어서 예측(.predict) ----- 66 | # . ----- 1초 동안 (1, 30, 160, 160, 3) 배열을 만들어 모델에 투입 --- 67 | # . ----- 예측 결과(1초)를 output에 씌워 준다. ----- 68 | # 그러기 위해서는 30개씩 append한 리스트를 넘파이화 -> 예측 -> 리스트 초기화 과정이 필요 69 | frame_ar = np.array(frame_list, dtype=np.float16) # > 30개의 원소가 든 리스트를 변환. (30, 160, 160, 3) 70 | frame_list = [] # 30프레임이 채워질 때마다 넘파이 배열로 변환을 마친 프레임 리스트는 초기화 해 준다. 71 | 72 | if (np.max(frame_ar) > 1): # 넘파이 배열의 RGB 값을 스케일링 73 | frame_ar = frame_ar / 255.0 74 | 75 | # video_frm_ar[i][:]=frame_ar #> (i, fps, 160, 160, 3). i번째 1초짜리 영상(30프레임) 배열 파일이 된다. 76 | # print(video_frm_ar.shape) 77 | 78 | # MobileNet로 초당 프레임 이미지 배열로부터 특성 추출 : (1*30, 5, 5, 1024) 79 | pred_imgarr = base_model.predict(frame_ar) # > (30, 5, 5, 1024) 80 | # 추출된 특성 배열들을 1차원으로 변환 : (1, 30, 5*5*1024) 81 | pred_imgarr_dim = pred_imgarr.reshape(1, pred_imgarr.shape[0], 5 * 5 * 1024) # > (1, 30, 25600) 82 | # 각 프레임 폭력 여부 예측값을 0에 저장 83 | preds = model.predict(pred_imgarr_dim) # > (True, 0.99) : (폭력여부, 폭력확률) 84 | print(f'preds:{preds}') 85 | Q.append(preds) # > Deque Q에 리스트처럼 예측값을 추가함 86 | 87 | # 지난 5초간의 폭력 확률 평균을 result로 한다. 88 | if i < 5: 89 | results = np.array(Q)[:i].mean(axis=0) 90 | else: 91 | results = np.array(Q)[(i - 5):i].mean(axis=0) 92 | # results=np.array(Q).mean(axis=0) 93 | print(f'Results = {results}') # > ex : (0.6, 0.650) 94 | 95 | # 예측 결과에서 최대폭력확률값 96 | maxprob = np.max(results) # > 가장 높은 값을 선택함 97 | print(f'Maximum Probability : {maxprob}') 98 | print('') 99 | 100 | rest = 1 - maxprob # 폭력이 아닐 확률 101 | diff = maxprob - rest # 폭력일 확률과 폭력이 아닐 확률의 차이 102 | th = 100 103 | 104 | if diff > 0.80: # 폭력일 확률과 아닐 확률의 차이가 0.8 이상이면 105 | th = diff # 근거가? 106 | 107 | frame_counter = 0 # > 1초(30프레임)가 경과했으므로 frame_counter=0으로 리셋 108 | i += 1 # > 1초 경과 의미 109 | # frame_counter==30이 되면 0으로 돌아가 위 루프를 반복해 준다. 110 | 111 | # ----- output에 씌울 자막 설정하기 ----- 112 | # 30프레임(1초)마다 갱신된 값이 output에 씌워지게 된다 113 | # font1 = ImageFont.truetype('fonts/Raleway-ExtraBold.ttf', 24) 114 | # font2 = ImageFont.truetype('fonts/Raleway-ExtraBold.ttf', 48) 115 | # 116 | # if preds is not None and maxprob is not None: # 예측값이 발생한 후부터 117 | # if (preds[0][1]) < th: # > 폭력일 확률이 th보다 작으면 정상 118 | # text1_1 = 'Normal' 119 | # text1_2 = '{:.2f}%'.format(100 - (maxprob * 100)) 120 | # img_pil = Image.fromarray(output) 121 | # draw = ImageDraw.Draw(img_pil) 122 | # draw.text((int(0.025 * W), int(0.025 * H)), text1_1, font=font1, fill=(0, 255, 0, 0)) 123 | # draw.text((int(0.025 * W), int(0.095 * H)), text1_2, font=font2, fill=(0, 255, 0, 0)) 124 | # output = np.array(img_pil) 125 | # # cv2.putText(이미지파일, 출력문자, 시작위치좌표(좌측하단), 폰트, 폰트크기, 폰트색상, 폰트두께) 126 | # 127 | # else: # > 폭력일 확률이 th보다 크면 폭력 취급 128 | # text2_1 = 'Violence Alert!' 129 | # text2_2 = '{:.2f}%'.format(maxprob * 100) 130 | # img_pil = Image.fromarray(output) 131 | # draw = ImageDraw.Draw(img_pil) 132 | # draw.text((int(0.025 * W), int(0.025 * H)), text2_1, font=font1, fill=(0, 0, 255, 0)) 133 | # draw.text((int(0.025 * W), int(0.095 * H)), text2_2, font=font2, fill=(0, 0, 255, 0)) 134 | # output = np.array(img_pil) 135 | # 136 | # # 자막이 씌워진 동영상을 writer로 저장함 137 | if writer is None: 138 | fourcc = cv2.VideoWriter_fourcc(*'DIVX') 139 | writer = cv2.VideoWriter(output_path, fourcc, 30, (W, H), True) 140 | 141 | # 아웃풋을 새창으로 열어 보여주기 142 | cv2.imshow('This is output', output) 143 | writer.write(output) # output_path로 output 객체를 저장함 144 | 145 | key = cv2.waitKey(round(1000 / fps)) # 프레임-다음 프레임 사이 간격 146 | if key == 27: # esc 키를 누르면 루프로부터 벗어나고 output 파일이 저장됨 147 | print('ESC키를 눌렀습니다. 녹화를 종료합니다.') 148 | break 149 | 150 | print('종료 처리되었습니다. 메모리를 해제합니다.') 151 | writer.release() 152 | vc.release() 153 | cv2.destroyAllWindows() -------------------------------------------------------------------------------- /raspberry/streaming_pred.py: -------------------------------------------------------------------------------- 1 | # Imports 2 | 3 | import cv2 # openCV 4.5.1 4 | import numpy as np # numpy 배열 5 | import os # 파일 및 폴더의 경로 지정을 위한 모듈 6 | import tensorflow as tf # 텐서플로우 7 | from tensorflow import keras # 케라스 8 | import time # 프로세스 소요시간 표시 목적 9 | 10 | from skimage.io import imread # 이미지 보이기 11 | from skimage.transform import resize # 이미지 리사이즈 12 | 13 | from PIL import Image # 이미지 열기 14 | from io import BytesIO 15 | 16 | from collections import deque # 비디오 영상에 텍스트 씌워 저장하기에 사용 17 | 18 | # 모델 불러오기 19 | 20 | ## 이미지를 투입할 베이스 모델(VGG19) 21 | 22 | base_model = keras.applications.VGG19(include_top=False, input_shape=(160, 160, 3), weights='imagenet') 23 | 24 | ## 과거 훈련시킨 LSTM 모델(.h5)불러오기 25 | 26 | model = keras.models.load_model('210508_vc_MobileNet_model_epoch100.h5') 27 | 28 | # 비디오 영상에 텍스트 씌워 스트리밍 29 | 30 | ## 투입 스트리밍, 산출 파일 지정 31 | input_path = 0 # 노트북 웹캠을 비디오 인풋으로 설정함 32 | output_path = 'output04.mp4' # 저장할 결과 동영상 파일 이름 33 | 34 | ## 실행! 35 | 36 | # . ----- 비디오 스트리밍을 불러오기 & 초기설정 ----- 37 | vc = cv2.VideoCapture(input_path) 38 | fps = vc.get(cv2.CAP_PROP_FPS) # input_path의 초당 프레임 수 인식. fps=30.0 39 | print(f'fps : {fps}') 40 | 41 | writer = None 42 | (W, H) = (None, None) 43 | i = 0 # 비디오 초 번호. While loop를 돌아가는 회차 44 | Q = deque(maxlen=128) 45 | 46 | video_frm_ar = np.zeros((1, int(fps), 160, 160, 3), dtype=np.float) # frames 47 | frame_counter = 0 # 1초당 프레임 번호. 1~30 48 | frame_list = [] 49 | preds = None 50 | maxprob = None 51 | 52 | # . While loop : 스트리밍이 끝날 때까지 프레임 추출 반복문 시작 53 | # ----- 스트리밍 동영상들을 (30, 160, 160, 3)으로 저장하기 시작 ----- 54 | while True: 55 | frame_counter += 1 56 | grabbed, frm = vc.read() # 비디오를 1개 프레임씩 읽는다. 57 | # > grabbed=True, frm=프레임별 넘파이 배열. (240, 320, 3) 58 | 59 | if not grabbed: # 프레임이 안 잡힐 경우 60 | print('프레임이 없습니다. 스트리밍을 종료합니다.') 61 | break 62 | 63 | if fps != 30: # 비디오 fps가 30이 아니면 루프를 돌지 않기로 한다. 64 | print('비디오의 초당 프레임이 30이 아닙니다. fps=30으로 맞춰주세요.') 65 | break 66 | 67 | if W is None or H is None: # 프레임 이미지 폭(W), 높이(H)를 동영상에서 68 | (H, W) = frm.shape[:2] 69 | 70 | output = frm.copy() # 비디오 프레임을 그대로 복사. 저장/출력할 .mp4 파일로 71 | 72 | frame = resize(frm, (160, 160, 3)) # > input 배열을 (160, 160, 3)으로 변환 73 | frame_list.append(frame) # 각 프레임 배열 (160, 160, 3)이 append 된다. 74 | 75 | if frame_counter == fps: # 프레임 카운터가 30이 된 순간. len(frame_list)==30이 된 순간. 76 | # . ----- 1초=30프레임마다 묶어서 예측(.predict) ----- 77 | # . ----- 1초 동안 (1, 30, 160, 160, 3) 배열을 만들어 모델에 투입 --- 78 | # . ----- 예측 결과(1초)를 output에 씌워 준다. ----- 79 | # 그러기 위해서는 30개씩 append한 리스트를 넘파이화 -> 예측 -> 리스트 초기화 과정이 필요 80 | frame_ar = np.array(frame_list, dtype=np.float16) # > 30개의 원소가 든 리스트를 변환. (30, 160, 160, 3) 81 | frame_list = [] # 30프레임이 채워질 때마다 넘파이 배열로 변환을 마친 프레임 리스트는 초기화 해 준다. 82 | 83 | if (np.max(frame_ar) > 1): # 넘파이 배열의 RGB 값을 스케일링 84 | frame_ar = frame_ar / 255.0 85 | 86 | # video_frm_ar[i][:]=frame_ar #> (i, fps, 160, 160, 3). i번째 1초짜리 영상(30프레임) 배열 파일이 된다. 87 | # print(video_frm_ar.shape) 88 | 89 | # VGG19로 초당 프레임 이미지 배열로부터 특성 추출 : (1*30, 5, 5, 512) 90 | pred_imgarr = base_model.predict(frame_ar) # > (30, 5, 5, 512) 91 | # 추출된 특성 배열들을 1차원으로 변환 : (1, 30, 5*5*512) 92 | pred_imgarr_dim = pred_imgarr.reshape(1, pred_imgarr.shape[0], 5 * 5 * 512) # > (1, 30, 12800) 93 | # 각 프레임 폭력 여부 예측값을 0에 저장 94 | preds = model.predict(pred_imgarr_dim) # > (True, 0.99) : (폭력여부, 폭력확률) 95 | print(f'preds:{preds}') 96 | Q.append(preds) # > Deque Q에 리스트처럼 예측값을 추가함 97 | 98 | # 지난 5초간의 폭력 확률 평균을 result로 한다. 99 | if i < 5: 100 | results = np.array(Q)[:i].mean(axis=0) 101 | else: 102 | results = np.array(Q)[(i - 5):i].mean(axis=0) 103 | # results=np.array(Q).mean(axis=0) 104 | print(f'Results = {results}') # > ex : (0.6, 0.650) 105 | 106 | # 예측 결과에서 최대폭력확률값 107 | maxprob = np.max(results) # > 가장 높은 값을 선택함 108 | print(f'Maximum Probability : {maxprob}') 109 | print('') 110 | 111 | rest = 1 - maxprob # 폭력이 아닐 확률 112 | diff = maxprob - rest # 폭력일 확률과 폭력이 아닐 확률의 차이 113 | th = 100 114 | 115 | if diff > 0.80: # 폭력일 확률과 아닐 확률의 차이가 0.8 이상이면 116 | th = diff # 근거가? 117 | 118 | frame_counter = 0 # > 1초(30프레임)가 경과했으므로 frame_counter=0으로 리셋 119 | i += 1 # > 1초 경과 의미 120 | 121 | # frame_counter==30이 되면 0으로 돌아가 위 루프를 반복해 준다. 122 | 123 | # ----- output에 씌울 자막 설정하기 ----- 124 | # 30프레임(1초)마다 갱신된 값이 output에 씌워지게 된다 125 | scale = 1 126 | fontScale = min(W, H) / (300 / scale) 127 | 128 | if preds is not None and maxprob is not None: # 예측값이 발생한 후부터 129 | if (preds[0][1]) < th: # > 폭력일 확률이 th보다 작으면 정상 130 | text1_1 = 'Normal' 131 | text1_2 = '{:.2f}%'.format(100 - (maxprob * 100)) 132 | cv2.putText(output, text1_1, (int(0.025 * W), int(0.1 * H)), 133 | cv2.FONT_HERSHEY_TRIPLEX, fontScale, (0, 255, 0), 2) 134 | cv2.putText(output, text1_2, (int(0.025 * W), int(0.2 * H)), 135 | cv2.FONT_HERSHEY_TRIPLEX, fontScale, (0, 255, 0), 2) 136 | # cv2.putText(이미지파일, 출력문자, 시작위치좌표(좌측하단), 폰트, 폰트크기, 폰트색상, 폰트두께) 137 | 138 | else: # > 폭력일 확률이 th보다 크면 폭력 취급 139 | text2_1 = 'Violence Alert!' 140 | text2_2 = '{:.2f}%'.format(maxprob * 100) 141 | cv2.putText(output, text2_1, (int(0.025 * W), int(0.1 * H)), 142 | cv2.FONT_HERSHEY_TRIPLEX, fontScale, (0, 0, 255), 2) 143 | cv2.putText(output, text2_2, (int(0.025 * W), int(0.2 * H)), 144 | cv2.FONT_HERSHEY_TRIPLEX, fontScale, (0, 0, 255), 2) 145 | 146 | # 자막이 씌워진 동영상을 writer로 저장함 147 | if writer is None: 148 | fourcc = cv2.VideoWriter_fourcc(*'DIVX') 149 | writer = cv2.VideoWriter(output_path, fourcc, 30, (W, H), True) 150 | 151 | # 아웃풋을 새창으로 열어 보여주기 152 | cv2.imshow('This is output', output) 153 | writer.write(output) # output_path로 output 객체를 저장함 154 | 155 | key = cv2.waitKey(round(1000 / fps)) # 프레임-다음 프레임 사이 간격 156 | if key == 27: # esc 키를 누르면 루프로부터 벗어나고 output 파일이 저장됨 157 | print('ESC키를 눌렀습니다. 녹화를 종료합니다.') 158 | break 159 | 160 | print('종료 처리되었습니다. 메모리를 해제합니다.') 161 | writer.release() 162 | vc.release() 163 | cv2.destroyAllWindows() -------------------------------------------------------------------------------- /raspberry/video_maker.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import numpy as np 3 | import imagezmq 4 | 5 | image_hub = imagezmq.ImageHub() 6 | fourcc = cv2.VideoWriter_fourcc(*'XVID') 7 | out = cv2.VideoWriter('C:/Users/leesookwang/Desktop/project/fight06.mp4', -1, 50.0, (160,160)) 8 | while True: # show streamed images until Ctrl-C 9 | cv2.waitKey(1) 10 | rpi_name, image = image_hub.recv_image() 11 | 12 | if rpi_name == 'raspberrypi': 13 | out.write(image) 14 | cv2.imshow('frame',image) 15 | 16 | if cv2.waitKey(1) & 0xFF == ord('q'): 17 | break 18 | 19 | else : 20 | break 21 | image_hub.send_reply(b'OK') 22 | 23 | out.release() 24 | cv2.destroyAllWindows() -------------------------------------------------------------------------------- /raspberry/video_maker1.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import numpy as np 3 | import imagezmq 4 | 5 | image_hub = imagezmq.ImageHub() 6 | while True: # show streamed images until Ctrl-C 7 | rpi_name, image = image_hub.recv_image() 8 | cv2.imshow(rpi_name, image) # 1 window for each RPi 9 | cv2.waitKey(1) 10 | image_hub.send_reply(b'OK') 11 | cap = cv2.VideoCapture() 12 | 13 | # 열렸는지 확인 14 | if not cap.isOpened(): 15 | print("Camera open failed!") 16 | sys.exit() 17 | 18 | # 웹캠의 속성 값을 받아오기 19 | # 정수 형태로 변환하기 위해 round 20 | w = round(cap.get(cv2.CAP_PROP_FRAME_WIDTH)) 21 | h = round(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) 22 | fps = cap.get(cv2.CAP_PROP_FPS) # 카메라에 따라 값이 정상적, 비정상적 23 | 24 | # fourcc 값 받아오기, *는 문자를 풀어쓰는 방식, *'DIVX' == 'D', 'I', 'V', 'X' 25 | fourcc = cv2.VideoWriter_fourcc(*'DIVX') 26 | 27 | # 1프레임과 다음 프레임 사이의 간격 설정 28 | delay = round(1000/fps) 29 | 30 | # 웹캠으로 찰영한 영상을 저장하기 31 | # cv2.VideoWriter 객체 생성, 기존에 받아온 속성값 입력 32 | out = cv2.VideoWriter('C:/Users/leesookwang/Desktop/project/sample_video10.avi', fourcc, fps, (w, h)) 33 | 34 | # 제대로 열렸는지 확인 35 | if not out.isOpened(): 36 | print('File open failed!') 37 | cap.release() 38 | sys.exit() -------------------------------------------------------------------------------- /ver_jupyter/01_video-to-numpy-save.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# 01. Transforming Video file(.avi) to 160*160 size, 30 frames Numpy array(.npy)\n", 8 | "* **Purpose of this Code** : Making train set & test set to train and evaluate Model\n", 9 | "* **`Please!`** **Before Run this code, You should separate your videos to Fight folder / NonFight Folder**\n", 10 | "* **`Output=.pickle`** : Very Large file.(Mabye about 30~50GB at each one?)\n", 11 | " * Please check your drive capacity, before run this code!" 12 | ] 13 | }, 14 | { 15 | "cell_type": "markdown", 16 | "metadata": {}, 17 | "source": [ 18 | "# Imports" 19 | ] 20 | }, 21 | { 22 | "cell_type": "code", 23 | "execution_count": 1, 24 | "metadata": { 25 | "execution": { 26 | "iopub.execute_input": "2021-05-12T01:35:46.747215Z", 27 | "iopub.status.busy": "2021-05-12T01:35:46.747215Z", 28 | "iopub.status.idle": "2021-05-12T01:35:53.734738Z", 29 | "shell.execute_reply": "2021-05-12T01:35:53.734738Z", 30 | "shell.execute_reply.started": "2021-05-12T01:35:46.747215Z" 31 | }, 32 | "id": "zo19mACbtiji", 33 | "tags": [] 34 | }, 35 | "outputs": [], 36 | "source": [ 37 | "import os\n", 38 | "import pickle # save list as .pickle\n", 39 | "import numpy as np\n", 40 | "from tqdm import tqdm\n", 41 | "import cv2 # read video file\n", 42 | "from skimage.transform import resize # resizing images" 43 | ] 44 | }, 45 | { 46 | "cell_type": "markdown", 47 | "metadata": {}, 48 | "source": [ 49 | "# 01-A. Transform video files to Numpy array" 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": 2, 55 | "metadata": { 56 | "execution": { 57 | "iopub.execute_input": "2021-05-12T01:35:55.841611Z", 58 | "iopub.status.busy": "2021-05-12T01:35:55.841611Z", 59 | "iopub.status.idle": "2021-05-12T01:35:55.861609Z", 60 | "shell.execute_reply": "2021-05-12T01:35:55.861609Z", 61 | "shell.execute_reply.started": "2021-05-12T01:35:55.841611Z" 62 | }, 63 | "id": "badw1e8Cs3xa", 64 | "tags": [] 65 | }, 66 | "outputs": [], 67 | "source": [ 68 | "def Save2Npy(file_dir, save_dir):\n", 69 | " \"\"\"This function load videos, and transform each of them to Numpy array, and save them in selected folder.\n", 70 | " :: file_dir :: This folder have original video files\n", 71 | " :: save_dir :: You'll save transformed Numpy array in this folder.\n", 72 | " \"\"\"\n", 73 | " if not os.path.exists(save_dir): # If there is not save_dir folder, then create new folder at there.\n", 74 | " os.makedirs(save_dir)\n", 75 | " \n", 76 | " file_list=os.listdir(file_dir) # Make list of video file(in file_dir folder)'s name.\n", 77 | " \n", 78 | " for file in tqdm(file_list):\n", 79 | " frames=np.zeros((30, 160, 160, 3), dtype=np.float)\n", 80 | " i=0\n", 81 | " \n", 82 | " vid=cv2.VideoCapture(os.path.join(file_dir, file)) # Create cv2.VideoCapture() Object of each video file.\n", 83 | " \n", 84 | " if vid.isOpened():\n", 85 | " grabbed, frame=vid.read()\n", 86 | " else:\n", 87 | " grabbed=False\n", 88 | " \n", 89 | " frm=resize(frame, (160, 160, 3))\n", 90 | " frm=np.expand_dims(frm, axis=0)\n", 91 | " \n", 92 | " if(np.max(frm)>1):\n", 93 | " frm=frm/255.0\n", 94 | " frames[i][:]=frm\n", 95 | " i+=1\n", 96 | " \n", 97 | " while i<30:\n", 98 | " grabbed, frame=vid.read()\n", 99 | " frm=resize(frame, (160, 160, 3))\n", 100 | " frm=np.expand_dims(frm, axis=0)\n", 101 | " if(np.max(frm)>1):\n", 102 | " frm=frm/255.0\n", 103 | " frames[i][:] = frm\n", 104 | " i+=1\n", 105 | "\n", 106 | " video_name=file.split('.')[0]\n", 107 | " video_path=os.path.join(file_dir, file)\n", 108 | " save_path=os.path.join(save_dir, video_name+'.npy')\n", 109 | "\n", 110 | " np.save(save_path, frames)" 111 | ] 112 | }, 113 | { 114 | "cell_type": "code", 115 | "execution_count": 3, 116 | "metadata": { 117 | "execution": { 118 | "iopub.execute_input": "2021-05-12T01:36:00.098085Z", 119 | "iopub.status.busy": "2021-05-12T01:36:00.098085Z", 120 | "iopub.status.idle": "2021-05-12T01:36:00.103099Z", 121 | "shell.execute_reply": "2021-05-12T01:36:00.103099Z", 122 | "shell.execute_reply.started": "2021-05-12T01:36:00.098085Z" 123 | }, 124 | "id": "QAQIBMa0twfe", 125 | "tags": [] 126 | }, 127 | "outputs": [], 128 | "source": [ 129 | "file_dir='D:/datasets/itwill/Fight_210511' # Folder that have videos\n", 130 | "save_dir='D:/datasets/AllVideo_Fight_Numpy'" 131 | ] 132 | }, 133 | { 134 | "cell_type": "code", 135 | "execution_count": 4, 136 | "metadata": { 137 | "colab": { 138 | "base_uri": "https://localhost:8080/" 139 | }, 140 | "execution": { 141 | "iopub.execute_input": "2021-05-12T01:36:03.919857Z", 142 | "iopub.status.busy": "2021-05-12T01:36:03.919857Z", 143 | "iopub.status.idle": "2021-05-12T01:36:12.416115Z", 144 | "shell.execute_reply": "2021-05-12T01:36:12.416115Z", 145 | "shell.execute_reply.started": "2021-05-12T01:36:03.919857Z" 146 | }, 147 | "executionInfo": { 148 | "elapsed": 43169, 149 | "status": "ok", 150 | "timestamp": 1619685813838, 151 | "user": { 152 | "displayName": "노혜원", 153 | "photoUrl": "", 154 | "userId": "17297828991528988184" 155 | }, 156 | "user_tz": -540 157 | }, 158 | "id": "ffcnzDeCyN3r", 159 | "outputId": "f39de57e-e854-46ce-936e-85579f2b4ef5", 160 | "tags": [] 161 | }, 162 | "outputs": [ 163 | { 164 | "name": "stderr", 165 | "output_type": "stream", 166 | "text": [ 167 | "100%|██████████████████████████████████████████████████████████████████████████████████| 16/16 [00:08<00:00, 1.89it/s]\n" 168 | ] 169 | } 170 | ], 171 | "source": [ 172 | "Save2Npy(file_dir=file_dir, save_dir=save_dir)" 173 | ] 174 | }, 175 | { 176 | "cell_type": "code", 177 | "execution_count": 5, 178 | "metadata": { 179 | "execution": { 180 | "iopub.execute_input": "2021-05-07T01:56:51.559948Z", 181 | "iopub.status.busy": "2021-05-07T01:56:51.559948Z", 182 | "iopub.status.idle": "2021-05-07T01:56:51.575818Z", 183 | "shell.execute_reply": "2021-05-07T01:56:51.575818Z", 184 | "shell.execute_reply.started": "2021-05-07T01:56:51.559948Z" 185 | }, 186 | "tags": [] 187 | }, 188 | "outputs": [], 189 | "source": [ 190 | "file_dir='D:/datasets/itwill/NonFight'\n", 191 | "save_dir='D:/datasets/AllVideo_NonFight_Numpy'" 192 | ] 193 | }, 194 | { 195 | "cell_type": "code", 196 | "execution_count": 6, 197 | "metadata": { 198 | "execution": { 199 | "iopub.execute_input": "2021-05-07T01:56:51.576851Z", 200 | "iopub.status.busy": "2021-05-07T01:56:51.576851Z", 201 | "iopub.status.idle": "2021-05-07T01:59:48.116783Z", 202 | "shell.execute_reply": "2021-05-07T01:59:48.116783Z", 203 | "shell.execute_reply.started": "2021-05-07T01:56:51.576851Z" 204 | }, 205 | "tags": [] 206 | }, 207 | "outputs": [ 208 | { 209 | "name": "stderr", 210 | "output_type": "stream", 211 | "text": [ 212 | "100%|██████████████████████████████████████████████████████████████████████████████████| 55/55 [02:56<00:00, 3.21s/it]\n" 213 | ] 214 | } 215 | ], 216 | "source": [ 217 | "Save2Npy(file_dir=file_dir, save_dir=save_dir)" 218 | ] 219 | }, 220 | { 221 | "cell_type": "markdown", 222 | "metadata": { 223 | "id": "DbDTCyEP1Sox" 224 | }, 225 | "source": [ 226 | "# 01-B. Make list of Numpy arrays" 227 | ] 228 | }, 229 | { 230 | "cell_type": "markdown", 231 | "metadata": {}, 232 | "source": [ 233 | "## 1. Fight Videos" 234 | ] 235 | }, 236 | { 237 | "cell_type": "code", 238 | "execution_count": 5, 239 | "metadata": { 240 | "colab": { 241 | "base_uri": "https://localhost:8080/" 242 | }, 243 | "execution": { 244 | "iopub.execute_input": "2021-05-12T01:37:07.779720Z", 245 | "iopub.status.busy": "2021-05-12T01:37:07.778723Z", 246 | "iopub.status.idle": "2021-05-12T01:43:05.549673Z", 247 | "shell.execute_reply": "2021-05-12T01:43:05.548333Z", 248 | "shell.execute_reply.started": "2021-05-12T01:37:07.779720Z" 249 | }, 250 | "executionInfo": { 251 | "elapsed": 2548, 252 | "status": "ok", 253 | "timestamp": 1619686496769, 254 | "user": { 255 | "displayName": "노혜원", 256 | "photoUrl": "", 257 | "userId": "17297828991528988184" 258 | }, 259 | "user_tz": -540 260 | }, 261 | "id": "2NpyhDVO0raO", 262 | "outputId": "38d35dc1-6af4-4c71-baa4-65a99f10f85d", 263 | "tags": [] 264 | }, 265 | "outputs": [ 266 | { 267 | "name": "stdout", 268 | "output_type": "stream", 269 | "text": [ 270 | "1857\n" 271 | ] 272 | } 273 | ], 274 | "source": [ 275 | "Fight_dir='D:/datasets/AllVideo_Fight_Numpy' # Folder that contains Fight(Violence) Video files\n", 276 | "file_list_npy = os.listdir(Fight_dir) # File name list\n", 277 | "\n", 278 | "data_Fight=[]\n", 279 | "for file in file_list_npy:\n", 280 | " file_path=os.path.join(Fight_dir, file)\n", 281 | " x=np.load(file_path)\n", 282 | " data_Fight.append(x)\n", 283 | "\n", 284 | "print(len(data_Fight))" 285 | ] 286 | }, 287 | { 288 | "cell_type": "markdown", 289 | "metadata": {}, 290 | "source": [ 291 | "## 2. NonFight Videos" 292 | ] 293 | }, 294 | { 295 | "cell_type": "code", 296 | "execution_count": 8, 297 | "metadata": { 298 | "execution": { 299 | "iopub.execute_input": "2021-05-07T02:05:35.848276Z", 300 | "iopub.status.busy": "2021-05-07T02:05:35.847278Z", 301 | "iopub.status.idle": "2021-05-07T02:11:00.733880Z", 302 | "shell.execute_reply": "2021-05-07T02:11:00.729890Z", 303 | "shell.execute_reply.started": "2021-05-07T02:05:35.847278Z" 304 | }, 305 | "tags": [] 306 | }, 307 | "outputs": [ 308 | { 309 | "name": "stdout", 310 | "output_type": "stream", 311 | "text": [ 312 | "1741\n" 313 | ] 314 | } 315 | ], 316 | "source": [ 317 | "NonFight_dir='D:/datasets/AllVideo_NonFight_Numpy'\n", 318 | "file_list_npy=os.listdir(NonFight_dir)\n", 319 | "\n", 320 | "data_NonFight=[]\n", 321 | "for file in file_list_npy:\n", 322 | " file_path=os.path.join(NonFight_dir, file)\n", 323 | " x=np.load(file_path)\n", 324 | " data_NonFight.append(x)\n", 325 | "\n", 326 | "print(len(data_NonFight))" 327 | ] 328 | }, 329 | { 330 | "cell_type": "markdown", 331 | "metadata": {}, 332 | "source": [ 333 | "# 01-C. Save list as .pickle" 334 | ] 335 | }, 336 | { 337 | "cell_type": "code", 338 | "execution_count": 6, 339 | "metadata": { 340 | "execution": { 341 | "iopub.execute_input": "2021-05-12T01:43:05.565294Z", 342 | "iopub.status.busy": "2021-05-12T01:43:05.565294Z", 343 | "iopub.status.idle": "2021-05-12T01:48:43.113500Z", 344 | "shell.execute_reply": "2021-05-12T01:48:43.099352Z", 345 | "shell.execute_reply.started": "2021-05-12T01:43:05.565294Z" 346 | }, 347 | "tags": [] 348 | }, 349 | "outputs": [], 350 | "source": [ 351 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/01_data_Fight_210512.pickle\",\"wb\") as fw:\n", 352 | " pickle.dump(data_Fight, fw, protocol=pickle.HIGHEST_PROTOCOL)" 353 | ] 354 | }, 355 | { 356 | "cell_type": "code", 357 | "execution_count": 10, 358 | "metadata": { 359 | "execution": { 360 | "iopub.execute_input": "2021-05-07T02:17:01.175596Z", 361 | "iopub.status.busy": "2021-05-07T02:17:01.175596Z", 362 | "iopub.status.idle": "2021-05-07T02:21:59.396154Z", 363 | "shell.execute_reply": "2021-05-07T02:21:59.385184Z", 364 | "shell.execute_reply.started": "2021-05-07T02:17:01.175596Z" 365 | }, 366 | "tags": [] 367 | }, 368 | "outputs": [], 369 | "source": [ 370 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/01_data_NonFight_210507.pickle\",\"wb\") as fw:\n", 371 | " pickle.dump(data_NonFight, fw, protocol=pickle.HIGHEST_PROTOCOL)" 372 | ] 373 | }, 374 | { 375 | "cell_type": "markdown", 376 | "metadata": {}, 377 | "source": [ 378 | "# 01-D. Make & Save Label(Real Values) as .pickle\n", 379 | "> Memo\n", 380 | "```\n", 381 | "prediction=preds.argmax(axis=0):[0 0]\n", 382 | "Results = [[0.09166703 0.9092251 ]]\n", 383 | "Maximun Probability = 0.9092251\n", 384 | "Difference of prob 0.8184502124786377\n", 385 | "Alert : violence - 90.92%\n", 386 | "```\n", 387 | "> labels1=[] : list of each video file's lavel(Fight/NonFight). \n", 388 | "> * Violence(Fight) : [0,1]\n", 389 | "> * Non-Violence(NonFight) : [1,0]" 390 | ] 391 | }, 392 | { 393 | "cell_type": "markdown", 394 | "metadata": {}, 395 | "source": [ 396 | "## 1. Make Label list" 397 | ] 398 | }, 399 | { 400 | "cell_type": "code", 401 | "execution_count": 7, 402 | "metadata": { 403 | "execution": { 404 | "iopub.execute_input": "2021-05-12T01:48:43.131808Z", 405 | "iopub.status.busy": "2021-05-12T01:48:43.131070Z", 406 | "iopub.status.idle": "2021-05-12T01:48:43.173974Z", 407 | "shell.execute_reply": "2021-05-12T01:48:43.172976Z", 408 | "shell.execute_reply.started": "2021-05-12T01:48:43.131808Z" 409 | }, 410 | "tags": [] 411 | }, 412 | "outputs": [], 413 | "source": [ 414 | "label_Fight_per_video=np.array([0,1])\n", 415 | "label_Fight=[label_Fight_per_video]*len(data_Fight) # As amount as count of Violence(Fight) Video" 416 | ] 417 | }, 418 | { 419 | "cell_type": "code", 420 | "execution_count": 12, 421 | "metadata": { 422 | "execution": { 423 | "iopub.execute_input": "2021-05-07T02:21:59.445413Z", 424 | "iopub.status.busy": "2021-05-07T02:21:59.445413Z", 425 | "iopub.status.idle": "2021-05-07T02:21:59.463108Z", 426 | "shell.execute_reply": "2021-05-07T02:21:59.463108Z", 427 | "shell.execute_reply.started": "2021-05-07T02:21:59.445413Z" 428 | }, 429 | "tags": [] 430 | }, 431 | "outputs": [], 432 | "source": [ 433 | "label_NonFight_per_video=np.array([1,0])\n", 434 | "label_NonFight=[label_NonFight_per_video]*len(data_NonFight) # As amount as count of Non-Violence(NonFight) Video" 435 | ] 436 | }, 437 | { 438 | "cell_type": "code", 439 | "execution_count": 13, 440 | "metadata": { 441 | "execution": { 442 | "iopub.execute_input": "2021-05-07T02:21:59.464122Z", 443 | "iopub.status.busy": "2021-05-07T02:21:59.464122Z", 444 | "iopub.status.idle": "2021-05-07T02:21:59.584647Z", 445 | "shell.execute_reply": "2021-05-07T02:21:59.583490Z", 446 | "shell.execute_reply.started": "2021-05-07T02:21:59.464122Z" 447 | }, 448 | "tags": [] 449 | }, 450 | "outputs": [ 451 | { 452 | "data": { 453 | "text/plain": [ 454 | "(1841, 1741)" 455 | ] 456 | }, 457 | "execution_count": 13, 458 | "metadata": {}, 459 | "output_type": "execute_result" 460 | } 461 | ], 462 | "source": [ 463 | "len(label_Fight), len(label_NonFight)" 464 | ] 465 | }, 466 | { 467 | "cell_type": "code", 468 | "execution_count": 14, 469 | "metadata": { 470 | "execution": { 471 | "iopub.execute_input": "2021-05-07T02:21:59.586606Z", 472 | "iopub.status.busy": "2021-05-07T02:21:59.586606Z", 473 | "iopub.status.idle": "2021-05-07T02:21:59.614788Z", 474 | "shell.execute_reply": "2021-05-07T02:21:59.614788Z", 475 | "shell.execute_reply.started": "2021-05-07T02:21:59.586606Z" 476 | }, 477 | "tags": [] 478 | }, 479 | "outputs": [ 480 | { 481 | "data": { 482 | "text/plain": [ 483 | "(array([0, 1]), array([1, 0]))" 484 | ] 485 | }, 486 | "execution_count": 14, 487 | "metadata": {}, 488 | "output_type": "execute_result" 489 | } 490 | ], 491 | "source": [ 492 | "label_Fight[0], label_NonFight[0]" 493 | ] 494 | }, 495 | { 496 | "cell_type": "markdown", 497 | "metadata": {}, 498 | "source": [ 499 | "## 2. Save Label list as .pickle" 500 | ] 501 | }, 502 | { 503 | "cell_type": "code", 504 | "execution_count": 8, 505 | "metadata": { 506 | "execution": { 507 | "iopub.execute_input": "2021-05-12T01:48:43.179021Z", 508 | "iopub.status.busy": "2021-05-12T01:48:43.179021Z", 509 | "iopub.status.idle": "2021-05-12T01:48:43.205109Z", 510 | "shell.execute_reply": "2021-05-12T01:48:43.205109Z", 511 | "shell.execute_reply.started": "2021-05-12T01:48:43.179021Z" 512 | }, 513 | "tags": [] 514 | }, 515 | "outputs": [], 516 | "source": [ 517 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/01_label_Fight_210512.pickle\",\"wb\") as fw:\n", 518 | " pickle.dump(label_Fight, fw)" 519 | ] 520 | }, 521 | { 522 | "cell_type": "code", 523 | "execution_count": 16, 524 | "metadata": { 525 | "execution": { 526 | "iopub.execute_input": "2021-05-07T02:21:59.631745Z", 527 | "iopub.status.busy": "2021-05-07T02:21:59.631745Z", 528 | "iopub.status.idle": "2021-05-07T02:21:59.661692Z", 529 | "shell.execute_reply": "2021-05-07T02:21:59.661692Z", 530 | "shell.execute_reply.started": "2021-05-07T02:21:59.631745Z" 531 | }, 532 | "tags": [] 533 | }, 534 | "outputs": [], 535 | "source": [ 536 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/01_label_NonFight_210507.pickle\",\"wb\") as fw:\n", 537 | " pickle.dump(label_NonFight, fw)" 538 | ] 539 | } 540 | ], 541 | "metadata": { 542 | "colab": { 543 | "collapsed_sections": [], 544 | "name": "Save2Npy.ipynb의 사본", 545 | "provenance": [ 546 | { 547 | "file_id": "1QzVuS31z0_YnwKgemxRNuTYqpiZXI6X5", 548 | "timestamp": 1619686640536 549 | } 550 | ] 551 | }, 552 | "kernelspec": { 553 | "display_name": "Python 3", 554 | "language": "python", 555 | "name": "python3" 556 | }, 557 | "language_info": { 558 | "codemirror_mode": { 559 | "name": "ipython", 560 | "version": 3 561 | }, 562 | "file_extension": ".py", 563 | "mimetype": "text/x-python", 564 | "name": "python", 565 | "nbconvert_exporter": "python", 566 | "pygments_lexer": "ipython3", 567 | "version": "3.8.5" 568 | } 569 | }, 570 | "nbformat": 4, 571 | "nbformat_minor": 4 572 | } 573 | -------------------------------------------------------------------------------- /ver_jupyter/02_create-numpy-datasets_training-test.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# 02. Making Train set & Test set\n", 8 | "* Before run this file, Please check this:\n", 9 | " * 01_video-to-numpy-save.ipynb\n", 10 | " * **`Are those files exist on there?`** Those files were made by 01_video_to_numpy_save.ipynb\n", 11 | " * 01_data_Fight_210512.pickle\n", 12 | " * 01_label_Fight_210512.pickle\n", 13 | " * 01_data_NonFight_210507.pickle\n", 14 | " * 01_label_NonFight_210507.pickle" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "# imports" 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": 1, 27 | "metadata": { 28 | "execution": { 29 | "iopub.execute_input": "2021-05-12T02:53:14.097084Z", 30 | "iopub.status.busy": "2021-05-12T02:53:14.097084Z", 31 | "iopub.status.idle": "2021-05-12T02:53:16.374164Z", 32 | "shell.execute_reply": "2021-05-12T02:53:16.374164Z", 33 | "shell.execute_reply.started": "2021-05-12T02:53:14.097084Z" 34 | }, 35 | "tags": [] 36 | }, 37 | "outputs": [], 38 | "source": [ 39 | "import numpy as np\n", 40 | "import pickle\n", 41 | "from random import shuffle" 42 | ] 43 | }, 44 | { 45 | "cell_type": "markdown", 46 | "metadata": {}, 47 | "source": [ 48 | "# 02-A. Load Fight / NonFight Video Pickle Files\n", 49 | "* Files are saved from 21XXXX_01_video-to-numpy-save.ipynb\n", 50 | " * **data_Fight.pickle** : List of Fight video's frame image Numpy arrays\n", 51 | " * **data_NonFight.pickle** : List of NonFight video's frame image Numpy arrays\n", 52 | " * **label_Fight.pickle** : List of Fight video's label Numpy arrays\n", 53 | " * **label_NonFight.pickle** : List of NonFight video's label Numpy arrays\n", 54 | "* **`The reason why I repeated saving and loading .pickle is`** :\n", 55 | " * Just, because of RAM & memory issues.\n", 56 | " * When I ran those codes, my desktop had 16GB RAM, 100GB rest capacity in C:, and 1TB D: drive." 57 | ] 58 | }, 59 | { 60 | "cell_type": "code", 61 | "execution_count": null, 62 | "metadata": { 63 | "tags": [] 64 | }, 65 | "outputs": [], 66 | "source": [ 67 | "# Fight Video frames Numpy array list\n", 68 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/01_data_Fight_210512.pickle\",\"rb\") as fr:\n", 69 | " data_Fight=pickle.load(fr)\n", 70 | "print(len(data_Fight))" 71 | ] 72 | }, 73 | { 74 | "cell_type": "code", 75 | "execution_count": null, 76 | "metadata": { 77 | "tags": [] 78 | }, 79 | "outputs": [], 80 | "source": [ 81 | "# NonFight Video frames Numpy array list\n", 82 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/01_label_Fight_210512.pickle\",\"rb\") as fr:\n", 83 | " label_Fight=pickle.load(fr)\n", 84 | "print(len(label_Fight))" 85 | ] 86 | }, 87 | { 88 | "cell_type": "code", 89 | "execution_count": null, 90 | "metadata": { 91 | "tags": [] 92 | }, 93 | "outputs": [], 94 | "source": [ 95 | "# Fight label Numpy array list\n", 96 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/01_data_NonFight_210507.pickle\",\"rb\") as fr:\n", 97 | " data_NonFight=pickle.load(fr)\n", 98 | "print(len(data_NonFight))" 99 | ] 100 | }, 101 | { 102 | "cell_type": "code", 103 | "execution_count": null, 104 | "metadata": { 105 | "tags": [] 106 | }, 107 | "outputs": [], 108 | "source": [ 109 | "# NonFight label Numpy array list\n", 110 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/01_label_NonFight_210507.pickle\",\"rb\") as fr:\n", 111 | " label_NonFight=pickle.load(fr)\n", 112 | "print(len(label_NonFight))" 113 | ] 114 | }, 115 | { 116 | "cell_type": "markdown", 117 | "metadata": {}, 118 | "source": [ 119 | "# 02-B. Merge data & Random Shuffle" 120 | ] 121 | }, 122 | { 123 | "cell_type": "markdown", 124 | "metadata": {}, 125 | "source": [ 126 | "## 1. Merge data : Fight + NonFight" 127 | ] 128 | }, 129 | { 130 | "cell_type": "code", 131 | "execution_count": null, 132 | "metadata": { 133 | "tags": [] 134 | }, 135 | "outputs": [], 136 | "source": [ 137 | "data_total=data_Fight+data_NonFight\n", 138 | "print(len(data_total))" 139 | ] 140 | }, 141 | { 142 | "cell_type": "code", 143 | "execution_count": null, 144 | "metadata": { 145 | "tags": [] 146 | }, 147 | "outputs": [], 148 | "source": [ 149 | "label_total=label_Fight+label_NonFight\n", 150 | "print(len(label_total))" 151 | ] 152 | }, 153 | { 154 | "cell_type": "markdown", 155 | "metadata": {}, 156 | "source": [ 157 | "## 2. Shuffle merged dataset" 158 | ] 159 | }, 160 | { 161 | "cell_type": "code", 162 | "execution_count": null, 163 | "metadata": { 164 | "tags": [] 165 | }, 166 | "outputs": [], 167 | "source": [ 168 | "np.random.seed(42)" 169 | ] 170 | }, 171 | { 172 | "cell_type": "code", 173 | "execution_count": null, 174 | "metadata": { 175 | "tags": [] 176 | }, 177 | "outputs": [], 178 | "source": [ 179 | "c=list(zip(data_total, label_total)) # zip \n", 180 | "shuffle(c) # Random Shuffle\n", 181 | "data_total, label_total=zip(*c) # unpacking" 182 | ] 183 | }, 184 | { 185 | "cell_type": "markdown", 186 | "metadata": {}, 187 | "source": [ 188 | "## 3. save shuffled dataset as .pickle\n", 189 | "* **`pickle.dump(protocol=pickle.HIGHEST_PROTOCOL)`** : You can solve lack of memory issue when pickle save process" 190 | ] 191 | }, 192 | { 193 | "cell_type": "code", 194 | "execution_count": null, 195 | "metadata": { 196 | "tags": [] 197 | }, 198 | "outputs": [], 199 | "source": [ 200 | "# Save data\n", 201 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/02_data_total_210512.pickle\",\"wb\") as fw:\n", 202 | " pickle.dump(data_total, fw, protocol=pickle.HIGHEST_PROTOCOL)" 203 | ] 204 | }, 205 | { 206 | "cell_type": "code", 207 | "execution_count": null, 208 | "metadata": { 209 | "tags": [] 210 | }, 211 | "outputs": [], 212 | "source": [ 213 | "# Save label\n", 214 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/02_label_total_210512.pickle\",\"wb\") as fw:\n", 215 | " pickle.dump(label_total, fw)" 216 | ] 217 | }, 218 | { 219 | "cell_type": "markdown", 220 | "metadata": {}, 221 | "source": [ 222 | "# 02-C. Split training set / test set" 223 | ] 224 | }, 225 | { 226 | "cell_type": "markdown", 227 | "metadata": {}, 228 | "source": [ 229 | "## 1. Load shuffled dataset(.pickle)\n", 230 | "* **`The reason why I repeated saving and loading .pickle is`** :\n", 231 | " * Just, because of RAM & memory issues." 232 | ] 233 | }, 234 | { 235 | "cell_type": "code", 236 | "execution_count": 2, 237 | "metadata": { 238 | "execution": { 239 | "iopub.execute_input": "2021-05-12T02:22:24.526848Z", 240 | "iopub.status.busy": "2021-05-12T02:22:24.526848Z", 241 | "iopub.status.idle": "2021-05-12T02:31:59.775642Z", 242 | "shell.execute_reply": "2021-05-12T02:31:59.744435Z", 243 | "shell.execute_reply.started": "2021-05-12T02:22:24.526848Z" 244 | }, 245 | "tags": [] 246 | }, 247 | "outputs": [], 248 | "source": [ 249 | "# load data\n", 250 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/02_data_total_210512.pickle\",\"rb\") as fr:\n", 251 | " data_total=pickle.load(fr)" 252 | ] 253 | }, 254 | { 255 | "cell_type": "code", 256 | "execution_count": 3, 257 | "metadata": { 258 | "execution": { 259 | "iopub.execute_input": "2021-05-12T02:31:59.810624Z", 260 | "iopub.status.busy": "2021-05-12T02:31:59.810081Z", 261 | "iopub.status.idle": "2021-05-12T02:31:59.958227Z", 262 | "shell.execute_reply": "2021-05-12T02:31:59.958227Z", 263 | "shell.execute_reply.started": "2021-05-12T02:31:59.810624Z" 264 | }, 265 | "tags": [] 266 | }, 267 | "outputs": [], 268 | "source": [ 269 | "# load label\n", 270 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/02_label_total_210512.pickle\",\"rb\") as fr:\n", 271 | " label_total=pickle.load(fr)" 272 | ] 273 | }, 274 | { 275 | "cell_type": "markdown", 276 | "metadata": {}, 277 | "source": [ 278 | "## 2. Split dataset as training set / test set (8:2 ratio)" 279 | ] 280 | }, 281 | { 282 | "cell_type": "markdown", 283 | "metadata": {}, 284 | "source": [ 285 | "### 1) The number of training set, test set" 286 | ] 287 | }, 288 | { 289 | "cell_type": "code", 290 | "execution_count": 4, 291 | "metadata": { 292 | "execution": { 293 | "iopub.execute_input": "2021-05-12T02:31:59.961047Z", 294 | "iopub.status.busy": "2021-05-12T02:31:59.961047Z", 295 | "iopub.status.idle": "2021-05-12T02:32:00.003690Z", 296 | "shell.execute_reply": "2021-05-12T02:32:00.003690Z", 297 | "shell.execute_reply.started": "2021-05-12T02:31:59.961047Z" 298 | }, 299 | "tags": [] 300 | }, 301 | "outputs": [], 302 | "source": [ 303 | "training_set=int(len(data_total)*0.8)\n", 304 | "test_set=int(len(data_total)*0.2)" 305 | ] 306 | }, 307 | { 308 | "cell_type": "code", 309 | "execution_count": 5, 310 | "metadata": { 311 | "execution": { 312 | "iopub.execute_input": "2021-05-12T02:32:00.029779Z", 313 | "iopub.status.busy": "2021-05-12T02:32:00.029779Z", 314 | "iopub.status.idle": "2021-05-12T02:32:00.091169Z", 315 | "shell.execute_reply": "2021-05-12T02:32:00.090171Z", 316 | "shell.execute_reply.started": "2021-05-12T02:32:00.029779Z" 317 | }, 318 | "tags": [] 319 | }, 320 | "outputs": [], 321 | "source": [ 322 | "data_training=data_total[0:training_set] # Training set data\n", 323 | "data_test=data_total[training_set:] # Test set data\n", 324 | "\n", 325 | "label_training=label_total[0:training_set] # Training set label\n", 326 | "label_test=label_total[training_set:] # Test set label" 327 | ] 328 | }, 329 | { 330 | "cell_type": "code", 331 | "execution_count": 6, 332 | "metadata": { 333 | "execution": { 334 | "iopub.execute_input": "2021-05-12T02:32:00.092166Z", 335 | "iopub.status.busy": "2021-05-12T02:32:00.092166Z", 336 | "iopub.status.idle": "2021-05-12T02:32:00.178477Z", 337 | "shell.execute_reply": "2021-05-12T02:32:00.177480Z", 338 | "shell.execute_reply.started": "2021-05-12T02:32:00.092166Z" 339 | }, 340 | "tags": [] 341 | }, 342 | "outputs": [ 343 | { 344 | "data": { 345 | "text/plain": [ 346 | "(2878, 2878, 720, 720)" 347 | ] 348 | }, 349 | "execution_count": 6, 350 | "metadata": {}, 351 | "output_type": "execute_result" 352 | } 353 | ], 354 | "source": [ 355 | "len(data_training), len(label_training), len(data_test), len(label_test)" 356 | ] 357 | }, 358 | { 359 | "cell_type": "markdown", 360 | "metadata": {}, 361 | "source": [ 362 | "### 2) Check the shape of elements" 363 | ] 364 | }, 365 | { 366 | "cell_type": "code", 367 | "execution_count": 7, 368 | "metadata": { 369 | "execution": { 370 | "iopub.execute_input": "2021-05-12T02:32:00.179475Z", 371 | "iopub.status.busy": "2021-05-12T02:32:00.179475Z", 372 | "iopub.status.idle": "2021-05-12T02:32:00.193437Z", 373 | "shell.execute_reply": "2021-05-12T02:32:00.193437Z", 374 | "shell.execute_reply.started": "2021-05-12T02:32:00.179475Z" 375 | }, 376 | "tags": [] 377 | }, 378 | "outputs": [ 379 | { 380 | "data": { 381 | "text/plain": [ 382 | "((30, 160, 160, 3), (2,))" 383 | ] 384 | }, 385 | "execution_count": 7, 386 | "metadata": {}, 387 | "output_type": "execute_result" 388 | } 389 | ], 390 | "source": [ 391 | "data_training[0].shape, label_training[0].shape" 392 | ] 393 | }, 394 | { 395 | "cell_type": "code", 396 | "execution_count": 8, 397 | "metadata": { 398 | "execution": { 399 | "iopub.execute_input": "2021-05-12T02:32:00.194434Z", 400 | "iopub.status.busy": "2021-05-12T02:32:00.194434Z", 401 | "iopub.status.idle": "2021-05-12T02:32:00.224355Z", 402 | "shell.execute_reply": "2021-05-12T02:32:00.224355Z", 403 | "shell.execute_reply.started": "2021-05-12T02:32:00.194434Z" 404 | }, 405 | "tags": [] 406 | }, 407 | "outputs": [ 408 | { 409 | "data": { 410 | "text/plain": [ 411 | "array([[0., 0., 0., ..., 0., 0., 0.],\n", 412 | " [0., 0., 0., ..., 0., 0., 0.],\n", 413 | " [0., 0., 0., ..., 0., 0., 0.],\n", 414 | " ...,\n", 415 | " [0., 0., 0., ..., 0., 0., 0.],\n", 416 | " [0., 0., 0., ..., 0., 0., 0.],\n", 417 | " [0., 0., 0., ..., 0., 0., 0.]])" 418 | ] 419 | }, 420 | "execution_count": 8, 421 | "metadata": {}, 422 | "output_type": "execute_result" 423 | } 424 | ], 425 | "source": [ 426 | "data_training[0][0, :, :, 0]" 427 | ] 428 | }, 429 | { 430 | "cell_type": "markdown", 431 | "metadata": {}, 432 | "source": [ 433 | "## 3. Save training set & test set as .pickle file\n", 434 | "* **`The reason why I repeated saving and loading .pickle is`** :\n", 435 | " * Just, because of RAM & memory issues." 436 | ] 437 | }, 438 | { 439 | "cell_type": "code", 440 | "execution_count": 9, 441 | "metadata": { 442 | "execution": { 443 | "iopub.execute_input": "2021-05-12T02:32:00.226349Z", 444 | "iopub.status.busy": "2021-05-12T02:32:00.226349Z", 445 | "iopub.status.idle": "2021-05-12T02:40:50.977546Z", 446 | "shell.execute_reply": "2021-05-12T02:40:50.962049Z", 447 | "shell.execute_reply.started": "2021-05-12T02:32:00.226349Z" 448 | }, 449 | "tags": [] 450 | }, 451 | "outputs": [], 452 | "source": [ 453 | "# training set, data\n", 454 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/02_data_training_210512.pickle\",\"wb\") as fw:\n", 455 | " pickle.dump(data_training, fw, protocol=pickle.HIGHEST_PROTOCOL)" 456 | ] 457 | }, 458 | { 459 | "cell_type": "code", 460 | "execution_count": 10, 461 | "metadata": { 462 | "execution": { 463 | "iopub.execute_input": "2021-05-12T02:40:50.993168Z", 464 | "iopub.status.busy": "2021-05-12T02:40:50.993168Z", 465 | "iopub.status.idle": "2021-05-12T02:40:51.101726Z", 466 | "shell.execute_reply": "2021-05-12T02:40:51.101726Z", 467 | "shell.execute_reply.started": "2021-05-12T02:40:50.993168Z" 468 | }, 469 | "tags": [] 470 | }, 471 | "outputs": [], 472 | "source": [ 473 | "# training set, label\n", 474 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/02_label_training_210512.pickle\",\"wb\") as fw:\n", 475 | " pickle.dump(label_training, fw)" 476 | ] 477 | }, 478 | { 479 | "cell_type": "code", 480 | "execution_count": 11, 481 | "metadata": { 482 | "execution": { 483 | "iopub.execute_input": "2021-05-12T02:40:51.163303Z", 484 | "iopub.status.busy": "2021-05-12T02:40:51.162305Z", 485 | "iopub.status.idle": "2021-05-12T02:42:57.569113Z", 486 | "shell.execute_reply": "2021-05-12T02:42:57.569113Z", 487 | "shell.execute_reply.started": "2021-05-12T02:40:51.163303Z" 488 | }, 489 | "tags": [] 490 | }, 491 | "outputs": [], 492 | "source": [ 493 | "# test set, data\n", 494 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/02_data_test_210512.pickle\",\"wb\") as fw:\n", 495 | " pickle.dump(data_test, fw, protocol=pickle.HIGHEST_PROTOCOL)" 496 | ] 497 | }, 498 | { 499 | "cell_type": "code", 500 | "execution_count": 12, 501 | "metadata": { 502 | "execution": { 503 | "iopub.execute_input": "2021-05-12T02:42:57.594584Z", 504 | "iopub.status.busy": "2021-05-12T02:42:57.593586Z", 505 | "iopub.status.idle": "2021-05-12T02:42:57.780007Z", 506 | "shell.execute_reply": "2021-05-12T02:42:57.780007Z", 507 | "shell.execute_reply.started": "2021-05-12T02:42:57.594584Z" 508 | }, 509 | "tags": [] 510 | }, 511 | "outputs": [], 512 | "source": [ 513 | "# test set, label\n", 514 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/02_label_test_210512.pickle\",\"wb\") as fw:\n", 515 | " pickle.dump(label_test, fw)" 516 | ] 517 | }, 518 | { 519 | "cell_type": "markdown", 520 | "metadata": {}, 521 | "source": [ 522 | "# 02-D. Transform dataset as Numpy array" 523 | ] 524 | }, 525 | { 526 | "cell_type": "markdown", 527 | "metadata": {}, 528 | "source": [ 529 | "## 1. Load training set & test set (.pickle)" 530 | ] 531 | }, 532 | { 533 | "cell_type": "markdown", 534 | "metadata": {}, 535 | "source": [ 536 | "### 1) Training set : data, label" 537 | ] 538 | }, 539 | { 540 | "cell_type": "code", 541 | "execution_count": null, 542 | "metadata": { 543 | "tags": [] 544 | }, 545 | "outputs": [], 546 | "source": [ 547 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/02_data_training_210512.pickle\",\"rb\") as fr:\n", 548 | " data_training=pickle.load(fr)" 549 | ] 550 | }, 551 | { 552 | "cell_type": "code", 553 | "execution_count": null, 554 | "metadata": { 555 | "tags": [] 556 | }, 557 | "outputs": [], 558 | "source": [ 559 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/02_label_training_210512.pickle\",\"rb\") as fr:\n", 560 | " label_training=pickle.load(fr)" 561 | ] 562 | }, 563 | { 564 | "cell_type": "markdown", 565 | "metadata": {}, 566 | "source": [ 567 | "### 2) Test set : data, label" 568 | ] 569 | }, 570 | { 571 | "cell_type": "code", 572 | "execution_count": 2, 573 | "metadata": { 574 | "execution": { 575 | "iopub.execute_input": "2021-05-12T02:53:20.799883Z", 576 | "iopub.status.busy": "2021-05-12T02:53:20.798885Z", 577 | "iopub.status.idle": "2021-05-12T02:55:22.019540Z", 578 | "shell.execute_reply": "2021-05-12T02:55:22.003919Z", 579 | "shell.execute_reply.started": "2021-05-12T02:53:20.799883Z" 580 | }, 581 | "tags": [] 582 | }, 583 | "outputs": [], 584 | "source": [ 585 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/02_data_test_210512.pickle\",\"rb\") as fr:\n", 586 | " data_test=pickle.load(fr)" 587 | ] 588 | }, 589 | { 590 | "cell_type": "code", 591 | "execution_count": 3, 592 | "metadata": { 593 | "execution": { 594 | "iopub.execute_input": "2021-05-12T02:55:22.019540Z", 595 | "iopub.status.busy": "2021-05-12T02:55:22.019540Z", 596 | "iopub.status.idle": "2021-05-12T02:55:22.165156Z", 597 | "shell.execute_reply": "2021-05-12T02:55:22.165156Z", 598 | "shell.execute_reply.started": "2021-05-12T02:55:22.019540Z" 599 | }, 600 | "tags": [] 601 | }, 602 | "outputs": [], 603 | "source": [ 604 | "with open(\"D:/datasets/AllVideo_numpy_list_pickle/02_label_test_210512.pickle\",\"rb\") as fr:\n", 605 | " label_test=pickle.load(fr)" 606 | ] 607 | }, 608 | { 609 | "cell_type": "markdown", 610 | "metadata": {}, 611 | "source": [ 612 | "## 2. Transform training set & test set as Numpy array, and save them (.npy)" 613 | ] 614 | }, 615 | { 616 | "cell_type": "markdown", 617 | "metadata": {}, 618 | "source": [ 619 | "### 1) Training set" 620 | ] 621 | }, 622 | { 623 | "cell_type": "code", 624 | "execution_count": 13, 625 | "metadata": { 626 | "execution": { 627 | "iopub.execute_input": "2021-05-12T02:42:57.865035Z", 628 | "iopub.status.busy": "2021-05-12T02:42:57.865035Z", 629 | "iopub.status.idle": "2021-05-12T02:48:35.108898Z", 630 | "shell.execute_reply": "2021-05-12T02:48:35.083939Z", 631 | "shell.execute_reply.started": "2021-05-12T02:42:57.865035Z" 632 | }, 633 | "tags": [] 634 | }, 635 | "outputs": [], 636 | "source": [ 637 | "data_training_ar=np.array(data_training, dtype=np.float16) #> (2878, 30, 160, 160, 3)" 638 | ] 639 | }, 640 | { 641 | "cell_type": "code", 642 | "execution_count": 14, 643 | "metadata": { 644 | "execution": { 645 | "iopub.execute_input": "2021-05-12T02:48:35.137795Z", 646 | "iopub.status.busy": "2021-05-12T02:48:35.136798Z", 647 | "iopub.status.idle": "2021-05-12T02:50:21.815294Z", 648 | "shell.execute_reply": "2021-05-12T02:50:21.815294Z", 649 | "shell.execute_reply.started": "2021-05-12T02:48:35.137795Z" 650 | }, 651 | "tags": [] 652 | }, 653 | "outputs": [], 654 | "source": [ 655 | "np.save('D:/datasets/AllVideo_numpy_list_pickle/02_data_training_Numpy_210512.npy', data_training_ar)" 656 | ] 657 | }, 658 | { 659 | "cell_type": "code", 660 | "execution_count": 15, 661 | "metadata": { 662 | "execution": { 663 | "iopub.execute_input": "2021-05-12T02:50:21.815294Z", 664 | "iopub.status.busy": "2021-05-12T02:50:21.815294Z", 665 | "iopub.status.idle": "2021-05-12T02:50:21.834224Z", 666 | "shell.execute_reply": "2021-05-12T02:50:21.831205Z", 667 | "shell.execute_reply.started": "2021-05-12T02:50:21.815294Z" 668 | }, 669 | "tags": [] 670 | }, 671 | "outputs": [], 672 | "source": [ 673 | "label_training_ar=np.array(label_training) #> (2878, 2)" 674 | ] 675 | }, 676 | { 677 | "cell_type": "code", 678 | "execution_count": 16, 679 | "metadata": { 680 | "execution": { 681 | "iopub.execute_input": "2021-05-12T02:50:21.838185Z", 682 | "iopub.status.busy": "2021-05-12T02:50:21.838185Z", 683 | "iopub.status.idle": "2021-05-12T02:50:21.910593Z", 684 | "shell.execute_reply": "2021-05-12T02:50:21.909595Z", 685 | "shell.execute_reply.started": "2021-05-12T02:50:21.838185Z" 686 | }, 687 | "tags": [] 688 | }, 689 | "outputs": [], 690 | "source": [ 691 | "np.save('D:/datasets/AllVideo_numpy_list_pickle/02_label_training_Numpy_210512.npy', label_training_ar)" 692 | ] 693 | }, 694 | { 695 | "cell_type": "code", 696 | "execution_count": 17, 697 | "metadata": { 698 | "execution": { 699 | "iopub.execute_input": "2021-05-12T02:50:21.912478Z", 700 | "iopub.status.busy": "2021-05-12T02:50:21.912478Z", 701 | "iopub.status.idle": "2021-05-12T02:50:22.028847Z", 702 | "shell.execute_reply": "2021-05-12T02:50:22.027848Z", 703 | "shell.execute_reply.started": "2021-05-12T02:50:21.912478Z" 704 | }, 705 | "tags": [] 706 | }, 707 | "outputs": [ 708 | { 709 | "data": { 710 | "text/plain": [ 711 | "((2878, 30, 160, 160, 3), (2878, 2))" 712 | ] 713 | }, 714 | "execution_count": 17, 715 | "metadata": {}, 716 | "output_type": "execute_result" 717 | } 718 | ], 719 | "source": [ 720 | "data_training_ar.shape, label_training_ar.shape" 721 | ] 722 | }, 723 | { 724 | "cell_type": "markdown", 725 | "metadata": {}, 726 | "source": [ 727 | "### 2) Test set" 728 | ] 729 | }, 730 | { 731 | "cell_type": "code", 732 | "execution_count": 4, 733 | "metadata": { 734 | "execution": { 735 | "iopub.execute_input": "2021-05-12T02:55:22.173140Z", 736 | "iopub.status.busy": "2021-05-12T02:55:22.165156Z", 737 | "iopub.status.idle": "2021-05-12T02:56:46.750762Z", 738 | "shell.execute_reply": "2021-05-12T02:56:46.735038Z", 739 | "shell.execute_reply.started": "2021-05-12T02:55:22.173140Z" 740 | }, 741 | "tags": [] 742 | }, 743 | "outputs": [], 744 | "source": [ 745 | "data_test_ar=np.array(data_test, dtype=np.float16) #> (720, 30, 160, 160, 3)" 746 | ] 747 | }, 748 | { 749 | "cell_type": "code", 750 | "execution_count": 5, 751 | "metadata": { 752 | "execution": { 753 | "iopub.execute_input": "2021-05-12T02:56:46.750762Z", 754 | "iopub.status.busy": "2021-05-12T02:56:46.750762Z", 755 | "iopub.status.idle": "2021-05-12T02:57:09.502823Z", 756 | "shell.execute_reply": "2021-05-12T02:57:09.502823Z", 757 | "shell.execute_reply.started": "2021-05-12T02:56:46.750762Z" 758 | }, 759 | "tags": [] 760 | }, 761 | "outputs": [], 762 | "source": [ 763 | "np.save('D:/datasets/AllVideo_numpy_list_pickle/02_data_test_Numpy_210512.npy', data_test_ar)" 764 | ] 765 | }, 766 | { 767 | "cell_type": "code", 768 | "execution_count": 6, 769 | "metadata": { 770 | "execution": { 771 | "iopub.execute_input": "2021-05-12T02:57:09.502823Z", 772 | "iopub.status.busy": "2021-05-12T02:57:09.502823Z", 773 | "iopub.status.idle": "2021-05-12T02:57:09.535582Z", 774 | "shell.execute_reply": "2021-05-12T02:57:09.534624Z", 775 | "shell.execute_reply.started": "2021-05-12T02:57:09.502823Z" 776 | }, 777 | "tags": [] 778 | }, 779 | "outputs": [], 780 | "source": [ 781 | "label_test_ar=np.array(label_test) #> (720, 2)" 782 | ] 783 | }, 784 | { 785 | "cell_type": "code", 786 | "execution_count": 7, 787 | "metadata": { 788 | "execution": { 789 | "iopub.execute_input": "2021-05-12T02:57:09.537560Z", 790 | "iopub.status.busy": "2021-05-12T02:57:09.537560Z", 791 | "iopub.status.idle": "2021-05-12T02:57:09.585803Z", 792 | "shell.execute_reply": "2021-05-12T02:57:09.585803Z", 793 | "shell.execute_reply.started": "2021-05-12T02:57:09.537560Z" 794 | }, 795 | "tags": [] 796 | }, 797 | "outputs": [], 798 | "source": [ 799 | "np.save('D:/datasets/AllVideo_numpy_list_pickle/02_label_test_Numpy_210512.npy', label_test_ar)" 800 | ] 801 | }, 802 | { 803 | "cell_type": "code", 804 | "execution_count": 8, 805 | "metadata": { 806 | "execution": { 807 | "iopub.execute_input": "2021-05-12T02:57:09.585803Z", 808 | "iopub.status.busy": "2021-05-12T02:57:09.585803Z", 809 | "iopub.status.idle": "2021-05-12T02:57:09.806466Z", 810 | "shell.execute_reply": "2021-05-12T02:57:09.806466Z", 811 | "shell.execute_reply.started": "2021-05-12T02:57:09.585803Z" 812 | }, 813 | "tags": [] 814 | }, 815 | "outputs": [ 816 | { 817 | "data": { 818 | "text/plain": [ 819 | "((720, 30, 160, 160, 3), (720, 2))" 820 | ] 821 | }, 822 | "execution_count": 8, 823 | "metadata": {}, 824 | "output_type": "execute_result" 825 | } 826 | ], 827 | "source": [ 828 | "data_test_ar.shape, label_test_ar.shape" 829 | ] 830 | } 831 | ], 832 | "metadata": { 833 | "kernelspec": { 834 | "display_name": "Python 3", 835 | "language": "python", 836 | "name": "python3" 837 | }, 838 | "language_info": { 839 | "codemirror_mode": { 840 | "name": "ipython", 841 | "version": 3 842 | }, 843 | "file_extension": ".py", 844 | "mimetype": "text/x-python", 845 | "name": "python", 846 | "nbconvert_exporter": "python", 847 | "pygments_lexer": "ipython3", 848 | "version": "3.8.5" 849 | } 850 | }, 851 | "nbformat": 4, 852 | "nbformat_minor": 4 853 | } 854 | -------------------------------------------------------------------------------- /ver_jupyter/03_MobileNet.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# 03. Extract features from dataset by using MobileNet(pre-trained model)\n", 8 | "* we'll made **`X_train_reshaped_XXXXXX.npy`** and **`X_test_reshaped_XXXXXX.npy`** through this code.\n", 9 | "* **`Before run this file, Please check this`**:\n", 10 | " * 01_video-to-numpy-save.ipynb\n", 11 | " * 02_create-numpy-datasets_training-test.ipynb\n", 12 | "* **`Are those files exist on there?`** Those files were made by 01~02 ipynb files.\n", 13 | " * 02_data_training_Numpy_210512.pickle\n", 14 | " * 02_data_test_Numpy_210512.pickle" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "# Imports " 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": 1, 27 | "metadata": { 28 | "execution": { 29 | "iopub.execute_input": "2021-05-12T04:07:11.401915Z", 30 | "iopub.status.busy": "2021-05-12T04:07:11.401915Z", 31 | "iopub.status.idle": "2021-05-12T04:07:28.556267Z", 32 | "shell.execute_reply": "2021-05-12T04:07:28.556267Z", 33 | "shell.execute_reply.started": "2021-05-12T04:07:11.401915Z" 34 | }, 35 | "tags": [] 36 | }, 37 | "outputs": [], 38 | "source": [ 39 | "import numpy as np\n", 40 | "import os\n", 41 | "import tensorflow as tf\n", 42 | "from tensorflow import keras\n", 43 | "import random" 44 | ] 45 | }, 46 | { 47 | "cell_type": "markdown", 48 | "metadata": {}, 49 | "source": [ 50 | "# 03-A. Load datasets array(.npy) : trainning set & test set\n", 51 | "* shape : `(# of Video file, # of frame img, img height, img width, RGB)`" 52 | ] 53 | }, 54 | { 55 | "cell_type": "code", 56 | "execution_count": 2, 57 | "metadata": { 58 | "execution": { 59 | "iopub.execute_input": "2021-05-12T04:07:28.556267Z", 60 | "iopub.status.busy": "2021-05-12T04:07:28.556267Z", 61 | "iopub.status.idle": "2021-05-12T04:09:21.213807Z", 62 | "shell.execute_reply": "2021-05-12T04:09:21.213807Z", 63 | "shell.execute_reply.started": "2021-05-12T04:07:28.556267Z" 64 | }, 65 | "tags": [] 66 | }, 67 | "outputs": [ 68 | { 69 | "data": { 70 | "text/plain": [ 71 | "(2878, 30, 160, 160, 3)" 72 | ] 73 | }, 74 | "execution_count": 2, 75 | "metadata": {}, 76 | "output_type": "execute_result" 77 | } 78 | ], 79 | "source": [ 80 | "data_training_ar=np.load('D:/datasets/AllVideo_numpy_list_pickle/02_data_training_Numpy_210512.npy') \n", 81 | "data_training_ar.shape #> (2878, 30, 160, 160, 3) " 82 | ] 83 | }, 84 | { 85 | "cell_type": "code", 86 | "execution_count": 3, 87 | "metadata": { 88 | "execution": { 89 | "iopub.execute_input": "2021-05-12T04:09:21.231613Z", 90 | "iopub.status.busy": "2021-05-12T04:09:21.231613Z", 91 | "iopub.status.idle": "2021-05-12T04:09:51.413334Z", 92 | "shell.execute_reply": "2021-05-12T04:09:51.412336Z", 93 | "shell.execute_reply.started": "2021-05-12T04:09:21.231613Z" 94 | }, 95 | "tags": [] 96 | }, 97 | "outputs": [ 98 | { 99 | "data": { 100 | "text/plain": [ 101 | "(720, 30, 160, 160, 3)" 102 | ] 103 | }, 104 | "execution_count": 3, 105 | "metadata": {}, 106 | "output_type": "execute_result" 107 | } 108 | ], 109 | "source": [ 110 | "data_test_ar=np.load('D:/datasets/AllVideo_numpy_list_pickle/02_data_test_Numpy_210512.npy') \n", 111 | "data_test_ar.shape #> (720, 30, 160, 160, 3)" 112 | ] 113 | }, 114 | { 115 | "cell_type": "markdown", 116 | "metadata": {}, 117 | "source": [ 118 | "# 03-B. Reshape data\n", 119 | "* `(# of Video file, # of frame img, img height, img width, RGB)` to `((# of Video file)*(# of frame img), img height, img width, RGB)`" 120 | ] 121 | }, 122 | { 123 | "cell_type": "code", 124 | "execution_count": 4, 125 | "metadata": { 126 | "execution": { 127 | "iopub.execute_input": "2021-05-12T04:09:51.415358Z", 128 | "iopub.status.busy": "2021-05-12T04:09:51.415358Z", 129 | "iopub.status.idle": "2021-05-12T04:09:51.444252Z", 130 | "shell.execute_reply": "2021-05-12T04:09:51.444252Z", 131 | "shell.execute_reply.started": "2021-05-12T04:09:51.415358Z" 132 | }, 133 | "tags": [] 134 | }, 135 | "outputs": [ 136 | { 137 | "data": { 138 | "text/plain": [ 139 | "(86340, 160, 160, 3)" 140 | ] 141 | }, 142 | "execution_count": 4, 143 | "metadata": {}, 144 | "output_type": "execute_result" 145 | } 146 | ], 147 | "source": [ 148 | "data_training_ar=data_training_ar.reshape(data_training_ar.shape[0]*30, 160, 160, 3) \n", 149 | "data_training_ar.shape #> (86340, 160, 160, 3)" 150 | ] 151 | }, 152 | { 153 | "cell_type": "code", 154 | "execution_count": 5, 155 | "metadata": { 156 | "execution": { 157 | "iopub.execute_input": "2021-05-12T04:09:51.458214Z", 158 | "iopub.status.busy": "2021-05-12T04:09:51.458214Z", 159 | "iopub.status.idle": "2021-05-12T04:09:51.475176Z", 160 | "shell.execute_reply": "2021-05-12T04:09:51.474177Z", 161 | "shell.execute_reply.started": "2021-05-12T04:09:51.458214Z" 162 | }, 163 | "tags": [] 164 | }, 165 | "outputs": [ 166 | { 167 | "data": { 168 | "text/plain": [ 169 | "(21600, 160, 160, 3)" 170 | ] 171 | }, 172 | "execution_count": 5, 173 | "metadata": {}, 174 | "output_type": "execute_result" 175 | } 176 | ], 177 | "source": [ 178 | "data_test_ar=data_test_ar.reshape(data_test_ar.shape[0]*30, 160, 160, 3) \n", 179 | "data_test_ar.shape #> (21600, 160, 160, 3)" 180 | ] 181 | }, 182 | { 183 | "cell_type": "markdown", 184 | "metadata": {}, 185 | "source": [ 186 | "# 03-C. Create Base Model : MobileNet" 187 | ] 188 | }, 189 | { 190 | "cell_type": "code", 191 | "execution_count": 6, 192 | "metadata": { 193 | "execution": { 194 | "iopub.execute_input": "2021-05-12T04:09:51.492476Z", 195 | "iopub.status.busy": "2021-05-12T04:09:51.489484Z", 196 | "iopub.status.idle": "2021-05-12T04:09:52.889151Z", 197 | "shell.execute_reply": "2021-05-12T04:09:52.889031Z", 198 | "shell.execute_reply.started": "2021-05-12T04:09:51.492476Z" 199 | }, 200 | "tags": [] 201 | }, 202 | "outputs": [], 203 | "source": [ 204 | "base_model=keras.applications.mobilenet.MobileNet(input_shape=(160, 160, 3),\n", 205 | " include_top=False,\n", 206 | " weights='imagenet', classes=2)" 207 | ] 208 | }, 209 | { 210 | "cell_type": "code", 211 | "execution_count": 7, 212 | "metadata": { 213 | "execution": { 214 | "iopub.execute_input": "2021-05-12T04:09:52.891179Z", 215 | "iopub.status.busy": "2021-05-12T04:09:52.891179Z", 216 | "iopub.status.idle": "2021-05-12T04:09:52.935587Z", 217 | "shell.execute_reply": "2021-05-12T04:09:52.935587Z", 218 | "shell.execute_reply.started": "2021-05-12T04:09:52.891179Z" 219 | }, 220 | "tags": [] 221 | }, 222 | "outputs": [ 223 | { 224 | "name": "stdout", 225 | "output_type": "stream", 226 | "text": [ 227 | "Model: \"mobilenet_1.00_160\"\n", 228 | "_________________________________________________________________\n", 229 | "Layer (type) Output Shape Param # \n", 230 | "=================================================================\n", 231 | "input_1 (InputLayer) [(None, 160, 160, 3)] 0 \n", 232 | "_________________________________________________________________\n", 233 | "conv1 (Conv2D) (None, 80, 80, 32) 864 \n", 234 | "_________________________________________________________________\n", 235 | "conv1_bn (BatchNormalization (None, 80, 80, 32) 128 \n", 236 | "_________________________________________________________________\n", 237 | "conv1_relu (ReLU) (None, 80, 80, 32) 0 \n", 238 | "_________________________________________________________________\n", 239 | "conv_dw_1 (DepthwiseConv2D) (None, 80, 80, 32) 288 \n", 240 | "_________________________________________________________________\n", 241 | "conv_dw_1_bn (BatchNormaliza (None, 80, 80, 32) 128 \n", 242 | "_________________________________________________________________\n", 243 | "conv_dw_1_relu (ReLU) (None, 80, 80, 32) 0 \n", 244 | "_________________________________________________________________\n", 245 | "conv_pw_1 (Conv2D) (None, 80, 80, 64) 2048 \n", 246 | "_________________________________________________________________\n", 247 | "conv_pw_1_bn (BatchNormaliza (None, 80, 80, 64) 256 \n", 248 | "_________________________________________________________________\n", 249 | "conv_pw_1_relu (ReLU) (None, 80, 80, 64) 0 \n", 250 | "_________________________________________________________________\n", 251 | "conv_pad_2 (ZeroPadding2D) (None, 81, 81, 64) 0 \n", 252 | "_________________________________________________________________\n", 253 | "conv_dw_2 (DepthwiseConv2D) (None, 40, 40, 64) 576 \n", 254 | "_________________________________________________________________\n", 255 | "conv_dw_2_bn (BatchNormaliza (None, 40, 40, 64) 256 \n", 256 | "_________________________________________________________________\n", 257 | "conv_dw_2_relu (ReLU) (None, 40, 40, 64) 0 \n", 258 | "_________________________________________________________________\n", 259 | "conv_pw_2 (Conv2D) (None, 40, 40, 128) 8192 \n", 260 | "_________________________________________________________________\n", 261 | "conv_pw_2_bn (BatchNormaliza (None, 40, 40, 128) 512 \n", 262 | "_________________________________________________________________\n", 263 | "conv_pw_2_relu (ReLU) (None, 40, 40, 128) 0 \n", 264 | "_________________________________________________________________\n", 265 | "conv_dw_3 (DepthwiseConv2D) (None, 40, 40, 128) 1152 \n", 266 | "_________________________________________________________________\n", 267 | "conv_dw_3_bn (BatchNormaliza (None, 40, 40, 128) 512 \n", 268 | "_________________________________________________________________\n", 269 | "conv_dw_3_relu (ReLU) (None, 40, 40, 128) 0 \n", 270 | "_________________________________________________________________\n", 271 | "conv_pw_3 (Conv2D) (None, 40, 40, 128) 16384 \n", 272 | "_________________________________________________________________\n", 273 | "conv_pw_3_bn (BatchNormaliza (None, 40, 40, 128) 512 \n", 274 | "_________________________________________________________________\n", 275 | "conv_pw_3_relu (ReLU) (None, 40, 40, 128) 0 \n", 276 | "_________________________________________________________________\n", 277 | "conv_pad_4 (ZeroPadding2D) (None, 41, 41, 128) 0 \n", 278 | "_________________________________________________________________\n", 279 | "conv_dw_4 (DepthwiseConv2D) (None, 20, 20, 128) 1152 \n", 280 | "_________________________________________________________________\n", 281 | "conv_dw_4_bn (BatchNormaliza (None, 20, 20, 128) 512 \n", 282 | "_________________________________________________________________\n", 283 | "conv_dw_4_relu (ReLU) (None, 20, 20, 128) 0 \n", 284 | "_________________________________________________________________\n", 285 | "conv_pw_4 (Conv2D) (None, 20, 20, 256) 32768 \n", 286 | "_________________________________________________________________\n", 287 | "conv_pw_4_bn (BatchNormaliza (None, 20, 20, 256) 1024 \n", 288 | "_________________________________________________________________\n", 289 | "conv_pw_4_relu (ReLU) (None, 20, 20, 256) 0 \n", 290 | "_________________________________________________________________\n", 291 | "conv_dw_5 (DepthwiseConv2D) (None, 20, 20, 256) 2304 \n", 292 | "_________________________________________________________________\n", 293 | "conv_dw_5_bn (BatchNormaliza (None, 20, 20, 256) 1024 \n", 294 | "_________________________________________________________________\n", 295 | "conv_dw_5_relu (ReLU) (None, 20, 20, 256) 0 \n", 296 | "_________________________________________________________________\n", 297 | "conv_pw_5 (Conv2D) (None, 20, 20, 256) 65536 \n", 298 | "_________________________________________________________________\n", 299 | "conv_pw_5_bn (BatchNormaliza (None, 20, 20, 256) 1024 \n", 300 | "_________________________________________________________________\n", 301 | "conv_pw_5_relu (ReLU) (None, 20, 20, 256) 0 \n", 302 | "_________________________________________________________________\n", 303 | "conv_pad_6 (ZeroPadding2D) (None, 21, 21, 256) 0 \n", 304 | "_________________________________________________________________\n", 305 | "conv_dw_6 (DepthwiseConv2D) (None, 10, 10, 256) 2304 \n", 306 | "_________________________________________________________________\n", 307 | "conv_dw_6_bn (BatchNormaliza (None, 10, 10, 256) 1024 \n", 308 | "_________________________________________________________________\n", 309 | "conv_dw_6_relu (ReLU) (None, 10, 10, 256) 0 \n", 310 | "_________________________________________________________________\n", 311 | "conv_pw_6 (Conv2D) (None, 10, 10, 512) 131072 \n", 312 | "_________________________________________________________________\n", 313 | "conv_pw_6_bn (BatchNormaliza (None, 10, 10, 512) 2048 \n", 314 | "_________________________________________________________________\n", 315 | "conv_pw_6_relu (ReLU) (None, 10, 10, 512) 0 \n", 316 | "_________________________________________________________________\n", 317 | "conv_dw_7 (DepthwiseConv2D) (None, 10, 10, 512) 4608 \n", 318 | "_________________________________________________________________\n", 319 | "conv_dw_7_bn (BatchNormaliza (None, 10, 10, 512) 2048 \n", 320 | "_________________________________________________________________\n", 321 | "conv_dw_7_relu (ReLU) (None, 10, 10, 512) 0 \n", 322 | "_________________________________________________________________\n", 323 | "conv_pw_7 (Conv2D) (None, 10, 10, 512) 262144 \n", 324 | "_________________________________________________________________\n", 325 | "conv_pw_7_bn (BatchNormaliza (None, 10, 10, 512) 2048 \n", 326 | "_________________________________________________________________\n", 327 | "conv_pw_7_relu (ReLU) (None, 10, 10, 512) 0 \n", 328 | "_________________________________________________________________\n", 329 | "conv_dw_8 (DepthwiseConv2D) (None, 10, 10, 512) 4608 \n", 330 | "_________________________________________________________________\n", 331 | "conv_dw_8_bn (BatchNormaliza (None, 10, 10, 512) 2048 \n", 332 | "_________________________________________________________________\n", 333 | "conv_dw_8_relu (ReLU) (None, 10, 10, 512) 0 \n", 334 | "_________________________________________________________________\n", 335 | "conv_pw_8 (Conv2D) (None, 10, 10, 512) 262144 \n", 336 | "_________________________________________________________________\n", 337 | "conv_pw_8_bn (BatchNormaliza (None, 10, 10, 512) 2048 \n", 338 | "_________________________________________________________________\n", 339 | "conv_pw_8_relu (ReLU) (None, 10, 10, 512) 0 \n", 340 | "_________________________________________________________________\n", 341 | "conv_dw_9 (DepthwiseConv2D) (None, 10, 10, 512) 4608 \n", 342 | "_________________________________________________________________\n", 343 | "conv_dw_9_bn (BatchNormaliza (None, 10, 10, 512) 2048 \n", 344 | "_________________________________________________________________\n", 345 | "conv_dw_9_relu (ReLU) (None, 10, 10, 512) 0 \n", 346 | "_________________________________________________________________\n", 347 | "conv_pw_9 (Conv2D) (None, 10, 10, 512) 262144 \n", 348 | "_________________________________________________________________\n", 349 | "conv_pw_9_bn (BatchNormaliza (None, 10, 10, 512) 2048 \n", 350 | "_________________________________________________________________\n", 351 | "conv_pw_9_relu (ReLU) (None, 10, 10, 512) 0 \n", 352 | "_________________________________________________________________\n", 353 | "conv_dw_10 (DepthwiseConv2D) (None, 10, 10, 512) 4608 \n", 354 | "_________________________________________________________________\n", 355 | "conv_dw_10_bn (BatchNormaliz (None, 10, 10, 512) 2048 \n", 356 | "_________________________________________________________________\n", 357 | "conv_dw_10_relu (ReLU) (None, 10, 10, 512) 0 \n", 358 | "_________________________________________________________________\n", 359 | "conv_pw_10 (Conv2D) (None, 10, 10, 512) 262144 \n", 360 | "_________________________________________________________________\n", 361 | "conv_pw_10_bn (BatchNormaliz (None, 10, 10, 512) 2048 \n", 362 | "_________________________________________________________________\n", 363 | "conv_pw_10_relu (ReLU) (None, 10, 10, 512) 0 \n", 364 | "_________________________________________________________________\n", 365 | "conv_dw_11 (DepthwiseConv2D) (None, 10, 10, 512) 4608 \n", 366 | "_________________________________________________________________\n", 367 | "conv_dw_11_bn (BatchNormaliz (None, 10, 10, 512) 2048 \n", 368 | "_________________________________________________________________\n", 369 | "conv_dw_11_relu (ReLU) (None, 10, 10, 512) 0 \n", 370 | "_________________________________________________________________\n", 371 | "conv_pw_11 (Conv2D) (None, 10, 10, 512) 262144 \n", 372 | "_________________________________________________________________\n", 373 | "conv_pw_11_bn (BatchNormaliz (None, 10, 10, 512) 2048 \n", 374 | "_________________________________________________________________\n", 375 | "conv_pw_11_relu (ReLU) (None, 10, 10, 512) 0 \n", 376 | "_________________________________________________________________\n", 377 | "conv_pad_12 (ZeroPadding2D) (None, 11, 11, 512) 0 \n", 378 | "_________________________________________________________________\n", 379 | "conv_dw_12 (DepthwiseConv2D) (None, 5, 5, 512) 4608 \n", 380 | "_________________________________________________________________\n", 381 | "conv_dw_12_bn (BatchNormaliz (None, 5, 5, 512) 2048 \n", 382 | "_________________________________________________________________\n", 383 | "conv_dw_12_relu (ReLU) (None, 5, 5, 512) 0 \n", 384 | "_________________________________________________________________\n", 385 | "conv_pw_12 (Conv2D) (None, 5, 5, 1024) 524288 \n", 386 | "_________________________________________________________________\n", 387 | "conv_pw_12_bn (BatchNormaliz (None, 5, 5, 1024) 4096 \n", 388 | "_________________________________________________________________\n", 389 | "conv_pw_12_relu (ReLU) (None, 5, 5, 1024) 0 \n", 390 | "_________________________________________________________________\n", 391 | "conv_dw_13 (DepthwiseConv2D) (None, 5, 5, 1024) 9216 \n", 392 | "_________________________________________________________________\n", 393 | "conv_dw_13_bn (BatchNormaliz (None, 5, 5, 1024) 4096 \n", 394 | "_________________________________________________________________\n", 395 | "conv_dw_13_relu (ReLU) (None, 5, 5, 1024) 0 \n", 396 | "_________________________________________________________________\n", 397 | "conv_pw_13 (Conv2D) (None, 5, 5, 1024) 1048576 \n", 398 | "_________________________________________________________________\n", 399 | "conv_pw_13_bn (BatchNormaliz (None, 5, 5, 1024) 4096 \n", 400 | "_________________________________________________________________\n", 401 | "conv_pw_13_relu (ReLU) (None, 5, 5, 1024) 0 \n", 402 | "=================================================================\n", 403 | "Total params: 3,228,864\n", 404 | "Trainable params: 3,206,976\n", 405 | "Non-trainable params: 21,888\n", 406 | "_________________________________________________________________\n" 407 | ] 408 | } 409 | ], 410 | "source": [ 411 | "base_model.summary() #> (None, 5, 5, 1024)" 412 | ] 413 | }, 414 | { 415 | "cell_type": "markdown", 416 | "metadata": {}, 417 | "source": [ 418 | "# 03-D. Predict(Extract features) : Insert datasets to MobileNet base_model\n", 419 | "* create **`X_train`**, **`X_test`**\n", 420 | "* **output shape** : **`((# of Video file)*(# of frame img), 5, 5, 1024)`**" 421 | ] 422 | }, 423 | { 424 | "cell_type": "code", 425 | "execution_count": 8, 426 | "metadata": { 427 | "execution": { 428 | "iopub.execute_input": "2021-05-12T04:09:52.936561Z", 429 | "iopub.status.busy": "2021-05-12T04:09:52.936561Z", 430 | "iopub.status.idle": "2021-05-12T04:09:52.951591Z", 431 | "shell.execute_reply": "2021-05-12T04:09:52.951591Z", 432 | "shell.execute_reply.started": "2021-05-12T04:09:52.936561Z" 433 | }, 434 | "tags": [] 435 | }, 436 | "outputs": [], 437 | "source": [ 438 | "np.random.seed(42)" 439 | ] 440 | }, 441 | { 442 | "cell_type": "code", 443 | "execution_count": 9, 444 | "metadata": { 445 | "execution": { 446 | "iopub.execute_input": "2021-05-12T04:09:52.955603Z", 447 | "iopub.status.busy": "2021-05-12T04:09:52.954558Z", 448 | "iopub.status.idle": "2021-05-12T04:22:48.277548Z", 449 | "shell.execute_reply": "2021-05-12T04:22:48.277548Z", 450 | "shell.execute_reply.started": "2021-05-12T04:09:52.955603Z" 451 | }, 452 | "tags": [] 453 | }, 454 | "outputs": [ 455 | { 456 | "data": { 457 | "text/plain": [ 458 | "(86340, 5, 5, 1024)" 459 | ] 460 | }, 461 | "execution_count": 9, 462 | "metadata": {}, 463 | "output_type": "execute_result" 464 | } 465 | ], 466 | "source": [ 467 | "X_train=base_model.predict(data_training_ar)\n", 468 | "X_train.shape #> (86340, 5, 5, 1024)" 469 | ] 470 | }, 471 | { 472 | "cell_type": "code", 473 | "execution_count": 10, 474 | "metadata": { 475 | "execution": { 476 | "iopub.execute_input": "2021-05-12T04:22:48.293142Z", 477 | "iopub.status.busy": "2021-05-12T04:22:48.293142Z", 478 | "iopub.status.idle": "2021-05-12T04:25:13.093066Z", 479 | "shell.execute_reply": "2021-05-12T04:25:13.090042Z", 480 | "shell.execute_reply.started": "2021-05-12T04:22:48.293142Z" 481 | }, 482 | "tags": [] 483 | }, 484 | "outputs": [ 485 | { 486 | "data": { 487 | "text/plain": [ 488 | "(21600, 5, 5, 1024)" 489 | ] 490 | }, 491 | "execution_count": 10, 492 | "metadata": {}, 493 | "output_type": "execute_result" 494 | } 495 | ], 496 | "source": [ 497 | "X_test=base_model.predict(data_test_ar)\n", 498 | "X_test.shape #> (21600, 5, 5, 1024)" 499 | ] 500 | }, 501 | { 502 | "cell_type": "markdown", 503 | "metadata": {}, 504 | "source": [ 505 | "# 03-E. Reshape predict result to insert LSTM\n", 506 | "* **Output Shape** : **`(# of Video File, # of frame img, 5*5*1024)`**" 507 | ] 508 | }, 509 | { 510 | "cell_type": "code", 511 | "execution_count": 13, 512 | "metadata": { 513 | "execution": { 514 | "iopub.execute_input": "2021-05-12T04:25:13.208724Z", 515 | "iopub.status.busy": "2021-05-12T04:25:13.208724Z", 516 | "iopub.status.idle": "2021-05-12T04:25:13.346427Z", 517 | "shell.execute_reply": "2021-05-12T04:25:13.346427Z", 518 | "shell.execute_reply.started": "2021-05-12T04:25:13.208724Z" 519 | }, 520 | "tags": [] 521 | }, 522 | "outputs": [], 523 | "source": [ 524 | "X_train_reshaped = X_train.reshape(int(X_train.shape[0]/30), 30, 5*5*1024) #> (2878, 30, 25600) ndarray" 525 | ] 526 | }, 527 | { 528 | "cell_type": "code", 529 | "execution_count": 14, 530 | "metadata": { 531 | "execution": { 532 | "iopub.execute_input": "2021-05-12T04:25:13.347390Z", 533 | "iopub.status.busy": "2021-05-12T04:25:13.347390Z", 534 | "iopub.status.idle": "2021-05-12T04:25:13.361565Z", 535 | "shell.execute_reply": "2021-05-12T04:25:13.361565Z", 536 | "shell.execute_reply.started": "2021-05-12T04:25:13.347390Z" 537 | }, 538 | "tags": [] 539 | }, 540 | "outputs": [], 541 | "source": [ 542 | "X_test_reshaped = X_test.reshape(int(X_test.shape[0]/30), 30, 5*5*1024) #> (720, 30, 25600) ndarray" 543 | ] 544 | }, 545 | { 546 | "cell_type": "code", 547 | "execution_count": 15, 548 | "metadata": { 549 | "execution": { 550 | "iopub.execute_input": "2021-05-12T04:25:13.365547Z", 551 | "iopub.status.busy": "2021-05-12T04:25:13.365547Z", 552 | "iopub.status.idle": "2021-05-12T04:25:13.377101Z", 553 | "shell.execute_reply": "2021-05-12T04:25:13.377101Z", 554 | "shell.execute_reply.started": "2021-05-12T04:25:13.365547Z" 555 | }, 556 | "tags": [] 557 | }, 558 | "outputs": [ 559 | { 560 | "data": { 561 | "text/plain": [ 562 | "((2878, 30, 25600), (720, 30, 25600))" 563 | ] 564 | }, 565 | "execution_count": 15, 566 | "metadata": {}, 567 | "output_type": "execute_result" 568 | } 569 | ], 570 | "source": [ 571 | "X_train_reshaped.shape, X_test_reshaped.shape" 572 | ] 573 | }, 574 | { 575 | "cell_type": "markdown", 576 | "metadata": {}, 577 | "source": [ 578 | "# 03-F. Save reshaped result file(.npy)\n", 579 | "* **`MobileNet_x_train_reshaped_210512.npy`**, **`MobileNet_x_test_reshaped_210512.npy`**" 580 | ] 581 | }, 582 | { 583 | "cell_type": "code", 584 | "execution_count": 16, 585 | "metadata": { 586 | "execution": { 587 | "iopub.execute_input": "2021-05-12T04:25:13.378099Z", 588 | "iopub.status.busy": "2021-05-12T04:25:13.378099Z", 589 | "iopub.status.idle": "2021-05-12T04:26:35.838146Z", 590 | "shell.execute_reply": "2021-05-12T04:26:35.838146Z", 591 | "shell.execute_reply.started": "2021-05-12T04:25:13.378099Z" 592 | }, 593 | "tags": [] 594 | }, 595 | "outputs": [], 596 | "source": [ 597 | "np.save('D:/datasets/AllVideo_numpy_list_pickle/MobileNet_x_train_reshaped_210512.npy', X_train_reshaped)" 598 | ] 599 | }, 600 | { 601 | "cell_type": "code", 602 | "execution_count": 17, 603 | "metadata": { 604 | "execution": { 605 | "iopub.execute_input": "2021-05-12T04:26:35.838146Z", 606 | "iopub.status.busy": "2021-05-12T04:26:35.838146Z", 607 | "iopub.status.idle": "2021-05-12T04:26:50.779646Z", 608 | "shell.execute_reply": "2021-05-12T04:26:50.779646Z", 609 | "shell.execute_reply.started": "2021-05-12T04:26:35.838146Z" 610 | }, 611 | "tags": [] 612 | }, 613 | "outputs": [], 614 | "source": [ 615 | "np.save('D:/datasets/AllVideo_numpy_list_pickle/MobileNet_x_test_reshaped_210512.npy', X_test_reshaped)" 616 | ] 617 | } 618 | ], 619 | "metadata": { 620 | "kernelspec": { 621 | "display_name": "Python 3", 622 | "language": "python", 623 | "name": "python3" 624 | }, 625 | "language_info": { 626 | "codemirror_mode": { 627 | "name": "ipython", 628 | "version": 3 629 | }, 630 | "file_extension": ".py", 631 | "mimetype": "text/x-python", 632 | "name": "python", 633 | "nbconvert_exporter": "python", 634 | "pygments_lexer": "ipython3", 635 | "version": "3.8.5" 636 | } 637 | }, 638 | "nbformat": 4, 639 | "nbformat_minor": 4 640 | } 641 | -------------------------------------------------------------------------------- /ver_jupyter/05_Apply-model-to-Video.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# 05. Actual Use : Violence detection for .mp4 video file\n", 8 | "* By using MobileNet base model & trained LSTM model, we can detect violent behavior in any video files(.avi, .mp4) and save output file.\n", 9 | "* **`Before run this file, Please check this`**:\n", 10 | " * 01_video-to-numpy-save.ipynb\n", 11 | " * 02_create-numpy-datasets_training-test.ipynb\n", 12 | " * 03_MobileNet.ipynb\n", 13 | " * 04_MobileNet_LSTM_model.ipynb\n", 14 | "* **`Are those files exist on there?`** Those files were made by 01~04_MobileNet.ipynb files.\n", 15 | " * Trained LSTM model : 210512_MobileNet_model_epoch100.h5" 16 | ] 17 | }, 18 | { 19 | "cell_type": "markdown", 20 | "metadata": {}, 21 | "source": [ 22 | "# Imports" 23 | ] 24 | }, 25 | { 26 | "cell_type": "code", 27 | "execution_count": 1, 28 | "metadata": { 29 | "execution": { 30 | "iopub.execute_input": "2021-05-11T02:24:41.707649Z", 31 | "iopub.status.busy": "2021-05-11T02:24:41.707649Z", 32 | "iopub.status.idle": "2021-05-11T02:25:05.586551Z", 33 | "shell.execute_reply": "2021-05-11T02:25:05.586551Z", 34 | "shell.execute_reply.started": "2021-05-11T02:24:41.707649Z" 35 | }, 36 | "tags": [] 37 | }, 38 | "outputs": [], 39 | "source": [ 40 | "import cv2 # openCV 4.5.1\n", 41 | "import numpy as np\n", 42 | "import os\n", 43 | "import tensorflow as tf\n", 44 | "from tensorflow import keras\n", 45 | "import time \n", 46 | "\n", 47 | "from skimage.io import imread\n", 48 | "from skimage.transform import resize \n", 49 | "from PIL import Image, ImageFont, ImageDraw # add caption by using custom font\n", 50 | "\n", 51 | "from collections import deque" 52 | ] 53 | }, 54 | { 55 | "cell_type": "markdown", 56 | "metadata": {}, 57 | "source": [ 58 | "# 05-A. Load Model Files\n", 59 | "* **`base_model`** : MobileNet\n", 60 | "* **`model`** : trained LSTM model file. `210512_MobileNet_model_epoch100.h5`" 61 | ] 62 | }, 63 | { 64 | "cell_type": "markdown", 65 | "metadata": {}, 66 | "source": [ 67 | "## 1. base_model : MobileNet" 68 | ] 69 | }, 70 | { 71 | "cell_type": "code", 72 | "execution_count": 2, 73 | "metadata": { 74 | "execution": { 75 | "iopub.execute_input": "2021-05-11T02:25:05.587429Z", 76 | "iopub.status.busy": "2021-05-11T02:25:05.587429Z", 77 | "iopub.status.idle": "2021-05-11T02:25:06.200785Z", 78 | "shell.execute_reply": "2021-05-11T02:25:06.200785Z", 79 | "shell.execute_reply.started": "2021-05-11T02:25:05.587429Z" 80 | }, 81 | "tags": [] 82 | }, 83 | "outputs": [], 84 | "source": [ 85 | "base_model=keras.applications.mobilenet.MobileNet(input_shape=(160, 160, 3),\n", 86 | " include_top=False,\n", 87 | " weights='imagenet', classes=2)" 88 | ] 89 | }, 90 | { 91 | "cell_type": "markdown", 92 | "metadata": {}, 93 | "source": [ 94 | "## 2. model : trained LSTM model(.h5)" 95 | ] 96 | }, 97 | { 98 | "cell_type": "code", 99 | "execution_count": 3, 100 | "metadata": { 101 | "execution": { 102 | "iopub.execute_input": "2021-05-11T02:25:06.201803Z", 103 | "iopub.status.busy": "2021-05-11T02:25:06.201803Z", 104 | "iopub.status.idle": "2021-05-11T02:25:08.747439Z", 105 | "shell.execute_reply": "2021-05-11T02:25:08.746474Z", 106 | "shell.execute_reply.started": "2021-05-11T02:25:06.201803Z" 107 | }, 108 | "tags": [] 109 | }, 110 | "outputs": [], 111 | "source": [ 112 | "model=keras.models.load_model('210512_MobileNet_model_epoch100.h5')" 113 | ] 114 | }, 115 | { 116 | "cell_type": "markdown", 117 | "metadata": {}, 118 | "source": [ 119 | "# 05-B. Define functions" 120 | ] 121 | }, 122 | { 123 | "cell_type": "markdown", 124 | "metadata": {}, 125 | "source": [ 126 | "## 1. Function : video_reader()\n", 127 | "* Load video file >> Scaling, Resizing >> Transform to Numpy array >> return Numpy array" 128 | ] 129 | }, 130 | { 131 | "cell_type": "code", 132 | "execution_count": 4, 133 | "metadata": { 134 | "execution": { 135 | "iopub.execute_input": "2021-05-11T02:25:08.748467Z", 136 | "iopub.status.busy": "2021-05-11T02:25:08.748467Z", 137 | "iopub.status.idle": "2021-05-11T02:25:08.763424Z", 138 | "shell.execute_reply": "2021-05-11T02:25:08.762399Z", 139 | "shell.execute_reply.started": "2021-05-11T02:25:08.748467Z" 140 | }, 141 | "tags": [] 142 | }, 143 | "outputs": [], 144 | "source": [ 145 | "def video_reader(cv2, filename):\n", 146 | " \"\"\"Load 1 video file. Next, read each frame image and resize as (fps, 160, 160, 3) shape and return frame Numpy array.\"\"\"\n", 147 | " \n", 148 | " frames=np.zeros((30, 160, 160, 3), dtype=np.float) #> (fps, img size, img size, RGB)\n", 149 | " \n", 150 | " i=0\n", 151 | " print(frames.shape)\n", 152 | " vid=cv2.VideoCapture(filename) # read frame img from video file.\n", 153 | " \n", 154 | " if vid.isOpened():\n", 155 | " grabbed, frame=vid.read() \n", 156 | " else:\n", 157 | " grabbed=False\n", 158 | " \n", 159 | " frm=resize(frame,(160, 160, 3))\n", 160 | " frm=np.expand_dims(frm, axis=0)\n", 161 | " \n", 162 | " if(np.max(frm)>1):\n", 163 | " frm=frm/255.0 # Scaling\n", 164 | " frames[i][:]=frm\n", 165 | " i+=1\n", 166 | " print('Reading Video')\n", 167 | " \n", 168 | " while i<30:\n", 169 | " grabbed, frame=vid.read()\n", 170 | " frm=resize(frame, (160, 160, 3)) \n", 171 | " frm=np.expand_dims(frm, axis=0)\n", 172 | " if(np.max(frm)>1):\n", 173 | " frm=frm/255.0\n", 174 | " frames[i][:]=frm\n", 175 | " i+=1\n", 176 | " \n", 177 | " return frames" 178 | ] 179 | }, 180 | { 181 | "cell_type": "markdown", 182 | "metadata": {}, 183 | "source": [ 184 | "## 2. Function : create_pred_imgarr()\n", 185 | "* Extract features of each frame img by using base_model(MobileNet)\n", 186 | "* Reshape features Numpy array to insert LSTM model" 187 | ] 188 | }, 189 | { 190 | "cell_type": "code", 191 | "execution_count": 5, 192 | "metadata": { 193 | "execution": { 194 | "iopub.execute_input": "2021-05-11T02:25:08.764428Z", 195 | "iopub.status.busy": "2021-05-11T02:25:08.764428Z", 196 | "iopub.status.idle": "2021-05-11T02:25:08.779354Z", 197 | "shell.execute_reply": "2021-05-11T02:25:08.778373Z", 198 | "shell.execute_reply.started": "2021-05-11T02:25:08.764428Z" 199 | }, 200 | "tags": [] 201 | }, 202 | "outputs": [], 203 | "source": [ 204 | "def create_pred_imgarr(base_model, video_frm_ar):\n", 205 | " \"\"\"Insert base_model(MobileNet) and result of video_reader() function.\n", 206 | " This function extract features from each frame img by using base_model.\n", 207 | " And reshape Numpy array to insert LSTM model : (1, 30, 25600)\"\"\"\n", 208 | " video_frm_ar_dim=np.zeros((1, 30, 160, 160, 3), dtype=np.float)\n", 209 | " video_frm_ar_dim[0][:][:]=video_frm_ar #> (1, 30, 160, 160, 3)\n", 210 | " \n", 211 | " # Extract features from each frame img by using base_model(MobileNet)\n", 212 | " pred_imgarr=base_model.predict(video_frm_ar)\n", 213 | " # Reshape features array : (1, fps, 25600)\n", 214 | " pred_imgarr=pred_imgarr.reshape(1, pred_imgarr.shape[0], 5*5*1024)\n", 215 | " \n", 216 | " return pred_imgarr #> ex : (1, 30, 25600)" 217 | ] 218 | }, 219 | { 220 | "cell_type": "markdown", 221 | "metadata": {}, 222 | "source": [ 223 | "## 3. Function : pred_fight()\n", 224 | "* Distinguish Violence(Fight) / Non-Violence(NonFight)\n", 225 | "* Insert reshaped-features-array to trained LSTM model" 226 | ] 227 | }, 228 | { 229 | "cell_type": "code", 230 | "execution_count": 6, 231 | "metadata": { 232 | "execution": { 233 | "iopub.execute_input": "2021-05-11T02:25:08.780350Z", 234 | "iopub.status.busy": "2021-05-11T02:25:08.780350Z", 235 | "iopub.status.idle": "2021-05-11T02:25:08.795311Z", 236 | "shell.execute_reply": "2021-05-11T02:25:08.794314Z", 237 | "shell.execute_reply.started": "2021-05-11T02:25:08.780350Z" 238 | }, 239 | "tags": [] 240 | }, 241 | "outputs": [], 242 | "source": [ 243 | "def pred_fight(model, pred_imgarr, acuracy=0.9):\n", 244 | " \"\"\"If accuracy>=input value(ex:0.9), return (Violence)'True'. else, return 'False'.\n", 245 | " ::model:: trained LSTM model (We already load .h5 file in the above.)\n", 246 | " ::pred_imgarr:: (1, 30, 25600) shaped Numpy array. Extracted features.\n", 247 | " ::accuracy:: default value is 0.9\"\"\"\n", 248 | "\n", 249 | " pred_test=model.predict(pred_imgarr) #> Violence(Fight) : [0,1]. Non-Violence(NonFight) : [1,0]\n", 250 | " \n", 251 | " if pred_test[0][1] >= acuracy:\n", 252 | " return True, pred_test[0][1] #> True, Probability of Violence\n", 253 | " \n", 254 | " else:\n", 255 | " return False, pred_test[0][1] #> False, Probability of Violence" 256 | ] 257 | }, 258 | { 259 | "cell_type": "markdown", 260 | "metadata": {}, 261 | "source": [ 262 | "## 4. Check above functions doing well" 263 | ] 264 | }, 265 | { 266 | "cell_type": "markdown", 267 | "metadata": {}, 268 | "source": [ 269 | "### 1) Load any video File" 270 | ] 271 | }, 272 | { 273 | "cell_type": "code", 274 | "execution_count": 7, 275 | "metadata": { 276 | "execution": { 277 | "iopub.execute_input": "2021-05-11T02:25:08.796307Z", 278 | "iopub.status.busy": "2021-05-11T02:25:08.796307Z", 279 | "iopub.status.idle": "2021-05-11T02:25:08.811273Z", 280 | "shell.execute_reply": "2021-05-11T02:25:08.810270Z", 281 | "shell.execute_reply.started": "2021-05-11T02:25:08.796307Z" 282 | }, 283 | "tags": [] 284 | }, 285 | "outputs": [], 286 | "source": [ 287 | "video_file='Fight_itwill_210506_01.mp4'" 288 | ] 289 | }, 290 | { 291 | "cell_type": "markdown", 292 | "metadata": {}, 293 | "source": [ 294 | "### 2) Check function's operation" 295 | ] 296 | }, 297 | { 298 | "cell_type": "code", 299 | "execution_count": 8, 300 | "metadata": { 301 | "execution": { 302 | "iopub.execute_input": "2021-05-11T02:25:08.812265Z", 303 | "iopub.status.busy": "2021-05-11T02:25:08.812265Z", 304 | "iopub.status.idle": "2021-05-11T02:25:11.219381Z", 305 | "shell.execute_reply": "2021-05-11T02:25:11.219381Z", 306 | "shell.execute_reply.started": "2021-05-11T02:25:08.812265Z" 307 | }, 308 | "tags": [] 309 | }, 310 | "outputs": [ 311 | { 312 | "name": "stdout", 313 | "output_type": "stream", 314 | "text": [ 315 | "(30, 160, 160, 3)\n", 316 | "Reading Video\n" 317 | ] 318 | } 319 | ], 320 | "source": [ 321 | "video_frm_ar=video_reader(cv2, video_file)" 322 | ] 323 | }, 324 | { 325 | "cell_type": "code", 326 | "execution_count": 9, 327 | "metadata": { 328 | "execution": { 329 | "iopub.execute_input": "2021-05-11T02:25:11.220378Z", 330 | "iopub.status.busy": "2021-05-11T02:25:11.220378Z", 331 | "iopub.status.idle": "2021-05-11T02:25:11.885930Z", 332 | "shell.execute_reply": "2021-05-11T02:25:11.884932Z", 333 | "shell.execute_reply.started": "2021-05-11T02:25:11.220378Z" 334 | }, 335 | "tags": [] 336 | }, 337 | "outputs": [ 338 | { 339 | "data": { 340 | "text/plain": [ 341 | "(1, 30, 25600)" 342 | ] 343 | }, 344 | "execution_count": 9, 345 | "metadata": {}, 346 | "output_type": "execute_result" 347 | } 348 | ], 349 | "source": [ 350 | "pred_imgarr=create_pred_imgarr(base_model, video_frm_ar)\n", 351 | "pred_imgarr.shape" 352 | ] 353 | }, 354 | { 355 | "cell_type": "code", 356 | "execution_count": 10, 357 | "metadata": { 358 | "execution": { 359 | "iopub.execute_input": "2021-05-11T02:25:11.886929Z", 360 | "iopub.status.busy": "2021-05-11T02:25:11.886929Z", 361 | "iopub.status.idle": "2021-05-11T02:25:12.471363Z", 362 | "shell.execute_reply": "2021-05-11T02:25:12.471363Z", 363 | "shell.execute_reply.started": "2021-05-11T02:25:11.886929Z" 364 | }, 365 | "tags": [] 366 | }, 367 | "outputs": [ 368 | { 369 | "data": { 370 | "text/plain": [ 371 | "(True, 0.99869007)" 372 | ] 373 | }, 374 | "execution_count": 10, 375 | "metadata": {}, 376 | "output_type": "execute_result" 377 | } 378 | ], 379 | "source": [ 380 | "preds=pred_fight(model, pred_imgarr, 0.9)\n", 381 | "preds #> (Violence True or False, Probability of Violence)" 382 | ] 383 | }, 384 | { 385 | "cell_type": "markdown", 386 | "metadata": {}, 387 | "source": [ 388 | "# 05-C. Define all-in-one function\n", 389 | "* It contains `video_reader()`, `create_pred_imgarr()`, `pred_fight()` as all-in-one.\n", 390 | "* Input : 1 Video file\n", 391 | "* Output : Violence True or False / Probability of Violence" 392 | ] 393 | }, 394 | { 395 | "cell_type": "markdown", 396 | "metadata": {}, 397 | "source": [ 398 | "## 1. Define detect_violence()" 399 | ] 400 | }, 401 | { 402 | "cell_type": "code", 403 | "execution_count": 11, 404 | "metadata": { 405 | "execution": { 406 | "iopub.execute_input": "2021-05-11T02:25:12.472361Z", 407 | "iopub.status.busy": "2021-05-11T02:25:12.472361Z", 408 | "iopub.status.idle": "2021-05-11T02:25:12.486323Z", 409 | "shell.execute_reply": "2021-05-11T02:25:12.486323Z", 410 | "shell.execute_reply.started": "2021-05-11T02:25:12.472361Z" 411 | }, 412 | "tags": [] 413 | }, 414 | "outputs": [], 415 | "source": [ 416 | "def detect_violence(video):\n", 417 | " \"\"\" It contains video_reader(), create_pred_imgarr(), pred_fight() function as all-in-one.\n", 418 | " ::video:: video file (.mp4, .avi, ...)\n", 419 | " \n", 420 | " video_reader() : Read each frame img by using openCV. Resize Numpy array\n", 421 | " create_pred_imgarr() : Extract features from frame img array by using base model(MobileNet)\n", 422 | " pred_fight() : Decide Violence True or False by using trained LSTM model\"\"\"\n", 423 | " \n", 424 | " video_frm_ar=video_reader(cv2, video) \n", 425 | " pred_imgarr=create_pred_imgarr(base_model, video_frm_ar)\n", 426 | " \n", 427 | " time1=int(round(time.time()*1000))\n", 428 | "\n", 429 | " f, precent=pred_fight(model, pred_imgarr, acuracy=0.65)\n", 430 | " \n", 431 | " time2=int(round(time.time()*1000))\n", 432 | " \n", 433 | " result={'Violence': f, #> True(Violence), False(Non-Violence)\n", 434 | " 'Violence Estimation': str(precent), # Probability of Violence\n", 435 | " 'Processing Time' : str(time2-time1)} \n", 436 | " \n", 437 | " return result" 438 | ] 439 | }, 440 | { 441 | "cell_type": "markdown", 442 | "metadata": {}, 443 | "source": [ 444 | "## 2. Test function: detect_violence()" 445 | ] 446 | }, 447 | { 448 | "cell_type": "code", 449 | "execution_count": 12, 450 | "metadata": { 451 | "execution": { 452 | "iopub.execute_input": "2021-05-11T02:25:12.487321Z", 453 | "iopub.status.busy": "2021-05-11T02:25:12.487321Z", 454 | "iopub.status.idle": "2021-05-11T02:25:15.067789Z", 455 | "shell.execute_reply": "2021-05-11T02:25:15.067789Z", 456 | "shell.execute_reply.started": "2021-05-11T02:25:12.487321Z" 457 | }, 458 | "tags": [] 459 | }, 460 | "outputs": [ 461 | { 462 | "name": "stdout", 463 | "output_type": "stream", 464 | "text": [ 465 | "(30, 160, 160, 3)\n", 466 | "Reading Video\n" 467 | ] 468 | }, 469 | { 470 | "data": { 471 | "text/plain": [ 472 | "{'Violence': True,\n", 473 | " 'Violence Estimation': '0.99869007',\n", 474 | " 'Processing Time': '148'}" 475 | ] 476 | }, 477 | "execution_count": 12, 478 | "metadata": {}, 479 | "output_type": "execute_result" 480 | } 481 | ], 482 | "source": [ 483 | "video_file='Fight_itwill_210506_01.mp4'\n", 484 | "detect_violence(video_file)" 485 | ] 486 | }, 487 | { 488 | "cell_type": "markdown", 489 | "metadata": {}, 490 | "source": [ 491 | "# 05-D. Add caption & Save output video file\n", 492 | "* **`Add Captions on video file`**\n", 493 | " * Violence True or False\n", 494 | " * Probability of violence\n", 495 | "* **`View & Save output video`**" 496 | ] 497 | }, 498 | { 499 | "cell_type": "markdown", 500 | "metadata": {}, 501 | "source": [ 502 | "## 1. Setting : Input path & Output path\n", 503 | "* **`input_path`** : input video file\n", 504 | "* **`output_path`** : You'll save output video file in output_path." 505 | ] 506 | }, 507 | { 508 | "cell_type": "code", 509 | "execution_count": 23, 510 | "metadata": { 511 | "execution": { 512 | "iopub.execute_input": "2021-05-11T04:21:59.807766Z", 513 | "iopub.status.busy": "2021-05-11T04:21:59.807766Z", 514 | "iopub.status.idle": "2021-05-11T04:21:59.820731Z", 515 | "shell.execute_reply": "2021-05-11T04:21:59.820731Z", 516 | "shell.execute_reply.started": "2021-05-11T04:21:59.807766Z" 517 | }, 518 | "tags": [] 519 | }, 520 | "outputs": [], 521 | "source": [ 522 | "input_path='Fight_itwill_210506_05.mp4'" 523 | ] 524 | }, 525 | { 526 | "cell_type": "code", 527 | "execution_count": null, 528 | "metadata": {}, 529 | "outputs": [], 530 | "source": [ 531 | "output_path=f'{input_path}+output.mp4'" 532 | ] 533 | }, 534 | { 535 | "cell_type": "markdown", 536 | "metadata": {}, 537 | "source": [ 538 | "## 2. Distinguish Violence True or False & Add caption on Video file" 539 | ] 540 | }, 541 | { 542 | "cell_type": "code", 543 | "execution_count": 24, 544 | "metadata": { 545 | "execution": { 546 | "iopub.execute_input": "2021-05-11T04:22:00.687564Z", 547 | "iopub.status.busy": "2021-05-11T04:22:00.686566Z", 548 | "iopub.status.idle": "2021-05-11T04:23:40.674141Z", 549 | "shell.execute_reply": "2021-05-11T04:23:40.674141Z", 550 | "shell.execute_reply.started": "2021-05-11T04:22:00.687564Z" 551 | }, 552 | "tags": [] 553 | }, 554 | "outputs": [ 555 | { 556 | "name": "stdout", 557 | "output_type": "stream", 558 | "text": [ 559 | "fps : 30.061038063436285\n", 560 | "preds:[[2.2413046e-04 9.9977583e-01]]\n", 561 | "Results = [[nan nan]]\n", 562 | "Maximum Probability : nan\n", 563 | "\n" 564 | ] 565 | }, 566 | { 567 | "name": "stderr", 568 | "output_type": "stream", 569 | "text": [ 570 | ":65: RuntimeWarning: Mean of empty slice.\n", 571 | " results=np.array(Q)[:i].mean(axis=0)\n" 572 | ] 573 | }, 574 | { 575 | "name": "stdout", 576 | "output_type": "stream", 577 | "text": [ 578 | "preds:[[0.00478598 0.995214 ]]\n", 579 | "Results = [[2.2413046e-04 9.9977583e-01]]\n", 580 | "Maximum Probability : 0.9997758269309998\n", 581 | "\n", 582 | "preds:[[7.171974e-04 9.992828e-01]]\n", 583 | "Results = [[0.00250505 0.99749494]]\n", 584 | "Maximum Probability : 0.9974949359893799\n", 585 | "\n", 586 | "preds:[[1.7574173e-04 9.9982435e-01]]\n", 587 | "Results = [[0.0019091 0.9980909]]\n", 588 | "Maximum Probability : 0.998090922832489\n", 589 | "\n", 590 | "preds:[[5.1806617e-04 9.9948198e-01]]\n", 591 | "Results = [[0.00147576 0.99852425]]\n", 592 | "Maximum Probability : 0.9985242486000061\n", 593 | "\n", 594 | "preds:[[0.00106501 0.9989349 ]]\n", 595 | "Results = [[0.00128422 0.99871576]]\n", 596 | "Maximum Probability : 0.9987157583236694\n", 597 | "\n", 598 | "preds:[[0.0015095 0.99849045]]\n", 599 | "Results = [[0.0014524 0.99854755]]\n", 600 | "Maximum Probability : 0.9985475540161133\n", 601 | "\n", 602 | "preds:[[4.4945168e-04 9.9955052e-01]]\n", 603 | "Results = [[7.9710456e-04 9.9920291e-01]]\n", 604 | "Maximum Probability : 0.9992029070854187\n", 605 | "\n", 606 | "preds:[[7.523823e-05 9.999248e-01]]\n", 607 | "Results = [[7.4355537e-04 9.9925643e-01]]\n", 608 | "Maximum Probability : 0.999256432056427\n", 609 | "\n", 610 | "preds:[[5.567938e-05 9.999443e-01]]\n", 611 | "Results = [[7.2345464e-04 9.9927652e-01]]\n", 612 | "Maximum Probability : 0.9992765188217163\n", 613 | "\n", 614 | "preds:[[5.4174510e-04 9.9945825e-01]]\n", 615 | "Results = [[6.3097727e-04 9.9936897e-01]]\n", 616 | "Maximum Probability : 0.9993689656257629\n", 617 | "\n", 618 | "preds:[[1.6276044e-04 9.9983716e-01]]\n", 619 | "Results = [[5.2632357e-04 9.9947369e-01]]\n", 620 | "Maximum Probability : 0.9994736909866333\n", 621 | "\n", 622 | "preds:[[0.00466873 0.9953312 ]]\n", 623 | "Results = [[2.5697495e-04 9.9974310e-01]]\n", 624 | "Maximum Probability : 0.9997431039810181\n", 625 | "\n", 626 | "preds:[[0.00164434 0.9983557 ]]\n", 627 | "Results = [[0.00110083 0.99889916]]\n", 628 | "Maximum Probability : 0.9988991618156433\n", 629 | "\n", 630 | "preds:[[0.00338764 0.99661237]]\n", 631 | "Results = [[0.00141465 0.99858534]]\n", 632 | "Maximum Probability : 0.9985853433609009\n", 633 | "\n", 634 | "preds:[[0.00443199 0.995568 ]]\n", 635 | "Results = [[0.00208104 0.99791896]]\n", 636 | "Maximum Probability : 0.997918963432312\n", 637 | "\n", 638 | "preds:[[6.745212e-04 9.993255e-01]]\n", 639 | "Results = [[0.00285909 0.9971409 ]]\n", 640 | "Maximum Probability : 0.9971408843994141\n", 641 | "\n", 642 | "preds:[[0.02375322 0.9762467 ]]\n", 643 | "Results = [[0.00296145 0.99703854]]\n", 644 | "Maximum Probability : 0.9970385432243347\n", 645 | "\n", 646 | "preds:[[0.06798141 0.9320186 ]]\n", 647 | "Results = [[0.00677834 0.99322164]]\n", 648 | "Maximum Probability : 0.993221640586853\n", 649 | "\n", 650 | "preds:[[3.8083564e-04 9.9961913e-01]]\n", 651 | "Results = [[0.02004576 0.97995424]]\n", 652 | "Maximum Probability : 0.9799542427062988\n", 653 | "\n", 654 | "preds:[[5.607391e-04 9.994393e-01]]\n", 655 | "Results = [[0.01944439 0.98055565]]\n", 656 | "Maximum Probability : 0.9805556535720825\n", 657 | "\n", 658 | "preds:[[0.01694237 0.9830577 ]]\n", 659 | "Results = [[0.01867015 0.9813298 ]]\n", 660 | "Maximum Probability : 0.9813297986984253\n", 661 | "\n", 662 | "preds:[[0.00746863 0.99253136]]\n", 663 | "Results = [[0.02192372 0.9780763 ]]\n", 664 | "Maximum Probability : 0.9780762791633606\n", 665 | "\n", 666 | "프레임이 없습니다. 스트리밍을 종료합니다.\n", 667 | "종료 처리되었습니다. 메모리를 해제합니다.\n" 668 | ] 669 | } 670 | ], 671 | "source": [ 672 | "vid=cv2.VideoCapture(input_path)\n", 673 | "fps=vid.get(cv2.CAP_PROP_FPS) # recognize frames per secone(fps) of input_path video file.\n", 674 | "print(f'fps : {fps}') # print fps.\n", 675 | "\n", 676 | "writer=None\n", 677 | "(W, H)=(None, None)\n", 678 | "i=0 # number of seconds in video = The number of times that how many operated while loop .\n", 679 | "Q=deque(maxlen=128) \n", 680 | "\n", 681 | "video_frm_ar=np.zeros((1, int(fps), 160, 160, 3), dtype=np.float) #frames\n", 682 | "frame_counter=0 # frame number in 1 second. 1~30\n", 683 | "frame_list=[] \n", 684 | "preds=None\n", 685 | "maxprob=None\n", 686 | "\n", 687 | "#. While loop : Until the end of input video, it read frame, extract features, predict violence True or False.\n", 688 | "# ----- Reshape & Save frame img as (30, 160, 160, 3) Numpy array -----\n", 689 | "while True: \n", 690 | " frame_counter+=1\n", 691 | " grabbed, frm=vid.read() # read each frame img. grabbed=True, frm=frm img. ex: (240, 320, 3)\n", 692 | " \n", 693 | " if not grabbed:\n", 694 | " print('There is no frame. Streaming ends.')\n", 695 | " break\n", 696 | "\n", 697 | " if W is None or H is None: # W: width, H: height of frame img\n", 698 | " (H, W)=frm.shape[:2]\n", 699 | " \n", 700 | " output=frm.copy() # It is necessary for streaming captioned output video, and to save that.\n", 701 | " \n", 702 | " frame=resize(frm, (160, 160, 3)) #> Resize frame img array to (160, 160, 3)\n", 703 | " frame_list.append(frame) # Append each frame img Numpy array : element is (160, 160, 3) Numpy array.\n", 704 | " \n", 705 | " if frame_counter>=fps: # fps=30 et al\n", 706 | " #. ----- we'll predict violence True or False every 30 frame -----\n", 707 | " #. ----- Insert (1, 30, 160, 160, 3) Numpy array to LSTM model ---\n", 708 | " #. ----- We'll renew predict result caption on output video every 1 second. -----\n", 709 | " # 30-element-appended list -> Transform to Numpy array -> Predict -> Initialize list (repeat)\n", 710 | " frame_ar=np.array(frame_list, dtype=np.float16) #> (30, 160, 160, 3)\n", 711 | " frame_list=[] # Initialize frame list when frame_counter is same or exceed 30, after transforming to Numpy array.\n", 712 | " \n", 713 | " if(np.max(frame_ar)>1): \n", 714 | " frame_ar=frame_ar/255.0 # Scaling RGB value in Numpy array\n", 715 | " \n", 716 | " pred_imgarr=base_model.predict(frame_ar) #> Extract features from each frame img by using MobileNet. (30, 5, 5, 1024)\n", 717 | " pred_imgarr_dim=pred_imgarr.reshape(1, pred_imgarr.shape[0], 5*5*1024)#> (1, 30, 25600)\n", 718 | "\n", 719 | " preds=model.predict(pred_imgarr_dim) #> (True, 0.99) : (Violence True or False, Probability of Violence)\n", 720 | " print(f'preds:{preds}')\n", 721 | " Q.append(preds)\n", 722 | " \n", 723 | " # Predict Result : Average of Violence probability in last 5 second\n", 724 | " if i<5:\n", 725 | " results=np.array(Q)[:i].mean(axis=0)\n", 726 | " else:\n", 727 | " results=np.array(Q)[(i-5):i].mean(axis=0)\n", 728 | " \n", 729 | " print(f'Results = {results}') #> ex : (0.6, 0.650)\n", 730 | " \n", 731 | " maxprob=np.max(results) #> Select Maximum Probability\n", 732 | " print(f'Maximum Probability : {maxprob}')\n", 733 | " print('')\n", 734 | " \n", 735 | " rest=1-maxprob # Probability of Non-Violence\n", 736 | " diff=maxprob-rest # Difference between Probability of Violence and Non-Violence's\n", 737 | " th=100\n", 738 | " \n", 739 | " if diff>0.80: \n", 740 | " th=diff # ?? What is supporting basis?\n", 741 | " \n", 742 | " frame_counter=0 #> Initialize frame_counter to 0\n", 743 | " i+=1 #> 1 second elapsed\n", 744 | " \n", 745 | " # When frame_counter>=30, Initialize frame_counter to 0, and repeat above while loop.\n", 746 | " \n", 747 | " # ----- Setting caption option of output video -----\n", 748 | " # Renewed caption is added every 30 frames(if fps=30, it means 1 second.)\n", 749 | " font1=ImageFont.truetype('fonts/Raleway-ExtraBold.ttf', int(0.05*W)) # font option\n", 750 | " font2=ImageFont.truetype('fonts/Raleway-ExtraBold.ttf', int(0.1*W)) #font option\n", 751 | " \n", 752 | " if preds is not None and maxprob is not None:\n", 753 | " if (preds[0][1]) if violence probability < th, Violence=False (Normal, Green Caption)\n", 754 | " text1_1='Normal'\n", 755 | " text1_2='{:.2f}%'.format(100-(maxprob*100))\n", 756 | " img_pil=Image.fromarray(output)\n", 757 | " draw=ImageDraw.Draw(img_pil)\n", 758 | " draw.text((int(0.025*W), int(0.025*H)), text1_1, font=font1, fill=(0, 255, 0, 0))\n", 759 | " draw.text((int(0.025*W), int(0.095*H)), text1_2, font=font2, fill=(0, 255, 0, 0))\n", 760 | " output=np.array(img_pil)\n", 761 | " \n", 762 | " else : #> if violence probability > th, Violence=True (Violence Alert!, Red Caption)\n", 763 | " text2_1='Violence Alert!'\n", 764 | " text2_2='{:.2f}%'.format(maxprob*100)\n", 765 | " img_pil=Image.fromarray(output)\n", 766 | " draw=ImageDraw.Draw(img_pil)\n", 767 | " draw.text((int(0.025*W), int(0.025*H)), text2_1, font=font1, fill=(0, 0, 255, 0))\n", 768 | " draw.text((int(0.025*W), int(0.095*H)), text2_2, font=font2, fill=(0, 0, 255, 0))\n", 769 | " output=np.array(img_pil)\n", 770 | " \n", 771 | " # Save captioned video file by using 'writer'\n", 772 | " if writer is None:\n", 773 | " fourcc=cv2.VideoWriter_fourcc(*'DIVX')\n", 774 | " writer=cv2.VideoWriter(output_path, fourcc, 30, (W, H), True)\n", 775 | " \n", 776 | " cv2.imshow('This is output', output) # View output in new Window.\n", 777 | " writer.write(output) # Save output in output_path\n", 778 | " \n", 779 | " key=cv2.waitKey(round(1000/fps)) # time gap of frame and next frame\n", 780 | " if key==27: # If you press ESC key, While loop will be breaked and output file will be saved.\n", 781 | " print('ESC is pressed. Video recording ends.')\n", 782 | " break\n", 783 | " \n", 784 | "print('Video recording ends. Release Memory.') # Output file will be saved.\n", 785 | "writer.release()\n", 786 | "vid.release()\n", 787 | "cv2.destroyAllWindows()" 788 | ] 789 | } 790 | ], 791 | "metadata": { 792 | "kernelspec": { 793 | "display_name": "Python 3", 794 | "language": "python", 795 | "name": "python3" 796 | }, 797 | "language_info": { 798 | "codemirror_mode": { 799 | "name": "ipython", 800 | "version": 3 801 | }, 802 | "file_extension": ".py", 803 | "mimetype": "text/x-python", 804 | "name": "python", 805 | "nbconvert_exporter": "python", 806 | "pygments_lexer": "ipython3", 807 | "version": "3.8.5" 808 | } 809 | }, 810 | "nbformat": 4, 811 | "nbformat_minor": 4 812 | } 813 | -------------------------------------------------------------------------------- /ver_jupyter/06_labtop_webcam_streaming.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# 06. Actual Use : Violence detection for laptop webcam streaming\n", 8 | "* By using MobileNet base model & trained LSTM model, we can detect violent behavior of streaming video(laptop webcam)\n", 9 | "* **`Before run this file, Please check this`**:\n", 10 | " * 01_video-to-numpy-save.ipynb\n", 11 | " * 02_create-numpy-datasets_training-test.ipynb\n", 12 | " * 03_MobileNet.ipynb\n", 13 | " * 04_MobileNet_LSTM_model.ipynb\n", 14 | "* **`Are those files exist on there?`** Those files were made by 01~04_MobileNet.ipynb files.\n", 15 | " * Trained LSTM model : 210512_MobileNet_model_epoch100.h5" 16 | ] 17 | }, 18 | { 19 | "cell_type": "markdown", 20 | "metadata": {}, 21 | "source": [ 22 | "# Imports" 23 | ] 24 | }, 25 | { 26 | "cell_type": "code", 27 | "execution_count": 1, 28 | "metadata": { 29 | "execution": { 30 | "iopub.execute_input": "2021-05-10T15:10:53.236116Z", 31 | "iopub.status.busy": "2021-05-10T15:10:53.236116Z", 32 | "iopub.status.idle": "2021-05-10T15:11:20.320255Z", 33 | "shell.execute_reply": "2021-05-10T15:11:20.314374Z", 34 | "shell.execute_reply.started": "2021-05-10T15:10:53.236116Z" 35 | }, 36 | "tags": [] 37 | }, 38 | "outputs": [], 39 | "source": [ 40 | "import cv2 # openCV 4.5.1\n", 41 | "import numpy as np\n", 42 | "import os\n", 43 | "import tensorflow as tf\n", 44 | "from tensorflow import keras\n", 45 | "import time \n", 46 | "\n", 47 | "from skimage.io import imread\n", 48 | "from skimage.transform import resize \n", 49 | "from PIL import Image, ImageFont, ImageDraw # add caption by using custom font\n", 50 | "\n", 51 | "from collections import deque" 52 | ] 53 | }, 54 | { 55 | "cell_type": "markdown", 56 | "metadata": {}, 57 | "source": [ 58 | "# 06-A. Load Model Files\n", 59 | "* **`base_model`** : MobileNet\n", 60 | "* **`model`** : trained LSTM model file. `210512_MobileNet_model_epoch100.h5`" 61 | ] 62 | }, 63 | { 64 | "cell_type": "markdown", 65 | "metadata": {}, 66 | "source": [ 67 | "## 1. base_model : MobileNet" 68 | ] 69 | }, 70 | { 71 | "cell_type": "code", 72 | "execution_count": 2, 73 | "metadata": { 74 | "execution": { 75 | "iopub.execute_input": "2021-05-10T15:11:20.323246Z", 76 | "iopub.status.busy": "2021-05-10T15:11:20.323246Z", 77 | "iopub.status.idle": "2021-05-10T15:11:26.430709Z", 78 | "shell.execute_reply": "2021-05-10T15:11:26.428302Z", 79 | "shell.execute_reply.started": "2021-05-10T15:11:20.323246Z" 80 | }, 81 | "tags": [] 82 | }, 83 | "outputs": [ 84 | { 85 | "name": "stdout", 86 | "output_type": "stream", 87 | "text": [ 88 | "Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet/mobilenet_1_0_160_tf_no_top.h5\n", 89 | "17227776/17225924 [==============================] - 2s 0us/step\n" 90 | ] 91 | } 92 | ], 93 | "source": [ 94 | "base_model=keras.applications.mobilenet.MobileNet(input_shape=(160, 160, 3),\n", 95 | " include_top=False,\n", 96 | " weights='imagenet', classes=2)" 97 | ] 98 | }, 99 | { 100 | "cell_type": "markdown", 101 | "metadata": {}, 102 | "source": [ 103 | "## 2. model : trained LSTM model(.h5)" 104 | ] 105 | }, 106 | { 107 | "cell_type": "code", 108 | "execution_count": 3, 109 | "metadata": { 110 | "execution": { 111 | "iopub.execute_input": "2021-05-10T15:11:26.438541Z", 112 | "iopub.status.busy": "2021-05-10T15:11:26.437543Z", 113 | "iopub.status.idle": "2021-05-10T15:11:30.997695Z", 114 | "shell.execute_reply": "2021-05-10T15:11:30.996809Z", 115 | "shell.execute_reply.started": "2021-05-10T15:11:26.438541Z" 116 | }, 117 | "tags": [] 118 | }, 119 | "outputs": [], 120 | "source": [ 121 | "model=keras.models.load_model('210512_MobileNet_model_epoch100.h5')" 122 | ] 123 | }, 124 | { 125 | "cell_type": "markdown", 126 | "metadata": {}, 127 | "source": [ 128 | "# 06-B. Add caption to streaming screen & Save output video file" 129 | ] 130 | }, 131 | { 132 | "cell_type": "markdown", 133 | "metadata": {}, 134 | "source": [ 135 | "## 1. Setting : Input path & Output path\n", 136 | "* **`input_path`** : input laptop webcam\n", 137 | "* **`output_path`** : You'll save output video file in output_path." 138 | ] 139 | }, 140 | { 141 | "cell_type": "code", 142 | "execution_count": 4, 143 | "metadata": { 144 | "execution": { 145 | "iopub.execute_input": "2021-05-10T15:11:31.006670Z", 146 | "iopub.status.busy": "2021-05-10T15:11:31.005672Z", 147 | "iopub.status.idle": "2021-05-10T15:11:31.030604Z", 148 | "shell.execute_reply": "2021-05-10T15:11:31.027613Z", 149 | "shell.execute_reply.started": "2021-05-10T15:11:31.006670Z" 150 | }, 151 | "tags": [] 152 | }, 153 | "outputs": [], 154 | "source": [ 155 | "input_path=0" 156 | ] 157 | }, 158 | { 159 | "cell_type": "code", 160 | "execution_count": null, 161 | "metadata": {}, 162 | "outputs": [], 163 | "source": [ 164 | "output_path='output04.mp4'" 165 | ] 166 | }, 167 | { 168 | "cell_type": "markdown", 169 | "metadata": {}, 170 | "source": [ 171 | "## 2. Distinguish Violence True or False & Add caption on Video file" 172 | ] 173 | }, 174 | { 175 | "cell_type": "code", 176 | "execution_count": 7, 177 | "metadata": { 178 | "execution": { 179 | "iopub.execute_input": "2021-05-10T15:14:45.397034Z", 180 | "iopub.status.busy": "2021-05-10T15:14:45.397034Z", 181 | "iopub.status.idle": "2021-05-10T15:15:54.929754Z", 182 | "shell.execute_reply": "2021-05-10T15:15:54.927760Z", 183 | "shell.execute_reply.started": "2021-05-10T15:14:45.397034Z" 184 | }, 185 | "tags": [] 186 | }, 187 | "outputs": [ 188 | { 189 | "name": "stdout", 190 | "output_type": "stream", 191 | "text": [ 192 | "fps : 30.0\n", 193 | "preds:[[0.00126779 0.99873227]]\n", 194 | "Results = [[nan nan]]\n", 195 | "Maximum Probability : nan\n", 196 | "\n" 197 | ] 198 | }, 199 | { 200 | "name": "stderr", 201 | "output_type": "stream", 202 | "text": [ 203 | ":65: RuntimeWarning: Mean of empty slice.\n", 204 | " results=np.array(Q)[:i].mean(axis=0)\n" 205 | ] 206 | }, 207 | { 208 | "name": "stdout", 209 | "output_type": "stream", 210 | "text": [ 211 | "preds:[[0.00121148 0.99878854]]\n", 212 | "Results = [[0.00126779 0.99873227]]\n", 213 | "Maximum Probability : 0.9987322688102722\n", 214 | "\n", 215 | "preds:[[0.01842158 0.98157835]]\n", 216 | "Results = [[0.00123964 0.9987604 ]]\n", 217 | "Maximum Probability : 0.9987604022026062\n", 218 | "\n", 219 | "preds:[[0.00956469 0.99043536]]\n", 220 | "Results = [[0.00696695 0.9930331 ]]\n", 221 | "Maximum Probability : 0.9930331110954285\n", 222 | "\n", 223 | "preds:[[0.00848091 0.99151903]]\n", 224 | "Results = [[0.00761638 0.99238366]]\n", 225 | "Maximum Probability : 0.9923836588859558\n", 226 | "\n", 227 | "preds:[[0.00283711 0.9971629 ]]\n", 228 | "Results = [[0.00778929 0.99221075]]\n", 229 | "Maximum Probability : 0.9922107458114624\n", 230 | "\n", 231 | "preds:[[0.00244105 0.99755895]]\n", 232 | "Results = [[0.00810315 0.9918968 ]]\n", 233 | "Maximum Probability : 0.9918968081474304\n", 234 | "\n", 235 | "preds:[[0.01611286 0.98388714]]\n", 236 | "Results = [[0.00834907 0.9916509 ]]\n", 237 | "Maximum Probability : 0.9916508793830872\n", 238 | "\n", 239 | "preds:[[0.0139117 0.98608834]]\n", 240 | "Results = [[0.00788732 0.99211264]]\n", 241 | "Maximum Probability : 0.9921126365661621\n", 242 | "\n", 243 | "ESC키를 눌렀습니다. 녹화를 종료합니다.\n", 244 | "종료 처리되었습니다. 메모리를 해제합니다.\n" 245 | ] 246 | } 247 | ], 248 | "source": [ 249 | "vid=cv2.VideoCapture(input_path)\n", 250 | "fps=vid.get(cv2.CAP_PROP_FPS) # recognize frames per secone(fps) of input_path video file.\n", 251 | "print(f'fps : {fps}') # print fps.\n", 252 | "\n", 253 | "writer=None\n", 254 | "(W, H)=(None, None)\n", 255 | "i=0 # number of seconds in video = The number of times that how many operated while loop .\n", 256 | "Q=deque(maxlen=128) \n", 257 | "\n", 258 | "video_frm_ar=np.zeros((1, int(fps), 160, 160, 3), dtype=np.float) #frames\n", 259 | "frame_counter=0 # frame number in 1 second. 1~30\n", 260 | "frame_list=[] \n", 261 | "preds=None\n", 262 | "maxprob=None\n", 263 | "\n", 264 | "#. While loop : Until the end of input video, it read frame, extract features, predict violence True or False.\n", 265 | "# ----- Reshape & Save frame img as (30, 160, 160, 3) Numpy array -----\n", 266 | "while True: \n", 267 | " frame_counter+=1\n", 268 | " grabbed, frm=vid.read() # read each frame img. grabbed=True, frm=frm img. ex: (240, 320, 3)\n", 269 | " \n", 270 | " if not grabbed:\n", 271 | " print('There is no frame. Streaming ends.')\n", 272 | " break\n", 273 | " \n", 274 | " if fps!=30: \n", 275 | " print('Please set fps=30')\n", 276 | " break\n", 277 | " \n", 278 | " if W is None or H is None: # W: width, H: height of frame img\n", 279 | " (H, W)=frm.shape[:2]\n", 280 | " \n", 281 | " output=frm.copy() # It is necessary for streaming captioned output video, and to save that.\n", 282 | " \n", 283 | " frame=resize(frm, (160, 160, 3)) #> Resize frame img array to (160, 160, 3)\n", 284 | " frame_list.append(frame) # Append each frame img Numpy array : element is (160, 160, 3) Numpy array.\n", 285 | " \n", 286 | " if frame_counter>=fps: # fps=30 et al\n", 287 | " #. ----- we'll predict violence True or False every 30 frame -----\n", 288 | " #. ----- Insert (1, 30, 160, 160, 3) Numpy array to LSTM model ---\n", 289 | " #. ----- We'll renew predict result caption on output video every 1 second. -----\n", 290 | " # 30-element-appended list -> Transform to Numpy array -> Predict -> Initialize list (repeat)\n", 291 | " frame_ar=np.array(frame_list, dtype=np.float16) #> (30, 160, 160, 3)\n", 292 | " frame_list=[] # Initialize frame list when frame_counter is same or exceed 30, after transforming to Numpy array.\n", 293 | " \n", 294 | " if(np.max(frame_ar)>1): # Scaling RGB value in Numpy array\n", 295 | " frame_ar=frame_ar/255.0\n", 296 | " \n", 297 | " pred_imgarr=base_model.predict(frame_ar) #> Extract features from each frame img by using MobileNet. (30, 5, 5, 1024)\n", 298 | " pred_imgarr_dim=pred_imgarr.reshape(1, pred_imgarr.shape[0], 5*5*1024)#> (1, 30, 25600)\n", 299 | " \n", 300 | " preds=model.predict(pred_imgarr_dim) #> (True, 0.99) : (Violence True or False, Probability of Violence)\n", 301 | " print(f'preds:{preds}')\n", 302 | " Q.append(preds) #> Deque Q\n", 303 | " \n", 304 | " # Predict Result : Average of Violence probability in last 5 second\n", 305 | " if i<5:\n", 306 | " results=np.array(Q)[:i].mean(axis=0)\n", 307 | " else:\n", 308 | " results=np.array(Q)[(i-5):i].mean(axis=0)\n", 309 | " \n", 310 | " print(f'Results = {results}') #> ex : (0.6, 0.650)\n", 311 | " \n", 312 | " maxprob=np.max(results) #> Select Maximum Probability\n", 313 | " print(f'Maximum Probability : {maxprob}')\n", 314 | " print('')\n", 315 | " \n", 316 | " rest=1-maxprob # Probability of Non-Violence\n", 317 | " diff=maxprob-rest # Difference between Probability of Violence and Non-Violence's\n", 318 | " th=100\n", 319 | " \n", 320 | " if diff>0.80:\n", 321 | " th=diff # ?? What is supporting basis?\n", 322 | " \n", 323 | " frame_counter=0 #> Initialize frame_counter to 0\n", 324 | " i+=1 #> 1 second elapsed\n", 325 | " \n", 326 | " # When frame_counter>=30, Initialize frame_counter to 0, and repeat above while loop.\n", 327 | " \n", 328 | " # ----- Setting caption option of output video -----\n", 329 | " # Renewed caption is added every 30 frames(if fps=30, it means 1 second.)\n", 330 | " font1=ImageFont.truetype('fonts/Raleway-ExtraBold.ttf', 24) # font option\n", 331 | " font2=ImageFont.truetype('fonts/Raleway-ExtraBold.ttf', 48) # font option\n", 332 | " \n", 333 | " if preds is not None and maxprob is not None:\n", 334 | " if (preds[0][1]) if violence probability < th, Violence=False (Normal, Green Caption)\n", 335 | " text1_1='Normal'\n", 336 | " text1_2='{:.2f}%'.format(100-(maxprob*100))\n", 337 | " img_pil=Image.fromarray(output)\n", 338 | " draw=ImageDraw.Draw(img_pil)\n", 339 | " draw.text((int(0.025*W), int(0.025*H)), text1_1, font=font1, fill=(0,255,0,0))\n", 340 | " draw.text((int(0.025*W), int(0.095*H)), text1_2, font=font2, fill=(0,255,0,0))\n", 341 | " output=np.array(img_pil)\n", 342 | " \n", 343 | " else : #> if violence probability > th, Violence=True (Violence Alert!, Red Caption)\n", 344 | " text2_1='Violence Alert!'\n", 345 | " text2_2='{:.2f}%'.format(maxprob*100)\n", 346 | " img_pil=Image.fromarray(output)\n", 347 | " draw=ImageDraw.Draw(img_pil)\n", 348 | " draw.text((int(0.025*W), int(0.025*H)), text2_1, font=font1, fill=(0,0,255,0))\n", 349 | " draw.text((int(0.025*W), int(0.095*H)), text2_2, font=font2, fill=(0,0,255,0))\n", 350 | " output=np.array(img_pil) \n", 351 | " \n", 352 | " # Save captioned video file by using 'writer'\n", 353 | " if writer is None:\n", 354 | " fourcc=cv2.VideoWriter_fourcc(*'DIVX')\n", 355 | " writer=cv2.VideoWriter(output_path, fourcc, 30, (W, H), True)\n", 356 | " \n", 357 | " cv2.imshow('This is output', output) # View output in new Window.\n", 358 | " writer.write(output) # Save output in output_path\n", 359 | " \n", 360 | " key=cv2.waitKey(round(1000/fps)) # time gap of frame and next frame\n", 361 | " if key==27: # If you press ESC key, While loop will be breaked and output file will be saved.\n", 362 | " print('ESC is pressed. Video recording ends.')\n", 363 | " break\n", 364 | " \n", 365 | "print('Video recording ends. Release Memory.') #Output file will be saved.\n", 366 | "writer.release()\n", 367 | "vid.release()\n", 368 | "cv2.destroyAllWindows()" 369 | ] 370 | } 371 | ], 372 | "metadata": { 373 | "kernelspec": { 374 | "display_name": "Python 3", 375 | "language": "python", 376 | "name": "python3" 377 | }, 378 | "language_info": { 379 | "codemirror_mode": { 380 | "name": "ipython", 381 | "version": 3 382 | }, 383 | "file_extension": ".py", 384 | "mimetype": "text/x-python", 385 | "name": "python", 386 | "nbconvert_exporter": "python", 387 | "pygments_lexer": "ipython3", 388 | "version": "3.8.5" 389 | } 390 | }, 391 | "nbformat": 4, 392 | "nbformat_minor": 4 393 | } 394 | -------------------------------------------------------------------------------- /ver_jupyter/README.md: -------------------------------------------------------------------------------- 1 | # 21.05.13 CODE Updated 2 | 3 | ## Usage of each .ipynb file 4 | * 01 : Video file data preprocessing 5 | * 02 : Making Training set & Test set 6 | * 03 : Extracting features from video files by using MobileNet(pre-trained model) 7 | * 04 : Training LSTM model & Evaluation 8 | * 05 : Actual use for video file(.mp4, .avi) 9 | * 06 : Actual use for laptop webcam streaming(Realtime) 10 | 11 | ## Our LSTM Model file(.h5) 12 | * **Download Link** : https://drive.google.com/file/d/1z4UKAkhyItFCJ7aue3FNfCVN5jfsdkkU/view?usp=sharing 13 | 14 | ## Final Accuracy & Loss (21.05.12.) 15 | * Accuracy : 0.831944465637207 16 | * Loss : 0.13901832699775696 17 | 18 | ![210512_MobileNet_model_accuracy_epoch100](https://user-images.githubusercontent.com/75024126/118084494-c8aee380-b3fb-11eb-9a23-6980f9a3d8e1.jpg) 19 | ![210512_MobileNet_model_loss_epoch100](https://user-images.githubusercontent.com/75024126/118084500-ca78a700-b3fb-11eb-8b4e-ab842173c565.jpg) 20 | 21 | ## How to download Raleway Font? 22 | * **Download at Google Font** : https://fonts.google.com/specimen/Raleway?query=ralewa#license 23 | --------------------------------------------------------------------------------