├── LICENSE ├── malwarefiles_image.py ├── README.md └── malware_detection.py /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2025 Faizah Mahendinawaz Kureshi 4 | 5 | 6 | Permission is hereby granted, free of charge, to any person obtaining a copy 7 | of this software and associated documentation files (the "Software"), to deal 8 | in the Software without restriction, including without limitation the rights 9 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 10 | copies of the Software, and to permit persons to whom the Software is 11 | furnished to do so, subject to the following conditions: 12 | 13 | The above copyright notice and this permission notice shall be included in all 14 | copies or substantial portions of the Software. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 20 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 21 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 22 | SOFTWARE. 23 | -------------------------------------------------------------------------------- /malwarefiles_image.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | from PIL import Image 4 | 5 | def convert_binary_to_grayscale(input_folder, output_folder, image_size=(256, 256)): 6 | 7 | if not os.path.exists(output_folder): 8 | os.makedirs(output_folder) 9 | 10 | for file_name in os.listdir(input_folder): 11 | file_path = os.path.join(input_folder, file_name) 12 | if os.path.isfile(file_path): 13 | with open(file_path, 'rb') as f: 14 | binary_data = f.read() 15 | 16 | # convert binary data to a numpy array of bytes 17 | byte_array = np.frombuffer(binary_data, dtype=np.uint8) 18 | 19 | # calculate the side length of a square image for simplicity 20 | total_bytes = len(byte_array) 21 | side_length = int(np.ceil(np.sqrt(total_bytes))) 22 | 23 | # pad the byte array to fit into the square image 24 | padded_byte_array = np.pad(byte_array, (0, side_length**2 - total_bytes), mode='constant') 25 | 26 | # rshape into a square matrix 27 | image_matrix = padded_byte_array.reshape((side_length, side_length)) 28 | 29 | # resize the image to the desired size 30 | image = Image.fromarray(image_matrix) 31 | image = image.resize(image_size) 32 | 33 | # save the grayscale image 34 | output_path = os.path.join(output_folder, f"{os.path.splitext(file_name)[0]}.png") 35 | image.save(output_path) 36 | 37 | print(f"Processed and saved: {output_path}") 38 | 39 | # Example usage 40 | input_folder = "path_to_your_binary_files_dataset" 41 | output_folder = "path_to_save_grayscale_images" 42 | convert_binary_to_grayscale(input_folder, output_folder, image_size=(256, 256)) 43 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Malware Detection Using Hybrid CNN-RNN Model 2 | ============================================ 3 | 4 | This project focuses on detecting malware by analyzing binary files through image-based representations. The approach uses a hybrid deep learning model that combines **Convolutional Neural Networks (CNN)** and **Recurrent Neural Networks (RNN)** to achieve highly accurate predictions. This model leverages both spatial and temporal patterns from binary data to distinguish between benign and malicious files. 5 | 6 | Problem Overview 7 | ---------------- 8 | 9 | Malware detection has become increasingly challenging due to the complexity and evolving nature of malware. Traditional signature-based methods often fail to detect new, unknown variants. In contrast, machine learning approaches, particularly deep learning, have proven effective in recognizing complex patterns and behaviors in files, even without prior knowledge of the malware. 10 | 11 | In this project, we use **image-based binary analysis**. Binary files are transformed into images, which represent the data structure and behavior of the file. A hybrid CNN-RNN model is used to extract features and make predictions based on these images. 12 | 13 | Why a Hybrid CNN-RNN Model? 14 | --------------------------- 15 | 16 | ### 1\. **CNN (Convolutional Neural Networks)**: 17 | 18 | - **Spatial Feature Extraction**: CNNs are excellent at extracting spatial features from images. Since the binary data, when transformed into an image, contains spatial patterns (such as structures or repetitive byte sequences), CNNs can learn to recognize these patterns effectively. 19 | - **Efficient at Identifying Local Patterns**: In the context of malware detection, specific byte sequences or structures in the binary data can indicate malicious behavior. CNNs can quickly identify these local patterns, making them highly efficient for image-based binary analysis. 20 | 21 | ### 2\. **RNN (Recurrent Neural Networks)**: 22 | 23 | - **Temporal Dependencies**: Binary files often contain temporal dependencies, where certain sequences of bytes depend on the preceding ones. RNNs excel at capturing these sequential relationships, making them ideal for malware detection where the pattern of data over time (or the sequence of bytes) is critical to identifying malicious files. 24 | - **Long-term Dependencies**: Using RNNs allows the model to capture not just immediate patterns (as CNN does) but also long-term dependencies, which is crucial for detecting more sophisticated malware that may rely on complex, long-range sequences of instructions or operations. 25 | 26 | ### 3\. **Hybrid Model**: 27 | 28 | - By combining CNNs and RNNs, the model can learn both **local spatial features** and **global temporal patterns**. This hybrid approach ensures that both the structure and the sequence of the binary data are fully utilized, allowing the model to achieve better performance and generalization. 29 | 30 | In conclusion, the hybrid CNN-RNN model improves the accuracy and robustness of malware detection by capturing both the spatial and temporal aspects of the binary file data, resulting in more reliable predictions and a higher detection rate of unseen malware. 31 | 32 | Features 33 | -------- 34 | 35 | - **Binary-to-Image Transformation**: Converts binary files into 2D images for analysis. 36 | - **Hybrid CNN-RNN Architecture**: Combines the spatial feature extraction power of CNNs with the sequential learning ability of RNNs. 37 | - **High Accuracy**: The model can predict whether a file is benign or malicious with high accuracy. 38 | - **Scalability**: Can be trained on large datasets of binary files to improve performance over time. 39 | 40 | Tech Stack 41 | ---------- 42 | 43 | - **Deep Learning Framework**: Keras, TensorFlow 44 | - **Image Processing**: OpenCV (for handling image conversion) 45 | - **Data Handling**: NumPy, Pandas 46 | - **Python**: Main programming language used for implementing the project 47 | - **Hardware**: GPU (recommended for faster training) 48 | 49 | Model Architecture 50 | ------------------ 51 | 52 | 1. **CNN Layer**: 53 | - Multiple convolutional layers that extract spatial features from the image representations of binary files. This layer detects local patterns such as byte sequences that are indicative of malware behavior. 54 | 2. **RNN Layer**: 55 | - A series of recurrent layers that capture the temporal dependencies within the sequence of features extracted by the CNN. This layer processes the data over time, detecting long-range dependencies and helping the model understand how the structure evolves. 56 | 3. **Fully Connected Layer**: 57 | - After feature extraction, the model uses fully connected layers to make the final classification, predicting whether the binary file is benign or malicious. 58 | 59 | Results 60 | ------- 61 | 62 | - **Prediction Accuracy**: The hybrid CNN-RNN model achieves an accuracy of over 95% in predicting malware. 63 | - **Improved Detection of Unknown Malware**: By leveraging both spatial and temporal patterns, the model has shown improved performance over traditional methods. 64 | - **Scalability**: The model can be extended to larger datasets and adapted to different file types. 65 | 66 | Installation 67 | ------------ 68 | 69 | * * * * * 70 | 71 | License 72 | ------- 73 | 74 | This project is licensed under the MIT License. See the `LICENSE` file for details. -------------------------------------------------------------------------------- /malware_detection.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """Malware detection.ipynb 3 | 4 | Automatically generated by Colaboratory. 5 | 6 | Original file is located at 7 | https://colab.research.google.com/drive/1ea8uKe_n3MTA19CWBqqMgbRVrk4Beg73 8 | 9 | ##Importing 10 | """ 11 | 12 | import cv2 13 | import os 14 | import numpy as np 15 | import tensorflow as tf 16 | from tensorflow.keras import Model, layers 17 | from random import shuffle 18 | from tqdm import tqdm 19 | from sklearn.model_selection import train_test_split 20 | from tensorflow import keras 21 | 22 | from google.colab import drive 23 | 24 | drive.mount('/content/gdrive/', force_remount=True) 25 | 26 | path_root = '/content/gdrive/MyDrive/malimg_imgs' 27 | 28 | """To be able to use our images for training and testing, lets use ImageDataGenerator.flow_from_directory() which generates batches of normalized tensor image data from the respective data directories. 29 | 30 | target_size : Will resize all images to the specified size. I personally chose (64*64) images. 31 | batch_size : Is the size of the batch we will use. In our case, we only have 9339 images, hence setting a batch_size above this won't change anything. 32 | """ 33 | 34 | from keras.preprocessing.image import ImageDataGenerator 35 | batches = ImageDataGenerator().flow_from_directory(directory=path_root, target_size=(64,64), batch_size=10000) 36 | 37 | batches.class_indices 38 | 39 | """Batches generated with ImageDataGenerator() is an iterator. Hence, we use next() to go through all its elements and generate a batch of images and labels from the data set.""" 40 | 41 | imgs, labels = next(batches) 42 | 43 | """As you can see, our images are in RGB with shape 64x64 [width x length x depth].""" 44 | 45 | imgs.shape 46 | 47 | """Labels has the shape (batch_size, number of classes).""" 48 | 49 | labels.shape 50 | 51 | """The following method allows us to plot a sample of images in our dataset.""" 52 | 53 | # plots images with labels within jupyter notebook 54 | import matplotlib.pyplot as plt 55 | def plots(ims, figsize=(20,30), rows=10, interp=False, titles=None): 56 | if type(ims[0]) is np.ndarray: 57 | ims = np.array(ims).astype(np.uint8) 58 | if (ims.shape[-1] != 3): 59 | ims = ims.transpose((0,2,3,1)) 60 | f = plt.figure(figsize=figsize) 61 | cols = 10 # len(ims)//rows if len(ims) % 2 == 0 else len(ims)//rows + 1 62 | for i in range(0,50): 63 | sp = f.add_subplot(rows, cols, i+1) 64 | sp.axis('Off') 65 | if titles is not None: 66 | sp.set_title(list(batches.class_indices.keys())[np.argmax(titles[i])], fontsize=16) 67 | plt.imshow(ims[i], interpolation=None if interp else 'none') 68 | 69 | plots(imgs, titles = labels) 70 | 71 | """We can already observe differences between classes. 72 | 73 | ##Analyze 74 | 75 | All our images are finally ready to be used. Lets check out the repartition of data between classes : 76 | """ 77 | 78 | classes = batches.class_indices.keys() 79 | 80 | perc = (sum(labels)/labels.shape[0])*100 81 | 82 | plt.xticks(rotation='vertical') 83 | plt.bar(classes,perc) 84 | 85 | """##Train And Test 86 | 87 | Lets split our model into train and test following a ratio 70% train - 30% test ratio. 88 | """ 89 | 90 | from sklearn.model_selection import train_test_split 91 | X_train, X_test, y_train, y_test = train_test_split(imgs/255.,labels, test_size=0.3) 92 | 93 | X_train.shape 94 | 95 | X_test.shape 96 | 97 | y_train.shape 98 | 99 | y_test.shape 100 | 101 | """##Convolutional Neural Network Model 102 | 103 | We will now build our **CNN** model using Keras. This model will have the following layers : 104 | 105 | **Convolutional Laye**r : 30 filters, (3 * 3) kernel size 106 | 107 | **Max Pooling Layer** : (2 * 2) pool size 108 | 109 | **Convolutional Layer** : 15 filters, (3 * 3) kernel size 110 | 111 | **Max Pooling Layer** : (2 * 2) pool size 112 | 113 | **DropOut Layer** : Dropping 25% of neurons. 114 | 115 | **Flatten Layer** 116 | 117 | **Dense/Fully Connected Layer** : 128 Neurons, Relu activation function 118 | 119 | **DropOut Layer** : Dropping 50% of neurons. 120 | 121 | **Dense/Fully Connected Layer** : 50 Neurons, Softmax activation function 122 | 123 | **Dense/Fully Connected Layer** : num_class Neurons, Softmax activation 124 | function 125 | 126 | **Input shape** : 64 * 64 * 3 127 | """ 128 | 129 | !pip install tensorflow keras --upgrade 130 | 131 | #import keras 132 | 133 | #!pip install tensorflow 134 | import tensorflow 135 | #!pip install keras --upgrade 136 | from keras.models import Sequential,Model 137 | from tensorflow.keras.layers import Input 138 | #from tensorflow.keras.models import Sequential, Input, Model 139 | from keras.layers import Dense, Dropout, Flatten 140 | from keras.layers import Conv2D, MaxPooling2D 141 | from tensorflow.keras.layers import BatchNormalization 142 | 143 | """We want 25 classes as output.""" 144 | 145 | num_classes = 25 146 | 147 | def malware_model(ac): 148 | Malware_model = Sequential() 149 | Malware_model.add(Conv2D(30, kernel_size=(3, 3),activation=ac,input_shape=(64,64,3))) 150 | Malware_model.add(MaxPooling2D(pool_size=(2, 2))) 151 | Malware_model.add(Conv2D(15, (3, 3), activation=ac)) 152 | Malware_model.add(MaxPooling2D(pool_size=(2, 2))) 153 | Malware_model.add(Dropout(0.25)) 154 | Malware_model.add(Flatten()) 155 | Malware_model.add(Dense(128, activation=ac)) 156 | Malware_model.add(Dropout(0.5)) 157 | Malware_model.add(Dense(50, activation=ac)) 158 | Malware_model.add(Dense(num_classes, activation='softmax')) 159 | Malware_model.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics=['accuracy']) 160 | return Malware_model 161 | 162 | """**We will compare with 10 different activation function** 163 | 164 | 1. Sigmoid Function 165 | 166 | 2.Tanh Function 167 | 168 | 3.Leaky ReLU 169 | 170 | 4.Exponential Linear Unit (ELU) 171 | 172 | 5.Scaled Exponential Linear Unit (SELU) 173 | 174 | 6.Gaussian Error Linear Unit (GELU) 175 | 176 | 7.Swish 177 | 178 | 8.Parametric ReLU 179 | 180 | 9.Softplus 181 | 182 | ##Relu Function 183 | """ 184 | 185 | Malware_model = malware_model('relu') 186 | 187 | Malware_model.summary() 188 | 189 | """Several methods are available to deal with unbalanced data. I our case, I chose to give higher weight to minority class and lower weight to majority class. 190 | 191 | class_weights uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data. To use this method, y_train must not be one hot encoded. 192 | """ 193 | 194 | y_train.shape 195 | 196 | """class_weight function cannot deal with one hot encoded y. We need to convert it.""" 197 | 198 | y_train_new = np.argmax(y_train, axis=1) 199 | 200 | y_train_new 201 | 202 | !pip install sklearn --upgrade 203 | 204 | from sklearn.utils import class_weight 205 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new) 206 | 207 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10) 208 | 209 | scores = Malware_model.evaluate(X_test, y_test) 210 | 211 | """We got a **96%** accuracy which is not bad !""" 212 | 213 | print('Final CNN accuracy using Relu: ', scores[1]) 214 | 215 | dic={} 216 | dic['Relu']=0.9635111689567566 217 | 218 | acti_list=['sigmoid','tanh','selu','elu','gelu','swish','mish'] 219 | 220 | """##Sigmoid Activation Function""" 221 | 222 | print("---------------------------------Model using activation function: Sigmoid ---------------------------------") 223 | 224 | Malware_model = malware_model('sigmoid') 225 | 226 | y_train_new = np.argmax(y_train, axis=1) 227 | 228 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new) 229 | 230 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10) 231 | 232 | scores = Malware_model.evaluate(X_test, y_test) 233 | 234 | print('Final CNN accuracy using Sigmoid: ', scores[1]) 235 | 236 | dic['Sigmoid']= 0.3287435472011566 237 | 238 | """##Leaky Relu""" 239 | 240 | print("---------------------------------Model using activation function: Leaky Relu---------------------------------") 241 | 242 | import tensorflow as tf 243 | Malware_model = Sequential() 244 | Malware_model.add(Conv2D(30, kernel_size=(3, 3),activation=tf.keras.layers.LeakyReLU(alpha=0.1),input_shape=(64,64,3))) 245 | Malware_model.add(MaxPooling2D(pool_size=(2, 2))) 246 | Malware_model.add(Conv2D(15, (3, 3), activation=tf.keras.layers.LeakyReLU(alpha=0.1))) 247 | Malware_model.add(MaxPooling2D(pool_size=(2, 2))) 248 | Malware_model.add(Dropout(0.25)) 249 | Malware_model.add(Flatten()) 250 | Malware_model.add(Dense(128, activation=tf.keras.layers.LeakyReLU(alpha=0.1))) 251 | Malware_model.add(Dropout(0.5)) 252 | Malware_model.add(Dense(50, activation=tf.keras.layers.LeakyReLU(alpha=0.1))) 253 | Malware_model.add(Dense(num_classes, activation='softmax')) 254 | Malware_model.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics=['accuracy']) 255 | y_train_new = np.argmax(y_train, axis=1) 256 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new) 257 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10) 258 | 259 | scores = Malware_model.evaluate(X_test, y_test) 260 | 261 | print('Final CNN accuracy using Leaky Relu: ', scores[1]) 262 | 263 | dic['Leaky Relu']=0.9672977328300476 264 | 265 | """## Soft plus""" 266 | 267 | print("---------------------------------Model using activation function: Soft Plus---------------------------------") 268 | 269 | Malware_model = Sequential() 270 | Malware_model.add(Conv2D(30, kernel_size=(3, 3),activation=tf.nn.softplus,input_shape=(64,64,3))) 271 | Malware_model.add(MaxPooling2D(pool_size=(2, 2))) 272 | Malware_model.add(Conv2D(15, (3, 3), activation=tf.nn.softplus)) 273 | Malware_model.add(MaxPooling2D(pool_size=(2, 2))) 274 | Malware_model.add(Dropout(0.25)) 275 | Malware_model.add(Flatten()) 276 | Malware_model.add(Dense(128, activation=tf.nn.softplus)) 277 | Malware_model.add(Dropout(0.5)) 278 | Malware_model.add(Dense(50, activation=tf.nn.softplus)) 279 | Malware_model.add(Dense(num_classes, activation='softmax')) 280 | Malware_model.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics=['accuracy']) 281 | y_train_new = np.argmax(y_train, axis=1) 282 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new) 283 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10) 284 | 285 | scores = Malware_model.evaluate(X_test, y_test) 286 | 287 | print('Final CNN accuracy using Soft plus: ', scores[1]) 288 | 289 | dic['Soft Plus']=0.9201377034187317 290 | 291 | """##Parametric ReLU""" 292 | 293 | print("---------------------------------Model using activation function: Parametric ReLU---------------------------------") 294 | 295 | Malware_model = Sequential() 296 | Malware_model.add(Conv2D(30, kernel_size=(3, 3),activation=tf.keras.layers.PReLU(),input_shape=(64,64,3))) 297 | Malware_model.add(MaxPooling2D(pool_size=(2, 2))) 298 | Malware_model.add(Conv2D(15, (3, 3), activation=tf.keras.layers.PReLU())) 299 | Malware_model.add(MaxPooling2D(pool_size=(2, 2))) 300 | Malware_model.add(Dropout(0.25)) 301 | Malware_model.add(Flatten()) 302 | Malware_model.add(Dense(128, activation=tf.keras.layers.PReLU())) 303 | Malware_model.add(Dropout(0.5)) 304 | Malware_model.add(Dense(50, activation=tf.keras.layers.PReLU())) 305 | Malware_model.add(Dense(num_classes, activation='softmax')) 306 | Malware_model.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics=['accuracy']) 307 | y_train_new = np.argmax(y_train, axis=1) 308 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new) 309 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10) 310 | 311 | scores = Malware_model.evaluate(X_test, y_test) 312 | 313 | print('Final CNN accuracy using Paramatic Plus: ', scores[1]) 314 | 315 | dic['Paramatic Relu']=0.9666092991828918 316 | 317 | """## Tanh Function""" 318 | 319 | print("---------------------------------Model using activation function: Tanh ---------------------------------") 320 | 321 | Malware_model = malware_model('tanh') 322 | 323 | y_train_new = np.argmax(y_train, axis=1) 324 | 325 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new) 326 | 327 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10) 328 | 329 | scores = Malware_model.evaluate(X_test, y_test) 330 | 331 | print('Final CNN accuracy using Tanh: ', scores[1]) 332 | 333 | dic['Tanh']= 0.9590361714363098 334 | 335 | """##ELU""" 336 | 337 | print("---------------------------------Model using activation function: Exponential Linear Unit ---------------------------------") 338 | 339 | Malware_model = malware_model('elu') 340 | 341 | y_train_new = np.argmax(y_train, axis=1) 342 | 343 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new) 344 | 345 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10) 346 | 347 | scores = Malware_model.evaluate(X_test, y_test) 348 | 349 | print('Final CNN accuracy using ELU: ', scores[1]) 350 | 351 | dic['Sigmoid']= 0.9562822580337524 352 | 353 | """##Scaled Exponential Linear Unit""" 354 | 355 | print("---------------------------------Model using activation function: Scaled Exponential Linear Unit ---------------------------------") 356 | 357 | Malware_model = malware_model('selu') 358 | 359 | y_train_new = np.argmax(y_train, axis=1) 360 | 361 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new) 362 | 363 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10) 364 | 365 | scores = Malware_model.evaluate(X_test, y_test) 366 | 367 | print('Final CNN accuracy using Selu: ', scores[1]) 368 | 369 | dic['Selu']= 0.9528399109840393 370 | 371 | """## Gaussian Error Linear Unit""" 372 | 373 | print("---------------------------------Model using activation function: Gelu ---------------------------------") 374 | 375 | Malware_model = malware_model('gelu') 376 | 377 | y_train_new = np.argmax(y_train, axis=1) 378 | 379 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new) 380 | 381 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10) 382 | 383 | scores = Malware_model.evaluate(X_test, y_test) 384 | 385 | print('Final CNN accuracy using gelu: ', scores[1]) 386 | 387 | dic['Gelu']= 0.9697074294090271 388 | 389 | """##Swish""" 390 | 391 | print("---------------------------------Model using activation function: Swish ---------------------------------") 392 | 393 | Malware_model = malware_model('swish') 394 | 395 | y_train_new = np.argmax(y_train, axis=1) 396 | 397 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new) 398 | 399 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10) 400 | 401 | scores = Malware_model.evaluate(X_test, y_test) 402 | 403 | print('Final CNN accuracy using Swish: ', scores[1]) 404 | 405 | dic['swish']= 0.9597246050834656 406 | 407 | """##Comparision""" 408 | 409 | print("Accuracy of the same model with different activation function") 410 | 411 | for key in dic: 412 | print(key,": ",dic[key]) 413 | print() 414 | 415 | print("Maximum Accuracy of ",dic['Gelu']," was obtained by using activation function as Gaussian Error Linear Unit (GELU)") --------------------------------------------------------------------------------