├── LICENSE
├── malwarefiles_image.py
├── README.md
└── malware_detection.py


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2025 Faizah Mahendinawaz Kureshi
 4 | 
 5 | 
 6 | Permission is hereby granted, free of charge, to any person obtaining a copy
 7 | of this software and associated documentation files (the "Software"), to deal
 8 | in the Software without restriction, including without limitation the rights
 9 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
10 | copies of the Software, and to permit persons to whom the Software is
11 | furnished to do so, subject to the following conditions:
12 | 
13 | The above copyright notice and this permission notice shall be included in all
14 | copies or substantial portions of the Software.
15 | 
16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
17 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
19 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
21 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
22 | SOFTWARE.
23 | 


--------------------------------------------------------------------------------
/malwarefiles_image.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import numpy as np
 3 | from PIL import Image
 4 | 
 5 | def convert_binary_to_grayscale(input_folder, output_folder, image_size=(256, 256)):
 6 |    
 7 |     if not os.path.exists(output_folder):
 8 |         os.makedirs(output_folder)
 9 | 
10 |     for file_name in os.listdir(input_folder):
11 |         file_path = os.path.join(input_folder, file_name)
12 |         if os.path.isfile(file_path):
13 |             with open(file_path, 'rb') as f:
14 |                 binary_data = f.read()
15 | 
16 |             # convert binary data to a numpy array of bytes
17 |             byte_array = np.frombuffer(binary_data, dtype=np.uint8)
18 | 
19 |             # calculate the side length of a square image for simplicity
20 |             total_bytes = len(byte_array)
21 |             side_length = int(np.ceil(np.sqrt(total_bytes)))
22 | 
23 |             # pad the byte array to fit into the square image
24 |             padded_byte_array = np.pad(byte_array, (0, side_length**2 - total_bytes), mode='constant')
25 | 
26 |             # rshape into a square matrix
27 |             image_matrix = padded_byte_array.reshape((side_length, side_length))
28 | 
29 |             # resize the image to the desired size
30 |             image = Image.fromarray(image_matrix)
31 |             image = image.resize(image_size)
32 | 
33 |             # save the grayscale image
34 |             output_path = os.path.join(output_folder, f"{os.path.splitext(file_name)[0]}.png")
35 |             image.save(output_path)
36 | 
37 |             print(f"Processed and saved: {output_path}")
38 | 
39 | # Example usage
40 | input_folder = "path_to_your_binary_files_dataset"
41 | output_folder = "path_to_save_grayscale_images"
42 | convert_binary_to_grayscale(input_folder, output_folder, image_size=(256, 256))
43 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | Malware Detection Using Hybrid CNN-RNN Model
 2 | ============================================
 3 | 
 4 | This project focuses on detecting malware by analyzing binary files through image-based representations. The approach uses a hybrid deep learning model that combines **Convolutional Neural Networks (CNN)** and **Recurrent Neural Networks (RNN)** to achieve highly accurate predictions. This model leverages both spatial and temporal patterns from binary data to distinguish between benign and malicious files.
 5 | 
 6 | Problem Overview
 7 | ----------------
 8 | 
 9 | Malware detection has become increasingly challenging due to the complexity and evolving nature of malware. Traditional signature-based methods often fail to detect new, unknown variants. In contrast, machine learning approaches, particularly deep learning, have proven effective in recognizing complex patterns and behaviors in files, even without prior knowledge of the malware.
10 | 
11 | In this project, we use **image-based binary analysis**. Binary files are transformed into images, which represent the data structure and behavior of the file. A hybrid CNN-RNN model is used to extract features and make predictions based on these images.
12 | 
13 | Why a Hybrid CNN-RNN Model?
14 | ---------------------------
15 | 
16 | ### 1\. **CNN (Convolutional Neural Networks)**:
17 | 
18 | -   **Spatial Feature Extraction**: CNNs are excellent at extracting spatial features from images. Since the binary data, when transformed into an image, contains spatial patterns (such as structures or repetitive byte sequences), CNNs can learn to recognize these patterns effectively.
19 | -   **Efficient at Identifying Local Patterns**: In the context of malware detection, specific byte sequences or structures in the binary data can indicate malicious behavior. CNNs can quickly identify these local patterns, making them highly efficient for image-based binary analysis.
20 | 
21 | ### 2\. **RNN (Recurrent Neural Networks)**:
22 | 
23 | -   **Temporal Dependencies**: Binary files often contain temporal dependencies, where certain sequences of bytes depend on the preceding ones. RNNs excel at capturing these sequential relationships, making them ideal for malware detection where the pattern of data over time (or the sequence of bytes) is critical to identifying malicious files.
24 | -   **Long-term Dependencies**: Using RNNs allows the model to capture not just immediate patterns (as CNN does) but also long-term dependencies, which is crucial for detecting more sophisticated malware that may rely on complex, long-range sequences of instructions or operations.
25 | 
26 | ### 3\. **Hybrid Model**:
27 | 
28 | -   By combining CNNs and RNNs, the model can learn both **local spatial features** and **global temporal patterns**. This hybrid approach ensures that both the structure and the sequence of the binary data are fully utilized, allowing the model to achieve better performance and generalization.
29 | 
30 | In conclusion, the hybrid CNN-RNN model improves the accuracy and robustness of malware detection by capturing both the spatial and temporal aspects of the binary file data, resulting in more reliable predictions and a higher detection rate of unseen malware.
31 | 
32 | Features
33 | --------
34 | 
35 | -   **Binary-to-Image Transformation**: Converts binary files into 2D images for analysis.
36 | -   **Hybrid CNN-RNN Architecture**: Combines the spatial feature extraction power of CNNs with the sequential learning ability of RNNs.
37 | -   **High Accuracy**: The model can predict whether a file is benign or malicious with high accuracy.
38 | -   **Scalability**: Can be trained on large datasets of binary files to improve performance over time.
39 | 
40 | Tech Stack
41 | ----------
42 | 
43 | -   **Deep Learning Framework**: Keras, TensorFlow
44 | -   **Image Processing**: OpenCV (for handling image conversion)
45 | -   **Data Handling**: NumPy, Pandas
46 | -   **Python**: Main programming language used for implementing the project
47 | -   **Hardware**: GPU (recommended for faster training)
48 | 
49 | Model Architecture
50 | ------------------
51 | 
52 | 1.  **CNN Layer**:
53 |     -   Multiple convolutional layers that extract spatial features from the image representations of binary files. This layer detects local patterns such as byte sequences that are indicative of malware behavior.
54 | 2.  **RNN Layer**:
55 |     -   A series of recurrent layers that capture the temporal dependencies within the sequence of features extracted by the CNN. This layer processes the data over time, detecting long-range dependencies and helping the model understand how the structure evolves.
56 | 3.  **Fully Connected Layer**:
57 |     -   After feature extraction, the model uses fully connected layers to make the final classification, predicting whether the binary file is benign or malicious.
58 | 
59 | Results
60 | -------
61 | 
62 | -   **Prediction Accuracy**: The hybrid CNN-RNN model achieves an accuracy of over 95% in predicting malware.
63 | -   **Improved Detection of Unknown Malware**: By leveraging both spatial and temporal patterns, the model has shown improved performance over traditional methods.
64 | -   **Scalability**: The model can be extended to larger datasets and adapted to different file types.
65 | 
66 | Installation
67 | ------------
68 | 
69 | * * * * *
70 | 
71 | License
72 | -------
73 | 
74 | This project is licensed under the MIT License. See the `LICENSE` file for details.


--------------------------------------------------------------------------------
/malware_detection.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """Malware detection.ipynb
  3 | 
  4 | Automatically generated by Colaboratory.
  5 | 
  6 | Original file is located at
  7 |     https://colab.research.google.com/drive/1ea8uKe_n3MTA19CWBqqMgbRVrk4Beg73
  8 | 
  9 | ##Importing
 10 | """
 11 | 
 12 | import cv2
 13 | import os
 14 | import numpy as np
 15 | import tensorflow as tf
 16 | from tensorflow.keras import Model, layers
 17 | from random import shuffle
 18 | from tqdm import tqdm
 19 | from sklearn.model_selection import train_test_split
 20 | from tensorflow import keras
 21 | 
 22 | from google.colab import drive
 23 | 
 24 | drive.mount('/content/gdrive/', force_remount=True)
 25 | 
 26 | path_root  = '/content/gdrive/MyDrive/malimg_imgs'
 27 | 
 28 | """To be able to use our images for training and testing, lets use ImageDataGenerator.flow_from_directory() which generates batches of normalized tensor image data from the respective data directories.
 29 | 
 30 | target_size : Will resize all images to the specified size. I personally chose (64*64) images.
 31 | batch_size : Is the size of the batch we will use. In our case, we only have 9339 images, hence setting a batch_size above this won't change anything.
 32 | """
 33 | 
 34 | from keras.preprocessing.image import ImageDataGenerator
 35 | batches = ImageDataGenerator().flow_from_directory(directory=path_root, target_size=(64,64), batch_size=10000)
 36 | 
 37 | batches.class_indices
 38 | 
 39 | """Batches generated with ImageDataGenerator() is an iterator. Hence, we use next() to go through all its elements and generate a batch of images and labels from the data set."""
 40 | 
 41 | imgs, labels = next(batches)
 42 | 
 43 | """As you can see, our images are in RGB with shape 64x64 [width x length x depth]."""
 44 | 
 45 | imgs.shape
 46 | 
 47 | """Labels has the shape (batch_size, number of classes)."""
 48 | 
 49 | labels.shape
 50 | 
 51 | """The following method allows us to plot a sample of images in our dataset."""
 52 | 
 53 | # plots images with labels within jupyter notebook
 54 | import matplotlib.pyplot as plt
 55 | def plots(ims, figsize=(20,30), rows=10, interp=False, titles=None):
 56 |   if type(ims[0]) is np.ndarray:
 57 |       ims = np.array(ims).astype(np.uint8)
 58 |       if (ims.shape[-1] != 3):
 59 |         ims = ims.transpose((0,2,3,1))
 60 |   f = plt.figure(figsize=figsize)
 61 |   cols = 10 # len(ims)//rows if len(ims) % 2 == 0 else len(ims)//rows + 1
 62 |   for i in range(0,50):
 63 |       sp = f.add_subplot(rows, cols, i+1)
 64 |       sp.axis('Off')
 65 |       if titles is not None:
 66 |           sp.set_title(list(batches.class_indices.keys())[np.argmax(titles[i])], fontsize=16)
 67 |       plt.imshow(ims[i], interpolation=None if interp else 'none')
 68 | 
 69 | plots(imgs, titles = labels)
 70 | 
 71 | """We can already observe differences between classes.
 72 | 
 73 | ##Analyze
 74 | 
 75 | All our images are finally ready to be used. Lets check out the repartition of data between classes :
 76 | """
 77 | 
 78 | classes = batches.class_indices.keys()
 79 | 
 80 | perc = (sum(labels)/labels.shape[0])*100
 81 | 
 82 | plt.xticks(rotation='vertical')
 83 | plt.bar(classes,perc)
 84 | 
 85 | """##Train And Test
 86 | 
 87 | Lets split our model into train and test following a ratio 70% train - 30% test ratio.
 88 | """
 89 | 
 90 | from sklearn.model_selection import train_test_split
 91 | X_train, X_test, y_train, y_test = train_test_split(imgs/255.,labels, test_size=0.3)
 92 | 
 93 | X_train.shape
 94 | 
 95 | X_test.shape
 96 | 
 97 | y_train.shape
 98 | 
 99 | y_test.shape
100 | 
101 | """##Convolutional Neural Network Model
102 | 
103 | We will now build our **CNN** model using Keras. This model will have the following layers :
104 | 
105 | **Convolutional Laye**r : 30 filters, (3 * 3) kernel size
106 | 
107 | **Max Pooling Layer** : (2 * 2) pool size
108 | 
109 | **Convolutional Layer** : 15 filters, (3 * 3) kernel size
110 | 
111 | **Max Pooling Layer** : (2 * 2) pool size
112 | 
113 | **DropOut Layer** : Dropping 25% of neurons.
114 | 
115 | **Flatten Layer**
116 | 
117 | **Dense/Fully Connected Layer** : 128 Neurons, Relu activation function
118 | 
119 | **DropOut Layer** : Dropping 50% of neurons.
120 | 
121 | **Dense/Fully Connected Layer** : 50 Neurons, Softmax activation function
122 | 
123 | **Dense/Fully Connected Layer** : num_class Neurons, Softmax activation 
124 | function
125 | 
126 | **Input shape** : 64 * 64 * 3
127 | """
128 | 
129 | !pip install tensorflow keras --upgrade
130 | 
131 | #import keras
132 | 
133 | #!pip install tensorflow
134 | import tensorflow 
135 | #!pip install keras --upgrade
136 | from keras.models import Sequential,Model
137 | from tensorflow.keras.layers import Input
138 | #from tensorflow.keras.models import Sequential, Input, Model
139 | from keras.layers import Dense, Dropout, Flatten
140 | from keras.layers import Conv2D, MaxPooling2D
141 | from tensorflow.keras.layers import BatchNormalization
142 | 
143 | """We want 25 classes as output."""
144 | 
145 | num_classes = 25
146 | 
147 | def malware_model(ac):
148 |     Malware_model = Sequential()
149 |     Malware_model.add(Conv2D(30, kernel_size=(3, 3),activation=ac,input_shape=(64,64,3)))
150 |     Malware_model.add(MaxPooling2D(pool_size=(2, 2)))
151 |     Malware_model.add(Conv2D(15, (3, 3), activation=ac))
152 |     Malware_model.add(MaxPooling2D(pool_size=(2, 2)))
153 |     Malware_model.add(Dropout(0.25))
154 |     Malware_model.add(Flatten())
155 |     Malware_model.add(Dense(128, activation=ac))
156 |     Malware_model.add(Dropout(0.5))
157 |     Malware_model.add(Dense(50, activation=ac))
158 |     Malware_model.add(Dense(num_classes, activation='softmax'))
159 |     Malware_model.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics=['accuracy'])
160 |     return Malware_model
161 | 
162 | """**We will compare with 10 different activation function**
163 |     
164 |     1. Sigmoid Function
165 | 
166 |     2.Tanh Function
167 | 
168 |     3.Leaky ReLU
169 | 
170 |     4.Exponential Linear Unit (ELU)
171 | 
172 |     5.Scaled Exponential Linear Unit (SELU)
173 | 
174 |     6.Gaussian Error Linear Unit (GELU)
175 | 
176 |     7.Swish
177 | 
178 |     8.Parametric ReLU
179 | 
180 |     9.Softplus
181 | 
182 | ##Relu Function
183 | """
184 | 
185 | Malware_model = malware_model('relu')
186 | 
187 | Malware_model.summary()
188 | 
189 | """Several methods are available to deal with unbalanced data. I our case, I chose to give higher weight to minority class and lower weight to majority class.
190 | 
191 | class_weights uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data. To use this method, y_train must not be one hot encoded.
192 | """
193 | 
194 | y_train.shape
195 | 
196 | """class_weight function cannot deal with one hot encoded y. We need to convert it."""
197 | 
198 | y_train_new = np.argmax(y_train, axis=1)
199 | 
200 | y_train_new
201 | 
202 | !pip install sklearn --upgrade
203 | 
204 | from sklearn.utils import class_weight
205 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new)
206 | 
207 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10)
208 | 
209 | scores = Malware_model.evaluate(X_test, y_test)
210 | 
211 | """We got a **96%** accuracy which is not bad !"""
212 | 
213 | print('Final CNN accuracy using Relu: ', scores[1])
214 | 
215 | dic={}
216 | dic['Relu']=0.9635111689567566
217 | 
218 | acti_list=['sigmoid','tanh','selu','elu','gelu','swish','mish']
219 | 
220 | """##Sigmoid Activation Function"""
221 | 
222 | print("---------------------------------Model using activation function: Sigmoid  ---------------------------------")
223 | 
224 | Malware_model = malware_model('sigmoid')
225 | 
226 | y_train_new = np.argmax(y_train, axis=1)
227 | 
228 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new)
229 | 
230 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10)
231 | 
232 | scores = Malware_model.evaluate(X_test, y_test)
233 | 
234 | print('Final CNN accuracy using Sigmoid: ', scores[1])
235 | 
236 | dic['Sigmoid']= 0.3287435472011566
237 | 
238 | """##Leaky Relu"""
239 | 
240 | print("---------------------------------Model using activation function: Leaky Relu---------------------------------")
241 | 
242 | import tensorflow as tf
243 | Malware_model = Sequential()
244 | Malware_model.add(Conv2D(30, kernel_size=(3, 3),activation=tf.keras.layers.LeakyReLU(alpha=0.1),input_shape=(64,64,3)))
245 | Malware_model.add(MaxPooling2D(pool_size=(2, 2)))
246 | Malware_model.add(Conv2D(15, (3, 3), activation=tf.keras.layers.LeakyReLU(alpha=0.1)))
247 | Malware_model.add(MaxPooling2D(pool_size=(2, 2)))
248 | Malware_model.add(Dropout(0.25))
249 | Malware_model.add(Flatten())
250 | Malware_model.add(Dense(128, activation=tf.keras.layers.LeakyReLU(alpha=0.1)))
251 | Malware_model.add(Dropout(0.5))
252 | Malware_model.add(Dense(50, activation=tf.keras.layers.LeakyReLU(alpha=0.1)))
253 | Malware_model.add(Dense(num_classes, activation='softmax'))
254 | Malware_model.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics=['accuracy'])
255 | y_train_new = np.argmax(y_train, axis=1)
256 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new)
257 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10)
258 | 
259 | scores = Malware_model.evaluate(X_test, y_test)
260 | 
261 | print('Final CNN accuracy using Leaky Relu: ', scores[1])
262 | 
263 | dic['Leaky Relu']=0.9672977328300476
264 | 
265 | """## Soft plus"""
266 | 
267 | print("---------------------------------Model using activation function: Soft Plus---------------------------------")
268 | 
269 | Malware_model = Sequential()
270 | Malware_model.add(Conv2D(30, kernel_size=(3, 3),activation=tf.nn.softplus,input_shape=(64,64,3)))
271 | Malware_model.add(MaxPooling2D(pool_size=(2, 2)))
272 | Malware_model.add(Conv2D(15, (3, 3), activation=tf.nn.softplus))
273 | Malware_model.add(MaxPooling2D(pool_size=(2, 2)))
274 | Malware_model.add(Dropout(0.25))
275 | Malware_model.add(Flatten())
276 | Malware_model.add(Dense(128, activation=tf.nn.softplus))
277 | Malware_model.add(Dropout(0.5))
278 | Malware_model.add(Dense(50, activation=tf.nn.softplus))
279 | Malware_model.add(Dense(num_classes, activation='softmax'))
280 | Malware_model.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics=['accuracy'])
281 | y_train_new = np.argmax(y_train, axis=1)
282 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new)
283 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10)
284 | 
285 | scores = Malware_model.evaluate(X_test, y_test)
286 | 
287 | print('Final CNN accuracy using Soft plus: ', scores[1])
288 | 
289 | dic['Soft Plus']=0.9201377034187317
290 | 
291 | """##Parametric ReLU"""
292 | 
293 | print("---------------------------------Model using activation function: Parametric ReLU---------------------------------")
294 | 
295 | Malware_model = Sequential()
296 | Malware_model.add(Conv2D(30, kernel_size=(3, 3),activation=tf.keras.layers.PReLU(),input_shape=(64,64,3)))
297 | Malware_model.add(MaxPooling2D(pool_size=(2, 2)))
298 | Malware_model.add(Conv2D(15, (3, 3), activation=tf.keras.layers.PReLU()))
299 | Malware_model.add(MaxPooling2D(pool_size=(2, 2)))
300 | Malware_model.add(Dropout(0.25))
301 | Malware_model.add(Flatten())
302 | Malware_model.add(Dense(128, activation=tf.keras.layers.PReLU()))
303 | Malware_model.add(Dropout(0.5))
304 | Malware_model.add(Dense(50, activation=tf.keras.layers.PReLU()))
305 | Malware_model.add(Dense(num_classes, activation='softmax'))
306 | Malware_model.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics=['accuracy'])
307 | y_train_new = np.argmax(y_train, axis=1)
308 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new)
309 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10)
310 | 
311 | scores = Malware_model.evaluate(X_test, y_test)
312 | 
313 | print('Final CNN accuracy using Paramatic Plus: ', scores[1])
314 | 
315 | dic['Paramatic Relu']=0.9666092991828918
316 | 
317 | """## Tanh Function"""
318 | 
319 | print("---------------------------------Model using activation function: Tanh ---------------------------------")
320 | 
321 | Malware_model = malware_model('tanh')
322 | 
323 | y_train_new = np.argmax(y_train, axis=1)
324 | 
325 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new)
326 | 
327 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10)
328 | 
329 | scores = Malware_model.evaluate(X_test, y_test)
330 | 
331 | print('Final CNN accuracy using Tanh: ', scores[1])
332 | 
333 | dic['Tanh']= 0.9590361714363098
334 | 
335 | """##ELU"""
336 | 
337 | print("---------------------------------Model using activation function: Exponential Linear Unit ---------------------------------")
338 | 
339 | Malware_model = malware_model('elu')
340 | 
341 | y_train_new = np.argmax(y_train, axis=1)
342 | 
343 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new)
344 | 
345 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10)
346 | 
347 | scores = Malware_model.evaluate(X_test, y_test)
348 | 
349 | print('Final CNN accuracy using ELU: ', scores[1])
350 | 
351 | dic['Sigmoid']= 0.9562822580337524
352 | 
353 | """##Scaled Exponential Linear Unit"""
354 | 
355 | print("---------------------------------Model using activation function: Scaled Exponential Linear Unit ---------------------------------")
356 | 
357 | Malware_model = malware_model('selu')
358 | 
359 | y_train_new = np.argmax(y_train, axis=1)
360 | 
361 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new)
362 | 
363 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10)
364 | 
365 | scores = Malware_model.evaluate(X_test, y_test)
366 | 
367 | print('Final CNN accuracy using Selu: ', scores[1])
368 | 
369 | dic['Selu']= 0.9528399109840393
370 | 
371 | """## Gaussian Error Linear Unit"""
372 | 
373 | print("---------------------------------Model using activation function: Gelu ---------------------------------")
374 | 
375 | Malware_model = malware_model('gelu')
376 | 
377 | y_train_new = np.argmax(y_train, axis=1)
378 | 
379 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new)
380 | 
381 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10)
382 | 
383 | scores = Malware_model.evaluate(X_test, y_test)
384 | 
385 | print('Final CNN accuracy using gelu: ', scores[1])
386 | 
387 | dic['Gelu']= 0.9697074294090271
388 | 
389 | """##Swish"""
390 | 
391 | print("---------------------------------Model using activation function: Swish ---------------------------------")
392 | 
393 | Malware_model = malware_model('swish')
394 | 
395 | y_train_new = np.argmax(y_train, axis=1)
396 | 
397 | class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.unique(y_train_new),y=y_train_new)
398 | 
399 | Malware_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10)
400 | 
401 | scores = Malware_model.evaluate(X_test, y_test)
402 | 
403 | print('Final CNN accuracy using Swish: ', scores[1])
404 | 
405 | dic['swish']= 0.9597246050834656
406 | 
407 | """##Comparision"""
408 | 
409 | print("Accuracy of the same model with different activation function")
410 | 
411 | for key in dic:
412 |    print(key,": ",dic[key])
413 |    print()
414 | 
415 | print("Maximum Accuracy of ",dic['Gelu']," was obtained by using activation function as Gaussian Error Linear Unit (GELU)")


--------------------------------------------------------------------------------