├── CIC IDS 2018 ├── FFNN_BC │ ├── keras_metadata.pb │ ├── saved_model.pb │ └── variables │ │ ├── variables.data-00000-of-00001 │ │ └── variables.index ├── FFNN_MC │ ├── keras_metadata.pb │ ├── saved_model.pb │ └── variables │ │ ├── variables.data-00000-of-00001 │ │ └── variables.index ├── Group6_cicids_FFNN.ipynb ├── Group6_cicids_LSTM.ipynb ├── LSTM_BC │ ├── keras_metadata.pb │ └── saved_model.pb ├── LSTM_MC_L2 │ ├── keras_metadata.pb │ └── saved_model.pb └── README.md ├── LICENSE ├── NSL-KDD ├── Best_CNN │ └── checkpoint.hdf5 ├── Best_NN │ └── checkpoint.hdf5 ├── Data │ ├── KDDTest+.txt │ ├── KDDTest-21.txt │ ├── KDDTrain+.txt │ └── KDDTrain+_20Percent.txt ├── Group_6_NSL_KDD.ipynb ├── README.md ├── img │ ├── FNN-arch.drawio │ ├── FNN-arch.png │ ├── acc.png │ ├── acc_CNN.png │ ├── acc_nn.png │ ├── arch_CNN.png │ ├── attacks_table.png │ ├── cm_CNN.png │ ├── cm_NN.png │ ├── cm_RF.png │ ├── flags.png │ ├── flags_hist.png │ ├── fpr.png │ ├── macro_category.png │ ├── services.png │ ├── services_hist.png │ └── tpr.png └── resources │ ├── column_info.pdf │ ├── flag_info.jpg │ └── lib_info.png └── README.md /CIC IDS 2018/FFNN_BC/keras_metadata.pb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/CIC IDS 2018/FFNN_BC/keras_metadata.pb -------------------------------------------------------------------------------- /CIC IDS 2018/FFNN_BC/saved_model.pb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/CIC IDS 2018/FFNN_BC/saved_model.pb -------------------------------------------------------------------------------- /CIC IDS 2018/FFNN_BC/variables/variables.data-00000-of-00001: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/CIC IDS 2018/FFNN_BC/variables/variables.data-00000-of-00001 -------------------------------------------------------------------------------- /CIC IDS 2018/FFNN_BC/variables/variables.index: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/CIC IDS 2018/FFNN_BC/variables/variables.index -------------------------------------------------------------------------------- /CIC IDS 2018/FFNN_MC/keras_metadata.pb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/CIC IDS 2018/FFNN_MC/keras_metadata.pb -------------------------------------------------------------------------------- /CIC IDS 2018/FFNN_MC/saved_model.pb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/CIC IDS 2018/FFNN_MC/saved_model.pb -------------------------------------------------------------------------------- /CIC IDS 2018/FFNN_MC/variables/variables.data-00000-of-00001: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/CIC IDS 2018/FFNN_MC/variables/variables.data-00000-of-00001 -------------------------------------------------------------------------------- /CIC IDS 2018/FFNN_MC/variables/variables.index: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/CIC IDS 2018/FFNN_MC/variables/variables.index -------------------------------------------------------------------------------- /CIC IDS 2018/Group6_cicids_FFNN.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "include_colab_link": true 8 | }, 9 | "kernelspec": { 10 | "name": "python3", 11 | "display_name": "Python 3" 12 | }, 13 | "language_info": { 14 | "name": "python" 15 | } 16 | }, 17 | "cells": [ 18 | { 19 | "cell_type": "markdown", 20 | "metadata": { 21 | "id": "view-in-github", 22 | "colab_type": "text" 23 | }, 24 | "source": [ 25 | "\"Open" 26 | ] 27 | }, 28 | { 29 | "cell_type": "markdown", 30 | "source": [ 31 | "\n", 32 | "## Intrusion Detection System (CIC IDS 2018)\n", 33 | "\n", 34 | "---\n", 35 | "\n", 36 | "\n", 37 | "### Group 6 - Sleety, Thejus, Tejas and Rahul \n", 38 | "#### Rajagiri School of Engineering and Technology (KTU 2019 Scheme)" 39 | ], 40 | "metadata": { 41 | "id": "CP6dTDOCPhCg" 42 | } 43 | }, 44 | { 45 | "cell_type": "markdown", 46 | "metadata": { 47 | "id": "vHveTnF7vBef" 48 | }, 49 | "source": [ 50 | "##**IDS**" 51 | ] 52 | }, 53 | { 54 | "cell_type": "code", 55 | "metadata": { 56 | "id": "NLezzTSoMcI8" 57 | }, 58 | "source": [ 59 | "import tensorflow as tf\n", 60 | "import pandas as pd\n", 61 | "import numpy as np\n", 62 | "import sklearn\n", 63 | "from keras.models import Sequential, load_model\n", 64 | "from keras.utils import np_utils\n", 65 | "from sklearn.model_selection import train_test_split\n", 66 | "from sklearn.linear_model import LogisticRegression\n", 67 | "from sklearn import preprocessing\n", 68 | "from sklearn.preprocessing import LabelEncoder, StandardScaler\n", 69 | "from sklearn.metrics import confusion_matrix, precision_score, recall_score\n", 70 | "import seaborn as sn" 71 | ], 72 | "execution_count": 25, 73 | "outputs": [] 74 | }, 75 | { 76 | "cell_type": "code", 77 | "metadata": { 78 | "colab": { 79 | "base_uri": "https://localhost:8080/" 80 | }, 81 | "id": "9TWNodDe1kHu", 82 | "outputId": "a4e73f3a-1f62-405f-af01-63d0f9281ac6" 83 | }, 84 | "source": [ 85 | "from google.colab import drive\n", 86 | "drive.mount('/content/drive')" 87 | ], 88 | "execution_count": 2, 89 | "outputs": [ 90 | { 91 | "output_type": "stream", 92 | "name": "stdout", 93 | "text": [ 94 | "Mounted at /content/drive\n" 95 | ] 96 | } 97 | ] 98 | }, 99 | { 100 | "cell_type": "code", 101 | "metadata": { 102 | "id": "KzFzH3CAMmxQ", 103 | "colab": { 104 | "base_uri": "https://localhost:8080/" 105 | }, 106 | "outputId": "f49092d6-6d2a-4e1c-b6e5-d72a3e85f1aa" 107 | }, 108 | "source": [ 109 | "df = pd.read_csv('/content/drive/My Drive/cicids/cic/02-16-2018.csv')\n", 110 | "df.drop(df.loc[df['Label'] == 'Label'].index, inplace=True)" 111 | ], 112 | "execution_count": 3, 113 | "outputs": [ 114 | { 115 | "output_type": "stream", 116 | "name": "stderr", 117 | "text": [ 118 | "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py:3326: DtypeWarning: Columns (0,1,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78) have mixed types.Specify dtype option on import or set low_memory=False.\n", 119 | " exec(code_obj, self.user_global_ns, self.user_ns)\n" 120 | ] 121 | } 122 | ] 123 | }, 124 | { 125 | "cell_type": "code", 126 | "metadata": { 127 | "colab": { 128 | "base_uri": "https://localhost:8080/", 129 | "height": 338 130 | }, 131 | "id": "oQdjN1DTBLWl", 132 | "outputId": "3161be4a-3155-4104-98a3-17f9ae436f0a" 133 | }, 134 | "source": [ 135 | "df.describe()" 136 | ], 137 | "execution_count": 4, 138 | "outputs": [ 139 | { 140 | "output_type": "execute_result", 141 | "data": { 142 | "text/plain": [ 143 | " Dst Port Protocol Timestamp Flow Duration Tot Fwd Pkts \\\n", 144 | "count 1048574 1048574 1048574 1048574 1048574 \n", 145 | "unique 14463 6 3177 453875 42 \n", 146 | "top 80 6 16/02/2018 01:45:28 2 5 \n", 147 | "freq 461655 1040250 8403 58706 426407 \n", 148 | "\n", 149 | " Tot Bwd Pkts TotLen Fwd Pkts TotLen Bwd Pkts Fwd Pkt Len Max \\\n", 150 | "count 1048574 1048574 1048574 1048574 \n", 151 | "unique 36 480 861 181 \n", 152 | "top 0 0 0 0 \n", 153 | "freq 438014 572790 572823 572790 \n", 154 | "\n", 155 | " Fwd Pkt Len Min ... Fwd Seg Size Min Active Mean Active Std \\\n", 156 | "count 1048574 ... 1048574 1048574 1048574.0 \n", 157 | "unique 6 ... 10 5695 18.0 \n", 158 | "top 0 ... 32 0 0.0 \n", 159 | "freq 1040366 ... 905621 1031324 1040366.0 \n", 160 | "\n", 161 | " Active Max Active Min Idle Mean Idle Std Idle Max Idle Min \\\n", 162 | "count 1048574 1048574 1048574.0 1048574.0 1048574 1048574 \n", 163 | "unique 5696 5695 46001.0 137.0 46001 46001 \n", 164 | "top 0 0 0.0 0.0 0 0 \n", 165 | "freq 1031324 1031324 982182.0 1040244.0 982182 982182 \n", 166 | "\n", 167 | " Label \n", 168 | "count 1048574 \n", 169 | "unique 3 \n", 170 | "top DoS attacks-Hulk \n", 171 | "freq 461912 \n", 172 | "\n", 173 | "[4 rows x 80 columns]" 174 | ], 175 | "text/html": [ 176 | "\n", 177 | "
\n", 178 | "
\n", 179 | "
\n", 180 | "\n", 193 | "\n", 194 | " \n", 195 | " \n", 196 | " \n", 197 | " \n", 198 | " \n", 199 | " \n", 200 | " \n", 201 | " \n", 202 | " \n", 203 | " \n", 204 | " \n", 205 | " \n", 206 | " \n", 207 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 212 | " \n", 213 | " \n", 214 | " \n", 215 | " \n", 216 | " \n", 217 | " \n", 218 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | " \n", 232 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 237 | " \n", 238 | " \n", 239 | " \n", 240 | " \n", 241 | " \n", 242 | " \n", 243 | " \n", 244 | " \n", 245 | " \n", 246 | " \n", 247 | " \n", 248 | " \n", 249 | " \n", 250 | " \n", 251 | " \n", 252 | " \n", 253 | " \n", 254 | " \n", 255 | " \n", 256 | " \n", 257 | " \n", 258 | " \n", 259 | " \n", 260 | " \n", 261 | " \n", 262 | " \n", 263 | " \n", 264 | " \n", 265 | " \n", 266 | " \n", 267 | " \n", 268 | " \n", 269 | " \n", 270 | " \n", 271 | " \n", 272 | " \n", 273 | " \n", 274 | " \n", 275 | " \n", 276 | " \n", 277 | " \n", 278 | " \n", 279 | " \n", 280 | " \n", 281 | " \n", 282 | " \n", 283 | " \n", 284 | " \n", 285 | " \n", 286 | " \n", 287 | " \n", 288 | " \n", 289 | " \n", 290 | " \n", 291 | " \n", 292 | " \n", 293 | " \n", 294 | " \n", 295 | " \n", 296 | " \n", 297 | " \n", 298 | " \n", 299 | " \n", 300 | " \n", 301 | " \n", 302 | " \n", 303 | " \n", 304 | " \n", 305 | " \n", 306 | " \n", 307 | " \n", 308 | " \n", 309 | " \n", 310 | " \n", 311 | " \n", 312 | " \n", 313 | " \n", 314 | " \n", 315 | " \n", 316 | " \n", 317 | " \n", 318 | "
Dst PortProtocolTimestampFlow DurationTot Fwd PktsTot Bwd PktsTotLen Fwd PktsTotLen Bwd PktsFwd Pkt Len MaxFwd Pkt Len Min...Fwd Seg Size MinActive MeanActive StdActive MaxActive MinIdle MeanIdle StdIdle MaxIdle MinLabel
count1048574104857410485741048574104857410485741048574104857410485741048574...104857410485741048574.0104857410485741048574.01048574.0104857410485741048574
unique144636317745387542364808611816...10569518.05696569546001.0137.046001460013
top80616/02/2018 01:45:282500000...3200.0000.00.000DoS attacks-Hulk
freq46165510402508403587064264074380145727905728235727901040366...90562110313241040366.010313241031324982182.01040244.0982182982182461912
\n", 319 | "

4 rows × 80 columns

\n", 320 | "
\n", 321 | " \n", 331 | " \n", 332 | " \n", 369 | "\n", 370 | " \n", 394 | "
\n", 395 | "
\n", 396 | " " 397 | ] 398 | }, 399 | "metadata": {}, 400 | "execution_count": 4 401 | } 402 | ] 403 | }, 404 | { 405 | "cell_type": "code", 406 | "metadata": { 407 | "id": "YZFMTRPvM9ts", 408 | "colab": { 409 | "base_uri": "https://localhost:8080/", 410 | "height": 438 411 | }, 412 | "outputId": "319a5821-3304-4da8-cc75-1e7b464c579b" 413 | }, 414 | "source": [ 415 | "df.head()" 416 | ], 417 | "execution_count": 5, 418 | "outputs": [ 419 | { 420 | "output_type": "execute_result", 421 | "data": { 422 | "text/plain": [ 423 | " Dst Port Protocol Timestamp Flow Duration Tot Fwd Pkts \\\n", 424 | "0 0 0 16/02/2018 08:27:23 112640768 3 \n", 425 | "1 0 0 16/02/2018 08:30:12 112641773 3 \n", 426 | "2 35605 6 16/02/2018 08:26:55 20784143 23 \n", 427 | "3 0 0 16/02/2018 08:33:01 112640836 3 \n", 428 | "4 23 6 16/02/2018 08:27:59 20 1 \n", 429 | "\n", 430 | " Tot Bwd Pkts TotLen Fwd Pkts TotLen Bwd Pkts Fwd Pkt Len Max \\\n", 431 | "0 0 0 0 0 \n", 432 | "1 0 0 0 0 \n", 433 | "2 44 2416 1344 240 \n", 434 | "3 0 0 0 0 \n", 435 | "4 1 0 0 0 \n", 436 | "\n", 437 | " Fwd Pkt Len Min ... Fwd Seg Size Min Active Mean Active Std Active Max \\\n", 438 | "0 0 ... 0 0 0.0 0 \n", 439 | "1 0 ... 0 0 0.0 0 \n", 440 | "2 64 ... 20 2624734 0.0 2624734 \n", 441 | "3 0 ... 0 0 0.0 0 \n", 442 | "4 0 ... 20 0 0.0 0 \n", 443 | "\n", 444 | " Active Min Idle Mean Idle Std Idle Max Idle Min Label \n", 445 | "0 0 56300000.0 138.592929 56300000 56300000 Benign \n", 446 | "1 0 56300000.0 263.750829 56300000 56300000 Benign \n", 447 | "2 2624734 9058214.0 0.0 9058214 9058214 Benign \n", 448 | "3 0 56300000.0 82.024387 56300000 56300000 Benign \n", 449 | "4 0 0.0 0.0 0 0 Benign \n", 450 | "\n", 451 | "[5 rows x 80 columns]" 452 | ], 453 | "text/html": [ 454 | "\n", 455 | "
\n", 456 | "
\n", 457 | "
\n", 458 | "\n", 471 | "\n", 472 | " \n", 473 | " \n", 474 | " \n", 475 | " \n", 476 | " \n", 477 | " \n", 478 | " \n", 479 | " \n", 480 | " \n", 481 | " \n", 482 | " \n", 483 | " \n", 484 | " \n", 485 | " \n", 486 | " \n", 487 | " \n", 488 | " \n", 489 | " \n", 490 | " \n", 491 | " \n", 492 | " \n", 493 | " \n", 494 | " \n", 495 | " \n", 496 | " \n", 497 | " \n", 498 | " \n", 499 | " \n", 500 | " \n", 501 | " \n", 502 | " \n", 503 | " \n", 504 | " \n", 505 | " \n", 506 | " \n", 507 | " \n", 508 | " \n", 509 | " \n", 510 | " \n", 511 | " \n", 512 | " \n", 513 | " \n", 514 | " \n", 515 | " \n", 516 | " \n", 517 | " \n", 518 | " \n", 519 | " \n", 520 | " \n", 521 | " \n", 522 | " \n", 523 | " \n", 524 | " \n", 525 | " \n", 526 | " \n", 527 | " \n", 528 | " \n", 529 | " \n", 530 | " \n", 531 | " \n", 532 | " \n", 533 | " \n", 534 | " \n", 535 | " \n", 536 | " \n", 537 | " \n", 538 | " \n", 539 | " \n", 540 | " \n", 541 | " \n", 542 | " \n", 543 | " \n", 544 | " \n", 545 | " \n", 546 | " \n", 547 | " \n", 548 | " \n", 549 | " \n", 550 | " \n", 551 | " \n", 552 | " \n", 553 | " \n", 554 | " \n", 555 | " \n", 556 | " \n", 557 | " \n", 558 | " \n", 559 | " \n", 560 | " \n", 561 | " \n", 562 | " \n", 563 | " \n", 564 | " \n", 565 | " \n", 566 | " \n", 567 | " \n", 568 | " \n", 569 | " \n", 570 | " \n", 571 | " \n", 572 | " \n", 573 | " \n", 574 | " \n", 575 | " \n", 576 | " \n", 577 | " \n", 578 | " \n", 579 | " \n", 580 | " \n", 581 | " \n", 582 | " \n", 583 | " \n", 584 | " \n", 585 | " \n", 586 | " \n", 587 | " \n", 588 | " \n", 589 | " \n", 590 | " \n", 591 | " \n", 592 | " \n", 593 | " \n", 594 | " \n", 595 | " \n", 596 | " \n", 597 | " \n", 598 | " \n", 599 | " \n", 600 | " \n", 601 | " \n", 602 | " \n", 603 | " \n", 604 | " \n", 605 | " \n", 606 | " \n", 607 | " \n", 608 | " \n", 609 | " \n", 610 | " \n", 611 | " \n", 612 | " \n", 613 | " \n", 614 | " \n", 615 | " \n", 616 | " \n", 617 | " \n", 618 | " \n", 619 | " \n", 620 | "
Dst PortProtocolTimestampFlow DurationTot Fwd PktsTot Bwd PktsTotLen Fwd PktsTotLen Bwd PktsFwd Pkt Len MaxFwd Pkt Len Min...Fwd Seg Size MinActive MeanActive StdActive MaxActive MinIdle MeanIdle StdIdle MaxIdle MinLabel
00016/02/2018 08:27:23112640768300000...000.00056300000.0138.5929295630000056300000Benign
10016/02/2018 08:30:12112641773300000...000.00056300000.0263.7508295630000056300000Benign
235605616/02/2018 08:26:552078414323442416134424064...2026247340.0262473426247349058214.00.090582149058214Benign
30016/02/2018 08:33:01112640836300000...000.00056300000.082.0243875630000056300000Benign
423616/02/2018 08:27:5920110000...2000.0000.00.000Benign
\n", 621 | "

5 rows × 80 columns

\n", 622 | "
\n", 623 | " \n", 633 | " \n", 634 | " \n", 671 | "\n", 672 | " \n", 696 | "
\n", 697 | "
\n", 698 | " " 699 | ] 700 | }, 701 | "metadata": {}, 702 | "execution_count": 5 703 | } 704 | ] 705 | }, 706 | { 707 | "cell_type": "code", 708 | "metadata": { 709 | "id": "HixtSvMpM_Fq" 710 | }, 711 | "source": [ 712 | "metadata = ['fl_dur' #Flow duration\n", 713 | ",'tot_fw_pk' #Total packets in the forward direction\n", 714 | ",'tot_bw_pk' #Total packets in the backward direction\n", 715 | ",'tot_l_fw_pkt' #Total size of packet in forward direction\n", 716 | ",'fw_pkt_l_max' #Maximum size of packet in forward direction\n", 717 | ",'fw_pkt_l_min' #Minimum size of packet in forward direction\n", 718 | ",'fw_pkt_l_avg' #Average size of packet in forward direction\n", 719 | ",'fw_pkt_l_std' #Standard deviation size of packet in forward direction\n", 720 | ",'Bw_pkt_l_max' #Maximum size of packet in backward direction\n", 721 | ",'Bw_pkt_l_min' #Minimum size of packet in backward direction\n", 722 | ",'Bw_pkt_l_avg' #Mean size of packet in backward direction\n", 723 | ",'Bw_pkt_l_std' #Standard deviation size of packet in backward direction\n", 724 | ",'fl_byt_s' #flow byte rate that is number of packets transferred per second\n", 725 | ",'fl_pkt_s' #flow packets rate that is number of packets transferred per second\n", 726 | ",'fl_iat_avg' #Average time between two flows\n", 727 | ",'fl_iat_std' #Standard deviation time two flows\n", 728 | ",'fl_iat_max' #Maximum time between two flows\n", 729 | ",'fl_iat_min' #Minimum time between two flows\n", 730 | ",'fw_iat_tot' #Total time between two packets sent in the forward direction\n", 731 | ",'fw_iat_avg' #Mean time between two packets sent in the forward direction\n", 732 | ",'fw_iat_std' #Standard deviation time between two packets sent in the forward direction\n", 733 | ",'fw_iat_max' #Maximum time between two packets sent in the forward direction\n", 734 | ",'fw_iat_min' #Minimum time between two packets sent in the forward direction\n", 735 | ",'bw_iat_tot' #Total time between two packets sent in the backward direction\n", 736 | ",'bw_iat_avg' #Mean time between two packets sent in the backward direction\n", 737 | ",'bw_iat_std' #Standard deviation time between two packets sent in the backward direction\n", 738 | ",'bw_iat_max' #Maximum time between two packets sent in the backward direction\n", 739 | ",'bw_iat_min' #Minimum time between two packets sent in the backward direction\n", 740 | ",'fw_psh_flag' #Number of times the PSH flag was set in packets travelling in the forward direction (0 for UDP)\n", 741 | ",'bw_psh_flag' #Number of times the PSH flag was set in packets travelling in the backward direction (0 for UDP)\n", 742 | ",'fw_urg_flag' #Number of times the URG flag was set in packets travelling in the forward direction (0 for UDP)\n", 743 | ",'bw_urg_flag' #Number of times the URG flag was set in packets travelling in the backward direction (0 for UDP)\n", 744 | ",'fw_hdr_len' #Total bytes used for headers in the forward direction\n", 745 | ",'bw_hdr_len' #Total bytes used for headers in the forward direction\n", 746 | ",'fw_pkt_s' #Number of forward packets per second\n", 747 | ",'bw_pkt_s' #Number of backward packets per second\n", 748 | ",'pkt_len_min' #Minimum length of a flow\n", 749 | ",'pkt_len_max' #Maximum length of a flow\n", 750 | ",'pkt_len_avg' #Mean length of a flow\n", 751 | ",'pkt_len_std' #Standard deviation length of a flow\n", 752 | ",'pkt_len_va' #Minimum inter-arrival time of packet\n", 753 | ",'fin_cnt' #Number of packets with FIN\n", 754 | ",'syn_cnt' #Number of packets with SYN\n", 755 | ",'rst_cnt' #Number of packets with RST\n", 756 | ",'pst_cnt' #Number of packets with PUSH\n", 757 | ",'ack_cnt' #Number of packets with ACK\n", 758 | ",'urg_cnt' #Number of packets with URG\n", 759 | ",'cwe_cnt' #Number of packets with CWE\n", 760 | ",'ece_cnt' #Number of packets with ECE\n", 761 | ",'down_up_ratio' #Download and upload ratio\n", 762 | ",'pkt_size_avg' #Average size of packet\n", 763 | ",'fw_seg_avg' #Average size observed in the forward direction\n", 764 | ",'bw_seg_avg' #Average size observed in the backward direction\n", 765 | ",'fw_byt_blk_avg' #Average number of bytes bulk rate in the forward direction\n", 766 | ",'fw_pkt_blk_avg' #Average number of packets bulk rate in the forward direction\n", 767 | ",'fw_blk_rate_avg' #Average number of bulk rate in the forward direction\n", 768 | ",'bw_byt_blk_avg' #Average number of bytes bulk rate in the backward direction\n", 769 | ",'bw_pkt_blk_avg' #Average number of packets bulk rate in the backward direction\n", 770 | ",'bw_blk_rate_avg' #Average number of bulk rate in the backward direction\n", 771 | ",'subfl_fw_pk' #The average number of packets in a sub flow in the forward direction\n", 772 | ",'subfl_fw_byt' #The average number of bytes in a sub flow in the forward direction\n", 773 | ",'subfl_bw_pkt' #The average number of packets in a sub flow in the backward direction\n", 774 | ",'subfl_bw_byt' #The average number of bytes in a sub flow in the backward direction\n", 775 | ",'fw_win_byt' #Number of bytes sent in initial window in the forward direction\n", 776 | ",'bw_win_byt' ## of bytes sent in initial window in the backward direction\n", 777 | ",'Fw_act_pkt' ## of packets with at least 1 byte of TCP data payload in the forward direction\n", 778 | ",'fw_seg_min' #Minimum segment size observed in the forward direction\n", 779 | ",'atv_avg' #Mean time a flow was active before becoming idle\n", 780 | ",'atv_std' #Standard deviation time a flow was active before becoming idle\n", 781 | ",'atv_max' #Maximum time a flow was active before becoming idle\n", 782 | ",'atv_min' #Minimum time a flow was active before becoming idle\n", 783 | ",'idl_avg' #Mean time a flow was idle before becoming active\n", 784 | ",'idl_std' #Standard deviation time a flow was idle before becoming active\n", 785 | ",'idl_max' #Maximum time a flow was idle before becoming active\n", 786 | ",'idl_min' #Minimum time a flow was idle before becoming active\n", 787 | "]" 788 | ], 789 | "execution_count": 6, 790 | "outputs": [] 791 | }, 792 | { 793 | "cell_type": "code", 794 | "metadata": { 795 | "id": "s9ZjigalNCIZ", 796 | "colab": { 797 | "base_uri": "https://localhost:8080/" 798 | }, 799 | "outputId": "1b2a9372-efc0-42f9-ee37-cd496b7eaffe" 800 | }, 801 | "source": [ 802 | "df.columns" 803 | ], 804 | "execution_count": 7, 805 | "outputs": [ 806 | { 807 | "output_type": "execute_result", 808 | "data": { 809 | "text/plain": [ 810 | "Index(['Dst Port', 'Protocol', 'Timestamp', 'Flow Duration', 'Tot Fwd Pkts',\n", 811 | " 'Tot Bwd Pkts', 'TotLen Fwd Pkts', 'TotLen Bwd Pkts', 'Fwd Pkt Len Max',\n", 812 | " 'Fwd Pkt Len Min', 'Fwd Pkt Len Mean', 'Fwd Pkt Len Std',\n", 813 | " 'Bwd Pkt Len Max', 'Bwd Pkt Len Min', 'Bwd Pkt Len Mean',\n", 814 | " 'Bwd Pkt Len Std', 'Flow Byts/s', 'Flow Pkts/s', 'Flow IAT Mean',\n", 815 | " 'Flow IAT Std', 'Flow IAT Max', 'Flow IAT Min', 'Fwd IAT Tot',\n", 816 | " 'Fwd IAT Mean', 'Fwd IAT Std', 'Fwd IAT Max', 'Fwd IAT Min',\n", 817 | " 'Bwd IAT Tot', 'Bwd IAT Mean', 'Bwd IAT Std', 'Bwd IAT Max',\n", 818 | " 'Bwd IAT Min', 'Fwd PSH Flags', 'Bwd PSH Flags', 'Fwd URG Flags',\n", 819 | " 'Bwd URG Flags', 'Fwd Header Len', 'Bwd Header Len', 'Fwd Pkts/s',\n", 820 | " 'Bwd Pkts/s', 'Pkt Len Min', 'Pkt Len Max', 'Pkt Len Mean',\n", 821 | " 'Pkt Len Std', 'Pkt Len Var', 'FIN Flag Cnt', 'SYN Flag Cnt',\n", 822 | " 'RST Flag Cnt', 'PSH Flag Cnt', 'ACK Flag Cnt', 'URG Flag Cnt',\n", 823 | " 'CWE Flag Count', 'ECE Flag Cnt', 'Down/Up Ratio', 'Pkt Size Avg',\n", 824 | " 'Fwd Seg Size Avg', 'Bwd Seg Size Avg', 'Fwd Byts/b Avg',\n", 825 | " 'Fwd Pkts/b Avg', 'Fwd Blk Rate Avg', 'Bwd Byts/b Avg',\n", 826 | " 'Bwd Pkts/b Avg', 'Bwd Blk Rate Avg', 'Subflow Fwd Pkts',\n", 827 | " 'Subflow Fwd Byts', 'Subflow Bwd Pkts', 'Subflow Bwd Byts',\n", 828 | " 'Init Fwd Win Byts', 'Init Bwd Win Byts', 'Fwd Act Data Pkts',\n", 829 | " 'Fwd Seg Size Min', 'Active Mean', 'Active Std', 'Active Max',\n", 830 | " 'Active Min', 'Idle Mean', 'Idle Std', 'Idle Max', 'Idle Min', 'Label'],\n", 831 | " dtype='object')" 832 | ] 833 | }, 834 | "metadata": {}, 835 | "execution_count": 7 836 | } 837 | ] 838 | }, 839 | { 840 | "cell_type": "markdown", 841 | "metadata": { 842 | "id": "yGdYt-UEvj2J" 843 | }, 844 | "source": [ 845 | "## **FFNN**" 846 | ] 847 | }, 848 | { 849 | "cell_type": "markdown", 850 | "metadata": { 851 | "id": "_lhN6l71KkhQ" 852 | }, 853 | "source": [ 854 | "**Binary Class**" 855 | ] 856 | }, 857 | { 858 | "cell_type": "code", 859 | "metadata": { 860 | "id": "TZXxDViHNFt7" 861 | }, 862 | "source": [ 863 | "features = ['Flow Duration', 'Tot Fwd Pkts',\n", 864 | " 'Tot Bwd Pkts', 'TotLen Fwd Pkts', 'TotLen Bwd Pkts', 'Fwd Pkt Len Max',\n", 865 | " 'Fwd Pkt Len Min', 'Fwd Pkt Len Mean', 'Fwd Pkt Len Std',\n", 866 | " 'Bwd Pkt Len Max', 'Bwd Pkt Len Min', 'Bwd Pkt Len Mean',\n", 867 | " 'Bwd Pkt Len Std', 'Flow Byts/s', 'Flow Pkts/s', 'Flow IAT Mean',\n", 868 | " 'Flow IAT Std', 'Flow IAT Max', 'Flow IAT Min', 'Fwd IAT Tot',\n", 869 | " 'Fwd IAT Mean', 'Fwd IAT Std', 'Fwd IAT Max', 'Fwd IAT Min',\n", 870 | " 'Bwd IAT Tot', 'Bwd IAT Mean', 'Bwd IAT Std', 'Bwd IAT Max',\n", 871 | " 'Bwd IAT Min', 'Fwd PSH Flags', 'Bwd PSH Flags', 'Fwd URG Flags',\n", 872 | " 'Bwd URG Flags', 'Fwd Header Len', 'Bwd Header Len', 'Fwd Pkts/s',\n", 873 | " 'Bwd Pkts/s', 'Pkt Len Min', 'Pkt Len Max', 'Pkt Len Mean',\n", 874 | " 'Pkt Len Std', 'Pkt Len Var', 'FIN Flag Cnt', 'SYN Flag Cnt',\n", 875 | " 'RST Flag Cnt', 'PSH Flag Cnt', 'ACK Flag Cnt', 'URG Flag Cnt',\n", 876 | " 'CWE Flag Count', 'ECE Flag Cnt', 'Down/Up Ratio', 'Pkt Size Avg',\n", 877 | " 'Fwd Seg Size Avg', 'Bwd Seg Size Avg', 'Fwd Byts/b Avg',\n", 878 | " 'Fwd Pkts/b Avg', 'Fwd Blk Rate Avg', 'Bwd Byts/b Avg',\n", 879 | " 'Bwd Pkts/b Avg', 'Bwd Blk Rate Avg', 'Subflow Fwd Pkts',\n", 880 | " 'Subflow Fwd Byts', 'Subflow Bwd Pkts', 'Subflow Bwd Byts',\n", 881 | " 'Init Fwd Win Byts', 'Init Bwd Win Byts', 'Fwd Act Data Pkts',\n", 882 | " 'Fwd Seg Size Min', 'Active Mean', 'Active Std', 'Active Max',\n", 883 | " 'Active Min', 'Idle Mean', 'Idle Std', 'Idle Max', 'Idle Min']" 884 | ], 885 | "execution_count": 8, 886 | "outputs": [] 887 | }, 888 | { 889 | "cell_type": "code", 890 | "metadata": { 891 | "id": "17a7f5rWNNXZ" 892 | }, 893 | "source": [ 894 | "def targetify(s):\n", 895 | " if s == 'Benign':\n", 896 | " return 0\n", 897 | " else:\n", 898 | " return 1" 899 | ], 900 | "execution_count": 9, 901 | "outputs": [] 902 | }, 903 | { 904 | "cell_type": "code", 905 | "metadata": { 906 | "id": "ALq0oAMFNE3x", 907 | "colab": { 908 | "base_uri": "https://localhost:8080/" 909 | }, 910 | "outputId": "eff0f19a-b710-4db8-bc2b-c4b937d4fd26" 911 | }, 912 | "source": [ 913 | "X = df[features]\n", 914 | "X[features] = X[features].apply(pd.to_numeric, errors='coerce', axis=1)\n", 915 | "X = X.fillna(0)\n", 916 | "labels = df['Label'] #For multiclass classification\n", 917 | "df['Target']=df['Label'].apply(targetify)\n", 918 | "y = df['Target']" 919 | ], 920 | "execution_count": 10, 921 | "outputs": [ 922 | { 923 | "output_type": "stream", 924 | "name": "stderr", 925 | "text": [ 926 | "/usr/local/lib/python3.8/dist-packages/pandas/core/frame.py:3641: SettingWithCopyWarning: \n", 927 | "A value is trying to be set on a copy of a slice from a DataFrame.\n", 928 | "Try using .loc[row_indexer,col_indexer] = value instead\n", 929 | "\n", 930 | "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n", 931 | " self[k1] = value[k2]\n" 932 | ] 933 | } 934 | ] 935 | }, 936 | { 937 | "cell_type": "code", 938 | "metadata": { 939 | "id": "8nWwSwSoNExM" 940 | }, 941 | "source": [ 942 | "min_max_scaler = preprocessing.MinMaxScaler()\n", 943 | "x_scaled = min_max_scaler.fit_transform(X.values)\n", 944 | "X = pd.DataFrame(x_scaled,columns=features)" 945 | ], 946 | "execution_count": 11, 947 | "outputs": [] 948 | }, 949 | { 950 | "cell_type": "code", 951 | "metadata": { 952 | "id": "GG4_M7ObNbKH", 953 | "colab": { 954 | "base_uri": "https://localhost:8080/" 955 | }, 956 | "outputId": "4b2fe5d7-bc99-422c-84a2-dd8b6c803806" 957 | }, 958 | "source": [ 959 | "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)\n", 960 | "print (X_train.shape, y_train.shape)\n", 961 | "print( X_test.shape, y_test.shape)" 962 | ], 963 | "execution_count": 12, 964 | "outputs": [ 965 | { 966 | "output_type": "stream", 967 | "name": "stdout", 968 | "text": [ 969 | "(838859, 76) (838859,)\n", 970 | "(209715, 76) (209715,)\n" 971 | ] 972 | } 973 | ] 974 | }, 975 | { 976 | "cell_type": "code", 977 | "metadata": { 978 | "id": "maEm0IWENq_f" 979 | }, 980 | "source": [ 981 | "model = tf.keras.models.Sequential([\n", 982 | " tf.keras.layers.Dense(128, activation='relu'),\n", 983 | " tf.keras.layers.Dropout(0.2),\n", 984 | " tf.keras.layers.Dense(64, activation='relu'),\n", 985 | " tf.keras.layers.Dropout(0.4),\n", 986 | " tf.keras.layers.Dense(2, activation='softmax')\n", 987 | "])" 988 | ], 989 | "execution_count": 13, 990 | "outputs": [] 991 | }, 992 | { 993 | "cell_type": "code", 994 | "metadata": { 995 | "id": "3Tdxjo5VNq1U", 996 | "colab": { 997 | "base_uri": "https://localhost:8080/" 998 | }, 999 | "outputId": "128485cc-a2ee-488e-bf36-77c54c5a6178" 1000 | }, 1001 | "source": [ 1002 | "model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])\n", 1003 | "model.fit(X_train.values, y_train.values, epochs=5)\n", 1004 | "model.save('drive/MyDrive/cicids/FFNN_BC')" 1005 | ], 1006 | "execution_count": 14, 1007 | "outputs": [ 1008 | { 1009 | "output_type": "stream", 1010 | "name": "stdout", 1011 | "text": [ 1012 | "Epoch 1/5\n", 1013 | "26215/26215 [==============================] - 55s 2ms/step - loss: 0.0138 - accuracy: 0.9979\n", 1014 | "Epoch 2/5\n", 1015 | "26215/26215 [==============================] - 60s 2ms/step - loss: 0.0117 - accuracy: 0.9983\n", 1016 | "Epoch 3/5\n", 1017 | "26215/26215 [==============================] - 54s 2ms/step - loss: 0.0115 - accuracy: 0.9983\n", 1018 | "Epoch 4/5\n", 1019 | "26215/26215 [==============================] - 55s 2ms/step - loss: 0.0115 - accuracy: 0.9983\n", 1020 | "Epoch 5/5\n", 1021 | "26215/26215 [==============================] - 55s 2ms/step - loss: 0.0113 - accuracy: 0.9983\n" 1022 | ] 1023 | } 1024 | ] 1025 | }, 1026 | { 1027 | "cell_type": "code", 1028 | "metadata": { 1029 | "id": "i8qkqLZfNqou", 1030 | "colab": { 1031 | "base_uri": "https://localhost:8080/" 1032 | }, 1033 | "outputId": "cb15357b-1db1-4641-d468-6fdaf38dad22" 1034 | }, 1035 | "source": [ 1036 | "predictions = model.predict(X_test.values)[:,1]\n", 1037 | "predictions = [int(round(x)) for x in predictions]\n", 1038 | "np.sum(predictions == y_test.values) / len(y_test.values)" 1039 | ], 1040 | "execution_count": 15, 1041 | "outputs": [ 1042 | { 1043 | "output_type": "stream", 1044 | "name": "stdout", 1045 | "text": [ 1046 | "6554/6554 [==============================] - 9s 1ms/step\n" 1047 | ] 1048 | }, 1049 | { 1050 | "output_type": "execute_result", 1051 | "data": { 1052 | "text/plain": [ 1053 | "0.9983119948501538" 1054 | ] 1055 | }, 1056 | "metadata": {}, 1057 | "execution_count": 15 1058 | } 1059 | ] 1060 | }, 1061 | { 1062 | "cell_type": "code", 1063 | "metadata": { 1064 | "colab": { 1065 | "base_uri": "https://localhost:8080/" 1066 | }, 1067 | "id": "Eg8wI42LHPPV", 1068 | "outputId": "49338ac2-6674-46f4-8969-ecca8771d866" 1069 | }, 1070 | "source": [ 1071 | "confMat = confusion_matrix(y_test.values, predictions)\n", 1072 | "confMat" 1073 | ], 1074 | "execution_count": 16, 1075 | "outputs": [ 1076 | { 1077 | "output_type": "execute_result", 1078 | "data": { 1079 | "text/plain": [ 1080 | "array([[ 88959, 354],\n", 1081 | " [ 0, 120402]])" 1082 | ] 1083 | }, 1084 | "metadata": {}, 1085 | "execution_count": 16 1086 | } 1087 | ] 1088 | }, 1089 | { 1090 | "cell_type": "code", 1091 | "metadata": { 1092 | "id": "HxGIS0PHUv04", 1093 | "colab": { 1094 | "base_uri": "https://localhost:8080/" 1095 | }, 1096 | "outputId": "217744a9-cf8c-415f-fb5a-b4fe8dea650a" 1097 | }, 1098 | "source": [ 1099 | "precision_score(y_test, predictions)" 1100 | ], 1101 | "execution_count": 17, 1102 | "outputs": [ 1103 | { 1104 | "output_type": "execute_result", 1105 | "data": { 1106 | "text/plain": [ 1107 | "0.9970684686475206" 1108 | ] 1109 | }, 1110 | "metadata": {}, 1111 | "execution_count": 17 1112 | } 1113 | ] 1114 | }, 1115 | { 1116 | "cell_type": "code", 1117 | "metadata": { 1118 | "id": "FUqP8hetU3ym" 1119 | }, 1120 | "source": [ 1121 | "#recall_score(y_test.values, predictions)" 1122 | ], 1123 | "execution_count": 61, 1124 | "outputs": [] 1125 | }, 1126 | { 1127 | "cell_type": "code", 1128 | "metadata": { 1129 | "id": "mU5wAR5r6c-J", 1130 | "outputId": "d5cbbf57-e34f-4a2e-8eb3-7eb8e48c025b", 1131 | "colab": { 1132 | "base_uri": "https://localhost:8080/", 1133 | "height": 282 1134 | } 1135 | }, 1136 | "source": [ 1137 | "cf_matrix = confusion_matrix(y_test.values, predictions)\n", 1138 | "sn.heatmap(cf_matrix / np.sum(cf_matrix), annot=True, fmt='.2%', cmap='Blues')" 1139 | ], 1140 | "execution_count": 19, 1141 | "outputs": [ 1142 | { 1143 | "output_type": "execute_result", 1144 | "data": { 1145 | "text/plain": [ 1146 | "" 1147 | ] 1148 | }, 1149 | "metadata": {}, 1150 | "execution_count": 19 1151 | }, 1152 | { 1153 | "output_type": "display_data", 1154 | "data": { 1155 | "text/plain": [ 1156 | "
" 1157 | ], 1158 | "image/png": "\n" 1159 | }, 1160 | "metadata": { 1161 | "needs_background": "light" 1162 | } 1163 | } 1164 | ] 1165 | }, 1166 | { 1167 | "cell_type": "markdown", 1168 | "metadata": { 1169 | "id": "SnS1tL8eQMGp" 1170 | }, 1171 | "source": [ 1172 | "**Multicalss**" 1173 | ] 1174 | }, 1175 | { 1176 | "cell_type": "code", 1177 | "metadata": { 1178 | "id": "V_Pe0g0YUVbY" 1179 | }, 1180 | "source": [ 1181 | "categories = ['Benign', 'FTP-BruteForce', 'SSH-Bruteforce',\n", 1182 | " 'DoS attacks-GoldenEye', 'DoS attacks-Slowloris', 'DoS attacks-SlowHTTPTest',\n", 1183 | " 'DoS attacks-Hulk', 'Brute Force -Web', 'Brute Force -XSS',\n", 1184 | " 'SQL Injection', 'Infiltration', 'Bot']" 1185 | ], 1186 | "execution_count": 20, 1187 | "outputs": [] 1188 | }, 1189 | { 1190 | "cell_type": "code", 1191 | "metadata": { 1192 | "id": "K1uUFkSq0ZEv" 1193 | }, 1194 | "source": [ 1195 | "labels = df['Label']" 1196 | ], 1197 | "execution_count": 21, 1198 | "outputs": [] 1199 | }, 1200 | { 1201 | "cell_type": "code", 1202 | "metadata": { 1203 | "id": "VcKnKgcmT7x2" 1204 | }, 1205 | "source": [ 1206 | "encoder = LabelEncoder()\n", 1207 | "encoder.fit(categories)\n", 1208 | "y = encoder.transform(labels)\n", 1209 | "y = np_utils.to_categorical(y, num_classes=12)" 1210 | ], 1211 | "execution_count": 22, 1212 | "outputs": [] 1213 | }, 1214 | { 1215 | "cell_type": "code", 1216 | "metadata": { 1217 | "id": "ZXWj5nY5HRkC", 1218 | "outputId": "b75c075a-bae4-44a7-e6b0-736515bbc87f", 1219 | "colab": { 1220 | "base_uri": "https://localhost:8080/" 1221 | } 1222 | }, 1223 | "source": [ 1224 | "y.shape" 1225 | ], 1226 | "execution_count": 23, 1227 | "outputs": [ 1228 | { 1229 | "output_type": "execute_result", 1230 | "data": { 1231 | "text/plain": [ 1232 | "(1048574, 12)" 1233 | ] 1234 | }, 1235 | "metadata": {}, 1236 | "execution_count": 23 1237 | } 1238 | ] 1239 | }, 1240 | { 1241 | "cell_type": "code", 1242 | "metadata": { 1243 | "id": "rkGTxRFQIRRI", 1244 | "outputId": "f92afb8c-0917-46d0-ae25-3ecf9333f042", 1245 | "colab": { 1246 | "base_uri": "https://localhost:8080/" 1247 | } 1248 | }, 1249 | "source": [ 1250 | "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)\n", 1251 | "print (X_train.shape, y_train.shape)\n", 1252 | "print( X_test.shape, y_test.shape)" 1253 | ], 1254 | "execution_count": 24, 1255 | "outputs": [ 1256 | { 1257 | "output_type": "stream", 1258 | "name": "stdout", 1259 | "text": [ 1260 | "(838859, 76) (838859, 12)\n", 1261 | "(209715, 76) (209715, 12)\n" 1262 | ] 1263 | } 1264 | ] 1265 | }, 1266 | { 1267 | "cell_type": "code", 1268 | "metadata": { 1269 | "id": "0bvqaOJyxFLP" 1270 | }, 1271 | "source": [ 1272 | "model = tf.keras.models.Sequential([\n", 1273 | " tf.keras.layers.Dense(128, activation='relu'),\n", 1274 | " tf.keras.layers.Dropout(0.2),\n", 1275 | " tf.keras.layers.Dense(64, activation='relu'),\n", 1276 | " tf.keras.layers.Dropout(0.4),\n", 1277 | " tf.keras.layers.Dense(12, activation='softmax')\n", 1278 | "])" 1279 | ], 1280 | "execution_count": 28, 1281 | "outputs": [] 1282 | }, 1283 | { 1284 | "cell_type": "code", 1285 | "metadata": { 1286 | "id": "P3HeMf-SxH2s", 1287 | "colab": { 1288 | "base_uri": "https://localhost:8080/" 1289 | }, 1290 | "outputId": "0f23f570-b5ae-4161-828c-783b2bce718c" 1291 | }, 1292 | "source": [ 1293 | "model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])\n", 1294 | "model.fit(X_train, y_train, epochs=5)\n", 1295 | "model.save('drive/MyDrive/cicids/FFNN_MC')" 1296 | ], 1297 | "execution_count": 29, 1298 | "outputs": [ 1299 | { 1300 | "output_type": "stream", 1301 | "name": "stdout", 1302 | "text": [ 1303 | "Epoch 1/5\n", 1304 | "26215/26215 [==============================] - 59s 2ms/step - loss: 0.0174 - accuracy: 0.9971\n", 1305 | "Epoch 2/5\n", 1306 | "26215/26215 [==============================] - 58s 2ms/step - loss: 0.0117 - accuracy: 0.9983\n", 1307 | "Epoch 3/5\n", 1308 | "26215/26215 [==============================] - 61s 2ms/step - loss: 0.0115 - accuracy: 0.9983\n", 1309 | "Epoch 4/5\n", 1310 | "26215/26215 [==============================] - 57s 2ms/step - loss: 0.0114 - accuracy: 0.9983\n", 1311 | "Epoch 5/5\n", 1312 | "26215/26215 [==============================] - 59s 2ms/step - loss: 0.0114 - accuracy: 0.9983\n" 1313 | ] 1314 | } 1315 | ] 1316 | }, 1317 | { 1318 | "cell_type": "code", 1319 | "metadata": { 1320 | "id": "njwo7PmSxNcW", 1321 | "colab": { 1322 | "base_uri": "https://localhost:8080/" 1323 | }, 1324 | "outputId": "76f5e0cc-6f73-4954-d644-5ca04ac5abf2" 1325 | }, 1326 | "source": [ 1327 | "predictions = model.predict(X_test.values)[:,:1]\n", 1328 | "predictions = predictions.argmax(axis=1)\n", 1329 | "y_test = y_test.argmax(axis=1)" 1330 | ], 1331 | "execution_count": 30, 1332 | "outputs": [ 1333 | { 1334 | "output_type": "stream", 1335 | "name": "stdout", 1336 | "text": [ 1337 | "6554/6554 [==============================] - 8s 1ms/step\n" 1338 | ] 1339 | } 1340 | ] 1341 | }, 1342 | { 1343 | "cell_type": "code", 1344 | "metadata": { 1345 | "id": "IjJTIqmmPbMR", 1346 | "outputId": "3379df0f-7229-4e97-bb0a-0f1e2c00d36b", 1347 | "colab": { 1348 | "base_uri": "https://localhost:8080/" 1349 | } 1350 | }, 1351 | "source": [ 1352 | "predictions.shape\n", 1353 | "y_test.shape" 1354 | ], 1355 | "execution_count": 31, 1356 | "outputs": [ 1357 | { 1358 | "output_type": "execute_result", 1359 | "data": { 1360 | "text/plain": [ 1361 | "(209715,)" 1362 | ] 1363 | }, 1364 | "metadata": {}, 1365 | "execution_count": 31 1366 | } 1367 | ] 1368 | }, 1369 | { 1370 | "cell_type": "code", 1371 | "metadata": { 1372 | "id": "jaxlBVw8MBhn", 1373 | "outputId": "ea8acb77-9010-4172-80f9-10070e89cf58", 1374 | "colab": { 1375 | "base_uri": "https://localhost:8080/" 1376 | } 1377 | }, 1378 | "source": [ 1379 | "np.sum(predictions == y_test) / len(y_test)" 1380 | ], 1381 | "execution_count": 32, 1382 | "outputs": [ 1383 | { 1384 | "output_type": "execute_result", 1385 | "data": { 1386 | "text/plain": [ 1387 | "0.42555849605416873" 1388 | ] 1389 | }, 1390 | "metadata": {}, 1391 | "execution_count": 32 1392 | } 1393 | ] 1394 | }, 1395 | { 1396 | "cell_type": "code", 1397 | "metadata": { 1398 | "id": "9oQAaNYgxP5A", 1399 | "colab": { 1400 | "base_uri": "https://localhost:8080/" 1401 | }, 1402 | "outputId": "c484151e-9403-4ecf-9744-97bcbed8a49f" 1403 | }, 1404 | "source": [ 1405 | "confMat = confusion_matrix(y_test, predictions)\n", 1406 | "confMat" 1407 | ], 1408 | "execution_count": 33, 1409 | "outputs": [ 1410 | { 1411 | "output_type": "execute_result", 1412 | "data": { 1413 | "text/plain": [ 1414 | "array([[89246, 0, 0],\n", 1415 | " [92371, 0, 0],\n", 1416 | " [28098, 0, 0]])" 1417 | ] 1418 | }, 1419 | "metadata": {}, 1420 | "execution_count": 33 1421 | } 1422 | ] 1423 | }, 1424 | { 1425 | "cell_type": "code", 1426 | "metadata": { 1427 | "id": "KKlZORPDWv5y", 1428 | "outputId": "357a35a7-078b-4c1b-95b5-347bda7022ad", 1429 | "colab": { 1430 | "base_uri": "https://localhost:8080/" 1431 | } 1432 | }, 1433 | "source": [ 1434 | "precision_score(y_test, predictions, average='weighted')" 1435 | ], 1436 | "execution_count": 34, 1437 | "outputs": [ 1438 | { 1439 | "output_type": "stream", 1440 | "name": "stderr", 1441 | "text": [ 1442 | "/usr/local/lib/python3.8/dist-packages/sklearn/metrics/_classification.py:1318: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.\n", 1443 | " _warn_prf(average, modifier, msg_start, len(result))\n" 1444 | ] 1445 | }, 1446 | { 1447 | "output_type": "execute_result", 1448 | "data": { 1449 | "text/plain": [ 1450 | "0.18110003356388596" 1451 | ] 1452 | }, 1453 | "metadata": {}, 1454 | "execution_count": 34 1455 | } 1456 | ] 1457 | }, 1458 | { 1459 | "cell_type": "code", 1460 | "metadata": { 1461 | "id": "J_4fMPJTXAto", 1462 | "outputId": "63230c51-c283-40ac-da9e-71c3d41fc318", 1463 | "colab": { 1464 | "base_uri": "https://localhost:8080/" 1465 | } 1466 | }, 1467 | "source": [ 1468 | "recall_score(y_test, predictions, average='weighted')" 1469 | ], 1470 | "execution_count": 35, 1471 | "outputs": [ 1472 | { 1473 | "output_type": "execute_result", 1474 | "data": { 1475 | "text/plain": [ 1476 | "0.42555849605416873" 1477 | ] 1478 | }, 1479 | "metadata": {}, 1480 | "execution_count": 35 1481 | } 1482 | ] 1483 | }, 1484 | { 1485 | "cell_type": "code", 1486 | "metadata": { 1487 | "id": "21H96b-dQ1m2", 1488 | "outputId": "1c1dd5d2-2ae8-4acb-9604-0fd8d30615e5", 1489 | "colab": { 1490 | "base_uri": "https://localhost:8080/", 1491 | "height": 282 1492 | } 1493 | }, 1494 | "source": [ 1495 | "cf_matrix = confusion_matrix(y_test, predictions)\n", 1496 | "sn.heatmap(cf_matrix / np.sum(cf_matrix), annot=True, fmt='.2%', cmap='Blues')" 1497 | ], 1498 | "execution_count": 36, 1499 | "outputs": [ 1500 | { 1501 | "output_type": "execute_result", 1502 | "data": { 1503 | "text/plain": [ 1504 | "" 1505 | ] 1506 | }, 1507 | "metadata": {}, 1508 | "execution_count": 36 1509 | }, 1510 | { 1511 | "output_type": "display_data", 1512 | "data": { 1513 | "text/plain": [ 1514 | "
" 1515 | ], 1516 | "image/png": "\n" 1517 | }, 1518 | "metadata": { 1519 | "needs_background": "light" 1520 | } 1521 | } 1522 | ] 1523 | }, 1524 | { 1525 | "cell_type": "code", 1526 | "metadata": { 1527 | "id": "5WoxcdTi-AdC", 1528 | "colab": { 1529 | "base_uri": "https://localhost:8080/" 1530 | }, 1531 | "outputId": "947c938d-4015-4d81-bb74-351bdf885299" 1532 | }, 1533 | "source": [ 1534 | "! pip3 install keras\n", 1535 | "! pip3 install ann_visualizer\n", 1536 | "! pip install graphviz\n", 1537 | "! pip install h5py" 1538 | ], 1539 | "execution_count": 37, 1540 | "outputs": [ 1541 | { 1542 | "output_type": "stream", 1543 | "name": "stdout", 1544 | "text": [ 1545 | "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", 1546 | "Requirement already satisfied: keras in /usr/local/lib/python3.8/dist-packages (2.9.0)\n", 1547 | "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", 1548 | "Collecting ann_visualizer\n", 1549 | " Downloading ann_visualizer-2.5.tar.gz (4.7 kB)\n", 1550 | "Building wheels for collected packages: ann-visualizer\n", 1551 | " Building wheel for ann-visualizer (setup.py) ... \u001b[?25l\u001b[?25hdone\n", 1552 | " Created wheel for ann-visualizer: filename=ann_visualizer-2.5-py3-none-any.whl size=4168 sha256=10a9e6a4d80111e731016f7bf03dbcfe2ca23f9ef4c632b4b9bc8625cff74b52\n", 1553 | " Stored in directory: /root/.cache/pip/wheels/4b/ef/77/9b8c4ae2f9a11de19957b80bc5c684accd99114bb8dc6b374c\n", 1554 | "Successfully built ann-visualizer\n", 1555 | "Installing collected packages: ann-visualizer\n", 1556 | "Successfully installed ann-visualizer-2.5\n", 1557 | "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", 1558 | "Requirement already satisfied: graphviz in /usr/local/lib/python3.8/dist-packages (0.10.1)\n", 1559 | "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", 1560 | "Requirement already satisfied: h5py in /usr/local/lib/python3.8/dist-packages (3.1.0)\n", 1561 | "Requirement already satisfied: numpy>=1.17.5 in /usr/local/lib/python3.8/dist-packages (from h5py) (1.21.6)\n" 1562 | ] 1563 | } 1564 | ] 1565 | }, 1566 | { 1567 | "cell_type": "code", 1568 | "metadata": { 1569 | "id": "GTWJZBBm2rUP" 1570 | }, 1571 | "source": [ 1572 | "from ann_visualizer.visualize import ann_viz;\n", 1573 | "ann_viz(model, title=\"Artificial Neural network - Model Visualization\")" 1574 | ], 1575 | "execution_count": 38, 1576 | "outputs": [] 1577 | }, 1578 | { 1579 | "cell_type": "code", 1580 | "source": [ 1581 | "! pip3 install keras-visualizer" 1582 | ], 1583 | "metadata": { 1584 | "colab": { 1585 | "base_uri": "https://localhost:8080/" 1586 | }, 1587 | "id": "FUP8Xmm-JPWr", 1588 | "outputId": "ec2c90f3-43c8-4b7d-8f5e-b2f4714eac09" 1589 | }, 1590 | "execution_count": 46, 1591 | "outputs": [ 1592 | { 1593 | "output_type": "stream", 1594 | "name": "stdout", 1595 | "text": [ 1596 | "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", 1597 | "Collecting keras-visualizer\n", 1598 | " Downloading keras_visualizer-2.4-py3-none-any.whl (5.4 kB)\n", 1599 | "Installing collected packages: keras-visualizer\n", 1600 | "Successfully installed keras-visualizer-2.4\n" 1601 | ] 1602 | } 1603 | ] 1604 | }, 1605 | { 1606 | "cell_type": "code", 1607 | "source": [ 1608 | "from keras_visualizer import visualizer" 1609 | ], 1610 | "metadata": { 1611 | "id": "d1rxPbzXJLVc" 1612 | }, 1613 | "execution_count": 47, 1614 | "outputs": [] 1615 | }, 1616 | { 1617 | "cell_type": "code", 1618 | "metadata": { 1619 | "id": "RaPSKJxr2-rO" 1620 | }, 1621 | "source": [ 1622 | "visualizer(model, format='png', view=True)" 1623 | ], 1624 | "execution_count": 49, 1625 | "outputs": [] 1626 | } 1627 | ] 1628 | } -------------------------------------------------------------------------------- /CIC IDS 2018/Group6_cicids_LSTM.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "collapsed_sections": [ 8 | "oEBYbD0jw2PI", 9 | "DER9DuHSa5hH" 10 | ], 11 | "include_colab_link": true 12 | }, 13 | "kernelspec": { 14 | "name": "python3", 15 | "display_name": "Python 3" 16 | }, 17 | "language_info": { 18 | "name": "python" 19 | } 20 | }, 21 | "cells": [ 22 | { 23 | "cell_type": "markdown", 24 | "metadata": { 25 | "id": "view-in-github", 26 | "colab_type": "text" 27 | }, 28 | "source": [ 29 | "\"Open" 30 | ] 31 | }, 32 | { 33 | "cell_type": "markdown", 34 | "source": [ 35 | "\n", 36 | "## Intrusion Detection System (CIC IDS 2018)\n", 37 | "\n", 38 | "---\n", 39 | "\n", 40 | "\n", 41 | "### Group 6 - Sleety, Thejus, Tejas and Rahul \n", 42 | "#### Rajagiri School of Engineering and Technology (KTU 2019 Scheme)" 43 | ], 44 | "metadata": { 45 | "id": "siJMdbALhwDm" 46 | } 47 | }, 48 | { 49 | "cell_type": "markdown", 50 | "metadata": { 51 | "id": "oEBYbD0jw2PI" 52 | }, 53 | "source": [ 54 | "## **IDS**" 55 | ] 56 | }, 57 | { 58 | "cell_type": "code", 59 | "metadata": { 60 | "id": "NLezzTSoMcI8" 61 | }, 62 | "source": [ 63 | "import tensorflow as tf\n", 64 | "import pandas as pd\n", 65 | "import numpy as np\n", 66 | "import sklearn\n", 67 | "from keras.models import Sequential, load_model\n", 68 | "from sklearn.model_selection import train_test_split\n", 69 | "from sklearn.linear_model import LogisticRegression\n", 70 | "from sklearn import preprocessing\n", 71 | "from sklearn.preprocessing import LabelEncoder, StandardScaler\n", 72 | "from sklearn.metrics import confusion_matrix\n", 73 | "from keras.utils import np_utils" 74 | ], 75 | "execution_count": null, 76 | "outputs": [] 77 | }, 78 | { 79 | "cell_type": "code", 80 | "metadata": { 81 | "colab": { 82 | "base_uri": "https://localhost:8080/" 83 | }, 84 | "id": "9TWNodDe1kHu", 85 | "outputId": "887c20b7-c6a9-46d7-f419-ba689e8f6ae4" 86 | }, 87 | "source": [ 88 | "from google.colab import drive\n", 89 | "drive.mount('/content/drive')" 90 | ], 91 | "execution_count": null, 92 | "outputs": [ 93 | { 94 | "output_type": "stream", 95 | "name": "stdout", 96 | "text": [ 97 | "Mounted at /content/drive\n" 98 | ] 99 | } 100 | ] 101 | }, 102 | { 103 | "cell_type": "code", 104 | "metadata": { 105 | "id": "KzFzH3CAMmxQ", 106 | "colab": { 107 | "base_uri": "https://localhost:8080/" 108 | }, 109 | "outputId": "6e9226bc-f84e-4540-f08b-5a01e0454f63" 110 | }, 111 | "source": [ 112 | "df = pd.read_csv('/content/drive/My Drive/cicids/cic/02-16-2018.csv')\n", 113 | "df.drop(df.loc[df['Label'] == 'Label'].index, inplace=True)" 114 | ], 115 | "execution_count": null, 116 | "outputs": [ 117 | { 118 | "output_type": "stream", 119 | "name": "stderr", 120 | "text": [ 121 | "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py:3326: DtypeWarning: Columns (0,1,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78) have mixed types.Specify dtype option on import or set low_memory=False.\n", 122 | " exec(code_obj, self.user_global_ns, self.user_ns)\n" 123 | ] 124 | } 125 | ] 126 | }, 127 | { 128 | "cell_type": "code", 129 | "metadata": { 130 | "id": "HixtSvMpM_Fq" 131 | }, 132 | "source": [ 133 | "metadata = ['fl_dur' #Flow duration\n", 134 | ",'tot_fw_pk' #Total packets in the forward direction\n", 135 | ",'tot_bw_pk' #Total packets in the backward direction\n", 136 | ",'tot_l_fw_pkt' #Total size of packet in forward direction\n", 137 | ",'fw_pkt_l_max' #Maximum size of packet in forward direction\n", 138 | ",'fw_pkt_l_min' #Minimum size of packet in forward direction\n", 139 | ",'fw_pkt_l_avg' #Average size of packet in forward direction\n", 140 | ",'fw_pkt_l_std' #Standard deviation size of packet in forward direction\n", 141 | ",'Bw_pkt_l_max' #Maximum size of packet in backward direction\n", 142 | ",'Bw_pkt_l_min' #Minimum size of packet in backward direction\n", 143 | ",'Bw_pkt_l_avg' #Mean size of packet in backward direction\n", 144 | ",'Bw_pkt_l_std' #Standard deviation size of packet in backward direction\n", 145 | ",'fl_byt_s' #flow byte rate that is number of packets transferred per second\n", 146 | ",'fl_pkt_s' #flow packets rate that is number of packets transferred per second\n", 147 | ",'fl_iat_avg' #Average time between two flows\n", 148 | ",'fl_iat_std' #Standard deviation time two flows\n", 149 | ",'fl_iat_max' #Maximum time between two flows\n", 150 | ",'fl_iat_min' #Minimum time between two flows\n", 151 | ",'fw_iat_tot' #Total time between two packets sent in the forward direction\n", 152 | ",'fw_iat_avg' #Mean time between two packets sent in the forward direction\n", 153 | ",'fw_iat_std' #Standard deviation time between two packets sent in the forward direction\n", 154 | ",'fw_iat_max' #Maximum time between two packets sent in the forward direction\n", 155 | ",'fw_iat_min' #Minimum time between two packets sent in the forward direction\n", 156 | ",'bw_iat_tot' #Total time between two packets sent in the backward direction\n", 157 | ",'bw_iat_avg' #Mean time between two packets sent in the backward direction\n", 158 | ",'bw_iat_std' #Standard deviation time between two packets sent in the backward direction\n", 159 | ",'bw_iat_max' #Maximum time between two packets sent in the backward direction\n", 160 | ",'bw_iat_min' #Minimum time between two packets sent in the backward direction\n", 161 | ",'fw_psh_flag' #Number of times the PSH flag was set in packets travelling in the forward direction (0 for UDP)\n", 162 | ",'bw_psh_flag' #Number of times the PSH flag was set in packets travelling in the backward direction (0 for UDP)\n", 163 | ",'fw_urg_flag' #Number of times the URG flag was set in packets travelling in the forward direction (0 for UDP)\n", 164 | ",'bw_urg_flag' #Number of times the URG flag was set in packets travelling in the backward direction (0 for UDP)\n", 165 | ",'fw_hdr_len' #Total bytes used for headers in the forward direction\n", 166 | ",'bw_hdr_len' #Total bytes used for headers in the forward direction\n", 167 | ",'fw_pkt_s' #Number of forward packets per second\n", 168 | ",'bw_pkt_s' #Number of backward packets per second\n", 169 | ",'pkt_len_min' #Minimum length of a flow\n", 170 | ",'pkt_len_max' #Maximum length of a flow\n", 171 | ",'pkt_len_avg' #Mean length of a flow\n", 172 | ",'pkt_len_std' #Standard deviation length of a flow\n", 173 | ",'pkt_len_va' #Minimum inter-arrival time of packet\n", 174 | ",'fin_cnt' #Number of packets with FIN\n", 175 | ",'syn_cnt' #Number of packets with SYN\n", 176 | ",'rst_cnt' #Number of packets with RST\n", 177 | ",'pst_cnt' #Number of packets with PUSH\n", 178 | ",'ack_cnt' #Number of packets with ACK\n", 179 | ",'urg_cnt' #Number of packets with URG\n", 180 | ",'cwe_cnt' #Number of packets with CWE\n", 181 | ",'ece_cnt' #Number of packets with ECE\n", 182 | ",'down_up_ratio' #Download and upload ratio\n", 183 | ",'pkt_size_avg' #Average size of packet\n", 184 | ",'fw_seg_avg' #Average size observed in the forward direction\n", 185 | ",'bw_seg_avg' #Average size observed in the backward direction\n", 186 | ",'fw_byt_blk_avg' #Average number of bytes bulk rate in the forward direction\n", 187 | ",'fw_pkt_blk_avg' #Average number of packets bulk rate in the forward direction\n", 188 | ",'fw_blk_rate_avg' #Average number of bulk rate in the forward direction\n", 189 | ",'bw_byt_blk_avg' #Average number of bytes bulk rate in the backward direction\n", 190 | ",'bw_pkt_blk_avg' #Average number of packets bulk rate in the backward direction\n", 191 | ",'bw_blk_rate_avg' #Average number of bulk rate in the backward direction\n", 192 | ",'subfl_fw_pk' #The average number of packets in a sub flow in the forward direction\n", 193 | ",'subfl_fw_byt' #The average number of bytes in a sub flow in the forward direction\n", 194 | ",'subfl_bw_pkt' #The average number of packets in a sub flow in the backward direction\n", 195 | ",'subfl_bw_byt' #The average number of bytes in a sub flow in the backward direction\n", 196 | ",'fw_win_byt' #Number of bytes sent in initial window in the forward direction\n", 197 | ",'bw_win_byt' ## of bytes sent in initial window in the backward direction\n", 198 | ",'Fw_act_pkt' ## of packets with at least 1 byte of TCP data payload in the forward direction\n", 199 | ",'fw_seg_min' #Minimum segment size observed in the forward direction\n", 200 | ",'atv_avg' #Mean time a flow was active before becoming idle\n", 201 | ",'atv_std' #Standard deviation time a flow was active before becoming idle\n", 202 | ",'atv_max' #Maximum time a flow was active before becoming idle\n", 203 | ",'atv_min' #Minimum time a flow was active before becoming idle\n", 204 | ",'idl_avg' #Mean time a flow was idle before becoming active\n", 205 | ",'idl_std' #Standard deviation time a flow was idle before becoming active\n", 206 | ",'idl_max' #Maximum time a flow was idle before becoming active\n", 207 | ",'idl_min' #Minimum time a flow was idle before becoming active\n", 208 | "]" 209 | ], 210 | "execution_count": null, 211 | "outputs": [] 212 | }, 213 | { 214 | "cell_type": "code", 215 | "metadata": { 216 | "id": "s9ZjigalNCIZ", 217 | "colab": { 218 | "base_uri": "https://localhost:8080/" 219 | }, 220 | "outputId": "07e772aa-03b6-40dc-a892-cfd7323292f3" 221 | }, 222 | "source": [ 223 | "df.columns" 224 | ], 225 | "execution_count": null, 226 | "outputs": [ 227 | { 228 | "output_type": "execute_result", 229 | "data": { 230 | "text/plain": [ 231 | "Index(['Dst Port', 'Protocol', 'Timestamp', 'Flow Duration', 'Tot Fwd Pkts',\n", 232 | " 'Tot Bwd Pkts', 'TotLen Fwd Pkts', 'TotLen Bwd Pkts', 'Fwd Pkt Len Max',\n", 233 | " 'Fwd Pkt Len Min', 'Fwd Pkt Len Mean', 'Fwd Pkt Len Std',\n", 234 | " 'Bwd Pkt Len Max', 'Bwd Pkt Len Min', 'Bwd Pkt Len Mean',\n", 235 | " 'Bwd Pkt Len Std', 'Flow Byts/s', 'Flow Pkts/s', 'Flow IAT Mean',\n", 236 | " 'Flow IAT Std', 'Flow IAT Max', 'Flow IAT Min', 'Fwd IAT Tot',\n", 237 | " 'Fwd IAT Mean', 'Fwd IAT Std', 'Fwd IAT Max', 'Fwd IAT Min',\n", 238 | " 'Bwd IAT Tot', 'Bwd IAT Mean', 'Bwd IAT Std', 'Bwd IAT Max',\n", 239 | " 'Bwd IAT Min', 'Fwd PSH Flags', 'Bwd PSH Flags', 'Fwd URG Flags',\n", 240 | " 'Bwd URG Flags', 'Fwd Header Len', 'Bwd Header Len', 'Fwd Pkts/s',\n", 241 | " 'Bwd Pkts/s', 'Pkt Len Min', 'Pkt Len Max', 'Pkt Len Mean',\n", 242 | " 'Pkt Len Std', 'Pkt Len Var', 'FIN Flag Cnt', 'SYN Flag Cnt',\n", 243 | " 'RST Flag Cnt', 'PSH Flag Cnt', 'ACK Flag Cnt', 'URG Flag Cnt',\n", 244 | " 'CWE Flag Count', 'ECE Flag Cnt', 'Down/Up Ratio', 'Pkt Size Avg',\n", 245 | " 'Fwd Seg Size Avg', 'Bwd Seg Size Avg', 'Fwd Byts/b Avg',\n", 246 | " 'Fwd Pkts/b Avg', 'Fwd Blk Rate Avg', 'Bwd Byts/b Avg',\n", 247 | " 'Bwd Pkts/b Avg', 'Bwd Blk Rate Avg', 'Subflow Fwd Pkts',\n", 248 | " 'Subflow Fwd Byts', 'Subflow Bwd Pkts', 'Subflow Bwd Byts',\n", 249 | " 'Init Fwd Win Byts', 'Init Bwd Win Byts', 'Fwd Act Data Pkts',\n", 250 | " 'Fwd Seg Size Min', 'Active Mean', 'Active Std', 'Active Max',\n", 251 | " 'Active Min', 'Idle Mean', 'Idle Std', 'Idle Max', 'Idle Min', 'Label'],\n", 252 | " dtype='object')" 253 | ] 254 | }, 255 | "metadata": {}, 256 | "execution_count": 5 257 | } 258 | ] 259 | }, 260 | { 261 | "cell_type": "markdown", 262 | "metadata": { 263 | "id": "Hi48qgaewUOb" 264 | }, 265 | "source": [ 266 | "## **LSTM**" 267 | ] 268 | }, 269 | { 270 | "cell_type": "markdown", 271 | "metadata": { 272 | "id": "H84qHurZa0Kt" 273 | }, 274 | "source": [ 275 | "### **Binary Class**" 276 | ] 277 | }, 278 | { 279 | "cell_type": "code", 280 | "metadata": { 281 | "id": "xO_0J1yQOAVY" 282 | }, 283 | "source": [ 284 | "features = ['Timestamp', 'Fwd Pkt Len Std', 'Fwd Pkt Len Mean',\n", 285 | " 'Fwd Pkt Len Max', 'Fwd Seg Size Avg', 'Pkt Len Std', 'Flow IAT Std',\n", 286 | " 'Bwd Pkt Len Std', 'Bwd Seg Size Avg', 'Pkt Size Avg',\n", 287 | " 'Subflow Fwd Byts']" 288 | ], 289 | "execution_count": null, 290 | "outputs": [] 291 | }, 292 | { 293 | "cell_type": "code", 294 | "metadata": { 295 | "id": "17a7f5rWNNXZ" 296 | }, 297 | "source": [ 298 | "def targetify(s):\n", 299 | " if s == 'Benign':\n", 300 | " return 0\n", 301 | " else:\n", 302 | " return 1" 303 | ], 304 | "execution_count": null, 305 | "outputs": [] 306 | }, 307 | { 308 | "cell_type": "code", 309 | "metadata": { 310 | "id": "Jyv3ec_KwtFz", 311 | "colab": { 312 | "base_uri": "https://localhost:8080/" 313 | }, 314 | "outputId": "f7df6c86-db4e-4006-9dcf-be82835a8364" 315 | }, 316 | "source": [ 317 | "X = df[features]\n", 318 | "X[features] = X[features].apply(pd.to_numeric, errors='coerce', axis=1)\n", 319 | "X = X.fillna(0)\n", 320 | "labels = df['Label'] #For multiclass classification\n" 321 | ], 322 | "execution_count": null, 323 | "outputs": [ 324 | { 325 | "output_type": "stream", 326 | "name": "stderr", 327 | "text": [ 328 | "/usr/local/lib/python3.8/dist-packages/pandas/core/frame.py:3641: SettingWithCopyWarning: \n", 329 | "A value is trying to be set on a copy of a slice from a DataFrame.\n", 330 | "Try using .loc[row_indexer,col_indexer] = value instead\n", 331 | "\n", 332 | "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n", 333 | " self[k1] = value[k2]\n" 334 | ] 335 | } 336 | ] 337 | }, 338 | { 339 | "cell_type": "code", 340 | "metadata": { 341 | "id": "gCbbLe-fsFll" 342 | }, 343 | "source": [ 344 | "df['Target']=df['Label'].apply(targetify)\n", 345 | "y = df['Target']" 346 | ], 347 | "execution_count": null, 348 | "outputs": [] 349 | }, 350 | { 351 | "cell_type": "markdown", 352 | "metadata": { 353 | "id": "ExRZ3yLNyswE" 354 | }, 355 | "source": [ 356 | "Normal Execution" 357 | ] 358 | }, 359 | { 360 | "cell_type": "code", 361 | "metadata": { 362 | "id": "5ZupR410ws7J", 363 | "colab": { 364 | "base_uri": "https://localhost:8080/" 365 | }, 366 | "outputId": "2178ab47-c2bb-47be-b9b4-85330337a948" 367 | }, 368 | "source": [ 369 | "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)\n", 370 | "print (X_train.shape, y_train.shape)\n", 371 | "print( X_test.shape, y_test.shape)" 372 | ], 373 | "execution_count": null, 374 | "outputs": [ 375 | { 376 | "output_type": "stream", 377 | "name": "stdout", 378 | "text": [ 379 | "(838859, 11) (838859,)\n", 380 | "(209715, 11) (209715,)\n" 381 | ] 382 | } 383 | ] 384 | }, 385 | { 386 | "cell_type": "markdown", 387 | "metadata": { 388 | "id": "vpr26iooyw4v" 389 | }, 390 | "source": [ 391 | "Faster Execution (1% rows)" 392 | ] 393 | }, 394 | { 395 | "cell_type": "code", 396 | "metadata": { 397 | "id": "MP40NJBNyKL6", 398 | "colab": { 399 | "base_uri": "https://localhost:8080/" 400 | }, 401 | "outputId": "043d8d13-017f-43f7-8b34-5b43e109ec79" 402 | }, 403 | "source": [ 404 | "# selecting 1% of random rows for better running time\n", 405 | "\n", 406 | "X_train = X_train.sample(frac=0.1, replace=True, random_state=1)\n", 407 | "y_train = y_train.sample(frac=0.1, replace=True, random_state=1)\n", 408 | "X_test = X_test.sample(frac=0.1, replace=True, random_state=1)\n", 409 | "y_test = y_test.sample(frac=0.1, replace=True, random_state=1)\n", 410 | "print (X_train.shape, y_train.shape)\n", 411 | "print( X_test.shape, y_test.shape)" 412 | ], 413 | "execution_count": null, 414 | "outputs": [ 415 | { 416 | "output_type": "stream", 417 | "name": "stdout", 418 | "text": [ 419 | "(83886, 11) (83886,)\n", 420 | "(20972, 11) (20972,)\n" 421 | ] 422 | } 423 | ] 424 | }, 425 | { 426 | "cell_type": "code", 427 | "metadata": { 428 | "id": "h4TWHdVGyQTz" 429 | }, 430 | "source": [ 431 | "min_max_scaler = preprocessing.MinMaxScaler()\n", 432 | "x_scaled = min_max_scaler.fit_transform(X_train.values)\n", 433 | "X_train = pd.DataFrame(x_scaled,columns=features)\n", 434 | "x_scaled_test = min_max_scaler.fit_transform(X_test.values)\n", 435 | "X_test = pd.DataFrame(x_scaled_test,columns=features)" 436 | ], 437 | "execution_count": null, 438 | "outputs": [] 439 | }, 440 | { 441 | "cell_type": "code", 442 | "metadata": { 443 | "id": "Rf2vrS4pwsuh" 444 | }, 445 | "source": [ 446 | "model = tf.keras.Sequential([\n", 447 | " tf.keras.layers.Embedding(100000, 64), # since it doesn't consider \"words,\" the embedding doesn't really matter\n", 448 | " tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)),\n", 449 | " tf.keras.layers.Dense(64, activation='relu'),\n", 450 | " tf.keras.layers.Dense(1, activation='sigmoid')\n", 451 | " # tf.keras.layers.Dense(1, activation='softmax') # loss too big\n", 452 | "])" 453 | ], 454 | "execution_count": null, 455 | "outputs": [] 456 | }, 457 | { 458 | "cell_type": "code", 459 | "metadata": { 460 | "id": "iS-VFQNxwslT", 461 | "colab": { 462 | "base_uri": "https://localhost:8080/" 463 | }, 464 | "outputId": "4b7cdcd2-1bcc-4a15-f653-db6be8bcdc04" 465 | }, 466 | "source": [ 467 | "model.compile(loss='binary_crossentropy',\n", 468 | " # optimizer='sgd', # almost same\n", 469 | " optimizer=tf.keras.optimizers.Adam(1e-4),\n", 470 | " metrics=['accuracy'])\n", 471 | "history = model.fit(X_train.values, y_train.values, epochs=3)\n", 472 | "model.save('drive/MyDrive/cicids/LSTM_BC')" 473 | ], 474 | "execution_count": null, 475 | "outputs": [ 476 | { 477 | "output_type": "stream", 478 | "name": "stdout", 479 | "text": [ 480 | "Epoch 1/3\n", 481 | "2622/2622 [==============================] - 244s 91ms/step - loss: 0.1005 - accuracy: 0.9488\n", 482 | "Epoch 2/3\n", 483 | "2622/2622 [==============================] - 246s 94ms/step - loss: 0.0249 - accuracy: 0.9959\n", 484 | "Epoch 3/3\n", 485 | "2622/2622 [==============================] - 252s 96ms/step - loss: 0.0248 - accuracy: 0.9959\n" 486 | ] 487 | }, 488 | { 489 | "output_type": "stream", 490 | "name": "stderr", 491 | "text": [ 492 | "WARNING:absl:Found untraced functions such as lstm_cell_1_layer_call_fn, lstm_cell_1_layer_call_and_return_conditional_losses, lstm_cell_2_layer_call_fn, lstm_cell_2_layer_call_and_return_conditional_losses while saving (showing 4 of 4). These functions will not be directly callable after loading.\n" 493 | ] 494 | } 495 | ] 496 | }, 497 | { 498 | "cell_type": "code", 499 | "metadata": { 500 | "id": "ax68X_PzxEWC", 501 | "colab": { 502 | "base_uri": "https://localhost:8080/" 503 | }, 504 | "outputId": "91c8802c-0efa-424f-8e0e-da4b51443f92" 505 | }, 506 | "source": [ 507 | "pred_class = model.predict(X_test.values[:])\n", 508 | "predictions = [int(round(x[0])) for x in pred_class]\n", 509 | "true_class = list(y_test)\n", 510 | "np.sum(predictions == y_test.values) / len(y_test.values)" 511 | ], 512 | "execution_count": null, 513 | "outputs": [ 514 | { 515 | "output_type": "stream", 516 | "name": "stdout", 517 | "text": [ 518 | "656/656 [==============================] - 5s 6ms/step\n" 519 | ] 520 | }, 521 | { 522 | "output_type": "execute_result", 523 | "data": { 524 | "text/plain": [ 525 | "0.9957562464238031" 526 | ] 527 | }, 528 | "metadata": {}, 529 | "execution_count": 15 530 | } 531 | ] 532 | }, 533 | { 534 | "cell_type": "code", 535 | "metadata": { 536 | "id": "KufqWIFZxEQ_" 537 | }, 538 | "source": [ 539 | "def myRound(x, r):\n", 540 | " if x>r/float(1000):\n", 541 | " return 1\n", 542 | " else:\n", 543 | " return 0" 544 | ], 545 | "execution_count": null, 546 | "outputs": [] 547 | }, 548 | { 549 | "cell_type": "code", 550 | "metadata": { 551 | "id": "y-rksHvxxEML" 552 | }, 553 | "source": [ 554 | "compdf = pd.DataFrame({'pred_class':predictions, 'true_class':true_class})\n", 555 | "compdf = compdf.sort_values('pred_class', ascending=True)\n", 556 | "predictions = list(compdf['pred_class'].apply(myRound, r=225))\n", 557 | "true_class = list(compdf['true_class'])" 558 | ], 559 | "execution_count": null, 560 | "outputs": [] 561 | }, 562 | { 563 | "cell_type": "code", 564 | "metadata": { 565 | "id": "SRrWsEQUxD20", 566 | "colab": { 567 | "base_uri": "https://localhost:8080/" 568 | }, 569 | "outputId": "c2e064e1-f6be-403c-99c9-4d727c427003" 570 | }, 571 | "source": [ 572 | "confm = confusion_matrix(true_class, predictions)\n", 573 | "confm" 574 | ], 575 | "execution_count": null, 576 | "outputs": [ 577 | { 578 | "output_type": "execute_result", 579 | "data": { 580 | "text/plain": [ 581 | "array([[ 8660, 89],\n", 582 | " [ 0, 12223]])" 583 | ] 584 | }, 585 | "metadata": {}, 586 | "execution_count": 18 587 | } 588 | ] 589 | }, 590 | { 591 | "cell_type": "markdown", 592 | "metadata": { 593 | "id": "DER9DuHSa5hH" 594 | }, 595 | "source": [ 596 | "### **Multi Class**" 597 | ] 598 | }, 599 | { 600 | "cell_type": "code", 601 | "metadata": { 602 | "id": "T0bKOdGia9Oa" 603 | }, 604 | "source": [ 605 | "categories = ['Benign', 'FTP-BruteForce', 'SSH-Bruteforce',\n", 606 | " 'DoS attacks-GoldenEye', 'DoS attacks-Slowloris', 'DoS attacks-SlowHTTPTest',\n", 607 | " 'DoS attacks-Hulk', 'Brute Force -Web', 'Brute Force -XSS',\n", 608 | " 'SQL Injection', 'Infiltration', 'Bot']" 609 | ], 610 | "execution_count": null, 611 | "outputs": [] 612 | }, 613 | { 614 | "cell_type": "code", 615 | "metadata": { 616 | "id": "9G0WRcoMa-uw" 617 | }, 618 | "source": [ 619 | "encoder = LabelEncoder()\n", 620 | "encoder.fit(categories)\n", 621 | "y = encoder.transform(labels)\n", 622 | "y = np_utils.to_categorical(y, num_classes=12)" 623 | ], 624 | "execution_count": null, 625 | "outputs": [] 626 | }, 627 | { 628 | "cell_type": "code", 629 | "metadata": { 630 | "id": "r3kN1kPwbHg7" 631 | }, 632 | "source": [ 633 | "model = tf.keras.Sequential([\n", 634 | " tf.keras.layers.Embedding(100000, 64, embeddings_regularizer='l2'), # since it doesn't consider \"words,\" the embedding doesn't really matter\n", 635 | " tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)),\n", 636 | " tf.keras.layers.Dropout(0.4),\n", 637 | " tf.keras.layers.Dense(64, activation='relu', use_bias=True, bias_regularizer='l2'),\n", 638 | " tf.keras.layers.Dropout(0.2),\n", 639 | " tf.keras.layers.Dense(12, activation='sigmoid')\n", 640 | " # tf.keras.layers.Dense(1, activation='softmax') # loss too big\n", 641 | "])" 642 | ], 643 | "execution_count": null, 644 | "outputs": [] 645 | }, 646 | { 647 | "cell_type": "code", 648 | "metadata": { 649 | "id": "dAWdPRbXolXO", 650 | "colab": { 651 | "base_uri": "https://localhost:8080/" 652 | }, 653 | "outputId": "6c7085bc-68fe-49c4-dd83-ed91c010b16b" 654 | }, 655 | "source": [ 656 | "X_train.shape, y_train.shape" 657 | ], 658 | "execution_count": null, 659 | "outputs": [ 660 | { 661 | "output_type": "execute_result", 662 | "data": { 663 | "text/plain": [ 664 | "((83886, 11), (83886,))" 665 | ] 666 | }, 667 | "metadata": {}, 668 | "execution_count": 22 669 | } 670 | ] 671 | }, 672 | { 673 | "cell_type": "code", 674 | "metadata": { 675 | "id": "qw5yKr6-bQe6", 676 | "colab": { 677 | "base_uri": "https://localhost:8080/" 678 | }, 679 | "outputId": "0949958f-3f39-4778-c507-d457224eb3e7" 680 | }, 681 | "source": [ 682 | "model.compile(loss='sparse_categorical_crossentropy',\n", 683 | " # optimizer='sgd', # almost same\n", 684 | " optimizer=tf.keras.optimizers.Adam(1e-4),\n", 685 | " metrics=['accuracy'])\n", 686 | "history = model.fit(X_train.values, y_train.values, epochs=3)\n", 687 | "model.save('drive/MyDrive/cicids/LSTM_MC_L2')" 688 | ], 689 | "execution_count": null, 690 | "outputs": [ 691 | { 692 | "output_type": "stream", 693 | "name": "stdout", 694 | "text": [ 695 | "Epoch 1/3\n", 696 | "2622/2622 [==============================] - 220s 82ms/step - loss: 3.3247 - accuracy: 0.9006\n", 697 | "Epoch 2/3\n", 698 | "2622/2622 [==============================] - 203s 77ms/step - loss: 0.0320 - accuracy: 0.9955\n", 699 | "Epoch 3/3\n", 700 | "2622/2622 [==============================] - 203s 77ms/step - loss: 0.0291 - accuracy: 0.9959\n" 701 | ] 702 | }, 703 | { 704 | "output_type": "stream", 705 | "name": "stderr", 706 | "text": [ 707 | "WARNING:absl:Found untraced functions such as lstm_cell_4_layer_call_fn, lstm_cell_4_layer_call_and_return_conditional_losses, lstm_cell_5_layer_call_fn, lstm_cell_5_layer_call_and_return_conditional_losses while saving (showing 4 of 4). These functions will not be directly callable after loading.\n" 708 | ] 709 | } 710 | ] 711 | }, 712 | { 713 | "cell_type": "code", 714 | "metadata": { 715 | "id": "Y2HCh4UhbheN", 716 | "colab": { 717 | "base_uri": "https://localhost:8080/" 718 | }, 719 | "outputId": "57eb16c6-834f-4e6e-bcfd-b4ad11c442f3" 720 | }, 721 | "source": [ 722 | "pred_class = model.predict(X_test.values[:])\n", 723 | "predictions = [int(round(x[0])) for x in pred_class]\n", 724 | "true_class = list(y_test)\n", 725 | "np.sum(predictions == y_test.values) / len(y_test.values)" 726 | ], 727 | "execution_count": null, 728 | "outputs": [ 729 | { 730 | "output_type": "stream", 731 | "name": "stdout", 732 | "text": [ 733 | "656/656 [==============================] - 5s 6ms/step\n" 734 | ] 735 | }, 736 | { 737 | "output_type": "execute_result", 738 | "data": { 739 | "text/plain": [ 740 | "0.5828247186725157" 741 | ] 742 | }, 743 | "metadata": {}, 744 | "execution_count": 24 745 | } 746 | ] 747 | }, 748 | { 749 | "cell_type": "code", 750 | "metadata": { 751 | "id": "4S6potN6buR9" 752 | }, 753 | "source": [ 754 | "compdf = pd.DataFrame({'pred_class':predictions, 'true_class':true_class})\n", 755 | "compdf = compdf.sort_values('pred_class', ascending=True)\n", 756 | "predictions = list(compdf['pred_class'].apply(myRound, r=225))\n", 757 | "true_class = list(compdf['true_class'])" 758 | ], 759 | "execution_count": null, 760 | "outputs": [] 761 | }, 762 | { 763 | "cell_type": "code", 764 | "metadata": { 765 | "id": "WTq1Vkoybyl2", 766 | "colab": { 767 | "base_uri": "https://localhost:8080/" 768 | }, 769 | "outputId": "f70d4cbf-6c55-4ac6-ba8e-464e1bca19d4" 770 | }, 771 | "source": [ 772 | "confm = confusion_matrix(true_class, predictions)\n", 773 | "confm" 774 | ], 775 | "execution_count": null, 776 | "outputs": [ 777 | { 778 | "output_type": "execute_result", 779 | "data": { 780 | "text/plain": [ 781 | "array([[ 0, 8749],\n", 782 | " [ 0, 12223]])" 783 | ] 784 | }, 785 | "metadata": {}, 786 | "execution_count": 26 787 | } 788 | ] 789 | }, 790 | { 791 | "cell_type": "code", 792 | "metadata": { 793 | "id": "7ilu8oBWEoBp", 794 | "colab": { 795 | "base_uri": "https://localhost:8080/" 796 | }, 797 | "outputId": "8ece0740-5977-47a3-827f-705c79657cbf" 798 | }, 799 | "source": [ 800 | "tf.keras.utils.plot_model(model, to_file='model.png', show_shapes=True, show_layer_names=False)" 801 | ], 802 | "execution_count": null, 803 | "outputs": [ 804 | { 805 | "output_type": "execute_result", 806 | "data": { 807 | "image/png": "\n", 808 | "text/plain": [ 809 | "" 810 | ] 811 | }, 812 | "metadata": {}, 813 | "execution_count": 27 814 | } 815 | ] 816 | }, 817 | { 818 | "cell_type": "code", 819 | "metadata": { 820 | "id": "j4PODlF4E0_d" 821 | }, 822 | "source": [], 823 | "execution_count": null, 824 | "outputs": [] 825 | } 826 | ] 827 | } -------------------------------------------------------------------------------- /CIC IDS 2018/LSTM_BC/keras_metadata.pb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/CIC IDS 2018/LSTM_BC/keras_metadata.pb -------------------------------------------------------------------------------- /CIC IDS 2018/LSTM_BC/saved_model.pb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/CIC IDS 2018/LSTM_BC/saved_model.pb -------------------------------------------------------------------------------- /CIC IDS 2018/LSTM_MC_L2/keras_metadata.pb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/CIC IDS 2018/LSTM_MC_L2/keras_metadata.pb -------------------------------------------------------------------------------- /CIC IDS 2018/LSTM_MC_L2/saved_model.pb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/CIC IDS 2018/LSTM_MC_L2/saved_model.pb -------------------------------------------------------------------------------- /CIC IDS 2018/README.md: -------------------------------------------------------------------------------- 1 | ## 2. CIC IDS 2018: The successor of the CIC IDS 2017 [Refer](https://www.unb.ca/cic/datasets/ids-2018.html) 2 |
[CICIDS2017 dataset](https://www.unb.ca/cic/datasets/ids-2017.html) contains benign and the most up-to-date common attacks, which resembles the true real-world data (PCAPs). It also includes the results of the network traffic analysis using CICFlowMeter with labeled flows based on the time stamp, source, and destination IPs, source and destination ports, protocols and attack (CSV files).
3 |
In CSE-CIC-IDS2018 dataset, we use the notion of profiles to generate datasets in a systematic manner, which will contain detailed descriptions of intrusions and abstract distribution models for applications, protocols, or lower level network entities. These profiles can be used by agents or human operators to generate events on the network. Due to the abstract nature of the generated profiles, we can apply them to a diverse range of network protocols with different topologies. Profiles can be used together to generate a dataset for specific needs.
4 | * B-profiles: Encapsulate the entity behaviours of users using various machine learning and statistical analysis techniques (such as K-Means, Random Forest, SVM, and J48). The encapsulated features are distributions of packet sizes of a protocol, number of packets per flow, certain patterns in the payload, size of payload, and request time distribution of a protocol. The following protocols will be simulated in our testbed environment: HTTPS, HTTP, SMTP, POP3, IMAP, SSH, and FTP. Based on our initial observations majority of traffic is HTTP and HTTPS. 5 | 6 | * M-Profiles: Attempt to describe an attack scenario in an unambiguous manner. In the simplest case, humans can interpret these profiles and subsequently carry them out. Idealistically, autonomous agents along with compilers would be employed to interpret and execute these scenarios. 7 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 Sleety Matt George 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /NSL-KDD/Best_CNN/checkpoint.hdf5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/Best_CNN/checkpoint.hdf5 -------------------------------------------------------------------------------- /NSL-KDD/Best_NN/checkpoint.hdf5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/Best_NN/checkpoint.hdf5 -------------------------------------------------------------------------------- /NSL-KDD/README.md: -------------------------------------------------------------------------------- 1 | ## 1. NSL KDD: The enhanced KDD Cup 99 [Refer](https://www.unb.ca/cic/datasets/nsl.html) 2 |
[The KDD Cup 99 dataset](https://www.tensorflow.org/datasets/catalog/kddcup99) contains a standard set of data to be audited, which includes a wide variety of intrusions simulated in a military network environment. 3 | The competition task was to build a network intrusion detector, a predictive model capable of distinguishing between ''bad’’ connections, called intrusions or attacks, and “good” normal connections. 4 | ### Improvements to the KDD'99 dataset 5 | The NSL-KDD data set has the following advantages over the original KDD data set: 6 | 7 | * It does not include redundant records in the train set, so the classifiers will not be biased towards more frequent records. 8 | * There is no duplicate records in the proposed test sets; therefore, the performance of the learners are not biased by the methods which have better detection rates on the frequent records. 9 | * The number of selected records from each difficultylevel group is inversely proportional to the percentage of records in the original KDD data set. As a result, the classification rates of distinct machine learning methods vary in a wider range, which makes it more efficient to have an accurate evaluation of different learning techniques. 10 | * The number of records in the train and test sets are reasonable, which makes it affordable to run the experiments on the complete set without the need to randomly select a small portion. Consequently, evaluation results of different research works will be consistent and comparable. 11 | -------------------------------------------------------------------------------- /NSL-KDD/img/FNN-arch.drawio: -------------------------------------------------------------------------------- 1 | 7Vldb5swFP01PK4KNibJY5ukXadUmxp1S/oyueEWmAAjxySwXz8TTIC4+aimFjb1JfE9vra5x/dc82HgUZjecBp7d8yBwEA9JzXw2EDIROZQ/uVIViBkiAvA5b6jnCpg5v8GBfYUmvgOrBqOgrFA+HETXLIogqVoYJRztmm6PbOguWpMXdCA2ZIGOvrDd4RXoAPUr/DP4LteubJpq4BDWjqrSFYeddimBuGJgUecMVG0wnQEQU5eyUsx7vpA7+7COETinAGrKFuTye3TT/9xbvYGIfTxw6eBujaRlQGDI+NXJuPCYy6LaDCp0CvOksiBfNaetCqfKWOxBE0J/gIhMrWZNBFMQp4IA9UrL5hn83z8BSnNhZpua4zThpUpS49YkbBiCV/CkTDLzKHcBXHEDxV+OQe1BRSfN8BCkNcjHTgEVPjrZo5QlWruzq/aDdlQG/KKzVHzrmmQqJVuozgREprSDLiB7EAGc/WUt9y8RSzZZ+BL+Zs393a2uW8bzxcwi+mWt40Ub3OPDnK9Bi4gPcqO6rVU4melopW9qXRkluLwahqye2/Fp87Je2R76ot5rb2oEl9aVarnRlbP+3ndqI16E4UgXSGJdXk/R0+P6EsfvqeLSXZ/d13mZEcUgjSFjCFawVkKQcTW++9h+tA14ezOwvaE09M4+RCO4gafebRYnRIO1oXDWczyw2VfElud1DXTsjrIsHPyMDVSPuTRTPuT8iCdkof12nNFE0knD5Y+6Zx0UJvSqeSyqPWcks57PbSQM6Vjd0o65N89WYZm1+Rha2R+TcTRR8A6p0TvnrFnEdK0c1Rj1DbVJvk4xA9yc+5NLu5UKTJfuMv9H45xC7VYp469FWip6GvMvcDvQTIR7hqZrd4SvV8h2isSJyvTscQ7WZiKUvAXhUgN/cb8SFTJg+295MHDC9KcpAhBjUO11/N7U1n7eahPVUSpTbVNsl1ML+WdNKvvDIV79bUGT/4A -------------------------------------------------------------------------------- /NSL-KDD/img/FNN-arch.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/img/FNN-arch.png -------------------------------------------------------------------------------- /NSL-KDD/img/acc.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/img/acc.png -------------------------------------------------------------------------------- /NSL-KDD/img/acc_CNN.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/img/acc_CNN.png -------------------------------------------------------------------------------- /NSL-KDD/img/acc_nn.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/img/acc_nn.png -------------------------------------------------------------------------------- /NSL-KDD/img/arch_CNN.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/img/arch_CNN.png -------------------------------------------------------------------------------- /NSL-KDD/img/attacks_table.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/img/attacks_table.png -------------------------------------------------------------------------------- /NSL-KDD/img/cm_CNN.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/img/cm_CNN.png -------------------------------------------------------------------------------- /NSL-KDD/img/cm_NN.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/img/cm_NN.png -------------------------------------------------------------------------------- /NSL-KDD/img/cm_RF.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/img/cm_RF.png -------------------------------------------------------------------------------- /NSL-KDD/img/flags.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/img/flags.png -------------------------------------------------------------------------------- /NSL-KDD/img/flags_hist.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/img/flags_hist.png -------------------------------------------------------------------------------- /NSL-KDD/img/fpr.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/img/fpr.png -------------------------------------------------------------------------------- /NSL-KDD/img/macro_category.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/img/macro_category.png -------------------------------------------------------------------------------- /NSL-KDD/img/services.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/img/services.png -------------------------------------------------------------------------------- /NSL-KDD/img/services_hist.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/img/services_hist.png -------------------------------------------------------------------------------- /NSL-KDD/img/tpr.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/img/tpr.png -------------------------------------------------------------------------------- /NSL-KDD/resources/column_info.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/resources/column_info.pdf -------------------------------------------------------------------------------- /NSL-KDD/resources/flag_info.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/resources/flag_info.jpg -------------------------------------------------------------------------------- /NSL-KDD/resources/lib_info.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sleetymattgeorge/Deep-Learning-Evaluation-of-IDS-Datasets/3a3c3083c17865874c41d127d1dc1984dd57c759/NSL-KDD/resources/lib_info.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | > **Status:** :heavy_check_mark: 2 | # Deep-Learning-Evaluation-of-IDS-Datasets 3 | Deep Model Intrusion Detection Evaluation of NSL KDD and CIC IDS 2018 datasets. 4 | 5 | # Usage 6 | - install jupyter-notebook. 7 | ```bash 8 | pip install jupyter-notebook 9 | ``` 10 | - Download the .ipynb files via `git clone` or download `zip`. 11 | ```bash 12 | git clone https://github.com/WhiteHatCyberus/Deep-Learning-Evaluation-of-IDS-Datasets.git 13 | ``` 14 | - Now run the `jupyter-notebook` command in the directory where the extracted zip file is found. 15 | - All componenets of the jupyter-notebooks are all found in the folder. 16 | 17 | ## Credits 18 | Every developer deserves credit for the work and time they put in. 19 | 20 | Thank you [Joule Effect](https://github.com/jouleffect) for your contribution in the evaluation of Deep Learning models on NSL-KDD IDS dataset. 21 | --------------------------------------------------------------------------------