├── PR_TEMPLATE.md ├── README.md ├── data ├── APA-DDoS-Dataset.csv ├── Apoorv012.txt ├── CONTRIBUTING.md ├── Jarviss77.txt ├── analysis_tonyStark-Jr.ipynb ├── k7aditya.csv ├── kushal7201.ipynb ├── network_data.csv ├── sarthakvermaa.txt ├── tonyStark-Jr.csv └── tonyStark-Jr.txt ├── model ├── main_model.py └── sarthakvermaa.txt └── reports ├── data ├── Mihir.ipynb ├── Mihir.pdf ├── analysis.pdf ├── analysis_tonyStark-Jr.ipynb ├── sarthakvermaa.ipynb └── sarthakvermaa.pdf └── model └── model_reports.pdf /PR_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | Issue: #`ISSUE NUMBER` 2 | 3 | 4 | 5 | 6 | #### Short description of what this resolves: 7 | 8 | #### Changes proposed in this pull request and/or Screenshots of changes: 9 | 10 | - 11 | - 12 | - 13 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # YADD 2 | 3 | YADD, aka Yet Another DDoS Detector, is an open-source AI-based DDoS detection system designed to safeguard networks from malicious distributed denial-of-service attacks. 4 | 5 | ## Features 6 | 7 | - Utilizes advanced machine learning algorithms for real-time DDoS detection. 8 | - Provides accurate threat analysis and mitigation strategies. 9 | - Scalable and adaptable to various network infrastructures. 10 | - Continuous monitoring and updates to adapt to evolving attack patterns. 11 | 12 | ## Timeline 13 | 14 | ### Week 1: Data Collection and Formatting 15 | 16 | - Gather relevant datasets for training and testing the model. 17 | - Preprocess and format the data to make it suitable for training. 18 | 19 | ### Week 2: Model Development 20 | 21 | - Design and implement a robust AI model for DDoS detection. 22 | - Train the model on the collected dataset to learn patterns of normal and malicious network behavior. 23 | 24 | ### Week 3: Model Tuning, Evaluation, Testing, and Deployment 25 | 26 | - Fine-tune the model parameters for optimal performance. 27 | - Evaluate the model's accuracy, precision, recall, and other relevant metrics. 28 | - Test the model extensively to ensure its reliability and effectiveness. 29 | - Prepare the model for deployment. 30 | 31 | ### Week 4: Deploying on Real-time Network 32 | 33 | - Implement the deployment of the trained model on a real-time network environment. 34 | - Integrate the DDoS detection system into the network infrastructure. 35 | - Conduct thorough testing in a live environment to validate the system's functionality. 36 | 37 | ## Claim an issue 38 | Comment on the issue. In case of no activity on the issue even after 2 days, the issue will be reassigned. If you have difficulty approaching the issue, feel free to ask on our discord channel. 39 | ## Communication 40 | Whether you are working on a new feature or facing a doubt please feel free to ask us on our [discord](https://discord.com/channels/885149696249708635/1182981039564525579) channel. We will be happy to help you out. 41 | 42 | ## About the Mentor 43 | 44 | **Devam Desai** 45 | 46 | `DRedDevil04` 47 | 48 | *Developer by day, Hacker by night* 49 | 50 | 51 | ## Guidelines 52 | Please help us follow the best practice to make it easy for the reviewer as well as the contributor. We want to focus on the code quality more than on managing pull request ethics. 53 | 54 | - People before code: If any of the following rules are violated, the pull-requests must not be rejected. This is to create an easy and joyful onboarding process for new programmers and first-time contributors. 55 | 56 | - Single commit per pull request and name the commit as something meaningful, example: Adding <-your-name-> in students/mentors section. 57 | 58 | - Reference the issue numbers in the commit message if it resolves an open issue. Follow the PR Template 59 | 60 | - Provide the link to live gh-pages from your forked repository or relevant screenshot for easier review. 61 | 62 | - Pull Request older than 3 days with no response from the contributor shall be marked closed. 63 | 64 | - Do not make PR which is not related to any issues. You can create an issue and solve it once we approve them. 65 | 66 | - Avoid duplicate PRs, if need be comment on the older PR with the PR number of the follow-up (new PR) and close the obsolete PR yourself. 67 | 68 | - Be polite: Be polite to other community members. 69 | -------------------------------------------------------------------------------- /data/Apoorv012.txt: -------------------------------------------------------------------------------- 1 | https://drive.google.com/file/d/1k_CIJFl0WSTBSCD8n2aB2G4QHDr5e47b/view?usp=drive_link 2 | 3 | This dataset has 2 categories: 4 | 1. BENIGN 5 | 2. NetBIOS 6 | 7 | The Labels are as follows: 8 | 1. Unnamed: 0 9 | 2. Flow ID 10 | 3. Source IP 11 | 4. Source Port 12 | 5. Destination IP 13 | 6. Destination Port 14 | 7. Protocol 15 | 8. Timestamp 16 | 9. Flow Duration 17 | 10. Total Fwd Packets 18 | 11. Total Backward Packets 19 | 12. Total Length of Fwd Packets 20 | 13. Total Length of Bwd Packets 21 | 14. Fwd Packet Length Max 22 | 15. Fwd Packet Length Min 23 | 16. Fwd Packet Length Mean 24 | 17. Fwd Packet Length Std 25 | 18. Bwd Packet Length Max 26 | 19. Bwd Packet Length Min 27 | 20. Bwd Packet Length Mean 28 | 21. Bwd Packet Length Std 29 | 22. Flow Bytes/s 30 | 23. Flow Packets/s 31 | 24. Flow IAT Mean 32 | 25. Flow IAT Std 33 | 26. Flow IAT Max 34 | 27. Flow IAT Min 35 | 28. Fwd IAT Total 36 | 29. Fwd IAT Mean 37 | 30. Fwd IAT Std 38 | 31. Fwd IAT Max 39 | 32. Fwd IAT Min 40 | 33. Bwd IAT Total 41 | 34. Bwd IAT Mean 42 | 35. Bwd IAT Std 43 | 36. Bwd IAT Max 44 | 37. Bwd IAT Min 45 | 38. Fwd PSH Flags 46 | 39. Bwd PSH Flags 47 | 40. Fwd URG Flags 48 | 41. Bwd URG Flags 49 | 42. Fwd Header Length 50 | 43. Bwd Header Length 51 | 44. Fwd Packets/s 52 | 45. Bwd Packets/s 53 | 46. Min Packet Length 54 | 47. Max Packet Length 55 | 48. Packet Length Mean 56 | 49. Packet Length Std 57 | 50. Packet Length Variance 58 | 51. FIN Flag Count 59 | 52. SYN Flag Count 60 | 53. RST Flag Count 61 | 54. PSH Flag Count 62 | 55. ACK Flag Count 63 | 56. URG Flag Count 64 | 57. CWE Flag Count 65 | 58. ECE Flag Count 66 | 59. Down/Up Ratio 67 | 60. Average Packet Size 68 | 61. Avg Fwd Segment Size 69 | 62. Avg Bwd Segment Size 70 | 63. Fwd Header Length.1 71 | 64. Fwd Avg Bytes/Bulk 72 | 65. Fwd Avg Packets/Bulk 73 | 66. Fwd Avg Bulk Rate 74 | 67. Bwd Avg Bytes/Bulk 75 | 68. Bwd Avg Packets/Bulk 76 | 69. Bwd Avg Bulk Rate 77 | 70. Subflow Fwd Packets 78 | 71. Subflow Fwd Bytes 79 | 72. Subflow Bwd Packets 80 | 73. Subflow Bwd Bytes 81 | 74. Init_Win_bytes_forward 82 | 75. Init_Win_bytes_backward 83 | 76. act_data_pkt_fwd 84 | 77. min_seg_size_forward 85 | 78. Active Mean 86 | 79. Active Std 87 | 80. Active Max 88 | 81. Active Min 89 | 82. Idle Mean 90 | 83. Idle Std 91 | 84. Idle Max 92 | 85. Idle Min 93 | 86. SimillarHTTP 94 | 87. Inbound 95 | 88. Label 96 | -------------------------------------------------------------------------------- /data/CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Dataset Contribution Guidelines 2 | 3 | Thank you for considering contributing to our dataset repository! Please follow the guidelines below to ensure a smooth contribution process. 4 | 5 | ## Dataset Format 6 | 7 | Our dataset follows a specific format with the following columns: 8 | 9 | 1. `ip.src` 10 | 2. `ip.dst` 11 | 3. `tcp.srcport` 12 | 4. `tcp.dstport` 13 | 5. `ip.proto` 14 | 6. `frame.len` 15 | 7. `tcp.flags.syn` 16 | 8. `tcp.flags.reset` 17 | 9. `tcp.flags.push` 18 | 10. `tcp.flags.ack` 19 | 11. `ip.flags.mf` 20 | 12. `ip.flags.df` 21 | 13. `ip.flags.rb` 22 | 14. `tcp.seq` 23 | 15. `tcp.ack` 24 | 16. `frame.time` 25 | 17. `Packets` 26 | 18. `Bytes` 27 | 19. `Tx Packets` 28 | 20. `Tx Bytes` 29 | 21. `Rx Packets` 30 | 22. `Rx Bytes` 31 | 23. `Label` 32 | 33 | Ensure that your dataset adheres to this column structure for consistency. 34 | 35 | In case, you have data but with some missing columns, kindly contact us on the specified Discord Channel(given in Readme.md) 36 | 37 | Also do checkout the `APA-DDoS-Dataset.csv` for reference. 38 | 39 | ## Contribution Process 40 | 41 | 1. **Fork the Repository:** Start by forking the repository to your GitHub account. 42 | 43 | 2. **Clone the Repository:** Clone the forked repository to your local machine. 44 | 45 | ```bash 46 | git clone https://github.com//YADD.git 47 | 48 | ## Feel free to post questions on the OpenCode Discord Server! 49 | 50 | 51 | -------------------------------------------------------------------------------- /data/Jarviss77.txt: -------------------------------------------------------------------------------- 1 | link: https://drive.google.com/file/d/1GLc2eKQstrDiVgTJ1A1Fw0Z4vYKhcR-3/view?usp=share_link -------------------------------------------------------------------------------- /data/kushal7201.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 3, 6 | "metadata": {}, 7 | "outputs": [ 8 | { 9 | "name": "stdout", 10 | "output_type": "stream", 11 | "text": [ 12 | "Index(['Unnamed: 0.1', 'Unnamed: 0', 'Flow ID', 'Src IP', 'Src Port', 'Dst IP',\n", 13 | " 'Dst Port', 'Protocol', 'Timestamp', 'Flow Duration', 'Tot Fwd Pkts',\n", 14 | " 'Tot Bwd Pkts', 'TotLen Fwd Pkts', 'TotLen Bwd Pkts', 'Fwd Pkt Len Max',\n", 15 | " 'Fwd Pkt Len Min', 'Fwd Pkt Len Mean', 'Fwd Pkt Len Std',\n", 16 | " 'Bwd Pkt Len Max', 'Bwd Pkt Len Min', 'Bwd Pkt Len Mean',\n", 17 | " 'Bwd Pkt Len Std', 'Flow Byts/s', 'Flow Pkts/s', 'Flow IAT Mean',\n", 18 | " 'Flow IAT Std', 'Flow IAT Max', 'Flow IAT Min', 'Fwd IAT Tot',\n", 19 | " 'Fwd IAT Mean', 'Fwd IAT Std', 'Fwd IAT Max', 'Fwd IAT Min',\n", 20 | " 'Bwd IAT Tot', 'Bwd IAT Mean', 'Bwd IAT Std', 'Bwd IAT Max',\n", 21 | " 'Bwd IAT Min', 'Fwd PSH Flags', 'Bwd PSH Flags', 'Fwd URG Flags',\n", 22 | " 'Bwd URG Flags', 'Fwd Header Len', 'Bwd Header Len', 'Fwd Pkts/s',\n", 23 | " 'Bwd Pkts/s', 'Pkt Len Min', 'Pkt Len Max', 'Pkt Len Mean',\n", 24 | " 'Pkt Len Std', 'Pkt Len Var', 'FIN Flag Cnt', 'SYN Flag Cnt',\n", 25 | " 'RST Flag Cnt', 'PSH Flag Cnt', 'ACK Flag Cnt', 'URG Flag Cnt',\n", 26 | " 'CWE Flag Count', 'ECE Flag Cnt', 'Down/Up Ratio', 'Pkt Size Avg',\n", 27 | " 'Fwd Seg Size Avg', 'Bwd Seg Size Avg', 'Fwd Byts/b Avg',\n", 28 | " 'Fwd Pkts/b Avg', 'Fwd Blk Rate Avg', 'Bwd Byts/b Avg',\n", 29 | " 'Bwd Pkts/b Avg', 'Bwd Blk Rate Avg', 'Subflow Fwd Pkts',\n", 30 | " 'Subflow Fwd Byts', 'Subflow Bwd Pkts', 'Subflow Bwd Byts',\n", 31 | " 'Init Fwd Win Byts', 'Init Bwd Win Byts', 'Fwd Act Data Pkts',\n", 32 | " 'Fwd Seg Size Min', 'Active Mean', 'Active Std', 'Active Max',\n", 33 | " 'Active Min', 'Idle Mean', 'Idle Std', 'Idle Max', 'Idle Min', 'Label'],\n", 34 | " dtype='object')\n", 35 | "Index(['ip.src', 'ip.dst', 'tcp.srcport', 'tcp.dstport', 'ip.proto',\n", 36 | " 'frame.len', 'tcp.flags.syn', 'tcp.flags.reset', 'tcp.flags.push',\n", 37 | " 'tcp.flags.ack', 'ip.flags.mf', 'ip.flags.df', 'ip.flags.rb', 'tcp.seq',\n", 38 | " 'tcp.ack', 'frame.time', 'Packets', 'Bytes', 'Tx Packets', 'Tx Bytes',\n", 39 | " 'Rx Packets', 'Rx Bytes', 'Label'],\n", 40 | " dtype='object')\n", 41 | "Index(['dt', 'switch', 'src', 'dst', 'pktcount', 'bytecount', 'dur',\n", 42 | " 'dur_nsec', 'tot_dur', 'flows', 'packetins', 'pktperflow',\n", 43 | " 'byteperflow', 'pktrate', 'Pairflow', 'Protocol', 'port_no', 'tx_bytes',\n", 44 | " 'rx_bytes', 'tx_kbps', 'rx_kbps', 'tot_kbps', 'label'],\n", 45 | " dtype='object')\n" 46 | ] 47 | } 48 | ], 49 | "source": [ 50 | "# Importing necessary libraries...\n", 51 | "import pandas as pd\n", 52 | "from sklearn.preprocessing import StandardScaler\n", 53 | "\n", 54 | "# Define a function to read data from a CSV file\n", 55 | "def read_data(file_path):\n", 56 | " return pd.read_csv(file_path)\n", 57 | "\n", 58 | "# Read data from the three datasets\n", 59 | "dataset1 = read_data('k7aditya.csv')\n", 60 | "dataset2 = read_data('APA-DDoS-Dataset.csv')\n", 61 | "dataset3 = read_data('tonyStark-Jr.csv')\n", 62 | "print(dataset1.columns)\n", 63 | "print(dataset2.columns)\n", 64 | "print(dataset3.columns)" 65 | ] 66 | }, 67 | { 68 | "cell_type": "code", 69 | "execution_count": 4, 70 | "metadata": {}, 71 | "outputs": [ 72 | { 73 | "name": "stdout", 74 | "output_type": "stream", 75 | "text": [ 76 | " unnamed: 0.1 unnamed: 0 src port dst port protocol flow duration \\\n", 77 | "0 -1.732039 -1.732039 1.660090 0.0 0.0 -0.481011 \n", 78 | "1 -1.732016 -1.732016 1.659866 0.0 0.0 -0.483635 \n", 79 | "2 -1.731993 -1.731993 1.659642 0.0 0.0 -0.489148 \n", 80 | "3 -1.731970 -1.731970 1.660314 0.0 0.0 -0.494445 \n", 81 | "4 -1.731947 -1.731947 -1.564743 0.0 0.0 -0.469008 \n", 82 | "... ... ... ... ... ... ... \n", 83 | "149995 1.731947 1.731947 -1.774386 0.0 0.0 -0.486475 \n", 84 | "149996 1.731970 1.731970 -1.774610 0.0 0.0 -0.489124 \n", 85 | "149997 1.731993 1.731993 -1.774834 0.0 0.0 -0.490007 \n", 86 | "149998 1.732016 1.732016 1.433200 0.0 0.0 -0.500506 \n", 87 | "149999 1.732039 1.732039 1.432976 0.0 0.0 -0.498692 \n", 88 | "\n", 89 | " tot fwd pkts tot bwd pkts totlen fwd pkts totlen bwd pkts ... \\\n", 90 | "0 0.0 0.0 0.0 0.0 ... \n", 91 | "1 0.0 0.0 0.0 0.0 ... \n", 92 | "2 0.0 0.0 0.0 0.0 ... \n", 93 | "3 0.0 0.0 0.0 0.0 ... \n", 94 | "4 0.0 0.0 0.0 0.0 ... \n", 95 | "... ... ... ... ... ... \n", 96 | "149995 0.0 0.0 0.0 0.0 ... \n", 97 | "149996 0.0 0.0 0.0 0.0 ... \n", 98 | "149997 0.0 0.0 0.0 0.0 ... \n", 99 | "149998 0.0 0.0 0.0 0.0 ... \n", 100 | "149999 0.0 0.0 0.0 0.0 ... \n", 101 | "\n", 102 | " fwd seg size min active mean active std active max active min \\\n", 103 | "0 0.0 0.0 0.0 0.0 0.0 \n", 104 | "1 0.0 0.0 0.0 0.0 0.0 \n", 105 | "2 0.0 0.0 0.0 0.0 0.0 \n", 106 | "3 0.0 0.0 0.0 0.0 0.0 \n", 107 | "4 0.0 0.0 0.0 0.0 0.0 \n", 108 | "... ... ... ... ... ... \n", 109 | "149995 0.0 0.0 0.0 0.0 0.0 \n", 110 | "149996 0.0 0.0 0.0 0.0 0.0 \n", 111 | "149997 0.0 0.0 0.0 0.0 0.0 \n", 112 | "149998 0.0 0.0 0.0 0.0 0.0 \n", 113 | "149999 0.0 0.0 0.0 0.0 0.0 \n", 114 | "\n", 115 | " idle mean idle std idle max idle min label \n", 116 | "0 0.0 0.0 0.0 0.0 0.0 \n", 117 | "1 0.0 0.0 0.0 0.0 0.0 \n", 118 | "2 0.0 0.0 0.0 0.0 0.0 \n", 119 | "3 0.0 0.0 0.0 0.0 0.0 \n", 120 | "4 0.0 0.0 0.0 0.0 0.0 \n", 121 | "... ... ... ... ... ... \n", 122 | "149995 0.0 0.0 0.0 0.0 0.0 \n", 123 | "149996 0.0 0.0 0.0 0.0 0.0 \n", 124 | "149997 0.0 0.0 0.0 0.0 0.0 \n", 125 | "149998 0.0 0.0 0.0 0.0 0.0 \n", 126 | "149999 0.0 0.0 0.0 0.0 0.0 \n", 127 | "\n", 128 | "[150000 rows x 82 columns]\n", 129 | " src port dst port protocol frame.len tcp.flags.syn \\\n", 130 | "0 -1.271463 0.0 0.0 -0.632141 0.0 \n", 131 | "1 -1.271412 0.0 0.0 -0.632141 0.0 \n", 132 | "2 -1.271361 0.0 0.0 -0.632141 0.0 \n", 133 | "3 -1.271310 0.0 0.0 -0.632141 0.0 \n", 134 | "4 -1.271259 0.0 0.0 -0.632141 0.0 \n", 135 | "... ... ... ... ... ... \n", 136 | "151195 0.508437 0.0 0.0 -0.463664 0.0 \n", 137 | "151196 0.508538 0.0 0.0 -0.463664 0.0 \n", 138 | "151197 0.508640 0.0 0.0 -0.463664 0.0 \n", 139 | "151198 0.508742 0.0 0.0 -0.463664 0.0 \n", 140 | "151199 0.508844 0.0 0.0 -0.463664 0.0 \n", 141 | "\n", 142 | " tcp.flags.reset tcp.flags.push tcp.flags.ack ip.flags.mf \\\n", 143 | "0 0.0 1.0 0.0 0.0 \n", 144 | "1 0.0 1.0 0.0 0.0 \n", 145 | "2 0.0 1.0 0.0 0.0 \n", 146 | "3 0.0 1.0 0.0 0.0 \n", 147 | "4 0.0 1.0 0.0 0.0 \n", 148 | "... ... ... ... ... \n", 149 | "151195 0.0 -1.0 0.0 0.0 \n", 150 | "151196 0.0 -1.0 0.0 0.0 \n", 151 | "151197 0.0 -1.0 0.0 0.0 \n", 152 | "151198 0.0 -1.0 0.0 0.0 \n", 153 | "151199 0.0 -1.0 0.0 0.0 \n", 154 | "\n", 155 | " ip.flags.df ip.flags.rb tcp.seq tcp.ack packets totlen fwd pkts \\\n", 156 | "0 -1.0 0.0 0.0 0.0 -0.508386 -0.983051 \n", 157 | "1 -1.0 0.0 0.0 0.0 0.430752 -0.694094 \n", 158 | "2 -1.0 0.0 0.0 0.0 1.369890 -0.405137 \n", 159 | "3 -1.0 0.0 0.0 0.0 0.430752 -0.694094 \n", 160 | "4 -1.0 0.0 0.0 0.0 -1.447524 -1.272008 \n", 161 | "... ... ... ... ... ... ... \n", 162 | "151195 1.0 0.0 0.0 0.0 0.430752 0.927277 \n", 163 | "151196 1.0 0.0 0.0 0.0 0.430752 0.940655 \n", 164 | "151197 1.0 0.0 0.0 0.0 0.430752 0.921926 \n", 165 | "151198 1.0 0.0 0.0 0.0 0.430752 1.004867 \n", 166 | "151199 1.0 0.0 0.0 0.0 0.430752 0.927277 \n", 167 | "\n", 168 | " tx packets tx bytes tot bwd pkts totlen bwd pkts label \n", 169 | "0 -0.774686 -0.985676 -0.035187 -0.977848 1.0 \n", 170 | "1 -0.036029 -0.680974 1.003123 -0.703883 1.0 \n", 171 | "2 0.702627 -0.376273 2.041433 -0.429917 1.0 \n", 172 | "3 -0.036029 -0.680974 1.003123 -0.703883 1.0 \n", 173 | "4 -1.513342 -1.290377 -1.073497 -1.251814 1.0 \n", 174 | "... ... ... ... ... ... \n", 175 | "151195 0.702627 0.955386 -0.035187 0.899322 -1.0 \n", 176 | "151196 0.702627 0.955386 -0.035187 0.924689 -1.0 \n", 177 | "151197 0.702627 0.955386 -0.035187 0.889175 -1.0 \n", 178 | "151198 0.702627 0.955386 -0.035187 1.046452 -1.0 \n", 179 | "151199 0.702627 0.955386 -0.035187 0.899322 -1.0 \n", 180 | "\n", 181 | "[151200 rows x 20 columns]\n", 182 | " timestamp switch pktcount bytes dur dur_nsec tot_dur \\\n", 183 | "0 -0.542890 -1.643016 -0.145262 0.207217 -0.781249 0.919175 -0.779412 \n", 184 | "1 -0.527862 -1.643016 1.413491 1.979416 -0.146367 0.984157 -0.144270 \n", 185 | "2 -0.542890 -1.643016 0.720298 1.191301 -0.428537 1.020258 -0.426555 \n", 186 | "3 -0.542890 -1.643016 0.720298 1.191301 -0.428537 1.020258 -0.426555 \n", 187 | "4 -0.542890 -1.643016 0.720298 1.191301 -0.428537 1.020258 -0.426555 \n", 188 | "... ... ... ... ... ... ... ... \n", 189 | "104340 -1.057435 -0.620687 -1.014589 -0.782718 -0.848265 1.374048 -0.847012 \n", 190 | "104341 -1.057435 -0.620687 -1.014589 -0.782718 -0.848265 1.374048 -0.847012 \n", 191 | "104342 -1.057435 -0.620687 -1.015512 -0.782815 -1.024621 1.240474 -1.023570 \n", 192 | "104343 -1.057435 -0.620687 -1.015512 -0.782815 -1.024621 1.240474 -1.023570 \n", 193 | "104344 -1.057435 -0.620687 -1.015512 -0.782815 -1.024621 1.240474 -1.023570 \n", 194 | "\n", 195 | " flows packets pktperflow totlen bwd pkts pktrate pairflow \\\n", 196 | "0 -0.899733 -0.619631 0.966041 1.284664 0.967330 -1.227267 \n", 197 | "1 -1.238714 -0.619631 0.965501 1.284100 0.967330 -1.227267 \n", 198 | "2 -0.899733 -0.619631 0.965906 1.284523 0.967330 -1.227267 \n", 199 | "3 -0.899733 -0.619631 0.965906 1.284523 0.967330 -1.227267 \n", 200 | "4 -0.899733 -0.619631 0.965906 1.284523 0.967330 -1.227267 \n", 201 | "... ... ... ... ... ... ... \n", 202 | "104340 -0.221772 -0.987332 -0.857925 -0.623447 -0.859661 -1.227267 \n", 203 | "104341 -0.221772 -0.987332 -0.857925 -0.623447 -0.859661 -1.227267 \n", 204 | "104342 -0.221772 -0.987332 -0.857790 -0.623434 -0.855610 -1.227267 \n", 205 | "104343 -0.221772 -0.987332 -0.857790 -0.623434 -0.855610 -1.227267 \n", 206 | "104344 -0.221772 -0.987332 -0.857790 -0.623434 -0.855610 -1.227267 \n", 207 | "\n", 208 | " src port tx_bytes rx_bytes flow_bytes_per_sec \\\n", 209 | "0 0.616885 0.333532 -0.701328 -0.412179 \n", 210 | "1 1.539116 -0.613732 -0.701331 -0.412179 \n", 211 | "2 -1.227575 -0.613733 -0.701348 -0.412179 \n", 212 | "3 -0.305345 -0.613733 -0.701346 -0.412179 \n", 213 | "4 0.616885 -0.613735 -0.701330 -0.412179 \n", 214 | "... ... ... ... ... \n", 215 | "104340 -1.227575 -0.613657 -0.701262 -0.411767 \n", 216 | "104341 0.616885 -0.613658 -0.701247 -0.411767 \n", 217 | "104342 -0.305345 -0.613735 -0.701329 -0.412179 \n", 218 | "104343 -1.227575 -0.613657 -0.701262 -0.411767 \n", 219 | "104344 0.616885 -0.613658 -0.701247 -0.411767 \n", 220 | "\n", 221 | " flow_packets_per_sec tot_kbps label \n", 222 | "0 -0.488502 -0.638457 0.0 \n", 223 | "1 -0.488502 -0.638457 0.0 \n", 224 | "2 -0.488502 -0.638457 0.0 \n", 225 | "3 -0.488502 -0.638457 0.0 \n", 226 | "4 -0.488502 -0.638457 0.0 \n", 227 | "... ... ... ... \n", 228 | "104340 -0.488015 -0.637821 0.0 \n", 229 | "104341 -0.488015 -0.637821 0.0 \n", 230 | "104342 -0.488502 -0.638457 0.0 \n", 231 | "104343 -0.488015 -0.637821 0.0 \n", 232 | "104344 -0.488015 -0.637821 0.0 \n", 233 | "\n", 234 | "[104345 rows x 20 columns]\n" 235 | ] 236 | } 237 | ], 238 | "source": [ 239 | "\n", 240 | "# Convert all column names to lowercase\n", 241 | "dataset1.columns = dataset1.columns.str.lower()\n", 242 | "dataset2.columns = dataset2.columns.str.lower()\n", 243 | "dataset3.columns = dataset3.columns.str.lower()\n", 244 | "\n", 245 | "# Standardize labels in each dataset\n", 246 | "dataset1['label'] = dataset1['label'].astype(str).str.lower().apply(lambda x: 1 if 'ddos' in x else 0)\n", 247 | "dataset2['label'] = dataset2['label'].astype(str).str.lower().apply(lambda x: 1 if 'ddos' in x else 0)\n", 248 | "dataset3['label'] = dataset3['label'].astype(str).str.lower().apply(lambda x: 1 if 'ddos' in x else 0)\n", 249 | "\n", 250 | "\n", 251 | "\n", 252 | "# Convert all column names to lowercase in DATASET1\n", 253 | "dataset1.columns = dataset1.columns.str.lower()\n", 254 | "\n", 255 | "# Rename features in DATASET1\n", 256 | "rename_mapping_dataset1 = {\n", 257 | " 'fwd pkt len max': 'max_fwd_pkt_len',\n", 258 | " 'fwd pkt len min': 'min_fwd_pkt_len',\n", 259 | " 'fwd pkt len mean': 'mean_fwd_pkt_len',\n", 260 | " 'fwd pkt len std': 'std_fwd_pkt_len',\n", 261 | " 'bwd pkt len max': 'max_bwd_pkt_len',\n", 262 | " 'bwd pkt len min': 'min_bwd_pkt_len',\n", 263 | " 'bwd pkt len mean': 'mean_bwd_pkt_len',\n", 264 | " 'bwd pkt len std': 'std_bwd_pkt_len',\n", 265 | " 'flow byts/s': 'flow_bytes_per_sec',\n", 266 | " 'flow pkts/s': 'flow_packets_per_sec',\n", 267 | " 'flow iat mean': 'flow_iat_mean',\n", 268 | " 'flow iat std': 'flow_iat_std',\n", 269 | " 'flow iat max': 'flow_iat_max',\n", 270 | " 'flow iat min': 'flow_iat_min'\n", 271 | "}\n", 272 | "\n", 273 | "dataset1.rename(columns=rename_mapping_dataset1, inplace=True)\n", 274 | "\n", 275 | "# Convert all column names to lowercase in DATASET2\n", 276 | "dataset2.columns = dataset2.columns.str.lower()\n", 277 | "\n", 278 | "# Rename features in DATASET2\n", 279 | "rename_mapping_dataset2 = {\n", 280 | " 'ip.src': 'src ip',\n", 281 | " 'tcp.srcport': 'src port',\n", 282 | " 'ip.dst': 'dst ip',\n", 283 | " 'tcp.dstport': 'dst port',\n", 284 | " 'ip.proto': 'protocol',\n", 285 | " 'frame.time': 'timestamp',\n", 286 | " 'bytes': 'totlen fwd pkts',\n", 287 | " 'rx packets': 'tot bwd pkts',\n", 288 | " 'rx bytes': 'totlen bwd pkts',\n", 289 | "}\n", 290 | "\n", 291 | "dataset2.rename(columns=rename_mapping_dataset2, inplace=True)\n", 292 | "\n", 293 | "# Convert all column names to lowercase in DATASET3\n", 294 | "dataset3.columns = dataset3.columns.str.lower()\n", 295 | "\n", 296 | "# Rename features in DATASET3\n", 297 | "rename_mapping_dataset3 = {\n", 298 | " 'src': 'src ip',\n", 299 | " 'port_no': 'src port',\n", 300 | " 'dst': 'dst ip',\n", 301 | " 'protocol': 'ip.proto',\n", 302 | " 'dt': 'timestamp',\n", 303 | " 'packetins': 'packets',\n", 304 | " 'bytecount': 'bytes',\n", 305 | " 'byteperflow': 'totlen bwd pkts',\n", 306 | " 'packetperflow': 'tot bwd pkts',\n", 307 | " 'tx_kbps': 'flow_bytes_per_sec',\n", 308 | " 'rx_kbps': 'flow_packets_per_sec',\n", 309 | "}\n", 310 | "dataset3.rename(columns=rename_mapping_dataset3, inplace=True)\n", 311 | "\n", 312 | "\n", 313 | "# Drop columns that have no entries\n", 314 | "dataset1 = dataset1.dropna(axis=1, how='all')\n", 315 | "dataset2 = dataset2.dropna(axis=1, how='all')\n", 316 | "dataset3 = dataset3.dropna(axis=1, how='all')\n", 317 | "\n", 318 | "# Select only numeric columns from each dataset\n", 319 | "numeric_features1 = dataset1.select_dtypes(include=['float64', 'int64'])\n", 320 | "numeric_features2 = dataset2.select_dtypes(include=['float64', 'int64'])\n", 321 | "numeric_features3 = dataset3.select_dtypes(include=['float64', 'int64'])\n", 322 | "\n", 323 | "# Initialize a StandardScaler\n", 324 | "scaler = StandardScaler()\n", 325 | "\n", 326 | "# Apply the scaler to the numeric columns of each dataset\n", 327 | "dataset1[numeric_features1.columns] = scaler.fit_transform(numeric_features1)\n", 328 | "dataset2[numeric_features2.columns] = scaler.fit_transform(numeric_features2)\n", 329 | "dataset3[numeric_features3.columns] = scaler.fit_transform(numeric_features3)\n", 330 | "\n", 331 | "# Print the standardized data\n", 332 | "print(dataset1[numeric_features1.columns])\n", 333 | "print(dataset2[numeric_features2.columns])\n", 334 | "print(dataset3[numeric_features3.columns])\n", 335 | "\n" 336 | ] 337 | }, 338 | { 339 | "cell_type": "code", 340 | "execution_count": 5, 341 | "metadata": {}, 342 | "outputs": [ 343 | { 344 | "name": "stdout", 345 | "output_type": "stream", 346 | "text": [ 347 | "['src ip', 'dst ip', 'totlen bwd pkts', 'label', 'src port', 'timestamp']\n" 348 | ] 349 | } 350 | ], 351 | "source": [ 352 | "# Identify common features in all datasets\n", 353 | "common_features = list(set(dataset1.columns).intersection(set(dataset2.columns), set(dataset3.columns)))\n", 354 | "print(common_features)" 355 | ] 356 | }, 357 | { 358 | "cell_type": "code", 359 | "execution_count": 6, 360 | "metadata": {}, 361 | "outputs": [ 362 | { 363 | "ename": "", 364 | "evalue": "", 365 | "output_type": "error", 366 | "traceback": [ 367 | "\u001b[1;31mThe Kernel crashed while executing code in the the current cell or a previous cell. Please review the code in the cell(s) to identify a possible cause of the failure. Click here for more info. View Jupyter log for further details." 368 | ] 369 | } 370 | ], 371 | "source": [ 372 | "\n", 373 | "# Drop uncommon features from each dataset\n", 374 | "dataset1 = dataset1.drop(columns=[col for col in dataset1.columns if col not in common_features])\n", 375 | "dataset2 = dataset2.drop(columns=[col for col in dataset2.columns if col not in common_features])\n", 376 | "dataset3 = dataset3.drop(columns=[col for col in dataset3.columns if col not in common_features])\n", 377 | "\n", 378 | "# Concatenate the three datasets\n", 379 | "dataset = pd.concat([dataset1, dataset2, dataset3])\n", 380 | "\n", 381 | "# Handle missing values by filling them with the mean of each column\n", 382 | "dataset = dataset.fillna(dataset.mean())\n", 383 | "\n", 384 | "# Handle outliers by removing any rows where any column has a value less than Q1 - 1.5 * IQR or greater than Q3 + 1.5 * IQR\n", 385 | "Q1 = dataset.quantile(0.25)\n", 386 | "Q3 = dataset.quantile(0.75)\n", 387 | "IQR = Q3 - Q1\n", 388 | "dataset = dataset[~((dataset < (Q1 - 1.5 * IQR)) |(dataset > (Q3 + 1.5 * IQR))).any(axis=1)]\n", 389 | "\n", 390 | "# Assign features to variable X and labels to variable Y\n", 391 | "X = dataset.drop('label', axis=1)\n", 392 | "Y = dataset['label']\n", 393 | "\n", 394 | "print(X)\n", 395 | "print(Y)" 396 | ] 397 | } 398 | ], 399 | "metadata": { 400 | "kernelspec": { 401 | "display_name": "Python 3", 402 | "language": "python", 403 | "name": "python3" 404 | }, 405 | "language_info": { 406 | "codemirror_mode": { 407 | "name": "ipython", 408 | "version": 3 409 | }, 410 | "file_extension": ".py", 411 | "mimetype": "text/x-python", 412 | "name": "python", 413 | "nbconvert_exporter": "python", 414 | "pygments_lexer": "ipython3", 415 | "version": "3.12.0" 416 | } 417 | }, 418 | "nbformat": 4, 419 | "nbformat_minor": 2 420 | } 421 | -------------------------------------------------------------------------------- /data/network_data.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/opencodeiiita/YADD/66bfff366d33bc082201cbe4f73cbc93fa282330/data/network_data.csv -------------------------------------------------------------------------------- /data/sarthakvermaa.txt: -------------------------------------------------------------------------------- 1 | https://drive.google.com/file/d/1N9ZKFQjyIZYMTV6yv-3XVuD9KgR7NOMa/view?usp=sharing 2 | 3 | This dataset has been labelled into: Benign and Portmap 4 | 5 | 88 Features: 6 | Unnamed 7 | Flow ID 8 | Source IP 9 | Source Port 10 | Destination IP 11 | Destination Port 12 | Protocol 13 | Timestamp 14 | Flow Duration 15 | Total Fwd Packets 16 | Total Backward Packets 17 | Total Length of Fwd Packets 18 | Total Length of Bwd Packets 19 | Fwd Packet Length Max 20 | Fwd Packet Length Min 21 | Fwd Packet Length Mean 22 | Fwd Packet Length Std 23 | Bwd Packet Length Max 24 | Bwd Packet Length Min 25 | Bwd Packet Length Mean 26 | Bwd Packet Length Std 27 | Flow Bytes/s 28 | Flow Packets/s 29 | Flow IAT Mean 30 | Flow IAT Std 31 | Flow IAT Max 32 | Flow IAT Min 33 | Fwd IAT Total 34 | Fwd IAT Mean 35 | Fwd IAT Std 36 | Fwd IAT Max 37 | Fwd IAT Min 38 | Bwd IAT Total 39 | Bwd IAT Mean 40 | Bwd IAT Std 41 | Bwd IAT Max 42 | Bwd IAT Min 43 | Fwd PSH Flags 44 | Bwd PSH Flags 45 | Fwd URG Flags 46 | Bwd URG Flags 47 | Fwd Header Length 48 | Bwd Header Length 49 | Fwd Packets/s 50 | Bwd Packets/s 51 | Min Packet Length 52 | Max Packet Length 53 | Packet Length Mean 54 | Packet Length Std 55 | Packet Length Variance 56 | FIN Flag Count 57 | SYN Flag Count 58 | RST Flag Count 59 | PSH Flag Count 60 | ACK Flag Count 61 | URG Flag Count 62 | CWE Flag Count 63 | ECE Flag Count 64 | Down/Up Ratio 65 | Average Packet Size 66 | Avg Fwd Segment Size 67 | Avg Bwd Segment Size 68 | Fwd Header Length.1 69 | Fwd Avg Bytes/Bulk 70 | Fwd Avg Packets/Bulk 71 | Fwd Avg Bulk Rate 72 | Bwd Avg Bytes/Bulk 73 | Bwd Avg Packets/Bulk 74 | Bwd Avg Bulk Rate 75 | Subflow Fwd Packets 76 | Subflow Fwd Bytes 77 | Subflow Bwd Packets 78 | Subflow Bwd Bytes 79 | Init_Win_bytes_forward 80 | Init_Win_bytes_backward 81 | act_data_pkt_fwd 82 | min_seg_size_forward 83 | Active Mean 84 | Active Std 85 | Active Max 86 | Active Min 87 | Idle Mean 88 | Idle Std 89 | Idle Max 90 | Idle Min 91 | SimillarHTTP 92 | Inbound 93 | Label 94 | -------------------------------------------------------------------------------- /data/tonyStark-Jr.txt: -------------------------------------------------------------------------------- 1 | Dataset Link: https://drive.google.com/file/d/1-7B7w6PARCTiWnoYPe73422V_l-nC-v7/view?usp=sharing 2 | 3 | It is classified into two category benign or LDAP-DDOS attack. 4 | 5 | Total Features: 84 6 | 7 | Features: 8 | 'Flow ID', 9 | ' Source IP', 10 | ' Source Port', 11 | ' Destination IP', 12 | ' Destination Port', 13 | ' Protocol', 14 | ' Timestamp', 15 | ' Flow Duration', 16 | ' Total Fwd Packets', 17 | ' Total Backward Packets', 18 | 'Total Length of Fwd Packets', 19 | ' Total Length of Bwd Packets', 20 | ' Fwd Packet Length Max', 21 | ' Fwd Packet Length Min', 22 | ' Fwd Packet Length Mean', 23 | ' Fwd Packet Length Std', 24 | 'Bwd Packet Length Max', 25 | ' Bwd Packet Length Min', 26 | ' Bwd Packet Length Mean', 27 | ' Bwd Packet Length Std', 28 | 'Flow Bytes/s', 29 | ' Flow Packets/s', 30 | ' Flow IAT Mean', 31 | ' Flow IAT Std', 32 | ' Flow IAT Max', 33 | ' Flow IAT Min', 34 | 'Fwd IAT Total', 35 | ' Fwd IAT Mean', 36 | ' Fwd IAT Std', 37 | ' Fwd IAT Max', 38 | ' Fwd IAT Min', 39 | 'Bwd IAT Total', 40 | ' Bwd IAT Mean', 41 | ' Bwd IAT Std', 42 | ' Bwd IAT Max', 43 | ' Bwd IAT Min', 44 | 'Fwd PSH Flags', 45 | ' Bwd PSH Flags', 46 | ' Fwd URG Flags', 47 | ' Bwd URG Flags', 48 | ' Fwd Header Length', 49 | ' Bwd Header Length', 50 | 'Fwd Packets/s', 51 | ' Bwd Packets/s', 52 | ' Min Packet Length', 53 | ' Max Packet Length', 54 | ' Packet Length Mean', 55 | ' Packet Length Std', 56 | ' Packet Length Variance', 57 | 'FIN Flag Count', 58 | ' SYN Flag Count', 59 | ' RST Flag Count', 60 | ' PSH Flag Count', 61 | ' ACK Flag Count', 62 | ' URG Flag Count', 63 | ' CWE Flag Count', 64 | ' ECE Flag Count', 65 | ' Down/Up Ratio', 66 | ' Average Packet Size', 67 | ' Avg Fwd Segment Size', 68 | ' Avg Bwd Segment Size', 69 | ' Fwd Header Length.1', 70 | 'Fwd Avg Bytes/Bulk', 71 | ' Fwd Avg Packets/Bulk', 72 | ' Fwd Avg Bulk Rate', 73 | ' Bwd Avg Bytes/Bulk', 74 | ' Bwd Avg Packets/Bulk', 75 | 'Bwd Avg Bulk Rate', 76 | 'Subflow Fwd Packets', 77 | ' Subflow Fwd Bytes', 78 | ' Subflow Bwd Packets', 79 | ' Subflow Bwd Bytes', 80 | 'Init_Win_bytes_forward', 81 | ' Init_Win_bytes_backward', 82 | ' act_data_pkt_fwd', 83 | ' min_seg_size_forward', 84 | 'Active Mean', 85 | ' Active Std', 86 | ' Active Max', 87 | ' Active Min', 88 | 'Idle Mean', 89 | ' Idle Std', 90 | ' Idle Max', 91 | ' Idle Min', 92 | 'SimillarHTTP', 93 | ' Inbound', 94 | ' Label' 95 | 96 | 97 | 98 | -------------------------------------------------------------------------------- /model/main_model.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/opencodeiiita/YADD/66bfff366d33bc082201cbe4f73cbc93fa282330/model/main_model.py -------------------------------------------------------------------------------- /model/sarthakvermaa.txt: -------------------------------------------------------------------------------- 1 | Link shared with email iit2022035@iiita.ac.in 2 | 3 | https://colab.research.google.com/drive/1PQkMDiVLMAb92T6aVU-OcAybGwYxQSxd?usp=sharing 4 | 5 | model_tonyStark-Jr 6 | https://colab.research.google.com/drive/1R1Pq56YIEg_EV6QoSCtpBjxfCLkXBlj6 7 | -------------------------------------------------------------------------------- /reports/data/Mihir.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 1, 6 | "id": "96078db8-a6be-4ae3-a23a-52129e351766", 7 | "metadata": {}, 8 | "outputs": [], 9 | "source": [ 10 | "import numpy as np\n", 11 | "import pandas as pd\n", 12 | "import seaborn as sns\n", 13 | "import matplotlib.pyplot as plt" 14 | ] 15 | }, 16 | { 17 | "cell_type": "code", 18 | "execution_count": 2, 19 | "id": "21b9cb29-8af5-425a-b5b6-dcfdd4aeb7ab", 20 | "metadata": {}, 21 | "outputs": [], 22 | "source": [ 23 | "dataset= pd.read_csv(\"data/APA-DDoS-Dataset.csv\")" 24 | ] 25 | }, 26 | { 27 | "cell_type": "code", 28 | "execution_count": 3, 29 | "id": "9253fb56-c939-4931-8a63-2bdef897ca7a", 30 | "metadata": {}, 31 | "outputs": [ 32 | { 33 | "data": { 34 | "text/html": [ 35 | "
\n", 36 | "\n", 49 | "\n", 50 | " \n", 51 | " \n", 52 | " \n", 53 | " \n", 54 | " \n", 55 | " \n", 56 | " \n", 57 | " \n", 58 | " \n", 59 | " \n", 60 | " \n", 61 | " \n", 62 | " \n", 63 | " \n", 64 | " \n", 65 | " \n", 66 | " \n", 67 | " \n", 68 | " \n", 69 | " \n", 70 | " \n", 71 | " \n", 72 | " \n", 73 | " \n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | " \n", 178 | " \n", 179 | " \n", 180 | " \n", 181 | " \n", 182 | " \n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | " \n", 189 | " \n", 190 | " \n", 191 | " \n", 192 | " \n", 193 | " \n", 194 | " \n", 195 | " \n", 196 | " \n", 197 | " \n", 198 | " \n", 199 | " \n", 200 | " \n", 201 | " \n", 202 | " \n", 203 | " \n", 204 | " \n", 205 | " \n", 206 | " \n", 207 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 212 | " \n", 213 | " \n", 214 | " \n", 215 | " \n", 216 | " \n", 217 | " \n", 218 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | " \n", 232 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 237 | " \n", 238 | " \n", 239 | " \n", 240 | " \n", 241 | " \n", 242 | " \n", 243 | " \n", 244 | " \n", 245 | " \n", 246 | " \n", 247 | " \n", 248 | " \n", 249 | " \n", 250 | " \n", 251 | " \n", 252 | " \n", 253 | " \n", 254 | " \n", 255 | " \n", 256 | " \n", 257 | " \n", 258 | " \n", 259 | " \n", 260 | " \n", 261 | " \n", 262 | " \n", 263 | " \n", 264 | " \n", 265 | " \n", 266 | " \n", 267 | " \n", 268 | " \n", 269 | " \n", 270 | " \n", 271 | " \n", 272 | " \n", 273 | " \n", 274 | " \n", 275 | " \n", 276 | " \n", 277 | " \n", 278 | " \n", 279 | " \n", 280 | " \n", 281 | " \n", 282 | " \n", 283 | " \n", 284 | " \n", 285 | " \n", 286 | " \n", 287 | " \n", 288 | " \n", 289 | " \n", 290 | " \n", 291 | " \n", 292 | " \n", 293 | " \n", 294 | " \n", 295 | " \n", 296 | " \n", 297 | " \n", 298 | " \n", 299 | " \n", 300 | " \n", 301 | " \n", 302 | " \n", 303 | " \n", 304 | " \n", 305 | " \n", 306 | " \n", 307 | " \n", 308 | " \n", 309 | " \n", 310 | " \n", 311 | " \n", 312 | " \n", 313 | " \n", 314 | " \n", 315 | " \n", 316 | " \n", 317 | " \n", 318 | " \n", 319 | " \n", 320 | " \n", 321 | " \n", 322 | " \n", 323 | " \n", 324 | " \n", 325 | " \n", 326 | " \n", 327 | " \n", 328 | " \n", 329 | " \n", 330 | " \n", 331 | " \n", 332 | " \n", 333 | " \n", 334 | " \n", 335 | " \n", 336 | " \n", 337 | " \n", 338 | " \n", 339 | " \n", 340 | " \n", 341 | " \n", 342 | "
ip.srcip.dsttcp.srcporttcp.dstportip.protoframe.lentcp.flags.syntcp.flags.resettcp.flags.pushtcp.flags.ack...tcp.seqtcp.ackframe.timePacketsBytesTx PacketsTx BytesRx PacketsRx BytesLabel
0192.168.1.1192.168.23.2241280006540011...1116-Jun 2020 20:18:15.071112000 Mountain Dayli...843242164216DDoS-PSH-ACK
1192.168.1.1192.168.23.2241380006540011...1116-Jun 2020 20:18:15.071138000 Mountain Dayli...1054052705270DDoS-PSH-ACK
2192.168.1.1192.168.23.2241480006540011...1116-Jun 2020 20:18:15.071146000 Mountain Dayli...1264863246324DDoS-PSH-ACK
3192.168.1.1192.168.23.2241580006540011...1116-Jun 2020 20:18:15.071152000 Mountain Dayli...1054052705270DDoS-PSH-ACK
4192.168.1.1192.168.23.2241680006540011...1116-Jun 2020 20:18:15.071159000 Mountain Dayli...632431623162DDoS-PSH-ACK
..................................................................
151195192.168.19.1192.168.23.23736080006660001...1116-Jun 2020 22:10:46.923006000 Mountain Dayli...10114665604586Benign
151196192.168.19.1192.168.23.23736280006660001...1116-Jun 2020 22:10:46.935672000 Mountain Dayli...10115165604591Benign
151197192.168.19.1192.168.23.23736480006660001...1116-Jun 2020 22:10:46.957469000 Mountain Dayli...10114465604584Benign
151198192.168.19.1192.168.23.23736680006660001...1116-Jun 2020 22:10:46.970971000 Mountain Dayli...10117565604615Benign
151199192.168.19.1192.168.23.23736880006660001...1116-Jun 2020 22:10:46.984798000 Mountain Dayli...10114665604586Benign
\n", 343 | "

151200 rows × 23 columns

\n", 344 | "
" 345 | ], 346 | "text/plain": [ 347 | " ip.src ip.dst tcp.srcport tcp.dstport ip.proto \\\n", 348 | "0 192.168.1.1 192.168.23.2 2412 8000 6 \n", 349 | "1 192.168.1.1 192.168.23.2 2413 8000 6 \n", 350 | "2 192.168.1.1 192.168.23.2 2414 8000 6 \n", 351 | "3 192.168.1.1 192.168.23.2 2415 8000 6 \n", 352 | "4 192.168.1.1 192.168.23.2 2416 8000 6 \n", 353 | "... ... ... ... ... ... \n", 354 | "151195 192.168.19.1 192.168.23.2 37360 8000 6 \n", 355 | "151196 192.168.19.1 192.168.23.2 37362 8000 6 \n", 356 | "151197 192.168.19.1 192.168.23.2 37364 8000 6 \n", 357 | "151198 192.168.19.1 192.168.23.2 37366 8000 6 \n", 358 | "151199 192.168.19.1 192.168.23.2 37368 8000 6 \n", 359 | "\n", 360 | " frame.len tcp.flags.syn tcp.flags.reset tcp.flags.push \\\n", 361 | "0 54 0 0 1 \n", 362 | "1 54 0 0 1 \n", 363 | "2 54 0 0 1 \n", 364 | "3 54 0 0 1 \n", 365 | "4 54 0 0 1 \n", 366 | "... ... ... ... ... \n", 367 | "151195 66 0 0 0 \n", 368 | "151196 66 0 0 0 \n", 369 | "151197 66 0 0 0 \n", 370 | "151198 66 0 0 0 \n", 371 | "151199 66 0 0 0 \n", 372 | "\n", 373 | " tcp.flags.ack ... tcp.seq tcp.ack \\\n", 374 | "0 1 ... 1 1 \n", 375 | "1 1 ... 1 1 \n", 376 | "2 1 ... 1 1 \n", 377 | "3 1 ... 1 1 \n", 378 | "4 1 ... 1 1 \n", 379 | "... ... ... ... ... \n", 380 | "151195 1 ... 1 1 \n", 381 | "151196 1 ... 1 1 \n", 382 | "151197 1 ... 1 1 \n", 383 | "151198 1 ... 1 1 \n", 384 | "151199 1 ... 1 1 \n", 385 | "\n", 386 | " frame.time Packets Bytes \\\n", 387 | "0 16-Jun 2020 20:18:15.071112000 Mountain Dayli... 8 432 \n", 388 | "1 16-Jun 2020 20:18:15.071138000 Mountain Dayli... 10 540 \n", 389 | "2 16-Jun 2020 20:18:15.071146000 Mountain Dayli... 12 648 \n", 390 | "3 16-Jun 2020 20:18:15.071152000 Mountain Dayli... 10 540 \n", 391 | "4 16-Jun 2020 20:18:15.071159000 Mountain Dayli... 6 324 \n", 392 | "... ... ... ... \n", 393 | "151195 16-Jun 2020 22:10:46.923006000 Mountain Dayli... 10 1146 \n", 394 | "151196 16-Jun 2020 22:10:46.935672000 Mountain Dayli... 10 1151 \n", 395 | "151197 16-Jun 2020 22:10:46.957469000 Mountain Dayli... 10 1144 \n", 396 | "151198 16-Jun 2020 22:10:46.970971000 Mountain Dayli... 10 1175 \n", 397 | "151199 16-Jun 2020 22:10:46.984798000 Mountain Dayli... 10 1146 \n", 398 | "\n", 399 | " Tx Packets Tx Bytes Rx Packets Rx Bytes Label \n", 400 | "0 4 216 4 216 DDoS-PSH-ACK \n", 401 | "1 5 270 5 270 DDoS-PSH-ACK \n", 402 | "2 6 324 6 324 DDoS-PSH-ACK \n", 403 | "3 5 270 5 270 DDoS-PSH-ACK \n", 404 | "4 3 162 3 162 DDoS-PSH-ACK \n", 405 | "... ... ... ... ... ... \n", 406 | "151195 6 560 4 586 Benign \n", 407 | "151196 6 560 4 591 Benign \n", 408 | "151197 6 560 4 584 Benign \n", 409 | "151198 6 560 4 615 Benign \n", 410 | "151199 6 560 4 586 Benign \n", 411 | "\n", 412 | "[151200 rows x 23 columns]" 413 | ] 414 | }, 415 | "execution_count": 3, 416 | "metadata": {}, 417 | "output_type": "execute_result" 418 | } 419 | ], 420 | "source": [ 421 | "dataset" 422 | ] 423 | }, 424 | { 425 | "cell_type": "code", 426 | "execution_count": 4, 427 | "id": "5920a89c-29e1-41d4-b6e7-03e9b7ddbb29", 428 | "metadata": {}, 429 | "outputs": [ 430 | { 431 | "name": "stdout", 432 | "output_type": "stream", 433 | "text": [ 434 | "(151200, 23)\n" 435 | ] 436 | } 437 | ], 438 | "source": [ 439 | "print(dataset.shape)" 440 | ] 441 | }, 442 | { 443 | "cell_type": "code", 444 | "execution_count": 5, 445 | "id": "22699c59-6340-48c1-84e4-9bc1d35aed49", 446 | "metadata": {}, 447 | "outputs": [ 448 | { 449 | "data": { 450 | "text/plain": [ 451 | "Index(['ip.src', 'ip.dst', 'tcp.srcport', 'tcp.dstport', 'ip.proto',\n", 452 | " 'frame.len', 'tcp.flags.syn', 'tcp.flags.reset', 'tcp.flags.push',\n", 453 | " 'tcp.flags.ack', 'ip.flags.mf', 'ip.flags.df', 'ip.flags.rb', 'tcp.seq',\n", 454 | " 'tcp.ack', 'frame.time', 'Packets', 'Bytes', 'Tx Packets', 'Tx Bytes',\n", 455 | " 'Rx Packets', 'Rx Bytes', 'Label'],\n", 456 | " dtype='object')" 457 | ] 458 | }, 459 | "execution_count": 5, 460 | "metadata": {}, 461 | "output_type": "execute_result" 462 | } 463 | ], 464 | "source": [ 465 | "dataset.columns" 466 | ] 467 | }, 468 | { 469 | "cell_type": "code", 470 | "execution_count": 6, 471 | "id": "b1557903-3e07-4ff7-b9a4-37fe80636414", 472 | "metadata": {}, 473 | "outputs": [ 474 | { 475 | "name": "stdout", 476 | "output_type": "stream", 477 | "text": [ 478 | "no such feature\n" 479 | ] 480 | } 481 | ], 482 | "source": [ 483 | "## to capture any nan values in the dataset\n", 484 | "features_nan=[feature for feature in dataset.columns if dataset[feature].isnull().sum()>1 and dataset[feature].dtypes=='O']\n", 485 | "if len(features_nan) !=0:\n", 486 | " for feature in features_nan:\n", 487 | " print(\"{} has: {}% missing values\".format(feature,np.round(dataset[feature].isnull().mean(),4)*100)) \n", 488 | "else :\n", 489 | " print(\"no such feature\")" 490 | ] 491 | }, 492 | { 493 | "cell_type": "code", 494 | "execution_count": 7, 495 | "id": "8f305fc9-ab1d-4154-9173-c0c4124d624b", 496 | "metadata": {}, 497 | "outputs": [ 498 | { 499 | "name": "stdout", 500 | "output_type": "stream", 501 | "text": [ 502 | "number of numerical variables 19\n" 503 | ] 504 | } 505 | ], 506 | "source": [ 507 | "numerical_features=[feature for feature in dataset.columns if dataset[feature].dtypes!='O']\n", 508 | "print(\"number of numerical variables \",len(numerical_features))" 509 | ] 510 | }, 511 | { 512 | "cell_type": "code", 513 | "execution_count": 8, 514 | "id": "f7978539-814f-4f35-b2a1-57d55f33f31f", 515 | "metadata": {}, 516 | "outputs": [ 517 | { 518 | "data": { 519 | "text/html": [ 520 | "
\n", 521 | "\n", 534 | "\n", 535 | " \n", 536 | " \n", 537 | " \n", 538 | " \n", 539 | " \n", 540 | " \n", 541 | " \n", 542 | " \n", 543 | " \n", 544 | " \n", 545 | " \n", 546 | " \n", 547 | " \n", 548 | " \n", 549 | " \n", 550 | " \n", 551 | " \n", 552 | " \n", 553 | " \n", 554 | " \n", 555 | " \n", 556 | " \n", 557 | " \n", 558 | " \n", 559 | " \n", 560 | " \n", 561 | " \n", 562 | " \n", 563 | " \n", 564 | " \n", 565 | " \n", 566 | " \n", 567 | " \n", 568 | " \n", 569 | " \n", 570 | " \n", 571 | " \n", 572 | " \n", 573 | " \n", 574 | " \n", 575 | " \n", 576 | " \n", 577 | " \n", 578 | " \n", 579 | " \n", 580 | " \n", 581 | " \n", 582 | " \n", 583 | " \n", 584 | " \n", 585 | " \n", 586 | " \n", 587 | " \n", 588 | " \n", 589 | " \n", 590 | " \n", 591 | " \n", 592 | " \n", 593 | " \n", 594 | " \n", 595 | " \n", 596 | " \n", 597 | " \n", 598 | " \n", 599 | " \n", 600 | " \n", 601 | " \n", 602 | " \n", 603 | " \n", 604 | " \n", 605 | " \n", 606 | " \n", 607 | " \n", 608 | " \n", 609 | " \n", 610 | " \n", 611 | " \n", 612 | " \n", 613 | " \n", 614 | " \n", 615 | " \n", 616 | " \n", 617 | " \n", 618 | " \n", 619 | " \n", 620 | " \n", 621 | " \n", 622 | " \n", 623 | " \n", 624 | " \n", 625 | " \n", 626 | " \n", 627 | " \n", 628 | " \n", 629 | " \n", 630 | " \n", 631 | " \n", 632 | " \n", 633 | " \n", 634 | " \n", 635 | " \n", 636 | " \n", 637 | " \n", 638 | " \n", 639 | " \n", 640 | " \n", 641 | " \n", 642 | " \n", 643 | " \n", 644 | " \n", 645 | " \n", 646 | " \n", 647 | " \n", 648 | " \n", 649 | " \n", 650 | " \n", 651 | " \n", 652 | " \n", 653 | " \n", 654 | " \n", 655 | " \n", 656 | " \n", 657 | " \n", 658 | " \n", 659 | " \n", 660 | " \n", 661 | " \n", 662 | " \n", 663 | " \n", 664 | " \n", 665 | " \n", 666 | " \n", 667 | " \n", 668 | " \n", 669 | " \n", 670 | " \n", 671 | "
tcp.srcporttcp.dstportip.protoframe.lentcp.flags.syntcp.flags.resettcp.flags.pushtcp.flags.ackip.flags.mfip.flags.dfip.flags.rbtcp.seqtcp.ackPacketsBytesTx PacketsTx BytesRx PacketsRx Bytes
024128000654001100011843242164216
1241380006540011000111054052705270
2241480006540011000111264863246324
3241580006540011000111054052705270
424168000654001100011632431623162
\n", 672 | "
" 673 | ], 674 | "text/plain": [ 675 | " tcp.srcport tcp.dstport ip.proto frame.len tcp.flags.syn \\\n", 676 | "0 2412 8000 6 54 0 \n", 677 | "1 2413 8000 6 54 0 \n", 678 | "2 2414 8000 6 54 0 \n", 679 | "3 2415 8000 6 54 0 \n", 680 | "4 2416 8000 6 54 0 \n", 681 | "\n", 682 | " tcp.flags.reset tcp.flags.push tcp.flags.ack ip.flags.mf ip.flags.df \\\n", 683 | "0 0 1 1 0 0 \n", 684 | "1 0 1 1 0 0 \n", 685 | "2 0 1 1 0 0 \n", 686 | "3 0 1 1 0 0 \n", 687 | "4 0 1 1 0 0 \n", 688 | "\n", 689 | " ip.flags.rb tcp.seq tcp.ack Packets Bytes Tx Packets Tx Bytes \\\n", 690 | "0 0 1 1 8 432 4 216 \n", 691 | "1 0 1 1 10 540 5 270 \n", 692 | "2 0 1 1 12 648 6 324 \n", 693 | "3 0 1 1 10 540 5 270 \n", 694 | "4 0 1 1 6 324 3 162 \n", 695 | "\n", 696 | " Rx Packets Rx Bytes \n", 697 | "0 4 216 \n", 698 | "1 5 270 \n", 699 | "2 6 324 \n", 700 | "3 5 270 \n", 701 | "4 3 162 " 702 | ] 703 | }, 704 | "execution_count": 8, 705 | "metadata": {}, 706 | "output_type": "execute_result" 707 | } 708 | ], 709 | "source": [ 710 | "dataset[numerical_features].head()" 711 | ] 712 | }, 713 | { 714 | "cell_type": "code", 715 | "execution_count": 9, 716 | "id": "19d74934-031d-4482-a9e1-a13fdad3c74c", 717 | "metadata": {}, 718 | "outputs": [], 719 | "source": [ 720 | "#values of some features has same value throughout the dataset and does not affect labels,so dropping them\n", 721 | "dataset=dataset.drop(columns=[\"tcp.dstport\",\"ip.proto\",\"tcp.flags.syn\",\"tcp.flags.reset\",\"tcp.flags.ack\",\"ip.flags.mf\",\"ip.flags.rb\",\"tcp.seq\",\"tcp.ack\",\"frame.time\"])" 722 | ] 723 | }, 724 | { 725 | "cell_type": "code", 726 | "execution_count": 10, 727 | "id": "acd67db1-c920-4467-a72b-bc5bf2136f32", 728 | "metadata": {}, 729 | "outputs": [ 730 | { 731 | "data": { 732 | "text/plain": [ 733 | "Index(['ip.src', 'ip.dst', 'tcp.srcport', 'frame.len', 'tcp.flags.push',\n", 734 | " 'ip.flags.df', 'Packets', 'Bytes', 'Tx Packets', 'Tx Bytes',\n", 735 | " 'Rx Packets', 'Rx Bytes', 'Label'],\n", 736 | " dtype='object')" 737 | ] 738 | }, 739 | "execution_count": 10, 740 | "metadata": {}, 741 | "output_type": "execute_result" 742 | } 743 | ], 744 | "source": [ 745 | "dataset.columns" 746 | ] 747 | }, 748 | { 749 | "cell_type": "code", 750 | "execution_count": 11, 751 | "id": "08502a70-3bca-48fb-a126-fecc37418cce", 752 | "metadata": {}, 753 | "outputs": [ 754 | { 755 | "data": { 756 | "text/html": [ 757 | "
\n", 758 | "\n", 771 | "\n", 772 | " \n", 773 | " \n", 774 | " \n", 775 | " \n", 776 | " \n", 777 | " \n", 778 | " \n", 779 | " \n", 780 | " \n", 781 | " \n", 782 | " \n", 783 | " \n", 784 | " \n", 785 | " \n", 786 | " \n", 787 | " \n", 788 | " \n", 789 | " \n", 790 | " \n", 791 | " \n", 792 | " \n", 793 | " \n", 794 | " \n", 795 | " \n", 796 | " \n", 797 | " \n", 798 | " \n", 799 | " \n", 800 | " \n", 801 | " \n", 802 | " \n", 803 | " \n", 804 | " \n", 805 | " \n", 806 | " \n", 807 | " \n", 808 | " \n", 809 | " \n", 810 | " \n", 811 | " \n", 812 | " \n", 813 | " \n", 814 | " \n", 815 | " \n", 816 | " \n", 817 | " \n", 818 | " \n", 819 | " \n", 820 | " \n", 821 | " \n", 822 | " \n", 823 | " \n", 824 | " \n", 825 | " \n", 826 | " \n", 827 | " \n", 828 | " \n", 829 | " \n", 830 | " \n", 831 | " \n", 832 | " \n", 833 | " \n", 834 | " \n", 835 | " \n", 836 | " \n", 837 | " \n", 838 | " \n", 839 | " \n", 840 | " \n", 841 | " \n", 842 | " \n", 843 | " \n", 844 | " \n", 845 | " \n", 846 | " \n", 847 | " \n", 848 | " \n", 849 | " \n", 850 | " \n", 851 | " \n", 852 | " \n", 853 | " \n", 854 | " \n", 855 | " \n", 856 | " \n", 857 | " \n", 858 | " \n", 859 | " \n", 860 | " \n", 861 | " \n", 862 | " \n", 863 | " \n", 864 | " \n", 865 | " \n", 866 | " \n", 867 | " \n", 868 | " \n", 869 | " \n", 870 | " \n", 871 | " \n", 872 | "
ip.srcip.dsttcp.srcportframe.lentcp.flags.puship.flags.dfPacketsBytesTx PacketsTx BytesRx PacketsRx BytesLabel
151195192.168.19.1192.168.23.237360660110114665604586Benign
151196192.168.19.1192.168.23.237362660110115165604591Benign
151197192.168.19.1192.168.23.237364660110114465604584Benign
151198192.168.19.1192.168.23.237366660110117565604615Benign
151199192.168.19.1192.168.23.237368660110114665604586Benign
\n", 873 | "
" 874 | ], 875 | "text/plain": [ 876 | " ip.src ip.dst tcp.srcport frame.len tcp.flags.push \\\n", 877 | "151195 192.168.19.1 192.168.23.2 37360 66 0 \n", 878 | "151196 192.168.19.1 192.168.23.2 37362 66 0 \n", 879 | "151197 192.168.19.1 192.168.23.2 37364 66 0 \n", 880 | "151198 192.168.19.1 192.168.23.2 37366 66 0 \n", 881 | "151199 192.168.19.1 192.168.23.2 37368 66 0 \n", 882 | "\n", 883 | " ip.flags.df Packets Bytes Tx Packets Tx Bytes Rx Packets \\\n", 884 | "151195 1 10 1146 6 560 4 \n", 885 | "151196 1 10 1151 6 560 4 \n", 886 | "151197 1 10 1144 6 560 4 \n", 887 | "151198 1 10 1175 6 560 4 \n", 888 | "151199 1 10 1146 6 560 4 \n", 889 | "\n", 890 | " Rx Bytes Label \n", 891 | "151195 586 Benign \n", 892 | "151196 591 Benign \n", 893 | "151197 584 Benign \n", 894 | "151198 615 Benign \n", 895 | "151199 586 Benign " 896 | ] 897 | }, 898 | "execution_count": 11, 899 | "metadata": {}, 900 | "output_type": "execute_result" 901 | } 902 | ], 903 | "source": [ 904 | "dataset.tail()" 905 | ] 906 | }, 907 | { 908 | "cell_type": "code", 909 | "execution_count": 12, 910 | "id": "bc4511e2-190a-4278-9c5f-03bbc05c9534", 911 | "metadata": {}, 912 | "outputs": [], 913 | "source": [ 914 | "label_dummy=pd.get_dummies(dataset['Label'])" 915 | ] 916 | }, 917 | { 918 | "cell_type": "code", 919 | "execution_count": 13, 920 | "id": "a083974b-dec4-41ec-bfdf-8397de60d302", 921 | "metadata": {}, 922 | "outputs": [], 923 | "source": [ 924 | "dataset=pd.concat([dataset,label_dummy], axis=1)" 925 | ] 926 | }, 927 | { 928 | "cell_type": "code", 929 | "execution_count": 14, 930 | "id": "43fc72eb-f75f-4061-b21d-d7f3de39fdc9", 931 | "metadata": {}, 932 | "outputs": [], 933 | "source": [ 934 | "from sklearn.preprocessing import LabelEncoder" 935 | ] 936 | }, 937 | { 938 | "cell_type": "code", 939 | "execution_count": 15, 940 | "id": "f59d68b7-dca5-4c8d-99a0-f37716840837", 941 | "metadata": {}, 942 | "outputs": [], 943 | "source": [ 944 | "le=LabelEncoder()" 945 | ] 946 | }, 947 | { 948 | "cell_type": "code", 949 | "execution_count": 16, 950 | "id": "f4bad7d1-a5e8-4b8f-b9e9-6969592e5dec", 951 | "metadata": {}, 952 | "outputs": [], 953 | "source": [ 954 | "dataset['DDoS-PSH-ACK']=le.fit_transform(dataset['DDoS-PSH-ACK'])\n", 955 | "dataset['Benign']=le.fit_transform(dataset['Benign'])\n", 956 | "dataset['DDoS-ACK']=le.fit_transform(dataset['DDoS-ACK'])" 957 | ] 958 | }, 959 | { 960 | "cell_type": "code", 961 | "execution_count": 17, 962 | "id": "2639d707-f0dc-4e20-86fc-eb76c5819068", 963 | "metadata": {}, 964 | "outputs": [ 965 | { 966 | "data": { 967 | "text/html": [ 968 | "
\n", 969 | "\n", 982 | "\n", 983 | " \n", 984 | " \n", 985 | " \n", 986 | " \n", 987 | " \n", 988 | " \n", 989 | " \n", 990 | " \n", 991 | " \n", 992 | " \n", 993 | " \n", 994 | " \n", 995 | " \n", 996 | " \n", 997 | " \n", 998 | " \n", 999 | " \n", 1000 | " \n", 1001 | " \n", 1002 | " \n", 1003 | " \n", 1004 | " \n", 1005 | " \n", 1006 | " \n", 1007 | " \n", 1008 | " \n", 1009 | " \n", 1010 | " \n", 1011 | " \n", 1012 | " \n", 1013 | " \n", 1014 | " \n", 1015 | " \n", 1016 | " \n", 1017 | " \n", 1018 | " \n", 1019 | " \n", 1020 | " \n", 1021 | " \n", 1022 | " \n", 1023 | " \n", 1024 | " \n", 1025 | " \n", 1026 | " \n", 1027 | " \n", 1028 | " \n", 1029 | " \n", 1030 | " \n", 1031 | " \n", 1032 | " \n", 1033 | " \n", 1034 | " \n", 1035 | " \n", 1036 | " \n", 1037 | " \n", 1038 | " \n", 1039 | " \n", 1040 | " \n", 1041 | " \n", 1042 | " \n", 1043 | " \n", 1044 | " \n", 1045 | " \n", 1046 | " \n", 1047 | " \n", 1048 | " \n", 1049 | " \n", 1050 | " \n", 1051 | " \n", 1052 | " \n", 1053 | " \n", 1054 | " \n", 1055 | " \n", 1056 | " \n", 1057 | " \n", 1058 | " \n", 1059 | " \n", 1060 | " \n", 1061 | " \n", 1062 | " \n", 1063 | " \n", 1064 | " \n", 1065 | " \n", 1066 | " \n", 1067 | " \n", 1068 | " \n", 1069 | " \n", 1070 | " \n", 1071 | " \n", 1072 | " \n", 1073 | " \n", 1074 | " \n", 1075 | " \n", 1076 | " \n", 1077 | " \n", 1078 | " \n", 1079 | " \n", 1080 | " \n", 1081 | " \n", 1082 | " \n", 1083 | " \n", 1084 | " \n", 1085 | " \n", 1086 | " \n", 1087 | " \n", 1088 | " \n", 1089 | " \n", 1090 | " \n", 1091 | " \n", 1092 | " \n", 1093 | " \n", 1094 | " \n", 1095 | " \n", 1096 | " \n", 1097 | " \n", 1098 | " \n", 1099 | " \n", 1100 | " \n", 1101 | "
ip.srcip.dsttcp.srcportframe.lentcp.flags.puship.flags.dfPacketsBytesTx PacketsTx BytesRx PacketsRx BytesLabelBenignDDoS-ACKDDoS-PSH-ACK
151195192.168.19.1192.168.23.237360660110114665604586Benign100
151196192.168.19.1192.168.23.237362660110115165604591Benign100
151197192.168.19.1192.168.23.237364660110114465604584Benign100
151198192.168.19.1192.168.23.237366660110117565604615Benign100
151199192.168.19.1192.168.23.237368660110114665604586Benign100
\n", 1102 | "
" 1103 | ], 1104 | "text/plain": [ 1105 | " ip.src ip.dst tcp.srcport frame.len tcp.flags.push \\\n", 1106 | "151195 192.168.19.1 192.168.23.2 37360 66 0 \n", 1107 | "151196 192.168.19.1 192.168.23.2 37362 66 0 \n", 1108 | "151197 192.168.19.1 192.168.23.2 37364 66 0 \n", 1109 | "151198 192.168.19.1 192.168.23.2 37366 66 0 \n", 1110 | "151199 192.168.19.1 192.168.23.2 37368 66 0 \n", 1111 | "\n", 1112 | " ip.flags.df Packets Bytes Tx Packets Tx Bytes Rx Packets \\\n", 1113 | "151195 1 10 1146 6 560 4 \n", 1114 | "151196 1 10 1151 6 560 4 \n", 1115 | "151197 1 10 1144 6 560 4 \n", 1116 | "151198 1 10 1175 6 560 4 \n", 1117 | "151199 1 10 1146 6 560 4 \n", 1118 | "\n", 1119 | " Rx Bytes Label Benign DDoS-ACK DDoS-PSH-ACK \n", 1120 | "151195 586 Benign 1 0 0 \n", 1121 | "151196 591 Benign 1 0 0 \n", 1122 | "151197 584 Benign 1 0 0 \n", 1123 | "151198 615 Benign 1 0 0 \n", 1124 | "151199 586 Benign 1 0 0 " 1125 | ] 1126 | }, 1127 | "execution_count": 17, 1128 | "metadata": {}, 1129 | "output_type": "execute_result" 1130 | } 1131 | ], 1132 | "source": [ 1133 | "dataset.tail()" 1134 | ] 1135 | }, 1136 | { 1137 | "cell_type": "code", 1138 | "execution_count": 18, 1139 | "id": "476778b6-7ef6-4863-84ab-11b37db80364", 1140 | "metadata": {}, 1141 | "outputs": [], 1142 | "source": [ 1143 | "feature_scale=[feature for feature in dataset.columns if feature not in ['ip.src','ip.dst','Benign','DDoS-ACK','DDoS-PSH-ACK','Label']]\n" 1144 | ] 1145 | }, 1146 | { 1147 | "cell_type": "code", 1148 | "execution_count": 19, 1149 | "id": "8a9976ce-dcc5-4b5c-a666-be845acede1c", 1150 | "metadata": {}, 1151 | "outputs": [], 1152 | "source": [ 1153 | "from sklearn.preprocessing import StandardScaler" 1154 | ] 1155 | }, 1156 | { 1157 | "cell_type": "code", 1158 | "execution_count": 20, 1159 | "id": "82196ac7-1e64-4ad3-abd2-1f2f1d79d6d6", 1160 | "metadata": {}, 1161 | "outputs": [], 1162 | "source": [ 1163 | "scaler=StandardScaler()" 1164 | ] 1165 | }, 1166 | { 1167 | "cell_type": "code", 1168 | "execution_count": 21, 1169 | "id": "263f317f-5c60-4871-a596-c8ec3fbdc6c3", 1170 | "metadata": {}, 1171 | "outputs": [ 1172 | { 1173 | "data": { 1174 | "text/html": [ 1175 | "
StandardScaler()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
" 1176 | ], 1177 | "text/plain": [ 1178 | "StandardScaler()" 1179 | ] 1180 | }, 1181 | "execution_count": 21, 1182 | "metadata": {}, 1183 | "output_type": "execute_result" 1184 | } 1185 | ], 1186 | "source": [ 1187 | "scaler.fit(dataset[feature_scale])" 1188 | ] 1189 | }, 1190 | { 1191 | "cell_type": "code", 1192 | "execution_count": 22, 1193 | "id": "5ec20821-66ad-4027-bdea-59108dba542c", 1194 | "metadata": {}, 1195 | "outputs": [ 1196 | { 1197 | "data": { 1198 | "text/plain": [ 1199 | "array([[-1.27146315, -0.63214064, 1. , ..., -0.98567572,\n", 1200 | " -0.03518717, -0.97784837],\n", 1201 | " [-1.27141222, -0.63214064, 1. , ..., -0.68097412,\n", 1202 | " 1.00312277, -0.70388293],\n", 1203 | " [-1.27136129, -0.63214064, 1. , ..., -0.37627252,\n", 1204 | " 2.0414327 , -0.4299175 ],\n", 1205 | " ...,\n", 1206 | " [ 0.50864024, -0.46366387, -1. , ..., 0.95538634,\n", 1207 | " -0.03518717, 0.88917535],\n", 1208 | " [ 0.5087421 , -0.46366387, -1. , ..., 0.95538634,\n", 1209 | " -0.03518717, 1.0464518 ],\n", 1210 | " [ 0.50884396, -0.46366387, -1. , ..., 0.95538634,\n", 1211 | " -0.03518717, 0.89932222]])" 1212 | ] 1213 | }, 1214 | "execution_count": 22, 1215 | "metadata": {}, 1216 | "output_type": "execute_result" 1217 | } 1218 | ], 1219 | "source": [ 1220 | "scaler.transform(dataset[feature_scale])" 1221 | ] 1222 | }, 1223 | { 1224 | "cell_type": "code", 1225 | "execution_count": 23, 1226 | "id": "27703530-3c60-4bcc-8900-6acf1ac325fb", 1227 | "metadata": {}, 1228 | "outputs": [ 1229 | { 1230 | "data": { 1231 | "text/html": [ 1232 | "
\n", 1233 | "\n", 1246 | "\n", 1247 | " \n", 1248 | " \n", 1249 | " \n", 1250 | " \n", 1251 | " \n", 1252 | " \n", 1253 | " \n", 1254 | " \n", 1255 | " \n", 1256 | " \n", 1257 | " \n", 1258 | " \n", 1259 | " \n", 1260 | " \n", 1261 | " \n", 1262 | " \n", 1263 | " \n", 1264 | " \n", 1265 | " \n", 1266 | " \n", 1267 | " \n", 1268 | " \n", 1269 | " \n", 1270 | " \n", 1271 | " \n", 1272 | " \n", 1273 | " \n", 1274 | " \n", 1275 | " \n", 1276 | " \n", 1277 | " \n", 1278 | " \n", 1279 | " \n", 1280 | " \n", 1281 | " \n", 1282 | " \n", 1283 | " \n", 1284 | " \n", 1285 | " \n", 1286 | " \n", 1287 | " \n", 1288 | " \n", 1289 | " \n", 1290 | " \n", 1291 | " \n", 1292 | " \n", 1293 | " \n", 1294 | " \n", 1295 | " \n", 1296 | " \n", 1297 | " \n", 1298 | " \n", 1299 | " \n", 1300 | " \n", 1301 | " \n", 1302 | " \n", 1303 | " \n", 1304 | " \n", 1305 | " \n", 1306 | " \n", 1307 | " \n", 1308 | " \n", 1309 | " \n", 1310 | " \n", 1311 | " \n", 1312 | " \n", 1313 | " \n", 1314 | " \n", 1315 | " \n", 1316 | " \n", 1317 | " \n", 1318 | " \n", 1319 | " \n", 1320 | " \n", 1321 | " \n", 1322 | " \n", 1323 | " \n", 1324 | " \n", 1325 | " \n", 1326 | " \n", 1327 | " \n", 1328 | " \n", 1329 | " \n", 1330 | " \n", 1331 | " \n", 1332 | " \n", 1333 | " \n", 1334 | " \n", 1335 | " \n", 1336 | " \n", 1337 | " \n", 1338 | " \n", 1339 | " \n", 1340 | " \n", 1341 | " \n", 1342 | " \n", 1343 | " \n", 1344 | " \n", 1345 | " \n", 1346 | " \n", 1347 | " \n", 1348 | " \n", 1349 | " \n", 1350 | " \n", 1351 | " \n", 1352 | " \n", 1353 | " \n", 1354 | " \n", 1355 | " \n", 1356 | " \n", 1357 | " \n", 1358 | " \n", 1359 | " \n", 1360 | " \n", 1361 | " \n", 1362 | " \n", 1363 | " \n", 1364 | " \n", 1365 | "
ip.srcip.dsttcp.srcportframe.lentcp.flags.puship.flags.dfPacketsBytesTx PacketsTx BytesRx PacketsRx BytesLabelBenignDDoS-ACKDDoS-PSH-ACK
0192.168.1.1192.168.23.224125410843242164216DDoS-PSH-ACK001
1192.168.1.1192.168.23.2241354101054052705270DDoS-PSH-ACK001
2192.168.1.1192.168.23.2241454101264863246324DDoS-PSH-ACK001
3192.168.1.1192.168.23.2241554101054052705270DDoS-PSH-ACK001
4192.168.1.1192.168.23.224165410632431623162DDoS-PSH-ACK001
\n", 1366 | "
" 1367 | ], 1368 | "text/plain": [ 1369 | " ip.src ip.dst tcp.srcport frame.len tcp.flags.push \\\n", 1370 | "0 192.168.1.1 192.168.23.2 2412 54 1 \n", 1371 | "1 192.168.1.1 192.168.23.2 2413 54 1 \n", 1372 | "2 192.168.1.1 192.168.23.2 2414 54 1 \n", 1373 | "3 192.168.1.1 192.168.23.2 2415 54 1 \n", 1374 | "4 192.168.1.1 192.168.23.2 2416 54 1 \n", 1375 | "\n", 1376 | " ip.flags.df Packets Bytes Tx Packets Tx Bytes Rx Packets Rx Bytes \\\n", 1377 | "0 0 8 432 4 216 4 216 \n", 1378 | "1 0 10 540 5 270 5 270 \n", 1379 | "2 0 12 648 6 324 6 324 \n", 1380 | "3 0 10 540 5 270 5 270 \n", 1381 | "4 0 6 324 3 162 3 162 \n", 1382 | "\n", 1383 | " Label Benign DDoS-ACK DDoS-PSH-ACK \n", 1384 | "0 DDoS-PSH-ACK 0 0 1 \n", 1385 | "1 DDoS-PSH-ACK 0 0 1 \n", 1386 | "2 DDoS-PSH-ACK 0 0 1 \n", 1387 | "3 DDoS-PSH-ACK 0 0 1 \n", 1388 | "4 DDoS-PSH-ACK 0 0 1 " 1389 | ] 1390 | }, 1391 | "execution_count": 23, 1392 | "metadata": {}, 1393 | "output_type": "execute_result" 1394 | } 1395 | ], 1396 | "source": [ 1397 | "dataset.head()" 1398 | ] 1399 | }, 1400 | { 1401 | "cell_type": "code", 1402 | "execution_count": 24, 1403 | "id": "2163dd06-2fc7-4a99-b215-9fc42237f680", 1404 | "metadata": {}, 1405 | "outputs": [], 1406 | "source": [ 1407 | "dataset=pd.concat([dataset[['ip.src','ip.dst','Benign','DDoS-ACK','DDoS-PSH-ACK','Label']].reset_index(drop=True),pd.DataFrame(scaler.transform(dataset[feature_scale]),columns=feature_scale)],\n", 1408 | " axis=1)" 1409 | ] 1410 | }, 1411 | { 1412 | "cell_type": "code", 1413 | "execution_count": 25, 1414 | "id": "9c520b85-0bed-4e32-8d74-88b2b9d069a4", 1415 | "metadata": {}, 1416 | "outputs": [ 1417 | { 1418 | "data": { 1419 | "text/html": [ 1420 | "
\n", 1421 | "\n", 1434 | "\n", 1435 | " \n", 1436 | " \n", 1437 | " \n", 1438 | " \n", 1439 | " \n", 1440 | " \n", 1441 | " \n", 1442 | " \n", 1443 | " \n", 1444 | " \n", 1445 | " \n", 1446 | " \n", 1447 | " \n", 1448 | " \n", 1449 | " \n", 1450 | " \n", 1451 | " \n", 1452 | " \n", 1453 | " \n", 1454 | " \n", 1455 | " \n", 1456 | " \n", 1457 | " \n", 1458 | " \n", 1459 | " \n", 1460 | " \n", 1461 | " \n", 1462 | " \n", 1463 | " \n", 1464 | " \n", 1465 | " \n", 1466 | " \n", 1467 | " \n", 1468 | " \n", 1469 | " \n", 1470 | " \n", 1471 | " \n", 1472 | " \n", 1473 | " \n", 1474 | " \n", 1475 | " \n", 1476 | " \n", 1477 | " \n", 1478 | " \n", 1479 | " \n", 1480 | " \n", 1481 | " \n", 1482 | " \n", 1483 | " \n", 1484 | " \n", 1485 | " \n", 1486 | " \n", 1487 | " \n", 1488 | " \n", 1489 | " \n", 1490 | " \n", 1491 | " \n", 1492 | " \n", 1493 | " \n", 1494 | " \n", 1495 | " \n", 1496 | " \n", 1497 | " \n", 1498 | " \n", 1499 | " \n", 1500 | " \n", 1501 | " \n", 1502 | " \n", 1503 | " \n", 1504 | " \n", 1505 | " \n", 1506 | " \n", 1507 | " \n", 1508 | " \n", 1509 | " \n", 1510 | " \n", 1511 | " \n", 1512 | " \n", 1513 | " \n", 1514 | " \n", 1515 | " \n", 1516 | " \n", 1517 | " \n", 1518 | " \n", 1519 | " \n", 1520 | " \n", 1521 | " \n", 1522 | " \n", 1523 | " \n", 1524 | " \n", 1525 | " \n", 1526 | " \n", 1527 | " \n", 1528 | " \n", 1529 | " \n", 1530 | " \n", 1531 | " \n", 1532 | " \n", 1533 | " \n", 1534 | " \n", 1535 | " \n", 1536 | " \n", 1537 | " \n", 1538 | " \n", 1539 | " \n", 1540 | " \n", 1541 | " \n", 1542 | " \n", 1543 | " \n", 1544 | " \n", 1545 | " \n", 1546 | " \n", 1547 | " \n", 1548 | " \n", 1549 | " \n", 1550 | " \n", 1551 | " \n", 1552 | " \n", 1553 | "
ip.srcip.dstBenignDDoS-ACKDDoS-PSH-ACKLabeltcp.srcportframe.lentcp.flags.puship.flags.dfPacketsBytesTx PacketsTx BytesRx PacketsRx Bytes
0192.168.1.1192.168.23.2001DDoS-PSH-ACK-1.271463-0.6321411.0-1.0-0.508386-0.983051-0.774686-0.985676-0.035187-0.977848
1192.168.1.1192.168.23.2001DDoS-PSH-ACK-1.271412-0.6321411.0-1.00.430752-0.694094-0.036029-0.6809741.003123-0.703883
2192.168.1.1192.168.23.2001DDoS-PSH-ACK-1.271361-0.6321411.0-1.01.369890-0.4051370.702627-0.3762732.041433-0.429917
3192.168.1.1192.168.23.2001DDoS-PSH-ACK-1.271310-0.6321411.0-1.00.430752-0.694094-0.036029-0.6809741.003123-0.703883
4192.168.1.1192.168.23.2001DDoS-PSH-ACK-1.271259-0.6321411.0-1.0-1.447524-1.272008-1.513342-1.290377-1.073497-1.251814
\n", 1554 | "
" 1555 | ], 1556 | "text/plain": [ 1557 | " ip.src ip.dst Benign DDoS-ACK DDoS-PSH-ACK Label \\\n", 1558 | "0 192.168.1.1 192.168.23.2 0 0 1 DDoS-PSH-ACK \n", 1559 | "1 192.168.1.1 192.168.23.2 0 0 1 DDoS-PSH-ACK \n", 1560 | "2 192.168.1.1 192.168.23.2 0 0 1 DDoS-PSH-ACK \n", 1561 | "3 192.168.1.1 192.168.23.2 0 0 1 DDoS-PSH-ACK \n", 1562 | "4 192.168.1.1 192.168.23.2 0 0 1 DDoS-PSH-ACK \n", 1563 | "\n", 1564 | " tcp.srcport frame.len tcp.flags.push ip.flags.df Packets Bytes \\\n", 1565 | "0 -1.271463 -0.632141 1.0 -1.0 -0.508386 -0.983051 \n", 1566 | "1 -1.271412 -0.632141 1.0 -1.0 0.430752 -0.694094 \n", 1567 | "2 -1.271361 -0.632141 1.0 -1.0 1.369890 -0.405137 \n", 1568 | "3 -1.271310 -0.632141 1.0 -1.0 0.430752 -0.694094 \n", 1569 | "4 -1.271259 -0.632141 1.0 -1.0 -1.447524 -1.272008 \n", 1570 | "\n", 1571 | " Tx Packets Tx Bytes Rx Packets Rx Bytes \n", 1572 | "0 -0.774686 -0.985676 -0.035187 -0.977848 \n", 1573 | "1 -0.036029 -0.680974 1.003123 -0.703883 \n", 1574 | "2 0.702627 -0.376273 2.041433 -0.429917 \n", 1575 | "3 -0.036029 -0.680974 1.003123 -0.703883 \n", 1576 | "4 -1.513342 -1.290377 -1.073497 -1.251814 " 1577 | ] 1578 | }, 1579 | "execution_count": 25, 1580 | "metadata": {}, 1581 | "output_type": "execute_result" 1582 | } 1583 | ], 1584 | "source": [ 1585 | "dataset.head()" 1586 | ] 1587 | }, 1588 | { 1589 | "cell_type": "code", 1590 | "execution_count": 26, 1591 | "id": "6c4d5c3c-b07c-4dcd-9b6e-abdbdec18317", 1592 | "metadata": {}, 1593 | "outputs": [], 1594 | "source": [ 1595 | "from sklearn.model_selection import train_test_split\n", 1596 | "rest_data, sampled_data = train_test_split(dataset, test_size=0.005, stratify=dataset['Label'], random_state=42)" 1597 | ] 1598 | }, 1599 | { 1600 | "cell_type": "code", 1601 | "execution_count": 27, 1602 | "id": "a33a78d0-8f56-42a8-8042-c064023f8fb5", 1603 | "metadata": {}, 1604 | "outputs": [ 1605 | { 1606 | "data": { 1607 | "image/png": "", 1608 | "text/plain": [ 1609 | "
" 1610 | ] 1611 | }, 1612 | "metadata": {}, 1613 | "output_type": "display_data" 1614 | }, 1615 | { 1616 | "data": { 1617 | "image/png": "", 1618 | "text/plain": [ 1619 | "
" 1620 | ] 1621 | }, 1622 | "metadata": {}, 1623 | "output_type": "display_data" 1624 | }, 1625 | { 1626 | "data": { 1627 | "image/png": "", 1628 | "text/plain": [ 1629 | "
" 1630 | ] 1631 | }, 1632 | "metadata": {}, 1633 | "output_type": "display_data" 1634 | } 1635 | ], 1636 | "source": [ 1637 | "plt.figure(figsize=(8, 6))\n", 1638 | "sns.violinplot(data= sampled_data, x='Label', y='frame.len')\n", 1639 | "plt.title('Distribution of frame length across Labels')\n", 1640 | "plt.xlabel('Label')\n", 1641 | "plt.ylabel('frame.len')\n", 1642 | "plt.show()\n", 1643 | "\n", 1644 | "plt.figure(figsize=(8, 6))\n", 1645 | "sns.violinplot(data= sampled_data, x='Label', y='Packets')\n", 1646 | "plt.title('Distribution of Packets across Labels')\n", 1647 | "plt.xlabel('Label')\n", 1648 | "plt.ylabel('Packets')\n", 1649 | "plt.show()\n", 1650 | "\n", 1651 | "plt.figure(figsize=(8, 6))\n", 1652 | "sns.violinplot(data= sampled_data, x='Label', y='Rx Bytes')\n", 1653 | "plt.title('Distribution of Rx Bytes across Labels')\n", 1654 | "plt.xlabel('Label')\n", 1655 | "plt.ylabel('Rx Bytes')\n", 1656 | "plt.show()" 1657 | ] 1658 | }, 1659 | { 1660 | "cell_type": "code", 1661 | "execution_count": 28, 1662 | "id": "8b2a9560-0253-4f11-8b16-ec31f3ee8f99", 1663 | "metadata": {}, 1664 | "outputs": [ 1665 | { 1666 | "data": { 1667 | "text/html": [ 1668 | "
\n", 1669 | "\n", 1682 | "\n", 1683 | " \n", 1684 | " \n", 1685 | " \n", 1686 | " \n", 1687 | " \n", 1688 | " \n", 1689 | " \n", 1690 | " \n", 1691 | " \n", 1692 | " \n", 1693 | " \n", 1694 | " \n", 1695 | " \n", 1696 | " \n", 1697 | " \n", 1698 | " \n", 1699 | " \n", 1700 | " \n", 1701 | " \n", 1702 | " \n", 1703 | " \n", 1704 | " \n", 1705 | " \n", 1706 | " \n", 1707 | " \n", 1708 | " \n", 1709 | " \n", 1710 | " \n", 1711 | " \n", 1712 | " \n", 1713 | " \n", 1714 | " \n", 1715 | " \n", 1716 | " \n", 1717 | " \n", 1718 | " \n", 1719 | " \n", 1720 | " \n", 1721 | " \n", 1722 | " \n", 1723 | " \n", 1724 | " \n", 1725 | " \n", 1726 | " \n", 1727 | " \n", 1728 | " \n", 1729 | " \n", 1730 | " \n", 1731 | " \n", 1732 | " \n", 1733 | " \n", 1734 | " \n", 1735 | " \n", 1736 | " \n", 1737 | " \n", 1738 | " \n", 1739 | " \n", 1740 | " \n", 1741 | " \n", 1742 | " \n", 1743 | " \n", 1744 | " \n", 1745 | " \n", 1746 | " \n", 1747 | " \n", 1748 | " \n", 1749 | " \n", 1750 | " \n", 1751 | " \n", 1752 | " \n", 1753 | " \n", 1754 | " \n", 1755 | " \n", 1756 | " \n", 1757 | " \n", 1758 | " \n", 1759 | " \n", 1760 | " \n", 1761 | " \n", 1762 | " \n", 1763 | " \n", 1764 | " \n", 1765 | " \n", 1766 | " \n", 1767 | " \n", 1768 | " \n", 1769 | " \n", 1770 | " \n", 1771 | " \n", 1772 | " \n", 1773 | " \n", 1774 | " \n", 1775 | " \n", 1776 | " \n", 1777 | " \n", 1778 | " \n", 1779 | " \n", 1780 | " \n", 1781 | " \n", 1782 | " \n", 1783 | " \n", 1784 | " \n", 1785 | " \n", 1786 | " \n", 1787 | " \n", 1788 | " \n", 1789 | " \n", 1790 | " \n", 1791 | " \n", 1792 | " \n", 1793 | " \n", 1794 | " \n", 1795 | " \n", 1796 | " \n", 1797 | " \n", 1798 | " \n", 1799 | " \n", 1800 | " \n", 1801 | " \n", 1802 | " \n", 1803 | " \n", 1804 | " \n", 1805 | " \n", 1806 | " \n", 1807 | " \n", 1808 | " \n", 1809 | " \n", 1810 | " \n", 1811 | " \n", 1812 | " \n", 1813 | " \n", 1814 | " \n", 1815 | " \n", 1816 | " \n", 1817 | " \n", 1818 | " \n", 1819 | " \n", 1820 | " \n", 1821 | " \n", 1822 | " \n", 1823 | " \n", 1824 | " \n", 1825 | " \n", 1826 | " \n", 1827 | " \n", 1828 | " \n", 1829 | " \n", 1830 | " \n", 1831 | " \n", 1832 | " \n", 1833 | " \n", 1834 | " \n", 1835 | " \n", 1836 | " \n", 1837 | " \n", 1838 | " \n", 1839 | " \n", 1840 | " \n", 1841 | " \n", 1842 | " \n", 1843 | " \n", 1844 | " \n", 1845 | " \n", 1846 | " \n", 1847 | " \n", 1848 | " \n", 1849 | " \n", 1850 | " \n", 1851 | " \n", 1852 | " \n", 1853 | " \n", 1854 | " \n", 1855 | " \n", 1856 | " \n", 1857 | " \n", 1858 | " \n", 1859 | " \n", 1860 | " \n", 1861 | " \n", 1862 | " \n", 1863 | " \n", 1864 | " \n", 1865 | " \n", 1866 | " \n", 1867 | " \n", 1868 | " \n", 1869 | " \n", 1870 | " \n", 1871 | " \n", 1872 | " \n", 1873 | " \n", 1874 | " \n", 1875 | " \n", 1876 | " \n", 1877 | " \n", 1878 | " \n", 1879 | " \n", 1880 | " \n", 1881 | " \n", 1882 | " \n", 1883 | " \n", 1884 | " \n", 1885 | " \n", 1886 | " \n", 1887 | " \n", 1888 | " \n", 1889 | " \n", 1890 | " \n", 1891 | " \n", 1892 | " \n", 1893 | " \n", 1894 | " \n", 1895 | " \n", 1896 | " \n", 1897 | " \n", 1898 | " \n", 1899 | " \n", 1900 | " \n", 1901 | " \n", 1902 | " \n", 1903 | " \n", 1904 | " \n", 1905 | " \n", 1906 | " \n", 1907 | " \n", 1908 | " \n", 1909 | " \n", 1910 | " \n", 1911 | " \n", 1912 | " \n", 1913 | " \n", 1914 | " \n", 1915 | "
ip.srcip.dstBenignDDoS-ACKDDoS-PSH-ACKLabeltcp.srcportframe.lentcp.flags.puship.flags.dfPacketsBytesTx PacketsTx BytesRx PacketsRx Bytes
0192.168.1.1192.168.23.2001DDoS-PSH-ACK-1.271463-0.6321411.0-1.0-0.508386-0.983051-0.774686-0.985676-0.035187-0.977848
1192.168.1.1192.168.23.2001DDoS-PSH-ACK-1.271412-0.6321411.0-1.00.430752-0.694094-0.036029-0.6809741.003123-0.703883
2192.168.1.1192.168.23.2001DDoS-PSH-ACK-1.271361-0.6321411.0-1.01.369890-0.4051370.702627-0.3762732.041433-0.429917
3192.168.1.1192.168.23.2001DDoS-PSH-ACK-1.271310-0.6321411.0-1.00.430752-0.694094-0.036029-0.6809741.003123-0.703883
4192.168.1.1192.168.23.2001DDoS-PSH-ACK-1.271259-0.6321411.0-1.0-1.447524-1.272008-1.513342-1.290377-1.073497-1.251814
...................................................
151195192.168.19.1192.168.23.2100Benign0.508437-0.463664-1.01.00.4307520.9272770.7026270.955386-0.0351870.899322
151196192.168.19.1192.168.23.2100Benign0.508538-0.463664-1.01.00.4307520.9406550.7026270.955386-0.0351870.924689
151197192.168.19.1192.168.23.2100Benign0.508640-0.463664-1.01.00.4307520.9219260.7026270.955386-0.0351870.889175
151198192.168.19.1192.168.23.2100Benign0.508742-0.463664-1.01.00.4307521.0048670.7026270.955386-0.0351871.046452
151199192.168.19.1192.168.23.2100Benign0.508844-0.463664-1.01.00.4307520.9272770.7026270.955386-0.0351870.899322
\n", 1916 | "

151200 rows × 16 columns

\n", 1917 | "
" 1918 | ], 1919 | "text/plain": [ 1920 | " ip.src ip.dst Benign DDoS-ACK DDoS-PSH-ACK \\\n", 1921 | "0 192.168.1.1 192.168.23.2 0 0 1 \n", 1922 | "1 192.168.1.1 192.168.23.2 0 0 1 \n", 1923 | "2 192.168.1.1 192.168.23.2 0 0 1 \n", 1924 | "3 192.168.1.1 192.168.23.2 0 0 1 \n", 1925 | "4 192.168.1.1 192.168.23.2 0 0 1 \n", 1926 | "... ... ... ... ... ... \n", 1927 | "151195 192.168.19.1 192.168.23.2 1 0 0 \n", 1928 | "151196 192.168.19.1 192.168.23.2 1 0 0 \n", 1929 | "151197 192.168.19.1 192.168.23.2 1 0 0 \n", 1930 | "151198 192.168.19.1 192.168.23.2 1 0 0 \n", 1931 | "151199 192.168.19.1 192.168.23.2 1 0 0 \n", 1932 | "\n", 1933 | " Label tcp.srcport frame.len tcp.flags.push ip.flags.df \\\n", 1934 | "0 DDoS-PSH-ACK -1.271463 -0.632141 1.0 -1.0 \n", 1935 | "1 DDoS-PSH-ACK -1.271412 -0.632141 1.0 -1.0 \n", 1936 | "2 DDoS-PSH-ACK -1.271361 -0.632141 1.0 -1.0 \n", 1937 | "3 DDoS-PSH-ACK -1.271310 -0.632141 1.0 -1.0 \n", 1938 | "4 DDoS-PSH-ACK -1.271259 -0.632141 1.0 -1.0 \n", 1939 | "... ... ... ... ... ... \n", 1940 | "151195 Benign 0.508437 -0.463664 -1.0 1.0 \n", 1941 | "151196 Benign 0.508538 -0.463664 -1.0 1.0 \n", 1942 | "151197 Benign 0.508640 -0.463664 -1.0 1.0 \n", 1943 | "151198 Benign 0.508742 -0.463664 -1.0 1.0 \n", 1944 | "151199 Benign 0.508844 -0.463664 -1.0 1.0 \n", 1945 | "\n", 1946 | " Packets Bytes Tx Packets Tx Bytes Rx Packets Rx Bytes \n", 1947 | "0 -0.508386 -0.983051 -0.774686 -0.985676 -0.035187 -0.977848 \n", 1948 | "1 0.430752 -0.694094 -0.036029 -0.680974 1.003123 -0.703883 \n", 1949 | "2 1.369890 -0.405137 0.702627 -0.376273 2.041433 -0.429917 \n", 1950 | "3 0.430752 -0.694094 -0.036029 -0.680974 1.003123 -0.703883 \n", 1951 | "4 -1.447524 -1.272008 -1.513342 -1.290377 -1.073497 -1.251814 \n", 1952 | "... ... ... ... ... ... ... \n", 1953 | "151195 0.430752 0.927277 0.702627 0.955386 -0.035187 0.899322 \n", 1954 | "151196 0.430752 0.940655 0.702627 0.955386 -0.035187 0.924689 \n", 1955 | "151197 0.430752 0.921926 0.702627 0.955386 -0.035187 0.889175 \n", 1956 | "151198 0.430752 1.004867 0.702627 0.955386 -0.035187 1.046452 \n", 1957 | "151199 0.430752 0.927277 0.702627 0.955386 -0.035187 0.899322 \n", 1958 | "\n", 1959 | "[151200 rows x 16 columns]" 1960 | ] 1961 | }, 1962 | "execution_count": 28, 1963 | "metadata": {}, 1964 | "output_type": "execute_result" 1965 | } 1966 | ], 1967 | "source": [ 1968 | "dataset" 1969 | ] 1970 | }, 1971 | { 1972 | "cell_type": "code", 1973 | "execution_count": null, 1974 | "id": "6baaa1b8-32e9-4d0b-af37-49114f78d0fb", 1975 | "metadata": {}, 1976 | "outputs": [], 1977 | "source": [] 1978 | } 1979 | ], 1980 | "metadata": { 1981 | "kernelspec": { 1982 | "display_name": "Python 3 (ipykernel)", 1983 | "language": "python", 1984 | "name": "python3" 1985 | }, 1986 | "language_info": { 1987 | "codemirror_mode": { 1988 | "name": "ipython", 1989 | "version": 3 1990 | }, 1991 | "file_extension": ".py", 1992 | "mimetype": "text/x-python", 1993 | "name": "python", 1994 | "nbconvert_exporter": "python", 1995 | "pygments_lexer": "ipython3", 1996 | "version": "3.11.4" 1997 | } 1998 | }, 1999 | "nbformat": 4, 2000 | "nbformat_minor": 5 2001 | } 2002 | -------------------------------------------------------------------------------- /reports/data/Mihir.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/opencodeiiita/YADD/66bfff366d33bc082201cbe4f73cbc93fa282330/reports/data/Mihir.pdf -------------------------------------------------------------------------------- /reports/data/analysis.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/opencodeiiita/YADD/66bfff366d33bc082201cbe4f73cbc93fa282330/reports/data/analysis.pdf -------------------------------------------------------------------------------- /reports/data/sarthakvermaa.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/opencodeiiita/YADD/66bfff366d33bc082201cbe4f73cbc93fa282330/reports/data/sarthakvermaa.pdf -------------------------------------------------------------------------------- /reports/model/model_reports.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/opencodeiiita/YADD/66bfff366d33bc082201cbe4f73cbc93fa282330/reports/model/model_reports.pdf --------------------------------------------------------------------------------