├── ANN_and_SMOTE_Sampling └── ANN_and_SMOTE_Sampling.ipynb ├── Artificial_Neural_Network_with_Keras └── Artificial_Neural_Network_with_Keras.ipynb ├── Black_Friday_EDA └── black_friday_eda.ipynb ├── Cohort Analysis and Customer Segmentation └── Cohort_Analysis.ipynb ├── Convolutional_Neural_Network └── Convolutional_Neural_Network.ipynb ├── Global_Terrorism_Database └── Terrorism_in_the_World_and_Turkey.ipynb ├── Heart_Disease_Machine_Learning └── heart-disease-classifications-machine-learning.ipynb ├── History_Of_Olympics_EDA └── History_Of_Olympics_Who_Are_The_Most_Successful.ipynb ├── Last_Words_NLP └── Last_Words_Natural_Language_Process.ipynb ├── Mall_Customers_KMeans └── Mall_Customers_KMeans.ipynb ├── README.md ├── Support_Vector_Machine └── What_is_Support_Vector_Machine.ipynb ├── Time Series Forecasting - ARIMA, LSTM, Prophet └── Time_Series_Forecasting.ipynb └── Wine_Reviews_NLP └── Wine_Reviews_NLP.ipynb /ANN_and_SMOTE_Sampling/ANN_and_SMOTE_Sampling.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "metadata": { 5 | "_uuid": "47145f29fe9bc06d173a7bf66881132305df6df0" 6 | }, 7 | "cell_type": "markdown", 8 | "source": "![](http://)# Credit Card Fraud Detection - Neural Networks and SMOTE Sampling" 9 | }, 10 | { 11 | "metadata": { 12 | "_uuid": "e70f7881f5fe0af181834092f41886e9d5eae48f" 13 | }, 14 | "cell_type": "markdown", 15 | "source": "## About Dataset\n
\n\"The datasets contains transactions made by credit cards in September 2013 by european cardholders. This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions.\n\nIt contains only numerical input variables which are the result of a PCA transformation. Unfortunately, due to confidentiality issues, we cannot provide the original features and more background information about the data. Features V1, V2, ... V28 are the principal components obtained with PCA, the only features which have not been transformed with PCA are 'Time' and 'Amount'. Feature 'Time' contains the seconds elapsed between each transaction and the first transaction in the dataset. The feature 'Amount' is the transaction Amount, this feature can be used for example-dependant cost-senstive learning. Feature 'Class' is the response variable and it takes value 1 in case of fraud and 0 otherwise.\"\n
\n" 16 | }, 17 | { 18 | "metadata": { 19 | "_uuid": "2d7afc30946595f6c618f68d3b2a1f12184fa685" 20 | }, 21 | "cell_type": "markdown", 22 | "source": "## Import Libraries" 23 | }, 24 | { 25 | "metadata": { 26 | "_uuid": "8f2839f25d086af736a60e9eeb907d3b93b6e0e5", 27 | "_cell_guid": "b1076dfc-b9ad-4769-8c92-a6c4dae69d19", 28 | "trusted": true 29 | }, 30 | "cell_type": "code", 31 | "source": "# This Python 3 environment comes with many helpful analytics libraries installed\n# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python\n# For example, here's several helpful packages to load in \n\nimport numpy as np # linear algebra\nimport pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)\nimport keras\nimport matplotlib.pyplot as plt\nimport seaborn as sns\n\n# Input data files are available in the \"../input/\" directory.\n# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory\n\nimport os\nprint(os.listdir(\"../input\"))\n\n# Any results you write to the current directory are saved as output.", 32 | "execution_count": 1, 33 | "outputs": [ 34 | { 35 | "output_type": "stream", 36 | "text": "Using TensorFlow backend.\n", 37 | "name": "stderr" 38 | }, 39 | { 40 | "output_type": "stream", 41 | "text": "['creditcard.csv']\n", 42 | "name": "stdout" 43 | } 44 | ] 45 | }, 46 | { 47 | "metadata": { 48 | "_cell_guid": "79c7e3d0-c299-4dcb-8224-4455121ee9b0", 49 | "collapsed": true, 50 | "_uuid": "d629ff2d2480ee46fbb7e2d37f6b5fab8052498a", 51 | "trusted": false 52 | }, 53 | "cell_type": "markdown", 54 | "source": "## Read and Explore Data" 55 | }, 56 | { 57 | "metadata": { 58 | "trusted": true, 59 | "_uuid": "b5bb132b5af2dc1237ddb324cf0e805a70fc745b" 60 | }, 61 | "cell_type": "code", 62 | "source": "df = pd.read_csv(\"../input/creditcard.csv\")", 63 | "execution_count": 2, 64 | "outputs": [] 65 | }, 66 | { 67 | "metadata": { 68 | "trusted": true, 69 | "_uuid": "2b7118ba27d3ee17bd401db920beda447691ef71" 70 | }, 71 | "cell_type": "code", 72 | "source": "# First 5 rows of data\ndf.head()", 73 | "execution_count": 3, 74 | "outputs": [ 75 | { 76 | "output_type": "execute_result", 77 | "execution_count": 3, 78 | "data": { 79 | "text/plain": " Time V1 V2 V3 ... V27 V28 Amount Class\n0 0.0 -1.359807 -0.072781 2.536347 ... 0.133558 -0.021053 149.62 0\n1 0.0 1.191857 0.266151 0.166480 ... -0.008983 0.014724 2.69 0\n2 1.0 -1.358354 -1.340163 1.773209 ... -0.055353 -0.059752 378.66 0\n3 1.0 -0.966272 -0.185226 1.792993 ... 0.062723 0.061458 123.50 0\n4 2.0 -1.158233 0.877737 1.548718 ... 0.219422 0.215153 69.99 0\n\n[5 rows x 31 columns]", 80 | "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
TimeV1V2V3V4V5V6V7V8V9V10V11V12V13V14V15V16V17V18V19V20V21V22V23V24V25V26V27V28AmountClass
00.0-1.359807-0.0727812.5363471.378155-0.3383210.4623880.2395990.0986980.3637870.090794-0.551600-0.617801-0.991390-0.3111691.468177-0.4704010.2079710.0257910.4039930.251412-0.0183070.277838-0.1104740.0669280.128539-0.1891150.133558-0.021053149.620
10.01.1918570.2661510.1664800.4481540.060018-0.082361-0.0788030.085102-0.255425-0.1669741.6127271.0652350.489095-0.1437720.6355580.463917-0.114805-0.183361-0.145783-0.069083-0.225775-0.6386720.101288-0.3398460.1671700.125895-0.0089830.0147242.690
21.0-1.358354-1.3401631.7732090.379780-0.5031981.8004990.7914610.247676-1.5146540.2076430.6245010.0660840.717293-0.1659462.345865-2.8900831.109969-0.121359-2.2618570.5249800.2479980.7716790.909412-0.689281-0.327642-0.139097-0.055353-0.059752378.660
31.0-0.966272-0.1852261.792993-0.863291-0.0103091.2472030.2376090.377436-1.387024-0.054952-0.2264870.1782280.507757-0.287924-0.631418-1.059647-0.6840931.965775-1.232622-0.208038-0.1083000.005274-0.190321-1.1755750.647376-0.2219290.0627230.061458123.500
42.0-1.1582330.8777371.5487180.403034-0.4071930.0959210.592941-0.2705330.8177390.753074-0.8228430.5381961.345852-1.1196700.175121-0.451449-0.237033-0.0381950.8034870.408542-0.0094310.798278-0.1374580.141267-0.2060100.5022920.2194220.21515369.990
\n
" 81 | }, 82 | "metadata": {} 83 | } 84 | ] 85 | }, 86 | { 87 | "metadata": { 88 | "trusted": true, 89 | "_uuid": "aaa4432873a2ee852da4ec0a57a92333dae2b3da" 90 | }, 91 | "cell_type": "code", 92 | "source": "df.info()", 93 | "execution_count": 4, 94 | "outputs": [ 95 | { 96 | "output_type": "stream", 97 | "text": "\nRangeIndex: 284807 entries, 0 to 284806\nData columns (total 31 columns):\nTime 284807 non-null float64\nV1 284807 non-null float64\nV2 284807 non-null float64\nV3 284807 non-null float64\nV4 284807 non-null float64\nV5 284807 non-null float64\nV6 284807 non-null float64\nV7 284807 non-null float64\nV8 284807 non-null float64\nV9 284807 non-null float64\nV10 284807 non-null float64\nV11 284807 non-null float64\nV12 284807 non-null float64\nV13 284807 non-null float64\nV14 284807 non-null float64\nV15 284807 non-null float64\nV16 284807 non-null float64\nV17 284807 non-null float64\nV18 284807 non-null float64\nV19 284807 non-null float64\nV20 284807 non-null float64\nV21 284807 non-null float64\nV22 284807 non-null float64\nV23 284807 non-null float64\nV24 284807 non-null float64\nV25 284807 non-null float64\nV26 284807 non-null float64\nV27 284807 non-null float64\nV28 284807 non-null float64\nAmount 284807 non-null float64\nClass 284807 non-null int64\ndtypes: float64(30), int64(1)\nmemory usage: 67.4 MB\n", 98 | "name": "stdout" 99 | } 100 | ] 101 | }, 102 | { 103 | "metadata": { 104 | "trusted": true, 105 | "_uuid": "50ad158e98e3bf56e2a8866247ba2980b65900fa" 106 | }, 107 | "cell_type": "code", 108 | "source": "df.columns", 109 | "execution_count": 5, 110 | "outputs": [ 111 | { 112 | "output_type": "execute_result", 113 | "execution_count": 5, 114 | "data": { 115 | "text/plain": "Index(['Time', 'V1', 'V2', 'V3', 'V4', 'V5', 'V6', 'V7', 'V8', 'V9', 'V10',\n 'V11', 'V12', 'V13', 'V14', 'V15', 'V16', 'V17', 'V18', 'V19', 'V20',\n 'V21', 'V22', 'V23', 'V24', 'V25', 'V26', 'V27', 'V28', 'Amount',\n 'Class'],\n dtype='object')" 116 | }, 117 | "metadata": {} 118 | } 119 | ] 120 | }, 121 | { 122 | "metadata": { 123 | "_uuid": "90dcf68dad6d9d8794d927b245a5d5bb7d302226" 124 | }, 125 | "cell_type": "markdown", 126 | "source": "## Normalize 'Amount'" 127 | }, 128 | { 129 | "metadata": { 130 | "trusted": true, 131 | "_uuid": "16563402e84bca3c52c6e15686baeedbd047d7f3" 132 | }, 133 | "cell_type": "code", 134 | "source": "from sklearn.preprocessing import StandardScaler", 135 | "execution_count": 6, 136 | "outputs": [] 137 | }, 138 | { 139 | "metadata": { 140 | "trusted": true, 141 | "_uuid": "4be099e30389dad238e5df626ca22ee6fce4b775" 142 | }, 143 | "cell_type": "code", 144 | "source": "df['Amount(Normalized)'] = StandardScaler().fit_transform(df['Amount'].values.reshape(-1,1))", 145 | "execution_count": 7, 146 | "outputs": [] 147 | }, 148 | { 149 | "metadata": { 150 | "trusted": true, 151 | "_uuid": "2682eaf1517fad0e011b00fa191064d5b1f24077" 152 | }, 153 | "cell_type": "code", 154 | "source": "df.iloc[:,[29,31]].head()", 155 | "execution_count": 26, 156 | "outputs": [ 157 | { 158 | "output_type": "execute_result", 159 | "execution_count": 26, 160 | "data": { 161 | "text/plain": " Amount Amount(Normalized)\n0 149.62 0.244964\n1 2.69 -0.342475\n2 378.66 1.160686\n3 123.50 0.140534\n4 69.99 -0.073403", 162 | "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
AmountAmount(Normalized)
0149.620.244964
12.69-0.342475
2378.661.160686
3123.500.140534
469.99-0.073403
\n
" 163 | }, 164 | "metadata": {} 165 | } 166 | ] 167 | }, 168 | { 169 | "metadata": { 170 | "trusted": true, 171 | "_uuid": "9552b3529bd0a4140e2199a13e23f3d9c61d594d" 172 | }, 173 | "cell_type": "code", 174 | "source": "df = df.drop(columns = ['Amount', 'Time'], axis=1) # This columns are not necessary anymore.", 175 | "execution_count": 27, 176 | "outputs": [] 177 | }, 178 | { 179 | "metadata": { 180 | "_uuid": "c34606b2d10e70c0a019caa8fbbf1f4c0cd91388" 181 | }, 182 | "cell_type": "markdown", 183 | "source": "## Data PreProcessing" 184 | }, 185 | { 186 | "metadata": { 187 | "trusted": true, 188 | "_uuid": "f8efd0d4c63e7725a6497e548c58f5832a3bac25" 189 | }, 190 | "cell_type": "code", 191 | "source": "X = df.drop('Class', axis=1)\n\ny = df['Class']", 192 | "execution_count": 28, 193 | "outputs": [] 194 | }, 195 | { 196 | "metadata": { 197 | "trusted": true, 198 | "_uuid": "f413b5226569c7d9fa095fec96695f41e38d125f" 199 | }, 200 | "cell_type": "markdown", 201 | "source": "## Train-Test Split" 202 | }, 203 | { 204 | "metadata": { 205 | "trusted": true, 206 | "_uuid": "21accac85a230d04427938a63b2259ad5828df25" 207 | }, 208 | "cell_type": "code", 209 | "source": "from sklearn.model_selection import train_test_split", 210 | "execution_count": 29, 211 | "outputs": [] 212 | }, 213 | { 214 | "metadata": { 215 | "trusted": true, 216 | "_uuid": "65cc0fd95307e851b935e157da63c9c17843b200" 217 | }, 218 | "cell_type": "code", 219 | "source": "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)", 220 | "execution_count": 30, 221 | "outputs": [] 222 | }, 223 | { 224 | "metadata": { 225 | "trusted": true, 226 | "_uuid": "b9336ab97555dc7ad2c2d5d86e0f689e0b3e1c77" 227 | }, 228 | "cell_type": "code", 229 | "source": "# We are transforming data to numpy array to implementing with keras\nX_train = np.array(X_train)\nX_test = np.array(X_test)\ny_train = np.array(y_train)\ny_test = np.array(y_test)", 230 | "execution_count": 31, 231 | "outputs": [] 232 | }, 233 | { 234 | "metadata": { 235 | "_uuid": "a134bfec108e763d2d8ed20a0f7af718b1880464" 236 | }, 237 | "cell_type": "markdown", 238 | "source": "## Artificial Neural Networks" 239 | }, 240 | { 241 | "metadata": { 242 | "trusted": true, 243 | "_uuid": "c30de6c8087a9669f3903ffc75508b8f6b47ac6d" 244 | }, 245 | "cell_type": "code", 246 | "source": "from keras.models import Sequential\nfrom keras.layers import Dense, Dropout ", 247 | "execution_count": 32, 248 | "outputs": [] 249 | }, 250 | { 251 | "metadata": { 252 | "trusted": true, 253 | "_uuid": "f5f322a8dee5c88a0d8d6f116914db3b968121ea" 254 | }, 255 | "cell_type": "code", 256 | "source": "model = Sequential([\n Dense(units=20, input_dim = X_train.shape[1], activation='relu'),\n Dense(units=24,activation='relu'),\n Dropout(0.5),\n Dense(units=20,activation='relu'),\n Dense(units=24,activation='relu'),\n Dense(1, activation='sigmoid')\n])", 257 | "execution_count": 33, 258 | "outputs": [] 259 | }, 260 | { 261 | "metadata": { 262 | "trusted": true, 263 | "_uuid": "fb1ec3e1840912594c69e76e37dd5aba2ce5c450" 264 | }, 265 | "cell_type": "code", 266 | "source": "model.summary()", 267 | "execution_count": 34, 268 | "outputs": [ 269 | { 270 | "output_type": "stream", 271 | "text": "_________________________________________________________________\nLayer (type) Output Shape Param # \n=================================================================\ndense_1 (Dense) (None, 20) 600 \n_________________________________________________________________\ndense_2 (Dense) (None, 24) 504 \n_________________________________________________________________\ndropout_1 (Dropout) (None, 24) 0 \n_________________________________________________________________\ndense_3 (Dense) (None, 20) 500 \n_________________________________________________________________\ndense_4 (Dense) (None, 24) 504 \n_________________________________________________________________\ndense_5 (Dense) (None, 1) 25 \n=================================================================\nTotal params: 2,133\nTrainable params: 2,133\nNon-trainable params: 0\n_________________________________________________________________\n", 272 | "name": "stdout" 273 | } 274 | ] 275 | }, 276 | { 277 | "metadata": { 278 | "trusted": true, 279 | "_uuid": "9d1f5f2fb9707ddf1d4db8216fc59273c6149145" 280 | }, 281 | "cell_type": "code", 282 | "source": "model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])\nmodel.fit(X_train, y_train, batch_size=30, epochs=5)", 283 | "execution_count": 35, 284 | "outputs": [ 285 | { 286 | "output_type": "stream", 287 | "text": "Epoch 1/5\n199364/199364 [==============================] - 16s 80us/step - loss: 0.0087 - acc: 0.9989\nEpoch 2/5\n199364/199364 [==============================] - 12s 63us/step - loss: 0.0042 - acc: 0.9993\nEpoch 3/5\n199364/199364 [==============================] - 12s 62us/step - loss: 0.0036 - acc: 0.9994\nEpoch 4/5\n199364/199364 [==============================] - 12s 61us/step - loss: 0.0034 - acc: 0.9993\nEpoch 5/5\n199364/199364 [==============================] - 13s 64us/step - loss: 0.0034 - acc: 0.9994\n", 288 | "name": "stdout" 289 | }, 290 | { 291 | "output_type": "execute_result", 292 | "execution_count": 35, 293 | "data": { 294 | "text/plain": "" 295 | }, 296 | "metadata": {} 297 | } 298 | ] 299 | }, 300 | { 301 | "metadata": { 302 | "trusted": true, 303 | "_uuid": "74469650aca497655658457a6a24ce9f895c7403" 304 | }, 305 | "cell_type": "code", 306 | "source": "score = model.evaluate(X_test, y_test)\nprint('Test Accuracy: {:.2f}%\\nTest Loss: {}'.format(score[1]*100,score[0]))", 307 | "execution_count": 36, 308 | "outputs": [ 309 | { 310 | "output_type": "stream", 311 | "text": "85443/85443 [==============================] - 2s 29us/step\nTest Accuracy: 99.94%\nTest Loss: 0.002909836608232323\n", 312 | "name": "stdout" 313 | } 314 | ] 315 | }, 316 | { 317 | "metadata": { 318 | "trusted": true, 319 | "_uuid": "14687c39822039864c4989d559d1068297064299" 320 | }, 321 | "cell_type": "code", 322 | "source": "from sklearn.metrics import confusion_matrix, classification_report", 323 | "execution_count": 37, 324 | "outputs": [] 325 | }, 326 | { 327 | "metadata": { 328 | "trusted": true, 329 | "_uuid": "6547bd9a0727f524bb61abe029d59a2dc90188b0" 330 | }, 331 | "cell_type": "code", 332 | "source": "y_pred = model.predict(X_test)\ny_test = pd.DataFrame(y_test)", 333 | "execution_count": 38, 334 | "outputs": [] 335 | }, 336 | { 337 | "metadata": { 338 | "trusted": true, 339 | "_uuid": "74b69f9a922413ed81213d0890ba4dab90cd0166" 340 | }, 341 | "cell_type": "code", 342 | "source": "cm = confusion_matrix(y_test, y_pred.round())", 343 | "execution_count": 39, 344 | "outputs": [] 345 | }, 346 | { 347 | "metadata": { 348 | "trusted": true, 349 | "_uuid": "c6c9b56d51bce77b78081f61a3cfa53a01ad4a95" 350 | }, 351 | "cell_type": "code", 352 | "source": "sns.heatmap(cm, annot=True, fmt='.0f', cmap='cividis_r')\nplt.show()", 353 | "execution_count": 43, 354 | "outputs": [ 355 | { 356 | "output_type": "display_data", 357 | "data": { 358 | "text/plain": "
", 359 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW0AAAD8CAYAAAC8TPVwAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAF1xJREFUeJzt3Xuc1mWd//HXewYGEeXgmcCSlOSBtRKlorbBaurolpCVi2WSsWHl4Vdrhu6umaU9bKs1+f2U/aGSh0x0WUsykp9Zmq4nSEnFQ45HQALjqCiHmfn8/ri/0A3O4Z51Zu65Lt7Px+N6NPfne7iv28fwnqvre33vryICMzNLQ021O2BmZpVzaJuZJcShbWaWEIe2mVlCHNpmZglxaJuZJcShbWaWEIe2mVlCHNpmZgnp1dVvoP3H+5ZLe5vmB1ZUuwvWA2nvB/WOz9GBzInnb3/H79fdPNI2M0tIl4+0zcy6lZIbPHeIQ9vM8lKTd6zl/enMbAfkkbaZWTqU96U6h7aZ5cWhbWaWEF+INDNLiUfaZmbp8OoRM7OUeKRtZpYOz2mbmSXEq0fMzFLi0DYzS4cvRJqZpcQjbTOzdPhCpJlZQnwh0swsJR5pm5mlI+/MdmibWV7kOW0zs3T0yjuzHdpmlpdaKn4Ye5LyvsxqZjucXqq8tUXSgZIWlrV1kr4m6duSlpbVTyg75gJJDZKelXRcWb2+qDVIOr+sPkzSw0X9Fkl17X0+h7aZZaVWlbe2RMSzETEqIkYBHwLeBH5ebL58y7aImAsgaSQwETgIqAeuklQrqRa4EjgeGAmcUuwL8P3iXAcAq4HJ7X0+h7aZZaW3Km8dcDTwfES83MY+44FZEbExIl4EGoBDi9YQES9ExCZgFjBepSumRwGzi+OvBya01xGHtpllpUaVtw6YCNxc9vosSY9LmilpUFEbAiwu22dJUWutvjuwJiIat6u3/fk61G0zsx6uhqi4SZoiaUFZm7L9+Yp55hOB/yxK04H9gVHAMuBH3fbh8OoRM8tMR0bQETEDmNHObscDj0bE8uKY5Vs2SLoauKN4uRTYt+y4oUWNVuorgYGSehWj7fL9W+WRtpllpQumR06hbGpE0uCybZ8Enix+ngNMlNRH0jBgOPAIMB8YXqwUqaM01TInIgL4HfDp4vhJwO3tdcYjbTPLSgfnqtskqR9wDHBGWfnfJI0CAnhpy7aIWCTpVuApoBE4MyKaivOcBcwDaoGZEbGoONdUYJakS4DHgGvb65ND28yyUqvOu7kmItZTumBYXvt8G/tfClzaQn0uMLeF+guUVpdUzKFtZlnpzJF2T+TQNrOsZJ7ZDm0zy4tH2mZmCcn8m1kd2maWF4+0zcwSUtOJq0d6Ioe2mWXFI20zs4RkntkObTPLiy9EmpklxNMjZmYJcWibmSVEXj1iZpaOzAfaDm0zy4svRJqZJcRz2mZmCfFI28wsIR5pm5klxCNte5uvnX4i/3jyMQTBE8++zOnfnMZ/XPIVxh76fta+vh6AL3xzGn98+kU+e+JYpp5xEhK8vn4DX7lwOo8/8xLvGzaEW6Z9Y+s537vvPnzrxz/jiut+yXe+/lnGf+wwmpubWbFyLV/45jSWrVhVrY9rnWjjxuDUs9exaTM0NcGx43pzzhd33rr9kivWc9vcjTw6b7cq9jJtmWc2Kj0QuAvfYP/xWS2afNfeu3H/LZcx8riz2LBxE7dMO4+59/6BcYe9nzt+u4D/uvOBbfY/fPQInm5YzJp166kfO5pvn3MKYz513jb71NTUsPSBmRx20nm88upr7LpLX15/4y0Azp70cUYesC9fuXB6t33G7tD8wIpqd6EqIoI334J+O4vNjcHnzlzHP5/Tj1EH9eKJZxq5cfYGfnPfph02tLX3g+84c8d+/OyKM+feO/53chnf7khb0ghgPDCkKC2l9Pj3p7uyYz1Zr1619N2pjs2Njezctw+vLm99FPzgo89s/fmhx55l6D67v22fo4/4G55/5c+88uprAFsDG6Bf3z509R9W6z6S6FcMrBsbS02CpqbgB9Pf5IcX7sJv7ttU3U4mLvfpkZq2NkqaCsyi9P84HimagJslnd/13et5Xl2+ih9e83Neue8alj14HWtff5O77l8IwKXnnsoff3UF//4vk6mre/vfw8knH8Ov7330bfWJH/9bbv7l77epXXLuqbxy/7V8bvxYvvXjn3XNh7GqaGoKJnxxLUeOX80RH+7NwSN7cdNtGznqyDr22qPNf5JWAXWgpai935DJwCERcVlE/LRol1F65Pvkru9ezzOwfz/Gf+wwho2bwruOOJ1+O/fhc+PHcsEPbmTEMV/lkE+ey24Dd2HqlE9tc9y4MR9g8mc+xtR/u36beu/evTjx6EP5z7n/vU39X3/0U979kcncdPu9nPX5v+/yz2Xdp7ZW/GLmAO6ZPZDHn2lk/sLN3HnPJk49qU+1u5aFmprKW4ra63Yz8K4W6oOLbS2SNEXSAkkLWPfSO+hez/OxIw/mxcXL+cuqdTQ2NnHbvIc4YvQI/vzaagA2bWrkJ7Pv5tCDh2895gMHvodrvncm48/4HqvWvL7N+Y4fO5pHFz3PipVrW3y/m26/l0/VH951H8iqpv+uNRz2wd48/Fgjryxt4tjPruWok9fw1gY49pQ11e5esnIfabc3p/014G5JzwGLi9q7gQOAs1o7KCJmADMgvwuRr7z6F8aMOpC+O9Xx1oZNHH3E37DgiQb22XPQ1uCecMxhPPmnVwDYd/Ae3Db9Aj7/jR/z3Euvvu18p3zio9z8y/u2qR2w32AaXloGwPhjDuOZ55d28aey7rJqTTO9akuBvWFj8MCCzfzjZ3fi/l8M2rrP6ONW8f9uHljFXqYt9zntNkM7Iu6U9D5K0yHlFyLnR0RTV3euJ3rkj39i9p0P8Oicy2lsauKxRS8wY9Y8fj3zIvbcrT+SWPjUi3y5WO3xrbMnsvvAXbnq4jMAaGxq5pAJ5wKwc98+HHPkwZzxL1dt8x6XnXcaB753CM3NwctLV2w9l6XvtZXNnP+99TQ1QQTU/10df3dEXbW7lZXcQ9tL/qwqdtQlf9a2zljyd9xJlS/5m3dbhkv+zMxS4tvYzcwSkvv0SKKLXszMWtaZq0ckDZQ0W9Izkp6WdLik3STdJem54n8HFftK0jRJDZIelzS67DyTiv2fkzSprP4hSU8Ux0yT2v+T49A2s6xIlbcKXAHcGREjgIOBp4HzgbsjYjhwd/Ea4HhgeNGmANNL/dFuwEXAYZQWdVy0JeiLfb5Udlx9ex1yaJtZVjortCUNAD4KXAsQEZsiYg2lr/XYcpfc9cCE4ufxwA1R8hAwUNJg4DjgrohYFRGrgbuA+mJb/4h4KEorQm4oO1erHNpmlpVOnB4ZBrwG/ETSY5KukdQP2DsilhX7/BnYu/h5CH+9nwVgSVFrq76khXqbHNpmlpWO3MZefvd20aaUnaoXMBqYHhEfBNbz16kQAIoRcrcua/bqETPLSkcWj5Tfvd2CJcCSiHi4eD2bUmgvlzQ4IpYVUxxbbjpYCuxbdvzQorYUGLdd/Z6iPrSF/dvkkbaZZUVSxa0tEfFnYLGkA4vS0cBTwBxgywqQScDtxc9zgNOKVSRjgLXFNMo84FhJg4oLkMcC84pt6ySNKVaNnFZ2rlZ5pG1mWenkddpnAzdJqgNeAE6nNNi9VdJk4GXg5GLfucAJQAPwZrEvEbFK0neB+cV+34mILV/C/1XgOqAv8OuitcmhbWZZ6czQjoiFwIdb2HR0C/sGcGYr55kJzGyhvgB4f0f65NA2s6xkfkOkQ9vM8qLMr9Q5tM0sKx5pm5klJPcvjHJom1lWHNpmZglR5hMkDm0zy0qqT1mvlEPbzLLi6REzs4RkntkObTPLi0faZmYpyTy1HdpmlpW8I9uhbWaZ8eoRM7OEeKRtZpaQ9h5ukDqHtpllJfPMdmibWV4c2mZmCXFom5klxHPaZmYJyTuyHdpmlpnMB9oObTPLi0PbzCwhntM2M0tI5pnt0DazvDi0zcwS4mdEmpklxCNtM7OEOLTNzBKS++qRzL8u3Mx2NFLlrbLzqVbSY5LuKF5fJ+lFSQuLNqqoS9I0SQ2SHpc0uuwckyQ9V7RJZfUPSXqiOGaaKviL49A2s6xIqrhV6H8BT29XOy8iRhVtYVE7HhhetCnA9KI/uwEXAYcBhwIXSRpUHDMd+FLZcfXtdcahbWZZUQdau+eShgJ/D1xTwe7jgRui5CFgoKTBwHHAXRGxKiJWA3cB9cW2/hHxUEQEcAMwob03cWibWVY6eXrkx8A3gebt6pcWUyCXS+pT1IYAi8v2WVLU2qovaaHeJoe2mWWlI9MjkqZIWlDWppSd5+PAioj4w3ZvcQEwAjgE2A2Y2n2fzqtHzCwzHVk9EhEzgBmtbD4SOFHSCcBOQH9JP42IU4vtGyX9BPhG8XopsG/Z8UOL2lJg3Hb1e4r60Bb2b5NH2maWlc6aHomICyJiaETsB0wEfhsRpxZz0RQrPSYATxaHzAFOK1aRjAHWRsQyYB5wrKRBxQXIY4F5xbZ1ksYU5zoNuL29z+eRtpllpRuWad8kaU9K1zIXAl8u6nOBE4AG4E3gdICIWCXpu8D8Yr/vRMSq4uevAtcBfYFfF61NDm0zy0pX3FwTEfdQmtIgIo5qZZ8Azmxl20xgZgv1BcD7O9IXh7aZZSXv+yG7IbSbH1jR1W9hZrZV5nexe6RtZnmpcWibmSXEoW1mlg5Pj5iZJcShbWaWkMwz26FtZnnJ/SEIDm0zy0pN5l/O4dA2s6zkPc52aJtZZjKfHXFom1lepKh2F7qUQ9vMsuKRtplZQnwbu5lZQjzSNjNLiEPbzCwhmWe2Q9vM8uKRtplZQhzaZmYJ8eoRM7OEeKRtZpYSh7aZWTo80jYzS4jwd4+YmSXDI20zs4T4IQhmZgnxSNvMLCGZZ7ZD28zy4ocgmJklJPfpkcyn7M1sR1OjyltbJO0k6RFJf5S0SNLFRX2YpIclNUi6RVJdUe9TvG4otu9Xdq4Livqzko4rq9cXtQZJ51f0+Tr+n8TMrOeSKm/t2AgcFREHA6OAekljgO8Dl0fEAcBqYHKx/2RgdVG/vNgPSSOBicBBQD1wlaRaSbXAlcDxwEjglGLfNjm0zSwrUlTc2hIlbxQvexctgKOA2UX9emBC8fP44jXF9qMlqajPioiNEfEi0AAcWrSGiHghIjYBs4p92+TQNrOsdGSkLWmKpAVlbcq251KtpIXACuAu4HlgTUQ0FrssAYYUPw8BFgMU29cCu5fXtzumtXqbfCHSzLLSkQuRETEDmNHG9iZglKSBwM+BEe+0f++UR9pmlhV1oFUqItYAvwMOBwZK2jLgHQosLX5eCuwLUGwfAKwsr293TGv1Njm0zSwrNTVRcWuLpD2LETaS+gLHAE9TCu9PF7tNAm4vfp5TvKbY/tuIiKI+sVhdMgwYDjwCzAeGF6tR6ihdrJzT3ufz9IiZZaUT12kPBq4vVnnUALdGxB2SngJmSboEeAy4ttj/WuBGSQ3AKkohTEQsknQr8BTQCJxZTLsg6SxgHlALzIyIRe11yqFtZlnprMyOiMeBD7ZQf4HSyo/t6xuAz7RyrkuBS1uozwXmdqRfDm0zy4pvYzczS0jut7E7tM0sK/4+bTOzhPhxY2ZmCfH0iJlZQhzaZmYJ8eoRM7OEeKRtZpaQGo+0zczS0d4TaVLn0DazrHhO28wsIZ7Ttv+RZcubmPq99axc1YwkTv5EH077zE6sWdfMP337DZYua2bI4Bouv3gXBuya+S1cO7h/vuwN7nlgM7sPquGX1w8A4M7fbeL//OQtnn+5iVv/b38+MOKv/xSffb6Rb/1wPevXlwJo9oz+9OmTeRJ1otxD22nRRWprxdSv7syvbhzIrP/oz00/30DDS01cfdMGxozuzbybBzJmdG+u/umGanfVutgn6/tw9Q923aY2fFgt0y7ZhQ8fvO24qbExOO+767n43H7cccMAbpi2K708tOoQERW3FDm0u8hee9Rw0IGlf2277Cz2f08ty19r5u77NzGhvg8AE+r78Jv7N1Wzm9YNDhnVmwH9tx3+7b9fLe99d+3b9v3v+Zs5cP9aRhxQ+t0ZNKCG2trMh46drLMegtBT/Y9DW9LpndmRnC1Z1sTTzzVx8MherFwd7LVH6T/7nruLlavT/MWxrvHS4mYkmHzuOk6avJZrfvZWtbuUnI482DdF72SkfXFrG8qfcDzjxuXv4C3St/7N4JwL3+CCs3dml37b/pZI6rQvbLc8NDYFf3i8kR9euAs3Xdmfu+7bxIN/2FztbiUl99Buc7ZM0uOtbQL2bu248iccx/LDd9ih5ObG4JwLX+cTx9Rx7Ng6AHYfJFb8pZm99qhhxV+a2W1Qor851iX22auGDx/ci0EDS+OpsWPqeOpPjRz+od5V7lk6cl/y195Ie2/gNOATLbSVXdu1tEUE//r99ez/nlpO/4e+W+tHHVnHL+7cCMAv7tzI0R+pq1YXrQf6yKG9ee6FJt7aEDQ2BvMXbmb//d4+922t64qnsfck7V2XvgPYJSIWbr9B0j1d0qNMPPpEI7fP28T73lvLhC+uBeDrX+rLlz63E1+/6A3+61cbedc+pSV/lrd/uvgN5j+2mdVrg7GfWs3Zp+/MgP7ikivWs2pN8OWprzPigFqu/VF/Buxawxf+YSc+M2UdEnx0TG/GHe4/7B2R6gXGSqn0hPeusyNPj5hZx2jvB9/xAPjeOWdWnDljT7wyuQG3V4CaWVZSvcBYKYe2mWUl9wuRDm0zy0rmA22HtpnlxSNtM7OE5L56xKFtZlnxhUgzs4R4esTMLCG5h7a/mtXMslLTgdYeSTMlrZD0ZFnt25KWSlpYtBPKtl0gqUHSs5KOK6vXF7UGSeeX1YdJerio3yKp3dtfHdpmlhUpKm4VuA6ob6F+eUSMKtrc0vtqJDAROKg45ipJtZJqgSuB44GRwCnFvgDfL851ALAamNxehxzaZpaVznwIQkT8HlhV4VuPB2ZFxMaIeBFoAA4tWkNEvBARm4BZwHhJAo4CZhfHXw9MaPfzVdgZM7MkdPJIuzVnSXq8mD4ZVNSGAIvL9llS1Fqr7w6siYjG7eptcmibWVY68hCE8ge2FG1KBW8xHdgfGAUsA37UpR9oO149YmZZ6cgIuvyBLR04ZuvjuCRdTekrrAGWAvuW7Tq0qNFKfSUwUFKvYrRdvn+rPNI2s6x09fSIpMFlLz8JbFlZMgeYKKmPpGHAcOARYD4wvFgpUkfpYuWcKH0v9u+ATxfHTwJub+/9PdI2s6zUdOI6bUk3A+OAPSQtAS4CxkkaBQTwEnAGQEQsknQr8BTQCJwZEU3Fec4C5gG1wMyIWFS8xVRglqRLgMeAa9vtkx+CYGY9RWc8BOGp+yZVnDkj//b65G5690jbzLKS+x2RDm0zy4pD28wsITXJTXh0jEPbzLLikbaZWUJqapqr3YUu5dA2s6wIj7TNzJLhJ9eYmSXEc9pmZglxaJuZJcShbWaWkFqvHjEzS4dH2mZmCXFom5klxKFtZpYQh7aZWUJq1FTtLnQph7aZZcUjbTOzhHTm48Z6Ioe2mWXFI20zs4Q4tM3MEiL5jkgzs2T4IQhmZgnxSNvMLCFePWJmlhCPtM3MEuLVI2ZmCfFI28wsIbU1jdXuQpdyaJtZVjzSNjNLiEPbzCwhUrV70LUc2maWl8xT26FtZnmpra12D7qUIvJe09iTSJoSETOq3Q/rWfx7YR1RU+0O7GCmVLsD1iP598Iq5tA2M0uIQ9vMLCEO7e7leUtriX8vrGK+EGlmlhCPtM3MEuLQ7iaS6iU9K6lB0vnV7o9Vn6SZklZIerLafbF0OLS7gaRa4ErgeGAkcIqkkdXtlfUA1wH11e6EpcWh3T0OBRoi4oWI2ATMAsZXuU9WZRHxe2BVtfthaXFod48hwOKy10uKmplZhzi0zcwS4tDuHkuBfcteDy1qZmYd4tDuHvOB4ZKGSaoDJgJzqtwnM0uQQ7sbREQjcBYwD3gauDUiFlW3V1Ztkm4GHgQOlLRE0uRq98l6Pt8RaWaWEI+0zcwS4tA2M0uIQ9vMLCEObTOzhDi0zcwS4tA2M0uIQ9vMLCEObTOzhPx/cn++NSUDjA0AAAAASUVORK5CYII=\n" 360 | }, 361 | "metadata": {} 362 | } 363 | ] 364 | }, 365 | { 366 | "metadata": { 367 | "_uuid": "6ac50a9ad2d6a5e8ae3f535310284dfde746d842" 368 | }, 369 | "cell_type": "markdown", 370 | "source": "Our results is fine however it is not the best way to do things like that. Since our dataset is unbalanced (we have 492 frauds out of 284,807 transactions) we will use 'smote sampling'. Basically smote turn our inbalanced data to balanced data.\nFor brief explanation you can check the link: http://rikunert.com/SMOTE_explained" 371 | }, 372 | { 373 | "metadata": { 374 | "_uuid": "fb89c4981d8cff55e2662f2a32a960d92ec0b6f7" 375 | }, 376 | "cell_type": "markdown", 377 | "source": "## SMOTE Sampling" 378 | }, 379 | { 380 | "metadata": { 381 | "trusted": true, 382 | "_uuid": "c38001f1a2d0a9974ac029511629ea1258021b17" 383 | }, 384 | "cell_type": "code", 385 | "source": "from imblearn.over_sampling import SMOTE", 386 | "execution_count": 44, 387 | "outputs": [] 388 | }, 389 | { 390 | "metadata": { 391 | "trusted": true, 392 | "_uuid": "8f4ebef239d5247a14ad0bf978fb5ba4da4a3137" 393 | }, 394 | "cell_type": "code", 395 | "source": "X_smote, y_smote = SMOTE().fit_sample(X, y)", 396 | "execution_count": 45, 397 | "outputs": [] 398 | }, 399 | { 400 | "metadata": { 401 | "trusted": true, 402 | "_uuid": "41e1ea49f6c180d580830700048e3a200e5777de" 403 | }, 404 | "cell_type": "code", 405 | "source": "X_smote = pd.DataFrame(X_smote)\ny_smote = pd.DataFrame(y_smote)", 406 | "execution_count": 47, 407 | "outputs": [] 408 | }, 409 | { 410 | "metadata": { 411 | "trusted": true, 412 | "_uuid": "bb60cc67cd27d236879e84796880526c06eaba45" 413 | }, 414 | "cell_type": "code", 415 | "source": "y_smote.iloc[:,0].value_counts()", 416 | "execution_count": 60, 417 | "outputs": [ 418 | { 419 | "output_type": "execute_result", 420 | "execution_count": 60, 421 | "data": { 422 | "text/plain": "1 284315\n0 284315\nName: 0, dtype: int64" 423 | }, 424 | "metadata": {} 425 | } 426 | ] 427 | }, 428 | { 429 | "metadata": { 430 | "trusted": true, 431 | "_uuid": "fdc118acddfcaaa6aeb3b949892139371ceb8b92" 432 | }, 433 | "cell_type": "code", 434 | "source": "X_train, X_test, y_train, y_test = train_test_split(X_smote, y_smote, test_size=0.3, random_state=0)", 435 | "execution_count": 61, 436 | "outputs": [] 437 | }, 438 | { 439 | "metadata": { 440 | "trusted": true, 441 | "_uuid": "300ef9012937ececd99402f78c9f18d72c616966" 442 | }, 443 | "cell_type": "code", 444 | "source": "X_train = np.array(X_train)\nX_test = np.array(X_test)\ny_train = np.array(y_train)\ny_test = np.array(y_test)", 445 | "execution_count": 62, 446 | "outputs": [] 447 | }, 448 | { 449 | "metadata": { 450 | "trusted": true, 451 | "_uuid": "cf7c51906744f2eef4f4c3ad8a0f5d1165aa3391" 452 | }, 453 | "cell_type": "code", 454 | "source": "model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])\nmodel.fit(X_train, y_train, batch_size = 30, epochs = 5)", 455 | "execution_count": 63, 456 | "outputs": [ 457 | { 458 | "output_type": "stream", 459 | "text": "Epoch 1/5\n398041/398041 [==============================] - 25s 64us/step - loss: 0.0319 - acc: 0.9892\nEpoch 2/5\n398041/398041 [==============================] - 27s 67us/step - loss: 0.0118 - acc: 0.9968\nEpoch 3/5\n398041/398041 [==============================] - 24s 61us/step - loss: 0.0096 - acc: 0.9976\nEpoch 4/5\n398041/398041 [==============================] - 24s 61us/step - loss: 0.0080 - acc: 0.9980\nEpoch 5/5\n398041/398041 [==============================] - 24s 61us/step - loss: 0.0073 - acc: 0.9982\n", 460 | "name": "stdout" 461 | }, 462 | { 463 | "output_type": "execute_result", 464 | "execution_count": 63, 465 | "data": { 466 | "text/plain": "" 467 | }, 468 | "metadata": {} 469 | } 470 | ] 471 | }, 472 | { 473 | "metadata": { 474 | "trusted": true, 475 | "_uuid": "0b2f76611e4b791ba050efc730f20c3d56e2c788" 476 | }, 477 | "cell_type": "code", 478 | "source": "score = model.evaluate(X_test, y_test)\nprint('Test Accuracy: {:.2f}%\\nTest Loss: {}'.format(score[1]*100,score[0]))", 479 | "execution_count": 64, 480 | "outputs": [ 481 | { 482 | "output_type": "stream", 483 | "text": "170589/170589 [==============================] - 5s 30us/step\nTest Accuracy: 99.76%\nTest Loss: 0.009045598387554149\n", 484 | "name": "stdout" 485 | } 486 | ] 487 | }, 488 | { 489 | "metadata": { 490 | "trusted": true, 491 | "_uuid": "37cbbd1de1f5f8cbf260e76b15db11e9edb8bb9b" 492 | }, 493 | "cell_type": "code", 494 | "source": "y_pred = model.predict(X_test)\ny_test = pd.DataFrame(y_test)\ncm = confusion_matrix(y_test, y_pred.round())\nsns.heatmap(cm, annot=True, fmt='.0f')\nplt.show()", 495 | "execution_count": 65, 496 | "outputs": [ 497 | { 498 | "output_type": "display_data", 499 | "data": { 500 | "text/plain": "
", 501 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW0AAAD8CAYAAAC8TPVwAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAGfBJREFUeJzt3XmYFdW19/HvohFEIpMQwmDikFbECUUQB5RJJmMaElQwEUUMEcW8xiFi3ijR6A0YvQpRiSCDRKQBJ5CACIKKiUwKMjhcO5hcIYwyKXOfXvePLjoH7OF06O7Tu/x9fPbjOat21dn1PLhc7NpVZe6OiIiEoUq6ByAiIqlT0hYRCYiStohIQJS0RUQCoqQtIhIQJW0RkYAoaYuIBERJW0QkIEraIiIBqVreP3BgyxrdcilfU6Nx23QPQSqh3P3r7EiPUZqcc1T9k4749yqaKm0RkYCUe6UtIlKh8hLpHkG5UtIWkXhJ5KZ7BOVKSVtEYsU9L91DKFdK2iISL3lK2iIi4VClLSISEF2IFBEJiCptEZFwuFaPiIgERBciRUQCoukREZGA6EKkiEhAVGmLiAREFyJFRAKiC5EiIuFw15y2iEg4NKctIhIQTY+IiARElbaISEASB9I9gnKlpC0i8aLpERGRgMR8ekRvYxeReMnLS70Vw8xONbPlSW2nmd1mZr81s3VJ8e5J+9xjZjlm9omZdUmKd41iOWY2OCl+opktiuKTzaxaSaenpC0i8VJGSdvdP3H3Fu7eAmgJ7AZejjY/dnCbu88EMLPmQG/gdKAr8JSZZZhZBvAk0A1oDvSJ+gIMi471fWAb0L+k01PSFpFY8cSBlFspdAT+7u7/LKZPFpDt7vvc/TMgB2gdtRx3X+Pu+4FsIMvMDOgAvBDt/yzQo6SBKGmLSLx4Xuotdb2BSUnfB5nZCjMba2Z1o1gT4POkPmujWFHx44Dt7p57WLxYStoiEi+lmB4xswFmtjSpDTj8cNE88w+BqVFoJHAy0AJYDzxaYeeGVo+ISNyUooJ291HAqBK6dQPed/eN0T4bD24ws9HAjOjrOuD4pP2aRjGKiH8B1DGzqlG1ndy/SKq0RSReyuhCZJI+JE2NmFmjpG09gVXR5+lAbzOrbmYnApnAYmAJkBmtFKlG/lTLdHd3YD7QK9r/OmBaSYNRpS0i8VKG67TNrCZwGfDzpPDDZtYCcOAfB7e5+2ozmwJ8COQCt3j0yEEzGwTMBjKAse6+OjrW3UC2mT0ILAPGlDim/GRffg5sWVO+PyBBqtG4bbqHIJVQ7v51dqTH2POXx1POOTUuv+2If6+iqdIWkXiJ+R2RStoiEi969oiISEBUaYuIBESVtohIQFRpi4gEJDe35D4BU9IWkXgp52XM6aakLSLxojltEZGAKGmLiAREFyJFRAKSSKR7BOVKSVtE4kXTIyIiAVHSFhEJiOa0RUTC4Xlapy0iEg5Nj4iIBESrR0REAhLzSlsv9v0PTMh+mayf/JweP72Ju4YMZd++/QXb/uuxkbTq1LPg+782bKT/LwbTs+9Arh/0KzZs2nzIsb7atYuOPX7KQ48+VRA7cOAAvx02nMt738gVfX7GnPnvlP9JSblo2rQxc1+fyooP5vPB8nncOqg/AMN+/xtWrXyL99+bwwtTn6F27VoF+5x55mm88/Z0Plg+j2Xvz6V69erpGn6Yyv7FvpWKknYpbdy8hYkvTGPy2BG88tyfyMvLY9bctwBY9dH/sPPLrw7p/8gTz/DDrh15ecJIBva7hsf/NP6Q7X8c/WdatjjzkNjTz2ZTr24d/pL9DNMmPs155xy6XcKRm5vLXb+6n7PObs9FF1/BwIHXc9ppmcx9423ObtGBc1texqefrmHw3YMAyMjI4NnxI7h50GDObtGBjp2u5MCBA2k+i8C4p94CVGLSNrNmZna3mY2I2t1mdlpFDK6yyk0k2LdvP7m5Cfbs3UeD+vVIJBI8+uQY7ri5/yF9//7Z/9K6ZQsAWp97NvMXvFuwbfXHn/LF1m1c2OrcQ/Z5+S+vc+O1VwNQpUoV6tapXc5nJOVlw4ZNLFu+CoCvvtrFxx9/SpPG32HO3LdJRHOvCxe9T5MmjQDofNmlrFz5EStWfAjA1q3byAu0Ikybb3KlbWZ3A9mAAYujZsAkMxtc/sOrfBo2qM/1fX5Mpx/1pX3WNRxb8xguOr8lz7/4Ku0vbkOD+vUO6X9q5knMfeuvAMx962/s2r2H7Tt2kpeXxx+eGM2dg248pP/BSv2J0RO4st8gbv/NQ2zZuq1iTk7K1fe+15QWZ5/BosXLDon3u743r82eD0Bm5km4w8wZE1m86DXuvGNgOoYatjxPvQWopEq7P9DK3Ye6+3NRGwq0jrZ94+zY+SXzFyxk9tRxzJs2kT179zFt1lxen7+Aa3r98Gv977zlRpYuW0mv629h6fKVNGxwHFWqVCH7pRlcckErvvPtBof0TyQSbNy0hRZnnsbUcU9w9hmn8cgTz1TU6Uk5qVnzGKZMHs3tdw7hy6QptHsG/4Lc3Fyef/4lAKpWzeCiC1tx7XWDuLRdD3pkdaND+4vTNewwJRKptwCVtHokD2gM/POweKNoW6HMbAAwAOCpRx/kxr59jmSMlcrCpctp0rgh9erWAaDjpRfy1Jjn2LtvP92vvgGAvXv30e2qG5g1ZSzfbnAcw39/LwC7d+9h7pvvUOvYb/HBqo94b8Vqsl+awe49ezlw4ADHHHM0t93UjxpHV6fTpRcB0Ll9W156dXZ6TlbKRNWqVZk6eTSTJr3MK6/MKoj3vfYqLu/eicu6XFUQW7tuPQveWcQXX+T/7WrWa/M455wzmKeL0SnzQKc9UlVS0r4NeMPMPgU+j2LfBb4PDCpqJ3cfBYwCOLBlTZh/BylCo4YNWLHqY/bs3cvR1auzaOly+l7dk59cmVXQp1WnnsyaMhaAbdt3ULvWsVSpUoXRf55Mz8s7AzDst3cX9H/lL3NY/fGn/HJgftK/9KLzWbJsBee3bMGipcs5+cTvVuAZSlkbPepRPvo4h8eHjyqIdencjjvvHEiHjj9mz569BfHXX3+LO++4mRo1jmb//gNc0rYNw0eMTsewwxXotEeqik3a7v6amZ1C/nRIkyi8Dlji7mH+3eIInXV6My5rfzFX9buVjIwMmp1yMldmdSuy/5JlK3j8T+MxM1qefQa/uePmEn/j9ptv4J4HHmHo8KepV6c2D/769rI8BalAF13Yimt/2osVKz9k6ZLXAbj33qE89t8PUL16dV6blQ3AokXvc8ugwWzfvoPHh49i4bszcXdee20eM2e9kc5TCE/Mnz1iXs7LXuJWaUvZqNG4bbqHIJVQ7v51dqTH2PXAT1LOOTXvm3jEv1fRdEekiMRLbrwnAZS0RSReYj49ojsiRSReynCdtpnVMbMXzOxjM/vIzC4ws3pmNsfMPo3+XTfqa9ENiDlmtsLMzk06znVR/0/N7LqkeEszWxntM8LMSpyuUdIWkVjxvLyUWwqGA6+5ezPgbOAjYDDwhrtnAm9E3wG6AZlRGwCMBDCzesAQ4HzyF3UMOZjooz4/S9qva0kDUtIWkXgpo0rbzGoDlwBjANx9v7tvB7KAZ6NuzwI9os9ZwATPtxCoY2aNgC7AHHff6u7bgDlA12hbLXdf6PkrQiYkHatIStoiEi9lNz1yIrAZGGdmy8zsGTOrCTR09/VRnw1Aw+hzE/59PwvA2ihWXHxtIfFiKWmLSLyU4jZ2MxtgZkuT2oCkI1UFzgVGuvs5wC7+PRUCQFQhV+iyZq0eEZFYKc07IpPv3i7EWmCtuy+Kvr9AftLeaGaN3H19NMWxKdq+Djg+af+mUWwd0O6w+JtRvGkh/YulSltE4qWMpkfcfQPwuZmdGoU6Ah8C04GDK0CuA6ZFn6cDfaNVJG2AHdE0ymygs5nVjS5AdgZmR9t2mlmbaNVI36RjFUmVtojES9k+MOpWYKKZVQPWAP3IL3anmFl/8h+md/CJXzOB7kAOsDvqi7tvNbPfAUuifg+4+9bo883AeKAGMCtqxVLSFpF4KcMHRrn7cuC8QjZ1LKSvA7cUcZyxwNhC4kuBM0ozJiVtEYmXb/JT/kREQuOJeN/GrqQtIvGiSltEJBylWfIXIiVtEYkXJW0RkYDEe0pbSVtE4sVz4521lbRFJF7inbOVtEUkXnQhUkQkJKq0RUTCoUpbRCQkqrRFRMLhuekeQflS0haRWHFV2iIiAVHSFhEJhyptEZGAKGmLiATEE5buIZQrJW0RiRVV2iIiAfE8VdoiIsFQpS0iEhB3VdoiIsFQpS0iEpA8rR4REQmHLkSKiARESVtEJCAe78dpK2mLSLyo0hYRCYiW/ImIBCQR89UjVdI9ABGRsuRuKbdUmFmGmS0zsxnR9/Fm9pmZLY9aiyhuZjbCzHLMbIWZnZt0jOvM7NOoXZcUb2lmK6N9RphZiYNSpS0isVIOc9r/D/gIqJUUu8vdXzisXzcgM2rnAyOB882sHjAEOA9w4D0zm+7u26I+PwMWATOBrsCs4gajSltEYsU99VYSM2sKXA48k8JPZwETPN9CoI6ZNQK6AHPcfWuUqOcAXaNttdx9obs7MAHoUdKPKGmLSKx4nqXczGyAmS1NagMOO9zjwK/4+kvMHoqmQB4zs+pRrAnweVKftVGsuPjaQuLF0vSIiMRKIi/1WtTdRwGjCttmZj8ANrn7e2bWLmnTPcAGoFq0793AA//peEtLlbaIxEoZTo9cBPzQzP4BZAMdzOw5d18fTYHsA8YBraP+64Djk/ZvGsWKizctJF4sJW0RiZU8t5Rbcdz9Hndv6u4nAL2Bee7+02gummilRw9gVbTLdKBvtIqkDbDD3dcDs4HOZlbXzOoCnYHZ0badZtYmOlZfYFpJ56fpERGJlQq4uWaimTUADFgO3BTFZwLdgRxgN9Avfzy+1cx+ByyJ+j3g7lujzzcD44Ea5K8aKXblCIB5Od+of2DLmpg/CUD+EzUat033EKQSyt2/7ogz7vvHZ6Wcc879fFpwd+KUe6Wt/zilMHv+tSDdQ5CYKmnaI3SaHhGRWCnN6pEQKWmLSKzEfT5WSVtEYkXTIyIiAdGjWUVEAhLzl7EraYtIvDiqtEVEgpGr6RERkXCo0hYRCYjmtEVEAqJKW0QkIKq0RUQCklClLSISjrJ/r2/loqQtIrGSp0pbRCQcemCUiEhAdCFSRCQgeabpERGRYCTSPYBypqQtIrGi1SMiIgHR6hERkYBo9YiISEA0PSIiEhAt+RMRCUhClbaISDhUaYuIBERJW0QkIDF/RaSStojES9wr7SrpHoCISFlKlKIVx8yONrPFZvaBma02s/uj+IlmtsjMcsxssplVi+LVo+850fYTko51TxT/xMy6JMW7RrEcMxucyvkpaYtIrORZ6q0E+4AO7n420ALoamZtgGHAY+7+fWAb0D/q3x/YFsUfi/phZs2B3sDpQFfgKTPLMLMM4EmgG9Ac6BP1LZaStojESl4pWnE831fR16Oi5kAH4IUo/izQI/qcFX0n2t7RzCyKZ7v7Pnf/DMgBWkctx93XuPt+IDvqWywlbRGJlbJK2gBRRbwc2ATMAf4ObHf33KjLWqBJ9LkJ8DlAtH0HcFxy/LB9iooXS0lbRGLFS9HMbICZLU1qAw45lnvC3VsATcmvjJtV3JkUTqtHRCRWSvPsEXcfBYxKod92M5sPXADUMbOqUTXdFFgXdVsHHA+sNbOqQG3gi6T4Qcn7FBUvkiptEYmVMlw90sDM6kSfawCXAR8B84FeUbfrgGnR5+nRd6Lt89zdo3jvaHXJiUAmsBhYAmRGq1GqkX+xcnpJ56dKW0RiJa/sHs7aCHg2WuVRBZji7jPM7EMg28weBJYBY6L+Y4A/m1kOsJX8JIy7rzazKcCHQC5wi7snAMxsEDAbyADGuvvqkgalpC0isVJWN9e4+wrgnELia8if3z48vhe4sohjPQQ8VEh8JjCzNONS0haRWNFLEEREAhL329iVtEUkVnIt3rW2kraIxEq8U7aStojEjKZHREQCUoZL/iolJW0RiZV4p2wlbRGJGU2PiIgEJBHzWltJW0RiRZW2iEhAXJW2iEg4VGlLykaPepTLu3di0+YttDinIwDPTxzJKaecDECd2rXYvmMn57XqTJ8+Pbnj9oEF+5515mm0Or8rH3xQ4kO+pJKakP0yL776GmZG5skn8OCvb+eBP/yRpctX8q2aNQF46P/fTrNTTmbG7HmMmTgVHI45pgb33jmIZpknFRwrkUhwdf9f8O0G9XnqD/cDsPZfG7hryFC279hJ81MzGXrfnRx11FFpOdfKTEv+JGUTJkzhqafGMW7c8ILYNT/5d2L+w7D72LFzJwCTJr3MpEkvA3DGGc14ceoYJeyAbdy8hYkvTGPaxKc5unp17rj3v5g19y0A7rilP53btz2kf5PG32H8Ew9Tu9axLHh3Cfc/PIJJox8v2P7c1GmcdMJ3+WrX7oLYYyPHcu3VPejeqR33P/xHXpwxm949f1AxJxiQeKdsvQShTC14ZxFbt20vcnuvXleQPXna1+K9r+7BlKklPvtcKrncRIJ9+/aTm5tgz959NKhfr8i+55zZnNq1jgXgrNObsXHTloJtGzZt5u2/LebHV3QpiLk7i977gM7t8pN/VvdOzHv73XI6k7Dl4im3EP3HSdvM+pXlQOKu7cXns3HTZnJyPvvatit7XUH25FfSMCopKw0b1Of6Pj+m04/60j7rGo6teQwXnd8SgBFPP0vPvgMZNvxp9u/f/7V9X5oxm4vbnFfwfdjwp7n95v6Y/fs/z+07dnLst2pStWpGwe9t2vxFOZ9VmLwU/4ToSCrt+4vakPyyzLy8XUfwE/Fx9dU9mFxIld261Tns3rOH1as/ScOopKzs2Pkl8xcsZPbUccybNpE9e/fx6ux53HZTP16dNJrJzwxnx84vGfPc1EP2W/zeB7w043Vuv/kGAN786yLq1a3D6c0y03EasVCWb2OvjIqd0zazFUVtAhoWtV/yyzKrVmsS5v/OylBGRgY9e3SjdZtuX9t29VVZhSZzCcvCpctp0rgh9erWAaDjpReyfOWHXNGlAwDVqlWjx+WdGT/pxYJ9Psn5jPuGPs6fHv0ddWrXAmDZig95852FLHh3Cfv2H2DXrt3cff/DDL3vLr78ahe5uQmqVs1g4+YtfLvBcRV/ogEItYJOVUkXIhsCXYBth8UN+Fu5jCiGOnVsyyef5LBu3fpD4mZGr14/oF2HH6VpZFJWGjVswIpVH7Nn716Orl6dRUuXc3qzTDZv2UqD+vVwd+a9/TcyT/oeAOs3bOK2X/+O3993Fyd8t2nBcX45sB+/HJg/87j4/RWMn/Qiw4b8CoDW557F628uoHundkybOZcObS+o+BMNQKgVdKpKStozgG+5+/LDN5jZm+UyooA99+cnufSSC6hfvx7/WLOU+x94hHHjs7nqqqxCL0Be0rYNa9eu57PP/jcNo5WydNbpzbis/cVc1e9WMjIyaHbKyVyZ1Y2b7riPbdt34O6cmnkSQ+66FYCR455nx84vefCRJ4H8v41NGTui2N/45cAbuGvIUP44agKnnXIyP/pB53I/rxAlPN6Vtnk5n6CmR6Qwe/61IN1DkEroqPon2ZEe45rv9Uw55zz/z5eP+PcqmtZpi0isfNPntEVEgvJNn9MWEQmKbmMXEQmIpkdERAIS99UjStoiEiuaHhERCYguRIqIBCTuc9p6NKuIxEoennIriZmNNbNNZrYqKfZbM1tnZsuj1j1p2z1mlmNmn5hZl6R41yiWY2aDk+InmtmiKD7ZzKqVNCYlbRGJFXdPuaVgPNC1kPhj7t4iajMBzKw50Bs4PdrnKTPLMLMM4EmgG9Ac6BP1BRgWHev75D/jqX9JA1LSFpFYSeApt5K4+9vA1hR/OgvIdvd97v4ZkAO0jlqOu69x9/1ANpBlZgZ0AF6I9n8W6FHSjyhpi0islOX0SDEGmdmKaPqkbhRrAnye1GdtFCsqfhyw3d1zD4sXS0lbRGKlNNMjyS9sidqAFH5iJHAy0AJYDzxarid0GK0eEZFYKU0FnfzCllLss/HgZzMbTf4jrAHWAccndW0axSgi/gVQx8yqRtV2cv8iqdIWkVgp73dEmlmjpK89gYMrS6YDvc2supmdCGQCi4ElQGa0UqQa+Rcrp3v+ldD5QK9o/+uAEl9jpUpbRGKlLG9jN7NJQDugvpmtBYYA7cysBeDAP4CfA7j7ajObAnwI5AK3uHsiOs4gYDaQAYx199XRT9wNZJvZg8AyYEyJY9JLECQd9BIEKUxZvAThoiYdUs45f103Ty9BEBFJJz17REQkIOU9e5BuStoiEiuqtEVEAhL3B0YpaYtIrCQ83g9nVdIWkVjRnLaISEA0py0iEhDNaYuIBCRP0yMiIuFQpS0iEhCtHhERCYimR0REAqLpERGRgKjSFhEJiCptEZGAJPLfOxBbStoiEiu6jV1EJCC6jV1EJCCqtEVEAqLVIyIiAdHqERGRgOg2dhGRgGhOW0QkIJrTFhEJiCptEZGAaJ22iEhAVGmLiAREq0dERAKiC5EiIgHR9IiISEB0R6SISEBUaYuIBCTuc9oW9/8rVSZmNsDdR6V7HFK56M+FlEaVdA/gG2ZAugcglZL+XEjKlLRFRAKipC0iEhAl7YqleUspjP5cSMp0IVJEJCCqtEVEAqKkXUHMrKuZfWJmOWY2ON3jkfQzs7FmtsnMVqV7LBIOJe0KYGYZwJNAN6A50MfMmqd3VFIJjAe6pnsQEhYl7YrRGshx9zXuvh/IBrLSPCZJM3d/G9ia7nFIWJS0K0YT4POk72ujmIhIqShpi4gEREm7YqwDjk/63jSKiYiUipJ2xVgCZJrZiWZWDegNTE/zmEQkQEraFcDdc4FBwGzgI2CKu69O76gk3cxsEvAucKqZrTWz/ukek1R+uiNSRCQgqrRFRAKipC0iEhAlbRGRgChpi4gERElbRCQgStoiIgFR0hYRCYiStohIQP4PaFhq+O4qPoEAAAAASUVORK5CYII=\n" 502 | }, 503 | "metadata": {} 504 | } 505 | ] 506 | }, 507 | { 508 | "metadata": { 509 | "_uuid": "045f75649fe2aae0a7d441bf345d20d71b757d16" 510 | }, 511 | "cell_type": "markdown", 512 | "source": "It is not the true result 'cause we used data with smote sampling because of that number of class 0 and class 1 are equal in here. That's why we'll use whole data we imported at the beginning." 513 | }, 514 | { 515 | "metadata": { 516 | "trusted": true, 517 | "_uuid": "e5454011a11a4f6413323472e1a51634835ce93d" 518 | }, 519 | "cell_type": "code", 520 | "source": "y_pred2 = model.predict(X)\ny_test2 = pd.DataFrame(y)\ncm2 = confusion_matrix(y_test2, y_pred2.round())\nsns.heatmap(cm2, annot=True, fmt='.0f', cmap='coolwarm')\nplt.show()", 521 | "execution_count": 67, 522 | "outputs": [ 523 | { 524 | "output_type": "display_data", 525 | "data": { 526 | "text/plain": "
", 527 | "image/png": "\n" 528 | }, 529 | "metadata": {} 530 | } 531 | ] 532 | }, 533 | { 534 | "metadata": { 535 | "trusted": true, 536 | "_uuid": "deb48a791a49b53930b9cafb9cf424212af24af2" 537 | }, 538 | "cell_type": "code", 539 | "source": "scoreNew = model.evaluate(X, y)\nprint('Test Accuracy: {:.2f}%\\nTest Loss: {}'.format(scoreNew[1]*100,scoreNew[0]))", 540 | "execution_count": 76, 541 | "outputs": [ 542 | { 543 | "output_type": "stream", 544 | "text": "284807/284807 [==============================] - 9s 32us/step\nTest Accuracy: 99.75%\nTest Loss: 0.010284884318468747\n", 545 | "name": "stdout" 546 | } 547 | ] 548 | }, 549 | { 550 | "metadata": { 551 | "trusted": true, 552 | "_uuid": "c753dd537b11d553d2832fffa36a240aebdfd954" 553 | }, 554 | "cell_type": "code", 555 | "source": "print(classification_report(y_test2, y_pred2.round()))", 556 | "execution_count": 72, 557 | "outputs": [ 558 | { 559 | "output_type": "stream", 560 | "text": " precision recall f1-score support\n\n 0 1.00 1.00 1.00 284315\n 1 0.41 0.99 0.58 492\n\n micro avg 1.00 1.00 1.00 284807\n macro avg 0.71 0.99 0.79 284807\nweighted avg 1.00 1.00 1.00 284807\n\n", 561 | "name": "stdout" 562 | } 563 | ] 564 | }, 565 | { 566 | "metadata": { 567 | "_uuid": "5e06badbb6abb16540231a2b61d37f5615fab08a" 568 | }, 569 | "cell_type": "markdown", 570 | "source": "**Thank you, if you like it please upvote and make a comment.**" 571 | } 572 | ], 573 | "metadata": { 574 | "kernelspec": { 575 | "display_name": "Python 3", 576 | "language": "python", 577 | "name": "python3" 578 | }, 579 | "language_info": { 580 | "name": "python", 581 | "version": "3.6.6", 582 | "mimetype": "text/x-python", 583 | "codemirror_mode": { 584 | "name": "ipython", 585 | "version": 3 586 | }, 587 | "pygments_lexer": "ipython3", 588 | "nbconvert_exporter": "python", 589 | "file_extension": ".py" 590 | } 591 | }, 592 | "nbformat": 4, 593 | "nbformat_minor": 1 594 | } 595 | -------------------------------------------------------------------------------- /Artificial_Neural_Network_with_Keras/Artificial_Neural_Network_with_Keras.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "metadata": { 5 | "_uuid": "f598221755016c3dae2ac6b9c8ef4012a6c9db9f" 6 | }, 7 | "cell_type": "markdown", 8 | "source": "[](http://)What is Artificial Neural Network?\n
\nArtifical Neural Networks (ANN) are one of the main tools which are used in machine learning. \"Neural\" part of their name is called as like that because these systems try to learn things like human brain. Replicated networks contains some kind of neurons and these neurons create a network by connecting each other. These networks have capacity of learning, storing and finding out relationships between datas like a human!
\nFor example they can learn to identify images that contain cars by analyzing example images. So after learning phase is completed if you ask to algorithm 'Is it a car?' by giving it an image, algorithm can answer your question becuase it identified other cars images and learned how a car looks like.
\nNeural Networks has input and output layers like others but most of the cases they also have hidden layers, and usually we can say how 'deep' our algorithm according to number of hidden layers.\n
\n
\n
\n\n
\n
\nNow we'll try to use this algorithm with a dataset contains images of 10 different classes of fashion. \n
\n
\n**CONTENT**\n1. [Information About Data](#1)\n1. [Reading Data](#2)\n1. [ANN with Keras](#3)" 9 | }, 10 | { 11 | "metadata": { 12 | "_uuid": "8f2839f25d086af736a60e9eeb907d3b93b6e0e5", 13 | "_cell_guid": "b1076dfc-b9ad-4769-8c92-a6c4dae69d19", 14 | "trusted": true 15 | }, 16 | "cell_type": "code", 17 | "source": "import numpy as np # linear algebra\nimport pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)\nimport matplotlib.pyplot as plt\n\nimport os\nprint(os.listdir(\"../input\"))\n\n# Any results you write to the current directory are saved as output.", 18 | "execution_count": 1, 19 | "outputs": [ 20 | { 21 | "output_type": "stream", 22 | "text": "['train-images-idx3-ubyte', 'train-labels-idx1-ubyte', 't10k-labels-idx1-ubyte', 'fashion-mnist_train.csv', 't10k-images-idx3-ubyte', 'fashion-mnist_test.csv']\n", 23 | "name": "stdout" 24 | } 25 | ] 26 | }, 27 | { 28 | "metadata": { 29 | "_uuid": "a156a3c39bb74ae571501dc65bdc00dcf980e924" 30 | }, 31 | "cell_type": "markdown", 32 | "source": "**About Dataset**
\n
\nDataset consists a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.\n - Each image is 28 pixels in height and 28 pixels in width, for a total of 784 pixels in total.\n - Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning darker. This pixel-value is an integer between 0 and 255.\n - The training and test data sets have 785 columns. The first column consists of the class labels, and represents class of clothing. The rest of the columns contain the pixel-values of the associated image.\n
\nEach training and test example is assigned to one of the following labels:\n\n - 0 T-shirt/top\n - 1 Trouser\n - 2 Pullover\n - 3 Dress\n - 4 Coat\n - 5 Sandal\n - 6 Shirt\n - 7 Sneaker\n - 8 Bag\n - 9 Ankle boot \n\nSince we want binary classification we'll just choose 0 and 1 for our data." 33 | }, 34 | { 35 | "metadata": { 36 | "_uuid": "16f7156c839160f581dfe32b95e42164ea08c482" 37 | }, 38 | "cell_type": "markdown", 39 | "source": "**Read Data**" 40 | }, 41 | { 42 | "metadata": { 43 | "trusted": true, 44 | "_uuid": "ae9f1b80920b1089b6b7445eec733c0c7061e8a0" 45 | }, 46 | "cell_type": "code", 47 | "source": "dfAll = pd.read_csv(\"../input/fashion-mnist_train.csv\")\ndf = dfAll[((dfAll.label == 0) | (dfAll.label == 1))]", 48 | "execution_count": 2, 49 | "outputs": [] 50 | }, 51 | { 52 | "metadata": { 53 | "trusted": true, 54 | "_uuid": "6c674a3247e5ff11d389fe4a7d71cc7296e48cc7" 55 | }, 56 | "cell_type": "code", 57 | "source": "df.head()", 58 | "execution_count": 3, 59 | "outputs": [ 60 | { 61 | "output_type": "execute_result", 62 | "execution_count": 3, 63 | "data": { 64 | "text/plain": " label pixel1 pixel2 ... pixel782 pixel783 pixel784\n3 0 0 0 ... 0 0 0\n10 0 0 0 ... 0 0 0\n13 0 0 0 ... 0 0 0\n24 0 0 0 ... 0 0 0\n29 1 0 0 ... 0 0 0\n\n[5 rows x 785 columns]", 65 | "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
labelpixel1pixel2pixel3pixel4pixel5pixel6pixel7pixel8pixel9pixel10pixel11pixel12pixel13pixel14pixel15pixel16pixel17pixel18pixel19pixel20pixel21pixel22pixel23pixel24pixel25pixel26pixel27pixel28pixel29pixel30pixel31pixel32pixel33pixel34pixel35pixel36pixel37pixel38pixel39...pixel745pixel746pixel747pixel748pixel749pixel750pixel751pixel752pixel753pixel754pixel755pixel756pixel757pixel758pixel759pixel760pixel761pixel762pixel763pixel764pixel765pixel766pixel767pixel768pixel769pixel770pixel771pixel772pixel773pixel774pixel775pixel776pixel777pixel778pixel779pixel780pixel781pixel782pixel783pixel784
300001200000114183112552372102165160280001000000001002418816393...17124920719720245030000000000100022212569524574393000010000
10000001000041162167843038941771762600010000000010041147228242228...231231228229212001000000000001011571481481671801821791761721711641771630010000
1300000000000000000000000000000000003000142205...84240194186181192133030000000000000026151351000000001000
2400000000040122000000028146430000000000000000129193122...17117618018919313900000000000080133155159157162157161161163151145125938749000000
2910000000001712522132172172192182172082454100000000000000000248255...25514602000000000000000010023119800165244320100000000
\n
" 66 | }, 67 | "metadata": {} 68 | } 69 | ] 70 | }, 71 | { 72 | "metadata": { 73 | "trusted": true, 74 | "_uuid": "78b247e075f487d6449d75dbefe1b45ceb803ede" 75 | }, 76 | "cell_type": "code", 77 | "source": "df.info()", 78 | "execution_count": 4, 79 | "outputs": [ 80 | { 81 | "output_type": "stream", 82 | "text": "\nInt64Index: 12000 entries, 3 to 59996\nColumns: 785 entries, label to pixel784\ndtypes: int64(785)\nmemory usage: 72.0 MB\n", 83 | "name": "stdout" 84 | } 85 | ] 86 | }, 87 | { 88 | "metadata": { 89 | "trusted": true, 90 | "_uuid": "37bcc80e854cd984b88bcadf07865fa42df8efb0" 91 | }, 92 | "cell_type": "code", 93 | "source": "# We will split our data\n\nfrom sklearn.model_selection import train_test_split\n\nX = df.drop([\"label\"], axis=1)\nY = df.label\nx_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.30, random_state=42)", 94 | "execution_count": 5, 95 | "outputs": [] 96 | }, 97 | { 98 | "metadata": { 99 | "trusted": true, 100 | "_uuid": "f005cd4dfc0c97057c286d1177426058019c466d" 101 | }, 102 | "cell_type": "code", 103 | "source": "# Example Images\n\nplt.figure(figsize=(8,8))\nfor i in range(4):\n plt.subplot(2,2,i+1)\n plt.axis('off')\n plt.imshow(x_train.head().values[i].reshape(28,28), cmap='gray', interpolation='none')", 104 | "execution_count": 6, 105 | "outputs": [ 106 | { 107 | "output_type": "display_data", 108 | "data": { 109 | "text/plain": "
", 110 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAd4AAAHVCAYAAABfWZoAAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAGQdJREFUeJzt3V1onnf5B/AnTZM0afpe166jMFeFbtOt6rY61KGgYx6IiKBnAw98OfFA3IGeCIJH6okiqAwERSg4EXwpygRRNoVatzFx1tptLat9f0uTNM1bm/9h+YvXle6X5uqT5PM5/XI/z912d77ekK+/nrm5uQ4AUGPV7b4BAFhJFC8AFFK8AFBI8QJAIcULAIUULwAUUrwAUEjxAkAhxQsAhVZXfllPT4//m6wGTz/9dJidPXs2zCYmJsLs+vXrYbZu3bowm5ycDLPR0dEw2759e5h95StfCbOVYG5urud238ObtVSe5Z6e+K/2dvy/9n31q18Nsx07doTZH/7whzDr7+8Ps0cffTTMZmdnw+ypp54Ks8WQ/Tt1Orfn36rFzT7L3ngBoJDiBYBCihcACileACikeAGgkOIFgEI9lb+mvVQmCLfD+973vjB77rnnwuzIkSNhls2CMtPT02HW29sbZoODg2G2ZcuWMNuzZ0+Yvfzyy2G2XJgT3dT3hdntmJo88cQTYfb5z38+zB5//PEwW7NmTZhlf/75pjiRqampMPvFL34RZr/+9a/D7He/+12YXbp06eZubAkzJwKALqR4AaCQ4gWAQooXAAopXgAopHgBoJA5UZfYt29fmD344INhduHChTDLTiC6evVqmA0NDTV9Zpbt2rUrzL73ve+F2be+9a0wWy7MiRbPRz/60TD7whe+kF77yCOPhNkdd9wRZocPHw6zsbGxMLvnnnvCbHx8PMyy5254eLjpM7Ppz7ve9a6mz3z11VfD7Otf/3qYdTqdzq9+9as07xbmRADQhRQvABRSvABQSPECQCHFCwCFFC8AFDIn6hJnz54Ns2yekJ0wsmnTpqbrWk9CybLNmzeH2alTp8LsscceC7PlwpxoYbLJ2VNPPRVm2fSl08kndzMzM2G2evXqMHvhhRfC7AMf+ECYZZOh7DSx7GdA9ty9/vrrYbZ79+6me1m7dm2Ybdy4Mcw6nU7nO9/5Tph96UtfSq+tZE4EAF1I8QJAIcULAIUULwAUUrwAUEjxAkAhc6JC2a/TX758OcxeeumlMJuYmAiz1ulPq97e3jDbsGFDmGUThOwklOXCnGh+2YTlwIEDYZZNgrKJTqeTPyP9/f1hdu3atabrsolS9mzNzs6GWetzvmpV/E6WfV/2d5p1TfZ31unkPz/e/e53h1l2ItJiMCcCgC6keAGgkOIFgEKKFwAKKV4AKKR4AaBQfIwGt1w2m8lOC8p+tT87fWRycjLMsl/fz6YLAwMDYdY6h5hv1gHZKUPr168Ps2xul50i1Onk85fW/2azeVP2nGf3kv3s6OvrC7Psz5A9r62yz5xv1rpmzZow+9rXvhZmTz755Pw3dht44wWAQooXAAopXgAopHgBoJDiBYBCihcACpkTFRoaGgqzbGZw4sSJMMtOPMrmCcPDw2HWOqPITkLJZg3nz58PM+h0Op2HH344zLI5TTZhyZ6PTief3GXPSJa1nhaUXTffnyOSPZPZFDE7YSn7+87+PuebdmX38/jjj6fXdiNvvABQSPECQCHFCwCFFC8AFFK8AFBI8QJAIXOiQg8++GDTddnJHNu3bw+zt7zlLWGWnTJ07Nixm7qv/5adTjQ4OBhmY2NjTd/HyvHAAw+E2fj4eJhlk5ls3tLptE9/MtnUKJsFtd5LNv/LpogvvvhimO3cuTPM7rzzzjDLZl/ZvXQ6+b/Vtm3b0mu7kTdeACikeAGgkOIFgEKKFwAKKV4AKKR4AaCQ4gWAQna8hXbv3h1m2bGA2cYtO4Zr69atYZZt6jZv3hxm2X3Ozs6GWXaf09PTYcbKMd/RcJFsP57t1bP/ljudfHPbevxdttXNNret+98sy/bP733ve8Msu8/sM7OfY9nOv9PpdEZHR9M8kv3/HJw+fbrpM28Fb7wAUEjxAkAhxQsAhRQvABRSvABQSPECQCFzokL33ntvmGUThPXr14dZNjPIjhLbsWNHmO3bty/M9u7dG2bZr/wPDw+H2aVLl8KMlSObsExMTIRZNifKjgXMsvk+t3X61Kr1OW+9LpsFZX/2bGqUHW+aTRE7nfznYzYLe+ihh8LsN7/5Tfqdi8kbLwAUUrwAUEjxAkAhxQsAhRQvABRSvABQyJyo0K5du8Ls8uXLYZb9iv7dd98dZtkkIJvwZCcXZdOmbJ6RnT5y6NChMGPl+PCHPxxm2WlA2WQme3b6+/vT+7ly5UrT57bOe7pJ9ved/fmy05Cyv+/5Th/K5kRZdt9994WZOREArBCKFwAKKV4AKKR4AaCQ4gWAQooXAAqZExXKpjjZCRvZBCE79edTn/rUzd3Yfzl37lyYtU4JsutefPHFm7sxlrUPfehDTdetW7cuzAYGBsIs+2+y08lnKtnnZtd1k9bZU+skqvXnWKeTn4iUzRg/9rGPhdk3v/nN9DsXkzdeACikeAGgkOIFgEKKFwAKKV4AKKR4AaCQOVGh7Nf3s9NONm3a1PR9zzzzTNN1p06dCrORkZEwy+ZEmaNHjzZdx/Ly2c9+Nsz27NkTZnv37g2zbE5yzz33pPeTnaaTTYbmmym1XNc671mMz8x+VmWzn+np6TA7fPhwmHU6nc7vf//7MLtw4UKYPf/88+nn3i7eeAGgkOIFgEKKFwAKKV4AKKR4AaCQ4gWAQuZEhbITiNauXRtmQ0NDYXbo0KEF3dP/cuzYsVv+mZnR0dHS76M7/fvf/27Kfvazn4XZl7/85TD74he/mN7Pt7/97TA7e/ZsmGXP62KcXLQY86Usm5iYCLMtW7aE2bPPPhtmn/jEJ8JsOfLGCwCFFC8AFFK8AFBI8QJAIcULAIUULwAUMicqNDU11XRdNjV68cUXW28n9Nvf/jbMvvGNb4TZwMBAmGWnnVy5cuXmboxlra+vL8xmZ2fDrPXUr/Hx8Zu7Mf6f3t7eMMv+LS5dutT8na0nn2X//reTN14AKKR4AaCQ4gWAQooXAAopXgAopHgBoJA5UaHsV9v7+/vDLJtZ/OMf/1jQPf0vJ0+eDLNsgpH9GWZmZhZ0Tyx/rf+NZPOW7Jmb76SgbAKXZStZNvsZGxtblO/s1slQxhsvABRSvABQSPECQCHFCwCFFC8AFFK8AFDInKhQdsLKxo0bwyybWQwPDy/ont6sy5cvh9m2bdvCzAlELDXZZCg7hWe5z5CyP0M27VnIpDD7+16KvPECQCHFCwCFFC8AFFK8AFBI8QJAIcULAIXMiQpduHAhzFavjv8pBgYGwuzUqVMLuqc3Kzud6O677w6zbIYEt8N8pxNl87/sFJ5M6wypm2R/huzvZSmeIrRYvPECQCHFCwCFFC8AFFK8AFBI8QJAIcULAIXMiQodOXIkzLI5UWb37t2tt9Nk/fr1YZbNnl555ZXFuB1oNt9pOdm8J5vNLLeTdG6VhcylltvfqTdeACikeAGgkOIFgEKKFwAKKV4AKKR4AaCQOVGhAwcOhFn2q/bZKSrVc6KjR4+G2dvf/vYwaz3NBRbLunXr0rx1wjLfqUctqk8uyp7X1nvJTjabjzkRANBM8QJAIcULAIUULwAUUrwAUEjxAkAhc6JCBw8eDLPp6ekw6+3tDbP7779/Qff0Zt11111hls0FqucQrBytU5P+/v40z567TOupRtl1WbYYU5vFeF7PnTvXfO1y+/nhjRcACileACikeAGgkOIFgEKKFwAKKV4AKGROVGhqairMRkZGwmxwcDDMjhw5sqB7erNWr47/k8nmGWNjY4txO3BbVM9blspk6Pr162GWTRHnY04EADRTvABQSPECQCHFCwCFFC8AFFK8AFBI8QJAITveLtHX19eUZceMLYZsT5dt+LZt27YYtwOLtvFs3c62Hu/X+pnZfbZed+3atabrsp8Bu3btCrP5ZPezFHnjBYBCihcACileACikeAGgkOIFgEKKFwAKmRN1iRMnToRZdpxWNjVaDJOTk2GWHQv4xhtvLMbtQPNEZ+vWrWneOtPJJn6tU5zsuuz7shlO9mfIjv/MPjPLdu7cGWYrjTdeACikeAGgkOIFgEKKFwAKKV4AKKR4AaCQOVGXeOGFF8LskUceCbO1a9cuxu2EsslDNms4fvz4YtwONJ9c8/DDDzd/buvJRa1a50StU6vs+3p7e8Nseno6zLKJ0kK0zr5uJ2+8AFBI8QJAIcULAIUULwAUUrwAUEjxAkAhc6Iu8fzzz4fZ5z73uTDLTicaHh4Os/Hx8Zu7sTfh6tWrYTYyMnLLvw8WIjtNq9PJT+KamZkJs2zCkk1fWmdR2byn9TOzidLs7GzT9w0NDTXdy3zMiQCAlOIFgEKKFwAKKV4AKKR4AaCQ4gWAQuZEXeKVV14Js2wusG3btjDbs2dPmGXzpUw2wdi6dWuYnTx5sun7oNNZnMnIHXfckeabNm0Ks2zGt27duqb7WSouX74cZtnPqmyCtRDdOhnKeOMFgEKKFwAKKV4AKKR4AaCQ4gWAQooXAAqZE3WJ1157Lcz2798fZsePHw+z1slQ5vvf/36Y/fOf/wyzH/7wh7f8XmAhnnnmmTTft29fmJ05cybMspO4sunLYpx41Hrd4OBgmGUnEF2/fj3M3vGOd4TZSuONFwAKKV4AKKR4AaCQ4gWAQooXAAopXgAo1LMUT3YAgKXKGy8AFFK8AFBI8QJAIcULAIUULwAUUrwAUEjxAkAhxQsAhRQvABRSvABQSPECQCHFCwCFFC8AFFK8AFBI8QJAIcULAIUULwAUUrwAUEjxAkAhxQsAhRQvABRSvABQSPECQCHFCwCFFC8AFFK8AFBI8QJAIcULAIUULwAUUrwAUEjxAkAhxQsAhRQvABRSvABQSPECQCHFCwCFFC8AFFK8AFBodeWX9fT0zFV+33Lx8Y9/PMze//73h9nhw4fDbHp6OsxmZmbCrLe3N8y2bdsWZt/97nebvm8lmJub67nd9/BmeZZjf//738PsxIkTYTY7Oxtm169fD7O5ufifYtWqtnerTZs2NX3fY4891vR9y8XNPsveeAGgkOIFgEKKFwAKKV4AKKR4AaCQ4gWAQqVzItr85Cc/CbOJiYkwy37t/8qVK2E2OTnZdN2aNWvC7MyZM2H205/+NMygG+3duzfM3vnOd4bZrl27wix7fjLZc57JJkrZtGlwcDDMhoeHw2x8fPzmbmwF8MYLAIUULwAUUrwAUEjxAkAhxQsAhRQvABQyJ+oS2SlD2ek9IyMjTddlk6HsBKKxsbGm7/vkJz8ZZuZELDV79uxpuu706dNhls10enraDrDKZkGZqampMMsmUdlz/uMf/7jpXpYjb7wAUEjxAkAhxQsAhRQvABRSvABQSPECQCFzoi7x5JNPhtm1a9fCLJvwDAwMhFk2TxgdHQ2z7ASV1avj/5x27NgRZlu2bAmzCxcuhBncLvfff3/Tdf39/WGWzfiy5zU7nSh7JrPvy04uymTPOTd44wWAQooXAAopXgAopHgBoJDiBYBCihcACpkTdYknnngizLITRlpPH8nmAqtWxf97LJsTZScXZdOF97znPWH27LPPhhncLjt37my6Lpv3ZFkmmxu2nmqU/QzIvO1tb2u6bqXxxgsAhRQvABRSvABQSPECQCHFCwCFFC8AFDIn6hLbtm0Ls5MnT4ZZNtPJ5j1DQ0NhdvHixTDLZhTZrCGbL+3atSvMoBtt3rw5zCYnJ8Msew6yaWA2C8qmP9kzmck+MzsNaXBwsOn7VhpvvABQSPECQCHFCwCFFC8AFFK8AFBI8QJAIXOiQtu3bw+zbBY0NTUVZtk8IfvMLBseHg6z/v7+MGuVnXgE3Sh7RrK5TTYLaj1JaGZmJsxaZ0GZbC7VesLSSuONFwAKKV4AKKR4AaCQ4gWAQooXAAopXgAo5He/C23YsCHMsilBNhfIrst+7T87RWRgYKDp+7LTVbIZUjaJgm6UTXFaJzxZ1tfXF2ZbtmwJs9HR0TDLTi7Ksuw+s59x3OCNFwAKKV4AKKR4AaCQ4gWAQooXAAopXgAopHgBoJAdb6FsV5tt/zKtG9hsi7dx48am6xbjmDHoRhcvXgyzbHObPa/Zz4DsugMHDoTZQw89FGYjIyNhlj3L2f8HwPHjx8OMG7zxAkAhxQsAhRQvABRSvABQSPECQCHFCwCFzIkKZROEzOrV8T/T1atXw2xoaCjMfvnLX4bZBz/4wTDbvn17mPX29oZZNpXI/gzQjY4cORJmH/nIR8IsmwVlR25mnn766TB79NFHwyw7+q/1Ps2Jbo43XgAopHgBoJDiBYBCihcACileACikeAGgkDlRobGxsTCbmpoKs+zX97PrsjnRv/71rzB74IEHwuzOO+8Ms9aTVy5cuBBm0I0OHjzYdF024clmg7Ozs2F26NChpnvJntfWU8/+9re/NV230njjBYBCihcACileACikeAGgkOIFgEKKFwAKmRN1iZmZmTDLTv2Zm5sLs2yGdPTo0abrMtlUIvsznD9/vun74HbZv39/03UDAwNhlk3usmfr5MmTTfeSPed9fX1Nn/nHP/6x6bqVxhsvABRSvABQSPECQCHFCwCFFC8AFFK8AFDInKhLtM6JsklAdsLIX//615u7sTehv78/zLKpxLlz5275vcBiav1vNntGsuc1mw223kvrnGh0dDTMrly50nQvK403XgAopHgBoJDiBYBCihcACileACikeAGgkDlRl8h+RT+bE2VZJpsutH5mdoJKpvV0FehGhw8fDrO77rorzLIpztq1a8MsO/GoVTZ7evnll2/596003ngBoJDiBYBCihcACileACikeAGgkOIFgELmRF1ibGwszDZs2BBm2ak/rbOg7CSUTPZ92WdOT083fR90o4MHD4bZW9/61jDLJoWDg4NNWSZ7JrPTiQ4cOND0fdzgjRcACileACikeAGgkOIFgEKKFwAKKV4AKGRO1CUmJibCbOPGjWHW09MTZq0zney62dnZps/MpgvZSUmw1PzlL38Js09/+tNhlj0H2WxwaGjo5m7sv2TPZPacP/fcc03fxw3eeAGgkOIFgEKKFwAKKV4AKKR4AaCQ4gWAQuZEXeLq1atN12UnAl2+fPmW30s2X8q0nngES81LL70UZq3Tuey527lzZ9NnZhOl7HmdnJxs+j5u8MYLAIUULwAUUrwAUEjxAkAhxQsAhRQvABQyJ+oSrb+in80Mzpw50/SZY2Njt/xeYKU4evRomGUnf/X19TV936uvvtp0Xevzunq12lgob7wAUEjxAkAhxQsAhRQvABRSvABQSPECQCG/F94lslNLspNCsknA+fPnm+5ldHS06brsXlpPZYGlZmRkJMxa50TZ3PA///lPmI2Pjzd9n2d5cXnjBYBCihcACileACikeAGgkOIFgEKKFwAKmRN1iWwy1HrduXPnmj5zYmKi6bpsgpDNKGA56e/vD7Pe3t6mz5ydnW26Lpsa7dixo+kzW09R4gZvvABQSPECQCHFCwCFFC8AFFK8AFBI8QJAIXOiLpGd+NF6GkjrLKh18pDNiWZmZpo+E5aajRs3htmqVfG7TuukMJM9d9m9ZD8D1q5du6B7whsvAJRSvABQSPECQCHFCwCFFC8AFFK8AFDInKhLtJ4+kk0QLl682PSZ2cyg9bpr1641fSYsNdncJpvcLcaz3CqbE61Zs6bwTpYnb7wAUEjxAkAhxQsAhRQvABRSvABQSPECQCHFCwCF7Hi7xPj4eNN12S7w0qVLTZ/Z19fX9H1ZNjk52XQvsNRkx3FmW93s+RkcHGy6l+xZbj2i8OrVq033wg3eeAGgkOIFgEKKFwAKKV4AKKR4AaCQ4gWAQuZES9xi/Nr/mTNnwiybPGRHiZ08ebLpXmCpOX36dJhdv349zFavjn8cZxOlTPYzoPUYzzfeeKPpXrjBGy8AFFK8AFBI8QJAIcULAIUULwAUUrwAUMicqEtkU5xsMpS5cuVK03XZHKL1dKILFy403QssNdPT02GWndK1adOmMBsbG2u6l9ZJ4ezsbJiZBi6cN14AKKR4AaCQ4gWAQooXAAopXgAopHgBoJA5UZfITgRqlU0XMq+99lrTddlpJ+ZEkJ8ytHHjxlv+fdk0sK+vL8zOnz8fZseOHVvILdHxxgsApRQvABRSvABQSPECQCHFCwCFFC8AFDIn6hKvv/56mGUnF2UnAo2Pjzfdy5EjR8JsamoqzAYGBsIsmyfASrFmzZqm7N577236vq1bt4ZZ9nMlmwaycP52AaCQ4gWAQooXAAopXgAopHgBoJDiBYBC5kRdYv/+/WH2mc98JsyyGdLhw4eb7uXSpUthlp1ctGnTpjD785//3HQvsJz8/Oc/D7P77rsvzH70ox81fd8PfvCDMNuwYUOY/elPf2r6Pm6ON14AKKR4AaCQ4gWAQooXAAopXgAopHgBoFDP3Nzc7b4HAFgxvPECQCHFCwCFFC8AFFK8AFBI8QJAIcULAIUULwAUUrwAUEjxAkAhxQsAhRQvABRSvABQSPECQCHFCwCFFC8AFFK8AFBI8QJAIcULAIUULwAUUrwAUEjxAkAhxQsAhRQvABT6P6zf7myzTBscAAAAAElFTkSuQmCC\n" 111 | }, 112 | "metadata": {} 113 | } 114 | ] 115 | }, 116 | { 117 | "metadata": { 118 | "_uuid": "ca4572528683c5e8e41b5972d7e93af4d115a3ab" 119 | }, 120 | "cell_type": "markdown", 121 | "source": "**ANN with Keras**" 122 | }, 123 | { 124 | "metadata": { 125 | "trusted": true, 126 | "_uuid": "ab079ea2fb6b3b7b1338ebceb760d6a8075ef1c5" 127 | }, 128 | "cell_type": "code", 129 | "source": "x_train = x_train.values.T\ny_train = y_train.values.reshape(8400,1).T\nx_test = x_test.values.T\ny_test = y_test.values.reshape(3600,1).T", 130 | "execution_count": 7, 131 | "outputs": [] 132 | }, 133 | { 134 | "metadata": { 135 | "trusted": true, 136 | "_uuid": "e57d82a55fda90e1fb6b2ec778632965cd6a6a02" 137 | }, 138 | "cell_type": "code", 139 | "source": "from keras.wrappers.scikit_learn import KerasClassifier\nfrom sklearn.model_selection import cross_val_score\nfrom keras.models import Sequential\nfrom keras.layers import Dense\n\ndef buildClassifier():\n classifier = Sequential()\n classifier.add(Dense(units=8, kernel_initializer=\"uniform\", activation=\"relu\", input_dim=x_train.shape[0])) # Hidden Layer 1 with 8 nodes\n classifier.add(Dense(units=6, kernel_initializer=\"uniform\", activation=\"relu\")) # Hidden Layer 2 with 6 nodes\n classifier.add(Dense(units=1, kernel_initializer=\"uniform\", activation=\"sigmoid\")) # Output Layer\n classifier.compile(optimizer = \"adam\", loss = \"binary_crossentropy\", metrics = [\"accuracy\"])\n return classifier\n\n\nclassifier = KerasClassifier(build_fn=buildClassifier, epochs = 100)\naccuracies = cross_val_score(estimator = classifier, X = x_train.T, y = y_train.T, cv=3)\nmean = accuracies.mean()\nvariance = accuracies.std()\n\nprint(\"Accuracy Mean is {:.2f}%\".format(mean*100))\nprint(\"Accuracy Variance is {}\".format(variance))", 140 | "execution_count": 10, 141 | "outputs": [ 142 | { 143 | "output_type": "stream", 144 | "text": "Epoch 1/100\n5600/5600 [==============================] - 1s 127us/step - loss: 0.1261 - acc: 0.9571\nEpoch 2/100\n5600/5600 [==============================] - 0s 50us/step - loss: 0.0534 - acc: 0.9812\nEpoch 3/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0462 - acc: 0.9841\nEpoch 4/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0411 - acc: 0.9868\nEpoch 5/100\n5600/5600 [==============================] - 0s 50us/step - loss: 0.0384 - acc: 0.9863\nEpoch 6/100\n5600/5600 [==============================] - 0s 60us/step - loss: 0.0335 - acc: 0.9871\nEpoch 7/100\n5600/5600 [==============================] - 0s 61us/step - loss: 0.0314 - acc: 0.9895\nEpoch 8/100\n5600/5600 [==============================] - 0s 59us/step - loss: 0.0274 - acc: 0.9916\nEpoch 9/100\n5600/5600 [==============================] - 0s 58us/step - loss: 0.0314 - acc: 0.9896\nEpoch 10/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0209 - acc: 0.9927\nEpoch 11/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0224 - acc: 0.9916\nEpoch 12/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0292 - acc: 0.9889\nEpoch 13/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0223 - acc: 0.9916\nEpoch 14/100\n5600/5600 [==============================] - 0s 50us/step - loss: 0.0200 - acc: 0.9923\nEpoch 15/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0203 - acc: 0.9923\nEpoch 16/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0190 - acc: 0.9921\nEpoch 17/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0149 - acc: 0.9945\nEpoch 18/100\n5600/5600 [==============================] - 0s 49us/step - loss: 0.0208 - acc: 0.9930\nEpoch 19/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0235 - acc: 0.9918\nEpoch 20/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0094 - acc: 0.9959\nEpoch 21/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0131 - acc: 0.9948\nEpoch 22/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0117 - acc: 0.9955\nEpoch 23/100\n5600/5600 [==============================] - 0s 43us/step - loss: 0.0165 - acc: 0.9941\nEpoch 24/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0127 - acc: 0.9952\nEpoch 25/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0093 - acc: 0.9964\nEpoch 26/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0128 - acc: 0.9948\nEpoch 27/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0111 - acc: 0.9955\nEpoch 28/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0229 - acc: 0.9918\nEpoch 29/100\n5600/5600 [==============================] - 0s 43us/step - loss: 0.0106 - acc: 0.9954\nEpoch 30/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0083 - acc: 0.9970\nEpoch 31/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0092 - acc: 0.9961\nEpoch 32/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0141 - acc: 0.9952\nEpoch 33/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0093 - acc: 0.9966\nEpoch 34/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0097 - acc: 0.9964\nEpoch 35/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0142 - acc: 0.9943\nEpoch 36/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0053 - acc: 0.9968\nEpoch 37/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0098 - acc: 0.9973\nEpoch 38/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0150 - acc: 0.9955\nEpoch 39/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0051 - acc: 0.9982\nEpoch 40/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0054 - acc: 0.9982\nEpoch 41/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0030 - acc: 0.9988\nEpoch 42/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0048 - acc: 0.9971\nEpoch 43/100\n5600/5600 [==============================] - 0s 43us/step - loss: 0.0113 - acc: 0.9961\nEpoch 44/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0086 - acc: 0.9957\nEpoch 45/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0027 - acc: 0.9980\nEpoch 46/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0339 - acc: 0.9848\nEpoch 47/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0226 - acc: 0.9854\nEpoch 48/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0190 - acc: 0.9875\nEpoch 49/100\n5600/5600 [==============================] - 0s 50us/step - loss: 0.0129 - acc: 0.9888\nEpoch 50/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0104 - acc: 0.9929\nEpoch 51/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0105 - acc: 0.9980\nEpoch 52/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0115 - acc: 0.9970\nEpoch 53/100\n5600/5600 [==============================] - 0s 52us/step - loss: 0.0107 - acc: 0.9982\nEpoch 54/100\n5600/5600 [==============================] - 0s 50us/step - loss: 0.0232 - acc: 0.9950\nEpoch 55/100\n5600/5600 [==============================] - 0s 50us/step - loss: 0.0113 - acc: 0.9968\nEpoch 56/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0092 - acc: 0.9982\nEpoch 57/100\n5600/5600 [==============================] - 0s 51us/step - loss: 0.0099 - acc: 0.9977\nEpoch 58/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0075 - acc: 0.9982\nEpoch 59/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0074 - acc: 0.9986\nEpoch 60/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0054 - acc: 0.9988\nEpoch 61/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0055 - acc: 0.9989\nEpoch 62/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0118 - acc: 0.9968\nEpoch 63/100\n5600/5600 [==============================] - 0s 52us/step - loss: 0.0076 - acc: 0.9979\nEpoch 64/100\n5600/5600 [==============================] - 0s 52us/step - loss: 0.0055 - acc: 0.9989\nEpoch 65/100\n5600/5600 [==============================] - 0s 50us/step - loss: 0.0045 - acc: 0.9993\nEpoch 66/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0043 - acc: 0.9993\nEpoch 67/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0044 - acc: 0.9993\nEpoch 68/100\n5600/5600 [==============================] - 0s 51us/step - loss: 0.0090 - acc: 0.9984\nEpoch 69/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0166 - acc: 0.9932\nEpoch 70/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0115 - acc: 0.9964\nEpoch 71/100\n5600/5600 [==============================] - 0s 49us/step - loss: 0.0054 - acc: 0.9988\nEpoch 72/100\n5600/5600 [==============================] - 0s 52us/step - loss: 0.0046 - acc: 0.9989\nEpoch 73/100\n5600/5600 [==============================] - 0s 50us/step - loss: 0.0032 - acc: 0.9995\nEpoch 74/100\n5600/5600 [==============================] - 0s 50us/step - loss: 0.0030 - acc: 0.9995\nEpoch 75/100\n5600/5600 [==============================] - 0s 50us/step - loss: 0.0081 - acc: 0.9975\nEpoch 76/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0087 - acc: 0.9977\nEpoch 77/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0035 - acc: 0.9995\nEpoch 78/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0033 - acc: 0.9993\nEpoch 79/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0054 - acc: 0.9986\nEpoch 80/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0028 - acc: 0.9995\nEpoch 81/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0028 - acc: 0.9995\nEpoch 82/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0027 - acc: 0.9995\nEpoch 83/100\n", 145 | "name": "stdout" 146 | }, 147 | { 148 | "output_type": "stream", 149 | "text": "5600/5600 [==============================] - 0s 47us/step - loss: 0.0026 - acc: 0.9995\nEpoch 84/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0025 - acc: 0.9995\nEpoch 85/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0156 - acc: 0.9966\nEpoch 86/100\n5600/5600 [==============================] - 0s 49us/step - loss: 0.0037 - acc: 0.9991\nEpoch 87/100\n5600/5600 [==============================] - 0s 49us/step - loss: 0.0026 - acc: 0.9995\nEpoch 88/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0025 - acc: 0.9995\nEpoch 89/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0024 - acc: 0.9995\nEpoch 90/100\n5600/5600 [==============================] - 0s 51us/step - loss: 0.0024 - acc: 0.9995\nEpoch 91/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0144 - acc: 0.9973\nEpoch 92/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0162 - acc: 0.9959\nEpoch 93/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0064 - acc: 0.9973\nEpoch 94/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0063 - acc: 0.9980\nEpoch 95/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0045 - acc: 0.9988\nEpoch 96/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0040 - acc: 0.9988\nEpoch 97/100\n5600/5600 [==============================] - 0s 51us/step - loss: 0.0024 - acc: 0.9995\nEpoch 98/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0023 - acc: 0.9995\nEpoch 99/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0024 - acc: 0.9995\nEpoch 100/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0024 - acc: 0.9995\n2800/2800 [==============================] - 0s 54us/step\nEpoch 1/100\n5600/5600 [==============================] - 1s 124us/step - loss: 0.1124 - acc: 0.9654\nEpoch 2/100\n5600/5600 [==============================] - 0s 50us/step - loss: 0.0437 - acc: 0.9823\nEpoch 3/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0407 - acc: 0.9848\nEpoch 4/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0338 - acc: 0.9886\nEpoch 5/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0319 - acc: 0.9891\nEpoch 6/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0241 - acc: 0.9913\nEpoch 7/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0222 - acc: 0.9929\nEpoch 8/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0211 - acc: 0.9930\nEpoch 9/100\n5600/5600 [==============================] - 0s 50us/step - loss: 0.0226 - acc: 0.9918\nEpoch 10/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0184 - acc: 0.9941\nEpoch 11/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0202 - acc: 0.9927\nEpoch 12/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0181 - acc: 0.9927\nEpoch 13/100\n5600/5600 [==============================] - 0s 49us/step - loss: 0.0198 - acc: 0.9925\nEpoch 14/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0204 - acc: 0.9923\nEpoch 15/100\n5600/5600 [==============================] - 0s 49us/step - loss: 0.0121 - acc: 0.9955\nEpoch 16/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0177 - acc: 0.9930\nEpoch 17/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0152 - acc: 0.9950\nEpoch 18/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0194 - acc: 0.9930\nEpoch 19/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0167 - acc: 0.9952\nEpoch 20/100\n5600/5600 [==============================] - 0s 50us/step - loss: 0.0091 - acc: 0.9970\nEpoch 21/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0083 - acc: 0.9970\nEpoch 22/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0122 - acc: 0.9941\nEpoch 23/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0080 - acc: 0.9968\nEpoch 24/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0068 - acc: 0.9973\nEpoch 25/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0064 - acc: 0.9977\nEpoch 26/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0172 - acc: 0.9945\nEpoch 27/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0092 - acc: 0.9966\nEpoch 28/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0092 - acc: 0.9961\nEpoch 29/100\n5600/5600 [==============================] - 0s 50us/step - loss: 0.0065 - acc: 0.9973\nEpoch 30/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0092 - acc: 0.9964\nEpoch 31/100\n5600/5600 [==============================] - 0s 50us/step - loss: 0.0096 - acc: 0.9962\nEpoch 32/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0084 - acc: 0.9973\nEpoch 33/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0065 - acc: 0.9982\nEpoch 34/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0055 - acc: 0.9984\nEpoch 35/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0049 - acc: 0.9979\nEpoch 36/100\n5600/5600 [==============================] - 0s 51us/step - loss: 0.0115 - acc: 0.9964\nEpoch 37/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0048 - acc: 0.9979\nEpoch 38/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0100 - acc: 0.9968\nEpoch 39/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0024 - acc: 0.9989\nEpoch 40/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0023 - acc: 0.9988\nEpoch 41/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0019 - acc: 0.9989\nEpoch 42/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0023 - acc: 0.9988\nEpoch 43/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0021 - acc: 0.9988\nEpoch 44/100\n5600/5600 [==============================] - 0s 49us/step - loss: 0.0015 - acc: 0.9991\nEpoch 45/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0015 - acc: 0.9989\nEpoch 46/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0015 - acc: 0.9989\nEpoch 47/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0014 - acc: 0.9991\nEpoch 48/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0064 - acc: 0.9977\nEpoch 49/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0170 - acc: 0.9939\nEpoch 50/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0092 - acc: 0.9975\nEpoch 51/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0076 - acc: 0.9966\nEpoch 52/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0074 - acc: 0.9970\nEpoch 53/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0063 - acc: 0.9973\nEpoch 54/100\n5600/5600 [==============================] - 0s 49us/step - loss: 0.0035 - acc: 0.9984\nEpoch 55/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0018 - acc: 0.9989\nEpoch 56/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0112 - acc: 0.9962\nEpoch 57/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0126 - acc: 0.9952\nEpoch 58/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0053 - acc: 0.9980\nEpoch 59/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0030 - acc: 0.9988\nEpoch 60/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0032 - acc: 0.9988\nEpoch 61/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0091 - acc: 0.9968\nEpoch 62/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0032 - acc: 0.9988\nEpoch 63/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0046 - acc: 0.9984\nEpoch 64/100\n", 150 | "name": "stdout" 151 | }, 152 | { 153 | "output_type": "stream", 154 | "text": "5600/5600 [==============================] - 0s 47us/step - loss: 0.0026 - acc: 0.9988\nEpoch 65/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0015 - acc: 0.9993\nEpoch 66/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0033 - acc: 0.9982\nEpoch 67/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0141 - acc: 0.9950\nEpoch 68/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0034 - acc: 0.9984\nEpoch 69/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0038 - acc: 0.9984\nEpoch 70/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0021 - acc: 0.9989\nEpoch 71/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0026 - acc: 0.9986\nEpoch 72/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0015 - acc: 0.9991\nEpoch 73/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0011 - acc: 0.9991\nEpoch 74/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0011 - acc: 0.9993\nEpoch 75/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0040 - acc: 0.9984\nEpoch 76/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0028 - acc: 0.9986\nEpoch 77/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0058 - acc: 0.9982\nEpoch 78/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0030 - acc: 0.9988\nEpoch 79/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0147 - acc: 0.9961\nEpoch 80/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0058 - acc: 0.9982\nEpoch 81/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0035 - acc: 0.9980\nEpoch 82/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0023 - acc: 0.9989\nEpoch 83/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0025 - acc: 0.9988\nEpoch 84/100\n5600/5600 [==============================] - 0s 45us/step - loss: 6.3361e-04 - acc: 0.9995\nEpoch 85/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0046 - acc: 0.9982\nEpoch 86/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0027 - acc: 0.9988\nEpoch 87/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0126 - acc: 0.9966\nEpoch 88/100\n5600/5600 [==============================] - 0s 44us/step - loss: 0.0021 - acc: 0.9989\nEpoch 89/100\n5600/5600 [==============================] - 0s 45us/step - loss: 9.9522e-04 - acc: 0.9995\nEpoch 90/100\n5600/5600 [==============================] - 0s 45us/step - loss: 7.5599e-04 - acc: 0.9995\nEpoch 91/100\n5600/5600 [==============================] - 0s 50us/step - loss: 6.3645e-04 - acc: 0.9995\nEpoch 92/100\n5600/5600 [==============================] - 0s 45us/step - loss: 5.4229e-04 - acc: 0.9995\nEpoch 93/100\n5600/5600 [==============================] - 0s 47us/step - loss: 4.9193e-04 - acc: 0.9995\nEpoch 94/100\n5600/5600 [==============================] - 0s 47us/step - loss: 4.6516e-04 - acc: 0.9995\nEpoch 95/100\n5600/5600 [==============================] - 0s 46us/step - loss: 4.4648e-04 - acc: 0.9995\nEpoch 96/100\n5600/5600 [==============================] - 0s 45us/step - loss: 4.3313e-04 - acc: 0.9995\nEpoch 97/100\n5600/5600 [==============================] - 0s 46us/step - loss: 4.2222e-04 - acc: 0.9995\nEpoch 98/100\n5600/5600 [==============================] - 0s 45us/step - loss: 4.1194e-04 - acc: 0.9995\nEpoch 99/100\n5600/5600 [==============================] - 0s 46us/step - loss: 4.0341e-04 - acc: 0.9995\nEpoch 100/100\n5600/5600 [==============================] - 0s 44us/step - loss: 3.9419e-04 - acc: 0.9995\n2800/2800 [==============================] - 0s 61us/step\nEpoch 1/100\n5600/5600 [==============================] - 1s 134us/step - loss: 0.2026 - acc: 0.9218\nEpoch 2/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0472 - acc: 0.9836\nEpoch 3/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0457 - acc: 0.9839\nEpoch 4/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0383 - acc: 0.9855\nEpoch 5/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0316 - acc: 0.9889\nEpoch 6/100\n5600/5600 [==============================] - 0s 49us/step - loss: 0.0325 - acc: 0.9879\nEpoch 7/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0263 - acc: 0.9891\nEpoch 8/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0326 - acc: 0.9868\nEpoch 9/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0217 - acc: 0.9920\nEpoch 10/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0220 - acc: 0.9921\nEpoch 11/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0201 - acc: 0.9929\nEpoch 12/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0191 - acc: 0.9932\nEpoch 13/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0177 - acc: 0.9929\nEpoch 14/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0186 - acc: 0.9930\nEpoch 15/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0160 - acc: 0.9939\nEpoch 16/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0129 - acc: 0.9955\nEpoch 17/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0181 - acc: 0.9936\nEpoch 18/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0152 - acc: 0.9954\nEpoch 19/100\n5600/5600 [==============================] - 0s 45us/step - loss: 0.0092 - acc: 0.9973\nEpoch 20/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0110 - acc: 0.9954\nEpoch 21/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0100 - acc: 0.9968\nEpoch 22/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0053 - acc: 0.9980\nEpoch 23/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0106 - acc: 0.9962\nEpoch 24/100\n5600/5600 [==============================] - 0s 46us/step - loss: 0.0068 - acc: 0.9977\nEpoch 25/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0096 - acc: 0.9959\nEpoch 26/100\n5600/5600 [==============================] - 0s 48us/step - loss: 0.0146 - acc: 0.9941\nEpoch 27/100\n5600/5600 [==============================] - 0s 50us/step - loss: 0.0117 - acc: 0.9957\nEpoch 28/100\n5600/5600 [==============================] - 0s 47us/step - loss: 0.0105 - acc: 0.9964\nEpoch 29/100\n5600/5600 [==============================] - 0s 49us/step - loss: 0.0056 - acc: 0.9975\nEpoch 30/100\n5600/5600 [==============================] - 0s 54us/step - loss: 0.0047 - acc: 0.9982\nEpoch 31/100\n5600/5600 [==============================] - 0s 51us/step - loss: 0.0149 - acc: 0.9939\nEpoch 32/100\n5600/5600 [==============================] - 0s 52us/step - loss: 0.0076 - acc: 0.9970\nEpoch 33/100\n5600/5600 [==============================] - 0s 54us/step - loss: 0.0044 - acc: 0.9982\nEpoch 34/100\n5600/5600 [==============================] - 0s 58us/step - loss: 0.0019 - acc: 0.9993\nEpoch 35/100\n5600/5600 [==============================] - 0s 57us/step - loss: 0.0040 - acc: 0.9988\nEpoch 36/100\n5600/5600 [==============================] - 0s 59us/step - loss: 0.0220 - acc: 0.9948\nEpoch 37/100\n5600/5600 [==============================] - 0s 55us/step - loss: 0.0140 - acc: 0.9941\nEpoch 38/100\n5600/5600 [==============================] - 0s 57us/step - loss: 0.0075 - acc: 0.9970\nEpoch 39/100\n5600/5600 [==============================] - 0s 55us/step - loss: 0.0049 - acc: 0.9986\nEpoch 40/100\n5600/5600 [==============================] - 0s 54us/step - loss: 0.0022 - acc: 0.9989\nEpoch 41/100\n5600/5600 [==============================] - 0s 57us/step - loss: 0.0048 - acc: 0.9988\nEpoch 42/100\n5600/5600 [==============================] - 0s 56us/step - loss: 0.0093 - acc: 0.9966\nEpoch 43/100\n5600/5600 [==============================] - 0s 60us/step - loss: 0.0038 - acc: 0.9989\nEpoch 44/100\n5600/5600 [==============================] - 0s 57us/step - loss: 0.0016 - acc: 0.9996\nEpoch 45/100\n", 155 | "name": "stdout" 156 | }, 157 | { 158 | "output_type": "stream", 159 | "text": "5600/5600 [==============================] - 0s 52us/step - loss: 0.0091 - acc: 0.9962\nEpoch 46/100\n5600/5600 [==============================] - 0s 53us/step - loss: 0.0021 - acc: 0.9995\nEpoch 47/100\n5600/5600 [==============================] - 0s 55us/step - loss: 0.0031 - acc: 0.9991\nEpoch 48/100\n5600/5600 [==============================] - 0s 58us/step - loss: 6.8306e-04 - acc: 0.9998\nEpoch 49/100\n5600/5600 [==============================] - 0s 57us/step - loss: 0.0019 - acc: 0.9996\nEpoch 50/100\n5600/5600 [==============================] - 0s 57us/step - loss: 0.0010 - acc: 0.9998\nEpoch 51/100\n5600/5600 [==============================] - 0s 56us/step - loss: 0.0070 - acc: 0.9971\nEpoch 52/100\n5600/5600 [==============================] - 0s 55us/step - loss: 5.4300e-04 - acc: 1.0000\nEpoch 53/100\n5600/5600 [==============================] - 0s 56us/step - loss: 3.4355e-04 - acc: 1.0000\nEpoch 54/100\n5600/5600 [==============================] - 0s 55us/step - loss: 2.7773e-04 - acc: 1.0000\nEpoch 55/100\n5600/5600 [==============================] - 0s 54us/step - loss: 2.1845e-04 - acc: 1.0000\nEpoch 56/100\n5600/5600 [==============================] - 0s 53us/step - loss: 1.8711e-04 - acc: 1.0000\nEpoch 57/100\n5600/5600 [==============================] - 0s 55us/step - loss: 1.7280e-04 - acc: 1.0000\nEpoch 58/100\n5600/5600 [==============================] - 0s 53us/step - loss: 1.5216e-04 - acc: 1.0000\nEpoch 59/100\n5600/5600 [==============================] - 0s 54us/step - loss: 1.4954e-04 - acc: 1.0000\nEpoch 60/100\n5600/5600 [==============================] - 0s 60us/step - loss: 1.3480e-04 - acc: 1.0000\nEpoch 61/100\n5600/5600 [==============================] - 0s 59us/step - loss: 1.2653e-04 - acc: 1.0000\nEpoch 62/100\n5600/5600 [==============================] - 0s 56us/step - loss: 1.1453e-04 - acc: 1.0000\nEpoch 63/100\n5600/5600 [==============================] - 0s 48us/step - loss: 1.0568e-04 - acc: 1.0000\nEpoch 64/100\n5600/5600 [==============================] - 0s 52us/step - loss: 1.0247e-04 - acc: 1.0000\nEpoch 65/100\n5600/5600 [==============================] - 0s 64us/step - loss: 9.9977e-05 - acc: 1.0000\nEpoch 66/100\n5600/5600 [==============================] - 0s 65us/step - loss: 9.7851e-05 - acc: 1.0000\nEpoch 67/100\n5600/5600 [==============================] - 0s 57us/step - loss: 1.4496e-04 - acc: 1.0000\nEpoch 68/100\n5600/5600 [==============================] - 0s 56us/step - loss: 1.0999e-04 - acc: 1.0000\nEpoch 69/100\n5600/5600 [==============================] - 0s 54us/step - loss: 9.4551e-05 - acc: 1.0000\nEpoch 70/100\n5600/5600 [==============================] - 0s 56us/step - loss: 9.1887e-05 - acc: 1.0000\nEpoch 71/100\n5600/5600 [==============================] - 0s 57us/step - loss: 8.9751e-05 - acc: 1.0000\nEpoch 72/100\n5600/5600 [==============================] - 0s 53us/step - loss: 8.7814e-05 - acc: 1.0000\nEpoch 73/100\n5600/5600 [==============================] - 0s 55us/step - loss: 8.6029e-05 - acc: 1.0000\nEpoch 74/100\n5600/5600 [==============================] - 0s 57us/step - loss: 8.4248e-05 - acc: 1.0000\nEpoch 75/100\n5600/5600 [==============================] - 0s 54us/step - loss: 8.2469e-05 - acc: 1.0000\nEpoch 76/100\n5600/5600 [==============================] - 0s 58us/step - loss: 8.0752e-05 - acc: 1.0000\nEpoch 77/100\n5600/5600 [==============================] - 0s 62us/step - loss: 7.8986e-05 - acc: 1.0000\nEpoch 78/100\n5600/5600 [==============================] - 0s 56us/step - loss: 7.7275e-05 - acc: 1.0000\nEpoch 79/100\n5600/5600 [==============================] - 0s 54us/step - loss: 7.5559e-05 - acc: 1.0000\nEpoch 80/100\n5600/5600 [==============================] - 0s 53us/step - loss: 7.3830e-05 - acc: 1.0000\nEpoch 81/100\n5600/5600 [==============================] - 0s 53us/step - loss: 7.2151e-05 - acc: 1.0000\nEpoch 82/100\n5600/5600 [==============================] - 0s 47us/step - loss: 7.0484e-05 - acc: 1.0000\nEpoch 83/100\n5600/5600 [==============================] - 0s 46us/step - loss: 6.8726e-05 - acc: 1.0000\nEpoch 84/100\n5600/5600 [==============================] - 0s 49us/step - loss: 6.7041e-05 - acc: 1.0000\nEpoch 85/100\n5600/5600 [==============================] - 0s 47us/step - loss: 6.5363e-05 - acc: 1.0000\nEpoch 86/100\n5600/5600 [==============================] - 0s 47us/step - loss: 6.3666e-05 - acc: 1.0000\nEpoch 87/100\n5600/5600 [==============================] - 0s 45us/step - loss: 6.2060e-05 - acc: 1.0000\nEpoch 88/100\n5600/5600 [==============================] - 0s 46us/step - loss: 6.0405e-05 - acc: 1.0000\nEpoch 89/100\n5600/5600 [==============================] - 0s 46us/step - loss: 5.8834e-05 - acc: 1.0000\nEpoch 90/100\n5600/5600 [==============================] - 0s 46us/step - loss: 5.7154e-05 - acc: 1.0000\nEpoch 91/100\n5600/5600 [==============================] - 0s 47us/step - loss: 5.5522e-05 - acc: 1.0000\nEpoch 92/100\n5600/5600 [==============================] - 0s 46us/step - loss: 5.3944e-05 - acc: 1.0000\nEpoch 93/100\n5600/5600 [==============================] - 0s 48us/step - loss: 5.2393e-05 - acc: 1.0000\nEpoch 94/100\n5600/5600 [==============================] - 0s 46us/step - loss: 5.0840e-05 - acc: 1.0000\nEpoch 95/100\n5600/5600 [==============================] - 0s 45us/step - loss: 4.9364e-05 - acc: 1.0000\nEpoch 96/100\n5600/5600 [==============================] - 0s 47us/step - loss: 4.7910e-05 - acc: 1.0000\nEpoch 97/100\n5600/5600 [==============================] - 0s 48us/step - loss: 4.6376e-05 - acc: 1.0000\nEpoch 98/100\n5600/5600 [==============================] - 0s 47us/step - loss: 4.4902e-05 - acc: 1.0000\nEpoch 99/100\n5600/5600 [==============================] - 0s 48us/step - loss: 4.3426e-05 - acc: 1.0000\nEpoch 100/100\n5600/5600 [==============================] - 0s 45us/step - loss: 4.2012e-05 - acc: 1.0000\n2800/2800 [==============================] - 0s 68us/step\nAccuracy Mean is 99.08%\nAccuracy Variance is 0.0014384578539993634\n", 160 | "name": "stdout" 161 | } 162 | ] 163 | }, 164 | { 165 | "metadata": { 166 | "trusted": true, 167 | "_uuid": "6e34edab4f7630979b8066240cdb8820ec49184f" 168 | }, 169 | "cell_type": "markdown", 170 | "source": "**We can say our model works with approximately 99% of acuracy.**" 171 | }, 172 | { 173 | "metadata": { 174 | "_uuid": "e60a25a598358170cb2fc644ed5960139bae8c3e" 175 | }, 176 | "cell_type": "markdown", 177 | "source": "Thanks for your time!" 178 | } 179 | ], 180 | "metadata": { 181 | "kernelspec": { 182 | "display_name": "Python 3", 183 | "language": "python", 184 | "name": "python3" 185 | }, 186 | "language_info": { 187 | "name": "python", 188 | "version": "3.6.6", 189 | "mimetype": "text/x-python", 190 | "codemirror_mode": { 191 | "name": "ipython", 192 | "version": 3 193 | }, 194 | "pygments_lexer": "ipython3", 195 | "nbconvert_exporter": "python", 196 | "file_extension": ".py" 197 | } 198 | }, 199 | "nbformat": 4, 200 | "nbformat_minor": 1 201 | } 202 | -------------------------------------------------------------------------------- /Black_Friday_EDA/black_friday_eda.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "_uuid": "052e075e513664cc0821fd5a1de9693dfb011e34" 7 | }, 8 | "source": [ 9 | "# BLACK FRIDAY\n", 10 | "Shopping day after Thanksgiving is called as **Black Friday**. In Black Friday most of retailers open very early and offer promotional sales. Because of that a lot of people do shopping in this day and sometimes it causes chaos. That's why it's called Black Friday.\n", 11 | "At first the day had not a name. In 1961 people caused traffic accidents and violence to make use of promotions and sales in Philadelpihia after that it is started calling Black Friday. \n", 12 | "\n", 13 | "Dataset contains 550 000 observations about the black Friday in a retail store\n", 14 | "![](https://i0.wp.com/ares.shiftdelete.net/580x330/original/2017/11/black-friday-nedir-sdn-1-1.jpg)" 15 | ] 16 | }, 17 | { 18 | "cell_type": "code", 19 | "execution_count": null, 20 | "metadata": { 21 | "_cell_guid": "b1076dfc-b9ad-4769-8c92-a6c4dae69d19", 22 | "_uuid": "8f2839f25d086af736a60e9eeb907d3b93b6e0e5" 23 | }, 24 | "outputs": [], 25 | "source": [ 26 | "import numpy as np # linear algebra\n", 27 | "import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)\n", 28 | "import matplotlib.pyplot as plt\n", 29 | "%matplotlib inline\n", 30 | "# plotly\n", 31 | "import plotly.plotly as py\n", 32 | "from plotly.offline import init_notebook_mode, iplot\n", 33 | "init_notebook_mode(connected=True)\n", 34 | "import plotly.graph_objs as go\n", 35 | "from plotly import tools\n" 36 | ] 37 | }, 38 | { 39 | "cell_type": "code", 40 | "execution_count": null, 41 | "metadata": { 42 | "_cell_guid": "79c7e3d0-c299-4dcb-8224-4455121ee9b0", 43 | "_uuid": "d629ff2d2480ee46fbb7e2d37f6b5fab8052498a" 44 | }, 45 | "outputs": [], 46 | "source": [ 47 | "df = pd.read_csv(\"BlackFriday.csv\")" 48 | ] 49 | }, 50 | { 51 | "cell_type": "code", 52 | "execution_count": null, 53 | "metadata": { 54 | "_uuid": "0981ff8ceaff0f30d13e3b906cc59e010507394d" 55 | }, 56 | "outputs": [], 57 | "source": [ 58 | "df.info()" 59 | ] 60 | }, 61 | { 62 | "cell_type": "code", 63 | "execution_count": null, 64 | "metadata": { 65 | "_uuid": "66734ab0f5d1157e41c4d7a238cbb997dff2731c" 66 | }, 67 | "outputs": [], 68 | "source": [ 69 | "df.head()" 70 | ] 71 | }, 72 | { 73 | "cell_type": "code", 74 | "execution_count": null, 75 | "metadata": { 76 | "_uuid": "0c15db5d7439dec2540ddbc174894e3e5c43f707" 77 | }, 78 | "outputs": [], 79 | "source": [ 80 | "df.columns" 81 | ] 82 | }, 83 | { 84 | "cell_type": "markdown", 85 | "metadata": { 86 | "_uuid": "e5fabbf7a129e7b52a33040b23f23073978a58a9" 87 | }, 88 | "source": [ 89 | "People age of between 26-35 tend to buy more product and spend more money. We may explain this situtation with financial status. Usually in that ages people's financial status are better than younger. Also they are interesting technology and promotions more than older people." 90 | ] 91 | }, 92 | { 93 | "cell_type": "code", 94 | "execution_count": null, 95 | "metadata": { 96 | "_uuid": "5d380a3a6d6893ccb3339b6b9473a5e076c5271d" 97 | }, 98 | "outputs": [], 99 | "source": [ 100 | "ageData = sorted(list(zip(df.Age.value_counts().index, df.Age.value_counts().values)))\n", 101 | "age, productBuy = zip(*ageData)\n", 102 | "age, productBuy = list(age), list(productBuy)\n", 103 | "ageSeries = pd.Series((i for i in age))\n", 104 | "\n", 105 | "data = [go.Bar(x=age, \n", 106 | " y=productBuy, \n", 107 | " name=\"How many products were sold\",\n", 108 | " marker = dict(color=['#EA4A28', '#D3EA28', '#28EA4E', '#28EAE2', '#2008B9', '#E511E1', '#C4061D'],\n", 109 | " line = dict(color='#7C7C7C', width = .5)),\n", 110 | " text=\"Age: \" + ageSeries)]\n", 111 | "layout = go.Layout(title= \"How many products were sold by ages\")\n", 112 | "fig = go.Figure(data=data, layout=layout)\n", 113 | "iplot(fig)" 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": null, 119 | "metadata": { 120 | "_uuid": "74fbbd7b448fb665fc63187e790195dcc412288e" 121 | }, 122 | "outputs": [], 123 | "source": [ 124 | "age0_17 = len(df[df.Age == \"0-17\"].User_ID.unique())\n", 125 | "age18_25 = len(df[df.Age == \"18-25\"].User_ID.unique())\n", 126 | "age26_35 = len(df[df.Age == \"26-35\"].User_ID.unique())\n", 127 | "age36_45 = len(df[df.Age == \"36-45\"].User_ID.unique())\n", 128 | "age46_50 = len(df[df.Age == \"46-50\"].User_ID.unique())\n", 129 | "age51_55 = len(df[df.Age == \"51-55\"].User_ID.unique())\n", 130 | "age55 = len(df[df.Age == \"55+\"].User_ID.unique())\n", 131 | "agesBuyerCount = [age0_17,age18_25,age26_35,age36_45,age46_50,age51_55,age55]\n", 132 | " \n", 133 | "trace1 = go.Bar(x = age,\n", 134 | " y = agesBuyerCount,\n", 135 | " name = \"People count\",\n", 136 | " marker = dict(color=['#F3B396', '#F3F196', '#A7F9AD', '#D5F0EF', '#AAADEE', '#EAC1E8', '#DF8787'],\n", 137 | " line = dict(color='#7C7C7C', width = 1)),\n", 138 | " text = \"Age: \" + ageSeries)\n", 139 | "data = [trace1]\n", 140 | "layout = go.Layout(title= \"How many people did shopping by ages\")\n", 141 | "fig = go.Figure(data=data, layout=layout)\n", 142 | "iplot(fig)" 143 | ] 144 | }, 145 | { 146 | "cell_type": "code", 147 | "execution_count": null, 148 | "metadata": { 149 | "_uuid": "1c30fcbf94a2c4c567d89c9d9843789f61137dc5" 150 | }, 151 | "outputs": [], 152 | "source": [ 153 | "totalPurchase = df.Purchase.sum()" 154 | ] 155 | }, 156 | { 157 | "cell_type": "code", 158 | "execution_count": null, 159 | "metadata": { 160 | "_uuid": "3939baa69e0b2adcbdaaecf1b3d550d4f0b88a12" 161 | }, 162 | "outputs": [], 163 | "source": [ 164 | "y_percentage = [i/totalPurchase*100 for i in df.groupby(\"Age\").sum()[\"Purchase\"].values]\n", 165 | "y_purchase = [i for i in df.groupby(\"Age\").sum()[\"Purchase\"].values]\n", 166 | "x_percentage = [i for i in age]\n", 167 | "x_purchase = [i for i in age]\n", 168 | "\n", 169 | "trace0 = go.Bar(x = y_percentage,\n", 170 | " y = x_percentage,\n", 171 | " marker = dict(color = '#BF5959', \n", 172 | " line = dict(\n", 173 | " color = '#FFFEA6', width = 1) \n", 174 | " ),\n", 175 | " name = \"Percentage of purchases amount in dollars\",\n", 176 | " orientation = \"h\"\n", 177 | " )\n", 178 | "trace1 = go.Scatter(x = y_purchase,\n", 179 | " y = x_purchase,\n", 180 | " mode='lines+markers',\n", 181 | " line=dict(\n", 182 | " color='#5079DC'),\n", 183 | " name='Purchases amount in dollars ',)\n", 184 | "layout = dict(\n", 185 | " title='Purchases in $',\n", 186 | " yaxis=dict(showticklabels=True,domain=[0, 0.85]),\n", 187 | " yaxis2=dict(showline=True,showticklabels=False,linecolor='rgba(102, 102, 102, 0.8)',linewidth=2,domain=[0, 0.85]),\n", 188 | " xaxis=dict(zeroline=False,showline=False,showticklabels=True,showgrid=True,domain=[0, 0.42]),\n", 189 | " xaxis2=dict(zeroline=False,showline=False,showticklabels=True,showgrid=True,domain=[0.47, 1],side='top',dtick=25),\n", 190 | " legend=dict(x=0.029,y=1.038,font=dict(size=10) ),\n", 191 | " margin=dict(l=200, r=20,t=70,b=70),\n", 192 | " paper_bgcolor='rgb(248, 248, 255)',\n", 193 | " plot_bgcolor='rgb(248, 248, 255)',\n", 194 | ")\n", 195 | "\n", 196 | "annotations = []\n", 197 | "\n", 198 | "y_s = np.round(y_percentage, decimals=2)\n", 199 | "y_nw = np.rint(y_purchase)\n", 200 | "\n", 201 | "for ydn, yd, xd in zip(y_nw, y_s, x_percentage):\n", 202 | " annotations.append(dict(xref='x2', yref='y2',\n", 203 | " y=xd, x=ydn - 20000,\n", 204 | " text='{:,}'.format(ydn),\n", 205 | " font=dict(family='Arial', size=12,\n", 206 | " color='rgb(128, 0, 128)'),\n", 207 | " showarrow=False))\n", 208 | " annotations.append(dict(xref='x1', yref='y1',\n", 209 | " y=xd, x=yd + 3,\n", 210 | " text=str(yd) + '%',\n", 211 | " font=dict(family='Arial', size=12,\n", 212 | " color='rgb(50, 171, 96)'),\n", 213 | " showarrow=False))\n", 214 | "annotations.append(dict(xref='paper', yref='paper',\n", 215 | " x=-0.2, y=-0.109,\n", 216 | " font=dict(family='Arial', size=10,\n", 217 | " color='rgb(150,150,150)'),\n", 218 | " showarrow=False))\n", 219 | "\n", 220 | "layout['annotations'] = annotations\n", 221 | "\n", 222 | "fig = tools.make_subplots(rows=1, cols=2, specs=[[{}, {}]], shared_xaxes=True,\n", 223 | " shared_yaxes=False, vertical_spacing=0.001)\n", 224 | "\n", 225 | "fig.append_trace(trace0, 1, 1)\n", 226 | "fig.append_trace(trace1, 1, 2)\n", 227 | "\n", 228 | "fig['layout'].update(layout)\n", 229 | "iplot(fig)" 230 | ] 231 | }, 232 | { 233 | "cell_type": "code", 234 | "execution_count": null, 235 | "metadata": { 236 | "_kg_hide-input": true, 237 | "_kg_hide-output": false, 238 | "_uuid": "2830b727051e458b9c51839851ade3c54889ab59" 239 | }, 240 | "outputs": [], 241 | "source": [ 242 | "data = [go.Box(y = df.Purchase, marker = dict(\n", 243 | " color = 'rgb(0, 128, 128)',\n", 244 | " ))]\n", 245 | "iplot(data)\n" 246 | ] 247 | }, 248 | { 249 | "cell_type": "markdown", 250 | "metadata": { 251 | "_uuid": "433effd3180d3c543f3edbdabbcb096213c31f23" 252 | }, 253 | "source": [ 254 | "**Purchases Amount in US Dollar Group by City Categories**" 255 | ] 256 | }, 257 | { 258 | "cell_type": "code", 259 | "execution_count": null, 260 | "metadata": { 261 | "_kg_hide-input": true, 262 | "_uuid": "0ac014179e4346b76ea95992d3a234ff7339b0df", 263 | "scrolled": true 264 | }, 265 | "outputs": [], 266 | "source": [ 267 | "labels = sorted(df.City_Category.unique())\n", 268 | "values = df.groupby(\"City_Category\").sum()[\"Purchase\"]\n", 269 | "colors = ['#FEBFB3', '#E1396C', '#96D38C', '#D0F9B1']\n", 270 | "\n", 271 | "trace = go.Pie(labels=labels, values=values,\n", 272 | " hoverinfo='label+percent', textinfo='value', \n", 273 | " textfont=dict(size=20),\n", 274 | " marker=dict(colors=colors, \n", 275 | " line=dict(color='#000000', width=2)))\n", 276 | "iplot([trace])" 277 | ] 278 | }, 279 | { 280 | "cell_type": "markdown", 281 | "metadata": { 282 | "_uuid": "d24e46d72fce9c13fdc495dceabb8131c03e2fd1" 283 | }, 284 | "source": [ 285 | "**Percentage of People's Years of Staying in Current City **" 286 | ] 287 | }, 288 | { 289 | "cell_type": "code", 290 | "execution_count": null, 291 | "metadata": { 292 | "_uuid": "31f60b7b6be44de1a78481b4c42c9b8b8ad31c15" 293 | }, 294 | "outputs": [], 295 | "source": [ 296 | "labels = sorted(df.Stay_In_Current_City_Years.unique())\n", 297 | "values = df.Stay_In_Current_City_Years.value_counts().sort_index()\n", 298 | "\n", 299 | "trace = go.Pie(labels=labels, values=values)\n", 300 | "\n", 301 | "iplot([trace])" 302 | ] 303 | }, 304 | { 305 | "cell_type": "code", 306 | "execution_count": null, 307 | "metadata": { 308 | "_uuid": "02f7c7a447c4f18ab1edb2f1e4b5653a9aa50476" 309 | }, 310 | "outputs": [], 311 | "source": [ 312 | "data = [go.Bar(\n", 313 | " x=df.Product_Category_1.value_counts().sort_index().index,\n", 314 | " y=df.Product_Category_1.value_counts().sort_index().values\n", 315 | " )]\n", 316 | "layout = go.Layout(\n", 317 | " title='Most Purchased Product Category',\n", 318 | ")\n", 319 | "\n", 320 | "fig = go.Figure(data=data, layout=layout)\n", 321 | "iplot(fig)" 322 | ] 323 | }, 324 | { 325 | "cell_type": "markdown", 326 | "metadata": { 327 | "_uuid": "de2343cc304f5020e1a406416886f7f7e2fd07a4" 328 | }, 329 | "source": [ 330 | "As we can see below single people and males spent more money than married people and females." 331 | ] 332 | }, 333 | { 334 | "cell_type": "code", 335 | "execution_count": null, 336 | "metadata": { 337 | "_uuid": "bbfab106c82a44b1dac411cebee7ba5c70c981b6" 338 | }, 339 | "outputs": [], 340 | "source": [ 341 | "x_MaritalStatus = ['Single', 'Married']\n", 342 | "y_PurchaseAmountAccordingToMaritalStatus = [int(df[df.Marital_Status == 0].Purchase.sum()), int(df[df.Marital_Status == 1].Purchase.sum())]\n", 343 | "\n", 344 | "data = [go.Bar(x = x_MaritalStatus, \n", 345 | " y = y_PurchaseAmountAccordingToMaritalStatus,\n", 346 | " marker = dict(color=['#0239E3','#E36502']))]\n", 347 | "layout = go.Layout(title = 'Purchased Amount According To Marital Status (in US Dollars)')\n", 348 | "fig = go.Figure(data=data, layout=layout)\n", 349 | "iplot(fig)" 350 | ] 351 | }, 352 | { 353 | "cell_type": "code", 354 | "execution_count": null, 355 | "metadata": { 356 | "_uuid": "9bacf0dbda79c22b07b03aa0b2ffa977d265c272" 357 | }, 358 | "outputs": [], 359 | "source": [] 360 | }, 361 | { 362 | "cell_type": "code", 363 | "execution_count": null, 364 | "metadata": { 365 | "_uuid": "a73800e9b84f90731ea6e1c836a29bd2eeae3fe8" 366 | }, 367 | "outputs": [], 368 | "source": [ 369 | "x_Gender = ['Male', 'Female']\n", 370 | "y_PurchaseAmountAccordingToGender = [df[df.Gender == 'M'].Purchase.sum(), df[df.Gender == 'F'].Purchase.sum()]\n", 371 | "\n", 372 | "data = [go.Bar(x = x_Gender, \n", 373 | " y = y_PurchaseAmountAccordingToGender,\n", 374 | " marker = dict(color=['#A6000D','#2ECEE7']))]\n", 375 | "layout = go.Layout(title = 'Purchased Amount According To Gender (in US Dollars)')\n", 376 | "fig = go.Figure(data=data, layout=layout)\n", 377 | "iplot(fig)" 378 | ] 379 | }, 380 | { 381 | "cell_type": "code", 382 | "execution_count": null, 383 | "metadata": { 384 | "_uuid": "74bb528474307f6dc710ae91ebd74e33d7722bf4" 385 | }, 386 | "outputs": [], 387 | "source": [ 388 | "x_Status = ['Single & Male', 'Single & Female', 'Married & Male', 'Married & Female']\n", 389 | "y_Purchases = [df[(df.Gender == 'M') & (df.Marital_Status == 0)].Purchase.sum(),\n", 390 | " df[(df.Gender == 'F') & (df.Marital_Status == 0)].Purchase.sum(),\n", 391 | " df[(df.Gender == 'M') & (df.Marital_Status == 1)].Purchase.sum(),\n", 392 | " df[(df.Gender == 'F') & (df.Marital_Status == 1)].Purchase.sum()]\n", 393 | "\n", 394 | "data = [go.Bar(x = x_Status, \n", 395 | " y = y_Purchases,\n", 396 | " marker = dict(color=['rgb(0,212,65)','rgb(54,10,95)','rgb(5,22,205)','rgb(50,62,1)']))]\n", 397 | "layout = go.Layout(title = 'Purchased Amount According To Gender and Marital Status (in US Dollars)')\n", 398 | "fig = go.Figure(data=data, layout=layout)\n", 399 | "iplot(fig)" 400 | ] 401 | }, 402 | { 403 | "cell_type": "markdown", 404 | "metadata": { 405 | "_uuid": "689664ae9473a2e0e3642e50a0b9ebe6ffa22dac" 406 | }, 407 | "source": [ 408 | "Thanks for your time!" 409 | ] 410 | } 411 | ], 412 | "metadata": { 413 | "kernelspec": { 414 | "display_name": "Python 3", 415 | "language": "python", 416 | "name": "python3" 417 | }, 418 | "language_info": { 419 | "codemirror_mode": { 420 | "name": "ipython", 421 | "version": 3 422 | }, 423 | "file_extension": ".py", 424 | "mimetype": "text/x-python", 425 | "name": "python", 426 | "nbconvert_exporter": "python", 427 | "pygments_lexer": "ipython3", 428 | "version": "3.7.0" 429 | } 430 | }, 431 | "nbformat": 4, 432 | "nbformat_minor": 1 433 | } 434 | -------------------------------------------------------------------------------- /Global_Terrorism_Database/Terrorism_in_the_World_and_Turkey.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "metadata": { 5 | "_uuid": "63b60abfd1ed4af52aae9e73250ab8df018c17c4" 6 | }, 7 | "cell_type": "markdown", 8 | "source": "# INTRODUCTION\nTerrorism is a systematic use of violence to spread fear and mostly practiced by political organizations and religious groups. It became one of the biggest problem in our world.\nAccording to researchs one of the problems that must be solved immediately is global terrorism. I try to analyse terrorist attacks which occured between 1970-2017 in the world and closer look to terrorist attacks happened in Turkey.\n\nThis is my first kernel and first attempt to analyse a data. Please comment your thoughts and what you recommend me to improve this and my future works.\n\nThanks.\n\n![](https://ukdj.imgix.net/2017/12/terrorism.jpg?auto=compress%2Cformat&fit=crop&h=580&ixlib=php-1.2.1&q=80&w=1021&wpsize=td_1021x580&s=deebed2357a05031b8cd2b6ad3ce1d5e)" 9 | }, 10 | { 11 | "metadata": { 12 | "_uuid": "8f2839f25d086af736a60e9eeb907d3b93b6e0e5", 13 | "_cell_guid": "b1076dfc-b9ad-4769-8c92-a6c4dae69d19", 14 | "trusted": true 15 | }, 16 | "cell_type": "code", 17 | "source": "# This Python 3 environment comes with many helpful analytics libraries installed\n# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python\n# For example, here's several helpful packages to load in \n\nimport numpy as np # linear algebra\nimport pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)\nimport seaborn as sns\nimport matplotlib.pyplot as plt\n\n# Input data files are available in the \"../input/\" directory.\n# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory\n\nimport os\nprint(os.listdir(\"../input\"))\n\n# Any results you write to the current directory are saved as output.", 18 | "execution_count": null, 19 | "outputs": [] 20 | }, 21 | { 22 | "metadata": { 23 | "_cell_guid": "79c7e3d0-c299-4dcb-8224-4455121ee9b0", 24 | "_uuid": "d629ff2d2480ee46fbb7e2d37f6b5fab8052498a", 25 | "trusted": true, 26 | "_kg_hide-output": true, 27 | "_kg_hide-input": true 28 | }, 29 | "cell_type": "code", 30 | "source": "# Read our data for Global\ndf = pd.read_csv('../input/globalterrorismdb_0718dist.csv', encoding='windows-1252')", 31 | "execution_count": null, 32 | "outputs": [] 33 | }, 34 | { 35 | "metadata": { 36 | "trusted": true, 37 | "_uuid": "fb63d690651633eacf35efa888b01df228eed24a" 38 | }, 39 | "cell_type": "code", 40 | "source": "list(df.columns)", 41 | "execution_count": null, 42 | "outputs": [] 43 | }, 44 | { 45 | "metadata": { 46 | "trusted": true, 47 | "_uuid": "9710f14b8dfef967013c5edcbb8e280a393fefad" 48 | }, 49 | "cell_type": "code", 50 | "source": "df.head()", 51 | "execution_count": null, 52 | "outputs": [] 53 | }, 54 | { 55 | "metadata": { 56 | "trusted": true, 57 | "_uuid": "97069c01006e7c095e5b40e23bf68b073ac932b3" 58 | }, 59 | "cell_type": "code", 60 | "source": "turkeyDf = df[(df.country_txt == 'Turkey')] # Data for Turkey", 61 | "execution_count": null, 62 | "outputs": [] 63 | }, 64 | { 65 | "metadata": { 66 | "trusted": true, 67 | "_uuid": "b5882f2e1eb2b64ae5aabf2a45050c33f5cfb479" 68 | }, 69 | "cell_type": "code", 70 | "source": "# 30 countries that the most terrorrist attacks occured\nmost30Global = df.country_txt.value_counts()[:31]\nplt.figure(figsize = (15,10))\nax = sns.barplot(x=most30Global,y=most30Global.index, palette=\"rocket\")\nplt.title('30 Countries Where Most of the Terrorist Attacks were Shown')\nplt.xlabel('Numbers of Terrorirst Attacks (1970 - 2017)')\nplt.xticks(np.arange(0,26500,2000))\nplt.show()", 71 | "execution_count": null, 72 | "outputs": [] 73 | }, 74 | { 75 | "metadata": { 76 | "trusted": true, 77 | "_uuid": "2952822b45d5da737ec82c88f1838c2d538b54d8" 78 | }, 79 | "cell_type": "code", 80 | "source": "df.region_txt.value_counts()\ndfRegion = df.groupby('region_txt').sum()\n\ndfRegionAndPeopleKilled = sorted(list(zip(dfRegion['nkill'].values, dfRegion['nkill'].index)), reverse=True)\npeopleWereKilledList, regionList = zip(*dfRegionAndPeopleKilled)\npeopleWereKilledList, regionList = list(peopleWereKilledList), list(regionList)\n\nplt.figure(figsize=(15,10))\nsns.barplot(x=regionList, y=peopleWereKilledList, palette= sns.color_palette(\"YlOrRd_r\", 10))\nplt.xticks(rotation=45, ha='right')\nplt.title(\"Numbers of people killed\")\nplt.show()\n", 81 | "execution_count": null, 82 | "outputs": [] 83 | }, 84 | { 85 | "metadata": { 86 | "_uuid": "2b2b6b1020a8e27615883a3d663d57f5f23963aa" 87 | }, 88 | "cell_type": "markdown", 89 | "source": "Apparently Middle East & North Africa and South Asia had suffered from terrorist attacks more than the other regions.\nAlso as you can see below 58% of attacks were target to Middle East & North Africa and South Asia regions." 90 | }, 91 | { 92 | "metadata": { 93 | "trusted": true, 94 | "_uuid": "d6a3d8f4f5599a178498f2cf4e6fdb2aec8574e2" 95 | }, 96 | "cell_type": "code", 97 | "source": "# Percentage of Attacks By Regions\nlabels = dfRegion.index\ncolors=['#17A50E', '#D905FA', '#D0E02F', '#2FD0E0', '#FF0000', '#00FFE8','#CAC8C2', '#6AF394', '#C08CEC', '#FAA6E5', '#37528A', '#DF650C']\nexplode=[0,0,0,0,0,0,0,0,0,0,0,0]\nsizes = dfRegion['nkill'].values\n\nplt.figure(figsize=(15,15))\nplt.pie(sizes, explode=explode, labels=labels, colors=colors, autopct='%1.1f%%', pctdistance=0.5)\nplt.title('Terrorist Attacks by Regions')\nplt.show()", 98 | "execution_count": null, 99 | "outputs": [] 100 | }, 101 | { 102 | "metadata": { 103 | "trusted": true, 104 | "_uuid": "761705c386d08eac111ea676da4147febc809d34", 105 | "_kg_hide-input": false 106 | }, 107 | "cell_type": "code", 108 | "source": "# Terrorist Incidents by Years\ndfYear = list(zip(df.iyear.value_counts().index, df.iyear.value_counts().values))\ndfYear = sorted(dfYear)\nterrorByYear = []\nterrorByCounts = []\nfor i in dfYear:\n terrorByYear.append(i[0])\n terrorByCounts.append(i[1])\n \nplt.figure(figsize=(20,5))\nsns.barplot(x=terrorByYear, y=terrorByCounts, palette=sns.color_palette(\"hls\", len(terrorByYear)))\nplt.xticks(rotation=90)\nplt.xlabel('Year')\nplt.ylabel('Numbers of Attack')\nplt.show()", 109 | "execution_count": null, 110 | "outputs": [] 111 | }, 112 | { 113 | "metadata": { 114 | "_kg_hide-input": true, 115 | "trusted": true, 116 | "_uuid": "caa6b2542a79abd5fafa450ebf19e8cdf69aa7c4" 117 | }, 118 | "cell_type": "code", 119 | "source": "totalKilled = df[df['iyear'] <= 2007].groupby('gname').sum()['nkill'].values.sum()\ntotalKilledAfter2007 = df[df['iyear'] > 2007].groupby('gname').sum()['nkill'].values.sum()\nprint(\"Total people killed by terrorists between 1970 and 2007:\",int(totalKilled))\nprint(\"Total people killed by terrorists between 2008 and 2017:\",int(totalKilledAfter2007))\n", 120 | "execution_count": null, 121 | "outputs": [] 122 | }, 123 | { 124 | "metadata": { 125 | "_kg_hide-output": false, 126 | "_kg_hide-input": true, 127 | "trusted": true, 128 | "_uuid": "c3d8acce60dfeaf75b32fae879c8cf3d4707c3a3" 129 | }, 130 | "cell_type": "code", 131 | "source": "df[df['iyear'] > 2007].groupby('gname').sum()[['nkill']].sort_values(['nkill'],ascending=False)", 132 | "execution_count": null, 133 | "outputs": [] 134 | }, 135 | { 136 | "metadata": { 137 | "_uuid": "7549803a7cdb923af014aec1f99cb73993de393b" 138 | }, 139 | "cell_type": "markdown", 140 | "source": "As we can see after 2007 more people were killed than between 1970 and 2007. Most of these people were killed by \n - Islamic State of Iraq and the Levant (ISIL) : 55697\n - Taliban: 27457\n - Boko Haram: 20328\n \n When we look these groups we can see ISIL mostly active in Iraq and Syria, Taliban in Afghanistan and Boko Haram in Nigeria" 141 | }, 142 | { 143 | "metadata": { 144 | "trusted": true, 145 | "_uuid": "8ce19bded4cfb8590aece444fdbcd0fe8239e537", 146 | "_kg_hide-output": false, 147 | "_kg_hide-input": false 148 | }, 149 | "cell_type": "code", 150 | "source": "byYear = df.groupby(\"iyear\").sum()\nplt.figure(figsize=(20,5))\nbyYear.nkill.plot(color=\"#4F098E\", label = \"Number of People Killed\", linewidth = 4, alpha = 0.8, grid = True)\nplt.legend(loc='upper left')\nplt.xticks(np.arange(byYear.index.min(),2018,5))\nplt.yticks(np.arange(0,50000,5000))\nplt.xlabel('Year')\nplt.ylabel('People Killed')\nplt.title('Number of People Killed')\nplt.show()", 151 | "execution_count": null, 152 | "outputs": [] 153 | }, 154 | { 155 | "metadata": { 156 | "trusted": true, 157 | "_uuid": "582deca57b20b1676f32b2f1c53676e83bccee5d" 158 | }, 159 | "cell_type": "code", 160 | "source": "df.head()", 161 | "execution_count": null, 162 | "outputs": [] 163 | }, 164 | { 165 | "metadata": { 166 | "trusted": true, 167 | "_uuid": "569235ab07ec26b39e11583105a27932815efc3f", 168 | "_kg_hide-input": false, 169 | "scrolled": true 170 | }, 171 | "cell_type": "code", 172 | "source": "regionAndYear = pd.crosstab(df.iyear, df.region_txt)\nregionAndYear.plot(color=sns.color_palette(\"bright\", 12),grid=True, linewidth = 2)\nfigure=plt.gcf()\nfigure.set_size_inches(20,6)\nplt.xlabel(\"Year\")\nplt.ylabel(\"Numbers of Terrorist Attacks\")\nplt.title(\"Terrorist Attacks by Year and Region\")\nplt.show()", 173 | "execution_count": null, 174 | "outputs": [] 175 | }, 176 | { 177 | "metadata": { 178 | "trusted": true, 179 | "_uuid": "77cc58a19aaa38063c0087f905a6393d3ba514b9" 180 | }, 181 | "cell_type": "markdown", 182 | "source": "After 2010 in Middle East & North Africa and South Asia terrorist attacks sharply raised. " 183 | }, 184 | { 185 | "metadata": { 186 | "trusted": true, 187 | "_uuid": "f1c3f10984f0b65a94ca0d7c6f95cb36c401a0fc" 188 | }, 189 | "cell_type": "code", 190 | "source": "df['attacktype1_txt'].value_counts()\nplt.figure(figsize=(20,8))\nsns.set(style=\"darkgrid\", context=\"talk\")\nsns.barplot(x=df['attacktype1_txt'].value_counts().index, y=df['attacktype1_txt'].value_counts().values, palette=\"GnBu_r\")\nplt.xlabel(\"Methods of Attacks\")\nplt.ylabel(\"Numbers of Attacks\")\nplt.xticks(rotation=45, ha=\"right\")\nplt.title(\"Attacks Type\")\nplt.show()\n", 191 | "execution_count": null, 192 | "outputs": [] 193 | }, 194 | { 195 | "metadata": { 196 | "_uuid": "2cc68f9608a6ecddb4d18d39736f791a816515ec" 197 | }, 198 | "cell_type": "markdown", 199 | "source": "## TURKEY" 200 | }, 201 | { 202 | "metadata": { 203 | "trusted": true, 204 | "_uuid": "625ddeeff6607827593fe2dbb8941723bc4ef4b4" 205 | }, 206 | "cell_type": "code", 207 | "source": "turkeyDf.head()", 208 | "execution_count": null, 209 | "outputs": [] 210 | }, 211 | { 212 | "metadata": { 213 | "_uuid": "0af572880fbe74e70f6e84ed5a2e636966c74582" 214 | }, 215 | "cell_type": "markdown", 216 | "source": "### Terrorist Attacks Year by Year" 217 | }, 218 | { 219 | "metadata": { 220 | "trusted": true, 221 | "_uuid": "5c948309f2d0db1653277be1c3fcc6a81628d809" 222 | }, 223 | "cell_type": "code", 224 | "source": "yearOfAttacksTurkey = turkeyDf['iyear'].value_counts()\nlistOfAttacksYear = sorted(list(zip(yearOfAttacksTurkey.index, yearOfAttacksTurkey.values)))\nattacksYear, attacksCount = zip(*listOfAttacksYear)\nattacksYear, attacksCount = list(attacksYear), list(attacksCount)\n\nsns.set(style=\"white\", context=\"talk\")\nf, ax1 = plt.subplots(figsize=(20,10))\nsns.pointplot(x=attacksYear, y=attacksCount, color=\"orange\")\nplt.xlabel(\"Year\")\nplt.xticks(rotation=90)\nplt.ylabel(\"Numbers of Terrorist Attacks\")\nplt.grid()\n\n", 225 | "execution_count": null, 226 | "outputs": [] 227 | }, 228 | { 229 | "metadata": { 230 | "trusted": true, 231 | "_uuid": "87dc6091f0e225a4917ac45003c7158bd9f9a7fe" 232 | }, 233 | "cell_type": "code", 234 | "source": "mostActiveGroupsTen = turkeyDf[turkeyDf['gname'].isin(turkeyDf['gname'].value_counts()[:11].index)]\nsns.set(style=\"whitegrid\", context=\"talk\")\npd.crosstab(mostActiveGroupsTen.iyear,mostActiveGroupsTen.gname).plot(color=sns.color_palette('Paired',12))\nplt.xlabel(\"Year\")\nfig=plt.gcf()\nfig.set_size_inches(20,6)\nplt.show()", 235 | "execution_count": null, 236 | "outputs": [] 237 | }, 238 | { 239 | "metadata": { 240 | "_uuid": "2c6bb49ab543527b9f180c600b64da85cba81d4d" 241 | }, 242 | "cell_type": "markdown", 243 | "source": "Between late 80's and early 90's terrorist attacks were raised and it reached its peak. We can see Kurdistan Workers' Party (PKK)'s attacks are correlated with that. After their leader Abdullah Ocalan was caught by Turkish Army their activity drop down. After 2014 we can observe same scenario: aggressively raise and sharply down." 244 | }, 245 | { 246 | "metadata": { 247 | "trusted": true, 248 | "_uuid": "93337ebb37ddea6576bb7378a203f3b44888421a" 249 | }, 250 | "cell_type": "code", 251 | "source": "terrorGroupsTurkey = turkeyDf.gname.value_counts()\nterrorGroupsList15 = sorted(list(zip(terrorGroupsTurkey.values[:16], terrorGroupsTurkey.index[:16])), reverse=True)\nattacksInTurkey15, terroristsInTurkey15 = zip(*terrorGroupsList15)\nattacksInTurkey15, terroristsInTurkey15 = list(attacksInTurkey15), list(terroristsInTurkey15)\n\nplt.figure(figsize=(20,6))\nsns.barplot(x=terroristsInTurkey15, y=attacksInTurkey15, palette = sns.cubehelix_palette(len(terroristsInTurkey15)))\nplt.xticks(rotation=70, ha=\"right\")\nplt.xlabel(\"Terrorist Organisations\")\nplt.ylabel(\"Attacks\")\nplt.show()", 252 | "execution_count": null, 253 | "outputs": [] 254 | }, 255 | { 256 | "metadata": { 257 | "_uuid": "254db24ae41cd275d01d6f5ba8424f82b28f1b11" 258 | }, 259 | "cell_type": "markdown", 260 | "source": "**Top 30 Cities Attacked by Terrorists **" 261 | }, 262 | { 263 | "metadata": { 264 | "trusted": true, 265 | "_uuid": "ff97daf92bf3b95e6f215ea033b2d6e072fa6e52" 266 | }, 267 | "cell_type": "code", 268 | "source": "citiesAttacked = turkeyDf.provstate.value_counts()\ncitiesAttackedList = sorted(list(zip(citiesAttacked.values[:31], citiesAttacked.index[:31])), reverse=True)\ncitiesAttacked30, citiesCount30 = zip(*citiesAttackedList)\ncitiesAttacked30, citiesCount30 = list(citiesAttacked30), list(citiesCount30)\n\nplt.figure(figsize=(20,10))\nsns.barplot(x=citiesAttacked30, y=citiesCount30)\nplt.xticks(np.arange(0,1201,100))\nplt.xlabel(\"Number of Attacks\")\nplt.show()", 269 | "execution_count": null, 270 | "outputs": [] 271 | }, 272 | { 273 | "metadata": { 274 | "_uuid": "6af952dd33251dbc21cb734d4f3d0d771fa14819" 275 | }, 276 | "cell_type": "markdown", 277 | "source": "**Attacks Types of Terrorists**" 278 | }, 279 | { 280 | "metadata": { 281 | "trusted": true, 282 | "_uuid": "ab6e1d67301c86792b3fe8fcc2769c19d23059ba" 283 | }, 284 | "cell_type": "code", 285 | "source": "attackType = turkeyDf.attacktype1_txt.value_counts()\nattackTypeList = sorted(list(zip(attackType.values, attackType.index)), reverse=True)\nattackCount, attackMethod = zip(*attackTypeList)\nattackCount, attackMethod = list(attackCount), list(attackMethod)\n\nplt.figure(figsize=(20,8))\nsns.barplot(x=attackMethod, y=attackCount, palette = sns.color_palette(\"YlOrBr_r\", 9))\nplt.xticks(rotation=35, ha=\"right\")\nplt.xlabel(\"Attack Types of Terrorists\")\nplt.title(\"Attack Types\")\nplt.ylabel(\"Attacks\")\nplt.show()", 286 | "execution_count": null, 287 | "outputs": [] 288 | }, 289 | { 290 | "metadata": { 291 | "trusted": true, 292 | "_uuid": "97b002aa48371f64cd17a40ab240c7d8f72b1750" 293 | }, 294 | "cell_type": "code", 295 | "source": "targetType = turkeyDf.targtype1_txt.value_counts()\ntargetTypeList = sorted(list(zip(targetType.values, targetType.index)), reverse=True)\ntargetCount, target = zip(*targetTypeList)\ntargetCount, target = list(targetCount), list(target)\n\nplt.figure(figsize=(20,8))\nsns.barplot(x=target, y=targetCount, palette = sns.color_palette(\"YlGn_r\", 21))\nplt.xticks(rotation=35, ha=\"right\")\nplt.xlabel(\"Targets of Terrorists\")\nplt.ylabel(\"Attacks\")\nplt.show()", 296 | "execution_count": null, 297 | "outputs": [] 298 | }, 299 | { 300 | "metadata": { 301 | "trusted": true, 302 | "scrolled": true, 303 | "_uuid": "2d7aba8eacd507f49c54db2b95457f714e622fca" 304 | }, 305 | "cell_type": "code", 306 | "source": "terrorists=turkeyDf['gname'].value_counts()[:5].to_frame()\nterrorists.columns=['gname']\nkill=turkeyDf.groupby('gname')['nkill'].sum().to_frame()\nterrorists.merge(kill,left_index=True,right_index=True,how='left').plot.bar(width=0.9)\nfig=plt.gcf()\nfig.set_size_inches(20,6)\nplt.show()", 307 | "execution_count": null, 308 | "outputs": [] 309 | } 310 | ], 311 | "metadata": { 312 | "kernelspec": { 313 | "display_name": "Python 3", 314 | "language": "python", 315 | "name": "python3" 316 | }, 317 | "language_info": { 318 | "name": "python", 319 | "version": "3.6.6", 320 | "mimetype": "text/x-python", 321 | "codemirror_mode": { 322 | "name": "ipython", 323 | "version": 3 324 | }, 325 | "pygments_lexer": "ipython3", 326 | "nbconvert_exporter": "python", 327 | "file_extension": ".py" 328 | } 329 | }, 330 | "nbformat": 4, 331 | "nbformat_minor": 1 332 | } 333 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Follow my projects on Kaggle: 2 | https://www.kaggle.com/cdabakoglu 3 | --------------------------------------------------------------------------------