├── .gitignore
├── Untitled.ipynb
├── Datasets
├── readme.txt
├── shakes-mapping.json
└── Churn Data Description.txt
├── README.md
├── 6. Credit Card Fraud Detection
└── Machine_Learning_06_Credit_Card_Fraud_Detection.ipynb
└── 4. Minimizing Churn Rate Through Analysis of Financial Habits
└── Machine_Learning_04_Minimizing_Churn_Rate_Through_Analysis_of_Financial_Habits (Part 2).ipynb
/.gitignore:
--------------------------------------------------------------------------------
1 | *.CSV
2 | *.xlx
3 | *.xlxs
4 |
5 | #folders
6 |
7 | .ipynb_checkpoints/
8 | .ipynb_checkpoints
9 |
--------------------------------------------------------------------------------
/Untitled.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [],
3 | "metadata": {},
4 | "nbformat": 4,
5 | "nbformat_minor": 2
6 | }
7 |
--------------------------------------------------------------------------------
/Datasets/readme.txt:
--------------------------------------------------------------------------------
1 | The dataset were donwload from : https://www.superdatascience.com/pages/machine-learning-practical
--------------------------------------------------------------------------------
/Datasets/shakes-mapping.json:
--------------------------------------------------------------------------------
1 | {
2 | "mappings" : {
3 | "_default_" : {
4 | "properties" : {
5 | "speaker" : {"type": "string", "index" : "not_analyzed" },
6 | "play_name" : {"type": "string", "index" : "not_analyzed" },
7 | "line_id" : { "type" : "integer" },
8 | "speech_number" : { "type" : "integer" }
9 | }
10 | }
11 | }
12 | }
13 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Machine Learning Practical: 6 real-world Aplications
2 |
3 | This repository contains the code from 6 practical real cases solved with Machine Learning. The contend is related with the course [Machine Learning Practical: 6 Real-World Applications](https://www.udemy.com/machine-learning-practical/) created / dictated by Kirill Eremenko, Hadelin de Ponteves, Dr. Ryan Ahmed, Ph.D., MBA, SuperDataScience Team and Rony Sulca.
4 |
5 | List of solved cases included in this this repo:
6 | - diagnosing diabetes in the early stages
7 | - directing customers to subscription products with app usage analysis (Linear Regression [L1])
8 | - minimizing churn rate in finance
9 | - predicting customer location with GPS data
10 | - forecasting future currency exchange rates
11 | - classifying fashion (Deep Learning)
12 | - predicting breast cancer (Support Vector Machine - SVM)
13 |
14 | ## Datasets
15 |
16 | All the datasets used in this course were downloaded from [datasets & Code Course](https://www.superdatascience.com/pages/machine-learning-practical)
17 |
--------------------------------------------------------------------------------
/Datasets/Churn Data Description.txt:
--------------------------------------------------------------------------------
1 | Description of each Columns
2 | userid - MongoDB userid
3 | churn - Active = No | Suspended < 30 = No Else Churn = Yes
4 | age - age of the customer
5 | city - city of the customer
6 | state- state where the customer lives
7 | postal_code - zip code of the customer
8 | zodiac_sign- zodiac sign of the customer
9 | rent_or_own - Does the customer rents or owns a house
10 | more_than_one_mobile_device - does the customer use more than one mobile device
11 | payFreq- Pay Frequency of the cusomter
12 | in_collections - is the customer in collections
13 | loan_pending - is the loan pending
14 | withdrawn_application - has the customer withdrawn the loan applicaiton
15 | paid_off_loan- has the customer paid of the loan
16 | did_not_accept_funding - customer did not accept funding
17 | cash_back_engagement - Sum of cash back dollars received by a customer / No of days in the app
18 | cash_back_amount - Sum of cash back dollars received by a customer
19 | used_ios- Has the user used an iphone
20 | used_android - Has the user used a android based phone
21 | has_used_mobile_and_web - Has the user used mobile and web platforms
22 | has_used_web - Has the user used MoneyLion Web app
23 | has_used_mobile - as the user used MoneyLion app
24 | has_reffered- Has the user referred
25 | cards_clicked - How many times a user has clicked the cards
26 | cards_not_helpful- How helpful was the cards
27 | cards_helpful- How helpful was the cards
28 | cards_viewed- How many times a user viewed the cards
29 | cards_share- How many times a user shared his cards
30 | trivia_view_results-How many times a user viewed trivia results
31 | trivia_view_unlocked- How many times a user viewed trivia view unlocked screen
32 | trivia_view_locked - How many times a user viewed trivia view locked screen
33 | trivia_shared_results- How many times a user shared trivia results
34 | trivia_played - How many times a user played trivia
35 | re_linked_account- Has the user re linked account
36 | un_linked_account - Has the user un linked account
37 | credit_score - Customer's credit score
--------------------------------------------------------------------------------
/6. Credit Card Fraud Detection/Machine_Learning_06_Credit_Card_Fraud_Detection.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 37,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "## importing libraries & data\n",
10 | "\n",
11 | "# import libraries\n",
12 | "import pandas as pd\n",
13 | "import numpy as np\n",
14 | "import seaborn as sns\n",
15 | "import matplotlib.pyplot as plt\n",
16 | "import random\n",
17 | "import time\n",
18 | "\n",
19 | "# TensorFlow and tf.keras\n",
20 | "import tensorflow as tf\n",
21 | "from tensorflow import keras\n",
22 | "\n",
23 | "np.random.seed(2)"
24 | ]
25 | },
26 | {
27 | "cell_type": "code",
28 | "execution_count": 2,
29 | "metadata": {},
30 | "outputs": [],
31 | "source": [
32 | "path = \"../Datasets\"\n",
33 | "\n",
34 | "import os\n",
35 | "#Actual absolute path\n",
36 | "cwd = os.getcwd()\n",
37 | "#print(cwd)"
38 | ]
39 | },
40 | {
41 | "cell_type": "code",
42 | "execution_count": 3,
43 | "metadata": {},
44 | "outputs": [],
45 | "source": [
46 | "# one level up directory. Datasets directory\n",
47 | "os.chdir(path)\n",
48 | "#Actual absolute path\n",
49 | "cwd = os.getcwd()\n",
50 | "#print(cwd)"
51 | ]
52 | },
53 | {
54 | "cell_type": "code",
55 | "execution_count": 4,
56 | "metadata": {},
57 | "outputs": [],
58 | "source": [
59 | "# path file current dir + file name\n",
60 | "path_file = cwd + '/creditcard.csv'\n",
61 | "# print(path_file)"
62 | ]
63 | },
64 | {
65 | "cell_type": "code",
66 | "execution_count": 5,
67 | "metadata": {},
68 | "outputs": [
69 | {
70 | "data": {
71 | "text/html": [
72 | "
\n",
73 | "\n",
86 | "
\n",
87 | " \n",
88 | " \n",
89 | " \n",
90 | " Time \n",
91 | " V1 \n",
92 | " V2 \n",
93 | " V3 \n",
94 | " V4 \n",
95 | " V5 \n",
96 | " V6 \n",
97 | " V7 \n",
98 | " V8 \n",
99 | " V9 \n",
100 | " ... \n",
101 | " V21 \n",
102 | " V22 \n",
103 | " V23 \n",
104 | " V24 \n",
105 | " V25 \n",
106 | " V26 \n",
107 | " V27 \n",
108 | " V28 \n",
109 | " Amount \n",
110 | " Class \n",
111 | " \n",
112 | " \n",
113 | " \n",
114 | " \n",
115 | " 0 \n",
116 | " 0.0 \n",
117 | " -1.359807 \n",
118 | " -0.072781 \n",
119 | " 2.536347 \n",
120 | " 1.378155 \n",
121 | " -0.338321 \n",
122 | " 0.462388 \n",
123 | " 0.239599 \n",
124 | " 0.098698 \n",
125 | " 0.363787 \n",
126 | " ... \n",
127 | " -0.018307 \n",
128 | " 0.277838 \n",
129 | " -0.110474 \n",
130 | " 0.066928 \n",
131 | " 0.128539 \n",
132 | " -0.189115 \n",
133 | " 0.133558 \n",
134 | " -0.021053 \n",
135 | " 149.62 \n",
136 | " 0 \n",
137 | " \n",
138 | " \n",
139 | " 1 \n",
140 | " 0.0 \n",
141 | " 1.191857 \n",
142 | " 0.266151 \n",
143 | " 0.166480 \n",
144 | " 0.448154 \n",
145 | " 0.060018 \n",
146 | " -0.082361 \n",
147 | " -0.078803 \n",
148 | " 0.085102 \n",
149 | " -0.255425 \n",
150 | " ... \n",
151 | " -0.225775 \n",
152 | " -0.638672 \n",
153 | " 0.101288 \n",
154 | " -0.339846 \n",
155 | " 0.167170 \n",
156 | " 0.125895 \n",
157 | " -0.008983 \n",
158 | " 0.014724 \n",
159 | " 2.69 \n",
160 | " 0 \n",
161 | " \n",
162 | " \n",
163 | " 2 \n",
164 | " 1.0 \n",
165 | " -1.358354 \n",
166 | " -1.340163 \n",
167 | " 1.773209 \n",
168 | " 0.379780 \n",
169 | " -0.503198 \n",
170 | " 1.800499 \n",
171 | " 0.791461 \n",
172 | " 0.247676 \n",
173 | " -1.514654 \n",
174 | " ... \n",
175 | " 0.247998 \n",
176 | " 0.771679 \n",
177 | " 0.909412 \n",
178 | " -0.689281 \n",
179 | " -0.327642 \n",
180 | " -0.139097 \n",
181 | " -0.055353 \n",
182 | " -0.059752 \n",
183 | " 378.66 \n",
184 | " 0 \n",
185 | " \n",
186 | " \n",
187 | " 3 \n",
188 | " 1.0 \n",
189 | " -0.966272 \n",
190 | " -0.185226 \n",
191 | " 1.792993 \n",
192 | " -0.863291 \n",
193 | " -0.010309 \n",
194 | " 1.247203 \n",
195 | " 0.237609 \n",
196 | " 0.377436 \n",
197 | " -1.387024 \n",
198 | " ... \n",
199 | " -0.108300 \n",
200 | " 0.005274 \n",
201 | " -0.190321 \n",
202 | " -1.175575 \n",
203 | " 0.647376 \n",
204 | " -0.221929 \n",
205 | " 0.062723 \n",
206 | " 0.061458 \n",
207 | " 123.50 \n",
208 | " 0 \n",
209 | " \n",
210 | " \n",
211 | " 4 \n",
212 | " 2.0 \n",
213 | " -1.158233 \n",
214 | " 0.877737 \n",
215 | " 1.548718 \n",
216 | " 0.403034 \n",
217 | " -0.407193 \n",
218 | " 0.095921 \n",
219 | " 0.592941 \n",
220 | " -0.270533 \n",
221 | " 0.817739 \n",
222 | " ... \n",
223 | " -0.009431 \n",
224 | " 0.798278 \n",
225 | " -0.137458 \n",
226 | " 0.141267 \n",
227 | " -0.206010 \n",
228 | " 0.502292 \n",
229 | " 0.219422 \n",
230 | " 0.215153 \n",
231 | " 69.99 \n",
232 | " 0 \n",
233 | " \n",
234 | " \n",
235 | "
\n",
236 | "
5 rows × 31 columns
\n",
237 | "
"
238 | ],
239 | "text/plain": [
240 | " Time V1 V2 V3 V4 V5 V6 V7 \\\n",
241 | "0 0.0 -1.359807 -0.072781 2.536347 1.378155 -0.338321 0.462388 0.239599 \n",
242 | "1 0.0 1.191857 0.266151 0.166480 0.448154 0.060018 -0.082361 -0.078803 \n",
243 | "2 1.0 -1.358354 -1.340163 1.773209 0.379780 -0.503198 1.800499 0.791461 \n",
244 | "3 1.0 -0.966272 -0.185226 1.792993 -0.863291 -0.010309 1.247203 0.237609 \n",
245 | "4 2.0 -1.158233 0.877737 1.548718 0.403034 -0.407193 0.095921 0.592941 \n",
246 | "\n",
247 | " V8 V9 ... V21 V22 V23 V24 V25 \\\n",
248 | "0 0.098698 0.363787 ... -0.018307 0.277838 -0.110474 0.066928 0.128539 \n",
249 | "1 0.085102 -0.255425 ... -0.225775 -0.638672 0.101288 -0.339846 0.167170 \n",
250 | "2 0.247676 -1.514654 ... 0.247998 0.771679 0.909412 -0.689281 -0.327642 \n",
251 | "3 0.377436 -1.387024 ... -0.108300 0.005274 -0.190321 -1.175575 0.647376 \n",
252 | "4 -0.270533 0.817739 ... -0.009431 0.798278 -0.137458 0.141267 -0.206010 \n",
253 | "\n",
254 | " V26 V27 V28 Amount Class \n",
255 | "0 -0.189115 0.133558 -0.021053 149.62 0 \n",
256 | "1 0.125895 -0.008983 0.014724 2.69 0 \n",
257 | "2 -0.139097 -0.055353 -0.059752 378.66 0 \n",
258 | "3 -0.221929 0.062723 0.061458 123.50 0 \n",
259 | "4 0.502292 0.219422 0.215153 69.99 0 \n",
260 | "\n",
261 | "[5 rows x 31 columns]"
262 | ]
263 | },
264 | "execution_count": 5,
265 | "metadata": {},
266 | "output_type": "execute_result"
267 | }
268 | ],
269 | "source": [
270 | "### EDA ###\n",
271 | "\n",
272 | "# import dataset\n",
273 | "\n",
274 | "dataset = pd.read_csv(path_file)\n",
275 | "dataset.head()\n",
276 | "\n",
277 | "# If the Class column is 1 the transaction is Fraudulent\n",
278 | "# The IP Address was deleted from the dataset to preserve the anonymity"
279 | ]
280 | },
281 | {
282 | "cell_type": "code",
283 | "execution_count": 6,
284 | "metadata": {},
285 | "outputs": [
286 | {
287 | "data": {
288 | "text/html": [
289 | "\n",
290 | "\n",
303 | "
\n",
304 | " \n",
305 | " \n",
306 | " \n",
307 | " Time \n",
308 | " V1 \n",
309 | " V2 \n",
310 | " V3 \n",
311 | " V4 \n",
312 | " V5 \n",
313 | " V6 \n",
314 | " V7 \n",
315 | " V8 \n",
316 | " V9 \n",
317 | " ... \n",
318 | " V21 \n",
319 | " V22 \n",
320 | " V23 \n",
321 | " V24 \n",
322 | " V25 \n",
323 | " V26 \n",
324 | " V27 \n",
325 | " V28 \n",
326 | " Amount \n",
327 | " Class \n",
328 | " \n",
329 | " \n",
330 | " \n",
331 | " \n",
332 | " count \n",
333 | " 284807.000000 \n",
334 | " 2.848070e+05 \n",
335 | " 2.848070e+05 \n",
336 | " 2.848070e+05 \n",
337 | " 2.848070e+05 \n",
338 | " 2.848070e+05 \n",
339 | " 2.848070e+05 \n",
340 | " 2.848070e+05 \n",
341 | " 2.848070e+05 \n",
342 | " 2.848070e+05 \n",
343 | " ... \n",
344 | " 2.848070e+05 \n",
345 | " 2.848070e+05 \n",
346 | " 2.848070e+05 \n",
347 | " 2.848070e+05 \n",
348 | " 2.848070e+05 \n",
349 | " 2.848070e+05 \n",
350 | " 2.848070e+05 \n",
351 | " 2.848070e+05 \n",
352 | " 284807.000000 \n",
353 | " 284807.000000 \n",
354 | " \n",
355 | " \n",
356 | " mean \n",
357 | " 94813.859575 \n",
358 | " 3.919560e-15 \n",
359 | " 5.688174e-16 \n",
360 | " -8.769071e-15 \n",
361 | " 2.782312e-15 \n",
362 | " -1.552563e-15 \n",
363 | " 2.010663e-15 \n",
364 | " -1.694249e-15 \n",
365 | " -1.927028e-16 \n",
366 | " -3.137024e-15 \n",
367 | " ... \n",
368 | " 1.537294e-16 \n",
369 | " 7.959909e-16 \n",
370 | " 5.367590e-16 \n",
371 | " 4.458112e-15 \n",
372 | " 1.453003e-15 \n",
373 | " 1.699104e-15 \n",
374 | " -3.660161e-16 \n",
375 | " -1.206049e-16 \n",
376 | " 88.349619 \n",
377 | " 0.001727 \n",
378 | " \n",
379 | " \n",
380 | " std \n",
381 | " 47488.145955 \n",
382 | " 1.958696e+00 \n",
383 | " 1.651309e+00 \n",
384 | " 1.516255e+00 \n",
385 | " 1.415869e+00 \n",
386 | " 1.380247e+00 \n",
387 | " 1.332271e+00 \n",
388 | " 1.237094e+00 \n",
389 | " 1.194353e+00 \n",
390 | " 1.098632e+00 \n",
391 | " ... \n",
392 | " 7.345240e-01 \n",
393 | " 7.257016e-01 \n",
394 | " 6.244603e-01 \n",
395 | " 6.056471e-01 \n",
396 | " 5.212781e-01 \n",
397 | " 4.822270e-01 \n",
398 | " 4.036325e-01 \n",
399 | " 3.300833e-01 \n",
400 | " 250.120109 \n",
401 | " 0.041527 \n",
402 | " \n",
403 | " \n",
404 | " min \n",
405 | " 0.000000 \n",
406 | " -5.640751e+01 \n",
407 | " -7.271573e+01 \n",
408 | " -4.832559e+01 \n",
409 | " -5.683171e+00 \n",
410 | " -1.137433e+02 \n",
411 | " -2.616051e+01 \n",
412 | " -4.355724e+01 \n",
413 | " -7.321672e+01 \n",
414 | " -1.343407e+01 \n",
415 | " ... \n",
416 | " -3.483038e+01 \n",
417 | " -1.093314e+01 \n",
418 | " -4.480774e+01 \n",
419 | " -2.836627e+00 \n",
420 | " -1.029540e+01 \n",
421 | " -2.604551e+00 \n",
422 | " -2.256568e+01 \n",
423 | " -1.543008e+01 \n",
424 | " 0.000000 \n",
425 | " 0.000000 \n",
426 | " \n",
427 | " \n",
428 | " 25% \n",
429 | " 54201.500000 \n",
430 | " -9.203734e-01 \n",
431 | " -5.985499e-01 \n",
432 | " -8.903648e-01 \n",
433 | " -8.486401e-01 \n",
434 | " -6.915971e-01 \n",
435 | " -7.682956e-01 \n",
436 | " -5.540759e-01 \n",
437 | " -2.086297e-01 \n",
438 | " -6.430976e-01 \n",
439 | " ... \n",
440 | " -2.283949e-01 \n",
441 | " -5.423504e-01 \n",
442 | " -1.618463e-01 \n",
443 | " -3.545861e-01 \n",
444 | " -3.171451e-01 \n",
445 | " -3.269839e-01 \n",
446 | " -7.083953e-02 \n",
447 | " -5.295979e-02 \n",
448 | " 5.600000 \n",
449 | " 0.000000 \n",
450 | " \n",
451 | " \n",
452 | " 50% \n",
453 | " 84692.000000 \n",
454 | " 1.810880e-02 \n",
455 | " 6.548556e-02 \n",
456 | " 1.798463e-01 \n",
457 | " -1.984653e-02 \n",
458 | " -5.433583e-02 \n",
459 | " -2.741871e-01 \n",
460 | " 4.010308e-02 \n",
461 | " 2.235804e-02 \n",
462 | " -5.142873e-02 \n",
463 | " ... \n",
464 | " -2.945017e-02 \n",
465 | " 6.781943e-03 \n",
466 | " -1.119293e-02 \n",
467 | " 4.097606e-02 \n",
468 | " 1.659350e-02 \n",
469 | " -5.213911e-02 \n",
470 | " 1.342146e-03 \n",
471 | " 1.124383e-02 \n",
472 | " 22.000000 \n",
473 | " 0.000000 \n",
474 | " \n",
475 | " \n",
476 | " 75% \n",
477 | " 139320.500000 \n",
478 | " 1.315642e+00 \n",
479 | " 8.037239e-01 \n",
480 | " 1.027196e+00 \n",
481 | " 7.433413e-01 \n",
482 | " 6.119264e-01 \n",
483 | " 3.985649e-01 \n",
484 | " 5.704361e-01 \n",
485 | " 3.273459e-01 \n",
486 | " 5.971390e-01 \n",
487 | " ... \n",
488 | " 1.863772e-01 \n",
489 | " 5.285536e-01 \n",
490 | " 1.476421e-01 \n",
491 | " 4.395266e-01 \n",
492 | " 3.507156e-01 \n",
493 | " 2.409522e-01 \n",
494 | " 9.104512e-02 \n",
495 | " 7.827995e-02 \n",
496 | " 77.165000 \n",
497 | " 0.000000 \n",
498 | " \n",
499 | " \n",
500 | " max \n",
501 | " 172792.000000 \n",
502 | " 2.454930e+00 \n",
503 | " 2.205773e+01 \n",
504 | " 9.382558e+00 \n",
505 | " 1.687534e+01 \n",
506 | " 3.480167e+01 \n",
507 | " 7.330163e+01 \n",
508 | " 1.205895e+02 \n",
509 | " 2.000721e+01 \n",
510 | " 1.559499e+01 \n",
511 | " ... \n",
512 | " 2.720284e+01 \n",
513 | " 1.050309e+01 \n",
514 | " 2.252841e+01 \n",
515 | " 4.584549e+00 \n",
516 | " 7.519589e+00 \n",
517 | " 3.517346e+00 \n",
518 | " 3.161220e+01 \n",
519 | " 3.384781e+01 \n",
520 | " 25691.160000 \n",
521 | " 1.000000 \n",
522 | " \n",
523 | " \n",
524 | "
\n",
525 | "
8 rows × 31 columns
\n",
526 | "
"
527 | ],
528 | "text/plain": [
529 | " Time V1 V2 V3 V4 \\\n",
530 | "count 284807.000000 2.848070e+05 2.848070e+05 2.848070e+05 2.848070e+05 \n",
531 | "mean 94813.859575 3.919560e-15 5.688174e-16 -8.769071e-15 2.782312e-15 \n",
532 | "std 47488.145955 1.958696e+00 1.651309e+00 1.516255e+00 1.415869e+00 \n",
533 | "min 0.000000 -5.640751e+01 -7.271573e+01 -4.832559e+01 -5.683171e+00 \n",
534 | "25% 54201.500000 -9.203734e-01 -5.985499e-01 -8.903648e-01 -8.486401e-01 \n",
535 | "50% 84692.000000 1.810880e-02 6.548556e-02 1.798463e-01 -1.984653e-02 \n",
536 | "75% 139320.500000 1.315642e+00 8.037239e-01 1.027196e+00 7.433413e-01 \n",
537 | "max 172792.000000 2.454930e+00 2.205773e+01 9.382558e+00 1.687534e+01 \n",
538 | "\n",
539 | " V5 V6 V7 V8 V9 \\\n",
540 | "count 2.848070e+05 2.848070e+05 2.848070e+05 2.848070e+05 2.848070e+05 \n",
541 | "mean -1.552563e-15 2.010663e-15 -1.694249e-15 -1.927028e-16 -3.137024e-15 \n",
542 | "std 1.380247e+00 1.332271e+00 1.237094e+00 1.194353e+00 1.098632e+00 \n",
543 | "min -1.137433e+02 -2.616051e+01 -4.355724e+01 -7.321672e+01 -1.343407e+01 \n",
544 | "25% -6.915971e-01 -7.682956e-01 -5.540759e-01 -2.086297e-01 -6.430976e-01 \n",
545 | "50% -5.433583e-02 -2.741871e-01 4.010308e-02 2.235804e-02 -5.142873e-02 \n",
546 | "75% 6.119264e-01 3.985649e-01 5.704361e-01 3.273459e-01 5.971390e-01 \n",
547 | "max 3.480167e+01 7.330163e+01 1.205895e+02 2.000721e+01 1.559499e+01 \n",
548 | "\n",
549 | " ... V21 V22 V23 V24 \\\n",
550 | "count ... 2.848070e+05 2.848070e+05 2.848070e+05 2.848070e+05 \n",
551 | "mean ... 1.537294e-16 7.959909e-16 5.367590e-16 4.458112e-15 \n",
552 | "std ... 7.345240e-01 7.257016e-01 6.244603e-01 6.056471e-01 \n",
553 | "min ... -3.483038e+01 -1.093314e+01 -4.480774e+01 -2.836627e+00 \n",
554 | "25% ... -2.283949e-01 -5.423504e-01 -1.618463e-01 -3.545861e-01 \n",
555 | "50% ... -2.945017e-02 6.781943e-03 -1.119293e-02 4.097606e-02 \n",
556 | "75% ... 1.863772e-01 5.285536e-01 1.476421e-01 4.395266e-01 \n",
557 | "max ... 2.720284e+01 1.050309e+01 2.252841e+01 4.584549e+00 \n",
558 | "\n",
559 | " V25 V26 V27 V28 Amount \\\n",
560 | "count 2.848070e+05 2.848070e+05 2.848070e+05 2.848070e+05 284807.000000 \n",
561 | "mean 1.453003e-15 1.699104e-15 -3.660161e-16 -1.206049e-16 88.349619 \n",
562 | "std 5.212781e-01 4.822270e-01 4.036325e-01 3.300833e-01 250.120109 \n",
563 | "min -1.029540e+01 -2.604551e+00 -2.256568e+01 -1.543008e+01 0.000000 \n",
564 | "25% -3.171451e-01 -3.269839e-01 -7.083953e-02 -5.295979e-02 5.600000 \n",
565 | "50% 1.659350e-02 -5.213911e-02 1.342146e-03 1.124383e-02 22.000000 \n",
566 | "75% 3.507156e-01 2.409522e-01 9.104512e-02 7.827995e-02 77.165000 \n",
567 | "max 7.519589e+00 3.517346e+00 3.161220e+01 3.384781e+01 25691.160000 \n",
568 | "\n",
569 | " Class \n",
570 | "count 284807.000000 \n",
571 | "mean 0.001727 \n",
572 | "std 0.041527 \n",
573 | "min 0.000000 \n",
574 | "25% 0.000000 \n",
575 | "50% 0.000000 \n",
576 | "75% 0.000000 \n",
577 | "max 1.000000 \n",
578 | "\n",
579 | "[8 rows x 31 columns]"
580 | ]
581 | },
582 | "execution_count": 6,
583 | "metadata": {},
584 | "output_type": "execute_result"
585 | }
586 | ],
587 | "source": [
588 | "dataset.describe()"
589 | ]
590 | },
591 | {
592 | "cell_type": "code",
593 | "execution_count": 7,
594 | "metadata": {},
595 | "outputs": [
596 | {
597 | "data": {
598 | "text/plain": [
599 | "Index(['Time', 'V1', 'V2', 'V3', 'V4', 'V5', 'V6', 'V7', 'V8', 'V9', 'V10',\n",
600 | " 'V11', 'V12', 'V13', 'V14', 'V15', 'V16', 'V17', 'V18', 'V19', 'V20',\n",
601 | " 'V21', 'V22', 'V23', 'V24', 'V25', 'V26', 'V27', 'V28', 'Amount',\n",
602 | " 'Class'],\n",
603 | " dtype='object')"
604 | ]
605 | },
606 | "execution_count": 7,
607 | "metadata": {},
608 | "output_type": "execute_result"
609 | }
610 | ],
611 | "source": [
612 | "dataset.columns"
613 | ]
614 | },
615 | {
616 | "cell_type": "code",
617 | "execution_count": 12,
618 | "metadata": {},
619 | "outputs": [
620 | {
621 | "data": {
622 | "text/html": [
623 | "\n",
624 | "\n",
637 | "
\n",
638 | " \n",
639 | " \n",
640 | " \n",
641 | " V1 \n",
642 | " V2 \n",
643 | " V3 \n",
644 | " V4 \n",
645 | " V5 \n",
646 | " V6 \n",
647 | " V7 \n",
648 | " V8 \n",
649 | " V9 \n",
650 | " V10 \n",
651 | " ... \n",
652 | " V21 \n",
653 | " V22 \n",
654 | " V23 \n",
655 | " V24 \n",
656 | " V25 \n",
657 | " V26 \n",
658 | " V27 \n",
659 | " V28 \n",
660 | " Class \n",
661 | " normalizedAmount \n",
662 | " \n",
663 | " \n",
664 | " \n",
665 | " \n",
666 | " 0 \n",
667 | " -1.359807 \n",
668 | " -0.072781 \n",
669 | " 2.536347 \n",
670 | " 1.378155 \n",
671 | " -0.338321 \n",
672 | " 0.462388 \n",
673 | " 0.239599 \n",
674 | " 0.098698 \n",
675 | " 0.363787 \n",
676 | " 0.090794 \n",
677 | " ... \n",
678 | " -0.018307 \n",
679 | " 0.277838 \n",
680 | " -0.110474 \n",
681 | " 0.066928 \n",
682 | " 0.128539 \n",
683 | " -0.189115 \n",
684 | " 0.133558 \n",
685 | " -0.021053 \n",
686 | " 0 \n",
687 | " 0.244964 \n",
688 | " \n",
689 | " \n",
690 | " 1 \n",
691 | " 1.191857 \n",
692 | " 0.266151 \n",
693 | " 0.166480 \n",
694 | " 0.448154 \n",
695 | " 0.060018 \n",
696 | " -0.082361 \n",
697 | " -0.078803 \n",
698 | " 0.085102 \n",
699 | " -0.255425 \n",
700 | " -0.166974 \n",
701 | " ... \n",
702 | " -0.225775 \n",
703 | " -0.638672 \n",
704 | " 0.101288 \n",
705 | " -0.339846 \n",
706 | " 0.167170 \n",
707 | " 0.125895 \n",
708 | " -0.008983 \n",
709 | " 0.014724 \n",
710 | " 0 \n",
711 | " -0.342475 \n",
712 | " \n",
713 | " \n",
714 | " 2 \n",
715 | " -1.358354 \n",
716 | " -1.340163 \n",
717 | " 1.773209 \n",
718 | " 0.379780 \n",
719 | " -0.503198 \n",
720 | " 1.800499 \n",
721 | " 0.791461 \n",
722 | " 0.247676 \n",
723 | " -1.514654 \n",
724 | " 0.207643 \n",
725 | " ... \n",
726 | " 0.247998 \n",
727 | " 0.771679 \n",
728 | " 0.909412 \n",
729 | " -0.689281 \n",
730 | " -0.327642 \n",
731 | " -0.139097 \n",
732 | " -0.055353 \n",
733 | " -0.059752 \n",
734 | " 0 \n",
735 | " 1.160686 \n",
736 | " \n",
737 | " \n",
738 | " 3 \n",
739 | " -0.966272 \n",
740 | " -0.185226 \n",
741 | " 1.792993 \n",
742 | " -0.863291 \n",
743 | " -0.010309 \n",
744 | " 1.247203 \n",
745 | " 0.237609 \n",
746 | " 0.377436 \n",
747 | " -1.387024 \n",
748 | " -0.054952 \n",
749 | " ... \n",
750 | " -0.108300 \n",
751 | " 0.005274 \n",
752 | " -0.190321 \n",
753 | " -1.175575 \n",
754 | " 0.647376 \n",
755 | " -0.221929 \n",
756 | " 0.062723 \n",
757 | " 0.061458 \n",
758 | " 0 \n",
759 | " 0.140534 \n",
760 | " \n",
761 | " \n",
762 | " 4 \n",
763 | " -1.158233 \n",
764 | " 0.877737 \n",
765 | " 1.548718 \n",
766 | " 0.403034 \n",
767 | " -0.407193 \n",
768 | " 0.095921 \n",
769 | " 0.592941 \n",
770 | " -0.270533 \n",
771 | " 0.817739 \n",
772 | " 0.753074 \n",
773 | " ... \n",
774 | " -0.009431 \n",
775 | " 0.798278 \n",
776 | " -0.137458 \n",
777 | " 0.141267 \n",
778 | " -0.206010 \n",
779 | " 0.502292 \n",
780 | " 0.219422 \n",
781 | " 0.215153 \n",
782 | " 0 \n",
783 | " -0.073403 \n",
784 | " \n",
785 | " \n",
786 | "
\n",
787 | "
5 rows × 30 columns
\n",
788 | "
"
789 | ],
790 | "text/plain": [
791 | " V1 V2 V3 V4 V5 V6 V7 \\\n",
792 | "0 -1.359807 -0.072781 2.536347 1.378155 -0.338321 0.462388 0.239599 \n",
793 | "1 1.191857 0.266151 0.166480 0.448154 0.060018 -0.082361 -0.078803 \n",
794 | "2 -1.358354 -1.340163 1.773209 0.379780 -0.503198 1.800499 0.791461 \n",
795 | "3 -0.966272 -0.185226 1.792993 -0.863291 -0.010309 1.247203 0.237609 \n",
796 | "4 -1.158233 0.877737 1.548718 0.403034 -0.407193 0.095921 0.592941 \n",
797 | "\n",
798 | " V8 V9 V10 ... V21 V22 V23 V24 \\\n",
799 | "0 0.098698 0.363787 0.090794 ... -0.018307 0.277838 -0.110474 0.066928 \n",
800 | "1 0.085102 -0.255425 -0.166974 ... -0.225775 -0.638672 0.101288 -0.339846 \n",
801 | "2 0.247676 -1.514654 0.207643 ... 0.247998 0.771679 0.909412 -0.689281 \n",
802 | "3 0.377436 -1.387024 -0.054952 ... -0.108300 0.005274 -0.190321 -1.175575 \n",
803 | "4 -0.270533 0.817739 0.753074 ... -0.009431 0.798278 -0.137458 0.141267 \n",
804 | "\n",
805 | " V25 V26 V27 V28 Class normalizedAmount \n",
806 | "0 0.128539 -0.189115 0.133558 -0.021053 0 0.244964 \n",
807 | "1 0.167170 0.125895 -0.008983 0.014724 0 -0.342475 \n",
808 | "2 -0.327642 -0.139097 -0.055353 -0.059752 0 1.160686 \n",
809 | "3 0.647376 -0.221929 0.062723 0.061458 0 0.140534 \n",
810 | "4 -0.206010 0.502292 0.219422 0.215153 0 -0.073403 \n",
811 | "\n",
812 | "[5 rows x 30 columns]"
813 | ]
814 | },
815 | "execution_count": 12,
816 | "metadata": {},
817 | "output_type": "execute_result"
818 | }
819 | ],
820 | "source": [
821 | "## Prepocessing \n",
822 | "from sklearn.preprocessing import StandardScaler\n",
823 | "\n",
824 | "# Create a new copy\n",
825 | "dataset2 = dataset\n",
826 | "# Normalizace the Amount between -1,1\n",
827 | "dataset2['normalizedAmount'] = StandardScaler().fit_transform(dataset2['Amount'].values.reshape(-1,1))\n",
828 | "\n",
829 | "dataset2 = dataset2.drop(columns = ['Amount','Time'])\n",
830 | "dataset2.head()"
831 | ]
832 | },
833 | {
834 | "cell_type": "code",
835 | "execution_count": 14,
836 | "metadata": {},
837 | "outputs": [],
838 | "source": [
839 | "## Define the training data and the responde variable from the dataset\n",
840 | "y = dataset2['Class']\n",
841 | "X = dataset2.drop(columns = ['Class'])"
842 | ]
843 | },
844 | {
845 | "cell_type": "code",
846 | "execution_count": 17,
847 | "metadata": {},
848 | "outputs": [],
849 | "source": [
850 | "# Generate the Split train/test datasets\n",
851 | "from sklearn.model_selection import train_test_split\n",
852 | "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state=0)"
853 | ]
854 | },
855 | {
856 | "cell_type": "code",
857 | "execution_count": 18,
858 | "metadata": {},
859 | "outputs": [
860 | {
861 | "data": {
862 | "text/plain": [
863 | "(199364, 29)"
864 | ]
865 | },
866 | "execution_count": 18,
867 | "metadata": {},
868 | "output_type": "execute_result"
869 | }
870 | ],
871 | "source": [
872 | "X_train.shape"
873 | ]
874 | },
875 | {
876 | "cell_type": "code",
877 | "execution_count": 19,
878 | "metadata": {},
879 | "outputs": [
880 | {
881 | "data": {
882 | "text/plain": [
883 | "(85443, 29)"
884 | ]
885 | },
886 | "execution_count": 19,
887 | "metadata": {},
888 | "output_type": "execute_result"
889 | }
890 | ],
891 | "source": [
892 | "X_test.shape"
893 | ]
894 | },
895 | {
896 | "cell_type": "code",
897 | "execution_count": 20,
898 | "metadata": {},
899 | "outputs": [],
900 | "source": [
901 | "# convert split sets to arrays\n",
902 | "X_train = np.array(X_train)\n",
903 | "X_test = np.array(X_test)\n",
904 | "y_train = np.array(y_train)\n",
905 | "y_test = np.array(y_test)"
906 | ]
907 | },
908 | {
909 | "cell_type": "code",
910 | "execution_count": 39,
911 | "metadata": {},
912 | "outputs": [],
913 | "source": [
914 | "### Deep Neural Network\n",
915 | "\n",
916 | "# import keras layers, optimizer & callbacks\n",
917 | "from tensorflow.keras import Sequential\n",
918 | "from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten, Dropout"
919 | ]
920 | },
921 | {
922 | "cell_type": "code",
923 | "execution_count": 47,
924 | "metadata": {},
925 | "outputs": [
926 | {
927 | "name": "stdout",
928 | "output_type": "stream",
929 | "text": [
930 | "Model: \"sequential_5\"\n",
931 | "_________________________________________________________________\n",
932 | "Layer (type) Output Shape Param # \n",
933 | "=================================================================\n",
934 | "dense_19 (Dense) (None, 16) 480 \n",
935 | "_________________________________________________________________\n",
936 | "dense_20 (Dense) (None, 24) 408 \n",
937 | "_________________________________________________________________\n",
938 | "dropout_3 (Dropout) (None, 24) 0 \n",
939 | "_________________________________________________________________\n",
940 | "dense_21 (Dense) (None, 20) 500 \n",
941 | "_________________________________________________________________\n",
942 | "dense_22 (Dense) (None, 24) 504 \n",
943 | "_________________________________________________________________\n",
944 | "dense_23 (Dense) (None, 1) 25 \n",
945 | "=================================================================\n",
946 | "Total params: 1,917\n",
947 | "Trainable params: 1,917\n",
948 | "Non-trainable params: 0\n",
949 | "_________________________________________________________________\n"
950 | ]
951 | }
952 | ],
953 | "source": [
954 | "# Create DNN model\n",
955 | "dnn_model = Sequential()\n",
956 | "\n",
957 | "# First stage of Dense layers\n",
958 | "dnn_model.add(Dense(units=16, input_dim=29,activation='relu'))\n",
959 | "dnn_model.add(Dense(units=24, activation='relu'))\n",
960 | "\n",
961 | "# Dropout to avoid oferfitting\n",
962 | "dnn_model.add(Dropout(rate=0.4))\n",
963 | "\n",
964 | "# Second stage of Dense layers\n",
965 | "dnn_model.add(Dense(units=20,activation='relu'))\n",
966 | "dnn_model.add(Dense(units=24,activation='relu'))\n",
967 | "# Last layer with sigmoid activation. Binary decision\n",
968 | "dnn_model.add(Dense(units=1,activation='sigmoid'))\n",
969 | "\n",
970 | "#Summary the model\n",
971 | "dnn_model.summary() \n"
972 | ]
973 | },
974 | {
975 | "cell_type": "code",
976 | "execution_count": 50,
977 | "metadata": {},
978 | "outputs": [],
979 | "source": [
980 | "### Training ###\n",
981 | "\n",
982 | "# Define the compile trainning params\n",
983 | "dnn_model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])"
984 | ]
985 | },
986 | {
987 | "cell_type": "code",
988 | "execution_count": 53,
989 | "metadata": {},
990 | "outputs": [
991 | {
992 | "name": "stdout",
993 | "output_type": "stream",
994 | "text": [
995 | "Epoch 1/5\n",
996 | "199364/199364 [==============================] - 15s 76us/sample - loss: 0.0038 - acc: 0.9993\n",
997 | "Epoch 2/5\n",
998 | "199364/199364 [==============================] - 16s 80us/sample - loss: 0.0036 - acc: 0.9994\n",
999 | "Epoch 3/5\n",
1000 | "199364/199364 [==============================] - 16s 81us/sample - loss: 0.0034 - acc: 0.9994\n",
1001 | "Epoch 4/5\n",
1002 | "199364/199364 [==============================] - 16s 81us/sample - loss: 0.0032 - acc: 0.9994\n",
1003 | "Epoch 5/5\n",
1004 | "199364/199364 [==============================] - 15s 77us/sample - loss: 0.0032 - acc: 0.9994\n"
1005 | ]
1006 | },
1007 | {
1008 | "data": {
1009 | "text/plain": [
1010 | ""
1011 | ]
1012 | },
1013 | "execution_count": 53,
1014 | "metadata": {},
1015 | "output_type": "execute_result"
1016 | }
1017 | ],
1018 | "source": [
1019 | "# train the model\n",
1020 | "dnn_model.fit(X_train, \n",
1021 | " y_train, \n",
1022 | " batch_size= 15,\n",
1023 | " epochs = 5)"
1024 | ]
1025 | },
1026 | {
1027 | "cell_type": "code",
1028 | "execution_count": 54,
1029 | "metadata": {},
1030 | "outputs": [
1031 | {
1032 | "name": "stdout",
1033 | "output_type": "stream",
1034 | "text": [
1035 | "85443/85443 [==============================] - 2s 18us/sample - loss: 0.0039 - acc: 0.9994\n"
1036 | ]
1037 | }
1038 | ],
1039 | "source": [
1040 | "## DNN model evaluation\n",
1041 | "score = dnn_model.evaluate(X_test,y_test)"
1042 | ]
1043 | },
1044 | {
1045 | "cell_type": "code",
1046 | "execution_count": 55,
1047 | "metadata": {},
1048 | "outputs": [
1049 | {
1050 | "name": "stdout",
1051 | "output_type": "stream",
1052 | "text": [
1053 | "[0.003928133984745635, 0.99939144]\n"
1054 | ]
1055 | }
1056 | ],
1057 | "source": [
1058 | "# Print eval result \n",
1059 | "print(score)"
1060 | ]
1061 | },
1062 | {
1063 | "cell_type": "code",
1064 | "execution_count": null,
1065 | "metadata": {},
1066 | "outputs": [],
1067 | "source": [
1068 | "## Metrics -->\n",
1069 | "\n",
1070 | "# Accuracy: (TRUE Positives + TRUE Negatives) / Total\n",
1071 | "# Precision: TRUE Positives / (TRUE Positives + FALSE Positives)\n",
1072 | "# Specificity: TRUE Negatives / (FALSE Positives + TRUE Negatives)\n",
1073 | "# Recall: TRUE Positives / (TRUE Positives + FALSE Negatives)\n",
1074 | "# F1-score:"
1075 | ]
1076 | },
1077 | {
1078 | "cell_type": "code",
1079 | "execution_count": 75,
1080 | "metadata": {},
1081 | "outputs": [],
1082 | "source": [
1083 | "# Getting the prediction from the test dataset\n",
1084 | "y_pred = dnn_model.predict(X_test).round()\n",
1085 | "# convert array to dataframe\n",
1086 | "y_test = pd.DataFrame(y_test)"
1087 | ]
1088 | },
1089 | {
1090 | "cell_type": "code",
1091 | "execution_count": 78,
1092 | "metadata": {},
1093 | "outputs": [],
1094 | "source": [
1095 | "# Evaluate the quality of the model\n",
1096 | "from sklearn.metrics import confusion_matrix, accuracy_score, f1_score, precision_score, recall_score\n",
1097 | "\n",
1098 | "cnf_matrix = confusion_matrix(y_test, y_pred)"
1099 | ]
1100 | },
1101 | {
1102 | "cell_type": "code",
1103 | "execution_count": 79,
1104 | "metadata": {},
1105 | "outputs": [],
1106 | "source": [
1107 | "# Accuracy score\n",
1108 | "acc = accuracy_score(y_test, y_pred)"
1109 | ]
1110 | },
1111 | {
1112 | "cell_type": "code",
1113 | "execution_count": 80,
1114 | "metadata": {},
1115 | "outputs": [],
1116 | "source": [
1117 | "# precision score (When is 0 and should be 1 and the other way round)\n",
1118 | "pre = precision_score(y_test, y_pred)"
1119 | ]
1120 | },
1121 | {
1122 | "cell_type": "code",
1123 | "execution_count": 81,
1124 | "metadata": {},
1125 | "outputs": [],
1126 | "source": [
1127 | "# recall score\n",
1128 | "rec = recall_score(y_test, y_pred)"
1129 | ]
1130 | },
1131 | {
1132 | "cell_type": "code",
1133 | "execution_count": 82,
1134 | "metadata": {},
1135 | "outputs": [],
1136 | "source": [
1137 | "f1 = f1_score(y_test, y_pred)"
1138 | ]
1139 | },
1140 | {
1141 | "cell_type": "code",
1142 | "execution_count": 83,
1143 | "metadata": {},
1144 | "outputs": [
1145 | {
1146 | "data": {
1147 | "text/html": [
1148 | "\n",
1149 | "\n",
1162 | "
\n",
1163 | " \n",
1164 | " \n",
1165 | " \n",
1166 | " Model \n",
1167 | " Accuracy \n",
1168 | " Precision \n",
1169 | " Recall \n",
1170 | " F1 score \n",
1171 | " \n",
1172 | " \n",
1173 | " \n",
1174 | " \n",
1175 | " 0 \n",
1176 | " Deep Neural Network \n",
1177 | " 0.999391 \n",
1178 | " 0.836879 \n",
1179 | " 0.802721 \n",
1180 | " 0.819444 \n",
1181 | " \n",
1182 | " \n",
1183 | "
\n",
1184 | "
"
1185 | ],
1186 | "text/plain": [
1187 | " Model Accuracy Precision Recall F1 score\n",
1188 | "0 Deep Neural Network 0.999391 0.836879 0.802721 0.819444"
1189 | ]
1190 | },
1191 | "execution_count": 83,
1192 | "metadata": {},
1193 | "output_type": "execute_result"
1194 | }
1195 | ],
1196 | "source": [
1197 | "# Metrics\n",
1198 | "results = pd.DataFrame([['Deep Neural Network', acc, pre, rec, f1]],\n",
1199 | " columns = ['Model','Accuracy','Precision','Recall','F1 score'])\n",
1200 | "\n",
1201 | "results"
1202 | ]
1203 | },
1204 | {
1205 | "cell_type": "code",
1206 | "execution_count": 84,
1207 | "metadata": {},
1208 | "outputs": [
1209 | {
1210 | "name": "stdout",
1211 | "output_type": "stream",
1212 | "text": [
1213 | "Accuracy 0.9994\n"
1214 | ]
1215 | },
1216 | {
1217 | "data": {
1218 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAa4AAAE0CAYAAAB0CNe/AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAgAElEQVR4nO3daZgV1bn28f/dIIpBRXFGPIpCnM6ROMUhMUYNglEhJlGMUaImJDnOY0RNnGOSk8QRTYigeBIF9WjEEX2JJGoccFZEpREVBFEEJyYBn/dDrdYNNt29m950V+37d111sWvVqqpnY9sPa6hVigjMzMzyoqa1AzAzMyuHE5eZmeWKE5eZmeWKE5eZmeWKE5eZmeWKE5eZmeWKE5e1aZI6SrpT0geSblmB6xwu6f6WjK21SPq6pFdaOw6z1iI/x2UtQdIPgFOArYCPgGeBiyPi4RW87hHA8cDuEbF4hQNt4yQF0CMials7FrO2yi0uW2GSTgEuA34NbABsClwN9GuBy/8H8Go1JK2mkNS+tWMwa21OXLZCJK0FXAAcGxG3RcTciFgUEXdGxOmpzqqSLpM0PW2XSVo1HdtL0jRJp0p6R9IMSUelY+cDvwIOlfSxpGMknSfpryX330xS1P1Cl/QjSa9J+kjSFEmHl5Q/XHLe7pLGpy7I8ZJ2Lzk2TtKFkh5J17lf0rrL+f518Z9REn9/SftLelXSbElnldTfRdKjkt5Pda+S1CEd+1eq9lz6voeWXP8Xkt4GrqsrS+dske6xQ9rfWNIsSXut0H9YszbMictW1G7AasDtDdQ5G9gV6AVsD+wCnFNyfENgLaArcAwwRNLaEXEuWStuVER0iohhDQUi6UvAFUDfiFgD2J2sy3LZeusAd6e6XYA/AndL6lJS7QfAUcD6QAfgtAZuvSHZ30FXskT7F+CHwI7A14FfSeqe6i4BTgbWJfu72wf4b4CI2DPV2T5931El11+HrPU5qPTGETEZ+AXwN0mrA9cB10fEuAbiNcs1Jy5bUV2AWY105R0OXBAR70TEu8D5wBElxxel44si4h7gY+DLzYznU2A7SR0jYkZETKinzreBSRHxvxGxOCJuAl4GDiypc11EvBoR84GbyZLu8iwiG89bBIwkS0qXR8RH6f4TgP8CiIinIuKxdN/XgT8D32jCdzo3IhameJYSEX8BJgGPAxuR/UPBrLCcuGxFvQes28jYy8bAGyX7b6Syz66xTOKbB3QqN5CImAscCvwMmCHpbklbNSGeupi6luy/XUY870XEkvS5LrHMLDk+v+58ST0l3SXpbUkfkrUo6+2GLPFuRCxopM5fgO2AKyNiYSN1zXLNictW1KPAAqB/A3Wmk3Vz1dk0lTXHXGD1kv0NSw9GxJiI+BZZy+Nlsl/ojcVTF9NbzYypHNeQxdUjItYEzgLUyDkNTv2V1Ilscsww4LzUFWpWWE5ctkIi4gOycZ0haVLC6pJWkdRX0u9StZuAcyStlyY5/Ar46/Ku2YhngT0lbZomhgyuOyBpA0kHpbGuhWRdjkvqucY9QE9JP5DUXtKhwDbAXc2MqRxrAB8CH6fW4M+XOT4T6P6Fsxp2OfBURPyYbOzuTyscpVkb5sRlKywi/kj2DNc5wLvAVOA44O+pykXAk8DzwAvA06msOfd6ABiVrvUUSyebGuBUshbVbLKxo/+u5xrvAQekuu8BZwAHRMSs5sRUptPIJn58RNYaHLXM8fOAEWnW4SGNXUxSP6APWfcoZP8ddqibTWlWRH4A2czMcsUtLjMzyxUnLjMzyxUnLjMzyxUnLjMzyxUnLjMzyxUnLms1kpZIelbSi5JuSWvtNfdae0m6K30+SNKZDdTtLOkL0+SbcI/zJH1hzcLllS9T53pJ3yvjXptJerHcGM2qgROXtab5EdErIrYDPuHzZ5EAUKbsn9GIGB0Rv2mgSmfqeb7LzPLBicvaioeALVNLY6Kkq8keVO4mqXd6FcjTqWVWt+5fH0kvp9eVHFx3ofQKk6vS5w0k3S7pubTtDvwG2CK19v4n1Ts9vd7keWWvU6m71tmSXpH0/2jCwr+SfpKu85yk/1umFbmvpIfS604OSPXbSfqfknv/dEX/Is2KzonLWl1aoLcv2aoakCWIGyLiK2RrE54D7BsRO5CtwHGKpNXIVp44kOzVIRt+4cKZK4B/RsT2wA5kK7WfCUxOrb3TJfUGepC9bqUXsKOkPSXtCAwAvkKWGHduwte5LSJ2TvebSPaaljqbka3m8W3gT+k7HAN8EBE7p+v/RNLmTbiPWdXy21StNXWUVPe+rIfIFondGHgjIh5L5buSrSP4iCTI3o31KLAVMCUiJgEoe7nkUu+qSvYGjgRIK7h/IGntZer0Ttszab8TWSJbA7g9Iuale4xuwnfaTtJFZN2RnYAxJcdujohPgUmSXkvfoTfwXyXjX2ule7/ahHuZVSUnLmtN8yNiqfdcpeQ0t7QIeCAiDlumXi8aWTW9DAIuiYg/L3OPk5pxj+uB/hHxnKQfAXuVHFv2WpHufXxElCY4JG1W5n3Nqoa7Cq2tewzYQ9KWAGn1+Z5krwbZXNIWqd5hyzl/LGkF9jSetCbZArdrlNQZAxxdMnbWVdL6wL+A70jqKGkNln7R5PKsQfYusFXIXqBZ6vuSalLM3YFX0r1/nurXva/rS024j1nVcovL2rSIeDe1XG6StGoqPiciXpU0CLhb0izgYbIXKS7rRGCopGPIXnHy84h4VNIjabr5vWmca2vg0dTi+xj4YUQ8LWkU2atU3iDrzmzML8neRPwG2ZhdaYJ8BfgnsAHws4hYIOlasrGvp5Xd/F0afreZWdXz6vBmZpYr7io0M7NcceIyM7NcabNjXB03Pcx9mLZSzX/z/MYrmbW4nmrJq5X7u3P+mze16P1XBre4zMwsV9psi8vMzMrXjOU9c8eJy8ysQFQFHWlOXGZmBeIWl5mZ5YoTl5mZ5Upa/aXQnLjMzArFLS4zM8sRdxWamVmuOHGZmVmueDq8mZnliltcZmaWK05cZmaWK05cZmaWK8LPcZmZWY64xWVmZrlSU1P8X+vF/4ZmZlXFLS4zM8sRdxWamVmuOHGZmVmueOUMMzPLFbe4zMwsV/w+LjMzyxW3uMzMLFc8xmVmZrniFpeZmeVKNSSu4n9DM7MqImrK2hq9nvRlSc+WbB9KOknSeZLeKinfv+ScwZJqJb0iab+S8j6prFbSmSXlm0t6XNIkSaMkdWgoJicuM7MiUU15WyMi4pWI6BURvYAdgXnA7enwpXXHIuIeAEnbAAOAbYE+wNWS2klqBwwB+gLbAIelugC/TdfqAcwBjmkoJicuM7MCkWrK2sq0DzA5It5ooE4/YGRELIyIKUAtsEvaaiPitYj4BBgJ9FM2f39v4NZ0/gigf0NBOHGZmRWIpHK3QZKeLNkGNXD5AcBNJfvHSXpe0nBJa6eyrsDUkjrTUtnyyrsA70fE4mXKl8uJy8ysQMod44qIoRGxU8k2tN7rZuNOBwG3pKJrgC2AXsAM4A+fhfBF0Yzy5fKsQjOzAqngrMK+wNMRMROg7s/snvoLcFfanQZ0KzlvE2B6+lxf+Sygs6T2qdVVWr9ebnGZmRWJVN7WdIdR0k0oaaOSY98BXkyfRwMDJK0qaXOgB/AEMB7okWYQdiDrdhwdEQE8CHwvnT8QuKOhQNziMjMrkgo0RyStDnwL+GlJ8e8k9SLr1nu97lhETJB0M/ASsBg4NiKWpOscB4wB2gHDI2JCutYvgJGSLgKeAYY1FI8Tl5lZkVRgkd2ImEc2iaK07IgG6l8MXFxP+T3APfWUv0Y267BJnLjMzIrEq8ObmVmuVMHMBScuM7MCCbe4zMwsV4qft5y4zMwKpab4mcuJy8ysSNxVaGZmuVL8vOXEZWZWKO4qNDOzXHFXoZmZ5Urx85YTl5lZobir0MzMcqX4ecuJy8ysSLxyhpmZ5Yu7Cs3MLFeKn7ecuMzMCsVdhWZmlivuKjQzs1wpft5y4jIzK5Sa4r9J0onLzKxIip+3nLjMzArFkzPMzCxXip+3nLjMzIokPKvQWtPxx/TlR4ftTUQw4eWpDDrtT1z562P4+le35oOP5gEw6NQ/8fxLbzCg/x6c8vODAJg7dwEnnD2MFya+SY/uG/G/Q0747Jqbb7o+F/7xVq4adi+/OvX7HNB7Jz799FPefe9DBp36J2bMnNMq39XyY8aMdznjjEuZNWsONTXikEP6MHDgQVx22V8ZO/ZxampEly5rccklJ7HBBl1aO9zqUwVdhYqI1o6hXh03PaxtBraSbLzB2oz9v/P4yj6nsWDhIv569Ync949n2HO3bbh37NPcfs8TS9XfdccevFw7nfc/mEvvvbbnnJO/x579frlUnZoaMfmJq/lGv1/y5luzWKNTRz76eD4A/33UfmzVYxNOOGvYSvuObc38N89v7RBy4Z13ZvPuu7PZdtst+fjjeXz3uyczZMjZbLjhunTqtDoAN9wwmtraqVxwwbGtHG0e9GzRTLPF4TeV9btz8t8Oy12mq1iLS9JWQD+gKxDAdGB0REys1D2Lpn37dnRcrQOLFi+hY8cODbaGHntq0mefn3imlq4brfOFOt/cYzumvDmTN9+aBfBZ0gJYffXVaKv/iLG2Zf3112H99bOfr06dVqd7927MnPkeW2656Wd15s9fiKrgX/5tUhV0FVZk4qSkXwAjyYYJnwDGp883STqzEvcsmukz53DZ0Lt49bGrmPLkNXz44TzGPvQCAOedfihPjPktv/vVEXTo8MV/e/zo0L0Y8+CzXyj//kG7c/Md/16q7LzTD2HSY1cxoP8eXPiHWyrzZaywpk2bycSJk9l++y8DcOmlN/CNbxzFnXeO48QTD2/l6KqUVN6WQxXpKpT0KrBtRCxaprwDMCEieiznvEHAIID2a++0Y/tOW7Z4bHnRea0vcdOfTuaIYy/n/Q/nceM1J3LbPU8w7pEXefud9+nQoT1DfvMTXntjJpdcfttn5+252zZcftHR7PPd85j9/sefla+ySjteG38NO+57Ou/M+uAL9zvt2H6stuoqXPTHW1fK92uL3FVYnrlz53PEEYP52c8OoXfv3Zc69uc/38LChZ9wwglOXo1r4a7CgaPK6yoccWjuslelHlX7FNi4nvKN0rF6RcTQiNgpInaq5qQFsPfXtuP1qe8wa/ZHLF68hL/fN55dd+zJ2++8D8AnnyzmhpvHsVOvLT47Z7utNuWa3w3i+z/+/VJJC2C/vXrx7ItT6k1aADf//RH6992lcl/ICmXRosWccMIlHHjgXl9IWgAHHPAN7r//3/WcaRVXo/K2HKrUGNdJwFhJk4CpqWxTYEvguArds1CmvjWLXXboQcfVOjB/wSd8c4/tePr519hw/c6fJa+D9tuZl17J/nq7bdyFkUNP5piThlA75e0vXO+Qfl/sJtxisw2Z/HpW99vf2pFXJ0+v8LeyIogIzj77Crp378ZRR/X/rPz116ez2WbZv1f/8Y/H6d59k9YKsbrlNBmVoyKJKyLuk9QT2IVscoaAacD4iFhSiXsWzfhnJ3P7PY/z6D2/ZvGST3luwusMu3Esd4w4k3W7rIEknp/wBsefdS0Ag088mHXW7sRlFx0NwOIln/K1A84GoONqHdj76//JcYOvXeoeF505gB5bbMynnwZvvvUuJwyu3hmF1nRPPfUSd9zxID17bka/ftmjFqecciS33no/U6a8hVRD167rcf75nlHYGqL4ecvT4c3qeIzLWkfLjnF1H3RrWb87Xxv6vdylOj+AbGZWJDmdKVgOJy4zsyLxGJeZmeVKFbzWpAq+oplZFanAA8iSOku6VdLLkiZK2k3SOpIekDQp/bl2qitJV0iqlfS8pB1KrjMw1Z8kaWBJ+Y6SXkjnXKFGll1x4jIzK5LKPMd1OXBfRGwFbA9MBM4ExqYFJcamfYC+QI+0DQKuAZC0DnAu8FWyGefn1iW7VGdQyXl9GvyKTY3azMzavpDK2hojaU1gT2AYQER8EhHvk61FOyJVGwHUPdTXD7ghMo8BnSVtBOwHPBARsyNiDvAA0CcdWzMiHo1smvsNJdeqlxOXmVmR1JS3SRok6cmSbdAyV+wOvAtcJ+kZSddK+hKwQUTMAEh/rp/qd+XzhScge4a3ayPl0+opXy5PzjAzK5IyZxVGxFBgaANV2gM7AMdHxOOSLufzbsH61BdANKN8udziMjMrkpafnDENmBYRj6f9W8kS2czUzUf6852S+t1Kzt+E7LVWDZVvUk/5cjlxmZkVSQtPzoiIt4Gpkr6civYBXgJGA3UzAwcCd6TPo4Ej0+zCXYEPUlfiGKC3pLXTpIzewJh07CNJu6bZhEeWXKte7io0MyuSyjx/fDzwt/RqqteAo8gaPjdLOgZ4E/h+qnsPsD9QC8xLdYmI2ZIuJHs/I8AFETE7ff45cD3QEbg3bcvlxGVmViBRgZUzIuJZYKd6Du1TT90A6l1hOSKGA8PrKX8S2K6p8ThxmZkViZd8MjOzXPEiu2ZmlitVMOXOicvMrEjc4jIzs1zxGJeZmeWKE5eZmeVJUxbOzTsnLjOzIvHkDDMzyxW3uMzMLFc8xmVmZrnixGVmZrlS/LzlxGVmViTRrvizM5y4zMyKxF2FZmaWK8XPW05cZmZFUlP8nkInLjOzIqmCx7iWn7gkrdPQiSWvXDYzszaiqhMX8BQQ1N9jGkD3ikRkZmbNpirIXMtNXBGx+coMxMzMVlwV5K3Gl2NU5oeSfpn2N5W0S+VDMzOzcknlbXnUlPknVwO7AT9I+x8BQyoWkZmZNZtqytvyqCmzCr8aETtIegYgIuZI6lDhuMzMrBny2ooqR1MS1yJJ7cgmZCBpPeDTikZlZmbNUgULZzSpq/AK4HZgA0kXAw8Dv65oVGZm1izVMMbVaIsrIv4m6Slgn1TUPyImVjYsMzNrjrwmo3I0deWM1YG67sKOlQvHzMxWRDU8x9WU6fC/AkYA6wDrAtdJOqfSgZmZWfk8qzBzGPCViFgAIOk3wNPARZUMzMzMylcFDa4mJa7XgdWABWl/VWBypQIyM7Pmq+rEJelKsjGthcAESQ+k/W+RzSw0M7M2pqoTF/Bk+vMpsunwdcZVLBozM1sh1fAcV0OL7I5YmYGYmdmKq/YWFwCSegCXANuQjXUBEBF+rYmZWRvjxJW5DjgXuBT4JnAU9b+jy8zMWpmqoK+wKbP4O0bEWEAR8UZEnAfsXdmwzMysOSqx5JOkdpKekXRX2r9e0hRJz6atVyqXpCsk1Up6XtIOJdcYKGlS2gaWlO8o6YV0zhVqwhPUTWlxLZBUA0ySdBzwFrB+076umZmtTBXqKjwRmAisWVJ2ekTcuky9vkCPtH0VuAb4qqR1yHrudiKbnf6UpNERMSfVGQQ8BtwD9AHubSiYprS4TiJb8ukEYEfgCGBgg2eYmVmraOkWl6RNgG8D1zbh9v2AGyLzGNBZ0kbAfsADETE7JasHgD7p2JoR8WhEBHAD0L+xmzRlkd3x6ePHZONbZmbWRpU7xCVpEFmLp87QiBhasn8ZcAawxjKnXpyWBBwLnBkRC4GuwNSSOtNSWUPl0+opb1BDDyDfSXoHV30i4qDGLm5mZitXuV2FKUkNre+YpAOAdyLiKUl7lRwaDLwNdEjn/gK4gPon7kUzyhvUUIvr942dbGZmbUsLL5y7B3CQpP3JHodaU9JfI+KH6fhCSdcBp6X9aUC3kvM3Aaan8r2WKR+Xyjepp36DGnoA+Z+NnWxmZm1LS07OiIjBZK0rUovrtIj4oaSNImJGmgHYH3gxnTIaOE7SSLLJGR+kemOAX0taO9XrDQyOiNmSPpK0K/A4cCRwZWNxNfV9XGZmlgMr6X1cf5O0HllX37PAz1L5PcD+QC0wjzQvIiWoC4G6ORMXRMTs9PnnwPVk73q8l0ZmFIITl5lZoVQqb0XEONJatRFR77O8aWbgscs5NhwYXk/5k8B25cTixGVmViBVveRTa88qnP/m+ZW8vJlZIVV14sKzCs3McqcKlir0rEIzsyKp6sRVx681MTPLjxo1+vxu7vm1JmZmBdK+Cn47+7UmZmYFUqMoa8sjv9bEzKxAqmGMy681MTMrkJoytzzya03MzAqkGlpcTZlV+CD1PIi8vCU/zMys9Sin41blaMoY12kln1cDvgssrkw4Zma2ItziAiLiqWWKHpHkh5PNzNqgvI5blaMpXYXrlOzWkE3Q2LBiEZmZWbPldYp7OZrSVfgUn79ieTEwBTimkkGZmVnzuKsws3VELCgtkLRqheIxM7MVUA1dhU35jv+up+zRlg7EzMxWXI3K2/KoofdxbQh0BTpK+gqfr0+4JtkDyWZm1sZU+xjXfsCPgE2AP/B54voQOKuyYZmZWXPktRVVjobexzUCGCHpuxHxfysxJjMzayaPcWV2lNS5bkfS2pIuqmBMZmbWTNWwOnxTElffiHi/bici5gD7Vy4kMzNrrqqenFGinaRVI2IhgKSOgKfDm5m1QXlNRuVoSuL6KzBW0nVkDyIfDdxQ0ajMzKxZqmGMqylrFf5O0vPAvmQzCy+MiDEVj8zMzMqW13GrcjSlxUVE3AfcByBpD0lDIuLYikZmZmZlc1dhIqkXcBhwKNlahbdVMigzM2uequ4qlNQTGECWsN4DRgGKiG+upNjMzKxM1d7iehl4CDgwImoBJJ28UqIyM7NmqYY3IDfUqvwu8DbwoKS/SNqHz5d9MjOzNqganuNabuKKiNsj4lBgK2AccDKwgaRrJPVeSfGZmVkZasrc8qjRuCNibkT8LSIOIFtw91ngzIpHZmZmZauGJZ+aNKuwTkTMBv6cNjMza2Py2v1XjrISl5mZtW1OXGZmlivtWjuAlSCvY3NmZlaPlh7jkrSapCckPSdpgqTzU/nmkh6XNEnSKEkdUvmqab82Hd+s5FqDU/krkvYrKe+TymolNTqHwonLzKxAKjAdfiGwd0RsD/QC+kjaFfgtcGlE9ADmAMek+scAcyJiS+DSVA9J25AtarEt0Ae4WlI7Se2AIUBfYBvgsFR3+d+xnL8QMzNr21o6cUXm47S7StoC2Bu4NZWPAPqnz/3SPun4PpKUykdGxMKImALUArukrTYiXouIT4CRqe7yv2OT/ibMzCwX2qm8TdIgSU+WbIOWvWZqGT0LvAM8AEwG3o+IxanKNKBr+twVmAqQjn8AdCktX+ac5ZUvlydnmJkVSLmzCiNiKDC0kTpLgF6SOgO3A1vXVy39WV8E0UB5fQ2oBgffnLjMzAqkkg8VR8T7ksYBuwKdJbVPrapNgOmp2jSgGzBNUntgLWB2SXmd0nOWV14vdxWamRVIS49xSVovtbSQ1JHspcITgQeB76VqA4E70ufRaZ90/B8REal8QJp1uDnQA3gCGA/0SLMUO5BN4BjdUExucZmZFUgFnuPaCBiRZv/VADdHxF2SXgJGSroIeAYYluoPA/5XUi1ZS2sAQERMkHQz8BKwGDg2dUEi6ThgTAp/eERMaCggZYmwLXq1rQZmZtaCerboWhdDXx5T1u/OQVvtl7u1NtziMjMrkHa5S0Plc+IyMysQr1VoZma54sRlZma54sRlZma50i6nL4cshxOXmVmBVMPDuU5cZmYF4q5CMzPLFScuMzPLFY9xmZlZrrjFZWZmueLEZWZmueLEZWZmueK1Cs3MLFcq+SLJtsKJy8ysQPwAsrVpM2a8yxlnXMqsWXOoqRGHHNKHgQMP4uWXp3DuuUOYN28BXbuuz+9/fxqdOq3e2uFajg0efDnjxo2nS5e1uOuuIQDce+/DXHXVjUyePI1bbvkD//mfPQBYtGgx55xzJS+9NJnFi5fQv//e/PSn32/N8KtKNYxxVUNyLqx27dpx5plHc++91zBq1O+58ca7qa19k7PPvoJTTx3InXdexb777sa1197W2qFazh188D5ce+15S5X17PkfXHnlWey887ZLld9338N88ski7rzzKm677VJGjbqPadNmrsRoq1s7lbflkRNXjq2//jpsu+2WAHTqtDrdu3dj5sz3mDLlLXbeeTsA9tijF/ff/+/WDNMKYOedt2OttdZYqmyLLbrRvfsmX6grifnzF7B48RIWLPiEVVZp7xb/SlSjKGvLIyeugpg2bSYTJ05m++2/TM+e/8HYsY8DcN99jzBjxqxWjs6qyX777UHHjqvxta8dyTe/eTRHH/0dOndeo/ETrUXUqLwtj1Z64pJ0VAPHBkl6UtKTQ4eOWplh5drcufM54YRLOOusn9Cp0+pcfPEJ3Hjj3Rx88EnMnTufDh08lGkrz/PPv0pNTQ0PPTSCsWOvZfjwvzN16tutHVbVqIbE1Rq/0c4HrqvvQEQMBYZme6/msw27ki1atJgTTriEAw/ci969dweyLpzhwy8EYMqUtxg3bnxrhmhV5q67/snXv74Dq6zSni5dOrPDDlvzwguT6NZtw9YOrSpUQzdaRb6jpOeXs70AbFCJe1ajiODss6+ge/duHHVU/8/K33vvfQA+/fRTrrlmFAMG9G2tEK0KbbTRejz++PNEBPPmLeC5516pdyzMKkMqb8sjRbR8w0bSTGA/YM6yh4B/R8TGjV/FLa7GPPnkBA4//Ex69tyMmtTmP+WUI3n99enceOPdAHzrW7tx6qkDUV5/Qq1NOOWU/+GJJ15gzpwP6dKlM8cf/wM6d16DCy/8M7Nnf8Caa3Zi6603Z9iwC5g7dz6DB1/O5MlvEgEHH7wvP/7xwa39Fdqwni36P+f4d+8u63fnzut9O3e/HCqVuIYB10XEw/UcuzEiftD4VZy4zKwatGzienJWeYlrp3Xzl7gqMsYVEcc0cKwJScvMzJqjGsa4PN3MzKxAlNNns8rhxGVmViC56/drBicuM7MCqYZ5WE5cZmYFUgV5y4nLzKxI8roaRjmcuMzMCqQK8pYTl5lZkXiMy8zMcqUK8pYTl5lZkThxmZlZrlTD5IxqWB3EzKxqqMyt0etJwyW9I+nFkrLzJL0l6dm07V9ybLCkWkmvSNqvpLxPKquVdGZJ+eaSHpc0SdIoSR0ai8mJy8ysQKQoa2uC64E+9ZRfGhG90nZPdm9tAwwAtk3nXC2pnaR2wI16onMAAAPDSURBVBCgL7ANcFiqC/DbdK0eZG8UWe5at3WcuMzMCqSl34AcEf8CZjfx9v2AkRGxMCKmALXALmmrjYjXIuITYCTQT9n7lvYGbk3njwD613Pdpb9jE4MxM7McqClzkzRI0pMl26Am3uq49ILg4ZLWTmVdgakldaalsuWVdwHej4jFy5Q3+h3NzKwgyn0DckQMjYidSrahTbjNNcAWQC9gBvCHutvXUzeaUd4gzyo0MyuQlTGpMCJmfnY/6S/AXWl3GtCtpOomwPT0ub7yWUBnSe1Tq6u0/nK5xWVmViDltriadw9tVLL7HaBuxuFoYICkVSVtDvQAngDGAz3SDMIOZBM4RkdEAA8C30vnDwTuaOz+bnGZmRVIS7e4JN0E7AWsK2kacC6wl6ReZN16rwM/BYiICZJuBl4CFgPHRsSSdJ3jgDFAO2B4RExIt/gFMFLSRcAzwLBGY8oSXlv0alsNzMysBfVs0Vwzfd6dZf3u3Hj1A3P3yLJbXGZmBZK7LNQMTlxmZgXSxIeKc82Jy8ysQNziMjOzXPH7uMzMLFeqIG85cZmZFUk1PJzrxGVmViDuKjQzs5wpfuZy4jIzKxA5cZmZWZ5IxR/lcuIyMysUt7jMzCxH3FVoZmY548RlZmY54jEuMzPLGbe4zMwsRzzGZWZmueLEZWZmOeMxLjMzyxFVwWKFTlxmZoXixGVmZjniMS4zM8sZj3GZmVmOuMVlZma54skZZmaWM05cZmaWI/IYl5mZ5YtbXGZmliMe4zIzs5xx4jIzsxzxGJeZmeWMW1xmZpYjNX4DspmZ5YsTl5mZ5YiXfDIzs5xx4jIzsxzxc1xmZpYzHuMyM7McqYYxLkVEa8dgLUzSoIgY2tpxWPXwz5ytTMVvU1anQa0dgFUd/8zZSuPEZWZmueLEZWZmueLEVUwea7CVzT9zttJ4coaZmeWKW1xmZpYrTlxmZpYrTlwFIqmPpFck1Uo6s7XjseKTNFzSO5JebO1YrHo4cRWEpHbAEKAvsA1wmKRtWjcqqwLXA31aOwirLk5cxbELUBsRr0XEJ8BIoF8rx2QFFxH/Ama3dhxWXZy4iqMrMLVkf1oqMzMrFCeu4qhvZU0/62BmhePEVRzTgG4l+5sA01spFjOzinHiKo7xQA9Jm0vqAAwARrdyTGZmLc6JqyAiYjFwHDAGmAjcHBETWjcqKzpJNwGPAl+WNE3SMa0dkxWfl3wyM7NccYvLzMxyxYnLzMxyxYnLzMxyxYnLzMxyxYnLzMxyxYnLzMxyxYnLzMxy5f8DiStLtpMFplUAAAAASUVORK5CYII=\n",
1219 | "text/plain": [
1220 | ""
1221 | ]
1222 | },
1223 | "metadata": {
1224 | "needs_background": "light"
1225 | },
1226 | "output_type": "display_data"
1227 | }
1228 | ],
1229 | "source": [
1230 | "# Plot the confussion Matrix\n",
1231 | "\n",
1232 | "class_names=[0,1] # name of classes\n",
1233 | "fig, ax = plt.subplots()\n",
1234 | "tick_marks = np.arange(len(class_names))\n",
1235 | "plt.xticks(tick_marks, class_names)\n",
1236 | "plt.yticks(tick_marks, class_names)\n",
1237 | "\n",
1238 | "# create heatmap\n",
1239 | "sns.heatmap(pd.DataFrame(cnf_matrix), annot=True, cmap=\"YlGnBu\" ,fmt='g')\n",
1240 | "ax.xaxis.set_label_position(\"top\")\n",
1241 | "plt.tight_layout()\n",
1242 | "plt.title('Confusion matrix', y=1.1)\n",
1243 | "plt.ylabel('Actual label')\n",
1244 | "plt.xlabel('Predicted label')\n",
1245 | "print(\"Accuracy %0.4f\" % accuracy_score(y_test, y_pred))"
1246 | ]
1247 | },
1248 | {
1249 | "cell_type": "code",
1250 | "execution_count": 87,
1251 | "metadata": {},
1252 | "outputs": [],
1253 | "source": [
1254 | "## Round 2: Random Forest Classification\n",
1255 | "from sklearn.ensemble import RandomForestClassifier"
1256 | ]
1257 | },
1258 | {
1259 | "cell_type": "code",
1260 | "execution_count": 89,
1261 | "metadata": {},
1262 | "outputs": [
1263 | {
1264 | "data": {
1265 | "text/plain": [
1266 | "RandomForestClassifier(bootstrap=True, class_weight=None, criterion='entropy',\n",
1267 | " max_depth=None, max_features='auto', max_leaf_nodes=None,\n",
1268 | " min_impurity_decrease=0.0, min_impurity_split=None,\n",
1269 | " min_samples_leaf=1, min_samples_split=2,\n",
1270 | " min_weight_fraction_leaf=0.0, n_estimators=100,\n",
1271 | " n_jobs=None, oob_score=False, random_state=0, verbose=0,\n",
1272 | " warm_start=False)"
1273 | ]
1274 | },
1275 | "execution_count": 89,
1276 | "metadata": {},
1277 | "output_type": "execute_result"
1278 | }
1279 | ],
1280 | "source": [
1281 | "# set parameters of RF and Fit\n",
1282 | "random_forest = RandomForestClassifier(n_estimators = 100)\n",
1283 | "\n",
1284 | "classifier = RandomForestClassifier(random_state=0, n_estimators = 100,\n",
1285 | " criterion = 'entropy')\n",
1286 | "classifier.fit(X_train, y_train)"
1287 | ]
1288 | },
1289 | {
1290 | "cell_type": "code",
1291 | "execution_count": 90,
1292 | "metadata": {},
1293 | "outputs": [],
1294 | "source": [
1295 | "# Evaluating Test set\n",
1296 | "y_pred = classifier.predict(X_test)"
1297 | ]
1298 | },
1299 | {
1300 | "cell_type": "code",
1301 | "execution_count": 91,
1302 | "metadata": {},
1303 | "outputs": [],
1304 | "source": [
1305 | "# Calculation of metrics\n",
1306 | "\n",
1307 | "# Accuracy score\n",
1308 | "acc = accuracy_score(y_test, y_pred)\n",
1309 | "\n",
1310 | "# precision score (When is 0 and should be 1 and the other way round)\n",
1311 | "pre = precision_score(y_test, y_pred)\n",
1312 | "\n",
1313 | "# recall score\n",
1314 | "rec = recall_score(y_test, y_pred)\n",
1315 | "\n",
1316 | "f1 = f1_score(y_test, y_pred)"
1317 | ]
1318 | },
1319 | {
1320 | "cell_type": "code",
1321 | "execution_count": 92,
1322 | "metadata": {},
1323 | "outputs": [],
1324 | "source": [
1325 | "# Metrics \n",
1326 | "rf_results = pd.DataFrame([['Random Forest (n=100, GSx2 + Entropy)', acc, pre, rec, f1]],\n",
1327 | " columns = ['Model','Accuracy','Precision','Recall','F1 score'])"
1328 | ]
1329 | },
1330 | {
1331 | "cell_type": "code",
1332 | "execution_count": 93,
1333 | "metadata": {
1334 | "scrolled": true
1335 | },
1336 | "outputs": [
1337 | {
1338 | "data": {
1339 | "text/html": [
1340 | "\n",
1341 | "\n",
1354 | "
\n",
1355 | " \n",
1356 | " \n",
1357 | " \n",
1358 | " Model \n",
1359 | " Accuracy \n",
1360 | " Precision \n",
1361 | " Recall \n",
1362 | " F1 score \n",
1363 | " \n",
1364 | " \n",
1365 | " \n",
1366 | " \n",
1367 | " 0 \n",
1368 | " Deep Neural Network \n",
1369 | " 0.999391 \n",
1370 | " 0.836879 \n",
1371 | " 0.802721 \n",
1372 | " 0.819444 \n",
1373 | " \n",
1374 | " \n",
1375 | " 1 \n",
1376 | " Random Forest (n=100, GSx2 + Entropy) \n",
1377 | " 0.999520 \n",
1378 | " 0.941667 \n",
1379 | " 0.768707 \n",
1380 | " 0.846442 \n",
1381 | " \n",
1382 | " \n",
1383 | "
\n",
1384 | "
"
1385 | ],
1386 | "text/plain": [
1387 | " Model Accuracy Precision Recall \\\n",
1388 | "0 Deep Neural Network 0.999391 0.836879 0.802721 \n",
1389 | "1 Random Forest (n=100, GSx2 + Entropy) 0.999520 0.941667 0.768707 \n",
1390 | "\n",
1391 | " F1 score \n",
1392 | "0 0.819444 \n",
1393 | "1 0.846442 "
1394 | ]
1395 | },
1396 | "execution_count": 93,
1397 | "metadata": {},
1398 | "output_type": "execute_result"
1399 | }
1400 | ],
1401 | "source": [
1402 | "# Showing the comparizon between the Models\n",
1403 | "\n",
1404 | "results = results.append(rf_results, ignore_index = True)\n",
1405 | "\n",
1406 | "results "
1407 | ]
1408 | },
1409 | {
1410 | "cell_type": "code",
1411 | "execution_count": 94,
1412 | "metadata": {},
1413 | "outputs": [],
1414 | "source": [
1415 | "## Round 3: Decision Tree\n",
1416 | "from sklearn.tree import DecisionTreeClassifier"
1417 | ]
1418 | },
1419 | {
1420 | "cell_type": "code",
1421 | "execution_count": 95,
1422 | "metadata": {},
1423 | "outputs": [
1424 | {
1425 | "data": {
1426 | "text/plain": [
1427 | "DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None,\n",
1428 | " max_features=None, max_leaf_nodes=None,\n",
1429 | " min_impurity_decrease=0.0, min_impurity_split=None,\n",
1430 | " min_samples_leaf=1, min_samples_split=2,\n",
1431 | " min_weight_fraction_leaf=0.0, presort=False,\n",
1432 | " random_state=None, splitter='best')"
1433 | ]
1434 | },
1435 | "execution_count": 95,
1436 | "metadata": {},
1437 | "output_type": "execute_result"
1438 | }
1439 | ],
1440 | "source": [
1441 | "# set parameters of DecisionTree and Fit\n",
1442 | "classifier = DecisionTreeClassifier()\n",
1443 | "classifier.fit(X_train, y_train)"
1444 | ]
1445 | },
1446 | {
1447 | "cell_type": "code",
1448 | "execution_count": 96,
1449 | "metadata": {},
1450 | "outputs": [],
1451 | "source": [
1452 | "# Evaluating Test set\n",
1453 | "y_pred = classifier.predict(X_test)"
1454 | ]
1455 | },
1456 | {
1457 | "cell_type": "code",
1458 | "execution_count": 97,
1459 | "metadata": {},
1460 | "outputs": [],
1461 | "source": [
1462 | "# Calculation of metrics\n",
1463 | "\n",
1464 | "# Accuracy score\n",
1465 | "acc = accuracy_score(y_test, y_pred)\n",
1466 | "\n",
1467 | "# precision score (When is 0 and should be 1 and the other way round)\n",
1468 | "pre = precision_score(y_test, y_pred)\n",
1469 | "\n",
1470 | "# recall score\n",
1471 | "rec = recall_score(y_test, y_pred)\n",
1472 | "\n",
1473 | "f1 = f1_score(y_test, y_pred)"
1474 | ]
1475 | },
1476 | {
1477 | "cell_type": "code",
1478 | "execution_count": 99,
1479 | "metadata": {},
1480 | "outputs": [],
1481 | "source": [
1482 | "# Metrics \n",
1483 | "dt_results = pd.DataFrame([['Decision Trees', acc, pre, rec, f1]],\n",
1484 | " columns = ['Model','Accuracy','Precision','Recall','F1 score'])"
1485 | ]
1486 | },
1487 | {
1488 | "cell_type": "code",
1489 | "execution_count": 100,
1490 | "metadata": {
1491 | "scrolled": true
1492 | },
1493 | "outputs": [
1494 | {
1495 | "data": {
1496 | "text/html": [
1497 | "\n",
1498 | "\n",
1511 | "
\n",
1512 | " \n",
1513 | " \n",
1514 | " \n",
1515 | " Model \n",
1516 | " Accuracy \n",
1517 | " Precision \n",
1518 | " Recall \n",
1519 | " F1 score \n",
1520 | " \n",
1521 | " \n",
1522 | " \n",
1523 | " \n",
1524 | " 0 \n",
1525 | " Deep Neural Network \n",
1526 | " 0.999391 \n",
1527 | " 0.836879 \n",
1528 | " 0.802721 \n",
1529 | " 0.819444 \n",
1530 | " \n",
1531 | " \n",
1532 | " 1 \n",
1533 | " Random Forest (n=100, GSx2 + Entropy) \n",
1534 | " 0.999520 \n",
1535 | " 0.941667 \n",
1536 | " 0.768707 \n",
1537 | " 0.846442 \n",
1538 | " \n",
1539 | " \n",
1540 | " 2 \n",
1541 | " Decision Trees \n",
1542 | " 0.999216 \n",
1543 | " 0.785714 \n",
1544 | " 0.748299 \n",
1545 | " 0.766551 \n",
1546 | " \n",
1547 | " \n",
1548 | "
\n",
1549 | "
"
1550 | ],
1551 | "text/plain": [
1552 | " Model Accuracy Precision Recall \\\n",
1553 | "0 Deep Neural Network 0.999391 0.836879 0.802721 \n",
1554 | "1 Random Forest (n=100, GSx2 + Entropy) 0.999520 0.941667 0.768707 \n",
1555 | "2 Decision Trees 0.999216 0.785714 0.748299 \n",
1556 | "\n",
1557 | " F1 score \n",
1558 | "0 0.819444 \n",
1559 | "1 0.846442 \n",
1560 | "2 0.766551 "
1561 | ]
1562 | },
1563 | "execution_count": 100,
1564 | "metadata": {},
1565 | "output_type": "execute_result"
1566 | }
1567 | ],
1568 | "source": [
1569 | "# Showing the comparizon between the Models\n",
1570 | "\n",
1571 | "results = results.append(dt_results, ignore_index = True)\n",
1572 | "\n",
1573 | "results "
1574 | ]
1575 | },
1576 | {
1577 | "cell_type": "code",
1578 | "execution_count": 104,
1579 | "metadata": {},
1580 | "outputs": [],
1581 | "source": [
1582 | "# Round 4: BALANCING THE DATASET\n",
1583 | "\n",
1584 | "# Splitting into train and Test set\n",
1585 | "from sklearn.model_selection import train_test_split\n",
1586 | "\n",
1587 | "X_train, X_test, y_train, y_test = train_test_split(X,\n",
1588 | " y,\n",
1589 | " test_size = 0.3,\n",
1590 | " random_state = 0)"
1591 | ]
1592 | },
1593 | {
1594 | "cell_type": "code",
1595 | "execution_count": 105,
1596 | "metadata": {},
1597 | "outputs": [
1598 | {
1599 | "data": {
1600 | "text/plain": [
1601 | "0 199019\n",
1602 | "1 345\n",
1603 | "Name: Class, dtype: int64"
1604 | ]
1605 | },
1606 | "execution_count": 105,
1607 | "metadata": {},
1608 | "output_type": "execute_result"
1609 | }
1610 | ],
1611 | "source": [
1612 | "# Round 4 \n",
1613 | "\n",
1614 | "#BALANCING THE DATASET:\n",
1615 | "\n",
1616 | "# # To Avoid the skewed on the total \n",
1617 | "\n",
1618 | "# Counting and watching the Y_train distribution \n",
1619 | "y_train.value_counts()"
1620 | ]
1621 | },
1622 | {
1623 | "cell_type": "code",
1624 | "execution_count": 106,
1625 | "metadata": {},
1626 | "outputs": [],
1627 | "source": [
1628 | "# The expecting distribution to avoid a Bias is to have a 50/50 (0/1)\n",
1629 | "#This balancing ensure that the model is accuracy\n",
1630 | "\n",
1631 | "#pos and neg index values\n",
1632 | "pos_index = y_train[y_train.values == 1].index\n",
1633 | "neg_index = y_train[y_train.values == 0].index\n",
1634 | "\n",
1635 | "if len(pos_index) > len(neg_index):\n",
1636 | " higher = pos_index\n",
1637 | " lower = neg_index\n",
1638 | "else:\n",
1639 | " lower = pos_index\n",
1640 | " higher = neg_index\n",
1641 | " \n",
1642 | "#Create random index selection\n",
1643 | "random.seed(0)\n",
1644 | "# random select index record up to the same size of lower index\n",
1645 | "higher = np.random.choice(higher, size = len(lower))\n",
1646 | "# select the lower index and convert to a numpy array\n",
1647 | "lower = np.asarray(lower)\n",
1648 | "# create the new index as a combination \n",
1649 | "new_indexes = np.concatenate((lower,higher))\n",
1650 | "\n",
1651 | "# Reselect the X_train, y_train dataset using the new indexes\n",
1652 | "X_train = X_train.loc[new_indexes, ]\n",
1653 | "y_train = y_train[new_indexes]"
1654 | ]
1655 | },
1656 | {
1657 | "cell_type": "code",
1658 | "execution_count": 107,
1659 | "metadata": {},
1660 | "outputs": [
1661 | {
1662 | "data": {
1663 | "text/plain": [
1664 | "1 345\n",
1665 | "0 345\n",
1666 | "Name: Class, dtype: int64"
1667 | ]
1668 | },
1669 | "execution_count": 107,
1670 | "metadata": {},
1671 | "output_type": "execute_result"
1672 | }
1673 | ],
1674 | "source": [
1675 | "# checking the balancing of the training Set\n",
1676 | "\n",
1677 | "#Counting and watching the Y_train distribution \n",
1678 | "y_train.value_counts()"
1679 | ]
1680 | },
1681 | {
1682 | "cell_type": "code",
1683 | "execution_count": 108,
1684 | "metadata": {},
1685 | "outputs": [
1686 | {
1687 | "name": "stdout",
1688 | "output_type": "stream",
1689 | "text": [
1690 | "Epoch 1/5\n",
1691 | "690/690 [==============================] - 0s 109us/sample - loss: 0.2002 - acc: 0.9319\n",
1692 | "Epoch 2/5\n",
1693 | "690/690 [==============================] - 0s 94us/sample - loss: 0.1285 - acc: 0.9435\n",
1694 | "Epoch 3/5\n",
1695 | "690/690 [==============================] - 0s 88us/sample - loss: 0.1055 - acc: 0.9493\n",
1696 | "Epoch 4/5\n",
1697 | "690/690 [==============================] - 0s 87us/sample - loss: 0.1052 - acc: 0.9493\n",
1698 | "Epoch 5/5\n",
1699 | "690/690 [==============================] - 0s 108us/sample - loss: 0.0964 - acc: 0.9536\n"
1700 | ]
1701 | },
1702 | {
1703 | "data": {
1704 | "text/plain": [
1705 | ""
1706 | ]
1707 | },
1708 | "execution_count": 108,
1709 | "metadata": {},
1710 | "output_type": "execute_result"
1711 | }
1712 | ],
1713 | "source": [
1714 | "# train the DNN model \n",
1715 | "dnn_model.fit(X_train, \n",
1716 | " y_train, \n",
1717 | " batch_size= 15,\n",
1718 | " epochs = 5)"
1719 | ]
1720 | },
1721 | {
1722 | "cell_type": "code",
1723 | "execution_count": 109,
1724 | "metadata": {},
1725 | "outputs": [
1726 | {
1727 | "name": "stdout",
1728 | "output_type": "stream",
1729 | "text": [
1730 | "85443/85443 [==============================] - 1s 16us/sample - loss: 0.0857 - acc: 0.9971\n"
1731 | ]
1732 | }
1733 | ],
1734 | "source": [
1735 | "## DNN model evaluation\n",
1736 | "score = dnn_model.evaluate(X_test,y_test)"
1737 | ]
1738 | },
1739 | {
1740 | "cell_type": "code",
1741 | "execution_count": 110,
1742 | "metadata": {},
1743 | "outputs": [
1744 | {
1745 | "name": "stdout",
1746 | "output_type": "stream",
1747 | "text": [
1748 | "[0.08572872194509909, 0.99707407]\n"
1749 | ]
1750 | }
1751 | ],
1752 | "source": [
1753 | "# Print eval result \n",
1754 | "print(score)"
1755 | ]
1756 | },
1757 | {
1758 | "cell_type": "code",
1759 | "execution_count": 112,
1760 | "metadata": {},
1761 | "outputs": [],
1762 | "source": [
1763 | "## Metrics -->\n",
1764 | "\n",
1765 | "# Accuracy: (TRUE Positives + TRUE Negatives) / Total\n",
1766 | "# Precision: TRUE Positives / (TRUE Positives + FALSE Positives)\n",
1767 | "# Specificity: TRUE Negatives / (FALSE Positives + TRUE Negatives)\n",
1768 | "# Recall: TRUE Positives / (TRUE Positives + FALSE Negatives)\n",
1769 | "# F1-score:"
1770 | ]
1771 | },
1772 | {
1773 | "cell_type": "code",
1774 | "execution_count": 113,
1775 | "metadata": {},
1776 | "outputs": [],
1777 | "source": [
1778 | "# Getting the prediction from the test dataset\n",
1779 | "y_pred = dnn_model.predict(X_test).round()\n",
1780 | "# convert array to dataframe\n",
1781 | "y_test = pd.DataFrame(y_test)"
1782 | ]
1783 | },
1784 | {
1785 | "cell_type": "code",
1786 | "execution_count": 114,
1787 | "metadata": {},
1788 | "outputs": [],
1789 | "source": [
1790 | "# Evaluating Test set\n",
1791 | "y_pred = classifier.predict(X_test)"
1792 | ]
1793 | },
1794 | {
1795 | "cell_type": "code",
1796 | "execution_count": 116,
1797 | "metadata": {},
1798 | "outputs": [],
1799 | "source": [
1800 | "# Calculation of metrics\n",
1801 | "\n",
1802 | "# Accuracy score\n",
1803 | "acc = accuracy_score(y_test, y_pred)\n",
1804 | "\n",
1805 | "# precision score (When is 0 and should be 1 and the other way round)\n",
1806 | "pre = precision_score(y_test, y_pred)\n",
1807 | "\n",
1808 | "# recall score\n",
1809 | "rec = recall_score(y_test, y_pred)\n",
1810 | "\n",
1811 | "f1 = f1_score(y_test, y_pred)"
1812 | ]
1813 | },
1814 | {
1815 | "cell_type": "code",
1816 | "execution_count": 117,
1817 | "metadata": {},
1818 | "outputs": [],
1819 | "source": [
1820 | "# Metrics \n",
1821 | "dnn_results = pd.DataFrame([['Deep Neural Network (Undersampling & balanced)', acc, pre, rec, f1]],\n",
1822 | " columns = ['Model','Accuracy','Precision','Recall','F1 score'])"
1823 | ]
1824 | },
1825 | {
1826 | "cell_type": "code",
1827 | "execution_count": 118,
1828 | "metadata": {
1829 | "scrolled": true
1830 | },
1831 | "outputs": [
1832 | {
1833 | "data": {
1834 | "text/html": [
1835 | "\n",
1836 | "\n",
1849 | "
\n",
1850 | " \n",
1851 | " \n",
1852 | " \n",
1853 | " Model \n",
1854 | " Accuracy \n",
1855 | " Precision \n",
1856 | " Recall \n",
1857 | " F1 score \n",
1858 | " \n",
1859 | " \n",
1860 | " \n",
1861 | " \n",
1862 | " 0 \n",
1863 | " Deep Neural Network \n",
1864 | " 0.999391 \n",
1865 | " 0.836879 \n",
1866 | " 0.802721 \n",
1867 | " 0.819444 \n",
1868 | " \n",
1869 | " \n",
1870 | " 1 \n",
1871 | " Random Forest (n=100, GSx2 + Entropy) \n",
1872 | " 0.999520 \n",
1873 | " 0.941667 \n",
1874 | " 0.768707 \n",
1875 | " 0.846442 \n",
1876 | " \n",
1877 | " \n",
1878 | " 2 \n",
1879 | " Decision Trees \n",
1880 | " 0.999216 \n",
1881 | " 0.785714 \n",
1882 | " 0.748299 \n",
1883 | " 0.766551 \n",
1884 | " \n",
1885 | " \n",
1886 | " 3 \n",
1887 | " Deep Neural Network (Undersampling & balanced) \n",
1888 | " 0.999216 \n",
1889 | " 0.785714 \n",
1890 | " 0.748299 \n",
1891 | " 0.766551 \n",
1892 | " \n",
1893 | " \n",
1894 | "
\n",
1895 | "
"
1896 | ],
1897 | "text/plain": [
1898 | " Model Accuracy Precision \\\n",
1899 | "0 Deep Neural Network 0.999391 0.836879 \n",
1900 | "1 Random Forest (n=100, GSx2 + Entropy) 0.999520 0.941667 \n",
1901 | "2 Decision Trees 0.999216 0.785714 \n",
1902 | "3 Deep Neural Network (Undersampling & balanced) 0.999216 0.785714 \n",
1903 | "\n",
1904 | " Recall F1 score \n",
1905 | "0 0.802721 0.819444 \n",
1906 | "1 0.768707 0.846442 \n",
1907 | "2 0.748299 0.766551 \n",
1908 | "3 0.748299 0.766551 "
1909 | ]
1910 | },
1911 | "execution_count": 118,
1912 | "metadata": {},
1913 | "output_type": "execute_result"
1914 | }
1915 | ],
1916 | "source": [
1917 | "# Showing the comparizon between the Models\n",
1918 | "\n",
1919 | "results = results.append(dnn_results, ignore_index = True)\n",
1920 | "\n",
1921 | "results "
1922 | ]
1923 | },
1924 | {
1925 | "cell_type": "markdown",
1926 | "metadata": {},
1927 | "source": [
1928 | "## Final Remarks\n",
1929 | "\n",
1930 | "On this case study we try to create a model for the detection of Fraudulent Card transaction. We tried severals Models and the best of thoughs model was the Random Forest model with this particular Metrics score:\n",
1931 | "\n",
1932 | "- Accuracy score: 0.999391\n",
1933 | "- Precision score: 0.836879\n",
1934 | "- Recall score: 0.802721\n",
1935 | "- F1 score: 0.819444\n",
1936 | "\n",
1937 | "\n"
1938 | ]
1939 | }
1940 | ],
1941 | "metadata": {
1942 | "kernelspec": {
1943 | "display_name": "Python 3",
1944 | "language": "python",
1945 | "name": "python3"
1946 | },
1947 | "language_info": {
1948 | "codemirror_mode": {
1949 | "name": "ipython",
1950 | "version": 3
1951 | },
1952 | "file_extension": ".py",
1953 | "mimetype": "text/x-python",
1954 | "name": "python",
1955 | "nbconvert_exporter": "python",
1956 | "pygments_lexer": "ipython3",
1957 | "version": "3.7.4"
1958 | }
1959 | },
1960 | "nbformat": 4,
1961 | "nbformat_minor": 2
1962 | }
1963 |
--------------------------------------------------------------------------------
/4. Minimizing Churn Rate Through Analysis of Financial Habits/Machine_Learning_04_Minimizing_Churn_Rate_Through_Analysis_of_Financial_Habits (Part 2).ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 1,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "## importing lbiraries & data\n",
10 | "\n",
11 | "# import libraries\n",
12 | "import pandas as pd\n",
13 | "import random\n",
14 | "import numpy as np\n",
15 | "import seaborn as sns\n",
16 | "import matplotlib.pyplot as plt"
17 | ]
18 | },
19 | {
20 | "cell_type": "code",
21 | "execution_count": 2,
22 | "metadata": {},
23 | "outputs": [],
24 | "source": [
25 | "path = \"../Datasets\"\n",
26 | "\n",
27 | "import os\n",
28 | "#Actual absolute path\n",
29 | "cwd = os.getcwd()\n",
30 | "#print(cwd)"
31 | ]
32 | },
33 | {
34 | "cell_type": "code",
35 | "execution_count": 3,
36 | "metadata": {},
37 | "outputs": [],
38 | "source": [
39 | "# one level up directory. Datasets directory\n",
40 | "os.chdir(path)\n",
41 | "#Actual absolute path\n",
42 | "cwd = os.getcwd()\n",
43 | "#print(cwd)"
44 | ]
45 | },
46 | {
47 | "cell_type": "code",
48 | "execution_count": 4,
49 | "metadata": {},
50 | "outputs": [],
51 | "source": [
52 | "# path file current dir + file name\n",
53 | "path_file = cwd + '/new_churn_data.csv'\n",
54 | "#print(path_file)"
55 | ]
56 | },
57 | {
58 | "cell_type": "code",
59 | "execution_count": 5,
60 | "metadata": {},
61 | "outputs": [
62 | {
63 | "data": {
64 | "text/html": [
65 | "\n",
66 | "\n",
79 | "
\n",
80 | " \n",
81 | " \n",
82 | " \n",
83 | " user \n",
84 | " churn \n",
85 | " age \n",
86 | " housing \n",
87 | " deposits \n",
88 | " withdrawal \n",
89 | " purchases_partners \n",
90 | " purchases \n",
91 | " cc_taken \n",
92 | " cc_recommended \n",
93 | " ... \n",
94 | " payment_type \n",
95 | " waiting_4_loan \n",
96 | " cancelled_loan \n",
97 | " received_loan \n",
98 | " rejected_loan \n",
99 | " zodiac_sign \n",
100 | " left_for_two_month_plus \n",
101 | " left_for_one_month \n",
102 | " reward_rate \n",
103 | " is_referred \n",
104 | " \n",
105 | " \n",
106 | " \n",
107 | " \n",
108 | " 0 \n",
109 | " 55409 \n",
110 | " 0 \n",
111 | " 37.0 \n",
112 | " na \n",
113 | " 0 \n",
114 | " 0 \n",
115 | " 0 \n",
116 | " 0 \n",
117 | " 0 \n",
118 | " 0 \n",
119 | " ... \n",
120 | " Bi-Weekly \n",
121 | " 0 \n",
122 | " 0 \n",
123 | " 0 \n",
124 | " 0 \n",
125 | " Leo \n",
126 | " 1 \n",
127 | " 0 \n",
128 | " 0.00 \n",
129 | " 0 \n",
130 | " \n",
131 | " \n",
132 | " 1 \n",
133 | " 23547 \n",
134 | " 0 \n",
135 | " 28.0 \n",
136 | " R \n",
137 | " 0 \n",
138 | " 0 \n",
139 | " 1 \n",
140 | " 0 \n",
141 | " 0 \n",
142 | " 96 \n",
143 | " ... \n",
144 | " Weekly \n",
145 | " 0 \n",
146 | " 0 \n",
147 | " 0 \n",
148 | " 0 \n",
149 | " Leo \n",
150 | " 0 \n",
151 | " 0 \n",
152 | " 1.47 \n",
153 | " 1 \n",
154 | " \n",
155 | " \n",
156 | " 2 \n",
157 | " 58313 \n",
158 | " 0 \n",
159 | " 35.0 \n",
160 | " R \n",
161 | " 47 \n",
162 | " 2 \n",
163 | " 86 \n",
164 | " 47 \n",
165 | " 0 \n",
166 | " 285 \n",
167 | " ... \n",
168 | " Semi-Monthly \n",
169 | " 0 \n",
170 | " 0 \n",
171 | " 0 \n",
172 | " 0 \n",
173 | " Capricorn \n",
174 | " 1 \n",
175 | " 0 \n",
176 | " 2.17 \n",
177 | " 0 \n",
178 | " \n",
179 | " \n",
180 | " 3 \n",
181 | " 8095 \n",
182 | " 0 \n",
183 | " 26.0 \n",
184 | " R \n",
185 | " 26 \n",
186 | " 3 \n",
187 | " 38 \n",
188 | " 25 \n",
189 | " 0 \n",
190 | " 74 \n",
191 | " ... \n",
192 | " Bi-Weekly \n",
193 | " 0 \n",
194 | " 0 \n",
195 | " 0 \n",
196 | " 0 \n",
197 | " Capricorn \n",
198 | " 0 \n",
199 | " 0 \n",
200 | " 1.10 \n",
201 | " 1 \n",
202 | " \n",
203 | " \n",
204 | " 4 \n",
205 | " 61353 \n",
206 | " 1 \n",
207 | " 27.0 \n",
208 | " na \n",
209 | " 0 \n",
210 | " 0 \n",
211 | " 2 \n",
212 | " 0 \n",
213 | " 0 \n",
214 | " 0 \n",
215 | " ... \n",
216 | " Bi-Weekly \n",
217 | " 0 \n",
218 | " 0 \n",
219 | " 0 \n",
220 | " 0 \n",
221 | " Aries \n",
222 | " 1 \n",
223 | " 0 \n",
224 | " 0.03 \n",
225 | " 0 \n",
226 | " \n",
227 | " \n",
228 | "
\n",
229 | "
5 rows × 28 columns
\n",
230 | "
"
231 | ],
232 | "text/plain": [
233 | " user churn age housing deposits withdrawal purchases_partners \\\n",
234 | "0 55409 0 37.0 na 0 0 0 \n",
235 | "1 23547 0 28.0 R 0 0 1 \n",
236 | "2 58313 0 35.0 R 47 2 86 \n",
237 | "3 8095 0 26.0 R 26 3 38 \n",
238 | "4 61353 1 27.0 na 0 0 2 \n",
239 | "\n",
240 | " purchases cc_taken cc_recommended ... payment_type \\\n",
241 | "0 0 0 0 ... Bi-Weekly \n",
242 | "1 0 0 96 ... Weekly \n",
243 | "2 47 0 285 ... Semi-Monthly \n",
244 | "3 25 0 74 ... Bi-Weekly \n",
245 | "4 0 0 0 ... Bi-Weekly \n",
246 | "\n",
247 | " waiting_4_loan cancelled_loan received_loan rejected_loan zodiac_sign \\\n",
248 | "0 0 0 0 0 Leo \n",
249 | "1 0 0 0 0 Leo \n",
250 | "2 0 0 0 0 Capricorn \n",
251 | "3 0 0 0 0 Capricorn \n",
252 | "4 0 0 0 0 Aries \n",
253 | "\n",
254 | " left_for_two_month_plus left_for_one_month reward_rate is_referred \n",
255 | "0 1 0 0.00 0 \n",
256 | "1 0 0 1.47 1 \n",
257 | "2 1 0 2.17 0 \n",
258 | "3 0 0 1.10 1 \n",
259 | "4 1 0 0.03 0 \n",
260 | "\n",
261 | "[5 rows x 28 columns]"
262 | ]
263 | },
264 | "execution_count": 5,
265 | "metadata": {},
266 | "output_type": "execute_result"
267 | }
268 | ],
269 | "source": [
270 | "# import dataset\n",
271 | "\n",
272 | "dataset = pd.read_csv(path_file)\n",
273 | "dataset.head()"
274 | ]
275 | },
276 | {
277 | "cell_type": "code",
278 | "execution_count": 6,
279 | "metadata": {},
280 | "outputs": [
281 | {
282 | "data": {
283 | "text/plain": [
284 | "Index(['user', 'churn', 'age', 'housing', 'deposits', 'withdrawal',\n",
285 | " 'purchases_partners', 'purchases', 'cc_taken', 'cc_recommended',\n",
286 | " 'cc_disliked', 'cc_liked', 'cc_application_begin', 'app_downloaded',\n",
287 | " 'web_user', 'ios_user', 'android_user', 'registered_phones',\n",
288 | " 'payment_type', 'waiting_4_loan', 'cancelled_loan', 'received_loan',\n",
289 | " 'rejected_loan', 'zodiac_sign', 'left_for_two_month_plus',\n",
290 | " 'left_for_one_month', 'reward_rate', 'is_referred'],\n",
291 | " dtype='object')"
292 | ]
293 | },
294 | "execution_count": 6,
295 | "metadata": {},
296 | "output_type": "execute_result"
297 | }
298 | ],
299 | "source": [
300 | "dataset.columns"
301 | ]
302 | },
303 | {
304 | "cell_type": "code",
305 | "execution_count": 7,
306 | "metadata": {},
307 | "outputs": [
308 | {
309 | "data": {
310 | "text/plain": [
311 | "(26996, 28)"
312 | ]
313 | },
314 | "execution_count": 7,
315 | "metadata": {},
316 | "output_type": "execute_result"
317 | }
318 | ],
319 | "source": [
320 | "dataset.shape # 31 x columns - 27000 records"
321 | ]
322 | },
323 | {
324 | "cell_type": "code",
325 | "execution_count": 8,
326 | "metadata": {},
327 | "outputs": [
328 | {
329 | "data": {
330 | "text/html": [
331 | "\n",
332 | "\n",
345 | "
\n",
346 | " \n",
347 | " \n",
348 | " \n",
349 | " churn \n",
350 | " age \n",
351 | " housing \n",
352 | " deposits \n",
353 | " withdrawal \n",
354 | " purchases_partners \n",
355 | " purchases \n",
356 | " cc_taken \n",
357 | " cc_recommended \n",
358 | " cc_disliked \n",
359 | " ... \n",
360 | " payment_type \n",
361 | " waiting_4_loan \n",
362 | " cancelled_loan \n",
363 | " received_loan \n",
364 | " rejected_loan \n",
365 | " zodiac_sign \n",
366 | " left_for_two_month_plus \n",
367 | " left_for_one_month \n",
368 | " reward_rate \n",
369 | " is_referred \n",
370 | " \n",
371 | " \n",
372 | " \n",
373 | " \n",
374 | " 0 \n",
375 | " 0 \n",
376 | " 37.0 \n",
377 | " na \n",
378 | " 0 \n",
379 | " 0 \n",
380 | " 0 \n",
381 | " 0 \n",
382 | " 0 \n",
383 | " 0 \n",
384 | " 0 \n",
385 | " ... \n",
386 | " Bi-Weekly \n",
387 | " 0 \n",
388 | " 0 \n",
389 | " 0 \n",
390 | " 0 \n",
391 | " Leo \n",
392 | " 1 \n",
393 | " 0 \n",
394 | " 0.00 \n",
395 | " 0 \n",
396 | " \n",
397 | " \n",
398 | " 1 \n",
399 | " 0 \n",
400 | " 28.0 \n",
401 | " R \n",
402 | " 0 \n",
403 | " 0 \n",
404 | " 1 \n",
405 | " 0 \n",
406 | " 0 \n",
407 | " 96 \n",
408 | " 0 \n",
409 | " ... \n",
410 | " Weekly \n",
411 | " 0 \n",
412 | " 0 \n",
413 | " 0 \n",
414 | " 0 \n",
415 | " Leo \n",
416 | " 0 \n",
417 | " 0 \n",
418 | " 1.47 \n",
419 | " 1 \n",
420 | " \n",
421 | " \n",
422 | " 2 \n",
423 | " 0 \n",
424 | " 35.0 \n",
425 | " R \n",
426 | " 47 \n",
427 | " 2 \n",
428 | " 86 \n",
429 | " 47 \n",
430 | " 0 \n",
431 | " 285 \n",
432 | " 0 \n",
433 | " ... \n",
434 | " Semi-Monthly \n",
435 | " 0 \n",
436 | " 0 \n",
437 | " 0 \n",
438 | " 0 \n",
439 | " Capricorn \n",
440 | " 1 \n",
441 | " 0 \n",
442 | " 2.17 \n",
443 | " 0 \n",
444 | " \n",
445 | " \n",
446 | " 3 \n",
447 | " 0 \n",
448 | " 26.0 \n",
449 | " R \n",
450 | " 26 \n",
451 | " 3 \n",
452 | " 38 \n",
453 | " 25 \n",
454 | " 0 \n",
455 | " 74 \n",
456 | " 0 \n",
457 | " ... \n",
458 | " Bi-Weekly \n",
459 | " 0 \n",
460 | " 0 \n",
461 | " 0 \n",
462 | " 0 \n",
463 | " Capricorn \n",
464 | " 0 \n",
465 | " 0 \n",
466 | " 1.10 \n",
467 | " 1 \n",
468 | " \n",
469 | " \n",
470 | " 4 \n",
471 | " 1 \n",
472 | " 27.0 \n",
473 | " na \n",
474 | " 0 \n",
475 | " 0 \n",
476 | " 2 \n",
477 | " 0 \n",
478 | " 0 \n",
479 | " 0 \n",
480 | " 0 \n",
481 | " ... \n",
482 | " Bi-Weekly \n",
483 | " 0 \n",
484 | " 0 \n",
485 | " 0 \n",
486 | " 0 \n",
487 | " Aries \n",
488 | " 1 \n",
489 | " 0 \n",
490 | " 0.03 \n",
491 | " 0 \n",
492 | " \n",
493 | " \n",
494 | " 5 \n",
495 | " 1 \n",
496 | " 32.0 \n",
497 | " R \n",
498 | " 5 \n",
499 | " 3 \n",
500 | " 111 \n",
501 | " 5 \n",
502 | " 0 \n",
503 | " 227 \n",
504 | " 0 \n",
505 | " ... \n",
506 | " Bi-Weekly \n",
507 | " 0 \n",
508 | " 0 \n",
509 | " 0 \n",
510 | " 0 \n",
511 | " Taurus \n",
512 | " 0 \n",
513 | " 0 \n",
514 | " 1.83 \n",
515 | " 0 \n",
516 | " \n",
517 | " \n",
518 | " 6 \n",
519 | " 0 \n",
520 | " 21.0 \n",
521 | " na \n",
522 | " 0 \n",
523 | " 0 \n",
524 | " 4 \n",
525 | " 0 \n",
526 | " 0 \n",
527 | " 0 \n",
528 | " 0 \n",
529 | " ... \n",
530 | " Bi-Weekly \n",
531 | " 0 \n",
532 | " 0 \n",
533 | " 0 \n",
534 | " 0 \n",
535 | " Cancer \n",
536 | " 0 \n",
537 | " 0 \n",
538 | " 0.07 \n",
539 | " 0 \n",
540 | " \n",
541 | " \n",
542 | " 7 \n",
543 | " 0 \n",
544 | " 24.0 \n",
545 | " na \n",
546 | " 0 \n",
547 | " 0 \n",
548 | " 2 \n",
549 | " 0 \n",
550 | " 0 \n",
551 | " 0 \n",
552 | " 0 \n",
553 | " ... \n",
554 | " na \n",
555 | " 0 \n",
556 | " 0 \n",
557 | " 0 \n",
558 | " 0 \n",
559 | " Leo \n",
560 | " 0 \n",
561 | " 0 \n",
562 | " 0.11 \n",
563 | " 0 \n",
564 | " \n",
565 | " \n",
566 | " 8 \n",
567 | " 0 \n",
568 | " 28.0 \n",
569 | " R \n",
570 | " 0 \n",
571 | " 0 \n",
572 | " 0 \n",
573 | " 0 \n",
574 | " 2 \n",
575 | " 47 \n",
576 | " 1 \n",
577 | " ... \n",
578 | " Bi-Weekly \n",
579 | " 0 \n",
580 | " 0 \n",
581 | " 0 \n",
582 | " 0 \n",
583 | " Sagittarius \n",
584 | " 0 \n",
585 | " 0 \n",
586 | " 0.87 \n",
587 | " 1 \n",
588 | " \n",
589 | " \n",
590 | " 9 \n",
591 | " 0 \n",
592 | " 23.0 \n",
593 | " na \n",
594 | " 1 \n",
595 | " 0 \n",
596 | " 87 \n",
597 | " 1 \n",
598 | " 0 \n",
599 | " 125 \n",
600 | " 0 \n",
601 | " ... \n",
602 | " Bi-Weekly \n",
603 | " 0 \n",
604 | " 0 \n",
605 | " 0 \n",
606 | " 0 \n",
607 | " Aquarius \n",
608 | " 0 \n",
609 | " 0 \n",
610 | " 1.07 \n",
611 | " 0 \n",
612 | " \n",
613 | " \n",
614 | " 10 \n",
615 | " 0 \n",
616 | " 32.0 \n",
617 | " R \n",
618 | " 0 \n",
619 | " 0 \n",
620 | " 30 \n",
621 | " 0 \n",
622 | " 0 \n",
623 | " 70 \n",
624 | " 0 \n",
625 | " ... \n",
626 | " Monthly \n",
627 | " 0 \n",
628 | " 0 \n",
629 | " 0 \n",
630 | " 0 \n",
631 | " Virgo \n",
632 | " 0 \n",
633 | " 0 \n",
634 | " 0.83 \n",
635 | " 1 \n",
636 | " \n",
637 | " \n",
638 | " 11 \n",
639 | " 0 \n",
640 | " 41.0 \n",
641 | " O \n",
642 | " 14 \n",
643 | " 0 \n",
644 | " 116 \n",
645 | " 13 \n",
646 | " 0 \n",
647 | " 217 \n",
648 | " 0 \n",
649 | " ... \n",
650 | " Semi-Monthly \n",
651 | " 0 \n",
652 | " 0 \n",
653 | " 0 \n",
654 | " 0 \n",
655 | " Virgo \n",
656 | " 0 \n",
657 | " 0 \n",
658 | " 2.17 \n",
659 | " 0 \n",
660 | " \n",
661 | " \n",
662 | " 12 \n",
663 | " 0 \n",
664 | " 20.0 \n",
665 | " na \n",
666 | " 0 \n",
667 | " 0 \n",
668 | " 135 \n",
669 | " 0 \n",
670 | " 0 \n",
671 | " 286 \n",
672 | " 0 \n",
673 | " ... \n",
674 | " Bi-Weekly \n",
675 | " 0 \n",
676 | " 0 \n",
677 | " 0 \n",
678 | " 0 \n",
679 | " Libra \n",
680 | " 0 \n",
681 | " 0 \n",
682 | " 2.20 \n",
683 | " 1 \n",
684 | " \n",
685 | " \n",
686 | " 13 \n",
687 | " 0 \n",
688 | " 27.0 \n",
689 | " O \n",
690 | " 2 \n",
691 | " 1 \n",
692 | " 48 \n",
693 | " 2 \n",
694 | " 0 \n",
695 | " 127 \n",
696 | " 0 \n",
697 | " ... \n",
698 | " Weekly \n",
699 | " 0 \n",
700 | " 0 \n",
701 | " 0 \n",
702 | " 0 \n",
703 | " Gemini \n",
704 | " 0 \n",
705 | " 0 \n",
706 | " 1.00 \n",
707 | " 0 \n",
708 | " \n",
709 | " \n",
710 | " 14 \n",
711 | " 0 \n",
712 | " 47.0 \n",
713 | " na \n",
714 | " 11 \n",
715 | " 1 \n",
716 | " 20 \n",
717 | " 11 \n",
718 | " 0 \n",
719 | " 30 \n",
720 | " 0 \n",
721 | " ... \n",
722 | " Bi-Weekly \n",
723 | " 0 \n",
724 | " 0 \n",
725 | " 0 \n",
726 | " 0 \n",
727 | " Virgo \n",
728 | " 0 \n",
729 | " 0 \n",
730 | " 0.69 \n",
731 | " 1 \n",
732 | " \n",
733 | " \n",
734 | " 15 \n",
735 | " 1 \n",
736 | " 44.0 \n",
737 | " na \n",
738 | " 0 \n",
739 | " 0 \n",
740 | " 0 \n",
741 | " 0 \n",
742 | " 1 \n",
743 | " 4 \n",
744 | " 0 \n",
745 | " ... \n",
746 | " Semi-Monthly \n",
747 | " 0 \n",
748 | " 0 \n",
749 | " 0 \n",
750 | " 0 \n",
751 | " Libra \n",
752 | " 0 \n",
753 | " 0 \n",
754 | " 0.07 \n",
755 | " 0 \n",
756 | " \n",
757 | " \n",
758 | " 16 \n",
759 | " 1 \n",
760 | " 24.0 \n",
761 | " na \n",
762 | " 6 \n",
763 | " 2 \n",
764 | " 24 \n",
765 | " 6 \n",
766 | " 0 \n",
767 | " 55 \n",
768 | " 0 \n",
769 | " ... \n",
770 | " Weekly \n",
771 | " 0 \n",
772 | " 0 \n",
773 | " 0 \n",
774 | " 0 \n",
775 | " Capricorn \n",
776 | " 0 \n",
777 | " 0 \n",
778 | " 0.90 \n",
779 | " 0 \n",
780 | " \n",
781 | " \n",
782 | " 17 \n",
783 | " 0 \n",
784 | " 31.0 \n",
785 | " R \n",
786 | " 17 \n",
787 | " 2 \n",
788 | " 62 \n",
789 | " 17 \n",
790 | " 0 \n",
791 | " 180 \n",
792 | " 0 \n",
793 | " ... \n",
794 | " na \n",
795 | " 0 \n",
796 | " 0 \n",
797 | " 0 \n",
798 | " 0 \n",
799 | " Libra \n",
800 | " 0 \n",
801 | " 0 \n",
802 | " 1.43 \n",
803 | " 0 \n",
804 | " \n",
805 | " \n",
806 | " 18 \n",
807 | " 0 \n",
808 | " 20.0 \n",
809 | " na \n",
810 | " 0 \n",
811 | " 0 \n",
812 | " 0 \n",
813 | " 0 \n",
814 | " 0 \n",
815 | " 0 \n",
816 | " 0 \n",
817 | " ... \n",
818 | " Monthly \n",
819 | " 0 \n",
820 | " 0 \n",
821 | " 0 \n",
822 | " 0 \n",
823 | " Sagittarius \n",
824 | " 1 \n",
825 | " 0 \n",
826 | " 0.00 \n",
827 | " 0 \n",
828 | " \n",
829 | " \n",
830 | " 19 \n",
831 | " 1 \n",
832 | " 30.0 \n",
833 | " na \n",
834 | " 2 \n",
835 | " 1 \n",
836 | " 0 \n",
837 | " 2 \n",
838 | " 0 \n",
839 | " 9 \n",
840 | " 0 \n",
841 | " ... \n",
842 | " Bi-Weekly \n",
843 | " 0 \n",
844 | " 0 \n",
845 | " 0 \n",
846 | " 0 \n",
847 | " Libra \n",
848 | " 1 \n",
849 | " 0 \n",
850 | " 0.07 \n",
851 | " 0 \n",
852 | " \n",
853 | " \n",
854 | " 20 \n",
855 | " 1 \n",
856 | " 59.0 \n",
857 | " na \n",
858 | " 0 \n",
859 | " 0 \n",
860 | " 0 \n",
861 | " 0 \n",
862 | " 0 \n",
863 | " 0 \n",
864 | " 0 \n",
865 | " ... \n",
866 | " Bi-Weekly \n",
867 | " 0 \n",
868 | " 0 \n",
869 | " 0 \n",
870 | " 0 \n",
871 | " Gemini \n",
872 | " 0 \n",
873 | " 0 \n",
874 | " 0.00 \n",
875 | " 0 \n",
876 | " \n",
877 | " \n",
878 | " 21 \n",
879 | " 1 \n",
880 | " 21.0 \n",
881 | " na \n",
882 | " 0 \n",
883 | " 0 \n",
884 | " 0 \n",
885 | " 0 \n",
886 | " 0 \n",
887 | " 3 \n",
888 | " 0 \n",
889 | " ... \n",
890 | " Bi-Weekly \n",
891 | " 0 \n",
892 | " 0 \n",
893 | " 0 \n",
894 | " 0 \n",
895 | " Sagittarius \n",
896 | " 0 \n",
897 | " 0 \n",
898 | " 0.00 \n",
899 | " 0 \n",
900 | " \n",
901 | " \n",
902 | " 22 \n",
903 | " 0 \n",
904 | " 37.0 \n",
905 | " R \n",
906 | " 0 \n",
907 | " 0 \n",
908 | " 39 \n",
909 | " 0 \n",
910 | " 0 \n",
911 | " 74 \n",
912 | " 0 \n",
913 | " ... \n",
914 | " na \n",
915 | " 0 \n",
916 | " 1 \n",
917 | " 0 \n",
918 | " 0 \n",
919 | " Scorpio \n",
920 | " 0 \n",
921 | " 0 \n",
922 | " 1.33 \n",
923 | " 0 \n",
924 | " \n",
925 | " \n",
926 | " 23 \n",
927 | " 1 \n",
928 | " 30.0 \n",
929 | " R \n",
930 | " 6 \n",
931 | " 0 \n",
932 | " 11 \n",
933 | " 6 \n",
934 | " 0 \n",
935 | " 20 \n",
936 | " 0 \n",
937 | " ... \n",
938 | " Bi-Weekly \n",
939 | " 0 \n",
940 | " 0 \n",
941 | " 0 \n",
942 | " 0 \n",
943 | " Gemini \n",
944 | " 0 \n",
945 | " 0 \n",
946 | " 0.40 \n",
947 | " 0 \n",
948 | " \n",
949 | " \n",
950 | " 24 \n",
951 | " 0 \n",
952 | " 36.0 \n",
953 | " na \n",
954 | " 0 \n",
955 | " 0 \n",
956 | " 2 \n",
957 | " 0 \n",
958 | " 0 \n",
959 | " 0 \n",
960 | " 0 \n",
961 | " ... \n",
962 | " Bi-Weekly \n",
963 | " 0 \n",
964 | " 0 \n",
965 | " 0 \n",
966 | " 0 \n",
967 | " Aquarius \n",
968 | " 0 \n",
969 | " 0 \n",
970 | " 0.50 \n",
971 | " 0 \n",
972 | " \n",
973 | " \n",
974 | " 25 \n",
975 | " 1 \n",
976 | " 44.0 \n",
977 | " O \n",
978 | " 0 \n",
979 | " 0 \n",
980 | " 0 \n",
981 | " 0 \n",
982 | " 3 \n",
983 | " 4 \n",
984 | " 3 \n",
985 | " ... \n",
986 | " Monthly \n",
987 | " 0 \n",
988 | " 0 \n",
989 | " 0 \n",
990 | " 0 \n",
991 | " Scorpio \n",
992 | " 0 \n",
993 | " 1 \n",
994 | " 0.17 \n",
995 | " 0 \n",
996 | " \n",
997 | " \n",
998 | " 26 \n",
999 | " 1 \n",
1000 | " 52.0 \n",
1001 | " na \n",
1002 | " 0 \n",
1003 | " 0 \n",
1004 | " 0 \n",
1005 | " 0 \n",
1006 | " 0 \n",
1007 | " 74 \n",
1008 | " 0 \n",
1009 | " ... \n",
1010 | " Bi-Weekly \n",
1011 | " 0 \n",
1012 | " 0 \n",
1013 | " 0 \n",
1014 | " 0 \n",
1015 | " Pisces \n",
1016 | " 1 \n",
1017 | " 0 \n",
1018 | " 1.30 \n",
1019 | " 0 \n",
1020 | " \n",
1021 | " \n",
1022 | " 27 \n",
1023 | " 1 \n",
1024 | " 25.0 \n",
1025 | " na \n",
1026 | " 0 \n",
1027 | " 0 \n",
1028 | " 0 \n",
1029 | " 0 \n",
1030 | " 0 \n",
1031 | " 32 \n",
1032 | " 0 \n",
1033 | " ... \n",
1034 | " Monthly \n",
1035 | " 0 \n",
1036 | " 0 \n",
1037 | " 0 \n",
1038 | " 0 \n",
1039 | " Virgo \n",
1040 | " 0 \n",
1041 | " 0 \n",
1042 | " 0.47 \n",
1043 | " 0 \n",
1044 | " \n",
1045 | " \n",
1046 | " 28 \n",
1047 | " 1 \n",
1048 | " 22.0 \n",
1049 | " na \n",
1050 | " 0 \n",
1051 | " 0 \n",
1052 | " 0 \n",
1053 | " 0 \n",
1054 | " 1 \n",
1055 | " 67 \n",
1056 | " 1 \n",
1057 | " ... \n",
1058 | " Bi-Weekly \n",
1059 | " 0 \n",
1060 | " 0 \n",
1061 | " 0 \n",
1062 | " 0 \n",
1063 | " Pisces \n",
1064 | " 0 \n",
1065 | " 0 \n",
1066 | " 0.90 \n",
1067 | " 0 \n",
1068 | " \n",
1069 | " \n",
1070 | " 29 \n",
1071 | " 0 \n",
1072 | " 38.0 \n",
1073 | " na \n",
1074 | " 0 \n",
1075 | " 0 \n",
1076 | " 27 \n",
1077 | " 0 \n",
1078 | " 0 \n",
1079 | " 30 \n",
1080 | " 0 \n",
1081 | " ... \n",
1082 | " Bi-Weekly \n",
1083 | " 0 \n",
1084 | " 0 \n",
1085 | " 0 \n",
1086 | " 0 \n",
1087 | " Libra \n",
1088 | " 0 \n",
1089 | " 0 \n",
1090 | " 1.00 \n",
1091 | " 0 \n",
1092 | " \n",
1093 | " \n",
1094 | " ... \n",
1095 | " ... \n",
1096 | " ... \n",
1097 | " ... \n",
1098 | " ... \n",
1099 | " ... \n",
1100 | " ... \n",
1101 | " ... \n",
1102 | " ... \n",
1103 | " ... \n",
1104 | " ... \n",
1105 | " ... \n",
1106 | " ... \n",
1107 | " ... \n",
1108 | " ... \n",
1109 | " ... \n",
1110 | " ... \n",
1111 | " ... \n",
1112 | " ... \n",
1113 | " ... \n",
1114 | " ... \n",
1115 | " ... \n",
1116 | " \n",
1117 | " \n",
1118 | " 26966 \n",
1119 | " 1 \n",
1120 | " 22.0 \n",
1121 | " R \n",
1122 | " 0 \n",
1123 | " 0 \n",
1124 | " 0 \n",
1125 | " 0 \n",
1126 | " 1 \n",
1127 | " 146 \n",
1128 | " 0 \n",
1129 | " ... \n",
1130 | " Weekly \n",
1131 | " 0 \n",
1132 | " 0 \n",
1133 | " 0 \n",
1134 | " 0 \n",
1135 | " Sagittarius \n",
1136 | " 0 \n",
1137 | " 0 \n",
1138 | " 1.23 \n",
1139 | " 0 \n",
1140 | " \n",
1141 | " \n",
1142 | " 26967 \n",
1143 | " 1 \n",
1144 | " 25.0 \n",
1145 | " na \n",
1146 | " 0 \n",
1147 | " 0 \n",
1148 | " 4 \n",
1149 | " 0 \n",
1150 | " 0 \n",
1151 | " 0 \n",
1152 | " 0 \n",
1153 | " ... \n",
1154 | " Bi-Weekly \n",
1155 | " 0 \n",
1156 | " 0 \n",
1157 | " 0 \n",
1158 | " 0 \n",
1159 | " Virgo \n",
1160 | " 0 \n",
1161 | " 0 \n",
1162 | " 0.00 \n",
1163 | " 0 \n",
1164 | " \n",
1165 | " \n",
1166 | " 26968 \n",
1167 | " 0 \n",
1168 | " 24.0 \n",
1169 | " na \n",
1170 | " 0 \n",
1171 | " 0 \n",
1172 | " 1 \n",
1173 | " 0 \n",
1174 | " 0 \n",
1175 | " 0 \n",
1176 | " 0 \n",
1177 | " ... \n",
1178 | " Bi-Weekly \n",
1179 | " 0 \n",
1180 | " 0 \n",
1181 | " 0 \n",
1182 | " 0 \n",
1183 | " Gemini \n",
1184 | " 0 \n",
1185 | " 0 \n",
1186 | " 0.03 \n",
1187 | " 0 \n",
1188 | " \n",
1189 | " \n",
1190 | " 26969 \n",
1191 | " 1 \n",
1192 | " 25.0 \n",
1193 | " R \n",
1194 | " 29 \n",
1195 | " 1 \n",
1196 | " 70 \n",
1197 | " 29 \n",
1198 | " 0 \n",
1199 | " 192 \n",
1200 | " 0 \n",
1201 | " ... \n",
1202 | " Bi-Weekly \n",
1203 | " 0 \n",
1204 | " 0 \n",
1205 | " 0 \n",
1206 | " 0 \n",
1207 | " Scorpio \n",
1208 | " 0 \n",
1209 | " 0 \n",
1210 | " 1.00 \n",
1211 | " 0 \n",
1212 | " \n",
1213 | " \n",
1214 | " 26970 \n",
1215 | " 0 \n",
1216 | " 26.0 \n",
1217 | " na \n",
1218 | " 0 \n",
1219 | " 0 \n",
1220 | " 3 \n",
1221 | " 0 \n",
1222 | " 0 \n",
1223 | " 0 \n",
1224 | " 0 \n",
1225 | " ... \n",
1226 | " Bi-Weekly \n",
1227 | " 0 \n",
1228 | " 0 \n",
1229 | " 0 \n",
1230 | " 0 \n",
1231 | " Libra \n",
1232 | " 0 \n",
1233 | " 0 \n",
1234 | " 0.10 \n",
1235 | " 1 \n",
1236 | " \n",
1237 | " \n",
1238 | " 26971 \n",
1239 | " 0 \n",
1240 | " 32.0 \n",
1241 | " na \n",
1242 | " 0 \n",
1243 | " 0 \n",
1244 | " 0 \n",
1245 | " 0 \n",
1246 | " 0 \n",
1247 | " 0 \n",
1248 | " 0 \n",
1249 | " ... \n",
1250 | " Bi-Weekly \n",
1251 | " 0 \n",
1252 | " 0 \n",
1253 | " 0 \n",
1254 | " 0 \n",
1255 | " Leo \n",
1256 | " 0 \n",
1257 | " 0 \n",
1258 | " 0.00 \n",
1259 | " 0 \n",
1260 | " \n",
1261 | " \n",
1262 | " 26972 \n",
1263 | " 0 \n",
1264 | " 52.0 \n",
1265 | " R \n",
1266 | " 0 \n",
1267 | " 0 \n",
1268 | " 10 \n",
1269 | " 0 \n",
1270 | " 0 \n",
1271 | " 203 \n",
1272 | " 0 \n",
1273 | " ... \n",
1274 | " Bi-Weekly \n",
1275 | " 0 \n",
1276 | " 0 \n",
1277 | " 0 \n",
1278 | " 0 \n",
1279 | " Leo \n",
1280 | " 0 \n",
1281 | " 0 \n",
1282 | " 2.03 \n",
1283 | " 0 \n",
1284 | " \n",
1285 | " \n",
1286 | " 26973 \n",
1287 | " 1 \n",
1288 | " 22.0 \n",
1289 | " R \n",
1290 | " 0 \n",
1291 | " 0 \n",
1292 | " 2 \n",
1293 | " 0 \n",
1294 | " 1 \n",
1295 | " 134 \n",
1296 | " 0 \n",
1297 | " ... \n",
1298 | " Weekly \n",
1299 | " 0 \n",
1300 | " 0 \n",
1301 | " 0 \n",
1302 | " 0 \n",
1303 | " Pisces \n",
1304 | " 0 \n",
1305 | " 0 \n",
1306 | " 1.00 \n",
1307 | " 1 \n",
1308 | " \n",
1309 | " \n",
1310 | " 26974 \n",
1311 | " 0 \n",
1312 | " 54.0 \n",
1313 | " R \n",
1314 | " 0 \n",
1315 | " 0 \n",
1316 | " 68 \n",
1317 | " 0 \n",
1318 | " 0 \n",
1319 | " 206 \n",
1320 | " 0 \n",
1321 | " ... \n",
1322 | " Bi-Weekly \n",
1323 | " 0 \n",
1324 | " 0 \n",
1325 | " 0 \n",
1326 | " 0 \n",
1327 | " Cancer \n",
1328 | " 1 \n",
1329 | " 0 \n",
1330 | " 1.63 \n",
1331 | " 1 \n",
1332 | " \n",
1333 | " \n",
1334 | " 26975 \n",
1335 | " 1 \n",
1336 | " 33.0 \n",
1337 | " na \n",
1338 | " 0 \n",
1339 | " 0 \n",
1340 | " 3 \n",
1341 | " 0 \n",
1342 | " 0 \n",
1343 | " 0 \n",
1344 | " 0 \n",
1345 | " ... \n",
1346 | " Monthly \n",
1347 | " 0 \n",
1348 | " 0 \n",
1349 | " 0 \n",
1350 | " 0 \n",
1351 | " Cancer \n",
1352 | " 0 \n",
1353 | " 0 \n",
1354 | " 1.00 \n",
1355 | " 0 \n",
1356 | " \n",
1357 | " \n",
1358 | " 26976 \n",
1359 | " 0 \n",
1360 | " 33.0 \n",
1361 | " na \n",
1362 | " 0 \n",
1363 | " 0 \n",
1364 | " 0 \n",
1365 | " 0 \n",
1366 | " 0 \n",
1367 | " 0 \n",
1368 | " 0 \n",
1369 | " ... \n",
1370 | " Bi-Weekly \n",
1371 | " 0 \n",
1372 | " 0 \n",
1373 | " 0 \n",
1374 | " 0 \n",
1375 | " na \n",
1376 | " 0 \n",
1377 | " 0 \n",
1378 | " 0.00 \n",
1379 | " 0 \n",
1380 | " \n",
1381 | " \n",
1382 | " 26977 \n",
1383 | " 0 \n",
1384 | " 48.0 \n",
1385 | " na \n",
1386 | " 0 \n",
1387 | " 0 \n",
1388 | " 24 \n",
1389 | " 0 \n",
1390 | " 0 \n",
1391 | " 32 \n",
1392 | " 0 \n",
1393 | " ... \n",
1394 | " Weekly \n",
1395 | " 0 \n",
1396 | " 0 \n",
1397 | " 0 \n",
1398 | " 0 \n",
1399 | " Leo \n",
1400 | " 0 \n",
1401 | " 0 \n",
1402 | " 0.43 \n",
1403 | " 0 \n",
1404 | " \n",
1405 | " \n",
1406 | " 26978 \n",
1407 | " 0 \n",
1408 | " 38.0 \n",
1409 | " O \n",
1410 | " 2 \n",
1411 | " 0 \n",
1412 | " 0 \n",
1413 | " 2 \n",
1414 | " 0 \n",
1415 | " 40 \n",
1416 | " 0 \n",
1417 | " ... \n",
1418 | " Bi-Weekly \n",
1419 | " 0 \n",
1420 | " 0 \n",
1421 | " 0 \n",
1422 | " 0 \n",
1423 | " Pisces \n",
1424 | " 0 \n",
1425 | " 0 \n",
1426 | " 0.63 \n",
1427 | " 1 \n",
1428 | " \n",
1429 | " \n",
1430 | " 26979 \n",
1431 | " 1 \n",
1432 | " 34.0 \n",
1433 | " na \n",
1434 | " 0 \n",
1435 | " 0 \n",
1436 | " 0 \n",
1437 | " 0 \n",
1438 | " 0 \n",
1439 | " 0 \n",
1440 | " 0 \n",
1441 | " ... \n",
1442 | " Monthly \n",
1443 | " 0 \n",
1444 | " 0 \n",
1445 | " 0 \n",
1446 | " 0 \n",
1447 | " Scorpio \n",
1448 | " 0 \n",
1449 | " 0 \n",
1450 | " 0.00 \n",
1451 | " 0 \n",
1452 | " \n",
1453 | " \n",
1454 | " 26980 \n",
1455 | " 0 \n",
1456 | " 25.0 \n",
1457 | " R \n",
1458 | " 0 \n",
1459 | " 0 \n",
1460 | " 0 \n",
1461 | " 0 \n",
1462 | " 0 \n",
1463 | " 210 \n",
1464 | " 0 \n",
1465 | " ... \n",
1466 | " Weekly \n",
1467 | " 0 \n",
1468 | " 0 \n",
1469 | " 1 \n",
1470 | " 0 \n",
1471 | " Aquarius \n",
1472 | " 0 \n",
1473 | " 0 \n",
1474 | " 2.00 \n",
1475 | " 0 \n",
1476 | " \n",
1477 | " \n",
1478 | " 26981 \n",
1479 | " 0 \n",
1480 | " 23.0 \n",
1481 | " na \n",
1482 | " 0 \n",
1483 | " 0 \n",
1484 | " 48 \n",
1485 | " 0 \n",
1486 | " 0 \n",
1487 | " 124 \n",
1488 | " 0 \n",
1489 | " ... \n",
1490 | " Weekly \n",
1491 | " 0 \n",
1492 | " 0 \n",
1493 | " 0 \n",
1494 | " 0 \n",
1495 | " Aries \n",
1496 | " 0 \n",
1497 | " 0 \n",
1498 | " 1.60 \n",
1499 | " 1 \n",
1500 | " \n",
1501 | " \n",
1502 | " 26982 \n",
1503 | " 0 \n",
1504 | " 35.0 \n",
1505 | " na \n",
1506 | " 0 \n",
1507 | " 0 \n",
1508 | " 57 \n",
1509 | " 0 \n",
1510 | " 0 \n",
1511 | " 156 \n",
1512 | " 0 \n",
1513 | " ... \n",
1514 | " Weekly \n",
1515 | " 0 \n",
1516 | " 0 \n",
1517 | " 0 \n",
1518 | " 0 \n",
1519 | " Leo \n",
1520 | " 0 \n",
1521 | " 0 \n",
1522 | " 1.47 \n",
1523 | " 1 \n",
1524 | " \n",
1525 | " \n",
1526 | " 26983 \n",
1527 | " 0 \n",
1528 | " 23.0 \n",
1529 | " na \n",
1530 | " 0 \n",
1531 | " 0 \n",
1532 | " 50 \n",
1533 | " 0 \n",
1534 | " 0 \n",
1535 | " 152 \n",
1536 | " 0 \n",
1537 | " ... \n",
1538 | " Bi-Weekly \n",
1539 | " 0 \n",
1540 | " 0 \n",
1541 | " 0 \n",
1542 | " 0 \n",
1543 | " Leo \n",
1544 | " 0 \n",
1545 | " 0 \n",
1546 | " 1.07 \n",
1547 | " 1 \n",
1548 | " \n",
1549 | " \n",
1550 | " 26984 \n",
1551 | " 0 \n",
1552 | " 24.0 \n",
1553 | " R \n",
1554 | " 0 \n",
1555 | " 0 \n",
1556 | " 62 \n",
1557 | " 0 \n",
1558 | " 0 \n",
1559 | " 136 \n",
1560 | " 0 \n",
1561 | " ... \n",
1562 | " Bi-Weekly \n",
1563 | " 0 \n",
1564 | " 0 \n",
1565 | " 0 \n",
1566 | " 0 \n",
1567 | " Virgo \n",
1568 | " 0 \n",
1569 | " 0 \n",
1570 | " 1.60 \n",
1571 | " 0 \n",
1572 | " \n",
1573 | " \n",
1574 | " 26985 \n",
1575 | " 0 \n",
1576 | " 31.0 \n",
1577 | " R \n",
1578 | " 0 \n",
1579 | " 0 \n",
1580 | " 22 \n",
1581 | " 0 \n",
1582 | " 0 \n",
1583 | " 138 \n",
1584 | " 0 \n",
1585 | " ... \n",
1586 | " Monthly \n",
1587 | " 0 \n",
1588 | " 0 \n",
1589 | " 0 \n",
1590 | " 0 \n",
1591 | " Virgo \n",
1592 | " 0 \n",
1593 | " 0 \n",
1594 | " 1.63 \n",
1595 | " 0 \n",
1596 | " \n",
1597 | " \n",
1598 | " 26986 \n",
1599 | " 0 \n",
1600 | " 29.0 \n",
1601 | " R \n",
1602 | " 1 \n",
1603 | " 1 \n",
1604 | " 70 \n",
1605 | " 1 \n",
1606 | " 0 \n",
1607 | " 147 \n",
1608 | " 0 \n",
1609 | " ... \n",
1610 | " Bi-Weekly \n",
1611 | " 0 \n",
1612 | " 0 \n",
1613 | " 0 \n",
1614 | " 0 \n",
1615 | " Cancer \n",
1616 | " 0 \n",
1617 | " 0 \n",
1618 | " 1.07 \n",
1619 | " 1 \n",
1620 | " \n",
1621 | " \n",
1622 | " 26987 \n",
1623 | " 0 \n",
1624 | " 30.0 \n",
1625 | " na \n",
1626 | " 0 \n",
1627 | " 0 \n",
1628 | " 2 \n",
1629 | " 0 \n",
1630 | " 0 \n",
1631 | " 4 \n",
1632 | " 0 \n",
1633 | " ... \n",
1634 | " Bi-Weekly \n",
1635 | " 0 \n",
1636 | " 0 \n",
1637 | " 0 \n",
1638 | " 0 \n",
1639 | " Leo \n",
1640 | " 0 \n",
1641 | " 0 \n",
1642 | " 0.03 \n",
1643 | " 0 \n",
1644 | " \n",
1645 | " \n",
1646 | " 26988 \n",
1647 | " 1 \n",
1648 | " 20.0 \n",
1649 | " R \n",
1650 | " 0 \n",
1651 | " 0 \n",
1652 | " 2 \n",
1653 | " 0 \n",
1654 | " 0 \n",
1655 | " 5 \n",
1656 | " 0 \n",
1657 | " ... \n",
1658 | " Bi-Weekly \n",
1659 | " 0 \n",
1660 | " 0 \n",
1661 | " 0 \n",
1662 | " 0 \n",
1663 | " Leo \n",
1664 | " 1 \n",
1665 | " 0 \n",
1666 | " 0.13 \n",
1667 | " 0 \n",
1668 | " \n",
1669 | " \n",
1670 | " 26989 \n",
1671 | " 0 \n",
1672 | " 29.0 \n",
1673 | " na \n",
1674 | " 1 \n",
1675 | " 1 \n",
1676 | " 5 \n",
1677 | " 1 \n",
1678 | " 0 \n",
1679 | " 5 \n",
1680 | " 0 \n",
1681 | " ... \n",
1682 | " Bi-Weekly \n",
1683 | " 0 \n",
1684 | " 0 \n",
1685 | " 0 \n",
1686 | " 0 \n",
1687 | " Scorpio \n",
1688 | " 1 \n",
1689 | " 0 \n",
1690 | " 0.03 \n",
1691 | " 0 \n",
1692 | " \n",
1693 | " \n",
1694 | " 26990 \n",
1695 | " 1 \n",
1696 | " 28.0 \n",
1697 | " R \n",
1698 | " 0 \n",
1699 | " 0 \n",
1700 | " 26 \n",
1701 | " 0 \n",
1702 | " 0 \n",
1703 | " 31 \n",
1704 | " 0 \n",
1705 | " ... \n",
1706 | " Monthly \n",
1707 | " 0 \n",
1708 | " 0 \n",
1709 | " 0 \n",
1710 | " 0 \n",
1711 | " Virgo \n",
1712 | " 0 \n",
1713 | " 0 \n",
1714 | " 0.60 \n",
1715 | " 0 \n",
1716 | " \n",
1717 | " \n",
1718 | " 26991 \n",
1719 | " 1 \n",
1720 | " 24.0 \n",
1721 | " R \n",
1722 | " 0 \n",
1723 | " 0 \n",
1724 | " 0 \n",
1725 | " 0 \n",
1726 | " 0 \n",
1727 | " 81 \n",
1728 | " 0 \n",
1729 | " ... \n",
1730 | " Weekly \n",
1731 | " 0 \n",
1732 | " 0 \n",
1733 | " 0 \n",
1734 | " 0 \n",
1735 | " Leo \n",
1736 | " 0 \n",
1737 | " 0 \n",
1738 | " 1.07 \n",
1739 | " 1 \n",
1740 | " \n",
1741 | " \n",
1742 | " 26992 \n",
1743 | " 1 \n",
1744 | " 26.0 \n",
1745 | " na \n",
1746 | " 0 \n",
1747 | " 0 \n",
1748 | " 2 \n",
1749 | " 0 \n",
1750 | " 0 \n",
1751 | " 1 \n",
1752 | " 0 \n",
1753 | " ... \n",
1754 | " Bi-Weekly \n",
1755 | " 0 \n",
1756 | " 0 \n",
1757 | " 0 \n",
1758 | " 1 \n",
1759 | " Cancer \n",
1760 | " 1 \n",
1761 | " 0 \n",
1762 | " 0.67 \n",
1763 | " 0 \n",
1764 | " \n",
1765 | " \n",
1766 | " 26993 \n",
1767 | " 0 \n",
1768 | " 22.0 \n",
1769 | " na \n",
1770 | " 0 \n",
1771 | " 0 \n",
1772 | " 37 \n",
1773 | " 0 \n",
1774 | " 0 \n",
1775 | " 98 \n",
1776 | " 0 \n",
1777 | " ... \n",
1778 | " Bi-Weekly \n",
1779 | " 0 \n",
1780 | " 0 \n",
1781 | " 0 \n",
1782 | " 0 \n",
1783 | " Taurus \n",
1784 | " 0 \n",
1785 | " 0 \n",
1786 | " 0.93 \n",
1787 | " 0 \n",
1788 | " \n",
1789 | " \n",
1790 | " 26994 \n",
1791 | " 1 \n",
1792 | " 46.0 \n",
1793 | " na \n",
1794 | " 2 \n",
1795 | " 0 \n",
1796 | " 16 \n",
1797 | " 2 \n",
1798 | " 0 \n",
1799 | " 58 \n",
1800 | " 0 \n",
1801 | " ... \n",
1802 | " Semi-Monthly \n",
1803 | " 0 \n",
1804 | " 0 \n",
1805 | " 0 \n",
1806 | " 0 \n",
1807 | " Aries \n",
1808 | " 1 \n",
1809 | " 0 \n",
1810 | " 0.90 \n",
1811 | " 1 \n",
1812 | " \n",
1813 | " \n",
1814 | " 26995 \n",
1815 | " 1 \n",
1816 | " 34.0 \n",
1817 | " na \n",
1818 | " 0 \n",
1819 | " 0 \n",
1820 | " 4 \n",
1821 | " 0 \n",
1822 | " 0 \n",
1823 | " 11 \n",
1824 | " 0 \n",
1825 | " ... \n",
1826 | " na \n",
1827 | " 0 \n",
1828 | " 0 \n",
1829 | " 0 \n",
1830 | " 0 \n",
1831 | " Cancer \n",
1832 | " 0 \n",
1833 | " 0 \n",
1834 | " 0.13 \n",
1835 | " 0 \n",
1836 | " \n",
1837 | " \n",
1838 | "
\n",
1839 | "
26996 rows × 27 columns
\n",
1840 | "
"
1841 | ],
1842 | "text/plain": [
1843 | " churn age housing deposits withdrawal purchases_partners \\\n",
1844 | "0 0 37.0 na 0 0 0 \n",
1845 | "1 0 28.0 R 0 0 1 \n",
1846 | "2 0 35.0 R 47 2 86 \n",
1847 | "3 0 26.0 R 26 3 38 \n",
1848 | "4 1 27.0 na 0 0 2 \n",
1849 | "5 1 32.0 R 5 3 111 \n",
1850 | "6 0 21.0 na 0 0 4 \n",
1851 | "7 0 24.0 na 0 0 2 \n",
1852 | "8 0 28.0 R 0 0 0 \n",
1853 | "9 0 23.0 na 1 0 87 \n",
1854 | "10 0 32.0 R 0 0 30 \n",
1855 | "11 0 41.0 O 14 0 116 \n",
1856 | "12 0 20.0 na 0 0 135 \n",
1857 | "13 0 27.0 O 2 1 48 \n",
1858 | "14 0 47.0 na 11 1 20 \n",
1859 | "15 1 44.0 na 0 0 0 \n",
1860 | "16 1 24.0 na 6 2 24 \n",
1861 | "17 0 31.0 R 17 2 62 \n",
1862 | "18 0 20.0 na 0 0 0 \n",
1863 | "19 1 30.0 na 2 1 0 \n",
1864 | "20 1 59.0 na 0 0 0 \n",
1865 | "21 1 21.0 na 0 0 0 \n",
1866 | "22 0 37.0 R 0 0 39 \n",
1867 | "23 1 30.0 R 6 0 11 \n",
1868 | "24 0 36.0 na 0 0 2 \n",
1869 | "25 1 44.0 O 0 0 0 \n",
1870 | "26 1 52.0 na 0 0 0 \n",
1871 | "27 1 25.0 na 0 0 0 \n",
1872 | "28 1 22.0 na 0 0 0 \n",
1873 | "29 0 38.0 na 0 0 27 \n",
1874 | "... ... ... ... ... ... ... \n",
1875 | "26966 1 22.0 R 0 0 0 \n",
1876 | "26967 1 25.0 na 0 0 4 \n",
1877 | "26968 0 24.0 na 0 0 1 \n",
1878 | "26969 1 25.0 R 29 1 70 \n",
1879 | "26970 0 26.0 na 0 0 3 \n",
1880 | "26971 0 32.0 na 0 0 0 \n",
1881 | "26972 0 52.0 R 0 0 10 \n",
1882 | "26973 1 22.0 R 0 0 2 \n",
1883 | "26974 0 54.0 R 0 0 68 \n",
1884 | "26975 1 33.0 na 0 0 3 \n",
1885 | "26976 0 33.0 na 0 0 0 \n",
1886 | "26977 0 48.0 na 0 0 24 \n",
1887 | "26978 0 38.0 O 2 0 0 \n",
1888 | "26979 1 34.0 na 0 0 0 \n",
1889 | "26980 0 25.0 R 0 0 0 \n",
1890 | "26981 0 23.0 na 0 0 48 \n",
1891 | "26982 0 35.0 na 0 0 57 \n",
1892 | "26983 0 23.0 na 0 0 50 \n",
1893 | "26984 0 24.0 R 0 0 62 \n",
1894 | "26985 0 31.0 R 0 0 22 \n",
1895 | "26986 0 29.0 R 1 1 70 \n",
1896 | "26987 0 30.0 na 0 0 2 \n",
1897 | "26988 1 20.0 R 0 0 2 \n",
1898 | "26989 0 29.0 na 1 1 5 \n",
1899 | "26990 1 28.0 R 0 0 26 \n",
1900 | "26991 1 24.0 R 0 0 0 \n",
1901 | "26992 1 26.0 na 0 0 2 \n",
1902 | "26993 0 22.0 na 0 0 37 \n",
1903 | "26994 1 46.0 na 2 0 16 \n",
1904 | "26995 1 34.0 na 0 0 4 \n",
1905 | "\n",
1906 | " purchases cc_taken cc_recommended cc_disliked ... \\\n",
1907 | "0 0 0 0 0 ... \n",
1908 | "1 0 0 96 0 ... \n",
1909 | "2 47 0 285 0 ... \n",
1910 | "3 25 0 74 0 ... \n",
1911 | "4 0 0 0 0 ... \n",
1912 | "5 5 0 227 0 ... \n",
1913 | "6 0 0 0 0 ... \n",
1914 | "7 0 0 0 0 ... \n",
1915 | "8 0 2 47 1 ... \n",
1916 | "9 1 0 125 0 ... \n",
1917 | "10 0 0 70 0 ... \n",
1918 | "11 13 0 217 0 ... \n",
1919 | "12 0 0 286 0 ... \n",
1920 | "13 2 0 127 0 ... \n",
1921 | "14 11 0 30 0 ... \n",
1922 | "15 0 1 4 0 ... \n",
1923 | "16 6 0 55 0 ... \n",
1924 | "17 17 0 180 0 ... \n",
1925 | "18 0 0 0 0 ... \n",
1926 | "19 2 0 9 0 ... \n",
1927 | "20 0 0 0 0 ... \n",
1928 | "21 0 0 3 0 ... \n",
1929 | "22 0 0 74 0 ... \n",
1930 | "23 6 0 20 0 ... \n",
1931 | "24 0 0 0 0 ... \n",
1932 | "25 0 3 4 3 ... \n",
1933 | "26 0 0 74 0 ... \n",
1934 | "27 0 0 32 0 ... \n",
1935 | "28 0 1 67 1 ... \n",
1936 | "29 0 0 30 0 ... \n",
1937 | "... ... ... ... ... ... \n",
1938 | "26966 0 1 146 0 ... \n",
1939 | "26967 0 0 0 0 ... \n",
1940 | "26968 0 0 0 0 ... \n",
1941 | "26969 29 0 192 0 ... \n",
1942 | "26970 0 0 0 0 ... \n",
1943 | "26971 0 0 0 0 ... \n",
1944 | "26972 0 0 203 0 ... \n",
1945 | "26973 0 1 134 0 ... \n",
1946 | "26974 0 0 206 0 ... \n",
1947 | "26975 0 0 0 0 ... \n",
1948 | "26976 0 0 0 0 ... \n",
1949 | "26977 0 0 32 0 ... \n",
1950 | "26978 2 0 40 0 ... \n",
1951 | "26979 0 0 0 0 ... \n",
1952 | "26980 0 0 210 0 ... \n",
1953 | "26981 0 0 124 0 ... \n",
1954 | "26982 0 0 156 0 ... \n",
1955 | "26983 0 0 152 0 ... \n",
1956 | "26984 0 0 136 0 ... \n",
1957 | "26985 0 0 138 0 ... \n",
1958 | "26986 1 0 147 0 ... \n",
1959 | "26987 0 0 4 0 ... \n",
1960 | "26988 0 0 5 0 ... \n",
1961 | "26989 1 0 5 0 ... \n",
1962 | "26990 0 0 31 0 ... \n",
1963 | "26991 0 0 81 0 ... \n",
1964 | "26992 0 0 1 0 ... \n",
1965 | "26993 0 0 98 0 ... \n",
1966 | "26994 2 0 58 0 ... \n",
1967 | "26995 0 0 11 0 ... \n",
1968 | "\n",
1969 | " payment_type waiting_4_loan cancelled_loan received_loan \\\n",
1970 | "0 Bi-Weekly 0 0 0 \n",
1971 | "1 Weekly 0 0 0 \n",
1972 | "2 Semi-Monthly 0 0 0 \n",
1973 | "3 Bi-Weekly 0 0 0 \n",
1974 | "4 Bi-Weekly 0 0 0 \n",
1975 | "5 Bi-Weekly 0 0 0 \n",
1976 | "6 Bi-Weekly 0 0 0 \n",
1977 | "7 na 0 0 0 \n",
1978 | "8 Bi-Weekly 0 0 0 \n",
1979 | "9 Bi-Weekly 0 0 0 \n",
1980 | "10 Monthly 0 0 0 \n",
1981 | "11 Semi-Monthly 0 0 0 \n",
1982 | "12 Bi-Weekly 0 0 0 \n",
1983 | "13 Weekly 0 0 0 \n",
1984 | "14 Bi-Weekly 0 0 0 \n",
1985 | "15 Semi-Monthly 0 0 0 \n",
1986 | "16 Weekly 0 0 0 \n",
1987 | "17 na 0 0 0 \n",
1988 | "18 Monthly 0 0 0 \n",
1989 | "19 Bi-Weekly 0 0 0 \n",
1990 | "20 Bi-Weekly 0 0 0 \n",
1991 | "21 Bi-Weekly 0 0 0 \n",
1992 | "22 na 0 1 0 \n",
1993 | "23 Bi-Weekly 0 0 0 \n",
1994 | "24 Bi-Weekly 0 0 0 \n",
1995 | "25 Monthly 0 0 0 \n",
1996 | "26 Bi-Weekly 0 0 0 \n",
1997 | "27 Monthly 0 0 0 \n",
1998 | "28 Bi-Weekly 0 0 0 \n",
1999 | "29 Bi-Weekly 0 0 0 \n",
2000 | "... ... ... ... ... \n",
2001 | "26966 Weekly 0 0 0 \n",
2002 | "26967 Bi-Weekly 0 0 0 \n",
2003 | "26968 Bi-Weekly 0 0 0 \n",
2004 | "26969 Bi-Weekly 0 0 0 \n",
2005 | "26970 Bi-Weekly 0 0 0 \n",
2006 | "26971 Bi-Weekly 0 0 0 \n",
2007 | "26972 Bi-Weekly 0 0 0 \n",
2008 | "26973 Weekly 0 0 0 \n",
2009 | "26974 Bi-Weekly 0 0 0 \n",
2010 | "26975 Monthly 0 0 0 \n",
2011 | "26976 Bi-Weekly 0 0 0 \n",
2012 | "26977 Weekly 0 0 0 \n",
2013 | "26978 Bi-Weekly 0 0 0 \n",
2014 | "26979 Monthly 0 0 0 \n",
2015 | "26980 Weekly 0 0 1 \n",
2016 | "26981 Weekly 0 0 0 \n",
2017 | "26982 Weekly 0 0 0 \n",
2018 | "26983 Bi-Weekly 0 0 0 \n",
2019 | "26984 Bi-Weekly 0 0 0 \n",
2020 | "26985 Monthly 0 0 0 \n",
2021 | "26986 Bi-Weekly 0 0 0 \n",
2022 | "26987 Bi-Weekly 0 0 0 \n",
2023 | "26988 Bi-Weekly 0 0 0 \n",
2024 | "26989 Bi-Weekly 0 0 0 \n",
2025 | "26990 Monthly 0 0 0 \n",
2026 | "26991 Weekly 0 0 0 \n",
2027 | "26992 Bi-Weekly 0 0 0 \n",
2028 | "26993 Bi-Weekly 0 0 0 \n",
2029 | "26994 Semi-Monthly 0 0 0 \n",
2030 | "26995 na 0 0 0 \n",
2031 | "\n",
2032 | " rejected_loan zodiac_sign left_for_two_month_plus left_for_one_month \\\n",
2033 | "0 0 Leo 1 0 \n",
2034 | "1 0 Leo 0 0 \n",
2035 | "2 0 Capricorn 1 0 \n",
2036 | "3 0 Capricorn 0 0 \n",
2037 | "4 0 Aries 1 0 \n",
2038 | "5 0 Taurus 0 0 \n",
2039 | "6 0 Cancer 0 0 \n",
2040 | "7 0 Leo 0 0 \n",
2041 | "8 0 Sagittarius 0 0 \n",
2042 | "9 0 Aquarius 0 0 \n",
2043 | "10 0 Virgo 0 0 \n",
2044 | "11 0 Virgo 0 0 \n",
2045 | "12 0 Libra 0 0 \n",
2046 | "13 0 Gemini 0 0 \n",
2047 | "14 0 Virgo 0 0 \n",
2048 | "15 0 Libra 0 0 \n",
2049 | "16 0 Capricorn 0 0 \n",
2050 | "17 0 Libra 0 0 \n",
2051 | "18 0 Sagittarius 1 0 \n",
2052 | "19 0 Libra 1 0 \n",
2053 | "20 0 Gemini 0 0 \n",
2054 | "21 0 Sagittarius 0 0 \n",
2055 | "22 0 Scorpio 0 0 \n",
2056 | "23 0 Gemini 0 0 \n",
2057 | "24 0 Aquarius 0 0 \n",
2058 | "25 0 Scorpio 0 1 \n",
2059 | "26 0 Pisces 1 0 \n",
2060 | "27 0 Virgo 0 0 \n",
2061 | "28 0 Pisces 0 0 \n",
2062 | "29 0 Libra 0 0 \n",
2063 | "... ... ... ... ... \n",
2064 | "26966 0 Sagittarius 0 0 \n",
2065 | "26967 0 Virgo 0 0 \n",
2066 | "26968 0 Gemini 0 0 \n",
2067 | "26969 0 Scorpio 0 0 \n",
2068 | "26970 0 Libra 0 0 \n",
2069 | "26971 0 Leo 0 0 \n",
2070 | "26972 0 Leo 0 0 \n",
2071 | "26973 0 Pisces 0 0 \n",
2072 | "26974 0 Cancer 1 0 \n",
2073 | "26975 0 Cancer 0 0 \n",
2074 | "26976 0 na 0 0 \n",
2075 | "26977 0 Leo 0 0 \n",
2076 | "26978 0 Pisces 0 0 \n",
2077 | "26979 0 Scorpio 0 0 \n",
2078 | "26980 0 Aquarius 0 0 \n",
2079 | "26981 0 Aries 0 0 \n",
2080 | "26982 0 Leo 0 0 \n",
2081 | "26983 0 Leo 0 0 \n",
2082 | "26984 0 Virgo 0 0 \n",
2083 | "26985 0 Virgo 0 0 \n",
2084 | "26986 0 Cancer 0 0 \n",
2085 | "26987 0 Leo 0 0 \n",
2086 | "26988 0 Leo 1 0 \n",
2087 | "26989 0 Scorpio 1 0 \n",
2088 | "26990 0 Virgo 0 0 \n",
2089 | "26991 0 Leo 0 0 \n",
2090 | "26992 1 Cancer 1 0 \n",
2091 | "26993 0 Taurus 0 0 \n",
2092 | "26994 0 Aries 1 0 \n",
2093 | "26995 0 Cancer 0 0 \n",
2094 | "\n",
2095 | " reward_rate is_referred \n",
2096 | "0 0.00 0 \n",
2097 | "1 1.47 1 \n",
2098 | "2 2.17 0 \n",
2099 | "3 1.10 1 \n",
2100 | "4 0.03 0 \n",
2101 | "5 1.83 0 \n",
2102 | "6 0.07 0 \n",
2103 | "7 0.11 0 \n",
2104 | "8 0.87 1 \n",
2105 | "9 1.07 0 \n",
2106 | "10 0.83 1 \n",
2107 | "11 2.17 0 \n",
2108 | "12 2.20 1 \n",
2109 | "13 1.00 0 \n",
2110 | "14 0.69 1 \n",
2111 | "15 0.07 0 \n",
2112 | "16 0.90 0 \n",
2113 | "17 1.43 0 \n",
2114 | "18 0.00 0 \n",
2115 | "19 0.07 0 \n",
2116 | "20 0.00 0 \n",
2117 | "21 0.00 0 \n",
2118 | "22 1.33 0 \n",
2119 | "23 0.40 0 \n",
2120 | "24 0.50 0 \n",
2121 | "25 0.17 0 \n",
2122 | "26 1.30 0 \n",
2123 | "27 0.47 0 \n",
2124 | "28 0.90 0 \n",
2125 | "29 1.00 0 \n",
2126 | "... ... ... \n",
2127 | "26966 1.23 0 \n",
2128 | "26967 0.00 0 \n",
2129 | "26968 0.03 0 \n",
2130 | "26969 1.00 0 \n",
2131 | "26970 0.10 1 \n",
2132 | "26971 0.00 0 \n",
2133 | "26972 2.03 0 \n",
2134 | "26973 1.00 1 \n",
2135 | "26974 1.63 1 \n",
2136 | "26975 1.00 0 \n",
2137 | "26976 0.00 0 \n",
2138 | "26977 0.43 0 \n",
2139 | "26978 0.63 1 \n",
2140 | "26979 0.00 0 \n",
2141 | "26980 2.00 0 \n",
2142 | "26981 1.60 1 \n",
2143 | "26982 1.47 1 \n",
2144 | "26983 1.07 1 \n",
2145 | "26984 1.60 0 \n",
2146 | "26985 1.63 0 \n",
2147 | "26986 1.07 1 \n",
2148 | "26987 0.03 0 \n",
2149 | "26988 0.13 0 \n",
2150 | "26989 0.03 0 \n",
2151 | "26990 0.60 0 \n",
2152 | "26991 1.07 1 \n",
2153 | "26992 0.67 0 \n",
2154 | "26993 0.93 0 \n",
2155 | "26994 0.90 1 \n",
2156 | "26995 0.13 0 \n",
2157 | "\n",
2158 | "[26996 rows x 27 columns]"
2159 | ]
2160 | },
2161 | "execution_count": 8,
2162 | "metadata": {},
2163 | "output_type": "execute_result"
2164 | }
2165 | ],
2166 | "source": [
2167 | "## Data preparation \n",
2168 | "user_identifier = dataset['user']\n",
2169 | "\n",
2170 | "#Remove the user from the dataset\n",
2171 | "dataset.drop(columns = ['user'])"
2172 | ]
2173 | },
2174 | {
2175 | "cell_type": "code",
2176 | "execution_count": 9,
2177 | "metadata": {},
2178 | "outputs": [],
2179 | "source": [
2180 | "# One-hot Encoding\n",
2181 | "dataset = pd.get_dummies(dataset)\n",
2182 | "\n",
2183 | "#dataset.housing.value_counts()"
2184 | ]
2185 | },
2186 | {
2187 | "cell_type": "code",
2188 | "execution_count": 10,
2189 | "metadata": {},
2190 | "outputs": [
2191 | {
2192 | "data": {
2193 | "text/plain": [
2194 | "Index(['user', 'churn', 'age', 'deposits', 'withdrawal', 'purchases_partners',\n",
2195 | " 'purchases', 'cc_taken', 'cc_recommended', 'cc_disliked', 'cc_liked',\n",
2196 | " 'cc_application_begin', 'app_downloaded', 'web_user', 'ios_user',\n",
2197 | " 'android_user', 'registered_phones', 'waiting_4_loan', 'cancelled_loan',\n",
2198 | " 'received_loan', 'rejected_loan', 'left_for_two_month_plus',\n",
2199 | " 'left_for_one_month', 'reward_rate', 'is_referred', 'housing_O',\n",
2200 | " 'housing_R', 'housing_na', 'payment_type_Bi-Weekly',\n",
2201 | " 'payment_type_Monthly', 'payment_type_Semi-Monthly',\n",
2202 | " 'payment_type_Weekly', 'payment_type_na', 'zodiac_sign_Aquarius',\n",
2203 | " 'zodiac_sign_Aries', 'zodiac_sign_Cancer', 'zodiac_sign_Capricorn',\n",
2204 | " 'zodiac_sign_Gemini', 'zodiac_sign_Leo', 'zodiac_sign_Libra',\n",
2205 | " 'zodiac_sign_Pisces', 'zodiac_sign_Sagittarius', 'zodiac_sign_Scorpio',\n",
2206 | " 'zodiac_sign_Taurus', 'zodiac_sign_Virgo', 'zodiac_sign_na'],\n",
2207 | " dtype='object')"
2208 | ]
2209 | },
2210 | "execution_count": 10,
2211 | "metadata": {},
2212 | "output_type": "execute_result"
2213 | }
2214 | ],
2215 | "source": [
2216 | "# shows how many new columns have been created using the get_dummies function\n",
2217 | "dataset.columns"
2218 | ]
2219 | },
2220 | {
2221 | "cell_type": "code",
2222 | "execution_count": 11,
2223 | "metadata": {},
2224 | "outputs": [],
2225 | "source": [
2226 | "# To avoid correlated variables we drop some columns\n",
2227 | "dataset = dataset.drop(columns = ['housing_na','zodiac_sign_na','payment_type_na'])"
2228 | ]
2229 | },
2230 | {
2231 | "cell_type": "code",
2232 | "execution_count": 12,
2233 | "metadata": {},
2234 | "outputs": [],
2235 | "source": [
2236 | "# spliting the dataset into Training and Test set\n",
2237 | "from sklearn.model_selection import train_test_split\n",
2238 | "\n",
2239 | "# generate the datasets for training. Test 20%\n",
2240 | "X_train, X_test, y_train, y_test = train_test_split(dataset.drop(columns = 'churn'),\n",
2241 | " dataset['churn'],\n",
2242 | " test_size = 0.2,\n",
2243 | " random_state = 0)"
2244 | ]
2245 | },
2246 | {
2247 | "cell_type": "code",
2248 | "execution_count": 13,
2249 | "metadata": {},
2250 | "outputs": [
2251 | {
2252 | "data": {
2253 | "text/plain": [
2254 | "0 12656\n",
2255 | "1 8940\n",
2256 | "Name: churn, dtype: int64"
2257 | ]
2258 | },
2259 | "execution_count": 13,
2260 | "metadata": {},
2261 | "output_type": "execute_result"
2262 | }
2263 | ],
2264 | "source": [
2265 | "# balancing the training Set\n",
2266 | "\n",
2267 | "#Counting and watching the Y_train distribution \n",
2268 | "y_train.value_counts()"
2269 | ]
2270 | },
2271 | {
2272 | "cell_type": "code",
2273 | "execution_count": 14,
2274 | "metadata": {},
2275 | "outputs": [],
2276 | "source": [
2277 | "# The expecting distribution to avoid a Bias is to have a 50/50 (0/1)\n",
2278 | "#This balancing ensure that the model is accuracy\n",
2279 | "\n",
2280 | "#pos and neg index values\n",
2281 | "pos_index = y_train[y_train.values == 1].index\n",
2282 | "neg_index = y_train[y_train.values == 0].index\n",
2283 | "\n",
2284 | "if len(pos_index) > len(neg_index):\n",
2285 | " higher = pos_index\n",
2286 | " lower = neg_index\n",
2287 | "else:\n",
2288 | " lower = pos_index\n",
2289 | " higher = neg_index\n",
2290 | " \n",
2291 | "#Create random index selection\n",
2292 | "random.seed(0)\n",
2293 | "# random select index record up to the same size of lower index\n",
2294 | "higher = np.random.choice(higher, size = len(lower))\n",
2295 | "# select the lower index and convert to a numpy array\n",
2296 | "lower = np.asarray(lower)\n",
2297 | "# create the new index as a combination \n",
2298 | "new_indexes = np.concatenate((lower,higher))\n",
2299 | "\n",
2300 | "# Reselect the X_train, y_train dataset using the new indexes\n",
2301 | "X_train = X_train.loc[new_indexes, ]\n",
2302 | "y_train = y_train[new_indexes]"
2303 | ]
2304 | },
2305 | {
2306 | "cell_type": "code",
2307 | "execution_count": 15,
2308 | "metadata": {},
2309 | "outputs": [
2310 | {
2311 | "name": "stderr",
2312 | "output_type": "stream",
2313 | "text": [
2314 | "C:\\ProgramData\\Anaconda3\\lib\\site-packages\\sklearn\\preprocessing\\data.py:625: DataConversionWarning: Data with input dtype uint8, int64, float64 were all converted to float64 by StandardScaler.\n",
2315 | " return self.partial_fit(X, y)\n",
2316 | "C:\\ProgramData\\Anaconda3\\lib\\site-packages\\sklearn\\base.py:462: DataConversionWarning: Data with input dtype uint8, int64, float64 were all converted to float64 by StandardScaler.\n",
2317 | " return self.fit(X, **fit_params).transform(X)\n",
2318 | "C:\\ProgramData\\Anaconda3\\lib\\site-packages\\sklearn\\preprocessing\\data.py:625: DataConversionWarning: Data with input dtype uint8, int64, float64 were all converted to float64 by StandardScaler.\n",
2319 | " return self.partial_fit(X, y)\n",
2320 | "C:\\ProgramData\\Anaconda3\\lib\\site-packages\\sklearn\\base.py:462: DataConversionWarning: Data with input dtype uint8, int64, float64 were all converted to float64 by StandardScaler.\n",
2321 | " return self.fit(X, **fit_params).transform(X)\n"
2322 | ]
2323 | }
2324 | ],
2325 | "source": [
2326 | "# Feature scaling\n",
2327 | "from sklearn.preprocessing import StandardScaler\n",
2328 | "# generate an standarScalar\n",
2329 | "sc_X = StandardScaler()\n",
2330 | "\n",
2331 | "#StandardScaler return a Numpy Array so we need to convert to a Dataframe\n",
2332 | "X_train2= pd.DataFrame(sc_X.fit_transform(X_train))\n",
2333 | "X_test2 = pd.DataFrame(sc_X.fit_transform(X_test))\n",
2334 | "\n",
2335 | "#copy columns name to the new traning and testing dataset\n",
2336 | "X_train2.columns = X_train.columns.values\n",
2337 | "X_test2.columns = X_test.columns.values\n",
2338 | "\n",
2339 | "#copy index to the new traning and testing dataset\n",
2340 | "X_train2.index = X_train.index.values\n",
2341 | "X_test2.index = X_test.index.values\n",
2342 | "\n",
2343 | "# reasigned copy dataset to original\n",
2344 | "X_train = X_train2\n",
2345 | "X_test = X_test2"
2346 | ]
2347 | },
2348 | {
2349 | "cell_type": "code",
2350 | "execution_count": 16,
2351 | "metadata": {},
2352 | "outputs": [
2353 | {
2354 | "data": {
2355 | "text/plain": [
2356 | "LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,\n",
2357 | " intercept_scaling=1, max_iter=100, multi_class='warn',\n",
2358 | " n_jobs=None, penalty='l2', random_state=0, solver='lbfgs',\n",
2359 | " tol=0.0001, verbose=0, warm_start=False)"
2360 | ]
2361 | },
2362 | "execution_count": 16,
2363 | "metadata": {},
2364 | "output_type": "execute_result"
2365 | }
2366 | ],
2367 | "source": [
2368 | "### Model building ###\n",
2369 | "\n",
2370 | "# Fitting Model to the Traninig Set\n",
2371 | "from sklearn.linear_model import LogisticRegression\n",
2372 | "classifier = LogisticRegression(random_state=0, solver='lbfgs')\n",
2373 | "classifier.fit(X_train, y_train)"
2374 | ]
2375 | },
2376 | {
2377 | "cell_type": "code",
2378 | "execution_count": 17,
2379 | "metadata": {},
2380 | "outputs": [],
2381 | "source": [
2382 | "# Evaluating Test set\n",
2383 | "y_pred = classifier.predict(X_test)"
2384 | ]
2385 | },
2386 | {
2387 | "cell_type": "code",
2388 | "execution_count": 18,
2389 | "metadata": {},
2390 | "outputs": [],
2391 | "source": [
2392 | "# Evaluating Result\n",
2393 | "from sklearn.metrics import confusion_matrix, accuracy_score, f1_score, precision_score, recall_score\n",
2394 | "cnf_matrix = confusion_matrix(y_test, y_pred)"
2395 | ]
2396 | },
2397 | {
2398 | "cell_type": "code",
2399 | "execution_count": 19,
2400 | "metadata": {},
2401 | "outputs": [
2402 | {
2403 | "data": {
2404 | "text/plain": [
2405 | "0.612037037037037"
2406 | ]
2407 | },
2408 | "execution_count": 19,
2409 | "metadata": {},
2410 | "output_type": "execute_result"
2411 | }
2412 | ],
2413 | "source": [
2414 | "# Accuracy score\n",
2415 | "accuracy_score(y_test, y_pred)"
2416 | ]
2417 | },
2418 | {
2419 | "cell_type": "code",
2420 | "execution_count": 20,
2421 | "metadata": {},
2422 | "outputs": [
2423 | {
2424 | "data": {
2425 | "text/plain": [
2426 | "0.5213911972914743"
2427 | ]
2428 | },
2429 | "execution_count": 20,
2430 | "metadata": {},
2431 | "output_type": "execute_result"
2432 | }
2433 | ],
2434 | "source": [
2435 | "# precision score (When is 0 and should be 1 and the other way round)\n",
2436 | "precision_score(y_test, y_pred)"
2437 | ]
2438 | },
2439 | {
2440 | "cell_type": "code",
2441 | "execution_count": 21,
2442 | "metadata": {},
2443 | "outputs": [
2444 | {
2445 | "data": {
2446 | "text/plain": [
2447 | "0.7582811101163832"
2448 | ]
2449 | },
2450 | "execution_count": 21,
2451 | "metadata": {},
2452 | "output_type": "execute_result"
2453 | }
2454 | ],
2455 | "source": [
2456 | "# recall score\n",
2457 | "recall_score(y_test, y_pred)"
2458 | ]
2459 | },
2460 | {
2461 | "cell_type": "code",
2462 | "execution_count": 22,
2463 | "metadata": {},
2464 | "outputs": [
2465 | {
2466 | "data": {
2467 | "text/plain": [
2468 | "0.617909903337589"
2469 | ]
2470 | },
2471 | "execution_count": 22,
2472 | "metadata": {},
2473 | "output_type": "execute_result"
2474 | }
2475 | ],
2476 | "source": [
2477 | "f1_score(y_test, y_pred)"
2478 | ]
2479 | },
2480 | {
2481 | "cell_type": "code",
2482 | "execution_count": 23,
2483 | "metadata": {},
2484 | "outputs": [
2485 | {
2486 | "name": "stdout",
2487 | "output_type": "stream",
2488 | "text": [
2489 | "Accuracy 0.6120\n"
2490 | ]
2491 | },
2492 | {
2493 | "data": {
2494 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAagAAAExCAYAAAAp2zZLAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzt3Xm8VVXdx/HP914UwYlJQUEDEufHCUHzSTMMxRGyQbDBjLoNmlqZWpqYSvpoZmlOlCAOIWpiOKQiORSJMigmOZHjVZRJ0UBk+j1/7H3pcL3DOXfc7Pt9+9ovzll77b3Wxvu6P9aw11JEYGZmljVlrV0BMzOzmjhAmZlZJjlAmZlZJjlAmZlZJjlAmZlZJjlAmZlZJjlAWaZJ6iDpbklLJd3eiPt8RdKDTVm31iLpQEkvtHY9zJqb/B6UNQVJxwM/AnYGPgCeBkZHxN8bed+vAT8ADoiI1Y2uaMZJCqBfRMxr7bqYtTa3oKzRJP0I+A3wS6A7sD1wNTC0CW7/CeDFthCciiGpXWvXwaylOEBZo0jaEjgfOCki7oyIZRGxKiLujoifpHnaS/qNpLfS4zeS2qfnDpZUKenHkhZImi/pxPTcL4BzgeMk/UfSSEnnSbq5oPzekqLqF7ekb0h6WdIHkl6R9JWC9L8XXHeApBlp1+EMSQcUnHtE0gWSpqX3eVBSt1qev6r+ZxTUf5ikIyS9KGmJpJ8V5B8o6XFJ76V5fydp4/TcY2m2OenzHldw/zMlvQ2Mq0pLr/lkWsY+6fdtJS2SdHCj/seaZYADlDXWp4BNgEl15Dkb2B/YC9gTGAicU3C+B7Al0BMYCVwlqXNEjCJplU2MiM0i4vq6KiJpU+AK4PCI2Bw4gKSrsXq+LsC9ad6uwK+BeyV1Lch2PHAisDWwMXB6HUX3IPk76EkSUH8PfBXoDxwInCupb5p3DfBDoBvJ390hwPcBIuKgNM+e6fNOLLh/F5LWZEVhwRHxb+BM4BZJHYFxwA0R8Ugd9TXbIDhAWWN1BRbV0wX3FeD8iFgQEQuBXwBfKzi/Kj2/KiLuA/4D7NTA+qwFdpfUISLmR8TcGvIcCbwUETdFxOqImAA8DxxdkGdcRLwYER8Ct5EE19qsIhlvWwXcShJ8fhsRH6TlzwX2AIiIWRExPS33VeA64DNFPNOoiPgorc96IuL3wEvAE8A2JP8gMNvgOUBZYy0GutUzNrIt8FrB99fStHX3qBbglgOblVqRiFgGHAd8F5gv6V5JOxdRn6o69Sz4/nYJ9VkcEWvSz1UB5J2C8x9WXS9pR0n3SHpb0vskLcQauw8LLIyIFfXk+T2wO3BlRHxUT16zDYIDlDXW48AKYFgded4i6Z6qsn2a1hDLgI4F33sUnoyIByJiMElL4nmSX9z11aeqTm82sE6luIakXv0iYgvgZ4DquabOqbaSNiOZpHI9cF7ahWm2wXOAskaJiKUk4y5XpZMDOkraSNLhki5Js00AzpG0VTrZ4Fzg5truWY+ngYMkbZ9O0Php1QlJ3SUdk45FfUTSVbimhnvcB+wo6XhJ7SQdB+wK3NPAOpVic+B94D9p6+571c6/A/T92FV1+y0wKyK+RTK2dm2ja2mWAQ5Q1mgR8WuSd6DOARYCbwAnA3elWS4EZgLPAP8EZqdpDSlrCjAxvdcs1g8qZcCPSVpIS0jGdr5fwz0WA0eleRcDZwBHRcSihtSpRKeTTMD4gKR1N7Ha+fOA8eksvy/XdzNJQ4EhJN2akPx/2Kdq9qLZhswv6pqZWSa5BWVmZpnkAGVmZpnkAGVmZpnkAGVmZpnkAGVmZpnkAGWtRtIaSU9LelbS7elacg2918GS7kk/HyPprDrydpL0sennRZRxnqSPrclXW3q1PDdI+mIJZfWW9GypdTTLEwcoa00fRsReEbE7sJL/vssDgBIl/4xGxOSIuLiOLJ2o4f0oM8sWByjLir8BO6Qth+ckXU3yQu92kg5Nt6iYnba0qta1GyLp+XQbjWOrbpRurfG79HN3SZMkzUmPA4CLgU+mrbdL03w/SbfdeEbJNh9V9zpb0guSHqKIBWwlfTu9zxxJf6rWKvycpL+l23AcleYvl3RpQdnfaexfpFleOEBZq0sXmj2cZJUJSALBjRGxN8nae+cAn4uIfUhWpPiRpE1IVmI4mmRLix4fu3HiCuDRiNgT2IdkZfGzgH+nrbefSDoU6EeyDcheQH9JB0nqDwwH9iYJgAOKeJw7I2JAWt5zJNuHVOlNsrrFkcC16TOMBJZGxID0/t+W1KeIcsxyz7tzWmvqIKlqv6a/kSx2ui3wWkRMT9P3J1knb5okSPZmepxka/lXIuIlACWbGK63V1JqEPB1gHTF8aWSOlfLc2h6PJV+34wkYG0OTIqI5WkZk4t4pt0lXUjSjbgZ8EDBudsiYi3wkqSX02c4FNijYHxqy7TsF4soyyzXHKCsNX0YEevts5QGoWWFScCUiBhRLd9e1LPKdwkEXBQR11Ur47QGlHEDMCwi5kj6BnBwwbnq94q07B9ERGEgQ1LvEss1yx138VnWTQf+V9IOAOlq6TuSbFnRR9In03wjarl+KumK4el4zxYkC7VuXpDnAeCbBWNbPSVtDTwGfF5SB0mbs/6GhrXZnGQvqo1INmos9CVJZWmd+wIvpGV/L81ftV/UpkWUY5Z7bkFZpkXEwrQlMkFS+zT5nIh4UVIFyVbti4C/k2zYV92pwBhJI0m23vheRDwuaVo6jfsv6TjULsDjaQvuP8BXI2K2pIkkW3y8RtINWZ+fk+xs+xrJmFphIHwBeBToDnw3IlZI+gPJ2NRsJYUvpO69tczaDK9mbmZmmeQuPjMzyyQHKDMzyyQHKDMzy6TMTpLYZrezPThmLWrTDlu3dhWsDZo381Q15f06bD+ipN+dH74+oUnLb0qZDVBmZla6BixfmVkOUGZmOaIcjdw4QJmZ5YhbUGZmlkkOUGZmlknpaii54ABlZpYrbkGZmVkGuYvPzMwyyQHKzMwyydPMzcwsk9yCMjOzTHKAMjOzTHKAMjOzTBJ+D8rMzDLILSgzM8ukPAWo/DyJmZkhlZV01H8/jZW0QNKz1dJ/IOkFSXMlXVKQ/lNJ89JzhxWkD0nT5kk6q5hncQvKzCxXmrzdcQPwO+DGqgRJnwWGAntExEeStk7TdwWGA7sB2wIPSdoxvewqYDBQCcyQNDki/lVXwQ5QZmY50tRdfBHxmKTe1ZK/B1wcER+leRak6UOBW9P0VyTNAwam5+ZFxMtJHXVrmrfOAOUuPjOzHGnqLr5a7AgcKOkJSY9KGpCm9wTeKMhXmabVll4nt6DMzHKk1KWOJFUAFQVJYyJiTD2XtQM6A/sDA4DbJPWFGue4BzU3hqK+ujlAmZnlSKmtojQY1ReQqqsE7oyIAJ6UtBbolqZvV5CvF/BW+rm29Fq5i8/MLEfKyspLOhroLmAQQDoJYmNgETAZGC6pvaQ+QD/gSWAG0E9SH0kbk0ykmFxfIW5BmZnlSFOvZi5pAnAw0E1SJTAKGAuMTaeerwROSFtTcyXdRjL5YTVwUkSsSe9zMvAAUA6MjYi59ZXtAGVmliPNMItvRC2nvlpL/tHA6BrS7wPuK6VsBygzsxzJ00oSDlBmZjniDQvNzCyb3IIyM7MschefmZllkuT9oMzMLIM8BmVmZpnkLj4zM8smd/GZmVkm5acB5QBlZpYrbkGZmVkmOUCZmVkmuYvPzMyyKNyCMjOzTMpPfHKAMjPLlbL8RCgHKDOzPHEXn5mZZVJ+4pMDlJlZrriLz8zMMsldfGZmlkn5iU8OUGZmueIuPjMzy6T8xCcHKDOzPIny/Kx15ABlZpYnbkGZmVkmeRafmZllkidJmJlZJuUnPjlAmZnlirv4zMwskxygzMwsk/Izy9wByswsV9yCMjOzTMpPfHKA2hD8+oJjGfyZnVi0ZBmfHXbFuvRvHr8/Jx6/P2vWrOWhx17gwsseoPOWHfj9b45nr917MvGupzh79N3r8p91ymC+eMxedNqyAzsMOL81HsU2IBed+zkGfboPi99dzhHH3QLAKRX78eVhu7Pk3Q8BuOzqf/DotFfpuc3mPHD713n5tXcBePrZtzn3or8CcMt1X2CrbpuyYsVqAL5x8qR111vTC08zt5Z0212zGffH6Vxx0RfXpR0wsA+HDdqFQz5/JStXraFrl00BWLFyNZdc+RA779Cdnfp1X+8+Dz7yPGP/OJ1//OWHLVp/2zDdefe/uHniHC49/9D10sf98Smuv3n2x/K//uZ7HPOVP9Z4rx+dcz/PPregWepp1eSoi6/ZhtMk7SzpTElXSPpt+nmX5iovz6bPepV3ly5fL+2E4/bjd394jJWr1gCweMkyAD78cBVPzn6NFStXfew+s595gwWLPmj+ClsuzHjqLd57f0VrV8NKpRKP+m4njZW0QNKzNZw7XVJI6pZ+V/o7f56kZyTtU5D3BEkvpccJxTxKswQoSWcCt5I8/pPAjPTzBElnNUeZbU3f3t3Yr39v7p3wXe684VvsuXvP1q6StRFf+/Ke3DPhK1x07ufYYvP269J7bbslk28ZwR+v+wL77rXtetf836jBTL7leE4aObClq9v2lKm0o343AEOqJ0raDhgMvF6QfDjQLz0qgGvSvF2AUcB+wEBglKTO9T5KMbVrgJHAgIi4OCJuTo+L04qNrO0iSRWSZkqaufzdp5qpavnQrryMLbfYhCNHXMv5l93PmMuGt3aVrA245Y5/MmjYDRx9/C0sXLSMn/7wQAAWLlrOQUeN5ZivTGD05X/j8guHsNmmGwNJ996Rw29hxLdvZ8De2zLsyJ1b8xHyTyrtqEdEPAYsqeHU5cAZQBSkDQVujMR0oJOkbYDDgCkRsSQi3gWmUEPQq665AtRaYNsa0rdJz9UoIsZExL4RsW/Hzns3U9XyYf47S7nvoX8B8PQ/K1m7NujauWMr18rybvGS5axdG0TAxEnPsuduyTjnylVreG9p0h049/kFvP7mUnpv3wmAdxYm3c/Llq9i8v0vsOduPVqn8m1FE3fx1ViEdAzwZkTMqXaqJ/BGwffKNK229Do11ySJ04Cpkl4qqNT2wA7Ayc1UZpty/9Tn+PR+fXl8xiv0/URXNtqonMXvLq//QrNG2KprRxYuTn7ODv3sDrz478UAdOnUgffeX8HatcF2PbfgE9t14o03l1JeLrbYrD3vLl1Bu/IyBh3Yh2lPvlFXEdZYJc7ik1RB0h1XZUxEjKkjf0fgbODQmk7XkBZ1pNepWQJURNwvaUeSLr2eJJWrBGZExJrmKDPPrr70yxwwoC9dOnVk1tQz+NVVU5kwaRaXX3AsD991CqtWreHUs/+0Lv+TD57OZpu1Z+ONyhkyaBdGVIzjxX8v5JwfH8bnj9iTDptsxKypZ/DHP83ksqv/2opPZll2+egh7Ne/F507bcLf7/0mvx3zBPv178kuO25FBLw5/33OGT0VgAH79OS07+zP6jVrWbs2OPeiv7L0/Y/osEk7xv1uGO3alVNeJqY9+ToTJ31srN2aUokBKg1GtQakGnwS6APMUdJF2AuYLWkgye/57Qry9gLeStMPrpb+SH0FKaLeINYqttnt7GxWzHJr0w5bt3YVrA2aN/PUJp0X3vdbt5f0u/PlP3yp3vIl9QbuiYjdazj3KrBvRCySdCRJL9kRJBMiroiIgekkiVlA1ay+2UD/iKhpbGsdvwdlZpYnTfyirqQJJK2fbpIqgVERcX0t2e8jCU7zgOXAiQARsUTSBSQzugHOry84gQOUmVm+NPGLuhExop7zvQs+B3BSLfnGAmNLKdsByswsT7zUkZmZZZK32zAzs0zK0Vp8DlBmZjkS5flpQjlAmZnlSX7ikwOUmVmueJKEmZllksegzMwsk9yCMjOzTMpPfHKAMjPLk3ALyszMMskByszMMsmTJMzMLJP8HpSZmWWSW1BmZpZJHoMyM7NMcoAyM7MsCnfxmZlZJnmShJmZZZJbUGZmlkkegzIzs0xygDIzs0zKT3xygDIzyxMvFmtmZtnkSRJmZpZJbkGZmVkm5Sc+OUCZmeVJWVt4UVdSl7oujIglTV8dMzNrjDYRoIBZQFBzgzGAvs1SIzMzazC1hUkSEdGnJStiZmaNl6P4VP+ygkp8VdLP0+/bSxrY/FUzM7NSSaUdWVZMb+XVwKeA49PvHwBXNVuNzMyswVRW2pFlxczi2y8i9pH0FEBEvCtp42aul5mZNUDWW0WlKCZ+rpJUTjIxAklbAWubtVZmZtYgZSrtqI+ksZIWSHq2IO1SSc9LekbSJEmdCs79VNI8SS9IOqwgfUiaNk/SWUU9SxF5rgAmAd0ljQb+DvyymJubmVnLaoYxqBuAIdXSpgC7R8QewIvAT5OytSswHNgtveZqSeVpI+cq4HBgV2BEmrdO9XbxRcQtkmYBh6RJwyLiuWKeyszMWlZTd/FFxGOSeldLe7Dg63Tgi+nnocCtEfER8IqkeUDVpLp5EfFyUkfdmub9V11lFztE1hEoT/N3KPIaMzNrYZJKOprAN4G/pJ97Am8UnKtM02pLr1Mx08zPBcYDXYBuwDhJ5xRVbTMza1GlzuKTVCFpZsFRUXRZ0tnAauCWqqQastW14EOdipnFNwLYOyJWpBW6GJgNXFjEtWZm1oJKbRRFxBhgTOnl6ATgKOCQiKgKNpXAdgXZegFvpZ9rS69VMV18rwKbFHxvD/y7iOvMzKyFtcSLupKGAGcCx0TE8oJTk4HhktpL6gP0A54EZgD9JPVJX1ManuatU12LxV5J0gT7CJgraUr6fTDJTD4zM8uYpp4kIWkCcDDQTVIlMIpk1l57YEo6jjU9Ir4bEXMl3UYy+WE1cFJErEnvczLwAMl8hrERMbe+suvq4puZ/jmLZJp5lUeKfzQzM2tJTb1fYUSMqCH5+jryjwZG15B+H3BfKWXXtVjs+FJuZGZmrS9PK0nUO0lCUj/gIpKXq9aNRUWEt9swM8uYNhWggHEkfY6XA58FTiRXmwqbmeWHmrqPrxUVM4uvQ0RMBRQRr0XEecCg5q2WmZk1RJ622yimBbVCUhnwUjoL401g6+atlpmZNUTWg04pimlBnUay1NEpQH/ga8AJzVkpMzNrmDbVgoqIGenH/5CMP5mZWUblaAiqzhd176aOtZIi4phmqZGZmTVY1ltFpairBfWrFquFmZk1iaxv416Kul7UfbQlK2JmZo3XVlpQZma2gSnL0SCUA5SZWY64BdUC5s/1THZrWR22H9XaVbA26dQmvVubCFCexWdmtuHJUQ+fZ/GZmeVJmwhQnsVnZrbhKVOtHV8bHG+3YWaWI3lqQRXzStc44BqS7Xs/C9wI3NSclTIzs4YpK/HIMm+3YWaWI2WKko4s83YbZmY50ta6+LzdhpnZBiJPXXzebsPMLEfy1IIqZhbfw9Twwm5EeBzKzCxjlPFxpVIUMwZ1esHnTYAvkMzoMzOzjGlTLaiImFUtaZokv8RrZpZBWR9XKkUxXXxdCr6WkUyU6NFsNTIzswbL+tTxUhTTxTeLZAxKJF17rwAjm7NSZmbWMG2qiw/YJSJWFCZIat9M9TEzs0bIUxdfMc/yjxrSHm/qipiZWeOVqbQjy+raD6oH0BPoIGlvki4+gC1IXtw1M7OMaStjUIcB3wB6AZfx3wD1PvCz5q2WmZk1RNZbRaWoaz+o8cB4SV+IiD+1YJ3MzKyB2toYVH9Jnaq+SOos6cJmrJOZmTVQnlYzLyZAHR4R71V9iYh3gSOar0pmZtZQeZokUUyAKi+cVi6pA+Bp5mZmGdROpR31kTRW0gJJzxakdZE0RdJL6Z+d03RJukLSPEnPSNqn4JoT0vwvSSpqR4xiAtTNwFRJIyV9E5hCsquumZllTDN08d0ADKmWdhYwNSL6AVPT7wCHA/3So4JkN/aqFYlGAfsBA4FRVUGtLsWsxXeJpGeAz5HM5LsgIh6o/5nMzKylNXW3XUQ8Jql3teShwMHp5/HAI8CZafqNERHAdEmdJG2T5p0SEUsAJE0hCXoT6iq7mJUkiIj7gfvTG/+vpKsi4qRirjUzs5bTQrP4ukfEfICImC+papf1nsAbBfkq07Ta0utUVICStBcwAjiOZC2+O4u5zszMWlapLShJFSTdcVXGRMSYBhZfU+lRR3qd6lpJYkdgOElgWgxMBBQRny2unmZm1tJK3bAwDUalBqR3JG2Ttp62ARak6ZXAdgX5egFvpekHV0t/pL5C6moNPg8cAhwdEZ+OiCuBNUVX38zMWlwLTTOfDFTNxDsB+HNB+tfT2Xz7A0vTrsAHgEPT92g7A4emaXWqq4vvCyQtqIcl3Q/cSs3NNDMzy4imHoOSNIGk9dNNUiXJbLyLgdskjQReB76UZr+P5D3ZecBy4ESAiFgi6QJgRprv/KoJE3Wpa6mjScAkSZsCw4AfAt0lXQNMiogHS31QMzNrXk29OkREjKjl1CE15A2gxgl0ETEWGFtK2fUG24hYFhG3RMRRJP2GT/PfOe9mZpYheVpJoqhZfFXSJtl16WFmZhmT9aBTipIClJmZZVt5a1egCTlAmZnlSNZXKC+FA5SZWY64i8/MzDLJAcrMzDKp3AHKzMyyyC0oMzPLJE+SMDOzTHILyszMMsnvQZmZWSa5BWVmZpnkMSgzM8skTzM3M7NMchefmZllUrum3rGwFTlAmZnlSLnHoMzMLIty1IBygDIzyxOPQZmZWSY5QJmZWSZ5DMrMzDLJLSgzM8skBygzM8skBygzM8skL3VkZmaZ5MVizcwsk/L0om6enqXNGDRoJEcffTJDh57Cscf+cL1z119/JzvtdDRLliwFICK48MLrGDy4gqOP/gFz585rjSrbBujaS7/Da7OvZeaUS9ZL/943DmPOw5cx66FLGf2z4wHYaKNyrvvVd5jx4P/xxP0Xc+D+u3zsfrdff/rH7mVNr0ylHVnmFtQGavz40XTpsuV6afPnL+Qf/3iabbfdal3aY4/N4tVX3+LBB69jzpwXOO+8a7j99staurq2Abrp9ke5dvwD/OHy769LO+hTu3LUof0ZcNiZrFy5mq26bgHAN0cMAmDAoWeyVdctuOvGM/n0UecQkXQ3DR0ygGXLVrT8Q7RBeRqDcgsqRy666A/85CcnIv33J3Tq1OkMGzYISey11868//4yFixY0oq1tA3FtCefZ8l7/1kvreJrg/nV1ZNZuXI1AAsXvw/Azv168fC0uevSlr6/nP579AVg047tOeXbR3DxlZNasPZtV5mipCPLWjxASTqxpcvMo5Ejz+XYY09j4sT7AZg69Qm23rorO+/cZ71877yzmB49uq373qNHV955Z3GL1tXyY4c+PfjfgTvz2J8v4MHbzl0XhP753GscfWh/ysvL+MR2W7H37n3otW1XAEad/mV+O+Zeln/4UWtWvc1wF1/j/AIYV9MJSRVABcB1151PRcVxLVmvDcaECZfQvXtXFi9+jxNP/Dl9+/bi2mtvY+zY8z+WN2r4B1JhC8usFO3aldN5y005aOjP2XfPT3Lz1aeyy6dPZfzER9h5h55Mu2c0r7+5iOmzXmT16jXssesn6Nu7O2ecfxPb9+pWfwHWaFkPOqVolgAl6ZnaTgHda7suIsYAY5JvL2a77dmKundP/mXatWsnBg/+FE8++SyVle8wdOgpALz99iKOPfY0br/91/To0ZW331607tq3317M1lt3aZV624bvzflLuOsvTwIwc86/WRtBty6bs2jJB5xx/k3r8j185y+Y9+rbHLjfLuzzP315ftoVtGtXxlZdt+SBiT/nsOMuaK1HyL08jds0VwuqO3AY8G61dAH/aKYy24Tly1ewdu1aNtusI8uXr2DatKf4/veH8/jjN6/LM2jQSO6449d06bIlgwbtx80338ORRx7EnDkvsPnmHR2grMHufnAmBx+wG3+b/hw79OnBxhu1Y9GSD+iwycZIYvmHHzHowP9h9Zo1PP/Smzz/0pv8/uaHANi+VzfuHHeGg1Mzy1MHSXMFqHuAzSLi6eonJD3STGW2CYsXv8dJJ40GYM2aNRx11Gc46KD+teb/zGf25dFHZzJ4cAUdOrTnl788taWqahu48Vf+gAM/tQvdOm/OvCd+xwW/voPxEx/muku/y8wpl7By5Wq+9aNrANiq2xbcfdNPWbs2eOudJYw87epWrn3b1RzxSdIPgW8BAfwTOBHYBrgV6ALMBr4WESsltQduBPoDi4HjIuLVBpUbNQ1SZIK7+Kxlddh+VGtXwdqgD1+f0KQxZeaie0v63blvtyPrLF9ST+DvwK4R8aGk24D7gCOAOyPiVknXAnMi4hpJ3wf2iIjvShoOfD4iGjShIE/dlWZmbV5ZiUeR2gEdJLUDOgLzgUHAHen58cCw9PPQ9Dvp+UPUwJlZDlBmZjkiRYmHKiTNLDgqCu8XEW8CvwJeJwlMS4FZwHsRsTrNVgn0TD/3BN5Ir12d5u/akGfxShJmZjlSalNl/dnTNdxP6kzSKuoDvAfcDhxe063qqEKDhmzcgjIzyxGptKMInwNeiYiFEbEKuBM4AOiUdvkB9ALeSj9XAtsldVE7YEugQcvXOECZmeVIuUo7ivA6sL+kjulY0iHAv4CHgS+meU4A/px+npx+Jz3/12jgbDx38ZmZ5UhTTzOPiCck3UEylXw18BRJl+C9wK2SLkzTrk8vuR64SdI8kpbT8IaW7QBlZpYjzfGibkSMAqq/h/EyMLCGvCuALzVFuQ5QZmY5kqOFJBygzMzyxAHKzMwyyauZm5lZJuUoPjlAmZnliTK+S24pHKDMzHLELSgzM8sk7wdlZmaZlKflgRygzMxyxC0oMzPLpBzFJwcoM7M8cQvKzMwyKUfxyQHKzCxPvJKEmZllUo7ikwOUmVmeeCUJMzPLJLegzMwskzyLz8zMMilH8ckByswsT7zUkZmZZZK7+MzMLKPyE6EcoMzMcqRM5a1dhSbjAGVmlituQZmZWQbJAcrMzLLJAcrMzDJIys9EcwcoM7NccQvKzMwyyGNQZmaWSQ5QZmaWUR6DMjOzDFKO1jpygDIzyxUHKDMzy6A8jUHlp7PSzMxIfq2XctRPUidJd0h6XtJzkj4lqYuxFXy/AAAC6ElEQVSkKZJeSv/snOaVpCskzZP0jKR9GvMkZmaWEyrxvyL9Frg/InYG9gSeA84CpkZEP2Bq+h3gcKBfelQA1zT0WRygzMxyRFJJRxH32wI4CLgeICJWRsR7wFBgfJptPDAs/TwUuDES04FOkrZpyLM4QJmZ5YpKPOrVF1gIjJP0lKQ/SNoU6B4R8wHSP7dO8/cE3ii4vjJNK5kDlJlZjoiy0g6pQtLMgqOi2i3bAfsA10TE3sAy/tudV3MVPi4a8iyexWdmliulzeKLiDHAmDqyVAKVEfFE+v0OkgD1jqRtImJ+2oW3oCD/dgXX9wLeKqlSKbegzMxypKnHoCLibeANSTulSYcA/wImAyekaScAf04/Twa+ns7m2x9YWtUVWCq3oMzMcqVZ3oP6AXCLpI2Bl4ETSRo4t0kaCbwOfCnNex9wBDAPWJ7mbRAHKDOzHFEzdIxFxNPAvjWcOqSGvAGc1BTlOkCZmeVKflaScIAyM8uRPC115ABlZpYjXs3czMwySZS3dhWajAOUmVmuuAVlZmYZ5C4+MzPLqPysv+AAZWaWI3maxafknSrLE0kV6fpaZi3CP3PWHPLTFrRC1VcjNmtu/pmzJucAZWZmmeQAZWZmmeQAlU8eC7CW5p85a3KeJGFmZpnkFpSZmWWSA1SOSBoi6QVJ8ySd1dr1sfyTNFbSAknPtnZdLH8coHJCUjlwFXA4sCswQtKurVsrawNuAIa0diUsnxyg8mMgMC8iXo6IlcCtwNBWrpPlXEQ8Bixp7XpYPjlA5UdP4I2C75VpmpnZBskBKj9qWoDLUzTNbIPlAJUflcB2Bd97AW+1Ul3MzBrNASo/ZgD9JPWRtDEwHJjcynUyM2swB6iciIjVwMnAA8BzwG0RMbd1a2V5J2kC8Diwk6RKSSNbu06WH15JwszMMsktKDMzyyQHKDMzyyQHKDMzyyQHKDMzyyQHKDMzyyQHKDMzyyQHKDMzyyQHKDMzy6T/B1NiiusqWRqoAAAAAElFTkSuQmCC\n",
2495 | "text/plain": [
2496 | ""
2497 | ]
2498 | },
2499 | "metadata": {
2500 | "needs_background": "light"
2501 | },
2502 | "output_type": "display_data"
2503 | }
2504 | ],
2505 | "source": [
2506 | "class_names=[0,1] # name of classes\n",
2507 | "fig, ax = plt.subplots()\n",
2508 | "tick_marks = np.arange(len(class_names))\n",
2509 | "plt.xticks(tick_marks, class_names)\n",
2510 | "plt.yticks(tick_marks, class_names)\n",
2511 | "# create heatmap\n",
2512 | "sns.heatmap(pd.DataFrame(cnf_matrix), annot=True, cmap=\"YlGnBu\" ,fmt='g')\n",
2513 | "ax.xaxis.set_label_position(\"top\")\n",
2514 | "plt.tight_layout()\n",
2515 | "plt.title('Confusion matrix', y=1.1)\n",
2516 | "plt.ylabel('Actual label')\n",
2517 | "plt.xlabel('Predicted label')\n",
2518 | "print(\"Accuracy %0.4f\" % accuracy_score(y_test, y_pred))"
2519 | ]
2520 | },
2521 | {
2522 | "cell_type": "code",
2523 | "execution_count": 24,
2524 | "metadata": {},
2525 | "outputs": [
2526 | {
2527 | "data": {
2528 | "text/plain": [
2529 | "array([0.65995526, 0.66778523, 0.65380313, 0.63199105, 0.6442953 ,\n",
2530 | " 0.66331096, 0.6582774 , 0.65659955, 0.64597315, 0.64261745])"
2531 | ]
2532 | },
2533 | "execution_count": 24,
2534 | "metadata": {},
2535 | "output_type": "execute_result"
2536 | }
2537 | ],
2538 | "source": [
2539 | "#Appliyng K-fold Cross Validation\n",
2540 | "from sklearn.model_selection import cross_val_score\n",
2541 | "accuracies = cross_val_score(estimator = classifier,\n",
2542 | " X = X_train,\n",
2543 | " y = y_train,\n",
2544 | " cv = 10)\n",
2545 | "\n",
2546 | "# Showing Accuracies to validate the model estimation\n",
2547 | "accuracies"
2548 | ]
2549 | },
2550 | {
2551 | "cell_type": "code",
2552 | "execution_count": 25,
2553 | "metadata": {},
2554 | "outputs": [
2555 | {
2556 | "data": {
2557 | "text/plain": [
2558 | "0.6524608501118567"
2559 | ]
2560 | },
2561 | "execution_count": 25,
2562 | "metadata": {},
2563 | "output_type": "execute_result"
2564 | }
2565 | ],
2566 | "source": [
2567 | "accuracies.mean()"
2568 | ]
2569 | },
2570 | {
2571 | "cell_type": "code",
2572 | "execution_count": 26,
2573 | "metadata": {},
2574 | "outputs": [
2575 | {
2576 | "data": {
2577 | "text/html": [
2578 | "\n",
2579 | "\n",
2592 | "
\n",
2593 | " \n",
2594 | " \n",
2595 | " \n",
2596 | " features \n",
2597 | " coef \n",
2598 | " \n",
2599 | " \n",
2600 | " \n",
2601 | " \n",
2602 | " 0 \n",
2603 | " user \n",
2604 | " -0.117347 \n",
2605 | " \n",
2606 | " \n",
2607 | " 1 \n",
2608 | " age \n",
2609 | " -0.168296 \n",
2610 | " \n",
2611 | " \n",
2612 | " 2 \n",
2613 | " deposits \n",
2614 | " 0.448547 \n",
2615 | " \n",
2616 | " \n",
2617 | " 3 \n",
2618 | " withdrawal \n",
2619 | " 0.077879 \n",
2620 | " \n",
2621 | " \n",
2622 | " 4 \n",
2623 | " purchases_partners \n",
2624 | " -0.724259 \n",
2625 | " \n",
2626 | " \n",
2627 | " 5 \n",
2628 | " purchases \n",
2629 | " -0.603786 \n",
2630 | " \n",
2631 | " \n",
2632 | " 6 \n",
2633 | " cc_taken \n",
2634 | " 0.074629 \n",
2635 | " \n",
2636 | " \n",
2637 | " 7 \n",
2638 | " cc_recommended \n",
2639 | " -0.003146 \n",
2640 | " \n",
2641 | " \n",
2642 | " 8 \n",
2643 | " cc_disliked \n",
2644 | " 0.003458 \n",
2645 | " \n",
2646 | " \n",
2647 | " 9 \n",
2648 | " cc_liked \n",
2649 | " 0.009891 \n",
2650 | " \n",
2651 | " \n",
2652 | " 10 \n",
2653 | " cc_application_begin \n",
2654 | " 0.027898 \n",
2655 | " \n",
2656 | " \n",
2657 | " 11 \n",
2658 | " app_downloaded \n",
2659 | " -0.017686 \n",
2660 | " \n",
2661 | " \n",
2662 | " 12 \n",
2663 | " web_user \n",
2664 | " 0.143757 \n",
2665 | " \n",
2666 | " \n",
2667 | " 13 \n",
2668 | " ios_user \n",
2669 | " 0.081419 \n",
2670 | " \n",
2671 | " \n",
2672 | " 14 \n",
2673 | " android_user \n",
2674 | " -0.019784 \n",
2675 | " \n",
2676 | " \n",
2677 | " 15 \n",
2678 | " registered_phones \n",
2679 | " 0.102284 \n",
2680 | " \n",
2681 | " \n",
2682 | " 16 \n",
2683 | " waiting_4_loan \n",
2684 | " -0.028444 \n",
2685 | " \n",
2686 | " \n",
2687 | " 17 \n",
2688 | " cancelled_loan \n",
2689 | " 0.078268 \n",
2690 | " \n",
2691 | " \n",
2692 | " 18 \n",
2693 | " received_loan \n",
2694 | " 0.103427 \n",
2695 | " \n",
2696 | " \n",
2697 | " 19 \n",
2698 | " rejected_loan \n",
2699 | " 0.103366 \n",
2700 | " \n",
2701 | " \n",
2702 | " 20 \n",
2703 | " left_for_two_month_plus \n",
2704 | " 0.063179 \n",
2705 | " \n",
2706 | " \n",
2707 | " 21 \n",
2708 | " left_for_one_month \n",
2709 | " 0.046261 \n",
2710 | " \n",
2711 | " \n",
2712 | " 22 \n",
2713 | " reward_rate \n",
2714 | " -0.223845 \n",
2715 | " \n",
2716 | " \n",
2717 | " 23 \n",
2718 | " is_referred \n",
2719 | " 0.013839 \n",
2720 | " \n",
2721 | " \n",
2722 | " 24 \n",
2723 | " housing_O \n",
2724 | " -0.034252 \n",
2725 | " \n",
2726 | " \n",
2727 | " 25 \n",
2728 | " housing_R \n",
2729 | " 0.036926 \n",
2730 | " \n",
2731 | " \n",
2732 | " 26 \n",
2733 | " payment_type_Bi-Weekly \n",
2734 | " -0.048699 \n",
2735 | " \n",
2736 | " \n",
2737 | " 27 \n",
2738 | " payment_type_Monthly \n",
2739 | " -0.002895 \n",
2740 | " \n",
2741 | " \n",
2742 | " 28 \n",
2743 | " payment_type_Semi-Monthly \n",
2744 | " -0.036323 \n",
2745 | " \n",
2746 | " \n",
2747 | " 29 \n",
2748 | " payment_type_Weekly \n",
2749 | " 0.039683 \n",
2750 | " \n",
2751 | " \n",
2752 | " 30 \n",
2753 | " zodiac_sign_Aquarius \n",
2754 | " -0.016194 \n",
2755 | " \n",
2756 | " \n",
2757 | " 31 \n",
2758 | " zodiac_sign_Aries \n",
2759 | " 0.004264 \n",
2760 | " \n",
2761 | " \n",
2762 | " 32 \n",
2763 | " zodiac_sign_Cancer \n",
2764 | " 0.026928 \n",
2765 | " \n",
2766 | " \n",
2767 | " 33 \n",
2768 | " zodiac_sign_Capricorn \n",
2769 | " 0.023683 \n",
2770 | " \n",
2771 | " \n",
2772 | " 34 \n",
2773 | " zodiac_sign_Gemini \n",
2774 | " -0.011197 \n",
2775 | " \n",
2776 | " \n",
2777 | " 35 \n",
2778 | " zodiac_sign_Leo \n",
2779 | " 0.003228 \n",
2780 | " \n",
2781 | " \n",
2782 | " 36 \n",
2783 | " zodiac_sign_Libra \n",
2784 | " -0.014704 \n",
2785 | " \n",
2786 | " \n",
2787 | " 37 \n",
2788 | " zodiac_sign_Pisces \n",
2789 | " 0.024610 \n",
2790 | " \n",
2791 | " \n",
2792 | " 38 \n",
2793 | " zodiac_sign_Sagittarius \n",
2794 | " 0.021215 \n",
2795 | " \n",
2796 | " \n",
2797 | " 39 \n",
2798 | " zodiac_sign_Scorpio \n",
2799 | " -0.023309 \n",
2800 | " \n",
2801 | " \n",
2802 | " 40 \n",
2803 | " zodiac_sign_Taurus \n",
2804 | " 0.011366 \n",
2805 | " \n",
2806 | " \n",
2807 | " 41 \n",
2808 | " zodiac_sign_Virgo \n",
2809 | " 0.015451 \n",
2810 | " \n",
2811 | " \n",
2812 | "
\n",
2813 | "
"
2814 | ],
2815 | "text/plain": [
2816 | " features coef\n",
2817 | "0 user -0.117347\n",
2818 | "1 age -0.168296\n",
2819 | "2 deposits 0.448547\n",
2820 | "3 withdrawal 0.077879\n",
2821 | "4 purchases_partners -0.724259\n",
2822 | "5 purchases -0.603786\n",
2823 | "6 cc_taken 0.074629\n",
2824 | "7 cc_recommended -0.003146\n",
2825 | "8 cc_disliked 0.003458\n",
2826 | "9 cc_liked 0.009891\n",
2827 | "10 cc_application_begin 0.027898\n",
2828 | "11 app_downloaded -0.017686\n",
2829 | "12 web_user 0.143757\n",
2830 | "13 ios_user 0.081419\n",
2831 | "14 android_user -0.019784\n",
2832 | "15 registered_phones 0.102284\n",
2833 | "16 waiting_4_loan -0.028444\n",
2834 | "17 cancelled_loan 0.078268\n",
2835 | "18 received_loan 0.103427\n",
2836 | "19 rejected_loan 0.103366\n",
2837 | "20 left_for_two_month_plus 0.063179\n",
2838 | "21 left_for_one_month 0.046261\n",
2839 | "22 reward_rate -0.223845\n",
2840 | "23 is_referred 0.013839\n",
2841 | "24 housing_O -0.034252\n",
2842 | "25 housing_R 0.036926\n",
2843 | "26 payment_type_Bi-Weekly -0.048699\n",
2844 | "27 payment_type_Monthly -0.002895\n",
2845 | "28 payment_type_Semi-Monthly -0.036323\n",
2846 | "29 payment_type_Weekly 0.039683\n",
2847 | "30 zodiac_sign_Aquarius -0.016194\n",
2848 | "31 zodiac_sign_Aries 0.004264\n",
2849 | "32 zodiac_sign_Cancer 0.026928\n",
2850 | "33 zodiac_sign_Capricorn 0.023683\n",
2851 | "34 zodiac_sign_Gemini -0.011197\n",
2852 | "35 zodiac_sign_Leo 0.003228\n",
2853 | "36 zodiac_sign_Libra -0.014704\n",
2854 | "37 zodiac_sign_Pisces 0.024610\n",
2855 | "38 zodiac_sign_Sagittarius 0.021215\n",
2856 | "39 zodiac_sign_Scorpio -0.023309\n",
2857 | "40 zodiac_sign_Taurus 0.011366\n",
2858 | "41 zodiac_sign_Virgo 0.015451"
2859 | ]
2860 | },
2861 | "execution_count": 26,
2862 | "metadata": {},
2863 | "output_type": "execute_result"
2864 | }
2865 | ],
2866 | "source": [
2867 | "# Analyzing coefficients\n",
2868 | "pd.concat([pd.DataFrame(X_train.columns, columns = [\"features\"]),\n",
2869 | " pd.DataFrame(np.transpose(classifier.coef_), columns = [\"coef\"])],\n",
2870 | "axis = 1)"
2871 | ]
2872 | },
2873 | {
2874 | "cell_type": "code",
2875 | "execution_count": 27,
2876 | "metadata": {},
2877 | "outputs": [
2878 | {
2879 | "data": {
2880 | "text/plain": [
2881 | "(17880, 42)"
2882 | ]
2883 | },
2884 | "execution_count": 27,
2885 | "metadata": {},
2886 | "output_type": "execute_result"
2887 | }
2888 | ],
2889 | "source": [
2890 | "### Feature selection ###\n",
2891 | "\n",
2892 | "# how many columns we have on the X_train dataset?\n",
2893 | "\n",
2894 | "X_train.shape\n"
2895 | ]
2896 | },
2897 | {
2898 | "cell_type": "code",
2899 | "execution_count": 28,
2900 | "metadata": {},
2901 | "outputs": [],
2902 | "source": [
2903 | "## Feature selection\n",
2904 | "\n",
2905 | "from sklearn.feature_selection import RFE\n",
2906 | "from sklearn.linear_model import LogisticRegression"
2907 | ]
2908 | },
2909 | {
2910 | "cell_type": "code",
2911 | "execution_count": 29,
2912 | "metadata": {},
2913 | "outputs": [],
2914 | "source": [
2915 | "# Model to test\n",
2916 | "classifier = LogisticRegression(solver = 'lbfgs')\n",
2917 | "# RFE(, N) the N is the Total Feature that we want to use for the clasification\n",
2918 | "rfe = RFE(classifier, 20)\n",
2919 | "rfe = rfe.fit(X_train, y_train)\n"
2920 | ]
2921 | },
2922 | {
2923 | "cell_type": "code",
2924 | "execution_count": 30,
2925 | "metadata": {},
2926 | "outputs": [
2927 | {
2928 | "name": "stdout",
2929 | "output_type": "stream",
2930 | "text": [
2931 | "[ True True True True True True True False False False False False\n",
2932 | " True True True True False True True True True True True False\n",
2933 | " False True True False False True False False False False False False\n",
2934 | " False False False False False False]\n"
2935 | ]
2936 | }
2937 | ],
2938 | "source": [
2939 | "# sumarize the selection of the atributes\n",
2940 | "print(rfe.support_)"
2941 | ]
2942 | },
2943 | {
2944 | "cell_type": "code",
2945 | "execution_count": 33,
2946 | "metadata": {},
2947 | "outputs": [
2948 | {
2949 | "data": {
2950 | "text/plain": [
2951 | "Index(['user', 'age', 'deposits', 'withdrawal', 'purchases_partners',\n",
2952 | " 'purchases', 'cc_taken', 'web_user', 'ios_user', 'android_user',\n",
2953 | " 'registered_phones', 'cancelled_loan', 'received_loan', 'rejected_loan',\n",
2954 | " 'left_for_two_month_plus', 'left_for_one_month', 'reward_rate',\n",
2955 | " 'housing_R', 'payment_type_Bi-Weekly', 'payment_type_Weekly'],\n",
2956 | " dtype='object')"
2957 | ]
2958 | },
2959 | "execution_count": 33,
2960 | "metadata": {},
2961 | "output_type": "execute_result"
2962 | }
2963 | ],
2964 | "source": [
2965 | "# show rfe columns names \n",
2966 | "X_train.columns[rfe.support_]"
2967 | ]
2968 | },
2969 | {
2970 | "cell_type": "code",
2971 | "execution_count": 35,
2972 | "metadata": {},
2973 | "outputs": [
2974 | {
2975 | "data": {
2976 | "text/plain": [
2977 | "LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,\n",
2978 | " intercept_scaling=1, max_iter=100, multi_class='warn',\n",
2979 | " n_jobs=None, penalty='l2', random_state=0, solver='lbfgs',\n",
2980 | " tol=0.0001, verbose=0, warm_start=False)"
2981 | ]
2982 | },
2983 | "execution_count": 35,
2984 | "metadata": {},
2985 | "output_type": "execute_result"
2986 | }
2987 | ],
2988 | "source": [
2989 | "# Model building with Feature Selection \n",
2990 | "# Fitting Moel to the Traninig Set\n",
2991 | "from sklearn.linear_model import LogisticRegression\n",
2992 | "classifier = LogisticRegression(random_state=0, solver='lbfgs')\n",
2993 | "classifier.fit(X_train[X_train.columns[rfe.support_]], y_train)"
2994 | ]
2995 | },
2996 | {
2997 | "cell_type": "code",
2998 | "execution_count": 37,
2999 | "metadata": {},
3000 | "outputs": [],
3001 | "source": [
3002 | "# Evaluating Test set with feature selection\n",
3003 | "y_pred = classifier.predict(X_test[X_test.columns[rfe.support_]])"
3004 | ]
3005 | },
3006 | {
3007 | "cell_type": "code",
3008 | "execution_count": 38,
3009 | "metadata": {},
3010 | "outputs": [],
3011 | "source": [
3012 | "# Evaluating Result with feature selection\n",
3013 | "from sklearn.metrics import confusion_matrix, accuracy_score, f1_score, precision_score, recall_score\n",
3014 | "cnf_matrix = confusion_matrix(y_test, y_pred)"
3015 | ]
3016 | },
3017 | {
3018 | "cell_type": "code",
3019 | "execution_count": 39,
3020 | "metadata": {},
3021 | "outputs": [
3022 | {
3023 | "name": "stdout",
3024 | "output_type": "stream",
3025 | "text": [
3026 | "Accuracy 0.6085\n"
3027 | ]
3028 | },
3029 | {
3030 | "data": {
3031 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAagAAAExCAYAAAAp2zZLAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzt3Xm8FmX9//HX+0AI4gKKAoKGGm6ZCyKh5ZKmAqH4q0xscYmi0qwsM0u/4pqm5b6FiUoaoKiJiiKRSy4gqEiYKEdxOQKyKQq4AZ/fHzMHb45nue/DWYY576ePeZx7rrlmrmvg9ny4rrnmuhQRmJmZZU1Zc1fAzMysOg5QZmaWSQ5QZmaWSQ5QZmaWSQ5QZmaWSQ5QZmaWSQ5QlmmS2km6V9JSSXesw3W+J+mhhqxbc5G0n6SXmrseZo1Nfg/KGoKk7wK/BnYC3gemAxdExOPreN0fACcD+0bEynWuaMZJCqBnRJQ3d13MmptbULbOJP0auBz4I9AZ2Aa4FhjUAJf/PPBySwhOxZDUurnrYNZUHKBsnUjaFDgXOCki7oqI5RHxSUTcGxG/TfNsIOlySXPT7XJJG6THDpRUIek3khZImifphPTYOcBZwNGSlkkaIulsSbcWlN9DUlT+4pZ0vKRXJb0vaY6k7xWkP15w3r6SpqZdh1Ml7Vtw7BFJ50l6Ir3OQ5I61XD/lfU/raD+R0oaIOllSUsk/aEgfx9JT0l6N817taQ26bHH0mzPp/d7dMH1fydpPnBTZVp6zvZpGb3S/a0kLZJ04Dr9xZplgAOUrat9gLbA3bXkOQPoC+wB7A70Ac4sON4F2BToBgwBrpHUMSKGkbTKxkTERhFxY20VkdQeuBLoHxEbA/uSdDVWzbcZcH+ad3PgUuB+SZsXZPsucAKwJdAGOLWWoruQ/Bl0IwmoNwDfB/YC9gPOkrRdmncVcArQieTP7mDgRICI2D/Ns3t6v2MKrr8ZSWtyaGHBEfEK8DvgNkkbAjcBN0fEI7XU12y94ABl62pzYFEdXXDfA86NiAURsRA4B/hBwfFP0uOfRMR4YBmwYz3rsxrYVVK7iJgXES9Uk+cbwOyI+HtErIyIUcAs4PCCPDdFxMsR8QFwO0lwrcknJM/bPgFGkwSfKyLi/bT8F4DdACLimYiYnJb7GvBX4IAi7mlYRHyU1mctEXEDMBuYAnQl+QeB2XrPAcrW1WKgUx3PRrYCXi/Yfz1NW3ONKgFuBbBRqRWJiOXA0cBPgXmS7pe0UxH1qaxTt4L9+SXUZ3FErEo/VwaQtwuOf1B5vqQdJN0nab6k90haiNV2HxZYGBEf1pHnBmBX4KqI+KiOvGbrBQcoW1dPAR8CR9aSZy5J91SlbdK0+lgObFiw36XwYERMiIhDSFoSs0h+cddVn8o6vVXPOpXiOpJ69YyITYA/AKrjnFqH2kraiGSQyo3A2WkXptl6zwHK1klELCV57nJNOjhgQ0mfk9Rf0sVptlHAmZK2SAcbnAXcWtM16zAd2F/SNukAjd9XHpDUWdIR6bOoj0i6CldVc43xwA6SviuptaSjgV2A++pZp1JsDLwHLEtbdz+rcvxtYLvPnFW7K4BnIuJHJM/Wrl/nWpplgAOUrbOIuJTkHagzgYXAm8DPgX+mWc4HpgEzgP8Cz6Zp9SlrIjAmvdYzrB1UyoDfkLSQlpA82zmxmmssBgameRcDpwEDI2JRfepUolNJBmC8T9K6G1Pl+NnALekov+/UdTFJg4B+JN2akPw99KocvWi2PvOLumZmlkluQZmZWSY5QJmZWSY5QJmZWSY5QJmZWSY5QJmZWSY5QFmzkbRK0nRJMyXdkc4lV99rHSjpvvTzEZJOryVvB0mfGX5eRBlnS/rMnHw1pVfJc7Okb5dQVg9JM0uto1meOEBZc/ogIvaIiF2Bj/n0XR4AlCj5OxoR4yLiolqydKCa96PMLFscoCwr/gN8IW05vCjpWpIXereWdGi6RMWzaUurcl67fpJmpctofLPyQunSGlennztLulvS8+m2L3ARsH3aerskzffbdNmNGUqW+ai81hmSXpL0L4qYwFbSj9PrPC/pziqtwq9L+k+6DMfANH8rSZcUlP2Tdf2DNMsLByhrdulEs/1JZpmAJBCMjIg9SebeOxP4ekT0IpmR4teS2pLMxHA4yZIWXT5z4cSVwKMRsTvQi2Rm8dOBV9LW228lHQr0JFkGZA9gL0n7S9oLGAzsSRIA9y7idu6KiL3T8l4kWT6kUg+S2S2+AVyf3sMQYGlE7J1e/8eSti2iHLPc8+qc1pzaSapcr+k/JJOdbgW8HhGT0/S+JPPkPSEJkrWZniJZWn5ORMwGULKI4VprJaUOAo4FSGccXyqpY5U8h6bbc+n+RiQBa2Pg7ohYkZYxroh72lXS+STdiBsBEwqO3R4Rq4HZkl5N7+FQYLeC51ObpmW/XERZZrnmAGXN6YOIWGudpTQILS9MAiZGxDFV8u1BHbN8l0DAhRHx1ypl/KoeZdwMHBkRz0s6Hjiw4FjVa0Va9skRURjIkNSjxHLNcsddfJZ1k4GvSPoCQDpb+g4kS1ZsK2n7NN8xNZw/iXTG8PR5zyYkE7VuXJBnAvDDgmdb3SRtCTwG/D9J7SRtzNoLGtZkY5K1qD5HslBjoaMklaV13g54KS37Z2n+yvWi2hdRjlnuuQVlmRYRC9OWyChJG6TJZ0bEy5KGkizVvgh4nGTBvqp+CQyXNIRk6Y2fRcRTkp5Ih3E/kD6H2hl4Km3BLQO+HxHPShpDssTH6yTdkHX5P5KVbV8neaZWGAhfAh4FOgM/jYgPJf2N5NnUs0oKX0jta2uZtRiezdzMzDLJXXxmZpZJDlBmZpZJDlBmZpZJmR0k0WOPi/xwzJpUPWZVMltnc547TQ15vXbbHFPS784P3hjVoOU3pMwGKDMzK12e/qHlAGVmliPK0ZMbBygzsxxxC8rMzDLJAcrMzDIpnQ0lFxygzMxyxS0oMzPLIHfxmZlZJjlAmZlZJnmYuZmZZZJbUGZmlkkOUGZmlkkOUGZmlknC70GZmVkGuQVlZmaZ5ABlZmaZ5ABlZmYZlZ8AlZ87MTMzpLKStrqvpxGSFkiaWSX9ZEkvSXpB0sUF6b+XVJ4eO6wgvV+aVi7p9GLuxS0oM7McaYQuvpuBq4GRn5ahrwGDgN0i4iNJW6bpuwCDgS8CWwH/krRDeto1wCFABTBV0riI+F9tBTtAmZnlSENPdRQRj0nqUSX5Z8BFEfFRmmdBmj4IGJ2mz5FUDvRJj5VHxKsAkkaneWsNUO7iMzPLkVK7+CQNlTStYBtaRDE7APtJmiLpUUl7p+ndgDcL8lWkaTWl18otKDOzHCkra1VS/ogYDgwvsZjWQEegL7A3cLuk7aDat4SD6htDUUwhZmaWE000m3kFcFdEBPC0pNVApzR964J83YG56eea0mvkLj4zsxxp6FF8NfgncFBSnnYA2gCLgHHAYEkbSNoW6Ak8DUwFekraVlIbkoEU4+oqxC0oM7McaehRfJJGAQcCnSRVAMOAEcCIdOj5x8BxaWvqBUm3kwx+WAmcFBGr0uv8HJgAtAJGRMQLdZXtAGVmliONMIrvmBoOfb+G/BcAF1STPh4YX0rZDlBmZnniqY7MzCyLPBefmZllkuT1oMzMLIOaaJh5k3CAMjPLEXfxmZlZNrmLz8zMMik/DSgHKDOzXHELyszMMskByszMMsldfGZmlkXhFpSZmWVSfuKTA5SZWa6U5SdCOUCZmeWJu/jMzCyT8hOfHKDMzHLFXXxmZpZJ7uIzM7NMyk98coAyM8sVd/GZmVkm5Sc+OUCZmeVJtMrPXEcOUGZmeeIWlJmZZZJH8ZmZWSZ5kISZmWVSfuKTA5SZWa64i8/MzDLJAcrMzDIpP6PMHaDMzHLFLSgzM8uk/MQnB6j1wcVnD+Cg/bdn8ZIVHPbtGwH41U+/yuBv7s6Sd1Ykea56lEcef5XWrcv407D+fHGnzrRuVcZd983k2hGT11yrrEzc+4/jmb/gfYb8Ymyz3I+tH/40rN+a712/o24C4Jc/+QqDv7nbmu/dJVf/Z8337qKz+n36vbt/JteNmELXzhvzl/O+wRabt2d1BKPufJ6bRz3TnLeVe+Fh5taUxo77L7eMfoZLzx+4VvqNt07lhpFPr5U24JCdaPO5VvQ7agRt27bmX3f9mHEPvkjF3KUAnPDd3pTPWcRG7Tdosvrb+unOe2cycsxz/OW8AWulj7h1Gjf8fepaaQO+viNt2rSi/3duom3b1ky8cwjjHniRjz9ZxQWXPswLs96m/YZtuPcfx/L4lNcof3VxU95Ky5KjLr5Ge5wmaSdJv5N0paQr0s87N1Z5efb0s2+y9L0Pi8scQbt2bWjVSrTdoDUff7KK95d9BECXLTfmoP22Z/RdMxqxtpYXTz9bwbtLPygqbwAbtv3cmu/dJ5+sYtnyj1m4aDkvzHobgOUrPqZ8zmK6bLFRI9baUIlbhjVKgJL0O2A0ye0/DUxNP4+SdHpjlNkSHTd4Lx64/YdcfPYANtk4aRGN/9dLfPDBxzw98WSefPBEbhg5ZU1wO+u3B3Ph5Q8TEc1ZbVvPHTu4Fw+MOZ4/Deu35nv3wL9eYsWHnzBl4kk88cBPuWHk1M/8o6pb103YZcfOTJ85rzmq3XKUqbStDpJGSFogaWY1x06VFJI6pftKGyXlkmZI6lWQ9zhJs9PtuKJupYTbLsUQYO+IuCgibk23i4A+6bFqSRoqaZqkae8vfrqmbAbcevuz7D/wegYcPYIFi5Zx5m8OBmD3XbuyanXw5UOvZr8B1/OjH/Rh626bctB+27P4nRXMfPHtZq65rc9uu+M5Djh8OAMG38zCRcs549dfA2D3L3Zl1aqg76HXsv83hvOjH+zN1t02XXPehu0+x3V/PpLz/jyJZcs/bq7qtwxSaVvdbgb6fbYYbQ0cArxRkNwf6JluQ4Hr0rybAcOAL5PEgWGSOtZVcGMFqNXAVtWkd02PVSsihkdE74jovfHmfRqpavmwaMkKVq8OImD0Xc+z+65dARjUfxcefeJVVq5czeJ3VvDM9LfY7Ytd6b1Hd75+wBd4fPzPuOqiI9h3789z2QUD6yjFbG2F37tRa33vduaxJz/93k2bXsFuu3QBoHXrMq7785Hc88D/mPDv2c1Z/Zahgbv4IuIxYEk1hy4DTiPp4a00CBgZiclAB0ldgcOAiRGxJCLeASZSTdCrqrEC1K+ASZIekDQ83R4EJgG/bKQyW5QtOrVf8/mwg3bg5fKFAMyd9x779vk8AO3afo49v7QVr8xZzMVXPco+h13LVwdcx8mnj+PJqa9zyhn3NUvdbf31me/dK4sAeGv+e+yzd8H3breteOW15Hfan4b1o3zOYm68dVrTV7glKrGLr7DnKt2G1lWEpCOAtyLi+SqHugFvFuxXpGk1pdeqUUbxRcSDknYgacp1I4nTFcDUiFjVGGXm2ZUXHkHf3tvQsUM7nppwIpdd9zh9e2/DLjtuSQRUzF3KH85/EICRY57lknO/wUN3DkGIO8bNYNbshc18B7Y+uuLCw+m719Z07NCOJx/8GZdf/zh999qGnXfcEiKomPcefzh/AgB/H/Mcl5zTnwljf4gEY++ZyazZC+m9Rze+OXBXZr28gPtHJ48dKoemWyMpcZh5RAwHhhebX9KGwBnAodUdrq6IWtJrLyurD8x77HFRNitmuSXlaI4YW2/Mee60Bh1Lt92P7ijpd+erfzuqzvIl9QDui4hdJX2JpDdsRXq4OzCXpEFyDvBIRIxKz3sJOLByi4ifpOl/LcxXE/8faWaWJw08iq+qiPhvRGwZET0iogdJ71iviJgPjAOOTUfz9QWWRsQ8YAJwqKSO6eCIQ9O0WvlFXTOzPGngF3UljSJpAXWSVAEMi4gba8g+HhgAlJO0sE4AiIglks4jeeUI4NyIqG7gxVocoMzM8qSBpzqKiGPqON6j4HMAJ9WQbwQwopSyHaDMzPIkRw9uHKDMzPIkR3PxOUCZmeVItMpPE8oByswsT/ITnxygzMxyxetBmZlZJvkZlJmZZZJbUGZmlkn5iU8OUGZmeRJuQZmZWSY5QJmZWSZ5kISZmWWS34MyM7NMcgvKzMwyyc+gzMwskxygzMwsi8JdfGZmlkkeJGFmZpnkFpSZmWWSn0GZmVkmOUCZmVkm5Sc+OUCZmeWJJ4s1M7Ns8iAJMzPLJLegzMwsk/ITnxygzMzypKwlvKgrabPaToyIJQ1fHTMzWxctIkABzwBB9Q3GALZrlBqZmVm9qSUMkoiIbZuyImZmtu5yFJ/qnlZQie9L+r90fxtJfRq/amZmViqptC3LiumtvBbYB/huuv8+cE2j1cjMzOpNZaVtWVbMKL4vR0QvSc8BRMQ7kto0cr3MzKwest4qKkUxAeoTSa1IBkYgaQtgdaPWyszM6iVH7+kW1cV3JXA30FnSBcDjwB8btVZmZlYvDf0MStIISQskzSxIu0TSLEkzJN0tqUPBsd9LKpf0kqTDCtL7pWnlkk4v5l7qDFARcRtwGklQmgscGRF3FHNxMzNrWo0wSOJmoF+VtInArhGxG/Ay8PukbO0CDAa+mJ5zraRWaS/cNUB/YBfgmDRvrYp9RLYh0CrN367Ic8zMrIlJKmmrS0Q8BiypkvZQRKxMdycD3dPPg4DREfFRRMwByoE+6VYeEa9GxMfA6DRvrYoZZn4WcAuwGdAJuEnSmXXelZmZNblSR/FJGippWsE2tMQifwg8kH7uBrxZcKwiTaspvVbFDJI4BtgzIj4EkHQR8CxwfhHnmplZEyp1FF9EDAeG168snQGsBG6rTKquCKpvDEVd1y8mQL0GtAU+TPc3AF4p4jwzM2tiTTXMXNJxwEDg4IioDDYVwNYF2bqTjF2glvQa1TZZ7FUkEe4j4AVJE9P9Q0hG8pmZWcY0RYCS1A/4HXBARKwoODQO+IekS4GtgJ7A0yQtq56StgXeIhlI8V3qUFsLalr68xmSYeaVHinyHszMrIk19HtQkkYBBwKdJFUAw0hG7W0ATEwHWkyOiJ9GxAuSbgf+R9L1d1JErEqv83NgAsmAuxER8UJdZdc2Wewt63RXZmbW5Bq6BRURx1STfGMt+S8ALqgmfTwwvpSy63wGJakncCHJ2PW2BYV5uQ0zs4xpaVMd3UTSpLsM+BpwArlaVNjMLD+Uo7mOinlRt11ETAIUEa9HxNnAQY1bLTMzq488LbdRTAvqQ0llwOz0IddbwJaNWy0zM6uPrAedUhTTgvoVyVRHvwD2An4AHNeYlTIzs/ppUS2oiJiaflxG8vzJzMwyKkePoGp9UfdeapmKIiKOaJQamZlZvWW9VVSK2lpQf26yWpiZWYPI+jLupajtRd1Hm7IiZma27lpKC8rMzNYzZTl6COUAZWaWI25BNYHXpn+zuatgLUy7bYY1dxXM1lmLCFAexWdmtv7JUQ+fR/GZmeVJiwhQHsVnZrb+KVOdK6mvN7zchplZjuSpBVXMK103AdeRrI74NWAk8PfGrJSZmdVPWYlblnm5DTOzHClTlLRlmZfbMDPLkZbWxeflNszM1hN56uLzchtmZjmSpxZUMaP4HqaaF3Yjws+hzMwyRhl/rlSKYp5BnVrwuS3wLZIRfWZmljEtqgUVEc9USXpCkl/iNTPLoKw/VypFMV18mxXslpEMlOjSaDUyM7N6y/rQ8VIU08X3DMkzKJF07c0BhjRmpczMrH5aVBcfsHNEfFiYIGmDRqqPmZmtgzx18RVzL09Wk/ZUQ1fEzMzWXZlK27KstvWgugDdgHaS9iTp4gPYhOTFXTMzy5iW8gzqMOB4oDvwFz4NUO8Bf2jcapmZWX1kvVVUitrWg7oFuEXStyLiziask5mZ1VNLewa1l6QOlTuSOko6vxHrZGZm9ZSn2cyLCVD9I+Ldyp2IeAcY0HhVMjOz+srTIIliAlSrwmHlktoBHmZuZpZBrVXaVhdJIyQtkDSzIG0zSRMlzU5/dkzTJelKSeWSZkjqVXDOcWn+2ZKKWhGjmAB1KzBJ0hBJPwQmkqyqa2ZmGdMIXXw3A/2qpJ0OTIqInsCkdB+gP9Az3YaSrMZeOSPRMODLQB9gWGVQq00xc/FdLGkG8HWSkXznRcSEuu/JzMyaWkN320XEY5J6VEkeBByYfr4FeAT4XZo+MiICmCypg6Suad6JEbEEQNJEkqA3qrayixrwEREPRsSpEfEbYJmka4o5z8zMmlapCxZKGippWsE2tIhiOkfEPID0Z+Uq692ANwvyVaRpNaXXqpipjpC0B3AMcDTJXHx3FXOemZk1rVJbUBExHBjeQMVXV3rUkl6r2maS2AEYTBKYFgNjAEXE14qrp5mZNbUmWrDwbUldI2Je2oW3IE2vALYuyNcdmJumH1gl/ZG6Cqmti28WcDBweER8NSKuAlYVXX0zM2tyTTTMfBxQORLvOOCegvRj09F8fYGlaRfgBODQ9D3ajsChaVqtauvi+xZJC+phSQ8Co6m+mWZmZhnR0DNJSBpF0vrpJKmCZDTeRcDtkoYAbwBHpdnHk7wnWw6sAE4AiIglks4Dpqb5zq0cMFGb2qY6uhu4W1J74EjgFKCzpOuAuyPioVJv1MzMGldDzw4REcfUcOjgavIGcFIN1xkBjCil7DqDbUQsj4jbImIgSb/hdD4d825mZhmSp5kkihrFVyltkv013czMLGOyHnRKUVKAMjOzbGvV3BVoQA5QZmY5kvUZykvhAGVmliPu4jMzs0xygDIzs0xq5QBlZmZZ5BaUmZllkgdJmJlZJrkFZWZmmeT3oMzMLJPcgjIzs0zyMygzM8skDzM3M7NMchefmZllUuuGXrGwGTlAmZnlSCs/gzIzsyzKUQPKAcrMLE/8DMrMzDLJAcrMzDLJz6DMzCyT3IIyM7NMcoAyM7NMcoAyM7NM8lRHZmaWSZ4s1szMMskv6lqzOuigIbRv346ysjJatWrFXXddxuWX38qkSVMoKxObb74pF174Kzp33pxx4x7hhhvuBKB9+7acffaJ7LTTts18B7Y+uP6Sn9D/4D1ZuPg9eh9yGgB/v+YX9NyuKwAdNmnPu+8tp2//3wNw6kmDOP7oA1m1ajW/GXYL/3psBt27bsbfLjuRzlt0YHUEI/4xiWtGPNhs99QS5OkZlCKy2hx8OasVa3YHHTSEsWMvZbPNNl2TtmzZCjbaaEMARo4cR3n5m5x77kk8++yLbL/91my66UY8+ug0rr56FHfc8ZfmqnqmtdtmWHNXIVO+0mcnlq/4kL9dduKaAFXoojO/z9L3V3DhFXexU89u3HLVyex3xJl07dyR8f84gy8dcApbdtqULlt2YPrM19iofVuevP+PfOfHf2HW7Lea4Y6y6YM3RjVoSHl03viSfnce0HVAZkNanlqDLVplcAL44IOPkJLvXK9eO7PpphsBsMceOzF//qJmqZ+tf554ehZL3l1W4/FvDezL7fc8CcDAQ3tzx71P8fHHK3n9zYW88tp89t7jC8xf8C7TZ74GwLLlHzKr/C226rJZU1S/xSpTlLRlWZN38Uk6ISJuaupy82bIkLOQxNFH9+Poo/sBcNllI/nnPx9m4403ZOTIP37mnLFjH2L//fdq6qpaDn2lz068vWgpr7w2H4BunTsy5bnyNcffmreErbp0XOucbbp3Yo8v9mBqQT5reHnq4muOFtQ5NR2QNFTSNEnThg8f05R1Wq+MGnUxd999BTfccDa33XY/U6fOBOCUU47l0Udv4vDDD+TWW+9b65zJk2cwduxETj31+GaoseXNdwbtyx1p6wkAffa3YuHTg/YbbsCov57Cb88ZyfvLPmiCGrZcZSpty7JGCVCSZtSw/RfoXNN5ETE8InpHRO+hQ49ujKrlQufOmwOw+eYdOOSQfZgx4+W1jg8ceAAPPfTpL49Zs+Zw5plXce21Z9Kx4yZNWlfLn1atyhjUrw9j731qTdpb85fQfavN1+x367oZ895+B4DWrVsx6q+nMObuJ7jnwalNXt+WpqzELcsaq36dgWOBw6vZFjdSmS3CihUfsmzZijWfn3jiOXr2/DyvvTZ3TZ5//3sK223XHYC5cxdw8skXcvHFv2bbbbs1S50tXw766pd4+ZW5vDV/yZq0+yc+w1GH70ObNq35/NZb8IVtuzB1etKVd/0lQ3mpfC5X/m18c1W5RZFK24q7pk6R9IKkmZJGSWoraVtJUyTNljRGUps07wbpfnl6vEd976WxnkHdB2wUEdOrHpD0SCOV2SIsXvwuJ510AQCrVq1i4MAD2H//vTj55D8yZ85bSGV067YF55xzEgDXXDOad999j3POuQ5gzbB0s7rcctXJ7LfPznTquDHlU67mvEvHcsuYRzjqiH24fdyTa+V98eUK7rxvMs9N+jMrV67iV2fexOrVwb5778j3vrU//33xDSY/cCEAwy4ew4SHP/OrwRpIQ/faSeoG/ALYJSI+kHQ7MBgYAFwWEaMlXQ8MAa5Lf74TEV+QNBj4E1CvLjEPMzdLeZi5NYeGHmY+bdH9Jf3u7N3pG7WWnwaoycDuwHvAP4GrgNuALhGxUtI+wNkRcZikCennpyS1BuYDW0Q9gk3WuyDNzKwEpT6DKhyclm5DC68XEW8BfwbeAOYBS4FngHcjYmWarQKofIbQDXgzPXdlmn9z6sEzSZiZ5YhKfLcpIoYDw2u+njoCg4BtgXeBO4D+1V2q8pRajpXELSgzsxxRiVsRvg7MiYiFEfEJcBewL9Ah7cID6A5UjtSqALYGSI9vCiyhHhygzMxypBFG8b0B9JW0oZIpag4G/gc8DHw7zXMccE/6eVy6T3r83/V5/gTu4jMzy5WGXg8qIqZIGgs8C6wEniPpErwfGC3p/DTtxvSUG4G/SyonaTkNrm/ZDlBmZjnSGJNDRMQwoOow11eBPtXk/RA4qiHKdYAyM8uRYl++XR84QJmZ5UiO4pMDlJlZnjhAmZlZJmV9hvJSOECZmeVIjuKTA5SZWZ6UOpNEljlAmZnliFtQZmaWSR5mbmZmmZSn+escoMzMcsQtKDMzy6QcxScHKDOzPHELyszMMilH8ckByswsTzyThJmZZVKO4pMDlJlZnnh/SZxaAAADgklEQVQmCTMzyyS3oMzMLJM8is/MzDIpR/HJAcrMLE881ZGZmWWSu/jMzCyj8hOhHKDMzHKkTK2auwoNxgHKzCxX3IIyM7MMkgOUmZllkwOUmZllkJSfgeYOUGZmueIWlJmZZZCfQZmZWSY5QJmZWUb5GZSZmWWQcjTXkQOUmVmu5CdA5actaGZmqMT/irqm1EHSWEmzJL0oaR9Jm0maKGl2+rNjmleSrpRULmmGpF71vRcHKDOzXCkrcSvKFcCDEbETsDvwInA6MCkiegKT0n2A/kDPdBsKXLcud2JmZjnR0C0oSZsA+wM3AkTExxHxLjAIuCXNdgtwZPp5EDAyEpOBDpK61udeHKDMzHJEUklbEbYDFgI3SXpO0t8ktQc6R8Q8gPTnlmn+bsCbBedXpGklc4AyM8sVlbRJGippWsE2tMoFWwO9gOsiYk9gOZ9259VUgaqiPnfiUXxmZjmiEtsdETEcGF5LlgqgIiKmpPtjSQLU25K6RsS8tAtvQUH+rQvO7w7MLalSKbegzMxypbQWVF0iYj7wpqQd06SDgf8B44Dj0rTjgHvSz+OAY9PRfH2BpZVdgaVyC8rMLEca6UXdk4HbJLUBXgVOIGng3C5pCPAGcFSadzwwACgHVqR568UByswsVxo+QEXEdKB3NYcOriZvACc1RLkOUGZmOVLqM6gsc4AyM8uV/Ex15ABlZpYjXm7DzMwyybOZm5lZJolWzV2FBuMAZWaWK25BmZlZBrmLz8zMMsrDzM3MLIPyNIpPyUu/lieShqYTQJo1CX/nrDHkpy1ohapOl2/W2PydswbnAGVmZpnkAGVmZpnkAJVPfhZgTc3fOWtwHiRhZmaZ5BaUmZllkgNUjkjqJ+klSeWSTm/u+lj+SRohaYGkmc1dF8sfB6ickNQKuAboD+wCHCNpl+atlbUANwP9mrsSlk8OUPnRByiPiFcj4mNgNDCometkORcRjwFLmrselk8OUPnRDXizYL8iTTMzWy85QOVHdRNweYimma23HKDyowLYumC/OzC3mepiZrbOHKDyYyrQU9K2ktoAg4FxzVwnM7N6c4DKiYhYCfwcmAC8CNweES80b60s7ySNAp4CdpRUIWlIc9fJ8sMzSZiZWSa5BWVmZpnkAGVmZpnkAGVmZpnkAGVmZpnkAGVmZpnkAGVmZpnkAGVmZpnkAGVmZpn0/wF2zw5LSueQCgAAAABJRU5ErkJggg==\n",
3032 | "text/plain": [
3033 | ""
3034 | ]
3035 | },
3036 | "metadata": {
3037 | "needs_background": "light"
3038 | },
3039 | "output_type": "display_data"
3040 | }
3041 | ],
3042 | "source": [
3043 | "# cnf matrix drawing\n",
3044 | "class_names=[0,1] # name of classes\n",
3045 | "fig, ax = plt.subplots()\n",
3046 | "tick_marks = np.arange(len(class_names))\n",
3047 | "plt.xticks(tick_marks, class_names)\n",
3048 | "plt.yticks(tick_marks, class_names)\n",
3049 | "# create heatmap\n",
3050 | "sns.heatmap(pd.DataFrame(cnf_matrix), annot=True, cmap=\"YlGnBu\" ,fmt='g')\n",
3051 | "ax.xaxis.set_label_position(\"top\")\n",
3052 | "plt.tight_layout()\n",
3053 | "plt.title('Confusion matrix', y=1.1)\n",
3054 | "plt.ylabel('Actual label')\n",
3055 | "plt.xlabel('Predicted label')\n",
3056 | "print(\"Accuracy %0.4f\" % accuracy_score(y_test, y_pred))"
3057 | ]
3058 | }
3059 | ],
3060 | "metadata": {
3061 | "kernelspec": {
3062 | "display_name": "Python 3",
3063 | "language": "python",
3064 | "name": "python3"
3065 | },
3066 | "language_info": {
3067 | "codemirror_mode": {
3068 | "name": "ipython",
3069 | "version": 3
3070 | },
3071 | "file_extension": ".py",
3072 | "mimetype": "text/x-python",
3073 | "name": "python",
3074 | "nbconvert_exporter": "python",
3075 | "pygments_lexer": "ipython3",
3076 | "version": "3.7.3"
3077 | }
3078 | },
3079 | "nbformat": 4,
3080 | "nbformat_minor": 2
3081 | }
3082 |
--------------------------------------------------------------------------------