Kicking Off Hacktoberfest with ACM-VIT!

├── .github
    ├── ISSUE_TEMPLATE
    │   ├── config.yml
    │   ├── doubt.md
    │   └── helper-issue.md
    └── PULL_REQUEST_TEMPLATE.md
├── README.md
├── main.ipynb
├── requirements.txt
└── submissions
    ├── .ipynb_checkpoints
        └── Arpit-Agarwal-checkpoint.ipynb
    ├── Abhijit-Singh.ipynb
    ├── Anusha-Verma-C.ipynb
    ├── Arpit-Agarwal.ipynb
    └── Dhanesh-Shetty.ipynb


/.github/ISSUE_TEMPLATE/config.yml:
--------------------------------------------------------------------------------
1 | blank_issues_enabled: false


--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/doubt.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | name: Doubt
 3 | about: Ask us a doubt if something is not clear.
 4 | title: '[DOUBT] <"Your doubt goes here"> '
 5 | labels: doubt
 6 | assignees: ''
 7 | 
 8 | ---
 9 | 
10 | **Describe the doubt**
11 | A clear and concise description of what the doubt is.
12 | 
13 | **Where do you need help**
14 | Where exactly do you need our help?
15 | 
16 | **Additional context**
17 | Add any other context about the doubt here.
18 | 


--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/helper-issue.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | name: Helper Issue
 3 | about: Make a helper issue for participants
 4 | title: ''
 5 | labels: good first issue, hacktoberfest, helper issue
 6 | assignees: ''
 7 | 
 8 | ---
 9 | 
10 | **Task**
11 | Describe the task here
12 | 
13 | **Function to Implement**
14 | Write function name, if any
15 | 


--------------------------------------------------------------------------------
/.github/PULL_REQUEST_TEMPLATE.md:
--------------------------------------------------------------------------------
1 | Fixes #[Add issue number here. If you do not solve the issue entirely, please change the message e.g. "First steps for issues #IssueNumber]
2 | 
3 | Changes: [Tell whether you have completed the entire task or some modules/functions]


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | <h1 align="center"><a href="https://organize.mlh.io/participants/events/4390-kickstarting-hacktoberfest-with-acm-vit">Kicking Off Hacktoberfest with ACM-VIT!</a></h1>
  2 | <p align="center">
  3 | <img src="https://raw.githubusercontent.com/Malika01/hacktoberfest-readme/master/Final.png">
  4 | </p>
  5 | 
  6 | <h2 align="center"> Good Client, Bad Client </h2>
  7 | 
  8 | <p align="center"> 
  9 | Help us build a Credit Card Approval System - Using Machine Learning!
 10 | </p>
 11 | 
 12 | <p>
 13 |   <a href="https://acmvit.in/" target="_blank">
 14 |   </a>
 15 |   <img alt="made-by-acm" src="https://img.shields.io/badge/MADE%20BY-ACM%20VIT-blue?style=for-the-badge"/>
 16 |   <img alt="license" src="https://img.shields.io/badge/License-MIT-green.svg?style=for-the-badge"  />
 17 |   <img alt="stars" src="https://img.shields.io/github/stars/ACM-VIT/Good-Client-Bad-Client?style=social" align="right"/> 
 18 |   <img alt="forks" src="https://img.shields.io/github/forks/ACM-VIT/Good-Client-Bad-Client?style=social" align="right"/>   
 19 |   
 20 |     
 21 | </p>
 22 | 
 23 | ## Overview
 24 | 
 25 | The main motive of the project is to build a machine learning model to predict if an applicant is 'good' or 'bad' client, different from other tasks, the definition of 'good' or 'bad' is not given. <br><br>
 26 | Credit score cards are a common risk control method in the financial industry. It uses personal information and data submitted by credit card applicants to predict the probability of future defaults and credit card borrowings. The bank is able to decide whether to issue a credit card to the applicant. Credit scores can objectively quantify the magnitude of risk.<br><br>
 27 |     In dataset,application_record.csv is the table/file that has information about all the customers regarding their socio-economic status and credit_record.csv is the file/table that has all the payment/default records for a given client.<br>
 28 | 
 29 | 
 30 | ---
 31 | 
 32 | ## Usage
 33 | Run the following command to install all the required packages for this project
 34 | <pre>pip install -r requirements.txt</pre>
 35 | 
 36 | Lets get started!
 37 |  <pre><code>
 38 |  git remote add
 39 |  git fetch 
 40 |  git merge</code></pre>
 41 | ## Dataset
 42 |    
 43 |    Link to the data set is [here](https://drive.google.com/drive/folders/1ltq08WdYxd-r9wnY60o78VBgN5FlMcKk?usp=sharing).
 44 |    
 45 | ---
 46 | ## Submitting a Pull Request
 47 | 
 48 |  * Fork the repository by clicking the fork button on top right corner of the page
 49 |  * Clone the target repository. To clone, click on the clone button and copy the https address. Then run 
 50 |  <pre><code>git clone [HTTPS-ADDRESS]</code></pre>
 51 | * Go to the cloned directory by running 
 52 | <pre><code>cd [NAME-OF-REPO]</code></pre>
 53 | * Create a new branch. Use 
 54 | <pre><code> git checkout -b [YOUR-BRANCH-NAME]</code></pre>
 55 | * Make your changes to the code. Add changes to your branch by using 
 56 | <pre><code>git add .</code></pre>
 57 | * Commit the chanes by executing
 58 | <pre><code>git commit -m "your msg"</code></pre>
 59 | * Push to remote. To do this, run 
 60 | <pre><code>git push origin [YOUR-BRANCH-NAME]</code></pre>
 61 | * Create a pull request. Go to the target repository and click on the "Compare & pull request" button. **Make sure your PR description mentions which issues you're solving.**
 62 | <img src="https://drive.google.com/u/1/uc?id=1f9JKAR-kRvCRGxIs_SAvegaYDPx53T9G&export=download"></img>
 63 | * Wait for your request to be accepted. 
 64 | 
 65 | ---
 66 | ## Guidelines for Pull Request
 67 | 
 68 |   * Avoid pull requests that :
 69 |       * are automated or scripted
 70 |       * that are plagarized from someone else's branch
 71 |   * Do not spam
 72 |   * Project maintainer's decision on validity of PR is final.
 73 | 
 74 |   For additional guidelines, refer to [participation rules](https://hacktoberfest.digitalocean.com/details#rules)
 75 | 
 76 | ---
 77 | 
 78 | ## What counts as a PR?
 79 | 
 80 | Check out our [issues](https://github.com/ACM-VIT/Good-Client-Bad-Client/issues) and try to solve them !
 81 |   
 82 | 
 83 | ---
 84 | 
 85 | ## Interacting with Issues
 86 | 
 87 |   * There are helper issues that detail all you have to do to complete the project.
 88 |       * Read the helper issues and work on the corresponding code in your fork of the repo.
 89 |       * If you have some doubt regarding the 'help' given, comment below the issue.
 90 |       * If you have some doubt not related to any 'helper issue/s' open, Open up a new issue, select doubt and fill in the template.
 91 |   * If you want to provide some extra help to fellow participants, open up a new helper issue. Don't include any solution/code!
 92 |   * Do not spam
 93 | 
 94 | ---
 95 | 
 96 | ## Authors
 97 | 
 98 | **Authors:**  [Aryan Vats](https://github.com/avats101), [Aditya Nalini](https://github.com/adinalini), [Varun Srinivasan](https://github.com/DEV-VarunSrinivasan)
 99 | <br>
100 | 


--------------------------------------------------------------------------------
/main.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |   "nbformat": 4,
  3 |   "nbformat_minor": 0,
  4 |   "metadata": {
  5 |     "colab": {
  6 |       "name": "main.ipynb",
  7 |       "provenance": [],
  8 |       "toc_visible": true
  9 |     },
 10 |     "kernelspec": {
 11 |       "name": "python3",
 12 |       "display_name": "Python 3"
 13 |     }
 14 |   },
 15 |   "cells": [
 16 |     {
 17 |       "cell_type": "markdown",
 18 |       "metadata": {
 19 |         "id": "f17yT-ge6H5l"
 20 |       },
 21 |       "source": [
 22 |         "# **Data**"
 23 |       ]
 24 |     },
 25 |     {
 26 |       "cell_type": "markdown",
 27 |       "metadata": {
 28 |         "id": "y6JmMMyn6T_6"
 29 |       },
 30 |       "source": [
 31 |         "\n",
 32 |         "Two files are available. One, the application data, and the second one monthly credit card account status information.\n",
 33 |         "\n",
 34 |         "The application data will be used for feature creation. And the status (credit payment status) will be required for defining the labels - which of the applications have paid back dues and which of these turn out to bad accounts."
 35 |       ]
 36 |     },
 37 |     {
 38 |       "cell_type": "markdown",
 39 |       "metadata": {
 40 |         "id": "HQkocJqb5BxU"
 41 |       },
 42 |       "source": [
 43 |         "### 1. Application\n",
 44 |         "\n",
 45 |         "For a credit card, the customers fill-up the form - online or a physical. The application information is used for assessing the creditworthiness of the customer. In addition to the application information, the Credit Bureau Score e.g. FICO Score in the US, CIBIL Score in India, and other internal information about the applicants are used for the decision.\n",
 46 |         "\n",
 47 |         "Also, gradually the banks are considering a lot of external data to improve the quality of credit decisions.\n",
 48 |         "\n",
 49 |         "Now, we expect to read and explore the application sample data file provided.\n"
 50 |       ]
 51 |     },
 52 |     {
 53 |       "cell_type": "code",
 54 |       "metadata": {
 55 |         "id": "YTu89Y6e5AlQ"
 56 |       },
 57 |       "source": [
 58 |         "def read_app_data():\n",
 59 |         "#Reading the application data"
 60 |       ],
 61 |       "execution_count": null,
 62 |       "outputs": []
 63 |     },
 64 |     {
 65 |       "cell_type": "markdown",
 66 |       "metadata": {
 67 |         "id": "v9RGpm697m6E"
 68 |       },
 69 |       "source": [
 70 |         "### 2. Credit Status\n",
 71 |         "\n",
 72 |         "Once a credit card is issued, the customer uses it for shopping items of its use, a statement is generated to make a payment toward the dues by due date and the customer makes payment. This is a typical credit card cycle.\n",
 73 |         "\n",
 74 |         "If a customer is not able to make a payment for the minimum due amount, the customer is considered past due for that month.\n",
 75 |         "If the non-payment is continued for a period, the customer is considered as a defaulter and the due amount is written off & becomes bad debt. Of course, there is a lot of effort and steps the bank does to recover the due amount and this falls under the collection process.\n",
 76 |         "\n",
 77 |         "With the modeling process, the aim to learn about the customers who were not able to pay back the dues and not to approve applications of the customers who look similar to these.\n",
 78 |         "Of course, we do not know the applications that were rejected and how many of those were actually good customers. This is not in the scope of this blog.\n",
 79 |         "\n",
 80 |         "For this exercise, the credit status file is given. In this file, the status value is given for each of the applications post approved."
 81 |       ]
 82 |     },
 83 |     {
 84 |       "cell_type": "code",
 85 |       "metadata": {
 86 |         "id": "CLupt4Kd7vu5"
 87 |       },
 88 |       "source": [
 89 |         ""
 90 |       ],
 91 |       "execution_count": null,
 92 |       "outputs": []
 93 |     },
 94 |     {
 95 |       "cell_type": "markdown",
 96 |       "metadata": {
 97 |         "id": "3DhzXpbe8qbG"
 98 |       },
 99 |       "source": [
100 |         "## Feature Variable Creation"
101 |       ]
102 |     },
103 |     {
104 |       "cell_type": "code",
105 |       "metadata": {
106 |         "id": "rIDdu8xb8-KA"
107 |       },
108 |       "source": [
109 |         "def feature_creation"
110 |       ],
111 |       "execution_count": null,
112 |       "outputs": []
113 |     },
114 |     {
115 |       "cell_type": "markdown",
116 |       "metadata": {
117 |         "id": "ZbZ7W1kK8_Pp"
118 |       },
119 |       "source": [
120 |         "## Data Exploration "
121 |       ]
122 |     },
123 |     {
124 |       "cell_type": "markdown",
125 |       "metadata": {
126 |         "id": "dxYpOQt_-O6L"
127 |       },
128 |       "source": [
129 |         "Let's check if any of the variables have missing values present.(Hint:There are missing values :) )"
130 |       ]
131 |     },
132 |     {
133 |       "cell_type": "code",
134 |       "metadata": {
135 |         "id": "7sTRgiMW9e6U"
136 |       },
137 |       "source": [
138 |         "def missing_values_table(df):"
139 |       ],
140 |       "execution_count": null,
141 |       "outputs": []
142 |     },
143 |     {
144 |       "cell_type": "markdown",
145 |       "metadata": {
146 |         "id": "S3BnPgZt-H7m"
147 |       },
148 |       "source": [
149 |         "Solving the missing values problem by creating them as a separate class. Now we would want to see bivariate analysis - the analysis between Label variable and each of the feature variables. Based on the analytical type of the feature variables, the analysis may be different. So, we can find an analytical type of feature first. We have written a function definition to find the analytical type of variables.You can try to solve the mising values by any other method too."
150 |       ]
151 |     },
152 |     {
153 |       "cell_type": "code",
154 |       "metadata": {
155 |         "id": "n5ZaKk-k-HLE"
156 |       },
157 |       "source": [
158 |         "def solution_missing_values(df):\n",
159 |         "  # Find Continuous and Categorical Features"
160 |       ],
161 |       "execution_count": null,
162 |       "outputs": []
163 |     },
164 |     {
165 |       "cell_type": "markdown",
166 |       "metadata": {
167 |         "id": "O0t6Ni2a-8zi"
168 |       },
169 |       "source": [
170 |         "### Observations\n"
171 |       ]
172 |     },
173 |     {
174 |       "cell_type": "markdown",
175 |       "metadata": {
176 |         "id": "-glozoHTJ-HQ"
177 |       },
178 |       "source": [
179 |         "# Model Creation"
180 |       ]
181 |     },
182 |     {
183 |       "cell_type": "markdown",
184 |       "metadata": {
185 |         "id": "3GnJPaiuKBq3"
186 |       },
187 |       "source": [
188 |         "## Importing packages for the model"
189 |       ]
190 |     },
191 |     {
192 |       "cell_type": "code",
193 |       "metadata": {
194 |         "id": "fQGD5KnFKayL"
195 |       },
196 |       "source": [
197 |         ""
198 |       ],
199 |       "execution_count": null,
200 |       "outputs": []
201 |     },
202 |     {
203 |       "cell_type": "markdown",
204 |       "metadata": {
205 |         "id": "5fS-PM02KUVY"
206 |       },
207 |       "source": [
208 |         "## Prepare Data"
209 |       ]
210 |     },
211 |     {
212 |       "cell_type": "code",
213 |       "metadata": {
214 |         "id": "tYnZ1xNPAZtC"
215 |       },
216 |       "source": [
217 |         ""
218 |       ],
219 |       "execution_count": null,
220 |       "outputs": []
221 |     },
222 |     {
223 |       "cell_type": "markdown",
224 |       "metadata": {
225 |         "id": "-8eLrkCCKeMA"
226 |       },
227 |       "source": [
228 |         "## Split Sample to Train and Test Samples"
229 |       ]
230 |     },
231 |     {
232 |       "cell_type": "code",
233 |       "metadata": {
234 |         "id": "sSMCsfs9Kfr4"
235 |       },
236 |       "source": [
237 |         ""
238 |       ],
239 |       "execution_count": null,
240 |       "outputs": []
241 |     },
242 |     {
243 |       "cell_type": "markdown",
244 |       "metadata": {
245 |         "id": "k7uKgsSJKgbU"
246 |       },
247 |       "source": [
248 |         "## Model Definition "
249 |       ]
250 |     },
251 |     {
252 |       "cell_type": "code",
253 |       "metadata": {
254 |         "id": "p9hmYoJQKv_R"
255 |       },
256 |       "source": [
257 |         ""
258 |       ],
259 |       "execution_count": null,
260 |       "outputs": []
261 |     },
262 |     {
263 |       "cell_type": "markdown",
264 |       "metadata": {
265 |         "id": "uBzMnlMVKwRd"
266 |       },
267 |       "source": [
268 |         "## Fitting Model"
269 |       ]
270 |     },
271 |     {
272 |       "cell_type": "code",
273 |       "metadata": {
274 |         "id": "fC_JjEB4OmvC"
275 |       },
276 |       "source": [
277 |         ""
278 |       ],
279 |       "execution_count": null,
280 |       "outputs": []
281 |     },
282 |     {
283 |       "cell_type": "markdown",
284 |       "metadata": {
285 |         "id": "ZpFwD14iOnWq"
286 |       },
287 |       "source": [
288 |         "## Predict using Fitted Model"
289 |       ]
290 |     },
291 |     {
292 |       "cell_type": "code",
293 |       "metadata": {
294 |         "id": "B_2uKNb9Qflr"
295 |       },
296 |       "source": [
297 |         ""
298 |       ],
299 |       "execution_count": null,
300 |       "outputs": []
301 |     },
302 |     {
303 |       "cell_type": "markdown",
304 |       "metadata": {
305 |         "id": "5hIjbQ5aSAqW"
306 |       },
307 |       "source": [
308 |         "## Model Evaluation\n",
309 |         "Now, we may want to compare the predicted and observed label classes to see the actual accuracy. Confusion Matrix can be useful."
310 |       ]
311 |     },
312 |     {
313 |       "cell_type": "code",
314 |       "metadata": {
315 |         "id": "Nsfw4Ty_SL69"
316 |       },
317 |       "source": [
318 |         ""
319 |       ],
320 |       "execution_count": null,
321 |       "outputs": []
322 |     },
323 |     {
324 |       "cell_type": "markdown",
325 |       "metadata": {
326 |         "id": "14r3giw8SMNF"
327 |       },
328 |       "source": [
329 |         "## Model Parameter Turing\n",
330 |         "Considering, we have a relatively small size of the data and features, we are setting a high number of parameters for tuning. If data is large, it may take a bit of time to get the results."
331 |       ]
332 |     },
333 |     {
334 |       "cell_type": "code",
335 |       "metadata": {
336 |         "id": "W6f3otA6Th6o"
337 |       },
338 |       "source": [
339 |         ""
340 |       ],
341 |       "execution_count": null,
342 |       "outputs": []
343 |     },
344 |     {
345 |       "cell_type": "markdown",
346 |       "metadata": {
347 |         "id": "mLKFRDjrTicw"
348 |       },
349 |       "source": [
350 |         "\n",
351 |         "## Optimized Model Classifier\n",
352 |         "\n",
353 |         "Parameter tuning has helped us get the best combination of the parameters. Now, we will fit the model with these sets of parameters and see the improvement in the accuracy of the model.\n"
354 |       ]
355 |     },
356 |     {
357 |       "cell_type": "code",
358 |       "metadata": {
359 |         "id": "6GOeMkqJTm5n"
360 |       },
361 |       "source": [
362 |         ""
363 |       ],
364 |       "execution_count": null,
365 |       "outputs": []
366 |     },
367 |     {
368 |       "cell_type": "markdown",
369 |       "metadata": {
370 |         "id": "H_mHegtJTnIt"
371 |       },
372 |       "source": [
373 |         "## Optimized Model Evaluation"
374 |       ]
375 |     },
376 |     {
377 |       "cell_type": "code",
378 |       "metadata": {
379 |         "id": "V7C1-zk2TvXi"
380 |       },
381 |       "source": [
382 |         ""
383 |       ],
384 |       "execution_count": null,
385 |       "outputs": []
386 |     },
387 |     {
388 |       "cell_type": "markdown",
389 |       "metadata": {
390 |         "id": "SXkBI0rOTv4r"
391 |       },
392 |       "source": [
393 |         "## Model Evaluation for the Optimized Model on Testing Sample\n"
394 |       ]
395 |     },
396 |     {
397 |       "cell_type": "code",
398 |       "metadata": {
399 |         "id": "-Ls2zBJiVSFG"
400 |       },
401 |       "source": [
402 |         ""
403 |       ],
404 |       "execution_count": null,
405 |       "outputs": []
406 |     }
407 |   ]
408 | }


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | numpy==1.19.2
2 | matplotlib==3.3.2
3 | pandas==1.1.2
4 | seaborn==0.11.0
5 | 


--------------------------------------------------------------------------------
/submissions/.ipynb_checkpoints/Arpit-Agarwal-checkpoint.ipynb:
--------------------------------------------------------------------------------
 1 | {
 2 |  "cells": [
 3 |   {
 4 |    "cell_type": "code",
 5 |    "execution_count": 1,
 6 |    "metadata": {},
 7 |    "outputs": [],
 8 |    "source": [
 9 |     "# Importing required libraries\n",
10 |     "import numpy as np\n",
11 |     "import pandas as pd"
12 |    ]
13 |   },
14 |   {
15 |    "cell_type": "markdown",
16 |    "metadata": {},
17 |    "source": [
18 |     "## Import and read app data\n",
19 |     "\n",
20 |     "The link for downloading the daatset is "
21 |    ]
22 |   },
23 |   {
24 |    "cell_type": "code",
25 |    "execution_count": 2,
26 |    "metadata": {},
27 |    "outputs": [],
28 |    "source": [
29 |     "def read_app_data():\n",
30 |     "    application_record=pd.read_csv(r'C:\\Users\\akhil\\Documents\\drive-download-20201009T155012Z-001\\application_record.csv')\n",
31 |     "    return application_record\n",
32 |     "application_record=read_app_data()"
33 |    ]
34 |   },
35 |   {
36 |    "cell_type": "code",
37 |    "execution_count": null,
38 |    "metadata": {},
39 |    "outputs": [],
40 |    "source": []
41 |   }
42 |  ],
43 |  "metadata": {
44 |   "kernelspec": {
45 |    "display_name": "Python3.6Test",
46 |    "language": "python",
47 |    "name": "python3.6test"
48 |   },
49 |   "language_info": {
50 |    "codemirror_mode": {
51 |     "name": "ipython",
52 |     "version": 3
53 |    },
54 |    "file_extension": ".py",
55 |    "mimetype": "text/x-python",
56 |    "name": "python",
57 |    "nbconvert_exporter": "python",
58 |    "pygments_lexer": "ipython3",
59 |    "version": "3.6.5"
60 |   }
61 |  },
62 |  "nbformat": 4,
63 |  "nbformat_minor": 4
64 | }
65 | 


--------------------------------------------------------------------------------
/submissions/Abhijit-Singh.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "# importing pandas library as pd\n",
 10 |     "\n",
 11 |     "import pandas as pd"
 12 |    ]
 13 |   },
 14 |   {
 15 |    "cell_type": "markdown",
 16 |    "metadata": {},
 17 |    "source": [
 18 |     "# importing application_record dataset "
 19 |    ]
 20 |   },
 21 |   {
 22 |    "cell_type": "code",
 23 |    "execution_count": 2,
 24 |    "metadata": {},
 25 |    "outputs": [],
 26 |    "source": [
 27 |     "# Reading application_record.csv file.\n",
 28 |     "# This application_record.csv could be downloaded from the below provided link\n",
 29 |     "# https://drive.google.com/file/d/1EJ454SyXT-RpEAfhqYu72bCyvZrY_ehg/view?usp=sharing\n",
 30 |     "def read_app_data():\n",
 31 |     "    application_data=pd.read_csv('application_record.csv')\n",
 32 |     "    return application_data"
 33 |    ]
 34 |   },
 35 |   {
 36 |    "cell_type": "markdown",
 37 |    "metadata": {},
 38 |    "source": [
 39 |     "## importing credit_card dataset "
 40 |    ]
 41 |   },
 42 |   {
 43 |    "cell_type": "code",
 44 |    "execution_count": 3,
 45 |    "metadata": {},
 46 |    "outputs": [],
 47 |    "source": [
 48 |     "# This credit_record.csv could be downloaded from the below provided link\n",
 49 |     "# https://drive.google.com/file/d/1LvjYFEztJJUYNhSa1eznfseCNHz3DgZb/view?usp=sharing\n",
 50 |     "credit_record=pd.read_csv('credit_record.csv')"
 51 |    ]
 52 |   },
 53 |   {
 54 |    "cell_type": "markdown",
 55 |    "metadata": {},
 56 |    "source": [
 57 |     "## Feature creation"
 58 |    ]
 59 |   },
 60 |   {
 61 |    "cell_type": "code",
 62 |    "execution_count": 4,
 63 |    "metadata": {},
 64 |    "outputs": [
 65 |     {
 66 |      "data": {
 67 |       "text/html": [
 68 |        "<div>\n",
 69 |        "<style scoped>\n",
 70 |        "    .dataframe tbody tr th:only-of-type {\n",
 71 |        "        vertical-align: middle;\n",
 72 |        "    }\n",
 73 |        "\n",
 74 |        "    .dataframe tbody tr th {\n",
 75 |        "        vertical-align: top;\n",
 76 |        "    }\n",
 77 |        "\n",
 78 |        "    .dataframe thead th {\n",
 79 |        "        text-align: right;\n",
 80 |        "    }\n",
 81 |        "</style>\n",
 82 |        "<table border=\"1\" class=\"dataframe\">\n",
 83 |        "  <thead>\n",
 84 |        "    <tr style=\"text-align: right;\">\n",
 85 |        "      <th></th>\n",
 86 |        "      <th>ID</th>\n",
 87 |        "      <th>CODE_GENDER</th>\n",
 88 |        "      <th>FLAG_OWN_CAR</th>\n",
 89 |        "      <th>FLAG_OWN_REALTY</th>\n",
 90 |        "      <th>CNT_CHILDREN</th>\n",
 91 |        "      <th>AMT_INCOME_TOTAL</th>\n",
 92 |        "      <th>NAME_INCOME_TYPE</th>\n",
 93 |        "      <th>NAME_EDUCATION_TYPE</th>\n",
 94 |        "      <th>NAME_FAMILY_STATUS</th>\n",
 95 |        "      <th>NAME_HOUSING_TYPE</th>\n",
 96 |        "      <th>DAYS_BIRTH</th>\n",
 97 |        "      <th>DAYS_EMPLOYED</th>\n",
 98 |        "      <th>FLAG_MOBIL</th>\n",
 99 |        "      <th>FLAG_WORK_PHONE</th>\n",
100 |        "      <th>FLAG_PHONE</th>\n",
101 |        "      <th>FLAG_EMAIL</th>\n",
102 |        "      <th>OCCUPATION_TYPE</th>\n",
103 |        "      <th>CNT_FAM_MEMBERS</th>\n",
104 |        "    </tr>\n",
105 |        "  </thead>\n",
106 |        "  <tbody>\n",
107 |        "    <tr>\n",
108 |        "      <th>0</th>\n",
109 |        "      <td>5008804</td>\n",
110 |        "      <td>M</td>\n",
111 |        "      <td>Y</td>\n",
112 |        "      <td>Y</td>\n",
113 |        "      <td>0</td>\n",
114 |        "      <td>427500.0</td>\n",
115 |        "      <td>Working</td>\n",
116 |        "      <td>Higher education</td>\n",
117 |        "      <td>Civil marriage</td>\n",
118 |        "      <td>Rented apartment</td>\n",
119 |        "      <td>-12005</td>\n",
120 |        "      <td>-4542</td>\n",
121 |        "      <td>1</td>\n",
122 |        "      <td>1</td>\n",
123 |        "      <td>0</td>\n",
124 |        "      <td>0</td>\n",
125 |        "      <td>NaN</td>\n",
126 |        "      <td>2.0</td>\n",
127 |        "    </tr>\n",
128 |        "    <tr>\n",
129 |        "      <th>1</th>\n",
130 |        "      <td>5008805</td>\n",
131 |        "      <td>M</td>\n",
132 |        "      <td>Y</td>\n",
133 |        "      <td>Y</td>\n",
134 |        "      <td>0</td>\n",
135 |        "      <td>427500.0</td>\n",
136 |        "      <td>Working</td>\n",
137 |        "      <td>Higher education</td>\n",
138 |        "      <td>Civil marriage</td>\n",
139 |        "      <td>Rented apartment</td>\n",
140 |        "      <td>-12005</td>\n",
141 |        "      <td>-4542</td>\n",
142 |        "      <td>1</td>\n",
143 |        "      <td>1</td>\n",
144 |        "      <td>0</td>\n",
145 |        "      <td>0</td>\n",
146 |        "      <td>NaN</td>\n",
147 |        "      <td>2.0</td>\n",
148 |        "    </tr>\n",
149 |        "    <tr>\n",
150 |        "      <th>2</th>\n",
151 |        "      <td>5008806</td>\n",
152 |        "      <td>M</td>\n",
153 |        "      <td>Y</td>\n",
154 |        "      <td>Y</td>\n",
155 |        "      <td>0</td>\n",
156 |        "      <td>112500.0</td>\n",
157 |        "      <td>Working</td>\n",
158 |        "      <td>Secondary / secondary special</td>\n",
159 |        "      <td>Married</td>\n",
160 |        "      <td>House / apartment</td>\n",
161 |        "      <td>-21474</td>\n",
162 |        "      <td>-1134</td>\n",
163 |        "      <td>1</td>\n",
164 |        "      <td>0</td>\n",
165 |        "      <td>0</td>\n",
166 |        "      <td>0</td>\n",
167 |        "      <td>Security staff</td>\n",
168 |        "      <td>2.0</td>\n",
169 |        "    </tr>\n",
170 |        "    <tr>\n",
171 |        "      <th>3</th>\n",
172 |        "      <td>5008808</td>\n",
173 |        "      <td>F</td>\n",
174 |        "      <td>N</td>\n",
175 |        "      <td>Y</td>\n",
176 |        "      <td>0</td>\n",
177 |        "      <td>270000.0</td>\n",
178 |        "      <td>Commercial associate</td>\n",
179 |        "      <td>Secondary / secondary special</td>\n",
180 |        "      <td>Single / not married</td>\n",
181 |        "      <td>House / apartment</td>\n",
182 |        "      <td>-19110</td>\n",
183 |        "      <td>-3051</td>\n",
184 |        "      <td>1</td>\n",
185 |        "      <td>0</td>\n",
186 |        "      <td>1</td>\n",
187 |        "      <td>1</td>\n",
188 |        "      <td>Sales staff</td>\n",
189 |        "      <td>1.0</td>\n",
190 |        "    </tr>\n",
191 |        "    <tr>\n",
192 |        "      <th>4</th>\n",
193 |        "      <td>5008809</td>\n",
194 |        "      <td>F</td>\n",
195 |        "      <td>N</td>\n",
196 |        "      <td>Y</td>\n",
197 |        "      <td>0</td>\n",
198 |        "      <td>270000.0</td>\n",
199 |        "      <td>Commercial associate</td>\n",
200 |        "      <td>Secondary / secondary special</td>\n",
201 |        "      <td>Single / not married</td>\n",
202 |        "      <td>House / apartment</td>\n",
203 |        "      <td>-19110</td>\n",
204 |        "      <td>-3051</td>\n",
205 |        "      <td>1</td>\n",
206 |        "      <td>0</td>\n",
207 |        "      <td>1</td>\n",
208 |        "      <td>1</td>\n",
209 |        "      <td>Sales staff</td>\n",
210 |        "      <td>1.0</td>\n",
211 |        "    </tr>\n",
212 |        "  </tbody>\n",
213 |        "</table>\n",
214 |        "</div>"
215 |       ],
216 |       "text/plain": [
217 |        "        ID CODE_GENDER FLAG_OWN_CAR FLAG_OWN_REALTY  CNT_CHILDREN  \\\n",
218 |        "0  5008804           M            Y               Y             0   \n",
219 |        "1  5008805           M            Y               Y             0   \n",
220 |        "2  5008806           M            Y               Y             0   \n",
221 |        "3  5008808           F            N               Y             0   \n",
222 |        "4  5008809           F            N               Y             0   \n",
223 |        "\n",
224 |        "   AMT_INCOME_TOTAL      NAME_INCOME_TYPE            NAME_EDUCATION_TYPE  \\\n",
225 |        "0          427500.0               Working               Higher education   \n",
226 |        "1          427500.0               Working               Higher education   \n",
227 |        "2          112500.0               Working  Secondary / secondary special   \n",
228 |        "3          270000.0  Commercial associate  Secondary / secondary special   \n",
229 |        "4          270000.0  Commercial associate  Secondary / secondary special   \n",
230 |        "\n",
231 |        "     NAME_FAMILY_STATUS  NAME_HOUSING_TYPE  DAYS_BIRTH  DAYS_EMPLOYED  \\\n",
232 |        "0        Civil marriage   Rented apartment      -12005          -4542   \n",
233 |        "1        Civil marriage   Rented apartment      -12005          -4542   \n",
234 |        "2               Married  House / apartment      -21474          -1134   \n",
235 |        "3  Single / not married  House / apartment      -19110          -3051   \n",
236 |        "4  Single / not married  House / apartment      -19110          -3051   \n",
237 |        "\n",
238 |        "   FLAG_MOBIL  FLAG_WORK_PHONE  FLAG_PHONE  FLAG_EMAIL OCCUPATION_TYPE  \\\n",
239 |        "0           1                1           0           0             NaN   \n",
240 |        "1           1                1           0           0             NaN   \n",
241 |        "2           1                0           0           0  Security staff   \n",
242 |        "3           1                0           1           1     Sales staff   \n",
243 |        "4           1                0           1           1     Sales staff   \n",
244 |        "\n",
245 |        "   CNT_FAM_MEMBERS  \n",
246 |        "0              2.0  \n",
247 |        "1              2.0  \n",
248 |        "2              2.0  \n",
249 |        "3              1.0  \n",
250 |        "4              1.0  "
251 |       ]
252 |      },
253 |      "execution_count": 4,
254 |      "metadata": {},
255 |      "output_type": "execute_result"
256 |     }
257 |    ],
258 |    "source": [
259 |     "application_data=read_app_data()\n",
260 |     "application_data.head()"
261 |    ]
262 |   },
263 |   {
264 |    "cell_type": "code",
265 |    "execution_count": 5,
266 |    "metadata": {},
267 |    "outputs": [
268 |     {
269 |      "name": "stdout",
270 |      "output_type": "stream",
271 |      "text": [
272 |       "<class 'pandas.core.frame.DataFrame'>\n",
273 |       "RangeIndex: 438557 entries, 0 to 438556\n",
274 |       "Data columns (total 18 columns):\n",
275 |       " #   Column               Non-Null Count   Dtype  \n",
276 |       "---  ------               --------------   -----  \n",
277 |       " 0   ID                   438557 non-null  int64  \n",
278 |       " 1   CODE_GENDER          438557 non-null  object \n",
279 |       " 2   FLAG_OWN_CAR         438557 non-null  object \n",
280 |       " 3   FLAG_OWN_REALTY      438557 non-null  object \n",
281 |       " 4   CNT_CHILDREN         438557 non-null  int64  \n",
282 |       " 5   AMT_INCOME_TOTAL     438557 non-null  float64\n",
283 |       " 6   NAME_INCOME_TYPE     438557 non-null  object \n",
284 |       " 7   NAME_EDUCATION_TYPE  438557 non-null  object \n",
285 |       " 8   NAME_FAMILY_STATUS   438557 non-null  object \n",
286 |       " 9   NAME_HOUSING_TYPE    438557 non-null  object \n",
287 |       " 10  DAYS_BIRTH           438557 non-null  int64  \n",
288 |       " 11  DAYS_EMPLOYED        438557 non-null  int64  \n",
289 |       " 12  FLAG_MOBIL           438557 non-null  int64  \n",
290 |       " 13  FLAG_WORK_PHONE      438557 non-null  int64  \n",
291 |       " 14  FLAG_PHONE           438557 non-null  int64  \n",
292 |       " 15  FLAG_EMAIL           438557 non-null  int64  \n",
293 |       " 16  OCCUPATION_TYPE      304354 non-null  object \n",
294 |       " 17  CNT_FAM_MEMBERS      438557 non-null  float64\n",
295 |       "dtypes: float64(2), int64(8), object(8)\n",
296 |       "memory usage: 60.2+ MB\n"
297 |      ]
298 |     }
299 |    ],
300 |    "source": [
301 |     "application_data.info()"
302 |    ]
303 |   },
304 |   {
305 |    "cell_type": "markdown",
306 |    "metadata": {},
307 |    "source": [
308 |     "### Here columns like ID and DAYS_BIRTH are not relatable to datasets so we could these columns"
309 |    ]
310 |   },
311 |   {
312 |    "cell_type": "code",
313 |    "execution_count": 6,
314 |    "metadata": {},
315 |    "outputs": [],
316 |    "source": [
317 |     "def feature_creation():\n",
318 |     "    application_data.drop(['ID','DAYS_BIRTH'],axis=1,inplace=True)\n",
319 |     "    \n",
320 |     "#     Checking for columns with data type as 'object'\n",
321 |     "\n",
322 |     "    object_type_columns=[col for col in application_data.columns if application_data[col].dtype=='object'] \n",
323 |     "    \n",
324 |     "#     Converting the object type data to categorical form.\n",
325 |     "    \n",
326 |     "    for i,column in enumerate(object_type_columns):\n",
327 |     "        application_data[column]=pd.Categorical(application_data[column]).codes\n",
328 |     "    features_column=application_data.columns\n",
329 |     "    features=pd.DataFrame(application_data,columns=features_column)\n",
330 |     "    return features"
331 |    ]
332 |   },
333 |   {
334 |    "cell_type": "code",
335 |    "execution_count": 7,
336 |    "metadata": {},
337 |    "outputs": [],
338 |    "source": [
339 |     "features_for_model=feature_creation()"
340 |    ]
341 |   },
342 |   {
343 |    "cell_type": "code",
344 |    "execution_count": 8,
345 |    "metadata": {},
346 |    "outputs": [
347 |     {
348 |      "data": {
349 |       "text/html": [
350 |        "<div>\n",
351 |        "<style scoped>\n",
352 |        "    .dataframe tbody tr th:only-of-type {\n",
353 |        "        vertical-align: middle;\n",
354 |        "    }\n",
355 |        "\n",
356 |        "    .dataframe tbody tr th {\n",
357 |        "        vertical-align: top;\n",
358 |        "    }\n",
359 |        "\n",
360 |        "    .dataframe thead th {\n",
361 |        "        text-align: right;\n",
362 |        "    }\n",
363 |        "</style>\n",
364 |        "<table border=\"1\" class=\"dataframe\">\n",
365 |        "  <thead>\n",
366 |        "    <tr style=\"text-align: right;\">\n",
367 |        "      <th></th>\n",
368 |        "      <th>CODE_GENDER</th>\n",
369 |        "      <th>FLAG_OWN_CAR</th>\n",
370 |        "      <th>FLAG_OWN_REALTY</th>\n",
371 |        "      <th>CNT_CHILDREN</th>\n",
372 |        "      <th>AMT_INCOME_TOTAL</th>\n",
373 |        "      <th>NAME_INCOME_TYPE</th>\n",
374 |        "      <th>NAME_EDUCATION_TYPE</th>\n",
375 |        "      <th>NAME_FAMILY_STATUS</th>\n",
376 |        "      <th>NAME_HOUSING_TYPE</th>\n",
377 |        "      <th>DAYS_EMPLOYED</th>\n",
378 |        "      <th>FLAG_MOBIL</th>\n",
379 |        "      <th>FLAG_WORK_PHONE</th>\n",
380 |        "      <th>FLAG_PHONE</th>\n",
381 |        "      <th>FLAG_EMAIL</th>\n",
382 |        "      <th>OCCUPATION_TYPE</th>\n",
383 |        "      <th>CNT_FAM_MEMBERS</th>\n",
384 |        "    </tr>\n",
385 |        "  </thead>\n",
386 |        "  <tbody>\n",
387 |        "    <tr>\n",
388 |        "      <th>0</th>\n",
389 |        "      <td>1</td>\n",
390 |        "      <td>1</td>\n",
391 |        "      <td>1</td>\n",
392 |        "      <td>0</td>\n",
393 |        "      <td>427500.0</td>\n",
394 |        "      <td>4</td>\n",
395 |        "      <td>1</td>\n",
396 |        "      <td>0</td>\n",
397 |        "      <td>4</td>\n",
398 |        "      <td>-4542</td>\n",
399 |        "      <td>1</td>\n",
400 |        "      <td>1</td>\n",
401 |        "      <td>0</td>\n",
402 |        "      <td>0</td>\n",
403 |        "      <td>-1</td>\n",
404 |        "      <td>2.0</td>\n",
405 |        "    </tr>\n",
406 |        "    <tr>\n",
407 |        "      <th>1</th>\n",
408 |        "      <td>1</td>\n",
409 |        "      <td>1</td>\n",
410 |        "      <td>1</td>\n",
411 |        "      <td>0</td>\n",
412 |        "      <td>427500.0</td>\n",
413 |        "      <td>4</td>\n",
414 |        "      <td>1</td>\n",
415 |        "      <td>0</td>\n",
416 |        "      <td>4</td>\n",
417 |        "      <td>-4542</td>\n",
418 |        "      <td>1</td>\n",
419 |        "      <td>1</td>\n",
420 |        "      <td>0</td>\n",
421 |        "      <td>0</td>\n",
422 |        "      <td>-1</td>\n",
423 |        "      <td>2.0</td>\n",
424 |        "    </tr>\n",
425 |        "    <tr>\n",
426 |        "      <th>2</th>\n",
427 |        "      <td>1</td>\n",
428 |        "      <td>1</td>\n",
429 |        "      <td>1</td>\n",
430 |        "      <td>0</td>\n",
431 |        "      <td>112500.0</td>\n",
432 |        "      <td>4</td>\n",
433 |        "      <td>4</td>\n",
434 |        "      <td>1</td>\n",
435 |        "      <td>1</td>\n",
436 |        "      <td>-1134</td>\n",
437 |        "      <td>1</td>\n",
438 |        "      <td>0</td>\n",
439 |        "      <td>0</td>\n",
440 |        "      <td>0</td>\n",
441 |        "      <td>16</td>\n",
442 |        "      <td>2.0</td>\n",
443 |        "    </tr>\n",
444 |        "    <tr>\n",
445 |        "      <th>3</th>\n",
446 |        "      <td>0</td>\n",
447 |        "      <td>0</td>\n",
448 |        "      <td>1</td>\n",
449 |        "      <td>0</td>\n",
450 |        "      <td>270000.0</td>\n",
451 |        "      <td>0</td>\n",
452 |        "      <td>4</td>\n",
453 |        "      <td>3</td>\n",
454 |        "      <td>1</td>\n",
455 |        "      <td>-3051</td>\n",
456 |        "      <td>1</td>\n",
457 |        "      <td>0</td>\n",
458 |        "      <td>1</td>\n",
459 |        "      <td>1</td>\n",
460 |        "      <td>14</td>\n",
461 |        "      <td>1.0</td>\n",
462 |        "    </tr>\n",
463 |        "    <tr>\n",
464 |        "      <th>4</th>\n",
465 |        "      <td>0</td>\n",
466 |        "      <td>0</td>\n",
467 |        "      <td>1</td>\n",
468 |        "      <td>0</td>\n",
469 |        "      <td>270000.0</td>\n",
470 |        "      <td>0</td>\n",
471 |        "      <td>4</td>\n",
472 |        "      <td>3</td>\n",
473 |        "      <td>1</td>\n",
474 |        "      <td>-3051</td>\n",
475 |        "      <td>1</td>\n",
476 |        "      <td>0</td>\n",
477 |        "      <td>1</td>\n",
478 |        "      <td>1</td>\n",
479 |        "      <td>14</td>\n",
480 |        "      <td>1.0</td>\n",
481 |        "    </tr>\n",
482 |        "  </tbody>\n",
483 |        "</table>\n",
484 |        "</div>"
485 |       ],
486 |       "text/plain": [
487 |        "   CODE_GENDER  FLAG_OWN_CAR  FLAG_OWN_REALTY  CNT_CHILDREN  AMT_INCOME_TOTAL  \\\n",
488 |        "0            1             1                1             0          427500.0   \n",
489 |        "1            1             1                1             0          427500.0   \n",
490 |        "2            1             1                1             0          112500.0   \n",
491 |        "3            0             0                1             0          270000.0   \n",
492 |        "4            0             0                1             0          270000.0   \n",
493 |        "\n",
494 |        "   NAME_INCOME_TYPE  NAME_EDUCATION_TYPE  NAME_FAMILY_STATUS  \\\n",
495 |        "0                 4                    1                   0   \n",
496 |        "1                 4                    1                   0   \n",
497 |        "2                 4                    4                   1   \n",
498 |        "3                 0                    4                   3   \n",
499 |        "4                 0                    4                   3   \n",
500 |        "\n",
501 |        "   NAME_HOUSING_TYPE  DAYS_EMPLOYED  FLAG_MOBIL  FLAG_WORK_PHONE  FLAG_PHONE  \\\n",
502 |        "0                  4          -4542           1                1           0   \n",
503 |        "1                  4          -4542           1                1           0   \n",
504 |        "2                  1          -1134           1                0           0   \n",
505 |        "3                  1          -3051           1                0           1   \n",
506 |        "4                  1          -3051           1                0           1   \n",
507 |        "\n",
508 |        "   FLAG_EMAIL  OCCUPATION_TYPE  CNT_FAM_MEMBERS  \n",
509 |        "0           0               -1              2.0  \n",
510 |        "1           0               -1              2.0  \n",
511 |        "2           0               16              2.0  \n",
512 |        "3           1               14              1.0  \n",
513 |        "4           1               14              1.0  "
514 |       ]
515 |      },
516 |      "execution_count": 8,
517 |      "metadata": {},
518 |      "output_type": "execute_result"
519 |     }
520 |    ],
521 |    "source": [
522 |     "# Observing first 5 values of feature\n",
523 |     "\n",
524 |     "features_for_model.head()"
525 |    ]
526 |   }
527 |  ],
528 |  "metadata": {
529 |   "kernelspec": {
530 |    "display_name": "Python 3",
531 |    "language": "python",
532 |    "name": "python3"
533 |   },
534 |   "language_info": {
535 |    "codemirror_mode": {
536 |     "name": "ipython",
537 |     "version": 3
538 |    },
539 |    "file_extension": ".py",
540 |    "mimetype": "text/x-python",
541 |    "name": "python",
542 |    "nbconvert_exporter": "python",
543 |    "pygments_lexer": "ipython3",
544 |    "version": "3.7.6"
545 |   }
546 |  },
547 |  "nbformat": 4,
548 |  "nbformat_minor": 4
549 | }
550 | 


--------------------------------------------------------------------------------
/submissions/Anusha-Verma-C.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "metadata": {
  3 |   "language_info": {
  4 |    "codemirror_mode": {
  5 |     "name": "ipython",
  6 |     "version": 3
  7 |    },
  8 |    "file_extension": ".py",
  9 |    "mimetype": "text/x-python",
 10 |    "name": "python",
 11 |    "nbconvert_exporter": "python",
 12 |    "pygments_lexer": "ipython3",
 13 |    "version": "3.7.6-final"
 14 |   },
 15 |   "orig_nbformat": 2,
 16 |   "kernelspec": {
 17 |    "name": "Python 3.7.6 64-bit ('base': conda)",
 18 |    "display_name": "Python 3.7.6 64-bit ('base': conda)",
 19 |    "metadata": {
 20 |     "interpreter": {
 21 |      "hash": "18f47364f2f4870763990e46b7154981c710d71482bd8194938a3829d09494e5"
 22 |     }
 23 |    }
 24 |   }
 25 |  },
 26 |  "nbformat": 4,
 27 |  "nbformat_minor": 2,
 28 |  "cells": [
 29 |   {
 30 |    "cell_type": "code",
 31 |    "execution_count": 15,
 32 |    "metadata": {},
 33 |    "outputs": [],
 34 |    "source": [
 35 |     "import numpy as np\n",
 36 |     "import matplotlib.pyplot as plt\n",
 37 |     "import pandas as pd\n",
 38 |     "\n",
 39 |     "\n"
 40 |    ]
 41 |   },
 42 |   {
 43 |    "source": [
 44 |     "## Import and read app data\n",
 45 |     "I have imported the given dataset into a folder called dataset.\n",
 46 |     "<br>\n"
 47 |    ],
 48 |    "cell_type": "markdown",
 49 |    "metadata": {}
 50 |   },
 51 |   {
 52 |    "cell_type": "code",
 53 |    "execution_count": 16,
 54 |    "metadata": {},
 55 |    "outputs": [],
 56 |    "source": [
 57 |     "def read_app_data():\n",
 58 |     "    application_record=pd.read_csv(r'C:\\Users\\anusha\\Desktop\\dataset\\application_record.csv')\n",
 59 |     "    return application_record\n",
 60 |     "application_record=read_app_data()\n"
 61 |    ]
 62 |   },
 63 |   {
 64 |    "source": [
 65 |     "## Import and read credit_record data"
 66 |    ],
 67 |    "cell_type": "markdown",
 68 |    "metadata": {}
 69 |   },
 70 |   {
 71 |    "cell_type": "code",
 72 |    "execution_count": 17,
 73 |    "metadata": {},
 74 |    "outputs": [],
 75 |    "source": [
 76 |     "credit_record=pd.read_csv(r'C:\\Users\\anusha\\Desktop\\dataset\\credit_record.csv')\n"
 77 |    ]
 78 |   },
 79 |   {
 80 |    "source": [
 81 |     "## Feature Creation\n"
 82 |    ],
 83 |    "cell_type": "markdown",
 84 |    "metadata": {}
 85 |   },
 86 |   {
 87 |    "cell_type": "code",
 88 |    "execution_count": 18,
 89 |    "metadata": {},
 90 |    "outputs": [],
 91 |    "source": [
 92 |     "#dropping ID\n",
 93 |     "#dropping Flag Mobil as the entire column has the same values\n",
 94 |     "\n",
 95 |     "def feature_creation():\n",
 96 |     "    application_record.drop(['ID','FLAG_MOBIL'],axis=1,inplace=True)\n",
 97 |     "    features=application_record.columns\n",
 98 |     "    return features\n",
 99 |     "\n",
100 |     "features=feature_creation()\n"
101 |    ]
102 |   },
103 |   {
104 |    "cell_type": "code",
105 |    "execution_count": 19,
106 |    "metadata": {},
107 |    "outputs": [],
108 |    "source": [
109 |     "features1=['CODE_GENDER','FLAG_OWN_CAR','FLAG_OWN_REALTY','OCCUPATION_TYPE']\n",
110 |     "import sklearn\n",
111 |     "from sklearn.preprocessing import LabelEncoder\n",
112 |     "le = LabelEncoder()\n",
113 |     "for a in features1:\n",
114 |     "    application_record[a] = le.fit_transform(application_record[a].astype(str))\n"
115 |    ]
116 |   },
117 |   {
118 |    "cell_type": "code",
119 |    "execution_count": 20,
120 |    "metadata": {},
121 |    "outputs": [
122 |     {
123 |      "output_type": "stream",
124 |      "name": "stdout",
125 |      "text": [
126 |       "        CODE_GENDER  FLAG_OWN_CAR  FLAG_OWN_REALTY  CNT_CHILDREN  \\\n0                 1             1                1             0   \n1                 1             1                1             0   \n2                 1             1                1             0   \n3                 0             0                1             0   \n4                 0             0                1             0   \n...             ...           ...              ...           ...   \n438552            1             0                1             0   \n438553            0             0                0             0   \n438554            0             0                0             0   \n438555            0             0                1             0   \n438556            0             0                1             0   \n\n        AMT_INCOME_TOTAL  DAYS_BIRTH  DAYS_EMPLOYED  FLAG_WORK_PHONE  \\\n0               5.630936      -12005          -4542                1   \n1               5.630936      -12005          -4542                1   \n2               5.051153      -21474          -1134                0   \n3               5.431364      -19110          -3051                0   \n4               5.431364      -19110          -3051                0   \n...                  ...         ...            ...              ...   \n438552          5.130334      -22717         365243                0   \n438553          5.014940      -15939          -3007                0   \n438554          4.732394       -8169           -372                1   \n438555          4.857332      -21673         365243                0   \n438556          5.084576      -18858          -1201                0   \n\n        FLAG_PHONE  FLAG_EMAIL  ...  NAME_FAMILY_STATUS_Married  \\\n0                0           0  ...                           0   \n1                0           0  ...                           0   \n2                0           0  ...                           1   \n3                1           1  ...                           0   \n4                1           1  ...                           0   \n...            ...         ...  ...                         ...   \n438552           0           0  ...                           0   \n438553           0           0  ...                           0   \n438554           0           0  ...                           0   \n438555           0           0  ...                           1   \n438556           1           0  ...                           1   \n\n        NAME_FAMILY_STATUS_Separated  NAME_FAMILY_STATUS_Single / not married  \\\n0                                  0                                        0   \n1                                  0                                        0   \n2                                  0                                        0   \n3                                  0                                        1   \n4                                  0                                        1   \n...                              ...                                      ...   \n438552                             1                                        0   \n438553                             0                                        1   \n438554                             0                                        1   \n438555                             0                                        0   \n438556                             0                                        0   \n\n        NAME_FAMILY_STATUS_Widow  NAME_HOUSING_TYPE_Co-op apartment  \\\n0                              0                                  0   \n1                              0                                  0   \n2                              0                                  0   \n3                              0                                  0   \n4                              0                                  0   \n...                          ...                                ...   \n438552                         0                                  0   \n438553                         0                                  0   \n438554                         0                                  0   \n438555                         0                                  0   \n438556                         0                                  0   \n\n        NAME_HOUSING_TYPE_House / apartment  \\\n0                                         0   \n1                                         0   \n2                                         1   \n3                                         1   \n4                                         1   \n...                                     ...   \n438552                                    1   \n438553                                    1   \n438554                                    0   \n438555                                    1   \n438556                                    1   \n\n        NAME_HOUSING_TYPE_Municipal apartment  \\\n0                                           0   \n1                                           0   \n2                                           0   \n3                                           0   \n4                                           0   \n...                                       ...   \n438552                                      0   \n438553                                      0   \n438554                                      0   \n438555                                      0   \n438556                                      0   \n\n        NAME_HOUSING_TYPE_Office apartment  \\\n0                                        0   \n1                                        0   \n2                                        0   \n3                                        0   \n4                                        0   \n...                                    ...   \n438552                                   0   \n438553                                   0   \n438554                                   0   \n438555                                   0   \n438556                                   0   \n\n        NAME_HOUSING_TYPE_Rented apartment  NAME_HOUSING_TYPE_With parents  \n0                                        1                               0  \n1                                        1                               0  \n2                                        0                               0  \n3                                        0                               0  \n4                                        0                               0  \n...                                    ...                             ...  \n438552                                   0                               0  \n438553                                   0                               0  \n438554                                   0                               1  \n438555                                   0                               0  \n438556                                   0                               0  \n\n[438557 rows x 33 columns]\n"
127 |      ]
128 |     }
129 |    ],
130 |    "source": [
131 |     "from sklearn.compose import ColumnTransformer\n",
132 |     "from sklearn.preprocessing import OneHotEncoder\n",
133 |     "application_record= pd.get_dummies(data=application_record,columns=['NAME_INCOME_TYPE','NAME_EDUCATION_TYPE','NAME_FAMILY_STATUS','NAME_HOUSING_TYPE'])\n",
134 |     "application_record['AMT_INCOME_TOTAL']=np.log10(application_record['AMT_INCOME_TOTAL']) \n",
135 |     "\n",
136 |     "print(application_record)\n",
137 |     "\n"
138 |    ]
139 |   }
140 |  ]
141 | }


--------------------------------------------------------------------------------
/submissions/Arpit-Agarwal.ipynb:
--------------------------------------------------------------------------------
 1 | {
 2 |  "cells": [
 3 |   {
 4 |    "cell_type": "code",
 5 |    "execution_count": 1,
 6 |    "metadata": {},
 7 |    "outputs": [],
 8 |    "source": [
 9 |     "# Importing required libraries\n",
10 |     "import numpy as np\n",
11 |     "import pandas as pd"
12 |    ]
13 |   },
14 |   {
15 |    "cell_type": "markdown",
16 |    "metadata": {},
17 |    "source": [
18 |     "## Import and read app data\n",
19 |     "\n",
20 |     "The link for downloading the daatset is "
21 |    ]
22 |   },
23 |   {
24 |    "cell_type": "code",
25 |    "execution_count": 2,
26 |    "metadata": {},
27 |    "outputs": [],
28 |    "source": [
29 |     "def read_app_data():\n",
30 |     "    application_record=pd.read_csv(r'C:\\Users\\akhil\\Documents\\drive-download-20201009T155012Z-001\\application_record.csv')\n",
31 |     "    return application_record\n",
32 |     "application_record=read_app_data()"
33 |    ]
34 |   },
35 |   {
36 |    "cell_type": "code",
37 |    "execution_count": null,
38 |    "metadata": {},
39 |    "outputs": [],
40 |    "source": []
41 |   }
42 |  ],
43 |  "metadata": {
44 |   "kernelspec": {
45 |    "display_name": "Python3.6Test",
46 |    "language": "python",
47 |    "name": "python3.6test"
48 |   },
49 |   "language_info": {
50 |    "codemirror_mode": {
51 |     "name": "ipython",
52 |     "version": 3
53 |    },
54 |    "file_extension": ".py",
55 |    "mimetype": "text/x-python",
56 |    "name": "python",
57 |    "nbconvert_exporter": "python",
58 |    "pygments_lexer": "ipython3",
59 |    "version": "3.6.5"
60 |   }
61 |  },
62 |  "nbformat": 4,
63 |  "nbformat_minor": 4
64 | }
65 | 


--------------------------------------------------------------------------------
/submissions/Dhanesh-Shetty.ipynb:
--------------------------------------------------------------------------------
 1 | {
 2 |  "cells": [
 3 |   {
 4 |    "cell_type": "markdown",
 5 |    "metadata": {},
 6 |    "source": [
 7 |     "# Import and Read Application Data"
 8 |    ]
 9 |   },
10 |   {
11 |    "cell_type": "code",
12 |    "execution_count": 1,
13 |    "metadata": {},
14 |    "outputs": [],
15 |    "source": [
16 |     "import pandas as pd"
17 |    ]
18 |   },
19 |   {
20 |    "cell_type": "code",
21 |    "execution_count": 2,
22 |    "metadata": {},
23 |    "outputs": [],
24 |    "source": [
25 |     "def read_app_data():\n",
26 |     "    application_records=pd.read_csv(\"/home/dhanesh/Documents/Credit Card Approval/application_record.csv\")\n",
27 |     "    return application_records"
28 |    ]
29 |   },
30 |   {
31 |    "cell_type": "markdown",
32 |    "metadata": {},
33 |    "source": [
34 |     "# Import The Credit Card Dataset"
35 |    ]
36 |   },
37 |   {
38 |    "cell_type": "code",
39 |    "execution_count": 3,
40 |    "metadata": {},
41 |    "outputs": [],
42 |    "source": [
43 |     "#dataset downloaded from https://drive.google.com/drive/folders/1ltq08WdYxd-r9wnY60o78VBgN5FlMcKk?usp=sharing\n",
44 |     "credit_card_data=pd.read_csv(\"/home/dhanesh/Documents/Credit Card Approval/credit_record.csv\")"
45 |    ]
46 |   }
47 |  ],
48 |  "metadata": {
49 |   "kernelspec": {
50 |    "display_name": "py3-TF",
51 |    "language": "python",
52 |    "name": "py3-tf"
53 |   },
54 |   "language_info": {
55 |    "codemirror_mode": {
56 |     "name": "ipython",
57 |     "version": 3
58 |    },
59 |    "file_extension": ".py",
60 |    "mimetype": "text/x-python",
61 |    "name": "python",
62 |    "nbconvert_exporter": "python",
63 |    "pygments_lexer": "ipython3",
64 |    "version": "3.8.3"
65 |   }
66 |  },
67 |  "nbformat": 4,
68 |  "nbformat_minor": 4
69 | }
70 | 


--------------------------------------------------------------------------------
	ID	CODE_GENDER	FLAG_OWN_CAR	FLAG_OWN_REALTY	AMT_INCOME_TOTAL	NAME_INCOME_TYPE	NAME_EDUCATION_TYPE	NAME_FAMILY_STATUS	NAME_HOUSING_TYPE	DAYS_BIRTH	DAYS_EMPLOYED	FLAG_MOBIL	FLAG_WORK_PHONE	FLAG_PHONE	FLAG_EMAIL	OCCUPATION_TYPE	CNT_FAM_MEMBERS
0	5008804	M	Y	Y	427500.0	Working	Higher education	Civil marriage	Rented apartment	-12005	-4542	1	1	0	0	NaN	2.0
1	5008805	M	Y	Y	427500.0	Working	Higher education	Civil marriage	Rented apartment	-12005	-4542	1	1	0	0	NaN	2.0
2	5008806	M	Y	Y	112500.0	Working	Secondary / secondary special	Married	House / apartment	-21474	-1134	1	0	0	0	Security staff	2.0
3	5008808	F	N	Y	270000.0	Commercial associate	Secondary / secondary special	Single / not married	House / apartment	-19110	-3051	1	0	1	1	Sales staff	1.0
4	5008809	F	N	Y	270000.0	Commercial associate	Secondary / secondary special	Single / not married	House / apartment	-19110	-3051	1	0	1	1	Sales staff	1.0