├── Analyzing US Economic Data and Building Dashboard.ipynb ├── Final Project - Data Visualization.ipynb ├── First Notebook.ipynb ├── House Sales in King County, USA Project.ipynb ├── Machine Learning Final Project.ipynb ├── Neighborhoods in Mumbai to Open a Restaurant.ipynb ├── README.md ├── SQL Assignment - Chicago.ipynb └── Segmenting and Clustering Neighborhoods in Toronto.ipynb /Analyzing US Economic Data and Building Dashboard.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": "" 7 | }, 8 | { 9 | "cell_type": "markdown", 10 | "metadata": {}, 11 | "source": "

Analyzing US Economic Data and Building a Dashboard

\n

Description

\n" 12 | }, 13 | { 14 | "cell_type": "markdown", 15 | "metadata": {}, 16 | "source": "Extracting essential data from a dataset and displaying it is a necessary part of data science; therefore individuals can make correct decisions based on the data. In this assignment, you will extract some essential economic indicators from some data, you will then display these economic indicators in a Dashboard. You can then share the dashboard via an URL.\n

\n Gross domestic product (GDP) is a measure of the market value of all the final goods and services produced in a period. GDP is an indicator of how well the economy is doing. A drop in GDP indicates the economy is producing less; similarly an increase in GDP suggests the economy is performing better. In this lab, you will examine how changes in GDP impact the unemployment rate. You will take screen shots of every step, you will share the notebook and the URL pointing to the dashboard.

" 17 | }, 18 | { 19 | "cell_type": "markdown", 20 | "metadata": {}, 21 | "source": "

Table of Contents

\n
\n \n

\n Estimated Time Needed: 180 min

\n
\n\n
" 22 | }, 23 | { 24 | "cell_type": "markdown", 25 | "metadata": {}, 26 | "source": "

Define Function that Makes a Dashboard

" 27 | }, 28 | { 29 | "cell_type": "markdown", 30 | "metadata": {}, 31 | "source": "We will import the following libraries." 32 | }, 33 | { 34 | "cell_type": "code", 35 | "execution_count": 83, 36 | "metadata": {}, 37 | "outputs": [ 38 | { 39 | "data": { 40 | "text/html": "\n
\n \n Loading BokehJS ...\n
" 41 | }, 42 | "metadata": {}, 43 | "output_type": "display_data" 44 | }, 45 | { 46 | "data": { 47 | "application/javascript": "\n(function(root) {\n function now() {\n return new Date();\n }\n\n var force = true;\n\n if (typeof (root._bokeh_onload_callbacks) === \"undefined\" || force === true) {\n root._bokeh_onload_callbacks = [];\n root._bokeh_is_loading = undefined;\n }\n\n var JS_MIME_TYPE = 'application/javascript';\n var HTML_MIME_TYPE = 'text/html';\n var EXEC_MIME_TYPE = 'application/vnd.bokehjs_exec.v0+json';\n var CLASS_NAME = 'output_bokeh rendered_html';\n\n /**\n * Render data to the DOM node\n */\n function render(props, node) {\n var script = document.createElement(\"script\");\n node.appendChild(script);\n }\n\n /**\n * Handle when an output is cleared or removed\n */\n function handleClearOutput(event, handle) {\n var cell = handle.cell;\n\n var id = cell.output_area._bokeh_element_id;\n var server_id = cell.output_area._bokeh_server_id;\n // Clean up Bokeh references\n if (id != null && id in Bokeh.index) {\n Bokeh.index[id].model.document.clear();\n delete Bokeh.index[id];\n }\n\n if (server_id !== undefined) {\n // Clean up Bokeh references\n var cmd = \"from bokeh.io.state import curstate; print(curstate().uuid_to_server['\" + server_id + \"'].get_sessions()[0].document.roots[0]._id)\";\n cell.notebook.kernel.execute(cmd, {\n iopub: {\n output: function(msg) {\n var id = msg.content.text.trim();\n if (id in Bokeh.index) {\n Bokeh.index[id].model.document.clear();\n delete Bokeh.index[id];\n }\n }\n }\n });\n // Destroy server and session\n var cmd = \"import bokeh.io.notebook as ion; ion.destroy_server('\" + server_id + \"')\";\n cell.notebook.kernel.execute(cmd);\n }\n }\n\n /**\n * Handle when a new output is added\n */\n function handleAddOutput(event, handle) {\n var output_area = handle.output_area;\n var output = handle.output;\n\n // limit handleAddOutput to display_data with EXEC_MIME_TYPE content only\n if ((output.output_type != \"display_data\") || (!output.data.hasOwnProperty(EXEC_MIME_TYPE))) {\n return\n }\n\n var toinsert = output_area.element.find(\".\" + CLASS_NAME.split(' ')[0]);\n\n if (output.metadata[EXEC_MIME_TYPE][\"id\"] !== undefined) {\n toinsert[toinsert.length - 1].firstChild.textContent = output.data[JS_MIME_TYPE];\n // store reference to embed id on output_area\n output_area._bokeh_element_id = output.metadata[EXEC_MIME_TYPE][\"id\"];\n }\n if (output.metadata[EXEC_MIME_TYPE][\"server_id\"] !== undefined) {\n var bk_div = document.createElement(\"div\");\n bk_div.innerHTML = output.data[HTML_MIME_TYPE];\n var script_attrs = bk_div.children[0].attributes;\n for (var i = 0; i < script_attrs.length; i++) {\n toinsert[toinsert.length - 1].firstChild.setAttribute(script_attrs[i].name, script_attrs[i].value);\n }\n // store reference to server id on output_area\n output_area._bokeh_server_id = output.metadata[EXEC_MIME_TYPE][\"server_id\"];\n }\n }\n\n function register_renderer(events, OutputArea) {\n\n function append_mime(data, metadata, element) {\n // create a DOM node to render to\n var toinsert = this.create_output_subarea(\n metadata,\n CLASS_NAME,\n EXEC_MIME_TYPE\n );\n this.keyboard_manager.register_events(toinsert);\n // Render to node\n var props = {data: data, metadata: metadata[EXEC_MIME_TYPE]};\n render(props, toinsert[toinsert.length - 1]);\n element.append(toinsert);\n return toinsert\n }\n\n /* Handle when an output is cleared or removed */\n events.on('clear_output.CodeCell', handleClearOutput);\n events.on('delete.Cell', handleClearOutput);\n\n /* Handle when a new output is added */\n events.on('output_added.OutputArea', handleAddOutput);\n\n /**\n * Register the mime type and append_mime function with output_area\n */\n OutputArea.prototype.register_mime_type(EXEC_MIME_TYPE, append_mime, {\n /* Is output safe? */\n safe: true,\n /* Index of renderer in `output_area.display_order` */\n index: 0\n });\n }\n\n // register the mime type if in Jupyter Notebook environment and previously unregistered\n if (root.Jupyter !== undefined) {\n var events = require('base/js/events');\n var OutputArea = require('notebook/js/outputarea').OutputArea;\n\n if (OutputArea.prototype.mime_types().indexOf(EXEC_MIME_TYPE) == -1) {\n register_renderer(events, OutputArea);\n }\n }\n\n \n if (typeof (root._bokeh_timeout) === \"undefined\" || force === true) {\n root._bokeh_timeout = Date.now() + 5000;\n root._bokeh_failed_load = false;\n }\n\n var NB_LOAD_WARNING = {'data': {'text/html':\n \"
\\n\"+\n \"

\\n\"+\n \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n \"

\\n\"+\n \"\\n\"+\n \"\\n\"+\n \"from bokeh.resources import INLINE\\n\"+\n \"output_notebook(resources=INLINE)\\n\"+\n \"\\n\"+\n \"
\"}};\n\n function display_loaded() {\n var el = document.getElementById(\"1940\");\n if (el != null) {\n el.textContent = \"BokehJS is loading...\";\n }\n if (root.Bokeh !== undefined) {\n if (el != null) {\n el.textContent = \"BokehJS \" + root.Bokeh.version + \" successfully loaded.\";\n }\n } else if (Date.now() < root._bokeh_timeout) {\n setTimeout(display_loaded, 100)\n }\n }\n\n\n function run_callbacks() {\n try {\n root._bokeh_onload_callbacks.forEach(function(callback) { callback() });\n }\n finally {\n delete root._bokeh_onload_callbacks\n }\n console.info(\"Bokeh: all callbacks have finished\");\n }\n\n function load_libs(js_urls, callback) {\n root._bokeh_onload_callbacks.push(callback);\n if (root._bokeh_is_loading > 0) {\n console.log(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n return null;\n }\n if (js_urls == null || js_urls.length === 0) {\n run_callbacks();\n return null;\n }\n console.log(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n root._bokeh_is_loading = js_urls.length;\n for (var i = 0; i < js_urls.length; i++) {\n var url = js_urls[i];\n var s = document.createElement('script');\n s.src = url;\n s.async = false;\n s.onreadystatechange = s.onload = function() {\n root._bokeh_is_loading--;\n if (root._bokeh_is_loading === 0) {\n console.log(\"Bokeh: all BokehJS libraries loaded\");\n run_callbacks()\n }\n };\n s.onerror = function() {\n console.warn(\"failed to load library \" + url);\n };\n console.log(\"Bokeh: injecting script tag for BokehJS library: \", url);\n document.getElementsByTagName(\"head\")[0].appendChild(s);\n }\n };var element = document.getElementById(\"1940\");\n if (element == null) {\n console.log(\"Bokeh: ERROR: autoload.js configured with elementid '1940' but no matching script tag was found. \")\n return false;\n }\n\n var js_urls = [\"https://cdn.pydata.org/bokeh/release/bokeh-1.0.4.min.js\", \"https://cdn.pydata.org/bokeh/release/bokeh-widgets-1.0.4.min.js\", \"https://cdn.pydata.org/bokeh/release/bokeh-tables-1.0.4.min.js\", \"https://cdn.pydata.org/bokeh/release/bokeh-gl-1.0.4.min.js\"];\n\n var inline_js = [\n function(Bokeh) {\n Bokeh.set_log_level(\"info\");\n },\n \n function(Bokeh) {\n \n },\n function(Bokeh) {\n console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-1.0.4.min.css\");\n Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-1.0.4.min.css\");\n console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-widgets-1.0.4.min.css\");\n Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-widgets-1.0.4.min.css\");\n console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-tables-1.0.4.min.css\");\n Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-tables-1.0.4.min.css\");\n }\n ];\n\n function run_inline_js() {\n \n if ((root.Bokeh !== undefined) || (force === true)) {\n for (var i = 0; i < inline_js.length; i++) {\n inline_js[i].call(root, root.Bokeh);\n }if (force === true) {\n display_loaded();\n }} else if (Date.now() < root._bokeh_timeout) {\n setTimeout(run_inline_js, 100);\n } else if (!root._bokeh_failed_load) {\n console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n root._bokeh_failed_load = true;\n } else if (force !== true) {\n var cell = $(document.getElementById(\"1940\")).parents('.cell').data().cell;\n cell.output_area.append_execute_result(NB_LOAD_WARNING)\n }\n\n }\n\n if (root._bokeh_is_loading === 0) {\n console.log(\"Bokeh: BokehJS loaded, going straight to plotting\");\n run_inline_js();\n } else {\n load_libs(js_urls, function() {\n console.log(\"Bokeh: BokehJS plotting callback run at\", now());\n run_inline_js();\n });\n }\n}(window));", 48 | "application/vnd.bokehjs_load.v0+json": "\n(function(root) {\n function now() {\n return new Date();\n }\n\n var force = true;\n\n if (typeof (root._bokeh_onload_callbacks) === \"undefined\" || force === true) {\n root._bokeh_onload_callbacks = [];\n root._bokeh_is_loading = undefined;\n }\n\n \n\n \n if (typeof (root._bokeh_timeout) === \"undefined\" || force === true) {\n root._bokeh_timeout = Date.now() + 5000;\n root._bokeh_failed_load = false;\n }\n\n var NB_LOAD_WARNING = {'data': {'text/html':\n \"
\\n\"+\n \"

\\n\"+\n \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n \"

\\n\"+\n \"\\n\"+\n \"\\n\"+\n \"from bokeh.resources import INLINE\\n\"+\n \"output_notebook(resources=INLINE)\\n\"+\n \"\\n\"+\n \"
\"}};\n\n function display_loaded() {\n var el = document.getElementById(\"1940\");\n if (el != null) {\n el.textContent = \"BokehJS is loading...\";\n }\n if (root.Bokeh !== undefined) {\n if (el != null) {\n el.textContent = \"BokehJS \" + root.Bokeh.version + \" successfully loaded.\";\n }\n } else if (Date.now() < root._bokeh_timeout) {\n setTimeout(display_loaded, 100)\n }\n }\n\n\n function run_callbacks() {\n try {\n root._bokeh_onload_callbacks.forEach(function(callback) { callback() });\n }\n finally {\n delete root._bokeh_onload_callbacks\n }\n console.info(\"Bokeh: all callbacks have finished\");\n }\n\n function load_libs(js_urls, callback) {\n root._bokeh_onload_callbacks.push(callback);\n if (root._bokeh_is_loading > 0) {\n console.log(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n return null;\n }\n if (js_urls == null || js_urls.length === 0) {\n run_callbacks();\n return null;\n }\n console.log(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n root._bokeh_is_loading = js_urls.length;\n for (var i = 0; i < js_urls.length; i++) {\n var url = js_urls[i];\n var s = document.createElement('script');\n s.src = url;\n s.async = false;\n s.onreadystatechange = s.onload = function() {\n root._bokeh_is_loading--;\n if (root._bokeh_is_loading === 0) {\n console.log(\"Bokeh: all BokehJS libraries loaded\");\n run_callbacks()\n }\n };\n s.onerror = function() {\n console.warn(\"failed to load library \" + url);\n };\n console.log(\"Bokeh: injecting script tag for BokehJS library: \", url);\n document.getElementsByTagName(\"head\")[0].appendChild(s);\n }\n };var element = document.getElementById(\"1940\");\n if (element == null) {\n console.log(\"Bokeh: ERROR: autoload.js configured with elementid '1940' but no matching script tag was found. \")\n return false;\n }\n\n var js_urls = [\"https://cdn.pydata.org/bokeh/release/bokeh-1.0.4.min.js\", \"https://cdn.pydata.org/bokeh/release/bokeh-widgets-1.0.4.min.js\", \"https://cdn.pydata.org/bokeh/release/bokeh-tables-1.0.4.min.js\", \"https://cdn.pydata.org/bokeh/release/bokeh-gl-1.0.4.min.js\"];\n\n var inline_js = [\n function(Bokeh) {\n Bokeh.set_log_level(\"info\");\n },\n \n function(Bokeh) {\n \n },\n function(Bokeh) {\n console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-1.0.4.min.css\");\n Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-1.0.4.min.css\");\n console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-widgets-1.0.4.min.css\");\n Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-widgets-1.0.4.min.css\");\n console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-tables-1.0.4.min.css\");\n Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-tables-1.0.4.min.css\");\n }\n ];\n\n function run_inline_js() {\n \n if ((root.Bokeh !== undefined) || (force === true)) {\n for (var i = 0; i < inline_js.length; i++) {\n inline_js[i].call(root, root.Bokeh);\n }if (force === true) {\n display_loaded();\n }} else if (Date.now() < root._bokeh_timeout) {\n setTimeout(run_inline_js, 100);\n } else if (!root._bokeh_failed_load) {\n console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n root._bokeh_failed_load = true;\n } else if (force !== true) {\n var cell = $(document.getElementById(\"1940\")).parents('.cell').data().cell;\n cell.output_area.append_execute_result(NB_LOAD_WARNING)\n }\n\n }\n\n if (root._bokeh_is_loading === 0) {\n console.log(\"Bokeh: BokehJS loaded, going straight to plotting\");\n run_inline_js();\n } else {\n load_libs(js_urls, function() {\n console.log(\"Bokeh: BokehJS plotting callback run at\", now());\n run_inline_js();\n });\n }\n}(window));" 49 | }, 50 | "metadata": {}, 51 | "output_type": "display_data" 52 | } 53 | ], 54 | "source": "import pandas as pd\nfrom bokeh.plotting import figure, output_file, show,output_notebook\noutput_notebook()" 55 | }, 56 | { 57 | "cell_type": "markdown", 58 | "metadata": {}, 59 | "source": "In this section, we define the function make_dashboard. \nYou don't have to know how the function works, you should only care about the inputs. The function will produce a dashboard as well as an html file. You can then use this html file to share your dashboard. If you do not know what an html file is don't worry everything you need to know will be provided in the lab. " 60 | }, 61 | { 62 | "cell_type": "code", 63 | "execution_count": 84, 64 | "metadata": {}, 65 | "outputs": [], 66 | "source": "def make_dashboard(x, gdp_change, unemployment, title, file_name):\n output_file(file_name)\n p = figure(title=title, x_axis_label='year', y_axis_label='%')\n p.line(x.squeeze(), gdp_change.squeeze(), color=\"firebrick\", line_width=4, legend=\"% GDP change\")\n p.line(x.squeeze(), unemployment.squeeze(), line_width=4, legend=\"% unemployed\")\n show(p)" 67 | }, 68 | { 69 | "cell_type": "markdown", 70 | "metadata": {}, 71 | "source": "The dictionary links contain the CSV files with all the data. The value for the key GDP is the file that contains the GDP data. The value for the key unemployment contains the unemployment data." 72 | }, 73 | { 74 | "cell_type": "code", 75 | "execution_count": 85, 76 | "metadata": {}, 77 | "outputs": [], 78 | "source": "links={'GDP':'https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/projects/coursera_project/clean_gdp.csv',\\\n 'unemployment':'https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/projects/coursera_project/clean_unemployment.csv'}" 79 | }, 80 | { 81 | "cell_type": "markdown", 82 | "metadata": {}, 83 | "source": "

Question 1: Create a dataframe that contains the GDP data and display the first five rows of the dataframe.

" 84 | }, 85 | { 86 | "cell_type": "markdown", 87 | "metadata": {}, 88 | "source": "Use the dictionary links and the function pd.read_csv to create a Pandas dataframes that contains the GDP data." 89 | }, 90 | { 91 | "cell_type": "markdown", 92 | "metadata": {}, 93 | "source": "Hint: links[\"GDP\"] contains the path or name of the file." 94 | }, 95 | { 96 | "cell_type": "code", 97 | "execution_count": 86, 98 | "metadata": {}, 99 | "outputs": [], 100 | "source": "# Type your code here\ndf_gdp = pd.read_csv(links['GDP'])" 101 | }, 102 | { 103 | "cell_type": "markdown", 104 | "metadata": {}, 105 | "source": "Use the method head() to display the first five rows of the GDP data, then take a screen-shot." 106 | }, 107 | { 108 | "cell_type": "code", 109 | "execution_count": 87, 110 | "metadata": {}, 111 | "outputs": [ 112 | { 113 | "data": { 114 | "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
datelevel-currentlevel-chainedchange-currentchange-chained
01948274.82020.0-0.7-0.6
11949272.82008.910.08.7
21950300.22184.015.78.0
31951347.32360.05.94.1
41952367.72456.16.04.7
\n
", 115 | "text/plain": " date level-current level-chained change-current change-chained\n0 1948 274.8 2020.0 -0.7 -0.6\n1 1949 272.8 2008.9 10.0 8.7\n2 1950 300.2 2184.0 15.7 8.0\n3 1951 347.3 2360.0 5.9 4.1\n4 1952 367.7 2456.1 6.0 4.7" 116 | }, 117 | "execution_count": 87, 118 | "metadata": {}, 119 | "output_type": "execute_result" 120 | } 121 | ], 122 | "source": "# Type your code here\ndf_gdp.head(5)" 123 | }, 124 | { 125 | "cell_type": "markdown", 126 | "metadata": {}, 127 | "source": "

Question 2: Create a dataframe that contains the unemployment data. Display the first five rows of the dataframe.

" 128 | }, 129 | { 130 | "cell_type": "markdown", 131 | "metadata": {}, 132 | "source": "Use the dictionary links and the function pd.read_csv to create a Pandas dataframes that contains the unemployment data." 133 | }, 134 | { 135 | "cell_type": "code", 136 | "execution_count": 88, 137 | "metadata": {}, 138 | "outputs": [], 139 | "source": "# Type your code here\ndf_unemployment = pd.read_csv(links['unemployment'])" 140 | }, 141 | { 142 | "cell_type": "markdown", 143 | "metadata": {}, 144 | "source": "Use the method head() to display the first five rows of the GDP data, then take a screen-shot." 145 | }, 146 | { 147 | "cell_type": "code", 148 | "execution_count": 89, 149 | "metadata": {}, 150 | "outputs": [ 151 | { 152 | "data": { 153 | "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
dateunemployment
019483.750000
119496.050000
219505.208333
319513.283333
419523.025000
\n
", 154 | "text/plain": " date unemployment\n0 1948 3.750000\n1 1949 6.050000\n2 1950 5.208333\n3 1951 3.283333\n4 1952 3.025000" 155 | }, 156 | "execution_count": 89, 157 | "metadata": {}, 158 | "output_type": "execute_result" 159 | } 160 | ], 161 | "source": "# Type your code here\ndf_unemployment.head(5)" 162 | }, 163 | { 164 | "cell_type": "markdown", 165 | "metadata": {}, 166 | "source": "

Question 3: Display a dataframe where unemployment was greater than 8.5%. Take a screen-shot.

" 167 | }, 168 | { 169 | "cell_type": "code", 170 | "execution_count": 90, 171 | "metadata": {}, 172 | "outputs": [ 173 | { 174 | "data": { 175 | "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
dateunemployment
3419829.708333
3519839.600000
6120099.283333
6220109.608333
6320118.933333
\n
", 176 | "text/plain": " date unemployment\n34 1982 9.708333\n35 1983 9.600000\n61 2009 9.283333\n62 2010 9.608333\n63 2011 8.933333" 177 | }, 178 | "execution_count": 90, 179 | "metadata": {}, 180 | "output_type": "execute_result" 181 | } 182 | ], 183 | "source": "# Type your code here\ndf_unemployment[df_unemployment['unemployment']>8.5]" 184 | }, 185 | { 186 | "cell_type": "markdown", 187 | "metadata": {}, 188 | "source": "

Question 4: Use the function make_dashboard to make a dashboard

" 189 | }, 190 | { 191 | "cell_type": "markdown", 192 | "metadata": {}, 193 | "source": "In this section, you will call the function make_dashboard , to produce a dashboard. We will use the convention of giving each variable the same name as the function parameter." 194 | }, 195 | { 196 | "cell_type": "markdown", 197 | "metadata": {}, 198 | "source": "Create a new dataframe with the column 'date' called x from the dataframe that contains the GDP data." 199 | }, 200 | { 201 | "cell_type": "code", 202 | "execution_count": 91, 203 | "metadata": {}, 204 | "outputs": [], 205 | "source": "x = df_gdp['date'] # Create your dataframe with column date" 206 | }, 207 | { 208 | "cell_type": "markdown", 209 | "metadata": {}, 210 | "source": "Create a new dataframe with the column 'change-current' called gdp_change from the dataframe that contains the GDP data." 211 | }, 212 | { 213 | "cell_type": "code", 214 | "execution_count": 92, 215 | "metadata": {}, 216 | "outputs": [], 217 | "source": "gdp_change = df_gdp['change-current'] # Create your dataframe with column change-current" 218 | }, 219 | { 220 | "cell_type": "markdown", 221 | "metadata": {}, 222 | "source": "Create a new dataframe with the column 'unemployment' called unemployment from the dataframe that contains the unemployment data." 223 | }, 224 | { 225 | "cell_type": "code", 226 | "execution_count": 93, 227 | "metadata": {}, 228 | "outputs": [], 229 | "source": "unemployment = df_unemployment['unemployment'] # Create your dataframe with column unemployment" 230 | }, 231 | { 232 | "cell_type": "markdown", 233 | "metadata": {}, 234 | "source": "Give your dashboard a string title, and assign it to the variable title" 235 | }, 236 | { 237 | "cell_type": "code", 238 | "execution_count": 94, 239 | "metadata": {}, 240 | "outputs": [], 241 | "source": "title = 'Dashboard for Unemployment and GDP Change' # Give your dashboard a string title" 242 | }, 243 | { 244 | "cell_type": "markdown", 245 | "metadata": {}, 246 | "source": "Finally, the function make_dashboard will output an .html in your direictory, just like a csv file. The name of the file is \"index.html\" and it will be stored in the varable file_name." 247 | }, 248 | { 249 | "cell_type": "code", 250 | "execution_count": 95, 251 | "metadata": {}, 252 | "outputs": [], 253 | "source": "file_name = \"index.html\"" 254 | }, 255 | { 256 | "cell_type": "markdown", 257 | "metadata": {}, 258 | "source": "Call the function make_dashboard , to produce a dashboard. Assign the parameter values accordingly take a the , take a screen shot of the dashboard and submit it." 259 | }, 260 | { 261 | "cell_type": "code", 262 | "execution_count": 96, 263 | "metadata": {}, 264 | "outputs": [ 265 | { 266 | "data": { 267 | "text/html": "\n\n\n\n\n\n
\n" 268 | }, 269 | "metadata": {}, 270 | "output_type": "display_data" 271 | }, 272 | { 273 | "data": { 274 | "application/javascript": "(function(root) {\n function embed_document(root) {\n \n var docs_json = {\"7e9ee6d3-2eb7-4009-aecf-268a588f8538\":{\"roots\":{\"references\":[{\"attributes\":{\"below\":[{\"id\":\"1952\",\"type\":\"LinearAxis\"}],\"left\":[{\"id\":\"1957\",\"type\":\"LinearAxis\"}],\"renderers\":[{\"id\":\"1952\",\"type\":\"LinearAxis\"},{\"id\":\"1956\",\"type\":\"Grid\"},{\"id\":\"1957\",\"type\":\"LinearAxis\"},{\"id\":\"1961\",\"type\":\"Grid\"},{\"id\":\"1970\",\"type\":\"BoxAnnotation\"},{\"id\":\"1988\",\"type\":\"Legend\"},{\"id\":\"1980\",\"type\":\"GlyphRenderer\"},{\"id\":\"1993\",\"type\":\"GlyphRenderer\"}],\"title\":{\"id\":\"1941\",\"type\":\"Title\"},\"toolbar\":{\"id\":\"1968\",\"type\":\"Toolbar\"},\"x_range\":{\"id\":\"1944\",\"type\":\"DataRange1d\"},\"x_scale\":{\"id\":\"1948\",\"type\":\"LinearScale\"},\"y_range\":{\"id\":\"1946\",\"type\":\"DataRange1d\"},\"y_scale\":{\"id\":\"1950\",\"type\":\"LinearScale\"}},\"id\":\"1942\",\"subtype\":\"Figure\",\"type\":\"Plot\"},{\"attributes\":{},\"id\":\"1948\",\"type\":\"LinearScale\"},{\"attributes\":{},\"id\":\"2052\",\"type\":\"Selection\"},{\"attributes\":{\"plot\":null,\"text\":\"Dashboard for Unemployment and GDP Change\"},\"id\":\"1941\",\"type\":\"Title\"},{\"attributes\":{\"data_source\":{\"id\":\"1977\",\"type\":\"ColumnDataSource\"},\"glyph\":{\"id\":\"1978\",\"type\":\"Line\"},\"hover_glyph\":null,\"muted_glyph\":null,\"nonselection_glyph\":{\"id\":\"1979\",\"type\":\"Line\"},\"selection_glyph\":null,\"view\":{\"id\":\"1981\",\"type\":\"CDSView\"}},\"id\":\"1980\",\"type\":\"GlyphRenderer\"},{\"attributes\":{\"source\":{\"id\":\"1977\",\"type\":\"ColumnDataSource\"}},\"id\":\"1981\",\"type\":\"CDSView\"},{\"attributes\":{\"callback\":null,\"data\":{\"x\":[1948,1949,1950,1951,1952,1953,1954,1955,1956,1957,1958,1959,1960,1961,1962,1963,1964,1965,1966,1967,1968,1969,1970,1971,1972,1973,1974,1975,1976,1977,1978,1979,1980,1981,1982,1983,1984,1985,1986,1987,1988,1989,1990,1991,1992,1993,1994,1995,1996,1997,1998,1999,2000,2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015,2016],\"y\":{\"__ndarray__\":\"ZmZmZmZm5r8AAAAAAAAkQGZmZmZmZi9AmpmZmZmZF0AAAAAAAAAYQDMzMzMzM9M/zczMzMzMIUBmZmZmZmYWQAAAAAAAABZAAAAAAAAA+D/NzMzMzMwgQAAAAAAAABBAmpmZmZmZDUCamZmZmZkdQGZmZmZmZhZAmpmZmZmZHUDNzMzMzMwgQDMzMzMzMyNAzczMzMzMFkDNzMzMzMwiQGZmZmZmZiBAAAAAAAAAFkAAAAAAAAAhQJqZmZmZmSNAzczMzMzMJkDNzMzMzMwgQAAAAAAAACJAZmZmZmZmJkAzMzMzMzMmQAAAAAAAACpAZmZmZmZmJ0CamZmZmZkhQGZmZmZmZihAMzMzMzMzEUBmZmZmZmYhQDMzMzMzMyZAAAAAAAAAHkAAAAAAAAAWQAAAAAAAABhAmpmZmZmZH0DNzMzMzMweQM3MzMzMzBZAZmZmZmZmCkCamZmZmZkXQM3MzMzMzBRAMzMzMzMzGUAzMzMzMzMTQM3MzMzMzBZAzczMzMzMGEDNzMzMzMwWQDMzMzMzMxlAAAAAAAAAGkCamZmZmZkJQDMzMzMzMwtAMzMzMzMzE0BmZmZmZmYaQM3MzMzMzBpAAAAAAAAAGEBmZmZmZmYSQM3MzMzMzPw/zczMzMzM/L9mZmZmZmYOQJqZmZmZmQ1AzczMzMzMEEDNzMzMzMwMQJqZmZmZmRFAAAAAAAAAEECamZmZmZkFQM3MzMzMzBBA\",\"dtype\":\"float64\",\"shape\":[69]}},\"selected\":{\"id\":\"2001\",\"type\":\"Selection\"},\"selection_policy\":{\"id\":\"2002\",\"type\":\"UnionRenderers\"}},\"id\":\"1977\",\"type\":\"ColumnDataSource\"},{\"attributes\":{\"axis_label\":\"%\",\"formatter\":{\"id\":\"1985\",\"type\":\"BasicTickFormatter\"},\"plot\":{\"id\":\"1942\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"1958\",\"type\":\"BasicTicker\"}},\"id\":\"1957\",\"type\":\"LinearAxis\"},{\"attributes\":{},\"id\":\"2053\",\"type\":\"UnionRenderers\"},{\"attributes\":{},\"id\":\"1985\",\"type\":\"BasicTickFormatter\"},{\"attributes\":{},\"id\":\"1950\",\"type\":\"LinearScale\"},{\"attributes\":{},\"id\":\"1958\",\"type\":\"BasicTicker\"},{\"attributes\":{\"items\":[{\"id\":\"1989\",\"type\":\"LegendItem\"},{\"id\":\"2003\",\"type\":\"LegendItem\"}],\"plot\":{\"id\":\"1942\",\"subtype\":\"Figure\",\"type\":\"Plot\"}},\"id\":\"1988\",\"type\":\"Legend\"},{\"attributes\":{},\"id\":\"1953\",\"type\":\"BasicTicker\"},{\"attributes\":{},\"id\":\"2001\",\"type\":\"Selection\"},{\"attributes\":{\"axis_label\":\"year\",\"formatter\":{\"id\":\"1983\",\"type\":\"BasicTickFormatter\"},\"plot\":{\"id\":\"1942\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"1953\",\"type\":\"BasicTicker\"}},\"id\":\"1952\",\"type\":\"LinearAxis\"},{\"attributes\":{\"label\":{\"value\":\"% GDP change\"},\"renderers\":[{\"id\":\"1980\",\"type\":\"GlyphRenderer\"}]},\"id\":\"1989\",\"type\":\"LegendItem\"},{\"attributes\":{\"active_drag\":\"auto\",\"active_inspect\":\"auto\",\"active_multi\":null,\"active_scroll\":\"auto\",\"active_tap\":\"auto\",\"tools\":[{\"id\":\"1962\",\"type\":\"PanTool\"},{\"id\":\"1963\",\"type\":\"WheelZoomTool\"},{\"id\":\"1964\",\"type\":\"BoxZoomTool\"},{\"id\":\"1965\",\"type\":\"SaveTool\"},{\"id\":\"1966\",\"type\":\"ResetTool\"},{\"id\":\"1967\",\"type\":\"HelpTool\"}]},\"id\":\"1968\",\"type\":\"Toolbar\"},{\"attributes\":{},\"id\":\"1983\",\"type\":\"BasicTickFormatter\"},{\"attributes\":{},\"id\":\"1962\",\"type\":\"PanTool\"},{\"attributes\":{},\"id\":\"1963\",\"type\":\"WheelZoomTool\"},{\"attributes\":{\"callback\":null,\"data\":{\"x\":[1948,1949,1950,1951,1952,1953,1954,1955,1956,1957,1958,1959,1960,1961,1962,1963,1964,1965,1966,1967,1968,1969,1970,1971,1972,1973,1974,1975,1976,1977,1978,1979,1980,1981,1982,1983,1984,1985,1986,1987,1988,1989,1990,1991,1992,1993,1994,1995,1996,1997,1998,1999,2000,2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015,2016],\"y\":{\"__ndarray__\":\"AAAAAAAADkAzMzMzMzMYQFVVVVVV1RRARURERERECkA0MzMzMzMIQGdmZmZmZgdA3d3d3d1dFkB4d3d3d3cRQAAAAAAAgBBAMzMzMzMzEUDd3d3d3V0bQM3MzMzMzBVArKqqqqoqFkBERERERMQaQEVERERERBZAEhERERGRFkAiIiIiIqIUQImIiIiICBJAVVVVVVVVDkC5u7u7u7sOQHd3d3d3dwxA7+7u7u7uC0Dv7u7u7u4TQM3MzMzMzBdAZ2ZmZmZmFkDu7u7u7m4TQBIRERERkRZAMzMzMzPzIEDLzMzMzMweQDUzMzMzMxxAREREREREGEBnZmZmZmYXQDUzMzMzsxxAeHd3d3d3HkCqqqqqqmojQDMzMzMzMyNAiYiIiIgIHkBERERERMQcQAAAAAAAABxANTMzMzOzGEB3d3d3d/cVQIiIiIiICBVAd3d3d3d3FkBlZmZmZmYbQHh3d3d39x1AISIiIiKiG0BnZmZmZmYYQN/d3d3dXRZAIyIiIiKiFUBERERERMQTQAAAAAAAABJA393d3d3dEEC5u7u7u7sPQHd3d3d39xJAIyIiIiIiF0B4d3d3d/cXQKyqqqqqKhZAVVVVVVVVFEDv7u7u7m4SQHh3d3d3dxJAMzMzMzMzF0AREREREZEiQHd3d3d3NyNA3t3d3d3dIUBnZmZmZiYgQO/u7u7ubh1AIyIiIiKiGECamZmZmRkVQAAAAAAAgBNA\",\"dtype\":\"float64\",\"shape\":[69]}},\"selected\":{\"id\":\"2052\",\"type\":\"Selection\"},\"selection_policy\":{\"id\":\"2053\",\"type\":\"UnionRenderers\"}},\"id\":\"1990\",\"type\":\"ColumnDataSource\"},{\"attributes\":{\"overlay\":{\"id\":\"1970\",\"type\":\"BoxAnnotation\"}},\"id\":\"1964\",\"type\":\"BoxZoomTool\"},{\"attributes\":{\"line_color\":\"#1f77b4\",\"line_width\":4,\"x\":{\"field\":\"x\"},\"y\":{\"field\":\"y\"}},\"id\":\"1991\",\"type\":\"Line\"},{\"attributes\":{},\"id\":\"1965\",\"type\":\"SaveTool\"},{\"attributes\":{\"line_alpha\":0.1,\"line_color\":\"#1f77b4\",\"line_width\":4,\"x\":{\"field\":\"x\"},\"y\":{\"field\":\"y\"}},\"id\":\"1992\",\"type\":\"Line\"},{\"attributes\":{},\"id\":\"1966\",\"type\":\"ResetTool\"},{\"attributes\":{\"data_source\":{\"id\":\"1990\",\"type\":\"ColumnDataSource\"},\"glyph\":{\"id\":\"1991\",\"type\":\"Line\"},\"hover_glyph\":null,\"muted_glyph\":null,\"nonselection_glyph\":{\"id\":\"1992\",\"type\":\"Line\"},\"selection_glyph\":null,\"view\":{\"id\":\"1994\",\"type\":\"CDSView\"}},\"id\":\"1993\",\"type\":\"GlyphRenderer\"},{\"attributes\":{\"dimension\":1,\"plot\":{\"id\":\"1942\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"1958\",\"type\":\"BasicTicker\"}},\"id\":\"1961\",\"type\":\"Grid\"},{\"attributes\":{\"callback\":null},\"id\":\"1944\",\"type\":\"DataRange1d\"},{\"attributes\":{},\"id\":\"1967\",\"type\":\"HelpTool\"},{\"attributes\":{\"plot\":{\"id\":\"1942\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"1953\",\"type\":\"BasicTicker\"}},\"id\":\"1956\",\"type\":\"Grid\"},{\"attributes\":{\"source\":{\"id\":\"1990\",\"type\":\"ColumnDataSource\"}},\"id\":\"1994\",\"type\":\"CDSView\"},{\"attributes\":{},\"id\":\"2002\",\"type\":\"UnionRenderers\"},{\"attributes\":{\"bottom_units\":\"screen\",\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"left_units\":\"screen\",\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"render_mode\":\"css\",\"right_units\":\"screen\",\"top_units\":\"screen\"},\"id\":\"1970\",\"type\":\"BoxAnnotation\"},{\"attributes\":{\"label\":{\"value\":\"% unemployed\"},\"renderers\":[{\"id\":\"1993\",\"type\":\"GlyphRenderer\"}]},\"id\":\"2003\",\"type\":\"LegendItem\"},{\"attributes\":{\"line_color\":\"firebrick\",\"line_width\":4,\"x\":{\"field\":\"x\"},\"y\":{\"field\":\"y\"}},\"id\":\"1978\",\"type\":\"Line\"},{\"attributes\":{\"line_alpha\":0.1,\"line_color\":\"#1f77b4\",\"line_width\":4,\"x\":{\"field\":\"x\"},\"y\":{\"field\":\"y\"}},\"id\":\"1979\",\"type\":\"Line\"},{\"attributes\":{\"callback\":null},\"id\":\"1946\",\"type\":\"DataRange1d\"}],\"root_ids\":[\"1942\"]},\"title\":\"Bokeh Application\",\"version\":\"1.0.4\"}};\n var render_items = [{\"docid\":\"7e9ee6d3-2eb7-4009-aecf-268a588f8538\",\"roots\":{\"1942\":\"e7891f1e-642b-4997-aa2f-10cb71655477\"}}];\n root.Bokeh.embed.embed_items_notebook(docs_json, render_items);\n\n }\n if (root.Bokeh !== undefined) {\n embed_document(root);\n } else {\n var attempts = 0;\n var timer = setInterval(function(root) {\n if (root.Bokeh !== undefined) {\n embed_document(root);\n clearInterval(timer);\n }\n attempts++;\n if (attempts > 100) {\n console.log(\"Bokeh: ERROR: Unable to run BokehJS code because BokehJS library is missing\");\n clearInterval(timer);\n }\n }, 10, root)\n }\n})(window);", 275 | "application/vnd.bokehjs_exec.v0+json": "" 276 | }, 277 | "metadata": { 278 | "application/vnd.bokehjs_exec.v0+json": { 279 | "id": "1942" 280 | } 281 | }, 282 | "output_type": "display_data" 283 | } 284 | ], 285 | "source": "# Fill up the parameters in the following function:\nmake_dashboard(x=x, gdp_change=gdp_change, unemployment=unemployment, title=title, file_name=file_name)" 286 | }, 287 | { 288 | "cell_type": "markdown", 289 | "metadata": {}, 290 | "source": "

(Optional not marked)Save the dashboard on IBM cloud and display it

" 291 | }, 292 | { 293 | "cell_type": "markdown", 294 | "metadata": {}, 295 | "source": "From the tutorial PROVISIONING AN OBJECT STORAGE INSTANCE ON IBM CLOUD copy the JSON object containing the credentials you created. You\u2019ll want to store everything you see in a credentials variable like the one below (obviously, replace the placeholder values with your own). Take special note of your access_key_id and secret_access_key. Do not delete # @hidden_cell as this will not allow people to see your credentials when you share your notebook. " 296 | }, 297 | { 298 | "cell_type": "markdown", 299 | "metadata": {}, 300 | "source": "\ncredentials = {
\n   \"apikey\": \"your-api-key\",
\n   \"cos_hmac_keys\": {
\n   \"access_key_id\": \"your-access-key-here\",
\n   \"secret_access_key\": \"your-secret-access-key-here\"
\n   },
\n
\n\n  \"endpoints\": \"your-endpoints\",
\n   \"iam_apikey_description\": \"your-iam_apikey_description\",
\n   \"iam_apikey_name\": \"your-iam_apikey_name\",
\n   \"iam_role_crn\": \"your-iam_apikey_name\",
\n   \"iam_serviceid_crn\": \"your-iam_serviceid_crn\",
\n  \"resource_instance_id\": \"your-resource_instance_id\"
\n}\n
" 301 | }, 302 | { 303 | "cell_type": "code", 304 | "execution_count": 97, 305 | "metadata": {}, 306 | "outputs": [], 307 | "source": "# The code was removed by Watson Studio for sharing." 308 | }, 309 | { 310 | "cell_type": "markdown", 311 | "metadata": {}, 312 | "source": "You will need the endpoint make sure the setting are the same as PROVISIONING AN OBJECT STORAGE INSTANCE ON IBM CLOUD assign the name of your bucket to the variable bucket_name " 313 | }, 314 | { 315 | "cell_type": "code", 316 | "execution_count": 98, 317 | "metadata": {}, 318 | "outputs": [], 319 | "source": "endpoint = 'https://s3-api.us-geo.objectstorage.softlayer.net'" 320 | }, 321 | { 322 | "cell_type": "markdown", 323 | "metadata": {}, 324 | "source": "From the tutorial PROVISIONING AN OBJECT STORAGE INSTANCE ON IBM CLOUD assign the name of your bucket to the variable bucket_name " 325 | }, 326 | { 327 | "cell_type": "code", 328 | "execution_count": 99, 329 | "metadata": {}, 330 | "outputs": [], 331 | "source": "bucket_name = 'python-for-ds-and-ai-bucket' # Type your bucket name on IBM Cloud" 332 | }, 333 | { 334 | "cell_type": "markdown", 335 | "metadata": {}, 336 | "source": "We can access IBM Cloud Object Storage with Python useing the boto3 library, which we\u2019ll import below:" 337 | }, 338 | { 339 | "cell_type": "code", 340 | "execution_count": 100, 341 | "metadata": {}, 342 | "outputs": [], 343 | "source": "import boto3" 344 | }, 345 | { 346 | "cell_type": "markdown", 347 | "metadata": {}, 348 | "source": "We can interact with IBM Cloud Object Storage through a boto3 resource object." 349 | }, 350 | { 351 | "cell_type": "code", 352 | "execution_count": 101, 353 | "metadata": {}, 354 | "outputs": [], 355 | "source": "resource = boto3.resource(\n 's3',\n aws_access_key_id = credentials[\"cos_hmac_keys\"]['access_key_id'],\n aws_secret_access_key = credentials[\"cos_hmac_keys\"][\"secret_access_key\"],\n endpoint_url = endpoint,\n)" 356 | }, 357 | { 358 | "cell_type": "markdown", 359 | "metadata": {}, 360 | "source": "We are going to use open to create a file object. To get the path of the file, you are going to concatenate the name of the file stored in the variable file_name. The directory stored in the variable directory using the + operator and assign it to the variable \nhtml_path. We will use the function getcwd() to find current the working directory." 361 | }, 362 | { 363 | "cell_type": "code", 364 | "execution_count": 102, 365 | "metadata": {}, 366 | "outputs": [], 367 | "source": "import os\n\ndirectory = os.getcwd()\nhtml_path = directory + \"/\" + file_name" 368 | }, 369 | { 370 | "cell_type": "markdown", 371 | "metadata": {}, 372 | "source": "Now you must read the html file, use the function f = open(html_path, mode) to create a file object and assign it to the variable f. The parameter file should be the variable html_path, the mode should be \"r\" for read. " 373 | }, 374 | { 375 | "cell_type": "code", 376 | "execution_count": 103, 377 | "metadata": {}, 378 | "outputs": [], 379 | "source": "# Type your code here\nf = open(file=html_path, mode='r')" 380 | }, 381 | { 382 | "cell_type": "markdown", 383 | "metadata": {}, 384 | "source": "To load your dataset into the bucket we will use the method put_object, you must set the parameter name to the name of the bucket, the parameter Key should be the name of the HTML file and the value for the parameter Body should be set to f.read()." 385 | }, 386 | { 387 | "cell_type": "code", 388 | "execution_count": 104, 389 | "metadata": {}, 390 | "outputs": [ 391 | { 392 | "data": { 393 | "text/plain": "s3.Object(bucket_name='python-for-ds-and-ai-bucket', key='index.html')" 394 | }, 395 | "execution_count": 104, 396 | "metadata": {}, 397 | "output_type": "execute_result" 398 | } 399 | ], 400 | "source": "# Fill up the parameters in the following function:\nresource.Bucket(name=bucket_name).put_object(Key=file_name, Body=f.read())" 401 | }, 402 | { 403 | "cell_type": "markdown", 404 | "metadata": {}, 405 | "source": "In the dictionary Params provide the bucket name as the value for the key 'Bucket'. Also for the value of the key 'Key' add the name of the html file, both values should be strings." 406 | }, 407 | { 408 | "cell_type": "code", 409 | "execution_count": 105, 410 | "metadata": {}, 411 | "outputs": [], 412 | "source": "# Fill in the value for each key\nParams = {'Bucket': bucket_name,'Key': file_name}" 413 | }, 414 | { 415 | "cell_type": "markdown", 416 | "metadata": {}, 417 | "source": "The following lines of code will generate a URL to share your dashboard. The URL only last seven days, but don't worry you will get full marks if the URL is visible in your notebook. " 418 | }, 419 | { 420 | "cell_type": "code", 421 | "execution_count": 106, 422 | "metadata": {}, 423 | "outputs": [ 424 | { 425 | "name": "stdout", 426 | "output_type": "stream", 427 | "text": "https://s3-api.us-geo.objectstorage.softlayer.net/python-for-ds-and-ai-bucket/index.html?AWSAccessKeyId=58a4eaefd7364ccbba025153bff5738b&Signature=ZnxTgAFOI3kUNeJhfHTRejpMFy8%3D&Expires=1596791728\n" 428 | } 429 | ], 430 | "source": "import sys\ntime = 7*24*60**2\nclient = boto3.client(\n 's3',\n aws_access_key_id = credentials[\"cos_hmac_keys\"]['access_key_id'],\n aws_secret_access_key = credentials[\"cos_hmac_keys\"][\"secret_access_key\"],\n endpoint_url=endpoint,\n\n)\nurl = client.generate_presigned_url('get_object',Params=Params,ExpiresIn=time)\nprint(url)" 431 | }, 432 | { 433 | "cell_type": "markdown", 434 | "metadata": {}, 435 | "source": "

How to submit

" 436 | }, 437 | { 438 | "cell_type": "markdown", 439 | "metadata": {}, 440 | "source": "

Once you complete your notebook you will have to share it to be marked. Select the icon on the top right a marked in red in the image below, a dialogue box should open, select the option all content excluding sensitive code cells.

\n\n

\"share

\n

\n\n

You can then share the notebook  via a  URL by scrolling down as shown in the following image:

\n

\"share

" 441 | }, 442 | { 443 | "cell_type": "markdown", 444 | "metadata": {}, 445 | "source": "
\n

Copyright © 2019 IBM Developer Skills Network. This notebook and its source code are released under the terms of the MIT License.

" 446 | }, 447 | { 448 | "cell_type": "markdown", 449 | "metadata": {}, 450 | "source": "

About the Authors:

\n\nJoseph Santarcangelo has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.\n

\nOther contributors: Yi leng Yao, Mavis Zhou \n

" 451 | }, 452 | { 453 | "cell_type": "markdown", 454 | "metadata": {}, 455 | "source": "

References :

" 456 | }, 457 | { 458 | "cell_type": "markdown", 459 | "metadata": {}, 460 | "source": "\n" 461 | } 462 | ], 463 | "metadata": { 464 | "kernelspec": { 465 | "display_name": "Python 3.6", 466 | "language": "python", 467 | "name": "python3" 468 | }, 469 | "language_info": { 470 | "codemirror_mode": { 471 | "name": "ipython", 472 | "version": 3 473 | }, 474 | "file_extension": ".py", 475 | "mimetype": "text/x-python", 476 | "name": "python", 477 | "nbconvert_exporter": "python", 478 | "pygments_lexer": "ipython3", 479 | "version": "3.6.9" 480 | } 481 | }, 482 | "nbformat": 4, 483 | "nbformat_minor": 2 484 | } -------------------------------------------------------------------------------- /First Notebook.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "collapsed": true 7 | }, 8 | "source": "# My Jupyter Notebook on IBM Data Science Experience" 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": "**Raunak Bhutoria** \nChemical Engineering Student" 14 | }, 15 | { 16 | "cell_type": "markdown", 17 | "metadata": {}, 18 | "source": "_I am interested in data science because I want to develop data analysis and data visualization skills as well as interpret data to gain various insights and predict future decisions_" 19 | }, 20 | { 21 | "cell_type": "markdown", 22 | "metadata": {}, 23 | "source": "The code below is supposed to print \"Hello World\"." 24 | }, 25 | { 26 | "cell_type": "code", 27 | "execution_count": 3, 28 | "metadata": {}, 29 | "outputs": [ 30 | { 31 | "name": "stdout", 32 | "output_type": "stream", 33 | "text": "Hello World\n" 34 | } 35 | ], 36 | "source": "print(\"Hello World\")" 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "metadata": {}, 41 | "source": "**Horizontal Rule:**\n\nHi everyone!\n\n---\n\nHope you all are doing well!\n\n***\n\nAll the best for the course!\n\n**Hyperlink:**\n\nhttps://en.wikipedia.org/wiki/Data_science\n\n**Image:**\n\n![alt text](https://github.com/adam-p/markdown-here/raw/master/src/common/images/icon48.png)" 42 | }, 43 | { 44 | "cell_type": "code", 45 | "execution_count": null, 46 | "metadata": {}, 47 | "outputs": [], 48 | "source": "" 49 | } 50 | ], 51 | "metadata": { 52 | "kernelspec": { 53 | "display_name": "Python 3.6 with Spark", 54 | "language": "python3", 55 | "name": "python36" 56 | }, 57 | "language_info": { 58 | "codemirror_mode": { 59 | "name": "ipython", 60 | "version": 3 61 | }, 62 | "file_extension": ".py", 63 | "mimetype": "text/x-python", 64 | "name": "python", 65 | "nbconvert_exporter": "python", 66 | "pygments_lexer": "ipython3", 67 | "version": "3.6.8" 68 | } 69 | }, 70 | "nbformat": 4, 71 | "nbformat_minor": 1 72 | } -------------------------------------------------------------------------------- /House Sales in King County, USA Project.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": " \n\n

Data Analysis with Python

" 7 | }, 8 | { 9 | "cell_type": "markdown", 10 | "metadata": {}, 11 | "source": "# House Sales in King County, USA" 12 | }, 13 | { 14 | "cell_type": "markdown", 15 | "metadata": {}, 16 | "source": "This dataset contains house sale prices for King County, which includes Seattle. It includes homes sold between May 2014 and May 2015." 17 | }, 18 | { 19 | "cell_type": "markdown", 20 | "metadata": {}, 21 | "source": "id : A notation for a house\n\n date: Date house was sold\n\n\nprice: Price is prediction target\n\n\nbedrooms: Number of bedrooms\n\n\nbathrooms: Number of bathrooms\n\nsqft_living: Square footage of the home\n\nsqft_lot: Square footage of the lot\n\n\nfloors :Total floors (levels) in house\n\n\nwaterfront :House which has a view to a waterfront\n\n\nview: Has been viewed\n\n\ncondition :How good the condition is overall\n\ngrade: overall grade given to the housing unit, based on King County grading system\n\n\nsqft_above : Square footage of house apart from basement\n\n\nsqft_basement: Square footage of the basement\n\nyr_built : Built Year\n\n\nyr_renovated : Year when house was renovated\n\nzipcode: Zip code\n\n\nlat: Latitude coordinate\n\nlong: Longitude coordinate\n\nsqft_living15 : Living room area in 2015(implies-- some renovations) This might or might not have affected the lotsize area\n\n\nsqft_lot15 : LotSize area in 2015(implies-- some renovations)" 22 | }, 23 | { 24 | "cell_type": "markdown", 25 | "metadata": {}, 26 | "source": "You will require the following libraries: " 27 | }, 28 | { 29 | "cell_type": "code", 30 | "execution_count": 7, 31 | "metadata": {}, 32 | "outputs": [], 33 | "source": "import pandas as pd\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport seaborn as sns\nfrom sklearn.pipeline import Pipeline\nfrom sklearn.preprocessing import StandardScaler,PolynomialFeatures\nfrom sklearn.linear_model import LinearRegression\n%matplotlib inline" 34 | }, 35 | { 36 | "cell_type": "markdown", 37 | "metadata": {}, 38 | "source": "# Module 1: Importing Data Sets " 39 | }, 40 | { 41 | "cell_type": "markdown", 42 | "metadata": {}, 43 | "source": " Load the csv: " 44 | }, 45 | { 46 | "cell_type": "code", 47 | "execution_count": 8, 48 | "metadata": { 49 | "jupyter": { 50 | "outputs_hidden": false 51 | } 52 | }, 53 | "outputs": [], 54 | "source": "file_name='https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DA0101EN/coursera/project/kc_house_data_NaN.csv'\ndf=pd.read_csv(file_name)" 55 | }, 56 | { 57 | "cell_type": "markdown", 58 | "metadata": {}, 59 | "source": "\nWe use the method head to display the first 5 columns of the dataframe." 60 | }, 61 | { 62 | "cell_type": "code", 63 | "execution_count": 9, 64 | "metadata": {}, 65 | "outputs": [ 66 | { 67 | "data": { 68 | "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
Unnamed: 0iddatepricebedroomsbathroomssqft_livingsqft_lotfloorswaterfront...gradesqft_abovesqft_basementyr_builtyr_renovatedzipcodelatlongsqft_living15sqft_lot15
00712930052020141013T000000221900.03.01.00118056501.00...711800195509817847.5112-122.25713405650
11641410019220141209T000000538000.03.02.25257072422.00...72170400195119919812547.7210-122.31916907639
22563150040020150225T000000180000.02.01.00770100001.00...67700193309802847.7379-122.23327208062
33248720087520141209T000000604000.04.03.00196050001.00...71050910196509813647.5208-122.39313605000
44195440051020150218T000000510000.03.02.00168080801.00...816800198709807447.6168-122.04518007503
\n

5 rows \u00d7 22 columns

\n
", 69 | "text/plain": " Unnamed: 0 id date price bedrooms bathrooms \\\n0 0 7129300520 20141013T000000 221900.0 3.0 1.00 \n1 1 6414100192 20141209T000000 538000.0 3.0 2.25 \n2 2 5631500400 20150225T000000 180000.0 2.0 1.00 \n3 3 2487200875 20141209T000000 604000.0 4.0 3.00 \n4 4 1954400510 20150218T000000 510000.0 3.0 2.00 \n\n sqft_living sqft_lot floors waterfront ... grade sqft_above \\\n0 1180 5650 1.0 0 ... 7 1180 \n1 2570 7242 2.0 0 ... 7 2170 \n2 770 10000 1.0 0 ... 6 770 \n3 1960 5000 1.0 0 ... 7 1050 \n4 1680 8080 1.0 0 ... 8 1680 \n\n sqft_basement yr_built yr_renovated zipcode lat long \\\n0 0 1955 0 98178 47.5112 -122.257 \n1 400 1951 1991 98125 47.7210 -122.319 \n2 0 1933 0 98028 47.7379 -122.233 \n3 910 1965 0 98136 47.5208 -122.393 \n4 0 1987 0 98074 47.6168 -122.045 \n\n sqft_living15 sqft_lot15 \n0 1340 5650 \n1 1690 7639 \n2 2720 8062 \n3 1360 5000 \n4 1800 7503 \n\n[5 rows x 22 columns]" 70 | }, 71 | "execution_count": 9, 72 | "metadata": {}, 73 | "output_type": "execute_result" 74 | } 75 | ], 76 | "source": "df.head()" 77 | }, 78 | { 79 | "cell_type": "markdown", 80 | "metadata": {}, 81 | "source": "### Question 1 \nDisplay the data types of each column using the attribute dtype, then take a screenshot and submit it, include your code in the image. " 82 | }, 83 | { 84 | "cell_type": "code", 85 | "execution_count": 10, 86 | "metadata": { 87 | "jupyter": { 88 | "outputs_hidden": false 89 | } 90 | }, 91 | "outputs": [ 92 | { 93 | "data": { 94 | "text/plain": "Unnamed: 0 int64\nid int64\ndate object\nprice float64\nbedrooms float64\nbathrooms float64\nsqft_living int64\nsqft_lot int64\nfloors float64\nwaterfront int64\nview int64\ncondition int64\ngrade int64\nsqft_above int64\nsqft_basement int64\nyr_built int64\nyr_renovated int64\nzipcode int64\nlat float64\nlong float64\nsqft_living15 int64\nsqft_lot15 int64\ndtype: object" 95 | }, 96 | "execution_count": 10, 97 | "metadata": {}, 98 | "output_type": "execute_result" 99 | } 100 | ], 101 | "source": "df.dtypes" 102 | }, 103 | { 104 | "cell_type": "markdown", 105 | "metadata": {}, 106 | "source": "We use the method describe to obtain a statistical summary of the dataframe." 107 | }, 108 | { 109 | "cell_type": "code", 110 | "execution_count": 11, 111 | "metadata": { 112 | "jupyter": { 113 | "outputs_hidden": false 114 | } 115 | }, 116 | "outputs": [ 117 | { 118 | "data": { 119 | "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
Unnamed: 0idpricebedroomsbathroomssqft_livingsqft_lotfloorswaterfrontview...gradesqft_abovesqft_basementyr_builtyr_renovatedzipcodelatlongsqft_living15sqft_lot15
count21613.000002.161300e+042.161300e+0421600.00000021603.00000021613.0000002.161300e+0421613.00000021613.00000021613.000000...21613.00000021613.00000021613.00000021613.00000021613.00000021613.00000021613.00000021613.00000021613.00000021613.000000
mean10806.000004.580302e+095.400881e+053.3728702.1157362079.8997361.510697e+041.4943090.0075420.234303...7.6568731788.390691291.5090451971.00513684.40225898077.93980547.560053-122.2138961986.55249212768.455652
std6239.280022.876566e+093.671272e+050.9266570.768996918.4408974.142051e+040.5399890.0865170.766318...1.175459828.090978442.57504329.373411401.67924053.5050260.1385640.140828685.39130427304.179631
min0.000001.000102e+067.500000e+041.0000000.500000290.0000005.200000e+021.0000000.0000000.000000...1.000000290.0000000.0000001900.0000000.00000098001.00000047.155900-122.519000399.000000651.000000
25%5403.000002.123049e+093.219500e+053.0000001.7500001427.0000005.040000e+031.0000000.0000000.000000...7.0000001190.0000000.0000001951.0000000.00000098033.00000047.471000-122.3280001490.0000005100.000000
50%10806.000003.904930e+094.500000e+053.0000002.2500001910.0000007.618000e+031.5000000.0000000.000000...7.0000001560.0000000.0000001975.0000000.00000098065.00000047.571800-122.2300001840.0000007620.000000
75%16209.000007.308900e+096.450000e+054.0000002.5000002550.0000001.068800e+042.0000000.0000000.000000...8.0000002210.000000560.0000001997.0000000.00000098118.00000047.678000-122.1250002360.00000010083.000000
max21612.000009.900000e+097.700000e+0633.0000008.00000013540.0000001.651359e+063.5000001.0000004.000000...13.0000009410.0000004820.0000002015.0000002015.00000098199.00000047.777600-121.3150006210.000000871200.000000
\n

8 rows \u00d7 21 columns

\n
", 120 | "text/plain": " Unnamed: 0 id price bedrooms bathrooms \\\ncount 21613.00000 2.161300e+04 2.161300e+04 21600.000000 21603.000000 \nmean 10806.00000 4.580302e+09 5.400881e+05 3.372870 2.115736 \nstd 6239.28002 2.876566e+09 3.671272e+05 0.926657 0.768996 \nmin 0.00000 1.000102e+06 7.500000e+04 1.000000 0.500000 \n25% 5403.00000 2.123049e+09 3.219500e+05 3.000000 1.750000 \n50% 10806.00000 3.904930e+09 4.500000e+05 3.000000 2.250000 \n75% 16209.00000 7.308900e+09 6.450000e+05 4.000000 2.500000 \nmax 21612.00000 9.900000e+09 7.700000e+06 33.000000 8.000000 \n\n sqft_living sqft_lot floors waterfront view \\\ncount 21613.000000 2.161300e+04 21613.000000 21613.000000 21613.000000 \nmean 2079.899736 1.510697e+04 1.494309 0.007542 0.234303 \nstd 918.440897 4.142051e+04 0.539989 0.086517 0.766318 \nmin 290.000000 5.200000e+02 1.000000 0.000000 0.000000 \n25% 1427.000000 5.040000e+03 1.000000 0.000000 0.000000 \n50% 1910.000000 7.618000e+03 1.500000 0.000000 0.000000 \n75% 2550.000000 1.068800e+04 2.000000 0.000000 0.000000 \nmax 13540.000000 1.651359e+06 3.500000 1.000000 4.000000 \n\n ... grade sqft_above sqft_basement yr_built \\\ncount ... 21613.000000 21613.000000 21613.000000 21613.000000 \nmean ... 7.656873 1788.390691 291.509045 1971.005136 \nstd ... 1.175459 828.090978 442.575043 29.373411 \nmin ... 1.000000 290.000000 0.000000 1900.000000 \n25% ... 7.000000 1190.000000 0.000000 1951.000000 \n50% ... 7.000000 1560.000000 0.000000 1975.000000 \n75% ... 8.000000 2210.000000 560.000000 1997.000000 \nmax ... 13.000000 9410.000000 4820.000000 2015.000000 \n\n yr_renovated zipcode lat long sqft_living15 \\\ncount 21613.000000 21613.000000 21613.000000 21613.000000 21613.000000 \nmean 84.402258 98077.939805 47.560053 -122.213896 1986.552492 \nstd 401.679240 53.505026 0.138564 0.140828 685.391304 \nmin 0.000000 98001.000000 47.155900 -122.519000 399.000000 \n25% 0.000000 98033.000000 47.471000 -122.328000 1490.000000 \n50% 0.000000 98065.000000 47.571800 -122.230000 1840.000000 \n75% 0.000000 98118.000000 47.678000 -122.125000 2360.000000 \nmax 2015.000000 98199.000000 47.777600 -121.315000 6210.000000 \n\n sqft_lot15 \ncount 21613.000000 \nmean 12768.455652 \nstd 27304.179631 \nmin 651.000000 \n25% 5100.000000 \n50% 7620.000000 \n75% 10083.000000 \nmax 871200.000000 \n\n[8 rows x 21 columns]" 121 | }, 122 | "execution_count": 11, 123 | "metadata": {}, 124 | "output_type": "execute_result" 125 | } 126 | ], 127 | "source": "df.describe()" 128 | }, 129 | { 130 | "cell_type": "markdown", 131 | "metadata": {}, 132 | "source": "# Module 2: Data Wrangling" 133 | }, 134 | { 135 | "cell_type": "markdown", 136 | "metadata": {}, 137 | "source": "### Question 2 \nDrop the columns \"id\" and \"Unnamed: 0\" from axis 1 using the method drop(), then use the method describe() to obtain a statistical summary of the data. Take a screenshot and submit it, make sure the inplace parameter is set to True" 138 | }, 139 | { 140 | "cell_type": "code", 141 | "execution_count": 12, 142 | "metadata": { 143 | "jupyter": { 144 | "outputs_hidden": false 145 | } 146 | }, 147 | "outputs": [ 148 | { 149 | "data": { 150 | "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
pricebedroomsbathroomssqft_livingsqft_lotfloorswaterfrontviewconditiongradesqft_abovesqft_basementyr_builtyr_renovatedzipcodelatlongsqft_living15sqft_lot15
count2.161300e+0421600.00000021603.00000021613.0000002.161300e+0421613.00000021613.00000021613.00000021613.00000021613.00000021613.00000021613.00000021613.00000021613.00000021613.00000021613.00000021613.00000021613.00000021613.000000
mean5.400881e+053.3728702.1157362079.8997361.510697e+041.4943090.0075420.2343033.4094307.6568731788.390691291.5090451971.00513684.40225898077.93980547.560053-122.2138961986.55249212768.455652
std3.671272e+050.9266570.768996918.4408974.142051e+040.5399890.0865170.7663180.6507431.175459828.090978442.57504329.373411401.67924053.5050260.1385640.140828685.39130427304.179631
min7.500000e+041.0000000.500000290.0000005.200000e+021.0000000.0000000.0000001.0000001.000000290.0000000.0000001900.0000000.00000098001.00000047.155900-122.519000399.000000651.000000
25%3.219500e+053.0000001.7500001427.0000005.040000e+031.0000000.0000000.0000003.0000007.0000001190.0000000.0000001951.0000000.00000098033.00000047.471000-122.3280001490.0000005100.000000
50%4.500000e+053.0000002.2500001910.0000007.618000e+031.5000000.0000000.0000003.0000007.0000001560.0000000.0000001975.0000000.00000098065.00000047.571800-122.2300001840.0000007620.000000
75%6.450000e+054.0000002.5000002550.0000001.068800e+042.0000000.0000000.0000004.0000008.0000002210.000000560.0000001997.0000000.00000098118.00000047.678000-122.1250002360.00000010083.000000
max7.700000e+0633.0000008.00000013540.0000001.651359e+063.5000001.0000004.0000005.00000013.0000009410.0000004820.0000002015.0000002015.00000098199.00000047.777600-121.3150006210.000000871200.000000
\n
", 151 | "text/plain": " price bedrooms bathrooms sqft_living sqft_lot \\\ncount 2.161300e+04 21600.000000 21603.000000 21613.000000 2.161300e+04 \nmean 5.400881e+05 3.372870 2.115736 2079.899736 1.510697e+04 \nstd 3.671272e+05 0.926657 0.768996 918.440897 4.142051e+04 \nmin 7.500000e+04 1.000000 0.500000 290.000000 5.200000e+02 \n25% 3.219500e+05 3.000000 1.750000 1427.000000 5.040000e+03 \n50% 4.500000e+05 3.000000 2.250000 1910.000000 7.618000e+03 \n75% 6.450000e+05 4.000000 2.500000 2550.000000 1.068800e+04 \nmax 7.700000e+06 33.000000 8.000000 13540.000000 1.651359e+06 \n\n floors waterfront view condition grade \\\ncount 21613.000000 21613.000000 21613.000000 21613.000000 21613.000000 \nmean 1.494309 0.007542 0.234303 3.409430 7.656873 \nstd 0.539989 0.086517 0.766318 0.650743 1.175459 \nmin 1.000000 0.000000 0.000000 1.000000 1.000000 \n25% 1.000000 0.000000 0.000000 3.000000 7.000000 \n50% 1.500000 0.000000 0.000000 3.000000 7.000000 \n75% 2.000000 0.000000 0.000000 4.000000 8.000000 \nmax 3.500000 1.000000 4.000000 5.000000 13.000000 \n\n sqft_above sqft_basement yr_built yr_renovated zipcode \\\ncount 21613.000000 21613.000000 21613.000000 21613.000000 21613.000000 \nmean 1788.390691 291.509045 1971.005136 84.402258 98077.939805 \nstd 828.090978 442.575043 29.373411 401.679240 53.505026 \nmin 290.000000 0.000000 1900.000000 0.000000 98001.000000 \n25% 1190.000000 0.000000 1951.000000 0.000000 98033.000000 \n50% 1560.000000 0.000000 1975.000000 0.000000 98065.000000 \n75% 2210.000000 560.000000 1997.000000 0.000000 98118.000000 \nmax 9410.000000 4820.000000 2015.000000 2015.000000 98199.000000 \n\n lat long sqft_living15 sqft_lot15 \ncount 21613.000000 21613.000000 21613.000000 21613.000000 \nmean 47.560053 -122.213896 1986.552492 12768.455652 \nstd 0.138564 0.140828 685.391304 27304.179631 \nmin 47.155900 -122.519000 399.000000 651.000000 \n25% 47.471000 -122.328000 1490.000000 5100.000000 \n50% 47.571800 -122.230000 1840.000000 7620.000000 \n75% 47.678000 -122.125000 2360.000000 10083.000000 \nmax 47.777600 -121.315000 6210.000000 871200.000000 " 152 | }, 153 | "execution_count": 12, 154 | "metadata": {}, 155 | "output_type": "execute_result" 156 | } 157 | ], 158 | "source": "df.drop(['id', 'Unnamed: 0'], axis=1, inplace=True)\ndf.describe()" 159 | }, 160 | { 161 | "cell_type": "markdown", 162 | "metadata": {}, 163 | "source": "We can see we have missing values for the columns bedrooms and bathrooms " 164 | }, 165 | { 166 | "cell_type": "code", 167 | "execution_count": 13, 168 | "metadata": { 169 | "jupyter": { 170 | "outputs_hidden": false 171 | } 172 | }, 173 | "outputs": [ 174 | { 175 | "name": "stdout", 176 | "output_type": "stream", 177 | "text": "number of NaN values for the column bedrooms : 13\nnumber of NaN values for the column bathrooms : 10\n" 178 | } 179 | ], 180 | "source": "print(\"number of NaN values for the column bedrooms :\", df['bedrooms'].isnull().sum())\nprint(\"number of NaN values for the column bathrooms :\", df['bathrooms'].isnull().sum())\n" 181 | }, 182 | { 183 | "cell_type": "markdown", 184 | "metadata": {}, 185 | "source": "\nWe can replace the missing values of the column 'bedrooms' with the mean of the column 'bedrooms' using the method replace(). Don't forget to set the inplace parameter to True" 186 | }, 187 | { 188 | "cell_type": "code", 189 | "execution_count": 14, 190 | "metadata": {}, 191 | "outputs": [], 192 | "source": "mean=df['bedrooms'].mean()\ndf['bedrooms'].replace(np.nan,mean, inplace=True)" 193 | }, 194 | { 195 | "cell_type": "markdown", 196 | "metadata": {}, 197 | "source": "\nWe also replace the missing values of the column 'bathrooms' with the mean of the column 'bathrooms' using the method replace(). Don't forget to set the inplace parameter top True " 198 | }, 199 | { 200 | "cell_type": "code", 201 | "execution_count": 15, 202 | "metadata": {}, 203 | "outputs": [], 204 | "source": "mean=df['bathrooms'].mean()\ndf['bathrooms'].replace(np.nan,mean, inplace=True)" 205 | }, 206 | { 207 | "cell_type": "code", 208 | "execution_count": 16, 209 | "metadata": { 210 | "jupyter": { 211 | "outputs_hidden": false 212 | } 213 | }, 214 | "outputs": [ 215 | { 216 | "name": "stdout", 217 | "output_type": "stream", 218 | "text": "number of NaN values for the column bedrooms : 0\nnumber of NaN values for the column bathrooms : 0\n" 219 | } 220 | ], 221 | "source": "print(\"number of NaN values for the column bedrooms :\", df['bedrooms'].isnull().sum())\nprint(\"number of NaN values for the column bathrooms :\", df['bathrooms'].isnull().sum())" 222 | }, 223 | { 224 | "cell_type": "markdown", 225 | "metadata": {}, 226 | "source": "# Module 3: Exploratory Data Analysis" 227 | }, 228 | { 229 | "cell_type": "markdown", 230 | "metadata": {}, 231 | "source": "### Question 3\nUse the method value_counts to count the number of houses with unique floor values, use the method .to_frame() to convert it to a dataframe.\n" 232 | }, 233 | { 234 | "cell_type": "code", 235 | "execution_count": 28, 236 | "metadata": { 237 | "jupyter": { 238 | "outputs_hidden": false 239 | } 240 | }, 241 | "outputs": [ 242 | { 243 | "data": { 244 | "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
floors
1.010680
2.08241
1.51910
3.0613
2.5161
3.58
\n
", 245 | "text/plain": " floors\n1.0 10680\n2.0 8241\n1.5 1910\n3.0 613\n2.5 161\n3.5 8" 246 | }, 247 | "execution_count": 28, 248 | "metadata": {}, 249 | "output_type": "execute_result" 250 | } 251 | ], 252 | "source": "floor_count = df['floors'].value_counts().to_frame()\nfloor_count" 253 | }, 254 | { 255 | "cell_type": "markdown", 256 | "metadata": {}, 257 | "source": "### Question 4\nUse the function boxplot in the seaborn library to determine whether houses with a waterfront view or without a waterfront view have more price outliers." 258 | }, 259 | { 260 | "cell_type": "code", 261 | "execution_count": 30, 262 | "metadata": { 263 | "jupyter": { 264 | "outputs_hidden": false 265 | } 266 | }, 267 | "outputs": [ 268 | { 269 | "data": { 270 | "text/plain": "" 271 | }, 272 | "execution_count": 30, 273 | "metadata": {}, 274 | "output_type": "execute_result" 275 | }, 276 | { 277 | "data": { 278 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAaEAAAEKCAYAAAC7c+rvAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAHnRJREFUeJzt3X2UXVWZ5/HvL4lAoiJQFCyoShvspFVaBOEKmbanGyGEwmkTZpa0pGdNbjtM1zQiRB27Bcc1GV8Xrp4lQ5iWNiMZKjMCRkaHwpWXqfDSvgGmEl5iiE5KDFAJDWUlRjAIJHnmj7sr3Cpu6s3cs6+5v89ad91znrPP2buyKnmy99lnH0UEZmZmOUzJ3QAzM2teTkJmZpaNk5CZmWXjJGRmZtk4CZmZWTZOQmZmlo2TkJmZZeMkZGZm2dQ1CUn6mKQtkn4s6XZJx0g6TdJDkrZJ+oako1LZo9N+Xzo+q+o616X4TyVdXBXvSLE+SddWxSdch5mZFU/1WjFBUhvwfeD0iHhR0ipgNfA+4FsRcYekfwAejYibJX0YeGdE/LWky4F/GREflHQ6cDtwLnAqsB74g1TN/wMuAvqBDcCiiHg81TXuOkb7OU488cSYNWvWYf2zMTM70m3cuPEXEdE6VrlpdW7HNGC6pFeAGcAzwAXAX6TjXcB/Bm4GFqZtgDuB/yZJKX5HRLwE/FxSH5WEBNAXEU8ASLoDWChp60TriFEy8axZs+jt7Z3kj29m1pwkPTmecnUbjouIHcB/AZ6iknz2ABuBX0bEvlSsH2hL223A0+ncfal8S3V8xDmHirdMoo5hJHVK6pXUOzAwMJkf38zMxqFuSUjS8VR6HqdRGUZ7PXBJjaJDvRAd4tjhio9Wx/BAxPKIKEVEqbV1zN6kmZlNUj0nJswDfh4RAxHxCvAt4I+A4yQNDQO2AzvTdj8wEyAdfxOwqzo+4pxDxX8xiTrMzCyDeiahp4C5kmakezsXAo8D9wEfSGXKwF1puzvtk47fm+7VdAOXp5ltpwFzgB9RmYgwJ82EOwq4HOhO50y0DjMzy6Ce94QeonLzfxOwOdW1HPgk8PE0waAFuCWdcgvQkuIfB65N19kCrKKSwNYCV0XE/nRP5yPAOmArsCqVZaJ1WDEGBwe55pprGBwczN0UM2sQdZuifaQolUrh2XGHx5e//GXuvvtuFixYwMc+9rHczTGzOpK0MSJKY5XziglWiMHBQdauXUtEsHbtWveGzAxwErKCdHV1ceDAAQD279/PypUrM7fIzBqBk5AVYv369ezbV3l0a9++ffT09GRukZk1AichK8S8efOYNq0ya37atGlcdNFFmVtkZo3AScgKUS6XmTKl8us2depUFi9enLlFZtYInISsEC0tLXR0dCCJjo4OWlpes1qSmTWhei9ganZQuVxm+/bt7gWZ2UFOQlaYlpYWli1blrsZZtZAPBxnZmbZOAmZmVk2TkJmZpaNk5CZmWXjJGRmZtk4CZmZWTZOQmZmlo2TkJk1Pb9wMR8nITNrel1dXWzevNmvGMmgbklI0lslPVL1+ZWkj0o6QVKPpG3p+/hUXpKWSeqT9Jiks6uuVU7lt0kqV8XPkbQ5nbNMklJ8wnWYWXPyCxfzqlsSioifRsRZEXEWcA6wF/g2cC1wT0TMAe5J+wCXAHPSpxO4GSoJBVgKnAecCywdSiqpTGfVeR0pPqE6zKx5+YWLeRU1HHch8LOIeBJYCHSleBdwadpeCKyMigeB4ySdAlwM9ETErojYDfQAHenYsRHxQEQEsHLEtSZShxXA4+7WiPzCxbyKSkKXA7en7ZMj4hmA9H1SircBT1ed059io8X7a8QnU8cwkjol9UrqHRgYmMCPaaPxuLs1Ir9wMa+6JyFJRwELgG+OVbRGLCYRn0wdwwMRyyOiFBGl1tbWMS5p4+Fxd2tUfuFiXkX0hC4BNkXEs2n/2aEhsPT9XIr3AzOrzmsHdo4Rb68Rn0wdVmced7dG5Rcu5lVEElrEq0NxAN3A0Ay3MnBXVXxxmsE2F9iThtLWAfMlHZ8mJMwH1qVjz0uam2bFLR5xrYnUYXXmcXdrZOVymTPOOMO9oAzqmoQkzQAuAr5VFb4euEjStnTs+hRfDTwB9AH/HfgwQETsAj4HbEifz6YYwJXA19I5PwPWTKYOqz+Pu1sjG3rhontBxVNlYpkdSqlUit7e3tzN+J03ODjIokWLePnllzn66KO57bbb/Bfe7AgmaWNElMYq5xUTrBAedzezWqblboA1j3K5zPbt2z3ubmYHOQlZYYbG3c3Mhng4zszMsnESMjOzbJyEzMwsGychMzPLxknIzMyycRIyM7NsnITMzCwbJyEzM8vGScjMzLJxEjIzs2ychMzMLBsnITMzy8ZJyMzMsnESMjOzbOr9eu/jJN0p6SeStkr6Z5JOkNQjaVv6Pj6VlaRlkvokPSbp7KrrlFP5bZLKVfFzJG1O5yyTpBSfcB1mZla8eveEbgTWRsTbgDOBrcC1wD0RMQe4J+0DXALMSZ9O4GaoJBRgKXAecC6wdCippDKdVed1pPiE6jAzszzqloQkHQv8CXALQES8HBG/BBYCXalYF3Bp2l4IrIyKB4HjJJ0CXAz0RMSuiNgN9AAd6dixEfFARASwcsS1JlKHmZllUM+e0FuAAeB/SHpY0tckvR44OSKeAUjfJ6XybcDTVef3p9ho8f4acSZRxzCSOiX1SuodGBiY2E9tZmbjVs8kNA04G7g5It4F/JpXh8VqUY1YTCI+mnGdExHLI6IUEaXW1tYxLmlmZpNVzyTUD/RHxENp/04qSenZoSGw9P1cVfmZVee3AzvHiLfXiDOJOszMLIO6JaGI+CfgaUlvTaELgceBbmBohlsZuCttdwOL0wy2ucCeNJS2Dpgv6fg0IWE+sC4de17S3DQrbvGIa02kDjMzy2Bana9/NfB1SUcBTwAfopL4Vkm6AngKuCyVXQ28D+gD9qayRMQuSZ8DNqRyn42IXWn7SuBWYDqwJn0Arp9IHWZmlocqE8vsUEqlUvT29uZuhpnZ7xRJGyOiNFY5r5hgZmbZOAlZYQYHB7nmmmsYHBzM3RQzaxBOQlaYrq4uNm/ezMqVK3M3xcwahJOQFWJwcJC1a9cSEaxdu9a9ITMDnISsIF1dXRw4cACA/fv3uzdkZoCTkBVk/fr17Nu3D4B9+/bR09OTuUVm1gichKwQ8+bNY9q0ymNp06ZN46KLLsrcIjNrBE5CVohyucyUKZVftylTprB48eLMLTKzRuAkZIVoaWnh1FNPBeDUU0+lpaUlc4vMXuXHB/JxErJCDA4OsmPHDgB27tzpv+zWUPz4QD5OQlaIrq4uhpaIOnDggP+yW8Pw4wN5OQlZITw7zhqVHx/Iy0nICuHZcdao/B+kvJyErBDVs+OmTp3q2XHWMPwfpLychKwQLS0tdHR0IImOjg7PjrOGUS6XDw7HHThwwP9BKli9X2pndlC5XGb79u3+S25mB7knZIVpaWlh2bJl7gVZQ+nq6kISAJI8MaFgdU1CkrZL2izpEUm9KXaCpB5J29L38SkuScsk9Ul6TNLZVdcpp/LbJJWr4uek6/elczXZOsysOa1fv579+/cDldlxnphQrCJ6Qu+NiLOqXvN6LXBPRMwB7kn7AJcAc9KnE7gZKgkFWAqcB5wLLB1KKqlMZ9V5HZOpw8yalycm5JVjOG4h0JW2u4BLq+Iro+JB4DhJpwAXAz0RsSsidgM9QEc6dmxEPBCVpyBXjrjWROowsyblmZt51TsJBfB/JW2U1JliJ0fEMwDp+6QUbwOerjq3P8VGi/fXiE+mjmEkdUrqldQ7MDAwgR/XzH7XeOZmXvWeHfeeiNgp6SSgR9JPRimrGrGYRHw04zonIpYDywFKpdJY1zSz33GeuZlPXXtCEbEzfT8HfJvKPZ1nh4bA0vdzqXg/MLPq9HZg5xjx9hpxJlGHmTUxz9zMp25JSNLrJb1xaBuYD/wY6AaGZriVgbvSdjewOM1gmwvsSUNp64D5ko5PExLmA+vSseclzU2z4haPuNZE6jAzswzqORx3MvDtNGt6GnBbRKyVtAFYJekK4CngslR+NfA+oA/YC3wIICJ2SfocsCGV+2xE7ErbVwK3AtOBNekDcP1E6jAzszw0tLy+1VYqlaK3tzd3M8ysjgYHB/nMZz7D0qVLPSR3mEjaWPVoziF5xQQza3p+qV0+TkJm1tT8Uru8nITMrKn5pXZ5OQlZYQYHB7nmmmv8P01rKH6pXV5OQlYYj7tbI5o3b96wVbS9dlyxnISsEB53t0a1YMEChmYJRwTvf//7M7eouTgJWSE87m6Nqru7e1hP6O67787coubiJGSF8Li7Nar169cP6wn5d7NYTkJWCL+zxRqVfzfzchKyQvidLdao/LuZl5OQFcLvbLFG5d/NvJyErDALFixgxowZnn1kDadcLnPGGWe4F5SBk5AVpru7m71793r2kTUcv08on3EnIUlvljQvbU8feleQ2Xj4OSEzq2VcSUjSXwF3Al9NoXbg/9SrUXbk8XNCZlbLeHtCVwHvAX4FEBHbgJPq1Sg78vg5ITOrZbxJ6KWIeHloR9I0wG/Ds3HzsxhmVst4k9A/SvoUMF3SRcA3gXHdXZY0VdLDkr6T9k+T9JCkbZK+IemoFD867fel47OqrnFdiv9U0sVV8Y4U65N0bVV8wnVYfZXL5YPDcQcOHPAsJDMDxp+ErgUGgM3AvwdWA58e57lLgK1V+18CboiIOcBu4IoUvwLYHRGzgRtSOSSdDlwO/CHQAXwlJbapwN8DlwCnA4tS2QnXYWZmeYw3CU0HVkTEZRHxAWBFio1KUjvwL4CvpX0BF1CZ5ADQBVyathemfdLxC1P5hcAdEfFSRPwc6APOTZ++iHgiDRXeASycZB1WZ11dXcMWifTEBDOD8SehexiedKYD68dx3n8F/hY4kPZbgF9GxL603w+0pe024GmAdHxPKn8wPuKcQ8UnU8cwkjol9UrqHRgYGMePaWNZv349+/fvByqz4zwxwcxg/EnomIh4YWgnbc8Y7QRJfwY8FxEbq8M1isYYxw5XfKz6Xw1ELI+IUkSUWltba5xiEzVv3ryD63NNmTLFExPMDBh/Evq1pLOHdiSdA7w4xjnvARZI2k5lqOwCKj2j49LsOqg8b7QzbfcDM9P1pwFvAnZVx0ecc6j4LyZRh9WZJyaYWS3Txi4CwEeBb0oa+sf8FOCDo50QEdcB1wFIOh/4RET8a0nfBD5AJTGVgbvSKd1p/4F0/N6ICEndwG2SvgycCswBfkSlVzNH0mnADiqTF/4inXPfROoY55+B/RZ27979mn0vkWIAN910E319fVnbsGPHDgDa2trGKFl/s2fP5uqrr87djMKMqycUERuAtwFXAh8G3j5imG0iPgl8XFIflfsxt6T4LUBLin+cyow8ImILsAp4HFgLXBUR+9M9nY8A66jMvluVyk64Dqu/z3/+86Pum+X04osv8uKLYw3uWD1otI6ApAsi4l5J/6rW8Yj4Vt1a1iBKpVL09vbmbsbvvPPPP/81sfvvv7/wdpjVsmTJEgBuvPHGzC05ckjaGBGlscqNNRz3p8C9QK219wM44pOQHR7t7e309/cf3J85c+Yopc2sWYyahCJiqaQpwJqIWFVQm+wINHPmzGFJqL29PWNrzKxRjHlPKCIOULn3YjZpDz300Kj7ZtacxjtFu0fSJyTNlHTC0KeuLbMjysh7j56UaGYw/ina/5bKPaAPj4i/5fA2x45UU6ZMObhiwtC+mdl4/yU4ncpioY8CjwA3UVlQ1Gxc5s2bN+q+mTWn8SahLuDtwDIqCejtvLoQqNmYOjs7R903s+Y03uG4t0bEmVX790l6tB4NMjOz5jHentDDkuYO7Ug6D/hBfZpkR6KvfvWrw/aXL1+eqSVm1kjGm4TOA34oaXtakPQB4E8lbZb0WN1aZ0eM9euHv/nDr3IwMxj/cFxHXVthR7yhFbQPtW9mzWlcSSginqx3Q8zMrPn4YQ0zM8vGScgKccIJJ4y6b2bNyUnICrFnz55R982sOTkJWSGql+yptW9mzaluSUjSMZJ+JOlRSVskfSbFT5P0kKRtkr4h6agUPzrt96Xjs6qudV2K/1TSxVXxjhTrk3RtVXzCdZiZWfHq2RN6CbggrbRwFtCRHnj9EnBDRMwBdgNXpPJXALsjYjZwQyqHpNOBy6msVdcBfEXSVElTqaxndwmVte0WpbJMtA4zM8ujbkkoKl5Iu69LnwAuAO5M8S7g0rS9kFfXo7sTuFCSUvyOiHgpIn4O9AHnpk9fRDwRES8DdwAL0zkTrcPMzDKo6z2h1GN5BHgO6AF+BvwyIvalIv1AW9puA54GSMf3AC3V8RHnHCreMok6zMwsg7omoYjYHxFnAe1Uei5vr1UsfdfqkcRhjI9WxzCSOiX1SuodGBiocYqZmR0OhcyOi4hfAvcDc4HjJA2t1NAO7Ezb/cBMgHT8TcCu6viIcw4V/8Uk6hjZ3uURUYqIUmtr6+R+aDMzG1M9Z8e1SjoubU8H5gFbgfuAD6RiZeCutN2d9knH743KO6C7gcvTzLbTgDnAj4ANwJw0E+4oKpMXutM5E63DzMwyGO8CppNxCtCVZrFNAVZFxHckPQ7cIenzwMPALan8LcD/lNRHpXdyOUBEbJG0Cngc2AdcFRH7ASR9BFgHTAVWRMSWdK1PTqQOMzPLo25JKCIeA95VI/4ElftDI+O/AS47xLW+AHyhRnw1sPpw1GFmZsXziglmZpaNk5CZmWXjJGRmZtk4CZmZWTZOQmZmlo2TkJmZZeMkZGZm2TgJmZlZNk5CZmaWjZOQmZll4yRkZmbZOAmZmVk2TkJmZpaNk5CZmWVTz/cJmVkDu+mmm+jr68vdjIYw9OewZMmSzC1pDLNnz+bqq68upC4nIbMm1dfXx7YtD/N7b9ifuynZHfVKZVDopSd7M7ckv6demFpofU5CZk3s996wn0+d/avczbAG8sVNxxZaX93uCUmaKek+SVslbZG0JMVPkNQjaVv6Pj7FJWmZpD5Jj0k6u+pa5VR+m6RyVfwcSZvTOcskabJ1mJlZ8eo5MWEf8B8i4u3AXOAqSacD1wL3RMQc4J60D3AJMCd9OoGboZJQgKXAeVRe2b10KKmkMp1V53Wk+ITqMDOzPOqWhCLimYjYlLafB7YCbcBCoCsV6wIuTdsLgZVR8SBwnKRTgIuBnojYFRG7gR6gIx07NiIeiIgAVo641kTqMDOzDAqZoi1pFvAu4CHg5Ih4BiqJCjgpFWsDnq46rT/FRov314gziTrMzCyDuichSW8A/jfw0YgY7Q6oasRiEvFRmzOecyR1SuqV1DswMDDGJc3MbLLqmoQkvY5KAvp6RHwrhZ8dGgJL38+leD8ws+r0dmDnGPH2GvHJ1DFMRCyPiFJElFpbW8f/A5uZ2YTUc3acgFuArRHx5apD3cDQDLcycFdVfHGawTYX2JOG0tYB8yUdnyYkzAfWpWPPS5qb6lo84loTqcPMzDKo53NC7wH+DbBZ0iMp9ingemCVpCuAp4DL0rHVwPuAPmAv8CGAiNgl6XPAhlTusxGxK21fCdwKTAfWpA8TrcPMzPKoWxKKiO9T+x4MwIU1ygdw1SGutQJYUSPeC7yjRnxwonWYmVnxvICpmZll4yRkZmbZOAmZmVk2TkJmZpaNk5CZmWXjJGRmZtk4CZmZWTZOQmZmlo2TkJmZZeMkZGZm2dRz7Tgza2A7duzg189P5Yubjs3dFGsgTz4/ldfv2FFYfe4JmZlZNu4JmTWptrY2Xtr3DJ86e7R3TVqz+eKmYzm6rbgXTrsnZGZm2TgJmZlZNk5CZmaWjZOQmZllU7eJCZJWAH8GPBcR70ixE4BvALOA7cCfR8RuSQJupPLq7b3AX0bEpnROGfh0uuznI6Irxc/h1Vd7rwaWRERMpo4j3U033URfX1/uZrzGkiVLstQ7e/Zsrr766ix1m9lw9ewJ3Qp0jIhdC9wTEXOAe9I+wCXAnPTpBG6Gg0lrKXAecC6wVNLx6ZybU9mh8zomU4eZmeVTt55QRHxX0qwR4YXA+Wm7C7gf+GSKr4yIAB6UdJykU1LZnojYBSCpB+iQdD9wbEQ8kOIrgUuBNROtIyKeOZw/dyNqhP/1n3/++a+J3XjjjcU3xMwaStH3hE4e+kc/fZ+U4m3A01Xl+lNstHh/jfhk6rACHHPMMcP2p0+fnqklZtZIGmVigmrEYhLxydTx2oJSp6ReSb0DAwNjXNbGY+3atcP216xZk6klZtZIik5Cz6ZhNtL3cyneD8ysKtcO7Bwj3l4jPpk6XiMilkdEKSJKra2tE/oBbWzuBZnZkKKTUDdQTttl4K6q+GJVzAX2pKG0dcB8ScenCQnzgXXp2POS5qZZb4tHXGsidVhBzjzzTM4880z3gszsoHpO0b6dygSBEyX1U5nldj2wStIVwFPAZan4aipTp/uoTJ/+EEBE7JL0OWBDKvfZoUkKwJW8OkV7Tfow0TrMzCyfes6OW3SIQxfWKBvAVYe4zgpgRY14L/COGvHBidZhZmZ5eBVtsyb21At+nxDAs3srdyZOnnEgc0vye+qFqcwpsD4nIbMmNXv27NxNaBgvpxVFjn6z/0zmUOzvhpOQWZNqhIeYG8XQElJ+gLp4TkJ11qjrtuUw9OeQa824RuM17MychOqur6+PR368lf0zTsjdlOymvFx5NnjjE89mbkl+U/fuGruQWRNwEirA/hkn8OLb3pe7GdZApv9kde4mmDWERlm2x8zMmpCTkJmZZePhuDrbsWMHU/fu8fCLDTN17yA7duzL3Qyz7NwTMjOzbNwTqrO2tjb+6aVpnphgw0z/yWra2k7O3Qyz7NwTMjOzbNwTKsDUvbt8TwiY8ptfAXDgGK9VVnlOyD0haIwHuhvpQepme4jZSajOvD7Xq/r6ngdg9lv8jy+c7N+NBuIXLeajyhsO7FBKpVL09vbmbsYRwetzmTUPSRsjojRWOd8TMjOzbJyEzMwsm6ZLQpI6JP1UUp+ka3O3x8ysmTXVxARJU4G/By4C+oENkroj4vG8LauvRph9BI0zA6nZZh+ZNbJm6wmdC/RFxBMR8TJwB7Awc5uaxvTp0z0LycyGaaqeENAGPF213w+cl6kthfH/+s2sUTVbT0g1Yq+Zoy6pU1KvpN6BgYECmmVm1pyaLQn1AzOr9tuBnSMLRcTyiChFRKm1tbWwxpmZNZtmS0IbgDmSTpN0FHA50J25TWZmTaup7glFxD5JHwHWAVOBFRGxJXOzzMyaVlMlIYCIWA14NVEzswbQbMNxZmbWQJyEzMwsGychMzPLxq9yGIOkAeDJ3O04gpwI/CJ3I8xq8O/m4fXmiBjzGRcnISuUpN7xvGPErGj+3czDw3FmZpaNk5CZmWXjJGRFW567AWaH4N/NDHxPyMzMsnFPyMzMsnESskL4terWqCStkPScpB/nbkszchKyuqt6rfolwOnAIkmn522V2UG3Ah25G9GsnISsCH6tujWsiPgusCt3O5qVk5AVodZr1dsytcXMGoiTkBVhXK9VN7Pm4yRkRRjXa9XNrPk4CVkR/Fp1M6vJScjqLiL2AUOvVd8KrPJr1a1RSLodeAB4q6R+SVfkblMz8YoJZmaWjXtCZmaWjZOQmZll4yRkZmbZOAmZmVk2TkJmZpaNk5BZA5H0UUkzJnHe2yQ9IulhSb9/GNpxqReZtSI4CZk1lo8CE0pCaZXyS4G7IuJdEfGzqmOSNJm/55dSWfHcrK6chMzqQNLfSrombd8g6d60faGk/yXpZkm9krZI+kw6dg1wKnCfpPtSbL6kByRtkvRNSW9I8e2S/pOk7wMfpJK8/p2k+yTNkrRV0leATcBMSYskbZb0Y0lfqmrnC5K+IOlRSQ9KOlnSHwELgL9LvavfumdldihOQmb18V3gn6ftEvAGSa8D/hj4HvAfI6IEvBP4U0nvjIhlVNbUe29EvFfSicCngXkRcTbQC3y8qo7fRMQfR8RtwD8AN0TEe9OxtwIrI+JdwCvAl4ALgLOAd0u6NJV7PfBgRJyZ2vxXEfFDKssq/U1EnFXdszI73JyEzOpjI3COpDcCL1FZFqZEJTF9D/hzSZuAh4E/pPbQ19wU/4GkR4Ay8Oaq498Ypf4nI+LBtP1u4P6IGEhLKH0d+JN07GXgO1VtnjWRH9LstzUtdwPMjkQR8Yqk7cCHgB8CjwHvBX4feBH4BPDuiNgt6VbgmBqXEdATEYsOUc2vR2lC9bFar9IY8kq8unbXfvxvghXMPSGz+vkulWTzXSq9n78GHgGOpZIk9kg6mcprz4c8D7wxbT8IvEfSbABJMyT9wSTa8RCVIb8T0ySGRcA/jnFOdTvM6sZJyKx+vgecAjwQEc8CvwG+FxGPUhmG2wKsAH5Qdc5yYI2k+yJiAPhL4HZJj1FJSm+baCMi4hngOuA+4FFgU0TcNcZpdwB/c7imfJsdilfRNjOzbNwTMjOzbJyEzMwsGychMzPLxknIzMyycRIyM7NsnITMzCwbJyEzM8vGScjMzLL5/7PzJ25ACb4KAAAAAElFTkSuQmCC\n", 279 | "text/plain": "
" 280 | }, 281 | "metadata": { 282 | "needs_background": "light" 283 | }, 284 | "output_type": "display_data" 285 | } 286 | ], 287 | "source": "sns.boxplot(x='waterfront', y='price', data=df)" 288 | }, 289 | { 290 | "cell_type": "markdown", 291 | "metadata": {}, 292 | "source": "### Question 5\nUse the function regplot in the seaborn library to determine if the feature sqft_above is negatively or positively correlated with price." 293 | }, 294 | { 295 | "cell_type": "code", 296 | "execution_count": 31, 297 | "metadata": { 298 | "jupyter": { 299 | "outputs_hidden": false 300 | } 301 | }, 302 | "outputs": [ 303 | { 304 | "data": { 305 | "text/plain": "" 306 | }, 307 | "execution_count": 31, 308 | "metadata": {}, 309 | "output_type": "execute_result" 310 | }, 311 | { 312 | "data": { 313 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAAaEAAAELCAYAAABwLzlKAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzsvXt4XPd53/l5z5kb7gBJgKR4iUSbEi3ZsmOxlrz1Kqzj2JLbSGkfu7GyqdVUXWoTp86lyUruk9pZOe1K2z51rNR1xI3bSE0ixdXGNTcrS5Gt0Ex2RVukbFmWRIkUSYngDQAJAjOY+znv/nHOGQwGA2AAYjC4vJ/nwTMzvznn/M6A4O87v/f3/b2vqCqGYRiG0QqcVt+AYRiGsXYxETIMwzBahomQYRiG0TJMhAzDMIyWYSJkGIZhtAwTIcMwDKNlmAgZhmEYLcNEyDAMw2gZTRUhEfkNEXlFRH4sIo+LSEpErhGR74nIMRH5cxFJhMcmw9fHw/evrrrO58L210XkY1Xtt4Vtx0Xk/qr2efdhGIZhLD3SrIwJIrIF+FvgelXNicjXgaeAjwN/oapPiMgfAi+p6ldF5FeAG1X1fxGRTwH/UFV/XkSuBx4HPgBcBXwbuDbs5g3gZ4BB4AXgLlV9Neyr4T5m+xwbNmzQq6++elF/N4ZhGKudI0eOjKhq/1zHxZp8HzGgTURKQDtwDvgw8Avh+48Cvwt8FbgzfA7wJPAfRUTC9idUtQCcFJHjBIIEcFxVTwCIyBPAnSLy2nz70FmU+Oqrr+bw4cML/PiGYRhrExF5q5HjmhaOU9UzwL8H3iYQnzHgCHBZVcvhYYPAlvD5FuB0eG45PH59dXvNOTO1r19AH4ZhGEYLaJoIiUgfwczjGoIwWgdwe51Do1mIzPDeYrXP1scURGSviBwWkcPDw8N1TjEMwzAWg2YaEz4CnFTVYVUtAX8B/A9Ar4hEYcCtwNnw+SCwDSB8vwe4VN1ec85M7SML6GMKqrpPVXer6u7+/jlDmoZhGMYCaaYIvQ3cIiLt4drOTwOvAn8NfCI85m7gm+Hz/eFrwvefC9dq9gOfCp1t1wA7ge8TGBF2hk64BPApYH94znz7MAzDMFpA04wJqvo9EXkSeBEoAz8A9gH/D/CEiPxe2Pa18JSvAf81NB5cIhAVVPWV0O32anidz6iqByAivwo8A7jAf1bVV8Jr3TefPgzDMIzW0DSL9mph9+7dau44wzBayYGjQzxy8ASnR7Ns62vn3lt3sGfXQKtva1ZE5Iiq7p7rOMuYYBiGsYw5cHSIz+9/haF0nt62OEPpPJ/f/woHjg61+tYWBRMhwzCMZcwjB08Qd4X2RAyR4DHuCo8cPNHqW1sUTIQMwzCWMadHs7TF3SltbXGXwdFsi+5ocTERMgzDWMZs62snV/KmtOVKHlv72lt0R4uLiZBhGMYy5t5bd1DylGyxjGrwWPKUe2/d0epbWxRMhAzDMJYxe3YN8MAdNzDQlWIsV2KgK8UDd9yw7N1xjdLsBKaGYRjGFbJn18CqEZ1abCZkGIZhtAwTIcMwDKNlmAgZhmEYLcNEyDAMw2gZJkKGYRhGyzARMgzDMFqGiZBhGIbRMkyEDMMwjJZhImQYhmG0DMuYYBiG0SArsbjccqdpMyERuU5Eflj1My4ivy4i60TkWRE5Fj72hceLiDwsIsdF5Eci8v6qa90dHn9MRO6uar9JRF4Oz3lYRCRsn3cfhmEYs7Hai8u1iqaJkKq+rqrvU9X3ATcBWeAbwP3Ad1R1J/Cd8DXA7cDO8Gcv8FUIBAX4AnAz8AHgC5GohMfsrTrvtrB9Xn0YhmHMxWovLtcqlmpN6KeBN1X1LeBO4NGw/VHg58LndwKPacAhoFdENgMfA55V1UuqOgo8C9wWvtetqs+rqgKP1VxrPn0YhmHMymovLtcqlmpN6FPA4+Hzjap6DkBVz4lIFFDdApyuOmcwbJutfbBO+0L6OFd9syKyl2CmxPbt2+f1QQ3DWJ1s62tnKJ2nPTE5bC6n4nIrdb2q6TMhEUkAdwD/ba5D67TpAtoX0sfUBtV9qrpbVXf39/fPcUnDMNYCy7m43Eper1qKcNztwIuqeiF8fSEKgYWP0W9pENhWdd5W4Owc7VvrtC+kD8MwjFlZzsXlVvJ61VKE4+5iMhQHsB+4G3gwfPxmVfuvisgTBCaEsTCU9gzwb6vMCB8FPqeql0QkLSK3AN8DPg38wUL6WPRPbBjGqmS5Fpc7PZqlty0+pW2lrFc1VYREpB34GeDequYHga+LyD3A28Anw/angI8DxwmcdL8EEIrNF4EXwuMeUNVL4fNfBv4YaAO+Ff7Muw/DMIyVzHJfr5oNCYxlxkzs3r1bDx8+3OrbMAzDmJFoTSjuCm1xl1zJo+RpS8OFInJEVXfPdZyl7TEMw1jhLOf1qrmwtD2GYRirgOW6XjUXNhMyDMMwWoaJkGEYhtEyTIQMwzCMlmEiZBiGYbQMEyHDMAyjZZgIGYZhGC3DRMgwDMNoGSZChmEYRsswETIMwzBahomQYRiG0TJMhAzDMIyWYSJkGIZhtAwTIcMwDKNlmAgZhmEYLaOpIiQivSLypIgcFZHXROSDIrJORJ4VkWPhY194rIjIwyJyXER+JCLvr7rO3eHxx0Tk7qr2m0Tk5fCch0VEwvZ592EYhmEsPc2eCX0ZeFpVdwHvBV4D7ge+o6o7ge+ErwFuB3aGP3uBr0IgKMAXgJuBDwBfiEQlPGZv1Xm3he3z6sMwDMNoDU0TIRHpBm4FvgagqkVVvQzcCTwaHvYo8HPh8zuBxzTgENArIpuBjwHPquolVR0FngVuC9/rVtXnNahR/ljNtebTh2EYhtECmjkT2gEMA/9FRH4gIn8kIh3ARlU9BxA+RqUAtwCnq84fDNtmax+s084C+jAMwzBaQDNFKAa8H/iqqv4kMMFkWKweUqdNF9A+Gw2dIyJ7ReSwiBweHh6e45KGYRjGQmmmCA0Cg6r6vfD1kwSidCEKgYWPQ1XHb6s6fytwdo72rXXaWUAfU1DVfaq6W1V39/f3N/yBDcMwjPnRNBFS1fPAaRG5Lmz6aeBVYD8QOdzuBr4ZPt8PfDp0sN0CjIWhtGeAj4pIX2hI+CjwTPheWkRuCV1xn6651nz6MAzDMFpArMnX/xfAn4pIAjgB/BKB8H1dRO4B3gY+GR77FPBx4DiQDY9FVS+JyBeBF8LjHlDVS+HzXwb+GGgDvhX+ADw4nz4MwzCM1iCBscyYid27d+vhw4dbfRuGYRgrChE5oqq75zrOMiYYhmEYLcNEyDAMw2gZJkKGYRhGy2i2McEwDAOAA0eHeOTgCU6PZtnW1869t+5gz66BuU80VjU2EzIMo+kcODrE5/e/wlA6T29bnKF0ns/vf4UDR4fmPtlY1ZgIGYbRdB45eIK4K7QnYogEj3FXeOTgiVbfmtFiTIQMw2g6p0eztMXdKW1tcZfB0WyL7shYLpgIGYbRdLb1tZMreVPaciWPrX3tLbojY7lgImQYRtO599YdlDwlWyyjGjyWPOXeW3e0+taMFmPuOMMwms6eXQM8QLA2NDiaZesC3XHmsFt9mAgZxipgJQzOe3YNXNE9RQ67uCtTHHYPhNc2ViYWjjOMFc5asT+bw251YiJkGCuctTI4m8NudWIiZBgrnLUyOJvDbnViImQYK5y1Mjibw251YiJkGCuc5TQ4Hzg6xF37DvGhh57jrn2HFnVdas+uAR644wYGulKM5UoMdKV44I4bzJSwwrGidnNgRe2MlUDkjrsS+/Ni3EPkXmuLu+RKHiVPTSjWKI0WtWuqRVtETgFpwAPKqrpbRNYBfw5cDZwC/rGqjoqIAF8mKL+dBf6pqr4YXudu4HfCy/6eqj4att/EZHnvp4BfU1VdSB+GsZK5UvvzYlBtkABoT8TIFss8cvBEy+/NWL4sRTju76nq+6oU8X7gO6q6E/hO+BrgdmBn+LMX+CpAKChfAG4GPgB8QUT6wnO+Gh4bnXfbQvowDOPKWSsGCWNxacWa0J3Ao+HzR4Gfq2p/TAMOAb0ishn4GPCsql5S1VHgWeC28L1uVX1eg5jiYzXXmk8fhmFcIWvFIGEsLs0WIQX+SkSOiMjesG2jqp4DCB+jefoW4HTVuYNh22ztg3XaF9KHYRhXyHIySBgrh2an7fm7qnpWRAaAZ0Xk6CzHSp02XUD7bDR0TiiYewG2b98+xyUNw4DFyw9nrC2aKkKqejZ8HBKRbxCs6VwQkc2qei4MhUUezkFgW9XpW4GzYfuemvYDYfvWOsezgD5q73sfsA8Cd9x8PrNhrGWWg0HCWFk0LRwnIh0i0hU9Bz4K/BjYD9wdHnY38M3w+X7g0xJwCzAWhtKeAT4qIn2hIeGjwDPhe2kRuSV0vX265lrz6cMwDMNoAc2cCW0EvhHoAzHgz1T1aRF5Afi6iNwDvA18Mjz+KQLr9HEC+/QvAajqJRH5IvBCeNwDqnopfP7LTFq0vxX+ADw4nz4MwzCM1mCbVefANqsahmHMn2WxWdUwjMVlJdQNMoz5YLnjDGOFsFbqBhlrCxMhw1ghrJW6QcbawkTIMFYIlhbHWI3YmpBhrBC29bUzlM5XEoTCyk+LY2tchs2EDGOFsNrS4lSvcbkCPzg9yj2PHeb23z9o61xrCBMhw1ghrLaibtEaV9lTzo7lUR9cgZMjE2a4WENYOM4wVhCrKS3O6dEsvW1xTo5N4CA4jqCA52vFcLFaPqsxMzYTMgyjJUSlH4qej4SphVUh4TpmuFhDmAgZhtESojUu1xF8VXxVVKG/K7niDRdG45gIGYbREqI1rqvXteOpIsDmniSuIyvacGHMD1sTMgyjZURrXJFVe3A0y0BXyqzaawgTIcMwptCKvTuryXBhzA8LxxmGUcHy0xlLTcMiJCI/ISIfCZ+3RQXrDMNYPVh+OmOpaUiEROR/Bp4EHgmbtgL/vVk3ZRhGa7D8dMZS0+hM6DPA3wXGAVT1GGABXMNYZUR7d6oxu7TRTBoVoYKqFqMXIhIDGirJKiKuiPxARP4yfH2NiHxPRI6JyJ+LSCJsT4avj4fvX111jc+F7a+LyMeq2m8L246LyP1V7fPuwzCMufPTHTg6xF37DvGhh57jrn2HbK3IuGIaFaHvisi/AtpE5GeA/wb83w2e+2vAa1WvHwK+pKo7gVHgnrD9HmBUVd8JfCk8DhG5HvgUcANwG/CfQmFzga8AtwPXA3eFx867D8MwAmbLT2emBaMZiOrcExoRcQgG8I8CAjwD/JHOcbKIbAUeBf4N8JvAzwLDwCZVLYvIB4HfVdWPicgz4fPnw5nWeaAfuB9AVf/38JrPAL8bdvG7qvqxsP1zYduD8+1jts+xe/duPXz48Jy/I8NY7dy179C0UhLZYpmBrhSP772lhXdmLEdE5Iiq7p7ruEb3CbUB/1lV/8/w4m7YNtdq5e8D/ysQOenWA5dVtRy+HgS2hM+3AKcBQvEYC4/fAhyqumb1Oadr2m9eYB8j1TctInuBvQDbt2+f4yMaxtogSjhajZkWjCul0XDcdwhEJ6IN+PZsJ4jIPwCGVPVIdXOdQ3WO9xarfa7+JxtU96nqblXd3d/fX+cUw1h7bOtr5+JEgRPDGY6eH+fEcIaLEwUzLRhXRKMzoZSqZqIXqpoRkbn+8v4ucIeIfBxIAd0EM6NeEYmFM5WtwNnw+EFgGzAYhsp6gEtV7RHV59RrH1lAH4axJCxmNoKlzmzwwR3r+P6pSzgCjkDR8xlKF7nr76xrWp/G6qfRmdCEiLw/eiEiNwG52U5Q1c+p6lZVvZrAWPCcqv5PwF8DnwgPuxv4Zvh8f/ia8P3nwrWa/cCnQmfbNcBO4PvAC8DO0AmXCPvYH54z3z4Mo+ks5sJ+K0wCz5+4RH9ngoTr4IclF/o7Ezx/wr7HGQun0ZnQrwP/TUSiGcVm4OcX2Od9wBMi8nvAD4Cvhe1fA/6riBwnmJ18CkBVXxGRrwOvAmXgM6rqAYjIrxKYJFyCNatXFtKHYSwF1dkIANoTMbLF8oKKty3mtRrl9GiWDZ1J+rtSlTZVtTUh44poSIRU9QUR2QVcR7CuclRVS412oqoHgAPh8xPAB+ockwc+OcP5/4bAYVfb/hTwVJ32efdhrH5akZizmsVc2G+FSWBbX/s0d5xtZDWulFlFSEQ+rKrPicg/qnlrp4igqn/RxHszjEUjCl/FXZkSvnoAFiREkaC9cWGckqckYg47B7pmFbbFGsQPHB1iPFfi3FiOVMylvytJVyredEG499YdfH7/K2SLZdriLrmSZ3V/jCtmrjWhnwoff7bOzz9o4n0ZxqKymIk5I0E7OZJhPF8mV/IYy5Y4dTEz67rMXNkI5tN3e8LFEaHo+ZwZzTGSyV+xIMyVDWG2jayGsVDm3KwablT9hKp+fWluaXlhm1VXBx966Dl62+KIBC79dL7E0Hiegqd84Op1fHDHOp4/camhUF20afP8WJ6ypzhheeqYI2zqSc26ebO6eNvWBYQEqzeMjudKjGQKFMo+7QmXhz/1k1fktItmitWzHBMZY6Es2mZVVfVDA8CaFCFjdVAdCkvnS5y9nEdRUjGHkyMZvn/qEgNdCdZ3JOcM1UXrMUXPxw1FTULL8lzrMldavK16Lai7LU53WxxVZSxXuqLrtsLoYBjQuEX7WRH5LRHZJiLrop+m3plhLCLVobCh8UCAADZ0JknnyzgC47lyQ6G6KNN0wnWIAgkaWpabvS7TrCzXVsLBaBWNitA/A34F+C5wuOrHMFYE1esZBU9JuA5X9bTRHc5oos2XEbMNwJGgdaVi+Chl38f3le62WNMX6hdjXakeVsLBaBWN7hO6nkCEPkSQ5uZvgD9s1k0ZRjOIQmG1iTgTrkPR80m4k9/JZhuA9+wa4AGCEFbZG6cYuuOuXt/ZdNt3dd8LXVeq5cDRIUYnCpy6OEHccdjYnSTmOuZ8M5aERrNof52goN2fhk13Ab2q+o+beG/LAjMmLE/ms+en9tgP7ljHky+eqSzCj2QKDGeKlTWh6kV5oKV7i5pNtSGh7PlcSBcoecq1A53cd9uuVfVZjaWlUWNCoyL0kqq+d6621YiJ0PKj1sk1kikwmi3RlYpN26szk+vrE+/fwvMnLlVmE5E7rnp2Aax6x5iVZzCaxWKXcviBiNyiqofCi98M/L9XcoPG2uZKshdUO7nGcyUuTgRFf7OF8jRn20yur+dPXJo2yH62pp+79h1qumNsNWVxqKXVn63VrPXP3yiNGhNuBv4/ETklIqeA54GfEpGXReRHTbs7Y1Vypck3q51cI5kCDoLrCCVfpznbrsT11SzHWLQp9KYv/hX3/skRTl3MtKxSabMMCWu9Cuta//zzoVERug24hiCDwk+Fzz9OkDXhZ5tza8Zq5UqzF1QPnEXPR2TSIg1TheJKBtlmDNDVg1O+5OOrcjFTIlMoX1EWh4XSLLfdYmaoWIms9c8/HxpNYPpWs2/EWDvMJwRUL6RRncMscrYJQn9XEpgqFHPlO5stZNKMXGnVg1O02VWB4XSBrlR8zt/DfHLVzfV7jNyCi+22A6vCutY//3xodE3IMGZkvrHvRhN5zph09I4beOCOG3jk4AnGskXKvrKuI05nMjbtm/xsg+xcSU2bMUBXD04J16HsKeJM7lGa7fdQLHuM54Oq9bmiV8lVN1cS1kY+52KvVaz1jNtr/fPPBxMh44pYSHbqRmcYs6WSeXzvLVMccLMJxUyDbCOpamY6d6GLztWD04bOJGfHcvhlxVd49dwYMcfhzvdeVfc+L2bKOEglV914rsymntg0o0TtvY1OFKZ8zrKnDKXz3PsnR3j/9r6mLJiv9Yzba/3zzwcTIeOKaGQgrzdgRzOZaov0IwdP8Dvf/HHlmEZDGgv9Jh9dP50vMZwuUPR84o4wlpu9VNaVlIWoHpy6UjE6ci6Xc2VcgVTMpSsV48kXz3Dj1t7KteaTq67evZ26OMHW3jYAxnMlzo7lEMBXveKSFjPRrDDfSmGtf/750DQREpEUcBBIhv08qapfCEt0PwGsA14E/omqFkUkCTwG3ARcBH5eVU+F1/occA/gAZ9V1WfC9tuALxNUVv0jVX0wbJ93H8bCmEsoZgupRRbpmY7pSsbIlbymhTS29bVz6mKGi5kSIlQcdul8mQNHh2YcMK4k2Wft4OQrbO5JsqFzslpp7bWi2VMlfCcz56qrd29xx+FCukB3W6LiJkQg6TpNTVTajDDfSmKtf/5GadQdtxAKwIfDDa3vA24TkVuAh4AvqepOYJRAXAgfR1X1ncCXwuMQkesJynDfQODS+08i4oqIC3wFuJ0grdBd4bHMtw9j4czlIGvEJTTTMaraFOcWBMI3OJrl3FgwAyqUfcpesHG7rz0+o4vp4W+/waGTF3lzeIJXzo5xYSwH1J+hzVSfZ8+uAR7fewt/c9+H6W6Ls74jOeW82mvNlavugzvWVfp58e1RylU58AA2dicrv8ei56MoqlSMHK1YMJ+rdpGxdmjaTEiDVAyZ8GU8/FHgw8AvhO2PAr8LfBW4M3wO8CTwHyUo/nIn8ISqFoCTInKcydLdx8NS3ojIE8CdIvLafPvQRtJGGHWZK/bdSEhtpmPGciW+eOe75wxpzLQ+M1v7bz/5EqPZqWG3sq8MdCbY0JmsOyg//O03+PJzxyuZs32FoUywUbarLT5lRjLT7O4Tg5en1C3qTLjTZnsXJwpMFDw+9NBz08KXUa46gImCh68lvnLgTdZ1BGI2ki5w5nIeELrD32nMdbh2oJPe9gSDo0EoblNPiq5U8P5SL5gvdpVbY2XT1DWhcLZyBHgnwazlTeCyqpbDQwaBLeHzLcBpAFUti8gYsD5sP1R12epzTte03xyeM98+Rq74w65R5op9N+ISmu2YuUIa0YBW8oLqpufGcrz49igff/dGjrw9Vnege+TgCdL5Mm64wK8afDsSgYmiN+3+IjE7dPIiqhBzhLI/+b1lKFPEB/7137++0lYvLDaSyfOVA2+yta+tck/juRLRldriLhcnCgyli/R3JqaK1/uDP+FUIsaGhMvFiSLdbXHOj+Xx1edipkQy5rKpJ8XgaI4L6TxdqSCcOZYr0d+Z5PRolh0bOhjOFHCdYKbZigVzq11kVNNUEVJVD3ifiPQC3wDeVe+w8FFmeG+m9nqhxNmOn62PKYjIXmAvwPbt2+ucYlQzm1A04hK6EifRIwdPUPK8yrpO3HXwfOW/v3SOTd1JetqCtZbqge70aJay7xNzHWKOQykMX6lCvjx9H1H0rT2aAVULUET1H9aBo0O8+PYonu+TjLn0dyXpSsUZy5bwwqwOkRkiX/ZIui79nUnGciUmCh79nQn6uybvu1a8jg9nKHtKR3LqXqPzY/mKsBZKyvnxPBs6EgiBiaG3LU6u5CFQMWA0e8G83mzU9tAY1SyJO05VL4vIAeAWoFdEYuFMZStwNjxsENgGDIpIDOgBLlW1R1SfU699ZAF91N7vPmAfBAlMr+Cjr3kacQk16iSaaUAbywYC5ITOMVegpMpYtjRlwT8a6Lb1tTOSLqAamBFgUog6ErEpCUqrv7U7EoTgIkQADR672ybXkT6//5XK/ZR95ezlPFf1QsHzScXcSmVXkWBWVfR8JooeX7zz3fzON388bYCuFi8Az1ccCTa4RmYFRSl4SjLmEHMCG3e0xtbdFp8y6wDo60jy9G80N0HpTGG3eiFI20OzdmmmO64fKIUC1AZ8hMAI8NfAJwjca3cD3wxP2R++fj58/zlVVRHZD/yZiPwH4CpgJ/B9gi+fO0Mn3BkC88IvhOfMq49m/Q6MgEZcQo2G3eo56M6N5YhX1QJSBUeCQb+aaKC799YdlTUhleCf33WE9rjDVT2pwCZ+cPq39g0dicoaUNBRMI0e6ExWBC4SrY1dqcAKrcGB58fyxByHrlSM4XShIlK+D8mYVMwa9UKTkXhFJNxANIuez1U9bZwdy1HydMo9bexKEXOFEyMT7BzonPJ7WKpZx0xhNxGh5Pm2h8YAmuuO2wz8dZjg9AXgWVX9S+A+4DdDg8F64Gvh8V8D1oftvwncD6CqrwBfB14FngY+o6peOMv5VeAZ4DXg6+GxzLcPY/nzyMETFMse58fyvH4hzfmxPMWyh6oSc4IQnKri+4qP0p2KEXOcus66PbsG+HefeC87BzoREUSETd1JEnGXkq91v7UDbOxpY6AzUbknEdjYlWSgO1URuCjpaXdbnKt62oi5QahMgc/seQeJmEu+5FEs++RKHgXPJ1sMPtexC+N1c7lF4hWxoTOJr+CK0JWKsb4juCfXEWKuVCrGRslXW1UxdaYEsJlCuVLldixXYqArtarKYxjzo6F6QmsZqye0PLjpi3/FeD7IGBDtk/FRelIxPv3Bq/nKgTfx/CAc1ZWKkYi502oGzbb2MVNdnYTrMFH0ptQUGsuVEKgM9NX1jIpln/aEW1nTia4T1ed5+Ntv8KXvHKP2v53rQMxxeOQXbwKmhiZri/BVmw0yhTJb+9q5nC1S9Pxp9x93hGzJX1BNpCstRWC1itYuJc8nEXMXtZ6QYbSUKNzkOJMZA3xfKXrKZz9yLTdu7a0M3B0JFxHhsUNvUSwHA/BczGYT/+RNW/mjvz1JphCEktrjDlt62xARzo/lSBe8ikU6crdBMGOpDjUdODrEH/3tScKlpOBzhI++D+u64tNSEkVUf76tfe38679//bRUPfXMHZFjb7479xfDRr2WUtes5dpB5XCfXfDjUSj54VprY5gIGSuCRMwhV/TwdTJjABq0w+SaUq1lG4FcCU6O1E/2GQ0ew+kCI+nCtP0zHQmXJ188Q0fSJVf0QCBf9rmcK1H2lWzJp+z7jOfKJGNuxQgxUfCmuM8gMCxMFMvEXaFQDmRICdavHBHWdySnpeBpdGCby9wx3wFxMWzUayV1zVra96SqgdiUQsEp+xVTz6WJIkfPj/P6+TTHhjJzXGkSEyGjKSz2N8OdA12cuphhPBfs+k+4Dt0dca5eP3XRfUqyT0fCxf8gFU+U7DM67thQmnS+TF97nE3dSc5czjM4mmNLrxJznaBkguvUvd5o6Fgr+0oyNtUFt74jScy96X28AAAgAElEQVQp8Tf3fbhyX1GV1lTMpewrjgRJS53QVh5zZMpazUIGtsVME7NYNuq1kLpmNe97KpYnxaZQ9imWfVSVdL7E6+fTvH4hzdHzaV4/n2ak2rQzD0yEjEXnSr4ZziReUWinuy2wLOfLHuUJ5a6/s27K+VOSfVaF7qJkn8cujFfuLVsoB0XlJopc1dPGlt42LqTznB8vcM36dhKuwxtDGZKuUPC0EtYTgUIY5hMADTNbowyng42gtQv/x4bSZAtlip5WLNYQWL59X+nuiE9JwfPi26OVzAZRKqOlHNisFEHjrJZ9T56vlXBaPnz0ww3Nxy4EQnP0fJo3LmQ4czlX9xoxR3hHfyfv2tzFlxvs10TIWHQW+s2wnnj99pMvsb4jQaboge8zkimiGmScjrvCVw68yWOH3qoUeJuS7NOfnuyz6Ck9ruD5Sq7kV9Zm3roUrCVt7EoyUShXFvNj4cI+BGIRd50pG1MTbiA++IAo+bI/bd3jwNEh0vkyZd+vZGeIHNWuI/S0BzO6agOCr4pAZXY1W8G72t/hQmagtedF97IW1nOulJUo2JWwWtU6TskLZjonRyYqs5vXL6R56+IEdfZn4whsX9fOdZu62LWpi+s2dbFjQyeJmEMq7poIGYvLfAa3hZZIqBUvLwx7pQtl3tnfyfHhDKqwta8NVTgbJg/NFsqcHMlwz6MvVAZ3CBb9Y+HspSsVZzxXCorBjQT/qWr/XxXKPmcu54m7Qk97grKnlXg3BMcHMyyIOcEsJrqXkUyBQlkrm10hCMGdHs0ynisRd6BQZhq/9uF38tmPXFs5Pvr80SZUZLLq6lwD20JnoPXOe/LFM/NyF8527dW+YL8SDBilyDxQ8siHYbWy5/PWxYlQbDK8fj7NiZHM5J6zGrb0tnFdKDbXbexk50AXbQmXuOuQjDskXZdk3CHhOhUDUSOYCBlzMt/BbaElEmrDGsPpAo4EYhRscAxCWW9dzFY2e8ZcIV/2yacL076thftEUYKQm6pWNojWw/MVN8wL1xZ3OTk2Qdxx8FXxdHINR4B39HdycaKI6wTW55grFeszMOX3dX4sXzfVD8BXv/smXz8yyLa+do4NpelIuJwYzpAreZXPUwCOXUjTlYpNyU9XS70Z6HA6z2ef+AHdbfEZRWCmmevzJy5dkZV6rSzYLzcDhu9rZYaTDw0EZc/n7OV8MMO5EJoHLmTIl+v/Z9jQmZic4WwMhKcrFb9iwamHiZAxJw89fZSh8TyeBgv1/V3Jyg7/ev/R7r11B/f+yREUxUHQ8O88KpEw03/O2rBG0QsWQYOqo+N4VQO5r0FRNoW6oYKIsio7NnQweClLwYc6qQKnsK49znCmyNHz6YpwRJZqJxRUR4T7bw/SINYbeKpnNDA94Wk1uZLPhfE8I5kCxbLP6ESQ9dqtSRGEzHXn00V8PFfi4kQRX5Xt69pnFIFmrWms5gX7WqrTPJ0OM2dUtzcLVa2UIsmXgrBasewxEv4Nvx661V6/kCFTbyoO9LTFuW5jZ9Usp4sNXSnirpCIOSRjLsmYQzLmIHJlglMPEyFjVg4cHeLo+TQQDIJl3+P0pSxb+9pmHKT27BqgM+mSL/kVJ9uGzhRdqdisA1ttWEOAkg+xcDYUEQmCr1Pb66EKpy5m5zwu7gquCJeyJeLhmlF1fxD052gwK6stzFdN7aA+14ZwVwKhjrqMKVPCiqmYw86Brjkr1tbmZBvJFMLz3VnNDc1a01gtC/aNsFSzvrLnkw/DatGazuWJIkfD2U1kHLg0Ud+p1p5wuXZjZzi76WbXpi429aQqYpMIxaZZglMPEyFjVh781mtTvoGrBuVtz17Osfvq9TOed+3G7rq75Wcb2GrDGjHXoex5uK5DuexXZiS1/zdqE4vW4s8hQBBshhU3ONaNOXh1hEMAz4etfSlcZ3ImOJcYzBD9q1CoCYmUfB8/WA4KBDG8l7kq1taWhSiEv7OoeF3tNSKataaxEhfsF0ozZn3VYbVob85YrsgbF9JT1nHOj+frnh93hXcOBIITGQe2r+8gFXdJhGG1hLu0glMPEyFjVk5ezBJzgsE3QglmKPUGqWhAfuXsGOlCGVFoS7iVVDpzDWzVm07v/ZMjiCOT5gCBuASLMrs2dfPKmbH6xTnqUJ2loB6BkCniMK0yaTWuI3Sl4qgqxy6Mc9uXvsux4Qxxx6E7FePIW5cqRecSrrC5J1Wp2DoTte/GHadiMRcR4mHMfaaKtTCZHTvhOmHxuiztCZeOpFvZfFt7jYhmrWmshAX7xWIxZn3VYlMoe6RzJd4cDp1qofCcvpSt+3fsCOzYEIXUAuHZ0d9JezJWEZxkLBCdRgRnKQ0lJkLGnDgiOK5ULMYQ1KOpV2ohylaQK3qVGUq26FH2lc/s2d6wXfjz+19BCMoyiAQzE1WlFM5qXj4zBkBH3GFHfxdvnB+nUGewb/T73VU9Kc6NF+YM2yXDDA0jmQLpgkemmMWVwHgxVLNZr+gpb12qv5+i+v6mF7QKBpVoHWlTd3JaafPZ0gx969dvBWZO5VNPBJqxqXQ+4rbSXXTznfXVprqZyJc5UW2NPp/m5MWJGf8et/W1TbFG7xzooqstHghNbH6CU8tSG0pMhIxZ2bGhg2NDGVwnSJGjGvwHisecSvnpD+5Yx/MnLlU2WEKQ4y0mDr4oMVfY1JPiqZfPTSltHQ2GtYNP9C1/U0+Ks5fzwaK8r9RqjAATRZ9XzozNOsuJnGuzMXg5z1wmHyVwDWWLZS5OFBGopN9ZKLVnC4GdNhp7Nvek8BUGulLzrli7HFxbjYjbanDRzTbrq011kyt5nBye4OiFScE5PpyhOINTbWN3MhCc0KV23aZu1nUkJsUmXM9ZLJbaUGIiZMzKfbft4reffCnYbOn5OGEVz+5UjN62OKcuZvj+qUv0dyYqGyyLnpIIpjCVbAVlz+fUxRxXV5VK+K0nX6JY8ij6gfV6JFPgt558CVVlc0+QILSv3eNCulD33iqGgVnuXwlmbdF9zYbrCP4sxyRdwdfgep6vQfG4Odaj6jElgWmUB6/qfhOuE9jJw3BcZPt+5OCJoNZR1WbS4XSedL5MoRyE7+5871VT+loJaXNWg4uuWvBPX5pgc28b/+Tmn+AdAx0cOnGpklPt6Pk0xy6kmSh6da/T1x6vONSu29TF9Zu7GehOTRGbKF9is1hqQ4mVcpgDK+UwGSoZHM0ylitNKVVwYjhTccABlL3AMooErizfD2ZCXpgqYOfGrsp1j54bo+RD0g324pSqRvO4A33tCS7nypSqwoALxXUE9XVOk8Bs9Hcm+HefeC+PHDzBD06Pho62qfe9oHuTqW64pOvgo5V6RPXKSZQ85abtPTz14wuUfZ+k69DTHqT+6e9MMhxavuOucO3G7mUdBvvQQ8/R2xafEjpSVcZyU/PvLUeqU90Uyj5nL2d59dw4b5zPcPRCmjfOp7k8wybtzmSsyhrdzbs2dbF1XVtgiQ7NA80WnHosVgkOEbFSDsbiUP1tOhowIoqejxPOdqIqn64DZR/K4a7QrlSw92Zrb2rKdaPoQ2B0mDqQl3wYyhRxa2YKC8XzteH1oXo4Aul8mYeePkq6UGZjV5KzY/mwpPiV3aCngZMpChnGXGFDZ4ruttAAMZRha1/btJnCd44OT2kfz5UYTucCl5zSUAbx5RAGWykuutpUNyPpAq+cnZzhvH4+zXCm/qw9GXPYOdBZWce5/qpurtnQURGcZMyZUh24lSy1ocREyJgXtQNGwnUms1q3xckVy4yEexR8hfa4wzUbOulrD4quVRMN3eWZUhiE15gr5BWJy0yHxMOMDQuVCiFYD/N95Y2hDNcOdFZEdyRToKwyq6GhdqZTj4Tr4Ergy97RP5kZPKqKWq9C6UTRY3tV+0gmyDBR8pREzKmbQbxaXJZLGGy5uuiqU92MZku8enaM16qMA3Ml8YzS27zrqm52buykIxGvrOMsF8Gpx1KvJTZNhERkG/AYsIkgbL9PVb8sIuuAPweuBk4B/1hVRyWYi38Z+DiQBf6pqr4YXutu4HfCS/+eqj4att8E/DHQBjwF/Jqq6kL6MBqjdsDobosxlC7SlYoxnityKVvCEWFbX6pSDqG6nk71QBN3hbKnswqMI7C1r50zl3MzZh0Iv/TXJeE6uI7gqdeQoEVUz2+CpKVBUtFgZpLGdRzWdcS5ZkMHuZLHSKbARKEMSKW6azkUgELZn7b2U0u26LGhM4HCtMH4mvXtU/YdwWSto+r2oudXfg9RZKs6g3htTH+5bCZdDgaK6j056XyZV8+N8erZuZN4CvAT6yeTeL5rczfXb+6hMxWrbPqMLWPBmYmlXEts5kyoDPxLVX1RRLqAIyLyLPBPge+o6oMicj9wP3AfcDuwM/y5GfgqcHMoKF8AdhOMC0dEZL+qjobH7AUOEYjQbcC3wms23EcTfwfLloWuBdQOGFev7+SuvzPpjou5wsauIJQEMJzO8yt/9iKer3i+T8J1aU+67BzoYqAzwfMnR2ftzycIT23pTXFuLE/RU+KO8I7+Du6//V38yp+9SDZc5I2Kw0Vi5UiUrSCsA6SK1oS+qqkWnujRDe3SJc+n7CtRkdb2hMO5sQLnxoLwS8IRfvbGzZwfL04ZSB/81mscH54AghRCMxE475Ls2tTJd44OM1EMROaff+gabtzaW3em8NO7+nnqxxcqZc1RxQ9FUJVpGcRrw1vLKQy2lINelOomX/LJFsscu5Dhx2fGgrDahTQnhmdO4rm5J1WxRV+/uYd3b+mmr33SqbYSBafVNE2EVPUccC58nhaR14AtwJ3AnvCwR4EDBAJxJ/CYBqPGIRHpFZHN4bHPquolgFDIbhORA0C3qj4ftj8G/ByBCM2rj/Be1wzVawGuwA/eHuWex15gZ38n99/+rjkzLtcTr88SrBe5EoSFzo7lcESCIlgEGzcdEYqeTzsuH9yxjj/46+Nz3mvSDezJxy6ME3MdPN+jrMrx4QwPPX0UR4I6P044U6leW5rML+ehGokSeDMMMPVaPQUvXLxyCPK6CTCWnZqHq+gr3/jhObb1tU25Tqbo0R4XxgtzWyIujOd5YyhDf2eC7euC2c+TL57hxq29PHDHDVNmCpE7bl1HvFJfyREhEeapi5yArgN9qWTd8NZSh8FaZYKIUt3ki8FenJfPjHH0XOhUG0qTL9X/t+lOxXjP1h6u2xgIznu29jDQlaxkGjDBWRyWZE1IRK4GfhL4HrAxGvRV9ZyIRH+FW4DTVacNhm2ztQ/WaWcBfawpEYrWAjxfOTdWCDJdi3DqUnbWhelIvIphyOL8WJ4X3x7l4+/eyNHzGc6M5iZDWE4gOJUZhRP8hxVfyRTKfPW7b865dwfAkSDLwq/86ZFKXR+AssLr59O4rtCRcCl5ykSx/mBSvf1iY1eSTKFM2VMKs2RGqIcPFWv2TGeeHs2RjDlcGMvxvZMX52XfHs2WKgaI/q7UlDWax/feMuXfJEqS2tOWqpQUH8nkGckUp4QmPT8ou9yVmqwqW13ye6nCYFdqgmhUwKrDaqdHs/zo9BivnhsPc6oFlXTr0R538TSYUXYmY2E4U/j0zT/BR27YVCmQaCw+TRchEekE/i/g11V1fJYdvPXemCncP1v7rLfTyDkispcgzMf27dvnuOTKI1oLODkyUSmJoAQDbL3s2NEA8OLbo/i+j48Qk6DgW8nz+cYPp2t49Ywk+qV7flCfR73J9rn+wfJlj3/x+ItTBCjCB9QLRO0n1nfwZhj2mo1oz1Eq5jTUfy3lsKT3bNTmgmsUJZh5ZYse47kS3W0zF7Krt54zFpYcdx2ZYnsvekEl1x+cHuWexw5z7UAn9922qxICW4rZyJWYIGYTsA++cz2Fss+FsTwvDV7mlTOTgnNxhiSebfEwieemLm64qocbt/bwe3/5KpeyRToSsYpVPFss88fPv8XH3rN5UX8XxlSaKkIiEicQoD9V1b8Imy9EIbAw3DYUtg8C26pO3wqcDdv31LQfCNu31jl+IX1MQVX3Afsg2CfU8AdeIURrAdUlsKO1g9pBr3oA8Hwfzw/q87hutM4yd3+RwJWqZkZReyOkC/U39kXXiDaQzoeZ6qgsBxQYvJxjK8F6WL01mq5kjONDmSnlNaKZnUNQbrxUlfR1JFMKHXhwcmRiya3Ys5kg5prlRKVEyn7gwuzrSFD2fP7VN17m2o1dvH4hzbmxuZN4Xn9VNzdu6eHajd20JYJNn9Hf/4V0YdpepdWa8Xu50Ux3nABfA15T1f9Q9dZ+4G7gwfDxm1XtvyoiTxCYBcZCEXkG+Lci0hce91Hgc6p6SUTSInILQZjv08AfLKSPxf7sy51oLcANLbwC+CjtiVileuld+w5NSaHTnoiRjLmVnd5l38d13IaFpNae3SiNiFzMEfo6kkBmQX0sRzxfOXM5x1W9bdPWaA4cHWI4U6ik9yl5HhMXs7gOoJPOuNpMDI5Tf8a7FGs1M5kg1Pe590+OVMwVZc/n8/tf4X9T5YPv3MDTL5/j6Pl05TNlSz7Zy5OCc7ZKfByBazZ0BDOczT28Z2vgVOtIBk612YqvLSeTxlqjaRkTRORDwN8ALzOZWeVfEQjG14HtwNvAJ0NBEeA/EjjcssAvqerh8Fr/LDwX4N+o6n8J23czadH+FvAvQov2+vn2MROrNWPCgaNDPPT0Ud4YyhB3ha6ky2i42L6ld9JenS2W2dSdQkRI50u8dTFbiYUmYs6CQ0+LzZVvGV2e/OZHdlbKf0fcte8QJ0cyDKULhIkokPDHDe17riMUypOZJhyBZMzF12BN65oNHYzlSnzxzndXZrrV5oQH7rhhUYWoekYd9TOWK1VKvjsE6zk+Qdb1ZMylryPB8aGZv1gkYw7/484NXL+5mxu39fDuq3rpbY8vqNpnvftrxu9hLdFoxgRL2zMHq1WEIqrXewTY1JOqpP7PFssMpwu0J1zS+TJFz8fzZ9/X0wxiEhgRVhPzyTnXmXR4z5a+ygzlQw89x+hEkbKnlcFWCcKdm7uSDE+UKPs+sdCu7mlgNY+5gXX7qt6gHtJAmHppMVK0NEJlxnVpgs09bQylC7x9KVuxkjeyoBtpi6/wyC/exM/csHHRauFUp6dqdYnu1YCJ0CKx2kUoYqb8XceHxpnBUGQsgIXO2HrbYsRch/7OJCdGJiiWfWIOxNwgY0LJCzbjigg7+zsQETKFMp3JGOlckXPpAnHHYWN3sjLLfeCOG/idb/647r/7+fE8Owe65hWiqxfW+6nr+smXPE5dzPLDt0d5+cw4r54bnzWJJwT26E/ctJWDb4xQKHsIgfW/FBovrl7XztO/8VML+E0aS4XljjPmRWfC5fhwJnBXiYQb+hae6mYxEYJ1jqWegTWDhX6Ey7kyAlzMFIP9TgT59cALy64He4KSbpCiqOT5fPHOd08rBT44mp1SFmLbwelrIRcnCqTzZYbS+Ybt1FE4K+ZAW8zhzeEMv/r4i2zpbefCeH7GJJ7VGR6ifVyOwJd//n38vXdtnBIm29HfWQmT3X/7uxb4mzSWGyZCa4xoLejESGBnvmZ9Ox9/z2YuhuEdUPIN7N9ZSpRgcFqt6z6NUp3NIfpdlPxg0HYdcMVhoHtyf9FDTx+d03BQb8PqpYkSfe3xWe3UB44O8YfffZNTFyfoTsW5OFEgWwwEotrC/vqFdOV53BWuC1Pb3Lilh7Ln87W/PUnZ9xnPlSl4PjHX4TN73oGIcNe+Q5wezdKVjFWyaluYbPVhIrSGOHB0iN9+8qXKpkiANy5keP3CMRwJcqTNVXOnlSzfO1taojE+NMMhIiRdYaB7cj0vqN+U5er17eGesAz3/skRulIxdg50TSkoOFEoVZKe7hzo4nK2yIbO5JQ+2+Iub13M8N03hvjGi2d45pXzFcE5P14/c3Qt3UmXf/mRa/l779pYabtmQ+e0dRhgyr6gaPZTPbMzVg+2JjQHK31NqDpOP54rkS95iAR7fMqef8W1cIzWETkU37+9j6F0nrIXFAaMDCSuwK7NPYznSpwdCzI+xx1hc28bY7kSQrAPaSxbCmYhTjALeerlc5wYyeD5wUbmuOtQ8v1ZM1xEIdPuVHzG0JvrwAeuXj+n4WGx6tkYrcXWhIxpO83Pj+Up+0rCBSQsNGesWJSg0N7lbJGTI9OzPKsGNYZGMgUcBHGCTBbtiRhnLufw/SCwp+pXkrR+6dvHpqy/eaqU/KkGAleCEF1bwqUt7tKWcMkVy5wezZEpzOxi8Xwa2vy5XLJ7G0uDidAqpjZVSjLmUK7E7b1pg9ZaX3NZiYznSiTjbrDPpuY9Bc6P5fAhNJsEM6F8yaNYDja6CtPNJ7XfTRygryPOv/1HN/KT23v5tcd/OG2mUnKDLNKBSWLmv6JGNn/axtG1haWBXcWcHs1OKYa2oTNZcVbVGydMgFYe4wWP82P5StG8aC9NVG694AUZMQpln2LZJ1f2OTaUqfz71/s3F2BbXxvXbezkPVt6uP6qbtoTMT52w6aKsy7ayKwaPJY8pSMZ4539ncy0T1SgoQzdM12/1UXujOZgIrSK2dbXXqnMCdDdFseVyd31xuqgXFU1NvqCUV2tNrLaRy7D2Yhmw+fH8wyO5kjnS9NmIXt2DfDAHTcw0JViLFdioCvFA3fcwM6BLnIlj/4aU0PEz71vc+M1q+pc30wJqxMLx61i6tlvfWD7uja62xKcGM6QLXo2A1rhlOrk5aue6U4mqdUpmQnqhV+j176vlPAZHM3R1x7nX//966ccN1P27c/vf4XOVGCpHs4UK3byrlSM8+NFDhwdaliITHTWBjYTWsXU+0a5uSvJhXSBl8+MMWECtCqYaQnGEdizcz3t8cDM3RZ3KyWnK5tEa84RgoSw0cwp5grrOxIz1pe6a98hPvTQc9y17xBA5e8tGXfZtamLDZ0JdvR3sH1de2XT64GjQ9OuZaxdbCa0CpkpK3K0T2imSpJG64mFm3+uNFde0hU6kjEOHp8srJcteWHJDpkyG4oeo7aYI3gKuzZ1VzaJ1jJjjZ87bqjYqO/adyiopLuAGkLG2sFEaIUym9DUq376mT3v4PkTl4i5thq0nFmspOSeQrpQnjpLCp+XQrMC098KnuuksWEmV1ojRerMam00gonQCqT6W2ihVObQiYs8f+JiGEqBzlSMTMHDQcJvtcpXDrxJZ9IlPcNGQmN1Uc8mrTWP9Qgs1j6betpmdaU1IjBmtTYawURoBRJ9Cx3PlRjOTIpKlEtsNFvGlSC1pYZFznyUXFEoWiRu2TOfMg+NXs8VmZIdI+kKhaoMCI4E6X9cCYwMvjIl0WktjQhMPWPMcrNaL0VBP2N2TIRWING30JNhEtJ6VGdY0ao1AWP5s+iZlBTcmFAO3XGpmMPOjV0cPT+OEOQM3NHfGRwargH9zX0fnvWSjQjMnl0DPADLtkbPjOtaLF3Zc6OJ7jgR+c8iMiQiP65qWyciz4rIsfCxL2wXEXlYRI6LyI9E5P1V59wdHn9MRO6uar9JRF4Oz3k4rJq6oD5WGtH+H0v7ZsxGtB/MB4qloMqqK9DTFufEcIayF5TryFctRDUaLmt0L8+eXQM8vvcW/ua+D/P43luW1eBeva4lEjxGZc+NpaOZM6E/Jiil/VhV2/3Ad1T1QRG5P3x9H3A7sDP8uRn4KnCziKwDvgDsJog2HRGR/ao6Gh6zFzgEPEVQsvtb8+2jaZ++CUShg2NDadJWac6Yg4Qr9LTFGc+XUYK0TTFHGM2WEAlKKxQ9xfeV8VyxUuyu0XDZSt/LY8aJ5UHTZkKqehC4VNN8J/Bo+PxR4Oeq2h/TgENAr4hsBj4GPKuql0LheRa4LXyvW1Wf1yAN+GM115pPHyuCKHQwlM6zqTvFuo743CcZa4ba/8jBa2E0V2Jjd5KBriQPf+onw71hGs6SBNcRYg6cHy+sucwEtRlFwIwTrWCp14Q2quo5AFU9JyLRX/sW4HTVcYNh22ztg3XaF9LHuSv9UEtBFDooe8qxS+llXffHWHpScZeS5+OphrWhwpyBPpy9nKMtEeO3nnyJQhh6KxPUD9ra00ZXKsZYrjRjmYQDR4d48FuvcfJiMEPYsaGD+27bteLFaiUYJ9YCy8WYUG/zii6gfSF9TD9QZC9BqI/t27fPcdml4fRoFlfg9GjO1oKMCjFH2NCZ4JoNnZW/kXNjBXxVRAK7tafQ5QR1gyJcCWzc58dynLkchOZu//2DpAvlafvOfuvJl7hcVQjx2FCG337yJf7dJ967ooVouRsn1gpLLUIXRGRzOEPZDET5OwaBbVXHbQXOhu17atoPhO1b6xy/kD6moar7gH0QFLWbzwdcDOrZRruSMV6/kDYBMioIsK49zniuxPdPXaIj4dKRdLmqN8VwOihupwRrQ4Wyj+MICZGg6J0GTjg/vE5RlWNDGbb0pqa4xB45eIJMoYwrghOqkKiSzq+OzAcrfV1rNbDUueP2A5HD7W7gm1Xtnw4dbLcAY2FI7RngoyLSF7rcPgo8E76XFpFbQlfcp2uuNZ8+lg0Hjg5x++8f5J7HDnP41EWG0wVeOHWJex59gdfOmwCtdapT7MQcIRlzuJwrU/KVVMyhPeEylC4yniuhqni+4ithWQQPVcV1hLgjU0IAiZhDzHFwHWEkU5ziEjs9msXzg1lV5T7CGZYt4BuLQdNmQiLyOMEsZoOIDBK43B4Evi4i9wBvA58MD38K+DhwHMgCvwSgqpdE5IvAC+FxD6hqZHb4ZQIHXhuBK+5bYfu8+mgl1TOermSM4UyBTL4MqpQU8G1nqREgwMbuJMmYw5nLeSDMih2+v6EzSXdbnHzJYzQs3R3tD4vi10VPSaA4IghK3A1Kdxc9H1eCGh/FMCN35BLb1tfOSKaA+lSESBVijmML+MaiIGolnmdl9+7devjw4UW/bvVGuba4y/GhDGVf8cN/D5v1GBHtcduG218AAA31SURBVIefWN9Bb3uCwdEsHQkXEeGNoQypmFMRIIATwxnyZY9UzKXsBX9PpZo/prgroIFHbktvGyOZAmVPQYIZ1o7+TrLFciVjQu2akK/Q1x5f8WtCRnMRkSOqunuu45aLMWHNUZsAMnI1+WoCZEySdB1KvjKczvOtX791ynt37Ts0LXVOoeyTrJrdxBwH8Cj74DiCr0pPKkZ/V4rhTIGYGxgbzlzOg8Km7uSUnHF7dg3w7z/x3inuuJ39q8MdZywPTIRaRO1GuYTrUPJ8RASpCrMYaxvHEXxf61ry61mMXUfoaY8znitT9oK1HMdxaI8Jm3pSDHSlKlbsKBw8OJrlnf0diAiZQnlazjhbvDeaiRW1axG1G+X6u5L4YVoVE6DVy0zF5GbCD0uhJmLT/6vWS53zmT3vIO66dKVi+Chl38f3le62WN3cbo/vvYUv3vlu+jqSpAtlsykbS47NhFpE7bdY1xESMYds0ZKMrmZiDiBSMQrMRbHsE3eF/s5kpW2uzM83bu3lkYMnKHvjgRkh5nD1+s664mJJPI1WYyLUImo3ygEmQGsAx4kS6kDcDezTM1FdVmE4U6iUxZ5LNOYTPmukOJ1hNBMToSWm9lvspu4EPzw9Ss5Kbq8Jrlnfzv23v4vP73+FM5ezxJ2gBlQt8XDGFHcD91vR8/jsEz+gUPYRYFNPqpL5+UpEw5J4Gq3GRGgJqQ19vDw4yvNWZW5NISKVWfBnn/gB2aJHR2LSZp0tlhkczbFzoJOwOgnpfImRdDHY7xNs5+Hs5TxX9UJXKn5FomHVT41WY8aEJaQ69DE0nidjArTmGM4UgCBk9vCnfpKretvY1JOiKxWrWKN3bOiYYloZThdAglIMCddBEETCdq5MNO69dUeYUaGMqs5a0tswmoGJ0BJyejRLW9xlaDzPUKbY6tsxWkCxqoDcTIXh7rtt1xRhyJcDQdrQmWRDZxIfRVUplL0rFo1Gi9MZRrOwcNwSsq2vnVMXM1wIv8EaqxOhvs3edYJkotXMZCKoNq10JGK0J9xKVgSAC+k8ojJtT89CsH1ARisxEVoCDhwd4qGnj3JsKE3ZInCrlr72GJez5boClIw59LXHuWZDZ0PXqhaGaC0xsvPHXLEZi7FqMBFqMr/xxIv89x+esw2oq4C4K3Ut1QJs7WtDREjGXNL5MvmSh6fQ2xZja7gxeaFhM6t7Y6xmTISayMPffoNv/HBZVYsw5kncAR/Y0tvOA3fcwI8GL/MHf328IkYJV+hpi/PFO98NTBWKD+5Yx/MnLi2KcFjIzFitmAg1kT88eKLVt2BcAcmYg+crHQm3Evras2ugkpGgnrjUCsVnW3HjhrGCMBFqAtGGVMuAsHxxBWbLmtOTdNnQnaLk6bS1F5uVGMbiYSK0yFRvSDWWJ45APObQ7gQlqzOFMl6VYWRDR5y2RGxRnGeGYcyOidAiks6X+D+eeZ1MoTRlUDOWhpgDZT8wCohAKu7SFpPKpuCBriQdCZeJolcJo4Et+BtGK1lzIiQitwFfBlzgj1T1wYVcJ1/yePXcOD86fZkfDY7x0uBlToxMYIVqm0t73OVjNwxwfrzI4GiWzmSMdL7EcLj59539QW62+QiJiY5htI41JUIi4gJfAX4GGAReEJH9qvrqbOeVPJ/Xz6d5+cwYPxq8zA9PX+bYhaAcdz3irtCRiNGWcFFVzo/b5tRqulMx/vmHrpl1gd8wjLXBmhIh4APAcVU9ASAiTwB3AjOK0JvDGd79hWcozLDL9KreFO/d2suNW3t579YeLmeLPPj068Rdqap2WWBde5zhTKkZn2lZ4QCphEt7wmXnQNecwmKiYxhrm7UmQluA01WvB4GbZzshW/QqArShMxGKTS83buvhxi09rK8qNhbRnohN+YYfd4SSrxRKPuOFleeY+4fv28yXPvX+Vt+GYRirkLUmQvUsa9NiaiKyF9gL0LflGv7wF9/P/9/e/QdZWdVxHH9/WNhFUFxII3BJwZgaIn5uDpAm/hhF0+gPC4opsvEfm0prsMGhf/zLsZomnQojtTJMSXKKwR9EyB/aDxQEROLXBiYIBY6yaCZIfvvjnGVv67K4e+/uw3I/r5k79zznOXv3Od89u999zn3uc8Y11DMsr+FyIm0v4W25Yu7sQf3h0H84dLhnr1qoqxF1/WqA9OHK0UMHeerLzE4K1ZaE9gAjSrYbgL1tG0XEImARQGNjY8wYO6ysb1p625UjR99heH0Nbxw+yt6Db9GSjur69uH02j40v3U0fX4l2r8JZosBtTWMb6h3MjGzXq3aktCzwGhJI4GXgdnAF3riG/sDjmZm71ZVSSgijkr6GrCCdIn2fRGxueDDMjOrWlWVhAAi4jHgsaKPw8zMvLKqmZkVyEnIzMwK4yRkZmaFcRIyM7PCOAmZmVlhFL7tc4ckHQD+0c6us4BXevhwTkaOQyvHopVj0apaY3FuRJx9okZOQl0kaW1ENBZ9HEVzHFo5Fq0ci1aORcc8HWdmZoVxEjIzs8I4CXXdoqIP4CThOLRyLFo5Fq0ciw74PSEzMyuMz4TMzKwwTkKdJGmGpG2SmiTNL/p4uoOkEZJWS9oiabOkm3L9EEkrJe3Iz4NzvSTdlWPyvKRJJa81N7ffIWluUX0qh6QaSeslLc/bIyWtyX1aIqk219fl7aa8/7yS17g112+TdGUxPSmPpHpJSyVtzWNjahWPiW/m340XJD0oqX+1jouyRYQf7/FBWv7h78AooBbYCIwp+ri6oZ/DgEm5fAawHRgDfBeYn+vnA3fk8tXA46SVa6cAa3L9EGBnfh6cy4OL7l8X4vEt4NfA8rz9G2B2Lt8N3JjLXwXuzuXZwJJcHpPHSh0wMo+hmqL71YU4/BK4IZdrgfpqHBPAOcAu4LSS8fDlah0X5T58JtQ5FwBNEbEzIo4ADwEzCz6miouIfRHxXC6/Dmwh/eLNJP0hIj9/JpdnAvdH8legXtIw4EpgZUS8GhGvASuBGT3YlbJJagA+BdyTtwVcCizNTdrGoSU+S4HLcvuZwEMRcTgidgFNpLHUa0gaBHwSuBcgIo5ExEGqcExkfYHTJPUFBgD7qMJxUQlOQp1zDrC7ZHtPrjtl5amDicAaYGhE7IOUqICWpWKPF5dTIV4/BL4Nx1Zifx9wMCKO5u3SPh3rb97fnNufCnEYBRwAfp6nJu+RNJAqHBMR8TLwfeAlUvJpBtZRneOibE5CnaN26k7ZywslnQ78Frg5Ig511LSduuigvleQdA2wPyLWlVa30zROsK9XxyHrC0wCFkbERODfpOm34zllY5Hf95pJmkIbDgwErmqnaTWMi7I5CXXOHmBEyXYDsLegY+lWkvqREtADEfFIrv5XnlIhP+/P9ceLS2+P1yeAT0t6kTT1einpzKg+T8PA//fpWH/z/jOBV+n9cYDUhz0RsSZvLyUlpWobEwCXA7si4kBEvA08AkyjOsdF2ZyEOudZYHS+CqaW9CbjsoKPqeLyfPW9wJaI+EHJrmVAy9VMc4Hfl9R/KV8RNQVozlMzK4ArJA3O/z1eket6hYi4NSIaIuI80s/6yYiYA6wGrsvN2sahJT7X5faR62fnq6RGAqOBZ3qoGxUREf8Edkv6cK66DPgbVTYmspeAKZIG5N+VllhU3bioiKKvjOhtD9JVP9tJV7IsKPp4uqmPF5KmBZ4HNuTH1aR57FXAjvw8JLcX8OMck01AY8lrfYX0hmsTcH3RfSsjJtNpvTpuFOmPRRPwMFCX6/vn7aa8f1TJ1y/I8dkGXFV0f7oYgwnA2jwufke6uq0qxwRwG7AVeAH4FekKt6ocF+U+fMcEMzMrjKfjzMysME5CZmZWGCchMzMrjJOQmZkVxknIzMwK4yRkZmaFcRIyK1D+oOIfJW2QNEvSzZIGdPG1XpR0VqWP0aw79T1xEzPrRhOBfhExAVIiARYDbxZ5UGY9xWdCZhUmaaCkRyVtzIuezVJaDHGrpKfzYm/LJb2flHAm5DOhm0g3xFwtaXUHr79Q0tq8qNptbXbfIumZ/PhQbn+upFV5cblVkj4o6cx85tQntxkgabekfpLOl/SEpHWSnpL0kW4KlZmTkFk3mAHsjYjxETEWeAL4GXAtcBHwAYCI2A/cADwVERMi4k7SDSwviYhLOnj9BRHRCIwDLpY0rmTfoYi4APgR6War5PL9ETEOeAC4KyKaSQuqXZzbXAusiHRDzkXA1yNiMjAP+Ek5wTDriJOQWeVtAi6XdIeki0i3/N8VETsi3SdrcZmv/zlJzwHrgY+SVuhs8WDJ89RcnkpaGRbSfc4uzOUlwKxcng0syct3TAMelrQB+ClppV2zbuH3hMwqLCK2S5pMuunr7cAfqNA6Mfluy/OAj0fEa5J+QbpB5rFvf5wy7dQvA26XNASYDDxJWhvnYMt7VGbdzWdCZhUmaTjwZkQsJq3AOQ0YKen83OTzHXz568AZHewfRFpQrlnSUN69mNqskue/5PKfSWc6AHOApwEi4g3SXZ3vJN0h/L+RFi/cJemzuS+SNL6j/pqVw2dCZpX3MeB7kt4B3gZuBM4CHpX0CikJjD3O1y4CHpe0r733hSJio6T1wGZgJ/CnNk3qJK0h/YPZkuy+Adwn6RbSEt3Xl7RfQlpmYHpJ3RxgoaTvAP1IC/ptfC8dN+ssL+Vg1sMkTQfmRcQ1RR+LWdE8HWdmZoXxmZDZSSpPq9W1qf5iRGwq4njMuoOTkJmZFcbTcWZmVhgnITMzK4yTkJmZFcZJyMzMCuMkZGZmhfkfGoPWmDN2yzQAAAAASUVORK5CYII=\n", 314 | "text/plain": "
" 315 | }, 316 | "metadata": { 317 | "needs_background": "light" 318 | }, 319 | "output_type": "display_data" 320 | } 321 | ], 322 | "source": "sns.regplot(x='sqft_above', y='price', data=df)" 323 | }, 324 | { 325 | "cell_type": "markdown", 326 | "metadata": {}, 327 | "source": "\nWe can use the Pandas method corr() to find the feature other than price that is most correlated with price." 328 | }, 329 | { 330 | "cell_type": "code", 331 | "execution_count": 32, 332 | "metadata": { 333 | "jupyter": { 334 | "outputs_hidden": false 335 | } 336 | }, 337 | "outputs": [ 338 | { 339 | "data": { 340 | "text/plain": "zipcode -0.053203\nlong 0.021626\ncondition 0.036362\nyr_built 0.054012\nsqft_lot15 0.082447\nsqft_lot 0.089661\nyr_renovated 0.126434\nfloors 0.256794\nwaterfront 0.266369\nlat 0.307003\nbedrooms 0.308797\nsqft_basement 0.323816\nview 0.397293\nbathrooms 0.525738\nsqft_living15 0.585379\nsqft_above 0.605567\ngrade 0.667434\nsqft_living 0.702035\nprice 1.000000\nName: price, dtype: float64" 341 | }, 342 | "execution_count": 32, 343 | "metadata": {}, 344 | "output_type": "execute_result" 345 | } 346 | ], 347 | "source": "df.corr()['price'].sort_values()" 348 | }, 349 | { 350 | "cell_type": "markdown", 351 | "metadata": {}, 352 | "source": "# Module 4: Model Development" 353 | }, 354 | { 355 | "cell_type": "markdown", 356 | "metadata": {}, 357 | "source": "\nWe can Fit a linear regression model using the longitude feature 'long' and caculate the R^2." 358 | }, 359 | { 360 | "cell_type": "code", 361 | "execution_count": 33, 362 | "metadata": { 363 | "jupyter": { 364 | "outputs_hidden": false 365 | } 366 | }, 367 | "outputs": [ 368 | { 369 | "data": { 370 | "text/plain": "0.00046769430149007363" 371 | }, 372 | "execution_count": 33, 373 | "metadata": {}, 374 | "output_type": "execute_result" 375 | } 376 | ], 377 | "source": "X = df[['long']]\nY = df['price']\nlm = LinearRegression()\nlm.fit(X,Y)\nlm.score(X, Y)" 378 | }, 379 | { 380 | "cell_type": "markdown", 381 | "metadata": {}, 382 | "source": "### Question 6\nFit a linear regression model to predict the 'price' using the feature 'sqft_living' then calculate the R^2. Take a screenshot of your code and the value of the R^2." 383 | }, 384 | { 385 | "cell_type": "code", 386 | "execution_count": 35, 387 | "metadata": { 388 | "jupyter": { 389 | "outputs_hidden": false 390 | } 391 | }, 392 | "outputs": [ 393 | { 394 | "data": { 395 | "text/plain": "0.49285321790379316" 396 | }, 397 | "execution_count": 35, 398 | "metadata": {}, 399 | "output_type": "execute_result" 400 | } 401 | ], 402 | "source": "X1 = df[['sqft_living']]\nY1 = df[['price']]\nlm1 = LinearRegression().fit(X1, Y1)\nlm1.score(X1, Y1)" 403 | }, 404 | { 405 | "cell_type": "markdown", 406 | "metadata": {}, 407 | "source": "### Question 7\nFit a linear regression model to predict the 'price' using the list of features:" 408 | }, 409 | { 410 | "cell_type": "code", 411 | "execution_count": 37, 412 | "metadata": {}, 413 | "outputs": [ 414 | { 415 | "data": { 416 | "text/plain": "0.657679183672129" 417 | }, 418 | "execution_count": 37, 419 | "metadata": {}, 420 | "output_type": "execute_result" 421 | } 422 | ], 423 | "source": "features =[\"floors\", \"waterfront\",\"lat\" ,\"bedrooms\" ,\"sqft_basement\" ,\"view\" ,\"bathrooms\",\"sqft_living15\",\"sqft_above\",\"grade\",\"sqft_living\"]\nlm2 = LinearRegression().fit(df[features], df[['price']])\nlm2.score(df[features], df[['price']])" 424 | }, 425 | { 426 | "cell_type": "markdown", 427 | "metadata": {}, 428 | "source": "Then calculate the R^2. Take a screenshot of your code." 429 | }, 430 | { 431 | "cell_type": "markdown", 432 | "metadata": { 433 | "jupyter": { 434 | "outputs_hidden": false 435 | } 436 | }, 437 | "source": "" 438 | }, 439 | { 440 | "cell_type": "markdown", 441 | "metadata": {}, 442 | "source": "### This will help with Question 8\n\nCreate a list of tuples, the first element in the tuple contains the name of the estimator:\n\n'scale'\n\n'polynomial'\n\n'model'\n\nThe second element in the tuple contains the model constructor \n\nStandardScaler()\n\nPolynomialFeatures(include_bias=False)\n\nLinearRegression()\n" 443 | }, 444 | { 445 | "cell_type": "code", 446 | "execution_count": 38, 447 | "metadata": {}, 448 | "outputs": [], 449 | "source": "Input=[('scale',StandardScaler()),('polynomial', PolynomialFeatures(include_bias=False)),('model',LinearRegression())]" 450 | }, 451 | { 452 | "cell_type": "markdown", 453 | "metadata": {}, 454 | "source": "### Question 8\nUse the list to create a pipeline object to predict the 'price', fit the object using the features in the list features, and calculate the R^2." 455 | }, 456 | { 457 | "cell_type": "code", 458 | "execution_count": 40, 459 | "metadata": { 460 | "jupyter": { 461 | "outputs_hidden": false 462 | } 463 | }, 464 | "outputs": [ 465 | { 466 | "name": "stderr", 467 | "output_type": "stream", 468 | "text": "/opt/conda/envs/Python36/lib/python3.6/site-packages/sklearn/preprocessing/data.py:645: DataConversionWarning: Data with input dtype int64, float64 were all converted to float64 by StandardScaler.\n return self.partial_fit(X, y)\n/opt/conda/envs/Python36/lib/python3.6/site-packages/sklearn/base.py:467: DataConversionWarning: Data with input dtype int64, float64 were all converted to float64 by StandardScaler.\n return self.fit(X, y, **fit_params).transform(X)\n/opt/conda/envs/Python36/lib/python3.6/site-packages/sklearn/pipeline.py:511: DataConversionWarning: Data with input dtype int64, float64 were all converted to float64 by StandardScaler.\n Xt = transform.transform(Xt)\n" 469 | }, 470 | { 471 | "data": { 472 | "text/plain": "0.7513408553309376" 473 | }, 474 | "execution_count": 40, 475 | "metadata": {}, 476 | "output_type": "execute_result" 477 | } 478 | ], 479 | "source": "pipe = Pipeline(Input)\npipe.fit(df[features], df[['price']])\npipe.score(df[features], df[['price']])" 480 | }, 481 | { 482 | "cell_type": "markdown", 483 | "metadata": {}, 484 | "source": "# Module 5: Model Evaluation and Refinement" 485 | }, 486 | { 487 | "cell_type": "markdown", 488 | "metadata": {}, 489 | "source": "Import the necessary modules:" 490 | }, 491 | { 492 | "cell_type": "code", 493 | "execution_count": 41, 494 | "metadata": { 495 | "jupyter": { 496 | "outputs_hidden": false 497 | } 498 | }, 499 | "outputs": [ 500 | { 501 | "name": "stdout", 502 | "output_type": "stream", 503 | "text": "done\n" 504 | } 505 | ], 506 | "source": "from sklearn.model_selection import cross_val_score\nfrom sklearn.model_selection import train_test_split\nprint(\"done\")" 507 | }, 508 | { 509 | "cell_type": "markdown", 510 | "metadata": {}, 511 | "source": "We will split the data into training and testing sets:" 512 | }, 513 | { 514 | "cell_type": "code", 515 | "execution_count": 42, 516 | "metadata": { 517 | "jupyter": { 518 | "outputs_hidden": false 519 | } 520 | }, 521 | "outputs": [ 522 | { 523 | "name": "stdout", 524 | "output_type": "stream", 525 | "text": "number of test samples: 3242\nnumber of training samples: 18371\n" 526 | } 527 | ], 528 | "source": "features =[\"floors\", \"waterfront\",\"lat\" ,\"bedrooms\" ,\"sqft_basement\" ,\"view\" ,\"bathrooms\",\"sqft_living15\",\"sqft_above\",\"grade\",\"sqft_living\"] \nX = df[features]\nY = df['price']\n\nx_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.15, random_state=1)\n\n\nprint(\"number of test samples:\", x_test.shape[0])\nprint(\"number of training samples:\",x_train.shape[0])" 529 | }, 530 | { 531 | "cell_type": "markdown", 532 | "metadata": {}, 533 | "source": "### Question 9\nCreate and fit a Ridge regression object using the training data, set the regularization parameter to 0.1, and calculate the R^2 using the test data. \n" 534 | }, 535 | { 536 | "cell_type": "code", 537 | "execution_count": 43, 538 | "metadata": {}, 539 | "outputs": [], 540 | "source": "from sklearn.linear_model import Ridge" 541 | }, 542 | { 543 | "cell_type": "code", 544 | "execution_count": 44, 545 | "metadata": { 546 | "jupyter": { 547 | "outputs_hidden": false 548 | } 549 | }, 550 | "outputs": [ 551 | { 552 | "data": { 553 | "text/plain": "0.6478759163939121" 554 | }, 555 | "execution_count": 44, 556 | "metadata": {}, 557 | "output_type": "execute_result" 558 | } 559 | ], 560 | "source": "RR = Ridge(alpha=0.1).fit(x_train, y_train)\nRR.score(x_test, y_test)" 561 | }, 562 | { 563 | "cell_type": "markdown", 564 | "metadata": {}, 565 | "source": "### Question 10\nPerform a second order polynomial transform on both the training data and testing data. Create and fit a Ridge regression object using the training data, set the regularisation parameter to 0.1, and calculate the R^2 utilising the test data provided. Take a screenshot of your code and the R^2." 566 | }, 567 | { 568 | "cell_type": "code", 569 | "execution_count": 51, 570 | "metadata": { 571 | "jupyter": { 572 | "outputs_hidden": false 573 | } 574 | }, 575 | "outputs": [ 576 | { 577 | "data": { 578 | "text/plain": "0.7002744279699229" 579 | }, 580 | "execution_count": 51, 581 | "metadata": {}, 582 | "output_type": "execute_result" 583 | } 584 | ], 585 | "source": "from sklearn.preprocessing import PolynomialFeatures\npoly = PolynomialFeatures(degree=2)\nx_train_poly = poly.fit_transform(x_train)\nx_test_poly = poly.transform(x_test)\nRR1 = Ridge(alpha=0.1).fit(x_train_poly, y_train)\nRR1.score(x_test_poly, y_test)" 586 | }, 587 | { 588 | "cell_type": "markdown", 589 | "metadata": {}, 590 | "source": "

Once you complete your notebook you will have to share it. Select the icon on the top right a marked in red in the image below, a dialogue box should open, and select the option all content excluding sensitive code cells.

\n

\"share

\n

\n

You can then share the notebook  via a  URL by scrolling down as shown in the following image:

\n

\"HTML\"

\n

 

" 591 | }, 592 | { 593 | "cell_type": "markdown", 594 | "metadata": {}, 595 | "source": "

About the Authors:

\n\nJoseph Santarcangelo has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD." 596 | }, 597 | { 598 | "cell_type": "markdown", 599 | "metadata": {}, 600 | "source": "Other contributors: Michelle Carey, Mavis Zhou " 601 | }, 602 | { 603 | "cell_type": "code", 604 | "execution_count": null, 605 | "metadata": {}, 606 | "outputs": [], 607 | "source": "" 608 | } 609 | ], 610 | "metadata": { 611 | "kernelspec": { 612 | "display_name": "Python 3.6", 613 | "language": "python", 614 | "name": "python3" 615 | }, 616 | "language_info": { 617 | "codemirror_mode": { 618 | "name": "ipython", 619 | "version": 3 620 | }, 621 | "file_extension": ".py", 622 | "mimetype": "text/x-python", 623 | "name": "python", 624 | "nbconvert_exporter": "python", 625 | "pygments_lexer": "ipython3", 626 | "version": "3.6.9" 627 | }, 628 | "widgets": { 629 | "state": {}, 630 | "version": "1.1.2" 631 | } 632 | }, 633 | "nbformat": 4, 634 | "nbformat_minor": 4 635 | } -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # IBM Data Science Professional Certificate Projects 2 | 3 | This repository contains the projects/assignments for courses in the *IBM Data Science Professional Certificate* on Coursera. The professional certificate contains 9 courses. These are as follows: 4 | 1. What is Data Science? 5 | 2. Tools for Data Science 6 | 3. Data Science Methodology 7 | 4. Python for Data Science and AI 8 | 5. Databases and SQL for Data Science 9 | 6. Data Analysis with Python 10 | 7. Data Visualization with Python 11 | 8. Machine Learning with Python 12 | 9. Applied Data Science Capstone 13 | 14 | Project/assignment notebooks for courses 2, 4, 5, 6, 7, 8 and 9 are included in this repository. Courses 1 and 3 only have quizzes as part of their assignments. Hence, there are no notebooks for them. 15 | 16 | The last course, *Applied Data Science Capstone*, has multiple submissions. These include: 17 | 1. A basic introduction to the capstone project notebook for week 1 of the course 18 | 2. An assignment notebook for week 3 of the course 19 | 3. A PDF document highlighting the introduction and data collection for the final project for week 4 of the course 20 | 4. The final project notebook, project report, and project presentation for week 5 of the course 21 | 22 | All of the files mentioned above can be found at https://github.com/raunakbhutoria/Coursera_Capstone. The Coursera_Capstone repository was the one used for making all submissions for the capstone course. Thus, in this repository I will only be including the Week 3 assignment notebook and the final project code notebook. In order to view the report, presentation, and other materials of the capstone course, please visit my Coursera_Capstone repository. 23 | 24 | # Thank You! 25 | -------------------------------------------------------------------------------- /SQL Assignment - Chicago.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": "\n\n

Assignment: Notebook for Peer Assignment

" 7 | }, 8 | { 9 | "cell_type": "markdown", 10 | "metadata": {}, 11 | "source": "# Introduction\n\nUsing this Python notebook you will:\n1. Understand 3 Chicago datasets \n1. Load the 3 datasets into 3 tables in a Db2 database\n1. Execute SQL queries to answer assignment questions " 12 | }, 13 | { 14 | "cell_type": "markdown", 15 | "metadata": {}, 16 | "source": "## Understand the datasets \nTo complete the assignment problems in this notebook you will be using three datasets that are available on the city of Chicago's Data Portal:\n1. Socioeconomic Indicators in Chicago\n1. Chicago Public Schools\n1. Chicago Crime Data\n\n### 1. Socioeconomic Indicators in Chicago\nThis dataset contains a selection of six socioeconomic indicators of public health significance and a \u201chardship index,\u201d for each Chicago community area, for the years 2008 \u2013 2012.\n\nFor this assignment you will use a snapshot of this dataset which can be downloaded from:\nhttps://ibm.box.com/shared/static/05c3415cbfbtfnr2fx4atenb2sd361ze.csv\n\nA detailed description of this dataset and the original dataset can be obtained from the Chicago Data Portal at:\nhttps://data.cityofchicago.org/Health-Human-Services/Census-Data-Selected-socioeconomic-indicators-in-C/kn9c-c2s2\n\n\n\n### 2. Chicago Public Schools\n\nThis dataset shows all school level performance data used to create CPS School Report Cards for the 2011-2012 school year. This dataset is provided by the city of Chicago's Data Portal.\n\nFor this assignment you will use a snapshot of this dataset which can be downloaded from:\nhttps://ibm.box.com/shared/static/f9gjvj1gjmxxzycdhplzt01qtz0s7ew7.csv\n\nA detailed description of this dataset and the original dataset can be obtained from the Chicago Data Portal at:\nhttps://data.cityofchicago.org/Education/Chicago-Public-Schools-Progress-Report-Cards-2011-/9xs2-f89t\n\n\n\n\n### 3. Chicago Crime Data \n\nThis dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. \n\nThis dataset is quite large - over 1.5GB in size with over 6.5 million rows. For the purposes of this assignment we will use a much smaller sample of this dataset which can be downloaded from:\nhttps://ibm.box.com/shared/static/svflyugsr9zbqy5bmowgswqemfpm1x7f.csv\n\nA detailed description of this dataset and the original dataset can be obtained from the Chicago Data Portal at:\nhttps://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2\n" 17 | }, 18 | { 19 | "cell_type": "markdown", 20 | "metadata": {}, 21 | "source": "### Download the datasets\nIn many cases the dataset to be analyzed is available as a .CSV (comma separated values) file, perhaps on the internet. Click on the links below to download and save the datasets (.CSV files):\n1. __CENSUS_DATA:__ https://ibm.box.com/shared/static/05c3415cbfbtfnr2fx4atenb2sd361ze.csv\n1. __CHICAGO_PUBLIC_SCHOOLS__ https://ibm.box.com/shared/static/f9gjvj1gjmxxzycdhplzt01qtz0s7ew7.csv\n1. __CHICAGO_CRIME_DATA:__ https://ibm.box.com/shared/static/svflyugsr9zbqy5bmowgswqemfpm1x7f.csv\n\n__NOTE:__ Ensure you have downloaded the datasets using the links above instead of directly from the Chicago Data Portal. The versions linked here are subsets of the original datasets and have some of the column names modified to be more database friendly which will make it easier to complete this assignment." 22 | }, 23 | { 24 | "cell_type": "markdown", 25 | "metadata": {}, 26 | "source": "### Store the datasets in database tables\nTo analyze the data using SQL, it first needs to be stored in the database.\n\nWhile it is easier to read the dataset into a Pandas dataframe and then PERSIST it into the database as we saw in Week 3 Lab 3, it results in mapping to default datatypes which may not be optimal for SQL querying. For example a long textual field may map to a CLOB instead of a VARCHAR. \n\nTherefore, __it is highly recommended to manually load the table using the database console LOAD tool, as indicated in Week 2 Lab 1 Part II__. The only difference with that lab is that in Step 5 of the instructions you will need to click on create \"(+) New Table\" and specify the name of the table you want to create and then click \"Next\". \n\n\n\n##### Now open the Db2 console, open the LOAD tool, Select / Drag the .CSV file for the first dataset, Next create a New Table, and then follow the steps on-screen instructions to load the data. Name the new tables as folows:\n1. __CENSUS_DATA__\n1. __CHICAGO_PUBLIC_SCHOOLS__\n1. __CHICAGO_CRIME_DATA__" 27 | }, 28 | { 29 | "cell_type": "markdown", 30 | "metadata": {}, 31 | "source": "### Connect to the database \nLet us first load the SQL extension and establish a connection with the database" 32 | }, 33 | { 34 | "cell_type": "code", 35 | "execution_count": 1, 36 | "metadata": {}, 37 | "outputs": [], 38 | "source": "%load_ext sql" 39 | }, 40 | { 41 | "cell_type": "markdown", 42 | "metadata": {}, 43 | "source": "In the next cell enter your db2 connection string. Recall you created Service Credentials for your Db2 instance in first lab in Week 3. From the __uri__ field of your Db2 service credentials copy everything after db2:// (except the double quote at the end) and paste it in the cell below after ibm_db_sa://\n\n" 44 | }, 45 | { 46 | "cell_type": "code", 47 | "execution_count": 2, 48 | "metadata": {}, 49 | "outputs": [ 50 | { 51 | "data": { 52 | "text/plain": "'Connected: pgq43854@BLUDB'" 53 | }, 54 | "execution_count": 2, 55 | "metadata": {}, 56 | "output_type": "execute_result" 57 | } 58 | ], 59 | "source": "# Remember the connection string is of the format:\n# %sql ibm_db_sa://my-username:my-password@my-hostname:my-port/my-db-name\n# Enter the connection string for your Db2 on Cloud database instance below\n%sql ibm_db_sa://pgq43854:3js4p2w4dzrc85%5Ev@dashdb-txn-sbox-yp-lon02-02.services.eu-gb.bluemix.net:50000/BLUDB" 60 | }, 61 | { 62 | "cell_type": "markdown", 63 | "metadata": {}, 64 | "source": "## Problems\nNow write and execute SQL queries to solve assignment problems\n\n### Problem 1\n\n##### Find the total number of crimes recorded in the CRIME table" 65 | }, 66 | { 67 | "cell_type": "code", 68 | "execution_count": 10, 69 | "metadata": {}, 70 | "outputs": [ 71 | { 72 | "name": "stdout", 73 | "output_type": "stream", 74 | "text": " * ibm_db_sa://pgq43854:***@dashdb-txn-sbox-yp-lon02-02.services.eu-gb.bluemix.net:50000/BLUDB\nDone.\n" 75 | }, 76 | { 77 | "data": { 78 | "text/html": "\n \n \n \n \n \n \n
count_of_crimes
533
", 79 | "text/plain": "[(Decimal('533'),)]" 80 | }, 81 | "execution_count": 10, 82 | "metadata": {}, 83 | "output_type": "execute_result" 84 | } 85 | ], 86 | "source": "# Rows in Crime table\n%sql SELECT COUNT(*) as count_of_crimes FROM CHICAGO_CRIME_DATA" 87 | }, 88 | { 89 | "cell_type": "markdown", 90 | "metadata": {}, 91 | "source": "### Problem 2\n\n##### Retrieve first 10 rows from the CRIME table\n" 92 | }, 93 | { 94 | "cell_type": "code", 95 | "execution_count": 11, 96 | "metadata": {}, 97 | "outputs": [ 98 | { 99 | "name": "stdout", 100 | "output_type": "stream", 101 | "text": " * ibm_db_sa://pgq43854:***@dashdb-txn-sbox-yp-lon02-02.services.eu-gb.bluemix.net:50000/BLUDB\nDone.\n" 102 | }, 103 | { 104 | "data": { 105 | "text/html": "\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
idcase_numberDATEblockiucrprimary_typedescriptionlocation_descriptionarrestdomesticbeatdistrictwardcommunity_area_numberfbicodex_coordinatey_coordinateYEARupdatedonlatitudelongitudelocation
3512276HK58771208/28/2004 05:50:56 PM047XX S KEDZIE AVE890THEFTFROM BUILDINGSMALL RETAIL STOREFALSEFALSE9119145861155838187305020042018-02-10 15:50:0141.80744050-87.70395585(41.8074405, -87.703955849)
3406613HK45630606/26/2004 12:40:00 PM009XX N CENTRAL PARK AVE820THEFT$500 AND UNDEROTHERFALSEFALSE111211272361152206190612720042018-02-28 15:56:2541.89827996-87.71640551(41.898279962, -87.716405505)
8002131HT23359504/04/2011 05:45:00 AM043XX S WABASH AVE820THEFT$500 AND UNDERNURSING HOME/RETIREMENT HOMEFALSEFALSE221233861177436187631320112018-02-10 15:50:0141.81593313-87.62464213(41.815933131, -87.624642127)
7903289HT13352212/30/2010 04:30:00 PM083XX S KINGSTON AVE840THEFTFINANCIAL ID THEFT: OVER $300RESIDENCEFALSEFALSE423474661194622185012520102018-02-10 15:50:0141.74366532-87.56246276(41.743665322, -87.562462756)
10402076HZ13855102/02/2016 07:30:00 PM033XX W 66TH ST820THEFT$500 AND UNDERALLEYFALSEFALSE8318156661155240186066120162018-02-10 15:50:0141.77345530-87.70648047(41.773455295, -87.706480471)
7732712HS54010609/29/2010 07:59:00 AM006XX W CHICAGO AVE810THEFTOVER $500PARKING LOT/GARAGE(NON.RESID.)FALSEFALSE132312272461171668190560720102018-02-10 15:50:0141.89644677-87.64493868(41.896446772, -87.644938678)
10769475HZ53477111/30/2016 01:15:00 AM050XX N KEDZIE AVE810THEFTOVER $500STREETFALSEFALSE171317331461154133193331420162018-02-10 15:50:0141.97284491-87.70860008(41.972844913, -87.708600079)
4494340HL79324312/16/2005 04:45:00 PM005XX E PERSHING RD860THEFTRETAIL THEFTGROCERY FOOD STORETRUEFALSE213233861180448187923420052018-02-28 15:56:2541.82387989-87.61350386(41.823879885, -87.613503857)
3778925HL14961001/28/2005 05:00:00 PM100XX S WASHTENAW AVE810THEFTOVER $500STREETFALSEFALSE221122197261160129183804020052018-02-28 15:56:2541.71128051-87.68917910(41.711280513, -87.689179097)
3324217HK36155105/13/2004 02:15:00 PM033XX W BELMONT AVE820THEFT$500 AND UNDERSMALL RETAIL STOREFALSEFALSE173317352161153590192108420042018-02-28 15:56:2541.93929582-87.71092344(41.939295821, -87.710923442)
", 106 | "text/plain": "[(3512276, 'HK587712', '08/28/2004 05:50:56 PM', '047XX S KEDZIE AVE', '890', 'THEFT', 'FROM BUILDING', 'SMALL RETAIL STORE', 'FALSE', 'FALSE', 911, 9, 14, 58, '6', 1155838, 1873050, 2004, datetime.datetime(2018, 2, 10, 15, 50, 1), Decimal('41.80744050'), Decimal('-87.70395585'), '(41.8074405, -87.703955849)'),\n (3406613, 'HK456306', '06/26/2004 12:40:00 PM', '009XX N CENTRAL PARK AVE', '820', 'THEFT', '$500 AND UNDER', 'OTHER', 'FALSE', 'FALSE', 1112, 11, 27, 23, '6', 1152206, 1906127, 2004, datetime.datetime(2018, 2, 28, 15, 56, 25), Decimal('41.89827996'), Decimal('-87.71640551'), '(41.898279962, -87.716405505)'),\n (8002131, 'HT233595', '04/04/2011 05:45:00 AM', '043XX S WABASH AVE', '820', 'THEFT', '$500 AND UNDER', 'NURSING HOME/RETIREMENT HOME', 'FALSE', 'FALSE', 221, 2, 3, 38, '6', 1177436, 1876313, 2011, datetime.datetime(2018, 2, 10, 15, 50, 1), Decimal('41.81593313'), Decimal('-87.62464213'), '(41.815933131, -87.624642127)'),\n (7903289, 'HT133522', '12/30/2010 04:30:00 PM', '083XX S KINGSTON AVE', '840', 'THEFT', 'FINANCIAL ID THEFT: OVER $300', 'RESIDENCE', 'FALSE', 'FALSE', 423, 4, 7, 46, '6', 1194622, 1850125, 2010, datetime.datetime(2018, 2, 10, 15, 50, 1), Decimal('41.74366532'), Decimal('-87.56246276'), '(41.743665322, -87.562462756)'),\n (10402076, 'HZ138551', '02/02/2016 07:30:00 PM', '033XX W 66TH ST', '820', 'THEFT', '$500 AND UNDER', 'ALLEY', 'FALSE', 'FALSE', 831, 8, 15, 66, '6', 1155240, 1860661, 2016, datetime.datetime(2018, 2, 10, 15, 50, 1), Decimal('41.77345530'), Decimal('-87.70648047'), '(41.773455295, -87.706480471)'),\n (7732712, 'HS540106', '09/29/2010 07:59:00 AM', '006XX W CHICAGO AVE', '810', 'THEFT', 'OVER $500', 'PARKING LOT/GARAGE(NON.RESID.)', 'FALSE', 'FALSE', 1323, 12, 27, 24, '6', 1171668, 1905607, 2010, datetime.datetime(2018, 2, 10, 15, 50, 1), Decimal('41.89644677'), Decimal('-87.64493868'), '(41.896446772, -87.644938678)'),\n (10769475, 'HZ534771', '11/30/2016 01:15:00 AM', '050XX N KEDZIE AVE', '810', 'THEFT', 'OVER $500', 'STREET', 'FALSE', 'FALSE', 1713, 17, 33, 14, '6', 1154133, 1933314, 2016, datetime.datetime(2018, 2, 10, 15, 50, 1), Decimal('41.97284491'), Decimal('-87.70860008'), '(41.972844913, -87.708600079)'),\n (4494340, 'HL793243', '12/16/2005 04:45:00 PM', '005XX E PERSHING RD', '860', 'THEFT', 'RETAIL THEFT', 'GROCERY FOOD STORE', 'TRUE', 'FALSE', 213, 2, 3, 38, '6', 1180448, 1879234, 2005, datetime.datetime(2018, 2, 28, 15, 56, 25), Decimal('41.82387989'), Decimal('-87.61350386'), '(41.823879885, -87.613503857)'),\n (3778925, 'HL149610', '01/28/2005 05:00:00 PM', '100XX S WASHTENAW AVE', '810', 'THEFT', 'OVER $500', 'STREET', 'FALSE', 'FALSE', 2211, 22, 19, 72, '6', 1160129, 1838040, 2005, datetime.datetime(2018, 2, 28, 15, 56, 25), Decimal('41.71128051'), Decimal('-87.68917910'), '(41.711280513, -87.689179097)'),\n (3324217, 'HK361551', '05/13/2004 02:15:00 PM', '033XX W BELMONT AVE', '820', 'THEFT', '$500 AND UNDER', 'SMALL RETAIL STORE', 'FALSE', 'FALSE', 1733, 17, 35, 21, '6', 1153590, 1921084, 2004, datetime.datetime(2018, 2, 28, 15, 56, 25), Decimal('41.93929582'), Decimal('-87.71092344'), '(41.939295821, -87.710923442)')]" 107 | }, 108 | "execution_count": 11, 109 | "metadata": {}, 110 | "output_type": "execute_result" 111 | } 112 | ], 113 | "source": "%sql SELECT * FROM CHICAGO_CRIME_DATA LIMIT 10" 114 | }, 115 | { 116 | "cell_type": "markdown", 117 | "metadata": {}, 118 | "source": "### Problem 3\n\n##### How many crimes involve an arrest?" 119 | }, 120 | { 121 | "cell_type": "code", 122 | "execution_count": 12, 123 | "metadata": {}, 124 | "outputs": [ 125 | { 126 | "name": "stdout", 127 | "output_type": "stream", 128 | "text": " * ibm_db_sa://pgq43854:***@dashdb-txn-sbox-yp-lon02-02.services.eu-gb.bluemix.net:50000/BLUDB\nDone.\n" 129 | }, 130 | { 131 | "data": { 132 | "text/html": "\n \n \n \n \n \n \n
count_of_arrest
163
", 133 | "text/plain": "[(Decimal('163'),)]" 134 | }, 135 | "execution_count": 12, 136 | "metadata": {}, 137 | "output_type": "execute_result" 138 | } 139 | ], 140 | "source": "%sql SELECT COUNT(*) as count_of_arrest FROM CHICAGO_CRIME_DATA WHERE ARREST = 'TRUE'" 141 | }, 142 | { 143 | "cell_type": "markdown", 144 | "metadata": {}, 145 | "source": "### Problem 4\n\n##### Which unique types of crimes have been recorded at GAS STATION locations?\n" 146 | }, 147 | { 148 | "cell_type": "code", 149 | "execution_count": 64, 150 | "metadata": {}, 151 | "outputs": [ 152 | { 153 | "name": "stdout", 154 | "output_type": "stream", 155 | "text": " * ibm_db_sa://pgq43854:***@dashdb-txn-sbox-yp-lon02-02.services.eu-gb.bluemix.net:50000/BLUDB\nDone.\n" 156 | }, 157 | { 158 | "data": { 159 | "text/html": "\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
unique_types_of_crimes
CRIMINAL TRESPASS
NARCOTICS
ROBBERY
THEFT
", 160 | "text/plain": "[('CRIMINAL TRESPASS',), ('NARCOTICS',), ('ROBBERY',), ('THEFT',)]" 161 | }, 162 | "execution_count": 64, 163 | "metadata": {}, 164 | "output_type": "execute_result" 165 | } 166 | ], 167 | "source": "%%sql\nSELECT PRIMARY_TYPE as unique_types_of_crimes FROM CHICAGO_CRIME_DATA \nWHERE LOCATION_DESCRIPTION = 'GAS STATION' GROUP BY PRIMARY_TYPE" 168 | }, 169 | { 170 | "cell_type": "markdown", 171 | "metadata": {}, 172 | "source": "Hint: Which column lists types of crimes e.g. THEFT?" 173 | }, 174 | { 175 | "cell_type": "markdown", 176 | "metadata": {}, 177 | "source": "### Problem 5\n\n##### In the CENUS_DATA table list all Community Areas whose names start with the letter \u2018B\u2019." 178 | }, 179 | { 180 | "cell_type": "code", 181 | "execution_count": 16, 182 | "metadata": {}, 183 | "outputs": [ 184 | { 185 | "name": "stdout", 186 | "output_type": "stream", 187 | "text": " * ibm_db_sa://pgq43854:***@dashdb-txn-sbox-yp-lon02-02.services.eu-gb.bluemix.net:50000/BLUDB\nDone.\n" 188 | }, 189 | { 190 | "data": { 191 | "text/html": "\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
community_area_name
Belmont Cragin
Burnside
Brighton Park
Bridgeport
Beverly
", 192 | "text/plain": "[('Belmont Cragin',),\n ('Burnside',),\n ('Brighton Park',),\n ('Bridgeport',),\n ('Beverly',)]" 193 | }, 194 | "execution_count": 16, 195 | "metadata": {}, 196 | "output_type": "execute_result" 197 | } 198 | ], 199 | "source": "%sql SELECT COMMUNITY_AREA_NAME FROM CENSUS_DATA WHERE COMMUNITY_AREA_NAME LIKE 'B%'" 200 | }, 201 | { 202 | "cell_type": "markdown", 203 | "metadata": {}, 204 | "source": "### Problem 6\n\n##### Which schools in Community Areas 10 to 15 are healthy school certified?" 205 | }, 206 | { 207 | "cell_type": "code", 208 | "execution_count": 29, 209 | "metadata": {}, 210 | "outputs": [ 211 | { 212 | "name": "stdout", 213 | "output_type": "stream", 214 | "text": " * ibm_db_sa://pgq43854:***@dashdb-txn-sbox-yp-lon02-02.services.eu-gb.bluemix.net:50000/BLUDB\nDone.\n" 215 | }, 216 | { 217 | "data": { 218 | "text/html": "\n \n \n \n \n \n \n \n \n \n \n \n \n
community_area_numbercommunity_area_namename_of_schoolhealthy_school_certified
10NORWOOD PARKRufus M Hitch Elementary SchoolYes
", 219 | "text/plain": "[(10, 'NORWOOD PARK', 'Rufus M Hitch Elementary School', 'Yes')]" 220 | }, 221 | "execution_count": 29, 222 | "metadata": {}, 223 | "output_type": "execute_result" 224 | } 225 | ], 226 | "source": "%%sql\nSELECT CPS.COMMUNITY_AREA_NUMBER, CPS.COMMUNITY_AREA_NAME, NAME_OF_SCHOOL, CPS.HEALTHY_SCHOOL_CERTIFIED \n FROM CHICAGO_PUBLIC_SCHOOLS as CPS JOIN CENSUS_DATA as CD \n ON CD.COMMUNITY_AREA_NUMBER = CPS.COMMUNITY_AREA_NUMBER \n WHERE CD.COMMUNITY_AREA_NUMBER BETWEEN 10 AND 15 AND CPS.HEALTHY_SCHOOL_CERTIFIED = 'Yes'" 227 | }, 228 | { 229 | "cell_type": "markdown", 230 | "metadata": {}, 231 | "source": "### Problem 7\n\n##### What is the average school Safety Score? " 232 | }, 233 | { 234 | "cell_type": "code", 235 | "execution_count": 31, 236 | "metadata": {}, 237 | "outputs": [ 238 | { 239 | "name": "stdout", 240 | "output_type": "stream", 241 | "text": " * ibm_db_sa://pgq43854:***@dashdb-txn-sbox-yp-lon02-02.services.eu-gb.bluemix.net:50000/BLUDB\nDone.\n" 242 | }, 243 | { 244 | "data": { 245 | "text/html": "\n \n \n \n \n \n \n
average_school_safety_score
49.504873
", 246 | "text/plain": "[(Decimal('49.504873'),)]" 247 | }, 248 | "execution_count": 31, 249 | "metadata": {}, 250 | "output_type": "execute_result" 251 | } 252 | ], 253 | "source": "%sql SELECT AVG(SAFETY_SCORE) as average_school_safety_score FROM CHICAGO_PUBLIC_SCHOOLS" 254 | }, 255 | { 256 | "cell_type": "markdown", 257 | "metadata": {}, 258 | "source": "### Problem 8\n\n##### List the top 5 Community Areas by average College Enrollment [number of students] " 259 | }, 260 | { 261 | "cell_type": "code", 262 | "execution_count": 68, 263 | "metadata": {}, 264 | "outputs": [ 265 | { 266 | "name": "stdout", 267 | "output_type": "stream", 268 | "text": " * ibm_db_sa://pgq43854:***@dashdb-txn-sbox-yp-lon02-02.services.eu-gb.bluemix.net:50000/BLUDB\nDone.\n" 269 | }, 270 | { 271 | "data": { 272 | "text/html": "\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
community_area_nameavg_college_enrollment
ARCHER HEIGHTS2411.500000
MONTCLARE1317.000000
WEST ELSDON1233.333333
BRIGHTON PARK1205.875000
BELMONT CRAGIN1198.833333
", 273 | "text/plain": "[('ARCHER HEIGHTS', Decimal('2411.500000')),\n ('MONTCLARE', Decimal('1317.000000')),\n ('WEST ELSDON', Decimal('1233.333333')),\n ('BRIGHTON PARK', Decimal('1205.875000')),\n ('BELMONT CRAGIN', Decimal('1198.833333'))]" 274 | }, 275 | "execution_count": 68, 276 | "metadata": {}, 277 | "output_type": "execute_result" 278 | } 279 | ], 280 | "source": "%%sql\nSELECT COMMUNITY_AREA_NAME, AVG(COLLEGE_ENROLLMENT) as AVG_COLLEGE_ENROLLMENT FROM \nCHICAGO_PUBLIC_SCHOOLS GROUP BY COMMUNITY_AREA_NAME ORDER BY AVG_COLLEGE_ENROLLMENT DESC LIMIT 5" 281 | }, 282 | { 283 | "cell_type": "markdown", 284 | "metadata": {}, 285 | "source": "### Problem 9\n\n##### Use a sub-query to determine which Community Area has the least value for school Safety Score? " 286 | }, 287 | { 288 | "cell_type": "code", 289 | "execution_count": 55, 290 | "metadata": {}, 291 | "outputs": [ 292 | { 293 | "name": "stdout", 294 | "output_type": "stream", 295 | "text": " * ibm_db_sa://pgq43854:***@dashdb-txn-sbox-yp-lon02-02.services.eu-gb.bluemix.net:50000/BLUDB\nDone.\n" 296 | }, 297 | { 298 | "data": { 299 | "text/html": "\n \n \n \n \n \n \n \n \n
community_area_namesafety_score
WASHINGTON PARK1
", 300 | "text/plain": "[('WASHINGTON PARK', 1)]" 301 | }, 302 | "execution_count": 55, 303 | "metadata": {}, 304 | "output_type": "execute_result" 305 | } 306 | ], 307 | "source": "%%sql\nSELECT COMMUNITY_AREA_NAME, SAFETY_SCORE FROM CHICAGO_PUBLIC_SCHOOLS\nWHERE SAFETY_SCORE = (SELECT MIN(SAFETY_SCORE) FROM CHICAGO_PUBLIC_SCHOOLS)" 308 | }, 309 | { 310 | "cell_type": "markdown", 311 | "metadata": {}, 312 | "source": "### Problem 10\n\n##### [Without using an explicit JOIN operator] Find the Per Capita Income of the Community Area which has a school Safety Score of 1." 313 | }, 314 | { 315 | "cell_type": "code", 316 | "execution_count": 59, 317 | "metadata": {}, 318 | "outputs": [ 319 | { 320 | "name": "stdout", 321 | "output_type": "stream", 322 | "text": " * ibm_db_sa://pgq43854:***@dashdb-txn-sbox-yp-lon02-02.services.eu-gb.bluemix.net:50000/BLUDB\nDone.\n" 323 | }, 324 | { 325 | "data": { 326 | "text/html": "\n \n \n \n \n \n \n \n \n \n \n
community_area_namesafety_scoreper_capita_income
WASHINGTON PARK113785
", 327 | "text/plain": "[('WASHINGTON PARK', 1, 13785)]" 328 | }, 329 | "execution_count": 59, 330 | "metadata": {}, 331 | "output_type": "execute_result" 332 | } 333 | ], 334 | "source": "%%sql\nSELECT CPS.COMMUNITY_AREA_NAME, CPS.SAFETY_SCORE, CD.PER_CAPITA_INCOME \n FROM CENSUS_DATA as CD, CHICAGO_PUBLIC_SCHOOLS as CPS \n WHERE CPS.SAFETY_SCORE = 1 AND CPS.COMMUNITY_AREA_NUMBER = CD.COMMUNITY_AREA_NUMBER" 335 | }, 336 | { 337 | "cell_type": "markdown", 338 | "metadata": {}, 339 | "source": "Copyright © 2018 [cognitiveclass.ai](cognitiveclass.ai?utm_source=bducopyrightlink&utm_medium=dswb&utm_campaign=bdu). This notebook and its source code are released under the terms of the [MIT License](https://bigdatauniversity.com/mit-license/).\n" 340 | } 341 | ], 342 | "metadata": { 343 | "kernelspec": { 344 | "display_name": "Python", 345 | "language": "python", 346 | "name": "conda-env-python-py" 347 | }, 348 | "language_info": { 349 | "codemirror_mode": { 350 | "name": "ipython", 351 | "version": 3 352 | }, 353 | "file_extension": ".py", 354 | "mimetype": "text/x-python", 355 | "name": "python", 356 | "nbconvert_exporter": "python", 357 | "pygments_lexer": "ipython3", 358 | "version": "3.6.11" 359 | }, 360 | "widgets": { 361 | "state": {}, 362 | "version": "1.1.2" 363 | } 364 | }, 365 | "nbformat": 4, 366 | "nbformat_minor": 4 367 | } --------------------------------------------------------------------------------