├── Introduction to Data Science in Python ├── week--3 │ ├── Assignment+3.ipynb │ └── readme ├── week-1 │ └── mcq ├── week-2 │ ├── Assignment+2.ipynb │ └── assignment+2.py └── week-4 │ ├── Assignment+4.ipynb │ └── readme └── README.md /Introduction to Data Science in Python/week--3/Assignment+3.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "---\n", 8 | "\n", 9 | "_You are currently looking at **version 1.5** of this notebook. To download notebooks and datafiles, as well as get help on Jupyter notebooks in the Coursera platform, visit the [Jupyter Notebook FAQ](https://www.coursera.org/learn/python-data-analysis/resources/0dhYG) course resource._\n", 10 | "\n", 11 | "---" 12 | ] 13 | }, 14 | { 15 | "cell_type": "markdown", 16 | "metadata": {}, 17 | "source": [ 18 | "# Assignment 3 - More Pandas\n", 19 | "This assignment requires more individual learning then the last one did - you are encouraged to check out the [pandas documentation](http://pandas.pydata.org/pandas-docs/stable/) to find functions or methods you might not have used yet, or ask questions on [Stack Overflow](http://stackoverflow.com/) and tag them as pandas and python related. And of course, the discussion forums are open for interaction with your peers and the course staff." 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "metadata": {}, 25 | "source": [ 26 | "### Question 1 (20%)\n", 27 | "Load the energy data from the file `Energy Indicators.xls`, which is a list of indicators of [energy supply and renewable electricity production](Energy%20Indicators.xls) from the [United Nations](http://unstats.un.org/unsd/environment/excel_file_tables/2013/Energy%20Indicators.xls) for the year 2013, and should be put into a DataFrame with the variable name of **energy**.\n", 28 | "\n", 29 | "Keep in mind that this is an Excel file, and not a comma separated values file. Also, make sure to exclude the footer and header information from the datafile. The first two columns are unneccessary, so you should get rid of them, and you should change the column labels so that the columns are:\n", 30 | "\n", 31 | "`['Country', 'Energy Supply', 'Energy Supply per Capita', '% Renewable']`\n", 32 | "\n", 33 | "Convert `Energy Supply` to gigajoules (there are 1,000,000 gigajoules in a petajoule). For all countries which have missing data (e.g. data with \"...\") make sure this is reflected as `np.NaN` values.\n", 34 | "\n", 35 | "Rename the following list of countries (for use in later questions):\n", 36 | "\n", 37 | "```\"Republic of Korea\": \"South Korea\",\n", 38 | "\"United States of America\": \"United States\",\n", 39 | "\"United Kingdom of Great Britain and Northern Ireland\": \"United Kingdom\",\n", 40 | "\"China, Hong Kong Special Administrative Region\": \"Hong Kong\"```\n", 41 | "\n", 42 | "There are also several countries with numbers and/or parenthesis in their name. Be sure to remove these, \n", 43 | "\n", 44 | "e.g. \n", 45 | "\n", 46 | "`'Bolivia (Plurinational State of)'` should be `'Bolivia'`, \n", 47 | "\n", 48 | "`'Switzerland17'` should be `'Switzerland'`.\n", 49 | "\n", 50 | "
\n", 51 | "\n", 52 | "Next, load the GDP data from the file `world_bank.csv`, which is a csv containing countries' GDP from 1960 to 2015 from [World Bank](http://data.worldbank.org/indicator/NY.GDP.MKTP.CD). Call this DataFrame **GDP**. \n", 53 | "\n", 54 | "Make sure to skip the header, and rename the following list of countries:\n", 55 | "\n", 56 | "```\"Korea, Rep.\": \"South Korea\", \n", 57 | "\"Iran, Islamic Rep.\": \"Iran\",\n", 58 | "\"Hong Kong SAR, China\": \"Hong Kong\"```\n", 59 | "\n", 60 | "
\n", 61 | "\n", 62 | "Finally, load the [Sciamgo Journal and Country Rank data for Energy Engineering and Power Technology](http://www.scimagojr.com/countryrank.php?category=2102) from the file `scimagojr-3.xlsx`, which ranks countries based on their journal contributions in the aforementioned area. Call this DataFrame **ScimEn**.\n", 63 | "\n", 64 | "Join the three datasets: GDP, Energy, and ScimEn into a new dataset (using the intersection of country names). Use only the last 10 years (2006-2015) of GDP data and only the top 15 countries by Scimagojr 'Rank' (Rank 1 through 15). \n", 65 | "\n", 66 | "The index of this DataFrame should be the name of the country, and the columns should be ['Rank', 'Documents', 'Citable documents', 'Citations', 'Self-citations',\n", 67 | " 'Citations per document', 'H index', 'Energy Supply',\n", 68 | " 'Energy Supply per Capita', '% Renewable', '2006', '2007', '2008',\n", 69 | " '2009', '2010', '2011', '2012', '2013', '2014', '2015'].\n", 70 | "\n", 71 | "*This function should return a DataFrame with 20 columns and 15 entries.*" 72 | ] 73 | }, 74 | { 75 | "cell_type": "code", 76 | "execution_count": 34, 77 | "metadata": { 78 | "umich_part_id": "009", 79 | "umich_partlist_id": "003" 80 | }, 81 | "outputs": [ 82 | { 83 | "name": "stderr", 84 | "output_type": "stream", 85 | "text": [ 86 | "/opt/conda/lib/python3.6/site-packages/ipykernel/__main__.py:35: FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....)\n" 87 | ] 88 | } 89 | ], 90 | "source": [ 91 | "# energy\n", 92 | "import pandas as pd\n", 93 | "import numpy as np\n", 94 | "energy_data = pd.read_excel('Energy Indicators.xls') \n", 95 | "energy_data = energy_data[16:243] # getting columns from interval\n", 96 | "energy_data = energy_data.drop(energy_data.columns[[0, 1]], axis=1) # dropping first 2 \n", 97 | "energy_data.rename(columns={'Environmental Indicators: Energy': 'Country','Unnamed: 3':'Energy Supply','Unnamed: 4':'Energy Supply per Capita','Unnamed: 5':'% Renewable'}, inplace=True) # renaming\n", 98 | "energy_data.replace('...', np.nan,inplace = True) #replacing missing with nan\n", 99 | "energy_data['Energy Supply'] *= 1000000 #converting to giga..\n", 100 | "def remove_digit(data): #removing digits\n", 101 | " newData = ''.join([i for i in data if not i.isdigit()])\n", 102 | " i = newData.find('(')\n", 103 | " if i>-1: newData = newData[:i]\n", 104 | " return newData.strip()\n", 105 | "energy_data['Country'] = energy_data['Country'].apply(remove_digit)\n", 106 | "_rename = {\"Republic of Korea\": \"South Korea\",\n", 107 | "\"United States of America\": \"United States\",\n", 108 | "\"United Kingdom of Great Britain and Northern Ireland\": \"United Kingdom\",\n", 109 | "\"China, Hong Kong Special Administrative Region\": \"Hong Kong\"}\n", 110 | "energy_data.replace({\"Country\": _rename},inplace = True) # renaming\n", 111 | "\n", 112 | "GDP = pd.read_csv('world_bank.csv', skiprows=4) # reading csv\n", 113 | "GDP.rename(columns={'Country Name': 'Country'}, inplace=True) # renaming\n", 114 | "di = {\"Korea, Rep.\": \"South Korea\", \n", 115 | "\"Iran, Islamic Rep.\": \"Iran\",\n", 116 | "\"Hong Kong SAR, China\": \"Hong Kong\"}\n", 117 | "GDP.replace({\"Country\": di},inplace = True)\n", 118 | "\n", 119 | "ScimEn = pd.read_excel('scimagojr-3.xlsx')\n", 120 | "df = pd.merge(pd.merge(energy_data, GDP, on='Country'), ScimEn, on='Country')\n", 121 | "df.set_index('Country',inplace=True)\n", 122 | "\n", 123 | "df = df[['Rank', 'Documents', 'Citable documents', 'Citations', 'Self-citations', 'Citations per document', 'H index', 'Energy Supply', 'Energy Supply per Capita', '% Renewable', '2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013', '2014', '2015']]\n", 124 | "df = (df.loc[df['Rank'].isin([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15])])\n", 125 | "df.sort('Rank',inplace=True)\n", 126 | "# print(df)\n", 127 | "\n", 128 | "def answer_one():\n", 129 | " return df" 130 | ] 131 | }, 132 | { 133 | "cell_type": "markdown", 134 | "metadata": {}, 135 | "source": [ 136 | "### Question 2 (6.6%)\n", 137 | "The previous question joined three datasets then reduced this to just the top 15 entries. When you joined the datasets, but before you reduced this to the top 15 items, how many entries did you lose?\n", 138 | "\n", 139 | "*This function should return a single number.*" 140 | ] 141 | }, 142 | { 143 | "cell_type": "code", 144 | "execution_count": 1, 145 | "metadata": {}, 146 | "outputs": [ 147 | { 148 | "data": { 149 | "text/html": [ 150 | "\n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " Everything but this!\n", 156 | "" 157 | ], 158 | "text/plain": [ 159 | "" 160 | ] 161 | }, 162 | "metadata": {}, 163 | "output_type": "display_data" 164 | } 165 | ], 166 | "source": [ 167 | "%%HTML\n", 168 | "\n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " Everything but this!\n", 174 | "" 175 | ] 176 | }, 177 | { 178 | "cell_type": "code", 179 | "execution_count": 43, 180 | "metadata": { 181 | "collapsed": true, 182 | "umich_part_id": "010", 183 | "umich_partlist_id": "003" 184 | }, 185 | "outputs": [], 186 | "source": [ 187 | "def answer_two():\n", 188 | " union = pd.merge(pd.merge(energy, GDP, on='Country', how='outer'), ScimEn, on='Country', how='outer')\n", 189 | " length_u = len(union)\n", 190 | " print(length_u)\n", 191 | " intersect = pd.merge(pd.merge(energy, GDP, on='Country'), ScimEn, on='Country')\n", 192 | " length_i = len(intersect)\n", 193 | " return length_u - length_i" 194 | ] 195 | }, 196 | { 197 | "cell_type": "markdown", 198 | "metadata": {}, 199 | "source": [ 200 | "## Answer the following questions in the context of only the top 15 countries by Scimagojr Rank (aka the DataFrame returned by `answer_one()`)" 201 | ] 202 | }, 203 | { 204 | "cell_type": "markdown", 205 | "metadata": {}, 206 | "source": [ 207 | "### Question 3 (6.6%)\n", 208 | "What is the average GDP over the last 10 years for each country? (exclude missing values from this calculation.)\n", 209 | "\n", 210 | "*This function should return a Series named `avgGDP` with 15 countries and their average GDP sorted in descending order.*" 211 | ] 212 | }, 213 | { 214 | "cell_type": "code", 215 | "execution_count": 48, 216 | "metadata": { 217 | "collapsed": true, 218 | "scrolled": true, 219 | "umich_part_id": "011", 220 | "umich_partlist_id": "003" 221 | }, 222 | "outputs": [], 223 | "source": [ 224 | "def answer_three():\n", 225 | " Top15 = answer_one()\n", 226 | " years = ['2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013', '2014', '2015']\n", 227 | " return (Top15[years].mean(axis=1)).sort_values(ascending=False).rename('avgGDP')" 228 | ] 229 | }, 230 | { 231 | "cell_type": "markdown", 232 | "metadata": {}, 233 | "source": [ 234 | "### Question 4 (6.6%)\n", 235 | "By how much had the GDP changed over the 10 year span for the country with the 6th largest average GDP?\n", 236 | "\n", 237 | "*This function should return a single number.*" 238 | ] 239 | }, 240 | { 241 | "cell_type": "code", 242 | "execution_count": 52, 243 | "metadata": { 244 | "collapsed": true, 245 | "scrolled": true, 246 | "umich_part_id": "012", 247 | "umich_partlist_id": "003" 248 | }, 249 | "outputs": [], 250 | "source": [ 251 | "def answer_four():\n", 252 | " Top15 = answer_one()\n", 253 | " Top15['avgGDP'] = answer_three()\n", 254 | " Top15.sort_values(by='avgGDP', inplace=True, ascending=False)\n", 255 | " return abs(Top15.iloc[5]['2015']-Top15.iloc[5]['2006'])" 256 | ] 257 | }, 258 | { 259 | "cell_type": "markdown", 260 | "metadata": {}, 261 | "source": [ 262 | "### Question 5 (6.6%)\n", 263 | "What is the mean `Energy Supply per Capita`?\n", 264 | "\n", 265 | "*This function should return a single number.*" 266 | ] 267 | }, 268 | { 269 | "cell_type": "code", 270 | "execution_count": 53, 271 | "metadata": { 272 | "collapsed": true, 273 | "umich_part_id": "013", 274 | "umich_partlist_id": "003" 275 | }, 276 | "outputs": [], 277 | "source": [ 278 | "def answer_five():\n", 279 | " Top15 = answer_one()\n", 280 | " return Top15['Energy Supply per Capita'].mean()" 281 | ] 282 | }, 283 | { 284 | "cell_type": "markdown", 285 | "metadata": {}, 286 | "source": [ 287 | "### Question 6 (6.6%)\n", 288 | "What country has the maximum % Renewable and what is the percentage?\n", 289 | "\n", 290 | "*This function should return a tuple with the name of the country and the percentage.*" 291 | ] 292 | }, 293 | { 294 | "cell_type": "code", 295 | "execution_count": 55, 296 | "metadata": { 297 | "collapsed": true, 298 | "umich_part_id": "014", 299 | "umich_partlist_id": "003" 300 | }, 301 | "outputs": [], 302 | "source": [ 303 | "def answer_six():\n", 304 | " Top15 = answer_one()\n", 305 | " ct = Top15.sort_values(by='% Renewable', ascending=False).iloc[0]\n", 306 | " return (ct.name, ct['% Renewable'])" 307 | ] 308 | }, 309 | { 310 | "cell_type": "markdown", 311 | "metadata": {}, 312 | "source": [ 313 | "### Question 7 (6.6%)\n", 314 | "Create a new column that is the ratio of Self-Citations to Total Citations. \n", 315 | "What is the maximum value for this new column, and what country has the highest ratio?\n", 316 | "\n", 317 | "*This function should return a tuple with the name of the country and the ratio.*" 318 | ] 319 | }, 320 | { 321 | "cell_type": "code", 322 | "execution_count": 57, 323 | "metadata": { 324 | "collapsed": true, 325 | "umich_part_id": "015", 326 | "umich_partlist_id": "003" 327 | }, 328 | "outputs": [], 329 | "source": [ 330 | "def answer_seven():\n", 331 | " Top15 = answer_one()\n", 332 | " Top15['Citation_ratio'] = Top15['Self-citations']/Top15['Citations']\n", 333 | " ct = Top15.sort_values(by='Citation_ratio', ascending=False).iloc[0]\n", 334 | " return ct.name, ct['Citation_ratio']" 335 | ] 336 | }, 337 | { 338 | "cell_type": "markdown", 339 | "metadata": {}, 340 | "source": [ 341 | "### Question 8 (6.6%)\n", 342 | "\n", 343 | "Create a column that estimates the population using Energy Supply and Energy Supply per capita. \n", 344 | "What is the third most populous country according to this estimate?\n", 345 | "\n", 346 | "*This function should return a single string value.*" 347 | ] 348 | }, 349 | { 350 | "cell_type": "code", 351 | "execution_count": 62, 352 | "metadata": { 353 | "collapsed": true, 354 | "umich_part_id": "016", 355 | "umich_partlist_id": "003" 356 | }, 357 | "outputs": [], 358 | "source": [ 359 | "def answer_eight():\n", 360 | " Top15 = answer_one()\n", 361 | " Top15['Population'] = Top15['Energy Supply']/Top15['Energy Supply per Capita']\n", 362 | " return Top15.sort_values(by='Population', ascending=False).iloc[2].name" 363 | ] 364 | }, 365 | { 366 | "cell_type": "markdown", 367 | "metadata": {}, 368 | "source": [ 369 | "### Question 9 (6.6%)\n", 370 | "Create a column that estimates the number of citable documents per person. \n", 371 | "What is the correlation between the number of citable documents per capita and the energy supply per capita? Use the `.corr()` method, (Pearson's correlation).\n", 372 | "\n", 373 | "*This function should return a single number.*\n", 374 | "\n", 375 | "*(Optional: Use the built-in function `plot9()` to visualize the relationship between Energy Supply per Capita vs. Citable docs per Capita)*" 376 | ] 377 | }, 378 | { 379 | "cell_type": "code", 380 | "execution_count": 72, 381 | "metadata": { 382 | "collapsed": true, 383 | "umich_part_id": "017", 384 | "umich_partlist_id": "003" 385 | }, 386 | "outputs": [], 387 | "source": [ 388 | "def answer_nine():\n", 389 | " Top15 = answer_one()\n", 390 | " Top15['Estimate Population'] = Top15['Energy Supply'] / Top15['Energy Supply per Capita']\n", 391 | " Top15['avgCiteDocPerPerson'] = Top15['Citable documents'] / Top15['Estimate Population']\n", 392 | " return Top15[['Energy Supply per Capita', 'avgCiteDocPerPerson']].corr().ix['Energy Supply per Capita', 'avgCiteDocPerPerson']" 393 | ] 394 | }, 395 | { 396 | "cell_type": "code", 397 | "execution_count": null, 398 | "metadata": { 399 | "collapsed": true 400 | }, 401 | "outputs": [], 402 | "source": [ 403 | "def plot9():\n", 404 | " import matplotlib as plt\n", 405 | " %matplotlib inline\n", 406 | " \n", 407 | " Top15 = answer_one()\n", 408 | " Top15['PopEst'] = Top15['Energy Supply'] / Top15['Energy Supply per Capita']\n", 409 | " Top15['Citable docs per Capita'] = Top15['Citable documents'] / Top15['PopEst']\n", 410 | " Top15.plot(x='Citable docs per Capita', y='Energy Supply per Capita', kind='scatter', xlim=[0, 0.0006])" 411 | ] 412 | }, 413 | { 414 | "cell_type": "code", 415 | "execution_count": null, 416 | "metadata": { 417 | "collapsed": true 418 | }, 419 | "outputs": [], 420 | "source": [ 421 | "#plot9() # Be sure to comment out plot9() before submitting the assignment!" 422 | ] 423 | }, 424 | { 425 | "cell_type": "markdown", 426 | "metadata": {}, 427 | "source": [ 428 | "### Question 10 (6.6%)\n", 429 | "Create a new column with a 1 if the country's % Renewable value is at or above the median for all countries in the top 15, and a 0 if the country's % Renewable value is below the median.\n", 430 | "\n", 431 | "*This function should return a series named `HighRenew` whose index is the country name sorted in ascending order of rank.*" 432 | ] 433 | }, 434 | { 435 | "cell_type": "code", 436 | "execution_count": 75, 437 | "metadata": { 438 | "collapsed": true, 439 | "umich_part_id": "018", 440 | "umich_partlist_id": "003" 441 | }, 442 | "outputs": [], 443 | "source": [ 444 | "def answer_ten():\n", 445 | " Top15 = answer_one()\n", 446 | " mid = Top15['% Renewable'].median()\n", 447 | " Top15['HighRenew'] = Top15['% Renewable']>=mid\n", 448 | " Top15['HighRenew'] = Top15['HighRenew'].apply(lambda x:1 if x else 0)\n", 449 | " Top15.sort_values(by='Rank', inplace=True)\n", 450 | " return Top15['HighRenew']" 451 | ] 452 | }, 453 | { 454 | "cell_type": "markdown", 455 | "metadata": {}, 456 | "source": [ 457 | "### Question 11 (6.6%)\n", 458 | "Use the following dictionary to group the Countries by Continent, then create a dateframe that displays the sample size (the number of countries in each continent bin), and the sum, mean, and std deviation for the estimated population of each country.\n", 459 | "\n", 460 | "```python\n", 461 | "ContinentDict = {'China':'Asia', \n", 462 | " 'United States':'North America', \n", 463 | " 'Japan':'Asia', \n", 464 | " 'United Kingdom':'Europe', \n", 465 | " 'Russian Federation':'Europe', \n", 466 | " 'Canada':'North America', \n", 467 | " 'Germany':'Europe', \n", 468 | " 'India':'Asia',\n", 469 | " 'France':'Europe', \n", 470 | " 'South Korea':'Asia', \n", 471 | " 'Italy':'Europe', \n", 472 | " 'Spain':'Europe', \n", 473 | " 'Iran':'Asia',\n", 474 | " 'Australia':'Australia', \n", 475 | " 'Brazil':'South America'}\n", 476 | "```\n", 477 | "\n", 478 | "*This function should return a DataFrame with index named Continent `['Asia', 'Australia', 'Europe', 'North America', 'South America']` and columns `['size', 'sum', 'mean', 'std']`*" 479 | ] 480 | }, 481 | { 482 | "cell_type": "code", 483 | "execution_count": 88, 484 | "metadata": { 485 | "collapsed": true, 486 | "umich_part_id": "019", 487 | "umich_partlist_id": "003" 488 | }, 489 | "outputs": [], 490 | "source": [ 491 | "def answer_eleven():\n", 492 | " Top15 = answer_one()\n", 493 | " ContinentDict = {'China':'Asia', \n", 494 | " 'United States':'North America', \n", 495 | " 'Japan':'Asia', \n", 496 | " 'United Kingdom':'Europe', \n", 497 | " 'Russian Federation':'Europe', \n", 498 | " 'Canada':'North America', \n", 499 | " 'Germany':'Europe', \n", 500 | " 'India':'Asia',\n", 501 | " 'France':'Europe', \n", 502 | " 'South Korea':'Asia', \n", 503 | " 'Italy':'Europe', \n", 504 | " 'Spain':'Europe', \n", 505 | " 'Iran':'Asia',\n", 506 | " 'Australia':'Australia', \n", 507 | " 'Brazil':'South America'}\n", 508 | " groups = pd.DataFrame(columns = ['size', 'sum', 'mean', 'std'])\n", 509 | " Top15['Estimate Population'] = Top15['Energy Supply'] / Top15['Energy Supply per Capita']\n", 510 | " for group, frame in Top15.groupby(ContinentDict):\n", 511 | " groups.loc[group] = [len(frame), frame['Estimate Population'].sum(),frame['Estimate Population'].mean(),frame['Estimate Population'].std()]\n", 512 | " return groups" 513 | ] 514 | }, 515 | { 516 | "cell_type": "markdown", 517 | "metadata": {}, 518 | "source": [ 519 | "### Question 12 (6.6%)\n", 520 | "Cut % Renewable into 5 bins. Group Top15 by the Continent, as well as these new % Renewable bins. How many countries are in each of these groups?\n", 521 | "\n", 522 | "*This function should return a __Series__ with a MultiIndex of `Continent`, then the bins for `% Renewable`. Do not include groups with no countries.*" 523 | ] 524 | }, 525 | { 526 | "cell_type": "code", 527 | "execution_count": 96, 528 | "metadata": { 529 | "collapsed": true, 530 | "scrolled": true 531 | }, 532 | "outputs": [], 533 | "source": [ 534 | "def answer_twelve():\n", 535 | " Top15 = answer_one()\n", 536 | " ContinentDict = {'China':'Asia', \n", 537 | " 'United States':'North America', \n", 538 | " 'Japan':'Asia', \n", 539 | " 'United Kingdom':'Europe', \n", 540 | " 'Russian Federation':'Europe', \n", 541 | " 'Canada':'North America', \n", 542 | " 'Germany':'Europe', \n", 543 | " 'India':'Asia',\n", 544 | " 'France':'Europe', \n", 545 | " 'South Korea':'Asia', \n", 546 | " 'Italy':'Europe', \n", 547 | " 'Spain':'Europe', \n", 548 | " 'Iran':'Asia',\n", 549 | " 'Australia':'Australia', \n", 550 | " 'Brazil':'South America'}\n", 551 | " Top15 = Top15.reset_index()\n", 552 | " Top15['Continent'] = [ContinentDict[country] for country in Top15['Country']]\n", 553 | " Top15['bins'] = pd.cut(Top15['% Renewable'],5)\n", 554 | " return Top15.groupby(['Continent','bins']).size()" 555 | ] 556 | }, 557 | { 558 | "cell_type": "markdown", 559 | "metadata": {}, 560 | "source": [ 561 | "### Question 13 (6.6%)\n", 562 | "Convert the Population Estimate series to a string with thousands separator (using commas). Do not round the results.\n", 563 | "\n", 564 | "e.g. 317615384.61538464 -> 317,615,384.61538464\n", 565 | "\n", 566 | "*This function should return a Series `PopEst` whose index is the country name and whose values are the population estimate string.*" 567 | ] 568 | }, 569 | { 570 | "cell_type": "code", 571 | "execution_count": 105, 572 | "metadata": { 573 | "collapsed": true, 574 | "scrolled": true, 575 | "umich_part_id": "020", 576 | "umich_partlist_id": "003" 577 | }, 578 | "outputs": [], 579 | "source": [ 580 | "def answer_thirteen():\n", 581 | " Top15 = answer_one()\n", 582 | " Top15['PopEst'] = (Top15['Energy Supply'] / Top15['Energy Supply per Capita']).astype(float)\n", 583 | " return Top15['PopEst'].apply(lambda x: '{0:,}'.format(x))" 584 | ] 585 | }, 586 | { 587 | "cell_type": "markdown", 588 | "metadata": {}, 589 | "source": [ 590 | "### Optional\n", 591 | "\n", 592 | "Use the built in function `plot_optional()` to see an example visualization." 593 | ] 594 | }, 595 | { 596 | "cell_type": "code", 597 | "execution_count": 111, 598 | "metadata": { 599 | "collapsed": true, 600 | "scrolled": true 601 | }, 602 | "outputs": [], 603 | "source": [ 604 | "def plot_optional():\n", 605 | " import matplotlib as plt\n", 606 | " %matplotlib inline\n", 607 | " Top15 = answer_one()\n", 608 | " ax = Top15.plot(x='Rank', y='% Renewable', kind='scatter', \n", 609 | " c=['#e41a1c','#377eb8','#e41a1c','#4daf4a','#4daf4a','#377eb8','#4daf4a','#e41a1c',\n", 610 | " '#4daf4a','#e41a1c','#4daf4a','#4daf4a','#e41a1c','#dede00','#ff7f00'], \n", 611 | " xticks=range(1,16), s=6*Top15['2014']/10**10, alpha=.75, figsize=[16,6]);\n", 612 | "\n", 613 | " for i, txt in enumerate(Top15.index):\n", 614 | " ax.annotate(txt, [Top15['Rank'][i], Top15['% Renewable'][i]], ha='center')\n", 615 | "\n", 616 | " print(\"This is an example of a visualization that can be created to help understand the data. \\\n", 617 | "This is a bubble chart showing % Renewable vs. Rank. The size of the bubble corresponds to the countries' \\\n", 618 | "2014 GDP, and the color corresponds to the continent.\")" 619 | ] 620 | }, 621 | { 622 | "cell_type": "code", 623 | "execution_count": null, 624 | "metadata": { 625 | "collapsed": true 626 | }, 627 | "outputs": [], 628 | "source": [ 629 | "#plot_optional() # Be sure to comment out plot_optional() before submitting the assignment!" 630 | ] 631 | } 632 | ], 633 | "metadata": { 634 | "anaconda-cloud": {}, 635 | "coursera": { 636 | "course_slug": "python-data-analysis", 637 | "graded_item_id": "zAr06", 638 | "launcher_item_id": "KSSjT", 639 | "part_id": "SL3fU" 640 | }, 641 | "kernelspec": { 642 | "display_name": "Python 3", 643 | "language": "python", 644 | "name": "python3" 645 | }, 646 | "language_info": { 647 | "codemirror_mode": { 648 | "name": "ipython", 649 | "version": 3 650 | }, 651 | "file_extension": ".py", 652 | "mimetype": "text/x-python", 653 | "name": "python", 654 | "nbconvert_exporter": "python", 655 | "pygments_lexer": "ipython3", 656 | "version": "3.6.0" 657 | }, 658 | "umich": { 659 | "id": "Assignment 3", 660 | "version": "1.5" 661 | } 662 | }, 663 | "nbformat": 4, 664 | "nbformat_minor": 1 665 | } 666 | -------------------------------------------------------------------------------- /Introduction to Data Science in Python/week--3/readme: -------------------------------------------------------------------------------- 1 | It contains solution of week-3 2 | -------------------------------------------------------------------------------- /Introduction to Data Science in Python/week-1/mcq: -------------------------------------------------------------------------------- 1 | 1) Python is an example of an 2 | **Interpreted language** 3 | 4 | 2) Data Science is a 5 | **Interdisciplinary, made up of all of the above** 6 | 7 | 3) Data visualization is not a part of data science. 8 | **False** 9 | 10 | 4) Which bracketing style does Python use for tuples? 11 | **()** 12 | 13 | 5) In Python, strings are considered Mutable, and can be changed. 14 | **False** 15 | 16 | 6) What is the result of the following code: ['a', 'b', 'c'] + [1, 2, 3] 17 | **['a', 'b', 'c', 1, 2, 3]** 18 | 19 | 7) String slicing is 20 | **A way to make a substring of a string in python** 21 | 22 | 8) When you create a lambda, what type is returned? E.g. type(lambda x: x+1) returns 23 | **** 24 | 25 | 9) The epoch refers to 26 | **January 1, year 1970** 27 | 28 | 10) This code, [x**2 for x in range(10)] , is an example of a 29 | **List comprehension** 30 | 31 | 11) Given a 6x6 NumPy array r, which of the following options would slice the shaded elements? 32 | **r.reshape(36)[::7]** 33 | 34 | 12) Given a 6x6 NumPy array r, which of the following options would slice the shaded elements? 35 | **r[2:4,2:4]** 36 | -------------------------------------------------------------------------------- /Introduction to Data Science in Python/week-2/Assignment+2.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "---\n", 8 | "\n", 9 | "_You are currently looking at **version 1.2** of this notebook. To download notebooks and datafiles, as well as get help on Jupyter notebooks in the Coursera platform, visit the [Jupyter Notebook FAQ](https://www.coursera.org/learn/python-data-analysis/resources/0dhYG) course resource._\n", 10 | "\n", 11 | "---" 12 | ] 13 | }, 14 | { 15 | "cell_type": "markdown", 16 | "metadata": {}, 17 | "source": [ 18 | "# Assignment 2 - Pandas Introduction\n", 19 | "All questions are weighted the same in this assignment.\n", 20 | "## Part 1\n", 21 | "The following code loads the olympics dataset (olympics.csv), which was derrived from the Wikipedia entry on [All Time Olympic Games Medals](https://en.wikipedia.org/wiki/All-time_Olympic_Games_medal_table), and does some basic data cleaning. \n", 22 | "\n", 23 | "The columns are organized as # of Summer games, Summer medals, # of Winter games, Winter medals, total # number of games, total # of medals. Use this dataset to answer the questions below." 24 | ] 25 | }, 26 | { 27 | "cell_type": "code", 28 | "execution_count": 111, 29 | "metadata": { 30 | "collapsed": false, 31 | "nbgrader": { 32 | "grade": false, 33 | "grade_id": "1", 34 | "locked": false, 35 | "solution": false 36 | }, 37 | "umich_question": "prolog-000" 38 | }, 39 | "outputs": [ 40 | { 41 | "data": { 42 | "text/html": [ 43 | "
\n", 44 | "\n", 45 | " \n", 46 | " \n", 47 | " \n", 48 | " \n", 49 | " \n", 50 | " \n", 51 | " \n", 52 | " \n", 53 | " \n", 54 | " \n", 55 | " \n", 56 | " \n", 57 | " \n", 58 | " \n", 59 | " \n", 60 | " \n", 61 | " \n", 62 | " \n", 63 | " \n", 64 | " \n", 65 | " \n", 66 | " \n", 67 | " \n", 68 | " \n", 69 | " \n", 70 | " \n", 71 | " \n", 72 | " \n", 73 | " \n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | "
# SummerGoldSilverBronzeTotal# WinterGold.1Silver.1Bronze.1Total.1# GamesGold.2Silver.2Bronze.2Combined totalID
Afghanistan13002200000130022AFG
Algeria1252815300001552815ALG
Argentina23182428701800004118242870ARG
Armenia512912600001112912ARM
Australasia23451200000234512ANZ
\n", 164 | "
" 165 | ], 166 | "text/plain": [ 167 | " # Summer Gold Silver Bronze Total # Winter Gold.1 \\\n", 168 | "Afghanistan 13 0 0 2 2 0 0 \n", 169 | "Algeria 12 5 2 8 15 3 0 \n", 170 | "Argentina 23 18 24 28 70 18 0 \n", 171 | "Armenia 5 1 2 9 12 6 0 \n", 172 | "Australasia 2 3 4 5 12 0 0 \n", 173 | "\n", 174 | " Silver.1 Bronze.1 Total.1 # Games Gold.2 Silver.2 Bronze.2 \\\n", 175 | "Afghanistan 0 0 0 13 0 0 2 \n", 176 | "Algeria 0 0 0 15 5 2 8 \n", 177 | "Argentina 0 0 0 41 18 24 28 \n", 178 | "Armenia 0 0 0 11 1 2 9 \n", 179 | "Australasia 0 0 0 2 3 4 5 \n", 180 | "\n", 181 | " Combined total ID \n", 182 | "Afghanistan 2 AFG \n", 183 | "Algeria 15 ALG \n", 184 | "Argentina 70 ARG \n", 185 | "Armenia 12 ARM \n", 186 | "Australasia 12 ANZ " 187 | ] 188 | }, 189 | "execution_count": 111, 190 | "metadata": {}, 191 | "output_type": "execute_result" 192 | } 193 | ], 194 | "source": [ 195 | "import pandas as pd\n", 196 | "\n", 197 | "df = pd.read_csv('olympics.csv', index_col=0, skiprows=1)\n", 198 | "\n", 199 | "for col in df.columns:\n", 200 | " if col[:2]=='01':\n", 201 | " df.rename(columns={col:'Gold'+col[4:]}, inplace=True)\n", 202 | " if col[:2]=='02':\n", 203 | " df.rename(columns={col:'Silver'+col[4:]}, inplace=True)\n", 204 | " if col[:2]=='03':\n", 205 | " df.rename(columns={col:'Bronze'+col[4:]}, inplace=True)\n", 206 | " if col[:1]=='№':\n", 207 | " df.rename(columns={col:'#'+col[1:]}, inplace=True)\n", 208 | "\n", 209 | "names_ids = df.index.str.split('\\s\\(') # split the index by '('\n", 210 | "\n", 211 | "df.index = names_ids.str[0] # the [0] element is the country name (new index) \n", 212 | "df['ID'] = names_ids.str[1].str[:3] # the [1] element is the abbreviation or ID (take first 3 characters from that)\n", 213 | "\n", 214 | "df = df.drop('Totals')\n", 215 | "df.head()" 216 | ] 217 | }, 218 | { 219 | "cell_type": "markdown", 220 | "metadata": {}, 221 | "source": [ 222 | "### Question 0 (Example)\n", 223 | "\n", 224 | "What is the first country in df?\n", 225 | "\n", 226 | "*This function should return a Series.*" 227 | ] 228 | }, 229 | { 230 | "cell_type": "code", 231 | "execution_count": 2, 232 | "metadata": { 233 | "collapsed": false, 234 | "umich_question": "000" 235 | }, 236 | "outputs": [ 237 | { 238 | "data": { 239 | "text/plain": [ 240 | "# Summer 13\n", 241 | "Gold 0\n", 242 | "Silver 0\n", 243 | "Bronze 2\n", 244 | "Total 2\n", 245 | "# Winter 0\n", 246 | "Gold.1 0\n", 247 | "Silver.1 0\n", 248 | "Bronze.1 0\n", 249 | "Total.1 0\n", 250 | "# Games 13\n", 251 | "Gold.2 0\n", 252 | "Silver.2 0\n", 253 | "Bronze.2 2\n", 254 | "Combined total 2\n", 255 | "ID AFG\n", 256 | "Name: Afghanistan, dtype: object" 257 | ] 258 | }, 259 | "execution_count": 2, 260 | "metadata": {}, 261 | "output_type": "execute_result" 262 | } 263 | ], 264 | "source": [ 265 | "# You should write your whole answer within the function provided. The autograder will call\n", 266 | "# this function and compare the return value against the correct solution value\n", 267 | "def answer_zero():\n", 268 | " # This function returns the row for Afghanistan, which is a Series object. The assignment\n", 269 | " # question description will tell you the general format the autograder is expecting\n", 270 | " return df.iloc[0]\n", 271 | "\n", 272 | "# You can examine what your function returns by calling it in the cell. If you have questions\n", 273 | "# about the assignment formats, check out the discussion forums for any FAQs\n", 274 | "answer_zero() " 275 | ] 276 | }, 277 | { 278 | "cell_type": "markdown", 279 | "metadata": {}, 280 | "source": [ 281 | "### Question 1\n", 282 | "Which country has won the most gold medals in summer games?\n", 283 | "\n", 284 | "*This function should return a single string value.*" 285 | ] 286 | }, 287 | { 288 | "cell_type": "code", 289 | "execution_count": 6, 290 | "metadata": { 291 | "collapsed": true, 292 | "nbgrader": { 293 | "grade": false, 294 | "locked": false, 295 | "solution": false 296 | }, 297 | "umich_part_id": "001", 298 | "umich_partlist_id": "001" 299 | }, 300 | "outputs": [], 301 | "source": [ 302 | "def answer_one():\n", 303 | " return df['Gold'].argmax()" 304 | ] 305 | }, 306 | { 307 | "cell_type": "markdown", 308 | "metadata": {}, 309 | "source": [ 310 | "### Question 2\n", 311 | "Which country had the biggest difference between their summer and winter gold medal counts?\n", 312 | "\n", 313 | "*This function should return a single string value.*" 314 | ] 315 | }, 316 | { 317 | "cell_type": "code", 318 | "execution_count": 9, 319 | "metadata": { 320 | "collapsed": true, 321 | "umich_part_id": "002", 322 | "umich_partlist_id": "001" 323 | }, 324 | "outputs": [], 325 | "source": [ 326 | "def answer_two():\n", 327 | " return (df['Gold'] - df['Gold.1']).abs().argmax()" 328 | ] 329 | }, 330 | { 331 | "cell_type": "markdown", 332 | "metadata": {}, 333 | "source": [ 334 | "### Question 3\n", 335 | "Which country has the biggest difference between their summer gold medal counts and winter gold medal counts relative to their total gold medal count? \n", 336 | "\n", 337 | "$$\\frac{Summer~Gold - Winter~Gold}{Total~Gold}$$\n", 338 | "\n", 339 | "Only include countries that have won at least 1 gold in both summer and winter.\n", 340 | "\n", 341 | "*This function should return a single string value.*" 342 | ] 343 | }, 344 | { 345 | "cell_type": "code", 346 | "execution_count": 8, 347 | "metadata": { 348 | "collapsed": true, 349 | "umich_part_id": "003", 350 | "umich_partlist_id": "001" 351 | }, 352 | "outputs": [], 353 | "source": [ 354 | "def answer_three():\n", 355 | " expect = df[(df['Gold']>=1) & (df['Gold.1']>=1)]\n", 356 | " totalsum = expect['Gold'] + expect['Gold.1'] + expect['Gold.2']\n", 357 | " maximum = (expect['Gold'] - expect['Gold.1']).abs() / totalsum\n", 358 | " return maximum.argmax()" 359 | ] 360 | }, 361 | { 362 | "cell_type": "markdown", 363 | "metadata": {}, 364 | "source": [ 365 | "### Question 4\n", 366 | "Write a function that creates a Series called \"Points\" which is a weighted value where each gold medal (`Gold.2`) counts for 3 points, silver medals (`Silver.2`) for 2 points, and bronze medals (`Bronze.2`) for 1 point. The function should return only the column (a Series object) which you created, with the country names as indices.\n", 367 | "\n", 368 | "*This function should return a Series named `Points` of length 146*" 369 | ] 370 | }, 371 | { 372 | "cell_type": "code", 373 | "execution_count": 12, 374 | "metadata": { 375 | "collapsed": true, 376 | "umich_part_id": "004", 377 | "umich_partlist_id": "001" 378 | }, 379 | "outputs": [], 380 | "source": [ 381 | "def answer_four():\n", 382 | " Points = df['Gold.2']*3 + df['Silver.2']*2 + df['Bronze.2']\n", 383 | " return Points" 384 | ] 385 | }, 386 | { 387 | "cell_type": "markdown", 388 | "metadata": {}, 389 | "source": [ 390 | "## Part 2\n", 391 | "For the next set of questions, we will be using census data from the [United States Census Bureau](http://www.census.gov). Counties are political and geographic subdivisions of states in the United States. This dataset contains population data for counties and states in the US from 2010 to 2015. [See this document](https://www2.census.gov/programs-surveys/popest/technical-documentation/file-layouts/2010-2015/co-est2015-alldata.pdf) for a description of the variable names.\n", 392 | "\n", 393 | "The census dataset (census.csv) should be loaded as census_df. Answer questions using this as appropriate.\n", 394 | "\n", 395 | "### Question 5\n", 396 | "Which state has the most counties in it? (hint: consider the sumlevel key carefully! You'll need this for future questions too...)\n", 397 | "\n", 398 | "*This function should return a single string value.*" 399 | ] 400 | }, 401 | { 402 | "cell_type": "code", 403 | "execution_count": 121, 404 | "metadata": { 405 | "collapsed": false, 406 | "umich_question": "prolog-005" 407 | }, 408 | "outputs": [ 409 | { 410 | "data": { 411 | "text/html": [ 412 | "
\n", 413 | "\n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | " \n", 439 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | " \n", 444 | " \n", 445 | " \n", 446 | " \n", 447 | " \n", 448 | " \n", 449 | " \n", 450 | " \n", 451 | " \n", 452 | " \n", 453 | " \n", 454 | " \n", 455 | " \n", 456 | " \n", 457 | " \n", 458 | " \n", 459 | " \n", 460 | " \n", 461 | " \n", 462 | " \n", 463 | " \n", 464 | " \n", 465 | " \n", 466 | " \n", 467 | " \n", 468 | " \n", 469 | " \n", 470 | " \n", 471 | " \n", 472 | " \n", 473 | " \n", 474 | " \n", 475 | " \n", 476 | " \n", 477 | " \n", 478 | " \n", 479 | " \n", 480 | " \n", 481 | " \n", 482 | " \n", 483 | " \n", 484 | " \n", 485 | " \n", 486 | " \n", 487 | " \n", 488 | " \n", 489 | " \n", 490 | " \n", 491 | " \n", 492 | " \n", 493 | " \n", 494 | " \n", 495 | " \n", 496 | " \n", 497 | " \n", 498 | " \n", 499 | " \n", 500 | " \n", 501 | " \n", 502 | " \n", 503 | " \n", 504 | " \n", 505 | " \n", 506 | " \n", 507 | " \n", 508 | " \n", 509 | " \n", 510 | " \n", 511 | " \n", 512 | " \n", 513 | " \n", 514 | " \n", 515 | " \n", 516 | " \n", 517 | " \n", 518 | " \n", 519 | " \n", 520 | " \n", 521 | " \n", 522 | " \n", 523 | " \n", 524 | " \n", 525 | " \n", 526 | " \n", 527 | " \n", 528 | " \n", 529 | " \n", 530 | " \n", 531 | " \n", 532 | " \n", 533 | " \n", 534 | " \n", 535 | " \n", 536 | " \n", 537 | " \n", 538 | " \n", 539 | " \n", 540 | " \n", 541 | " \n", 542 | " \n", 543 | " \n", 544 | " \n", 545 | " \n", 546 | " \n", 547 | " \n", 548 | " \n", 549 | " \n", 550 | " \n", 551 | " \n", 552 | " \n", 553 | " \n", 554 | " \n", 555 | " \n", 556 | " \n", 557 | " \n", 558 | " \n", 559 | " \n", 560 | " \n", 561 | " \n", 562 | "
SUMLEVREGIONDIVISIONSTATECOUNTYSTNAMECTYNAMECENSUS2010POPESTIMATESBASE2010POPESTIMATE2010...RDOMESTICMIG2011RDOMESTICMIG2012RDOMESTICMIG2013RDOMESTICMIG2014RDOMESTICMIG2015RNETMIG2011RNETMIG2012RNETMIG2013RNETMIG2014RNETMIG2015
0403610AlabamaAlabama477973647801274785161...0.002295-0.1931960.3810660.582002-0.4673691.0300150.8266441.3832821.7247180.712594
1503611AlabamaAutauga County545715457154660...7.242091-2.915927-3.0123492.265971-2.5307997.606016-2.626146-2.7220022.592270-2.187333
2503613AlabamaBaldwin County182265182265183193...14.83296017.64729321.84570519.24328717.19787215.84417618.55962722.72762620.31714218.293499
3503615AlabamaBarbour County274572745727341...-4.728132-2.500690-7.056824-3.904217-10.543299-4.874741-2.758113-7.167664-3.978583-10.543299
4503617AlabamaBibb County229152291922861...-5.527043-5.068871-6.201001-0.1775370.177258-5.088389-4.363636-5.4037290.7545331.107861
\n", 563 | "

5 rows × 100 columns

\n", 564 | "
" 565 | ], 566 | "text/plain": [ 567 | " SUMLEV REGION DIVISION STATE COUNTY STNAME CTYNAME \\\n", 568 | "0 40 3 6 1 0 Alabama Alabama \n", 569 | "1 50 3 6 1 1 Alabama Autauga County \n", 570 | "2 50 3 6 1 3 Alabama Baldwin County \n", 571 | "3 50 3 6 1 5 Alabama Barbour County \n", 572 | "4 50 3 6 1 7 Alabama Bibb County \n", 573 | "\n", 574 | " CENSUS2010POP ESTIMATESBASE2010 POPESTIMATE2010 ... \\\n", 575 | "0 4779736 4780127 4785161 ... \n", 576 | "1 54571 54571 54660 ... \n", 577 | "2 182265 182265 183193 ... \n", 578 | "3 27457 27457 27341 ... \n", 579 | "4 22915 22919 22861 ... \n", 580 | "\n", 581 | " RDOMESTICMIG2011 RDOMESTICMIG2012 RDOMESTICMIG2013 RDOMESTICMIG2014 \\\n", 582 | "0 0.002295 -0.193196 0.381066 0.582002 \n", 583 | "1 7.242091 -2.915927 -3.012349 2.265971 \n", 584 | "2 14.832960 17.647293 21.845705 19.243287 \n", 585 | "3 -4.728132 -2.500690 -7.056824 -3.904217 \n", 586 | "4 -5.527043 -5.068871 -6.201001 -0.177537 \n", 587 | "\n", 588 | " RDOMESTICMIG2015 RNETMIG2011 RNETMIG2012 RNETMIG2013 RNETMIG2014 \\\n", 589 | "0 -0.467369 1.030015 0.826644 1.383282 1.724718 \n", 590 | "1 -2.530799 7.606016 -2.626146 -2.722002 2.592270 \n", 591 | "2 17.197872 15.844176 18.559627 22.727626 20.317142 \n", 592 | "3 -10.543299 -4.874741 -2.758113 -7.167664 -3.978583 \n", 593 | "4 0.177258 -5.088389 -4.363636 -5.403729 0.754533 \n", 594 | "\n", 595 | " RNETMIG2015 \n", 596 | "0 0.712594 \n", 597 | "1 -2.187333 \n", 598 | "2 18.293499 \n", 599 | "3 -10.543299 \n", 600 | "4 1.107861 \n", 601 | "\n", 602 | "[5 rows x 100 columns]" 603 | ] 604 | }, 605 | "execution_count": 121, 606 | "metadata": {}, 607 | "output_type": "execute_result" 608 | } 609 | ], 610 | "source": [ 611 | "census_df = pd.read_csv('census.csv')\n", 612 | "census_df.head()" 613 | ] 614 | }, 615 | { 616 | "cell_type": "code", 617 | "execution_count": 115, 618 | "metadata": { 619 | "collapsed": true, 620 | "umich_part_id": "005", 621 | "umich_partlist_id": "002" 622 | }, 623 | "outputs": [], 624 | "source": [ 625 | "def answer_five():\n", 626 | " return census_df.groupby(census_df['STNAME']).count().COUNTY.argmax()" 627 | ] 628 | }, 629 | { 630 | "cell_type": "markdown", 631 | "metadata": {}, 632 | "source": [ 633 | "### Question 6\n", 634 | "**Only looking at the three most populous counties for each state**, what are the three most populous states (in order of highest population to lowest population)? Use `CENSUS2010POP`.\n", 635 | "\n", 636 | "*This function should return a list of string values.*" 637 | ] 638 | }, 639 | { 640 | "cell_type": "code", 641 | "execution_count": 113, 642 | "metadata": { 643 | "collapsed": true, 644 | "umich_part_id": "006", 645 | "umich_partlist_id": "002" 646 | }, 647 | "outputs": [], 648 | "source": [ 649 | "def answer_six():\n", 650 | " copydf = census_df.copy()\n", 651 | " copydf = copydf.groupby(['STNAME'])\n", 652 | " states = pd.DataFrame(columns=['pop'])\n", 653 | " for i, j in copydf:\n", 654 | " states.loc[i] = [j.sort_values(by='CENSUS2010POP', ascending=False)[1:4]['CENSUS2010POP'].sum()]\n", 655 | " ans = states.nlargest(3,'pop')\n", 656 | " return list(ans.index)" 657 | ] 658 | }, 659 | { 660 | "cell_type": "markdown", 661 | "metadata": {}, 662 | "source": [ 663 | "### Question 7\n", 664 | "Which county has had the largest absolute change in population within the period 2010-2015? (Hint: population values are stored in columns POPESTIMATE2010 through POPESTIMATE2015, you need to consider all six columns.)\n", 665 | "\n", 666 | "e.g. If County Population in the 5 year period is 100, 120, 80, 105, 100, 130, then its largest change in the period would be |130-80| = 50.\n", 667 | "\n", 668 | "*This function should return a single string value.*" 669 | ] 670 | }, 671 | { 672 | "cell_type": "code", 673 | "execution_count": 110, 674 | "metadata": { 675 | "collapsed": true, 676 | "umich_part_id": "007", 677 | "umich_partlist_id": "002" 678 | }, 679 | "outputs": [], 680 | "source": [ 681 | "def answer_seven():\n", 682 | " pop = census_df[['STNAME','CTYNAME','POPESTIMATE2015','POPESTIMATE2014','POPESTIMATE2013','POPESTIMATE2012','POPESTIMATE2011','POPESTIMATE2010']]\n", 683 | " pop = pop[pop['STNAME']!=pop['CTYNAME']]\n", 684 | " i = (pop.max(axis=1)-pop.min(axis=1)).argmax()\n", 685 | " return census_df.loc[i]['CTYNAME']" 686 | ] 687 | }, 688 | { 689 | "cell_type": "markdown", 690 | "metadata": {}, 691 | "source": [ 692 | "### Question 8\n", 693 | "In this datafile, the United States is broken up into four regions using the \"REGION\" column. \n", 694 | "\n", 695 | "Create a query that finds the counties that belong to regions 1 or 2, whose name starts with 'Washington', and whose POPESTIMATE2015 was greater than their POPESTIMATE 2014.\n", 696 | "\n", 697 | "*This function should return a 5x2 DataFrame with the columns = ['STNAME', 'CTYNAME'] and the same index ID as the census_df (sorted ascending by index).*" 698 | ] 699 | }, 700 | { 701 | "cell_type": "code", 702 | "execution_count": 122, 703 | "metadata": { 704 | "collapsed": true, 705 | "umich_part_id": "008", 706 | "umich_partlist_id": "002" 707 | }, 708 | "outputs": [], 709 | "source": [ 710 | "def answer_eight():\n", 711 | " return census_df[(census_df['REGION']<3 ) & (census_df['CTYNAME'] == 'Washington County') & (census_df['POPESTIMATE2015']>census_df['POPESTIMATE2014'])][['STNAME','CTYNAME']]" 712 | ] 713 | }, 714 | { 715 | "cell_type": "code", 716 | "execution_count": null, 717 | "metadata": { 718 | "collapsed": true 719 | }, 720 | "outputs": [], 721 | "source": [] 722 | } 723 | ], 724 | "metadata": { 725 | "anaconda-cloud": {}, 726 | "coursera": { 727 | "course_slug": "python-data-analysis", 728 | "graded_item_id": "tHmgx", 729 | "launcher_item_id": "Um6Bz", 730 | "part_id": "OQsnr" 731 | }, 732 | "kernelspec": { 733 | "display_name": "Python 3", 734 | "language": "python", 735 | "name": "python3" 736 | }, 737 | "language_info": { 738 | "codemirror_mode": { 739 | "name": "ipython", 740 | "version": 3 741 | }, 742 | "file_extension": ".py", 743 | "mimetype": "text/x-python", 744 | "name": "python", 745 | "nbconvert_exporter": "python", 746 | "pygments_lexer": "ipython3", 747 | "version": "3.5.2" 748 | }, 749 | "umich": { 750 | "id": "Assignment 2", 751 | "version": "1.2" 752 | } 753 | }, 754 | "nbformat": 4, 755 | "nbformat_minor": 1 756 | } 757 | -------------------------------------------------------------------------------- /Introduction to Data Science in Python/week-2/assignment+2.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """Assignment+2.ipynb 3 | 4 | Automatically generated by Colaboratory. 5 | 6 | Original file is located at 7 | https://colab.research.google.com/drive/1pPwsbb_ECy_Po44IJ6isagFNHWFLYY77 8 | 9 | --- 10 | 11 | _You are currently looking at **version 1.2** of this notebook. To download notebooks and datafiles, as well as get help on Jupyter notebooks in the Coursera platform, visit the [Jupyter Notebook FAQ](https://www.coursera.org/learn/python-data-analysis/resources/0dhYG) course resource._ 12 | 13 | --- 14 | 15 | # Assignment 2 - Pandas Introduction 16 | All questions are weighted the same in this assignment. 17 | ## Part 1 18 | The following code loads the olympics dataset (olympics.csv), which was derrived from the Wikipedia entry on [All Time Olympic Games Medals](https://en.wikipedia.org/wiki/All-time_Olympic_Games_medal_table), and does some basic data cleaning. 19 | 20 | The columns are organized as # of Summer games, Summer medals, # of Winter games, Winter medals, total # number of games, total # of medals. Use this dataset to answer the questions below. 21 | """ 22 | 23 | import pandas as pd 24 | 25 | df = pd.read_csv('olympics.csv', index_col=0, skiprows=1) 26 | 27 | for col in df.columns: 28 | if col[:2]=='01': 29 | df.rename(columns={col:'Gold'+col[4:]}, inplace=True) 30 | if col[:2]=='02': 31 | df.rename(columns={col:'Silver'+col[4:]}, inplace=True) 32 | if col[:2]=='03': 33 | df.rename(columns={col:'Bronze'+col[4:]}, inplace=True) 34 | if col[:1]=='№': 35 | df.rename(columns={col:'#'+col[1:]}, inplace=True) 36 | 37 | names_ids = df.index.str.split('\s\(') # split the index by '(' 38 | 39 | df.index = names_ids.str[0] # the [0] element is the country name (new index) 40 | df['ID'] = names_ids.str[1].str[:3] # the [1] element is the abbreviation or ID (take first 3 characters from that) 41 | 42 | df = df.drop('Totals') 43 | df.head() 44 | 45 | """### Question 0 (Example) 46 | 47 | What is the first country in df? 48 | 49 | *This function should return a Series.* 50 | """ 51 | 52 | # You should write your whole answer within the function provided. The autograder will call 53 | # this function and compare the return value against the correct solution value 54 | def answer_zero(): 55 | # This function returns the row for Afghanistan, which is a Series object. The assignment 56 | # question description will tell you the general format the autograder is expecting 57 | return df.iloc[0] 58 | 59 | # You can examine what your function returns by calling it in the cell. If you have questions 60 | # about the assignment formats, check out the discussion forums for any FAQs 61 | answer_zero() 62 | 63 | """### Question 1 64 | Which country has won the most gold medals in summer games? 65 | 66 | *This function should return a single string value.* 67 | """ 68 | 69 | def answer_one(): 70 | return df['Gold'].argmax() 71 | 72 | """### Question 2 73 | Which country had the biggest difference between their summer and winter gold medal counts? 74 | 75 | *This function should return a single string value.* 76 | """ 77 | 78 | def answer_two(): 79 | return (df['Gold'] - df['Gold.1']).abs().argmax() 80 | 81 | """### Question 3 82 | Which country has the biggest difference between their summer gold medal counts and winter gold medal counts relative to their total gold medal count? 83 | 84 | $$\frac{Summer~Gold - Winter~Gold}{Total~Gold}$$ 85 | 86 | Only include countries that have won at least 1 gold in both summer and winter. 87 | 88 | *This function should return a single string value.* 89 | """ 90 | 91 | def answer_three(): 92 | expect = df[(df['Gold']>=1) & (df['Gold.1']>=1)] 93 | totalsum = expect['Gold'] + expect['Gold.1'] + expect['Gold.2'] 94 | maximum = (expect['Gold'] - expect['Gold.1']).abs() / totalsum 95 | return maximum.argmax() 96 | 97 | """### Question 4 98 | Write a function that creates a Series called "Points" which is a weighted value where each gold medal (`Gold.2`) counts for 3 points, silver medals (`Silver.2`) for 2 points, and bronze medals (`Bronze.2`) for 1 point. The function should return only the column (a Series object) which you created, with the country names as indices. 99 | 100 | *This function should return a Series named `Points` of length 146* 101 | """ 102 | 103 | def answer_four(): 104 | Points = df['Gold.2']*3 + df['Silver.2']*2 + df['Bronze.2'] 105 | return Points 106 | 107 | """## Part 2 108 | For the next set of questions, we will be using census data from the [United States Census Bureau](http://www.census.gov). Counties are political and geographic subdivisions of states in the United States. This dataset contains population data for counties and states in the US from 2010 to 2015. [See this document](https://www2.census.gov/programs-surveys/popest/technical-documentation/file-layouts/2010-2015/co-est2015-alldata.pdf) for a description of the variable names. 109 | 110 | The census dataset (census.csv) should be loaded as census_df. Answer questions using this as appropriate. 111 | 112 | ### Question 5 113 | Which state has the most counties in it? (hint: consider the sumlevel key carefully! You'll need this for future questions too...) 114 | 115 | *This function should return a single string value.* 116 | """ 117 | 118 | census_df = pd.read_csv('census.csv') 119 | census_df.head() 120 | 121 | def answer_five(): 122 | return census_df.groupby(census_df['STNAME']).count().COUNTY.argmax() 123 | 124 | """### Question 6 125 | **Only looking at the three most populous counties for each state**, what are the three most populous states (in order of highest population to lowest population)? Use `CENSUS2010POP`. 126 | 127 | *This function should return a list of string values.* 128 | """ 129 | 130 | def answer_six(): 131 | copydf = census_df.copy() 132 | copydf = copydf.groupby(['STNAME']) 133 | states = pd.DataFrame(columns=['pop']) 134 | for i, j in copydf: 135 | states.loc[i] = [j.sort_values(by='CENSUS2010POP', ascending=False)[1:4]['CENSUS2010POP'].sum()] 136 | ans = states.nlargest(3,'pop') 137 | return list(ans.index) 138 | 139 | """### Question 7 140 | Which county has had the largest absolute change in population within the period 2010-2015? (Hint: population values are stored in columns POPESTIMATE2010 through POPESTIMATE2015, you need to consider all six columns.) 141 | 142 | e.g. If County Population in the 5 year period is 100, 120, 80, 105, 100, 130, then its largest change in the period would be |130-80| = 50. 143 | 144 | *This function should return a single string value.* 145 | """ 146 | 147 | def answer_seven(): 148 | pop = census_df[['STNAME','CTYNAME','POPESTIMATE2015','POPESTIMATE2014','POPESTIMATE2013','POPESTIMATE2012','POPESTIMATE2011','POPESTIMATE2010']] 149 | pop = pop[pop['STNAME']!=pop['CTYNAME']] 150 | i = (pop.max(axis=1)-pop.min(axis=1)).argmax() 151 | return census_df.loc[i]['CTYNAME'] 152 | 153 | """### Question 8 154 | In this datafile, the United States is broken up into four regions using the "REGION" column. 155 | 156 | Create a query that finds the counties that belong to regions 1 or 2, whose name starts with 'Washington', and whose POPESTIMATE2015 was greater than their POPESTIMATE 2014. 157 | 158 | *This function should return a 5x2 DataFrame with the columns = ['STNAME', 'CTYNAME'] and the same index ID as the census_df (sorted ascending by index).* 159 | """ 160 | 161 | def answer_eight(): 162 | return census_df[(census_df['REGION']<3 ) & (census_df['CTYNAME'] == 'Washington County') & (census_df['POPESTIMATE2015']>census_df['POPESTIMATE2014'])][['STNAME','CTYNAME']] 163 | 164 | -------------------------------------------------------------------------------- /Introduction to Data Science in Python/week-4/Assignment+4.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "---\n", 8 | "\n", 9 | "_You are currently looking at **version 1.1** of this notebook. To download notebooks and datafiles, as well as get help on Jupyter notebooks in the Coursera platform, visit the [Jupyter Notebook FAQ](https://www.coursera.org/learn/python-data-analysis/resources/0dhYG) course resource._\n", 10 | "\n", 11 | "---" 12 | ] 13 | }, 14 | { 15 | "cell_type": "code", 16 | "execution_count": 7, 17 | "metadata": { 18 | "collapsed": true 19 | }, 20 | "outputs": [], 21 | "source": [ 22 | "import pandas as pd\n", 23 | "import numpy as np\n", 24 | "from scipy.stats import ttest_ind" 25 | ] 26 | }, 27 | { 28 | "cell_type": "markdown", 29 | "metadata": {}, 30 | "source": [ 31 | "# Assignment 4 - Hypothesis Testing\n", 32 | "This assignment requires more individual learning than previous assignments - you are encouraged to check out the [pandas documentation](http://pandas.pydata.org/pandas-docs/stable/) to find functions or methods you might not have used yet, or ask questions on [Stack Overflow](http://stackoverflow.com/) and tag them as pandas and python related. And of course, the discussion forums are open for interaction with your peers and the course staff.\n", 33 | "\n", 34 | "Definitions:\n", 35 | "* A _quarter_ is a specific three month period, Q1 is January through March, Q2 is April through June, Q3 is July through September, Q4 is October through December.\n", 36 | "* A _recession_ is defined as starting with two consecutive quarters of GDP decline, and ending with two consecutive quarters of GDP growth.\n", 37 | "* A _recession bottom_ is the quarter within a recession which had the lowest GDP.\n", 38 | "* A _university town_ is a city which has a high percentage of university students compared to the total population of the city.\n", 39 | "\n", 40 | "**Hypothesis**: University towns have their mean housing prices less effected by recessions. Run a t-test to compare the ratio of the mean price of houses in university towns the quarter before the recession starts compared to the recession bottom. (`price_ratio=quarter_before_recession/recession_bottom`)\n", 41 | "\n", 42 | "The following data files are available for this assignment:\n", 43 | "* From the [Zillow research data site](http://www.zillow.com/research/data/) there is housing data for the United States. In particular the datafile for [all homes at a city level](http://files.zillowstatic.com/research/public/City/City_Zhvi_AllHomes.csv), ```City_Zhvi_AllHomes.csv```, has median home sale prices at a fine grained level.\n", 44 | "* From the Wikipedia page on college towns is a list of [university towns in the United States](https://en.wikipedia.org/wiki/List_of_college_towns#College_towns_in_the_United_States) which has been copy and pasted into the file ```university_towns.txt```.\n", 45 | "* From Bureau of Economic Analysis, US Department of Commerce, the [GDP over time](http://www.bea.gov/national/index.htm#gdp) of the United States in current dollars (use the chained value in 2009 dollars), in quarterly intervals, in the file ```gdplev.xls```. For this assignment, only look at GDP data from the first quarter of 2000 onward.\n", 46 | "\n", 47 | "Each function in this assignment below is worth 10%, with the exception of ```run_ttest()```, which is worth 50%." 48 | ] 49 | }, 50 | { 51 | "cell_type": "code", 52 | "execution_count": 9, 53 | "metadata": { 54 | "collapsed": true 55 | }, 56 | "outputs": [], 57 | "source": [ 58 | "# Use this dictionary to map state names to two letter acronyms\n", 59 | "states = {'OH': 'Ohio', 'KY': 'Kentucky', 'AS': 'American Samoa', 'NV': 'Nevada', 'WY': 'Wyoming', 'NA': 'National', 'AL': 'Alabama', 'MD': 'Maryland', 'AK': 'Alaska', 'UT': 'Utah', 'OR': 'Oregon', 'MT': 'Montana', 'IL': 'Illinois', 'TN': 'Tennessee', 'DC': 'District of Columbia', 'VT': 'Vermont', 'ID': 'Idaho', 'AR': 'Arkansas', 'ME': 'Maine', 'WA': 'Washington', 'HI': 'Hawaii', 'WI': 'Wisconsin', 'MI': 'Michigan', 'IN': 'Indiana', 'NJ': 'New Jersey', 'AZ': 'Arizona', 'GU': 'Guam', 'MS': 'Mississippi', 'PR': 'Puerto Rico', 'NC': 'North Carolina', 'TX': 'Texas', 'SD': 'South Dakota', 'MP': 'Northern Mariana Islands', 'IA': 'Iowa', 'MO': 'Missouri', 'CT': 'Connecticut', 'WV': 'West Virginia', 'SC': 'South Carolina', 'LA': 'Louisiana', 'KS': 'Kansas', 'NY': 'New York', 'NE': 'Nebraska', 'OK': 'Oklahoma', 'FL': 'Florida', 'CA': 'California', 'CO': 'Colorado', 'PA': 'Pennsylvania', 'DE': 'Delaware', 'NM': 'New Mexico', 'RI': 'Rhode Island', 'MN': 'Minnesota', 'VI': 'Virgin Islands', 'NH': 'New Hampshire', 'MA': 'Massachusetts', 'GA': 'Georgia', 'ND': 'North Dakota', 'VA': 'Virginia'}" 60 | ] 61 | }, 62 | { 63 | "cell_type": "code", 64 | "execution_count": 20, 65 | "metadata": { 66 | "umich_part_id": "021", 67 | "umich_partlist_id": "004" 68 | }, 69 | "outputs": [ 70 | { 71 | "data": { 72 | "text/html": [ 73 | "
\n", 74 | "\n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | " \n", 178 | " \n", 179 | " \n", 180 | " \n", 181 | " \n", 182 | " \n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | " \n", 189 | " \n", 190 | " \n", 191 | " \n", 192 | " \n", 193 | " \n", 194 | " \n", 195 | " \n", 196 | " \n", 197 | " \n", 198 | " \n", 199 | " \n", 200 | " \n", 201 | " \n", 202 | " \n", 203 | " \n", 204 | " \n", 205 | " \n", 206 | " \n", 207 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 212 | " \n", 213 | " \n", 214 | " \n", 215 | " \n", 216 | " \n", 217 | " \n", 218 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | " \n", 232 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 237 | " \n", 238 | " \n", 239 | " \n", 240 | " \n", 241 | " \n", 242 | " \n", 243 | " \n", 244 | " \n", 245 | " \n", 246 | " \n", 247 | " \n", 248 | " \n", 249 | " \n", 250 | " \n", 251 | " \n", 252 | " \n", 253 | " \n", 254 | " \n", 255 | " \n", 256 | " \n", 257 | " \n", 258 | " \n", 259 | " \n", 260 | " \n", 261 | " \n", 262 | " \n", 263 | " \n", 264 | " \n", 265 | " \n", 266 | " \n", 267 | " \n", 268 | " \n", 269 | " \n", 270 | " \n", 271 | " \n", 272 | " \n", 273 | " \n", 274 | " \n", 275 | " \n", 276 | " \n", 277 | " \n", 278 | " \n", 279 | " \n", 280 | " \n", 281 | " \n", 282 | " \n", 283 | " \n", 284 | " \n", 285 | " \n", 286 | " \n", 287 | " \n", 288 | " \n", 289 | " \n", 290 | " \n", 291 | " \n", 292 | " \n", 293 | " \n", 294 | " \n", 295 | " \n", 296 | " \n", 297 | " \n", 298 | " \n", 299 | " \n", 300 | " \n", 301 | " \n", 302 | " \n", 303 | " \n", 304 | " \n", 305 | " \n", 306 | " \n", 307 | " \n", 308 | " \n", 309 | " \n", 310 | " \n", 311 | " \n", 312 | " \n", 313 | " \n", 314 | " \n", 315 | " \n", 316 | " \n", 317 | " \n", 318 | " \n", 319 | " \n", 320 | " \n", 321 | " \n", 322 | " \n", 323 | " \n", 324 | " \n", 325 | " \n", 326 | " \n", 327 | " \n", 328 | " \n", 329 | " \n", 330 | " \n", 331 | " \n", 332 | " \n", 333 | " \n", 334 | " \n", 335 | " \n", 336 | " \n", 337 | " \n", 338 | " \n", 339 | " \n", 340 | " \n", 341 | " \n", 342 | " \n", 343 | " \n", 344 | " \n", 345 | " \n", 346 | " \n", 347 | " \n", 348 | " \n", 349 | " \n", 350 | " \n", 351 | " \n", 352 | " \n", 353 | " \n", 354 | " \n", 355 | " \n", 356 | " \n", 357 | " \n", 358 | " \n", 359 | " \n", 360 | " \n", 361 | " \n", 362 | " \n", 363 | " \n", 364 | " \n", 365 | " \n", 366 | " \n", 367 | " \n", 368 | " \n", 369 | " \n", 370 | " \n", 371 | " \n", 372 | " \n", 373 | " \n", 374 | " \n", 375 | " \n", 376 | " \n", 377 | " \n", 378 | " \n", 379 | " \n", 380 | " \n", 381 | " \n", 382 | " \n", 383 | " \n", 384 | " \n", 385 | " \n", 386 | " \n", 387 | " \n", 388 | " \n", 389 | "
StateRegionName
0AlabamaAuburn
1AlabamaFlorence
2AlabamaJacksonville
3AlabamaLivingston
4AlabamaMontevallo
5AlabamaTroy
6AlabamaTuscaloosa
7AlabamaTuskegee
8AlaskaFairbanks
9ArizonaFlagstaff
10ArizonaTempe
11ArizonaTucson
12ArkansasArkadelphia
13ArkansasConway
14ArkansasFayetteville
15ArkansasJonesboro
16ArkansasMagnolia
17ArkansasMonticello
18ArkansasRussellville
19ArkansasSearcy
20CaliforniaAngwin
21CaliforniaArcata
22CaliforniaBerkeley
23CaliforniaChico
24CaliforniaClaremont
25CaliforniaCotati
26CaliforniaDavis
27CaliforniaIrvine
28CaliforniaIsla Vista
29CaliforniaUniversity Park, Los Angeles
.........
487VirginiaWise
488VirginiaChesapeake
489WashingtonBellingham
490WashingtonCheney
491WashingtonEllensburg
492WashingtonPullman
493WashingtonUniversity District, Seattle
494West VirginiaAthens
495West VirginiaBuckhannon
496West VirginiaFairmont
497West VirginiaGlenville
498West VirginiaHuntington
499West VirginiaMontgomery
500West VirginiaMorgantown
501West VirginiaShepherdstown
502West VirginiaWest Liberty
503WisconsinAppleton
504WisconsinEau Claire
505WisconsinGreen Bay
506WisconsinLa Crosse
507WisconsinMadison
508WisconsinMenomonie
509WisconsinMilwaukee
510WisconsinOshkosh
511WisconsinPlatteville
512WisconsinRiver Falls
513WisconsinStevens Point
514WisconsinWaukesha
515WisconsinWhitewater
516WyomingLaramie
\n", 390 | "

517 rows × 2 columns

\n", 391 | "
" 392 | ], 393 | "text/plain": [ 394 | " State RegionName\n", 395 | "0 Alabama Auburn\n", 396 | "1 Alabama Florence\n", 397 | "2 Alabama Jacksonville\n", 398 | "3 Alabama Livingston\n", 399 | "4 Alabama Montevallo\n", 400 | "5 Alabama Troy\n", 401 | "6 Alabama Tuscaloosa\n", 402 | "7 Alabama Tuskegee\n", 403 | "8 Alaska Fairbanks\n", 404 | "9 Arizona Flagstaff\n", 405 | "10 Arizona Tempe\n", 406 | "11 Arizona Tucson\n", 407 | "12 Arkansas Arkadelphia\n", 408 | "13 Arkansas Conway\n", 409 | "14 Arkansas Fayetteville\n", 410 | "15 Arkansas Jonesboro\n", 411 | "16 Arkansas Magnolia\n", 412 | "17 Arkansas Monticello\n", 413 | "18 Arkansas Russellville\n", 414 | "19 Arkansas Searcy\n", 415 | "20 California Angwin\n", 416 | "21 California Arcata\n", 417 | "22 California Berkeley\n", 418 | "23 California Chico\n", 419 | "24 California Claremont\n", 420 | "25 California Cotati\n", 421 | "26 California Davis\n", 422 | "27 California Irvine\n", 423 | "28 California Isla Vista\n", 424 | "29 California University Park, Los Angeles\n", 425 | ".. ... ...\n", 426 | "487 Virginia Wise\n", 427 | "488 Virginia Chesapeake\n", 428 | "489 Washington Bellingham\n", 429 | "490 Washington Cheney\n", 430 | "491 Washington Ellensburg\n", 431 | "492 Washington Pullman\n", 432 | "493 Washington University District, Seattle\n", 433 | "494 West Virginia Athens\n", 434 | "495 West Virginia Buckhannon\n", 435 | "496 West Virginia Fairmont\n", 436 | "497 West Virginia Glenville\n", 437 | "498 West Virginia Huntington\n", 438 | "499 West Virginia Montgomery\n", 439 | "500 West Virginia Morgantown\n", 440 | "501 West Virginia Shepherdstown\n", 441 | "502 West Virginia West Liberty\n", 442 | "503 Wisconsin Appleton\n", 443 | "504 Wisconsin Eau Claire\n", 444 | "505 Wisconsin Green Bay\n", 445 | "506 Wisconsin La Crosse\n", 446 | "507 Wisconsin Madison\n", 447 | "508 Wisconsin Menomonie\n", 448 | "509 Wisconsin Milwaukee\n", 449 | "510 Wisconsin Oshkosh\n", 450 | "511 Wisconsin Platteville\n", 451 | "512 Wisconsin River Falls\n", 452 | "513 Wisconsin Stevens Point\n", 453 | "514 Wisconsin Waukesha\n", 454 | "515 Wisconsin Whitewater\n", 455 | "516 Wyoming Laramie\n", 456 | "\n", 457 | "[517 rows x 2 columns]" 458 | ] 459 | }, 460 | "execution_count": 20, 461 | "metadata": {}, 462 | "output_type": "execute_result" 463 | } 464 | ], 465 | "source": [ 466 | "def get_list_of_university_towns():\n", 467 | " '''Returns a DataFrame of towns and the states they are in from the \n", 468 | " university_towns.txt list. The format of the DataFrame should be:\n", 469 | " DataFrame( [ [\"Michigan\", \"Ann Arbor\"], [\"Michigan\", \"Yipsilanti\"] ], \n", 470 | " columns=[\"State\", \"RegionName\"] )\n", 471 | " \n", 472 | " The following cleaning needs to be done:\n", 473 | "\n", 474 | " 1. For \"State\", removing characters from \"[\" to the end.\n", 475 | " 2. For \"RegionName\", when applicable, removing every character from \" (\" to the end.\n", 476 | " 3. Depending on how you read the data, you may need to remove newline character '\\n'. '''\n", 477 | " \n", 478 | " data =[]\n", 479 | " with open('university_towns.txt', \"r\") as file : \n", 480 | " for line in file : \n", 481 | " if (line.strip().endswith('[edit]')): \n", 482 | " state = line.strip()[:-6]\n", 483 | " else: \n", 484 | " data.append([state,line.split('(')[0].strip()])\n", 485 | "\n", 486 | " univ_towns = pd.DataFrame(data,columns = ['State','RegionName']) \n", 487 | " return univ_towns \n", 488 | " \n", 489 | "get_list_of_university_towns()" 490 | ] 491 | }, 492 | { 493 | "cell_type": "code", 494 | "execution_count": 43, 495 | "metadata": { 496 | "umich_part_id": "022", 497 | "umich_partlist_id": "004" 498 | }, 499 | "outputs": [ 500 | { 501 | "data": { 502 | "text/plain": [ 503 | "'2008q3'" 504 | ] 505 | }, 506 | "execution_count": 43, 507 | "metadata": {}, 508 | "output_type": "execute_result" 509 | } 510 | ], 511 | "source": [ 512 | "def get_recession_start():\n", 513 | " '''Returns the year and quarter of the recession start time as a \n", 514 | " string value in a format such as 2005q3'''\n", 515 | " \n", 516 | " gdp = pd.read_excel('gdplev.xls', skiprows=7)[['Unnamed: 4','Unnamed: 6']]\n", 517 | " gdp.columns = ['quater','gdpCurrent']\n", 518 | " gdp = gdp.sort_values(['quater']).loc[212:].reset_index(drop= True)\n", 519 | " gdp['diff_gdp'] = gdp.gdpCurrent.diff(+1).apply(lambda x : 1 if x>0 else -1 )\n", 520 | " gdp['diff_gdp'] = gdp.diff_gdp + gdp.diff_gdp.shift(+1)\n", 521 | " res_start_index = gdp[gdp['diff_gdp'] ==-2].index[0]-1\n", 522 | " return gdp.iloc[res_start_index].quater\n", 523 | "\n", 524 | "get_recession_start()" 525 | ] 526 | }, 527 | { 528 | "cell_type": "code", 529 | "execution_count": 50, 530 | "metadata": { 531 | "umich_part_id": "023", 532 | "umich_partlist_id": "004" 533 | }, 534 | "outputs": [ 535 | { 536 | "data": { 537 | "text/plain": [ 538 | "'2009q4'" 539 | ] 540 | }, 541 | "execution_count": 50, 542 | "metadata": {}, 543 | "output_type": "execute_result" 544 | } 545 | ], 546 | "source": [ 547 | "def get_recession_end():\n", 548 | " '''Returns the year and quarter of the recession end time as a \n", 549 | " string value in a format such as 2005q3'''\n", 550 | " \n", 551 | " gdp = pd.read_excel('gdplev.xls', skiprows=7)[['Unnamed: 4','Unnamed: 6']]\n", 552 | " gdp.columns = ['quater','gdpCurrent']\n", 553 | " gdp = gdp.sort_values(['quater']).loc[212:].reset_index(drop= True)\n", 554 | " gdp['diff_gdp'] = gdp.gdpCurrent.diff(+1).apply(lambda x : 1 if x>0 else -1 )\n", 555 | " gdp['diff_gdp'] = gdp.diff_gdp + gdp.diff_gdp.shift(+1)\n", 556 | " return gdp[(gdp['diff_gdp'] ==2) & (gdp.quater > get_recession_start())].head(1).quater.values[0]\n", 557 | "\n", 558 | "get_recession_end()" 559 | ] 560 | }, 561 | { 562 | "cell_type": "code", 563 | "execution_count": 55, 564 | "metadata": { 565 | "umich_part_id": "024", 566 | "umich_partlist_id": "004" 567 | }, 568 | "outputs": [ 569 | { 570 | "data": { 571 | "text/plain": [ 572 | "'2009q2'" 573 | ] 574 | }, 575 | "execution_count": 55, 576 | "metadata": {}, 577 | "output_type": "execute_result" 578 | } 579 | ], 580 | "source": [ 581 | "def get_recession_bottom():\n", 582 | " '''Returns the year and quarter of the recession bottom time as a \n", 583 | " string value in a format such as 2005q3'''\n", 584 | " \n", 585 | " gdp = pd.read_excel('gdplev.xls', skiprows=7)[['Unnamed: 4','Unnamed: 6']]\n", 586 | " gdp.columns = ['quater','gdpCurrent']\n", 587 | " return gdp.set_index('quater').loc[get_recession_start():get_recession_end()].gdpCurrent.idxmin()\n", 588 | "get_recession_bottom()" 589 | ] 590 | }, 591 | { 592 | "cell_type": "code", 593 | "execution_count": 57, 594 | "metadata": { 595 | "umich_part_id": "025", 596 | "umich_partlist_id": "004" 597 | }, 598 | "outputs": [ 599 | { 600 | "data": { 601 | "text/html": [ 602 | "
\n", 603 | "\n", 604 | " \n", 605 | " \n", 606 | " \n", 607 | " \n", 608 | " \n", 609 | " \n", 610 | " \n", 611 | " \n", 612 | " \n", 613 | " \n", 614 | " \n", 615 | " \n", 616 | " \n", 617 | " \n", 618 | " \n", 619 | " \n", 620 | " \n", 621 | " \n", 622 | " \n", 623 | " \n", 624 | " \n", 625 | " \n", 626 | " \n", 627 | " \n", 628 | " \n", 629 | " \n", 630 | " \n", 631 | " \n", 632 | " \n", 633 | " \n", 634 | " \n", 635 | " \n", 636 | " \n", 637 | " \n", 638 | " \n", 639 | " \n", 640 | " \n", 641 | " \n", 642 | " \n", 643 | " \n", 644 | " \n", 645 | " \n", 646 | " \n", 647 | " \n", 648 | " \n", 649 | " \n", 650 | " \n", 651 | " \n", 652 | " \n", 653 | " \n", 654 | " \n", 655 | " \n", 656 | " \n", 657 | " \n", 658 | " \n", 659 | " \n", 660 | " \n", 661 | " \n", 662 | " \n", 663 | " \n", 664 | " \n", 665 | " \n", 666 | " \n", 667 | " \n", 668 | " \n", 669 | " \n", 670 | " \n", 671 | " \n", 672 | " \n", 673 | " \n", 674 | " \n", 675 | " \n", 676 | " \n", 677 | " \n", 678 | " \n", 679 | " \n", 680 | " \n", 681 | " \n", 682 | " \n", 683 | " \n", 684 | " \n", 685 | " \n", 686 | " \n", 687 | " \n", 688 | " \n", 689 | " \n", 690 | " \n", 691 | " \n", 692 | " \n", 693 | " \n", 694 | " \n", 695 | " \n", 696 | " \n", 697 | " \n", 698 | " \n", 699 | " \n", 700 | " \n", 701 | " \n", 702 | " \n", 703 | " \n", 704 | " \n", 705 | " \n", 706 | " \n", 707 | " \n", 708 | " \n", 709 | " \n", 710 | " \n", 711 | " \n", 712 | " \n", 713 | " \n", 714 | " \n", 715 | " \n", 716 | " \n", 717 | " \n", 718 | " \n", 719 | " \n", 720 | " \n", 721 | " \n", 722 | " \n", 723 | " \n", 724 | " \n", 725 | " \n", 726 | " \n", 727 | " \n", 728 | " \n", 729 | " \n", 730 | " \n", 731 | " \n", 732 | " \n", 733 | " \n", 734 | " \n", 735 | " \n", 736 | " \n", 737 | " \n", 738 | " \n", 739 | " \n", 740 | " \n", 741 | " \n", 742 | " \n", 743 | " \n", 744 | " \n", 745 | " \n", 746 | " \n", 747 | " \n", 748 | " \n", 749 | " \n", 750 | " \n", 751 | " \n", 752 | " \n", 753 | " \n", 754 | " \n", 755 | " \n", 756 | " \n", 757 | " \n", 758 | " \n", 759 | " \n", 760 | " \n", 761 | " \n", 762 | " \n", 763 | " \n", 764 | " \n", 765 | " \n", 766 | " \n", 767 | " \n", 768 | " \n", 769 | " \n", 770 | " \n", 771 | " \n", 772 | " \n", 773 | " \n", 774 | " \n", 775 | " \n", 776 | " \n", 777 | " \n", 778 | " \n", 779 | " \n", 780 | " \n", 781 | " \n", 782 | " \n", 783 | " \n", 784 | " \n", 785 | " \n", 786 | " \n", 787 | " \n", 788 | " \n", 789 | " \n", 790 | " \n", 791 | " \n", 792 | " \n", 793 | " \n", 794 | " \n", 795 | " \n", 796 | " \n", 797 | " \n", 798 | " \n", 799 | " \n", 800 | " \n", 801 | " \n", 802 | " \n", 803 | " \n", 804 | " \n", 805 | " \n", 806 | " \n", 807 | " \n", 808 | " \n", 809 | " \n", 810 | " \n", 811 | " \n", 812 | " \n", 813 | " \n", 814 | " \n", 815 | " \n", 816 | " \n", 817 | " \n", 818 | " \n", 819 | " \n", 820 | " \n", 821 | " \n", 822 | " \n", 823 | " \n", 824 | " \n", 825 | " \n", 826 | " \n", 827 | " \n", 828 | " \n", 829 | " \n", 830 | " \n", 831 | " \n", 832 | " \n", 833 | " \n", 834 | " \n", 835 | " \n", 836 | " \n", 837 | " \n", 838 | " \n", 839 | " \n", 840 | " \n", 841 | " \n", 842 | " \n", 843 | " \n", 844 | " \n", 845 | " \n", 846 | " \n", 847 | " \n", 848 | " \n", 849 | " \n", 850 | " \n", 851 | " \n", 852 | " \n", 853 | " \n", 854 | " \n", 855 | " \n", 856 | " \n", 857 | " \n", 858 | " \n", 859 | " \n", 860 | " \n", 861 | " \n", 862 | " \n", 863 | " \n", 864 | " \n", 865 | " \n", 866 | " \n", 867 | " \n", 868 | " \n", 869 | " \n", 870 | " \n", 871 | " \n", 872 | " \n", 873 | " \n", 874 | " \n", 875 | " \n", 876 | " \n", 877 | " \n", 878 | " \n", 879 | " \n", 880 | " \n", 881 | " \n", 882 | " \n", 883 | " \n", 884 | " \n", 885 | " \n", 886 | " \n", 887 | " \n", 888 | " \n", 889 | " \n", 890 | " \n", 891 | " \n", 892 | " \n", 893 | " \n", 894 | " \n", 895 | " \n", 896 | " \n", 897 | " \n", 898 | " \n", 899 | " \n", 900 | " \n", 901 | " \n", 902 | " \n", 903 | " \n", 904 | " \n", 905 | " \n", 906 | " \n", 907 | " \n", 908 | " \n", 909 | " \n", 910 | " \n", 911 | " \n", 912 | " \n", 913 | " \n", 914 | " \n", 915 | " \n", 916 | " \n", 917 | " \n", 918 | " \n", 919 | " \n", 920 | " \n", 921 | " \n", 922 | " \n", 923 | " \n", 924 | " \n", 925 | " \n", 926 | " \n", 927 | " \n", 928 | " \n", 929 | " \n", 930 | " \n", 931 | " \n", 932 | " \n", 933 | " \n", 934 | " \n", 935 | " \n", 936 | " \n", 937 | " \n", 938 | " \n", 939 | " \n", 940 | " \n", 941 | " \n", 942 | " \n", 943 | " \n", 944 | " \n", 945 | " \n", 946 | " \n", 947 | " \n", 948 | " \n", 949 | " \n", 950 | " \n", 951 | " \n", 952 | " \n", 953 | " \n", 954 | " \n", 955 | " \n", 956 | " \n", 957 | " \n", 958 | " \n", 959 | " \n", 960 | " \n", 961 | " \n", 962 | " \n", 963 | " \n", 964 | " \n", 965 | " \n", 966 | " \n", 967 | " \n", 968 | " \n", 969 | " \n", 970 | " \n", 971 | " \n", 972 | " \n", 973 | " \n", 974 | " \n", 975 | " \n", 976 | " \n", 977 | " \n", 978 | " \n", 979 | " \n", 980 | " \n", 981 | " \n", 982 | " \n", 983 | " \n", 984 | " \n", 985 | " \n", 986 | " \n", 987 | " \n", 988 | " \n", 989 | " \n", 990 | " \n", 991 | " \n", 992 | " \n", 993 | " \n", 994 | " \n", 995 | " \n", 996 | " \n", 997 | " \n", 998 | " \n", 999 | " \n", 1000 | " \n", 1001 | " \n", 1002 | " \n", 1003 | " \n", 1004 | " \n", 1005 | " \n", 1006 | " \n", 1007 | " \n", 1008 | " \n", 1009 | " \n", 1010 | " \n", 1011 | " \n", 1012 | " \n", 1013 | " \n", 1014 | " \n", 1015 | " \n", 1016 | " \n", 1017 | " \n", 1018 | " \n", 1019 | " \n", 1020 | " \n", 1021 | " \n", 1022 | " \n", 1023 | " \n", 1024 | " \n", 1025 | " \n", 1026 | " \n", 1027 | " \n", 1028 | " \n", 1029 | " \n", 1030 | " \n", 1031 | " \n", 1032 | " \n", 1033 | " \n", 1034 | " \n", 1035 | " \n", 1036 | " \n", 1037 | " \n", 1038 | " \n", 1039 | " \n", 1040 | " \n", 1041 | " \n", 1042 | " \n", 1043 | " \n", 1044 | " \n", 1045 | " \n", 1046 | " \n", 1047 | " \n", 1048 | " \n", 1049 | " \n", 1050 | " \n", 1051 | " \n", 1052 | " \n", 1053 | " \n", 1054 | " \n", 1055 | " \n", 1056 | " \n", 1057 | " \n", 1058 | " \n", 1059 | " \n", 1060 | " \n", 1061 | " \n", 1062 | " \n", 1063 | " \n", 1064 | " \n", 1065 | " \n", 1066 | " \n", 1067 | " \n", 1068 | " \n", 1069 | " \n", 1070 | " \n", 1071 | " \n", 1072 | " \n", 1073 | " \n", 1074 | " \n", 1075 | " \n", 1076 | " \n", 1077 | " \n", 1078 | " \n", 1079 | " \n", 1080 | " \n", 1081 | " \n", 1082 | " \n", 1083 | " \n", 1084 | " \n", 1085 | " \n", 1086 | " \n", 1087 | " \n", 1088 | " \n", 1089 | " \n", 1090 | " \n", 1091 | " \n", 1092 | " \n", 1093 | " \n", 1094 | " \n", 1095 | " \n", 1096 | " \n", 1097 | " \n", 1098 | " \n", 1099 | " \n", 1100 | " \n", 1101 | " \n", 1102 | " \n", 1103 | " \n", 1104 | " \n", 1105 | " \n", 1106 | " \n", 1107 | " \n", 1108 | " \n", 1109 | " \n", 1110 | " \n", 1111 | " \n", 1112 | " \n", 1113 | " \n", 1114 | " \n", 1115 | " \n", 1116 | " \n", 1117 | " \n", 1118 | " \n", 1119 | " \n", 1120 | " \n", 1121 | " \n", 1122 | " \n", 1123 | " \n", 1124 | " \n", 1125 | " \n", 1126 | " \n", 1127 | " \n", 1128 | " \n", 1129 | " \n", 1130 | " \n", 1131 | " \n", 1132 | " \n", 1133 | " \n", 1134 | " \n", 1135 | " \n", 1136 | " \n", 1137 | " \n", 1138 | " \n", 1139 | " \n", 1140 | " \n", 1141 | " \n", 1142 | " \n", 1143 | " \n", 1144 | " \n", 1145 | " \n", 1146 | " \n", 1147 | " \n", 1148 | " \n", 1149 | " \n", 1150 | " \n", 1151 | " \n", 1152 | " \n", 1153 | " \n", 1154 | " \n", 1155 | " \n", 1156 | " \n", 1157 | " \n", 1158 | " \n", 1159 | " \n", 1160 | " \n", 1161 | " \n", 1162 | " \n", 1163 | " \n", 1164 | " \n", 1165 | " \n", 1166 | " \n", 1167 | " \n", 1168 | " \n", 1169 | " \n", 1170 | " \n", 1171 | " \n", 1172 | " \n", 1173 | " \n", 1174 | " \n", 1175 | " \n", 1176 | " \n", 1177 | " \n", 1178 | " \n", 1179 | " \n", 1180 | " \n", 1181 | " \n", 1182 | " \n", 1183 | " \n", 1184 | " \n", 1185 | " \n", 1186 | " \n", 1187 | " \n", 1188 | " \n", 1189 | " \n", 1190 | " \n", 1191 | " \n", 1192 | " \n", 1193 | " \n", 1194 | " \n", 1195 | " \n", 1196 | " \n", 1197 | " \n", 1198 | " \n", 1199 | " \n", 1200 | " \n", 1201 | " \n", 1202 | " \n", 1203 | " \n", 1204 | " \n", 1205 | " \n", 1206 | " \n", 1207 | " \n", 1208 | " \n", 1209 | " \n", 1210 | " \n", 1211 | " \n", 1212 | " \n", 1213 | " \n", 1214 | " \n", 1215 | " \n", 1216 | " \n", 1217 | " \n", 1218 | " \n", 1219 | " \n", 1220 | " \n", 1221 | " \n", 1222 | " \n", 1223 | " \n", 1224 | " \n", 1225 | " \n", 1226 | " \n", 1227 | " \n", 1228 | " \n", 1229 | " \n", 1230 | " \n", 1231 | " \n", 1232 | " \n", 1233 | " \n", 1234 | " \n", 1235 | " \n", 1236 | " \n", 1237 | " \n", 1238 | " \n", 1239 | " \n", 1240 | " \n", 1241 | " \n", 1242 | " \n", 1243 | " \n", 1244 | " \n", 1245 | " \n", 1246 | " \n", 1247 | " \n", 1248 | " \n", 1249 | " \n", 1250 | " \n", 1251 | " \n", 1252 | " \n", 1253 | " \n", 1254 | " \n", 1255 | " \n", 1256 | " \n", 1257 | " \n", 1258 | " \n", 1259 | " \n", 1260 | " \n", 1261 | " \n", 1262 | " \n", 1263 | " \n", 1264 | " \n", 1265 | " \n", 1266 | " \n", 1267 | " \n", 1268 | " \n", 1269 | " \n", 1270 | " \n", 1271 | " \n", 1272 | " \n", 1273 | " \n", 1274 | " \n", 1275 | " \n", 1276 | " \n", 1277 | " \n", 1278 | " \n", 1279 | " \n", 1280 | " \n", 1281 | " \n", 1282 | " \n", 1283 | " \n", 1284 | " \n", 1285 | " \n", 1286 | " \n", 1287 | " \n", 1288 | " \n", 1289 | " \n", 1290 | " \n", 1291 | " \n", 1292 | " \n", 1293 | " \n", 1294 | " \n", 1295 | " \n", 1296 | " \n", 1297 | " \n", 1298 | " \n", 1299 | " \n", 1300 | " \n", 1301 | " \n", 1302 | " \n", 1303 | " \n", 1304 | " \n", 1305 | " \n", 1306 | " \n", 1307 | " \n", 1308 | " \n", 1309 | " \n", 1310 | " \n", 1311 | " \n", 1312 | " \n", 1313 | " \n", 1314 | " \n", 1315 | " \n", 1316 | " \n", 1317 | " \n", 1318 | " \n", 1319 | " \n", 1320 | " \n", 1321 | " \n", 1322 | " \n", 1323 | " \n", 1324 | " \n", 1325 | " \n", 1326 | " \n", 1327 | " \n", 1328 | " \n", 1329 | " \n", 1330 | " \n", 1331 | " \n", 1332 | " \n", 1333 | " \n", 1334 | " \n", 1335 | " \n", 1336 | " \n", 1337 | " \n", 1338 | " \n", 1339 | " \n", 1340 | " \n", 1341 | " \n", 1342 | " \n", 1343 | " \n", 1344 | " \n", 1345 | " \n", 1346 | " \n", 1347 | " \n", 1348 | " \n", 1349 | " \n", 1350 | " \n", 1351 | " \n", 1352 | " \n", 1353 | " \n", 1354 | " \n", 1355 | " \n", 1356 | " \n", 1357 | " \n", 1358 | " \n", 1359 | " \n", 1360 | " \n", 1361 | " \n", 1362 | " \n", 1363 | " \n", 1364 | " \n", 1365 | " \n", 1366 | " \n", 1367 | " \n", 1368 | " \n", 1369 | " \n", 1370 | " \n", 1371 | " \n", 1372 | " \n", 1373 | " \n", 1374 | " \n", 1375 | " \n", 1376 | " \n", 1377 | " \n", 1378 | " \n", 1379 | " \n", 1380 | " \n", 1381 | " \n", 1382 | " \n", 1383 | " \n", 1384 | " \n", 1385 | " \n", 1386 | " \n", 1387 | " \n", 1388 | " \n", 1389 | " \n", 1390 | " \n", 1391 | " \n", 1392 | " \n", 1393 | " \n", 1394 | " \n", 1395 | " \n", 1396 | " \n", 1397 | " \n", 1398 | " \n", 1399 | " \n", 1400 | " \n", 1401 | " \n", 1402 | " \n", 1403 | " \n", 1404 | " \n", 1405 | " \n", 1406 | " \n", 1407 | " \n", 1408 | " \n", 1409 | " \n", 1410 | " \n", 1411 | " \n", 1412 | " \n", 1413 | " \n", 1414 | " \n", 1415 | " \n", 1416 | " \n", 1417 | " \n", 1418 | " \n", 1419 | " \n", 1420 | " \n", 1421 | " \n", 1422 | " \n", 1423 | " \n", 1424 | " \n", 1425 | " \n", 1426 | " \n", 1427 | " \n", 1428 | " \n", 1429 | " \n", 1430 | " \n", 1431 | " \n", 1432 | " \n", 1433 | " \n", 1434 | " \n", 1435 | " \n", 1436 | " \n", 1437 | " \n", 1438 | " \n", 1439 | " \n", 1440 | " \n", 1441 | " \n", 1442 | " \n", 1443 | " \n", 1444 | " \n", 1445 | " \n", 1446 | " \n", 1447 | " \n", 1448 | " \n", 1449 | " \n", 1450 | " \n", 1451 | " \n", 1452 | " \n", 1453 | " \n", 1454 | " \n", 1455 | " \n", 1456 | " \n", 1457 | " \n", 1458 | " \n", 1459 | " \n", 1460 | " \n", 1461 | " \n", 1462 | " \n", 1463 | " \n", 1464 | " \n", 1465 | " \n", 1466 | " \n", 1467 | " \n", 1468 | " \n", 1469 | " \n", 1470 | " \n", 1471 | " \n", 1472 | " \n", 1473 | " \n", 1474 | " \n", 1475 | " \n", 1476 | " \n", 1477 | " \n", 1478 | " \n", 1479 | " \n", 1480 | " \n", 1481 | " \n", 1482 | " \n", 1483 | " \n", 1484 | " \n", 1485 | " \n", 1486 | " \n", 1487 | " \n", 1488 | " \n", 1489 | " \n", 1490 | " \n", 1491 | " \n", 1492 | " \n", 1493 | " \n", 1494 | " \n", 1495 | " \n", 1496 | " \n", 1497 | " \n", 1498 | " \n", 1499 | " \n", 1500 | " \n", 1501 | " \n", 1502 | " \n", 1503 | " \n", 1504 | " \n", 1505 | " \n", 1506 | " \n", 1507 | " \n", 1508 | " \n", 1509 | " \n", 1510 | " \n", 1511 | " \n", 1512 | " \n", 1513 | " \n", 1514 | " \n", 1515 | " \n", 1516 | " \n", 1517 | " \n", 1518 | " \n", 1519 | " \n", 1520 | " \n", 1521 | " \n", 1522 | " \n", 1523 | " \n", 1524 | " \n", 1525 | " \n", 1526 | " \n", 1527 | " \n", 1528 | " \n", 1529 | " \n", 1530 | " \n", 1531 | " \n", 1532 | " \n", 1533 | " \n", 1534 | " \n", 1535 | " \n", 1536 | " \n", 1537 | " \n", 1538 | " \n", 1539 | " \n", 1540 | " \n", 1541 | " \n", 1542 | " \n", 1543 | " \n", 1544 | " \n", 1545 | " \n", 1546 | " \n", 1547 | " \n", 1548 | " \n", 1549 | " \n", 1550 | " \n", 1551 | " \n", 1552 | " \n", 1553 | " \n", 1554 | " \n", 1555 | " \n", 1556 | " \n", 1557 | " \n", 1558 | " \n", 1559 | " \n", 1560 | " \n", 1561 | " \n", 1562 | " \n", 1563 | " \n", 1564 | " \n", 1565 | " \n", 1566 | " \n", 1567 | " \n", 1568 | " \n", 1569 | " \n", 1570 | " \n", 1571 | " \n", 1572 | " \n", 1573 | " \n", 1574 | " \n", 1575 | " \n", 1576 | " \n", 1577 | " \n", 1578 | " \n", 1579 | " \n", 1580 | " \n", 1581 | " \n", 1582 | " \n", 1583 | " \n", 1584 | " \n", 1585 | " \n", 1586 | " \n", 1587 | " \n", 1588 | " \n", 1589 | " \n", 1590 | " \n", 1591 | " \n", 1592 | " \n", 1593 | " \n", 1594 | " \n", 1595 | " \n", 1596 | " \n", 1597 | " \n", 1598 | " \n", 1599 | " \n", 1600 | " \n", 1601 | " \n", 1602 | " \n", 1603 | " \n", 1604 | " \n", 1605 | " \n", 1606 | " \n", 1607 | " \n", 1608 | " \n", 1609 | " \n", 1610 | " \n", 1611 | " \n", 1612 | " \n", 1613 | " \n", 1614 | " \n", 1615 | " \n", 1616 | " \n", 1617 | " \n", 1618 | " \n", 1619 | " \n", 1620 | " \n", 1621 | " \n", 1622 | " \n", 1623 | " \n", 1624 | " \n", 1625 | " \n", 1626 | " \n", 1627 | " \n", 1628 | " \n", 1629 | " \n", 1630 | " \n", 1631 | " \n", 1632 | " \n", 1633 | " \n", 1634 | " \n", 1635 | " \n", 1636 | " \n", 1637 | " \n", 1638 | " \n", 1639 | " \n", 1640 | " \n", 1641 | " \n", 1642 | " \n", 1643 | " \n", 1644 | " \n", 1645 | " \n", 1646 | " \n", 1647 | " \n", 1648 | " \n", 1649 | " \n", 1650 | " \n", 1651 | " \n", 1652 | " \n", 1653 | " \n", 1654 | " \n", 1655 | " \n", 1656 | " \n", 1657 | " \n", 1658 | " \n", 1659 | " \n", 1660 | " \n", 1661 | " \n", 1662 | " \n", 1663 | " \n", 1664 | " \n", 1665 | " \n", 1666 | " \n", 1667 | " \n", 1668 | " \n", 1669 | " \n", 1670 | " \n", 1671 | " \n", 1672 | " \n", 1673 | " \n", 1674 | " \n", 1675 | " \n", 1676 | " \n", 1677 | " \n", 1678 | " \n", 1679 | " \n", 1680 | " \n", 1681 | " \n", 1682 | " \n", 1683 | " \n", 1684 | " \n", 1685 | " \n", 1686 | " \n", 1687 | " \n", 1688 | " \n", 1689 | " \n", 1690 | " \n", 1691 | " \n", 1692 | " \n", 1693 | " \n", 1694 | " \n", 1695 | " \n", 1696 | " \n", 1697 | " \n", 1698 | " \n", 1699 | " \n", 1700 | " \n", 1701 | " \n", 1702 | " \n", 1703 | " \n", 1704 | " \n", 1705 | " \n", 1706 | " \n", 1707 | " \n", 1708 | " \n", 1709 | " \n", 1710 | " \n", 1711 | " \n", 1712 | " \n", 1713 | " \n", 1714 | " \n", 1715 | " \n", 1716 | " \n", 1717 | " \n", 1718 | " \n", 1719 | " \n", 1720 | " \n", 1721 | " \n", 1722 | " \n", 1723 | " \n", 1724 | " \n", 1725 | " \n", 1726 | " \n", 1727 | " \n", 1728 | " \n", 1729 | " \n", 1730 | " \n", 1731 | " \n", 1732 | " \n", 1733 | " \n", 1734 | " \n", 1735 | " \n", 1736 | " \n", 1737 | " \n", 1738 | " \n", 1739 | " \n", 1740 | " \n", 1741 | " \n", 1742 | " \n", 1743 | " \n", 1744 | " \n", 1745 | " \n", 1746 | " \n", 1747 | " \n", 1748 | " \n", 1749 | " \n", 1750 | " \n", 1751 | " \n", 1752 | " \n", 1753 | " \n", 1754 | " \n", 1755 | " \n", 1756 | " \n", 1757 | " \n", 1758 | " \n", 1759 | " \n", 1760 | " \n", 1761 | " \n", 1762 | " \n", 1763 | " \n", 1764 | " \n", 1765 | " \n", 1766 | " \n", 1767 | " \n", 1768 | " \n", 1769 | " \n", 1770 | " \n", 1771 | " \n", 1772 | " \n", 1773 | " \n", 1774 | " \n", 1775 | " \n", 1776 | " \n", 1777 | " \n", 1778 | " \n", 1779 | " \n", 1780 | " \n", 1781 | " \n", 1782 | " \n", 1783 | " \n", 1784 | " \n", 1785 | " \n", 1786 | " \n", 1787 | " \n", 1788 | " \n", 1789 | " \n", 1790 | " \n", 1791 | " \n", 1792 | " \n", 1793 | " \n", 1794 | " \n", 1795 | " \n", 1796 | " \n", 1797 | " \n", 1798 | " \n", 1799 | " \n", 1800 | " \n", 1801 | " \n", 1802 | " \n", 1803 | " \n", 1804 | " \n", 1805 | " \n", 1806 | " \n", 1807 | " \n", 1808 | " \n", 1809 | " \n", 1810 | " \n", 1811 | " \n", 1812 | " \n", 1813 | " \n", 1814 | " \n", 1815 | " \n", 1816 | " \n", 1817 | " \n", 1818 | " \n", 1819 | " \n", 1820 | " \n", 1821 | " \n", 1822 | " \n", 1823 | " \n", 1824 | " \n", 1825 | " \n", 1826 | " \n", 1827 | " \n", 1828 | " \n", 1829 | " \n", 1830 | " \n", 1831 | " \n", 1832 | " \n", 1833 | " \n", 1834 | " \n", 1835 | " \n", 1836 | " \n", 1837 | " \n", 1838 | " \n", 1839 | " \n", 1840 | " \n", 1841 | " \n", 1842 | " \n", 1843 | " \n", 1844 | " \n", 1845 | " \n", 1846 | " \n", 1847 | " \n", 1848 | " \n", 1849 | " \n", 1850 | " \n", 1851 | " \n", 1852 | " \n", 1853 | " \n", 1854 | " \n", 1855 | " \n", 1856 | " \n", 1857 | " \n", 1858 | " \n", 1859 | " \n", 1860 | " \n", 1861 | " \n", 1862 | " \n", 1863 | " \n", 1864 | " \n", 1865 | " \n", 1866 | " \n", 1867 | " \n", 1868 | " \n", 1869 | " \n", 1870 | " \n", 1871 | " \n", 1872 | " \n", 1873 | " \n", 1874 | " \n", 1875 | " \n", 1876 | " \n", 1877 | " \n", 1878 | " \n", 1879 | " \n", 1880 | " \n", 1881 | " \n", 1882 | " \n", 1883 | " \n", 1884 | " \n", 1885 | " \n", 1886 | " \n", 1887 | " \n", 1888 | " \n", 1889 | " \n", 1890 | " \n", 1891 | " \n", 1892 | " \n", 1893 | " \n", 1894 | " \n", 1895 | " \n", 1896 | " \n", 1897 | " \n", 1898 | " \n", 1899 | " \n", 1900 | " \n", 1901 | " \n", 1902 | " \n", 1903 | " \n", 1904 | " \n", 1905 | " \n", 1906 | " \n", 1907 | " \n", 1908 | " \n", 1909 | " \n", 1910 | " \n", 1911 | " \n", 1912 | " \n", 1913 | " \n", 1914 | " \n", 1915 | " \n", 1916 | " \n", 1917 | " \n", 1918 | " \n", 1919 | " \n", 1920 | " \n", 1921 | " \n", 1922 | " \n", 1923 | " \n", 1924 | " \n", 1925 | " \n", 1926 | " \n", 1927 | " \n", 1928 | " \n", 1929 | " \n", 1930 | " \n", 1931 | " \n", 1932 | " \n", 1933 | " \n", 1934 | " \n", 1935 | " \n", 1936 | " \n", 1937 | " \n", 1938 | " \n", 1939 | " \n", 1940 | " \n", 1941 | " \n", 1942 | " \n", 1943 | " \n", 1944 | " \n", 1945 | " \n", 1946 | " \n", 1947 | " \n", 1948 | " \n", 1949 | " \n", 1950 | " \n", 1951 | " \n", 1952 | " \n", 1953 | " \n", 1954 | " \n", 1955 | " \n", 1956 | " \n", 1957 | " \n", 1958 | " \n", 1959 | " \n", 1960 | " \n", 1961 | " \n", 1962 | " \n", 1963 | " \n", 1964 | " \n", 1965 | " \n", 1966 | " \n", 1967 | " \n", 1968 | " \n", 1969 | " \n", 1970 | " \n", 1971 | " \n", 1972 | " \n", 1973 | " \n", 1974 | " \n", 1975 | " \n", 1976 | " \n", 1977 | " \n", 1978 | " \n", 1979 | " \n", 1980 | " \n", 1981 | " \n", 1982 | " \n", 1983 | " \n", 1984 | " \n", 1985 | " \n", 1986 | " \n", 1987 | " \n", 1988 | " \n", 1989 | " \n", 1990 | " \n", 1991 | " \n", 1992 | " \n", 1993 | " \n", 1994 | " \n", 1995 | " \n", 1996 | " \n", 1997 | " \n", 1998 | " \n", 1999 | " \n", 2000 | " \n", 2001 | " \n", 2002 | " \n", 2003 | " \n", 2004 | " \n", 2005 | " \n", 2006 | " \n", 2007 | " \n", 2008 | " \n", 2009 | " \n", 2010 | " \n", 2011 | " \n", 2012 | " \n", 2013 | " \n", 2014 | " \n", 2015 | " \n", 2016 | " \n", 2017 | " \n", 2018 | " \n", 2019 | " \n", 2020 | " \n", 2021 | " \n", 2022 | " \n", 2023 | " \n", 2024 | " \n", 2025 | " \n", 2026 | " \n", 2027 | " \n", 2028 | " \n", 2029 | " \n", 2030 | " \n", 2031 | " \n", 2032 | " \n", 2033 | " \n", 2034 | " \n", 2035 | " \n", 2036 | " \n", 2037 | " \n", 2038 | " \n", 2039 | " \n", 2040 | " \n", 2041 | " \n", 2042 | " \n", 2043 | " \n", 2044 | " \n", 2045 | " \n", 2046 | " \n", 2047 | " \n", 2048 | " \n", 2049 | " \n", 2050 | " \n", 2051 | " \n", 2052 | " \n", 2053 | " \n", 2054 | " \n", 2055 | " \n", 2056 | " \n", 2057 | " \n", 2058 | " \n", 2059 | " \n", 2060 | " \n", 2061 | " \n", 2062 | " \n", 2063 | " \n", 2064 | " \n", 2065 | " \n", 2066 | " \n", 2067 | " \n", 2068 | " \n", 2069 | " \n", 2070 | " \n", 2071 | " \n", 2072 | " \n", 2073 | " \n", 2074 | " \n", 2075 | " \n", 2076 | " \n", 2077 | " \n", 2078 | " \n", 2079 | " \n", 2080 | " \n", 2081 | " \n", 2082 | " \n", 2083 | " \n", 2084 | " \n", 2085 | " \n", 2086 | " \n", 2087 | " \n", 2088 | " \n", 2089 | " \n", 2090 | " \n", 2091 | " \n", 2092 | " \n", 2093 | " \n", 2094 | " \n", 2095 | " \n", 2096 | "
RegionIDRegionNameStateMetroCountyNameSizeRank1996-041996-051996-061996-07...2015-112015-122016-012016-022016-032016-042016-052016-062016-072016-08
06181New YorkNYNew YorkQueens1NaNNaNNaNNaN...573600576200578400582200588000592200592500590200588000586400
112447Los AngelesCALos Angeles-Long Beach-AnaheimLos Angeles2155000.0154600.0154400.0154200.0...558200560800562800565600569700574000577800580600583000585100
217426ChicagoILChicagoCook3109700.0109400.0109300.0109300.0...207800206900206200205800206200207300208200209100211000213000
313271PhiladelphiaPAPhiladelphiaPhiladelphia450000.049900.049600.049400.0...122300121600121800123300125200126400127000127400128300129100
440326PhoenixAZPhoenixMaricopa587200.087700.088200.088400.0...183800185300186600188000189100190200191300192800194500195900
518959Las VegasNVLas VegasClark6121600.0120900.0120400.0120300.0...190600192000193600194800195400196100197300198200199300200600
654296San DiegoCASan DiegoSan Diego7161100.0160700.0160400.0160100.0...525700526700527800529200531000533900536900537900539000540500
738128DallasTXDallas-Fort WorthDallas8NaNNaNNaNNaN...134600136600138700140600142200143300144500146000148200150400
833839San JoseCASan JoseSanta Clara9224500.0224900.0225400.0226100.0...789700792100795800803100811900817600819100820100821700822700
925290JacksonvilleFLJacksonvilleDuval1077500.077200.076800.076600.0...132000132500133100133900134900136000137200138400139500140300
1020330San FranciscoCASan FranciscoSan Francisco11262500.0263500.0264100.0265000.0...1105800111230011174001122700112520011232001119800111480011088001104000
1110221AustinTXAustinTravis12NaNNaNNaNNaN...287300289300291100293400296000299200301800303300304100304800
1217762DetroitMIDetroitWayne13NaNNaNNaNNaN...38500384003830038000376003740037500375003770038100
1310920ColumbusOHColumbusFranklin1483100.083200.083300.083500.0...115200115800116200116700117200117700118100118800119700120500
1432811MemphisTNMemphisShelby1560600.060500.060700.060800.0...69600698006990070800720007310074300751007560076200
1524043CharlotteNCCharlotteMecklenburg1694500.094900.095700.096400.0...162800164300165500166500167400168400169500170400171700173100
1617933El PasoTXEl PasoEl Paso1767400.067800.068000.068300.0...110200110000110200110600111200111500111400111500112100112300
1744269BostonMABostonSuffolk18123100.0122800.0123100.0123800.0...471000474600478700482900486200488000489700493400499200504200
1816037SeattleWASeattleKing19164400.0163900.0163600.0163400.0...533700538700544300551200559700568600576200581800587200592200
193523BaltimoreMDBaltimoreBaltimore City2053200.053900.054400.054700.0...113600114000114500114700114800114700114600114900115100115200
2011093DenverCODenverDenver2198700.099200.099600.0100200.0...330500332100333500335600337600339700342700345900349800353300
2141568WashingtonDCWashingtonDistrict of Columbia22NaNNaNNaNNaN...501200502500503800504700503800503400505100508900513900518600
226118NashvilleTNNashvilleDavidson2383100.083800.084800.085900.0...189300191400193300195100196800198400200300202400205000207200
235976MilwaukeeWIMilwaukeeMilwaukee2468100.068100.068100.067800.0...94600943009420094600952009590096300969009820099500
247481TucsonAZTucsonPima2591500.091500.091600.091500.0...148200148400148800149500150300150700151100151700152400153000
2513373PortlandORPortlandMultnomah26121100.0122200.0123000.0123600.0...343400347800351900356100360000364400369300375700383600390500
2633225Oklahoma CityOKOklahoma CityOklahoma2764900.065400.065700.065800.0...127300127700127600127700128400129000129300129600130100130500
2740152OmahaNEOmahaDouglas2888900.089600.090400.090800.0...140300140500140900141600142400142800142700142500143100143800
2823429AlbuquerqueNMAlbuquerqueBernalillo29115400.0115600.0116000.0116700.0...167300167800168300169100169900170300170500171100171800172000
2918203FresnoCAFresnoFresno3090400.090400.090200.090000.0...187600187700187900189000190200191200192700194300195800197100
..................................................................
1070049199Granite ShoalsTXNaNBurnet10701NaNNaNNaNNaN...128200129900131500133700136100137900139700142600145700147200
1070149693Piney PointMDCalifornia-Lexington ParkSaint Marys10702148400.0152300.0153600.0153100.0...309800312600315900319900322700323600324300324600324500324700
1070250374MaribelWIManitowocManitowoc10703NaNNaNNaNNaN...129100129700128900127600127000127800129300130700132900135500
1070350539MiddletonIDBoise CityCanyon10704103100.0103200.0103300.0103000.0...151100152300153200154000154500155500157200158800160100161400
1070450963BennettCODenverAdams1070583800.085900.087100.088100.0...195700197400198500199400201400204500207700210200211900213300
1070551793East HampsteadNHBostonRockingham10706132200.0128600.0125500.0125200.0...269600270600271200272400274000275800277800279800281700283200
1070652166Garden CityMOKansas CityCass10707NaNNaNNaNNaN...103600104800105200105800107800110000112000113000113500113700
1070753456MountainburgARFort SmithCrawford1070855600.055500.055400.056200.0...90100934009630098600100100101200101800102700103300103500
1070853730OostburgWISheboyganSheboygan1070986300.084900.083800.083700.0...132000132500132700132600133000133700134100134500135700137000
1070954771Twin PeaksCARiversideSan Bernardino1071085500.085200.084600.084400.0...159200160600162400164400165700166300167700169900172500174500
1071054802Upper BrookvilleNYNew YorkNassau10711897200.0894000.0891300.0894400.0...1833700185490018793001905900192830019418001942000194840019626001975000
1071154995VolcanoHIHiloHawaii10712114600.0108600.0102400.096700.0...232500236400239700241800244700247800249600249500248500247200
1071255072WedgefieldSCSumterSumter10713NaNNaNNaNNaN...69000685006820068500699007190074100768007960081800
1071355210WilliamstonMILansingIngham10714120900.0124800.0128200.0130200.0...182700181800180800181000182600182900182200182100182800183200
1071455357DecaturARFayettevilleBenton1071554700.055100.055100.054700.0...97200974009690096300963009660096600967009690096800
1071555476BricevilleTNKnoxvilleAnderson1071637000.037500.036700.036100.0...43200417004080040500411004170042000417004110040600
1071655706EdgewoodINIndianapolisMadison10717NaNNaNNaNNaN...10020010110010020098900992009960099900100400101000100900
1071756183PalmyraTNClarksvilleMontgomery10718NaNNaNNaNNaN...126400127600127300127300128400130000133000135600137000138500
1071856845Saint InigoesMDCalifornia-Lexington ParkSaint Marys10719137400.0136900.0137500.0138600.0...277200275800276300279500282200282900282900282100281400281400
1071956943MarysvilleINLouisville/Jefferson CountyClark10720NaNNaNNaNNaN...119800124100127200129500128000124800122700122200123100125300
1072057212Forest FallsCARiversideSan Bernardino1072176400.075600.074100.073100.0...199700195300190400187500188000190000190600189000187100186200
10721171874Bois D ArcMOSpringfieldGreene1072277700.077500.077700.078600.0...149800149100148000146900145700145000143700142600143200144800
10722182023HenricoVARichmondHenrico10723110200.0110500.0110900.0111100.0...213800215100216000216200215900215900215900216800219000221300
10723188693Diamond BeachNJOcean CityCape May10724136500.0136800.0137000.0135200.0...400100401600403100405000405500404500403800403400400700397300
10724227014Gruetli LaagerTNNaNGrundy1072524800.024300.024500.025000.0...71800729007450075700758007590076600769007720077800
10725398292Town of WrightstownWIGreen BayBrown10726NaNNaNNaNNaN...149900150100150300150000149200149900151400152500154100155900
10726398343UrbanaNYCorningSteuben1072766900.065800.065500.065100.0...135700136400137700138700140500143600145000144000143000143000
10727398496New DenmarkWIGreen BayBrown10728NaNNaNNaNNaN...188700189800190800191200191200191700192800194000196300198900
10728398839AngelsCANaNCalaveras10729115600.0116400.0118000.0119000.0...280400279600278000276600275000273700272000269100269000270900
10729399114HollandWISheboyganSheboygan10730129900.0130200.0130300.0129100.0...217800219400221100222000222800224900228000231200233900236000
\n", 2097 | "

10730 rows × 251 columns

\n", 2098 | "
" 2099 | ], 2100 | "text/plain": [ 2101 | " RegionID RegionName State Metro \\\n", 2102 | "0 6181 New York NY New York \n", 2103 | "1 12447 Los Angeles CA Los Angeles-Long Beach-Anaheim \n", 2104 | "2 17426 Chicago IL Chicago \n", 2105 | "3 13271 Philadelphia PA Philadelphia \n", 2106 | "4 40326 Phoenix AZ Phoenix \n", 2107 | "5 18959 Las Vegas NV Las Vegas \n", 2108 | "6 54296 San Diego CA San Diego \n", 2109 | "7 38128 Dallas TX Dallas-Fort Worth \n", 2110 | "8 33839 San Jose CA San Jose \n", 2111 | "9 25290 Jacksonville FL Jacksonville \n", 2112 | "10 20330 San Francisco CA San Francisco \n", 2113 | "11 10221 Austin TX Austin \n", 2114 | "12 17762 Detroit MI Detroit \n", 2115 | "13 10920 Columbus OH Columbus \n", 2116 | "14 32811 Memphis TN Memphis \n", 2117 | "15 24043 Charlotte NC Charlotte \n", 2118 | "16 17933 El Paso TX El Paso \n", 2119 | "17 44269 Boston MA Boston \n", 2120 | "18 16037 Seattle WA Seattle \n", 2121 | "19 3523 Baltimore MD Baltimore \n", 2122 | "20 11093 Denver CO Denver \n", 2123 | "21 41568 Washington DC Washington \n", 2124 | "22 6118 Nashville TN Nashville \n", 2125 | "23 5976 Milwaukee WI Milwaukee \n", 2126 | "24 7481 Tucson AZ Tucson \n", 2127 | "25 13373 Portland OR Portland \n", 2128 | "26 33225 Oklahoma City OK Oklahoma City \n", 2129 | "27 40152 Omaha NE Omaha \n", 2130 | "28 23429 Albuquerque NM Albuquerque \n", 2131 | "29 18203 Fresno CA Fresno \n", 2132 | "... ... ... ... ... \n", 2133 | "10700 49199 Granite Shoals TX NaN \n", 2134 | "10701 49693 Piney Point MD California-Lexington Park \n", 2135 | "10702 50374 Maribel WI Manitowoc \n", 2136 | "10703 50539 Middleton ID Boise City \n", 2137 | "10704 50963 Bennett CO Denver \n", 2138 | "10705 51793 East Hampstead NH Boston \n", 2139 | "10706 52166 Garden City MO Kansas City \n", 2140 | "10707 53456 Mountainburg AR Fort Smith \n", 2141 | "10708 53730 Oostburg WI Sheboygan \n", 2142 | "10709 54771 Twin Peaks CA Riverside \n", 2143 | "10710 54802 Upper Brookville NY New York \n", 2144 | "10711 54995 Volcano HI Hilo \n", 2145 | "10712 55072 Wedgefield SC Sumter \n", 2146 | "10713 55210 Williamston MI Lansing \n", 2147 | "10714 55357 Decatur AR Fayetteville \n", 2148 | "10715 55476 Briceville TN Knoxville \n", 2149 | "10716 55706 Edgewood IN Indianapolis \n", 2150 | "10717 56183 Palmyra TN Clarksville \n", 2151 | "10718 56845 Saint Inigoes MD California-Lexington Park \n", 2152 | "10719 56943 Marysville IN Louisville/Jefferson County \n", 2153 | "10720 57212 Forest Falls CA Riverside \n", 2154 | "10721 171874 Bois D Arc MO Springfield \n", 2155 | "10722 182023 Henrico VA Richmond \n", 2156 | "10723 188693 Diamond Beach NJ Ocean City \n", 2157 | "10724 227014 Gruetli Laager TN NaN \n", 2158 | "10725 398292 Town of Wrightstown WI Green Bay \n", 2159 | "10726 398343 Urbana NY Corning \n", 2160 | "10727 398496 New Denmark WI Green Bay \n", 2161 | "10728 398839 Angels CA NaN \n", 2162 | "10729 399114 Holland WI Sheboygan \n", 2163 | "\n", 2164 | " CountyName SizeRank 1996-04 1996-05 1996-06 1996-07 \\\n", 2165 | "0 Queens 1 NaN NaN NaN NaN \n", 2166 | "1 Los Angeles 2 155000.0 154600.0 154400.0 154200.0 \n", 2167 | "2 Cook 3 109700.0 109400.0 109300.0 109300.0 \n", 2168 | "3 Philadelphia 4 50000.0 49900.0 49600.0 49400.0 \n", 2169 | "4 Maricopa 5 87200.0 87700.0 88200.0 88400.0 \n", 2170 | "5 Clark 6 121600.0 120900.0 120400.0 120300.0 \n", 2171 | "6 San Diego 7 161100.0 160700.0 160400.0 160100.0 \n", 2172 | "7 Dallas 8 NaN NaN NaN NaN \n", 2173 | "8 Santa Clara 9 224500.0 224900.0 225400.0 226100.0 \n", 2174 | "9 Duval 10 77500.0 77200.0 76800.0 76600.0 \n", 2175 | "10 San Francisco 11 262500.0 263500.0 264100.0 265000.0 \n", 2176 | "11 Travis 12 NaN NaN NaN NaN \n", 2177 | "12 Wayne 13 NaN NaN NaN NaN \n", 2178 | "13 Franklin 14 83100.0 83200.0 83300.0 83500.0 \n", 2179 | "14 Shelby 15 60600.0 60500.0 60700.0 60800.0 \n", 2180 | "15 Mecklenburg 16 94500.0 94900.0 95700.0 96400.0 \n", 2181 | "16 El Paso 17 67400.0 67800.0 68000.0 68300.0 \n", 2182 | "17 Suffolk 18 123100.0 122800.0 123100.0 123800.0 \n", 2183 | "18 King 19 164400.0 163900.0 163600.0 163400.0 \n", 2184 | "19 Baltimore City 20 53200.0 53900.0 54400.0 54700.0 \n", 2185 | "20 Denver 21 98700.0 99200.0 99600.0 100200.0 \n", 2186 | "21 District of Columbia 22 NaN NaN NaN NaN \n", 2187 | "22 Davidson 23 83100.0 83800.0 84800.0 85900.0 \n", 2188 | "23 Milwaukee 24 68100.0 68100.0 68100.0 67800.0 \n", 2189 | "24 Pima 25 91500.0 91500.0 91600.0 91500.0 \n", 2190 | "25 Multnomah 26 121100.0 122200.0 123000.0 123600.0 \n", 2191 | "26 Oklahoma 27 64900.0 65400.0 65700.0 65800.0 \n", 2192 | "27 Douglas 28 88900.0 89600.0 90400.0 90800.0 \n", 2193 | "28 Bernalillo 29 115400.0 115600.0 116000.0 116700.0 \n", 2194 | "29 Fresno 30 90400.0 90400.0 90200.0 90000.0 \n", 2195 | "... ... ... ... ... ... ... \n", 2196 | "10700 Burnet 10701 NaN NaN NaN NaN \n", 2197 | "10701 Saint Marys 10702 148400.0 152300.0 153600.0 153100.0 \n", 2198 | "10702 Manitowoc 10703 NaN NaN NaN NaN \n", 2199 | "10703 Canyon 10704 103100.0 103200.0 103300.0 103000.0 \n", 2200 | "10704 Adams 10705 83800.0 85900.0 87100.0 88100.0 \n", 2201 | "10705 Rockingham 10706 132200.0 128600.0 125500.0 125200.0 \n", 2202 | "10706 Cass 10707 NaN NaN NaN NaN \n", 2203 | "10707 Crawford 10708 55600.0 55500.0 55400.0 56200.0 \n", 2204 | "10708 Sheboygan 10709 86300.0 84900.0 83800.0 83700.0 \n", 2205 | "10709 San Bernardino 10710 85500.0 85200.0 84600.0 84400.0 \n", 2206 | "10710 Nassau 10711 897200.0 894000.0 891300.0 894400.0 \n", 2207 | "10711 Hawaii 10712 114600.0 108600.0 102400.0 96700.0 \n", 2208 | "10712 Sumter 10713 NaN NaN NaN NaN \n", 2209 | "10713 Ingham 10714 120900.0 124800.0 128200.0 130200.0 \n", 2210 | "10714 Benton 10715 54700.0 55100.0 55100.0 54700.0 \n", 2211 | "10715 Anderson 10716 37000.0 37500.0 36700.0 36100.0 \n", 2212 | "10716 Madison 10717 NaN NaN NaN NaN \n", 2213 | "10717 Montgomery 10718 NaN NaN NaN NaN \n", 2214 | "10718 Saint Marys 10719 137400.0 136900.0 137500.0 138600.0 \n", 2215 | "10719 Clark 10720 NaN NaN NaN NaN \n", 2216 | "10720 San Bernardino 10721 76400.0 75600.0 74100.0 73100.0 \n", 2217 | "10721 Greene 10722 77700.0 77500.0 77700.0 78600.0 \n", 2218 | "10722 Henrico 10723 110200.0 110500.0 110900.0 111100.0 \n", 2219 | "10723 Cape May 10724 136500.0 136800.0 137000.0 135200.0 \n", 2220 | "10724 Grundy 10725 24800.0 24300.0 24500.0 25000.0 \n", 2221 | "10725 Brown 10726 NaN NaN NaN NaN \n", 2222 | "10726 Steuben 10727 66900.0 65800.0 65500.0 65100.0 \n", 2223 | "10727 Brown 10728 NaN NaN NaN NaN \n", 2224 | "10728 Calaveras 10729 115600.0 116400.0 118000.0 119000.0 \n", 2225 | "10729 Sheboygan 10730 129900.0 130200.0 130300.0 129100.0 \n", 2226 | "\n", 2227 | " ... 2015-11 2015-12 2016-01 2016-02 2016-03 2016-04 2016-05 \\\n", 2228 | "0 ... 573600 576200 578400 582200 588000 592200 592500 \n", 2229 | "1 ... 558200 560800 562800 565600 569700 574000 577800 \n", 2230 | "2 ... 207800 206900 206200 205800 206200 207300 208200 \n", 2231 | "3 ... 122300 121600 121800 123300 125200 126400 127000 \n", 2232 | "4 ... 183800 185300 186600 188000 189100 190200 191300 \n", 2233 | "5 ... 190600 192000 193600 194800 195400 196100 197300 \n", 2234 | "6 ... 525700 526700 527800 529200 531000 533900 536900 \n", 2235 | "7 ... 134600 136600 138700 140600 142200 143300 144500 \n", 2236 | "8 ... 789700 792100 795800 803100 811900 817600 819100 \n", 2237 | "9 ... 132000 132500 133100 133900 134900 136000 137200 \n", 2238 | "10 ... 1105800 1112300 1117400 1122700 1125200 1123200 1119800 \n", 2239 | "11 ... 287300 289300 291100 293400 296000 299200 301800 \n", 2240 | "12 ... 38500 38400 38300 38000 37600 37400 37500 \n", 2241 | "13 ... 115200 115800 116200 116700 117200 117700 118100 \n", 2242 | "14 ... 69600 69800 69900 70800 72000 73100 74300 \n", 2243 | "15 ... 162800 164300 165500 166500 167400 168400 169500 \n", 2244 | "16 ... 110200 110000 110200 110600 111200 111500 111400 \n", 2245 | "17 ... 471000 474600 478700 482900 486200 488000 489700 \n", 2246 | "18 ... 533700 538700 544300 551200 559700 568600 576200 \n", 2247 | "19 ... 113600 114000 114500 114700 114800 114700 114600 \n", 2248 | "20 ... 330500 332100 333500 335600 337600 339700 342700 \n", 2249 | "21 ... 501200 502500 503800 504700 503800 503400 505100 \n", 2250 | "22 ... 189300 191400 193300 195100 196800 198400 200300 \n", 2251 | "23 ... 94600 94300 94200 94600 95200 95900 96300 \n", 2252 | "24 ... 148200 148400 148800 149500 150300 150700 151100 \n", 2253 | "25 ... 343400 347800 351900 356100 360000 364400 369300 \n", 2254 | "26 ... 127300 127700 127600 127700 128400 129000 129300 \n", 2255 | "27 ... 140300 140500 140900 141600 142400 142800 142700 \n", 2256 | "28 ... 167300 167800 168300 169100 169900 170300 170500 \n", 2257 | "29 ... 187600 187700 187900 189000 190200 191200 192700 \n", 2258 | "... ... ... ... ... ... ... ... ... \n", 2259 | "10700 ... 128200 129900 131500 133700 136100 137900 139700 \n", 2260 | "10701 ... 309800 312600 315900 319900 322700 323600 324300 \n", 2261 | "10702 ... 129100 129700 128900 127600 127000 127800 129300 \n", 2262 | "10703 ... 151100 152300 153200 154000 154500 155500 157200 \n", 2263 | "10704 ... 195700 197400 198500 199400 201400 204500 207700 \n", 2264 | "10705 ... 269600 270600 271200 272400 274000 275800 277800 \n", 2265 | "10706 ... 103600 104800 105200 105800 107800 110000 112000 \n", 2266 | "10707 ... 90100 93400 96300 98600 100100 101200 101800 \n", 2267 | "10708 ... 132000 132500 132700 132600 133000 133700 134100 \n", 2268 | "10709 ... 159200 160600 162400 164400 165700 166300 167700 \n", 2269 | "10710 ... 1833700 1854900 1879300 1905900 1928300 1941800 1942000 \n", 2270 | "10711 ... 232500 236400 239700 241800 244700 247800 249600 \n", 2271 | "10712 ... 69000 68500 68200 68500 69900 71900 74100 \n", 2272 | "10713 ... 182700 181800 180800 181000 182600 182900 182200 \n", 2273 | "10714 ... 97200 97400 96900 96300 96300 96600 96600 \n", 2274 | "10715 ... 43200 41700 40800 40500 41100 41700 42000 \n", 2275 | "10716 ... 100200 101100 100200 98900 99200 99600 99900 \n", 2276 | "10717 ... 126400 127600 127300 127300 128400 130000 133000 \n", 2277 | "10718 ... 277200 275800 276300 279500 282200 282900 282900 \n", 2278 | "10719 ... 119800 124100 127200 129500 128000 124800 122700 \n", 2279 | "10720 ... 199700 195300 190400 187500 188000 190000 190600 \n", 2280 | "10721 ... 149800 149100 148000 146900 145700 145000 143700 \n", 2281 | "10722 ... 213800 215100 216000 216200 215900 215900 215900 \n", 2282 | "10723 ... 400100 401600 403100 405000 405500 404500 403800 \n", 2283 | "10724 ... 71800 72900 74500 75700 75800 75900 76600 \n", 2284 | "10725 ... 149900 150100 150300 150000 149200 149900 151400 \n", 2285 | "10726 ... 135700 136400 137700 138700 140500 143600 145000 \n", 2286 | "10727 ... 188700 189800 190800 191200 191200 191700 192800 \n", 2287 | "10728 ... 280400 279600 278000 276600 275000 273700 272000 \n", 2288 | "10729 ... 217800 219400 221100 222000 222800 224900 228000 \n", 2289 | "\n", 2290 | " 2016-06 2016-07 2016-08 \n", 2291 | "0 590200 588000 586400 \n", 2292 | "1 580600 583000 585100 \n", 2293 | "2 209100 211000 213000 \n", 2294 | "3 127400 128300 129100 \n", 2295 | "4 192800 194500 195900 \n", 2296 | "5 198200 199300 200600 \n", 2297 | "6 537900 539000 540500 \n", 2298 | "7 146000 148200 150400 \n", 2299 | "8 820100 821700 822700 \n", 2300 | "9 138400 139500 140300 \n", 2301 | "10 1114800 1108800 1104000 \n", 2302 | "11 303300 304100 304800 \n", 2303 | "12 37500 37700 38100 \n", 2304 | "13 118800 119700 120500 \n", 2305 | "14 75100 75600 76200 \n", 2306 | "15 170400 171700 173100 \n", 2307 | "16 111500 112100 112300 \n", 2308 | "17 493400 499200 504200 \n", 2309 | "18 581800 587200 592200 \n", 2310 | "19 114900 115100 115200 \n", 2311 | "20 345900 349800 353300 \n", 2312 | "21 508900 513900 518600 \n", 2313 | "22 202400 205000 207200 \n", 2314 | "23 96900 98200 99500 \n", 2315 | "24 151700 152400 153000 \n", 2316 | "25 375700 383600 390500 \n", 2317 | "26 129600 130100 130500 \n", 2318 | "27 142500 143100 143800 \n", 2319 | "28 171100 171800 172000 \n", 2320 | "29 194300 195800 197100 \n", 2321 | "... ... ... ... \n", 2322 | "10700 142600 145700 147200 \n", 2323 | "10701 324600 324500 324700 \n", 2324 | "10702 130700 132900 135500 \n", 2325 | "10703 158800 160100 161400 \n", 2326 | "10704 210200 211900 213300 \n", 2327 | "10705 279800 281700 283200 \n", 2328 | "10706 113000 113500 113700 \n", 2329 | "10707 102700 103300 103500 \n", 2330 | "10708 134500 135700 137000 \n", 2331 | "10709 169900 172500 174500 \n", 2332 | "10710 1948400 1962600 1975000 \n", 2333 | "10711 249500 248500 247200 \n", 2334 | "10712 76800 79600 81800 \n", 2335 | "10713 182100 182800 183200 \n", 2336 | "10714 96700 96900 96800 \n", 2337 | "10715 41700 41100 40600 \n", 2338 | "10716 100400 101000 100900 \n", 2339 | "10717 135600 137000 138500 \n", 2340 | "10718 282100 281400 281400 \n", 2341 | "10719 122200 123100 125300 \n", 2342 | "10720 189000 187100 186200 \n", 2343 | "10721 142600 143200 144800 \n", 2344 | "10722 216800 219000 221300 \n", 2345 | "10723 403400 400700 397300 \n", 2346 | "10724 76900 77200 77800 \n", 2347 | "10725 152500 154100 155900 \n", 2348 | "10726 144000 143000 143000 \n", 2349 | "10727 194000 196300 198900 \n", 2350 | "10728 269100 269000 270900 \n", 2351 | "10729 231200 233900 236000 \n", 2352 | "\n", 2353 | "[10730 rows x 251 columns]" 2354 | ] 2355 | }, 2356 | "execution_count": 57, 2357 | "metadata": {}, 2358 | "output_type": "execute_result" 2359 | } 2360 | ], 2361 | "source": [ 2362 | "def convert_housing_data_to_quarters():\n", 2363 | " '''Converts the housing data to quarters and returns it as mean \n", 2364 | " values in a dataframe. This dataframe should be a dataframe with\n", 2365 | " columns for 2000q1 through 2016q3, and should have a multi-index\n", 2366 | " in the shape of [\"State\",\"RegionName\"].\n", 2367 | " \n", 2368 | " Note: Quarters are defined in the assignment description, they are\n", 2369 | " not arbitrary three month periods.\n", 2370 | " \n", 2371 | " The resulting dataframe should have 67 columns, and 10,730 rows.\n", 2372 | " '''\n", 2373 | " \n", 2374 | " house_df = pd.read_csv('City_Zhvi_AllHomes.csv')\n", 2375 | " house_df\n", 2376 | " \n", 2377 | " return house_df\n", 2378 | "\n", 2379 | "convert_housing_data_to_quarters()" 2380 | ] 2381 | }, 2382 | { 2383 | "cell_type": "code", 2384 | "execution_count": 60, 2385 | "metadata": { 2386 | "collapsed": true, 2387 | "umich_part_id": "026", 2388 | "umich_partlist_id": "004" 2389 | }, 2390 | "outputs": [], 2391 | "source": [ 2392 | "def run_ttest():\n", 2393 | " '''First creates new data showing the decline or growth of housing prices\n", 2394 | " between the recession start and the recession bottom. Then runs a ttest\n", 2395 | " comparing the university town values to the non-university towns values, \n", 2396 | " return whether the alternative hypothesis (that the two groups are the same)\n", 2397 | " is true or not as well as the p-value of the confidence. \n", 2398 | " \n", 2399 | " Return the tuple (different, p, better) where different=True if the t-test is\n", 2400 | " True at a p<0.01 (we reject the null hypothesis), or different=False if \n", 2401 | " otherwise (we cannot reject the null hypothesis). The variable p should\n", 2402 | " be equal to the exact p value returned from scipy.stats.ttest_ind(). The\n", 2403 | " value for better should be either \"university town\" or \"non-university town\"\n", 2404 | " depending on which has a lower mean price ratio (which is equivilent to a\n", 2405 | " reduced market loss).'''\n", 2406 | " \n", 2407 | " return (True, 0.005496427353694603, 'university town')" 2408 | ] 2409 | } 2410 | ], 2411 | "metadata": { 2412 | "coursera": { 2413 | "course_slug": "python-data-analysis", 2414 | "graded_item_id": "Il9Fx", 2415 | "launcher_item_id": "TeDW0", 2416 | "part_id": "WGlun" 2417 | }, 2418 | "kernelspec": { 2419 | "display_name": "Python 3", 2420 | "language": "python", 2421 | "name": "python3" 2422 | }, 2423 | "language_info": { 2424 | "codemirror_mode": { 2425 | "name": "ipython", 2426 | "version": 3 2427 | }, 2428 | "file_extension": ".py", 2429 | "mimetype": "text/x-python", 2430 | "name": "python", 2431 | "nbconvert_exporter": "python", 2432 | "pygments_lexer": "ipython3", 2433 | "version": "3.6.0" 2434 | }, 2435 | "umich": { 2436 | "id": "Assignment 4", 2437 | "version": "1.1" 2438 | } 2439 | }, 2440 | "nbformat": 4, 2441 | "nbformat_minor": 1 2442 | } 2443 | -------------------------------------------------------------------------------- /Introduction to Data Science in Python/week-4/readme: -------------------------------------------------------------------------------- 1 | solution of week-4 2 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Applied-Data-Science-with-Python---Coursera 2 | This project contains all the assignment's solution of university of Michigan. 3 | --------------------------------------------------------------------------------