├── presentation.pdf ├── The Battle of Cities.pdf ├── CapstoneProject.ipynb ├── README.md ├── Tbilisi-project-intro.ipynb ├── Toronto-project-final-PART2.ipynb └── Toronto-project-final-PART1.ipynb /presentation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bexxmodd/Coursera_Capstone/master/presentation.pdf -------------------------------------------------------------------------------- /The Battle of Cities.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/bexxmodd/Coursera_Capstone/master/The Battle of Cities.pdf -------------------------------------------------------------------------------- /CapstoneProject.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "## Welcome to Coursera Capstone Project\n", 8 | "\n", 9 | "* This notebook will be used to create the final capstone project for the coursera data science certificate program\n", 10 | "* Program was built and taught by the IBM Data science team" 11 | ] 12 | }, 13 | { 14 | "cell_type": "code", 15 | "execution_count": 6, 16 | "metadata": {}, 17 | "outputs": [ 18 | { 19 | "name": "stdout", 20 | "output_type": "stream", 21 | "text": [ 22 | "Hello Capstone Project Course!\n" 23 | ] 24 | } 25 | ], 26 | "source": [ 27 | "import pandas as pd\n", 28 | "import numpy as np\n", 29 | "\n", 30 | "print('Hello Capstone Project Course!')" 31 | ] 32 | }, 33 | { 34 | "cell_type": "code", 35 | "execution_count": null, 36 | "metadata": {}, 37 | "outputs": [], 38 | "source": [] 39 | } 40 | ], 41 | "metadata": { 42 | "kernelspec": { 43 | "display_name": "Python", 44 | "language": "python", 45 | "name": "conda-env-python-py" 46 | }, 47 | "language_info": { 48 | "codemirror_mode": { 49 | "name": "ipython", 50 | "version": 3 51 | }, 52 | "file_extension": ".py", 53 | "mimetype": "text/x-python", 54 | "name": "python", 55 | "nbconvert_exporter": "python", 56 | "pygments_lexer": "ipython3", 57 | "version": "3.6.7" 58 | } 59 | }, 60 | "nbformat": 4, 61 | "nbformat_minor": 4 62 | } 63 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |

The Battle of Cities: Tbilisi vs Berlin

2 |

Introduction:

3 | 4 | Tbilisi, the capital of post-soviet country Georgia, one of the oldest cities which was rediscovered by the world in recent years. For the past 10 years, the whole country is experiencing immense growth in tourism and its especially noticeable in Tbilisi. People come here to experience the natural beauty of Caucasus mountains or, the greenness of the forests and the wildlife preserved there, or simply enjoy summer on the beaches of the Adjara region. However, Tbilisi became popular for its night clubs, bars, and restaurants. As time goes by Tbilisi is attaining its reputation for mesmerizing nightlife and raving culture. In several articles Forbes, The Guardian and VICE pronounced Tbilisi as the center of the nightlife, putting him in from of such huge cities like Berlin and London. Last year Forbes called Tbilisi “This Year's Most Exciting City” [1] and it’s only beginning. 5 | On the other side of the story, we have Berlin, which has a well-established reputation and has worldwide popularity due to its outstanding night clubs and bars and the experience the city can give to the peoples who are into nightlife. Since the fall of the berlin wall, this city converted itself as the benchmark city for clubbing and nightlife. However, the growth of Tbilisi’s, later conversations and the increase in demand for the Georgian DJ’s around the globe brought me to a question, is Tbilisi in competition with Berlin for the best nightlife city? Does Tbilisi compare to Berlin in terms of night clubs and bars, and if it’s worth it tapping into that business? 6 | To elaborate better on a topic, we need to dive into the culture and investigate important metrics which I will address in the next paragraph. To experience nightlife fully, especially in the foreign city, just having a good night club is not enough, it’s a combination of other venues, services and freedom of expression. The Guardian in their article about Tbilisi said that “These cities share the magic ingredients that allowed clubbing to thrive in east Berlin: cheap rents, plenty of space, often in the form of unused communist-era buildings, and creative, open-minded young people” [2]. We’ll look at the combination of factors like taxi services, hotels, a measure of personal freedom and crime rates in combination with the number of clubs and bars and their average reviews and how easy it is to travel to Tbilisi. 7 | -------------------------------------------------------------------------------- /Tbilisi-project-intro.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "

The Battle of Cities: Tbilisi vs Berlin

\n", 8 | "\n", 9 | "### 1.0 Introduction:\n", 10 | "\n", 11 | "Tbilisi, the capital of post-soviet country Georgia, one of the oldest cities which was rediscovered by the world in recent years. For the past 10 years, the whole country is experiencing immense growth in tourism and its especially noticeable in Tbilisi. People come here to experience the natural beauty of Caucasus mountains or, the greenness of the forests and the wildlife preserved there, or simply enjoy summer on the beaches of the Adjara region. However, Tbilisi became popular for its night clubs, bars, and restaurants. As time goes by Tbilisi is attaining its reputation for mesmerizing nightlife and raving culture. In several articles Forbes, The Guardian and VICE pronounced Tbilisi as the center of the nightlife, putting him in from of such huge cities like Berlin and London. Last year Forbes called Tbilisi “This Year's Most Exciting City” [1] and it’s only beginning. \n", 12 | "\n", 13 | "\n", 14 | "On the other side of the story, we have Berlin, which has a well-established reputation and has worldwide popularity due to its outstanding night clubs and bars and the experience the city can give to the peoples who are into nightlife. Since the fall of the berlin wall, this city converted itself as the benchmark city for clubbing and nightlife. However, the growth of Tbilisi’s, later conversations and the increase in demand for the Georgian DJ’s around the globe brought me to a question, is Tbilisi in competition with Berlin for the best nightlife city? Does Tbilisi compare to Berlin in terms of night clubs and bars, and if it’s worth it tapping into that business?\n", 15 | "\n", 16 | "\n", 17 | "To elaborate better on a topic, we need to dive into the culture and investigate important metrics which I will address in the next paragraph. To experience nightlife fully, especially in the foreign city, just having a good night club is not enough, it’s a combination of other venues, services and freedom of expression. The Guardian in their article about Tbilisi said that “These cities share the magic ingredients that allowed clubbing to thrive in east Berlin: cheap rents, plenty of space, often in the form of unused communist-era buildings, and creative, open-minded young people” [2]. We’ll look at the combination of factors like taxi services, hotels, a measure of personal freedom and crime rates in combination with the number of clubs and bars and their average reviews and how easy it is to travel to Tbilisi.\n", 18 | "\n", 19 | "\n", 20 | "### 1.1 Data:\n", 21 | "I will use foursquare API to collect the top 25 venues from Tbilisi, Georgia, and Berlin, Germany searchable under the category “Night Life” and \"Night Clubs.\" Using the same portal and the same technics I will also collect the top 25 venues searchable under the category \"Hotel\" and \"Hostel\". I will extract average ratings of the venues with the number of feedbacks to analyze the reputation and the impressions and of the visitors\n", 22 | "\n", 23 | "\n", 24 | "I will use other sources (full data collecting and cleaning details available in data collection and wrangling section) to obtain information about travel trends, and average prices of a room at the hotel and taxi prices to portray how enjoyable the nightlife is in those two cities and try to draw some parallels. I will look at the crime rates, overall security, and to what degree jurisdictions allow to express personal freedom and exercise one’s legal rights.\n", 25 | "\n", 26 | "--------\n", 27 | "
References:
\n", 28 | "\n", 29 | "[1] Wilson, B. (2019, September 25). Berlin Is Out, Tbilisi Is In: Georgia's Capital Is This Year's Most Exciting City. Retrieved from https://www.forbes.com/sites/breannawilson/2018/09/05/berlin-is-out-tbilisi-is-in-georgias-capital-is-this-years-most-exciting-city/#435a0d80479d.\n", 30 | "\n", 31 | "[2] Ravens, C. (2019, January 22). Bassiani: the Tbilisi techno mecca shaking off post-Soviet repression. Retrieved from https://www.theguardian.com/music/2019/jan/22/bassiani-tbilisi-techno-nightclub-georgia.\n" 32 | ] 33 | }, 34 | { 35 | "cell_type": "code", 36 | "execution_count": null, 37 | "metadata": {}, 38 | "outputs": [], 39 | "source": [] 40 | } 41 | ], 42 | "metadata": { 43 | "kernelspec": { 44 | "display_name": "Python 3", 45 | "language": "python", 46 | "name": "python3" 47 | }, 48 | "language_info": { 49 | "codemirror_mode": { 50 | "name": "ipython", 51 | "version": 3 52 | }, 53 | "file_extension": ".py", 54 | "mimetype": "text/x-python", 55 | "name": "python", 56 | "nbconvert_exporter": "python", 57 | "pygments_lexer": "ipython3", 58 | "version": "3.7.4" 59 | } 60 | }, 61 | "nbformat": 4, 62 | "nbformat_minor": 2 63 | } 64 | -------------------------------------------------------------------------------- /Toronto-project-final-PART2.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Part 2 of Neighborhoods project\n", 8 | "\n", 9 | "--------\n", 10 | "### Finalizing our data" 11 | ] 12 | }, 13 | { 14 | "cell_type": "markdown", 15 | "metadata": {}, 16 | "source": [ 17 | "I'm importing csv file for Toronto locations and for Geocordinates to work them and combine together" 18 | ] 19 | }, 20 | { 21 | "cell_type": "code", 22 | "execution_count": 1, 23 | "metadata": {}, 24 | "outputs": [ 25 | { 26 | "data": { 27 | "text/html": [ 28 | "
\n", 29 | "\n", 42 | "\n", 43 | " \n", 44 | " \n", 45 | " \n", 46 | " \n", 47 | " \n", 48 | " \n", 49 | " \n", 50 | " \n", 51 | " \n", 52 | " \n", 53 | " \n", 54 | " \n", 55 | " \n", 56 | " \n", 57 | " \n", 58 | " \n", 59 | " \n", 60 | " \n", 61 | " \n", 62 | " \n", 63 | " \n", 64 | " \n", 65 | " \n", 66 | " \n", 67 | " \n", 68 | " \n", 69 | " \n", 70 | " \n", 71 | " \n", 72 | " \n", 73 | " \n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | "
Unnamed: 0PostcodeBoroughNeighborhood
00M1BScarboroughRouge,Malvern
11M1CScarboroughHighland Creek,Rouge Hill,Port Union
22M1EScarboroughGuildwood,Morningside,West Hill
33M1GScarboroughWoburn
44M1HScarboroughCedarbrae
\n", 90 | "
" 91 | ], 92 | "text/plain": [ 93 | " Unnamed: 0 Postcode Borough Neighborhood\n", 94 | "0 0 M1B Scarborough Rouge,Malvern\n", 95 | "1 1 M1C Scarborough Highland Creek,Rouge Hill,Port Union\n", 96 | "2 2 M1E Scarborough Guildwood,Morningside,West Hill\n", 97 | "3 3 M1G Scarborough Woburn\n", 98 | "4 4 M1H Scarborough Cedarbrae" 99 | ] 100 | }, 101 | "execution_count": 1, 102 | "metadata": {}, 103 | "output_type": "execute_result" 104 | } 105 | ], 106 | "source": [ 107 | "import pandas as pd\n", 108 | "df = pd.read_csv('tor-loc.csv')\n", 109 | "df.head()" 110 | ] 111 | }, 112 | { 113 | "cell_type": "markdown", 114 | "metadata": {}, 115 | "source": [ 116 | "I'm dropping not needed column" 117 | ] 118 | }, 119 | { 120 | "cell_type": "code", 121 | "execution_count": 2, 122 | "metadata": {}, 123 | "outputs": [ 124 | { 125 | "data": { 126 | "text/html": [ 127 | "
\n", 128 | "\n", 141 | "\n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | " \n", 178 | " \n", 179 | " \n", 180 | " \n", 181 | " \n", 182 | "
PostcodeBoroughNeighborhood
0M1BScarboroughRouge,Malvern
1M1CScarboroughHighland Creek,Rouge Hill,Port Union
2M1EScarboroughGuildwood,Morningside,West Hill
3M1GScarboroughWoburn
4M1HScarboroughCedarbrae
\n", 183 | "
" 184 | ], 185 | "text/plain": [ 186 | " Postcode Borough Neighborhood\n", 187 | "0 M1B Scarborough Rouge,Malvern\n", 188 | "1 M1C Scarborough Highland Creek,Rouge Hill,Port Union\n", 189 | "2 M1E Scarborough Guildwood,Morningside,West Hill\n", 190 | "3 M1G Scarborough Woburn\n", 191 | "4 M1H Scarborough Cedarbrae" 192 | ] 193 | }, 194 | "execution_count": 2, 195 | "metadata": {}, 196 | "output_type": "execute_result" 197 | } 198 | ], 199 | "source": [ 200 | "df = df.drop(['Unnamed: 0'], axis=1)\n", 201 | "df.head()" 202 | ] 203 | }, 204 | { 205 | "cell_type": "code", 206 | "execution_count": 3, 207 | "metadata": {}, 208 | "outputs": [ 209 | { 210 | "data": { 211 | "text/html": [ 212 | "
\n", 213 | "\n", 226 | "\n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | " \n", 232 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 237 | " \n", 238 | " \n", 239 | " \n", 240 | " \n", 241 | " \n", 242 | " \n", 243 | " \n", 244 | " \n", 245 | " \n", 246 | " \n", 247 | " \n", 248 | " \n", 249 | " \n", 250 | " \n", 251 | " \n", 252 | " \n", 253 | " \n", 254 | " \n", 255 | " \n", 256 | " \n", 257 | " \n", 258 | " \n", 259 | " \n", 260 | " \n", 261 | " \n", 262 | " \n", 263 | " \n", 264 | " \n", 265 | " \n", 266 | " \n", 267 | "
Postal CodeLatitudeLongitude
0M1B43.806686-79.194353
1M1C43.784535-79.160497
2M1E43.763573-79.188711
3M1G43.770992-79.216917
4M1H43.773136-79.239476
\n", 268 | "
" 269 | ], 270 | "text/plain": [ 271 | " Postal Code Latitude Longitude\n", 272 | "0 M1B 43.806686 -79.194353\n", 273 | "1 M1C 43.784535 -79.160497\n", 274 | "2 M1E 43.763573 -79.188711\n", 275 | "3 M1G 43.770992 -79.216917\n", 276 | "4 M1H 43.773136 -79.239476" 277 | ] 278 | }, 279 | "execution_count": 3, 280 | "metadata": {}, 281 | "output_type": "execute_result" 282 | } 283 | ], 284 | "source": [ 285 | "import pandas as pd\n", 286 | "df1 = pd.read_csv('Geospatial_Coordinates.csv')\n", 287 | "df1.head()" 288 | ] 289 | }, 290 | { 291 | "cell_type": "markdown", 292 | "metadata": {}, 293 | "source": [ 294 | "I'm renaming 'Postal Code' column as 'Postcode' which later will be used to merge two dataframes together" 295 | ] 296 | }, 297 | { 298 | "cell_type": "code", 299 | "execution_count": 4, 300 | "metadata": {}, 301 | "outputs": [ 302 | { 303 | "data": { 304 | "text/html": [ 305 | "
\n", 306 | "\n", 319 | "\n", 320 | " \n", 321 | " \n", 322 | " \n", 323 | " \n", 324 | " \n", 325 | " \n", 326 | " \n", 327 | " \n", 328 | " \n", 329 | " \n", 330 | " \n", 331 | " \n", 332 | " \n", 333 | " \n", 334 | " \n", 335 | " \n", 336 | " \n", 337 | " \n", 338 | " \n", 339 | " \n", 340 | " \n", 341 | " \n", 342 | " \n", 343 | " \n", 344 | " \n", 345 | " \n", 346 | " \n", 347 | " \n", 348 | " \n", 349 | " \n", 350 | " \n", 351 | " \n", 352 | " \n", 353 | " \n", 354 | " \n", 355 | " \n", 356 | " \n", 357 | " \n", 358 | " \n", 359 | " \n", 360 | "
PostcodeLatitudeLongitude
0M1B43.806686-79.194353
1M1C43.784535-79.160497
2M1E43.763573-79.188711
3M1G43.770992-79.216917
4M1H43.773136-79.239476
\n", 361 | "
" 362 | ], 363 | "text/plain": [ 364 | " Postcode Latitude Longitude\n", 365 | "0 M1B 43.806686 -79.194353\n", 366 | "1 M1C 43.784535 -79.160497\n", 367 | "2 M1E 43.763573 -79.188711\n", 368 | "3 M1G 43.770992 -79.216917\n", 369 | "4 M1H 43.773136 -79.239476" 370 | ] 371 | }, 372 | "execution_count": 4, 373 | "metadata": {}, 374 | "output_type": "execute_result" 375 | } 376 | ], 377 | "source": [ 378 | "df1 = df1.rename(columns={'Postal Code': 'Postcode'})\n", 379 | "df1.head()" 380 | ] 381 | }, 382 | { 383 | "cell_type": "markdown", 384 | "metadata": {}, 385 | "source": [ 386 | "I'm adding geolocations to the postcode and neighborhoods by merging two dataframes with the \"Postcode\" being the key" 387 | ] 388 | }, 389 | { 390 | "cell_type": "code", 391 | "execution_count": 5, 392 | "metadata": {}, 393 | "outputs": [ 394 | { 395 | "data": { 396 | "text/html": [ 397 | "
\n", 398 | "\n", 411 | "\n", 412 | " \n", 413 | " \n", 414 | " \n", 415 | " \n", 416 | " \n", 417 | " \n", 418 | " \n", 419 | " \n", 420 | " \n", 421 | " \n", 422 | " \n", 423 | " \n", 424 | " \n", 425 | " \n", 426 | " \n", 427 | " \n", 428 | " \n", 429 | " \n", 430 | " \n", 431 | " \n", 432 | " \n", 433 | " \n", 434 | " \n", 435 | " \n", 436 | " \n", 437 | " \n", 438 | " \n", 439 | " \n", 440 | " \n", 441 | " \n", 442 | " \n", 443 | " \n", 444 | " \n", 445 | " \n", 446 | " \n", 447 | " \n", 448 | " \n", 449 | " \n", 450 | " \n", 451 | " \n", 452 | " \n", 453 | " \n", 454 | " \n", 455 | " \n", 456 | " \n", 457 | " \n", 458 | " \n", 459 | " \n", 460 | " \n", 461 | " \n", 462 | " \n", 463 | " \n", 464 | " \n", 465 | " \n", 466 | " \n", 467 | " \n", 468 | " \n", 469 | " \n", 470 | " \n", 471 | " \n", 472 | " \n", 473 | " \n", 474 | " \n", 475 | " \n", 476 | " \n", 477 | " \n", 478 | " \n", 479 | " \n", 480 | " \n", 481 | " \n", 482 | " \n", 483 | " \n", 484 | " \n", 485 | " \n", 486 | " \n", 487 | " \n", 488 | " \n", 489 | " \n", 490 | " \n", 491 | " \n", 492 | " \n", 493 | " \n", 494 | " \n", 495 | " \n", 496 | " \n", 497 | " \n", 498 | " \n", 499 | " \n", 500 | " \n", 501 | " \n", 502 | " \n", 503 | " \n", 504 | "
PostcodeBoroughNeighborhoodLatitudeLongitude
0M1BScarboroughRouge,Malvern43.806686-79.194353
1M1CScarboroughHighland Creek,Rouge Hill,Port Union43.784535-79.160497
2M1EScarboroughGuildwood,Morningside,West Hill43.763573-79.188711
3M1GScarboroughWoburn43.770992-79.216917
4M1HScarboroughCedarbrae43.773136-79.239476
5M1JScarboroughScarborough Village43.744734-79.239476
6M1KScarboroughEast Birchmount Park,Ionview,Kennedy Park43.727929-79.262029
7M1LScarboroughClairlea,Golden Mile,Oakridge43.711112-79.284577
8M1MScarboroughCliffcrest,Cliffside,Scarborough Village West43.716316-79.239476
9M1NScarboroughBirch Cliff,Cliffside West43.692657-79.264848
\n", 505 | "
" 506 | ], 507 | "text/plain": [ 508 | " Postcode Borough Neighborhood \\\n", 509 | "0 M1B Scarborough Rouge,Malvern \n", 510 | "1 M1C Scarborough Highland Creek,Rouge Hill,Port Union \n", 511 | "2 M1E Scarborough Guildwood,Morningside,West Hill \n", 512 | "3 M1G Scarborough Woburn \n", 513 | "4 M1H Scarborough Cedarbrae \n", 514 | "5 M1J Scarborough Scarborough Village \n", 515 | "6 M1K Scarborough East Birchmount Park,Ionview,Kennedy Park \n", 516 | "7 M1L Scarborough Clairlea,Golden Mile,Oakridge \n", 517 | "8 M1M Scarborough Cliffcrest,Cliffside,Scarborough Village West \n", 518 | "9 M1N Scarborough Birch Cliff,Cliffside West \n", 519 | "\n", 520 | " Latitude Longitude \n", 521 | "0 43.806686 -79.194353 \n", 522 | "1 43.784535 -79.160497 \n", 523 | "2 43.763573 -79.188711 \n", 524 | "3 43.770992 -79.216917 \n", 525 | "4 43.773136 -79.239476 \n", 526 | "5 43.744734 -79.239476 \n", 527 | "6 43.727929 -79.262029 \n", 528 | "7 43.711112 -79.284577 \n", 529 | "8 43.716316 -79.239476 \n", 530 | "9 43.692657 -79.264848 " 531 | ] 532 | }, 533 | "execution_count": 5, 534 | "metadata": {}, 535 | "output_type": "execute_result" 536 | } 537 | ], 538 | "source": [ 539 | "dft = pd.merge(df, df1, on='Postcode')\n", 540 | "dft.head(10)" 541 | ] 542 | }, 543 | { 544 | "cell_type": "code", 545 | "execution_count": 6, 546 | "metadata": {}, 547 | "outputs": [], 548 | "source": [ 549 | "dft.to_csv('toronto-df.csv')" 550 | ] 551 | } 552 | ], 553 | "metadata": { 554 | "kernelspec": { 555 | "display_name": "Python", 556 | "language": "python", 557 | "name": "conda-env-python-py" 558 | }, 559 | "language_info": { 560 | "codemirror_mode": { 561 | "name": "ipython", 562 | "version": 3 563 | }, 564 | "file_extension": ".py", 565 | "mimetype": "text/x-python", 566 | "name": "python", 567 | "nbconvert_exporter": "python", 568 | "pygments_lexer": "ipython3", 569 | "version": "3.6.7" 570 | } 571 | }, 572 | "nbformat": 4, 573 | "nbformat_minor": 4 574 | } 575 | -------------------------------------------------------------------------------- /Toronto-project-final-PART1.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Part 1 of Neighborhoods project\n", 8 | "\n", 9 | "--------\n", 10 | "### Extracting initial data" 11 | ] 12 | }, 13 | { 14 | "cell_type": "markdown", 15 | "metadata": {}, 16 | "source": [ 17 | "## Data preperation\n", 18 | "In this notebook I'll import, cleanm and prepare dataframe for Toronto" 19 | ] 20 | }, 21 | { 22 | "cell_type": "code", 23 | "execution_count": 1, 24 | "metadata": {}, 25 | "outputs": [ 26 | { 27 | "name": "stdout", 28 | "output_type": "stream", 29 | "text": [ 30 | "Libraries imported.\n" 31 | ] 32 | } 33 | ], 34 | "source": [ 35 | "import numpy as np # library to handle data in a vectorized manner\n", 36 | "\n", 37 | "import pandas as pd # library for data analsysis\n", 38 | "pd.set_option('display.max_columns', None)\n", 39 | "pd.set_option('display.max_rows', None)\n", 40 | "\n", 41 | "import json # library to handle JSON files\n", 42 | "\n", 43 | "#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab\n", 44 | "from geopy.geocoders import Nominatim # convert an address into latitude and longitude values\n", 45 | "\n", 46 | "import requests # library to handle requests\n", 47 | "from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe\n", 48 | "\n", 49 | "# Matplotlib and associated plotting modules\n", 50 | "import matplotlib.cm as cm\n", 51 | "import matplotlib.colors as colors\n", 52 | "\n", 53 | "# import k-means from clustering stage\n", 54 | "from sklearn.cluster import KMeans\n", 55 | "\n", 56 | "#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab\n", 57 | "import folium # map rendering library\n", 58 | "\n", 59 | "print('Libraries imported.')" 60 | ] 61 | }, 62 | { 63 | "cell_type": "markdown", 64 | "metadata": {}, 65 | "source": [ 66 | "I import csv file after scrapping it from the https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M" 67 | ] 68 | }, 69 | { 70 | "cell_type": "code", 71 | "execution_count": 2, 72 | "metadata": {}, 73 | "outputs": [ 74 | { 75 | "data": { 76 | "text/html": [ 77 | "
\n", 78 | "\n", 91 | "\n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | "
PostcodeBoroughNeighborhood
0M1ANot assignedNot assigned
1M2ANot assignedNot assigned
2M3ANorth YorkParkwoods
3M4ANorth YorkVictoria Village
4M5ADowntown TorontoHarbourfront
\n", 133 | "
" 134 | ], 135 | "text/plain": [ 136 | " Postcode Borough Neighborhood\n", 137 | "0 M1A Not assigned Not assigned\n", 138 | "1 M2A Not assigned Not assigned\n", 139 | "2 M3A North York Parkwoods\n", 140 | "3 M4A North York Victoria Village\n", 141 | "4 M5A Downtown Toronto Harbourfront" 142 | ] 143 | }, 144 | "execution_count": 2, 145 | "metadata": {}, 146 | "output_type": "execute_result" 147 | } 148 | ], 149 | "source": [ 150 | "# import csv file after scrapping it from the wikipage\n", 151 | "df = pd.read_csv('toronto.csv')\n", 152 | "df.head()" 153 | ] 154 | }, 155 | { 156 | "cell_type": "markdown", 157 | "metadata": {}, 158 | "source": [ 159 | "I'm dropping 'Not assigned' value rows from Borough column" 160 | ] 161 | }, 162 | { 163 | "cell_type": "code", 164 | "execution_count": 3, 165 | "metadata": {}, 166 | "outputs": [ 167 | { 168 | "data": { 169 | "text/html": [ 170 | "
\n", 171 | "\n", 184 | "\n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | " \n", 189 | " \n", 190 | " \n", 191 | " \n", 192 | " \n", 193 | " \n", 194 | " \n", 195 | " \n", 196 | " \n", 197 | " \n", 198 | " \n", 199 | " \n", 200 | " \n", 201 | " \n", 202 | " \n", 203 | " \n", 204 | " \n", 205 | " \n", 206 | " \n", 207 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 212 | " \n", 213 | " \n", 214 | " \n", 215 | " \n", 216 | " \n", 217 | " \n", 218 | " \n", 219 | " \n", 220 | " \n", 221 | " \n", 222 | " \n", 223 | " \n", 224 | " \n", 225 | " \n", 226 | " \n", 227 | " \n", 228 | " \n", 229 | " \n", 230 | " \n", 231 | " \n", 232 | " \n", 233 | " \n", 234 | " \n", 235 | " \n", 236 | " \n", 237 | " \n", 238 | " \n", 239 | " \n", 240 | " \n", 241 | " \n", 242 | " \n", 243 | " \n", 244 | " \n", 245 | " \n", 246 | " \n", 247 | " \n", 248 | " \n", 249 | " \n", 250 | " \n", 251 | " \n", 252 | " \n", 253 | " \n", 254 | " \n", 255 | "
PostcodeBoroughNeighborhood
2M3ANorth YorkParkwoods
3M4ANorth YorkVictoria Village
4M5ADowntown TorontoHarbourfront
5M6ANorth YorkLawrence Heights
6M6ANorth YorkLawrence Manor
7M7AQueen's ParkNot assigned
9M9ADowntown TorontoQueen's Park
10M1BScarboroughRouge
11M1BScarboroughMalvern
13M3BNorth YorkDon Mills North
\n", 256 | "
" 257 | ], 258 | "text/plain": [ 259 | " Postcode Borough Neighborhood\n", 260 | "2 M3A North York Parkwoods\n", 261 | "3 M4A North York Victoria Village\n", 262 | "4 M5A Downtown Toronto Harbourfront\n", 263 | "5 M6A North York Lawrence Heights\n", 264 | "6 M6A North York Lawrence Manor\n", 265 | "7 M7A Queen's Park Not assigned\n", 266 | "9 M9A Downtown Toronto Queen's Park\n", 267 | "10 M1B Scarborough Rouge\n", 268 | "11 M1B Scarborough Malvern\n", 269 | "13 M3B North York Don Mills North" 270 | ] 271 | }, 272 | "execution_count": 3, 273 | "metadata": {}, 274 | "output_type": "execute_result" 275 | } 276 | ], 277 | "source": [ 278 | "df = df[df.Borough != 'Not assigned']\n", 279 | "df.head(10)" 280 | ] 281 | }, 282 | { 283 | "cell_type": "code", 284 | "execution_count": 4, 285 | "metadata": {}, 286 | "outputs": [ 287 | { 288 | "data": { 289 | "text/plain": [ 290 | "(210, 3)" 291 | ] 292 | }, 293 | "execution_count": 4, 294 | "metadata": {}, 295 | "output_type": "execute_result" 296 | } 297 | ], 298 | "source": [ 299 | "df.shape" 300 | ] 301 | }, 302 | { 303 | "cell_type": "markdown", 304 | "metadata": {}, 305 | "source": [ 306 | "I'm filling Neighborhood 'Not assigned' values from linked Borought column" 307 | ] 308 | }, 309 | { 310 | "cell_type": "code", 311 | "execution_count": 5, 312 | "metadata": {}, 313 | "outputs": [ 314 | { 315 | "data": { 316 | "text/html": [ 317 | "
\n", 318 | "\n", 331 | "\n", 332 | " \n", 333 | " \n", 334 | " \n", 335 | " \n", 336 | " \n", 337 | " \n", 338 | " \n", 339 | " \n", 340 | " \n", 341 | " \n", 342 | " \n", 343 | " \n", 344 | " \n", 345 | " \n", 346 | " \n", 347 | " \n", 348 | " \n", 349 | " \n", 350 | " \n", 351 | " \n", 352 | " \n", 353 | " \n", 354 | " \n", 355 | " \n", 356 | " \n", 357 | " \n", 358 | " \n", 359 | " \n", 360 | " \n", 361 | " \n", 362 | " \n", 363 | " \n", 364 | " \n", 365 | " \n", 366 | " \n", 367 | " \n", 368 | " \n", 369 | " \n", 370 | " \n", 371 | " \n", 372 | " \n", 373 | " \n", 374 | " \n", 375 | " \n", 376 | " \n", 377 | " \n", 378 | " \n", 379 | " \n", 380 | " \n", 381 | " \n", 382 | " \n", 383 | " \n", 384 | " \n", 385 | " \n", 386 | " \n", 387 | " \n", 388 | " \n", 389 | " \n", 390 | " \n", 391 | " \n", 392 | " \n", 393 | " \n", 394 | " \n", 395 | " \n", 396 | " \n", 397 | " \n", 398 | " \n", 399 | " \n", 400 | " \n", 401 | " \n", 402 | "
PostcodeBoroughNeighborhood
2M3ANorth YorkParkwoods
3M4ANorth YorkVictoria Village
4M5ADowntown TorontoHarbourfront
5M6ANorth YorkLawrence Heights
6M6ANorth YorkLawrence Manor
7M7AQueen's ParkQueen's Park
9M9ADowntown TorontoQueen's Park
10M1BScarboroughRouge
11M1BScarboroughMalvern
13M3BNorth YorkDon Mills North
\n", 403 | "
" 404 | ], 405 | "text/plain": [ 406 | " Postcode Borough Neighborhood\n", 407 | "2 M3A North York Parkwoods\n", 408 | "3 M4A North York Victoria Village\n", 409 | "4 M5A Downtown Toronto Harbourfront\n", 410 | "5 M6A North York Lawrence Heights\n", 411 | "6 M6A North York Lawrence Manor\n", 412 | "7 M7A Queen's Park Queen's Park\n", 413 | "9 M9A Downtown Toronto Queen's Park\n", 414 | "10 M1B Scarborough Rouge\n", 415 | "11 M1B Scarborough Malvern\n", 416 | "13 M3B North York Don Mills North" 417 | ] 418 | }, 419 | "execution_count": 5, 420 | "metadata": {}, 421 | "output_type": "execute_result" 422 | } 423 | ], 424 | "source": [ 425 | "\n", 426 | "df = df.mask(df == 'Not assigned').ffill(1)\n", 427 | "df.head(10)" 428 | ] 429 | }, 430 | { 431 | "cell_type": "markdown", 432 | "metadata": {}, 433 | "source": [ 434 | "I'm combining Neighborhoods based on Postcode" 435 | ] 436 | }, 437 | { 438 | "cell_type": "code", 439 | "execution_count": 6, 440 | "metadata": {}, 441 | "outputs": [ 442 | { 443 | "data": { 444 | "text/html": [ 445 | "
\n", 446 | "\n", 459 | "\n", 460 | " \n", 461 | " \n", 462 | " \n", 463 | " \n", 464 | " \n", 465 | " \n", 466 | " \n", 467 | " \n", 468 | " \n", 469 | " \n", 470 | " \n", 471 | " \n", 472 | " \n", 473 | " \n", 474 | " \n", 475 | " \n", 476 | " \n", 477 | " \n", 478 | " \n", 479 | " \n", 480 | " \n", 481 | " \n", 482 | " \n", 483 | " \n", 484 | " \n", 485 | " \n", 486 | " \n", 487 | " \n", 488 | " \n", 489 | " \n", 490 | " \n", 491 | " \n", 492 | " \n", 493 | " \n", 494 | " \n", 495 | " \n", 496 | " \n", 497 | " \n", 498 | " \n", 499 | " \n", 500 | " \n", 501 | " \n", 502 | " \n", 503 | " \n", 504 | " \n", 505 | " \n", 506 | " \n", 507 | " \n", 508 | " \n", 509 | " \n", 510 | " \n", 511 | " \n", 512 | " \n", 513 | " \n", 514 | " \n", 515 | " \n", 516 | " \n", 517 | " \n", 518 | " \n", 519 | " \n", 520 | " \n", 521 | " \n", 522 | " \n", 523 | " \n", 524 | " \n", 525 | " \n", 526 | " \n", 527 | " \n", 528 | " \n", 529 | " \n", 530 | " \n", 531 | " \n", 532 | " \n", 533 | " \n", 534 | " \n", 535 | " \n", 536 | " \n", 537 | " \n", 538 | " \n", 539 | " \n", 540 | " \n", 541 | " \n", 542 | " \n", 543 | " \n", 544 | " \n", 545 | " \n", 546 | " \n", 547 | " \n", 548 | " \n", 549 | " \n", 550 | " \n", 551 | " \n", 552 | " \n", 553 | " \n", 554 | " \n", 555 | " \n", 556 | " \n", 557 | " \n", 558 | " \n", 559 | " \n", 560 | " \n", 561 | " \n", 562 | " \n", 563 | " \n", 564 | " \n", 565 | " \n", 566 | " \n", 567 | " \n", 568 | " \n", 569 | " \n", 570 | " \n", 571 | " \n", 572 | " \n", 573 | " \n", 574 | " \n", 575 | " \n", 576 | " \n", 577 | " \n", 578 | " \n", 579 | " \n", 580 | " \n", 581 | " \n", 582 | " \n", 583 | " \n", 584 | " \n", 585 | " \n", 586 | " \n", 587 | " \n", 588 | " \n", 589 | " \n", 590 | "
PostcodeBoroughNeighborhood
0M1BScarboroughRouge,Malvern
1M1CScarboroughHighland Creek,Rouge Hill,Port Union
2M1EScarboroughGuildwood,Morningside,West Hill
3M1GScarboroughWoburn
4M1HScarboroughCedarbrae
5M1JScarboroughScarborough Village
6M1KScarboroughEast Birchmount Park,Ionview,Kennedy Park
7M1LScarboroughClairlea,Golden Mile,Oakridge
8M1MScarboroughCliffcrest,Cliffside,Scarborough Village West
9M1NScarboroughBirch Cliff,Cliffside West
10M1PScarboroughDorset Park,Scarborough Town Centre,Wexford He...
11M1RScarboroughMaryvale,Wexford
12M1SScarboroughAgincourt
13M1TScarboroughClarks Corners,Sullivan,Tam O'Shanter
14M1VScarboroughAgincourt North,L'Amoreaux East,Milliken,Steel...
15M1WScarboroughL'Amoreaux West
16M1XScarboroughUpper Rouge
17M2HNorth YorkHillcrest Village
18M2JNorth YorkFairview,Henry Farm,Oriole
19M2KNorth YorkBayview Village
\n", 591 | "
" 592 | ], 593 | "text/plain": [ 594 | " Postcode Borough Neighborhood\n", 595 | "0 M1B Scarborough Rouge,Malvern\n", 596 | "1 M1C Scarborough Highland Creek,Rouge Hill,Port Union\n", 597 | "2 M1E Scarborough Guildwood,Morningside,West Hill\n", 598 | "3 M1G Scarborough Woburn\n", 599 | "4 M1H Scarborough Cedarbrae\n", 600 | "5 M1J Scarborough Scarborough Village\n", 601 | "6 M1K Scarborough East Birchmount Park,Ionview,Kennedy Park\n", 602 | "7 M1L Scarborough Clairlea,Golden Mile,Oakridge\n", 603 | "8 M1M Scarborough Cliffcrest,Cliffside,Scarborough Village West\n", 604 | "9 M1N Scarborough Birch Cliff,Cliffside West\n", 605 | "10 M1P Scarborough Dorset Park,Scarborough Town Centre,Wexford He...\n", 606 | "11 M1R Scarborough Maryvale,Wexford\n", 607 | "12 M1S Scarborough Agincourt\n", 608 | "13 M1T Scarborough Clarks Corners,Sullivan,Tam O'Shanter\n", 609 | "14 M1V Scarborough Agincourt North,L'Amoreaux East,Milliken,Steel...\n", 610 | "15 M1W Scarborough L'Amoreaux West\n", 611 | "16 M1X Scarborough Upper Rouge\n", 612 | "17 M2H North York Hillcrest Village\n", 613 | "18 M2J North York Fairview,Henry Farm,Oriole\n", 614 | "19 M2K North York Bayview Village" 615 | ] 616 | }, 617 | "execution_count": 6, 618 | "metadata": {}, 619 | "output_type": "execute_result" 620 | } 621 | ], 622 | "source": [ 623 | "df = df.groupby(['Postcode','Borough'])['Neighborhood'].apply(','.join).reset_index()\n", 624 | "df.head(20)" 625 | ] 626 | }, 627 | { 628 | "cell_type": "markdown", 629 | "metadata": {}, 630 | "source": [ 631 | "I'm checking for the shape of the dataframe" 632 | ] 633 | }, 634 | { 635 | "cell_type": "code", 636 | "execution_count": 7, 637 | "metadata": {}, 638 | "outputs": [ 639 | { 640 | "data": { 641 | "text/plain": [ 642 | "(103, 3)" 643 | ] 644 | }, 645 | "execution_count": 7, 646 | "metadata": {}, 647 | "output_type": "execute_result" 648 | } 649 | ], 650 | "source": [ 651 | "df.shape" 652 | ] 653 | }, 654 | { 655 | "cell_type": "code", 656 | "execution_count": 8, 657 | "metadata": {}, 658 | "outputs": [], 659 | "source": [ 660 | "df.to_csv('tor-loc.csv')" 661 | ] 662 | } 663 | ], 664 | "metadata": { 665 | "kernelspec": { 666 | "display_name": "Python", 667 | "language": "python", 668 | "name": "conda-env-python-py" 669 | }, 670 | "language_info": { 671 | "codemirror_mode": { 672 | "name": "ipython", 673 | "version": 3 674 | }, 675 | "file_extension": ".py", 676 | "mimetype": "text/x-python", 677 | "name": "python", 678 | "nbconvert_exporter": "python", 679 | "pygments_lexer": "ipython3", 680 | "version": "3.6.7" 681 | } 682 | }, 683 | "nbformat": 4, 684 | "nbformat_minor": 4 685 | } 686 | --------------------------------------------------------------------------------