├── presentation.pdf
├── The Battle of Cities.pdf
├── CapstoneProject.ipynb
├── README.md
├── Tbilisi-project-intro.ipynb
├── Toronto-project-final-PART2.ipynb
└── Toronto-project-final-PART1.ipynb
/presentation.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bexxmodd/Coursera_Capstone/master/presentation.pdf
--------------------------------------------------------------------------------
/The Battle of Cities.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/bexxmodd/Coursera_Capstone/master/The Battle of Cities.pdf
--------------------------------------------------------------------------------
/CapstoneProject.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Welcome to Coursera Capstone Project\n",
8 | "\n",
9 | "* This notebook will be used to create the final capstone project for the coursera data science certificate program\n",
10 | "* Program was built and taught by the IBM Data science team"
11 | ]
12 | },
13 | {
14 | "cell_type": "code",
15 | "execution_count": 6,
16 | "metadata": {},
17 | "outputs": [
18 | {
19 | "name": "stdout",
20 | "output_type": "stream",
21 | "text": [
22 | "Hello Capstone Project Course!\n"
23 | ]
24 | }
25 | ],
26 | "source": [
27 | "import pandas as pd\n",
28 | "import numpy as np\n",
29 | "\n",
30 | "print('Hello Capstone Project Course!')"
31 | ]
32 | },
33 | {
34 | "cell_type": "code",
35 | "execution_count": null,
36 | "metadata": {},
37 | "outputs": [],
38 | "source": []
39 | }
40 | ],
41 | "metadata": {
42 | "kernelspec": {
43 | "display_name": "Python",
44 | "language": "python",
45 | "name": "conda-env-python-py"
46 | },
47 | "language_info": {
48 | "codemirror_mode": {
49 | "name": "ipython",
50 | "version": 3
51 | },
52 | "file_extension": ".py",
53 | "mimetype": "text/x-python",
54 | "name": "python",
55 | "nbconvert_exporter": "python",
56 | "pygments_lexer": "ipython3",
57 | "version": "3.6.7"
58 | }
59 | },
60 | "nbformat": 4,
61 | "nbformat_minor": 4
62 | }
63 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
The Battle of Cities: Tbilisi vs Berlin
2 | Introduction:
3 |
4 | Tbilisi, the capital of post-soviet country Georgia, one of the oldest cities which was rediscovered by the world in recent years. For the past 10 years, the whole country is experiencing immense growth in tourism and its especially noticeable in Tbilisi. People come here to experience the natural beauty of Caucasus mountains or, the greenness of the forests and the wildlife preserved there, or simply enjoy summer on the beaches of the Adjara region. However, Tbilisi became popular for its night clubs, bars, and restaurants. As time goes by Tbilisi is attaining its reputation for mesmerizing nightlife and raving culture. In several articles Forbes, The Guardian and VICE pronounced Tbilisi as the center of the nightlife, putting him in from of such huge cities like Berlin and London. Last year Forbes called Tbilisi “This Year's Most Exciting City” [1] and it’s only beginning.
5 | On the other side of the story, we have Berlin, which has a well-established reputation and has worldwide popularity due to its outstanding night clubs and bars and the experience the city can give to the peoples who are into nightlife. Since the fall of the berlin wall, this city converted itself as the benchmark city for clubbing and nightlife. However, the growth of Tbilisi’s, later conversations and the increase in demand for the Georgian DJ’s around the globe brought me to a question, is Tbilisi in competition with Berlin for the best nightlife city? Does Tbilisi compare to Berlin in terms of night clubs and bars, and if it’s worth it tapping into that business?
6 | To elaborate better on a topic, we need to dive into the culture and investigate important metrics which I will address in the next paragraph. To experience nightlife fully, especially in the foreign city, just having a good night club is not enough, it’s a combination of other venues, services and freedom of expression. The Guardian in their article about Tbilisi said that “These cities share the magic ingredients that allowed clubbing to thrive in east Berlin: cheap rents, plenty of space, often in the form of unused communist-era buildings, and creative, open-minded young people” [2]. We’ll look at the combination of factors like taxi services, hotels, a measure of personal freedom and crime rates in combination with the number of clubs and bars and their average reviews and how easy it is to travel to Tbilisi.
7 |
--------------------------------------------------------------------------------
/Tbilisi-project-intro.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "The Battle of Cities: Tbilisi vs Berlin
\n",
8 | "\n",
9 | "### 1.0 Introduction:\n",
10 | "\n",
11 | "Tbilisi, the capital of post-soviet country Georgia, one of the oldest cities which was rediscovered by the world in recent years. For the past 10 years, the whole country is experiencing immense growth in tourism and its especially noticeable in Tbilisi. People come here to experience the natural beauty of Caucasus mountains or, the greenness of the forests and the wildlife preserved there, or simply enjoy summer on the beaches of the Adjara region. However, Tbilisi became popular for its night clubs, bars, and restaurants. As time goes by Tbilisi is attaining its reputation for mesmerizing nightlife and raving culture. In several articles Forbes, The Guardian and VICE pronounced Tbilisi as the center of the nightlife, putting him in from of such huge cities like Berlin and London. Last year Forbes called Tbilisi “This Year's Most Exciting City” [1] and it’s only beginning. \n",
12 | "\n",
13 | "\n",
14 | "On the other side of the story, we have Berlin, which has a well-established reputation and has worldwide popularity due to its outstanding night clubs and bars and the experience the city can give to the peoples who are into nightlife. Since the fall of the berlin wall, this city converted itself as the benchmark city for clubbing and nightlife. However, the growth of Tbilisi’s, later conversations and the increase in demand for the Georgian DJ’s around the globe brought me to a question, is Tbilisi in competition with Berlin for the best nightlife city? Does Tbilisi compare to Berlin in terms of night clubs and bars, and if it’s worth it tapping into that business?\n",
15 | "\n",
16 | "\n",
17 | "To elaborate better on a topic, we need to dive into the culture and investigate important metrics which I will address in the next paragraph. To experience nightlife fully, especially in the foreign city, just having a good night club is not enough, it’s a combination of other venues, services and freedom of expression. The Guardian in their article about Tbilisi said that “These cities share the magic ingredients that allowed clubbing to thrive in east Berlin: cheap rents, plenty of space, often in the form of unused communist-era buildings, and creative, open-minded young people” [2]. We’ll look at the combination of factors like taxi services, hotels, a measure of personal freedom and crime rates in combination with the number of clubs and bars and their average reviews and how easy it is to travel to Tbilisi.\n",
18 | "\n",
19 | "\n",
20 | "### 1.1 Data:\n",
21 | "I will use foursquare API to collect the top 25 venues from Tbilisi, Georgia, and Berlin, Germany searchable under the category “Night Life” and \"Night Clubs.\" Using the same portal and the same technics I will also collect the top 25 venues searchable under the category \"Hotel\" and \"Hostel\". I will extract average ratings of the venues with the number of feedbacks to analyze the reputation and the impressions and of the visitors\n",
22 | "\n",
23 | "\n",
24 | "I will use other sources (full data collecting and cleaning details available in data collection and wrangling section) to obtain information about travel trends, and average prices of a room at the hotel and taxi prices to portray how enjoyable the nightlife is in those two cities and try to draw some parallels. I will look at the crime rates, overall security, and to what degree jurisdictions allow to express personal freedom and exercise one’s legal rights.\n",
25 | "\n",
26 | "--------\n",
27 | "References:\n",
28 | "\n",
29 | "[1] Wilson, B. (2019, September 25). Berlin Is Out, Tbilisi Is In: Georgia's Capital Is This Year's Most Exciting City. Retrieved from https://www.forbes.com/sites/breannawilson/2018/09/05/berlin-is-out-tbilisi-is-in-georgias-capital-is-this-years-most-exciting-city/#435a0d80479d.\n",
30 | "\n",
31 | "[2] Ravens, C. (2019, January 22). Bassiani: the Tbilisi techno mecca shaking off post-Soviet repression. Retrieved from https://www.theguardian.com/music/2019/jan/22/bassiani-tbilisi-techno-nightclub-georgia.\n"
32 | ]
33 | },
34 | {
35 | "cell_type": "code",
36 | "execution_count": null,
37 | "metadata": {},
38 | "outputs": [],
39 | "source": []
40 | }
41 | ],
42 | "metadata": {
43 | "kernelspec": {
44 | "display_name": "Python 3",
45 | "language": "python",
46 | "name": "python3"
47 | },
48 | "language_info": {
49 | "codemirror_mode": {
50 | "name": "ipython",
51 | "version": 3
52 | },
53 | "file_extension": ".py",
54 | "mimetype": "text/x-python",
55 | "name": "python",
56 | "nbconvert_exporter": "python",
57 | "pygments_lexer": "ipython3",
58 | "version": "3.7.4"
59 | }
60 | },
61 | "nbformat": 4,
62 | "nbformat_minor": 2
63 | }
64 |
--------------------------------------------------------------------------------
/Toronto-project-final-PART2.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Part 2 of Neighborhoods project\n",
8 | "\n",
9 | "--------\n",
10 | "### Finalizing our data"
11 | ]
12 | },
13 | {
14 | "cell_type": "markdown",
15 | "metadata": {},
16 | "source": [
17 | "I'm importing csv file for Toronto locations and for Geocordinates to work them and combine together"
18 | ]
19 | },
20 | {
21 | "cell_type": "code",
22 | "execution_count": 1,
23 | "metadata": {},
24 | "outputs": [
25 | {
26 | "data": {
27 | "text/html": [
28 | "\n",
29 | "\n",
42 | "
\n",
43 | " \n",
44 | " \n",
45 | " | \n",
46 | " Unnamed: 0 | \n",
47 | " Postcode | \n",
48 | " Borough | \n",
49 | " Neighborhood | \n",
50 | "
\n",
51 | " \n",
52 | " \n",
53 | " \n",
54 | " | 0 | \n",
55 | " 0 | \n",
56 | " M1B | \n",
57 | " Scarborough | \n",
58 | " Rouge,Malvern | \n",
59 | "
\n",
60 | " \n",
61 | " | 1 | \n",
62 | " 1 | \n",
63 | " M1C | \n",
64 | " Scarborough | \n",
65 | " Highland Creek,Rouge Hill,Port Union | \n",
66 | "
\n",
67 | " \n",
68 | " | 2 | \n",
69 | " 2 | \n",
70 | " M1E | \n",
71 | " Scarborough | \n",
72 | " Guildwood,Morningside,West Hill | \n",
73 | "
\n",
74 | " \n",
75 | " | 3 | \n",
76 | " 3 | \n",
77 | " M1G | \n",
78 | " Scarborough | \n",
79 | " Woburn | \n",
80 | "
\n",
81 | " \n",
82 | " | 4 | \n",
83 | " 4 | \n",
84 | " M1H | \n",
85 | " Scarborough | \n",
86 | " Cedarbrae | \n",
87 | "
\n",
88 | " \n",
89 | "
\n",
90 | "
"
91 | ],
92 | "text/plain": [
93 | " Unnamed: 0 Postcode Borough Neighborhood\n",
94 | "0 0 M1B Scarborough Rouge,Malvern\n",
95 | "1 1 M1C Scarborough Highland Creek,Rouge Hill,Port Union\n",
96 | "2 2 M1E Scarborough Guildwood,Morningside,West Hill\n",
97 | "3 3 M1G Scarborough Woburn\n",
98 | "4 4 M1H Scarborough Cedarbrae"
99 | ]
100 | },
101 | "execution_count": 1,
102 | "metadata": {},
103 | "output_type": "execute_result"
104 | }
105 | ],
106 | "source": [
107 | "import pandas as pd\n",
108 | "df = pd.read_csv('tor-loc.csv')\n",
109 | "df.head()"
110 | ]
111 | },
112 | {
113 | "cell_type": "markdown",
114 | "metadata": {},
115 | "source": [
116 | "I'm dropping not needed column"
117 | ]
118 | },
119 | {
120 | "cell_type": "code",
121 | "execution_count": 2,
122 | "metadata": {},
123 | "outputs": [
124 | {
125 | "data": {
126 | "text/html": [
127 | "\n",
128 | "\n",
141 | "
\n",
142 | " \n",
143 | " \n",
144 | " | \n",
145 | " Postcode | \n",
146 | " Borough | \n",
147 | " Neighborhood | \n",
148 | "
\n",
149 | " \n",
150 | " \n",
151 | " \n",
152 | " | 0 | \n",
153 | " M1B | \n",
154 | " Scarborough | \n",
155 | " Rouge,Malvern | \n",
156 | "
\n",
157 | " \n",
158 | " | 1 | \n",
159 | " M1C | \n",
160 | " Scarborough | \n",
161 | " Highland Creek,Rouge Hill,Port Union | \n",
162 | "
\n",
163 | " \n",
164 | " | 2 | \n",
165 | " M1E | \n",
166 | " Scarborough | \n",
167 | " Guildwood,Morningside,West Hill | \n",
168 | "
\n",
169 | " \n",
170 | " | 3 | \n",
171 | " M1G | \n",
172 | " Scarborough | \n",
173 | " Woburn | \n",
174 | "
\n",
175 | " \n",
176 | " | 4 | \n",
177 | " M1H | \n",
178 | " Scarborough | \n",
179 | " Cedarbrae | \n",
180 | "
\n",
181 | " \n",
182 | "
\n",
183 | "
"
184 | ],
185 | "text/plain": [
186 | " Postcode Borough Neighborhood\n",
187 | "0 M1B Scarborough Rouge,Malvern\n",
188 | "1 M1C Scarborough Highland Creek,Rouge Hill,Port Union\n",
189 | "2 M1E Scarborough Guildwood,Morningside,West Hill\n",
190 | "3 M1G Scarborough Woburn\n",
191 | "4 M1H Scarborough Cedarbrae"
192 | ]
193 | },
194 | "execution_count": 2,
195 | "metadata": {},
196 | "output_type": "execute_result"
197 | }
198 | ],
199 | "source": [
200 | "df = df.drop(['Unnamed: 0'], axis=1)\n",
201 | "df.head()"
202 | ]
203 | },
204 | {
205 | "cell_type": "code",
206 | "execution_count": 3,
207 | "metadata": {},
208 | "outputs": [
209 | {
210 | "data": {
211 | "text/html": [
212 | "\n",
213 | "\n",
226 | "
\n",
227 | " \n",
228 | " \n",
229 | " | \n",
230 | " Postal Code | \n",
231 | " Latitude | \n",
232 | " Longitude | \n",
233 | "
\n",
234 | " \n",
235 | " \n",
236 | " \n",
237 | " | 0 | \n",
238 | " M1B | \n",
239 | " 43.806686 | \n",
240 | " -79.194353 | \n",
241 | "
\n",
242 | " \n",
243 | " | 1 | \n",
244 | " M1C | \n",
245 | " 43.784535 | \n",
246 | " -79.160497 | \n",
247 | "
\n",
248 | " \n",
249 | " | 2 | \n",
250 | " M1E | \n",
251 | " 43.763573 | \n",
252 | " -79.188711 | \n",
253 | "
\n",
254 | " \n",
255 | " | 3 | \n",
256 | " M1G | \n",
257 | " 43.770992 | \n",
258 | " -79.216917 | \n",
259 | "
\n",
260 | " \n",
261 | " | 4 | \n",
262 | " M1H | \n",
263 | " 43.773136 | \n",
264 | " -79.239476 | \n",
265 | "
\n",
266 | " \n",
267 | "
\n",
268 | "
"
269 | ],
270 | "text/plain": [
271 | " Postal Code Latitude Longitude\n",
272 | "0 M1B 43.806686 -79.194353\n",
273 | "1 M1C 43.784535 -79.160497\n",
274 | "2 M1E 43.763573 -79.188711\n",
275 | "3 M1G 43.770992 -79.216917\n",
276 | "4 M1H 43.773136 -79.239476"
277 | ]
278 | },
279 | "execution_count": 3,
280 | "metadata": {},
281 | "output_type": "execute_result"
282 | }
283 | ],
284 | "source": [
285 | "import pandas as pd\n",
286 | "df1 = pd.read_csv('Geospatial_Coordinates.csv')\n",
287 | "df1.head()"
288 | ]
289 | },
290 | {
291 | "cell_type": "markdown",
292 | "metadata": {},
293 | "source": [
294 | "I'm renaming 'Postal Code' column as 'Postcode' which later will be used to merge two dataframes together"
295 | ]
296 | },
297 | {
298 | "cell_type": "code",
299 | "execution_count": 4,
300 | "metadata": {},
301 | "outputs": [
302 | {
303 | "data": {
304 | "text/html": [
305 | "\n",
306 | "\n",
319 | "
\n",
320 | " \n",
321 | " \n",
322 | " | \n",
323 | " Postcode | \n",
324 | " Latitude | \n",
325 | " Longitude | \n",
326 | "
\n",
327 | " \n",
328 | " \n",
329 | " \n",
330 | " | 0 | \n",
331 | " M1B | \n",
332 | " 43.806686 | \n",
333 | " -79.194353 | \n",
334 | "
\n",
335 | " \n",
336 | " | 1 | \n",
337 | " M1C | \n",
338 | " 43.784535 | \n",
339 | " -79.160497 | \n",
340 | "
\n",
341 | " \n",
342 | " | 2 | \n",
343 | " M1E | \n",
344 | " 43.763573 | \n",
345 | " -79.188711 | \n",
346 | "
\n",
347 | " \n",
348 | " | 3 | \n",
349 | " M1G | \n",
350 | " 43.770992 | \n",
351 | " -79.216917 | \n",
352 | "
\n",
353 | " \n",
354 | " | 4 | \n",
355 | " M1H | \n",
356 | " 43.773136 | \n",
357 | " -79.239476 | \n",
358 | "
\n",
359 | " \n",
360 | "
\n",
361 | "
"
362 | ],
363 | "text/plain": [
364 | " Postcode Latitude Longitude\n",
365 | "0 M1B 43.806686 -79.194353\n",
366 | "1 M1C 43.784535 -79.160497\n",
367 | "2 M1E 43.763573 -79.188711\n",
368 | "3 M1G 43.770992 -79.216917\n",
369 | "4 M1H 43.773136 -79.239476"
370 | ]
371 | },
372 | "execution_count": 4,
373 | "metadata": {},
374 | "output_type": "execute_result"
375 | }
376 | ],
377 | "source": [
378 | "df1 = df1.rename(columns={'Postal Code': 'Postcode'})\n",
379 | "df1.head()"
380 | ]
381 | },
382 | {
383 | "cell_type": "markdown",
384 | "metadata": {},
385 | "source": [
386 | "I'm adding geolocations to the postcode and neighborhoods by merging two dataframes with the \"Postcode\" being the key"
387 | ]
388 | },
389 | {
390 | "cell_type": "code",
391 | "execution_count": 5,
392 | "metadata": {},
393 | "outputs": [
394 | {
395 | "data": {
396 | "text/html": [
397 | "\n",
398 | "\n",
411 | "
\n",
412 | " \n",
413 | " \n",
414 | " | \n",
415 | " Postcode | \n",
416 | " Borough | \n",
417 | " Neighborhood | \n",
418 | " Latitude | \n",
419 | " Longitude | \n",
420 | "
\n",
421 | " \n",
422 | " \n",
423 | " \n",
424 | " | 0 | \n",
425 | " M1B | \n",
426 | " Scarborough | \n",
427 | " Rouge,Malvern | \n",
428 | " 43.806686 | \n",
429 | " -79.194353 | \n",
430 | "
\n",
431 | " \n",
432 | " | 1 | \n",
433 | " M1C | \n",
434 | " Scarborough | \n",
435 | " Highland Creek,Rouge Hill,Port Union | \n",
436 | " 43.784535 | \n",
437 | " -79.160497 | \n",
438 | "
\n",
439 | " \n",
440 | " | 2 | \n",
441 | " M1E | \n",
442 | " Scarborough | \n",
443 | " Guildwood,Morningside,West Hill | \n",
444 | " 43.763573 | \n",
445 | " -79.188711 | \n",
446 | "
\n",
447 | " \n",
448 | " | 3 | \n",
449 | " M1G | \n",
450 | " Scarborough | \n",
451 | " Woburn | \n",
452 | " 43.770992 | \n",
453 | " -79.216917 | \n",
454 | "
\n",
455 | " \n",
456 | " | 4 | \n",
457 | " M1H | \n",
458 | " Scarborough | \n",
459 | " Cedarbrae | \n",
460 | " 43.773136 | \n",
461 | " -79.239476 | \n",
462 | "
\n",
463 | " \n",
464 | " | 5 | \n",
465 | " M1J | \n",
466 | " Scarborough | \n",
467 | " Scarborough Village | \n",
468 | " 43.744734 | \n",
469 | " -79.239476 | \n",
470 | "
\n",
471 | " \n",
472 | " | 6 | \n",
473 | " M1K | \n",
474 | " Scarborough | \n",
475 | " East Birchmount Park,Ionview,Kennedy Park | \n",
476 | " 43.727929 | \n",
477 | " -79.262029 | \n",
478 | "
\n",
479 | " \n",
480 | " | 7 | \n",
481 | " M1L | \n",
482 | " Scarborough | \n",
483 | " Clairlea,Golden Mile,Oakridge | \n",
484 | " 43.711112 | \n",
485 | " -79.284577 | \n",
486 | "
\n",
487 | " \n",
488 | " | 8 | \n",
489 | " M1M | \n",
490 | " Scarborough | \n",
491 | " Cliffcrest,Cliffside,Scarborough Village West | \n",
492 | " 43.716316 | \n",
493 | " -79.239476 | \n",
494 | "
\n",
495 | " \n",
496 | " | 9 | \n",
497 | " M1N | \n",
498 | " Scarborough | \n",
499 | " Birch Cliff,Cliffside West | \n",
500 | " 43.692657 | \n",
501 | " -79.264848 | \n",
502 | "
\n",
503 | " \n",
504 | "
\n",
505 | "
"
506 | ],
507 | "text/plain": [
508 | " Postcode Borough Neighborhood \\\n",
509 | "0 M1B Scarborough Rouge,Malvern \n",
510 | "1 M1C Scarborough Highland Creek,Rouge Hill,Port Union \n",
511 | "2 M1E Scarborough Guildwood,Morningside,West Hill \n",
512 | "3 M1G Scarborough Woburn \n",
513 | "4 M1H Scarborough Cedarbrae \n",
514 | "5 M1J Scarborough Scarborough Village \n",
515 | "6 M1K Scarborough East Birchmount Park,Ionview,Kennedy Park \n",
516 | "7 M1L Scarborough Clairlea,Golden Mile,Oakridge \n",
517 | "8 M1M Scarborough Cliffcrest,Cliffside,Scarborough Village West \n",
518 | "9 M1N Scarborough Birch Cliff,Cliffside West \n",
519 | "\n",
520 | " Latitude Longitude \n",
521 | "0 43.806686 -79.194353 \n",
522 | "1 43.784535 -79.160497 \n",
523 | "2 43.763573 -79.188711 \n",
524 | "3 43.770992 -79.216917 \n",
525 | "4 43.773136 -79.239476 \n",
526 | "5 43.744734 -79.239476 \n",
527 | "6 43.727929 -79.262029 \n",
528 | "7 43.711112 -79.284577 \n",
529 | "8 43.716316 -79.239476 \n",
530 | "9 43.692657 -79.264848 "
531 | ]
532 | },
533 | "execution_count": 5,
534 | "metadata": {},
535 | "output_type": "execute_result"
536 | }
537 | ],
538 | "source": [
539 | "dft = pd.merge(df, df1, on='Postcode')\n",
540 | "dft.head(10)"
541 | ]
542 | },
543 | {
544 | "cell_type": "code",
545 | "execution_count": 6,
546 | "metadata": {},
547 | "outputs": [],
548 | "source": [
549 | "dft.to_csv('toronto-df.csv')"
550 | ]
551 | }
552 | ],
553 | "metadata": {
554 | "kernelspec": {
555 | "display_name": "Python",
556 | "language": "python",
557 | "name": "conda-env-python-py"
558 | },
559 | "language_info": {
560 | "codemirror_mode": {
561 | "name": "ipython",
562 | "version": 3
563 | },
564 | "file_extension": ".py",
565 | "mimetype": "text/x-python",
566 | "name": "python",
567 | "nbconvert_exporter": "python",
568 | "pygments_lexer": "ipython3",
569 | "version": "3.6.7"
570 | }
571 | },
572 | "nbformat": 4,
573 | "nbformat_minor": 4
574 | }
575 |
--------------------------------------------------------------------------------
/Toronto-project-final-PART1.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Part 1 of Neighborhoods project\n",
8 | "\n",
9 | "--------\n",
10 | "### Extracting initial data"
11 | ]
12 | },
13 | {
14 | "cell_type": "markdown",
15 | "metadata": {},
16 | "source": [
17 | "## Data preperation\n",
18 | "In this notebook I'll import, cleanm and prepare dataframe for Toronto"
19 | ]
20 | },
21 | {
22 | "cell_type": "code",
23 | "execution_count": 1,
24 | "metadata": {},
25 | "outputs": [
26 | {
27 | "name": "stdout",
28 | "output_type": "stream",
29 | "text": [
30 | "Libraries imported.\n"
31 | ]
32 | }
33 | ],
34 | "source": [
35 | "import numpy as np # library to handle data in a vectorized manner\n",
36 | "\n",
37 | "import pandas as pd # library for data analsysis\n",
38 | "pd.set_option('display.max_columns', None)\n",
39 | "pd.set_option('display.max_rows', None)\n",
40 | "\n",
41 | "import json # library to handle JSON files\n",
42 | "\n",
43 | "#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab\n",
44 | "from geopy.geocoders import Nominatim # convert an address into latitude and longitude values\n",
45 | "\n",
46 | "import requests # library to handle requests\n",
47 | "from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe\n",
48 | "\n",
49 | "# Matplotlib and associated plotting modules\n",
50 | "import matplotlib.cm as cm\n",
51 | "import matplotlib.colors as colors\n",
52 | "\n",
53 | "# import k-means from clustering stage\n",
54 | "from sklearn.cluster import KMeans\n",
55 | "\n",
56 | "#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab\n",
57 | "import folium # map rendering library\n",
58 | "\n",
59 | "print('Libraries imported.')"
60 | ]
61 | },
62 | {
63 | "cell_type": "markdown",
64 | "metadata": {},
65 | "source": [
66 | "I import csv file after scrapping it from the https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M"
67 | ]
68 | },
69 | {
70 | "cell_type": "code",
71 | "execution_count": 2,
72 | "metadata": {},
73 | "outputs": [
74 | {
75 | "data": {
76 | "text/html": [
77 | "\n",
78 | "\n",
91 | "
\n",
92 | " \n",
93 | " \n",
94 | " | \n",
95 | " Postcode | \n",
96 | " Borough | \n",
97 | " Neighborhood | \n",
98 | "
\n",
99 | " \n",
100 | " \n",
101 | " \n",
102 | " | 0 | \n",
103 | " M1A | \n",
104 | " Not assigned | \n",
105 | " Not assigned | \n",
106 | "
\n",
107 | " \n",
108 | " | 1 | \n",
109 | " M2A | \n",
110 | " Not assigned | \n",
111 | " Not assigned | \n",
112 | "
\n",
113 | " \n",
114 | " | 2 | \n",
115 | " M3A | \n",
116 | " North York | \n",
117 | " Parkwoods | \n",
118 | "
\n",
119 | " \n",
120 | " | 3 | \n",
121 | " M4A | \n",
122 | " North York | \n",
123 | " Victoria Village | \n",
124 | "
\n",
125 | " \n",
126 | " | 4 | \n",
127 | " M5A | \n",
128 | " Downtown Toronto | \n",
129 | " Harbourfront | \n",
130 | "
\n",
131 | " \n",
132 | "
\n",
133 | "
"
134 | ],
135 | "text/plain": [
136 | " Postcode Borough Neighborhood\n",
137 | "0 M1A Not assigned Not assigned\n",
138 | "1 M2A Not assigned Not assigned\n",
139 | "2 M3A North York Parkwoods\n",
140 | "3 M4A North York Victoria Village\n",
141 | "4 M5A Downtown Toronto Harbourfront"
142 | ]
143 | },
144 | "execution_count": 2,
145 | "metadata": {},
146 | "output_type": "execute_result"
147 | }
148 | ],
149 | "source": [
150 | "# import csv file after scrapping it from the wikipage\n",
151 | "df = pd.read_csv('toronto.csv')\n",
152 | "df.head()"
153 | ]
154 | },
155 | {
156 | "cell_type": "markdown",
157 | "metadata": {},
158 | "source": [
159 | "I'm dropping 'Not assigned' value rows from Borough column"
160 | ]
161 | },
162 | {
163 | "cell_type": "code",
164 | "execution_count": 3,
165 | "metadata": {},
166 | "outputs": [
167 | {
168 | "data": {
169 | "text/html": [
170 | "\n",
171 | "\n",
184 | "
\n",
185 | " \n",
186 | " \n",
187 | " | \n",
188 | " Postcode | \n",
189 | " Borough | \n",
190 | " Neighborhood | \n",
191 | "
\n",
192 | " \n",
193 | " \n",
194 | " \n",
195 | " | 2 | \n",
196 | " M3A | \n",
197 | " North York | \n",
198 | " Parkwoods | \n",
199 | "
\n",
200 | " \n",
201 | " | 3 | \n",
202 | " M4A | \n",
203 | " North York | \n",
204 | " Victoria Village | \n",
205 | "
\n",
206 | " \n",
207 | " | 4 | \n",
208 | " M5A | \n",
209 | " Downtown Toronto | \n",
210 | " Harbourfront | \n",
211 | "
\n",
212 | " \n",
213 | " | 5 | \n",
214 | " M6A | \n",
215 | " North York | \n",
216 | " Lawrence Heights | \n",
217 | "
\n",
218 | " \n",
219 | " | 6 | \n",
220 | " M6A | \n",
221 | " North York | \n",
222 | " Lawrence Manor | \n",
223 | "
\n",
224 | " \n",
225 | " | 7 | \n",
226 | " M7A | \n",
227 | " Queen's Park | \n",
228 | " Not assigned | \n",
229 | "
\n",
230 | " \n",
231 | " | 9 | \n",
232 | " M9A | \n",
233 | " Downtown Toronto | \n",
234 | " Queen's Park | \n",
235 | "
\n",
236 | " \n",
237 | " | 10 | \n",
238 | " M1B | \n",
239 | " Scarborough | \n",
240 | " Rouge | \n",
241 | "
\n",
242 | " \n",
243 | " | 11 | \n",
244 | " M1B | \n",
245 | " Scarborough | \n",
246 | " Malvern | \n",
247 | "
\n",
248 | " \n",
249 | " | 13 | \n",
250 | " M3B | \n",
251 | " North York | \n",
252 | " Don Mills North | \n",
253 | "
\n",
254 | " \n",
255 | "
\n",
256 | "
"
257 | ],
258 | "text/plain": [
259 | " Postcode Borough Neighborhood\n",
260 | "2 M3A North York Parkwoods\n",
261 | "3 M4A North York Victoria Village\n",
262 | "4 M5A Downtown Toronto Harbourfront\n",
263 | "5 M6A North York Lawrence Heights\n",
264 | "6 M6A North York Lawrence Manor\n",
265 | "7 M7A Queen's Park Not assigned\n",
266 | "9 M9A Downtown Toronto Queen's Park\n",
267 | "10 M1B Scarborough Rouge\n",
268 | "11 M1B Scarborough Malvern\n",
269 | "13 M3B North York Don Mills North"
270 | ]
271 | },
272 | "execution_count": 3,
273 | "metadata": {},
274 | "output_type": "execute_result"
275 | }
276 | ],
277 | "source": [
278 | "df = df[df.Borough != 'Not assigned']\n",
279 | "df.head(10)"
280 | ]
281 | },
282 | {
283 | "cell_type": "code",
284 | "execution_count": 4,
285 | "metadata": {},
286 | "outputs": [
287 | {
288 | "data": {
289 | "text/plain": [
290 | "(210, 3)"
291 | ]
292 | },
293 | "execution_count": 4,
294 | "metadata": {},
295 | "output_type": "execute_result"
296 | }
297 | ],
298 | "source": [
299 | "df.shape"
300 | ]
301 | },
302 | {
303 | "cell_type": "markdown",
304 | "metadata": {},
305 | "source": [
306 | "I'm filling Neighborhood 'Not assigned' values from linked Borought column"
307 | ]
308 | },
309 | {
310 | "cell_type": "code",
311 | "execution_count": 5,
312 | "metadata": {},
313 | "outputs": [
314 | {
315 | "data": {
316 | "text/html": [
317 | "\n",
318 | "\n",
331 | "
\n",
332 | " \n",
333 | " \n",
334 | " | \n",
335 | " Postcode | \n",
336 | " Borough | \n",
337 | " Neighborhood | \n",
338 | "
\n",
339 | " \n",
340 | " \n",
341 | " \n",
342 | " | 2 | \n",
343 | " M3A | \n",
344 | " North York | \n",
345 | " Parkwoods | \n",
346 | "
\n",
347 | " \n",
348 | " | 3 | \n",
349 | " M4A | \n",
350 | " North York | \n",
351 | " Victoria Village | \n",
352 | "
\n",
353 | " \n",
354 | " | 4 | \n",
355 | " M5A | \n",
356 | " Downtown Toronto | \n",
357 | " Harbourfront | \n",
358 | "
\n",
359 | " \n",
360 | " | 5 | \n",
361 | " M6A | \n",
362 | " North York | \n",
363 | " Lawrence Heights | \n",
364 | "
\n",
365 | " \n",
366 | " | 6 | \n",
367 | " M6A | \n",
368 | " North York | \n",
369 | " Lawrence Manor | \n",
370 | "
\n",
371 | " \n",
372 | " | 7 | \n",
373 | " M7A | \n",
374 | " Queen's Park | \n",
375 | " Queen's Park | \n",
376 | "
\n",
377 | " \n",
378 | " | 9 | \n",
379 | " M9A | \n",
380 | " Downtown Toronto | \n",
381 | " Queen's Park | \n",
382 | "
\n",
383 | " \n",
384 | " | 10 | \n",
385 | " M1B | \n",
386 | " Scarborough | \n",
387 | " Rouge | \n",
388 | "
\n",
389 | " \n",
390 | " | 11 | \n",
391 | " M1B | \n",
392 | " Scarborough | \n",
393 | " Malvern | \n",
394 | "
\n",
395 | " \n",
396 | " | 13 | \n",
397 | " M3B | \n",
398 | " North York | \n",
399 | " Don Mills North | \n",
400 | "
\n",
401 | " \n",
402 | "
\n",
403 | "
"
404 | ],
405 | "text/plain": [
406 | " Postcode Borough Neighborhood\n",
407 | "2 M3A North York Parkwoods\n",
408 | "3 M4A North York Victoria Village\n",
409 | "4 M5A Downtown Toronto Harbourfront\n",
410 | "5 M6A North York Lawrence Heights\n",
411 | "6 M6A North York Lawrence Manor\n",
412 | "7 M7A Queen's Park Queen's Park\n",
413 | "9 M9A Downtown Toronto Queen's Park\n",
414 | "10 M1B Scarborough Rouge\n",
415 | "11 M1B Scarborough Malvern\n",
416 | "13 M3B North York Don Mills North"
417 | ]
418 | },
419 | "execution_count": 5,
420 | "metadata": {},
421 | "output_type": "execute_result"
422 | }
423 | ],
424 | "source": [
425 | "\n",
426 | "df = df.mask(df == 'Not assigned').ffill(1)\n",
427 | "df.head(10)"
428 | ]
429 | },
430 | {
431 | "cell_type": "markdown",
432 | "metadata": {},
433 | "source": [
434 | "I'm combining Neighborhoods based on Postcode"
435 | ]
436 | },
437 | {
438 | "cell_type": "code",
439 | "execution_count": 6,
440 | "metadata": {},
441 | "outputs": [
442 | {
443 | "data": {
444 | "text/html": [
445 | "\n",
446 | "\n",
459 | "
\n",
460 | " \n",
461 | " \n",
462 | " | \n",
463 | " Postcode | \n",
464 | " Borough | \n",
465 | " Neighborhood | \n",
466 | "
\n",
467 | " \n",
468 | " \n",
469 | " \n",
470 | " | 0 | \n",
471 | " M1B | \n",
472 | " Scarborough | \n",
473 | " Rouge,Malvern | \n",
474 | "
\n",
475 | " \n",
476 | " | 1 | \n",
477 | " M1C | \n",
478 | " Scarborough | \n",
479 | " Highland Creek,Rouge Hill,Port Union | \n",
480 | "
\n",
481 | " \n",
482 | " | 2 | \n",
483 | " M1E | \n",
484 | " Scarborough | \n",
485 | " Guildwood,Morningside,West Hill | \n",
486 | "
\n",
487 | " \n",
488 | " | 3 | \n",
489 | " M1G | \n",
490 | " Scarborough | \n",
491 | " Woburn | \n",
492 | "
\n",
493 | " \n",
494 | " | 4 | \n",
495 | " M1H | \n",
496 | " Scarborough | \n",
497 | " Cedarbrae | \n",
498 | "
\n",
499 | " \n",
500 | " | 5 | \n",
501 | " M1J | \n",
502 | " Scarborough | \n",
503 | " Scarborough Village | \n",
504 | "
\n",
505 | " \n",
506 | " | 6 | \n",
507 | " M1K | \n",
508 | " Scarborough | \n",
509 | " East Birchmount Park,Ionview,Kennedy Park | \n",
510 | "
\n",
511 | " \n",
512 | " | 7 | \n",
513 | " M1L | \n",
514 | " Scarborough | \n",
515 | " Clairlea,Golden Mile,Oakridge | \n",
516 | "
\n",
517 | " \n",
518 | " | 8 | \n",
519 | " M1M | \n",
520 | " Scarborough | \n",
521 | " Cliffcrest,Cliffside,Scarborough Village West | \n",
522 | "
\n",
523 | " \n",
524 | " | 9 | \n",
525 | " M1N | \n",
526 | " Scarborough | \n",
527 | " Birch Cliff,Cliffside West | \n",
528 | "
\n",
529 | " \n",
530 | " | 10 | \n",
531 | " M1P | \n",
532 | " Scarborough | \n",
533 | " Dorset Park,Scarborough Town Centre,Wexford He... | \n",
534 | "
\n",
535 | " \n",
536 | " | 11 | \n",
537 | " M1R | \n",
538 | " Scarborough | \n",
539 | " Maryvale,Wexford | \n",
540 | "
\n",
541 | " \n",
542 | " | 12 | \n",
543 | " M1S | \n",
544 | " Scarborough | \n",
545 | " Agincourt | \n",
546 | "
\n",
547 | " \n",
548 | " | 13 | \n",
549 | " M1T | \n",
550 | " Scarborough | \n",
551 | " Clarks Corners,Sullivan,Tam O'Shanter | \n",
552 | "
\n",
553 | " \n",
554 | " | 14 | \n",
555 | " M1V | \n",
556 | " Scarborough | \n",
557 | " Agincourt North,L'Amoreaux East,Milliken,Steel... | \n",
558 | "
\n",
559 | " \n",
560 | " | 15 | \n",
561 | " M1W | \n",
562 | " Scarborough | \n",
563 | " L'Amoreaux West | \n",
564 | "
\n",
565 | " \n",
566 | " | 16 | \n",
567 | " M1X | \n",
568 | " Scarborough | \n",
569 | " Upper Rouge | \n",
570 | "
\n",
571 | " \n",
572 | " | 17 | \n",
573 | " M2H | \n",
574 | " North York | \n",
575 | " Hillcrest Village | \n",
576 | "
\n",
577 | " \n",
578 | " | 18 | \n",
579 | " M2J | \n",
580 | " North York | \n",
581 | " Fairview,Henry Farm,Oriole | \n",
582 | "
\n",
583 | " \n",
584 | " | 19 | \n",
585 | " M2K | \n",
586 | " North York | \n",
587 | " Bayview Village | \n",
588 | "
\n",
589 | " \n",
590 | "
\n",
591 | "
"
592 | ],
593 | "text/plain": [
594 | " Postcode Borough Neighborhood\n",
595 | "0 M1B Scarborough Rouge,Malvern\n",
596 | "1 M1C Scarborough Highland Creek,Rouge Hill,Port Union\n",
597 | "2 M1E Scarborough Guildwood,Morningside,West Hill\n",
598 | "3 M1G Scarborough Woburn\n",
599 | "4 M1H Scarborough Cedarbrae\n",
600 | "5 M1J Scarborough Scarborough Village\n",
601 | "6 M1K Scarborough East Birchmount Park,Ionview,Kennedy Park\n",
602 | "7 M1L Scarborough Clairlea,Golden Mile,Oakridge\n",
603 | "8 M1M Scarborough Cliffcrest,Cliffside,Scarborough Village West\n",
604 | "9 M1N Scarborough Birch Cliff,Cliffside West\n",
605 | "10 M1P Scarborough Dorset Park,Scarborough Town Centre,Wexford He...\n",
606 | "11 M1R Scarborough Maryvale,Wexford\n",
607 | "12 M1S Scarborough Agincourt\n",
608 | "13 M1T Scarborough Clarks Corners,Sullivan,Tam O'Shanter\n",
609 | "14 M1V Scarborough Agincourt North,L'Amoreaux East,Milliken,Steel...\n",
610 | "15 M1W Scarborough L'Amoreaux West\n",
611 | "16 M1X Scarborough Upper Rouge\n",
612 | "17 M2H North York Hillcrest Village\n",
613 | "18 M2J North York Fairview,Henry Farm,Oriole\n",
614 | "19 M2K North York Bayview Village"
615 | ]
616 | },
617 | "execution_count": 6,
618 | "metadata": {},
619 | "output_type": "execute_result"
620 | }
621 | ],
622 | "source": [
623 | "df = df.groupby(['Postcode','Borough'])['Neighborhood'].apply(','.join).reset_index()\n",
624 | "df.head(20)"
625 | ]
626 | },
627 | {
628 | "cell_type": "markdown",
629 | "metadata": {},
630 | "source": [
631 | "I'm checking for the shape of the dataframe"
632 | ]
633 | },
634 | {
635 | "cell_type": "code",
636 | "execution_count": 7,
637 | "metadata": {},
638 | "outputs": [
639 | {
640 | "data": {
641 | "text/plain": [
642 | "(103, 3)"
643 | ]
644 | },
645 | "execution_count": 7,
646 | "metadata": {},
647 | "output_type": "execute_result"
648 | }
649 | ],
650 | "source": [
651 | "df.shape"
652 | ]
653 | },
654 | {
655 | "cell_type": "code",
656 | "execution_count": 8,
657 | "metadata": {},
658 | "outputs": [],
659 | "source": [
660 | "df.to_csv('tor-loc.csv')"
661 | ]
662 | }
663 | ],
664 | "metadata": {
665 | "kernelspec": {
666 | "display_name": "Python",
667 | "language": "python",
668 | "name": "conda-env-python-py"
669 | },
670 | "language_info": {
671 | "codemirror_mode": {
672 | "name": "ipython",
673 | "version": 3
674 | },
675 | "file_extension": ".py",
676 | "mimetype": "text/x-python",
677 | "name": "python",
678 | "nbconvert_exporter": "python",
679 | "pygments_lexer": "ipython3",
680 | "version": "3.6.7"
681 | }
682 | },
683 | "nbformat": 4,
684 | "nbformat_minor": 4
685 | }
686 |
--------------------------------------------------------------------------------