├── 001_Read_and_Analyze_GPX_Strava_Route.ipynb
├── 002_Visualize_GPX_Strava_Routes_with_Folium.ipynb
├── 003_Elevation_and_Distance_Between_Points.ipynb
├── 004_Calculate_and_Visualize_Gradients.ipynb
└── 005_GradientProfile.ipynb
/001_Read_and_Analyze_GPX_Strava_Route.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "2ce70558-8243-4ff4-b548-6112b3f05537",
6 | "metadata": {},
7 | "source": [
8 | "# Data Science for Cycling #1 - How To Read GPX Strava Routes With Python\n",
9 | "- Notebook 1/6\n",
10 | "- Make sure to have `gpxpy` installed:\n",
11 | "
\n",
12 | "\n",
13 | "```\n",
14 | "pip install gpxpy\n",
15 | "```\n",
16 | "\n",
17 | "- Let's import the libraries and tweak Matplotlib's default stylings:"
18 | ]
19 | },
20 | {
21 | "cell_type": "code",
22 | "execution_count": 1,
23 | "id": "fd92ae56-c120-4081-96a8-958327826c26",
24 | "metadata": {},
25 | "outputs": [],
26 | "source": [
27 | "import gpxpy\n",
28 | "import gpxpy.gpx\n",
29 | "\n",
30 | "import pandas as pd\n",
31 | "import matplotlib.pyplot as plt\n",
32 | "plt.rcParams['axes.spines.top'] = False\n",
33 | "plt.rcParams['axes.spines.right'] = False"
34 | ]
35 | },
36 | {
37 | "cell_type": "markdown",
38 | "id": "7ab807b5-13a5-40d9-a7b8-f62f6ef385c5",
39 | "metadata": {},
40 | "source": [
41 | "- You can read GPX files with Python's context manager syntax:"
42 | ]
43 | },
44 | {
45 | "cell_type": "code",
46 | "execution_count": 3,
47 | "id": "a5c4e1aa-8589-4ac4-ba50-6ddb473b934d",
48 | "metadata": {},
49 | "outputs": [],
50 | "source": [
51 | "with open('../src_code/Zg288.gpx', 'r') as gpx_file:\n",
52 | " gpx = gpxpy.parse(gpx_file)"
53 | ]
54 | },
55 | {
56 | "cell_type": "markdown",
57 | "id": "483d3c00-01c2-4e47-bb14-2ce6b5c89df0",
58 | "metadata": {},
59 | "source": [
60 | "- It's a specific GPX object:"
61 | ]
62 | },
63 | {
64 | "cell_type": "code",
65 | "execution_count": 10,
66 | "id": "d521e73c-5f9f-48a4-a117-3b46da950e2e",
67 | "metadata": {},
68 | "outputs": [
69 | {
70 | "data": {
71 | "text/plain": [
72 | "GPX(tracks=[GPXTrack(name='Zg288', segments=[GPXTrackSegment(points=[...])])])"
73 | ]
74 | },
75 | "execution_count": 10,
76 | "metadata": {},
77 | "output_type": "execute_result"
78 | }
79 | ],
80 | "source": [
81 | "gpx"
82 | ]
83 | },
84 | {
85 | "cell_type": "markdown",
86 | "id": "f82304af-1de4-43f5-b713-3714eaeb2073",
87 | "metadata": {},
88 | "source": [
89 | "- Get the number of data points (number of times geolocation was taken):"
90 | ]
91 | },
92 | {
93 | "cell_type": "code",
94 | "execution_count": 22,
95 | "id": "6a7272df-f0df-4025-b65c-dec0bdff4b1d",
96 | "metadata": {},
97 | "outputs": [
98 | {
99 | "data": {
100 | "text/plain": [
101 | "835"
102 | ]
103 | },
104 | "execution_count": 22,
105 | "metadata": {},
106 | "output_type": "execute_result"
107 | }
108 | ],
109 | "source": [
110 | "gpx.get_track_points_no()"
111 | ]
112 | },
113 | {
114 | "cell_type": "markdown",
115 | "id": "d16ac89b-3049-4aa0-b4ce-d07cfd61c19a",
116 | "metadata": {},
117 | "source": [
118 | "- Get the minimum and maximum altitudes:"
119 | ]
120 | },
121 | {
122 | "cell_type": "code",
123 | "execution_count": 17,
124 | "id": "877183c5-51a8-474b-b3ea-f7fa18808694",
125 | "metadata": {},
126 | "outputs": [
127 | {
128 | "data": {
129 | "text/plain": [
130 | "MinimumMaximum(minimum=113.96000000000001, maximum=239.16)"
131 | ]
132 | },
133 | "execution_count": 17,
134 | "metadata": {},
135 | "output_type": "execute_result"
136 | }
137 | ],
138 | "source": [
139 | "gpx.get_elevation_extremes()"
140 | ]
141 | },
142 | {
143 | "cell_type": "markdown",
144 | "id": "d1a08fcd-2315-408b-89fc-ced315f44126",
145 | "metadata": {},
146 | "source": [
147 | "- Get the number of meters of uphil and downhil ride\n",
148 | "- It's a roundtrip, so the numbers should be almost identical"
149 | ]
150 | },
151 | {
152 | "cell_type": "code",
153 | "execution_count": 18,
154 | "id": "1f774255-cf8c-4594-8a5d-6f2dde900c84",
155 | "metadata": {},
156 | "outputs": [
157 | {
158 | "data": {
159 | "text/plain": [
160 | "UphillDownhill(uphill=295.7459999999997, downhill=295.7260000000002)"
161 | ]
162 | },
163 | "execution_count": 18,
164 | "metadata": {},
165 | "output_type": "execute_result"
166 | }
167 | ],
168 | "source": [
169 | "gpx.get_uphill_downhill()"
170 | ]
171 | },
172 | {
173 | "cell_type": "markdown",
174 | "id": "e4e5027f-6f33-41af-9b88-1ad038e27082",
175 | "metadata": {},
176 | "source": [
177 | "- You can dump the entire GPX file to XML\n",
178 | "- Here are the first 1000 characters:"
179 | ]
180 | },
181 | {
182 | "cell_type": "code",
183 | "execution_count": 9,
184 | "id": "a0858ff9-8c98-4c5b-bc53-94da96050d44",
185 | "metadata": {},
186 | "outputs": [
187 | {
188 | "data": {
189 | "text/plain": [
190 | "'\\n\\n \\n Zg288\\n \\n Dario Radečić\\n \\n \\n \\n \\n 2020\\n https://www.openstreetmap.org/copyright\\n \\n \\n \\n \\n \\n Zg288\\n \\n \\n Ride\\n \\n \\n 113.96000000000001\\n \\n \\n '"
191 | ]
192 | },
193 | "execution_count": 9,
194 | "metadata": {},
195 | "output_type": "execute_result"
196 | }
197 | ],
198 | "source": [
199 | "gpx.to_xml()[:1000]"
200 | ]
201 | },
202 | {
203 | "cell_type": "markdown",
204 | "id": "321d8e18-43c5-48f8-9aca-18745633f886",
205 | "metadata": {},
206 | "source": [
207 | "
\n",
208 | "\n",
209 | "## Basic analysis\n",
210 | "- There's only one track available in the file\n",
211 | "- Access it with Python's list indexing syntax:"
212 | ]
213 | },
214 | {
215 | "cell_type": "code",
216 | "execution_count": 11,
217 | "id": "f17aa437-8c29-4be8-9b73-09aacb2bbcf5",
218 | "metadata": {},
219 | "outputs": [
220 | {
221 | "data": {
222 | "text/plain": [
223 | "GPXTrack(name='Zg288', segments=[GPXTrackSegment(points=[...])])"
224 | ]
225 | },
226 | "execution_count": 11,
227 | "metadata": {},
228 | "output_type": "execute_result"
229 | }
230 | ],
231 | "source": [
232 | "gpx.tracks[0]"
233 | ]
234 | },
235 | {
236 | "cell_type": "markdown",
237 | "id": "1c017826-2595-490f-89fc-fd503d84250c",
238 | "metadata": {},
239 | "source": [
240 | "- The track has only one segment - access it the same way:"
241 | ]
242 | },
243 | {
244 | "cell_type": "code",
245 | "execution_count": 12,
246 | "id": "7c7b5f59-e86f-493a-b67b-c89ab4620904",
247 | "metadata": {},
248 | "outputs": [
249 | {
250 | "data": {
251 | "text/plain": [
252 | "GPXTrackSegment(points=[...])"
253 | ]
254 | },
255 | "execution_count": 12,
256 | "metadata": {},
257 | "output_type": "execute_result"
258 | }
259 | ],
260 | "source": [
261 | "gpx.tracks[0].segments[0]"
262 | ]
263 | },
264 | {
265 | "cell_type": "markdown",
266 | "id": "fc95d72d-6a96-48a0-9c62-670ee87aa8f3",
267 | "metadata": {},
268 | "source": [
269 | "- The segment has 835 data points\n",
270 | "- Here are the first 10:"
271 | ]
272 | },
273 | {
274 | "cell_type": "code",
275 | "execution_count": 23,
276 | "id": "c7a5c1f9-735a-4d50-acd3-b2f6146e0c87",
277 | "metadata": {},
278 | "outputs": [
279 | {
280 | "data": {
281 | "text/plain": [
282 | "[GPXTrackPoint(45.77248, 15.95804, elevation=113.96000000000001),\n",
283 | " GPXTrackPoint(45.77277, 15.959090000000002, elevation=115.82000000000001),\n",
284 | " GPXTrackPoint(45.77327000046602, 15.958795002593812, elevation=116.15),\n",
285 | " GPXTrackPoint(45.773770000000006, 15.9585, elevation=116.12000000000002),\n",
286 | " GPXTrackPoint(45.77423500296469, 15.95933499290041, elevation=115.98000000000002),\n",
287 | " GPXTrackPoint(45.7747, 15.960170000000002, elevation=115.10000000000001),\n",
288 | " GPXTrackPoint(45.77487000000001, 15.960220000000001, elevation=115.25),\n",
289 | " GPXTrackPoint(45.77533000042586, 15.960000001712027, elevation=116.14),\n",
290 | " GPXTrackPoint(45.77579, 15.959780000000002, elevation=116.07000000000001),\n",
291 | " GPXTrackPoint(45.776320000000005, 15.959510000000002, elevation=115.78)]"
292 | ]
293 | },
294 | "execution_count": 23,
295 | "metadata": {},
296 | "output_type": "execute_result"
297 | }
298 | ],
299 | "source": [
300 | "gpx.tracks[0].segments[0].points[:10]"
301 | ]
302 | },
303 | {
304 | "cell_type": "markdown",
305 | "id": "9e361b78-d193-4677-9faa-98af2166d24c",
306 | "metadata": {},
307 | "source": [
308 | "- Let's now extract all dat apoints\n",
309 | "- Store latitude, longitude, and elevation as a list of dicts"
310 | ]
311 | },
312 | {
313 | "cell_type": "code",
314 | "execution_count": 24,
315 | "id": "a0ead523-faaf-4587-be3a-a9982066bef4",
316 | "metadata": {},
317 | "outputs": [],
318 | "source": [
319 | "route_info = []\n",
320 | "\n",
321 | "for track in gpx.tracks:\n",
322 | " for segment in track.segments:\n",
323 | " for point in segment.points:\n",
324 | " route_info.append({\n",
325 | " 'latitude': point.latitude,\n",
326 | " 'longitude': point.longitude,\n",
327 | " 'elevation': point.elevation\n",
328 | " })"
329 | ]
330 | },
331 | {
332 | "cell_type": "code",
333 | "execution_count": 25,
334 | "id": "71d033ad-2998-415b-ab5a-fa42b0dfc959",
335 | "metadata": {},
336 | "outputs": [
337 | {
338 | "data": {
339 | "text/plain": [
340 | "[{'latitude': 45.77248,\n",
341 | " 'longitude': 15.95804,\n",
342 | " 'elevation': 113.96000000000001},\n",
343 | " {'latitude': 45.77277,\n",
344 | " 'longitude': 15.959090000000002,\n",
345 | " 'elevation': 115.82000000000001},\n",
346 | " {'latitude': 45.77327000046602,\n",
347 | " 'longitude': 15.958795002593812,\n",
348 | " 'elevation': 116.15}]"
349 | ]
350 | },
351 | "execution_count": 25,
352 | "metadata": {},
353 | "output_type": "execute_result"
354 | }
355 | ],
356 | "source": [
357 | "route_info[:3]"
358 | ]
359 | },
360 | {
361 | "cell_type": "markdown",
362 | "id": "be715000-2a30-4373-b8bc-51c15abb3e98",
363 | "metadata": {},
364 | "source": [
365 | "- Convert it to Pandas DataFrame for faster and easier analysis"
366 | ]
367 | },
368 | {
369 | "cell_type": "code",
370 | "execution_count": 26,
371 | "id": "939968ff-e834-4603-b39e-6ebab08958aa",
372 | "metadata": {},
373 | "outputs": [
374 | {
375 | "data": {
376 | "text/html": [
377 | "\n",
378 | "\n",
391 | "
\n",
392 | " \n",
393 | " \n",
394 | " | \n",
395 | " latitude | \n",
396 | " longitude | \n",
397 | " elevation | \n",
398 | "
\n",
399 | " \n",
400 | " \n",
401 | " \n",
402 | " 0 | \n",
403 | " 45.772480 | \n",
404 | " 15.958040 | \n",
405 | " 113.96 | \n",
406 | "
\n",
407 | " \n",
408 | " 1 | \n",
409 | " 45.772770 | \n",
410 | " 15.959090 | \n",
411 | " 115.82 | \n",
412 | "
\n",
413 | " \n",
414 | " 2 | \n",
415 | " 45.773270 | \n",
416 | " 15.958795 | \n",
417 | " 116.15 | \n",
418 | "
\n",
419 | " \n",
420 | " 3 | \n",
421 | " 45.773770 | \n",
422 | " 15.958500 | \n",
423 | " 116.12 | \n",
424 | "
\n",
425 | " \n",
426 | " 4 | \n",
427 | " 45.774235 | \n",
428 | " 15.959335 | \n",
429 | " 115.98 | \n",
430 | "
\n",
431 | " \n",
432 | "
\n",
433 | "
"
434 | ],
435 | "text/plain": [
436 | " latitude longitude elevation\n",
437 | "0 45.772480 15.958040 113.96\n",
438 | "1 45.772770 15.959090 115.82\n",
439 | "2 45.773270 15.958795 116.15\n",
440 | "3 45.773770 15.958500 116.12\n",
441 | "4 45.774235 15.959335 115.98"
442 | ]
443 | },
444 | "execution_count": 26,
445 | "metadata": {},
446 | "output_type": "execute_result"
447 | }
448 | ],
449 | "source": [
450 | "route_df = pd.DataFrame(route_info)\n",
451 | "route_df.head()"
452 | ]
453 | },
454 | {
455 | "cell_type": "markdown",
456 | "id": "d325daf4-9057-4705-83ee-096467891134",
457 | "metadata": {},
458 | "source": [
459 | "- Save it to CSV for later use:"
460 | ]
461 | },
462 | {
463 | "cell_type": "code",
464 | "execution_count": 37,
465 | "id": "6121d955-82a8-41ac-b51c-b7a5c599b6ec",
466 | "metadata": {},
467 | "outputs": [],
468 | "source": [
469 | "route_df.to_csv('../data/route_df.csv', index=False)"
470 | ]
471 | },
472 | {
473 | "cell_type": "markdown",
474 | "id": "fe96a890-5947-48c5-894a-eb5889bf22b5",
475 | "metadata": {},
476 | "source": [
477 | "
\n",
478 | "\n",
479 | "## Basic visualization\n",
480 | "- You can use matplotlib to visualize all data points\n",
481 | "- It won't show the map, but you should still see how the route looks like:"
482 | ]
483 | },
484 | {
485 | "cell_type": "code",
486 | "execution_count": 35,
487 | "id": "56d0931e-a147-47cc-9772-036b01bee8db",
488 | "metadata": {},
489 | "outputs": [
490 | {
491 | "data": {
492 | "image/png": "\n",
493 | "text/plain": [
494 | ""
495 | ]
496 | },
497 | "metadata": {
498 | "needs_background": "light"
499 | },
500 | "output_type": "display_data"
501 | }
502 | ],
503 | "source": [
504 | "plt.figure(figsize=(14, 8))\n",
505 | "plt.scatter(route_df['longitude'], route_df['latitude'], color='#101010')\n",
506 | "plt.title('Route latitude and longitude points', size=20);"
507 | ]
508 | },
509 | {
510 | "cell_type": "markdown",
511 | "id": "5f91006b-b7f5-4c28-9c46-10482c8871a7",
512 | "metadata": {},
513 | "source": [
514 | "- You'll see in the following notebook how to visualize the route on a map with Folium"
515 | ]
516 | }
517 | ],
518 | "metadata": {
519 | "kernelspec": {
520 | "display_name": "Python 3 (ipykernel)",
521 | "language": "python",
522 | "name": "python3"
523 | },
524 | "language_info": {
525 | "codemirror_mode": {
526 | "name": "ipython",
527 | "version": 3
528 | },
529 | "file_extension": ".py",
530 | "mimetype": "text/x-python",
531 | "name": "python",
532 | "nbconvert_exporter": "python",
533 | "pygments_lexer": "ipython3",
534 | "version": "3.9.4"
535 | }
536 | },
537 | "nbformat": 4,
538 | "nbformat_minor": 5
539 | }
540 |
--------------------------------------------------------------------------------
/003_Elevation_and_Distance_Between_Points.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "79181b70-9da9-4726-af5e-d5b9769eab9d",
6 | "metadata": {},
7 | "source": [
8 | "# Data Science for Cycling #3 - How To Calculate Elevation Difference and Distance From a GPX Route File\n",
9 | "- Noteook 3/6\n",
10 | "- Make sure to have the `haversine` package installed\n",
11 | "\n",
12 | " pip install haversine\n",
13 | " \n",
14 | "- Let's import the libraries and tweak Matplotlib's default stylings:"
15 | ]
16 | },
17 | {
18 | "cell_type": "code",
19 | "execution_count": 1,
20 | "id": "1f87a665-8046-4cfe-ae98-7e32fcf8e7e1",
21 | "metadata": {},
22 | "outputs": [],
23 | "source": [
24 | "import numpy as np\n",
25 | "import pandas as pd\n",
26 | "import matplotlib.pyplot as plt\n",
27 | "import haversine as hs\n",
28 | "\n",
29 | "plt.rcParams['figure.figsize'] = (16, 6)\n",
30 | "plt.rcParams['axes.spines.top'] = False\n",
31 | "plt.rcParams['axes.spines.right'] = False"
32 | ]
33 | },
34 | {
35 | "cell_type": "markdown",
36 | "id": "43fa1bdf-f1c7-4015-b372-b880d364271e",
37 | "metadata": {},
38 | "source": [
39 | "- Let's read in the dataset\n",
40 | "- We've saved it to a CSV file in the first notebook:"
41 | ]
42 | },
43 | {
44 | "cell_type": "code",
45 | "execution_count": 2,
46 | "id": "d4f7f190-46ba-44bc-8566-7a9a1bfe69e4",
47 | "metadata": {},
48 | "outputs": [
49 | {
50 | "data": {
51 | "text/html": [
52 | "\n",
53 | "\n",
66 | "
\n",
67 | " \n",
68 | " \n",
69 | " | \n",
70 | " latitude | \n",
71 | " longitude | \n",
72 | " elevation | \n",
73 | "
\n",
74 | " \n",
75 | " \n",
76 | " \n",
77 | " 0 | \n",
78 | " 45.772480 | \n",
79 | " 15.958040 | \n",
80 | " 113.96 | \n",
81 | "
\n",
82 | " \n",
83 | " 1 | \n",
84 | " 45.772770 | \n",
85 | " 15.959090 | \n",
86 | " 115.82 | \n",
87 | "
\n",
88 | " \n",
89 | " 2 | \n",
90 | " 45.773270 | \n",
91 | " 15.958795 | \n",
92 | " 116.15 | \n",
93 | "
\n",
94 | " \n",
95 | " 3 | \n",
96 | " 45.773770 | \n",
97 | " 15.958500 | \n",
98 | " 116.12 | \n",
99 | "
\n",
100 | " \n",
101 | " 4 | \n",
102 | " 45.774235 | \n",
103 | " 15.959335 | \n",
104 | " 115.98 | \n",
105 | "
\n",
106 | " \n",
107 | "
\n",
108 | "
"
109 | ],
110 | "text/plain": [
111 | " latitude longitude elevation\n",
112 | "0 45.772480 15.958040 113.96\n",
113 | "1 45.772770 15.959090 115.82\n",
114 | "2 45.773270 15.958795 116.15\n",
115 | "3 45.773770 15.958500 116.12\n",
116 | "4 45.774235 15.959335 115.98"
117 | ]
118 | },
119 | "execution_count": 2,
120 | "metadata": {},
121 | "output_type": "execute_result"
122 | }
123 | ],
124 | "source": [
125 | "route_df = pd.read_csv('../data/route_df.csv')\n",
126 | "route_df.head()"
127 | ]
128 | },
129 | {
130 | "cell_type": "markdown",
131 | "id": "9df4ae0b-268e-4923-8b10-3150b33c0b8a",
132 | "metadata": {},
133 | "source": [
134 | "
\n",
135 | "\n",
136 | "## Calculate elevation difference\n",
137 | "- Put simply, you can use Python's `diff()` function to difference a column\n",
138 | "- If you apply it to the Elevation column, you get the elevation difference between point T+1 and T\n",
139 | "- The first differenced value will be Null, as there's no data point before it:"
140 | ]
141 | },
142 | {
143 | "cell_type": "code",
144 | "execution_count": 3,
145 | "id": "70b6b1db-27bd-4f46-ab85-18f049a87c96",
146 | "metadata": {},
147 | "outputs": [
148 | {
149 | "data": {
150 | "text/html": [
151 | "\n",
152 | "\n",
165 | "
\n",
166 | " \n",
167 | " \n",
168 | " | \n",
169 | " latitude | \n",
170 | " longitude | \n",
171 | " elevation | \n",
172 | " elevation_diff | \n",
173 | "
\n",
174 | " \n",
175 | " \n",
176 | " \n",
177 | " 0 | \n",
178 | " 45.772480 | \n",
179 | " 15.958040 | \n",
180 | " 113.96 | \n",
181 | " NaN | \n",
182 | "
\n",
183 | " \n",
184 | " 1 | \n",
185 | " 45.772770 | \n",
186 | " 15.959090 | \n",
187 | " 115.82 | \n",
188 | " 1.86 | \n",
189 | "
\n",
190 | " \n",
191 | " 2 | \n",
192 | " 45.773270 | \n",
193 | " 15.958795 | \n",
194 | " 116.15 | \n",
195 | " 0.33 | \n",
196 | "
\n",
197 | " \n",
198 | " 3 | \n",
199 | " 45.773770 | \n",
200 | " 15.958500 | \n",
201 | " 116.12 | \n",
202 | " -0.03 | \n",
203 | "
\n",
204 | " \n",
205 | " 4 | \n",
206 | " 45.774235 | \n",
207 | " 15.959335 | \n",
208 | " 115.98 | \n",
209 | " -0.14 | \n",
210 | "
\n",
211 | " \n",
212 | "
\n",
213 | "
"
214 | ],
215 | "text/plain": [
216 | " latitude longitude elevation elevation_diff\n",
217 | "0 45.772480 15.958040 113.96 NaN\n",
218 | "1 45.772770 15.959090 115.82 1.86\n",
219 | "2 45.773270 15.958795 116.15 0.33\n",
220 | "3 45.773770 15.958500 116.12 -0.03\n",
221 | "4 45.774235 15.959335 115.98 -0.14"
222 | ]
223 | },
224 | "execution_count": 3,
225 | "metadata": {},
226 | "output_type": "execute_result"
227 | }
228 | ],
229 | "source": [
230 | "route_df['elevation_diff'] = route_df['elevation'].diff()\n",
231 | "\n",
232 | "route_df.head()"
233 | ]
234 | },
235 | {
236 | "cell_type": "markdown",
237 | "id": "c9aacdce-adad-4e02-be5e-af529a9f7ae6",
238 | "metadata": {},
239 | "source": [
240 | "
\n",
241 | "\n",
242 | "## Calculate distance between geolocations\n",
243 | "- Calculating distance is tricky in this case\n",
244 | "- It's best to use Google's API\n",
245 | " - But it's not free\n",
246 | " - And applies to road, which is limiting for let's say mountain bike trails\n",
247 | "- The best free option is to use **haversine distance**\n",
248 | " - It calculates a great circle distance between two points on a sphere given their latitudes and longitudes\n",
249 | " - Learn more: https://en.wikipedia.org/wiki/Haversine_formula\n",
250 | "- The GPX route measures 36,4 kilometers and has 835 data points\n",
251 | " - Which means there's on average 43,6 meters between two points\n",
252 | " - Haversine distance will be off, but hopefully not much\n",
253 | "- Let's first define a function to calculate haversine distance:"
254 | ]
255 | },
256 | {
257 | "cell_type": "code",
258 | "execution_count": 4,
259 | "id": "cc860cc9-7b65-4952-ad30-73ec73e8eef2",
260 | "metadata": {
261 | "tags": []
262 | },
263 | "outputs": [],
264 | "source": [
265 | "def haversine_distance(lat1, lon1, lat2, lon2) -> float:\n",
266 | " distance = hs.haversine(\n",
267 | " point1=(lat1, lon1),\n",
268 | " point2=(lat2, lon2),\n",
269 | " unit=hs.Unit.METERS\n",
270 | " )\n",
271 | " return np.round(distance, 2)"
272 | ]
273 | },
274 | {
275 | "cell_type": "markdown",
276 | "id": "50eda61d-be83-43ae-9ec3-2966d2f161ca",
277 | "metadata": {},
278 | "source": [
279 | "- Let's test it\n",
280 | "- The example below measures the distance between the first and the second point:"
281 | ]
282 | },
283 | {
284 | "cell_type": "code",
285 | "execution_count": 5,
286 | "id": "8c794c26-ede1-49cb-8106-93cdb46d3e9f",
287 | "metadata": {},
288 | "outputs": [
289 | {
290 | "data": {
291 | "text/plain": [
292 | "87.59"
293 | ]
294 | },
295 | "execution_count": 5,
296 | "metadata": {},
297 | "output_type": "execute_result"
298 | }
299 | ],
300 | "source": [
301 | "haversine_distance(\n",
302 | " lat1=route_df.iloc[0]['latitude'],\n",
303 | " lon1=route_df.iloc[0]['longitude'],\n",
304 | " lat2=route_df.iloc[1]['latitude'],\n",
305 | " lon2=route_df.iloc[1]['longitude']\n",
306 | ")"
307 | ]
308 | },
309 | {
310 | "cell_type": "markdown",
311 | "id": "e8023c63-73f3-41db-9f2d-fdaabaf531b9",
312 | "metadata": {},
313 | "source": [
314 | "- It's 87,59 meters, but we can't verify it through Strava\n",
315 | "- Let's calculate the distances between all data points\n",
316 | " - We have to skip the first row as there's no data point before it:"
317 | ]
318 | },
319 | {
320 | "cell_type": "code",
321 | "execution_count": 6,
322 | "id": "90284425-1392-439f-96b0-992eb9728bf2",
323 | "metadata": {},
324 | "outputs": [
325 | {
326 | "data": {
327 | "text/html": [
328 | "\n",
329 | "\n",
342 | "
\n",
343 | " \n",
344 | " \n",
345 | " | \n",
346 | " latitude | \n",
347 | " longitude | \n",
348 | " elevation | \n",
349 | " elevation_diff | \n",
350 | " distance | \n",
351 | "
\n",
352 | " \n",
353 | " \n",
354 | " \n",
355 | " 0 | \n",
356 | " 45.772480 | \n",
357 | " 15.958040 | \n",
358 | " 113.96 | \n",
359 | " NaN | \n",
360 | " NaN | \n",
361 | "
\n",
362 | " \n",
363 | " 1 | \n",
364 | " 45.772770 | \n",
365 | " 15.959090 | \n",
366 | " 115.82 | \n",
367 | " 1.86 | \n",
368 | " 87.59 | \n",
369 | "
\n",
370 | " \n",
371 | " 2 | \n",
372 | " 45.773270 | \n",
373 | " 15.958795 | \n",
374 | " 116.15 | \n",
375 | " 0.33 | \n",
376 | " 60.12 | \n",
377 | "
\n",
378 | " \n",
379 | " 3 | \n",
380 | " 45.773770 | \n",
381 | " 15.958500 | \n",
382 | " 116.12 | \n",
383 | " -0.03 | \n",
384 | " 60.12 | \n",
385 | "
\n",
386 | " \n",
387 | " 4 | \n",
388 | " 45.774235 | \n",
389 | " 15.959335 | \n",
390 | " 115.98 | \n",
391 | " -0.14 | \n",
392 | " 82.87 | \n",
393 | "
\n",
394 | " \n",
395 | "
\n",
396 | "
"
397 | ],
398 | "text/plain": [
399 | " latitude longitude elevation elevation_diff distance\n",
400 | "0 45.772480 15.958040 113.96 NaN NaN\n",
401 | "1 45.772770 15.959090 115.82 1.86 87.59\n",
402 | "2 45.773270 15.958795 116.15 0.33 60.12\n",
403 | "3 45.773770 15.958500 116.12 -0.03 60.12\n",
404 | "4 45.774235 15.959335 115.98 -0.14 82.87"
405 | ]
406 | },
407 | "execution_count": 6,
408 | "metadata": {},
409 | "output_type": "execute_result"
410 | }
411 | ],
412 | "source": [
413 | "distances = [np.nan]\n",
414 | "\n",
415 | "for i in range(len(route_df)):\n",
416 | " if i == 0:\n",
417 | " continue\n",
418 | " else:\n",
419 | " distances.append(haversine_distance(\n",
420 | " lat1=route_df.iloc[i - 1]['latitude'],\n",
421 | " lon1=route_df.iloc[i - 1]['longitude'],\n",
422 | " lat2=route_df.iloc[i]['latitude'],\n",
423 | " lon2=route_df.iloc[i]['longitude']\n",
424 | " ))\n",
425 | " \n",
426 | "route_df['distance'] = distances\n",
427 | "route_df.head()"
428 | ]
429 | },
430 | {
431 | "cell_type": "markdown",
432 | "id": "c069b292-2e8d-435d-a153-54b93541bf97",
433 | "metadata": {},
434 | "source": [
435 | "### Total Uphill\n",
436 | "- How to calculate it?\n",
437 | "- Simple - subset the column so only the rows with positive `elevation_diff` are kept, and then calculate the sum of the elevation difference:"
438 | ]
439 | },
440 | {
441 | "cell_type": "code",
442 | "execution_count": 7,
443 | "id": "0e8fc860-0c4d-401f-8f7a-e2420682d93d",
444 | "metadata": {},
445 | "outputs": [
446 | {
447 | "data": {
448 | "text/plain": [
449 | "311.97999999999996"
450 | ]
451 | },
452 | "execution_count": 7,
453 | "metadata": {},
454 | "output_type": "execute_result"
455 | }
456 | ],
457 | "source": [
458 | "route_df[route_df['elevation_diff'] >= 0]['elevation_diff'].sum()"
459 | ]
460 | },
461 | {
462 | "cell_type": "markdown",
463 | "id": "924decb9-0bbf-4638-bd25-2161aedee92c",
464 | "metadata": {},
465 | "source": [
466 | "- The official Strava route states there's 288 meters of elevation, so we're a bit off"
467 | ]
468 | },
469 | {
470 | "cell_type": "markdown",
471 | "id": "71df9dc0-2bde-4934-9268-7bf553fcf90f",
472 | "metadata": {},
473 | "source": [
474 | "### Total Distance\n",
475 | "- To calculate the total distance, simply sum the `distance` column:"
476 | ]
477 | },
478 | {
479 | "cell_type": "code",
480 | "execution_count": 8,
481 | "id": "46462286-22d7-4a5c-809b-34731c85fec1",
482 | "metadata": {},
483 | "outputs": [
484 | {
485 | "data": {
486 | "text/plain": [
487 | "36449.990000000005"
488 | ]
489 | },
490 | "execution_count": 8,
491 | "metadata": {},
492 | "output_type": "execute_result"
493 | }
494 | ],
495 | "source": [
496 | "route_df['distance'].sum()"
497 | ]
498 | },
499 | {
500 | "cell_type": "markdown",
501 | "id": "674046ac-74ea-43d0-a823-8d36ad2b6252",
502 | "metadata": {},
503 | "source": [
504 | "- The official Strava route states 36,4 km - we're dead on even with the simple haversine distance calculation!"
505 | ]
506 | },
507 | {
508 | "cell_type": "markdown",
509 | "id": "2a078176-9b0f-4697-9a69-3b5fc1ee881a",
510 | "metadata": {},
511 | "source": [
512 | "
\n",
513 | "\n",
514 | "## Visualize the Elevation profile\n",
515 | "- Let's see if our calculations make sense by visualizing the elevation profile\n",
516 | " - Shows meters of climbing at different distances\n",
517 | "- To make things simpler, we'll calculate a cumulative sum for elevation and distance"
518 | ]
519 | },
520 | {
521 | "cell_type": "code",
522 | "execution_count": 9,
523 | "id": "22527405-8ae9-48ad-b9b9-7cb94d89ec34",
524 | "metadata": {},
525 | "outputs": [
526 | {
527 | "data": {
528 | "text/html": [
529 | "\n",
530 | "\n",
543 | "
\n",
544 | " \n",
545 | " \n",
546 | " | \n",
547 | " latitude | \n",
548 | " longitude | \n",
549 | " elevation | \n",
550 | " elevation_diff | \n",
551 | " distance | \n",
552 | " cum_elevation | \n",
553 | " cum_distance | \n",
554 | "
\n",
555 | " \n",
556 | " \n",
557 | " \n",
558 | " 0 | \n",
559 | " 45.772480 | \n",
560 | " 15.958040 | \n",
561 | " 113.96 | \n",
562 | " NaN | \n",
563 | " NaN | \n",
564 | " NaN | \n",
565 | " NaN | \n",
566 | "
\n",
567 | " \n",
568 | " 1 | \n",
569 | " 45.772770 | \n",
570 | " 15.959090 | \n",
571 | " 115.82 | \n",
572 | " 1.86 | \n",
573 | " 87.59 | \n",
574 | " 1.86 | \n",
575 | " 87.59 | \n",
576 | "
\n",
577 | " \n",
578 | " 2 | \n",
579 | " 45.773270 | \n",
580 | " 15.958795 | \n",
581 | " 116.15 | \n",
582 | " 0.33 | \n",
583 | " 60.12 | \n",
584 | " 2.19 | \n",
585 | " 147.71 | \n",
586 | "
\n",
587 | " \n",
588 | " 3 | \n",
589 | " 45.773770 | \n",
590 | " 15.958500 | \n",
591 | " 116.12 | \n",
592 | " -0.03 | \n",
593 | " 60.12 | \n",
594 | " 2.16 | \n",
595 | " 207.83 | \n",
596 | "
\n",
597 | " \n",
598 | " 4 | \n",
599 | " 45.774235 | \n",
600 | " 15.959335 | \n",
601 | " 115.98 | \n",
602 | " -0.14 | \n",
603 | " 82.87 | \n",
604 | " 2.02 | \n",
605 | " 290.70 | \n",
606 | "
\n",
607 | " \n",
608 | "
\n",
609 | "
"
610 | ],
611 | "text/plain": [
612 | " latitude longitude elevation elevation_diff distance cum_elevation \\\n",
613 | "0 45.772480 15.958040 113.96 NaN NaN NaN \n",
614 | "1 45.772770 15.959090 115.82 1.86 87.59 1.86 \n",
615 | "2 45.773270 15.958795 116.15 0.33 60.12 2.19 \n",
616 | "3 45.773770 15.958500 116.12 -0.03 60.12 2.16 \n",
617 | "4 45.774235 15.959335 115.98 -0.14 82.87 2.02 \n",
618 | "\n",
619 | " cum_distance \n",
620 | "0 NaN \n",
621 | "1 87.59 \n",
622 | "2 147.71 \n",
623 | "3 207.83 \n",
624 | "4 290.70 "
625 | ]
626 | },
627 | "execution_count": 9,
628 | "metadata": {},
629 | "output_type": "execute_result"
630 | }
631 | ],
632 | "source": [
633 | "route_df['cum_elevation'] = route_df['elevation_diff'].cumsum()\n",
634 | "route_df['cum_distance'] = route_df['distance'].cumsum()\n",
635 | "\n",
636 | "route_df.head()"
637 | ]
638 | },
639 | {
640 | "cell_type": "markdown",
641 | "id": "8aabdbb9-e59d-410e-96a0-8e76fbb55b6f",
642 | "metadata": {},
643 | "source": [
644 | "- Let's get rid of the NaNs\n",
645 | "- We'll fill the NaNs with zeros, as that makes more sense for this dataset:"
646 | ]
647 | },
648 | {
649 | "cell_type": "code",
650 | "execution_count": 10,
651 | "id": "3b152671-7b3c-4e7f-8116-3e52a1f5e00b",
652 | "metadata": {},
653 | "outputs": [
654 | {
655 | "data": {
656 | "text/html": [
657 | "\n",
658 | "\n",
671 | "
\n",
672 | " \n",
673 | " \n",
674 | " | \n",
675 | " latitude | \n",
676 | " longitude | \n",
677 | " elevation | \n",
678 | " elevation_diff | \n",
679 | " distance | \n",
680 | " cum_elevation | \n",
681 | " cum_distance | \n",
682 | "
\n",
683 | " \n",
684 | " \n",
685 | " \n",
686 | " 0 | \n",
687 | " 45.772480 | \n",
688 | " 15.958040 | \n",
689 | " 113.96 | \n",
690 | " 0.00 | \n",
691 | " 0.00 | \n",
692 | " 0.00 | \n",
693 | " 0.00 | \n",
694 | "
\n",
695 | " \n",
696 | " 1 | \n",
697 | " 45.772770 | \n",
698 | " 15.959090 | \n",
699 | " 115.82 | \n",
700 | " 1.86 | \n",
701 | " 87.59 | \n",
702 | " 1.86 | \n",
703 | " 87.59 | \n",
704 | "
\n",
705 | " \n",
706 | " 2 | \n",
707 | " 45.773270 | \n",
708 | " 15.958795 | \n",
709 | " 116.15 | \n",
710 | " 0.33 | \n",
711 | " 60.12 | \n",
712 | " 2.19 | \n",
713 | " 147.71 | \n",
714 | "
\n",
715 | " \n",
716 | " 3 | \n",
717 | " 45.773770 | \n",
718 | " 15.958500 | \n",
719 | " 116.12 | \n",
720 | " -0.03 | \n",
721 | " 60.12 | \n",
722 | " 2.16 | \n",
723 | " 207.83 | \n",
724 | "
\n",
725 | " \n",
726 | " 4 | \n",
727 | " 45.774235 | \n",
728 | " 15.959335 | \n",
729 | " 115.98 | \n",
730 | " -0.14 | \n",
731 | " 82.87 | \n",
732 | " 2.02 | \n",
733 | " 290.70 | \n",
734 | "
\n",
735 | " \n",
736 | "
\n",
737 | "
"
738 | ],
739 | "text/plain": [
740 | " latitude longitude elevation elevation_diff distance cum_elevation \\\n",
741 | "0 45.772480 15.958040 113.96 0.00 0.00 0.00 \n",
742 | "1 45.772770 15.959090 115.82 1.86 87.59 1.86 \n",
743 | "2 45.773270 15.958795 116.15 0.33 60.12 2.19 \n",
744 | "3 45.773770 15.958500 116.12 -0.03 60.12 2.16 \n",
745 | "4 45.774235 15.959335 115.98 -0.14 82.87 2.02 \n",
746 | "\n",
747 | " cum_distance \n",
748 | "0 0.00 \n",
749 | "1 87.59 \n",
750 | "2 147.71 \n",
751 | "3 207.83 \n",
752 | "4 290.70 "
753 | ]
754 | },
755 | "execution_count": 10,
756 | "metadata": {},
757 | "output_type": "execute_result"
758 | }
759 | ],
760 | "source": [
761 | "route_df = route_df.fillna(0)\n",
762 | "\n",
763 | "route_df.head()"
764 | ]
765 | },
766 | {
767 | "cell_type": "markdown",
768 | "id": "f85108d7-73ba-401a-8187-d7a40266d057",
769 | "metadata": {},
770 | "source": [
771 | "- We'll save this dataset for the upcoming notebooks:"
772 | ]
773 | },
774 | {
775 | "cell_type": "code",
776 | "execution_count": 11,
777 | "id": "87578b62-af38-406b-9dae-2fdf6e06ddc0",
778 | "metadata": {},
779 | "outputs": [],
780 | "source": [
781 | "route_df.to_csv('../data/route_df_elevation_distance.csv', index=False)"
782 | ]
783 | },
784 | {
785 | "cell_type": "markdown",
786 | "id": "036d18e8-08bf-4d18-baf4-a20b54ab5317",
787 | "metadata": {},
788 | "source": [
789 | "- Finally, let's use Matplotlib to plot the elevation profile:"
790 | ]
791 | },
792 | {
793 | "cell_type": "code",
794 | "execution_count": 14,
795 | "id": "1050d8cc-550e-43cd-a8e4-da1430f4549e",
796 | "metadata": {},
797 | "outputs": [
798 | {
799 | "data": {
800 | "image/png": "\n",
801 | "text/plain": [
802 | ""
803 | ]
804 | },
805 | "metadata": {
806 | "needs_background": "light"
807 | },
808 | "output_type": "display_data"
809 | }
810 | ],
811 | "source": [
812 | "plt.plot(route_df['cum_distance'], route_df['cum_elevation'], color='#101010', lw=3)\n",
813 | "plt.title('Route elevation profile', size=20)\n",
814 | "plt.xlabel('Distance in meters', size=14)\n",
815 | "plt.ylabel('Elevation in meters', size=14);\n",
816 | "plt.savefig('fig.jpg', dpi=300, bbox_inches='tight')"
817 | ]
818 | },
819 | {
820 | "cell_type": "markdown",
821 | "id": "34ae3aa8-5722-42fd-86aa-bd214dbba852",
822 | "metadata": {},
823 | "source": [
824 | "- It's difficult to say if it looks exactly like the one on Strava, but it looks almost identical!\n",
825 | "- In the next notebook, you'll learn how to calculate gradients based on the elevation difference and distance between the data points."
826 | ]
827 | }
828 | ],
829 | "metadata": {
830 | "kernelspec": {
831 | "display_name": "Python 3 (ipykernel)",
832 | "language": "python",
833 | "name": "python3"
834 | },
835 | "language_info": {
836 | "codemirror_mode": {
837 | "name": "ipython",
838 | "version": 3
839 | },
840 | "file_extension": ".py",
841 | "mimetype": "text/x-python",
842 | "name": "python",
843 | "nbconvert_exporter": "python",
844 | "pygments_lexer": "ipython3",
845 | "version": "3.9.4"
846 | }
847 | },
848 | "nbformat": 4,
849 | "nbformat_minor": 5
850 | }
851 |
--------------------------------------------------------------------------------