├── .gitignore
├── README.md
├── chapter1.md
├── chapter2.md
├── chapter3.md
├── chapter4.md
├── course.yml
├── courses-introduction-to-python.Rproj
├── datasets
    ├── baseball.csv
    ├── fifa.csv
    └── references.md
├── img
    └── shield_image.png
├── intro-to-python-keynotes.zip
├── requirements.sh
├── scripts
    ├── chapter1_script.md
    ├── chapter2_script.md
    ├── chapter3_script.md
    └── chapter4_script.md
└── slides
    ├── ch4_slides.pdf
    ├── chapter_1_433dcfcfedaee070cbf440491c402e3b.md
    ├── chapter_1_d8fcd4c930027fa4e1c3870c7e7e0ff1.md
    ├── chapter_2_355ed52d2fb0d67508c6a311b7cbc6d3.md
    ├── chapter_2_a0530c4542f10988847b2dbb91f717c3.md
    ├── chapter_2_fc15ba5cb9485456df8589130b519ea3.md
    ├── chapter_3_1204d914b0e53100529827e07441ee6c.md
    ├── chapter_3_8e387776f3a264a745128b68aa8d8f83.md
    ├── chapter_3_cedcfb34350be8545599768f96695cdd.md
    ├── chapter_4_34495ba457d74296794d2a122c9b6e19.md
    ├── chapter_4_a0487c26210f6b71ea98f917734cea3a.md
    ├── chapter_4_ae3238dcc7feb9adecfee0c395fc8dc8.md
    └── timings.json


/.gitignore:
--------------------------------------------------------------------------------
1 | .Rproj.user/*
2 | .Rproj.user
3 | .cache
4 | .DS_STORE
5 | .Rhistory
6 | *.html
7 | *.Rproj
8 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Intro to Python for Data Science
 2 | 
 3 | - Teach: https://www.datacamp.com/teach/repositories/288
 4 | - Campus: https://www.datacamp.com/courses/intro-to-python-for-data-science
 5 | - Docs: https://instructor-support.datacamp.com
 6 | 
 7 | This repository contains the source files for the interactive course "Intro to Python for Data Science", hosted at www.datacamp.com. Feel free to suggest improvements!
 8 | 
 9 | Want to create your own DataCamp course? Everybody can teach on DataCamp! Visit https://www.datacamp.com/teach.
10 | 


--------------------------------------------------------------------------------
/chapter1.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title_meta: Chapter 1
  3 | title: Python Basics
  4 | description: >-
  5 |   An introduction to the basic concepts of Python. Learn how to use Python
  6 |   interactively and by using a script. Create your first variables and acquaint
  7 |   yourself with Python's basic data types.
  8 | attachments:
  9 |   slides_link: 'https://projector-video-pdf-converter.datacamp.com/735/chapter1.pdf'
 10 | free_preview: true
 11 | lessons:
 12 |   - nb_of_exercises: 3
 13 |     title: Hello Python!
 14 |   - nb_of_exercises: 5
 15 |     title: Variables and Types
 16 | ---
 17 | 
 18 | ## Hello Python!
 19 | 
 20 | ```yaml
 21 | type: VideoExercise
 22 | key: f644a48d5d
 23 | xp: 50
 24 | ```
 25 | 
 26 | `@projector_key`
 27 | d8fcd4c930027fa4e1c3870c7e7e0ff1
 28 | 
 29 | ---
 30 | 
 31 | ## Your first Python code
 32 | 
 33 | ```yaml
 34 | type: NormalExercise
 35 | key: bdc52f0e19
 36 | lang: python
 37 | xp: 100
 38 | skills:
 39 |   - 2
 40 | ```
 41 | 
 42 | It's time to run your first Python code!
 43 | 
 44 | Head to the code and hit the run code button to see the output.
 45 | 
 46 | `@instructions`
 47 | - Hit the run code button to see the output of `print(5 / 8)`.
 48 | 
 49 | `@hint`
 50 | - Run the code first before submitting your answer so you have time to explore the output.
 51 | 
 52 | `@pre_exercise_code`
 53 | ```{python}
 54 | 
 55 | ```
 56 | 
 57 | `@sample_code`
 58 | ```{python}
 59 | # Hit run code to see the output!
 60 | print(5 / 8)
 61 | ```
 62 | 
 63 | `@solution`
 64 | ```{python}
 65 | # Hit run code to see the output!
 66 | print(5 / 8)
 67 | ```
 68 | 
 69 | `@sct`
 70 | ```{python}
 71 | Ex().has_printout(0, not_printed_msg = "__JINJA__:Have you used `{{sol_call}}` to print out `5 / 8`?")
 72 | success_msg("Great! On to the next one!")
 73 | ```
 74 | 
 75 | ---
 76 | 
 77 | ## Python as a calculator
 78 | 
 79 | ```yaml
 80 | type: NormalExercise
 81 | key: 0f7c039428
 82 | lang: python
 83 | xp: 100
 84 | skills:
 85 |   - 2
 86 | ```
 87 | 
 88 | Python is perfectly suited to do basic calculations. It can do addition, subtraction, multiplication and division.
 89 | 
 90 | The code in the script gives some examples.
 91 | 
 92 | Now it's your turn to practice by writing some code yourself.
 93 | 
 94 | `@instructions`
 95 | - Print the result of subtracting `5` from `5` under `# Subtraction` using `print()`.
 96 | - Print the result of multiplying `3` by `5` under `# Multiplication`.
 97 | 
 98 | `@hint`
 99 | - You'll need to use `print()` to generate an output.
100 | - You can subtract with `-` and multiply with `*`.
101 | 
102 | `@pre_exercise_code`
103 | ```{python}
104 |  
105 | ```
106 | 
107 | `@sample_code`
108 | ```{python}
109 | # Addition and division
110 | print(4 + 5)
111 | print(10 / 2)
112 | 
113 | # Subtraction
114 | print()
115 | 
116 | # Multiplication
117 | 
118 | ```
119 | 
120 | `@solution`
121 | ```{python}
122 | # Addition and division
123 | print(4 + 5)
124 | print(10 / 2)
125 | 
126 | # Subtraction
127 | print(5 - 5)
128 | 
129 | # Multiplication
130 | print(3 * 5)
131 | ```
132 | 
133 | `@sct`
134 | ```{python}
135 | Ex().has_printout(0, not_printed_msg = "Have you used `print(4 + 5)` to print out the result of your sum?")
136 | 
137 | Ex().has_printout(1, not_printed_msg = "Have you used `print(5 - 5)` to print out the result of your subtration?")
138 | 
139 | Ex().has_printout(2, not_printed_msg = "Have you used `print(3 * 5)` to print out the result of your multiplication?")
140 | 
141 | Ex().has_printout(3, not_printed_msg = "Have you used `print(10 / 2)` to print out the result of your division?")
142 | 
143 | success_msg("That's correct! Python can help you do the math, a characteristic that will be helpful for analysis as we grow our data skills.")
144 | ```
145 | 
146 | ---
147 | 
148 | ## Variables and Types
149 | 
150 | ```yaml
151 | type: VideoExercise
152 | key: c2e396792e
153 | xp: 50
154 | ```
155 | 
156 | `@projector_key`
157 | 433dcfcfedaee070cbf440491c402e3b
158 | 
159 | ---
160 | 
161 | ## Variable Assignment
162 | 
163 | ```yaml
164 | type: NormalExercise
165 | key: 4bf65ad83e
166 | lang: python
167 | xp: 100
168 | skills:
169 |   - 2
170 | ```
171 | 
172 | In Python, a variable allows you to refer to a value with a name. To create a variable `x` with a value of `5`, you use `=`, like this example:
173 | 
174 | ```
175 | x = 5
176 | ```
177 | 
178 | You can now use the name of this variable, `x`, instead of the actual value, `5`.
179 | 
180 | Remember, `=` in Python means _assignment_, it doesn't test equality! Try it in the exercise by replacing `____` with your code.
181 | 
182 | `@instructions`
183 | - Create a variable `savings` with the value of `100`.
184 | - Check out this variable by typing `print(savings)` in the script.
185 | 
186 | `@hint`
187 | - Type `savings = 100` to create the variable `savings`.
188 | - After creating the variable `savings`, you can type `print(savings)`.
189 | - Your final code should not include any `____`.
190 | 
191 | `@pre_exercise_code`
192 | ```{python}
193 |  
194 | ```
195 | 
196 | `@sample_code`
197 | ```{python}
198 | # Create a variable savings
199 | ____
200 | 
201 | # Print out savings
202 | ____
203 | ```
204 | 
205 | `@solution`
206 | ```{python}
207 | # Create a variable savings
208 | savings = 100
209 | 
210 | # Print out savings
211 | print(savings)
212 | ```
213 | 
214 | `@sct`
215 | ```{python}
216 | Ex().check_object("savings").has_equal_value(incorrect_msg="Assign `100` to the variable `savings`.")
217 | Ex().has_printout(0, not_printed_msg = "Print out `savings`, the variable you created, with `print(savings)`.")
218 | success_msg("Great! Let's try to do some calculations with this variable now!")
219 | ```
220 | 
221 | ---
222 | 
223 | ## Calculations with variables
224 | 
225 | ```yaml
226 | type: NormalExercise
227 | key: ff06cedeb4
228 | lang: python
229 | xp: 100
230 | skills:
231 |   - 2
232 | ```
233 | 
234 | You've now created a savings variable, so let's start saving!
235 | 
236 | Instead of calculating with the actual values, you can use variables instead.
237 | 
238 | How much money would you have saved four months from now, if you saved $10 each month?
239 | 
240 | `@instructions`
241 | - Create a variable `monthly_savings`, equal to `10` and `num_months`, equal to `4`.
242 | - Multiply `monthly_savings` by `num_months` and assign it to `new_savings`.
243 | - Print the value of `new_savings`.
244 | 
245 | `@hint`
246 | - You can do calculations with variables the same way as with numbers so instead of `10 * 4`, replace the numbers with the variables!
247 | - Use `print()` to see the amount in `new_savings`.
248 | - Take care to spell the variables correctly!
249 | 
250 | `@pre_exercise_code`
251 | ```{python}
252 | 
253 | ```
254 | 
255 | `@sample_code`
256 | ```{python}
257 | # Create the variables monthly_savings and num_months
258 | 
259 | 
260 | 
261 | # Multiply monthly_savings and num_months
262 | new_savings = ____
263 | 
264 | # Print new_savings
265 | 
266 | ```
267 | 
268 | `@solution`
269 | ```{python}
270 | # Create the variables monthly_savings and num_months
271 | monthly_savings = 10
272 | num_months = 4
273 | 
274 | # Multiply monthly_savings and num_months
275 | new_savings = monthly_savings * num_months
276 | 
277 | # Print new_savings
278 | print(new_savings)
279 | ```
280 | 
281 | `@sct`
282 | ```{python}
283 | Ex().check_object("monthly_savings").has_equal_value(incorrect_msg = "Did you save `10` to `monthly_savings` using `monthly_savings = 10`?")
284 | Ex().check_object("num_months").has_equal_value(incorrect_msg = "Did you save `4` to `num_months` using `num_months = 4`?")
285 | Ex().check_object("new_savings").has_equal_value(incorrect_msg = "Did you use the correct variables and symbols to multiply? Expected `monthly_savings * num_months` but got something else.")
286 | # Ex().check_object("total_savings").has_equal_value(incorrect_msg = "Did you use the correct variables and symbols to add? Expected `savings + new_savings` but got something else.")
287 | 
288 | Ex().has_printout(0, not_printed_msg="Remember to print out `new_savings` at the end of your script.")
289 | 
290 | success_msg("You have $40 in new savings!")
291 | ```
292 | 
293 | ---
294 | 
295 | ## Other variable types
296 | 
297 | ```yaml
298 | type: NormalExercise
299 | key: 006b48561f
300 | lang: python
301 | xp: 100
302 | skills:
303 |   - 2
304 | ```
305 | 
306 | In the previous exercise, you worked with the integer Python data type:
307 | 
308 | - `int`, or integer: a number without a fractional part. `savings`, with the value `100`, is an example of an integer.
309 | 
310 | Next to numerical data types, there are three other very common data types:
311 | 
312 | - `float`, or floating point: a number that has both an integer and fractional part, separated by a point. `1.1`, is an example of a float.
313 | - `str`, or string: a type to represent text. You can use single or double quotes to build a string.
314 | - `bool`, or boolean: a type to represent logical values. It can only be `True` or `False` (the capitalization is important!).
315 | 
316 | `@instructions`
317 | - Create a new float, `half`, with the value `0.5`.
318 | - Create a new string, `intro`, with the value `"Hello! How are you?"`.
319 | - Create a new boolean, `is_good`, with the value `True`.
320 | 
321 | `@hint`
322 | - To create a variable in Python, use `=`. Make sure to wrap your string in single or double quotes.
323 | - Only two boolean values exist in Python: `True` and `False`. `TRUE`, `true`, `FALSE`, `false` and other versions will not be accepted.
324 | 
325 | `@pre_exercise_code`
326 | ```{python}
327 | 
328 | ```
329 | 
330 | `@sample_code`
331 | ```{python}
332 | # Create a variable half
333 | 
334 | 
335 | # Create a variable intro
336 | 
337 | 
338 | # Create a variable is_good
339 | 
340 | ```
341 | 
342 | `@solution`
343 | ```{python}
344 | # Create a variable half
345 | half = 0.5
346 | 
347 | # Create a variable intro
348 | intro = "Hello! How are you?"
349 | 
350 | # Create a variable is_good
351 | is_good = True
352 | ```
353 | 
354 | `@sct`
355 | ```{python}
356 | Ex().check_object("half").has_equal_value(incorrect_msg = "Did you save the float, `0.5` to `half`?")
357 | 
358 | Ex().check_object("intro").has_equal_value(incorrect_msg = "Hmm, something is incorrect in your `intro` variable. Double check the spelling and make sure you've used quotation marks.")
359 | 
360 | Ex().check_object("is_good").has_equal_value(incorrect_msg = "Did you capitalize the boolean value? Remember you don't need to use quotation marks here.")
361 | 
362 | success_msg("Nice!")
363 | ```
364 | 
365 | ---
366 | 
367 | ## Operations with other types
368 | 
369 | ```yaml
370 | type: BulletExercise
371 | key: 4d0d83cc02
372 | xp: 100
373 | ```
374 | 
375 | Variables come in different types in Python. You can see the type of a variable by using `type()`. For example, to see type of `a`, execute: `type(a)`.
376 | 
377 | Different types behave differently in Python. When you sum two strings, for example, you'll get different behavior than when you sum two integers or two booleans.
378 | 
379 | Time for you to test this out.
380 | 
381 | `@pre_exercise_code`
382 | ```{python}
383 | 
384 | ```
385 | 
386 | ***
387 | 
388 | ```yaml
389 | type: NormalExercise
390 | key: f4e91c4ae9
391 | xp: 50
392 | ```
393 | 
394 | `@instructions`
395 | - Add `savings` and `new_savings` and assign it to `total_savings`.
396 | - Use `type()` to print the resulting type of `total_savings`.
397 | 
398 | `@hint`
399 | - Assign `savings + new_savings` to a new variable, `total_savings`.
400 | - To print the type of a variable `x`, use `print(type(x))`.
401 | 
402 | `@sample_code`
403 | ```{python}
404 | savings = 100
405 | new_savings = 40
406 | 
407 | # Calculate total_savings using savings and new_savings
408 | ____
409 | print(total_savings)
410 | 
411 | # Print the type of total_savings
412 | print(____)
413 | ```
414 | 
415 | `@solution`
416 | ```{python}
417 | savings = 100
418 | new_savings = 40
419 | 
420 | # Calculate total_savings using savings and new_savings
421 | total_savings = savings + new_savings
422 | print(total_savings)
423 | 
424 | # Print the type of total_savings
425 | print(type(total_savings))
426 | ```
427 | 
428 | `@sct`
429 | ```{python}
430 | # predefined
431 | msg = "You don't have to change or remove the predefined variables."
432 | 
433 | Ex().multi(
434 |     check_object('savings', missing_msg=msg).has_equal_value(incorrect_msg=msg),
435 |     check_object('new_savings', missing_msg=msg).has_equal_value(incorrect_msg=msg)
436 | )
437 | 
438 | Ex().multi(
439 |     check_object("total_savings").has_equal_value(incorrect_msg="Add `savings` and `new_savings` to create the `total_savings` variable."),
440 |     has_printout(1, not_printed_msg = "__JINJA__:Use `{{sol_call}}` to print out the type of `total_savings`.")
441 | )
442 | ```
443 | 
444 | ***
445 | 
446 | ```yaml
447 | type: NormalExercise
448 | key: f54fbf9bd9
449 | xp: 50
450 | ```
451 | 
452 | `@instructions`
453 | - Calculate the sum of `intro` and `intro` and assign the result to `doubleintro`.
454 | - Print out `doubleintro`. Did you expect this?
455 | 
456 | `@hint`
457 | - Assign `intro + intro` to a new variable, `doubleintro`.
458 | - To print a variable `x`, write `print(x)` in the script.
459 | 
460 | `@sample_code`
461 | ```{python}
462 | intro = "Hello! How are you?"
463 | 
464 | # Assign sum of intro and intro to doubleintro
465 | ____
466 | 
467 | # Print out doubleintro
468 | print(____)
469 | ```
470 | 
471 | `@solution`
472 | ```{python}
473 | intro = "Hello! How are you?"
474 | 
475 | # Assign sum of intro and intro to doubleintro
476 | doubleintro = intro + intro
477 | 
478 | # Print out doubleintro
479 | print(doubleintro)
480 | ```
481 | 
482 | `@sct`
483 | ```{python}
484 | # predefined
485 | msg = "You don't have to change or remove the predefined variables."
486 | 
487 | Ex().check_object('intro', missing_msg=msg).has_equal_value(incorrect_msg=msg)
488 | 
489 | Ex().multi(
490 |     check_object("doubleintro").has_equal_value(incorrect_msg  = "Have you stored the result of `intro + intro` in `doubleintro`?"),
491 |     has_printout(0, not_printed_msg = "Don't forget to print out `doubleintro`.")
492 | )
493 | 
494 | success_msg("Nice. Notice how `intro + intro` causes `\"Hello! How are you?\"` and `\"Hello! How are you?\"` to be pasted together.")
495 | ```
496 | 


--------------------------------------------------------------------------------
/chapter2.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title_meta: Chapter 2
  3 | title: Python Lists
  4 | description: >-
  5 |   Learn to store, access, and manipulate data in lists: the first step toward
  6 |   efficiently working with huge amounts of data.
  7 | attachments:
  8 |   slides_link: 'https://projector-video-pdf-converter.datacamp.com/735/chapter2.pdf'
  9 | lessons:
 10 |   - nb_of_exercises: 4
 11 |     title: Python Lists
 12 |   - nb_of_exercises: 4
 13 |     title: Subsetting Lists
 14 |   - nb_of_exercises: 5
 15 |     title: Manipulating Lists
 16 | ---
 17 | 
 18 | ## Python Lists
 19 | 
 20 | ```yaml
 21 | type: VideoExercise
 22 | key: a5886d213f
 23 | xp: 50
 24 | ```
 25 | 
 26 | `@projector_key`
 27 | a0530c4542f10988847b2dbb91f717c3
 28 | 
 29 | ---
 30 | 
 31 | ## Create a list
 32 | 
 33 | ```yaml
 34 | type: NormalExercise
 35 | key: e6c527bf41
 36 | lang: python
 37 | xp: 100
 38 | skills:
 39 |   - 2
 40 | ```
 41 | 
 42 | A list is a **compound data type**; you can group values together, like this:
 43 | 
 44 | ```
 45 | a = "is"
 46 | b = "nice"
 47 | my_list = ["my", "list", a, b]
 48 | ```
 49 | 
 50 | After measuring the height of your family, you decide to collect some information on the house you're living in. The areas of the different parts of your house are stored in separate variables in the exercise.
 51 | 
 52 | `@instructions`
 53 | - Create a list, `areas`, that contains the area of the hallway (`hall`), kitchen (`kit`), living room (`liv`), bedroom (`bed`) and bathroom (`bath`), in this order. Use the predefined variables.
 54 | - Print `areas` with the `print()` function.
 55 | 
 56 | `@hint`
 57 | - You can use the variables that have already been created to build the list: `areas = [hall, kit, ...]`.
 58 | - Make sure to use square brackets `[]` rather than parentheses `()`.
 59 | - You don't need to use quotation marks when storing variables within a list.
 60 | - Type `print(areas)` to print out the list when submitting.
 61 | 
 62 | `@pre_exercise_code`
 63 | ```{python}
 64 | 
 65 | ```
 66 | 
 67 | `@sample_code`
 68 | ```{python}
 69 | hall = 11.25
 70 | kit = 18.0
 71 | liv = 20.0
 72 | bed = 10.75
 73 | bath = 9.50
 74 | 
 75 | # Create list areas
 76 | 
 77 | 
 78 | # Print areas
 79 | 
 80 | ```
 81 | 
 82 | `@solution`
 83 | ```{python}
 84 | hall = 11.25
 85 | kit = 18.0
 86 | liv = 20.0
 87 | bed = 10.75
 88 | bath = 9.50
 89 | 
 90 | # Create list areas
 91 | areas = [hall, kit, liv, bed, bath]
 92 | 
 93 | # Print areas
 94 | print(areas)
 95 | ```
 96 | 
 97 | `@sct`
 98 | ```{python}
 99 | predef_msg = "Don't remove or edit the predefined variables!"
100 | areas_msg = "Define `areas` as the list containing all the area variables, in the correct order: `[hall, kit, liv, bed, bath]`. Watch out for typos. The list shouldn't contain anything else!"
101 | 
102 | Ex().check_correct(
103 |     has_printout(0, not_printed_msg = "__JINJA__:Have you used `{{sol_call}}` to print out the `areas` list at the end of your script?"),
104 |     check_correct(
105 |         check_object("areas").has_equal_value(incorrect_msg = areas_msg),
106 |         multi(
107 |             check_object('hall', missing_msg=predef_msg).has_equal_value(incorrect_msg=predef_msg),
108 |             check_object('kit', missing_msg=predef_msg).has_equal_value(incorrect_msg=predef_msg),
109 |             check_object('liv', missing_msg=predef_msg).has_equal_value(incorrect_msg=predef_msg),
110 |             check_object('bed', missing_msg=predef_msg).has_equal_value(incorrect_msg=predef_msg),
111 |             check_object('bath', missing_msg=predef_msg).has_equal_value(incorrect_msg=predef_msg)
112 |         )
113 |     )
114 | )
115 | 
116 | success_msg("Nice! A list is way better here, isn't it?")
117 | ```
118 | 
119 | ---
120 | 
121 | ## Create lists with different types
122 | 
123 | ```yaml
124 | type: NormalExercise
125 | key: 1702a8bcdc
126 | lang: python
127 | xp: 100
128 | skills:
129 |   - 2
130 | ```
131 | 
132 | Although it's not really common, a list can also contain a mix of Python types including strings, floats, and booleans.
133 | 
134 | You're now going to add the room names to your list, so you can easily see both the room name and size together.
135 | 
136 | Some of the code has been provided for you to get you started. Pay attention here! `"bathroom"` is a string, while `bath` is a variable that represents the float `9.50` you specified earlier.
137 | 
138 | `@instructions`
139 | - Finish the code that creates the `areas` list. Build the list so that the list first contains the name of each room as a string and then its area. In other words, add the strings `"hallway"`, `"kitchen"` and `"bedroom"` at the appropriate locations.
140 | - Print `areas` again; is the printout more informative this time?
141 | 
142 | `@hint`
143 | - The first four elements of the list `areas` are coded as `["hallway", hall, "kitchen", kit, ...`.
144 | - A string will need to be in quotation marks `""`.
145 | 
146 | `@pre_exercise_code`
147 | ```{python}
148 | 
149 | ```
150 | 
151 | `@sample_code`
152 | ```{python}
153 | hall = 11.25
154 | kit = 18.0
155 | liv = 20.0
156 | bed = 10.75
157 | bath = 9.50
158 | 
159 | # Adapt list areas
160 | areas = [____, hall, ____, kit, "living room", liv, ____, bed, "bathroom", bath]
161 | 
162 | # Print areas
163 | ____
164 | ```
165 | 
166 | `@solution`
167 | ```{python}
168 | hall = 11.25
169 | kit = 18.0
170 | liv = 20.0
171 | bed = 10.75
172 | bath = 9.50
173 | 
174 | # Adapt list areas
175 | areas = ["hallway", hall, "kitchen", kit, "living room", liv, "bedroom", bed, "bathroom", bath]
176 | 
177 | # Print areas
178 | print(areas)
179 | ```
180 | 
181 | `@sct`
182 | ```{python}
183 | objs = ["hall", "kit", "liv", "bed", "bath"]
184 | predef_msg = "Don't remove or edit the predefined variables!"
185 | areas_msg = "You didn't assign the correct value to `areas`. Have another look at the instructions. Make sure to place the room name before the variable containing the area each time. The order matters here! Watch out for typos."
186 | 
187 | Ex().check_correct(
188 |   check_object("areas").has_equal_value(incorrect_msg = areas_msg),
189 |   multi([ check_object(obj, missing_msg = predef_msg).has_equal_value(incorrect_msg = predef_msg) for obj in objs])
190 | )
191 | 
192 | Ex().has_printout(0, not_printed_msg = "__JINJA__:Have you used `{{sol_call}}` to print out the `areas` list at the end of your script?")
193 | 
194 | success_msg("Nice! This list contains both strings and floats, but that's not a problem for Python!")
195 | ```
196 | 
197 | ---
198 | 
199 | ## List of lists
200 | 
201 | ```yaml
202 | type: NormalExercise
203 | key: 9158c577b0
204 | lang: python
205 | xp: 100
206 | skills:
207 |   - 2
208 | ```
209 | 
210 | As a data scientist, you'll often be dealing with a lot of data, and it will make sense to group some of this data.
211 | 
212 | Instead of creating a list containing strings and floats, representing the names and areas of the rooms in your house, you can create a list of lists.
213 | 
214 | Remember: `"hallway"` is a string, while `hall` is a variable that represents the float `11.25` you specified earlier.
215 | 
216 | `@instructions`
217 | - Finish the list of lists so that it also contains the bedroom and bathroom data. Make sure you enter these in order!
218 | - Print out `house`; does this way of structuring your data make more sense?
219 | 
220 | `@hint`
221 | - Add _sublists_ to the `house` list by adding `["bedroom", bed]` and `["bathroom", bath]` inside the square brackets.
222 | - Remember to include a comma `,` after each sublist.
223 | - To print a variable `x`, write `print(x)` on a new line.
224 | 
225 | `@pre_exercise_code`
226 | ```{python}
227 | 
228 | ```
229 | 
230 | `@sample_code`
231 | ```{python}
232 | hall = 11.25
233 | kit = 18.0
234 | liv = 20.0
235 | bed = 10.75
236 | bath = 9.50
237 | 
238 | # House information as list of lists
239 | house = [["hallway", hall],
240 |          ["kitchen", kit],
241 |          ["living room", liv],
242 |         ____,
243 |         ____]
244 | 
245 | # Print out house
246 | ____
247 | ```
248 | 
249 | `@solution`
250 | ```{python}
251 | hall = 11.25
252 | kit = 18.0
253 | liv = 20.0
254 | bed = 10.75
255 | bath = 9.50
256 | 
257 | # House information as list of lists
258 | house = [["hallway", hall],
259 |          ["kitchen", kit],
260 |          ["living room", liv],
261 |          ["bedroom", bed],
262 |          ["bathroom", bath]]
263 | 
264 | # Print out house
265 | print(house)
266 | ```
267 | 
268 | `@sct`
269 | ```{python}
270 | predef_msg = "Don't remove or edit the predefined variables!"
271 | house_msg = "You didn't assign the correct value to `house`. Have another look at the instructions. Extend the list of lists so it incorporates a list for each pair of room name and room area. Mind the order and typos!"
272 | 
273 | Ex().check_correct(
274 |     check_object("house").has_equal_value(incorrect_msg = house_msg),
275 |     multi(
276 |         check_object('hall', missing_msg=predef_msg).has_equal_value(incorrect_msg=predef_msg),
277 |         check_object('kit', missing_msg=predef_msg).has_equal_value(incorrect_msg=predef_msg),
278 |         check_object('liv', missing_msg=predef_msg).has_equal_value(incorrect_msg=predef_msg),
279 |         check_object('bed', missing_msg=predef_msg).has_equal_value(incorrect_msg=predef_msg),
280 |         check_object('bath', missing_msg=predef_msg).has_equal_value(incorrect_msg=predef_msg)
281 |     )
282 | )
283 | 
284 | Ex().has_printout(0, not_printed_msg = "__JINJA__:Have you used `{{sol_call}}` to print out the contents of `house`?")
285 | 
286 | success_msg("Great! Get ready to learn about list subsetting!")
287 | ```
288 | 
289 | ---
290 | 
291 | ## Subsetting Lists
292 | 
293 | ```yaml
294 | type: VideoExercise
295 | key: c076b5a69c
296 | xp: 50
297 | ```
298 | 
299 | `@projector_key`
300 | fc15ba5cb9485456df8589130b519ea3
301 | 
302 | ---
303 | 
304 | ## Subset and conquer
305 | 
306 | ```yaml
307 | type: NormalExercise
308 | key: c3ce582e32
309 | lang: python
310 | xp: 100
311 | skills:
312 |   - 2
313 | ```
314 | 
315 | Subsetting Python lists is a piece of cake. Take the code sample below, which creates a list `x` and then selects "b" from it. Remember that this is the second element, so it has index 1. You can also use negative indexing.
316 | 
317 | ```
318 | x = ["a", "b", "c", "d"]
319 | x[1]
320 | x[-3] # same result!
321 | ```
322 | 
323 | Remember the `areas` list from before, containing both strings and floats? Its definition is already in the script. Can you add the correct code to do some Python subsetting?
324 | 
325 | `@instructions`
326 | - Print out the second element from the `areas` list (it has the value `11.25`).
327 | - Subset and print out the last element of `areas`, being `9.50`. Using a negative index makes sense here!
328 | - Select the number representing the area of the living room (`20.0`) and print it out.
329 | 
330 | `@hint`
331 | - Use `x[1]` to select the second element of a list `x`.
332 | - Use `x[-1]` to select the last element of a list `x`.
333 | - Make sure to wrap your subsetting operations in a `print()` call.
334 | - The number representing the area of the living room is the 6th element in the list, so you'll need `[5]` here. `area[4]` would show the string!
335 | 
336 | `@pre_exercise_code`
337 | ```{python}
338 | 
339 | ```
340 | 
341 | `@sample_code`
342 | ```{python}
343 | # Create the areas list
344 | areas = ["hallway", 11.25, "kitchen", 18.0, "living room", 20.0, "bedroom", 10.75, "bathroom", 9.50]
345 | 
346 | # Print out second element from areas
347 | print(areas[____])
348 | 
349 | # Print out last element from areas
350 | print(areas[____])
351 | 
352 | # Print out the area of the living room
353 | print(areas[____])
354 | ```
355 | 
356 | `@solution`
357 | ```{python}
358 | # Create the areas list
359 | areas = ["hallway", 11.25, "kitchen", 18.0, "living room", 20.0, "bedroom", 10.75, "bathroom", 9.50]
360 | 
361 | # Print out second element from areas
362 | print(areas[1])
363 | 
364 | # Print out last element from areas
365 | print(areas[-1])
366 | 
367 | # Print out the area of the living room
368 | print(areas[5])
369 | ```
370 | 
371 | `@sct`
372 | ```{python}
373 | msg = "Don't remove or edit the predefined `areas` list."
374 | Ex().check_object("areas", missing_msg = msg).has_equal_value(incorrect_msg = msg)
375 | Ex().has_printout(0, not_printed_msg = "Have another look at your code to print out the second element in `areas`, which is at index `1`.")
376 | Ex().has_printout(1, not_printed_msg = "Have another look at your code to print out the last element in `areas`, which is at index `-1`.")
377 | Ex().has_printout(2, not_printed_msg = "Have another look at your code to print out the area of the living room. It's at index `5`.")
378 | success_msg("Good job!")
379 | ```
380 | 
381 | ---
382 | 
383 | ## Slicing and dicing
384 | 
385 | ```yaml
386 | type: NormalExercise
387 | key: 7f08642d18
388 | lang: python
389 | xp: 100
390 | skills:
391 |   - 2
392 | ```
393 | 
394 | Selecting single values from a list is just one part of the story. It's also possible to _slice_ your list, which means selecting multiple elements from your list. Use the following syntax:
395 | 
396 | ```
397 | my_list[start:end]
398 | ```
399 | 
400 | The `start` index will be included, while the `end` index is _not_. However, it's also possible not to specify these indexes. If you don't specify the `start` index, Python figures out that you want to start your slice at the beginning of your list.
401 | 
402 | `@instructions`
403 | - Use slicing to create a list, `downstairs`, that contains the first 6 elements of `areas`.
404 | - Create `upstairs`, as the last `4` elements of `areas`. This time, simplify the slicing by omitting the `end` index.
405 | - Print both `downstairs` and `upstairs` using `print()`.
406 | 
407 | `@hint`
408 | - Use the brackets `[0:6]` to get the first six elements of a list.
409 | - To get everything except the first 5 elements of a list, `l`, you would use `l[5:]`.
410 | - Add two `print()` calls to print out `downstairs` and `upstairs`.
411 | 
412 | `@pre_exercise_code`
413 | ```{python}
414 | 
415 | ```
416 | 
417 | `@sample_code`
418 | ```{python}
419 | # Create the areas list
420 | areas = ["hallway", 11.25, "kitchen", 18.0, "living room", 20.0, "bedroom", 10.75, "bathroom", 9.50]
421 | 
422 | # Use slicing to create downstairs
423 | downstairs = areas[____]
424 | 
425 | # Use slicing to create upstairs
426 | upstairs = areas[____]
427 | 
428 | # Print out downstairs and upstairs
429 | ____
430 | ____
431 | ```
432 | 
433 | `@solution`
434 | ```{python}
435 | # Create the areas list
436 | areas = ["hallway", 11.25, "kitchen", 18.0, "living room", 20.0, "bedroom", 10.75, "bathroom", 9.50]
437 | 
438 | # Use slicing to create downstairs
439 | downstairs = areas[0:6]
440 | 
441 | # Use slicing to create upstairs
442 | upstairs = areas[6:]
443 | 
444 | # Print out downstairs and upstairs
445 | print(downstairs)
446 | print(upstairs)
447 | ```
448 | 
449 | `@sct`
450 | ```{python}
451 | msg = "Don't remove or edit the predefined `areas` list."
452 | Ex().check_object("areas", missing_msg = msg).has_equal_value(incorrect_msg = msg)
453 | 
454 | patt = "`%s` is incorrect. Use `areas[%s]` and slicing to select the elements you want, or something equivalent."
455 | Ex().check_object("downstairs").has_equal_value(incorrect_msg = patt % ('downstairs', '0:6'))
456 | Ex().check_object("upstairs").has_equal_value(incorrect_msg = patt % ("upstairs",":6"))
457 | 
458 | Ex().has_printout(0, not_printed_msg="Have you printed out `downstairs` after calculating it?")
459 | Ex().has_printout(1, not_printed_msg="Have you printed out `upstairs` after calculating it?")
460 | 
461 | success_msg("Great!")
462 | ```
463 | 
464 | ---
465 | 
466 | ## Subsetting lists of lists
467 | 
468 | ```yaml
469 | type: NormalExercise
470 | key: dbbbd306cf
471 | xp: 100
472 | ```
473 | 
474 | A Python list can also contain other lists.
475 | 
476 | To subset lists of lists, you can use the same technique as before: square brackets. This would look something like this for a list, `house`:
477 | 
478 | ```
479 | house[2][0]
480 | ```
481 | 
482 | `@instructions`
483 | - Subset the `house` list to get the float `9.5`.
484 | 
485 | `@hint`
486 | - Break this down step by step. First you want to get to the last element of the list, `["bathroom", 9.50]`. Recall the index of the last element is `-1`.
487 | - Next you want to get the second element of `["bathroom", 9.50]`, which is at index `1`.
488 | 
489 | `@pre_exercise_code`
490 | ```{python}
491 | 
492 | ```
493 | 
494 | `@sample_code`
495 | ```{python}
496 | house = [["hallway", 11.25],
497 |          ["kitchen", 18.0],
498 |          ["living room", 20.0],
499 |          ["bedroom", 10.75],
500 |          ["bathroom", 9.50]]
501 | 
502 | # Subset the house list
503 | house___
504 | ```
505 | 
506 | `@solution`
507 | ```{python}
508 | house = [["hallway", 11.25],
509 |          ["kitchen", 18.0],
510 |          ["living room", 20.0],
511 |          ["bedroom", 10.75],
512 |          ["bathroom", 9.50]]
513 | 
514 | # Subset the house list
515 | house[-1][1]
516 | ```
517 | 
518 | `@sct`
519 | ```{python}
520 | Ex().check_or(
521 |   has_code("house[-1][1]", pattern=False),
522 |   has_code("house[4][1]", pattern=False)
523 | )
524 | 
525 | success_msg("Correctomundo! The last piece of the list puzzle is manipulation.")
526 | ```
527 | 
528 | ---
529 | 
530 | ## Manipulating Lists
531 | 
532 | ```yaml
533 | type: VideoExercise
534 | key: d7fe818b3a
535 | xp: 50
536 | ```
537 | 
538 | `@projector_key`
539 | 355ed52d2fb0d67508c6a311b7cbc6d3
540 | 
541 | ---
542 | 
543 | ## Replace list elements
544 | 
545 | ```yaml
546 | type: NormalExercise
547 | key: 4e1bba1b55
548 | lang: python
549 | xp: 100
550 | skills:
551 |   - 2
552 | ```
553 | 
554 | To replace list elements, you subset the list and assign new values to the subset. You can select single elements or you can change entire list slices at once.
555 | 
556 | For this and the following exercises, you'll continue working on the `areas` list that contains the names and areas of different rooms in a house.
557 | 
558 | `@instructions`
559 | - Update the area of the bathroom to be `10.50` square meters instead of `9.50` using negative indexing.
560 | - Make the `areas` list more trendy! Change `"living room"` to `"chill zone"`. Don't use negative indexing this time.
561 | 
562 | `@hint`
563 | - To update the bathroom area, identify the subset of the bathroom area (it's the last item of the list!).
564 | - Then, replace the value with the new bathroom area by assigning it to this subset.
565 | - Do the same to update the `"living room"` name, which is at index 4.
566 | 
567 | `@pre_exercise_code`
568 | ```{python}
569 | 
570 | ```
571 | 
572 | `@sample_code`
573 | ```{python}
574 | # Create the areas list
575 | areas = ["hallway", 11.25, "kitchen", 18.0, "living room", 20.0, "bedroom", 10.75, "bathroom", 9.50]
576 | 
577 | # Correct the bathroom area
578 | 
579 | 
580 | # Change "living room" to "chill zone"
581 | 
582 | ```
583 | 
584 | `@solution`
585 | ```{python}
586 | # Create the areas list
587 | areas = ["hallway", 11.25, "kitchen", 18.0, "living room", 20.0, "bedroom", 10.75, "bathroom", 9.50]
588 | 
589 | # Correct the bathroom area
590 | areas[-1] = 10.50
591 | 
592 | # Change "living room" to "chill zone"
593 | areas[4] = "chill zone"
594 | ```
595 | 
596 | `@sct`
597 | ```{python}
598 | bathroom_msg = 'You can use `areas[-1] = 10.50` to update the bathroom area.'
599 | chillzone_msg = 'You can use `areas[4] = "chill zone"` to update the living room name.'
600 | Ex().check_correct(
601 |   check_object('areas').has_equal_value(incorrect_msg = 'Your changes to `areas` did not result in the correct list. Are you sure you used the correct subset operations? When in doubt, you can use a hint!'),
602 |   multi(
603 |     has_equal_value(expr_code='areas[-1]', override=10.50, incorrect_msg = bathroom_msg),
604 |     has_equal_value(expr_code='areas[4]', override='chill zone', incorrect_msg = chillzone_msg),
605 |   )
606 | )
607 | success_msg('Sweet! As the code sample showed, you can also slice a list and replace it with another list to update multiple elements in a single command.')
608 | ```
609 | 
610 | ---
611 | 
612 | ## Extend a list
613 | 
614 | ```yaml
615 | type: NormalExercise
616 | key: ff0fe8d967
617 | lang: python
618 | xp: 100
619 | skills:
620 |   - 2
621 | ```
622 | 
623 | If you can change elements in a list, you sure want to be able to add elements to it, right? You can use the `+` operator:
624 | 
625 | ```
626 | x = ["a", "b", "c", "d"]
627 | y = x + ["e", "f"]
628 | ```
629 | 
630 | You just won the lottery, awesome! You decide to build a poolhouse and a garage. Can you add the information to the `areas` list?
631 | 
632 | `@instructions`
633 | - Use the `+` operator to paste the list `["poolhouse", 24.5]` to the end of the `areas` list. Store the resulting list as `areas_1`.
634 | - Further extend `areas_1` by adding data on your garage. Add the string `"garage"` and float `15.45`. Name the resulting list `areas_2`.
635 | 
636 | `@hint`
637 | - Follow the code sample in the assignment. `x` is `areas` here, and `["e", "f"]` is `["poolhouse", 24.5]`.
638 | - To add more elements to `areas_1`, use `areas_1 + ["element", 123]`.
639 | 
640 | `@pre_exercise_code`
641 | ```{python}
642 | 
643 | ```
644 | 
645 | `@sample_code`
646 | ```{python}
647 | # Create the areas list and make some changes
648 | areas = ["hallway", 11.25, "kitchen", 18.0, "chill zone", 20.0,
649 |          "bedroom", 10.75, "bathroom", 10.50]
650 | 
651 | # Add poolhouse data to areas, new list is areas_1
652 | areas_1 = ____
653 | 
654 | # Add garage data to areas_1, new list is areas_2
655 | areas_2 = ____
656 | ```
657 | 
658 | `@solution`
659 | ```{python}
660 | # Create the areas list (updated version)
661 | areas = ["hallway", 11.25, "kitchen", 18.0, "chill zone", 20.0,
662 |          "bedroom", 10.75, "bathroom", 10.50]
663 | 
664 | # Add poolhouse data to areas, new list is areas_1
665 | areas_1 = areas + ["poolhouse", 24.5]
666 | 
667 | # Add garage data to areas_1, new list is areas_2
668 | areas_2 = areas_1 + ["garage", 15.45]
669 | ```
670 | 
671 | `@sct`
672 | ```{python}
673 | msg = "Don't remove or edit the predefined `areas` list."
674 | Ex().check_object("areas", missing_msg = msg).has_equal_value(incorrect_msg = msg)
675 | Ex().check_object("areas_1").has_equal_value(incorrect_msg = "Use `areas + [\"poolhouse\", 24.5]` to create `areas_1`. Watch out for typos!")
676 | Ex().check_object("areas_2").has_equal_value(incorrect_msg = "Use `areas_1 + [\"garage\", 15.45]` to create `areas_2`. Watch out for typos!")
677 | success_msg("Cool! The list is shaping up nicely!")
678 | ```
679 | 
680 | ---
681 | 
682 | ## Delete list elements
683 | 
684 | ```yaml
685 | type: NormalExercise
686 | key: 85f792356e
687 | xp: 100
688 | ```
689 | 
690 | Finally, you can also remove elements from your list. You can do this with the `del` statement:
691 | 
692 | ```
693 | x = ["a", "b", "c", "d"]
694 | del x[1]
695 | ```
696 | 
697 | Pay attention here: as soon as you remove an element from a list, the indexes of the elements that come after the deleted element all change!
698 | 
699 | Unfortunately, the amount you won with the lottery is not that big after all and it looks like the poolhouse isn't going to happen. You'll need to remove it from the list. You decide to remove the corresponding string and float from the `areas` list.
700 | 
701 | `@instructions`
702 | - Delete the string and float for the `"poolhouse"` from your `areas` list.
703 | - Print the updated `areas` list.
704 | 
705 | `@hint`
706 | - You'll need to use `del` twice to delete two elements. Be careful about changing indexes though!
707 | 
708 | `@pre_exercise_code`
709 | ```{python}
710 | 
711 | ```
712 | 
713 | `@sample_code`
714 | ```{python}
715 | areas = ["hallway", 11.25, "kitchen", 18.0,
716 |         "chill zone", 20.0, "bedroom", 10.75,
717 |          "bathroom", 10.50, "poolhouse", 24.5,
718 |          "garage", 15.45]
719 | 
720 | # Delete the poolhouse items from the list
721 | 
722 | 
723 | # Print the updated list
724 | 
725 | ```
726 | 
727 | `@solution`
728 | ```{python}
729 | areas = ["hallway", 11.25, "kitchen", 18.0,
730 |         "chill zone", 20.0, "bedroom", 10.75,
731 |          "bathroom", 10.50, "poolhouse", 24.5,
732 |          "garage", 15.45]
733 | 
734 | # Delete the poolhouse items from the list
735 | del areas[10]
736 | del areas[10]
737 | 
738 | # Print the updated list
739 | print(areas)
740 | ```
741 | 
742 | `@sct`
743 | ```{python}
744 | Ex().check_or(
745 |   multi(
746 |     has_code("del areas[10]", pattern=False),
747 |     has_code("del areas[10]", pattern=False)
748 |   ),
749 |   has_code("del areas[-4:-2]", pattern=False),
750 |   has_code("del(areas[-4:-2])", pattern=False),
751 |   multi(
752 |     has_code("del(areas[10])", pattern=False),
753 |     has_code("del(areas[10])", pattern=False)
754 |   ),
755 |   has_code("del areas[10:12]", pattern=False),
756 |   has_code("del(areas[10:12])", pattern=False),
757 |   multi(
758 |     has_code("del areas[-4]", pattern=False),
759 |     has_code("del areas[-3]", pattern=False)
760 |   ),
761 |   multi(
762 |     has_code("del(areas[-4])", pattern=False),
763 |     has_code("del(areas[-3])", pattern=False)
764 |   )
765 | )
766 | 
767 | Ex().has_printout(0, not_printed_msg="Have you printed out `areas` after removing the poolhouse string and float?")
768 | success_msg("Correct! You'll learn about easier ways to remove specific elements from Python lists later on.")
769 | ```
770 | 
771 | ---
772 | 
773 | ## Inner workings of lists
774 | 
775 | ```yaml
776 | type: NormalExercise
777 | key: af72db9915
778 | lang: python
779 | xp: 100
780 | skills:
781 |   - 2
782 | ```
783 | 
784 | Some code has been provided for you in this exercise: a list with the name `areas` and a copy named `areas_copy`.
785 | 
786 | Currently, the first element in the `areas_copy` list is changed and the `areas` list is printed out. If you hit the run code button you'll see that, although you've changed `areas_copy`, the change also takes effect in the `areas` list. That's because `areas` and `areas_copy` point to the same list.
787 | 
788 | If you want to prevent changes in `areas_copy` from also taking effect in `areas`, you'll have to do a more explicit copy of the `areas` list with `list()` or by using `[:]`.
789 | 
790 | `@instructions`
791 | - Change the second command, that creates the variable `areas_copy`, such that `areas_copy` is an explicit copy of `areas`. After your edit, changes made to `areas_copy` shouldn't affect `areas`. Submit the answer to check this.
792 | 
793 | `@hint`
794 | - Change the `areas_copy = areas` call. Instead of assigning `areas`, you can assign `list(areas)` or `areas[:]`.
795 | 
796 | `@pre_exercise_code`
797 | ```{python}
798 | 
799 | ```
800 | 
801 | `@sample_code`
802 | ```{python}
803 | # Create list areas
804 | areas = [11.25, 18.0, 20.0, 10.75, 9.50]
805 | 
806 | # Change this command
807 | areas_copy = areas
808 | 
809 | # Change areas_copy
810 | areas_copy[0] = 5.0
811 | 
812 | # Print areas
813 | print(areas)
814 | ```
815 | 
816 | `@solution`
817 | ```{python}
818 | # Create list areas
819 | areas = [11.25, 18.0, 20.0, 10.75, 9.50]
820 | 
821 | # Change this command
822 | areas_copy = list(areas)
823 | 
824 | # Change areas_copy
825 | areas_copy[0] = 5.0
826 | 
827 | # Print areas
828 | print(areas)
829 | ```
830 | 
831 | `@sct`
832 | ```{python}
833 | Ex().check_correct(
834 |   check_object("areas_copy").has_equal_value(incorrect_msg = "It seems that `areas_copy` has not been updated correctly."),
835 |   check_function("list", missing_msg = "Make sure to use `list(areas)` to create an `areas_copy`.")
836 | )
837 | 
838 | mmsg = "Don't remove the predefined `areas` list."
839 | imsg = "Be sure to edit ONLY the copy, not the original `areas` list. Have another look at the exercise description if you're unsure how to create a copy."
840 | Ex().check_correct(
841 |   check_object("areas", missing_msg = mmsg).has_equal_value(incorrect_msg = imsg),
842 |   check_function("list", missing_msg = "Make sure to use `list(areas)` to create an `areas_copy`.")
843 | )
844 | 
845 | success_msg("Nice! The difference between explicit and reference-based copies is subtle, but can be really important. Try to keep in mind how a list is stored in the computer's memory.")
846 | ```
847 | 


--------------------------------------------------------------------------------
/chapter3.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title_meta: Chapter 3
  3 | title: Functions and Packages
  4 | description: >-
  5 |   You'll learn how to use functions, methods, and packages to efficiently
  6 |   leverage the code that brilliant Python developers have written. The goal is
  7 |   to reduce the amount of code you need to solve challenging problems!
  8 | attachments:
  9 |   slides_link: 'https://projector-video-pdf-converter.datacamp.com/735/chapter3.pdf'
 10 | lessons:
 11 |   - nb_of_exercises: 4
 12 |     title: Functions
 13 |   - nb_of_exercises: 4
 14 |     title: Methods
 15 |   - nb_of_exercises: 4
 16 |     title: Packages
 17 | ---
 18 | 
 19 | ## Functions
 20 | 
 21 | ```yaml
 22 | type: VideoExercise
 23 | key: 5c5f365930
 24 | xp: 50
 25 | ```
 26 | 
 27 | `@projector_key`
 28 | 1204d914b0e53100529827e07441ee6c
 29 | 
 30 | ---
 31 | 
 32 | ## Familiar functions
 33 | 
 34 | ```yaml
 35 | type: NormalExercise
 36 | key: c422ee929b
 37 | lang: python
 38 | xp: 100
 39 | skills:
 40 |   - 2
 41 | ```
 42 | 
 43 | Out of the box, Python offers a bunch of built-in functions to make your life as a data scientist easier. You already know two such functions: `print()` and `type()`. There are also functions like `str()`, `int()`, `bool()` and `float()` to switch between data types. You can find out about them [here.](https://docs.python.org/3/library/functions.html) These are built-in functions as well.
 44 | 
 45 | Calling a function is easy. To get the type of `3.0` and store the output as a new variable, `result`, you can use the following:
 46 | 
 47 | ```
 48 | result = type(3.0)
 49 | ```
 50 | 
 51 | `@instructions`
 52 | - Use `print()` in combination with `type()` to print out the type of `var1`.
 53 | - Use `len()` to get the [length of the list](https://docs.python.org/3/library/functions.html#len) `var1`. Wrap it in a `print()` call to directly print it out.
 54 | - Use `int()` to convert `var2` to an [integer](https://docs.python.org/3/library/functions.html#int). Store the output as `out2`.
 55 | 
 56 | `@hint`
 57 | - Call the `type()` function like this: `type(var1)`.
 58 | - Call `print()` like you did so many times before. Simply put the variable you want to print in parentheses.
 59 | - `int(x)` will convert `x` to an integer.
 60 | 
 61 | `@pre_exercise_code`
 62 | ```{python}
 63 | 
 64 | ```
 65 | 
 66 | `@sample_code`
 67 | ```{python}
 68 | # Create variables var1 and var2
 69 | var1 = [1, 2, 3, 4]
 70 | var2 = True
 71 | 
 72 | # Print out type of var1
 73 | ____
 74 | 
 75 | # Print out length of var1
 76 | ____
 77 | 
 78 | # Convert var2 to an integer: out2
 79 | out2 = ____
 80 | ```
 81 | 
 82 | `@solution`
 83 | ```{python}
 84 | # Create variables var1 and var2
 85 | var1 = [1, 2, 3, 4]
 86 | var2 = True
 87 | 
 88 | # Print out type of var1
 89 | print(type(var1))
 90 | 
 91 | # Print out length of var1
 92 | print(len(var1))
 93 | 
 94 | # Convert var2 to an integer: out2
 95 | out2 = int(var2)
 96 | ```
 97 | 
 98 | `@sct`
 99 | ```{python}
100 | msg = "You don't have to change or remove the predefined variables."
101 | Ex().check_object("var1", missing_msg=msg).has_equal_value(incorrect_msg=msg)
102 | Ex().check_object("var2", missing_msg=msg).has_equal_value(incorrect_msg=msg)
103 | 
104 | patt = "__JINJA__:Make sure to print out the %s of `var1` with `{{sol_call}}`."
105 | Ex().has_printout(0, not_printed_msg = patt % 'type')
106 | Ex().has_printout(1, not_printed_msg = patt % 'length')
107 | 
108 | int_miss_msg = "Have you used `int()` to make an integer of `var2`?"
109 | int_incorr_msg = "Have you passed `var2` to `int()`?"
110 | Ex().check_correct(
111 |   check_object("out2").has_equal_value(incorrect_msg="You called `int()` correctly; now make sure to assign the result of this call to `out2`."),
112 |   check_function("int", missing_msg=int_miss_msg).has_equal_value(incorrect_msg=int_incorr_msg)
113 | )
114 | success_msg("Great job! The `len()` function is extremely useful; it also works on strings to count the number of characters!")
115 | ```
116 | 
117 | ---
118 | 
119 | ## Help!
120 | 
121 | ```yaml
122 | type: MultipleChoiceExercise
123 | key: 679b852978
124 | lang: python
125 | xp: 50
126 | skills:
127 |   - 2
128 | ```
129 | 
130 | Maybe you already know the name of a Python function, but you still have to figure out how to use it. Ironically, you have to ask for information about a function with another function: `help()`. In IPython specifically, you can also use `?` before the function name.
131 | 
132 | To get help on the `max()` function, for example, you can use one of these calls:
133 | 
134 | ```
135 | help(max)
136 | ?max
137 | ```
138 | 
139 | Use the IPython Shell to open up the [documentation](https://docs.python.org/3/library/functions.html#pow) on `pow()`. Do this by typing `?pow` or `help(pow)` and hitting **Enter**.
140 | 
141 | Which of the following statements is true?
142 | 
143 | `@possible_answers`
144 | - `pow()` takes three arguments: `base`, `exp`, and `mod`. Without `mod`, the function will return an error.
145 | - `pow()` takes three required arguments: `base`, `exp`, and `None`.
146 | - `pow()` requires `base` and `exp` arguments; `mod` is optional.
147 | - `pow()` takes two arguments: `exp` and `mod`. Missing `exp` results in an error.
148 | 
149 | `@hint`
150 | - Optional arguments are set `=` to a default value, which the function will use if that argument is not specified.
151 | 
152 | `@pre_exercise_code`
153 | ```{python}
154 | 
155 | ```
156 | 
157 | `@sct`
158 | ```{python}
159 | msg1 = "Not quite. `mod` has a default value that will be used if you don't specify a value."
160 | msg2 = "Incorrect. `None` is the default value for the `mod` argument."
161 | msg3 = "Perfect! Using `help()` can help you understand how functions work, unleashing their full potential!"
162 | msg4 = "Incorrect. `pow()` takes three arguments, one of which has a default value."
163 | Ex().has_chosen(3, [msg1, msg2, msg3, msg4])
164 | ```
165 | 
166 | ---
167 | 
168 | ## Multiple arguments
169 | 
170 | ```yaml
171 | type: NormalExercise
172 | key: e30486d7c1
173 | lang: python
174 | xp: 100
175 | skills:
176 |   - 2
177 | ```
178 | 
179 | In the previous exercise, you identified optional arguments by viewing the documentation with `help()`. You'll now apply this to change the behavior of the `sorted()` function.
180 | 
181 | Have a look at the [documentation](https://docs.python.org/3/library/functions.html#sorted) of `sorted()` by typing `help(sorted)` in the IPython Shell.
182 | 
183 | You'll see that `sorted()` takes three arguments: `iterable`, `key`, and `reverse`. In this exercise, you'll only have to specify `iterable` and `reverse`, not `key`.
184 | 
185 | Two lists have been created for you.
186 | 
187 | Can you paste them together and sort them in descending order?
188 | 
189 | `@instructions`
190 | - Use `+` to merge the contents of `first` and `second` into a new list: `full`.
191 | - Call `sorted()` and on `full` and specify the `reverse` argument to be `True`. Save the sorted list as `full_sorted`.
192 | - Finish off by printing out `full_sorted`.
193 | 
194 | `@hint`
195 | - Sum `first` and `second` as if they are two numbers and assign the result to `full`.
196 | - Use `sorted()` with two inputs: `full` and `reverse=True`.
197 | - To print out a variable, use `print()`.
198 | 
199 | `@pre_exercise_code`
200 | ```{python}
201 | 
202 | ```
203 | 
204 | `@sample_code`
205 | ```{python}
206 | # Create lists first and second
207 | first = [11.25, 18.0, 20.0]
208 | second = [10.75, 9.50]
209 | 
210 | # Paste together first and second: full
211 | full = ____ + ____
212 | 
213 | # Sort full in descending order: full_sorted
214 | full_sorted = ____
215 | 
216 | # Print out full_sorted
217 | ____
218 | ```
219 | 
220 | `@solution`
221 | ```{python}
222 | # Create lists first and second
223 | first = [11.25, 18.0, 20.0]
224 | second = [10.75, 9.50]
225 | 
226 | # Paste together first and second: full
227 | full = first + second
228 | 
229 | # Sort full in descending order: full_sorted
230 | full_sorted = sorted(full, reverse=True)
231 | 
232 | # Print out full_sorted
233 | print(full_sorted)
234 | ```
235 | 
236 | `@sct`
237 | ```{python}
238 | msg = "You don't have to change or remove the already variables `first` and `second`."
239 | Ex().multi(
240 |   check_object("first", missing_msg=msg).has_equal_value(incorrect_msg=msg),
241 |   check_object("second", missing_msg=msg).has_equal_value(incorrect_msg=msg)
242 | )
243 | Ex().check_correct(
244 |   check_object("full_sorted").has_equal_value(incorrect_msg="Make sure you assign the result of calling `sorted()` to `full_sorted`."),
245 |   check_function("sorted").multi(
246 |     check_args(0).has_equal_value(),
247 |     check_args('reverse').has_equal_value()
248 |   )
249 | )
250 | 
251 | success_msg("Cool! Head over to the video on Python methods.")
252 | ```
253 | 
254 | ---
255 | 
256 | ## Methods
257 | 
258 | ```yaml
259 | type: VideoExercise
260 | key: 2b66cb66b1
261 | xp: 50
262 | ```
263 | 
264 | `@projector_key`
265 | 8e387776f3a264a745128b68aa8d8f83
266 | 
267 | ---
268 | 
269 | ## String Methods
270 | 
271 | ```yaml
272 | type: NormalExercise
273 | key: 4039302ee0
274 | lang: python
275 | xp: 100
276 | skills:
277 |   - 2
278 | ```
279 | 
280 | Strings come with a bunch of methods. Follow the instructions closely to discover some of them. If you want to discover them in more detail, you can always type `help(str)` in the IPython Shell.
281 | 
282 | A string `place` has already been created for you to experiment with.
283 | 
284 | `@instructions`
285 | - Use the `.upper()` [method](https://docs.python.org/3/library/stdtypes.html#str.upper) on `place` and store the result in `place_up`. Use the syntax for calling methods that you learned in the previous video.
286 | - Print out `place` and `place_up`. Did both change?
287 | - Print out the number of o's on the variable `place` by calling `.count()` on `place` and passing the letter `'o'` as an input to the method. We're talking about the variable `place`, not the word `"place"`!
288 | 
289 | `@hint`
290 | - You can call the `.upper()` method on `place` without any additional inputs.
291 | - To print out a variable `x`, you can write `print(x)`.
292 | - Make sure to wrap your `place.count(____)` call in a `print()` function so that you print it out.
293 | 
294 | `@pre_exercise_code`
295 | ```{python}
296 | 
297 | ```
298 | 
299 | `@sample_code`
300 | ```{python}
301 | # string to experiment with: place
302 | place = "poolhouse"
303 | 
304 | # Use upper() on place
305 | place_up = 
306 | 
307 | # Print out place and place_up
308 | 
309 | 
310 | 
311 | # Print out the number of o's in place
312 | 
313 | ```
314 | 
315 | `@solution`
316 | ```{python}
317 | # string to experiment with: place
318 | place = "poolhouse"
319 | 
320 | # Use upper() on place
321 | place_up = place.upper()
322 | 
323 | # Print out place and place_up
324 | print(place)
325 | print(place_up)
326 | 
327 | # Print out the number of o's in place
328 | print(place.count('o'))
329 | ```
330 | 
331 | `@sct`
332 | ```{python}
333 | msg = "You don't have to change or remove the predefined variables."
334 | Ex().check_object("place", missing_msg=msg).has_equal_value(incorrect_msg=msg)
335 | 
336 | patt = "Don't forget to print out `%s`."
337 | Ex().has_printout(0, not_printed_msg=patt % "place")
338 | Ex().check_correct(
339 |     has_printout(1, not_printed_msg=patt % "place_up"),
340 |     check_correct(
341 |         check_object("place_up").has_equal_value(incorrect_msg="Assign the result of your `place.upper()` call to `place_up`."),
342 |         check_function("place.upper", signature=False)
343 |     )
344 | )    
345 | 
346 | # check count of place
347 | Ex().check_correct(
348 |   has_printout(2, not_printed_msg = "You have calculated the number of o's in `place` fine; now make sure to wrap `place.count('o')` call in a `print()` function to print out the result."),
349 |   check_function("place.count", signature=False).check_args(0).has_equal_value()
350 | )
351 | 
352 | success_msg("Nice! Notice from the printouts that the `upper()` method does not change the object it is called on. This will be different for lists in the next exercise!")
353 | ```
354 | 
355 | ---
356 | 
357 | ## List Methods
358 | 
359 | ```yaml
360 | type: NormalExercise
361 | key: 0dbe8ed695
362 | lang: python
363 | xp: 100
364 | skills:
365 |   - 2
366 | ```
367 | 
368 | Strings are not the only Python types that have methods associated with them. Lists, floats, integers and booleans are also types that come packaged with a bunch of useful methods. In this exercise, you'll be experimenting with:
369 | 
370 | - `.index()`, to get the index of the first element of a list that matches its input and
371 | - `.count()`, to get the number of times an element appears in a list.
372 | 
373 | You'll be working on the list with the area of different parts of a house: `areas`.
374 | 
375 | `@instructions`
376 | - Use the `.index()` method to get the index of the element in `areas` that is equal to `20.0`. Print out this index.
377 | - Call `.count()` on `areas` to find out how many times `9.50` appears in the list. Again, simply print out this number.
378 | 
379 | `@hint`
380 | - To print out the index, wrap the `areas.index(___)` call in a `print()` function.
381 | - To print out the number of times an element `x` occurs in the list, wrap the `areas.count(___)` call in a `print()` function.
382 | 
383 | `@pre_exercise_code`
384 | ```{python}
385 | 
386 | ```
387 | 
388 | `@sample_code`
389 | ```{python}
390 | # Create list areas
391 | areas = [11.25, 18.0, 20.0, 10.75, 9.50]
392 | 
393 | # Print out the index of the element 20.0
394 | 
395 | 
396 | # Print out how often 9.50 appears in areas
397 | 
398 | ```
399 | 
400 | `@solution`
401 | ```{python}
402 | # Create list areas
403 | areas = [11.25, 18.0, 20.0, 10.75, 9.50]
404 | 
405 | # Print out the index of the element 20.0
406 | print(areas.index(20.0))
407 | 
408 | # Print out how often 9.50 appears in areas
409 | print(areas.count(9.50))
410 | ```
411 | 
412 | `@sct`
413 | ```{python}
414 | predef_msg = "You don't have to change or remove the predefined list `areas`."
415 | 
416 | Ex().check_object("areas", missing_msg=predef_msg).has_equal_value(incorrect_msg=predef_msg)
417 | 
418 | Ex().check_function("print", index=0).check_args(0).check_function('areas.index', signature=False).check_args(0).has_equal_value()
419 | 
420 | 
421 | Ex().check_function("print", index=1).check_args(0).check_function('areas.count', signature=False).has_equal_value()
422 | 
423 | success_msg("Nice! These were examples of `list` methods that did not change the list they were called on.")
424 | ```
425 | 
426 | ---
427 | 
428 | ## List Methods (2)
429 | 
430 | ```yaml
431 | type: NormalExercise
432 | key: 1fbeab82d0
433 | lang: python
434 | xp: 100
435 | skills:
436 |   - 2
437 | ```
438 | 
439 | Most list methods will change the list they're called on. Examples are:
440 | 
441 | - `.append()`, that adds an element to the list it is called on,
442 | - `.remove()`, that [removes](https://docs.python.org/3/library/stdtypes.html#typesseq-mutable) the first element of a list that matches the input, and
443 | - `.reverse()`, that [reverses](https://docs.python.org/3/library/stdtypes.html#typesseq-mutable) the order of the elements in the list it is called on.
444 | 
445 | You'll be working on the list with the area of different parts of the house: `areas`.
446 | 
447 | `@instructions`
448 | - Use `.append()` twice to add the size of the poolhouse and the garage again: `24.5` and `15.45`, respectively. Make sure to add them in this order.
449 | - Print out `areas`
450 | - Use the `.reverse()` method to reverse the order of the elements in `areas`.
451 | - Print out `areas` once more.
452 | 
453 | `@hint`
454 | - For the first instruction, use the `areas.append(___)` call twice.
455 | - To print out a variable `x`, simply write `print(x)`.
456 | - The `.reverse()` method does not require additional inputs; just use the dot notation and empty parentheses: `.reverse()`.
457 | - To print out a variable `x`, simply write `print(x)`.
458 | 
459 | `@pre_exercise_code`
460 | ```{python}
461 | 
462 | ```
463 | 
464 | `@sample_code`
465 | ```{python}
466 | # Create list areas
467 | areas = [11.25, 18.0, 20.0, 10.75, 9.50]
468 | 
469 | # Use append twice to add poolhouse and garage size
470 | 
471 | 
472 | 
473 | # Print out areas
474 | 
475 | 
476 | # Reverse the orders of the elements in areas
477 | 
478 | 
479 | # Print out areas
480 | 
481 | ```
482 | 
483 | `@solution`
484 | ```{python}
485 | # Create list areas
486 | areas = [11.25, 18.0, 20.0, 10.75, 9.50]
487 | 
488 | # Use append twice to add poolhouse and garage size
489 | areas.append(24.5)
490 | areas.append(15.45)
491 | 
492 | # Print out areas
493 | print(areas)
494 | 
495 | # Reverse the orders of the elements in areas
496 | areas.reverse()
497 | 
498 | # Print out areas
499 | print(areas)
500 | ```
501 | 
502 | `@sct`
503 | ```{python}
504 | Ex().multi(
505 |   check_function("areas.append", index=0, signature=False).check_args(0).has_equal_value(),
506 |   check_function("areas.append", index=1, signature=False).check_args(0).has_equal_value(),
507 |   check_function("print", index=0).check_args(0).has_equal_ast(),
508 |   check_function("areas.reverse", index=0, signature=False),
509 |   check_function("print", index=1).check_args(0).has_equal_ast()
510 | )
511 | 
512 | success_msg("Great!")
513 | ```
514 | 
515 | ---
516 | 
517 | ## Packages
518 | 
519 | ```yaml
520 | type: VideoExercise
521 | key: ab96a17c5e
522 | xp: 50
523 | ```
524 | 
525 | `@projector_key`
526 | cedcfb34350be8545599768f96695cdd
527 | 
528 | ---
529 | 
530 | ## Import package
531 | 
532 | ```yaml
533 | type: NormalExercise
534 | key: 7432a6376f
535 | lang: python
536 | xp: 100
537 | skills:
538 |   - 2
539 | ```
540 | 
541 | Let's say you wanted to calculate the circumference and area of a circle. Here's what those formulas look like:
542 | 
543 | $$C = 2 \pi r$$
544 | $$A = \pi r^2 $$
545 | 
546 | Rather than typing the number for `pi`, you can use the `math` package that contains the number
547 | 
548 | For reference, `**` is the symbol for exponentiation. For example `3**4` is `3` to the power of `4` and will give `81`.
549 | 
550 | `@instructions`
551 | - Import the `math` package.
552 | - Use `math.pi` to calculate the circumference of the circle and store it in `C`.
553 | - Use `math.pi` to calculate the area of the circle and store it in `A`.
554 | 
555 | `@hint`
556 | - You can simply use `import math`, and then refer to `pi` with `math.pi`.
557 | - Use the equation in the assignment text to find `C`. Use `*`
558 | - Use the equation in the assignment text to find `A`. Use `*` and `**`.
559 | 
560 | `@pre_exercise_code`
561 | ```{python}
562 | 
563 | ```
564 | 
565 | `@sample_code`
566 | ```{python}
567 | # Import the math package
568 | import ____
569 | 
570 | # Calculate C
571 | C = 2 * 0.43 * ____
572 | 
573 | # Calculate A
574 | A = ____ * 0.43 ** 2
575 | 
576 | print("Circumference: " + str(C))
577 | print("Area: " + str(A))
578 | ```
579 | 
580 | `@solution`
581 | ```{python}
582 | # Import the math package
583 | import math
584 | 
585 | # Calculate C
586 | C = 2 * 0.43 * math.pi
587 | 
588 | # Calculate A
589 | A = math.pi * 0.43 ** 2
590 | 
591 | print("Circumference: " + str(C))
592 | print("Area: " + str(A))
593 | ```
594 | 
595 | `@sct`
596 | ```{python}
597 | patt = "Your calculation of `%s` is not quite correct. Make sure to use `math.pi`."
598 | Ex().multi(
599 |   has_import('math', same_as=False),
600 |   check_object('C').has_equal_value(incorrect_msg=patt%'C'),
601 |   check_object('A').has_equal_value(incorrect_msg=patt%'A')
602 | )
603 | 
604 | Ex().multi(
605 |   has_printout(0, not_printed_msg = "__JINJA__:Keep `{{sol_call}}` in there to print out the circumference."),
606 |   has_printout(1, not_printed_msg = "__JINJA__:Keep `{{sol_call}}` in there to print out the area.")
607 | )
608 | 
609 | success_msg("Nice! If you know how to deal with functions from packages, the power of a lot of Python programmers is at your fingertips!")
610 | ```
611 | 
612 | ---
613 | 
614 | ## Selective import
615 | 
616 | ```yaml
617 | type: NormalExercise
618 | key: fe65eff50a
619 | lang: python
620 | xp: 100
621 | skills:
622 |   - 2
623 | ```
624 | 
625 | General imports, like `import math`, make **all** functionality from the `math` package available to you. However, if you decide to only use a specific part of a package, you can always make your import more selective:
626 | 
627 | ```
628 | from math import pi
629 | ```
630 | 
631 | Try the same thing again, but this time only use `pi`.
632 | 
633 | `@instructions`
634 | - Perform a selective import from the `math` package where you only import the `pi` function.
635 | - Use `math.pi` to calculate the circumference of the circle and store it in `C`.
636 | - Use `math.pi` to calculate the area of the circle and store it in `A`.
637 | 
638 | `@hint`
639 | - Use `from math import pi` to do the selective import.
640 | - Now, you can use `pi` on it's own!
641 | 
642 | `@pre_exercise_code`
643 | ```{python}
644 | 
645 | ```
646 | 
647 | `@sample_code`
648 | ```{python}
649 | # Import pi function of math package
650 | from math import ____
651 | 
652 | # Calculate C
653 | C = 2 * 0.43 * ____
654 | 
655 | # Calculate A
656 | A = ____ * 0.43 ** 2
657 | 
658 | print("Circumference: " + str(C))
659 | print("Area: " + str(A))
660 | ```
661 | 
662 | `@solution`
663 | ```{python}
664 | # Import pi function of math package
665 | from math import pi
666 | 
667 | # Calculate C
668 | C = 2 * 0.43 * pi
669 | 
670 | # Calculate A
671 | A = pi * 0.43 ** 2
672 | 
673 | print("Circumference: " + str(C))
674 | print("Area: " + str(A))
675 | ```
676 | 
677 | `@sct`
678 | ```{python}
679 | patt = "Your calculation of `%s` is not quite correct. Make sure to use only `pi`."
680 | 
681 | Ex().has_import("math.pi", not_imported_msg = "Be sure to import `pi` from the `math` package. You should use the `from ___ import ___` notation.",)
682 | 
683 | Ex().multi(
684 |   check_object('C').has_equal_value(incorrect_msg=patt%'C'),
685 |   check_object('A').has_equal_value(incorrect_msg=patt%'A')
686 | )
687 | 
688 | Ex().multi(
689 |   has_printout(0, not_printed_msg = "__JINJA__:Keep `{{sol_call}}` in there to print out the circumference."),
690 |   has_printout(1, not_printed_msg = "__JINJA__:Keep `{{sol_call}}` in there to print out the area.")
691 | )
692 | 
693 | success_msg("Nice! Head over to the next exercise.")
694 | ```
695 | 
696 | ---
697 | 
698 | ## Different ways of importing
699 | 
700 | ```yaml
701 | type: MultipleChoiceExercise
702 | key: f1b2675a2a
703 | lang: python
704 | xp: 50
705 | skills:
706 |   - 2
707 | ```
708 | 
709 | There are several ways to import packages and modules into Python. Depending on the import call, you'll have to use different Python code.
710 | 
711 | Suppose you want to use the [function](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.inv.html) `inv()`, which is in the `linalg` subpackage of the `scipy` package. You want to be able to use this function as follows:
712 | 
713 | ```
714 | my_inv([[1,2], [3,4]])
715 | ```
716 | 
717 | Which `import` statement will you need in order to run the above code without an error?
718 | 
719 | `@possible_answers`
720 | - `import scipy`
721 | - `import scipy.linalg`
722 | - `from scipy.linalg import my_inv`
723 | - `from scipy.linalg import inv as my_inv`
724 | 
725 | `@hint`
726 | - Try the different import statements in the IPython shell and see which one causes the line `my_inv([[1, 2], [3, 4]])` to run without errors. Hit **enter** to run the code you have typed.
727 | 
728 | `@pre_exercise_code`
729 | ```{python}
730 | 
731 | ```
732 | 
733 | `@sct`
734 | ```{python}
735 | msg1 = msg2 = msg3 = "Incorrect, try again. Try the different import statements in the IPython shell and see which one causes the line `my_inv([[1, 2], [3, 4]])` to run without errors."
736 | msg4 = "Correct! The `as` word allows you to create a local name for the function you're importing: `inv()` is now available as `my_inv()`."
737 | Ex().has_chosen(4, [msg1, msg2, msg3, msg4])
738 | ```
739 | 


--------------------------------------------------------------------------------
/chapter4.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title_meta: Chapter 4
  3 | title: NumPy
  4 | description: >-
  5 |   NumPy is a fundamental Python package to efficiently practice data science.
  6 |   Learn to work with powerful tools in the NumPy array, and get started with
  7 |   data exploration.
  8 | attachments:
  9 |   slides_link: 'https://projector-video-pdf-converter.datacamp.com/735/chapter4.pdf'
 10 | lessons:
 11 |   - nb_of_exercises: 5
 12 |     title: Numpy
 13 |   - nb_of_exercises: 5
 14 |     title: 2D Numpy Arrays
 15 |   - nb_of_exercises: 3
 16 |     title: 'Numpy: Basic Statistics'
 17 | ---
 18 | 
 19 | ## NumPy
 20 | 
 21 | ```yaml
 22 | type: VideoExercise
 23 | key: f4545baa53
 24 | xp: 50
 25 | ```
 26 | 
 27 | `@projector_key`
 28 | a0487c26210f6b71ea98f917734cea3a
 29 | 
 30 | ---
 31 | 
 32 | ## Your First NumPy Array
 33 | 
 34 | ```yaml
 35 | type: NormalExercise
 36 | key: 84cab9d170
 37 | lang: python
 38 | xp: 100
 39 | skills:
 40 |   - 2
 41 | ```
 42 | 
 43 | You're now going to dive into the world of baseball. Along the way, you'll get comfortable with the basics of `numpy`, a powerful package to do data science.
 44 | 
 45 | A list `baseball` has already been defined in the Python script, representing the height of some baseball players in centimeters. Can you add some code to create a `numpy` array from it?
 46 | 
 47 | `@instructions`
 48 | - Import the `numpy` package as `np`, so that you can refer to `numpy` with `np`.
 49 | - Use `np.array()` to create a `numpy` array from `baseball`. Name this array `np_baseball`.
 50 | - Print out the type of `np_baseball` to check that you got it right.
 51 | 
 52 | `@hint`
 53 | - `import numpy as np` will do the trick. Now, you have to use `np.fun_name()` whenever you want to use a `numpy` function.
 54 | - `np.array()` should take on input `baseball`. Assign the result of the function call to `np_baseball`.
 55 | - To print out the type of a variable `x`, simply type `print(type(x))`.
 56 | 
 57 | `@pre_exercise_code`
 58 | ```{python}
 59 | import numpy as np
 60 | ```
 61 | 
 62 | `@sample_code`
 63 | ```{python}
 64 | # Import the numpy package as np
 65 | 
 66 | 
 67 | baseball = [180, 215, 210, 210, 188, 176, 209, 200]
 68 | 
 69 | # Create a numpy array from baseball: np_baseball
 70 | 
 71 | 
 72 | # Print out type of np_baseball
 73 | 
 74 | ```
 75 | 
 76 | `@solution`
 77 | ```{python}
 78 | # Import the numpy package as np
 79 | import numpy as np
 80 | 
 81 | baseball = [180, 215, 210, 210, 188, 176, 209, 200]
 82 | 
 83 | # Create a NumPy array from baseball: np_baseball
 84 | np_baseball = np.array(baseball)
 85 | 
 86 | # Print out type of np_baseball
 87 | print(type(np_baseball))
 88 | ```
 89 | 
 90 | `@sct`
 91 | ```{python}
 92 | predef_msg = "You don't have to change or remove the predefined variables."
 93 | Ex().has_import("numpy")
 94 | Ex().check_correct(
 95 |   check_object("np_baseball"),
 96 |   multi(
 97 |     check_object("baseball", missing_msg=predef_msg).has_equal_value(incorrect_msg=predef_msg),
 98 |     check_function("numpy.array").check_args(0).has_equal_ast()
 99 |   )
100 | )
101 | 
102 | Ex().has_printout(0)
103 | success_msg("Great job!")
104 | ```
105 | 
106 | ---
107 | 
108 | ## Baseball players' height
109 | 
110 | ```yaml
111 | type: NormalExercise
112 | key: e7e25a89ea
113 | lang: python
114 | xp: 100
115 | skills:
116 |   - 2
117 | ```
118 | 
119 | You are a huge baseball fan. You decide to call the MLB (Major League Baseball) and ask around for some more statistics on the height of the main players. They pass along data on more than a thousand players, which is stored as a regular Python list: `height_in`. The height is expressed in inches. Can you make a `numpy` array out of it and convert the units to meters?
120 | 
121 | `height_in` is already available and the `numpy` package is loaded, so you can start straight away (Source: stat.ucla.edu).
122 | 
123 | `@instructions`
124 | - Create a `numpy` array from `height_in`. Name this new array `np_height_in`.
125 | - Print `np_height_in`.
126 | - Multiply `np_height_in` with `0.0254` to convert all height measurements from inches to meters. Store the new values in a new array, `np_height_m`.
127 | - Print out `np_height_m` and check if the output makes sense.
128 | 
129 | `@hint`
130 | - Use `np.array()` and pass it `height`. Store the result in `np_height_in`.
131 | - To print out a variable `x`, type `print(x)` in the Python script.
132 | - Perform calculations as if `np_height_in` is a single number: `np_height_in * conversion_factor` is part of the answer.
133 | - To print out a variable `x`, type `print(x)` in the Python script.
134 | 
135 | `@pre_exercise_code`
136 | ```{python}
137 | import pandas as pd
138 | mlb = pd.read_csv("https://assets.datacamp.com/course/intro_to_python/baseball.csv")
139 | height_in = mlb['Height'].tolist()
140 | import numpy as np
141 | ```
142 | 
143 | `@sample_code`
144 | ```{python}
145 | # Import numpy
146 | import numpy as np
147 | 
148 | # Create a numpy array from height_in: np_height_in
149 | 
150 | 
151 | # Print out np_height_in
152 | 
153 | 
154 | # Convert np_height_in to m: np_height_m
155 | 
156 | 
157 | # Print np_height_m
158 | 
159 | ```
160 | 
161 | `@solution`
162 | ```{python}
163 | # Import numpy
164 | import numpy as np
165 | 
166 | # Create a numpy array from height_in: np_height_in
167 | np_height_in = np.array(height_in)
168 | 
169 | # Print out np_height_in
170 | print(np_height_in)
171 | 
172 | # Convert np_height_in to m: np_height_m
173 | np_height_m = np_height_in * 0.0254
174 | 
175 | # Print np_height_m
176 | print(np_height_m)
177 | ```
178 | 
179 | `@sct`
180 | ```{python}
181 | Ex().has_import("numpy", same_as = False)
182 | 
183 | Ex().check_correct(
184 |   has_printout(0),
185 |   check_correct(
186 |     check_object('np_height_in').has_equal_value(),
187 |     check_function('numpy.array').check_args(0).has_equal_ast()
188 |   )
189 | )
190 | 
191 | Ex().check_correct(
192 |   has_printout(1),
193 |   check_object("np_height_m").has_equal_value(incorrect_msg = "Use `np_height_in * 0.0254` to calculate `np_height_m`.")
194 | )
195 | 
196 | success_msg("Nice! In the blink of an eye, `numpy` performs multiplications on more than 1000 height measurements.")
197 | ```
198 | 
199 | ---
200 | 
201 | ## NumPy Side Effects
202 | 
203 | ```yaml
204 | type: MultipleChoiceExercise
205 | key: 3662ff6637
206 | lang: python
207 | xp: 50
208 | skills:
209 |   - 2
210 | ```
211 | 
212 | `numpy` is great for doing vector arithmetic. If you compare its functionality with regular Python lists, however, some things have changed.
213 | 
214 | First of all, `numpy` arrays cannot contain elements with different types. 
215 | Second, the typical arithmetic operators, such as `+`, `-`, `*` and `/` have a different meaning for regular Python lists and `numpy` arrays.
216 | 
217 | Some lines of code have been provided for you. Try these out and select the one that would match this:
218 | 
219 | ```
220 | np.array([True, 1, 2]) + np.array([3, 4, False])
221 | ```
222 | 
223 | The `numpy` package is already imported as `np`.
224 | 
225 | `@possible_answers`
226 | - `np.array([True, 1, 2, 3, 4, False])`
227 | - `np.array([4, 3, 0]) + np.array([0, 2, 2])`
228 | - `np.array([1, 1, 2]) + np.array([3, 4, -1])`
229 | - `np.array([0, 1, 2, 3, 4, 5])`
230 | 
231 | `@hint`
232 | - Copy the different code chunks and paste them in the IPython Shell. Hit **enter** to run the code and see which output matches the one generated by `np.array([True, 1, 2]) + np.array([3, 4, False])`.
233 | 
234 | `@pre_exercise_code`
235 | ```{python}
236 | import numpy as np
237 | ```
238 | 
239 | `@sct`
240 | ```{python}
241 | msg1 = msg3 = msg4 = "Incorrect. Try out the different code chunks and see which one matches the target code chunk."
242 | msg2 = "Great job! `True` is converted to 1, `False` is converted to 0."
243 | Ex().has_chosen(2, [msg1, msg2, msg3, msg4])
244 | ```
245 | 
246 | ---
247 | 
248 | ## Subsetting NumPy Arrays
249 | 
250 | ```yaml
251 | type: NormalExercise
252 | key: fcb2a9007b
253 | lang: python
254 | xp: 100
255 | skills:
256 |   - 2
257 | ```
258 | 
259 | Subsetting (using the square bracket notation on lists or arrays) works exactly the same with both lists and arrays.
260 | 
261 | This exercise already has two lists, `height_in` and `weight_lb`, loaded in the background for you. These contain the height and weight of the MLB players as regular lists. It also has two `numpy` array lists, `np_weight_lb` and `np_height_in` prepared for you.
262 | 
263 | `@instructions`
264 | - Subset `np_weight_lb` by printing out the element at index 50.
265 | - Print out a sub-array of `np_height_in` that contains the elements at index 100 up to **and including** index 110.
266 | 
267 | `@hint`
268 | - Make sure to wrap a `print()` call around your subsetting operations.
269 | - Use `[100:111]` to get the elements from index 100 up to and including index 110.
270 | 
271 | `@pre_exercise_code`
272 | ```{python}
273 | import pandas as pd
274 | mlb = pd.read_csv("https://assets.datacamp.com/course/intro_to_python/baseball.csv")
275 | height_in = mlb['Height'].tolist()
276 | weight_lb = mlb['Weight'].tolist()
277 | ```
278 | 
279 | `@sample_code`
280 | ```{python}
281 | import numpy as np
282 | 
283 | np_weight_lb = np.array(weight_lb)
284 | np_height_in = np.array(height_in)
285 | 
286 | # Print out the weight at index 50
287 | 
288 | 
289 | # Print out sub-array of np_height_in: index 100 up to and including index 110
290 | 
291 | ```
292 | 
293 | `@solution`
294 | ```{python}
295 | import numpy as np
296 | 
297 | np_weight_lb = np.array(weight_lb)
298 | np_height_in = np.array(height_in)
299 | 
300 | # Print out the weight at index 50
301 | print(np_weight_lb[50])
302 | 
303 | # Print out sub-array of np_height_in: index 100 up to and including index 110
304 | print(np_height_in[100:111])
305 | ```
306 | 
307 | `@sct`
308 | ```{python}
309 | Ex().has_import("numpy", same_as=False)
310 | msg = "You don't have to change or remove the predefined variables."
311 | Ex().multi(
312 |     check_object("np_height_in", missing_msg=msg).has_equal_value(incorrect_msg = msg),
313 |     check_object("np_weight_lb", missing_msg=msg).has_equal_value(incorrect_msg = msg)
314 | )
315 | 
316 | Ex().has_printout(0)
317 | Ex().has_printout(1)
318 | 
319 | success_msg("Nice! Time to learn something new: 2D NumPy arrays!")
320 | ```
321 | 
322 | ---
323 | 
324 | ## 2D NumPy Arrays
325 | 
326 | ```yaml
327 | type: VideoExercise
328 | key: 1241efac7a
329 | xp: 50
330 | ```
331 | 
332 | `@projector_key`
333 | ae3238dcc7feb9adecfee0c395fc8dc8
334 | 
335 | ---
336 | 
337 | ## Your First 2D NumPy Array
338 | 
339 | ```yaml
340 | type: NormalExercise
341 | key: 5cb045bb13
342 | lang: python
343 | xp: 100
344 | skills:
345 |   - 2
346 | ```
347 | 
348 | Before working on the actual MLB data, let's try to create a 2D `numpy` array from a small list of lists.
349 | 
350 | In this exercise, `baseball` is a list of lists. The main list contains 4 elements. Each of these elements is a list containing the height and the weight of 4 baseball players, in this order. `baseball` is already coded for you in the script.
351 | 
352 | `@instructions`
353 | - Use `np.array()` to create a 2D `numpy` array from `baseball`. Name it `np_baseball`.
354 | - Print out the type of `np_baseball`.
355 | - Print out the `shape` attribute of `np_baseball`. Use `np_baseball.shape`.
356 | 
357 | `@hint`
358 | - `baseball` is already coded for you in the script. Call `np.array()` on it and store the resulting 2D `numpy` array in `np_baseball`.
359 | - Use `print()` in combination with `type()` for the second instruction.
360 | - `np_baseball.shape` will give you the dimensions of the `np_baseball`. Make sure to wrap a `print()` call around it.
361 | 
362 | `@pre_exercise_code`
363 | ```{python}
364 | 
365 | ```
366 | 
367 | `@sample_code`
368 | ```{python}
369 | import numpy as np
370 | 
371 | baseball = [[180, 78.4],
372 |             [215, 102.7],
373 |             [210, 98.5],
374 |             [188, 75.2]]
375 | 
376 | # Create a 2D numpy array from baseball: np_baseball
377 | 
378 | 
379 | # Print out the type of np_baseball
380 | 
381 | 
382 | # Print out the shape of np_baseball
383 | 
384 | ```
385 | 
386 | `@solution`
387 | ```{python}
388 | import numpy as np
389 | 
390 | baseball = [[180, 78.4],
391 |             [215, 102.7],
392 |             [210, 98.5],
393 |             [188, 75.2]]
394 | 
395 | # Create a 2D numpy array from baseball: np_baseball
396 | np_baseball = np.array(baseball)
397 | 
398 | # Print out the type of np_baseball
399 | print(type(np_baseball))
400 | 
401 | # Print out the shape of np_baseball
402 | print(np_baseball.shape)
403 | ```
404 | 
405 | `@sct`
406 | ```{python}
407 | msg = "You don't have to change or remove the predefined variables."
408 | Ex().check_object("baseball", missing_msg=msg).has_equal_value(incorrect_msg = msg)
409 | Ex().has_import("numpy", same_as = False)
410 | 
411 | Ex().check_correct(
412 |     multi(
413 |         has_printout(0),
414 |         has_printout(1)
415 |     ),
416 |     check_correct(
417 |         check_object('np_baseball').has_equal_value(),
418 |         check_function('numpy.array').check_args(0).has_equal_ast()
419 |     )
420 | )
421 | 
422 | success_msg("Great! You're ready to convert the actual MLB data to a 2D `numpy` array now!")
423 | ```
424 | 
425 | ---
426 | 
427 | ## Baseball data in 2D form
428 | 
429 | ```yaml
430 | type: NormalExercise
431 | key: 5df25d0b7b
432 | lang: python
433 | xp: 100
434 | skills:
435 |   - 2
436 | ```
437 | 
438 | You realize that it makes more sense to restructure all this information in a 2D `numpy` array.
439 | 
440 | You have a Python list of lists. In this list of lists, each sublist represents the height and weight of a single baseball player. The name of this list is `baseball` and it has been loaded for you already (although you can't see it).
441 | 
442 | Store the data as a 2D array to unlock `numpy`'s extra functionality.
443 | 
444 | `@instructions`
445 | - Use `np.array()` to create a 2D `numpy` array from `baseball`. Name it `np_baseball`.
446 | - Print out the `shape` attribute of `np_baseball`.
447 | 
448 | `@hint`
449 | - `baseball` is already available in the Python environment. Call `np.array()` on it and store the resulting 2D `numpy` array in `np_baseball`.
450 | - `np_baseball.shape` will give the dimensions of the `np_baseball`. Make sure to wrap a `print()`call around it.
451 | 
452 | `@pre_exercise_code`
453 | ```{python}
454 | import pandas as pd
455 | baseball = pd.read_csv("https://assets.datacamp.com/course/intro_to_python/baseball.csv")[['Height', 'Weight']].to_numpy().tolist()
456 | import numpy as np
457 | ```
458 | 
459 | `@sample_code`
460 | ```{python}
461 | import numpy as np
462 | 
463 | # Create a 2D numpy array from baseball: np_baseball
464 | np_baseball = 
465 | 
466 | # Print out the shape of np_baseball
467 | 
468 | ```
469 | 
470 | `@solution`
471 | ```{python}
472 | import numpy as np
473 | 
474 | # Create a 2D numpy array from baseball: np_baseball
475 | np_baseball = np.array(baseball)
476 | 
477 | # Print out the shape of np_baseball
478 | print(np_baseball.shape)
479 | ```
480 | 
481 | `@sct`
482 | ```{python}
483 | Ex().has_import("numpy", same_as = False)
484 | 
485 | Ex().check_correct(
486 |     has_printout(0),
487 |     check_correct(
488 |         check_object('np_baseball').has_equal_value(),
489 |         check_function('numpy.array').check_args(0).has_equal_ast()
490 |     )
491 | )
492 | 
493 | success_msg("Slick! Time to show off some killer features of multi-dimensional `numpy` arrays!")
494 | ```
495 | 
496 | ---
497 | 
498 | ## Subsetting 2D NumPy Arrays
499 | 
500 | ```yaml
501 | type: NormalExercise
502 | key: aeca4977f0
503 | lang: python
504 | xp: 100
505 | skills:
506 |   - 2
507 | ```
508 | 
509 | If your 2D `numpy` array has a regular structure, i.e. each row and column has a fixed number of values, complicated ways of subsetting become very easy. Have a look at the code below where the elements `"a"` and `"c"` are extracted from a list of lists.
510 | 
511 | ```
512 | # numpy
513 | import numpy as np
514 | np_x = np.array(x)
515 | np_x[:, 0]
516 | ```
517 | 
518 | The indexes before the comma refer to the rows, while those after the comma refer to the columns. The `:` is for slicing; in this example, it tells Python to include all rows.
519 | 
520 | `@instructions`
521 | - Print out the 50th row of `np_baseball`.
522 | - Make a new variable, `np_weight_lb`, containing the entire second column of `np_baseball`.
523 | - Select the height (first column) of the 124th baseball player in `np_baseball` and print it out.
524 | 
525 | `@hint`
526 | - You need row index 49 in the first instruction! More specifically, you'll want to use `[49, :]`.
527 | - To select the entire second column, you'll need `[:, 1]`.
528 | - For the last instruction, use `[123, 0]`; don't forget to wrap it all in a `print()` statement.
529 | 
530 | `@pre_exercise_code`
531 | ```{python}
532 | import pandas as pd
533 | baseball = pd.read_csv("https://assets.datacamp.com/course/intro_to_python/baseball.csv")[['Height', 'Weight']].to_numpy().tolist()
534 | import numpy as np
535 | ```
536 | 
537 | `@sample_code`
538 | ```{python}
539 | import numpy as np
540 | 
541 | np_baseball = np.array(baseball)
542 | 
543 | # Print out the 50th row of np_baseball
544 | 
545 | 
546 | # Select the entire second column of np_baseball: np_weight_lb
547 | 
548 | 
549 | # Print out height of 124th player
550 | 
551 | ```
552 | 
553 | `@solution`
554 | ```{python}
555 | import numpy as np
556 | 
557 | np_baseball = np.array(baseball)
558 | 
559 | # Print out the 50th row of np_baseball
560 | print(np_baseball[49,:])
561 | 
562 | # Select the entire second column of np_baseball: np_weight_lb
563 | np_weight_lb = np_baseball[:,1]
564 | 
565 | # Print out height of 124th player
566 | print(np_baseball[123, 0])
567 | ```
568 | 
569 | `@sct`
570 | ```{python}
571 | msg = "You don't have to change or remove the predefined variables."
572 | Ex().multi(
573 |     has_import("numpy", same_as = False),
574 |     check_object("np_baseball", missing_msg=msg).has_equal_value(incorrect_msg = msg)
575 | )
576 | 
577 | Ex().has_printout(0)
578 | 
579 | Ex().check_object('np_weight_lb').has_equal_value(incorrect_msg = "You can use `np_baseball[:,1]` to define `np_weight_lb`. This will select the entire first column.")
580 | 
581 | Ex().has_printout(1)
582 | 
583 | success_msg("This is going well!")
584 | ```
585 | 
586 | ---
587 | 
588 | ## 2D Arithmetic
589 | 
590 | ```yaml
591 | type: NormalExercise
592 | key: 1c2378b677
593 | lang: python
594 | xp: 100
595 | skills:
596 |   - 2
597 | ```
598 | 
599 | 2D `numpy` arrays can perform calculations element by element, like `numpy` arrays.
600 | 
601 | `np_baseball` is coded for you; it's again a 2D `numpy` array with 3 columns representing height (in inches), weight (in pounds) and age (in years). `baseball` is available as a regular list of lists and `updated` is available as 2D numpy array.
602 | 
603 | `@instructions`
604 | - You managed to get hold of the changes in height, weight and age of all baseball players. It is available as a 2D `numpy` array, `updated`. Add `np_baseball` and `updated` and print out the result.
605 | - You want to convert the units of height and weight to metric (meters and kilograms, respectively). As a first step, create a `numpy` array with three values: `0.0254`, `0.453592` and `1`. Name this array `conversion`.
606 | - Multiply `np_baseball` with `conversion` and print out the result.
607 | 
608 | `@hint`
609 | - `np_baseball + updated` will do an element-wise summation of the two `numpy` arrays.
610 | - Create a `numpy` array with `np.array()`; the input is a regular Python list with three elements.
611 | - `np_baseball * conversion` will work, without extra work. Try out it! Make sure to wrap it in a `print()` call.
612 | 
613 | `@pre_exercise_code`
614 | ```{python}
615 | import pandas as pd
616 | import numpy as np
617 | baseball = pd.read_csv("https://assets.datacamp.com/course/intro_to_python/baseball.csv")[['Height', 'Weight', 'Age']].to_numpy().tolist()
618 | n = len(baseball)
619 | updated = np.array(pd.read_csv("https://assets.datacamp.com/course/intro_to_python/update.csv", header = None))
620 | import numpy as np
621 | ```
622 | 
623 | `@sample_code`
624 | ```{python}
625 | import numpy as np
626 | 
627 | np_baseball = np.array(baseball)
628 | 
629 | # Print out addition of np_baseball and updated
630 | 
631 | 
632 | # Create numpy array: conversion
633 | 
634 | 
635 | # Print out product of np_baseball and conversion
636 | 
637 | ```
638 | 
639 | `@solution`
640 | ```{python}
641 | import numpy as np
642 | 
643 | np_baseball = np.array(baseball)
644 | 
645 | # Print out addition of np_baseball and updated
646 | print(np_baseball + updated)
647 | 
648 | # Create numpy array: conversion
649 | conversion = np.array([0.0254, 0.453592, 1])
650 | 
651 | # Print out product of np_baseball and conversion
652 | print(np_baseball * conversion)
653 | ```
654 | 
655 | `@sct`
656 | ```{python}
657 | Ex().has_import("numpy")
658 | 
659 | msg = "You don't have to change or remove the predefined variables."
660 | Ex().check_object("np_baseball", missing_msg=msg).has_equal_value(incorrect_msg = msg)
661 | 
662 | Ex().has_printout(0)
663 | 
664 | Ex().check_correct(
665 |     has_printout(1),
666 |     check_correct(
667 |         check_object('conversion').has_equal_value(),
668 |         check_function('numpy.array', index = 1).check_args(0).has_equal_value()
669 |     )    
670 | )
671 | 
672 | success_msg("Great job! Notice how with very little code, you can change all values in your `numpy` data structure in a very specific way. This will be very useful in your future as a data scientist!")
673 | ```
674 | 
675 | ---
676 | 
677 | ## NumPy: Basic Statistics
678 | 
679 | ```yaml
680 | type: VideoExercise
681 | key: 287995e488
682 | xp: 50
683 | ```
684 | 
685 | `@projector_key`
686 | 34495ba457d74296794d2a122c9b6e19
687 | 
688 | ---
689 | 
690 | ## Average versus median
691 | 
692 | ```yaml
693 | type: NormalExercise
694 | key: 509c588eb6
695 | lang: python
696 | xp: 100
697 | skills:
698 |   - 2
699 | ```
700 | 
701 | You now know how to use `numpy` functions to get a better feeling for your data. 
702 | 
703 | The baseball data is available as a 2D `numpy` array with 3 columns (height, weight, age) and 1015 rows. The name of this `numpy` array is `np_baseball`. After restructuring the data, however, you notice that some height values are abnormally high. Follow the instructions and discover which summary statistic is best suited if you're dealing with so-called _outliers_. `np_baseball` is available.
704 | 
705 | `@instructions`
706 | - Create `numpy` array `np_height_in` that is equal to first column of `np_baseball`.
707 | - Print out the mean of `np_height_in`.
708 | - Print out the median of `np_height_in`.
709 | 
710 | `@hint`
711 | - Use 2D `numpy` subsetting: `[:,0]` is a part of the solution.
712 | - If `numpy` is imported as `np`, you can use `np.mean()` to get the mean of a NumPy array. Don't forget to throw in a `print()` call.
713 | - For the last instruction, use `np.median()`.
714 | 
715 | `@pre_exercise_code`
716 | ```{python}
717 | import pandas as pd
718 | np_baseball = pd.read_csv("https://assets.datacamp.com/course/intro_to_python/baseball.csv")[['Height', 'Weight', 'Age']].to_numpy()
719 | np_baseball[slice(0, 1015, 50), 0] = np_baseball[slice(0, 1015, 50), 0]*1000
720 | import numpy as np
721 | ```
722 | 
723 | `@sample_code`
724 | ```{python}
725 | import numpy as np
726 | 
727 | # Create np_height_in from np_baseball
728 | 
729 | 
730 | # Print out the mean of np_height_in
731 | 
732 | 
733 | # Print out the median of np_height_in
734 | 
735 | ```
736 | 
737 | `@solution`
738 | ```{python}
739 | import numpy as np
740 | 
741 | # Create np_height_in from np_baseball
742 | np_height_in = np_baseball[:,0]
743 | 
744 | # Print out the mean of np_height_in
745 | print(np.mean(np_height_in))
746 | 
747 | # Print out the median of np_height_in
748 | print(np.median(np_height_in))
749 | ```
750 | 
751 | `@sct`
752 | ```{python}
753 | Ex().has_import("numpy", same_as = False)
754 | 
755 | Ex().check_object("np_height_in").has_equal_value(incorrect_msg = "You can use `np_baseball[:,0]` to select the first column from `np_baseball`"),
756 | 
757 | Ex().check_correct(
758 |     has_printout(0),
759 |     check_function('numpy.mean').has_equal_value()
760 | )
761 | 
762 | Ex().check_correct(
763 |     has_printout(1),
764 |     check_function('numpy.median').has_equal_value()
765 | )
766 | 
767 | success_msg("An average height of 1586 inches, that doesn't sound right, does it? However, the median does not seem affected by the outliers: 74 inches makes perfect sense. It's always a good idea to check both the median and the mean, to get an idea about the overall distribution of the entire dataset.")
768 | ```
769 | 
770 | ---
771 | 
772 | ## Explore the baseball data
773 | 
774 | ```yaml
775 | type: NormalExercise
776 | key: '4409948807'
777 | lang: python
778 | xp: 100
779 | skills:
780 |   - 2
781 | ```
782 | 
783 | Because the mean and median are so far apart, you decide to complain to the MLB. They find the error and send the corrected data over to you. It's again available as a 2D NumPy array `np_baseball`, with three columns.
784 | 
785 | The Python script in the editor already includes code to print out informative messages with the different summary statistics and `numpy` is already loaded as `np`. Can you finish the job? `np_baseball` is available.
786 | 
787 | `@instructions`
788 | - The code to print out the mean height is already included. Complete the code for the median height.
789 | - Use `np.std()` on the first column of `np_baseball` to calculate `stddev`. 
790 | - Do big players tend to be heavier? Use `np.corrcoef()` to store the correlation between the first and second column of `np_baseball` in `corr`.
791 | 
792 | `@hint`
793 | - Use `np.median()` to calculate the median. Make sure to select to correct column first!
794 | - Subset the same column when calculating the standard deviation with `np.std()`.
795 | - Use `np_baseball[:, 0]` and `np_baseball[:, 1]` to select the first and second columns; these are the inputs to `np.corrcoef()`.
796 | 
797 | `@pre_exercise_code`
798 | ```{python}
799 | import pandas as pd
800 | np_baseball = pd.read_csv("https://assets.datacamp.com/course/intro_to_python/baseball.csv")[['Height', 'Weight', 'Age']].to_numpy()
801 | import numpy as np
802 | ```
803 | 
804 | `@sample_code`
805 | ```{python}
806 | avg = np.mean(np_baseball[:,0])
807 | print("Average: " + str(avg))
808 | 
809 | # Print median height
810 | med = ____
811 | print("Median: " + str(med))
812 | 
813 | # Print out the standard deviation on height
814 | stddev = ____
815 | print("Standard Deviation: " + str(stddev))
816 | 
817 | # Print out correlation between first and second column
818 | corr = ____
819 | print("Correlation: " + str(corr))
820 | ```
821 | 
822 | `@solution`
823 | ```{python}
824 | avg = np.mean(np_baseball[:,0])
825 | print("Average: " + str(avg))
826 | 
827 | # Print median height
828 | med = np.median(np_baseball[:,0])
829 | print("Median: " + str(med))
830 | 
831 | # Print out the standard deviation on height
832 | stddev = np.std(np_baseball[:,0])
833 | print("Standard Deviation: " + str(stddev))
834 | 
835 | # Print out correlation between first and second column
836 | corr = np.corrcoef(np_baseball[:,0], np_baseball[:,1])
837 | print("Correlation: " + str(corr))
838 | ```
839 | 
840 | `@sct`
841 | ```{python}
842 | msg = "You shouldn't change or remove the predefined `avg` variable."
843 | Ex().check_object("avg", missing_msg=msg).has_equal_value(incorrect_msg=msg)
844 | 
845 | missing = "Have you used `np.median()` to calculate the median?"
846 | incorrect = "To calculate `med`, pass the first column of `np_baseball` to `numpy.median()`. The example of `np.mean()` shows how it's done."
847 | Ex().check_correct(
848 |   check_object("med").has_equal_value(),
849 |   check_function("numpy.median", index=0, missing_msg=missing).check_args(0).has_equal_value(incorrect_msg=incorrect)
850 | )
851 | 
852 | missing = "Have you used `np.std()` to calculate the standard deviation?"
853 | incorrect = "To calculate `stddev`, pass the first column of `np_baseball` to `numpy.std()`. The example of `np.mean()` shows how it's done."
854 | Ex().check_correct(
855 |   check_object("stddev").has_equal_value(),
856 |   check_function("numpy.std", index=0, missing_msg=missing).check_args(0).has_equal_value(incorrect_msg=incorrect)
857 | )
858 | 
859 | missing = "Have you used `np.corrcoef()` to calculate the correlation?"
860 | incorrect1 = "To calculate `corr`, the first argument to `np.corrcoef()` should be the first column of `np_baseball`, similar to how did it before."
861 | incorrect2 = "To calculate `corr`, the second argument to `np.corrcoef()` should be the second column of `np_baseball`. Instead of `[:,0]`, use `[:,1]` this time."
862 | Ex().check_correct(
863 |   check_object("corr").has_equal_value(),
864 |   check_function("numpy.corrcoef", index=0, missing_msg=missing).multi(
865 |     check_args(0, missing_msg=incorrect1).has_equal_value(incorrect_msg=incorrect1),
866 |     check_args(1, missing_msg=incorrect2).has_equal_value(incorrect_msg=incorrect2)
867 |   )
868 | )
869 | 
870 | success_msg("Great work! You've built a solid foundation - now it's time to use all of your new data science skills to solve more challenges and make an impact.")
871 | ```
872 | 


--------------------------------------------------------------------------------
/course.yml:
--------------------------------------------------------------------------------
 1 | id: 735
 2 | title: Introduction to Python
 3 | programming_language: python
 4 | description: >-
 5 |   Python is a general-purpose programming language that is becoming ever more
 6 |   popular for data science. Companies worldwide are using Python to harvest
 7 |   insights from their data and gain a competitive edge. Unlike other Python
 8 |   tutorials, this course focuses on Python specifically for data science. In our
 9 |   Introduction to Python course, you’ll learn about powerful ways to store and
10 |   manipulate data, and helpful data science tools to begin conducting your own
11 |   analyses. Start DataCamp’s online Python curriculum now.
12 | from: 'python-base-prod:v2.0.0'
13 | practice_pool_id: 107
14 | datasets:
15 |   baseball.csv: MLB (baseball)
16 |   fifa.csv: FIFA (soccer)
17 | 


--------------------------------------------------------------------------------
/courses-introduction-to-python.Rproj:
--------------------------------------------------------------------------------
 1 | Version: 1.0
 2 | 
 3 | RestoreWorkspace: Default
 4 | SaveWorkspace: Default
 5 | AlwaysSaveHistory: Default
 6 | 
 7 | EnableCodeIndexing: Yes
 8 | UseSpacesForTab: Yes
 9 | NumSpacesForTab: 2
10 | Encoding: UTF-8
11 | 
12 | RnwWeave: Sweave
13 | LaTeX: pdfLaTeX
14 | 


--------------------------------------------------------------------------------
/datasets/references.md:
--------------------------------------------------------------------------------
1 | # Sources of datasets
2 | 
3 | - MLB data: http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data_MLB_HeightsWeights
4 | - Soccer data: https://github.com/jokecamp/FootballData


--------------------------------------------------------------------------------
/img/shield_image.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacamp/courses-introduction-to-python/4deee0de7c9fb66ce9687590847413b01862a7a0/img/shield_image.png


--------------------------------------------------------------------------------
/intro-to-python-keynotes.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacamp/courses-introduction-to-python/4deee0de7c9fb66ce9687590847413b01862a7a0/intro-to-python-keynotes.zip


--------------------------------------------------------------------------------
/requirements.sh:
--------------------------------------------------------------------------------
1 | # pip3 install numpy==1.11.0
2 | # pip3 install scipy==0.18.1
3 | 
4 | 


--------------------------------------------------------------------------------
/scripts/chapter1_script.md:
--------------------------------------------------------------------------------
 1 | --- video_exercise_key:d5509896f7
 2 | 
 3 | ## Hello Python!
 4 | 
 5 | Hi, my name is Filip and I'll be your host for Introduction to Python for Data Science. It's a long name, but that's to stress something: this is not just another Python tutorial. Instead, the focus will be on using Python specifically for data science. By the end of this course, you'll know about powerful ways to store and manipulate data and to deploy cool data science tools for your own analyses.
 6 | 
 7 | You will learn Python for Data Science through video lessons, like this one, and interactive exercises. You get your own Python session where you can experiment and try to come up with the correct code to solve the instructions. You're learning by doing, while receiving customized and instant feedback on your work.
 8 | 
 9 | Python was conceived by Guido Van Rossum. What started as a hobby project, soon became a general purpose programming language: nowadays, you can use Python to build practically any piece of software. But how did this happen? Well, first of all, Python is open source. It's free to use. Second, it's very easy to build packages in Python, which is code that you can share with other people to solve specific problems. Throughout time, more and more of these packages specifically built for data science have been developed. Suppose you want to make some fancy visualizations of your company's sales. There's a package for that. Or what about connecting to a database to analyze sensor measurements? There's also a package for that. 
10 | 
11 | Currently, there are two common versions of Python, version 2.7 and 3.5 and later. Apart from some syntactical differences, they are pretty similar, but as support for version 2 will fade over time, our courses focus on Python 3. To install Python 3 on your own system, follow the steps at this URL.
12 | 
13 | Now that you're all eyes and ears for Python, let's start experimenting. I'll start with the Python shell, a place where you can type Python code and immediately see the results. In DataCamp's exercise interface, this shell is embedded here. Let's start off simple and use Python as a calculator. Let me type 4 + 5 and hit Enter. Python interprets what you typed and prints the result of your calculation, 9. The Python shell that's used here is actually not the original one; we're using IPython, short for Interactive Python, which is some kind of juiced up version of regular Python that'll be useful later on.
14 | 
15 | Apart from interactively working with Python, you can also have Python run so called python scripts. These python scripts are simply text files with the extension (dot) py. It's basically a list of Python commands that are executed, almost as if you where typing the commands in the shell yourself, line by line. Let's put the command from before in a script now, that can be found here in DataCamp's interface. The next step is executing the script, by clicking 'Submit Answer'.
16 | 
17 | If you execute this script in the DataCamp interface, there's nothing in the output pane. That's because you have to explicitly use print() inside scripts if you want to generate output during execution. Let's wrap our previous calculation in a print() call, and rerun the script. This time, the same output as before is generated, great!
18 | 
19 | Putting your code in Python scripts instead of manually retyping every step interactively will help you to keep structure and avoid retyping everything over and over again if you want to make a change; you simply make the change in the script, and rerun the entire thing.
20 | 
21 | Now that you've got an idea about different ways of working with Python, I suggest you head over to the exercises. Use the IPython Shell for experimentation, and use the Python script editor to code the actual answer. If you click Submit Answer, your script will be executed and checked for correctness. Have fun!
22 | 
23 | --- video_exercise_key:ef8356fb92
24 | 
25 | ## Variables and types
26 | 
27 | It's clear that Python is a great calculator. If you want to do more complex calculations though, you will want to "save" values while you're coding along. You can do this by defining a variable, with a specific, case-sensitive name. Once you create (or declare) such a variable, you can later call up its value by typing the variable name.
28 | 
29 | Suppose you measure your height and weight, in metric units: you are 1.79 meters tall, and weigh 68.7 kilograms. You can assign these values to two variables, named height and weight, with an equals sign:
30 | 
31 | If you now type the name of the variable, height,
32 | 
33 | Python looks for the variable name, retrieves its value, and prints it out.
34 | 
35 | Let's now calculate the Body Mass Index, or BMI, which is calculated as follows, with weight in kilograms and height in meters. You can do this with the actual values, but you can just as well use the variables height and weight, like in here. Every time you type the variable's name, you are asking Python to change it with the actual value of the variable. weight corresponds to 68.7, and height to 1.79.
36 | 
37 | Finally, this version has Python store the result in a new variable, bmi. bmi now contains the same value as the one you calculated earlier.
38 | 
39 | In Python, variables are used all the time. They help to make your code reproducible. Suppose the code to create the height, weight and bmi variable are in a script, like this. If you now want to recalculate the bmi for another weight, you can simply change the declaration of the weight variable, and rerun the script. The bmi changes accordingly, because the value of the variable weight has changed as well.
40 | 
41 | So far, we've only worked with numerical values, such as height and weight. In Python, these numbers all have a specific type. You can check out the type of a value with the type() function. To see the type of our bmi value, simply write type and then bmi inside parentheses. You can see that it's a float, which is python's way of representing a real number, so a number which can have both an integer part and a fractional part. Python also has a type for integers: int, like this example.
42 | 
43 | To do data science, you'll need more than ints and floats, though. Python features tons of other data types. The most common ones are strings and booleans. 
44 | 
45 | A string is Python's way to represent text. You can use both double and single quotes to build a string, as you can see from these examples. If you print the type of the last variable here, you see that it's str, short for string.
46 | 
47 | The Boolean is a type that can either be True or False. You can think of it as 'Yes' and 'No' in everyday language. Booleans will be very useful in the future, to perform filtering operations on your data for example.
48 | 
49 | There's something special about Python data types. Have a look at this line of code, that sums two integers, and then this line of code, that sums two strings. 
50 | 
51 | For the integers, the values were summed, while for the strings, the strings were pasted together. The plus operator behaved differently for different data types. This is a general principle: how the code behaves depends on the types you're working with.
52 | 
53 | In the exercises that follow, you'll create your first variables and experiment with some of Python's data types. I'll see you in the next video to explain all about lists.
54 | 


--------------------------------------------------------------------------------
/scripts/chapter2_script.md:
--------------------------------------------------------------------------------
 1 | --- video_exercise_key:f366e876d8
 2 | 
 3 | ## Lists
 4 | 
 5 | By now, you've played around with different data types. On the numbers side, there's the float, to represent a real number, and the int, to represent an integer. Next, we also have str, short for string, to represent text in Python, and bool, which can be either True or False. You can save these values as a variable, like these examples show. Each variable then represents a single value.
 6 | 
 7 | As a data scientist, you'll often want to work with many data points. If you for example want to measure the height of everybody in your family, and store this information in Python, it would be inconvenient to create a new python variable for each point you collected right?
 8 | 
 9 | What you can do instead, is store all this information in a Python list. You can build such a list with square brackets. Suppose you asked your two sisters and parents for their height, in meters. You can build the list as follows:
10 | 
11 | Of course, also this data structure can be referenced to with a variable. Simply put the variable name and the equals sign in front, like here.
12 | 
13 | A list is a way to give a single name to a collection of values. These values, or elements, can have any type; they can be floats, integer, booleans, strings, but also more advanced Python types, even lists.
14 | 
15 | It's perfectly possible for a list to contain different types as well. Suppose, for example, that you want to add the names of your sisters and parents to the list, so that you know which height belongs to who. You can throw in some strings without issues.
16 | 
17 | But that's not all. I just told you that lists can also contain lists themselves. Instead of putting the strings in between the numbers, you can create little sublists for each member of the family. One for liz, one for emma and so on. Now, you can tell Python that these sublists are the elements of another list, that I named fam2: the little lists are wrapped in square brackets and separated with commas. If you now print out fam2, you see that we have a list of lists. The main list contains 4 sub-lists.
18 | 
19 | We're dealing with a new Python type here, next to the strings, booleans, integers and floats you already know about: the list. These calls show that both fam and fam2 are lists. Remember that I told you that each type has specific functionality and behavior associated? Well, for lists, this is also true. Python lists host a bunch of tools to subset and adapt them. But let's take this step by step, and have you experiment with list creation first!
20 | 
21 | --- video_exercise_key:9e15e5b8a0
22 | 
23 | ## Subsetting lists
24 | 
25 | After you've created your very own Python list, you might wonder how you can access information in the list. Python uses the index to do this. Have a look at the fam list again here. The first element in the list has index 0, the second element has index 1, and so on. Suppose that you want to select the height of emma, the float 1.68. It's the fourth element, so it has index 3. To select it, you use 3 inside square brackets.
26 | 
27 | Similarly, to select the string "dad" from the list, which is the seventh element in the list, you'll need to put the index 6 inside square brackets.
28 | 
29 | You can also count backwards, using negative indexes. This is useful if you want to get some elements at the end of your list. To get your dad's height, for example, you'll need the index -1. These are the negative indexes for all list elements.
30 | 
31 | This means that this line and this line, return the exact same result.
32 | 
33 | Apart from indexing, there's also something called slicing, which allows you to select multiple elements from a list, thus creating a new list. You can do this by specifying a range, using a colon. Let's first have another look at the list, and then try this piece of code.
34 | 
35 | Can you guess what it'll return? A list with the the float 1.68, the string "mom", and the float 1.71, corresponding to the 4th, 5th and 6th element in the list maybe? Let's see what the output is.
36 | 
37 | Apparently, only the elements with index 3 and 4, get returned. The element with index 5 is not included. In general, this is the syntax: the index you specify before the colon, so where the slice starts, is included, while the index you specify after the colon, where the slice ends, is not.
38 | 
39 | With this in mind, can you tell what this call will return?
40 | 
41 | You probably guessed correctly that this call gives you a list with three elements, corresponding to the elements with index 1, 2 and 3 of the fam list. 
42 | 
43 | You can also choose to just leave out the index before or after the colon. If you leave out the index where the slice should begin, you're telling Python to start the slice from index 0, like this example.
44 | 
45 | If you leave out the index where the slice should end, you include all elements up to and including the last element in the list, like here.
46 | 
47 | Now it's time to head over to the exercises, where you will continue to work on the list you've created yourself before. You'll use different subsetting methods to get exactly the piece of information you need!
48 | 
49 | --- video_exercise_key:fbdaaec22a
50 | 
51 | ## Manipulating lists
52 | 
53 | After creation and subsetting, the final piece of the Python lists puzzle is manipulation, so ways to change elements in your list, or to add elements to and remove elements from your list.
54 | 
55 | Changing list elements is pretty straightforward. You use the same square brackets that we've used to subset lists, and then assign new elements to it using the equals sign. Suppose that after another look at `fam`, you realize that your dad's height is not up to date anymore, as he's shrinking with age. Instead of 1.89 meters, it should be 1.86 meters. To change this list element, which is at index 7, you can use this line of code.
56 | 
57 | If you now check out fam, you'll see that the value is updated.
58 | 
59 | You can even change an entire list slice at once. To change the elements "liz" and 1.73, you access the first two elements with 0:2, and then assign a new list to it.
60 | 
61 | Do you still remember how the plus operator was different for strings and integers? Well, it's again different for lists. If you use the plus sign with two lists, Python simply pastes together their contents in a single list. Suppose you want to add your own name and height to the fam height list. This will do the trick.
62 | 
63 | Of course, you can also store this new list in a variable, `fam_ext` for example.
64 | 
65 | Finally, deleting elements from a list is also pretty straightforward, you'll have to use `del` here. Take this line, for example, that deletes the element with index 2, so "emma", from the list.
66 | 
67 | If you check out fam now, you'll see that the "emma" string is gone. Because you've removed an index, all elements that came after "emma" scooted over by one index. If you again run the same line, you're again removing the element at index 2, which is emma's height, 1.68 meters now.
68 | 
69 | Understanding how Python lists actually work behind the scenes becomes pretty important now. What actually happens when you create a new list, `x`, like this?
70 | 
71 | Well, in a simplified sense, you're storing a list in your computer memory, and store the 'address' of that list, so where the list is in your computer memory, in `x`. This means that `x` does not actually contain all the list elements, it rather contains a reference to the list. For basic operations, the difference is not that important, but it becomes more so when you start copying lists. Let me clarify this with an example.
72 | 
73 | Let's store the list `x` as a new variable `y`, by simply using the equals sign.
74 | 
75 | Let's now change the element with index one in the list `y`, as follows.
76 | 
77 | The funky thing is that if you now check out `x` again, also here the second element was changed.
78 | 
79 | That's because when you copied x to y with the equals sign, you copied the reference to the list, not the actual values themselves. When you're updating an element the list, it's one and the same list in the computer memory your changing. Both `x` and `y` point to this list, so the update is visible from both variables.
80 | 
81 | If you want to create a list `y` that points to a new list in the memory with the same values, you'll need to use something else than the equals sign. You can use the `list()` function, like this, or use slicing to select all list elements explicitly.
82 | 
83 | If you now make a change to the list `y` points to, `x` is not affected.
84 | 
85 | If this was a bit too much to take in, don't worry. The exercises will help you understand list manipulation and the subtle inner workings of lists. I'm sure you'll do great!
86 | 


--------------------------------------------------------------------------------
/scripts/chapter3_script.md:
--------------------------------------------------------------------------------
  1 | --- video_exercise_key:2dde2f90b8
  2 | 
  3 | ## Functions, what are they?
  4 | 
  5 | In this video, I'm going to introduce you to functions. Functions aren't entirely new for you actually: you've already used them. type(), for example, is a function that returns the type of a value. But what is a function? Simply put, a function is a piece of reusable code, aimed at solving a particular task. You can call functions instead of having to write code yourself. Maybe an example can clarify things here.
  6 | 
  7 | Suppose you have the list containing only the heights of your family, fam:
  8 | 
  9 | Say that you want to get the maximum value in this list. Instead of writing your own piece of Python code that goes through the list and finds the highest value, you can also use Python's max() function. This is one of Python's built-in functions, just like type(). We simply pass fam to max() inside parentheses.
 10 | 
 11 | The output makes sense: 1.89, the highest number in the list. 
 12 | 
 13 | max() worked kind of like a black box here: you passed it a list, then the implementation of `max()`, that you don't know, did its magic, and produced an output. How max() actually did this, is not important to you, it just does what it's supposed to, and you didn't have to write your own code, which made your life easier.
 14 | 
 15 | Of course, it's possible to also assign the result of a function call to a new variable, like here. Now `tallest` is just like any other variable; you can use to continue your fancy calculations.
 16 | 
 17 | Another one of these built-in functions is round(). It takes two inputs: first, a number you want to round, and second, the precision with which to round, so how many digits behind the decimal point you want to keep. Say you want to round 1.68 to one decimal place. The first input is 1.68, the second input is 1. You separate the inputs with a comma.
 18 | 
 19 | But there's more. It's perfectly possible to call the round() function with only one input, like this. This time, Python figured out that you didn't specify the second input, and automatically chooses to round the number to the closest integer. 
 20 | 
 21 | To understand why both approaches work, let's open up the documentation. You can do this with yet another function, `help`, as follows.
 22 | 
 23 | It appears that round() takes two inputs. In Python, these inputs, also called arguments, have names: number and ndigits. When you call the function round(), with these two inputs, Python matches the inputs to the arguments: number is set to 1.68 and ndigits is set to 1. Next, The round() function does its calculations with number and ndigits as if they are variables in a Python script. We don't know exactly what code Python executes. What is important, though, is that the function produces an output, namely the number 1.68 rounded to 1 decimal place.
 24 | 
 25 | If you call the function round() with only one input, Python again tries to match the inputs to the arguments. There's no input to match to the ndigits argument though. Luckily, the internal machinery of the round() function knows how to handle this. When ndigits is not specified, the function simply rounds to the closest integer and returns that integer. That's why we got the number 2.
 26 | 
 27 | How was I so sure that calling the function with a single input would work? Well, in the documentation, there are square brackets around the comma and the ndigits here. This tells us that you can call round() in this form, as well as in this one. In other words, ndigits is an optional argument. Actually, Python offers yet another way to show that a function has optional arguments, but that's something for the exercises.
 28 | 
 29 | By now, you have an idea about how to use max() and round(), but how could you know that a function such as round() exists in Python in the first place? Well, this is something you will learn with time. Whenever you are doing a rather standard task in Python, you can be pretty sure that there's already a function that can do this for you. In that case, you should definitely use it! Just do a quick internet search and you'll find the function you need with a nice usage example. And there is of course DataCamp, where you'll also learn about powerful functions and how to use them. Get straight to it in the interactive exercises!
 30 | 
 31 | --- video_exercise_key:e1aaeb300b
 32 | 
 33 | ## Methods
 34 | 
 35 | Built-in functions are only one part of the Python story. You already know about functions such as max(), to get the maximum of a list, len(), to get the length of a list or a string, and so on. But what about other basic things, such getting the index of a specific element in the list, or reversing a list? You can look very hard for built-in functions that do this, but you won't find them.
 36 | 
 37 | In the past exercises, you've already created a bunch of variables. Among other Python types, you've created strings, floats and lists, like the ones you see here. Each one of these values or data structures are so-called Python objects. This string is an object, this float is an object, but this list is also an object. These objects have a specific type, that you already know: string, float, and list, and of course they represent the values you gave them, such as "liz", 1.73 and an entire list. But next to that, Python objects also come with a bunch of so-called "methods". You can think of methods as functions that "belong to" Python objects. A Python object of type string has methods, such as capitalize and replace, but also objects of type float and list have specific methods depending on the type.
 38 | 
 39 | Enough for the theory now; let's try to use a method! Suppose you want to get the index of the string "mom" in the fam list. fam is an Python object with the type list, and has a method named index(). To call the method, you use the dot notation, like this. The only input is the string "mom", the element you want to get the index for. 
 40 | 
 41 | Python returns 4, which indeed is the index of the string "mom". I called the index() method "on" the fam list here, and the output was 4. Similarly, I can use the count() method on the fam list to count the number of times 1.73 occurs in the list.
 42 | 
 43 | Python gives me 1, which makes sense, because only liz is 1.73 meters tall. 
 44 | 
 45 | 
 46 | But lists are not the only Python objects that have methods associated. Also floats, integers, booleans and strings are Python objects that have specific methods associated with them. Take the variable `sister` for example, that represents a string.
 47 | 
 48 | You can call the method capitalize() on sister, without any inputs. It returns a string where the first letter is capitalized now.
 49 | 
 50 | Or what if you want to replace some parts of the string with other parts? Not a problem. Just call the method replace on sister, with two appropriate inputs.
 51 | 
 52 | In the output, "z" is replaced with "sa".
 53 | 
 54 | I guess it's clear by now: in Python, everything is an object, and each object has specific methods associated. Depending on the type of the object, list, string, float, whatever, the available methods are different. A string object like sister has a replace method, but a list like fam doesn't have this, as you can see from this error. Objects of different types can have methods with the same name: Take the index() method. It's available for both strings and lists. If you call it on a string, you get the index of the letters in the string; If you call it on a list, you get the index of the element in the list. This means that, depending on the type of the object, the methods behave differently.
 55 | 
 56 | Before I unleash you on some exercises on methods, there's one more thing I want to tell you. Some methods can change the objects they are called on. Let's retake the fam list, and call the append() method on it. As the input, we pass a string we want to add to the list.
 57 | 
 58 | Python doesn't generate an output, but if we check the `fam` list again, we see that it has been extended with the string "me".
 59 | 
 60 | Let's do this again, this time to add my length to the list.
 61 | 
 62 | Again, the fam list was extended.
 63 | 
 64 | This is pretty cool, because you can write very concise code to update your data structures on the fly, but it can also be pretty dangerous. Some method calls don't change the object they're called on, while others do, so watch out.
 65 | 
 66 | Let's take a step back here and summarise this. you have Python functions, like type(), max() and round(), that you can call like this.
 67 | There's also methods, which are functions that are specific to Python objects. Depending on the type of the Python object you're dealing with, you'll be able to use different methods and they behave differently. You can call methods on the objects with the dot notation, like this, for example.
 68 | 
 69 | There's much more to tell about Python objects, methods and how Python works internally, but for now, let's stick to what I've talked about here. It's time to get some exercises and add methods to your evergrowing skillset!
 70 | 
 71 | --- video_exercise_key:2b89c5a9d8
 72 | 
 73 | ## Packages
 74 | 
 75 | By now, I hope you're convinced that python functions and methods are extremely powerful: you can basically use other people's code to solve your own problems. However, adding all functions and methods that have been written up to now to the same Python distribution would be a mess. There would be tons and tons of code in there, that you'll never use. Also, maintaining all of this code would be a real pain.
 76 | 
 77 | This is where packages come into play. You can think of packages as a directory of Python scripts. Each such script is a so-called module. These modules specify functions, methods and new Python types aimed at solving particular problems. There are thousands of Python packages available from the internet. Among them are packages for data science: there's numpy to efficiently work with arrays, matplotlib for data visualization, and scikit-learn for machine learning.
 78 | 
 79 | Not all these packages are available in Python by default. To use Python packages, you'll first have to install them on your own system, and then put code in your script to tell Python that you want to use these packages.
 80 | 
 81 | Datacamp already has all necessary packages installed for you, but if you want to install them on your own system, you'll want to use pip, a package maintenance system for Python. If you go to this URL, you can download the file get-pip.py. Next, you go to the terminal, and execute python3 get-pip.py. Now you can use pip to actually install a Python package of your choosing. Suppose we want to install the numpy package, which you'll learn about in the next chapter. You type pip3 install numpy. You have to use the commands python3 and pip3 here to tell our system that we're working with Python version 3.
 82 | 
 83 | Now that the package is installed, you can actually start using it in one of your Python scripts. Before you can do this, you should import the package, or a specific module of the package. You can do this with the import statement.
 84 | 
 85 | To import the entire numpy package, you can do import numpy, like this.
 86 | 
 87 | A commonly used function in Numpy is array(). It takes a list as input. Simply calling the array function like this, will generate an error.
 88 | 
 89 | To refer to the array function from the numpy package, you'll need this.
 90 | 
 91 | This time it works. The Numpy array is very useful to do data science, but more on that later.
 92 | 
 93 | Using this numpy dot prefix all the time can become pretty tiring, so you can also import the package and refer to it with a different name. You can do this by extending your import statement with as, like this.
 94 | 
 95 | Now, instead of numpy.array(), you'll have to use np.array() to use Numpy's array function.
 96 | 
 97 | There are cases in which you only need one specific function of a package. Python allows you to make this explicit in your code. Suppose that we only want to use the array() function from the Numpy package. Instead of doing import numpy, you can instead do from numpy import array, like this.
 98 | 
 99 | This time, you can simply call the array function like this, no need to use numpy dot here. 
100 | 
101 | This from import version to use specific parts of a package can be useful to limit the amount of coding, but you're also loosing some of the context. Suppose you're working in a long Python script. You import the array function from numpy at the very top, and way later, you actually use this array function. Somebody else who's reading your code might have forgotten that this array function is a specific Numpy function; it's not clear from the function call. In that respect, the more standard import numpy call is preferred: In this case, your function call is numpy.array(), making it very clear that you're working with Numpy. At the end of the day, it's a matter of personal preference; up to you to decide what you think is most convenient!
102 | 
103 | Off to the exercises now, where you can practice on different ways of importing packages and modules yourself!
104 | 


--------------------------------------------------------------------------------
/scripts/chapter4_script.md:
--------------------------------------------------------------------------------
 1 | --- video_exercise_key:ed471f4b00
 2 | 
 3 | ## Intro to Numpy
 4 | 
 5 | By now, you are aware that the Python list is pretty powerful. A list can hold any type and can hold different types at the same time. You can also change, add and remove elements. This is wonderful, but one feature is missing, a feature that is super important for aspiring data scientists as yourself. When analyzing data, you'll often want to carry out operations over entire collections of values, and you want to do this fast. With lists, this is a problem.
 6 | 
 7 | Let's retake the heights of your family and yourself. Suppose you've also asked for everybody's weight. It's not very polite, but everything for science, right? You end up with two lists, height, and weight. The first person is 1.73 meters tall and weighs 65.4 kilograms.
 8 | 
 9 | If you now want to calculate the Body Mass Index for each family member, you'd hope that this call can work, making the calculations element-wise.
10 | 
11 | Unfortunately, Python throws an error, because it has no idea how to do calculations on lists. You could solve this by going through each list element one after the other, and calculating the BMI for each person separately, but this is terribly inefficient and tiresome to write.
12 | 
13 | A way more elegant solution is to use NumPy, or Numeric Python. It's a Python package that, among others, provides a alternative to the regular python list: the Numpy array. The Numpy array is pretty similar to the list, but has one additional feature: you can perform calculations over entire arrays. It's really easy, and super-fast as well.
14 | 
15 | The Numpy package is already installed on DataCamp's servers, but if you want to work with it on your own system, go to the command line and execute pip3 install numpy.
16 | 
17 | Next, to actually use Numpy in your Python session, you can import the numpy package, like this.
18 | 
19 | Let's start with creating a numpy array. You do this with Numpy's array() function: the input is a regular Python list. I'm using array() twice here, to create Numpy versions of the height and weight lists from before: np_height and np_weight:
20 | 
21 | Let's try to calculate everybody's BMI with a single call again.
22 | 
23 | This time, it worked fine: the calculations were performed element-wise. The first person's BMI was calculated by dividing the first element in np_weight by the square of the first element in np_height, the second person's BMI was calculated with the second height and weight elements, and so on.
24 | 
25 | Let's do a quick comparison here. First, we tried to do calculations with regular lists, like this, but this gave us an error, because Python doesn't now how to do calculations with lists like we want them to. Next, these regular lists where converted to Numpy arrays. The same operations now work without any problem: Numpy knows how to work with arrays as if they are single values, which is pretty awesome if you ask me.
26 | 
27 | You should still pay attention, though. First of all, Numpy can do all of this so easily because it assumes that your Numpy array can only contain values of a single type. It's either an array of floats, either an array of booleans, and so on. If you do try to create an array with different types, like this for example, the resulting Numpy array will contain a single type, string in this case. The boolean and the float were both converted to strings.
28 | 
29 | Second, you should know that a Numpy array is simply a new kind of Python type, like the float, string and list types from before. This means that it comes with its own methods, which can behave differently than you'd expect. Take this Python list and this numpy array, for example.
30 | 
31 | If you do python_list + python_list, the list elements are pasted together, generating a list with 6 elements. If you do this with the numpy arrays, on the other hand, Python will do an element-wise sum of the arrays.
32 | 
33 | Just make sure to pay attention when you're juggling around with different Python types, because the outcomes can differ a lot! 
34 | 
35 | Apart from these subtleties, you can work with Numpy arrays pretty much the same as you can with regular Python lists. When you want to get elements from your array, for example, you can use square brackets. Suppose you want to get the `bmi` for the second person, so at index 1. This will do the trick.
36 | 
37 | Specifically for Numpy, there's also another way to do list subsetting: using an array of booleans. Say you want to get all BMI values in the bmi array that are over 23. A first step is using the greater than sign, like this:
38 | 
39 | The result is a Numpy array containing booleans: True if the corresponding bmi is above 23, False if it's below. Next, you can use this boolean array inside square brackets to do subsetting. Only the elements in bmi that are above 23, so for which the corresponding boolean value is True, is selected. There's only one BMI that's above 23, so we end up with a Numpy array with a single value, that specific BMI. 
40 | 
41 | Using the result of a comparison to make a selection of your data is a very common way to get surprising insights. Learn all about it and the other Numpy basics in the exercises!
42 | 
43 | --- video_exercise_key:84e9f3c38d
44 | 
45 | ## 2D Numpy arrays
46 | 
47 | Let's recreate the numpy arrays from the previous video.
48 | 
49 | If you ask for the type of these arrays, Python tells you that they are numpy.ndarray. numpy dot tells you it's a type that was defined in the numpy package. ndarray stands for n-dimensional array. The arrays np_height and np_weight are one-dimensional arrays, but it's perfectly possible to create 2 dimensional, three dimensional, heck even seven dimensional arrays! Let's stick to 2 in this video though.
50 | 
51 | You can create a 2D numpy array from a regular Python list of lists. Let's try to create one numpy array for all height and weight data of your family, like this.
52 | 
53 | If you print out np_2d now, you'll see that it is a rectangular data structure: Each sublist in the list, corresponds to a row in the two dimensional numpy array. From np_2d.shape, you can see that we indeed have 2 rows and 5 columns. shape is a so-called attribute of the np2d array, that can give you more information about what the data structure looks like.
54 | 
55 | Also for 2D arrays, the Numpy rule applies: an array can only contain a single type. If you change one float to be string, all the array elements will be coerced to strings, to end up with a homogenous array.
56 | 
57 | You can think of the 2D numpy array as an improved list of lists: you can perform calculations on the arrays, like I showed before, and you can do more advanced ways of subsetting.
58 | 
59 | Suppose you want the first row, and then the third element in that row. To select the row, you need the index 0 in square brackets.
60 | 
61 | To then select the third element, you can extend the same call with another pair of brackets, this time with the index 2, like this. Basically you're selecting the row, and then from that row do another selection.
62 | 
63 | There's also an alternative way of subsetting, using single square brackets and a comma. This call returns the exact same value as before. The value before the comma specifies the row, the value after the comma specifies the column. The intersection of the rows and columns you specified, are returned.
64 | 
65 | Once you get used to it, this syntax is more intuitive and opens up more possibilities. Suppose you want to select the height and weight of the second and third family member. You want both rows, so you put in a colon before the comma. You only want the second and third column, so you put in the indices 1 to 3 after the comma. Remember that the third index is not included here. The intersection gives us a 2D array with 2 rows and 2 columns:
66 | 
67 | Similarly, you can select the weight of all family members like this: you only want the second row, so put 1 before the comma. You want all columns, so you use a colon after the comma. The intersection gives us the entire second row.
68 | 
69 | Finally, 2D numpy arrays enable you to do element-wise calculations, the same way you did it with 1D numpy arrays. That's something you can experiment with in the exercises, along with creating and subsetting 2D numpy arrays! Exciting...
70 | 
71 | --- video_exercise_key:16403c5a74
72 | 
73 | ## Basic Statistics with Numpy
74 | 
75 | A typical first step in analyzing your data, is getting to know your data in the first place. For the Numpy arrays from before, this is pretty easy, because it isn't a lot of data. However, as a data scientist, you'll be crunching thousands, if not millions or billions of numbers.
76 | 
77 | Imagine you conduct a city-wide survey where you ask 5000 adults about their height and weight. You end up with something like this: a 2D numpy array, which I named np_city, that has 5000 rows, corresponding to the 5000 people, and two columns, corresponding to the height and the weight.
78 | 
79 | Simply staring at these numbers like a zombie won't give you any insights. What you can do, though, is generate summarizing statistics about your data. Aside from an efficient data structure for number crunching, it happens that Numpy is also good at doing these kinds of things. 
80 | 
81 | For starters, you can try to find out the average height of these 5000 people, with Numpy's mean function. Because it's a function from the Numpy package, don't forget to start with np..
82 | 
83 | Of course, I first had to do a subsetting operation to get the height column from the 2D array. It appears that on average, people are 1.75 meters tall. What about the median height? This is the height of the middle person if you sort all persons from small to tall. Instead of writing complicated python code to figure this out, you can simply use Numpy's median() function:
84 | 
85 | You can do similar things for the weight column in np_city. Often, these summarizing statistics will provide you with a "sanity check" of your data. If you end up with a average weight of 2000 kilograms, your measurements are most likely incorrect.
86 | 
87 | Apart from mean() and median(), there's also other functions, like corrcoeff() to check if for example height and weight are correlated,
88 | 
89 | and std(), for standard deviation. 
90 | 
91 | Numpy also features more basic functions, such as sum() and sort(), which also exist in the basic Python distribution. However, the big difference here is speed. Because Numpy enforces a single data type in an array, it can drastically speed up the calculations.
92 | 
93 | Just a sidenote here: If you're wondering how I came up with the data in this video: I simulated it with Numpy functions! I sampled two random distributions 5000 times to create the height and weight arrays, and then used column_stack to paste them together as two columns. Another thing that Numpy can do!
94 | 
95 | Another great tool to get some sense of your data is to visualize it, but that's something for later. First, head over to the exercises to learn how to explore your Numpy arrays!
96 | 


--------------------------------------------------------------------------------
/slides/ch4_slides.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacamp/courses-introduction-to-python/4deee0de7c9fb66ce9687590847413b01862a7a0/slides/ch4_slides.pdf


--------------------------------------------------------------------------------
/slides/chapter_1_433dcfcfedaee070cbf440491c402e3b.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: Insert title here
  3 | key: 433dcfcfedaee070cbf440491c402e3b
  4 | video_link:
  5 |   mp4: 'https://videos.datacamp.com/raw/735_intro_to_python/v6/735_ch1_2.mp4'
  6 |   hls: >-
  7 |     https://videos.datacamp.com/transcoded/735_intro_to_python/v6/hls-735_ch1_2.master.m3u8
  8 | transformations:
  9 |   translateX: 50
 10 |   translateY: 0
 11 |   scale: 1
 12 | ---
 13 | 
 14 | ## Variables and Types
 15 | 
 16 | ```yaml
 17 | type: TitleSlide
 18 | key: dc8b62f1c8
 19 | ```
 20 | 
 21 | `@lower_third`
 22 | name: Hugo Bowne-Anderson
 23 | title: Data Scientist at DataCamp
 24 | 
 25 | `@script`
 26 | Well done and welcome back! It's clear that Python is a great calculator. If you want to do more complex calculations though, you will want to "save" values while you're coding along.
 27 | 
 28 | ---
 29 | 
 30 | ## Variable
 31 | 
 32 | ```yaml
 33 | type: FullSlide
 34 | key: 36ec318b41
 35 | ```
 36 | 
 37 | `@part1`
 38 | - Specific, case-sensitive name
 39 | 
 40 | - Call up value through variable name{{1}}
 41 | 
 42 | - 1.79 m - 68.7 kg{{2}}
 43 | 
 44 | ```py
 45 | height = 1.79
 46 | weight = 68.7
 47 | ```{{3}}
 48 | ```py
 49 | height
 50 | ```{{4}}
 51 | 
 52 | ```out
 53 | 1.79
 54 | ```{{4}}
 55 | 
 56 | `@script`
 57 | You can do this by defining a variable, with a specific, case-sensitive name. Once you create (or declare) such a variable, you can later call up its value by typing the variable name.
 58 | 
 59 | Suppose you measure your height and weight, in metric units: you are 1.79 meters tall, and weigh 68.7 kilograms. You can assign these values to two variables, named height and weight, with an equals sign:
 60 | 
 61 | If you now type the name of the variable, height,
 62 | 
 63 | Python looks for the variable name, retrieves its value, and prints it out.
 64 | 
 65 | ---
 66 | 
 67 | ## Calculate BMI
 68 | 
 69 | ```yaml
 70 | type: TwoColumns
 71 | key: fe1b10a93b
 72 | code_zoom: 80
 73 | ```
 74 | 
 75 | `@part1`
 76 | ```py
 77 | height = 1.79
 78 | weight = 68.7
 79 | ```
 80 | ```py
 81 | height
 82 | ```
 83 | 
 84 | ```out
 85 | 1.79
 86 | ```
 87 | 
 88 | $$ \text{BMI} = \frac{\text{weight}}{\text{height}^2} $${{1}}
 89 | 
 90 | `@part2`
 91 | ```py
 92 | 68.7 / 1.79 ** 2
 93 | ```{{2}}
 94 | 
 95 | ```out
 96 | 21.4413
 97 | ```{{2}}
 98 | 
 99 | ```py
100 | weight / height ** 2
101 | ```{{3}}
102 | 
103 | ```out
104 | 21.4413
105 | ```{{3}}
106 | 
107 | ```py
108 | bmi = weight / height ** 2
109 | bmi
110 | ```{{4}}
111 | 
112 | ```out
113 | 21.4413
114 | ```{{4}}
115 | 
116 | `@script`
117 | Let's now calculate the Body Mass Index, or BMI, which is calculated as follows, with weight in kilograms and height in meters. You can do this with the actual values, but you can just as well use the variables height and weight, like in here. Every time you type the variable's name, you are asking Python to change it with the actual value of the variable. weight corresponds to 68.7, and height to 1.79.
118 | 
119 | Finally, this version has Python store the result in a new variable, bmi. bmi now contains the same value as the one you calculated earlier.
120 | 
121 | In Python, variables are used all the time. They help to make your code reproducible.
122 | 
123 | ---
124 | 
125 | ## Reproducibility
126 | 
127 | ```yaml
128 | type: FullSlide
129 | key: 9980f47f9d
130 | ```
131 | 
132 | `@part1`
133 | ```py
134 | height = 1.79
135 | weight = 68.7
136 | bmi = weight / height ** 2
137 | print(bmi)
138 | ```
139 | 
140 | ```out
141 | 21.4413
142 | ```
143 | 
144 | `@script`
145 | Suppose the code to create the height, weight and bmi variable are in a script, like this. If you now want to recalculate the bmi for another weight,
146 | 
147 | ---
148 | 
149 | ## Reproducibility
150 | 
151 | ```yaml
152 | type: FullSlide
153 | key: a4e899f00f
154 | disable_transition: true
155 | ```
156 | 
157 | `@part1`
158 | ```py
159 | height = 1.79
160 | weight = 74.2 # <-
161 | bmi = weight / height ** 2
162 | print(bmi)
163 | ```
164 | 
165 | ```out
166 | 23.1578
167 | ```
168 | 
169 | `@script`
170 | you can simply change the declaration of the weight variable, and rerun the script. The bmi changes accordingly, because the value of the variable weight has changed as well.
171 | 
172 | So far, we've only worked with numerical values, such as height and weight.
173 | 
174 | ---
175 | 
176 | ## Python Types
177 | 
178 | ```yaml
179 | type: FullSlide
180 | key: 9d86084ad4
181 | ```
182 | 
183 | `@part1`
184 | ```py
185 | type(bmi)
186 | ```{{1}}
187 | 
188 | ```out
189 | float
190 | ```{{1}}
191 | 
192 | ```py
193 | day_of_week = 5
194 | type(day_of_week)
195 | ```{{2}}
196 | 
197 | ```out
198 | int
199 | ```{{2}}
200 | 
201 | `@script`
202 | In Python, these numbers all have a specific type. You can check out the type of a value with the type function. To see the type of our bmi value, simply write type and then bmi inside parentheses. You can see that it's a float, which is python's way of representing a real number, so a number which can have both an integer part and a fractional part. Python also has a type for integers: int, like this example.
203 | 
204 | To do data science, you'll need more than ints and floats, though.
205 | 
206 | ---
207 | 
208 | ## Python Types (2)
209 | 
210 | ```yaml
211 | type: FullSlide
212 | key: d971d34e6a
213 | ```
214 | 
215 | `@part1`
216 | ```py
217 | x = "body mass index"
218 | y = 'this works too'
219 | ```{{1}}
220 | ```py
221 | type(y)
222 | ```{{2}}
223 | 
224 | ```out
225 | str
226 | ```{{2}}
227 | 
228 | ```py
229 | z = True
230 | type(z)
231 | ```{{3}}
232 | 
233 | ```out
234 | bool
235 | ```{{3}}
236 | 
237 | `@script`
238 | Python features tons of other data types. The most common ones are strings and booleans.
239 | 
240 | A string is Python's way to represent text. You can use both double and single quotes to build a string, as you can see from these examples. If you print the type of the last variable here, you see that it's str, short for string.
241 | 
242 | The Boolean is a type that can either be True or False. You can think of it as 'Yes' and 'No' in everyday language. Booleans will be very useful in the future, to perform filtering operations on your data for example.
243 | 
244 | There's something special about Python data types.
245 | 
246 | ---
247 | 
248 | ## Python Types (3)
249 | 
250 | ```yaml
251 | type: FullSlide
252 | key: 24601e2af0
253 | ```
254 | 
255 | `@part1`
256 | ```py
257 | 2 + 3
258 | ```{{1}}
259 | 
260 | ```out
261 | 5
262 | ```{{1}}
263 | 
264 | ```py
265 | 'ab' + 'cd'
266 | ```{{2}}
267 | 
268 | ```out
269 | 'abcd'
270 | ```{{2}}
271 | 
272 | - Different type = different behavior!{{3}}
273 | 
274 | `@script`
275 | Have a look at this line of code, that sums two integers, and then this line of code, that sums two strings.
276 | 
277 | For the integers, the values were summed, while for the strings, the strings were pasted together. The plus operator behaved differently for different data types. This is a general principle: how the code behaves depends on the types you're working with.
278 | 
279 | In the exercises that follow, you'll create your first variables and experiment with some of Python's data types. I'll see you in the next video to explain all about lists.
280 | 
281 | ---
282 | 
283 | ## Let's practice!
284 | 
285 | ```yaml
286 | type: FinalSlide
287 | key: b7fc40db4d
288 | ```
289 | 
290 | `@script`
291 | Let's get you coding and I can't wait to see you in the next chapter where you'll build even more awesome python charts.
292 | 


--------------------------------------------------------------------------------
/slides/chapter_1_d8fcd4c930027fa4e1c3870c7e7e0ff1.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: Insert title here
  3 | key: d8fcd4c930027fa4e1c3870c7e7e0ff1
  4 | video_link:
  5 |   mp4: 'https://videos.datacamp.com/raw/735_intro_to_python/v8/735_ch1_1.mp4'
  6 |   hls: >-
  7 |     https://videos.datacamp.com/transcoded/735_intro_to_python/v8/hls-735_ch1_1.master.m3u8
  8 | transformations:
  9 |   translateX: 50
 10 |   translateY: 0
 11 |   scale: 1
 12 | ---
 13 | 
 14 | ## Hello Python!
 15 | 
 16 | ```yaml
 17 | type: TitleSlide
 18 | key: f743ca8c41
 19 | ```
 20 | 
 21 | `@lower_third`
 22 | name: Hugo Bowne-Anderson
 23 | title: Data Scientist at DataCamp
 24 | 
 25 | `@script`
 26 | Hi, my name is Hugo and I'll be your host for Introduction to Python for Data Science.
 27 | 
 28 | I'm a data scientist and educator at DataCamp.
 29 | 
 30 | ---
 31 | 
 32 | ## How you will learn
 33 | 
 34 | ```yaml
 35 | type: FullSlide
 36 | key: 30ee08a725
 37 | disable_transition: true
 38 | ```
 39 | 
 40 | `@part1`
 41 | ![DataCamp Interface](https://assets.datacamp.com/production/repositories/288/datasets/729574d2168960686381caefe79baf5978e27d0d/liveexercise.gif)
 42 | 
 43 | `@script`
 44 | In this course, you will learn Python for Data Science through video lessons, like this one, and interactive exercises. You get your own Python session where you can experiment and try to come up with the correct code to solve the instructions. You're learning by doing, while receiving customized and instant feedback on your work.
 45 | 
 46 | ---
 47 | 
 48 | ## Python
 49 | 
 50 | ```yaml
 51 | type: FullSlide
 52 | key: 3f23b93572
 53 | ```
 54 | 
 55 | `@part1`
 56 | ![guido-hba.png](https://assets.datacamp.com/production/repositories/288/datasets/fb3e4b8dc114529dafffb37d33f2b2244210d40f/guido-hba.png = 38){{1}}
 57 | 
 58 | - General purpose: build anything{{2}}
 59 | 
 60 | - Open source! Free!{{3}}
 61 | 
 62 | - Python packages, also for data science{{4}}
 63 | 
 64 | 	- Many applications and fields{{5}}
 65 | 
 66 | `@script`
 67 | Python was conceived by Guido Van Rossum. Here, you can see a photo of me with Guido. What started as a hobby project, soon became a general purpose programming language: nowadays, you can use Python to build practically any piece of software. But how did this happen? Well, first of all, Python is open source. It's free to use. Second, it's very easy to build packages in Python, which is code that you can share with other people to solve specific problems. Throughout time, more and more of these packages specifically built for data science have been developed. Suppose you want to make some fancy visualizations of your company's sales. There's a package for that. Or what about connecting to a database to analyze sensor measurements? There's also a package for that.
 68 | People often refer to Python as the swiss army knife of programming languages as you can do almost anything with it.
 69 | In this course, we'll start to build up your data science coding skills bit by bit, so make sure to stick around to see how powerful the language can be.
 70 | 
 71 | ---
 72 | 
 73 | ## IPython Shell
 74 | 
 75 | ```yaml
 76 | type: FullSlide
 77 | key: 43a91a7217
 78 | ```
 79 | 
 80 | `@part1`
 81 | **Execute Python commands**
 82 | 
 83 | ![ipython_shell.png](https://assets.datacamp.com/production/repositories/288/datasets/a9e8440bb8fbd49e4a73e4c36ef1cd677c0dd55f/pyexercise.png = 95)
 84 | 
 85 | `@script`
 86 | Now that you're all eyes and ears for Python, let's start experimenting. I'll start with the
 87 | 
 88 | ---
 89 | 
 90 | ## IPython Shell
 91 | 
 92 | ```yaml
 93 | type: FullSlide
 94 | key: 9c51ee700d
 95 | disable_transition: true
 96 | ```
 97 | 
 98 | `@part1`
 99 | **Execute Python commands**
100 | 
101 | ![ipython_shell_highlighted.png](https://assets.datacamp.com/production/repositories/288/datasets/dd43cc0183b15b43a072eb0fbab4caa72dee9250/pyexercise_shell.jpg = 95)
102 | 
103 | `@script`
104 | Python shell, a place where you can type Python code and immediately see the results. In DataCamp's exercise interface, this shell is embedded here. Let's start off simple and use Python as a calculator.
105 | 
106 | ---
107 | 
108 | ## IPython Shell
109 | 
110 | ```yaml
111 | type: FullSlide
112 | key: 524e4c20a7
113 | disable_transition: true
114 | ```
115 | 
116 | `@part1`
117 | &nbsp;
118 | 
119 | ![Calculations in DataCamp's IPython shell](https://assets.datacamp.com/production/repositories/288/datasets/cee32b788a62e4b9a1234ccde56ac9ebb49cfa72/shelladdition.gif = 95)
120 | 
121 | `@script`
122 | Let me type 4 + 5, and hit Enter. Python interprets what you typed and prints the result of your calculation, 9. The Python shell that's used here is actually not the original one; we're using IPython, short for Interactive Python, which is some kind of juiced up version of regular Python that'll be useful later on.
123 | 
124 | IPython was created by Fernando Pérez and is part of the broader Jupyter ecosystem. Apart from interactively working with Python, you can also have Python run so called
125 | 
126 | ---
127 | 
128 | ## Python Script
129 | 
130 | ```yaml
131 | type: FullSlide
132 | key: 78ef256bc0
133 | ```
134 | 
135 | `@part1`
136 | - Text files - `.py`{{1}}
137 | 
138 | - List of Python commands{{2}}
139 | 
140 | - Similar to typing in IPython Shell{{3}}
141 | 
142 | ![Python script in DataCamp](https://assets.datacamp.com/production/repositories/288/datasets/59f196e96536543a4fb8801228019fc4106f3791/pyexercise_script.jpg = 78){{3}}
143 | 
144 | `@script`
145 | python scripts. These python scripts are simply text files with the extension (dot) py. It's basically a list of Python commands that are executed, almost as if you where typing the commands in the shell yourself, line by line.
146 | 
147 | ---
148 | 
149 | ## Python Script
150 | 
151 | ```yaml
152 | type: FullSlide
153 | key: 717d124175
154 | disable_transition: true
155 | ```
156 | 
157 | `@part1`
158 | ![GIF: typing 4 + 5 in the script and hitting submit answer. No output is shown.](https://assets.datacamp.com/production/repositories/288/datasets/2f96e979012e15329cc158d1e0f496aac3539f45/scriptnoprint.gif = 95)
159 | 
160 | `@script`
161 | Let's put the command from before in a script now, which can be found here in DataCamp's interface. The next step is executing the script, by clicking 'Submit Answer'. If you execute this script in the DataCamp interface, there's nothing in the output pane. That's because you have to explicitly use print inside scripts if you want to generate output during execution.
162 | 
163 | ---
164 | 
165 | ## Python Script
166 | 
167 | ```yaml
168 | type: FullSlide
169 | key: c7a9d02fb6
170 | disable_transition: true
171 | code_zoom: 90
172 | ```
173 | 
174 | `@part1`
175 | ![python_script_print.gif](https://assets.datacamp.com/production/repositories/288/datasets/8b13d046bb54dcb11aa49f0da7363781129d1561/scriptwithprint.gif = 95)
176 | 
177 | - Use `print()` to generate output from script
178 | 
179 | `@script`
180 | Let's wrap our previous calculation in a print call, and rerun the script. This time, the same output as before is generated, great! Putting your code in Python scripts instead of manually retyping every step interactively will help you to keep structure and avoid retyping everything over and over again if you want to make a change; you simply make the change in the script, and rerun the entire thing.
181 | 
182 | ---
183 | 
184 | ## DataCamp Interface
185 | 
186 | ```yaml
187 | type: FullSlide
188 | key: 693ba1cd14
189 | ```
190 | 
191 | `@part1`
192 | ![Screenshot of DataCamp interface](https://assets.datacamp.com/production/repositories/288/datasets/a9e8440bb8fbd49e4a73e4c36ef1cd677c0dd55f/pyexercise.png)
193 | 
194 | `@script`
195 | Now that you've got an idea about different ways of working with Python, I suggest you head over to the exercises. Use the IPython Shell for experimentation, and use the Python script editor to code the actual answer. If you click Submit Answer, your script will be executed and checked for correctness.
196 | 
197 | ---
198 | 
199 | ## Let's practice!
200 | 
201 | ```yaml
202 | type: FinalSlide
203 | key: 7445cd202e
204 | ```
205 | 
206 | `@script`
207 | Get coding and don't forget to have fun!
208 | 


--------------------------------------------------------------------------------
/slides/chapter_2_355ed52d2fb0d67508c6a311b7cbc6d3.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: Insert title here
  3 | key: 355ed52d2fb0d67508c6a311b7cbc6d3
  4 | video_link:
  5 |   mp4: 'https://videos.datacamp.com/raw/735_intro_to_python/v6/735_ch2_3.mp4'
  6 |   hls: >-
  7 |     https://videos.datacamp.com/transcoded/735_intro_to_python/v6/hls-735_ch2_3.master.m3u8
  8 | transformations:
  9 |   translateX: 50
 10 |   translateY: 0
 11 |   scale: 1
 12 | ---
 13 | 
 14 | ## Manipulating Lists
 15 | 
 16 | ```yaml
 17 | type: TitleSlide
 18 | key: 6484e4d1f6
 19 | ```
 20 | 
 21 | `@lower_third`
 22 | name: Hugo Bowne-Anderson
 23 | title: Data Scientist at DataCamp
 24 | 
 25 | `@script`
 26 | Wow, you're doing super well. So now, after creation and subsetting, the final piece of the Python lists puzzle is
 27 | 
 28 | ---
 29 | 
 30 | ## List Manipulation
 31 | 
 32 | ```yaml
 33 | type: FullSlide
 34 | key: 5b83249ee9
 35 | ```
 36 | 
 37 | `@part1`
 38 | - Change list elements{{1}}
 39 | 
 40 | - Add list elements{{2}}
 41 | 
 42 | - Remove list elements{{3}}
 43 | 
 44 | `@script`
 45 | manipulation, so ways to change elements in your list, or to add elements to and remove elements from your list.
 46 | 
 47 | ---
 48 | 
 49 | ## Changing list elements
 50 | 
 51 | ```yaml
 52 | type: FullSlide
 53 | key: c1d58a3c4c
 54 | code_zoom: 64
 55 | ```
 56 | 
 57 | `@part1`
 58 | ```py
 59 | fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
 60 | fam
 61 | ```
 62 | 
 63 | ```out
 64 | ['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
 65 | ```
 66 | 
 67 | ```py
 68 | fam[7] = 1.86
 69 | fam
 70 | ```{{1}}
 71 | 
 72 | ```out
 73 | ['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.86]
 74 | ```{{1}}
 75 | 
 76 | ```py
 77 | fam[0:2] = ["lisa", 1.74]
 78 | fam
 79 | ```{{2}}
 80 | 
 81 | ```out
 82 | ['lisa', 1.74, 'emma', 1.68, 'mom', 1.71, 'dad', 1.86]
 83 | ```{{2}}
 84 | 
 85 | `@script`
 86 | Changing list elements is pretty straightforward. You use the same square brackets that we've used to subset lists, and then assign new elements to it using the equals sign. Suppose that after another look at fam, you realize that your dad's height is not up to date anymore, as he's shrinking with age. Instead of 1.89 meters, it should be 1.86 meters. To change this list element, which is at index 7, you can use this line of code.
 87 | 
 88 | If you now check out fam, you'll see that the value is updated.
 89 | 
 90 | You can even change an entire list slice at once. To change the elements "liz" and 1.73, you access the first two elements with 0:2, and then assign a new list to it.
 91 | 
 92 | Do you still remember how the plus operator was different for strings and integers?
 93 | 
 94 | ---
 95 | 
 96 | ## Adding and removing elements
 97 | 
 98 | ```yaml
 99 | type: FullSlide
100 | key: a66d56cb46
101 | code_zoom: 74
102 | ```
103 | 
104 | `@part1`
105 | ```py
106 | fam + ["me", 1.79]
107 | ```{{1}}
108 | 
109 | ```out
110 | ['lisa', 1.74,'emma', 1.68, 'mom', 1.71, 'dad', 1.86, 'me', 1.79]
111 | ```{{1}}
112 | 
113 | ```py
114 | fam_ext = fam + ["me", 1.79]
115 | ```{{2}}
116 | ```py
117 | del fam[2]
118 | ```{{3}}
119 | ```py
120 | fam
121 | ```{{4}}
122 | 
123 | ```out
124 | ['lisa', 1.74, 1.68, 'mom', 1.71, 'dad', 1.86]
125 | ```{{4}}
126 | 
127 | `@script`
128 | Well, it's again different for lists. If you use the plus sign with two lists, Python simply pastes together their contents in a single list. Suppose you want to add your own name and height to the fam height list. This will do the trick.
129 | 
130 | Of course, you can also store this new list in a variable, fam_ext for example.
131 | 
132 | Finally, deleting elements from a list is also pretty straightforward, you'll have to use del here. Take this line, for example, that deletes the element with index 2, so "emma", from the list.
133 | 
134 | If you check out fam now, you'll see that the "emma" string is gone. Because you've removed an index, all elements that came after "emma" scooted over by one index. If you again run the same line, you're again removing the element at index 2, which is emma's height, 1.68 meters now.
135 | 
136 | Understanding how Python lists actually work
137 | 
138 | ---
139 | 
140 | ## Behind the scenes (1)
141 | 
142 | ```yaml
143 | type: TwoColumns
144 | key: ef5370967a
145 | code_zoom: 100
146 | ```
147 | 
148 | `@part1`
149 | ```py
150 | x = ["a", "b", "c"]
151 | ```{{1}}
152 | 
153 | `@part2`
154 | ![ch_2_3_slides.024.png](https://assets.datacamp.com/production/repositories/288/datasets/e91761036b6647fa635fe8493b4ff3379587f5d5/ch_2_3_slides.024.png = 70){{2}}
155 | 
156 | `@script`
157 | behind the scenes becomes pretty important now. What actually happens when you create a new list, x, like this?
158 | 
159 | Well, in a simplified sense, you're storing a list in your computer memory, and store the 'address' of that list, so
160 | 
161 | ---
162 | 
163 | ## Behind the scenes (1)
164 | 
165 | ```yaml
166 | type: TwoColumns
167 | key: 4d48163f25
168 | disable_transition: true
169 | code_zoom: 100
170 | ```
171 | 
172 | `@part1`
173 | ```py
174 | x = ["a", "b", "c"]
175 | ```
176 | ```py
177 | y = x
178 | ```{{1}}
179 | ```py
180 | y[1] = "z"
181 | y
182 | ```{{2}}
183 | 
184 | ```out
185 | ['a', 'z', 'c']
186 | ```{{2}}
187 | 
188 | ```py
189 | x
190 | ```{{3}}
191 | 
192 | ```out
193 | ['a', 'z', 'c']
194 | ```{{3}}
195 | 
196 | `@part2`
197 | ![ch_2_3_slides.025.png](https://assets.datacamp.com/production/repositories/288/datasets/03d95d40b2e0d631ea89f07cadf12e66babd3693/ch_2_3_slides.025.png = 70)
198 | 
199 | `@script`
200 | where the list is in your computer memory, in x. This means that x does not actually contain all the list elements, it rather contains a reference to the list. For basic operations, the difference is not that important, but it becomes more so when you start copying lists. Let me clarify this with an example.
201 | 
202 | Let's store the list x as a new variable y, by simply using the equals sign.
203 | 
204 | Let's now change the element with index one in the list y, like this.
205 | 
206 | The funky thing is that if you now check out x again, also here the second element was changed.
207 | 
208 | That's because when you copied x to y with the equals sign,
209 | 
210 | ---
211 | 
212 | ## Behind the scenes (1)
213 | 
214 | ```yaml
215 | type: TwoColumns
216 | key: 4a5827f664
217 | disable_transition: true
218 | code_zoom: 100
219 | ```
220 | 
221 | `@part1`
222 | ```py
223 | x = ["a", "b", "c"]
224 | ```
225 | ```py
226 | y = x
227 | ```
228 | ```py
229 | y[1] = "z"
230 | y
231 | ```
232 | 
233 | ```out
234 | ['a', 'z', 'c']
235 | ```
236 | 
237 | ```py
238 | x
239 | ```
240 | 
241 | ```out
242 | ['a', 'z', 'c']
243 | ```
244 | 
245 | `@part2`
246 | ![ch_2_3_slides.030.png](https://assets.datacamp.com/production/repositories/288/datasets/cee01ad8680d8cd824bab998aed4c5e5f74521bb/ch_2_3_slides.030.png = 70)
247 | 
248 | `@script`
249 | you copied the reference to the list, not the actual values themselves.
250 | 
251 | ---
252 | 
253 | ## Behind the scenes (1)
254 | 
255 | ```yaml
256 | type: TwoColumns
257 | key: ef3476e2fc
258 | disable_transition: true
259 | ```
260 | 
261 | `@part1`
262 | ```py
263 | x = ["a", "b", "c"]
264 | ```
265 | ```py
266 | y = x
267 | ```
268 | ```py
269 | y[1] = "z"
270 | y
271 | ```
272 | 
273 | ```out
274 | ['a', 'z', 'c']
275 | ```
276 | 
277 | ```py
278 | x
279 | ```
280 | 
281 | ```out
282 | ['a', 'z', 'c']
283 | ```
284 | 
285 | `@part2`
286 | ![ch_2_3_slides.031.png](https://assets.datacamp.com/production/repositories/288/datasets/fff4d255ec69a9a6e4d64394bdb92464390498c4/ch_2_3_slides.031.png = 70)
287 | 
288 | `@script`
289 | When you're updating an element the list, it's one and the same list in the computer memory your changing. Both x and y point to this list, so the update is visible from both variables.
290 | 
291 | If you want to create a list y that points to a new list in the memory with the same values,
292 | 
293 | ---
294 | 
295 | ## Behind the scenes (2)
296 | 
297 | ```yaml
298 | type: TwoColumns
299 | key: 05f37e881d
300 | code_zoom: 100
301 | ```
302 | 
303 | `@part1`
304 | ```py
305 | x = ["a", "b", "c"]
306 | ```
307 | 
308 | `@part2`
309 | ![ch_2_3_slides.033.png](https://assets.datacamp.com/production/repositories/288/datasets/97dc873ce995a7fb3cf83305c56a6a9b4f23de51/ch_2_3_slides.033.png)
310 | 
311 | `@script`
312 | you'll need to use something else than the equals sign. You can use the list function,
313 | 
314 | ---
315 | 
316 | ## Behind the scenes (2)
317 | 
318 | ```yaml
319 | type: TwoColumns
320 | key: 678dfa958a
321 | disable_transition: true
322 | code_zoom: 100
323 | ```
324 | 
325 | `@part1`
326 | ```py
327 | x = ["a", "b", "c"]
328 | ```
329 | ```py
330 | y = list(x)
331 | y = x[:]
332 | ```
333 | 
334 | `@part2`
335 | ![ch_2_3_slides.034.png](https://assets.datacamp.com/production/repositories/288/datasets/ec9a50129117c16795d74c53b070c34c0015f6d1/ch_2_3_slides.034.png)
336 | 
337 | `@script`
338 | like this, or use slicing to select all list elements explicitly.
339 | 
340 | If you now
341 | 
342 | ---
343 | 
344 | ## Behind the scenes (2)
345 | 
346 | ```yaml
347 | type: TwoColumns
348 | key: d211be5714
349 | disable_transition: true
350 | code_zoom: 100
351 | ```
352 | 
353 | `@part1`
354 | ```py
355 | x = ["a", "b", "c"]
356 | ```
357 | ```py
358 | y = list(x)
359 | y = x[:]
360 | ```
361 | ```py
362 | y[1] = "z"
363 | x
364 | ```
365 | 
366 | ```out
367 | ['a', 'b', 'c']
368 | ```
369 | 
370 | `@part2`
371 | ![ch_2_3_slides.036.png](https://assets.datacamp.com/production/repositories/288/datasets/3f6b4d36a70007385ff752d07fa842a1e3a7f878/ch_2_3_slides.036.png)
372 | 
373 | `@script`
374 | make a change to the list y points to, x is not affected.
375 | 
376 | If this was a bit too much to take in, don't worry.
377 | 
378 | ---
379 | 
380 | ## Let's practice!
381 | 
382 | ```yaml
383 | type: FinalSlide
384 | key: 934a5be348
385 | ```
386 | 
387 | `@script`
388 | The exercises will help you understand list manipulation and the subtle inner workings of lists. I'm sure you'll do great!
389 | 


--------------------------------------------------------------------------------
/slides/chapter_2_a0530c4542f10988847b2dbb91f717c3.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: Insert title here
  3 | key: a0530c4542f10988847b2dbb91f717c3
  4 | video_link:
  5 |   mp4: 'https://videos.datacamp.com/raw/735_intro_to_python/v6/735_ch2_1.mp4'
  6 |   hls: >-
  7 |     https://videos.datacamp.com/transcoded/735_intro_to_python/v6/hls-735_ch2_1.master.m3u8
  8 | transformations:
  9 |   translateX: 50
 10 |   translateY: 0
 11 |   scale: 1
 12 | ---
 13 | 
 14 | ## Python Lists
 15 | 
 16 | ```yaml
 17 | type: TitleSlide
 18 | key: 30d2c57d4e
 19 | ```
 20 | 
 21 | `@lower_third`
 22 | name: Hugo Bowne-Anderson
 23 | title: Data Scientist at DataCamp
 24 | 
 25 | `@script`
 26 | Welcome back aspiring Pythonista. By now, you've played around with different data types, and I hope you've had as much fun as I have.
 27 | 
 28 | ---
 29 | 
 30 | ## Python Data Types
 31 | 
 32 | ```yaml
 33 | type: FullSlide
 34 | key: 2b9e2d1529
 35 | ```
 36 | 
 37 | `@part1`
 38 | - float - real numbers{{1}}
 39 | 
 40 | - int - integer numbers{{2}}
 41 | 
 42 | - str - string, text{{3}}
 43 | 
 44 | - bool - True, False{{4}}
 45 | 
 46 | ```py
 47 | height = 1.73
 48 | tall = True
 49 | ```{{5}}
 50 | 
 51 | - Each variable represents single value{{6}}
 52 | 
 53 | `@script`
 54 | On the numbers side, there's the float, to represent a real number, and the int, to represent an integer. Next, we also have str, short for string, to represent text in Python, and bool, which can be either True or False. You can save these values as a variable, like these examples show. Each variable then represents a single value.
 55 | 
 56 | As a data scientist,
 57 | 
 58 | ---
 59 | 
 60 | ## Problem
 61 | 
 62 | ```yaml
 63 | type: FullSlide
 64 | key: a6e5aa6c25
 65 | ```
 66 | 
 67 | `@part1`
 68 | - Data Science: many data points{{1}}
 69 | 
 70 | - Height of entire family{{2}}
 71 | 
 72 | ```py
 73 | height1 = 1.73
 74 | height2 = 1.68
 75 | height3 = 1.71
 76 | height4 = 1.89
 77 | ```{{3}}
 78 | 
 79 | - Inconvenient{{4}}
 80 | 
 81 | `@script`
 82 | you'll often want to work with many data points. If you for example want to measure the height of everybody in your family, and store this information in Python, it would be inconvenient to create a new python variable for each point you collected right?
 83 | 
 84 | What you can do instead, is store all this information in a Python list.
 85 | 
 86 | ---
 87 | 
 88 | ## Python List
 89 | 
 90 | ```yaml
 91 | type: FullSlide
 92 | key: e0a7e67ef6
 93 | code_zoom: 66
 94 | ```
 95 | 
 96 | `@part1`
 97 | - `[a, b, c]`
 98 | 
 99 | 
100 | ```py
101 | [1.73, 1.68, 1.71, 1.89]
102 | ```{{1}}
103 | 
104 | ```out
105 | [1.73, 1.68, 1.71, 1.89]
106 | ```{{1}}
107 | 
108 | ```py
109 | fam = [1.73, 1.68, 1.71, 1.89]
110 | fam
111 | ```{{2}}
112 | 
113 | ```out
114 | [1.73, 1.68, 1.71, 1.89]
115 | ```{{2}}
116 | 
117 | - Name a collection of values{{3}}
118 | 
119 | - Contain any type{{4}}
120 | 
121 | - Contain different types{{5}}
122 | 
123 | `@script`
124 | You can build such a list with square brackets. Suppose you asked your two sisters and parents for their height, in meters. You can build the list as follows:
125 | 
126 | Of course, also this data structure can be referenced to with a variable. Simply put the variable name and the equals sign in front, like here.
127 | 
128 | A list is a way to give a single name to a collection of values. These values, or elements, can have any type; they can be floats, integer, booleans, strings, but also more advanced Python types, even lists.
129 | 
130 | It's perfectly possible for a list to contain different types as well.
131 | 
132 | ---
133 | 
134 | ## Python List
135 | 
136 | ```yaml
137 | type: FullSlide
138 | key: 35d6825cd6
139 | code_zoom: 68
140 | ```
141 | 
142 | `@part1`
143 | - `[a, b, c]`
144 | 
145 | ```py
146 | fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
147 | ```
148 | ```py
149 | fam
150 | ```
151 | 
152 | ```out
153 | ['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
154 | ```
155 | 
156 | ```py
157 | fam2 = [["liz", 1.73],
158 | 		["emma", 1.68],
159 | 		["mom", 1.71],
160 | 		["dad", 1.89]]
161 | ```{{1}}
162 | ```py
163 | fam2
164 | ```{{2}}
165 | 
166 | ```out
167 | [['liz', 1.73], ['emma', 1.68], ['mom', 1.71], ['dad', 1.89]]
168 | ```{{2}}
169 | 
170 | `@script`
171 | Suppose, for example, that you want to add the names of your sisters and parents to the list, so that you know which height belongs to who. You can throw in some strings without issues.
172 | 
173 | But that's not all. I just told you that lists can also contain lists themselves. Instead of putting the strings in between the numbers, you can create little sublists for each member of the family. One for liz, one for emma and so on. Now, you can tell Python that these sublists are the elements of another list, that I named fam2: the little lists are wrapped in square brackets and separated with commas. If you now print out fam2, you see that we have a list of lists. The main list contains 4 sub-lists.
174 | 
175 | We're dealing with a new Python type here, next to the strings, booleans, integers and floats you already know about:
176 | 
177 | ---
178 | 
179 | ## List type
180 | 
181 | ```yaml
182 | type: FullSlide
183 | key: 2dd9765326
184 | code_zoom: 80
185 | ```
186 | 
187 | `@part1`
188 | ```py
189 | type(fam)
190 | ```
191 | 
192 | ```out
193 | list
194 | ```
195 | 
196 | ```py
197 | type(fam2)
198 | ```
199 | 
200 | ```out
201 | list
202 | ```
203 | 
204 | - Specific functionality{{1}}
205 | 
206 | - Specific behavior{{1}}
207 | 
208 | `@script`
209 | the list. These calls show that both fam and fam2 are lists. Remember that I told you that each type has specific functionality and behavior associated? Well, for lists, this is also true. Python lists host a bunch of tools to subset and adapt them. But let's take this step by step,
210 | 
211 | ---
212 | 
213 | ## Let's practice!
214 | 
215 | ```yaml
216 | type: FinalSlide
217 | key: de08280f5e
218 | ```
219 | 
220 | `@script`
221 | and have you experiment with list creation first!
222 | 


--------------------------------------------------------------------------------
/slides/chapter_2_fc15ba5cb9485456df8589130b519ea3.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: Insert title here
  3 | key: fc15ba5cb9485456df8589130b519ea3
  4 | video_link:
  5 |   mp4: 'https://videos.datacamp.com/raw/735_intro_to_python/v6/735_ch2_2.mp4'
  6 |   hls: >-
  7 |     https://videos.datacamp.com/transcoded/735_intro_to_python/v6/hls-735_ch2_2.master.m3u8
  8 | transformations:
  9 |   translateX: 50
 10 |   translateY: 0
 11 |   scale: 1
 12 | ---
 13 | 
 14 | ## Subsetting Lists
 15 | 
 16 | ```yaml
 17 | type: TitleSlide
 18 | key: e4c1e2cc21
 19 | ```
 20 | 
 21 | `@lower_third`
 22 | name: Hugo Bowne-Anderson
 23 | title: Data Scientist at DataCamp
 24 | 
 25 | `@script`
 26 | After you've created your very own Python list, you'll need to know how you can access information in the list.
 27 | 
 28 | ---
 29 | 
 30 | ## Subsetting lists
 31 | 
 32 | ```yaml
 33 | type: FullSlide
 34 | key: 3c299aff4c
 35 | code_zoom: 70
 36 | ```
 37 | 
 38 | `@part1`
 39 | ```py
 40 | fam = ["liz", 1.73, "emma", 1.68, "mom", 1.71, "dad", 1.89]
 41 | fam
 42 | ```{{1}}
 43 | 
 44 | ```out
 45 | ['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
 46 | ```{{1}}
 47 | 
 48 | ```py
 49 | fam[3]
 50 | ```{{2}}
 51 | 
 52 | ```out
 53 | 1.68
 54 | ```{{2}}
 55 | 
 56 | `@script`
 57 | Python uses the index to do this. Have a look at the fam list again here. The first element in the list has index 0, the second element has index 1, and so on. Suppose that you want to select the height of emma, the float 1.68. It's the fourth element, so it has index 3. To select it, you use 3 inside square brackets.
 58 | 
 59 | Similarly, to select the string "dad" from the list,
 60 | 
 61 | ---
 62 | 
 63 | ## Subsetting lists
 64 | 
 65 | ```yaml
 66 | type: FullSlide
 67 | key: e036a40a08
 68 | code_zoom: 70
 69 | ```
 70 | 
 71 | `@part1`
 72 | ```out
 73 | ['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
 74 | ```
 75 | 
 76 | ```py
 77 | fam[6]
 78 | ```{{1}}
 79 | 
 80 | ```out
 81 | 'dad'
 82 | ```{{1}}
 83 | 
 84 | ```py
 85 | fam[-1]
 86 | ```{{2}}
 87 | 
 88 | ```out
 89 | 1.89
 90 | ```{{2}}
 91 | 
 92 | ```py
 93 | fam[7]
 94 | ```{{3}}
 95 | 
 96 | ```out
 97 | 1.89
 98 | ```{{3}}
 99 | 
100 | `@script`
101 | which is the seventh element in the list, you'll need to put the index 6 inside square brackets.
102 | 
103 | You can also count backwards, using negative indexes. This is useful if you want to get some elements at the end of your list. To get your dad's height, for example, you'll need the index -1. These are the negative indexes for all list elements.
104 | 
105 | ---
106 | 
107 | ## Subsetting lists
108 | 
109 | ```yaml
110 | type: FullSlide
111 | key: 06e85623c2
112 | disable_transition: true
113 | code_zoom: 70
114 | ```
115 | 
116 | `@part1`
117 | ```out
118 | ['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
119 | ```
120 | 
121 | ```py
122 | fam[6]
123 | ```
124 | 
125 | ```out
126 | 'dad'
127 | ```
128 | 
129 | ```py
130 | fam[-1]  # <-
131 | ```
132 | 
133 | ```out
134 | 1.89
135 | ```
136 | 
137 | ```py
138 | fam[7] # <-
139 | ```
140 | 
141 | ```out
142 | 1.89
143 | ```
144 | 
145 | `@script`
146 | This means that both these lines return the exact same result.
147 | 
148 | Apart from indexing, there's also something called slicing,
149 | 
150 | ---
151 | 
152 | ## List slicing
153 | 
154 | ```yaml
155 | type: FullSlide
156 | key: 125c4cb6c9
157 | code_zoom: 70
158 | ```
159 | 
160 | `@part1`
161 | ```py
162 | fam
163 | ```
164 | 
165 | ```out
166 | ['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
167 | ```
168 | 
169 | ```py
170 | fam[3:5]
171 | ```{{1}}
172 | 
173 | ```out
174 | [1.68, 'mom']
175 | ```{{2}}
176 | 
177 | ```py
178 | fam[1:4]
179 | ```{{4}}
180 | 
181 | ```out
182 | [1.73, 'emma', 1.68]
183 | ```{{5}}
184 | 
185 | ![The slicing syntax for Python lists, showing that the start value is included in the subset, while the stop value is excluded.](https://assets.datacamp.com/production/repositories/288/datasets/83dd2f807be0d4d08a187935eed11667c18fcfe3/slicing-syntax.png = 40){{3}}
186 | 
187 | `@script`
188 | which allows you to select multiple elements from a list, thus creating a new list. You can do this by specifying a range, using a colon. Let's first have another look at the list, and then try this piece of code.
189 | 
190 | Can you guess what it'll return? A list with the the float 1.68, the string "mom", and the float 1.71, corresponding to the 4th, 5th and 6th element in the list maybe? Let's see what the output is.
191 | 
192 | Apparently, only the elements with index 3 and 4, get returned. The element with index 5 is not included. In general, this is the syntax: the index you specify before the colon, so where the slice starts, is included, while the index you specify after the colon, where the slice ends, is not.
193 | 
194 | With this in mind, can you tell what this call will return?
195 | 
196 | You probably guessed correctly that this call gives you a list with three elements, corresponding to the elements with index 1, 2 and 3 of the fam list.
197 | 
198 | You can also choose to just leave out the index before or after the colon.
199 | 
200 | ---
201 | 
202 | ## List slicing
203 | 
204 | ```yaml
205 | type: FullSlide
206 | key: 8207b3255e
207 | disable_transition: true
208 | code_zoom: 70
209 | ```
210 | 
211 | `@part1`
212 | ```py
213 | fam
214 | ```
215 | 
216 | ```out
217 | ['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
218 | ```
219 | 
220 | ```py
221 | fam[:4]
222 | ```{{1}}
223 | 
224 | ```out
225 | ['liz', 1.73, 'emma', 1.68]
226 | ```{{1}}
227 | 
228 | ```py
229 | fam[5:]
230 | ```{{2}}
231 | 
232 | ```out
233 | [1.71, 'dad', 1.89]
234 | ```{{2}}
235 | 
236 | `@script`
237 | If you leave out the index where the slice should begin, you're telling Python to start the slice from index 0, like this example.
238 | 
239 | If you leave out the index where the slice should end, you include all elements up to and including the last element in the list, like here.
240 | 
241 | Now it's time to head over to the exercises,
242 | 
243 | ---
244 | 
245 | ## Let's practice!
246 | 
247 | ```yaml
248 | type: FinalSlide
249 | key: 048b2b774f
250 | ```
251 | 
252 | `@script`
253 | where you will continue to work on the list you've created yourself before. You'll use different subsetting methods to get exactly the piece of information you need!
254 | 


--------------------------------------------------------------------------------
/slides/chapter_3_1204d914b0e53100529827e07441ee6c.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: Insert title here
  3 | key: 1204d914b0e53100529827e07441ee6c
  4 | transformations:
  5 |   translateX: 50
  6 |   translateY: 0
  7 |   scale: 1
  8 | video_link:
  9 |   mp4: 'https://videos.datacamp.com/raw/735_intro_to_python/v8/735_ch3_1.mp4'
 10 |   hls: >-
 11 |     https://videos.datacamp.com/transcoded/735_intro_to_python/v8/hls-735_ch3_1.master.m3u8
 12 | ---
 13 | 
 14 | ## Functions
 15 | 
 16 | ```yaml
 17 | type: TitleSlide
 18 | key: 6d7066bcd2
 19 | ```
 20 | 
 21 | `@lower_third`
 22 | name: Hugo Bowne-Anderson
 23 | title: Data Scientist at DataCamp
 24 | 
 25 | `@script`
 26 | In this video, I'm going to introduce you to functions. Once you learn about them you won't be able to stop using them. I sure can't.
 27 | 
 28 | ---
 29 | 
 30 | ## Functions
 31 | 
 32 | ```yaml
 33 | type: FullSlide
 34 | key: 5f508018d7
 35 | ```
 36 | 
 37 | `@part1`
 38 | - Nothing new!{{1}}
 39 | 
 40 | - `type()`{{2}}
 41 | 
 42 | - Piece of reusable code{{3}}
 43 | 
 44 | - Solves particular task{{4}}
 45 | 
 46 | - Call function instead of writing code yourself{{5}}
 47 | 
 48 | `@script`
 49 | Functions aren't entirely new for you actually: you've already used them. type, for example, is a function that returns the type of a value. But what is a function? Simply put, a function is a piece of reusable code, aimed at solving a particular task. You can call functions instead of having to write code yourself. Maybe an example can clarify things here.
 50 | 
 51 | ---
 52 | 
 53 | ## Example
 54 | 
 55 | ```yaml
 56 | type: FullSlide
 57 | key: c2afbb6435
 58 | code_zoom: 75
 59 | ```
 60 | 
 61 | `@part1`
 62 | ```py
 63 | fam = [1.73, 1.68, 1.71, 1.89]
 64 | fam
 65 | ```
 66 | 
 67 | ```out
 68 | [1.73, 1.68, 1.71, 1.89]
 69 | ```
 70 | 
 71 | ```py
 72 | max(fam)
 73 | ```{{1}}
 74 | 
 75 | ```out
 76 | 1.89
 77 | ```{{1}}
 78 | 
 79 | ![ch_3_1_slides.012.png](https://assets.datacamp.com/production/repositories/288/datasets/efef98eb50aba2b36df52166f7c4b18fd89c62e1/ch_3_1_slides.012.png){{2}}
 80 | 
 81 | `@script`
 82 | Suppose you have the list containing only the heights of your family, fam:
 83 | 
 84 | Say that you want to get the maximum value in this list. Instead of writing your own piece of Python code that goes through the list and finds the highest value, you can also use Python's max function. This is one of Python's built-in functions, just like type. We simply pass fam to max inside parentheses.
 85 | 
 86 | The output makes sense: 1.89, the highest number in the list.
 87 | 
 88 | max worked kind of like a black box here:
 89 | 
 90 | ---
 91 | 
 92 | ## Example
 93 | 
 94 | ```yaml
 95 | type: FullSlide
 96 | key: 46af509641
 97 | disable_transition: true
 98 | code_zoom: 75
 99 | ```
100 | 
101 | `@part1`
102 | ```py
103 | fam = [1.73, 1.68, 1.71, 1.89]
104 | fam
105 | ```
106 | 
107 | ```out
108 | [1.73, 1.68, 1.71, 1.89]
109 | ```
110 | 
111 | ```py
112 | max(fam)
113 | ```
114 | 
115 | ```out
116 | 1.89
117 | ```
118 | 
119 | ![ch_3_1_slides.013.png](https://assets.datacamp.com/production/repositories/288/datasets/65f70092ec124c8f29a082f9409e473496806aaa/ch_3_1_slides.013.png)
120 | 
121 | `@script`
122 | you passed it a list, then the implementation of max, that you don't know, did its magic,
123 | 
124 | ---
125 | 
126 | ## Example
127 | 
128 | ```yaml
129 | type: FullSlide
130 | key: c575524d98
131 | disable_transition: true
132 | code_zoom: 75
133 | ```
134 | 
135 | `@part1`
136 | ```py
137 | fam = [1.73, 1.68, 1.71, 1.89]
138 | fam
139 | ```
140 | 
141 | ```out
142 | [1.73, 1.68, 1.71, 1.89]
143 | ```
144 | 
145 | ```py
146 | max(fam)
147 | ```
148 | 
149 | ```out
150 | 1.89
151 | ```
152 | 
153 | ![ch_3_1_slides.014.png](https://assets.datacamp.com/production/repositories/288/datasets/404545609ab865031039dcfd81ea2d2962126f72/ch_3_1_slides.014.png)
154 | 
155 | `@script`
156 | and produced an output. How max actually did this, is not important to you, it just does what it's supposed to, and you didn't have to write your own code, which made your life easier.
157 | 
158 | ---
159 | 
160 | ## Example
161 | 
162 | ```yaml
163 | type: FullSlide
164 | key: bed6186ee9
165 | disable_transition: true
166 | code_zoom: 75
167 | ```
168 | 
169 | `@part1`
170 | ```py
171 | fam = [1.73, 1.68, 1.71, 1.89]
172 | fam
173 | ```
174 | 
175 | ```out
176 | [1.73, 1.68, 1.71, 1.89]
177 | ```
178 | 
179 | ```py
180 | max(fam)
181 | ```
182 | 
183 | ```out
184 | 1.89
185 | ```
186 | 
187 | ```py
188 | tallest = max(fam)
189 | tallest
190 | ```{{1}}
191 | 
192 | ```out
193 | 1.89
194 | ```{{1}}
195 | 
196 | `@script`
197 | Of course, it's possible to also assign the result of a function call to a new variable, like here. Now tallest is just like any other variable; you can use it to continue your fancy calculations.
198 | 
199 | ---
200 | 
201 | ## round()
202 | 
203 | ```yaml
204 | type: FullSlide
205 | key: b6626f6bff
206 | code_zoom: 62
207 | ```
208 | 
209 | `@part1`
210 | ```py
211 | round(1.68, 1)
212 | ```{{1}}
213 | 
214 | ```out
215 | 1.7
216 | ```{{1}}
217 | 
218 | ```py
219 | round(1.68)
220 | ```{{2}}
221 | 
222 | ```out
223 | 2
224 | ```{{2}}
225 | 
226 | ```py
227 | help(round) # Open up documentation
228 | ```{{3}}
229 | 
230 | ```out
231 | Help on built-in function round in module builtins:
232 | 
233 | round(number, ndigits=None)
234 |     Round a number to a given precision in decimal digits.
235 |     
236 |     The return value is an integer if ndigits is omitted or None. 
237 |     Otherwise the return value has the same type as the number. ndigits may be negative. 
238 | ```{{3}}
239 | 
240 | `@script`
241 | Another one of these built-in functions is round. It takes two inputs: first, a number you want to round, and second, the precision with which to round, which is how many digits after the decimal point you want to keep. Say you want to round 1.68 to one decimal place. The first input is 1.68, the second input is 1. You separate the inputs with a comma.
242 | 
243 | But there's more. It's perfectly possible to call the round function with only one input, like this. This time, Python figured out that you didn't specify the second input, and automatically chooses to round the number to the closest integer.
244 | 
245 | To understand why both approaches work, let's open up the documentation. You can do this with yet another function, help, like this.
246 | 
247 | It appears that round takes two inputs.
248 | 
249 | ---
250 | 
251 | ## round()
252 | 
253 | ```yaml
254 | type: FullSlide
255 | key: c8119a3588
256 | code_zoom: 63
257 | ```
258 | 
259 | `@part1`
260 | ```py
261 | help(round)
262 | ```
263 | 
264 | ```out
265 | Help on built-in function round in module builtins:
266 | 
267 | round(number, ndigits=None)
268 |     Round a number to a given precision in decimal digits.
269 |     
270 |     The return value is an integer if ndigits is omitted or None. 
271 |     Otherwise the return value has the same type as the number. ndigits may be negative. 
272 | ```
273 | 
274 | 
275 | 
276 | ![ch_3_1_slides.026.png](https://assets.datacamp.com/production/repositories/288/datasets/27ffd63d62347e84e5471ee64a8652616e575616/ch_3_1_slides.026.png){{1}}
277 | 
278 | `@script`
279 | In Python, these inputs, also called arguments, have names: number and ndigits. When you call the function round,
280 | 
281 | ---
282 | 
283 | ## round()
284 | 
285 | ```yaml
286 | type: FullSlide
287 | key: 8aacabb9b1
288 | disable_transition: true
289 | code_zoom: 63
290 | ```
291 | 
292 | `@part1`
293 | ```py
294 | help(round)
295 | ```
296 | 
297 | ```out
298 | Help on built-in function round in module builtins:
299 | 
300 | round(number, ndigits=None)
301 |     Round a number to a given precision in decimal digits.
302 |     
303 |     The return value is an integer if ndigits is omitted or None. 
304 |     Otherwise the return value has the same type as the number. ndigits may be negative. 
305 | ```
306 | 
307 | ![ch_3_1_slides.027.png](https://assets.datacamp.com/production/repositories/288/datasets/0b07066836b79b6c2539ddda423da6ff6352ddf6/ch_3_1_slides.027.png)
308 | 
309 | `@script`
310 | with these two inputs, Python matches the inputs to the arguments:
311 | 
312 | ---
313 | 
314 | ## round()
315 | 
316 | ```yaml
317 | type: FullSlide
318 | key: 0ae8191d5a
319 | disable_transition: true
320 | code_zoom: 63
321 | ```
322 | 
323 | `@part1`
324 | ```py
325 | help(round)
326 | ```
327 | 
328 | ```out
329 | Help on built-in function round in module builtins:
330 | 
331 | round(number, ndigits=None)
332 |     Round a number to a given precision in decimal digits.
333 |     
334 |     The return value is an integer if ndigits is omitted or None. 
335 |     Otherwise the return value has the same type as the number. ndigits may be negative. 
336 | ```
337 | 
338 | ![ch_3_1_slides.028.png](https://assets.datacamp.com/production/repositories/288/datasets/4c257fe9ca0994487db6141be7376a370a81d25f/ch_3_1_slides.028.png)
339 | 
340 | `@script`
341 | number is set to 1.68 and
342 | 
343 | ---
344 | 
345 | ## round()
346 | 
347 | ```yaml
348 | type: FullSlide
349 | key: 061bc680d8
350 | disable_transition: true
351 | code_zoom: 63
352 | ```
353 | 
354 | `@part1`
355 | ```py
356 | help(round)
357 | ```
358 | 
359 | ```out
360 | Help on built-in function round in module builtins:
361 | 
362 | round(number, ndigits=None)
363 |     Round a number to a given precision in decimal digits.
364 |     
365 |     The return value is an integer if ndigits is omitted or None. 
366 |     Otherwise the return value has the same type as the number. ndigits may be negative. 
367 | ```
368 | 
369 | ![ch_3_1_slides.029.png](https://assets.datacamp.com/production/repositories/288/datasets/26344efb7eb778da4d8c3350f79dadd82a8a6fd1/ch_3_1_slides.029.png)
370 | 
371 | `@script`
372 | ndigits is set to 1. Next,
373 | 
374 | ---
375 | 
376 | ## round()
377 | 
378 | ```yaml
379 | type: FullSlide
380 | key: 7289eaeb61
381 | disable_transition: true
382 | code_zoom: 63
383 | ```
384 | 
385 | `@part1`
386 | ```py
387 | help(round)
388 | ```
389 | 
390 | ```out
391 | Help on built-in function round in module builtins:
392 | 
393 | round(number, ndigits=None)
394 |     Round a number to a given precision in decimal digits.
395 |     
396 |     The return value is an integer if ndigits is omitted or None. 
397 |     Otherwise the return value has the same type as the number. ndigits may be negative. 
398 | ```
399 | 
400 | ![ch_3_1_slides.030.png](https://assets.datacamp.com/production/repositories/288/datasets/a7d825885a0519ffec79b0763684cd8b16822d6e/ch_3_1_slides.030.png)
401 | 
402 | `@script`
403 | The round function does its calculations with number and ndigits as if they are Python variables in a script. We don't know exactly what code Python executes. What is important, though, is that the function produces an output,
404 | 
405 | ---
406 | 
407 | ## round()
408 | 
409 | ```yaml
410 | type: FullSlide
411 | key: b5ef829b0c
412 | disable_transition: true
413 | code_zoom: 63
414 | ```
415 | 
416 | `@part1`
417 | ```py
418 | help(round)
419 | ```
420 | 
421 | ```out
422 | Help on built-in function round in module builtins:
423 | 
424 | round(number, ndigits=None)
425 |     Round a number to a given precision in decimal digits.
426 |     
427 |     The return value is an integer if ndigits is omitted or None. 
428 |     Otherwise the return value has the same type as the number. ndigits may be negative. 
429 | ```
430 | 
431 | ![ch_3_1_slides.031.png](https://assets.datacamp.com/production/repositories/288/datasets/c4e016f38f0612354160324d2e2abe0ce922a4f3/ch_3_1_slides.031.png)
432 | 
433 | `@script`
434 | namely the number 1.68 rounded to 1 decimal place.
435 | 
436 | ---
437 | 
438 | ## round()
439 | 
440 | ```yaml
441 | type: FullSlide
442 | key: c02d6edac7
443 | disable_transition: true
444 | code_zoom: 63
445 | ```
446 | 
447 | `@part1`
448 | ```py
449 | help(round)
450 | ```
451 | 
452 | ```out
453 | Help on built-in function round in module builtins:
454 | 
455 | round(number, ndigits=None)
456 |     Round a number to a given precision in decimal digits.
457 |     
458 |     The return value is an integer if ndigits is omitted or None. 
459 |     Otherwise the return value has the same type as the number. ndigits may be negative. 
460 | ```
461 | 
462 | 
463 | 
464 | ![ch_3_1_slides.032.png](https://assets.datacamp.com/production/repositories/288/datasets/27ffd63d62347e84e5471ee64a8652616e575616/ch_3_1_slides.032.png)
465 | 
466 | `@script`
467 | If you call the function round with only one input,
468 | 
469 | ---
470 | 
471 | ## round()
472 | 
473 | ```yaml
474 | type: FullSlide
475 | key: 7c246e950e
476 | disable_transition: true
477 | code_zoom: 63
478 | ```
479 | 
480 | `@part1`
481 | ```py
482 | help(round)
483 | ```
484 | 
485 | ```out
486 | Help on built-in function round in module builtins:
487 | 
488 | round(number, ndigits=None)
489 |     Round a number to a given precision in decimal digits.
490 |     
491 |     The return value is an integer if ndigits is omitted or None. 
492 |     Otherwise the return value has the same type as the number. ndigits may be negative. 
493 | ```
494 | 
495 | ![ch_3_1_slides.033.png](https://assets.datacamp.com/production/repositories/288/datasets/29fb0a5fe3ca2ea269fc4c82815591b9bca55d5e/ch_3_1_slides.033.png)
496 | 
497 | `@script`
498 | Python again tries to
499 | 
500 | ---
501 | 
502 | ## round()
503 | 
504 | ```yaml
505 | type: FullSlide
506 | key: 51e45534c3
507 | disable_transition: true
508 | code_zoom: 63
509 | ```
510 | 
511 | `@part1`
512 | ```py
513 | help(round)
514 | ```
515 | 
516 | ```out
517 | Help on built-in function round in module builtins:
518 | 
519 | round(number, ndigits=None)
520 |     Round a number to a given precision in decimal digits.
521 |     
522 |     The return value is an integer if ndigits is omitted or None. 
523 |     Otherwise the return value has the same type as the number. ndigits may be negative. 
524 | ```
525 | 
526 | ![ch_3_1_slides.034.png](https://assets.datacamp.com/production/repositories/288/datasets/4167c94aecf6b66c78efaf5f8ac9232187fb23df/ch_3_1_slides.034.png)
527 | 
528 | `@script`
529 | match the inputs to
530 | 
531 | ---
532 | 
533 | ## round()
534 | 
535 | ```yaml
536 | type: FullSlide
537 | key: e33598e422
538 | disable_transition: true
539 | code_zoom: 63
540 | ```
541 | 
542 | `@part1`
543 | ```py
544 | help(round)
545 | ```
546 | 
547 | ```out
548 | Help on built-in function round in module builtins:
549 | 
550 | round(number, ndigits=None)
551 |     Round a number to a given precision in decimal digits.
552 |     
553 |     The return value is an integer if ndigits is omitted or None. 
554 |     Otherwise the return value has the same type as the number. ndigits may be negative. 
555 | ```
556 | 
557 | ![ch_3_1_slides.035.png](https://assets.datacamp.com/production/repositories/288/datasets/1218fd4989e4f6dfd8471d5cf0f88da0189efc27/ch_3_1_slides.035.png)
558 | 
559 | `@script`
560 | the arguments. There's no input to match to the ndigits argument though. Luckily,
561 | 
562 | ---
563 | 
564 | ## round()
565 | 
566 | ```yaml
567 | type: FullSlide
568 | key: 767966a5a9
569 | disable_transition: true
570 | code_zoom: 63
571 | ```
572 | 
573 | `@part1`
574 | ```py
575 | help(round)
576 | ```
577 | 
578 | ```out
579 | Help on built-in function round in module builtins:
580 | 
581 | round(number, ndigits=None)
582 |     Round a number to a given precision in decimal digits.
583 |     
584 |     The return value is an integer if ndigits is omitted or None. 
585 |     Otherwise the return value has the same type as the number. ndigits may be negative. 
586 | ```
587 | 
588 | ![ch_3_1_slides.036.png](https://assets.datacamp.com/production/repositories/288/datasets/b8f1b94ac3acfdd400bbdf1fca652f772cee4ae6/ch_3_1_slides.036.png)
589 | 
590 | `@script`
591 | the internal machinery of the round function knows how to handle this. When ndigits is not specified, the function simply rounds to the closest integer and
592 | 
593 | ---
594 | 
595 | ## round()
596 | 
597 | ```yaml
598 | type: FullSlide
599 | key: 93b669c9cb
600 | disable_transition: true
601 | code_zoom: 63
602 | ```
603 | 
604 | `@part1`
605 | ```py
606 | help(round)
607 | ```
608 | 
609 | ```out
610 | Help on built-in function round in module builtins:
611 | 
612 | round(number, ndigits=None)
613 |     Round a number to a given precision in decimal digits.
614 |     
615 |     The return value is an integer if ndigits is omitted or None. 
616 |     Otherwise the return value has the same type as the number. ndigits may be negative. 
617 | ```
618 | 
619 | ![ch_3_1_slides.037.png](https://assets.datacamp.com/production/repositories/288/datasets/0a9ca09bc0b46f05f77483d00fb1eadadfc75033/ch_3_1_slides.037.png)
620 | 
621 | `@script`
622 | returns that integer. That's why we got the number 2.
623 | 
624 | ---
625 | 
626 | ## round()
627 | 
628 | ```yaml
629 | type: FullSlide
630 | key: eed1e60402
631 | disable_transition: true
632 | code_zoom: 63
633 | ```
634 | 
635 | `@part1`
636 | ```py
637 | help(round)
638 | ```
639 | 
640 | ```out
641 | Help on built-in function round in module builtins:
642 | 
643 | round(number, ndigits=None)
644 |     Round a number to a given precision in decimal digits.
645 |     
646 |     The return value is an integer if ndigits is omitted or None. 
647 |     Otherwise the return value has the same type as the number. ndigits may be negative. 
648 | ```
649 | 
650 | - `round(number)`{{1}}
651 | - `round(number, ndigits)`{{2}}
652 | 
653 | `@script`
654 | In other words, ndigits is an optional argument. This tells us that you can call round in this form, as well as in this one.
655 | 
656 | ---
657 | 
658 | ## Find functions
659 | 
660 | ```yaml
661 | type: FullSlide
662 | key: a9853a9d66
663 | ```
664 | 
665 | `@part1`
666 | - How to know?{{1}}
667 | 
668 | - Standard task -> probably function exists!{{2}}
669 | 
670 | - The internet is your friend{{3}}
671 | 
672 | `@script`
673 | By now, you have an idea about how to use max and round, but how could you know that a function such as round exists in Python in the first place? Well, this is something you will learn with time. Whenever you are doing a rather standard task in Python, you can be pretty sure that there's already a function that can do this for you. In that case, you should definitely use it! Just do a quick internet search and you'll find the function you need with a nice usage example. And there is of course DataCamp, where you'll also learn about powerful functions and how to use them.
674 | 
675 | ---
676 | 
677 | ## Let's practice!
678 | 
679 | ```yaml
680 | type: FinalSlide
681 | key: dbac5490bd
682 | ```
683 | 
684 | `@script`
685 | Get straight to it in the interactive exercises, and I'll see you back here soon!
686 | 


--------------------------------------------------------------------------------
/slides/chapter_3_8e387776f3a264a745128b68aa8d8f83.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: Insert title here
  3 | key: 8e387776f3a264a745128b68aa8d8f83
  4 | video_link:
  5 |   mp4: 'https://videos.datacamp.com/raw/735_intro_to_python/v6/735_ch3_2.mp4'
  6 |   hls: >-
  7 |     https://videos.datacamp.com/transcoded/735_intro_to_python/v6/hls-735_ch3_2.master.m3u8
  8 | transformations:
  9 |   translateX: 50
 10 |   translateY: 0
 11 |   scale: 1
 12 | ---
 13 | 
 14 | ## Methods
 15 | 
 16 | ```yaml
 17 | type: TitleSlide
 18 | key: c536df1034
 19 | ```
 20 | 
 21 | `@lower_third`
 22 | name: Hugo Bowne-Anderson
 23 | title: Data Scientist at DataCamp
 24 | 
 25 | `@script`
 26 | Built-in functions are only
 27 | 
 28 | ---
 29 | 
 30 | ## Built-in Functions
 31 | 
 32 | ```yaml
 33 | type: FullSlide
 34 | key: 45877294bd
 35 | ```
 36 | 
 37 | `@part1`
 38 | - Maximum of list: max(){{1}}
 39 | 
 40 | - Length of list or string: len(){{2}}
 41 | 
 42 | - Get index in list: ?{{3}}
 43 | 
 44 | - Reversing a list: ?{{4}}
 45 | 
 46 | `@script`
 47 | one part of the Python story. You already know about functions such as max, to get the maximum of a list, len, to get the length of a list or a string, and so on. But what about other basic things, such getting the index of a specific element in the list, or reversing a list? You can look very hard for built-in functions that do this, but you won't find them.
 48 | 
 49 | ---
 50 | 
 51 | ## Back 2 Basics
 52 | 
 53 | ```yaml
 54 | type: TwoColumns
 55 | key: a3e45f6524
 56 | code_zoom: 75
 57 | ```
 58 | 
 59 | `@part1`
 60 | &nbsp;
 61 | 
 62 | ```py
 63 | sister = "liz"
 64 | ```
 65 | 
 66 | ```py
 67 | height = 1.73
 68 | ```{{1}}
 69 | 
 70 | 
 71 | ```py
 72 | fam = ["liz", 1.73, "emma", 1.68,
 73 |        "mom", 1.71, "dad", 1.89]
 74 | ```{{2}}
 75 | 
 76 | `@part2`
 77 | ![ch_3_2_slides.020.png](https://assets.datacamp.com/production/repositories/288/datasets/c7a9757fa49f8396eb025ef221823441d6e66ced/ch_3_2_slides.020.png = 85){{3}}
 78 | 
 79 | `@script`
 80 | In the past exercises, you've already created a bunch of variables. Among other Python types, you've created strings, floats and lists, like the ones you see here. Each one of these values or data structures are so-called Python objects. This string is an object, this float is an object, but this list is also, you got it, an object. These objects have a specific type, that you already know:
 81 | 
 82 | ---
 83 | 
 84 | ## Back 2 Basics
 85 | 
 86 | ```yaml
 87 | type: TwoColumns
 88 | key: 6a5eddd6ea
 89 | disable_transition: true
 90 | code_zoom: 75
 91 | ```
 92 | 
 93 | `@part1`
 94 | &nbsp;
 95 | 
 96 | ```py
 97 | sister = "liz"
 98 | ```
 99 | 
100 | ```py
101 | height = 1.73
102 | ```
103 | 
104 | 
105 | ```py
106 | fam = ["liz", 1.73, "emma", 1.68,
107 |        "mom", 1.71, "dad", 1.89]
108 | ```
109 | 
110 | - Methods: Functions that belong to objects{{1}}
111 | 
112 | `@part2`
113 | ![ch_3_2_slides.024.png](https://assets.datacamp.com/production/repositories/288/datasets/6d444348823438f856363d02d093318f2ed457a3/ch_3_2_slides.024.png = 85)
114 | 
115 | `@script`
116 | string, float, and list, and of course they represent the values you gave them, such as "liz", 1.73 and an entire list. But in addition to this, Python objects also come with a bunch of so-called "methods". You can think of methods as functions that "belong to" Python objects. A Python object of type string has methods,
117 | 
118 | ---
119 | 
120 | ## Back 2 Basics
121 | 
122 | ```yaml
123 | type: TwoColumns
124 | key: ff540e522c
125 | disable_transition: true
126 | code_zoom: 75
127 | ```
128 | 
129 | `@part1`
130 | &nbsp;
131 | 
132 | ```py
133 | sister = "liz"
134 | ```
135 | 
136 | ```py
137 | height = 1.73
138 | ```
139 | 
140 | 
141 | ```py
142 | fam = ["liz", 1.73, "emma", 1.68,
143 |        "mom", 1.71, "dad", 1.89]
144 | ```
145 | 
146 | - Methods: Functions that belong to objects
147 | 
148 | `@part2`
149 | ![ch_3_2_slides.028.png](https://assets.datacamp.com/production/repositories/288/datasets/80891dbbb1a9b4f759540c5d601cbfb661a894d9/ch_3_2_slides.028.png = 85)
150 | 
151 | `@script`
152 | such as capitalize and replace, but also objects of type float and list have specific methods depending on the type.
153 | 
154 | Enough for the theory now; let's try to use a method!
155 | 
156 | ---
157 | 
158 | ## list methods
159 | 
160 | ```yaml
161 | type: FullSlide
162 | key: 431cae8707
163 | code_zoom: 85
164 | ```
165 | 
166 | `@part1`
167 | ```py
168 | fam
169 | ```
170 | 
171 | ```out
172 | ['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
173 | ```
174 | 
175 | ```py
176 | fam.index("mom") # "Call method index() on fam"
177 | ```{{1}}
178 | 
179 | ```out
180 | 4
181 | ```{{2}}
182 | 
183 | ```py
184 | fam.count(1.73)
185 | ```{{3}}
186 | 
187 | ```out
188 | 1
189 | ```{{4}}
190 | 
191 | `@script`
192 | Suppose you want to get the index of the string "mom" in the fam list. fam is an Python object with the type list, and has a method named index. To call the method, you use the dot notation, like this. The only input is the string "mom", the element you want to get the index for.
193 | 
194 | Python returns 4, which indeed is the index of the string "mom". I called the index method "on" the fam list here, and the output was 4. Similarly, I can use the count method on the fam list to count the number of times 1.73 occurs in the list.
195 | 
196 | Python gives me 1, which makes sense, because only liz is 1.73 meters tall.
197 | 
198 | But lists are not the only Python objects that have methods associated. Also floats, integers, booleans and strings
199 | 
200 | ---
201 | 
202 | ## str methods
203 | 
204 | ```yaml
205 | type: FullSlide
206 | key: 73c6a6ff3a
207 | code_zoom: 80
208 | ```
209 | 
210 | `@part1`
211 | ```py
212 | sister
213 | ```{{1}}
214 | 
215 | ```out
216 | 'liz'
217 | ```{{1}}
218 | 
219 | ```py
220 | sister.capitalize()
221 | ```{{2}}
222 | 
223 | ```out
224 | 'Liz'
225 | ```{{3}}
226 | 
227 | ```py
228 | sister.replace("z", "sa")
229 | ```{{4}}
230 | 
231 | ```out
232 | 'lisa'
233 | ```{{5}}
234 | 
235 | `@script`
236 | are Python objects that have specific methods associated with them. Take the variable sister for example, that represents a string.
237 | 
238 | You can call the method capitalize on sister, without any inputs. It returns a string where the first letter is capitalized now.
239 | 
240 | Or what if you want to replace some parts of the string with other parts? Not a problem. Just call the method replace on sister, with two appropriate inputs.
241 | 
242 | In the output, "z" is replaced with "sa".
243 | 
244 | ---
245 | 
246 | ## Methods
247 | 
248 | ```yaml
249 | type: FullSlide
250 | key: 346697c688
251 | code_zoom: 80
252 | ```
253 | 
254 | `@part1`
255 | - Everything = object{{1}}
256 | 
257 | - Object have methods associated, depending on type{{2}}
258 | 
259 | ```py
260 | sister.replace("z", "sa")
261 | ```{{3}}
262 | 
263 | ```out
264 | 'lisa'
265 | ```{{3}}
266 | 
267 | ```py
268 | fam.replace("mom", "mommy")
269 | ```{{4}}
270 | 
271 | ```out
272 | AttributeError: 'list' object has no attribute 'replace'
273 | ```{{4}}
274 | 
275 | `@script`
276 | To be absolutely clear: in Python, everything is an object, and each object has specific methods associated. Depending on the type of the object, list, string, float, whatever, the available methods are different. A string object like sister has a replace method, but a list like fam doesn't have this, as you can see from this error.
277 | 
278 | ---
279 | 
280 | ## Methods
281 | 
282 | ```yaml
283 | type: FullSlide
284 | key: c0100c8d69
285 | disable_transition: true
286 | code_zoom: 80
287 | ```
288 | 
289 | `@part1`
290 | ```py
291 | sister.index("z")
292 | ```{{1}}
293 | 
294 | ```out
295 | 2
296 | ```{{1}}
297 | 
298 | ```py
299 | fam.index("mom")
300 | ```{{1}}
301 | 
302 | ```out
303 | 4
304 | ```{{1}}
305 | 
306 | `@script`
307 | Objects of different types can have methods with the same name: Take the index method. It's available for both strings and lists. If you call it on a string, you get the index of the letters in the string; If you call it on a list, you get the index of the element in the list. This means that, depending on the type of the object, the methods behave differently.
308 | 
309 | Before I unleash you on some exercises on methods,
310 | 
311 | ---
312 | 
313 | ## Methods (2)
314 | 
315 | ```yaml
316 | type: FullSlide
317 | key: f03ac21e34
318 | code_zoom: 75
319 | ```
320 | 
321 | `@part1`
322 | ```py
323 | fam
324 | ```{{1}}
325 | 
326 | ```out
327 | ['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89]
328 | ```{{1}}
329 | 
330 | ```py
331 | fam.append("me")
332 | ```{{2}}
333 | ```py
334 | fam
335 | ```{{3}}
336 | 
337 | ```out
338 | ['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89, 'me']
339 | ```{{3}}
340 | 
341 | ```py
342 | fam.append(1.79)
343 | fam
344 | ```{{4}}
345 | 
346 | ```out
347 | ['liz', 1.73, 'emma', 1.68, 'mom', 1.71, 'dad', 1.89, 'me', 1.79]
348 | ```{{4}}
349 | 
350 | `@script`
351 | there's one more thing I want to tell you. Some methods can change the objects they are called on. Let's retake the fam list, and call the append method on it. As the input, we pass a string we want to add to the list.
352 | 
353 | Python doesn't generate an output, but if we check the fam list again, we see that it has been extended with the string "me".
354 | 
355 | Let's do this again, this time to add my height to the list.
356 | 
357 | Again, the fam list was extended.
358 | 
359 | This is pretty cool, because you can write very concise code to update your data structures on the fly, but it can also be pretty dangerous. Some method calls don't change the object they're called on, while others do, so watch out.
360 | 
361 | ---
362 | 
363 | ## Summary
364 | 
365 | ```yaml
366 | type: FullSlide
367 | key: eecd826650
368 | code_zoom: 80
369 | ```
370 | 
371 | `@part1`
372 | Functions{{1}}
373 | 
374 | ```py
375 | type(fam)
376 | ```{{2}}
377 | 
378 | ```out
379 | list
380 | ```{{2}}
381 | 
382 | Methods: call functions on objects{{3}}
383 | 
384 | ```py
385 | fam.index("dad")
386 | ```{{4}}
387 | 
388 | ```out
389 | 6
390 | ```{{4}}
391 | 
392 | `@script`
393 | Let's take a step back here and summarize this. you have Python functions, like type, max and round, that you can call like this. There's also methods, which are functions that are specific to Python objects. Depending on the type of the Python object you're dealing with, you'll be able to use different methods and they behave differently. You can call methods on the objects with the dot notation, like this, for example.
394 | 
395 | There's much more to tell about Python objects, methods and how Python works internally,
396 | 
397 | ---
398 | 
399 | ## Let's practice!
400 | 
401 | ```yaml
402 | type: FinalSlide
403 | key: cefb86a284
404 | ```
405 | 
406 | `@script`
407 | but for now, let's stick to what I've talked about here. It's time to get some exercises and add methods to your evergrowing skillset!
408 | 


--------------------------------------------------------------------------------
/slides/chapter_3_cedcfb34350be8545599768f96695cdd.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: Insert title here
  3 | key: cedcfb34350be8545599768f96695cdd
  4 | video_link:
  5 |   mp4: 'https://videos.datacamp.com/raw/735_intro_to_python/v6/735_ch3_3.mp4'
  6 |   hls: >-
  7 |     https://videos.datacamp.com/transcoded/735_intro_to_python/v6/hls-735_ch3_3.master.m3u8
  8 | transformations:
  9 |   translateX: 50
 10 |   translateY: 0
 11 |   scale: 1
 12 | ---
 13 | 
 14 | ## Packages
 15 | 
 16 | ```yaml
 17 | type: TitleSlide
 18 | key: de661f5035
 19 | ```
 20 | 
 21 | `@lower_third`
 22 | name: Hugo Bowne-Anderson
 23 | title: Data Scientist at DataCamp
 24 | 
 25 | `@script`
 26 | By now, I hope you're convinced
 27 | 
 28 | ---
 29 | 
 30 | ## Motivation
 31 | 
 32 | ```yaml
 33 | type: FullSlide
 34 | key: 63ee37e52b
 35 | ```
 36 | 
 37 | `@part1`
 38 | - Functions and methods are powerful{{1}}
 39 | 
 40 | - All code in Python distribution?{{2}}
 41 | 
 42 |   - Huge code base: messy{{2}}
 43 | 
 44 |   - Lots of code you won’t use{{3}}
 45 | 
 46 |   - Maintenance problem{{4}}
 47 | 
 48 | `@script`
 49 | that python functions and methods are extremely powerful: you can basically use other people's code to solve your own problems. That's amazing! However, adding all functions and methods that have been written up to now to the same Python distribution would be a mess. There would be tons and tons of code in there, that you'll never use. Also, maintaining all of this code would be a real pain.
 50 | 
 51 | ---
 52 | 
 53 | ## Packages
 54 | 
 55 | ```yaml
 56 | type: TwoColumns
 57 | key: fe3a37611e
 58 | ```
 59 | 
 60 | `@part1`
 61 | - Directory of Python Scripts{{1}}
 62 | 
 63 | - Each script = module{{2}}
 64 | 
 65 | - Specify functions, methods, types{{3}}
 66 | 
 67 | - Thousands of packages available{{4}}
 68 | 
 69 |   - NumPy{{5}}
 70 | 
 71 |   - Matplotlib{{6}}
 72 | 
 73 |   - scikit-learn{{7}}
 74 | 
 75 | `@part2`
 76 | ![Screen Shot 2019-09-08 at 9.18.56 AM.png](https://assets.datacamp.com/production/repositories/288/datasets/4763cadd79023a264f2e25c85c8344817ec13c55/Screen%20Shot%202019-09-08%20at%209.18.56%20AM.png = 60)
 77 | 
 78 | `@script`
 79 | This is where packages come into play. You can think of packages as a directory of Python scripts. Each such script is a so-called module. These modules specify functions, methods and new Python types aimed at solving particular problems. There are thousands of Python packages available from the internet. Among them are packages for data science: there's NumPy to efficiently work with arrays, Matplotlib for data visualization, and scikit-learn for machine learning.
 80 | 
 81 | Not all these packages are available in Python by default.
 82 | 
 83 | ---
 84 | 
 85 | ## Install package
 86 | 
 87 | ```yaml
 88 | type: FullSlide
 89 | key: a198cbb666
 90 | ```
 91 | 
 92 | `@part1`
 93 | - https://pip.pypa.io/en/stable/installation/{{1}}
 94 | 
 95 | - Download `get-pip.py`{{2}}
 96 | 
 97 | - Terminal:{{3}}
 98 | 
 99 | 	- `python3 get-pip.py`{{4}}
100 | 
101 | 	- `pip3 install numpy`{{5}}
102 | 
103 | `@script`
104 | To use Python packages, you'll first have to install them on your own system, and then put code in your script to tell Python that you want to use these packages.
105 | 
106 | Datacamp already has all necessary packages installed for you, but if you want to install them on your own system, you'll want to use pip, a package maintenance system for Python. If you go to this URL, you can download the file get-pip.py. Next, you go to the terminal, and execute python3 get-pip.py. Now you can use pip to actually install a Python package of your choosing. Suppose we want to install the numpy package, which you'll learn about in the next chapter. You type pip3 install numpy. You have to use the commands python3 and pip3 here to tell our system that we're working with Python version 3.
107 | 
108 | Now that the package is installed, you can actually start using it in one of your Python scripts.
109 | 
110 | ---
111 | 
112 | ## Import package
113 | 
114 | ```yaml
115 | type: TwoColumns
116 | key: d87a9581e9
117 | code_zoom: 68
118 | ```
119 | 
120 | `@part1`
121 | ```py
122 | import numpy
123 | ```{{1}}
124 | ```py
125 | array([1, 2, 3])
126 | ```{{2}}
127 | 
128 | ```out
129 | NameError: name 'array' is not defined
130 | ```{{3}}
131 | 
132 | ```py
133 | numpy.array([1, 2, 3])
134 | ```{{4}}
135 | 
136 | ```out
137 | array([1, 2, 3])
138 | ```{{5}}
139 | 
140 | `@part2`
141 | ```py
142 | import numpy as np
143 | ```{{6}}
144 | ```py
145 | np.array([1, 2, 3])
146 | ```{{7}}
147 | 
148 | ```out
149 | array([1, 2, 3])
150 | ```{{8}}
151 | 
152 | ```py
153 | from numpy import array
154 | ```{{9}}
155 | ```py
156 | array([1, 2, 3])
157 | ```{{10}}
158 | 
159 | ```out
160 | array([1, 2, 3])
161 | ```{{11}}
162 | 
163 | `@script`
164 | Before you can do this, you should import the package, or a specific module of the package. You can do this with the import statement.
165 | 
166 | To import the entire numpy package, you can do import numpy, like this.
167 | 
168 | A commonly used function in NumPy is array. It takes a list as input. Simply calling the array function like this, will generate an error.
169 | 
170 | To refer to the array function from the numpy package, you'll need this.
171 | 
172 | This time it works. The NumPy array is very useful to do data science, but more on that later.
173 | 
174 | Using this numpy dot prefix all the time can become pretty tiring, so you can also import the package and refer to it with a different name. You can do this by extending your import statement with as, like this.
175 | 
176 | Now, instead of numpy.array, you'll have to use np.array to use NumPy's array function.
177 | 
178 | There are cases in which you only need one specific function of a package. Python allows you to make this explicit in your code. Suppose that we only want to use the array function from the NumPy package. Instead of doing import numpy, you can instead do from numpy import array, like this.
179 | 
180 | This time, you can simply call the array function like this, no need to use numpy dot here.
181 | 
182 | This from import version to use specific parts of a package can be useful to limit the amount of coding, but you're also loosing some of the context.
183 | 
184 | ---
185 | 
186 | ## from numpy import array
187 | 
188 | ```yaml
189 | type: FullSlide
190 | key: e17caa7b57
191 | code_zoom: 70
192 | ```
193 | 
194 | `@part1`
195 | - `my_script.py`
196 | 
197 | ```py
198 | from numpy import array
199 | ```
200 | ```py
201 | 
202 | fam = ["liz", 1.73, "emma", 1.68, 
203 | 	"mom", 1.71, "dad", 1.89]
204 | 
205 | ...
206 | ```
207 | ```py
208 | fam_ext = fam + ["me", 1.79]
209 | 
210 | ...
211 | ```
212 | ```py
213 | print(str(len(fam_ext)) + " elements in fam_ext")
214 | 
215 | ...
216 | ```
217 | ```py
218 | np_fam = array(fam_ext) 
219 | ```{{1}}
220 | 
221 | - Using NumPy, but not very clear{{2}}
222 | 
223 | `@script`
224 | Suppose you're working in a long Python script. You import the array function from numpy at the very top, and way later, you actually use this array function. Somebody else who's reading your code might have forgotten that this array function is a specific NumPy function; it's not clear from the function call.
225 | 
226 | ---
227 | 
228 | ## import numpy
229 | 
230 | ```yaml
231 | type: FullSlide
232 | key: b287cdae79
233 | code_zoom: 70
234 | ```
235 | 
236 | `@part1`
237 | ```py
238 | import numpy as np
239 | 
240 | fam = ["liz", 1.73, "emma", 1.68, 
241 | 	"mom", 1.71, "dad", 1.89]
242 | 
243 | ...
244 | ```
245 | ```py
246 | fam_ext = fam + ["me", 1.79]
247 | 
248 | ...
249 | ```
250 | ```py
251 | print(str(len(fam_ext)) + " elements in fam_ext")
252 | 
253 | ...
254 | ```
255 | ```py
256 | np_fam = np.array(fam_ext) # Clearly using NumPy
257 | ```{{1}}
258 | 
259 | `@script`
260 | In that respect, the more standard import numpy call is preferred: In this case, your function call is numpy.array, making it very clear that you're working with NumPy.
261 | 
262 | ---
263 | 
264 | ## Let's practice!
265 | 
266 | ```yaml
267 | type: FinalSlide
268 | key: 570affae26
269 | ```
270 | 
271 | `@script`
272 | Off to the exercises now, where you can practice different ways of importing packages and modules yourself. You're well on your way to becoming a pythonista data science ninja.
273 | 


--------------------------------------------------------------------------------
/slides/chapter_4_34495ba457d74296794d2a122c9b6e19.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: Insert title here
  3 | key: 34495ba457d74296794d2a122c9b6e19
  4 | video_link:
  5 |   hls: >-
  6 |     https://videos.datacamp.com/transcoded/735_intro_to_python/v6/hls-735_ch4_3.master.m3u8
  7 |   mp4: 'https://videos.datacamp.com/raw/735_intro_to_python/v6/735_ch4_3.mp4'
  8 | transformations:
  9 |   translateX: 50
 10 |   translateY: 0
 11 |   scale: 1
 12 | ---
 13 | 
 14 | ## NumPy: Basic Statistics
 15 | 
 16 | ```yaml
 17 | type: TitleSlide
 18 | key: 5d21c4b49f
 19 | ```
 20 | 
 21 | `@lower_third`
 22 | name: Hugo Bowne-Anderson
 23 | title: Data Scientist at DataCamp
 24 | 
 25 | `@script`
 26 | A typical first step in analyzing your data,
 27 | 
 28 | ---
 29 | 
 30 | ## Data analysis
 31 | 
 32 | ```yaml
 33 | type: FullSlide
 34 | key: 32899f8a31
 35 | ```
 36 | 
 37 | `@part1`
 38 | - Get to know your data{{1}}
 39 | 
 40 | - Little data -> simply look at it{{2}}
 41 | 
 42 | - Big data -> ?{{3}}
 43 | 
 44 | `@script`
 45 | is getting to know your data in the first place. For the NumPy arrays from before, this is pretty easy, because it isn't a lot of data. However, as a data scientist, you'll be crunching thousands, if not millions or billions of numbers.
 46 | 
 47 | ---
 48 | 
 49 | ## City-wide survey
 50 | 
 51 | ```yaml
 52 | type: FullSlide
 53 | key: df02059657
 54 | ```
 55 | 
 56 | `@part1`
 57 | ```py
 58 | import numpy as np
 59 | np_city = ... # Implementation left out
 60 | np_city
 61 | ```{{1}}
 62 | 
 63 | ```out
 64 | array([[1.64, 71.78],
 65 |        [1.37, 63.35],
 66 |        [1.6 , 55.09],
 67 |        ...,
 68 |        [2.04, 74.85],
 69 |        [2.04, 68.72],
 70 |        [2.01, 73.57]])
 71 | ```{{1}}
 72 | 
 73 | `@script`
 74 | Imagine you conduct a city-wide survey where you ask 5000 adults about their height and weight. You end up with something like this: a 2D numpy array, which I named np_city, that has 5000 rows, corresponding to the 5000 people, and two columns, corresponding to the height and the weight.
 75 | 
 76 | Simply staring at these numbers like a zombie won't give you any insights. What you can do, though, is generate summarizing statistics about your data.
 77 | 
 78 | ---
 79 | 
 80 | ## NumPy
 81 | 
 82 | ```yaml
 83 | type: FullSlide
 84 | key: d3c991b91f
 85 | code_zoom: 90
 86 | ```
 87 | 
 88 | `@part1`
 89 | ```py
 90 | np.mean(np_city[:, 0])
 91 | ```{{1}}
 92 | 
 93 | ```out
 94 | 1.7472
 95 | ```{{1}}
 96 | 
 97 | ```py
 98 | np.median(np_city[:, 0])
 99 | ```{{2}}
100 | 
101 | ```out
102 | 1.75
103 | ```{{2}}
104 | 
105 | `@script`
106 | Aside from an efficient data structure for number crunching, it happens that NumPy is also good at doing these kinds of things.
107 | 
108 | For starters, you can try to find out the average height of these 5000 people, with NumPy's mean function. Because it's a function from the NumPy package, don't forget to start with np..
109 | 
110 | Of course, I first had to do a subsetting operation to get the height column from the 2D array. It appears that on average, people are 1.75 meters tall. What about the median height? This is the height of the middle person if you sort all persons from small to tall. Instead of writing complicated python code to figure this out, you can simply use NumPy's median function:
111 | 
112 | You can do similar things for the weight column in np_city. Often, these summarizing statistics will provide you with a "sanity check" of your data. If you end up with a average weight of 2000 kilograms, your measurements are most likely incorrect.
113 | 
114 | Apart from mean and median, there's also other functions,
115 | 
116 | ---
117 | 
118 | ## NumPy
119 | 
120 | ```yaml
121 | type: FullSlide
122 | key: a66131c711
123 | ```
124 | 
125 | `@part1`
126 | ```py
127 | np.corrcoef(np_city[:, 0], np_city[:, 1])
128 | ```
129 | 
130 | ```out
131 | array([[ 1.     , -0.01802],
132 |        [-0.01803,  1.     ]])
133 | ```
134 | 
135 | ```py
136 | np.std(np_city[:, 0])
137 | ```{{1}}
138 | 
139 | ```out
140 | 0.1992
141 | ```{{1}}
142 | 
143 | - sum(), sort(), ...{{2}}
144 | 
145 | - Enforce single data type: speed!{{3}}
146 | 
147 | `@script`
148 | like corrcoeff to check if for example height and weight are correlated,
149 | 
150 | and std, for standard deviation.
151 | 
152 | NumPy also features more basic functions, such as sum and sort, which also exist in the basic Python distribution. However, the big difference here is speed. Because NumPy enforces a single data type in an array, it can drastically speed up the calculations.
153 | 
154 | ---
155 | 
156 | ## Generate data
157 | 
158 | ```yaml
159 | type: FullSlide
160 | key: 0c27803967
161 | code_zoom: 80
162 | ```
163 | 
164 | `@part1`
165 | - Arguments for `np.random.normal()` {{1}}
166 | 	- distribution mean{{1}}
167 |     - distribution standard deviation{{1}}
168 |     - number of samples{{1}}
169 | 
170 | ```py
171 | height = np.round(np.random.normal(1.75, 0.20, 5000), 2)
172 | 
173 | weight = np.round(np.random.normal(60.32, 15, 5000), 2)
174 | 
175 | ```{{1}}
176 | ```py
177 | np_city = np.column_stack((height, weight))
178 | ```{{2}}
179 | 
180 | `@script`
181 | Just a sidenote here: If you're wondering how I came up with the data in this video: We simulated it with NumPy functions! I sampled two random distributions 5000 times to create the height and weight arrays, and then used column_stack to paste them together as two columns. Another awesome thing that NumPy can do!
182 | 
183 | Another great tool to get some sense of your data is to visualize it, but that's something for the next course also.
184 | 
185 | ---
186 | 
187 | ## Let's practice!
188 | 
189 | ```yaml
190 | type: FinalSlide
191 | key: c4df18cfc1
192 | ```
193 | 
194 | `@script`
195 | First, head over to the exercises to learn how to explore your NumPy arrays!
196 | 


--------------------------------------------------------------------------------
/slides/chapter_4_a0487c26210f6b71ea98f917734cea3a.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: Insert title here
  3 | key: a0487c26210f6b71ea98f917734cea3a
  4 | video_link:
  5 |   mp4: 'https://videos.datacamp.com/raw/735_intro_to_python/v6/735_ch4_1.mp4'
  6 |   hls: >-
  7 |     https://videos.datacamp.com/transcoded/735_intro_to_python/v6/hls-735_ch4_1.master.m3u8
  8 | transformations:
  9 |   translateX: 50
 10 |   translateY: 0
 11 |   scale: 1
 12 | ---
 13 | 
 14 | ## NumPy
 15 | 
 16 | ```yaml
 17 | type: TitleSlide
 18 | key: 1062fb4e4c
 19 | ```
 20 | 
 21 | `@lower_third`
 22 | name: Hugo Bowne-Anderson
 23 | title: Data Scientist at DataCamp
 24 | 
 25 | `@script`
 26 | Wow, you've done well and by now, you are aware
 27 | 
 28 | ---
 29 | 
 30 | ## Lists Recap
 31 | 
 32 | ```yaml
 33 | type: FullSlide
 34 | key: 819dc4dd09
 35 | ```
 36 | 
 37 | `@part1`
 38 | - Powerful{{1}}
 39 | 
 40 | - Collection of values{{2}}
 41 | 
 42 | - Hold different types{{3}}
 43 | 
 44 | - Change, add, remove{{4}}
 45 | 
 46 | - Need for Data Science{{5}}
 47 | 
 48 |   - Mathematical operations over collections{{6}}
 49 | 
 50 |   - Speed{{7}}
 51 | 
 52 | `@script`
 53 | that the Python list is pretty powerful. A list can hold any type and can hold different types at the same time. You can also change, add and remove elements. This is wonderful, but one feature is missing, a feature that is super important for aspiring data scientists as yourself. When analyzing data, you'll often want to carry out operations over entire collections of values, and you want to do this fast. With lists, this is a problem.
 54 | 
 55 | ---
 56 | 
 57 | ## Illustration
 58 | 
 59 | ```yaml
 60 | type: FullSlide
 61 | key: c038185807
 62 | code_zoom: 64
 63 | ```
 64 | 
 65 | `@part1`
 66 | ```py
 67 | height = [1.73, 1.68, 1.71, 1.89, 1.79]
 68 | height
 69 | ```
 70 | 
 71 | ```out
 72 | [1.73, 1.68, 1.71, 1.89, 1.79]
 73 | ```
 74 | 
 75 | ```py
 76 | weight = [65.4, 59.2, 63.6, 88.4, 68.7]
 77 | weight
 78 | ```{{1}}
 79 | 
 80 | ```out
 81 | [65.4, 59.2, 63.6, 88.4, 68.7]
 82 | ```{{1}}
 83 | 
 84 | ```py
 85 | weight / height ** 2
 86 | ```{{2}}
 87 | 
 88 | ```out
 89 | TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'
 90 | ```{{3}}
 91 | 
 92 | `@script`
 93 | Let's retake the heights of your family and yourself. Suppose you've also asked for everybody's weight. It's not very polite, but everything for science, right? You end up with two lists, height, and weight. The first person is 1.73 meters tall and weighs 65.4 kilograms.
 94 | 
 95 | If you now want to calculate the Body Mass Index for each family member, you'd hope that this call can work, making the calculations element-wise.
 96 | 
 97 | Unfortunately, Python throws an error, because it has no idea how to do calculations on lists. You could solve this by going through each list element one after the other, and calculating the BMI for each person separately, but this is terribly inefficient and tiresome to write.
 98 | 
 99 | ---
100 | 
101 | ## Solution: NumPy
102 | 
103 | ```yaml
104 | type: FullSlide
105 | key: 7d3d0276cb
106 | ```
107 | 
108 | `@part1`
109 | - Numeric Python{{1}}
110 | 
111 | - Alternative to Python List: NumPy Array{{2}}
112 | 
113 | - Calculations over entire arrays{{3}}
114 | 
115 | - Easy and Fast{{4}}
116 | 
117 | - Installation{{5}}
118 | 
119 | 	- In the terminal: `pip3 install numpy`{{6}}
120 | 
121 | `@script`
122 | A way more elegant solution is to use NumPy, or Numeric Python. It's a Python package that, among others, provides a alternative to the regular python list: the NumPy array. The NumPy array is pretty similar to the list, but has one additional feature: you can perform calculations over entire arrays. It's really easy, and super-fast as well.
123 | 
124 | The NumPy package is already installed on DataCamp's servers, but if you want to work with it on your own system, go to the command line and execute pip3 install numpy.
125 | 
126 | Next,
127 | 
128 | ---
129 | 
130 | ## NumPy
131 | 
132 | ```yaml
133 | type: FullSlide
134 | key: b227a9dc4f
135 | code_zoom: 75
136 | ```
137 | 
138 | `@part1`
139 | ```py
140 | import numpy as np
141 | ```
142 | ```py
143 | np_height = np.array(height)
144 | np_height
145 | ```{{1}}
146 | 
147 | ```out
148 | array([1.73, 1.68, 1.71, 1.89, 1.79])
149 | ```{{1}}
150 | 
151 | ```py
152 | np_weight = np.array(weight)
153 | np_weight
154 | ```{{1}}
155 | 
156 | ```out
157 | array([65.4, 59.2, 63.6, 88.4, 68.7])
158 | ```{{1}}
159 | 
160 | ```py
161 | bmi = np_weight / np_height ** 2
162 | bmi
163 | ```{{2}}
164 | 
165 | ```out
166 | array([21.85171573, 20.97505669, 21.75028214, 24.7473475 , 21.44127836])
167 | ```{{2}}
168 | 
169 | `@script`
170 | to actually use NumPy in your Python session, you can import the numpy package, like this.
171 | 
172 | Let's start with creating a numpy array. You do this with NumPy's array function: the input is a regular Python list. I'm using array twice here, to create NumPy versions of the height and weight lists from before: np_height and np_weight:
173 | 
174 | Let's try to calculate everybody's BMI with a single call again.
175 | 
176 | This time, it worked fine: the calculations were performed element-wise. The first person's BMI was calculated by dividing the first element in np_weight by the square of the first element in np_height, the second person's BMI was calculated with the second height and weight elements, and so on.
177 | 
178 | ---
179 | 
180 | ## Comparison
181 | 
182 | ```yaml
183 | type: FullSlide
184 | key: b0247dd81c
185 | code_zoom: 77
186 | ```
187 | 
188 | `@part1`
189 | ```py
190 | height = [1.73, 1.68, 1.71, 1.89, 1.79]
191 | weight = [65.4, 59.2, 63.6, 88.4, 68.7]
192 | weight / height ** 2
193 | ```{{1}}
194 | 
195 | ```out
196 | TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'
197 | ```{{1}}
198 | 
199 | ```py
200 | np_height = np.array(height)
201 | np_weight = np.array(weight)
202 | np_weight / np_height ** 2
203 | ```{{2}}
204 | 
205 | ```out
206 | array([21.85171573, 20.97505669, 21.75028214, 24.7473475 , 21.44127836])
207 | ```{{2}}
208 | 
209 | `@script`
210 | Let's do a quick comparison here. First, we tried to do calculations with regular lists, like this, but this gave us an error, because Python doesn't now how to do calculations with lists like we want them to. Next, these regular lists where converted to NumPy arrays. The same operations now work without any problem: NumPy knows how to work with arrays as if they are single values, which is pretty awesome if you ask me.
211 | 
212 | ---
213 | 
214 | ## NumPy: remarks
215 | 
216 | ```yaml
217 | type: FullSlide
218 | key: f9882b091b
219 | code_zoom: 90
220 | ```
221 | 
222 | `@part1`
223 | ```py
224 | np.array([1.0, "is", True])
225 | ```{{1}}
226 | 
227 | ```out
228 | array(['1.0', 'is', 'True'], dtype='<U32')
229 | ```{{1}}
230 | 
231 | - NumPy arrays: contain only one type{{1}}
232 | 
233 | `@script`
234 | You should still pay attention, though. First of all, NumPy can do all of this so easily because it assumes that your NumPy array can only contain values of a single type. It's either an array of floats, either an array of booleans, and so on. If you do try to create an array with different types, like this for example, the resulting NumPy array will contain a single type, string in this case. The boolean and the float were both converted to strings.
235 | 
236 | Second, you should know that a NumPy array is simply a new kind of Python type, like the float, string and list types from before. This means that it comes with its own methods, which can behave differently than you'd expect.
237 | 
238 | ---
239 | 
240 | ## NumPy: remarks
241 | 
242 | ```yaml
243 | type: FullSlide
244 | key: 4da6149ced
245 | code_zoom: 80
246 | ```
247 | 
248 | `@part1`
249 | ```py
250 | python_list = [1, 2, 3]
251 | numpy_array = np.array([1, 2, 3])
252 | ```
253 | 
254 | ```py
255 | python_list + python_list
256 | ```{{1}}
257 | 
258 | ```out
259 | [1, 2, 3, 1, 2, 3]
260 | ```{{1}}
261 | 
262 | ```py
263 | numpy_array + numpy_array
264 | ```{{2}}
265 | 
266 | ```out
267 | array([2, 4, 6])
268 | ```{{2}}
269 | 
270 | - Different types: different behavior!{{3}}
271 | 
272 | `@script`
273 | Take this Python list and this numpy array, for example.
274 | 
275 | If you do python_list + python_list, the list elements are pasted together, generating a list with 6 elements. If you do this with the numpy arrays, on the other hand, Python will do an element-wise sum of the arrays.
276 | 
277 | Just make sure to pay attention when you're juggling around with different Python types, because the outcomes can differ a lot!
278 | 
279 | Apart from these subtleties,
280 | 
281 | ---
282 | 
283 | ## NumPy Subsetting
284 | 
285 | ```yaml
286 | type: FullSlide
287 | key: c1f3774f83
288 | code_zoom: 71
289 | ```
290 | 
291 | `@part1`
292 | ```py
293 | bmi
294 | ```{{1}}
295 | 
296 | ```out
297 | array([21.85171573, 20.97505669, 21.75028214, 24.7473475 , 21.44127836])
298 | ```{{1}}
299 | 
300 | ```py
301 | bmi[1]
302 | ```{{2}}
303 | 
304 | ```out
305 | 20.975
306 | ```{{2}}
307 | 
308 | ```py
309 | bmi > 23
310 | ```{{3}}
311 | 
312 | ```out
313 | array([False, False, False,  True, False])
314 | ```{{3}}
315 | 
316 | ```py
317 | bmi[bmi > 23]
318 | ```{{4}}
319 | 
320 | ```out
321 | array([24.7473475])
322 | ```{{4}}
323 | 
324 | `@script`
325 | you can work with NumPy arrays pretty much the same as you can with regular Python lists. When you want to get elements from your array, for example, you can use square brackets. Suppose you want to get the bmi for the second person, so at index 1. This will do the trick.
326 | 
327 | Specifically for NumPy, there's also another way to do list subsetting: using an array of booleans. Say you want to get all BMI values in the bmi array that are over 23. A first step is using the greater than sign, like this:
328 | 
329 | The result is a NumPy array containing booleans: True if the corresponding bmi is above 23, False if it's below. Next, you can use this boolean array inside square brackets to do subsetting. Only the elements in bmi that are above 23, so for which the corresponding boolean value is True, is selected. There's only one BMI that's above 23, so we end up with a NumPy array with a single value, that specific BMI.
330 | 
331 | Using the result of a comparison to make a selection of your data is a very common way to get surprising insights.
332 | 
333 | ---
334 | 
335 | ## Let's practice!
336 | 
337 | ```yaml
338 | type: FinalSlide
339 | key: 1138fd29b8
340 | ```
341 | 
342 | `@script`
343 | Learn all about it and the other NumPy basics in the exercises!
344 | 


--------------------------------------------------------------------------------
/slides/chapter_4_ae3238dcc7feb9adecfee0c395fc8dc8.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: Insert title here
  3 | key: ae3238dcc7feb9adecfee0c395fc8dc8
  4 | video_link:
  5 |   mp4: 'https://videos.datacamp.com/raw/735_intro_to_python/v6/735_ch4_2.mp4'
  6 |   hls: >-
  7 |     https://videos.datacamp.com/transcoded/735_intro_to_python/v6/hls-735_ch4_2.master.m3u8
  8 | transformations:
  9 |   translateX: 50
 10 |   translateY: 0
 11 |   scale: 1
 12 | ---
 13 | 
 14 | ## 2D NumPy Arrays
 15 | 
 16 | ```yaml
 17 | type: TitleSlide
 18 | key: 0cc8abf493
 19 | ```
 20 | 
 21 | `@lower_third`
 22 | name: Hugo Bowne-Anderson
 23 | title: Data Scientist at DataCamp
 24 | 
 25 | `@script`
 26 | Well done you legend! Let's now recreate the numpy arrays from the previous video.
 27 | 
 28 | ---
 29 | 
 30 | ## Type of NumPy Arrays
 31 | 
 32 | ```yaml
 33 | type: FullSlide
 34 | key: 1b9db47fd2
 35 | code_zoom: 100
 36 | ```
 37 | 
 38 | `@part1`
 39 | ```py
 40 | import numpy as np
 41 | np_height = np.array([1.73, 1.68, 1.71, 1.89, 1.79])
 42 | np_weight = np.array([65.4, 59.2, 63.6, 88.4, 68.7])
 43 | ```
 44 | 
 45 | ```py
 46 | type(np_height)
 47 | ```
 48 | 
 49 | ```out
 50 | numpy.ndarray
 51 | ```
 52 | 
 53 | ```py
 54 | type(np_weight)
 55 | ```
 56 | 
 57 | ```out
 58 | numpy.ndarray
 59 | ```
 60 | 
 61 | `@script`
 62 | If you ask for the type of these arrays, Python tells you that they are numpy.ndarray. numpy dot tells you it's a type that was defined in the numpy package. ndarray stands for n-dimensional array. The arrays np_height and np_weight are one-dimensional arrays, but it's perfectly possible to create 2 dimensional, three dimensional, heck even seven dimensional arrays! Let's stick to 2 in this video though.
 63 | 
 64 | ---
 65 | 
 66 | ## 2D NumPy Arrays
 67 | 
 68 | ```yaml
 69 | type: FullSlide
 70 | key: ebb550dcba
 71 | code_zoom: 71
 72 | ```
 73 | 
 74 | `@part1`
 75 | ```py
 76 | np_2d = np.array([[1.73, 1.68, 1.71, 1.89, 1.79],
 77 |                   [65.4, 59.2, 63.6, 88.4, 68.7]])
 78 | ```{{1}}
 79 | ```py
 80 | np_2d
 81 | ```{{2}}
 82 | 
 83 | ```out
 84 | array([[ 1.73,  1.68,  1.71,  1.89,  1.79],
 85 |        [65.4 , 59.2 , 63.6 , 88.4 , 68.7 ]])
 86 | ```{{2}}
 87 | 
 88 | ```py
 89 | np_2d.shape
 90 | ```{{3}}
 91 | 
 92 | ```out
 93 | (2, 5) # 2 rows, 5 columns
 94 | ```{{3}}
 95 | 
 96 | ```py
 97 | np.array([[1.73, 1.68, 1.71, 1.89, 1.79],
 98 |           [65.4, 59.2, 63.6, 88.4, "68.7"]])
 99 | ```{{4}}
100 | 
101 | ```out
102 | array([['1.73', '1.68', '1.71', '1.89', '1.79'],
103 |        ['65.4', '59.2', '63.6', '88.4', '68.7']], dtype='<U32')
104 | ```{{4}}
105 | 
106 | `@script`
107 | You can create a 2D numpy array from a regular Python list of lists. Let's try to create one numpy array for all height and weight data of your family, like this.
108 | 
109 | If you print out np_2d now, you'll see that it is a rectangular data structure: Each sublist in the list, corresponds to a row in the two dimensional numpy array. From np_2d.shape, you can see that we indeed have 2 rows and 5 columns. shape is a so-called attribute of the np2d array, that can give you more information about what the data structure looks like.
110 | 
111 | Note that the syntax for accessing an attribute looks a bit like calling a method, but they are not the same! Remember that methods have round brackets after them, and, you can see here, attributes do not.
112 | 
113 | Also for 2D arrays, the NumPy rule applies: an array can only contain a single type. If you change one float to be string, all the array elements will be coerced to strings, to end up with a homogeneous array.
114 | 
115 | ---
116 | 
117 | ## Subsetting
118 | 
119 | ```yaml
120 | type: FullSlide
121 | key: e71d2fc8b8
122 | code_zoom: 80
123 | ```
124 | 
125 | `@part1`
126 | ```out
127 |            0       1       2       3       4
128 |            
129 | array([[  1.73,   1.68,   1.71,   1.89,   1.79],     0
130 |        [  65.4,   59.2,   63.6,   88.4,   68.7]])    1
131 | ```
132 | 
133 | ```py
134 | np_2d[0]
135 | ```
136 | 
137 | ```out
138 | array([1.73, 1.68, 1.71, 1.89, 1.79])
139 | ```
140 | 
141 | `@script`
142 | You can think of the 2D numpy array as an improved list of lists: you can perform calculations on the arrays, like I showed before, and you can do more advanced ways of subsetting.
143 | 
144 | Suppose you want the first row, and then the third element in that row. To select the row, you need the index 0 in square brackets. Don't forget about zero indexing.
145 | 
146 | To then select the third element, you can extend the same call with another pair of brackets, this time with the index 2,
147 | 
148 | ---
149 | 
150 | ## Subsetting
151 | 
152 | ```yaml
153 | type: FullSlide
154 | key: 57a1fb1581
155 | disable_transition: true
156 | code_zoom: 80
157 | ```
158 | 
159 | `@part1`
160 | ```out
161 |            0       1       2       3       4
162 |            
163 | array([[  1.73,   1.68,   1.71,   1.89,   1.79],     0
164 |        [  65.4,   59.2,   63.6,   88.4,   68.7]])    1
165 | ```
166 | 
167 | ```py
168 | np_2d[0][2]
169 | ```
170 | 
171 | ```out
172 | 1.71
173 | ```
174 | 
175 | ```py
176 | np_2d[0, 2]
177 | ```{{1}}
178 | 
179 | ```out
180 | 1.71
181 | ```{{1}}
182 | 
183 | `@script`
184 | like this. Basically you're selecting the row, and then from that row do another selection.
185 | 
186 | There's also an alternative way of subsetting, using single square brackets and a comma. This call returns the exact same value as before. The value before the comma specifies the row, the value after the comma specifies the column. The intersection of the rows and columns you specified, are returned. Once you get used to it, this syntax is more intuitive and opens up more possibilities.
187 | 
188 | ---
189 | 
190 | ## Subsetting
191 | 
192 | ```yaml
193 | type: FullSlide
194 | key: feb75c975c
195 | disable_transition: true
196 | code_zoom: 80
197 | ```
198 | 
199 | `@part1`
200 | ```out
201 |            0       1       2       3       4
202 |            
203 | array([[  1.73,   1.68,   1.71,   1.89,   1.79],     0
204 |        [  65.4,   59.2,   63.6,   88.4,   68.7]])    1
205 | ```
206 | 
207 | ```py
208 | np_2d[:, 1:3]
209 | ```{{1}}
210 | 
211 | ```out
212 | array([[ 1.68,  1.71],
213 |        [59.2 , 63.6 ]])
214 | ```{{1}}
215 | 
216 | ```py
217 | np_2d[1, :]
218 | ```{{2}}
219 | 
220 | ```out
221 | array([65.4, 59.2, 63.6, 88.4, 68.7])
222 | ```{{2}}
223 | 
224 | `@script`
225 | Suppose you want to select the height and weight of the second and third family member. You want both rows, so you put in a colon before the comma. You only want the second and third column, so you put in the indices 1 to 3 after the comma. Remember that the third index is not included here. The intersection gives us a 2D array with 2 rows and 2 columns:
226 | 
227 | Similarly, you can select the weight of all family members like this: you only want the second row, so put 1 before the comma. You want all columns, so you use a colon after the comma. The intersection gives us the entire second row.
228 | 
229 | Finally, 2D numpy arrays enable you to do element-wise calculations, the same way you did it with 1D numpy arrays. That's something
230 | 
231 | ---
232 | 
233 | ## Let's practice!
234 | 
235 | ```yaml
236 | type: FinalSlide
237 | key: 6047b27c09
238 | ```
239 | 
240 | `@script`
241 | you can experiment with in the exercises, along with creating and subsetting 2D numpy arrays! Exciting
242 | 


--------------------------------------------------------------------------------