├── .gitignore
├── CITATION.md
├── CONTRIBUTING.md
├── LICENSE.md
├── Makefile
├── README.md
├── bin
    └── make-seasonal-data.py
├── chapter1.md
├── chapter2.md
├── chapter3.md
├── chapter4.md
├── chapter5.md
├── course.yml
├── datasets
    ├── filesys.zip
    └── solutions.zip
├── design
    ├── concept.dot
    └── concept.svg
├── filesys
    ├── course.txt
    ├── people
    │   └── agarwal.txt
    └── seasonal
    │   ├── autumn.csv
    │   ├── spring.csv
    │   ├── summer.csv
    │   └── winter.csv
├── img
    └── shield_image.png
├── requirements.sh
├── rules.yml
├── solutions
    ├── count-records-start.sh
    ├── count-records.sh
    ├── current-time.sh
    ├── date-range-start.sh
    ├── date-range.sh
    ├── dates.sh
    ├── get-lines-solution.sh
    ├── get-lines.sh
    ├── lines.sh
    ├── names.txt
    ├── num-records.out
    ├── range-1.sh
    ├── range-2.sh
    ├── range-3.sh
    ├── range-start-1.sh
    ├── teeth-start.sh
    ├── teeth.out
    └── teeth.sh
└── unused
    └── permissions.md


/.gitignore:
--------------------------------------------------------------------------------
1 | *~
2 | .DS_Store
3 | 


--------------------------------------------------------------------------------
/CITATION.md:
--------------------------------------------------------------------------------
1 | # Citation
2 | 
3 | Please cite this lesson as:
4 | 
5 | ```
6 | Greg Wilson: "Introduction to the Unix Shell for Data Science".  DataCamp, 2017, https://www.datacamp.com/courses/intro-to-unix-shell.
7 | ```
8 | 


--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
 1 | # Contributing
 2 | 
 3 | [DataCamp](https://www.datacamp.com/) welcomes bug reports, feature requests, and fixes. By contributing to this lesson, you agree that we may redistribute your work under [our license](LICENSE.md). In exchange, we will address your issues and/or assess your proposed changes as promptly as we can.
 4 | 
 5 | The easiest way to get started is to file an issue to tell us about a spelling mistake, some awkward wording, or a factual error.
 6 | 
 7 | 1.  You can submit suggestions directly through our web-based learning interface.  This will get the fastest response.
 8 | 
 9 | 2.  If you have a [GitHub](https://github.com/) account, you can file an issue in this lesson's repository, or file a pull request to propose an improvement.
10 | 
11 | 3.  We also welcome comments by email, but will be able to respond more quickly if you use one of the other methods described above.
12 | 
13 | The repository for this lesson is <https://github.com/datacamp/courses-intro-to-unix-shell>.
14 | 


--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | ---
 3 | # License
 4 | 
 5 | Copyright (c) DataCamp 2017.
 6 | 
 7 | This work is made available under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license.
 8 | This summary is a human-readable summary of (and not a substitute for) the [full license](https://creativecommons.org/licenses/by-nc/4.0/legalcode).
 9 | 
10 | ## You are free to:
11 | 
12 | * **Share** - copy and redistribute the material in any medium or format.
13 | 
14 | * **Adapt** - remix, transform, and build upon the material.
15 | 
16 | The licensor cannot revoke these freedoms as long as you follow the license terms.
17 | 
18 | ## Under the following terms:
19 | 
20 | * **Attribution** — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
21 | 
22 | * **NonCommercial** - You may not use the material for commercial purposes.
23 | 
24 | **No additional restrictions** - You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
25 | 
26 | ## Notices:
27 | 
28 | You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation.
29 | 
30 | No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.
31 | 


--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
 1 | FILESYS=$(wildcard filesys/*.txt) $(wildcard filesys/*/*.txt) $(wildcard filesys/*/*.csv)
 2 | SOLUTIONS=$(wildcard solutions/*.sh) $(wildcard solutions/*.out)
 3 | 
 4 | all : datasets/filesys.zip datasets/solutions.zip
 5 | 
 6 | datasets/filesys.zip : ${FILESYS}
 7 | 	cd filesys && zip -r ../$@ .
 8 | 
 9 | datasets/solutions.zip : ${SOLUTIONS}
10 | 	zip -r $@ solutions
11 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Introduction to Shell
 2 | 
 3 | - Teach: https://www.datacamp.com/teach/repositories/1395
 4 | - Campus: https://www.datacamp.com/courses/introduction-to-shell-for-data-science
 5 | - Docs: https://instructor-support.datacamp.com
 6 | 
 7 | ## Description
 8 | 
 9 | The Unix command line has survived and thrived for almost fifty years because it lets people to do complex things with just a few keystrokes. It helps users combine existing programs in new ways, automate repetitive tasks, and run programs on clusters and clouds that may be halfway around the world. This lesson will introduce its key elements and show you how to use them efficiently.
10 | 
11 | ## Learning objectives
12 | 
13 | - Explain the similarities and differences between the Unix shell and graphical user interfaces.
14 | - Use core Unix commands to create, rename, and remove files and directories.
15 | - Explain what files and directories are.
16 | - Match files and directories to relative and absolute paths.
17 | - Use core data manipulation commands to filter and sort textual data by position and value.
18 | - Find and interpret help.
19 | - Predict the paths matched by wildcards and specify wildcards to match sets of paths.
20 | - Combine programs using pipes to process large data sets.
21 | - Set and use variables to record information.
22 | - Use loops to run the same commands on many different files.
23 | 
24 | ## Prerequisites
25 | 
26 | - None
27 | 


--------------------------------------------------------------------------------
/bin/make-seasonal-data.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | 
 3 | '''Make CSV data files in filesys/seasonal/*.csv.'''
 4 | 
 5 | import sys
 6 | import csv
 7 | import random
 8 | from datetime import datetime, timedelta
 9 | 
10 | FILENAME_TEMPLATE = './filesys/seasonal/{}.csv'
11 | FILENAME_STEMS = 'spring summer autumn winter'.split()
12 | HEADER = 'Date Tooth'.split()
13 | TEETH = 'incisor canine bicuspid molar wisdom'.split()
14 | MIN_RECORDS = 20
15 | MAX_RECORDS = 30
16 | DATE_FORMAT = '%Y-%m-%d'
17 | START_DATE = datetime.strptime('2017-01-01', DATE_FORMAT)
18 | ONE_DAY = timedelta(days=1)
19 | DATE_RANGE = 250
20 | 
21 | def main():
22 |     random.seed(20170101)
23 |     for stem in FILENAME_STEMS:
24 |         path = FILENAME_TEMPLATE.format(stem)
25 |         records = [[datetime.strftime(START_DATE + random.randrange(DATE_RANGE) * ONE_DAY, DATE_FORMAT),
26 |                     random.choice(TEETH)]
27 |                    for r in range(random.randint(MIN_RECORDS, MAX_RECORDS))]
28 |         records.sort()
29 |         records.insert(0, HEADER)
30 |         csv.writer(open(path, 'w'), lineterminator='\n').writerows(records)
31 | 
32 | if __name__ == '__main__':
33 |     main()
34 | 


--------------------------------------------------------------------------------
/chapter1.md:
--------------------------------------------------------------------------------
   1 | ---
   2 | title: Manipulating files and directories
   3 | description: >-
   4 |   This chapter is a brief introduction to the Unix shell. You'll learn why it is
   5 |   still in use after almost 50 years, how it compares to the graphical tools you
   6 |   may be more familiar with, how to move around in the shell, and how to create,
   7 |   modify, and delete files and folders.
   8 | free_preview: true
   9 | lessons:
  10 |   - nb_of_exercises: 12
  11 |     title: How does the shell compare to a desktop interface?
  12 | ---
  13 | 
  14 | ## How does the shell compare to a desktop interface?
  15 | 
  16 | ```yaml
  17 | type: PureMultipleChoiceExercise
  18 | key: badd717ea4
  19 | xp: 50
  20 | ```
  21 | 
  22 | An operating system like Windows, Linux, or Mac OS is a special kind of program.
  23 | It controls the computer's processor, hard drive, and network connection,
  24 | but its most important job is to run other programs.
  25 | 
  26 | Since human beings aren't digital,
  27 | they need an interface to interact with the operating system.
  28 | The most common one these days is a graphical file explorer,
  29 | which translates clicks and double-clicks into commands to open files and run programs.
  30 | Before computers had graphical displays,
  31 | though,
  32 | people typed instructions into a program called a **command-line shell**.
  33 | Each time a command is entered,
  34 | the shell runs some other programs,
  35 | prints their output in human-readable form,
  36 | and then displays a *prompt* to signal that it's ready to accept the next command.
  37 | (Its name comes from the notion that it's the "outer shell" of the computer.)
  38 | 
  39 | Typing commands instead of clicking and dragging may seem clumsy at first,
  40 | but as you will see,
  41 | once you start spelling out what you want the computer to do,
  42 | you can combine old commands to create new ones
  43 | and automate repetitive operations
  44 | with just a few keystrokes.
  45 | 
  46 | <hr>
  47 | What is the relationship between the graphical file explorer that most people use and the command-line shell?
  48 | 
  49 | `@hint`
  50 | Remember that a user can only interact with an operating system through a program.
  51 | 
  52 | `@possible_answers`
  53 | - The file explorer lets you view and edit files, while the shell lets you run programs.
  54 | - The file explorer is built on top of the shell.
  55 | - The shell is part of the operating system, while the file explorer is separate.
  56 | - [They are both interfaces for issuing commands to the operating system.]
  57 | 
  58 | `@feedback`
  59 | - Both allow you to view and edit files and run programs.
  60 | - Graphical file explorers and the shell both call the same underlying operating system functions.
  61 | - The shell and the file explorer are both programs that translate user commands (typed or clicked) into calls to the operating system.
  62 | - Correct! Both take the user's commands (whether typed or clicked) and send them to the operating system.
  63 | 
  64 | ---
  65 | 
  66 | ## Where am I?
  67 | 
  68 | ```yaml
  69 | type: MultipleChoiceExercise
  70 | key: 7c1481dbd3
  71 | xp: 50
  72 | ```
  73 | 
  74 | The **filesystem** manages files and directories (or folders).
  75 | Each is identified by an **absolute path**
  76 | that shows how to reach it from the filesystem's **root directory**:
  77 | `/home/repl` is the directory `repl` in the directory `home`,
  78 | while `/home/repl/course.txt` is a file `course.txt` in that directory,
  79 | and `/` on its own is the root directory.
  80 | 
  81 | To find out where you are in the filesystem,
  82 | run the command `pwd`
  83 | (short for "**p**rint **w**orking **d**irectory").
  84 | This prints the absolute path of your **current working directory**,
  85 | which is where the shell runs commands and looks for files by default.
  86 | 
  87 | <hr>
  88 | Run `pwd`.
  89 | Where are you right now?
  90 | 
  91 | `@possible_answers`
  92 | - `/home`
  93 | - `/repl`
  94 | - `/home/repl`
  95 | 
  96 | `@hint`
  97 | Unix systems typically place all users' home directories underneath `/home`.
  98 | 
  99 | `@pre_exercise_code`
 100 | ```{python}
 101 | 
 102 | ```
 103 | 
 104 | `@sct`
 105 | ```{python}
 106 | err = "That is not the correct path."
 107 | correct = "Correct - you are in `/home/repl`."
 108 | 
 109 | Ex().has_chosen(3, [err, err, correct])
 110 | ```
 111 | 
 112 | ---
 113 | 
 114 | ## How can I identify files and directories?
 115 | 
 116 | ```yaml
 117 | type: MultipleChoiceExercise
 118 | key: f5b0499835
 119 | xp: 50
 120 | ```
 121 | 
 122 | `pwd` tells you where you are.
 123 | To find out what's there,
 124 | type `ls` (which is short for "**l**i**s**ting") and press the enter key.
 125 | On its own,
 126 | `ls` lists the contents of your current directory
 127 | (the one displayed by `pwd`).
 128 | If you add the names of some files,
 129 | `ls` will list them,
 130 | and if you add the names of directories,
 131 | it will list their contents.
 132 | For example,
 133 | `ls /home/repl` shows you what's in your starting directory
 134 | (usually called your **home directory**).
 135 | 
 136 | <hr>
 137 | Use `ls` with an appropriate argument to list the files in the directory `/home/repl/seasonal`
 138 | (which holds information on dental surgeries by date, broken down by season).
 139 | Which of these files is *not* in that directory?
 140 | 
 141 | `@possible_answers`
 142 | - `autumn.csv`
 143 | - `fall.csv`
 144 | - `spring.csv`
 145 | - `winter.csv`
 146 | 
 147 | `@hint`
 148 | If you give `ls` a path, it shows what's in that path.
 149 | 
 150 | `@pre_exercise_code`
 151 | ```{python}
 152 | 
 153 | ```
 154 | 
 155 | `@sct`
 156 | ```{python}
 157 | err = "That file is in the `seasonal` directory."
 158 | correct = "Correct - that file is *not* in the `seasonal` directory."
 159 | 
 160 | Ex().has_chosen(2, [err, correct, err, err])
 161 | ```
 162 | 
 163 | ---
 164 | 
 165 | ## How else can I identify files and directories?
 166 | 
 167 | ```yaml
 168 | type: BulletConsoleExercise
 169 | key: a766184b59
 170 | xp: 100
 171 | ```
 172 | 
 173 | An absolute path is like a latitude and longitude: it has the same value no matter where you are. A **relative path**, on the other hand, specifies a location starting from where you are: it's like saying "20 kilometers north".
 174 | 
 175 | As examples:
 176 | - If you are in the directory `/home/repl`, the **relative** path `seasonal` specifies the same directory as the **absolute** path `/home/repl/seasonal`. 
 177 | - If you are in the directory `/home/repl/seasonal`, the **relative** path `winter.csv` specifies the same file as the **absolute** path `/home/repl/seasonal/winter.csv`.
 178 | 
 179 | The shell decides if a path is absolute or relative by looking at its first character: If it begins with `/`, it is absolute. If it *does not* begin with `/`, it is relative.
 180 | 
 181 | `@pre_exercise_code`
 182 | ```{python}
 183 | 
 184 | ```
 185 | 
 186 | ***
 187 | 
 188 | ```yaml
 189 | type: ConsoleExercise
 190 | key: 9db1ed7afd
 191 | xp: 35
 192 | ```
 193 | 
 194 | `@instructions`
 195 | You are in `/home/repl`. Use `ls` with a **relative path** to list the file that has an absolute path of `/home/repl/course.txt` (and only that file).
 196 | 
 197 | `@hint`
 198 | You can often construct the relative path to a file or directory below your current location
 199 | by subtracting the absolute path of your current location
 200 | from the absolute path of the thing you want.
 201 | 
 202 | `@solution`
 203 | ```{shell}
 204 | ls course.txt
 205 | 
 206 | ```
 207 | 
 208 | `@sct`
 209 | ```{python}
 210 | Ex().multi(
 211 |     has_cwd("/home/repl"),
 212 |     has_code("ls", incorrect_msg = "You didn't call `ls` to generate the file listing."), # to prevent `echo "course.txt"`
 213 |     check_correct(
 214 |       has_expr_output(strict=True),
 215 |       has_code("ls +course.txt", incorrect_msg = "Your command didn't generate the correct file listing. Use `ls` followed by a relative path to `/home/repl/course.txt`.")
 216 |     )
 217 | )
 218 | 
 219 | ```
 220 | 
 221 | ***
 222 | 
 223 | ```yaml
 224 | type: ConsoleExercise
 225 | key: 4165425bf6
 226 | xp: 35
 227 | ```
 228 | 
 229 | `@instructions`
 230 | You are in `/home/repl`.
 231 | Use `ls` with a **relative** path
 232 | to list the file `/home/repl/seasonal/summer.csv` (and only that file).
 233 | 
 234 | `@hint`
 235 | Relative paths do *not* start with a leading '/'.
 236 | 
 237 | `@solution`
 238 | ```{shell}
 239 | ls seasonal/summer.csv
 240 | 
 241 | ```
 242 | 
 243 | `@sct`
 244 | ```{python}
 245 | Ex().multi(
 246 |     has_cwd("/home/repl"),
 247 |     has_code("ls", incorrect_msg = "You didn't call `ls` to generate the file listing."), 
 248 |     check_correct(
 249 |       has_expr_output(strict=True),
 250 |       has_code("ls +seasonal/summer.csv", incorrect_msg = "Your command didn't generate the correct file listing. Use `ls` followed by a relative path to `/home/repl/seasonal/summer.csv`.")
 251 |     )
 252 | )
 253 | ```
 254 | 
 255 | ***
 256 | 
 257 | ```yaml
 258 | type: ConsoleExercise
 259 | key: b5e66d3741
 260 | xp: 30
 261 | ```
 262 | 
 263 | `@instructions`
 264 | You are in `/home/repl`.
 265 | Use `ls` with a **relative** path
 266 | to list the contents of the directory `/home/repl/people`.
 267 | 
 268 | `@hint`
 269 | Relative paths do not start with a leading '/'.
 270 | 
 271 | `@solution`
 272 | ```{shell}
 273 | ls people
 274 | 
 275 | ```
 276 | 
 277 | `@sct`
 278 | ```{python}
 279 | Ex().multi(
 280 |     has_cwd("/home/repl"),
 281 |     has_code("ls", incorrect_msg = "You didn't call `ls` to generate the file listing."), 
 282 |     check_correct(
 283 |       has_expr_output(strict=True),
 284 |       has_code("ls +people", incorrect_msg = "Your command didn't generate the correct file listing. Use `ls` followed by a relative path to `/home/repl/people`.")
 285 |     )
 286 | )
 287 | Ex().success_msg("Well done. Now that you know about listing files and directories, let's see how you can move around the filesystem!")
 288 | 
 289 | ```
 290 | 
 291 | ---
 292 | 
 293 | ## How can I move to another directory?
 294 | 
 295 | ```yaml
 296 | type: BulletConsoleExercise
 297 | key: dbdaec5610
 298 | xp: 100
 299 | ```
 300 | 
 301 | Just as you can move around in a file browser by double-clicking on folders,
 302 | you can move around in the filesystem using the command `cd`
 303 | (which stands for "change directory").
 304 | 
 305 | If you type `cd seasonal` and then type `pwd`,
 306 | the shell will tell you that you are now in `/home/repl/seasonal`.
 307 | If you then run `ls` on its own,
 308 | it shows you the contents of `/home/repl/seasonal`,
 309 | because that's where you are.
 310 | If you want to get back to your home directory `/home/repl`,
 311 | you can use the command `cd /home/repl`.
 312 | 
 313 | `@pre_exercise_code`
 314 | ```{python}
 315 | 
 316 | ```
 317 | 
 318 | ***
 319 | 
 320 | ```yaml
 321 | type: ConsoleExercise
 322 | key: 3d0bfdd77d
 323 | xp: 35
 324 | ```
 325 | 
 326 | `@instructions`
 327 | You are in `/home/repl`/.
 328 | Change directory to `/home/repl/seasonal` using a relative path.
 329 | 
 330 | `@hint`
 331 | Remember that `cd` stands for "change directory" and that relative paths do not start with a leading '/'.
 332 | 
 333 | `@solution`
 334 | ```{shell}
 335 | cd seasonal
 336 | 
 337 | ```
 338 | 
 339 | `@sct`
 340 | ```{python}
 341 | Ex().check_correct(
 342 |   has_cwd('/home/repl/seasonal'),
 343 |   has_code('cd +seasonal', incorrect_msg="If your current working directory (find out with `pwd`) is `/home/repl`, you can move to the `seasonal` folder with `cd seasonal`.")
 344 | )
 345 | 
 346 | ```
 347 | 
 348 | ***
 349 | 
 350 | ```yaml
 351 | type: ConsoleExercise
 352 | key: e69c8eac15
 353 | xp: 35
 354 | ```
 355 | 
 356 | `@instructions`
 357 | Use `pwd` to check that you're there.
 358 | 
 359 | `@hint`
 360 | Remember to press "enter" or "return" after entering the command.
 361 | 
 362 | `@solution`
 363 | ```{shell}
 364 | pwd
 365 | 
 366 | ```
 367 | 
 368 | `@sct`
 369 | ```{python}
 370 | Ex().multi(
 371 |     has_cwd('/home/repl/seasonal'),
 372 |     check_correct(
 373 |       has_expr_output(),
 374 |       has_code('pwd')
 375 |     )
 376 | )
 377 | 
 378 | ```
 379 | 
 380 | ***
 381 | 
 382 | ```yaml
 383 | type: ConsoleExercise
 384 | key: f6b265bd7f
 385 | xp: 30
 386 | ```
 387 | 
 388 | `@instructions`
 389 | Use `ls` without any paths to see what's in that directory.
 390 | 
 391 | `@hint`
 392 | Remember to press "enter" or "return" after the command.
 393 | 
 394 | `@solution`
 395 | ```{shell}
 396 | ls
 397 | 
 398 | ```
 399 | 
 400 | `@sct`
 401 | ```{python}
 402 | Ex().multi(
 403 |     has_cwd('/home/repl/seasonal'),
 404 |     check_correct(
 405 |       has_expr_output(),
 406 |       has_code('ls', incorrect_msg="Your command did not generate the correct output. Have you used `ls` with no paths to show the contents of the current directory?")
 407 |     )
 408 | )
 409 | 
 410 | Ex().success_msg("Neat! This was about navigating down to subdirectories. What about moving up? Let's find out!")
 411 | 
 412 | ```
 413 | 
 414 | ---
 415 | 
 416 | ## How can I move up a directory?
 417 | 
 418 | ```yaml
 419 | type: PureMultipleChoiceExercise
 420 | key: 09c717ef76
 421 | xp: 50
 422 | ```
 423 | 
 424 | The **parent** of a directory is the directory above it.
 425 | For example, `/home` is the parent of `/home/repl`,
 426 | and `/home/repl` is the parent of `/home/repl/seasonal`.
 427 | You can always give the absolute path of your parent directory to commands like `cd` and `ls`.
 428 | More often,
 429 | though,
 430 | you will take advantage of the fact that the special path `..`
 431 | (two dots with no spaces) means "the directory above the one I'm currently in".
 432 | If you are in `/home/repl/seasonal`,
 433 | then `cd ..` moves you up to `/home/repl`.
 434 | If you use `cd ..` once again,
 435 | it puts you in `/home`.
 436 | One more `cd ..` puts you in the *root directory* `/`,
 437 | which is the very top of the filesystem.
 438 | (Remember to put a space between `cd` and `..` - it is a command and a path, not a single four-letter command.)
 439 | 
 440 | A single dot on its own, `.`, always means "the current directory",
 441 | so `ls` on its own and `ls .` do the same thing,
 442 | while `cd .` has no effect
 443 | (because it moves you into the directory you're currently in).
 444 | 
 445 | One final special path is `~` (the tilde character),
 446 | which means "your home directory",
 447 | such as `/home/repl`.
 448 | No matter where you are,
 449 | `ls ~` will always list the contents of your home directory,
 450 | and `cd ~` will always take you home.
 451 | 
 452 | <hr>
 453 | If you are in `/home/repl/seasonal`,
 454 | where does `cd ~/../.` take you?
 455 | 
 456 | `@hint`
 457 | Trace the path one directory at a time.
 458 | 
 459 | `@possible_answers`
 460 | - `/home/repl`
 461 | - [`/home`]
 462 | - `/home/repl/seasonal`
 463 | - `/` (the root directory)
 464 | 
 465 | `@feedback`
 466 | - No, but either `~` or `..` on its own would take you there.
 467 | - Correct! The path means 'home directory', 'up a level', 'here'.
 468 | - No, but `.` on its own would do that.
 469 | - No, the final part of the path is `.` (meaning "here") rather than `..` (meaning "up").
 470 | 
 471 | ---
 472 | 
 473 | ## How can I copy files?
 474 | 
 475 | ```yaml
 476 | type: BulletConsoleExercise
 477 | key: 832de9e74c
 478 | xp: 100
 479 | ```
 480 | 
 481 | You will often want to copy files,
 482 | move them into other directories to organize them,
 483 | or rename them.
 484 | One command to do this is `cp`, which is short for "copy".
 485 | If `original.txt` is an existing file,
 486 | then:
 487 | 
 488 | ```{shell}
 489 | cp original.txt duplicate.txt
 490 | ```
 491 | 
 492 | creates a copy of `original.txt` called `duplicate.txt`.
 493 | If there already was a file called `duplicate.txt`,
 494 | it is overwritten.
 495 | If the last parameter to `cp` is an existing directory,
 496 | then a command like:
 497 | 
 498 | ```{shell}
 499 | cp seasonal/autumn.csv seasonal/winter.csv backup
 500 | ```
 501 | 
 502 | copies *all* of the files into that directory.
 503 | 
 504 | `@pre_exercise_code`
 505 | ```{python}
 506 | 
 507 | ```
 508 | 
 509 | ***
 510 | 
 511 | ```yaml
 512 | type: ConsoleExercise
 513 | key: 6ab3fb1e25
 514 | xp: 50
 515 | ```
 516 | 
 517 | `@instructions`
 518 | Make a copy of `seasonal/summer.csv` in the `backup` directory (which is also in `/home/repl`),
 519 | calling the new file `summer.bck`.
 520 | 
 521 | `@hint`
 522 | Combine the name of the destination directory and the name of the copied file
 523 | to create a relative path for the new file.
 524 | 
 525 | `@solution`
 526 | ```{shell}
 527 | cp seasonal/summer.csv backup/summer.bck
 528 | 
 529 | ```
 530 | 
 531 | `@sct`
 532 | ```{python}
 533 | Ex().check_correct(
 534 |     check_file('/home/repl/backup/summer.bck', missing_msg="`summer.bck` doesn't appear to exist in the `backup` directory. Provide two paths to `cp`: the existing file (`seasonal/summer.csv`) and the destination file (`backup/summer.bck`)."),
 535 |     has_cwd('/home/repl')
 536 | )
 537 | 
 538 | ```
 539 | 
 540 | ***
 541 | 
 542 | ```yaml
 543 | type: ConsoleExercise
 544 | key: d9e1214bb0
 545 | xp: 50
 546 | ```
 547 | 
 548 | `@instructions`
 549 | Copy `spring.csv` and `summer.csv` from the `seasonal` directory into the `backup` directory
 550 | *without* changing your current working directory (`/home/repl`).
 551 | 
 552 | `@hint`
 553 | Use `cp` with the names of the files you want to copy
 554 | and *then* the name of the directory to copy them to.
 555 | 
 556 | `@solution`
 557 | ```{shell}
 558 | cp seasonal/spring.csv seasonal/summer.csv backup
 559 | 
 560 | ```
 561 | 
 562 | `@sct`
 563 | ```{python}
 564 | patt = "`%s` doesn't appear to have been copied into the `backup` directory. Provide two filenames and a directory name to `cp`."
 565 | Ex().multi(
 566 |     has_cwd('/home/repl', incorrect_msg="Make sure to copy the files while in `{{dir}}`! Use `cd {{dir}}` to navigate back there."),
 567 |     check_file('/home/repl/backup/spring.csv', missing_msg=patt%'spring.csv'),
 568 |     check_file('/home/repl/backup/summer.csv', missing_msg=patt%'summer.csv')
 569 | )
 570 | Ex().success_msg("Good job. Other than copying, we should also be able to move files from one directory to another. Learn about it in the next exercise!")
 571 | ```
 572 | 
 573 | ---
 574 | 
 575 | ## How can I move a file?
 576 | 
 577 | ```yaml
 578 | type: ConsoleExercise
 579 | key: 663a083a3c
 580 | xp: 100
 581 | ```
 582 | 
 583 | While `cp` copies a file,
 584 | `mv` moves it from one directory to another,
 585 | just as if you had dragged it in a graphical file browser.
 586 | It handles its parameters the same way as `cp`,
 587 | so the command:
 588 | 
 589 | ```{shell}
 590 | mv autumn.csv winter.csv ..
 591 | ```
 592 | 
 593 | moves the files `autumn.csv` and `winter.csv` from the current working directory
 594 | up one level to its parent directory
 595 | (because `..` always refers to the directory above your current location).
 596 | 
 597 | `@instructions`
 598 | You are in `/home/repl`, which has sub-directories `seasonal` and `backup`.
 599 | Using a single command, move `spring.csv` and `summer.csv` from `seasonal` to `backup`.
 600 | 
 601 | `@hint`
 602 | 
 603 | 
 604 | `@pre_exercise_code`
 605 | ```{python}
 606 | 
 607 | ```
 608 | 
 609 | `@solution`
 610 | ```{shell}
 611 | mv seasonal/spring.csv seasonal/summer.csv backup
 612 | ```
 613 | 
 614 | `@sct`
 615 | ```{python}
 616 | backup_patt="The file `%s` is not in the `backup` directory. Have you used `mv` correctly? Use two filenames and a directory as parameters to `mv`."
 617 | seasonal_patt="The file `%s` is still in the `seasonal` directory. Make sure to move the files with `mv` rather than copying them with `cp`!"
 618 | Ex().multi(
 619 |     check_file('/home/repl/backup/spring.csv', missing_msg=backup_patt%'spring.csv'),
 620 |     check_file('/home/repl/backup/summer.csv', missing_msg=backup_patt%'summer.csv'),
 621 |     check_not(check_file('/home/repl/seasonal/spring.csv'), incorrect_msg=seasonal_patt%'spring.csv'),
 622 |     check_not(check_file('/home/repl/seasonal/summer.csv'), incorrect_msg=seasonal_patt%'summer.csv')
 623 | )
 624 | Ex().success_msg("Well done, let's keep this shell train going!")
 625 | ```
 626 | 
 627 | ---
 628 | 
 629 | ## How can I rename files?
 630 | 
 631 | ```yaml
 632 | type: BulletConsoleExercise
 633 | key: 001801a652
 634 | xp: 100
 635 | ```
 636 | 
 637 | `mv` can also be used to rename files. If you run:
 638 | 
 639 | ```{shell}
 640 | mv course.txt old-course.txt
 641 | ```
 642 | 
 643 | then the file `course.txt` in the current working directory is "moved" to the file `old-course.txt`.
 644 | This is different from the way file browsers work,
 645 | but is often handy.
 646 | 
 647 | One warning:
 648 | just like `cp`,
 649 | `mv` will overwrite existing files.
 650 | If,
 651 | for example,
 652 | you already have a file called `old-course.txt`,
 653 | then the command shown above will replace it with whatever is in `course.txt`.
 654 | 
 655 | `@pre_exercise_code`
 656 | ```{python}
 657 | 
 658 | ```
 659 | 
 660 | ***
 661 | 
 662 | ```yaml
 663 | type: ConsoleExercise
 664 | key: 710187c8c7
 665 | xp: 35
 666 | ```
 667 | 
 668 | `@instructions`
 669 | Go into the `seasonal` directory.
 670 | 
 671 | `@hint`
 672 | Remember that `cd` stands for "change directory" and that relative paths do not start with a leading '/'.
 673 | 
 674 | `@solution`
 675 | ```{shell}
 676 | cd seasonal
 677 | 
 678 | ```
 679 | 
 680 | `@sct`
 681 | ```{python}
 682 | Ex().check_correct(
 683 |   has_cwd('/home/repl/seasonal'),
 684 |   has_code('cd +seasonal', incorrect_msg="If your current working directory (find out with `pwd`) is `/home/repl`, you can move to the `seasonal` folder with `cd seasonal`.")
 685 | )
 686 | 
 687 | ```
 688 | 
 689 | ***
 690 | 
 691 | ```yaml
 692 | type: ConsoleExercise
 693 | key: ed5fe1df23
 694 | xp: 35
 695 | ```
 696 | 
 697 | `@instructions`
 698 | Rename the file `winter.csv` to be `winter.csv.bck`.
 699 | 
 700 | `@hint`
 701 | Use `mv` with the current name of the file and the name you want it to have in that order.
 702 | 
 703 | `@solution`
 704 | ```{shell}
 705 | mv winter.csv winter.csv.bck
 706 | 
 707 | ```
 708 | 
 709 | `@sct`
 710 | ```{python}
 711 | hint = " Use `mv` with two arguments: the file you want to rename (`winter.csv`) and the new name for the file (`winter.csv.bck`)."
 712 | Ex().multi(
 713 |     has_cwd('/home/repl/seasonal'),
 714 |     multi(
 715 |         check_file('/home/repl/seasonal/winter.csv.bck', missing_msg="We expected to find `winter.csv.bck` in the directory." + hint),
 716 |         check_not(check_file('/home/repl/seasonal/winter.csv'), incorrect_msg="We were no longer expecting `winter.csv` to be in the directory." + hint)
 717 |     )
 718 | )
 719 | 
 720 | ```
 721 | 
 722 | ***
 723 | 
 724 | ```yaml
 725 | type: ConsoleExercise
 726 | key: 1deee4c768
 727 | xp: 30
 728 | ```
 729 | 
 730 | `@instructions`
 731 | Run `ls` to check that everything has worked.
 732 | 
 733 | `@hint`
 734 | Remember to press "enter" or "return" to run the command.
 735 | 
 736 | `@solution`
 737 | ```{shell}
 738 | ls
 739 | 
 740 | ```
 741 | 
 742 | `@sct`
 743 | ```{python}
 744 | Ex().multi(
 745 |     has_cwd('/home/repl/seasonal'),
 746 |     has_expr_output(incorrect_msg="Have you used `ls` to list the contents of your current working directory?")
 747 | )
 748 | Ex().multi(
 749 |     has_cwd("/home/repl/seasonal"),
 750 |     check_correct(
 751 |       has_expr_output(strict=True),
 752 |       has_code("ls", incorrect_msg = "Your command didn't generate the correct file listing. Use `ls` without arguments to list the contents of your current working directory.")
 753 |     )
 754 | )
 755 | Ex().success_msg("Copying, moving, renaming, you've all got it figured out! Next up: deleting files.")
 756 | 
 757 | ```
 758 | 
 759 | ---
 760 | 
 761 | ## How can I delete files?
 762 | 
 763 | ```yaml
 764 | type: BulletConsoleExercise
 765 | key: '2734680614'
 766 | xp: 100
 767 | ```
 768 | 
 769 | We can copy files and move them around;
 770 | to delete them,
 771 | we use `rm`,
 772 | which stands for "remove".
 773 | As with `cp` and `mv`,
 774 | you can give `rm` the names of as many files as you'd like, so:
 775 | 
 776 | ```{shell}
 777 | rm thesis.txt backup/thesis-2017-08.txt
 778 | ```
 779 | 
 780 | removes both `thesis.txt` and `backup/thesis-2017-08.txt`
 781 | 
 782 | `rm` does exactly what its name says,
 783 | and it does it right away:
 784 | unlike graphical file browsers,
 785 | the shell doesn't have a trash can,
 786 | so when you type the command above,
 787 | your thesis is gone for good.
 788 | 
 789 | `@pre_exercise_code`
 790 | ```{python}
 791 | 
 792 | ```
 793 | 
 794 | ***
 795 | 
 796 | ```yaml
 797 | type: ConsoleExercise
 798 | key: d7580f7bd4
 799 | xp: 25
 800 | ```
 801 | 
 802 | `@instructions`
 803 | You are in `/home/repl`.
 804 | Go into the `seasonal` directory.
 805 | 
 806 | `@hint`
 807 | Remember that `cd` stands for "change directory" and that a relative path does not start with a leading '/'.
 808 | 
 809 | `@solution`
 810 | ```{shell}
 811 | cd seasonal
 812 | 
 813 | ```
 814 | 
 815 | `@sct`
 816 | ```{python}
 817 | Ex().has_cwd('/home/repl/seasonal')
 818 | 
 819 | ```
 820 | 
 821 | ***
 822 | 
 823 | ```yaml
 824 | type: ConsoleExercise
 825 | key: 1c21cc7039
 826 | xp: 25
 827 | ```
 828 | 
 829 | `@instructions`
 830 | Remove `autumn.csv`.
 831 | 
 832 | `@hint`
 833 | Remember that `rm` stands for "remove".
 834 | 
 835 | `@solution`
 836 | ```{shell}
 837 | rm autumn.csv
 838 | 
 839 | ```
 840 | 
 841 | `@sct`
 842 | ```{python}
 843 | Ex().multi(
 844 |     has_cwd('/home/repl/seasonal'),
 845 |     check_not(check_file('/home/repl/seasonal/autumn.csv'), incorrect_msg="We weren't expecting `autumn.csv` to still be in the `seasonal` directory. Use `rm` with the path to the file you want to remove."),
 846 |     has_code('rm', incorrect_msg = 'Use `rm` to remove the file, rather than moving it.')
 847 | )
 848 | 
 849 | ```
 850 | 
 851 | ***
 852 | 
 853 | ```yaml
 854 | type: ConsoleExercise
 855 | key: 09f2d105cd
 856 | xp: 25
 857 | ```
 858 | 
 859 | `@instructions`
 860 | Go back to your home directory.
 861 | 
 862 | `@hint`
 863 | If you use `cd` without any paths, it takes you home.
 864 | 
 865 | `@solution`
 866 | ```{shell}
 867 | cd
 868 | 
 869 | ```
 870 | 
 871 | `@sct`
 872 | ```{python}
 873 | Ex().has_cwd('/home/repl', incorrect_msg="Use `cd ..` or `cd ~` to return to the home directory.")
 874 | 
 875 | ```
 876 | 
 877 | ***
 878 | 
 879 | ```yaml
 880 | type: ConsoleExercise
 881 | key: 9eaf49744c
 882 | xp: 25
 883 | ```
 884 | 
 885 | `@instructions`
 886 | Remove `seasonal/summer.csv` without changing directories again.
 887 | 
 888 | `@hint`
 889 | Remember that `rm` stands for "remove".
 890 | 
 891 | `@solution`
 892 | ```{shell}
 893 | rm seasonal/summer.csv
 894 | 
 895 | ```
 896 | 
 897 | `@sct`
 898 | ```{python}
 899 | Ex().multi(
 900 |     has_cwd('/home/repl'),
 901 |     check_not(check_file('/home/repl/seasonal/summer.csv'), incorrect_msg="We weren't expecting `summer.csv` to still be in the `seasonal` directory. Use `rm` with the path to the file you want to remove."),
 902 |     has_code('rm', incorrect_msg = 'Use `rm` to remove the file, rather than moving it.')
 903 | )
 904 | Ex().success_msg("Impressive stuff! Off to the next one!")
 905 | 
 906 | ```
 907 | 
 908 | ---
 909 | 
 910 | ## How can I create and delete directories?
 911 | 
 912 | ```yaml
 913 | type: BulletConsoleExercise
 914 | key: 63e8fbd0c2
 915 | xp: 100
 916 | ```
 917 | 
 918 | `mv` treats directories the same way it treats files:
 919 | if you are in your home directory and run `mv seasonal by-season`,
 920 | for example,
 921 | `mv` changes the name of the `seasonal` directory to `by-season`.
 922 | However,
 923 | `rm` works differently.
 924 | 
 925 | If you try to `rm` a directory,
 926 | the shell prints an error message telling you it can't do that,
 927 | primarily to stop you from accidentally deleting an entire directory full of work.
 928 | Instead,
 929 | you can use a separate command called `rmdir`.
 930 | For added safety,
 931 | it only works when the directory is empty,
 932 | so you must delete the files in a directory *before* you delete the directory.
 933 | (Experienced users can use the `-r` option to `rm` to get the same effect;
 934 | we will discuss command options in the next chapter.)
 935 | 
 936 | `@pre_exercise_code`
 937 | ```{python}
 938 | 
 939 | ```
 940 | 
 941 | ***
 942 | 
 943 | ```yaml
 944 | type: ConsoleExercise
 945 | key: 5a81bb8589
 946 | xp: 25
 947 | ```
 948 | 
 949 | `@instructions`
 950 | Without changing directories,
 951 | delete the file `agarwal.txt` in the `people` directory.
 952 | 
 953 | `@hint`
 954 | Remember that `rm` stands for "remove" and that a relative path does not start with a leading '/'.
 955 | 
 956 | `@solution`
 957 | ```{shell}
 958 | rm people/agarwal.txt
 959 | 
 960 | ```
 961 | 
 962 | `@sct`
 963 | ```{python}
 964 | Ex().multi(
 965 |     has_cwd('/home/repl'),
 966 |     check_not(check_file('/home/repl/people/agarwal.txt'), incorrect_msg="`agarwal.txt` should no longer be in `/home/repl/people`. Have you used `rm` correctly?"),
 967 |     has_expr_output(expr = 'ls people', output = '', incorrect_msg = 'There are still files in the `people` directory. If you simply moved `agarwal.txt`, or created new files, delete them all.')
 968 | )
 969 | 
 970 | ```
 971 | 
 972 | ***
 973 | 
 974 | ```yaml
 975 | type: ConsoleExercise
 976 | key: 661633e531
 977 | xp: 25
 978 | ```
 979 | 
 980 | `@instructions`
 981 | Now that the `people` directory is empty,
 982 | use a single command to delete it.
 983 | 
 984 | `@hint`
 985 | Remember that `rm` only works on files.
 986 | 
 987 | `@solution`
 988 | ```{shell}
 989 | rmdir people
 990 | 
 991 | ```
 992 | 
 993 | `@sct`
 994 | ```{python}
 995 | Ex().multi(
 996 |     has_cwd('/home/repl'),
 997 |     check_not(has_dir('/home/repl/people'),
 998 |               incorrect_msg = "The 'people' directory should no longer be in your home directory. Use `rmdir` to remove it!")
 999 | )
1000 | 
1001 | ```
1002 | 
1003 | ***
1004 | 
1005 | ```yaml
1006 | type: ConsoleExercise
1007 | key: 89f7ffc1da
1008 | xp: 25
1009 | ```
1010 | 
1011 | `@instructions`
1012 | Since a directory is not a file,
1013 | you must use the command `mkdir directory_name`
1014 | to create a new (empty) directory.
1015 | Use this command to create a new directory called `yearly` below your home directory.
1016 | 
1017 | `@hint`
1018 | Run `mkdir` with the name of the directory you want to create.
1019 | 
1020 | `@solution`
1021 | ```{shell}
1022 | mkdir yearly
1023 | 
1024 | ```
1025 | 
1026 | `@sct`
1027 | ```{python}
1028 | Ex().multi(
1029 |     has_cwd('/home/repl'),
1030 |     has_dir('/home/repl/yearly', msg="There is no `yearly` directory in your home directory. Use `mkdir yearly` to make one!")
1031 | )
1032 | 
1033 | ```
1034 | 
1035 | ***
1036 | 
1037 | ```yaml
1038 | type: ConsoleExercise
1039 | key: 013a5ff2dc
1040 | xp: 25
1041 | ```
1042 | 
1043 | `@instructions`
1044 | Now that `yearly` exists,
1045 | create another directory called `2017` inside it
1046 | *without* leaving your home directory.
1047 | 
1048 | `@hint`
1049 | Use a relative path for the sub-directory you want to create.
1050 | 
1051 | `@solution`
1052 | ```{shell}
1053 | mkdir yearly/2017
1054 | 
1055 | ```
1056 | 
1057 | `@sct`
1058 | ```{python}
1059 | Ex().multi(
1060 |     has_cwd('/home/repl'),
1061 |     has_dir('/home/repl/yearly/2017',
1062 |             msg="Cannot find a '2017' directory in '/home/repl/yearly'. You can make this directory using the relative path `yearly/2017`.")
1063 | )
1064 | Ex().success_msg("Cool! Let's wrap up this chapter with an exercise that repeats some of its concepts!")
1065 | 
1066 | ```
1067 | 
1068 | ---
1069 | 
1070 | ## Wrapping up
1071 | 
1072 | ```yaml
1073 | type: BulletConsoleExercise
1074 | key: b1990e9a42
1075 | xp: 100
1076 | ```
1077 | 
1078 | You will often create intermediate files when analyzing data.
1079 | Rather than storing them in your home directory,
1080 | you can put them in `/tmp`,
1081 | which is where people and programs often keep files they only need briefly.
1082 | (Note that `/tmp` is immediately below the root directory `/`,
1083 | *not* below your home directory.)
1084 | This wrap-up exercise will show you how to do that.
1085 | 
1086 | `@pre_exercise_code`
1087 | ```{python}
1088 | 
1089 | ```
1090 | 
1091 | ***
1092 | 
1093 | ```yaml
1094 | type: ConsoleExercise
1095 | key: 59781bc43b
1096 | xp: 25
1097 | ```
1098 | 
1099 | `@instructions`
1100 | Use `cd` to go into `/tmp`.
1101 | 
1102 | `@hint`
1103 | Remember that `cd` stands for "change directory" and that an absolute path starts with a '/'.
1104 | 
1105 | `@solution`
1106 | ```{shell}
1107 | cd /tmp
1108 | 
1109 | ```
1110 | 
1111 | `@sct`
1112 | ```{python}
1113 | Ex().check_correct(
1114 |   has_cwd('/tmp'),
1115 |   has_code('cd +/tmp', incorrect_msg = 'You are in the wrong directory. Use `cd` to change directory to `/tmp`.')
1116 | )
1117 | 
1118 | ```
1119 | 
1120 | ***
1121 | 
1122 | ```yaml
1123 | type: ConsoleExercise
1124 | key: 7e6ada440d
1125 | xp: 25
1126 | ```
1127 | 
1128 | `@instructions`
1129 | List the contents of `/tmp` *without* typing a directory name.
1130 | 
1131 | `@hint`
1132 | If you don't tell `ls` what to list, it shows you what's in your current directory.
1133 | 
1134 | `@solution`
1135 | ```{shell}
1136 | ls
1137 | 
1138 | ```
1139 | 
1140 | `@sct`
1141 | ```{python}
1142 | Ex().multi(
1143 |     has_cwd("/tmp"),
1144 |     has_code("ls", incorrect_msg = "You didn't call `ls` to generate the file listing."),
1145 |     check_correct(
1146 |       has_expr_output(strict=True),
1147 |       has_code("^\s*ls\s*$", incorrect_msg = "Your command didn't generate the correct file listing. Use `ls` without`.")
1148 |     )
1149 | )
1150 | 
1151 | ```
1152 | 
1153 | ***
1154 | 
1155 | ```yaml
1156 | type: ConsoleExercise
1157 | key: edaf1bcf96
1158 | xp: 25
1159 | ```
1160 | 
1161 | `@instructions`
1162 | Make a new directory inside `/tmp` called `scratch`.
1163 | 
1164 | `@hint`
1165 | Use `mkdir` to make directories.
1166 | 
1167 | `@solution`
1168 | ```{shell}
1169 | mkdir scratch
1170 | 
1171 | ```
1172 | 
1173 | `@sct`
1174 | ```{python}
1175 | Ex().multi(
1176 |     has_cwd('/tmp'),
1177 |     check_correct(
1178 |       has_dir('/tmp/scratch'),
1179 |       has_code('mkdir +scratch', incorrect_msg="Cannot find a 'scratch' directory under '/tmp'. Make sure to use `mkdir` correctly.")
1180 |     )
1181 | )
1182 | 
1183 | ```
1184 | 
1185 | ***
1186 | 
1187 | ```yaml
1188 | type: ConsoleExercise
1189 | key: a904a3a719
1190 | xp: 25
1191 | ```
1192 | 
1193 | `@instructions`
1194 | Move `/home/repl/people/agarwal.txt` into `/tmp/scratch`.
1195 | We suggest you use the `~` shortcut for your home directory and a relative path for the second rather than the absolute path.
1196 | 
1197 | `@hint`
1198 | 
1199 | 
1200 | `@solution`
1201 | ```{shell}
1202 | mv ~/people/agarwal.txt scratch
1203 | 
1204 | ```
1205 | 
1206 | `@sct`
1207 | ```{python}
1208 | Ex().multi(
1209 |     has_cwd('/tmp'),
1210 |     check_file('/tmp/scratch/agarwal.txt', missing_msg="Cannot find 'agarwal.txt' in '/tmp/scratch'. Use `mv` with `~/people/agarwal.txt` as the first parameter and `scratch` as the second.")
1211 | )
1212 | Ex().success_msg("This concludes Chapter 1 of Introduction to Shell! Rush over to the next chapter to learn more about manipulating data!")
1213 | 
1214 | ```
1215 | 


--------------------------------------------------------------------------------
/chapter2.md:
--------------------------------------------------------------------------------
   1 | ---
   2 | title: Manipulating data
   3 | description: >-
   4 |   The commands you saw in the previous chapter allowed you to move things around
   5 |   in the filesystem. This chapter will show you how to work with the data in
   6 |   those files. The tools we’ll use are fairly simple, but are solid building
   7 |   blocks.
   8 | lessons:
   9 |   - nb_of_exercises: 12
  10 |     title: How can I view a file's contents?
  11 | ---
  12 | 
  13 | ## How can I view a file's contents?
  14 | 
  15 | ```yaml
  16 | type: ConsoleExercise
  17 | key: 8acc09ede3
  18 | xp: 100
  19 | ```
  20 | 
  21 | Before you rename or delete files,
  22 | you may want to have a look at their contents.
  23 | The simplest way to do this is with `cat`,
  24 | which just prints the contents of files onto the screen.
  25 | (Its name is short for "concatenate", meaning "to link things together",
  26 | since it will print all the files whose names you give it, one after the other.)
  27 | 
  28 | ```{shell}
  29 | cat agarwal.txt
  30 | ```
  31 | ```
  32 | name: Agarwal, Jasmine
  33 | position: RCT2
  34 | start: 2017-04-01
  35 | benefits: full
  36 | ```
  37 | 
  38 | `@instructions`
  39 | Print the contents of `course.txt` to the screen.
  40 | 
  41 | `@hint`
  42 | 
  43 | 
  44 | `@pre_exercise_code`
  45 | ```{python}
  46 | 
  47 | ```
  48 | 
  49 | `@solution`
  50 | ```{bash}
  51 | cat course.txt
  52 | ```
  53 | 
  54 | `@sct`
  55 | ```{python}
  56 | Ex().multi(
  57 |     has_cwd('/home/repl'),
  58 |     has_expr_output(incorrect_msg="Your command didn't generate the right output. Have you used `cat` followed by the name of the file, `course.txt`?")
  59 | )
  60 | Ex().success_msg("Nice! Let's look at other ways to view a file's contents.")
  61 | ```
  62 | 
  63 | ---
  64 | 
  65 | ## How can I view a file's contents piece by piece?
  66 | 
  67 | ```yaml
  68 | type: ConsoleExercise
  69 | key: d8a30a3f81
  70 | xp: 100
  71 | ```
  72 | 
  73 | You can use `cat` to print large files and then scroll through the output,
  74 | but it is usually more convenient to **page** the output.
  75 | The original command for doing this was called `more`,
  76 | but it has been superseded by a more powerful command called `less`.
  77 | (This kind of naming is what passes for humor in the Unix world.)
  78 | When you `less` a file,
  79 | one page is displayed at a time;
  80 | you can press spacebar to page down or type `q` to quit.
  81 | 
  82 | If you give `less` the names of several files,
  83 | you can type `:n` (colon and a lower-case 'n') to move to the next file,
  84 | `:p` to go back to the previous one,
  85 | or `:q` to quit.
  86 | 
  87 | Note: If you view solutions to exercises that use `less`,
  88 | you will see an extra command at the end that turns paging *off*
  89 | so that we can test your solutions efficiently.
  90 | 
  91 | `@instructions`
  92 | Use `less seasonal/spring.csv seasonal/summer.csv` to view those two files in that order.
  93 | Press spacebar to page down, `:n` to go to the second file, and `:q` to quit.
  94 | 
  95 | `@hint`
  96 | 
  97 | 
  98 | `@pre_exercise_code`
  99 | ```{python}
 100 | 
 101 | ```
 102 | 
 103 | `@solution`
 104 | ```{bash}
 105 | # You can leave out the '| cat' part here:
 106 | less seasonal/spring.csv seasonal/summer.csv | cat
 107 | ```
 108 | 
 109 | `@sct`
 110 | ```{python}
 111 | Ex().multi(
 112 |     has_cwd('/home/repl'),
 113 |     check_or(
 114 |         has_code(r'\s*less\s+seasonal/spring\.csv\s+seasonal/summer\.csv\s*',
 115 |                  incorrect_msg='Use `less` and the filenames. Remember that `:n` moves you to the next file.'),
 116 |         has_code(r'\s*less\s+seasonal/summer\.csv\s+seasonal/spring\.csv\s*')
 117 |     )
 118 | )
 119 | ```
 120 | 
 121 | ---
 122 | 
 123 | ## How can I look at the start of a file?
 124 | 
 125 | ```yaml
 126 | type: MultipleChoiceExercise
 127 | key: 82bdc9af65
 128 | lang: shell
 129 | xp: 50
 130 | skills:
 131 |   - 1
 132 | ```
 133 | 
 134 | The first thing most data scientists do when given a new dataset to analyze is
 135 | figure out what fields it contains and what values those fields have.
 136 | If the dataset has been exported from a database or spreadsheet,
 137 | it will often be stored as **comma-separated values** (CSV).
 138 | A quick way to figure out what it contains is to look at the first few rows.
 139 | 
 140 | We can do this in the shell using a command called `head`.
 141 | As its name suggests,
 142 | it prints the first few lines of a file
 143 | (where "a few" means 10),
 144 | so the command:
 145 | 
 146 | ```{shell}
 147 | head seasonal/summer.csv
 148 | ```
 149 | 
 150 | displays:
 151 | 
 152 | ```
 153 | Date,Tooth
 154 | 2017-01-11,canine
 155 | 2017-01-18,wisdom
 156 | 2017-01-21,bicuspid
 157 | 2017-02-02,molar
 158 | 2017-02-27,wisdom
 159 | 2017-02-27,wisdom
 160 | 2017-03-07,bicuspid
 161 | 2017-03-15,wisdom
 162 | 2017-03-20,canine
 163 | ```
 164 | 
 165 | <hr>
 166 | 
 167 | What does `head` do if there aren't 10 lines in the file?
 168 | (To find out, use it to look at the top of `people/agarwal.txt`.)
 169 | 
 170 | `@possible_answers`
 171 | - Print an error message because the file is too short.
 172 | - Display as many lines as there are.
 173 | - Display enough blank lines to bring the total to 10.
 174 | 
 175 | `@hint`
 176 | What is the most useful thing it could do?
 177 | 
 178 | `@pre_exercise_code`
 179 | ```{python}
 180 | 
 181 | ```
 182 | 
 183 | `@sct`
 184 | ```{shell}
 185 | Ex().has_chosen(2, ["Incorrect: that isn't the most useful thing it could do.",
 186 |                     "Correct!",
 187 |                     "Incorrect: that would be impossible to distinguish from a file that ended with a bunch of blank lines."])
 188 | ```
 189 | 
 190 | ---
 191 | 
 192 | ## How can I type less?
 193 | 
 194 | ```yaml
 195 | type: BulletConsoleExercise
 196 | key: 0b7b8ca8f7
 197 | xp: 100
 198 | ```
 199 | 
 200 | One of the shell's power tools is **tab completion**.
 201 | If you start typing the name of a file and then press the tab key,
 202 | the shell will do its best to auto-complete the path.
 203 | For example,
 204 | if you type `sea` and press tab,
 205 | it will fill in the directory name `seasonal/` (with a trailing slash).
 206 | If you then type `a` and tab,
 207 | it will complete the path as `seasonal/autumn.csv`.
 208 | 
 209 | If the path is ambiguous,
 210 | such as `seasonal/s`,
 211 | pressing tab a second time will display a list of possibilities.
 212 | Typing another character or two to make your path more specific
 213 | and then pressing tab
 214 | will fill in the rest of the name.
 215 | 
 216 | `@pre_exercise_code`
 217 | ```{python}
 218 | 
 219 | ```
 220 | 
 221 | ***
 222 | 
 223 | ```yaml
 224 | type: ConsoleExercise
 225 | key: 4e30296c27
 226 | xp: 50
 227 | ```
 228 | 
 229 | `@instructions`
 230 | Run `head seasonal/autumn.csv` without typing the full filename.
 231 | 
 232 | `@hint`
 233 | Type as much of the path as you need to, then press tab, and repeat.
 234 | 
 235 | `@solution`
 236 | ```{shell}
 237 | head seasonal/autumn.csv
 238 | 
 239 | ```
 240 | 
 241 | `@sct`
 242 | ```{python}
 243 | Ex().multi(
 244 |     has_cwd('/home/repl'),
 245 |     has_expr_output(incorrect_msg="The checker couldn't find the right output in your command. Are you sure you called `head` on `seasonal/autumn.csv`?")
 246 | )
 247 | 
 248 | ```
 249 | 
 250 | ***
 251 | 
 252 | ```yaml
 253 | type: ConsoleExercise
 254 | key: e249266733
 255 | xp: 50
 256 | ```
 257 | 
 258 | `@instructions`
 259 | Run `head seasonal/spring.csv` without typing the full filename.
 260 | 
 261 | `@hint`
 262 | Type as much of the path as you need to, then press tab, and repeat.
 263 | 
 264 | `@solution`
 265 | ```{shell}
 266 | head seasonal/spring.csv
 267 | 
 268 | ```
 269 | 
 270 | `@sct`
 271 | ```{python}
 272 | Ex().multi(
 273 |     has_cwd('/home/repl'),
 274 |     has_expr_output(incorrect_msg="The checker couldn't find the right output in your command. Are you sure you called `head` on `seasonal/spring.csv`?")
 275 | )
 276 | Ex().success_msg("Good work! Once you get used to using tab completion, it will save you a lot of time!")
 277 | 
 278 | ```
 279 | 
 280 | ---
 281 | 
 282 | ## How can I control what commands do?
 283 | 
 284 | ```yaml
 285 | type: ConsoleExercise
 286 | key: 9eb608f6c9
 287 | xp: 100
 288 | ```
 289 | 
 290 | You won't always want to look at the first 10 lines of a file,
 291 | so the shell lets you change `head`'s behavior
 292 | by giving it a **command-line flag** (or just "flag" for short).
 293 | If you run the command:
 294 | 
 295 | ```{shell}
 296 | head -n 3 seasonal/summer.csv
 297 | ```
 298 | 
 299 | `head` will only display the first three lines of the file.
 300 | If you run `head -n 100`,
 301 | it will display the first 100 (assuming there are that many),
 302 | and so on.
 303 | 
 304 | A flag's name usually indicates its purpose
 305 | (for example, `-n` is meant to signal "**n**umber of lines").
 306 | Command flags don't have to be a `-` followed by a single letter,
 307 | but it's a widely-used convention.
 308 | 
 309 | Note: it's considered good style to put all flags *before* any filenames,
 310 | so in this course,
 311 | we only accept answers that do that.
 312 | 
 313 | `@instructions`
 314 | Display the first 5 lines of `winter.csv` in the `seasonal` directory.
 315 | 
 316 | `@hint`
 317 | 
 318 | 
 319 | `@pre_exercise_code`
 320 | ```{python}
 321 | 
 322 | ```
 323 | 
 324 | `@solution`
 325 | ```{shell}
 326 | head -n 5 seasonal/winter.csv
 327 | ```
 328 | 
 329 | `@sct`
 330 | ```{python}
 331 | Ex().multi(
 332 |     has_cwd('/home/repl'),
 333 |     check_correct(
 334 |         has_expr_output(incorrect_msg="Are you sure you're calling `head` on the `seasonal/winter.csv` file?"),
 335 |         has_expr_output(strict=True, incorrect_msg="Are you sure you used the flag `-n 5`?")
 336 |     ),
 337 |     check_not(has_output("2017-02-17,incisor"), incorrect_msg = "Are you sure you used the flag `-n 5`?")
 338 | )
 339 | Ex().success_msg("Nice! With this technique, you can avoid your shell from blowing up if you want to have a look at larger text files.")
 340 | ```
 341 | 
 342 | ---
 343 | 
 344 | ## How can I list everything below a directory?
 345 | 
 346 | ```yaml
 347 | type: ConsoleExercise
 348 | key: f830d46419
 349 | xp: 100
 350 | ```
 351 | 
 352 | In order to see everything underneath a directory,
 353 | no matter how deeply nested it is,
 354 | you can give `ls` the flag `-R`
 355 | (which means "recursive").
 356 | If you use `ls -R` in your home directory,
 357 | you will see something like this:
 358 | 
 359 | ```
 360 | backup          course.txt      people          seasonal
 361 | 
 362 | ./backup:
 363 | 
 364 | ./people:
 365 | agarwal.txt
 366 | 
 367 | ./seasonal:
 368 | autumn.csv      spring.csv      summer.csv      winter.csv
 369 | ```
 370 | 
 371 | This shows every file and directory in the current level,
 372 | then everything in each sub-directory,
 373 | and so on.
 374 | 
 375 | `@instructions`
 376 | To help you know what is what,
 377 | `ls` has another flag `-F` that prints a `/` after the name of every directory
 378 | and a `*` after the name of every runnable program.
 379 | Run `ls` with the two flags, `-R` and `-F`, and the absolute path to your home directory
 380 | to see everything it contains.
 381 | (The order of the flags doesn't matter, but the directory name must come last.)
 382 | 
 383 | `@hint`
 384 | Your home directory can be specified using `~` or `.` or its absolute path.
 385 | 
 386 | `@pre_exercise_code`
 387 | ```{python}
 388 | 
 389 | ```
 390 | 
 391 | `@solution`
 392 | ```{shell}
 393 | ls -R -F /home/repl
 394 | ```
 395 | 
 396 | `@sct`
 397 | ```{python}
 398 | Ex().check_or(
 399 |   has_expr_output(incorrect_msg='Use either `ls -R -F` or `ls -F -R` and the path `/home/repl`.'),
 400 |   has_expr_output(expr = "ls -R -F .", incorrect_msg='Use either `ls -R -F` or `ls -F -R` and the path `/home/repl`.')
 401 | )
 402 | Ex().success_msg("That's a pretty neat overview, isn't it?")
 403 | ```
 404 | 
 405 | ---
 406 | 
 407 | ## How can I get help for a command?
 408 | 
 409 | ```yaml
 410 | type: BulletConsoleExercise
 411 | key: 7b90b8a7cd
 412 | xp: 100
 413 | ```
 414 | 
 415 | To find out what commands do,
 416 | people used to use the `man` command
 417 | (short for "manual").
 418 | For example,
 419 | the command `man head` brings up this information:
 420 | 
 421 | ```
 422 | HEAD(1)               BSD General Commands Manual              HEAD(1)
 423 | 
 424 | NAME
 425 |      head -- display first lines of a file
 426 | 
 427 | SYNOPSIS
 428 |      head [-n count | -c bytes] [file ...]
 429 | 
 430 | DESCRIPTION
 431 |      This filter displays the first count lines or bytes of each of
 432 |      the specified files, or of the standard input if no files are
 433 |      specified.  If count is omitted it defaults to 10.
 434 | 
 435 |      If more than a single file is specified, each file is preceded by
 436 |      a header consisting of the string ``==> XXX <=='' where ``XXX''
 437 |      is the name of the file.
 438 | 
 439 | SEE ALSO
 440 |      tail(1)
 441 | ```
 442 | 
 443 | `man` automatically invokes `less`,
 444 | so you may need to press spacebar to page through the information
 445 | and `:q` to quit.
 446 | 
 447 | The one-line description under `NAME` tells you briefly what the command does,
 448 | and the summary under `SYNOPSIS` lists all the flags it understands.
 449 | Anything that is optional is shown in square brackets `[...]`,
 450 | either/or alternatives are separated by `|`,
 451 | and things that can be repeated are shown by `...`,
 452 | so `head`'s manual page is telling you that you can *either* give a line count with `-n`
 453 | or a byte count with `-c`,
 454 | and that you can give it any number of filenames.
 455 | 
 456 | The problem with the Unix manual is that you have to know what you're looking for.
 457 | If you don't,
 458 | you can search [Stack Overflow](https://stackoverflow.com/),
 459 | ask a question on DataCamp's Slack channels,
 460 | or look at the `SEE ALSO` sections of the commands you already know.
 461 | 
 462 | `@pre_exercise_code`
 463 | ```{python}
 464 | 
 465 | ```
 466 | 
 467 | ***
 468 | 
 469 | ```yaml
 470 | type: ConsoleExercise
 471 | key: 52d629048a
 472 | xp: 50
 473 | ```
 474 | 
 475 | `@instructions`
 476 | Read the manual page for the `tail` command to find out
 477 | what putting a `+` sign in front of the number used with the `-n` flag does.
 478 | (Remember to press spacebar to page down and/or type `q` to quit.)
 479 | 
 480 | `@hint`
 481 | Remember: `man` is short for "manual".
 482 | 
 483 | `@solution`
 484 | ```{shell}
 485 | # Run the following command *without* '| cat':
 486 | man tail | cat
 487 | 
 488 | ```
 489 | 
 490 | `@sct`
 491 | ```{python}
 492 | Ex().has_code(r'\s*man\s+tail.*', incorrect_msg='Use `man` and the command name.')
 493 | 
 494 | ```
 495 | 
 496 | ***
 497 | 
 498 | ```yaml
 499 | type: ConsoleExercise
 500 | key: 6a07958ae0
 501 | xp: 50
 502 | ```
 503 | 
 504 | `@instructions`
 505 | Use `tail` with the flag `-n +7` to display all *but* the first six lines of `seasonal/spring.csv`.
 506 | 
 507 | `@hint`
 508 | Use a plus sign '+' in front of the number of lines you want displayed.
 509 | 
 510 | `@solution`
 511 | ```{shell}
 512 | tail -n +7 seasonal/spring.csv
 513 | 
 514 | ```
 515 | 
 516 | `@sct`
 517 | ```{python}
 518 | Ex().multi(
 519 |     has_cwd('/home/repl'),
 520 |     has_output('2017-09-07,molar', incorrect_msg="Are you calling `tail` on `seasonal/spring.csv`?"),
 521 |     has_expr_output(strict=True, incorrect_msg="Are you share you used the flag `-n +7`?")
 522 | )
 523 | 
 524 | ```
 525 | 
 526 | ---
 527 | 
 528 | ## How can I select columns from a file?
 529 | 
 530 | ```yaml
 531 | type: MultipleChoiceExercise
 532 | key: 925e9d645a
 533 | xp: 50
 534 | ```
 535 | 
 536 | `head` and `tail` let you select rows from a text file.
 537 | If you want to select columns,
 538 | you can use the command `cut`.
 539 | It has several options (use `man cut` to explore them),
 540 | but the most common is something like:
 541 | 
 542 | ```{shell}
 543 | cut -f 2-5,8 -d , values.csv
 544 | ```
 545 | 
 546 | which means
 547 | "select columns 2 through 5 and columns 8,
 548 | using comma as the separator".
 549 | `cut` uses `-f` (meaning "fields") to specify columns
 550 | and `-d` (meaning "delimiter") to specify the separator.
 551 | You need to specify the latter because some files may use spaces, tabs, or colons to separate columns.
 552 | 
 553 | <hr>
 554 | 
 555 | What command will select the first column (containing dates) from the  file `spring.csv`?
 556 | 
 557 | `@possible_answers`
 558 | - `cut -d , -f 1 seasonal/spring.csv`
 559 | - `cut -d, -f1 seasonal/spring.csv`
 560 | - Either of the above.
 561 | - Neither of the above, because `-f` must come before `-d`.
 562 | 
 563 | `@hint`
 564 | The order of the flags doesn't matter.
 565 | 
 566 | `@pre_exercise_code`
 567 | ```{python}
 568 | 
 569 | ```
 570 | 
 571 | `@sct`
 572 | ```{python}
 573 | Ex().has_chosen(3, ['Yes, but that is not all', 'Yes, but that is not all', 'Correct! Adding a space after the flag is good style, but not compulsory.', 'No, flag order doesn\'t matter'])
 574 | ```
 575 | 
 576 | ---
 577 | 
 578 | ## What can't cut do?
 579 | 
 580 | ```yaml
 581 | type: MultipleChoiceExercise
 582 | key: b9bb10ae87
 583 | xp: 50
 584 | ```
 585 | 
 586 | `cut` is a simple-minded command.
 587 | In particular,
 588 | it doesn't understand quoted strings.
 589 | If, for example, your file is:
 590 | 
 591 | ```
 592 | Name,Age
 593 | "Johel,Ranjit",28
 594 | "Sharma,Rupinder",26
 595 | ```
 596 | 
 597 | then:
 598 | 
 599 | ```{shell}
 600 | cut -f 2 -d , everyone.csv
 601 | ```
 602 | 
 603 | will produce:
 604 | 
 605 | ```
 606 | Age
 607 | Ranjit"
 608 | Rupinder"
 609 | ```
 610 | 
 611 | rather than everyone's age,
 612 | because it will think the comma between last and first names is a column separator.
 613 | 
 614 | <hr>
 615 | 
 616 | What is the output of `cut -d : -f 2-4` on the line:
 617 | 
 618 | ```
 619 | first:second:third:
 620 | ```
 621 | 
 622 | (Note the trailing colon.)
 623 | 
 624 | `@possible_answers`
 625 | - `second`
 626 | - `second:third`
 627 | - `second:third:`
 628 | - None of the above, because there aren't four fields.
 629 | 
 630 | `@hint`
 631 | Pay attention to the trailing colon.
 632 | 
 633 | `@pre_exercise_code`
 634 | ```{python}
 635 | 
 636 | ```
 637 | 
 638 | `@sct`
 639 | ```{python}
 640 | Ex().has_chosen(3, ['No, there is more.', 'No, there is more.', 'Correct! The trailing colon creates an empty fourth field.', 'No, `cut` does the best it can.'])
 641 | ```
 642 | 
 643 | ---
 644 | 
 645 | ## How can I repeat commands?
 646 | 
 647 | ```yaml
 648 | type: TabConsoleExercise
 649 | key: 32c0d30049
 650 | xp: 100
 651 | ```
 652 | 
 653 | One of the biggest advantages of using the shell is that
 654 | it makes it easy for you to do things over again.
 655 | If you run some commands,
 656 | you can then press the up-arrow key to cycle back through them.
 657 | You can also use the left and right arrow keys and the delete key to edit them.
 658 | Pressing return will then run the modified command.
 659 | 
 660 | Even better, `history` will print a list of commands you have run recently.
 661 | Each one is preceded by a serial number to make it easy to re-run particular commands:
 662 | just type `!55` to re-run the 55th command in your history (if you have that many).
 663 | You can also re-run a command by typing an exclamation mark followed by the command's name,
 664 | such as `!head` or `!cut`,
 665 | which will re-run the most recent use of that command.
 666 | 
 667 | `@pre_exercise_code`
 668 | ```{python}
 669 | 
 670 | ```
 671 | 
 672 | ***
 673 | 
 674 | ```yaml
 675 | type: ConsoleExercise
 676 | key: 188a2fab38
 677 | xp: 20
 678 | ```
 679 | 
 680 | `@instructions`
 681 | Run `head summer.csv` in your home directory (which should fail).
 682 | 
 683 | `@hint`
 684 | Tab completion won't work if there isn't a matching filename.
 685 | 
 686 | `@solution`
 687 | ```{shell}
 688 | head summer.csv
 689 | 
 690 | ```
 691 | 
 692 | `@sct`
 693 | ```{python}
 694 | Ex().multi(
 695 |     has_cwd('/home/repl'),
 696 |     has_code(r'\s*head\s+summer.csv\s*', incorrect_msg="Use `head` and a filename, `summer.csv`. Don't worry if it fails. It should.")
 697 | )
 698 | 
 699 | ```
 700 | 
 701 | ***
 702 | 
 703 | ```yaml
 704 | type: ConsoleExercise
 705 | key: cba6bf99a5
 706 | xp: 20
 707 | ```
 708 | 
 709 | `@instructions`
 710 | Change directory to `seasonal`.
 711 | 
 712 | `@hint`
 713 | Remember that `cd` stands for "change directory".
 714 | 
 715 | `@solution`
 716 | ```{shell}
 717 | cd seasonal
 718 | 
 719 | ```
 720 | 
 721 | `@sct`
 722 | ```{python}
 723 | Ex().check_correct(
 724 |   has_cwd('/home/repl/seasonal'),
 725 |   has_code('cd +seasonal', incorrect_msg="If your current working directory (find out with `pwd`) is `/home/repl`, you can move to the `seasonal` folder with `cd seasonal`.")
 726 | )
 727 | 
 728 | ```
 729 | 
 730 | ***
 731 | 
 732 | ```yaml
 733 | type: ConsoleExercise
 734 | key: 74f5c8d2fc
 735 | xp: 20
 736 | ```
 737 | 
 738 | `@instructions`
 739 | Re-run the `head` command with `!head`.
 740 | 
 741 | `@hint`
 742 | Do not type any spaces between `!` and what follows.
 743 | 
 744 | `@solution`
 745 | ```{shell}
 746 | !head
 747 | 
 748 | ```
 749 | 
 750 | `@sct`
 751 | ```{python}
 752 | # !head is expanded into head summer.csv by the terminal, so manually specify expression
 753 | # This won't work for the validator though, so we have to use check_or to satisfy it.
 754 | Ex().multi(
 755 |     has_cwd('/home/repl/seasonal'),
 756 |     check_or(
 757 |         has_expr_output(expr = 'head summer.csv',
 758 |                         incorrect_msg='Use `!head` to repeat the `head` command.'),
 759 |         has_code('!head')
 760 |     )
 761 | )
 762 | 
 763 | ```
 764 | 
 765 | ***
 766 | 
 767 | ```yaml
 768 | type: ConsoleExercise
 769 | key: a28555575a
 770 | xp: 20
 771 | ```
 772 | 
 773 | `@instructions`
 774 | Use `history` to look at what you have done.
 775 | 
 776 | `@hint`
 777 | Notice that `history` shows the most recent commands last, so that they are left on your screen when it finishes running.
 778 | 
 779 | `@solution`
 780 | ```{shell}
 781 | history
 782 | 
 783 | ```
 784 | 
 785 | `@sct`
 786 | ```{python}
 787 | Ex().has_code(r'history', incorrect_msg='Use `history` without flags to get a list of previous commands.')
 788 | 
 789 | ```
 790 | 
 791 | ***
 792 | 
 793 | ```yaml
 794 | type: ConsoleExercise
 795 | key: 0629b2adf3
 796 | xp: 20
 797 | ```
 798 | 
 799 | `@instructions`
 800 | Re-run `head` again using `!` followed by a command number.
 801 | 
 802 | `@hint`
 803 | Do *not* type any spaces between `!` and what follows.
 804 | 
 805 | `@solution`
 806 | ```{shell}
 807 | !3
 808 | 
 809 | ```
 810 | 
 811 | `@sct`
 812 | ```{python}
 813 | # !3 is expanded into head summer.csv by the terminal, so manually specify expression
 814 | # This won't work for the validator though, so we have to use check_or to satisfy it.
 815 | Ex().multi(
 816 |     has_cwd('/home/repl/seasonal'),
 817 |     check_or(
 818 |         has_expr_output(expr = 'head summer.csv',
 819 |                         incorrect_msg='Have you used `!<a_number>` to rerun the last `head` from the history?'),
 820 |         # The head cmd should appear twice, at positions 1 and 3, though this will change 
 821 |         # if the student typed a wrong answer.
 822 |         # Since we're also checking output, this should be niche enough to ignore.
 823 |         has_code(r'!3'),
 824 |         has_code(r'!1') 
 825 |     )
 826 | )
 827 | Ex().success_msg("Well done! To the next one!")
 828 | 
 829 | ```
 830 | 
 831 | ---
 832 | 
 833 | ## How can I select lines containing specific values?
 834 | 
 835 | ```yaml
 836 | type: BulletConsoleExercise
 837 | key: adf1516acf
 838 | xp: 100
 839 | ```
 840 | 
 841 | `head` and `tail` select rows,
 842 | `cut` selects columns,
 843 | and `grep` selects lines according to what they contain.
 844 | In its simplest form,
 845 | `grep` takes a piece of text followed by one or more filenames
 846 | and prints all of the lines in those files that contain that text.
 847 | For example,
 848 | `grep bicuspid seasonal/winter.csv`
 849 | prints lines from `winter.csv` that contain "bicuspid".
 850 | 
 851 | `grep` can search for patterns as well;
 852 | we will explore those in the next course.
 853 | What's more important right now is some of `grep`'s more common flags:
 854 | 
 855 | - `-c`: print a count of matching lines rather than the lines themselves
 856 | - `-h`: do *not* print the names of files when searching multiple files
 857 | - `-i`: ignore case (e.g., treat "Regression" and "regression" as matches)
 858 | - `-l`: print the names of files that contain matches, not the matches
 859 | - `-n`: print line numbers for matching lines
 860 | - `-v`: invert the match, i.e., only show lines that *don't* match
 861 | 
 862 | `@pre_exercise_code`
 863 | ```{python}
 864 | 
 865 | ```
 866 | 
 867 | ***
 868 | 
 869 | ```yaml
 870 | type: ConsoleExercise
 871 | key: 0d7ef2baa0
 872 | xp: 35
 873 | ```
 874 | 
 875 | `@instructions`
 876 | Print the contents of all of the lines containing the word `molar` in `seasonal/autumn.csv`
 877 | by running a single command while in your home directory. Don't use any flags.
 878 | 
 879 | `@hint`
 880 | Use `grep` with the word you are searching for and the name of the file(s) to search in.
 881 | 
 882 | `@solution`
 883 | ```{shell}
 884 | grep molar seasonal/autumn.csv
 885 | 
 886 | ```
 887 | 
 888 | `@sct`
 889 | ```{python}
 890 | Ex().multi(
 891 |   has_cwd('/home/repl'),
 892 |   check_correct(
 893 |     has_expr_output(),
 894 |     multi(
 895 |       has_code("grep", incorrect_msg = "Did you call `grep`?"),
 896 |       has_code("molar", incorrect_msg = "Did you search for `molar`?"),
 897 |       has_code("seasonal/autumn.csv", incorrect_msg = "Did you search the `seasonal/autumn.csv` file?")
 898 |     )
 899 |   )
 900 | )
 901 | 
 902 | ```
 903 | 
 904 | ***
 905 | 
 906 | ```yaml
 907 | type: ConsoleExercise
 908 | key: a0eee34d1e
 909 | xp: 35
 910 | ```
 911 | 
 912 | `@instructions`
 913 | Invert the match to find all of the lines that *don't* contain the word `molar` in `seasonal/spring.csv`, and show their line numbers.
 914 | Remember, it's considered good style to put all of the flags *before* other values like filenames or the search term "molar".
 915 | 
 916 | `@hint`
 917 | 
 918 | 
 919 | `@solution`
 920 | ```{shell}
 921 | grep -v -n molar seasonal/spring.csv
 922 | 
 923 | ```
 924 | 
 925 | `@sct`
 926 | ```{python}
 927 | Ex().multi(
 928 |   has_cwd('/home/repl'),
 929 |   check_correct(
 930 |     has_expr_output(),
 931 |     multi(
 932 |       has_code("grep", incorrect_msg = "Did you call `grep`?"),
 933 |       has_code("-v", incorrect_msg = "Did you invert the match with `-v`?"),
 934 |       has_code("-n", incorrect_msg = "Did you show line numbers with `-n`?"),
 935 |       has_code("molar", incorrect_msg = "Did you search for `molar`?"),
 936 |       has_code("seasonal/spring.csv", incorrect_msg = "Did you search the `seasonal/spring.csv` file?")
 937 |     )
 938 |   )
 939 | )
 940 | 
 941 | ```
 942 | 
 943 | ***
 944 | 
 945 | ```yaml
 946 | type: ConsoleExercise
 947 | key: f5641234fe
 948 | xp: 30
 949 | ```
 950 | 
 951 | `@instructions`
 952 | Count how many lines contain the word `incisor` in `autumn.csv` and `winter.csv` combined.
 953 | (Again, run a single command from your home directory.)
 954 | 
 955 | `@hint`
 956 | Remember to use `-c` with `grep` to count lines.
 957 | 
 958 | `@solution`
 959 | ```{shell}
 960 | grep -c incisor seasonal/autumn.csv seasonal/winter.csv
 961 | 
 962 | ```
 963 | 
 964 | `@sct`
 965 | ```{python}
 966 | Ex().multi(
 967 |   has_cwd('/home/repl'),
 968 |   check_correct(
 969 |     has_expr_output(),
 970 |     multi(
 971 |       has_code("grep", incorrect_msg = "Did you call `grep`?"),
 972 |       has_code("-c", incorrect_msg = "Did you get counts with `-c`?"),
 973 |       has_code("incisor", incorrect_msg = "Did you search for `incisor`?"),
 974 |       has_code("seasonal/autumn.csv", incorrect_msg = "Did you search the `seasonal/autumn.csv` file?"),
 975 |       has_code("seasonal/winter.csv", incorrect_msg = "Did you search the `seasonal/winter.csv` file?")
 976 |     )
 977 |   )
 978 | )
 979 | 
 980 | ```
 981 | 
 982 | ---
 983 | 
 984 | ## Why isn't it always safe to treat data as text?
 985 | 
 986 | ```yaml
 987 | type: MultipleChoiceExercise
 988 | key: 11914639fc
 989 | xp: 50
 990 | ```
 991 | 
 992 | The `SEE ALSO` section of the manual page for `cut` refers to a command called `paste`
 993 | that can be used to combine data files instead of cutting them up.
 994 | 
 995 | <hr>
 996 | 
 997 | Read the manual page for `paste`,
 998 | and then run `paste` to combine the autumn and winter data files in a single table
 999 | using a comma as a separator.
1000 | What's wrong with the output from a data analysis point of view?
1001 | 
1002 | `@possible_answers`
1003 | - The column headers are repeated.
1004 | - The last few rows have the wrong number of columns.
1005 | - Some of the data from `winter.csv` is missing.
1006 | 
1007 | `@hint`
1008 | If you `cut` the output of `paste` using commas as a separator,
1009 | would it produce the right answer?
1010 | 
1011 | `@pre_exercise_code`
1012 | ```{python}
1013 | 
1014 | ```
1015 | 
1016 | `@sct`
1017 | ```{python}
1018 | err1 = 'True, but it is not necessarily an error.'
1019 | correct2 = 'Correct: joining the lines with columns creates only one empty column at the start, not two.'
1020 | err3 = 'No, all of the winter data is there.'
1021 | Ex().has_chosen(2, [err1, correct2, err3])
1022 | ```
1023 | 


--------------------------------------------------------------------------------
/chapter3.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: Combining tools
  3 | description: >-
  4 |   The real power of the Unix shell lies not in the individual commands, but in
  5 |   how easily they can be combined to do new things. This chapter will show you
  6 |   how to use this power to select the data you want, and introduce commands for
  7 |   sorting values and removing duplicates.
  8 | lessons:
  9 |   - nb_of_exercises: 12
 10 |     title: How can I store a command's output in a file?
 11 | ---
 12 | 
 13 | ## How can I store a command's output in a file?
 14 | 
 15 | ```yaml
 16 | type: ConsoleExercise
 17 | key: 07a427d50c
 18 | xp: 100
 19 | ```
 20 | 
 21 | All of the tools you have seen so far let you name input files.
 22 | Most don't have an option for naming an output file because they don't need one.
 23 | Instead,
 24 | you can use **redirection** to save any command's output anywhere you want.
 25 | If you run this command:
 26 | 
 27 | ```{shell}
 28 | head -n 5 seasonal/summer.csv
 29 | ```
 30 | 
 31 | it prints the first 5 lines of the summer data on the screen.
 32 | If you run this command instead:
 33 | 
 34 | ```{shell}
 35 | head -n 5 seasonal/summer.csv > top.csv
 36 | ```
 37 | 
 38 | nothing appears on the screen.
 39 | Instead,
 40 | `head`'s output is put in a new file called `top.csv`.
 41 | You can take a look at that file's contents using `cat`:
 42 | 
 43 | ```{shell}
 44 | cat top.csv
 45 | ```
 46 | 
 47 | The greater-than sign `>` tells the shell to redirect `head`'s output to a file.
 48 | It isn't part of the `head` command;
 49 | instead,
 50 | it works with every shell command that produces output.
 51 | 
 52 | `@instructions`
 53 | Combine `tail` with redirection to save the last 5 lines of `seasonal/winter.csv` in a file called `last.csv`.
 54 | 
 55 | `@hint`
 56 | Use `tail -n 5` to get the last 5 lines.
 57 | 
 58 | `@pre_exercise_code`
 59 | ```{python}
 60 | 
 61 | ```
 62 | 
 63 | `@solution`
 64 | ```{shell}
 65 | tail -n 5 seasonal/winter.csv > last.csv
 66 | ```
 67 | 
 68 | `@sct`
 69 | ```{python}
 70 | patt = "The line `%s` should be in the file `last.csv`, but it isn't. Redirect the output of `tail -n 5 seasonal/winter.csv` to `last.csv` with `>`."
 71 | Ex().multi(
 72 |     has_cwd('/home/repl'),
 73 |     check_file('/home/repl/last.csv').multi(
 74 |         check_not(has_code('2017-07-01,incisor'), incorrect_msg='`last.csv` has too many lines. Did you use the flag `-n 5` with `tail`?'),
 75 |         has_code('2017-07-17,canine', incorrect_msg=patt%'2017-07-17,canine'),
 76 |         has_code('2017-08-13,canine', incorrect_msg=patt%'2017-08-13,canine')
 77 |     )
 78 | )
 79 | Ex().success_msg("Nice! Let's practice some more!")
 80 | ```
 81 | 
 82 | ---
 83 | 
 84 | ## How can I use a command's output as an input?
 85 | 
 86 | ```yaml
 87 | type: BulletConsoleExercise
 88 | key: f47d337593
 89 | xp: 100
 90 | ```
 91 | 
 92 | Suppose you want to get lines from the middle of a file.
 93 | More specifically,
 94 | suppose you want to get lines 3-5 from one of our data files.
 95 | You can start by using `head` to get the first 5 lines
 96 | and redirect that to a file,
 97 | and then use `tail` to select the last 3:
 98 | 
 99 | ```{shell}
100 | head -n 5 seasonal/winter.csv > top.csv
101 | tail -n 3 top.csv
102 | ```
103 | 
104 | A quick check confirms that this is lines 3-5 of our original file,
105 | because it is the last 3 lines of the first 5.
106 | 
107 | `@pre_exercise_code`
108 | ```{python}
109 | 
110 | ```
111 | 
112 | ***
113 | 
114 | ```yaml
115 | type: ConsoleExercise
116 | key: 35bbb5520e
117 | xp: 50
118 | ```
119 | 
120 | `@instructions`
121 | Select the last two lines from `seasonal/winter.csv`
122 | and save them in a file called `bottom.csv`.
123 | 
124 | `@hint`
125 | Use `tail` to select lines and `>` to redirect `tail`'s output.
126 | 
127 | `@solution`
128 | ```{shell}
129 | tail -n 2 seasonal/winter.csv > bottom.csv
130 | 
131 | ```
132 | 
133 | `@sct`
134 | ```{python}
135 | patt="The line `%s` should be in the file `bottom.csv`, but it isn't. Redirect the output of `tail -n 2 seasonal/winter.csv` to `bottom.csv` with `>`."
136 | Ex().multi(
137 |     has_cwd('/home/repl'),
138 |     check_file('/home/repl/bottom.csv').multi(
139 |         check_not(has_code('2017-08-11,bicuspid'), incorrect_msg = '`bottom.csv` has too many lines. Did you use the flag `-n 2` with `tail`?'),
140 |         has_code('2017-08-11,wisdom', incorrect_msg=patt%"2017-08-11,wisdom"),
141 |         has_code('2017-08-13,canine', incorrect_msg=patt%"2017-08-13,canine")
142 |     )
143 | )
144 | 
145 | ```
146 | 
147 | ***
148 | 
149 | ```yaml
150 | type: ConsoleExercise
151 | key: c94d3936a7
152 | xp: 50
153 | ```
154 | 
155 | `@instructions`
156 | Select the first line from `bottom.csv`
157 | in order to get the second-to-last line of the original file.
158 | 
159 | `@hint`
160 | Use `head` to select the line you want.
161 | 
162 | `@solution`
163 | ```{shell}
164 | head -n 1 bottom.csv
165 | 
166 | ```
167 | 
168 | `@sct`
169 | ```{python}
170 | Ex().multi(
171 |     has_cwd('/home/repl'),
172 |     check_file('/home/repl/bottom.csv').has_code('2017-08-11,wisdom', incorrect_msg="There's something wrong with the `bottom.csv` file. Make sure you don't change it!"),
173 |     has_expr_output(strict=True, incorrect_msg="Have you used `head` correctly on `bottom.csv`? Make sure to use the `-n` flag correctly.")
174 | )
175 | 
176 | Ex().success_msg("Well done. Head over to the next exercise to find out about better ways to combine commands.")                             
177 | 
178 | ```
179 | 
180 | ---
181 | 
182 | ## What's a better way to combine commands?
183 | 
184 | ```yaml
185 | type: ConsoleExercise
186 | key: b36aea9a1e
187 | xp: 100
188 | ```
189 | 
190 | Using redirection to combine commands has two drawbacks:
191 | 
192 | 1. It leaves a lot of intermediate files lying around (like `top.csv`).
193 | 2. The commands to produce your final result are scattered across several lines of history.
194 | 
195 | The shell provides another tool that solves both of these problems at once called a **pipe**.
196 | Once again,
197 | start by running `head`:
198 | 
199 | ```{shell}
200 | head -n 5 seasonal/summer.csv
201 | ```
202 | 
203 | Instead of sending `head`'s output to a file,
204 | add a vertical bar and the `tail` command *without* a filename:
205 | 
206 | ```{shell}
207 | head -n 5 seasonal/summer.csv | tail -n 3
208 | ```
209 | 
210 | The pipe symbol tells the shell to use the output of the command on the left
211 | as the input to the command on the right.
212 | 
213 | `@instructions`
214 | Use `cut` to select all of the tooth names from column 2 of the comma delimited file `seasonal/summer.csv`, then pipe the result to `grep`, with an inverted match, to exclude the header line containing the word "Tooth". *`cut` and `grep` were covered in detail in Chapter 2, exercises 8 and 11 respectively.*
215 | 
216 | `@hint`
217 | - The first part of the command takes the form `cut -d field_delimiter -f column_number filename`.
218 | - The second part of the command takes the form `grep -v thing_to_match`.
219 | 
220 | `@pre_exercise_code`
221 | ```{python}
222 | 
223 | ```
224 | 
225 | `@solution`
226 | ```{shell}
227 | cut -d , -f 2 seasonal/summer.csv | grep -v Tooth
228 | ```
229 | 
230 | `@sct`
231 | ```{python}
232 | Ex().multi(
233 |     has_cwd('/home/repl'),
234 |     has_expr_output(incorrect_msg = 'Have you piped the result of `cut -d , -f 2 seasonal/summer.csv` into `grep -v Tooth` with `|`?'),
235 |     check_not(has_output("Tooth"), incorrect_msg = 'Did you exclude the `"Tooth"` header line using `grep`?')
236 | )
237 | Ex().success_msg("Perfect piping! This may be the first time you used `|`, but it's definitely not the last!")
238 | ```
239 | 
240 | ---
241 | 
242 | ## How can I combine many commands?
243 | 
244 | ```yaml
245 | type: ConsoleExercise
246 | key: b8753881d6
247 | xp: 100
248 | ```
249 | 
250 | You can chain any number of commands together.
251 | For example,
252 | this command:
253 | 
254 | ```{shell}
255 | cut -d , -f 1 seasonal/spring.csv | grep -v Date | head -n 10
256 | ```
257 | 
258 | will:
259 | 
260 | 1. select the first column from the spring data;
261 | 2. remove the header line containing the word "Date"; and
262 | 3. select the first 10 lines of actual data.
263 | 
264 | `@instructions`
265 | In the previous exercise, you used the following command to select all the tooth names from column 2 of `seasonal/summer.csv`:
266 | 
267 | ```
268 | cut -d , -f 2 seasonal/summer.csv | grep -v Tooth
269 | ```
270 | 
271 | Extend this pipeline with a `head` command to only select the very first tooth name.
272 | 
273 | `@hint`
274 | Copy and paste the code in the instructions, append a pipe, then call `head` with the `-n` flag.
275 | 
276 | `@pre_exercise_code`
277 | ```{python}
278 | 
279 | ```
280 | 
281 | `@solution`
282 | ```{shell}
283 | cut -d , -f 2 seasonal/summer.csv | grep -v Tooth | head -n 1
284 | ```
285 | 
286 | `@sct`
287 | ```{python}
288 | Ex().multi(
289 |     has_cwd('/home/repl'),
290 |     # for some reason has_expr_output with strict=True does not work here...
291 |     has_output('^\s*canine\s*$', incorrect_msg = "Have you used `|` to extend the pipeline with a `head` command? Make sure to set the `-n` flag correctly."),
292 |     # by coincidence, tail -n 1 returns the same as head -n 1, so check that head was called
293 |     has_code("head", "Have you used `|` to extend the pipeline with a `head` command?")
294 | )
295 | Ex().success_msg("Cheerful chaining! By chaining several commands together, you can build powerful data manipulation pipelines.")
296 | ```
297 | 
298 | ---
299 | 
300 | ## How can I count the records in a file?
301 | 
302 | ```yaml
303 | type: ConsoleExercise
304 | key: ae6a48d6aa
305 | xp: 100
306 | ```
307 | 
308 | The command `wc` (short for "word count") prints the number of **c**haracters, **w**ords, and **l**ines in a file.
309 | You can make it print only one of these using `-c`, `-w`, or `-l` respectively.
310 | 
311 | `@instructions`
312 | Count how many records in `seasonal/spring.csv` have dates in July 2017 (`2017-07`). 
313 | - To do this, use `grep` with a partial date to select the lines and pipe this result into `wc` with an appropriate flag to count the lines.
314 | 
315 | `@hint`
316 | - Use `head seasonal/spring.csv` to remind yourself of the date format.
317 | - The first part of the command takes the form `grep thing_to_match filename`.
318 | - After the pipe, `|`, call `wc` with the `-l` flag.
319 | 
320 | `@pre_exercise_code`
321 | ```{python}
322 | 
323 | ```
324 | 
325 | `@solution`
326 | ```{shell}
327 | grep 2017-07 seasonal/spring.csv | wc -l
328 | ```
329 | 
330 | `@sct`
331 | ```{python}
332 | Ex().multi(
333 |   has_cwd('/home/repl'),
334 |   check_correct(
335 |     has_expr_output(strict=True),
336 |     multi(
337 |       has_code("grep", incorrect_msg = "Did you call `grep`?"),
338 |       has_code("2017-07", incorrect_msg = "Did you search for `2017-07`?"),
339 |       has_code("seasonal/spring.csv", incorrect_msg = "Did you search the `seasonal/spring.csv` file?"),
340 |       has_code("|", incorrect_msg = "Did you pipe to `wc` using `|`?"),      
341 |       has_code("wc", incorrect_msg = "Did you call `wc`?"),
342 |       has_code("-l", incorrect_msg = "Did you count lines with `-l`?")
343 |     )
344 |   )
345 | )
346 | Ex().success_msg("Careful counting! Determining how much data you have is a great first step in any data analysis.")
347 | ```
348 | 
349 | ---
350 | 
351 | ## How can I specify many files at once?
352 | 
353 | ```yaml
354 | type: ConsoleExercise
355 | key: 602d47e70c
356 | xp: 100
357 | ```
358 | 
359 | Most shell commands will work on multiple files if you give them multiple filenames.
360 | For example,
361 | you can get the first column from all of the seasonal data files at once like this:
362 | 
363 | ```{shell}
364 | cut -d , -f 1 seasonal/winter.csv seasonal/spring.csv seasonal/summer.csv seasonal/autumn.csv
365 | ```
366 | 
367 | But typing the names of many files over and over is a bad idea:
368 | it wastes time,
369 | and sooner or later you will either leave a file out or repeat a file's name.
370 | To make your life better,
371 | the shell allows you to use **wildcards** to specify a list of files with a single expression.
372 | The most common wildcard is `*`,
373 | which means "match zero or more characters".
374 | Using it,
375 | we can shorten the `cut` command above to this:
376 | 
377 | ```{shell}
378 | cut -d , -f 1 seasonal/*
379 | ```
380 | 
381 | or:
382 | 
383 | ```{shell}
384 | cut -d , -f 1 seasonal/*.csv
385 | ```
386 | 
387 | `@instructions`
388 | Write a single command using `head` to get the first three lines from both `seasonal/spring.csv` and `seasonal/summer.csv`, a total of six lines of data, but *not* from the autumn or winter data files.
389 | Use a wildcard instead of spelling out the files' names in full.
390 | 
391 | `@hint`
392 | - The command takes the form `head -n number_of_lines filename_pattern`.
393 | - You could match files in directory `a`, starting with `b`, using `a/b*`, for example.
394 | 
395 | `@pre_exercise_code`
396 | ```{python}
397 | 
398 | ```
399 | 
400 | `@solution`
401 | ```{shell}
402 | head -n 3 seasonal/s* # ...or seasonal/s*.csv, or even s*/s*.csv
403 | ```
404 | 
405 | `@sct`
406 | ```{python}
407 | Ex().multi(
408 |     has_cwd('/home/repl'),
409 |     has_expr_output(incorrect_msg = "You can use `seasonal/s*` to select `seasonal/spring.csv` and `seasonal/summer.csv`. Make sure to only include the first three lines of each file with the `-n` flag!"),
410 |     check_not(has_output('==> seasonal/autumn.csv <=='), incorrect_msg = "Don't include the output for `seasonal/autumn.csv`. You can use `seasonal/s*` to select `seasonal/spring.csv` and `seasonal/summer.csv`"),
411 |     check_not(has_output('==> seasonal/winter.csv <=='), incorrect_msg = "Don't include the output for `seasonal/winter.csv`. You can use `seasonal/s*` to select `seasonal/spring.csv` and `seasonal/summer.csv`")
412 | )
413 | Ex().success_msg("Wild wildcard work! This becomes even more important if your directory contains hundreds or thousands of files.")
414 | ```
415 | 
416 | ---
417 | 
418 | ## What other wildcards can I use?
419 | 
420 | ```yaml
421 | type: PureMultipleChoiceExercise
422 | key: f8feeacd8c
423 | xp: 50
424 | ```
425 | 
426 | The shell has other wildcards as well,
427 | though they are less commonly used:
428 | 
429 | - `?` matches a single character, so `201?.txt` will match `2017.txt` or `2018.txt`, but not `2017-01.txt`.
430 | - `[...]` matches any one of the characters inside the square brackets, so `201[78].txt` matches `2017.txt` or `2018.txt`, but not `2016.txt`.
431 | - `{...}` matches any of the comma-separated patterns inside the curly brackets, so `{*.txt, *.csv}` matches any file whose name ends with `.txt` or `.csv`, but not files whose names end with `.pdf`.
432 | 
433 | <hr/>
434 | 
435 | Which expression would match `singh.pdf` and `johel.txt` but *not* `sandhu.pdf` or `sandhu.txt`?
436 | 
437 | `@hint`
438 | Match each expression against each filename in turn.
439 | 
440 | `@possible_answers`
441 | - `[sj]*.{.pdf, .txt}`
442 | - `{s*.pdf, j*.txt}`
443 | - `[singh,johel]{*.pdf, *.txt}`
444 | - [`{singh.pdf, j*.txt}`]
445 | 
446 | `@feedback`
447 | - No: `.pdf` and `.txt` are not filenames.
448 | - No: this will match `sandhu.pdf`.
449 | - No: the expression in square brackets matches only one character, not entire words.
450 | - Correct!
451 | 
452 | ---
453 | 
454 | ## How can I sort lines of text?
455 | 
456 | ```yaml
457 | type: ConsoleExercise
458 | key: f06d9e310e
459 | xp: 100
460 | ```
461 | 
462 | As its name suggests,
463 | `sort` puts data in order.
464 | By default it does this in ascending alphabetical order,
465 | but the flags `-n` and `-r` can be used to sort numerically and reverse the order of its output,
466 | while `-b` tells it to ignore leading blanks
467 | and `-f` tells it to **f**old case (i.e., be case-insensitive).
468 | Pipelines often use `grep` to get rid of unwanted records
469 | and then `sort` to put the remaining records in order.
470 | 
471 | `@instructions`
472 | Remember the combination of `cut` and `grep` to select all the tooth names from column 2 of `seasonal/summer.csv`?
473 | 
474 | ```
475 | cut -d , -f 2 seasonal/summer.csv | grep -v Tooth
476 | ```
477 | 
478 | Starting from this recipe, sort the names of the teeth in `seasonal/winter.csv` (not `summer.csv`) in descending alphabetical order. To do this, extend the pipeline with a `sort` step.
479 | 
480 | `@hint`
481 | Copy and paste the command in the instructions, change the filename, append a pipe, then call `sort` with the `-r` flag.
482 | 
483 | `@pre_exercise_code`
484 | ```{python}
485 | 
486 | ```
487 | 
488 | `@solution`
489 | ```{shell}
490 | cut -d , -f 2 seasonal/winter.csv | grep -v Tooth | sort -r
491 | ```
492 | 
493 | `@sct`
494 | ```{python}
495 | Ex().multi(
496 |   has_cwd('/home/repl'),
497 |   check_correct(
498 |     has_expr_output(strict=True),
499 |     multi(
500 |       has_code("cut", incorrect_msg = "Did you call `cut`?"),
501 |       has_code("-d", incorrect_msg = "Did you specify a field delimiter with `-d`?"),
502 |       has_code("seasonal/winter.csv", incorrect_msg = "Did you get data from the `seasonal/winter.csv` file?"),
503 |       has_code("|", incorrect_msg = "Did you pipe from `cut` to `grep` to `sort` using `|`?"),      
504 |       has_code("grep", incorrect_msg = "Did you call `grep`?"),
505 |       has_code("-v", incorrect_msg = "Did you invert the match with `-v`?"),
506 |       has_code("Tooth", incorrect_msg = "Did you search for `Tooth`?"),
507 |       has_code("sort", incorrect_msg = "Did you call `sort`?"),
508 |       has_code("-r", incorrect_msg = "Did you reverse the sort order with `-r`?")
509 |     )
510 |   )
511 | )
512 | Ex().success_msg("Sorted! `sort` has many uses. For example, piping `sort -n` to `head` shows you the largest values.")
513 | ```
514 | 
515 | ---
516 | 
517 | ## How can I remove duplicate lines?
518 | 
519 | ```yaml
520 | type: ConsoleExercise
521 | key: ed77aed337
522 | xp: 100
523 | ```
524 | 
525 | Another command that is often used with `sort` is `uniq`,
526 | whose job is to remove duplicated lines.
527 | More specifically,
528 | it removes *adjacent* duplicated lines.
529 | If a file contains:
530 | 
531 | ```
532 | 2017-07-03
533 | 2017-07-03
534 | 2017-08-03
535 | 2017-08-03
536 | ```
537 | 
538 | then `uniq` will produce:
539 | 
540 | ```
541 | 2017-07-03
542 | 2017-08-03
543 | ```
544 | 
545 | but if it contains:
546 | 
547 | ```
548 | 2017-07-03
549 | 2017-08-03
550 | 2017-07-03
551 | 2017-08-03
552 | ```
553 | 
554 | then `uniq` will print all four lines.
555 | The reason is that `uniq` is built to work with very large files.
556 | In order to remove non-adjacent lines from a file,
557 | it would have to keep the whole file in memory
558 | (or at least,
559 | all the unique lines seen so far).
560 | By only removing adjacent duplicates,
561 | it only has to keep the most recent unique line in memory.
562 | 
563 | `@instructions`
564 | Write a pipeline to:
565 | 
566 | - get the second column from `seasonal/winter.csv`,
567 | - remove the word "Tooth" from the output so that only tooth names are displayed,
568 | - sort the output so that all occurrences of a particular tooth name are adjacent; and
569 | - display each tooth name once along with a count of how often it occurs.
570 | 
571 | The start of your pipeline is the same as the previous exercise:
572 | 
573 | ```
574 | cut -d , -f 2 seasonal/winter.csv | grep -v Tooth
575 | ```
576 | 
577 | Extend it with a `sort` command, and use `uniq -c` to display unique lines with a count of how often each occurs rather than using `uniq` and `wc`.
578 | 
579 | `@hint`
580 | Copy and paste the command in the instructions, pipe to `sort` without flags, then pipe again to `uniq` with a `-c` flag.
581 | 
582 | `@pre_exercise_code`
583 | ```{python}
584 | 
585 | ```
586 | 
587 | `@solution`
588 | ```{shell}
589 | cut -d , -f 2 seasonal/winter.csv | grep -v Tooth | sort | uniq -c
590 | ```
591 | 
592 | `@sct`
593 | ```{python}
594 | Ex().multi(
595 |     has_cwd('/home/repl'),
596 |     check_correct(
597 |         has_expr_output(),
598 |         multi(
599 |             has_code('cut\s+-d\s+,\s+-f\s+2\s+seasonal/winter.csv\s+\|\s+grep\s+-v\s+Tooth',
600 |                      incorrect_msg="You should start from this command: `cut -d , -f 2 seasonal/winter.csv | grep -v Tooth`. Now extend it!"),
601 |             has_code('\|\s+sort', incorrect_msg="Have you extended the command with `| sort`?"),
602 |             has_code('\|\s+uniq', incorrect_msg="Have you extended the command with `| uniq`?"),
603 |             has_code('-c', incorrect_msg="Have you included counts with `-c`?")
604 |         )
605 |     )
606 | )
607 | Ex().success_msg("Great! After all of this work on a pipe, it would be nice if we could store the result, no?")
608 | ```
609 | 
610 | ---
611 | 
612 | ## How can I save the output of a pipe?
613 | 
614 | ```yaml
615 | type: MultipleChoiceExercise
616 | key: 4115aa25b2
617 | xp: 50
618 | ```
619 | 
620 | The shell lets us redirect the output of a sequence of piped commands:
621 | 
622 | ```{shell}
623 | cut -d , -f 2 seasonal/*.csv | grep -v Tooth > teeth-only.txt
624 | ```
625 | 
626 | However, `>` must appear at the end of the pipeline:
627 | if we try to use it in the middle, like this:
628 | 
629 | ```{shell}
630 | cut -d , -f 2 seasonal/*.csv > teeth-only.txt | grep -v Tooth
631 | ```
632 | 
633 | then all of the output from `cut` is written to `teeth-only.txt`,
634 | so there is nothing left for `grep`
635 | and it waits forever for some input.
636 | 
637 | <hr>
638 | 
639 | What happens if we put redirection at the front of a pipeline as in:
640 | 
641 | ```{shell}
642 | > result.txt head -n 3 seasonal/winter.csv
643 | ```
644 | 
645 | `@possible_answers`
646 | - [The command's output is redirected to the file as usual.]
647 | - The shell reports it as an error.
648 | - The shell waits for input forever.
649 | 
650 | `@hint`
651 | Try it out in the shell.
652 | 
653 | `@pre_exercise_code`
654 | ```{python}
655 | 
656 | ```
657 | 
658 | `@sct`
659 | ```{python}
660 | Ex().has_chosen(1, ['Correct!', 'No; the shell can actually execute this.', 'No; the shell can actually execute this.'])
661 | ```
662 | 
663 | ---
664 | 
665 | ## How can I stop a running program?
666 | 
667 | ```yaml
668 | type: ConsoleExercise
669 | key: d1694dbdcd
670 | xp: 100
671 | ```
672 | 
673 | The commands and scripts that you have run so far have all executed quickly,
674 | but some tasks will take minutes, hours, or even days to complete.
675 | You may also mistakenly put redirection in the middle of a pipeline,
676 | causing it to hang up.
677 | If you decide that you don't want a program to keep running,
678 | you can type `Ctrl` + `C` to end it.
679 | This is often written `^C` in Unix documentation;
680 | note that the 'c' can be lower-case.
681 | 
682 | `@instructions`
683 | Run the command:
684 | 
685 | ```{shell}
686 | head
687 | ```
688 | 
689 | with no arguments (so that it waits for input that will never come)
690 | and then stop it by typing `Ctrl` + `C`.
691 | 
692 | `@hint`
693 | Simply type head, hit Enter and exit the running program with `Ctrl` + `C`.
694 | 
695 | `@pre_exercise_code`
696 | ```{python}
697 | 
698 | ```
699 | 
700 | `@solution`
701 | ```{shell}
702 | # Simply type head, hit Enter and exit the running program with `Ctrl` + `C`.
703 | ```
704 | 
705 | `@sct`
706 | ```{python}
707 | Ex().has_code(r'\s*head\s*', fixed=False, incorrect_msg="Have you used `head`?")
708 | ```
709 | 
710 | ---
711 | 
712 | ## Wrapping up
713 | 
714 | ```yaml
715 | type: BulletConsoleExercise
716 | key: 659d3caa48
717 | xp: 100
718 | ```
719 | 
720 | To wrap up,
721 | you will build a pipeline to find out how many records are in the shortest of the seasonal data files.
722 | 
723 | `@pre_exercise_code`
724 | ```{python}
725 | 
726 | ```
727 | 
728 | ***
729 | 
730 | ```yaml
731 | type: ConsoleExercise
732 | key: b1f9c8ff84
733 | xp: 35
734 | ```
735 | 
736 | `@instructions`
737 | Use `wc` with appropriate parameters to list the number of lines in all of the seasonal data files.
738 | (Use a wildcard for the filenames instead of typing them all in by hand.)
739 | 
740 | `@hint`
741 | Use `-l` to list only the lines and `*` to match filenames.
742 | 
743 | `@solution`
744 | ```{shell}
745 | wc -l seasonal/*.csv
746 | 
747 | ```
748 | 
749 | `@sct`
750 | ```{python}
751 | Ex().multi(
752 |   has_cwd('/home/repl'),
753 |   check_correct(
754 |     has_expr_output(strict=True),
755 |     multi(
756 |       has_code("wc", incorrect_msg = "Did you call `wc`?"),
757 |       has_code("-l", incorrect_msg = "Did you count the number of lines with `-l`?"),
758 |       has_code("seasonal/\*", incorrect_msg = "Did you get data from all `seasonal/*` files?")
759 |     )
760 |   )
761 | )
762 | 
763 | ```
764 | 
765 | ***
766 | 
767 | ```yaml
768 | type: ConsoleExercise
769 | key: 7f94acc679
770 | xp: 35
771 | ```
772 | 
773 | `@instructions`
774 | Add another command to the previous one using a pipe to remove the line containing the word "total".
775 | 
776 | `@hint`
777 | 
778 | 
779 | `@solution`
780 | ```{shell}
781 | wc -l seasonal/*.csv | grep -v total
782 | 
783 | ```
784 | 
785 | `@sct`
786 | ```{python}
787 | Ex().multi(
788 |   has_cwd('/home/repl'),
789 |   check_correct(
790 |     has_expr_output(strict=True),
791 |     multi(
792 |       has_code("wc", incorrect_msg = "Did you call `wc`?"),
793 |       has_code("-l", incorrect_msg = "Did you count the number of lines with `-l`?"),
794 |       has_code("seasonal/\*", incorrect_msg = "Did you get data from all `seasonal/*` files?"),
795 |       has_code("|", incorrect_msg = "Did you pipe from `wc` to `grep` using `|`?"),      
796 |       has_code("grep", incorrect_msg = "Did you call `grep`?"),
797 |       has_code("-v", incorrect_msg = "Did you invert the match with `-v`?"),
798 |       has_code("total", incorrect_msg = "Did you search for `total`?")
799 |     )
800 |   )
801 | )
802 | 
803 | ```
804 | 
805 | ***
806 | 
807 | ```yaml
808 | type: ConsoleExercise
809 | key: c5f55bff6b
810 | xp: 30
811 | ```
812 | 
813 | `@instructions`
814 | Add two more stages to the pipeline that use `sort -n` and `head -n 1` to find the file containing the fewest lines.
815 | 
816 | `@hint`
817 | - Use `sort`'s `-n` flag to sort numerically.
818 | - Use `head`'s `-n` flag to limit to keeping 1 line.
819 | 
820 | `@solution`
821 | ```{shell}
822 | wc -l seasonal/*.csv | grep -v total | sort -n | head -n 1
823 | 
824 | ```
825 | 
826 | `@sct`
827 | ```{python}
828 | Ex().multi(
829 |   has_cwd('/home/repl'),
830 |   check_correct(
831 |     has_expr_output(strict=True),
832 |     multi(
833 |       has_code("wc", incorrect_msg = "Did you call `wc`?"),
834 |       has_code("-l", incorrect_msg = "Did you count the number of lines with `-l`?"),
835 |       has_code("seasonal/\*", incorrect_msg = "Did you get data from all `seasonal/*` files?"),
836 |       has_code("|", incorrect_msg = "Did you pipe from `wc` to `grep` to `sort` to `head` using `|`?"),      
837 |       has_code("grep", incorrect_msg = "Did you call `grep`?"),
838 |       has_code("-v", incorrect_msg = "Did you invert the match with `-v`?"),
839 |       has_code("total", incorrect_msg = "Did you search for `total`?"),
840 |       has_code("sort", incorrect_msg = "Did you call `sort`?"),
841 |       has_code("-n", incorrect_msg = "Did you specify the number of lines to keep with `-n`?"),
842 |       has_code("1", incorrect_msg = "Did you specify 1 line to keep with `-n 1`?")
843 |     )
844 |   )
845 | )
846 | Ex().success_msg("Great! It turns out `autumn.csv` is the file with the fewest lines. Rush over to chapter 4 to learn more about batch processing!")
847 | 
848 | ```
849 | 


--------------------------------------------------------------------------------
/chapter4.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: Batch processing
  3 | description: >-
  4 |   Most shell commands will process many files at once. This chapter shows you
  5 |   how to make your own pipelines do that. Along the way, you will see how the
  6 |   shell uses variables to store information.
  7 | lessons:
  8 |   - nb_of_exercises: 10
  9 |     title: How does the shell store information?
 10 | ---
 11 | 
 12 | ## How does the shell store information?
 13 | 
 14 | ```yaml
 15 | type: MultipleChoiceExercise
 16 | key: e4d5f4adea
 17 | xp: 50
 18 | ```
 19 | 
 20 | Like other programs, the shell stores information in variables.
 21 | Some of these,
 22 | called **environment variables**,
 23 | are available all the time.
 24 | Environment variables' names are conventionally written in upper case,
 25 | and a few of the more commonly-used ones are shown below.
 26 | 
 27 | | Variable | Purpose                           | Value                 |
 28 | |----------|-----------------------------------|-----------------------|
 29 | | `HOME`   | User's home directory             | `/home/repl`          |
 30 | | `PWD `   | Present working directory         | Same as `pwd` command |
 31 | | `SHELL`  | Which shell program is being used | `/bin/bash`           |
 32 | | `USER`   | User's ID                         | `repl`                |
 33 | 
 34 | To get a complete list (which is quite long),
 35 | you can type `set` in the shell.
 36 | 
 37 | <hr>
 38 | 
 39 | Use `set` and `grep` with a pipe to display the value of `HISTFILESIZE`,
 40 | which determines how many old commands are stored in your command history.
 41 | What is its value?
 42 | 
 43 | `@possible_answers`
 44 | - 10
 45 | - 500
 46 | - [2000]
 47 | - The variable is not there.
 48 | 
 49 | `@hint`
 50 | Use `set | grep HISTFILESIZE` to get the line you need.
 51 | 
 52 | `@pre_exercise_code`
 53 | ```{python}
 54 | 
 55 | ```
 56 | 
 57 | `@sct`
 58 | ```{python}
 59 | err1 = "No: the shell records more history than that."
 60 | err2 = "No: the shell records more history than that."
 61 | correct3 = "Correct: the shell saves 2000 old commands by default on this system."
 62 | err4 = "No: the variable `HISTFILESIZE` is there."
 63 | Ex().has_chosen(3, [err1, err2, correct3, err4])
 64 | ```
 65 | 
 66 | ---
 67 | 
 68 | ## How can I print a variable's value?
 69 | 
 70 | ```yaml
 71 | type: ConsoleExercise
 72 | key: afae0f33a7
 73 | xp: 100
 74 | ```
 75 | 
 76 | A simpler way to find a variable's value is to use a command called `echo`, which prints its arguments. Typing
 77 | 
 78 | ```{shell}
 79 | echo hello DataCamp!
 80 | ```
 81 | 
 82 | prints
 83 | 
 84 | ```
 85 | hello DataCamp!
 86 | ```
 87 | 
 88 | If you try to use it to print a variable's value like this:
 89 | 
 90 | ```{shell}
 91 | echo USER
 92 | ```
 93 | 
 94 | it will print the variable's name, `USER`.
 95 | 
 96 | To get the variable's value, you must put a dollar sign `$` in front of it. Typing 
 97 | 
 98 | ```{shell}
 99 | echo $USER
100 | ```
101 | 
102 | prints
103 | 
104 | ```
105 | repl
106 | ```
107 | 
108 | This is true everywhere:
109 | to get the value of a variable called `X`,
110 | you must write `$X`.
111 | (This is so that the shell can tell whether you mean "a file named X"
112 | or "the value of a variable named X".)
113 | 
114 | `@instructions`
115 | The variable `OSTYPE` holds the name of the kind of operating system you are using.
116 | Display its value using `echo`.
117 | 
118 | `@hint`
119 | Call `echo` with the variable `OSTYPE` prepended by `$`.
120 | 
121 | `@pre_exercise_code`
122 | ```{python}
123 | 
124 | ```
125 | 
126 | `@solution`
127 | ```{shell}
128 | echo $OSTYPE
129 | ```
130 | 
131 | `@sct`
132 | ```{python}
133 | Ex().multi(
134 |     has_cwd('/home/repl'),
135 |     check_correct(
136 |         has_expr_output(strict = True),
137 |         multi(
138 |             has_code('echo', incorrect_msg="Did you call `echo`?"),
139 |             has_code('OSTYPE', incorrect_msg="Did you print the `OSTYPE` environment variable?"),
140 |             has_code(r'\$OSTYPE', incorrect_msg="Make sure to prepend `OSTYPE` by a `$`.")
141 |         )
142 |     )
143 | )
144 | Ex().success_msg("Excellent echoing of environment variables! You're off to a good start. Let's carry on!")
145 | ```
146 | 
147 | ---
148 | 
149 | ## How else does the shell store information?
150 | 
151 | ```yaml
152 | type: BulletConsoleExercise
153 | key: e925da48e4
154 | xp: 100
155 | ```
156 | 
157 | The other kind of variable is called a **shell variable**,
158 | which is like a local variable in a programming language.
159 | 
160 | To create a shell variable,
161 | you simply assign a value to a name:
162 | 
163 | ```{shell}
164 | training=seasonal/summer.csv
165 | ```
166 | 
167 | *without* any spaces before or after the `=` sign.
168 | Once you have done this,
169 | you can check the variable's value with:
170 | 
171 | ```{shell}
172 | echo $training
173 | ```
174 | ```
175 | seasonal/summer.csv
176 | ```
177 | 
178 | `@pre_exercise_code`
179 | ```{python}
180 | 
181 | ```
182 | 
183 | ***
184 | 
185 | ```yaml
186 | type: ConsoleExercise
187 | key: 78f7fd446f
188 | xp: 50
189 | ```
190 | 
191 | `@instructions`
192 | Define a variable called `testing` with the value `seasonal/winter.csv`.
193 | 
194 | `@hint`
195 | There should *not* be spaces between the variable's name and its value.
196 | 
197 | `@solution`
198 | ```{shell}
199 | testing=seasonal/winter.csv
200 | 
201 | ```
202 | 
203 | `@sct`
204 | ```{python}
205 | # For some reason, testing the shell variable directly always passes, so we can't do the following.
206 | # Ex().multi(
207 | #     has_cwd('/home/repl'),
208 | #     has_expr_output(
209 | #         expr='echo $testing',
210 | #         output='seasonal/winter.csv',
211 | #         incorrect_msg="Have you used `testing=seasonal/winter.csv` to define the `testing` variable?"
212 | #     )
213 | # )
214 | Ex().multi(
215 |     has_cwd('/home/repl'),
216 |     multi(
217 |         has_code('testing', incorrect_msg='Did you define a shell variable named `testing`?'),
218 |         has_code('testing=', incorrect_msg='Did you write `=` directly after testing, with no spaces?'),
219 |         has_code('=seasonal/winter\.csv', incorrect_msg='Did you set the value of `testing` to `seasonal/winter.csv`?')
220 |     )
221 | )
222 | 
223 | ```
224 | 
225 | ***
226 | 
227 | ```yaml
228 | type: ConsoleExercise
229 | key: d5e7224f55
230 | xp: 50
231 | ```
232 | 
233 | `@instructions`
234 | Use `head -n 1 SOMETHING` to get the first line from `seasonal/winter.csv`
235 | using the value of the variable `testing` instead of the name of the file.
236 | 
237 | `@hint`
238 | Remember to use `$testing` rather than just `testing`
239 | (the `$` is needed to get the value of the variable).
240 | 
241 | `@solution`
242 | ```{shell}
243 | # We need to re-set the variable for testing purposes for this exercise
244 | # you should only run "head -n 1 $testing"
245 | testing=seasonal/winter.csv
246 | head -n 1 $testing
247 | 
248 | ```
249 | 
250 | `@sct`
251 | ```{python}
252 | Ex().multi(
253 |     has_cwd('/home/repl'),
254 |     has_code(r'\$testing', incorrect_msg="Did you reference the shell variable using `$testing`?"),
255 |     check_correct(
256 |         has_output('^Date,Tooth\s*$'),
257 |         multi(
258 |             has_code('head', incorrect_msg="Did you call `head`?"),
259 |             has_code('-n', incorrect_msg="Did you limit the number of lines with `-n`?"),
260 |             has_code(r'-n\s+1', incorrect_msg="Did you elect to keep 1 line with `-n 1`?")     
261 |         )
262 |     )
263 | )
264 | Ex().success_msg("Stellar! Let's see how you can repeat commands easily.")
265 | 
266 | ```
267 | 
268 | ---
269 | 
270 | ## How can I repeat a command many times?
271 | 
272 | ```yaml
273 | type: ConsoleExercise
274 | key: 920d1887e3
275 | xp: 100
276 | ```
277 | 
278 | Shell variables are also used in **loops**,
279 | which repeat commands many times.
280 | If we run this command:
281 | 
282 | ```{shell}
283 | for filetype in gif jpg png; do echo $filetype; done
284 | ```
285 | 
286 | it produces:
287 | 
288 | ```
289 | gif
290 | jpg
291 | png
292 | ```
293 | 
294 | Notice these things about the loop:
295 | 
296 | 1. The structure is `for` ...variable... `in` ...list... `; do` ...body... `; done`
297 | 2. The list of things the loop is to process (in our case, the words `gif`, `jpg`, and `png`).
298 | 3. The variable that keeps track of which thing the loop is currently processing (in our case, `filetype`).
299 | 4. The body of the loop that does the processing (in our case, `echo $filetype`).
300 | 
301 | Notice that the body uses `$filetype` to get the variable's value instead of just `filetype`,
302 | just like it does with any other shell variable.
303 | Also notice where the semi-colons go:
304 | the first one comes between the list and the keyword `do`,
305 | and the second comes between the body and the keyword `done`.
306 | 
307 | `@instructions`
308 | Modify the loop so that it prints:
309 | 
310 | ```
311 | docx
312 | odt
313 | pdf
314 | ```
315 | 
316 | Please use `filetype` as the name of the loop variable.
317 | 
318 | `@hint`
319 | Use the code structure in the introductory text, swapping the image file types for document file types.
320 | 
321 | `@pre_exercise_code`
322 | ```{python}
323 | 
324 | ```
325 | 
326 | `@solution`
327 | ```{shell}
328 | for filetype in docx odt pdf; do echo $filetype; done
329 | ```
330 | 
331 | `@sct`
332 | ```{python}
333 | Ex().multi(
334 |   has_cwd('/home/repl'),
335 |   check_correct(
336 |     has_expr_output(),
337 |     multi(
338 |       has_code('for', incorrect_msg='Did you call `for`?'),
339 |       has_code('filetype', incorrect_msg='Did you use `filetype` as the loop variable?'),
340 |       has_code('in', incorrect_msg='Did you use `in` before the list of file types?'),
341 |       has_code('docx odt pdf', incorrect_msg='Did you loop over `docx`, `odt` and `pdf` in that order?'),
342 |       has_code(r'pdf\s*;', incorrect_msg='Did you put a semi-colon after the last loop element?'),
343 |       has_code(r';\s*do', incorrect_msg='Did you use `do` after the first semi-colon?'),
344 |       has_code('echo', incorrect_msg='Did you call `echo`?'),
345 |       has_code(r'\$filetype', incorrect_msg='Did you echo `$filetype`?'),
346 |       has_code(r'filetype\s*;', incorrect_msg='Did you put a semi-colon after the loop body?'),
347 |       has_code('; done', incorrect_msg='Did you finish with `done`?')
348 |     )
349 |   )
350 | )
351 | Ex().success_msg("First-rate for looping! Loops are brilliant if you want to do the same thing hundreds or thousands of times.")
352 | ```
353 | 
354 | ---
355 | 
356 | ## How can I repeat a command once for each file?
357 | 
358 | ```yaml
359 | type: ConsoleExercise
360 | key: 8468b70a71
361 | xp: 100
362 | ```
363 | 
364 | You can always type in the names of the files you want to process when writing the loop,
365 | but it's usually better to use wildcards.
366 | Try running this loop in the console:
367 | 
368 | ```{shell}
369 | for filename in seasonal/*.csv; do echo $filename; done
370 | ```
371 | 
372 | It prints:
373 | 
374 | ```
375 | seasonal/autumn.csv
376 | seasonal/spring.csv
377 | seasonal/summer.csv
378 | seasonal/winter.csv
379 | ```
380 | 
381 | because the shell expands `seasonal/*.csv` to be a list of four filenames
382 | before it runs the loop.
383 | 
384 | `@instructions`
385 | Modify the wildcard expression to `people/*`
386 | so that the loop prints the names of the files in the `people` directory
387 | regardless of what suffix they do or don't have.
388 | Please use `filename` as the name of your loop variable.
389 | 
390 | `@hint`
391 | 
392 | 
393 | `@pre_exercise_code`
394 | ```{python}
395 | 
396 | ```
397 | 
398 | `@solution`
399 | ```{bash}
400 | for filename in people/*; do echo $filename; done
401 | ```
402 | 
403 | `@sct`
404 | ```{python}
405 | Ex().multi(
406 |   has_cwd('/home/repl'),
407 |   check_correct(
408 |     has_expr_output(),
409 |     multi(
410 |       has_code('for', incorrect_msg='Did you call `for`?'),
411 |       has_code('filename', incorrect_msg='Did you use `filename` as the loop variable?'),
412 |       has_code('in', incorrect_msg='Did you use `in` before the list of file types?'),
413 |       has_code('people/\*', incorrect_msg='Did you specify a list of files with `people/*`?'),
414 |       has_code(r'people/\*\s*;', incorrect_msg='Did you put a semi-colon after the list of files?'),
415 |       has_code(r';\s*do', incorrect_msg='Did you use `do` after the first semi-colon?'),
416 |       has_code('echo', incorrect_msg='Did you call `echo`?'),
417 |       has_code(r'\$filename', incorrect_msg='Did you echo `$filename`?'),
418 |       has_code(r'filename\s*;', incorrect_msg='Did you put a semi-colon after the loop body?'),
419 |       has_code('; done', incorrect_msg='Did you finish with `done`?')
420 |     )
421 |   )
422 | )
423 | Ex().success_msg("Loopy looping! Wildcards and loops make a powerful combination.")
424 | ```
425 | 
426 | ---
427 | 
428 | ## How can I record the names of a set of files?
429 | 
430 | ```yaml
431 | type: MultipleChoiceExercise
432 | key: 153ca10317
433 | xp: 50
434 | ```
435 | 
436 | People often set a variable using a wildcard expression to record a list of filenames.
437 | For example,
438 | if you define `datasets` like this:
439 | 
440 | ```{shell}
441 | datasets=seasonal/*.csv
442 | ```
443 | 
444 | you can display the files' names later using:
445 | 
446 | ```{shell}
447 | for filename in $datasets; do echo $filename; done
448 | ```
449 | 
450 | This saves typing and makes errors less likely.
451 | 
452 | <hr>
453 | 
454 | If you run these two commands in your home directory,
455 | how many lines of output will they print?
456 | 
457 | ```{shell}
458 | files=seasonal/*.csv
459 | for f in $files; do echo $f; done
460 | ```
461 | 
462 | `@possible_answers`
463 | - None: since `files` is defined on a separate line, it has no value in the second line.
464 | - One: the word "files".
465 | - Four: the names of all four seasonal data files.
466 | 
467 | `@hint`
468 | Remember that `X` on its own is just "X", while `$X` is the value of the variable `X`.
469 | 
470 | `@pre_exercise_code`
471 | ```{python}
472 | 
473 | ```
474 | 
475 | `@sct`
476 | ```{python}
477 | err1 = "No: you do not have to define a variable on the same line you use it."
478 | err2 = "No: this example defines and uses the variable `files` in the same shell."
479 | correct3 = "Correct. The command is equivalent to `for f in seasonal/*.csv; do echo $f; done`."
480 | Ex().has_chosen(3, [err1, err2, correct3])
481 | ```
482 | 
483 | ---
484 | 
485 | ## A variable's name versus its value
486 | 
487 | ```yaml
488 | type: PureMultipleChoiceExercise
489 | key: 4fcfb63c4f
490 | xp: 50
491 | ```
492 | 
493 | A common mistake is to forget to use `$` before the name of a variable.
494 | When you do this,
495 | the shell uses the name you have typed
496 | rather than the value of that variable.
497 | 
498 | A more common mistake for experienced users is to mis-type the variable's name.
499 | For example,
500 | if you define `datasets` like this:
501 | 
502 | ```{shell}
503 | datasets=seasonal/*.csv
504 | ```
505 | 
506 | and then type:
507 | 
508 | ```{shell}
509 | echo $datsets
510 | ```
511 | 
512 | the shell doesn't print anything,
513 | because `datsets` (without the second "a") isn't defined.
514 | 
515 | <hr>
516 | 
517 | If you were to run these two commands in your home directory,
518 | what output would be printed?
519 | 
520 | ```{shell}
521 | files=seasonal/*.csv
522 | for f in files; do echo $f; done
523 | ```
524 | 
525 | (Read the first part of the loop carefully before answering.)
526 | 
527 | `@hint`
528 | Remember that `X` on its own is just "X", while `$X` is the value of the variable `X`.
529 | 
530 | `@possible_answers`
531 | - [One line: the word "files".]
532 | - Four lines: the names of all four seasonal data files.
533 | - Four blank lines: the variable `f` isn't assigned a value.
534 | 
535 | `@feedback`
536 | - Correct: the loop uses `files` instead of `$files`, so the list consists of the word "files".
537 | - No: the loop uses `files` instead of `$files`, so the list consists of the word "files" rather than the expansion of `files`.
538 | - No: the variable `f` is defined automatically by the `for` loop.
539 | 
540 | ---
541 | 
542 | ## How can I run many commands in a single loop?
543 | 
544 | ```yaml
545 | type: ConsoleExercise
546 | key: 39b5dcf81a
547 | xp: 100
548 | ```
549 | 
550 | Printing filenames is useful for debugging,
551 | but the real purpose of loops is to do things with multiple files.
552 | This loop prints the second line of each data file:
553 | 
554 | ```{shell}
555 | for file in seasonal/*.csv; do head -n 2 $file | tail -n 1; done
556 | ```
557 | 
558 | It has the same structure as the other loops you have already seen:
559 | all that's different is that its body is a pipeline of two commands instead of a single command.
560 | 
561 | `@instructions`
562 | Write a loop that prints the last entry from July 2017 (`2017-07`) in every seasonal file. It should produce a similar output to:
563 | 
564 | ```{shell}
565 | grep 2017-07 seasonal/winter.csv | tail -n 1
566 | ```
567 | 
568 | but for **_each_** seasonal file separately. Please use `file` as the name of the loop variable, and remember to loop through the list of files `seasonal/*.csv` (_instead of 'seasonal/winter.csv' as in the example_).
569 | 
570 | `@hint`
571 | The loop body is the grep command shown in the instructions, with `seasonal/winter.csv` replaced by `$file`.
572 | 
573 | `@pre_exercise_code`
574 | ```{python}
575 | 
576 | ```
577 | 
578 | `@solution`
579 | ```{bash}
580 | for file in seasonal/*.csv; do grep 2017-07 $file | tail -n 1; done
581 | ```
582 | 
583 | `@sct`
584 | ```{python}
585 | Ex().multi(
586 |   has_cwd('/home/repl'),
587 |   # Enforce use of for loop, so students can't just use grep -h 2017-07 seasonal/*.csv
588 |   has_code('for', incorrect_msg='Did you call `for`?'),
589 |   check_correct(
590 |     has_expr_output(),
591 |     multi(
592 |       has_code('file', incorrect_msg='Did you use `file` as the loop variable?'),
593 |       has_code('in', incorrect_msg='Did you use `in` before the list of files?'),
594 |       has_code('seasonal/\*', incorrect_msg='Did you specify a list of files with `seasonal/*`?'),
595 |       has_code(r'seasonal\/\*\.csv\s*;', incorrect_msg='Did you put a semi-colon after the list of files?'),
596 |       has_code(r';\s*do', incorrect_msg='Did you use `do` after the first semi-colon?'),
597 |       has_code('grep', incorrect_msg='Did you call `grep`?'),
598 |       has_code('2017-07', incorrect_msg='Did you match on `2017-07`?'),
599 |       has_code(r'\$file', incorrect_msg='Did you use `$file` as the name of the loop variable?'),
600 |       has_code(r'file\s*|', incorrect_msg='Did you use a pipe to connect your second command?'),
601 |       has_code(r'tail\s*-n\s*1', incorrect_msg='Did you use `tail -n 1` to print the last entry of each search in your second command?'),
602 |       has_code('; done', incorrect_msg='Did you finish with `done`?')
603 |     )
604 |   )
605 | )
606 | 
607 | Ex().success_msg("Loopy looping! Wildcards and loops make a powerful combination.")
608 | ```
609 | 
610 | ---
611 | 
612 | ## Why shouldn't I use spaces in filenames?
613 | 
614 | ```yaml
615 | type: PureMultipleChoiceExercise
616 | key: b974b7f45a
617 | xp: 50
618 | ```
619 | 
620 | It's easy and sensible to give files multi-word names like `July 2017.csv`
621 | when you are using a graphical file explorer.
622 | However,
623 | this causes problems when you are working in the shell.
624 | For example,
625 | suppose you wanted to rename `July 2017.csv` to be `2017 July data.csv`.
626 | You cannot type:
627 | 
628 | ```{shell}
629 | mv July 2017.csv 2017 July data.csv
630 | ```
631 | 
632 | because it looks to the shell as though you are trying to move
633 | four files called `July`, `2017.csv`, `2017`, and `July` (again)
634 | into a directory called `data.csv`.
635 | Instead,
636 | you have to quote the files' names
637 | so that the shell treats each one as a single parameter:
638 | 
639 | ```{shell}
640 | mv 'July 2017.csv' '2017 July data.csv'
641 | ```
642 | 
643 | <hr>
644 | 
645 | If you have two files called `current.csv` and `last year.csv`
646 | (with a space in its name)
647 | and you type:
648 | 
649 | ```{shell}
650 | rm current.csv last year.csv
651 | ```
652 | 
653 | what will happen:
654 | 
655 | `@hint`
656 | What would you think was going to happen if someone showed you the command and you didn't know what files existed?
657 | 
658 | `@possible_answers`
659 | - The shell will print an error message because `last` and `year.csv` do not exist.
660 | - The shell will delete `current.csv`.
661 | - [Both of the above.]
662 | - Nothing.
663 | 
664 | `@feedback`
665 | - Yes, but that's not all.
666 | - Yes, but that's not all.
667 | - Correct. You can use single quotes, `'`, or double quotes, `"`, around the file names.
668 | - Unfortunately not.
669 | 
670 | ---
671 | 
672 | ## How can I do many things in a single loop?
673 | 
674 | ```yaml
675 | type: MultipleChoiceExercise
676 | key: f6d0530991
677 | xp: 50
678 | ```
679 | 
680 | The loops you have seen so far all have a single command or pipeline in their body,
681 | but a loop can contain any number of commands.
682 | To tell the shell where one ends and the next begins,
683 | you must separate them with semi-colons:
684 | 
685 | ```{shell}
686 | for f in seasonal/*.csv; do echo $f; head -n 2 $f | tail -n 1; done
687 | ```
688 | 
689 | ```
690 | seasonal/autumn.csv
691 | 2017-01-05,canine
692 | seasonal/spring.csv
693 | 2017-01-25,wisdom
694 | seasonal/summer.csv
695 | 2017-01-11,canine
696 | seasonal/winter.csv
697 | 2017-01-03,bicuspid
698 | ```
699 | 
700 | <hr>
701 | 
702 | Suppose you forget the semi-colon between the `echo` and `head` commands in the previous loop,
703 | so that you ask the shell to run:
704 | 
705 | ```{shell}
706 | for f in seasonal/*.csv; do echo $f head -n 2 $f | tail -n 1; done
707 | ```
708 | 
709 | What will the shell do?
710 | 
711 | `@possible_answers`
712 | - Print an error message.
713 | - Print one line for each of the four files.
714 | - Print one line for `autumn.csv` (the first file).
715 | - Print the last line of each file.
716 | 
717 | `@hint`
718 | You can pipe the output of `echo` to `tail`.
719 | 
720 | `@pre_exercise_code`
721 | ```{python}
722 | 
723 | ```
724 | 
725 | `@sct`
726 | ```{python}
727 | err1 = "No: the loop will run, it just won't do something sensible."
728 | correct2 = "Yes: `echo` produces one line that includes the filename twice, which `tail` then copies."
729 | err3 = "No: the loop runs one for each of the four filenames."
730 | err4 = "No: the input of `tail` is the output of `echo` for each filename."
731 | Ex().has_chosen(2, [err1, correct2, err3, err4])
732 | ```
733 | 


--------------------------------------------------------------------------------
/chapter5.md:
--------------------------------------------------------------------------------
   1 | ---
   2 | title: Creating new tools
   3 | description: >-
   4 |   History lets you repeat things with just a few keystrokes, and pipes let you
   5 |   combine existing commands to create new ones. In this chapter, you will see
   6 |   how to go one step further and create new commands of your own.
   7 | lessons:
   8 |   - nb_of_exercises: 9
   9 |     title: How can I edit a file?
  10 | ---
  11 | 
  12 | ## How can I edit a file?
  13 | 
  14 | ```yaml
  15 | type: ConsoleExercise
  16 | key: 39eee3cfc0
  17 | xp: 100
  18 | ```
  19 | 
  20 | Unix has a bewildering variety of text editors.
  21 | For this course,
  22 | we will use a simple one called Nano.
  23 | If you type `nano filename`,
  24 | it will open `filename` for editing
  25 | (or create it if it doesn't already exist).
  26 | You can move around with the arrow keys,
  27 | delete characters using backspace,
  28 | and do other operations with control-key combinations:
  29 | 
  30 | - `Ctrl` + `K`: delete a line.
  31 | - `Ctrl` + `U`: un-delete a line.
  32 | - `Ctrl` + `O`: save the file ('O' stands for 'output'). _You will also need to press Enter to confirm the filename!_
  33 | - `Ctrl` + `X`: exit the editor.
  34 | 
  35 | `@instructions`
  36 | Run `nano names.txt` to edit a new file in your home directory
  37 | and enter the following four lines:
  38 | 
  39 | ```
  40 | Lovelace
  41 | Hopper
  42 | Johnson
  43 | Wilson
  44 | ```
  45 | 
  46 | To save what you have written,
  47 | type `Ctrl` + `O` to write the file out,
  48 | then Enter to confirm the filename,
  49 | then `Ctrl` + `X` to exit the editor.
  50 | 
  51 | `@hint`
  52 | 
  53 | 
  54 | `@pre_exercise_code`
  55 | ```{python}
  56 | 
  57 | ```
  58 | 
  59 | `@solution`
  60 | ```{shell}
  61 | # This solution uses `cp` instead of `nano`
  62 | # because our automated tests can't edit files interactively.
  63 | cp /solutions/names.txt /home/repl
  64 | ```
  65 | 
  66 | `@sct`
  67 | ```{python}
  68 | patt = "Have you included the line `%s` in the `names.txt` file? Use `nano names.txt` again to update your file. Use `Ctrl` + `O` to save and `Ctrl` + `X` to exit."
  69 | Ex().multi(
  70 |     has_cwd('/home/repl'),
  71 |     check_file('/home/repl/names.txt').multi(
  72 |         has_code(r'Lovelace', incorrect_msg=patt%'Lovelace'),
  73 |         has_code(r'Hopper', incorrect_msg=patt%'Hopper'),
  74 |         has_code(r'Johnson', incorrect_msg=patt%'Johnson'),
  75 |         has_code(r'Wilson', incorrect_msg=patt%'Wilson')
  76 |     )
  77 | )
  78 | Ex().success_msg("Well done! Off to the next one!")
  79 | ```
  80 | 
  81 | ---
  82 | 
  83 | ## How can I record what I just did?
  84 | 
  85 | ```yaml
  86 | type: BulletConsoleExercise
  87 | key: 80c3532985
  88 | xp: 100
  89 | ```
  90 | 
  91 | When you are doing a complex analysis,
  92 | you will often want to keep a record of the commands you used.
  93 | You can do this with the tools you have already seen:
  94 | 
  95 | 1. Run `history`.
  96 | 2. Pipe its output to `tail -n 10` (or however many recent steps you want to save).
  97 | 3. Redirect that to a file called something like `figure-5.history`.
  98 | 
  99 | This is better than writing things down in a lab notebook
 100 | because it is guaranteed not to miss any steps.
 101 | It also illustrates the central idea of the shell:
 102 | simple tools that produce and consume lines of text
 103 | can be combined in a wide variety of ways
 104 | to solve a broad range of problems.
 105 | 
 106 | `@pre_exercise_code`
 107 | ```{python}
 108 | 
 109 | ```
 110 | 
 111 | ***
 112 | 
 113 | ```yaml
 114 | type: ConsoleExercise
 115 | key: 144ca955ca
 116 | xp: 35
 117 | ```
 118 | 
 119 | `@instructions`
 120 | Copy the files `seasonal/spring.csv` and `seasonal/summer.csv` to your home directory.
 121 | 
 122 | `@hint`
 123 | Use `cp` to copy and `~` as a shortcut for the path to your home directory.
 124 | 
 125 | `@solution`
 126 | ```{shell}
 127 | cp seasonal/s* ~
 128 | 
 129 | ```
 130 | 
 131 | `@sct`
 132 | ```{python}
 133 | msg="Have you used `cp seasonal/s* ~` to copy the required files to your home directory?"
 134 | Ex().multi(
 135 |     has_cwd('/home/repl'),
 136 |     check_file('/home/repl/spring.csv', missing_msg=msg).\
 137 |         has_code(r'2017-01-25,wisdom', incorrect_msg=msg),
 138 |     check_file('/home/repl/summer.csv', missing_msg=msg).\
 139 |         has_code(r'2017-01-11,canine', incorrect_msg=msg)
 140 | )
 141 | Ex().success_msg("Remarkable record-keeping! If you mistyped any commands, you can always use `nano` to clean up the saves history file afterwards.")
 142 | 
 143 | ```
 144 | 
 145 | ***
 146 | 
 147 | ```yaml
 148 | type: ConsoleExercise
 149 | key: 09a432e4df
 150 | xp: 35
 151 | ```
 152 | 
 153 | `@instructions`
 154 | Use `grep` with the `-h` flag (to stop it from printing filenames)
 155 | and `-v Tooth` (to select lines that *don't* match the header line)
 156 | to select the data records from `spring.csv` and `summer.csv` in that order
 157 | and redirect the output to `temp.csv`.
 158 | 
 159 | `@hint`
 160 | Put the flags before the filenames.
 161 | 
 162 | `@solution`
 163 | ```{shell}
 164 | grep -h -v Tooth spring.csv summer.csv > temp.csv
 165 | 
 166 | ```
 167 | 
 168 | `@sct`
 169 | ```{python}
 170 | msg1 = "Make sure you redirect the output of the `grep` command to `temp.csv` with `>`!"
 171 | msg2 = "Have you used `grep -h -v ___ ___ ___` (fill in the blanks) to populate `temp.csv`?"
 172 | Ex().multi(
 173 |     has_cwd('/home/repl'),
 174 |     check_file('/home/repl/temp.csv', missing_msg=msg1).multi(
 175 |         has_code(r'2017-08-04,canine', incorrect_msg=msg2),
 176 |         has_code(r'2017-03-14,incisor', incorrect_msg=msg2),
 177 |         has_code(r'2017-03-12,wisdom', incorrect_msg=msg2)
 178 |     )
 179 | )
 180 | 
 181 | ```
 182 | 
 183 | ***
 184 | 
 185 | ```yaml
 186 | type: ConsoleExercise
 187 | key: c40348c1e5
 188 | xp: 30
 189 | ```
 190 | 
 191 | `@instructions`
 192 | Pipe `history` into `tail -n 3`
 193 | and redirect the output to `steps.txt`
 194 | to save the last three commands in a file.
 195 | (You need to save three instead of just two
 196 | because the `history` command itself will be in the list.)
 197 | 
 198 | `@hint`
 199 | Remember that redirection with `>` comes at the end of the sequence of piped commands.
 200 | 
 201 | `@solution`
 202 | ```{shell}
 203 | history | tail -n 3 > steps.txt
 204 | 
 205 | ```
 206 | 
 207 | `@sct`
 208 | ```{python}
 209 | msg1="Make sure to redirect the output of your command to `steps.txt`."
 210 | msg2="Have you used `history | tail ___ ___` (fill in the blanks) to populate `steps.txt`?"
 211 | Ex().multi(
 212 |     has_cwd('/home/repl'),
 213 |     # When run by the validator, solution3 doesn't pass, so including a has_code for that
 214 |     check_or(
 215 |         check_file('/home/repl/steps.txt', missing_msg=msg1).multi(
 216 |             has_code(r'\s+1\s+', incorrect_msg=msg2),
 217 |             has_code(r'\s+3\s+history', incorrect_msg=msg2)
 218 |         ),
 219 |         has_code(r'history\s+|\s+tail\s+-n\s+4\s+>\s+steps\.txt')
 220 |     )
 221 | )
 222 | Ex().success_msg("Well done! Let's step it up!")
 223 | 
 224 | ```
 225 | 
 226 | ---
 227 | 
 228 | ## How can I save commands to re-run later?
 229 | 
 230 | ```yaml
 231 | type: BulletConsoleExercise
 232 | key: 4507a0dbd8
 233 | xp: 100
 234 | ```
 235 | 
 236 | You have been using the shell interactively so far.
 237 | But since the commands you type in are just text,
 238 | you can store them in files for the shell to run over and over again.
 239 | To start exploring this powerful capability,
 240 | put the following command in a file called `headers.sh`:
 241 | 
 242 | ```{shell}
 243 | head -n 1 seasonal/*.csv
 244 | ```
 245 | 
 246 | This command selects the first row from each of the CSV files in the `seasonal` directory.
 247 | Once you have created this file,
 248 | you can run it by typing:
 249 | 
 250 | ```{shell}
 251 | bash headers.sh
 252 | ```
 253 | 
 254 | This tells the shell (which is just a program called `bash`)
 255 | to run the commands contained in the file `headers.sh`,
 256 | which produces the same output as running the commands directly.
 257 | 
 258 | `@pre_exercise_code`
 259 | ```{python}
 260 | 
 261 | ```
 262 | 
 263 | ***
 264 | 
 265 | ```yaml
 266 | type: ConsoleExercise
 267 | key: 316ad2fec6
 268 | xp: 50
 269 | ```
 270 | 
 271 | `@instructions`
 272 | Use `nano dates.sh` to create a file called `dates.sh`
 273 | that contains this command:
 274 | 
 275 | ```{shell}
 276 | cut -d , -f 1 seasonal/*.csv
 277 | ```
 278 | 
 279 | to extract the first column from all of the CSV files in `seasonal`.
 280 | 
 281 | `@hint`
 282 | Put the commands shown into the file without extra blank lines or spaces.
 283 | 
 284 | `@solution`
 285 | ```{shell}
 286 | # This solution uses `cp` instead of `nano`
 287 | # because our automated tests can't edit files interactively.
 288 | cp /solutions/dates.sh ~
 289 | 
 290 | ```
 291 | 
 292 | `@sct`
 293 | ```{python}
 294 | msg = "Have you included the line `cut -d , -f 1 seasonal/*.csv` in the `dates.sh` file? Use `nano dates.sh` again to update your file. Use `Ctrl` + `O` to save and `Ctrl` + `X` to exit."
 295 | Ex().multi(
 296 |     has_cwd('/home/repl'),
 297 |     check_file('/home/repl/dates.sh').\
 298 |         has_code('cut -d *, *-f +1 +seasonal\/\*\.csv', incorrect_msg=msg)
 299 | )
 300 | 
 301 | ```
 302 | 
 303 | ***
 304 | 
 305 | ```yaml
 306 | type: ConsoleExercise
 307 | key: 30a8fa953e
 308 | xp: 50
 309 | ```
 310 | 
 311 | `@instructions`
 312 | Use `bash` to run the file `dates.sh`.
 313 | 
 314 | `@hint`
 315 | Use `bash filename` to run the file.
 316 | 
 317 | `@solution`
 318 | ```{shell}
 319 | bash dates.sh
 320 | 
 321 | ```
 322 | 
 323 | `@sct`
 324 | ```{python}
 325 | Ex().multi(
 326 |   has_cwd('/home/repl'),
 327 |   check_correct(
 328 |     has_expr_output(),
 329 |     multi(
 330 |       has_code("bash", incorrect_msg = 'Did you call `bash`?'),
 331 |       has_code("dates.sh", incorrect_msg = 'Did you specify the `dates.sh` file?')
 332 |     )
 333 |   )
 334 | )
 335 | 
 336 | ```
 337 | 
 338 | ---
 339 | 
 340 | ## How can I re-use pipes?
 341 | 
 342 | ```yaml
 343 | type: BulletConsoleExercise
 344 | key: da13667750
 345 | xp: 100
 346 | ```
 347 | 
 348 | A file full of shell commands is called a ***shell script**,
 349 | or sometimes just a "script" for short. Scripts don't have to have names ending in `.sh`,
 350 | but this lesson will use that convention
 351 | to help you keep track of which files are scripts.
 352 | 
 353 | Scripts can also contain pipes.
 354 | For example,
 355 | if `all-dates.sh` contains this line:
 356 | 
 357 | ```{shell}
 358 | cut -d , -f 1 seasonal/*.csv | grep -v Date | sort | uniq
 359 | ```
 360 | 
 361 | then:
 362 | 
 363 | ```{shell}
 364 | bash all-dates.sh > dates.out
 365 | ```
 366 | 
 367 | will extract the unique dates from the seasonal data files
 368 | and save them in `dates.out`.
 369 | 
 370 | `@pre_exercise_code`
 371 | ```{python}
 372 | import shutil
 373 | shutil.copyfile('/solutions/teeth-start.sh', 'teeth.sh')
 374 | ```
 375 | 
 376 | ***
 377 | 
 378 | ```yaml
 379 | type: ConsoleExercise
 380 | key: 6fae90f320
 381 | xp: 35
 382 | ```
 383 | 
 384 | `@instructions`
 385 | A file `teeth.sh` in your home directory has been prepared for you, but contains some blanks.
 386 | Use Nano to edit the file and replace the two `____` placeholders
 387 | with `seasonal/*.csv` and `-c` so that this script prints a count of the
 388 | number of times each tooth name appears in the CSV files in the `seasonal` directory.
 389 | 
 390 | `@hint`
 391 | Use `nano teeth.sh` to edit the file.
 392 | 
 393 | `@solution`
 394 | ```{shell}
 395 | # This solution uses `cp` instead of `nano`
 396 | # because our automated tests can't edit files interactively.
 397 | cp /solutions/teeth.sh ~
 398 | 
 399 | ```
 400 | 
 401 | `@sct`
 402 | ```{python}
 403 | msg="Have you a replaced the blanks properly so the command in `teeth.sh` reads `cut -d , -f 2 seasonal/*.csv | grep -v Tooth | sort | uniq -c`? Use `nano teeth.sh` again to make the required changes."
 404 | Ex().multi(
 405 |     has_cwd('/home/repl'),
 406 |     check_file('/home/repl/teeth.sh').\
 407 |         has_code(r'cut\s+-d\s+,\s+-f\s+2\s+seasonal/\*\.csv\s+\|\s+grep\s+-v\s+Tooth\s+\|\s+sort\s+\|\s+uniq\s+-c', incorrect_msg=msg)
 408 | )
 409 | 
 410 | ```
 411 | 
 412 | ***
 413 | 
 414 | ```yaml
 415 | type: ConsoleExercise
 416 | key: dcfccb51e2
 417 | xp: 35
 418 | ```
 419 | 
 420 | `@instructions`
 421 | Use `bash` to run `teeth.sh` and `>` to redirect its output to `teeth.out`.
 422 | 
 423 | `@hint`
 424 | Remember that `> teeth.out` must come *after* the command that is producing output.
 425 | 
 426 | `@solution`
 427 | ```{shell}
 428 | # We need to use 'cp' below to satisfy our automated tests.
 429 | # You should only use the last line that runs 'bash'.
 430 | cp /solutions/teeth.sh .
 431 | bash teeth.sh > teeth.out
 432 | 
 433 | ```
 434 | 
 435 | `@sct`
 436 | ```{python}
 437 | msg="Have you correctly redirected the result of `bash teeth.sh` to `teeth.out` with the `>`?"
 438 | Ex().multi(
 439 |   has_cwd('/home/repl'),
 440 |   check_correct(
 441 |     check_file('/home/repl/teeth.out').multi(
 442 |       has_code(r'31 canine', incorrect_msg=msg),
 443 |       has_code(r'17 wisdom', incorrect_msg=msg)
 444 |     ),
 445 |     multi(
 446 |       has_code("bash", incorrect_msg = 'Did you call `bash`?'),
 447 |       has_code("bash\s+teeth.sh", incorrect_msg = 'Did you run the `teeth.sh` file?'),
 448 |       has_code(">\s+teeth.out", incorrect_msg = 'Did you redirect to the `teeth.out` file?')
 449 |     )
 450 |   )
 451 | )
 452 | 
 453 | ```
 454 | 
 455 | ***
 456 | 
 457 | ```yaml
 458 | type: ConsoleExercise
 459 | key: c8c9a11e3c
 460 | xp: 30
 461 | ```
 462 | 
 463 | `@instructions`
 464 | Run `cat teeth.out` to inspect your results.
 465 | 
 466 | `@hint`
 467 | Remember, you can type the first few characters of a filename and then press the tab key to auto-complete.
 468 | 
 469 | `@solution`
 470 | ```{shell}
 471 | cat teeth.out
 472 | 
 473 | ```
 474 | 
 475 | `@sct`
 476 | ```{python}
 477 | Ex().multi(
 478 |   has_cwd('/home/repl'),
 479 |   check_correct(
 480 |     has_expr_output(),
 481 |     multi(
 482 |       has_code("cat", incorrect_msg = 'Did you call `cat`?'),
 483 |       has_code("teeth.out", incorrect_msg = 'Did you specify the `teeth.out` file?')
 484 |     )
 485 |   )
 486 | )
 487 | Ex().success_msg("Nice! This all may feel contrived at first, but the nice thing is that you are automating parts of your workflow step by step. Something that comes in really handy as a data scientist!")
 488 | 
 489 | ```
 490 | 
 491 | ---
 492 | 
 493 | ## How can I pass filenames to scripts?
 494 | 
 495 | ```yaml
 496 | type: BulletConsoleExercise
 497 | key: c2623b9c14
 498 | xp: 100
 499 | ```
 500 | 
 501 | A script that processes specific files is useful as a record of what you did, but one that allows you to process any files you want is more useful.
 502 | To support this,
 503 | you can use the special expression `$@` (dollar sign immediately followed by at-sign)
 504 | to mean "all of the command-line parameters given to the script".
 505 | 
 506 | For example, if `unique-lines.sh` contains `sort $@ | uniq`, when you run:
 507 | 
 508 | ```{shell}
 509 | bash unique-lines.sh seasonal/summer.csv
 510 | ```
 511 | 
 512 | the shell replaces `$@` with `seasonal/summer.csv` and processes one file. If you run this:
 513 | 
 514 | ```{shell}
 515 | bash unique-lines.sh seasonal/summer.csv seasonal/autumn.csv
 516 | ```
 517 | 
 518 | it processes two data files, and so on.
 519 | 
 520 | _As a reminder, to save what you have written in Nano, type `Ctrl` + `O` to write the file out, then Enter to confirm the filename, then `Ctrl` + `X` to exit the editor._
 521 | 
 522 | `@pre_exercise_code`
 523 | ```{python}
 524 | import shutil
 525 | shutil.copyfile('/solutions/count-records-start.sh', 'count-records.sh')
 526 | ```
 527 | 
 528 | ***
 529 | 
 530 | ```yaml
 531 | type: ConsoleExercise
 532 | key: 7a893623af
 533 | xp: 50
 534 | ```
 535 | 
 536 | `@instructions`
 537 | Edit the script `count-records.sh` with Nano and fill in the two `____` placeholders
 538 | with `$@` and `-l` (_the letter_) respectively so that it counts the number of lines in one or more files,
 539 | excluding the first line of each.
 540 | 
 541 | `@hint`
 542 | * Use `nano count-records.sh` to edit the filename.
 543 | * Make sure you are specifying the _letter_ `-l`, and not the number one.
 544 | 
 545 | `@solution`
 546 | ```{shell}
 547 | # This solution uses `cp` instead of `nano`
 548 | # because our automated tests can't edit files interactively.
 549 | cp /solutions/count-records.sh ~
 550 | 
 551 | ```
 552 | 
 553 | `@sct`
 554 | ```{python}
 555 | msg="Have you a replaced the blanks properly so the command in `count-records.sh` reads `tail -q -n +2 $@ | wc -l`? Use `nano count-records.sh` again to make the required changes."
 556 | Ex().multi(
 557 |     has_cwd('/home/repl'),
 558 |     check_file('/home/repl/count-records.sh').\
 559 |         has_code('tail\s+-q\s+-n\s+\+2\s+\$\@\s+\|\s+wc\s+-l', incorrect_msg=msg)
 560 | )
 561 | 
 562 | ```
 563 | 
 564 | ***
 565 | 
 566 | ```yaml
 567 | type: ConsoleExercise
 568 | key: d0da324516
 569 | xp: 50
 570 | ```
 571 | 
 572 | `@instructions`
 573 | Run `count-records.sh` on `seasonal/*.csv`
 574 | and redirect the output to `num-records.out` using `>`.
 575 | 
 576 | `@hint`
 577 | Use `>` to redirect the output.
 578 | 
 579 | `@solution`
 580 | ```{shell}
 581 | bash count-records.sh seasonal/*.csv > num-records.out
 582 | 
 583 | ```
 584 | 
 585 | `@sct`
 586 | ```{python}
 587 | Ex().multi(
 588 |   has_cwd('/home/repl'),
 589 |   check_correct(
 590 |     check_file('/home/repl/num-records.out').has_code(r'92'),
 591 |     multi(
 592 |       has_code("bash", incorrect_msg = 'Did you call `bash`?'),
 593 |       has_code("bash\s+count-records.sh", incorrect_msg = 'Did you run the `count-records.sh` file?'),
 594 |       has_code("seasonal/\*", incorrect_msg = 'Did you specify the files to process with `seasonal/*`?'),
 595 |       has_code(">\s+num-records.out", incorrect_msg = 'Did you redirect to the `num-records.out` file?')
 596 |     )
 597 |   )
 598 | )
 599 | Ex().success_msg("A job well done! Your shell power is ever-expanding!")
 600 | 
 601 | ```
 602 | 
 603 | ---
 604 | 
 605 | ## How can I process a single argument?
 606 | 
 607 | ```yaml
 608 | type: PureMultipleChoiceExercise
 609 | key: 4092cb4cda
 610 | xp: 50
 611 | ```
 612 | 
 613 | As well as `$@`,
 614 | the shell lets you use `$1`, `$2`, and so on to refer to specific command-line parameters.
 615 | You can use this to write commands that feel simpler or more natural than the shell's.
 616 | For example,
 617 | you can create a script called `column.sh` that selects a single column from a CSV file
 618 | when the user provides the filename as the first parameter and the column as the second:
 619 | 
 620 | ```{shell}
 621 | cut -d , -f $2 $1
 622 | ```
 623 | 
 624 | and then run it using:
 625 | 
 626 | ```{shell}
 627 | bash column.sh seasonal/autumn.csv 1
 628 | ```
 629 | 
 630 | Notice how the script uses the two parameters in reverse order.
 631 | 
 632 | <hr>
 633 | 
 634 | The script `get-field.sh` is supposed to take a filename,
 635 | the number of the row to select,
 636 | the number of the column to select,
 637 | and print just that field from a CSV file.
 638 | For example:
 639 | 
 640 | ```
 641 | bash get-field.sh seasonal/summer.csv 4 2
 642 | ```
 643 | 
 644 | should select the second field from line 4 of `seasonal/summer.csv`.
 645 | Which of the following commands should be put in `get-field.sh` to do that?
 646 | 
 647 | `@hint`
 648 | Remember that command-line parameters are numbered left to right.
 649 | 
 650 | `@possible_answers`
 651 | - `head -n $1 $2 | tail -n 1 | cut -d , -f $3`
 652 | - [`head -n $2 $1 | tail -n 1 | cut -d , -f $3`]
 653 | - `head -n $3 $1 | tail -n 1 | cut -d , -f $2`
 654 | - `head -n $2 $3 | tail -n 1 | cut -d , -f $1`
 655 | 
 656 | `@feedback`
 657 | - No: that will try to use the filename as the number of lines to select with `head`.
 658 | - Correct!
 659 | - No: that will try to use the column number as the line number and vice versa.
 660 | - No: that will use the field number as the filename and vice versa.
 661 | 
 662 | ---
 663 | 
 664 | ## How can one shell script do many things?
 665 | 
 666 | ```yaml
 667 | type: TabConsoleExercise
 668 | key: 846bc70e9d
 669 | xp: 100
 670 | ```
 671 | 
 672 | Our shells scripts so far have had a single command or pipe, but a script can contain many lines of commands. For example, you can create one that tells you how many records are in the shortest and longest of your data files, i.e., the range of your datasets' lengths.
 673 | 
 674 | Note that in Nano, "copy and paste" is achieved by navigating to the line you want to copy, pressing `CTRL` + `K` to cut the line, then `CTRL` + `U` twice to paste two copies of it.
 675 | 
 676 | _As a reminder, to save what you have written in Nano, type `Ctrl` + `O` to write the file out, then Enter to confirm the filename, then `Ctrl` + `X` to exit the editor._
 677 | 
 678 | `@pre_exercise_code`
 679 | ```{python}
 680 | import shutil
 681 | shutil.copyfile('/solutions/range-start-1.sh', 'range.sh')
 682 | ```
 683 | 
 684 | ***
 685 | 
 686 | ```yaml
 687 | type: ConsoleExercise
 688 | key: a1e55487fb
 689 | xp: 25
 690 | ```
 691 | 
 692 | `@instructions`
 693 | Use Nano to edit the script `range.sh`
 694 | and replace the two `____` placeholders
 695 | with `$@` and `-v`
 696 | so that it lists the names and number of lines in all of the files given on the command line
 697 | *without* showing the total number of lines in all files.
 698 | (Do not try to subtract the column header lines from the files.)
 699 | 
 700 | `@hint`
 701 | Use `wc -l $@` to count lines in all the files given on the command line.
 702 | 
 703 | `@solution`
 704 | ```{shell}
 705 | # This solution uses `cp` instead of `nano`
 706 | # because our automated tests can't edit files interactively.
 707 | cp /solutions/range-1.sh range.sh
 708 | 
 709 | ```
 710 | 
 711 | `@sct`
 712 | ```{python}
 713 | msg="Have you a replaced the blanks properly so the command in `range.sh` reads `wc -l $@ | grep -v total`? Use `nano range.sh` again to make the required changes."
 714 | Ex().multi(
 715 |     has_cwd('/home/repl'),
 716 |     check_file('/home/repl/range.sh').\
 717 |         has_code(r'wc\s+-l\s+\$@\s+\|\s+grep\s+-v\s+total', incorrect_msg=msg)
 718 | )
 719 | 
 720 | ```
 721 | 
 722 | ***
 723 | 
 724 | ```yaml
 725 | type: ConsoleExercise
 726 | key: e8ece27fe7
 727 | xp: 25
 728 | ```
 729 | 
 730 | `@instructions`
 731 | Use Nano again to add `sort -n` and `head -n 1` in that order
 732 | to the pipeline in `range.sh`
 733 | to display the name and line count of the shortest file given to it.
 734 | 
 735 | `@hint`
 736 | 
 737 | 
 738 | `@solution`
 739 | ```{shell}
 740 | # This solution uses `cp` instead of `nano`
 741 | # because our automated tests can't edit files interactively.
 742 | cp /solutions/range-2.sh range.sh
 743 | 
 744 | ```
 745 | 
 746 | `@sct`
 747 | ```{python}
 748 | msg="Have you added `sort -n` and `head -n 1` with pipes to the `range.sh` file? Use `nano range.sh` again to make the required changes."
 749 | Ex().multi(
 750 |     has_cwd('/home/repl'),
 751 |     check_file('/home/repl/range.sh').\
 752 |         has_code(r'wc\s+-l\s+\$@\s+\|\s+grep\s+-v\s+total\s+\|\s+sort\s+-n\s+|\s+head\s+-n\s+1', incorrect_msg=msg)
 753 | )
 754 | 
 755 | ```
 756 | 
 757 | ***
 758 | 
 759 | ```yaml
 760 | type: ConsoleExercise
 761 | key: a3b36a746e
 762 | xp: 25
 763 | ```
 764 | 
 765 | `@instructions`
 766 | Again using Nano, add a second line to `range.sh` to print the name and record count of
 767 | the *longest* file in the directory *as well as* the shortest.
 768 | This line should be a duplicate of the one you have already written,
 769 | but with `sort -n -r` rather than `sort -n`.
 770 | 
 771 | `@hint`
 772 | Copy the first line and modify the sorting order.
 773 | 
 774 | `@solution`
 775 | ```{shell}
 776 | # This solution uses `cp` instead of `nano`
 777 | # because our automated tests can't edit files interactively.
 778 | cp /solutions/range-3.sh range.sh
 779 | 
 780 | ```
 781 | 
 782 | `@sct`
 783 | ```{python}
 784 | msg1="Keep the first line in the `range.sh` file: `wc -l $@ | grep -v total | sort -n | head -n 1`"
 785 | msg2="Have you duplicated the first line in `range.sh` and made a small change? `sort -n -r` instead of `sort -n`!"
 786 | Ex().multi(
 787 |     has_cwd('/home/repl'),
 788 |     check_file('/home/repl/range.sh').multi(
 789 |         has_code("wc -l $@ | grep -v total | sort -n | head -n 1", fixed=True, incorrect_msg = msg1),
 790 |         has_code(r'wc\s+-l\s+\$@\s+\|\s+grep\s+-v\s+total\s+\|\s+sort\s+-n\s+-r\s+|\s+head\s+-n\s+1', incorrect_msg=msg2)
 791 |     )
 792 | )
 793 | 
 794 | ```
 795 | 
 796 | ***
 797 | 
 798 | ```yaml
 799 | type: ConsoleExercise
 800 | key: cba93a77c3
 801 | xp: 25
 802 | ```
 803 | 
 804 | `@instructions`
 805 | Run the script on the files in the `seasonal` directory
 806 | using `seasonal/*.csv` to match all of the files
 807 | and redirect the output using `>`
 808 | to a file called `range.out` in your home directory.
 809 | 
 810 | `@hint`
 811 | Use `bash range.sh` to run your script, `seasonal/*.csv` to specify files, and `> range.out` to redirect the output.
 812 | 
 813 | `@solution`
 814 | ```{shell}
 815 | bash range.sh seasonal/*.csv > range.out
 816 | 
 817 | ```
 818 | 
 819 | `@sct`
 820 | ```{python}
 821 | msg="Have you correctly redirected the result of `bash range.sh seasonal/*.csv` to `range.out` with the `>`?"
 822 | Ex().multi(
 823 | has_cwd('/home/repl'),
 824 | multi(
 825 | has_code("bash", incorrect_msg = 'Did you call `bash`?'),
 826 | has_code("bash\s+range.sh", incorrect_msg = 'Did you run the `range.sh` file?'),
 827 | has_code("seasonal/\*", incorrect_msg = 'Did you specify the files to process with `seasonal/*`?'),
 828 | has_code(">\s+range.out", incorrect_msg = 'Did you redirect to the `range.out` file?')
 829 | )
 830 | )
 831 | 
 832 | Ex().success_msg("This is going well. Head over to the next exercise to learn about writing loops!")
 833 | 
 834 | ```
 835 | 
 836 | ---
 837 | 
 838 | ## How can I write loops in a shell script?
 839 | 
 840 | ```yaml
 841 | type: BulletConsoleExercise
 842 | key: 6be8ca6009
 843 | xp: 100
 844 | ```
 845 | 
 846 | Shell scripts can also contain loops. You can write them using semi-colons, or split them across lines without semi-colons to make them more readable:
 847 | 
 848 | ```{shell}
 849 | # Print the first and last data records of each file.
 850 | for filename in $@
 851 | do
 852 |     head -n 2 $filename | tail -n 1
 853 |     tail -n 1 $filename
 854 | done
 855 | ```
 856 | 
 857 | (You don't have to indent the commands inside the loop, but doing so makes things clearer.)
 858 | 
 859 | The first line of this script is a **comment** to tell readers what the script does. Comments start with the `#` character and run to the end of the line. Your future self will thank you for adding brief explanations like the one shown here to every script you write.
 860 | 
 861 | _As a reminder, to save what you have written in Nano, type `Ctrl` + `O` to write the file out, then Enter to confirm the filename, then `Ctrl` + `X` to exit the editor._
 862 | 
 863 | `@pre_exercise_code`
 864 | ```{python}
 865 | import shutil
 866 | shutil.copyfile('/solutions/date-range-start.sh', '/home/repl/date-range.sh')
 867 | ```
 868 | 
 869 | ***
 870 | 
 871 | ```yaml
 872 | type: ConsoleExercise
 873 | key: 8ca2adb6c4
 874 | xp: 35
 875 | ```
 876 | 
 877 | `@instructions`
 878 | Fill in the placeholders in the script `date-range.sh`
 879 | with `$filename` (twice), `head`, and `tail`
 880 | so that it prints the first and last date from one or more files.
 881 | 
 882 | `@hint`
 883 | Remember to use `$filename` to get the current value of the loop variable.
 884 | 
 885 | `@solution`
 886 | ```{shell}
 887 | # This solution uses `cp` instead of `nano`
 888 | # because our automated tests can't edit files interactively.
 889 | cp /solutions/date-range.sh date-range.sh
 890 | 
 891 | ```
 892 | 
 893 | `@sct`
 894 | ```{python}
 895 | msgpatt="In `date-range.sh`, have you changed the %s line in the loop to be `%s`? Use `nano date-range.sh` to make changes."
 896 | cmdpatt = 'cut -d , -f 1 $filename | grep -v Date | sort | %s -n 1'
 897 | msg1=msgpatt%('first', cmdpatt%'head')
 898 | msg2=msgpatt%('second', cmdpatt%'tail')
 899 | patt='cut\s+-d\s+,\s+-f\s+1\s+\$filename\s+\|\s+grep\s+-v\s+Date\s+\|\s+sort\s+\|\s+%s\s+-n\s+1'
 900 | patt1 = patt%'head'
 901 | patt2 = patt%'tail'
 902 | Ex().multi(
 903 |     has_cwd('/home/repl'),
 904 |     check_file('/home/repl/date-range.sh').multi(
 905 |         has_code(patt1, incorrect_msg=msg1),
 906 |         has_code(patt2, incorrect_msg=msg2)
 907 |     )
 908 | )
 909 | 
 910 | ```
 911 | 
 912 | ***
 913 | 
 914 | ```yaml
 915 | type: ConsoleExercise
 916 | key: ec1271356d
 917 | xp: 35
 918 | ```
 919 | 
 920 | `@instructions`
 921 | Run `date-range.sh` on all four of the seasonal data files
 922 | using `seasonal/*.csv` to match their names.
 923 | 
 924 | `@hint`
 925 | The wildcard expression should start with the directory name.
 926 | 
 927 | `@solution`
 928 | ```{shell}
 929 | bash date-range.sh seasonal/*.csv
 930 | 
 931 | ```
 932 | 
 933 | `@sct`
 934 | ```{python}
 935 | Ex().multi(
 936 |   has_cwd('/home/repl'),
 937 |   check_correct(
 938 |     has_expr_output(),
 939 |     multi(
 940 |       has_code("bash", incorrect_msg = 'Did you call `bash`?'),
 941 |       has_code("bash\s+date-range.sh", incorrect_msg = 'Did you run the `date-range.sh` file?'),
 942 |       has_code("seasonal/\*", incorrect_msg = 'Did you specify the files to process with `seasonal/*`?')
 943 |     )
 944 |   )
 945 | )
 946 | 
 947 | ```
 948 | 
 949 | ***
 950 | 
 951 | ```yaml
 952 | type: ConsoleExercise
 953 | key: 0323c7d68d
 954 | xp: 30
 955 | ```
 956 | 
 957 | `@instructions`
 958 | Run `date-range.sh` on all four of the seasonal data files using `seasonal/*.csv` to match their names,
 959 | and pipe its output to `sort` to see that your scripts can be used just like Unix's built-in commands.
 960 | 
 961 | `@hint`
 962 | Use the same wildcard expression you used earlier.
 963 | 
 964 | `@solution`
 965 | ```{shell}
 966 | bash date-range.sh seasonal/*.csv | sort
 967 | 
 968 | ```
 969 | 
 970 | `@sct`
 971 | ```{python}
 972 | Ex().multi(
 973 |   has_cwd('/home/repl'),
 974 |   check_correct(
 975 |     has_expr_output(),
 976 |     multi(
 977 |       has_code("bash", incorrect_msg = 'Did you call `bash`?'),
 978 |       has_code("bash\s+date-range.sh", incorrect_msg = 'Did you run the `date-range.sh` file?'),
 979 |       has_code("seasonal/\*", incorrect_msg = 'Did you specify the files to process with `seasonal/*`?'),
 980 |       has_code("|", incorrect_msg = 'Did you pipe from the script output to `sort`?'),
 981 |       has_code("sort", incorrect_msg = 'Did you call `sort`?')
 982 |     )
 983 |   )
 984 | )
 985 | Ex().success_msg("Magic! Notice how composable all the things we've learned are.")
 986 | 
 987 | ```
 988 | 
 989 | ---
 990 | 
 991 | ## What happens when I don't provide filenames?
 992 | 
 993 | ```yaml
 994 | type: MultipleChoiceExercise
 995 | key: 8a162c4d54
 996 | xp: 50
 997 | ```
 998 | 
 999 | A common mistake in shell scripts (and interactive commands) is to put filenames in the wrong place.
1000 | If you type:
1001 | 
1002 | ```{shell}
1003 | tail -n 3
1004 | ```
1005 | 
1006 | then since `tail` hasn't been given any filenames,
1007 | it waits to read input from your keyboard.
1008 | This means that if you type:
1009 | 
1010 | ```{shell}
1011 | head -n 5 | tail -n 3 somefile.txt
1012 | ```
1013 | 
1014 | then `tail` goes ahead and prints the last three lines of `somefile.txt`,
1015 | but `head` waits forever for keyboard input,
1016 | since it wasn't given a filename and there isn't anything ahead of it in the pipeline.
1017 | 
1018 | <hr>
1019 | 
1020 | Suppose you do accidentally type:
1021 | 
1022 | ```{shell}
1023 | head -n 5 | tail -n 3 somefile.txt
1024 | ```
1025 | 
1026 | What should you do next?
1027 | 
1028 | `@possible_answers`
1029 | - Wait 10 seconds for `head` to time out.
1030 | - Type `somefile.txt` and press Enter to give `head` some input.
1031 | - Use `Ctrl` + `C` to stop the running `head` program.
1032 | 
1033 | `@hint`
1034 | What does `head` do if it doesn't have a filename and nothing is upstream from it?
1035 | 
1036 | `@pre_exercise_code`
1037 | ```{python}
1038 | 
1039 | ```
1040 | 
1041 | `@sct`
1042 | ```{python}
1043 | a1 = 'No, commands will not time out.'
1044 | a2 = 'No, that will give `head` the text `somefile.txt` to process, but then it will hang up waiting for still more input.'
1045 | a3 = "Yes! You should use `Ctrl` + `C` to stop a running program. This concludes this introductory course! If you're interested to learn more command line tools, we thoroughly recommend taking our free intro to Git course!"
1046 | Ex().has_chosen(3, [a1, a2, a3])
1047 | ```
1048 | 


--------------------------------------------------------------------------------
/course.yml:
--------------------------------------------------------------------------------
 1 | title: Introduction to Shell
 2 | description: >-
 3 |   The Unix command line has survived and thrived for almost 50 years because it
 4 |   lets people do complex things with just a few keystrokes. Sometimes called
 5 |   "the universal glue of programming," it helps users combine existing programs
 6 |   in new ways, automate repetitive tasks, and run programs on clusters and
 7 |   clouds that may be halfway around the world. This course will introduce its
 8 |   key elements and show you how to use them efficiently.
 9 | programming_language: shell
10 | from: 'shell-base-prod:v1.1.3'
11 | 


--------------------------------------------------------------------------------
/datasets/filesys.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacamp-content/courses-introduction-to-shell/b3fb925221d5f26948c18cf8d8fff89c9e68c957/datasets/filesys.zip


--------------------------------------------------------------------------------
/datasets/solutions.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacamp-content/courses-introduction-to-shell/b3fb925221d5f26948c18cf8d8fff89c9e68c957/datasets/solutions.zip


--------------------------------------------------------------------------------
/design/concept.dot:
--------------------------------------------------------------------------------
 1 | digraph conda_concepts {
 2 |     node [shape = rectangle];
 3 |     history
 4 |     pipe
 5 |     loop
 6 |     command
 7 |     shell
 8 |     program
 9 |     script
10 |     file
11 |     directory
12 |     filesystem
13 |     path
14 |     wildcard
15 | 
16 |     {
17 |       rank=same
18 |       rankdir=LR
19 |       pipe
20 |       loop
21 |       script
22 |       program
23 |       filesystem
24 |     }
25 | 
26 |     i01 [shape=point, width=0, height=0]
27 |     i02 [shape=point, width=0, height=0]
28 | 
29 |     loop -> script [style="invis"]
30 | 
31 |     shell -> command [label="runs"]
32 |     shell -> program [label="runs"]
33 |     shell -> history [label="records"]
34 |     shell -> wildcard [label="expands"]
35 |     wildcard -> path [label="matches"]
36 |     history -> command [label="records"]
37 |     command -> pipe [label="combined\nusing"]
38 |     command -> loop [label="repeated\nusing"]
39 |     command -> history [label="repeated\nusing"]
40 |     command -> script [label="stored\nin"]
41 |     script -> program [label="is a"]
42 |     program -> i01 [dir="none", label="manipulates"]
43 |       i01 -> file
44 |       i01 -> directory
45 |     filesystem -> i02 [dir="none", label="contains"]
46 |       i02 -> file
47 |       i02 -> directory
48 |     path -> filesystem [label="identifies\nparts\nof"]
49 | }
50 | 


--------------------------------------------------------------------------------
/design/concept.svg:
--------------------------------------------------------------------------------
  1 | <?xml version="1.0" encoding="UTF-8" standalone="no"?>
  2 | <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
  3 |  "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
  4 | <!-- Generated by graphviz version 2.40.1 (20161225.0304)
  5 |  -->
  6 | <!-- Title: conda_concepts Pages: 1 -->
  7 | <svg width="426pt" height="467pt"
  8 |  viewBox="0.00 0.00 426.33 467.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
  9 | <g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 463)">
 10 | <title>conda_concepts</title>
 11 | <polygon fill="#ffffff" stroke="transparent" points="-4,4 -4,-463 422.3311,-463 422.3311,4 -4,4"/>
 12 | <!-- history -->
 13 | <g id="node1" class="node">
 14 | <title>history</title>
 15 | <polygon fill="none" stroke="#000000" points="247.3898,-373 192.6102,-373 192.6102,-337 247.3898,-337 247.3898,-373"/>
 16 | <text text-anchor="middle" x="220" y="-350.8" font-family="Times,serif" font-size="14.00" fill="#000000">history</text>
 17 | </g>
 18 | <!-- command -->
 19 | <g id="node4" class="node">
 20 | <title>command</title>
 21 | <polygon fill="none" stroke="#000000" points="155.7073,-273 84.2927,-273 84.2927,-237 155.7073,-237 155.7073,-273"/>
 22 | <text text-anchor="middle" x="120" y="-250.8" font-family="Times,serif" font-size="14.00" fill="#000000">command</text>
 23 | </g>
 24 | <!-- history&#45;&gt;command -->
 25 | <g id="edge7" class="edge">
 26 | <title>history&#45;&gt;command</title>
 27 | <path fill="none" stroke="#000000" d="M217.6748,-336.7556C215.0384,-322.7548 209.4999,-303.7283 198,-291 189.1638,-281.2199 177.1328,-273.9021 165.1659,-268.512"/>
 28 | <polygon fill="#000000" stroke="#000000" points="166.2242,-265.1627 155.6447,-264.5923 163.5594,-271.6357 166.2242,-265.1627"/>
 29 | <text text-anchor="middle" x="233.6001" y="-300.8" font-family="Times,serif" font-size="14.00" fill="#000000">records</text>
 30 | </g>
 31 | <!-- pipe -->
 32 | <g id="node2" class="node">
 33 | <title>pipe</title>
 34 | <polygon fill="none" stroke="#000000" points="54,-159 0,-159 0,-123 54,-123 54,-159"/>
 35 | <text text-anchor="middle" x="27" y="-136.8" font-family="Times,serif" font-size="14.00" fill="#000000">pipe</text>
 36 | </g>
 37 | <!-- loop -->
 38 | <g id="node3" class="node">
 39 | <title>loop</title>
 40 | <polygon fill="none" stroke="#000000" points="126,-159 72,-159 72,-123 126,-123 126,-159"/>
 41 | <text text-anchor="middle" x="99" y="-136.8" font-family="Times,serif" font-size="14.00" fill="#000000">loop</text>
 42 | </g>
 43 | <!-- script -->
 44 | <g id="node7" class="node">
 45 | <title>script</title>
 46 | <polygon fill="none" stroke="#000000" points="216,-159 162,-159 162,-123 216,-123 216,-159"/>
 47 | <text text-anchor="middle" x="189" y="-136.8" font-family="Times,serif" font-size="14.00" fill="#000000">script</text>
 48 | </g>
 49 | <!-- loop&#45;&gt;script -->
 50 | <!-- command&#45;&gt;history -->
 51 | <g id="edge10" class="edge">
 52 | <title>command&#45;&gt;history</title>
 53 | <path fill="none" stroke="#000000" d="M123.847,-273.2038C127.6052,-287.1823 134.563,-306.1984 146.5928,-319 156.43,-329.4683 169.9616,-337.286 182.7293,-342.9128"/>
 54 | <polygon fill="#000000" stroke="#000000" points="181.6884,-346.2679 192.2701,-346.7962 184.3274,-339.7844 181.6884,-346.2679"/>
 55 | <text text-anchor="middle" x="170.7036" y="-307.8" font-family="Times,serif" font-size="14.00" fill="#000000">repeated</text>
 56 | <text text-anchor="middle" x="170.7036" y="-293.8" font-family="Times,serif" font-size="14.00" fill="#000000">using</text>
 57 | </g>
 58 | <!-- command&#45;&gt;pipe -->
 59 | <g id="edge8" class="edge">
 60 | <title>command&#45;&gt;pipe</title>
 61 | <path fill="none" stroke="#000000" d="M84.1673,-244.5747C70.6902,-238.9936 56.3355,-230.7729 46.793,-219 35.3911,-204.933 30.461,-185.1165 28.3721,-168.9779"/>
 62 | <polygon fill="#000000" stroke="#000000" points="31.852,-168.6013 27.3702,-159.0011 24.887,-169.3008 31.852,-168.6013"/>
 63 | <text text-anchor="middle" x="74.6035" y="-200.8" font-family="Times,serif" font-size="14.00" fill="#000000">combined</text>
 64 | <text text-anchor="middle" x="74.6035" y="-186.8" font-family="Times,serif" font-size="14.00" fill="#000000">using</text>
 65 | </g>
 66 | <!-- command&#45;&gt;loop -->
 67 | <g id="edge9" class="edge">
 68 | <title>command&#45;&gt;loop</title>
 69 | <path fill="none" stroke="#000000" d="M116.6725,-236.9364C113.3284,-218.7827 108.125,-190.536 104.229,-169.3859"/>
 70 | <polygon fill="#000000" stroke="#000000" points="107.6353,-168.5571 102.3815,-159.3566 100.7511,-169.8253 107.6353,-168.5571"/>
 71 | <text text-anchor="middle" x="137.7036" y="-200.8" font-family="Times,serif" font-size="14.00" fill="#000000">repeated</text>
 72 | <text text-anchor="middle" x="137.7036" y="-186.8" font-family="Times,serif" font-size="14.00" fill="#000000">using</text>
 73 | </g>
 74 | <!-- command&#45;&gt;script -->
 75 | <g id="edge11" class="edge">
 76 | <title>command&#45;&gt;script</title>
 77 | <path fill="none" stroke="#000000" d="M148.1802,-236.734C154.4161,-231.6156 160.4987,-225.6182 165,-219 175.0996,-204.1509 181.0816,-184.8849 184.5476,-169.2159"/>
 78 | <polygon fill="#000000" stroke="#000000" points="188.0531,-169.531 186.5648,-159.0412 181.1868,-168.1696 188.0531,-169.531"/>
 79 | <text text-anchor="middle" x="200.1069" y="-200.8" font-family="Times,serif" font-size="14.00" fill="#000000">stored</text>
 80 | <text text-anchor="middle" x="200.1069" y="-186.8" font-family="Times,serif" font-size="14.00" fill="#000000">in</text>
 81 | </g>
 82 | <!-- shell -->
 83 | <g id="node5" class="node">
 84 | <title>shell</title>
 85 | <polygon fill="none" stroke="#000000" points="283,-459 229,-459 229,-423 283,-423 283,-459"/>
 86 | <text text-anchor="middle" x="256" y="-436.8" font-family="Times,serif" font-size="14.00" fill="#000000">shell</text>
 87 | </g>
 88 | <!-- shell&#45;&gt;history -->
 89 | <g id="edge4" class="edge">
 90 | <title>shell&#45;&gt;history</title>
 91 | <path fill="none" stroke="#000000" d="M239.504,-422.63C235.4411,-417.2662 231.489,-411.1767 228.7998,-405 225.8344,-398.189 223.8678,-390.4433 222.5637,-383.1165"/>
 92 | <polygon fill="#000000" stroke="#000000" points="226.0219,-382.5748 221.1262,-373.1789 219.094,-383.5769 226.0219,-382.5748"/>
 93 | <text text-anchor="middle" x="249.6001" y="-393.8" font-family="Times,serif" font-size="14.00" fill="#000000">records</text>
 94 | </g>
 95 | <!-- shell&#45;&gt;command -->
 96 | <g id="edge2" class="edge">
 97 | <title>shell&#45;&gt;command</title>
 98 | <path fill="none" stroke="#000000" d="M228.9529,-426.5172C208.121,-414.3447 179.6376,-395.3735 159.8896,-373 141.8777,-352.5933 139.8389,-344.7437 131,-319 127.0881,-307.6064 124.5254,-294.5742 122.8672,-283.2563"/>
 99 | <polygon fill="#000000" stroke="#000000" points="126.2991,-282.5055 121.5399,-273.0398 119.3575,-283.4074 126.2991,-282.5055"/>
100 | <text text-anchor="middle" x="172.0552" y="-350.8" font-family="Times,serif" font-size="14.00" fill="#000000">runs</text>
101 | </g>
102 | <!-- program -->
103 | <g id="node6" class="node">
104 | <title>program</title>
105 | <polygon fill="none" stroke="#000000" points="315.9292,-159 252.0708,-159 252.0708,-123 315.9292,-123 315.9292,-159"/>
106 | <text text-anchor="middle" x="284" y="-136.8" font-family="Times,serif" font-size="14.00" fill="#000000">program</text>
107 | </g>
108 | <!-- shell&#45;&gt;program -->
109 | <g id="edge3" class="edge">
110 | <title>shell&#45;&gt;program</title>
111 | <path fill="none" stroke="#000000" d="M264.4242,-422.6626C266.6232,-417.1015 268.7304,-410.9003 270,-405 288.1313,-320.734 287.195,-218.0305 285.4092,-169.2874"/>
112 | <polygon fill="#000000" stroke="#000000" points="288.9019,-169.0344 284.9973,-159.1853 281.9077,-169.3196 288.9019,-169.0344"/>
113 | <text text-anchor="middle" x="297.0552" y="-300.8" font-family="Times,serif" font-size="14.00" fill="#000000">runs</text>
114 | </g>
115 | <!-- wildcard -->
116 | <g id="node12" class="node">
117 | <title>wildcard</title>
118 | <polygon fill="none" stroke="#000000" points="380.4795,-373 315.5205,-373 315.5205,-337 380.4795,-337 380.4795,-373"/>
119 | <text text-anchor="middle" x="348" y="-350.8" font-family="Times,serif" font-size="14.00" fill="#000000">wildcard</text>
120 | </g>
121 | <!-- shell&#45;&gt;wildcard -->
122 | <g id="edge5" class="edge">
123 | <title>shell&#45;&gt;wildcard</title>
124 | <path fill="none" stroke="#000000" d="M277.6442,-422.9976C284.2697,-417.341 291.5173,-411.0032 298,-405 306.2417,-397.368 314.9794,-388.8041 322.8056,-380.9496"/>
125 | <polygon fill="#000000" stroke="#000000" points="325.7351,-382.9646 330.2759,-373.3922 320.7567,-378.0436 325.7351,-382.9646"/>
126 | <text text-anchor="middle" x="335.938" y="-393.8" font-family="Times,serif" font-size="14.00" fill="#000000">expands</text>
127 | </g>
128 | <!-- i01 -->
129 | <g id="node13" class="node">
130 | <title>i01</title>
131 | <ellipse fill="#000000" stroke="#000000" cx="288" cy="-73" rx="0" ry="0"/>
132 | </g>
133 | <!-- program&#45;&gt;i01 -->
134 | <g id="edge13" class="edge">
135 | <title>program&#45;&gt;i01</title>
136 | <path fill="none" stroke="#000000" d="M282.5559,-122.8796C282.1015,-113.3548 282.0168,-101.5067 283.3516,-91 284.3278,-83.3157 287.5591,-74.2192 287.9592,-73.1121"/>
137 | <text text-anchor="middle" x="317.8242" y="-93.8" font-family="Times,serif" font-size="14.00" fill="#000000">manipulates</text>
138 | </g>
139 | <!-- script&#45;&gt;program -->
140 | <g id="edge12" class="edge">
141 | <title>script&#45;&gt;program</title>
142 | <path fill="none" stroke="#000000" d="M216.2232,-141C224.2625,-141 233.2557,-141 241.9972,-141"/>
143 | <polygon fill="#000000" stroke="#000000" points="242.0711,-144.5001 252.0711,-141 242.0711,-137.5001 242.0711,-144.5001"/>
144 | <text text-anchor="middle" x="234.1431" y="-146.8" font-family="Times,serif" font-size="14.00" fill="#000000">is a</text>
145 | </g>
146 | <!-- file -->
147 | <g id="node8" class="node">
148 | <title>file</title>
149 | <polygon fill="none" stroke="#000000" points="315,-36 261,-36 261,0 315,0 315,-36"/>
150 | <text text-anchor="middle" x="288" y="-13.8" font-family="Times,serif" font-size="14.00" fill="#000000">file</text>
151 | </g>
152 | <!-- directory -->
153 | <g id="node9" class="node">
154 | <title>directory</title>
155 | <polygon fill="none" stroke="#000000" points="399.0329,-36 332.9671,-36 332.9671,0 399.0329,0 399.0329,-36"/>
156 | <text text-anchor="middle" x="366" y="-13.8" font-family="Times,serif" font-size="14.00" fill="#000000">directory</text>
157 | </g>
158 | <!-- filesystem -->
159 | <g id="node10" class="node">
160 | <title>filesystem</title>
161 | <polygon fill="none" stroke="#000000" points="408.2796,-159 335.7204,-159 335.7204,-123 408.2796,-123 408.2796,-159"/>
162 | <text text-anchor="middle" x="372" y="-136.8" font-family="Times,serif" font-size="14.00" fill="#000000">filesystem</text>
163 | </g>
164 | <!-- i02 -->
165 | <g id="node14" class="node">
166 | <title>i02</title>
167 | <ellipse fill="#000000" stroke="#000000" cx="366" cy="-73" rx="0" ry="0"/>
168 | </g>
169 | <!-- filesystem&#45;&gt;i02 -->
170 | <g id="edge16" class="edge">
171 | <title>filesystem&#45;&gt;i02</title>
172 | <path fill="none" stroke="#000000" d="M370.3921,-122.7772C368.6704,-103.2643 366.1626,-74.8429 366.0076,-73.0856"/>
173 | <text text-anchor="middle" x="391.3276" y="-93.8" font-family="Times,serif" font-size="14.00" fill="#000000">contains</text>
174 | </g>
175 | <!-- path -->
176 | <g id="node11" class="node">
177 | <title>path</title>
178 | <polygon fill="none" stroke="#000000" points="384,-273 330,-273 330,-237 384,-237 384,-273"/>
179 | <text text-anchor="middle" x="357" y="-250.8" font-family="Times,serif" font-size="14.00" fill="#000000">path</text>
180 | </g>
181 | <!-- path&#45;&gt;filesystem -->
182 | <g id="edge19" class="edge">
183 | <title>path&#45;&gt;filesystem</title>
184 | <path fill="none" stroke="#000000" d="M359.3768,-236.9364C361.7654,-218.7827 365.4821,-190.536 368.265,-169.3859"/>
185 | <polygon fill="#000000" stroke="#000000" points="371.7501,-169.7278 369.5847,-159.3566 364.8099,-168.8145 371.7501,-169.7278"/>
186 | <text text-anchor="middle" x="392.6655" y="-207.8" font-family="Times,serif" font-size="14.00" fill="#000000">identifies</text>
187 | <text text-anchor="middle" x="392.6655" y="-193.8" font-family="Times,serif" font-size="14.00" fill="#000000">parts</text>
188 | <text text-anchor="middle" x="392.6655" y="-179.8" font-family="Times,serif" font-size="14.00" fill="#000000">of</text>
189 | </g>
190 | <!-- wildcard&#45;&gt;path -->
191 | <g id="edge6" class="edge">
192 | <title>wildcard&#45;&gt;path</title>
193 | <path fill="none" stroke="#000000" d="M349.6507,-336.6585C350.9955,-321.7164 352.917,-300.3665 354.458,-283.2446"/>
194 | <polygon fill="#000000" stroke="#000000" points="357.9492,-283.4988 355.3597,-273.2253 350.9773,-282.8712 357.9492,-283.4988"/>
195 | <text text-anchor="middle" x="375.9346" y="-300.8" font-family="Times,serif" font-size="14.00" fill="#000000">matches</text>
196 | </g>
197 | <!-- i01&#45;&gt;file -->
198 | <g id="edge14" class="edge">
199 | <title>i01&#45;&gt;file</title>
200 | <path fill="none" stroke="#000000" d="M288,-72.8422C288,-71.3261 288,-59.1219 288,-46.6107"/>
201 | <polygon fill="#000000" stroke="#000000" points="291.5001,-46.2569 288,-36.2569 284.5001,-46.2569 291.5001,-46.2569"/>
202 | </g>
203 | <!-- i01&#45;&gt;directory -->
204 | <g id="edge15" class="edge">
205 | <title>i01&#45;&gt;directory</title>
206 | <path fill="none" stroke="#000000" d="M288.0565,-72.9601C289.3215,-72.0681 311.5896,-56.3663 332,-41.9743"/>
207 | <polygon fill="#000000" stroke="#000000" points="334.1151,-44.7656 340.2708,-36.1424 330.0812,-39.0448 334.1151,-44.7656"/>
208 | </g>
209 | <!-- i02&#45;&gt;file -->
210 | <g id="edge17" class="edge">
211 | <title>i02&#45;&gt;file</title>
212 | <path fill="none" stroke="#000000" d="M365.9435,-72.9601C364.6785,-72.0681 342.4104,-56.3663 322,-41.9743"/>
213 | <polygon fill="#000000" stroke="#000000" points="323.9188,-39.0448 313.7292,-36.1424 319.8849,-44.7656 323.9188,-39.0448"/>
214 | </g>
215 | <!-- i02&#45;&gt;directory -->
216 | <g id="edge18" class="edge">
217 | <title>i02&#45;&gt;directory</title>
218 | <path fill="none" stroke="#000000" d="M366,-72.8422C366,-71.3261 366,-59.1219 366,-46.6107"/>
219 | <polygon fill="#000000" stroke="#000000" points="369.5001,-46.2569 366,-36.2569 362.5001,-46.2569 369.5001,-46.2569"/>
220 | </g>
221 | </g>
222 | </svg>
223 | 


--------------------------------------------------------------------------------
/filesys/course.txt:
--------------------------------------------------------------------------------
 1 | Introduction to the Unix Shell for Data Science
 2 | 
 3 | The Unix command line has survived and thrived for almost fifty years
 4 | because it lets people to do complex things with just a few
 5 | keystrokes. Sometimes called "the duct tape of programming", it helps
 6 | users combine existing programs in new ways, automate repetitive
 7 | tasks, and run programs on clusters and clouds that may be halfway
 8 | around the world. This lesson will introduce its key elements and show
 9 | you how to use them efficiently.
10 | 


--------------------------------------------------------------------------------
/filesys/people/agarwal.txt:
--------------------------------------------------------------------------------
1 | name: Agarwal, Jasmine
2 | position: RCT2
3 | start: 2017-04-01
4 | benefits: full
5 | 


--------------------------------------------------------------------------------
/filesys/seasonal/autumn.csv:
--------------------------------------------------------------------------------
 1 | Date,Tooth
 2 | 2017-01-05,canine
 3 | 2017-01-17,wisdom
 4 | 2017-01-18,canine
 5 | 2017-02-01,molar
 6 | 2017-02-22,bicuspid
 7 | 2017-03-10,canine
 8 | 2017-03-13,canine
 9 | 2017-04-30,incisor
10 | 2017-05-02,canine
11 | 2017-05-10,canine
12 | 2017-05-19,bicuspid
13 | 2017-05-25,molar
14 | 2017-06-22,wisdom
15 | 2017-06-25,canine
16 | 2017-07-10,incisor
17 | 2017-07-10,wisdom
18 | 2017-07-20,incisor
19 | 2017-07-21,bicuspid
20 | 2017-08-09,canine
21 | 2017-08-16,canine
22 | 


--------------------------------------------------------------------------------
/filesys/seasonal/spring.csv:
--------------------------------------------------------------------------------
 1 | Date,Tooth
 2 | 2017-01-25,wisdom
 3 | 2017-02-19,canine
 4 | 2017-02-24,canine
 5 | 2017-02-28,wisdom
 6 | 2017-03-04,incisor
 7 | 2017-03-12,wisdom
 8 | 2017-03-14,incisor
 9 | 2017-03-21,molar
10 | 2017-04-29,wisdom
11 | 2017-05-08,canine
12 | 2017-05-20,canine
13 | 2017-05-21,canine
14 | 2017-05-25,canine
15 | 2017-06-04,molar
16 | 2017-06-13,bicuspid
17 | 2017-06-14,canine
18 | 2017-07-10,incisor
19 | 2017-07-16,bicuspid
20 | 2017-07-23,bicuspid
21 | 2017-08-13,bicuspid
22 | 2017-08-13,incisor
23 | 2017-08-13,wisdom
24 | 2017-09-07,molar
25 | 


--------------------------------------------------------------------------------
/filesys/seasonal/summer.csv:
--------------------------------------------------------------------------------
 1 | Date,Tooth
 2 | 2017-01-11,canine
 3 | 2017-01-18,wisdom
 4 | 2017-01-21,bicuspid
 5 | 2017-02-02,molar
 6 | 2017-02-27,wisdom
 7 | 2017-02-27,wisdom
 8 | 2017-03-07,bicuspid
 9 | 2017-03-15,wisdom
10 | 2017-03-20,canine
11 | 2017-03-23,molar
12 | 2017-04-02,bicuspid
13 | 2017-04-22,wisdom
14 | 2017-05-07,canine
15 | 2017-05-09,canine
16 | 2017-05-11,incisor
17 | 2017-05-14,incisor
18 | 2017-05-19,canine
19 | 2017-05-23,incisor
20 | 2017-05-24,incisor
21 | 2017-06-18,incisor
22 | 2017-07-25,canine
23 | 2017-08-02,canine
24 | 2017-08-03,bicuspid
25 | 2017-08-04,canine
26 | 


--------------------------------------------------------------------------------
/filesys/seasonal/winter.csv:
--------------------------------------------------------------------------------
 1 | Date,Tooth
 2 | 2017-01-03,bicuspid
 3 | 2017-01-05,incisor
 4 | 2017-01-21,wisdom
 5 | 2017-02-05,molar
 6 | 2017-02-17,incisor
 7 | 2017-02-25,bicuspid
 8 | 2017-03-12,incisor
 9 | 2017-03-25,molar
10 | 2017-03-26,incisor
11 | 2017-04-04,canine
12 | 2017-04-18,canine
13 | 2017-04-26,canine
14 | 2017-04-26,molar
15 | 2017-04-26,wisdom
16 | 2017-04-27,canine
17 | 2017-05-08,molar
18 | 2017-05-13,bicuspid
19 | 2017-05-14,wisdom
20 | 2017-06-17,canine
21 | 2017-07-01,incisor
22 | 2017-07-17,canine
23 | 2017-08-10,incisor
24 | 2017-08-11,bicuspid
25 | 2017-08-11,wisdom
26 | 2017-08-13,canine
27 | 


--------------------------------------------------------------------------------
/img/shield_image.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/datacamp-content/courses-introduction-to-shell/b3fb925221d5f26948c18cf8d8fff89c9e68c957/img/shield_image.png


--------------------------------------------------------------------------------
/requirements.sh:
--------------------------------------------------------------------------------
 1 | # Definitions.
 2 | HOME_DIR=/home/repl
 3 | HOME_COPY=/.course_home
 4 | USER_GROUP=repl:repl
 5 | COURSE_ID=course_5065
 6 | FILESYS=filesys.zip
 7 | SOLUTIONS=solutions.zip
 8 | 
 9 | # Report start.
10 | echo ''
11 | echo '----------------------------------------'
12 | echo 'START TIME:' $(date)
13 | echo 'HOME_DIR: ' ${HOME_DIR}
14 | echo 'USER_GROUP: ' ${USER_GROUP}
15 | echo 'COURSE_ID: ' ${COURSE_ID}
16 | echo 'FILESYS: ' ${FILESYS}
17 | echo
18 | 
19 | # Make sure we're in the home directory.
20 | cd ${HOME_DIR}
21 | 
22 | # Get the zip files.
23 | wget https://s3.amazonaws.com/assets.datacamp.com/production/${COURSE_ID}/datasets/${FILESYS}
24 | wget https://s3.amazonaws.com/assets.datacamp.com/production/${COURSE_ID}/datasets/${SOLUTIONS}
25 | 
26 | # Make sure we have nano and unzip.
27 | apt-get update
28 | apt-get -y install nano
29 | apt-get -y install unzip
30 | 
31 | # Unminimize the docker image so the man command is available
32 | yes | unminimize
33 | 
34 | # Unpack to the local directory.
35 | unzip ./${FILESYS}
36 | 
37 | # Remove the zip file.
38 | rm -f ./${FILESYS}
39 | 
40 | # Make the `backup` and `bin` directories (which start off empty, so are not in Git).
41 | mkdir ./backup
42 | mkdir ./bin
43 | 
44 | # Change ownership.
45 | chown -R ${USER_GROUP} .
46 | 
47 | # Unpack the solutions (used for file comparison in SCTs).
48 | unzip -d / ./${SOLUTIONS}
49 | rm -f ./${SOLUTIONS}
50 | 
51 | # Change prompt.
52 | echo "export PS1='\$ '" >> ${HOME_DIR}/.bashrc
53 | 
54 | # Ensure ~/bin is on the user's path.
55 | echo 'export PATH=$PATH:$HOME/bin' >> ${HOME_DIR}/.bashrc
56 | 
57 | # Make copy for resetting exercises.
58 | # Files there will replace /home/repl each exercise.
59 | # IMPORTANT: Trailing slashes after directory names force rsync to do the right thing.
60 | rsync -a ${HOME_DIR}/ ${HOME_COPY}/
61 | chown -R ${USER_GROUP} ${HOME_COPY}
62 | 
63 | # Show what's been done where.
64 | echo 'Installed in home directory:'
65 | ls -R ${HOME_DIR}/*
66 | echo
67 | echo 'Last 10 lines of .bashrc'
68 | tail -n 10 ${HOME_DIR}/.bashrc
69 | 
70 | echo 'home backup directory:'
71 | ls -R ${HOME_COPY}
72 | 
73 | echo 'solutions directory'
74 | ls -lR /solutions
75 | 
76 | # Report end of installation.
77 | echo
78 | echo 'ENDING requirements.sh'
79 | echo '----------------------------------------'
80 | echo ''
81 | 


--------------------------------------------------------------------------------
/rules.yml:
--------------------------------------------------------------------------------
 1 | rules:
 2 |     course_pct_ex_assignment_lte_reco:
 3 |         min: 0
 4 |         max: 1
 5 |     course_pct_ex_code_sample_lte_reco:
 6 |         min: 0
 7 |         max: 1
 8 |     course_pct_ex_code_solution_lte_reco:
 9 |         min: 0
10 |         max: 1
11 |     mce_num_chars_assignment:
12 |         min: 0
13 |         max: 2000
14 | 


--------------------------------------------------------------------------------
/solutions/count-records-start.sh:
--------------------------------------------------------------------------------
1 | tail -q -n +2 ____ | wc ____
2 | 


--------------------------------------------------------------------------------
/solutions/count-records.sh:
--------------------------------------------------------------------------------
1 | tail -q -n +2 $@ | wc -l
2 | 


--------------------------------------------------------------------------------
/solutions/current-time.sh:
--------------------------------------------------------------------------------
1 | # Print the date and time at one-second intervals until stopped.
2 | while true
3 | do
4 |     date
5 |     sleep 1
6 | done
7 | 


--------------------------------------------------------------------------------
/solutions/date-range-start.sh:
--------------------------------------------------------------------------------
1 | # Print the first and last date from each data file.
2 | for filename in $@
3 | do
4 |     cut -d , -f 1 ____ | grep -v Date | sort | ____ -n 1
5 |     cut -d , -f 1 ____ | grep -v Date | sort | ____ -n 1
6 | done
7 | 


--------------------------------------------------------------------------------
/solutions/date-range.sh:
--------------------------------------------------------------------------------
1 | # Print the first and last date from each data file.
2 | for filename in $@
3 | do
4 |     cut -d , -f 1 $filename | grep -v Date | sort | head -n 1
5 |     cut -d , -f 1 $filename | grep -v Date | sort | tail -n 1
6 | done
7 | 


--------------------------------------------------------------------------------
/solutions/dates.sh:
--------------------------------------------------------------------------------
1 | cut -d , -f 1 seasonal/*.csv
2 | 


--------------------------------------------------------------------------------
/solutions/get-lines-solution.sh:
--------------------------------------------------------------------------------
1 | head -n $2 $1 | tail -n $3
2 | 


--------------------------------------------------------------------------------
/solutions/get-lines.sh:
--------------------------------------------------------------------------------
1 | head -n ____ ____ | tail -n ____
2 | 


--------------------------------------------------------------------------------
/solutions/lines.sh:
--------------------------------------------------------------------------------
1 | wc -l $@ | grep -v total
2 | 


--------------------------------------------------------------------------------
/solutions/names.txt:
--------------------------------------------------------------------------------
1 | Lovelace
2 | Hopper
3 | Johnson
4 | Wilson
5 | 


--------------------------------------------------------------------------------
/solutions/num-records.out:
--------------------------------------------------------------------------------
1 | 92
2 | 


--------------------------------------------------------------------------------
/solutions/range-1.sh:
--------------------------------------------------------------------------------
1 | wc -l $@ | grep -v total
2 | 


--------------------------------------------------------------------------------
/solutions/range-2.sh:
--------------------------------------------------------------------------------
1 | wc -l $@ | grep -v total | sort -n | head -n 1
2 | 


--------------------------------------------------------------------------------
/solutions/range-3.sh:
--------------------------------------------------------------------------------
1 | wc -l $@ | grep -v total | sort -n | head -n 1
2 | wc -l $@ | grep -v total | sort -n -r | head -n 1
3 | 


--------------------------------------------------------------------------------
/solutions/range-start-1.sh:
--------------------------------------------------------------------------------
1 | wc -l ____ | grep ____ total
2 | 


--------------------------------------------------------------------------------
/solutions/teeth-start.sh:
--------------------------------------------------------------------------------
1 | cut -d , -f 2 ____ | grep -v Tooth | sort | uniq ____
2 | 


--------------------------------------------------------------------------------
/solutions/teeth.out:
--------------------------------------------------------------------------------
1 | 15 bicuspid
2 | 31 canine
3 | 18 incisor
4 | 11 molar
5 | 17 wisdom
6 | 


--------------------------------------------------------------------------------
/solutions/teeth.sh:
--------------------------------------------------------------------------------
1 | cut -d , -f 2 seasonal/*.csv | grep -v Tooth | sort | uniq -c
2 | 


--------------------------------------------------------------------------------
/unused/permissions.md:
--------------------------------------------------------------------------------
  1 | --- type:MultipleChoiceExercise lang:shell xp:100 skills:1 key:59f0e1cf33
  2 | ## How can I get detailed information about a file?
  3 | 
  4 | In order to take the next step with scripting,
  5 | you need to know more about how Unix manages files.
  6 | First,
  7 | Unix stores a set of properties for each file along with its contents.
  8 | `ls` with the `-l` flag will display these.
  9 | For example,
 10 | `ls -l seasonal` displays something like this:
 11 | 
 12 | ```
 13 | -rw-r--r--  1 repl  staff  399 18 Aug 09:27 autumn.csv
 14 | -rw-r--r--  1 repl  staff  458 18 Aug 09:27 spring.csv
 15 | -rw-r--r--  1 repl  staff  479 18 Aug 09:27 summer.csv
 16 | -rw-r--r--  1 repl  staff  497 18 Aug 09:27 winter.csv
 17 | ```
 18 | 
 19 | Ignoring the first two columns for now,
 20 | this listing shows that the files are owned by a user named `repl`
 21 | who belongs to a group named `staff`,
 22 | that they range in size from 399 to 497 bytes,
 23 | and that they were last modified on August 18 at 9:27 in the morning.
 24 | 
 25 | <hr>
 26 | 
 27 | How many bytes are in the file `course.txt`?
 28 | 
 29 | *** =instructions
 30 | - 1
 31 | - 18
 32 | - 485
 33 | 
 34 | *** =hint
 35 | 
 36 | Use the same command shown in the lesson.
 37 | 
 38 | *** =sct
 39 | ```{python}
 40 | err = "No - you are looking at the wrong column."
 41 | correct = "That's correct!"
 42 | Ex().has_chosen(3, [err, err, correct])
 43 | ```
 44 | 
 45 | --- type:MultipleChoiceExercise lang:shell xp:50 skills:1 key:3061b5a818
 46 | ## How does Unix control who can do what with a file?
 47 | 
 48 | Unix keeps track of who can do what to files and directories
 49 | by storing a set of **permissions** for each one.
 50 | The three permissions are *read*, *write*, and *execute* (i.e., run as a program).
 51 | These are often written `rwx` with dashes for permissions that are missing,
 52 | so `rw-` means "can read and write but not execute"
 53 | and `r-x` means "can read and execute but not modify".
 54 | 
 55 | Unix is a multi-user operating system,
 56 | so it stores three sets of permissions for each file or directory:
 57 | one for the owner,
 58 | a second for other people in the owner's group,
 59 | and a third for everyone else.
 60 | When `ls -l seasonal` displays this:
 61 | 
 62 | ```
 63 | -rw-r--r--  1 repl  staff  399 18 Aug 09:27 autumn.csv
 64 | -rw-r--r--  1 repl  staff  458 18 Aug 09:27 spring.csv
 65 | -rw-r--r--  1 repl  staff  479 18 Aug 09:27 summer.csv
 66 | -rw-r--r--  1 repl  staff  497 18 Aug 09:27 winter.csv
 67 | ```
 68 | 
 69 | it means that each file can be read and written by their owner (the first `rw-`),
 70 | read by other people in the `staff` group (`r--`),
 71 | and also read by everyone else on the machine (`r--`).
 72 | (The first character on each line is "-" for files and "d" for directories.)
 73 | 
 74 | <hr>
 75 | 
 76 | What can users who *aren't* members of your group do with the file `course.txt`?
 77 | 
 78 | *** =instructions
 79 | - Read.
 80 | - Read and write.
 81 | - Read and execute.
 82 | - None of the above.
 83 | 
 84 | *** =hint
 85 | 
 86 | Use `ls -l` and read the permissions in groups of three characters.
 87 | 
 88 | *** =sct
 89 | ```{python}
 90 | a1 = 'Correct!'
 91 | a2 = 'No: the third group of characters does not contain a "w".'
 92 | a3 = 'No: the third group of characters does not contain an "x".'
 93 | a4 = 'No: the third group of characters contains an "r".'
 94 | Ex().has_chosen(1, [a1, a2, a3, a4])
 95 | ```
 96 | 
 97 | --- type:ConsoleExercise lang:shell xp:100 skills:1 key:f1988ccaf6
 98 | ## How can I change a file's permissions?
 99 | 
100 | You can change a file's permissions using `chmod`
101 | (which stands for "change mode").
102 | Its first parameter describes what permissions you want the file to have;
103 | the other parameters should be the names of files.
104 | 
105 | To describe permissions,
106 | you write an expression like `u=rw` or `g=rwx`.
107 | The first is "u" for "user" (i.e., you),
108 | "g" for "group" (other people in your group),
109 | or "o" for "other" (everyone else).
110 | The letters after the equals sign specify the permissions you want to give the file.
111 | Thus,
112 | to stop yourself from accidentally editing `course.txt`
113 | you would write:
114 | 
115 | ```{shell}
116 | chmod u=r course.txt
117 | ```
118 | 
119 | *** =instructions
120 | 
121 | Set the permissions on `people/agarwal.txt` so that you can read it
122 | but not write to it or execute it.
123 | 
124 | *** =hint
125 | 
126 | *** =pre_exercise_code
127 | ```{python}
128 | import os
129 | os.system('chmod 000 people/agarwal.txt')
130 | ```
131 | 
132 | *** =sample_code
133 | ```{shell}
134 | 
135 | ```
136 | 
137 | *** =solution
138 | ```{shell}
139 | chmod u=r people/agarwal.txt
140 | ```
141 | 
142 | *** =sct
143 | ```{shell}
144 | # Ex() >> test_file_perms('people/agarwal.txt', 'r', 'is not readable.')
145 | ```
146 | 
147 | --- type:BulletConsoleExercise key:6445630844
148 | ## How can I use my scripts like other commands?
149 | 
150 | As you use the shell to work with data,
151 | you will build up your own toolbox of useful scripts.
152 | Most users put these in a directory called `bin` under their home directory.
153 | If a script is there,
154 | and if it has execute permission,
155 | the shell will run it when you type its name *without* also typing `bash`.
156 | 
157 | *** =pre_exercise_code
158 | ```{python}
159 | import shutil
160 | shutil.copyfile('/solutions/lines.sh', 'bin/lines.sh')
161 | ```
162 | 
163 | *** =type1: ConsoleExercise
164 | *** =key1: d0173a85f4
165 | 
166 | *** =xp1: 10
167 | 
168 | *** =instructions1
169 | 
170 | The script `bin/lines.sh`
171 | reports the number of lines in one or more files
172 | without reporting the total number of lines.
173 | Use `chmod` to change its permissions
174 | so that you can read, write, and execute it.
175 | 
176 | *** =hint1
177 | 
178 | Use `o=rwx` as the permission.
179 | 
180 | *** =sample_code1
181 | ```{shell}
182 | ```
183 | 
184 | *** =solution1
185 | ```{shell}
186 | cp /solutions/lines.sh bin
187 | chmod u=rwx bin/lines.sh
188 | ```
189 | 
190 | *** =sct1
191 | ```{python}
192 | #Ex() >> test_file_perms('bin/lines.sh', 'x', 'is not executable (did you forget `chmod`?).')
193 | ```
194 | 
195 | *** =type2: ConsoleExercise
196 | *** =key2: 4925a72bc2
197 | 
198 | *** =xp2: 10
199 | 
200 | *** =instructions2
201 | 
202 | Run the script on `seasonal/*.csv` *without* typing the command `bash`
203 | *or* the word `bin`.
204 | 
205 | *** =hint2
206 | 
207 | *** =sample_code2
208 | ```{shell}
209 | ```
210 | 
211 | *** =solution2
212 | ```{shell}
213 | cp /solutions/lines.sh bin
214 | chmod u=rwx bin/lines.sh
215 | lines.sh seasonal/*.csv
216 | ```
217 | 
218 | *** =sct2
219 | ```{python}
220 | Ex().has_code(r'\s*lines\.sh\s+seasonal/\*\.csv\s*',
221 |               fixed=False, incorrect_msg='Type the name of the script and the wildcard pattern for the files.')
222 | ```
223 | 
224 | 


--------------------------------------------------------------------------------