├── .gitignore ├── CITATION.md ├── CONTRIBUTING.md ├── LICENSE.md ├── Makefile ├── README.md ├── bin └── make-seasonal-data.py ├── chapter1.md ├── chapter2.md ├── chapter3.md ├── chapter4.md ├── chapter5.md ├── course.yml ├── datasets ├── filesys.zip └── solutions.zip ├── design ├── concept.dot └── concept.svg ├── filesys ├── course.txt ├── people │ └── agarwal.txt └── seasonal │ ├── autumn.csv │ ├── spring.csv │ ├── summer.csv │ └── winter.csv ├── img └── shield_image.png ├── requirements.sh ├── rules.yml ├── solutions ├── count-records-start.sh ├── count-records.sh ├── current-time.sh ├── date-range-start.sh ├── date-range.sh ├── dates.sh ├── get-lines-solution.sh ├── get-lines.sh ├── lines.sh ├── names.txt ├── num-records.out ├── range-1.sh ├── range-2.sh ├── range-3.sh ├── range-start-1.sh ├── teeth-start.sh ├── teeth.out └── teeth.sh └── unused └── permissions.md /.gitignore: -------------------------------------------------------------------------------- 1 | *~ 2 | .DS_Store 3 | -------------------------------------------------------------------------------- /CITATION.md: -------------------------------------------------------------------------------- 1 | # Citation 2 | 3 | Please cite this lesson as: 4 | 5 | ``` 6 | Greg Wilson: "Introduction to the Unix Shell for Data Science". DataCamp, 2017, https://www.datacamp.com/courses/intro-to-unix-shell. 7 | ``` 8 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing 2 | 3 | [DataCamp](https://www.datacamp.com/) welcomes bug reports, feature requests, and fixes. By contributing to this lesson, you agree that we may redistribute your work under [our license](LICENSE.md). In exchange, we will address your issues and/or assess your proposed changes as promptly as we can. 4 | 5 | The easiest way to get started is to file an issue to tell us about a spelling mistake, some awkward wording, or a factual error. 6 | 7 | 1. You can submit suggestions directly through our web-based learning interface. This will get the fastest response. 8 | 9 | 2. If you have a [GitHub](https://github.com/) account, you can file an issue in this lesson's repository, or file a pull request to propose an improvement. 10 | 11 | 3. We also welcome comments by email, but will be able to respond more quickly if you use one of the other methods described above. 12 | 13 | The repository for this lesson is . 14 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | --- 2 | --- 3 | # License 4 | 5 | Copyright (c) DataCamp 2017. 6 | 7 | This work is made available under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license. 8 | This summary is a human-readable summary of (and not a substitute for) the [full license](https://creativecommons.org/licenses/by-nc/4.0/legalcode). 9 | 10 | ## You are free to: 11 | 12 | * **Share** - copy and redistribute the material in any medium or format. 13 | 14 | * **Adapt** - remix, transform, and build upon the material. 15 | 16 | The licensor cannot revoke these freedoms as long as you follow the license terms. 17 | 18 | ## Under the following terms: 19 | 20 | * **Attribution** — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. 21 | 22 | * **NonCommercial** - You may not use the material for commercial purposes. 23 | 24 | **No additional restrictions** - You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits. 25 | 26 | ## Notices: 27 | 28 | You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation. 29 | 30 | No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material. 31 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | FILESYS=$(wildcard filesys/*.txt) $(wildcard filesys/*/*.txt) $(wildcard filesys/*/*.csv) 2 | SOLUTIONS=$(wildcard solutions/*.sh) $(wildcard solutions/*.out) 3 | 4 | all : datasets/filesys.zip datasets/solutions.zip 5 | 6 | datasets/filesys.zip : ${FILESYS} 7 | cd filesys && zip -r ../$@ . 8 | 9 | datasets/solutions.zip : ${SOLUTIONS} 10 | zip -r $@ solutions 11 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Introduction to Shell 2 | 3 | - Teach: https://www.datacamp.com/teach/repositories/1395 4 | - Campus: https://www.datacamp.com/courses/introduction-to-shell-for-data-science 5 | - Docs: https://instructor-support.datacamp.com 6 | 7 | ## Description 8 | 9 | The Unix command line has survived and thrived for almost fifty years because it lets people to do complex things with just a few keystrokes. It helps users combine existing programs in new ways, automate repetitive tasks, and run programs on clusters and clouds that may be halfway around the world. This lesson will introduce its key elements and show you how to use them efficiently. 10 | 11 | ## Learning objectives 12 | 13 | - Explain the similarities and differences between the Unix shell and graphical user interfaces. 14 | - Use core Unix commands to create, rename, and remove files and directories. 15 | - Explain what files and directories are. 16 | - Match files and directories to relative and absolute paths. 17 | - Use core data manipulation commands to filter and sort textual data by position and value. 18 | - Find and interpret help. 19 | - Predict the paths matched by wildcards and specify wildcards to match sets of paths. 20 | - Combine programs using pipes to process large data sets. 21 | - Set and use variables to record information. 22 | - Use loops to run the same commands on many different files. 23 | 24 | ## Prerequisites 25 | 26 | - None 27 | -------------------------------------------------------------------------------- /bin/make-seasonal-data.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | '''Make CSV data files in filesys/seasonal/*.csv.''' 4 | 5 | import sys 6 | import csv 7 | import random 8 | from datetime import datetime, timedelta 9 | 10 | FILENAME_TEMPLATE = './filesys/seasonal/{}.csv' 11 | FILENAME_STEMS = 'spring summer autumn winter'.split() 12 | HEADER = 'Date Tooth'.split() 13 | TEETH = 'incisor canine bicuspid molar wisdom'.split() 14 | MIN_RECORDS = 20 15 | MAX_RECORDS = 30 16 | DATE_FORMAT = '%Y-%m-%d' 17 | START_DATE = datetime.strptime('2017-01-01', DATE_FORMAT) 18 | ONE_DAY = timedelta(days=1) 19 | DATE_RANGE = 250 20 | 21 | def main(): 22 | random.seed(20170101) 23 | for stem in FILENAME_STEMS: 24 | path = FILENAME_TEMPLATE.format(stem) 25 | records = [[datetime.strftime(START_DATE + random.randrange(DATE_RANGE) * ONE_DAY, DATE_FORMAT), 26 | random.choice(TEETH)] 27 | for r in range(random.randint(MIN_RECORDS, MAX_RECORDS))] 28 | records.sort() 29 | records.insert(0, HEADER) 30 | csv.writer(open(path, 'w'), lineterminator='\n').writerows(records) 31 | 32 | if __name__ == '__main__': 33 | main() 34 | -------------------------------------------------------------------------------- /chapter1.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Manipulating files and directories 3 | description: >- 4 | This chapter is a brief introduction to the Unix shell. You'll learn why it is 5 | still in use after almost 50 years, how it compares to the graphical tools you 6 | may be more familiar with, how to move around in the shell, and how to create, 7 | modify, and delete files and folders. 8 | free_preview: true 9 | lessons: 10 | - nb_of_exercises: 12 11 | title: How does the shell compare to a desktop interface? 12 | --- 13 | 14 | ## How does the shell compare to a desktop interface? 15 | 16 | ```yaml 17 | type: PureMultipleChoiceExercise 18 | key: badd717ea4 19 | xp: 50 20 | ``` 21 | 22 | An operating system like Windows, Linux, or Mac OS is a special kind of program. 23 | It controls the computer's processor, hard drive, and network connection, 24 | but its most important job is to run other programs. 25 | 26 | Since human beings aren't digital, 27 | they need an interface to interact with the operating system. 28 | The most common one these days is a graphical file explorer, 29 | which translates clicks and double-clicks into commands to open files and run programs. 30 | Before computers had graphical displays, 31 | though, 32 | people typed instructions into a program called a **command-line shell**. 33 | Each time a command is entered, 34 | the shell runs some other programs, 35 | prints their output in human-readable form, 36 | and then displays a *prompt* to signal that it's ready to accept the next command. 37 | (Its name comes from the notion that it's the "outer shell" of the computer.) 38 | 39 | Typing commands instead of clicking and dragging may seem clumsy at first, 40 | but as you will see, 41 | once you start spelling out what you want the computer to do, 42 | you can combine old commands to create new ones 43 | and automate repetitive operations 44 | with just a few keystrokes. 45 | 46 |
47 | What is the relationship between the graphical file explorer that most people use and the command-line shell? 48 | 49 | `@hint` 50 | Remember that a user can only interact with an operating system through a program. 51 | 52 | `@possible_answers` 53 | - The file explorer lets you view and edit files, while the shell lets you run programs. 54 | - The file explorer is built on top of the shell. 55 | - The shell is part of the operating system, while the file explorer is separate. 56 | - [They are both interfaces for issuing commands to the operating system.] 57 | 58 | `@feedback` 59 | - Both allow you to view and edit files and run programs. 60 | - Graphical file explorers and the shell both call the same underlying operating system functions. 61 | - The shell and the file explorer are both programs that translate user commands (typed or clicked) into calls to the operating system. 62 | - Correct! Both take the user's commands (whether typed or clicked) and send them to the operating system. 63 | 64 | --- 65 | 66 | ## Where am I? 67 | 68 | ```yaml 69 | type: MultipleChoiceExercise 70 | key: 7c1481dbd3 71 | xp: 50 72 | ``` 73 | 74 | The **filesystem** manages files and directories (or folders). 75 | Each is identified by an **absolute path** 76 | that shows how to reach it from the filesystem's **root directory**: 77 | `/home/repl` is the directory `repl` in the directory `home`, 78 | while `/home/repl/course.txt` is a file `course.txt` in that directory, 79 | and `/` on its own is the root directory. 80 | 81 | To find out where you are in the filesystem, 82 | run the command `pwd` 83 | (short for "**p**rint **w**orking **d**irectory"). 84 | This prints the absolute path of your **current working directory**, 85 | which is where the shell runs commands and looks for files by default. 86 | 87 |
88 | Run `pwd`. 89 | Where are you right now? 90 | 91 | `@possible_answers` 92 | - `/home` 93 | - `/repl` 94 | - `/home/repl` 95 | 96 | `@hint` 97 | Unix systems typically place all users' home directories underneath `/home`. 98 | 99 | `@pre_exercise_code` 100 | ```{python} 101 | 102 | ``` 103 | 104 | `@sct` 105 | ```{python} 106 | err = "That is not the correct path." 107 | correct = "Correct - you are in `/home/repl`." 108 | 109 | Ex().has_chosen(3, [err, err, correct]) 110 | ``` 111 | 112 | --- 113 | 114 | ## How can I identify files and directories? 115 | 116 | ```yaml 117 | type: MultipleChoiceExercise 118 | key: f5b0499835 119 | xp: 50 120 | ``` 121 | 122 | `pwd` tells you where you are. 123 | To find out what's there, 124 | type `ls` (which is short for "**l**i**s**ting") and press the enter key. 125 | On its own, 126 | `ls` lists the contents of your current directory 127 | (the one displayed by `pwd`). 128 | If you add the names of some files, 129 | `ls` will list them, 130 | and if you add the names of directories, 131 | it will list their contents. 132 | For example, 133 | `ls /home/repl` shows you what's in your starting directory 134 | (usually called your **home directory**). 135 | 136 |
137 | Use `ls` with an appropriate argument to list the files in the directory `/home/repl/seasonal` 138 | (which holds information on dental surgeries by date, broken down by season). 139 | Which of these files is *not* in that directory? 140 | 141 | `@possible_answers` 142 | - `autumn.csv` 143 | - `fall.csv` 144 | - `spring.csv` 145 | - `winter.csv` 146 | 147 | `@hint` 148 | If you give `ls` a path, it shows what's in that path. 149 | 150 | `@pre_exercise_code` 151 | ```{python} 152 | 153 | ``` 154 | 155 | `@sct` 156 | ```{python} 157 | err = "That file is in the `seasonal` directory." 158 | correct = "Correct - that file is *not* in the `seasonal` directory." 159 | 160 | Ex().has_chosen(2, [err, correct, err, err]) 161 | ``` 162 | 163 | --- 164 | 165 | ## How else can I identify files and directories? 166 | 167 | ```yaml 168 | type: BulletConsoleExercise 169 | key: a766184b59 170 | xp: 100 171 | ``` 172 | 173 | An absolute path is like a latitude and longitude: it has the same value no matter where you are. A **relative path**, on the other hand, specifies a location starting from where you are: it's like saying "20 kilometers north". 174 | 175 | As examples: 176 | - If you are in the directory `/home/repl`, the **relative** path `seasonal` specifies the same directory as the **absolute** path `/home/repl/seasonal`. 177 | - If you are in the directory `/home/repl/seasonal`, the **relative** path `winter.csv` specifies the same file as the **absolute** path `/home/repl/seasonal/winter.csv`. 178 | 179 | The shell decides if a path is absolute or relative by looking at its first character: If it begins with `/`, it is absolute. If it *does not* begin with `/`, it is relative. 180 | 181 | `@pre_exercise_code` 182 | ```{python} 183 | 184 | ``` 185 | 186 | *** 187 | 188 | ```yaml 189 | type: ConsoleExercise 190 | key: 9db1ed7afd 191 | xp: 35 192 | ``` 193 | 194 | `@instructions` 195 | You are in `/home/repl`. Use `ls` with a **relative path** to list the file that has an absolute path of `/home/repl/course.txt` (and only that file). 196 | 197 | `@hint` 198 | You can often construct the relative path to a file or directory below your current location 199 | by subtracting the absolute path of your current location 200 | from the absolute path of the thing you want. 201 | 202 | `@solution` 203 | ```{shell} 204 | ls course.txt 205 | 206 | ``` 207 | 208 | `@sct` 209 | ```{python} 210 | Ex().multi( 211 | has_cwd("/home/repl"), 212 | has_code("ls", incorrect_msg = "You didn't call `ls` to generate the file listing."), # to prevent `echo "course.txt"` 213 | check_correct( 214 | has_expr_output(strict=True), 215 | has_code("ls +course.txt", incorrect_msg = "Your command didn't generate the correct file listing. Use `ls` followed by a relative path to `/home/repl/course.txt`.") 216 | ) 217 | ) 218 | 219 | ``` 220 | 221 | *** 222 | 223 | ```yaml 224 | type: ConsoleExercise 225 | key: 4165425bf6 226 | xp: 35 227 | ``` 228 | 229 | `@instructions` 230 | You are in `/home/repl`. 231 | Use `ls` with a **relative** path 232 | to list the file `/home/repl/seasonal/summer.csv` (and only that file). 233 | 234 | `@hint` 235 | Relative paths do *not* start with a leading '/'. 236 | 237 | `@solution` 238 | ```{shell} 239 | ls seasonal/summer.csv 240 | 241 | ``` 242 | 243 | `@sct` 244 | ```{python} 245 | Ex().multi( 246 | has_cwd("/home/repl"), 247 | has_code("ls", incorrect_msg = "You didn't call `ls` to generate the file listing."), 248 | check_correct( 249 | has_expr_output(strict=True), 250 | has_code("ls +seasonal/summer.csv", incorrect_msg = "Your command didn't generate the correct file listing. Use `ls` followed by a relative path to `/home/repl/seasonal/summer.csv`.") 251 | ) 252 | ) 253 | ``` 254 | 255 | *** 256 | 257 | ```yaml 258 | type: ConsoleExercise 259 | key: b5e66d3741 260 | xp: 30 261 | ``` 262 | 263 | `@instructions` 264 | You are in `/home/repl`. 265 | Use `ls` with a **relative** path 266 | to list the contents of the directory `/home/repl/people`. 267 | 268 | `@hint` 269 | Relative paths do not start with a leading '/'. 270 | 271 | `@solution` 272 | ```{shell} 273 | ls people 274 | 275 | ``` 276 | 277 | `@sct` 278 | ```{python} 279 | Ex().multi( 280 | has_cwd("/home/repl"), 281 | has_code("ls", incorrect_msg = "You didn't call `ls` to generate the file listing."), 282 | check_correct( 283 | has_expr_output(strict=True), 284 | has_code("ls +people", incorrect_msg = "Your command didn't generate the correct file listing. Use `ls` followed by a relative path to `/home/repl/people`.") 285 | ) 286 | ) 287 | Ex().success_msg("Well done. Now that you know about listing files and directories, let's see how you can move around the filesystem!") 288 | 289 | ``` 290 | 291 | --- 292 | 293 | ## How can I move to another directory? 294 | 295 | ```yaml 296 | type: BulletConsoleExercise 297 | key: dbdaec5610 298 | xp: 100 299 | ``` 300 | 301 | Just as you can move around in a file browser by double-clicking on folders, 302 | you can move around in the filesystem using the command `cd` 303 | (which stands for "change directory"). 304 | 305 | If you type `cd seasonal` and then type `pwd`, 306 | the shell will tell you that you are now in `/home/repl/seasonal`. 307 | If you then run `ls` on its own, 308 | it shows you the contents of `/home/repl/seasonal`, 309 | because that's where you are. 310 | If you want to get back to your home directory `/home/repl`, 311 | you can use the command `cd /home/repl`. 312 | 313 | `@pre_exercise_code` 314 | ```{python} 315 | 316 | ``` 317 | 318 | *** 319 | 320 | ```yaml 321 | type: ConsoleExercise 322 | key: 3d0bfdd77d 323 | xp: 35 324 | ``` 325 | 326 | `@instructions` 327 | You are in `/home/repl`/. 328 | Change directory to `/home/repl/seasonal` using a relative path. 329 | 330 | `@hint` 331 | Remember that `cd` stands for "change directory" and that relative paths do not start with a leading '/'. 332 | 333 | `@solution` 334 | ```{shell} 335 | cd seasonal 336 | 337 | ``` 338 | 339 | `@sct` 340 | ```{python} 341 | Ex().check_correct( 342 | has_cwd('/home/repl/seasonal'), 343 | has_code('cd +seasonal', incorrect_msg="If your current working directory (find out with `pwd`) is `/home/repl`, you can move to the `seasonal` folder with `cd seasonal`.") 344 | ) 345 | 346 | ``` 347 | 348 | *** 349 | 350 | ```yaml 351 | type: ConsoleExercise 352 | key: e69c8eac15 353 | xp: 35 354 | ``` 355 | 356 | `@instructions` 357 | Use `pwd` to check that you're there. 358 | 359 | `@hint` 360 | Remember to press "enter" or "return" after entering the command. 361 | 362 | `@solution` 363 | ```{shell} 364 | pwd 365 | 366 | ``` 367 | 368 | `@sct` 369 | ```{python} 370 | Ex().multi( 371 | has_cwd('/home/repl/seasonal'), 372 | check_correct( 373 | has_expr_output(), 374 | has_code('pwd') 375 | ) 376 | ) 377 | 378 | ``` 379 | 380 | *** 381 | 382 | ```yaml 383 | type: ConsoleExercise 384 | key: f6b265bd7f 385 | xp: 30 386 | ``` 387 | 388 | `@instructions` 389 | Use `ls` without any paths to see what's in that directory. 390 | 391 | `@hint` 392 | Remember to press "enter" or "return" after the command. 393 | 394 | `@solution` 395 | ```{shell} 396 | ls 397 | 398 | ``` 399 | 400 | `@sct` 401 | ```{python} 402 | Ex().multi( 403 | has_cwd('/home/repl/seasonal'), 404 | check_correct( 405 | has_expr_output(), 406 | has_code('ls', incorrect_msg="Your command did not generate the correct output. Have you used `ls` with no paths to show the contents of the current directory?") 407 | ) 408 | ) 409 | 410 | Ex().success_msg("Neat! This was about navigating down to subdirectories. What about moving up? Let's find out!") 411 | 412 | ``` 413 | 414 | --- 415 | 416 | ## How can I move up a directory? 417 | 418 | ```yaml 419 | type: PureMultipleChoiceExercise 420 | key: 09c717ef76 421 | xp: 50 422 | ``` 423 | 424 | The **parent** of a directory is the directory above it. 425 | For example, `/home` is the parent of `/home/repl`, 426 | and `/home/repl` is the parent of `/home/repl/seasonal`. 427 | You can always give the absolute path of your parent directory to commands like `cd` and `ls`. 428 | More often, 429 | though, 430 | you will take advantage of the fact that the special path `..` 431 | (two dots with no spaces) means "the directory above the one I'm currently in". 432 | If you are in `/home/repl/seasonal`, 433 | then `cd ..` moves you up to `/home/repl`. 434 | If you use `cd ..` once again, 435 | it puts you in `/home`. 436 | One more `cd ..` puts you in the *root directory* `/`, 437 | which is the very top of the filesystem. 438 | (Remember to put a space between `cd` and `..` - it is a command and a path, not a single four-letter command.) 439 | 440 | A single dot on its own, `.`, always means "the current directory", 441 | so `ls` on its own and `ls .` do the same thing, 442 | while `cd .` has no effect 443 | (because it moves you into the directory you're currently in). 444 | 445 | One final special path is `~` (the tilde character), 446 | which means "your home directory", 447 | such as `/home/repl`. 448 | No matter where you are, 449 | `ls ~` will always list the contents of your home directory, 450 | and `cd ~` will always take you home. 451 | 452 |
453 | If you are in `/home/repl/seasonal`, 454 | where does `cd ~/../.` take you? 455 | 456 | `@hint` 457 | Trace the path one directory at a time. 458 | 459 | `@possible_answers` 460 | - `/home/repl` 461 | - [`/home`] 462 | - `/home/repl/seasonal` 463 | - `/` (the root directory) 464 | 465 | `@feedback` 466 | - No, but either `~` or `..` on its own would take you there. 467 | - Correct! The path means 'home directory', 'up a level', 'here'. 468 | - No, but `.` on its own would do that. 469 | - No, the final part of the path is `.` (meaning "here") rather than `..` (meaning "up"). 470 | 471 | --- 472 | 473 | ## How can I copy files? 474 | 475 | ```yaml 476 | type: BulletConsoleExercise 477 | key: 832de9e74c 478 | xp: 100 479 | ``` 480 | 481 | You will often want to copy files, 482 | move them into other directories to organize them, 483 | or rename them. 484 | One command to do this is `cp`, which is short for "copy". 485 | If `original.txt` is an existing file, 486 | then: 487 | 488 | ```{shell} 489 | cp original.txt duplicate.txt 490 | ``` 491 | 492 | creates a copy of `original.txt` called `duplicate.txt`. 493 | If there already was a file called `duplicate.txt`, 494 | it is overwritten. 495 | If the last parameter to `cp` is an existing directory, 496 | then a command like: 497 | 498 | ```{shell} 499 | cp seasonal/autumn.csv seasonal/winter.csv backup 500 | ``` 501 | 502 | copies *all* of the files into that directory. 503 | 504 | `@pre_exercise_code` 505 | ```{python} 506 | 507 | ``` 508 | 509 | *** 510 | 511 | ```yaml 512 | type: ConsoleExercise 513 | key: 6ab3fb1e25 514 | xp: 50 515 | ``` 516 | 517 | `@instructions` 518 | Make a copy of `seasonal/summer.csv` in the `backup` directory (which is also in `/home/repl`), 519 | calling the new file `summer.bck`. 520 | 521 | `@hint` 522 | Combine the name of the destination directory and the name of the copied file 523 | to create a relative path for the new file. 524 | 525 | `@solution` 526 | ```{shell} 527 | cp seasonal/summer.csv backup/summer.bck 528 | 529 | ``` 530 | 531 | `@sct` 532 | ```{python} 533 | Ex().check_correct( 534 | check_file('/home/repl/backup/summer.bck', missing_msg="`summer.bck` doesn't appear to exist in the `backup` directory. Provide two paths to `cp`: the existing file (`seasonal/summer.csv`) and the destination file (`backup/summer.bck`)."), 535 | has_cwd('/home/repl') 536 | ) 537 | 538 | ``` 539 | 540 | *** 541 | 542 | ```yaml 543 | type: ConsoleExercise 544 | key: d9e1214bb0 545 | xp: 50 546 | ``` 547 | 548 | `@instructions` 549 | Copy `spring.csv` and `summer.csv` from the `seasonal` directory into the `backup` directory 550 | *without* changing your current working directory (`/home/repl`). 551 | 552 | `@hint` 553 | Use `cp` with the names of the files you want to copy 554 | and *then* the name of the directory to copy them to. 555 | 556 | `@solution` 557 | ```{shell} 558 | cp seasonal/spring.csv seasonal/summer.csv backup 559 | 560 | ``` 561 | 562 | `@sct` 563 | ```{python} 564 | patt = "`%s` doesn't appear to have been copied into the `backup` directory. Provide two filenames and a directory name to `cp`." 565 | Ex().multi( 566 | has_cwd('/home/repl', incorrect_msg="Make sure to copy the files while in `{{dir}}`! Use `cd {{dir}}` to navigate back there."), 567 | check_file('/home/repl/backup/spring.csv', missing_msg=patt%'spring.csv'), 568 | check_file('/home/repl/backup/summer.csv', missing_msg=patt%'summer.csv') 569 | ) 570 | Ex().success_msg("Good job. Other than copying, we should also be able to move files from one directory to another. Learn about it in the next exercise!") 571 | ``` 572 | 573 | --- 574 | 575 | ## How can I move a file? 576 | 577 | ```yaml 578 | type: ConsoleExercise 579 | key: 663a083a3c 580 | xp: 100 581 | ``` 582 | 583 | While `cp` copies a file, 584 | `mv` moves it from one directory to another, 585 | just as if you had dragged it in a graphical file browser. 586 | It handles its parameters the same way as `cp`, 587 | so the command: 588 | 589 | ```{shell} 590 | mv autumn.csv winter.csv .. 591 | ``` 592 | 593 | moves the files `autumn.csv` and `winter.csv` from the current working directory 594 | up one level to its parent directory 595 | (because `..` always refers to the directory above your current location). 596 | 597 | `@instructions` 598 | You are in `/home/repl`, which has sub-directories `seasonal` and `backup`. 599 | Using a single command, move `spring.csv` and `summer.csv` from `seasonal` to `backup`. 600 | 601 | `@hint` 602 | 603 | 604 | `@pre_exercise_code` 605 | ```{python} 606 | 607 | ``` 608 | 609 | `@solution` 610 | ```{shell} 611 | mv seasonal/spring.csv seasonal/summer.csv backup 612 | ``` 613 | 614 | `@sct` 615 | ```{python} 616 | backup_patt="The file `%s` is not in the `backup` directory. Have you used `mv` correctly? Use two filenames and a directory as parameters to `mv`." 617 | seasonal_patt="The file `%s` is still in the `seasonal` directory. Make sure to move the files with `mv` rather than copying them with `cp`!" 618 | Ex().multi( 619 | check_file('/home/repl/backup/spring.csv', missing_msg=backup_patt%'spring.csv'), 620 | check_file('/home/repl/backup/summer.csv', missing_msg=backup_patt%'summer.csv'), 621 | check_not(check_file('/home/repl/seasonal/spring.csv'), incorrect_msg=seasonal_patt%'spring.csv'), 622 | check_not(check_file('/home/repl/seasonal/summer.csv'), incorrect_msg=seasonal_patt%'summer.csv') 623 | ) 624 | Ex().success_msg("Well done, let's keep this shell train going!") 625 | ``` 626 | 627 | --- 628 | 629 | ## How can I rename files? 630 | 631 | ```yaml 632 | type: BulletConsoleExercise 633 | key: 001801a652 634 | xp: 100 635 | ``` 636 | 637 | `mv` can also be used to rename files. If you run: 638 | 639 | ```{shell} 640 | mv course.txt old-course.txt 641 | ``` 642 | 643 | then the file `course.txt` in the current working directory is "moved" to the file `old-course.txt`. 644 | This is different from the way file browsers work, 645 | but is often handy. 646 | 647 | One warning: 648 | just like `cp`, 649 | `mv` will overwrite existing files. 650 | If, 651 | for example, 652 | you already have a file called `old-course.txt`, 653 | then the command shown above will replace it with whatever is in `course.txt`. 654 | 655 | `@pre_exercise_code` 656 | ```{python} 657 | 658 | ``` 659 | 660 | *** 661 | 662 | ```yaml 663 | type: ConsoleExercise 664 | key: 710187c8c7 665 | xp: 35 666 | ``` 667 | 668 | `@instructions` 669 | Go into the `seasonal` directory. 670 | 671 | `@hint` 672 | Remember that `cd` stands for "change directory" and that relative paths do not start with a leading '/'. 673 | 674 | `@solution` 675 | ```{shell} 676 | cd seasonal 677 | 678 | ``` 679 | 680 | `@sct` 681 | ```{python} 682 | Ex().check_correct( 683 | has_cwd('/home/repl/seasonal'), 684 | has_code('cd +seasonal', incorrect_msg="If your current working directory (find out with `pwd`) is `/home/repl`, you can move to the `seasonal` folder with `cd seasonal`.") 685 | ) 686 | 687 | ``` 688 | 689 | *** 690 | 691 | ```yaml 692 | type: ConsoleExercise 693 | key: ed5fe1df23 694 | xp: 35 695 | ``` 696 | 697 | `@instructions` 698 | Rename the file `winter.csv` to be `winter.csv.bck`. 699 | 700 | `@hint` 701 | Use `mv` with the current name of the file and the name you want it to have in that order. 702 | 703 | `@solution` 704 | ```{shell} 705 | mv winter.csv winter.csv.bck 706 | 707 | ``` 708 | 709 | `@sct` 710 | ```{python} 711 | hint = " Use `mv` with two arguments: the file you want to rename (`winter.csv`) and the new name for the file (`winter.csv.bck`)." 712 | Ex().multi( 713 | has_cwd('/home/repl/seasonal'), 714 | multi( 715 | check_file('/home/repl/seasonal/winter.csv.bck', missing_msg="We expected to find `winter.csv.bck` in the directory." + hint), 716 | check_not(check_file('/home/repl/seasonal/winter.csv'), incorrect_msg="We were no longer expecting `winter.csv` to be in the directory." + hint) 717 | ) 718 | ) 719 | 720 | ``` 721 | 722 | *** 723 | 724 | ```yaml 725 | type: ConsoleExercise 726 | key: 1deee4c768 727 | xp: 30 728 | ``` 729 | 730 | `@instructions` 731 | Run `ls` to check that everything has worked. 732 | 733 | `@hint` 734 | Remember to press "enter" or "return" to run the command. 735 | 736 | `@solution` 737 | ```{shell} 738 | ls 739 | 740 | ``` 741 | 742 | `@sct` 743 | ```{python} 744 | Ex().multi( 745 | has_cwd('/home/repl/seasonal'), 746 | has_expr_output(incorrect_msg="Have you used `ls` to list the contents of your current working directory?") 747 | ) 748 | Ex().multi( 749 | has_cwd("/home/repl/seasonal"), 750 | check_correct( 751 | has_expr_output(strict=True), 752 | has_code("ls", incorrect_msg = "Your command didn't generate the correct file listing. Use `ls` without arguments to list the contents of your current working directory.") 753 | ) 754 | ) 755 | Ex().success_msg("Copying, moving, renaming, you've all got it figured out! Next up: deleting files.") 756 | 757 | ``` 758 | 759 | --- 760 | 761 | ## How can I delete files? 762 | 763 | ```yaml 764 | type: BulletConsoleExercise 765 | key: '2734680614' 766 | xp: 100 767 | ``` 768 | 769 | We can copy files and move them around; 770 | to delete them, 771 | we use `rm`, 772 | which stands for "remove". 773 | As with `cp` and `mv`, 774 | you can give `rm` the names of as many files as you'd like, so: 775 | 776 | ```{shell} 777 | rm thesis.txt backup/thesis-2017-08.txt 778 | ``` 779 | 780 | removes both `thesis.txt` and `backup/thesis-2017-08.txt` 781 | 782 | `rm` does exactly what its name says, 783 | and it does it right away: 784 | unlike graphical file browsers, 785 | the shell doesn't have a trash can, 786 | so when you type the command above, 787 | your thesis is gone for good. 788 | 789 | `@pre_exercise_code` 790 | ```{python} 791 | 792 | ``` 793 | 794 | *** 795 | 796 | ```yaml 797 | type: ConsoleExercise 798 | key: d7580f7bd4 799 | xp: 25 800 | ``` 801 | 802 | `@instructions` 803 | You are in `/home/repl`. 804 | Go into the `seasonal` directory. 805 | 806 | `@hint` 807 | Remember that `cd` stands for "change directory" and that a relative path does not start with a leading '/'. 808 | 809 | `@solution` 810 | ```{shell} 811 | cd seasonal 812 | 813 | ``` 814 | 815 | `@sct` 816 | ```{python} 817 | Ex().has_cwd('/home/repl/seasonal') 818 | 819 | ``` 820 | 821 | *** 822 | 823 | ```yaml 824 | type: ConsoleExercise 825 | key: 1c21cc7039 826 | xp: 25 827 | ``` 828 | 829 | `@instructions` 830 | Remove `autumn.csv`. 831 | 832 | `@hint` 833 | Remember that `rm` stands for "remove". 834 | 835 | `@solution` 836 | ```{shell} 837 | rm autumn.csv 838 | 839 | ``` 840 | 841 | `@sct` 842 | ```{python} 843 | Ex().multi( 844 | has_cwd('/home/repl/seasonal'), 845 | check_not(check_file('/home/repl/seasonal/autumn.csv'), incorrect_msg="We weren't expecting `autumn.csv` to still be in the `seasonal` directory. Use `rm` with the path to the file you want to remove."), 846 | has_code('rm', incorrect_msg = 'Use `rm` to remove the file, rather than moving it.') 847 | ) 848 | 849 | ``` 850 | 851 | *** 852 | 853 | ```yaml 854 | type: ConsoleExercise 855 | key: 09f2d105cd 856 | xp: 25 857 | ``` 858 | 859 | `@instructions` 860 | Go back to your home directory. 861 | 862 | `@hint` 863 | If you use `cd` without any paths, it takes you home. 864 | 865 | `@solution` 866 | ```{shell} 867 | cd 868 | 869 | ``` 870 | 871 | `@sct` 872 | ```{python} 873 | Ex().has_cwd('/home/repl', incorrect_msg="Use `cd ..` or `cd ~` to return to the home directory.") 874 | 875 | ``` 876 | 877 | *** 878 | 879 | ```yaml 880 | type: ConsoleExercise 881 | key: 9eaf49744c 882 | xp: 25 883 | ``` 884 | 885 | `@instructions` 886 | Remove `seasonal/summer.csv` without changing directories again. 887 | 888 | `@hint` 889 | Remember that `rm` stands for "remove". 890 | 891 | `@solution` 892 | ```{shell} 893 | rm seasonal/summer.csv 894 | 895 | ``` 896 | 897 | `@sct` 898 | ```{python} 899 | Ex().multi( 900 | has_cwd('/home/repl'), 901 | check_not(check_file('/home/repl/seasonal/summer.csv'), incorrect_msg="We weren't expecting `summer.csv` to still be in the `seasonal` directory. Use `rm` with the path to the file you want to remove."), 902 | has_code('rm', incorrect_msg = 'Use `rm` to remove the file, rather than moving it.') 903 | ) 904 | Ex().success_msg("Impressive stuff! Off to the next one!") 905 | 906 | ``` 907 | 908 | --- 909 | 910 | ## How can I create and delete directories? 911 | 912 | ```yaml 913 | type: BulletConsoleExercise 914 | key: 63e8fbd0c2 915 | xp: 100 916 | ``` 917 | 918 | `mv` treats directories the same way it treats files: 919 | if you are in your home directory and run `mv seasonal by-season`, 920 | for example, 921 | `mv` changes the name of the `seasonal` directory to `by-season`. 922 | However, 923 | `rm` works differently. 924 | 925 | If you try to `rm` a directory, 926 | the shell prints an error message telling you it can't do that, 927 | primarily to stop you from accidentally deleting an entire directory full of work. 928 | Instead, 929 | you can use a separate command called `rmdir`. 930 | For added safety, 931 | it only works when the directory is empty, 932 | so you must delete the files in a directory *before* you delete the directory. 933 | (Experienced users can use the `-r` option to `rm` to get the same effect; 934 | we will discuss command options in the next chapter.) 935 | 936 | `@pre_exercise_code` 937 | ```{python} 938 | 939 | ``` 940 | 941 | *** 942 | 943 | ```yaml 944 | type: ConsoleExercise 945 | key: 5a81bb8589 946 | xp: 25 947 | ``` 948 | 949 | `@instructions` 950 | Without changing directories, 951 | delete the file `agarwal.txt` in the `people` directory. 952 | 953 | `@hint` 954 | Remember that `rm` stands for "remove" and that a relative path does not start with a leading '/'. 955 | 956 | `@solution` 957 | ```{shell} 958 | rm people/agarwal.txt 959 | 960 | ``` 961 | 962 | `@sct` 963 | ```{python} 964 | Ex().multi( 965 | has_cwd('/home/repl'), 966 | check_not(check_file('/home/repl/people/agarwal.txt'), incorrect_msg="`agarwal.txt` should no longer be in `/home/repl/people`. Have you used `rm` correctly?"), 967 | has_expr_output(expr = 'ls people', output = '', incorrect_msg = 'There are still files in the `people` directory. If you simply moved `agarwal.txt`, or created new files, delete them all.') 968 | ) 969 | 970 | ``` 971 | 972 | *** 973 | 974 | ```yaml 975 | type: ConsoleExercise 976 | key: 661633e531 977 | xp: 25 978 | ``` 979 | 980 | `@instructions` 981 | Now that the `people` directory is empty, 982 | use a single command to delete it. 983 | 984 | `@hint` 985 | Remember that `rm` only works on files. 986 | 987 | `@solution` 988 | ```{shell} 989 | rmdir people 990 | 991 | ``` 992 | 993 | `@sct` 994 | ```{python} 995 | Ex().multi( 996 | has_cwd('/home/repl'), 997 | check_not(has_dir('/home/repl/people'), 998 | incorrect_msg = "The 'people' directory should no longer be in your home directory. Use `rmdir` to remove it!") 999 | ) 1000 | 1001 | ``` 1002 | 1003 | *** 1004 | 1005 | ```yaml 1006 | type: ConsoleExercise 1007 | key: 89f7ffc1da 1008 | xp: 25 1009 | ``` 1010 | 1011 | `@instructions` 1012 | Since a directory is not a file, 1013 | you must use the command `mkdir directory_name` 1014 | to create a new (empty) directory. 1015 | Use this command to create a new directory called `yearly` below your home directory. 1016 | 1017 | `@hint` 1018 | Run `mkdir` with the name of the directory you want to create. 1019 | 1020 | `@solution` 1021 | ```{shell} 1022 | mkdir yearly 1023 | 1024 | ``` 1025 | 1026 | `@sct` 1027 | ```{python} 1028 | Ex().multi( 1029 | has_cwd('/home/repl'), 1030 | has_dir('/home/repl/yearly', msg="There is no `yearly` directory in your home directory. Use `mkdir yearly` to make one!") 1031 | ) 1032 | 1033 | ``` 1034 | 1035 | *** 1036 | 1037 | ```yaml 1038 | type: ConsoleExercise 1039 | key: 013a5ff2dc 1040 | xp: 25 1041 | ``` 1042 | 1043 | `@instructions` 1044 | Now that `yearly` exists, 1045 | create another directory called `2017` inside it 1046 | *without* leaving your home directory. 1047 | 1048 | `@hint` 1049 | Use a relative path for the sub-directory you want to create. 1050 | 1051 | `@solution` 1052 | ```{shell} 1053 | mkdir yearly/2017 1054 | 1055 | ``` 1056 | 1057 | `@sct` 1058 | ```{python} 1059 | Ex().multi( 1060 | has_cwd('/home/repl'), 1061 | has_dir('/home/repl/yearly/2017', 1062 | msg="Cannot find a '2017' directory in '/home/repl/yearly'. You can make this directory using the relative path `yearly/2017`.") 1063 | ) 1064 | Ex().success_msg("Cool! Let's wrap up this chapter with an exercise that repeats some of its concepts!") 1065 | 1066 | ``` 1067 | 1068 | --- 1069 | 1070 | ## Wrapping up 1071 | 1072 | ```yaml 1073 | type: BulletConsoleExercise 1074 | key: b1990e9a42 1075 | xp: 100 1076 | ``` 1077 | 1078 | You will often create intermediate files when analyzing data. 1079 | Rather than storing them in your home directory, 1080 | you can put them in `/tmp`, 1081 | which is where people and programs often keep files they only need briefly. 1082 | (Note that `/tmp` is immediately below the root directory `/`, 1083 | *not* below your home directory.) 1084 | This wrap-up exercise will show you how to do that. 1085 | 1086 | `@pre_exercise_code` 1087 | ```{python} 1088 | 1089 | ``` 1090 | 1091 | *** 1092 | 1093 | ```yaml 1094 | type: ConsoleExercise 1095 | key: 59781bc43b 1096 | xp: 25 1097 | ``` 1098 | 1099 | `@instructions` 1100 | Use `cd` to go into `/tmp`. 1101 | 1102 | `@hint` 1103 | Remember that `cd` stands for "change directory" and that an absolute path starts with a '/'. 1104 | 1105 | `@solution` 1106 | ```{shell} 1107 | cd /tmp 1108 | 1109 | ``` 1110 | 1111 | `@sct` 1112 | ```{python} 1113 | Ex().check_correct( 1114 | has_cwd('/tmp'), 1115 | has_code('cd +/tmp', incorrect_msg = 'You are in the wrong directory. Use `cd` to change directory to `/tmp`.') 1116 | ) 1117 | 1118 | ``` 1119 | 1120 | *** 1121 | 1122 | ```yaml 1123 | type: ConsoleExercise 1124 | key: 7e6ada440d 1125 | xp: 25 1126 | ``` 1127 | 1128 | `@instructions` 1129 | List the contents of `/tmp` *without* typing a directory name. 1130 | 1131 | `@hint` 1132 | If you don't tell `ls` what to list, it shows you what's in your current directory. 1133 | 1134 | `@solution` 1135 | ```{shell} 1136 | ls 1137 | 1138 | ``` 1139 | 1140 | `@sct` 1141 | ```{python} 1142 | Ex().multi( 1143 | has_cwd("/tmp"), 1144 | has_code("ls", incorrect_msg = "You didn't call `ls` to generate the file listing."), 1145 | check_correct( 1146 | has_expr_output(strict=True), 1147 | has_code("^\s*ls\s*$", incorrect_msg = "Your command didn't generate the correct file listing. Use `ls` without`.") 1148 | ) 1149 | ) 1150 | 1151 | ``` 1152 | 1153 | *** 1154 | 1155 | ```yaml 1156 | type: ConsoleExercise 1157 | key: edaf1bcf96 1158 | xp: 25 1159 | ``` 1160 | 1161 | `@instructions` 1162 | Make a new directory inside `/tmp` called `scratch`. 1163 | 1164 | `@hint` 1165 | Use `mkdir` to make directories. 1166 | 1167 | `@solution` 1168 | ```{shell} 1169 | mkdir scratch 1170 | 1171 | ``` 1172 | 1173 | `@sct` 1174 | ```{python} 1175 | Ex().multi( 1176 | has_cwd('/tmp'), 1177 | check_correct( 1178 | has_dir('/tmp/scratch'), 1179 | has_code('mkdir +scratch', incorrect_msg="Cannot find a 'scratch' directory under '/tmp'. Make sure to use `mkdir` correctly.") 1180 | ) 1181 | ) 1182 | 1183 | ``` 1184 | 1185 | *** 1186 | 1187 | ```yaml 1188 | type: ConsoleExercise 1189 | key: a904a3a719 1190 | xp: 25 1191 | ``` 1192 | 1193 | `@instructions` 1194 | Move `/home/repl/people/agarwal.txt` into `/tmp/scratch`. 1195 | We suggest you use the `~` shortcut for your home directory and a relative path for the second rather than the absolute path. 1196 | 1197 | `@hint` 1198 | 1199 | 1200 | `@solution` 1201 | ```{shell} 1202 | mv ~/people/agarwal.txt scratch 1203 | 1204 | ``` 1205 | 1206 | `@sct` 1207 | ```{python} 1208 | Ex().multi( 1209 | has_cwd('/tmp'), 1210 | check_file('/tmp/scratch/agarwal.txt', missing_msg="Cannot find 'agarwal.txt' in '/tmp/scratch'. Use `mv` with `~/people/agarwal.txt` as the first parameter and `scratch` as the second.") 1211 | ) 1212 | Ex().success_msg("This concludes Chapter 1 of Introduction to Shell! Rush over to the next chapter to learn more about manipulating data!") 1213 | 1214 | ``` 1215 | -------------------------------------------------------------------------------- /chapter2.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Manipulating data 3 | description: >- 4 | The commands you saw in the previous chapter allowed you to move things around 5 | in the filesystem. This chapter will show you how to work with the data in 6 | those files. The tools we’ll use are fairly simple, but are solid building 7 | blocks. 8 | lessons: 9 | - nb_of_exercises: 12 10 | title: How can I view a file's contents? 11 | --- 12 | 13 | ## How can I view a file's contents? 14 | 15 | ```yaml 16 | type: ConsoleExercise 17 | key: 8acc09ede3 18 | xp: 100 19 | ``` 20 | 21 | Before you rename or delete files, 22 | you may want to have a look at their contents. 23 | The simplest way to do this is with `cat`, 24 | which just prints the contents of files onto the screen. 25 | (Its name is short for "concatenate", meaning "to link things together", 26 | since it will print all the files whose names you give it, one after the other.) 27 | 28 | ```{shell} 29 | cat agarwal.txt 30 | ``` 31 | ``` 32 | name: Agarwal, Jasmine 33 | position: RCT2 34 | start: 2017-04-01 35 | benefits: full 36 | ``` 37 | 38 | `@instructions` 39 | Print the contents of `course.txt` to the screen. 40 | 41 | `@hint` 42 | 43 | 44 | `@pre_exercise_code` 45 | ```{python} 46 | 47 | ``` 48 | 49 | `@solution` 50 | ```{bash} 51 | cat course.txt 52 | ``` 53 | 54 | `@sct` 55 | ```{python} 56 | Ex().multi( 57 | has_cwd('/home/repl'), 58 | has_expr_output(incorrect_msg="Your command didn't generate the right output. Have you used `cat` followed by the name of the file, `course.txt`?") 59 | ) 60 | Ex().success_msg("Nice! Let's look at other ways to view a file's contents.") 61 | ``` 62 | 63 | --- 64 | 65 | ## How can I view a file's contents piece by piece? 66 | 67 | ```yaml 68 | type: ConsoleExercise 69 | key: d8a30a3f81 70 | xp: 100 71 | ``` 72 | 73 | You can use `cat` to print large files and then scroll through the output, 74 | but it is usually more convenient to **page** the output. 75 | The original command for doing this was called `more`, 76 | but it has been superseded by a more powerful command called `less`. 77 | (This kind of naming is what passes for humor in the Unix world.) 78 | When you `less` a file, 79 | one page is displayed at a time; 80 | you can press spacebar to page down or type `q` to quit. 81 | 82 | If you give `less` the names of several files, 83 | you can type `:n` (colon and a lower-case 'n') to move to the next file, 84 | `:p` to go back to the previous one, 85 | or `:q` to quit. 86 | 87 | Note: If you view solutions to exercises that use `less`, 88 | you will see an extra command at the end that turns paging *off* 89 | so that we can test your solutions efficiently. 90 | 91 | `@instructions` 92 | Use `less seasonal/spring.csv seasonal/summer.csv` to view those two files in that order. 93 | Press spacebar to page down, `:n` to go to the second file, and `:q` to quit. 94 | 95 | `@hint` 96 | 97 | 98 | `@pre_exercise_code` 99 | ```{python} 100 | 101 | ``` 102 | 103 | `@solution` 104 | ```{bash} 105 | # You can leave out the '| cat' part here: 106 | less seasonal/spring.csv seasonal/summer.csv | cat 107 | ``` 108 | 109 | `@sct` 110 | ```{python} 111 | Ex().multi( 112 | has_cwd('/home/repl'), 113 | check_or( 114 | has_code(r'\s*less\s+seasonal/spring\.csv\s+seasonal/summer\.csv\s*', 115 | incorrect_msg='Use `less` and the filenames. Remember that `:n` moves you to the next file.'), 116 | has_code(r'\s*less\s+seasonal/summer\.csv\s+seasonal/spring\.csv\s*') 117 | ) 118 | ) 119 | ``` 120 | 121 | --- 122 | 123 | ## How can I look at the start of a file? 124 | 125 | ```yaml 126 | type: MultipleChoiceExercise 127 | key: 82bdc9af65 128 | lang: shell 129 | xp: 50 130 | skills: 131 | - 1 132 | ``` 133 | 134 | The first thing most data scientists do when given a new dataset to analyze is 135 | figure out what fields it contains and what values those fields have. 136 | If the dataset has been exported from a database or spreadsheet, 137 | it will often be stored as **comma-separated values** (CSV). 138 | A quick way to figure out what it contains is to look at the first few rows. 139 | 140 | We can do this in the shell using a command called `head`. 141 | As its name suggests, 142 | it prints the first few lines of a file 143 | (where "a few" means 10), 144 | so the command: 145 | 146 | ```{shell} 147 | head seasonal/summer.csv 148 | ``` 149 | 150 | displays: 151 | 152 | ``` 153 | Date,Tooth 154 | 2017-01-11,canine 155 | 2017-01-18,wisdom 156 | 2017-01-21,bicuspid 157 | 2017-02-02,molar 158 | 2017-02-27,wisdom 159 | 2017-02-27,wisdom 160 | 2017-03-07,bicuspid 161 | 2017-03-15,wisdom 162 | 2017-03-20,canine 163 | ``` 164 | 165 |
166 | 167 | What does `head` do if there aren't 10 lines in the file? 168 | (To find out, use it to look at the top of `people/agarwal.txt`.) 169 | 170 | `@possible_answers` 171 | - Print an error message because the file is too short. 172 | - Display as many lines as there are. 173 | - Display enough blank lines to bring the total to 10. 174 | 175 | `@hint` 176 | What is the most useful thing it could do? 177 | 178 | `@pre_exercise_code` 179 | ```{python} 180 | 181 | ``` 182 | 183 | `@sct` 184 | ```{shell} 185 | Ex().has_chosen(2, ["Incorrect: that isn't the most useful thing it could do.", 186 | "Correct!", 187 | "Incorrect: that would be impossible to distinguish from a file that ended with a bunch of blank lines."]) 188 | ``` 189 | 190 | --- 191 | 192 | ## How can I type less? 193 | 194 | ```yaml 195 | type: BulletConsoleExercise 196 | key: 0b7b8ca8f7 197 | xp: 100 198 | ``` 199 | 200 | One of the shell's power tools is **tab completion**. 201 | If you start typing the name of a file and then press the tab key, 202 | the shell will do its best to auto-complete the path. 203 | For example, 204 | if you type `sea` and press tab, 205 | it will fill in the directory name `seasonal/` (with a trailing slash). 206 | If you then type `a` and tab, 207 | it will complete the path as `seasonal/autumn.csv`. 208 | 209 | If the path is ambiguous, 210 | such as `seasonal/s`, 211 | pressing tab a second time will display a list of possibilities. 212 | Typing another character or two to make your path more specific 213 | and then pressing tab 214 | will fill in the rest of the name. 215 | 216 | `@pre_exercise_code` 217 | ```{python} 218 | 219 | ``` 220 | 221 | *** 222 | 223 | ```yaml 224 | type: ConsoleExercise 225 | key: 4e30296c27 226 | xp: 50 227 | ``` 228 | 229 | `@instructions` 230 | Run `head seasonal/autumn.csv` without typing the full filename. 231 | 232 | `@hint` 233 | Type as much of the path as you need to, then press tab, and repeat. 234 | 235 | `@solution` 236 | ```{shell} 237 | head seasonal/autumn.csv 238 | 239 | ``` 240 | 241 | `@sct` 242 | ```{python} 243 | Ex().multi( 244 | has_cwd('/home/repl'), 245 | has_expr_output(incorrect_msg="The checker couldn't find the right output in your command. Are you sure you called `head` on `seasonal/autumn.csv`?") 246 | ) 247 | 248 | ``` 249 | 250 | *** 251 | 252 | ```yaml 253 | type: ConsoleExercise 254 | key: e249266733 255 | xp: 50 256 | ``` 257 | 258 | `@instructions` 259 | Run `head seasonal/spring.csv` without typing the full filename. 260 | 261 | `@hint` 262 | Type as much of the path as you need to, then press tab, and repeat. 263 | 264 | `@solution` 265 | ```{shell} 266 | head seasonal/spring.csv 267 | 268 | ``` 269 | 270 | `@sct` 271 | ```{python} 272 | Ex().multi( 273 | has_cwd('/home/repl'), 274 | has_expr_output(incorrect_msg="The checker couldn't find the right output in your command. Are you sure you called `head` on `seasonal/spring.csv`?") 275 | ) 276 | Ex().success_msg("Good work! Once you get used to using tab completion, it will save you a lot of time!") 277 | 278 | ``` 279 | 280 | --- 281 | 282 | ## How can I control what commands do? 283 | 284 | ```yaml 285 | type: ConsoleExercise 286 | key: 9eb608f6c9 287 | xp: 100 288 | ``` 289 | 290 | You won't always want to look at the first 10 lines of a file, 291 | so the shell lets you change `head`'s behavior 292 | by giving it a **command-line flag** (or just "flag" for short). 293 | If you run the command: 294 | 295 | ```{shell} 296 | head -n 3 seasonal/summer.csv 297 | ``` 298 | 299 | `head` will only display the first three lines of the file. 300 | If you run `head -n 100`, 301 | it will display the first 100 (assuming there are that many), 302 | and so on. 303 | 304 | A flag's name usually indicates its purpose 305 | (for example, `-n` is meant to signal "**n**umber of lines"). 306 | Command flags don't have to be a `-` followed by a single letter, 307 | but it's a widely-used convention. 308 | 309 | Note: it's considered good style to put all flags *before* any filenames, 310 | so in this course, 311 | we only accept answers that do that. 312 | 313 | `@instructions` 314 | Display the first 5 lines of `winter.csv` in the `seasonal` directory. 315 | 316 | `@hint` 317 | 318 | 319 | `@pre_exercise_code` 320 | ```{python} 321 | 322 | ``` 323 | 324 | `@solution` 325 | ```{shell} 326 | head -n 5 seasonal/winter.csv 327 | ``` 328 | 329 | `@sct` 330 | ```{python} 331 | Ex().multi( 332 | has_cwd('/home/repl'), 333 | check_correct( 334 | has_expr_output(incorrect_msg="Are you sure you're calling `head` on the `seasonal/winter.csv` file?"), 335 | has_expr_output(strict=True, incorrect_msg="Are you sure you used the flag `-n 5`?") 336 | ), 337 | check_not(has_output("2017-02-17,incisor"), incorrect_msg = "Are you sure you used the flag `-n 5`?") 338 | ) 339 | Ex().success_msg("Nice! With this technique, you can avoid your shell from blowing up if you want to have a look at larger text files.") 340 | ``` 341 | 342 | --- 343 | 344 | ## How can I list everything below a directory? 345 | 346 | ```yaml 347 | type: ConsoleExercise 348 | key: f830d46419 349 | xp: 100 350 | ``` 351 | 352 | In order to see everything underneath a directory, 353 | no matter how deeply nested it is, 354 | you can give `ls` the flag `-R` 355 | (which means "recursive"). 356 | If you use `ls -R` in your home directory, 357 | you will see something like this: 358 | 359 | ``` 360 | backup course.txt people seasonal 361 | 362 | ./backup: 363 | 364 | ./people: 365 | agarwal.txt 366 | 367 | ./seasonal: 368 | autumn.csv spring.csv summer.csv winter.csv 369 | ``` 370 | 371 | This shows every file and directory in the current level, 372 | then everything in each sub-directory, 373 | and so on. 374 | 375 | `@instructions` 376 | To help you know what is what, 377 | `ls` has another flag `-F` that prints a `/` after the name of every directory 378 | and a `*` after the name of every runnable program. 379 | Run `ls` with the two flags, `-R` and `-F`, and the absolute path to your home directory 380 | to see everything it contains. 381 | (The order of the flags doesn't matter, but the directory name must come last.) 382 | 383 | `@hint` 384 | Your home directory can be specified using `~` or `.` or its absolute path. 385 | 386 | `@pre_exercise_code` 387 | ```{python} 388 | 389 | ``` 390 | 391 | `@solution` 392 | ```{shell} 393 | ls -R -F /home/repl 394 | ``` 395 | 396 | `@sct` 397 | ```{python} 398 | Ex().check_or( 399 | has_expr_output(incorrect_msg='Use either `ls -R -F` or `ls -F -R` and the path `/home/repl`.'), 400 | has_expr_output(expr = "ls -R -F .", incorrect_msg='Use either `ls -R -F` or `ls -F -R` and the path `/home/repl`.') 401 | ) 402 | Ex().success_msg("That's a pretty neat overview, isn't it?") 403 | ``` 404 | 405 | --- 406 | 407 | ## How can I get help for a command? 408 | 409 | ```yaml 410 | type: BulletConsoleExercise 411 | key: 7b90b8a7cd 412 | xp: 100 413 | ``` 414 | 415 | To find out what commands do, 416 | people used to use the `man` command 417 | (short for "manual"). 418 | For example, 419 | the command `man head` brings up this information: 420 | 421 | ``` 422 | HEAD(1) BSD General Commands Manual HEAD(1) 423 | 424 | NAME 425 | head -- display first lines of a file 426 | 427 | SYNOPSIS 428 | head [-n count | -c bytes] [file ...] 429 | 430 | DESCRIPTION 431 | This filter displays the first count lines or bytes of each of 432 | the specified files, or of the standard input if no files are 433 | specified. If count is omitted it defaults to 10. 434 | 435 | If more than a single file is specified, each file is preceded by 436 | a header consisting of the string ``==> XXX <=='' where ``XXX'' 437 | is the name of the file. 438 | 439 | SEE ALSO 440 | tail(1) 441 | ``` 442 | 443 | `man` automatically invokes `less`, 444 | so you may need to press spacebar to page through the information 445 | and `:q` to quit. 446 | 447 | The one-line description under `NAME` tells you briefly what the command does, 448 | and the summary under `SYNOPSIS` lists all the flags it understands. 449 | Anything that is optional is shown in square brackets `[...]`, 450 | either/or alternatives are separated by `|`, 451 | and things that can be repeated are shown by `...`, 452 | so `head`'s manual page is telling you that you can *either* give a line count with `-n` 453 | or a byte count with `-c`, 454 | and that you can give it any number of filenames. 455 | 456 | The problem with the Unix manual is that you have to know what you're looking for. 457 | If you don't, 458 | you can search [Stack Overflow](https://stackoverflow.com/), 459 | ask a question on DataCamp's Slack channels, 460 | or look at the `SEE ALSO` sections of the commands you already know. 461 | 462 | `@pre_exercise_code` 463 | ```{python} 464 | 465 | ``` 466 | 467 | *** 468 | 469 | ```yaml 470 | type: ConsoleExercise 471 | key: 52d629048a 472 | xp: 50 473 | ``` 474 | 475 | `@instructions` 476 | Read the manual page for the `tail` command to find out 477 | what putting a `+` sign in front of the number used with the `-n` flag does. 478 | (Remember to press spacebar to page down and/or type `q` to quit.) 479 | 480 | `@hint` 481 | Remember: `man` is short for "manual". 482 | 483 | `@solution` 484 | ```{shell} 485 | # Run the following command *without* '| cat': 486 | man tail | cat 487 | 488 | ``` 489 | 490 | `@sct` 491 | ```{python} 492 | Ex().has_code(r'\s*man\s+tail.*', incorrect_msg='Use `man` and the command name.') 493 | 494 | ``` 495 | 496 | *** 497 | 498 | ```yaml 499 | type: ConsoleExercise 500 | key: 6a07958ae0 501 | xp: 50 502 | ``` 503 | 504 | `@instructions` 505 | Use `tail` with the flag `-n +7` to display all *but* the first six lines of `seasonal/spring.csv`. 506 | 507 | `@hint` 508 | Use a plus sign '+' in front of the number of lines you want displayed. 509 | 510 | `@solution` 511 | ```{shell} 512 | tail -n +7 seasonal/spring.csv 513 | 514 | ``` 515 | 516 | `@sct` 517 | ```{python} 518 | Ex().multi( 519 | has_cwd('/home/repl'), 520 | has_output('2017-09-07,molar', incorrect_msg="Are you calling `tail` on `seasonal/spring.csv`?"), 521 | has_expr_output(strict=True, incorrect_msg="Are you share you used the flag `-n +7`?") 522 | ) 523 | 524 | ``` 525 | 526 | --- 527 | 528 | ## How can I select columns from a file? 529 | 530 | ```yaml 531 | type: MultipleChoiceExercise 532 | key: 925e9d645a 533 | xp: 50 534 | ``` 535 | 536 | `head` and `tail` let you select rows from a text file. 537 | If you want to select columns, 538 | you can use the command `cut`. 539 | It has several options (use `man cut` to explore them), 540 | but the most common is something like: 541 | 542 | ```{shell} 543 | cut -f 2-5,8 -d , values.csv 544 | ``` 545 | 546 | which means 547 | "select columns 2 through 5 and columns 8, 548 | using comma as the separator". 549 | `cut` uses `-f` (meaning "fields") to specify columns 550 | and `-d` (meaning "delimiter") to specify the separator. 551 | You need to specify the latter because some files may use spaces, tabs, or colons to separate columns. 552 | 553 |
554 | 555 | What command will select the first column (containing dates) from the file `spring.csv`? 556 | 557 | `@possible_answers` 558 | - `cut -d , -f 1 seasonal/spring.csv` 559 | - `cut -d, -f1 seasonal/spring.csv` 560 | - Either of the above. 561 | - Neither of the above, because `-f` must come before `-d`. 562 | 563 | `@hint` 564 | The order of the flags doesn't matter. 565 | 566 | `@pre_exercise_code` 567 | ```{python} 568 | 569 | ``` 570 | 571 | `@sct` 572 | ```{python} 573 | Ex().has_chosen(3, ['Yes, but that is not all', 'Yes, but that is not all', 'Correct! Adding a space after the flag is good style, but not compulsory.', 'No, flag order doesn\'t matter']) 574 | ``` 575 | 576 | --- 577 | 578 | ## What can't cut do? 579 | 580 | ```yaml 581 | type: MultipleChoiceExercise 582 | key: b9bb10ae87 583 | xp: 50 584 | ``` 585 | 586 | `cut` is a simple-minded command. 587 | In particular, 588 | it doesn't understand quoted strings. 589 | If, for example, your file is: 590 | 591 | ``` 592 | Name,Age 593 | "Johel,Ranjit",28 594 | "Sharma,Rupinder",26 595 | ``` 596 | 597 | then: 598 | 599 | ```{shell} 600 | cut -f 2 -d , everyone.csv 601 | ``` 602 | 603 | will produce: 604 | 605 | ``` 606 | Age 607 | Ranjit" 608 | Rupinder" 609 | ``` 610 | 611 | rather than everyone's age, 612 | because it will think the comma between last and first names is a column separator. 613 | 614 |
615 | 616 | What is the output of `cut -d : -f 2-4` on the line: 617 | 618 | ``` 619 | first:second:third: 620 | ``` 621 | 622 | (Note the trailing colon.) 623 | 624 | `@possible_answers` 625 | - `second` 626 | - `second:third` 627 | - `second:third:` 628 | - None of the above, because there aren't four fields. 629 | 630 | `@hint` 631 | Pay attention to the trailing colon. 632 | 633 | `@pre_exercise_code` 634 | ```{python} 635 | 636 | ``` 637 | 638 | `@sct` 639 | ```{python} 640 | Ex().has_chosen(3, ['No, there is more.', 'No, there is more.', 'Correct! The trailing colon creates an empty fourth field.', 'No, `cut` does the best it can.']) 641 | ``` 642 | 643 | --- 644 | 645 | ## How can I repeat commands? 646 | 647 | ```yaml 648 | type: TabConsoleExercise 649 | key: 32c0d30049 650 | xp: 100 651 | ``` 652 | 653 | One of the biggest advantages of using the shell is that 654 | it makes it easy for you to do things over again. 655 | If you run some commands, 656 | you can then press the up-arrow key to cycle back through them. 657 | You can also use the left and right arrow keys and the delete key to edit them. 658 | Pressing return will then run the modified command. 659 | 660 | Even better, `history` will print a list of commands you have run recently. 661 | Each one is preceded by a serial number to make it easy to re-run particular commands: 662 | just type `!55` to re-run the 55th command in your history (if you have that many). 663 | You can also re-run a command by typing an exclamation mark followed by the command's name, 664 | such as `!head` or `!cut`, 665 | which will re-run the most recent use of that command. 666 | 667 | `@pre_exercise_code` 668 | ```{python} 669 | 670 | ``` 671 | 672 | *** 673 | 674 | ```yaml 675 | type: ConsoleExercise 676 | key: 188a2fab38 677 | xp: 20 678 | ``` 679 | 680 | `@instructions` 681 | Run `head summer.csv` in your home directory (which should fail). 682 | 683 | `@hint` 684 | Tab completion won't work if there isn't a matching filename. 685 | 686 | `@solution` 687 | ```{shell} 688 | head summer.csv 689 | 690 | ``` 691 | 692 | `@sct` 693 | ```{python} 694 | Ex().multi( 695 | has_cwd('/home/repl'), 696 | has_code(r'\s*head\s+summer.csv\s*', incorrect_msg="Use `head` and a filename, `summer.csv`. Don't worry if it fails. It should.") 697 | ) 698 | 699 | ``` 700 | 701 | *** 702 | 703 | ```yaml 704 | type: ConsoleExercise 705 | key: cba6bf99a5 706 | xp: 20 707 | ``` 708 | 709 | `@instructions` 710 | Change directory to `seasonal`. 711 | 712 | `@hint` 713 | Remember that `cd` stands for "change directory". 714 | 715 | `@solution` 716 | ```{shell} 717 | cd seasonal 718 | 719 | ``` 720 | 721 | `@sct` 722 | ```{python} 723 | Ex().check_correct( 724 | has_cwd('/home/repl/seasonal'), 725 | has_code('cd +seasonal', incorrect_msg="If your current working directory (find out with `pwd`) is `/home/repl`, you can move to the `seasonal` folder with `cd seasonal`.") 726 | ) 727 | 728 | ``` 729 | 730 | *** 731 | 732 | ```yaml 733 | type: ConsoleExercise 734 | key: 74f5c8d2fc 735 | xp: 20 736 | ``` 737 | 738 | `@instructions` 739 | Re-run the `head` command with `!head`. 740 | 741 | `@hint` 742 | Do not type any spaces between `!` and what follows. 743 | 744 | `@solution` 745 | ```{shell} 746 | !head 747 | 748 | ``` 749 | 750 | `@sct` 751 | ```{python} 752 | # !head is expanded into head summer.csv by the terminal, so manually specify expression 753 | # This won't work for the validator though, so we have to use check_or to satisfy it. 754 | Ex().multi( 755 | has_cwd('/home/repl/seasonal'), 756 | check_or( 757 | has_expr_output(expr = 'head summer.csv', 758 | incorrect_msg='Use `!head` to repeat the `head` command.'), 759 | has_code('!head') 760 | ) 761 | ) 762 | 763 | ``` 764 | 765 | *** 766 | 767 | ```yaml 768 | type: ConsoleExercise 769 | key: a28555575a 770 | xp: 20 771 | ``` 772 | 773 | `@instructions` 774 | Use `history` to look at what you have done. 775 | 776 | `@hint` 777 | Notice that `history` shows the most recent commands last, so that they are left on your screen when it finishes running. 778 | 779 | `@solution` 780 | ```{shell} 781 | history 782 | 783 | ``` 784 | 785 | `@sct` 786 | ```{python} 787 | Ex().has_code(r'history', incorrect_msg='Use `history` without flags to get a list of previous commands.') 788 | 789 | ``` 790 | 791 | *** 792 | 793 | ```yaml 794 | type: ConsoleExercise 795 | key: 0629b2adf3 796 | xp: 20 797 | ``` 798 | 799 | `@instructions` 800 | Re-run `head` again using `!` followed by a command number. 801 | 802 | `@hint` 803 | Do *not* type any spaces between `!` and what follows. 804 | 805 | `@solution` 806 | ```{shell} 807 | !3 808 | 809 | ``` 810 | 811 | `@sct` 812 | ```{python} 813 | # !3 is expanded into head summer.csv by the terminal, so manually specify expression 814 | # This won't work for the validator though, so we have to use check_or to satisfy it. 815 | Ex().multi( 816 | has_cwd('/home/repl/seasonal'), 817 | check_or( 818 | has_expr_output(expr = 'head summer.csv', 819 | incorrect_msg='Have you used `!` to rerun the last `head` from the history?'), 820 | # The head cmd should appear twice, at positions 1 and 3, though this will change 821 | # if the student typed a wrong answer. 822 | # Since we're also checking output, this should be niche enough to ignore. 823 | has_code(r'!3'), 824 | has_code(r'!1') 825 | ) 826 | ) 827 | Ex().success_msg("Well done! To the next one!") 828 | 829 | ``` 830 | 831 | --- 832 | 833 | ## How can I select lines containing specific values? 834 | 835 | ```yaml 836 | type: BulletConsoleExercise 837 | key: adf1516acf 838 | xp: 100 839 | ``` 840 | 841 | `head` and `tail` select rows, 842 | `cut` selects columns, 843 | and `grep` selects lines according to what they contain. 844 | In its simplest form, 845 | `grep` takes a piece of text followed by one or more filenames 846 | and prints all of the lines in those files that contain that text. 847 | For example, 848 | `grep bicuspid seasonal/winter.csv` 849 | prints lines from `winter.csv` that contain "bicuspid". 850 | 851 | `grep` can search for patterns as well; 852 | we will explore those in the next course. 853 | What's more important right now is some of `grep`'s more common flags: 854 | 855 | - `-c`: print a count of matching lines rather than the lines themselves 856 | - `-h`: do *not* print the names of files when searching multiple files 857 | - `-i`: ignore case (e.g., treat "Regression" and "regression" as matches) 858 | - `-l`: print the names of files that contain matches, not the matches 859 | - `-n`: print line numbers for matching lines 860 | - `-v`: invert the match, i.e., only show lines that *don't* match 861 | 862 | `@pre_exercise_code` 863 | ```{python} 864 | 865 | ``` 866 | 867 | *** 868 | 869 | ```yaml 870 | type: ConsoleExercise 871 | key: 0d7ef2baa0 872 | xp: 35 873 | ``` 874 | 875 | `@instructions` 876 | Print the contents of all of the lines containing the word `molar` in `seasonal/autumn.csv` 877 | by running a single command while in your home directory. Don't use any flags. 878 | 879 | `@hint` 880 | Use `grep` with the word you are searching for and the name of the file(s) to search in. 881 | 882 | `@solution` 883 | ```{shell} 884 | grep molar seasonal/autumn.csv 885 | 886 | ``` 887 | 888 | `@sct` 889 | ```{python} 890 | Ex().multi( 891 | has_cwd('/home/repl'), 892 | check_correct( 893 | has_expr_output(), 894 | multi( 895 | has_code("grep", incorrect_msg = "Did you call `grep`?"), 896 | has_code("molar", incorrect_msg = "Did you search for `molar`?"), 897 | has_code("seasonal/autumn.csv", incorrect_msg = "Did you search the `seasonal/autumn.csv` file?") 898 | ) 899 | ) 900 | ) 901 | 902 | ``` 903 | 904 | *** 905 | 906 | ```yaml 907 | type: ConsoleExercise 908 | key: a0eee34d1e 909 | xp: 35 910 | ``` 911 | 912 | `@instructions` 913 | Invert the match to find all of the lines that *don't* contain the word `molar` in `seasonal/spring.csv`, and show their line numbers. 914 | Remember, it's considered good style to put all of the flags *before* other values like filenames or the search term "molar". 915 | 916 | `@hint` 917 | 918 | 919 | `@solution` 920 | ```{shell} 921 | grep -v -n molar seasonal/spring.csv 922 | 923 | ``` 924 | 925 | `@sct` 926 | ```{python} 927 | Ex().multi( 928 | has_cwd('/home/repl'), 929 | check_correct( 930 | has_expr_output(), 931 | multi( 932 | has_code("grep", incorrect_msg = "Did you call `grep`?"), 933 | has_code("-v", incorrect_msg = "Did you invert the match with `-v`?"), 934 | has_code("-n", incorrect_msg = "Did you show line numbers with `-n`?"), 935 | has_code("molar", incorrect_msg = "Did you search for `molar`?"), 936 | has_code("seasonal/spring.csv", incorrect_msg = "Did you search the `seasonal/spring.csv` file?") 937 | ) 938 | ) 939 | ) 940 | 941 | ``` 942 | 943 | *** 944 | 945 | ```yaml 946 | type: ConsoleExercise 947 | key: f5641234fe 948 | xp: 30 949 | ``` 950 | 951 | `@instructions` 952 | Count how many lines contain the word `incisor` in `autumn.csv` and `winter.csv` combined. 953 | (Again, run a single command from your home directory.) 954 | 955 | `@hint` 956 | Remember to use `-c` with `grep` to count lines. 957 | 958 | `@solution` 959 | ```{shell} 960 | grep -c incisor seasonal/autumn.csv seasonal/winter.csv 961 | 962 | ``` 963 | 964 | `@sct` 965 | ```{python} 966 | Ex().multi( 967 | has_cwd('/home/repl'), 968 | check_correct( 969 | has_expr_output(), 970 | multi( 971 | has_code("grep", incorrect_msg = "Did you call `grep`?"), 972 | has_code("-c", incorrect_msg = "Did you get counts with `-c`?"), 973 | has_code("incisor", incorrect_msg = "Did you search for `incisor`?"), 974 | has_code("seasonal/autumn.csv", incorrect_msg = "Did you search the `seasonal/autumn.csv` file?"), 975 | has_code("seasonal/winter.csv", incorrect_msg = "Did you search the `seasonal/winter.csv` file?") 976 | ) 977 | ) 978 | ) 979 | 980 | ``` 981 | 982 | --- 983 | 984 | ## Why isn't it always safe to treat data as text? 985 | 986 | ```yaml 987 | type: MultipleChoiceExercise 988 | key: 11914639fc 989 | xp: 50 990 | ``` 991 | 992 | The `SEE ALSO` section of the manual page for `cut` refers to a command called `paste` 993 | that can be used to combine data files instead of cutting them up. 994 | 995 |
996 | 997 | Read the manual page for `paste`, 998 | and then run `paste` to combine the autumn and winter data files in a single table 999 | using a comma as a separator. 1000 | What's wrong with the output from a data analysis point of view? 1001 | 1002 | `@possible_answers` 1003 | - The column headers are repeated. 1004 | - The last few rows have the wrong number of columns. 1005 | - Some of the data from `winter.csv` is missing. 1006 | 1007 | `@hint` 1008 | If you `cut` the output of `paste` using commas as a separator, 1009 | would it produce the right answer? 1010 | 1011 | `@pre_exercise_code` 1012 | ```{python} 1013 | 1014 | ``` 1015 | 1016 | `@sct` 1017 | ```{python} 1018 | err1 = 'True, but it is not necessarily an error.' 1019 | correct2 = 'Correct: joining the lines with columns creates only one empty column at the start, not two.' 1020 | err3 = 'No, all of the winter data is there.' 1021 | Ex().has_chosen(2, [err1, correct2, err3]) 1022 | ``` 1023 | -------------------------------------------------------------------------------- /chapter3.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Combining tools 3 | description: >- 4 | The real power of the Unix shell lies not in the individual commands, but in 5 | how easily they can be combined to do new things. This chapter will show you 6 | how to use this power to select the data you want, and introduce commands for 7 | sorting values and removing duplicates. 8 | lessons: 9 | - nb_of_exercises: 12 10 | title: How can I store a command's output in a file? 11 | --- 12 | 13 | ## How can I store a command's output in a file? 14 | 15 | ```yaml 16 | type: ConsoleExercise 17 | key: 07a427d50c 18 | xp: 100 19 | ``` 20 | 21 | All of the tools you have seen so far let you name input files. 22 | Most don't have an option for naming an output file because they don't need one. 23 | Instead, 24 | you can use **redirection** to save any command's output anywhere you want. 25 | If you run this command: 26 | 27 | ```{shell} 28 | head -n 5 seasonal/summer.csv 29 | ``` 30 | 31 | it prints the first 5 lines of the summer data on the screen. 32 | If you run this command instead: 33 | 34 | ```{shell} 35 | head -n 5 seasonal/summer.csv > top.csv 36 | ``` 37 | 38 | nothing appears on the screen. 39 | Instead, 40 | `head`'s output is put in a new file called `top.csv`. 41 | You can take a look at that file's contents using `cat`: 42 | 43 | ```{shell} 44 | cat top.csv 45 | ``` 46 | 47 | The greater-than sign `>` tells the shell to redirect `head`'s output to a file. 48 | It isn't part of the `head` command; 49 | instead, 50 | it works with every shell command that produces output. 51 | 52 | `@instructions` 53 | Combine `tail` with redirection to save the last 5 lines of `seasonal/winter.csv` in a file called `last.csv`. 54 | 55 | `@hint` 56 | Use `tail -n 5` to get the last 5 lines. 57 | 58 | `@pre_exercise_code` 59 | ```{python} 60 | 61 | ``` 62 | 63 | `@solution` 64 | ```{shell} 65 | tail -n 5 seasonal/winter.csv > last.csv 66 | ``` 67 | 68 | `@sct` 69 | ```{python} 70 | patt = "The line `%s` should be in the file `last.csv`, but it isn't. Redirect the output of `tail -n 5 seasonal/winter.csv` to `last.csv` with `>`." 71 | Ex().multi( 72 | has_cwd('/home/repl'), 73 | check_file('/home/repl/last.csv').multi( 74 | check_not(has_code('2017-07-01,incisor'), incorrect_msg='`last.csv` has too many lines. Did you use the flag `-n 5` with `tail`?'), 75 | has_code('2017-07-17,canine', incorrect_msg=patt%'2017-07-17,canine'), 76 | has_code('2017-08-13,canine', incorrect_msg=patt%'2017-08-13,canine') 77 | ) 78 | ) 79 | Ex().success_msg("Nice! Let's practice some more!") 80 | ``` 81 | 82 | --- 83 | 84 | ## How can I use a command's output as an input? 85 | 86 | ```yaml 87 | type: BulletConsoleExercise 88 | key: f47d337593 89 | xp: 100 90 | ``` 91 | 92 | Suppose you want to get lines from the middle of a file. 93 | More specifically, 94 | suppose you want to get lines 3-5 from one of our data files. 95 | You can start by using `head` to get the first 5 lines 96 | and redirect that to a file, 97 | and then use `tail` to select the last 3: 98 | 99 | ```{shell} 100 | head -n 5 seasonal/winter.csv > top.csv 101 | tail -n 3 top.csv 102 | ``` 103 | 104 | A quick check confirms that this is lines 3-5 of our original file, 105 | because it is the last 3 lines of the first 5. 106 | 107 | `@pre_exercise_code` 108 | ```{python} 109 | 110 | ``` 111 | 112 | *** 113 | 114 | ```yaml 115 | type: ConsoleExercise 116 | key: 35bbb5520e 117 | xp: 50 118 | ``` 119 | 120 | `@instructions` 121 | Select the last two lines from `seasonal/winter.csv` 122 | and save them in a file called `bottom.csv`. 123 | 124 | `@hint` 125 | Use `tail` to select lines and `>` to redirect `tail`'s output. 126 | 127 | `@solution` 128 | ```{shell} 129 | tail -n 2 seasonal/winter.csv > bottom.csv 130 | 131 | ``` 132 | 133 | `@sct` 134 | ```{python} 135 | patt="The line `%s` should be in the file `bottom.csv`, but it isn't. Redirect the output of `tail -n 2 seasonal/winter.csv` to `bottom.csv` with `>`." 136 | Ex().multi( 137 | has_cwd('/home/repl'), 138 | check_file('/home/repl/bottom.csv').multi( 139 | check_not(has_code('2017-08-11,bicuspid'), incorrect_msg = '`bottom.csv` has too many lines. Did you use the flag `-n 2` with `tail`?'), 140 | has_code('2017-08-11,wisdom', incorrect_msg=patt%"2017-08-11,wisdom"), 141 | has_code('2017-08-13,canine', incorrect_msg=patt%"2017-08-13,canine") 142 | ) 143 | ) 144 | 145 | ``` 146 | 147 | *** 148 | 149 | ```yaml 150 | type: ConsoleExercise 151 | key: c94d3936a7 152 | xp: 50 153 | ``` 154 | 155 | `@instructions` 156 | Select the first line from `bottom.csv` 157 | in order to get the second-to-last line of the original file. 158 | 159 | `@hint` 160 | Use `head` to select the line you want. 161 | 162 | `@solution` 163 | ```{shell} 164 | head -n 1 bottom.csv 165 | 166 | ``` 167 | 168 | `@sct` 169 | ```{python} 170 | Ex().multi( 171 | has_cwd('/home/repl'), 172 | check_file('/home/repl/bottom.csv').has_code('2017-08-11,wisdom', incorrect_msg="There's something wrong with the `bottom.csv` file. Make sure you don't change it!"), 173 | has_expr_output(strict=True, incorrect_msg="Have you used `head` correctly on `bottom.csv`? Make sure to use the `-n` flag correctly.") 174 | ) 175 | 176 | Ex().success_msg("Well done. Head over to the next exercise to find out about better ways to combine commands.") 177 | 178 | ``` 179 | 180 | --- 181 | 182 | ## What's a better way to combine commands? 183 | 184 | ```yaml 185 | type: ConsoleExercise 186 | key: b36aea9a1e 187 | xp: 100 188 | ``` 189 | 190 | Using redirection to combine commands has two drawbacks: 191 | 192 | 1. It leaves a lot of intermediate files lying around (like `top.csv`). 193 | 2. The commands to produce your final result are scattered across several lines of history. 194 | 195 | The shell provides another tool that solves both of these problems at once called a **pipe**. 196 | Once again, 197 | start by running `head`: 198 | 199 | ```{shell} 200 | head -n 5 seasonal/summer.csv 201 | ``` 202 | 203 | Instead of sending `head`'s output to a file, 204 | add a vertical bar and the `tail` command *without* a filename: 205 | 206 | ```{shell} 207 | head -n 5 seasonal/summer.csv | tail -n 3 208 | ``` 209 | 210 | The pipe symbol tells the shell to use the output of the command on the left 211 | as the input to the command on the right. 212 | 213 | `@instructions` 214 | Use `cut` to select all of the tooth names from column 2 of the comma delimited file `seasonal/summer.csv`, then pipe the result to `grep`, with an inverted match, to exclude the header line containing the word "Tooth". *`cut` and `grep` were covered in detail in Chapter 2, exercises 8 and 11 respectively.* 215 | 216 | `@hint` 217 | - The first part of the command takes the form `cut -d field_delimiter -f column_number filename`. 218 | - The second part of the command takes the form `grep -v thing_to_match`. 219 | 220 | `@pre_exercise_code` 221 | ```{python} 222 | 223 | ``` 224 | 225 | `@solution` 226 | ```{shell} 227 | cut -d , -f 2 seasonal/summer.csv | grep -v Tooth 228 | ``` 229 | 230 | `@sct` 231 | ```{python} 232 | Ex().multi( 233 | has_cwd('/home/repl'), 234 | has_expr_output(incorrect_msg = 'Have you piped the result of `cut -d , -f 2 seasonal/summer.csv` into `grep -v Tooth` with `|`?'), 235 | check_not(has_output("Tooth"), incorrect_msg = 'Did you exclude the `"Tooth"` header line using `grep`?') 236 | ) 237 | Ex().success_msg("Perfect piping! This may be the first time you used `|`, but it's definitely not the last!") 238 | ``` 239 | 240 | --- 241 | 242 | ## How can I combine many commands? 243 | 244 | ```yaml 245 | type: ConsoleExercise 246 | key: b8753881d6 247 | xp: 100 248 | ``` 249 | 250 | You can chain any number of commands together. 251 | For example, 252 | this command: 253 | 254 | ```{shell} 255 | cut -d , -f 1 seasonal/spring.csv | grep -v Date | head -n 10 256 | ``` 257 | 258 | will: 259 | 260 | 1. select the first column from the spring data; 261 | 2. remove the header line containing the word "Date"; and 262 | 3. select the first 10 lines of actual data. 263 | 264 | `@instructions` 265 | In the previous exercise, you used the following command to select all the tooth names from column 2 of `seasonal/summer.csv`: 266 | 267 | ``` 268 | cut -d , -f 2 seasonal/summer.csv | grep -v Tooth 269 | ``` 270 | 271 | Extend this pipeline with a `head` command to only select the very first tooth name. 272 | 273 | `@hint` 274 | Copy and paste the code in the instructions, append a pipe, then call `head` with the `-n` flag. 275 | 276 | `@pre_exercise_code` 277 | ```{python} 278 | 279 | ``` 280 | 281 | `@solution` 282 | ```{shell} 283 | cut -d , -f 2 seasonal/summer.csv | grep -v Tooth | head -n 1 284 | ``` 285 | 286 | `@sct` 287 | ```{python} 288 | Ex().multi( 289 | has_cwd('/home/repl'), 290 | # for some reason has_expr_output with strict=True does not work here... 291 | has_output('^\s*canine\s*$', incorrect_msg = "Have you used `|` to extend the pipeline with a `head` command? Make sure to set the `-n` flag correctly."), 292 | # by coincidence, tail -n 1 returns the same as head -n 1, so check that head was called 293 | has_code("head", "Have you used `|` to extend the pipeline with a `head` command?") 294 | ) 295 | Ex().success_msg("Cheerful chaining! By chaining several commands together, you can build powerful data manipulation pipelines.") 296 | ``` 297 | 298 | --- 299 | 300 | ## How can I count the records in a file? 301 | 302 | ```yaml 303 | type: ConsoleExercise 304 | key: ae6a48d6aa 305 | xp: 100 306 | ``` 307 | 308 | The command `wc` (short for "word count") prints the number of **c**haracters, **w**ords, and **l**ines in a file. 309 | You can make it print only one of these using `-c`, `-w`, or `-l` respectively. 310 | 311 | `@instructions` 312 | Count how many records in `seasonal/spring.csv` have dates in July 2017 (`2017-07`). 313 | - To do this, use `grep` with a partial date to select the lines and pipe this result into `wc` with an appropriate flag to count the lines. 314 | 315 | `@hint` 316 | - Use `head seasonal/spring.csv` to remind yourself of the date format. 317 | - The first part of the command takes the form `grep thing_to_match filename`. 318 | - After the pipe, `|`, call `wc` with the `-l` flag. 319 | 320 | `@pre_exercise_code` 321 | ```{python} 322 | 323 | ``` 324 | 325 | `@solution` 326 | ```{shell} 327 | grep 2017-07 seasonal/spring.csv | wc -l 328 | ``` 329 | 330 | `@sct` 331 | ```{python} 332 | Ex().multi( 333 | has_cwd('/home/repl'), 334 | check_correct( 335 | has_expr_output(strict=True), 336 | multi( 337 | has_code("grep", incorrect_msg = "Did you call `grep`?"), 338 | has_code("2017-07", incorrect_msg = "Did you search for `2017-07`?"), 339 | has_code("seasonal/spring.csv", incorrect_msg = "Did you search the `seasonal/spring.csv` file?"), 340 | has_code("|", incorrect_msg = "Did you pipe to `wc` using `|`?"), 341 | has_code("wc", incorrect_msg = "Did you call `wc`?"), 342 | has_code("-l", incorrect_msg = "Did you count lines with `-l`?") 343 | ) 344 | ) 345 | ) 346 | Ex().success_msg("Careful counting! Determining how much data you have is a great first step in any data analysis.") 347 | ``` 348 | 349 | --- 350 | 351 | ## How can I specify many files at once? 352 | 353 | ```yaml 354 | type: ConsoleExercise 355 | key: 602d47e70c 356 | xp: 100 357 | ``` 358 | 359 | Most shell commands will work on multiple files if you give them multiple filenames. 360 | For example, 361 | you can get the first column from all of the seasonal data files at once like this: 362 | 363 | ```{shell} 364 | cut -d , -f 1 seasonal/winter.csv seasonal/spring.csv seasonal/summer.csv seasonal/autumn.csv 365 | ``` 366 | 367 | But typing the names of many files over and over is a bad idea: 368 | it wastes time, 369 | and sooner or later you will either leave a file out or repeat a file's name. 370 | To make your life better, 371 | the shell allows you to use **wildcards** to specify a list of files with a single expression. 372 | The most common wildcard is `*`, 373 | which means "match zero or more characters". 374 | Using it, 375 | we can shorten the `cut` command above to this: 376 | 377 | ```{shell} 378 | cut -d , -f 1 seasonal/* 379 | ``` 380 | 381 | or: 382 | 383 | ```{shell} 384 | cut -d , -f 1 seasonal/*.csv 385 | ``` 386 | 387 | `@instructions` 388 | Write a single command using `head` to get the first three lines from both `seasonal/spring.csv` and `seasonal/summer.csv`, a total of six lines of data, but *not* from the autumn or winter data files. 389 | Use a wildcard instead of spelling out the files' names in full. 390 | 391 | `@hint` 392 | - The command takes the form `head -n number_of_lines filename_pattern`. 393 | - You could match files in directory `a`, starting with `b`, using `a/b*`, for example. 394 | 395 | `@pre_exercise_code` 396 | ```{python} 397 | 398 | ``` 399 | 400 | `@solution` 401 | ```{shell} 402 | head -n 3 seasonal/s* # ...or seasonal/s*.csv, or even s*/s*.csv 403 | ``` 404 | 405 | `@sct` 406 | ```{python} 407 | Ex().multi( 408 | has_cwd('/home/repl'), 409 | has_expr_output(incorrect_msg = "You can use `seasonal/s*` to select `seasonal/spring.csv` and `seasonal/summer.csv`. Make sure to only include the first three lines of each file with the `-n` flag!"), 410 | check_not(has_output('==> seasonal/autumn.csv <=='), incorrect_msg = "Don't include the output for `seasonal/autumn.csv`. You can use `seasonal/s*` to select `seasonal/spring.csv` and `seasonal/summer.csv`"), 411 | check_not(has_output('==> seasonal/winter.csv <=='), incorrect_msg = "Don't include the output for `seasonal/winter.csv`. You can use `seasonal/s*` to select `seasonal/spring.csv` and `seasonal/summer.csv`") 412 | ) 413 | Ex().success_msg("Wild wildcard work! This becomes even more important if your directory contains hundreds or thousands of files.") 414 | ``` 415 | 416 | --- 417 | 418 | ## What other wildcards can I use? 419 | 420 | ```yaml 421 | type: PureMultipleChoiceExercise 422 | key: f8feeacd8c 423 | xp: 50 424 | ``` 425 | 426 | The shell has other wildcards as well, 427 | though they are less commonly used: 428 | 429 | - `?` matches a single character, so `201?.txt` will match `2017.txt` or `2018.txt`, but not `2017-01.txt`. 430 | - `[...]` matches any one of the characters inside the square brackets, so `201[78].txt` matches `2017.txt` or `2018.txt`, but not `2016.txt`. 431 | - `{...}` matches any of the comma-separated patterns inside the curly brackets, so `{*.txt, *.csv}` matches any file whose name ends with `.txt` or `.csv`, but not files whose names end with `.pdf`. 432 | 433 |
434 | 435 | Which expression would match `singh.pdf` and `johel.txt` but *not* `sandhu.pdf` or `sandhu.txt`? 436 | 437 | `@hint` 438 | Match each expression against each filename in turn. 439 | 440 | `@possible_answers` 441 | - `[sj]*.{.pdf, .txt}` 442 | - `{s*.pdf, j*.txt}` 443 | - `[singh,johel]{*.pdf, *.txt}` 444 | - [`{singh.pdf, j*.txt}`] 445 | 446 | `@feedback` 447 | - No: `.pdf` and `.txt` are not filenames. 448 | - No: this will match `sandhu.pdf`. 449 | - No: the expression in square brackets matches only one character, not entire words. 450 | - Correct! 451 | 452 | --- 453 | 454 | ## How can I sort lines of text? 455 | 456 | ```yaml 457 | type: ConsoleExercise 458 | key: f06d9e310e 459 | xp: 100 460 | ``` 461 | 462 | As its name suggests, 463 | `sort` puts data in order. 464 | By default it does this in ascending alphabetical order, 465 | but the flags `-n` and `-r` can be used to sort numerically and reverse the order of its output, 466 | while `-b` tells it to ignore leading blanks 467 | and `-f` tells it to **f**old case (i.e., be case-insensitive). 468 | Pipelines often use `grep` to get rid of unwanted records 469 | and then `sort` to put the remaining records in order. 470 | 471 | `@instructions` 472 | Remember the combination of `cut` and `grep` to select all the tooth names from column 2 of `seasonal/summer.csv`? 473 | 474 | ``` 475 | cut -d , -f 2 seasonal/summer.csv | grep -v Tooth 476 | ``` 477 | 478 | Starting from this recipe, sort the names of the teeth in `seasonal/winter.csv` (not `summer.csv`) in descending alphabetical order. To do this, extend the pipeline with a `sort` step. 479 | 480 | `@hint` 481 | Copy and paste the command in the instructions, change the filename, append a pipe, then call `sort` with the `-r` flag. 482 | 483 | `@pre_exercise_code` 484 | ```{python} 485 | 486 | ``` 487 | 488 | `@solution` 489 | ```{shell} 490 | cut -d , -f 2 seasonal/winter.csv | grep -v Tooth | sort -r 491 | ``` 492 | 493 | `@sct` 494 | ```{python} 495 | Ex().multi( 496 | has_cwd('/home/repl'), 497 | check_correct( 498 | has_expr_output(strict=True), 499 | multi( 500 | has_code("cut", incorrect_msg = "Did you call `cut`?"), 501 | has_code("-d", incorrect_msg = "Did you specify a field delimiter with `-d`?"), 502 | has_code("seasonal/winter.csv", incorrect_msg = "Did you get data from the `seasonal/winter.csv` file?"), 503 | has_code("|", incorrect_msg = "Did you pipe from `cut` to `grep` to `sort` using `|`?"), 504 | has_code("grep", incorrect_msg = "Did you call `grep`?"), 505 | has_code("-v", incorrect_msg = "Did you invert the match with `-v`?"), 506 | has_code("Tooth", incorrect_msg = "Did you search for `Tooth`?"), 507 | has_code("sort", incorrect_msg = "Did you call `sort`?"), 508 | has_code("-r", incorrect_msg = "Did you reverse the sort order with `-r`?") 509 | ) 510 | ) 511 | ) 512 | Ex().success_msg("Sorted! `sort` has many uses. For example, piping `sort -n` to `head` shows you the largest values.") 513 | ``` 514 | 515 | --- 516 | 517 | ## How can I remove duplicate lines? 518 | 519 | ```yaml 520 | type: ConsoleExercise 521 | key: ed77aed337 522 | xp: 100 523 | ``` 524 | 525 | Another command that is often used with `sort` is `uniq`, 526 | whose job is to remove duplicated lines. 527 | More specifically, 528 | it removes *adjacent* duplicated lines. 529 | If a file contains: 530 | 531 | ``` 532 | 2017-07-03 533 | 2017-07-03 534 | 2017-08-03 535 | 2017-08-03 536 | ``` 537 | 538 | then `uniq` will produce: 539 | 540 | ``` 541 | 2017-07-03 542 | 2017-08-03 543 | ``` 544 | 545 | but if it contains: 546 | 547 | ``` 548 | 2017-07-03 549 | 2017-08-03 550 | 2017-07-03 551 | 2017-08-03 552 | ``` 553 | 554 | then `uniq` will print all four lines. 555 | The reason is that `uniq` is built to work with very large files. 556 | In order to remove non-adjacent lines from a file, 557 | it would have to keep the whole file in memory 558 | (or at least, 559 | all the unique lines seen so far). 560 | By only removing adjacent duplicates, 561 | it only has to keep the most recent unique line in memory. 562 | 563 | `@instructions` 564 | Write a pipeline to: 565 | 566 | - get the second column from `seasonal/winter.csv`, 567 | - remove the word "Tooth" from the output so that only tooth names are displayed, 568 | - sort the output so that all occurrences of a particular tooth name are adjacent; and 569 | - display each tooth name once along with a count of how often it occurs. 570 | 571 | The start of your pipeline is the same as the previous exercise: 572 | 573 | ``` 574 | cut -d , -f 2 seasonal/winter.csv | grep -v Tooth 575 | ``` 576 | 577 | Extend it with a `sort` command, and use `uniq -c` to display unique lines with a count of how often each occurs rather than using `uniq` and `wc`. 578 | 579 | `@hint` 580 | Copy and paste the command in the instructions, pipe to `sort` without flags, then pipe again to `uniq` with a `-c` flag. 581 | 582 | `@pre_exercise_code` 583 | ```{python} 584 | 585 | ``` 586 | 587 | `@solution` 588 | ```{shell} 589 | cut -d , -f 2 seasonal/winter.csv | grep -v Tooth | sort | uniq -c 590 | ``` 591 | 592 | `@sct` 593 | ```{python} 594 | Ex().multi( 595 | has_cwd('/home/repl'), 596 | check_correct( 597 | has_expr_output(), 598 | multi( 599 | has_code('cut\s+-d\s+,\s+-f\s+2\s+seasonal/winter.csv\s+\|\s+grep\s+-v\s+Tooth', 600 | incorrect_msg="You should start from this command: `cut -d , -f 2 seasonal/winter.csv | grep -v Tooth`. Now extend it!"), 601 | has_code('\|\s+sort', incorrect_msg="Have you extended the command with `| sort`?"), 602 | has_code('\|\s+uniq', incorrect_msg="Have you extended the command with `| uniq`?"), 603 | has_code('-c', incorrect_msg="Have you included counts with `-c`?") 604 | ) 605 | ) 606 | ) 607 | Ex().success_msg("Great! After all of this work on a pipe, it would be nice if we could store the result, no?") 608 | ``` 609 | 610 | --- 611 | 612 | ## How can I save the output of a pipe? 613 | 614 | ```yaml 615 | type: MultipleChoiceExercise 616 | key: 4115aa25b2 617 | xp: 50 618 | ``` 619 | 620 | The shell lets us redirect the output of a sequence of piped commands: 621 | 622 | ```{shell} 623 | cut -d , -f 2 seasonal/*.csv | grep -v Tooth > teeth-only.txt 624 | ``` 625 | 626 | However, `>` must appear at the end of the pipeline: 627 | if we try to use it in the middle, like this: 628 | 629 | ```{shell} 630 | cut -d , -f 2 seasonal/*.csv > teeth-only.txt | grep -v Tooth 631 | ``` 632 | 633 | then all of the output from `cut` is written to `teeth-only.txt`, 634 | so there is nothing left for `grep` 635 | and it waits forever for some input. 636 | 637 |
638 | 639 | What happens if we put redirection at the front of a pipeline as in: 640 | 641 | ```{shell} 642 | > result.txt head -n 3 seasonal/winter.csv 643 | ``` 644 | 645 | `@possible_answers` 646 | - [The command's output is redirected to the file as usual.] 647 | - The shell reports it as an error. 648 | - The shell waits for input forever. 649 | 650 | `@hint` 651 | Try it out in the shell. 652 | 653 | `@pre_exercise_code` 654 | ```{python} 655 | 656 | ``` 657 | 658 | `@sct` 659 | ```{python} 660 | Ex().has_chosen(1, ['Correct!', 'No; the shell can actually execute this.', 'No; the shell can actually execute this.']) 661 | ``` 662 | 663 | --- 664 | 665 | ## How can I stop a running program? 666 | 667 | ```yaml 668 | type: ConsoleExercise 669 | key: d1694dbdcd 670 | xp: 100 671 | ``` 672 | 673 | The commands and scripts that you have run so far have all executed quickly, 674 | but some tasks will take minutes, hours, or even days to complete. 675 | You may also mistakenly put redirection in the middle of a pipeline, 676 | causing it to hang up. 677 | If you decide that you don't want a program to keep running, 678 | you can type `Ctrl` + `C` to end it. 679 | This is often written `^C` in Unix documentation; 680 | note that the 'c' can be lower-case. 681 | 682 | `@instructions` 683 | Run the command: 684 | 685 | ```{shell} 686 | head 687 | ``` 688 | 689 | with no arguments (so that it waits for input that will never come) 690 | and then stop it by typing `Ctrl` + `C`. 691 | 692 | `@hint` 693 | Simply type head, hit Enter and exit the running program with `Ctrl` + `C`. 694 | 695 | `@pre_exercise_code` 696 | ```{python} 697 | 698 | ``` 699 | 700 | `@solution` 701 | ```{shell} 702 | # Simply type head, hit Enter and exit the running program with `Ctrl` + `C`. 703 | ``` 704 | 705 | `@sct` 706 | ```{python} 707 | Ex().has_code(r'\s*head\s*', fixed=False, incorrect_msg="Have you used `head`?") 708 | ``` 709 | 710 | --- 711 | 712 | ## Wrapping up 713 | 714 | ```yaml 715 | type: BulletConsoleExercise 716 | key: 659d3caa48 717 | xp: 100 718 | ``` 719 | 720 | To wrap up, 721 | you will build a pipeline to find out how many records are in the shortest of the seasonal data files. 722 | 723 | `@pre_exercise_code` 724 | ```{python} 725 | 726 | ``` 727 | 728 | *** 729 | 730 | ```yaml 731 | type: ConsoleExercise 732 | key: b1f9c8ff84 733 | xp: 35 734 | ``` 735 | 736 | `@instructions` 737 | Use `wc` with appropriate parameters to list the number of lines in all of the seasonal data files. 738 | (Use a wildcard for the filenames instead of typing them all in by hand.) 739 | 740 | `@hint` 741 | Use `-l` to list only the lines and `*` to match filenames. 742 | 743 | `@solution` 744 | ```{shell} 745 | wc -l seasonal/*.csv 746 | 747 | ``` 748 | 749 | `@sct` 750 | ```{python} 751 | Ex().multi( 752 | has_cwd('/home/repl'), 753 | check_correct( 754 | has_expr_output(strict=True), 755 | multi( 756 | has_code("wc", incorrect_msg = "Did you call `wc`?"), 757 | has_code("-l", incorrect_msg = "Did you count the number of lines with `-l`?"), 758 | has_code("seasonal/\*", incorrect_msg = "Did you get data from all `seasonal/*` files?") 759 | ) 760 | ) 761 | ) 762 | 763 | ``` 764 | 765 | *** 766 | 767 | ```yaml 768 | type: ConsoleExercise 769 | key: 7f94acc679 770 | xp: 35 771 | ``` 772 | 773 | `@instructions` 774 | Add another command to the previous one using a pipe to remove the line containing the word "total". 775 | 776 | `@hint` 777 | 778 | 779 | `@solution` 780 | ```{shell} 781 | wc -l seasonal/*.csv | grep -v total 782 | 783 | ``` 784 | 785 | `@sct` 786 | ```{python} 787 | Ex().multi( 788 | has_cwd('/home/repl'), 789 | check_correct( 790 | has_expr_output(strict=True), 791 | multi( 792 | has_code("wc", incorrect_msg = "Did you call `wc`?"), 793 | has_code("-l", incorrect_msg = "Did you count the number of lines with `-l`?"), 794 | has_code("seasonal/\*", incorrect_msg = "Did you get data from all `seasonal/*` files?"), 795 | has_code("|", incorrect_msg = "Did you pipe from `wc` to `grep` using `|`?"), 796 | has_code("grep", incorrect_msg = "Did you call `grep`?"), 797 | has_code("-v", incorrect_msg = "Did you invert the match with `-v`?"), 798 | has_code("total", incorrect_msg = "Did you search for `total`?") 799 | ) 800 | ) 801 | ) 802 | 803 | ``` 804 | 805 | *** 806 | 807 | ```yaml 808 | type: ConsoleExercise 809 | key: c5f55bff6b 810 | xp: 30 811 | ``` 812 | 813 | `@instructions` 814 | Add two more stages to the pipeline that use `sort -n` and `head -n 1` to find the file containing the fewest lines. 815 | 816 | `@hint` 817 | - Use `sort`'s `-n` flag to sort numerically. 818 | - Use `head`'s `-n` flag to limit to keeping 1 line. 819 | 820 | `@solution` 821 | ```{shell} 822 | wc -l seasonal/*.csv | grep -v total | sort -n | head -n 1 823 | 824 | ``` 825 | 826 | `@sct` 827 | ```{python} 828 | Ex().multi( 829 | has_cwd('/home/repl'), 830 | check_correct( 831 | has_expr_output(strict=True), 832 | multi( 833 | has_code("wc", incorrect_msg = "Did you call `wc`?"), 834 | has_code("-l", incorrect_msg = "Did you count the number of lines with `-l`?"), 835 | has_code("seasonal/\*", incorrect_msg = "Did you get data from all `seasonal/*` files?"), 836 | has_code("|", incorrect_msg = "Did you pipe from `wc` to `grep` to `sort` to `head` using `|`?"), 837 | has_code("grep", incorrect_msg = "Did you call `grep`?"), 838 | has_code("-v", incorrect_msg = "Did you invert the match with `-v`?"), 839 | has_code("total", incorrect_msg = "Did you search for `total`?"), 840 | has_code("sort", incorrect_msg = "Did you call `sort`?"), 841 | has_code("-n", incorrect_msg = "Did you specify the number of lines to keep with `-n`?"), 842 | has_code("1", incorrect_msg = "Did you specify 1 line to keep with `-n 1`?") 843 | ) 844 | ) 845 | ) 846 | Ex().success_msg("Great! It turns out `autumn.csv` is the file with the fewest lines. Rush over to chapter 4 to learn more about batch processing!") 847 | 848 | ``` 849 | -------------------------------------------------------------------------------- /chapter4.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Batch processing 3 | description: >- 4 | Most shell commands will process many files at once. This chapter shows you 5 | how to make your own pipelines do that. Along the way, you will see how the 6 | shell uses variables to store information. 7 | lessons: 8 | - nb_of_exercises: 10 9 | title: How does the shell store information? 10 | --- 11 | 12 | ## How does the shell store information? 13 | 14 | ```yaml 15 | type: MultipleChoiceExercise 16 | key: e4d5f4adea 17 | xp: 50 18 | ``` 19 | 20 | Like other programs, the shell stores information in variables. 21 | Some of these, 22 | called **environment variables**, 23 | are available all the time. 24 | Environment variables' names are conventionally written in upper case, 25 | and a few of the more commonly-used ones are shown below. 26 | 27 | | Variable | Purpose | Value | 28 | |----------|-----------------------------------|-----------------------| 29 | | `HOME` | User's home directory | `/home/repl` | 30 | | `PWD ` | Present working directory | Same as `pwd` command | 31 | | `SHELL` | Which shell program is being used | `/bin/bash` | 32 | | `USER` | User's ID | `repl` | 33 | 34 | To get a complete list (which is quite long), 35 | you can type `set` in the shell. 36 | 37 |
38 | 39 | Use `set` and `grep` with a pipe to display the value of `HISTFILESIZE`, 40 | which determines how many old commands are stored in your command history. 41 | What is its value? 42 | 43 | `@possible_answers` 44 | - 10 45 | - 500 46 | - [2000] 47 | - The variable is not there. 48 | 49 | `@hint` 50 | Use `set | grep HISTFILESIZE` to get the line you need. 51 | 52 | `@pre_exercise_code` 53 | ```{python} 54 | 55 | ``` 56 | 57 | `@sct` 58 | ```{python} 59 | err1 = "No: the shell records more history than that." 60 | err2 = "No: the shell records more history than that." 61 | correct3 = "Correct: the shell saves 2000 old commands by default on this system." 62 | err4 = "No: the variable `HISTFILESIZE` is there." 63 | Ex().has_chosen(3, [err1, err2, correct3, err4]) 64 | ``` 65 | 66 | --- 67 | 68 | ## How can I print a variable's value? 69 | 70 | ```yaml 71 | type: ConsoleExercise 72 | key: afae0f33a7 73 | xp: 100 74 | ``` 75 | 76 | A simpler way to find a variable's value is to use a command called `echo`, which prints its arguments. Typing 77 | 78 | ```{shell} 79 | echo hello DataCamp! 80 | ``` 81 | 82 | prints 83 | 84 | ``` 85 | hello DataCamp! 86 | ``` 87 | 88 | If you try to use it to print a variable's value like this: 89 | 90 | ```{shell} 91 | echo USER 92 | ``` 93 | 94 | it will print the variable's name, `USER`. 95 | 96 | To get the variable's value, you must put a dollar sign `$` in front of it. Typing 97 | 98 | ```{shell} 99 | echo $USER 100 | ``` 101 | 102 | prints 103 | 104 | ``` 105 | repl 106 | ``` 107 | 108 | This is true everywhere: 109 | to get the value of a variable called `X`, 110 | you must write `$X`. 111 | (This is so that the shell can tell whether you mean "a file named X" 112 | or "the value of a variable named X".) 113 | 114 | `@instructions` 115 | The variable `OSTYPE` holds the name of the kind of operating system you are using. 116 | Display its value using `echo`. 117 | 118 | `@hint` 119 | Call `echo` with the variable `OSTYPE` prepended by `$`. 120 | 121 | `@pre_exercise_code` 122 | ```{python} 123 | 124 | ``` 125 | 126 | `@solution` 127 | ```{shell} 128 | echo $OSTYPE 129 | ``` 130 | 131 | `@sct` 132 | ```{python} 133 | Ex().multi( 134 | has_cwd('/home/repl'), 135 | check_correct( 136 | has_expr_output(strict = True), 137 | multi( 138 | has_code('echo', incorrect_msg="Did you call `echo`?"), 139 | has_code('OSTYPE', incorrect_msg="Did you print the `OSTYPE` environment variable?"), 140 | has_code(r'\$OSTYPE', incorrect_msg="Make sure to prepend `OSTYPE` by a `$`.") 141 | ) 142 | ) 143 | ) 144 | Ex().success_msg("Excellent echoing of environment variables! You're off to a good start. Let's carry on!") 145 | ``` 146 | 147 | --- 148 | 149 | ## How else does the shell store information? 150 | 151 | ```yaml 152 | type: BulletConsoleExercise 153 | key: e925da48e4 154 | xp: 100 155 | ``` 156 | 157 | The other kind of variable is called a **shell variable**, 158 | which is like a local variable in a programming language. 159 | 160 | To create a shell variable, 161 | you simply assign a value to a name: 162 | 163 | ```{shell} 164 | training=seasonal/summer.csv 165 | ``` 166 | 167 | *without* any spaces before or after the `=` sign. 168 | Once you have done this, 169 | you can check the variable's value with: 170 | 171 | ```{shell} 172 | echo $training 173 | ``` 174 | ``` 175 | seasonal/summer.csv 176 | ``` 177 | 178 | `@pre_exercise_code` 179 | ```{python} 180 | 181 | ``` 182 | 183 | *** 184 | 185 | ```yaml 186 | type: ConsoleExercise 187 | key: 78f7fd446f 188 | xp: 50 189 | ``` 190 | 191 | `@instructions` 192 | Define a variable called `testing` with the value `seasonal/winter.csv`. 193 | 194 | `@hint` 195 | There should *not* be spaces between the variable's name and its value. 196 | 197 | `@solution` 198 | ```{shell} 199 | testing=seasonal/winter.csv 200 | 201 | ``` 202 | 203 | `@sct` 204 | ```{python} 205 | # For some reason, testing the shell variable directly always passes, so we can't do the following. 206 | # Ex().multi( 207 | # has_cwd('/home/repl'), 208 | # has_expr_output( 209 | # expr='echo $testing', 210 | # output='seasonal/winter.csv', 211 | # incorrect_msg="Have you used `testing=seasonal/winter.csv` to define the `testing` variable?" 212 | # ) 213 | # ) 214 | Ex().multi( 215 | has_cwd('/home/repl'), 216 | multi( 217 | has_code('testing', incorrect_msg='Did you define a shell variable named `testing`?'), 218 | has_code('testing=', incorrect_msg='Did you write `=` directly after testing, with no spaces?'), 219 | has_code('=seasonal/winter\.csv', incorrect_msg='Did you set the value of `testing` to `seasonal/winter.csv`?') 220 | ) 221 | ) 222 | 223 | ``` 224 | 225 | *** 226 | 227 | ```yaml 228 | type: ConsoleExercise 229 | key: d5e7224f55 230 | xp: 50 231 | ``` 232 | 233 | `@instructions` 234 | Use `head -n 1 SOMETHING` to get the first line from `seasonal/winter.csv` 235 | using the value of the variable `testing` instead of the name of the file. 236 | 237 | `@hint` 238 | Remember to use `$testing` rather than just `testing` 239 | (the `$` is needed to get the value of the variable). 240 | 241 | `@solution` 242 | ```{shell} 243 | # We need to re-set the variable for testing purposes for this exercise 244 | # you should only run "head -n 1 $testing" 245 | testing=seasonal/winter.csv 246 | head -n 1 $testing 247 | 248 | ``` 249 | 250 | `@sct` 251 | ```{python} 252 | Ex().multi( 253 | has_cwd('/home/repl'), 254 | has_code(r'\$testing', incorrect_msg="Did you reference the shell variable using `$testing`?"), 255 | check_correct( 256 | has_output('^Date,Tooth\s*$'), 257 | multi( 258 | has_code('head', incorrect_msg="Did you call `head`?"), 259 | has_code('-n', incorrect_msg="Did you limit the number of lines with `-n`?"), 260 | has_code(r'-n\s+1', incorrect_msg="Did you elect to keep 1 line with `-n 1`?") 261 | ) 262 | ) 263 | ) 264 | Ex().success_msg("Stellar! Let's see how you can repeat commands easily.") 265 | 266 | ``` 267 | 268 | --- 269 | 270 | ## How can I repeat a command many times? 271 | 272 | ```yaml 273 | type: ConsoleExercise 274 | key: 920d1887e3 275 | xp: 100 276 | ``` 277 | 278 | Shell variables are also used in **loops**, 279 | which repeat commands many times. 280 | If we run this command: 281 | 282 | ```{shell} 283 | for filetype in gif jpg png; do echo $filetype; done 284 | ``` 285 | 286 | it produces: 287 | 288 | ``` 289 | gif 290 | jpg 291 | png 292 | ``` 293 | 294 | Notice these things about the loop: 295 | 296 | 1. The structure is `for` ...variable... `in` ...list... `; do` ...body... `; done` 297 | 2. The list of things the loop is to process (in our case, the words `gif`, `jpg`, and `png`). 298 | 3. The variable that keeps track of which thing the loop is currently processing (in our case, `filetype`). 299 | 4. The body of the loop that does the processing (in our case, `echo $filetype`). 300 | 301 | Notice that the body uses `$filetype` to get the variable's value instead of just `filetype`, 302 | just like it does with any other shell variable. 303 | Also notice where the semi-colons go: 304 | the first one comes between the list and the keyword `do`, 305 | and the second comes between the body and the keyword `done`. 306 | 307 | `@instructions` 308 | Modify the loop so that it prints: 309 | 310 | ``` 311 | docx 312 | odt 313 | pdf 314 | ``` 315 | 316 | Please use `filetype` as the name of the loop variable. 317 | 318 | `@hint` 319 | Use the code structure in the introductory text, swapping the image file types for document file types. 320 | 321 | `@pre_exercise_code` 322 | ```{python} 323 | 324 | ``` 325 | 326 | `@solution` 327 | ```{shell} 328 | for filetype in docx odt pdf; do echo $filetype; done 329 | ``` 330 | 331 | `@sct` 332 | ```{python} 333 | Ex().multi( 334 | has_cwd('/home/repl'), 335 | check_correct( 336 | has_expr_output(), 337 | multi( 338 | has_code('for', incorrect_msg='Did you call `for`?'), 339 | has_code('filetype', incorrect_msg='Did you use `filetype` as the loop variable?'), 340 | has_code('in', incorrect_msg='Did you use `in` before the list of file types?'), 341 | has_code('docx odt pdf', incorrect_msg='Did you loop over `docx`, `odt` and `pdf` in that order?'), 342 | has_code(r'pdf\s*;', incorrect_msg='Did you put a semi-colon after the last loop element?'), 343 | has_code(r';\s*do', incorrect_msg='Did you use `do` after the first semi-colon?'), 344 | has_code('echo', incorrect_msg='Did you call `echo`?'), 345 | has_code(r'\$filetype', incorrect_msg='Did you echo `$filetype`?'), 346 | has_code(r'filetype\s*;', incorrect_msg='Did you put a semi-colon after the loop body?'), 347 | has_code('; done', incorrect_msg='Did you finish with `done`?') 348 | ) 349 | ) 350 | ) 351 | Ex().success_msg("First-rate for looping! Loops are brilliant if you want to do the same thing hundreds or thousands of times.") 352 | ``` 353 | 354 | --- 355 | 356 | ## How can I repeat a command once for each file? 357 | 358 | ```yaml 359 | type: ConsoleExercise 360 | key: 8468b70a71 361 | xp: 100 362 | ``` 363 | 364 | You can always type in the names of the files you want to process when writing the loop, 365 | but it's usually better to use wildcards. 366 | Try running this loop in the console: 367 | 368 | ```{shell} 369 | for filename in seasonal/*.csv; do echo $filename; done 370 | ``` 371 | 372 | It prints: 373 | 374 | ``` 375 | seasonal/autumn.csv 376 | seasonal/spring.csv 377 | seasonal/summer.csv 378 | seasonal/winter.csv 379 | ``` 380 | 381 | because the shell expands `seasonal/*.csv` to be a list of four filenames 382 | before it runs the loop. 383 | 384 | `@instructions` 385 | Modify the wildcard expression to `people/*` 386 | so that the loop prints the names of the files in the `people` directory 387 | regardless of what suffix they do or don't have. 388 | Please use `filename` as the name of your loop variable. 389 | 390 | `@hint` 391 | 392 | 393 | `@pre_exercise_code` 394 | ```{python} 395 | 396 | ``` 397 | 398 | `@solution` 399 | ```{bash} 400 | for filename in people/*; do echo $filename; done 401 | ``` 402 | 403 | `@sct` 404 | ```{python} 405 | Ex().multi( 406 | has_cwd('/home/repl'), 407 | check_correct( 408 | has_expr_output(), 409 | multi( 410 | has_code('for', incorrect_msg='Did you call `for`?'), 411 | has_code('filename', incorrect_msg='Did you use `filename` as the loop variable?'), 412 | has_code('in', incorrect_msg='Did you use `in` before the list of file types?'), 413 | has_code('people/\*', incorrect_msg='Did you specify a list of files with `people/*`?'), 414 | has_code(r'people/\*\s*;', incorrect_msg='Did you put a semi-colon after the list of files?'), 415 | has_code(r';\s*do', incorrect_msg='Did you use `do` after the first semi-colon?'), 416 | has_code('echo', incorrect_msg='Did you call `echo`?'), 417 | has_code(r'\$filename', incorrect_msg='Did you echo `$filename`?'), 418 | has_code(r'filename\s*;', incorrect_msg='Did you put a semi-colon after the loop body?'), 419 | has_code('; done', incorrect_msg='Did you finish with `done`?') 420 | ) 421 | ) 422 | ) 423 | Ex().success_msg("Loopy looping! Wildcards and loops make a powerful combination.") 424 | ``` 425 | 426 | --- 427 | 428 | ## How can I record the names of a set of files? 429 | 430 | ```yaml 431 | type: MultipleChoiceExercise 432 | key: 153ca10317 433 | xp: 50 434 | ``` 435 | 436 | People often set a variable using a wildcard expression to record a list of filenames. 437 | For example, 438 | if you define `datasets` like this: 439 | 440 | ```{shell} 441 | datasets=seasonal/*.csv 442 | ``` 443 | 444 | you can display the files' names later using: 445 | 446 | ```{shell} 447 | for filename in $datasets; do echo $filename; done 448 | ``` 449 | 450 | This saves typing and makes errors less likely. 451 | 452 |
453 | 454 | If you run these two commands in your home directory, 455 | how many lines of output will they print? 456 | 457 | ```{shell} 458 | files=seasonal/*.csv 459 | for f in $files; do echo $f; done 460 | ``` 461 | 462 | `@possible_answers` 463 | - None: since `files` is defined on a separate line, it has no value in the second line. 464 | - One: the word "files". 465 | - Four: the names of all four seasonal data files. 466 | 467 | `@hint` 468 | Remember that `X` on its own is just "X", while `$X` is the value of the variable `X`. 469 | 470 | `@pre_exercise_code` 471 | ```{python} 472 | 473 | ``` 474 | 475 | `@sct` 476 | ```{python} 477 | err1 = "No: you do not have to define a variable on the same line you use it." 478 | err2 = "No: this example defines and uses the variable `files` in the same shell." 479 | correct3 = "Correct. The command is equivalent to `for f in seasonal/*.csv; do echo $f; done`." 480 | Ex().has_chosen(3, [err1, err2, correct3]) 481 | ``` 482 | 483 | --- 484 | 485 | ## A variable's name versus its value 486 | 487 | ```yaml 488 | type: PureMultipleChoiceExercise 489 | key: 4fcfb63c4f 490 | xp: 50 491 | ``` 492 | 493 | A common mistake is to forget to use `$` before the name of a variable. 494 | When you do this, 495 | the shell uses the name you have typed 496 | rather than the value of that variable. 497 | 498 | A more common mistake for experienced users is to mis-type the variable's name. 499 | For example, 500 | if you define `datasets` like this: 501 | 502 | ```{shell} 503 | datasets=seasonal/*.csv 504 | ``` 505 | 506 | and then type: 507 | 508 | ```{shell} 509 | echo $datsets 510 | ``` 511 | 512 | the shell doesn't print anything, 513 | because `datsets` (without the second "a") isn't defined. 514 | 515 |
516 | 517 | If you were to run these two commands in your home directory, 518 | what output would be printed? 519 | 520 | ```{shell} 521 | files=seasonal/*.csv 522 | for f in files; do echo $f; done 523 | ``` 524 | 525 | (Read the first part of the loop carefully before answering.) 526 | 527 | `@hint` 528 | Remember that `X` on its own is just "X", while `$X` is the value of the variable `X`. 529 | 530 | `@possible_answers` 531 | - [One line: the word "files".] 532 | - Four lines: the names of all four seasonal data files. 533 | - Four blank lines: the variable `f` isn't assigned a value. 534 | 535 | `@feedback` 536 | - Correct: the loop uses `files` instead of `$files`, so the list consists of the word "files". 537 | - No: the loop uses `files` instead of `$files`, so the list consists of the word "files" rather than the expansion of `files`. 538 | - No: the variable `f` is defined automatically by the `for` loop. 539 | 540 | --- 541 | 542 | ## How can I run many commands in a single loop? 543 | 544 | ```yaml 545 | type: ConsoleExercise 546 | key: 39b5dcf81a 547 | xp: 100 548 | ``` 549 | 550 | Printing filenames is useful for debugging, 551 | but the real purpose of loops is to do things with multiple files. 552 | This loop prints the second line of each data file: 553 | 554 | ```{shell} 555 | for file in seasonal/*.csv; do head -n 2 $file | tail -n 1; done 556 | ``` 557 | 558 | It has the same structure as the other loops you have already seen: 559 | all that's different is that its body is a pipeline of two commands instead of a single command. 560 | 561 | `@instructions` 562 | Write a loop that prints the last entry from July 2017 (`2017-07`) in every seasonal file. It should produce a similar output to: 563 | 564 | ```{shell} 565 | grep 2017-07 seasonal/winter.csv | tail -n 1 566 | ``` 567 | 568 | but for **_each_** seasonal file separately. Please use `file` as the name of the loop variable, and remember to loop through the list of files `seasonal/*.csv` (_instead of 'seasonal/winter.csv' as in the example_). 569 | 570 | `@hint` 571 | The loop body is the grep command shown in the instructions, with `seasonal/winter.csv` replaced by `$file`. 572 | 573 | `@pre_exercise_code` 574 | ```{python} 575 | 576 | ``` 577 | 578 | `@solution` 579 | ```{bash} 580 | for file in seasonal/*.csv; do grep 2017-07 $file | tail -n 1; done 581 | ``` 582 | 583 | `@sct` 584 | ```{python} 585 | Ex().multi( 586 | has_cwd('/home/repl'), 587 | # Enforce use of for loop, so students can't just use grep -h 2017-07 seasonal/*.csv 588 | has_code('for', incorrect_msg='Did you call `for`?'), 589 | check_correct( 590 | has_expr_output(), 591 | multi( 592 | has_code('file', incorrect_msg='Did you use `file` as the loop variable?'), 593 | has_code('in', incorrect_msg='Did you use `in` before the list of files?'), 594 | has_code('seasonal/\*', incorrect_msg='Did you specify a list of files with `seasonal/*`?'), 595 | has_code(r'seasonal\/\*\.csv\s*;', incorrect_msg='Did you put a semi-colon after the list of files?'), 596 | has_code(r';\s*do', incorrect_msg='Did you use `do` after the first semi-colon?'), 597 | has_code('grep', incorrect_msg='Did you call `grep`?'), 598 | has_code('2017-07', incorrect_msg='Did you match on `2017-07`?'), 599 | has_code(r'\$file', incorrect_msg='Did you use `$file` as the name of the loop variable?'), 600 | has_code(r'file\s*|', incorrect_msg='Did you use a pipe to connect your second command?'), 601 | has_code(r'tail\s*-n\s*1', incorrect_msg='Did you use `tail -n 1` to print the last entry of each search in your second command?'), 602 | has_code('; done', incorrect_msg='Did you finish with `done`?') 603 | ) 604 | ) 605 | ) 606 | 607 | Ex().success_msg("Loopy looping! Wildcards and loops make a powerful combination.") 608 | ``` 609 | 610 | --- 611 | 612 | ## Why shouldn't I use spaces in filenames? 613 | 614 | ```yaml 615 | type: PureMultipleChoiceExercise 616 | key: b974b7f45a 617 | xp: 50 618 | ``` 619 | 620 | It's easy and sensible to give files multi-word names like `July 2017.csv` 621 | when you are using a graphical file explorer. 622 | However, 623 | this causes problems when you are working in the shell. 624 | For example, 625 | suppose you wanted to rename `July 2017.csv` to be `2017 July data.csv`. 626 | You cannot type: 627 | 628 | ```{shell} 629 | mv July 2017.csv 2017 July data.csv 630 | ``` 631 | 632 | because it looks to the shell as though you are trying to move 633 | four files called `July`, `2017.csv`, `2017`, and `July` (again) 634 | into a directory called `data.csv`. 635 | Instead, 636 | you have to quote the files' names 637 | so that the shell treats each one as a single parameter: 638 | 639 | ```{shell} 640 | mv 'July 2017.csv' '2017 July data.csv' 641 | ``` 642 | 643 |
644 | 645 | If you have two files called `current.csv` and `last year.csv` 646 | (with a space in its name) 647 | and you type: 648 | 649 | ```{shell} 650 | rm current.csv last year.csv 651 | ``` 652 | 653 | what will happen: 654 | 655 | `@hint` 656 | What would you think was going to happen if someone showed you the command and you didn't know what files existed? 657 | 658 | `@possible_answers` 659 | - The shell will print an error message because `last` and `year.csv` do not exist. 660 | - The shell will delete `current.csv`. 661 | - [Both of the above.] 662 | - Nothing. 663 | 664 | `@feedback` 665 | - Yes, but that's not all. 666 | - Yes, but that's not all. 667 | - Correct. You can use single quotes, `'`, or double quotes, `"`, around the file names. 668 | - Unfortunately not. 669 | 670 | --- 671 | 672 | ## How can I do many things in a single loop? 673 | 674 | ```yaml 675 | type: MultipleChoiceExercise 676 | key: f6d0530991 677 | xp: 50 678 | ``` 679 | 680 | The loops you have seen so far all have a single command or pipeline in their body, 681 | but a loop can contain any number of commands. 682 | To tell the shell where one ends and the next begins, 683 | you must separate them with semi-colons: 684 | 685 | ```{shell} 686 | for f in seasonal/*.csv; do echo $f; head -n 2 $f | tail -n 1; done 687 | ``` 688 | 689 | ``` 690 | seasonal/autumn.csv 691 | 2017-01-05,canine 692 | seasonal/spring.csv 693 | 2017-01-25,wisdom 694 | seasonal/summer.csv 695 | 2017-01-11,canine 696 | seasonal/winter.csv 697 | 2017-01-03,bicuspid 698 | ``` 699 | 700 |
701 | 702 | Suppose you forget the semi-colon between the `echo` and `head` commands in the previous loop, 703 | so that you ask the shell to run: 704 | 705 | ```{shell} 706 | for f in seasonal/*.csv; do echo $f head -n 2 $f | tail -n 1; done 707 | ``` 708 | 709 | What will the shell do? 710 | 711 | `@possible_answers` 712 | - Print an error message. 713 | - Print one line for each of the four files. 714 | - Print one line for `autumn.csv` (the first file). 715 | - Print the last line of each file. 716 | 717 | `@hint` 718 | You can pipe the output of `echo` to `tail`. 719 | 720 | `@pre_exercise_code` 721 | ```{python} 722 | 723 | ``` 724 | 725 | `@sct` 726 | ```{python} 727 | err1 = "No: the loop will run, it just won't do something sensible." 728 | correct2 = "Yes: `echo` produces one line that includes the filename twice, which `tail` then copies." 729 | err3 = "No: the loop runs one for each of the four filenames." 730 | err4 = "No: the input of `tail` is the output of `echo` for each filename." 731 | Ex().has_chosen(2, [err1, correct2, err3, err4]) 732 | ``` 733 | -------------------------------------------------------------------------------- /chapter5.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Creating new tools 3 | description: >- 4 | History lets you repeat things with just a few keystrokes, and pipes let you 5 | combine existing commands to create new ones. In this chapter, you will see 6 | how to go one step further and create new commands of your own. 7 | lessons: 8 | - nb_of_exercises: 9 9 | title: How can I edit a file? 10 | --- 11 | 12 | ## How can I edit a file? 13 | 14 | ```yaml 15 | type: ConsoleExercise 16 | key: 39eee3cfc0 17 | xp: 100 18 | ``` 19 | 20 | Unix has a bewildering variety of text editors. 21 | For this course, 22 | we will use a simple one called Nano. 23 | If you type `nano filename`, 24 | it will open `filename` for editing 25 | (or create it if it doesn't already exist). 26 | You can move around with the arrow keys, 27 | delete characters using backspace, 28 | and do other operations with control-key combinations: 29 | 30 | - `Ctrl` + `K`: delete a line. 31 | - `Ctrl` + `U`: un-delete a line. 32 | - `Ctrl` + `O`: save the file ('O' stands for 'output'). _You will also need to press Enter to confirm the filename!_ 33 | - `Ctrl` + `X`: exit the editor. 34 | 35 | `@instructions` 36 | Run `nano names.txt` to edit a new file in your home directory 37 | and enter the following four lines: 38 | 39 | ``` 40 | Lovelace 41 | Hopper 42 | Johnson 43 | Wilson 44 | ``` 45 | 46 | To save what you have written, 47 | type `Ctrl` + `O` to write the file out, 48 | then Enter to confirm the filename, 49 | then `Ctrl` + `X` to exit the editor. 50 | 51 | `@hint` 52 | 53 | 54 | `@pre_exercise_code` 55 | ```{python} 56 | 57 | ``` 58 | 59 | `@solution` 60 | ```{shell} 61 | # This solution uses `cp` instead of `nano` 62 | # because our automated tests can't edit files interactively. 63 | cp /solutions/names.txt /home/repl 64 | ``` 65 | 66 | `@sct` 67 | ```{python} 68 | patt = "Have you included the line `%s` in the `names.txt` file? Use `nano names.txt` again to update your file. Use `Ctrl` + `O` to save and `Ctrl` + `X` to exit." 69 | Ex().multi( 70 | has_cwd('/home/repl'), 71 | check_file('/home/repl/names.txt').multi( 72 | has_code(r'Lovelace', incorrect_msg=patt%'Lovelace'), 73 | has_code(r'Hopper', incorrect_msg=patt%'Hopper'), 74 | has_code(r'Johnson', incorrect_msg=patt%'Johnson'), 75 | has_code(r'Wilson', incorrect_msg=patt%'Wilson') 76 | ) 77 | ) 78 | Ex().success_msg("Well done! Off to the next one!") 79 | ``` 80 | 81 | --- 82 | 83 | ## How can I record what I just did? 84 | 85 | ```yaml 86 | type: BulletConsoleExercise 87 | key: 80c3532985 88 | xp: 100 89 | ``` 90 | 91 | When you are doing a complex analysis, 92 | you will often want to keep a record of the commands you used. 93 | You can do this with the tools you have already seen: 94 | 95 | 1. Run `history`. 96 | 2. Pipe its output to `tail -n 10` (or however many recent steps you want to save). 97 | 3. Redirect that to a file called something like `figure-5.history`. 98 | 99 | This is better than writing things down in a lab notebook 100 | because it is guaranteed not to miss any steps. 101 | It also illustrates the central idea of the shell: 102 | simple tools that produce and consume lines of text 103 | can be combined in a wide variety of ways 104 | to solve a broad range of problems. 105 | 106 | `@pre_exercise_code` 107 | ```{python} 108 | 109 | ``` 110 | 111 | *** 112 | 113 | ```yaml 114 | type: ConsoleExercise 115 | key: 144ca955ca 116 | xp: 35 117 | ``` 118 | 119 | `@instructions` 120 | Copy the files `seasonal/spring.csv` and `seasonal/summer.csv` to your home directory. 121 | 122 | `@hint` 123 | Use `cp` to copy and `~` as a shortcut for the path to your home directory. 124 | 125 | `@solution` 126 | ```{shell} 127 | cp seasonal/s* ~ 128 | 129 | ``` 130 | 131 | `@sct` 132 | ```{python} 133 | msg="Have you used `cp seasonal/s* ~` to copy the required files to your home directory?" 134 | Ex().multi( 135 | has_cwd('/home/repl'), 136 | check_file('/home/repl/spring.csv', missing_msg=msg).\ 137 | has_code(r'2017-01-25,wisdom', incorrect_msg=msg), 138 | check_file('/home/repl/summer.csv', missing_msg=msg).\ 139 | has_code(r'2017-01-11,canine', incorrect_msg=msg) 140 | ) 141 | Ex().success_msg("Remarkable record-keeping! If you mistyped any commands, you can always use `nano` to clean up the saves history file afterwards.") 142 | 143 | ``` 144 | 145 | *** 146 | 147 | ```yaml 148 | type: ConsoleExercise 149 | key: 09a432e4df 150 | xp: 35 151 | ``` 152 | 153 | `@instructions` 154 | Use `grep` with the `-h` flag (to stop it from printing filenames) 155 | and `-v Tooth` (to select lines that *don't* match the header line) 156 | to select the data records from `spring.csv` and `summer.csv` in that order 157 | and redirect the output to `temp.csv`. 158 | 159 | `@hint` 160 | Put the flags before the filenames. 161 | 162 | `@solution` 163 | ```{shell} 164 | grep -h -v Tooth spring.csv summer.csv > temp.csv 165 | 166 | ``` 167 | 168 | `@sct` 169 | ```{python} 170 | msg1 = "Make sure you redirect the output of the `grep` command to `temp.csv` with `>`!" 171 | msg2 = "Have you used `grep -h -v ___ ___ ___` (fill in the blanks) to populate `temp.csv`?" 172 | Ex().multi( 173 | has_cwd('/home/repl'), 174 | check_file('/home/repl/temp.csv', missing_msg=msg1).multi( 175 | has_code(r'2017-08-04,canine', incorrect_msg=msg2), 176 | has_code(r'2017-03-14,incisor', incorrect_msg=msg2), 177 | has_code(r'2017-03-12,wisdom', incorrect_msg=msg2) 178 | ) 179 | ) 180 | 181 | ``` 182 | 183 | *** 184 | 185 | ```yaml 186 | type: ConsoleExercise 187 | key: c40348c1e5 188 | xp: 30 189 | ``` 190 | 191 | `@instructions` 192 | Pipe `history` into `tail -n 3` 193 | and redirect the output to `steps.txt` 194 | to save the last three commands in a file. 195 | (You need to save three instead of just two 196 | because the `history` command itself will be in the list.) 197 | 198 | `@hint` 199 | Remember that redirection with `>` comes at the end of the sequence of piped commands. 200 | 201 | `@solution` 202 | ```{shell} 203 | history | tail -n 3 > steps.txt 204 | 205 | ``` 206 | 207 | `@sct` 208 | ```{python} 209 | msg1="Make sure to redirect the output of your command to `steps.txt`." 210 | msg2="Have you used `history | tail ___ ___` (fill in the blanks) to populate `steps.txt`?" 211 | Ex().multi( 212 | has_cwd('/home/repl'), 213 | # When run by the validator, solution3 doesn't pass, so including a has_code for that 214 | check_or( 215 | check_file('/home/repl/steps.txt', missing_msg=msg1).multi( 216 | has_code(r'\s+1\s+', incorrect_msg=msg2), 217 | has_code(r'\s+3\s+history', incorrect_msg=msg2) 218 | ), 219 | has_code(r'history\s+|\s+tail\s+-n\s+4\s+>\s+steps\.txt') 220 | ) 221 | ) 222 | Ex().success_msg("Well done! Let's step it up!") 223 | 224 | ``` 225 | 226 | --- 227 | 228 | ## How can I save commands to re-run later? 229 | 230 | ```yaml 231 | type: BulletConsoleExercise 232 | key: 4507a0dbd8 233 | xp: 100 234 | ``` 235 | 236 | You have been using the shell interactively so far. 237 | But since the commands you type in are just text, 238 | you can store them in files for the shell to run over and over again. 239 | To start exploring this powerful capability, 240 | put the following command in a file called `headers.sh`: 241 | 242 | ```{shell} 243 | head -n 1 seasonal/*.csv 244 | ``` 245 | 246 | This command selects the first row from each of the CSV files in the `seasonal` directory. 247 | Once you have created this file, 248 | you can run it by typing: 249 | 250 | ```{shell} 251 | bash headers.sh 252 | ``` 253 | 254 | This tells the shell (which is just a program called `bash`) 255 | to run the commands contained in the file `headers.sh`, 256 | which produces the same output as running the commands directly. 257 | 258 | `@pre_exercise_code` 259 | ```{python} 260 | 261 | ``` 262 | 263 | *** 264 | 265 | ```yaml 266 | type: ConsoleExercise 267 | key: 316ad2fec6 268 | xp: 50 269 | ``` 270 | 271 | `@instructions` 272 | Use `nano dates.sh` to create a file called `dates.sh` 273 | that contains this command: 274 | 275 | ```{shell} 276 | cut -d , -f 1 seasonal/*.csv 277 | ``` 278 | 279 | to extract the first column from all of the CSV files in `seasonal`. 280 | 281 | `@hint` 282 | Put the commands shown into the file without extra blank lines or spaces. 283 | 284 | `@solution` 285 | ```{shell} 286 | # This solution uses `cp` instead of `nano` 287 | # because our automated tests can't edit files interactively. 288 | cp /solutions/dates.sh ~ 289 | 290 | ``` 291 | 292 | `@sct` 293 | ```{python} 294 | msg = "Have you included the line `cut -d , -f 1 seasonal/*.csv` in the `dates.sh` file? Use `nano dates.sh` again to update your file. Use `Ctrl` + `O` to save and `Ctrl` + `X` to exit." 295 | Ex().multi( 296 | has_cwd('/home/repl'), 297 | check_file('/home/repl/dates.sh').\ 298 | has_code('cut -d *, *-f +1 +seasonal\/\*\.csv', incorrect_msg=msg) 299 | ) 300 | 301 | ``` 302 | 303 | *** 304 | 305 | ```yaml 306 | type: ConsoleExercise 307 | key: 30a8fa953e 308 | xp: 50 309 | ``` 310 | 311 | `@instructions` 312 | Use `bash` to run the file `dates.sh`. 313 | 314 | `@hint` 315 | Use `bash filename` to run the file. 316 | 317 | `@solution` 318 | ```{shell} 319 | bash dates.sh 320 | 321 | ``` 322 | 323 | `@sct` 324 | ```{python} 325 | Ex().multi( 326 | has_cwd('/home/repl'), 327 | check_correct( 328 | has_expr_output(), 329 | multi( 330 | has_code("bash", incorrect_msg = 'Did you call `bash`?'), 331 | has_code("dates.sh", incorrect_msg = 'Did you specify the `dates.sh` file?') 332 | ) 333 | ) 334 | ) 335 | 336 | ``` 337 | 338 | --- 339 | 340 | ## How can I re-use pipes? 341 | 342 | ```yaml 343 | type: BulletConsoleExercise 344 | key: da13667750 345 | xp: 100 346 | ``` 347 | 348 | A file full of shell commands is called a ***shell script**, 349 | or sometimes just a "script" for short. Scripts don't have to have names ending in `.sh`, 350 | but this lesson will use that convention 351 | to help you keep track of which files are scripts. 352 | 353 | Scripts can also contain pipes. 354 | For example, 355 | if `all-dates.sh` contains this line: 356 | 357 | ```{shell} 358 | cut -d , -f 1 seasonal/*.csv | grep -v Date | sort | uniq 359 | ``` 360 | 361 | then: 362 | 363 | ```{shell} 364 | bash all-dates.sh > dates.out 365 | ``` 366 | 367 | will extract the unique dates from the seasonal data files 368 | and save them in `dates.out`. 369 | 370 | `@pre_exercise_code` 371 | ```{python} 372 | import shutil 373 | shutil.copyfile('/solutions/teeth-start.sh', 'teeth.sh') 374 | ``` 375 | 376 | *** 377 | 378 | ```yaml 379 | type: ConsoleExercise 380 | key: 6fae90f320 381 | xp: 35 382 | ``` 383 | 384 | `@instructions` 385 | A file `teeth.sh` in your home directory has been prepared for you, but contains some blanks. 386 | Use Nano to edit the file and replace the two `____` placeholders 387 | with `seasonal/*.csv` and `-c` so that this script prints a count of the 388 | number of times each tooth name appears in the CSV files in the `seasonal` directory. 389 | 390 | `@hint` 391 | Use `nano teeth.sh` to edit the file. 392 | 393 | `@solution` 394 | ```{shell} 395 | # This solution uses `cp` instead of `nano` 396 | # because our automated tests can't edit files interactively. 397 | cp /solutions/teeth.sh ~ 398 | 399 | ``` 400 | 401 | `@sct` 402 | ```{python} 403 | msg="Have you a replaced the blanks properly so the command in `teeth.sh` reads `cut -d , -f 2 seasonal/*.csv | grep -v Tooth | sort | uniq -c`? Use `nano teeth.sh` again to make the required changes." 404 | Ex().multi( 405 | has_cwd('/home/repl'), 406 | check_file('/home/repl/teeth.sh').\ 407 | has_code(r'cut\s+-d\s+,\s+-f\s+2\s+seasonal/\*\.csv\s+\|\s+grep\s+-v\s+Tooth\s+\|\s+sort\s+\|\s+uniq\s+-c', incorrect_msg=msg) 408 | ) 409 | 410 | ``` 411 | 412 | *** 413 | 414 | ```yaml 415 | type: ConsoleExercise 416 | key: dcfccb51e2 417 | xp: 35 418 | ``` 419 | 420 | `@instructions` 421 | Use `bash` to run `teeth.sh` and `>` to redirect its output to `teeth.out`. 422 | 423 | `@hint` 424 | Remember that `> teeth.out` must come *after* the command that is producing output. 425 | 426 | `@solution` 427 | ```{shell} 428 | # We need to use 'cp' below to satisfy our automated tests. 429 | # You should only use the last line that runs 'bash'. 430 | cp /solutions/teeth.sh . 431 | bash teeth.sh > teeth.out 432 | 433 | ``` 434 | 435 | `@sct` 436 | ```{python} 437 | msg="Have you correctly redirected the result of `bash teeth.sh` to `teeth.out` with the `>`?" 438 | Ex().multi( 439 | has_cwd('/home/repl'), 440 | check_correct( 441 | check_file('/home/repl/teeth.out').multi( 442 | has_code(r'31 canine', incorrect_msg=msg), 443 | has_code(r'17 wisdom', incorrect_msg=msg) 444 | ), 445 | multi( 446 | has_code("bash", incorrect_msg = 'Did you call `bash`?'), 447 | has_code("bash\s+teeth.sh", incorrect_msg = 'Did you run the `teeth.sh` file?'), 448 | has_code(">\s+teeth.out", incorrect_msg = 'Did you redirect to the `teeth.out` file?') 449 | ) 450 | ) 451 | ) 452 | 453 | ``` 454 | 455 | *** 456 | 457 | ```yaml 458 | type: ConsoleExercise 459 | key: c8c9a11e3c 460 | xp: 30 461 | ``` 462 | 463 | `@instructions` 464 | Run `cat teeth.out` to inspect your results. 465 | 466 | `@hint` 467 | Remember, you can type the first few characters of a filename and then press the tab key to auto-complete. 468 | 469 | `@solution` 470 | ```{shell} 471 | cat teeth.out 472 | 473 | ``` 474 | 475 | `@sct` 476 | ```{python} 477 | Ex().multi( 478 | has_cwd('/home/repl'), 479 | check_correct( 480 | has_expr_output(), 481 | multi( 482 | has_code("cat", incorrect_msg = 'Did you call `cat`?'), 483 | has_code("teeth.out", incorrect_msg = 'Did you specify the `teeth.out` file?') 484 | ) 485 | ) 486 | ) 487 | Ex().success_msg("Nice! This all may feel contrived at first, but the nice thing is that you are automating parts of your workflow step by step. Something that comes in really handy as a data scientist!") 488 | 489 | ``` 490 | 491 | --- 492 | 493 | ## How can I pass filenames to scripts? 494 | 495 | ```yaml 496 | type: BulletConsoleExercise 497 | key: c2623b9c14 498 | xp: 100 499 | ``` 500 | 501 | A script that processes specific files is useful as a record of what you did, but one that allows you to process any files you want is more useful. 502 | To support this, 503 | you can use the special expression `$@` (dollar sign immediately followed by at-sign) 504 | to mean "all of the command-line parameters given to the script". 505 | 506 | For example, if `unique-lines.sh` contains `sort $@ | uniq`, when you run: 507 | 508 | ```{shell} 509 | bash unique-lines.sh seasonal/summer.csv 510 | ``` 511 | 512 | the shell replaces `$@` with `seasonal/summer.csv` and processes one file. If you run this: 513 | 514 | ```{shell} 515 | bash unique-lines.sh seasonal/summer.csv seasonal/autumn.csv 516 | ``` 517 | 518 | it processes two data files, and so on. 519 | 520 | _As a reminder, to save what you have written in Nano, type `Ctrl` + `O` to write the file out, then Enter to confirm the filename, then `Ctrl` + `X` to exit the editor._ 521 | 522 | `@pre_exercise_code` 523 | ```{python} 524 | import shutil 525 | shutil.copyfile('/solutions/count-records-start.sh', 'count-records.sh') 526 | ``` 527 | 528 | *** 529 | 530 | ```yaml 531 | type: ConsoleExercise 532 | key: 7a893623af 533 | xp: 50 534 | ``` 535 | 536 | `@instructions` 537 | Edit the script `count-records.sh` with Nano and fill in the two `____` placeholders 538 | with `$@` and `-l` (_the letter_) respectively so that it counts the number of lines in one or more files, 539 | excluding the first line of each. 540 | 541 | `@hint` 542 | * Use `nano count-records.sh` to edit the filename. 543 | * Make sure you are specifying the _letter_ `-l`, and not the number one. 544 | 545 | `@solution` 546 | ```{shell} 547 | # This solution uses `cp` instead of `nano` 548 | # because our automated tests can't edit files interactively. 549 | cp /solutions/count-records.sh ~ 550 | 551 | ``` 552 | 553 | `@sct` 554 | ```{python} 555 | msg="Have you a replaced the blanks properly so the command in `count-records.sh` reads `tail -q -n +2 $@ | wc -l`? Use `nano count-records.sh` again to make the required changes." 556 | Ex().multi( 557 | has_cwd('/home/repl'), 558 | check_file('/home/repl/count-records.sh').\ 559 | has_code('tail\s+-q\s+-n\s+\+2\s+\$\@\s+\|\s+wc\s+-l', incorrect_msg=msg) 560 | ) 561 | 562 | ``` 563 | 564 | *** 565 | 566 | ```yaml 567 | type: ConsoleExercise 568 | key: d0da324516 569 | xp: 50 570 | ``` 571 | 572 | `@instructions` 573 | Run `count-records.sh` on `seasonal/*.csv` 574 | and redirect the output to `num-records.out` using `>`. 575 | 576 | `@hint` 577 | Use `>` to redirect the output. 578 | 579 | `@solution` 580 | ```{shell} 581 | bash count-records.sh seasonal/*.csv > num-records.out 582 | 583 | ``` 584 | 585 | `@sct` 586 | ```{python} 587 | Ex().multi( 588 | has_cwd('/home/repl'), 589 | check_correct( 590 | check_file('/home/repl/num-records.out').has_code(r'92'), 591 | multi( 592 | has_code("bash", incorrect_msg = 'Did you call `bash`?'), 593 | has_code("bash\s+count-records.sh", incorrect_msg = 'Did you run the `count-records.sh` file?'), 594 | has_code("seasonal/\*", incorrect_msg = 'Did you specify the files to process with `seasonal/*`?'), 595 | has_code(">\s+num-records.out", incorrect_msg = 'Did you redirect to the `num-records.out` file?') 596 | ) 597 | ) 598 | ) 599 | Ex().success_msg("A job well done! Your shell power is ever-expanding!") 600 | 601 | ``` 602 | 603 | --- 604 | 605 | ## How can I process a single argument? 606 | 607 | ```yaml 608 | type: PureMultipleChoiceExercise 609 | key: 4092cb4cda 610 | xp: 50 611 | ``` 612 | 613 | As well as `$@`, 614 | the shell lets you use `$1`, `$2`, and so on to refer to specific command-line parameters. 615 | You can use this to write commands that feel simpler or more natural than the shell's. 616 | For example, 617 | you can create a script called `column.sh` that selects a single column from a CSV file 618 | when the user provides the filename as the first parameter and the column as the second: 619 | 620 | ```{shell} 621 | cut -d , -f $2 $1 622 | ``` 623 | 624 | and then run it using: 625 | 626 | ```{shell} 627 | bash column.sh seasonal/autumn.csv 1 628 | ``` 629 | 630 | Notice how the script uses the two parameters in reverse order. 631 | 632 |
633 | 634 | The script `get-field.sh` is supposed to take a filename, 635 | the number of the row to select, 636 | the number of the column to select, 637 | and print just that field from a CSV file. 638 | For example: 639 | 640 | ``` 641 | bash get-field.sh seasonal/summer.csv 4 2 642 | ``` 643 | 644 | should select the second field from line 4 of `seasonal/summer.csv`. 645 | Which of the following commands should be put in `get-field.sh` to do that? 646 | 647 | `@hint` 648 | Remember that command-line parameters are numbered left to right. 649 | 650 | `@possible_answers` 651 | - `head -n $1 $2 | tail -n 1 | cut -d , -f $3` 652 | - [`head -n $2 $1 | tail -n 1 | cut -d , -f $3`] 653 | - `head -n $3 $1 | tail -n 1 | cut -d , -f $2` 654 | - `head -n $2 $3 | tail -n 1 | cut -d , -f $1` 655 | 656 | `@feedback` 657 | - No: that will try to use the filename as the number of lines to select with `head`. 658 | - Correct! 659 | - No: that will try to use the column number as the line number and vice versa. 660 | - No: that will use the field number as the filename and vice versa. 661 | 662 | --- 663 | 664 | ## How can one shell script do many things? 665 | 666 | ```yaml 667 | type: TabConsoleExercise 668 | key: 846bc70e9d 669 | xp: 100 670 | ``` 671 | 672 | Our shells scripts so far have had a single command or pipe, but a script can contain many lines of commands. For example, you can create one that tells you how many records are in the shortest and longest of your data files, i.e., the range of your datasets' lengths. 673 | 674 | Note that in Nano, "copy and paste" is achieved by navigating to the line you want to copy, pressing `CTRL` + `K` to cut the line, then `CTRL` + `U` twice to paste two copies of it. 675 | 676 | _As a reminder, to save what you have written in Nano, type `Ctrl` + `O` to write the file out, then Enter to confirm the filename, then `Ctrl` + `X` to exit the editor._ 677 | 678 | `@pre_exercise_code` 679 | ```{python} 680 | import shutil 681 | shutil.copyfile('/solutions/range-start-1.sh', 'range.sh') 682 | ``` 683 | 684 | *** 685 | 686 | ```yaml 687 | type: ConsoleExercise 688 | key: a1e55487fb 689 | xp: 25 690 | ``` 691 | 692 | `@instructions` 693 | Use Nano to edit the script `range.sh` 694 | and replace the two `____` placeholders 695 | with `$@` and `-v` 696 | so that it lists the names and number of lines in all of the files given on the command line 697 | *without* showing the total number of lines in all files. 698 | (Do not try to subtract the column header lines from the files.) 699 | 700 | `@hint` 701 | Use `wc -l $@` to count lines in all the files given on the command line. 702 | 703 | `@solution` 704 | ```{shell} 705 | # This solution uses `cp` instead of `nano` 706 | # because our automated tests can't edit files interactively. 707 | cp /solutions/range-1.sh range.sh 708 | 709 | ``` 710 | 711 | `@sct` 712 | ```{python} 713 | msg="Have you a replaced the blanks properly so the command in `range.sh` reads `wc -l $@ | grep -v total`? Use `nano range.sh` again to make the required changes." 714 | Ex().multi( 715 | has_cwd('/home/repl'), 716 | check_file('/home/repl/range.sh').\ 717 | has_code(r'wc\s+-l\s+\$@\s+\|\s+grep\s+-v\s+total', incorrect_msg=msg) 718 | ) 719 | 720 | ``` 721 | 722 | *** 723 | 724 | ```yaml 725 | type: ConsoleExercise 726 | key: e8ece27fe7 727 | xp: 25 728 | ``` 729 | 730 | `@instructions` 731 | Use Nano again to add `sort -n` and `head -n 1` in that order 732 | to the pipeline in `range.sh` 733 | to display the name and line count of the shortest file given to it. 734 | 735 | `@hint` 736 | 737 | 738 | `@solution` 739 | ```{shell} 740 | # This solution uses `cp` instead of `nano` 741 | # because our automated tests can't edit files interactively. 742 | cp /solutions/range-2.sh range.sh 743 | 744 | ``` 745 | 746 | `@sct` 747 | ```{python} 748 | msg="Have you added `sort -n` and `head -n 1` with pipes to the `range.sh` file? Use `nano range.sh` again to make the required changes." 749 | Ex().multi( 750 | has_cwd('/home/repl'), 751 | check_file('/home/repl/range.sh').\ 752 | has_code(r'wc\s+-l\s+\$@\s+\|\s+grep\s+-v\s+total\s+\|\s+sort\s+-n\s+|\s+head\s+-n\s+1', incorrect_msg=msg) 753 | ) 754 | 755 | ``` 756 | 757 | *** 758 | 759 | ```yaml 760 | type: ConsoleExercise 761 | key: a3b36a746e 762 | xp: 25 763 | ``` 764 | 765 | `@instructions` 766 | Again using Nano, add a second line to `range.sh` to print the name and record count of 767 | the *longest* file in the directory *as well as* the shortest. 768 | This line should be a duplicate of the one you have already written, 769 | but with `sort -n -r` rather than `sort -n`. 770 | 771 | `@hint` 772 | Copy the first line and modify the sorting order. 773 | 774 | `@solution` 775 | ```{shell} 776 | # This solution uses `cp` instead of `nano` 777 | # because our automated tests can't edit files interactively. 778 | cp /solutions/range-3.sh range.sh 779 | 780 | ``` 781 | 782 | `@sct` 783 | ```{python} 784 | msg1="Keep the first line in the `range.sh` file: `wc -l $@ | grep -v total | sort -n | head -n 1`" 785 | msg2="Have you duplicated the first line in `range.sh` and made a small change? `sort -n -r` instead of `sort -n`!" 786 | Ex().multi( 787 | has_cwd('/home/repl'), 788 | check_file('/home/repl/range.sh').multi( 789 | has_code("wc -l $@ | grep -v total | sort -n | head -n 1", fixed=True, incorrect_msg = msg1), 790 | has_code(r'wc\s+-l\s+\$@\s+\|\s+grep\s+-v\s+total\s+\|\s+sort\s+-n\s+-r\s+|\s+head\s+-n\s+1', incorrect_msg=msg2) 791 | ) 792 | ) 793 | 794 | ``` 795 | 796 | *** 797 | 798 | ```yaml 799 | type: ConsoleExercise 800 | key: cba93a77c3 801 | xp: 25 802 | ``` 803 | 804 | `@instructions` 805 | Run the script on the files in the `seasonal` directory 806 | using `seasonal/*.csv` to match all of the files 807 | and redirect the output using `>` 808 | to a file called `range.out` in your home directory. 809 | 810 | `@hint` 811 | Use `bash range.sh` to run your script, `seasonal/*.csv` to specify files, and `> range.out` to redirect the output. 812 | 813 | `@solution` 814 | ```{shell} 815 | bash range.sh seasonal/*.csv > range.out 816 | 817 | ``` 818 | 819 | `@sct` 820 | ```{python} 821 | msg="Have you correctly redirected the result of `bash range.sh seasonal/*.csv` to `range.out` with the `>`?" 822 | Ex().multi( 823 | has_cwd('/home/repl'), 824 | multi( 825 | has_code("bash", incorrect_msg = 'Did you call `bash`?'), 826 | has_code("bash\s+range.sh", incorrect_msg = 'Did you run the `range.sh` file?'), 827 | has_code("seasonal/\*", incorrect_msg = 'Did you specify the files to process with `seasonal/*`?'), 828 | has_code(">\s+range.out", incorrect_msg = 'Did you redirect to the `range.out` file?') 829 | ) 830 | ) 831 | 832 | Ex().success_msg("This is going well. Head over to the next exercise to learn about writing loops!") 833 | 834 | ``` 835 | 836 | --- 837 | 838 | ## How can I write loops in a shell script? 839 | 840 | ```yaml 841 | type: BulletConsoleExercise 842 | key: 6be8ca6009 843 | xp: 100 844 | ``` 845 | 846 | Shell scripts can also contain loops. You can write them using semi-colons, or split them across lines without semi-colons to make them more readable: 847 | 848 | ```{shell} 849 | # Print the first and last data records of each file. 850 | for filename in $@ 851 | do 852 | head -n 2 $filename | tail -n 1 853 | tail -n 1 $filename 854 | done 855 | ``` 856 | 857 | (You don't have to indent the commands inside the loop, but doing so makes things clearer.) 858 | 859 | The first line of this script is a **comment** to tell readers what the script does. Comments start with the `#` character and run to the end of the line. Your future self will thank you for adding brief explanations like the one shown here to every script you write. 860 | 861 | _As a reminder, to save what you have written in Nano, type `Ctrl` + `O` to write the file out, then Enter to confirm the filename, then `Ctrl` + `X` to exit the editor._ 862 | 863 | `@pre_exercise_code` 864 | ```{python} 865 | import shutil 866 | shutil.copyfile('/solutions/date-range-start.sh', '/home/repl/date-range.sh') 867 | ``` 868 | 869 | *** 870 | 871 | ```yaml 872 | type: ConsoleExercise 873 | key: 8ca2adb6c4 874 | xp: 35 875 | ``` 876 | 877 | `@instructions` 878 | Fill in the placeholders in the script `date-range.sh` 879 | with `$filename` (twice), `head`, and `tail` 880 | so that it prints the first and last date from one or more files. 881 | 882 | `@hint` 883 | Remember to use `$filename` to get the current value of the loop variable. 884 | 885 | `@solution` 886 | ```{shell} 887 | # This solution uses `cp` instead of `nano` 888 | # because our automated tests can't edit files interactively. 889 | cp /solutions/date-range.sh date-range.sh 890 | 891 | ``` 892 | 893 | `@sct` 894 | ```{python} 895 | msgpatt="In `date-range.sh`, have you changed the %s line in the loop to be `%s`? Use `nano date-range.sh` to make changes." 896 | cmdpatt = 'cut -d , -f 1 $filename | grep -v Date | sort | %s -n 1' 897 | msg1=msgpatt%('first', cmdpatt%'head') 898 | msg2=msgpatt%('second', cmdpatt%'tail') 899 | patt='cut\s+-d\s+,\s+-f\s+1\s+\$filename\s+\|\s+grep\s+-v\s+Date\s+\|\s+sort\s+\|\s+%s\s+-n\s+1' 900 | patt1 = patt%'head' 901 | patt2 = patt%'tail' 902 | Ex().multi( 903 | has_cwd('/home/repl'), 904 | check_file('/home/repl/date-range.sh').multi( 905 | has_code(patt1, incorrect_msg=msg1), 906 | has_code(patt2, incorrect_msg=msg2) 907 | ) 908 | ) 909 | 910 | ``` 911 | 912 | *** 913 | 914 | ```yaml 915 | type: ConsoleExercise 916 | key: ec1271356d 917 | xp: 35 918 | ``` 919 | 920 | `@instructions` 921 | Run `date-range.sh` on all four of the seasonal data files 922 | using `seasonal/*.csv` to match their names. 923 | 924 | `@hint` 925 | The wildcard expression should start with the directory name. 926 | 927 | `@solution` 928 | ```{shell} 929 | bash date-range.sh seasonal/*.csv 930 | 931 | ``` 932 | 933 | `@sct` 934 | ```{python} 935 | Ex().multi( 936 | has_cwd('/home/repl'), 937 | check_correct( 938 | has_expr_output(), 939 | multi( 940 | has_code("bash", incorrect_msg = 'Did you call `bash`?'), 941 | has_code("bash\s+date-range.sh", incorrect_msg = 'Did you run the `date-range.sh` file?'), 942 | has_code("seasonal/\*", incorrect_msg = 'Did you specify the files to process with `seasonal/*`?') 943 | ) 944 | ) 945 | ) 946 | 947 | ``` 948 | 949 | *** 950 | 951 | ```yaml 952 | type: ConsoleExercise 953 | key: 0323c7d68d 954 | xp: 30 955 | ``` 956 | 957 | `@instructions` 958 | Run `date-range.sh` on all four of the seasonal data files using `seasonal/*.csv` to match their names, 959 | and pipe its output to `sort` to see that your scripts can be used just like Unix's built-in commands. 960 | 961 | `@hint` 962 | Use the same wildcard expression you used earlier. 963 | 964 | `@solution` 965 | ```{shell} 966 | bash date-range.sh seasonal/*.csv | sort 967 | 968 | ``` 969 | 970 | `@sct` 971 | ```{python} 972 | Ex().multi( 973 | has_cwd('/home/repl'), 974 | check_correct( 975 | has_expr_output(), 976 | multi( 977 | has_code("bash", incorrect_msg = 'Did you call `bash`?'), 978 | has_code("bash\s+date-range.sh", incorrect_msg = 'Did you run the `date-range.sh` file?'), 979 | has_code("seasonal/\*", incorrect_msg = 'Did you specify the files to process with `seasonal/*`?'), 980 | has_code("|", incorrect_msg = 'Did you pipe from the script output to `sort`?'), 981 | has_code("sort", incorrect_msg = 'Did you call `sort`?') 982 | ) 983 | ) 984 | ) 985 | Ex().success_msg("Magic! Notice how composable all the things we've learned are.") 986 | 987 | ``` 988 | 989 | --- 990 | 991 | ## What happens when I don't provide filenames? 992 | 993 | ```yaml 994 | type: MultipleChoiceExercise 995 | key: 8a162c4d54 996 | xp: 50 997 | ``` 998 | 999 | A common mistake in shell scripts (and interactive commands) is to put filenames in the wrong place. 1000 | If you type: 1001 | 1002 | ```{shell} 1003 | tail -n 3 1004 | ``` 1005 | 1006 | then since `tail` hasn't been given any filenames, 1007 | it waits to read input from your keyboard. 1008 | This means that if you type: 1009 | 1010 | ```{shell} 1011 | head -n 5 | tail -n 3 somefile.txt 1012 | ``` 1013 | 1014 | then `tail` goes ahead and prints the last three lines of `somefile.txt`, 1015 | but `head` waits forever for keyboard input, 1016 | since it wasn't given a filename and there isn't anything ahead of it in the pipeline. 1017 | 1018 |
1019 | 1020 | Suppose you do accidentally type: 1021 | 1022 | ```{shell} 1023 | head -n 5 | tail -n 3 somefile.txt 1024 | ``` 1025 | 1026 | What should you do next? 1027 | 1028 | `@possible_answers` 1029 | - Wait 10 seconds for `head` to time out. 1030 | - Type `somefile.txt` and press Enter to give `head` some input. 1031 | - Use `Ctrl` + `C` to stop the running `head` program. 1032 | 1033 | `@hint` 1034 | What does `head` do if it doesn't have a filename and nothing is upstream from it? 1035 | 1036 | `@pre_exercise_code` 1037 | ```{python} 1038 | 1039 | ``` 1040 | 1041 | `@sct` 1042 | ```{python} 1043 | a1 = 'No, commands will not time out.' 1044 | a2 = 'No, that will give `head` the text `somefile.txt` to process, but then it will hang up waiting for still more input.' 1045 | a3 = "Yes! You should use `Ctrl` + `C` to stop a running program. This concludes this introductory course! If you're interested to learn more command line tools, we thoroughly recommend taking our free intro to Git course!" 1046 | Ex().has_chosen(3, [a1, a2, a3]) 1047 | ``` 1048 | -------------------------------------------------------------------------------- /course.yml: -------------------------------------------------------------------------------- 1 | title: Introduction to Shell 2 | description: >- 3 | The Unix command line has survived and thrived for almost 50 years because it 4 | lets people do complex things with just a few keystrokes. Sometimes called 5 | "the universal glue of programming," it helps users combine existing programs 6 | in new ways, automate repetitive tasks, and run programs on clusters and 7 | clouds that may be halfway around the world. This course will introduce its 8 | key elements and show you how to use them efficiently. 9 | programming_language: shell 10 | from: 'shell-base-prod:v1.1.3' 11 | -------------------------------------------------------------------------------- /datasets/filesys.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacamp-content/courses-introduction-to-shell/b3fb925221d5f26948c18cf8d8fff89c9e68c957/datasets/filesys.zip -------------------------------------------------------------------------------- /datasets/solutions.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacamp-content/courses-introduction-to-shell/b3fb925221d5f26948c18cf8d8fff89c9e68c957/datasets/solutions.zip -------------------------------------------------------------------------------- /design/concept.dot: -------------------------------------------------------------------------------- 1 | digraph conda_concepts { 2 | node [shape = rectangle]; 3 | history 4 | pipe 5 | loop 6 | command 7 | shell 8 | program 9 | script 10 | file 11 | directory 12 | filesystem 13 | path 14 | wildcard 15 | 16 | { 17 | rank=same 18 | rankdir=LR 19 | pipe 20 | loop 21 | script 22 | program 23 | filesystem 24 | } 25 | 26 | i01 [shape=point, width=0, height=0] 27 | i02 [shape=point, width=0, height=0] 28 | 29 | loop -> script [style="invis"] 30 | 31 | shell -> command [label="runs"] 32 | shell -> program [label="runs"] 33 | shell -> history [label="records"] 34 | shell -> wildcard [label="expands"] 35 | wildcard -> path [label="matches"] 36 | history -> command [label="records"] 37 | command -> pipe [label="combined\nusing"] 38 | command -> loop [label="repeated\nusing"] 39 | command -> history [label="repeated\nusing"] 40 | command -> script [label="stored\nin"] 41 | script -> program [label="is a"] 42 | program -> i01 [dir="none", label="manipulates"] 43 | i01 -> file 44 | i01 -> directory 45 | filesystem -> i02 [dir="none", label="contains"] 46 | i02 -> file 47 | i02 -> directory 48 | path -> filesystem [label="identifies\nparts\nof"] 49 | } 50 | -------------------------------------------------------------------------------- /design/concept.svg: -------------------------------------------------------------------------------- 1 | 2 | 4 | 6 | 7 | 9 | 10 | conda_concepts 11 | 12 | 13 | 14 | history 15 | 16 | history 17 | 18 | 19 | 20 | command 21 | 22 | command 23 | 24 | 25 | 26 | history->command 27 | 28 | 29 | records 30 | 31 | 32 | 33 | pipe 34 | 35 | pipe 36 | 37 | 38 | 39 | loop 40 | 41 | loop 42 | 43 | 44 | 45 | script 46 | 47 | script 48 | 49 | 50 | 51 | 52 | command->history 53 | 54 | 55 | repeated 56 | using 57 | 58 | 59 | 60 | command->pipe 61 | 62 | 63 | combined 64 | using 65 | 66 | 67 | 68 | command->loop 69 | 70 | 71 | repeated 72 | using 73 | 74 | 75 | 76 | command->script 77 | 78 | 79 | stored 80 | in 81 | 82 | 83 | 84 | shell 85 | 86 | shell 87 | 88 | 89 | 90 | shell->history 91 | 92 | 93 | records 94 | 95 | 96 | 97 | shell->command 98 | 99 | 100 | runs 101 | 102 | 103 | 104 | program 105 | 106 | program 107 | 108 | 109 | 110 | shell->program 111 | 112 | 113 | runs 114 | 115 | 116 | 117 | wildcard 118 | 119 | wildcard 120 | 121 | 122 | 123 | shell->wildcard 124 | 125 | 126 | expands 127 | 128 | 129 | 130 | i01 131 | 132 | 133 | 134 | 135 | program->i01 136 | 137 | manipulates 138 | 139 | 140 | 141 | script->program 142 | 143 | 144 | is a 145 | 146 | 147 | 148 | file 149 | 150 | file 151 | 152 | 153 | 154 | directory 155 | 156 | directory 157 | 158 | 159 | 160 | filesystem 161 | 162 | filesystem 163 | 164 | 165 | 166 | i02 167 | 168 | 169 | 170 | 171 | filesystem->i02 172 | 173 | contains 174 | 175 | 176 | 177 | path 178 | 179 | path 180 | 181 | 182 | 183 | path->filesystem 184 | 185 | 186 | identifies 187 | parts 188 | of 189 | 190 | 191 | 192 | wildcard->path 193 | 194 | 195 | matches 196 | 197 | 198 | 199 | i01->file 200 | 201 | 202 | 203 | 204 | 205 | i01->directory 206 | 207 | 208 | 209 | 210 | 211 | i02->file 212 | 213 | 214 | 215 | 216 | 217 | i02->directory 218 | 219 | 220 | 221 | 222 | 223 | -------------------------------------------------------------------------------- /filesys/course.txt: -------------------------------------------------------------------------------- 1 | Introduction to the Unix Shell for Data Science 2 | 3 | The Unix command line has survived and thrived for almost fifty years 4 | because it lets people to do complex things with just a few 5 | keystrokes. Sometimes called "the duct tape of programming", it helps 6 | users combine existing programs in new ways, automate repetitive 7 | tasks, and run programs on clusters and clouds that may be halfway 8 | around the world. This lesson will introduce its key elements and show 9 | you how to use them efficiently. 10 | -------------------------------------------------------------------------------- /filesys/people/agarwal.txt: -------------------------------------------------------------------------------- 1 | name: Agarwal, Jasmine 2 | position: RCT2 3 | start: 2017-04-01 4 | benefits: full 5 | -------------------------------------------------------------------------------- /filesys/seasonal/autumn.csv: -------------------------------------------------------------------------------- 1 | Date,Tooth 2 | 2017-01-05,canine 3 | 2017-01-17,wisdom 4 | 2017-01-18,canine 5 | 2017-02-01,molar 6 | 2017-02-22,bicuspid 7 | 2017-03-10,canine 8 | 2017-03-13,canine 9 | 2017-04-30,incisor 10 | 2017-05-02,canine 11 | 2017-05-10,canine 12 | 2017-05-19,bicuspid 13 | 2017-05-25,molar 14 | 2017-06-22,wisdom 15 | 2017-06-25,canine 16 | 2017-07-10,incisor 17 | 2017-07-10,wisdom 18 | 2017-07-20,incisor 19 | 2017-07-21,bicuspid 20 | 2017-08-09,canine 21 | 2017-08-16,canine 22 | -------------------------------------------------------------------------------- /filesys/seasonal/spring.csv: -------------------------------------------------------------------------------- 1 | Date,Tooth 2 | 2017-01-25,wisdom 3 | 2017-02-19,canine 4 | 2017-02-24,canine 5 | 2017-02-28,wisdom 6 | 2017-03-04,incisor 7 | 2017-03-12,wisdom 8 | 2017-03-14,incisor 9 | 2017-03-21,molar 10 | 2017-04-29,wisdom 11 | 2017-05-08,canine 12 | 2017-05-20,canine 13 | 2017-05-21,canine 14 | 2017-05-25,canine 15 | 2017-06-04,molar 16 | 2017-06-13,bicuspid 17 | 2017-06-14,canine 18 | 2017-07-10,incisor 19 | 2017-07-16,bicuspid 20 | 2017-07-23,bicuspid 21 | 2017-08-13,bicuspid 22 | 2017-08-13,incisor 23 | 2017-08-13,wisdom 24 | 2017-09-07,molar 25 | -------------------------------------------------------------------------------- /filesys/seasonal/summer.csv: -------------------------------------------------------------------------------- 1 | Date,Tooth 2 | 2017-01-11,canine 3 | 2017-01-18,wisdom 4 | 2017-01-21,bicuspid 5 | 2017-02-02,molar 6 | 2017-02-27,wisdom 7 | 2017-02-27,wisdom 8 | 2017-03-07,bicuspid 9 | 2017-03-15,wisdom 10 | 2017-03-20,canine 11 | 2017-03-23,molar 12 | 2017-04-02,bicuspid 13 | 2017-04-22,wisdom 14 | 2017-05-07,canine 15 | 2017-05-09,canine 16 | 2017-05-11,incisor 17 | 2017-05-14,incisor 18 | 2017-05-19,canine 19 | 2017-05-23,incisor 20 | 2017-05-24,incisor 21 | 2017-06-18,incisor 22 | 2017-07-25,canine 23 | 2017-08-02,canine 24 | 2017-08-03,bicuspid 25 | 2017-08-04,canine 26 | -------------------------------------------------------------------------------- /filesys/seasonal/winter.csv: -------------------------------------------------------------------------------- 1 | Date,Tooth 2 | 2017-01-03,bicuspid 3 | 2017-01-05,incisor 4 | 2017-01-21,wisdom 5 | 2017-02-05,molar 6 | 2017-02-17,incisor 7 | 2017-02-25,bicuspid 8 | 2017-03-12,incisor 9 | 2017-03-25,molar 10 | 2017-03-26,incisor 11 | 2017-04-04,canine 12 | 2017-04-18,canine 13 | 2017-04-26,canine 14 | 2017-04-26,molar 15 | 2017-04-26,wisdom 16 | 2017-04-27,canine 17 | 2017-05-08,molar 18 | 2017-05-13,bicuspid 19 | 2017-05-14,wisdom 20 | 2017-06-17,canine 21 | 2017-07-01,incisor 22 | 2017-07-17,canine 23 | 2017-08-10,incisor 24 | 2017-08-11,bicuspid 25 | 2017-08-11,wisdom 26 | 2017-08-13,canine 27 | -------------------------------------------------------------------------------- /img/shield_image.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datacamp-content/courses-introduction-to-shell/b3fb925221d5f26948c18cf8d8fff89c9e68c957/img/shield_image.png -------------------------------------------------------------------------------- /requirements.sh: -------------------------------------------------------------------------------- 1 | # Definitions. 2 | HOME_DIR=/home/repl 3 | HOME_COPY=/.course_home 4 | USER_GROUP=repl:repl 5 | COURSE_ID=course_5065 6 | FILESYS=filesys.zip 7 | SOLUTIONS=solutions.zip 8 | 9 | # Report start. 10 | echo '' 11 | echo '----------------------------------------' 12 | echo 'START TIME:' $(date) 13 | echo 'HOME_DIR: ' ${HOME_DIR} 14 | echo 'USER_GROUP: ' ${USER_GROUP} 15 | echo 'COURSE_ID: ' ${COURSE_ID} 16 | echo 'FILESYS: ' ${FILESYS} 17 | echo 18 | 19 | # Make sure we're in the home directory. 20 | cd ${HOME_DIR} 21 | 22 | # Get the zip files. 23 | wget https://s3.amazonaws.com/assets.datacamp.com/production/${COURSE_ID}/datasets/${FILESYS} 24 | wget https://s3.amazonaws.com/assets.datacamp.com/production/${COURSE_ID}/datasets/${SOLUTIONS} 25 | 26 | # Make sure we have nano and unzip. 27 | apt-get update 28 | apt-get -y install nano 29 | apt-get -y install unzip 30 | 31 | # Unminimize the docker image so the man command is available 32 | yes | unminimize 33 | 34 | # Unpack to the local directory. 35 | unzip ./${FILESYS} 36 | 37 | # Remove the zip file. 38 | rm -f ./${FILESYS} 39 | 40 | # Make the `backup` and `bin` directories (which start off empty, so are not in Git). 41 | mkdir ./backup 42 | mkdir ./bin 43 | 44 | # Change ownership. 45 | chown -R ${USER_GROUP} . 46 | 47 | # Unpack the solutions (used for file comparison in SCTs). 48 | unzip -d / ./${SOLUTIONS} 49 | rm -f ./${SOLUTIONS} 50 | 51 | # Change prompt. 52 | echo "export PS1='\$ '" >> ${HOME_DIR}/.bashrc 53 | 54 | # Ensure ~/bin is on the user's path. 55 | echo 'export PATH=$PATH:$HOME/bin' >> ${HOME_DIR}/.bashrc 56 | 57 | # Make copy for resetting exercises. 58 | # Files there will replace /home/repl each exercise. 59 | # IMPORTANT: Trailing slashes after directory names force rsync to do the right thing. 60 | rsync -a ${HOME_DIR}/ ${HOME_COPY}/ 61 | chown -R ${USER_GROUP} ${HOME_COPY} 62 | 63 | # Show what's been done where. 64 | echo 'Installed in home directory:' 65 | ls -R ${HOME_DIR}/* 66 | echo 67 | echo 'Last 10 lines of .bashrc' 68 | tail -n 10 ${HOME_DIR}/.bashrc 69 | 70 | echo 'home backup directory:' 71 | ls -R ${HOME_COPY} 72 | 73 | echo 'solutions directory' 74 | ls -lR /solutions 75 | 76 | # Report end of installation. 77 | echo 78 | echo 'ENDING requirements.sh' 79 | echo '----------------------------------------' 80 | echo '' 81 | -------------------------------------------------------------------------------- /rules.yml: -------------------------------------------------------------------------------- 1 | rules: 2 | course_pct_ex_assignment_lte_reco: 3 | min: 0 4 | max: 1 5 | course_pct_ex_code_sample_lte_reco: 6 | min: 0 7 | max: 1 8 | course_pct_ex_code_solution_lte_reco: 9 | min: 0 10 | max: 1 11 | mce_num_chars_assignment: 12 | min: 0 13 | max: 2000 14 | -------------------------------------------------------------------------------- /solutions/count-records-start.sh: -------------------------------------------------------------------------------- 1 | tail -q -n +2 ____ | wc ____ 2 | -------------------------------------------------------------------------------- /solutions/count-records.sh: -------------------------------------------------------------------------------- 1 | tail -q -n +2 $@ | wc -l 2 | -------------------------------------------------------------------------------- /solutions/current-time.sh: -------------------------------------------------------------------------------- 1 | # Print the date and time at one-second intervals until stopped. 2 | while true 3 | do 4 | date 5 | sleep 1 6 | done 7 | -------------------------------------------------------------------------------- /solutions/date-range-start.sh: -------------------------------------------------------------------------------- 1 | # Print the first and last date from each data file. 2 | for filename in $@ 3 | do 4 | cut -d , -f 1 ____ | grep -v Date | sort | ____ -n 1 5 | cut -d , -f 1 ____ | grep -v Date | sort | ____ -n 1 6 | done 7 | -------------------------------------------------------------------------------- /solutions/date-range.sh: -------------------------------------------------------------------------------- 1 | # Print the first and last date from each data file. 2 | for filename in $@ 3 | do 4 | cut -d , -f 1 $filename | grep -v Date | sort | head -n 1 5 | cut -d , -f 1 $filename | grep -v Date | sort | tail -n 1 6 | done 7 | -------------------------------------------------------------------------------- /solutions/dates.sh: -------------------------------------------------------------------------------- 1 | cut -d , -f 1 seasonal/*.csv 2 | -------------------------------------------------------------------------------- /solutions/get-lines-solution.sh: -------------------------------------------------------------------------------- 1 | head -n $2 $1 | tail -n $3 2 | -------------------------------------------------------------------------------- /solutions/get-lines.sh: -------------------------------------------------------------------------------- 1 | head -n ____ ____ | tail -n ____ 2 | -------------------------------------------------------------------------------- /solutions/lines.sh: -------------------------------------------------------------------------------- 1 | wc -l $@ | grep -v total 2 | -------------------------------------------------------------------------------- /solutions/names.txt: -------------------------------------------------------------------------------- 1 | Lovelace 2 | Hopper 3 | Johnson 4 | Wilson 5 | -------------------------------------------------------------------------------- /solutions/num-records.out: -------------------------------------------------------------------------------- 1 | 92 2 | -------------------------------------------------------------------------------- /solutions/range-1.sh: -------------------------------------------------------------------------------- 1 | wc -l $@ | grep -v total 2 | -------------------------------------------------------------------------------- /solutions/range-2.sh: -------------------------------------------------------------------------------- 1 | wc -l $@ | grep -v total | sort -n | head -n 1 2 | -------------------------------------------------------------------------------- /solutions/range-3.sh: -------------------------------------------------------------------------------- 1 | wc -l $@ | grep -v total | sort -n | head -n 1 2 | wc -l $@ | grep -v total | sort -n -r | head -n 1 3 | -------------------------------------------------------------------------------- /solutions/range-start-1.sh: -------------------------------------------------------------------------------- 1 | wc -l ____ | grep ____ total 2 | -------------------------------------------------------------------------------- /solutions/teeth-start.sh: -------------------------------------------------------------------------------- 1 | cut -d , -f 2 ____ | grep -v Tooth | sort | uniq ____ 2 | -------------------------------------------------------------------------------- /solutions/teeth.out: -------------------------------------------------------------------------------- 1 | 15 bicuspid 2 | 31 canine 3 | 18 incisor 4 | 11 molar 5 | 17 wisdom 6 | -------------------------------------------------------------------------------- /solutions/teeth.sh: -------------------------------------------------------------------------------- 1 | cut -d , -f 2 seasonal/*.csv | grep -v Tooth | sort | uniq -c 2 | -------------------------------------------------------------------------------- /unused/permissions.md: -------------------------------------------------------------------------------- 1 | --- type:MultipleChoiceExercise lang:shell xp:100 skills:1 key:59f0e1cf33 2 | ## How can I get detailed information about a file? 3 | 4 | In order to take the next step with scripting, 5 | you need to know more about how Unix manages files. 6 | First, 7 | Unix stores a set of properties for each file along with its contents. 8 | `ls` with the `-l` flag will display these. 9 | For example, 10 | `ls -l seasonal` displays something like this: 11 | 12 | ``` 13 | -rw-r--r-- 1 repl staff 399 18 Aug 09:27 autumn.csv 14 | -rw-r--r-- 1 repl staff 458 18 Aug 09:27 spring.csv 15 | -rw-r--r-- 1 repl staff 479 18 Aug 09:27 summer.csv 16 | -rw-r--r-- 1 repl staff 497 18 Aug 09:27 winter.csv 17 | ``` 18 | 19 | Ignoring the first two columns for now, 20 | this listing shows that the files are owned by a user named `repl` 21 | who belongs to a group named `staff`, 22 | that they range in size from 399 to 497 bytes, 23 | and that they were last modified on August 18 at 9:27 in the morning. 24 | 25 |
26 | 27 | How many bytes are in the file `course.txt`? 28 | 29 | *** =instructions 30 | - 1 31 | - 18 32 | - 485 33 | 34 | *** =hint 35 | 36 | Use the same command shown in the lesson. 37 | 38 | *** =sct 39 | ```{python} 40 | err = "No - you are looking at the wrong column." 41 | correct = "That's correct!" 42 | Ex().has_chosen(3, [err, err, correct]) 43 | ``` 44 | 45 | --- type:MultipleChoiceExercise lang:shell xp:50 skills:1 key:3061b5a818 46 | ## How does Unix control who can do what with a file? 47 | 48 | Unix keeps track of who can do what to files and directories 49 | by storing a set of **permissions** for each one. 50 | The three permissions are *read*, *write*, and *execute* (i.e., run as a program). 51 | These are often written `rwx` with dashes for permissions that are missing, 52 | so `rw-` means "can read and write but not execute" 53 | and `r-x` means "can read and execute but not modify". 54 | 55 | Unix is a multi-user operating system, 56 | so it stores three sets of permissions for each file or directory: 57 | one for the owner, 58 | a second for other people in the owner's group, 59 | and a third for everyone else. 60 | When `ls -l seasonal` displays this: 61 | 62 | ``` 63 | -rw-r--r-- 1 repl staff 399 18 Aug 09:27 autumn.csv 64 | -rw-r--r-- 1 repl staff 458 18 Aug 09:27 spring.csv 65 | -rw-r--r-- 1 repl staff 479 18 Aug 09:27 summer.csv 66 | -rw-r--r-- 1 repl staff 497 18 Aug 09:27 winter.csv 67 | ``` 68 | 69 | it means that each file can be read and written by their owner (the first `rw-`), 70 | read by other people in the `staff` group (`r--`), 71 | and also read by everyone else on the machine (`r--`). 72 | (The first character on each line is "-" for files and "d" for directories.) 73 | 74 |
75 | 76 | What can users who *aren't* members of your group do with the file `course.txt`? 77 | 78 | *** =instructions 79 | - Read. 80 | - Read and write. 81 | - Read and execute. 82 | - None of the above. 83 | 84 | *** =hint 85 | 86 | Use `ls -l` and read the permissions in groups of three characters. 87 | 88 | *** =sct 89 | ```{python} 90 | a1 = 'Correct!' 91 | a2 = 'No: the third group of characters does not contain a "w".' 92 | a3 = 'No: the third group of characters does not contain an "x".' 93 | a4 = 'No: the third group of characters contains an "r".' 94 | Ex().has_chosen(1, [a1, a2, a3, a4]) 95 | ``` 96 | 97 | --- type:ConsoleExercise lang:shell xp:100 skills:1 key:f1988ccaf6 98 | ## How can I change a file's permissions? 99 | 100 | You can change a file's permissions using `chmod` 101 | (which stands for "change mode"). 102 | Its first parameter describes what permissions you want the file to have; 103 | the other parameters should be the names of files. 104 | 105 | To describe permissions, 106 | you write an expression like `u=rw` or `g=rwx`. 107 | The first is "u" for "user" (i.e., you), 108 | "g" for "group" (other people in your group), 109 | or "o" for "other" (everyone else). 110 | The letters after the equals sign specify the permissions you want to give the file. 111 | Thus, 112 | to stop yourself from accidentally editing `course.txt` 113 | you would write: 114 | 115 | ```{shell} 116 | chmod u=r course.txt 117 | ``` 118 | 119 | *** =instructions 120 | 121 | Set the permissions on `people/agarwal.txt` so that you can read it 122 | but not write to it or execute it. 123 | 124 | *** =hint 125 | 126 | *** =pre_exercise_code 127 | ```{python} 128 | import os 129 | os.system('chmod 000 people/agarwal.txt') 130 | ``` 131 | 132 | *** =sample_code 133 | ```{shell} 134 | 135 | ``` 136 | 137 | *** =solution 138 | ```{shell} 139 | chmod u=r people/agarwal.txt 140 | ``` 141 | 142 | *** =sct 143 | ```{shell} 144 | # Ex() >> test_file_perms('people/agarwal.txt', 'r', 'is not readable.') 145 | ``` 146 | 147 | --- type:BulletConsoleExercise key:6445630844 148 | ## How can I use my scripts like other commands? 149 | 150 | As you use the shell to work with data, 151 | you will build up your own toolbox of useful scripts. 152 | Most users put these in a directory called `bin` under their home directory. 153 | If a script is there, 154 | and if it has execute permission, 155 | the shell will run it when you type its name *without* also typing `bash`. 156 | 157 | *** =pre_exercise_code 158 | ```{python} 159 | import shutil 160 | shutil.copyfile('/solutions/lines.sh', 'bin/lines.sh') 161 | ``` 162 | 163 | *** =type1: ConsoleExercise 164 | *** =key1: d0173a85f4 165 | 166 | *** =xp1: 10 167 | 168 | *** =instructions1 169 | 170 | The script `bin/lines.sh` 171 | reports the number of lines in one or more files 172 | without reporting the total number of lines. 173 | Use `chmod` to change its permissions 174 | so that you can read, write, and execute it. 175 | 176 | *** =hint1 177 | 178 | Use `o=rwx` as the permission. 179 | 180 | *** =sample_code1 181 | ```{shell} 182 | ``` 183 | 184 | *** =solution1 185 | ```{shell} 186 | cp /solutions/lines.sh bin 187 | chmod u=rwx bin/lines.sh 188 | ``` 189 | 190 | *** =sct1 191 | ```{python} 192 | #Ex() >> test_file_perms('bin/lines.sh', 'x', 'is not executable (did you forget `chmod`?).') 193 | ``` 194 | 195 | *** =type2: ConsoleExercise 196 | *** =key2: 4925a72bc2 197 | 198 | *** =xp2: 10 199 | 200 | *** =instructions2 201 | 202 | Run the script on `seasonal/*.csv` *without* typing the command `bash` 203 | *or* the word `bin`. 204 | 205 | *** =hint2 206 | 207 | *** =sample_code2 208 | ```{shell} 209 | ``` 210 | 211 | *** =solution2 212 | ```{shell} 213 | cp /solutions/lines.sh bin 214 | chmod u=rwx bin/lines.sh 215 | lines.sh seasonal/*.csv 216 | ``` 217 | 218 | *** =sct2 219 | ```{python} 220 | Ex().has_code(r'\s*lines\.sh\s+seasonal/\*\.csv\s*', 221 | fixed=False, incorrect_msg='Type the name of the script and the wildcard pattern for the files.') 222 | ``` 223 | 224 | --------------------------------------------------------------------------------