├── README.md ├── class-notes ├── class-1.md ├── class-2.md ├── class-3.md ├── class-4.md ├── class-5.md ├── class-6.md ├── class-7.md └── examples │ ├── class7 │ ├── email.py │ ├── vidbot.py │ └── webapp.py │ ├── natural-language-processing │ ├── classify.py │ ├── manifesto.txt │ ├── part-of-speech.py │ ├── pride.txt │ ├── regexp.py │ ├── similarity.py │ └── translate.py │ └── video │ ├── combine_videos.py │ ├── random_overlay.py │ └── randomize.py ├── reader-01-the-command-line.md └── reader-02-python-basics.md /README.md: -------------------------------------------------------------------------------- 1 | # Scrapism 2 | 3 | (draft syllabus) 4 | 5 | **Instructor:** [Sam Lavigne](http://lav.io) | [splavigne@gmail.com](mailto:splavigne@gmail.com) 6 | **Teaching Assistant:** TBD 7 | **Track:** Code Poetry, Fall 2018 8 | **Location:** [School for Poetic Computation](http://sfpc.io/) | 155 Bank St, New York, NY 10014 9 | **Time:** Tuesdays 10am to 1pm 10 | **Office Hours:** Tuesdays 2pm to 4pm (or by appointment) 11 | **Class Notes:** [link](https://paper.dropbox.com/folder/show/Class-Notes-e.1gg8YzoPEhbTkrhvQwJ2zz3XJBcZkbceseDnY854qf9k5dPQtUC2) 12 | 13 | Scrapism is the artistic practice of web scraping, or of automatically collecting and transforming found digital material. It hinges upon a combination of curatorial practice, reverse engineering, and hoarding mentality. In this class students will learn how to scrape massive quantities of material from the internet with Python, and then use that material to make poetic, satirical, critical, political projects. Each session we will cover a different web scraping technique, with production assignments relating to text, image and video. We will explore surrealist, dadaist, situationist techniques such as detournement, collage, and cut-ups, and apply them to a contemporary digital context. 14 | 15 | ## Schedule 16 | 17 | ### 1. September 18th 18 | 19 | Introductions. Using the terminal. Basic python. Reading lines. 20 | 21 | #### Readings 22 | * [Intro to the command line](https://github.com/antiboredom/sfpc-scrapism/blob/master/reader-01-the-command-line.md) 23 | * [Python basics](https://github.com/antiboredom/sfpc-scrapism/blob/master/reader-02-python-basics.md) 24 | * [Artificial Hells (introduction and chapter 1)](https://selforganizedseminar.files.wordpress.com/2011/08/bishop-claire-artificial-hells-participatory-art-and-politics-spectatorship.pdf) By Claire Bishop 25 | * [A User’s Guide to Détournement](http://www.bopsecrets.org/SI/detourn.htm) 26 | 27 | #### Assignment 28 | * Find three sentences (or phrases) in the wild. Your sentences could come from the internet or the real world, from a book, a store sign, a facebook post, a news article, product packaging, or from a restaurant menu. Anything is fine, but you must not write it yourself. Be prepared to recite what you have found next week in class. 29 | 30 | --- 31 | 32 | ### 2. September 25th 33 | 34 | Python part 2. Manipulating text. Automating writing. 35 | 36 | #### Readings 37 | * Tech reading 2 TBD 38 | * [The Cut Up Method](http://www.writing.upenn.edu/~afilreis/88v/burroughs-cutup.html) by William Burroughs 39 | 40 | #### Assignment 41 | * Transform a non-poetic text into a poetic text using Python. It is up to you to determine how and why a text is poetic or non-poetic. If you are stuck, try techniques like sorting, randomizing, filtering, deleting, or replacing. 42 | 43 | --- 44 | 45 | ### 3. October 2nd 46 | 47 | Web scraping basics. Making big lists. 48 | 49 | #### Readings 50 | * Tech reading 3 51 | * [Uncreative Writing](https://www.chronicle.com/article/Uncreative-Writing/128908) by Kenneth Goldsmith 52 | 53 | --- 54 | 55 | ### 4. October 9th 56 | 57 | Web scraping part 2. APIs. Advanced text manipulation and parsing. 58 | 59 | #### Readings 60 | * Tech reading 4 61 | * [Digital Divide](https://www.artforum.com/print/201207/digital-divide-contemporary-art-and-new-media-31944) by Claire Bishop 62 | * [Montage](https://lucian.uchicago.edu/blogs/mediatheory/keywords/montage/) by Jared Leibowich 63 | 64 | --- 65 | 66 | ### 5. October 16th 67 | 68 | Automating collage. 69 | 70 | #### Readings 71 | * Tech reading 5 72 | * [Too Much World: Is the Internet Dead?](https://www.e-flux.com/journal/49/60004/too-much-world-is-the-internet-dead/) by Hito Steyerl 73 | 74 | --- 75 | 76 | ### 6. October 23rd 77 | 78 | Automating video. 79 | 80 | #### Readings 81 | * Tech reading 6 82 | * [Surrealism: the Last Snapshot of the European Intelligentsia](https://monoskop.org/images/a/a0/Benjamin_Walter_1929_1978_Surrealism_The_Last_Snapshot_of_the_European_Intelligentsia.pdf) by Walter Benjamin 83 | 84 | --- 85 | 86 | ### 7. October 30th 87 | 88 | Bots and project work. 89 | 90 | --- 91 | 92 | 93 | ## Fun/useful Python Libraries 94 | * [moviepy](http://zulko.github.io/moviepy/) - edit video 95 | * [vidpy](http://antiboredom.github.com/vidpy/) - edit video (my library) 96 | * [videogrep](http://antiboredom.github.com/videogrep/) - make supercuts (my library) 97 | * [youtube-dl](https://rg3.github.io/youtube-dl/) - download videos 98 | * [pillow](https://python-pillow.org/) - edit images 99 | * [flask](http://flask.pocoo.org/) - web server 100 | * [twython](https://github.com/ryanmcgrath/twython) - use the twitter api 101 | * [spacy](https://github.com/ryanmcgrath/twython) - natural language processing 102 | * [requests](http://docs.python-requests.org/en/master/) - easy http requests 103 | * [envelopes](http://tomekwojcik.github.io/envelopes/) - send email 104 | * [opencv](http://opencv.org/) - computer vision 105 | * [asciimatics](https://github.com/peterbrittain/asciimatics) - text-based interfaces and animation 106 | * [colorama](https://github.com/tartley/colorama) - easy color in the terminal 107 | 108 | -------------------------------------------------------------------------------- /class-notes/class-1.md: -------------------------------------------------------------------------------- 1 | # Sept 18 - The Command Line 2 | **Instructor**: Sam Lavigne | [splavigne@gmail.com](mailto:splavigne@gmail.com) 3 | **Teaching Assistant**: Fernando Ramallo | [fernando.ramallo@gmail.com](mailto:fernando.ramallo@gmail.com) 4 | **Track**: Code Poetry, Fall 2018 5 | **Location**: School for Poetic Computation | 155 Bank St, New York, NY 10014 **Time**: Tuesdays 10am to 1pm 6 | **Office Hours**: Tuesdays 2pm to 4pm (or by appointment) 7 | 8 | Slack channel: #2018-fall-scrapism 9 | Sam’s office hours Sign-up sheet: [+Sam Office Hours](https://paper.dropbox.com/doc/Sam-Office-Hours-gaKmWg2Qo7jnn2FbO7F5b) 10 | Fernando’s office hours sign-up sheet: [+Fernando (TA) Office Hours](https://paper.dropbox.com/doc/Fernando-TA-Office-Hours-p8FxDav0hzpIjrJ4rtfeX) 11 | 12 | 13 | # Reader 14 | - [Intro to the command line](https://github.com/antiboredom/sfpc-scrapism/blob/master/reader-01-the-command-line.md) 15 | - [Python basics](https://github.com/antiboredom/sfpc-scrapism/blob/master/reader-02-python-basics.md) 16 | # Notes 17 | 18 | 19 | 20 | - We all introduced ourselves, again! 21 | - We’re gonna assume no technical knowledge, feel free to reach out for questions. 22 | - Sam will record himself giving the class, put it in a private link 23 | 24 | 25 | ## Sam’s work 26 | 27 | http://lav.io/ 28 | 29 | How can we make critical statements without saying specifically what that statement is. 30 | 31 | https://lav.io/projects/white-collar-crime-risk-zones/ 32 | https://lav.io/projects/baabaa/ - An index of selected commodities listed for sale on alibaba.com. Items are arranged by price and minimum order quantity and are search results for terms like “riot gear” and “human labor”. 33 | https://lav.io/projects/cspan-5/ - most frequently stated phrases turned into a video 34 | 35 | 36 | 37 | ## Scrapism 38 | 39 | Q of this class: how do we make something new by using material that already exists / 40 | What new things are sayable today? .. by means of these tools that wouldn’t be sayable otherwise 41 | 42 | Objectives 43 | 44 | - learn python 45 | - use it to collect material and manipulate it 46 | - use text: how do we create automatic *poetry* 47 | - image: how do we create automatic *collage* 48 | - video: automatic *montage* 49 | 50 | Look at groups and individuals from the past that used rule-based techniques / almost automatically / surrealists, dadaists, situationists 51 | We’re gonna be making critiques, satires, commentaries, poetry. 52 | Process: 53 | 54 | - find a good source material 55 | - figure out how to get that source material (get a lot of it) 56 | - figure out how to parse it and transform it / take something that is a big mess from the internet, take unstructured information / transform it into something you can use 57 | - figure out how to present what you’ve collected to the world / something new 58 | 59 | We’re gonna treat everything *as a text***,** looking at images *as* *if they were* text, e.g. [C-SPAN5 bot](https://twitter.com/cspanfive) (treating video as text that is cut and put together). 60 | 61 | How do these techniques work in a post-Trump environment? 62 | All information is out in the open, does that make this work superfluous? 63 | 64 | I saw a horrible website today! 65 | https://anti-captcha.com 66 | 67 | 68 | ## Class today 69 | 70 | All the things we’re gonna talk about today are gonna be in these readers: 71 | 72 | - [Intro to the command line](https://github.com/antiboredom/sfpc-scrapism/blob/master/reader-01-the-command-line.md) 73 | - [Python basics](https://github.com/antiboredom/sfpc-scrapism/blob/master/reader-02-python-basics.md) 74 | 75 | Every class will have a series of readings (technical and non-technical): 76 | 77 | - Technical readings are what we talked about in the class, for reference / when you forget 78 | - The readings are the ones listed in the [syllabus](https://github.com/antiboredom/sfpc-scrapism), *in the slot for the previous class* 79 | 80 | 81 | 82 | ## The Terminal 83 | 84 | Applications > Utilities > Terminal 85 | Cmd+Space > “Terminal” 86 | 87 | 88 | ![](https://d2mxuefqeaa7sj.cloudfront.net/s_DB93935784C30DFE0319F4DADC3823BE454C5CF94C07DCD9BB4B5FA46EC71A23_1537283005399_image.png) 89 | 90 | 91 | The terminal is a text-based way of navigating folders 92 | 93 | **Print the directory you’re in:** 94 | 95 | pwd 96 | 97 | 98 | ![](https://d2mxuefqeaa7sj.cloudfront.net/s_DB93935784C30DFE0319F4DADC3823BE454C5CF94C07DCD9BB4B5FA46EC71A23_1537283131928_image.png) 99 | 100 | 101 | 102 | 103 | **See what’s in the folder you’re in** 104 | 105 | ls 106 | ![](https://d2mxuefqeaa7sj.cloudfront.net/s_DB93935784C30DFE0319F4DADC3823BE454C5CF94C07DCD9BB4B5FA46EC71A23_1537283236404_image.png) 107 | 108 | 109 | **Change the directory you’re in** 110 | 111 | cd [folder you want to enter] 112 | 113 | cd Desktop 114 | 115 | 116 | ![](https://d2mxuefqeaa7sj.cloudfront.net/s_DB93935784C30DFE0319F4DADC3823BE454C5CF94C07DCD9BB4B5FA46EC71A23_1537283219373_image.png) 117 | 118 | 119 | **The terminal doesn’t understand spaces. Use commas “ to access folders and files with spaces.** 120 | 121 | 122 | cd Creative Cloud Files # doesn't work 123 | cd "Creative Cloud Files" 124 | 125 | 126 | 127 | **Going back: To go back one directory: cd ..** 128 | 129 | cd .. # goes back to the previous folder 130 | 131 | 132 | **Making directories: mkdir** 133 | 134 | mkdir [name of the directory] 135 | 136 | mkdir newfolder # makes a folder called 'newfolder' 137 | 138 | 139 | 140 | **Move files and folders and rename them: mv** 141 | 142 | 143 | mv [old name] [new name] 144 | 145 | mv newfolder/ newnamedfolder #renames folder 'newfolder' to 'newnamedfolder' 146 | 147 | slash means folder, it’s optional 148 | 149 | can be used for moving a file, but also be used for renaming 150 | 151 | 152 | **Creating new files: touch** 153 | Updates the last date modified tag for a file or folder, to be right now. 154 | If that file doesn’t exist, it **creates that file** 155 | a fast way of making files 156 | 157 | 158 | touch [name of file or folder] 159 | 160 | touch coolfile.txt #makes an empty file called 'coolfile.txt' 161 | 162 | 163 | 164 | **Delete** 165 | 166 | rm [name of file] 167 | 168 | rm coolfile.txt 169 | 170 | 171 | **Hit tab to autocomplete a file or folder** 172 | 173 | cd Des[HIT TAB] # autocompletes to cd Desktop 174 | 175 | 176 | 177 | 178 | ## Manipulating text 179 | 180 | **Use gutenberg for source text** 181 | A good external source to work with is [project Gutenberg](http://gutenberg.org/). 57,000 free eBooks public domain texts. 182 | 183 | - Download files in Plain Text format 184 | 185 | Moby dick text: https://www.gutenberg.org/cache/epub/15/pg15.txt 186 | The Trial by Kafka: https://www.gutenberg.org/cache/epub/7849/pg7849.txt 187 | 188 | Save file as Plain Text Document (or Page Source in Safari) 189 | 190 | **See information about the file** 191 | 192 | file [name of file] 193 | 194 | file mobydick.txt 195 | Output: mobydick.txt: UTF-8 Unicode (with BOM) text, with CRLF line terminators 196 | 197 | **Looking inside the contents of a file** 198 | 199 | cat [name of file] # prints content of the file on the screen 200 | 201 | cat mobydick.txt 202 | # .... will print the entire text 203 | 204 | **Use the ‘more’ command to actually read through the text with scrolling** 205 | 206 | more [name of file] 207 | 208 | more mobydick.txt 209 | # ... scroll through the text 210 | # ... type Q to exit 211 | 212 | 213 | 214 | **Best command: say** 215 | 216 | say hello 217 | 218 | say this is your computer i am going to murder you 219 | 220 | 221 | 222 | All the commands have a stucture 223 | ***name of command + argument (usually file or folder)*** 224 | 225 | **But most commands have additional options** 226 | Every single command has a manual built-in. Access it with **man** keyword 227 | 228 | man say 229 | # will go to the manual about the say command, 230 | # exit by typing Q 231 | 232 | e.g. -v to change the voice, -f file, -r rate 233 | usually two ways of accessing an option, e.g. 234 | 235 | - -r rate 236 | - --rate=rate 237 | say whatever 238 | # says 'whatever' at normal rate 239 | 240 | say -r 500 whatever 241 | # says 'whatever' at the rate of 500 words per minute 242 | 243 | # use -f option to read a file 244 | say -f mobydick.txt 245 | # says the entirety of Moby Dick outloud. Poetic! 246 | 247 | 248 | 249 | 250 | **To stop a command** 251 | 252 | - Ctrl + C: Stops the command 253 | - Cmd + Q (Alt + F4 in windows): Closes the terminal entirely 254 | 255 | 256 | **Use grep command to print every line of a text file that contains a certain word** 257 | a line is understood as every time there’s a carriage return / breaking point / enter in the text 258 | 259 | 260 | grep trial thetrial.txt 261 | # prints all the lines of the text file that has the word 'trial' 262 | 263 | grep whale mobydick.txt 264 | 265 | # to search for more than one word, put it in quotes 266 | grep "the whale" mobydick.txt 267 | 268 | 269 | 270 | 271 | **Sort comand** 272 | sorts every line 273 | 274 | sort thetrial.txt 275 | # returns the trial, alphabetically ordered 276 | 277 | sort -u # only uniques 278 | sort -r # reverse 279 | 280 | 281 | 282 | 283 | **Save the output of the command line to a new file, with the > sign** 284 | *this is called a redirect* 285 | 286 | [command] > [file name to save to] 287 | 288 | sort thetrial.txt > thetrial_sorted.txt 289 | # instead of printing, save whatever output to thetrial_sorted.txt file 290 | 291 | 292 | 293 | **You can combine commands together** 294 | take the output of one command, pipe it to another command, and chain things together 295 | e.g. do the sort and grep at the same time 296 | 297 | 298 | # use the vertical bar character (pipe) | to chain commands 299 | 300 | grep whale mobydick.txt | sort 301 | # take the output of the lines from grep, into the sort command, finally to the screen 302 | 303 | grep whale mobydick.txt | sort > sorted_whales.txt 304 | # make a text file with the lines that include "whale", sorted alphabetically 305 | 306 | 307 | 308 | **Other fun commands** 309 | 310 | **use cut to separate words** 311 | 312 | cut # breaks every line in the file by a delimiter, 313 | # e.g. break the lines by spaces, 314 | # -d delimiter 315 | # -f field 316 | 317 | cut -d " " -f 1 mobydick.txt 318 | # separate the lines by empty spaces (therefore separating each word), get the first field (the first instance, ie. the first word), of mobydick.txt 319 | 320 | **use a wildcard to access multiple files** 321 | 322 | ls *.txt 323 | # lists any file that ends with .txt 324 | 325 | 326 | **clear to clear the screen** 327 | 328 | clear 329 | # empties the terminal window 330 | 331 | 332 | 333 | 334 | ## How the file system works 335 | 336 | Files and folders, 337 | Every folder has exactly one parent folder, except the very top (the root) 338 | 339 | The root folder (the hard drive) is described as a forward slash / 340 | 341 | cd / 342 | # goes to the root folder 343 | 344 | Some files and folders are **hidden** 345 | 346 | cd / 347 | ls 348 | # will list all the files and folders in the root, you'll see some that are hidden in the Finder / Folder viewer 349 | 350 | Each file/folder has a unique path 351 | You can go to a specific folder and access a file inside it 352 | 353 | cd /Users/sam/Desktop/ 354 | # go to the desktop 355 | more thetrial.txt 356 | # if there's a file called thetrial.txt in Desktop, it gets printed out 357 | # otherwise, an error 358 | 359 | But you can also access a file by its **unique path**, from any other folder 360 | 361 | cd / 362 | more /Users/sam/Desktop/thetrial.txt 363 | 364 | *Tip: Drag a folder or file from the Finder to the terminal and get its unique path without having to type it* 365 | 366 | cd can be used to navigate the file system easily 367 | 368 | cd 369 | # cd with no argument goes to the root folder 370 | 371 | cd ../Documents 372 | # .. means one level up 373 | # goes one level up, and then down into the Documents folder, if it exists 374 | # can be combined: 375 | cd ../../../Desktop #go three levels up and then into Desktop 376 | 377 | cd ./Desktop 378 | # . means the folder we are currently in 379 | 380 | 381 | **open** opens a file in its default application 382 | 383 | open mobydick.txt 384 | # opens the text file in TextEdit or notepad 385 | 386 | open . 387 | # opens the folder we currently are in, in the folder viewer (eg. Finder) 388 | 389 | 390 | **Some tricks to move the typing cursor quickly** 391 | **Shortcuts:** 392 | 393 | - Ctrl + A: brings the cursor to the beginning of the line 394 | - Ctrl + E: brings the cursor to the end of the line 395 | - Tab: for autocomplete of commands 396 | - “gr” + Tab: show all commands that start with gr 397 | - Cmd + D: splits screen to have multiple terminals 398 | - Cmd + N: makes a new terminal window 399 | - Cmd + T: makes a new tab 400 | 401 | **Another terminal program** 402 | 403 | - iTerm 404 | 405 | 406 | 407 | ## Install python + text editor 408 | 409 | **Installing python** 410 | Your computer comes with python, but we need a different version. 411 | 412 | There’s tons of ways to install python? 413 | We’re gonna use a tool called **brew** to install stuff with: 414 | https://brew.sh/ 415 | 416 | Take the main example line, copy paste it into a Termina, hit enter. 417 | 418 | /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" 419 | 420 | “It should just work” 421 | 422 | Once brew is installed, install python, on a terminal: 423 | 424 | brew install python3 425 | ## 426 | 427 | **Installing a text editor** 428 | 429 | Doesn’t matter what text editor you use, but a few good ones 430 | 431 | - Sublime https://www.sublimetext.com/ **paid but fast!** 432 | - Visual Studio Code https://code.visualstudio.com/ **free/open source** 433 | - Atom https://atom.io/ **free/open source** 434 | 435 | See [Python basics](https://github.com/antiboredom/sfpc-scrapism/blob/master/reader-02-python-basics.md) for install instructions 436 | 437 | Text editors will color-code a python file to show you different parts. 438 | 439 | You can also edit Python files in an **IDE**, “integrated development environment”, they are full platforms for programming, with lots of features. For the purpose of this class we’ll stick to plain text editors. 440 | 441 | **Using python** 442 | 443 | python is just a command line program (a program that you can use in the Terminal) 444 | 445 | you might have more than one python version, 446 | to use the one we’re using type **python3** 447 | 448 | **Way ONE to use python: without arguments** 449 | 450 | In a terminal window: 451 | 452 | python3 453 | 454 | 455 | ![](https://d2mxuefqeaa7sj.cloudfront.net/s_DB93935784C30DFE0319F4DADC3823BE454C5CF94C07DCD9BB4B5FA46EC71A23_1537288255592_image.png) 456 | 457 | 458 | 459 | >>> 2+1 460 | 3 # output 461 | 462 | 463 | To exit the python console, type 464 | 465 | Ctrl + D 466 | 467 | >>> exit() 468 | 469 | 470 | **Way TWO: next week!** 471 | 472 | 473 | 474 | 475 | ## Works to look for / Works we’re basing our work on 476 | 477 | **Allison Parrish** 478 | http://www.decontextualize.com/ 479 | 480 | https://twitter.com/everyword 481 | a twitter bot that tweets every single word of the english language in alphabetical order 482 | 483 | / when you make a work for this, what is it you’re doing? 484 | / closer to a performance 485 | / the lens of performance can help us understand this work 486 | 487 | not just about the bot itself, about the reactions to the bot 488 | 489 | / related to Claire Bishop’s reading 490 | 491 | Responses to éclair 492 | [https://twitter.com/everyword/status/475170297776447488](https://twitter.com/everyword/status/475170297776447488) 493 | 494 | **Nick Monfort** 495 | 256 characters-long one line terminal commands to make poetry 496 | https://nickm.com/poems/ppg256.html 497 | 498 | **Everest Pipkin -** Cloud OCR 499 | http://ifyoulived.org/translations.html 500 | Misusing image conversion / analysis 501 | https://procedural-generation.tumblr.com/ 502 | 503 | what does the cloud say according to the computer 504 | poem 505 | 506 | / it’s broken / a natural lifespan/limit 507 | 508 | **Daniel Temkin - Internet Directory** 509 | http://danieltemkin.com/InternetDirectory 510 | A 37k+ page loose-leaf book containing all 115 million .COM domains in alphabetical order, along with current IP addresses. 511 | 512 | 513 | **Sam’s own - Patent Generator** 514 | http://lav.io/2014/05/transform-any-text-into-a-patent-application/ 515 | Output: https://saaaam.s3.amazonaws.com/communist.pdf 516 | 517 | 518 | **Kate Compton - Tracery** 519 | http://www.tracery.io/ 520 | Text generation 521 | 522 | 523 | / You can make tools 524 | / You can share those tools, see what other people make with it 525 | / You are making a form / with constraints 526 | 527 | 528 | **Kyle Macdonald - Keytweeter** 529 | [https://vimeo.com/9922212](https://vimeo.com/9922212) 530 | Tweets everything you type 531 | 532 | 533 | 534 | **Great book for learning python** 535 | Learn Python the hard way 536 | https://www.learnpythonthehardway.org/ 537 | 538 | **Other resources for learning python** 539 | Automate the Boring Stuff 540 | https://automatetheboringstuff.com/ 541 | 542 | Python for Everybody 543 | https://books.trinket.io/pfe/ 544 | 545 | 546 | ## Assignment for next week 547 | 548 | Look at python basics 549 | Read 550 | 551 | - [Python basics](https://github.com/antiboredom/sfpc-scrapism/blob/master/reader-02-python-basics.md) 552 | - [Artificial Hells (introduction and chapter 1)](https://selforganizedseminar.files.wordpress.com/2011/08/bishop-claire-artificial-hells-participatory-art-and-politics-spectatorship.pdf) By Claire Bishop 553 | - [A User’s Guide to Détournement](http://www.bopsecrets.org/SI/detourn.htm) 554 | 555 | 556 | **Find 3 sentences** 557 | You’re gonna assign them to the rest of the class 558 | not too long 559 | they can come from anywhere / internet real world facebook post product packaging menu 560 | as long as you don’t write them yourself 561 | 562 | Combine them (one after the other) 563 | either make sense together or not 564 | that creates new possibilities when put together 565 | 566 | 567 | 568 | 569 | ## WordHack this Thursday! 570 | 571 | [[link]](https://www.facebook.com/events/713754025655700/?acontext=%7B%22ref%22%3A%2229%22%2C%22ref_notif_type%22%3A%22event_aggregate%22%2C%22action_history%22%3A%22null%22%7D¬if_id=1537184173953188¬if_t=event_aggregate) 572 | WordHack is a monthly evening of performances and talks exploring the intersection of language and technology. Code poetry, digital literature, e-lit, language games, coders interested in the creative side, writers interested in new forms writing can take, all are welcome here. 573 | 574 | This month we will feature talks and performances by: 575 | JOANNE MCNEIL ([http://www.joannemcneil.com/](http://www.joannemcneil.com/)) 576 | MARTIN O'LEARY ([http://mewo2.com/](http://mewo2.com/)) 577 | ESTHER SEYFFARTH ([https://user.phil.hhu.de/~seyffarth/index.html](https://user.phil.hhu.de/~seyffarth/index.html)) 578 | 579 | 580 | ## Syncrony NYC 581 | 582 | Syncrony NYC 583 | http://synchrony.nyc/2019/index.html 584 | Synchrony is a DEMOPARTY that begins in NEW YORK CITY, continues on an Amtrak train, and concludes in MONTREAL. 585 | 586 | Synchrony is about being creative with computers, and seeing how computers can produce amazing sorts of animation, graphics, music, and other experiences. At the end we have COMPOS (competitions) that are voted on by those who are there at the party. Some people may work on their entries for these compos for months beforehand; some, just on the train ride up. People are welcome to enter remotely, even if they are unable to attend. 587 | 588 | 589 | -------------------------------------------------------------------------------- /class-notes/class-2.md: -------------------------------------------------------------------------------- 1 | # Sept 25 - Python part 2. Manipulating text. Automating writing 2 | 3 | **Instructor**: Sam Lavigne | [splavigne@gmail.com](mailto:splavigne@gmail.com) 4 | **Teaching Assistant**: Fernando Ramallo | [fernando.ramallo@gmail.com](mailto:fernando.ramallo@gmail.com) 5 | **Track**: Code Poetry, Fall 2018 6 | **Location**: School for Poetic Computation | 155 Bank St, New York, NY 10014 **Time**: Tuesdays 10am to 1pm 7 | **Office Hours**: Tuesdays 2pm to 4pm (or by appointment) 8 | 9 | Syllabus: http://github.com/antiboredom/sfpc-scrapism 10 | Slack channel: #2018-fall-scrapism 11 | Sam’s office hours Sign-up sheet: [+Sam Office Hours](https://paper.dropbox.com/doc/Sam-Office-Hours-gaKmWg2Qo7jnn2FbO7F5b) 12 | Fernando’s office hours sign-up sheet: [+Fernando (TA) Office Hours](https://paper.dropbox.com/doc/Fernando-TA-Office-Hours-p8FxDav0hzpIjrJ4rtfeX) 13 | 14 | 15 | # Notes 16 | 17 | 18 | 19 | Fernando gave a presentation about his work 20 | 21 | - His website http://byfernando.com/ 22 | - His games https://fernandoramallo.itch.io/ (get in touch if you want a free copy of any) 23 | 24 | We all went through our assignments 25 | 26 | 27 | Get you in the mood of using language that you find around. 28 | Juxtaposition 29 | 30 | 31 | # Readings 32 | 33 | Claire Bishop 34 | 35 | - Good survey of things that have been done around reutilizing existing language 36 | - attempts to create art that are not commodifiable, a characteristic of social art 37 | - frequently dealing with ethical political concerns 38 | - creating a social space, rather than making an object 39 | - social art isn’t held to same standards as normal art, when judged. is it good art? good activism? sometimes neither. important to take note of / be aware of, when making work that’s aesthetic and activist. 40 | - is making an art project the best way to achieve activist goals? 41 | - is doing an activist project the best way to achieve the artistic goals? 42 | - set your own ideas for how your work is judged / sometimes it’s not quantifiable 43 | - 44 | 45 | Detournement 46 | 47 | - different intepretations of it 48 | - what is the source text advocating for / in using it your erradicating its context 49 | - you’re renewing its value 50 | - as a practitioner, what would your desired goal be? make a new thing and destroy the old? give new value to the old through that act? 51 | 52 | 53 | 54 | # Python 55 | 56 | 57 | # See Sam’s reader with more examples here: 58 | 59 | https://github.com/antiboredom/sfpc-scrapism/blob/master/reader-02-python-basics.md 60 | 61 | 62 | 63 | ## Using the right version 64 | 65 | when you installed python 3, it didn’t remove your old version 66 | if you type python, sometimes it runs the **older** version that comes with Mac, not the one we’ll use 67 | 68 | Depending on your settings, you might be able to type ***python*** in the terminal and get the right version, but to make sure you can type **python3** 69 | 70 | ![](https://d2mxuefqeaa7sj.cloudfront.net/s_B93F4A161C5A44CDAEDBB62D1CDA4B91AEE6CCE1A00E6E4521CF0698C91A37EA_1537886949835_image.png) 71 | 72 | 73 | 74 | **To exit the python console, press Ctrl + D** 75 | 76 | 77 | ## Creating a file with the Terminal 78 | 79 | On the terminal: 80 | 81 | 82 | 1. Make a new folder with **mkdir python_lesson_1** 83 | 2. Enter it with **cd python_lesson_1** 84 | 3. Create a file with **touch hello.py** 85 | 1. **touch** updates a file’s modified date if it exists, otherwise it creates it 86 | 4. Open the file with the default editor with **open hello.py** 87 | 1. To change the default editor: right click the file in Finder > Get Info > Change the default app in the Open With section 88 | 89 | 90 | 91 | ## Writing our first program 92 | 93 | **Print something to the screen:** 94 | 95 | On the text editor for hello.py: 96 | 97 | 98 | print("a specter is haunting europe") 99 | 100 | 101 | 102 | Hit save on your editor 103 | 104 | Run the program to see its output. 105 | On the termina: 106 | 107 | 108 | $ python3 hello.py 109 | 110 | ![](https://d2mxuefqeaa7sj.cloudfront.net/s_B93F4A161C5A44CDAEDBB62D1CDA4B91AEE6CCE1A00E6E4521CF0698C91A37EA_1537887306088_image.png) 111 | 112 | 113 | 114 | **Expressions** 115 | 116 | python replaces mathematical operations with the value of that operation 117 | 118 | 119 | 120 | # print mathematical operations 121 | print( 1 + 1 ) # outputs 2 122 | print( 5 / 33 ) 123 | print( 1 + 7 / 25 * 5 ) 124 | 125 | 126 | 127 | You can compare expressions 128 | 129 | 130 | 131 | print( 1 == 2 ) # returns True or False depending on if 1 equals 2 132 | 133 | print( 1 < 2 ) # less than 134 | print( 1 > 2 ) # greater than 135 | print( 1 >= 2 ) # equal or greater than 136 | print( 1 <= 2 ) # equal or lesser than 137 | print( 1 != 2 ) # not equal 138 | 139 | 140 | 141 | You can **comment** parts of code out with # so they’re in your file but they don’t run 142 | 143 | 144 | 145 | print(1+2) 146 | # print("Hello") 147 | 148 | 149 | Some editors let you comment the code you select with **Ctrl + /** 150 | 151 | 152 | You can save the value of an expression with **variables**, where you assign a name to an expression or value 153 | 154 | 155 | 156 | some_number = 100 157 | 158 | 159 | the value 100 is now stored in the variable some_number. We can see its value with print() 160 | 161 | 162 | 163 | print(some_number) 164 | # Output: 100 165 | 166 | 167 | There’s different **kinds** of values: 168 | 169 | - Integer: a whole number (1, 2, 3, 5, 1000) 170 | - Float: a number with decimals (1.55345, 2.0) 171 | - String: a piece of text, defined between quotes (“hello”, “a spectre… “) 172 | - Boolean: True or False 173 | - Lists: a list of items 174 | 175 | some_number = 100 176 | some_float = 10.5 177 | some_string = "a spectre is haunting europe" 178 | some_boolean = False 179 | a_list = [ 1, 100, 20, 25, -305 ] # a list of integers 180 | # You can combine types, not a good idea but.. 181 | another_list = [ "hi", 1, 1.53242, False ] 182 | 183 | 184 | The most important for us is going to be 185 | 186 | 187 | ## Strings 188 | 189 | A string is a series of characters 190 | 191 | we can make a variable that stores a string 192 | 193 | we can combine variables to make new values. 194 | 195 | If we add two strings together, it **concatenates** them 196 | 197 | 198 | 199 | first_name = "Karl" 200 | last_name = "Marx" 201 | 202 | full_name = first_name + last_name 203 | 204 | print(full_name) # Output: KarlMarx 205 | 206 | # To put a space between the values 207 | full_name = first_name + " " + last_name 208 | print(full_name) # Output: Karl Marx 209 | 210 | 211 | Each character in our string has a numerical index 212 | If we want the first letter, we **access it with brackets and an index** (starting from zero) 213 | 214 | 215 | first_letter = full_name[0] 216 | second_letter = full_name[1] 217 | 218 | print(first_letter) #Output: K 219 | 220 | 221 | If we use an **index outside of the length of the string**, we get an error 222 | 223 | 224 | print(full_name[1000]) # Output: IndexError: string index out of range 225 | 226 | 227 | 228 | We can use **indices with negative numbers** to start at the end and walk our way backwards: 229 | 230 | 231 | 232 | # Get the last letter 233 | last_letter = full_name[-1] 234 | second_to_last_letter = full_name[-2] 235 | 236 | 237 | We can also get **ranges of characters**, this makes python very powerful for our kind of work 238 | 239 | 240 | 241 | print(full_name[0:3]) # Outputs the first three characters: Kar\ 242 | 243 | 244 | We can combine everything we've seen so far: 245 | 246 | 247 | 248 | print(full_name[4:-1]) # Gets a range from the fifth character to the last one 249 | 250 | 251 | 252 | We can check for the length of a string, with **len()** 253 | 254 | 255 | 256 | total_characters = len(full_name) 257 | 258 | 259 | 260 | I can determine if a string contains another string, using the **in** keyword 261 | 262 | - my_string in another_string: return True or False 263 | 264 | sentence = "A spectre is haunting Europe" 265 | 266 | # is "spectre" inside the sentence? 267 | print("spectre" in sentence) #Output: True 268 | 269 | print("specter" in sentence) #Output: False 270 | 271 | 272 | # it's case-sensitive 273 | print("Spectre" in sentence) #Output: False 274 | 275 | # to make the check case-insensitive, we turn it into lowercase 276 | print("europe" in sentence) #Output: False 277 | print("europe" in sentence.lower()) #Output: True . # Note: doesn't modify sentence 278 | 279 | 280 | **String methods** lets us manipulate strings in interesting ways: 281 | 282 | 283 | 284 | sentence = "A spectre is haunting Europe" 285 | 286 | # Make every character upper case 287 | print(sentence.upper()) # Outputs: A SPECTRE IS HAUNTING EUROPE 288 | 289 | # or lower case 290 | print(sentence.lower()) #Outputs: a spectre is haunting europe 291 | 292 | # capitalize the first letter of each word 293 | print(sentence.title()) #Outputs: A Spectre Is Haunting Europe 294 | 295 | # Use replace to find a word and replace it with another 296 | print(sentence.replace("is", "was")) #Outputs: A spectre was haunting Europe 297 | 298 | # We can chain these operations together 299 | print(sentence.replace("is", "was").upper()) #Outputs: A SPECTRE WAS HAUNTING EUROPE 300 | 301 | 302 | None of these examples **modify the original value**, but if we want to actually change it 303 | 304 | 305 | sentence.upper() # only returns the upper case sentence, doesn't modify the variable 306 | sentence = sentence.upper() # assigns the variable to the newer upper case version 307 | 308 | 309 | 310 | You can go through **more string methods** here: 311 | https://docs.python.org/3.7/library/stdtypes.html#string-methods 312 | like center 313 | 314 | 315 | sentence = sentence.center(30, "*") #puts the character * around the sentence until it's 30 characters long 316 | 317 | 318 | 319 | You can also do fun things like **multiplication** 320 | 321 | 322 | hello = "Hello" * 100 323 | print(hello) # Outputs: hellohellohellohellohello ... 324 | 325 | hello = "hello" + "o" * 100 326 | print(hello) # Outputs: helloooooooooooooo 327 | 328 | hello = "he" + "l" * 1000 + "o" 329 | 330 | 331 | 332 | 333 | We can **combine different types**, but there are different ways 334 | 335 | The bad way: 336 | 337 | 338 | number = 10 339 | message = "The number is " + number 340 | # This throws an error (cannot concatenate 'str' and 'int' objects) 341 | 342 | 343 | The OK way, convert a number to a string: 344 | 345 | 346 | message = "The number is " + str(number) 347 | 348 | 349 | The better way if you have lots of numbers, use format, it’ll replace {} with the number 350 | 351 | 352 | # one value 353 | message = "The number is {}".format(number) 354 | 355 | # two values 356 | message = "The number is {} and the 2nd number is {}.".format(number, 100) 357 | print(message) # Outputs: The number is 10 and the 2nd number is 100 358 | 359 | 360 | 361 | ## Make the computer say it 362 | 363 | In the terminal 364 | 365 | 366 | python3 strings.py | say 367 | 368 | 369 | 370 | 371 | ## Save the output of a python file to a text file from the terminal 372 | 373 | 374 | 375 | python 3 strings.py > strings.txt 376 | 377 | 378 | 379 | ## Lists 380 | 381 | Make an empty lists.py file 382 | In the terminal: 383 | 384 | touch lists.py 385 | open lists.py 386 | 387 | 388 | A lot of methods from strings apply to lists. 389 | 390 | 391 | # Declaring a list 392 | names = ["Marx", "Trotsky", "Lenin", "Engels"] 393 | 394 | # Get the length with len(names) 395 | print("Total names: ", len(names)) # Outputs: 4 396 | 397 | # Add items to a list 398 | names.append("Stravinsky") 399 | 400 | # We can declare an empty list 401 | some_list = [] 402 | 403 | # You can multiply a list 404 | print(names * 10) # Outputs a list with the content of the names list 10 times 405 | 406 | # You can add lists together 407 | print(names + some_list) 408 | 409 | # You can access individual items by their index starting with zero 410 | print(names[0]) # First item 411 | print(names[-1]) # Last item 412 | print(names[0:3]) # A list with items from the first to the 4th item 413 | 414 | 415 | 416 | 417 | We can go through every item in our list, called **iteration,** using the **for** keyword 418 | 419 | 420 | # declare a variable name first, then the list we're going through second 421 | # it'll temporarily store each of the values in the variable name 422 | for name in names: 423 | print(name) 424 | # Outputs: calls print for every item, outputs its value 425 | 426 | In other languages a *block* is defined with brackets { }, but in python it’s defined by **white space, using indentation** 427 | Anything that shares the same indentation (e.g. a Tab), is part of the same block 428 | 429 | 430 | for name in names: 431 | print(name) 432 | print("is a dead white guy") # also inside the loop 433 | 434 | print("and so is:") # Still inside the loop 435 | 436 | print("That's all the dead white guys in our list") # Outside of the loop 437 | 438 | 439 | 440 | 441 | 442 | ## More 443 | 444 | We’ll grab Kafka’s Metamorphosis from gutenberg 445 | https://www.gutenberg.org/cache/epub/5200/pg5200.txt 446 | 447 | Save it to a file next to our python script 448 | 449 | We’ll read the text file and store it as a variable 450 | In our python script: 451 | 452 | 453 | text = open("kafka.txt").read() # the name of the file, relative to where the script is 454 | 455 | print(text) # Outputs: the entire text 456 | 457 | 458 | 459 | Now we can do stuff with it 460 | 461 | 462 | 463 | print(text.upper()) 464 | 465 | 466 | 467 | To read every single lines, Instead of read() we use readlines() 468 | 469 | 470 | text = open("kafka.txt").readlines() 471 | # text is now a list of string items, with each line from the file 472 | 473 | 474 | Now we can iterate over the lines 475 | 476 | 477 | 478 | for line in text: 479 | print(line) #Outputs each line 480 | 481 | 482 | The problem, it’s putting a space in between each line. 483 | This is because there’s an extra character after a line break, called a newline character 484 | We can get rid of that with strip() 485 | 486 | 487 | 488 | for line in text: 489 | line = line.strip() 490 | print(line) # Outputs each line without whitespace or extra line breaks 491 | 492 | 493 | 494 | Each of the lines is a string, so we can print parts of each line 495 | 496 | 497 | 498 | for line in text: 499 | line = line.strip() 500 | print(line[0:4]) 501 | 502 | # Output is the first four characters of each line 503 | 504 | 505 | Or do fun stuff like replacing 506 | 507 | 508 | 509 | for line in text: 510 | line = line.strip() 511 | print(line.replace('e', 'eeeeeee')) 512 | 513 | 514 | 515 | 516 | ## Processing text 517 | 518 | We’re gonna use a function called split() to break downs a string according to a delimiter character. 519 | You can use split() to return a string as a list separated by a character 520 | You can use join() to join a list back into a string 521 | 522 | 523 | for line in text: 524 | line = line.strip() 525 | words = line.split(" ") # Separates the lines by an empty space, getting a list of words 526 | 527 | print(words[0]) # Outputs the first word of each sentence 528 | 529 | # Chain it all together! 530 | print(words[0].center(30, '~').upper()) 531 | 532 | 533 | 534 | We can use the **random** methods to do interesting stuff 535 | 536 | Sometimes you have to tell python to add **modules** with the **import** keyword to add functionality you need. Here we’ll import the [random module](https://docs.python.org/3.5/library/random.html). 537 | 538 | - Use the documentation to find what you can do with a module 539 | - Make sure you’re seeing the documentation of the python version you’re using (e.g. 3.5) 540 | # Import the module 541 | import random 542 | 543 | text = open("kafka.txt").readlines() 544 | 545 | for line in text: 546 | line = line.strip() 547 | words = line.split(" ") 548 | 549 | random_word = random.choice(words) #Get a random item from the word list 550 | 551 | random.shuffle(words) # Randomizes the order of the items in the list 552 | 553 | 554 | 555 | We use the join() method to join the randomized word list in to a string 556 | 557 | 558 | 559 | for line in text: 560 | line = line.strip() 561 | words = line.split(" ") 562 | random.shuffle(words) 563 | 564 | new_line = " ".join(words) # Joins each element in the list by sticking the space character in between the words, outputs a string 565 | 566 | 567 | 568 | We can sort with sorted() 569 | 570 | 571 | 572 | for line in text: 573 | line = line.strip() 574 | words = line.split(" ") 575 | random.shuffle(words) 576 | 577 | words = sorted(words) # Sort the words list alphabetically 578 | 579 | new_line = " ".join(words) 580 | 581 | 582 | 583 | Final script 584 | 585 | # Import the module 586 | import random 587 | 588 | text = open("kafka.txt").readlines() 589 | for line in text: 590 | line = line.strip() 591 | words = line.split(" ") 592 | random.shuffle(words) 593 | words = sorted(words) 594 | new_line = " ".join(words) 595 | print(new_line) 596 | 597 | 598 | ## List comprehension 599 | 600 | Make a new file comps.py 601 | 602 | 603 | We can make a list of upper case’d items 604 | 605 | names = ["Trotsky", "Marx", "Lenin", "Engels"] 606 | 607 | uppercase_names = [] 608 | for name in names: 609 | uppercase_names.append(name.upper()) 610 | 611 | 612 | 613 | There’s a handier way of doing this in python, called **list comprehension.** 614 | This does the same thing as the example above 615 | 616 | names = ["Trotsky", "Marx", "Lenin", "Engels"] 617 | 618 | uppercase_names = [name.upper() for name in names] 619 | 620 | 621 | It’s saying: for every value in the list **names** temporarily store it as a variable **name**, make that upper case and store it in a new list called **uppercase_names** 622 | 623 | 624 | 625 | names = [name.replace('r', 'arrrrr') for name in names] 626 | 627 | 628 | We can filter too, by adding **if statements** inside too: 629 | 630 | 631 | names = [name for name in names if name[0] == "l"] 632 | # returns elements inside of the list whose first letter is l 633 | 634 | 635 | 636 | We can add this filtering technique to the words in our previous example 637 | 638 | import random 639 | 640 | text = open("kafka.txt").readlines() 641 | for line in text: 642 | line = line.strip() 643 | words = line.split(" ") 644 | 645 | words = [word for word in words if word.startswith("a")] 646 | 647 | new_line = " ".join(words) 648 | 649 | print(new_line) 650 | # prints all the words that start with a 651 | 652 | OR more: 653 | 654 | words = [word for word in words if len(word) > 5 655 | # all the words that have 5 or more characters in them 656 | 657 | 658 | words = [word for word in words if word.endswith("ing")] 659 | # all the words that end in ing 660 | 661 | 662 | 663 | # Assignment for next week 664 | 665 | Also available in: https://github.com/antiboredom/sfpc-scrapism 666 | 667 | Transform a non-poetic text into a poetic text 668 | 669 | - up to you to determine what’s poetic 670 | 671 | Read some file, or if the text is short you can just put that text directly into python as a variable 672 | 673 | if don’t know what to do try stuff like sorting, randomizing, replacing, deleting things 674 | 675 | by taking something that exists and using these methods we can reformat it, rework it, you can use whatever is at your disposal. you’re not bound by command line, so you can take the output of that text and you’re welcome to format it into something interesting, put it into open frameworkds, whatevr you want to do 676 | 677 | Take something that exists, do something that transforms it. 678 | 679 | If you’re more advanced, you can start to get into using third party libraries to analyze text. 680 | If you’re feeling ambitious, make this program so that it can deal with any text. Make this poetic operation so it can work with any text that you feed it. 681 | 682 | 683 | -------------------------------------------------------------------------------- /class-notes/class-3.md: -------------------------------------------------------------------------------- 1 | # 10/02 - Dictionaries, scraping the web 2 | 3 | 4 | 5 | # Dictionaries 6 | 7 | List = collection of items ordered numerically 8 | Dictionary = no order, the items are indexed by another variable (usually a String) 9 | 10 | 11 | 12 | On the terminal 13 | Make a new file and open it 14 | 15 | 16 | $ touch dicts.py 17 | $ open dicts.py 18 | 19 | 20 | **Dictionaries are Key and Value pairs** 21 | They’re used to represent structures of data 22 | 23 | In python, you define dictionaries with curly brackets { } 24 | 25 | 26 | person = { } # empty dictionary 27 | 28 | person = { "first_name": "Karl, "last_name": "Marx", "age": 235 } 29 | 30 | # An easier way to look at it: 31 | person = { 32 | "first_name": "Karl, 33 | "last_name": "Marx", 34 | "age": 235 35 | } 36 | 37 | 38 | “first_name” is the **Key**, “Karl” is the **value** 39 | 40 | the values can be of any type: int, float, boolean, Strings, or even other dictionaries 41 | 42 | **Dictionaries can contain any type, including dictionaries and lists** 43 | 44 | 45 | person = { 46 | "first_name": "Karl", 47 | "last_name": "Marx", 48 | "age": 235, 49 | "pet": { 50 | "name": "Proleterry", 51 | "species": "parrot", 52 | "age": 12 53 | }, 54 | "favorite_books": ["Ethics", "Twilight"] 55 | } 56 | 57 | 58 | 59 | You’ll want to do things with values in the dictionary 60 | 61 | 62 | ## Getting values 63 | 64 | **You can get a value from a dictionary using brackets and accessing the key** 65 | The key has to be exactly the name of the key, e.g. first_name 66 | If it doesn’t exist, an error halts the program 67 | 68 | 69 | # 1. access the value using brackets by referencing the key 70 | print( person["first_name!"] ) #Outputs: KeyError, there is no key names first_name! 71 | 72 | print( person["first_name"] ) #Outputs: Karl 73 | 74 | 75 | **A safer way is to use the get method,** 76 | Returns None without an error if the key isn’t present 77 | 78 | 79 | name = person.get("first_name") 80 | 81 | 82 | Sometimes dictionaries will have nested values, like a list and dictionaries, so you’ll **iterate** through the values 83 | 84 | 85 | for book in person["favorite_books"]: 86 | print(book) 87 | 88 | 89 | 90 | ## You can iterate through a dictionary 91 | 92 | and get all its properties 93 | 94 | 95 | for key in person: 96 | print(key) # prints all the keys 97 | print(person[key]) # prints all the values 98 | 99 | 100 | 101 | ## Adding and modifying the dictionary 102 | 103 | Accessing a key and modifying its value will override the value for that key: 104 | 105 | 106 | # replaces the value for first_name 107 | person["first_name"] = "Lenin" 108 | 109 | 110 | If the key doesn’t exist, you can create it and assign a value 111 | 112 | 113 | person["middle_name"] = "Terry" 114 | # now there's a new key middle_name with value Terry 115 | 116 | 117 | 118 | 119 | 120 | # Intro to HTML 121 | 122 | HTML is a markup language, that the web is written in. 123 | 124 | 125 | ## Tags 126 | 127 | Works as a series of **tags** 128 | A tag looks like 129 | 130 | \some stuff\ 131 | 132 | The beginning of the tag, the contents of it, and the closing of a tag 133 | 134 | There’s different types for different things 135 | 136 | - \ paragraph 137 | - \ makes text bold 138 | - this text is normal and \this text is bold\ 139 | - \ makes a link 140 | - \go to google\ 141 | - \ makes a header 142 | - \My Header\ 143 | - \ represents a random division of text 144 | - \I’m a div\ 145 | 146 | 147 | ## Attributes 148 | 149 | Each tag can have a series of attributes, a set of **key** and **value** pairs 150 | Two most important ones for scraping is 151 | 152 | - **id** attribute 153 | - gives a unique identifier to a particular tag 154 | - \

Hi I’m very important\ 155 | - an id can only be applied to one tag 156 | - **class** attribute 157 | - designates a category of tag, that the author of the page uses to find or group 158 | - you can have multiple tags with the same class 159 | - \

I am somewhat important\ 160 | - \

I am also somewhat important\ 161 | 162 | 163 | ## Specific attributes 164 | 165 | There’s some attributes that can only be applied to certain tags 166 | 167 | - **href** is only applied to \ to indicate where to go when you click on a link 168 | - \google\ 169 | - **src** only applied to \ to indicate which image 170 | - \ 171 | 172 | 173 | ## Structure 174 | 175 | A web page looks like this 176 | 177 | 178 | \ 179 | \ 180 | \My page title\ 181 | \ 182 | \ 183 | \Hello i am header\ 184 | 185 | \a paragraph\ 186 | 187 | \ 188 | \ 189 | 190 | 191 | ## CSS 192 | 193 | Cascading Style Sheets, 194 | just know that CSS is used to apply style to a page 195 | so the HTML stays the same for the content but the CSS indicates text color, sizes, etc. 196 | 197 | it’s comprised of a selector, that references a part of the page, brackets that contain style 198 | 199 | A CSS style sheet looks like this 200 | 201 | // this sets all the p tags to have a red border 202 | p { 203 | border: 1px solid red; 204 | } 205 | 206 | 207 | Different selectors 208 | 209 | 210 | // style the p tags and all the strong tags 211 | p, strong { 212 | } 213 | 214 | // style all the \ tags inside all the \ tags 215 | p a { 216 | 217 | } 218 | 219 | // style everything with a certain class name, preseed with a period 220 | .moderately-important { 221 | 222 | } 223 | 224 | // style an id, using #. e.g. style this \

contains the entire box 477 | 478 | ![](https://d2mxuefqeaa7sj.cloudfront.net/s_D67961504E4563B95DDF29A2542D190EAEAA0F940919FBD5CB2C6591C6D3326E_1538495625813_image.png) 479 | 480 | 481 | so we get the box on our python script 482 | 483 | 484 | # find the box by using the class we found 485 | items = r.html.find(".item-content") 486 | 487 | for item in items: 488 | # now we have the whole item 489 | print(item) #returns \ showing we're getting an element 490 | 491 | # we find the price class in the item 492 | price = item.find(".price", first=True).text 493 | # we find the title 494 | title = item.find(".title", first=True).text 495 | 496 | print(title + " costs " + price) # returns Pain Pills Raw... costs $1.00 497 | 498 | 499 | 500 | 501 | 502 | ## Downloading images 503 | 504 | https://github.com/antiboredom/detourning-the-web-2018/blob/master/week_04/shutterstock.py 505 | 506 | 507 | import requests 508 | 509 | def download_file(url): 510 | local_filename = url.split('/')[-1] 511 | # NOTE the stream=True parameter 512 | r = requests.get(url, stream=True) 513 | with open(local_filename, 'wb') as f: 514 | for chunk in r.iter_content(chunk_size=1024): 515 | if chunk: # filter out keep-alive new chunks 516 | f.write(chunk) 517 | #f.flush() commented by recommendation from J.F.Sebastian 518 | return local_filename 519 | 520 | 521 | 522 | 523 | 524 | ## Some websites have barebones HTML without the content 525 | 526 | If the content of the HTML is barebones (like facebook) that means the content is loaded AFTER the HTML is downloaded 527 | 528 | to help with that we can use the render() function to grab the full text of the page the way you’d see it in Chrome 529 | 530 | r.html.render() 531 | 532 | 533 | 534 | # Homework 535 | 536 | Make a big list using this technique 537 | 538 | of whatever you want 539 | 540 | There should be a reason to make that big list (a poetic, political, satirical, surrealist reason) 541 | 542 | You’re welcome to manipulate that list in some way 543 | 544 | 545 | 546 | # Next week 547 | 548 | More advanced tools for analyzing language 549 | natural language processing 550 | 551 | if you want a head start, the libraries to look at are: 552 | 553 | - TextBlob https://textblob.readthedocs.io/en/dev/ 554 | - Spacy https://spacy.io/ 555 | 556 | 557 | 558 | 559 | # More resources 560 | 561 | **Understanding Word Vectors by Allison Parrish** 562 | [https://gist.github.com/aparrish/2f562e3737544cf29aaf1af30362f469](https://gist.github.com/aparrish/2f562e3737544cf29aaf1af30362f469) 563 | 564 | 565 | 566 | 567 | -------------------------------------------------------------------------------- /class-notes/class-4.md: -------------------------------------------------------------------------------- 1 | # 10/09 - Natural Language Processing 2 | 3 | Raise your hand if you’ve had problems scraping 4 | 5 | Scraping is more art than science 6 | 7 | We’ll see other ways of scraping you can try 8 | 9 | We’ll get into basics of natural language processing, using TextBlob, maybe a bit of SpaCy. 10 | 11 | Who wants to share their lists? 12 | 13 | - Elizabeth scraped foundation colors and sorted them 14 | - Edgardo showed their Playlist Of The Chilean Dictatorship, scraping Billboard’s Top 1 hit during the dictatorship years 15 | - Tomoya scraped Youtube and sorted thousands of thumbnails of recommended videos 16 | - Tim scraped pictures of Plan-B 17 | 18 | 19 | 20 | ## Problem: Scraping google images, would return HTML without my content 21 | 22 | Things to try: 23 | 24 | **View source** 25 | 26 | **Turn off Javascript** 27 | 28 | - Turn off Javascript, then load a page 29 | - you’ll get a legacy page in simple HTML, sometimes for certain sites 30 | 31 | **Use the Network tools** 32 | 33 | - View > Developer Tools > Network 34 | - You can see all the network requests from the browser to the server 35 | - So you can see, eg. images that are being loaded 36 | 37 | ![](https://d2mxuefqeaa7sj.cloudfront.net/s_EF94E54173BB9AD30672FE5689D53FCD672AC2999755231FC9590FA02D6C9AFF_1539096267653_image.png) 38 | 39 | - You can get links from there, and content in JSON format 40 | - Sometimes it’s easier than parsing the HTML itself 41 | 42 | For getting results of a search: 43 | 44 | 1. Look for what looks like it has results, filter by XHR 45 | 2. right click > Copy > Copy link address 46 | 1. sometimes will give you a link to a JSON with all the results 47 | - a Chrome extension for Beautifying JSON: JSON Formatter 48 | 3. You can hit Next, see what the new URL is, see what changed, maybe you can see how to get different pages 49 | 50 | **Read the JSON in python:** 51 | 52 | import requests 53 | 54 | r = requests.get("....... crazy URL you got from network") 55 | 56 | # convert it to a JSON object 57 | data = r.json() 58 | 59 | # access its elements 60 | print(data["results"]["total_num_results"]) 61 | 62 | # it gets bonkers 63 | print(data["results"]["cluster"][0]["patent"]["result"][0]["title"]) 64 | 65 | 66 | 67 | Network doesn’t work in cases like Twitter, where there’s no Next button, you just scroll down and it loads automatically 68 | 69 | **When the JSON link gives you an error, use cURL** 70 | 71 | 72 | 1. Right click > Copy > Copy as CURL 73 | 2. Paste on a Terminal 74 | 1. Now you’ll get the actual result of a query 75 | 76 | You can paste a curl command on this online tool, and return a python requests command: 77 | 78 | https://curl.trillworks.com/ 79 | 80 | 81 | 82 | 83 | # Natural Language Processing 84 | 85 | What is it? 86 | 87 | get computers to understand language, extract meaning from text 88 | Convert characters into some kind of data the computer understands and we can do something with. 89 | 90 | We can get computers to: 91 | 92 | - extract sentences 93 | - get words 94 | - for each word figure out what kind of word it is (Part of speech) 95 | - Understand the sentiment of a text 96 | - Classify text 97 | - here’s a bunch of negative sounding sentences, positive sentences, here’s a new sentence, is it positive or negative? 98 | 99 | Based on rules, as machine learning or if else statements. 100 | There’s a lot of biases 101 | 102 | - what the computer determines has some form of ideology, coming from the creator’s intention, consciously or not 103 | 104 | 105 | 106 | ## TextBlob 107 | 108 | We’ll use a library called TextBlob 109 | https://textblob.readthedocs.io 110 | 111 | Lets us 112 | 113 | - basic NLP tasks 114 | - easy-to-use 115 | - tradeoff is that it’s less accurate than other libraries 116 | - other library, better, but more annoying to use: SpaCy 117 | - https://spacy.io/ 118 | 119 | **Installing** 120 | On the terminal: 121 | 122 | 123 | $ pip3 install textblob 124 | 125 | 126 | Install the data set 127 | 128 | 129 | $ python3 -m textblob.download_corpora 130 | 131 | 132 | **Basic usage** 133 | 134 | Breaking into sentences: 135 | 136 | 137 | from textblob import TextBlob 138 | 139 | blob = TextBlob("A specter is haunting this classroom. The specter of sleepiness.") 140 | 141 | print(blob.sentences) 142 | # Outputs a list of Sentence object 143 | 144 | # Iterate through all the sentences 145 | for sentence in blob.sentences: 146 | print(sentence) 147 | 148 | # Get all the words, removing the punctuation 149 | for word in blob.words: 150 | print(word) 151 | 152 | # Get the POS / part of speech (nouns, adjectives, etc.) 153 | for tag in blob.tags: 154 | print(tag) # Outputs a tuple (list you can't change) ('specter', u'NN'): the word, the part of speech 155 | print(tag[0]) # Word 156 | print(tag[1]) # POS 157 | 158 | # Get all the nouns 159 | nouns = [] 160 | for tag in blob.tags: 161 | if tag[1] == "NN": 162 | nouns.append(tag[0]) 163 | print(nouns) 164 | 165 | 166 | 167 | Tags for Part of Speech 168 | 169 | - Penn Treebank part of speech tagging system 170 | https://www.clips.uantwerpen.be/pages/mbsp-tags 171 | 172 | 173 | Pluralize words 174 | 175 | 176 | for word in blob.words: 177 | print(word.pluralize(,)) 178 | 179 | 180 | 181 | Classify sentences between positive and negative ones, using a simple training set 182 | 183 | 184 | from textblob import TextBlob 185 | from textblob.classifiers import NaiveBayesClassifier 186 | train = [ 187 | ('i am happy today', 'pos'), 188 | ('this is a good burger', 'pos'), 189 | ('you\'re a good boy', 'pos'), 190 | ('you are doing well', 'pos'), 191 | 192 | ('i do not like you', 'neg'), 193 | ("don't go there", 'neg'), 194 | ('this is so frustrating', 'neg'), 195 | ('things are bad', 'neg') 196 | ] 197 | cl = NaiveBayesClassifier(train) 198 | sentence = "I feel really bad" 199 | # Classify a sentence 200 | print(sentence,"is",cl.classify(sentence)) 201 | # Get the probability 202 | prob = cl.prob_classify("I don't like tings") 203 | print("The probability that this sentence is negative is", prob.prob("neg")) 204 | print("The probability that this sentence is positive is", prob.prob("pos")) 205 | 206 | # Get all sentences of a certain category 207 | for sentence in blob.sentences: 208 | if (cl.classify(sentence) == "pos") 209 | print(sentence) 210 | 211 | # only if they have less than three words 212 | if len(sentence.words) < 3: 213 | print(sentence) 214 | 215 | 216 | 217 | Get the sentiment (biased, unreliable) 218 | 219 | 220 | for sentence in blob.sentences: 221 | print(sentence.sentiment) # Returns a Sentiment object with a polarity value and a subjectivity value 222 | 223 | # Print only the positive sentences 224 | if (sentence.sentiment.polarity > 0.8) 225 | print(sentence) 226 | 227 | 228 | 229 | 230 | # Natural Language Processing Examples 231 | 232 | https://github.com/antiboredom/sfpc-scrapism/tree/master/class-notes/examples 233 | 234 | # More Tools for NLP 235 | ## SpaCy 236 | - https://spacy.io/usage/ 237 | 238 | 239 | ## Concept Net 240 | - http://conceptnet.io/ 241 | - synonyms, related terms, used for…, types 242 | - Use the same URL but add api., for seeing what’d you get with the API 243 | - http://conceptnet.io/c/en/glass 244 | - http://api.conceptnet.io/c/en/glass 245 | 246 | Using the concept net API 247 | 248 | 249 | import requests 250 | 251 | word = "glass" 252 | r = requests.get("http://api.conceptnet.io/c/en/" + word) 253 | data = r.json() 254 | 255 | 256 | 257 | Click a header, e.g. used for 258 | 259 | http://conceptnet.io/c/en/glass?rel=/r/UsedFor&limit=1000 260 | 261 | Get the API URL for all things glass is used for 262 | 263 | http://api.conceptnet.io/c/en/glass?rel=/r/UsedFor&limit=1000 264 | 265 | *edges* = the ways in which this word relates to other words 266 | 267 | - *edges* has a *start* and a *end* ([start] is used for [end])*,* we want the *end* 268 | 269 | for edge in data["edges"]: 270 | if edge["rel"]["label"] == "UsedFor": 271 | print(edge["start"]["label"]) 272 | 273 | 274 | A project made with this: 275 | 276 | - Darius Kazemi’s expanding mind bot 277 | - https://twitter.com/expandingbot 278 | 279 | 280 | 281 | 282 | # Tips 283 | ## how to publish code to github without personal information 284 | - Make a secrets.py that has your logins, or passwords or API keys, that you’re using in your code 285 | - add it to .gitignore so it doesn’t get pushed to the server 286 | 287 | 288 | 289 | ## a prettier print() statement 290 | 291 | 292 | from pprint import pprint 293 | 294 | pprint(json) 295 | 296 | 297 | 298 | 299 | # Homework TBD 300 | 301 | -------------------------------------------------------------------------------- /class-notes/class-5.md: -------------------------------------------------------------------------------- 1 | # 10/16 - Photos 📸 2 | 3 | # 📒 Agenda 4 | - image manipulation 5 | - getting images 6 | - photo manipulation 7 | - Image Magick 8 | - TLDR 9 | - Subprocess 10 | - [Pillow](https://pillow.readthedocs.io/en/5.3.x/) 🛌 😴 11 | - video manipulation (if we have time!) 12 | # 🤔 Who did homework?!?! 13 | - Ilona has a WIP: who is coming out day for? 14 | - [AI weirdness](http://aiweirdness.com/), training neural networks with a sense of humor 15 | - Tim made a lyric 🎶 generator using TextBlob 16 | - Kanye & Ayn Rand 🙃 17 | # 🖼️ How do we get images off a website? 18 | - [Shutterstock](https://www.shutterstock.com/), [Pexels](https://www.pexels.com/), good image resources 19 | - Check whether website still loads by toggling javascript on/off 20 | - If it does, it should be easy to scrape 21 | - When searching, make sure you select only Photos 22 | ![](https://d2mxuefqeaa7sj.cloudfront.net/s_760AA603A36DCE03DC9C80E71E2F81C1E2EDCAAED6138F202B76C43EC5789514_1539700121631_image.png) 23 | 24 | 25 | **Use Requests_Html library to help scrape** 26 | Gets all images. Generalizable to any website, but will pull all images. 27 | 28 | from requests_html import HTMLSession 29 | #requests is library requests_html is based off, use to download image 30 | import requests 31 | #subprocesses are how python can call other command line tools 32 | import subprocess 33 | url = 'https://www.shutterstock.com/search?searchterm=existential+despair&search_source=base_search_form&language=en&page=1&sort=popular&image_type=all&measurement=px&safe=true' 34 | 35 | # searches through all the html tags and grabs specified tags 36 | session = HTMLSession() 37 | r = session.get(url) 38 | 39 | # Gets all images on the page 40 | images = r.html.find('img'') 41 | for img in images: 42 | print(img) 43 | 44 | To get image source 45 | 46 | for img in images: 47 | src = img.attrs.get('src') # Gets image source 48 | title = img.attrs.get('alt') # Gets image title 49 | 50 | Get title of each image using the split command 51 | 52 | # Gets end of url for img name 53 | imgname = src.split('/')[-1] 54 | imgdata = requests.get(src).content 55 | 56 | #wb is the command for 'write binary' 57 | open(imgname, 'wb').write(imgdata) 58 | 59 | 60 | # Basic photo manipulation in command line 61 | ## [IMAGE MAGICK](https://www.imagemagick.org/script/index.php) 🧙 🧙‍♂️ ✨ 62 | 63 | F*or command line photo manipulation!* 64 | 65 | Run this command in terminal 66 | 67 | brew install imagemagick 68 | 69 | Examples of Image Magick Functionality: Creating GIF, converting file types, resize, etc. 70 | 71 | #Rename and convert file type 72 | convert 73 | 74 | #Whoa! Maintains the aspect ratio 😱 75 | convert -resize 1000x1000 76 | 77 | #Rotate! 78 | convert -rotate 90 79 | 80 | #Invert photo 81 | convert -negate 90 82 | 83 | #You can combine them 🤝 84 | convert -negate -rotate 90 85 | 86 | #Creating GIF from a folder of images 87 | convert images/*.jpg -delay 0 animation.gif 88 | 89 | #Montage tiles images into a grid 90 | montage images/*.jpg montage.jpg 91 | 92 | 👉 All the Image Magick command line options [**here**](https://imagemagick.org/script/command-line-options.php)❗ 93 | 94 | 95 | ## [TLDR](https://tldr.sh/) 96 | 97 | Finds common terminal commands with keywords 98 | 99 | brew install tldr 100 | tldr 101 | 102 | 103 | ## [Subprocess](https://docs.python.org/2/library/subprocess.html) 104 | 105 | Allows you to use command line 106 | 107 | import subprocess 108 | 109 | # Same as: say hi 110 | subprocess.call(["say", "hi"]) 111 | 112 | # Same as: say -r 300 "a specter is haunting this python script 113 | # No need to double quote things! 114 | subprocess.call(["say", "-r", "300", "a specter is haunting this python script"]) 115 | 116 | **Subprocess example #1:** Says title of each image 117 | 118 | for img in images: 119 | src = img.attrs.get('src') # Gets image source 120 | title = img.attrs.get('alt') # Gets image title 121 | subprocess.call(["say", title] 122 | 123 | **Subprocess example #2:** Downloads each file and converts it to negative 124 | 125 | subprocess.call(["convert", imgname, "-negate", imgname + ".neg.jpg"]) 126 | 127 | **Subprocess example #3:** Takes all images in folder and makes an animated gif 128 | 129 | subprocess.call(["convert", "*.jpg", "-delay", "0", "animation.gif"]) 130 | 131 | 132 | ## [Pillow](https://pillow.readthedocs.io/en/5.3.x/) 133 | 134 | **Install using pip** 135 | 136 | pip3 install pillow 137 | 138 | **Basic example** 139 | 140 | from PIL import Image, ImageFilter 141 | #If image & file are in separate folders, provide file path 142 | img = Image.open("") 143 | 144 | #Resizing images 145 | #Thumbnail respects original aspect ratio, takes a new size as a tuple (width, height). It also doesn't make things bigger than they already are. Changes original image. 146 | img.thumbnail((100, 100)) 147 | img.save("") 148 | 149 | #Resize doesn't respect original aspect ratio. Does not change original image. 150 | img = img.resize((1000, 1000)) 151 | 152 | #Rotates image 153 | img = img.rotate(45) 154 | 155 | #Apply a filter 156 | img = img.filter(ImageFilter.BLUR) 157 | 158 | **Image draw** 159 | 160 | from PIL import Image, ImageFilter, ImageDraw 161 | img = Image.open("") 162 | 163 | #Draws on image! 164 | draw = ImageDraw.Draw(img) 165 | draw.text((10, 10), "HELLO!") 166 | draw.ellipse((0, 0, 500, 500), fill=(255, 255, 255)) 167 | 168 | **Collaging images together** 169 | 170 | from PIL import Image, ImageFilter, ImageDraw 171 | #glob syntax allows you to use /* to easily refer to all items at a path 172 | from glob import glob 173 | import random 174 | 175 | #a list of all the file names 176 | files = glob("images/*.jpg") 177 | 178 | #Takes three parameters: kind of image, width & height 179 | canvas = Image.new("RGB", (1000, 1000)) 180 | 181 | #Loop through all files and stick them on image 182 | for filename in files: 183 | img = Image.open(filename) 184 | 185 | #generates random location 186 | x = random.randint(-100, 1000) 187 | y = random.randint(-100, 1000) 188 | 189 | #takes an image pastes it on something else 190 | canvas.paste(img, (x, y)) 191 | 192 | canvas.save("collage.jpg") 193 | 194 | 195 | **Get rid of labels** 196 | *Crops image* 197 | 198 | img = img.crop(0, 0, img.size[0], img.size[1]-20) 199 | 200 | To use transparency, use the RGBA colorspace. Can’t combine images in Pillow that don’t have the same colorspace 201 | 202 | canvas = Image.new("RGBA", (1000, 1000)) 203 | img = img.convert("RGBA") 204 | canvas.save("collage.png") 205 | 206 | 207 | ## [Open CV and Python](https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_tutorials.html) 208 | 209 | Installation 210 | 211 | pip3 install opencv-python 212 | 213 | **What is a Haar Cascade?** 214 | Computer is looking for patterns. When we tell open CV, we can tell it to grab anything: a face, an eyeball, a smile, etc. [**Download XML file**](https://github.com/opencv/opencv/tree/master/data/haarcascades) ****depending on what you want to detect. 215 | 216 | 1 - Eyes 217 | 218 | 2 - Eyeglasses 219 | 220 | 3 - Front of face 221 | 222 | 4 - Profile 223 | 224 | 5 - Full body 225 | 226 | 6 - Left eye 227 | 228 | 7 - Right eye 229 | 230 | 8 - Lower body 231 | 232 | 233 | import cv2 234 | import numpy as np 235 | 236 | cascade = cv2.CascadeClassifier("eye.xml") 237 | 238 | #makes video object, looks for video camera on 💻 239 | video_capture = cv2.VideoCapture(0) 240 | video_capture.set(3, 1280) 241 | video_capture.set(4, 720) 242 | 243 | #creates an infinite loop 244 | while True: 245 | ret, frame = video_capture.read() 246 | gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) 247 | #becomes a list of coordinates with eyeballs 248 | eyeballs = cascade.detectMultiScale(gray) 249 | 250 | for (x,y,w,h) in eyeballs: 251 | #draws rectangle onto image 252 | #(image, coords, color, stroke width 253 | #to grab part of the frame 254 | eye_img = frame[y:y+h, w:x+w] 255 | #to keep each frame an increasing number 256 | outname = "eye_" + str(index) + ".jpg" 257 | cv2.imwrite(eye_img, outname) 258 | cv2.rectangle(frame, (x, y), (x+w, y+h),(0,255,0),2) 259 | 260 | #makes window on computer 261 | cv2.imshow("Video", frame) 262 | 263 | #looking for an exit key 264 | if cv2.waitKey(1) & 0xFF == ord('q'): 265 | break 266 | 267 | #Should stream the video 268 | VideoCapture.release() 269 | cv2.destroyAllWindows() 270 | 271 | Instead of using a live video, you can read an image in. 272 | 273 | from glob import glob 274 | 275 | files = glob('images/*.jpg') 276 | for filename in files: 277 | frame = cv2.imread(filename) 278 | 279 | #everything else same as above! 280 | 281 | detectMultiScale has a few options ([list of params](https://docs.opencv.org/2.4/modules/objdetect/doc/cascade_classification.html)) 282 | 283 | #eyeballs have to be at least 100, 100 284 | cascade.detectMultiScale(gray, minSize=(100,100) 285 | # Libraries & Add-Ons 286 | 287 | [Open CV and Python](https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_tutorials.html) 288 | 289 | - Open CV is easiest to implement 290 | 291 | [Dark Flow Library](https://github.com/thtrieu/darkflow) chaan find people and other objects (like apples, oranges, etc) 292 | [Image2Text](https://github.com/tensorflow/models/tree/master/research/im2txt) 293 | 294 | -------------------------------------------------------------------------------- /class-notes/class-6.md: -------------------------------------------------------------------------------- 1 | # 10/23 Video 2 | 3 | 4 | https://www.youtube.com/watch?v=KgbSjRMqyjc& 5 | 6 | 7 | Today 8 | 9 | - download video 10 | - manipulate them 11 | - python library called numPy, lets you edit video in python 12 | 13 | 14 | # Example scripts from this class 15 | 16 | https://github.com/antiboredom/sfpc-scrapism/tree/master/class-notes/examples/video 17 | 18 | 19 | 20 | 21 | # Getting material with Youtube-dl 22 | 23 | scraping web for video is difficult, so we use **youtube-dl** to download videos from the terminal 24 | https://youtube-dl.org/ 25 | Documentation: 26 | https://github.com/rg3/youtube-dl/blob/master/README.md#output-template-examples 27 | 28 | To install, in terminal: 29 | 30 | $ brew install youtube-dl 31 | 32 | It’s a web scraper just for video. For every website they reversed engineered how to get a video file 33 | 34 | 35 | ## To use it 36 | $ youtube-dl [URL] 37 | 38 | 39 | 40 | ## See the possible formats with -F 41 | $ youtube-dl [URL] -F 42 | # outputs the format available for that file, and resolutions 43 | 44 | # you can choose a specific format to download, with -f and the code for that format 45 | $ youtube-dl [UDL] -f 22 46 | 47 | 48 | - Some formats let you download only the video, or only the audio 49 | 50 | 51 | 52 | ## Download every single video from a youtube user 53 | 54 | Paste the URL from the **youtube channel** 55 | Get every TED talk: 56 | 57 | 58 | $ youtube-dl https://www.youtube.com/user/TEDtalksDirector 59 | 60 | 61 | Also works for, e.g. playlists 62 | https://www.youtube.com/user/TEDtalksDirector/playlists 63 | 64 | 65 | 66 | ## Decide what the output name it’ll be with -o 67 | 68 | $ youtube-dl [URL] -o [FILENAME] 69 | 70 | 71 | 72 | 73 | ## Try and always save with a specific format (like mp4) 74 | 75 | $ youtube-dl [URL] --merge-output-format mp4 76 | 77 | 78 | 79 | 80 | ## Download subtitles 81 | - add “,cc” to your search, to only get videos with closed captions 82 | 83 | --write-auto-sub 84 | --skip-download Download only subtitles and not the video 85 | 86 | $ youtube-dl [URL] --write-auto-sub --skip-download 87 | 88 | 89 | 90 | - check sam’s tool for parsing subtitle files 91 | ## Get the URLS of the video and audio 92 | 93 | # --get-url 94 | $ youtube-dl [URL] --get-url 95 | 96 | 97 | 98 | 99 | ## Tips 100 | - Avoid URLs with “&” 101 | - Make sure the URL for youtube is just https://www.youtube.com/watch?v=[ID] 102 | - limit max downloads with --max-downloads to save space 103 | - call the program from python with subprocess.call(…) 104 | - Use it for more sites: 105 | - full list: https://rg3.github.io/youtube-dl/supportedsites.html 106 | - 107 | 108 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | # VLC video player 117 | 118 | Use VLC to play every video format imaginable 119 | https://www.videolan.org/vlc/ 120 | 121 | 122 | 123 | 124 | 125 | # Use ffmpeg to convert formats / edit video 126 | 127 | https://ffmpeg.org/ 128 | Documentation: https://ffmpeg.org/ffmpeg.html (insane and confusing, just google stuff) 129 | 130 | Usage: 131 | 132 | 133 | # ffmpeg -i \[Some kind of input\] [parameters (optional)] [Some kind of output] 134 | 135 | # convert my mp4 video to mov format 136 | $ ffmpeg -i mycatvideo.mp4 mycatvideo.mov 137 | 138 | 139 | 140 | ## Turn things into animated GIFs 141 | 142 | $ ffmpeg -i kitten.mp4 kitten.gif 143 | 144 | # use -r 3 . to set it to 3 frames a second 145 | $ ffmpeg -i kitten.mp4 -r 3 kitten.gif 146 | 147 | 148 | 149 | 150 | ## Output frames to images 151 | 152 | 153 | 154 | # 1. set the output format to an image format (eg jpg) 155 | # 2. put %d in the output name, it'll be replaced by the frame number 156 | 157 | $ ffmpeg -i kitten.mp4 kitten_frame_%d.jpg 158 | 159 | 160 | 161 | ## Turn a bunch of images into a video 162 | # 1. set the INPUT format to the image format (eg jpg) 163 | # 2. put %d in the INPUT name, it'll be replaced by the frame number, make sure those files exist 164 | 165 | $ ffmpeg -i kitten_frame_%d.jpg kitten.mp4 166 | 167 | 168 | 169 | ## Cut/trim video (e.g. get rid of the intro) 170 | 171 | # BEFORE the input, use -ss [timestamp] 172 | # AFTER the input, use -t [length in seconds] 173 | 174 | # start at 10 seconds, and go for 10 seconds 175 | $ ffmpeg -s 00:00:10 -i kitten.mp4 -t 10 output.mp4 176 | 177 | 178 | 179 | ## Get info about a video with ffprobe 180 | 181 | $ ffprobe kitten.video 182 | # outputs a bunch of info, like duration, bitrate, etc. 183 | 184 | 185 | 186 | ## Combine all this 187 | 188 | # Start at 10 seconds, get a 10 seconds video, turn into a GIF 189 | $ ffmpeg -s 00:00:10 -i kitten.mp4 -t 10 output.gif 190 | 191 | 192 | 193 | ## Use ffmpeg to download a video URL you got from youtube-dl (!!!) 194 | 195 | # Get only the URL 196 | $ youtube-dl --get-url [YOUTUBE LINK] 197 | 198 | # copy that URL to ffmpeg, get only 5 seconds of it 199 | $ ffmpeg -i "[THE URL WE GOT]" -t 5 short_fan.mp4 200 | 201 | 202 | 203 | ## Stupid filters 204 | 205 | https://ffmpeg.org/ffmpeg-filters.html 206 | 207 | **vflip - flip the video vertically** 208 | 209 | 210 | $ ffmpeg -i in.mp4 -vf "vflip" out.mp4 211 | 212 | 213 | **edgedetect** 214 | 215 | - Add parameters to the filter after the -vf, in quotes 216 | - [option]=[value], separated with **:** 217 | 218 | $ ffmpeg -i in.mp4 -vf "edgedetect=low=0.1:high=0.4" out.mp4 219 | 220 | 221 | 222 | 223 | ## Change the speed of the video 224 | 225 | The command is stupid 226 | 227 | - PTS = Presentation Time Stamp, the time stamp eg. 00:10:00 228 | - you tell it to make the PTS, e.g. twice what it is, therefore it slows down 229 | - to make it faster you multiply the PTS by a decimal number, e.g. twice as fast = 0.5 230 | - ugh just google it 231 | 232 | # slow it down, twice the length 233 | $ ffmpeg -i short_fan.mp4 -vf "setpt=2*PTS" slow_short_fan.mp4 234 | 235 | # speed it up, twice as fast 236 | $ ffmpeg -i short_fan.mp4 -vf "setpt=0.5*PTS" fast_short_fan.mp4 237 | 238 | 239 | **Interpolation** 240 | 241 | - When slowing down you can use motion-interpolate to create the frames in between, to make smoother slow-mo videos 242 | - https://cloudacm.com/?p=3055 243 | 244 | 245 | 246 | ## Use the camera 247 | 248 | $ ffmpeg -f avfoundation -pixel_format yuyv422 -framerate 30 -video_size 1280x720 -i 0:0 recording.mp4 249 | 250 | 251 | Keeps recording until we hit Ctrl+C 252 | 253 | **use -t to only record 5 seconds** 254 | 255 | 256 | $ ffmpeg -t 5 -f avfoundation -pixel_format yuyv422 -framerate 30 -video_size 1280x720 -i 0:0 recording.mp4 257 | 258 | 259 | 260 | 261 | ## Tips 262 | - abort a long operation with Cmd+C 263 | - when stuck, search for specific uses, eg. “ffmpeg make optimized gif” 264 | - delete all files in a folder with terminal (if we downloaded too many files) 265 | - rm *.jpg 266 | - use youtube-dl on a livestream to capture it 267 | 268 | 269 | 270 | 271 | # Using MoviePy 272 | 273 | We’ll use MoviePy for video editing 274 | https://zulko.github.io/moviepy/ 275 | Documentation: https://zulko.github.io/moviepy/ref/ref.html 276 | 277 | 278 | WIP library from Sam 279 | https://antiboredom.github.io/vidpy/ 280 | 281 | Install 282 | 283 | 284 | $ pip3 install moviepy 285 | 286 | 287 | 288 | 289 | ## Put two videos together 290 | 291 | **By concatenating** 292 | 293 | 294 | # import the editor library but call it mp to make it shorter 295 | import moviepy.editor as mp 296 | # another way is to only import the functions we need, a bit faster 297 | # from moviepy.editor import VideoFileClip, concatenate_videoclips 298 | 299 | # Join videos together 300 | clipl = mp.VideoFileClip("fan_upside_down.mp4") 301 | clip2 = mp.VideoFileClip("prancercise.mp4") 302 | 303 | # the function takes a list of video objects 304 | final_clip = mp.concatenate_videoclips([clip1, clip2]) 305 | # output the video to a file 306 | final_clip.write_videofile("output.mp4") 307 | 308 | 309 | **By compositing videos together (like layers in photoshop/premiere)** 310 | 311 | 312 | ## Get only a segment of a video clip, with subclip() 313 | # from 10 seconds to 13.5 seconds 314 | clip1 = mp.VideoFileClip("prancercise.mp4").subclip(10, 13.5) 315 | 316 | # also works like this 317 | clip1 = mp.VideoFileClip("prancercise.mp4") 318 | tiny_clip = clip1.subclip(10, 13.5) 319 | 320 | 321 | 322 | ## Resize a video file 323 | - if the videos are of different sizes, the output might be broken, so we resize them 324 | 325 | clip1 = clip1.resize((1280, 720)) 326 | 327 | 328 | 329 | 330 | 331 | ## Get random subclips from a video and combine them together 332 | 333 | 334 | import random 335 | import moviepy.editor as mp 336 | 337 | video = mp.VideoFileClip("dance.mp4") 338 | 339 | # get the duration of the video, in seconds 340 | video_duration = video.duration 341 | # define the duration of the subclips we're gonna take 342 | clip_duration = 0.5 343 | # make a list we'll populate with subclips 344 | clips = [] 345 | 346 | for i in range(0,10): # do this 10 times 347 | # get a random start point 348 | # make it so the end can never be past the full video duration 349 | start = random.uniform(0, video_duration - clip_duration) 350 | # the end point is whatever start time plus the subclip duration 351 | end = start + clip_duration 352 | # add a subclip to the list, between our start and end points 353 | clips.append(video.subclip(start,end)) 354 | 355 | # create the final video out of all the clips 356 | final_clip = mp.concatenate_videoclips(clips) 357 | # write it to a file 358 | final_clip.write_video("random_dance.mp4") 359 | 360 | 361 | 362 | ## Play a bunch of clips overlayed on top of each other, for reasons 363 | 364 | It’s like what we just did but instead of concatenate_videoclips: 365 | 366 | 367 | # use CompositeVideoClip to make a video that's all the clips layered together 368 | final_clip = mp.CompositeVideoClip(clips) 369 | 370 | 371 | 372 | 373 | 374 | import random 375 | import moviepy.editor as mp 376 | 377 | video = mp.VideoFileClip("dance.mp4") 378 | 379 | # get the duration of the video, in seconds 380 | video_duration = video.duration 381 | # define the duration of the subclips we're gonna take 382 | clip_duration = 1.5 383 | # make a list we'll populate with subclips 384 | clips = [] 385 | 386 | for i in range(0,10): # do this 10 times 387 | # get a random start point 388 | # make it so the end can never be past the full video duration 389 | start = random.uniform(0, video_duration - clip_duration) 390 | # the end point is whatever start time plus the subclip duration 391 | end = start + clip_duration 392 | 393 | # store the subclip so we can do things to it 394 | clip = video.subclip(start,end) 395 | 396 | # set the position of the clip in our canvas 397 | clip = clip.set_position((random.randint(-100, 800), random.randint(-100, 400))) 398 | # set the start time, to whatever position we're in (i) by half, just for fun 399 | clip = clip.set_start( i / 2.0 ) 400 | # add to the list 401 | clips.append(clip) 402 | 403 | # make a video that is a composite of all the subclip 404 | final_clip = mp.CompositeVideoClip(clips) 405 | 406 | # write it to a file 407 | final_clip.write_videofile(“random_dance.mp4”, codec="libx264", temp_audiofile="something.m4a", remove_temp=True, audio_codec="aac") 408 | ## Tips 409 | - Functions on video clips don’t change the original clip, they return a new clip 410 | - so always do new_clip = clip1.resize((1280,720)) 411 | - To export with an audio codec that mac understands: 412 | 413 | write_videofile(“random_dance.mp4”, codec="libx264", temp_audiofile="something.m4a", remove_temp=True, audio_codec="aac") 414 | 415 | 416 | 417 | 418 | 419 | # Videogrep 420 | 421 | Videogrep is a command line tool that searches through dialog in video files and makes supercuts based on what it finds. 422 | 423 | https://antiboredom.github.io/videogrep/ 424 | 425 | 426 | - Needs a video file with a subtitle or transcription file associated with it, with the same name 427 | - most youtube videos have them 428 | - we can use youtube-dl to get them 429 | 430 | 431 | 432 | ## Download a video, then use videogrep to get all the instances of a word and make a supercut 433 | 434 | # get the version 18 that's a smaller video, and download subtitles 435 | $ youtube-dl "[URL]" -f 18 --write-auto-sub 436 | 437 | # if our subtitles are in .vtt format, we add --use-vtt 438 | $ videogrep -i [name_of_the_video_file] --use-vtt --search "Korea" 439 | 440 | 441 | 442 | ## Get only individual words 443 | 444 | # use --search-type word 445 | $ videogrep -i [name_of_the_video_file] --use-vtt --search "Korea" --search-type word 446 | 447 | 448 | 449 | ## Set the output file 450 | 451 | # use -o output_name.mp4 452 | 453 | 454 | 455 | ## Add padding 456 | 457 | # add 300 ms of space between words 458 | # --padding 300 459 | 460 | 461 | 462 | ## Use regular expressions to search for multiple things 463 | 464 | # use the pipe character | to search for either text 465 | # search for "Korea" or any word with "nucl" in it 466 | # "Korea|nucl" 467 | $ videogrep -i [name_of_the_video_file] --use-vtt --search "Korea|nucl" --search-type word 468 | 469 | # the ^ character means the start of the word 470 | # ^a = all the words that begin with the letter a 471 | 472 | # the $ means the end of the word 473 | # ing$ = all the words that end with ing 474 | 475 | # 476 | 477 | 478 | 479 | ## Export n-grams 480 | 481 | # -n 1 482 | # outputs the most used words 483 | 484 | # -n 2 485 | # outputs the most used couplings of words 486 | 487 | 488 | 489 | ## Use sphinx to transcribe videos 490 | 491 | # install sphinx 492 | brew tap watsonbox/cmu-sphinx 493 | brew tap watsonbox/cmu-sphinx 494 | brew install --HEAD watsonbox/cmu-sphinx/cmu-sphinxbase 495 | brew install --HEAD watsonbox/cmu-sphinx/cmu-sphinxtrain # optional 496 | brew install --HEAD watsonbox/cmu-sphinx/cmu-pocketsphinx 497 | 498 | # transcribe the video 499 | videogrep -i pompeo.mp4 --transcribe 500 | 501 | 502 | 503 | 504 | # Next 505 | 506 | 507 | - Upcoming workshop from hardware class TA 508 | - A servo motor + a camera + control it with python 509 | 510 | 511 | ## Homework 512 | - explore these tools and make a python script that makes a new video every time you run it 513 | 514 | 515 | 516 | -------------------------------------------------------------------------------- /class-notes/class-7.md: -------------------------------------------------------------------------------- 1 | # 10/30 - Bots 2 | 3 | 4 | 5 | # Troubleshooting videogrep 6 | 7 | Some students had problems with videogrep! 8 | 9 | 10 | “The right attitude is… it’s amazing it works at all!” - Sam 11 | 12 | 13 | - **bool is not iterable** might mean it didn’t find the subtitle or video file 14 | 15 | Make sure 16 | 17 | - Subtitle file name is the same as video file name, and it can end in .vtt OR .en.vtt 18 | - Videos might need to be the same size 19 | - Keep all videos the same format and size 20 | - Make sure youtube-dl gets the same format and size by using the -f setting (-f 22 gives you a 1280x720 video on youtube, use capital F (-F) to see what formats there are) 21 | - Convert using ffmpeg: (-vcodec copy means the video doesn’t get reencoded = faster) 22 | - ffmpeg -i myvideo.mkv -vcodec copy myvideo.mp4 23 | - If a video has youtube auto-generated subtitles, it’ll have **word** **by word** timings. If it’s subtitles uploaded by the user it might have **sentence by sentence** timings. 24 | - look at the subtitle file, look for tags surrounding each word, and a timestamp tag next to it 25 | - you can use sphinx to transcribe the videos so they’ll all have the same format 26 | - **The word i’m searching for is in the vtt, but videogrep doesn’t find it** 27 | - check the vtt, if that word doesn’t have timing tags for only that word, and instead it’s part of a sentence, videogrep might not find it 28 | - use sphinx to transcribe the video 29 | 30 | 31 | 32 | To test a regular expression 33 | https://regex101.com/ 34 | 35 | 36 | 37 | # Bots bots beep boop 38 | 39 | Today: 40 | 41 | 1. How we can write python code we can use in multiple contexts 42 | 1. You want to write code that’s reusable / acts like a tool 43 | 2. We’ll be able to write one file that does something with any kind of input 44 | 3. Can be used from another script 45 | 4. Applicable to other programming languages 46 | 47 | We’ll make an example script, and then we’ll make it modular. 48 | 49 | - Take a video of a sunset 50 | - overlay a word 51 | - the word will be different 52 | 53 | 54 | 55 | ## The script 56 | 57 | We got a video of a sunset from youtube 58 | 59 | https://www.youtube.com/watch?v=Nl3S8VhUxfY& 60 | 61 | 62 | We’ll clip it to start at 01:48 until 01:52 63 | 64 | 65 | # Import VideoFileClip to load a video, 66 | # TextClip to overlay text, and CompositeVideoClip 67 | # to take two or more clips and overlay them as layers 68 | from moviepy.editor import VideoFileClip, TextClip, CompositeVideoClip 69 | 70 | text = "A specter is haunting this sunset" 71 | 72 | # Load the video, get a subclip that's just our small range, in seconds 73 | clip1 = VideoFileClip("sunset.mp4").subclip(108, 112) 74 | # Make a text clip, give it a duration 75 | clip2 = TextClip(text).set_duration(4) 76 | 77 | # Make a composite video, takes a list of clips 78 | composition = CompositeVideoClip( [ clip1, clip2 ] ) 79 | 80 | # Export the video 81 | composition.write_videofile("sunset_words.mp4") 82 | 83 | 84 | Problems with this code 85 | 86 | - The text is tiny 87 | - We didn’t tell the text what size to make it 88 | - the video is a bit big, it’ll faster if we resize it 89 | - we want the text clip to be the same size as the video 90 | 91 | Options for TextClip: https://zulko.github.io/moviepy/ref/VideoClip/VideoClip.html?highlight=textclip#textclip 92 | 93 | 94 | ## Making it modular 95 | 96 | We want to reuse this code, so we’ll turn all of this into a **function** with def, and turning the code into a block 97 | 98 | Also, we want to use arguments from the terminal to **reuse the script with any text we want**, by using sys.argsv 99 | 100 | **vidbot.py** 101 | 102 | # Import VideoFileClip to load a video, 103 | # TextClip to overlay text, and 104 | # CompositeVideoClip to take two or more clips and overlay them as layers 105 | from moviepy.editor import VideoFileClip, TextClip, CompositeVideoClip 106 | 107 | def compose(text): 108 | # Load the video, get a subclip that's just our small range, in seconds 109 | # e.g. start time is one minute, 48 seconds = 60 + 48 = 108 110 | # Then resize it to make it faster, resize takes a tuple 111 | clip1 = VideoFileClip("sunset.mp4").subclip(108, 112).resize( (1920/2, 1080/2) ) 112 | # Make a text clip, give it a duration 113 | # We'll also make it the size of the video clip 114 | clip2 = TextClip(text, size=clip1.size).set_duration(4) 115 | 116 | # Make a composite video, takes a list of clips 117 | composition = CompositeVideoClip( [ clip1, clip2 ] ) 118 | 119 | # Export the video 120 | composition.write_videofile("sunset_words.mp4") 121 | 122 | # sys.argv gives us the arguments from the terminal, in a list (first item is the script name) 123 | text = sys.argv[1] 124 | # call our function using the text from the terminal 125 | compose(text) 126 | 127 | 128 | 129 | Now we have a tool to create these videos from the terminal. IT’S AMAZING. 130 | 131 | 132 | ## Adding more options 133 | 134 | We want to be able to change the duration 135 | 136 | - We’ll make our function take a duration parameter, 137 | - the start and end of the clip will get calculated based on that 138 | - We’ll make the duration optional 139 | - by adding a default value on the parameter with *duration=4.0* 140 | - by checking the sys.argv parameter is actually there, avoid throwing an error if it’s not 141 | 142 | boop 143 | 144 | # import VideoFileClip to load a video, TextClip to overlay text, and CompositeVideoClip to take two or more clips and overlay them as layers 145 | from moviepy.editor import VideoFileClip, TextClip, CompositeVideoClip 146 | import sys 147 | import argparse 148 | 149 | # our function takes a text and duration. duration is optional and defaults to 4.0 150 | def compose(text, duration=4.0): 151 | # define when our video starts (one minute, 48 seconds = 60 + 48 = 108) 152 | start = 108 153 | # and calculate when it ends 154 | end = start + duration 155 | # Load the video, get a subclip that's just our calculated range 156 | # Then resize it to make it faster, resize takes a tuple 157 | clip1 = VideoFileClip("sunset.mp4").subclip(start, end).resize( (1920/2, 1080/2) ) 158 | # Make a text clip, give it a duration 159 | # We'll also make it the size of the video clip 160 | clip2 = TextClip(text, size=clip1.size).set_duration(4) 161 | 162 | # Make a composite video, takes a list of clips 163 | composition = CompositeVideoClip( [ clip1, clip2 ] ) 164 | 165 | # Export the video 166 | composition.write_videofile("sunset_words.mp4") 167 | 168 | # sys.argv gives us the arguments from the terminal, in a list (first item is the script name) 169 | text = sys.argv[1] 170 | # get the duration from the second parameter, if it's there, otherwise use a default 171 | if len(sys.argv) > 2: 172 | # it comes as a String, we need to turn it into a number with float() 173 | duration = float(sys.argv[2]) 174 | else: 175 | duration = 3 176 | # call our function 177 | compose(text, duration) 178 | 179 | 180 | 181 | 182 | ## Tips 183 | - Use the argparse module to simplify parsing sys.argv 184 | - https://docs.python.org/3/library/argparse.html 185 | 186 | 187 | ## Another script that sends the video 188 | 189 | We’ll make a script that sends you the video in an email (?) 190 | 191 | in the terminal: 192 | 193 | pip3 install emails 194 | 195 | python email.py: 196 | 197 | import emails 198 | 199 | # Create our message object 200 | message = emails.html( 201 | html="Hello friend!", 202 | subject="Specter blah blah", 203 | mail_from=("Scrap Ism", "scrapism.sfpc@gmail.com") 204 | ) 205 | # Attach the file 206 | # Read the video file in binary form (using rb mode) 207 | message.attach(data=open("sunset_words.mp4", "rb"), filename="sunset_words.mp4") 208 | 209 | # Send the email 210 | message.send( 211 | to=("Sam", "splavigne@gmail.com"), 212 | # A bunch of email server stuff from google 213 | smtp={ 214 | "host": "smtp.gmail.com", 215 | "port": 465, 216 | "ssl": True, 217 | # You'd use an actual email login info (maybe not your own) 218 | "user": "scrapism.sfpc@gmail.com", 219 | "password": "scrapismscrapism" 220 | } 221 | ) 222 | 223 | SO COOL 224 | 225 | 226 | ## Making the script import the other script 227 | 228 | Using **import vidbot** we can import the functions from the other script 🤯 229 | 230 | - “vidbot” is whatever name of the other script, without .py 231 | 232 | When importing another script, everything in the lowest indentation level will be executed. To avoid this we run the code only if the script is run directly through the terminal. 233 | we add this hacky python thing to it 234 | 235 | if __name__ == "__main__": 236 | # our code 237 | 238 | So our resulting videobot.py: 239 | 240 | 241 | from moviepy.editor import VideoFileClip, TextClip, CompositeVideoClip 242 | import sys 243 | import argparse 244 | 245 | def compose(text, duration=4.0): 246 | start = 108 247 | end = start + duration 248 | clip1 = VideoFileClip("sunset.mp4").subclip(start, end).resize( (1920/2, 1080/2) ) 249 | clip2 = TextClip(text, size=clip1.size).set_duration(4) 250 | composition = CompositeVideoClip( [ clip1, clip2 ] ) 251 | composition.write_videofile("sunset_words.mp4") 252 | 253 | if __name__ == "__main__": 254 | text = sys.argv[1] 255 | if len(sys.argv) > 2: 256 | duration = float(sys.argv[2]) 257 | else: 258 | duration = 3 259 | compose(text, duration) 260 | 261 | 262 | 263 | and in our **email.py** we call the vidbot compose function, by adding 264 | 265 | 266 | import vidbot 267 | 268 | # ... 269 | 270 | vidbot.compose("cool emailz", 1) 271 | 272 | 273 | 274 | We’ll also turn the video into a gif before sending it 275 | 276 | 277 | import subprocess 278 | 279 | subprocess.call(["ffmpeg", "-i", "sunset_words.mp4", "sunset_words.gif"]) 280 | 281 | 282 | 283 | And we’ll take the message from: 284 | 285 | - Corpora https://github.com/dariusk/corpora 286 | 287 | resulting email.py 288 | 289 | import emails 290 | import vidbot 291 | import subprocess 292 | 293 | isms = [ 294 | "abstract expressionism", 295 | "academic", 296 | "action painting", 297 | "aestheticism", 298 | "art deco", 299 | "art nouveau", 300 | # ... 301 | ] 302 | 303 | # Create our message object 304 | message = emails.html( 305 | html="Hello friend!", 306 | subject="Specter blah blah", 307 | mail_from=("Scrap Ism", "scrapism.sfpc@gmail.com") 308 | ) 309 | # Turn into a gif 310 | subprocess.call(["ffmpeg", "-i", "sunset_words.mp4", "sunset_words.gif"]) 311 | # Attach the file 312 | # Read the video file in binary form (using rb mode) 313 | message.attach(data=open("sunset_words.gif", "rb"), filename="sunset_words.gif") 314 | 315 | # Send the email 316 | message.send( 317 | to=("Sam", "splavigne@gmail.com"), 318 | # A bunch of email server stuff from google 319 | smtp={ 320 | "host": "smtp.gmail.com", 321 | "port": 465, 322 | "ssl": True, 323 | # You'd use an actual email login info (maybe not your own) 324 | "user": "scrapism.sfpc@gmail.com", 325 | "password": "scrapismscrapism" 326 | } 327 | ) 328 | 329 | 330 | 331 | 332 | ## Making a twitter bot 333 | 334 | Now you have to apply to post to twitter 335 | https://developer.twitter.com/en/apply-for-access 336 | 337 | 338 | - install a python library to interface with twitter 339 | - make an application with twitter dev 340 | - get a set of keys 341 | - consumer key 342 | - consumer secret key 343 | - access token 344 | 345 | boop 346 | 347 | from twython import Twython 348 | import vidbot 349 | 350 | # parameters are APP_KEY, APP_SECRET, OAUTH_TOKEN, OAUTH_TOKEN_SECRET 351 | twitter = Twython("dslfgjlskdfgjdflsgjdslfgsdffjldisfdf","dslfgjlskdfgjdflsgjdslfgsdffjldisfdf","dslfgjlskdfgjdflsgjdslfgsdffjldisfdf","dslfgjlskdfgjdflsgjdslfgsdffjldisfdf",) 352 | 353 | vidbot.compose("A spectre blah blah", 1) 354 | video = open("sunset_words.mp4", "rb") 355 | response = twitter.upload_video(media=video, media_type="video/mp4") 356 | twitter.update_status(media_ids=[response["media_id"]]) 357 | 358 | 359 | 360 | ## See also: 361 | - instagram picture of plunger 362 | - https://www.instagram.com/samepicofplunger/ 363 | - Post to tumblr using their API 364 | - instagram api for python unofficial 365 | - https://github.com/LevPasha/Instagram-API-python 366 | 367 | 368 | # Turn a python script into a website! 369 | 370 | 371 | ## Running a server 372 | 373 | install flask, make webservers with it 374 | http://flask.pocoo.org/ 375 | 376 | 377 | pip3 install flask 378 | 379 | make webapp.py 380 | 381 | boop 382 | 383 | from flask import Flask 384 | app = Flask(__name__) 385 | 386 | # In webservers 387 | # You set up routes, when the user goes to the this URL then show them this thing 388 | # using a 'decorator' 389 | # when user enters the base URL /, perform the function on the next line 390 | @app.route("/") 391 | def home(): 392 | return "hello" 393 | 394 | # run the web server when we run the script and get the web server going 395 | if __name__ == "__main__": 396 | #run in debug mode to update the server when we change the script 397 | app.run(debug=True) 398 | 399 | When running the script, a local web server will be created and you can open your browser to the address the script gives you (http://127.0.0.1:5555) 400 | 401 | When you make a change to your script, you need to **stop and restart the server** 402 | To make it easier, you can tell flask to restart the server every time there’s a change to the file by changing: 403 | 404 | app.run(debug=True) 405 | 406 | 407 | ## Getting data from the URL bar 408 | 409 | We can get information about the request the user made in the url 410 | 411 | - 127.0.0.1:5000/?text=lol 412 | 413 | we get that with the flask request module 414 | 415 | # ... 416 | @app.route("/") 417 | def home(): 418 | text = request.args.get("text") 419 | return "hello!! " + text 420 | # ... 421 | 422 | 423 | 424 | ## Compositing the video from the website 425 | 426 | We need to 427 | 428 | - call the compose function of vidbot 429 | - but we need to make the output file name unique, since there might be more than one user at the same time 430 | - Show the user a preview of the file 431 | - using flask static server 432 | - showing the video in a