├── README.md ├── class_notes ├── class_notes.docx ├── class_notes.md └── class_notes.pdf ├── code_examples ├── 4.1A_declaring_an_observable.py ├── 4.1B_subscribing_to_an_observable.py ├── 4.1C_subscribing_with_lambdas.py ├── 4.2A_some_basic_operators.py ├── 4.2B_range_and_just.py ├── 4.2C_observable_empty.py ├── 4.3A_creating_observable_from_scratch.py ├── 4.3B_interval_observable.py ├── 4.3C_unsubscribing.py ├── 4.4A_twitter_observable.py ├── 4.4B_cold_observable.py ├── 5.1A_filter.py ├── 5.1B_take.py ├── 5.1C_take_while.py ├── 5.2A_distinct.py ├── 5.2B_distinct_with_mapping.py ├── 5.2C_distinct_until_changed.py ├── 5.3A_count.py ├── 5.3B_reduce.py ├── 5.3C_scan.py ├── 5.4A_to_list.py ├── 5.4B_to_dict.py ├── 6.1A_merge.py ├── 6.1B_merge_interval.py ├── 6.1C_merge_all.py ├── 6.1D_merge_all_continued.py ├── 6.1E_flat_map.py ├── 6.2A_concat.py ├── 6.2B_concat_all.py ├── 6.2C_zip.py ├── 6.3D_spacing_emissions.py ├── 6.4A_grouping_into_lists.py ├── 6.4B_grouping_length_counts.py ├── 7.1A_reading_text_file.py ├── 7.1B_reading_web_url.py ├── 7.1C_recursive_file_iteration.py ├── 7.2A_reading_sql_query.py ├── 7.2C_merging_sql_queries.py ├── 7.2D_writing_sql_updates.py ├── 7.3A_reading_words_from_text_file.py ├── 7.3B_counting_word_occurrences.py ├── 7.3C_scheduling_reactive_word_counter.py ├── 8.1A_connectableobservable.py ├── 8.1B_sharing_observable.py ├── 8.1C_refcount.py ├── 8.2_twitter_feed_for_topics.py ├── 9.1A_sequential_long_running_tasks.py ├── 9.1B_using_subscribe_on.py ├── 9.2_using_observe_on.py ├── 9.3_processing_emissions_in_parallel.py ├── 9.4_switch_map.py └── rexon_metals.db ├── resources ├── bbc_news_article.txt ├── reactive_python_slides.pptx └── rexon_metals.db ├── setting_up_twitter_api.md └── setting_up_twitter_api.pdf /README.md: -------------------------------------------------------------------------------- 1 | Resources for the O'Reilly Media online video as well as webcast [_Reactive Python for Data Science_](https://www.safaribooksonline.com/library/view/reactive-python-for/9781491979006/). 2 | 3 | [![](http://akamaicovers.oreilly.com/images/0636920064237/lrg.jpg)](https://www.safaribooksonline.com/library/view/reactive-python-for/9781491979006/)) 4 | 5 | 6 | 7 | -------------------------------------------------------------------------------- /class_notes/class_notes.docx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/thomasnield/oreilly_reactive_python_for_data/dd223b28bad8ea8f00242e827bd020a9c94e2692/class_notes/class_notes.docx -------------------------------------------------------------------------------- /class_notes/class_notes.md: -------------------------------------------------------------------------------- 1 | # Reactive Python for Data 2 | # Part IV - The Observable 3 | 4 | ## 4.1A - Creating an `Observable` 5 | 6 | An `Observable` pushes items. It can push a finite or infinite series of items over time. To create an `Observable` that pushes 5 text strings, you can declare it like this: 7 | 8 | ```python 9 | from rx import Observable 10 | 11 | letters = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) 12 | ``` 13 | 14 | We create an `Observable` using the `from_()` function, and pass it a list of five strings. It will take the list and **emit** (or push) each item from it. The `Observable.from_()` will work with any iterable. 15 | 16 | However, running this does nothing more than save an `Observable` to a variable called `letters`. For the items to actually get pushed, we need a `Subscriber`. 17 | 18 | ## 4.1B - Subscribing to an `Observable` 19 | 20 | To receive emissions from an `Observable`, we need to create a `Subscriber` by implementing an `Observer`. An `Observer` implements three functions `on_next()` which receives an emission, `on_completed()` which is called when there are no more items, and `on_error()` which receives an error in the event one occurs. 21 | 22 | Then we can pass an implementation of this `Observer` to the Observable's `subscribe()` function. It will then fire the emissions to our `Subscriber`. 23 | 24 | ```python 25 | from rx import Observable, Observer 26 | 27 | letters = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) 28 | 29 | 30 | class MySubscriber(Observer): 31 | def on_next(self, value): 32 | print(value) 33 | 34 | def on_completed(self): 35 | print("Completed!") 36 | 37 | def on_error(self, error): 38 | print("Error occured: {0}".format(error)) 39 | 40 | 41 | letters.subscribe(MySubscriber()) 42 | ``` 43 | 44 | **OUTPUT:** 45 | 46 | ``` 47 | Received: Alpha 48 | Received: Beta 49 | Received: Gamma 50 | Received: Delta 51 | Received: Epsilon 52 | Completed! 53 | ``` 54 | 55 | ## Example 4.1C - Subscribing Shorthand with Lambdas 56 | 57 | Implementing a `Subscriber` is a bit verbose, so we also have the option of passing more concise lambda arguemnts to the `subscribe()` function. Then it will use those lambas to create the `Subscriber` for us. 58 | 59 | ```python 60 | from rx import Observable 61 | 62 | letters = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) 63 | 64 | letters.subscribe(on_next = lambda value: print(value), 65 | on_completed = lambda: print("Completed!"), 66 | on_error = lambda error: print("Error occurred: {0}".format(error))) 67 | ``` 68 | 69 | You do not even have to supply all the lambda arguments. You can leave out the `on_completed` and `on_error`, but for production code you should try to have an `on_error` so errors are not quietly swallowed. 70 | 71 | 72 | ```python 73 | letters.subscribe(on_next = lambda value: print(value)) 74 | 75 | # or 76 | 77 | letters.subscribe(lambda value: print("Received: {0}".format(value))) 78 | ``` 79 | 80 | We will be using lambdas constantly as we do reactive programming. 81 | 82 | 83 | ## 4.2A - Some Basic Operators 84 | 85 | RxPy has approximately 130 operators to powerfully express business logic, transformations, and concurrency behaviors. For now we will start with two basic ones: `map()` and `filter()` and cover more in the next section. 86 | 87 | For instance, we can `map()` each `String` to its lenth, and then filter only to lengths that are at least 5. 88 | 89 | ```python 90 | from rx import Observable 91 | 92 | letters = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) 93 | 94 | mapped = letters.map(lambda s: len(s)) 95 | 96 | filtered = mapped.filter(lambda i: i >= 5) 97 | 98 | filtered.subscribe(lambda value: print(value)) 99 | ``` 100 | 101 | **OUTPUT:** 102 | 103 | ``` 104 | Received: 5 105 | Received: 5 106 | Received: 5 107 | Received: 7 108 | ``` 109 | 110 | Each operator yields a new `Observable` emitting that transformation. We can save each one to a variable if we want and then `subscribe()` to the one we want, but oftentimes you will likely want to call them all in a single chain. 111 | 112 | ```python 113 | from rx import Observable 114 | 115 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \ 116 | .map(lambda s: len(s)) \ 117 | .filter(lambda i: i >= 5) \ 118 | .subscribe(lambda value: print(value)) 119 | ``` 120 | 121 | > If you are using an IDE like PyCharm, operators like `filter()` and `map()` will unfortunately not be available for auto-complete. The reason is RxPy will add these operators to the `Observable` at runtime. For PyCharm, you may want to disable _Unresolved References_ under _Settings -> Editor -> Inspection -> Python_ so you do not get any warnings. 122 | 123 | ## 4.2B Using Observable.range() and Observable.just() 124 | 125 | There are other ways to create an `Observable`. For instance, you can emit a range of numbers: 126 | 127 | ```python 128 | from rx import Observable 129 | 130 | source = Observable.range(1,10) 131 | 132 | source.subscribe(lambda value: print(value)) 133 | ``` 134 | 135 | **OUTPUT:** 136 | 137 | ``` 138 | Received: 1 139 | Received: 2 140 | Received: 3 141 | Received: 4 142 | Received: 5 143 | Received: 6 144 | Received: 7 145 | Received: 8 146 | Received: 9 147 | Received: 10 148 | ``` 149 | 150 | You can also use `Observable.just()` to emit a single item. 151 | 152 | ```python 153 | from rx import Observable 154 | 155 | greeting = Observable.just("Hello World!") 156 | 157 | greeting.subscribe(lambda value: print(value)) 158 | ``` 159 | 160 | **OUTPUT:** 161 | 162 | ``` 163 | Received: Hello World! 164 | ``` 165 | 166 | # 4.2C - Using Observable.empty() 167 | 168 | You can also create an `Observable` that emits nothing and call `on_completed()` immediately via `Observable.empty()`. While this may not seem useful, an empty `Observable` is the reactive equivalent to `None`, `null`, or an empty collection. 169 | 170 | ```python 171 | from rx import Observable 172 | 173 | Observable.empty() \ 174 | .subscribe(on_next= lambda s: print(s), 175 | on_completed= lambda: print("Done!") 176 | ) 177 | ``` 178 | 179 | **OUTPUT:** 180 | 181 | ``` 182 | Done! 183 | ``` 184 | 185 | # 4.3A - Creating an Observable from Scratch 186 | 187 | You can also create an `Observable` source from scratch. Using `Observable.create()`. you can pass a function with an `observer` argument, and call it's `on_next()`, `on_completed()`, and `on_error()` to pass items or events to the `Observer` or the next operator in the chain. 188 | 189 | ```python 190 | from rx import Observable 191 | 192 | def push_numbers(observer): 193 | observer.on_next(100) 194 | observer.on_next(300) 195 | observer.on_next(500) 196 | observer.on_completed() 197 | 198 | Observable.create(push_numbers).subscribe(on_next = lambda i: print(i)) 199 | ``` 200 | 201 | **OUTPUT:** 202 | 203 | ``` 204 | 100 205 | 300 206 | 500 207 | ``` 208 | 209 | 210 | ## 4.3B - An Interval Observable 211 | 212 | Observables do not have to strictly emit data. They can also emit events. Remember our definition that states _events are data, and data are events_? Events and data are treated the same way in ReactiveX. They both can be pushed through an `Observable`. 213 | 214 | For instance, we can use `Observable.interval()` to emit a consecutive integer every 1 second. 215 | 216 | ```python 217 | from rx import Observable 218 | 219 | Observable.interval(1000) \ 220 | .map(lambda i: "{0} Mississippi".format(i)) \ 221 | .subscribe(lambda s: print(s)) 222 | 223 | # Keep application alive until user presses a key 224 | input("Press any key to quit\r\n") 225 | ``` 226 | 227 | 228 | **OUTPUT:** 229 | 230 | ``` 231 | 0 Mississippi 232 | 1 Mississippi 233 | 2 Mississippi 234 | 3 Mississippi 235 | 4 Mississippi 236 | 5 Mississippi 237 | 6 Mississippi 238 | 7 Mississippi 239 | 8 Mississippi 240 | ``` 241 | 242 | Notice how the `Observable` in fact has a notion of time? It is emitting an integer every second, and each emission is both data and an event. Observables can be created to emit button clicks for a UI, server requests, new Tweets, and any other event while repsresenting that event as data. 243 | 244 | Note also we had to use `input()` to make the main thread pause until the user presses a key. If we did not do this, the `Observable.interval()` would not have a chance to fire because the application will exit. The reason for this is the `Observable.interval()` has to operate on a separate thread and create a separate workstream driven by a timer. The Python code will finish and terminate before it has a chance to fire. 245 | 246 | 247 | # 4.3C - Using Observable.defer() (EXTRA) 248 | 249 | A behavior to be aware of with `Observable.from_()` and other functions that create Observables is they may not reflect changes that happen to their sources. 250 | 251 | For instance, if have an `Observable.range()` built off two variables `x` and `y`, and one of the variables changes later, this change will not be captured by the source. 252 | 253 | ```python 254 | 1 255 | 2 256 | 3 257 | 4 258 | 5 259 | 260 | Setting y = 10 261 | 262 | 1 263 | 2 264 | 3 265 | 4 266 | 5 267 | 268 | ``` 269 | 270 | **OUTPUT:** 271 | 272 | ``` 273 | Alpha 274 | Beta 275 | Gamma 276 | 277 | Adding Delta! 278 | 279 | Alpha 280 | Beta 281 | Gamma 282 | Delta 283 | ``` 284 | 285 | Using `Observable.defer()` allows you to create a new `Observable` from scratch each time it is subscribed, and therefore capturing anything that might have changed about its source. Just supply how to create the `Observable` through a lambda. 286 | 287 | ```python 288 | from rx import Observable 289 | 290 | x = 1 291 | y = 5 292 | 293 | integers = Observable.defer(lambda: Observable.range(x, y)) 294 | integers.subscribe(lambda i: print(i)) 295 | 296 | print("\nSetting y = 10\n") 297 | y = 10 298 | 299 | integers.subscribe(lambda i: print(i)) 300 | ``` 301 | 302 | **OUTPUT:** 303 | 304 | ``` 305 | 1 306 | 2 307 | 3 308 | 4 309 | 5 310 | 311 | Setting y = 10 312 | 313 | 1 314 | 2 315 | 3 316 | 4 317 | 5 318 | 6 319 | 7 320 | 8 321 | 9 322 | 10 323 | ``` 324 | 325 | The lambda argument ensures the `Observable` source declaration is rebuilt each time it is subscribed to. 326 | 327 | # 4.3D - Unsubscribing from an Observable 328 | 329 | When you `subscribe()` to an `Observable` it returns a `Disposable` so you can disconnect the `Subscriber` from the `Observable` at any time. 330 | 331 | ```python 332 | from rx import Observable 333 | import time 334 | 335 | disposable = Observable.interval(1000) \ 336 | .map(lambda i: "{0} Mississippi".format(i)) \ 337 | .subscribe(lambda s: print(s)) 338 | 339 | # sleep 5 seconds so Observable can fire 340 | time.sleep(5) 341 | 342 | # disconnect the Subscriber 343 | print("Unsubscribing!") 344 | disposable.dispose() 345 | 346 | # sleep a bit longer to prove no more emissions are coming 347 | time.sleep(5) 348 | ``` 349 | 350 | **OUTPUT:** 351 | 352 | ``` 353 | 0 Mississippi 354 | 1 Mississippi 355 | 2 Mississippi 356 | 3 Mississippi 357 | Unsubscribing! 358 | ``` 359 | 360 | Unsubscribing/disposing is usually not necessary for Observables that are finite and quick (they will unsubscribe themselves), but it can be necessary for long-running or infinite Observables. 361 | 362 | # 4.4 - An Observable emitting Tweets 363 | 364 | Later we will learn how to create Observables that emit Tweets for a given topic, but here is a preview of what's to come. Using Tweepy and `Observable.create()`, we can create a function that yields an `Observable` emitting Tweets for specified topics. For instance, here is how to get a live stream of text bodies from Tweets for "Britain" and "France". 365 | 366 | 367 | ## 4.4A - A Twitter Observable 368 | 369 | ```python 370 | from tweepy.streaming import StreamListener 371 | from tweepy import OAuthHandler 372 | from tweepy import Stream 373 | import json 374 | from rx import Observable 375 | 376 | # Variables that contains the user credentials to access Twitter API 377 | access_token = "CONFIDENTIAL" 378 | access_token_secret = "CONFIDENTIAL" 379 | consumer_key = "CONFIDENTIAL" 380 | consumer_secret = "CONFIDENTIAL" 381 | 382 | 383 | def tweets_for(topics): 384 | 385 | def observe_tweets(observer): 386 | class TweetListener(StreamListener): 387 | def on_data(self, data): 388 | observer.on_next(data) 389 | return True 390 | 391 | def on_error(self, status): 392 | observer.on_error(status) 393 | 394 | # This handles Twitter authetification and the connection to Twitter Streaming API 395 | l = TweetListener() 396 | auth = OAuthHandler(consumer_key, consumer_secret) 397 | auth.set_access_token(access_token, access_token_secret) 398 | stream = Stream(auth, l) 399 | stream.filter(track=topics) 400 | 401 | return Observable.create(observe_tweets).share() 402 | 403 | 404 | topics = ['Britain','France'] 405 | 406 | tweets_for(topics).map(lambda d: json.loads(d)) \ 407 | .filter(lambda map: "text" in map) \ 408 | .map(lambda map: map["text"].strip()) \ 409 | .subscribe(lambda s: print(s)) 410 | 411 | 412 | ``` 413 | 414 | **OUTPUT:** 415 | 416 | ``` 417 | RT @YourAnonCentral: The five biggest international arms exports suppliers in 2008–12 were the #US,#Russia, #Germany, #France and #China. ht… 418 | RT @parismarx: Marine Le Pen believes France "will provide the third stage of a global political uprising" following Brexit & Trump https:/… 419 | Attentats du 13-Novembre: des rescapés racontent leur vie un an après https://t.co/VMM5rlsoQu via @RFI 420 | RT @AOLNews: 1 year after the Paris attacks, France's state of emergency remains: https://t.co/PD0U6mXHcN https://t.co/QUHWRSCLxt 421 | おむつは不要、手ぶらで登園。少子化を克服したフランスの保育園事情とは https://t.co/4ImUajYSq2 @HuffPostJapanさんから 422 | RT @CPIF_: #France Interdit cette année, les islamistes tentent de convertir les femmes en faisant l'expérience du voile à… 423 | RT @StewartWood: This week our Government should remember & make clear that Britain's alliances must be based on our values, not our values… 424 | RT @MaxAbrahms: "Britain will spend the next two months trying to convince Mr Trump's team of the need to remove President Assad." https://… 425 | RT @Bassounov: #Trump est devenu présidentiable grâce à 10 ans de #téléPoubelle. En 2022 en France, la présidence se jouera entre #Hanouna… 426 | # Panoramix #Radio #Station 427 | ... 428 | ``` 429 | 430 | ## 4.4B Cold vs Hot Observables 431 | 432 | Observables that emit data typically are **cold Observables**, meaning they will replay emissions to each individual `Subscriber`. For instance, this `Observable` below will emit all five strings to both Subscribers individually. 433 | 434 | ```python 435 | from rx import Observable 436 | 437 | source = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) 438 | 439 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s))) 440 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s))) 441 | ``` 442 | 443 | **OUTPUT:** 444 | 445 | ``` 446 | Subscriber 1: Alpha 447 | Subscriber 1: Beta 448 | Subscriber 1: Gamma 449 | Subscriber 1: Delta 450 | Subscriber 1: Epsilon 451 | Subscriber 2: Alpha 452 | Subscriber 2: Beta 453 | Subscriber 2: Gamma 454 | Subscriber 2: Delta 455 | Subscriber 2: Epsilon 456 | ``` 457 | 458 | However, **hot Observables** will not replay emissions for tardy subscribers that come later. Our Twitter `Observable` is an example of a hot `Observable`. If a second `Subscriber` subscribes to a Tweet feed 5 seconds after the first `Subscriber`, it will miss all Tweets that occurred in that window. We will explore this later. 459 | 460 | 461 | # Part V - Operators 462 | 463 | In this section, we will learn some of the 130 operators available in RxPy. Learning these operators can be overwhelming, so the best approach is to seek the right operators out of need. The key to being productive with RxPy and unleashing its potential is to find the key operators that help you with the tasks you encounter. With practice, you will become fluent in composing them together. 464 | 465 | The best way to see what operators are available in RxPy is to look through them on GitHub 466 | https://github.com/ReactiveX/RxPY/tree/master/rx/linq/observable 467 | 468 | You can also view the ReactiveX operators page which has helpful marble diagrams showing each operator's behavior 469 | http://reactivex.io/documentation/operators.html 470 | 471 | You can also explore various operators using the interactive RxMarbles website 472 | http://rxmarbles.com/ 473 | 474 | 475 | ## 5.1 Suppressing Emissions 476 | 477 | Here are some operators that can be helpful for supressing emissions that fail to meet a criteria in some form. 478 | 479 | ### 5.1A `filter()` 480 | 481 | You have already seen the `filter()`. It supresses emissions that fail to meet a condition specified by you. For instance, only allowing emissions forward that are at least length 5. 482 | 483 | ```python 484 | from rx import Observable 485 | 486 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \ 487 | .filter(lambda s: len(s) >= 5) \ 488 | .subscribe(lambda s: print(s)) 489 | ``` 490 | 491 | **OUTPUT:** 492 | 493 | ``` 494 | Alpha 495 | Gamma 496 | Delta 497 | Epsilon 498 | ``` 499 | 500 | 501 | 502 | ## 5.1B `take()` 503 | 504 | You can also use `take()` to cut off at a certain number of emissions and call `on_completed()`. For instance, calling `take(2)` like below will only allow the first two emissions coming out of the `filter()` to come through. 505 | 506 | ```python 507 | from rx import Observable 508 | 509 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \ 510 | .filter(lambda s: len(s) >= 5) \ 511 | .take(2) \ 512 | .subscribe(lambda s: print(s)) 513 | ``` 514 | 515 | **OUTPUT:** 516 | 517 | ``` 518 | Alpha 519 | Gamma 520 | ``` 521 | 522 | `take()` will not throw an error if it fails to get the number of items it wants. It will just emit what it does capture. For instance, when `take(10)` only recieves 4 emissions (and not 10), it will just emit those 4 emissions. 523 | 524 | ```python 525 | from rx import Observable 526 | 527 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \ 528 | .filter(lambda s: len(s) >= 5) \ 529 | .take(10) \ 530 | .subscribe(on_next = lambda s: print(s), on_error = lambda e: print(e)) 531 | ``` 532 | 533 | **OUTPUT:** 534 | 535 | ``` 536 | Alpha 537 | Beta 538 | Gamma 539 | Delta 540 | Epsilon 541 | ``` 542 | 543 | ## 5.1C `take_while()` 544 | 545 | `take_while()` will keep passing emissions based on a condition. For instance if we have an `Observable` emitting some integers, we can keep taking integers while they are less than 100. We can achieve this using a `take_while()`. 546 | 547 | ```python 548 | from rx import Observable 549 | 550 | Observable.from_([2,5,21,5,2,1,5,63,127,12]) \ 551 | .take_while(lambda i: i < 100) \ 552 | .subscribe(on_next = lambda i: print(i), on_completed = lambda: print("Done!")) 553 | ``` 554 | 555 | 556 | When the `127` is encountered, the `take_while()` specified as above with the condition `i < 100` will trigger `on_completed()` to be called to the `Subscriber`, and unsubscription will prevent any more emissions from occurring. 557 | 558 | 559 | # 5.2 Distinct Operators 560 | 561 | ## 5.2A `distinct()` 562 | 563 | You can use `distinct()` to suppress redundant emissions. If an item has been emitted before (based on its equality logic via its `__eq__` implementation), it will not be emitted. 564 | 565 | This will emit the distinct lengths 566 | 567 | ```python 568 | from rx import Observable 569 | 570 | Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]) \ 571 | .map(lambda s: len(s)) \ 572 | .distinct() \ 573 | .subscribe(lambda i: print(i)) 574 | ``` 575 | 576 | **OUTPUT:** 577 | 578 | ``` 579 | 5 580 | 4 581 | 7 582 | ``` 583 | 584 | 585 | ## 5.2B `distinct()` with mapping 586 | 587 | You can also pass a lambda specifying what you want to distinct on. If we want to emit the `String` rather than its length, but use distinct logic on its length, you can leverage a lambda argument. 588 | 589 | ```python 590 | from rx import Observable 591 | 592 | Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]) \ 593 | .distinct(lambda s: len(s)) \ 594 | .subscribe(lambda i: print(i)) 595 | ``` 596 | 597 | **OUTPUT:** 598 | 599 | ``` 600 | Alpha 601 | Beta 602 | Epsilon 603 | ``` 604 | 605 | 606 | ## 5.2C `distinct_until_changed()` 607 | 608 | The `distinct_until_changed()` will prevent _consecutive_ duplicates from emitting. 609 | 610 | ```python 611 | from rx import Observable 612 | 613 | Observable.from_(["Alpha", "Theta", "Kappa", "Beta", "Gamma", "Delta", "Epsilon"]) \ 614 | .map(lambda s: len(s)) \ 615 | .distinct_until_changed() \ 616 | .subscribe(lambda i: print(i)) 617 | ``` 618 | 619 | **OUTPUT:** 620 | 621 | ``` 622 | 5 623 | 4 624 | 5 625 | 7 626 | ``` 627 | 628 | Just like `distinct()`, you can also provide a lambda to distinct on an attribute. 629 | 630 | ```python 631 | from rx import Observable 632 | 633 | Observable.from_(["Alpha", "Theta", "Kappa", "Beta", "Gamma", "Delta", "Epsilon"]) \ 634 | .distinct_until_changed(lambda s: len(s)) \ 635 | .subscribe(lambda i: print(i)) 636 | ``` 637 | 638 | ``` 639 | Alpha 640 | Beta 641 | Gamma 642 | Epsilon 643 | ``` 644 | 645 | # 5.3 Aggregating Operators 646 | 647 | When working with data, there will be many instances where we want to consolidate emissions into a single emission to reflect some form of an aggregated result. 648 | 649 | With the exception of `scan()`, one thing to be careful about when aggregating emissions is they rely on `on_completed()` to be called. Infinite Observables will cause an aggregation operator to work forever aggregating an infinite series of emissions. 650 | 651 | ## 5.3A - `count()` 652 | 653 | The simplest aggregation to an `Observable` is to simply `count()` the number of emisssions, and then push that count forward as a single emission once `on_completed()` is called. If we want to count the number of text strings that are not 5 characters, we can achieve it like this: 654 | 655 | 656 | ```python 657 | from rx import Observable 658 | 659 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \ 660 | .filter(lambda s: len(s) != 5) \ 661 | .count() \ 662 | .subscribe(lambda i: print(i)) 663 | ``` 664 | 665 | **OUTPUT:** 666 | 667 | ``` 668 | 2 669 | ``` 670 | 671 | ## 5.3B `reduce()` 672 | 673 | The `reduce()` allows you to define a custom aggregation operation to "fold" each value into a rolling value. For instance, you can find the sum of numeric emissions (less than 100) using `reduce()` in this manner. 674 | 675 | ```python 676 | from rx import Observable 677 | 678 | Observable.from_([4,76,22,66,881,13,35]) \ 679 | .filter(lambda i: i < 100) \ 680 | .reduce(lambda total, value: total + value) \ 681 | .subscribe(lambda s: print(s)) 682 | ``` 683 | 684 | **OUTPUT:** 685 | 686 | ``` 687 | 216 688 | ``` 689 | 690 | You can use this to consolidate emissions in your own custom way for most cases. Keep in mind that there are already built in mathematical aggregators like `sum()` (which could replace this `reduce()`) as well as `min()`, `max()`, and `average()`. These only work on numeric emissions, however. 691 | 692 | 693 | ## 5.3C `scan()` 694 | 695 | The `scan()` is almost identical to `reduce()`, but it will emit each rolling total for each emission that is received. Therefore, it can work with infinite Observables such as Twitter streams and other events. 696 | 697 | 698 | ```python 699 | from rx import Observable 700 | 701 | Observable.from_([4,76,22,66,881,13,35]) \ 702 | .scan(lambda total, value: total + value) \ 703 | .subscribe(lambda s: print(s)) 704 | ``` 705 | 706 | **OUTPUT:** 707 | 708 | ``` 709 | 4 710 | 80 711 | 102 712 | 168 713 | 1049 714 | 1062 715 | 1097 716 | ``` 717 | 718 | Each accumulation is emitted every time an emission is added to our running total. We start with `4`, then `4` + `76` which is `80`, then `80` + `22` which is `102`, etc... 719 | 720 | 721 | # 5.4 Collecting Operators 722 | 723 | You can consolidate emissions by collecting them into a `List` or `Dict`, and then pushing that collection forward as a single emission. 724 | 725 | ## 5.4A - `to_list()` 726 | 727 | `to_list()` will collect the emissions into a single `List` until `on_completed()` is called, then it will push that `List` forward as a single emission. 728 | 729 | ```python 730 | from rx import Observable 731 | 732 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \ 733 | .to_list() \ 734 | .subscribe(lambda s: print(s)) 735 | ``` 736 | 737 | **OUTPUT:** 738 | 739 | ``` 740 | ['Alpha', 'Beta', 'Gamma', 'Delta', 'Epsilon'] 741 | ``` 742 | 743 | Typically you want avoid excessively collecting things into Lists unless business logic requires it. Prefer to keep emissions flowing forward one-at-a-time in a reactive manner when possible, rather than stopping the flow and collecting emissions into Lists. 744 | 745 | ## 5.4B - `to_dict()` 746 | 747 | The `to_dict()` will collect emissions into a `Dict` and you specify a lambda that derives the key. For instance, if you wanted to key each String off its first letter and collect them into a `Dict`, do the following: 748 | 749 | ```python 750 | from rx import Observable 751 | 752 | Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]) \ 753 | .to_dict(lambda s: s[0]) \ 754 | .subscribe(lambda i: print(i)) 755 | ``` 756 | 757 | **OUTPUT:** 758 | 759 | ``` 760 | {'B': 'Beta', 'E': 'Epsilon', 'A': 'Alpha', 'G': 'Gamma', 'D': 'Delta'} 761 | ``` 762 | 763 | You can optionally provide a second lambda argument to specify a value other than the emission itself. If we wanted to map the first letter to the length of the String instead, we can do this: 764 | 765 | ```python 766 | from rx import Observable 767 | 768 | Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]) \ 769 | .to_dict(lambda s: s[0], lambda s: len(s)) \ 770 | .subscribe(lambda i: print(i)) 771 | ``` 772 | 773 | 774 | 775 | ## 6.1A - Observable.merge() 776 | 777 | 778 | ```python 779 | from rx import Observable 780 | 781 | source1 = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) 782 | source2 = Observable.from_(["Zeta","Eta","Theta","Iota"]) 783 | 784 | Observable.merge(source1,source2) \ 785 | .subscribe(lambda s: print(s)) 786 | ``` 787 | 788 | **OUTPUT:** 789 | 790 | ```python 791 | Alpha 792 | Zeta 793 | Beta 794 | Eta 795 | Gamma 796 | Theta 797 | Delta 798 | Iota 799 | Epsilon 800 | ``` 801 | 802 | Notice that although emissions from both Observable are now a single stream, the emissions are interleaved and jumbled. This is because `Observable.merge()` will fire emissions from all the Observables at once rather than sequentially one-at-a-time. 803 | 804 | ## 6.1B - Observable.merge() (Continued) 805 | 806 | If you want this sequential ordered guarantee, you will want to use `Observable.concat()` which is discussed later. But the `Observable.merge()` can be helpful for merging multiple event streams. 807 | 808 | ```python 809 | from rx import Observable 810 | 811 | source1 = Observable.interval(1000).map(lambda i: "Source 1: {0}".format(i)) 812 | source2 = Observable.interval(500).map(lambda i: "Source 2: {0}".format(i)) 813 | source3 = Observable.interval(300).map(lambda i: "Source 3: {0}".format(i)) 814 | 815 | Observable.merge(source1, source2, source3) \ 816 | .subscribe(lambda s: print(s)) 817 | 818 | # keep application alive until user presses a key 819 | input("Press any key to quit\n") 820 | ``` 821 | 822 | **OUTPUT:** 823 | 824 | ``` 825 | Source 3: 0 826 | Source 2: 0 827 | Source 3: 1 828 | Source 3: 2 829 | Source 1: 0 830 | Source 2: 1 831 | Source 3: 3 832 | Source 2: 2 833 | Source 3: 4 834 | Source 3: 5 835 | Source 2: 3 836 | Source 1: 1 837 | etc... 838 | ``` 839 | 840 | Three infinite Observables above are emitting a consecutive integer at different intervals (1000 milliseconds, 500 milliseconds, and 300 milliseconds), and putting each integer into a String labeling the source. But we merged these three infinite Observables into one using `Observable.merge()`. 841 | 842 | ## 6.1C - `merge_all()` 843 | 844 | Another way to accomplish this is to make a List containing all three Observables, and then passing it to `Observable.from_()`. This will make an Observable emitting Observables, then you can call `merge_all()` to turn each one into its emissions. 845 | 846 | ```python 847 | from rx import Observable 848 | 849 | source1 = Observable.interval(1000).map(lambda i: "Source 1: {0}".format(i)) 850 | source2 = Observable.interval(500).map(lambda i: "Source 2: {0}".format(i)) 851 | source3 = Observable.interval(300).map(lambda i: "Source 3: {0}".format(i)) 852 | 853 | Observable.from_([source1,source2,source3]) \ 854 | .merge_all() \ 855 | .subscribe(lambda s: print(s)) 856 | 857 | # keep application alive until user presses a key 858 | input("Press any key to quit\n") 859 | ``` 860 | 861 | ## 6.1D - `merge_all()` (Continued) 862 | 863 | If you are creating an `Observable` off each emission on-the-fly, `merge_all()` can be helpful here as well. Say you have a list of Strings containing numbers separated by `/`. You can map each String to be `split()` and then pass those separated values to an `Observable.from_()`. Then you can call `merge_all()` afterwards. 864 | 865 | ```python 866 | from rx import Observable 867 | 868 | items = ["134/34/235/132/77", "64/22/98/112/86/11", "66/08/34/778/22/12"] 869 | 870 | Observable.from_(items) \ 871 | .map(lambda s: Observable.from_(s.split("/"))) \ 872 | .merge_all() \ 873 | .map(lambda s: int(s)) \ 874 | .subscribe(lambda i: print(i)) 875 | 876 | ``` 877 | 878 | **OUTPUT:** 879 | 880 | ``` 881 | 134 882 | 34 883 | 64 884 | 235 885 | 22 886 | 66 887 | 132 888 | 98 889 | 8 890 | 77 891 | 112 892 | 34 893 | 86 894 | 778 895 | 11 896 | 22 897 | 12 898 | ``` 899 | 900 | ## 6.1E - `flat_map()` 901 | 902 | An alternative way of expressing the previous example (5.1D) is using `flat_map()`. It will consolidate mapping to an `Observable` and calling `merge_all()` into a single operator. 903 | 904 | ```python 905 | from rx import Observable 906 | 907 | items = ["134/34/235/132/77", "64/22/98/112/86/11", "66/08/34/778/22/12"] 908 | 909 | Observable.from_(items) \ 910 | .flat_map(lambda s: Observable.from_(s.split("/"))) \ 911 | .map(lambda s: int(s)) \ 912 | .subscribe(lambda i: print(i)) 913 | ``` 914 | 915 | We will try to prefer the `flat_map()` over the `map()`/`merge_all()` from now on since it is much more succinct. 916 | 917 | # 6.2 Concat and Zip 918 | 919 | `Observable.concat()` and the `concat_all()` operator are simliar to `Observable.merge()` and the `merge_all()` operator. The only difference is they will emit items from each `Observable` _sequentially_. It will fire off each `Observable` in order and one-at-a-time. Therefore, this not something you want to use with infinite Observables, because the first infinite `Observable` will occupy its place in the queue forever and stop the `Observables` behind it from firing. They are helpful for finite data sets though. 920 | 921 | 922 | ## 6.2A - `concat()` 923 | 924 | Our previous `merge()` example can now emit items in order: 925 | 926 | ```python 927 | from rx import Observable 928 | 929 | source1 = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) 930 | source2 = Observable.from_(["Zeta","Eta","Theta","Iota"]) 931 | 932 | Observable.concat(source1,source2) \ 933 | .subscribe(lambda s: print(s)) 934 | ``` 935 | 936 | **OUTPUT:** 937 | 938 | ``` 939 | Alpha 940 | Beta 941 | Gamma 942 | Delta 943 | Epsilon 944 | Zeta 945 | Eta 946 | Theta 947 | Iota 948 | ``` 949 | 950 | ## 6.2B - `concat_all()` 951 | 952 | We can make our earlier example splitting Strings ordered using `concat_all()` instead of `merge_all()`. 953 | 954 | ```python 955 | from rx import Observable 956 | 957 | items = ["134/34/235/132/77", "64/22/98/112/86/11", "66/08/34/778/22/12"] 958 | 959 | Observable.from_(items) \ 960 | .map(lambda s: Observable.from_(s.split("/"))) \ 961 | .concat_all() \ 962 | .map(lambda s: int(s)) \ 963 | .subscribe(lambda i: print(i)) 964 | ``` 965 | 966 | **OUTPUT:** 967 | 968 | ``` 969 | 134 970 | 34 971 | 235 972 | 132 973 | 77 974 | 64 975 | 22 976 | 98 977 | 112 978 | 86 979 | 11 980 | 66 981 | 08 982 | 34 983 | 778 984 | 22 985 | 12 986 | ``` 987 | 988 | If you do not care about ordering, it is recommend to use `merge_all()` or `flat_map()`. `concat_all()` can behave unpredictably with certain operators like `group_by()`, which we will cover later. 989 | 990 | 991 | You can also use `concat_map()` in the same spirit as `flat_map()`, preserving the order of sequence. 992 | 993 | ```python 994 | from rx import Observable 995 | 996 | items = ["134/34/235/132/77", "64/22/98/112/86/11", "66/08/34/778/22/12"] 997 | 998 | Observable.from_(items) \ 999 | .concat_map(lambda s: Observable.from_(s.split("/"))) \ 1000 | .map(lambda s: int(s)) \ 1001 | .subscribe(lambda i: print(i)) 1002 | ``` 1003 | 1004 | 1005 | ## 6.2C - Zip 1006 | 1007 | Zipping pairs emissions from two or more sources and turns them into a single `Observable`. 1008 | 1009 | ```python 1010 | from rx import Observable 1011 | 1012 | letters = Observable.from_(["A","B","C","D","E","F"]) 1013 | numbers = Observable.range(1,5) 1014 | 1015 | Observable.zip(letters,numbers, lambda l,n: "{0}-{1}".format(l,n)) \ 1016 | .subscribe(lambda i: print(i)) 1017 | ``` 1018 | 1019 | **OUTPUT:** 1020 | 1021 | ``` 1022 | A-1 1023 | B-2 1024 | C-3 1025 | D-4 1026 | E-5 1027 | ``` 1028 | 1029 | You can alternatively express this as an operator. 1030 | 1031 | ```python 1032 | letters.zip(numbers, lambda l,n: "{0}-{1}".format(l,n)) \ 1033 | .subscribe(lambda i: print(i)) 1034 | ``` 1035 | 1036 | ## 6.3D - Using Zip to Space Emissions 1037 | 1038 | Zip can also be helpful to space out emissions by zipping an Observable with an `Observable.interva()`. For instnance, we can space out five emissions by one second intervals. 1039 | 1040 | ```python 1041 | from rx import Observable 1042 | 1043 | letters = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) 1044 | intervals = Observable.interval(1000) 1045 | 1046 | Observable.zip(letters,intervals, lambda s,i: s) \ 1047 | .subscribe(lambda s: print(s)) 1048 | 1049 | input("Press any key to quit\n") 1050 | ``` 1051 | 1052 | Note that `zip()` can get overwhelmed with infinite hot Observables where one produces emissions faster than another. You might want to consider using `combine_latest()` or `with_latest_from()` instead of `zip()`, which will pair with the latest emission from each source. For the sake of brevity, we will not cover this in this course. But you can read more about it in the ReactiveX documentation. 1053 | 1054 | 1055 | # 6.4 - Group By 1056 | 1057 | For the purposes of data science, one of the most powerful operators in ReactiveX is `group_by()`. It will yield an Observable emitting GroupedObservables, where each `GroupedObservable` pushes items with a given key. It behaves just like any other `Observable`, but it has a `key` property which we will leverage in a moment. 1058 | 1059 | But first, let's group some `String` emissions by keying on their lengths. Then let's collect emissions for each grouping into a `List`. Then we can call `flat_map()` to yield all the Lists. 1060 | 1061 | ## 6.4A - Group into Lists 1062 | 1063 | ```python 1064 | from rx import Observable 1065 | 1066 | items = ["Alpha", "Beta", "Gamma", "Delta", "Epsilon"] 1067 | 1068 | Observable.from_(items) \ 1069 | .group_by(lambda s: len(s)) \ 1070 | .flat_map(lambda grp: grp.to_list()) \ 1071 | .subscribe(lambda i: print(i)) 1072 | ``` 1073 | 1074 | **OUTPUT:** 1075 | 1076 | ``` 1077 | ['Alpha', 'Gamma', 'Delta'] 1078 | ['Beta'] 1079 | ['Epsilon'] 1080 | ``` 1081 | 1082 | `group_by()` is efficient because it is still 100% reactive and pushing items one-at-a-time through the different GroupedObservables. You can also leverage the `key` property and tuple it up with an aggregated value. This is helpful if you want to create `Dict` that holds aggregations by key values. 1083 | 1084 | For instance, if you want to find the count of each word length occurrence, you can create a `Dict` like this: 1085 | 1086 | ## 6.4B - Getting Length Counts 1087 | 1088 | ```python 1089 | from rx import Observable 1090 | 1091 | items = ["Alpha", "Beta", "Gamma", "Delta", "Epsilon"] 1092 | 1093 | Observable.from_(items) \ 1094 | .group_by(lambda s: len(s)) \ 1095 | .flat_map(lambda grp: 1096 | grp.count().map(lambda ct: (grp.key, ct)) 1097 | ) \ 1098 | .to_dict(lambda key_value: key_value[0], lambda key_value: key_value[1]) \ 1099 | .subscribe(lambda i: print(i)) 1100 | ``` 1101 | 1102 | **OUTPUT:** 1103 | 1104 | ``` 1105 | {4: 1, 5: 3, 7: 1} 1106 | ``` 1107 | 1108 | You can interpret the returned `Dict` above as "for length 4 there are one occurrences, for length 5 there are 3 occurrences, etc". 1109 | 1110 | `group_by()` is somewhat abstract but it is a powerful and efficient way to perform aggregations on a given key. It also works with infinite Observables assuming you use infinite-friendly operators on each `GroupedObservable`. We will use `group_by()` a few more times in this course. 1111 | 1112 | 1113 | # Section VII - Reading and Analyzing data 1114 | 1115 | In this chapter we will look over basic ways to reactively read data and analyze data from text files, URL's, and SQL. We will also integrate concepts we previously learned to create a reactive word counter that runs on a schedule and detects changes to a file. 1116 | 1117 | One catch with using `Observable.from_()` with a data source iterable is it only iterates once, causing multiple Subscribers to not receieve data after the first Subscriber. To get around this we will use functions to create a new `Observable` each time we need to subscribe to a data source. A slightly more advanced way to solve this issue is to use `Observable.defer()` which we will not cover here, but you can read about it in the Appendix. 1118 | 1119 | It is good to leverage functions that return Observables anyway. You can accept arguments to build the Observable chain that is returned and increase reusability. 1120 | 1121 | ## 7.1A - Reading a Text File 1122 | 1123 | As stated earlier, anything that is iterable can be turned into an `Observable` using `Observable.from_()`. We can emit the lines from a text file in this manner. If I have a raw text file called `bbc_news_article.txt` in my Python project, I can emit the lines like this: 1124 | 1125 | ```python 1126 | from rx import Observable 1127 | 1128 | 1129 | def read_lines(file_name): 1130 | file = open(file_name) 1131 | 1132 | return Observable.from_(file) \ 1133 | .map(lambda l: l.strip()) \ 1134 | .filter(lambda l: l != "") 1135 | 1136 | 1137 | read_lines("bbc_news_article.txt").subscribe(lambda s: print(s)) 1138 | ``` 1139 | 1140 | **OUTPUT:** 1141 | 1142 | ``` 1143 | Giant waves damage S Asia economy 1144 | Governments, aid agencies, insurers and travel firms are among those counting the cost of the massive earthquake and waves that hammered southern Asia. 1145 | The worst-hit areas are Sri Lanka, India, Indonesia and Thailand, with at least 23,000 people killed. Early estimates from the World Bank put the amount of aid needed at about $5bn (£2.6bn), similar to the cash offered Central America after Hurricane Mitch. Mitch killed about 10,000 people and caused damage of about $10bn in 1998. World Bank spokesman Damien 1146 | ... 1147 | ``` 1148 | 1149 | I use the `map()` and `filter()` operators to strip any leading and trailing whitespace for each line, as well as rid lines that are empty. 1150 | 1151 | We will use this example for a project at the end of this section. 1152 | 1153 | 1154 | ## 7.1B - Reading a URL 1155 | 1156 | You can also read content from the web in a similar manner. This can be a powerful way to do web scraping and data wrangling, especially if you reactively push multiple URL's or URL arguments and scrape the content off each page. Just be kind and don't tax somebody's system! 1157 | 1158 | I saved a simple raw text page of the 50 U.S. states on a Gist page. You can view it with this URL: https://goo.gl/rIaDyM. 1159 | 1160 | If you want to read the lines off the response, you can do it like this: 1161 | 1162 | ```python 1163 | from rx import Observable 1164 | from urllib.request import urlopen 1165 | 1166 | 1167 | def read_request(link): 1168 | f = urlopen(link) 1169 | 1170 | return Observable.from_(f) \ 1171 | .map(lambda s: s.decode("utf-8").strip()) 1172 | 1173 | read_request("https://goo.gl/rIaDyM") \ 1174 | .subscribe(lambda s: print(s)) 1175 | ``` 1176 | 1177 | **OUTPUT:** 1178 | 1179 | ``` 1180 | Alabama 1181 | Alaska 1182 | Arizona 1183 | Arkansas 1184 | California 1185 | Colorado 1186 | Connecticut 1187 | Delaware 1188 | ... 1189 | ``` 1190 | 1191 | In the map we have to decode the bytes and convert them to UTF-8 Strings. Then we also clean leading and trailing whitespace with `strip()`. then finally we print each line. 1192 | 1193 | ## 7.1C Recursively Iterating Files in Directories (EXTRA) 1194 | 1195 | You can use Rx to do powerful recursion patterns to iterate files. You can download and unzip a BBC article datset for this example here, with thousands of articles in text file format: http://mlg.ucd.ie/datasets/bbc.html 1196 | 1197 | 1198 | ```python 1199 | from rx import Observable 1200 | import os 1201 | 1202 | 1203 | def recursive_files_in_directory(folder): 1204 | 1205 | def emit_files_recursively(observer): 1206 | for root, directories, filenames in os.walk(folder): 1207 | for directory in directories: 1208 | observer.on_next(os.path.join(root, directory)) 1209 | for filename in filenames: 1210 | observer.on_next(os.path.join(root, filename)) 1211 | 1212 | observer.on_completed() 1213 | 1214 | return Observable.create(emit_files_recursively) 1215 | 1216 | 1217 | recursive_files_in_directory('/home/thomas/Desktop/bbc_data_sets') \ 1218 | .filter(lambda f: f.endswith('.txt')) \ 1219 | .subscribe(on_next=lambda l: print(l), on_error=lambda e: print(e)) 1220 | 1221 | ``` 1222 | 1223 | 1224 | You can iterate files through a directory and any nested directories, filter only for files you are interested in (such as .txt files), and then emit the lines from all the files. 1225 | 1226 | 1227 | ```python 1228 | from rx import Observable 1229 | import os 1230 | 1231 | 1232 | def recursive_files_in_directory(folder): 1233 | 1234 | def emit_files_recursively(observer): 1235 | for root, directories, filenames in os.walk(folder): 1236 | for directory in directories: 1237 | observer.on_next(os.path.join(root, directory)) 1238 | for filename in filenames: 1239 | observer.on_next(os.path.join(root, filename)) 1240 | 1241 | observer.on_completed() 1242 | 1243 | return Observable.create(emit_files_recursively) 1244 | 1245 | 1246 | recursive_files_in_directory('/home/thomas/Desktop/bbc') \ 1247 | .filter(lambda f: f.endswith('.txt')) \ 1248 | .flat_map(lambda f: Observable.from_(open(f, encoding="ISO-8859-1"))) \ 1249 | .map(lambda l: l.strip()) \ 1250 | .filter(lambda l: l != "") \ 1251 | .subscribe(on_next=lambda l: print(l), on_error=lambda e: print(e)) 1252 | 1253 | ``` 1254 | 1255 | ## 7.2 - Reading a SQL Query 1256 | 1257 | SQLAlchemy is the go-to Python library for SQL querying, and since it is iterable it can easily support Rx. In this example, I am using a SQLite database file which you can download at https://goo.gl/9DYXPS. You can also download it on my [_Getting Started with SQL_ GitHub page](https://github.com/thomasnield/oreilly_getting_started_with_sql). 1258 | 1259 | 1260 | 1261 | ### 7.2A - Emitting a query 1262 | 1263 | When you set up your engine, statement, and connection, you can reactively emit each result (which will be a tuple) from a query using `Observable.from_()`. Since a SQL query result set can only be iterated once, it is easiest to use a function to create a new one and return it in an `Observable` each time. That way multiple subscribers can be accommodated easily. 1264 | 1265 | 1266 | ```python 1267 | from sqlalchemy import create_engine, text 1268 | from rx import Observable 1269 | 1270 | engine = create_engine('sqlite:///rexon_metals.db') 1271 | conn = engine.connect() 1272 | 1273 | 1274 | def get_all_customers(): 1275 | stmt = text("SELECT * FROM CUSTOMER") 1276 | return Observable.from_(conn.execute(stmt)) 1277 | 1278 | 1279 | get_all_customers().subscribe(lambda r: print(r)) 1280 | ``` 1281 | 1282 | **OUTPUT:** 1283 | 1284 | ``` 1285 | (1, 'LITE Industrial', 'Southwest', '729 Ravine Way', 'Irving', 'TX', 75014) 1286 | (2, 'Rex Tooling Inc', 'Southwest', '6129 Collie Blvd', 'Dallas', 'TX', 75201) 1287 | (3, 'Re-Barre Construction', 'Southwest', '9043 Windy Dr', 'Irving', 'TX', 75032) 1288 | (4, 'Prairie Construction', 'Southwest', '264 Long Rd', 'Moore', 'OK', 62104) 1289 | (5, 'Marsh Lane Metal Works', 'Southeast', '9143 Marsh Ln', 'Avondale', 'LA', 79782) 1290 | ``` 1291 | 1292 | ### 7.2B - Using Observable.defer() 1293 | 1294 | If you need multiple subscribers for sources that can only be iterated once, use `Observable.defer()` to generate a new iterable object for each subscription. 1295 | 1296 | ```python 1297 | from sqlalchemy import create_engine, text 1298 | from rx import Observable 1299 | 1300 | engine = create_engine('sqlite:///rexon_metals.db') 1301 | conn = engine.connect() 1302 | 1303 | 1304 | def get_all_customers(): 1305 | stmt = text("SELECT * FROM CUSTOMER") 1306 | return Observable.defer(lambda: Observable.from_(conn.execute(stmt))) 1307 | 1308 | my_source = get_all_customers() 1309 | 1310 | my_source.subscribe(lambda r: print(r)) 1311 | my_source.subscribe(lambda r: print(r)) 1312 | ``` 1313 | 1314 | 1315 | 1316 | ### 7.2C - Merging multiple queries 1317 | 1318 | You can create some powerful reactive patterns when working with databases. For instance, say you wanted to query for customers with ID's 1, 3, and 5. Of course you can do this in raw SQL like so: 1319 | 1320 | ```sql 1321 | SELECT * FROM CUSTOMER WHERE CUSTOMER_ID in (1,3,5) 1322 | ``` 1323 | 1324 | However, let's leverage Rx to keep our API simple and minimize the number of query functions it needs. 1325 | 1326 | You can create a single `customer_for_id()` function that returns an `Observable` emitting a customer for a given `customer_id`. You can compose it into a reactive chain by using `merge_all()` or `flat_map()`. Do this by emitting the desired ID's, mapping them to the `customer_for_id()`, and then calling `merge_all()` to consolidate the results from all three queries. 1327 | 1328 | ```python 1329 | from sqlalchemy import create_engine, text 1330 | from rx import Observable 1331 | 1332 | engine = create_engine('sqlite:///rexon_metals.db') 1333 | conn = engine.connect() 1334 | 1335 | def get_all_customers(): 1336 | stmt = text("SELECT * FROM CUSTOMER") 1337 | return Observable.from_(conn.execute(stmt)) 1338 | 1339 | def customer_for_id(customer_id): 1340 | stmt = text("SELECT * FROM CUSTOMER WHERE CUSTOMER_ID = :id") 1341 | return Observable.from_(conn.execute(stmt, id=customer_id)) 1342 | 1343 | # Query customers with IDs 1, 3, and 5 1344 | Observable.from_([1, 3, 5]) \ 1345 | .flat_map(lambda id: customer_for_id(id)) \ 1346 | .subscribe(lambda r: print(r)) 1347 | 1348 | ``` 1349 | 1350 | **OUTPUT:** 1351 | 1352 | ``` 1353 | (1, 'LITE Industrial', 'Southwest', '729 Ravine Way', 'Irving', 'TX', 75014) 1354 | (3, 'Re-Barre Construction', 'Southwest', '9043 Windy Dr', 'Irving', 'TX', 75032) 1355 | (5, 'Marsh Lane Metal Works', 'Southeast', '9143 Marsh Ln', 'Avondale', 'LA', 79782) 1356 | ``` 1357 | 1358 | 1359 | ## 7.2D - Writing Data (EXTRA) 1360 | 1361 | You can also use Rx to write data to a database. One way to do this is to put the writing operations in the Subscriber, but you can get a bit more creative and flexible with Rx. For instance, we can create a function called `insert_new_customer()` that accepts the parameters needed to create a new `CUSTOMER` record. But, we can return an `Observable` that emits the automatically assigned PRIMARY KEY value for that record. This allows us to compose writing operations with other operations, such as querying for the record we just created. 1362 | 1363 | ```python 1364 | from sqlalchemy import create_engine, text 1365 | from rx import Observable 1366 | 1367 | 1368 | engine = create_engine('sqlite:///rexon_metals.db') 1369 | conn = engine.connect() 1370 | 1371 | 1372 | def get_all_customers(): 1373 | stmt = text("SELECT * FROM CUSTOMER") 1374 | return Observable.from_(conn.execute(stmt)) 1375 | 1376 | 1377 | def customer_for_id(customer_id): 1378 | stmt = text("SELECT * FROM CUSTOMER WHERE CUSTOMER_ID = :id") 1379 | return Observable.from_(conn.execute(stmt, id=customer_id)) 1380 | 1381 | 1382 | def insert_new_customer(customer_name, region, street_address, city, state, zip_code): 1383 | stmt = text("INSERT INTO CUSTOMER (NAME, REGION, STREET_ADDRESS, CITY, STATE, ZIP) VALUES (" 1384 | ":customer_name, :region, :street_address, :city, :state, :zip_code)") 1385 | 1386 | result = conn.execute(stmt, customer_name=customer_name, region=region, street_address=street_address, city=city, state=state, zip_code=zip_code) 1387 | return Observable.just(result.lastrowid) 1388 | 1389 | # Create new customer, emit primary key ID, and query that customer 1390 | insert_new_customer('RMS Materials','Northeast', '5764 Carrier Ln', 'Boston', 'Massachusetts', '02201') \ 1391 | .flat_map(lambda i: customer_for_id(i)) \ 1392 | .subscribe(lambda s: print(s)) 1393 | 1394 | ``` 1395 | 1396 | **OUTPUT:** 1397 | 1398 | ``` 1399 | (6, 'RMS Materials', 'Northeast', '5764 Carrier Ln', 'Boston', 'Massachusetts', 2201) 1400 | 1401 | ``` 1402 | 1403 | 1404 | ## 7.3 - A Scheduled Reactive Word Counter 1405 | 1406 | Let's apply everything we have learned so far to create a reactive word counter process. 1407 | 1408 | 1409 | ### 7.3A - Emitting words from a text file 1410 | Let's start by creating a function that returns an `Observable` emitting and cleaning the words in a text file, ridding punctuation, empty lines, and making all words lower case. 1411 | 1412 | ```python 1413 | from rx import Observable 1414 | import re 1415 | 1416 | 1417 | def words_from_file(file_name): 1418 | file = open(file_name) 1419 | 1420 | # parse, clean, and push words in text file 1421 | return Observable.from_(file) \ 1422 | .flat_map(lambda s: Observable.from_(s.split())) \ 1423 | .map(lambda w: re.sub(r'[^\w]', '', w)) \ 1424 | .filter(lambda w: w != "") \ 1425 | .map(lambda w: w.lower()) 1426 | 1427 | article_file = "bbc_news_article.txt" 1428 | words_from_file(article_file).subscribe(lambda w: print(w)) 1429 | ``` 1430 | 1431 | **OUTPUT:** 1432 | 1433 | ``` 1434 | giant 1435 | waves 1436 | damage 1437 | governments 1438 | s 1439 | aid 1440 | asia 1441 | agencies 1442 | the 1443 | economy 1444 | ... 1445 | ``` 1446 | 1447 | 1448 | ### 7.3B - Counting Word Occurrences 1449 | 1450 | Let's create another function called `word_counter()`. It will leverage the existing `words_from_file()` then use `group_by()` to count the word occurrances, then tuple the word with the count. 1451 | 1452 | ```python 1453 | from rx import Observable 1454 | import re 1455 | 1456 | 1457 | def words_from_file(file_name): 1458 | file = open(file_name) 1459 | 1460 | # parse, clean, and push words in text file 1461 | return Observable.from_(file) \ 1462 | .flat_map(lambda s: Observable.from_(s.split())) \ 1463 | .map(lambda w: re.sub(r'[^\w\s]', '', w)) \ 1464 | .filter(lambda w: w != "") \ 1465 | .map(lambda w: w.lower()) \ 1466 | 1467 | 1468 | 1469 | def word_counter(file_name): 1470 | 1471 | # count words using `group_by()` 1472 | # tuple the word with the count 1473 | return words_from_file(file_name) \ 1474 | .group_by(lambda word: word) \ 1475 | .flat_map(lambda grp: grp.count().map(lambda ct: (grp.key, ct))) 1476 | 1477 | article_file = "bbc_news_article.txt" 1478 | word_counter(article_file).subscribe(lambda w: print(w)) 1479 | ``` 1480 | 1481 | **OUTPUT:** 1482 | 1483 | ``` 1484 | ('giant', 1) 1485 | ('waves', 3) 1486 | ('damage', 6) 1487 | ('governments', 3) 1488 | ('s', 1) 1489 | ('aid', 10) 1490 | ('asia', 6) 1491 | ('agencies', 3) 1492 | ('the', 78) 1493 | ('economy', 1) 1494 | ... 1495 | ``` 1496 | 1497 | ## 7.3C - Scheduling the Word Count And Notifying of Changes 1498 | 1499 | Finally, let's schedule this word count to occur every 3 seconds and collect them into a `Dict`. We can use `distinct_until_changed()` to only emit `Dict` items that have changed due to the text file being edited. 1500 | 1501 | ```python 1502 | # Schedules a reactive process that counts the words in a text file every three seconds, 1503 | # but only prints it as a dict if it has changed 1504 | 1505 | from rx import Observable 1506 | import re 1507 | 1508 | 1509 | def words_from_file(file_name): 1510 | file = open(file_name) 1511 | 1512 | # parse, clean, and push words in text file 1513 | return Observable.from_(file) \ 1514 | .flat_map(lambda s: Observable.from_(s.split())) \ 1515 | .map(lambda w: re.sub(r'[^\w\s]', '', w)) \ 1516 | .filter(lambda w: w != "") \ 1517 | .map(lambda w: w.lower()) \ 1518 | 1519 | 1520 | 1521 | def word_counter(file_name): 1522 | 1523 | # count words using `group_by()` 1524 | # tuple the word with the count 1525 | return words_from_file(file_name) \ 1526 | .group_by(lambda word: word) \ 1527 | .flat_map(lambda grp: grp.count().map(lambda ct: (grp.key, ct))) 1528 | 1529 | 1530 | # composes the above word_counter() into a dict 1531 | def word_counter_as_dict(file_name): 1532 | return word_counter(file_name).to_dict(lambda t: t[0], lambda t: t[1]) 1533 | 1534 | 1535 | # Schedule to create a word count dict every three seconds an article 1536 | # But only re-print if text is edited and word counts change 1537 | 1538 | article_file = "bbc_news_article.txt" 1539 | 1540 | # create a dict every three seconds, but only push if it changed 1541 | Observable.interval(3000) \ 1542 | .flat_map(lambda i: word_counter_as_dict(article_file)) \ 1543 | .distinct_until_changed() \ 1544 | .subscribe(lambda word_ct_dict: print(word_ct_dict)) 1545 | 1546 | # Keep alive until user presses any key 1547 | input("Starting, press any key to quit\n") 1548 | ``` 1549 | 1550 | **OUTPUT:** 1551 | 1552 | ``` 1553 | Starting, press any key to quit 1554 | {'a': 7, 'governments': 3, 'first': 1, 'getting': 1, 'offered': 1, ... 1555 | ``` 1556 | 1557 | Every time the file is edited and words are added, modified, or removed, it should push a new `Dict` reflecting these changes. This can be helpful to run a report on a schedule, and you can only emit a new report to an output if the data has changed. 1558 | 1559 | Ideally, it is better to hook onto the change event itself rather than running a potentially expensive process every 3 seconds. We will learn how to do this with Twitter in the next section. 1560 | 1561 | > If you want to see an intensive reactive data analysis example, see my [social media example on Gist](https://goo.gl/NO0Q4P) 1562 | 1563 | 1564 | 1565 | # Section VIII - Hot Observables 1566 | 1567 | In this section we will learn how to create an `Observable` emitting Tweets for a set of topics. We will wrap an `Observable.create()` around the Tweepy API. But first, let's cover multicasting. 1568 | 1569 | ## 8.1A - Creating a `ConnectableObservable` 1570 | 1571 | Remember how cold Observables will replay data to each Subscriber like a music CD? 1572 | 1573 | ```python 1574 | from rx import Observable 1575 | 1576 | source = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) 1577 | 1578 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s))) 1579 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s))) 1580 | ``` 1581 | 1582 | **OUTPUT:** 1583 | 1584 | ``` 1585 | Subscriber 1: Alpha 1586 | Subscriber 1: Beta 1587 | Subscriber 1: Gamma 1588 | Subscriber 1: Delta 1589 | Subscriber 1: Epsilon 1590 | Subscriber 2: Alpha 1591 | Subscriber 2: Beta 1592 | Subscriber 2: Gamma 1593 | Subscriber 2: Delta 1594 | Subscriber 2: Epsilon 1595 | ``` 1596 | 1597 | This is often what we want so no data is missed for each Subscriber. But there are times we will want to force cold Observables to become hot Observables. We can do this by calling `publish()` which will return a `ConnectableObservable`. Then we can subscribe our Subscribers to it, then call `connect()` to fire emissions to all Subscribers at once. 1598 | 1599 | ```python 1600 | from rx import Observable 1601 | 1602 | source = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]).publish() 1603 | 1604 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s))) 1605 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s))) 1606 | 1607 | source.connect() 1608 | ``` 1609 | 1610 | **OUTPUT:** 1611 | 1612 | ``` 1613 | Subscriber 1: Alpha 1614 | Subscriber 2: Alpha 1615 | Subscriber 1: Beta 1616 | Subscriber 2: Beta 1617 | Subscriber 1: Gamma 1618 | Subscriber 2: Gamma 1619 | Subscriber 1: Delta 1620 | Subscriber 2: Delta 1621 | Subscriber 1: Epsilon 1622 | Subscriber 2: Epsilon 1623 | ``` 1624 | 1625 | This is known as multicasting. Notice how the emissions are now interleaved? This is because each emission is going to both subscribers. This is helpful if "replaying" the data is expensive or we just simply want all Subscribers to get the emissions simultaneously. 1626 | 1627 | ## 8.1B - Sharing an Interval Observable (EXTRA) 1628 | 1629 | `Observable.interval()` is actually a cold Observable too. If one Subscriber subscribes to it, and 5 seconds later another Subscriber comes in, that second subscriber will receive its own emissions that "start over". 1630 | 1631 | ```python 1632 | from rx import Observable 1633 | import time 1634 | 1635 | source = Observable.interval(1000) 1636 | 1637 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s))) 1638 | 1639 | # sleep 5 seconds, then add another subscriber 1640 | time.sleep(5) 1641 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s))) 1642 | 1643 | input("Press any key to exit\n") 1644 | ``` 1645 | 1646 | **OUTPUT:** 1647 | 1648 | ``` 1649 | Subscriber 1: 0 1650 | Subscriber 1: 1 1651 | Subscriber 1: 2 1652 | Subscriber 1: 3 1653 | Press any key to exit 1654 | Subscriber 1: 4 1655 | Subscriber 2: 0 1656 | Subscriber 1: 5 1657 | Subscriber 2: 1 1658 | Subscriber 1: 6 1659 | Subscriber 2: 2 1660 | Subscriber 1: 7 1661 | Subscriber 2: 3 1662 | 1663 | ``` 1664 | 1665 | Subscriber 2 starts at `0` while Subscriber 2 is already at `4`. If we want both to be on the same timer, we can use `publish()` to create a `ConnectableObservable`. 1666 | 1667 | 1668 | ```python 1669 | from rx import Observable 1670 | import time 1671 | 1672 | source = Observable.interval(1000).publish() 1673 | 1674 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s))) 1675 | source.connect() 1676 | 1677 | # sleep 5 seconds, then add another subscriber 1678 | time.sleep(5) 1679 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s))) 1680 | 1681 | input("Press any key to exit\n") 1682 | ``` 1683 | 1684 | **OUTPUT:** 1685 | 1686 | ``` 1687 | Subscriber 1: 0 1688 | Subscriber 1: 1 1689 | Subscriber 1: 2 1690 | Subscriber 1: 3 1691 | Press any key to exit 1692 | Subscriber 1: 4 1693 | Subscriber 2: 4 1694 | Subscriber 1: 5 1695 | Subscriber 2: 5 1696 | ``` 1697 | 1698 | 1699 | ## 8.1C - Autoconnecting 1700 | 1701 | We can have our `ConnectableObservable` automatically `connect()` itself when it gets a Subscriber by calling `ref_count()` on it. 1702 | 1703 | 1704 | ```python 1705 | from rx import Observable 1706 | import time 1707 | 1708 | source = Observable.interval(1000).publish().auto_connect() 1709 | 1710 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s))) 1711 | 1712 | # sleep 5 seconds, then add another subscriber 1713 | time.sleep(5) 1714 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s))) 1715 | 1716 | input("Press any key to exit\n") 1717 | ``` 1718 | 1719 | You can also pass the number of subscribers to wait for to `auto_connect()` before it starts firing. 1720 | 1721 | ```python 1722 | source = Observable.interval(1000).auto_connect() 1723 | ``` 1724 | 1725 | Again, multicasting is helpful when you want all Subscribers to receive the same emissions simultaneously 1726 | and prevent redundant, expensive work for each Subscriber. 1727 | 1728 | 1729 | ## 8.2D - Multicasting Specific Points 1730 | 1731 | The placement of the mutlicasting matters. For instance, if you map three emissions to three random integers, but multicast _before_ the `map()` operation, two subscribers will both receive separate random integers. 1732 | 1733 | ```python 1734 | from rx import Observable 1735 | from random import randint 1736 | 1737 | 1738 | three_emissions = Observable.range(1, 3).publish() 1739 | 1740 | three_random_ints = three_emissions.map(lambda i: randint(1, 100000)) 1741 | 1742 | three_random_ints.subscribe(lambda i: print("Subscriber 1 Received: {0}".format(i))) 1743 | three_random_ints.subscribe(lambda i: print("Subscriber 2 Received: {0}".format(i))) 1744 | 1745 | three_emissions.connect() 1746 | ``` 1747 | 1748 | **OUTPUT:** 1749 | 1750 | ``` 1751 | Subscriber 1 Received: 56976 1752 | Subscriber 1 Received: 882 1753 | Subscriber 1 Received: 59873 1754 | Subscriber 2 Received: 12911 1755 | Subscriber 2 Received: 47631 1756 | Subscriber 2 Received: 84640 1757 | ``` 1758 | 1759 | 1760 | However, putting the `publish()` _after_ the `map()` operation, both subscribers will receive the same emissions. 1761 | 1762 | ```python 1763 | from rx import Observable 1764 | from random import randint 1765 | 1766 | 1767 | three_emissions = Observable.range(1, 3) 1768 | 1769 | three_random_ints = three_emissions.map(lambda i: randint(1, 100000)).publish() 1770 | 1771 | three_random_ints.subscribe(lambda i: print("Subscriber 1 Received: {0}".format(i))) 1772 | three_random_ints.subscribe(lambda i: print("Subscriber 2 Received: {0}".format(i))) 1773 | 1774 | three_random_ints.connect() 1775 | ``` 1776 | 1777 | **OUTPUT:** 1778 | 1779 | ``` 1780 | Subscriber 1 Received: 17500 1781 | Subscriber 2 Received: 17500 1782 | Subscriber 1 Received: 71398 1783 | Subscriber 2 Received: 71398 1784 | Subscriber 1 Received: 90457 1785 | Subscriber 2 Received: 90457 1786 | ``` 1787 | 1788 | Therefore, note that most operators will create a separate stream for each subscriber, even if upstream there is a mutlicasting operation. Typically, you multicast up to the point where operations are common to both subscribers. For instance, if one subscriber simply printed each random number while the second subscriber performed a sum on them, the multcasting will happen before the summing operation since that is where digressive operations occur. 1789 | 1790 | ```python 1791 | from rx import Observable 1792 | from random import randint 1793 | 1794 | 1795 | three_emissions = Observable.range(1, 3) 1796 | 1797 | three_random_ints = three_emissions.map(lambda i: randint(1, 100000)).publish() 1798 | 1799 | three_random_ints.subscribe(lambda i: print("Subscriber 1 Received: {0}".format(i)))\ 1800 | 1801 | three_random_ints.reduce(lambda total, item: total + item) \ 1802 | .subscribe(lambda i: print("Subscriber 2 Received: {0}".format(i))) 1803 | 1804 | three_random_ints.connect() 1805 | ``` 1806 | 1807 | **OUTPUT:*** 1808 | 1809 | ``` 1810 | Subscriber 1 Received: 17618 1811 | Subscriber 1 Received: 66227 1812 | Subscriber 1 Received: 36159 1813 | Subscriber 2 Received: 120004 1814 | ``` 1815 | 1816 | 1817 | ## 8.2E - Subjects (EXTRA) 1818 | 1819 | Another way to create a kmutlicasted `Observable` is by declaring a `Subject`. A `Subject` is both an `Observable` and `Observer`, and you can call its `Observer` functions to push items through it and up to any Subscribers at any time. It will push these items to all subscribers. 1820 | 1821 | ```python 1822 | from rx.subjects import Subject 1823 | 1824 | subject = Subject() 1825 | 1826 | subject.filter(lambda i: i < 100) \ 1827 | .map(lambda i: i * 1000) \ 1828 | .subscribe(lambda i: print(i)) 1829 | 1830 | subject.on_next(10) 1831 | subject.on_next(50) 1832 | subject.on_next(105) 1833 | subject.on_next(87) 1834 | 1835 | subject.on_completed() 1836 | ``` 1837 | 1838 | **OUTPUT:** 1839 | 1840 | ``` 1841 | 10000 1842 | 50000 1843 | 87000 1844 | ``` 1845 | 1846 | While they seem convenient, Subjects are often discouraged from being used. They can easily encourage antipatterns and are prone to abuse. They also are difficult to compose against and do not respect `subscribe_on()`. It is better to create Observables that strictly come from one defined source, rather than be openly mutable and have anything push items to it at anytime. Use Subjects with discretion. 1847 | 1848 | ## 8.2 - Querying Live Twitter Feeds 1849 | 1850 | 1851 | You can use `Observable.create()` to wrangle and analyze a live Twitter feed. 1852 | 1853 | You will need to create your own application and access keys/tokens at https://apps.twitter.com. 1854 | 1855 | If we want to query a live stream of Tweets pertaining to the topics of "Britain" or "France", we can do it like this: 1856 | 1857 | ```python 1858 | from tweepy.streaming import StreamListener 1859 | from tweepy import OAuthHandler 1860 | from tweepy import Stream 1861 | import json 1862 | from rx import Observable 1863 | 1864 | # Variables that contains the user credentials to access Twitter API 1865 | access_token = "PUT YOURS HERE" 1866 | access_token_secret = "PUT YOURS HERE" 1867 | consumer_key = "PUT YOURS HERE" 1868 | consumer_secret = "PUT YOURS HERE" 1869 | 1870 | 1871 | def tweets_for(topics): 1872 | def observe_tweets(observer): 1873 | class TweetListener(StreamListener): 1874 | def on_data(self, data): 1875 | observer.on_next(data) 1876 | return True 1877 | 1878 | def on_error(self, status): 1879 | observer.on_error(status) 1880 | 1881 | # This handles Twitter authetification and the connection to Twitter Streaming API 1882 | l = TweetListener() 1883 | auth = OAuthHandler(consumer_key, consumer_secret) 1884 | auth.set_access_token(access_token, access_token_secret) 1885 | stream = Stream(auth, l) 1886 | stream.filter(track=topics) 1887 | 1888 | return Observable.create(observe_tweets).share() 1889 | 1890 | 1891 | topics = ['Britain', 'France'] 1892 | 1893 | tweets_for(topics) \ 1894 | .map(lambda d: json.loads(d)) \ 1895 | .subscribe(on_next=lambda s: print(s), on_error=lambda e: print(e)) 1896 | ``` 1897 | # IX - Concurrency 1898 | 1899 | (Refer to slides to cover concurrency concepts). 1900 | 1901 | ## 9.1 - Using `subscribe_on()` 1902 | 1903 | ## 9.1A - Two Long-Running Processes 1904 | 1905 | We will not dive too deep into concurrency topics, but we will learn enough to make it useful and speed up slow processes. Note also the [GIL issue in Python](https://stackoverflow.com/questions/1294382/what-is-a-global-interpreter-lock-gil#1294402) can undermine concurrency performance in Python applications, but hopefully you will still get some marginal benefit. Be sure to test your concurrency strategies and measure what brings the best performance. 1906 | 1907 | > Keep in mind your output may be different than mine, because concurrency tends to shuffle emissions of multiple sources. Output is almost never deterministic when multiple threads are doing work simultaneously and being merged. 1908 | 1909 | Below, we create two Observables we will call "Task 1" and "Task 2". The first Observable is emitting five strings and the other emits numbers in a range. These Observables will fire quickly when subscribed to, but concurrency is more useful and apparent with long-running tasks. To emulate long-running expensive processes, we will need to exaggerate and slow down emissions. We can use a `intense_calculation()` function that sleeps for a short random duration (between 0.5 to 2.0 seconds) before returning the value it was given. Then we can use this in a `map()` operator for each `Observable`. 1910 | 1911 | We will use `current_thread().name` to identify the thread that is calling each `on_next()` in the `Subscriber`. Python will label each thread it creates consecutively as "Thread-1", "Thread-2", "Thread-3", etc. 1912 | 1913 | Before "Task 2" can start, it must wait for "Task 1" to call `on_completed()` because by default both are on the `ImmediateScheduler`. This scheduler uses the same `MainThread` that runs our Python program. 1914 | 1915 | 1916 | ```python 1917 | from rx import Observable 1918 | from threading import current_thread 1919 | import multiprocessing, time, random 1920 | 1921 | def intense_calculation(value): 1922 | # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation 1923 | time.sleep(random.randint(5,20) * .1) 1924 | return value 1925 | 1926 | # Create TASK 1 1927 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \ 1928 | .map(lambda s: intense_calculation(s)) \ 1929 | .subscribe(on_next=lambda s: print("TASK 1: {0} {1}".format(current_thread().name, s)), 1930 | on_error=lambda e: print(e), 1931 | on_completed=lambda: print("TASK 1 done!")) 1932 | 1933 | # Create TASK 2 1934 | Observable.range(1,10) \ 1935 | .map(lambda s: intense_calculation(s)) \ 1936 | .subscribe(on_next=lambda i: print("TASK 2: {0} {1}".format(current_thread().name, i)), 1937 | on_error=lambda e: print(e), 1938 | on_completed=lambda: print("TASK 2 done!")) 1939 | 1940 | input("Press any key to exit\n") 1941 | ``` 1942 | 1943 | **OUTPUT (May not match yours):** 1944 | 1945 | ``` 1946 | TASK 1: MainThread Alpha 1947 | TASK 1: MainThread Beta 1948 | TASK 1: MainThread Gamma 1949 | TASK 1: MainThread Delta 1950 | TASK 1: MainThread Epsilon 1951 | TASK 1 done! 1952 | TASK 2: MainThread 1 1953 | TASK 2: MainThread 2 1954 | TASK 2: MainThread 3 1955 | TASK 2: MainThread 4 1956 | TASK 2: MainThread 5 1957 | TASK 2: MainThread 6 1958 | TASK 2: MainThread 7 1959 | TASK 2: MainThread 8 1960 | TASK 2: MainThread 9 1961 | TASK 2: MainThread 10 1962 | TASK 2 done! 1963 | ``` 1964 | 1965 | ## 9.1B - Kicking off both processes simultaneously 1966 | 1967 | This would go much faster if we kick off both "Task 1" and "Task 2" simultaneously. We can kick off the Subscription in "Task 1" and then immediately move on to kicking off "Task 2". We will kick off both of their subscriptions simultaneously. 1968 | 1969 | In advance, we can create a `ThreadPoolScheduler` that holds a number of threads equaling the _number of CPU's on your computer_ + 1. If your computer has 4 cores, the `ThreadPoolScheduler` will have 5 threads. The reason for the extra thread is to utilize any idle time of the other threads. To make the Observables work on this `ThreadPoolScheduler`, we can pass it to a `subscribe_on()` operator anywhere in the chain. The `subscribe_on()`, no matter where it is in the chain, will instruct the source Observable what thread to push items on. 1970 | 1971 | > You are welcome to experiment and specify your own arbitrary number of threads. Just keep in mind there will be a point of diminishing return. 1972 | 1973 | The code below will execute all the above: 1974 | 1975 | ```python 1976 | from rx import Observable 1977 | from rx.concurrency import ThreadPoolScheduler 1978 | from threading import current_thread 1979 | import multiprocessing, time, random 1980 | 1981 | 1982 | def intense_calculation(value): 1983 | # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation 1984 | time.sleep(random.randint(5,20) * .1) 1985 | return value 1986 | 1987 | # calculate number of CPU's and add 1, then create a ThreadPoolScheduler with that number of threads 1988 | optimal_thread_count = multiprocessing.cpu_count() 1989 | pool_scheduler = ThreadPoolScheduler(optimal_thread_count) 1990 | 1991 | print("We are using {0} threads".format(optimal_thread_count)) 1992 | 1993 | # Create Task 1 1994 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \ 1995 | .map(lambda s: intense_calculation(s)) \ 1996 | .subscribe_on(pool_scheduler) \ 1997 | .subscribe(on_next=lambda s: print("TASK 1: {0} {1}".format(current_thread().name, s)), 1998 | on_error=lambda e: print(e), 1999 | on_completed=lambda: print("TASK 1 done!")) 2000 | 2001 | # Create Task 2 2002 | Observable.range(1,10) \ 2003 | .map(lambda s: intense_calculation(s)) \ 2004 | .subscribe_on(pool_scheduler) \ 2005 | .subscribe(on_next=lambda i: print("TASK 2: {0} {1}".format(current_thread().name, i)), 2006 | on_error=lambda e: print(e), 2007 | on_completed=lambda: print("TASK 2 done!")) 2008 | 2009 | input("Press any key to exit\n") 2010 | 2011 | ``` 2012 | 2013 | **OUTPUT (May not match yours):** 2014 | 2015 | ``` 2016 | TASK 1: Thread-1 Alpha 2017 | TASK 2: Thread-2 1 2018 | TASK 1: Thread-1 Beta 2019 | TASK 1: Thread-1 Gamma 2020 | TASK 2: Thread-2 2 2021 | TASK 2: Thread-2 3 2022 | TASK 1: Thread-1 Delta 2023 | TASK 2: Thread-2 4 2024 | TASK 1: Thread-1 Epsilon 2025 | TASK 1 done! 2026 | TASK 2: Thread-2 5 2027 | TASK 2: Thread-2 6 2028 | TASK 2: Thread-2 7 2029 | TASK 2: Thread-2 8 2030 | TASK 2: Thread-2 9 2031 | TASK 2: Thread-2 10 2032 | TASK 2 done! 2033 | ``` 2034 | 2035 | We use the `input()` function to hold the `MainThread` and keep the application alive until a key is pressed, allowing the Observables to fire. Notice how the emissions between Task 1 and Task 2 are interleaved, indicating they are both working at the same time. If we did not have the `subscribe_on()` calls, "Task 1" would have to finish before "Task 2" can start, because they both would use the default `ImmediateScheduler` as shown earlier. 2036 | 2037 | Notice also that "Task 1" requested a thread from our `ThreadPoolScheduler` and got `Thread-1`, and "Task 2" got `Thread 2`. They both will continue to use these threads until `on_completed()` is called on their Subscribers. Then the threads will be given back to the `ThreadPoolScheduler` so they can be used again later. 2038 | 2039 | 2040 | ## 9.2 - Using `observe_on()` to redirect in the middle of the chain 2041 | 2042 | Not all source Observables will respect a `subscribe_on()` you specify. This is especially true for time-driven sources like `Observable.interval()` which will use the `TimeoutScheduler` and effectively ignore any `subscribe_on()` you try to call. However, although you cannot instruct the source to emit on a different scheduler, you can specify a different scheduler to be used _at a certain point_ in the `Observable` chain by using `observe_on()`. 2043 | 2044 | Let's create a third process called "Task 3". The source will be an `Observable.interval()` which will emit on the `TimeoutScheduler`. After each emitted number is multiplied by 100, the emission is then moved to the `ThreadPoolScheduler` via the `observe_on()` operator. This means for the remaining operators, the emissions will be passed on the `ThreadPoolScheduler`. Unlike `subscribe_on()`, the placement of `observe_on()` does matter as it will redirect to a different executor _at that point_ in the chain. 2045 | 2046 | ```python 2047 | from rx import Observable 2048 | from rx.concurrency import ThreadPoolScheduler 2049 | from threading import current_thread 2050 | import multiprocessing, time, random 2051 | 2052 | def intense_calculation(value): 2053 | # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation 2054 | time.sleep(random.randint(5,20) * .1) 2055 | return value 2056 | 2057 | # calculate number of CPU's and add 1, then create a ThreadPoolScheduler with that number of threads 2058 | optimal_thread_count = multiprocessing.cpu_count() + 1 2059 | pool_scheduler = ThreadPoolScheduler(optimal_thread_count) 2060 | 2061 | # Create Task 1 2062 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \ 2063 | .map(lambda s: intense_calculation(s)) \ 2064 | .subscribe_on(pool_scheduler) \ 2065 | .subscribe(on_next=lambda s: print("TASK 1: {0} {1}".format(current_thread().name, s)), 2066 | on_error=lambda e: print(e), 2067 | on_completed=lambda: print("TASK 1 done!")) 2068 | 2069 | # Create Task 2 2070 | Observable.range(1,10) \ 2071 | .map(lambda s: intense_calculation(s)) \ 2072 | .subscribe_on(pool_scheduler) \ 2073 | .subscribe(on_next=lambda i: print("TASK 2: {0} {1}".format(current_thread().name, i)), on_error=lambda e: print(e), on_completed=lambda: print("TASK 2 done!")) 2074 | 2075 | # Create Task 3, which is infinite 2076 | Observable.interval(1000) \ 2077 | .map(lambda i: i * 100) \ 2078 | .observe_on(pool_scheduler) \ 2079 | .map(lambda s: intense_calculation(s)) \ 2080 | .subscribe(on_next=lambda i: print("TASK 3: {0} {1}".format(current_thread().name, i)), on_error=lambda e: print(e)) 2081 | 2082 | input("Press any key to exit\n") 2083 | ``` 2084 | 2085 | 2086 | **OUTPUT (May not match yours):** 2087 | 2088 | ``` 2089 | TASK 2: Thread-2 1 2090 | TASK 1: Thread-1 Alpha 2091 | TASK 1: Thread-1 Beta 2092 | TASK 3: Thread-4 0 2093 | TASK 2: Thread-2 2 2094 | TASK 1: Thread-1 Gamma 2095 | TASK 3: Thread-4 100 2096 | TASK 1: Thread-1 Delta 2097 | TASK 2: Thread-2 3 2098 | TASK 3: Thread-6 200 2099 | TASK 1: Thread-1 Epsilon 2100 | TASK 1 done! 2101 | TASK 3: Thread-13 300 2102 | TASK 2: Thread-2 4 2103 | TASK 3: Thread-15 400 2104 | TASK 2: Thread-2 5 2105 | TASK 3: Thread-4 500 2106 | TASK 2: Thread-2 6 2107 | TASK 3: Thread-4 600 2108 | TASK 2: Thread-2 7 2109 | TASK 3: Thread-4 700 2110 | TASK 2: Thread-2 8 2111 | TASK 3: Thread-4 800 2112 | TASK 2: Thread-2 9 2113 | TASK 3: Thread-4 900 2114 | TASK 3: Thread-4 1000 2115 | TASK 2: Thread-2 10 2116 | TASK 2 done! 2117 | TASK 3: Thread-4 1100 2118 | TASK 3: Thread-4 1200 2119 | TASK 3: Thread-4 1300 2120 | TASK 3: Thread-4 1400 2121 | ... 2122 | ``` 2123 | 2124 | Unlike `subscribe_on()`, the `observe_on()` may use a different thread for each emission rather than reserving one thread for all emissions. You can use as many `observe_on()` calls as you like in an `Observable` chain to redirect emissions to different thread pools at different points in the chain. But you can only have one `subscribe_on()`. 2125 | 2126 | > You can use the `do_action()` to essentially put Subscribers in the middle of the Observable chain, often for debugging purposes. This can be helpful to print the current thread at different points in the `Observable` chain. Refer to the Appendix to learn more. 2127 | 2128 | 2129 | # 9.3 - Parallelization 2130 | 2131 | An `Observable` will only process one item at a time. However, we can use a `subscribe_on()` or an `observe_on()` in a `flat_map()` and do multiple operations in parallel _within_ that `flat_map()`. 2132 | 2133 | For instance, say I have 10 Strings I need to process. Because our `intense_calculation()` will take 0.5 to 2.0 seconds to process each emission, this could take up to 20 seconds. 2134 | 2135 | ```python 2136 | from rx import Observable 2137 | from rx.concurrency import ThreadPoolScheduler 2138 | from threading import current_thread 2139 | import multiprocessing, time, random 2140 | 2141 | def intense_calculation(value): 2142 | # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation 2143 | time.sleep(random.randint(5,20) * .1) 2144 | return value 2145 | 2146 | # Create Parallel Process 2147 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon","Zeta","Eta","Theta","Iota","Kappa"]) \ 2148 | .map(lambda s: intense_calculation(s)) \ 2149 | .subscribe(on_next=lambda s: print("{0} {1}".format(current_thread().name, s)), 2150 | on_error=lambda e: print(e), 2151 | on_completed=lambda: print("TASK 1 done!")) 2152 | 2153 | 2154 | input("Press any key to exit\n") 2155 | ``` 2156 | 2157 | This would go much faster if we processed multiple emissions at a time rather than one at a time. Let's set 2158 | 2159 | My computer has 8 cores, but let's use Python to count the number of cores dynamically. Let's set a `ThreadPoolScheduler` to have that many threads (plus one) according to our rough optimal formula. Rather than process 1 item at a time, I can now process 9 at a time which will yield a much faster completion. I just need to make sure the expensive operators happen within a `flat_map()`, starting with that single emission wrapped in an `Observable.just()` and scheduled using `subscribe_on()`. 2160 | 2161 | 2162 | ```python 2163 | from rx import Observable 2164 | from rx.concurrency import ThreadPoolScheduler 2165 | from threading import current_thread 2166 | import multiprocessing, time, random 2167 | 2168 | def intense_calculation(value): 2169 | # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation 2170 | time.sleep(random.randint(5,20) * .1) 2171 | return value 2172 | 2173 | # calculate number of CPU's, then create a ThreadPoolScheduler with that number of threads 2174 | optimal_thread_count = multiprocessing.cpu_count() 2175 | pool_scheduler = ThreadPoolScheduler(optimal_thread_count) 2176 | 2177 | # Create Parallel Process 2178 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon","Zeta","Eta","Theta","Iota","Kappa"]) \ 2179 | .flat_map(lambda s: 2180 | Observable.just(s).subscribe_on(pool_scheduler).map(lambda s: intense_calculation(s)) 2181 | ) \ 2182 | .subscribe(on_next=lambda i: print("{0} {1}".format(current_thread().name, i)), 2183 | on_error=lambda e: print(e), 2184 | on_completed=lambda: print("TASK 1 done!")) 2185 | 2186 | 2187 | input("Press any key to exit\n") 2188 | ``` 2189 | 2190 | **OUTPUT:** 2191 | 2192 | ``` 2193 | Press any key to exit 2194 | Thread-4 Delta 2195 | Thread-6 Zeta 2196 | Thread-1 Alpha 2197 | Thread-2 Beta 2198 | Thread-9 Iota 2199 | Thread-3 Gamma 2200 | Thread-8 Theta 2201 | Thread-4 Kappa 2202 | Thread-7 Eta 2203 | Thread-5 Epsilon 2204 | PROCESS done! 2205 | ``` 2206 | 2207 | Now this takes less than 3 seconds! Of course the 10 items are now racing each other and complete in a random order. Only 9 threads are available, thus a 10th item must wait for one of the first 9 to complete. It looks like this item was `Kappa` which received `Thread-4` from `Delta` after it was done. 2208 | 2209 | Parallelization using `flat_map()` (or `merge_all()`) can greatly increase performance if each emission must go through an expensive operation. Just wrap that emission into an `Observable.just()`, schedule it with `subscribe_on()` or `observe_on()` (preferably `subscribe_on()` if possible), and then make all the expensive operations happen inside the `flat_map()`. 2210 | 2211 | The reason each emission must be broken into its own `Observable` is because an `Observable` is sequential and cannot be parallelized. But you can take multiple Observables and merge them into a single Observable, even if they are working on a different threads. The merged Observable will only push out items on one thread, but the items inside `flat_map()` can process in parallel. 2212 | 2213 | # 9.4 - Redirecting Work with `switch_map()` 2214 | 2215 | Imagine you have an `Observable` and you use `flat_map()` to yield a emissions from another `Observable`. However, say you wanted to _only_ puruse the `Observable` for the latest emission, and kill any previous Observables to stop their emissions coming out of `flat_map()`. 2216 | 2217 | You can achieve this with `switch_map()`. It operates much like a `flat_map()`, but will only fire items for the latest emission. All previous Observables derived from previous emissions will be unsubscribed. 2218 | 2219 | This example is slightly contrived, but let's say we have a finite `Observable` emitting Strings. We want an `Observable.interval()` to emit every 6 seconds, and have each emission flat map to our `Observable` of strings which are artificially slowed by `intense_calculation()`. But instead of using `flat_map()`, we can use `switch_map()` to only chase after the latest `Observable` created off each interval emission and unsubscribe previous ones. 2220 | 2221 | We also need to parallelize using `subscribe_on()` so each Observable within the `switch_map()` happens on a different thread. 2222 | 2223 | ```python 2224 | from rx import Observable 2225 | from rx.concurrency import ThreadPoolScheduler 2226 | from threading import current_thread 2227 | import multiprocessing, time, random 2228 | 2229 | 2230 | def intense_calculation(value): 2231 | # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation 2232 | time.sleep(random.randint(5, 20) * .1) 2233 | return value 2234 | 2235 | # calculate number of CPU's and add 1, then create a ThreadPoolScheduler with that number of threads 2236 | optimal_thread_count = multiprocessing.cpu_count() 2237 | pool_scheduler = ThreadPoolScheduler(optimal_thread_count) 2238 | 2239 | strings = Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon", "Zeta", "Eta", "Theta", "Iota", "Kappa"]) 2240 | 2241 | Observable.interval(6000) \ 2242 | .switch_map(lambda i: strings.map(lambda s: intense_calculation(s)).subscribe_on(pool_scheduler)) \ 2243 | .subscribe(on_next = lambda s: print("Received {0} on {1}".format(s, current_thread().name)), 2244 | on_error = lambda e: print(e)) 2245 | 2246 | 2247 | input("Press any key to exit\n") 2248 | ``` 2249 | 2250 | 2251 | **OUTPUT (May Vary):** 2252 | 2253 | ``` 2254 | Press any key to exit 2255 | Received Alpha on Thread-2 2256 | Received Beta on Thread-2 2257 | Received Gamma on Thread-2 2258 | Received Delta on Thread-2 2259 | Received Alpha on Thread-4 2260 | Received Beta on Thread-4 2261 | Received Gamma on Thread-4 2262 | Received Alpha on Thread-6 2263 | Received Beta on Thread-6 2264 | Received Gamma on Thread-6 2265 | Received Delta on Thread-6 2266 | Received Epsilon on Thread-6 2267 | Received Alpha on Thread-2 2268 | ... 2269 | ``` 2270 | 2271 | 2272 | Using `switch_map()` is a convenient way to cancel current work when new work comes in, rather than queuing up work. This is desirable if you are only concerned with the latest data or want to cancel obsolete processing. If you are scraping web data on a schedule using `Observable.interval()`, but a scrape instance takes too long and a new scrape requests comes in, you can cancel that scrape and start the next one. 2273 | 2274 | 2275 | # Appendix 2276 | 2277 | ## 1 - Deferred Observables 2278 | 2279 | A behavior to be aware of with `Observable.from_()` and other functions that create Observables is they may not reflect changes that happen to their sources. 2280 | 2281 | For instance, if have an `Observable.range()` built off two variables `x` and `y`, and one of the variables changes later, this change will not be captured by the source. 2282 | 2283 | ```python 2284 | 1 2285 | 2 2286 | 3 2287 | 4 2288 | 5 2289 | 2290 | Setting y = 10 2291 | 2292 | 1 2293 | 2 2294 | 3 2295 | 4 2296 | 5 2297 | 2298 | ``` 2299 | 2300 | **OUTPUT:** 2301 | 2302 | ``` 2303 | Alpha 2304 | Beta 2305 | Gamma 2306 | 2307 | Adding Delta! 2308 | 2309 | Alpha 2310 | Beta 2311 | Gamma 2312 | Delta 2313 | ``` 2314 | 2315 | Using `Observable.defer()` allows you to create a new `Observable` from scratch each time it is subscribed, and therefore capturing anything that might have changed about its source. Just supply how to create the `Observable` through a lambda. 2316 | 2317 | ```python 2318 | from rx import Observable 2319 | 2320 | x = 1 2321 | y = 5 2322 | 2323 | integers = Observable.defer(lambda: Observable.range(x, y)) 2324 | integers.subscribe(lambda i: print(i)) 2325 | 2326 | print("\nSetting y = 10\n") 2327 | y = 10 2328 | 2329 | integers.subscribe(lambda i: print(i)) 2330 | ``` 2331 | 2332 | **OUTPUT:** 2333 | 2334 | ``` 2335 | 1 2336 | 2 2337 | 3 2338 | 4 2339 | 5 2340 | 2341 | Setting y = 10 2342 | 2343 | 1 2344 | 2 2345 | 3 2346 | 4 2347 | 5 2348 | 6 2349 | 7 2350 | 8 2351 | 9 2352 | 10 2353 | ``` 2354 | 2355 | The lambda argument ensures the `Observable` source declaration is rebuilt each time it is subscribed to. This is especially helpful to use with data sources that can only be iterated once, as opposed to calling a helper function for each Subscriber (this was covered in Section VII): 2356 | 2357 | ```python 2358 | 2359 | def get_all_customers(): 2360 | stmt = text("SELECT * FROM CUSTOMER") 2361 | return Observable.from_(conn.execute(stmt)) 2362 | ``` 2363 | 2364 | We can actually create an `Obserable` that is truly reusable for multiple Subscribers. 2365 | 2366 | ```python 2367 | stmt = text("SELECT * FROM CUSTOMER") 2368 | 2369 | # Will suppport multiple subscribers and coldly replay to each one 2370 | all_customers = Observable.defer(lambda: Observable.from_(conn.execute(stmt))) 2371 | ``` 2372 | 2373 | 2374 | ## 2 - Debugging with `do_action()` 2375 | 2376 | A helpful operator that provides insight into any point in the `Observable` chain is the `do_action()`. This essentially allows us to insert a `Subscriber` after any operator we want, and pass one or more of `on_next()`, `on_completed()`, and `on_error()` actions. 2377 | 2378 | ```python 2379 | from rx import Observable 2380 | 2381 | Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]) \ 2382 | .map(lambda s: len(s)) \ 2383 | .do_action(on_next=lambda i: print("Receiving {0} from map()".format(i)), 2384 | on_completed=lambda: print("map() is done!")) \ 2385 | .to_list() \ 2386 | .subscribe(on_next=lambda l: print("Subscriber received {0}".format(l)), 2387 | on_completed=lambda: print("Subscriber done!")) 2388 | 2389 | ``` 2390 | 2391 | **OUTPUT:** 2392 | 2393 | ``` 2394 | Receiving 5 from map() 2395 | Receiving 4 from map() 2396 | Receiving 5 from map() 2397 | Receiving 5 from map() 2398 | Receiving 7 from map() 2399 | map() is done! 2400 | Subscriber received [5, 4, 5, 5, 7] 2401 | Subscriber done! 2402 | ``` 2403 | 2404 | Above, we declare a `do_action` right after the `map()` operation emitting the lengths. We print each length emission before it goes to the `to_list()`. Finally, `on_completed` is called and prints a notification that `map()` is not giving any more items. Then it pushes the completion event to the `to_list()` which then pushes the `List` to the `Subscriber`. Then `to_list()` calls `on_completed()` up to the `Subscriber` _after_ the `List` is emitted. 2405 | 2406 | Use `do_action()` when you need to "peek" inside any point in the `Observable` chain, either for debugging or quickly call actions at that point. 2407 | 2408 | ## 3 - Subjects 2409 | 2410 | Another way to create an `Observable` is by declaring a `Subject`. A `Subject` is both an `Observable` and `Observer`, and you can call its `Observer` functions to push items through it and up to any Subscribers at any time. 2411 | 2412 | ```python 2413 | from rx.subjects import Subject 2414 | 2415 | subject = Subject() 2416 | 2417 | subject.filter(lambda i: i < 100) \ 2418 | .map(lambda i: i * 1000) \ 2419 | .subscribe(lambda i: print(i)) 2420 | 2421 | subject.on_next(10) 2422 | subject.on_next(50) 2423 | subject.on_next(105) 2424 | subject.on_next(87) 2425 | 2426 | subject.on_completed() 2427 | ``` 2428 | 2429 | **OUTPUT:** 2430 | 2431 | ``` 2432 | 10000 2433 | 50000 2434 | 87000 2435 | ``` 2436 | 2437 | While they seem convenient, Subjects are often discouraged from being used. They can easily encourage antipatterns and are prone to abuse. They also are difficult to compose against and do not respect `subscribe_on()`. It is better to create Observables that strictly come from one defined source, rather than be openly mutable and have anything push items to it at anytime. Use Subjects with discretion. 2438 | 2439 | ## 4. Error Recovery 2440 | 2441 | There are a number of error recovery operators, but we will cover two helpful ones. Say you have an `Observable` operation that will ultimately attempt to divide by zero and therefore throw an error. 2442 | 2443 | ```python 2444 | from rx import Observable 2445 | 2446 | Observable.from_([5, 6, 2, 0, 1, 35]) \ 2447 | .map(lambda i: 5 / i) \ 2448 | .subscribe(on_next=lambda i: print(i), on_error=lambda e: print(e)) 2449 | ``` 2450 | 2451 | **OUTPUT:** 2452 | 2453 | ``` 2454 | 1.0 2455 | 0.8333333333333334 2456 | 2.5 2457 | division by zero 2458 | ``` 2459 | 2460 | There are multiple ways to handle this. Of course, the best way is to be proactive and use `filter()` to hold back any `0` value emissions. But for the sake of example, let's say we did not expect this error and we want a way to handle any errors we have not considered. 2461 | 2462 | One way is to use `on_error_resume_next()` which will switch to an alternate `Observable` source in the event there is an error. This is somewhat contrived, but if we encounter an error we can switch to emitting an `Observable.range()`. 2463 | 2464 | ```python 2465 | from rx import Observable 2466 | 2467 | Observable.from_([5, 6, 2, 0, 1, 35]) \ 2468 | .map(lambda i: 5 / i) \ 2469 | .on_error_resume_next(Observable.range(1,10)) \ 2470 | .subscribe(on_next=lambda i: print(i), on_error=lambda e: print(e)) 2471 | ``` 2472 | 2473 | **OUTPUT:** 2474 | 2475 | ``` 2476 | 1.0 2477 | 0.8333333333333334 2478 | 2.5 2479 | 1 2480 | 2 2481 | 3 2482 | 4 2483 | 5 2484 | 6 2485 | 7 2486 | 8 2487 | 9 2488 | 10 2489 | ``` 2490 | 2491 | It probably would be more realistic to pass an `Observable.empty()` instead to simply stop emissions once an error happens. 2492 | 2493 | ```python 2494 | from rx import Observable 2495 | 2496 | Observable.from_([5, 6, 2, 0, 1, 35]) \ 2497 | .map(lambda i: 5 / i) \ 2498 | .on_error_resume_next(Observable.empty()) \ 2499 | .subscribe(on_next=lambda i: print(i), on_error=lambda e: print(e)) 2500 | ``` 2501 | 2502 | **OUTPUT:** 2503 | 2504 | ``` 2505 | 1.0 2506 | 0.8333333333333334 2507 | 2.5 2508 | ``` 2509 | 2510 | Although this is not a good example to use it, you can also use `retry()` to re-attempt subscribing to the `Observable` and hope the next set of emissions are successful without error. You typically should pass an integer argument to specify the number of retry attempts before it gives up and lets the error go to the `Subscriber`. If you do not, it will retry an infinite number of times. 2511 | 2512 | ```python 2513 | from rx import Observable 2514 | 2515 | Observable.from_([5, 6, 2, 0, 1, 35]) \ 2516 | .map(lambda i: 5 / i) \ 2517 | .retry(3) \ 2518 | .subscribe(on_next=lambda i: print(i), on_error=lambda e: print(e)) 2519 | ``` 2520 | 2521 | **OUTPUT:** 2522 | 2523 | ``` 2524 | 1.0 2525 | 0.8333333333333334 2526 | 2.5 2527 | 1.0 2528 | 0.8333333333333334 2529 | 2.5 2530 | 1.0 2531 | 0.8333333333333334 2532 | 2.5 2533 | division by zero 2534 | ``` 2535 | 2536 | You can also use this in combination with the `delay()` operator to hold off subscribing for a fixed time period, which can be helpful for intermittent connectivity problems. 2537 | 2538 | 2539 | ## 5. combine_latest() 2540 | 2541 | There is one operation for merging multiple Observables together we did not cover: `combine_latest()`. It behaves much like `zip()` but will only combine the _latest_ emissions for each source in the event one of them emits something. This is helpful for hot event sources especially, such as user inputs in a UI, where do you not care what the previous emissions are. 2542 | 2543 | Below, we have two interval sources put in `combine_latest()`: `source1` emitting every 3 seconds and `source2` every 1 second. Notice that `source2` is going to emit a lot faster, but rather than get queued up like in `zip()` waiting for an emission from `source1`, it is going to pair with only the latest emission from `source1`. It is not going to wait for any emission to be zipped with. Conversely, when `source1` does emit something it is going to pair with the latest emission from `source2`, not wait for an emission. 2544 | 2545 | 2546 | ```python 2547 | from rx import Observable 2548 | 2549 | source1 = Observable.interval(3000).map(lambda i: "SOURCE 1: {0}".format(i)) 2550 | source2 = Observable.interval(1000).map(lambda i: "SOURCE 2: {0}".format(i)) 2551 | 2552 | Observable.combine_latest(source1, source2, lambda s1,s2: "{0}, {1}".format(s1,s2)) \ 2553 | .subscribe(lambda s: print(s)) 2554 | 2555 | input("Press any key to quit\n") 2556 | ``` 2557 | 2558 | **OUTPUT:** 2559 | 2560 | ``` 2561 | Press any key to quit 2562 | SOURCE 1: 0, SOURCE 2: 1 2563 | SOURCE 1: 0, SOURCE 2: 2 2564 | SOURCE 1: 0, SOURCE 2: 3 2565 | SOURCE 1: 0, SOURCE 2: 4 2566 | SOURCE 1: 1, SOURCE 2: 4 2567 | SOURCE 1: 1, SOURCE 2: 5 2568 | SOURCE 1: 1, SOURCE 2: 6 2569 | SOURCE 1: 1, SOURCE 2: 7 2570 | SOURCE 1: 2, SOURCE 2: 7 2571 | SOURCE 1: 2, SOURCE 2: 8 2572 | SOURCE 1: 2, SOURCE 2: 9 2573 | SOURCE 1: 2, SOURCE 2: 10 2574 | SOURCE 1: 3, SOURCE 2: 10 2575 | SOURCE 1: 3, SOURCE 2: 11 2576 | SOURCE 1: 3, SOURCE 2: 12 2577 | SOURCE 1: 3, SOURCE 2: 13 2578 | ``` 2579 | 2580 | Again, this is a helpful alternative for `zip()` if you want to emit the _latest combinations_ from two or more Observables. 2581 | -------------------------------------------------------------------------------- /class_notes/class_notes.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/thomasnield/oreilly_reactive_python_for_data/dd223b28bad8ea8f00242e827bd020a9c94e2692/class_notes/class_notes.pdf -------------------------------------------------------------------------------- /code_examples/4.1A_declaring_an_observable.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | letters = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) 4 | -------------------------------------------------------------------------------- /code_examples/4.1B_subscribing_to_an_observable.py: -------------------------------------------------------------------------------- 1 | from rx import Observable, Observer 2 | 3 | letters = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) 4 | 5 | 6 | class MySubscriber(Observer): 7 | def on_next(self, value): 8 | print(value) 9 | 10 | def on_completed(self): 11 | print("Completed!") 12 | 13 | def on_error(self, error): 14 | print("Error occured: {0}".format(error)) 15 | 16 | 17 | letters.subscribe(MySubscriber()) 18 | -------------------------------------------------------------------------------- /code_examples/4.1C_subscribing_with_lambdas.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | letters = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) 4 | 5 | letters.subscribe(on_next = lambda value: print(value), 6 | on_completed = lambda: print("Completed!"), 7 | on_error = lambda error: print("Error occurred: {0}".format(error))) 8 | 9 | # to use just on_next: 10 | # letters.subscribe(on_next = lambda value: print(value)) 11 | # letters.subscribe(lambda value: print("Received: {0}".format(value))) 12 | -------------------------------------------------------------------------------- /code_examples/4.2A_some_basic_operators.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \ 4 | .map(lambda s: len(s)) \ 5 | .filter(lambda i: i >= 5) \ 6 | .subscribe(lambda value: print(value)) 7 | -------------------------------------------------------------------------------- /code_examples/4.2B_range_and_just.py: -------------------------------------------------------------------------------- 1 | from rx import Observable, Observer 2 | 3 | # Using Observable.range() 4 | letters = Observable.range(1,10) 5 | letters.subscribe(lambda value: print(value)) 6 | 7 | # Using Observable.just() 8 | greeting = Observable.just("Hello World!") 9 | greeting.subscribe(lambda value: print(value)) 10 | -------------------------------------------------------------------------------- /code_examples/4.2C_observable_empty.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | Observable.empty() \ 4 | .subscribe(on_next= lambda s: print(s), 5 | on_completed= lambda: print("Done!") 6 | ) 7 | -------------------------------------------------------------------------------- /code_examples/4.3A_creating_observable_from_scratch.py: -------------------------------------------------------------------------------- 1 | from rx import Observable, Observer 2 | 3 | def push_numbers(observer): 4 | observer.on_next(100) 5 | observer.on_next(300) 6 | observer.on_next(500) 7 | observer.on_completed() 8 | 9 | Observable.create(push_numbers).subscribe(on_next = lambda i: print(i)) 10 | -------------------------------------------------------------------------------- /code_examples/4.3B_interval_observable.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | Observable.interval(1000) \ 4 | .map(lambda i: "{0} Mississippi".format(i)) \ 5 | .subscribe(lambda s: print(s)) 6 | 7 | # Keep application alive until user presses a key 8 | input("Press any key to quit") 9 | -------------------------------------------------------------------------------- /code_examples/4.3C_unsubscribing.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | import time 3 | 4 | disposable = Observable.interval(1000) \ 5 | .map(lambda i: "{0} Mississippi".format(i)) \ 6 | .subscribe(lambda s: print(s)) 7 | 8 | # sleep 5 seconds so Observable can fire 9 | time.sleep(5) 10 | 11 | # disconnect the Subscriber 12 | print("Unsubscribing!") 13 | disposable.dispose() 14 | 15 | # sleep a bit longer to prove no more emissions are coming 16 | time.sleep(5) 17 | -------------------------------------------------------------------------------- /code_examples/4.4A_twitter_observable.py: -------------------------------------------------------------------------------- 1 | from tweepy.streaming import StreamListener 2 | from tweepy import OAuthHandler 3 | from tweepy import Stream 4 | import json 5 | from rx import Observable 6 | 7 | # Variables that contains the user credentials to access Twitter API 8 | access_token = "CONFIDENTIAL" 9 | access_token_secret = "CONFIDENTIAL" 10 | consumer_key = "CONFIDENTIAL" 11 | consumer_secret = "CONFIDENTIAL" 12 | 13 | 14 | def tweets_for(topics): 15 | 16 | def observe_tweets(observer): 17 | class TweetListener(StreamListener): 18 | def on_data(self, data): 19 | observer.on_next(data) 20 | return True 21 | 22 | def on_error(self, status): 23 | observer.on_error(status) 24 | 25 | # This handles Twitter authetification and the connection to Twitter Streaming API 26 | l = TweetListener() 27 | auth = OAuthHandler(consumer_key, consumer_secret) 28 | auth.set_access_token(access_token, access_token_secret) 29 | stream = Stream(auth, l) 30 | stream.filter(track=topics) 31 | 32 | return Observable.create(observe_tweets).share() 33 | 34 | 35 | topics = ['Britain','France'] 36 | 37 | tweets_for(topics).map(lambda d: json.loads(d)) \ 38 | .filter(lambda map: "text" in map) \ 39 | .map(lambda map: map["text"].strip()) \ 40 | .subscribe(lambda s: print(s)) 41 | -------------------------------------------------------------------------------- /code_examples/4.4B_cold_observable.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | source = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) 4 | 5 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s))) 6 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s))) 7 | -------------------------------------------------------------------------------- /code_examples/5.1A_filter.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \ 4 | .filter(lambda s: len(s) >= 5) \ 5 | .subscribe(lambda s: print(s)) 6 | -------------------------------------------------------------------------------- /code_examples/5.1B_take.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \ 4 | .filter(lambda s: len(s) >= 5) \ 5 | .take(2) \ 6 | .subscribe(lambda s: print(s)) 7 | -------------------------------------------------------------------------------- /code_examples/5.1C_take_while.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | Observable.from_([2,5,21,5,2,1,5,63,127,12]) \ 4 | .take_while(lambda i: i < 100) \ 5 | .subscribe(on_next = lambda i: print(i), on_completed = lambda: print("Done!")) 6 | -------------------------------------------------------------------------------- /code_examples/5.2A_distinct.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]) \ 4 | .map(lambda s: len(s)) \ 5 | .distinct() \ 6 | .subscribe(lambda i: print(i)) 7 | -------------------------------------------------------------------------------- /code_examples/5.2B_distinct_with_mapping.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]) \ 4 | .distinct(lambda s: len(s)) \ 5 | .subscribe(lambda i: print(i)) 6 | -------------------------------------------------------------------------------- /code_examples/5.2C_distinct_until_changed.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | Observable.from_(["Alpha", "Theta", "Kappa", "Beta", "Gamma", "Delta", "Epsilon"]) \ 4 | .map(lambda s: len(s)) \ 5 | .distinct_until_changed() \ 6 | .subscribe(lambda i: print(i)) 7 | 8 | 9 | Observable.from_(["Alpha", "Theta", "Kappa", "Beta", "Gamma", "Delta", "Epsilon"]) \ 10 | .distinct_until_changed(lambda s: len(s)) \ 11 | .subscribe(lambda i: print(i)) 12 | -------------------------------------------------------------------------------- /code_examples/5.3A_count.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \ 4 | .filter(lambda s: len(s) != 5) \ 5 | .count() \ 6 | .subscribe(lambda i: print(i)) 7 | -------------------------------------------------------------------------------- /code_examples/5.3B_reduce.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | Observable.from_([4,76,22,66,881,13,35]) \ 4 | .filter(lambda i: i < 100) \ 5 | .reduce(lambda total, value: total + value) \ 6 | .subscribe(lambda s: print(s)) 7 | -------------------------------------------------------------------------------- /code_examples/5.3C_scan.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | Observable.from_([4,76,22,66,881,13,35]) \ 4 | .scan(lambda total, value: total + value) \ 5 | .subscribe(lambda s: print(s)) 6 | -------------------------------------------------------------------------------- /code_examples/5.4A_to_list.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \ 4 | .to_list() \ 5 | .subscribe(lambda s: print(s)) 6 | -------------------------------------------------------------------------------- /code_examples/5.4B_to_dict.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]) \ 4 | .to_dict(lambda s: s[0]) \ 5 | .subscribe(lambda i: print(i)) 6 | 7 | Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]) \ 8 | .to_dict(lambda s: s[0], lambda s: len(s)) \ 9 | .subscribe(lambda i: print(i)) 10 | -------------------------------------------------------------------------------- /code_examples/6.1A_merge.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | source1 = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) 4 | source2 = Observable.from_(["Zeta","Eta","Theta","Iota"]) 5 | 6 | Observable.merge(source1,source2) \ 7 | .subscribe(lambda s: print(s)) 8 | -------------------------------------------------------------------------------- /code_examples/6.1B_merge_interval.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | source1 = Observable.interval(1000).map(lambda i: "Source 1: {0}".format(i)) 4 | source2 = Observable.interval(500).map(lambda i: "Source 2: {0}".format(i)) 5 | source3 = Observable.interval(300).map(lambda i: "Source 3: {0}".format(i)) 6 | 7 | Observable.merge(source1, source2, source3) \ 8 | .subscribe(lambda s: print(s)) 9 | 10 | # keep application alive until user presses a key 11 | input("Press any key to quit\n") 12 | -------------------------------------------------------------------------------- /code_examples/6.1C_merge_all.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | source1 = Observable.interval(1000).map(lambda i: "Source 1: {0}".format(i)) 4 | source2 = Observable.interval(500).map(lambda i: "Source 2: {0}".format(i)) 5 | source3 = Observable.interval(300).map(lambda i: "Source 3: {0}".format(i)) 6 | 7 | Observable.from_([source1,source2,source3]) \ 8 | .merge_all() \ 9 | .subscribe(lambda s: print(s)) 10 | 11 | # keep application alive until user presses a key 12 | input("Press any key to quit\n") 13 | -------------------------------------------------------------------------------- /code_examples/6.1D_merge_all_continued.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | items = ["134/34/235/132/77", "64/22/98/112/86/11", "66/08/34/778/22/12"] 4 | 5 | Observable.from_(items) \ 6 | .map(lambda s: Observable.from_(s.split("/"))) \ 7 | .merge_all() \ 8 | .map(lambda s: int(s)) \ 9 | .subscribe(lambda i: print(i)) 10 | -------------------------------------------------------------------------------- /code_examples/6.1E_flat_map.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | items = ["134/34/235/132/77", "64/22/98/112/86/11", "66/08/34/778/22/12"] 4 | 5 | Observable.from_(items) \ 6 | .flat_map(lambda s: Observable.from_(s.split("/"))) \ 7 | .map(lambda s: int(s)) \ 8 | .subscribe(lambda i: print(i)) 9 | -------------------------------------------------------------------------------- /code_examples/6.2A_concat.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | source1 = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) 4 | source2 = Observable.from_(["Zeta","Eta","Theta","Iota"]) 5 | 6 | Observable.concat(source1,source2) \ 7 | .subscribe(lambda s: print(s)) 8 | -------------------------------------------------------------------------------- /code_examples/6.2B_concat_all.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | source1 = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) 4 | source2 = Observable.from_(["Zeta","Eta","Theta","Iota"]) 5 | 6 | Observable.concat(source1,source2) \ 7 | .subscribe(lambda s: print(s)) 8 | -------------------------------------------------------------------------------- /code_examples/6.2C_zip.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | letters = Observable.from_(["A","B","C","D","E","F"]) 4 | numbers = Observable.range(1,5) 5 | 6 | Observable.zip(letters,numbers, lambda l,n: "{0}-{1}".format(l,n)) \ 7 | .subscribe(lambda i: print(i)) 8 | -------------------------------------------------------------------------------- /code_examples/6.3D_spacing_emissions.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | letters = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) 4 | intervals = Observable.interval(1000) 5 | 6 | Observable.zip(letters,intervals, lambda s,i: s) \ 7 | .subscribe(lambda s: print(s)) 8 | 9 | input("Press any key to quit\n") 10 | -------------------------------------------------------------------------------- /code_examples/6.4A_grouping_into_lists.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | items = ["Alpha", "Beta", "Gamma", "Delta", "Epsilon"] 4 | 5 | Observable.from_(items) \ 6 | .group_by(lambda s: len(s)) \ 7 | .flat_map(lambda grp: grp.to_list()) \ 8 | .subscribe(lambda i: print(i)) 9 | -------------------------------------------------------------------------------- /code_examples/6.4B_grouping_length_counts.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | items = ["Alpha", "Beta", "Gamma", "Delta", "Epsilon"] 4 | 5 | Observable.from_(items) \ 6 | .group_by(lambda s: len(s)) \ 7 | .flat_map(lambda grp: 8 | grp.count().map(lambda ct: (grp.key, ct)) 9 | ) \ 10 | .to_dict(lambda key_value: key_value[0], lambda key_value: key_value[1]) \ 11 | .subscribe(lambda i: print(i)) 12 | -------------------------------------------------------------------------------- /code_examples/7.1A_reading_text_file.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | 4 | def read_lines(file_name): 5 | file = open(file_name) 6 | 7 | return Observable.from_(file) \ 8 | .map(lambda l: l.strip()) \ 9 | .filter(lambda l: l != "") 10 | 11 | 12 | read_lines("bbc_news_article.txt").subscribe(lambda s: print(s)) 13 | -------------------------------------------------------------------------------- /code_examples/7.1B_reading_web_url.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | from urllib.request import urlopen 3 | 4 | 5 | def read_request(link): 6 | f = urlopen(link) 7 | 8 | return Observable.from_(f) \ 9 | .map(lambda s: s.decode("utf-8").strip()) \ 10 | 11 | read_request("https://goo.gl/rIaDyM") \ 12 | .subscribe(lambda s: print(s)) 13 | -------------------------------------------------------------------------------- /code_examples/7.1C_recursive_file_iteration.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | import os 3 | 4 | 5 | def recursive_files_in_directory(folder): 6 | 7 | def emit_files_recursively(observer): 8 | for root, directories, filenames in os.walk(folder): 9 | for directory in directories: 10 | observer.on_next(os.path.join(root, directory)) 11 | for filename in filenames: 12 | observer.on_next(os.path.join(root, filename)) 13 | 14 | observer.on_completed() 15 | 16 | return Observable.create(emit_files_recursively) 17 | 18 | 19 | recursive_files_in_directory('/home/thomas/Desktop/bbc_data_sets') \ 20 | .filter(lambda f: f.endswith('.txt')) \ 21 | .subscribe(on_next=lambda l: print(l), on_error=lambda e: print(e)) 22 | 23 | -------------------------------------------------------------------------------- /code_examples/7.2A_reading_sql_query.py: -------------------------------------------------------------------------------- 1 | from sqlalchemy import create_engine, text 2 | from rx import Observable 3 | 4 | engine = create_engine('sqlite:///rexon_metals.db') 5 | conn = engine.connect() 6 | 7 | 8 | def get_all_customers(): 9 | stmt = text("SELECT * FROM CUSTOMER") 10 | return Observable.from_(conn.execute(stmt)) 11 | 12 | 13 | get_all_customers() \ 14 | .map(lambda r: r[0]) \ 15 | .subscribe(lambda r: print(r)) 16 | -------------------------------------------------------------------------------- /code_examples/7.2C_merging_sql_queries.py: -------------------------------------------------------------------------------- 1 | from sqlalchemy import create_engine, text 2 | from rx import Observable 3 | 4 | engine = create_engine('sqlite:///rexon_metals.db') 5 | conn = engine.connect() 6 | 7 | 8 | def get_all_customers(): 9 | stmt = text("SELECT * FROM CUSTOMER") 10 | return Observable.from_(conn.execute(stmt)) 11 | 12 | 13 | def customer_for_id(customer_id): 14 | stmt = text("SELECT * FROM CUSTOMER WHERE CUSTOMER_ID = :id") 15 | return Observable.from_(conn.execute(stmt, id=customer_id)) 16 | 17 | 18 | # Query customers with IDs 1, 3, and 5 19 | Observable.from_([1, 3, 5]) \ 20 | .flat_map(lambda id: customer_for_id(id)) \ 21 | .subscribe(lambda r: print(r)) 22 | -------------------------------------------------------------------------------- /code_examples/7.2D_writing_sql_updates.py: -------------------------------------------------------------------------------- 1 | from sqlalchemy import create_engine, text 2 | from rx import Observable 3 | 4 | 5 | engine = create_engine('sqlite:///rexon_metals.db') 6 | conn = engine.connect() 7 | 8 | 9 | def get_all_customers(): 10 | stmt = text("SELECT * FROM CUSTOMER") 11 | return Observable.from_(conn.execute(stmt)) 12 | 13 | 14 | def customer_for_id(customer_id): 15 | stmt = text("SELECT * FROM CUSTOMER WHERE CUSTOMER_ID = :id") 16 | return Observable.from_(conn.execute(stmt, id=customer_id)) 17 | 18 | 19 | def insert_new_customer(customer_name, region, street_address, city, state, zip_code): 20 | stmt = text("INSERT INTO CUSTOMER (NAME, REGION, STREET_ADDRESS, CITY, STATE, ZIP) VALUES (" 21 | ":customer_name, :region, :street_address, :city, :state, :zip_code)") 22 | 23 | result = conn.execute(stmt, customer_name=customer_name, region=region, street_address=street_address, city=city, state=state, zip_code=zip_code) 24 | return Observable.just(result.lastrowid) 25 | 26 | # Create new customer, emit primary key ID, and query that customer 27 | insert_new_customer('RMS Materials','Northeast', '5764 Carrier Ln', 'Boston', 'Massachusetts', '02201') \ 28 | .flat_map(lambda i: customer_for_id(i)) \ 29 | .subscribe(lambda s: print(s)) 30 | -------------------------------------------------------------------------------- /code_examples/7.3A_reading_words_from_text_file.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | import re 3 | 4 | 5 | def words_from_file(file_name): 6 | file = open(file_name) 7 | 8 | # parse, clean, and push words in text file 9 | return Observable.from_(file) \ 10 | .flat_map(lambda s: Observable.from_(s.split())) \ 11 | .map(lambda w: re.sub(r'[^\w]', '', w)) \ 12 | .filter(lambda w: w != "") \ 13 | .map(lambda w: w.lower()) 14 | 15 | article_file = "bbc_news_article.txt" 16 | words_from_file(article_file).subscribe(lambda w: print(w)) 17 | -------------------------------------------------------------------------------- /code_examples/7.3B_counting_word_occurrences.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | import re 3 | 4 | 5 | def words_from_file(file_name): 6 | file = open(file_name) 7 | 8 | # parse, clean, and push words in text file 9 | return Observable.from_(file) \ 10 | .flat_map(lambda s: Observable.from_(s.split())) \ 11 | .map(lambda w: re.sub(r'[^\w\s]', '', w)) \ 12 | .filter(lambda w: w != "") \ 13 | .map(lambda w: w.lower()) \ 14 | 15 | 16 | 17 | def word_counter(file_name): 18 | 19 | # count words using `group_by()` 20 | # tuple the word with the count 21 | return words_from_file(file_name) \ 22 | .group_by(lambda word: word) \ 23 | .flat_map(lambda grp: grp.count().map(lambda ct: (grp.key, ct))) 24 | 25 | article_file = "bbc_news_article.txt" 26 | word_counter(article_file).subscribe(lambda w: print(w)) 27 | -------------------------------------------------------------------------------- /code_examples/7.3C_scheduling_reactive_word_counter.py: -------------------------------------------------------------------------------- 1 | # Schedules a reactive process that counts the words in a text file every three seconds, 2 | # but only prints it as a dict if it has changed 3 | 4 | from rx import Observable 5 | import re 6 | 7 | 8 | def words_from_file(file_name): 9 | file = open(file_name) 10 | 11 | # parse, clean, and push words in text file 12 | return Observable.from_(file) \ 13 | .flat_map(lambda s: Observable.from_(s.split())) \ 14 | .map(lambda w: re.sub(r'[^\w\s]', '', w)) \ 15 | .filter(lambda w: w != "") \ 16 | .map(lambda w: w.lower()) \ 17 | 18 | 19 | 20 | def word_counter(file_name): 21 | 22 | # count words using `group_by()` 23 | # tuple the word with the count 24 | return words_from_file(file_name) \ 25 | .group_by(lambda word: word) \ 26 | .flat_map(lambda grp: grp.count().map(lambda ct: (grp.key, ct))) 27 | 28 | 29 | # composes the above word_counter() into a dict 30 | def word_counter_as_dict(file_name): 31 | return word_counter(file_name).to_dict(lambda t: t[0], lambda t: t[1]) 32 | 33 | 34 | # Schedule to create a word count dict every three seconds an article 35 | # But only re-print if text is edited and word counts change 36 | 37 | article_file = "bbc_news_article.txt" 38 | 39 | # create a dict every three seconds, but only push if it changed 40 | Observable.interval(3000) \ 41 | .flat_map(lambda i: word_counter_as_dict(article_file)) 42 | .distinct_until_changed() \ 43 | .subscribe(lambda word_ct_dict: print(word_ct_dict)) 44 | 45 | # Keep alive until user presses any key 46 | input("Starting, press any key to quit\n") 47 | -------------------------------------------------------------------------------- /code_examples/8.1A_connectableobservable.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | 3 | source = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]).publish() 4 | 5 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s))) 6 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s))) 7 | 8 | source.connect() 9 | -------------------------------------------------------------------------------- /code_examples/8.1B_sharing_observable.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | import time 3 | 4 | source = Observable.interval(1000).publish() 5 | 6 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s))) 7 | source.connect() 8 | 9 | # sleep 5 seconds, then add another subscriber 10 | time.sleep(5) 11 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s))) 12 | 13 | input("Press any key to exit\n") -------------------------------------------------------------------------------- /code_examples/8.1C_refcount.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | import time 3 | 4 | source = Observable.interval(1000).publish().ref_count() 5 | 6 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s))) 7 | 8 | # sleep 5 seconds, then add another subscriber 9 | time.sleep(5) 10 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s))) 11 | 12 | input("Press any key to exit\n") -------------------------------------------------------------------------------- /code_examples/8.2_twitter_feed_for_topics.py: -------------------------------------------------------------------------------- 1 | from tweepy.streaming import StreamListener 2 | from tweepy import OAuthHandler 3 | from tweepy import Stream 4 | import json 5 | from rx import Observable 6 | 7 | # Variables that contains the user credentials to access Twitter API 8 | access_token = "PUT YOURS HERE" 9 | access_token_secret = "PUT YOURS HERE" 10 | consumer_key = "PUT YOURS HERE" 11 | consumer_secret = "PUT YOURS HERE" 12 | 13 | 14 | def tweets_for(topics): 15 | def observe_tweets(observer): 16 | class TweetListener(StreamListener): 17 | def on_data(self, data): 18 | observer.on_next(data) 19 | return True 20 | 21 | def on_error(self, status): 22 | observer.on_error(status) 23 | 24 | # This handles Twitter authetification and the connection to Twitter Streaming API 25 | l = TweetListener() 26 | auth = OAuthHandler(consumer_key, consumer_secret) 27 | auth.set_access_token(access_token, access_token_secret) 28 | stream = Stream(auth, l) 29 | stream.filter(track=topics) 30 | 31 | return Observable.create(observe_tweets).share() 32 | 33 | 34 | topics = ['Britain', 'France'] 35 | 36 | tweets_for(topics) \ 37 | .map(lambda d: json.loads(d)) \ 38 | .subscribe(on_next=lambda s: print(s), on_error=lambda e: print(e)) -------------------------------------------------------------------------------- /code_examples/9.1A_sequential_long_running_tasks.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | from threading import current_thread 3 | import multiprocessing, time, random 4 | 5 | 6 | def intense_calculation(value): 7 | # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation 8 | time.sleep(random.randint(5,20) * .1) 9 | return value 10 | 11 | # Create Process 1 12 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \ 13 | .map(lambda s: intense_calculation(s)) \ 14 | .subscribe(on_next=lambda s: print("PROCESS 1: {0} {1}".format(current_thread().name, s)), 15 | on_error=lambda e: print(e), 16 | on_completed=lambda: print("PROCESS 1 done!")) 17 | 18 | # Create Process 2 19 | Observable.range(1,10) \ 20 | .map(lambda s: intense_calculation(s)) \ 21 | .subscribe(on_next=lambda i: print("PROCESS 2: {0} {1}".format(current_thread().name, i)), 22 | on_error=lambda e: print(e), 23 | on_completed=lambda: print("PROCESS 2 done!")) 24 | 25 | input("Press any key to exit\n") -------------------------------------------------------------------------------- /code_examples/9.1B_using_subscribe_on.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | from rx.concurrency import ThreadPoolScheduler 3 | from threading import current_thread 4 | import multiprocessing, time, random 5 | 6 | 7 | def intense_calculation(value): 8 | # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation 9 | time.sleep(random.randint(5,20) * .1) 10 | return value 11 | 12 | # calculate number of CPU's and add 1, then create a ThreadPoolScheduler with that number of threads 13 | optimal_thread_count = multiprocessing.cpu_count() + 1 14 | pool_scheduler = ThreadPoolScheduler(optimal_thread_count) 15 | 16 | print("We are using {0} threads".format(optimal_thread_count)) 17 | 18 | # Create Process 1 19 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \ 20 | .map(lambda s: intense_calculation(s)) \ 21 | .subscribe_on(pool_scheduler) \ 22 | .subscribe(on_next=lambda s: print("PROCESS 1: {0} {1}".format(current_thread().name, s)), 23 | on_error=lambda e: print(e), 24 | on_completed=lambda: print("PROCESS 1 done!")) 25 | 26 | # Create Process 2 27 | Observable.range(1,10) \ 28 | .map(lambda s: intense_calculation(s)) \ 29 | .subscribe_on(pool_scheduler) \ 30 | .subscribe(on_next=lambda i: print("PROCESS 2: {0} {1}".format(current_thread().name, i)), 31 | on_error=lambda e: print(e), 32 | on_completed=lambda: print("PROCESS 2 done!")) 33 | 34 | input("Press any key to exit\n") -------------------------------------------------------------------------------- /code_examples/9.2_using_observe_on.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | from rx.concurrency import ThreadPoolScheduler 3 | from threading import current_thread 4 | import multiprocessing, time, random 5 | 6 | def intense_calculation(value): 7 | # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation 8 | time.sleep(random.randint(5,20) * .1) 9 | return value 10 | 11 | # calculate number of CPU's and add 1, then create a ThreadPoolScheduler with that number of threads 12 | optimal_thread_count = multiprocessing.cpu_count() + 1 13 | pool_scheduler = ThreadPoolScheduler(optimal_thread_count) 14 | 15 | # Create Process 1 16 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \ 17 | .map(lambda s: intense_calculation(s)) \ 18 | .subscribe_on(pool_scheduler) \ 19 | .subscribe(on_next=lambda s: print("PROCESS 1: {0} {1}".format(current_thread().name, s)), 20 | on_error=lambda e: print(e), 21 | on_completed=lambda: print("PROCESS 1 done!")) 22 | 23 | # Create Process 2 24 | Observable.range(1,10) \ 25 | .map(lambda s: intense_calculation(s)) \ 26 | .subscribe_on(pool_scheduler) \ 27 | .subscribe(on_next=lambda i: print("PROCESS 2: {0} {1}".format(current_thread().name, i)), on_error=lambda e: print(e), on_completed=lambda: print("PROCESS 2 done!")) 28 | 29 | # Create Process 3, which is infinite 30 | Observable.interval(1000) \ 31 | .map(lambda i: i * 100) \ 32 | .observe_on(pool_scheduler) \ 33 | .map(lambda s: intense_calculation(s)) \ 34 | .subscribe(on_next=lambda i: print("PROCESS 3: {0} {1}".format(current_thread().name, i)), on_error=lambda e: print(e)) 35 | 36 | input("Press any key to exit\n") -------------------------------------------------------------------------------- /code_examples/9.3_processing_emissions_in_parallel.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | from rx.concurrency import ThreadPoolScheduler 3 | from threading import current_thread 4 | import multiprocessing, time, random 5 | 6 | 7 | def intense_calculation(value): 8 | # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation 9 | time.sleep(random.randint(5,20) * .1) 10 | return value 11 | 12 | # calculate number of CPU's and add 1, then create a ThreadPoolScheduler with that number of threads 13 | optimal_thread_count = multiprocessing.cpu_count() + 1 14 | pool_scheduler = ThreadPoolScheduler(optimal_thread_count) 15 | 16 | # Create Parallel Process 17 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon","Zeta","Eta","Theta","Iota","Kappa"]) \ 18 | .flat_map(lambda s: 19 | Observable.just(s).subscribe_on(pool_scheduler).map(lambda s: intense_calculation(s)) 20 | ) \ 21 | .subscribe(on_next=lambda i: print("{0} {1}".format(current_thread().name, i)), 22 | on_error=lambda e: print(e), 23 | on_completed=lambda: print("PROCESS 1 done!")) 24 | 25 | 26 | input("Press any key to exit\n") -------------------------------------------------------------------------------- /code_examples/9.4_switch_map.py: -------------------------------------------------------------------------------- 1 | from rx import Observable 2 | from rx.concurrency import ThreadPoolScheduler 3 | from threading import current_thread 4 | import multiprocessing, time, random 5 | 6 | 7 | def intense_calculation(value): 8 | # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation 9 | time.sleep(random.randint(5, 20) * .1) 10 | return value 11 | 12 | 13 | # calculate number of CPU's and add 1, then create a ThreadPoolScheduler with that number of threads 14 | optimal_thread_count = multiprocessing.cpu_count() + 1 15 | pool_scheduler = ThreadPoolScheduler(optimal_thread_count) 16 | 17 | strings = Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon", "Zeta", "Eta", "Theta", "Iota", "Kappa"]) 18 | 19 | Observable.interval(6000) \ 20 | .switch_map(lambda i: strings.map(lambda s: intense_calculation(s)).subscribe_on(pool_scheduler)) \ 21 | .subscribe(on_next=lambda s: print("Received {0} on {1}".format(s, current_thread().name)), 22 | on_error=lambda e: print(e)) 23 | 24 | input("Press any key to exit\n") 25 | -------------------------------------------------------------------------------- /code_examples/rexon_metals.db: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/thomasnield/oreilly_reactive_python_for_data/dd223b28bad8ea8f00242e827bd020a9c94e2692/code_examples/rexon_metals.db -------------------------------------------------------------------------------- /resources/bbc_news_article.txt: -------------------------------------------------------------------------------- 1 | Giant waves damage S Asia economy 2 | 3 | Governments, aid agencies, insurers and travel firms are among those counting the cost of the massive earthquake and waves that hammered southern Asia. 4 | 5 | The worst-hit areas are Sri Lanka, India, Indonesia and Thailand, with at least 23,000 people killed. Early estimates from the World Bank put the amount of aid needed at about $5bn (£2.6bn), similar to the cash offered Central America after Hurricane Mitch. Mitch killed about 10,000 people and caused damage of about $10bn in 1998. World Bank spokesman Damien Milverton told the Wall Street Journal that he expected an aid package of financing and debt relief. 6 | 7 | Tourism is a vital part of the economies of the stricken countries, providing jobs for 19 million people in the south east Asian region, according to the World Travel and Tourism Council (WTTC). In the Maldives islands, in the Indian ocean, two-thirds of all jobs depend on tourism. 8 | 9 | But the damage covers fishing, farming and businesses too, with hundreds of thousands of buildings and small boats destroyed by the waves. International agencies have pledged their support; most say it is impossible to gauge the extent of the damage yet. The International Monetary Fund (IMF) has promised rapid action to help the governments of the stricken countries cope. 10 | 11 | "The IMF stands ready to do its part to assist these nations with appropriate support in their time of need," said managing director Rodrigo Rato. Only Sri Lanka and Bangladesh currently receive IMF support, while Indonesia, the quake's epicentre, has recently graduated from IMF assistance. It is up to governments to decide if they want IMF help. Other agencies, such as the Asian Development Bank, have said that it is too early to comment on the amount of aid needed. There is no underestimating the size of the problem, however. The United Nations' emergency relief coordinator, Jan Egeland, said that "this may be the worst national disaster in recent history because it is affecting so many heavily populated coastal areas... so many vulnerable communities. "Many people will have [had] their livelihoods, their whole future destroyed in a few seconds." He warned that "the longer term effects many be as devastating as the tidal wave or the tsunami itself" because of the risks of epidemics from polluted drinking water. 12 | 13 | Insurers are also struggling to assess the cost of the damage, but several big players believe the final bill is likely to be less than the $27bn cost of the hurricanes that battered the US earlier this year. 14 | 15 | "The region that's affected is very big so we have to check country-by-country what the situation is", said Serge Troeber, deputy head of the natural disasters department at Swiss Re, the world's second biggest reinsurance firm. "I should assume, however, that the overall dimension of insured damages is below the storm damages of the US," he said. Munich Re, the world's biggest reinsurer, said: "This is primarily a human tragedy. It is too early for us to state what our financial burden will be." Allianz has said it sees no significant impact on its profitability. However, a low insurance bill may simply reflect the general poverty of much of the region, rather than the level of economic devastation for those who live there. 16 | 17 | The International Federation of the Red Cross and Red Crescent Societies told the Reuters news agency that it was seeking $6.5m for emergency aid. 18 | 19 | "The biggest health challenges we face is the spread of waterborne diseases, particularly malaria and diarrhoea," the aid agency was quoted as saying. The European Union has said it will deliver 3m euros (£2.1m; $4.1m) of aid, according to the Wall Street Journal. The EU's Humanitarian Aid Commissioner, Louis Michel, was quoted as saying that it was key to bring aid "in those vital hours and days immediately after the disaster". Other countries also are reported to have pledged cash, while the US State Department said it was examining what aid was needed in the region. Getting companies and business up and running also may play a vital role in helping communities recover from the weekend's events. 20 | 21 | Many of the worst-hit areas, such as Sri Lanka, Thailand's Phuket island and the Maldives, are popular tourist resorts that are key to local economies. 22 | 23 | December and January are two of the busiest months for the travel in southern Asia and the damage will be even more keenly felt as the industry was only just beginning to emerge from a post 9/11 slump. Growth has been rapid in southeast Asia, with the World Tourism Organisation figures showing a 45% increase in tourist revenues in the region during the first 10 months of 2004. In southern Asia that expansion is 23%. "India continues to post excellent results thanks to increased promotion and product development, but also to the upsurge in business travel driven by the rapid economic development of the country," the WTO said. "Arrivals to other destinations such as... Maldives and Sri Lanka also thrived." In Thailand, tourism accounts for about 6% of the country's annual gross domestic product, or about $8bn. In Singapore the figure is close to 5%. Tourism also brings in much needed foreign currency. In the short-term, however, travel companies are cancelling flights and trips. That has hit shares across Asia and Europe, with investors saying that earnings and economic growth are likely to slow. 24 | -------------------------------------------------------------------------------- /resources/reactive_python_slides.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/thomasnield/oreilly_reactive_python_for_data/dd223b28bad8ea8f00242e827bd020a9c94e2692/resources/reactive_python_slides.pptx -------------------------------------------------------------------------------- /resources/rexon_metals.db: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/thomasnield/oreilly_reactive_python_for_data/dd223b28bad8ea8f00242e827bd020a9c94e2692/resources/rexon_metals.db -------------------------------------------------------------------------------- /setting_up_twitter_api.md: -------------------------------------------------------------------------------- 1 | # Setting Up Twitter API 2 | 3 | To use Tweepy, which is needed for a project in this course, you will need to set up a Twitter API account. 4 | 5 | First, go to [apps.twitter.com](https://apps.twitter.com). Make sure you have a Twitter account set up and "Sign in". Click the "Create New App" button as shown below (Figure 1). 6 | 7 | ![](http://i.imgur.com/VfohYTN.png) 8 | 9 | **Figure 1:** Twitter Apps Dashboard 10 | 11 | 12 | After that, fill in the form to create an API account. Give any name you like, and provide a "Website" if you have one you own. Otherwise, just use a placeholder as I've done below. It does not have to be an existing site (Figure 2). 13 | 14 | ![](http://i.imgur.com/LojWHv1.png) 15 | 16 | **Figure 2:** Filling out the form to create an API account 17 | 18 | After that, your API account will be set up. You will see the page below. Click the "Manage Keys and Access Tokens" link (Figure 3). 19 | 20 | ![](http://i.imgur.com/ruDx1Ms.png) 21 | 22 | **Figure 3:** Click "Manage Keys and Access Tokens" 23 | 24 | You will then come to a page to manage your keys and tokens, which you will need to have so Tweepy can use this account. Note the "Consumer Key" and "Consumer Secret" values. You will need those. Then click "Create my access token". 25 | 26 | ![](https://i.imgur.com/V2kILPr.png) 27 | 28 | **Figure 4:** Note the values above, and click "Create my access token" 29 | 30 | A panel will pop up at the bottom with two more values: "Access Token" and "Access Token Secret". Hold on to these two values as well (Figure 5). 31 | 32 | ![](http://i.imgur.com/kiTk8kh.png) 33 | 34 | **Figure 5:** The "Access Token" and "Access Token " 35 | 36 | Do not share these four key/token values as they are used to access your Twitter API account. Be sure to follow the usage agreement so your access is not revoked. 37 | -------------------------------------------------------------------------------- /setting_up_twitter_api.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/thomasnield/oreilly_reactive_python_for_data/dd223b28bad8ea8f00242e827bd020a9c94e2692/setting_up_twitter_api.pdf --------------------------------------------------------------------------------