├── README.md
├── class_notes
    ├── class_notes.docx
    ├── class_notes.md
    └── class_notes.pdf
├── code_examples
    ├── 4.1A_declaring_an_observable.py
    ├── 4.1B_subscribing_to_an_observable.py
    ├── 4.1C_subscribing_with_lambdas.py
    ├── 4.2A_some_basic_operators.py
    ├── 4.2B_range_and_just.py
    ├── 4.2C_observable_empty.py
    ├── 4.3A_creating_observable_from_scratch.py
    ├── 4.3B_interval_observable.py
    ├── 4.3C_unsubscribing.py
    ├── 4.4A_twitter_observable.py
    ├── 4.4B_cold_observable.py
    ├── 5.1A_filter.py
    ├── 5.1B_take.py
    ├── 5.1C_take_while.py
    ├── 5.2A_distinct.py
    ├── 5.2B_distinct_with_mapping.py
    ├── 5.2C_distinct_until_changed.py
    ├── 5.3A_count.py
    ├── 5.3B_reduce.py
    ├── 5.3C_scan.py
    ├── 5.4A_to_list.py
    ├── 5.4B_to_dict.py
    ├── 6.1A_merge.py
    ├── 6.1B_merge_interval.py
    ├── 6.1C_merge_all.py
    ├── 6.1D_merge_all_continued.py
    ├── 6.1E_flat_map.py
    ├── 6.2A_concat.py
    ├── 6.2B_concat_all.py
    ├── 6.2C_zip.py
    ├── 6.3D_spacing_emissions.py
    ├── 6.4A_grouping_into_lists.py
    ├── 6.4B_grouping_length_counts.py
    ├── 7.1A_reading_text_file.py
    ├── 7.1B_reading_web_url.py
    ├── 7.1C_recursive_file_iteration.py
    ├── 7.2A_reading_sql_query.py
    ├── 7.2C_merging_sql_queries.py
    ├── 7.2D_writing_sql_updates.py
    ├── 7.3A_reading_words_from_text_file.py
    ├── 7.3B_counting_word_occurrences.py
    ├── 7.3C_scheduling_reactive_word_counter.py
    ├── 8.1A_connectableobservable.py
    ├── 8.1B_sharing_observable.py
    ├── 8.1C_refcount.py
    ├── 8.2_twitter_feed_for_topics.py
    ├── 9.1A_sequential_long_running_tasks.py
    ├── 9.1B_using_subscribe_on.py
    ├── 9.2_using_observe_on.py
    ├── 9.3_processing_emissions_in_parallel.py
    ├── 9.4_switch_map.py
    └── rexon_metals.db
├── resources
    ├── bbc_news_article.txt
    ├── reactive_python_slides.pptx
    └── rexon_metals.db
├── setting_up_twitter_api.md
└── setting_up_twitter_api.pdf


/README.md:
--------------------------------------------------------------------------------
1 | Resources for the O'Reilly Media online video as well as webcast [_Reactive Python for Data Science_](https://www.safaribooksonline.com/library/view/reactive-python-for/9781491979006/).
2 | 
3 | [![](http://akamaicovers.oreilly.com/images/0636920064237/lrg.jpg)](https://www.safaribooksonline.com/library/view/reactive-python-for/9781491979006/))
4 | 
5 | 
6 | 
7 | 


--------------------------------------------------------------------------------
/class_notes/class_notes.docx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/thomasnield/oreilly_reactive_python_for_data/dd223b28bad8ea8f00242e827bd020a9c94e2692/class_notes/class_notes.docx


--------------------------------------------------------------------------------
/class_notes/class_notes.md:
--------------------------------------------------------------------------------
   1 | # Reactive Python for Data
   2 | # Part IV - The Observable
   3 | 
   4 | ## 4.1A - Creating an `Observable`
   5 | 
   6 | An `Observable` pushes items. It can push a finite or infinite series of items over time. To create an `Observable` that pushes 5 text strings, you can declare it like this:
   7 | 
   8 | ```python
   9 | from rx import Observable
  10 | 
  11 | letters = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"])
  12 | ```
  13 | 
  14 | We create an `Observable` using the `from_()` function, and pass it a list of five strings. It will take the list and **emit** (or push) each item from it. The `Observable.from_()` will work with any iterable.
  15 | 
  16 | However, running this does nothing more than save an `Observable` to a variable called `letters`. For the items to actually get pushed, we need a `Subscriber`.
  17 | 
  18 | ## 4.1B - Subscribing to an `Observable`
  19 | 
  20 | To receive emissions from an `Observable`, we need to create a `Subscriber` by implementing an `Observer`. An `Observer` implements three functions `on_next()` which receives an emission, `on_completed()` which is called when there are no more items, and `on_error()` which receives an error in the event one occurs.
  21 | 
  22 | Then we can pass an implementation of this `Observer` to the Observable's `subscribe()` function. It will then fire the emissions to our `Subscriber`.
  23 | 
  24 | ```python
  25 | from rx import Observable, Observer
  26 | 
  27 | letters = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"])
  28 | 
  29 | 
  30 | class MySubscriber(Observer):
  31 |     def on_next(self, value):
  32 |         print(value)
  33 | 
  34 |     def on_completed(self):
  35 |         print("Completed!")
  36 | 
  37 |     def on_error(self, error):
  38 |         print("Error occured: {0}".format(error))
  39 | 
  40 | 
  41 | letters.subscribe(MySubscriber())
  42 | ```
  43 | 
  44 | **OUTPUT:**
  45 | 
  46 | ```
  47 | Received: Alpha
  48 | Received: Beta
  49 | Received: Gamma
  50 | Received: Delta
  51 | Received: Epsilon
  52 | Completed!
  53 | ```
  54 | 
  55 | ## Example 4.1C - Subscribing Shorthand with Lambdas
  56 | 
  57 | Implementing a `Subscriber` is a bit verbose, so we also have the option of passing more concise lambda arguemnts to the `subscribe()` function. Then it will use those lambas to create the `Subscriber` for us.
  58 | 
  59 | ```python
  60 | from rx import Observable
  61 | 
  62 | letters = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"])
  63 | 
  64 | letters.subscribe(on_next = lambda value: print(value),
  65 |                   on_completed = lambda: print("Completed!"),
  66 |                   on_error = lambda error: print("Error occurred: {0}".format(error)))
  67 | ```
  68 | 
  69 | You do not even have to supply all the lambda arguments. You can leave out the `on_completed` and `on_error`, but for production code you should try to have an `on_error` so errors are not quietly swallowed.
  70 | 
  71 | 
  72 | ```python
  73 | letters.subscribe(on_next = lambda value: print(value))
  74 | 
  75 | # or
  76 | 
  77 | letters.subscribe(lambda value: print("Received: {0}".format(value)))
  78 | ```
  79 | 
  80 | We will be using lambdas constantly as we do reactive programming.
  81 | 
  82 | 
  83 | ## 4.2A - Some Basic Operators
  84 | 
  85 | RxPy has approximately 130 operators to powerfully express business logic, transformations, and concurrency behaviors. For now we will start with two basic ones: `map()` and `filter()` and cover more in the next section.
  86 | 
  87 | For instance, we can `map()` each `String` to its lenth, and then filter only to lengths that are at least 5.
  88 | 
  89 | ```python
  90 | from rx import Observable
  91 | 
  92 | letters = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"])
  93 | 
  94 | mapped = letters.map(lambda s: len(s))
  95 | 
  96 | filtered = mapped.filter(lambda i: i >= 5)
  97 | 
  98 | filtered.subscribe(lambda value: print(value))
  99 | ```
 100 | 
 101 | **OUTPUT:**
 102 | 
 103 | ```
 104 | Received: 5
 105 | Received: 5
 106 | Received: 5
 107 | Received: 7
 108 | ```
 109 | 
 110 | Each operator yields a new `Observable` emitting that transformation. We can save each one to a variable if we want and then `subscribe()` to the one we want, but oftentimes you will likely want to call them all in a single chain.
 111 | 
 112 | ```python
 113 | from rx import Observable
 114 | 
 115 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \
 116 |     .map(lambda s: len(s)) \
 117 |     .filter(lambda i: i >= 5) \
 118 |     .subscribe(lambda value: print(value))
 119 | ```
 120 | 
 121 | > If you are using an IDE like PyCharm, operators like `filter()` and `map()` will unfortunately not be available for auto-complete. The reason is RxPy will add these operators to the `Observable` at runtime. For PyCharm, you may want to disable _Unresolved References_ under _Settings -> Editor -> Inspection -> Python_ so you do not get any warnings.
 122 | 
 123 | ## 4.2B Using Observable.range() and Observable.just()
 124 | 
 125 | There are other ways to create an `Observable`. For instance, you can emit a range of numbers:
 126 | 
 127 | ```python
 128 | from rx import Observable
 129 | 
 130 | source = Observable.range(1,10)
 131 | 
 132 | source.subscribe(lambda value: print(value))
 133 | ```
 134 | 
 135 | **OUTPUT:**
 136 | 
 137 | ```
 138 | Received: 1
 139 | Received: 2
 140 | Received: 3
 141 | Received: 4
 142 | Received: 5
 143 | Received: 6
 144 | Received: 7
 145 | Received: 8
 146 | Received: 9
 147 | Received: 10
 148 | ```
 149 | 
 150 | You can also use `Observable.just()` to emit a single item.
 151 | 
 152 | ```python
 153 | from rx import Observable
 154 | 
 155 | greeting = Observable.just("Hello World!")
 156 | 
 157 | greeting.subscribe(lambda value: print(value))
 158 | ```
 159 | 
 160 | **OUTPUT:**
 161 | 
 162 | ```
 163 | Received: Hello World!
 164 | ```
 165 | 
 166 | # 4.2C - Using Observable.empty()
 167 | 
 168 | You can also create an `Observable` that emits nothing and call `on_completed()` immediately via `Observable.empty()`. While this may not seem useful, an empty `Observable` is the reactive equivalent to `None`, `null`, or an empty collection.
 169 | 
 170 | ```python
 171 | from rx import Observable
 172 | 
 173 | Observable.empty() \
 174 |     .subscribe(on_next= lambda s: print(s),
 175 |                on_completed= lambda: print("Done!")
 176 |                )
 177 | ```
 178 | 
 179 | **OUTPUT:**
 180 | 
 181 | ```
 182 | Done!
 183 | ```
 184 | 
 185 | # 4.3A - Creating an Observable from Scratch
 186 | 
 187 | You can also create an `Observable` source from scratch. Using `Observable.create()`. you can pass a function with an `observer` argument, and call it's `on_next()`, `on_completed()`, and `on_error()` to pass items or events to the `Observer` or the next operator in the chain.
 188 | 
 189 | ```python
 190 | from rx import Observable
 191 | 
 192 | def push_numbers(observer):
 193 |     observer.on_next(100)
 194 |     observer.on_next(300)
 195 |     observer.on_next(500)
 196 |     observer.on_completed()
 197 | 
 198 | Observable.create(push_numbers).subscribe(on_next = lambda i: print(i))
 199 | ```
 200 | 
 201 | **OUTPUT:**
 202 | 
 203 | ```
 204 | 100
 205 | 300
 206 | 500
 207 | ```
 208 | 
 209 | 
 210 | ## 4.3B - An Interval Observable
 211 | 
 212 | Observables do not have to strictly emit data. They can also emit events. Remember our definition that states _events are data, and data are events_? Events and data are treated the same way in ReactiveX. They both can be pushed through an `Observable`.
 213 | 
 214 | For instance, we can use `Observable.interval()` to emit a consecutive integer every 1 second.
 215 | 
 216 | ```python
 217 | from rx import Observable
 218 | 
 219 | Observable.interval(1000) \
 220 |     .map(lambda i: "{0} Mississippi".format(i)) \
 221 |     .subscribe(lambda s: print(s))
 222 | 
 223 | # Keep application alive until user presses a key
 224 | input("Press any key to quit\r\n")
 225 | ```
 226 | 
 227 | 
 228 | **OUTPUT:**
 229 | 
 230 | ```
 231 | 0 Mississippi
 232 | 1 Mississippi
 233 | 2 Mississippi
 234 | 3 Mississippi
 235 | 4 Mississippi
 236 | 5 Mississippi
 237 | 6 Mississippi
 238 | 7 Mississippi
 239 | 8 Mississippi
 240 | ```
 241 | 
 242 | Notice how the `Observable` in fact has a notion of time? It is emitting an integer every second, and each emission is both data and an event. Observables can be created to emit button clicks for a UI, server requests, new Tweets, and any other event while repsresenting that event as data.
 243 | 
 244 | Note also we had to use `input()` to make the main thread pause until the user presses a key. If we did not do this, the `Observable.interval()` would not have a chance to fire because the application will exit. The reason for this is the `Observable.interval()` has to operate on a separate thread and create a separate workstream driven by a timer. The Python code will finish and terminate before it has a chance to fire.
 245 | 
 246 | 
 247 | # 4.3C - Using Observable.defer() (EXTRA)
 248 | 
 249 | A behavior to be aware of with `Observable.from_()` and other functions that create Observables is they may not reflect changes that happen to their sources.
 250 | 
 251 | For instance, if have an `Observable.range()` built off two variables `x` and `y`, and one of the variables changes later, this change will not be captured by the source.
 252 | 
 253 | ```python
 254 | 1
 255 | 2
 256 | 3
 257 | 4
 258 | 5
 259 | 
 260 | Setting y = 10
 261 | 
 262 | 1
 263 | 2
 264 | 3
 265 | 4
 266 | 5
 267 | 
 268 | ```
 269 | 
 270 | **OUTPUT:**
 271 | 
 272 | ```
 273 | Alpha
 274 | Beta
 275 | Gamma
 276 | 
 277 | Adding Delta!
 278 | 
 279 | Alpha
 280 | Beta
 281 | Gamma
 282 | Delta
 283 | ```
 284 | 
 285 | Using `Observable.defer()` allows you to create a new `Observable` from scratch each time it is subscribed, and therefore capturing anything that might have changed about its source. Just supply how to create the `Observable` through a lambda.
 286 | 
 287 | ```python
 288 | from rx import Observable
 289 | 
 290 | x = 1
 291 | y = 5
 292 | 
 293 | integers = Observable.defer(lambda: Observable.range(x, y))
 294 | integers.subscribe(lambda i: print(i))
 295 | 
 296 | print("\nSetting y = 10\n")
 297 | y = 10
 298 | 
 299 | integers.subscribe(lambda i: print(i))
 300 | ```
 301 | 
 302 | **OUTPUT:**
 303 | 
 304 | ```
 305 | 1
 306 | 2
 307 | 3
 308 | 4
 309 | 5
 310 | 
 311 | Setting y = 10
 312 | 
 313 | 1
 314 | 2
 315 | 3
 316 | 4
 317 | 5
 318 | 6
 319 | 7
 320 | 8
 321 | 9
 322 | 10
 323 | ```
 324 | 
 325 | The lambda argument ensures the `Observable` source declaration is rebuilt each time it is subscribed to.
 326 | 
 327 | # 4.3D - Unsubscribing from an Observable
 328 | 
 329 | When you `subscribe()` to an `Observable` it returns a `Disposable` so you can disconnect the `Subscriber` from the `Observable` at any time.
 330 | 
 331 | ```python
 332 | from rx import Observable
 333 | import time
 334 | 
 335 | disposable = Observable.interval(1000) \
 336 |     .map(lambda i: "{0} Mississippi".format(i)) \
 337 |     .subscribe(lambda s: print(s))
 338 | 
 339 | # sleep 5 seconds so Observable can fire
 340 | time.sleep(5)
 341 | 
 342 | # disconnect the Subscriber
 343 | print("Unsubscribing!")
 344 | disposable.dispose()
 345 | 
 346 | # sleep a bit longer to prove no more emissions are coming
 347 | time.sleep(5)
 348 | ```
 349 | 
 350 | **OUTPUT:**
 351 | 
 352 | ```
 353 | 0 Mississippi
 354 | 1 Mississippi
 355 | 2 Mississippi
 356 | 3 Mississippi
 357 | Unsubscribing!
 358 | ```
 359 | 
 360 | Unsubscribing/disposing is usually not necessary for Observables that are finite and quick (they will unsubscribe themselves), but it can be necessary for long-running or infinite Observables.
 361 | 
 362 | # 4.4 - An Observable emitting Tweets
 363 | 
 364 | Later we will learn how to create Observables that emit Tweets for a given topic, but here is a preview of what's to come. Using Tweepy and `Observable.create()`, we can create a function that yields an `Observable` emitting Tweets for specified topics. For instance, here is how to get a live stream of text bodies from Tweets for "Britain" and "France".
 365 | 
 366 | 
 367 | ## 4.4A - A Twitter Observable
 368 | 
 369 | ```python
 370 | from tweepy.streaming import StreamListener
 371 | from tweepy import OAuthHandler
 372 | from tweepy import Stream
 373 | import json
 374 | from rx import Observable
 375 | 
 376 | # Variables that contains the user credentials to access Twitter API
 377 | access_token = "CONFIDENTIAL"
 378 | access_token_secret = "CONFIDENTIAL"
 379 | consumer_key = "CONFIDENTIAL"
 380 | consumer_secret = "CONFIDENTIAL"
 381 | 
 382 | 
 383 | def tweets_for(topics):
 384 | 
 385 |     def observe_tweets(observer):
 386 |         class TweetListener(StreamListener):
 387 |             def on_data(self, data):
 388 |                 observer.on_next(data)
 389 |                 return True
 390 | 
 391 |             def on_error(self, status):
 392 |                 observer.on_error(status)
 393 | 
 394 |         # This handles Twitter authetification and the connection to Twitter Streaming API
 395 |         l = TweetListener()
 396 |         auth = OAuthHandler(consumer_key, consumer_secret)
 397 |         auth.set_access_token(access_token, access_token_secret)
 398 |         stream = Stream(auth, l)
 399 |         stream.filter(track=topics)
 400 | 
 401 |     return Observable.create(observe_tweets).share()
 402 | 
 403 | 
 404 | topics = ['Britain','France']
 405 | 
 406 | tweets_for(topics).map(lambda d: json.loads(d)) \
 407 |     .filter(lambda map: "text" in map) \
 408 |     .map(lambda map: map["text"].strip()) \
 409 |     .subscribe(lambda s: print(s))
 410 | 
 411 | 
 412 | ```
 413 | 
 414 | **OUTPUT:**
 415 | 
 416 | ```
 417 | RT @YourAnonCentral: The ﬁve biggest international arms exports suppliers in 2008–12 were the #US,#Russia, #Germany, #France and #China. ht…
 418 | RT @parismarx: Marine Le Pen believes France "will provide the third stage of a global political uprising" following Brexit &amp; Trump https:/…
 419 | Attentats du 13-Novembre: des rescapés racontent leur vie un an après https://t.co/VMM5rlsoQu via @RFI
 420 | RT @AOLNews: 1 year after the Paris attacks, France's state of emergency remains: https://t.co/PD0U6mXHcN https://t.co/QUHWRSCLxt
 421 | おむつは不要、手ぶらで登園。少子化を克服したフランスの保育園事情とは https://t.co/4ImUajYSq2 @HuffPostJapanさんから
 422 | RT @CPIF_: #France Interdit cette année, les islamistes tentent de convertir les femmes en faisant l'expérience du voile à…
 423 | RT @StewartWood: This week our Government should remember &amp; make clear that Britain's alliances must be based on our values, not our values…
 424 | RT @MaxAbrahms: "Britain will spend the next two months trying to convince Mr Trump's team of the need to remove President Assad." https://…
 425 | RT @Bassounov: #Trump est devenu présidentiable grâce à 10 ans de #téléPoubelle. En 2022 en France, la présidence se jouera entre #Hanouna…
 426 | # Panoramix #Radio #Station
 427 | ...
 428 | ```
 429 | 
 430 | ## 4.4B Cold vs Hot Observables
 431 | 
 432 | Observables that emit data typically are **cold Observables**, meaning they will replay emissions to each individual `Subscriber`. For instance, this `Observable` below will emit all five strings to both Subscribers individually.
 433 | 
 434 | ```python
 435 | from rx import Observable
 436 | 
 437 | source = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"])
 438 | 
 439 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s)))
 440 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s)))
 441 | ```
 442 | 
 443 | **OUTPUT:**
 444 | 
 445 | ```
 446 | Subscriber 1: Alpha
 447 | Subscriber 1: Beta
 448 | Subscriber 1: Gamma
 449 | Subscriber 1: Delta
 450 | Subscriber 1: Epsilon
 451 | Subscriber 2: Alpha
 452 | Subscriber 2: Beta
 453 | Subscriber 2: Gamma
 454 | Subscriber 2: Delta
 455 | Subscriber 2: Epsilon
 456 | ```
 457 | 
 458 | However, **hot Observables** will not replay emissions for tardy subscribers that come later. Our Twitter `Observable` is an example of a hot `Observable`. If a second `Subscriber` subscribes to a Tweet feed 5 seconds after the first `Subscriber`, it will miss all Tweets that occurred in that window. We will explore this later.
 459 | 
 460 | 
 461 | # Part V - Operators
 462 | 
 463 | In this section, we will learn some of the 130 operators available in RxPy. Learning these operators can be overwhelming, so the best approach is to seek the right operators out of need. The key to being productive with RxPy and unleashing its potential is to find the key operators that help you with the tasks you encounter. With practice, you will become fluent in composing them together.
 464 | 
 465 | The best way to see what operators are available in RxPy is to look through them on GitHub
 466 | https://github.com/ReactiveX/RxPY/tree/master/rx/linq/observable
 467 | 
 468 | You can also view the ReactiveX operators page which has helpful marble diagrams showing each operator's behavior
 469 | http://reactivex.io/documentation/operators.html
 470 | 
 471 | You can also explore various operators using the interactive RxMarbles website
 472 | http://rxmarbles.com/
 473 | 
 474 | 
 475 | ## 5.1 Suppressing Emissions
 476 | 
 477 | Here are some operators that can be helpful for supressing emissions that fail to meet a criteria in some form.
 478 | 
 479 | ### 5.1A `filter()`
 480 | 
 481 | You have already seen the `filter()`. It supresses emissions that fail to meet a condition specified by you. For instance, only allowing emissions forward that are at least length 5.
 482 | 
 483 | ```python
 484 | from rx import Observable
 485 | 
 486 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \
 487 |     .filter(lambda s: len(s) >= 5) \
 488 |     .subscribe(lambda s: print(s))
 489 | ```
 490 | 
 491 | **OUTPUT:**
 492 | 
 493 | ```
 494 | Alpha
 495 | Gamma
 496 | Delta
 497 | Epsilon
 498 | ```
 499 | 
 500 | 
 501 | 
 502 | ## 5.1B `take()`
 503 | 
 504 | You can also use `take()` to cut off at a certain number of emissions and call `on_completed()`. For instance, calling `take(2)` like below will only allow the first two emissions coming out of the `filter()` to come through.
 505 | 
 506 | ```python
 507 | from rx import Observable
 508 | 
 509 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \
 510 |     .filter(lambda s: len(s) >= 5) \
 511 |     .take(2) \
 512 |     .subscribe(lambda s: print(s))
 513 | ```
 514 | 
 515 | **OUTPUT:**
 516 | 
 517 | ```
 518 | Alpha
 519 | Gamma
 520 | ```
 521 | 
 522 | `take()` will not throw an error if it fails to get the number of items it wants. It will just emit what it does capture. For instance, when `take(10)` only recieves 4 emissions (and not 10), it will just emit those 4 emissions.
 523 | 
 524 | ```python
 525 | from rx import Observable
 526 | 
 527 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \
 528 |     .filter(lambda s: len(s) >= 5) \
 529 |     .take(10) \
 530 |     .subscribe(on_next = lambda s: print(s), on_error = lambda e: print(e))
 531 | ```
 532 | 
 533 | **OUTPUT:**
 534 | 
 535 | ```
 536 | Alpha
 537 | Beta
 538 | Gamma
 539 | Delta
 540 | Epsilon
 541 | ```
 542 | 
 543 | ## 5.1C `take_while()`
 544 | 
 545 | `take_while()` will keep passing emissions based on a condition. For instance if we have an `Observable` emitting some integers, we can keep taking integers while they are less than 100. We can achieve this using a `take_while()`.
 546 | 
 547 | ```python
 548 | from rx import Observable
 549 | 
 550 | Observable.from_([2,5,21,5,2,1,5,63,127,12]) \
 551 |     .take_while(lambda i: i < 100) \
 552 |     .subscribe(on_next = lambda i: print(i), on_completed = lambda: print("Done!"))
 553 | ```
 554 | 
 555 | 
 556 | When the `127` is encountered, the `take_while()` specified as above with the condition `i < 100` will trigger `on_completed()` to be called to the `Subscriber`, and unsubscription will prevent any more emissions from occurring.
 557 | 
 558 | 
 559 | # 5.2 Distinct Operators
 560 | 
 561 | ## 5.2A `distinct()`
 562 | 
 563 | You can use `distinct()` to suppress redundant emissions. If an item has been emitted before (based on its equality logic via its `__eq__` implementation), it will not be emitted.
 564 | 
 565 | This will emit the distinct lengths
 566 | 
 567 | ```python
 568 | from rx import Observable
 569 | 
 570 | Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]) \
 571 |     .map(lambda s: len(s)) \
 572 |     .distinct() \
 573 |     .subscribe(lambda i: print(i))
 574 | ```
 575 | 
 576 | **OUTPUT:**
 577 | 
 578 | ```
 579 | 5
 580 | 4
 581 | 7
 582 | ```
 583 | 
 584 | 
 585 | ## 5.2B `distinct()` with mapping
 586 | 
 587 | You can also pass a lambda specifying what you want to distinct on. If we want to emit the `String` rather than its length, but use distinct logic on its length, you can leverage a lambda argument.
 588 | 
 589 | ```python
 590 | from rx import Observable
 591 | 
 592 | Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]) \
 593 |     .distinct(lambda s: len(s)) \
 594 |     .subscribe(lambda i: print(i))
 595 | ```
 596 | 
 597 | **OUTPUT:**
 598 | 
 599 | ```
 600 | Alpha
 601 | Beta
 602 | Epsilon
 603 | ```
 604 | 
 605 | 
 606 | ## 5.2C `distinct_until_changed()`
 607 | 
 608 | The `distinct_until_changed()` will prevent _consecutive_ duplicates from emitting.
 609 | 
 610 | ```python
 611 | from rx import Observable
 612 | 
 613 | Observable.from_(["Alpha", "Theta", "Kappa", "Beta", "Gamma", "Delta", "Epsilon"]) \
 614 |     .map(lambda s: len(s)) \
 615 |     .distinct_until_changed() \
 616 |     .subscribe(lambda i: print(i))
 617 | ```
 618 | 
 619 | **OUTPUT:**
 620 | 
 621 | ```
 622 | 5
 623 | 4
 624 | 5
 625 | 7
 626 | ```
 627 | 
 628 | Just like `distinct()`, you can also provide a lambda to distinct on an attribute.
 629 | 
 630 | ```python
 631 | from rx import Observable
 632 | 
 633 | Observable.from_(["Alpha", "Theta", "Kappa", "Beta", "Gamma", "Delta", "Epsilon"]) \
 634 |     .distinct_until_changed(lambda s: len(s)) \
 635 |     .subscribe(lambda i: print(i))
 636 | ```
 637 | 
 638 | ```
 639 | Alpha
 640 | Beta
 641 | Gamma
 642 | Epsilon
 643 | ```
 644 | 
 645 | # 5.3 Aggregating Operators
 646 | 
 647 | When working with data, there will be many instances where we want to consolidate emissions into a single emission to reflect some form of an aggregated result.
 648 | 
 649 | With the exception of `scan()`, one thing to be careful about when aggregating emissions is they rely on `on_completed()` to be called. Infinite Observables will cause an aggregation operator to work forever aggregating an infinite series of emissions.
 650 | 
 651 | ## 5.3A - `count()`
 652 | 
 653 | The simplest aggregation to an `Observable` is to simply `count()` the number of emisssions, and then push that count forward as a single emission once `on_completed()` is called. If we want to count the number of text strings that are not 5 characters, we can achieve it like this:
 654 | 
 655 | 
 656 | ```python
 657 | from rx import Observable
 658 | 
 659 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \
 660 |     .filter(lambda s: len(s) != 5) \
 661 |     .count() \
 662 |     .subscribe(lambda i: print(i))
 663 | ```
 664 | 
 665 | **OUTPUT:**
 666 | 
 667 | ```
 668 | 2
 669 | ```
 670 | 
 671 | ## 5.3B `reduce()`
 672 | 
 673 | The `reduce()` allows you to define a custom aggregation operation to "fold" each value into a rolling value. For instance, you can find the sum of numeric emissions (less than 100) using `reduce()` in this manner.
 674 | 
 675 | ```python
 676 | from rx import Observable
 677 | 
 678 | Observable.from_([4,76,22,66,881,13,35]) \
 679 |     .filter(lambda i: i < 100) \
 680 |     .reduce(lambda total, value: total + value) \
 681 |     .subscribe(lambda s: print(s))
 682 | ```
 683 | 
 684 | **OUTPUT:**
 685 | 
 686 | ```
 687 | 216
 688 | ```
 689 | 
 690 | You can use this to consolidate emissions in your own custom way for most cases. Keep in mind that there are already built in mathematical aggregators like `sum()` (which could replace this `reduce()`) as well as `min()`, `max()`, and `average()`. These only work on numeric emissions, however.
 691 | 
 692 | 
 693 | ## 5.3C `scan()`
 694 | 
 695 | The `scan()` is almost identical to `reduce()`, but it will emit each rolling total for each emission that is received. Therefore, it can work with infinite Observables such as Twitter streams and other events.
 696 | 
 697 | 
 698 | ```python
 699 | from rx import Observable
 700 | 
 701 | Observable.from_([4,76,22,66,881,13,35]) \
 702 |     .scan(lambda total, value: total + value) \
 703 |     .subscribe(lambda s: print(s))
 704 | ```
 705 | 
 706 | **OUTPUT:**
 707 | 
 708 | ```
 709 | 4
 710 | 80
 711 | 102
 712 | 168
 713 | 1049
 714 | 1062
 715 | 1097
 716 | ```
 717 | 
 718 | Each accumulation is emitted every time an emission is added to our running total. We start with `4`, then `4` + `76` which is `80`, then `80` + `22` which is `102`, etc...
 719 | 
 720 | 
 721 | # 5.4 Collecting Operators
 722 | 
 723 | You can consolidate emissions by collecting them into a `List` or `Dict`, and then pushing that collection forward as a single emission.
 724 | 
 725 | ## 5.4A - `to_list()`
 726 | 
 727 | `to_list()` will collect the emissions into a single `List` until `on_completed()` is called, then it will push that `List` forward as a single emission.
 728 | 
 729 | ```python
 730 | from rx import Observable
 731 | 
 732 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \
 733 |     .to_list() \
 734 |     .subscribe(lambda s: print(s))
 735 | ```
 736 | 
 737 | **OUTPUT:**
 738 | 
 739 | ```
 740 | ['Alpha', 'Beta', 'Gamma', 'Delta', 'Epsilon']
 741 | ```
 742 | 
 743 | Typically you want avoid excessively collecting things into Lists unless business logic requires it. Prefer to keep emissions flowing forward one-at-a-time in a reactive manner when possible, rather than stopping the flow and collecting emissions into Lists.
 744 | 
 745 | ## 5.4B - `to_dict()`
 746 | 
 747 | The `to_dict()` will collect emissions into a `Dict` and you specify a lambda that derives the key. For instance, if you wanted to key each String off its first letter and collect them into a `Dict`, do the following:
 748 | 
 749 | ```python
 750 | from rx import Observable
 751 | 
 752 | Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]) \
 753 |     .to_dict(lambda s: s[0]) \
 754 |     .subscribe(lambda i: print(i))
 755 | ```
 756 | 
 757 | **OUTPUT:**
 758 | 
 759 | ```
 760 | {'B': 'Beta', 'E': 'Epsilon', 'A': 'Alpha', 'G': 'Gamma', 'D': 'Delta'}
 761 | ```
 762 | 
 763 | You can optionally provide a second lambda argument to specify a value other than the emission itself. If we wanted to map the first letter to the length of the String instead, we can do this:
 764 | 
 765 | ```python
 766 | from rx import Observable
 767 | 
 768 | Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]) \
 769 |     .to_dict(lambda s: s[0], lambda s: len(s)) \
 770 |     .subscribe(lambda i: print(i))
 771 | ```
 772 | 
 773 | 
 774 | 
 775 | ## 6.1A - Observable.merge()
 776 | 
 777 | 
 778 | ```python
 779 | from rx import Observable
 780 | 
 781 | source1 = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"])
 782 | source2 = Observable.from_(["Zeta","Eta","Theta","Iota"])
 783 | 
 784 | Observable.merge(source1,source2) \
 785 |     .subscribe(lambda s: print(s))
 786 | ```
 787 | 
 788 | **OUTPUT:**
 789 | 
 790 | ```python
 791 | Alpha
 792 | Zeta
 793 | Beta
 794 | Eta
 795 | Gamma
 796 | Theta
 797 | Delta
 798 | Iota
 799 | Epsilon
 800 | ```
 801 | 
 802 | Notice that although emissions from both Observable are now a single stream, the emissions are interleaved and jumbled. This is because `Observable.merge()` will fire emissions from all the Observables at once rather than sequentially one-at-a-time.
 803 | 
 804 | ## 6.1B - Observable.merge() (Continued)
 805 | 
 806 | If you want this sequential ordered guarantee, you will want to use `Observable.concat()` which is discussed later. But the `Observable.merge()` can be helpful for merging multiple event streams.
 807 | 
 808 | ```python
 809 | from rx import Observable
 810 | 
 811 | source1 = Observable.interval(1000).map(lambda i: "Source 1: {0}".format(i))
 812 | source2 = Observable.interval(500).map(lambda i: "Source 2: {0}".format(i))
 813 | source3 = Observable.interval(300).map(lambda i: "Source 3: {0}".format(i))
 814 | 
 815 | Observable.merge(source1, source2, source3) \
 816 |     .subscribe(lambda s: print(s))
 817 | 
 818 | # keep application alive until user presses a key
 819 | input("Press any key to quit\n")
 820 | ```
 821 | 
 822 | **OUTPUT:**
 823 | 
 824 | ```
 825 | Source 3: 0
 826 | Source 2: 0
 827 | Source 3: 1
 828 | Source 3: 2
 829 | Source 1: 0
 830 | Source 2: 1
 831 | Source 3: 3
 832 | Source 2: 2
 833 | Source 3: 4
 834 | Source 3: 5
 835 | Source 2: 3
 836 | Source 1: 1
 837 | etc...
 838 | ```
 839 | 
 840 | Three infinite Observables above are emitting a consecutive integer at different intervals (1000 milliseconds, 500 milliseconds, and 300 milliseconds), and putting each integer into a String labeling the source. But we merged these three infinite Observables into one using `Observable.merge()`.
 841 | 
 842 | ## 6.1C - `merge_all()`
 843 | 
 844 | Another way to accomplish this is to make a List containing all three Observables, and then passing it to `Observable.from_()`. This will make an Observable emitting Observables, then you can call `merge_all()` to turn each one into its emissions.
 845 | 
 846 | ```python
 847 | from rx import Observable
 848 | 
 849 | source1 = Observable.interval(1000).map(lambda i: "Source 1: {0}".format(i))
 850 | source2 = Observable.interval(500).map(lambda i: "Source 2: {0}".format(i))
 851 | source3 = Observable.interval(300).map(lambda i: "Source 3: {0}".format(i))
 852 | 
 853 | Observable.from_([source1,source2,source3]) \
 854 |     .merge_all() \
 855 |     .subscribe(lambda s: print(s))
 856 | 
 857 | # keep application alive until user presses a key
 858 | input("Press any key to quit\n")
 859 | ```
 860 | 
 861 | ## 6.1D - `merge_all()` (Continued)
 862 | 
 863 | If you are creating an `Observable` off each emission on-the-fly, `merge_all()` can be helpful here as well. Say you have a list of Strings containing numbers separated by `/`. You can map each String to be `split()` and then pass those separated values to an `Observable.from_()`. Then you can call `merge_all()` afterwards.
 864 | 
 865 | ```python
 866 | from rx import Observable
 867 | 
 868 | items = ["134/34/235/132/77", "64/22/98/112/86/11", "66/08/34/778/22/12"]
 869 | 
 870 | Observable.from_(items) \
 871 |     .map(lambda s: Observable.from_(s.split("/"))) \
 872 |     .merge_all() \
 873 |     .map(lambda s: int(s)) \
 874 |     .subscribe(lambda i: print(i))
 875 | 
 876 | ```
 877 | 
 878 | **OUTPUT:**
 879 | 
 880 | ```
 881 | 134
 882 | 34
 883 | 64
 884 | 235
 885 | 22
 886 | 66
 887 | 132
 888 | 98
 889 | 8
 890 | 77
 891 | 112
 892 | 34
 893 | 86
 894 | 778
 895 | 11
 896 | 22
 897 | 12
 898 | ```
 899 | 
 900 | ## 6.1E - `flat_map()`
 901 | 
 902 | An alternative way of expressing the previous example (5.1D) is using `flat_map()`. It will consolidate mapping to an `Observable` and calling `merge_all()` into a single operator.
 903 | 
 904 | ```python
 905 | from rx import Observable
 906 | 
 907 | items = ["134/34/235/132/77", "64/22/98/112/86/11", "66/08/34/778/22/12"]
 908 | 
 909 | Observable.from_(items) \
 910 |     .flat_map(lambda s: Observable.from_(s.split("/"))) \
 911 |     .map(lambda s: int(s)) \
 912 |     .subscribe(lambda i: print(i))
 913 | ```
 914 | 
 915 | We will try to prefer the `flat_map()` over the `map()`/`merge_all()` from now on since it is much more succinct.
 916 | 
 917 | # 6.2 Concat and Zip
 918 | 
 919 | `Observable.concat()` and the `concat_all()` operator are simliar to `Observable.merge()` and the `merge_all()` operator. The only difference is they will emit items from each `Observable` _sequentially_. It will fire off each `Observable` in order and one-at-a-time. Therefore, this not something you want to use with infinite Observables, because the first infinite `Observable` will occupy its place in the queue forever and stop the `Observables` behind it from firing. They are helpful for finite data sets though.
 920 | 
 921 | 
 922 | ## 6.2A - `concat()`
 923 | 
 924 | Our previous `merge()` example can now emit items in order:
 925 | 
 926 | ```python
 927 | from rx import Observable
 928 | 
 929 | source1 = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"])
 930 | source2 = Observable.from_(["Zeta","Eta","Theta","Iota"])
 931 | 
 932 | Observable.concat(source1,source2) \
 933 |     .subscribe(lambda s: print(s))
 934 | ```
 935 | 
 936 | **OUTPUT:**
 937 | 
 938 | ```
 939 | Alpha
 940 | Beta
 941 | Gamma
 942 | Delta
 943 | Epsilon
 944 | Zeta
 945 | Eta
 946 | Theta
 947 | Iota
 948 | ```
 949 | 
 950 | ## 6.2B - `concat_all()`
 951 | 
 952 | We can make our earlier example splitting Strings ordered using `concat_all()` instead of `merge_all()`.
 953 | 
 954 | ```python
 955 | from rx import Observable
 956 | 
 957 | items = ["134/34/235/132/77", "64/22/98/112/86/11", "66/08/34/778/22/12"]
 958 | 
 959 | Observable.from_(items) \
 960 |     .map(lambda s: Observable.from_(s.split("/"))) \
 961 |     .concat_all() \
 962 |     .map(lambda s: int(s)) \
 963 |     .subscribe(lambda i: print(i))
 964 | ```
 965 | 
 966 | **OUTPUT:**
 967 | 
 968 | ```
 969 | 134
 970 | 34
 971 | 235
 972 | 132
 973 | 77
 974 | 64
 975 | 22
 976 | 98
 977 | 112
 978 | 86
 979 | 11
 980 | 66
 981 | 08
 982 | 34
 983 | 778
 984 | 22
 985 | 12
 986 | ```
 987 | 
 988 | If you do not care about ordering, it is recommend to use `merge_all()` or `flat_map()`. `concat_all()` can behave unpredictably with certain operators like `group_by()`, which we will cover later.
 989 | 
 990 | 
 991 | You can also use `concat_map()` in the same spirit as `flat_map()`, preserving the order of sequence.
 992 | 
 993 | ```python
 994 | from rx import Observable
 995 | 
 996 | items = ["134/34/235/132/77", "64/22/98/112/86/11", "66/08/34/778/22/12"]
 997 | 
 998 | Observable.from_(items) \
 999 |     .concat_map(lambda s: Observable.from_(s.split("/"))) \
1000 |     .map(lambda s: int(s)) \
1001 |     .subscribe(lambda i: print(i))
1002 | ```
1003 | 
1004 | 
1005 | ## 6.2C - Zip
1006 | 
1007 | Zipping pairs emissions from two or more sources and turns them into a single `Observable`.
1008 | 
1009 | ```python
1010 | from rx import Observable
1011 | 
1012 | letters = Observable.from_(["A","B","C","D","E","F"])
1013 | numbers = Observable.range(1,5)
1014 | 
1015 | Observable.zip(letters,numbers, lambda l,n: "{0}-{1}".format(l,n)) \
1016 |     .subscribe(lambda i: print(i))
1017 | ```
1018 | 
1019 | **OUTPUT:**
1020 | 
1021 | ```
1022 | A-1
1023 | B-2
1024 | C-3
1025 | D-4
1026 | E-5
1027 | ```
1028 | 
1029 | You can alternatively express this as an operator.
1030 | 
1031 | ```python
1032 | letters.zip(numbers, lambda l,n: "{0}-{1}".format(l,n)) \
1033 |     .subscribe(lambda i: print(i))
1034 | ```
1035 | 
1036 | ## 6.3D - Using Zip to Space Emissions
1037 | 
1038 | Zip can also be helpful to space out emissions by zipping an Observable with an `Observable.interva()`. For instnance, we can space out five emissions by one second intervals.
1039 | 
1040 | ```python
1041 | from rx import Observable
1042 | 
1043 | letters = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"])
1044 | intervals = Observable.interval(1000)
1045 | 
1046 | Observable.zip(letters,intervals, lambda s,i: s) \
1047 |     .subscribe(lambda s: print(s))
1048 | 
1049 | input("Press any key to quit\n")
1050 | ```
1051 | 
1052 | Note that `zip()` can get overwhelmed with infinite hot Observables where one produces emissions faster than another. You might want to consider using `combine_latest()` or `with_latest_from()` instead of `zip()`, which will pair with the latest emission from each source. For the sake of brevity, we will not cover this in this course. But you can read more about it in the ReactiveX documentation.
1053 | 
1054 | 
1055 | # 6.4 - Group By
1056 | 
1057 | For the purposes of data science, one of the most powerful operators in ReactiveX is `group_by()`. It will yield an Observable emitting GroupedObservables, where each `GroupedObservable` pushes items with a given key. It behaves just like any other `Observable`, but it has a `key` property which we will leverage in a moment.
1058 | 
1059 | But first, let's group some `String` emissions by keying on their lengths. Then let's collect emissions for each grouping into a `List`. Then we can call `flat_map()` to yield all the Lists.
1060 | 
1061 | ## 6.4A - Group into Lists
1062 | 
1063 | ```python
1064 | from rx import Observable
1065 | 
1066 | items = ["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]
1067 | 
1068 | Observable.from_(items) \
1069 |     .group_by(lambda s: len(s)) \
1070 |     .flat_map(lambda grp: grp.to_list()) \
1071 |     .subscribe(lambda i: print(i))
1072 | ```
1073 | 
1074 | **OUTPUT:**
1075 | 
1076 | ```
1077 | ['Alpha', 'Gamma', 'Delta']
1078 | ['Beta']
1079 | ['Epsilon']
1080 | ```
1081 | 
1082 | `group_by()` is efficient because it is still 100% reactive and pushing items one-at-a-time through the different GroupedObservables. You can also leverage the `key` property and tuple it up with an aggregated value. This is helpful if you want to create `Dict` that holds aggregations by key values.
1083 | 
1084 | For instance, if you want to find the count of each word length occurrence, you can create a `Dict` like this:
1085 | 
1086 | ## 6.4B - Getting Length Counts
1087 | 
1088 | ```python
1089 | from rx import Observable
1090 | 
1091 | items = ["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]
1092 | 
1093 | Observable.from_(items) \
1094 |     .group_by(lambda s: len(s)) \
1095 |     .flat_map(lambda grp:
1096 |          grp.count().map(lambda ct: (grp.key, ct))
1097 |     ) \
1098 |     .to_dict(lambda key_value: key_value[0], lambda key_value: key_value[1]) \
1099 |     .subscribe(lambda i: print(i))
1100 | ```
1101 | 
1102 | **OUTPUT:**
1103 | 
1104 | ```
1105 | {4: 1, 5: 3, 7: 1}
1106 | ```
1107 | 
1108 | You can interpret the returned `Dict` above as "for length 4 there are one occurrences, for length 5 there are 3 occurrences, etc".
1109 | 
1110 | `group_by()` is somewhat abstract but it is a powerful and efficient way to perform aggregations on a given key. It also works with infinite Observables assuming you use infinite-friendly operators on each `GroupedObservable`. We will use `group_by()` a few more times in this course.
1111 | 
1112 | 
1113 | # Section VII - Reading and Analyzing data
1114 | 
1115 | In this chapter we will look over basic ways to reactively read data and analyze data from text files, URL's, and SQL. We will also integrate concepts we previously learned to create a reactive word counter that runs on a schedule and detects changes to a file.
1116 | 
1117 | One catch with using `Observable.from_()` with a data source iterable is it only iterates once, causing multiple Subscribers to not receieve data after the first Subscriber. To get around this we will use functions to create a new `Observable` each time we need to subscribe to a data source. A slightly more advanced way to solve this issue is to use `Observable.defer()` which we will not cover here, but you can read about it in the Appendix.
1118 | 
1119 | It is good to leverage functions that return Observables anyway. You can accept arguments to build the Observable chain that is returned and increase reusability.
1120 | 
1121 | ## 7.1A - Reading a Text File
1122 | 
1123 | As stated earlier, anything that is iterable can be turned into an `Observable` using `Observable.from_()`. We can emit the lines from a text file in this manner. If I have a raw text file called `bbc_news_article.txt` in my Python project, I can emit the lines like this:
1124 | 
1125 | ```python
1126 | from rx import Observable
1127 | 
1128 | 
1129 | def read_lines(file_name):
1130 |     file = open(file_name)
1131 | 
1132 |     return Observable.from_(file) \
1133 |         .map(lambda l: l.strip()) \
1134 |         .filter(lambda l: l != "")
1135 | 
1136 | 
1137 | read_lines("bbc_news_article.txt").subscribe(lambda s: print(s))
1138 | ```
1139 | 
1140 | **OUTPUT:**
1141 | 
1142 | ```
1143 | Giant waves damage S Asia economy
1144 | Governments, aid agencies, insurers and travel firms are among those counting the cost of the massive earthquake and waves that hammered southern Asia.
1145 | The worst-hit areas are Sri Lanka, India, Indonesia and Thailand, with at least 23,000 people killed. Early estimates from the World Bank put the amount of aid needed at about $5bn (£2.6bn), similar to the cash offered Central America after Hurricane Mitch. Mitch killed about 10,000 people and caused damage of about $10bn in 1998. World Bank spokesman Damien
1146 | ...
1147 | ```
1148 | 
1149 | I use the `map()` and `filter()` operators to strip any leading and trailing whitespace for each line, as well as rid lines that are empty.
1150 | 
1151 | We will use this example for a project at the end of this section.
1152 | 
1153 | 
1154 | ## 7.1B - Reading a URL
1155 | 
1156 | You can also read content from the web in a similar manner. This can be a powerful way to do web scraping and data wrangling, especially if you reactively push multiple URL's or URL arguments and scrape the content off each page. Just be kind and don't tax somebody's system!
1157 | 
1158 | I saved a simple raw text page of the 50 U.S. states on a Gist page. You can view it with this URL: https://goo.gl/rIaDyM.
1159 | 
1160 | If you want to read the lines off the response, you can do it like this:
1161 | 
1162 | ```python
1163 | from rx import Observable
1164 | from urllib.request import urlopen
1165 | 
1166 | 
1167 | def read_request(link):
1168 |     f = urlopen(link)
1169 | 
1170 |     return Observable.from_(f) \
1171 |         .map(lambda s: s.decode("utf-8").strip())
1172 | 
1173 | read_request("https://goo.gl/rIaDyM") \
1174 |     .subscribe(lambda s: print(s))
1175 | ```
1176 | 
1177 | **OUTPUT:**
1178 | 
1179 | ```
1180 | Alabama
1181 | Alaska
1182 | Arizona
1183 | Arkansas
1184 | California
1185 | Colorado
1186 | Connecticut
1187 | Delaware
1188 | ...
1189 | ```
1190 | 
1191 | In the map we have to decode the bytes and convert them to UTF-8 Strings. Then we also clean leading and trailing whitespace with `strip()`. then finally we print each line.
1192 | 
1193 | ## 7.1C Recursively Iterating Files in Directories (EXTRA)
1194 | 
1195 | You can use Rx to do powerful recursion patterns to iterate files. You can download and unzip a BBC article datset for this example here, with thousands of articles in text file format: http://mlg.ucd.ie/datasets/bbc.html
1196 | 
1197 | 
1198 | ```python
1199 | from rx import Observable
1200 | import os
1201 | 
1202 | 
1203 | def recursive_files_in_directory(folder):
1204 | 
1205 |     def emit_files_recursively(observer):
1206 |         for root, directories, filenames in os.walk(folder):
1207 |             for directory in directories:
1208 |                 observer.on_next(os.path.join(root, directory))
1209 |             for filename in filenames:
1210 |                 observer.on_next(os.path.join(root, filename))
1211 | 
1212 |         observer.on_completed()
1213 | 
1214 |     return Observable.create(emit_files_recursively)
1215 | 
1216 | 
1217 | recursive_files_in_directory('/home/thomas/Desktop/bbc_data_sets') \
1218 |     .filter(lambda f: f.endswith('.txt')) \
1219 |     .subscribe(on_next=lambda l: print(l), on_error=lambda e: print(e))
1220 | 
1221 | ```
1222 | 
1223 | 
1224 | You can iterate files through a directory and any nested directories, filter only for files you are interested in (such as .txt files), and then emit the lines from all the files.
1225 | 
1226 | 
1227 | ```python
1228 | from rx import Observable
1229 | import os
1230 | 
1231 | 
1232 | def recursive_files_in_directory(folder):
1233 | 
1234 |     def emit_files_recursively(observer):
1235 |         for root, directories, filenames in os.walk(folder):
1236 |             for directory in directories:
1237 |                 observer.on_next(os.path.join(root, directory))
1238 |             for filename in filenames:
1239 |                 observer.on_next(os.path.join(root, filename))
1240 | 
1241 |         observer.on_completed()
1242 | 
1243 |     return Observable.create(emit_files_recursively)
1244 | 
1245 | 
1246 | recursive_files_in_directory('/home/thomas/Desktop/bbc') \
1247 |     .filter(lambda f: f.endswith('.txt')) \
1248 |     .flat_map(lambda f:  Observable.from_(open(f, encoding="ISO-8859-1"))) \
1249 |     .map(lambda l: l.strip()) \
1250 |     .filter(lambda l: l != "") \
1251 |     .subscribe(on_next=lambda l: print(l), on_error=lambda e: print(e))
1252 | 
1253 | ```
1254 | 
1255 |  ## 7.2 - Reading a SQL Query
1256 | 
1257 | SQLAlchemy is the go-to Python library for SQL querying, and since it is iterable it can easily support Rx. In this example, I am using a SQLite database file which you can download at https://goo.gl/9DYXPS. You can also download it on my [_Getting Started with SQL_ GitHub page](https://github.com/thomasnield/oreilly_getting_started_with_sql).
1258 | 
1259 | 
1260 | 
1261 | ### 7.2A - Emitting a query
1262 | 
1263 | When you set up your engine, statement, and connection, you can reactively emit each result (which will be a tuple) from a query using `Observable.from_()`. Since a SQL query result set can only be iterated once, it is easiest to use a function to create a new one and return it in an `Observable` each time. That way multiple subscribers can be accommodated easily.
1264 | 
1265 | 
1266 | ```python
1267 | from sqlalchemy import create_engine, text
1268 | from rx import Observable
1269 | 
1270 | engine = create_engine('sqlite:///rexon_metals.db')
1271 | conn = engine.connect()
1272 | 
1273 | 
1274 | def get_all_customers():
1275 |     stmt = text("SELECT * FROM CUSTOMER")
1276 |     return Observable.from_(conn.execute(stmt))
1277 | 
1278 | 
1279 | get_all_customers().subscribe(lambda r: print(r))
1280 | ```
1281 | 
1282 | **OUTPUT:**
1283 | 
1284 | ```
1285 | (1, 'LITE Industrial', 'Southwest', '729 Ravine Way', 'Irving', 'TX', 75014)
1286 | (2, 'Rex Tooling Inc', 'Southwest', '6129 Collie Blvd', 'Dallas', 'TX', 75201)
1287 | (3, 'Re-Barre Construction', 'Southwest', '9043 Windy Dr', 'Irving', 'TX', 75032)
1288 | (4, 'Prairie Construction', 'Southwest', '264 Long Rd', 'Moore', 'OK', 62104)
1289 | (5, 'Marsh Lane Metal Works', 'Southeast', '9143 Marsh Ln', 'Avondale', 'LA', 79782)
1290 | ```
1291 | 
1292 | ### 7.2B - Using Observable.defer()
1293 | 
1294 | If you need multiple subscribers for sources that can only be iterated once, use `Observable.defer()` to generate a new iterable object for each subscription.
1295 | 
1296 | ```python
1297 | from sqlalchemy import create_engine, text
1298 | from rx import Observable
1299 | 
1300 | engine = create_engine('sqlite:///rexon_metals.db')
1301 | conn = engine.connect()
1302 | 
1303 | 
1304 | def get_all_customers():
1305 |     stmt = text("SELECT * FROM CUSTOMER")
1306 |     return Observable.defer(lambda: Observable.from_(conn.execute(stmt)))
1307 | 
1308 | my_source = get_all_customers()
1309 | 
1310 | my_source.subscribe(lambda r: print(r))
1311 | my_source.subscribe(lambda r: print(r))
1312 | ```
1313 | 
1314 | 
1315 | 
1316 | ### 7.2C - Merging multiple queries
1317 | 
1318 | You can create some powerful reactive patterns when working with databases. For instance, say you wanted to query for customers with ID's 1, 3, and 5. Of course you can do this in raw SQL like so:
1319 | 
1320 | ```sql
1321 | SELECT * FROM CUSTOMER WHERE CUSTOMER_ID in (1,3,5)
1322 | ```
1323 | 
1324 | However, let's leverage Rx to keep our API simple and minimize the number of query functions it needs.
1325 | 
1326 | You can create a single `customer_for_id()` function that returns an `Observable` emitting a customer for a given `customer_id`. You can compose it into a reactive chain by using `merge_all()` or `flat_map()`. Do this by emitting the desired ID's, mapping them to the `customer_for_id()`, and then calling `merge_all()` to consolidate the results from all three queries.
1327 | 
1328 | ```python
1329 | from sqlalchemy import create_engine, text
1330 | from rx import Observable
1331 | 
1332 | engine = create_engine('sqlite:///rexon_metals.db')
1333 | conn = engine.connect()
1334 | 
1335 | def get_all_customers():
1336 |     stmt = text("SELECT * FROM CUSTOMER")
1337 |     return Observable.from_(conn.execute(stmt))
1338 | 
1339 | def customer_for_id(customer_id):
1340 |     stmt = text("SELECT * FROM CUSTOMER WHERE CUSTOMER_ID = :id")
1341 |     return Observable.from_(conn.execute(stmt, id=customer_id))
1342 | 
1343 | # Query customers with IDs 1, 3, and 5
1344 | Observable.from_([1, 3, 5]) \
1345 |     .flat_map(lambda id: customer_for_id(id)) \
1346 |     .subscribe(lambda r: print(r))
1347 | 
1348 | ```
1349 | 
1350 | **OUTPUT:**
1351 | 
1352 | ```
1353 | (1, 'LITE Industrial', 'Southwest', '729 Ravine Way', 'Irving', 'TX', 75014)
1354 | (3, 'Re-Barre Construction', 'Southwest', '9043 Windy Dr', 'Irving', 'TX', 75032)
1355 | (5, 'Marsh Lane Metal Works', 'Southeast', '9143 Marsh Ln', 'Avondale', 'LA', 79782)
1356 | ```
1357 | 
1358 | 
1359 | ## 7.2D - Writing Data (EXTRA)
1360 | 
1361 | You can also use Rx to write data to a database. One way to do this is to put the writing operations in the Subscriber, but you can get a bit more creative and flexible with Rx. For instance, we can create a function called `insert_new_customer()` that accepts the parameters needed to create a new `CUSTOMER` record. But, we can return an `Observable` that emits the automatically assigned PRIMARY KEY value for that record. This allows us to compose writing operations with other operations, such as querying for the record we just created.
1362 | 
1363 | ```python
1364 | from sqlalchemy import create_engine, text
1365 | from rx import Observable
1366 | 
1367 | 
1368 | engine = create_engine('sqlite:///rexon_metals.db')
1369 | conn = engine.connect()
1370 | 
1371 | 
1372 | def get_all_customers():
1373 |     stmt = text("SELECT * FROM CUSTOMER")
1374 |     return Observable.from_(conn.execute(stmt))
1375 | 
1376 | 
1377 | def customer_for_id(customer_id):
1378 |     stmt = text("SELECT * FROM CUSTOMER WHERE CUSTOMER_ID = :id")
1379 |     return Observable.from_(conn.execute(stmt, id=customer_id))
1380 | 
1381 | 
1382 | def insert_new_customer(customer_name, region, street_address, city, state, zip_code):
1383 |     stmt = text("INSERT INTO CUSTOMER (NAME, REGION, STREET_ADDRESS, CITY, STATE, ZIP) VALUES ("
1384 |                 ":customer_name, :region, :street_address, :city, :state, :zip_code)")
1385 | 
1386 |     result = conn.execute(stmt, customer_name=customer_name, region=region, street_address=street_address, city=city, state=state, zip_code=zip_code)
1387 |     return Observable.just(result.lastrowid)
1388 | 
1389 | # Create new customer, emit primary key ID, and query that customer
1390 | insert_new_customer('RMS Materials','Northeast', '5764 Carrier Ln', 'Boston', 'Massachusetts', '02201') \
1391 |     .flat_map(lambda i: customer_for_id(i)) \
1392 |     .subscribe(lambda s: print(s))
1393 | 
1394 | ```
1395 | 
1396 | **OUTPUT:**
1397 | 
1398 | ```
1399 | (6, 'RMS Materials', 'Northeast', '5764 Carrier Ln', 'Boston', 'Massachusetts', 2201)
1400 | 
1401 | ```
1402 | 
1403 | 
1404 | ## 7.3 - A Scheduled Reactive Word Counter
1405 | 
1406 | Let's apply everything we have learned so far to create a reactive word counter process.
1407 | 
1408 | 
1409 | ### 7.3A - Emitting words from a text file
1410 | Let's start by creating a function that returns an `Observable` emitting and cleaning the words in a text file, ridding punctuation, empty lines, and making all words lower case.
1411 | 
1412 | ```python
1413 | from rx import Observable
1414 | import re
1415 | 
1416 | 
1417 | def words_from_file(file_name):
1418 |     file = open(file_name)
1419 | 
1420 |     # parse, clean, and push words in text file
1421 |     return Observable.from_(file) \
1422 |         .flat_map(lambda s: Observable.from_(s.split())) \
1423 |         .map(lambda w: re.sub(r'[^\w]', '', w)) \
1424 |         .filter(lambda w: w != "") \
1425 |         .map(lambda w: w.lower())
1426 | 
1427 | article_file = "bbc_news_article.txt"
1428 | words_from_file(article_file).subscribe(lambda w: print(w))
1429 | ```
1430 | 
1431 | **OUTPUT:**
1432 | 
1433 | ```
1434 | giant
1435 | waves
1436 | damage
1437 | governments
1438 | s
1439 | aid
1440 | asia
1441 | agencies
1442 | the
1443 | economy
1444 | ...
1445 | ```
1446 | 
1447 | 
1448 | ### 7.3B - Counting Word Occurrences
1449 | 
1450 | Let's create another function called `word_counter()`. It will leverage the existing `words_from_file()` then use `group_by()` to count the word occurrances, then tuple the word with the count.
1451 | 
1452 | ```python
1453 | from rx import Observable
1454 | import re
1455 | 
1456 | 
1457 | def words_from_file(file_name):
1458 |     file = open(file_name)
1459 | 
1460 |     # parse, clean, and push words in text file
1461 |     return Observable.from_(file) \
1462 |         .flat_map(lambda s: Observable.from_(s.split())) \
1463 |         .map(lambda w: re.sub(r'[^\w\s]', '', w)) \
1464 |         .filter(lambda w: w != "") \
1465 |         .map(lambda w: w.lower()) \
1466 | 
1467 | 
1468 | 
1469 | def word_counter(file_name):
1470 | 
1471 |     # count words using `group_by()`
1472 |     # tuple the word with the count
1473 |     return words_from_file(file_name) \
1474 |         .group_by(lambda word: word) \
1475 |         .flat_map(lambda grp: grp.count().map(lambda ct: (grp.key, ct)))
1476 | 
1477 | article_file = "bbc_news_article.txt"
1478 | word_counter(article_file).subscribe(lambda w: print(w))
1479 | ```
1480 | 
1481 | **OUTPUT:**
1482 | 
1483 | ```
1484 | ('giant', 1)
1485 | ('waves', 3)
1486 | ('damage', 6)
1487 | ('governments', 3)
1488 | ('s', 1)
1489 | ('aid', 10)
1490 | ('asia', 6)
1491 | ('agencies', 3)
1492 | ('the', 78)
1493 | ('economy', 1)
1494 | ...
1495 | ```
1496 | 
1497 | ## 7.3C - Scheduling the Word Count And Notifying of Changes
1498 | 
1499 | Finally, let's schedule this word count to occur every 3 seconds and collect them into a `Dict`. We can use `distinct_until_changed()` to only emit `Dict` items that have changed due to the text file being edited.
1500 | 
1501 | ```python
1502 | # Schedules a reactive process that counts the words in a text file every three seconds,
1503 | # but only prints it as a dict if it has changed
1504 | 
1505 | from rx import Observable
1506 | import re
1507 | 
1508 | 
1509 | def words_from_file(file_name):
1510 |     file = open(file_name)
1511 | 
1512 |     # parse, clean, and push words in text file
1513 |     return Observable.from_(file) \
1514 |         .flat_map(lambda s: Observable.from_(s.split())) \
1515 |         .map(lambda w: re.sub(r'[^\w\s]', '', w)) \
1516 |         .filter(lambda w: w != "") \
1517 |         .map(lambda w: w.lower()) \
1518 | 
1519 | 
1520 | 
1521 | def word_counter(file_name):
1522 | 
1523 |     # count words using `group_by()`
1524 |     # tuple the word with the count
1525 |     return words_from_file(file_name) \
1526 |         .group_by(lambda word: word) \
1527 |         .flat_map(lambda grp: grp.count().map(lambda ct: (grp.key, ct)))
1528 | 
1529 | 
1530 | # composes the above word_counter() into a dict
1531 | def word_counter_as_dict(file_name):
1532 |     return word_counter(file_name).to_dict(lambda t: t[0], lambda t: t[1])
1533 | 
1534 | 
1535 | # Schedule to create a word count dict every three seconds an article
1536 | # But only re-print if text is edited and word counts change
1537 | 
1538 | article_file = "bbc_news_article.txt"
1539 | 
1540 | # create a dict every three seconds, but only push if it changed
1541 | Observable.interval(3000) \
1542 |     .flat_map(lambda i: word_counter_as_dict(article_file)) \
1543 |     .distinct_until_changed() \
1544 |     .subscribe(lambda word_ct_dict: print(word_ct_dict))
1545 | 
1546 | # Keep alive until user presses any key
1547 | input("Starting, press any key to quit\n")
1548 | ```
1549 | 
1550 | **OUTPUT:**
1551 | 
1552 | ```
1553 | Starting, press any key to quit
1554 | {'a': 7, 'governments': 3, 'first': 1, 'getting': 1, 'offered': 1, ...
1555 | ```
1556 | 
1557 | Every time the file is edited and words are added, modified, or removed, it should push a new `Dict` reflecting these changes. This can be helpful to run a report on a schedule, and you can only emit a new report to an output if the data has changed.
1558 | 
1559 | Ideally, it is better to hook onto the change event itself rather than running a potentially expensive process every 3 seconds. We will learn how to do this with Twitter in the next section.
1560 | 
1561 | > If you want to see an intensive reactive data analysis example, see my [social media example on Gist](https://goo.gl/NO0Q4P)
1562 | 
1563 | 
1564 | 
1565 | # Section VIII - Hot Observables
1566 | 
1567 | In this section we will learn how to create an `Observable` emitting Tweets for a set of topics. We will wrap an `Observable.create()` around the Tweepy API. But first, let's cover multicasting.
1568 | 
1569 | ## 8.1A - Creating a `ConnectableObservable`
1570 | 
1571 | Remember how cold Observables will replay data to each Subscriber like a music CD?
1572 | 
1573 | ```python
1574 | from rx import Observable
1575 | 
1576 | source = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"])
1577 | 
1578 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s)))
1579 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s)))
1580 | ```
1581 | 
1582 | **OUTPUT:**
1583 | 
1584 | ```
1585 | Subscriber 1: Alpha
1586 | Subscriber 1: Beta
1587 | Subscriber 1: Gamma
1588 | Subscriber 1: Delta
1589 | Subscriber 1: Epsilon
1590 | Subscriber 2: Alpha
1591 | Subscriber 2: Beta
1592 | Subscriber 2: Gamma
1593 | Subscriber 2: Delta
1594 | Subscriber 2: Epsilon
1595 | ```
1596 | 
1597 | This is often what we want so no data is missed for each Subscriber. But there are times we will want to force cold Observables to become hot Observables. We can do this by calling `publish()` which will return a `ConnectableObservable`. Then we can subscribe our Subscribers to it, then call `connect()` to fire emissions to all Subscribers at once.
1598 | 
1599 | ```python
1600 | from rx import Observable
1601 | 
1602 | source = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]).publish()
1603 | 
1604 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s)))
1605 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s)))
1606 | 
1607 | source.connect()
1608 | ```
1609 | 
1610 | **OUTPUT:**
1611 | 
1612 | ```
1613 | Subscriber 1: Alpha
1614 | Subscriber 2: Alpha
1615 | Subscriber 1: Beta
1616 | Subscriber 2: Beta
1617 | Subscriber 1: Gamma
1618 | Subscriber 2: Gamma
1619 | Subscriber 1: Delta
1620 | Subscriber 2: Delta
1621 | Subscriber 1: Epsilon
1622 | Subscriber 2: Epsilon
1623 | ```
1624 | 
1625 | This is known as multicasting. Notice how the emissions are now interleaved? This is because each emission is going to both subscribers. This is helpful if "replaying" the data is expensive or we just simply want all Subscribers to get the emissions simultaneously.
1626 | 
1627 | ## 8.1B - Sharing an Interval Observable (EXTRA)
1628 | 
1629 | `Observable.interval()` is actually a cold Observable too. If one Subscriber subscribes to it, and 5 seconds later another Subscriber comes in, that second subscriber will receive its own emissions that "start over".
1630 | 
1631 | ```python
1632 | from rx import Observable
1633 | import time
1634 | 
1635 | source = Observable.interval(1000)
1636 | 
1637 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s)))
1638 | 
1639 | # sleep 5 seconds, then add another subscriber
1640 | time.sleep(5)
1641 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s)))
1642 | 
1643 | input("Press any key to exit\n")
1644 | ```
1645 | 
1646 | **OUTPUT:**
1647 | 
1648 | ```
1649 | Subscriber 1: 0
1650 | Subscriber 1: 1
1651 | Subscriber 1: 2
1652 | Subscriber 1: 3
1653 | Press any key to exit
1654 | Subscriber 1: 4
1655 | Subscriber 2: 0
1656 | Subscriber 1: 5
1657 | Subscriber 2: 1
1658 | Subscriber 1: 6
1659 | Subscriber 2: 2
1660 | Subscriber 1: 7
1661 | Subscriber 2: 3
1662 | 
1663 | ```
1664 | 
1665 | Subscriber 2 starts at `0` while Subscriber 2 is already at `4`. If we want both to be on the same timer, we can use `publish()` to create a `ConnectableObservable`.
1666 | 
1667 | 
1668 | ```python
1669 | from rx import Observable
1670 | import time
1671 | 
1672 | source = Observable.interval(1000).publish()
1673 | 
1674 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s)))
1675 | source.connect()
1676 | 
1677 | # sleep 5 seconds, then add another subscriber
1678 | time.sleep(5)
1679 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s)))
1680 | 
1681 | input("Press any key to exit\n")
1682 | ```
1683 | 
1684 | **OUTPUT:**
1685 | 
1686 | ```
1687 | Subscriber 1: 0
1688 | Subscriber 1: 1
1689 | Subscriber 1: 2
1690 | Subscriber 1: 3
1691 | Press any key to exit
1692 | Subscriber 1: 4
1693 | Subscriber 2: 4
1694 | Subscriber 1: 5
1695 | Subscriber 2: 5
1696 | ```
1697 | 
1698 | 
1699 | ## 8.1C - Autoconnecting
1700 | 
1701 | We can have our `ConnectableObservable` automatically `connect()` itself when it gets a Subscriber by calling `ref_count()` on it.
1702 | 
1703 | 
1704 | ```python
1705 | from rx import Observable
1706 | import time
1707 | 
1708 | source = Observable.interval(1000).publish().auto_connect()
1709 | 
1710 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s)))
1711 | 
1712 | # sleep 5 seconds, then add another subscriber
1713 | time.sleep(5)
1714 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s)))
1715 | 
1716 | input("Press any key to exit\n")
1717 | ```
1718 | 
1719 | You can also pass the number of subscribers to wait for to `auto_connect()` before it starts firing.
1720 | 
1721 | ```python
1722 | source = Observable.interval(1000).auto_connect()
1723 | ```
1724 | 
1725 | Again, multicasting is helpful when you want all Subscribers to receive the same emissions simultaneously
1726 | and prevent redundant, expensive work for each Subscriber.
1727 | 
1728 | 
1729 | ## 8.2D - Multicasting Specific Points
1730 | 
1731 | The placement of the mutlicasting matters. For instance, if you map three emissions to three random integers, but multicast _before_ the `map()` operation, two subscribers will both receive separate random integers.
1732 | 
1733 | ```python
1734 | from rx import Observable
1735 | from random import randint
1736 | 
1737 | 
1738 | three_emissions = Observable.range(1, 3).publish()
1739 | 
1740 | three_random_ints = three_emissions.map(lambda i: randint(1, 100000))
1741 | 
1742 | three_random_ints.subscribe(lambda i: print("Subscriber 1 Received: {0}".format(i)))
1743 | three_random_ints.subscribe(lambda i: print("Subscriber 2 Received: {0}".format(i)))
1744 | 
1745 | three_emissions.connect()
1746 | ```
1747 | 
1748 | **OUTPUT:**
1749 | 
1750 | ```
1751 | Subscriber 1 Received: 56976
1752 | Subscriber 1 Received: 882
1753 | Subscriber 1 Received: 59873
1754 | Subscriber 2 Received: 12911
1755 | Subscriber 2 Received: 47631
1756 | Subscriber 2 Received: 84640
1757 | ```
1758 | 
1759 | 
1760 | However, putting the `publish()` _after_ the `map()` operation, both subscribers will receive the same emissions.
1761 | 
1762 | ```python
1763 | from rx import Observable
1764 | from random import randint
1765 | 
1766 | 
1767 | three_emissions = Observable.range(1, 3)
1768 | 
1769 | three_random_ints = three_emissions.map(lambda i: randint(1, 100000)).publish()
1770 | 
1771 | three_random_ints.subscribe(lambda i: print("Subscriber 1 Received: {0}".format(i)))
1772 | three_random_ints.subscribe(lambda i: print("Subscriber 2 Received: {0}".format(i)))
1773 | 
1774 | three_random_ints.connect()
1775 | ```
1776 | 
1777 | **OUTPUT:**
1778 | 
1779 | ```
1780 | Subscriber 1 Received: 17500
1781 | Subscriber 2 Received: 17500
1782 | Subscriber 1 Received: 71398
1783 | Subscriber 2 Received: 71398
1784 | Subscriber 1 Received: 90457
1785 | Subscriber 2 Received: 90457
1786 | ```
1787 | 
1788 | Therefore, note that most operators will create a separate stream for each subscriber, even if upstream there is a mutlicasting operation. Typically, you multicast up to the point where operations are common to both subscribers. For instance, if one subscriber simply printed each random number while the second subscriber performed a sum on them, the multcasting will happen before the summing operation since that is where digressive operations occur.
1789 | 
1790 | ```python
1791 | from rx import Observable
1792 | from random import randint
1793 | 
1794 | 
1795 | three_emissions = Observable.range(1, 3)
1796 | 
1797 | three_random_ints = three_emissions.map(lambda i: randint(1, 100000)).publish()
1798 | 
1799 | three_random_ints.subscribe(lambda i: print("Subscriber 1 Received: {0}".format(i)))\
1800 | 
1801 | three_random_ints.reduce(lambda total, item: total + item) \
1802 |     .subscribe(lambda i: print("Subscriber 2 Received: {0}".format(i)))
1803 | 
1804 | three_random_ints.connect()
1805 | ```
1806 | 
1807 | **OUTPUT:***
1808 | 
1809 | ```
1810 | Subscriber 1 Received: 17618
1811 | Subscriber 1 Received: 66227
1812 | Subscriber 1 Received: 36159
1813 | Subscriber 2 Received: 120004
1814 | ```
1815 | 
1816 | 
1817 | ## 8.2E - Subjects (EXTRA)
1818 | 
1819 | Another way to create a kmutlicasted `Observable` is by declaring a `Subject`. A `Subject` is both an `Observable` and `Observer`, and you can call its `Observer` functions to push items through it and up to any Subscribers at any time. It will push these items to all subscribers.
1820 | 
1821 | ```python
1822 | from rx.subjects import Subject
1823 | 
1824 | subject = Subject()
1825 | 
1826 | subject.filter(lambda i: i < 100) \
1827 |     .map(lambda i: i * 1000) \
1828 |     .subscribe(lambda i: print(i))
1829 | 
1830 | subject.on_next(10)
1831 | subject.on_next(50)
1832 | subject.on_next(105)
1833 | subject.on_next(87)
1834 | 
1835 | subject.on_completed()
1836 | ```
1837 | 
1838 | **OUTPUT:**
1839 | 
1840 | ```
1841 | 10000
1842 | 50000
1843 | 87000
1844 | ```
1845 | 
1846 | While they seem convenient, Subjects are often discouraged from being used. They can easily encourage antipatterns and are prone to abuse. They also are difficult to compose against and do not respect `subscribe_on()`. It is better to create Observables that strictly come from one defined source, rather than be openly mutable and have anything push items to it at anytime. Use Subjects with discretion.
1847 | 
1848 | ## 8.2 - Querying Live Twitter Feeds
1849 | 
1850 | 
1851 | You can use `Observable.create()` to wrangle and analyze a live Twitter feed.
1852 | 
1853 | You will need to create your own application and access keys/tokens at https://apps.twitter.com.
1854 | 
1855 | If we want to query a live stream of Tweets pertaining to the topics of "Britain" or "France", we can do it like this:
1856 | 
1857 | ```python
1858 | from tweepy.streaming import StreamListener
1859 | from tweepy import OAuthHandler
1860 | from tweepy import Stream
1861 | import json
1862 | from rx import Observable
1863 | 
1864 | # Variables that contains the user credentials to access Twitter API
1865 | access_token = "PUT YOURS HERE"
1866 | access_token_secret = "PUT YOURS HERE"
1867 | consumer_key = "PUT YOURS HERE"
1868 | consumer_secret = "PUT YOURS HERE"
1869 | 
1870 | 
1871 | def tweets_for(topics):
1872 |     def observe_tweets(observer):
1873 |         class TweetListener(StreamListener):
1874 |             def on_data(self, data):
1875 |                 observer.on_next(data)
1876 |                 return True
1877 | 
1878 |             def on_error(self, status):
1879 |                 observer.on_error(status)
1880 | 
1881 |         # This handles Twitter authetification and the connection to Twitter Streaming API
1882 |         l = TweetListener()
1883 |         auth = OAuthHandler(consumer_key, consumer_secret)
1884 |         auth.set_access_token(access_token, access_token_secret)
1885 |         stream = Stream(auth, l)
1886 |         stream.filter(track=topics)
1887 | 
1888 |     return Observable.create(observe_tweets).share()
1889 | 
1890 | 
1891 | topics = ['Britain', 'France']
1892 | 
1893 | tweets_for(topics) \
1894 |     .map(lambda d: json.loads(d)) \
1895 |     .subscribe(on_next=lambda s: print(s), on_error=lambda e: print(e))
1896 | ```
1897 | # IX - Concurrency
1898 | 
1899 | (Refer to slides to cover concurrency concepts).
1900 | 
1901 | ## 9.1 - Using `subscribe_on()`
1902 | 
1903 | ## 9.1A - Two Long-Running Processes
1904 | 
1905 | We will not dive too deep into concurrency topics, but we will learn enough to make it useful and speed up slow processes. Note also the [GIL issue in Python](https://stackoverflow.com/questions/1294382/what-is-a-global-interpreter-lock-gil#1294402) can undermine concurrency performance in Python applications, but hopefully you will still get some marginal benefit. Be sure to test your concurrency strategies and measure what brings the best performance.
1906 | 
1907 | > Keep in mind your output may be different than mine, because concurrency tends to shuffle emissions of multiple sources. Output is almost never deterministic when multiple threads are doing work simultaneously and being merged.
1908 | 
1909 | Below, we create two Observables we will call "Task 1" and "Task 2". The first Observable is emitting five strings and the other emits numbers in a range.  These Observables will fire quickly when subscribed to, but concurrency is more useful and apparent with long-running tasks. To emulate long-running expensive processes, we will need to exaggerate and slow down emissions. We can use a `intense_calculation()` function that sleeps for a short random duration (between 0.5 to 2.0 seconds) before returning the value it was given. Then we can use this in a `map()` operator for each `Observable`.
1910 | 
1911 | We will use `current_thread().name` to identify the thread that is calling each `on_next()` in the `Subscriber`. Python will label each thread it creates consecutively as "Thread-1", "Thread-2", "Thread-3", etc.
1912 | 
1913 | Before "Task 2" can start, it must wait for "Task 1" to call `on_completed()` because by default both are on the `ImmediateScheduler`. This scheduler uses the same `MainThread` that runs our Python program.
1914 | 
1915 | 
1916 | ```python
1917 | from rx import Observable
1918 | from threading import current_thread
1919 | import multiprocessing, time, random
1920 | 
1921 | def intense_calculation(value):
1922 |     # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation
1923 |     time.sleep(random.randint(5,20) * .1)
1924 |     return value
1925 | 
1926 | # Create TASK 1
1927 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \
1928 |     .map(lambda s: intense_calculation(s)) \
1929 |     .subscribe(on_next=lambda s: print("TASK 1: {0} {1}".format(current_thread().name, s)),
1930 |                on_error=lambda e: print(e),
1931 |                on_completed=lambda: print("TASK 1 done!"))
1932 | 
1933 | # Create TASK 2
1934 | Observable.range(1,10) \
1935 |     .map(lambda s: intense_calculation(s)) \
1936 |     .subscribe(on_next=lambda i: print("TASK 2: {0} {1}".format(current_thread().name, i)),
1937 |                on_error=lambda e: print(e),
1938 |                on_completed=lambda: print("TASK 2 done!"))
1939 | 
1940 | input("Press any key to exit\n")
1941 | ```
1942 | 
1943 | **OUTPUT (May not match yours):**
1944 | 
1945 | ```
1946 | TASK 1: MainThread Alpha
1947 | TASK 1: MainThread Beta
1948 | TASK 1: MainThread Gamma
1949 | TASK 1: MainThread Delta
1950 | TASK 1: MainThread Epsilon
1951 | TASK 1 done!
1952 | TASK 2: MainThread 1
1953 | TASK 2: MainThread 2
1954 | TASK 2: MainThread 3
1955 | TASK 2: MainThread 4
1956 | TASK 2: MainThread 5
1957 | TASK 2: MainThread 6
1958 | TASK 2: MainThread 7
1959 | TASK 2: MainThread 8
1960 | TASK 2: MainThread 9
1961 | TASK 2: MainThread 10
1962 | TASK 2 done!
1963 | ```
1964 | 
1965 | ## 9.1B - Kicking off both processes simultaneously
1966 | 
1967 | This would go much faster if we kick off both "Task 1" and "Task 2" simultaneously. We can kick off the Subscription in "Task 1" and then immediately move on to kicking off "Task 2". We will kick off both of their subscriptions simultaneously.
1968 | 
1969 | In advance, we can create a `ThreadPoolScheduler` that holds a number of threads equaling the _number of CPU's on your computer_ + 1. If your computer has 4 cores, the `ThreadPoolScheduler` will have 5 threads.  The reason for the extra thread is to utilize any idle time of the other threads. To make the Observables work on this `ThreadPoolScheduler`, we can pass it to a `subscribe_on()` operator anywhere in the chain. The `subscribe_on()`, no matter where it is in the chain, will instruct the source Observable what thread to push items on.
1970 | 
1971 | > You are welcome to experiment and specify your own arbitrary number of threads. Just keep in mind there will be a point of diminishing return.
1972 | 
1973 | The code below will execute all the above:
1974 | 
1975 | ```python
1976 | from rx import Observable
1977 | from rx.concurrency import ThreadPoolScheduler
1978 | from threading import current_thread
1979 | import multiprocessing, time, random
1980 | 
1981 | 
1982 | def intense_calculation(value):
1983 |     # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation
1984 |     time.sleep(random.randint(5,20) * .1)
1985 |     return value
1986 | 
1987 | # calculate number of CPU's and add 1, then create a ThreadPoolScheduler with that number of threads
1988 | optimal_thread_count = multiprocessing.cpu_count()
1989 | pool_scheduler = ThreadPoolScheduler(optimal_thread_count)
1990 | 
1991 | print("We are using {0} threads".format(optimal_thread_count))
1992 | 
1993 | # Create Task 1
1994 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \
1995 |     .map(lambda s: intense_calculation(s)) \
1996 |     .subscribe_on(pool_scheduler) \
1997 |     .subscribe(on_next=lambda s: print("TASK 1: {0} {1}".format(current_thread().name, s)),
1998 |                on_error=lambda e: print(e),
1999 |                on_completed=lambda: print("TASK 1 done!"))
2000 | 
2001 | # Create Task 2
2002 | Observable.range(1,10) \
2003 |     .map(lambda s: intense_calculation(s)) \
2004 |     .subscribe_on(pool_scheduler) \
2005 |     .subscribe(on_next=lambda i: print("TASK 2: {0} {1}".format(current_thread().name, i)),
2006 |                on_error=lambda e: print(e),
2007 |                on_completed=lambda: print("TASK 2 done!"))
2008 | 
2009 | input("Press any key to exit\n")
2010 | 
2011 | ```
2012 | 
2013 | **OUTPUT (May not match yours):**
2014 | 
2015 | ```
2016 | TASK 1: Thread-1 Alpha
2017 | TASK 2: Thread-2 1
2018 | TASK 1: Thread-1 Beta
2019 | TASK 1: Thread-1 Gamma
2020 | TASK 2: Thread-2 2
2021 | TASK 2: Thread-2 3
2022 | TASK 1: Thread-1 Delta
2023 | TASK 2: Thread-2 4
2024 | TASK 1: Thread-1 Epsilon
2025 | TASK 1 done!
2026 | TASK 2: Thread-2 5
2027 | TASK 2: Thread-2 6
2028 | TASK 2: Thread-2 7
2029 | TASK 2: Thread-2 8
2030 | TASK 2: Thread-2 9
2031 | TASK 2: Thread-2 10
2032 | TASK 2 done!
2033 | ```
2034 | 
2035 | We use the `input()` function to hold the `MainThread` and keep the application alive until a key is pressed, allowing the Observables to fire. Notice how the emissions between Task 1 and Task 2 are interleaved, indicating they are both working at the same time. If we did not have the `subscribe_on()` calls, "Task 1" would have to finish before "Task 2" can start, because they both would use the default `ImmediateScheduler` as shown earlier.
2036 | 
2037 | Notice also that "Task 1" requested a thread from our `ThreadPoolScheduler` and got `Thread-1`, and "Task 2" got `Thread 2`. They both will continue to use these threads until `on_completed()` is called on their Subscribers. Then the threads will be given back to the `ThreadPoolScheduler` so they can be used again later.
2038 | 
2039 | 
2040 | ## 9.2 - Using `observe_on()` to redirect in the middle of the chain
2041 | 
2042 | Not all source Observables will respect a `subscribe_on()` you specify. This is especially true for time-driven sources like `Observable.interval()` which will use the `TimeoutScheduler` and effectively ignore any `subscribe_on()` you try to call. However, although you cannot instruct the source to emit on a different scheduler, you can specify a different scheduler to be used _at a certain point_ in the `Observable` chain by using `observe_on()`.
2043 | 
2044 | Let's create a third process called "Task 3". The source will be an `Observable.interval()` which will emit on the `TimeoutScheduler`. After each emitted number is multiplied by 100, the emission is then moved to the `ThreadPoolScheduler` via the `observe_on()` operator. This means for the remaining operators, the emissions will be passed on the `ThreadPoolScheduler`. Unlike `subscribe_on()`, the placement of `observe_on()` does matter as it will redirect to a different executor _at that point_ in the chain.
2045 | 
2046 | ```python
2047 | from rx import Observable
2048 | from rx.concurrency import ThreadPoolScheduler
2049 | from threading import current_thread
2050 | import multiprocessing, time, random
2051 | 
2052 | def intense_calculation(value):
2053 |     # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation
2054 |     time.sleep(random.randint(5,20) * .1)
2055 |     return value
2056 | 
2057 | # calculate number of CPU's and add 1, then create a ThreadPoolScheduler with that number of threads
2058 | optimal_thread_count = multiprocessing.cpu_count() + 1
2059 | pool_scheduler = ThreadPoolScheduler(optimal_thread_count)
2060 | 
2061 | # Create Task 1
2062 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \
2063 |     .map(lambda s: intense_calculation(s)) \
2064 |     .subscribe_on(pool_scheduler) \
2065 |     .subscribe(on_next=lambda s: print("TASK 1: {0} {1}".format(current_thread().name, s)),
2066 |                on_error=lambda e: print(e),
2067 |               on_completed=lambda: print("TASK 1 done!"))
2068 | 
2069 | # Create Task 2
2070 | Observable.range(1,10) \
2071 |     .map(lambda s: intense_calculation(s)) \
2072 |     .subscribe_on(pool_scheduler) \
2073 |     .subscribe(on_next=lambda i: print("TASK 2: {0} {1}".format(current_thread().name, i)), on_error=lambda e: print(e), on_completed=lambda: print("TASK 2 done!"))
2074 | 
2075 | # Create Task 3, which is infinite
2076 | Observable.interval(1000) \
2077 |     .map(lambda i: i * 100) \
2078 |     .observe_on(pool_scheduler) \
2079 |     .map(lambda s: intense_calculation(s)) \
2080 |     .subscribe(on_next=lambda i: print("TASK 3: {0} {1}".format(current_thread().name, i)), on_error=lambda e: print(e))
2081 | 
2082 | input("Press any key to exit\n")
2083 | ```
2084 | 
2085 | 
2086 | **OUTPUT (May not match yours):**
2087 | 
2088 | ```
2089 | TASK 2: Thread-2 1
2090 | TASK 1: Thread-1 Alpha
2091 | TASK 1: Thread-1 Beta
2092 | TASK 3: Thread-4 0
2093 | TASK 2: Thread-2 2
2094 | TASK 1: Thread-1 Gamma
2095 | TASK 3: Thread-4 100
2096 | TASK 1: Thread-1 Delta
2097 | TASK 2: Thread-2 3
2098 | TASK 3: Thread-6 200
2099 | TASK 1: Thread-1 Epsilon
2100 | TASK 1 done!
2101 | TASK 3: Thread-13 300
2102 | TASK 2: Thread-2 4
2103 | TASK 3: Thread-15 400
2104 | TASK 2: Thread-2 5
2105 | TASK 3: Thread-4 500
2106 | TASK 2: Thread-2 6
2107 | TASK 3: Thread-4 600
2108 | TASK 2: Thread-2 7
2109 | TASK 3: Thread-4 700
2110 | TASK 2: Thread-2 8
2111 | TASK 3: Thread-4 800
2112 | TASK 2: Thread-2 9
2113 | TASK 3: Thread-4 900
2114 | TASK 3: Thread-4 1000
2115 | TASK 2: Thread-2 10
2116 | TASK 2 done!
2117 | TASK 3: Thread-4 1100
2118 | TASK 3: Thread-4 1200
2119 | TASK 3: Thread-4 1300
2120 | TASK 3: Thread-4 1400
2121 | ...
2122 | ```
2123 | 
2124 | Unlike `subscribe_on()`, the `observe_on()` may use a different thread for each emission rather than reserving one thread for all emissions. You can use as many `observe_on()` calls as you like in an `Observable` chain to redirect emissions to different thread pools at different points in the chain. But you can only have one `subscribe_on()`.
2125 | 
2126 | > You can use the `do_action()` to essentially put Subscribers in the middle of the Observable chain, often for debugging purposes. This can be helpful to print the current thread at different points in the `Observable` chain. Refer to the Appendix to learn more.
2127 | 
2128 | 
2129 | # 9.3 - Parallelization
2130 | 
2131 | An `Observable` will only process one item at a time. However, we can use a `subscribe_on()` or an `observe_on()` in a `flat_map()` and do multiple operations in parallel _within_ that `flat_map()`.
2132 | 
2133 | For instance, say I have 10 Strings I need to process. Because our `intense_calculation()`  will take 0.5 to 2.0 seconds to process each emission, this could take up to 20 seconds.
2134 | 
2135 | ```python
2136 | from rx import Observable
2137 | from rx.concurrency import ThreadPoolScheduler
2138 | from threading import current_thread
2139 | import multiprocessing, time, random
2140 | 
2141 | def intense_calculation(value):
2142 |     # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation
2143 |     time.sleep(random.randint(5,20) * .1)
2144 |     return value
2145 | 
2146 | # Create Parallel Process
2147 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon","Zeta","Eta","Theta","Iota","Kappa"]) \
2148 |     .map(lambda s: intense_calculation(s)) \
2149 |     .subscribe(on_next=lambda s: print("{0} {1}".format(current_thread().name, s)),
2150 |                on_error=lambda e: print(e),
2151 |                on_completed=lambda: print("TASK 1 done!"))
2152 | 
2153 | 
2154 | input("Press any key to exit\n")
2155 | ```
2156 | 
2157 | This would go much faster if we processed multiple emissions at a time rather than one at a time. Let's set
2158 | 
2159 | My computer has 8 cores, but let's use Python to count the number of cores dynamically. Let's set a `ThreadPoolScheduler` to have that many threads (plus one) according to our rough optimal formula. Rather than process 1 item at a time, I can now process 9 at a time which will yield a much faster completion. I just need to make sure the expensive operators happen within a `flat_map()`, starting with that single emission wrapped in an `Observable.just()` and scheduled using `subscribe_on()`.
2160 | 
2161 | 
2162 | ```python
2163 | from rx import Observable
2164 | from rx.concurrency import ThreadPoolScheduler
2165 | from threading import current_thread
2166 | import multiprocessing, time, random
2167 | 
2168 | def intense_calculation(value):
2169 |     # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation
2170 |     time.sleep(random.randint(5,20) * .1)
2171 |     return value
2172 | 
2173 | # calculate number of CPU's, then create a ThreadPoolScheduler with that number of threads
2174 | optimal_thread_count = multiprocessing.cpu_count()
2175 | pool_scheduler = ThreadPoolScheduler(optimal_thread_count)
2176 | 
2177 | # Create Parallel Process
2178 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon","Zeta","Eta","Theta","Iota","Kappa"]) \
2179 |     .flat_map(lambda s:
2180 |         Observable.just(s).subscribe_on(pool_scheduler).map(lambda s: intense_calculation(s))
2181 |     ) \
2182 |     .subscribe(on_next=lambda i: print("{0} {1}".format(current_thread().name, i)),
2183 |                on_error=lambda e: print(e),
2184 |                on_completed=lambda: print("TASK 1 done!"))
2185 | 
2186 | 
2187 | input("Press any key to exit\n")
2188 | ```
2189 | 
2190 | **OUTPUT:**
2191 | 
2192 | ```
2193 | Press any key to exit
2194 | Thread-4 Delta
2195 | Thread-6 Zeta
2196 | Thread-1 Alpha
2197 | Thread-2 Beta
2198 | Thread-9 Iota
2199 | Thread-3 Gamma
2200 | Thread-8 Theta
2201 | Thread-4 Kappa
2202 | Thread-7 Eta
2203 | Thread-5 Epsilon
2204 | PROCESS done!
2205 | ```
2206 | 
2207 | Now this takes less than 3 seconds! Of course the 10 items are now racing each other and complete in a random order. Only 9 threads are available, thus a 10th item must wait for one of the first 9 to complete. It looks like this item was `Kappa` which received `Thread-4` from `Delta` after it was done.
2208 | 
2209 | Parallelization using `flat_map()` (or `merge_all()`) can greatly increase performance if each emission must go through an expensive operation. Just wrap that emission into an  `Observable.just()`, schedule it with `subscribe_on()` or `observe_on()` (preferably `subscribe_on()` if possible), and then make all the expensive operations happen inside the `flat_map()`.
2210 | 
2211 | The reason each emission must be broken into its own `Observable` is because an `Observable` is sequential and cannot be parallelized. But you can take multiple Observables and merge them into a single Observable, even if they are working on a different threads. The merged Observable will only push out items on one thread, but the items inside `flat_map()` can process in parallel.
2212 | 
2213 | # 9.4 - Redirecting Work with `switch_map()`
2214 | 
2215 | Imagine you have an `Observable` and you use `flat_map()` to yield a emissions from another `Observable`. However, say you wanted to _only_ puruse the `Observable` for the latest emission, and kill any previous Observables to stop their emissions coming out of `flat_map()`.
2216 | 
2217 | You can achieve this with `switch_map()`. It operates much like a `flat_map()`, but will only fire items for the latest emission. All previous Observables derived from previous emissions will be unsubscribed.
2218 | 
2219 | This example is slightly contrived, but let's say we have a finite `Observable` emitting Strings. We want an `Observable.interval()` to emit every 6 seconds, and have each emission flat map to our `Observable` of strings which are artificially slowed by `intense_calculation()`. But instead of using `flat_map()`, we can use `switch_map()` to only chase after the latest `Observable` created off each interval emission and unsubscribe previous ones.
2220 | 
2221 | We also need to parallelize using `subscribe_on()` so each Observable within the `switch_map()` happens on a different thread.
2222 | 
2223 | ```python
2224 | from rx import Observable
2225 | from rx.concurrency import ThreadPoolScheduler
2226 | from threading import current_thread
2227 | import multiprocessing, time, random
2228 | 
2229 | 
2230 | def intense_calculation(value):
2231 |     # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation
2232 |     time.sleep(random.randint(5, 20) * .1)
2233 |     return value
2234 | 
2235 | # calculate number of CPU's and add 1, then create a ThreadPoolScheduler with that number of threads
2236 | optimal_thread_count = multiprocessing.cpu_count()
2237 | pool_scheduler = ThreadPoolScheduler(optimal_thread_count)
2238 | 
2239 | strings = Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon", "Zeta", "Eta", "Theta", "Iota", "Kappa"])
2240 | 
2241 | Observable.interval(6000) \
2242 |     .switch_map(lambda i: strings.map(lambda s: intense_calculation(s)).subscribe_on(pool_scheduler)) \
2243 |     .subscribe(on_next = lambda s: print("Received {0} on {1}".format(s, current_thread().name)),
2244 |                on_error = lambda e: print(e))
2245 | 
2246 | 
2247 | input("Press any key to exit\n")
2248 | ```
2249 | 
2250 | 
2251 | **OUTPUT (May Vary):**
2252 | 
2253 | ```
2254 | Press any key to exit
2255 | Received Alpha on Thread-2
2256 | Received Beta on Thread-2
2257 | Received Gamma on Thread-2
2258 | Received Delta on Thread-2
2259 | Received Alpha on Thread-4
2260 | Received Beta on Thread-4
2261 | Received Gamma on Thread-4
2262 | Received Alpha on Thread-6
2263 | Received Beta on Thread-6
2264 | Received Gamma on Thread-6
2265 | Received Delta on Thread-6
2266 | Received Epsilon on Thread-6
2267 | Received Alpha on Thread-2
2268 | ...
2269 | ```
2270 | 
2271 | 
2272 | Using `switch_map()` is a convenient way to cancel current work when new work comes in, rather than queuing up work. This is desirable if you are only concerned with the latest data or want to cancel obsolete processing. If you are scraping web data on a schedule using `Observable.interval()`, but a scrape instance takes too long and a new scrape requests comes in, you can cancel that scrape and start the next one.
2273 | 
2274 | 
2275 | # Appendix
2276 | 
2277 | ## 1 - Deferred Observables
2278 | 
2279 | A behavior to be aware of with `Observable.from_()` and other functions that create Observables is they may not reflect changes that happen to their sources.
2280 | 
2281 | For instance, if have an `Observable.range()` built off two variables `x` and `y`, and one of the variables changes later, this change will not be captured by the source.
2282 | 
2283 | ```python
2284 | 1
2285 | 2
2286 | 3
2287 | 4
2288 | 5
2289 | 
2290 | Setting y = 10
2291 | 
2292 | 1
2293 | 2
2294 | 3
2295 | 4
2296 | 5
2297 | 
2298 | ```
2299 | 
2300 | **OUTPUT:**
2301 | 
2302 | ```
2303 | Alpha
2304 | Beta
2305 | Gamma
2306 | 
2307 | Adding Delta!
2308 | 
2309 | Alpha
2310 | Beta
2311 | Gamma
2312 | Delta
2313 | ```
2314 | 
2315 | Using `Observable.defer()` allows you to create a new `Observable` from scratch each time it is subscribed, and therefore capturing anything that might have changed about its source. Just supply how to create the `Observable` through a lambda.
2316 | 
2317 | ```python
2318 | from rx import Observable
2319 | 
2320 | x = 1
2321 | y = 5
2322 | 
2323 | integers = Observable.defer(lambda: Observable.range(x, y))
2324 | integers.subscribe(lambda i: print(i))
2325 | 
2326 | print("\nSetting y = 10\n")
2327 | y = 10
2328 | 
2329 | integers.subscribe(lambda i: print(i))
2330 | ```
2331 | 
2332 | **OUTPUT:**
2333 | 
2334 | ```
2335 | 1
2336 | 2
2337 | 3
2338 | 4
2339 | 5
2340 | 
2341 | Setting y = 10
2342 | 
2343 | 1
2344 | 2
2345 | 3
2346 | 4
2347 | 5
2348 | 6
2349 | 7
2350 | 8
2351 | 9
2352 | 10
2353 | ```
2354 | 
2355 | The lambda argument ensures the `Observable` source declaration is rebuilt each time it is subscribed to. This is especially helpful to use with data sources that can only be iterated once, as opposed to calling a helper function for each Subscriber (this was covered in Section VII):
2356 | 
2357 | ```python
2358 | 
2359 | def get_all_customers():
2360 |     stmt = text("SELECT * FROM CUSTOMER")
2361 |     return Observable.from_(conn.execute(stmt))
2362 | ```
2363 | 
2364 | We can actually create an `Obserable` that is truly reusable for multiple Subscribers.
2365 | 
2366 | ```python
2367 | stmt = text("SELECT * FROM CUSTOMER")
2368 | 
2369 | # Will suppport multiple subscribers and coldly replay to each one
2370 | all_customers =  Observable.defer(lambda: Observable.from_(conn.execute(stmt)))
2371 | ```
2372 | 
2373 | 
2374 | ## 2 - Debugging with `do_action()`
2375 | 
2376 | A helpful operator that provides insight into any point in the `Observable` chain is the `do_action()`. This essentially allows us to insert a `Subscriber` after any operator we want, and pass one or more of `on_next()`, `on_completed()`, and `on_error()` actions.
2377 | 
2378 | ```python
2379 | from rx import Observable
2380 | 
2381 | Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]) \
2382 |     .map(lambda s: len(s)) \
2383 |     .do_action(on_next=lambda i: print("Receiving {0} from map()".format(i)),
2384 |                on_completed=lambda: print("map() is done!")) \
2385 |     .to_list() \
2386 |     .subscribe(on_next=lambda l: print("Subscriber received {0}".format(l)),
2387 |                on_completed=lambda: print("Subscriber done!"))
2388 | 
2389 | ```
2390 | 
2391 | **OUTPUT:**
2392 | 
2393 | ```
2394 | Receiving 5 from map()
2395 | Receiving 4 from map()
2396 | Receiving 5 from map()
2397 | Receiving 5 from map()
2398 | Receiving 7 from map()
2399 | map() is done!
2400 | Subscriber received [5, 4, 5, 5, 7]
2401 | Subscriber done!
2402 | ```
2403 | 
2404 | Above, we declare a `do_action` right after the `map()` operation emitting the lengths. We print each length emission before it goes to the `to_list()`. Finally, `on_completed` is called and prints a notification that `map()` is not giving any more items. Then it pushes the completion event to the `to_list()` which then pushes the `List` to the `Subscriber`. Then `to_list()` calls `on_completed()` up to the `Subscriber` _after_ the `List` is emitted.
2405 | 
2406 | Use `do_action()` when you need to "peek" inside any point in the `Observable` chain, either for debugging or quickly call actions at that point.
2407 | 
2408 | ## 3 - Subjects
2409 | 
2410 | Another way to create an `Observable` is by declaring a `Subject`. A `Subject` is both an `Observable` and `Observer`, and you can call its `Observer` functions to push items through it and up to any Subscribers at any time.
2411 | 
2412 | ```python
2413 | from rx.subjects import Subject
2414 | 
2415 | subject = Subject()
2416 | 
2417 | subject.filter(lambda i: i < 100) \
2418 |     .map(lambda i: i * 1000) \
2419 |     .subscribe(lambda i: print(i))
2420 | 
2421 | subject.on_next(10)
2422 | subject.on_next(50)
2423 | subject.on_next(105)
2424 | subject.on_next(87)
2425 | 
2426 | subject.on_completed()
2427 | ```
2428 | 
2429 | **OUTPUT:**
2430 | 
2431 | ```
2432 | 10000
2433 | 50000
2434 | 87000
2435 | ```
2436 | 
2437 | While they seem convenient, Subjects are often discouraged from being used. They can easily encourage antipatterns and are prone to abuse. They also are difficult to compose against and do not respect `subscribe_on()`. It is better to create Observables that strictly come from one defined source, rather than be openly mutable and have anything push items to it at anytime. Use Subjects with discretion.
2438 | 
2439 | ## 4. Error Recovery
2440 | 
2441 | There are a number of error recovery operators, but we will cover two helpful ones. Say you have an `Observable` operation that will ultimately attempt to divide by zero and therefore throw an error.
2442 | 
2443 | ```python
2444 | from rx import Observable
2445 | 
2446 | Observable.from_([5, 6, 2, 0, 1, 35]) \
2447 |     .map(lambda i: 5 / i) \
2448 |     .subscribe(on_next=lambda i: print(i), on_error=lambda e: print(e))
2449 | ```
2450 | 
2451 | **OUTPUT:**
2452 | 
2453 | ```
2454 | 1.0
2455 | 0.8333333333333334
2456 | 2.5
2457 | division by zero
2458 | ```
2459 | 
2460 | There are multiple ways to handle this. Of course, the best way is to be proactive and use `filter()` to hold back any `0` value emissions. But for the sake of example, let's say we did not expect this error and we want a way to handle any errors we have not considered.
2461 | 
2462 | One way is to use `on_error_resume_next()` which will switch to an alternate `Observable` source in the event there is an error. This is somewhat contrived, but if we encounter an error we can switch to emitting an `Observable.range()`.
2463 | 
2464 | ```python
2465 | from rx import Observable
2466 | 
2467 | Observable.from_([5, 6, 2, 0, 1, 35]) \
2468 |     .map(lambda i: 5 / i) \
2469 |     .on_error_resume_next(Observable.range(1,10)) \
2470 |     .subscribe(on_next=lambda i: print(i), on_error=lambda e: print(e))
2471 | ```
2472 | 
2473 | **OUTPUT:**
2474 | 
2475 | ```
2476 | 1.0
2477 | 0.8333333333333334
2478 | 2.5
2479 | 1
2480 | 2
2481 | 3
2482 | 4
2483 | 5
2484 | 6
2485 | 7
2486 | 8
2487 | 9
2488 | 10
2489 | ```
2490 | 
2491 | It probably would be more realistic to pass an `Observable.empty()` instead to simply stop emissions once an error happens.
2492 | 
2493 | ```python
2494 | from rx import Observable
2495 | 
2496 | Observable.from_([5, 6, 2, 0, 1, 35]) \
2497 |     .map(lambda i: 5 / i) \
2498 |     .on_error_resume_next(Observable.empty()) \
2499 |     .subscribe(on_next=lambda i: print(i), on_error=lambda e: print(e))
2500 | ```
2501 | 
2502 | **OUTPUT:**
2503 | 
2504 | ```
2505 | 1.0
2506 | 0.8333333333333334
2507 | 2.5
2508 | ```
2509 | 
2510 | Although this is not a good example to use it, you can also use `retry()` to re-attempt subscribing to the `Observable` and hope the next set of emissions are successful without error. You typically should pass an integer argument to specify the number of retry attempts before it gives up and lets the error go to the `Subscriber`. If you do not, it will retry an infinite number of times.
2511 | 
2512 | ```python
2513 | from rx import Observable
2514 | 
2515 | Observable.from_([5, 6, 2, 0, 1, 35]) \
2516 |     .map(lambda i: 5 / i) \
2517 |     .retry(3) \
2518 |     .subscribe(on_next=lambda i: print(i), on_error=lambda e: print(e))
2519 | ```
2520 | 
2521 | **OUTPUT:**
2522 | 
2523 | ```
2524 | 1.0
2525 | 0.8333333333333334
2526 | 2.5
2527 | 1.0
2528 | 0.8333333333333334
2529 | 2.5
2530 | 1.0
2531 | 0.8333333333333334
2532 | 2.5
2533 | division by zero
2534 | ```
2535 | 
2536 | You can also use this in combination with the `delay()` operator to hold off subscribing for a fixed time period, which can be helpful for intermittent connectivity problems.
2537 | 
2538 | 
2539 | ## 5. combine_latest()
2540 | 
2541 | There is one operation for merging multiple Observables together we did not cover: `combine_latest()`. It behaves much like `zip()` but will only combine the _latest_ emissions for each source in the event one of them emits something. This is helpful for hot event sources especially, such as user inputs in a UI, where do you not care what the previous emissions are.
2542 | 
2543 | Below, we have two interval sources put in `combine_latest()`: `source1` emitting every 3 seconds and `source2` every 1 second. Notice that `source2` is going to emit a lot faster, but rather than get queued up like in `zip()` waiting for an emission from `source1`, it is going to pair with only the latest emission from `source1`. It is not going to wait for any emission to be zipped with. Conversely, when `source1` does emit something it is going to pair with the latest emission from `source2`, not wait for an emission.
2544 | 
2545 | 
2546 | ```python
2547 | from rx import Observable
2548 | 
2549 | source1 = Observable.interval(3000).map(lambda i: "SOURCE 1: {0}".format(i))
2550 | source2 = Observable.interval(1000).map(lambda i: "SOURCE 2: {0}".format(i))
2551 | 
2552 | Observable.combine_latest(source1, source2, lambda s1,s2: "{0}, {1}".format(s1,s2)) \
2553 |     .subscribe(lambda s: print(s))
2554 | 
2555 | input("Press any key to quit\n")
2556 | ```
2557 | 
2558 | **OUTPUT:**
2559 | 
2560 | ```
2561 | Press any key to quit
2562 | SOURCE 1: 0, SOURCE 2: 1
2563 | SOURCE 1: 0, SOURCE 2: 2
2564 | SOURCE 1: 0, SOURCE 2: 3
2565 | SOURCE 1: 0, SOURCE 2: 4
2566 | SOURCE 1: 1, SOURCE 2: 4
2567 | SOURCE 1: 1, SOURCE 2: 5
2568 | SOURCE 1: 1, SOURCE 2: 6
2569 | SOURCE 1: 1, SOURCE 2: 7
2570 | SOURCE 1: 2, SOURCE 2: 7
2571 | SOURCE 1: 2, SOURCE 2: 8
2572 | SOURCE 1: 2, SOURCE 2: 9
2573 | SOURCE 1: 2, SOURCE 2: 10
2574 | SOURCE 1: 3, SOURCE 2: 10
2575 | SOURCE 1: 3, SOURCE 2: 11
2576 | SOURCE 1: 3, SOURCE 2: 12
2577 | SOURCE 1: 3, SOURCE 2: 13
2578 | ```
2579 | 
2580 | Again, this is a helpful alternative for `zip()` if you want to emit the _latest combinations_ from two or more Observables.
2581 | 


--------------------------------------------------------------------------------
/class_notes/class_notes.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/thomasnield/oreilly_reactive_python_for_data/dd223b28bad8ea8f00242e827bd020a9c94e2692/class_notes/class_notes.pdf


--------------------------------------------------------------------------------
/code_examples/4.1A_declaring_an_observable.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | letters = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"])
4 | 


--------------------------------------------------------------------------------
/code_examples/4.1B_subscribing_to_an_observable.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable, Observer
 2 | 
 3 | letters = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"])
 4 | 
 5 | 
 6 | class MySubscriber(Observer):
 7 |     def on_next(self, value):
 8 |         print(value)
 9 | 
10 |     def on_completed(self):
11 |         print("Completed!")
12 | 
13 |     def on_error(self, error):
14 |         print("Error occured: {0}".format(error))
15 | 
16 | 
17 | letters.subscribe(MySubscriber())
18 | 


--------------------------------------------------------------------------------
/code_examples/4.1C_subscribing_with_lambdas.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | 
 3 | letters = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"])
 4 | 
 5 | letters.subscribe(on_next = lambda value: print(value),
 6 |                   on_completed = lambda: print("Completed!"),
 7 |                   on_error = lambda error: print("Error occurred: {0}".format(error)))
 8 | 
 9 | # to use just on_next:
10 | # letters.subscribe(on_next = lambda value: print(value))
11 | # letters.subscribe(lambda value: print("Received: {0}".format(value)))
12 | 


--------------------------------------------------------------------------------
/code_examples/4.2A_some_basic_operators.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \
4 |     .map(lambda s: len(s)) \
5 |     .filter(lambda i: i >= 5) \
6 |     .subscribe(lambda value: print(value))
7 | 


--------------------------------------------------------------------------------
/code_examples/4.2B_range_and_just.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable, Observer
 2 | 
 3 | # Using Observable.range()
 4 | letters = Observable.range(1,10)
 5 | letters.subscribe(lambda value: print(value))
 6 | 
 7 | # Using Observable.just()
 8 | greeting = Observable.just("Hello World!")
 9 | greeting.subscribe(lambda value: print(value))
10 | 


--------------------------------------------------------------------------------
/code_examples/4.2C_observable_empty.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | Observable.empty() \
4 |     .subscribe(on_next= lambda s: print(s),
5 |                on_completed= lambda: print("Done!")
6 |                )
7 | 


--------------------------------------------------------------------------------
/code_examples/4.3A_creating_observable_from_scratch.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable, Observer
 2 | 
 3 | def push_numbers(observer):
 4 |     observer.on_next(100)
 5 |     observer.on_next(300)
 6 |     observer.on_next(500)
 7 |     observer.on_completed()
 8 | 
 9 | Observable.create(push_numbers).subscribe(on_next = lambda i: print(i))
10 | 


--------------------------------------------------------------------------------
/code_examples/4.3B_interval_observable.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | Observable.interval(1000) \
4 |     .map(lambda i: "{0} Mississippi".format(i)) \
5 |     .subscribe(lambda s: print(s))
6 | 
7 | # Keep application alive until user presses a key
8 | input("Press any key to quit")
9 | 


--------------------------------------------------------------------------------
/code_examples/4.3C_unsubscribing.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | import time
 3 | 
 4 | disposable = Observable.interval(1000) \
 5 |     .map(lambda i: "{0} Mississippi".format(i)) \
 6 |     .subscribe(lambda s: print(s))
 7 | 
 8 | # sleep 5 seconds so Observable can fire
 9 | time.sleep(5)
10 | 
11 | # disconnect the Subscriber
12 | print("Unsubscribing!")
13 | disposable.dispose()
14 | 
15 | # sleep a bit longer to prove no more emissions are coming
16 | time.sleep(5)
17 | 


--------------------------------------------------------------------------------
/code_examples/4.4A_twitter_observable.py:
--------------------------------------------------------------------------------
 1 | from tweepy.streaming import StreamListener
 2 | from tweepy import OAuthHandler
 3 | from tweepy import Stream
 4 | import json
 5 | from rx import Observable
 6 | 
 7 | # Variables that contains the user credentials to access Twitter API
 8 | access_token = "CONFIDENTIAL"
 9 | access_token_secret = "CONFIDENTIAL"
10 | consumer_key = "CONFIDENTIAL"
11 | consumer_secret = "CONFIDENTIAL"
12 | 
13 | 
14 | def tweets_for(topics):
15 | 
16 |     def observe_tweets(observer):
17 |         class TweetListener(StreamListener):
18 |             def on_data(self, data):
19 |                 observer.on_next(data)
20 |                 return True
21 | 
22 |             def on_error(self, status):
23 |                 observer.on_error(status)
24 | 
25 |         # This handles Twitter authetification and the connection to Twitter Streaming API
26 |         l = TweetListener()
27 |         auth = OAuthHandler(consumer_key, consumer_secret)
28 |         auth.set_access_token(access_token, access_token_secret)
29 |         stream = Stream(auth, l)
30 |         stream.filter(track=topics)
31 | 
32 |     return Observable.create(observe_tweets).share()
33 | 
34 | 
35 | topics = ['Britain','France']
36 | 
37 | tweets_for(topics).map(lambda d: json.loads(d)) \
38 |     .filter(lambda map: "text" in map) \
39 |     .map(lambda map: map["text"].strip()) \
40 |     .subscribe(lambda s: print(s))
41 | 


--------------------------------------------------------------------------------
/code_examples/4.4B_cold_observable.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | source = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"])
4 | 
5 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s)))
6 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s)))
7 | 


--------------------------------------------------------------------------------
/code_examples/5.1A_filter.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \
4 |     .filter(lambda s: len(s) >= 5) \
5 |     .subscribe(lambda s: print(s))
6 | 


--------------------------------------------------------------------------------
/code_examples/5.1B_take.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \
4 |     .filter(lambda s: len(s) >= 5) \
5 |     .take(2) \
6 |     .subscribe(lambda s: print(s))
7 | 


--------------------------------------------------------------------------------
/code_examples/5.1C_take_while.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | Observable.from_([2,5,21,5,2,1,5,63,127,12]) \
4 |     .take_while(lambda i: i < 100) \
5 |     .subscribe(on_next = lambda i: print(i), on_completed = lambda: print("Done!"))
6 | 


--------------------------------------------------------------------------------
/code_examples/5.2A_distinct.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]) \
4 |     .map(lambda s: len(s)) \
5 |     .distinct() \
6 |     .subscribe(lambda i: print(i))
7 | 


--------------------------------------------------------------------------------
/code_examples/5.2B_distinct_with_mapping.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]) \
4 |     .distinct(lambda s: len(s)) \
5 |     .subscribe(lambda i: print(i))
6 | 


--------------------------------------------------------------------------------
/code_examples/5.2C_distinct_until_changed.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | 
 3 | Observable.from_(["Alpha", "Theta", "Kappa", "Beta", "Gamma", "Delta", "Epsilon"]) \
 4 |     .map(lambda s: len(s)) \
 5 |     .distinct_until_changed() \
 6 |     .subscribe(lambda i: print(i))
 7 | 
 8 | 
 9 | Observable.from_(["Alpha", "Theta", "Kappa", "Beta", "Gamma", "Delta", "Epsilon"]) \
10 |     .distinct_until_changed(lambda s: len(s)) \
11 |     .subscribe(lambda i: print(i))
12 | 


--------------------------------------------------------------------------------
/code_examples/5.3A_count.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \
4 |     .filter(lambda s: len(s) != 5) \
5 |     .count() \
6 |     .subscribe(lambda i: print(i))
7 | 


--------------------------------------------------------------------------------
/code_examples/5.3B_reduce.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | Observable.from_([4,76,22,66,881,13,35]) \
4 |     .filter(lambda i: i < 100) \
5 |     .reduce(lambda total, value: total + value) \
6 |     .subscribe(lambda s: print(s))
7 | 


--------------------------------------------------------------------------------
/code_examples/5.3C_scan.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | Observable.from_([4,76,22,66,881,13,35]) \
4 |     .scan(lambda total, value: total + value) \
5 |     .subscribe(lambda s: print(s))
6 | 


--------------------------------------------------------------------------------
/code_examples/5.4A_to_list.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \
4 |     .to_list() \
5 |     .subscribe(lambda s: print(s))
6 | 


--------------------------------------------------------------------------------
/code_examples/5.4B_to_dict.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | 
 3 | Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]) \
 4 |     .to_dict(lambda s: s[0]) \
 5 |     .subscribe(lambda i: print(i))
 6 | 
 7 | Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]) \
 8 |     .to_dict(lambda s: s[0], lambda s: len(s)) \
 9 |     .subscribe(lambda i: print(i))
10 | 


--------------------------------------------------------------------------------
/code_examples/6.1A_merge.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | source1 = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"])
4 | source2 = Observable.from_(["Zeta","Eta","Theta","Iota"])
5 | 
6 | Observable.merge(source1,source2) \
7 |     .subscribe(lambda s: print(s))
8 | 


--------------------------------------------------------------------------------
/code_examples/6.1B_merge_interval.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | 
 3 | source1 = Observable.interval(1000).map(lambda i: "Source 1: {0}".format(i))
 4 | source2 = Observable.interval(500).map(lambda i: "Source 2: {0}".format(i))
 5 | source3 = Observable.interval(300).map(lambda i: "Source 3: {0}".format(i))
 6 | 
 7 | Observable.merge(source1, source2, source3) \
 8 |     .subscribe(lambda s: print(s))
 9 | 
10 | # keep application alive until user presses a key
11 | input("Press any key to quit\n")
12 | 


--------------------------------------------------------------------------------
/code_examples/6.1C_merge_all.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | 
 3 | source1 = Observable.interval(1000).map(lambda i: "Source 1: {0}".format(i))
 4 | source2 = Observable.interval(500).map(lambda i: "Source 2: {0}".format(i))
 5 | source3 = Observable.interval(300).map(lambda i: "Source 3: {0}".format(i))
 6 | 
 7 | Observable.from_([source1,source2,source3]) \
 8 |     .merge_all() \
 9 |     .subscribe(lambda s: print(s))
10 | 
11 | # keep application alive until user presses a key
12 | input("Press any key to quit\n")
13 | 


--------------------------------------------------------------------------------
/code_examples/6.1D_merge_all_continued.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | 
 3 | items = ["134/34/235/132/77", "64/22/98/112/86/11", "66/08/34/778/22/12"]
 4 | 
 5 | Observable.from_(items) \
 6 |     .map(lambda s: Observable.from_(s.split("/"))) \
 7 |     .merge_all() \
 8 |     .map(lambda s: int(s)) \
 9 |     .subscribe(lambda i: print(i))
10 | 


--------------------------------------------------------------------------------
/code_examples/6.1E_flat_map.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | items = ["134/34/235/132/77", "64/22/98/112/86/11", "66/08/34/778/22/12"]
4 | 
5 | Observable.from_(items) \
6 |     .flat_map(lambda s: Observable.from_(s.split("/"))) \
7 |     .map(lambda s: int(s)) \
8 |     .subscribe(lambda i: print(i))
9 | 


--------------------------------------------------------------------------------
/code_examples/6.2A_concat.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | source1 = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"])
4 | source2 = Observable.from_(["Zeta","Eta","Theta","Iota"])
5 | 
6 | Observable.concat(source1,source2) \
7 |     .subscribe(lambda s: print(s))
8 | 


--------------------------------------------------------------------------------
/code_examples/6.2B_concat_all.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | source1 = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"])
4 | source2 = Observable.from_(["Zeta","Eta","Theta","Iota"])
5 | 
6 | Observable.concat(source1,source2) \
7 |     .subscribe(lambda s: print(s))
8 | 


--------------------------------------------------------------------------------
/code_examples/6.2C_zip.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | letters = Observable.from_(["A","B","C","D","E","F"])
4 | numbers = Observable.range(1,5)
5 | 
6 | Observable.zip(letters,numbers, lambda l,n: "{0}-{1}".format(l,n)) \
7 |     .subscribe(lambda i: print(i))
8 | 


--------------------------------------------------------------------------------
/code_examples/6.3D_spacing_emissions.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | 
 3 | letters = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"])
 4 | intervals = Observable.interval(1000)
 5 | 
 6 | Observable.zip(letters,intervals, lambda s,i: s) \
 7 |     .subscribe(lambda s: print(s))
 8 | 
 9 | input("Press any key to quit\n")
10 | 


--------------------------------------------------------------------------------
/code_examples/6.4A_grouping_into_lists.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | items = ["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]
4 | 
5 | Observable.from_(items) \
6 |     .group_by(lambda s: len(s)) \
7 |     .flat_map(lambda grp: grp.to_list()) \
8 |     .subscribe(lambda i: print(i))
9 | 


--------------------------------------------------------------------------------
/code_examples/6.4B_grouping_length_counts.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | 
 3 | items = ["Alpha", "Beta", "Gamma", "Delta", "Epsilon"]
 4 | 
 5 | Observable.from_(items) \
 6 |     .group_by(lambda s: len(s)) \
 7 |     .flat_map(lambda grp:
 8 |          grp.count().map(lambda ct: (grp.key, ct))
 9 |     ) \
10 |     .to_dict(lambda key_value: key_value[0], lambda key_value: key_value[1]) \
11 |     .subscribe(lambda i: print(i))
12 | 


--------------------------------------------------------------------------------
/code_examples/7.1A_reading_text_file.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | 
 3 | 
 4 | def read_lines(file_name):
 5 |     file = open(file_name)
 6 | 
 7 |     return Observable.from_(file) \
 8 |         .map(lambda l: l.strip()) \
 9 |         .filter(lambda l: l != "")
10 | 
11 | 
12 | read_lines("bbc_news_article.txt").subscribe(lambda s: print(s))
13 | 


--------------------------------------------------------------------------------
/code_examples/7.1B_reading_web_url.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | from urllib.request import urlopen
 3 | 
 4 | 
 5 | def read_request(link):
 6 |     f = urlopen(link)
 7 | 
 8 |     return Observable.from_(f) \
 9 |         .map(lambda s: s.decode("utf-8").strip()) \
10 | 
11 | read_request("https://goo.gl/rIaDyM") \
12 |     .subscribe(lambda s: print(s))
13 | 


--------------------------------------------------------------------------------
/code_examples/7.1C_recursive_file_iteration.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | import os
 3 | 
 4 | 
 5 | def recursive_files_in_directory(folder):
 6 | 
 7 |     def emit_files_recursively(observer):
 8 |         for root, directories, filenames in os.walk(folder):
 9 |             for directory in directories:
10 |                 observer.on_next(os.path.join(root, directory))
11 |             for filename in filenames:
12 |                 observer.on_next(os.path.join(root, filename))
13 | 
14 |         observer.on_completed()
15 | 
16 |     return Observable.create(emit_files_recursively)
17 | 
18 | 
19 | recursive_files_in_directory('/home/thomas/Desktop/bbc_data_sets') \
20 |     .filter(lambda f: f.endswith('.txt')) \
21 |     .subscribe(on_next=lambda l: print(l), on_error=lambda e: print(e))
22 | 
23 | 


--------------------------------------------------------------------------------
/code_examples/7.2A_reading_sql_query.py:
--------------------------------------------------------------------------------
 1 | from sqlalchemy import create_engine, text
 2 | from rx import Observable
 3 | 
 4 | engine = create_engine('sqlite:///rexon_metals.db')
 5 | conn = engine.connect()
 6 | 
 7 | 
 8 | def get_all_customers():
 9 |     stmt = text("SELECT * FROM CUSTOMER")
10 |     return Observable.from_(conn.execute(stmt))
11 | 
12 | 
13 | get_all_customers() \
14 |     .map(lambda r: r[0]) \
15 |     .subscribe(lambda r: print(r))
16 | 


--------------------------------------------------------------------------------
/code_examples/7.2C_merging_sql_queries.py:
--------------------------------------------------------------------------------
 1 | from sqlalchemy import create_engine, text
 2 | from rx import Observable
 3 | 
 4 | engine = create_engine('sqlite:///rexon_metals.db')
 5 | conn = engine.connect()
 6 | 
 7 | 
 8 | def get_all_customers():
 9 |     stmt = text("SELECT * FROM CUSTOMER")
10 |     return Observable.from_(conn.execute(stmt))
11 | 
12 | 
13 | def customer_for_id(customer_id):
14 |     stmt = text("SELECT * FROM CUSTOMER WHERE CUSTOMER_ID = :id")
15 |     return Observable.from_(conn.execute(stmt, id=customer_id))
16 | 
17 | 
18 | # Query customers with IDs 1, 3, and 5
19 | Observable.from_([1, 3, 5]) \
20 |     .flat_map(lambda id: customer_for_id(id)) \
21 |     .subscribe(lambda r: print(r))
22 | 


--------------------------------------------------------------------------------
/code_examples/7.2D_writing_sql_updates.py:
--------------------------------------------------------------------------------
 1 | from sqlalchemy import create_engine, text
 2 | from rx import Observable
 3 | 
 4 | 
 5 | engine = create_engine('sqlite:///rexon_metals.db')
 6 | conn = engine.connect()
 7 | 
 8 | 
 9 | def get_all_customers():
10 |     stmt = text("SELECT * FROM CUSTOMER")
11 |     return Observable.from_(conn.execute(stmt))
12 | 
13 | 
14 | def customer_for_id(customer_id):
15 |     stmt = text("SELECT * FROM CUSTOMER WHERE CUSTOMER_ID = :id")
16 |     return Observable.from_(conn.execute(stmt, id=customer_id))
17 | 
18 | 
19 | def insert_new_customer(customer_name, region, street_address, city, state, zip_code):
20 |     stmt = text("INSERT INTO CUSTOMER (NAME, REGION, STREET_ADDRESS, CITY, STATE, ZIP) VALUES ("
21 |                 ":customer_name, :region, :street_address, :city, :state, :zip_code)")
22 | 
23 |     result = conn.execute(stmt, customer_name=customer_name, region=region, street_address=street_address, city=city, state=state, zip_code=zip_code)
24 |     return Observable.just(result.lastrowid)
25 | 
26 | # Create new customer, emit primary key ID, and query that customer
27 | insert_new_customer('RMS Materials','Northeast', '5764 Carrier Ln', 'Boston', 'Massachusetts', '02201') \
28 |     .flat_map(lambda i: customer_for_id(i)) \
29 |     .subscribe(lambda s: print(s))
30 | 


--------------------------------------------------------------------------------
/code_examples/7.3A_reading_words_from_text_file.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | import re
 3 | 
 4 | 
 5 | def words_from_file(file_name):
 6 |     file = open(file_name)
 7 | 
 8 |     # parse, clean, and push words in text file
 9 |     return Observable.from_(file) \
10 |         .flat_map(lambda s: Observable.from_(s.split())) \
11 |         .map(lambda w: re.sub(r'[^\w]', '', w)) \
12 |         .filter(lambda w: w != "") \
13 |         .map(lambda w: w.lower())
14 | 
15 | article_file = "bbc_news_article.txt"
16 | words_from_file(article_file).subscribe(lambda w: print(w))
17 | 


--------------------------------------------------------------------------------
/code_examples/7.3B_counting_word_occurrences.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | import re
 3 | 
 4 | 
 5 | def words_from_file(file_name):
 6 |     file = open(file_name)
 7 | 
 8 |     # parse, clean, and push words in text file
 9 |     return Observable.from_(file) \
10 |         .flat_map(lambda s: Observable.from_(s.split())) \
11 |         .map(lambda w: re.sub(r'[^\w\s]', '', w)) \
12 |         .filter(lambda w: w != "") \
13 |         .map(lambda w: w.lower()) \
14 | 
15 | 
16 | 
17 | def word_counter(file_name):
18 | 
19 |     # count words using `group_by()`
20 |     # tuple the word with the count
21 |     return words_from_file(file_name) \
22 |         .group_by(lambda word: word) \
23 |         .flat_map(lambda grp: grp.count().map(lambda ct: (grp.key, ct)))
24 | 
25 | article_file = "bbc_news_article.txt"
26 | word_counter(article_file).subscribe(lambda w: print(w))
27 | 


--------------------------------------------------------------------------------
/code_examples/7.3C_scheduling_reactive_word_counter.py:
--------------------------------------------------------------------------------
 1 | # Schedules a reactive process that counts the words in a text file every three seconds,
 2 | # but only prints it as a dict if it has changed
 3 | 
 4 | from rx import Observable
 5 | import re
 6 | 
 7 | 
 8 | def words_from_file(file_name):
 9 |     file = open(file_name)
10 | 
11 |     # parse, clean, and push words in text file
12 |     return Observable.from_(file) \
13 |         .flat_map(lambda s: Observable.from_(s.split())) \
14 |         .map(lambda w: re.sub(r'[^\w\s]', '', w)) \
15 |         .filter(lambda w: w != "") \
16 |         .map(lambda w: w.lower()) \
17 | 
18 | 
19 | 
20 | def word_counter(file_name):
21 | 
22 |     # count words using `group_by()`
23 |     # tuple the word with the count
24 |     return words_from_file(file_name) \
25 |         .group_by(lambda word: word) \
26 |         .flat_map(lambda grp: grp.count().map(lambda ct: (grp.key, ct)))
27 | 
28 | 
29 | # composes the above word_counter() into a dict
30 | def word_counter_as_dict(file_name):
31 |     return word_counter(file_name).to_dict(lambda t: t[0], lambda t: t[1])
32 | 
33 | 
34 | # Schedule to create a word count dict every three seconds an article
35 | # But only re-print if text is edited and word counts change
36 | 
37 | article_file = "bbc_news_article.txt"
38 | 
39 | # create a dict every three seconds, but only push if it changed
40 | Observable.interval(3000) \
41 |     .flat_map(lambda i: word_counter_as_dict(article_file))
42 |     .distinct_until_changed() \
43 |     .subscribe(lambda word_ct_dict: print(word_ct_dict))
44 | 
45 | # Keep alive until user presses any key
46 | input("Starting, press any key to quit\n")
47 | 


--------------------------------------------------------------------------------
/code_examples/8.1A_connectableobservable.py:
--------------------------------------------------------------------------------
1 | from rx import Observable
2 | 
3 | source = Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]).publish()
4 | 
5 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s)))
6 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s)))
7 | 
8 | source.connect()
9 | 


--------------------------------------------------------------------------------
/code_examples/8.1B_sharing_observable.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | import time
 3 | 
 4 | source = Observable.interval(1000).publish()
 5 | 
 6 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s)))
 7 | source.connect()
 8 | 
 9 | # sleep 5 seconds, then add another subscriber
10 | time.sleep(5)
11 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s)))
12 | 
13 | input("Press any key to exit\n")


--------------------------------------------------------------------------------
/code_examples/8.1C_refcount.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | import time
 3 | 
 4 | source = Observable.interval(1000).publish().ref_count()
 5 | 
 6 | source.subscribe(lambda s: print("Subscriber 1: {0}".format(s)))
 7 | 
 8 | # sleep 5 seconds, then add another subscriber
 9 | time.sleep(5)
10 | source.subscribe(lambda s: print("Subscriber 2: {0}".format(s)))
11 | 
12 | input("Press any key to exit\n")


--------------------------------------------------------------------------------
/code_examples/8.2_twitter_feed_for_topics.py:
--------------------------------------------------------------------------------
 1 | from tweepy.streaming import StreamListener
 2 | from tweepy import OAuthHandler
 3 | from tweepy import Stream
 4 | import json
 5 | from rx import Observable
 6 | 
 7 | # Variables that contains the user credentials to access Twitter API
 8 | access_token = "PUT YOURS HERE"
 9 | access_token_secret = "PUT YOURS HERE"
10 | consumer_key = "PUT YOURS HERE"
11 | consumer_secret = "PUT YOURS HERE"
12 | 
13 | 
14 | def tweets_for(topics):
15 |     def observe_tweets(observer):
16 |         class TweetListener(StreamListener):
17 |             def on_data(self, data):
18 |                 observer.on_next(data)
19 |                 return True
20 | 
21 |             def on_error(self, status):
22 |                 observer.on_error(status)
23 | 
24 |         # This handles Twitter authetification and the connection to Twitter Streaming API
25 |         l = TweetListener()
26 |         auth = OAuthHandler(consumer_key, consumer_secret)
27 |         auth.set_access_token(access_token, access_token_secret)
28 |         stream = Stream(auth, l)
29 |         stream.filter(track=topics)
30 | 
31 |     return Observable.create(observe_tweets).share()
32 | 
33 | 
34 | topics = ['Britain', 'France']
35 | 
36 | tweets_for(topics) \
37 |     .map(lambda d: json.loads(d)) \
38 |     .subscribe(on_next=lambda s: print(s), on_error=lambda e: print(e))


--------------------------------------------------------------------------------
/code_examples/9.1A_sequential_long_running_tasks.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | from threading import current_thread
 3 | import multiprocessing, time, random
 4 | 
 5 | 
 6 | def intense_calculation(value):
 7 |     # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation
 8 |     time.sleep(random.randint(5,20) * .1)
 9 |     return value
10 | 
11 | # Create Process 1
12 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \
13 |     .map(lambda s: intense_calculation(s)) \
14 |     .subscribe(on_next=lambda s: print("PROCESS 1: {0} {1}".format(current_thread().name, s)),
15 |                on_error=lambda e: print(e),
16 |                on_completed=lambda: print("PROCESS 1 done!"))
17 | 
18 | # Create Process 2
19 | Observable.range(1,10) \
20 |     .map(lambda s: intense_calculation(s)) \
21 |     .subscribe(on_next=lambda i: print("PROCESS 2: {0} {1}".format(current_thread().name, i)),
22 |                on_error=lambda e: print(e),
23 |                on_completed=lambda: print("PROCESS 2 done!"))
24 | 
25 | input("Press any key to exit\n")


--------------------------------------------------------------------------------
/code_examples/9.1B_using_subscribe_on.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | from rx.concurrency import ThreadPoolScheduler
 3 | from threading import current_thread
 4 | import multiprocessing, time, random
 5 | 
 6 | 
 7 | def intense_calculation(value):
 8 |     # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation
 9 |     time.sleep(random.randint(5,20) * .1)
10 |     return value
11 | 
12 | # calculate number of CPU's and add 1, then create a ThreadPoolScheduler with that number of threads
13 | optimal_thread_count = multiprocessing.cpu_count() + 1
14 | pool_scheduler = ThreadPoolScheduler(optimal_thread_count)
15 | 
16 | print("We are using {0} threads".format(optimal_thread_count))
17 | 
18 | # Create Process 1
19 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \
20 |     .map(lambda s: intense_calculation(s)) \
21 |     .subscribe_on(pool_scheduler) \
22 |     .subscribe(on_next=lambda s: print("PROCESS 1: {0} {1}".format(current_thread().name, s)),
23 |                on_error=lambda e: print(e),
24 |                on_completed=lambda: print("PROCESS 1 done!"))
25 | 
26 | # Create Process 2
27 | Observable.range(1,10) \
28 |     .map(lambda s: intense_calculation(s)) \
29 |     .subscribe_on(pool_scheduler) \
30 |     .subscribe(on_next=lambda i: print("PROCESS 2: {0} {1}".format(current_thread().name, i)),
31 |                on_error=lambda e: print(e),
32 |                on_completed=lambda: print("PROCESS 2 done!"))
33 | 
34 | input("Press any key to exit\n")


--------------------------------------------------------------------------------
/code_examples/9.2_using_observe_on.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | from rx.concurrency import ThreadPoolScheduler
 3 | from threading import current_thread
 4 | import multiprocessing, time, random
 5 | 
 6 | def intense_calculation(value):
 7 |     # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation
 8 |     time.sleep(random.randint(5,20) * .1)
 9 |     return value
10 | 
11 | # calculate number of CPU's and add 1, then create a ThreadPoolScheduler with that number of threads
12 | optimal_thread_count = multiprocessing.cpu_count() + 1
13 | pool_scheduler = ThreadPoolScheduler(optimal_thread_count)
14 | 
15 | # Create Process 1
16 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon"]) \
17 |     .map(lambda s: intense_calculation(s)) \
18 |     .subscribe_on(pool_scheduler) \
19 |     .subscribe(on_next=lambda s: print("PROCESS 1: {0} {1}".format(current_thread().name, s)),
20 |                on_error=lambda e: print(e),
21 |               on_completed=lambda: print("PROCESS 1 done!"))
22 | 
23 | # Create Process 2
24 | Observable.range(1,10) \
25 |     .map(lambda s: intense_calculation(s)) \
26 |     .subscribe_on(pool_scheduler) \
27 |     .subscribe(on_next=lambda i: print("PROCESS 2: {0} {1}".format(current_thread().name, i)), on_error=lambda e: print(e), on_completed=lambda: print("PROCESS 2 done!"))
28 | 
29 | # Create Process 3, which is infinite
30 | Observable.interval(1000) \
31 |     .map(lambda i: i * 100) \
32 |     .observe_on(pool_scheduler) \
33 |     .map(lambda s: intense_calculation(s)) \
34 |     .subscribe(on_next=lambda i: print("PROCESS 3: {0} {1}".format(current_thread().name, i)), on_error=lambda e: print(e))
35 | 
36 | input("Press any key to exit\n")


--------------------------------------------------------------------------------
/code_examples/9.3_processing_emissions_in_parallel.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | from rx.concurrency import ThreadPoolScheduler
 3 | from threading import current_thread
 4 | import multiprocessing, time, random
 5 | 
 6 | 
 7 | def intense_calculation(value):
 8 |     # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation
 9 |     time.sleep(random.randint(5,20) * .1)
10 |     return value
11 | 
12 | # calculate number of CPU's and add 1, then create a ThreadPoolScheduler with that number of threads
13 | optimal_thread_count = multiprocessing.cpu_count() + 1
14 | pool_scheduler = ThreadPoolScheduler(optimal_thread_count)
15 | 
16 | # Create Parallel Process
17 | Observable.from_(["Alpha","Beta","Gamma","Delta","Epsilon","Zeta","Eta","Theta","Iota","Kappa"]) \
18 |     .flat_map(lambda s:
19 |         Observable.just(s).subscribe_on(pool_scheduler).map(lambda s: intense_calculation(s))
20 |     ) \
21 |     .subscribe(on_next=lambda i: print("{0} {1}".format(current_thread().name, i)),
22 |                on_error=lambda e: print(e),
23 |                on_completed=lambda: print("PROCESS 1 done!"))
24 | 
25 | 
26 | input("Press any key to exit\n")


--------------------------------------------------------------------------------
/code_examples/9.4_switch_map.py:
--------------------------------------------------------------------------------
 1 | from rx import Observable
 2 | from rx.concurrency import ThreadPoolScheduler
 3 | from threading import current_thread
 4 | import multiprocessing, time, random
 5 | 
 6 | 
 7 | def intense_calculation(value):
 8 |     # sleep for a random short duration between 0.5 to 2.0 seconds to simulate a long-running calculation
 9 |     time.sleep(random.randint(5, 20) * .1)
10 |     return value
11 | 
12 | 
13 | # calculate number of CPU's and add 1, then create a ThreadPoolScheduler with that number of threads
14 | optimal_thread_count = multiprocessing.cpu_count() + 1
15 | pool_scheduler = ThreadPoolScheduler(optimal_thread_count)
16 | 
17 | strings = Observable.from_(["Alpha", "Beta", "Gamma", "Delta", "Epsilon", "Zeta", "Eta", "Theta", "Iota", "Kappa"])
18 | 
19 | Observable.interval(6000) \
20 |     .switch_map(lambda i: strings.map(lambda s: intense_calculation(s)).subscribe_on(pool_scheduler)) \
21 |     .subscribe(on_next=lambda s: print("Received {0} on {1}".format(s, current_thread().name)),
22 |                on_error=lambda e: print(e))
23 | 
24 | input("Press any key to exit\n")
25 | 


--------------------------------------------------------------------------------
/code_examples/rexon_metals.db:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/thomasnield/oreilly_reactive_python_for_data/dd223b28bad8ea8f00242e827bd020a9c94e2692/code_examples/rexon_metals.db


--------------------------------------------------------------------------------
/resources/bbc_news_article.txt:
--------------------------------------------------------------------------------
 1 | Giant waves damage S Asia economy
 2 | 
 3 | Governments, aid agencies, insurers and travel firms are among those counting the cost of the massive earthquake and waves that hammered southern Asia.
 4 | 
 5 | The worst-hit areas are Sri Lanka, India, Indonesia and Thailand, with at least 23,000 people killed. Early estimates from the World Bank put the amount of aid needed at about $5bn (£2.6bn), similar to the cash offered Central America after Hurricane Mitch. Mitch killed about 10,000 people and caused damage of about $10bn in 1998. World Bank spokesman Damien Milverton told the Wall Street Journal that he expected an aid package of financing and debt relief.
 6 | 
 7 | Tourism is a vital part of the economies of the stricken countries, providing jobs for 19 million people in the south east Asian region, according to the World Travel and Tourism Council (WTTC). In the Maldives islands, in the Indian ocean, two-thirds of all jobs depend on tourism.
 8 | 
 9 | But the damage covers fishing, farming and businesses too, with hundreds of thousands of buildings and small boats destroyed by the waves. International agencies have pledged their support; most say it is impossible to gauge the extent of the damage yet. The International Monetary Fund (IMF) has promised rapid action to help the governments of the stricken countries cope.
10 | 
11 | "The IMF stands ready to do its part to assist these nations with appropriate support in their time of need," said managing director Rodrigo Rato. Only Sri Lanka and Bangladesh currently receive IMF support, while Indonesia, the quake's epicentre, has recently graduated from IMF assistance. It is up to governments to decide if they want IMF help. Other agencies, such as the Asian Development Bank, have said that it is too early to comment on the amount of aid needed. There is no underestimating the size of the problem, however. The United Nations' emergency relief coordinator, Jan Egeland, said that "this may be the worst national disaster in recent history because it is affecting so many heavily populated coastal areas... so many vulnerable communities. "Many people will have [had] their livelihoods, their whole future destroyed in a few seconds." He warned that "the longer term effects many be as devastating as the tidal wave or the tsunami itself" because of the risks of epidemics from polluted drinking water.
12 | 
13 | Insurers are also struggling to assess the cost of the damage, but several big players believe the final bill is likely to be less than the $27bn cost of the hurricanes that battered the US earlier this year.
14 | 
15 | "The region that's affected is very big so we have to check country-by-country what the situation is", said Serge Troeber, deputy head of the natural disasters department at Swiss Re, the world's second biggest reinsurance firm. "I should assume, however, that the overall dimension of insured damages is below the storm damages of the US," he said. Munich Re, the world's biggest reinsurer, said: "This is primarily a human tragedy. It is too early for us to state what our financial burden will be." Allianz has said it sees no significant impact on its profitability. However, a low insurance bill may simply reflect the general poverty of much of the region, rather than the level of economic devastation for those who live there.
16 | 
17 | The International Federation of the Red Cross and Red Crescent Societies told the Reuters news agency that it was seeking $6.5m for emergency aid.
18 | 
19 | "The biggest health challenges we face is the spread of waterborne diseases, particularly malaria and diarrhoea," the aid agency was quoted as saying. The European Union has said it will deliver 3m euros (£2.1m; $4.1m) of aid, according to the Wall Street Journal. The EU's Humanitarian Aid Commissioner, Louis Michel, was quoted as saying that it was key to bring aid "in those vital hours and days immediately after the disaster". Other countries also are reported to have pledged cash, while the US State Department said it was examining what aid was needed in the region. Getting companies and business up and running also may play a vital role in helping communities recover from the weekend's events.
20 | 
21 | Many of the worst-hit areas, such as Sri Lanka, Thailand's Phuket island and the Maldives, are popular tourist resorts that are key to local economies.
22 | 
23 | December and January are two of the busiest months for the travel in southern Asia and the damage will be even more keenly felt as the industry was only just beginning to emerge from a post 9/11 slump. Growth has been rapid in southeast Asia, with the World Tourism Organisation figures showing a 45% increase in tourist revenues in the region during the first 10 months of 2004. In southern Asia that expansion is 23%. "India continues to post excellent results thanks to increased promotion and product development, but also to the upsurge in business travel driven by the rapid economic development of the country," the WTO said. "Arrivals to other destinations such as... Maldives and Sri Lanka also thrived." In Thailand, tourism accounts for about 6% of the country's annual gross domestic product, or about $8bn. In Singapore the figure is close to 5%. Tourism also brings in much needed foreign currency. In the short-term, however, travel companies are cancelling flights and trips. That has hit shares across Asia and Europe, with investors saying that earnings and economic growth are likely to slow.
24 | 


--------------------------------------------------------------------------------
/resources/reactive_python_slides.pptx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/thomasnield/oreilly_reactive_python_for_data/dd223b28bad8ea8f00242e827bd020a9c94e2692/resources/reactive_python_slides.pptx


--------------------------------------------------------------------------------
/resources/rexon_metals.db:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/thomasnield/oreilly_reactive_python_for_data/dd223b28bad8ea8f00242e827bd020a9c94e2692/resources/rexon_metals.db


--------------------------------------------------------------------------------
/setting_up_twitter_api.md:
--------------------------------------------------------------------------------
 1 | # Setting Up Twitter API
 2 | 
 3 | To use Tweepy,  which is needed for a project in this course, you will need to set up a Twitter API account.
 4 | 
 5 | First, go to [apps.twitter.com](https://apps.twitter.com). Make sure you have a Twitter account set up and "Sign in". Click the "Create New App" button as shown below (Figure 1).
 6 | 
 7 | ![](http://i.imgur.com/VfohYTN.png)
 8 | 
 9 | **Figure 1:** Twitter Apps Dashboard
10 | 
11 | 
12 | After that, fill in the form to create an API account. Give any name you like, and provide a "Website" if you have one you own. Otherwise, just use a placeholder as I've done below. It does not have to be an existing site (Figure 2).
13 | 
14 | ![](http://i.imgur.com/LojWHv1.png)
15 | 
16 | **Figure 2:** Filling out the form to create an API account
17 | 
18 | After that, your API account will be set up. You will see the page below. Click the "Manage Keys and Access Tokens" link (Figure 3).
19 | 
20 | ![](http://i.imgur.com/ruDx1Ms.png)
21 | 
22 | **Figure 3:** Click "Manage Keys and Access Tokens"
23 | 
24 | You will then come to a page to manage your keys and tokens, which you will need to have so Tweepy can use this account. Note the "Consumer Key" and "Consumer Secret" values. You will need those. Then click "Create my access token".
25 | 
26 | ![](https://i.imgur.com/V2kILPr.png)
27 | 
28 | **Figure 4:** Note the values above, and click "Create my access token"
29 | 
30 | A panel will pop up at the bottom with two more values: "Access Token" and "Access Token Secret". Hold on to these two values as well (Figure 5).
31 | 
32 | ![](http://i.imgur.com/kiTk8kh.png)
33 | 
34 | **Figure 5:** The "Access Token" and "Access Token "
35 | 
36 | Do not share these four key/token values as they are used to access your Twitter API account. Be sure to follow the usage agreement so your access is not revoked.
37 | 


--------------------------------------------------------------------------------
/setting_up_twitter_api.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/thomasnield/oreilly_reactive_python_for_data/dd223b28bad8ea8f00242e827bd020a9c94e2692/setting_up_twitter_api.pdf


--------------------------------------------------------------------------------