├── .gitignore ├── INSTALL.rst ├── LICENSE ├── README.rst ├── TEACHING.rst ├── binomial-distn.py ├── binomial.png ├── edward ├── bayes-linreg.png ├── bayesian-linear-regression.py ├── edwd-simple-nnet.py └── linear_mixed_effects_models.ipynb ├── poisson-distn.py ├── poisson.png ├── pubs ├── edward-1.pdf └── edward-2.pdf ├── pymc3 ├── bayes-lin-reg.png ├── bayesian_neural_network_wiecki.ipynb ├── data │ ├── jester-dataset-v1-dense-first-1000.csv │ ├── jokes.json │ └── txtdata.csv ├── introduction.ipynb ├── linear_regression.ipynb ├── nn-0.png ├── nn-1.png ├── nn-2.png └── probabilistic_matrix_factorization.ipynb └── slides ├── bmh.jpeg ├── boxs-loop.png ├── coin_flip.png ├── conceptual.odg ├── data-decisions.pdf ├── edward.png ├── galvanize-logo.png ├── inference-graph.png ├── keras.png ├── pp.bib ├── probabilistic-programming-intro.pdf ├── probabilistic-programming-intro.tex ├── pymc3.png ├── stan.jpeg ├── tensorflow.jpeg ├── theano.jpeg └── torch.jpeg /.gitignore: -------------------------------------------------------------------------------- 1 | *.py[cod] 2 | 3 | *~ 4 | 5 | .DS_Store 6 | 7 | # C extensions 8 | *.so 9 | 10 | # Packages 11 | *.egg 12 | *.egg-info 13 | dist 14 | build 15 | eggs 16 | parts 17 | bin 18 | var 19 | sdist 20 | develop-eggs 21 | .installed.cfg 22 | lib 23 | lib64 24 | __pycache__ 25 | 26 | # Installer logs 27 | pip-log.txt 28 | 29 | # Unit test / coverage reports 30 | .coverage 31 | .tox 32 | nosetests.xml 33 | 34 | # Translations 35 | *.mo 36 | 37 | # Mr Developer 38 | .mr.developer.cfg 39 | .project 40 | .pydevproject 41 | 42 | #python 43 | *.pyc 44 | *~ 45 | *.ipynb_checkpoints/ 46 | 47 | ## restructured text 48 | *_build 49 | *_static 50 | 51 | ## latex 52 | *.bak 53 | *.bbl 54 | *.out 55 | *.blg 56 | *.aux 57 | *.log 58 | *.backup 59 | *.bk 60 | *.orig 61 | *.dvi 62 | *.nav 63 | *.snm 64 | *.toc 65 | *.vrb 66 | 67 | ## R 68 | .Rhistory 69 | 70 | ## mcmc 71 | */traces/ 72 | -------------------------------------------------------------------------------- /INSTALL.rst: -------------------------------------------------------------------------------- 1 | ## Under ubuntu 2 | sudo apt-get install libcupti-dev 3 | 4 | ## for OSX and Ubuntu install Edward 5 | conda install --upgrade tensorflow-gpu 6 | 7 | or 8 | 9 | conda install --upgrade tensorflow 10 | pip install --upgrade edward 11 | 12 | ## for OSX and Ubuntu install PyMC3 13 | pip install --upgrade pymc3 14 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | BSD 3-Clause License 2 | 3 | Copyright (c) 2017, Zipfian Academy 4 | All rights reserved. 5 | 6 | Redistribution and use in source and binary forms, with or without 7 | modification, are permitted provided that the following conditions are met: 8 | 9 | * Redistributions of source code must retain the above copyright notice, this 10 | list of conditions and the following disclaimer. 11 | 12 | * Redistributions in binary form must reproduce the above copyright notice, 13 | this list of conditions and the following disclaimer in the documentation 14 | and/or other materials provided with the distribution. 15 | 16 | * Neither the name of the copyright holder nor the names of its 17 | contributors may be used to endorse or promote products derived from 18 | this software without specific prior written permission. 19 | 20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 21 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 22 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 23 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 24 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 25 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 26 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 27 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 28 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30 | -------------------------------------------------------------------------------- /README.rst: -------------------------------------------------------------------------------- 1 | ******************************************************* 2 | Introduction to probabilistics programming with PyMC3 3 | ******************************************************* 4 | 5 | :Version: 1.0.0 6 | :Authors: Galvanize DSI 7 | :Web site: https://github.com/zipfian/probabilistic-programming-intro 8 | :Copyright: This document has been placed in the public domain. 9 | :License: These materials are released under the BSD-3 license unless otherwise noted 10 | 11 | What is probabilistic programming all about? 12 | ----------------------------------------------- 13 | 14 | There are three major trends in the machine learning side of data 15 | science: **big data**, `deep learning `_ and **probabilistic 16 | programming**. There has been a sustained focus on the first two, but 17 | recent advances in the way models are evaluated has brought the 18 | attention again to probabilitstic programming. 19 | 20 | At the core of probabilistic programming is the idea that statistical 21 | models are written in code, which is then evaluated in turn by MCMC 22 | sampling algorithms. New variational inference algorithms have 23 | emerged as a means to scale these algorithms to production level. 24 | 25 | There are three major reasons to consider probabilistic programming. 26 | With the Bayesian paradigm to guide model creation. 27 | 28 | 1. **Customization** - We can create models that have built-in hypothesis tests 29 | 2. **Propagation of uncertainty** - There is a degree of belief associated prediction and estimation 30 | 3. **Intuition** - The models are essentially 'white-box' which provides insight into our data 31 | 32 | Overview 33 | --------------------- 34 | 35 | Ultimately, we create models to guide the decision making process. It 36 | is a typical task in data science to build a recommendation system or 37 | make a prediction about a medical diagnosis, but these recommendations 38 | or diagnoses mean so much more when they are accompanied by an 39 | estimated level of uncertainty. 40 | 41 | This short talk is designed to familiarize Python programmers the basics 42 | concepts of **probabilistic programming**. We will introduce the 43 | Python package PyMC3 with a tutorial example. Then we will use it to 44 | build a simple recommendation system and finally we will finish up 45 | with an implementation of a probabilistic neural network. 46 | 47 | Install 48 | --------------- 49 | 50 | .. code:: bash 51 | 52 | pip install --process-dependency-links git+https://github.com/pymc-devs/pymc3 53 | -------------------------------------------------------------------------------- /TEACHING.rst: -------------------------------------------------------------------------------- 1 | *************************************** 2 | Notes on teaching this repository 3 | *************************************** 4 | 5 | The intended audience has a working knowledge of Python, and a basic 6 | understanding of statistics. It is important to explain up front that 7 | the materials are intended as both an introduction and reference so if 8 | participants don't get everything then it is not critical. 9 | 10 | If you are going to teach this workshop I would first watch this video. 11 | 12 | * https://www.youtube.com/watch?v=LlzVlqVzeD8 13 | 14 | Then read through at least the first 3 chapters in Cam Davidison-Pilon's book to get some perspcetive 15 | 16 | * https://pymc-devs.github.io/pymc3/notebooks/getting_started.html 17 | 18 | Then I would go through the getting started guide to get familiar with PyMC3 19 | 20 | * https://pymc-devs.github.io/pymc3/notebooks/getting_started.html 21 | 22 | From these materials and from other sources a person who expects to teach these materials should be able to 23 | 24 | * Conceptually explain MCMC 25 | * Talk about recent advances in Hamiltonian Monte Carlo methods like NUTS 26 | * Be able to talk about ADVI and mini-batch 27 | * Explain why using theano was a major step forward 28 | 29 | Here is a reasonable way to break up the content. 30 | 31 | 1. For about 1 hour go through the introductory examples 32 | 2. Spend 5-10 minutes going through the recommender example (avoid a play-by-play) 33 | 3. Spend 5-10 minutes going through the neural network example (avoid a play-by-play) 34 | 35 | 36 | The introductory examples are the main contents and the recommender 37 | and neural network examples are meant to showcase some of the more 38 | advanced possibilities. 39 | -------------------------------------------------------------------------------- /binomial-distn.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | """ 3 | fit a binomial distribution with several parameterizations 4 | """ 5 | 6 | import sys,os 7 | import numpy as np 8 | import matplotlib.pyplot as plt 9 | import scipy.stats as scs 10 | plt.style.use('bmh') 11 | 12 | ## declare variables 13 | font_size = 11 14 | font_name = 'sans-serif' 15 | n = 10000 16 | fig = plt.figure(figsize=(10,6),dpi=300) 17 | splot = 0 18 | 19 | ## looxp through parameterizations of the beta 20 | for n,p in [(5,0.25),(5,0.5),(5,0.75)]: 21 | splot += 1 22 | ax = fig.add_subplot(1,3,splot) 23 | 24 | x = np.arange(scs.binom.ppf(0.01,n,p),scs.binom.ppf(0.99,n,p)) 25 | ax.plot(x, scs.binom.pmf(x,n,p), 'bo', ms=8, label='pmf') 26 | ax.vlines(x, 0, scs.binom.pmf(x,n,p), colors='b', lw=5, alpha=0.5) 27 | rv = scs.binom(n,p) 28 | 29 | ax.set_ylim((0,1.0)) 30 | ax.set_xlim((-0.5,4.5)) 31 | ax.set_title("n=%s,p=%s"%(n,p)) 32 | ax.set_aspect(1./ax.get_data_ratio()) 33 | 34 | for t in ax.get_xticklabels(): 35 | t.set_fontsize(font_size-1) 36 | t.set_fontname(font_name) 37 | for t in ax.get_yticklabels(): 38 | t.set_fontsize(font_size-1) 39 | t.set_fontname(font_name) 40 | 41 | plt.savefig("binomial.png",dpi=400) 42 | plt.show() 43 | -------------------------------------------------------------------------------- /binomial.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/binomial.png -------------------------------------------------------------------------------- /edward/bayes-linreg.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/edward/bayes-linreg.png -------------------------------------------------------------------------------- /edward/bayesian-linear-regression.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | """ 3 | This example has been modified from: 4 | 5 | http://edwardlib.org/tutorials/supervised-regression 6 | """ 7 | 8 | import os 9 | import numpy as np 10 | import matplotlib.pyplot as plt 11 | import tensorflow as tf 12 | import edward as ed 13 | plt.style.use('bmh') 14 | from edward.models import Normal 15 | 16 | 17 | def buffer_layout(ax,buff=0.01): 18 | """use x and y to add well spaced margins""" 19 | xmin,xmax = ax.get_xlim() 20 | ymin,ymax = ax.get_ylim() 21 | xbuff = buff * (xmax - xmin) 22 | ybuff = buff * (ymax - ymin) 23 | ax.set_xlim(xmin-xbuff,xmax+xbuff) 24 | ax.set_ylim(ymin-ybuff,ymax+ybuff) 25 | 26 | 27 | def build_toy_dataset(N, w, noise_std=0.1): 28 | """generate data""" 29 | D = len(w) 30 | x = np.random.randn(N, D) 31 | y = np.dot(x, w) + np.random.normal(0, noise_std, size=N) 32 | return x, y 33 | 34 | N = 40 # number of data points 35 | D = 10 # number of features 36 | 37 | w_true = np.random.randn(D) 38 | X_train, y_train = build_toy_dataset(N, w_true) 39 | X_test, y_test = build_toy_dataset(N, w_true) 40 | 41 | X = tf.placeholder(tf.float32, [N, D]) 42 | w = Normal(loc=tf.zeros(D), scale=tf.ones(D)) 43 | b = Normal(loc=tf.zeros(1), scale=tf.ones(1)) 44 | y = Normal(loc=ed.dot(X, w) + b, scale=tf.ones(N)) 45 | 46 | qw = Normal(loc=tf.Variable(tf.random_normal([D])), 47 | scale=tf.nn.softplus(tf.Variable(tf.random_normal([D])))) 48 | qb = Normal(loc=tf.Variable(tf.random_normal([1])), 49 | scale=tf.nn.softplus(tf.Variable(tf.random_normal([1])))) 50 | 51 | 52 | inference = ed.KLqp({w: qw, b: qb}, data={X: X_train, y: y_train}) 53 | inference.run(n_samples=5, n_iter=250) 54 | 55 | y_post = ed.copy(y, {w: qw, b: qb}) 56 | 57 | # This is equivalent to 58 | # y_post = Normal(loc=ed.dot(X, qw) + qb, scale=tf.ones(N)) 59 | 60 | print("Mean squared error on test data:") 61 | print(ed.evaluate('mean_squared_error', data={X: X_test, y_post: y_test})) 62 | 63 | print("Mean absolute error on test data:") 64 | print(ed.evaluate('mean_absolute_error', data={X: X_test, y_post: y_test})) 65 | 66 | 67 | def visualise(X_data, y_data, w, b, ax, title, n_samples=10): 68 | w_samples = w.sample(n_samples)[:, 0].eval() 69 | b_samples = b.sample(n_samples).eval() 70 | ax.scatter(X_data[:, 0], y_data) 71 | inputs = np.linspace(-8, 8, num=400) 72 | for ns in range(n_samples): 73 | output = inputs * w_samples[ns] + b_samples[ns] 74 | ax.plot(inputs, output) 75 | 76 | ax.set_title(title) 77 | buffer_layout(ax) 78 | ax.set_aspect(1./ax.get_data_ratio()) 79 | 80 | fig = plt.figure(figsize=(10,10)) 81 | ax1 = fig.add_subplot(121) 82 | ax2 = fig.add_subplot(122) 83 | 84 | # Visualize samples from the prior 85 | visualise(X_train, y_train, w, b,ax1,"Samples from prior") 86 | 87 | # visualize samples from posterior 88 | visualise(X_train, y_train, qw, qb, ax2, "Samples from posterior") 89 | 90 | 91 | plt.savefig("bayes-linreg.png",dpi=400,bbox_inches = 'tight', pad_inches = 0) 92 | plt.show() 93 | -------------------------------------------------------------------------------- /edward/edwd-simple-nnet.py: -------------------------------------------------------------------------------- 1 | #!/usr/env/python 2 | 3 | """ 4 | simple neural network example to model cosine data 5 | """ 6 | 7 | import numpy as np 8 | import matplotlib.pyplot as plt 9 | import tensorflow as tf 10 | from edward.models import Normal 11 | import edward as ed 12 | plt.style.use('bmh') 13 | 14 | def buffer_layout(ax,buff=0.01): 15 | """use x and y to add well spaced margins""" 16 | xmin,xmax = ax.get_xlim() 17 | ymin,ymax = ax.get_ylim() 18 | xbuff = buff * (xmax - xmin) 19 | ybuff = buff * (ymax - ymin) 20 | ax.set_xlim(xmin-xbuff,xmax+xbuff) 21 | ax.set_ylim(ymin-ybuff,ymax+ybuff) 22 | 23 | ## create the data 24 | print("\t ...specify data") 25 | x_train = np.linspace(-3, 3, num=50) 26 | y_train = np.cos(x_train) + np.random.normal(0, 0.1, size=50) 27 | x_train = x_train.astype(np.float32).reshape((50, 1)) 28 | y_train = y_train.astype(np.float32).reshape((50, 1)) 29 | 30 | ## plot data 31 | fig = plt.figure(figsize=(8,8)) 32 | ax = fig.add_subplot(111) 33 | 34 | ax.plot(x_train,y_train,color='darkblue',markersize=10,linestyle='none',marker='s') 35 | buffer_layout(ax) 36 | ax.set_aspect(1./ax.get_data_ratio()) 37 | 38 | ## specify the weights and biases of a simple nerural networks 39 | print("\t ...specify base model") 40 | W_0 = Normal(loc=tf.zeros([1, 2]), scale=tf.ones([1, 2])) 41 | W_1 = Normal(loc=tf.zeros([2, 1]), scale=tf.ones([2, 1])) 42 | b_0 = Normal(loc=tf.zeros(2), scale=tf.ones(2)) 43 | b_1 = Normal(loc=tf.zeros(1), scale=tf.ones(1)) 44 | 45 | ## use tanh nonlinearities 46 | x = x_train 47 | y = Normal(loc=tf.matmul(tf.tanh(tf.matmul(x, W_0) + b_0), W_1) + b_1, scale=0.1) 48 | 49 | ## Specify a normal approximation over the weights and biases (for variational inference) 50 | print("\t ...specify priors") 51 | qW_0 = Normal(loc=tf.Variable(tf.zeros([1, 2])), 52 | scale=tf.nn.softplus(tf.Variable(tf.zeros([1, 2])))) 53 | qW_1 = Normal(loc=tf.Variable(tf.zeros([2, 1])), 54 | scale=tf.nn.softplus(tf.Variable(tf.zeros([2, 1])))) 55 | qb_0 = Normal(loc=tf.Variable(tf.zeros(2)), 56 | scale=tf.nn.softplus(tf.Variable(tf.zeros(2)))) 57 | qb_1 = Normal(loc=tf.Variable(tf.zeros(1)), 58 | scale=tf.nn.softplus(tf.Variable(tf.zeros(1)))) 59 | 60 | ## carry out variaitonal inference 61 | print("\t ...performing inference") 62 | inference = ed.KLqp({W_0: qW_0, b_0: qb_0, 63 | W_1: qW_1, b_1: qb_1}, data={y: y_train}) 64 | inference.run(n_iter=1000) 65 | 66 | plt.show() 67 | 68 | print("done") 69 | -------------------------------------------------------------------------------- /poisson-distn.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | """ 3 | fit a bernoulli distribution with several parameterizations 4 | """ 5 | 6 | import sys,os 7 | import numpy as np 8 | import matplotlib.pyplot as plt 9 | import scipy.stats as scs 10 | plt.style.use('bmh') 11 | 12 | ## declare variables 13 | font_size = 11 14 | font_name = 'sans-serif' 15 | n = 10000 16 | fig = plt.figure(figsize=(10,6),dpi=300) 17 | splot = 0 18 | 19 | ## loop through parameterizations of the beta 20 | for lamb in [1.0,2.0,5.0]: 21 | splot += 1 22 | ax = fig.add_subplot(1,3,splot) 23 | 24 | x = np.arange(scs.poisson.ppf(0.01, lamb),scs.poisson.ppf(0.99, lamb)) 25 | ax.plot(x, scs.poisson.pmf(x, lamb), 'bo', ms=8, label='pmf') 26 | ax.vlines(x, 0, scs.poisson.pmf(x, lamb), colors='b', lw=5, alpha=0.5) 27 | rv = scs.poisson(lamb) 28 | 29 | ax.set_xlim((-0.5,10.5)) 30 | ax.set_ylim((0,0.6)) 31 | ax.set_title("lambda=%s"%(lamb)) 32 | ax.set_aspect(1./ax.get_data_ratio()) 33 | 34 | for t in ax.get_xticklabels(): 35 | t.set_fontsize(font_size-1) 36 | t.set_fontname(font_name) 37 | for t in ax.get_yticklabels(): 38 | t.set_fontsize(font_size-1) 39 | t.set_fontname(font_name) 40 | plt.savefig("poisson.png",dpi=400) 41 | plt.show() 42 | -------------------------------------------------------------------------------- /poisson.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/poisson.png -------------------------------------------------------------------------------- /pubs/edward-1.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/pubs/edward-1.pdf -------------------------------------------------------------------------------- /pubs/edward-2.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/pubs/edward-2.pdf -------------------------------------------------------------------------------- /pymc3/bayes-lin-reg.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/pymc3/bayes-lin-reg.png -------------------------------------------------------------------------------- /pymc3/data/jokes.json: -------------------------------------------------------------------------------- 1 | {"1": "A man visits the doctor. The doctor says \"I have bad news for you.You have\ncancer and Alzheimer's disease\". \nThe man replies \"Well,thank God I don't have cancer!\"\n", "2": "This couple had an excellent relationship going until one day he came home\nfrom work to find his girlfriend packing. He asked her why she was leaving him\nand she told him that she had heard awful things about him. \n\n\"What could they possibly have said to make you move out?\" \n\n\"They told me that you were a pedophile.\" \n\nHe replied, \"That's an awfully big word for a ten year old.\" \n", "3": "Q. What's 200 feet long and has 4 teeth? \n\nA. The front row at a Willie Nelson Concert.\n", "4": "Q. What's the difference between a man and a toilet? \n\nA. A toilet doesn't follow you around after you use it.\n", "5": "Q.\tWhat's O. J. Simpson's Internet address? \nA.\tSlash, slash, backslash, slash, slash, escape.\n", "6": "Bill & Hillary are on a trip back to Arkansas. They're almost out of gas, so Bill pulls into a service station on the outskirts of\ntown. The attendant runs out of the station to serve them when Hillary realizes it's an old boyfriend from high school. She and\nthe attendant chat as he gases up their car and cleans the windows. Then they all say good-bye. \n\nAs Bill pulls the car onto the road, he turns to Hillary and says, 'Now aren't you glad you married me and not him ? You could've\nbeen the wife of a grease monkey !' \n\nTo which Hillary replied, 'No, Bill. If I would have married him, you'd be pumping gas and he would be the President !' \n", "7": "How many feminists does it take to screw in a light bulb?\nThat's not funny.\n", "8": "Q. Did you hear about the dyslexic devil worshiper? \n\nA. He sold his soul to Santa.\n", "9": "A country guy goes into a city bar that has a dress code, and the maitre\nd' \ndemands he wear a tie. Discouraged, the guy goes to his car to sulk when \ninspiration strikes: He's got jumper cables in the trunk! So he wraps\nthem around his neck, sort of like a string tie (a bulky string tie to be\nsure) and returns to the bar. The maitre d' is reluctant, but says to the\nguy, \"Okay, you're a pretty resourceful fellow, you can come in... but\njust don't start anything\"! \n", "10": "Two cannibals are eating a clown, one turns to other and says: \n\"Does this taste funny to you? \n", "11": "Q. What do a hurricane, a tornado, and a redneck\ndivorce all have in common? \nA. Someone's going to lose their trailer...\n", "12": "A guy stood over his tee shot for what seemed an eternity, looking up, looking down, measuring the distance,\nfiguring the wind direction and speed. Driving his partner nuts.\n\nFinally his exasperated partner says, \"What the hell is taking so long? Hit the goddamn ball!\"\nThe guy answers, \"My wife is up there watching me from the clubhouse. I want to make this a perfect shot.\"\n\"Well, hell, man, you don't stand a snowball's chance in hell of hitting her from here!\" \n", "13": "They asked the Japanese visitor if they have elections in his\ncountry. \n\"Every Morning\" he answers.\n", "14": "The father was very anxious to marry off his only daughter so he wanted \nto impress her date. \"Do you like to screw,\" he says. \"Huh\" replied the \nsurprised first date. \"My daughter she loves to screw and she's good at it, \nyou and her should go screw,\" carefully explained the father. Now very \ninterested the boy replied, \"Yes, sir.\" Minutes later the girl came down \nthe stairs, kissed her father goodbye and the couple left. After only a \nfew minutes she reappeared, furious, dress torn, hair a mess and screamed\n\"Dammit, Daddy, it's the TWIST, get it straight!\" \n", "15": "Q: What did the blind person say when given some matzah?\n\nA: Who the hell wrote this?\n", "16": "Q. What is orange and sounds like a parrot? \n\nA. A carrot.\n", "17": "How many men does it take to screw in a light bulb? \n\nOne...men will screw anything. \n", "18": "A dog walks into Western Union and asks the clerk to send a telegram. He fills out a form on which he\nwrites down the telegram he wishes to send: \"Bow wow wow, Bow wow wow.\"\n\nThe clerk says, \"You can add another 'Bow wow' for the same price.\"\n\nThe dog responded, \"Now wouldn't that sound a little silly?\" \n", "19": "Q: If a person who speaks three languages is called \"tri-lingual,\" and\na person who speaks two languages is called \"bi-lingual,\" what do call\na person who only speaks one language?\n\nA: American! \n", "20": "What's the difference between a MacIntosh and an\nEtch-A-Sketch? \n\nYou don't have to shake the Mac to clear the screen. \n", "21": "What's the difference between a used tire and 365 used condoms?\n\nOne's a Goodyear, the other's a great year.\n", "22": "A duck walks into a pharmacy and asks for a condom. The pharmacist says\n\"Would you like me to stick that on your bill?\"\nThe duck says: \n\"What kind of duck do you think I am!\"\n", "23": "Q: What is the Australian word for a boomerang that won't\n come back? \n\nA: A stick\n", "24": "What do you get when you run over a parakeet with a lawnmower? \nShredded tweet.\n", "25": "Two kindergarten girls were talking outside: one said, \"You\nwon't believe what I saw on the patio yesterday--a condom!\"\n\nThe second girl asked, \"What's a patio?\"\n", "26": "A guy walks into a bar and sits down next to an extremely gorgeous \nwoman. The first thing he notices about her though, are her pants. \nThey were skin-tight, high-waisted and had no obvious mechanism \n(zipper, buttons or velcro) for opening them.\n\nAfter several minutes of puzzling over how she got the pants up over \nher hips, he finally worked up the nerve to ask her. \"Excuse me miss, \nbut how do you get into your pants?\"\n\n\"Well,\" she replied, \"you can start by buying me a drink.\"\n", "27": "Clinton returns from a vacation in Arkansas and walks down the\nsteps of Air Force One with two pigs under his arms. At the bottom\nof the steps, he says to the honor guardsman, \"These are genuine\nArkansas Razor-Back Hogs. I got this one for Chelsea and this one for\nHillary.\" \n\nThe guardsman replies, \"Nice trade, Sir.\"\n", "28": "A mechanical, electrical and a software engineer from Microsoft were\ndriving through the desert when the car broke down. The mechanical\nengineer said \"It seems to be a problem with the fuel injection system,\nwhy don't we pop the hood and I'll take a look at it.\" To which the\nelectrical engineer replied, \"No I think it's just a loose ground wire,\nI'll get out and take a look.\" Then, the Microsoft engineer jumps in.\n\"No, no, no. If we just close up all the windows, get out, wait a few\nminutes, get back in, and then reopen the windows everything will work\nfine.\"\n", "29": "An old Scotsmen is sitting with a younger Scottish gentleman and says the boy. \n\"Ah, lad look out that window. You see that stone wall there, I built it with\nme own bare hands, placed every stone meself. But do they call me MacGregor the\nwall builder? No! \n\nHe Takes a few sips of his beer then says, \"Aye, and look out on that lake and \neye that beautiful pier. I built it meself, laid every board and hammered each\nnail but do they call me MacGregor the pier builder? No! \n\nHe continues...\"And lad, you see that road? That too I build with me own bare \nhands. Laid every inch of pavement meself, but do they call MacGregor the road\nbuilder? No!\"\n\nAgain he returns to his beer for a few sips, then says, \n\"Agh, but you screw one sheep...\"\n", "30": "Q: What's the difference between a Lawyer and a Plumber? \nA: A Plumber works to unclog the system.\n", "31": " President Clinton looks up from his desk in the Oval Office to see\n one of his aides nervously approach him. \n \"What is it?\" exclaims the President. \n\"It's this Abortion Bill Mr. President, what do you want to do\n about it?\" the aide replies. \n\"Just go ahead and pay it.\" responds the President. \n", "32": "A man arrives at the gates of heaven. St. Peter asks, \"Religion?\" \nThe man says, \"Methodist.\" St. Peter looks down his list, and says, \n\"Go to room 24, but be very quiet as you pass room 8.\" \n\nAnother man arrives at the gates of heaven. \"Religion?\"\n\"Baptist.\" \"Go to room 18, but be very quiet as you pass room 8.\" \n\nA third man arrives at the gates. \"Religion?\" \"Jewish.\"\n\"Go to room 11, but be very quiet as you pass room 8.\" \nThe man says, \"I can understand there being different rooms for different religions, but why\nmust I be quiet when I pass room 8?\" St. Peter tells him, \"Well the Catholics are in room 8, \nand they think they're the only ones here.\n", "33": "What do you call an American in the finals of the world cup?\n\n\"Hey Beer Man!\"\n", "34": "Out in the backwoods of some midwestern state, little Johnny arrives\nat school an hour late.\n\nTeacher: \"Why are you so late, John?\"\nJohny : \"My big brother got shot in the ass.\"\n(the teacher corrects his speech)\nTeacher: \"Rectum.\"\nJohnny : \"Wrecked him!? Hell, It damn near killed him!\" \n", "35": "An explorer in the deepest Amazon suddenly finds himself surrounded\nby a bloodthirsty group of natives. Upon surveying the situation, he\nsays quietly to himself, \"Oh God, I'm screwed.\" \n\nThe sky darkens and a voice booms out, \"No, you are NOT\nscrewed. Pick up that stone at your feet and bash in the head of the\nchief standing in front of you.\" \n\nSo with the stone he bashes the life out of the chief. Standing above\nthe lifeless body, breathing heavily looking at 100 angry natives... \n\nThe voice booms out again, \"Okay ..... NOW you're screwed.\" \n", "36": "A guy walks into a bar, orders a beer and says to the bartender,\n\"Hey, I got this great Polish Joke...\" \n\nThe barkeep glares at him and says in a warning tone of voice:\n\"Before you go telling that joke you better know that I'm Polish, both\nbouncers are Polish and so are most of my customers\"\n\n\"Okay\" says the customer,\"I'll tell it very slowly.\" \n", "37": "A Jewish young man was seeing a psychiatrist for an eating and\nsleeping disorder. \n\n\"I am so obsessed with my mother... As soon as I go to sleep, I start\ndreaming, and everyone in my dream turns into my mother. I wake up in\nsuch a state, all I can do is go downstairs and eat a piece of toast.\"\n\nThe psychiatrist replies:\n\n\"What, just one piece of toast, for a big boy like you?\"\n", "38": "\"May I take your order?\" the waiter asked. \n\n\"Yes, how do you prepare your chickens?\" \n\n\"Nothing special sir,\" he replied. \"We just tell them straight out\nthat they're going to die.\"\n", "39": "What is the difference between men and women:\n\n\nA woman wants one man to satisfy her every need.\nA man wants every woman to satisfy his one need.\n", "40": "How many Irishmen does it take to change a lightbulb?\n\nTwo, one to hold the lightbulb and the other to drink until the room spins. \n", "41": "What does an atheist say during an orgasm?\n\"Oh Darwin! Oh Darwin!...\"\n", "42": "Two men are discussing the age old question: who enjoys sex more, the\nman or the woman? A woman walks by and listens in for awhile and then\ninterrupts: \n\"Listen you guys. You know when your ear itches and you put in your \nlittle finger and wiggle it around for awhile? Afterward,\nwhich feels better, your finger or your ear?\"\n", "43": "Arnold Swartzeneger and Sylvester Stallone are making a movie about\nthe lives of the great composers. \nStallone says \"I want to be Mozart.\" \nSwartzeneger says: \"In that case... I'll be Bach.\"\n", "44": "A horse walks into a bar. Bartender says:\n\"So, why the long face?\"\n", "45": "A boy comes home from school and tells his mother that he got a part\nin the school play. \"What part?\" the mother asked. \"I play a Jewish\nhusband,\" the boy replied. \n\"Go back to school and tell your teacher that you want a speaking role!\"\n", "46": "A couple has been married for 75 years. For the husband's 95th\nbirthday, his wife decides to surprise him by hiring a prostitute.\nThat day, the doorbell rings. The husband uses his walker to get to\nthe door and opens it. \nA 21-year-old in a latex outfit smiles and\nsays, \"Hi, I here to give you super sex!\" \nThe old man says, \"I'll take the soup.\"\n", "47": "There was an engineer who had an exceptional gift for fixing all \nthings mechanical. After serving his company loyally for over 30 \nyears, he happily retired. Several years later the company contacted \nhim regarding a seemingly impossible problem they were having with \none of their multi-million dollar machines. They had tried everything and \neveryone else to get the machine fixed, but to no avail. In \ndesperation, they called on the retired engineer who had solved so \nmany of their problems in the past.\nThe engineer reluctantly took the challenge. He spent a day studying \nthe huge machine. At the end of the day, he marked a small \"x\" in \nchalk on a particular component of the machine and proudly stated, \n\"This is where your problem is\".\nThe part was replaced and the machine worked perfectly again. The \ncompany received a bill for $50,000 from the engineer for his \nservice.They demanded an itemized accounting of his charges. The \nengineer responded briefly:\nOne chalk mark $1 \nKnowing where to put it $49,999\nIt was paid in full and the engineer retired again in peace. \n", "48": "The graduate with a Science degree asks, \"Why does it work?\"\nThe graduate with an Engineering degree asks, \"How does it work?\"\nThe graduate with an Accounting degree Asks, \"How much will it cost?\" \nThe graduate with a Liberal Arts degree asks, \"Do you want fries \nwith that?\"\n", "49": "Three engineering students were gathered together discussing the\npossible designers of the human body. \nOne said, \"It was a mechanical engineer. Just look at all the joints.\" \nAnother said, \"No, it was an electrical engineer. The nervous systems many thousands of electrical\nconnections.\" \nThe last said, \"Actually it was a civil engineer. Who else would run a toxic waste pipeline through a recreational area?\"\n", "50": "A guy goes into confession and says to the priest, \"Father, I'm 80 years\nold, widower, with 11 grandchildren. Last night I met two beautiful flight\nattendants. They took me home and I made love to both of them. Twice.\"\n\nThe priest said: \"Well, my son, when was the last time you were in\nconfession?\"\n \"Never Father, I'm Jewish.\"\n \"So then, why are you telling me?\"\n \"I'm telling everybody.\"\n", "51": "Did you hear that Clinton has announced there is a new national bird? \nThe spread eagle.\n", "52": "Q: What do Monica Lewinsky and Bob Dole have in common?\nA: They were both upset when Bill finished first.\n", "53": "One Sunday morning William burst into the living room and said,\n\"Dad! Mom! I have some great news for you! I am getting married\nto the most beautiful girl in town. She lives a block away and\nher name is Susan.\"\nAfter dinner, William's dad took him aside. \"Son, I have to talk\nwith you. Your mother and I have been married 30 years.. She's a\nwonderful wife but she has never offered much excitement in the\nbedroom, so I used to fool around with women a lot. Susan is\nactually your half-sister, and I'm afraid you can't marry her.\"\nWilliam was heart-broken. After eight months he eventually\nstarted dating girls again. A year later he came home and very\nproudly announced, \"Dianne said yes! We're getting married in\nJune.\"\nAgain his father insisted on another private conversation and\nbroke the sad news. \"Dianne is your half-sister too, William. I'm\nawfully sorry about this.\"\nWilliam was furious! He finally decided to go to his mother with\nthe news.\n\"Dad has done so much harm.. I guess I'm never going to get\nmarried,\" he complained. \"Every time I fall in love, Dad tells\nme the girl is my half-sister.\"\nHis mother just shook her head. \"Don't pay any attention to what\nhe says, dear. He's not really your father.\"\n", "54": "The Pope dies and, naturally, goes to heaven. He's met by the reception\ncommittee, and after a whirlwind tour he is told that he can enjoy any\nof the myriad of recreations available.\nHe decides that he wants to read all of the ancient original text of the\nHoly Scriptures, so he spends the next eon or so learning languages.\nAfter becoming a linguistic master, he sits down in the library and\nbegins to pour over every version of the Bible, working back from most\nrecent \"Easy Reading\" to the original script.\nAll of a sudden there is a scream in the library. The Angels come\nrunning in only to find the Pope huddled in his chair, crying to himself\nand muttering, \"An 'R'! The scribes left out the 'R'.\" \nA particularly concerned Angel takes him aside, offering comfort, asks\nhim what the problem is and what does he mean. \nAfter collecting his\nwits, the Pope sobs again, \"It's the letter 'R'. They left out the 'R'.\nThe word was supposed to be CELEBRATE!\"\n", "55": "A woman has twins, and gives them up for adoption. One of\nthem goes to a family in Egypt and is named \"Amal.\" The other goes to\na family in Spain; they name him \"Juan.\" Years later, Juan sends a\npicture of himself to his mom. Upon receiving the picture, she tells\nher husband that she wishes she also had a picture of Amal. \nHer husband responds, \"But they are twins-if you've seen Juan, you've\nseen Amal.\n", "56": "A man and Cindy Crawford get stranded on a desert island. After a couple\nof days they fall in love and start sleeping together. Time pass the\nman seems frustrated, Cindy asks if there is anything she can do? He\nsays there is one thing, \"Could you put on this baseball cap and go to\nthe other side of the island and answer me when I call you Bob?\" She\nagrees. Next day he is walking on the other side of the island, runs\ninto her and says \"Hi Bob!\" \nShe says \"Hello, what's up?\" \nHe replies: \"Bob you won't believe it: I've been sleeping with Cindy\nCrawford for the past two weeks!!!!\"\n", "57": "Why are there so many Jones's in the phone book?\nBecause they all have phones.\n", "58": "How many teddybears does it take to change a lightbulb?\n\nIt takes only one teddybear, but it takes a whole lot of lightbulbs.\n", "59": "The Chukcha (Russian Eskimo) phones up the Russian Parliament Building. \nA guard answers. \nChukcha: \"What is required to become Parliament member?\"\nGuard: \"What are you, an idiot?\"\nChukcha: \"Is it required?\"\n", "60": "What did the Buddhist say to the hot dog vendor?\nMake me one with everything.\n", "61": "During a recent publicity outing, Hillary sneaked off to visit a\nfortune teller of some local repute. In a dark and hazy room, peering\ninto a crystal ball, the mystic delivered grave news.\n\"There's no easy way to say this, so I'll just be blunt: Prepare\nyourself to be a widow. Your husband will die a violent and horrible\ndeath this year.\"\nVisibly shaken, Hillary stared at the woman's lined face, then at \nthe single flickering candle, then down at her hands. She took a few \ndeep breaths to compose herself. She simply had to know. She met the\nfortune teller's gaze, steadied her voice, and asked her question.\n\"Will I be acquitted?\"\n", "62": "A group of managers were given the assignment to measure the\nheight of a flagpole. So they go out to the flagpole with ladders\nand tape measures, and they're falling off the ladders, dropping\nthe tape measures - the whole thing is just a mess.\nAn engineer comes along and sees what they're trying to do,\nwalks over, pulls the flagpole out of the ground, lays it flat,\nmeasures it from end to end, gives the measurement to one of the\nmanagers and walks away.\nAfter the engineer has gone, one manager turns to another and\nlaughs. \"Isn't that just like an engineer, we're looking for the \nheight and he gives us the length.\"\n", "63": "An engineer, a physicist and a mathematician are sleeping in a\nroom. There is a fire in the room. The engineer wakes up, sees the fire,\npicks up the bucket of water and douses the fire and goes back to\nsleep. \n\nAgain there is fire in the room. This time, the physicist wakes\nup, notices the bucket, fills it with water, calculates the optimal\ntrajectory and douses the fire in minimum amount of water and goes\nback to sleep. \n\nAgain there is fire. This time the mathematician wakes up. \nHe looks at the fire, looks at the bucket and the water and\nexclaims, \"A solution exists\" and goes back to sleep.\n", "64": "What is the rallying cry of the International Dyslexic Pride movement?\nDyslexics Untie!\n", "65": "Two Rednecks were seated at the end of a bar when a young lady\nseated a few stools up began to choke on a piece of hamburger. She was\nturning blue and obviously in serious respiratory distress.\nOne said to the other, \"That gal there is having a bad time!\" \nThe other agreed and said \"Think we should go help?\" \"You bet,\" said the\nfirst,and with that, he ran over and said, \"Can you breathe??\" She shook\nher head no. He said, \"Can you speak??\" She again shook her head no.\nWith that, he pulled up her skirt and licked her on the butt.\nShe was so shocked, she coughed up the obstruction and began to\nbreathe-with great relief.\nThe redneck walked back to his\nfriend and said, \"Funny how that hind lick maneuver always works.\"\n", "66": "A lawyer opened the door of his BMW, when suddenly a car came along\nand hit the door, ripping it off completely. When the police arrived\nat the scene, the lawyer was complaining bitterly about the damage to\nhis precious BMW. \n\"Officer, look what they've done to my Beeeeemer!!!\", he whined. \n\"You lawyers are so materialistic, you make\nme sick!!!\" retorted the officer. \"You're so worried about your\nstupid BMW, that you didn't even notice that your left arm was ripped\noff!!!\" \n\"Oh my gaaaad...\", replied the lawyer, finally noticing the\nbloody left shoulder where his arm once was. \"Where's my\nRolex???!!!!\"\n", "67": "Once upon a time, two brooms fell in love and decided to get married.\nBefore the ceremony, the bride broom informed the groom broom that \nshe was expecting a little whiskbroom. The groom broom was aghast!\n\n\"How is this possible?\" he asked. \"We've never swept together!\n", "68": "A man piloting a hot air balloon discovers he has wandered off course and\nis hopelessly lost. He descends to a lower altitude and locates a man\ndown on the ground. He lowers the balloon further and shouts \"Excuse me,\ncan you tell me where I am?\"\n\nThe man below says: \"Yes, you're in a hot air balloon, about 30 feet\nabove this field.\"\n\n\"You must work in Information Technology,\" says the balloonist.\n\n\"Yes I do,\" replies the man. \"And how did you know that?\"\n\n\"Well,\" says the balloonist, \"what you told me is technically correct,\nbut of no use to anyone.\"\n\nThe man below says, \"You must work in management.\"\n\n\"I do,\" replies the balloonist, \"how did you know?\"\n\n\"Well,\" says the man, \"you don't know where you are, or where you're\ngoing, but you expect my immediate help. You're in the same position you\nwere before we met, but now it's my fault!\"\n\n", "69": "This guys wife asks, \"Honey if I died would you remarry?\" and he replies,\n\"Well, after a considerable period of grieving, we all need\ncompanionship, I guess I would.\"\n\nShe then asks, \"If I died and you remarried, would she live in this\nhouse?\" and he replies, \"We've spent a lot of time and money getting this\nhouse just the way we want it. I'm not going to get rid of my house, I\nguess she would.\"\n\n\"If I died and you remarried, and she lived in this house, would she\nsleep in our bed?\" and he says, \"That bed is brand new, we just paid two\nthousand dollars for it, it's going to last a long time, I guess she\nwould.\"\n\nSo she asks, \"If I died and you remarried, and she lived in this house,\nand slept in our bed, would she use my golf clubs?\"\n\n\"Oh no, she's left handed.\"\n", "70": "Employer to applicant: \"In this job we need someone who is responsible.\"\n\nApplicant: \"I'm the one you want. On my last job, every time anything\nwent wrong, they said I was responsible.\"\n\n", "71": "At a recent Sacramento PC Users Group meeting,\na company was demonstrating its latest speech-\nrecognition software. A representative from the\ncompany was just about ready to start the\ndemonstration and asked everyone in the room\nto quiet down.\n\nJust then someone in the back of the room yelled,\n*\"Format C: Return.\"*\n\nSomeone else chimed in:\n*\"Yes, Return\"*\n\nUnfortunately, the software worked.\n\n", "72": "On the first day of college, the Dean addressed the students,\npointing out some of the rules:\n\n\"The female dormitory will be out-of-bounds for all male students\nand the male dormitory to the female students. Anybody caught breaking\nthis rule will be finded $20 the first time.\" He continued, \"Anybody \ncaught breaking this rule the second time will be fined $60. Being caught\na third time will cost you a fine of $180. Are there any questions ?\"\n\nAt this point, a male student in the crowd inquired:\n*\"How much for a season pass ?\"*\n", "73": "Q: What is the difference between George Washington, Richard Nixon,\nand Bill Clinton?\n\nA: Washington couldn't tell a lie, Nixon couldn't tell the truth, and\nClinton doesn't know the *difference*.\n", "74": "Q: How many stalkers does it take to change a light bulb?\n\nA: *Two*. One to replace the bulb, and the other to watch it day and night.\n", "75": "Q: Do you know the difference between an intelligent male and the\nSasquatch?\n\nA: *There have been actual reported sightings of the Sasquatch*.\n", "76": "There once was a man and a woman that both got in a terrible car wreck. Both of their vehicles \nwere completely destroyed, buy fortunately, no one was hurt. In thankfulness, the woman said to the \nman, 'We are both okay, so we should celebrate. I have a bottle of wine in my car, let's open it.' \nSo the woman got the bottleout of the car, and handed it to the man. The man took a really big drink, \nand handed the woman the bottle. The woman closed the bottle and put it down. The man asked, \n'Aren't you going to take a drink?' \n\nThe woman cleverly replied, \n*'No, I think I'll just wait for the cops to get here.'*\n", "77": "If pro- is the opposite of con- then congress must be the *opposite*\nof progress.\n", "78": "Q: What's the difference between the government and the Mafia?\n\nA: *One of them is organized*.\n", "79": "Q: Ever wonder why the IRS calls it Form 1040?\n\nA: Because for every $50 that you earn, *you get 10 and they get 40*.\n", "80": "Hillary, Bill Clinton and the Pope are sitting together on an airplane.\n\nBill says \"I could throw one thousand dollar bill out of this plane and\nmake one person very happy.\"\n\nHillary says \"I could throw 10 hundred dollar bills out of the plane and\nmake 10 people very happy.\"\n\nThe Pope chips in and says \"*I could throw Bill out of the airplane and make the whole \ncountry happy*.\"\n", "81": "An Asian man goes into a New York CityBank to exchange 10,000 yen for\nAmerican Currency. The teller gives him $72.00. The next month the\nAsian man goes into the same bank with 10,000 yen and receives $62.00.\nHe asks, \"How come? Only $62.00?\" The teller says \"Fluctuations-\nFluctuations!\"\n\nWhereupon the Asian man looks back at the teller and says \"*Fluk you\nAmelicans too*!\"\n", "82": "Q: How do you keep a computer programmer in the \nshower all day long?\n\nA: Give them a shampoo with a label that says\n*\"rinse, lather, repeat\"*.\n", "83": "*What a woman says*:\n\n\"This place is a mess! C'mon,\nYou and I need to clean up,\nYour stuff is lying on the floor and\nyou'll have no clothes to wear,\nif we don't do laundry right now!\"\n\n*What a man hears*:\n\nblah, blah, blah, blah, *C'mon*\nblah, blah, blah, blah, *you and I*\nblah, blah, blah, blah, *on the floor*\nblah, blah, blah, blah, *no clothes*\nblah, blah, blah, blah, *RIGHT NOW!*\n", "84": "Q: What is the difference between Mechanical Engineers and Civil \nEngineers?\n \nA: Mechanical Engineers build *weapons*, Civil Engineers build *targets*.\n", "85": "Q: How many Presidents does it take to screw in a light bulb?\n\nA: *It depends upon your definition of screwing a light bulb*.\n", "86": "A neutron walks into a bar and orders a drink.\n\"How much do I owe you?\" the neutron asks.\n\nThe bartender replies, *\"for you, no charge.\"*\n", "87": "A man, recently completing a routine physical examination receives a\nphone call from his doctor. The doctor says, \"I have some good news and\nsome bad news.\" The man says, \"OK, give me the good news first.\" The\ndoctor says, \"The good news is, you have 24 hours to live.\" The man\nreplies, \"Shit! That's the good news? Then what's the bad news?\"\n\nThe doctor says, \"The bad news is, I forgot to call you *yesterday*.\"\n", "88": "A Czechoslovakian man felt his eyesight was growing steadily worse, and \nfelt it was time to go see an optometrist. \n\nThe doctor started with some simple testing, and showed him a standard eye \nchart with letters of\ndiminishing size: CRKBNWXSKZY. . . \n\n\"Can you read this?\" the doctor asked. \n\n\"Read it?\" the Czech answered. *\"Doc, I know him!\"*\n", "89": "*A radio conversation of a US naval \nship with Canadian authorities ... *\n\nAmericans: Please divert your course 15 degrees to the North to avoid a\ncollision.\n\nCanadians: Recommend you divert YOUR course 15 degrees to the South to \navoid a collision.\n\nAmericans: This is the Captain of a US Navy ship. I say again, divert \nYOUR course.\n\nCanadians: No. I say again, you divert YOUR course.\n\nAmericans: This is the aircraft carrier USS LINCOLN, the second largest ship in the United States' Atlantic Fleet. We are accompanied by three destroyers, three cruisers and numerous support vessels. I demand that you change your course 15 degrees north, that's ONE FIVE DEGREES NORTH, or counter-measures will be undertaken to ensure the safety of this ship.\n\nCanadians: *This is a lighthouse. Your call*.\n", "90": "Q: How many programmers does it take to change a lightbulb?\n\nA: *NONE! That's a hardware problem....*\n", "91": "A Panda bear walks into a bar. Sits down at a table and orders a beer \nand a double cheeseburger. After he is finished eating, he pulls out a gun\nand rips the place with gunfire. Patrons scatter and dive under chairs and\ntables as the bear runs out the door. After ensuring that no one is hurt, \nthe bartender races out the door, and calls after the bear \"What the hell did\nyou do that for?\" The bear calls back, \"I'm a Panda bear. Look it up in the\ndictionary.\" \n\nThe bartender returns, pulls out his dictionary.\n\npanda : \\Pan\"da\\, n. (Zo[\"o]l.)\nA small Asiatic mammal (Ailurus fulgens) having fine soft fur.\nIt is related to the bears, and inhabits the mountains of Northern India.\n*Eats shoots and leaves.*\n", "92": "Early one morning a mother went to her sleeping son and woke him up.\n\n\"Wake up, son. It's time to go to school.\" \n\"But why, Mama? I don't want to go to school.\" \n\"Give me two reasons why you don't want to go to school.\" \n\"One, all the children hate me. Two, all the teachers hate me,\" \n\"Oh! that's no reason. Come on, you have to go to school,\" \n\n\"Give me two good reasons WHY I should go to school?\" \n \n\"One, you are *fifty-two* years old. Two, you are the *principal* of the\n school.\"\n", "93": "Reaching the end of a job interview, the human resources person asked a\nyoung engineer fresh out of Stanford,\n\n\"And what starting salary were you looking for?\"\n\nThe engineer said, \"In the neighborhood of $125,000 a year, depending\non the benefits package.\"\n\nThe interviewer said, \"Well, what would you say to a package of 5-weeks \nvacation, 14 paid holidays, full medical and dental, company matching \nretirement fund to 50% of salary, and a company car leased every 2 years - \nsay, a red Corvette?\"\n\nThe Engineer sat up straight and said, \"Wow! Are you *kidding*?\"\n\nAnd the interviewer replied, *\"Yeah, but you started it.\"*\n", "94": "Two atoms are walking down the street when one \natom says to the other \n\"Oh, my! I've lost an electron!\"\n\nThe second atom says\"Are you sure\"\n\nThe first replies *\"I'm positive!\"*\n", "95": "Just a thought ..\n\nBefore criticizing someone, walk a mile in their shoes. \n\nThen when you do criticize them, \n*you will be a mile away and have their shoes !*\n", "96": "Two attorneys went into a diner and ordered two drinks. Then they produced \nsandwiches from their briefcases and started to eat. The owner became\nquite concerned and marched over and told them, *\"You can't eat your own\nsandwiches in here!\"*\n\nThe attorneys looked at each other, shrugged their shoulders and then\n*exchanged* sandwiches.\n", "97": "A teacher is explaining to her class how different languages use \nnegatives differently. She says, \"In all languages, a positive followed\nby a negative or a negative followed by a positive makes a negative. In\nsome languages, two negatives together make a positive, while in others they\nmake a negative. But in no language do two positives make a negative.\" \n\nOne of the students puts up his hand and says, *\"Yeah, right.\"*\n", "98": "Age and Womanhood\n\n1. Between the ages of 13 and 18 ...\n *She is like Africa, virgin and unexplored.* \n\n2. Between the ages of 19 and 35 ...\n *She is like Asia, hot and exotic.* \n\n3. Between the ages of 36 and 45 ...\n *She is like America, fully explored, breathtakingly beautiful,and free with her resources.*\n\n4. Between the ages of 46 and 56 ...\n *She is like Europe, exhausted but still has points of interest.* \n\n5. After 56 she is like Australia ...\n *Everybody knows it's down there, but who gives a damn?*\n", "99": "A bus station is where a bus stops.\nA train station is where a train stops.\n\nOn *my* desk I have a *work station...*\n", "100": "Q: Whats the difference between greeting a Queen and greeting the\nPresident of the United States?\n\nA: You only have to get on *one knee* to greet the queen.\n"} -------------------------------------------------------------------------------- /pymc3/data/txtdata.csv: -------------------------------------------------------------------------------- 1 | 1.300000000000000000e+01 2 | 2.400000000000000000e+01 3 | 8.000000000000000000e+00 4 | 2.400000000000000000e+01 5 | 7.000000000000000000e+00 6 | 3.500000000000000000e+01 7 | 1.400000000000000000e+01 8 | 1.100000000000000000e+01 9 | 1.500000000000000000e+01 10 | 1.100000000000000000e+01 11 | 2.200000000000000000e+01 12 | 2.200000000000000000e+01 13 | 1.100000000000000000e+01 14 | 5.700000000000000000e+01 15 | 1.100000000000000000e+01 16 | 1.900000000000000000e+01 17 | 2.900000000000000000e+01 18 | 6.000000000000000000e+00 19 | 1.900000000000000000e+01 20 | 1.200000000000000000e+01 21 | 2.200000000000000000e+01 22 | 1.200000000000000000e+01 23 | 1.800000000000000000e+01 24 | 7.200000000000000000e+01 25 | 3.200000000000000000e+01 26 | 9.000000000000000000e+00 27 | 7.000000000000000000e+00 28 | 1.300000000000000000e+01 29 | 1.900000000000000000e+01 30 | 2.300000000000000000e+01 31 | 2.700000000000000000e+01 32 | 2.000000000000000000e+01 33 | 6.000000000000000000e+00 34 | 1.700000000000000000e+01 35 | 1.300000000000000000e+01 36 | 1.000000000000000000e+01 37 | 1.400000000000000000e+01 38 | 6.000000000000000000e+00 39 | 1.600000000000000000e+01 40 | 1.500000000000000000e+01 41 | 7.000000000000000000e+00 42 | 2.000000000000000000e+00 43 | 1.500000000000000000e+01 44 | 1.500000000000000000e+01 45 | 1.900000000000000000e+01 46 | 7.000000000000000000e+01 47 | 4.900000000000000000e+01 48 | 7.000000000000000000e+00 49 | 5.300000000000000000e+01 50 | 2.200000000000000000e+01 51 | 2.100000000000000000e+01 52 | 3.100000000000000000e+01 53 | 1.900000000000000000e+01 54 | 1.100000000000000000e+01 55 | 1.800000000000000000e+01 56 | 2.000000000000000000e+01 57 | 1.200000000000000000e+01 58 | 3.500000000000000000e+01 59 | 1.700000000000000000e+01 60 | 2.300000000000000000e+01 61 | 1.700000000000000000e+01 62 | 4.000000000000000000e+00 63 | 2.000000000000000000e+00 64 | 3.100000000000000000e+01 65 | 3.000000000000000000e+01 66 | 1.300000000000000000e+01 67 | 2.700000000000000000e+01 68 | 0.000000000000000000e+00 69 | 3.900000000000000000e+01 70 | 3.700000000000000000e+01 71 | 5.000000000000000000e+00 72 | 1.400000000000000000e+01 73 | 1.300000000000000000e+01 74 | 2.200000000000000000e+01 75 | -------------------------------------------------------------------------------- /pymc3/nn-0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/pymc3/nn-0.png -------------------------------------------------------------------------------- /pymc3/nn-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/pymc3/nn-1.png -------------------------------------------------------------------------------- /pymc3/nn-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/pymc3/nn-2.png -------------------------------------------------------------------------------- /pymc3/probabilistic_matrix_factorization.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Probabilistic Matrix Factorization for Making Personalized Recommendations" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "The model discussed in this analysis was developed by Ruslan Salakhutdinov and Andriy Mnih. All of the code and supporting text, when not referenced, is the original work of [Mack Sweeney](https://www.linkedin.com/in/macksweeney).\n", 15 | "\n", 16 | "## Motivation\n", 17 | "\n", 18 | "Say I download a handbook of a hundred jokes, and I'd like to know very quickly which ones will be my favorite. So maybe I read a few, I laugh, I read a few more, I stop laughing, and I indicate on a scale of -10 to 10 how funny I thought each joke was. Maybe I do this for 5 jokes out of the 100. Now I go to the back of the book, and there's a little program included for calculating my preferences for all the other jokes. I enter in my preference numbers and shazam! The program spits out a list of all 100 jokes, sorted in the order I'll like them. That certainly would be nice. Today we'll write a program that does exactly this.\n", 19 | "\n", 20 | "We'll start out by getting some intuition for how our model will work. Then we'll formalize our intuition. Afterwards, we'll examine the dataset we are going to use. Once we have some notion of what our data looks like, we'll define some baseline methods for predicting preferences for jokes. Following that, we'll look at Probabilistic Matrix Factorization (PMF), which is a more sophisticated Bayesian method for predicting preferences. Having detailed the PMF model, we'll use PyMC3 for MAP estimation and MCMC inference. Finally, we'll compare the results obtained with PMF to those obtained from our baseline methods and discuss the outcome.\n", 21 | "\n", 22 | "## Intuition\n", 23 | "\n", 24 | "Normally if we want recommendations for something, we try to find people who are similar to us and ask their opinions. If Bob, Alice, and Monty are all similar to me, and they all like knock-knock jokes, I'll probably like knock-knock jokes. Now this isn't always true. It depends on what we consider to be \"similar\". In order to get the best bang for our buck, we really want to look for people who have the most similar sense of humor. Humor being a complex beast, we'd probably like to break it down into something more understandable. We might try to characterize each joke in terms of various factors. Perhaps jokes can be dry, sarcastic, crude, sexual, political, etc. Now imagine we go through our handbook of jokes and assign each joke a rating in each of the categories. How dry is it? How sarcastic is it? How much does it use sexual innuendos? Perhaps we use numbers between 0 and 1 for each category. Intuitively, we might call this the joke's humor profile.\n", 25 | "\n", 26 | "Now let's suppose we go back to those 5 jokes we rated. At this point, we can get a richer picture of our own preferences by looking at the humor profiles of each of the jokes we liked and didn't like. Perhaps we take the averages across the 5 humor profiles and call this our ideal type of joke. In other words, we have computed some notion of our inherent _preferences_ for various types of jokes. Suppose Bob, Alice, and Monty all do the same. Now we can compare our preferences and determine how similar each of us really are. I might find that Bob is the most similar and the other two are still more similar than other people, but not as much as Bob. So I want recommendations from all three people, but when I make my final decision, I'm going to put more weight on Bob's recommendation than those I get from Alice and Monty.\n", 27 | "\n", 28 | "While the above procedure sounds fairly effective as is, it also reveals an unexpected additional source of information. If we rated a particular joke highly, and we know its humor profile, we can compare with the profiles of other jokes. If we find one with very close numbers, it is probable we'll also enjoy this joke. Both this approach and the one above are commonly known as _neighborhood approaches_. Techniques that leverage both of these approaches simultaneously are often called _collaborative filtering_ [[1]](http://www2.research.att.com/~volinsky/papers/ieeecomputer.pdf). The first approach we talked about uses user-user similarity, while the second uses item-item similarity. Ideally, we'd like to use both sources of information. The idea is we have a lot of items available to us, and we'd like to work together with others to filter the list of items down to those we'll each like best. My list should have the items I'll like best at the top and those I'll like least at the bottom. Everyone else wants the same. If I get together with a bunch of other people, we all read 5 jokes, and we have some efficient computational process to determine similarity, we can very quickly order the jokes to our liking.\n", 29 | "\n", 30 | "## Formalization\n", 31 | "\n", 32 | "Let's take some time to make the intuitive notions we've been discussing more concrete. We have a set of $M$ jokes, or _items_ ($M = 100$ in our example above). We also have $N$ people, whom we'll call _users_ of our recommender system. For each item, we'd like to find a $D$ dimensional factor composition (humor profile above) to describe the item. Ideally, we'd like to do this without actually going through and manually labeling all of the jokes. Manual labeling would be both slow and error-prone, as different people will likely label jokes differently. So we model each joke as a $D$ dimensional vector, which is its latent factor composition. Furthermore, we expect each user to have some preferences, but without our manual labeling and averaging procedure, we have to rely on the latent factor compositions to learn $D$ dimensional latent preference vectors for each user. The only thing we get to observe is the $N \\times M$ ratings matrix $R$ provided by the users. Entry $R_{ij}$ is the rating user $i$ gave to item $j$. Many of these entries may be missing, since most users will not have rated all 100 jokes. Our goal is to fill in the missing values with predicted ratings based on the latent variables $U$ and $V$. We denote the predicted ratings by $R_{ij}^*$. We also define an indicator matrix $I$, with entry $I_{ij} = 0$ if $R_{ij}$ is missing and $I_{ij} = 1$ otherwise.\n", 33 | "\n", 34 | "So we have an $N \\times D$ matrix of user preferences which we'll call $U$ and an $M \\times D$ factor composition matrix we'll call $V$. We also have a $N \\times M$ rating matrix we'll call $R$. We can think of each row $U_i$ as indications of how much each user prefers each of the $D$ latent factors. Each row $V_j$ can be thought of as how much each item can be described by each of the latent factors. In order to make a recommendation, we need a suitable prediction function which maps a user preference vector $U_i$ and an item latent factor vector $V_j$ to a predicted ranking. The choice of this prediction function is an important modeling decision, and a variety of prediction functions have been used. Perhaps the most common is the dot product of the two vectors, $U_i \\cdot V_j$ [[1]](http://www2.research.att.com/~volinsky/papers/ieeecomputer.pdf).\n", 35 | "\n", 36 | "To better understand CF techniques, let us explore a particular example. Imagine we are seeking to recommend jokes using a model which infers five latent factors, $V_j$, for $j = 1,2,3,4,5$. In reality, the latent factors are often unexplainable in a straightforward manner, and most models make no attempt to understand what information is being captured by each factor. However, for the purposes of explanation, let us assume the five latent factors might end up capturing the humor profile we were discussing above. So our five latent factors are: dry, sarcastic, crude, sexual, and political. Then for a particular user $i$, imagine we infer a preference vector $U_i = <0.2, 0.1, 0.3, 0.1, 0.3>$. Also, for a particular item $j$, we infer these values for the latent factors: $V_j = <0.5, 0.5, 0.25, 0.8, 0.9>$. Using the dot product as the prediction function, we would calculate 0.575 as the ranking for that item, which is more or less a neutral preference given our -10 to 10 rating scale.\n", 37 | "\n", 38 | "$$ 0.2 \\times 0.5 + 0.1 \\times 0.5 + 0.3 \\times 0.25 + 0.1 \\times 0.8 + 0.3 \\times 0.9 = 0.575 $$" 39 | ] 40 | }, 41 | { 42 | "cell_type": "markdown", 43 | "metadata": {}, 44 | "source": [ 45 | "## Data\n", 46 | "\n", 47 | "The [v1 Jester dataset](http://eigentaste.berkeley.edu/dataset/) provides something very much like the handbook of jokes we have been discussing. The original version of this dataset was constructed in conjunction with the development of the [Eigentaste recommender system](http://eigentaste.berkeley.edu/about.html) [[2]](http://goldberg.berkeley.edu/pubs/eigentaste.pdf). At this point in time, v1 contains over 4.1 million continuous ratings in the range [-10, 10] of 100 jokes from 73,421 users. These ratings were collected between Apr. 1999 and May 2003. In order to reduce the training time of the model for illustrative purposes, 1,000 users who have rated all 100 jokes will be selected randomly. We will implement a model that is suitable for collaborative filtering on this data and evaluate it in terms of root mean squared error (RMSE) to validate the results.\n", 48 | "\n", 49 | "Let's begin by exploring our data. We want to get a general feel for what it looks like and a sense for what sort of patterns it might contain." 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": 5, 55 | "metadata": { 56 | "collapsed": false 57 | }, 58 | "outputs": [ 59 | { 60 | "data": { 61 | "text/html": [ 62 | "
\n", 63 | "\n", 64 | " \n", 65 | " \n", 66 | " \n", 67 | " \n", 68 | " \n", 69 | " \n", 70 | " \n", 71 | " \n", 72 | " \n", 73 | " \n", 74 | " \n", 75 | " \n", 76 | " \n", 77 | " \n", 78 | " \n", 79 | " \n", 80 | " \n", 81 | " \n", 82 | " \n", 83 | " \n", 84 | " \n", 85 | " \n", 86 | " \n", 87 | " \n", 88 | " \n", 89 | " \n", 90 | " \n", 91 | " \n", 92 | " \n", 93 | " \n", 94 | " \n", 95 | " \n", 96 | " \n", 97 | " \n", 98 | " \n", 99 | " \n", 100 | " \n", 101 | " \n", 102 | " \n", 103 | " \n", 104 | " \n", 105 | " \n", 106 | " \n", 107 | " \n", 108 | " \n", 109 | " \n", 110 | " \n", 111 | " \n", 112 | " \n", 113 | " \n", 114 | " \n", 115 | " \n", 116 | " \n", 117 | " \n", 118 | " \n", 119 | " \n", 120 | " \n", 121 | " \n", 122 | " \n", 123 | " \n", 124 | " \n", 125 | " \n", 126 | " \n", 127 | " \n", 128 | " \n", 129 | " \n", 130 | " \n", 131 | " \n", 132 | " \n", 133 | " \n", 134 | " \n", 135 | " \n", 136 | " \n", 137 | " \n", 138 | " \n", 139 | " \n", 140 | " \n", 141 | " \n", 142 | " \n", 143 | " \n", 144 | " \n", 145 | " \n", 146 | " \n", 147 | " \n", 148 | " \n", 149 | " \n", 150 | " \n", 151 | " \n", 152 | " \n", 153 | " \n", 154 | " \n", 155 | " \n", 156 | " \n", 157 | " \n", 158 | " \n", 159 | " \n", 160 | " \n", 161 | " \n", 162 | " \n", 163 | " \n", 164 | " \n", 165 | " \n", 166 | " \n", 167 | " \n", 168 | " \n", 169 | " \n", 170 | " \n", 171 | " \n", 172 | " \n", 173 | " \n", 174 | " \n", 175 | " \n", 176 | " \n", 177 | " \n", 178 | " \n", 179 | " \n", 180 | " \n", 181 | " \n", 182 | " \n", 183 | " \n", 184 | " \n", 185 | " \n", 186 | " \n", 187 | " \n", 188 | " \n", 189 | " \n", 190 | " \n", 191 | " \n", 192 | " \n", 193 | " \n", 194 | " \n", 195 | " \n", 196 | " \n", 197 | " \n", 198 | " \n", 199 | " \n", 200 | " \n", 201 | " \n", 202 | " \n", 203 | " \n", 204 | " \n", 205 | " \n", 206 | " \n", 207 | " \n", 208 | " \n", 209 | " \n", 210 | " \n", 211 | " \n", 212 | "
12345678910...919293949596979899100
04.08-0.296.364.37-2.38-9.66-0.73-5.348.889.22...2.82-4.95-0.297.86-0.19-2.143.060.34-4.321.07
1-6.17-3.540.44-8.50-7.09-4.32-8.69-0.87-6.65-1.80...-3.54-6.89-0.68-2.96-2.18-3.350.05-9.08-5.05-3.45
26.843.169.17-6.21-8.16-1.709.271.41-5.19-4.42...7.23-1.12-0.10-5.68-3.16-3.352.14-0.051.310.00
3-3.79-3.54-9.42-6.89-8.74-0.29-5.29-8.93-7.86-1.60...4.37-0.294.17-0.29-0.29-0.29-0.29-0.29-3.40-4.95
41.311.802.57-2.380.730.73-0.975.00-7.23-1.36...1.461.700.29-3.303.455.444.082.484.514.66
\n", 213 | "

5 rows × 100 columns

\n", 214 | "
" 215 | ], 216 | "text/plain": [ 217 | " 1 2 3 4 5 6 7 8 9 10 ... 91 \\\n", 218 | "0 4.08 -0.29 6.36 4.37 -2.38 -9.66 -0.73 -5.34 8.88 9.22 ... 2.82 \n", 219 | "1 -6.17 -3.54 0.44 -8.50 -7.09 -4.32 -8.69 -0.87 -6.65 -1.80 ... -3.54 \n", 220 | "2 6.84 3.16 9.17 -6.21 -8.16 -1.70 9.27 1.41 -5.19 -4.42 ... 7.23 \n", 221 | "3 -3.79 -3.54 -9.42 -6.89 -8.74 -0.29 -5.29 -8.93 -7.86 -1.60 ... 4.37 \n", 222 | "4 1.31 1.80 2.57 -2.38 0.73 0.73 -0.97 5.00 -7.23 -1.36 ... 1.46 \n", 223 | "\n", 224 | " 92 93 94 95 96 97 98 99 100 \n", 225 | "0 -4.95 -0.29 7.86 -0.19 -2.14 3.06 0.34 -4.32 1.07 \n", 226 | "1 -6.89 -0.68 -2.96 -2.18 -3.35 0.05 -9.08 -5.05 -3.45 \n", 227 | "2 -1.12 -0.10 -5.68 -3.16 -3.35 2.14 -0.05 1.31 0.00 \n", 228 | "3 -0.29 4.17 -0.29 -0.29 -0.29 -0.29 -0.29 -3.40 -4.95 \n", 229 | "4 1.70 0.29 -3.30 3.45 5.44 4.08 2.48 4.51 4.66 \n", 230 | "\n", 231 | "[5 rows x 100 columns]" 232 | ] 233 | }, 234 | "execution_count": 5, 235 | "metadata": {}, 236 | "output_type": "execute_result" 237 | } 238 | ], 239 | "source": [ 240 | "% matplotlib inline\n", 241 | "import matplotlib.pyplot as plt\n", 242 | "import pandas as pd\n", 243 | "import numpy as np\n", 244 | "import os\n", 245 | "import shutil\n", 246 | "%precision 4\n", 247 | "plt.style.use('bmh')\n", 248 | "DATA_DIR = './data'\n", 249 | "\n", 250 | "data = pd.read_csv(os.path.join(DATA_DIR, 'jester-dataset-v1-dense-first-1000.csv'))\n", 251 | "data.head()" 252 | ] 253 | }, 254 | { 255 | "cell_type": "code", 256 | "execution_count": 6, 257 | "metadata": { 258 | "collapsed": false 259 | }, 260 | "outputs": [ 261 | { 262 | "data": { 263 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAA7UAAAGcCAYAAAAGZYfiAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3Xd4XdWV9/HvuurF6nKVewObYkpMCQFigylJMCQkIQXj\nhJSZQCaZFJLMMCmkkiEhmSSQzAADKUM6wS8hAYIxJaFjU4wx7pbkIsvqvdz1/nGPZFnItiRL9+hK\nv8/z3Efn7LPPuUs8wNW6e+29zd0RERERERERSUSRsAMQERERERERGSwltSIiIiIiIpKwlNSKiIiI\niIhIwlJSKyIiIiIiIglLSa2IiIiIiIgkLCW1IiIiIiIikrDiltSa2YVmttHMNpvZF/u4nmZmvwmu\nP21mM4L2FDO7y8xeNrMNZvaleMUsIiIiIiIiI1tcklozSwJ+AlwELADeZ2YLenW7Gqh29znAzcCN\nQfu7gTR3Px44Bfh4V8IrIiIiIiIiY1u8RmoXA5vdfau7twG/Bpb36rMcuCs4/j2w1MwMcCDLzJKB\nDKANqItP2CIiIiIiIjKSxSupnQKU9jgvC9r67OPuHUAtUEgswW0EdgM7gZvcvWq4AxYREREREZGR\nLznsAPphMdAJTAbygcfN7G/uvrVnpwceeMB3797dfV5YWEhRUVFcAxURkdGjqampcunSpcVhx5HI\n1qxZ42lpaWGHISIio8ShPpvjldSWA1N7nJcEbX31KQtKjXOB/cD7gb+6eztQYWZ/B04FDkpqc3Jy\nWLx48TCFLyIiY80LL7ywI+wYEl1aWhrHHHNM2GGIiMgocajP5niVHz8LzDWzmWaWClwBrOrVZxVw\nVXB8ObDa3Z1YyfESADPLAk4HXotL1CIiIiIiIjKixSWpDebIXgs8AGwAfuvu683sBjO7JOh2O1Bo\nZpuBzwBd2/78BMg2s/XEkuP/dfeX4hG3iIiIiIiIjGxxm1Pr7vcD9/dq+3KP4xZi2/f0vq+hr3YR\nERERERGReJUfi4iIiIiIiAw5JbUiIiIiIiKSsJTUioiIiIiISMJSUisiIiIiIiIJS0mtiIiIiIiI\nJCwltSIiIiIiIpKwlNSKiIiIiIhIwlJSKyIiIiIiIglLSa2IiIiIiIgkLCW1IiIiIiIikrCU1IqI\niIiIiEjCUlIrIiIiIiIiCUtJrYiIiIiIiCSs5LADEBERERGRxLXstrVD8pwHP3LSkDxHxh6N1IqI\niIiIiEjCUlIrIiIiIiIiCUtJrYiIiIiIiCQsJbUiIiJjhJnNN7N1PV51ZvZpMysws4fMbFPwMz/o\nb2b2X2a22cxeMrOTezzrqqD/JjO7KrzfSkRExjoltSIiImOEu29090Xuvgg4BWgC7gG+CDzs7nOB\nh4NzgIuAucHrY8CtAGZWAHwFOA1YDHylKxEWERGJNyW1IiIiY9NSYIu77wCWA3cF7XcBlwbHy4Gf\ne8xTQJ6ZTQIuAB5y9yp3rwYeAi6Mb/giIiIxSmpFRETGpiuAu4PjCe6+OzjeA0wIjqcApT3uKQva\nDtUuIiISd9qnVkREZIwxs1TgEuBLva+5u5uZD8X7VFZWsmTJku7zFStWsHLlyqF4tIiISDcltSIi\nImPPRcAL7r43ON9rZpPcfXdQXlwRtJcDU3vcVxK0lQPn9mpf0/tNioqKWL169RCHLiIicjCVH4uI\niIw97+NA6THAKqBrBeOrgHt7tK8IVkE+HagNypQfAJaZWX6wQNSyoE1ERCTuNFIrIiIyhphZFnA+\n8PEezd8BfmtmVwM7gPcE7fcDFwObia2U/CEAd68ys68Dzwb9bnD3qjiELyIi8gZKakVERMYQd28E\nCnu17Se2GnLvvg5cc4jn3AHcMRwxioiIDITKj0VERERERCRhKakVERERERGRhKWkVkRERERERBKW\n5tSKSMJzdzbua+KVPQ00tkeZNC6VRZPHMT47NezQRERERGSYKakVkYS2oaKRHz5Rytaq5oPaIwbn\nzy3gQ6dOpiAzJaToRERERGS4KakVkYT1u5f2cvuzu4g65KUn8+YZueRlpLC1qplndtbywOtVPFta\nx/VLZ3LcxOywwxURERGRYaCkVkQS0p3P7eL/1u0F4D0njGfFyZNITT6wTMCuula+/9hOXtrTwBf+\nspkbzp/FKSU5YYUrIiIiIsNEC0WJSMJZ9eo+/m/dXiIG150znY8snnJQQgswOSeNGy+ew8XHFNLe\n6Xzloa2s39MQUsQiIiIiMlyU1IpIQnlxVz23PFkGwGfeMo3z5hYcsm9SxPiXN0/lwnmFtHU6X/vb\nNioa2uIVqoiIiIjEgZJaEUkY9a0d3PjoDqIO7z1hPMvmFR7xnogZnzprKidNzqampYOvP7yNjqjH\nIVoRERERiQcltSKSMH78jzIqG9s5pjiTladO7vd9SRHj35fMZHx2Chv3NfGrtXuGMUoRERERiScl\ntSKSENZsqeaRLdWkJUf4wrnTSYrYgO7PSU/munOmY8Dd6/awoaJxeAIVERERkbhSUisiI15TWyc/\nfTo2j/bjp01hSm76oJ5zwqRxXH78eKION67ZQWtHdCjDFBEREZEQxC2pNbMLzWyjmW02sy/2cT3N\nzH4TXH/azGYE7R8ws3U9XlEzWxSvuEUkfL9au4eqpg6OHZ/JxccceR7t4Vx16iRm5Kezq66Vu9ep\nDFlEREQk0cUlqTWzJOAnwEXAAuB9ZragV7ergWp3nwPcDNwI4O6/cvdF7r4IuBLY5u7r4hG3iISv\ntKaFe9bvw4BrzphKxAZWdtxbalKET715KgC/famC0pqWIYhSRERERMISr5HaxcBmd9/q7m3Ar4Hl\nvfosB+4Kjn8PLDV7w1+v7wvuFZExwN259akyOqLOhfMLmVecOSTPXTgxmwvmFdARdX70j1LctRqy\niIiISKKKV1I7BSjtcV4WtPXZx907gFqgd53he4G7+3qDyspKlixZ0v268847hyJuEQnRkztrea6s\nnuzUJD506qQhffZHFk9hXFoS63Y18MiW6iF9toiIiIjET3LYAfSXmZ0GNLn7K31dLyoqYvXq1XGO\nSkSGS2tHlFufLAdgxSmTyMtIGdLn56Yn85HFU7j58Z3899PlnDYtl6zUpCF9DxERkZFu2W1rww5B\n5KjFK6ktB6b2OC8J2vrqU2ZmyUAusL/H9Ss4xCitiIw+v3tpL3sb2piZn847ji0alve4YF4Bf91Y\nyYaK2N61HzutdwGJiIiIxMtQJdgPfuSkIXmOJI54lR8/C8w1s5lmlkosQV3Vq88q4Krg+HJgtQcT\n3cwsArwHzacVGRP21rfx6xf3AnDNmSUD3pO2vyJmXHPGVAy455UKdmrRKBEREZGEE5ekNpgjey3w\nALAB+K27rzezG8zskqDb7UChmW0GPgP03PbnbKDU3bfGI14RCdfPni6nrdM5Z1YeJ0waN6zvNa84\nkwvnF9LpcOuTZVo0SkRERCTBxG1OrbvfD9zfq+3LPY5bgHcf4t41wOnDGZ+IjAwvlNfxxPYa0pIj\ncSsH/tCpk3h8Ww3Pl9fz5M5azpyeF5f3FRGRxKHSWJGRK17lxyIiR9TWGeVHfy8D4P2LJlCclRqX\n983LSGHFKbHVlX/6VDmtHdG4vK+IiIiIHD0ltSIyYvz2pQrK61qZlpfO5cePj+t7v+PYImbkp7On\nvo3fvVwR1/cWERERkcFTUisySrV3RnlsWzV/fKWC1/c1hR3OEe2qa+XudXsA+OSZJaQkxfd/T0kR\n45ozSgD4zbo97K1vi+v7i4iIiMjgJMw+tSLSf/sb2/n3B7awtaq5u+3CeYX8y1lTSR6mlYSPhrvz\nk3+U0d7pnDcnnxMnD+/iUIdy4uRxnDMzj0e31fA/z5Rz/dKZocQhIiIiIv2nkVqRUaa9M8pX/7aV\nrVXNTByXyrK5BaQmGX99fT83PbpjRK7u+/Dmap4tqyM7NYmPLg53r9iPnjaFtOQIj22rYe2u+lBj\nEREREZEjU1IrMsrcu34fG/c1MSE7lR8tn8/nzpnO994+l4yUCKu3VHP/xv1hh3iQioY2fvyPUgA+\nfvoU8jNTQo1nfHYq7ztxAgC3PFlGR3TkfQkgIiIiIgcoqRUZRZrbO/m/dXsB+OSbS8hNj80wmF+c\nxafePBWIJWpltS2hxdhT1J3vPbaDpvYoZ0zPZdncgrBDAuDy48czaVwqO6pb+H+v7gs7HBERERE5\nDCW1IqPIA69X0dDWycIJWSyemnvQtSVzCjhvbgHtnc4tT5aNiDLk37y4l7W7GshNT+bTZ03FbGTM\n901NjvBPp8cWjfr5C3uobm4POSIRERERORQltSKjyAOvx0qLLzuuuM/rH108mazUJJ4rq+fJnbXx\nDO0Nniur487ndgPwubOnkZ8Rbtlxb6dPy+HUknE0tnXyv8/uDjscERERETkErX4sMkqU1bawZX8z\nmSkRTp+W22ef/IwUrjplErc8Wcb/PL2LxVNzQ1kNeXddK99+ZDsOXHnyRE47RLxhMjM+cUYJH/vD\nazzw+n4uPqaQY8ZnhR2WiIiIxMmy29YOyXMe/MhJQ/IcOTQltSKjxJqtNQCcOSOP1MPs8fr2Y4tY\n9eo+ympb+fOGSpYv7HtUd7hUNbXzpb9upr61k9Om5vCBkybG9f0HoiQ3nXcdV8xvXqrge4/t5CeX\nzic1WQUuIiISvqFKuERGA/11JjJKPLq1GoBzZ+Udtl9yxLj6TZMB+MULu2ls6xz22Lo0tHbwb3/d\nwq66NuYUZvDFt84gMkLm0R7KB0+eREluGjtqWrjreZUhi4iIiIw0SmpFRoFdda3sqG4hOzWJkyaP\nO2L/M6fnctzELOpaO/n1i3vjECHUNLdz3f2b2VrVTEluGt+8cDZZqUlxee+jkZYc4fPnTCdi8PuX\nK3hlT0PYIYmIiIhIDyo/FhkF1u2qB2DR5HGkHKb0uIuZ8bHFU/iXVa9zzysVvOPYIsZnpw5bfBUN\nbXzxL5spq21lck4q37lozohbGOpwjh2fxXtPnMDd6/byrdXb+cml80PfT1dERET6ptLssUcjtSKj\nwNogqT1pcna/7zlmfBbnzMqjrdO5cxjLatfvbeCT926krLaVWQXpfP/t84Y1gR4uV548iYUTsqhs\naudbj2ynMxr+lkgiIiIioqRWJOG5O+t2xUpiF/Wj9LinD586meSI8fCmKjZXNg15bPe/Vsnn/7yZ\n6uYOTpyUzU1vm0tBgo5wJkeM65fOpCAjmRd3N/DTp8pHxF6/IiIiImOdklqRBLe9uoXalg4KM1Mo\nyU0b0L2TctK4ZEERDvzPM0OXpLV0RLn58Z384IlSOqLOZQuL+fZFc8hOS+wZD4WZKVy/dCbJEePe\nV/dx97r4zEcWERERkUNTUiuS4A7Mp83GBrGS8PsXTSQ7NYm1uxp4tqzuqOPZur+Za/+0kb9s3E9K\nkvHZs6fxz2eUhLIf7nA4bmI2150zHQPufH43P39+t0ZsRUREREKkpFYkwW2oaATg+In9n0/bU056\nMu9fNAGAH/29bNBb/Lg7q17dxydXbWRnTQvT8tL58fL5XDCvcFDPG8nOnZ3PZ86eRsTgl2v38O1H\nth/VP7eOqBNVYiwiIiIyKIldCygivLYvNhf2mOKsQT/j0uPGs3pLNZv3N/PTp8r47NnTB3R/VVM7\nP3yilCd31gJw0fxC/vmMEtKTR+/3ZhfMKyQnLZlvPbKdNVtreGVPI1eeMomls/NJPcTv3dzeyY7q\nFrZVNbO1qpmtVbHjhrZOjFh58/T8dE6cnM05M/OZlDOwcnIRERGRsUhJrUgCq2luZ099G+nJEabn\npw/6OckR4wvnTucTf9rIA69XMacwk+ULi494n7uzZmsNP/5HKfWtnWSlJvGvZ03l7Fn5g44lkZwx\nPZdbL5vPjWt2sHFfEzc/vpOfPVXGosnjmJqbRmpyhLqWTioa2yitaaG8tpW+xmMjBlGHyqZ2Kpva\neb68njue3c2pJeO48uRJHDt+8F9YiPRmZnnAbcBxgAMfBjYCvwFmANuB97h7tcXmNPwQuBhoAla6\n+wvBc64Crg8e+w13vyuOv4ZIwtJ2MyJDT0mtSAJ7PVixeG5RJklHOWd1en4Gnz5rKv/56E5ufaqM\nzNQI5889dOlweW0LP3u6nKd2xubhnjJlHP/6lmkJuV3P0SjJTeeHl8zj0a3V/O6lCjbvb+YfO2r7\n7JtkMDUvnVkFGbFXYQYzCzIoyEim02FfYxsbK5p4amctf99Ry3Nl9TxXVs/p03K49sypY+6frQyb\nHwJ/dffLzSwVyAT+DXjY3b9jZl8Evgh8AbgImBu8TgNuBU4zswLgK8CpxBLj581slbtXx//XERGR\nsU5JrUgCe60iltTOL84ckuedP7eQioZ27np+N//56E427mvigydNJC/jwDY8pTUt/Gn9Pv6ycT8d\nUScjJcLHTpvCxfMLB7VQ1WgQMeOtswt46+wCymtb2bivkd31bbR3RslOS2Z8dgpTctKYmpdOalLf\npcnJBpPGpTFpXBrnzs6nrqWDP7xcwZ9e3cdTO+tYt2sDH37TZC5ZUERkjP5zlqNnZrnA2cBKAHdv\nA9rMbDlwbtDtLmANsaR2OfBzj62G9pSZ5ZnZpKDvQ+5eFTz3IeBC4O54/S4iIiJdlNSKJLCN3fNp\nhyapBfjASRMZl5bELU+WserVSv66cT9zizLJSImwq66NXXWtABhwwbwCVp46mcIE3Xt2OEzJTWPK\nALdW6ktOejIfetNkLllYzC1PlvH4thpuebKM58vq+Nw508lN1/++ZVBmAvuA/zWzE4HngU8BE9x9\nd9BnDzAhOJ4ClPa4vyxoO1T7QSorK1myZEn3+YoVK1i5cuWQ/CIiIiJd9FeRSIJydzbui618PP8o\nFonqyyULijl2fBZ3Pb+bZ0rrWL+3sftaZkqEs2fmc9lxxcwsyBjS95U3KsxM4T+WzuSJbTXc/MRO\nni6t45//+BpffOsMTpg0uBWvZUxLBk4GPunuT5vZD4mVGndzdzezIVmOu6ioiNWrVw/Fo0RERA5J\nSa1IgqpsaqeutZNxaUmMzx76kdK5RZl844LZ1LZ0sKmyic6ok5+RwuzCjKOevysDd9bMPOYVZ/Kt\n1dt5taKR6+7fxEcXT+GdxxWP2bJvGZQyoMzdnw7Of08sqd1rZpPcfXdQXlwRXC8Hpva4vyRoK+dA\nuXJX+5phjFtEROSQRu9+GyKj3Nb9zQDMKsgY1qQmNz2ZU0tyOG1aLvOKj35BKhm88dmpfO/tc3nv\nCeOJOvzs6XK+9ch2mtsHt0eujD3uvgcoNbP5QdNS4FVgFXBV0HYVcG9wvApYYTGnA7VBmfIDwDIz\nyzezfGBZ0CYiIhJ3GqkVSVBbq4KktlAlwGNJUsS4evEU5hVncdNjO3h0aw07qlv4ynmzhmQur4wJ\nnwR+Fax8vBX4ELEvuX9rZlcDO4D3BH3vJ7adz2ZiW/p8CMDdq8zs68CzQb8buhaNEhERiTcltSIJ\nqmukdrbmtY5Jb5mZx/S8dL76t61sr27h2ns38oVzp3P6tNywQ5MRzt3XEduKp7elffR14JpDPOcO\n4I6hjU5EZPQZqr2JH/zISUPynNFI5cciCWpL1YHyYxmbpuWn86Pl83nz9Fwa2zr58oNb+fnzu4n6\nkKzxIyIiIpIQlNSKJKDm9k7Ka1tJslhiI2NXVmoSXz5vJh86dRIG/HLtHr784FbqWzvCDk1EREQk\nLpTUiiSg7dUtODA1L53UJP1nPNaZGe9bNJFvXjibcWlJPFNaxyfv3dhdoi4iIiIymumvYZEEtD0o\nPdY+sdLTqSU5/OTS+cwpzGBXXRufWrWR1Zu1do+IiIiMbkpqRRJQaW0rANPzVHosB5s4Lo2b3zGP\n8+YW0NrpfGfNDm59qoyOqObZioiIyOikpFYkAZXWtACx8mOR3tKSI3z+7Glce2YJSQb3vLKPL9y/\nmeqm9rBDExERERlySmpFElDXSO3UPO1LKn0zMy5ZUMxNb59LQWYyL+9p4BN/2siGisawQxMREREZ\nUkpqRRJMW2eUPfWtRAwm5yiplcNbOCGbWy49huMmZLG/qZ3P3reJ+zZU4tr2R0REREYJJbUiCWZX\nXStRh4njUrXysfRLQWYK333bXC5dWExH1Pmvv5fy/cd30tYRDTs0ERERkaOWHK83MrMLgR8CScBt\n7v6dXtfTgJ8DpwD7gfe6+/bg2gnAz4AcIAq8yd1b4hW7yEhSWhOUHudqPq30X3LE+MQZJcwryuSH\nT+zkgder2FbVwjcumEVeRkrY4YmIiEicLLtt7VE/48GPnDQEkQyduAzzmFkS8BPgImAB8D4zW9Cr\n29VAtbvPAW4GbgzuTQZ+CfyTuy8EzgW02omMWWW1WiRKBu+8uQX84JJ5TByXyuuVTXz+z5vZrwWk\nREREJIHFq3ZxMbDZ3be6exvwa2B5rz7LgbuC498DS83MgGXAS+7+IoC773f3zjjFLTLidK98nKv5\ntDI4swsz+cE75jE9P50dNS187r5NVDS0hR2WiIiIyKDEK6mdApT2OC8L2vrs4+4dQC1QCMwD3Mwe\nMLMXzOy6OMQrMmJ1rXxcopFaOQoFmSnc9La5zC7MoLyulc/et4nd9a1hhyUiIiIyYImwykwycBbw\ngeDnZWa2tHenyspKlixZ0v2688474xymyPBzd43UypDJTU/muxfP4ZjiTPY2tPHZ+zZRXqvEVkRE\nRBJLvBaKKgem9jgvCdr66lMWzKPNJbZgVBnwmLtXApjZ/cDJwMM9by4qKmL16tXDE73ICFHV3EFT\ne5RxaUnkpsdtnTcZxcalJfOdi+Zw/QNbeGVvI5/98+v858VzNWdbREREEka8RmqfBeaa2UwzSwWu\nAFb16rMKuCo4vhxY7bGNFB8AjjezzCDZPQd4NU5xi4woB0Zp04lNORc5epmpSXzzwtmcOCmbqqYO\nPvfnTeyobg47LBEREZF+iUtSG8yRvZZYgroB+K27rzezG8zskqDb7UChmW0GPgN8Mbi3Gvg+scR4\nHfCCu/85HnGLjDTdSW2eSo9laGWkJPH1C2Zz0uRxVDd38Lk/b2brfiW2IiIiMvLFrX7R3e8H7u/V\n9uUexy3Auw9x7y+JbesjMqaV1WqPWhk+6ckRblg2i6/9bSvPldVz3f2b+NZFc5hXlBl2aCIiIiKH\nlAgLRYlIoFR71MowS0uO8NXzZ3Ha1BzqWjv57P97nUe2VIcdloiIiMghKakVSSClNcFIrcqPZRil\nJkX48nkzWTa3gNZO59uPbOeWJ8tobtcW4SIiIjLyKKkVSRAtHVH2NrSRZDBxnJJaGV4pSRE+e/Y0\nrjmjhIjBn9bv46N/2MB9Gypp6YiGHZ6IiIhIN+0JIpIgyoPS48k5aSRHtPKxDD8zY/nCYo6dkMUP\nHt/J5v3N/NffS/npU2UsnJDNtLw0MlOSSEkyUpMjpCdHmJKTxsKJ2aQn6ztTERERiQ8ltSIJ4kDp\nsebTSnzNK8rkR8vn8/i2Gu5ZX8GGiibW7qpn7a76PvunJRmXHjeeK0+eSGqSklsREREZXkpqRRKE\nFomSMCVFjHNn53Pu7Hz2N7Xz+r4mdtW10tIRpb0zSlun09Teyev7mti8v5nfvLiX9Xsa+MYFs8lM\nTQo7fBERERnFlNSKJIgD2/loPq2EqzAzhTOm5x7y+qt7G/nGw9t4ZW8jNz22k/9YOgMzlcyLiIjI\n8FBSK5IgSms0UiuJYcGELL77tjlc+6eNPLG9hjVba3jr7PywwxIREUloy25bG3YII5YmO4kkgKg7\npcFIbYlGaiUBlOSm8/HTpgDwP8+U096pFZNFRERkeCipFUkAlY3ttHZEyc9IZlyaCiwkMVwwv5AZ\n+elUNrbzt01VYYcjIiIio5SSWpEE0FV6XJKr0mNJHBEz3nviBADufbUSdw85IhERERmNlNSKJICu\n0uOpeSo9lsTylpl55KQlsbWqmU2VzWGHIyIiIqOQklqRBNC9SJRGaiXBpCZFWDKnAIA1W6tDjkZE\nRERGIyW1IgngwB61GqmVxHP2zDwAHt9WoxJkERERGXJKakUSQFlN1x61GqmVxLNgQhb5GcnsbWhj\ne3VL2OGIiIjIKKOkVmSEa2rrpLKpnZQkY3x2atjhiAxYxIxTSnIAeK6sLuRoREREZLRRUisywpV1\n7U+bk0ZSxEKORmRw3hQktc+X14cciYiIiIw2SmpFRrgD82lVeiyJ64SJ2QC8VtFIZ1TzakVERGTo\nKKkVGeEO7FGrRaIkcRVmpTAhO5Wm9ig7NK9WREREhpCSWpER7sAetRqplcS2YEIWAK9WNIYciYiI\niIwmSmpFRrjuPWqV1EqCWzA+SGr3NoQciYiIiIwmSmpFRrDOqFNe17Wdj8qPJbEdGKltCjkSERER\nGU2U1IqMYBUNbbR3OkWZKWSkJIUdjshRmVWQQVpyhF11rVQ3t4cdjoiIiIwSSmpFRrADKx9rlFYS\nX1LEmFuYAcCW/c0hRyMiIiKjhZJakRFsZ40WiZLRZUZBLKndXqWkNixmtt3MXjazdWb2XNBWYGYP\nmdmm4Gd+0G5m9l9mttnMXjKzk3s856qg/yYzuyqs30dERERJrcgIVtY1UpurpFZGh5n5sX+Xt2lb\nn7C91d0XufupwfkXgYfdfS7wcHAOcBEwN3h9DLgVYkkw8BXgNGAx8JWuRFhERCTelNSKjGClwUit\n9qiV0WJmMFK7TSO1I81y4K7g+C7g0h7tP/eYp4A8M5sEXAA85O5V7l4NPARcGO+gRUREAJLDDkBE\nDm2ntvORUWZGMFK7s6aFzqiTFLGQIxqTHHjQzBz4mbv/NzDB3XcH1/cAE4LjKUBpj3vLgrZDtR+k\nsrKSJUuWdJ+vWLGClStXDtGvISIiEqOkVmSEqmvpoLalg4yUCMVZKWGHIzIkstOSKc5KYV9jO7vq\nWvWFTTjOcvdyMxsPPGRmr/W86O4eJLxHraioiNWrVw/Fo0RERA5J5cciI1RpzYH5tGYazZLRo7sE\nuVolyGFw9/LgZwVwD7E5sXuDsmKCnxVB93Jgao/bS4K2Q7WLiIjEnZJakRGqq/R4mrbzkVGma7Go\n7VVaLCr3SeWpAAAgAElEQVTezCzLzMZ1HQPLgFeAVUDXCsZXAfcGx6uAFcEqyKcDtUGZ8gPAMjPL\nDxaIWha0iYiIxJ3Kj0VGKM2nldFqWpDUdlUjSFxNAO4Jqj+Sgf9z97+a2bPAb83samAH8J6g//3A\nxcBmoAn4EIC7V5nZ14Fng343uHtV/H4NERGRA5TUioxQ2qNWRqspObF/p8vrWkOOZOxx963AiX20\n7weW9tHuwDWHeNYdwB1DHaOIiMhAqfxYZIQ6UH6spFZGlynBFlW76lqJ5UwiIiIig6ekVmQEaumI\nUtHQRpLB5BzNqZXRJSctiazUJJrao9S0dIQdjoiIiCQ4JbUiI1BZTQsOTMlNJ1n7eMooY2ZMCb6s\n2VWrEmQRERE5OkpqRUag0tqu7Xw0SiujU1cJsubVioiIyNFSUisyAnUtEqX5tDJadZXVK6kVERGR\no6WkVmQE0nY+Mtqp/FhERESGStySWjO70Mw2mtlmM/tiH9fTzOw3wfWnzWxG0D7DzJrNbF3w+mm8\nYhYJS/fKx/lKamV00kitiIiIDJW47FNrZknAT4DzgTLgWTNb5e6v9uh2NVDt7nPM7ArgRuC9wbUt\n7r4oHrGKhK0j6t2jV5pTK6NV7219zLQgmoiIiAxOvEZqFwOb3X2ru7cBvwaW9+qzHLgrOP49sNT0\nV46MQWW1LbRHnUnjUslISQo7HJFhkZOWRGZKhKb2KPWtnWGHIyIiIgksXkntFKC0x3lZ0NZnH3fv\nAGqBwuDaTDNba2aPmtlb+nqDyspKlixZ0v268847h/QXEImXbVXNAMwoyAg5EpHhY2aMz04FYG9D\nW8jRiIiISCKLS/nxUdoNTHP3/WZ2CvAnM1vo7nU9OxUVFbF69epwIhQZQtuqYvNpZymplVFuQnYq\n26tb2NvQxtyizLDDERERkQQVr5HacmBqj/OSoK3PPmaWDOQC+9291d33A7j788AWYN6wRywSkq6R\n2plaJEpGua6R2gqN1IqIiMhRiFdS+yww18xmmlkqcAWwqlefVcBVwfHlwGp3dzMrDhaawsxmAXOB\nrXGKWyTutlUHSa1GamWUm6DyYxERERkCcSk/dvcOM7sWeABIAu5w9/VmdgPwnLuvAm4HfmFmm4Eq\nYokvwNnADWbWDkSBf3L3qnjELRJvDa0dVDS0k5pk3VueiIxWXSO1+5TUioiIyFGI25xad78fuL9X\n25d7HLcA7+7jvj8Afxj2AEVGgO3Vsfm00/PTSYpo8W8Z3SaM00itiIiIHL14lR+LSD9s7Z5Pq9Jj\nGf3GZ3XNqW0PORIRERFJZEpqRUaQ7cHKx5pPK2NBfmYyKRGjtqWD5nbtVSsiIiKDo6RWZATpHqkt\n0MrHMvpFzCjOTgFgn0ZrRUREZJCU1IqMEO7Odq18LGPMeK2ALCIiIkdJSa3ICLGnoY2m9ih56cnk\nZ6SEHY5IXGhbHxERETlaSmpFRohNlU0AzC3KDDkSkfgpDhaLqmxUUisiIiKDo6RWZITYtK8rqVXp\nsYwdhVmxqoT9TZpTKyIiIoOjpFZkhHi9Mjafdl6xRmpl7CjKVFIrIiIiR0dJrcgI4O4qP5YxqSgY\nqa1sVFIrIiIig6OkVmQE2F3fRkNbJ/kZyd0jVyJjQYFGakVEROQoKakVGQF6jtKaWcjRiMRPbnoy\nyRGjvrWT1o5o2OGIiIhIAlJSKzICbNyn0mMZmyJmFGaqBFlEREQGT0mtyAjw6t5GAI4dr6RWxp5C\nlSCLiIjIUVBSKxKyts5od/nxseOzQo5GJP6Kurf10V61IiIiMnBKakVCtqmyifaoMz0/nXFpyWGH\nIxJ3hVoBeUDM7FNmVhR2HCIiIiOFklqRkHWVHi/QKK2MUV0rfleq/Li/lgDbzew+M3uvmaWFHZCI\niEiYlNSKhGx9kNQunKCkVsam7jm1GqntF3dfDkwH/gJ8GthjZreZ2dnhRiYiIhIOJbUiIXL37pFa\nJbUyVh2YU6uktr/cfb+7/8TdzwDOAd4EPGJm283s380sO+QQRURE4kZJrUiIdtS0UNPSQUFmMpNz\nVEEoY1NhZiqgObUDZWZLzex/gTXAXmAFcCVwErFRXBERkTFBq9KIhGhteT0AJ00eh5mFHI1IOAoy\nYx9F1c3tuLv+WzgCM7sJuAKoBX4OXO/u5T2uPwVUhxSeiIhI3CmpFQnRCz2SWpGxKiMlifTkCC0d\nUZrao2SlJoUd0kiXDlzm7s/2ddHd283s1DjHJCIiEhqVH4uEpCPqvLSnAYCTpiiplbGt52itHNG3\ngc09G8ws38wmd527+2txj0pERCQk/U5qzWy5mWlkV2SIbKxopLk9ytTcNIqzUsMORyRU+RmxxaKq\nmjpCjiQh/Ako6dVWAtwTQiwiIiKhG8hI7Q3AbjP7sZmdNlwBiYwVz5TVAXCyRmlFyM+IfWdao5Ha\n/pjv7i/3bAjOjwkpHhERkVD1O6l19xOB84Bm4A9mttHMrjezGcMUm8io9o8dtQCcMT035EhEwtc9\nUtuskdp+qDCzOT0bgvP9IcUjIiISqgHNqXX3F93988BU4Brg3cAWM3vMzD5gZpqjK9IP5bUt7Khu\nISs1iRMmaaRWpGuktlp71fbHHcS+XH67mS0ws3cAvwduCzkuERGRUAx4jqyZzQY+GLyiwJeBncC1\nwLuAdw5lgCKjUdco7WlTc0iOaPsSkfzM2EhttUZq++M7QDtwE7EvmUuJJbTf7+8DzCwJeA4od/e3\nm9lM4NdAIfA8cKW7t5lZGrFtg04hNhL8XnffHjzjS8DVQCfwL+7+wND8en1bdtvaIXnOgx85aUie\nIyIiI8dAFoq6Jtj77hlgArEPvPnu/k13/wWwFFg2THGKjCp/3x5Las+codJjEYCCjK6kViO1R+Lu\nUXf/T3c/xt2zgp83uXt0AI/5FLChx/mNwM3uPofYHrdXB+1XA9VB+81BP8xsAbG9chcCFwK3BImy\niIhI3A1kpPYi4HvAKndv7X3R3ZvMTKO0IkdQXtvCqxWNpCdHeFNJTtjhiIwIeV3lxxqp7Rczmw+c\nCGT3bHf3O/pxbwnwNuCbwGfMzIAlwPuDLncBXwVuBZYHxxArcf5x0H858Ovg74FtZrYZWAw8eVS/\nmIiIyCAMJKld4+6/691oZp9x9+8DuPuDQxaZyCj14OtVAJw9M4+MFA1siMCBkdoqjdQekZn9G7Gp\nPy8CTT0uObH5tkfyA+A6oGtCfyFQ4+5d3yiUAVOC4ynEyptx9w4zqw36TwGe6vHMnveIiIjE1UCS\n2i8Tm7/T2/UMYB6PyFjWGXUe2hxLapfNKwg5GpGR48CWPh24O7HBQDmETwOL3f2lgd5oZm8HKtz9\neTM7d8gj66WyspIlS5Z0n69YsYKVK1cO99uKiMgYc8Sk1sy6Po2SzeytQM+/NGYB9cMRmMho9FxZ\nHZWN7Uwal8pxE7OPfIPIGJGaHCErNYnGtk7qWzvJSR/wOoZjSTPw2iDvfTNwiZldDKQDOcAPgTwz\nSw5Ga0uA8qB/ObHFqMrMLBnIJbZgVFd7l573dCsqKmL16tWDDFVERKR/+rNQ1O3BK41YWVPX+W3A\nh4FPDlt0IqPM716qAOBtxxYR0UiUyEG6t/VRCfKR/AfwIzObZGaRnq8j3ejuX3L3EnefQWyhp9Xu\n/gHgEeDyoNtVwL3B8argnOD6anf3oP0KM0sLVk6eS2whSRERkbg74lfh7j4TwMx+7u4rhj8kkdFp\nQ0UjL+1pIDMlwtuOKQo7HJERJz8jhbLaVqqaO5ieH3Y0I9qdwc+P9GgzYnNqBztR/wvAr83sG8Ba\nYl9eE/z8RbAQVBWxRBh3X29mvwVeBTqAa9y9c5DvLSIiclT6Xd+lhFbk6Pzf2j0AvOPYIrJStUCU\nSG8F3fNqNVJ7BDOH4iHuvgZYExxvJbZ6ce8+LcC7D3H/N4mtoCxyEO0pLCLxdtik1sw2uPuxwXEp\nsW+B38Ddpw1DbCKjxvNldTxdWkdGSoR3Hjc+7HBERqT8zGAF5CZt63M47r4DICg3nuDuu0MOSURE\nJFRHGqn9aI/jDw5nICKjVXtnlJ8+HVs/5f2LJnb/4S4iB9Oc2v4xszzgFmJzXNuBLDO7hNiKyNeH\nGpyIiEgIDpvUuvsTPY4fHf5wREafO5/bzY7qFibnpHLZccVhhyMyYuUHe9VWN2uk9gh+ClQD04nN\naQV4EvgesW32RERExpR+z6k1s88QW/VwnZmdDvwW6ATe7+5P9uP+C4ltG5AE3Obu3+l1PQ34OXAK\nse0C3uvu23tcn0bsw/ur7t7XfrkiI84zpbX87uUKIgafP2c6qUn9WXBcZGzSSG2/LQUmu3u7mTmA\nu+8zM81tEOnDUM3xFZGRayB/Yf8rsC04/jbwfeAbwA+OdKOZJQE/AS4CFgDvM7MFvbpdDVS7+xzg\nZuDGXte/D/xlAPGKhGpDRSPfeHg7AFeePImFE7QvrcjhdJXma6T2iGqBg5ZQD7741dxaEREZkwaS\n1Oa6e62ZjQNOBH7k7rcD8/tx72Jgs7tvdfc24NfA8l59lgN3Bce/B5aaxTbyNLNLiSXU6wcQr0ho\n1u6q59/+uoWWjijnzcnnfYsmhB2SyIjXtfpxdZNGao/gNuAPZvZWIGJmZxD7/PxpuGGJiIiEo9/l\nx0CpmZ0JLAQec/dOM8shVoJ8JFOA0h7nZcBph+rj7h1mVgsUmlkLsf3zzgc+N4B4ReKuM+r84eUK\n/ve5XXQ6nDUjj8+cPZ1I7PsZETmMvGBObU1LB51RJymi/24O4UagmVgFVApwB/AzYlN8RERExpyB\nJLWfJzaC2ga8K2h7O/DMUAfVy1eBm929wQ6TGFRWVrJkyZLu8xUrVrBy5cphDk3kgBfK67jtmV1s\n3t8MwLuPH8/ViycroRXpp+SIMS4tifrWTupbO7qTXDmYuzuxBFZJrIiICANIat39fmByr+bfBa8j\nKQem9jgvCdr66lNmZslALrEFo04DLjez7wJ5QNTMWtz9xz1vLioqYvXq1f39dWQUq2lup6qpg/Zo\nlIyUJCZmp5KaPDwLNHVEnSd31HLv+n28tKcBgMLMFP71LVNZPDV3WN5TZDTLTU+mvrWT2hYltYdi\nZksOdc3d9UEoIiJjzkBGajGzXGJzaHuveHOkD9FngblmNpNY8noF8P5efVYBVxHbluByYistO/CW\nHu//VaChd0Irsqmyib+8tp+nS2vZ13jwfLyIwcRxqRxTnMWJk8exaFI2E8elcriR/8Nxdzbvb+ax\nbTU8tGk/VU2xRW2yUpN4zwnjuey48aQPUxItMtrlpSdTVttKbYsWizqM23udFwOpxKb2zIp/OCIi\nIuEayJY+K4nN32kAmnpcco7wIRrMkb0WeIDYlj53uPt6M7sBeM7dVxH7kP6FmW0GqoglviKHVdHQ\nxk+fKueJ7TXdbZkpEYqzU0mJGI1tnextaGNXXey1eks1AOOzUzhx0jhOnJTNosnjGJ+detj3qW5u\n5+U9Dby0u4FnSuvYU9/WfW1aXjpvP7aI8+cWkJWaNDy/qMgYkZse+1iq0QrIh+TuM3ueBzsMXA/U\nhxORyPDQVjwi0l8DGan9JnC5uw9qW52gfPn+Xm1f7nHcArz7CM/46mDeW0anJ3fUctNjO6hv7SQt\nybj42CLOn1PArMKMg+axtnVGKa1p4eU9jby4q56X9jRQ0dDOQ5uqeGhTFRAbHSrJS6M4K5WMlAgR\niyXE1c3tlNa0sr/XaqwFGcmcNTOPs2fmc/zErEGP+orIwXKDFZBrNFLbb8HCjd8kNlL7/bDjERER\nibeBJLXJwIPDFYjIQNy3oZIf/b0UB06bmsOnzppKUVbfo62pSRFmF2YyuzCTSxcWE3VnW1Uz63Y1\n8OLuel7e00hNSwc1ezqAxj6fkZYcYcH4LE6clM2Jk7M5pjhLK7OKDIO8YKRW5ccDdj4QDTsIERGR\nMAwkqb0RuN7Mvu7u+uCU0Ny3oZL/+ntsh6irTpnE+xdNGNBIacSsO8l91/HjibpT2dhOaU0L1c0d\ntHREibqTmZJETnoSU/PSGZ+VqiRWJA5yldQekZmVEpv60yUTSAc+EU5EIiIi4RpIUvuvwETgOjPb\n3/OCu08b0qhEDuHJHbX8+B+xhPbaM0u4ZEHxUT8zYsb47NQjzqsVkeGXl6E5tf3wwV7njcDr7l4X\nRjAiIiJhG0hS2/tDVCSuympb+M6a7UQdrjx54pAktCIysmik9sjc/dGwY5DRS4sziUgiGsg+tfoQ\nldC0dkT5xsPbaW6Pcs7MPD540sSwQxKRYZCXHtubVgtFHZqZ/YKDy4/75O4r4hCOiIhI6Pq9maaZ\npZnZN81sq5nVBm3Lgq16RIbVL1/YzdaqZibnpPHpt0zTasMio1TX6se1Kj8+nBrgUmJb5JUR+yxf\nHrRv6fESEREZEwZSfnwzMAX4ANC1rc/6oP3HQxyXSLfNlU387uUKDPjCudO1F6zIKNZVflzX2kHU\n/aDtuaTbPOBt7v54V4OZnQX8h7tfEF5YIiIi4RhIUnsZMMfdG80sCuDu5WY2ZXhCE4HOqPP9x3cS\ndbhsYTHHjs8KOyQRGUbJESM7NYmGtk7qWzu7k1w5yOnAU73angbOCCEWERGR0PW7/Bhoo1cSbGbF\nwP6+u4scvVWv7mPz/mbGZ6ew8tRJYYcjInGQpxLkI1kLfMvMMgCCn98E1oUalYiISEgGktT+DrjL\nzGYCmNkkYmXHvx6OwETqWjr45do9AFxzxlQyUlR2LDIWdI3O1rS0hxzJiLUSeDNQa2Z7gVrgLOCq\nMIMSEREJy0CS2n8DtgIvA3nAJmA38LVhiEuEu9ftob61k0WTszl9Wk7Y4YhInBxIajVS2xd33+7u\nZwKzgUuITQ060923hRyaiIhIKAYyWWkOsBH4FrEVF//k7i8PS1Qy5u2qa+XeVysx4GOLp2i1Y5Ex\npHuvWpUfH5KZFQLnApPc/btmNhmIuHtZuJGJiIjE3xGTWotlE7cTK2sqA3YRWwX5K8FeeR929yPu\nlycyEHc9v5uOqHPe3ALmFGWGHY6IxFFeV1Krkdo+mdk5wB+A54iVIX8XmAt8DnhHiKHJICy7be2Q\nPOfBj5w0JM8REUlE/Sk//hixb4NPd/fp7n6Gu08jtsriW4CPD2N8MgbtrG5hzZZqkiPGylO0OJTI\nWNO1UJTKjw/pB8B73f1CoOsf0tPA4vBCEhERCU9/ktorgX9x92d7Ngbnnw6uiwyZX63bgwMXzitk\nfHZq2OGISJyp/PiIZrj7w8FxV6XUG3YoEBERGSv6k9QuAB49xLVHg+siQ2JHdXP3KO0ViyaEHY6I\nhEALRR3Rq2Z2Qa+284gt5CgiIjLm9Odb3SR3r+/rgrvXm9lAVlAWOaxfrdUorchY171PrZLaQ/ks\ncJ+Z/RnIMLOfEZtLuzzcsERERMLRn6Q2xczeChxq+VmVO8mQ2FHdzKNbazRKKzLG5aWnAFCj8uM+\nuftTZnYC8EHgDqAUWKyVj0VEZKzqT0JaQexD83DXx7SKhjYe31ZDZ9Q5fVou0/LTww4pIXWP0s7X\nKK3IWJaTngRAXWsHUXci2tKrm5klAQ8DF7j7d8OOR0REZCQ4YlLr7jPiEEfCemJ7DTc+sp3Wztha\nHbc/u4v3nDiBD586SXurDsDO6pYDo7QnapRWZCxLSYqQlZpEY1snDa2d5KSrIKiLu3ea2Uz6tyaG\niIjImKAPxaOwubKJb62OJbSnT8vh/LkFmMFvXtzLz54uR9v39p9WPBaRnrRY1GF9DbjVzKabWZKZ\nRbpeYQcmIiISBn39PUhRd77/+E46os7FxxTyqTdPxcw4Z1YeX31oG398ZR/FWam86/jxYYd6kLaO\nKOsrGimtaaGmuYOMlAjT89NZOCGbrNSkUGLaWdOiFY9F5CB56cnsqmulprmDaXlhRzPi3Bb8XMGB\nLX0sOA7nf+QiIiIhUlI7SP/YXsvm/c0UZqbw8dOmdJcaL56ay3XnTOdbj2zn9md3cfykbOYVZYYc\nLWzZ38Sf1u9jzdYaWjuib7iemmQsm1vIFYsmxH2k9O5glHbZvAKN0ooIALlaAfkNzGyiu+8BZoYd\ni8Cy29aGHYKIiASU1A7SH16JrY91xYkTyEg5+Ivxc2fns35vA/e+Wsl3HtnOLZcdQ3pyOFVhO6qb\nue2ZXTxdWtfdNqsgg/nFmeRnJNPUHmXjvkY2VDRx32uVrN5SxcdPm8KF8wvjMie4vLaFR7ZUk2Ro\nLq2IdMtLV1Lbh9eBHHffAWBmf3T3d4Yck4iISOiU1A7C9upm1u9tJDMlwrJ5BX32+ejiKazb3cCO\n6hZ+/vxuPnbalLjG2NoR5Y7ndnHv+n1EHTJSIlwwr5DlC4qYkvvG1Zl3Vrdwx3O7+MeOWm5+opRN\n+5u55owSkiLDm9jevW4vUY/NpZ04Lm1Y30tEEofm1Pap9/+Qzw0jCBERkZFGi0oMwiObqwE4Z1b+\nG0Zpu6QmR/jc2dOIGPzxlQo2VDTGLb6N+xr5xD2vcc8r+wB4+7FF3PmeBXzijJI+E1qAafnpfOW8\nmXz+nGmkRIz7NlTynTXb6YwO32JXu+pa+dvmKiIG79NcWhHpIa+r/Li5PeRIRhStPigiItIHjdQO\nkLvz2LYaAM6dlX/YvvOLs3j38eP5zUsV3PToDm697BhSh7EMuSPq3L1uD79au4eow7S8dK47d3q/\n5/SaGefPLWTyuDT+/YEtPLq1hrSknXzm7GnDsk/k3eticS6bW8CkHI3SisgBGqntU7KZvZUDI7a9\nz3H31aFEJiIiEiIltQO0q66V8rpWctKSOGFS9hH7X3nyJP6+o5bS2lZ+sXYPV79p8rDEVVrTwncf\n3cHGfU0AvPO4Yj506mTSBpFEL5yYzTcumM2X/rqFBzdVkZ2WxD+dXjKk8e6obuahTV2jtBOH9Nki\nkvhyNae2LxXAHT3O9/c6d2BWXCMSEREZAVR+PEAvlNcDsGjyuH7NN01NjvDZs6dhwO9e2svrQdI5\nVNydVa/u4xP3vMbGfU0UZ6Vw48Vz+KfTSwaV0HY5bmI2Xzt/JskR44+v7OMvG/cPYdRw+7O7iDq8\n7ZgipuRqlFZEDta9UFSzktou7j7D3Wce5qWEVkRExiQltQO0dlcDACdNGdfvexZOyOay44qJOtz0\n2A7aOt+4pc5gVDa28aW/buHH/yijtdM5b04+P3vnMZw0uf+xHc7JU3L45JunAvCjv5fy0u6GIXnu\nS7vreWpnHRkpET54kkZpReSNurb0UfmxiIiIHImS2gHojDov7o6N1A40cVx56mQm56SxvbqFu9ft\nPepYHtlSxcf+8BovlNeTk5bEfyydyXXnziA7bWgryi+aX8hlxxXTEXW+/vA2dte3HtXzOqPOfz+9\nC4B3Hz+e/MyUoQhTREaZnuXHUdf6SEPFzNLN7Bkze9HM1pvZ14L2mWb2tJltNrPfmFlq0J4WnG8O\nrs/o8awvBe0bzeyCcH4jERERJbUDsqWqmfrWTiZkpzJpXOqA7k1PjvCZt0wDYgskPVdWd4Q7+lbX\n0sG3Vm/j24/soKGtk8VTc/jvdx3LW2bmDep5/fGxxVM4tWQctS0dfOXB/9/enQfJeZcHHv8+M6OZ\n0T2SRrcP2bEAHwm2IbbZAEtkfOAk2BAIhlRkE2+oBFwVNqR2zVJrXIakwuagKoGQ2sReGzbB9hpY\nHCLiGGTWIYkdH/g2tmVLxrotaTS6Nddv/3jfHpqhRxpJM+/bx/dT1TVvv+/bPU+/0+qfnv4dz8vs\nHxg+7uda88MdvLDjAAtmTONXf3bRJEYpqZl0trcxY1obIwn2HT7+zxz9lMPAqpTSG4Fzgcsj4iLg\nc8DnU0pnAH3Adfn51wF9+f7P5+cREWcBVwNnA5cDfxkRtcsBSJI0xUxqj8ETmyvzaWcRx7Ea8M8t\nncWHzl3MSII/XLuBH/UdOqbHf3/Dbn7ra8/xvZd3093Rxu++9WQ+c+npzJ/i3s72tuBTq07jlJ5u\nNvQd4o/uP75SPzsPDHLrI1sA+J23LB+3HJIkgYtFTYWUqcwlmZbfErAKuDvffztwVb59ZX6f/PjF\nkTWAVwJ3pJQOp5TWA+uACwp4CZIk/RST2mNQWVn47MVHX/V4PKvftJS3nDqXfQPD/P4/vMj6XQeP\n+pid+wf5w7Xrufk76+k7OMQ5i2fyV+99A7/0ht7jSq6Px8zOdm6+9HRmd7Xz0Kt7uOXhzcf0+JGU\n+JP/9wr7897lt62Yup5lSc3BpHZqRER7RDxOtpryfcBLwO6UUuVCbwSW59vLgVcB8uP9wILq/TUe\nM2rHjh2sWrVq9HbbbbdNwSuSJLU6S/ocg0pS+/qFE6v7WktbBJ/8xRXcdN/LPLZpLx//+xf46FtO\n4pKV83+qFmz/oSG+/tR2vv70dg4PJ7o72rju55fxK2f1Tknd2KNZNqeLGy8+jRu+vY67n9rOwpnT\neM85ExtCfPdT23k0n//7n996SmHJuKTGZa3aqZFSGgbOjYge4BvAG6bqd/X29rJ2raVzJUlTy6R2\ngnYfHGTbvgG6Oto4paf7hJ6ru6ONmy85nT954BW+9/Ju/vSBH3HnE9v4hVPnsmhWJ/sGhnl2234e\n3bSXoXyY71tX9PBbFy5j6exyy9+8cdlsPv62U/jTB37Elx7cREdb8CtnLTziY/55/W5u+fesZ/cT\nbz+VBTNdHErS0fVMt6d2KqWUdkfE/cBbgJ6I6Mh7Y08CNuWnbQJOBjZGRAcwl6w+bmV/RfVjJEkq\nlEntBL2wI+ulXblg+oTq0x5NZ0cbn/zFFfz8ybu4/dEtbOw/zJ1Pbv+Jc9oCLjx5Dh86bwlnLpp5\nwr9zslz2ugUcGBjmSw9u4i/+dSOv7R/kmjctrXldvvdSH3/8wCsk4Np86LUkTUSlp3aPSe2kiYiF\nwGCe0E4HLiFb/Ol+4H3AHcA1wDfzh9yT3/+3/PjalFKKiHuAv4uIPwOWASuBfy/0xUiSlDOpnaAX\ndgmxKCcAAB7JSURBVGRzX193AkOPx4oILlm5gFU/M5+HN+7h+dcO0HdwkOkdbZy+YDrnLZtN78xj\nW2W5KO85ZxGdHW38xb+8yh1PbOPRTXv48JuXcf7y2bRFsH3fAH/3+FbW/HAnAFedvZAPnru45Kgl\nNRKHH0+JpcDt+UrFbcBdKaVvRcSzwB0R8VngB8At+fm3AF+JiHXALrIVj0kpPRMRdwHPAkPAx/Jh\nzZIkFc6kdoJeeG0/cGLzacfT3hZcdMpcLjqlsXoxf+kNvSyb3cWfPPAKL+44yH/7x5fo7mhj+rQ2\n+g5m/wltC/jIhct5z9kLnUcr6ZiMLhR10KR2sqSUngTOq7H/ZWqsXpxSOgS8f5zn+gPgDyY7RkmS\njlVhqx9HxOV5gfZ1EXFDjeM1C7xHxAUR8Xh+eyIi3lNUzNVerPTU9k5+UtvIzls+m7/+1TO55k1L\nWTK7k0NDI/QdHGL6tDb+42k9/M/3nsl7z1lkQivpmDmnVpIkTUQhPbX5MKcvks3d2Qg8HBH3pJSe\nrTpttMB7RFxNNsfnA8DTwJtTSkMRsRR4IiL+vqr0wJTbc2iInQcG6e5oY+mcchdqqkczOtv59fOW\n8KFzF7N/YJh9A8MsnNk5KXOPJbUuS/pIkqSJKGr48QXAunx4ExFxB1nh9uqk9krgpnz7buALEREp\npQNV53STFYkv1Ia+QwCcOq+7lFI6jSIimNXVwawuR7VLOnHOqdVUuPRvflB2CJKkSVbU8OOJFGkf\nr8A7EXFhRDwDPAX8dpG9tAAb+rKhxyvmnVgpH0nSxFX31KZU+PeZkiSpQTREl1pK6SHg7Ig4k2zV\nxm/ni1eM2rFjB6tWrRq9v3r1aq699tpJ+f2VntoV86ZPyvNJko5u+rR2utqDw8OJg4MjzOhsLzsk\nSZJUh4pKaidSpH28Au+jUkrPRcQ+4Bzgkepjvb29rF27drLjBuyplaSyzJ3ewfZ9g/QfGjKplSRJ\nNRWV1D4MrIyI08iS16uBD405Z7wC76cBr+YLRZ0KvAHYUFDcpJR4pdJTO9+eWkkq0tzuLKndfWjI\nhfqkI3CusKRWVkhSmyek1wP3Au3ArXnh9puBR1JK9zBOgXfgrcANETEIjAAfTSntKCJugF0Hhth7\neJjZXe3Mn94Qo7UlqWm4ArIkSTqawrK0lNIaYM2YfTdWbdcs8J5S+grwlSkPcBzrR4ceT7fWqiQV\nrMekVpIkHUVRqx83rFd3Z0OPT+5x2JskFW20p/agSa0kSarNpPYoNu85DMBJzuWSpMLNnW6tWkmS\ndGQmtUexsT9LapfPdeVjSSra3O5pgMOPJUnS+Exqj2LTnkpSa0+tJBXNObWSJOloTGqPYHB4hO37\nBmgLWDq7s+xwJKnluPqxJEk6GpPaI9iyd4CRBItmdTKt3UslSUWrJLW7XShKkiSNw0ztCDbl82lP\ncuixJJWiZ7o9tZIk6chMao9gU39Wzme5Kx9LUilmTGujoy04NDTC4aGRssORJEl1yKT2CCqLRC0z\nqZWkUkSE82olSdIRmdQewcbR4ceW85GksozOqzWplSRJNZjUHoHlfCSpfKM9tS4WJUmSajCpHceh\noRF27B+kPWDxLMv5SFJZXCxKkiQdiUntODbnQ4+XzumivS1KjkaSWpfDjyVJ0pGY1I5jdOixi0RJ\nUqlcKEqSJB2JSe04NjufVpLqgnNqJUnSkZjUjmPrXsv5SFI96LGnVpIkHYFJ7Ti27RsAXCRKkso2\n14WiJEnSEZjUjmPr3jypnW1SK0llcqEoSZJ0JCa1NaSU2G5PrSTVhcrw4z0mtZIkqQaT2hr6Dg4x\nMJyY293B9GntZYcjSS1tVlc7bQH7BoYZGkllhyNJkuqMSW0NzqeVpPrRFsGcLufVSpKk2kxqa6is\nfOx8WkmqD6OLRVnWR5IkjWFSW8PoIlH21EpSXbCsjyRJGo9JbQ2V4cdL7KmVpLrgCsiSJGk8JrU1\nbLOnVpLqylx7aiVJ0jhMamsYXSjKnlpJqgsmtZIkaTwmtWOMpOTqx5JUZ0aTWheKkiRJY5jUjtF3\ncIhBa9RKUl1xTq0kSRqPSe0YzqeVpPozWtLHpFaSJI1hUjvGtn3WqJWkemNJH0mSNB6T2jGsUStJ\n9ceFoiRJ0nhMasewRq0k1Z85eVK759AQwyOp5GgkSVI9MakdozKn1qRWkupHR1swu6udBOw9bG/t\n8YqIkyPi/oh4NiKeiYjfzffPj4j7IuLF/Oe8fH9ExJ9HxLqIeDIizq96rmvy81+MiGvKek2SJJnU\njmE5H0mqTw5BnhRDwCdSSmcBFwEfi4izgBuA76aUVgLfze8DvAtYmd8+AnwJsiQY+DRwIXAB8OlK\nIixJUtFMaquMpDTaU7vIpFaS6kpPvgJyn7Vqj1tKaUtK6bF8ey/wHLAcuBK4PT/tduCqfPtK4Msp\n8yDQExFLgcuA+1JKu1JKfcB9wOUFvhRJkkaZ1FbpOzDE4Ig1aiWpHs2bPg0wqZ0sEbECOA94CFic\nUtqSH9oKLM63lwOvVj1sY75vvP2SJBWuo+wA6snWvJyP82klqf7Mz3tqdx8cLDmSxhcRs4CvAR9P\nKe2JiNFjKaUUEZOyGteOHTtYtWrV6P3Vq1dz7bXXTsZTS5I0yqS2yjbL+UhS3erJe2p32VN7QiJi\nGllC+7cppa/nu7dFxNKU0pZ8ePH2fP8m4OSqh5+U79sEvGPM/u+N/V29vb2sXbt2cl+AJEljOPy4\niotESVL9mmdP7QmLrEv2FuC5lNKfVR26B6isYHwN8M2q/avzVZAvAvrzYcr3ApdGxLx8gahL832S\nJBWusKQ2Ii6PiOfzsgA31DjeFRF35scfyuf6EBGXRMSjEfFU/nPV2MdOlq2VnlqHH0tS3XFO7aT4\nBeA3gFUR8Xh+uwL4I+CSiHgReGd+H2AN8DKwDvhr4KMAKaVdwGeAh/Pbzfk+SZIKV8jw44hoB74I\nXEK2mMTDEXFPSunZqtOuA/pSSmdExNXA54APADuAX0kpbY6Ic8i+CZ6SxSgqPbXOqZWk+lPpqd11\nwJ7a45VS+j4Q4xy+uMb5CfjYOM91K3Dr5EUnSdLxKaqn9gJgXUrp5ZTSAHAHWZmAatXlBO4GLo6I\nSCn9IKW0Od//DDA9IrqmIkjn1EpS/Zo/I+up3W1PrSRJqlLUQlG1lv6/cLxzUkpDEdEPLCDrqa34\nVeCxlNLhsb/gRFdYHEmJ7fusUStJ9aqnO59Te2iIkZRoi/E6HCVJUitpmNWPI+JssiHJl9Y6fqIr\nLFqjVpLqW2dHGzM729k/MMy+w8PM6W6YJkySJE2hooYfj1cSoOY5EdEBzAV25vdPAr4BrE4pvTQV\nAVqjVpLqX2VebZ8rIEuSpFxRSe3DwMqIOC0iOoGrycoEVKsuJ/A+YG1eAL4H+AfghpTSv0xVgM6n\nlaT65wrIkiRprEKS2pTSEHA92crFzwF3pZSeiYibI+Ld+Wm3AAsiYh3we0Cl7M/1wBnAjVXlBxZN\ndozWqJWk+mdPrSRJGquwCUkppTVk9e6q991YtX0IeH+Nx30W+OxUx1epUevwY0mqX/bUSpKksYoa\nflz3KkntYpNaSapbP+6pNamVJEkZk9pcZfjxkllTUgJXkjQJRpPaAw4/liRJGZNaxtSotadWkurW\nvBkOP5YkST/JpBbYdWCQoZFET3cH3R1eEkmqVy4UJUmSxjKDo6qcj720klTXXChKkiSNZVILbB2d\nT2tSK0n1rCfvqd19cJCRlEqORpIk1QOTWuyplaRG0dnexqzOdoYT7Ds8XHY4kiSpDpjU8uOVjxfb\nUytJda/SW7vLebWSJAmTWsAatZLUSOY7r1aSJFUxqcUatZLUSObNyHtqrVUrSZIwqbVGrSQ1mAV5\nrdqdJrWSJAmTWmvUSlKD6a0ktftNaiVJkkmt82klqcEsmJl9Xu+wp1aSJGFSy5a9hwFYYlIrSQ2h\nd6Y9tZIk6cdaPqmt9NQune0iUZLUCCrDj3ccGCg5EkmSVA9MavOk1p5aSWoM8/OkdteBIUZSKjka\nSZJUNpNak1pJaihdHW3M7mpnaCTRf8hatZIktbqWT2orc2odfixJjcMVkCVJUkVLJ7UDwyPs3D9I\nW8DCWfbUSlKjWDCzMq/WpFaSpFbX0knta/sGSMDCmZ10tEXZ4UiSJqh3Rl7Wx55aSZJaXksntVuc\nTytJDalS1meXPbWSJLW8lk5qXSRKkhpTZQVke2olSVKLJ7XZIlFLXCRKkhpK70xr1UqSpEyLJ7XZ\nf4aW2lMrSQ3F1Y8lSVJFSye1W+yplaSG5OrHkiSpoqWTWufUSlJjmtvdQUdbsPfwMIeHRsoOR5Ik\nlahlk9r9A8PsPTxMV3swb3pH2eFIko5BWwQLZrgCsiRJauGktnqRqAhr1EpSo6kktQ5BliSptbVs\nUmuNWklqbKPzal0sSpKkltaySe3mPVlP7dI5LhIlSY2od7RWrWV9JElqZS2b1G7qz5Lak+aa1EpS\nI1o4Kxtps32fPbWSJLWylk9ql9lTK0kNafFoUmtPrSRJrax1k9p8+PFye2olqSFVktptJrWSJLW0\nlkxqDw4Os/PAINPagkUzXShKkhrR4tkmtZIkqUWT2soiUUtmd9LeZjkfSWpEc7ra6epoY//AMPsH\nhssOR5IklaQlk9ofLxLVXXIkkqTjFRE/HoK8197aiYiIWyNie0Q8XbVvfkTcFxEv5j/n5fsjIv48\nItZFxJMRcX7VY67Jz38xIq4p47VIklTRmkmt82klqSksmpWV9XEI8oTdBlw+Zt8NwHdTSiuB7+b3\nAd4FrMxvHwG+BFkSDHwauBC4APh0JRGWJKkMrZnUuvKxJDWFJbOyz3GT2olJKT0A7Bqz+0rg9nz7\nduCqqv1fTpkHgZ6IWApcBtyXUtqVUuoD7uOnE2VJkgpTWFIbEZdHxPP5MKYbahzviog78+MPRcSK\nfP+CiLg/IvZFxBcmIxZ7aiWpOSyanfXUWtbnhCxOKW3Jt7cCi/Pt5cCrVedtzPeNt1+SpFJ0FPFL\nIqId+CJwCVnj93BE3JNSerbqtOuAvpTSGRFxNfA54APAIeC/A+fktxO2cXROrUmtJDWyypzarc6p\nnRQppRQRabKeb8eOHaxatWr0/urVq7n22msn6+klSQIKSmrJ5tysSym9DBARd5ANa6pOaq8Ebsq3\n7wa+EBGRUtoPfD8izpiMQPYPDNN/aIiu9mDBjGmT8ZSSpJIsypNae2pPyLaIWJpS2pIPL96e798E\nnFx13kn5vk3AO8bs/16tJ+7t7WXt2rWTHrAkSdWKGn48kaFKo+eklIaAfmDBZAdSPZ+2LSznI0mN\nrDKnduvewyVH0tDuASorGF8DfLNq/+p8FeSLgP58mPK9wKURMS9fIOrSfJ8kSaUoqqd2yk10iNOm\nPYcA59NKUjOYP6ODrvZgz+Fh9h0eYlZX0zRrUyIivkrWy9obERvJVjH+I+CuiLgOeAX4tfz0NcAV\nwDrgAPBhgJTSroj4DPBwft7NKaWxi09JklSYolr/8YYw1TpnY0R0AHOBnRP9BRMd4lSZT7vclY8l\nqeFFBMvmdLG+7xCb9wzwuoUmtUeSUvrgOIcurnFuAj42zvPcCtw6iaFJknTcihp+/DCwMiJOi4hO\n4GqyYU3Vqoc/vQ9Ymzeok+pHfVlP7Snzuif7qSVJJaiUZ6usbC9JklpLIV9pp5SGIuJ6sjk37cCt\nKaVnIuJm4JGU0j3ALcBXImIdWQ29qyuPj4gNwBygMyKuAi4ds3LyhG3YnSW1p/ZMP4FXJEmqF5Xp\nJCa1kiS1psLGaaWU1pDNz6ned2PV9iHg/eM8dsVkxDA0kkYXijq5x+HHktQMKj21m01qJUlqSUUN\nP64Lm/sPMzSSWDyrk+nT2ssOR5I0CSprJGzuN6mVJKkVtVRS+0o+9HiF82klqWksc/ixJEktrbWS\n2r6DAJzSY1IrSc1iwYxpdLYH/YeG2D8wXHY4kiSpYK2V1FYWibKnVpKaRlte1gecVytJUitqraS2\nz6RWkppRJand2H+o5EgkSVLRWiapHRweYWP/YQKHH0tSszk1/1yvfHkpSZJaR8skta/0HWJoJLF8\nbpcrH0tSk6mMwDGplSSp9bRMUrtuZ7ZI1M8smF5yJJKkybZiXvbZXlk7QZIktY6WSWpf2nkAgDMW\nzCg5EknSZDupp4u2yBaKGhgaKTscSZJUoBZKau2plaRm1dnexrI5XYwkeNXFoiRJaiktkdSOpMRL\nu0xqJamZrcjn1W5wXq0kSS2lJZLaLXsOc3BwhAUzpjFv+rSyw5EkTYHRebUmtZIktZSWSGori0Sd\nYS+tJDWtU0d7ag+WHIkkSSpSSyS1z23fD8DrFrpIlCQ1q9PnZ19cVr7IlCRJraGlktozF80sORJJ\n0lRZPreLGdPa2LF/kF0HBssOR5IkFaTpk9qB4RHW7ThIYFIrSc2sLYKVvdmInBd2HCg5GkmSVJSm\nT2rX7TjI4EjilHndzOxsLzscSdIUGk1qXzOplSSpVTR9UvtsPvT4LHtpJanpvX6hPbWSJLWapk9q\nnU8rSa3jdXlP7fOvHSClVHI0kiSpCE2d1I6kxJNb9gFw9mKTWklqdktmdzK7q53+Q0Ns3+diUZIk\ntYKmTmrX7zpI/6EhemdO46S5XWWHI0maYhExOjLnqa37So5GkiQVoamT2sc27QXg/GWziYiSo5Ek\nFeGNS2cB8MSWvSVHIkmSitDUSe0PNudJ7fLZJUciSSrKG5dln/lPbLGnVpKkVtC0Se3A0AhP5f+h\nOW+ZSa0ktYqfmT+dWZ3tbN07wLa9A2WHI0mSpljTJrVPbt3H4eHEafO6mTdjWtnhSJIK0t4W/OwS\nhyBLktQqmjap/f6G3QC85dS5JUciSSraG5dlSe2jm0xqJUlqdk2Z1A6PJP5lQz8Abzutp+RoJElF\nu/Dk7AvNh37Uz8DwSMnRSJKkqdSUSe1TW/fRf2iIZXO6OH3+9LLDkSQVbPncLk6f382BwREe32xv\nrSRJzawpk9oHXs6GHr/ttB5L+UhSi/qFFdlIne+v7y85EkmSNJWaLqk9ODjM2pd2AfCLp88rORpJ\nUlnemie1//rKboZGUsnRSJKkqdJ0Se3al/o4MDjC2YtncvoChx5LUqtaMa+bU3q62XN4mH/NFw+U\nJEnNp6mS2pQS33puBwC/fGZvydFIksoUEaNtwd/nbYMkSWo+TZXUPrJxLy/tPMjc7g5XPZYkccnK\n+XR3tPHEln1s6DtYdjiSJGkKNE1Sm1Lifz2yGYBf+7lFdLY3zUuTJB2nmZ3tvPOM+QB89fFtJUcj\nSZKmQtNkfnsHhlm38yALZkzj3WctLDscSVKd+MAbFzOtPbj/pT5eeO1A2eFIkqRJ1jRJ7Wv7BgBY\nff4Sujqa5mVJkk7Q4tmdXJV/2fmlBzcy7ErIkiQ1labJ/oZH4Pzls7ns9QvKDkWSVGc+eO5iero7\neGbbfr76+Nayw5EkSZOoaZLa9jb4xNtPoS2i7FAkSXVmVlcH//UdpxLAVx7byvfXW+JHkqRm0TRJ\n7bI5XSyc2Vl2GJKkOvWmk+bwG+cvIQGfXbuebz+/k5QciixJUqMrLKmNiMsj4vmIWBcRN9Q43hUR\nd+bHH4qIFVXHPpnvfz4iLqv1/NOntU9KnLfddtukPE/RGjVuaNzYjbtYxl2sZo37189bwgfPXcxI\ngs//84+46TvrecVSP8ftaG37ZGnU9+NrD36r7BCOS6PGDY0bu3EXy7iLVcRneCFJbUS0A18E3gWc\nBXwwIs4ac9p1QF9K6Qzg88Dn8seeBVwNnA1cDvxl/nxT4stf/vJUPfWUatS4oXFjN+5iGXexmjXu\niODDb17GJ95+CjOmtfFvr/TzW1/7Ib/3rRe444mtPLZpD7sODLqY1ARMsG2fFI36fnztoX8oO4Tj\n0qhxQ+PGbtzFMu5iFfEZ3jHlvyFzAbAupfQyQETcAVwJPFt1zpXATfn23cAXIiLy/XeklA4D6yNi\nXf58/1ZQ7JKkJnPZ6xZw3rLZ3PnENu59YSdPb93P01v3jx5vC/jD80oMsDFMpG2XJGnKRRHziSLi\nfcDlKaX/lN//DeDClNL1Vec8nZ+zMb//EnAhWaL7YErpf+f7bwG+nVK6u/p3rFmzZu+WLVtGe57n\nzJnz2vz583cca6y7du3qPZ7Hla1R44bGjd24i2XcxTJuAE69+OKLLXw+jom07bbNxl20Ro3duItl\n3MUqom0uqqd2yl1xxRWzy45BkiT9mG2zJKkIRS0UtQk4uer+Sfm+mudERAcwF9g5wcdKkqRi2T5L\nkupCUUntw8DKiDgtIjrJFn66Z8w59wDX5NvvA9ambGz0PcDV+erIpwErgX8vKG5JklTbRNp2SZKm\nXCFJbUppCLgeuBd4DrgrpfRMRNwcEe/OT7sFWJAvBPV7wA35Y58B7iJbeOIfgY+llIZPJJ6IeH9E\nPBMRIxHx5jHHjlo+KG/AH8rPuzNvzAuV/97H89uGiHh8nPM2RMRT+XmPFB1njXhuiohNVbFfMc55\nhZSJOBYR8ccR8cOIeDIivhERPeOcV/o1P5ESWmWKiJMj4v6IeDb/N/q7Nc55R0T0V72Hbiwj1rGO\n9nePzJ/n1/zJiDi/jDjHxPT6quv4eETsiYiPjzmnLq53RNwaEdvz9Rcq++ZHxH0R8WL+c944j70m\nP+fFiLim1jk6duO17cf7fLbN5bFtLoZtc/Fsm6c81vppm1NKLXcDzgReD3wPeHPV/rOAJ4Au4DTg\nJaC9xuPvAq7Ot/8K+J2SX8+fAjeOc2wD0Fv2Na+K5ybg949yTnt+7U8HOvO/yVl1EPulQEe+/Tng\nc/V4zSdy/YCPAn+Vb18N3Fn29c1jWQqcn2/PBl6oEfs7gG+VHeux/t2BK4BvAwFcBDxUdsw13jdb\ngVPr8XoDbwfOB56u2vc/gBvy7Rtq/ZsE5gMv5z/n5dvzyn493mr+jW2by4vVtnnq47RtLid22+ap\nja9u2uaihh/XlZTScyml52scGi0flFJaD1TKB42KiABWkZUdArgduGoq4z2SPJ5fA75aVgxTYLRM\nREppAKiUiShVSumfUtYzAfAg2fyxejSR63cl2XsXsvfyxfl7qVQppS0ppcfy7b1kvT/Ly41q0lwJ\nfDllHgR6ImJp2UFVuRh4KaX0StmB1JJSegDYNWZ39ft4vM/iy4D7Ukq7Ukp9wH1kNc9VZ2yb655t\n84mxba5Pts0noJ7a5pZMao9gOfBq1f2N/PQ/2gXA7qoP0FrnFOltwLaU0ovjHE/AP0XEoxHxkQLj\nOpLr8yEet44zJGEif4ey/SbZN3u1lH3NJ3L9Rs/J38v9ZO/tupEPuzoPeKjG4bdExBMR8e2IOLvQ\nwMZ3tL97vb+vr2b8/4DX4/UGWJxS2pJvbwUW1zin3q+7js62uRi2zVPLtrkcts3FK6VtbpqSPmNF\nxHeAJTUOfSql9M2i4zkeE3wNH+TI3wS/NaW0KSIWAfdFxA/zb1WmzJHiBr4EfIbsQ+YzZMOzfnMq\n4zkWE7nmEfEpYAj423GepvBr3mwiYhbwNeDjKaU9Yw4/RjYMZ18+7+v/ki0gV7aG/btHNvfw3cAn\naxyu1+v9E1JKKSKmvvC6Toht8yjb5mNg21wfbJuLZdt8bJo2qU0pvfM4HjaR8gQ7yYYmdOTfok1Z\nCYOjvYbISh+9F3jTEZ5jU/5ze0R8g2z4y5T+Y57otY+Ivwa+VeNQaWUiJnDNrwV+Gbg45ZMCajxH\n4dd8jGMpobUxfrKEVukiYhpZo/m3KaWvjz1e3ZCmlNZExF9GRG9KqdRi5BP4u9dz+ZN3AY+llLaN\nPVCv1zu3LSKWppS25MPFttc4ZxPZ3KOKk8jmbKoEts2jz2HbfAxsm8tn21wK2+Zj4PDjn3TU8kH5\nh+X9ZGWHICtDVNa3y+8EfphS2ljrYETMjIjZlW2yxRSernVuUcbMU3gPteOpyzIREXE58F+Ad6eU\nDoxzTj1c8xMpoVWqfO7QLcBzKaU/G+ecJZU5RhFxAdnnWKmN/gT/7vcAqyNzEdBfNTynbOP2KtXj\n9a5S/T4e77P4XuDSiJiXD6m8NN+nxmHbPMVsmwth21ww2+bSlNM2pzpY2avoG9kH9kbgMLANuLfq\n2KfIVqd7HnhX1f41wLJ8+3SyBnUd8H+ArpJex23Ab4/ZtwxYUxXnE/ntGbJhOmVf+68ATwFP5m/6\npWPjzu9fQba63kv1EHce0zqy8f+P57fKCoV1d81rXT/gZrJGH6A7f++uy9/Lp5d9ffO43ko2/O3J\nqut8BfDblfc6WQmRZ/Jr/CDwH+og7pp/9zFxB/DF/G/yFFWru5Yc+0yyhnBu1b66u95kDfsWYDD/\n/L6ObK7Zd4EXge8A8/Nz3wz8TdVjfzN/r68DPlz2Nfc27t/Ytrm8a2/bXEysts3Fxm3bPPVx1k3b\nHPmTSpIkSZLUcBx+LEmSJElqWCa1kiRJkqSGZVIrSZIkSWpYJrWSJEmSpIZlUitJkiRJalgmtZIk\nSZKkhmVSK0mSJElqWCa1kiRJkqSG9f8BdAXMd5+WfNkAAAAASUVORK5CYII=\n", 264 | "text/plain": [ 265 | "" 266 | ] 267 | }, 268 | "metadata": {}, 269 | "output_type": "display_data" 270 | } 271 | ], 272 | "source": [ 273 | "# Extract the ratings from the DataFrame\n", 274 | "all_ratings = np.ndarray.flatten(data.values)\n", 275 | "ratings = pd.Series(all_ratings)\n", 276 | "\n", 277 | "# Plot histogram and density.\n", 278 | "fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 7))\n", 279 | "ratings.plot(kind='density', ax=ax1, grid=False)\n", 280 | "ax1.set_ylim(0, 0.08)\n", 281 | "ax1.set_xlim(-11, 11)\n", 282 | "\n", 283 | "# Plot histogram\n", 284 | "ratings.plot(kind='hist', ax=ax2, bins=20, grid=False)\n", 285 | "ax2.set_xlim(-11, 11)\n", 286 | "plt.show()" 287 | ] 288 | }, 289 | { 290 | "cell_type": "code", 291 | "execution_count": 7, 292 | "metadata": { 293 | "collapsed": false 294 | }, 295 | "outputs": [ 296 | { 297 | "data": { 298 | "text/plain": [ 299 | "count 100000.000000\n", 300 | "mean 0.996219\n", 301 | "std 5.265215\n", 302 | "min -9.950000\n", 303 | "25% -2.860000\n", 304 | "50% 1.650000\n", 305 | "75% 5.290000\n", 306 | "max 9.420000\n", 307 | "dtype: float64" 308 | ] 309 | }, 310 | "execution_count": 7, 311 | "metadata": {}, 312 | "output_type": "execute_result" 313 | } 314 | ], 315 | "source": [ 316 | "ratings.describe()" 317 | ] 318 | }, 319 | { 320 | "cell_type": "markdown", 321 | "metadata": {}, 322 | "source": [ 323 | "\n", 324 | "This must be a decent batch of jokes. From our exploration above, we know most ratings are in the range -1 to 10, and positive ratings are more likely than negative ratings. Let's look at the means for each joke to see if we have any particularly good (or bad) humor here.\n" 325 | ] 326 | }, 327 | { 328 | "cell_type": "code", 329 | "execution_count": 8, 330 | "metadata": { 331 | "collapsed": false 332 | }, 333 | "outputs": [ 334 | { 335 | "data": { 336 | "text/plain": [ 337 | "" 338 | ] 339 | }, 340 | "execution_count": 8, 341 | "metadata": {}, 342 | "output_type": "execute_result" 343 | }, 344 | { 345 | "data": { 346 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAA50AAAF7CAYAAABRru/8AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xm4JFddP/73JwuBEBMhExOUCLKJbCYIBCX+CBPAsAiI\n8IAg4yCRTQgoyiJIXADZBBEQxUAwgCD4ZQ1bgCEiSMKSAElIwhLZFzNElhgFAuf3R9XNdJq79Myt\nuuvr9Tz93O6q06c+XVVdfT91Tp2q1loAAABgDHutdgAAAABsXJJOAAAARiPpBAAAYDSSTgAAAEYj\n6QQAAGA0kk4AAABGI+kEYN2pqldU1XtWO44kqarDq+q9VfU/VbVq9yGrqtOr6qSJ12tmHQ2pqj5f\nVU9Z7TgAmJ2kE2Ad6BOIVlVvmGfePft5l69GbPPpE4PWP/6vqj5bVU+rqqvsZj2/vUAi95gk9x0m\n2mX7kyQ/leSIJNcaayFV9TNV9b2q+mpV7TNQnQ/tE+Zv9tvq6AXKba+qC/vlX1BVD5ynzFFV9R/9\n9v5aVf1VVe29xPI3ZGIMwJVJOgHWjy8muXtVHTo1/WFJvrAK8SzlWemSsBsleVKSE5L82RAVt9a+\n3Vr77yHqGsANk3y4tfaZ1trX97SSqtp3iSIPSXJqkm8l+fU9Xc6U/ZPsSPL4ReK6V5KXJfn7JL+Y\n5KQkp1TVXSbKHJ7k3UkuTPJLSR6Rbr98+kBxArCOSToB1o/PJDkjyfa5CVX1s0nulOTk6cJV9UtV\ndVpVXVpVF1fVG6rqOhPzf66f9tWquqyqzqmqB03VcXpVnVRVf1pVX6+qS6rqlKo6YIZ4L22tfb21\n9sXW2uvTJSW/NlX/06vq/H75X6qqv6+qg/p5xyR5Zf98rtX0Ff3rK7WQzb3uW+6+UFXfqaq3TCfo\nVfXYqvpyv7x3VdWD+nqv3c8/sKpO7j/r9/qYnrfQB+xbYY9N8rtT8V2rql5bVd+qqv/t1+OtJt53\nTF/+blX1gar6vyTHL7KcvdIlna9I8k9JHrr4qp9Na+1vWmtPT/LeRYo9Psm/tNae31q7oLX23CRv\nSPKEiTKPSPKdJA9prZ3XWntTkj9N8uiquvqs8VTnj6rqoqr6flV9rqoeu8R77tiv58dOTLtTVX2w\nX/df6bfpwRPzb9pv/29V1y36/Ol9H4DhSDoB1peXJjm+qqp/fXy6hOFKLZ1VdZMk/5bkQ0lulWRr\nkh8meXdVXbUvdkC6Vq67JLl5X/fJVXWHqWXeJ8k1kxyT5P5J7p4rJxxLqqojkxyd5PtTs/43XQJ1\nk3TJ9DFJ/raf9x9JHtU/v1b/eMwii7l1kjskuVu65PbmSZ47EcO9+9fPSddi95p0rbGTnpbklknu\nma4F835Jzl9kmddKt47/eS6+ftu8KcmN062r2yT5Rrp1v2Xq/X/dx/ALSd66yHLukmS/JO9Il4gf\nW1XXXaT8IKrrDn3rJO+cmvXOJLed6D57uySntdZ+NFVm/yRH7sYiH5nkL5M8M8lN022rZ1bVQxaI\n74FJ3pjk4a21v+mnbU3y5iSvTXKLJPdKct0kb5j43rwmyTeT/Eq6/eQPk6yVlnOADWeQa0IAWDH/\nmuQFSY6pqvcn+d103VYPnCr3+CSnttZOnJtQVb+d7h/r45K8qbV2TpJzJt7zwqq6Y5IHJHnfxPQv\ntNb+oH9+QVX9S5I7pmvJWsyfVtUTk+yb5Crpkt6HTRZorT1t4uXnq+pJSV5bVQ9urX2/qr7dl5ul\n2+r3kmxvrX2v/7x/n2SylexxSV7TWntB//ozVXXjXDmBvk6Ss1trZ/avv5gu+Z1Xa+3rVfX9JP87\nF2NVHZsu0bxpa+1T/bRtST6fLqn6i4kqnt5aWyzZnPPQJK9urV2e5KtVtSPdCYexB9TZku5/hen1\n//V0SfA1k1ycLuH+4Dxlkt27zvWJSV7YWntp//ozVfXzSZ6crovvFarqj9Ltg/dqrU221D41yd+2\n1l44UfZ30p2Y+cUkH0+3nZ83t32SXLQbMQKwm7R0AqwjrbX/S9fS9XvpWvT2yfwtZLdO8ht919pL\nq+rSdC07V03Xgpeq2r+qnllV5/XdZi9Nctd0/5BP+sTU668mmb6udD4vTje4ztHpWp5e2He7vEJV\n3buq3l9dF99Lk7w6XYJ62Az1T7tgLuFcIM6bpOuePOlDU6//Lsl9qurcqnpBVd2l79q6O26a5JsT\nCU36uM7s50368FKVVdXPpNvWr5iY/E/puvRumJPHVXVgkmsnef/UrH9Lct2q2n9i2kPTtUpvnUo4\nk27ff+zUvj+3LW7Y/31ukpP6bs9/VlW3HPTDAHAlG+bHCmATeWmSs5IcnuTk1toPdvUavMJe6ZLT\nZ87z/m/2f5+TrhvpH6YbAOZ/0nX3PGiq/HSX2JbZTlpe0lr7bJJU1f2SnF9VH22tvbqfdlSS1yf5\nqyR/nK4V9rbpEqrdGuV2kTinV8yitzRprb2ruutkfy1dV99XJTmnqo5trf1wD2Jayv/MUOYhSfZO\ncvbUdt473YBCbxwhrjk7k1yeHz8JcGi6luVL+tdfW6DM3LyhfShdl/GHVNVZrbXJ7bpXui7Lr5zn\nfV9PktbaX1bVq9O1+m9N8idV9ezWmluxAIxASyfAOtO3oH0k3XV0Jy1Q7KPprmf7XGvts1OPuWvX\n/r90XTZf11r7RLouhjcaKebvpRvJ9LkTLVZHJ9nZWntKa+3M1tqn07V0Tfp+ktQSt96Y0aeS/PLU\ntNvOE+slrbXXtNYelq6F8fbpWklndV6Sg/vrapMkVbVfkqOSnLs7AU8MIPSMdK3Gk4/XZKABhRbS\nWvt+un3t16ZmHZfkjIlE/INJ7jTVKnxcksuSnD3jsr6T5Mvp9stJt0/yn621yyamnZPupMC9k7y0\nrpyNfzRd1+bp/f6zrbVLJ5Z3UWvt71pr90nXJfcRs8QJwO6TdAKsT7+WZEtr7XMLzH9GusFpXlVV\nt6lupNo79F1Gr9eXuTDJPfv5N0nXgvrTI8b8qv7v3GBAFyY5pKoeUlXX6697fOTUe/6z/3uPqjqk\nZhs1dyF/neT+VfXoqrpBv7xt/byWXDGa7r2r6uer6oZJHpjk0nTXds5qR7pus/9cVberqpslOSVd\n1+aX7GbMd0nXov0PrbVzJx/putveeTkDClXVYVV1RHYl1TeoqiOqarLV8tlJ7ldVj+nXyx+mS/Ym\nB2F6SboW8n/sR4a9R7oBgV7YWpulNXfOX6Ub8fb3quqGVfWwdMngM6YLttbOS5d43jXdAFhz/9M8\nNd1+/bz+s1y/qo6rqpdV1dWq6oCqenFVbe2/F0emS5A/Nb0MAIYh6QRYh1prl7XWLllk/vnpRuY8\nIMm70v1D/Y9JrpbuPo9J8gfpBld5X7oRcL+SbqCisWL+XpIXJnl8VV2jtXZqutbPZ6Rrubp/um62\nk+/5SLqBk/4hyX8ledEylv+GdAMsPbFf3gOT/Hk/+/8m/v5Fko9lV2vxXVpr396N5bR0I6ZekORt\n6VoKD0typ9bazt0M+6FJzmytzZf07kjXvXXBW63M4OHpWiLf1r8+uX/98LkC/XW4xyf5/XTr7WHp\nBmx6x0SZLyW5c7oTHR9LdwLjpekGAFrMXum67855Sbqk8U/S7bNPSPLE1trL5nlvWmsXpGsJ3Zru\n3qF7t9be17++RZJ/T/LJJM9P8t0kP+iXd410AxOdn+778Y10A2gBMIK68mUQALB5VNVTk5zQWpu+\nlQkroKpOS/KV1tqDVzsWAMZjICEANoWq2jfdbVPenm4Anzuka1l98WrGtRn19yu9XbpWSgknwAan\npROATaG/vcipSX4pyU+ku170lCTP6e9/yQqpqvelu33JK5M8ZaSRgQFYIySdAAAAjMZAQgAAAIxm\nRa7pPP3009t+++23EosCAABghV122WU7jz322EPmm7ciSed+++2XG9/4xiuxKAAAAFbYWWed9YWF\n5uleCwAAwGgknQAAAIxG0gkAAMBoJJ0AAACMRtIJAADAaCSdAAAAjEbSCQAAwGgknQAAAIxG0gkA\nAMBoJJ0AAACMRtIJAADAaCSdAAAAjEbSCQAAwGgknQAAAIxmn9UOAAAAYDF3PunsH5t22vFHrkIk\n7AktnQAAAIxG0gkAAMBoBks6q2rvqjq7qk4dqk4AAADWtyFbOh+T5PwB6wMAAGCdGyTprKprJ7lb\nkpOGqA8AAICNYaiWzr9J8vgkPxqoPgAAADaAZd8yparunuS/Wmsfq6pj5iuzc+fObN269YrX27Zt\ny/bt25e7aAAAANa4Ie7Tebsk96iquya5apIDq+pVrbXfniuwZcuW7NixY4BFAQAAsJ4su3tta+1J\nrbVrt9aum+T+SXZMJpwAAABsXu7TCQAAwGiG6F57hdba6UlOH7JOAAAA1q9Bk04A2EzufNLZPzbt\ntOOPXIVIAGDt0r0WAACA0Ug6AQAAGI2kEwAAgNFIOgEAABiNpBMAAIDRSDoBAAAYjaQTAACA0Ug6\nAQAAGI2kEwAAgNFIOgEAABiNpBMAAIDR7LPaAQBsFHc+6ewfm3ba8UeuQiQAAGuHlk4AAABGI+kE\nAABgNJJOAAAARiPpBAAAYDSSTgAAAEYj6QQAAGA0bpkCAGx4bmkEsHq0dAIAADAaSScAAACjkXQC\nAAAwGkknAAAAo5F0AgAAMBpJJwAAAKORdAIAADAa9+kEBjd9P7yx7oXnvnsAAGuflk4AAABGI+kE\nAABgNLrXAldYqW6x65GuvAAAe2bZLZ1VddWq+nBVfaKqzquqPx8iMAAAANa/IVo6v5dka2vt0qra\nN8kHquodrbUzBqgbAACAdWzZSWdrrSW5tH+5b/9oy60XAFaTLtWMyeUMwGYyyDWdVbV3ko8luUGS\nF7fWzpycv3PnzmzduvWK19u2bcv27duHWDQAAABr2CBJZ2vth0mOqKqfTPLGqrpZa+3cuflbtmzJ\njh07hlgUAAAA68igt0xprX0ryfuSHDdkvQAAAKxPy27prKpDkvygtfatqrpakjsledayIwMArjDL\nNYCb9TrBzXz97Wbd5sD6MkT32msl+af+us69kryutXbqAPUCAACwzg0xeu0nkzitBgAAwI8ZZCAh\nAADQ3ReYj6QTAABgN2zma8n3xKCj1wIAAMAkSScAAACjkXQCAAAwGkknAAAAozGQEADABmZEWWC1\naekEAABgNJJOAAAARiPpBAAAYDSSTgAAAEYj6QQAAGA0kk4AAABGI+kEAABgNJJOAAAARrPPagcA\nAABDu/NJZ//YtNOOP3IVIgG0dAIAADAaSScAAACj0b0WAGBAunUCXJmkEwAAWDVO1Gx8utcCAAAw\nGi2dwJo1febTWU8AgPVHSycAAACjkXQCAAAwGkknAAAAo5F0AgAAMBoDCcEeMLQ3rG++wwCwcrR0\nAgAAMBpJJwAAAKORdAIAADAaSScAAACjWfZAQlV1eJJTkhyapCV5aWvtBcutFwCmGQAIANafIUav\nvTzJ41prZ1XVTyT5WFW9u7X2qQHqBgAAGISTl6tj2Ulna+1rSb7WP/9uVZ2f5GeSSDoBNoDpH2g/\nzgDA7hj0Pp1Vdd0kRyY5c3L6zp07s3Xr1iteb9u2Ldu3bx9y0QAAQJwsZO0ZLOmsqgOS/L8kj22t\nfWdy3pYtW7Jjx46hFrWiNMEDAADsuUFGr62qfdMlnK9urb1hiDoBAABY/5addFZVJXlZkvNba89b\nfkgAAABsFEN0r71dkgclOaeqPt5P+5PW2tsHqBsAADaFjXgtpkvVSIYZvfYDSWqAWAAAANhgBrmm\nEwAAAOYj6QQAAGA0g96nk7VhI14PADAr1w8BwNoi6VxnJJSsNvsgAMDK2Qj/e0k6AYAr2Qj/4MBQ\n9J5gPVjrx21JJ2wSa/1gBADAxmQgIQAAAEajpRNgjdEqDQBsJFo6AQAAGI2Wzk1KSwoAALAStHQC\nAAAwGkknAAAAo5F0AgAAMBpJJwAAAKORdAIAADAaSScAAACjkXQCAAAwGvfpZFnc7xMYkmMKAGw8\nWjoBAAAYjaQTAACA0eheCwAAMDCXjOwi6QQAAOhNJ4vJ5k4YhyDpBAAAYCZ70oLrmk4AAABGI+kE\nAABgNCvevdYFtQAAAJuHazoB1iEn8ACA9UL3WgAAAEYj6QQAAGA0kk4AAABGM8g1nVX18iR3T/Jf\nrbWbDVEnAADARrZZxmgYqqXzFUmOG6guAAAANohBWjpba++vqusOURdsFNNnrpKNe/YKAFgZm6Vl\njI3FLVOATc8JAgCA8axI0rlz585s3bo1SfLpiy/LIUfdLYfc9u4rsWgAAABW0YoknVu2bMmOHTuS\nzN+iAAAAwMbklikAAACMZqhbprwmyTFJtlTVl5Oc2Fp72RB1A7BnDDbBfOwXAKy0oUav/a0h6gEA\nAGBjMXotAACMTC8DNjPXdAIAADAaLZ0AAKwI90VeX7TOMhQtnQAAAIxG0gkAAMBo1mT3Wk35AAAA\nG8OaTDoBAIDxaORhJeleCwAAwGgknQAAAIxG91oAgLidB8BYJJ0AAGxKTjRsLLbn2qV7LQAAAKPR\n0gmraKgzckagAwBgrdLSCQAAwGi0dAIAbHJ6zABj0tIJAADAaNZtS6czcgAAAGvfuk06AQCA1aUh\naP1YzW2ley0AAACjkXQCAAAwGkknAAAAo5F0AgAAMBpJJwAAAKORdAIAADAaSScAAACjkXQCAAAw\nmn1WOwBYSdM3xU3cxBhYfW6uDsBGJukEAABgtJOgutcCAAAwGkknAAAAo9G9FmATcy0hADC2QVo6\nq+q4qrqwqj5bVU8cok4AAADWv2W3dFbV3klenOROSb6c5CNV9ZbW2qeWW/dmY2RVAABgoxmie+1t\nkny2tXZRklTVa5PcM8maTzoleQAAAOOq1tryKqi6T5LjWmvH968flOSo1tqj5sq89a1vbU996lOv\neM+2bduyffv2ZS13KSuZUM6yrPWY4A51rdcs9QxRZqh1vNa21Ua95m6lPtda357JxtimQ33PNyLb\nfHll1tr6W83/L1Z7v1lLx+3NvF/AWnXWWWd97Nhjj73VfPNWZCChLVu2ZMeOHSuxKAAAANaQIQYS\n+kqSwydeX7ufBgAAwCY3REvnR5LcsKp+Ll2yef8kDxigXgBgk9NFkfXAfgqLW3bS2Vq7vKoeleRd\nSfZO8vLW2nnLjgzY0PxAAwBsDoNc09lae3uStw9RFwAAABvHENd0AgAAwLwknQAAAIxG0gkAAMBo\nJJ0AAACMRtIJAADAaAYZvRYAgI3Nra6APaWlEwAAgNFo6QQAVo3WM4CNT9I5AD+Ya4PtAAAAa4/u\ntQAAAIxGSycAwIz0qgHYfZJOABiRJAWAzU73WgAAAEYj6QQAAGA0utcCAKxBumYDG4WkE6b4kQcA\ngOHoXgsAAMBoJJ0AAACMRvfaFaLLJgAAsBlJOgEAWFeczIf1RdIJwLL5BxAAWIhrOgEAABiNpBMA\nAIDRSDoBAAAYjaQTAACA0Ug6AQAAGI3RawFWkFFeAYDNRksnAAAAo5F0AgAAMBpJJwAAAKORdAIA\nADCaZSWdVXXfqjqvqn5UVbcaKigAAAA2huW2dJ6b5N5J3j9ALAAAAGwwy7plSmvt/CSpqmGiAQAA\nYENZkft07ty5M1u3br3i9bZt27J9+/aVWDQAAACraMmks6rek+SweWY9ubX25lkWsmXLluzYsWN3\nYwMAAGCdWzLpbK3dcSUCAQAAYONxyxQAAABGs9xbpvxGVX05yS8neVtVvWuYsAAAANgIljt67RuT\nvHGgWAAAANhgdK8FAABgNJJOAAAARrMi9+mEIZx2/JGrHQIAALCbJJ2wxkm2AQBYzySdwIYmaQcA\nWF2u6QQAAGA0G7alU+sGAIzH7ywAs9LSCQAAwGgknQAAAIxG0gkAAMBoJJ0AAACMRtIJAADAaCSd\nAAAAjEbSCQAAwGg27H06WT73YAMAAJZLSycAAACjkXQCAAAwGkknAAAAo5F0AgAAMBpJJwAAAKMx\nei2jMwouAABsXlo6AQAAGI2kEwAAgNFIOgEAABiNpBMAAIDRSDoBAAAYjaQTAACA0bhlCgArwu2T\nAGBz0tIJAADAaCSdAAAAjGZZ3Wur6jlJfj3J95N8LsmDW2vfGiIwAAA2H13xYeNZbkvnu5PcrLV2\niySfTvKk5YcEAADARrGspLO1dlpr7fL+5RlJrr38kAAAANgohhy99neT/Mt8M3bu3JmtW7de8Xrb\ntm3Zvn37gIsGAABgLVoy6ayq9yQ5bJ5ZT26tvbkv8+Qklyd59Xx1bNmyJTt27FhOnAAAAKxDSyad\nrbU7Lja/qrYnuXuSY1trbaC4AAAA2ACWO3rtcUken+T2rbXLhgkJAACAjWK5o9e+KMlPJHl3VX28\nqv5+gJgAAADYIJbV0tlau8FQgQAAALDxLLelEwAAABYk6QQAAGA0kk4AAABGI+kEAABgNJJOAAAA\nRiPpBAAAYDSSTgAAAEYj6QQAAGA0kk4AAABGI+kEAABgNJJOAAAARiPpBAAAYDSSTgAAAEYj6QQA\nAGA0kk4AAABGI+kEAABgNJJOAAAARiPpBAAAYDSSTgAAAEYj6QQAAGA0kk4AAABGI+kEAABgNJJO\nAAAARiPpBAAAYDSSTgAAAEazz2oHAEly2vFHrnYIAADACLR0AgAAMBpJJwAAAKORdAIAADAaSScA\nAACjkXQCAAAwmmUlnVX1l1X1yar6eFWdVlU/PVRgAAAArH/Lbel8TmvtFq21I5KcmuSpA8QEAADA\nBrGspLO19p2Jl1dP0pYXDgAAABvJPsutoKqenmRbkm8nucN8ZXbu3JmtW7de8Xrbtm3Zvn37chcN\nAADAGrdk0llV70ly2Dyzntxae3Nr7clJnlxVT0ryqCQnThfcsmVLduzYsexgAQAAWF+WTDpba3ec\nsa5XJ3l75kk6AQAA2JyWO3rtDSde3jPJBcsLBwAAgI1kudd0PrOqfj7Jj5J8IcnDlx8SAAAAG8Wy\nks7W2m8OFQgAAAAbz3Lv0wkAAAALknQCAAAwGkknAAAAo5F0AgAAMBpJJwAAAKORdAIAADAaSScA\nAACjkXQCAAAwGkknAAAAo5F0AgAAMBpJJwAAAKORdAIAADAaSScAAACjkXQCAAAwGkknAAAAo5F0\nAgAAMBpJJwAAAKNZtaTzFa94xYYrs5ZiUWZlyqylWJRZmTJrKRZlVqbMWoplvZa5+IxT11Q8trky\nazkWZVamzFqKZSOXmbNqSecpp5yy4cqspViUsc2VGafMWopFGdt8vZS5+My3ral4bHNl1nIsytjm\nG6nMHN1rAQAAGE211kZfyHvf+96Lk3xhctoll1yy5ZrXvObOxd633sqspViUsc2VGafMWopFGdtc\nmXHKrKVYlLHNlRmnzFqKZQOVuc6xxx57yHxlVyTpBAAAYHPSvRYAAIDRSDoBAAAYjaQTAACA0Ug6\nAQAAGM2aSjqr6sZVdWxVHTA1/biJ57epqlv3z29SVX9YVXddpM5FbyBTVUf3ddx5YtpRVXVg//xq\nVfXnVfXWqnpWVR3UTz+hqg5fou6rVNW2qrpj//oBVfWiqvr9qtp3otz1quqPquoFVfW8qnr43PJh\nTlX91GrHAKwPjhfjs47HZx2zUdiX10DSWVUP7v+ekOTNSR6d5NyquudEsWf0ZU5M8rdJXlJVf5Xk\nRUmunuSJVfXkqnrL1OOtSe4997qv48MTy/69vo6fSHJiVT2xn/XyJJf1z1+Q5KAkz+qnndxP/8sk\nZ1bVv1fVI6tqvuGBT05ytySPqapXJrlvkjOT3DrJSROf+++TXLWfvl+Sw5OcUVXH7M66XE3r/ctU\nVQdV1TOr6oKquqSqvllV5/fTfnLgZR1WVS+pqhdX1cFV9WdVdU5Vva6qrtWXuebU4+AkH66qa1TV\nNZe5/IP34D23qqr3VdWrqurwqnp3VX27qj5SVUfuRj37VNXDquqdVfXJ/vGO/kTLvku896X93737\nOv6yqm43VeYp/d/9q+rxVfXHVXXVqtreHweePX1Sa+r9n556fYuJ5/tW1VP6ep5RVfv30x9VVVv6\n5zeoqvdX1beq6syqunk//Q1V9dtLLPt6VfXyqnpaVR1QVf9YVedW1eur6rp9mb2q6ner6m1V9Ymq\nOquqXjt5rLCOV2QdO14s/p5lHy+s4yXfM9Qxec2s5w28jvf4mNy/f0WPy+WYvKjNfLxYttbaqj6S\nfLH/e06SA/rn103y0SSP6V+fPVFm7yT7J/lOkgP76VdL8skkZyV5VZJjkty+//u1/vntJ+vqn38k\nySH986snOad/fv5EmbOm4v34XD3pkvY7J3lZkouTvDPJ7yT5ib7MJ/u/+yT5RpK9+9c1Me+cien7\nJzm9f/6zE5/7oCTPTHJBkkuSfDPJ+f20nxx4exyW5CVJXpzk4CR/1sf4uiTX6stcc+pxcJLPJ7lG\nkmsuc/kH78F7bpXkff22PzzJu5N8u9++R85Yx7uSPCHJYVPr4glJTpvh/e/o/x6Y5K+SvDLJA6bK\n/F3/953pTq48sd9vn9DH/egkb+7L/CjJf049ftD/vagvc9xE3Qf1++Enk/xzkkP76c9MsmViPV2U\n5LPp7ps79504K8lTklx/kc/34SR3SfJbSb6U5D799GOTfKh/fkCSv0hyXr/+L05yRpLtE/W8pt+/\nbpvk2v3jtv20f5ln35rcx77c13FS/xkfm+RjSZ43/X3t99e/TvJ3Sd6b7uTSryZ5TpJX9mW+m+44\n8p3++XeT/HBu+vT3v6/vFemOJ89Pcko//byJMm9L8hv982OSfLB//pUk/5ru+/u6JL+R5CpT6/j9\nSR7R7xfnJnlcv188JMmOvszJ6b6TRyf5m3593ynJe5I82jpesXXseDHy8cI6XrFj8ppZzxt4HS96\nTO7LrJnjchyTN93xIjPsx1P1HZrklv3j0KXW7xXvm7Xgch79SpzvcU6S703vwBNf5HcmeV4mEr2J\n+WdPlf94uiTwD9IlHUf00y+aKveJdMnRwUk+OjVvLsl7fZIHT+zst+qf3yjJRya/4BPv3TfJPdId\nXC7up50CxUSjAAAOZklEQVSb5Cr98r6bPiFL16p5fv/8nCT79c+vMRlTknN9mVbsH5wLF6n/wv7v\nLRd4/FKSr/Vl/l//ue6V5C3967ntO/ejMLkff3F6P+7/Pq7fFjefmPefU2UnfxhOSvK0JNdJ9x14\n09z+NVHmfUluPbEvf3Su3iTPTfLFfl3+QZKfnu+7sUDMc9+bNyfZnu4H9Q+T/GmSGyb5pyTP6Mt8\nepH1/Ol0P3wXTe1bc6+/P3c8mXjPPklemuQN6XoJzMUytx4rydez657Ekyd8/jbJKZk4YM6zjic/\n98eT7DtPPRdOlPnI1Ps/ObWODkzyoCRvT7cPnpzkzruxjj85Nf2M/u9+2XVMsY7HX8eOFyMfL6zj\nFTsmr5n1vIHX8aLH5P7vmjkuxzF50x0vMsN+3Jc9It3/zuenS/rfk65B7Iwkt1xoG1zx/qUKDPFI\n18p3RL8yJx/XTfLVvsyO9Ini1JfqlCQ/7F+fmWT//vleE+UOmtp4106XOL5onpX7+ez6Il+UXa13\nB0zsLAelO7PzuX6ZP+jL/luSX5zegPN83rkY/6B/3xeSnJDurNM/pks0T+zLPCZdMvaP/YabS3YP\nSfJ+X6YV+wfntCSPz5UPwoemS7rf07/+Ybr99H3zPP53ch1N1PHkJB9Md5Jjbh1/YmL+06bKT66T\nuf34eem6gE+fQJlcx9PLndtW5yfZp39+xnzLmqrnV9OdIf16/7ke2k//ULpW/fum25/v1U+//cS2\n+sRU/XMnaPZKcsFcDH0dk9/fvZLcL9137TNJfnaBff1L/d8L5pl3Yr+ePzO9PpK8fKrs5Pr/pX6b\nntDHMb2OL0py7yS/mYkeEJP1JHl6uuPF9ZL8Sbqz0NdJ8uAkp06v44n3H5zk4dl1Nvdj6fb92yTZ\nmV0nu26QXT/iH0t/Aibd9/v9E/V9ah2v499Y4XV862Wu47VyvJj8J3RDHS+s4xU7Jq+p9bxB1/Gi\nx+T+9Zo5Lmdt/O7dMBvnmPz8rPF9OTPsx3NxJTlqnu162+k65t2XlyowxCNdK9fRC8z754mNc9gC\nZW7X/91vgflbMpEATUy/WyYy9CVi3D/Jz01NOzDJL6b7ch46Ne9GM9b70+mTpCQ/meQ+SW4zVeam\n/fQbL1DHWvkybdiEKF0r87P65/+drjvI+f20uRbqc5PccIFt9KWJz7TX1Lzt6VpYv9C//ov0Xcmn\nyt0gyb/OM/0e6X60vj41/cvpEujHpfuRqIl5cwfrR/f7z9Z03VNe0K+XP8+urjbz/TDsneS4JCf3\nr38xXYv7O5LcuK/nW/3n+pW+zH+k/573Mb9ror65kyPXTdfF87/Stbp9un/+L0l+Lsnvpz+xM09M\nc91oXpWJlvSJ+ccn+UH//KQF1vH1k3xgatpe6X54/z39SbCJeSdPPeZa6Q9L8t6pbXxmuh/N7yb5\nVLqTGQf1898/32eaWtaxSS7s96Gj050U+ky/fu7Zl9ma7gTMZ9KdkDmqn35IkmdPreOL+/U7V8da\nXcevmHEdP3jkdTx33Jhbx5/t1/Ft51nHjheLHy+OyI8fL/67/1xzv+eLHi82+Tr+sZPa86zj+Y7J\nu7WO1/K+vIbX8eC/e32ZNXNcTnccfnnW1+/e3H58frp9eE3sxyu0L+/p/3BXHC9y5f34nplnP+6f\nf2aRbfnZJbf3UgU8Vv+RK/8oTH+ZrtGX8WVa/g/DjZPccfrzpz/Ipzsx8PMLrOO5f1ifneSO88w/\nbvLL2i/r2IWWNV0m3XXLN5uK58Spx9z1yYelv+6if31Muh+3s9O1sr89yUOzq8vMa2fcD39hsZj7\n7fDhdAeyD8ytq3Q/DCdMlD8q3VnNg9Md7P4oyV0n5t8mu1q9b9LvR3edWuaelrlbrrwvTpb51SRP\nnaeeo3ZzWTdNt9/vScxHTdXzR/OU+eWl6pkoe3D/eNUS2/aUxebvbpnJdTw1/1pJvjnQsl45UD2n\n5sePi5W+6/8s9fT7zuPSdxnrpw12vJiad3S/ze+8RDxPmYpnRY8XM6yf6ePFjfrph6T7R/io7Prn\ndf90v02npvvdm5s+yDrulzU5RsSfJ3nr1LImy+zf1/ueqTIrto6n1s/VFlg/t1hsHffPT0hy+BLL\nGmo9L7qs6fmZ+N3bg/34DgOs41nWzdw6/tYi6/gq6cb7uFO64/ED051A//2JePZLsm1uHSZ5QLre\nerOWucpu1HOVqTIPSvd/2iPTXSq2Xx/v7tTxwHRjgezuZ7rK1LLmWzdXSXfS8b4L1dNPv36SP07X\nffj56VpUDxx6P+5fXy/d7/ML0jW+XGlZ85T5h3T/807GM7kfPzVLHy/Oyq59+WHZ/f/h5tbPXMyP\nyK5j2qLH44k6/jbdNbz3S/Ir/eN+/bQXLRXDXF9v1qmqenBr7eSquk+61sEL5ylzr9bam6rq2emu\nAX3P1PzjkrywtXbDWZY1z/Srpev6cO5EPCdOFfu71trFVXVYujNT2/r3HpNux79Ruu7UX0rypnTd\nQi6vqte21u4/4+pYMOZ+NLaT0nXZOC/J77bWPl3dqMO/1Rf//XSJ+RHpBrF6c1/HWa21W/bPb5zk\nZ9J1ibl0YjnHtdbeuUSZu7TW3lFVj07yqMWWVd2oxmPHM3OZPp5HpjvxsVg8v9DXc8YC9ZyY7vrb\nfdJde32bJKen+zF+Vz99cv5R6Vq175TuRMHT56ljqDJXimXkMmPG85b8uK3pekLMp9L9g7YjSVpr\n95injqHKXCmWkcuMGc+HW2u3SZKqOj7dd/VN6XpcvLW19szpCqrq6HTb69zW2mnzLGPeMlPL+r1+\nWW+cXNY8ZR45UDy/2pc5Z5kx73Y8VXVeupafy6sbvfN/0rWCHNtPv3dVHZWu+993+t+hJyU5Mn2L\nS2vt232ZC/rnC5WZXtZl6QZAmVzWrPHMLWv/dGMjzBfPXMxzZW6Z7ndpvjKzxjxLPE/olzVZz7f7\n934u3ZgLr2+t7ZzaNickeWNr7Uvz7QO7UWZyWa/pl3XxIvNfN2Isy453N+p5dbpj9tXSjSlx9XTf\n4WPTnaD7nYky+6dLYA9Id73mrGXSWtu+h2WuFE8/b1l17MFn2tN1MxfPCUnunm5gorumO9nwrXSX\nbjyytXZ6vy2ul67r8OHpegh+Ol2Py+9MbK9Fy8yyrL7Mr6e7LG+xeK7fL+vaS8TzmxPxXDhSzNef\nqOPy+WLp67pLutbQn+knfSXJW1prb89SlspKPdb2I1PXMC5Q5sEDlZllWSsZz1AxPzizjZ58Qv9l\nf1O6a4PvOVHHXPfkR89QZtaRmpcqM8uyVjLmE9IlpUvVs9gI1IvOn6UOZRYfxTvdj82So3wPVGam\nEcVXsJ7Bykzs2wuNgv7hiTK/l+5amBPTXdLwxN0oM8uydjee42eI5/h+XYwV86LxZLZR5M/Lrks0\nXppuZMuj+3resBtlZlnWWPE8f8SYZ6lnlpH4v53kq+m6YT5ybntOLXOWMosua6RYHpGJXgtDxrvA\nsuarZ5Y7GayZMmsplt0oM8udIE5I17vuKel6v7043bWpn0pyzG6UmWVZQ8XzmJWIeZY6hngMUonH\nuI/MMPrvEu+fOVmcZVkrGc9QMS9VT2YbPXmoZHGWZa1kPEPFPEs9S41Avej8WepQZvFRvJear8xg\no6APlSzOsqyVjGeomBetJ7ONIj9UsjjLslYynqFinqWeWUbinyX5mqXMosta4ViWHe9u1DPLnQzW\nTJm1FMtulJnlThBDJYuzLmsl41lWzLPU0b+eu4Xj3LWzu3ULx33CenBokl9L19d6UqU7I5Gq+uQC\n763+/TOVmWVZKxnPUDHPUM9nq+qI1trHk6S1dmlV3T3dxfQ378vu1fouo621z/ddg/+1qq7T1zNr\nmW/MsKxZygwVz1Axz1LP96tq/9baZekG6Eq/fQ5KdyueHywxf5Y6NnWZ1tqPkjy/ql7f//1GsutY\nv9R8ZZYuk+6H92Pp9utWVddqrX2tupugX/G9qqprpPuHtFrfLa+19j9VdflulJllWSsZz1AxL1XP\n8UleUN1N73cm+VBVfSndJRjH93VccUlHkk9U1a1aax+tqhulG3V+1jKzLGsl4xkq5lnqmdse6df/\nD9KNbv+W6rrk9pPbj9K1hJxWVftm123Knpvuuq9Zyiy1rA+uYCxDxDtrPC9L1wto73QDOL6+qi5K\nN+Lna/t61lKZ76yhWGYtc1KSj1TVmemuIX9WklR3CdUl2WWfdN1P90t38jyttS/2223WMrMsayXj\nGSLma8y4nNelu9TkDq21r/d1HJZufJjXpTv5srClslKP1X9kttF/Z7ktzSxlZlnWSsYzVMyL1pPZ\nRk+e5bY+s5SZZVkrGc9QMc9Sz6IjUC81f5Y6NnuZeeYtOor3UvOV2bNR0DPb7bmWLDPLslYynqFi\nnrWeLD6K/Cy3N1uyzCzLWsl4hop5xmUtORJ/ZrtF3CxlFl3WCsey7Hhnrad/PsudDNZMmbUUy26U\nWepOELPconDJMrMsayXjGSLm3ahjyVs4LvYwkNAGUVUvSzea6wfmmffPrbUHzFJmrcUzVMxD1FNV\n105yeevP7kzNu11r7YOzlJkl3lkMFc9QMa/kZ4f1qm8hObS19p/LKbPW4hkq5j2pp6oOTHc7oH2S\nfLm19o09KTOUoeIZKubl1lNVN2qtfXq5ZYYwVCxDxbtSn5thVNVN043Ef25r7YI9LbPW4hki5hmX\nc1q6Ubv/ae44UlWHpmvpvFNr7Y6LLkPSCQAAwEL6SyKemG702p/qJ38jXbfzZ7bWpi9zu/L7JZ0A\nAADsiVrgtopXKiPpBAAAYE9U1Rdbaz+7WBmj1wIAALCgmu2OEguSdAIAALCYWW6ruCBJJwAAAIs5\nNckBrb9n+6SqOn2pN7umEwAAgNHstdoBAAAAsHFJOgEAABiNpBMAAIDRSDoBAAAYzf8PWXXY+UmY\nUIIAAAAASUVORK5CYII=\n", 347 | "text/plain": [ 348 | "" 349 | ] 350 | }, 351 | "metadata": {}, 352 | "output_type": "display_data" 353 | } 354 | ], 355 | "source": [ 356 | "joke_means = data.mean(axis=0)\n", 357 | "joke_means.plot(kind='bar', grid=False, figsize=(16, 6),\n", 358 | " title=\"Mean Ratings for All 100 Jokes\")" 359 | ] 360 | }, 361 | { 362 | "cell_type": "markdown", 363 | "metadata": {}, 364 | "source": [ 365 | "While the majority of the jokes generally get positive feedback from users, there are definitely a few that stand out as poor humor. Let's take a look at the worst and best joke, just for fun." 366 | ] 367 | }, 368 | { 369 | "cell_type": "code", 370 | "execution_count": 9, 371 | "metadata": { 372 | "collapsed": false 373 | }, 374 | "outputs": [ 375 | { 376 | "name": "stdout", 377 | "output_type": "stream", 378 | "text": [ 379 | "The worst joke:\n", 380 | "---------------\n", 381 | "How many teddybears does it take to change a lightbulb?\n", 382 | "\n", 383 | "It takes only one teddybear, but it takes a whole lot of lightbulbs.\n", 384 | "\n", 385 | "\n", 386 | "The best joke:\n", 387 | "--------------\n", 388 | "*A radio conversation of a US naval \n", 389 | "ship with Canadian authorities ... *\n", 390 | "\n", 391 | "Americans: Please divert your course 15 degrees to the North to avoid a\n", 392 | "collision.\n", 393 | "\n", 394 | "Canadians: Recommend you divert YOUR course 15 degrees to the South to \n", 395 | "avoid a collision.\n", 396 | "\n", 397 | "Americans: This is the Captain of a US Navy ship. I say again, divert \n", 398 | "YOUR course.\n", 399 | "\n", 400 | "Canadians: No. I say again, you divert YOUR course.\n", 401 | "\n", 402 | "Americans: This is the aircraft carrier USS LINCOLN, the second largest ship in the United States' Atlantic Fleet. We are accompanied by three destroyers, three cruisers and numerous support vessels. I demand that you change your course 15 degrees north, that's ONE FIVE DEGREES NORTH, or counter-measures will be undertaken to ensure the safety of this ship.\n", 403 | "\n", 404 | "Canadians: *This is a lighthouse. Your call*.\n", 405 | "\n" 406 | ] 407 | } 408 | ], 409 | "source": [ 410 | "import json\n", 411 | "# Worst and best joke?\n", 412 | "worst_joke_id = joke_means.argmin()\n", 413 | "best_joke_id = joke_means.argmax()\n", 414 | "\n", 415 | "# Let's see for ourselves. Load the jokes.\n", 416 | "with open(os.path.join(DATA_DIR, 'jokes.json')) as buff:\n", 417 | " joke_dict = json.load(buff)\n", 418 | "\n", 419 | "print('The worst joke:\\n---------------\\n%s\\n' % joke_dict[worst_joke_id])\n", 420 | "print('The best joke:\\n--------------\\n%s' % joke_dict[best_joke_id])" 421 | ] 422 | }, 423 | { 424 | "cell_type": "markdown", 425 | "metadata": {}, 426 | "source": [ 427 | "Make sense to me. We now know there are definite popularity differences between the jokes. Some of them are simply funnier than others, and some are downright lousy. Looking at the joke means allowed us to discover these general trends. Perhaps there are similar trends across users. It might be the case that some users are simply more easily humored than others. Let's take a look." 428 | ] 429 | }, 430 | { 431 | "cell_type": "code", 432 | "execution_count": 10, 433 | "metadata": { 434 | "collapsed": false 435 | }, 436 | "outputs": [ 437 | { 438 | "data": { 439 | "text/plain": [ 440 | "[]" 441 | ] 442 | }, 443 | "execution_count": 10, 444 | "metadata": {}, 445 | "output_type": "execute_result" 446 | }, 447 | { 448 | "data": { 449 | "image/png": "iVBORw0KGgoAAAANSUhEUgAAA50AAAFlCAYAAABocowXAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XuULVddJ/DvDwLhOb4ahCEouDQiohDlKYxCByMoY5TF\nLJkR4lUyIAoisobhIcgoqKP4YBRFvMoVQRAZBgURArbIUiEwJjIQkgDyiLyUKwJCIDG65486nXQ6\n/T5V3XXqfD5r9br3dNXZe5+qXXX29+w61dVaCwAAAAzhekfdAAAAAKZL6AQAAGAwQicAAACDEToB\nAAAYjNAJAADAYIROAAAABiN0AjBaVXWiqt5w1O1Ikqq6bVX9aVV9tqqO7O+NVdUbq+r4hsej2UYA\nsBWhE2BEZgGiVdUrtlh29mzZVUfRtq1U1QdmbWpV9fmqem9VPbOqbrjPch62TZB7XJL/1E9r5/aU\nJLdMcpcktx6qkqq6TVVdUVUfqapTeirzkbPA/I+zfXWfbdY7VlWXzuq/pKq+d4t17lFVfzXb3x+t\nqp+pqutvWufWVfWyqvr07OelVXXLXdr4jKp67zbLPlBVP76f1wzAeAidAONzWZIHVdWXbvr9o5J8\n8Ajas5v/mS6EnZ7kyUl+JMkz+ii4tfap1to/9VFWD74qyVtba+9prX3soIVU1Q12WeURSV6d5JNJ\n/uNB69nkJknWkjxxh3Z9V5LfSvK8JHdOcjzJC6vqgRvWuW2S1ye5NMk3Jnl0un75rA3rXG/W/tsn\n+dYkZ6XrG6+squrp9fSqqq63OTgD0B+hE2B83pPkLUmOrf+iqr4s3QD+BZtXrqpvrKrzquozVfXx\nqnpFVX35huW3n/3uI1V1eVW9o6oevqmMN1bV8ap6WlV9rKo+UVUvrKqb7aG9n2mtfay1dllr7Q/S\nhZJv21T+s6rq4ln9f1dVz6uqL5gtu2+S3539f33W9MTs8bUuHV1/PJu5++BsFu2PNgf0qvrRqvrQ\nrL7XVdXDZ+WeNlv+76rqBbPXesWsTb+43QuczcKemeQHNrXv1rNZvE9W1edm2/GuG55339n631FV\nf1FVn09y7g71XC9d6DyR5HeSPHLnTb83rbVfbq09K8mf7rDaE5P8fmvtl1prl7TWnp3kFUn++4Z1\nHp3k00ke0Vq7qLX2yiRPS/LYqrrpbJ37J/mGJA9rrZ3fWntLkocnuVeSb+nj9VTVubP+9PlZX33T\n+r6dLd/tmHjGbFb+e6rqkiRXJjm9qr521l8+Wd1l1BdvPlYA2D+hE2Ccnp/k3A0zQ+emCwzXmums\nqjsm+fMkb05y1ySrSf41yeur6kaz1W6WbpbrgUm+blb2C6rqfpvqfEiSL05y3yQPTfKgXDtw7Kqq\nzkhyn3SD+I0+ly5A3TFdmL5vkv81W/ZXSR4z+/+tZz+P26GauyW5X5LvSBduvy7Jsze04cGzxz+f\nbsbuJelmYzd6ZrpgdHa6GczvSXLxDnXeOt02/r319s32zSuT3CHdtrp7kr9Pt+1XNj3/F2Zt+Jok\nr9qhngcmOTXJn6QL4mdW1e12WL8X1V0Ofbckr9206LVJ7rlhFvDeSc5rrf3bpnVukuSMDeu8v7V2\n6foKrbWLknwoXd+Yt63fmG429meSfHW6IPvCDcv3ckwkyb9P8kNJvi9dv/xQur7yj0m+KV2/+rEk\nY5lpB1hYvXxXBIDevTzJc5Lct6relOQH0l22+u82rffEJK9urf3E+i+q6mHpBsoPSPLK1to7krxj\nw3N+parun+S/JPmzDb//YGvt8bP/X1JVv59u1uppu7T1aVX1pCQ3SHLDdAP8R21cobX2zA0PP1BV\nT07y0qr6/tbalVX1qdl6e7ls9Yokx1prV8xe7/OS/OiG5U9I8pLW2nNmj99TVXfItQP0lye5sLV2\n/uzxZenC75Zaax+rqiuTfG69jVV1Zrqg+bWttXfNfndOkg+kCzM/uaGIZ7XWdgqb6x6Z5MWttauS\nfKSq1tJ94DD09xlX0o0JNm//j6ULwV+c5OPpAvdfbrFOcs33XG+9RTnr6/XxXdgvS/LZdH3707Pf\nbezfux4Ts1/fKMnDW2uXbVjvy5P84vr+TPK+HtoLsPTMdAKMUGvt8+lmuv5ruhm9U7L1DNndknz3\n7DLCz1TVZ9LN1Nwo3QxequomVfWzVXXR7FLEzyT59nTBa6O3b3r8kSSbv1e6leemu7nOfZL8YZJf\nmV12ebWqevDsEsiPzOp/cbqAeqs9lL/ZJeuBc5t23jHd5ckbvXnT419L8pCqemdVPaeqHji7tHU/\nvjbJP24IKJm16/zZso3eulthVXWbdPv6xIZf/066S3p9SHyN16cLg++fXdr8yE0zy7seEzN/vzFw\nzjw7yfHZZdLPqKpvGPKFACwLoRNgvJ6f5MFJ/luSF7TW/mWLda6XLpzeZdPP6eluBJN0l5k+LMn/\nSHdZ6l2SvCZd6Nto8yWxLXt7n/hEa+29rbW3pbtM9ezacNfTqrpHkj9I8qYk353ustYfnC3e111u\nd2jn5hvU7PgnTVprr0s3Y/asdGHkRUnWaribyXx2D+s8Isn1k1xYVVdVd5fi3003O9jXDYW2czLJ\nVbnuhwBfmm5m+ROzxx/dZp31Zduts77eR7f4/bpPJfmCbZZ9YZLPJ0lr7TPpLpv97iTvTteX3ju7\n7DbZ2zGRbLFPWms/NVvvZUnulOQtVfXMzesBsD9CJ8BIzWbQ3pbuO3LHt1nt/yb5+iR/Owt+G3/W\nv4v2zeku2XxZa+3t6WaJTh+ozVekC3LPrqqbzH59nyQnW2s/PruxzLuTnLbpqVcmSU+h713pblqz\n0T23aOsnWmsvaa09Kt0M47ekmyXdq4uSfMnsO4RJkqo6Nck9krxzPw3ecAOhn851w9JL0tMNhbbT\nWrsyXV/7tk2LHpDkLa21f509/ssk37ppVvgBSS5PcuGGdW5fVVfPKs620W2T/MUOzbgkyUpVfcXG\nX1bV6enC6CUb2vuvrbU3tdaenu4uuh9Nd7l4srdjYluttfe11n6ttfaQJE9Pd/MkAOYgdAKM27cl\nWWmt/e02y3863c1pXlRVd6/uTrX3m10yuj54vzTd7OPdZ4P/56e7icpQXjT7d/1mQJcmuUVVPaKq\nvmL2vccf2vSc98/+/c6qukXt7a652/mFJA+tqsdW1VfO6jtntqwlV99N98FV9dWzcPS9ST6T7rud\ne7WW7rLZ36uqe1fVndLd0OZGSX59n21+YLpQ9huttXdu/El3ue1Z89xQqKpuVVV3yTWh+iur6i5V\ntXFG8ueSfE9VPW62XX4s3Uz7xpsw/Xq6APibszu9fmeSn0p3SfX6zOEbklyQa/rkPdJtl7eku8HP\nds5Ld4n3S6tqddaXV9PdvOniJK+bvZazq+rx1d2h9suSfFe6bbd+mfNejomtttHNquq5G+o+I12g\nftd2zwFgb4ROgBFrrV3eWvvEDssvTnenzZulG5S/K8lvJrlxur/zmCSPT3fX2z9LdwfcD6e7UdFQ\nbb4iya8keWJVfVFr7dXpZj9/Ot0NXx6a7pLhjc95W7obJ/1Gkn9I8qtz1P+KdDeTedKsvu9Nd2lx\nMrtEc/bvTyb561wzM/bA1tqn9lFPSxd4Lknyx+lmCm+V5Ftbayf32exHJjl/i+8YJl24/UR2+FMr\ne/CD6WYi/3j2+AWzx+uXOWf2Pdxzk/xwuu32qHQ3bPqTDev8Xbq/u/k16bbd82c/T92wzr+lu5vv\nZen62+uT/G2Ss2fbbEuz2dSz0gXW38w1ffmtSb55w+Xl/5TucuPXpru89ueSPLO19luzcvZyTGzl\nqiRflO5vla6H3L/PNTOoABxQ7XD+B4BJqKqnJ/mR1trmP2UCAAzM3fAAmJSqukG6P5vymnQ3i7lf\nupnV5x5luwBgWZnpBGBSZn9e5NXpbjBz83TfF31hkp+f/f1LAOAQCZ0AAAAMxo2EAAAAGIzQCQAA\nwGAO5UZCb3zjG9upp556GFUBAABwyC6//PKTZ5555i22WnYoofPUU0/NHe5wh8OoCgAAgEN2wQUX\nfHC7ZS6vBQAAYDBCJwAAAIMROgEAABiM0AkAAMBghE4AAAAGI3QCAAAwGKETAACAwQidAAAADEbo\nBAAAYDBCJwAAAIMROgEAABiM0AkAAMBghE4AAAAGI3QCAAAwGKETAACAwQidAAAADEboBAAAYDBC\nJwAAAIMROgEAABiM0AkAAMBgegmdVfX4qrqoqt5ZVS+pqhv1US4AAACLbe7QWVW3SfIjSe7aWrtT\nkusneei85QIAALD4+rq89pQkN66qU5LcJMlHeioXAACABXbKvAW01j5cVc9OclmSzyU5r7V23sZ1\nTp48mdXV1asfn3POOTl27Ni8VQMAADByc4fOqvqiJGcnuX2STyb5g6p6WGvtRevrrKysZG1tbd6q\nAAAAWDB9XF57/yTvb619vLX2L0lekeSbeigXAACABddH6LwsyT2r6iZVVUnOTHJxD+UCAACw4OYO\nna2185O8PMkFSd4xK/P585YLAADA4pv7O51J0lr7iSQ/0UdZAAAATEdffzIFgCV31vELj7oJAMAI\nCZ0AAAAMRugEAJaemXqA4QidAAAADEboBAAAYDBCJwAAAIMROgEAABiM0AkAAMBghE4AAAAGI3QC\nAAAwGKETAACAwQidACPgD9MDAFMldAIAADAYoRMAAIDBCJ0AAAAMRugEYGn47iwAHD6hEwAAgMEI\nnQAAAAxG6AQAAGAwQicAAACDEToBAAAYjNAJAADAYIROYKn5ExoAAMMSOgEAABiM0AkAI2QWHoCp\nEDoBAAAYjNAJAADAYIROAGBLLvEFoA9CJwAAAIPpJXRW1RdW1cur6pKquriq7tVHuQAAACy2vmY6\nn5Pkta21OyS5c5KLeyoXAABgIfhawtZOmbeAqvqCJN+c5FiStNauTHLlvOUCAACw+PqY6bx9ko8n\neUFVXVhVx6vqpj2UCwAAwILrI3SekuQbkvx6a+2MJJ9N8qSNK5w8eTKrq6tX/5w4caKHaoGduLwD\nAIAxmPvy2iQfSvKh1tr5s8cvz6bQubKykrW1tR6qAqbkrOMX5rxzzzjqZgAAMKC5Zzpbax9L8ndV\n9dWzX52Z5F3zlgvAYjCrDgDspI+ZziR5bJIXV9UNk7wvyff3VC4AAAALrJc/mdJa+5vW2l1ba1/f\nWvuu1to/9VEuALC4zIIzNfo0HExff6cTAAAArkPoJIlP7gAAgGEInQAAAAxG6AQAABjYMl9ZKHQC\nAACM3CKHVqETmJxFPikDAEyN0MmRExAAAGC6hE4AAAAGI3ROhNlCAABgjIROAAAmwYfwME5CJwAA\nAIMROgEA2JVZROCgRh06ndwAAJaPMSBMy6hDJwAwPQIFwGLb73lc6AS2ZFAIy805AIC+TDJ0eqME\nAAAYh0mGTgAAAMZB6ASW0tSuiJja6wEApkPoBAAAYDBCJwALxawuACwWoRMYjHDAXukrADBdQicA\nAMDEHeUHvEInAMCSc7UBMCShk0nxpgkAAOMidLJnOwU6YY+h6FswfY5zgGkTOoEDM1BkEem3AHC4\nhE4AAAAGc2Sh0yfNAMDQjDcAjp6ZTgAAAAYjdAIAAAzosK66GOvVHb2Fzqq6flVdWFWv7qtMAADY\naKyDamB7fc50Pi7JxT2WB7DUDKwAYPqW4f2+l9BZVacl+Y4kx/soDwAAgGk4padyfjnJE5PcfKuF\nJ0+ezOrq6tWPzznnnCR37qlqAACO0lnHL8x5555x1M0ARmrumc6qelCSf2it/fV266ysrGRtbe3q\nn2PHjs1bLQBLZhkuPwJYVs7x09bH5bX3TvKdVfWBJC9NslpVL+qhXAAAABbc3KGztfbk1tpprbXb\nJXlokrXW2sPmbhmj4ZMnAADgoPydTgAAYDAmMOg1dLbW3thae1CfZQJsxRvY4bK9r832YKr0bWAI\nZjqPmJM7ACw3YwFg3VTPB0InAMASmOpgdlHZHywToRMAAIDBCJ0AwEIyUwSwGIROAABgqfkQa1hC\nJwA78kYMAMxjdKHT4IajoN/BuDlGAWBxjS50AgAwLT44guUmdAIAACywsX+wI3QyqLEfAMDhWJZz\nwbK8TmA8nHdYBEInwEAMBICxcV4CjoLQCQAjIxgsJvsNYGtCJ/TIgAMAgKk66Fh3lKHTwB2A/fC+\nsXzscxjeTsfZYR+DjvnFNsrQCQCQGGgCTIHQCXDEFmVQvSjtBAC2dlTv5UInBpIAAMBghE4AiA/g\ngMPjfMOymVzodBADwHIZ6r3fmGKa7NdxsT+Ww+RC516cdfxCHRwAAOAQLGXoBGCafKAIAOMjdLIt\ngzeOir4HACyT3cY+iz42Ejq3seg7tm+2x+FsA9sZ2AvninGxPwB2JnQyGt60OQj9hj6NoT+NoQ2M\nj37RP9sUDo/QCQDsy5QG61N6LX2ZwjaZwmuAKRE6gdEyaICj4dgDoE9CJxwiAzmAvXPOBJgGoZMd\necMHWDyLcO5ehDbCPPTx6TisfTn2PjNP+4ROJmHsBykwDc417Ncy9pllfM3AzuYOnVV126r6s6p6\nV1VdVFWP66NhwGIy2Lgu24Qx0R8P3163uX0DTFUfM51XJXlCa+2OSe6Z5Ier6o49lAswCkMOBA0y\nr7Eo22JR2gkAYzF36GytfbS1dsHs//+c5OIkt5m33MNg4ABMgXPZdNiXw7Bd2a8p9ZkpvRYWV6/f\n6ayq2yU5I8n5fZbLdZ11/EInEdiFYwQA4Oj1Fjqr6mZJ/neSH22tfXrjspMnT2Z1dfXqnxMnTvRV\nLQAAHAkfbo6ffTQOp/RRSFXdIF3gfHFr7RWbl6+srGRtbe1av/s9HQAAFtZZxy/MeeeecdTNgIUj\nBLGM+rh7bSX5rSQXt9Z+cf4mAYyHwQEcjL9rB9PgGNubMW2nMbVlXR+X1947ycOTrFbV38x+vr2H\ncgEA2MUYB5gAG819eW1r7S+SVA9tAQBYKIcZ+Ia+pFl4BYbS691rAQC4LoGuYzuMh33RH9tyd0In\nAMBIGcwCUyB0DsAbBEMYW78aW3vYnn21XOzv/bPNWKcv9Me2ZCOhE+CAvKEePtscABaP0AkjtOgD\n60Vv/7qpvA4AgKMkdI6Age2wFmX7Lko7F4XtCTsb8hhx/DE0fQwWi9A5ICfE5WXfAwBAZ1Sh00Cd\nZaCfw3JzDpimMezXMbRhLxalnTvZ7TVM4TUuI/ttOKMKncDunBABgD4YU3BYhM4l4aQCwNh4b2KM\n9Evon9BJb6Zykp7K6wAAgDFY2NApGAAAwHgZr7NuYUPnGDiQAAAAdiZ0Qg8W8QOIRWwz47dVv9LX\nmJc+xF4sYj9ZxDbDQUw2dDqIO7bDcrG/Yfoc50ydPr48Fm1fL1p7x2SyoZPxcaACjIvzMtAH5xJ2\nM+nQ6QAA9sK5AgBYFkcx7lm40GlwCAAAsDgWLnSyfKb4QcMUX9Nhsv2Ylz4EwFFY1vcfoRMAAIDB\nCJ0snWX9hAkAgOsyNhye0HlAOuf0LMM+XYbXCPP2c8cJsCicr65hW4yb0HkIHAQAAIzRGMapY2gD\nwxI6WShOSiyaZemzy/I6AYD9EzpZGgbFsDXHBuzMMbI3Lm2HxTfUcXgkodNJ5XDYzv2zTQH2x3kT\nADOdwHXsNkhchkHkMrxGYHk4p/XHtoT9EzphCY3hDXMMbZiCvraj/XG4bG/W6QschH7D0PruY0Ln\nEXGyAGAe3kcAWBRCJwD0bLtAKCgCsIx6CZ1V9YCqurSq3ltVT+qjTGB7vnO5eOwTAPbKewZTM3fo\nrKrrJ3lukgcmuWOS/1xVd5y3XGCavJECwN6M5T1zLO04SrbBfPqY6bx7kve21t7XWrsyyUuTnN1D\nuYdCB1p89iFHTR9kSqbUn6f0WgAWWbXW5iug6iFJHtBaO3f2+OFJ7tFae8z6Oq961ava05/+9Lz7\n45cnSZ75hB/MsWPHdiz3rOMX5rxzz9hx2U7rHKTcrcpPcp31N5axXXl7aeNuy9bt1N7t2riXevby\n3N3stf7t1tm8neZpax99Zrf9uZd19lLGTg7aZ/a6zl763UGfv5965n0dfaxzkHPIEGXsVv66g7zW\nPs5nu9WxV/OeL/ZSfh99d7t199v3+2zPQcroq+/Pe844aPnzbp/Nzz/oPt1pf+63jX3sr4O8B/XR\nF3c7l/T9OrZ7/k51b9XWnc6JQ72f7ve8ulv759lWyc7jr53K2us22q6O/ZSzW/nb1bHf8fJW6w39\n3rHfMvZSR7K3/brfNsz7/nXBBRf89ZlnnnnXrZ5zyo4192RlZSVra2tXb6Rjx4YboE3NkINZAICp\nMXaaHvt08fVxee2Hk9x2w+PTZr8bJZ0WAKbP+z1ToS/vzjYavz5C59uSfFVV3b6qbpjkoUn+qIdy\nAYAlZRAJMB1zh87W2lVJHpPkdUkuTvKy1tpF85YLAOxMMAO2s9P5wbljmsa8X3v5Tmdr7TVJXtNH\nWQAAAExHH5fXAgALZMyfhtO/w9zf+lZ/bEumROjcxVEd8E40i8u+A4DlZRxwuPa6ve2XoyV0AgAA\nMBihEwCAQ2XWaVi2L2MjdI6YE8Y1bAtgUThfcZT0Pw6bPsdeCJ0Ac/BmC9PmGO/YDsA8hM4dOMGy\nFf3icNjOh882BwCGMNrQafADACwL4569sZ1gMY02dALD8IbNUdL/Fo99BsC8hE4OhUELTINjGTgo\n5w9YXoceOp1wAADYL2NIWFxmOhfcUZ+Aj7r+vVqUdgIH5zgHgHESOgFggoRwAMZC6AQAAGAwQidw\n6MzAAOsW+XywyG0HFtOinneETgAAAAYjdDIZi/rJDwAsCu+1wEEInQAAAAxG6AQADszMFwC7ETph\nA4Mnpkz/Ply2N4yX45NFtoj9V+hkFMZ+8Iy9fUyb/gdw+Jx76Yu+JHQCAEvKQBDm4xhir4TOOTnY\nAAA4bMagLBKh8wg5WQC7cZ4A9mIq54qpvA76o09Mg9B5ADo/jINjEQBg/ITOJWBgvlzsbwAAxkTo\nZFQEJgAAmBahEwAAgMEInbBAzAQDALBohE4AOAI+RDpctjfA0ZkrdFbVz1fVJVX1/6rq/1TVF/bV\nMIbnDRgAgKkz5j168850vj7JnVprX5/k3UmePH+TdqfjwDAcWwAA09PnGO8gZc0VOltr57XWrpo9\nfEuS0+Ypb4oM4gEAgGXW53c6fyDJn2y14OTJk1ldXc27nvPorK6u5sSJEz1Wy9gJ3gAAsLxO2W2F\nqnpDklttseiprbU/nK3z1CRXJXnxVmWsrKxkbW0tZx2/UAABAAAmZyo5Z4jXsWvobK3df6flVXUs\nyYOSnNlaaz21CwAAYFtTCXnLYNfQuZOqekCSJyb5ltba5f00CQBgegyQgWU173c6fzXJzZO8vqr+\npqqe10Ob2CdvYkAyjXPBFF4DwKJx7mVoc810tta+sq+GAAAAMD193r0WAIABmZGaj+0HR0PoBAAA\nYDCHGjp9usQy0//haO3nGNxqXccwAByMmU4AAAAGI3QCAAAwGKETmDSXRAIAHC2hEwBgxHx4BsNx\nfB0OoRNGxsmPregXcLQcgwAHJ3QCAACT58OjoyN0bkGHBKbGeQ0AOCpCJwDAEvOhFDA0oRMAAIDB\nCJ0cqcP+dNWnuQAAcLiETgAAAAYjdAJMmNl9hqR/AbAXQicwCIPRxWA/AcD2vE/2Q+gEAACuRdii\nT0LnknNCYb/0GQAA9kPoBAAAYDBCJwAAAIMROgEAYJ983YSpGbJPC50AAAAMRugE4FCYFQCA5SR0\nAgAAMBihEwAAgMEsdOh0qRYwD+cQAIbkfQY6Cx06AQAAGDehEwAAgMEInQAAAAyml9BZVU+oqlZV\nK32UBwAAy8j3QK9hW0zH3KGzqm6b5Kwkl83fHOCoOcEDANCnPmY6fynJE5O0HsqCbQlDAACweOYK\nnVV1dpIPt9be3lN7YGEJxQAAcF2n7LZCVb0hya22WPTUJE9Jd2ntjk6ePJnV1dWrH59zzjk5duzY\n3lvJqAlbAADAdnYNna21+2/1+6r6uiS3T/L2qkqS05JcUFV3b619bOO6KysrWVtb66G5AAAALJJd\nQ+d2WmvvSHLL9cdV9YEkd22tneyhXQAAwIJzRRyJv9MJAADAgA4807lZa+12fZUFAADANJjpBAAA\nYDBCJwAAAIMROgEAABiM0AkAAMBghM6BuD00AACA0AkAAMCAhE4AAAAGI3QCAAAwGKETAACAwQid\nAAAADEboBAAAYDBCJwAAAIMROgEAABiM0AkAAMBghE4AAAAGI3QCAAAwGKETAACAwQidAAAADEbo\nBAAAYDBCJwAAAIMROgEYlfPOPeOomwAA9EjoBAAAYDBCJwAAAIMROgEA2DOXwAP7JXQCAAAwGKET\nAACAwQidAABMjsuAYTyETgAAAAYjdAIAADAYoRMAAIDBzB06q+qxVXVJVV1UVT/XR6MAAACYhlPm\neXJV3S/J2Unu3Fq7oqpu2U+zAAAAmIJ5ZzofneRnW2tXJElr7R/mbxIAAABTMW/oPD3Jf6iq86vq\nz6vqblutdPLkyayurl79c+LEiTmrBQAAYBHsenltVb0hya22WPTU2fO/OMk9k9wtycuq6itaa23j\niisrK1lbW+uhuQAAACySXUNna+3+2y2rqkcnecUsZL61qv4tyUqSj/fXRAAAABbVvJfXvjLJ/ZKk\nqk5PcsMkJ+dtFAAAANMw191rk/x2kt+uqncmuTLJ922+tBYAAIDlNVfobK1dmeRhPbUFAACAiZn3\n8loAAADYltAJAADAYIROAAAABiN0AgAAMBihEwAAgMEInQAAAAxG6AQAgAk679wzjroJkEToBAAA\nYEBCJwAAAIMROgEAABiM0AkAAMBghE4AAAAGI3QCAAAwGKETAACAwQidAAAADEboBAAAYDBCJwAA\nAIMROgEAABiM0AkAAMBghE4AAAAGI3QCAAAwGKETAACAwZxyWBWdOHHi6v+/+c1vzr3uda8d199t\nnXmXH0YdY2iDOtShDnWoQx3qUMeitEEd6lDH4tZx2mmnrWz3nEMLnS984Quv/v+ll16a97znPTuu\nv9s68y4541avAAAAn0lEQVQ/jDrG0AZ1qEMd6lCHOtShjkVpgzrUoY7FreMxj3nMLbZ7jstrAQAA\nGMyhzHRefvnlJ5/ylKd8dv3x5z73uZvf+MY3/uednrPbOvMuP4w6xtAGdahDHepQhzrUoY5FaYM6\n1KGOxa3jpje96Zds95xqre1UJwAAAByYy2sBAAAYjNAJAADAYIROAAAABiN0AgAAMBihEwAAgMH8\nf9f8ZxKjcxYZAAAAAElFTkSuQmCC\n", 450 | "text/plain": [ 451 | "" 452 | ] 453 | }, 454 | "metadata": {}, 455 | "output_type": "display_data" 456 | } 457 | ], 458 | "source": [ 459 | "user_means = data.mean(axis=1)\n", 460 | "_, ax = plt.subplots(figsize=(16, 6))\n", 461 | "user_means.plot(kind='bar', grid=False, ax=ax,\n", 462 | " title=\"Mean Ratings for All 1000 Users\")\n", 463 | "ax.set_xticklabels('') # 1000 labels is nonsensical" 464 | ] 465 | }, 466 | { 467 | "cell_type": "markdown", 468 | "metadata": {}, 469 | "source": [ 470 | "We see even more significant trends here. Some users rate nearly everything highly, and some (though not as many) rate nearly everything negatively. These observations will come in handy when considering models to use for predicting user preferences on unseen jokes." 471 | ] 472 | }, 473 | { 474 | "cell_type": "markdown", 475 | "metadata": {}, 476 | "source": [ 477 | "## Methods\n", 478 | "\n", 479 | "Having explored the data, we're now ready to dig in and start addressing the problem. We want to predict how much each user is going to like all of the jokes he or she has not yet read.\n", 480 | "\n", 481 | "\n", 482 | "### Baselines\n", 483 | "\n", 484 | "Every good analysis needs some kind of baseline methods to compare against. It's difficult to claim we've produced good results if we have no reference point for what defines \"good\". We'll define three very simple baseline methods and find the RMSE using these methods. Our goal will be to obtain lower RMSE scores with whatever model we produce.\n", 485 | "\n", 486 | "#### Uniform Random Baseline\n", 487 | "\n", 488 | "Our first baseline is about as dead stupid as you can get. Every place we see a missing value in $R$, we'll simply fill it with a number drawn uniformly at random in the range [-10, 10]. We expect this method to do the worst by far.\n", 489 | "\n", 490 | "$$R_{ij}^* \\sim Uniform$$\n", 491 | "\n", 492 | "#### Global Mean Baseline\n", 493 | "\n", 494 | "This method is only slightly better than the last. Wherever we have a missing value, we'll fill it in with the mean of all observed ratings.\n", 495 | "\n", 496 | "$$\\text{global_mean} = \\frac{1}{N \\times M} \\sum_{i=1}^N \\sum_{j=1}^M I_{ij}(R_{ij})$$\n", 497 | "\n", 498 | "$$R_{ij}^* = \\text{global_mean}$$\n", 499 | "\n", 500 | "#### Mean of Means Baseline\n", 501 | "\n", 502 | "Now we're going to start getting a bit smarter. We imagine some users might be easily amused, and inclined to rate all jokes more highly. Other users might be the opposite. Additionally, some jokes might simply be more witty than others, so all users might rate some jokes more highly than others in general. We can clearly see this in our graph of the joke means above. We'll attempt to capture these general trends through per-user and per-joke rating means. We'll also incorporate the global mean to smooth things out a bit. So if we see a missing value in cell $R_{ij}$, we'll average the global mean with the mean of $U_i$ and the mean of $V_j$ and use that value to fill it in.\n", 503 | "\n", 504 | "$$\\text{user_means} = \\frac{1}{M} \\sum_{j=1}^M I_{ij}(R_{ij})$$\n", 505 | "\n", 506 | "$$\\text{joke_means} = \\frac{1}{N} \\sum_{i=1}^N I_{ij}(R_{ij})$$\n", 507 | "\n", 508 | "$$R_{ij}^* = \\frac{1}{3} \\left(\\text{user_means}_i + \\text{ joke_means}_j + \\text{ global_mean} \\right)$$\n" 509 | ] 510 | }, 511 | { 512 | "cell_type": "code", 513 | "execution_count": 11, 514 | "metadata": { 515 | "collapsed": false 516 | }, 517 | "outputs": [], 518 | "source": [ 519 | "from collections import OrderedDict\n", 520 | "\n", 521 | "# Create a base class with scaffolding for our 3 baselines.\n", 522 | "\n", 523 | "def split_title(title):\n", 524 | " \"\"\"Change \"BaselineMethod\" to \"Baseline Method\".\"\"\"\n", 525 | " words = []\n", 526 | " tmp = [title[0]]\n", 527 | " for c in title[1:]:\n", 528 | " if c.isupper():\n", 529 | " words.append(''.join(tmp))\n", 530 | " tmp = [c]\n", 531 | " else:\n", 532 | " tmp.append(c)\n", 533 | " words.append(''.join(tmp))\n", 534 | " return ' '.join(words)\n", 535 | "\n", 536 | "\n", 537 | "class Baseline(object):\n", 538 | " \"\"\"Calculate baseline predictions.\"\"\"\n", 539 | "\n", 540 | " def __init__(self, train_data):\n", 541 | " \"\"\"Simple heuristic-based transductive learning to fill in missing\n", 542 | " values in data matrix.\"\"\"\n", 543 | " self.predict(train_data.copy())\n", 544 | "\n", 545 | " def predict(self, train_data):\n", 546 | " raise NotImplementedError(\n", 547 | " 'baseline prediction not implemented for base class')\n", 548 | "\n", 549 | " def rmse(self, test_data):\n", 550 | " \"\"\"Calculate root mean squared error for predictions on test data.\"\"\"\n", 551 | " return rmse(test_data, self.predicted)\n", 552 | " \n", 553 | " def __str__(self):\n", 554 | " return split_title(self.__class__.__name__)\n", 555 | " \n", 556 | "\n", 557 | "\n", 558 | "# Implement the 3 baselines.\n", 559 | "\n", 560 | "class UniformRandomBaseline(Baseline):\n", 561 | " \"\"\"Fill missing values with uniform random values.\"\"\"\n", 562 | "\n", 563 | " def predict(self, train_data):\n", 564 | " nan_mask = np.isnan(train_data)\n", 565 | " masked_train = np.ma.masked_array(train_data, nan_mask)\n", 566 | " pmin, pmax = masked_train.min(), masked_train.max()\n", 567 | " N = nan_mask.sum()\n", 568 | " train_data[nan_mask] = np.random.uniform(pmin, pmax, N)\n", 569 | " self.predicted = train_data\n", 570 | "\n", 571 | "\n", 572 | "class GlobalMeanBaseline(Baseline):\n", 573 | " \"\"\"Fill in missing values using the global mean.\"\"\"\n", 574 | "\n", 575 | " def predict(self, train_data):\n", 576 | " nan_mask = np.isnan(train_data)\n", 577 | " train_data[nan_mask] = train_data[~nan_mask].mean()\n", 578 | " self.predicted = train_data\n", 579 | "\n", 580 | "\n", 581 | "class MeanOfMeansBaseline(Baseline):\n", 582 | " \"\"\"Fill in missing values using mean of user/item/global means.\"\"\"\n", 583 | "\n", 584 | " def predict(self, train_data):\n", 585 | " nan_mask = np.isnan(train_data)\n", 586 | " masked_train = np.ma.masked_array(train_data, nan_mask)\n", 587 | " global_mean = masked_train.mean()\n", 588 | " user_means = masked_train.mean(axis=1)\n", 589 | " item_means = masked_train.mean(axis=0)\n", 590 | " self.predicted = train_data.copy()\n", 591 | " n, m = train_data.shape\n", 592 | " for i in range(n):\n", 593 | " for j in range(m):\n", 594 | " if np.ma.isMA(item_means[j]):\n", 595 | " self.predicted[i,j] = np.mean(\n", 596 | " (global_mean, user_means[i]))\n", 597 | " else:\n", 598 | " self.predicted[i,j] = np.mean(\n", 599 | " (global_mean, user_means[i], item_means[j]))\n", 600 | " \n", 601 | " \n", 602 | "baseline_methods = OrderedDict()\n", 603 | "baseline_methods['ur'] = UniformRandomBaseline\n", 604 | "baseline_methods['gm'] = GlobalMeanBaseline\n", 605 | "baseline_methods['mom'] = MeanOfMeansBaseline" 606 | ] 607 | }, 608 | { 609 | "cell_type": "markdown", 610 | "metadata": {}, 611 | "source": [ 612 | "## Probabilistic Matrix Factorization\n", 613 | "\n", 614 | "[Probabilistic Matrix Factorization (PMF)](http://papers.nips.cc/paper/3208-probabilistic-matrix-factorization.pdf) [3] is a probabilistic approach to the collaborative filtering problem that takes a Bayesian perspective. The ratings $R$ are modeled as draws from a Gaussian distribution. The mean for $R_{ij}$ is $U_i V_j^T$. The precision $\\alpha$ is a fixed parameter that reflects the uncertainty of the estimations; the normal distribution is commonly reparameterized in terms of precision, which is the inverse of the variance. Complexity is controlled by placing zero-mean spherical Gaussian priors on $U$ and $V$. In other words, each row of $U$ is drawn from a multivariate Gaussian with mean $\\mu = 0$ and precision which is some multiple of the identity matrix $I$. Those multiples are $\\alpha_U$ for $U$ and $\\alpha_V$ for $V$. So our model is defined by:\n", 615 | "\n", 616 | "$\\newcommand\\given[1][]{\\:#1\\vert\\:}$\n", 617 | "\n", 618 | "\\begin{equation}\n", 619 | "P(R \\given U, V, \\alpha^2) = \n", 620 | " \\prod_{i=1}^N \\prod_{j=1}^M\n", 621 | " \\left[ \\mathcal{N}(R_{ij} \\given U_i V_j^T, \\alpha^{-1}) \\right]^{I_{ij}}\n", 622 | "\\end{equation}\n", 623 | "\n", 624 | "\\begin{equation}\n", 625 | "P(U \\given \\alpha_U^2) =\n", 626 | " \\prod_{i=1}^N \\mathcal{N}(U_i \\given 0, \\alpha_U^{-1} \\boldsymbol{I})\n", 627 | "\\end{equation}\n", 628 | "\n", 629 | "\\begin{equation}\n", 630 | "P(V \\given \\alpha_U^2) =\n", 631 | " \\prod_{j=1}^M \\mathcal{N}(V_j \\given 0, \\alpha_V^{-1} \\boldsymbol{I})\n", 632 | "\\end{equation}\n", 633 | "\n", 634 | "Given small precision parameters, the priors on $U$ and $V$ ensure our latent variables do not grow too far from 0. This prevents overly strong user preferences and item factor compositions from being learned. This is commonly known as complexity control, where the complexity of the model here is measured by the magnitude of the latent variables. Controlling complexity like this helps prevent overfitting, which allows the model to generalize better for unseen data. We must also choose an appropriate $\\alpha$ value for the normal distribution for $R$. So the challenge becomes choosing appropriate values for $\\alpha_U$, $\\alpha_V$, and $\\alpha$. This challenge can be tackled with the soft weight-sharing methods discussed by [Nowland and Hinton, 1992](http://www.cs.toronto.edu/~fritz/absps/sunspots.pdf) [4]. However, for the purposes of this analysis, we will stick to using point estimates obtained from our data." 635 | ] 636 | }, 637 | { 638 | "cell_type": "code", 639 | "execution_count": 19, 640 | "metadata": { 641 | "collapsed": false 642 | }, 643 | "outputs": [], 644 | "source": [ 645 | "import time\n", 646 | "import logging\n", 647 | "import pymc3 as pm\n", 648 | "import theano\n", 649 | "import scipy as sp\n", 650 | "\n", 651 | "\n", 652 | "# Enable on-the-fly graph computations, but ignore \n", 653 | "# absence of intermediate test values.\n", 654 | "theano.config.compute_test_value = 'ignore'\n", 655 | "\n", 656 | "# Set up logging.\n", 657 | "logger = logging.getLogger()\n", 658 | "logger.setLevel(logging.INFO)\n", 659 | "\n", 660 | "\n", 661 | "class PMF(object):\n", 662 | " \"\"\"Probabilistic Matrix Factorization model using pymc3.\"\"\"\n", 663 | "\n", 664 | " def __init__(self, train, dim, alpha=2, std=0.01, bounds=(-10, 10)):\n", 665 | " \"\"\"Build the Probabilistic Matrix Factorization model using pymc3.\n", 666 | "\n", 667 | " :param np.ndarray train: The training data to use for learning the model.\n", 668 | " :param int dim: Dimensionality of the model; number of latent factors.\n", 669 | " :param int alpha: Fixed precision for the likelihood function.\n", 670 | " :param float std: Amount of noise to use for model initialization.\n", 671 | " :param (tuple of int) bounds: (lower, upper) bound of ratings.\n", 672 | " These bounds will simply be used to cap the estimates produced for R.\n", 673 | "\n", 674 | " \"\"\"\n", 675 | " self.dim = dim\n", 676 | " self.alpha = alpha\n", 677 | " self.std = np.sqrt(1.0 / alpha)\n", 678 | " self.bounds = bounds\n", 679 | " self.data = train.copy()\n", 680 | " n, m = self.data.shape\n", 681 | "\n", 682 | " # Perform mean value imputation\n", 683 | " nan_mask = np.isnan(self.data)\n", 684 | " self.data[nan_mask] = self.data[~nan_mask].mean()\n", 685 | "\n", 686 | " # Low precision reflects uncertainty; prevents overfitting.\n", 687 | " # Set to the mean variance across users and items.\n", 688 | " self.alpha_u = 1 / self.data.var(axis=1).mean()\n", 689 | " self.alpha_v = 1 / self.data.var(axis=0).mean()\n", 690 | "\n", 691 | " # Specify the model.\n", 692 | " logging.info('building the PMF model')\n", 693 | " with pm.Model() as pmf:\n", 694 | " U = pm.MvNormal(\n", 695 | " 'U', mu=0, tau=self.alpha_u * np.eye(dim),\n", 696 | " shape=(n, dim), testval=np.random.randn(n, dim) * std)\n", 697 | " V = pm.MvNormal(\n", 698 | " 'V', mu=0, tau=self.alpha_v * np.eye(dim),\n", 699 | " shape=(m, dim), testval=np.random.randn(m, dim) * std)\n", 700 | " R = pm.Normal(\n", 701 | " 'R', mu=theano.tensor.dot(U, V.T), tau=self.alpha * np.ones((n, m)),\n", 702 | " observed=self.data)\n", 703 | "\n", 704 | " logging.info('done building the PMF model') \n", 705 | " self.model = pmf\n", 706 | " \n", 707 | " def __str__(self):\n", 708 | " return self.name\n", 709 | " " 710 | ] 711 | }, 712 | { 713 | "cell_type": "markdown", 714 | "metadata": {}, 715 | "source": [ 716 | "We'll also need functions for calculating the MAP and performing sampling on our PMF model. When the observation noise variance $\\alpha$ and the prior variances $\\alpha_U$ and $\\alpha_V$ are all kept fixed, maximizing the log posterior is equivalent to minimizing the sum-of-squared-errors objective function with quadratic regularization terms.\n", 717 | "\n", 718 | "$$ E = \\frac{1}{2} \\sum_{i=1}^N \\sum_{j=1}^M I_{ij} (R_{ij} - U_i V_j^T)^2 + \\frac{\\lambda_U}{2} \\sum_{i=1}^N \\|U\\|_{Fro}^2 + \\frac{\\lambda_V}{2} \\sum_{j=1}^M \\|V\\|_{Fro}^2, $$\n", 719 | "\n", 720 | "where $\\lambda_U = \\alpha_U / \\alpha$, $\\lambda_V = \\alpha_V / \\alpha$, and $\\|\\cdot\\|_{Fro}^2$ denotes the Frobenius norm [3]. Minimizing this objective function gives a local minimum, which is essentially a maximum a posteriori (MAP) estimate. While it is possible to use a fast Stochastic Gradient Descent procedure to find this MAP, we'll be finding it using the utilities built into `pymc3`. In particular, we'll use `find_MAP` with Powell optimization (`scipy.optimize.fmin_powell`). Having found this MAP estimate, we can use it as our starting point for MCMC sampling.\n", 721 | "\n", 722 | "Since it is a reasonably complex model, we expect the MAP estimation to take some time. So let's save it after we've found it. Note that we define a function for finding the MAP below, assuming it will receive a namespace with some variables in it. Then we attach that function to the PMF class, where it will have such a namespace after initialization. The PMF class is defined in pieces this way so I can say a few things between each piece to make it clearer." 723 | ] 724 | }, 725 | { 726 | "cell_type": "code", 727 | "execution_count": 20, 728 | "metadata": { 729 | "collapsed": false 730 | }, 731 | "outputs": [ 732 | { 733 | "name": "stdout", 734 | "output_type": "stream", 735 | "text": [ 736 | "ready\n" 737 | ] 738 | } 739 | ], 740 | "source": [ 741 | "try:\n", 742 | " import ujson as json\n", 743 | "except ImportError:\n", 744 | " import json\n", 745 | "\n", 746 | "\n", 747 | "# First define functions to save our MAP estimate after it is found.\n", 748 | "# We adapt these from `pymc3`'s `backends` module, where the original\n", 749 | "# code is used to save the traces from MCMC samples.\n", 750 | "def save_np_vars(vars, savedir):\n", 751 | " \"\"\"Save a dictionary of numpy variables to `savedir`. We assume\n", 752 | " the directory does not exist; an OSError will be raised if it does.\n", 753 | " \"\"\"\n", 754 | " logging.info('writing numpy vars to directory: %s' % savedir)\n", 755 | " if not os.path.isdir(savedir):\n", 756 | " os.mkdir(savedir)\n", 757 | " shapes = {}\n", 758 | " for varname in vars:\n", 759 | " data = vars[varname]\n", 760 | " var_file = os.path.join(savedir, varname + '.txt')\n", 761 | " np.savetxt(var_file, data.reshape(-1, data.size))\n", 762 | " shapes[varname] = data.shape\n", 763 | "\n", 764 | " ## Store shape information for reloading.\n", 765 | " shape_file = os.path.join(savedir, 'shapes.json')\n", 766 | " with open(shape_file, 'w') as sfh:\n", 767 | " json.dump(shapes, sfh)\n", 768 | " \n", 769 | " \n", 770 | "def load_np_vars(savedir):\n", 771 | " \"\"\"Load numpy variables saved with `save_np_vars`.\"\"\"\n", 772 | " shape_file = os.path.join(savedir, 'shapes.json')\n", 773 | " with open(shape_file, 'r') as sfh:\n", 774 | " shapes = json.load(sfh)\n", 775 | "\n", 776 | " vars = {}\n", 777 | " for varname, shape in shapes.items():\n", 778 | " var_file = os.path.join(savedir, varname + '.txt')\n", 779 | " vars[varname] = np.loadtxt(var_file).reshape(shape)\n", 780 | " \n", 781 | " return vars\n", 782 | "\n", 783 | "\n", 784 | "# Now define the MAP estimation infrastructure.\n", 785 | "def _map_dir(self):\n", 786 | " basename = 'pmf-map-d%d' % self.dim\n", 787 | " return os.path.join(DATA_DIR, basename)\n", 788 | "\n", 789 | "def _find_map(self):\n", 790 | " \"\"\"Find mode of posterior using Powell optimization.\"\"\"\n", 791 | " tstart = time.time()\n", 792 | " with self.model:\n", 793 | " logging.info('finding PMF MAP using Powell optimization...')\n", 794 | " self._map = pm.find_MAP(fmin=sp.optimize.fmin_powell, disp=True)\n", 795 | "\n", 796 | " elapsed = int(time.time() - tstart)\n", 797 | " logging.info('found PMF MAP in %d seconds' % elapsed)\n", 798 | " \n", 799 | " # This is going to take a good deal of time to find, so let's save it.\n", 800 | " save_np_vars(self._map, self.map_dir)\n", 801 | " \n", 802 | "def _load_map(self):\n", 803 | " self._map = load_np_vars(self.map_dir)\n", 804 | "\n", 805 | "def _map(self):\n", 806 | " try:\n", 807 | " return self._map\n", 808 | " except:\n", 809 | " if os.path.isdir(self.map_dir):\n", 810 | " self.load_map()\n", 811 | " else:\n", 812 | " self.find_map()\n", 813 | " return self._map\n", 814 | "\n", 815 | " \n", 816 | "# Update our class with the new MAP infrastructure.\n", 817 | "PMF.find_map = _find_map\n", 818 | "PMF.load_map = _load_map\n", 819 | "PMF.map_dir = property(_map_dir)\n", 820 | "PMF.map = property(_map)\n", 821 | "print(\"ready\")" 822 | ] 823 | }, 824 | { 825 | "cell_type": "markdown", 826 | "metadata": {}, 827 | "source": [ 828 | "So now our PMF class has a `map` `property` which will either be found using Powell optimization or loaded from a previous optimization. Once we have the MAP, we can use it as a starting point for our MCMC sampler. We'll need a sampling function in order to draw MCMC samples to approximate the posterior distribution of the PMF model." 829 | ] 830 | }, 831 | { 832 | "cell_type": "code", 833 | "execution_count": 21, 834 | "metadata": { 835 | "collapsed": false 836 | }, 837 | "outputs": [ 838 | { 839 | "name": "stdout", 840 | "output_type": "stream", 841 | "text": [ 842 | "ready\n" 843 | ] 844 | } 845 | ], 846 | "source": [ 847 | "# Draw MCMC samples.\n", 848 | "def _trace_dir(self):\n", 849 | " basename = 'pmf-mcmc-d%d' % self.dim\n", 850 | " return os.path.join(DATA_DIR, basename)\n", 851 | "\n", 852 | "def _draw_samples(self, nsamples=1000, njobs=2):\n", 853 | " # First make sure the trace_dir does not already exist.\n", 854 | " if os.path.isdir(self.trace_dir):\n", 855 | " shutil.rmtree(self.trace_dir)\n", 856 | "\n", 857 | " with self.model:\n", 858 | " logging.info('drawing %d samples using %d jobs' % (nsamples, njobs))\n", 859 | " backend = pm.backends.Text(self.trace_dir)\n", 860 | " logging.info('backing up trace to directory: %s' % self.trace_dir)\n", 861 | " self.trace = pm.sample(draws=nsamples, init='advi',\n", 862 | " n_init=150000, njobs=njobs, trace=backend)\n", 863 | " \n", 864 | "def _load_trace(self):\n", 865 | " with self.model:\n", 866 | " self.trace = pm.backends.text.load(self.trace_dir)\n", 867 | "\n", 868 | " \n", 869 | "# Update our class with the sampling infrastructure.\n", 870 | "PMF.trace_dir = property(_trace_dir)\n", 871 | "PMF.draw_samples = _draw_samples\n", 872 | "PMF.load_trace = _load_trace\n", 873 | "print(\"ready\")" 874 | ] 875 | }, 876 | { 877 | "cell_type": "markdown", 878 | "metadata": {}, 879 | "source": [ 880 | "We could define some kind of default trace property like we did for the MAP, but that would mean using possibly nonsensical values for `nsamples` and `njobs`. Better to leave it as a non-optional call to `draw_samples`. Finally, we'll need a function to make predictions using our inferred values for $U$ and $V$. For user $i$ and joke $j$, a prediction is generated by drawing from $\\mathcal{N}(U_i V_j^T, \\alpha)$. To generate predictions from the sampler, we generate an $R$ matrix for each $U$ and $V$ sampled, then we combine these by averaging over the $K$ samples.\n", 881 | "\n", 882 | "\\begin{equation}\n", 883 | "P(R_{ij}^* \\given R, \\alpha, \\alpha_U, \\alpha_V) \\approx\n", 884 | " \\frac{1}{K} \\sum_{k=1}^K \\mathcal{N}(U_i V_j^T, \\alpha)\n", 885 | "\\end{equation}\n", 886 | "\n", 887 | "We'll want to inspect the individual $R$ matrices before averaging them for diagnostic purposes. So we'll write code for the averaging piece during evaluation. The function below simply draws an $R$ matrix given a $U$ and $V$ and the fixed $\\alpha$ stored in the PMF object." 888 | ] 889 | }, 890 | { 891 | "cell_type": "code", 892 | "execution_count": 22, 893 | "metadata": { 894 | "collapsed": false 895 | }, 896 | "outputs": [], 897 | "source": [ 898 | "def _predict(self, U, V):\n", 899 | " \"\"\"Estimate R from the given values of U and V.\"\"\"\n", 900 | " R = np.dot(U, V.T)\n", 901 | " n, m = R.shape\n", 902 | " sample_R = np.array([\n", 903 | " [np.random.normal(R[i,j], self.std) for j in range(m)]\n", 904 | " for i in range(n)\n", 905 | " ])\n", 906 | "\n", 907 | " # bound ratings\n", 908 | " low, high = self.bounds\n", 909 | " sample_R[sample_R < low] = low\n", 910 | " sample_R[sample_R > high] = high\n", 911 | " return sample_R\n", 912 | "\n", 913 | "\n", 914 | "PMF.predict = _predict" 915 | ] 916 | }, 917 | { 918 | "cell_type": "markdown", 919 | "metadata": {}, 920 | "source": [ 921 | "One final thing to note: the dot products in this model are often constrained using a logistic function $g(x) = 1/(1 + exp(-x))$, that bounds the predictions to the range [0, 1]. To facilitate this bounding, the ratings are also mapped to the range [0, 1] using $t(x) = (x + min) / range$. The authors of PMF also introduced a constrained version which performs better on users with less ratings [3]. Both models are generally improvements upon the basic model presented here. However, in the interest of time and space, these will not be implemented here." 922 | ] 923 | }, 924 | { 925 | "cell_type": "markdown", 926 | "metadata": {}, 927 | "source": [ 928 | "## Evaluation\n", 929 | "\n", 930 | "### Metrics\n", 931 | "\n", 932 | "In order to understand how effective our models are, we'll need to be able to evaluate them. We'll be evaluating in terms of root mean squared error (RMSE), which looks like this:\n", 933 | "\n", 934 | "\\begin{equation}\n", 935 | "RMSE = \\sqrt{ \\frac{ \\sum_{i=1}^N \\sum_{j=1}^M I_{ij} (R_{ij} - R_{ij}^*)^2 }\n", 936 | " { \\sum_{i=1}^N \\sum_{j=1}^M I_{ij} } }\n", 937 | "\\end{equation}\n", 938 | "\n", 939 | "In this case, the RMSE can be thought of as the standard deviation of our predictions from the actual user preferences." 940 | ] 941 | }, 942 | { 943 | "cell_type": "code", 944 | "execution_count": 23, 945 | "metadata": { 946 | "collapsed": false 947 | }, 948 | "outputs": [], 949 | "source": [ 950 | "# Define our evaluation function.\n", 951 | "def rmse(test_data, predicted):\n", 952 | " \"\"\"Calculate root mean squared error.\n", 953 | " Ignoring missing values in the test data.\n", 954 | " \"\"\"\n", 955 | " I = ~np.isnan(test_data) # indicator for missing values\n", 956 | " N = I.sum() # number of non-missing values\n", 957 | " sqerror = abs(test_data - predicted) ** 2 # squared error array\n", 958 | " mse = sqerror[I].sum() / N # mean squared error\n", 959 | " return np.sqrt(mse) # RMSE" 960 | ] 961 | }, 962 | { 963 | "cell_type": "markdown", 964 | "metadata": {}, 965 | "source": [ 966 | "### Training Data vs. Test Data\n", 967 | "\n", 968 | "The next thing we need to do is split our data into a training set and a test set. Matrix factorization techniques use [transductive learning](http://en.wikipedia.org/wiki/Transduction_%28machine_learning%29) rather than inductive learning. So we produce a test set by taking a random sample of the cells in the full $N \\times M$ data matrix. The values selected as test samples are replaced with `nan` values in a copy of the original data matrix to produce the training set. Since we'll be producing random splits, let's also write out the train/test sets generated. This will allow us to replicate our results. We'd like to be able to idenfity which split is which, so we'll take a hash of the indices selected for testing and use that to save the data." 969 | ] 970 | }, 971 | { 972 | "cell_type": "code", 973 | "execution_count": 28, 974 | "metadata": { 975 | "collapsed": false 976 | }, 977 | "outputs": [ 978 | { 979 | "ename": "TypeError", 980 | "evalue": "Unicode-objects must be encoded before hashing", 981 | "output_type": "error", 982 | "traceback": [ 983 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", 984 | "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", 985 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 51\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mvars\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'train'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mvars\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'test'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 52\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 53\u001b[0;31m \u001b[0mtrain\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtest\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mname\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0msplit_train_test\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 54\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"done\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 986 | "\u001b[0;32m\u001b[0m in \u001b[0;36msplit_train_test\u001b[0;34m(data, percent_test)\u001b[0m\n\u001b[1;32m 37\u001b[0m \u001b[0;31m# Finally, hash the indices and save the train/test sets.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 38\u001b[0m \u001b[0mindex_string\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m''\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mjoin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmap\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mstr\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msort\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msample\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 39\u001b[0;31m \u001b[0mname\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mhashlib\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msha1\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mindex_string\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mhexdigest\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 40\u001b[0m \u001b[0msavedir\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mos\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mjoin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mDATA_DIR\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mname\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 41\u001b[0m \u001b[0msave_np_vars\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m{\u001b[0m\u001b[0;34m'train'\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mtrain\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'test'\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mtest\u001b[0m\u001b[0;34m}\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0msavedir\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 987 | "\u001b[0;31mTypeError\u001b[0m: Unicode-objects must be encoded before hashing" 988 | ] 989 | } 990 | ], 991 | "source": [ 992 | "import hashlib\n", 993 | "\n", 994 | "\n", 995 | "# Define a function for splitting train/test data.\n", 996 | "def split_train_test(data, percent_test=10):\n", 997 | " \"\"\"Split the data into train/test sets.\n", 998 | " :param int percent_test: Percentage of data to use for testing. Default 10.\n", 999 | " \"\"\"\n", 1000 | " n, m = data.shape # # users, # jokes\n", 1001 | " N = n * m # # cells in matrix\n", 1002 | " test_size = N / percent_test # use 10% of data as test set\n", 1003 | " train_size = N - test_size # and remainder for training\n", 1004 | "\n", 1005 | " # Prepare train/test ndarrays.\n", 1006 | " train = data.copy().values\n", 1007 | " test = np.ones(data.shape) * np.nan\n", 1008 | "\n", 1009 | " # Draw random sample of training data to use for testing.\n", 1010 | " tosample = np.where(~np.isnan(train)) # ignore nan values in data\n", 1011 | " \n", 1012 | " #idx_pairs = zip(tosample[0], tosample[1]) # tuples of row/col index pairs\n", 1013 | " idx_pairs = list(zip(tosample[0], tosample[1])) # tuples of row/col index pairs\n", 1014 | " \n", 1015 | " indices = np.arange(len(idx_pairs)) # indices of index pairs\n", 1016 | " sample = np.random.choice(indices, replace=False, size=int(test_size))\n", 1017 | "\n", 1018 | " # Transfer random sample from train set to test set.\n", 1019 | " for idx in sample:\n", 1020 | " idx_pair = idx_pairs[idx]\n", 1021 | " test[idx_pair] = train[idx_pair] # transfer to test set\n", 1022 | " train[idx_pair] = np.nan # remove from train set\n", 1023 | "\n", 1024 | " # Verify everything worked properly\n", 1025 | " assert(np.isnan(train).sum() == test_size)\n", 1026 | " assert(np.isnan(test).sum() == train_size)\n", 1027 | " \n", 1028 | " # Finally, hash the indices and save the train/test sets.\n", 1029 | " index_string = ''.join(map(str, np.sort(sample)))\n", 1030 | " name = hashlib.sha1(index_string).hexdigest()\n", 1031 | " savedir = os.path.join(DATA_DIR, name)\n", 1032 | " save_np_vars({'train': train, 'test': test}, savedir)\n", 1033 | " \n", 1034 | " # Return train set, test set, and unique hash of indices.\n", 1035 | " return train, test, name\n", 1036 | "\n", 1037 | "\n", 1038 | "def load_train_test(name):\n", 1039 | " \"\"\"Load the train/test sets.\"\"\"\n", 1040 | " savedir = os.path.join(DATA_DIR, name)\n", 1041 | " vars = load_np_vars(savedir)\n", 1042 | " return vars['train'], vars['test']\n", 1043 | "\n", 1044 | "train, test, name = split_train_test(data)\n", 1045 | "print(\"done\")" 1046 | ] 1047 | }, 1048 | { 1049 | "cell_type": "markdown", 1050 | "metadata": {}, 1051 | "source": [ 1052 | "In order to facilitate reproducibility, I've produced a train/test split using the code above which we'll now use for all the evaluations below." 1053 | ] 1054 | }, 1055 | { 1056 | "cell_type": "code", 1057 | "execution_count": 18, 1058 | "metadata": { 1059 | "collapsed": false 1060 | }, 1061 | "outputs": [ 1062 | { 1063 | "ename": "FileNotFoundError", 1064 | "evalue": "[Errno 2] No such file or directory: './data/2f7be834c19f630bc89d550e41485a5a54d28af7/shapes.json'", 1065 | "output_type": "error", 1066 | "traceback": [ 1067 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", 1068 | "\u001b[0;31mFileNotFoundError\u001b[0m Traceback (most recent call last)", 1069 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mtrain\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtest\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mload_train_test\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'2f7be834c19f630bc89d550e41485a5a54d28af7'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", 1070 | "\u001b[0;32m\u001b[0m in \u001b[0;36mload_train_test\u001b[0;34m(name)\u001b[0m\n\u001b[1;32m 45\u001b[0m \u001b[0;34m\"\"\"Load the train/test sets.\"\"\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 46\u001b[0m \u001b[0msavedir\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mos\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mjoin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mDATA_DIR\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mname\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 47\u001b[0;31m \u001b[0mvars\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mload_np_vars\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msavedir\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 48\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mvars\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'train'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mvars\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'test'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 49\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", 1071 | "\u001b[0;32m\u001b[0m in \u001b[0;36mload_np_vars\u001b[0;34m(savedir)\u001b[0m\n\u001b[1;32m 31\u001b[0m \u001b[0;34m\"\"\"Load numpy variables saved with `save_np_vars`.\"\"\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 32\u001b[0m \u001b[0mshape_file\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mos\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mjoin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msavedir\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'shapes.json'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 33\u001b[0;31m \u001b[0;32mwith\u001b[0m \u001b[0mopen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mshape_file\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'r'\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0msfh\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 34\u001b[0m \u001b[0mshapes\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mjson\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mload\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msfh\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 35\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n", 1072 | "\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: './data/2f7be834c19f630bc89d550e41485a5a54d28af7/shapes.json'" 1073 | ] 1074 | } 1075 | ], 1076 | "source": [ 1077 | "train, test = load_train_test('2f7be834c19f630bc89d550e41485a5a54d28af7')" 1078 | ] 1079 | }, 1080 | { 1081 | "cell_type": "markdown", 1082 | "metadata": {}, 1083 | "source": [ 1084 | "## Results" 1085 | ] 1086 | }, 1087 | { 1088 | "cell_type": "code", 1089 | "execution_count": null, 1090 | "metadata": { 1091 | "collapsed": false 1092 | }, 1093 | "outputs": [], 1094 | "source": [ 1095 | "# Let's see the results:\n", 1096 | "baselines = {}\n", 1097 | "for name in baseline_methods:\n", 1098 | " Method = baseline_methods[name]\n", 1099 | " method = Method(train)\n", 1100 | " baselines[name] = method.rmse(test)\n", 1101 | " print('%s RMSE:\\t%.5f' % (method, baselines[name]))\n" 1102 | ] 1103 | }, 1104 | { 1105 | "cell_type": "markdown", 1106 | "metadata": {}, 1107 | "source": [ 1108 | "As expected: the uniform random baseline is the worst by far, the global mean baseline is next best, and the mean of means method is our best baseline. Now let's see how PMF stacks up." 1109 | ] 1110 | }, 1111 | { 1112 | "cell_type": "code", 1113 | "execution_count": null, 1114 | "metadata": { 1115 | "collapsed": false 1116 | }, 1117 | "outputs": [], 1118 | "source": [ 1119 | "# We use a fixed precision for the likelihood.\n", 1120 | "# This reflects uncertainty in the dot product.\n", 1121 | "# We choose 2 in the footsteps Salakhutdinov\n", 1122 | "# Mnihof.\n", 1123 | "ALPHA = 2\n", 1124 | "\n", 1125 | "# The dimensionality D; the number of latent factors.\n", 1126 | "# We can adjust this higher to try to capture more subtle\n", 1127 | "# characteristics of each joke. However, the higher it is,\n", 1128 | "# the more expensive our inference procedures will be.\n", 1129 | "# Specifically, we have D(N + M) latent variables. For our\n", 1130 | "# Jester dataset, this means we have D(1100), so for 5\n", 1131 | "# dimensions, we are sampling 5500 latent variables.\n", 1132 | "DIM = 5\n", 1133 | "\n", 1134 | "\n", 1135 | "pmf = PMF(train, DIM, ALPHA, std=0.05)" 1136 | ] 1137 | }, 1138 | { 1139 | "cell_type": "markdown", 1140 | "metadata": {}, 1141 | "source": [ 1142 | "### Predictions Using MAP" 1143 | ] 1144 | }, 1145 | { 1146 | "cell_type": "code", 1147 | "execution_count": null, 1148 | "metadata": { 1149 | "collapsed": false 1150 | }, 1151 | "outputs": [], 1152 | "source": [ 1153 | "# Find MAP for PMF.\n", 1154 | "pmf.find_map()" 1155 | ] 1156 | }, 1157 | { 1158 | "cell_type": "markdown", 1159 | "metadata": {}, 1160 | "source": [ 1161 | "Excellent. The first thing we want to do is make sure the MAP estimate we obtained is reasonable. We can do this by computing RMSE on the predicted ratings obtained from the MAP values of $U$ and $V$. First we define a function for generating the predicted ratings $R$ from $U$ and $V$. We ensure the actual rating bounds are enforced by setting all values below -10 to -10 and all values above 10 to 10. Finally, we compute RMSE for both the training set and the test set. We expect the test RMSE to be higher. The difference between the two gives some idea of how much we have overfit. Some difference is always expected, but a very low RMSE on the training set with a high RMSE on the test set is a definite sign of overfitting." 1162 | ] 1163 | }, 1164 | { 1165 | "cell_type": "code", 1166 | "execution_count": null, 1167 | "metadata": { 1168 | "collapsed": false 1169 | }, 1170 | "outputs": [], 1171 | "source": [ 1172 | "def eval_map(pmf_model, train, test):\n", 1173 | " U = pmf_model.map['U']\n", 1174 | " V = pmf_model.map['V']\n", 1175 | " \n", 1176 | " # Make predictions and calculate RMSE on train & test sets.\n", 1177 | " predictions = pmf_model.predict(U, V)\n", 1178 | " train_rmse = rmse(train, predictions)\n", 1179 | " test_rmse = rmse(test, predictions)\n", 1180 | " overfit = test_rmse - train_rmse\n", 1181 | " \n", 1182 | " # Print report.\n", 1183 | " print('PMF MAP training RMSE: %.5f' % train_rmse)\n", 1184 | " print('PMF MAP testing RMSE: %.5f' % test_rmse)\n", 1185 | " print('Train/test difference: %.5f' % overfit)\n", 1186 | " \n", 1187 | " return test_rmse\n", 1188 | " \n", 1189 | "\n", 1190 | "# Add eval function to PMF class.\n", 1191 | "PMF.eval_map = eval_map" 1192 | ] 1193 | }, 1194 | { 1195 | "cell_type": "code", 1196 | "execution_count": null, 1197 | "metadata": { 1198 | "collapsed": false 1199 | }, 1200 | "outputs": [], 1201 | "source": [ 1202 | "# Evaluate PMF MAP estimates.\n", 1203 | "pmf_map_rmse = pmf.eval_map(train, test)\n", 1204 | "pmf_improvement = baselines['mom'] - pmf_map_rmse\n", 1205 | "print('PMF MAP Improvement: %.5f' % pmf_improvement)" 1206 | ] 1207 | }, 1208 | { 1209 | "cell_type": "markdown", 1210 | "metadata": {}, 1211 | "source": [ 1212 | "So we see a pretty nice improvement here when compared to our best baseline, which was the mean of means method. We also have a fairly small difference in the RMSE values between the train and the test sets. This indicates that the point estimates for $\\alpha_U$ and $\\alpha_V$ that we calculated from our data are doing a good job of controlling model complexity. Now let's see if we can improve our estimates by approximating our posterior distribution with MCMC sampling. We'll draw 1000 samples and back them up using the `pymc3.backend.Text` backend." 1213 | ] 1214 | }, 1215 | { 1216 | "cell_type": "markdown", 1217 | "metadata": {}, 1218 | "source": [ 1219 | "### Predictions using MCMC" 1220 | ] 1221 | }, 1222 | { 1223 | "cell_type": "code", 1224 | "execution_count": null, 1225 | "metadata": { 1226 | "collapsed": false, 1227 | "scrolled": true 1228 | }, 1229 | "outputs": [], 1230 | "source": [ 1231 | "# Draw MCMC samples.\n", 1232 | "pmf.draw_samples(300)\n", 1233 | "\n", 1234 | "# uncomment to load previous trace rather than drawing new samples.\n", 1235 | "# pmf.load_trace()" 1236 | ] 1237 | }, 1238 | { 1239 | "cell_type": "markdown", 1240 | "metadata": {}, 1241 | "source": [ 1242 | "### Diagnostics and Posterior Predictive Check\n", 1243 | "\n", 1244 | "The next step is to check how many samples we should discard as burn-in. Normally, we'd do this using a traceplot to get some idea of where the sampled variables start to converge. In this case, we have high-dimensional samples, so we need to find a way to approximate them. One way was proposed by [Salakhutdinov and Mnih, p.886](https://www.cs.toronto.edu/~amnih/papers/bpmf.pdf). We can calculate the Frobenius norms of $U$ and $V$ at each step and monitor those for convergence. This essentially gives us some idea when the average magnitude of the latent variables is stabilizing. The equations for the Frobenius norms of $U$ and $V$ are shown below. We will use `numpy`'s `linalg` package to calculate these.\n", 1245 | "\n", 1246 | "$$ \\|U\\|_{Fro}^2 = \\sqrt{\\sum_{i=1}^N \\sum_{d=1}^D |U_{id}|^2}, \\hspace{40pt} \\|V\\|_{Fro}^2 = \\sqrt{\\sum_{j=1}^M \\sum_{d=1}^D |V_{jd}|^2} $$" 1247 | ] 1248 | }, 1249 | { 1250 | "cell_type": "code", 1251 | "execution_count": null, 1252 | "metadata": { 1253 | "collapsed": false 1254 | }, 1255 | "outputs": [], 1256 | "source": [ 1257 | "def _norms(pmf_model, monitor=('U', 'V'), ord='fro'):\n", 1258 | " \"\"\"Return norms of latent variables at each step in the\n", 1259 | " sample trace. These can be used to monitor convergence\n", 1260 | " of the sampler.\n", 1261 | " \"\"\"\n", 1262 | " monitor = ('U', 'V')\n", 1263 | " norms = {var: [] for var in monitor}\n", 1264 | " for sample in pmf_model.trace:\n", 1265 | " for var in monitor:\n", 1266 | " norms[var].append(np.linalg.norm(sample[var], ord))\n", 1267 | " return norms\n", 1268 | "\n", 1269 | "\n", 1270 | "def _traceplot(pmf_model):\n", 1271 | " \"\"\"Plot Frobenius norms of U and V as a function of sample #.\"\"\"\n", 1272 | " trace_norms = pmf_model.norms()\n", 1273 | " u_series = pd.Series(trace_norms['U'])\n", 1274 | " v_series = pd.Series(trace_norms['V'])\n", 1275 | " fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 7))\n", 1276 | " u_series.plot(kind='line', ax=ax1, grid=False,\n", 1277 | " title=\"$\\|U\\|_{Fro}^2$ at Each Sample\")\n", 1278 | " v_series.plot(kind='line', ax=ax2, grid=False,\n", 1279 | " title=\"$\\|V\\|_{Fro}^2$ at Each Sample\")\n", 1280 | " ax1.set_xlabel(\"Sample Number\")\n", 1281 | " ax2.set_xlabel(\"Sample Number\")\n", 1282 | " \n", 1283 | " \n", 1284 | "PMF.norms = _norms\n", 1285 | "PMF.traceplot = _traceplot" 1286 | ] 1287 | }, 1288 | { 1289 | "cell_type": "code", 1290 | "execution_count": null, 1291 | "metadata": { 1292 | "collapsed": false 1293 | }, 1294 | "outputs": [], 1295 | "source": [ 1296 | "pmf.traceplot()" 1297 | ] 1298 | }, 1299 | { 1300 | "cell_type": "markdown", 1301 | "metadata": {}, 1302 | "source": [ 1303 | "It appears we get convergence of $U$ and $V$ after about 200 samples. When testing for convergence, we also want to see convergence of the particular statistics we are looking for, since different characteristics of the posterior may converge at different rates. Let's also do a traceplot of the RSME. We'll compute RMSE for both the train and the test set, even though the convergence is indicated by RMSE on the training set alone. In addition, let's compute a running RMSE on the train/test sets to see how aggregate performance improves or decreases as we continue to sample." 1304 | ] 1305 | }, 1306 | { 1307 | "cell_type": "code", 1308 | "execution_count": null, 1309 | "metadata": { 1310 | "collapsed": false 1311 | }, 1312 | "outputs": [], 1313 | "source": [ 1314 | "def _running_rmse(pmf_model, test_data, train_data, burn_in=0, plot=True):\n", 1315 | " \"\"\"Calculate RMSE for each step of the trace to monitor convergence.\n", 1316 | " \"\"\"\n", 1317 | " burn_in = burn_in if len(pmf_model.trace) >= burn_in else 0\n", 1318 | " results = {'per-step-train': [], 'running-train': [],\n", 1319 | " 'per-step-test': [], 'running-test': []}\n", 1320 | " R = np.zeros(test_data.shape)\n", 1321 | " for cnt, sample in enumerate(pmf_model.trace[burn_in:]):\n", 1322 | " sample_R = pmf_model.predict(sample['U'], sample['V'])\n", 1323 | " R += sample_R\n", 1324 | " running_R = R / (cnt + 1)\n", 1325 | " results['per-step-train'].append(rmse(train_data, sample_R))\n", 1326 | " results['running-train'].append(rmse(train_data, running_R))\n", 1327 | " results['per-step-test'].append(rmse(test_data, sample_R))\n", 1328 | " results['running-test'].append(rmse(test_data, running_R))\n", 1329 | " \n", 1330 | " results = pd.DataFrame(results)\n", 1331 | "\n", 1332 | " if plot:\n", 1333 | " results.plot(\n", 1334 | " kind='line', grid=False, figsize=(15, 7),\n", 1335 | " title='Per-step and Running RMSE From Posterior Predictive')\n", 1336 | " \n", 1337 | " # Return the final predictions, and the RMSE calculations\n", 1338 | " return running_R, results\n", 1339 | "\n", 1340 | "\n", 1341 | "PMF.running_rmse = _running_rmse" 1342 | ] 1343 | }, 1344 | { 1345 | "cell_type": "code", 1346 | "execution_count": null, 1347 | "metadata": { 1348 | "collapsed": false 1349 | }, 1350 | "outputs": [], 1351 | "source": [ 1352 | "predicted, results = pmf.running_rmse(test, train, burn_in=200)" 1353 | ] 1354 | }, 1355 | { 1356 | "cell_type": "code", 1357 | "execution_count": null, 1358 | "metadata": { 1359 | "collapsed": false 1360 | }, 1361 | "outputs": [], 1362 | "source": [ 1363 | "# And our final RMSE?\n", 1364 | "final_test_rmse = results['running-test'].values[-1]\n", 1365 | "final_train_rmse = results['running-train'].values[-1]\n", 1366 | "print('Posterior predictive train RMSE: %.5f' % final_train_rmse)\n", 1367 | "print('Posterior predictive test RMSE: %.5f' % final_test_rmse)\n", 1368 | "print('Train/test difference: %.5f' % (final_test_rmse - final_train_rmse))\n", 1369 | "print('Improvement from MAP: %.5f' % (pmf_map_rmse - final_test_rmse))\n", 1370 | "print('Improvement from Mean of Means: %.5f' % (baselines['mom'] - final_test_rmse))" 1371 | ] 1372 | }, 1373 | { 1374 | "cell_type": "markdown", 1375 | "metadata": {}, 1376 | "source": [ 1377 | "We have some interesting results here. As expected, our MCMC sampler provides lower error on the training set. However, it seems it does so at the cost of overfitting the data. This results in a decrease in test RMSE as compared to the MAP, even though it is still much better than our best baseline. So why might this be the case? Recall that we used point estimates for our precision paremeters $\\alpha_U$ and $\\alpha_V$ and we chose a fixed precision $\\alpha$. It is quite likely that by doing this, we constrained our posterior in a way that biased it towards the training data. In reality, the variance in the user ratings and the joke ratings is unlikely to be equal to the means of sample variances we used. Also, the most reasonable observation precision $\\alpha$ is likely different as well." 1378 | ] 1379 | }, 1380 | { 1381 | "cell_type": "markdown", 1382 | "metadata": {}, 1383 | "source": [ 1384 | "### Summary of Results\n", 1385 | "\n", 1386 | "Let's summarize our results." 1387 | ] 1388 | }, 1389 | { 1390 | "cell_type": "code", 1391 | "execution_count": null, 1392 | "metadata": { 1393 | "collapsed": false 1394 | }, 1395 | "outputs": [], 1396 | "source": [ 1397 | "size = 100 # RMSE doesn't really change after 100th sample anyway.\n", 1398 | "all_results = pd.DataFrame({\n", 1399 | " 'uniform random': np.repeat(baselines['ur'], size),\n", 1400 | " 'global means': np.repeat(baselines['gm'], size),\n", 1401 | " 'mean of means': np.repeat(baselines['mom'], size),\n", 1402 | " 'PMF MAP': np.repeat(pmf_map_rmse, size),\n", 1403 | " 'PMF MCMC': results['running-test'][:size],\n", 1404 | "})\n", 1405 | "fig, ax = plt.subplots(figsize=(10, 5))\n", 1406 | "all_results.plot(kind='line', grid=False, ax=ax,\n", 1407 | " title='RMSE for all methods')\n", 1408 | "ax.set_xlabel(\"Number of Samples\")\n", 1409 | "ax.set_ylabel(\"RMSE\")" 1410 | ] 1411 | }, 1412 | { 1413 | "cell_type": "markdown", 1414 | "metadata": {}, 1415 | "source": [ 1416 | "## Summary\n", 1417 | "\n", 1418 | "We set out to predict user preferences for unseen jokes. First we discussed the intuitive notion behind the user-user and item-item neighborhood approaches to collaborative filtering. Then we formalized our intuitions. With a firm understanding of our problem context, we moved on to exploring our subset of the Jester data. After discovering some general patterns, we defined three baseline methods: uniform random, global mean, and mean of means. With the goal of besting our baseline methods, we implemented the basic version of Probabilistic Matrix Factorization (PMF) using `pymc3`.\n", 1419 | "\n", 1420 | "Our results demonstrate that the mean of means method is our best baseline on our prediction task. As expected, we are able to obtain a significant decrease in RMSE using the PMF MAP estimate obtained via Powell optimization. We illustrated one way to monitor convergence of an MCMC sampler with a high-dimensionality sampling space using the Frobenius norms of the sampled variables. The traceplots using this method seem to indicate that our sampler converged to the posterior. Results using this posterior showed that attempting to improve the MAP estimation using MCMC sampling actually overfit the training data and increased test RMSE. This was likely caused by the constraining of the posterior via fixed precision parameters $\\alpha$, $\\alpha_U$, and $\\alpha_V$.\n", 1421 | "\n", 1422 | "As a followup to this analysis, it would be interesting to also implement the logistic and constrained versions of PMF. We expect both models to outperform the basic PMF model. We could also implement the [fully Bayesian version of PMF](https://www.cs.toronto.edu/~amnih/papers/bpmf.pdf) (BPMF), which places hyperpriors on the model parameters to automatically learn ideal mean and precision parameters for $U$ and $V$. This would likely resolve the issue we faced in this analysis. We would expect BPMF to improve upon the MAP estimation produced here by learning more suitable hyperparameters and parameters. For a basic (but working!) implementation of BPMF in `pymc3`, see [this gist](https://gist.github.com/macks22/00a17b1d374dfc267a9a).\n", 1423 | "\n", 1424 | "If you made it this far, then congratulations! You now have some idea of how to build a basic recommender system. These same ideas and methods can be used on many different recommendation tasks. Items can be movies, products, advertisements, courses, or even other people. Any time you can build yourself a user-item matrix with user preferences in the cells, you can use these types of collaborative filtering algorithms to predict the missing values. If you want to learn more about recommender systems, the first reference is a good place to start." 1425 | ] 1426 | }, 1427 | { 1428 | "cell_type": "markdown", 1429 | "metadata": {}, 1430 | "source": [ 1431 | "## References\n", 1432 | "\n", 1433 | "1. Y. Koren, R. Bell, and C. Volinsky, “Matrix Factorization Techniques for Recommender Systems,” Computer, vol. 42, no. 8, pp. 30–37, Aug. 2009.\n", 1434 | "2. K. Goldberg, T. Roeder, D. Gupta, and C. Perkins, “Eigentaste: A constant time collaborative filtering algorithm,” Information Retrieval, vol. 4, no. 2, pp. 133–151, 2001.\n", 1435 | "3. A. Mnih and R. Salakhutdinov, “Probabilistic matrix factorization,” in Advances in neural information processing systems, 2007, pp. 1257–1264.\n", 1436 | "4. S. J. Nowlan and G. E. Hinton, “Simplifying Neural Networks by Soft Weight-sharing,” Neural Comput., vol. 4, no. 4, pp. 473–493, Jul. 1992.\n", 1437 | "5. R. Salakhutdinov and A. Mnih, “Bayesian Probabilistic Matrix Factorization Using Markov Chain Monte Carlo,” in Proceedings of the 25th International Conference on Machine Learning, New York, NY, USA, 2008, pp. 880–887.\n", 1438 | "\n", 1439 | "\n", 1440 | "\n", 1441 | "\n" 1442 | ] 1443 | }, 1444 | { 1445 | "cell_type": "code", 1446 | "execution_count": null, 1447 | "metadata": { 1448 | "collapsed": true 1449 | }, 1450 | "outputs": [], 1451 | "source": [] 1452 | } 1453 | ], 1454 | "metadata": { 1455 | "anaconda-cloud": {}, 1456 | "kernelspec": { 1457 | "display_name": "Python 3", 1458 | "language": "python", 1459 | "name": "python3" 1460 | }, 1461 | "language_info": { 1462 | "codemirror_mode": { 1463 | "name": "ipython", 1464 | "version": 3 1465 | }, 1466 | "file_extension": ".py", 1467 | "mimetype": "text/x-python", 1468 | "name": "python", 1469 | "nbconvert_exporter": "python", 1470 | "pygments_lexer": "ipython3", 1471 | "version": "3.5.2" 1472 | } 1473 | }, 1474 | "nbformat": 4, 1475 | "nbformat_minor": 1 1476 | } 1477 | -------------------------------------------------------------------------------- /slides/bmh.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/slides/bmh.jpeg -------------------------------------------------------------------------------- /slides/boxs-loop.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/slides/boxs-loop.png -------------------------------------------------------------------------------- /slides/coin_flip.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/slides/coin_flip.png -------------------------------------------------------------------------------- /slides/conceptual.odg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/slides/conceptual.odg -------------------------------------------------------------------------------- /slides/data-decisions.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/slides/data-decisions.pdf -------------------------------------------------------------------------------- /slides/edward.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/slides/edward.png -------------------------------------------------------------------------------- /slides/galvanize-logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/slides/galvanize-logo.png -------------------------------------------------------------------------------- /slides/inference-graph.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/slides/inference-graph.png -------------------------------------------------------------------------------- /slides/keras.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/slides/keras.png -------------------------------------------------------------------------------- /slides/pp.bib: -------------------------------------------------------------------------------- 1 | % Encoding: ISO-8859-1 2 | 3 | 4 | @Book{Bishop06, 5 | Title = {Pattern Recognition and Machine Learning}, 6 | Author = {C M Bishop}, 7 | Editor = {M I Jordan and J Kleinberg and B Sch\"{o}lkopf}, 8 | Publisher = {Springer}, 9 | Year = {2006}, 10 | Month = {August}, 11 | 12 | HowPublished = {Hardcover}, 13 | ISBN = {0387310738}, 14 | Url = {http://www.amazon.ca/exec/obidos/redirect?tag=citeulike09-20\&path=ASIN/0387310738} 15 | } 16 | 17 | @Article{Rumelhart86, 18 | Title = {Learning representations by back-propagation errors}, 19 | Author = {D. Rumelhart, G Hinton, R. Williams}, 20 | Journal = {Nature}, 21 | Year = {1986}, 22 | Pages = {533-536}, 23 | Volume = {323}, 24 | 25 | Owner = {adam}, 26 | Timestamp = {2017.01.17} 27 | } 28 | 29 | @Book{Goodfellow16, 30 | Title = {Deep Learning}, 31 | Author = {I Goodfellow and Y Bengio and A Courville}, 32 | Publisher = {MIT Press}, 33 | Year = {2016}, 34 | 35 | Url = {http://www.deeplearningbook.org} 36 | } 37 | 38 | @Article{Hoffman14, 39 | Title = {The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo.}, 40 | Author = {M D Hoffman and A Gelman}, 41 | Journal = {Journal of Machine Learning Research}, 42 | Year = {2014}, 43 | Number = {1}, 44 | Pages = {1593--1623}, 45 | Volume = {15} 46 | } 47 | 48 | @InCollection{Salakhutdinov08, 49 | Title = {Probabilistic Matrix Factorization}, 50 | Author = {R Salakhutdinov and A Mnih}, 51 | Booktitle = {Advances in Neural Information Processing Systems 20}, 52 | Publisher = {MIT Press}, 53 | Year = {2008}, 54 | 55 | Address = {Cambridge, MA}, 56 | Editor = {Platt, J. C. and Koller, D. and Singer, Y. and Roweis, S.}, 57 | 58 | Citeulike-article-id = {2147262}, 59 | File = {:Salakhutdinov08.pdf:PDF}, 60 | Priority = {5} 61 | } 62 | 63 | @Article{Salvatier16, 64 | Title = {Probabilistic programming in Python using PyMC3.}, 65 | Author = {J Salvatier and T V Wiecki and C Fonnesbeck}, 66 | Journal = {PeerJ Computer Science}, 67 | Year = {2016}, 68 | Pages = {e55}, 69 | Volume = {2}, 70 | 71 | Keywords = {dblp}, 72 | Timestamp = {2016-06-23T11:38:20.000+0200}, 73 | Url = {http://dblp.uni-trier.de/db/journals/peerj-cs/peerj-cs2.html#SalvatierWF16} 74 | } 75 | 76 | @Book{Murphy12, 77 | title = {Machine learning: A probabilistic perspective}, 78 | publisher = {MIT Press}, 79 | year = {2012}, 80 | author = {Murphy, K. P}, 81 | } 82 | 83 | @Article{tran16, 84 | author = {Dustin Tran and Alp Kucukelbir and Adji B. Dieng and Maja Rudolph and Dawen Liang and David M. Blei}, 85 | title = {{Edward: A library for probabilistic modeling, inference, and criticism}}, 86 | journal = {arXiv preprint arXiv:1610.09787}, 87 | year = {2016}, 88 | } 89 | 90 | @InProceedings{tran17, 91 | author = {Dustin Tran and Matthew D. Hoffman and Rif A. Saurous and Eugene Brevdo and Kevin Murphy and David M. Blei}, 92 | title = {Deep probabilistic programming}, 93 | booktitle = {International Conference on Learning Representations}, 94 | year = {2017}, 95 | } 96 | 97 | @Article{Blei14, 98 | author = {D M Blei}, 99 | title = {Build, compute, critique, repeat: Data analysis with latent variable models.}, 100 | journal = {Annual Review of Statistics and Its Application}, 101 | year = {2014}, 102 | } 103 | 104 | @Comment{jabref-meta: databaseType:bibtex;} 105 | -------------------------------------------------------------------------------- /slides/probabilistic-programming-intro.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/slides/probabilistic-programming-intro.pdf -------------------------------------------------------------------------------- /slides/probabilistic-programming-intro.tex: -------------------------------------------------------------------------------- 1 | %% beamer packages 2 | % other themes: AnnArbor, Antibes, Bergen, Berkeley, Berlin, Boadilla, boxes, 3 | % CambridgeUS, Darmstadt, Dresden, Frankfurt, Goettingen, Hannover, Ilmenau, 4 | %JuanLesPins, Luebeck, Madrid, Malmoe, Marburg, Montpellier, PaloAlto, 5 | %Pittsburgh, Rochester, Singapore, Szeged, Warsaw 6 | % other colors: albatross, beaver, crane, default, dolphin, dove, fly, lily, 7 | %orchid, rose, seagull, seahorse, sidebartab, structure, whale, wolverine, 8 | %beetle 9 | 10 | %\documentclass[xcolor=dvipsnames]{beamer} 11 | \documentclass[table,dvipsnames]{beamer} 12 | \usepackage{beamerthemesplit} 13 | \usepackage{bm,amsmath,marvosym} 14 | \usepackage{listings,color}%xcolor 15 | \usepackage[ngerman]{babel} 16 | \usepackage{natbib} 17 | \usepackage[utf8]{inputenc} 18 | \definecolor{shadecolor}{rgb}{.9, .9, .9} 19 | \definecolor{darkblue}{rgb}{0.0,0.0,0.5} 20 | \definecolor{myorange}{cmyk}{0,0.7,1,0} 21 | \definecolor{mypurple}{cmyk}{0.3, 0.9, 0.0, 0.2} 22 | 23 | % make a checkmark 24 | \usepackage{tikz} 25 | \def\checkmark{\tikz\fill[scale=0.4](0,.35) -- (.25,0) -- (1,.7) -- (.25,.15) -- cycle;} 26 | 27 | % dot product 28 | \usetikzlibrary{arrows,positioning} 29 | \tikzset{ 30 | %Define standard arrow tip 31 | >=stealth', 32 | % Define arrow style 33 | pil/.style={->,thick} 34 | } 35 | 36 | % math stuff 37 | \newcommand{\argmin}{\operatornamewithlimits{argmin}} 38 | 39 | \lstnewenvironment{code}{ 40 | \lstset{backgroundcolor=\color{shadecolor}, 41 | showstringspaces=false, 42 | language=python, 43 | frame=single, 44 | framerule=0pt, 45 | keepspaces=true, 46 | breaklines=true, 47 | basicstyle=\ttfamily, 48 | keywordstyle=\bfseries, 49 | basicstyle=\ttfamily\scriptsize, 50 | keywordstyle=\color{blue}\ttfamily, 51 | stringstyle=\color{red}\ttfamily, 52 | commentstyle=\color{green}\ttfamily, 53 | columns=fullflexible 54 | } 55 | }{} 56 | 57 | \lstnewenvironment{codeout}{ 58 | \lstset{backgroundcolor=\color{shadecolor}, 59 | frame=single, 60 | framerule=0pt, 61 | breaklines=true, 62 | basicstyle=\ttfamily\scriptsize, 63 | columns=fullflexible 64 | } 65 | }{} 66 | 67 | \hypersetup{colorlinks = true, linkcolor=darkblue, citecolor=darkblue,urlcolor=darkblue} 68 | \hypersetup{pdfauthor={A. Richards}, pdftitle={Intro to probabilistic programming}} 69 | 70 | \newcommand{\rd}{\textcolor{red}} 71 | \newcommand{\grn}{\textcolor{green}} 72 | \newcommand{\keywd}{\textcolor{myorange}} 73 | \newcommand{\highlt}{\textcolor{NavyBlue}} 74 | \newcommand{\norm}[1]{\left\lVert#1\right\rVert} 75 | \def\ci{\perp\!\!\!\perp} 76 | % set beamer theme and color 77 | \usetheme{Frankfurt} 78 | %\usetheme{Berkeley} 79 | \usecolortheme{orchid} 80 | %\usecolortheme{seagull} 81 | 82 | %% modify the font 83 | %\usepackage{fontspec} 84 | %setting a font 85 | %\setsansfont{TeX Gyre Adventor} 86 | %\usepackage{newcent} 87 | 88 | %% fix the section title for literature 89 | \renewcommand{\bibsection}{\subsubsection*{\bibname } } 90 | 91 | \title[Project teams]{Introduction to probabilistic programming \\ (Introducing our new friend Edward)} 92 | \author[A. Richards]{Adam Richards \\ \ \\ \includegraphics[scale=0.05]{galvanize-logo.png}} 93 | \institute{} 94 | \date[]{01.04.2018} 95 | 96 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 97 | \begin{document} 98 | \frame{\titlepage} 99 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 100 | \frame{ 101 | \footnotesize 102 | \tableofcontents 103 | \normalsize 104 | } 105 | 106 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 107 | \section{Probabilistic Programming} 108 | \subsection{} 109 | 110 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 111 | \frame{ 112 | \frametitle{What are we going to cover?} 113 | \begin{block}{} 114 | \begin{itemize} 115 | \item Probabilistic programming intro 116 | \item Box's Loop (Model $\rightarrow$ infer $\rightarrow$ criticize) 117 | \item Bayesian inference and some related tools 118 | \item Examples in both PyMC3 and Edward (model) 119 | \item Discuss the tools used for inference and criticism 120 | \item Switchpoint and multilevel modeling examples 121 | \end{itemize} 122 | \end{block} 123 | } 124 | 125 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 126 | \frame{ 127 | \frametitle{Probabilistic programming} 128 | \footnotesize 129 | \begin{block}{A probabilistic programming language makes it easy to:} 130 | \begin{enumerate} 131 | \item Write out complex probability models 132 | \item And subsequently solve these models automatically. 133 | \end{enumerate} 134 | \end{block} 135 | 136 | \begin{block}{Generally this is accomplished by:} 137 | \begin{enumerate} 138 | \item Random variables are handled as a \href{https://en.wikipedia.org/wiki/Language\_primitive}{primitive} 139 | \item Inference is handled behind the scenes 140 | \item Memory and processor management is abstracted away 141 | \end{enumerate} 142 | \end{block} 143 | } 144 | 145 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 146 | \frame{ 147 | \frametitle{The pros and the cons} 148 | \footnotesize 149 | \textbf{Why you might want to use probabilistic programming} 150 | \begin{enumerate} 151 | \item \keywd{Customization} - We can create models that have built-in hypothesis tests 152 | \item \keywd{Propagation of uncertainty} - There is a degree of belief associated prediction and estimation 153 | \item \keywd{Intuition} - The models are essentially 'white-box' which provides insight into our data 154 | \end{enumerate} 155 | \textbf{Why you might \highlt{NOT} want use out probabilistic programming} 156 | \begin{enumerate} 157 | \item \keywd{Deep dive} - Many of the online examples will assume a fairly deep understanding of statistics 158 | \item \keywd{Overhead} - Computational overhead might make it difficult to be production ready 159 | \item \keywd{Sometimes simple is enough} - The ability to customize models in almost a plug-n-play manner has to come with some cost. 160 | \end{enumerate} 161 | } 162 | 163 | 164 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 165 | \frame{ 166 | \frametitle{Doing data science} 167 | \footnotesize 168 | The taxonomy of what a data scientist does 169 | \begin{columns} 170 | 171 | \begin{column}{6cm} 172 | \begin{enumerate} 173 | \item Obtaining data 174 | \item Scrubbing data 175 | \item Exploring data 176 | \item Modeling data 177 | \item iNterpreting data 178 | \end{enumerate} 179 | \vspace{1cm} 180 | OSEMN is pronounced as \keywd{Awesome}. 181 | \end{column} 182 | \begin{column}{5cm} 183 | \begin{center} 184 | \includegraphics[scale=0.3]{data-decisions.pdf} 185 | \end{center} 186 | \end{column} 187 | \end{columns} 188 | 189 | 190 | \tiny 191 | \begin{flushleft} 192 | See Hilary Mason and Chris Wiggins blog post \\ 193 | \href{http://www.dataists.com/2010/09/a-taxonomy-of-data-science}{http://www.dataists.com/2010/09/a-taxonomy-of-data-science} 194 | \end{flushleft} 195 | } 196 | 197 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 198 | \frame{ 199 | \frametitle{probabilistic programming and data science} 200 | \footnotesize 201 | \begin{columns} 202 | \begin{column}{5.5cm} 203 | \begin{center} 204 | \includegraphics[scale=0.3]{data-decisions.pdf} 205 | \end{center} 206 | \end{column} 207 | 208 | \begin{column}{5.5cm} 209 | \begin{center} 210 | \includegraphics[scale=0.35]{boxs-loop.png} 211 | \end{center} 212 | \end{column} 213 | \end{columns} 214 | \tiny 215 | \vspace{1cm} 216 | \citep{Tran16,Blei14} 217 | } 218 | 219 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 220 | \frame{ 221 | \frametitle{Bayesian Inference} 222 | \large \highlt{Degree of belief} \\ \ \\ 223 | \footnotesize 224 | You are a skilled programmer, but bugs still slip into your code. After a particularly difficult implementation of an algorithm, you decide to test your code on a trivial example. It passes. You test the code on a harder problem. It passes once again. And it passes the next, \textit{even more difficult}, test too! You are \highlt{starting to believe} that there may be no bugs in this code... 225 | 226 | \begin{flushleft} 227 | \href{https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers}{Bayesian methods for hackers} 228 | \end{flushleft} 229 | 230 | \begin{center} 231 | \includegraphics[scale=0.25]{bmh.jpeg} 232 | \end{center} 233 | } 234 | 235 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 236 | \frame{ 237 | \frametitle{Some terminology} 238 | \footnotesize 239 | 240 | \begin{block}{Probabilistic model} 241 | A joint distribution $p(X, \theta)$ of data $X$ and latent variables $\theta$. 242 | \end{block} 243 | 244 | \begin{equation} 245 | P(\theta|x) = \frac{P(x|\theta)P(\theta)}{P(x)} 246 | \end{equation} 247 | 248 | \begin{itemize} 249 | \item \keywd{Prior} - $P(\theta)$ - one's beliefs about a quantity before presented with evidence 250 | \item \keywd{Posterior} - $P(\theta|x)$ - probability of the parameters given the evidence 251 | \item \keywd{Likelihood} - $P(x|\theta)$ - probability of the evidence given the parameters 252 | \item \keywd{Normalizing constant} - $P(x)$ 253 | \end{itemize} 254 | 255 | \begin{itemize} 256 | \item $P(\theta)$: This big, complex code likely has a bug in it. 257 | \item $P(\theta|X)$: The code passed all X tests; there still might be a bug, but it is less likely now. 258 | \end{itemize} 259 | } 260 | 261 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 262 | \frame{ 263 | \frametitle{Related libraries and frameworks} 264 | Libraries with high level of abstraction... 265 | \begin{center} 266 | Edward 267 | \includegraphics[scale=0.1]{edward.png} \hspace{1cm} 268 | \includegraphics[scale=0.3]{keras.png} 269 | \end{center} 270 | Computational framework libraries... 271 | \begin{center} 272 | \includegraphics[scale=0.3]{tensorflow.jpeg} \hspace{1cm} 273 | \includegraphics[scale=0.3]{theano.jpeg} \hspace{1cm} 274 | \includegraphics[scale=0.3]{torch.jpeg} 275 | \end{center} 276 | } 277 | 278 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 279 | \frame{ 280 | \frametitle{The three main probabilistic programming libraries} 281 | \begin{center} 282 | Edward 283 | \includegraphics[scale=0.15]{edward.png} \hspace{1cm} 284 | \includegraphics[scale=0.35]{stan.jpeg} \hspace{1cm} \\ \ \\ 285 | \includegraphics[scale=0.35]{pymc3.png} 286 | \end{center} 287 | } 288 | 289 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 290 | \section{Model,Inference,Criticism} 291 | \subsection{} 292 | 293 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 294 | \begin{frame}[fragile] 295 | \frametitle{PyMC3} 296 | \footnotesize 297 | \begin{code} 298 | import pymc3 as pm 299 | \end{code} 300 | 301 | \begin{itemize} 302 | \item Developed by John Salvatier, Thomas Wiecki, and Christopher Fonnesbeck \citep{Salvatier16} 303 | \item Comes with \href{https://github.com/pymc-devs/pymc3/tree/master/pymc3/examples}{loads of good examples} 304 | \item API is is not backwards compartible with models specified in PyMC2 305 | \item Can still be run in Python2.7+. 306 | \end{itemize} 307 | 308 | \highlt{Basic workflow} 309 | \begin{enumerate} 310 | \item Define hyperpriors 311 | \item Open a model context 312 | \item Perform inference 313 | \end{enumerate} 314 | \end{frame} 315 | 316 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 317 | \begin{frame}[fragile] 318 | \frametitle{Edward} 319 | \footnotesize 320 | \begin{block}{} 321 | Edward is named after the statistician \highlt{George Edward Pelham Box}. Its design is based on his philosophy of statistics 322 | \end{block} 323 | First gather data from some real-world phenomena. \ \\ 324 | Then cycle through \highlt{Box’s loop} 325 | \begin{center} 326 | \includegraphics[scale=0.35]{boxs-loop.png} 327 | \end{center} 328 | \citep{Tran16} 329 | \end{frame} 330 | 331 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 332 | \begin{frame}[fragile] 333 | \frametitle{Generalized procedure for probabilistic programming} 334 | \footnotesize 335 | \begin{block}{Model,Inference,Criticism} 336 | \begin{enumerate} 337 | \item Build a probabilistic model of the phenomena 338 | \item Reason about the phenomena given model and data 339 | \item Criticize the model, revise and repeat 340 | \end{enumerate} 341 | \end{block} 342 | 343 | A child flips a coin ten times... 344 | \begin{code} 345 | [0, 1, 0, 0, 0, 0, 0, 0, 0, 1] 346 | \end{code} 347 | 348 | \begin{enumerate} 349 | \item She is interested in the probability that the coin lands on heads 350 | \item Given she only see heads and tails she reason that a binomial distribution might be appropriate 351 | \item She finally analyzes whether her model captures the real-world phenomenon of coin flips (she may revise the model 352 | and repeat) 353 | \end{enumerate} 354 | \citep{Tran16} 355 | \end{frame} 356 | 357 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 358 | \frame{ 359 | \begin{center} 360 | \includegraphics[scale=0.32]{coin_flip.png} 361 | \end{center} 362 | } 363 | 364 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 365 | \begin{frame}[fragile] 366 | \frametitle{Coin flip in PyMC3} 367 | \footnotesize 368 | \begin{code} 369 | import pymc3 as pm 370 | 371 | n,h,alpha,beta,niter = 100,61,2,2,1000 372 | 373 | # context management 374 | with pm.Model() as model: 375 | p = pm.Beta('p', alpha=alpha, beta=beta) 376 | y = pm.Binomial('y', n=n, p=p, observed=h) 377 | 378 | start = pm.find_MAP() 379 | step = pm.Metropolis() 380 | trace = pm.sample(niter, step, start) 381 | \end{code} 382 | 383 | Data $\rightarrow$ Model context $\rightarrow$ Priors $\rightarrow$ Likelihood $\rightarrow$ Sampler $\rightarrow$ Inference 384 | \vspace{0.5cm} 385 | \\ \noindent To the notebooks! 386 | \end{frame} 387 | 388 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 389 | \begin{frame}[fragile] 390 | \frametitle{Bayesian Linear Regression} 391 | \scriptsize 392 | For a set of $N$ data points $(\mathbf{X},\mathbf{y})=\{(\mathbf{x}_n, y_n)\}$, the model can be specified as follows: 393 | 394 | \begin{align} 395 | p(\mathbf{w}) &= Normal(\mathbf{w}|\mathbf{0},\sigma_{w}^{2} \mathbf{I})\\ 396 | p(b) &= Normal(b|0,\sigma^{2}_{b}) \\ 397 | p(\mathbf{y}|\mathbf{w},b,\mathbf{X}) &= \prod^{N}_{n=1} Normal(y_{n}|\mathbf{x}_{n}^{T} \mathbf{w} + b,\sigma^{2}_{y}) 398 | \end{align} 399 | 400 | \begin{itemize} 401 | \item The latent variables are the linear model’s weights $\mathbf{w}$ and intercept $\mathbf{b}$ 402 | \item We assume the prior and likelihood variances are known: $\sigma_{w}^{2}$, $\sigma_{b}^{2}$ and $\sigma^{2}_{y}$ 403 | \item The mean of the likelihood is given by a linear transformation of the inputs in $\mathbf{x}_{n}$ 404 | \end{itemize} 405 | \citep{Murphy12} 406 | \end{frame} 407 | 408 | % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 409 | % \frame{ 410 | % \frametitle{Bayes linear regression in Edward} 411 | % \begin{center} 412 | % \includegraphics[scale=0.5]{../edward/bayes-linreg.png} 413 | % \end{center} 414 | % } 415 | 416 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 417 | \frame{ 418 | \frametitle{Bayes linear regression in PyMC3} 419 | \begin{center} 420 | \includegraphics[scale=0.3]{../pymc3/bayes-lin-reg.png} 421 | \end{center} 422 | } 423 | 424 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 425 | \frame{ 426 | \frametitle{Markov chain Monte Carlo (MCMC)} 427 | \footnotesize 428 | \keywd{MCMC} 429 | \begin{itemize} 430 | \item It is a family of algorithms for obtaining a sequence of random samples from a probability distribution for which direct sampling is difficult. 431 | \item The sequence can then be used to approximate the distribution 432 | \item It allows for inference on complex models 433 | \end{itemize} 434 | 435 | A particularly useful class of MCMC, known as Hamliltonian Monte Carlo, requires \href{https://en.wikipedia.org/wiki/Gradient}{gradient} information which is often not readily available so PyMC3 uses Theano to get around this problem. Something that has recently made this whole field a lot more interesting is the No-U-turn sampler (NUTS) because there are \highlt{self-tuning strategies} \citep{Hoffman14}. 436 | \\ \ \\ 437 | One of the really nice things about probabilistic programming is that \highlt{you do not have to know how inference is performed}, but it can be useful. 438 | 439 | \begin{itemize} 440 | \item \href{http://twiecki.github.io/blog/2015/11/10/mcmc-sampling/}{MCMC for Dummies} 441 | \item \href{https://arxiv.org/pdf/1206.1901.pdf}{More on Hamliltonian MCMC (Not for dummies)} 442 | \item \href{http://twiecki.github.io/blog/2014/01/02/visualizing-mcmc/}{How to animate MCMC (for everyone)} 443 | \end{itemize} 444 | } 445 | 446 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 447 | \frame{ 448 | \frametitle{There are a lot of tools for inference available} 449 | \footnotesize 450 | \begin{center} 451 | \includegraphics[scale=0.4]{inference-graph.png} 452 | \end{center} 453 | %\vspace{1cm} 454 | \citep{Tran16} 455 | } 456 | 457 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 458 | \begin{frame}[fragile] 459 | \footnotesize 460 | 461 | \begin{block}{Bayes Factor} 462 | It is the \keywd{Bayesian way} to compare models. This is done by computing the marginal likelihood of each model 463 | \begin{equation} 464 | BF = \frac{p(y|M_{0})}{p(y|M_{1})} 465 | \end{equation} 466 | \highlt{Can be used to replace the p-value} 467 | \end{block} 468 | 469 | \begin{block}{Posterior predictive checks} 470 | Analyze the degree to which data generated from the model deviate from data generated from the true distribution. Can by used for hypothesis testing, model comparison, model selection, model averaging and to validate test data. 471 | \end{block} 472 | 473 | Links 474 | \begin{itemize} 475 | \item \href{http://docs.pymc.io/notebooks/Bayes\_factor.html}{Bayes Factor in PyMC3} 476 | \item \href{http://docs.pymc.io/notebooks/posterior\_predictive.html}{Posterior predictive checks in PyMC3} 477 | \item \href{https://replicationindex.wordpress.com/2015/04/30/replacing-p-values-with-bayes-factors-a-miracle-cure-for-the-replicability-crisis-in-psychological-science/}{replacing p-values} 478 | \item \href{http://docs.pymc.io/notebooks/BEST.html}{Bayesian estimation supersedes the t-test} 479 | \end{itemize} 480 | \end{frame} 481 | 482 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 483 | \section{Decisions, Decisions} 484 | \subsection{} 485 | 486 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 487 | \begin{frame}[fragile] 488 | \footnotesize 489 | \begin{block}{} 490 | We can never validate whether a model is true. 491 | \\ \ \\ 492 | \keywd{In practice, all models are wrong -George Box}. 493 | \\ \ \\ 494 | However, we can try to uncover where the model goes wrong. Model criticism helps justify the 495 | model as an approximation or point to good directions for revising the model. 496 | \end{block} 497 | \vspace{1cm} 498 | Posterior predictive checks (PPCs) analyze the degree to which data generated from the model deviate 499 | from data generated from the true distribution. They can be used either numerically to quantify this 500 | degree, or graphically to visualize this degree. 501 | \end{frame} 502 | 503 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 504 | \begin{frame}[fragile] 505 | \footnotesize 506 | \frametitle{RMSE and MSE} 507 | Model evaluation has become so much easier... 508 | \vspace{1cm} 509 | \begin{code} 510 | print("Mean squared error on test data:") 511 | print(ed.evaluate('MSE', data={X: X_test, y_post: y_test})) 512 | \end{code} 513 | 514 | \begin{code} 515 | print("Mean absolute error on test data:") 516 | print(ed.evaluate('MAE', data={X: X_test, y_post: y_test})) 517 | \end{code} 518 | \vspace{1cm} 519 | Note: RMSE gives a relatively high weight to large errors. This means the RMSE should be more useful when large errors are particularly undesirable. 520 | \end{frame} 521 | 522 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 523 | \begin{frame}[fragile] 524 | \begin{block}{Switchpoint analysis} 525 | \footnotesize 526 | \begin{itemize} 527 | \item \href{http://nbviewer.jupyter.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/blob/master/Chapter1\_Introduction/Ch1\_Introduction\_PyMC3.ipynb}{Probabilistic Programming and Bayesian Methods for Hackers \\ Text Messages} 528 | \item \href{http://docs.pymc.io/notebooks/getting_started.html}{PyMC3 Documentation \\ Coal Mining Disasters} 529 | \end{itemize} 530 | \end{block} 531 | \end{frame} 532 | 533 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 534 | \begin{frame}[fragile] 535 | \frametitle{Switchpoint analysis applications} 536 | \begin{block}{} 537 | \begin{itemize} 538 | \item Did the change actually happen? 539 | \item Did change X on January 1 affect the number of sales we have seen in our company? 540 | \item Did imposing a new safety policy last summer reduce the number of accidents? 541 | \item After those staffing decisions how longs did it take before revenue was affected? 542 | \item Based on my time-series forecasting model when will a change actually occur? 543 | \item ... 544 | \end{itemize} 545 | \end{block} 546 | \begin{block}{} 547 | If $p$-values are your tool of choice you could try Bayes Factors to have numeric values to make decisions from. 548 | \end{block} 549 | \end{frame} 550 | 551 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 552 | \begin{frame}[fragile] 553 | \begin{block}{Multilevel modeling} 554 | Multilevel model. Multilevel models (also known as hierarchical linear models, nested data models, mixed models, random coefficient, random-effects models, random parameter models, or split-plot designs) are statistical models of parameters that vary at more than one level 555 | \end{block} 556 | \vspace{1cm} 557 | \begin{itemize} 558 | \item \href{http://twiecki.github.io/blog/2014/03/17/bayesian-glms-3/}{PyMC3 - Radon example GLM style} 559 | \item \href{http://edwardlib.org/tutorials/linear-mixed-effects-models}{Edward - Instructor evaluation ratings} 560 | \end{itemize} 561 | \vspace{1cm} 562 | \href{https://en.wikipedia.org/wiki/Multilevel\_model}{https://en.wikipedia.org/wiki/Multilevel\_model} 563 | \end{frame} 564 | 565 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 566 | \begin{frame}[fragile] 567 | \frametitle{A note on deep learning...} 568 | \begin{enumerate} 569 | \item Neural Networks compute point estimates 570 | \item They tend to be overly confident about class prediction 571 | \item They are prone to overfitting 572 | \item They have many parameters that may require tuning 573 | \end{enumerate} 574 | 575 | \begin{block}{} 576 | How confident am I in my prediction---can I quantify the uncertainty? 577 | \end{block} 578 | 579 | \keywd{Bayesian neural network}: Is a neural network with a prior distribution on the weights 580 | 581 | \tiny 582 | \href{https://www.youtube.com/watch?v=I09QVNrUS3Q}{https://www.youtube.com/watch?v=I09QVNrUS3Q} 583 | \citep{Tran17} 584 | \end{frame} 585 | 586 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 587 | \frame{ 588 | \frametitle{Simple Neural Network using PyMC3} 589 | \footnotesize 590 | \begin{center} 591 | \includegraphics[scale=0.2]{../pymc3/nn-1.png} 592 | \includegraphics[scale=0.2]{../pymc3/nn-2.png} 593 | \end{center} 594 | } 595 | 596 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 597 | \frame{ 598 | \frametitle{Another take on probabilistic programming} 599 | \footnotesize 600 | \begin{block}{} 601 | Another way of thinking about this: unlike a traditional program, which only runs in the forward directions, \highlt{a probabilistic program is run in both the forward and backward direction}. It runs forward to compute the consequences of the assumptions it contains about the world (i.e., the model space it represents), but it also runs backward from the data to constrain the possible explanations. In practice, many probabilistic programming systems will cleverly interleave these forward and backward operations to efficiently home in on the best explanations. 602 | \end{block} 603 | \vspace{1cm} 604 | \href{https://plus.google.com/u/0/107971134877020469960/posts/KpeRdJKR6Z1}{Why Probabilistic Programming matters? (Beau Cronin)} 605 | } 606 | 607 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 608 | \frame{ 609 | \frametitle{What did we cover again?} 610 | \begin{block}{} 611 | \begin{itemize} 612 | \item[\checkmark] Probabilistic programming intro 613 | \item[\checkmark] Box's Loop (Model $\rightarrow$ infer $\rightarrow$ criticize) 614 | \item[\checkmark] Bayesian inference and some related tools 615 | \item[\checkmark] Examples in both PyMC3 and Edward (model) 616 | \item[\checkmark] Discuss the tools used for inference and criticism 617 | \item[\checkmark] Switchpoint and multilevel modeling examples 618 | \end{itemize} 619 | \end{block} 620 | } 621 | 622 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 623 | \frame{ 624 | \frametitle{Where to go from here} 625 | \footnotesize 626 | 627 | Examples, examples, examples... 628 | \begin{itemize} 629 | \item \href{http://edwardlib.org/tutorials/}{Edward examples} 630 | \item \href{https://github.com/pymc-devs/pymc3}{PyMC3 examples} 631 | \item \href{http://pymc-devs.github.io/pymc3/notebooks/getting_started.html}{PyMC3 Getting started guide} 632 | \item \href{https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers}{Bayesian methods 633 | for hackers} 634 | \item \href{http://twiecki.github.io}{Blog by Thomas Wiecki} 635 | \item \href{http://www.amazon.com/Doing-Bayesian-Analysis-Second-Edition/dp/0124058884/ref=dp_ob_title_bk}{Doing Bayesian Data Analysis by John Kruschke} 636 | \item \href{https://github.com/markdregan/Bayesian-Modelling-in-Python}{Resource by Mark Dregan} 637 | \item This is \href{http://www.kdnuggets.com/2016/12/datascience-introduction-bayesian-inference.html}{a nice intro to Bayesian thinking done on kdnuggets (using PyMC3)} 638 | \end{itemize} 639 | 640 | There is also \href{https://github.com/stan-dev/example-models/wiki}{PyStan} (\href{http://www.stat.columbia.edu/~gelman/research/unpublished/stan-resubmit-JSS1293.pdf}{Stan paper}) 641 | } 642 | 643 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 644 | \frame{ 645 | \frametitle{Galvanize} 646 | \footnotesize 647 | \begin{block}{Upcoming events} 648 | \begin{itemize} 649 | \item \keywd{Essential Mathematical Foundations for Data Science} (January 08-10) 650 | \item \keywd{Intro to Tableau for Data Science} (January 11th) 651 | \item \keywd{Statistics short-course} (January 22-24) 652 | \item \keywd{Intro to Python - Part time Course} (February-March) 653 | \item \keywd{Data Science Immersive} (February-May) 654 | \end{itemize} 655 | \end{block} 656 | } 657 | 658 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 659 | \frame[allowframebreaks]{ 660 | \frametitle{References} 661 | \begin{tiny} \bibliography{pp.bib} 662 | \bibliographystyle{apalike} % Style BST file 663 | \end{tiny} 664 | } 665 | 666 | \end{document} -------------------------------------------------------------------------------- /slides/pymc3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/slides/pymc3.png -------------------------------------------------------------------------------- /slides/stan.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/slides/stan.jpeg -------------------------------------------------------------------------------- /slides/tensorflow.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/slides/tensorflow.jpeg -------------------------------------------------------------------------------- /slides/theano.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/slides/theano.jpeg -------------------------------------------------------------------------------- /slides/torch.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/GalvanizeOpenSource/probabilistic-programming-intro/b6036e8dcc3aae4a9517e84ef93428a300be918b/slides/torch.jpeg --------------------------------------------------------------------------------