├── Challenges ├── BrakingDAG.JPG ├── Breaking.ipynb ├── BreakingData.csv ├── ChallengeWeekend.pdf ├── CliffWalking.ipynb ├── CliffWalkingDiagram.JPG └── GeneralPolicyInterationDIagram.pdf ├── Homework ├── FiguresHW6.pptx ├── GSPC_Yahoo_Oct_8_19.csv ├── GridWorldFactory.JPG ├── Homework1.ipynb ├── Homework2.ipynb ├── Homework3.ipynb ├── Homework4.ipynb ├── Homework5.ipynb ├── Homework6.ipynb ├── Homework7.ipynb ├── Homework8.ipynb ├── Influence.jpg ├── InfluenceSample.JPG ├── MurderDirected.JPG └── OptimalPlansOnGridWorld.JPG ├── Lesson0_Introduction_Probability ├── IntroductionToProbabalisticProgramming.ipynb ├── Review_of_Probability.ipynb └── img │ ├── AIModel.JPG │ └── AgentModel.JPG ├── Lesson1_Bayes_Networks ├── 1_DirectedGraphicalModels.pdf ├── 1_Introduction.pdf ├── IntroductionToBayesNetworks.ipynb └── img │ ├── BayesBall.JPG │ ├── Dependency.JPG │ ├── GraphTypes.JPG │ ├── Independencies2.JPG │ ├── LetterDAG.JPG │ └── Representation.JPG ├── Lesson2_MarkovNetworks ├── 2_MarkovGraphs.pdf ├── IntroductionToMarkovNetworks.ipynb └── img │ ├── Cliques1.JPG │ ├── Cliques2.JPG │ ├── DAGvsMN.JPG │ ├── Diagrams.pptx │ ├── LetterCliques.JPG │ ├── LetterDAG.JPG │ ├── MarkovBlanket.JPG │ ├── MoralizedGraph.JPG │ ├── MoralizedLetter.JPG │ ├── NoDAG.JPG │ ├── Representation.JPG │ └── Separation.JPG ├── Lesson3_ExactInference ├── 3_ExactInference.pdf ├── ExactInferenceMethods.ipynb └── img │ ├── Chain1.JPG │ ├── CliqueTree.JPG │ ├── Collect.JPG │ ├── DAGFactor.JPG │ ├── Distribute.JPG │ ├── Eliminate1.JPG │ ├── Eliminate2.JPG │ ├── Eliminate3.JPG │ ├── Eliminate4.JPG │ ├── EliminationTree.JPG │ ├── FactorToVar.JPG │ ├── FourCycle.JPG │ ├── Inference.JPG │ ├── LetterDAG.JPG │ ├── Loopy.JPG │ ├── MarkovBlanket.JPG │ ├── MarkovFactor.JPG │ ├── Moralized.JPG │ ├── MultiConnected.JPG │ ├── MultiConnectedMoralized.JPG │ ├── StudentGraph.JPG │ ├── SumProductTree.JPG │ ├── Triangle1.JPG │ ├── Triangle2.JPG │ ├── Triangulated.JPG │ ├── Undirected1.JPG │ └── VarToFactor.JPG ├── Lesson4_Learning ├── 4_LearningInGraphicalModels.pdf ├── IntroductionToLearningModelParameters.ipynb └── img │ ├── Factorizing.JPG │ ├── Learning.JPG │ ├── LetterDAG.JPG │ └── PlateDiagram.JPG ├── Lesson5_Learning_Part2 ├── IntroductiontoLearningModelStructure.ipynb ├── StudentSimulation.py └── img │ ├── FullyConneted.JPG │ ├── Learning.JPG │ └── LetterDAG.JPG ├── Lesson6_Utility_and_Decision_Trees ├── 6_DecisionsUnderUncertainty.pdf ├── IntroductionToUtilityDecisionTrees.ipynb └── img │ ├── AgentEnvironment.JPG │ ├── Bridge.JPG │ ├── RandomVariable.JPG │ └── Utility.JPG ├── Lesson7_MarkovDecisionProcesses ├── 7_MarkovProcesses.pdf ├── IntroductionToMarkovDecisionProcesses.ipynb ├── IntroductionToRL.pdf ├── MarkovDecisionProcesses.pdf └── img │ ├── AgentEnvironment.JPG │ ├── CarRewards.JPG │ └── CarStates.JPG ├── Lesson8_LVM_Variational ├── 8_LatentVariable_VariationalMethods.pdf └── Introduction_LVM_Variational.ipynb ├── Lesson9_IntroductionToDynamicProgramming ├── 9_IntroToDynamicProgramming.pdf ├── Introduction_To_Dynamic_Programming.ipynb └── img │ ├── ActionValueBackup.JPG │ ├── Backups.JPG │ ├── DPAgent.JPG │ ├── GPI.JPG │ ├── GridWorld.JPG │ ├── OptimalPath.JPG │ └── ValueBackup.JPG ├── Lesson_10_Bandit_Problems ├── 10_Bandits.pdf ├── 10_IntroductionToRL.pdf ├── Introduction_to_Bandit_Problems.ipynb └── img │ ├── BanditAgent.JPG │ ├── OneArmedBandit.jpg │ └── multiarmedbandit.jpg ├── Lesson_11_MonteCarloReinforcementLearning ├── 11_MC_RL.pdf ├── IntroToMonteCarloReinforcementLearning.ipynb └── img │ ├── GPI.JPG │ ├── GridWorld.JPG │ ├── MC_Backup.JPG │ └── RL_AgentModel.JPG ├── Lesson_12_TDandQLearning ├── 12_TD_Q_Learning.pdf ├── IntroductionToQLearning.ipynb ├── IntroductionToTDLearning.ipynb └── img │ ├── GridWorld.JPG │ ├── Q-Learning.JPG │ ├── RL_AgentModel.JPG │ ├── SARSA.JPG │ ├── SARSAN.JPG │ ├── TD0.JPG │ └── TDN.JPG ├── Lesson_13_DeepFunctionApproximation ├── 13_OverviewOfDL.pdf ├── GettingStartedKeras.ipynb ├── IntrotoDeepFunctionApproximation.ipynb └── img │ ├── Accuracy-Layers.JPG │ ├── Accuracy-Parameters.JPG │ ├── CompGraph1.JPG │ ├── CompGraph2.JPG │ ├── Hidden.JPG │ ├── L2.jpg │ ├── Lesson1Figures.pptx │ ├── LinearNetwork.JPG │ ├── LossGraph.JPG │ ├── MachineIntelligence.JPG │ ├── Preceptron.JPG │ ├── SVD.png │ └── Tikhonov_board.jpg ├── Lesson_14_FunctionApproximation ├── 14_FunctionAppoxForRL.pdf ├── DeepQLearning.ipynb ├── IntroToFunctionApproximationForRL.ipynb └── img │ ├── AgentEnvironment.JPG │ ├── DQN.JPG │ ├── ReplayBuffer.JPG │ ├── Tile1.JPG │ └── Tile2.JPG ├── Lesson_15_PolicyGradient ├── 15_PolicyGradient.pdf └── IntroductionToPolicyGradients.ipynb ├── Models ├── insurance.bif └── mildew.bif └── README.md /Challenges/BrakingDAG.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Challenges/BrakingDAG.JPG -------------------------------------------------------------------------------- /Challenges/Breaking.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Challenge Assignment\n", 8 | "## Autonomous Vehicle Breaking\n", 9 | "\n", 10 | "## CSCI E-82A\n", 11 | "\n", 12 | ">**Make sure** you include your name along with the name of your team and team members in the notebook you submit." 13 | ] 14 | }, 15 | { 16 | "cell_type": "markdown", 17 | "metadata": {}, 18 | "source": [ 19 | "**Your name and team name here:** " 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "metadata": {}, 25 | "source": [ 26 | "## Introduction\n", 27 | "\n", 28 | "As is typically the case with robots, autonomous vehicles use multiple sensors. Agent actions rely on integrating the precepts of these sensors. A number of methods are used to integrate uncertain information from sensors. Directed graphical models are a powerful and flexible approach to sensor integration. \n", 29 | "\n", 30 | "Another difficulty for autonomous vehicles, and many other robotics problems, is uncertainty about the environment. In the case of autonomous vehicles the uncertainty can include road conditions, visibility conditions, and the actions of human drivers and pedestrians. \n", 31 | "\n", 32 | "Directed graphical models provide a powerful representation for reasoning with uncertain sensor data and uncertainty in the environment. In this challenge you will perform learning and inference on a directed graphical model of braking for an autonomous vehicle. The goal of the agent is to control the braking of the autonomous vehicle to avoid collisions.\n", 33 | "\n", 34 | "The control of an actual autonomous vehicle is extremely complicated. Autonomous vehicles use many task-specific sensors. Further, any useful model has a large number of variables, many with complex continuous distribution models (e.g. mixture models). For this challenge, the number of senors and variables has been limited. Further, all distributions have simple Binomial posterior distributions.\n", 35 | "\n", 36 | "A practical autonomous vehicle would have a very low probability of collision with an object in its path; e.g. $p(collision) < 10^{-7}$. Since such small probability values are hard to work with, for this challenge, the probabilities of sensor errors and collisions are unrealistically high. " 37 | ] 38 | }, 39 | { 40 | "cell_type": "markdown", 41 | "metadata": {}, 42 | "source": [ 43 | "### Description of Problem\n", 44 | "\n", 45 | "A Directed Acyclic Graph (DAG) of the autonomous vehicle breaking decision model is show below. There are 11 variable, two utility function and one decision node. " 46 | ] 47 | }, 48 | { 49 | "cell_type": "markdown", 50 | "metadata": {}, 51 | "source": [ 52 | "\"Drawing\"\n", 53 | "
DAG for autonomous vehicle breaking control problem
" 54 | ] 55 | }, 56 | { 57 | "cell_type": "markdown", 58 | "metadata": {}, 59 | "source": [ 60 | "#### Variables \n", 61 | "\n", 62 | "The variables for the joint probability distribution are:\n", 63 | "\n", 64 | "1. **Road Condition** is the condition of the road surface; 0 = good, 1 = slippery, eg. wet or icy.\n", 65 | "2. **Weather Visibility** is the optical (visual) visibility for the visual sensor; 0 = good, 1 = poor, eg. rain of fog. \n", 66 | "3. **Light Dark** is the lighting conditions for the road ahead; 0 = good, 1 = poor or dark.\n", 67 | "4. **Object** indicates an object in the vehicle's path; 0 = no object, 1 = object in path. \n", 68 | "5. **Road Condition Detection** is the conditional probability of the reading (precept) from a sensor that determines road condition, given the Road Condition variable; 0 = good, 1 = slippery, eg. wet or icy.\n", 69 | "6. **Weather Detection** is the conditional probability of the reading (precept) from a sensor that determines weather visibility, given the Weather Visibility Variable; 0 = good, 1 = poor, eg. rain of fog.\n", 70 | "7. **Visual Sensor Detection** is the conditional probability that the visual sensors see an object in the vehicle's path, or sense a non-existent object (false positive), give the Weather Visibility, Light Dark and Object variables; 0 = no object, 1 = object in path. \n", 71 | "8. **LIDAR Sensor Detection** is the conditional probability that the LIDAR sensor see an object in the vehicle's path, or sense a non-existent object (false positive), give the Object variable; 0 = no object, 1 = object in path. LIDAR uses infrared lasers for imaging and ranging. LIDAR is much less affected by rain and fog, but has lower resolution, when compared to a visual (optical) sensor. \n", 72 | "9. **Sensor Detection** is the integrated posterior distribution of an object being in the vehicle's path, or sensing a non-existent object (false positive), given the precepts of the Weather Detection, Dark Light, Visual Sensor Detection and LIDAR Sensor Detection variables; ; 0 = no object, 1 = object in path.\n", 73 | "10. **Early Breaking** is the conditional probability that the autonomous vehicle should apply breaks early to avoid a collision given the Road Condition Detection and Sensor Detection variables; 0 = normal breaking, 1 = apply early breaking. Early breaking should reduce the chances of collision but incurs a cost in terms of delay of the vehicle and other traffic. \n", 74 | "11. **Collision** is the conditional probability of the vehicle colliding with an object given the Object, Road Condition Detection and Sensor Detection variables; 0 = no collision, 1 = collision. " 75 | ] 76 | }, 77 | { 78 | "cell_type": "markdown", 79 | "metadata": {}, 80 | "source": [ 81 | "#### Utility Functions\n", 82 | "\n", 83 | "The utility functions for this problem are: \n", 84 | "\n", 85 | "- Utility of applying breaking early:\n", 86 | "\n", 87 | "| | No Early Breaking | Early Breaking |\n", 88 | "|----|----|----|\n", 89 | "|Utility | 0 | -1 |\n", 90 | "\n", 91 | "\n", 92 | "- Utility of collision: \n", 93 | "\n", 94 | "| | No Collision | Collision |\n", 95 | "|----|----|----|\n", 96 | "|Utility | 0 | -1000 |\n", 97 | "\n", 98 | "Total utility for this problem is the sum of Early Breaking utility and Collision utility. " 99 | ] 100 | }, 101 | { 102 | "cell_type": "markdown", 103 | "metadata": {}, 104 | "source": [ 105 | "#### Decision Node\n", 106 | "\n", 107 | "There is one decision node in this problem, Early Breaking. This decision node is implemented as evidence for the Early Breaking variable; 0 = normal breaking, 1 = early breaking. " 108 | ] 109 | }, 110 | { 111 | "cell_type": "markdown", 112 | "metadata": {}, 113 | "source": [ 114 | "## Instructions\n", 115 | "\n", 116 | "In this challenge you and your team will do the following: \n", 117 | "\n", 118 | "### 1. Load Dataset \n", 119 | "Load the required packages and the csv file of 5,000 cases into a Pandas data frame. \n", 120 | "\n", 121 | "> **Hint:** Carefully examine the variable names. You will need to make sure that you use these names to construct your model. " 122 | ] 123 | }, 124 | { 125 | "cell_type": "markdown", 126 | "metadata": {}, 127 | "source": [ 128 | "### 2. Define the Graphical Model \n", 129 | "\n", 130 | "Using the tools in pgmpy define the BayesianModel class object for this problem. " 131 | ] 132 | }, 133 | { 134 | "cell_type": "markdown", 135 | "metadata": {}, 136 | "source": [ 137 | "### 3. Factorization of Distribution\n", 138 | "\n", 139 | "Using Markdown write out the joint distribution and the factorization defined in the graphical model. You may use some abbreviations for the long variable names. \n", 140 | "\n", 141 | "How many states are there in the joint distribution of 11 Binomially distributed variables. \n", 142 | "\n", 143 | "How many states are in the factorized distribution?" 144 | ] 145 | }, 146 | { 147 | "cell_type": "markdown", 148 | "metadata": {}, 149 | "source": [ 150 | "**ANS**: \n", 151 | "\n", 152 | "**ANS**:\n", 153 | "\n", 154 | "**ANS**: " 155 | ] 156 | }, 157 | { 158 | "cell_type": "markdown", 159 | "metadata": {}, 160 | "source": [ 161 | "### 4. Verify Independencies\n", 162 | "\n", 163 | "With the skeleton of your DAG defined, you can verify the indepenencies. To simplify this problem use the pgmpy local_independencies method. Recall that the local independencies are independencies within the Markov blanket of each variable. \n", 164 | "\n", 165 | "Are the independencies you find consistent with your factorization of the distribution and why?" 166 | ] 167 | }, 168 | { 169 | "cell_type": "markdown", 170 | "metadata": {}, 171 | "source": [ 172 | "**ANS**:" 173 | ] 174 | }, 175 | { 176 | "cell_type": "markdown", 177 | "metadata": {}, 178 | "source": [ 179 | "### 5. Maximum Likelihood Estimation of Model Parameters\n", 180 | "\n", 181 | "Next, use pgmy to perform maximum likelihood estimation of the model parameters using the dataset provided. \n", 182 | "\n", 183 | "Print the CPDs and carefully examine the results. Notice that some of the probabilities for the Collision variable are either 0.0 or 1.0. Is this reasonable and why?" 184 | ] 185 | }, 186 | { 187 | "cell_type": "markdown", 188 | "metadata": {}, 189 | "source": [ 190 | "**ANS**:" 191 | ] 192 | }, 193 | { 194 | "cell_type": "markdown", 195 | "metadata": {}, 196 | "source": [ 197 | "### 6. Queries\n", 198 | "\n", 199 | "With parameters fit, you are ready to perform queries on your model. Using the pgmpy VariableElimination function, perform the queries specified in the table below, computing the total utility, examine the results, and answer the questions. \n", 200 | "\n", 201 | "| Query | Query Variables | Evidence |\n", 202 | "|:----|:----|:----|\n", 203 | "|1 | Collision, Early_Breaking | Road_Condition = 0 |\n", 204 | "|2 | Collision, Early_Breaking | Road_Condition = 1 |\n", 205 | "|3 | Collision, Early_Breaking | Light_Dark = 0, Weather_Visibility = 0 |\n", 206 | "|4 | Collision, Early_Breaking | Light_Dark = 1, Weather_Visibility = 1 |\n", 207 | "|5 | Collision | Early_Breaking = 0, Object = 1, Road_Condition = 1 |\n", 208 | "|6 | Collision | Early_Breaking = 1, Object = 1, Road_Condition = 1 |\n", 209 | "|7 | Collision | Early_Breaking = 0, Object = 1, Light_Dark = 1, Weather_Visibility = 1 |\n", 210 | "|8 | Collision | Early_Breaking = 1, Object = 1, Light_Dark = 1, Weather_Visibility = 1 |\n", 211 | "\n", 212 | "**Q 1:** Compare the probability of Collision, Early Breaking and total utility for the different values of evidence specified in queries 1 and 2. Are these values consistent with what you expect and why? \n", 213 | "\n", 214 | "**Q 2:** Compare the probability of Collision, Early Breaking and total utility for the different values of the evidence variables specified in queries 3 and 4. Are these values significantly different? Given how the sensor data is integrated, do these differences seem reasonable?\n", 215 | "\n", 216 | "**Q 3:** Compare the probability of Collision and total utility for the different values of the evidence variables specified in queries 5 and 6. Are these values consistent with what you expect and why? \n", 217 | "\n", 218 | "**Q 4** Compare the probability of Collision and total utility for the different values of the evidence variables specified in queries 7 and 8. Are these values consistent with what you expect and why? \n", 219 | "\n", 220 | "> **Note:** You cannot perform a query on an evidence variable. When Early_Breaking is evidence, make sure it is not a query variable. " 221 | ] 222 | }, 223 | { 224 | "cell_type": "markdown", 225 | "metadata": {}, 226 | "source": [ 227 | "**ANS 1**:\n", 228 | "\n", 229 | "**ANS 2**:\n", 230 | "\n", 231 | "**ANS 3**:\n", 232 | "\n", 233 | "**ANS 4**:" 234 | ] 235 | }, 236 | { 237 | "cell_type": "markdown", 238 | "metadata": {}, 239 | "source": [ 240 | "### 7. Bayesian Estimation of Model Parameters\n", 241 | "\n", 242 | "Next, use pgmy to perform Bayesian estimation of the model parameters using the dataset provided. The pseudo counts for the prior are given in a cell below. \n", 243 | "\n", 244 | "**Q 1:** The prior is generally weak, with the exception of one variable. Which variable has a strong prior and do you think using this strong prior is reasonable in the interest of improving safety and why? \n", 245 | "\n", 246 | "**Q 2:** Print the CPDs and carefully examine the results. Pay particular attention to the Early_Breaking variable, comparing the values to the values obtained with maximum likelihood estimation. Is this difference reasonable given the prior and why?\n", 247 | "\n", 248 | "**Q 3:** Finally, compare the estimated values for the Collision variable with the ones found using Maximum likelihood estimation. Are these differences expected and why? " 249 | ] 250 | }, 251 | { 252 | "cell_type": "markdown", 253 | "metadata": {}, 254 | "source": [ 255 | "**ANS 1:**\n", 256 | "\n", 257 | "**ANS 2:**\n", 258 | "\n", 259 | "**ANS 3:**" 260 | ] 261 | }, 262 | { 263 | "cell_type": "code", 264 | "execution_count": 67, 265 | "metadata": { 266 | "collapsed": true 267 | }, 268 | "outputs": [], 269 | "source": [ 270 | "pseudo_counts = {'Road_Condition':[[5],[5]],\n", 271 | " 'Weather_Visibility':[[5],[5]],\n", 272 | " 'Light_Dark':[[5],[5]],\n", 273 | " 'Object':[[5],[5]],\n", 274 | " 'LIDAR_Sensor':[[5,5],[5,5]],\n", 275 | " 'Weather_Detection':[[5,5],[5,5]],\n", 276 | " 'Sensor_Detection':[[5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5],\n", 277 | " [5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5]],\n", 278 | " 'Road_Condition_Detection':[[5,5],[5,5]],\n", 279 | " 'Visual_Sensor_Detection':[[5,5,5,5,5,5,5,5],\n", 280 | " [5,5,5,5,5,5,5,5]],\n", 281 | " 'Early_Breaking':[[500,500,500,500], \n", 282 | " [5000,5000,5000,5000]],\n", 283 | " 'Collision':[[100,100,100,100,5,5,5,5],[1,1,1,1,5,5,5,5]]}" 284 | ] 285 | }, 286 | { 287 | "cell_type": "markdown", 288 | "metadata": {}, 289 | "source": [ 290 | "### 8. Queries\n", 291 | "\n", 292 | "With parameters fit, you are ready to perform queries on your model with Bayesian parameter estimates. Using the pgmpy VariableElimination function, perform the queries 1 and 2 from the table shown previously, computing the total utility, examine the results, and answer the questions. \n", 293 | "\n", 294 | "**Q 1:** Compare the probability of Collision and total utility for the different values of evidence specified in queries 1 and 2. Are these values consistent with what you expect and why? " 295 | ] 296 | }, 297 | { 298 | "cell_type": "markdown", 299 | "metadata": {}, 300 | "source": [ 301 | "**ANS:**" 302 | ] 303 | }, 304 | { 305 | "cell_type": "markdown", 306 | "metadata": {}, 307 | "source": [ 308 | "# Solution\n", 309 | "\n", 310 | "Create cells below for your solution to the stated problem. Be sure to include some Markdown text and code comments to explain each component of your algorithm. " 311 | ] 312 | }, 313 | { 314 | "cell_type": "code", 315 | "execution_count": 68, 316 | "metadata": { 317 | "collapsed": true 318 | }, 319 | "outputs": [], 320 | "source": [ 321 | "from pgmpy.models import BayesianModel\n", 322 | "from pgmpy.factors.discrete import TabularCPD\n", 323 | "from pgmpy.inference import VariableElimination, BeliefPropagation\n", 324 | "from pgmpy.estimators.MLE import MaximumLikelihoodEstimator\n", 325 | "from pgmpy.estimators.BayesianEstimator import BayesianEstimator\n", 326 | "import numpy as np\n", 327 | "import numpy.random as nr\n", 328 | "import pandas as pd" 329 | ] 330 | }, 331 | { 332 | "cell_type": "markdown", 333 | "metadata": {}, 334 | "source": [ 335 | "### Import Data File" 336 | ] 337 | }, 338 | { 339 | "cell_type": "code", 340 | "execution_count": 69, 341 | "metadata": { 342 | "collapsed": true 343 | }, 344 | "outputs": [], 345 | "source": [ 346 | "samples = pd.read_csv('BreakingData.csv')" 347 | ] 348 | }, 349 | { 350 | "cell_type": "code", 351 | "execution_count": 70, 352 | "metadata": { 353 | "collapsed": true 354 | }, 355 | "outputs": [], 356 | "source": [] 357 | } 358 | ], 359 | "metadata": { 360 | "kernelspec": { 361 | "display_name": "Python 3", 362 | "language": "python", 363 | "name": "python3" 364 | }, 365 | "language_info": { 366 | "codemirror_mode": { 367 | "name": "ipython", 368 | "version": 3 369 | }, 370 | "file_extension": ".py", 371 | "mimetype": "text/x-python", 372 | "name": "python", 373 | "nbconvert_exporter": "python", 374 | "pygments_lexer": "ipython3", 375 | "version": "3.7.3" 376 | } 377 | }, 378 | "nbformat": 4, 379 | "nbformat_minor": 2 380 | } 381 | -------------------------------------------------------------------------------- /Challenges/ChallengeWeekend.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Challenges/ChallengeWeekend.pdf -------------------------------------------------------------------------------- /Challenges/CliffWalking.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Challenge Assignment\n", 8 | "## Cliff Walking with Reinforcement Learning\n", 9 | "\n", 10 | "## CSCI E-82A\n", 11 | "\n", 12 | ">**Make sure** you include your name along with the name of your team and team members in the notebook you submit." 13 | ] 14 | }, 15 | { 16 | "cell_type": "markdown", 17 | "metadata": {}, 18 | "source": [ 19 | "**Your name and team name here:** " 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "metadata": {}, 25 | "source": [ 26 | "## Introduction\n", 27 | "\n", 28 | "In this challenge you will apply Monte Carlo reinforcement learning algorithms to a classic problem in reinforcement learning, known as the **cliff walking problem**. The cliff walking problem is a type of game. The goal is for the agent to find the highest reward (lowest cost) path from a starting state to the goal. \n", 29 | "\n", 30 | "There are a number of versions of the cliff walking problems which have been used as research benchmarks over the years. You can find a short discussion of the cliff walking problem on page 132 of Sutton and Barto, second edition. \n", 31 | "\n", 32 | "In the general cliff walking problem the agent starts in one corner of the state-space and must travel to goal, or terminal state, in another corner of the state-space. Between the starting state and goal state there is an area with a **cliff**. If the agent falls off a cliff it is sent back to the starting state. A schematic diagram of the state-space is shown in the diagram below. \n", 33 | "\n", 34 | "\"Drawing\"\n", 35 | "
State-space of cliff-walking problem
\n", 36 | "\n" 37 | ] 38 | }, 39 | { 40 | "cell_type": "markdown", 41 | "metadata": {}, 42 | "source": [ 43 | "### Problem Description\n", 44 | "\n", 45 | "The agent must learn a policy to navigate from the starting state to the terminal state. The properties this problem are as follows:\n", 46 | "\n", 47 | "1. The state-space has two **continuous variables**, x and y.\n", 48 | "2. The starting state is at $x = 0.0$, $y = 0.0$. \n", 49 | "3. The terminal state has two segments:\n", 50 | " - At $y = 0.0$ is in the range $9.0 \\le x \\le 10.0$. \n", 51 | " - At $x = 10.0$ is in the range $0.0 \\le y \\le 1.0$. \n", 52 | "4. The cliff zone is bounded by:\n", 53 | " - $0.0 \\le y \\le 1.0$ and \n", 54 | " - $1.0 \\le x \\le 9.0$. \n", 55 | "5. An agent entering the cliff zone is returned to the starting state.\n", 56 | "6. The agent moves 1.0 units per time step. \n", 57 | "7. The 8 possible **discrete actions** are moves in the following directions: \n", 58 | " - +x, \n", 59 | " - +x, +y,\n", 60 | " - +y\n", 61 | " - -x, +y,\n", 62 | " - -x,\n", 63 | " - -x, -y,\n", 64 | " - -y, and\n", 65 | " - +x, -y. \n", 66 | "8. The rewards are:\n", 67 | " - -1 for a time step in the state-space,\n", 68 | " - -10 for colliding with an edge (barrier) of the state-space, and the agent returns to the previous state,\n", 69 | " - -100 for falling off the cliff and returning to the starting state, and \n", 70 | " - +1000 for reaching the terminal or goal state. \n", 71 | " \n" 72 | ] 73 | }, 74 | { 75 | "cell_type": "markdown", 76 | "metadata": {}, 77 | "source": [ 78 | "## Instructions\n", 79 | "\n", 80 | "In this challenge you and your team will do the following. Include commentary on each component of your algorithms. Make sure you answer the questions. " 81 | ] 82 | }, 83 | { 84 | "cell_type": "markdown", 85 | "metadata": {}, 86 | "source": [ 87 | "### Environment Simulator \n", 88 | "\n", 89 | "Your reinforcement learning agent cannot contain any information about the environment other that the starting state and the possible actions. Therefore, you must create an environment simulator, with the following input and output:\n", 90 | "- Input: Arguments of state, the $(x,y)$ tuple, and discrete action\n", 91 | "- Output: the new state (s'), reward, and if the new state meets the terminal or goal criteria.\n", 92 | "\n", 93 | "Make sure you test your simulator functions carefully. The test cases must include, steps with each of the actions, falling off the cliff from each edge, hitting the barriers, and reaching the goal (terminal) edges. Errors in the simulator will make the rest of this challenge difficult. \n", 94 | "\n", 95 | "> **Note**: For this problem, coordinate state is represented by a tuple of continuous variables. Make sure that you maintain coordinate state as continuous variables for this problem. " 96 | ] 97 | }, 98 | { 99 | "cell_type": "markdown", 100 | "metadata": {}, 101 | "source": [ 102 | "### Grid Approximation\n", 103 | "\n", 104 | "The state-space of the cliff walking problem is continuous. Therefor, you will need to use a **grid approximation** to construct a policy. The policy is specified as the probability of action for each grid cell. For this problem, use a 10x10 grid. \n", 105 | "\n", 106 | "> **Note:** While the policy uses a grid approximation, state should be represented as continuous variables." 107 | ] 108 | }, 109 | { 110 | "cell_type": "markdown", 111 | "metadata": {}, 112 | "source": [ 113 | "### Initial Policy\n", 114 | "\n", 115 | "Start with a uniform initial policy. A uniform policy has an equal probability of taking any of the 8 possible actions for each cell in the grid representation. \n", 116 | "\n", 117 | "> **Note:** As has already been stated, the coordinate state representation for this problem is a tuple of coordinate values. However, policy, state-values and action-values are represented with a grid approximation. \n", 118 | "\n", 119 | "> **Hint:** You may wish to use a 3-dimensional numpy array to code the policy for this problem. With 8 possible actions, this approach will be easier to work with. \n", 120 | "\n" 121 | ] 122 | }, 123 | { 124 | "cell_type": "markdown", 125 | "metadata": {}, 126 | "source": [ 127 | "### Monte Carlo State Value Estimation \n", 128 | "\n", 129 | "For the initial uniform policy, compute the state values using the Monte Carlo RL algorithm:\n", 130 | "1. Compute and print the state values for each grid in the representation. Use at least 1,000 episodes. This will take some time to execute. \n", 131 | "2. Plot the grid of state values, as an image (e.g. matplotlib [imshow](https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.imshow.html)). \n", 132 | "3. Compute the Forbenious norm (Euclidean norm) of the state value array with [numpy.linalg.norm](https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.norm.html). You will use this figure as a basis to compare your improved policy. \n", 133 | "\n", 134 | "Study your plot to ensure your state values seem correct. Do these state values seem reasonable given the uniform policy and why? Make sure you pay attention to the state values of the cliff zone. \n", 135 | "\n", 136 | "> **Hint:** Careful testing at each stage of your algorithm development will potentially save you considerable time. Test your function(s) to for a single episode to make sure your algorithm converges. Then test for say 10 episodes to ensure the state values update in a reasonable manner at each episode. \n", 137 | "\n", 138 | "> **Note:** The Monte Carlo episodes can be executed in parallel for production systems. The Markov chain of each episode is statistically independent. " 139 | ] 140 | }, 141 | { 142 | "cell_type": "markdown", 143 | "metadata": {}, 144 | "source": [ 145 | "ANS:" 146 | ] 147 | }, 148 | { 149 | "cell_type": "markdown", 150 | "metadata": {}, 151 | "source": [ 152 | "### Monte Carlo State Policy Improvement \n", 153 | "\n", 154 | "Finally, you will perform Monte Carlo RL policy improvement:\n", 155 | "1. Starting with the uniform policy, compute action-values for each grid in the representation. Use at least 1,000 episodes. \n", 156 | "2. Use these action values to find an improved policy.\n", 157 | "3. To evaluate your updated policy compute the state-values for this policy. \n", 158 | "4. Plot the grid of state values for the improved policy, as an image. \n", 159 | "5. Compute the Forbenious norm (Euclidean norm) of the state value array. \n", 160 | "\n", 161 | "Compare the state value plot for the improved policy to the one for the initial uniform policy. Does the improved state values increase generally as distance to the terminal states decreases? Is this what you expect and why? \n", 162 | "\n", 163 | "Compare the norm of the state values with your improved policy to the norm for the uniform policy. Is the increase significant? \n", 164 | "\n", 165 | "> **Hint:** Careful testing at each stage of your algorithm development will potentially save you considerable time. Test your function(s) to for a single episode to make sure your algorithm converges. Then test for say 10 episodes to ensure the state values update in a reasonable manner at each episode. \n", 166 | "\n", 167 | "> **Note:** You could continue to improve policy using the general policy improvement algorithm (GPI). In the interest of time, you are not required to do so here. " 168 | ] 169 | }, 170 | { 171 | "cell_type": "markdown", 172 | "metadata": {}, 173 | "source": [ 174 | "ANS:\n", 175 | "\n", 176 | "ANS:\n", 177 | "\n", 178 | "ANS:" 179 | ] 180 | }, 181 | { 182 | "cell_type": "markdown", 183 | "metadata": {}, 184 | "source": [ 185 | "## Solution\n", 186 | "\n", 187 | "Create cells below for your solution to the stated problem. Be sure to include some Markdown text and code comments to explain each component of your algorithm. " 188 | ] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "execution_count": null, 193 | "metadata": { 194 | "collapsed": true 195 | }, 196 | "outputs": [], 197 | "source": [] 198 | } 199 | ], 200 | "metadata": { 201 | "kernelspec": { 202 | "display_name": "Python 3", 203 | "language": "python", 204 | "name": "python3" 205 | }, 206 | "language_info": { 207 | "codemirror_mode": { 208 | "name": "ipython", 209 | "version": 3 210 | }, 211 | "file_extension": ".py", 212 | "mimetype": "text/x-python", 213 | "name": "python", 214 | "nbconvert_exporter": "python", 215 | "pygments_lexer": "ipython3", 216 | "version": "3.7.3" 217 | } 218 | }, 219 | "nbformat": 4, 220 | "nbformat_minor": 2 221 | } 222 | -------------------------------------------------------------------------------- /Challenges/CliffWalkingDiagram.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Challenges/CliffWalkingDiagram.JPG -------------------------------------------------------------------------------- /Challenges/GeneralPolicyInterationDIagram.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Challenges/GeneralPolicyInterationDIagram.pdf -------------------------------------------------------------------------------- /Homework/FiguresHW6.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Homework/FiguresHW6.pptx -------------------------------------------------------------------------------- /Homework/GridWorldFactory.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Homework/GridWorldFactory.JPG -------------------------------------------------------------------------------- /Homework/Homework1.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Homework 1\n", 8 | "\n", 9 | "## Directed Graphical Models\n", 10 | "\n", 11 | "## Scenario\n", 12 | "\n", 13 | "Inspector Andrea Markov receives a call that Lord Fauntleroy has been murdered in his manor house. She is told that only the cook and butler were present in the house at the time of the crime. Further, she learns that the only possible murder weapons are a knife and poison. \n", 14 | "\n", 15 | "In this exercise you will construct a Directed Bayesian Graphical Model or belief network for the available evidence. \n", 16 | "\n", 17 | "It is a cliche that a criminal must have opportunity, $OP$, means, $W$, and a motive, $MO$: \n", 18 | "- Since both the cook and butler were at the house and had access to the victim when the murder was committed, we can say that $P(OP) = 1.0$. Any model can be simplified by eliminating this variable. \n", 19 | "- The Inspector has already established that the means was either a knife, $K$, or poison, $P$. \n", 20 | "- Upon questioning the suspects, Inspector Markov believe that inheriting part of the fortune of the victim is a likely motive. She suspects that the butler may be due an inheritance, but doubts the cook does. \n", 21 | "\n", 22 | "Before leaving the police station the Inspector instructs Sargent Bernoulli to gather information on the outcomes of similar murder investigations. She tells the Sargent that it is entirely possible that both the butler and the cook could have worked together. \n", 23 | "\n", 24 | "While somewhat unrealistic, we assume the actions of one of the people did not influence the other. Therefore, $p(B\\ |\\ C) = p(B)$ and $p(C\\ |\\ B) = p(C)$.\n", 25 | "\n", 26 | "Further, as Inspector Markov has continued her investigation she has discovered an unexplained set of footprints, evidence that a third person may have been involved in the crime. There is no evidence linking the cook or the butler to any other possible perpetrator, the model can neglect the possibility of collaboration with a third party. In other words, $p(C,B\\ |\\ third\\ party) = 0$, $p(third\\ party\\ |\\ cook = 0)$, and $p(third\\ party\\ |\\ butler = 0)$. \n", 27 | "\n", 28 | "As a first step in creating the belief network, import the packages you will need for this analysis." 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": null, 34 | "metadata": {}, 35 | "outputs": [], 36 | "source": [ 37 | "from pgmpy.models import BayesianModel\n", 38 | "from pgmpy.factors.discrete import TabularCPD" 39 | ] 40 | }, 41 | { 42 | "cell_type": "markdown", 43 | "metadata": {}, 44 | "source": [ 45 | "The joint probability distribution is:\n", 46 | "\n", 47 | "$$p(B,C,W,MO,M)$$ \n", 48 | "where the letters indicate the following variables; \n", 49 | "$B = $ unconditional probability that the butler committed the crime, \n", 50 | "$C = $ unconditional probability that the cook committed the crime, \n", 51 | "$W = $ the probability of the weapon, K = knife, P = poison, conditional on B and C. \n", 52 | "$MO = $ the probability of a motive, conditional on C and B. \n", 53 | "$M = $ the probability that the third party, the cook, C, or the butler, B, committed the crime, conditional on B, C, W, and MO. \n", 54 | "\n", 55 | "Given the independencies, this distribution can be factorized in the following manner:\n", 56 | "\n", 57 | "$$p(B,C,W,MO,M) = p(B)\\ p(C)\\ p(W\\ |\\ B, C)\\ p(MO\\ |\\ B,C)\\ p(M\\ |\\ B, C, W, MO)$$\n", 58 | "\n", 59 | "Now you will define the skeleton of the graph. Given the independency relationships of the factorized probability distribution define the skeleton of the model (`m_model`) using the `BayesianModel` function.\n", 60 | "\n", 61 | ">**Hint:** Using paper and pencil make a sketch of the graph before you commit your skeleton structure to code. " 62 | ] 63 | }, 64 | { 65 | "cell_type": "code", 66 | "execution_count": null, 67 | "metadata": {}, 68 | "outputs": [], 69 | "source": [ 70 | "## Define the network structure.\n" 71 | ] 72 | }, 73 | { 74 | "cell_type": "markdown", 75 | "metadata": {}, 76 | "source": [ 77 | "Your next step to create you model is to define the conditional probability tables (CPT) for each independent variable using the `TabularCPD` function. The tables for these variables are: \n", 78 | "\n", 79 | "\n", 80 | "$p(B)$ \n", 81 | "\n", 82 | "| Case | p |\n", 83 | "|---|---|\n", 84 | "|$B_0$ | 0.4 |\n", 85 | "|$B_1$ | 0.6 | \n", 86 | "\n", 87 | "$p(C)$ \n", 88 | "\n", 89 | "| Case | p |\n", 90 | "|---|---|\n", 91 | "|$C_0$ | 0.7 |\n", 92 | "|$C_1$ | 0.3 |\n", 93 | "\n", 94 | "\n", 95 | "Using the above tables define and print the CPTs. Make sure you use variable names consistent with your model." 96 | ] 97 | }, 98 | { 99 | "cell_type": "code", 100 | "execution_count": null, 101 | "metadata": {}, 102 | "outputs": [], 103 | "source": [ 104 | "## Define the independent variables\n" 105 | ] 106 | }, 107 | { 108 | "cell_type": "markdown", 109 | "metadata": {}, 110 | "source": [ 111 | "Next, define the variables $W$ and $MO$, the conditional probabilities of weapon choice and motive given the butler and the cook. The conditional probability tables for these variables are:\n", 112 | "\n", 113 | "$$p(W)$$\n", 114 | "\n", 115 | "| Case | B0, C0 | B0, C1 | B1, C0 | B1, C1 |\n", 116 | "|---|---|---|--|---|\n", 117 | "|$W_0$ | 0.1 | 0.5 | 0.4 | 0.7 |\n", 118 | "|$W_1$ | 0.9 | 0.5 | 0.6 | 0.3 |\n", 119 | "\n", 120 | "Where $W_0$ is poison and $W_1$ is knife. \n", 121 | "\n", 122 | "$$p(MO)$$\n", 123 | "\n", 124 | "| Case | B0, C0 | B0, C1 | B1, C0 | B1, C1 |\n", 125 | "|---|---|---|--|---|\n", 126 | "|$MO_0$ | 1.0 | 0.7 | 0.1 | 0.3 |\n", 127 | "|$MO_1$ | 0.0 | 0.3 | 0.9 | 0.7 |\n", 128 | "\n", 129 | "Where $MO_0$ is no motive and $MO_1$ is motive.\n", 130 | "\n", 131 | "Give the above tables define and print these CPTs. " 132 | ] 133 | }, 134 | { 135 | "cell_type": "code", 136 | "execution_count": null, 137 | "metadata": {}, 138 | "outputs": [], 139 | "source": [] 140 | }, 141 | { 142 | "cell_type": "code", 143 | "execution_count": null, 144 | "metadata": {}, 145 | "outputs": [], 146 | "source": [ 147 | "\n" 148 | ] 149 | }, 150 | { 151 | "cell_type": "markdown", 152 | "metadata": {}, 153 | "source": [ 154 | "**Question:** If poison is rulled out, $p(Poison) = 0$, how many possible states would each of these CPTs have? " 155 | ] 156 | }, 157 | { 158 | "cell_type": "markdown", 159 | "metadata": {}, 160 | "source": [ 161 | "ANS: " 162 | ] 163 | }, 164 | { 165 | "cell_type": "markdown", 166 | "metadata": {}, 167 | "source": [ 168 | "Finally, you must define a CPT for the conditional probability of the murderer. The marginal distribution of this CPT will be the probabilities of each of the suspects having committed the crime. The tree cases are coded as follows:\n", 169 | "\n", 170 | "- **M0:** The murder is committed by a third unnamed party, \n", 171 | "- **M1:** the cook is a murderer, and\n", 172 | "- **M2:** the butler is a murderer. \n", 173 | "\n", 174 | "This CPT is conditional on $B$, $C$, $W$, and $MO$. Since there are three possible guilty parties (cardinality of 3) there are 48 possible states; $N_{B} * N_{C} * N_{M} * N_W * N_M = 2 * 2 * 2 * 2* 3 = 48$ as shown here:\n", 175 | "\n", 176 | "| | p | p | p | p| p | p | p | p| p | p | p | p| p | p | p | p| \n", 177 | "|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---| \n", 178 | "|| $C_0$ | $C_0$ | $C_0$ | $C_0$| $C_1$ | $C_1$ | $C_1$ | $C_1$| $C_0$ | $C_0$ | $C_0$ | $C_0$| $C_1$ | $C_1$ | $C_1$ | $C_1$| \n", 179 | "|| $B_0$ | $B_0$ | $B_0$ | $B_0$ | $B_0$ | $B_0$ | $B_0$ | $B_0$ | $B_1$ | $B_1$ | $B_1$ | $B_1$ | $B_1$ | $B_1$ | $B_1$ | $B_1$ | \n", 180 | "|| $W_0$ | $W_0$ | $W_1$ | $W_1$ | $W_0$ | $W_0$ | $W_1$ | $W_1$ | $W_0$ | $W_0$ | $W_1$ | $W_1$ | $W_0$ | $W_0$ | $W_1$ | $W_1$ | $W_0$ | $W_0$ | $W_1$ | $W_1$ | \n", 181 | "|| $MO_0$ | $MO_1$ | $MO_0$ | $MO_1$ | $MO_0$ | $MO_1$ | $MO_0$ | $MO_1$ | $MO_0$ | $MO_1$ | $MO_0$ | $MO_1$ | $MO_0$ | $MO_1$ | $MO_0$ | $MO_1$ | \n", 182 | "|$M_0$| 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0. | 0.0 | 0.0 | 0.0 | 0.0 | \n", 183 | "|$M_1$| 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.5 | 0.5 | 0.5 | 0.5 | \n", 184 | "|$M_2$| 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 |0.5 | 0.5 | 0.5 | 0.5 |\n", 185 | "\n", 186 | "Where: \n", 187 | "$M_0$ = The third party \n", 188 | "$M_1$ = The cook \n", 189 | "$M_2$ = The butler \n", 190 | "\n", 191 | "Create and print this CPT. " 192 | ] 193 | }, 194 | { 195 | "cell_type": "code", 196 | "execution_count": null, 197 | "metadata": {}, 198 | "outputs": [], 199 | "source": [] 200 | }, 201 | { 202 | "cell_type": "markdown", 203 | "metadata": {}, 204 | "source": [ 205 | "**Question:** If $p(Poison) = 0$ how many possible states would there be in this CPT?" 206 | ] 207 | }, 208 | { 209 | "cell_type": "markdown", 210 | "metadata": {}, 211 | "source": [ 212 | "ANS:" 213 | ] 214 | }, 215 | { 216 | "cell_type": "markdown", 217 | "metadata": {}, 218 | "source": [ 219 | "To complete your belief network, use the `add_cpds` method. \n", 220 | "\n", 221 | "> **Hint:** Before going any further make sure you apply the `check_model` method to your complete model. " 222 | ] 223 | }, 224 | { 225 | "cell_type": "code", 226 | "execution_count": null, 227 | "metadata": {}, 228 | "outputs": [], 229 | "source": [] 230 | }, 231 | { 232 | "cell_type": "markdown", 233 | "metadata": {}, 234 | "source": [ 235 | "Next investigate the independencies of all the variables in your model using the `local_independencies` method. Be sure to include all the variables in your list. " 236 | ] 237 | }, 238 | { 239 | "cell_type": "code", 240 | "execution_count": null, 241 | "metadata": {}, 242 | "outputs": [], 243 | "source": [] 244 | }, 245 | { 246 | "cell_type": "markdown", 247 | "metadata": {}, 248 | "source": [ 249 | "**Question:** Is this graphical model an I-map of the distribution discussed at the start of this lab and why?" 250 | ] 251 | }, 252 | { 253 | "cell_type": "markdown", 254 | "metadata": {}, 255 | "source": [ 256 | "ANS: " 257 | ] 258 | }, 259 | { 260 | "cell_type": "markdown", 261 | "metadata": {}, 262 | "source": [ 263 | "**Question:** is the graphical model a perfect map of the distribution, and why? " 264 | ] 265 | }, 266 | { 267 | "cell_type": "markdown", 268 | "metadata": {}, 269 | "source": [ 270 | "ANS:" 271 | ] 272 | }, 273 | { 274 | "cell_type": "markdown", 275 | "metadata": {}, 276 | "source": [ 277 | "Next, you will determine which of all possible trails in the graph are active. Create and execute the code using the `is_active_trail` method on the model object. Make sure you account for all possible pairs of variables. " 278 | ] 279 | }, 280 | { 281 | "cell_type": "code", 282 | "execution_count": null, 283 | "metadata": {}, 284 | "outputs": [], 285 | "source": [] 286 | }, 287 | { 288 | "cell_type": "markdown", 289 | "metadata": {}, 290 | "source": [ 291 | "**Question:** How can you best explain the blocked trails given the independent variables and V-structures in the graph? What are the trials with V-structures that are blocked? **Hint:** Be careful, as there can be several paths between a pair of variables. " 292 | ] 293 | }, 294 | { 295 | "cell_type": "markdown", 296 | "metadata": {}, 297 | "source": [ 298 | "ANS: " 299 | ] 300 | }, 301 | { 302 | "cell_type": "markdown", 303 | "metadata": {}, 304 | "source": [ 305 | "**Question:** Does {W,MO} D-separate {C,B} from {M}, and why? " 306 | ] 307 | }, 308 | { 309 | "cell_type": "markdown", 310 | "metadata": {}, 311 | "source": [ 312 | "ANS:" 313 | ] 314 | }, 315 | { 316 | "cell_type": "markdown", 317 | "metadata": {}, 318 | "source": [ 319 | "**Question:** What is the Markov blanket of the node W and the node MO?" 320 | ] 321 | }, 322 | { 323 | "cell_type": "markdown", 324 | "metadata": {}, 325 | "source": [ 326 | "ANS:" 327 | ] 328 | } 329 | ], 330 | "metadata": { 331 | "kernelspec": { 332 | "display_name": "Python 3", 333 | "language": "python", 334 | "name": "python3" 335 | }, 336 | "language_info": { 337 | "codemirror_mode": { 338 | "name": "ipython", 339 | "version": 3 340 | }, 341 | "file_extension": ".py", 342 | "mimetype": "text/x-python", 343 | "name": "python", 344 | "nbconvert_exporter": "python", 345 | "pygments_lexer": "ipython3", 346 | "version": "3.7.3" 347 | } 348 | }, 349 | "nbformat": 4, 350 | "nbformat_minor": 2 351 | } 352 | -------------------------------------------------------------------------------- /Homework/Homework2.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Homework 2\n", 8 | "## CSCI E-82a\n", 9 | "\n", 10 | "## Properties of Undirected Graphical Models\n", 11 | "\n", 12 | "In the previous homework you created a DAG to represent the probability of variables for a fictional murder investigation. In this exercise you examined the independence properties of the DAG you created. \n", 13 | "\n", 14 | "Now you will investigate the properties of an undirected graphical model, or Markov Random Field (MRF) derived from your DAG. \n", 15 | "\n", 16 | "As a first step, import the packages you will need for this analysis. " 17 | ] 18 | }, 19 | { 20 | "cell_type": "code", 21 | "execution_count": null, 22 | "metadata": {}, 23 | "outputs": [], 24 | "source": [ 25 | "from pgmpy.models import BayesianModel\n", 26 | "from pgmpy.factors.discrete import TabularCPD" 27 | ] 28 | }, 29 | { 30 | "cell_type": "markdown", 31 | "metadata": {}, 32 | "source": [ 33 | "In the cell below, recreate the DAG you created in the first homework. You may do so by simply coping the code you previously created. Make sure you apply the `check_model` method to your model object to make sure there are no errors before you move on. \n", 34 | "\n", 35 | "> **Tip:** If you have saved your pickled model you are read it by uncommenting the code below. " 36 | ] 37 | }, 38 | { 39 | "cell_type": "code", 40 | "execution_count": null, 41 | "metadata": {}, 42 | "outputs": [], 43 | "source": [ 44 | "#import pickle\n", 45 | "#with open('my_model.pickle', 'rb') as pkl:\n", 46 | "# murder_model = pickle.load(pkl)\n", 47 | "\n", 48 | "#print(murder_model.check_model())" 49 | ] 50 | }, 51 | { 52 | "cell_type": "markdown", 53 | "metadata": {}, 54 | "source": [ 55 | "Immoralities are an important property of a DAG which is to be transformed to a MRF. Apply the `get_immoralities` method to your DAG object and examine the result. " 56 | ] 57 | }, 58 | { 59 | "cell_type": "code", 60 | "execution_count": null, 61 | "metadata": {}, 62 | "outputs": [], 63 | "source": [] 64 | }, 65 | { 66 | "cell_type": "markdown", 67 | "metadata": {}, 68 | "source": [ 69 | "To answer this and subsequent questions you may wish to draw a diagram of your DAG. Do these immoralities make sense given the number of v-structures in the DAG and why? " 70 | ] 71 | }, 72 | { 73 | "cell_type": "markdown", 74 | "metadata": {}, 75 | "source": [ 76 | "ANS: Yes, there are actually 3 V-structures in the DAG, B -> W <- C, B -> W <- C, and W -> M <- MO. Thus, the nodes that must be married are ('B','C') and ('MO','W'). " 77 | ] 78 | }, 79 | { 80 | "cell_type": "markdown", 81 | "metadata": {}, 82 | "source": [ 83 | "Next, do the following: \n", 84 | "1. Transform your DAG object to a Markov network using the `to_markov_model` method. \n", 85 | "2. Find the factors of your Markov network using the `get_factors` method. " 86 | ] 87 | }, 88 | { 89 | "cell_type": "code", 90 | "execution_count": null, 91 | "metadata": {}, 92 | "outputs": [], 93 | "source": [] 94 | }, 95 | { 96 | "cell_type": "markdown", 97 | "metadata": {}, 98 | "source": [ 99 | "Examine these factors. Do these factors correspond to cliques of the moralized graph, and why? Are these all maximal cliques or not, and why? " 100 | ] 101 | }, 102 | { 103 | "cell_type": "markdown", 104 | "metadata": {}, 105 | "source": [ 106 | "ANS: These factors do correspond to cliques of the graph, since a clique can be any collection of neighbors on a graph, including a single node. Not all these cliques are maximal. For example there are single node cliques. " 107 | ] 108 | }, 109 | { 110 | "cell_type": "markdown", 111 | "metadata": {}, 112 | "source": [ 113 | "Finally, print the Markov blankets for nodes W and MO and examine the results. " 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": null, 119 | "metadata": {}, 120 | "outputs": [], 121 | "source": [] 122 | }, 123 | { 124 | "cell_type": "markdown", 125 | "metadata": {}, 126 | "source": [ 127 | "Do these Markov blankets appear to be correct, and why?" 128 | ] 129 | }, 130 | { 131 | "cell_type": "markdown", 132 | "metadata": {}, 133 | "source": [ 134 | "ANS: Given the structure of the graph the Markov blanket of each of these nodes include all the other nodes as shown. " 135 | ] 136 | } 137 | ], 138 | "metadata": { 139 | "kernelspec": { 140 | "display_name": "Python 3", 141 | "language": "python", 142 | "name": "python3" 143 | }, 144 | "language_info": { 145 | "codemirror_mode": { 146 | "name": "ipython", 147 | "version": 3 148 | }, 149 | "file_extension": ".py", 150 | "mimetype": "text/x-python", 151 | "name": "python", 152 | "nbconvert_exporter": "python", 153 | "pygments_lexer": "ipython3", 154 | "version": "3.7.3" 155 | } 156 | }, 157 | "nbformat": 4, 158 | "nbformat_minor": 2 159 | } 160 | -------------------------------------------------------------------------------- /Homework/Homework3.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Homework 3\n", 8 | "\n", 9 | "## Inference on Graphical Models\n", 10 | "\n", 11 | "In the previous homework, you constructed and examined the properties of a DAG and a MRF to model the variables for a murder mystery problem. Now, you are ready to perform inference on the directed graphical models. \n", 12 | "\n", 13 | "The goal of this analysis is to find the posterior distribution of key variables which will help Inspector Markov determine the most likely, weapon, motive and perpetrator. This process will require **inference** on the models. \n", 14 | "\n", 15 | "\n", 16 | "*******\n", 17 | "In this exercise you will construct a Directed Bayesian Graphical Model or belief network for the available evidence and perform inference on some of the variables. \n", 18 | "\n", 19 | "Inspector Markov has continued her investigation. Additionally Dr. Turing has had time to perform laboratory analysis. Turing reports to the Inspector that there is a chance that a toxic substance may have been used to incapacitate the victim before the stabbing. So, there are now two possible weapons:\n", 20 | "- knife alone, \n", 21 | "- knife with a poison. \n", 22 | "\n", 23 | "Given this evidence, Inspector Markov must update her beliefs. Further, she needs to perform inference to determine if there are likely combinations of suspects and weapons. \n", 24 | "\n" 25 | ] 26 | }, 27 | { 28 | "cell_type": "markdown", 29 | "metadata": {}, 30 | "source": [ 31 | "Recall that the joint probability distribution is:\n", 32 | "\n", 33 | "$$p(B,C,W,MO,M)$$ \n", 34 | "where the letters indicate the following variables; \n", 35 | "$B = $ unconditional probability that the butler committed the crime, \n", 36 | "$C = $ unconditional probability that the cook committed the crime, \n", 37 | "$W = $ the probability of the weapon, K = knife, P = poison, conditional on B and C. \n", 38 | "$MO = $ the probability of a motive, conditional on C and B. \n", 39 | "$M = $ the probability that the third party, the cook, C, or the butler, B, committed the crime, conditional on B, C, W, and MO. \n", 40 | "\n", 41 | "Given the independencies, this distribution can be factorized in the following manner:\n", 42 | "\n", 43 | "$$p(B,C,W,MO,M) = p(B)\\ p(C)\\ p(W\\ |\\ B, C)\\ p(MO\\ |\\ B,C)\\ p(M\\ |\\ B, C, W, MO)$$\n" 44 | ] 45 | }, 46 | { 47 | "cell_type": "markdown", 48 | "metadata": {}, 49 | "source": [ 50 | "As a first step in creating the belief network, import the packages you will need for this analysis." 51 | ] 52 | }, 53 | { 54 | "cell_type": "code", 55 | "execution_count": 1, 56 | "metadata": {}, 57 | "outputs": [], 58 | "source": [ 59 | "from pgmpy.models import BayesianModel\n", 60 | "from pgmpy.factors.discrete import TabularCPD" 61 | ] 62 | }, 63 | { 64 | "cell_type": "markdown", 65 | "metadata": {}, 66 | "source": [ 67 | "Next, define your directed graphical model in the cell below. You should use the model you defined in Homework 1 and used in Homework 2. Be sure to apply the `check_model` method to you model.\n", 68 | "\n", 69 | "> **Tip:** If you have saved your pickled model you are read it by uncommenting the code below. You will need to change the name `my_model.pickle` to whatever name you have save your model under. " 70 | ] 71 | }, 72 | { 73 | "cell_type": "code", 74 | "execution_count": 2, 75 | "metadata": {}, 76 | "outputs": [ 77 | { 78 | "name": "stdout", 79 | "output_type": "stream", 80 | "text": [ 81 | "True\n" 82 | ] 83 | } 84 | ], 85 | "source": [ 86 | "#import pickle\n", 87 | "#with open('my_model.pickle', 'rb') as pkl:\n", 88 | "# murder_model = pickle.load(pkl)\n", 89 | "\n", 90 | "#print(murder_model.check_model())" 91 | ] 92 | }, 93 | { 94 | "cell_type": "markdown", 95 | "metadata": {}, 96 | "source": [ 97 | "Now that you have the model define, you are ready to apply **variable elimination** to perform inference. In the cell below create and execute code to do the following: \n", 98 | "1. Create an inference object using the `VariableElimination` function. \n", 99 | "2. Perform a query on the variable W. \n", 100 | "3. print the query for W. \n", 101 | "\n", 102 | "You can find [detailed documentation on the VariableELimination methods in the pgmpy documentation](http://pgmpy.org/inference.html). " 103 | ] 104 | }, 105 | { 106 | "cell_type": "code", 107 | "execution_count": 3, 108 | "metadata": {}, 109 | "outputs": [ 110 | { 111 | "name": "stdout", 112 | "output_type": "stream", 113 | "text": [ 114 | "+-----+----------+\n", 115 | "| W | phi(W) |\n", 116 | "+=====+==========+\n", 117 | "| W_0 | 0.3820 |\n", 118 | "+-----+----------+\n", 119 | "| W_1 | 0.6180 |\n", 120 | "+-----+----------+\n" 121 | ] 122 | }, 123 | { 124 | "name": "stderr", 125 | "output_type": "stream", 126 | "text": [ 127 | "C:\\Users\\StevePC2\\Anaconda3\\lib\\site-packages\\pgmpy\\factors\\discrete\\DiscreteFactor.py:586: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.\n", 128 | " phi.values = phi.values[slice_]\n", 129 | "C:\\Users\\StevePC2\\Anaconda3\\lib\\site-packages\\pgmpy\\factors\\discrete\\DiscreteFactor.py:598: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.\n", 130 | " phi1.values = phi1.values[slice_]\n" 131 | ] 132 | } 133 | ], 134 | "source": [] 135 | }, 136 | { 137 | "cell_type": "markdown", 138 | "metadata": {}, 139 | "source": [ 140 | "Now perform a query on the motive, MO, variable to find the marginal distribution. " 141 | ] 142 | }, 143 | { 144 | "cell_type": "code", 145 | "execution_count": 4, 146 | "metadata": {}, 147 | "outputs": [ 148 | { 149 | "name": "stdout", 150 | "output_type": "stream", 151 | "text": [ 152 | "+------+-----------+\n", 153 | "| MO | phi(MO) |\n", 154 | "+======+===========+\n", 155 | "| MO_0 | 0.4600 |\n", 156 | "+------+-----------+\n", 157 | "| MO_1 | 0.5400 |\n", 158 | "+------+-----------+\n" 159 | ] 160 | } 161 | ], 162 | "source": [] 163 | }, 164 | { 165 | "cell_type": "markdown", 166 | "metadata": {}, 167 | "source": [ 168 | "These marginal distributions quantify the initial beliefs for these variables. Given these marginal distributions, which weapon and motive appears to be the most likely and why? " 169 | ] 170 | }, 171 | { 172 | "cell_type": "markdown", 173 | "metadata": {}, 174 | "source": [ 175 | "ANS: " 176 | ] 177 | }, 178 | { 179 | "cell_type": "markdown", 180 | "metadata": {}, 181 | "source": [ 182 | "What Inspector Markov really needs to know is the probability that each of the suspects committed the murder. In the cell below, create the code to make a query and print the results on the butler, B, and cook, C. " 183 | ] 184 | }, 185 | { 186 | "cell_type": "code", 187 | "execution_count": 5, 188 | "metadata": {}, 189 | "outputs": [ 190 | { 191 | "name": "stdout", 192 | "output_type": "stream", 193 | "text": [ 194 | "+-----+----------+\n", 195 | "| B | phi(B) |\n", 196 | "+=====+==========+\n", 197 | "| B_0 | 0.4000 |\n", 198 | "+-----+----------+\n", 199 | "| B_1 | 0.6000 |\n", 200 | "+-----+----------+\n", 201 | "+-----+----------+\n", 202 | "| C | phi(C) |\n", 203 | "+=====+==========+\n", 204 | "| C_0 | 0.7000 |\n", 205 | "+-----+----------+\n", 206 | "| C_1 | 0.3000 |\n", 207 | "+-----+----------+\n" 208 | ] 209 | } 210 | ], 211 | "source": [] 212 | }, 213 | { 214 | "cell_type": "markdown", 215 | "metadata": {}, 216 | "source": [ 217 | "It is possible that Inspector Markov could discover evidence which would cause the beliefs to be updated. \n", 218 | "\n", 219 | "For example, Sgt Bernoulli telephones the Inspector to say that he has discovered that the butler is indeed due the inheritance (MO = 1). Using this evidence compute the query and print the marginal distribution for the butler, B, and cook, C, are murders. These marginal distributions are the updated belief on who the perpetrator might be. " 220 | ] 221 | }, 222 | { 223 | "cell_type": "code", 224 | "execution_count": 6, 225 | "metadata": {}, 226 | "outputs": [ 227 | { 228 | "name": "stdout", 229 | "output_type": "stream", 230 | "text": [ 231 | "+-----+----------+\n", 232 | "| B | phi(B) |\n", 233 | "+=====+==========+\n", 234 | "| B_0 | 0.0667 |\n", 235 | "+-----+----------+\n", 236 | "| B_1 | 0.9333 |\n", 237 | "+-----+----------+\n", 238 | "+-----+----------+\n", 239 | "| C | phi(C) |\n", 240 | "+=====+==========+\n", 241 | "| C_0 | 0.7000 |\n", 242 | "+-----+----------+\n", 243 | "| C_1 | 0.3000 |\n", 244 | "+-----+----------+\n" 245 | ] 246 | } 247 | ], 248 | "source": [] 249 | }, 250 | { 251 | "cell_type": "markdown", 252 | "metadata": {}, 253 | "source": [ 254 | "Notice how the marginal belief has changed for one suspect but not the other. How can you best explain this change, given the new evidence?" 255 | ] 256 | }, 257 | { 258 | "cell_type": "markdown", 259 | "metadata": {}, 260 | "source": [ 261 | "ANS: " 262 | ] 263 | }, 264 | { 265 | "cell_type": "markdown", 266 | "metadata": {}, 267 | "source": [ 268 | "Next, Dr. Turing, the medical examiner, call to say that she has discovered that poison was definitely used for the murder, or W = 0. Using this evidence compute the updated belief with a query and print the marginal distributions for the butler, B, and cook, C, are murders. " 269 | ] 270 | }, 271 | { 272 | "cell_type": "code", 273 | "execution_count": 7, 274 | "metadata": {}, 275 | "outputs": [ 276 | { 277 | "name": "stdout", 278 | "output_type": "stream", 279 | "text": [ 280 | "+-----+----------+\n", 281 | "| B | phi(B) |\n", 282 | "+=====+==========+\n", 283 | "| B_0 | 0.0699 |\n", 284 | "+-----+----------+\n", 285 | "| B_1 | 0.9301 |\n", 286 | "+-----+----------+\n", 287 | "+-----+----------+\n", 288 | "| C | phi(C) |\n", 289 | "+=====+==========+\n", 290 | "| C_0 | 0.5874 |\n", 291 | "+-----+----------+\n", 292 | "| C_1 | 0.4126 |\n", 293 | "+-----+----------+\n" 294 | ] 295 | } 296 | ], 297 | "source": [] 298 | }, 299 | { 300 | "cell_type": "markdown", 301 | "metadata": {}, 302 | "source": [ 303 | "Compare the marginal distributions of the cook and the butler being the murderer given the changes in belief from the new evidence of the weapon. How does this change the belief that the cook or the butler are murderers. " 304 | ] 305 | }, 306 | { 307 | "cell_type": "markdown", 308 | "metadata": {}, 309 | "source": [ 310 | "ANS: " 311 | ] 312 | }, 313 | { 314 | "cell_type": "markdown", 315 | "metadata": {}, 316 | "source": [ 317 | "You have performed the foregoing analysis using variable elimination. Now, you will perform analysis using the **junction tree algorithm**. \n", 318 | "\n", 319 | "In the cell below do the following:\n", 320 | "1. Create a belief propagation object using the `BeliefPropagation` function. \n", 321 | "2. Print the `.factor` attribute of the object you created. \n", 322 | "\n", 323 | "Despite the name, the pgmpy `BeliefPropagation` model actually implements the Juncton Tree Algorithm. You can see a more complete description of the algorithm by [reading the documentation](http://pgmpy.org/inference.html#belief-propagation). " 324 | ] 325 | }, 326 | { 327 | "cell_type": "code", 328 | "execution_count": 8, 329 | "metadata": {}, 330 | "outputs": [ 331 | { 332 | "name": "stdout", 333 | "output_type": "stream", 334 | "text": [ 335 | "defaultdict(, {'B': [, , , ], 'W': [, ], 'C': [, , , ], 'MO': [, ], 'M': []})\n" 336 | ] 337 | } 338 | ], 339 | "source": [] 340 | }, 341 | { 342 | "cell_type": "markdown", 343 | "metadata": {}, 344 | "source": [ 345 | "Examine these results, considering the structure of the DAG and its cliques. " 346 | ] 347 | }, 348 | { 349 | "cell_type": "markdown", 350 | "metadata": {}, 351 | "source": [ 352 | "Now, compute a query and print the results for the prior belief (belief with no evidence) for the butler, B, and cook, C, being murderers. " 353 | ] 354 | }, 355 | { 356 | "cell_type": "code", 357 | "execution_count": 9, 358 | "metadata": {}, 359 | "outputs": [ 360 | { 361 | "name": "stdout", 362 | "output_type": "stream", 363 | "text": [ 364 | "+-----+----------+\n", 365 | "| C | phi(C) |\n", 366 | "+=====+==========+\n", 367 | "| C_0 | 0.7000 |\n", 368 | "+-----+----------+\n", 369 | "| C_1 | 0.3000 |\n", 370 | "+-----+----------+\n", 371 | "+-----+----------+\n", 372 | "| B | phi(B) |\n", 373 | "+=====+==========+\n", 374 | "| B_0 | 0.4000 |\n", 375 | "+-----+----------+\n", 376 | "| B_1 | 0.6000 |\n", 377 | "+-----+----------+\n" 378 | ] 379 | } 380 | ], 381 | "source": [] 382 | }, 383 | { 384 | "cell_type": "markdown", 385 | "metadata": {}, 386 | "source": [ 387 | "Now, perform the same queries, but include the evidence that the motive what the inheritance (MO = 1) and the weapon was the poison (W = 0). " 388 | ] 389 | }, 390 | { 391 | "cell_type": "code", 392 | "execution_count": 10, 393 | "metadata": {}, 394 | "outputs": [ 395 | { 396 | "name": "stdout", 397 | "output_type": "stream", 398 | "text": [ 399 | "+-----+----------+\n", 400 | "| C | phi(C) |\n", 401 | "+=====+==========+\n", 402 | "| C_0 | 0.5874 |\n", 403 | "+-----+----------+\n", 404 | "| C_1 | 0.4126 |\n", 405 | "+-----+----------+\n", 406 | "+-----+----------+\n", 407 | "| B | phi(B) |\n", 408 | "+=====+==========+\n", 409 | "| B_0 | 0.0699 |\n", 410 | "+-----+----------+\n", 411 | "| B_1 | 0.9301 |\n", 412 | "+-----+----------+\n" 413 | ] 414 | } 415 | ], 416 | "source": [] 417 | }, 418 | { 419 | "cell_type": "markdown", 420 | "metadata": {}, 421 | "source": [ 422 | "Compare the results of the two queries you have just performed with the junction tree algorithm with the equivalent queries you performed with the variable elimination algorithm. Are the results different? Does this outcome surprise you?" 423 | ] 424 | }, 425 | { 426 | "cell_type": "markdown", 427 | "metadata": {}, 428 | "source": [ 429 | "ANS: " 430 | ] 431 | }, 432 | { 433 | "cell_type": "markdown", 434 | "metadata": {}, 435 | "source": [ 436 | "Recall, that there was the possibility that a third party, not the butler or the cook, may have committed the murder. Perform and print a query of the prior marginal of the murderer variable, M." 437 | ] 438 | }, 439 | { 440 | "cell_type": "code", 441 | "execution_count": 11, 442 | "metadata": {}, 443 | "outputs": [ 444 | { 445 | "name": "stdout", 446 | "output_type": "stream", 447 | "text": [ 448 | "+-----+----------+\n", 449 | "| M | phi(M) |\n", 450 | "+=====+==========+\n", 451 | "| M_0 | 0.2800 |\n", 452 | "+-----+----------+\n", 453 | "| M_1 | 0.2100 |\n", 454 | "+-----+----------+\n", 455 | "| M_2 | 0.5100 |\n", 456 | "+-----+----------+\n" 457 | ] 458 | } 459 | ], 460 | "source": [] 461 | }, 462 | { 463 | "cell_type": "markdown", 464 | "metadata": {}, 465 | "source": [ 466 | "Next, perform and print the query of the murderer variable, M, but with the evidence, MO = 1 and W = 0. " 467 | ] 468 | }, 469 | { 470 | "cell_type": "code", 471 | "execution_count": 15, 472 | "metadata": {}, 473 | "outputs": [ 474 | { 475 | "name": "stdout", 476 | "output_type": "stream", 477 | "text": [ 478 | "+-----+----------+\n", 479 | "| M | phi(M) |\n", 480 | "+=====+==========+\n", 481 | "| M_0 | 0.0000 |\n", 482 | "+-----+----------+\n", 483 | "| M_1 | 0.2413 |\n", 484 | "+-----+----------+\n", 485 | "| M_2 | 0.7587 |\n", 486 | "+-----+----------+\n" 487 | ] 488 | } 489 | ], 490 | "source": [] 491 | }, 492 | { 493 | "cell_type": "markdown", 494 | "metadata": {}, 495 | "source": [ 496 | "Compare the prior marginal to the marginal including evidence. Is the change in the results consistent with the evidence and why? " 497 | ] 498 | }, 499 | { 500 | "cell_type": "markdown", 501 | "metadata": {}, 502 | "source": [ 503 | "ANS: " 504 | ] 505 | }, 506 | { 507 | "cell_type": "markdown", 508 | "metadata": {}, 509 | "source": [ 510 | "There are many other possible queries you can perform on this model. You may wish to explore some more of these. " 511 | ] 512 | } 513 | ], 514 | "metadata": { 515 | "kernelspec": { 516 | "display_name": "Python 3", 517 | "language": "python", 518 | "name": "python3" 519 | }, 520 | "language_info": { 521 | "codemirror_mode": { 522 | "name": "ipython", 523 | "version": 3 524 | }, 525 | "file_extension": ".py", 526 | "mimetype": "text/x-python", 527 | "name": "python", 528 | "nbconvert_exporter": "python", 529 | "pygments_lexer": "ipython3", 530 | "version": "3.7.3" 531 | } 532 | }, 533 | "nbformat": 4, 534 | "nbformat_minor": 2 535 | } 536 | -------------------------------------------------------------------------------- /Homework/Homework5.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Homework 5\n", 8 | "## CSCI E-82A\n", 9 | "\n", 10 | "## Background\n", 11 | "\n", 12 | "Robotics are becoming common place in many business situations. For example in retail sales robotics are used in warehouse management. Amazon is widely known to be a leader in this area. But, according to a recent Wall Street Journal article, UK online grocery retailer, Ocado, is rumored to have even more sophisticated order fulfilment robotics.\n", 13 | "\n", 14 | "https://www.wsj.com/articles/how-robots-and-drones-will-change-retail-forever-1539604800 \n", 15 | "\n", 16 | "Not surprisingly, companies like Amazon and Ocado do not disclose much information on their robotics. Nonetheless, we can be sure that sensor fusion is a significant problem. For example, sensor fusion is a significant issue with self driving cars. See for example:\n", 17 | "\n", 18 | "https://arxiv.org/ftp/arxiv/papers/0709/0709.1099.pdf \n", 19 | "\n", 20 | "These types of complex robots require years, even decades, to develop and perfect. In reality, the processes used in such complex robots are decomposable into a large number of **simple single tasks**. The complete system then operates by integrating the many single tasks into a **multi-task** environment. To avoid this complexity (and finish the course in less than 10 years:) you will address a simple single task problem. " 21 | ] 22 | }, 23 | { 24 | "cell_type": "markdown", 25 | "metadata": {}, 26 | "source": [ 27 | "\n", 28 | "## Scenario \n", 29 | "\n", 30 | "Bob's Orchards is a premium seller of apples and pears. Bob's customers pay a substantial premium for superior fruit. To satisfy these customers, Bob's must ensure that the fruit delivered is correctly packed and perfectly ripe. \n", 31 | "\n", 32 | "Like many legacy industries requiring specialized human skills, Bob's is facing a talent problem. An expert human inspector will only pass fruit at the perfect ripeness, maximizing customer satisfaction and utility. \n", 33 | "\n", 34 | "However, many of the human inspectors who expertly check each piece of fruit shipped for ripeness are approaching retirement age. Management's attempts to recruit younger people to apprentice as fruit inspectors have been, well, fruitless (oh, sorry!:). Therefore, it has become imperative to find some type of automated system which can reduce the workload on the diminishing number of human inspectors. To address this problem, Bob's is deploying technology from Robots R Us.\n", 35 | "\n", 36 | "The first robotic system to be deployed at Bob's uses a sensor array to determine if the fruit being shipped is at the correct ripeness. There are two sensors, a color vision system that examines the fruit to determine if it is ripe, and a smell sensor that determines if the fruit is not ripe enough or over ripe. If either sensor indicates the fruit is bad it is not shipped. In addition customers may reject even perfect fruit for no apparent reason, whereas others seem perfectly happy with less than perfect fruit. \n", 37 | "\n", 38 | "The probability distributions, decisions and utilities of this system can be summarized as follows:\n", 39 | "1. The unconditional probability distribution of the ripeness of the fruit being packed is known.\n", 40 | "2. A conditional distribution for the visual color sensor reading conditioned on fruit quality. \n", 41 | "3. A conditional distribution for the smell sensor reading conditioned on fruit quality. \n", 42 | "4. A conditional probability distribution of shipping the fruit conditioned on the readings of both sensors. This can be thought of as classifying the fruit as good or bad based on the senor results. There is a decision variable associated with this CPD. The fruit is only shipped if both sensors read positive. \n", 43 | "5. There is a conditional probability distribution of a customer accepting an order, or not, depending conditioned on CPD for the fruit classified for shipment and the actual fruit quality. \n", 44 | "6. There is a utility of the customer accepting or rejecting an order. " 45 | ] 46 | }, 47 | { 48 | "cell_type": "markdown", 49 | "metadata": {}, 50 | "source": [ 51 | "## Instructions\n", 52 | "\n", 53 | "You have been hired as a consultant to determine the optimal decision process for the ripeness testing robot. To perform this analysis you will do the following steps:\n", 54 | "\n", 55 | "1. Draw an influence diagram for the fruit inspection task. \n", 56 | "2. Use a combination of pgmy and Python with numpy, to compute and compare the utility for using robot aided inspection.\n", 57 | "3. Compare the utility of robot aided inspection to the current manual inspection process. " 58 | ] 59 | }, 60 | { 61 | "cell_type": "markdown", 62 | "metadata": {}, 63 | "source": [ 64 | "### Influence Diagram\n", 65 | "\n", 66 | "Once you have completed your influence diagram, display it here by replacing the influence.jpg file. \n", 67 | "\n", 68 | "\"Drawing\"\n", 69 | "
Your influence diagram must go here
\n", 70 | "\n", 71 | "\n", 72 | "> **Note:** You can save your figure as a file titled Influence.jpg in the same directory as your notebook. The figure should then be visible in the notebook and in the .html you will download. \n" 73 | ] 74 | }, 75 | { 76 | "cell_type": "markdown", 77 | "metadata": {}, 78 | "source": [ 79 | "### Define the DAG \n", 80 | "\n", 81 | "Before you proceed, execute the code in the cell below:" 82 | ] 83 | }, 84 | { 85 | "cell_type": "code", 86 | "execution_count": null, 87 | "metadata": {}, 88 | "outputs": [], 89 | "source": [ 90 | "from pgmpy.models import BayesianModel\n", 91 | "from pgmpy.factors.discrete import TabularCPD\n", 92 | "from pgmpy.inference import VariableElimination\n", 93 | "import numpy as np" 94 | ] 95 | }, 96 | { 97 | "cell_type": "markdown", 98 | "metadata": {}, 99 | "source": [ 100 | "As a next step, you will define the CPDs for the DAG. \n", 101 | "\n", 102 | "**Fruit Quality** \n", 103 | "\n", 104 | "The fruit quality arriving from the orchard is characterized by the unconditional probability distribution: \n", 105 | "\n", 106 | "| Fruit Quality | Bad | Good |\n", 107 | "|----|----|----|\n", 108 | "|Probability | 0.3 | 0.7 | \n", 109 | "\n", 110 | "**Color Sensor** \n", 111 | "\n", 112 | "The color sensor determines fruit quality given the actual quality of the fruit. The sensor is not completely reliable. It will detect good fruit as bad and vice versa. The vendor has tuned the sensor to improve accuracy of bad fruit detection at the expense of good fruit detection accuracy. The CPD is as follows: \n", 113 | "\n", 114 | "| sensor reading | bad fruit | good fruit |\n", 115 | "|----|----|----|\n", 116 | "| Sensed as Bad | 0.9 | 0.20 |\n", 117 | "| Sensed as Good | 0.1 | 0.80 |\n", 118 | "\n", 119 | "**Smell Sensor** \n", 120 | "\n", 121 | "The smell sensor determines fruit quality given the actual quality of the fruit. As with the color sensor, this sensor is not completely reliable. It will detect good fruit as bad and vice versa. The vendor has tuned the sensor to improve accuracy of bad fruit detection at the expense of good fruit detection accuracy. Overall, this sensor is less reliable than the color sensor. The CPD is as follows: \n", 122 | "\n", 123 | "| sensor reading | bad fruit | good fruit |\n", 124 | "|----|----|----|\n", 125 | "| Sensed as Bad | 0.8 | 0.30 |\n", 126 | "| Sensed as Good | 0.2 | 0.70 |\n", 127 | "\n", 128 | "**Fruit Classification**\n", 129 | "\n", 130 | "Bob's Fruit is quite particular about the quality of fruit shipped. An order will not be shipped unless both sensors agree that the fruit is good. \n", 131 | "\n", 132 | "You must determine the values of this CPD and the evidence representing the decision process. This variable is conditioned on the two sensor CPDs. Keep in mind that as long as the probabilities in each column of the CPD adds to 1.0, this is a valid distribution.\n", 133 | "\n", 134 | "> **Note:** You can think of this table as a binary or logical operator. The result should only be to ship fruit if the color sensor AND the smell senor agree the fruit is good. \n", 135 | "\n", 136 | "**Customer Satisfaction**\n", 137 | "\n", 138 | "The customer satisfaction is conditional on the actual fruit quality and the sensor classification of the shipment. Some customers will reject good shipments, whereas some customers will accept a bad shipment. This CPD is:\n", 139 | "\n", 140 | "| Conditional Variables | Bad sensors - Bad fruit | Bad sensors - Good fruit | Good sensors - Bad fruit | Good sensors - Good fruit | \n", 141 | "|----|----|----|----|----| \n", 142 | "| Not satisfied | 0.8 | 0.1 | 0.8 | 0.1 | \n", 143 | "| Satisfied | 0.2 | 0.9 | 0.2 | 0.9 | \n", 144 | "\n", 145 | "> **Note:** In the above table the notation is as follows:\n", 146 | "> - Bad sensors = the sensors indicate the fruit is bad. \n", 147 | "> - Good sensors = both sensors agree that the fruit is good. \n", 148 | "\n", 149 | "Define these CPDs in the cell below. " 150 | ] 151 | }, 152 | { 153 | "cell_type": "code", 154 | "execution_count": null, 155 | "metadata": {}, 156 | "outputs": [], 157 | "source": [] 158 | }, 159 | { 160 | "cell_type": "markdown", 161 | "metadata": {}, 162 | "source": [ 163 | "In the cell below do the following:\n", 164 | "\n", 165 | "1. Define the DAG model.\n", 166 | "2. Add the CPDs to the model.\n", 167 | "3. Check the model." 168 | ] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "execution_count": null, 173 | "metadata": {}, 174 | "outputs": [], 175 | "source": [] 176 | }, 177 | { 178 | "cell_type": "markdown", 179 | "metadata": {}, 180 | "source": [ 181 | "### Inference and Utility Analysis\n", 182 | "\n", 183 | "Now, you will define the utility function of customer satisfaction. In the cell below define an array for the utility function as shown in the table:\n", 184 | "\n", 185 | "| | Satisfied | Not Satisfied |\n", 186 | "|----|----|----|\n", 187 | "|Utility | 20 | -40 |" 188 | ] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "execution_count": null, 193 | "metadata": {}, 194 | "outputs": [], 195 | "source": [] 196 | }, 197 | { 198 | "cell_type": "markdown", 199 | "metadata": {}, 200 | "source": [ 201 | "As a first step in this analysis, you will create baseline utility figures so that you can compare these to other utilities. Compute the utility for these cases:\n", 202 | "\n", 203 | "1. The fruit is shipped without inspection. The quality of the fruit will be determined by what comes from the orchard. \n", 204 | "2. The fruit is 100% manually inspected by the expert human inspector, so that fruit of perfect quality is shipped, keeping in mind that customers may reject perfectly good fruit. \n", 205 | "\n", 206 | "> **Hint:** You may wish to do these calculations using numpy, rather than pgmpy. " 207 | ] 208 | }, 209 | { 210 | "cell_type": "code", 211 | "execution_count": null, 212 | "metadata": {}, 213 | "outputs": [], 214 | "source": [] 215 | }, 216 | { 217 | "cell_type": "markdown", 218 | "metadata": {}, 219 | "source": [ 220 | "Does inspection of the fruit significantly improve customer satisfaction? " 221 | ] 222 | }, 223 | { 224 | "cell_type": "markdown", 225 | "metadata": {}, 226 | "source": [ 227 | "ANS: " 228 | ] 229 | }, 230 | { 231 | "cell_type": "markdown", 232 | "metadata": {}, 233 | "source": [ 234 | "In the cell below define a function to compute the utility given a VariableEliminatiion object, a query variable, the utility function, and evidence dictionary. " 235 | ] 236 | }, 237 | { 238 | "cell_type": "code", 239 | "execution_count": null, 240 | "metadata": {}, 241 | "outputs": [], 242 | "source": [] 243 | }, 244 | { 245 | "cell_type": "markdown", 246 | "metadata": {}, 247 | "source": [ 248 | "In the cell below create a VariableElimination object using your model as an argument." 249 | ] 250 | }, 251 | { 252 | "cell_type": "code", 253 | "execution_count": null, 254 | "metadata": {}, 255 | "outputs": [], 256 | "source": [] 257 | }, 258 | { 259 | "cell_type": "markdown", 260 | "metadata": {}, 261 | "source": [ 262 | "Now, you are ready to do a query on your model and to compute utility of the robotic fruit inspection. In the cell below do the following:\n", 263 | "\n", 264 | "1. Compute and print the results of a query on the customer satisfaction variable, with decision variable (evidence) that only fruit determined good by both sensors is shipped, the decision variable. \n", 265 | "2. Compute and print the utility for customer satisfaction with decision variable (evidence) that only fruit determined good by both sensors is shipped, the decision variable. " 266 | ] 267 | }, 268 | { 269 | "cell_type": "code", 270 | "execution_count": null, 271 | "metadata": { 272 | "scrolled": true 273 | }, 274 | "outputs": [], 275 | "source": [] 276 | }, 277 | { 278 | "cell_type": "markdown", 279 | "metadata": {}, 280 | "source": [ 281 | "Examine the marginal distribution of the query variable. Is this distribution (done by sensors) close to the value of customer satisfaction with perfect fruit inspection (done by expert human inspectors)? \n", 282 | "\n", 283 | "Next, compare the utility value of sensor inspection to the one found for perfect fruit inspection. Are the values similar? " 284 | ] 285 | }, 286 | { 287 | "cell_type": "markdown", 288 | "metadata": {}, 289 | "source": [ 290 | "ANS 1: \n", 291 | "\n", 292 | "ANS 2: " 293 | ] 294 | }, 295 | { 296 | "cell_type": "markdown", 297 | "metadata": {}, 298 | "source": [ 299 | "The forgoing analysis assumes the cost of either automatic or human fruit inspection is zero. It is more realistic to consider the costs of the inspection when computing and comparing the total utility for the two scenarios. \n", 300 | "\n", 301 | "Assume the following utility functions for human and auto\n", 302 | "\n", 303 | "| | No Inspection | Inspection |\n", 304 | "|----|----|----|\n", 305 | "|Human Inspection Utility | 0 | -5 | \n", 306 | "\n", 307 | "| | No Inspection | Inspection |\n", 308 | "|----|----|----|\n", 309 | "|Sensor Inspection Utility | 0 | -1 |\n", 310 | "\n", 311 | "Using these utility functions compute and compare the total utility for both scenarios and answer these questions. \n", 312 | "\n", 313 | "- What is the total utility for human inspected fruit?\n", 314 | "- What is the total utility for sensor inspected fruit?\n", 315 | "- Which process produces higher utility? " 316 | ] 317 | }, 318 | { 319 | "cell_type": "markdown", 320 | "metadata": {}, 321 | "source": [ 322 | "ANS 1: \n", 323 | "ANS 2: \n", 324 | "ANS 3: " 325 | ] 326 | }, 327 | { 328 | "cell_type": "markdown", 329 | "metadata": {}, 330 | "source": [ 331 | "### Single Sensor \n", 332 | "\n", 333 | "Your foregoing analysis is based on a scenario with two sensors with **independent errors**. Now, you will investigate the value of using a single sensor verses using multiple sensors. \n", 334 | "\n", 335 | "The color vision sensor is known to be more accurate, so the scenario is to use only this sensor. The DAG no longer needs a CPD for the smell sensor or a CPD for the classification of fruit quality, which integrated the output of the two sensors. Fruit determined to be good by the single sensor will be shipped. The decision variable (ship or not) is associated only with the single sensor CPD. \n", 336 | "\n", 337 | "In the cell below you will do the following:\n", 338 | "\n", 339 | "1. Create a new CPD for customer satisfaction. The probability distribution is no different. However, this CPD is conditional on the color vision sensor CPD and actual fruit quality. \n", 340 | "2. Define the edges of your DAG model with only the three remaining CPDs. Notice two CPDs are unchanged. \n", 341 | "3. Add the CPDs to you new DAG model. \n", 342 | "4. Check you model!" 343 | ] 344 | }, 345 | { 346 | "cell_type": "code", 347 | "execution_count": null, 348 | "metadata": {}, 349 | "outputs": [], 350 | "source": [] 351 | }, 352 | { 353 | "cell_type": "markdown", 354 | "metadata": {}, 355 | "source": [ 356 | "Now, you are ready to do a query on your new model and to compute utility of the single sensor inspection. In the cell below do the following:\n", 357 | "\n", 358 | "1. Compute and print the results of a query on the customer satisfaction variable, with decision variable (evidence) that only fruit determined good by the color vision sensor is shipped, the decision variable. \n", 359 | "2. Compute and print the utility for customer satisfaction with decision variable (evidence) that only fruit determined good by the single sensors is shipped, the decision variable. " 360 | ] 361 | }, 362 | { 363 | "cell_type": "code", 364 | "execution_count": null, 365 | "metadata": {}, 366 | "outputs": [], 367 | "source": [] 368 | }, 369 | { 370 | "cell_type": "markdown", 371 | "metadata": {}, 372 | "source": [ 373 | "Examine the marginal distribution of the query variable. Is this distribution close to the value of customer satisfaction for fruit inspected by the two sensor array? \n", 374 | "\n", 375 | "Next, compare the utility value to the one found for fruit inspected by the two sensor array. Are the values significantly different? Which method is superior? " 376 | ] 377 | }, 378 | { 379 | "cell_type": "markdown", 380 | "metadata": {}, 381 | "source": [ 382 | "ANS 1: \n", 383 | "\n", 384 | "ANS 2: " 385 | ] 386 | }, 387 | { 388 | "cell_type": "markdown", 389 | "metadata": {}, 390 | "source": [ 391 | "As with the first comparison of utilities the cost of inspection should be considered. However, this difference may not be that great, since much of fixtures and computer system required are the same as for single sensor inspection. The utility function for single sensor inspection is shown below. \n", 392 | "\n", 393 | "| | No Inspection | Inspection |\n", 394 | "|----|----|----|\n", 395 | "|Sensor Inspection Utility | 0 | -0.8 |\n", 396 | "\n", 397 | "Using this utility function, compute the total utility of single sensor inspection and compare it to the total utility for multi-sensor inspection. \n", 398 | "\n", 399 | "- What is the total utility for single sensor inspected fruit?\n", 400 | "- Which process produces higher total utility? \n", 401 | "- In understanding this difference how important do you think it is that the errors of the two sensors are independent? " 402 | ] 403 | }, 404 | { 405 | "cell_type": "markdown", 406 | "metadata": {}, 407 | "source": [ 408 | "ANS 1: \n", 409 | "ANS 2: \n", 410 | "ANS 3: " 411 | ] 412 | }, 413 | { 414 | "cell_type": "code", 415 | "execution_count": null, 416 | "metadata": {}, 417 | "outputs": [], 418 | "source": [] 419 | } 420 | ], 421 | "metadata": { 422 | "kernelspec": { 423 | "display_name": "Python 3", 424 | "language": "python", 425 | "name": "python3" 426 | }, 427 | "language_info": { 428 | "codemirror_mode": { 429 | "name": "ipython", 430 | "version": 3 431 | }, 432 | "file_extension": ".py", 433 | "mimetype": "text/x-python", 434 | "name": "python", 435 | "nbconvert_exporter": "python", 436 | "pygments_lexer": "ipython3", 437 | "version": "3.7.3" 438 | } 439 | }, 440 | "nbformat": 4, 441 | "nbformat_minor": 2 442 | } 443 | -------------------------------------------------------------------------------- /Homework/Homework6.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Homework 6\n", 8 | "\n", 9 | "## CSCI E-82A\n", 10 | "\n", 11 | "\n", 12 | "In the Dynamic Programming (DP) lesson Jupyter notebook, we constructed a representation of a simple grid world. DP was used to find optimal plans for a robot to navigate from any starting location on the grid to the goal. This problem is an analog for more complex real-world robot navigation problems. \n", 13 | "\n", 14 | "In this homework you will use DP to solve a slightly more complex robotic navigation problem in a grid world. This grid world is a simple version of the problem a material transport robot might encounter in a warehouse. The situation is illustrated in the figure below.\n", 15 | "\n", 16 | "\"Drawing\"\n", 17 | "
Grid World for Factory Navigation Example
" 18 | ] 19 | }, 20 | { 21 | "cell_type": "markdown", 22 | "metadata": {}, 23 | "source": [ 24 | "The goal is for the robot to deliver some material to position (state) 12, shown in blue. Since there is a goal state or **terminal state** this an **episodic task**. \n", 25 | "\n", 26 | "There are some barriers comprised of the states $\\{ 6, 7, 8 \\}$ and $\\{ 16, 17, 18 \\}$, shown with hash marks. In a real warehouse, these positions might be occupied by shelving or equipment. We do not want the robot to hit these barriers. Thus, we say that transitioning to these barrier states is **taboo**.\n", 27 | "\n", 28 | "As before, we do not want the robot to hit the edges of the grid world, which represent the outer walls of the warehouse. \n", 29 | "\n" 30 | ] 31 | }, 32 | { 33 | "cell_type": "markdown", 34 | "metadata": {}, 35 | "source": [ 36 | "## Representation\n", 37 | "\n", 38 | "As with many such problems, the starting place is creating the **representation**. In the cell below encode your representation for the possible action-state transitions. From each state there are 4 possible actions:\n", 39 | "- up, u\n", 40 | "- down, d,\n", 41 | "- left, l\n", 42 | "- right, r\n", 43 | "\n", 44 | "There are a few special cases you need to consider:\n", 45 | "- Any action transitioning to a state off the grid or into a barrier should keep the state unchanged. \n", 46 | "- Once in the goal state there are no more state transitions. \n", 47 | "- Any transition within the barrier (taboo) states can keep the state unchanged. If you experiment, you will see that other encodings work as well since the value of a barrier states are always zero and there are no actions transitioning into these states. \n", 48 | "\n", 49 | "> **Hint:** It may help you create a pencil and paper sketch of the transitions, rewards, and probabilities or policy. This can help you to keep the bookkeeping correct. \n", 50 | "\n", 51 | "In the cell below define a dictionary where your code can look up the successor state given the current state and action. " 52 | ] 53 | }, 54 | { 55 | "cell_type": "code", 56 | "execution_count": null, 57 | "metadata": {}, 58 | "outputs": [], 59 | "source": [ 60 | "## import numpy for latter\n", 61 | "import numpy as np\n", 62 | "\n" 63 | ] 64 | }, 65 | { 66 | "cell_type": "markdown", 67 | "metadata": {}, 68 | "source": [ 69 | "You need to define the initial policy for the Markov process. Set the probabilities for each transition as a **uniform distribution** leading to random action by the robot. In the subsequent sections of this notebook you will improve this policy. \n", 70 | "\n", 71 | "> **Note:** As these are just starting values, the exact values of the transition probabilities are not actually all that important in terms of solving the DP problem. Also, notice that it does not matter how the taboo state transitions are encoded. The point of the DP algorithm is to learn the transition policy. \n", 72 | "\n", 73 | "In the cell below, define a dictionary with the initial policy. " 74 | ] 75 | }, 76 | { 77 | "cell_type": "code", 78 | "execution_count": null, 79 | "metadata": {}, 80 | "outputs": [], 81 | "source": [] 82 | }, 83 | { 84 | "cell_type": "markdown", 85 | "metadata": {}, 86 | "source": [ 87 | "The robot receives the following rewards:\n", 88 | "- +10 for achieving the goal. \n", 89 | "- -1 for attempting to leave the warehouse or hitting the barriers. In other words, we penalize the robot for hitting the edges of the grid or the barriers. \n", 90 | "- -0.1 for all other state transitions, which is the cost for the robot to move from one state to another. \n", 91 | "\n", 92 | "In the code cell below encode a dictionary with your representation of this reward structure. " 93 | ] 94 | }, 95 | { 96 | "cell_type": "code", 97 | "execution_count": null, 98 | "metadata": {}, 99 | "outputs": [], 100 | "source": [] 101 | }, 102 | { 103 | "cell_type": "markdown", 104 | "metadata": {}, 105 | "source": [ 106 | "You will find it useful to create a list of taboo states, which you can encode in the cell below." 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "execution_count": null, 112 | "metadata": {}, 113 | "outputs": [], 114 | "source": [] 115 | }, 116 | { 117 | "cell_type": "markdown", 118 | "metadata": {}, 119 | "source": [ 120 | "## Policy Evaluation\n", 121 | "\n", 122 | "With your representation defined, you can now create and test a function for **policy evaluation**. You will need this function for your policy iteration code. \n", 123 | "\n", 124 | "You are welcome to start with the `compute_state_value` function from the DP notebook. However, keep in mind that you must modify this code to correctly treat the taboo states. Specifically, taboo states should have 0 value. " 125 | ] 126 | }, 127 | { 128 | "cell_type": "code", 129 | "execution_count": null, 130 | "metadata": { 131 | "scrolled": false 132 | }, 133 | "outputs": [], 134 | "source": [] 135 | }, 136 | { 137 | "cell_type": "markdown", 138 | "metadata": {}, 139 | "source": [ 140 | "Examine the state values you have computed using a random walk for the robot. Answer the following questions:\n", 141 | "\n", 142 | "1. Are the values of the goal and taboo states zero? \n", 143 | "2. Do the values of the states increase closer to the goal? \n", 144 | "3. Do the goal and barrier states all have zero values? " 145 | ] 146 | }, 147 | { 148 | "cell_type": "markdown", 149 | "metadata": {}, 150 | "source": [ 151 | "ANS 1: \n", 152 | "ANS 2: \n", 153 | "ANS 3: " 154 | ] 155 | }, 156 | { 157 | "cell_type": "markdown", 158 | "metadata": {}, 159 | "source": [ 160 | "## Policy Iteration\n", 161 | "\n", 162 | "Now that you have your representation and a functions to perform the MC policy evaluation you have everything you need to apply the policy improvement algorithm to create an optimal policy for the robot to reach the goal. \n", 163 | "\n", 164 | "If your policy evaluation functions works correctly, you should be able to use the `policy_iteration` function from the DP notebook with minor modifications. Make sure you print the state values as well as the policy you discovered. " 165 | ] 166 | }, 167 | { 168 | "cell_type": "code", 169 | "execution_count": null, 170 | "metadata": { 171 | "scrolled": false 172 | }, 173 | "outputs": [], 174 | "source": [] 175 | }, 176 | { 177 | "cell_type": "markdown", 178 | "metadata": {}, 179 | "source": [ 180 | "Examine your results. First look at the state values at convergence of the policy iteration algorithm and answer the following questions:\n", 181 | "1. Are non-taboo state values closest to the goal the largest? \n", 182 | "2. Are the non-taboo state values farthest from the goal the smallest? Keep in mind the robot must travel around the barrier. \n", 183 | "3. Are the non-taboo state values symmetric (e.g. same) with respect to distance from the goal? \n", 184 | "4. Do the taboo states have 0 values? \n", 185 | "5. How do the state values of the improved policy compare to the state values of the initial policy?\n" 186 | ] 187 | }, 188 | { 189 | "cell_type": "markdown", 190 | "metadata": {}, 191 | "source": [ 192 | "ANS 1: \n", 193 | "ANS 2: \n", 194 | "ANS 3: \n", 195 | "ANS 4: \n", 196 | "ANS 5: " 197 | ] 198 | }, 199 | { 200 | "cell_type": "markdown", 201 | "metadata": {}, 202 | "source": [ 203 | "Next, examine the policy you have computed. Do the following:\n", 204 | "- Follow the optimal paths from the 4 corners of the grid to the goal. How does the symmetry and length of these paths make sense in terms of length and state values? " 205 | ] 206 | }, 207 | { 208 | "cell_type": "markdown", 209 | "metadata": {}, 210 | "source": [ 211 | "ANS: " 212 | ] 213 | }, 214 | { 215 | "cell_type": "markdown", 216 | "metadata": {}, 217 | "source": [ 218 | "- Imagine that the door for the warehouse is at position (state) 2. Insert an illustration showing the paths of the optimal plans below. You are welcome to start with the PowerPoint illustration in the course Github repository. " 219 | ] 220 | }, 221 | { 222 | "cell_type": "markdown", 223 | "metadata": {}, 224 | "source": [ 225 | "**Insert your image here** \n", 226 | " \n", 227 | "\"Drawing\"\n", 228 | "
Grid world optimal plans from state 2 to the goal shown goes here
\n" 229 | ] 230 | }, 231 | { 232 | "cell_type": "markdown", 233 | "metadata": {}, 234 | "source": [ 235 | "## Value Iteration \n", 236 | "\n", 237 | "Finally, use the value iteration algorithm to compute an optimal policy for the robot reaching the goal. Keep in mind that you will need to maintain a state value of 0 for the taboo states. You may use the 'value_iteration' function from the DP notebook with minor modifications." 238 | ] 239 | }, 240 | { 241 | "cell_type": "code", 242 | "execution_count": null, 243 | "metadata": { 244 | "scrolled": false 245 | }, 246 | "outputs": [], 247 | "source": [] 248 | }, 249 | { 250 | "cell_type": "markdown", 251 | "metadata": {}, 252 | "source": [ 253 | "Compare your results from the value iteration algorithm to your results from the policy iteration algorithm and answer the following questions: \n", 254 | "1. Are the state values identical between the two methods? \n", 255 | "2. Ignoring the taboo states, are the plans computed by the two methods identical? \n", 256 | "3. Ignoring the taboo states, are the final state values computed from both methods the same. " 257 | ] 258 | }, 259 | { 260 | "cell_type": "markdown", 261 | "metadata": {}, 262 | "source": [ 263 | "ANS 1: \n", 264 | "ANS 2: \n", 265 | "ANS 3: " 266 | ] 267 | }, 268 | { 269 | "cell_type": "code", 270 | "execution_count": null, 271 | "metadata": {}, 272 | "outputs": [], 273 | "source": [] 274 | } 275 | ], 276 | "metadata": { 277 | "kernelspec": { 278 | "display_name": "Python 3", 279 | "language": "python", 280 | "name": "python3" 281 | }, 282 | "language_info": { 283 | "codemirror_mode": { 284 | "name": "ipython", 285 | "version": 3 286 | }, 287 | "file_extension": ".py", 288 | "mimetype": "text/x-python", 289 | "name": "python", 290 | "nbconvert_exporter": "python", 291 | "pygments_lexer": "ipython3", 292 | "version": "3.7.3" 293 | } 294 | }, 295 | "nbformat": 4, 296 | "nbformat_minor": 2 297 | } 298 | -------------------------------------------------------------------------------- /Homework/Homework7.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Homework 7\n", 8 | "\n", 9 | "## CSCI E-82A \n", 10 | "\n", 11 | "[**Value at Risk (VaR)**](https://en.wikipedia.org/wiki/Value_at_risk) is a commonly used measure of financial risk. VaR is as an **order statistic**, being defined at a probability quantile $p$. For example, for the 1% var is there is a probability of $0.01$ that the loss will be at the lower 1% quantile of the returns or greater. \n", 12 | "\n", 13 | "Returns of financial assets are well know to be non-Gaussian. This presents significant problems for analysis, including risk analysis. Many approaches have been used to create probability models which better represent the returns of financial assets. \n", 14 | "\n", 15 | "One possibility is to use Gaussian mixture models to model the distribution of financial returns. In this assignment, you will model returns of the [S&P 500 Index](https://en.wikipedia.org/wiki/S%26P_500_Index) using Gaussian mixture models and then compute the VaR. \n", 16 | "\n", 17 | "> **Note:** In this exercise you will work with daily returns and therefore daily VaR. You may not be familiar with the concept of returns. The return is the difference between the value from one period (day) to the next (day). A **positive return** means the asset has increased in value. Whereas, a **negative return** means the asset has decreased in value, and represent losses. It is these losses the VaR model seeks to quantify. \n", 18 | "\n", 19 | "> More specifically, in this assignment you will work with log returns. Here the log returns are the returns of the log values of the series. " 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "metadata": {}, 25 | "source": [ 26 | "## Load and Examine the Dataset\n", 27 | "\n", 28 | "As a first step you will load a dataset containing approximately 5 years of daily index value data for the S&P 500 index (benchmark indicies do no have prices). \n", 29 | "\n", 30 | "Execute the code in the cell below to load the packages you will need for this assignment. " 31 | ] 32 | }, 33 | { 34 | "cell_type": "code", 35 | "execution_count": null, 36 | "metadata": {}, 37 | "outputs": [], 38 | "source": [ 39 | "import pandas as pd\n", 40 | "import numpy as np\n", 41 | "import numpy.random as nr\n", 42 | "from sklearn.linear_model import LinearRegression\n", 43 | "from sklearn.preprocessing import scale\n", 44 | "from sklearn.mixture import GaussianMixture, BayesianGaussianMixture\n", 45 | "from scipy.stats import norm\n", 46 | "import matplotlib.pyplot as plt\n", 47 | "%matplotlib inline" 48 | ] 49 | }, 50 | { 51 | "cell_type": "markdown", 52 | "metadata": {}, 53 | "source": [ 54 | "Execute the code in the cell below to load the S&P 500 daily values, downloaded from Yahoo finance on October 8, 2019. " 55 | ] 56 | }, 57 | { 58 | "cell_type": "code", 59 | "execution_count": null, 60 | "metadata": {}, 61 | "outputs": [], 62 | "source": [ 63 | "SP_Raw = pd.read_csv('GSPC_Yahoo_Oct_8_19.csv')" 64 | ] 65 | }, 66 | { 67 | "cell_type": "markdown", 68 | "metadata": {}, 69 | "source": [ 70 | "Execute the code in the cell below and examine the column names." 71 | ] 72 | }, 73 | { 74 | "cell_type": "code", 75 | "execution_count": null, 76 | "metadata": {}, 77 | "outputs": [], 78 | "source": [ 79 | "SP_Raw.columns" 80 | ] 81 | }, 82 | { 83 | "cell_type": "markdown", 84 | "metadata": {}, 85 | "source": [ 86 | "There are a number of columns in this dataset. You will be working with the `Adj Close` column. The **adjusted close** is a final price of the day, adjusted for stock splits and dividend payments. \n", 87 | "\n", 88 | "Execute the code in the cell below to add a count of days from the start of the series and plot the time series of adjusted returns." 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": null, 94 | "metadata": {}, 95 | "outputs": [], 96 | "source": [ 97 | "SP_Raw['Day_Count'] = [float(i) for i in range(SP_Raw.shape[0])]\n", 98 | "plt.plot(SP_Raw['Day_Count'],SP_Raw['Adj Close'])" 99 | ] 100 | }, 101 | { 102 | "cell_type": "markdown", 103 | "metadata": {}, 104 | "source": [ 105 | "There are several points to notice in this time series: \n", 106 | "\n", 107 | "1. There is a long-term trend of the daily values. We will remove this effect so we can examine the day-to-day returns (changes) without being influenced by the overall trend.\n", 108 | "2. There is considerable volatility (variance) in this time series. We want to calculated the VaR of this time series, and in detail this variability will affect the results. We will ignore this problem for this assignment." 109 | ] 110 | }, 111 | { 112 | "cell_type": "markdown", 113 | "metadata": {}, 114 | "source": [ 115 | "## Remove the Trend\n", 116 | "\n", 117 | "To compute returns the trend must be removed from the time series. In this case, we will simply compute the remove the linear trend. \n", 118 | " \n", 119 | " As a first step you must compute two numpy arrays:\n", 120 | " - The feature (independent variable) is the 'Day_Count' column.\n", 121 | " - The label (dependent variable) is the log of the 'Adj Close' column. \n", 122 | " \n", 123 | "Make sure to apply the [numpy ravel](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ravel.html) method to each array to ensure the dimensionality is correct. \n", 124 | " \n", 125 | "> **Note:** It is standard practice in capital markets analytics to work with log returns. The intuition for this is that the price of capital market assets cannot drop below 0, so taking the logarithm transforms the return distribution to be closer to Gaussian. Often the word *return* actually means *log return*. In the rest of this document, we will mostly just use the term *return*, but will mean *log return* unless otherwise noted. " 126 | ] 127 | }, 128 | { 129 | "cell_type": "code", 130 | "execution_count": null, 131 | "metadata": {}, 132 | "outputs": [], 133 | "source": [] 134 | }, 135 | { 136 | "cell_type": "markdown", 137 | "metadata": {}, 138 | "source": [ 139 | "In the cell below do the following: \n", 140 | "- Scale the feature column with the scale function from sklearn.preprocessing. \n", 141 | "- Define and fit the model using the LinearRegression model and fit method from sklearn.linear_model. you will need to reshape the feature column as, [`.reshape(-1,1)`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html), since there is only one column in the feature array. \n", 142 | "- Subtract the predicted value from the model (use the `predict` method) to find a single column array. You will need to apply `.reshape(number_of_days, 1)` to this result to get the dimensions correct. \n", 143 | "- [Concatenate ](https://docs.scipy.org/doc/numpy/reference/generated/numpy.concatenate.html)the following as columns of a numpy array; \n", 144 | " - Unscaled label (dependent variable) values. \n", 145 | " - Scaled feature (independent variable) values. \n", 146 | " - The difference between the log values and predicted values from the regression. \n", 147 | " \n", 148 | "You will need to ensure the dimensions of all these columns are correct before applying the `concatenate` method. " 149 | ] 150 | }, 151 | { 152 | "cell_type": "code", 153 | "execution_count": null, 154 | "metadata": {}, 155 | "outputs": [], 156 | "source": [] 157 | }, 158 | { 159 | "cell_type": "markdown", 160 | "metadata": {}, 161 | "source": [ 162 | "Now, make a plot of the difference between the log values and predicted values. Ensure that the trend removal process has produced a reasonable result." 163 | ] 164 | }, 165 | { 166 | "cell_type": "code", 167 | "execution_count": null, 168 | "metadata": {}, 169 | "outputs": [], 170 | "source": [] 171 | }, 172 | { 173 | "cell_type": "markdown", 174 | "metadata": {}, 175 | "source": [ 176 | "## Compute the Returns\n", 177 | "\n", 178 | "With the trend removed from the daily values, its time to compute the returns. In the cell below do the following:\n", 179 | "\n", 180 | "- Compute the returns using the numpy [`diff`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.diff.html) method. \n", 181 | "- The length of the difference (return) series will be 1 less than the original series. You will need to add a value of 0.0 to the start of the series, using the numpy `concatenate` method. \n", 182 | "- Concatenate the difference value column to the numpy array you created previously. Remember to apply `ravel` and `reshape` to the difference array. " 183 | ] 184 | }, 185 | { 186 | "cell_type": "code", 187 | "execution_count": null, 188 | "metadata": {}, 189 | "outputs": [], 190 | "source": [] 191 | }, 192 | { 193 | "cell_type": "markdown", 194 | "metadata": {}, 195 | "source": [ 196 | "In the cell below, plot the time series of the returns." 197 | ] 198 | }, 199 | { 200 | "cell_type": "code", 201 | "execution_count": null, 202 | "metadata": {}, 203 | "outputs": [], 204 | "source": [] 205 | }, 206 | { 207 | "cell_type": "markdown", 208 | "metadata": {}, 209 | "source": [ 210 | "Examine this series. Notice that the volatility (local variance) of these returns change with time. Volatility remains relatively constant for a period, and then changes. This is known as the [clustering of volatility effect](https://en.wikipedia.org/wiki/Volatility_clustering). To model volatility clustering, Prof Robert Engel to develop the [ARCH model](http://www.econ.uiuc.edu/~econ508/Papers/engle82.pdf) which won him the Nobel prize in economics. \n", 211 | "\n", 212 | "In this exercise we will ignore the non-stationarity (clustering of volatility) nature of the return series. This may lead to some less than idea results. " 213 | ] 214 | }, 215 | { 216 | "cell_type": "markdown", 217 | "metadata": {}, 218 | "source": [ 219 | "## Exploration of the Return Series\n", 220 | "\n", 221 | "Now you will explore the probability properties of the return series and compute the empirical VaR. You will then compare this to the VaR computed using a single Gaussian distribution. \n", 222 | "\n", 223 | "In the cell below, use the scale function from sklearn.preprocessing to scale the return series. " 224 | ] 225 | }, 226 | { 227 | "cell_type": "code", 228 | "execution_count": null, 229 | "metadata": {}, 230 | "outputs": [], 231 | "source": [] 232 | }, 233 | { 234 | "cell_type": "markdown", 235 | "metadata": {}, 236 | "source": [ 237 | "Now, plot a histogram of the return values, using 50 bins. " 238 | ] 239 | }, 240 | { 241 | "cell_type": "code", 242 | "execution_count": null, 243 | "metadata": {}, 244 | "outputs": [], 245 | "source": [] 246 | }, 247 | { 248 | "cell_type": "markdown", 249 | "metadata": {}, 250 | "source": [ 251 | "**Question:** Carefully examine this histogram. In particular, look for signs of asymmetry and the 'weight' of the tails. What do you observe in terms of differences of this empirical distribution compared to a textbook Gaussian (Normal) distribution. \n", 252 | "\n", 253 | "> **Hint:**: There are two well know reasons that Gaussian (Normal) distributions do not model the (log) returns of financial assets. The actual returns are asymmetric and have greater 'weight' in the tails than would be modeled with a Gaussian distribution. " 254 | ] 255 | }, 256 | { 257 | "cell_type": "markdown", 258 | "metadata": {}, 259 | "source": [ 260 | "ANS: " 261 | ] 262 | }, 263 | { 264 | "cell_type": "markdown", 265 | "metadata": {}, 266 | "source": [ 267 | "Now, you will compute the 1% (p = 0.01) **empirical VaR** using the values of the return series. Write a function following this recommended process:\n", 268 | "1. Sort the values with the numpy sort method.\n", 269 | "2. Find the index of the at the 0.01 quantile.\n", 270 | "3. Print the value for the index.\n", 271 | "\n", 272 | "Execute your function and examine the results.\n", 273 | "\n", 274 | "> **Note:** *Emperical VaR* is the VaR computed using the log return series. The computation is simply a matter of counting the ordered returns from the lowest value to the $p * series\\ length$ value. " 275 | ] 276 | }, 277 | { 278 | "cell_type": "code", 279 | "execution_count": null, 280 | "metadata": {}, 281 | "outputs": [], 282 | "source": [] 283 | }, 284 | { 285 | "cell_type": "markdown", 286 | "metadata": {}, 287 | "source": [ 288 | "Compare the empirical VaR you computed above to the **Normal VaR**. Since the return series has already been standardized you can compute the Normal 1% VaR using the [norm.ppf](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html) function from scipy.stats, with $q = 0.01$. \n", 289 | "\n", 290 | "> **Note:** The Normal or Gaussian VaR is simply the pth quantile of the Normal distribution, given the estimated variance of the 0 mean return series. In this case, the return series is standardized so, the variance is just $\\sigma^2 = 1.0$." 291 | ] 292 | }, 293 | { 294 | "cell_type": "code", 295 | "execution_count": null, 296 | "metadata": {}, 297 | "outputs": [], 298 | "source": [] 299 | }, 300 | { 301 | "cell_type": "markdown", 302 | "metadata": {}, 303 | "source": [ 304 | "Compare the empirical 1% VaR with the Normal 1% VaR. Which VaR measure estimates higher losses for the worst 1% of trading days? Which VaR measure do you think is more realistic and trustworthy? " 305 | ] 306 | }, 307 | { 308 | "cell_type": "markdown", 309 | "metadata": {}, 310 | "source": [ 311 | "ANS: " 312 | ] 313 | }, 314 | { 315 | "cell_type": "markdown", 316 | "metadata": {}, 317 | "source": [ 318 | "## The Gaussian MIxture Model \n", 319 | "\n", 320 | "Now you are ready to model the return series as a GMM and then examine the properties of the of returns simulated from this model. In this case you will compute the GMM model using maximum likelihood Estimation (MLE) EM algorithm with the [GaussianMixture](https://scikit-learn.org/stable/modules/generated/sklearn.mixture.GaussianMixture.html) method from the sklearn.mixture package. \n", 321 | "\n", 322 | "In the cell below do the following:\n", 323 | "1. (Optionally) Create a function to print attributes of you GMM model; .weights_, .means_, .covariances_. Such a function will help you understand the components of your model. \n", 324 | "2. Set a numpy.random seed of 654.\n", 325 | "3. Define a GaussianMixture model with the following arguments:\n", 326 | " - n_components = 3, to fit 3 Gaussian distributions in your mixture,\n", 327 | " - verbose = 2, to see some information on the progress of the EM algorithm,\n", 328 | " - covariance_type = 'spherical', since there is only a variance for each univariate distribution,\n", 329 | " - reg_covar = 0.1, to ensure proper regularization. \n", 330 | "4. Fit the log return series with the .fit method. You will need to apply `.reshape(-1, 1)` so that the dimensionality of the array is correct. \n", 331 | "5. Print the results of your model using the function you have defined. " 332 | ] 333 | }, 334 | { 335 | "cell_type": "code", 336 | "execution_count": null, 337 | "metadata": {}, 338 | "outputs": [], 339 | "source": [] 340 | }, 341 | { 342 | "cell_type": "markdown", 343 | "metadata": {}, 344 | "source": [ 345 | "Examine the weights, means and covariances (actually the univariate variance) for the model you have created. Given these parameters, what does this tell you about the major (dominant) Gaussian vs. the other components of this model? " 346 | ] 347 | }, 348 | { 349 | "cell_type": "markdown", 350 | "metadata": {}, 351 | "source": [ 352 | "ANS: " 353 | ] 354 | }, 355 | { 356 | "cell_type": "markdown", 357 | "metadata": {}, 358 | "source": [ 359 | "With your GMM model computed, you will now simulate returns from the mixture distribution and compute the 1% VaR from the simulated values. In the cell below create code to perform the following operations for 10,000 samples: \n", 360 | "1. Create a function to perform the simulation with the following steps:\n", 361 | " - Start with an [empty](https://docs.scipy.org/doc/numpy/reference/generated/numpy.empty.html) numpy array of dimension (0). \n", 362 | " - For each model weight, compute the fraction of the total realizations which use the parameters for that component of the mixture. It may help you when debugging to print this faction. \n", 363 | " - For the fraction of realizations use the [normal](https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.random.normal.html) to compute realizations of a univariate Normal (Gaussian) for each component. \n", 364 | " - Concatenate the realizations computed from each component. \n", 365 | "2. Execute your function to compute the simulated return values. \n", 366 | "3. Compute and print the 1% VaR. " 367 | ] 368 | }, 369 | { 370 | "cell_type": "code", 371 | "execution_count": null, 372 | "metadata": {}, 373 | "outputs": [], 374 | "source": [] 375 | }, 376 | { 377 | "cell_type": "markdown", 378 | "metadata": {}, 379 | "source": [ 380 | "Next, plot the histogram of your simulated returns with 200 bins. " 381 | ] 382 | }, 383 | { 384 | "cell_type": "code", 385 | "execution_count": null, 386 | "metadata": {}, 387 | "outputs": [], 388 | "source": [] 389 | }, 390 | { 391 | "cell_type": "markdown", 392 | "metadata": {}, 393 | "source": [ 394 | "Compare this histogram to the histogram of the empirical returns. Paying attention to the downside tail (the left hand tail of the distribution), and keeping in mind that loss events can occur that did not actually occur in the history (empirical data), what is the key difference and do you think this difference might matter?" 395 | ] 396 | }, 397 | { 398 | "cell_type": "markdown", 399 | "metadata": {}, 400 | "source": [ 401 | "ANS: " 402 | ] 403 | }, 404 | { 405 | "cell_type": "markdown", 406 | "metadata": {}, 407 | "source": [ 408 | "Keeping the differences of the two histogrames in mind, compare the 1% VaR you have computed from your GMM with the 1% empirical VaR you computed earlier. How big is this difference, and do you think it might matter and why?" 409 | ] 410 | }, 411 | { 412 | "cell_type": "markdown", 413 | "metadata": {}, 414 | "source": [ 415 | "ANS: " 416 | ] 417 | }, 418 | { 419 | "cell_type": "markdown", 420 | "metadata": {}, 421 | "source": [ 422 | "## Bayesian Gaussian Mixture Model\n", 423 | "\n", 424 | "As a final step you will model the return series as a Bayesian GMM and then examine the properties of the of returns simulated from this model. This model uses the Bayesian variational EM algorithm. In this case you will compute the Bayesian GMM model using maximum likelihood Estimation (MLE) EM algorithm with the [BayesianGaussianMixture](https://scikit-learn.org/stable/modules/generated/sklearn.mixture.BayesianGaussianMixture.html#sklearn.mixture.BayesianGaussianMixture) method from the sklearn.mixture package. \n", 425 | "\n", 426 | "In the cell below do the following:\n", 427 | "1. Set a numpy.random seed of 345.\n", 428 | "3. Define a BayesianGaussianMixture model with the following arguments:\n", 429 | " - n_components = 3, to fit 3 Gaussian distributions in your mixture,\n", 430 | " - verbose = 2, to see some information on the progress of the EM algorithm,\n", 431 | " - covariance_type = 'spherical', since there is only a variance for each univariate distribution,\n", 432 | " - weight_concentration_prior = 0.1, as a prior on the component weights, which puts less mass on the center component of the mixture,\n", 433 | " - mean_prior = [0.0], as a prior on the component means, consistent with expected returns typically being close to 0,\n", 434 | " - covariance_prior = 1.0, as the variance prior for the standardized data,\n", 435 | " - max_iter = 500, since convergence seems to be a problem with this model. \n", 436 | "4. Fit the log return series with the .fit method. You will need to apply `.reshape(-1, 1)` so that the dimensionality of the array is correct. \n", 437 | "5. Print the results of your model using the function you have defined previously. " 438 | ] 439 | }, 440 | { 441 | "cell_type": "code", 442 | "execution_count": null, 443 | "metadata": {}, 444 | "outputs": [], 445 | "source": [] 446 | }, 447 | { 448 | "cell_type": "markdown", 449 | "metadata": {}, 450 | "source": [ 451 | "Notice that convergence is slow and that one of the components has both small weight and unrealistically small variance. These facts seem to point to the model being over parameterized. \n", 452 | "\n", 453 | "Perhaps a model with 2 components, and fewer parameters, will work better? In the cell below, compute a Bayesian GMM with 2 mixture components, starting with a numpy.random.seed of 567. " 454 | ] 455 | }, 456 | { 457 | "cell_type": "code", 458 | "execution_count": null, 459 | "metadata": {}, 460 | "outputs": [], 461 | "source": [] 462 | }, 463 | { 464 | "cell_type": "markdown", 465 | "metadata": {}, 466 | "source": [ 467 | "Notice that the weights of the 2 components are nearly equal and the means are both close to 0. However, the variances of the two components are quite different. \n", 468 | "\n", 469 | "To investigate this model further, in the cell below create and execute code to do the following:\n", 470 | "- Simulate 10,000 realizations of returns from the 2 component Gaussian GMM. \n", 471 | "- Plot the histogram of these realizations using 200 bins. \n", 472 | "- Compute and print the 1% VaR using these realizations. " 473 | ] 474 | }, 475 | { 476 | "cell_type": "code", 477 | "execution_count": null, 478 | "metadata": { 479 | "scrolled": false 480 | }, 481 | "outputs": [], 482 | "source": [] 483 | }, 484 | { 485 | "cell_type": "markdown", 486 | "metadata": {}, 487 | "source": [ 488 | "Compare this histogram to the histogram of the empirical returns. Do you think this histogram represents a realistic model and why? " 489 | ] 490 | }, 491 | { 492 | "cell_type": "markdown", 493 | "metadata": {}, 494 | "source": [ 495 | "ANS: " 496 | ] 497 | }, 498 | { 499 | "cell_type": "markdown", 500 | "metadata": {}, 501 | "source": [ 502 | "Given your observations above, do you think the 1% VaR for this model is likely to be a reasonable representation of risk and why?" 503 | ] 504 | }, 505 | { 506 | "cell_type": "markdown", 507 | "metadata": {}, 508 | "source": [ 509 | "ANS: " 510 | ] 511 | }, 512 | { 513 | "cell_type": "code", 514 | "execution_count": null, 515 | "metadata": {}, 516 | "outputs": [], 517 | "source": [] 518 | } 519 | ], 520 | "metadata": { 521 | "kernelspec": { 522 | "display_name": "Python 3", 523 | "language": "python", 524 | "name": "python3" 525 | }, 526 | "language_info": { 527 | "codemirror_mode": { 528 | "name": "ipython", 529 | "version": 3 530 | }, 531 | "file_extension": ".py", 532 | "mimetype": "text/x-python", 533 | "name": "python", 534 | "nbconvert_exporter": "python", 535 | "pygments_lexer": "ipython3", 536 | "version": "3.7.3" 537 | } 538 | }, 539 | "nbformat": 4, 540 | "nbformat_minor": 2 541 | } 542 | -------------------------------------------------------------------------------- /Homework/Influence.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Homework/Influence.jpg -------------------------------------------------------------------------------- /Homework/InfluenceSample.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Homework/InfluenceSample.JPG -------------------------------------------------------------------------------- /Homework/MurderDirected.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Homework/MurderDirected.JPG -------------------------------------------------------------------------------- /Homework/OptimalPlansOnGridWorld.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Homework/OptimalPlansOnGridWorld.JPG -------------------------------------------------------------------------------- /Lesson0_Introduction_Probability/IntroductionToProbabalisticProgramming.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Introduction to Probabilistic Programming\n", 8 | "\n", 9 | "\n", 10 | "\n", 11 | "## CSCI E-83\n", 12 | "## Stephen Elston\n", 13 | "\n", 14 | "Real-world machine intelligence and machine learning operates in an uncertain world. Probabilistic programming encompasses a range of algorithms for making decisions and inferences under uncertainty. Probabilistic programming has multiple uses in machine learning and artificial intelligence. Probabilistic programming methods arise in problems in many areas including, scheduling, robotics, natural language processing and image understanding. \n", 15 | "\n", 16 | "This course will give you a background in the theory and practice of probabilistic programming, including:\n", 17 | "1. Develop the ability to apply probabilistic programming methods to machine intelligence and machine learning applications. \n", 18 | "2. Have an understanding of the theory that connects various probabilistic programming methods. \n", 19 | "3. Have hands-on experience applying probabilistic programming algorithms to various machine intelligence and machine learning problems. \n" 20 | ] 21 | }, 22 | { 23 | "cell_type": "markdown", 24 | "metadata": {}, 25 | "source": [ 26 | "## About your instructional staff\n", 27 | "\n", 28 | "**Steve Elston:** \n", 29 | "\n", 30 | "Steve is an experienced big data geek, data scientist, and machine learning and AI engineer with several decades of experience. He creates and presents machine learning and artificial intelligence instructional material in his roles as instructor for the Harvard Extension and University of Washington, O’Reilly Author, Strata speaker, and an edX data science and machine learning instructor. Steve provides data science and machine learning consulting to many companies from global enterprises to startups. He leads analytics projects, takes software products from concept and financing through on-shore and off-shore development, intellectual property protection, large client sales, customer shipment and support. He has founded companies and served in several senior management positions. \n", 31 | "\n", 32 | "His experience includes:\n", 33 | "- Data science author, trainer and instructor.\n", 34 | "- R, Python, Bash.\n", 35 | "- SQL and NoSQL databases, Hadoop. \n", 36 | "- Git and Github.\n", 37 | "- Jupyter. \n", 38 | "- Machine learning and predictive analytics model construction and evaluation.\n", 39 | "- Artificial intelligence engineering.\n", 40 | "- Data exploration and visualization.\n", 41 | "- Scaling and performance tuning of large scale analytics systems. \n", 42 | "- Real-time analytics, streaming analytics, and Complex Event Processing (CEP). \n", 43 | "- Location-based analytics. \n", 44 | "- Large scale optimization. \n", 45 | "- Signal processing, image processing and machine vision. \n", 46 | "- Financial market data analytics, risk models, portfolio optimization, time series models.\n", 47 | "- Electronic payment systems and fraud prevention. \n", 48 | "- Wireless telecom analytics and fraud detection.\n", 49 | "- Customer gift and loyalty systems.\n", 50 | "\n", 51 | "\n", 52 | "**Teaching Fellow:** \n", 53 | "\n", 54 | "**Sarah Assano**" 55 | ] 56 | }, 57 | { 58 | "cell_type": "markdown", 59 | "metadata": {}, 60 | "source": [ 61 | "## Agent model of machine intelligence\n", 62 | "\n", 63 | "A general model for **machine intelligence** or **artificial intelligence** involves an **agent** interacting with the **environment**. The agent uses **sensors** to acquire information on the environment and based on the **state** of the environment takes **actions** to change the state of the environment. A schematic representation of this simple model is shown in Figure 1 below. \n", 64 | "\n", 65 | "\"Drawing\"\n", 66 | "
**Figure 1. General Agent Model**
\n", 67 | "\n", 68 | "At the minimum, the agent must have certain capabilities:\n", 69 | "\n", 70 | "1. A **Representation** of the model used by the agent to represent the relationships in the model. A good representation is often the key to good machine intelligence. A good representation is a mapping of the model and the environment. Graphical models provide an effective representation for many complex problems. Alternatively, so called **deep representations** produce powerful results in many situations. \n", 71 | "2. **Inference or reasoning** is the process of computing actions or decisions from **queries** of the model given the **evidence**. In the simplest form a query returns a mathematical result, such as the **marginal probability distribution** or the **maximum a posteriori** value. Reasoning computes a specific action which is applied to the environment. \n", 72 | "3. The agent performs **learning** using data or **evidence** to update the model. The evidence is observed by **sensors** which provide information to the model on the **state of the environment**. For graphical models, learning involves two distinct steps, learning the the structure of the model and learning parameters of the model. In graphical probabilistic models learning and inference are actually the same process. \n", 73 | "\n", 74 | "A schematic of this simple agent model \n", 75 | "\n", 76 | "\"Drawing\"\n", 77 | "
**Schematic view of an agent**
" 78 | ] 79 | }, 80 | { 81 | "cell_type": "markdown", 82 | "metadata": {}, 83 | "source": [ 84 | "## Why probabilistic programming?\n", 85 | "\n", 86 | "\n", 87 | "Machine intelligence must operate in a world with. \n", 88 | "\n", 89 | "1. Data is often noisy or otherwise **uncertain**. A related problem is the possibility of **unreliable** evidence. Machine learning models must be resistant to unexpected and erroneous sensor inputs. In some cases, the uncertain or unreliable evidence are deliberately produced. This results in what is known as an **adversarial** situation. \n", 90 | "2. Unexpected situations occur on a regular basis. Often these 'unexpected situations' are simply cases just outside the set of training cases. Many machine learning models produce erroneous or unexpected results when faced with unexpected inputs; a situation known as **brittleness**. \n", 91 | "3. Many real-world problems have a high degree of complexity which cannot be feasibly modeled explicitly. \n", 92 | "4. Models of many real-world systems include **unobservable** variables. There is no feasible way to obtain direct evidence on these variables. \n", 93 | "5. Optimizing complex processes requires a sequence of decisions. These decisions often must be made with uncertain and incomplete information. \n", 94 | "\n", 95 | "Probabilistic models provides a framework for a wide range of difficult problems including:\n", 96 | "\n", 97 | "1. Dealing gracefully with unexpected or erroneous inputs. Probabilistic models can exhibit robustness to unexpected inputs as each observation or piece of evidence is probability weighted. \n", 98 | "2. In many cases, probabilistic models can effectively represent highly complex problems. Conceptually, a probabilistic model 'smooths' over a certain amount of complexity allowing a relatively simple representation of a complex system. \n", 99 | "3. Inferences can be made on **unobservable** variables with probabilistic models. The **marginal probability distribution** of the unobservable variables can be estimated from the model. This process is known as making a **query** on the model. \n", 100 | "4. Prior information, such as input on **beliefs** from domain experts, can be used to supplement data or **evidence**. This allows models to work in data poor situations, or for modeling rare events. \n", 101 | "5. Probabilistic models can operate as **generative models**. A generative model computes new synthetic cases from an original training set. Ideally, the statistics of the generated data are identical to those generated by the real environment. \n", 102 | "6. The combination of **probabilistic models** and **utility theory** provides a framework for complex decisions under uncertainty. This capability gives " 103 | ] 104 | }, 105 | { 106 | "cell_type": "markdown", 107 | "metadata": {}, 108 | "source": [ 109 | "## Decision theory and probability\n", 110 | "\n", 111 | "Many AI problem require a decision to be made. In this case the agent makes an inference based on a probabilistic model. Based on this inference a **decision** to take an action is made by finding the maximum **utility** of the alternatives. The agent's decision process is then a combination of probability theory and utility theory. This decision making process is also referred to as **reasoning**.\n", 112 | "\n", 113 | "In particular, the combination of probabilistic Markov process and a utility function leads to a **Markov decision process** or **MDP**. The goal of these problems is to maximize the utility of a series of **decisions** in a system with a probabilistic model. \n", 114 | "\n", 115 | "In many practical situations multiple decisions must be made in a time sequence. These decisions are made using both uncertain and incomplete information. Optimizing such decisions requires the combination of probabilistic models and utility theory found in MDPs. \n", 116 | "\n", 117 | "In many situations, the variable we wish to control is **unobservable**. For example, we may need to control the torque of a robot's arm motor. The torque is unobservable, but other variables like the voltage on the motor, and dynamical information on the arms position, velocity and acceleration are all observable with sensors. We can model the behavior of the arm as a **partially observable Markov decision process** or **POMDP**.\n", 118 | "\n", 119 | "Optimal MDP solutions can be found by a number of algorithms. **Dynamic programming** algorithms are the most widely used methods. However, dynamic programming is limited to cases where a probabilistic model for the system being acted on can be modeled completely. An alternative that has emerged in recent years is **reinforcement learning** which can only requires a **utility function** and does not require an explicit probabilistic model. " 120 | ] 121 | }, 122 | { 123 | "cell_type": "markdown", 124 | "metadata": {}, 125 | "source": [ 126 | "## Learning in probabilistic models\n", 127 | "\n", 128 | "Learning is at the core of probabilistic models. A variety of machine learning algorithms are available for learning with probabilistic models. In general there is a trade-off between computational complexity and accuracy. \n", 129 | "\n", 130 | "### Learning for graphical models\n", 131 | "\n", 132 | "There are two distinct aspects of the specification process of graphical models:\n", 133 | "\n", 134 | "The **quantitative specification** which defines the details of the conditional distributions. For Bayesian graphical models, the quantitative specification is created using a combination of prior information and evidence. Luckily for us, there are computationally efficient exact and approximate algorithms. There are, in fact, two broad classes of learning algorithms for graphical models, exact and approximate. Exact methods are useful for smaller problems with discrete probability distributions. Approximate methods can scale to larger problems and accommodate continuous distributions. \n", 135 | "\n", 136 | "\n", 137 | "### Learning for Markov decision processes\n", 138 | "\n", 139 | "There are two categories of algorithms used for optimizing Markov Decision Processes, or MDPs. The first, and most widely used class, are **dynamic programming** methods. Dynamic programming algorithms require a completely specified probability model. Given such a model, dynamic programming algorithms perform an efficient search for a sequence of optimal decisions. \n", 140 | "\n", 141 | "In contrast to dynamic programming, **reinforcement learning** algorithms do not require a specified probability model. Reinforcement learning is in a class of algorithms know as **function approximation** methods. These algorithms attempt to learn complex functions to relate data inputs to posterior values or outputs. In effect, reinforcement learning algorithms learn the probability model from data. Inherently, these models have a large number of parameters which must be determined in the training process. Thus, reinforcement learning algorithms trade-off no requirement for a model for computational complexity. \n", 142 | "\n", 143 | "Reinforcement learning models are trained using a utility function. It is this utility function which must be approximated. However, the relationship between observable variables and the utility function is complex and inherently nonlinear. As a result of the inherent complexity training reinforcement learning models is both computationally intensive and requires vast amounts of data. These facts limits the problems to which reinforcement learning can be applied. Creating more general algorithms for training reinforcement learning models is an active area of research. " 144 | ] 145 | }, 146 | { 147 | "cell_type": "markdown", 148 | "metadata": {}, 149 | "source": [ 150 | "#### Copyright 2018, Stephen F Elston. All rights reserved." 151 | ] 152 | }, 153 | { 154 | "cell_type": "code", 155 | "execution_count": null, 156 | "metadata": {}, 157 | "outputs": [], 158 | "source": [] 159 | } 160 | ], 161 | "metadata": { 162 | "kernelspec": { 163 | "display_name": "Python 3", 164 | "language": "python", 165 | "name": "python3" 166 | }, 167 | "language_info": { 168 | "codemirror_mode": { 169 | "name": "ipython", 170 | "version": 3 171 | }, 172 | "file_extension": ".py", 173 | "mimetype": "text/x-python", 174 | "name": "python", 175 | "nbconvert_exporter": "python", 176 | "pygments_lexer": "ipython3", 177 | "version": "3.7.3" 178 | } 179 | }, 180 | "nbformat": 4, 181 | "nbformat_minor": 2 182 | } 183 | -------------------------------------------------------------------------------- /Lesson0_Introduction_Probability/img/AIModel.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson0_Introduction_Probability/img/AIModel.JPG -------------------------------------------------------------------------------- /Lesson0_Introduction_Probability/img/AgentModel.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson0_Introduction_Probability/img/AgentModel.JPG -------------------------------------------------------------------------------- /Lesson1_Bayes_Networks/1_DirectedGraphicalModels.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson1_Bayes_Networks/1_DirectedGraphicalModels.pdf -------------------------------------------------------------------------------- /Lesson1_Bayes_Networks/1_Introduction.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson1_Bayes_Networks/1_Introduction.pdf -------------------------------------------------------------------------------- /Lesson1_Bayes_Networks/img/BayesBall.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson1_Bayes_Networks/img/BayesBall.JPG -------------------------------------------------------------------------------- /Lesson1_Bayes_Networks/img/Dependency.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson1_Bayes_Networks/img/Dependency.JPG -------------------------------------------------------------------------------- /Lesson1_Bayes_Networks/img/GraphTypes.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson1_Bayes_Networks/img/GraphTypes.JPG -------------------------------------------------------------------------------- /Lesson1_Bayes_Networks/img/Independencies2.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson1_Bayes_Networks/img/Independencies2.JPG -------------------------------------------------------------------------------- /Lesson1_Bayes_Networks/img/LetterDAG.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson1_Bayes_Networks/img/LetterDAG.JPG -------------------------------------------------------------------------------- /Lesson1_Bayes_Networks/img/Representation.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson1_Bayes_Networks/img/Representation.JPG -------------------------------------------------------------------------------- /Lesson2_MarkovNetworks/2_MarkovGraphs.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson2_MarkovNetworks/2_MarkovGraphs.pdf -------------------------------------------------------------------------------- /Lesson2_MarkovNetworks/img/Cliques1.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson2_MarkovNetworks/img/Cliques1.JPG -------------------------------------------------------------------------------- /Lesson2_MarkovNetworks/img/Cliques2.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson2_MarkovNetworks/img/Cliques2.JPG -------------------------------------------------------------------------------- /Lesson2_MarkovNetworks/img/DAGvsMN.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson2_MarkovNetworks/img/DAGvsMN.JPG -------------------------------------------------------------------------------- /Lesson2_MarkovNetworks/img/Diagrams.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson2_MarkovNetworks/img/Diagrams.pptx -------------------------------------------------------------------------------- /Lesson2_MarkovNetworks/img/LetterCliques.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson2_MarkovNetworks/img/LetterCliques.JPG -------------------------------------------------------------------------------- /Lesson2_MarkovNetworks/img/LetterDAG.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson2_MarkovNetworks/img/LetterDAG.JPG -------------------------------------------------------------------------------- /Lesson2_MarkovNetworks/img/MarkovBlanket.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson2_MarkovNetworks/img/MarkovBlanket.JPG -------------------------------------------------------------------------------- /Lesson2_MarkovNetworks/img/MoralizedGraph.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson2_MarkovNetworks/img/MoralizedGraph.JPG -------------------------------------------------------------------------------- /Lesson2_MarkovNetworks/img/MoralizedLetter.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson2_MarkovNetworks/img/MoralizedLetter.JPG -------------------------------------------------------------------------------- /Lesson2_MarkovNetworks/img/NoDAG.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson2_MarkovNetworks/img/NoDAG.JPG -------------------------------------------------------------------------------- /Lesson2_MarkovNetworks/img/Representation.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson2_MarkovNetworks/img/Representation.JPG -------------------------------------------------------------------------------- /Lesson2_MarkovNetworks/img/Separation.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson2_MarkovNetworks/img/Separation.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/3_ExactInference.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/3_ExactInference.pdf -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/Chain1.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/Chain1.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/CliqueTree.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/CliqueTree.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/Collect.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/Collect.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/DAGFactor.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/DAGFactor.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/Distribute.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/Distribute.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/Eliminate1.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/Eliminate1.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/Eliminate2.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/Eliminate2.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/Eliminate3.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/Eliminate3.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/Eliminate4.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/Eliminate4.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/EliminationTree.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/EliminationTree.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/FactorToVar.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/FactorToVar.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/FourCycle.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/FourCycle.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/Inference.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/Inference.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/LetterDAG.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/LetterDAG.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/Loopy.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/Loopy.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/MarkovBlanket.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/MarkovBlanket.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/MarkovFactor.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/MarkovFactor.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/Moralized.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/Moralized.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/MultiConnected.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/MultiConnected.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/MultiConnectedMoralized.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/MultiConnectedMoralized.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/StudentGraph.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/StudentGraph.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/SumProductTree.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/SumProductTree.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/Triangle1.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/Triangle1.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/Triangle2.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/Triangle2.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/Triangulated.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/Triangulated.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/Undirected1.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/Undirected1.JPG -------------------------------------------------------------------------------- /Lesson3_ExactInference/img/VarToFactor.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson3_ExactInference/img/VarToFactor.JPG -------------------------------------------------------------------------------- /Lesson4_Learning/4_LearningInGraphicalModels.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson4_Learning/4_LearningInGraphicalModels.pdf -------------------------------------------------------------------------------- /Lesson4_Learning/img/Factorizing.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson4_Learning/img/Factorizing.JPG -------------------------------------------------------------------------------- /Lesson4_Learning/img/Learning.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson4_Learning/img/Learning.JPG -------------------------------------------------------------------------------- /Lesson4_Learning/img/LetterDAG.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson4_Learning/img/LetterDAG.JPG -------------------------------------------------------------------------------- /Lesson4_Learning/img/PlateDiagram.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson4_Learning/img/PlateDiagram.JPG -------------------------------------------------------------------------------- /Lesson5_Learning_Part2/StudentSimulation.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import numpy.random as nr 3 | 4 | def sim_bernoulli(p, n = 25): 5 | """ 6 | Function to compute the vectors with probabilities for each 7 | condition (input value) of the dependent variable using the Bernoulli 8 | distribution. 9 | 10 | The arguments are: 11 | p - a vector of probabilites of success for each case. 12 | n - The numer of realizations. 13 | """ 14 | temp = np.zeros(shape = (len(p), n)) 15 | for i in range(len(p)): 16 | temp[i,:] = nr.binomial(1, p[i], n) 17 | return(temp) 18 | 19 | def selec_dist_1(sims, var, lg): 20 | """ 21 | Function to integrate the conditional probabilities for 22 | each of the cases of the parent variable. 23 | 24 | The arguments are: 25 | sims - the array of simulated realizations with one row for each state of the 26 | parent variable. 27 | var - the vector of values of parent variable used to select the value from the 28 | sims array. 29 | lg - vector of states of possible states of the parent variable. These must be 30 | in the same order as for the sims array. 31 | """ 32 | out = sims[0,:] # Copy of values for first parent state 33 | var = np.array(var).ravel() 34 | for i in range(1, sims.shape[0]): # loop over other parent states 35 | out = [x if u == lg[i] else y for x,y,u in zip(sims[i,:], out, var)] 36 | return([int(x) for x in out]) 37 | 38 | def set_class(x): 39 | """ 40 | Function to flatten the array produced by the numpy.random.multinoulli function. 41 | The function tests which binary value of the array of output states is true 42 | and substitutes an integer for that state. This function only works for up to three 43 | output states. 44 | 45 | Argument: 46 | x - The array produced by the numpy.random.multinoulli function. 47 | """ 48 | out = [] 49 | for i,j in enumerate(x): 50 | if j[0] == 1: out.append(0) 51 | elif j[1] == 1: out.append(1) 52 | else: out.append(2) 53 | return(out) 54 | 55 | 56 | def sim_multinoulli(p, n = 25): 57 | """ 58 | Function to compute the vectors with probabilities for each 59 | condition (input value) of the dependent variable using the multinoulli 60 | distribution. 61 | 62 | The arguments are: 63 | p - an array of probabilites of success for each possible combination 64 | of states of the parent variables. Each row in the array are the 65 | probabilities for each state of the multinoulli distribution for 66 | that combination of parent values. 67 | n - The numer of realizations. 68 | """ 69 | temp = np.zeros(shape = (p.shape[0], n)) 70 | for i in range(p.shape[0]): 71 | ps = p[i,:] 72 | mutlis = nr.multinomial(1, ps, n) 73 | temp[i,:] = set_class(mutlis) 74 | return(temp) 75 | 76 | def selec_dist_2(sims, var1, var2, lg1, lg2): 77 | """ 78 | Function to integrate the conditional probabilities for 79 | each of the cases of two parent variables. 80 | 81 | The arguments are: 82 | sims - the array of simulated realizations with one row for each state of the 83 | union of the parent variables. 84 | var1 - the vector of values of first parent variable used to select the value from the 85 | sims array. 86 | var2 - the vector of values of second parent variable used to select the value from the 87 | sims array. 88 | lg1 - vector of states of possible states of the first parent variable. These must be 89 | in the same order as for the sims array. 90 | lg2 - vector of states of possible states of the second parent variable. These must be 91 | in the same order as for the sims array. 92 | """ 93 | out = sims[0,:] # Copy values for first combination of states for parent variables 94 | ## make sure the parent variables are 1-d numpy arrays. 95 | var1 = np.array(var1).ravel() 96 | var2 = np.array(var2).ravel() 97 | for i in range(1, sims.shape[0]): # Loop over all comnination of states of the parent variables 98 | out = [x if u == lg1[i] and v == lg2[i] else y for x,y,u,v in zip(sims[i,:], out, var1, var2)] 99 | return([int(x) for x in out]) 100 | 101 | -------------------------------------------------------------------------------- /Lesson5_Learning_Part2/img/FullyConneted.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson5_Learning_Part2/img/FullyConneted.JPG -------------------------------------------------------------------------------- /Lesson5_Learning_Part2/img/Learning.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson5_Learning_Part2/img/Learning.JPG -------------------------------------------------------------------------------- /Lesson5_Learning_Part2/img/LetterDAG.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson5_Learning_Part2/img/LetterDAG.JPG -------------------------------------------------------------------------------- /Lesson6_Utility_and_Decision_Trees/6_DecisionsUnderUncertainty.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson6_Utility_and_Decision_Trees/6_DecisionsUnderUncertainty.pdf -------------------------------------------------------------------------------- /Lesson6_Utility_and_Decision_Trees/img/AgentEnvironment.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson6_Utility_and_Decision_Trees/img/AgentEnvironment.JPG -------------------------------------------------------------------------------- /Lesson6_Utility_and_Decision_Trees/img/Bridge.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson6_Utility_and_Decision_Trees/img/Bridge.JPG -------------------------------------------------------------------------------- /Lesson6_Utility_and_Decision_Trees/img/RandomVariable.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson6_Utility_and_Decision_Trees/img/RandomVariable.JPG -------------------------------------------------------------------------------- /Lesson6_Utility_and_Decision_Trees/img/Utility.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson6_Utility_and_Decision_Trees/img/Utility.JPG -------------------------------------------------------------------------------- /Lesson7_MarkovDecisionProcesses/7_MarkovProcesses.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson7_MarkovDecisionProcesses/7_MarkovProcesses.pdf -------------------------------------------------------------------------------- /Lesson7_MarkovDecisionProcesses/IntroductionToMarkovDecisionProcesses.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Introduction to Markov Decision Processes\n", 8 | "\n", 9 | "## CSCI E-82A\n", 10 | "### Stephen Elston\n", 11 | "\n", 12 | "In this lesson we will introduce **Markov processes**, which are a **representation** of a **memoryless state transition processes**. The diagram below shows the Markov process representation in the intelligent agent, along with the interactions with the environment. \n", 13 | "\n", 14 | "\"Drawing\"\n", 15 | "
**Interaction of agent and environment**
\n", 16 | "\n", 17 | "**Markov decision processes** which are widely used in planning and optimal decision theory. The closely related **Markov reward process** is necessary for planning and optimal decision methods. We will introduce the Markov reward process here. Markov decision processes are addressed in another lesson. \n", 18 | "\n", 19 | "Suggested readings: The following reading is an optional supplement to the material presented here:\n", 20 | "\n", 21 | "Sutton and Barto, second edition, Sections 3.1, 3.2, 3.3, 3.4, 3.5 or\n", 22 | "Russell and Norvig, third edition, Section 7.1, or\n", 23 | "Kochenderfer, Section 4.1." 24 | ] 25 | }, 26 | { 27 | "cell_type": "markdown", 28 | "metadata": {}, 29 | "source": [ 30 | "## Markov Processes\n", 31 | "\n", 32 | "A first order **Markov process** is a process where the probability of a transition between a **finite set of states** only depends on the current state. In other words, first order **Markov processes have no memory of past states**. The current state has all the relevant information on the history of states. \n", 33 | "\n", 34 | "For the transition between as a state $S_t$, at time $t$ ,to the next state $S_{t+1}$, at time $t+1$, we can express a Markov process mathematically as follows:\n", 35 | "\n", 36 | "$$p[S_{t+1}\\ |\\ S_1, \\ldots, S_t] = p[S_{t+1}\\ |\\ S_t]$$\n", 37 | "\n", 38 | "For a vector of possible states, $S$, we can create a **state transition probability matrix**. This matrix **represents** the probability of a state transition from $S$ to the next state, $S'$ at the next time step:\n", 39 | "\n", 40 | "$$\\mathcal{P_{ss'}} = \n", 41 | "\\begin{bmatrix}\n", 42 | " P_{11} & \\dots & P_{1n} \\\\\n", 43 | " \\vdots & \\vdots & \\vdots \\\\\n", 44 | " P_{n1} & \\dots & P_{nn}\n", 45 | "\\end{bmatrix}\\\\\n", 46 | "$$\n", 47 | "Where, $\\mathcal{P}_{ij} =$ probability of transition from state $s_i$ to $s_j$. \n", 48 | "\n", 49 | "Let's say we have a vector of probabilities of being in one of n possible states, $S = (s_1, s_2, \\ldots, s_n)$. Using simple matrix multiplication we can write the relationships for the transition to the next state $S'$ as:\n", 50 | "\n", 51 | "$$S' = \\mathcal{P_{ss'}} S\\\\\n", 52 | "or\\\\\n", 53 | "\\begin{bmatrix}\n", 54 | " s_1' \\\\\n", 55 | " \\vdots \\\\\n", 56 | " s_n'\n", 57 | "\\end{bmatrix}\n", 58 | "=\n", 59 | "\\begin{bmatrix}\n", 60 | " P_{11} & \\dots & P_{1n} \\\\\n", 61 | " \\vdots & \\vdots & \\vdots \\\\\n", 62 | " P_{n1} & \\dots & P_{nn}\n", 63 | "\\end{bmatrix}\n", 64 | "\\begin{bmatrix}\n", 65 | " s_1 \\\\\n", 66 | " \\vdots \\\\\n", 67 | " s_n\n", 68 | "\\end{bmatrix}\n", 69 | "\\\\\n", 70 | "$$\n", 71 | "\n", 72 | "\n", 73 | "A **Markov chain** is a sequence of of states of a Markov process. In other words, if we run a Markov process over a number of time steps the result is a Markov chain. \n", 74 | "\n", 75 | "If the transition probability matrix, $ \\mathcal{P_{ss'}}$, does not change over time, we say the Markov chain is **stationary**. Stationary Markov chains have a **convergence property**. If we run the Markov chain for enough time steps, the chain will reach a **steady state**. At steady state the probabilities of being in any state of the Markov process is **unchanged from time step to time step**. \n" 76 | ] 77 | }, 78 | { 79 | "cell_type": "markdown", 80 | "metadata": {}, 81 | "source": [ 82 | "## Computational Example - Does Steve Need a New Car?\n", 83 | "\n", 84 | "Let's try a computational example to test out the foregoing concepts. In this case we will look at the state transitions for the use of an old car vs. a new car. The diagram below shows the states of car ownership and the possible transitions between them. \n", 85 | "\n", 86 | "\"Drawing\"\n", 87 | "
States and possible transitions of car use
" 88 | ] 89 | }, 90 | { 91 | "cell_type": "markdown", 92 | "metadata": {}, 93 | "source": [ 94 | "The states and the possible transitions are:\n", 95 | "1. Old car, can transition to continue driving the old car, a breakdown, or an accident.\n", 96 | "2. Old car breakdown, can transition to old car or new car.\n", 97 | "3. Old car accident, transitions to new car.\n", 98 | "4. New car, can transition to continue driving the new car, a breakdown, or an accident.\n", 99 | "5. New car breakdown, transitions to new car or to an old car.\n", 100 | "6. New car accident, transitions to new car.\n", 101 | "\n", 102 | "Notice that there are no **terminal states** in this diagram. A terminal state can be entered, but there is no possible transition to another state. An example of a terminal state is the win or loss of game. The game is over, and there will be no more states for playing. Markov processes with terminal states are said to be **periodic** or **finite** since they run for a finite period, after which they must be restarted. Whereas, Markov processes with no terminal state are said to be **infinite**, since in theory they will run for an infinite number of time steps.\n", 103 | "\n", 104 | "Given these transitions, the question is what is the probability that Steve will end up in a new car or keep his old car. \n", 105 | "\n", 106 | "We will start by defining a transition probability matrix and testing that the probabilities in the columns add to 1.0. " 107 | ] 108 | }, 109 | { 110 | "cell_type": "code", 111 | "execution_count": null, 112 | "metadata": {}, 113 | "outputs": [], 114 | "source": [ 115 | "import numpy as np\n", 116 | "import pandas as pd\n", 117 | "T = np.array([[9.8999e-01, 9.90000e-01, 0.00000e-00, 1.00000e-02, 0.00000e+00],\n", 118 | " [1.00000e-02, 0.00000e+00, 0.00000e+00, 0.00000e+00, 0.00000e+00],\n", 119 | " [0.0000,1.00000e-02, 9.98999e-01, 9.90000e-01, 1.00000e+00],\n", 120 | " [0.00000e+00, 0.00000e+00, 1.00000e-03, 0.00000e+00, 0.00000e+00],\n", 121 | " [0.00001e+00, 0.00000e+00, 1.00000e-06, 0.00000e+00, 0.00000e+00]])\n", 122 | "\n", 123 | "print('The transition probability matrix')\n", 124 | "labels = ['OldCar','OldBreak','NewCar','NewBreak','Acident']\n", 125 | "print(pd.DataFrame(T, columns = labels, index = labels))\n", 126 | "\n", 127 | "print('\\nTest that the columns add to 1')\n", 128 | "np.sum(T, axis = 0)" 129 | ] 130 | }, 131 | { 132 | "cell_type": "markdown", 133 | "metadata": {}, 134 | "source": [ 135 | "Now we need to define an initial state for the Markov chain. In this case, Steve is driving his old car, so the state vector is defined as shown below. The multiplication by the transition probability matrix gives gives the new states. " 136 | ] 137 | }, 138 | { 139 | "cell_type": "code", 140 | "execution_count": null, 141 | "metadata": {}, 142 | "outputs": [], 143 | "source": [ 144 | "initial_state = np.array([1.0,0.0, 0.0, 0.0, 0.0 ])\n", 145 | "pd.Series(np.matmul(T,initial_state), index = labels)" 146 | ] 147 | }, 148 | { 149 | "cell_type": "markdown", 150 | "metadata": {}, 151 | "source": [ 152 | "Notice how several states now have non-zero probabilities. Two states still have zero probabilities. \n", 153 | "\n", 154 | "Let's see what happens when we apply a series of state transitions to this initial state. The code in the cell below performs these multiplications to compute the probabilities of the new state given the current state. The function returns a list containing the state vector following each transition. The final state vector is printed at the end." 155 | ] 156 | }, 157 | { 158 | "cell_type": "code", 159 | "execution_count": null, 160 | "metadata": {}, 161 | "outputs": [], 162 | "source": [ 163 | "def state_transition(T, s, n = 1):\n", 164 | " s_list = np.reshape(s, (s.shape[0],1))\n", 165 | " for _ in range(n):\n", 166 | " s = np.matmul(T,s)\n", 167 | " s_list = np.concatenate((s_list, np.reshape(s, (s.shape[0],1))), axis = 1)\n", 168 | " return s_list\n", 169 | "states = state_transition(T, initial_state, 100000)\n", 170 | "pd.Series(states[:,999], index = labels)" 171 | ] 172 | }, 173 | { 174 | "cell_type": "markdown", 175 | "metadata": {}, 176 | "source": [ 177 | "You can see the probabilities of being in each state after a large number of transitions.\n", 178 | "\n", 179 | "Let's make a plot of two of these states. The state that Steve drives his own car or the state that he buys a new car are plotted in the chart before. " 180 | ] 181 | }, 182 | { 183 | "cell_type": "code", 184 | "execution_count": null, 185 | "metadata": {}, 186 | "outputs": [], 187 | "source": [ 188 | "import matplotlib.pyplot as plt\n", 189 | "%matplotlib inline\n", 190 | "def plot_states(states, variables = [0,2]):\n", 191 | " fig = plt.figure(figsize=(6,6)) # define plot area\n", 192 | " ax = fig.gca() # define axis \n", 193 | " ax.set_xlabel('Number of time steps')\n", 194 | " ax.set_ylabel('Probability of being in state')\n", 195 | " ax.set_title('Probability of states vs. number of time steps')\n", 196 | " for var in variables:\n", 197 | " plt.plot(states[var,:])\n", 198 | " \n", 199 | "plot_states(states) " 200 | ] 201 | }, 202 | { 203 | "cell_type": "markdown", 204 | "metadata": {}, 205 | "source": [ 206 | "Notice that after about 3,000 transitions the state probabilities the probabilities remain essentially unchanged. This indicates that the Markov chain is in steady state. It seems that in the steady state, Steve does eventually end up with a new car!\n", 207 | "\n", 208 | "Let's plot the state probabilities for two other states, the probabilities of a breakdown for the old and new cars. " 209 | ] 210 | }, 211 | { 212 | "cell_type": "code", 213 | "execution_count": null, 214 | "metadata": {}, 215 | "outputs": [], 216 | "source": [ 217 | "plot_states(states, [1,3]) " 218 | ] 219 | }, 220 | { 221 | "cell_type": "markdown", 222 | "metadata": {}, 223 | "source": [ 224 | "As before these state probabilities are in steady state after about 3,000 steps. The probability of the old car breaking down approaches zero since there is a low probability of driving the old car in the steady state. " 225 | ] 226 | }, 227 | { 228 | "cell_type": "markdown", 229 | "metadata": {}, 230 | "source": [ 231 | "## Markov Reward Process\n", 232 | "\n", 233 | "We can define a **reward function**, $\\mathcal{R}$ as the **expected reward** of **change in utility** in the next time step. Given the expectation over all transitions from the current state $s$ to all possible **successor states** $s'$:\n", 234 | "\n", 235 | "$$R_{t+1} = E \\big[ \\mathcal{R}_{s s'}\\ |\\ S_t = s \\big]$$ \n", 236 | "\n", 237 | "Where, \n", 238 | "- $\\mathcal{R}_{s s'}$ is the reward for the transition from state, $s$ to a successor state, $s'$ and\n", 239 | "- $\\mathcal{R}_{s s'}$ given a transition from the current state $s$ to a new state $s'$. " 240 | ] 241 | }, 242 | { 243 | "cell_type": "markdown", 244 | "metadata": {}, 245 | "source": [ 246 | "Let's look at an example of a Markov reward process. The diagram below shows the rewards for the various state transitions in the auto example. Since owning cars has significant costs, all of the rewards are negative. \n", 247 | "\n", 248 | "\"Drawing\"\n", 249 | "
**Rewards for state transitions of car use**
\n", 250 | "\n", 251 | "Keep in mind that just like utility, **reward need not simply follow economic value**; e.g. money. For example, the reward of the car breaking down must account for the inconvenience of dealing with the repair, or the reward for driving the car must account for intangibles like comfort and safety of the passengers. \n", 252 | "\n", 253 | "What is the relationship between utility and reward in a Markov chain? It is easy to compute utility from rewards, since **rewards are additive**. First, let's consider a finite Markov reward process, which reaches a terminal state after $T$ time steps:\n", 254 | "\n", 255 | "$$U([s_o, s_1, \\ldots, s_T]) = R(s_o) + R(s_1) + \\ldots + R(s_T) = \\sum_{t = 0}^T R(s_t)$$\n", 256 | "\n", 257 | "But, consider what happens with an infinite Markov reward process, which never reaches a terminal state. If we use the above formulation, the utility will grow without bound; e.g. $U(s_t) \\rightarrow \\infty$ as $T \\rightarrow \\infty$. \n", 258 | "\n", 259 | "The solution to keeping utility bounded for infinite Markov reward processes is **discounting**. By discounting we are saying that the value of a reward in the future decreases the further in the future the reward is received. This is a commonly used concept in many fields. For example, an investor will discount expected future returns, preferring immediate payoff. \n", 260 | "\n", 261 | "Using discounting, we can formulate a bounded relationship between utility and reward:\n", 262 | "\n", 263 | "$$U([s_o, s_1, s_2, s_3 \\ldots]) = R(s_o) + \\gamma R(s_1) + \\gamma^2 R(s_2) + \\gamma^3 R(s_3) \\ldots = \\sum_{t = 0}^{\\infty} \\gamma^{t} R(s_t)$$\n", 264 | "\n", 265 | "The choice of the discount parameter, $\\gamma$, will change the outcome for the Markov reward process:\n", 266 | "- As $\\gamma \\rightarrow 0$, the reward process becomes myopic, only counting near term rewards.\n", 267 | "- As $\\gamma \\rightarrow 1$, the reward process becomes far sighted, valuing distant rewards highly. \n", 268 | "\n", 269 | "For infinite Markov reward processes we are interested in the **return** for state transitions starting with the current state. Return is the sum of the rewards for future state transitions and can be expressed as:\n", 270 | "\n", 271 | "$$G_t = R_{t+1} + \\gamma R_{t+2} + \\gamma^2 R_{t+3} \\ldots = \\sum_{k = 0}^{\\infty} \\gamma^{k} R_{t+k+1}$$\n", 272 | "\n", 273 | "We are also interested in the **state value function**. This expression computes the expected future value of being in state $s$:\n", 274 | "\n", 275 | "$$v(s) = E[G_t\\ |\\ S_t = s ]$$" 276 | ] 277 | }, 278 | { 279 | "cell_type": "markdown", 280 | "metadata": {}, 281 | "source": [ 282 | "## Optimal Policy\n", 283 | "\n", 284 | "The agent uses a **policy**, $\\pi$, to determine which action to take. The expected **action value** given the action, $a$, from state, $s$, by the policy is: \n", 285 | "\n", 286 | "$$q_{\\pi}(s,a) = \\mathbb{E}_{\\pi} [G_{t}\\ |\\ S_t = s, A_t = a] $$\n", 287 | "\n", 288 | "Our goal is to find an **optimal policy** which maximizes the expected action value. We say that the optimal policy, $q_*(s,a)$, gives the highest expected value for the action $a$:\n", 289 | "\n", 290 | "$$q_{\\pi^*}(s,a) = \\mathbb{E}_{\\pi^*} [G_{t}\\ |\\ S_t = s, A_t = a] $$\n", 291 | "\n", 292 | "An optimal policy has an expected action value greater than or equal to all possible policies:\n", 293 | "\n", 294 | "$$q_{\\pi^*}(s,a) \\ge q_{\\pi}(s,a)\\ \\forall\\ \\pi$$\n", 295 | "\n", 296 | "The bandit model is stateless. Thus, there is no state required for the representation." 297 | ] 298 | }, 299 | { 300 | "cell_type": "markdown", 301 | "metadata": {}, 302 | "source": [ 303 | "## Computational Example\n", 304 | "\n", 305 | "With the above theory in mind, let's try a computational example for a Markov reward process. We are particularly interested in the convergence properties of the state value function. This convergence is key if we wish to find an optimal state for a Markov process. \n", 306 | "\n", 307 | "As a first step we, must define a matrix of the rewards for transitions between the states. Execute the code in the cell below and examine the result." 308 | ] 309 | }, 310 | { 311 | "cell_type": "code", 312 | "execution_count": null, 313 | "metadata": {}, 314 | "outputs": [], 315 | "source": [ 316 | "## Define the Markov reward matrix\n", 317 | "R = np.array([[-30.0, -600, 0.0, -3000, 0.0],\n", 318 | " [-200.0, 0.0, 0.0, 0.0, 0.0],\n", 319 | " [0.0, -30000, -20, -100, -30000],\n", 320 | " [0.0, 0.0, -1000, 0.0, 0.0],\n", 321 | " [-100000, 0.0, -50000, 0.0, 0.0]])\n", 322 | "\n", 323 | "print('The reward matrix for state transitions')\n", 324 | "labels = ['OldCar','OldBreak','NewCar','NewBreak','Acident']\n", 325 | "print(pd.DataFrame(R, columns = labels, index = labels))" 326 | ] 327 | }, 328 | { 329 | "cell_type": "markdown", 330 | "metadata": {}, 331 | "source": [ 332 | "The code in the cell below computes the value of being in some initial state $s$. The code comments explain the steps. " 333 | ] 334 | }, 335 | { 336 | "cell_type": "code", 337 | "execution_count": null, 338 | "metadata": {}, 339 | "outputs": [], 340 | "source": [ 341 | "def compute_state_value(T, R, s, n = 1, gamma = 0.9):\n", 342 | " v_list = [] # a list to hold the values\n", 343 | " for _ in range(n):\n", 344 | " s_prime = np.matmul(T,s) # The probabilities of being in the new states\n", 345 | " delta_s = np.subtract(s,s_prime) # The change in probabilities of the states\n", 346 | " s = s_prime\n", 347 | " v_list.append(np.sum(np.matmul(R,delta_s))) # Build the list of values\n", 348 | " \n", 349 | " state_value = 0.0\n", 350 | " ## Now loop over the state transitions and compute the discounted value\n", 351 | " for i in range(n):\n", 352 | " state_value = state_value + gamma**i * v_list[i]\n", 353 | " return state_value\n", 354 | "\n", 355 | "compute_state_value(T, R, initial_state, n = 10)" 356 | ] 357 | }, 358 | { 359 | "cell_type": "markdown", 360 | "metadata": {}, 361 | "source": [ 362 | "The above test result looks reasonable, but what about convergence? The code in the cell below computes the discounted value of being in an initial state for an increasing number of time steps and then plots the result." 363 | ] 364 | }, 365 | { 366 | "cell_type": "code", 367 | "execution_count": null, 368 | "metadata": {}, 369 | "outputs": [], 370 | "source": [ 371 | "gamma = 0.9\n", 372 | "def converge_state_value(T, R, s, n = 1, gamma = 0.9):\n", 373 | " ## return a list for the value after each number of time steps\n", 374 | " return [compute_state_value(T, R, s, steps, gamma) for steps in range(n)]\n", 375 | "\n", 376 | "values = converge_state_value(T, R, initial_state, n = 200, gamma = gamma)\n", 377 | "\n", 378 | "def plot_values(vals, gamma):\n", 379 | " fig = plt.figure(figsize=(6,6)) # define plot area\n", 380 | " ax = fig.gca() # define axis \n", 381 | " ax.set_xlabel('Number of time steps')\n", 382 | " ax.set_ylabel('value of state')\n", 383 | " ax.set_title('Value of state vs. number of time steps with gamma = ' + str(gamma))\n", 384 | " plt.plot(vals)\n", 385 | " \n", 386 | "plot_values(values, gamma) " 387 | ] 388 | }, 389 | { 390 | "cell_type": "markdown", 391 | "metadata": {}, 392 | "source": [ 393 | "For a value of $\\gamma = 0.9$ the state value function converges rather quickly, in less than 100 time steps. No doubt, the fact that the Markov process reaches steady state quickly helps. \n", 394 | "\n", 395 | "Let's see what happens for $\\gamma = 0.99$. How does this change in discounting change the convergence?" 396 | ] 397 | }, 398 | { 399 | "cell_type": "code", 400 | "execution_count": null, 401 | "metadata": {}, 402 | "outputs": [], 403 | "source": [ 404 | "gamma = 0.99\n", 405 | "values = converge_state_value(T, R, initial_state, n = 1000, gamma = gamma)\n", 406 | "plot_values(values, gamma) " 407 | ] 408 | }, 409 | { 410 | "cell_type": "markdown", 411 | "metadata": {}, 412 | "source": [ 413 | "Even for this rather low discounting the state value function converges in about 600 time steps. This result is encouraging!" 414 | ] 415 | }, 416 | { 417 | "cell_type": "markdown", 418 | "metadata": {}, 419 | "source": [ 420 | "#### Copyright 2018, 2019, Stephen F Elston. All rights reserved. " 421 | ] 422 | } 423 | ], 424 | "metadata": { 425 | "kernelspec": { 426 | "display_name": "Python 3", 427 | "language": "python", 428 | "name": "python3" 429 | }, 430 | "language_info": { 431 | "codemirror_mode": { 432 | "name": "ipython", 433 | "version": 3 434 | }, 435 | "file_extension": ".py", 436 | "mimetype": "text/x-python", 437 | "name": "python", 438 | "nbconvert_exporter": "python", 439 | "pygments_lexer": "ipython3", 440 | "version": "3.7.3" 441 | } 442 | }, 443 | "nbformat": 4, 444 | "nbformat_minor": 2 445 | } 446 | -------------------------------------------------------------------------------- /Lesson7_MarkovDecisionProcesses/IntroductionToRL.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson7_MarkovDecisionProcesses/IntroductionToRL.pdf -------------------------------------------------------------------------------- /Lesson7_MarkovDecisionProcesses/MarkovDecisionProcesses.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson7_MarkovDecisionProcesses/MarkovDecisionProcesses.pdf -------------------------------------------------------------------------------- /Lesson7_MarkovDecisionProcesses/img/AgentEnvironment.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson7_MarkovDecisionProcesses/img/AgentEnvironment.JPG -------------------------------------------------------------------------------- /Lesson7_MarkovDecisionProcesses/img/CarRewards.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson7_MarkovDecisionProcesses/img/CarRewards.JPG -------------------------------------------------------------------------------- /Lesson7_MarkovDecisionProcesses/img/CarStates.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson7_MarkovDecisionProcesses/img/CarStates.JPG -------------------------------------------------------------------------------- /Lesson8_LVM_Variational/8_LatentVariable_VariationalMethods.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson8_LVM_Variational/8_LatentVariable_VariationalMethods.pdf -------------------------------------------------------------------------------- /Lesson9_IntroductionToDynamicProgramming/9_IntroToDynamicProgramming.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson9_IntroductionToDynamicProgramming/9_IntroToDynamicProgramming.pdf -------------------------------------------------------------------------------- /Lesson9_IntroductionToDynamicProgramming/img/ActionValueBackup.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson9_IntroductionToDynamicProgramming/img/ActionValueBackup.JPG -------------------------------------------------------------------------------- /Lesson9_IntroductionToDynamicProgramming/img/Backups.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson9_IntroductionToDynamicProgramming/img/Backups.JPG -------------------------------------------------------------------------------- /Lesson9_IntroductionToDynamicProgramming/img/DPAgent.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson9_IntroductionToDynamicProgramming/img/DPAgent.JPG -------------------------------------------------------------------------------- /Lesson9_IntroductionToDynamicProgramming/img/GPI.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson9_IntroductionToDynamicProgramming/img/GPI.JPG -------------------------------------------------------------------------------- /Lesson9_IntroductionToDynamicProgramming/img/GridWorld.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson9_IntroductionToDynamicProgramming/img/GridWorld.JPG -------------------------------------------------------------------------------- /Lesson9_IntroductionToDynamicProgramming/img/OptimalPath.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson9_IntroductionToDynamicProgramming/img/OptimalPath.JPG -------------------------------------------------------------------------------- /Lesson9_IntroductionToDynamicProgramming/img/ValueBackup.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson9_IntroductionToDynamicProgramming/img/ValueBackup.JPG -------------------------------------------------------------------------------- /Lesson_10_Bandit_Problems/10_Bandits.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_10_Bandit_Problems/10_Bandits.pdf -------------------------------------------------------------------------------- /Lesson_10_Bandit_Problems/10_IntroductionToRL.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_10_Bandit_Problems/10_IntroductionToRL.pdf -------------------------------------------------------------------------------- /Lesson_10_Bandit_Problems/img/BanditAgent.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_10_Bandit_Problems/img/BanditAgent.JPG -------------------------------------------------------------------------------- /Lesson_10_Bandit_Problems/img/OneArmedBandit.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_10_Bandit_Problems/img/OneArmedBandit.jpg -------------------------------------------------------------------------------- /Lesson_10_Bandit_Problems/img/multiarmedbandit.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_10_Bandit_Problems/img/multiarmedbandit.jpg -------------------------------------------------------------------------------- /Lesson_11_MonteCarloReinforcementLearning/11_MC_RL.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_11_MonteCarloReinforcementLearning/11_MC_RL.pdf -------------------------------------------------------------------------------- /Lesson_11_MonteCarloReinforcementLearning/img/GPI.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_11_MonteCarloReinforcementLearning/img/GPI.JPG -------------------------------------------------------------------------------- /Lesson_11_MonteCarloReinforcementLearning/img/GridWorld.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_11_MonteCarloReinforcementLearning/img/GridWorld.JPG -------------------------------------------------------------------------------- /Lesson_11_MonteCarloReinforcementLearning/img/MC_Backup.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_11_MonteCarloReinforcementLearning/img/MC_Backup.JPG -------------------------------------------------------------------------------- /Lesson_11_MonteCarloReinforcementLearning/img/RL_AgentModel.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_11_MonteCarloReinforcementLearning/img/RL_AgentModel.JPG -------------------------------------------------------------------------------- /Lesson_12_TDandQLearning/12_TD_Q_Learning.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_12_TDandQLearning/12_TD_Q_Learning.pdf -------------------------------------------------------------------------------- /Lesson_12_TDandQLearning/img/GridWorld.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_12_TDandQLearning/img/GridWorld.JPG -------------------------------------------------------------------------------- /Lesson_12_TDandQLearning/img/Q-Learning.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_12_TDandQLearning/img/Q-Learning.JPG -------------------------------------------------------------------------------- /Lesson_12_TDandQLearning/img/RL_AgentModel.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_12_TDandQLearning/img/RL_AgentModel.JPG -------------------------------------------------------------------------------- /Lesson_12_TDandQLearning/img/SARSA.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_12_TDandQLearning/img/SARSA.JPG -------------------------------------------------------------------------------- /Lesson_12_TDandQLearning/img/SARSAN.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_12_TDandQLearning/img/SARSAN.JPG -------------------------------------------------------------------------------- /Lesson_12_TDandQLearning/img/TD0.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_12_TDandQLearning/img/TD0.JPG -------------------------------------------------------------------------------- /Lesson_12_TDandQLearning/img/TDN.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_12_TDandQLearning/img/TDN.JPG -------------------------------------------------------------------------------- /Lesson_13_DeepFunctionApproximation/13_OverviewOfDL.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_13_DeepFunctionApproximation/13_OverviewOfDL.pdf -------------------------------------------------------------------------------- /Lesson_13_DeepFunctionApproximation/img/Accuracy-Layers.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_13_DeepFunctionApproximation/img/Accuracy-Layers.JPG -------------------------------------------------------------------------------- /Lesson_13_DeepFunctionApproximation/img/Accuracy-Parameters.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_13_DeepFunctionApproximation/img/Accuracy-Parameters.JPG -------------------------------------------------------------------------------- /Lesson_13_DeepFunctionApproximation/img/CompGraph1.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_13_DeepFunctionApproximation/img/CompGraph1.JPG -------------------------------------------------------------------------------- /Lesson_13_DeepFunctionApproximation/img/CompGraph2.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_13_DeepFunctionApproximation/img/CompGraph2.JPG -------------------------------------------------------------------------------- /Lesson_13_DeepFunctionApproximation/img/Hidden.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_13_DeepFunctionApproximation/img/Hidden.JPG -------------------------------------------------------------------------------- /Lesson_13_DeepFunctionApproximation/img/L2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_13_DeepFunctionApproximation/img/L2.jpg -------------------------------------------------------------------------------- /Lesson_13_DeepFunctionApproximation/img/Lesson1Figures.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_13_DeepFunctionApproximation/img/Lesson1Figures.pptx -------------------------------------------------------------------------------- /Lesson_13_DeepFunctionApproximation/img/LinearNetwork.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_13_DeepFunctionApproximation/img/LinearNetwork.JPG -------------------------------------------------------------------------------- /Lesson_13_DeepFunctionApproximation/img/LossGraph.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_13_DeepFunctionApproximation/img/LossGraph.JPG -------------------------------------------------------------------------------- /Lesson_13_DeepFunctionApproximation/img/MachineIntelligence.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_13_DeepFunctionApproximation/img/MachineIntelligence.JPG -------------------------------------------------------------------------------- /Lesson_13_DeepFunctionApproximation/img/Preceptron.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_13_DeepFunctionApproximation/img/Preceptron.JPG -------------------------------------------------------------------------------- /Lesson_13_DeepFunctionApproximation/img/SVD.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_13_DeepFunctionApproximation/img/SVD.png -------------------------------------------------------------------------------- /Lesson_13_DeepFunctionApproximation/img/Tikhonov_board.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_13_DeepFunctionApproximation/img/Tikhonov_board.jpg -------------------------------------------------------------------------------- /Lesson_14_FunctionApproximation/14_FunctionAppoxForRL.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_14_FunctionApproximation/14_FunctionAppoxForRL.pdf -------------------------------------------------------------------------------- /Lesson_14_FunctionApproximation/img/AgentEnvironment.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_14_FunctionApproximation/img/AgentEnvironment.JPG -------------------------------------------------------------------------------- /Lesson_14_FunctionApproximation/img/DQN.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_14_FunctionApproximation/img/DQN.JPG -------------------------------------------------------------------------------- /Lesson_14_FunctionApproximation/img/ReplayBuffer.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_14_FunctionApproximation/img/ReplayBuffer.JPG -------------------------------------------------------------------------------- /Lesson_14_FunctionApproximation/img/Tile1.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_14_FunctionApproximation/img/Tile1.JPG -------------------------------------------------------------------------------- /Lesson_14_FunctionApproximation/img/Tile2.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_14_FunctionApproximation/img/Tile2.JPG -------------------------------------------------------------------------------- /Lesson_15_PolicyGradient/15_PolicyGradient.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/StephenElston/CSCI_E_82A_Probabalistic_Programming/7cd8f34ea1b7e6eac571713502816e436f082854/Lesson_15_PolicyGradient/15_PolicyGradient.pdf -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Welcome to CSCI E-82A 2 | 3 | Real-world machine intelligence and machine learning operates in an uncertain world. Probabilistic programming encompasses a range of algorithms for making decisions and inferences under uncertainty. Probabilistic programming has multiple uses in machine learning and artificial intelligence. Probabilistic programming methods arise in problems in many areas including, scheduling, robotics, natural language processing and image understanding. 4 | The focus of this course is developing understanding of the theory and gaining hands on experience with probabilistic representation, leaning and inference methods for planning and classification. Hands on exercises will be done using Python APIs for several powerful packages. 5 | 6 | The course is built around the three pillars of machine learning and artificial intelligence; representation, learning and inference. This course will survey a number of powerful probabilistic programming methods for representation, learning, and inference: 7 | 1. Review of probability and inference. 8 | 2. Representations for probabilistic models. 9 | 3. Learning in probabilistic models. 10 | 4. Bayesian graphical models. 11 | 5. Markov decision processes and planning. 12 | 6. Partially observable Markov decision processes. 13 | 7. Unsupervised probabilistic models, time permitting. 14 | 8. Reinforcement learning methods. 15 | 16 | Students completing this course will: 17 | 1. Develop the ability to apply probabilistic programming methods to machine intelligence and machine learning applications. 18 | 2. Have an understanding of the theory that connects various probabilistic programming methods. 19 | 3. Have hands-on experience applying probabilistic programming algorithms to various machine intelligence and machine learning problems. 20 | 21 | > **Note:** For specific University policy information please see the course syllabus on the course Canvas page. 22 | 23 | 24 | ## Some course mechanics 25 | 26 | **Meeting time:** Tuesdays, 5:30-7:30 pm US Eastern time, online. Students are expected to attend and participate actively in class sessions. 27 | 28 | **Mandatory On-Campus Weekend Session:** Saturday and Sunday December 8-9, 9am-5pm, Harvard Hall 202. Students must attend the entire weekend session to receive credit for the course. 29 | 30 | **Course materials:** Course lecture material is in the form of Jupyter notebooks in the Github repository at https://github.com/StephenElston/CSCI_E_82A_Probabalistic_Programming . As this is a new course I am still developing the material. 31 | 32 | **Technical Requirements:** You are required to have a computer/laptop and Internet connection capable of performing the course work. Specifically: 33 | - Highspeed Internet connection for watching videos. 34 | - Up to date web browser for watching videos and working with Microsoft Azure Machine Learning. 35 | - A modern CPU capable of (4) multi-core computations and ideally a GPU. 36 | - At least 50GB of free disk space. 37 | - At least 8 GB of RAM, but 16 GB will be better. 38 | - Running Windows, MAC OS, or Linux. 39 | - The ability to install the Python Anaconda stack (https://www.continuum.io/downloads ) including Jupyter notebooks. 40 | 41 | 42 | ## Supplementary reading material 43 | 44 | These texts are sources used for preparing the course. Students may wish to refer to these books for supplementary readings: 45 | 1. Bayesian Reasoning and Machine Learning, David Barber, Cambridge University Press, 2012 – I find this book a useful source for both theory and algorithms for a wide range of topics. 46 | 2. Artificial Intelligence, A Modern Approach, Stuart Russell and Peter Norvig, Prentice Hall, Third edition, 2010 – The go-to introductory AI textbook with introductory treatment of probabilistic models. 47 | 3. Probabilistic Graphical Models, Principles and Techniques, Daphne Koller and Nir Freedman, MIT Press, 2009 – Comprehensive but quite theoretical text. I mostly use this book as a reference. 48 | 4. Reinforcement Learning, An Introduction (Adaptive Computation and Machine Learning), Richard Sutton, Andrew Barto, MIT Press, Second edition, 2018 – Introductory text on reinforcement learning. You can download the draft of the second edition: https://drive.google.com/file/d/1xeUDVGWGUUv1-ccUMAZHJLej2C7aAFWY/view 49 | 5. Decision Theory Under Uncertainty: Theory and Applications, Kochenderfer, et. al., MIT Press, 2015. I nice introduction to Markov decision processes (MDP) and partially observable Markov decision processes (POMDP). This book also contains a nice introduction to reinforcement learning. 50 | 6. Machine Learning: A Probabilistic Perspective, Murphy, MIT Press, 2012. A good reference for the theory of many probabilistic machine learning algorithms. Not the ideal book for learning the material. Make sure you get the latest printing. See the preference for the printing. 51 | 7. The Book of Why: The New Science of Cause and Effect, Pearl and Makenzie, Basic Books, 2018. An extended essay on causal models written for a broad audience. 52 | 8. Deep Learning, Ian Goodfellow, Yushua Bengio, and Arron Courville, MIT Press, 2016 – The definitive (and more or less only) text on deep learning theory. Most of this material is not within the scope of this course. 53 | 54 | 55 | ## Approximate Schedule 56 | 57 | This preliminary lecture schedule is subject to change as the course progresses; 58 | 1. Week 1 – Sep 4: Review of probability 59 | - Basics of probability 60 | - Conditional probability 61 | - Independence and conditional independence 62 | 2. Week 2 – Sep 11: Probabilistic reasoning 63 | - Conditional probability, Priors, likelihood and posterior 64 | - Basic graph concepts 65 | - Introduction to probabilistic graphical models 66 | 3. Week 3 – Sep 18 67 | - Bayesian belief networks 68 | - Representation in belief networks 69 | - Independence and separation in graphical models 70 | 4. Week 4 – Sep 25 71 | - Markov properties and Markov networks 72 | - Efficient inference algorithms for graphical models 73 | - Message algorithms for inference in graphical models 74 | 5. Week 5 – Oct 2 75 | - Overview of approximate inference in graphical models 76 | - MCMC methods 77 | - Variational methods 78 | 6. Week 6 – Oct 9 79 | - Decision making and utility functions 80 | - Markov decision processes 81 | - Decision tree models 82 | 7. Week 7 – Oct 16 83 | - Decisions with partially observable Markov processes 84 | - The EM algorithm 85 | 8. Week 8 – Oct 23 86 | - Time dependent processes 87 | - Hidden Markov Models 88 | - Filtering and smoothing with MDP 89 | - Kalamn Filters 90 | 9. Week 9 – Oct 30 91 | - The bandit problem 92 | - Dynamic programming 93 | - Limits on tabular MDP models 94 | 10. Week 10 – Nov 6 95 | - Introduction to temporal difference learning 96 | - Introduction to Q learning 97 | - Actor-critic methods 98 | 11. Week 11 – Nov 13 99 | - Introduction to deep learning 100 | - Representation in deep networks 101 | - Back propagation 102 | Note: No class week of Nov 20, unless needed for make-up. 103 | 12. Week 12 – Nov 27 104 | - Deep Q learning 105 | 106 | 107 | 108 | --------------------------------------------------------------------------------