├── .gitignore
├── bayesian-data-analysis.Rproj
├── chapter01
    ├── Chapter1Exercise9Simmulation.ipynb
    ├── Exercise06.Rmd
    ├── exercice04.Rmd
    ├── exercice04.html
    ├── exercise01.Rmd
    ├── exercise01.html
    ├── exercise02.Rmd
    ├── exercise03.Rmd
    ├── exercise03.html
    ├── exercise08.Rmd
    ├── exercise08.html
    ├── exercise09.Rmd
    ├── exercise09.html
    └── exercise09
    │   └── Aaron - PoissonPatients.ipynb
├── chapter02
    ├── 2008ElectionResult.csv
    ├── Chapter2Exercise11.ipynb
    ├── exercise03.Rmd
    ├── exercise03.html
    ├── exercise04.Rmd
    ├── exercise04.html
    ├── exercise07.Rmd
    ├── exercise07.html
    ├── exercise10.Rmd
    ├── exercise10.html
    ├── exercise12
    │   ├── exercise12.Rmd
    │   └── exercise12.html
    ├── exercise15
    │   ├── exercise15.Rmd
    │   └── exercise15.html
    ├── exercise21.Rmd
    ├── exercise21.html
    └── pew_research_center_june_elect_wknd_data.dta
├── chapter03
    ├── bioassayExample.Rmd
    ├── bioassayExample.md
    ├── bioassayExample_files
    │   └── figure-gfm
    │   │   ├── unnamed-chunk-10-1.png
    │   │   ├── unnamed-chunk-16-1.png
    │   │   ├── unnamed-chunk-17-1.png
    │   │   ├── unnamed-chunk-18-1.png
    │   │   ├── unnamed-chunk-19-1.png
    │   │   ├── unnamed-chunk-21-1.png
    │   │   ├── unnamed-chunk-22-1.png
    │   │   ├── unnamed-chunk-24-1.png
    │   │   ├── unnamed-chunk-25-1.png
    │   │   ├── unnamed-chunk-27-1.png
    │   │   ├── unnamed-chunk-3-1.png
    │   │   ├── unnamed-chunk-30-1.png
    │   │   ├── unnamed-chunk-31-1.png
    │   │   ├── unnamed-chunk-32-1.png
    │   │   ├── unnamed-chunk-33-1.png
    │   │   ├── unnamed-chunk-34-1.png
    │   │   ├── unnamed-chunk-35-1.png
    │   │   ├── unnamed-chunk-37-1.png
    │   │   └── unnamed-chunk-38-1.png
    ├── exercise15.Rmd
    └── exercise15.html
├── meetings
    ├── 2019-03-28.md
    ├── 2019-04-11.md
    ├── 2019-05-02.md
    └── 2019-05-16.md
└── readme.md


/.gitignore:
--------------------------------------------------------------------------------
 1 | # History files
 2 | .Rhistory
 3 | .Rapp.history
 4 | 
 5 | # Session Data files
 6 | .RData
 7 | 
 8 | # User-specific files
 9 | .Ruserdata
10 | 
11 | # Example code in package build process
12 | *-Ex.R
13 | 
14 | # Output files from R CMD build
15 | /*.tar.gz
16 | 
17 | # Output files from R CMD check
18 | /*.Rcheck/
19 | 
20 | # RStudio files
21 | .Rproj.user/
22 | 
23 | # produced vignettes
24 | vignettes/*.html
25 | vignettes/*.pdf
26 | 
27 | # OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3
28 | .httr-oauth
29 | 
30 | # knitr and R markdown default cache directories
31 | /*_cache/
32 | /cache/
33 | 
34 | # Temporary files created by R markdown
35 | *.utf8.md
36 | *.knit.md
37 | .Rproj.user
38 | 
39 | # Jupyter notebook files
40 | .ipynb_checkpoints
41 | 


--------------------------------------------------------------------------------
/bayesian-data-analysis.Rproj:
--------------------------------------------------------------------------------
 1 | Version: 1.0
 2 | 
 3 | RestoreWorkspace: No
 4 | SaveWorkspace: No
 5 | AlwaysSaveHistory: Yes
 6 | 
 7 | EnableCodeIndexing: Yes
 8 | UseSpacesForTab: Yes
 9 | NumSpacesForTab: 2
10 | Encoding: UTF-8
11 | 
12 | RnwWeave: Sweave
13 | LaTeX: pdfLaTeX
14 | 
15 | AutoAppendNewline: Yes
16 | StripTrailingWhitespace: Yes
17 | 
18 | QuitChildProcessesOnExit: Yes
19 | 


--------------------------------------------------------------------------------
/chapter01/Chapter1Exercise9Simmulation.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "import numpy as np\n",
 10 |     "import datetime"
 11 |    ]
 12 |   },
 13 |   {
 14 |    "cell_type": "code",
 15 |    "execution_count": 2,
 16 |    "metadata": {},
 17 |    "outputs": [],
 18 |    "source": [
 19 |     "class Hospital:\n",
 20 |     "    #adding an substracting times in pythons datetime\n",
 21 |     "    #package only works for the datetime data type.\n",
 22 |     "    #Therefore a dummy date 01.01.2000 is used, but is\n",
 23 |     "    #of no importance.\n",
 24 |     "    opening_time = datetime.datetime(2000,1,1,9,0)\n",
 25 |     "    closing_time = datetime.datetime(2000,1,1,16,0)\n",
 26 |     "    \n",
 27 |     "    def __init__(self,\n",
 28 |     "                 mean_arrival_time=10,\n",
 29 |     "                 min_wait_time=5,\n",
 30 |     "                 max_wait_time=20,\n",
 31 |     "                 doctors=3):\n",
 32 |     "        self.doc_availibility_t = np.array(doctors*[Hospital.opening_time])\n",
 33 |     "        self.mean_arrival_time = mean_arrival_time\n",
 34 |     "        self.min_wait_time=5\n",
 35 |     "        self.max_wait_time = 20\n",
 36 |     "        self.doctors = 3\n",
 37 |     "\n",
 38 |     "        last_patient_arrival = Hospital.opening_time\n",
 39 |     "        self.arrival_times = []\n",
 40 |     "        while True:\n",
 41 |     "            last_patient_arrival += datetime.timedelta(\n",
 42 |     "                minutes=np.random.exponential(scale=mean_arrival_time))\n",
 43 |     "            if last_patient_arrival>Hospital.closing_time:\n",
 44 |     "                break\n",
 45 |     "            self.arrival_times.append(last_patient_arrival)\n",
 46 |     "            \n",
 47 |     "        self.treatment_times = np.random.uniform(low=min_wait_time,\n",
 48 |     "                                                high = max_wait_time,\n",
 49 |     "                                                size = len(self.arrival_times))\n",
 50 |     "        self._currPatientID = 0\n",
 51 |     "        self._numberWaitingPatients = 0\n",
 52 |     "        self._cumulativeWaitingTime = datetime.timedelta(0)\n",
 53 |     "        \n",
 54 |     "    def nextFreeDoctor(self):\n",
 55 |     "        return np.argmin(self.doc_availibility_t)\n",
 56 |     "    \n",
 57 |     "    def nextDoctorAvailibilityTime(self):\n",
 58 |     "        return np.min(self.doc_availibility_t)\n",
 59 |     "    \n",
 60 |     "    def processPatient(self,verbose = False):\n",
 61 |     "        if self._currPatientID >= len(self.arrival_times):\n",
 62 |     "            if verbose: print('All patients processed')\n",
 63 |     "            return True\n",
 64 |     "        else:\n",
 65 |     "            next_doc_id = self.nextFreeDoctor()\n",
 66 |     "            next_free_doc_time = self.nextDoctorAvailibilityTime()\n",
 67 |     "            curr_p_arrival = self.arrival_times[self._currPatientID]\n",
 68 |     "            curr_p_treat_time = datetime.timedelta(minutes=self.treatment_times[self._currPatientID])\n",
 69 |     "            \n",
 70 |     "            if verbose:\n",
 71 |     "                print('Process patient %s' % self._currPatientID)\n",
 72 |     "                print('Patient arrived at {:%H:%M:%S}'\n",
 73 |     "                      .format(curr_p_arrival.time()))\n",
 74 |     "                tt = self.treatment_times[self._currPatientID]\n",
 75 |     "                tt_m = int(tt // 1)\n",
 76 |     "                tt_s = int(60 * (tt % 1))\n",
 77 |     "                print('Treatment time will be {:02d}:{:02d} minutes'\n",
 78 |     "                      .format(tt_m,tt_s))\n",
 79 |     "                print('Doctors are available at {:s}'\n",
 80 |     "                      .format(str([dt.time().strftime('%H:%M:%S') \n",
 81 |     "                                   for dt in self.doc_availibility_t])))\n",
 82 |     "            \n",
 83 |     "            if curr_p_arrival >= next_free_doc_time:\n",
 84 |     "                self.doc_availibility_t[next_doc_id] = curr_p_arrival + curr_p_treat_time\n",
 85 |     "                \n",
 86 |     "                if verbose:\n",
 87 |     "                    print('Patient immediately sees doctor {:d} until {:%H:%M:%S}'\n",
 88 |     "                         .format(next_doc_id,self.doc_availibility_t[next_doc_id]))\n",
 89 |     "            else:\n",
 90 |     "                self.doc_availibility_t[next_doc_id] += curr_p_treat_time\n",
 91 |     "                self._numberWaitingPatients += 1\n",
 92 |     "                self._cumulativeWaitingTime += next_free_doc_time - curr_p_arrival\n",
 93 |     "            \n",
 94 |     "                if verbose:\n",
 95 |     "                    wait_min,wait_sec = divmod((next_free_doc_time - curr_p_arrival).seconds,60)\n",
 96 |     "                    print('Patient has to wait for {:02d}:{:02d} minutes to see doc {:d} until {:%H:%M:%S}'\n",
 97 |     "                         .format(wait_min,\n",
 98 |     "                                 wait_sec,\n",
 99 |     "                                 next_doc_id,\n",
100 |     "                                 self.doc_availibility_t[next_doc_id]))\n",
101 |     "            \n",
102 |     "            self._currPatientID += 1\n",
103 |     "                \n",
104 |     "            \n",
105 |     "            return False\n",
106 |     "    \n",
107 |     "    def processDay(self,verbose=False):\n",
108 |     "        \"\"\"\n",
109 |     "        Processes a hospital day and returns the following four values\n",
110 |     "        Number of patients that had to wait\n",
111 |     "        Average waiting time over all patients\n",
112 |     "        Average waiting time conditioned on the patient waiting\n",
113 |     "        Hospital close\n",
114 |     "        \"\"\"\n",
115 |     "        while True:\n",
116 |     "            done = self.processPatient(verbose=verbose)\n",
117 |     "            if verbose: print(70 * '-' + '\\n')\n",
118 |     "            if done:\n",
119 |     "                break\n",
120 |     "        avg_wait_all = self._cumulativeWaitingTime.total_seconds()/len(self.arrival_times)/60\n",
121 |     "        if self._numberWaitingPatients > 0:\n",
122 |     "            avg_wait_given_waiting_patient = self._cumulativeWaitingTime.total_seconds()/self._numberWaitingPatients/60\n",
123 |     "        else:\n",
124 |     "            avg_wait_given_waiting_patient = np.nan\n",
125 |     "        office_close = np.max(self.doc_availibility_t)\n",
126 |     "        return self._numberWaitingPatients,avg_wait_all,avg_wait_given_waiting_patient,office_close"
127 |    ]
128 |   },
129 |   {
130 |    "cell_type": "code",
131 |    "execution_count": 3,
132 |    "metadata": {},
133 |    "outputs": [
134 |     {
135 |      "name": "stdout",
136 |      "output_type": "stream",
137 |      "text": [
138 |       "Process patient 0\n",
139 |       "Patient arrived at 09:14:58\n",
140 |       "Treatment time will be 16:30 minutes\n",
141 |       "Doctors are available at ['09:00:00', '09:00:00', '09:00:00']\n",
142 |       "Patient immediately sees doctor 0 until 09:31:28\n",
143 |       "----------------------------------------------------------------------\n",
144 |       "\n",
145 |       "Process patient 1\n",
146 |       "Patient arrived at 09:31:19\n",
147 |       "Treatment time will be 06:53 minutes\n",
148 |       "Doctors are available at ['09:31:28', '09:00:00', '09:00:00']\n",
149 |       "Patient immediately sees doctor 1 until 09:38:13\n",
150 |       "----------------------------------------------------------------------\n",
151 |       "\n",
152 |       "Process patient 2\n",
153 |       "Patient arrived at 09:34:53\n",
154 |       "Treatment time will be 18:53 minutes\n",
155 |       "Doctors are available at ['09:31:28', '09:38:13', '09:00:00']\n",
156 |       "Patient immediately sees doctor 2 until 09:53:46\n",
157 |       "----------------------------------------------------------------------\n",
158 |       "\n",
159 |       "Process patient 3\n",
160 |       "Patient arrived at 09:49:20\n",
161 |       "Treatment time will be 06:14 minutes\n",
162 |       "Doctors are available at ['09:31:28', '09:38:13', '09:53:46']\n",
163 |       "Patient immediately sees doctor 0 until 09:55:35\n",
164 |       "----------------------------------------------------------------------\n",
165 |       "\n",
166 |       "Process patient 4\n",
167 |       "Patient arrived at 09:59:11\n",
168 |       "Treatment time will be 19:34 minutes\n",
169 |       "Doctors are available at ['09:55:35', '09:38:13', '09:53:46']\n",
170 |       "Patient immediately sees doctor 1 until 10:18:46\n",
171 |       "----------------------------------------------------------------------\n",
172 |       "\n",
173 |       "Process patient 5\n",
174 |       "Patient arrived at 10:22:52\n",
175 |       "Treatment time will be 17:22 minutes\n",
176 |       "Doctors are available at ['09:55:35', '10:18:46', '09:53:46']\n",
177 |       "Patient immediately sees doctor 2 until 10:40:15\n",
178 |       "----------------------------------------------------------------------\n",
179 |       "\n",
180 |       "Process patient 6\n",
181 |       "Patient arrived at 10:34:06\n",
182 |       "Treatment time will be 19:07 minutes\n",
183 |       "Doctors are available at ['09:55:35', '10:18:46', '10:40:15']\n",
184 |       "Patient immediately sees doctor 0 until 10:53:14\n",
185 |       "----------------------------------------------------------------------\n",
186 |       "\n",
187 |       "Process patient 7\n",
188 |       "Patient arrived at 10:36:24\n",
189 |       "Treatment time will be 09:48 minutes\n",
190 |       "Doctors are available at ['10:53:14', '10:18:46', '10:40:15']\n",
191 |       "Patient immediately sees doctor 1 until 10:46:13\n",
192 |       "----------------------------------------------------------------------\n",
193 |       "\n",
194 |       "Process patient 8\n",
195 |       "Patient arrived at 10:39:41\n",
196 |       "Treatment time will be 11:13 minutes\n",
197 |       "Doctors are available at ['10:53:14', '10:46:13', '10:40:15']\n",
198 |       "Patient has to wait for 00:33 minutes to see doc 2 until 10:51:28\n",
199 |       "----------------------------------------------------------------------\n",
200 |       "\n",
201 |       "Process patient 9\n",
202 |       "Patient arrived at 10:44:57\n",
203 |       "Treatment time will be 06:33 minutes\n",
204 |       "Doctors are available at ['10:53:14', '10:46:13', '10:51:28']\n",
205 |       "Patient has to wait for 01:15 minutes to see doc 1 until 10:52:46\n",
206 |       "----------------------------------------------------------------------\n",
207 |       "\n",
208 |       "Process patient 10\n",
209 |       "Patient arrived at 10:47:55\n",
210 |       "Treatment time will be 13:01 minutes\n",
211 |       "Doctors are available at ['10:53:14', '10:52:46', '10:51:28']\n",
212 |       "Patient has to wait for 03:33 minutes to see doc 2 until 11:04:29\n",
213 |       "----------------------------------------------------------------------\n",
214 |       "\n",
215 |       "Process patient 11\n",
216 |       "Patient arrived at 11:05:09\n",
217 |       "Treatment time will be 05:35 minutes\n",
218 |       "Doctors are available at ['10:53:14', '10:52:46', '11:04:29']\n",
219 |       "Patient immediately sees doctor 1 until 11:10:45\n",
220 |       "----------------------------------------------------------------------\n",
221 |       "\n",
222 |       "Process patient 12\n",
223 |       "Patient arrived at 11:12:04\n",
224 |       "Treatment time will be 05:21 minutes\n",
225 |       "Doctors are available at ['10:53:14', '11:10:45', '11:04:29']\n",
226 |       "Patient immediately sees doctor 0 until 11:17:25\n",
227 |       "----------------------------------------------------------------------\n",
228 |       "\n",
229 |       "Process patient 13\n",
230 |       "Patient arrived at 11:18:04\n",
231 |       "Treatment time will be 05:35 minutes\n",
232 |       "Doctors are available at ['11:17:25', '11:10:45', '11:04:29']\n",
233 |       "Patient immediately sees doctor 2 until 11:23:39\n",
234 |       "----------------------------------------------------------------------\n",
235 |       "\n",
236 |       "Process patient 14\n",
237 |       "Patient arrived at 11:20:51\n",
238 |       "Treatment time will be 17:20 minutes\n",
239 |       "Doctors are available at ['11:17:25', '11:10:45', '11:23:39']\n",
240 |       "Patient immediately sees doctor 1 until 11:38:12\n",
241 |       "----------------------------------------------------------------------\n",
242 |       "\n",
243 |       "Process patient 15\n",
244 |       "Patient arrived at 11:20:51\n",
245 |       "Treatment time will be 05:58 minutes\n",
246 |       "Doctors are available at ['11:17:25', '11:38:12', '11:23:39']\n",
247 |       "Patient immediately sees doctor 0 until 11:26:49\n",
248 |       "----------------------------------------------------------------------\n",
249 |       "\n",
250 |       "Process patient 16\n",
251 |       "Patient arrived at 11:24:14\n",
252 |       "Treatment time will be 05:24 minutes\n",
253 |       "Doctors are available at ['11:26:49', '11:38:12', '11:23:39']\n",
254 |       "Patient immediately sees doctor 2 until 11:29:38\n",
255 |       "----------------------------------------------------------------------\n",
256 |       "\n",
257 |       "Process patient 17\n",
258 |       "Patient arrived at 11:26:54\n",
259 |       "Treatment time will be 08:21 minutes\n",
260 |       "Doctors are available at ['11:26:49', '11:38:12', '11:29:38']\n",
261 |       "Patient immediately sees doctor 0 until 11:35:15\n",
262 |       "----------------------------------------------------------------------\n",
263 |       "\n",
264 |       "Process patient 18\n",
265 |       "Patient arrived at 11:31:34\n",
266 |       "Treatment time will be 11:31 minutes\n",
267 |       "Doctors are available at ['11:35:15', '11:38:12', '11:29:38']\n",
268 |       "Patient immediately sees doctor 2 until 11:43:06\n",
269 |       "----------------------------------------------------------------------\n",
270 |       "\n",
271 |       "Process patient 19\n",
272 |       "Patient arrived at 11:32:14\n",
273 |       "Treatment time will be 05:28 minutes\n",
274 |       "Doctors are available at ['11:35:15', '11:38:12', '11:43:06']\n",
275 |       "Patient has to wait for 03:00 minutes to see doc 0 until 11:40:43\n",
276 |       "----------------------------------------------------------------------\n",
277 |       "\n",
278 |       "Process patient 20\n",
279 |       "Patient arrived at 11:39:12\n",
280 |       "Treatment time will be 08:26 minutes\n",
281 |       "Doctors are available at ['11:40:43', '11:38:12', '11:43:06']\n",
282 |       "Patient immediately sees doctor 1 until 11:47:38\n",
283 |       "----------------------------------------------------------------------\n",
284 |       "\n",
285 |       "Process patient 21\n",
286 |       "Patient arrived at 11:41:16\n",
287 |       "Treatment time will be 08:14 minutes\n",
288 |       "Doctors are available at ['11:40:43', '11:47:38', '11:43:06']\n",
289 |       "Patient immediately sees doctor 0 until 11:49:30\n",
290 |       "----------------------------------------------------------------------\n",
291 |       "\n",
292 |       "Process patient 22\n",
293 |       "Patient arrived at 11:43:10\n",
294 |       "Treatment time will be 17:44 minutes\n",
295 |       "Doctors are available at ['11:49:30', '11:47:38', '11:43:06']\n",
296 |       "Patient immediately sees doctor 2 until 12:00:55\n",
297 |       "----------------------------------------------------------------------\n",
298 |       "\n",
299 |       "Process patient 23\n",
300 |       "Patient arrived at 11:49:26\n",
301 |       "Treatment time will be 17:05 minutes\n",
302 |       "Doctors are available at ['11:49:30', '11:47:38', '12:00:55']\n",
303 |       "Patient immediately sees doctor 1 until 12:06:31\n",
304 |       "----------------------------------------------------------------------\n",
305 |       "\n",
306 |       "Process patient 24\n",
307 |       "Patient arrived at 12:10:48\n",
308 |       "Treatment time will be 05:11 minutes\n",
309 |       "Doctors are available at ['11:49:30', '12:06:31', '12:00:55']\n",
310 |       "Patient immediately sees doctor 0 until 12:15:59\n",
311 |       "----------------------------------------------------------------------\n",
312 |       "\n",
313 |       "Process patient 25\n",
314 |       "Patient arrived at 12:13:34\n",
315 |       "Treatment time will be 18:47 minutes\n",
316 |       "Doctors are available at ['12:15:59', '12:06:31', '12:00:55']\n",
317 |       "Patient immediately sees doctor 2 until 12:32:21\n",
318 |       "----------------------------------------------------------------------\n",
319 |       "\n",
320 |       "Process patient 26\n",
321 |       "Patient arrived at 12:18:58\n",
322 |       "Treatment time will be 08:44 minutes\n",
323 |       "Doctors are available at ['12:15:59', '12:06:31', '12:32:21']\n",
324 |       "Patient immediately sees doctor 1 until 12:27:42\n",
325 |       "----------------------------------------------------------------------\n",
326 |       "\n",
327 |       "Process patient 27\n",
328 |       "Patient arrived at 12:47:28\n",
329 |       "Treatment time will be 17:58 minutes\n",
330 |       "Doctors are available at ['12:15:59', '12:27:42', '12:32:21']\n",
331 |       "Patient immediately sees doctor 0 until 13:05:27\n",
332 |       "----------------------------------------------------------------------\n",
333 |       "\n",
334 |       "Process patient 28\n",
335 |       "Patient arrived at 12:47:41\n",
336 |       "Treatment time will be 18:07 minutes\n",
337 |       "Doctors are available at ['13:05:27', '12:27:42', '12:32:21']\n",
338 |       "Patient immediately sees doctor 1 until 13:05:49\n",
339 |       "----------------------------------------------------------------------\n",
340 |       "\n",
341 |       "Process patient 29\n",
342 |       "Patient arrived at 12:54:07\n",
343 |       "Treatment time will be 07:56 minutes\n",
344 |       "Doctors are available at ['13:05:27', '13:05:49', '12:32:21']\n",
345 |       "Patient immediately sees doctor 2 until 13:02:03\n",
346 |       "----------------------------------------------------------------------\n",
347 |       "\n",
348 |       "Process patient 30\n",
349 |       "Patient arrived at 13:03:39\n",
350 |       "Treatment time will be 10:34 minutes\n",
351 |       "Doctors are available at ['13:05:27', '13:05:49', '13:02:03']\n",
352 |       "Patient immediately sees doctor 2 until 13:14:13\n",
353 |       "----------------------------------------------------------------------\n",
354 |       "\n",
355 |       "Process patient 31\n",
356 |       "Patient arrived at 13:24:45\n",
357 |       "Treatment time will be 14:13 minutes\n",
358 |       "Doctors are available at ['13:05:27', '13:05:49', '13:14:13']\n",
359 |       "Patient immediately sees doctor 0 until 13:38:58\n",
360 |       "----------------------------------------------------------------------\n",
361 |       "\n",
362 |       "Process patient 32\n",
363 |       "Patient arrived at 13:49:20\n",
364 |       "Treatment time will be 09:20 minutes\n",
365 |       "Doctors are available at ['13:38:58', '13:05:49', '13:14:13']\n",
366 |       "Patient immediately sees doctor 1 until 13:58:40\n",
367 |       "----------------------------------------------------------------------\n",
368 |       "\n",
369 |       "Process patient 33\n",
370 |       "Patient arrived at 13:51:54\n",
371 |       "Treatment time will be 13:12 minutes\n",
372 |       "Doctors are available at ['13:38:58', '13:58:40', '13:14:13']\n",
373 |       "Patient immediately sees doctor 2 until 14:05:07\n",
374 |       "----------------------------------------------------------------------\n",
375 |       "\n",
376 |       "Process patient 34\n",
377 |       "Patient arrived at 13:58:50\n",
378 |       "Treatment time will be 12:17 minutes\n",
379 |       "Doctors are available at ['13:38:58', '13:58:40', '14:05:07']\n",
380 |       "Patient immediately sees doctor 0 until 14:11:08\n",
381 |       "----------------------------------------------------------------------\n",
382 |       "\n",
383 |       "Process patient 35\n",
384 |       "Patient arrived at 14:01:22\n",
385 |       "Treatment time will be 05:12 minutes\n",
386 |       "Doctors are available at ['14:11:08', '13:58:40', '14:05:07']\n",
387 |       "Patient immediately sees doctor 1 until 14:06:35\n",
388 |       "----------------------------------------------------------------------\n",
389 |       "\n",
390 |       "Process patient 36\n",
391 |       "Patient arrived at 14:23:26\n",
392 |       "Treatment time will be 09:07 minutes\n",
393 |       "Doctors are available at ['14:11:08', '14:06:35', '14:05:07']\n",
394 |       "Patient immediately sees doctor 2 until 14:32:34\n",
395 |       "----------------------------------------------------------------------\n",
396 |       "\n",
397 |       "Process patient 37\n",
398 |       "Patient arrived at 14:26:48\n",
399 |       "Treatment time will be 06:06 minutes\n",
400 |       "Doctors are available at ['14:11:08', '14:06:35', '14:32:34']\n",
401 |       "Patient immediately sees doctor 1 until 14:32:55\n",
402 |       "----------------------------------------------------------------------\n",
403 |       "\n",
404 |       "Process patient 38\n",
405 |       "Patient arrived at 14:31:45\n",
406 |       "Treatment time will be 09:50 minutes\n",
407 |       "Doctors are available at ['14:11:08', '14:32:55', '14:32:34']\n",
408 |       "Patient immediately sees doctor 0 until 14:41:36\n",
409 |       "----------------------------------------------------------------------\n",
410 |       "\n",
411 |       "Process patient 39\n",
412 |       "Patient arrived at 14:35:46\n",
413 |       "Treatment time will be 08:52 minutes\n",
414 |       "Doctors are available at ['14:41:36', '14:32:55', '14:32:34']\n",
415 |       "Patient immediately sees doctor 2 until 14:44:38\n",
416 |       "----------------------------------------------------------------------\n",
417 |       "\n",
418 |       "Process patient 40\n",
419 |       "Patient arrived at 14:38:01\n",
420 |       "Treatment time will be 10:34 minutes\n",
421 |       "Doctors are available at ['14:41:36', '14:32:55', '14:44:38']\n",
422 |       "Patient immediately sees doctor 1 until 14:48:35\n",
423 |       "----------------------------------------------------------------------\n",
424 |       "\n",
425 |       "Process patient 41\n",
426 |       "Patient arrived at 14:46:24\n",
427 |       "Treatment time will be 10:03 minutes\n",
428 |       "Doctors are available at ['14:41:36', '14:48:35', '14:44:38']\n",
429 |       "Patient immediately sees doctor 0 until 14:56:27\n",
430 |       "----------------------------------------------------------------------\n",
431 |       "\n",
432 |       "Process patient 42\n",
433 |       "Patient arrived at 14:47:06\n",
434 |       "Treatment time will be 10:32 minutes\n",
435 |       "Doctors are available at ['14:56:27', '14:48:35', '14:44:38']\n",
436 |       "Patient immediately sees doctor 2 until 14:57:39\n",
437 |       "----------------------------------------------------------------------\n",
438 |       "\n",
439 |       "Process patient 43\n",
440 |       "Patient arrived at 14:47:10\n",
441 |       "Treatment time will be 11:59 minutes\n",
442 |       "Doctors are available at ['14:56:27', '14:48:35', '14:57:39']\n",
443 |       "Patient has to wait for 01:24 minutes to see doc 1 until 15:00:35\n",
444 |       "----------------------------------------------------------------------\n",
445 |       "\n",
446 |       "Process patient 44\n",
447 |       "Patient arrived at 14:57:07\n",
448 |       "Treatment time will be 08:43 minutes\n",
449 |       "Doctors are available at ['14:56:27', '15:00:35', '14:57:39']\n",
450 |       "Patient immediately sees doctor 0 until 15:05:51\n",
451 |       "----------------------------------------------------------------------\n",
452 |       "\n",
453 |       "Process patient 45\n",
454 |       "Patient arrived at 14:58:15\n",
455 |       "Treatment time will be 14:47 minutes\n",
456 |       "Doctors are available at ['15:05:51', '15:00:35', '14:57:39']\n",
457 |       "Patient immediately sees doctor 2 until 15:13:02\n",
458 |       "----------------------------------------------------------------------\n",
459 |       "\n",
460 |       "Process patient 46\n",
461 |       "Patient arrived at 15:11:13\n",
462 |       "Treatment time will be 08:29 minutes\n",
463 |       "Doctors are available at ['15:05:51', '15:00:35', '15:13:02']\n",
464 |       "Patient immediately sees doctor 1 until 15:19:42\n",
465 |       "----------------------------------------------------------------------\n",
466 |       "\n",
467 |       "Process patient 47\n",
468 |       "Patient arrived at 15:14:38\n",
469 |       "Treatment time will be 15:17 minutes\n",
470 |       "Doctors are available at ['15:05:51', '15:19:42', '15:13:02']\n",
471 |       "Patient immediately sees doctor 0 until 15:29:56\n",
472 |       "----------------------------------------------------------------------\n",
473 |       "\n",
474 |       "Process patient 48\n",
475 |       "Patient arrived at 15:39:01\n",
476 |       "Treatment time will be 16:04 minutes\n",
477 |       "Doctors are available at ['15:29:56', '15:19:42', '15:13:02']\n",
478 |       "Patient immediately sees doctor 2 until 15:55:05\n",
479 |       "----------------------------------------------------------------------\n",
480 |       "\n",
481 |       "Process patient 49\n",
482 |       "Patient arrived at 15:51:08\n",
483 |       "Treatment time will be 18:39 minutes\n",
484 |       "Doctors are available at ['15:29:56', '15:19:42', '15:55:05']\n",
485 |       "Patient immediately sees doctor 1 until 16:09:47\n",
486 |       "----------------------------------------------------------------------\n",
487 |       "\n",
488 |       "Process patient 50\n",
489 |       "Patient arrived at 15:57:56\n",
490 |       "Treatment time will be 07:27 minutes\n",
491 |       "Doctors are available at ['15:29:56', '16:09:47', '15:55:05']\n",
492 |       "Patient immediately sees doctor 0 until 16:05:23\n",
493 |       "----------------------------------------------------------------------\n",
494 |       "\n",
495 |       "All patients processed\n",
496 |       "----------------------------------------------------------------------\n",
497 |       "\n"
498 |      ]
499 |     }
500 |    ],
501 |    "source": [
502 |     "#process a day as an example (and suppress final output)\n",
503 |     "Hospital().processDay(verbose = True);"
504 |    ]
505 |   },
506 |   {
507 |    "cell_type": "code",
508 |    "execution_count": 4,
509 |    "metadata": {},
510 |    "outputs": [],
511 |    "source": [
512 |     "#simmulate 10k days\n",
513 |     "sim = [Hospital().processDay() for _ in range(10000)]"
514 |    ]
515 |   },
516 |   {
517 |    "cell_type": "code",
518 |    "execution_count": 5,
519 |    "metadata": {},
520 |    "outputs": [
521 |     {
522 |      "data": {
523 |       "text/html": [
524 |        "<div>\n",
525 |        "<style scoped>\n",
526 |        "    .dataframe tbody tr th:only-of-type {\n",
527 |        "        vertical-align: middle;\n",
528 |        "    }\n",
529 |        "\n",
530 |        "    .dataframe tbody tr th {\n",
531 |        "        vertical-align: top;\n",
532 |        "    }\n",
533 |        "\n",
534 |        "    .dataframe thead th {\n",
535 |        "        text-align: right;\n",
536 |        "    }\n",
537 |        "</style>\n",
538 |        "<table border=\"1\" class=\"dataframe\">\n",
539 |        "  <thead>\n",
540 |        "    <tr style=\"text-align: right;\">\n",
541 |        "      <th></th>\n",
542 |        "      <th>wait_count</th>\n",
543 |        "      <th>avg_wait_all</th>\n",
544 |        "      <th>avg_wait_waiting</th>\n",
545 |        "      <th>close_time</th>\n",
546 |        "    </tr>\n",
547 |        "  </thead>\n",
548 |        "  <tbody>\n",
549 |        "    <tr>\n",
550 |        "      <th>0</th>\n",
551 |        "      <td>4</td>\n",
552 |        "      <td>0.534141</td>\n",
553 |        "      <td>5.341409</td>\n",
554 |        "      <td>2000-01-01 16:11:42.523556</td>\n",
555 |        "    </tr>\n",
556 |        "    <tr>\n",
557 |        "      <th>1</th>\n",
558 |        "      <td>10</td>\n",
559 |        "      <td>1.551962</td>\n",
560 |        "      <td>7.139025</td>\n",
561 |        "      <td>2000-01-01 16:16:22.325181</td>\n",
562 |        "    </tr>\n",
563 |        "    <tr>\n",
564 |        "      <th>2</th>\n",
565 |        "      <td>4</td>\n",
566 |        "      <td>0.365534</td>\n",
567 |        "      <td>3.289806</td>\n",
568 |        "      <td>2000-01-01 16:18:55.219028</td>\n",
569 |        "    </tr>\n",
570 |        "    <tr>\n",
571 |        "      <th>3</th>\n",
572 |        "      <td>1</td>\n",
573 |        "      <td>0.009703</td>\n",
574 |        "      <td>0.407519</td>\n",
575 |        "      <td>2000-01-01 16:13:02.476515</td>\n",
576 |        "    </tr>\n",
577 |        "    <tr>\n",
578 |        "      <th>4</th>\n",
579 |        "      <td>9</td>\n",
580 |        "      <td>1.454188</td>\n",
581 |        "      <td>6.786210</td>\n",
582 |        "      <td>2000-01-01 15:57:43.139387</td>\n",
583 |        "    </tr>\n",
584 |        "  </tbody>\n",
585 |        "</table>\n",
586 |        "</div>"
587 |       ],
588 |       "text/plain": [
589 |        "   wait_count  avg_wait_all  avg_wait_waiting                 close_time\n",
590 |        "0           4      0.534141          5.341409 2000-01-01 16:11:42.523556\n",
591 |        "1          10      1.551962          7.139025 2000-01-01 16:16:22.325181\n",
592 |        "2           4      0.365534          3.289806 2000-01-01 16:18:55.219028\n",
593 |        "3           1      0.009703          0.407519 2000-01-01 16:13:02.476515\n",
594 |        "4           9      1.454188          6.786210 2000-01-01 15:57:43.139387"
595 |       ]
596 |      },
597 |      "execution_count": 5,
598 |      "metadata": {},
599 |      "output_type": "execute_result"
600 |     }
601 |    ],
602 |    "source": [
603 |     "import pandas as pd\n",
604 |     "resultFrame = pd.DataFrame(sim,columns=['wait_count','avg_wait_all','avg_wait_waiting','close_time'])\n",
605 |     "resultFrame.head()"
606 |    ]
607 |   },
608 |   {
609 |    "cell_type": "markdown",
610 |    "metadata": {},
611 |    "source": [
612 |     "To analyze the actual closing time we measure it as a difference to the door closing time, i.e. 4 o'clok, in minutes. This means that negative values mean that the hospital closed early and positive minutes mean it closed late."
613 |    ]
614 |   },
615 |   {
616 |    "cell_type": "code",
617 |    "execution_count": 6,
618 |    "metadata": {},
619 |    "outputs": [],
620 |    "source": [
621 |     "resultFrame['relative_closing'] = (resultFrame.close_time - Hospital.closing_time)\\\n",
622 |     "    .map(lambda x:x.total_seconds()//60)"
623 |    ]
624 |   },
625 |   {
626 |    "cell_type": "code",
627 |    "execution_count": 7,
628 |    "metadata": {},
629 |    "outputs": [
630 |     {
631 |      "data": {
632 |       "text/html": [
633 |        "<div>\n",
634 |        "<style scoped>\n",
635 |        "    .dataframe tbody tr th:only-of-type {\n",
636 |        "        vertical-align: middle;\n",
637 |        "    }\n",
638 |        "\n",
639 |        "    .dataframe tbody tr th {\n",
640 |        "        vertical-align: top;\n",
641 |        "    }\n",
642 |        "\n",
643 |        "    .dataframe thead th {\n",
644 |        "        text-align: right;\n",
645 |        "    }\n",
646 |        "</style>\n",
647 |        "<table border=\"1\" class=\"dataframe\">\n",
648 |        "  <thead>\n",
649 |        "    <tr style=\"text-align: right;\">\n",
650 |        "      <th></th>\n",
651 |        "      <th>wait_count</th>\n",
652 |        "      <th>avg_wait_all</th>\n",
653 |        "      <th>avg_wait_waiting</th>\n",
654 |        "      <th>relative_closing</th>\n",
655 |        "    </tr>\n",
656 |        "  </thead>\n",
657 |        "  <tbody>\n",
658 |        "    <tr>\n",
659 |        "      <th>count</th>\n",
660 |        "      <td>10000.000000</td>\n",
661 |        "      <td>10000.000000</td>\n",
662 |        "      <td>9549.000000</td>\n",
663 |        "      <td>10000.000000</td>\n",
664 |        "    </tr>\n",
665 |        "    <tr>\n",
666 |        "      <th>mean</th>\n",
667 |        "      <td>6.169100</td>\n",
668 |        "      <td>0.635834</td>\n",
669 |        "      <td>4.104275</td>\n",
670 |        "      <td>3.340500</td>\n",
671 |        "    </tr>\n",
672 |        "    <tr>\n",
673 |        "      <th>std</th>\n",
674 |        "      <td>4.507298</td>\n",
675 |        "      <td>0.629553</td>\n",
676 |        "      <td>2.002651</td>\n",
677 |        "      <td>11.279631</td>\n",
678 |        "    </tr>\n",
679 |        "    <tr>\n",
680 |        "      <th>min</th>\n",
681 |        "      <td>0.000000</td>\n",
682 |        "      <td>0.000000</td>\n",
683 |        "      <td>0.001795</td>\n",
684 |        "      <td>-123.000000</td>\n",
685 |        "    </tr>\n",
686 |        "    <tr>\n",
687 |        "      <th>25%</th>\n",
688 |        "      <td>3.000000</td>\n",
689 |        "      <td>0.209732</td>\n",
690 |        "      <td>2.752586</td>\n",
691 |        "      <td>-2.000000</td>\n",
692 |        "    </tr>\n",
693 |        "    <tr>\n",
694 |        "      <th>50%</th>\n",
695 |        "      <td>5.000000</td>\n",
696 |        "      <td>0.466592</td>\n",
697 |        "      <td>3.855806</td>\n",
698 |        "      <td>5.000000</td>\n",
699 |        "    </tr>\n",
700 |        "    <tr>\n",
701 |        "      <th>75%</th>\n",
702 |        "      <td>9.000000</td>\n",
703 |        "      <td>0.849017</td>\n",
704 |        "      <td>5.179251</td>\n",
705 |        "      <td>11.000000</td>\n",
706 |        "    </tr>\n",
707 |        "    <tr>\n",
708 |        "      <th>max</th>\n",
709 |        "      <td>32.000000</td>\n",
710 |        "      <td>6.481719</td>\n",
711 |        "      <td>20.196816</td>\n",
712 |        "      <td>37.000000</td>\n",
713 |        "    </tr>\n",
714 |        "  </tbody>\n",
715 |        "</table>\n",
716 |        "</div>"
717 |       ],
718 |       "text/plain": [
719 |        "         wait_count  avg_wait_all  avg_wait_waiting  relative_closing\n",
720 |        "count  10000.000000  10000.000000       9549.000000      10000.000000\n",
721 |        "mean       6.169100      0.635834          4.104275          3.340500\n",
722 |        "std        4.507298      0.629553          2.002651         11.279631\n",
723 |        "min        0.000000      0.000000          0.001795       -123.000000\n",
724 |        "25%        3.000000      0.209732          2.752586         -2.000000\n",
725 |        "50%        5.000000      0.466592          3.855806          5.000000\n",
726 |        "75%        9.000000      0.849017          5.179251         11.000000\n",
727 |        "max       32.000000      6.481719         20.196816         37.000000"
728 |       ]
729 |      },
730 |      "execution_count": 7,
731 |      "metadata": {},
732 |      "output_type": "execute_result"
733 |     }
734 |    ],
735 |    "source": [
736 |     "resultFrame.describe()"
737 |    ]
738 |   },
739 |   {
740 |    "cell_type": "code",
741 |    "execution_count": 8,
742 |    "metadata": {},
743 |    "outputs": [
744 |     {
745 |      "data": {
746 |       "image/png": "iVBORw0KGgoAAAANSUhEUgAAA64AAAJPCAYAAAB8aYnhAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzs3Xt8VPWd//HXkEhkuUWQTNRN8YHiyuIFV+6JsIadBAlIoGF3teuuWV1aoPJDFBergqKgba1Qy65L6k+WPmr7E6gJ1rTlEhSIN7xRdmmsVZvfgiWT3QgJXkhImN8fPJyflFuQSWYyeT3/Sr5z5pzP52Qe8513zplzApFIJIIkSZIkSQmqS7wLkCRJkiTpZAyukiRJkqSEZnCVJEmSJCU0g6skSZIkKaEZXCVJkiRJCc3gKkmSJElKaAZXKQm88cYb5Ofnx3y98+fPZ+nSpQC89tprjBkzJubbkCQp2bXVPN1at956K6WlpSd8fMGCBfzLv/xLO1YknT6Dq5QEhg4dyvr166O/5+bm8vLLL8exIkmS9Ll4z9NPPvkkU6ZMAeDZZ5/lhhtuOOrxRYsWMWvWrHarR/oyDK6SJEmSpIRmcJWOo6SkhL/6q7/iqquuYsKECWzcuJGmpiaGDh3Ku+++G13uo48+4oorrqCurg6AH/7wh+Tk5JCTk8OaNWv4sz/7M/7v//2/J9zO7t27GTp0KIcPHwbg3nvvZdSoUdHH582bx7//+78D8LOf/YzrrruOq666inHjxvF//s//iS73xdN4582bxx/+8Ae+8Y1vcNVVV/HDH/7wpL3Onj2b7Oxsrr76ar72ta/xu9/97vR2liRJ7ayzzNP//M//zFNPPQVAOBzmz/7sz3j66acB+K//+i+GDx/O4cOHqa+v5+tf/zojR45k2LBhfP3rX6empia6nptuuok1a9bw/vvvs3DhQnbs2MFVV13F0KFDgeN/Neipp55i1KhR5OTk8LOf/Sy6rn379vGNb3yDv/iLv+CrX/0qS5cuPeYIrtQWDK7ScWRlZfH000/z5ptv8s1vfpN58+axf/9+QqEQ5eXl0eV++ctfMmzYMPr27cvWrVv593//d1auXMnGjRt57bXXWrWdHj168Jvf/AaA119/nT/5kz/h/fffj/4+fPhwAPr27cuKFSt46623ePjhh3n44YfZtWvXMev87ne/y/nnn8+//du/8fbbb/NP//RPJ61hzJgxrF+/nldeeYU///M/584772z1fpIkKR46yzw9bNgwtm/fDsD27dvJysri9ddfj/5+9dVX06VLFw4fPszUqVN54YUXeOGFF0hLS2PRokXHrO+iiy7igQceYMiQIbz99tu88cYbx93u//zP/3DgwAG2bt3K4sWLWbRoEfX19cCR04q7devGSy+9xLe//W3KyspOuR+lWDC4Ssdx3XXXEQwG6dKlCxMmTKB///7s3LmTSZMmHTUh/vznP2fSpEnAkclx6tSpDBw4kG7dunHbbbe1alvDhg3j9ddf57//+78ByM/PZ/v27ezevZuPP/6YSy+9FIC//Mu/5Ctf+QqBQIDhw4eTnZ19wgnndBQVFdGjRw+6du3KbbfdxjvvvMOBAwfOeL2SJLWVzjJPDx8+nDfffJPDhw/z+uuvc+utt/LWW28BR4fmc845h/z8fLp160aPHj2YMWNGNOB+GampqcyaNYuzzjqLsWPH8id/8if8/ve/p6WlhQ0bNnDbbbfRrVs3Lr74YgoLC8+oR6m1UuNdgJSIysrKWLlyJR9++CEAn376Kfv27ePaa6/l4MGD/PrXv6Zv37688847/NVf/RUAtbW1XHbZZdF1nHfeea3a1vDhw6moqCAYDDJs2DBGjBjBunXrSEtLY+jQoXTpcuT/S1u2bOFf/uVfqK6u5vDhwxw8eJBLLrnkjPpsaWlh6dKl/OpXv+Kjjz6Kbmvfvn307NnzjNYtSVJb6Szz9Fe+8hW6detGVVUVb775JrNmzWLt2rV88MEHvP7669x0000AfPbZZzz88MNs27YtemT0k08+oaWlhZSUlNPebnp6Oqmp/z8mdOvWjU8//ZSPPvqI5ubmo/Zda/ejdKY84ir9kQ8//JB7772X++67j9dee4033niDgQMHApCSksL48eN5/vnnKS8v5y//8i/p0aMHABkZGYTD4eh69u7d26rtDRs2jDfffJPt27czbNgwrr76at566y1ef/11hg0bBkBTUxOzZ8/mH//xH3nppZd44403GDNmDJFI5Ix6/fnPf05FRQUrV67kzTffZPPmzQBnvF5JktpKZ5qnP9/++vXrOXToUDQ8l5WVUV9fz6BBgwB46qmn+P3vf8/q1at56623ot+DPd72A4HAl66lT58+pKamHvX92dbuR+lMGVylP/LZZ58RCATo06cPcORiC1+8YNGkSZP45S9/yc9//nMmTpwYHR8/fjzPPvss77//Pp999hn/+q//2qrtXXjhhaSlpfHcc88xfPhwevToQd++fVm/fv1RE2JTU1N0wtiyZQsvvfTSCdd57rnnsnv37lNu+5NPPqFr166cc845fPbZZzz22GOtqlmSpHjpTPM0HDni++Mf/zh6IaURI0bw4x//mKuvvjp6NPWTTz4hLS2NXr16sX//fpYvX37C9fXt25dwOExTU1Ortv9FKSkphEIhli9fzmeffcb777/PunXrTns90pdhcJX+yMUXX8w//uM/8rd/+7eMHj2ad999l7/4i7+IPn7llVfSrVs3amtro1cIBBg7diw33XQTf//3f08oFOLKK68EoGvXrqfc5vDhw0lPT4+ebjN8+HAikQiDBw8GoEePHtx7773MmTOHYcOG8fzzz5Obm3vC9U2fPp0nnniCoUOH8r//9/8+4XKFhYWcf/75XHPNNRQUFDBkyJBT1ipJUjx1pnkajhxx/eSTT6Ih+eqrr+bgwYPRIAvwD//wDzQ2NjJy5Ej+5m/+hmuuueaE6xs5ciQXX3wxOTk5jBgx4pS9/7EFCxZw4MABsrOzueuuuygoKGjVPpTOVCDiOYFSm3j//feZOHEi//Ef/3HU90QkSVL8OU/Hxne/+13+53/+h29/+9vxLkVJziOuUgx9fh+5+vp6vvvd73Lttdc6GUqSlCCcp8/c+++/zzvvvEMkEmHnzp2sXbuWUCgU77LUCXjEVYqhW265hR07dpCSksKwYcNYuHAhGRkZFBQU8Ic//OGY5R944AGuv/76Nq3pueeeY+HChceMn3/++UfdMkCSpGTnPH3mdu7cyR133EFtbS19+/blb/7mb5g+ffoZXfRJag2DqyRJkiQpoXmqsCRJkiQpoRlcJUmSJEkJLaGD6xfvyXUmqqurY7KeRJOsfUHy9pasfUHy9pasfUFy96bE1xnn+I5UK1hvW+pItYL1tqWOVCvEt96EDq7Nzc0xWc9nn30Wk/UkmmTtC5K3t2TtC5K3t2TtC5K7NyW+zjjHd6RawXrbUkeqFay3LXWkWiG+9SZ0cJUkSZIkyeAqSZIkSUpoBldJkiRJUkIzuEqSJEmSEprBVZKkTqylpYXCwkK+/vWvA7B7926mTZtGKBRizpw5NDU1AdDU1MScOXMIhUJMmzaNPXv2RNexYsUKQqEQ+fn5bNu2LS59SJKSm8FVkqRO7Ec/+hEXXXRR9PdHH32Um2++mY0bN9KrVy/Wrl0LwJo1a+jVqxcbN27k5ptv5tFHHwXgvffeo7y8nPLycp588kkeeOABWlpa4tKLJCl5GVwlSeqkampqePHFFykqKgIgEonw6quvkp+fD8CUKVOoqKgAYPPmzUyZMgWA/Px8XnnlFSKRCBUVFRQUFNC1a1eysrLo378/O3fujE9DkqSkZXCVJKmTWrJkCfPmzaNLlyMfB/bt20evXr1ITU0FIDMzk3A4DEA4HOa8884DIDU1lZ49e7Jv3z7C4TCZmZnRdQaDwehzJEmKldR4FyBJktrfCy+8QJ8+fbjssst47bXX2n37jY2NVFVVnfF6Dh48GJP1tIeOVCtYb1vqSLWC9baljlQrtE29gwYNatVynSK4fuXCAfEuIergoRbOPisl3mVIkjq5t956i82bN7N161YaGxv5+OOPWbx4MQ0NDTQ3N5OamkpNTQ3BYBA4ciR17969ZGZm0tzczIEDBzjnnHMIBoPU1NRE1xsOh6PPOZm0tLRWf1g5maqqqpisJxYSaY6PRS2JtG9boyPV25FqBettSx2pVohvvZ0iuHbvlsaF88vjXQYA1Y8UxLsESZK44447uOOOOwB47bXXeOqpp/je977H7NmzWb9+PQUFBZSWlpKbmwtAbm4upaWlXHXVVaxfv56RI0cSCATIzc3ljjvuoLi4mHA4THV1NVdccUU8W4ubs89K8fOGJLURv+MqSZKi5s2bx8qVKwmFQuzfv59p06YBUFRUxP79+wmFQqxcuZI777wTgIEDB3LdddcxYcIEbr31VhYsWEBKSmIcdZQkJY9OccRVkiSd2IgRIxgxYgQAWVlZ0VvgfFFaWhqPP/74cZ8/Y8YMZsyY0aY1SpI6N4+4SpIkSZISmsFVkiRJkpTQDK6SJEmSpIRmcJUkSZIkJTSDqyRJkiQpobUquDY0NDB79mzGjx/Pddddx9tvv83+/fspLi4mLy+P4uJi6uvrAYhEIjz00EOEQiEmTZrErl27ouspLS0lLy+PvLw8SktL26YjSZIkSVJSaVVwXbx4Mddccw2/+tWvWLduHRdddBElJSWMGjWKDRs2MGrUKEpKSgDYunUr1dXVbNiwgQcffJD7778fgP3797N8+XJWr17NmjVrWL58eTTsSpIkSZJ0IqcMrgcOHOD111+nqKgIgK5du9KrVy8qKiooLCwEoLCwkE2bNgFExwOBAEOGDKGhoYHa2loqKyvJzs4mPT2d3r17k52dzbZt29qwNUmSJElSMkg91QJ79uyhT58+3H333bzzzjsMHjyYe+65h7q6OjIyMgDo168fdXV1AITDYTIzM6PPz8zMJBwOHzMeDAYJh8Ox7keSJEmSlGROGVybm5v5zW9+w3333ceVV17JQw89FD0t+HOBQIBAIBDz4hobG6mqqjrj9QwaNCgG1cROLHoCOHjwYMzWlWiStbdk7QuSt7dk7Qs6Vm+J9j4uSZLa1ymDa2ZmJpmZmVx55ZUAjB8/npKSEvr27UttbS0ZGRnU1tbSp08f4MiR1Jqamujza2pqCAaDBINBtm/fHh0Ph8MMHz78pNtOS0tLyg8rseqpqqoqKfcPJG9vydoXJG9vydoXJHdvkiQpuZzyO679+vUjMzOTDz74AIBXXnmFiy66iNzcXMrKygAoKytj3LhxANHxSCTCjh076NmzJxkZGeTk5FBZWUl9fT319fVUVlaSk5PThq1JkiRJkpLBKY+4Atx3333ceeedHDp0iKysLB5++GEOHz7MnDlzWLt2Leeffz7Lli0DYOzYsWzZsoVQKES3bt1YsmQJAOnp6cycOTN6kadZs2aRnp7eRm1JkiRJkpJFq4LroEGDePbZZ48ZX7Vq1TFjgUCAhQsXHnc9RUVF0eAqSZIkSVJrtOo+rpIkSZIkxYvBVZIkSZKU0AyukiRJkqSEZnCVJEmSJCU0g6skSZIkKaEZXCVJkiRJCc3gKklSJ9XY2EhRURHXX389BQUFPP744wDMnz+f3NxcJk+ezOTJk6mqqgIgEonw0EMPEQqFmDRpErt27Yquq7S0lLy8PPLy8igtLY1LP5Kk5NWq+7hKkqTk07VrV1atWkX37t05dOgQN954I2PGjAHgrrvuYvz48Uctv3XrVqqrq9mwYQO//vWvuf/++1mzZg379+9n+fLl/OxnPyMQCDB16lRyc3Pp3bt3PNqSJCUhj7hKktRJBQIBunfvDkBzczPNzc0EAoETLl9RUUFhYSGBQIAhQ4bQ0NBAbW0tlZWVZGdnk56eTu/evcnOzmbbtm3t1YYkqRMwuEqS1Im1tLQwefJkRo8ezejRo7nyyisBWLp0KZMmTWLJkiU0NTUBEA6HyczMjD43MzOTcDh8zHgwGCQcDrdvI5KkpOapwpIkdWIpKSmsW7eOhoYGZs2axbvvvsvcuXPp168fhw4d4r777qOkpIRvfvObMd1uY2Nj9LuzZ+LgwYMxWU8sDBo0KN4lHOVM90si7dvW6Ej1dqRawXrbUkeqFdqm3ta+dxpcJUkSvXr1YsSIEWzbto1bbrkFOPId2KlTp/LUU08BR46k1tTURJ9TU1NDMBgkGAyyffv26Hg4HGb48OEn3V5aWlpMgl5VVVXCBcZEcab7paPt245Ub0eqFay3LXWkWiG+9XqqsCRJndRHH31EQ0MDcOS/6C+//DIDBgygtrYWOHIV4U2bNjFw4EAAcnNzKSsrIxKJsGPHDnr27ElGRgY5OTlUVlZSX19PfX09lZWV5OTkxK0vSVLy8YirJEmdVG1tLfPnz6elpYVIJML48eO59tpr+fu//3v27dtHJBLh0ksv5YEHHgBg7NixbNmyhVAoRLdu3ViyZAkA6enpzJw5k6KiIgBmzZpFenp63PqSJCUfg6skSZ3UpZdeSllZ2THjP/rRj467fCAQYOHChcd9rKioKBpcJUmKNU8VliRJkiQlNIOrJEmSJCmhGVwlSZIkSQnN4CpJkiRJSmgGV0mSJElSQjO4SpIkSZISmsFVkiRJkpTQDK6SJEmSpIRmcJUkSZIkJTSDqyRJkiQpoRlcJUmSJEkJzeAqSZIkSUpoqa1ZKDc3l+7du9OlSxdSUlJ49tln2b9/P7fffjsffvghF1xwAcuWLaN3795EIhEWL17Mli1bOPvss3nkkUcYPHgwAKWlpTzxxBMAzJgxgylTprRdZ5IkSZKkpNDqI66rVq1i3bp1PPvsswCUlJQwatQoNmzYwKhRoygpKQFg69atVFdXs2HDBh588EHuv/9+APbv38/y5ctZvXo1a9asYfny5dTX18e+I0mSJElSUvnSpwpXVFRQWFgIQGFhIZs2bTpqPBAIMGTIEBoaGqitraWyspLs7GzS09Pp3bs32dnZbNu2LTZdSJIkSZKSVquD6y233MLUqVN55plnAKirqyMjIwOAfv36UVdXB0A4HCYzMzP6vMzMTMLh8DHjwWCQcDgckyYkSZIkScmrVd9x/elPf0owGKSuro7i4mIGDBhw1OOBQIBAIBDz4hobG6mqqjrj9QwaNCgG1cROLHoCOHjwYMzWlWiStbdk7QuSt7dk7Qs6Vm+J9j4uSZLaV6uCazAYBKBv376EQiF27txJ3759qa2tJSMjg9raWvr06RNdtqamJvrcmpoagsEgwWCQ7du3R8fD4TDDhw8/6XbT0tKS8sNKrHqqqqpKyv0DydtbsvYFydtbsvYFyd2bJElKLqc8VfjTTz/l448/jv780ksvMXDgQHJzcykrKwOgrKyMcePGAUTHI5EIO3bsoGfPnmRkZJCTk0NlZSX19fXU19dTWVlJTk5OG7YmSZIkSUoGpzziWldXx6xZswBoaWlh4sSJjBkzhssvv5w5c+awdu1azj//fJYtWwbA2LFj2bJlC6FQiG7durFkyRIA0tPTmTlzJkVFRQDMmjWL9PT0tupLkiSdQmNjI1/72tdoamqipaWF/Px8Zs+eze7du5k7dy779+9n8ODBfOc736Fr1640NTVx1113sWvXLtLT01m6dCl/+qd/CsCKFStYu3YtXbp04d577+Waa66Jc3eSpGRyyuCalZXFc889d8z4Oeecw6pVq44ZDwQCLFy48LjrKioqigZXSZIUX127dmXVqlV0796dQ4cOceONNzJmzBhWrlzJzTffTEFBAQsWLGDt2rXceOONrFmzhl69erFx40bKy8t59NFHWbZsGe+99x7l5eWUl5cTDocpLi5m/fr1pKSkxLtFSVKS+NK3w5EkSR1bIBCge/fuADQ3N9Pc3EwgEODVV18lPz8fgClTplBRUQHA5s2bmTJlCgD5+fm88sorRCIRKioqKCgooGvXrmRlZdG/f3927twZn6YkSUnJ4CpJUifW0tLC5MmTGT16NKNHjyYrK4tevXqRmnrkpKzPb2sHRy6seN555wGQmppKz5492bdvn7e8kyS1uVZdVViSJCWnlJQU1q1bR0NDA7NmzeKDDz5ol+3G6pZ3iXRbp0S7SveZ7pdE2ret0ZHq7Ui1gvW2pY5UK7RNva197zS4SpIkevXqxYgRI9ixYwcNDQ00NzeTmpoava0dHDmSunfvXjIzM2lububAgQOcc845x9wKLxwOR59zIrG65Z23dTq+g4daEma/HDzUwtlntf33nTvSa6Ej1QrW25Y6Uq0Q33oNrpIkdVIfffQRqamp9OrVi4MHD/Lyyy/zT//0T4wYMYL169dTUFBAaWkpubm5wJFb3pWWlnLVVVexfv16Ro4cSSAQIDc3lzvuuIPi4mLC4TDV1dVcccUVce6uczv7rBQunF8e7zIAqH6kIN4lSEoCBldJkjqp2tpa5s+fT0tLC5FIhPHjx3Pttddy8cUXc/vtt7Ns2TIGDRrEtGnTgCN3B5g3bx6hUIjevXuzdOlSAAYOHMh1113HhAkTSElJYcGCBV5RWJIUUwZXSZI6qUsvvZSysrJjxrOysli7du0x42lpaTz++OPHXdeMGTOYMWNGzGuUJAm8qrAkSZIkKcEZXCVJkiRJCc3gKkmSJElKaAZXSZIkSVJCM7hKkiRJkhKawVWSJEmSlNAMrpIkSZKkhGZwlSRJkiQlNIOrJEmSJCmhGVwlSZIkSQnN4CpJkiRJSmgGV0mSJElSQjO4SpIkSZISmsFVkiRJkpTQDK6SJEmSpIRmcJUkSZIkJTSDqyRJkiQpoRlcJUmSJEkJzeAqSZIkSUporQ6uLS0tFBYW8vWvfx2A3bt3M23aNEKhEHPmzKGpqQmApqYm5syZQygUYtq0aezZsye6jhUrVhAKhcjPz2fbtm0xbkWSJLXW3r17uemmm5gwYQIFBQWsWrUKgB/84Adcc801TJ48mcmTJ7Nly5boc040j2/dupX8/HxCoRAlJSXt3oskKfm1Orj+6Ec/4qKLLor+/uijj3LzzTezceNGevXqxdq1awFYs2YNvXr1YuPGjdx88808+uijALz33nuUl5dTXl7Ok08+yQMPPEBLS0uM25EkSa2RkpLC/Pnz+cUvfsEzzzzDT37yE9577z0Abr75ZtatW8e6desYO3YscOJ5vKWlhUWLFvHkk09SXl7O888/H12PJEmx0qrgWlNTw4svvkhRUREAkUiEV199lfz8fACmTJlCRUUFAJs3b2bKlCkA5Ofn88orrxCJRKioqKCgoICuXbuSlZVF//792blzZ1v0JEmSTiEjI4PBgwcD0KNHDwYMGEA4HD7h8ieax3fu3En//v3Jysqia9euFBQURD8TSJIUK60KrkuWLGHevHl06XJk8X379tGrVy9SU1MByMzMjE524XCY8847D4DU1FR69uzJvn37CIfDZGZmRtcZDAZPOkFKkqT2sWfPHqqqqrjyyisBePrpp5k0aRJ333039fX1ACecx53fJUntIfVUC7zwwgv06dOHyy67jNdee609aopqbGykqqrqjNczaNCgGFQTO7HoCeDgwYMxW1eiSdbekrUvSN7ekrUv6Fi9Jdr7eDL55JNPmD17Nt/61rfo0aMHN9xwAzNnziQQCPD973+fRx55hIcffjjm243VHJ9Ir2NfpyfWHn+jRHotnEpHqhWsty11pFqhbept7XvnKYPrW2+9xebNm9m6dSuNjY18/PHHLF68mIaGBpqbm0lNTaWmpoZgMAgc+U/r3r17yczMpLm5mQMHDnDOOecQDAapqamJrjccDkefcyJpaWlJOQnEqqeqqqqk3D+QvL0la1+QvL0la1+Q3L2pdQ4dOsTs2bOZNGkSeXl5AJx77rnRx6dNm8Y3vvENgJPO46c7v0Ps5nhfxx1De/yNOtJroSPVCtbbljpSrRDfek95qvAdd9zB1q1b2bx5M4899hgjR47ke9/7HiNGjGD9+vUAlJaWkpubC0Bubi6lpaUArF+/npEjRxIIBMjNzaW8vJympiZ2795NdXU1V1xxRRu2JkmSTiQSiXDPPfcwYMAAiouLo+O1tbXRnzdt2sTAgQMBTjiPX3755VRXV7N7926ampooLy+PfiaQJClWTnnE9UTmzZvH7bffzrJlyxg0aBDTpk0DoKioiHnz5hEKhejduzdLly4FYODAgVx33XVMmDCBlJQUFixYQEpKSmy6kCRJp+XNN99k3bp1XHLJJUyePBmAuXPn8vzzz/POO+8AcMEFF7Bo0SLg5PP4ggULuPXWW2lpaeGrX/1qNOxKkhQrpxVcR4wYwYgRIwDIysqK3gLni9LS0nj88ceP+/wZM2YwY8aML1GmJEmKpaFDh/Lb3/72mPHPb39zPCeax8eOHXvS50mSdKZafR9XSZIkSZLiweAqSZIkSUpoBldJkiRJUkIzuEqSJEmSEprBVZIkSZKU0AyukiRJkqSEZnCVJEmSJCU0g6skSZIkKaEZXCVJkiRJCc3gKkmSJElKaAZXSZIkSVJCM7hKkiRJkhKawVWSJEmSlNAMrpIkSZKkhGZwlSRJkiQlNIOrJEmSJCmhGVwlSZIkSQnN4CpJkiRJSmgGV0mSJElSQjO4SpIkSZISmsFVkqROau/evdx0001MmDCBgoICVq1aBcD+/fspLi4mLy+P4uJi6uvrAYhEIjz00EOEQiEmTZrErl27ousqLS0lLy+PvLw8SktL49KPJCl5GVwlSeqkUlJSmD9/Pr/4xS945pln+MlPfsJ7771HSUkJo0aNYsOGDYwaNYqSkhIAtm7dSnV1NRs2bODBBx/k/vvvB44E3eXLl7N69WrWrFnD8uXLo2FXkqRYMLhKktRJZWRkMHjwYAB69OjBgAEDCIfDVFRUUFhYCEBhYSGbNm0CiI4HAgGGDBlCQ0MDtbW1VFZWkp2dTXp6Or179yY7O5tt27bFrS9JUvIxuEqSJPbs2UNVVRVXXnkldXV1ZGRkANCvXz/q6uoACIfDZGZmRp+TmZlJOBw+ZjwYDBIOh9u3AUlSUkuNdwGSJCm+PvnkE2bPns23vvVBAxn2AAAgAElEQVQtevTocdRjgUCAQCAQ8202NjZSVVV1xus5ePBgTNYTC4MGDYp3CQmrPf5GifRaOJWOVCtYb1vqSLVC29Tb2vdOg6skSZ3YoUOHmD17NpMmTSIvLw+Avn37UltbS0ZGBrW1tfTp0wc4ciS1pqYm+tyamhqCwSDBYJDt27dHx8PhMMOHDz/pdtPS0mIS9KqqqgyMHUB7/I060muhI9UK1tuWOlKtEN96T3mqcGNjI0VFRVx//fUUFBTw+OOPA7B7926mTZtGKBRizpw5NDU1AdDU1MScOXMIhUJMmzaNPXv2RNe1YsUKQqEQ+fn5fvdFkqQ4i0Qi3HPPPQwYMIDi4uLoeG5uLmVlZQCUlZUxbty4o8YjkQg7duygZ8+eZGRkkJOTQ2VlJfX19dTX11NZWUlOTk5cepIkJadTHnHt2rUrq1atonv37hw6dIgbb7yRMWPGsHLlSm6++WYKCgpYsGABa9eu5cYbb2TNmjX06tWLjRs3Ul5ezqOPPsqyZct47733KC8vp7y8nHA4THFxMevXryclJaU9+pQkSX/kzTffZN26dVxyySVMnjwZgLlz5zJ9+nTmzJnD2rVrOf/881m2bBkAY8eOZcuWLYRCIbp168aSJUsASE9PZ+bMmRQVFQEwa9Ys0tPT49OUJCkpnTK4BgIBunfvDkBzczPNzc0EAgFeffVVvve97wEwZcoUli9fzo033sjmzZv55je/CUB+fj6LFi0iEolQUVFBQUEBXbt2JSsri/79+7Nz506uuuqqNmxPkiSdyNChQ/ntb3973Mc+v6frFwUCARYuXHjc5YuKiqLBVZKkWGvVVYVbWlqYPHkyo0ePZvTo0WRlZdGrVy9SU4/k3s+vKghHvtdy3nnnAZCamkrPnj3Zt2+fVxyUJEmSJH0prbo4U0pKCuvWraOhoYFZs2bxwQcftHVdQOyuOJhoX3iO1ZW4OtpVyE5HsvaWrH1B8vaWrH1Bx+ot0d7HJUlS+zqtqwr36tWLESNGsGPHDhoaGmhubiY1NTV6VUE4ciR17969ZGZm0tzczIEDBzjnnHOOuRJhOByOPudEYnXFwUQTq5462lXITkey9pasfUHy9pasfUFy9yZJkpLLKU8V/uijj2hoaACO/Hf+5Zdf5qKLLmLEiBGsX78egNLSUnJzc4EjVxwsLS0FYP369YwcOZJAIEBubi7l5eU0NTWxe/duqqurueKKK9qqL0mSJElSkjjlEdfa2lrmz59PS0sLkUiE8ePHc+2113LxxRdz++23s2zZMgYNGsS0adOAIxdnmDdvHqFQiN69e7N06VIABg4cyHXXXceECRNISUlhwYIFXlFYkiRJknRKpwyul156afRebl+UlZXF2rVrjxlPS0uL3uv1j82YMYMZM2Z8iTIlSZIkSZ1Vq64qLEmSJElSvBhcJUmSJEkJzeAqSZIkSUpoBldJkiRJUkIzuEqSJEmSEprBVZIkSZKU0AyukiRJkqSEZnCVJEmSJCU0g6skSZIkKaEZXCVJkiRJCc3gKkmSJElKaAZXSZIkSVJCM7hKkiRJkhKawbWdHTzUErN1DRo06IzXEct6JEkdy913382oUaOYOHFidOwHP/gB11xzDZMnT2by5Mls2bIl+tiKFSsIhULk5+ezbdu26PjWrVvJz88nFApRUlLSrj1IkjqH1HgX0NmcfVYKF84vj3cZUdWPFMS7BElSnEydOpW/+7u/45//+Z+PGr/55pu55ZZbjhp77733KC8vp7y8nHA4THFxMevXrwdg0aJFrFy5kmAwSFFREbm5uVx88cXt1ockKfkZXCVJ6qSGDRvGnj17WrVsRUUFBQUFdO3alaysLPr378/OnTsB6N+/P1lZWQAUFBRQUVFhcJUkxZSnCkuSpKM8/fTTTJo0ibvvvpv6+noAwuEwmZmZ0WWCwSDhcPiE45IkxZJHXCVJUtQNN9zAzJkzCQQCfP/73+eRRx7h4Ycfjvl2GhsbqaqqOuP1HDx4MCbriYVYXHsiWbXH3yiRXgun0pFqBettSx2pVmibelv73mlwlSRJUeeee27052nTpvGNb3wDOHIktaamJvpYOBwmGAwCnHD8ZNLS0mIS9KqqqgyMHUB7/I060muhI9UK1tuWOlKtEN96PVVYkiRF1dbWRn/etGkTAwcOBCA3N5fy8nKamprYvXs31dXVXHHFFVx++eVUV1eze/dumpqaKC8vJzc3N17lS5KSlEdcJUnqpObOncv27dvZt28fY8aM4bbbbmP79u288847AFxwwQUsWrQIgIEDB3LdddcxYcIEUlJSWLBgASkpKQAsWLCAW2+9lZaWFr761a9Gw64kSbFicJUkqZN67LHHjhmbNm3aCZefMWMGM2bMOGZ87NixjB07Nqa1SZL0RZ4qLEmSJElKaAZXSZIkSVJCM7hKkiRJkhKawVWSJEmSlNAMrpIkSZKkhHbK4Lp3715uuukmJkyYQEFBAatWrQJg//79FBcXk5eXR3FxMfX19QBEIhEeeughQqEQkyZNYteuXdF1lZaWkpeXR15eHqWlpW3UkiRJkiQpmZwyuKakpDB//nx+8Ytf8Mwzz/CTn/yE9957j5KSEkaNGsWGDRsYNWoUJSUlAGzdupXq6mo2bNjAgw8+yP333w8cCbrLly9n9erVrFmzhuXLl0fDriRJkiRJJ3LK4JqRkcHgwYMB6NGjBwMGDCAcDlNRUUFhYSEAhYWFbNq0CSA6HggEGDJkCA0NDdTW1lJZWUl2djbp6en07t2b7Oxstm3b1oatSZIkSZKSwWl9x3XPnj1UVVVx5ZVXUldXR0ZGBgD9+vWjrq4OgHA4TGZmZvQ5mZmZhMPhY8aDwSDhcDgWPUiSJEmSklhqaxf85JNPmD17Nt/61rfo0aPHUY8FAgECgUDMi2tsbKSqquqM1zNo0KAYVJO8YrGPY+3gwYMJWdeZSta+IHl7S9a+oGP15vu4JEmdW6uC66FDh5g9ezaTJk0iLy8PgL59+1JbW0tGRga1tbX06dMHOHIktaamJvrcmpoagsEgwWCQ7du3R8fD4TDDhw8/6XbT0tL8sNIOEnEfV1VVJWRdZypZ+4Lk7S1Z+4Lk7k2SJCWXU54qHIlEuOeeexgwYADFxcXR8dzcXMrKygAoKytj3LhxR41HIhF27NhBz549ycjIICcnh8rKSurr66mvr6eyspKcnJw2akuSJEmSlCxOecT1zTffZN26dVxyySVMnjwZgLlz5zJ9+nTmzJnD2rVrOf/881m2bBkAY8eOZcuWLYRCIbp168aSJUsASE9PZ+bMmRQVFQEwa9Ys0tPT26ovSZIkSVKSOGVwHTp0KL/97W+P+9jn93T9okAgwMKFC4+7fFFRUTS4SpIknamvXDgg3iVIktpBqy/OJEmSlGi6d0vjwvnl8S4DgOpHCuJdgiQlrdO6HY4kSZIkSe3N4CpJkiRJSmgGV0mSJElSQjO4SpIkSZISmsFVkqRO6u6772bUqFFMnDgxOrZ//36Ki4vJy8ujuLiY+vp64Mh93R966CFCoRCTJk1i165d0eeUlpaSl5dHXl4epaWl7d6HJCn5GVwlSeqkpk6dypNPPnnUWElJCaNGjWLDhg2MGjWKkpISALZu3Up1dTUbNmzgwQcf5P777weOBN3ly5ezevVq1qxZw/Lly6NhV5KkWDG4SpLUSQ0bNozevXsfNVZRUUFhYSEAhYWFbNq06ajxQCDAkCFDaGhooLa2lsrKSrKzs0lPT6d3795kZ2ezbdu2du9FkpTcDK6SJCmqrq6OjIwMAPr160ddXR0A4XCYzMzM6HKZmZmEw+FjxoPBIOFwuH2LVkI7eKilXbYzaNCgVi3XXvVIiq3UeBcgSZISUyAQIBAItMm6GxsbqaqqOuP1tDasKH7OPiuFC+eXx7uMqOpHCmLy2jsTBw8ejHsNp8N6205HqhXapt7Wvo8bXCVJUlTfvn2pra0lIyOD2tpa+vTpAxw5klpTUxNdrqamhmAwSDAYZPv27dHxcDjM8OHDT7mdtLQ0Q6fiJt6vvaqqqrjXcDqst+10pFohvvV6qrAkSYrKzc2lrKwMgLKyMsaNG3fUeCQSYceOHfTs2ZOMjAxycnKorKykvr6e+vp6KisrycnJiWcLkqQk5BFXSZI6qblz57J9+3b27dvHmDFjuO2225g+fTpz5sxh7dq1nH/++SxbtgyAsWPHsmXLFkKhEN26dWPJkiUApKenM3PmTIqKigCYNWsW6enpcetJkpScDK6SJHVSjz322HHHV61adcxYIBBg4cKFx12+qKgoGlwlSWoLniosSZIkSUpoBldJkiRJUkIzuEqSJEmSEprBVZIkSZKU0AyukiRJkqSEZnCVJEmSJCU0g6skSZIkKaEZXCVJkiRJCc3gKkmSJElKaAZXSZIkSVJCM7hKkiRJkhKawVWSJEmSlNBOGVzvvvtuRo0axcSJE6Nj+/fvp7i4mLy8PIqLi6mvrwcgEonw0EMPEQqFmDRpErt27Yo+p7S0lLy8PPLy8igtLW2DViRJkiRJyeiUwXXq1Kk8+eSTR42VlJQwatQoNmzYwKhRoygpKQFg69atVFdXs2HDBh588EHuv/9+4EjQXb58OatXr2bNmjUsX748GnYlSZIkSTqZUwbXYcOG0bt376PGKioqKCwsBKCwsJBNmzYdNR4IBBgyZAgNDQ3U1tZSWVlJdnY26enp9O7dm+zsbLZt29YG7UiSJEmSks2X+o5rXV0dGRkZAPTr14+6ujoAwuEwmZmZ0eUyMzMJh8PHjAeDQcLh8JnULUmSJEnqJFLPdAWBQIBAIBCLWo7R2NhIVVXVGa9n0KBBMagmecViH8fawYMHE7KuM5WsfUHy9pasfUHH6s33cUmSOrcvFVz79u1LbW0tGRkZ1NbW0qdPH+DIkdSamprocjU1NQSDQYLBINu3b4+Oh8Nhhg8ffsrtpKWl+WGlHSTiPq6qqkrIus5UsvYFydtbsvYFyd2bJElKLl/qVOHc3FzKysoAKCsrY9y4cUeNRyIRduzYQc+ePcnIyCAnJ4fKykrq6+upr6+nsrKSnJyc2HUhSZIkSUpapzziOnfuXLZv386+ffsYM2YMt912G9OnT2fOnDmsXbuW888/n2XLlgEwduxYtmzZQigUolu3bixZsgSA9PR0Zs6cSVFREQCzZs0iPT29DduSJElnIjc3l+7du9OlSxdSUlJ49tln2b9/P7fffjsffvghF1xwAcuWLaN3795EIhEWL17Mli1bOPvss3nkkUcYPHhwvFuQJCWRUwbXxx577Ljjq1atOmYsEAiwcOHC4y5fVFQUDa6SJCnxrVq1Kvp1IPj/t8ObPn06JSUllJSUMG/evKNuh/frX/+a+++/nzVr1sSxcklSsvlSpwpLkqTO53RvhydJUqwYXCVJ0nHdcsstTJ06lWeeeQY4/dvhSZIUK2d8OxxJkpR8fvrTnxIMBqmrq6O4uJgBAwYc9fiZ3g7PW94pnuJ9K7COdDsysN621JFqhbapt7Xv4wZXSZJ0jGAwCBy5BV4oFGLnzp2nfTu8k/GWd4qneL/2OtrtyKy37XSkWiG+9XqqsCRJOsqnn37Kxx9/HP35pZdeYuDAgad9OzxJkmLFI66d3MFDLZx9Vkq8ywASqxZJ6szq6uqYNWsWAC0tLUycOJExY8Zw+eWXn9bt8CRJihWDayd39lkpXDi/PN5lAFD9SEG8S5AkAVlZWTz33HPHjJ9zzjmnfTs8SZJiwVOFJUmSJEkJzeAqSZIkSUpoBldJkiRJUkIzuEqSJEmSEprBVZIkSZKU0AyukiRJkqSEZnCVJEmSJCU0g6skSZIkKaEZXCVJktRpHDzUEu8SGDRoEJAYtUgdRWq8C5AkSZLay9lnpXDh/PJ4lwFA9SMF8S5B6jA84ipJkiRJSmgGV0mSJElSQjO4SpIkSZISmsFVkiRJkpTQDK6SJEmSpIRmcFXC+OIl4T+/THy8eHl6SZIkKXF4OxwlDC9PL0mSJOl4POIqSZIkSUpoBldJkiRJUkIzuEqSJElxkEjX1EikWqTjaffvuG7dupXFixdz+PBhpk2bxvTp09u7BOmUDh5q4eyzUmK+3i9z0am2qkWSYsn5XTp9Xt9Dar12Da4tLS0sWrSIlStXEgwGKSoqIjc3l4svvrg9y5BOyYlEklrP+V2S1Nba9VThnTt30r9/f7KysujatSsFBQVUVFS0ZwlSh5Nop+4kWj2S4s/5XZLU1tr1iGs4HCYzMzP6ezAYZOfOne1ZgtThJNLRX4B3Hhx/3PF43Hu3PU6jbm1fiXZKd2vqaa+/WaLtG8We87vU8Z3svbqjzfGxrjeR5rF41/LFfdvetQQikUikvTb2q1/9im3btrF48WIAysrK2LlzJwsWLDju8jt27CAtLa29ypMkJajU1FQGDhwY7zJ0Aqc7v4NzvCTpiNbO8e16xDUYDFJTUxP9PRwOEwwGT7j8kCFD2qMsSZJ0Bk53fgfneEnS6WnX77hefvnlVFdXs3v3bpqamigvLyc3N7c9S5AkSTHm/C5JamvtesQ1NTWVBQsWcOutt9LS0sJXv/pVT/2SJKmDc36XJLW1dv2OqyRJkiRJp6tdTxWWJEmSJOl0GVwlSZIkSQmtXb/j2t62bt3K4sWLOXz4MNOmTWP69OnxLikm7r77bl588UX69u3L888/H+9yYmbv3r3cdddd1NXVEQgE+Ou//mv+4R/+Id5lxURjYyNf+9rXaGpqoqWlhfz8fGbPnh3vsmLm8++0BYNBVqxYEe9yYiY3N5fu3bvTpUsXUlJSePbZZ+NdUkw0NDRw77338u677xIIBFiyZAlXXXVVvMuSTuhU83lTUxN33XUXu3btIj09naVLl/Knf/qncam1NXPZa6+9xsyZM6M1hkIhvvnNb8ajXODU73WRSITFixezZcsWzj77bB555BEGDx7c7nV+8MEH3H777dHfd+/ezezZs7n55pujY/Het8f7jLZ//35uv/12PvzwQy644AKWLVtG7969j3luaWkpTzzxBAAzZsxgypQpcan329/+Ni+88AJnnXUWX/nKV3j44Yfp1avXMc+Nxxx5vHp/8IMfsHr1avr06QPA3LlzGTt27DHPbe9ccLxa58yZw+9//3sADhw4QM+ePVm3bt0xz43Hvj3Re1dCvX4jSaq5uTkybty4yH/9139FGhsbI5MmTYr87ne/i3dZMbF9+/bIf/7nf0YKCgriXUpMhcPhyH/+539GIpFI5MCBA5G8vLyk+ZsdPnw48vHHH0cikUikqakpUlRUFHn77bfjXFXsPPXUU5G5c+dGpk+fHu9SYuraa6+N1NXVxbuMmLvrrrsiq1evjkQikUhjY2Okvr4+zhVJJ9aa+fzHP/5x5L777otEIpHI888/H/lf/+t/xaPUSCTSurns1VdfTaj3y1O917344ouRW265JXL48OHI22+/HSkqKmrH6o6vubk5Mnr06MiePXuOGo/3vj3eZ7Rvf/vbkRUrVkQikUhkxYoVke985zvHPG/fvn2R3NzcyL59+yL79++P5ObmRvbv3x+Xerdt2xY5dOhQJBKJRL7zne8ct95IJD5z5PHqffzxxyNPPvnkSZ8Xj1xwqs/rDz/8cOQHP/jBcR+Lx7490XtXIr1+k/ZU4Z07d9K/f3+ysrLo2rUrBQUFVFRUxLusmBg2bNhx/9PR0WVkZET/g9ujRw8GDBhAOByOc1WxEQgE6N69OwDNzc00NzcTCATiXFVs1NTU8OKLL1JUVBTvUtQKBw4c4PXXX4/+vbp27Xrc/6RLiaI18/nmzZuj/93Pz8/nlVdeIRKna08m41xWUVFBYWEhgUCAIUOG0NDQQG1tbVxreuWVV8jKyuKCCy6Iax1/7Hif0T7ffwCFhYVs2rTpmOdVVlaSnZ1Neno6vXv3Jjs7m23btsWl3pycHFJTj5yUOWTIkKPu0RxvX/YzcDxywclqjUQi/PKXv2TixIltWsPpONF7VyK9fpM2uIbDYTIzM6O/B4PBDj9xdCZ79uyhqqqKK6+8Mt6lxExLSwuTJ09m9OjRjB49Oml6W7JkCfPmzaNLl+R8O7nllluYOnUqzzzzTLxLiYk9e/bQp08f7r77bgoLC7nnnnv49NNP412WdEKtmc/D4TDnnXcecOTWPD179mTfvn3tWufxnGwu27FjB9dffz233norv/vd7+JQ3dFO9l73x3+DzMzMuH+mKi8vP+GH/kTbt3V1dWRkZADQr18/6urqjlkmUT+3/uxnP2PMmDEnfDxR5sinn36aSZMmcffdd1NfX3/M44m2f9944w369u3LhRdeeMJl4rlvv/jelUiv3+T8pKkO7ZNPPmH27Nl861vfokePHvEuJ2ZSUlJYt24dW7ZsYefOnbz77rvxLumMvfDCC/Tp04fLLrss3qW0iZ/+9KeUlpbywx/+kKeffprXX3893iWdsebmZn7zm99www03UFZWRrdu3SgpKYl3WVLSOdlcNnjwYDZv3sxzzz3HTTfdxKxZs+JU5REd7b2uqamJzZs3M378+GMeS7R9+8cCgUCHOePqiSeeICUlheuvv/64jyfK6+aGG25g48aNrFu3joyMDB555JG41HE6nn/++ZMebY3nvj3Ze1e8X79JG1yDweBRpzaEw2GCwWAcK1JrHDp0iNmzZzNp0iTy8vLiXU6b6NWrFyNGjGiXU4Da2ltvvcXmzZvJzc1l7ty5vPrqq9x5553xLitmPn/P6Nu3L6FQiJ07d8a5ojOXmZlJZmZm9AjQ+PHj+c1vfhPnqqQTa818HgwG2bt3L3DknzMHDhzgnHPOadc6v+hUc1mPHj2iXx8ZO3Yszc3NfPTRR+1dZtSp3uv++G9QU1MT189UW7duZfDgwZx77rnHPJZo+xaO7NfPT62ura2NXkToixLtc+uzzz7Liy++yKOPPnrCoJIoc+S5555LSkoKXbp0Ydq0afzHf/zHMcsk0v5tbm5m48aNTJgw4YTLxGvfHu+9K5Fev0kbXC+//HKqq6vZvXs3TU1NlJeXk5ubG++ydBKRSIR77rmHAQMGUFxcHO9yYuqjjz6ioaEBgIMHD/Lyyy8zYMCAOFd15u644w62bt3K5s2beeyxxxg5ciSPPvpovMuKiU8//ZSPP/44+vNLL73EwIED41zVmevXrx+ZmZl88MEHwJHviV100UVxrko6sdbM57m5uZSWlgKwfv16Ro4cGbejAq2Zy/77v/87+h3cnTt3cvjw4bgF7da81+Xm5lJWVkYkEmHHjh307NkzeupgPJSXl1NQUHDcxxJp337u8/0HUFZWxrhx445ZJicnh8rKSurr66mvr6eyspKcnJz2LhU48o+BJ598kieeeIJu3bodd5lEmiO/+H3rTZs2HbeORMoFn38G/OKptV8Ur317oveuRHr9Ju3tcFJTU1mwYAG33npr9FYdyfChE45c5nv79u3s27ePMWPGcNtttzFt2rR4l3XG3nzzTdatW8cll1zC5MmTgRNf0ryjqa2tZf78+bS0tBCJRBg/fjzXXnttvMvSSdTV1UVPMWtpaWHixIkn/Z5PR3Lfffdx5513cujQIbKysnj44YfjXZJ0Qieaz7///e9z2WWXMW7cOIqKipg3bx6hUIjevXuzdOnSuNV7ornsD3/4A3DktMb169fz05/+lJSUFM4++2wee+yxuAXt/9fevUdHVd77H/8MiYkcIQSQmQCmuFCEaLlVbiGB1MFJhDAmYFKPp9KSQikQ5USUCmhBEfFSalE4R0kRTHssVaIJarBcgpBEkYglopxgQRobLJn5NZAEsLm6f3+w3AeEQIAkszN5v9ZirfDMvnyfPXtmz2f2nv009l63fv16s96YmBjt3LlTLpdLHTt21LJly3xSq3T6g/wHH3ygJUuWmG1n1urrbXu+z2gzZsxQWlqaMjMz1atXL61YsUKS9Omnn+pPf/qTnnzySYWGhmr27NnmjfNSU1MVGhrqk3rT09NVW1trhpfBgwdryZIl8ng8evTRR/W73/3OZ8fI89VbWFioAwcOSJJ69+5t7htn1uuLXNDY5/VNmzad88WLFbZtY+9dVtp/bYavbrsHAAAAAEAT+O2lwgAAAAAA/0BwBQAAAABYGsEVAAAAAGBpBFcAAAAAgKURXAEAAAAAlkZwBQAAAABYGsEVAAAAAGBpBFcAAAAAgKURXAEAAAAAlkZwBQAAAABYGsEVAAAAAGBpBFcAAAAAgKURXAEAAAAAlkZwBQAAAABYGsEVAAAAAGBpBFcAAAAAgKURXAEAAAAAlkZwBQAAAABYGsEVAAAAAGBpBFcAAAAAgKURXAEAAAAAlkZwBQAAAABYGsEVAAAAAGBpBFcAAAAAgKURXAEAAAAAlkZwBQAAAABYGsEVAAAAAGBpBFcAAAAAgKURXAEAAAAAlkZwBQAAAABYGsEVAAAAAGBpBFcAAAAAgKURXAEAAAAAlkZwBQAAAABYGsEVAAAAAGBpBFcAAAAAgKURXAEAAAAAlkZwBQAAAABYGsEVaMTKlSv10EMPXfb88fHx2r17dzNW1DRXWrckvfTSS3rkkUeaqSIAAADgygT6ugDAH8yfP18Oh0MPPPCA2ZaTk+PDiq7MzJkzfV0CAABtwksvvaTS0lI9+eSTvi4F8GuccUW7VV9f7+sSAABAGzdz5kwztB45ckT9+/e37GeMN998U/fcc4+vywAuC8EV7YrT6VR6errcbreGDBmif/zjH7r//vs1atQoOZ1O/f73v2903jlz5igqKkq33nqrfvzjH+vgwYOSpNdee01vv/22Xn75ZQ0dOtQ8W+l0OvXBBx/I4/Fo0KBBqqioMJf1v//7vxo5cqTq6uokSZmZmRo/fryGDx+uadOm6auvvrpoXw4ePKiUlBSNGDFCo0eP1ksvvXTe6XJzcxUfH69hw4ZpypQp+iDSI+IAACAASURBVOKLL8zH0tPTNWbMGA0dOlRxcXHatWuXpLMvN/72IJyVlaUf/vCHGjlypF588UVzGdXV1Xr44Yc1fPhwjR8/Xr/73e80duzYi9YPAAAANBXBFe1OTk6O0tPTVVhYqPvuu0/9+/dXXl6eMjIylJGRofz8/PPON3bsWG3evFm7du3SzTffbAa7u+++W263W9OmTdPevXvPCZAOh0NDhgzRli1bzLa3335bcXFxuuqqq7Rt2zatXr1aq1at0q5du3TrrbfqwQcfvGAfTp48qZSUFI0ZM0b5+fnasmWLIiMjz5nub3/7mx588EEtXLhQu3bt0tixYzVz5kzV1tbq8OHDevXVV5WZmam9e/fq5ZdfVu/evRtd58cff6w///nPysjI0H/913+ZAXjVqlX66quvtG3bNq1bt05vvfXWBWsHAMAq3njjjbN+HhMbG6s5c+aY/4+JiVFxcbGWLl2qmJgY/eAHP9DkyZO1Z88ec5ozv+y99957JUnDhw/X0KFDtXfv3guu//XXX9f48eM1dOhQTZgwQfv375ckffHFF5oyZYqGDRum+Ph45ebmmvNMmTJFGzZsMP//3bOo/fv31/r16xUbG6thw4bp8ccfl2EY+uKLL7R48WIVFRVp6NChGjZs2OVsMsBnCK5od6ZMmaKePXvq4MGDOnbsmO677z4FBQUpPDxcP/rRj7Rp06bzzpeUlKROnTopKChI999/vw4cOKATJ040aZ1ut1vvvPOOJMkwDG3atElut1uS9Kc//UkzZszQDTfcoMDAQM2cOVPFxcUXPOu6Y8cOXXvttfrZz36m4OBgderUSYMHDz5nuk2bNikmJkZRUVG66qqrNG3aNFVXV2vv3r0KCAhQbW2tvvjiC9XV1em6667T9773vUbXed999+nqq6/WgAEDNGDAAB04cECS9O677+oXv/iFunTporCwMP3kJz9p0jYBAMDXRowYoT179uibb76Rx+NRXV2dioqKJEmlpaX6+uuv1b9/fw0cOFDZ2dkqLCzUxIkT9Z//+Z+qqak5Z3n/8z//I0n66KOPtHfvXg0dOrTRdb/77rtauXKlnnnmGf3lL3/Riy++qNDQUNXV1WnmzJmKiorSBx98oEcffVQPPfSQDh8+3OR+7dixQ5mZmXrrrbf07rvvKj8/XzfccIMef/xxDRkyRHv37j0rfANtAcEV7U7Pnj0lSV999ZW8Xq+GDRtm/nvppZf0z3/+85x5GhoatHz5ct1+++36wQ9+IKfTKUk6fvx4k9YZGxuroqIieb1effTRR+rQoYP5Tec//vEPLVu2zKxhxIgRMgxDHo+n0eUdPXr0giHzW16vV7169TL/36FDB/Xs2VMej0d9+vTRwoULtXLlSo0ePVoPPPDABdd57bXXmn937NhRX3/9tbmOb7epJIWFhV20LgAArCA8PFzXXHONiouLtWfPHkVHR8tut+uLL75QYWGhbr31VnXo0EEJCQnq2rWrAgMD9bOf/Uy1tbX629/+dkXrzszM1PTp0zVo0CDZbDb16dNHvXv31ieffKKvv/5aM2bMUFBQkCIjI3Xbbbdd0k0ff/7znyskJES9evXSyJEjzS+bgbaMuwqj3bHZbJJOB9jrrrvurEt4G/P2228rNzdX69at03XXXacTJ05o+PDhMgzjrGU2pkuXLoqKitKmTZt0+PBhTZgw4aw6Zs6cqTvvvLPJfejZs2ejZ4bPZLfb9de//tX8v2EYOnr0qBwOh6TTZ4LdbrdOnjypRYsWafny5fr1r3/d5DokqUePHiorK9ONN94oSSorK7uk+QEA8KXhw4ersLBQX375pYYPH67OnTvro48+UlFRkUaMGCFJevnll5WZmSmv1yubzaaTJ082+cvrxjT2JbTX61VYWJg6dPi/80u9evW64JfL39WjRw/z744dO+rUqVNXVCtgBZxxRbs1aNAgXXPNNUpPT1d1dbUaGhr017/+Vfv27Ttn2lOnTikoKEhdu3bVv/71Lz333HNnPd69e3cdOXLkgutzu93auHGjNm/ebF4mLEn//u//rvT0dPNmTydOnNC77757wWX98Ic/1P/7f/9Pr7zyimpra3Xy5El98skn50w3fvx47dy5U7t27VJdXZ3Wrl2roKAgDR06VIcPH9auXbtUW1uroKAgBQcHn3WQbKrx48dr9erVqqyslMfjMS+TAgCgLRgxYoR2796tjz/+WCNGjNCIESP00UcfqbCwUMOHD9eePXu0Zs0arVixQh999JH27Nmjzp07m19en+liX2SfqWfPnvr73/9+TrvdbldZWZm++eYbs+3ML507duyof/3rX+Zj57tSrDGXUh9gNQRXtFsBAQF66aWXdODAAY0bN06jRo3So48+qpMnT54zbWJionr16qUxY8YoPj5eQ4YMOevxpKQkHTp0SMOGDdPs2bPPuz6n06mSkhJde+21GjBggNnucrk0ffp0zZ07Vz/4wQ80ceJE5eXlXbD2Tp06ae3atXrvvfcUFRWluLg47d69+5zp+vbtq1//+td64oknNGrUKL333nt66aWXFBQUpNraWv3mN7/RyJEjFR0drWPHjmnu3LlN2XRnSU1NVVhYmMaNG6epU6cqLi5OQUFBl7wcAAB8Yfjw4dq9e7eqq6sVFhamYcOGKT8/XxUVFbr55pt16tQpBQQEqFu3bqqvr9eqVavO+1lBkrp166YOHTqotLT0outNSkrS2rVr9dlnn8kwDH355Zf66quvNGjQIF199dVas2aN6urqtHv3bm3fvl0TJkyQJEVERGjr1q3617/+pS+//FKZmZlN7mv37t3l8XhUW1vb5HkAq7AZ5/u6CAAu0x//+Edt2rSJM68AgDYjOjpaY8aM0VNPPSVJmjx5srp166Y1a9aooaFBjz76qDZv3qx/+7d/009/+lOtX79eS5cu1ejRo7Vy5Up9+eWXWr58uSTp+eef1/r161VfX681a9ac82X3mdavX69XXnlFXq9XvXv31rPPPqubb75ZBw8e1OOPP67i4mI5HA498MADcrlckqRjx47poYce0t69e9W/f3/zJk7r16+XdPquwlu2bFGfPn0kSfPnzzeXUVtbq/vuu09FRUWy2Wzn/dIbsCqCK4Ar4vV6VVpaqqFDh6qkpES/+MUv9OMf/1hTp071dWkAAADwE9ycCbCoPXv26Oc///l5H7vYuHCtqa6uTosXL9aRI0fUuXNnxcfH6z/+4z98XRYAAAD8CGdcAQAAgBawaNEivf322+e0u91uLVmyxAcVAW0XwRUAAAAAYGncVRgAAAAAYGmWDq7fjmvZkkpKSlp8Hc2JelsW9bYs6m1Z1Iu2pLmO8f68H9G3tom+tU30zfosHVzr6+tbfB1nDuDcFlBvy6LelkW9LYt60ZY01zHen/cj+tY20be2ib5Zn6WDKwAAAAAABFcAAAAAgKURXAEAAAAAlkZwBQAAAABYGsEVAAAAAGBpBFcAAAAAgKURXAEAAAAAlkZwBQAAAABYGsEVAAAAAGBpBFcAAAAAgKURXAGgGVTXNfi6BEVERJh/W6EeoDV87/q+vi7BxOsOAFpOoK8LAAB/cPVVAbp+fo6vyzCVPB3v6xKAVnFNx2DLvPZ43QFAy+GMKwAAAADA0giuAAAAAABLI7gCAAAAACyN4AoAAAAAsDSCKwAAAADA0giuAAAAAABLI7gCAAAAACyN4AoAQDtVU1OjpKQk3XnnnYqPj9cLL7wgSSotLVVycrJcLpfS0tJUW1srSaqtrVVaWppcLpeSk5N15MgRc1mrV6+Wy+VSXFyc8vPzfdIfAID/alJwdTqdcrvdSkhI0OTJkyVJFRUVSklJUWxsrFJSUlRZWSlJMgxDS5culcvlktvt1v79+83lZGVlKTY2VrGxscrKymqB7gAAgKYKCgpSRkaG3nrrLWVnZys/P19FRUVavny5pk6dqq1btyokJESZmZmSpA0bNigkJERbt27V1KlTtXz5cknSoUOHlJOTo5ycHK1Zs0aPP/64GhoafNk1AICfafIZ14yMDG3cuFFvvvmmJCk9PV2RkZHasmWLIiMjlZ6eLknKy8tTSUmJtmzZoieeeEKPPfaYpNNBd9WqVXr99de1YcMGrVq1ygy7AACg9dlsNl1zzTWSpPr6etXX18tms+nDDz9UXFycJGnSpEnKzc2VJG3fvl2TJk2SJMXFxWnXrl0yDEO5ubmKj49XUFCQwsPD1adPH+3bt883nQIA+KXLvlQ4NzdXiYmJkqTExERt27btrHabzaYhQ4aoqqpKXq9XBQUFioqKUmhoqLp06aKoqCguJQIAwMcaGhqUkJCg0aNHa/To0QoPD1dISIgCAwMlSWFhYfJ4PJIkj8ejnj17SpICAwPVuXNnHT9+XB6PR2FhYeYyHQ6HOQ8AAM0hsKkTTps2TTabTXfffbfuvvtulZeXy263S5J69Oih8vJySTrn4PXtAY+DGgAA1hMQEKCNGzeqqqpKqampOnz4cKust6amRsXFxVe8nIiIiGaopvk0R5++VV1d3azLsxL61jbRt7bJ6n1r6vt4k4Lr+vXr5XA4VF5erpSUFPXt2/esx202m2w226VXeRHNdVC7EKs/kd9FvS2LeluWP9drtQ/PUvN+gG4JbX37+puQkBCNHDlSRUVFqqqqUn19vQIDA1VWViaHwyHp9JfOR48eVVhYmOrr63XixAl17dpVDodDZWVl5rI8Ho85T2OCg4P98nltzj4VFxf75TaS6FtbRd/aJn/pW5OC67cHn+7du8vlcmnfvn3q3r27vF6v7Ha7vF6vunXrZk575sHr2wOew+FQYWGh2e7xeDRixIgLrrc1Dmpt7Ymk3pZFvS2LeluX1Wtv69vXHxw7dkyBgYEKCQlRdXW1PvjgA/385z/XyJEjtXnzZsXHxysrK0tOp1PS6Zs1ZmVlaejQodq8ebNGjRolm80mp9OpBx98UCkpKfJ4PCopKdGgQYN83DsAgD+56G9cv/76a508edL8+/3331e/fv3kdDqVnZ0tScrOzta4ceMkyWw3DENFRUXq3Lmz7Ha7oqOjVVBQoMrKSlVWVqqgoEDR0dEt2DUAAHAhXq9XP/nJT+R2u5WUlKTRo0frtttu07x587Ru3Tq5XC5VVFQoOTlZkpSUlKSKigq5XC6tW7dODz30kCSpX79+Gj9+vCZMmKDp06dr0aJFCggI8GXXAAB+5qJnXMvLy5Wamirp9A0cJk6cqLFjx2rgwIFKS0tTZmamevXqpRUrVkiSYmJitHPnTrlcLnXs2FHLli2TJIWGhmr27NlKSkqSJKWmpio0NLSl+gUAAC5iwIAB5pfQZwoPDzeHwDlTcHCwOdbrd82aNUuzZs1q9hoBAJCaEFzDw8P11ltvndPetWtXZWRknNNus9m0ePHi8y4rKSnJDK4AAAAAADTFZQ+HAwAAAABAayC4AgAANIPquoZmXd6V3LysuWsBAF9r8jiuAAAAaNzVVwXo+vk5vi5DklTydLyvSwCAZsUZVwAAAACApRFcAQAAAACWRnAFAAAAAFgawRUAAAAAYGkEVwAAAACApRFcAQAAAACWRnAFAAAAAFgawRUAAAAAYGkEVwAAAACApRFcAQAAAACWRnAFAAAAAFgawRUAAAAAYGkEVwAAAACApRFcAQAAAACWRnAFAAAAAFgawRUAAAAAYGkEVwAAAACApRFcAQAAAACWRnAFAAAAAFgawRUAAAAAYGkEVwAAAACApRFcAQAAAACWRnAFAKAdOnr0qKZMmaIJEyYoPj5eGRkZkqSVK1dqzJgxSkhIUEJCgnbu3GnOs3r1arlcLsXFxSk/P99sz8vLU1xcnFwul9LT01u9LwAA/xfo6wIAAEDrCwgI0Pz583XLLbfo5MmTuuuuuxQVFSVJmjp1qqZNm3bW9IcOHVJOTo5ycnLk8XiUkpKizZs3S5KWLFmidevWyeFwKCkpSU6nUzfeeGOr9wkA4L8IrgAAtEN2u112u12S1KlTJ/Xt21cej6fR6XNzcxUfH6+goCCFh4erT58+2rdvnySpT58+Cg8PlyTFx8crNzeX4AoAaFZcKgwAQDt35MgRFRcXa/DgwZKkV199VW63WwsWLFBlZaUkyePxKCwszJzH4XDI4/E02g4AQHPijCsAAO3YqVOnNGfOHC1cuFCdOnXSPffco9mzZ8tms+n555/X008/raeeeqrZ11tTU6Pi4uIrXk5EREQzVOOfmmP7tpTq6mpL13cl6FvbRN98p6nv4wRXAADaqbq6Os2ZM0dut1uxsbGSpGuvvdZ8PDk5WTNnzpR0+kxqWVmZ+ZjH45HD4ZCkRtsvJDg4mNDZwqy8fYuLiy1d35Wgb20TfbM+LhUGAKAdMgxDjzzyiPr27auUlBSz3ev1mn9v27ZN/fr1kyQ5nU7l5OSotrZWpaWlKikp0aBBgzRw4ECVlJSotLRUtbW1ysnJkdPpbPX+AAD8G2dcAQBohz7++GNt3LhRN910kxISEiRJc+fO1TvvvKMDBw5Iknr37q0lS5ZIkvr166fx48drwoQJCggI0KJFixQQECBJWrRokaZPn66GhgbdddddZtgFAKC5NDm4fnswcjgcWr16tUpLSzV37lxVVFTolltu0bPPPqugoCDV1tbql7/8pfbv36/Q0FD99re/1XXXXSfp9PhvmZmZ6tChgx599FGNGTOmxToGAAAaN2zYMH3++efntMfExDQ6z6xZszRr1qzzznOh+QAAuFJNvlT497//vW644Qbz/8uXL9fUqVO1detWhYSEKDMzU5K0YcMGhYSEaOvWrZo6daqWL18u6ezx39asWaPHH39cDQ0NzdwdAAAAAIC/aVJwLSsr044dO5SUlCTp9O9iPvzwQ8XFxUmSJk2apNzcXEnS9u3bNWnSJElSXFycdu3aJcMwLjj+GwAAAAAAjWlScF22bJnmzZunDh1OT378+HGFhIQoMPD0lcZhYWHmmG0ej0c9e/aUJAUGBqpz5846fvw447wBAAAAAC7LRX/j+t5776lbt276/ve/r927d7dGTabmGuPtQqw+rtF3UW/Lot6W5c/1WvE281bf1m19+wIAgNZz0eD6l7/8Rdu3b1deXp5qamp08uRJPfnkk6qqqlJ9fb0CAwNVVlZmjtnmcDh09OhRhYWFqb6+XidOnFDXrl0vOP5bY1pjjLe2Nq4R9bYs6m1Z1Nu6rF57W9++AACg9Vz0UuEHH3xQeXl52r59u5577jmNGjVKv/nNbzRy5Eht3rxZkpSVlWWO2eZ0OpWVlSVJ2rx5s0aNGiWbzdbo+G8AAAAAAFxIk+8q/F3z5s3TunXr5HK5VFFRoeTkZElSUlKSKioq5HK5tG7dOj300EOSzh7/bfr06WeN/wYAAAAAQGOaPI6rJI0cOVIjR46UJIWHh5tD4JwpODhYL7zwwnnnb2z8NwAAAAAAGnPZZ1wBAAAAAGgNBFcAAAAAgKURXAEAAAAAlkZwBQAAAABYGsEVAAAAAGBpBFcAAAAAgKURXAEAAAAAlkZwBQAAAABYGsEVAAAAAGBpBFcAAAAAgKURXAEAAAAAlkZwBQAAAABYGsEVAAAAAGBpBFcAbVZ1XUOLLj8iIqJFlw8AAICmCfR1AQBwua6+KkDXz8/xdRmSpJKn431dAgAAgN/ijCsAAAAAwNIIrgAAAAAASyO4AgAAAAAsjeAKAEA7dfToUU2ZMkUTJkxQfHy8MjIyJEkVFRVKSUlRbGysUlJSVFlZKUkyDENLly6Vy+WS2+3W/v37zWVlZWUpNjZWsbGxysrK8kl/AAD+i+AKAEA7FRAQoPnz52vTpk167bXX9Mc//lGHDh1Senq6IiMjtWXLFkVGRio9PV2SlJeXp5KSEm3ZskVPPPGEHnvsMUmng+6qVav0+uuva8OGDVq1apUZdgEAaA4EVwAA2im73a5bbrlFktSpUyf17dtXHo9Hubm5SkxMlCQlJiZq27ZtkmS222w2DRkyRFVVVfJ6vSooKFBUVJRCQ0PVpUsXRUVFKT8/32f9AgD4H4IrAADQkSNHVFxcrMGDB6u8vFx2u12S1KNHD5WXl0uSPB6PwsLCzHnCwsLk8XjOaXc4HPJ4PK3bAQCAX2McVwAA2rlTp05pzpw5WrhwoTp16nTWYzabTTabrdnXWVNTo+Li4iteTkRERDNU45+aY/u2lOrqakvXdyXoW9tE33ynqe/jBFcAANqxuro6zZkzR263W7GxsZKk7t27y+v1ym63y+v1qlu3bpJOn0ktKysz5y0rK5PD4ZDD4VBhYaHZ7vF4NGLEiAuuNzg4mNDZwqy8fYuLiy1d35Wgb20TfbM+LhUGAKCdMgxDjzzyiPr27auUlBSz3el0Kjs7W5KUnZ2tcePGndVuGIaKiorUuXNn2e12RUdHq6CgQJWVlaqsrFRBQYGio6N90icAgH/ijCsAAO3Uxx9/rI0bN+qmm25SQkKCJGnu3LmaMWOG0tLSlJmZqV69emnFihWSpJiYGO3cuVMul0sdO3bUsmXLJEmhoaGaPXu2kpKSJEmpqakKDQ31TacAAH6J4AoAQDs1bNgwff755+d97NsxXc9ks9m0ePHi806flJRkBlcAAJoblwoDAAAAACyN4AoAAAAAsDSCKwAAgJ+prmvwdQkmK9UCoO3iN64AAAB+5uqrAnT9/BxflyFJKnk63tclAPADnHEFAAAAAFgawRUAAAAAYGkXDa41NTVKSkrSnXfeqfj4eL3wwguSpNLSUiUnJ8vlciktLU21tbWSpNraWqWlpcnlcik5OVlHjhwxl7V69Wq5XC7FxcUpPz+/hboEAAAAAPAnFw2uQUFBysjI0FtvvaXs7Gzl5+erqKhIy5cv19SpU7V161aFhIQoMzNTkrRhwwaFhIRo69atmjp1qpYvXy5JOnTokHJycpSTk6M1a9bo8ccfV0MDP9YHAAAAAFzYRYOrzWbTNddcI0mqr69XfX29bDabPvzwQ8XFxUmSJk2apNzcXEnS9u3bNWnSJElSXFycdu3aJcMwlJubq/j4eAUFBSk8PFx9+vTRvn37WqpfAAAAAAA/0aTfuDY0NCghIUGjR4/W6NGjFR4erpCQEAUGnr4pcVhYmDwejyTJ4/GoZ8+ekqTAwEB17txZx48fl8fjUVhYmLlMh8NhzgMAAAAAQGOaNBxOQECANm7cqKqqKqWmpurw4cMtXZek07+vLS4ubtF1VFdXt/g6mhP1tizqbVnNXW9ERESzLcsfWX3fuJT9gecaAID27ZLGcQ0JCdHIkSNVVFSkqqoq1dfXKzAwUGVlZXI4HJJOn0k9evSowsLCVF9frxMnTqhr165yOBwqKyszl+XxeMx5GhMcHNziH1aKi4vb1Aci6m1Z1Nuy2lq9bZ3VtzX7AwAAaKqLXip87NgxVVVVSTr97fgHH3ygG264QSNHjtTmzZslSVlZWXI6nZIkp9OprKwsSdLmzZs1atQo2Ww2OZ1O5eTkqLa2VqWlpSopKdGgQYNaql8AAAAAAD9x0TOuXq9X8+fPV0NDgwzD0B133KHbbrtNN954ox544AGtWLFCERERSk5OliQlJSVp3rx5crlc6tKli377299Kkvr166fx48drwoQJCggI0KJFixQQENCyvQMAAAAAtHkXDa4DBgxQdnb2Oe3h4eHmEDhnCg4ONsd6/a5Zs2Zp1qxZl1EmAAAAAKC9atJdhQEAAAAA8BWCKwAAAADA0giuAAAAAABLI7gCAAAAACyN4AoAAAAAsDSCKwAAAADA0giuAAAAAABLI7gCAAAAACyN4AoAAAAAsDSCKwAAAADA0giuAAAAAABLI7gCAAAAACyN4AoAQDu1YMECRUZGauLEiWbbypUrNWbMGCUkJCghIUE7d+40H1u9erVcLpfi4uKUn59vtufl5SkuLk4ul0vp6emt2gcAQPsQ6OsCAACAb0yePFn33nuvHn744bPap06dqmnTpp3VdujQIeXk5CgnJ0cej0cpKSnavHmzJGnJkiVat26dHA6HkpKS5HQ6deONN7ZaPwAA/o/gCgBAOzV8+HAdOXKkSdPm5uYqPj5eQUFBCg8PV58+fbRv3z5JUp8+fRQeHi5Jio+PV25uLsEVANCsuFQYAACc5dVXX5Xb7daCBQtUWVkpSfJ4PAoLCzOncTgc8ng8jbYDANCcOOMKAABM99xzj2bPni2bzabnn39eTz/9tJ566qlmX09NTY2Ki4uveDkRERHNUA1a2nef6+rq6mZ5/q2IvrVN9M13mvo+TnAFAACma6+91vw7OTlZM2fOlHT6TGpZWZn5mMfjkcPhkKRG2y8kODiY0NmOfPe5Li4u9tvnn761TfTN+rhUGAAAmLxer/n3tm3b1K9fP0mS0+lUTk6OamtrVVpaqpKSEg0aNEgDBw5USUmJSktLVVtbq5ycHDmdTl+VDwDwU5xxBQCgnZo7d64KCwt1/PhxjR07Vvfff78KCwt14MABSVLv3r21ZMkSSVK/fv00fvx4TZgwQQEBAVq0aJECAgIkSYsWLdL06dPV0NCgu+66ywy7AAA0F4IrAADt1HPPPXdOW3JycqPTz5o1S7NmzTqnPSYmRjExMc1aGwAAZ+JSYQAAAACApRFcAQAAAACWRnAFAAAAAFgawRUAAAAAYGkEVwAAAACApRFcAQAAAACWRnAFAAAAAFgawRUAAAAAYGkEVwAAAACApRFcAQAAAACWRnAFAAAAAFgawRUAAAAAYGkEVwAAAACApV00uB49elRTpkzRhAkTFB8fr4yMDElSRUWFUlJSFBsbq5SUFFVWVkqSDMPQ0qVL5XK55Ha7tX//fnNZWVlZio2NVWxsrLKyslqoSwAAAAAAf3LR4BoQEKD58+dr06ZNeu211/THP/5Rhw4dUnp6uiIjI7VlyxZFRkYqPT1dkpSXl6eSkhJt2bJFTzzxhB577DFJZh5z2QAAFqdJREFUp4PuqlWr9Prrr2vDhg1atWqVGXYBAAAAAGjMRYOr3W7XLbfcIknq1KmT+vbtK4/Ho9zcXCUmJkqSEhMTtW3bNkky2202m4YMGaKqqip5vV4VFBQoKipKoaGh6tKli6KiopSfn9+CXQMAAAAA+INL+o3rkSNHVFxcrMGDB6u8vFx2u12S1KNHD5WXl0uSPB6PwsLCzHnCwsLk8XjOaXc4HPJ4PM3RBwAAAACAHwts6oSnTp3SnDlztHDhQnXq1Omsx2w2m2w2W7MXV1NTo+Li4mZf7pmqq6tbfB3NiXpbFvW2rOauNyIiotmW5Y+svm9cyv7Acw0AQPvWpOBaV1enOXPmyO12KzY2VpLUvXt3eb1e2e12eb1edevWTdLpM6llZWXmvGVlZXI4HHI4HCosLDTbPR6PRowYccH1BgcHt/iHleLi4jb1gYh6Wxb1tqy2Vm9bZ/Vtzf4AAACa6qKXChuGoUceeUR9+/ZVSkqK2e50OpWdnS1Jys7O1rhx485qNwxDRUVF6ty5s+x2u6Kjo1VQUKDKykpVVlaqoKBA0dHRLdQtAAAAAIC/uOgZ148//lgbN27UTTfdpISEBEnS3LlzNWPGDKWlpSkzM1O9evXSihUrJEkxMTHauXOnXC6XOnbsqGXLlkmSQkNDNXv2bCUlJUmSUlNTFRoa2lL9AgAAAAD4iYsG12HDhunzzz8/72Pfjul6JpvNpsWLF593+qSkJDO4AgAAAADQFJd0V2EAAAAAAFobwRUAAAAAYGkEVwAAAACApRFcAQAAAACWRnAFAKCdWrBggSIjIzVx4kSzraKiQikpKYqNjVVKSooqKyslnR4eb+nSpXK5XHK73dq/f785T1ZWlmJjYxUbG6usrKxW7wesrbqu4Zw2X47hfL56AFjfRe8qDAAA/NPkyZN177336uGHHzbb0tPTFRkZqRkzZig9PV3p6emaN2+e8vLyVFJSoi1btuiTTz7RY489pg0bNqiiokKrVq3SG2+8IZvNpsmTJ8vpdKpLly4+7Bms5OqrAnT9/Bxfl2EqeTre1yUAuAyccQUAoJ0aPnz4OQEzNzdXiYmJkqTExERt27btrHabzaYhQ4aoqqpKXq9XBQUFioqKUmhoqLp06aKoqCjl5+e3el8AAP6N4AoAAEzl5eWy2+2SpB49eqi8vFyS5PF4FBYWZk4XFhYmj8dzTrvD4ZDH42ndogEAfo9LhQEAwHnZbDbZbLYWWXZNTY2Ki4uveDm+/K0k2q7m2PcaU11d3aLL9yX61jZZvW9NfR8nuAIAAFP37t3l9Xplt9vl9XrVrVs3SafPpJaVlZnTlZWVyeFwyOFwqLCw0Gz3eDwaMWLERdcTHBxM6ITPtOS+V1xc7Lf7Nn1rm/ylb1wqDAAATE6nU9nZ2ZKk7OxsjRs37qx2wzBUVFSkzp07y263Kzo6WgUFBaqsrFRlZaUKCgoUHR3tyy4AAPwQZ1wBAGin5s6dq8LCQh0/flxjx47V/fffrxkzZigtLU2ZmZnq1auXVqxYIUmKiYnRzp075XK51LFjRy1btkySFBoaqtmzZyspKUmSlJqaqtDQUJ/1CQDgnwiuAAC0U88999x52zMyMs5ps9lsWrx48XmnT0pKMoMrAAAtgUuFAQAAAACWRnAFAAAAAFgawRUAAAAAYGkEVwAAAACApRFcAQAAAACWRnAFAAAAAFgawRUAAAAAYGkEVwAAAACApRFcAQAAAACWRnAFAAAAAFgawRUAAAAAYGkEVwAAAACApRFcAQAAAACWRnAFAAAAAFgawRUAAAAAYGkEVwAAAACApRFcAQAAAACWRnAFAAAAAFgawRUAAAAAYGkEVwAAAACApV00uC5YsECRkZGaOHGi2VZRUaGUlBTFxsYqJSVFlZWVkiTDMLR06VK5XC653W7t37/fnCcrK0uxsbGKjY1VVlZWC3QFAAAAAOCPLhpcJ0+erDVr1pzVlp6ersjISG3ZskWRkZFKT0+XJOXl5amkpERbtmzRE088occee0zS6aC7atUqvf7669qwYYNWrVplhl0AAAAAAC7kosF1+PDh6tKly1ltubm5SkxMlCQlJiZq27ZtZ7XbbDYNGTJEVVVV8nq9KigoUFRUlEJDQ9WlSxdFRUUpPz+/BboDAAAAAPA3l/Ub1/LyctntdklSjx49VF5eLknyeDwKCwszpwsLC5PH4zmn3eFwyOPxXEndAAAAAIB2IvBKF2Cz2WSz2ZqjlnPU1NSouLi4RZb9rerq6hZfR3Oi3pZFvS2rueuNiIhotmX5I6vvG5eyP/BcAwDQvl1WcO3evbu8Xq/sdru8Xq+6desm6fSZ1LKyMnO6srIyORwOORwOFRYWmu0ej0cjRoy46HqCg4Nb/MNKcXFxm/pARL0ti3pbVlurt62z+rZmfwAAAE11WZcKO51OZWdnS5Kys7M1bty4s9oNw1BRUZE6d+4su92u6OhoFRQUqLKyUpWVlSooKFB0dHTz9QIAADQrp9Mpt9uthIQETZ48WdLljSoAAEBzuOgZ17lz56qwsFDHjx/X2LFjdf/992vGjBlKS0tTZmamevXqpRUrVkiSYmJitHPnTrlcLnXs2FHLli2TJIWGhmr27NlKSkqSJKWmpio0NLQFuwUAAK5URkaGeVWV9H+jCsyYMUPp6elKT0/XvHnzzhpV4JNPPtFjjz2mDRs2+LByAIC/uWhwfe65587bnpGRcU6bzWbT4sWLzzt9UlKSGVwBAEDbk5ubqz/84Q+STo8qMGXKFM2bN6/RUQW+vZEjYCXVdQ26+qqAFlv+pfwEoqVrAfzJFd+cCQAA+Kdp06bJZrPp7rvv1t13333JowoQXGFFV18VoOvn5/i6DElSydPxvi4BaDMIrgAA4Bzr16+Xw+FQeXm5UlJS1Ldv37Mev9JRBZpr5ABu8IW2zup3gD9TWxvN4FLQN99p6vs4wRUAAJzD4XBIOj2SgMvl0r59+y55VIELaY2RA4C2oC29Dvz5bvD0zfou667CAADAf3399dc6efKk+ff777+vfv36XfKoAgAANBfOuAIAgLOUl5crNTVVktTQ0KCJEydq7NixGjhw4CWNKgAAQHMhuAK4JFdyB0R/uEwFaA/Cw8P11ltvndPetWvXSx5VAACA5kBwBXBJuBsjAAAAWhu/cQUAAAAAWBrBFQAAAABgaQRXAAAAAIClEVwBAAAAAJZGcAUAAAAAWBrBFQAAAABgaQRXAAAAAIClEVwBAAAAAJZGcAUAAAAAWBrBFQAAAPCB6roGX5dgslItwPkE+roAAAAAoD26+qoAXT8/x9dlSJJKno73dQnABXHGFQAAAABgaQRXAAAAAIClEVwBAAAAAJZGcAUAAAAAWBrBFQAAAABgaQRXAAAAAIClEVwBAAAAAJZGcAUAAAAAWBrBFQAAAABgaQRXAAAAAIClEVwBAAAAAJZGcAUAAAAAWBrBFbC46rqGRh+LiIhoxUoAAAAA3wj0dQEALuzqqwJ0/fwcX5dhKnk63tclAAAAoJ3hjCsA+KELnalvbVaqBQBwfk15r27NK704duC7Wv2Ma15enp588kl98803Sk5O1owZM1q7BADwe1Y6U89Z+vaB4zvQtlnpuCFx7MC5WvWMa0NDg5YsWaI1a9YoJydH77zzjg4dOtSaJQAAgGbG8R0A0NJaNbju27dPffr0UXh4uIKCghQfH6/c3NzWLAEAADQzju8AmltrXyp8ocuguWzZGlr1UmGPx6OwsDDz/w6HQ/v27WvNEmBh1XUNuvqqAJ/W8O2blhVqAYC2guM7gOZmpUuXDzxxh69LMF3OZ9SW+m1ya39ethmGYbTWyv785z8rPz9fTz75pCQpOztb+/bt06JFi847fVFRkYKDg1urPACARQUGBqpfv36+LgONuNTju8QxHgBwWlOP8a16xtXhcKisrMz8v8fjkcPhaHT6IUOGtEZZAADgClzq8V3iGA8AuDSt+hvXgQMHqqSkRKWlpaqtrVVOTo6cTmdrlgAAAJoZx3cAQEtr1TOugYGBWrRokaZPn66GhgbdddddXPoFAEAbx/EdANDSWvU3rgAAAAAAXKpWvVQYAAAAAIBLRXAFAAAAAFhauwmu7777ruLj4zVgwAB9+umnZvv777+vyZMny+12a/Lkydq1a5f52GeffSa32y2Xy6WlS5eqNa+qbqxeSVq9erVcLpfi4uKUn59vtufl5SkuLk4ul0vp6emtVut3FRcX60c/+pESEhI0efJkcyw/wzC0dOlSuVwuud1u7d+/32c1ns8f/vAH3XHHHYqPj9ezzz5rtje2va1g7dq16t+/v44dOybJutv4mWee0R133CG3263U1FRVVVWZj1l1+1rl9dSYo0ePasqUKZowYYLi4+OVkZEhSaqoqFBKSopiY2OVkpKiyspKH1d6toaGBiUmJuoXv/iFJKm0tFTJyclyuVxKS0tTbW2tjytEW2H11+iVcDqdcrvd5nG0rVuwYIEiIyM1ceJEs83q71VNdb6+rVy5UmPGjFFCQoISEhK0c+dOH1Z4+drqcaYpGuubPzx3NTU1SkpK0p133qn4+Hi98MILkvzkeGu0E4cOHTK++OIL49577zX27dtntu/fv98oKyszDMMwPv/8cyM6Otp87K677jL27t1rfPPNN8a0adOMHTt2+LzegwcPGm6326ipqTH+/ve/G+PGjTPq6+uN+vp6Y9y4ccbf//53o6amxnC73cbBgwdbrd4zpaSkmNtqx44dxr333mv+PW3aNOObb74x9u7dayQlJfmkvvPZtWuX8dOf/tSoqakxDMMw/vnPfxqG0fj2toJ//OMfxs9+9jPjhz/8oVFeXm4YhnW3cX5+vlFXV2cYhmE8++yzxrPPPmsYhnW3r5VeT43xeDzGZ599ZhiGYZw4ccKIjY01Dh48aDzzzDPG6tWrDcMwjNWrV5vb2irWrl1rzJ0715gxY4ZhGIYxZ84c45133jEMwzB+9atfGa+++qovy0Mb0RZeo1fitttuM9/X/UFhYaHx2WefGfHx8Wab1d+rmup8fXvhhReMNWvW+LCq5tFWjzNN0Vjf/OG5++abb4yTJ08ahmEYtbW1RlJSkrF3716/ON62mzOuN9xwg/r27XtO+80332yONdevXz/V1NSotrZWXq9XJ0+e1JAhQ2Sz2ZSYmKjc3Fyf15ubm6v4+HgFBQUpPDxcffr00b59+7Rv3z716dNH4eHhCgoKUnx8fKvWeyabzaZTp05Jkk6cOCG73W7WnpiYKJvNpiFDhqiqqkper9cnNX7X+vXrNWPGDAUFBUmSunfvLqnx7W0FTz31lObNmyebzWa2WXUbR0dHKzDw9E3MhwwZYo73aNXta6XXU2PsdrtuueUWSVKnTp3Ut29feTwecx+QpMTERG3bts2XZZ6lrKxMO3bsUFJSkqTTVwh8+OGHiouLkyRNmjTJctsZ1tQWXqP4P8OHD1eXLl3OarPye9WlOF/f/EVbPM40VWN98wc2m03XXHONJKm+vl719fWy2Wx+cbxtN8G1KTZv3qybb75ZQUFB8ng8CgsLMx8LCwuzxA793bocDoc8Hk+j7b6wcOFCPfvss4qJidEzzzyjuXPnSjq3dqtsU0kqKSnRnj17lJycrHvvvdcMT1barmfatm2b7Ha7BgwYcFa7lbfxt9544w2NHTtWknW3r1XrasyRI0dUXFyswYMHq7y83PyyqEePHiovL/dxdf9n2bJlmjdvnjp0OH3oOX78uEJCQswvNay4v8Ka2tpr9HJMmzZNkydP1muvvebrUlqEld+rmsOrr74qt9utBQsWtMlLab+rrRxnLseZfZP847lraGhQQkKCRo8erdGjRys8PNwvjretOo5rS5s6dar++c9/ntOelpam22+//YLzHjx4UMuXL9fatWtbqrxzXEm9vnah2j/88EMtWLBAcXFx2rRpkx555BG98sorrV/kd1yo5oaGBlVWVur111/Xp59+qrS0NJ9/E3WhelevXt2q+2pTNGV/fvHFFxUQEKA777yztcvzW6dOndKcOXO0cOFCderU6azHbDbbWWfkfem9995Tt27d9P3vf1+7d+/2dTmApa1fv14Oh0Pl5eVKSUlR3759NXz4cF+X1WKs9F7VHO655x7Nnj1bNptNzz//vJ5++mk99dRTvi7rsrWV48zl+G7f/OW5CwgI0MaNG1VVVaXU1FQdPnzY1yU1C78KrpcbjsrKynTffffpmWee0fe+9z1Jp7+9/fZyxm+n+faS4uZyOfV+ty6Px2PW1Vh7S7hQ7Q8//LAeeeQRSdL48eP16KOPSmqdbXohF6p5/fr1crlcstlsGjRokDp06KDjx49fcHu3tMbq/fzzz3XkyBElJCRIOr0dJ0+erA0bNvh0G19sf37zzTe1Y8cOvfLKK+ZBzpfb90KsWtd31dXVac6cOXK73YqNjZV0+jJ3r9cru90ur9erbt26+bjK0/7yl79o+/btysvLU01NjU6ePKknn3xSVVVVqq+vV2BgYKu/J6Dtaiuv0cv1bV+6d+8ul8ulffv2+V1wtep7VXO49tprzb+Tk5M1c+ZMH1ZzZdrSceZSna9v/vTcSVJISIhGjhypoqIivzjetvtLhauqqjRjxgw9+OCDuvXWW812u92uTp06qaioSIZhKDs7W+PGjfNhpac5nU7l5OSotrZWpaWlKikp0aBBgzRw4ECVlJSotLRUtbW1ysnJkdPp9EmNdrtdhYWFkqQPP/xQ119/vVl7dna2DMNQUVGROnfubF5q4mu33367eRbob3/7m+rq6tS1a9dGt7cv9e/fX7t27dL27du1fft2hYWF6c0331SPHj0su43z8vK0Zs0avfjii+rYsaPZbsXtK8lSr6fGGIahRx55RH379lVKSorZ/u0+IMky71uS9OCDDyovL0/bt2/Xc889p1GjRuk3v/mNRo4cqc2bN0uSsrKyLLedYU1t4TV6ub7++mudPHnS/Pv9999Xv379fFxV87Pqe1VzOPPeEtu2bWuzz19bO85cisb65g/P3bFjx8zRG6qrq/XBBx/ohhtu8Ivjrc0wWnGMFx/aunWrnnjiCR07dkwhISGKiIjQyy+/rP/+7/9Wenq6+vTpY067du1ade/eXZ9++qkWLFig6upqjR07Vr/61a9a7XKIxuqVTl9u+cYbbyggIEALFy5UTEyMJGnnzp1atmyZGhoadNddd2nWrFmtUut37dmzR8uWLVN9fb2Cg4O1ePFiff/735dhGFqyZIny8/PVsWNHLVu2TAMHDvRJjd9VW1urhQsX6sCBA7rqqqv0y1/+UpGRkZIa395W4XQ6lZmZqW7dull2G7tcLtXW1io0NFSSNHjwYC1ZskSSdbevVV5PjdmzZ49+/OMf66abbjJ/Mzp37lwNGjRIaWlpOnr0qHr16qUVK1aY290qdu/erbVr12r16tUqLS3VAw88oMrKSkVERGj58uXmTdKAC7H6a/RylZaWKjU1VdLp36lNnDixzfdt7ty5Kiws1PHjx9W9e3fdf//9uv322y3/XtUU5+tbYWGhDhw4IEnq3bu3lixZYokvkS9VWz7OXExjfXvnnXfa/HN34MABzZ8/Xw0NDTIMQ3fccYfuu+8+vzjetpvgCgAAAABom9r9pcIAAAAAAGsjuAIAAAAALI3gCgAAAACwNIIrAAAAAMDSCK4AAAAAAEsjuAIAAAAALI3gCgAAAACwNIIrAAAAAMDS/j/BqLtwq45ajQAAAABJRU5ErkJggg==\n",
747 |       "text/plain": [
748 |        "<Figure size 1152x720 with 4 Axes>"
749 |       ]
750 |      },
751 |      "metadata": {},
752 |      "output_type": "display_data"
753 |     }
754 |    ],
755 |    "source": [
756 |     "import matplotlib.pyplot as plt\n",
757 |     "import seaborn as sns\n",
758 |     "%matplotlib inline\n",
759 |     "sns.set_style(\"whitegrid\")\n",
760 |     "resultFrame.hist(figsize=(16,10))\n",
761 |     "sns.despine()\n",
762 |     "plt.show()"
763 |    ]
764 |   }
765 |  ],
766 |  "metadata": {
767 |   "kernelspec": {
768 |    "display_name": "Python 3",
769 |    "language": "python",
770 |    "name": "python3"
771 |   },
772 |   "language_info": {
773 |    "codemirror_mode": {
774 |     "name": "ipython",
775 |     "version": 3
776 |    },
777 |    "file_extension": ".py",
778 |    "mimetype": "text/x-python",
779 |    "name": "python",
780 |    "nbconvert_exporter": "python",
781 |    "pygments_lexer": "ipython3",
782 |    "version": "3.6.7"
783 |   }
784 |  },
785 |  "nbformat": 4,
786 |  "nbformat_minor": 2
787 | }
788 | 


--------------------------------------------------------------------------------
/chapter01/Exercise06.Rmd:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: "BDA3 - exercise 1.6"
 3 | author: "Tiago"
 4 | date: "`r Sys.Date()`"
 5 | output: tint::tintHtml
 6 | ---
 7 | Our goal is to compute the probability that Elvis was an indentical twin, given he had a twin brother. 
 8 | 
 9 | We know some things in advance:
10 | 
11 | - Probability of identical twins being born :$p(\text{indentical twins}) = \frac{1}{300}$;
12 | 
13 | - Probability of Fraternal twins: $p(\text{Fraternal twins}) = \frac{1}{125}$
14 | 
15 | - Probability of the baby being male: $p(\text{baby boy}) = \frac{1}{2}$
16 | 
17 | Our goal is to compute the conditional probability
18 | 
19 | $$
20 | P(\text{Elvis being an identical twin} \ | \ \text{Elvis had a twin brother} )
21 | $$
22 | 
23 | Since they were both boys, we know that:
24 | 
25 | 1. $P(\text{both boys | identical twins }) = \frac{1}{2}$, given that the outcome can only be {Male,Male} or {Female, Female};
26 | 
27 | 2. $P(\text{both boys | fraternal twins }) = \frac{1}{4}$, since the outcome can be {Male,Male} or {Female, Female}, {Female,Male} and {Male,Female}.
28 | 
29 | We can then compute the unconditional probabilities:
30 | 
31 | 1. $P(\text{Identical twins & both boys}) = p(\text{both boys | identical twins}) \times p(\text{indentical twins)} =  \frac{1}{2} \times \frac{1}{300}$
32 | 
33 | 2. $P(\text{Fraternal twins & both boys}) = p(\text{both boys | Fraternal twins}) \times p(\text{Fraternal twins)} =  \frac{1}{2} \times \frac{1}{125}$
34 | 
35 | Now, we can use the unconditional probabilities to compute our target conditional probability:
36 | 
37 | \begin{align} 
38 | P(\text{Elvis being an identical twin} \ | \ \text{Elvis had a twin brother}) = \frac{P(\text{Identical twins & both boys})}{P(\text{being twins})} \\
39 | = \frac{P(\text{Identical twins & both boys})}{P(\text{Identical twins & both boys}) + P(\text{Fraternal twins & both boys})} \\
40 | = \frac{\frac{1}{2} \times \frac{1}{300}}
41 | {\frac{1}{2} \times \frac{1}{300} + \frac{1}{4} \times \frac{1}{125}} \\
42 | & = \frac{5}{11}
43 | \end{align}
44 | 
45 | 
46 | I was lazy and used wolfram alpha to compute the score but corresponds to the solutions released by the authors. 
47 | 


--------------------------------------------------------------------------------
/chapter01/exercice04.Rmd:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: "r markdown"
 3 | author: "Salma Bouzid"
 4 | date: "4/11/2019"
 5 | output: html_document
 6 | ---
 7 | 
 8 | ```{r setup, include=FALSE}
 9 | knitr::opts_chunk$set(echo = TRUE)
10 | ```
11 | 
12 | ## Prior to solving this problem, section 1.6 of the book should be read. 
13 | 
14 | ## 4.a.Let's compute based on observed frequency:  
15 | #### Pr(favorite wins | point spread = 8)
16 | ```{r}
17 | number_fav_win = 8
18 | total_num_games_played = 12
19 | 
20 | prob_fav_win_spr8 = number_fav_win/total_num_games_played
21 | prob_fav_win_spr8
22 | ```
23 | 
24 | #### Pr(favorite wins by at least 8 | point spread = 8)
25 | ```{r }
26 | number_fav_win_atleast_8 = 5
27 | total_num_games_played = 12
28 | 
29 | prob_fav_win_atleast_8 = number_fav_win_atleast_8/total_num_games_played
30 | prob_fav_win_atleast_8
31 | ```
32 | 
33 | #### Pr(favorite wins by at least 8 | point spread = 8 and favorite wins).
34 | 
35 | ```{r }
36 | number_fav_win_atleast_8 = 5
37 | number_fav_win = 8
38 | 
39 | prob_fav_win_atleast_8_favorite_wins = number_fav_win_atleast_8/number_fav_win
40 | prob_fav_win_atleast_8_favorite_wins
41 | ```
42 | 
43 | ##4.(b) Let's build a paramteric model for the difference beween outcome and spread. 
44 | In here we will assume that the distribution of the difference (outcome-spread) follows a normal distribution with mean 0 and standard deviation equal to 14. 
45 | 
46 | ### Pr(favorite wins | point spread = 8)
47 | ### If our favorite team is to win, then he needs to have a point spread difference of more than 8, thus the 8.5
48 | 
49 | ```{r }
50 | 
51 | prob_fav_win_spr8 <- pnorm(8.5,0,13.86)
52 | prob_fav_win_spr8
53 | ```
54 | #### Pr(favorite wins by at least 8 | point spread = 8)
55 | ```{r}
56 | prob_fav_win_atleast_8 = pnorm(.5,0,14)
57 | prob_fav_win_atleast_8
58 | ```
59 | 
60 | #### Pr(favorite wins by at least 8 | point spread = 8 and favorite wins) = P(favorite wins by at least 8 | point spread = 8) / P(favorite wins by at least 8 | favorite wins)
61 | ```{r}
62 | prob_fav_win_atleast_8/prob_fav_win_spr8
63 | ```
64 | 
65 | 


--------------------------------------------------------------------------------
/chapter01/exercise01.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: "Exercise 1"
  3 | author: "Brian Callander"
  4 | date: "28 March 2019"
  5 | output: html_document
  6 | ---
  7 | 
  8 | <div style="display:none">
  9 |   $\DeclareMathOperator{\binomial}{Binomial}
 10 |    \DeclareMathOperator{\bernoulli}{Bernoulli}
 11 |    \DeclareMathOperator{\poisson}{Poisson}
 12 |    \DeclareMathOperator{\normal}{Normal}
 13 |    \DeclareMathOperator{\studentt}{t}
 14 |    \DeclareMathOperator{\cauchy}{Cauchy}
 15 |    \DeclareMathOperator{\exponential}{Exp}
 16 |    \DeclareMathOperator{\uniform}{Uniform}
 17 |    \DeclareMathOperator{\gamma}{Gamma}
 18 |    \DeclareMathOperator{\invgamma}{InvGamma}
 19 |    \DeclareMathOperator{\invlogit}{InvLogit}
 20 |    \DeclareMathOperator{\logit}{Logit}
 21 |    \DeclareMathOperator{\dirichlet}{Dirichlet}
 22 |    \DeclareMathOperator{\beta}{Beta}$
 23 | </div>
 24 | 
 25 | ```{r setup, include=FALSE}
 26 | knitr::opts_chunk$set(
 27 |   # cache = TRUE,
 28 |   # dev = "svglite",
 29 |   echo = TRUE,
 30 |   comment = NA,
 31 |   message = FALSE,
 32 |   warning = TRUE,
 33 |   error = TRUE
 34 | )
 35 | 
 36 | library(tidyverse)
 37 | library(scales)
 38 | library(kableExtra)
 39 | library(here)
 40 | 
 41 | theme_set(theme_bw())
 42 | ```
 43 | 
 44 | ## Part A
 45 | 
 46 | The marginal density of $y$ (a.k.a. the prior predictive distribution) is
 47 | 
 48 | $$
 49 | \begin{align}
 50 |   p(y)
 51 |   &=
 52 |   p(y \mid \theta = 1) \cdot p(\theta = 1)
 53 |   +
 54 |   p(y \mid \theta = 2) \cdot p(\theta = 2)
 55 |   \\
 56 |   &=
 57 |   \normal(y \mid 1, 2) \cdot 0.5
 58 |   +
 59 |   \normal(y \mid 2, 2) \cdot 0.5
 60 | \end{align}
 61 | $$
 62 | 
 63 | The graph of the prior predictive distribution is
 64 | 
 65 | ```{r}
 66 | tibble(y = seq(-5, 8, 0.05)) %>% 
 67 |   mutate(
 68 |     density = 0.5 * (dnorm(y, 1, 2) + dnorm(y, 2, 2))
 69 |   ) %>% 
 70 |   ggplot() + 
 71 |   aes(y, density) +
 72 |   geom_line()
 73 | ```
 74 | 
 75 | 
 76 | This looks fairly normal because the variance is so large (in relation to the distance between the two means). However, the weighted average of normal probability density functions is not necessarily the probability density function of the weighted average of normal random variables. In other words, summing two normal probability density functions doesn't give us a normal density function, even though the sum of two normal random variables is a normal random variable. This becomes clearer if we plot the same prior predictive distribution but with a very small variance.
 77 | 
 78 | 
 79 | ```{r}
 80 | tibble(y = seq(0, 3, 0.01)) %>% 
 81 |   mutate(
 82 |     density = 0.5 * (dnorm(y, 1, 0.1) + dnorm(y, 2, 0.1))
 83 |   ) %>% 
 84 |   ggplot() + 
 85 |   aes(y, density) +
 86 |   geom_line()
 87 | ```
 88 | 
 89 | Clearly, this distribution is not normal. In particular, it is multi-modal.
 90 | 
 91 | ## Part B
 92 | 
 93 | $$
 94 | \begin{align}
 95 |   \mathbb P(\theta \mid y = 1)
 96 |   &=
 97 |   \frac{
 98 |     \mathbb P(y = 1 \mid \theta) \cdot \mathbb P (\theta)
 99 |   }{
100 |     \mathbb P(y)
101 |   }
102 |   \\
103 |   &=
104 |   \frac{
105 |     \normal(1 \mid \theta, 2) \cdot 0.5
106 |   }{
107 |     \normal(1 \mid 1, 2) \cdot 0.5
108 |     +
109 |     \normal(1 \mid 2, 2) \cdot 0.5
110 |   }
111 |   \\
112 |   &=
113 |   \frac{
114 |     \normal(1 \mid \theta, 2) 
115 |   }{
116 |     \normal(1 \mid 1, 2) 
117 |     +
118 |     \normal(1 \mid 2, 2) 
119 |   }
120 | \end{align}
121 | $$
122 | 
123 | which is approximately 0.531 when evaluated at $\theta = 1$.
124 | 
125 | ## Part C
126 | 
127 | As $\sigma \rightarrow 0$, the posterior density of $\theta$ converges around a point mass at $y = 1$. This is because $\normal(1 \mid 2, \sigma) \approx 0$ for $\sigma$ sufficiently small.
128 | 
129 | As $\sigma \rightarrow \infty$, the posterior density of $\theta$ converges to 0.5. This is because $\normal(1 \mid 1, \sigma) \approx \normal(1 \mid 2, \sigma)$ for $\sigma \gg 0$.
130 | 
131 | 


--------------------------------------------------------------------------------
/chapter01/exercise02.Rmd:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: "Exercise 2"
 3 | author: "Sören Berg"
 4 | date: "11 April 2019"
 5 | output: html_document
 6 | ---
 7 | 
 8 | # Exercise 2 
 9 | 
10 | ## Part I - equation (1.8) for vectors 
11 | 
12 | We show that $E(u) = E(E(u|v))$. 
13 | 
14 | First, note that if $x\in\R^d$ is a vector the vector valued integral with $x$ as an integrant is defined by $\int x := (\int x_1, \dots, \int x_d)^t$. Therefore, it is sufficient to show 
15 | 
16 | $$ 
17 | E(u_i) = \int \int u_i p(u, v) \ du \ dv = \int \int u_i p(u|v) \ du \ p(v) \ dv = \int E(u_i|v) p(v) \ dv = E(E(u_i|v)) 
18 | $$ 
19 | 
20 | for all $i=1,\dots, d$. This implies that $E(u) = E(E(u|v))$. 
21 |   
22 | 
23 | ## Part II - equation (1.9) for vectors 
24 | 
25 | We show $Cov(u) = E(Cov(u|v)) + Cov(E(u|v))$. 
26 | 
27 | For the covariance matrix of a random variable $X=(x_1,\dots, x_d)^t \in\R^d$ with $\mu = E(X)\in\R^d$ it holds 
28 | 
29 | $$ 
30 | \begin{aligned} 
31 | Cov(X) &= E( (X-\mu)(X-\mu)^t ) \\ 
32 | &= E\left( \begin{pmatrix} (x_1- \mu_1)^2 & \dots & (x_1-\mu_1)(x_d-\mu_d) \\ \vdots & & \vdots \\ (x_d-\mu_d)(x_1-\mu_1) & \dots & (x_d- \mu_d)^2 \end{pmatrix} \right) \\ 
33 | &= \begin{pmatrix} Var(x_1) & \dots & Cov(x_1, x_d) \\ \vdots & & \vdots \\ Cov(x_d, x_1) & \dots & Var(x_d, x_d) \end{pmatrix}. 
34 | \end{aligned} 
35 | $$ 
36 | 
37 | Using the identity above and basic rules for the expectation value we conclude 
38 | 
39 | $$ 
40 | \begin{aligned} 
41 | \left(E(Cov(u|v)) + Cov(E(u|v))\right)_{ij} &= E(Cov(u|v)_{ij}) + Cov(E(u|v))_{ij} \\ 
42 | &= E( Cov(u_i, u_j|v)) + Cov(E(u_i|v), E(u_j|v)) \\ 
43 | &= E( E(u_iu_j|v) - E(u_i|v)E(u_j|v) ) + E( E(u_i|v)E(u_j|v)) - E( E(u_i|v)) E(E(u_j|v)) \\ 
44 | &=E( E(u_iu_j|v)) - E( E(u_i|v)) E(E(u_j|v)) \\ 
45 | &= E(u_iu_j) - E(u_i)E(u_j) \\ 
46 | &= Cov(u_i, u_j), 
47 | \end{aligned} 
48 | $$ 
49 | 
50 | for all $i,j \in \{1,\dots,d\}$. 
51 | 
52 | Therefore, $Cov(u) = E(Cov(u|v)) + Cov(E(u|v))$.
53 | 


--------------------------------------------------------------------------------
/chapter01/exercise03.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: "Exercise 3"
  3 | author: "Corrie Bartelheimer"
  4 | date: "April 10, 2019"
  5 | output:
  6 |   html_document:
  7 |     toc: true
  8 | ---
  9 | 
 10 | ```{r setup, include=FALSE}
 11 | knitr::opts_chunk$set(echo = TRUE)
 12 | ```
 13 | 
 14 | # Probability calculation for genetics
 15 | Suppose that in each individual of a large population there is a pair of genes, each of which can be either $x$ or $X$, that controls eye color: those with xx have blue eyes, while heterozygotes ($xX$ or $Xx$) and those with XX have brown eyes. The proportion of blue-eyed individuals is $p^2$ and the proportion of heterozygotes is $2p(1-p)$. The proportion of brown-eyed homozygotes is then $(1-p)^2$.Each parent transmits one of its own genes to the child where the probability to transmit either gene is $\frac{1}{2}$.
 16 | 
 17 | ## Part 1
 18 | Assuming random mating, show that among brown-eyed children of brown-eyed parents, the expcted proportion of heterozygotes is $2p/(1+2p)$.
 19 | 
 20 | We write short-hand $Xx$ to mean both heterozygote combinations. We also denote $B$ as brown-eyed.
 21 | 
 22 | We first consider the different combinations of brown-eyed parents we can have. There are three combinations for a person to be brown-eyed thus we have $3 \times 3 = 9$ different gene combinations for brown-eyed parents.
 23 | Of these 9 combinations, there is
 24 | 
 25 |   - 1 combination with parents homozygote (both are $XX$). 
 26 |   The probability of this to happen is $(1-p)^4$.
 27 |   - 4 combinations with one parent heterozygote and one homozygote (e.g. $Xx$ and $XX$). 
 28 |   The probability of this combination is $4p(1-p)^3$.
 29 |   - 4 combinations with both parents heterozygote (e.g. $Xx$).
 30 |   The probability $4p^2(1-p)^2$.
 31 |   
 32 | For each parent combination, there are 4 combinations how to transmit the genes to its children, each of which is equally likely.
 33 | 
 34 | - If both parents are brown-eyed homozygotes, then all 4 possible combinations to transmit are also homozygote and $0$ of $4$ are heterozygote.
 35 | - If one parent is heterozygote, the other homozygote, we can get the following combinations: $Xx$, $xX$, $XX$, $XX$.
 36 |   That is $2$ of $4$ are heterozygotes.
 37 | - If both parents are heterozygotes, we can get the following combinations: $Xx$, $xX$, $XX$ and $xx$. 
 38 |   Since we already know that both parents and kids are all brown-eyed, we can omit $xx$ as a possible combination. Thus $2$ of $3$ combinations are heterozygote.
 39 |   
 40 | We can now combine all this information to obtain the desired probability as follows:
 41 | 
 42 | 
 43 | $$\begin{align*}
 44 | Pr(\text{Judy} = Xx | \text{ Judy & parents}= B) &=  \frac{0 \times (1-p)^4 + 2 \times 4p(1-p)^3 + 2\times 4p^2(1-p)^2}{4 \times (1-p)^4 + 4 \times 4p(1-p)^3 + 3\times 4p^2(1-p)^2} \\
 45 | \\
 46 | &= \frac{ (1-p)^2 \left[ 8p(1-p) +8p^2 \right] }{4(1-p)^2 \left[ (1-p)^2 +  4p(1-p) + 3\times p^2 \right]} \\
 47 | \\
 48 | &= \frac{ 8p(1-p)^2 \left[ 1-p +p \right] }{4(1-p)^2 \left[ 1-2p + p^2 +  4p-4p^2 + 3 p^2 \right]} \\
 49 | \\
 50 | &= \frac{ 2p }{ 1+2p  }
 51 | \end{align*}$$
 52 | 
 53 | ## Part 2
 54 | Suppose Judy, a brown-eyed child of brown-eyed parents, marries a heterozygote, and they have $n$ children, all brown-eyed. Find the posterior probability that Judy is a heterozygote.
 55 | 
 56 | $$\begin{align*}
 57 | Pr(\text{Judy}= Xx | n \text{ children} = B ) &= \frac{ Pr(\text{Judy}= Xx) Pr( n \text{ children} = B | \text{Judy}= Xx) }{Pr( n \text{ children} = B )} 
 58 | \end{align*}$$
 59 | 
 60 | We calculated above the prior probability that Judy is heterozygote as
 61 | $$Pr(\text{Judy}= Xx) = \frac{2p}{1+2p}$$
 62 | 
 63 | To compute the probability of $n$ children, given that Judy is heterozygote, remember from above that when both parents are heterozygote, we have 3 of 4 combinations that result in brown-eyed children. Thus:
 64 | $$Pr( n \text{ children} = B |\text{Judy}= Xx ) = \left( \frac{3}{4} \right)^n $$
 65 | 
 66 | Last, we need the probability of the denominator:
 67 | $$Pr(n \text{ children} = B ) = Pr(\text{Judy}= Xx) Pr( n \text{ children} = B |\text{Judy}= Xx ) + Pr( \text{Judy}=XX) Pr(  n \text{ children} = B |\text{Judy}= XX )$$
 68 | 
 69 | Obviously, $Pr(  n \text{ children} = B |\text{Judy}= XX ) = 1$. Also, 
 70 | $$\begin{align*}
 71 | Pr( \text{Judy}=XX) &= 1 - Pr( \text{Judy}=Xx) \\
 72 | &= 1 - \frac{2p}{1+2p} \\
 73 | &= \frac{1 +2p - 2p}{1+2p} \\
 74 | &= \frac{1}{1+2p}
 75 | \end{align*}$$
 76 | 
 77 | Plugging all this together, we get:
 78 | $$
 79 | \begin{align*}
 80 | Pr(\text{Judy}= Xx | n \text{ children} = B ) &= \frac{ \left( \frac{3}{4} \right)^n  \frac{2p}{1+2p} }{ \left( \frac{3}{4} \right)^n  \frac{2p}{1+2p} + \frac{1}{1+2p}}
 81 | \end{align*}$$
 82 | 
 83 | ## Part 3
 84 | Find the probability (given all information as above) that Judys first grand-children has blue eyes.
 85 | 
 86 | The grand-child can only have blue eyes if Judys child is heterozygote and if her random mate is also heterozygote or blue-eyed. For two heterozygote parents, we know the probability of having a blue-eyed kid is $1$ of $4$. If one parent is blue-eyed and the other heterozygote then 2 of 4 combination result in a blue-eyed kid. So we get:
 87 | 
 88 | $$\begin{align*}
 89 | Pr(\text{ Judys grand-kid} = xx) &=  Pr(\text{Judys kid} = Xx) \left[ \frac{1}{4} Pr(\text{random mate} = Xx) + \frac{1}{2} Pr(\text{radom mate} = xx) \right] \\
 90 | &= Pr(\text{Judys kid} = Xx) \left[  \frac{1}{4} 2p(1-p) + \frac{1}{2} p^2 \right] \\
 91 | &= Pr(\text{Judys kid} = Xx) \left(  \frac{1}{2} p \right)
 92 | \end{align*}$$
 93 | 
 94 | To get the probability that Judys kid heterozygote, we need to consider the two cases of Judy being heterozygote or homozygote.
 95 | We already know the posterior probability that Judy is heterozygote and the probability of a heterozygote kid when both parents are heterozygote. If Judy is homozygote then the probability of a heterozygote kid is $\frac{1}{2}$.
 96 | To get the probability that Judy is homozygote, we only need to change the nominator in the posterior probability:
 97 | $$
 98 | \begin{align*}
 99 | Pr(\text{Judy}= XX | n \text{ children} = B ) &= \frac{ \frac{1}{1+2p} }{ \left( \frac{3}{4} \right)^n  \frac{2p}{1+2p} + \frac{1}{1+2p}}
100 | \end{align*}$$
101 | 
102 | Thus:
103 | $$
104 | \begin{align*}
105 | Pr(\text{Judys kid} = Xx  ) &= \frac{1}{2}Pr(\text{Judy}= XX ) + \frac{2}{3} Pr(\text{Judy}= Xx) \\
106 | \\
107 | &= \frac{1}{2} \frac{ \frac{1}{1+2p} }{ \left( \frac{3}{4} \right)^n  \frac{2p}{1+2p} + \frac{1}{1+2p}} + \frac{2}{3} \frac{ \left( \frac{3}{4} \right)^n  \frac{2p}{1+2p} }{ \left( \frac{3}{4} \right)^n  \frac{2p}{1+2p} + \frac{1}{1+2p}} \\
108 | \end{align*}$$
109 | 
110 | 
111 | And therefore we get the following probability for a blue-eyed grand-kid:
112 | $$\begin{align*}
113 | Pr(\text{ Judys grand-kid} = xx) &= Pr(\text{Judys kid} = Xx) \left(  \frac{1}{2} p \right) \\
114 | \\
115 | &= \left[ \frac{1}{2} \frac{ \frac{1}{1+2p} }{ \left( \frac{3}{4} \right)^n  \frac{2p}{1+2p} + \frac{1}{1+2p}} + \frac{2}{3} \frac{ \left( \frac{3}{4} \right)^n  \frac{2p}{1+2p} }{ \left( \frac{3}{4} \right)^n  \frac{2p}{1+2p} + \frac{1}{1+2p}} \right] \left(  \frac{1}{2} p \right) 
116 | \end{align*}$$
117 | 


--------------------------------------------------------------------------------
/chapter01/exercise08.Rmd:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: "Chapter 01 -- Exercise08"
 3 | author: "Jan Kraemer"
 4 | date: "11 April 2019"
 5 | output: html_document
 6 | ---
 7 | 
 8 | ```{r setup, include=FALSE}
 9 | knitr::opts_chunk$set(echo = TRUE)
10 | ```
11 | 
12 | ## The Question
13 | 
14 | Subjective probability: discuss the following statement. "The Probability of event $E$ is considered 'Subjective' if two rational persons $A$ and $B$ can assign unequal probabilities to $E$, $P_A(E)$ and $P_B(E)$. These probabilities can also be interpreted as 'contidional': $P_A(E) = P(E|I_A)$ and $P_B(E) = P(E|I_B)$, where $I_A$ and $I_B$ represent the knowledge available to persons $A$ and $B$ respectively."
15 | 
16 | Apply this idea to the following examples:
17 | 
18 | a) the probability that a '6' appears when a fair die is rolled, where A observes the outcome of the die roll and B does not.
19 | b) The probability that Brazil wins the next World Cup, where A is ignorant of soccer and B is a knowledgeable sports fan.
20 | 
21 | ## An Attempt towards the Answer
22 | 
23 | ### Thoughts regarding subjective probability
24 | 
25 | The hard part is how to formulate the individual knowledge mathematically. 
26 | 
27 | 
28 | ### Part a)
29 | 
30 | #### The Intuitive Approach
31 | 
32 | I would posit that for $E=6$: $P(E|I_A)=1$, her knowing the outcome, while $P_B(E) = 1/6$, being clueless and just _hoping_ it is a fair dice.
33 | 
34 | #### Mathematical formulism
35 | 
36 | \[
37 | P(E)=\frac{1}{6} \\
38 | I_A: E\in \{6\}; I_B: E \in \{1\dots6\} \\
39 | \\
40 | P_A(E=6)=P(E=6|I_A)=P(E=6|E\in\{6\})=\frac{P(E=6 \cap E\in\{6\})}{P(E=6)}=\frac{P(E=6)}{P(E=6)}=1\\
41 | P_B(E=6)=P(E=6|I_B)=P(E=6|E \in \{1\dots6\})=\frac{P(E=6 \cap E \in \{1\dots6\})}{P(E \in \{1\dots6\})} = \frac{P(E=6)}{P(E \in \{1\dots6\})} = \frac{1/6}{1}=\frac{1}{6}
42 | \]
43 | 
44 | 
45 | ### Part b)
46 | 
47 | #### The Intuitive Approach
48 | 
49 | This is a hard one since I am rather well modeled by person $A$. So I went to Wikipedia, saw that Brasil won 5 titles so far, and posit: $P(\text{'Barzil wins next Worldcup'}|\text{'wikipedia'})=P(\text{'Brazil won any one World cup in the past'})=\frac{5}{21}$
50 | 
51 | As for $B$: I have no idea how to begin to model that....even intuitively...
52 | 
53 | #### Mathematical formulism
54 | 
55 | I don't even...
56 | 


--------------------------------------------------------------------------------
/chapter01/exercise09.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: "Exercise 9"
  3 | author: "Corrie"
  4 | output: html_document
  5 | ---
  6 | 
  7 | ```{r setup, include=FALSE}
  8 | knitr::opts_chunk$set(echo = TRUE, comment=NA)
  9 | ```
 10 | 
 11 | ```{r}
 12 | library(tidyverse)
 13 | ```
 14 | 
 15 | 
 16 | A clinic has three doctors. Patients come into the clinic at random, starting at 9am, according to a Poisson process with time parameter 10 minutes; that is, the time after opening at which the first patient appears follows an exponential distribution with expectation 10 minutes and then, after each patient arrives, the waiting time until the next patient arrives is independently exponentially distributed, also with expectation 10 minutes. When a patient arrives, s/he waits until a doctor is available. The amount of time spent by each doctor with each patient is a random variable, uniformly distributed between 5 and 20 minutes. The office stops admitting new patients at 4pm and closes when the last patient is through with the doctor.
 17 | 
 18 | 
 19 | We observe 7 hours (from 9am to 4pm) of patients arriving. That is `r 7*60` minutes. We thus sample patient arrival times (actually, waiting time between the arrival of two patients) from $\text{Exp}(\theta=1/10)$ a number of times, until the sum of arrival times is above 420. Or, in a vectorized way, we sample a generous amount of arrival times (say $n=100$) and discard all for which the sum is above 420.
 20 | ```{r}
 21 | patients <- rexp(n=100, rate=1/10)
 22 | arrival_times <- cumsum(patients)
 23 | arrival_times <- arrival_times[arrival_times <= 420 ]
 24 | 
 25 | 
 26 | ( num_patients <- length(arrival_times) )
 27 | ```
 28 | 
 29 | Since each patient will spent between 5 and 20 minutes with the doctor (uniformly distributed), independent of the doctor, we can simulate the time spent with the doctor already at arrival time (how convenient).
 30 | ```{r}
 31 | consultation_time <- runif(num_patients, min=5, max=20)
 32 | ```
 33 | 
 34 | Since we already now in advance how long a doctor will spend with a patient, we also already know in advance will be available next. So everytime a new patient arrives, we check which doctor will be soonest available and then add the consultation time of this patient to the time the doctor will be next available.
 35 | ```{r}
 36 | simulate <- function(num_doctors=3, mean_wait_time=10,
 37 |                      opening_time=9, closing_time=16, 
 38 |                      min_consul_time=5,
 39 |                      max_consul_time=20){
 40 |   
 41 |   hours <- closing_time - opening_time
 42 |   minutes <- hours * 60
 43 |   patients <- rexp(n=200, rate=1/mean_wait_time)
 44 |   arrival_times <- cumsum(patients)
 45 |   arrival_times <- arrival_times[arrival_times <= minutes ]
 46 |   
 47 |   
 48 |   num_patients <- length(arrival_times) 
 49 |   
 50 |   consultation_time <- runif(num_patients, min=min_consul_time, max=max_consul_time)
 51 |   
 52 |   # initiate doctors 
 53 |   doctors <- rep(0, num_doctors)  
 54 |   # waiting time 
 55 |   waiting_times <- c()
 56 |   num_waiters <- 0
 57 |   for (i in 1:num_patients) {
 58 |     # which doctor will be next available?
 59 |     next_free_doctor <- which.min(doctors)
 60 |     # what time will the doctor be available?
 61 |     next_free_doctor_time <- doctors[next_free_doctor]
 62 |     # does the patient have to wait?
 63 |     if (next_free_doctor_time > arrival_times[i]) {
 64 |       enter_room <- next_free_doctor_time
 65 |       waiting_times[i] <- enter_room - arrival_times[i]
 66 |       num_waiters <- num_waiters + 1
 67 |     } else {
 68 |       enter_room <- arrival_times[i]
 69 |       waiting_times[i] <- 0
 70 |     }
 71 |     leave_room <- enter_room + consultation_time[i]
 72 |     doctors[next_free_doctor] <- leave_room
 73 |   }
 74 |   #num_waiters <- sum( waiting_times > 0 )
 75 |   office_close <- max(doctors)
 76 |   avg_wait_all <- mean(waiting_times)
 77 |   avg_wait <- mean( waiting_times[waiting_times > 0 ])
 78 |   rel_closing <- office_close - 420
 79 |   res <- c(num_patients, num_waiters, office_close, rel_closing, avg_wait_all, avg_wait)
 80 |   names(res) <- c("num_patients", "num_waiters", "office_close", "rel_closing", "avg_wait_all", "avg_wait")
 81 |   as.data.frame(t(res))
 82 | }
 83 | 
 84 | simulate()
 85 | ```
 86 | 
 87 | We can now easily simulate the process multiple times:
 88 | ```{r}
 89 | num_sims <- 10000
 90 | sims <- map_dfr(1:num_sims, .f=function(x) {simulate(num_doctors = 3) } )
 91 | ```
 92 | 
 93 | From this, we can estimate the median and 50% interval for the number of patients in a day, how many patients have to wait, the closing time of the office and the average waiting time.
 94 | ```{r}
 95 | summary(sims)
 96 | ```
 97 | 
 98 | ```{r, fig.height=8, fig.width=8, dpi=400}
 99 | sims %>%
100 |   select(-office_close) %>%
101 |   replace_na(list(avg_wait = 0)) %>%
102 |   pivot_longer(everything()) %>%
103 |   ggplot(aes(x=value)) +
104 |   geom_histogram(bins=20, fill="#377EB8", col="white") +
105 |   facet_wrap(~name, scales="free") +
106 |   theme_minimal()
107 |   
108 | ```
109 | 
110 | 


--------------------------------------------------------------------------------
/chapter01/exercise09/Aaron - PoissonPatients.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "# Aaron\n",
 10 |     "import numpy.random as nr\n",
 11 |     "import itertools\n",
 12 |     "from collections import namedtuple\n",
 13 |     "import pandas as pd"
 14 |    ]
 15 |   },
 16 |   {
 17 |    "cell_type": "code",
 18 |    "execution_count": 2,
 19 |    "metadata": {},
 20 |    "outputs": [],
 21 |    "source": [
 22 |     "start_time = 9.0 # 9am\n",
 23 |     "end_time = 16.0 # 4pm\n",
 24 |     "hourly_rate = 10/60 # every 10 minutes\n",
 25 |     "process_time_min_max = (5/60, 20/60) # uniform between 10 and 30 minutes"
 26 |    ]
 27 |   },
 28 |   {
 29 |    "cell_type": "code",
 30 |    "execution_count": 3,
 31 |    "metadata": {},
 32 |    "outputs": [],
 33 |    "source": [
 34 |     "def arrival_times():\n",
 35 |     "    current_time = start_time\n",
 36 |     "    while True:\n",
 37 |     "        current_time += nr.exponential(hourly_rate)\n",
 38 |     "        if current_time > end_time:\n",
 39 |     "            break\n",
 40 |     "        else:\n",
 41 |     "            yield current_time\n",
 42 |     "        \n",
 43 |     "def process_times():\n",
 44 |     "    while True: yield nr.uniform(*process_time_min_max)\n",
 45 |     "        \n",
 46 |     "def patients():\n",
 47 |     "    return zip(arrival_times(), process_times())\n",
 48 |     "        "
 49 |    ]
 50 |   },
 51 |   {
 52 |    "cell_type": "code",
 53 |    "execution_count": 4,
 54 |    "metadata": {},
 55 |    "outputs": [
 56 |     {
 57 |      "name": "stdout",
 58 |      "output_type": "stream",
 59 |      "text": [
 60 |       "(9.22647697452822, 0.2655797646475507)\n",
 61 |       "(9.269022429514955, 0.18736198398352416)\n",
 62 |       "(9.57737637227136, 0.18725573329773443)\n",
 63 |       "(9.719719861923531, 0.2801266643451028)\n",
 64 |       "(9.782588453675242, 0.08807369500432473)\n",
 65 |       "(9.80840073728496, 0.3036979691681516)\n"
 66 |      ]
 67 |     }
 68 |    ],
 69 |    "source": [
 70 |     "for a in itertools.islice(patients(), 6):\n",
 71 |     "    print(a)"
 72 |    ]
 73 |   },
 74 |   {
 75 |    "cell_type": "code",
 76 |    "execution_count": 5,
 77 |    "metadata": {},
 78 |    "outputs": [],
 79 |    "source": [
 80 |     "Patient_record = namedtuple('Patient_record', 'arrival wait process_time')\n",
 81 |     "def simulate():\n",
 82 |     "    doctors = [start_time for _ in range(3)]\n",
 83 |     "    for arrival_time, process_time in patients():\n",
 84 |     "        # A patient has just arrived\n",
 85 |     "        # Which doctor will see this patient?\n",
 86 |     "        doctors = sorted(doctors)\n",
 87 |     "        first_free_doctor = doctors.pop(0)\n",
 88 |     "        # when will this patient be seen?\n",
 89 |     "        enter_the_room = max(arrival_time, first_free_doctor)\n",
 90 |     "        leave_the_room = enter_the_room + process_time\n",
 91 |     "        # this doctor will be busy until then\n",
 92 |     "        doctors.append(leave_the_room)\n",
 93 |     "        assert len(doctors) == 3\n",
 94 |     "        yield Patient_record(arrival_time, enter_the_room-arrival_time, process_time)"
 95 |    ]
 96 |   },
 97 |   {
 98 |    "cell_type": "code",
 99 |    "execution_count": 6,
100 |    "metadata": {},
101 |    "outputs": [
102 |     {
103 |      "data": {
104 |       "text/html": [
105 |        "<div>\n",
106 |        "<style scoped>\n",
107 |        "    .dataframe tbody tr th:only-of-type {\n",
108 |        "        vertical-align: middle;\n",
109 |        "    }\n",
110 |        "\n",
111 |        "    .dataframe tbody tr th {\n",
112 |        "        vertical-align: top;\n",
113 |        "    }\n",
114 |        "\n",
115 |        "    .dataframe thead th {\n",
116 |        "        text-align: right;\n",
117 |        "    }\n",
118 |        "</style>\n",
119 |        "<table border=\"1\" class=\"dataframe\">\n",
120 |        "  <thead>\n",
121 |        "    <tr style=\"text-align: right;\">\n",
122 |        "      <th></th>\n",
123 |        "      <th>arrival</th>\n",
124 |        "      <th>wait</th>\n",
125 |        "      <th>process_time</th>\n",
126 |        "    </tr>\n",
127 |        "  </thead>\n",
128 |        "  <tbody>\n",
129 |        "    <tr>\n",
130 |        "      <th>0</th>\n",
131 |        "      <td>9.057364</td>\n",
132 |        "      <td>0.000000</td>\n",
133 |        "      <td>0.314053</td>\n",
134 |        "    </tr>\n",
135 |        "    <tr>\n",
136 |        "      <th>1</th>\n",
137 |        "      <td>9.133635</td>\n",
138 |        "      <td>0.000000</td>\n",
139 |        "      <td>0.144422</td>\n",
140 |        "    </tr>\n",
141 |        "    <tr>\n",
142 |        "      <th>2</th>\n",
143 |        "      <td>9.245335</td>\n",
144 |        "      <td>0.000000</td>\n",
145 |        "      <td>0.282666</td>\n",
146 |        "    </tr>\n",
147 |        "    <tr>\n",
148 |        "      <th>3</th>\n",
149 |        "      <td>9.615008</td>\n",
150 |        "      <td>0.000000</td>\n",
151 |        "      <td>0.153809</td>\n",
152 |        "    </tr>\n",
153 |        "    <tr>\n",
154 |        "      <th>4</th>\n",
155 |        "      <td>9.835923</td>\n",
156 |        "      <td>0.000000</td>\n",
157 |        "      <td>0.206641</td>\n",
158 |        "    </tr>\n",
159 |        "    <tr>\n",
160 |        "      <th>5</th>\n",
161 |        "      <td>9.995190</td>\n",
162 |        "      <td>0.000000</td>\n",
163 |        "      <td>0.318543</td>\n",
164 |        "    </tr>\n",
165 |        "    <tr>\n",
166 |        "      <th>6</th>\n",
167 |        "      <td>10.202111</td>\n",
168 |        "      <td>0.000000</td>\n",
169 |        "      <td>0.106085</td>\n",
170 |        "    </tr>\n",
171 |        "    <tr>\n",
172 |        "      <th>7</th>\n",
173 |        "      <td>10.374703</td>\n",
174 |        "      <td>0.000000</td>\n",
175 |        "      <td>0.210055</td>\n",
176 |        "    </tr>\n",
177 |        "    <tr>\n",
178 |        "      <th>8</th>\n",
179 |        "      <td>10.428493</td>\n",
180 |        "      <td>0.000000</td>\n",
181 |        "      <td>0.324764</td>\n",
182 |        "    </tr>\n",
183 |        "    <tr>\n",
184 |        "      <th>9</th>\n",
185 |        "      <td>10.582662</td>\n",
186 |        "      <td>0.000000</td>\n",
187 |        "      <td>0.283858</td>\n",
188 |        "    </tr>\n",
189 |        "    <tr>\n",
190 |        "      <th>10</th>\n",
191 |        "      <td>10.701231</td>\n",
192 |        "      <td>0.000000</td>\n",
193 |        "      <td>0.311638</td>\n",
194 |        "    </tr>\n",
195 |        "    <tr>\n",
196 |        "      <th>11</th>\n",
197 |        "      <td>10.728006</td>\n",
198 |        "      <td>0.025251</td>\n",
199 |        "      <td>0.228080</td>\n",
200 |        "    </tr>\n",
201 |        "    <tr>\n",
202 |        "      <th>12</th>\n",
203 |        "      <td>10.731422</td>\n",
204 |        "      <td>0.135098</td>\n",
205 |        "      <td>0.278255</td>\n",
206 |        "    </tr>\n",
207 |        "    <tr>\n",
208 |        "      <th>13</th>\n",
209 |        "      <td>11.096205</td>\n",
210 |        "      <td>0.000000</td>\n",
211 |        "      <td>0.140842</td>\n",
212 |        "    </tr>\n",
213 |        "    <tr>\n",
214 |        "      <th>14</th>\n",
215 |        "      <td>11.100451</td>\n",
216 |        "      <td>0.000000</td>\n",
217 |        "      <td>0.263788</td>\n",
218 |        "    </tr>\n",
219 |        "    <tr>\n",
220 |        "      <th>15</th>\n",
221 |        "      <td>11.456833</td>\n",
222 |        "      <td>0.000000</td>\n",
223 |        "      <td>0.103892</td>\n",
224 |        "    </tr>\n",
225 |        "    <tr>\n",
226 |        "      <th>16</th>\n",
227 |        "      <td>11.845074</td>\n",
228 |        "      <td>0.000000</td>\n",
229 |        "      <td>0.272671</td>\n",
230 |        "    </tr>\n",
231 |        "    <tr>\n",
232 |        "      <th>17</th>\n",
233 |        "      <td>12.356279</td>\n",
234 |        "      <td>0.000000</td>\n",
235 |        "      <td>0.143145</td>\n",
236 |        "    </tr>\n",
237 |        "    <tr>\n",
238 |        "      <th>18</th>\n",
239 |        "      <td>12.644723</td>\n",
240 |        "      <td>0.000000</td>\n",
241 |        "      <td>0.122049</td>\n",
242 |        "    </tr>\n",
243 |        "    <tr>\n",
244 |        "      <th>19</th>\n",
245 |        "      <td>12.791327</td>\n",
246 |        "      <td>0.000000</td>\n",
247 |        "      <td>0.282262</td>\n",
248 |        "    </tr>\n",
249 |        "    <tr>\n",
250 |        "      <th>20</th>\n",
251 |        "      <td>12.983407</td>\n",
252 |        "      <td>0.000000</td>\n",
253 |        "      <td>0.146275</td>\n",
254 |        "    </tr>\n",
255 |        "    <tr>\n",
256 |        "      <th>21</th>\n",
257 |        "      <td>13.171120</td>\n",
258 |        "      <td>0.000000</td>\n",
259 |        "      <td>0.092032</td>\n",
260 |        "    </tr>\n",
261 |        "    <tr>\n",
262 |        "      <th>22</th>\n",
263 |        "      <td>13.454217</td>\n",
264 |        "      <td>0.000000</td>\n",
265 |        "      <td>0.285920</td>\n",
266 |        "    </tr>\n",
267 |        "    <tr>\n",
268 |        "      <th>23</th>\n",
269 |        "      <td>13.753393</td>\n",
270 |        "      <td>0.000000</td>\n",
271 |        "      <td>0.192504</td>\n",
272 |        "    </tr>\n",
273 |        "    <tr>\n",
274 |        "      <th>24</th>\n",
275 |        "      <td>13.987684</td>\n",
276 |        "      <td>0.000000</td>\n",
277 |        "      <td>0.199915</td>\n",
278 |        "    </tr>\n",
279 |        "    <tr>\n",
280 |        "      <th>25</th>\n",
281 |        "      <td>14.093300</td>\n",
282 |        "      <td>0.000000</td>\n",
283 |        "      <td>0.210698</td>\n",
284 |        "    </tr>\n",
285 |        "    <tr>\n",
286 |        "      <th>26</th>\n",
287 |        "      <td>14.173222</td>\n",
288 |        "      <td>0.000000</td>\n",
289 |        "      <td>0.205493</td>\n",
290 |        "    </tr>\n",
291 |        "    <tr>\n",
292 |        "      <th>27</th>\n",
293 |        "      <td>14.302395</td>\n",
294 |        "      <td>0.000000</td>\n",
295 |        "      <td>0.306721</td>\n",
296 |        "    </tr>\n",
297 |        "    <tr>\n",
298 |        "      <th>28</th>\n",
299 |        "      <td>14.683552</td>\n",
300 |        "      <td>0.000000</td>\n",
301 |        "      <td>0.220661</td>\n",
302 |        "    </tr>\n",
303 |        "    <tr>\n",
304 |        "      <th>29</th>\n",
305 |        "      <td>14.701095</td>\n",
306 |        "      <td>0.000000</td>\n",
307 |        "      <td>0.279457</td>\n",
308 |        "    </tr>\n",
309 |        "    <tr>\n",
310 |        "      <th>30</th>\n",
311 |        "      <td>14.735536</td>\n",
312 |        "      <td>0.000000</td>\n",
313 |        "      <td>0.105993</td>\n",
314 |        "    </tr>\n",
315 |        "    <tr>\n",
316 |        "      <th>31</th>\n",
317 |        "      <td>14.881566</td>\n",
318 |        "      <td>0.000000</td>\n",
319 |        "      <td>0.186963</td>\n",
320 |        "    </tr>\n",
321 |        "    <tr>\n",
322 |        "      <th>32</th>\n",
323 |        "      <td>14.907767</td>\n",
324 |        "      <td>0.000000</td>\n",
325 |        "      <td>0.199865</td>\n",
326 |        "    </tr>\n",
327 |        "    <tr>\n",
328 |        "      <th>33</th>\n",
329 |        "      <td>15.101449</td>\n",
330 |        "      <td>0.000000</td>\n",
331 |        "      <td>0.145699</td>\n",
332 |        "    </tr>\n",
333 |        "    <tr>\n",
334 |        "      <th>34</th>\n",
335 |        "      <td>15.379239</td>\n",
336 |        "      <td>0.000000</td>\n",
337 |        "      <td>0.131209</td>\n",
338 |        "    </tr>\n",
339 |        "    <tr>\n",
340 |        "      <th>35</th>\n",
341 |        "      <td>15.447321</td>\n",
342 |        "      <td>0.000000</td>\n",
343 |        "      <td>0.102237</td>\n",
344 |        "    </tr>\n",
345 |        "    <tr>\n",
346 |        "      <th>36</th>\n",
347 |        "      <td>15.619611</td>\n",
348 |        "      <td>0.000000</td>\n",
349 |        "      <td>0.110790</td>\n",
350 |        "    </tr>\n",
351 |        "    <tr>\n",
352 |        "      <th>37</th>\n",
353 |        "      <td>15.654948</td>\n",
354 |        "      <td>0.000000</td>\n",
355 |        "      <td>0.222799</td>\n",
356 |        "    </tr>\n",
357 |        "    <tr>\n",
358 |        "      <th>38</th>\n",
359 |        "      <td>15.728213</td>\n",
360 |        "      <td>0.000000</td>\n",
361 |        "      <td>0.155223</td>\n",
362 |        "    </tr>\n",
363 |        "    <tr>\n",
364 |        "      <th>39</th>\n",
365 |        "      <td>15.914905</td>\n",
366 |        "      <td>0.000000</td>\n",
367 |        "      <td>0.257484</td>\n",
368 |        "    </tr>\n",
369 |        "  </tbody>\n",
370 |        "</table>\n",
371 |        "</div>"
372 |       ],
373 |       "text/plain": [
374 |        "      arrival      wait  process_time\n",
375 |        "0    9.057364  0.000000      0.314053\n",
376 |        "1    9.133635  0.000000      0.144422\n",
377 |        "2    9.245335  0.000000      0.282666\n",
378 |        "3    9.615008  0.000000      0.153809\n",
379 |        "4    9.835923  0.000000      0.206641\n",
380 |        "5    9.995190  0.000000      0.318543\n",
381 |        "6   10.202111  0.000000      0.106085\n",
382 |        "7   10.374703  0.000000      0.210055\n",
383 |        "8   10.428493  0.000000      0.324764\n",
384 |        "9   10.582662  0.000000      0.283858\n",
385 |        "10  10.701231  0.000000      0.311638\n",
386 |        "11  10.728006  0.025251      0.228080\n",
387 |        "12  10.731422  0.135098      0.278255\n",
388 |        "13  11.096205  0.000000      0.140842\n",
389 |        "14  11.100451  0.000000      0.263788\n",
390 |        "15  11.456833  0.000000      0.103892\n",
391 |        "16  11.845074  0.000000      0.272671\n",
392 |        "17  12.356279  0.000000      0.143145\n",
393 |        "18  12.644723  0.000000      0.122049\n",
394 |        "19  12.791327  0.000000      0.282262\n",
395 |        "20  12.983407  0.000000      0.146275\n",
396 |        "21  13.171120  0.000000      0.092032\n",
397 |        "22  13.454217  0.000000      0.285920\n",
398 |        "23  13.753393  0.000000      0.192504\n",
399 |        "24  13.987684  0.000000      0.199915\n",
400 |        "25  14.093300  0.000000      0.210698\n",
401 |        "26  14.173222  0.000000      0.205493\n",
402 |        "27  14.302395  0.000000      0.306721\n",
403 |        "28  14.683552  0.000000      0.220661\n",
404 |        "29  14.701095  0.000000      0.279457\n",
405 |        "30  14.735536  0.000000      0.105993\n",
406 |        "31  14.881566  0.000000      0.186963\n",
407 |        "32  14.907767  0.000000      0.199865\n",
408 |        "33  15.101449  0.000000      0.145699\n",
409 |        "34  15.379239  0.000000      0.131209\n",
410 |        "35  15.447321  0.000000      0.102237\n",
411 |        "36  15.619611  0.000000      0.110790\n",
412 |        "37  15.654948  0.000000      0.222799\n",
413 |        "38  15.728213  0.000000      0.155223\n",
414 |        "39  15.914905  0.000000      0.257484"
415 |       ]
416 |      },
417 |      "execution_count": 6,
418 |      "metadata": {},
419 |      "output_type": "execute_result"
420 |     }
421 |    ],
422 |    "source": [
423 |     "simulated_data = pd.DataFrame(list(simulate()))\n",
424 |     "simulated_data"
425 |    ]
426 |   },
427 |   {
428 |    "cell_type": "code",
429 |    "execution_count": 7,
430 |    "metadata": {},
431 |    "outputs": [],
432 |    "source": [
433 |     "Record_of_one_simulation = namedtuple('Record_of_one_simulation', 'n number_of_waiters average_wait_of_waiters closing_time')\n",
434 |     "def one_simulation():\n",
435 |     "    simulated_data = pd.DataFrame(list(simulate()))\n",
436 |     "    # How many patients came to the office?\n",
437 |     "    n = simulated_data.shape[0]\n",
438 |     "    # How many had to wait for a doctor?\n",
439 |     "    waiters = simulated_data.query('wait > 0')\n",
440 |     "    number_of_waiters = waiters.shape[0]\n",
441 |     "    # What was their average wait?\n",
442 |     "    average_wait_of_waiters = waiters.wait.mean()\n",
443 |     "    # When did the office close?\n",
444 |     "    closing_time = float(simulated_data.iloc[[-1],:].eval('arrival+wait+process_time'))\n",
445 |     "    rec = Record_of_one_simulation(n, number_of_waiters, average_wait_of_waiters, closing_time)\n",
446 |     "    return rec"
447 |    ]
448 |   },
449 |   {
450 |    "cell_type": "code",
451 |    "execution_count": 8,
452 |    "metadata": {},
453 |    "outputs": [
454 |     {
455 |      "name": "stdout",
456 |      "output_type": "stream",
457 |      "text": [
458 |       "    n  number_of_waiters  average_wait_of_waiters  closing_time\n",
459 |       "0  50                 11                 0.074844     16.068317\n",
460 |       "1  45                  8                 0.077117     16.015736\n",
461 |       "2  39                  7                 0.079502     15.944850\n",
462 |       "3  45                  2                 0.083258     16.120998\n",
463 |       "4  44                  7                 0.077304     16.067006\n",
464 |       "5  42                  5                 0.040500     15.947436\n",
465 |       "6  37                  7                 0.076368     16.000642\n",
466 |       "7  36                  5                 0.114543     16.273268\n",
467 |       "8  51                  9                 0.056909     16.060058\n",
468 |       "9  40                  7                 0.104967     16.289932\n"
469 |      ]
470 |     }
471 |    ],
472 |    "source": [
473 |     "number_of_simulations = 10\n",
474 |     "sims = pd.DataFrame([one_simulation() for sim in range(number_of_simulations)])\n",
475 |     "\n",
476 |     "print(sims)"
477 |    ]
478 |   },
479 |   {
480 |    "cell_type": "code",
481 |    "execution_count": null,
482 |    "metadata": {},
483 |    "outputs": [],
484 |    "source": []
485 |   }
486 |  ],
487 |  "metadata": {
488 |   "kernelspec": {
489 |    "display_name": "Python 3",
490 |    "language": "python",
491 |    "name": "python3"
492 |   },
493 |   "language_info": {
494 |    "codemirror_mode": {
495 |     "name": "ipython",
496 |     "version": 3
497 |    },
498 |    "file_extension": ".py",
499 |    "mimetype": "text/x-python",
500 |    "name": "python",
501 |    "nbconvert_exporter": "python",
502 |    "pygments_lexer": "ipython3",
503 |    "version": "3.6.7"
504 |   }
505 |  },
506 |  "nbformat": 4,
507 |  "nbformat_minor": 2
508 | }
509 | 


--------------------------------------------------------------------------------
/chapter02/2008ElectionResult.csv:
--------------------------------------------------------------------------------
 1 | state,vote_Obama,vote_Obama_pct,vote_McCain,vote_McCain_pct,electoral_vote_dem,electoral_vote_rep
 2 | Alabama,811764,38.8,1264879,60.4,,9
 3 | Alaska,105650,37.7,168844,60.2,,3
 4 | Arizona,948648,45,1132560,53.8,,10
 5 | Arkansas,418049,38.8,632672,58.8,,6
 6 | California,7245731,60.9,4434146,37.3,55,
 7 | Colorado,1216793,53.5,1020135,44.9,9,
 8 | Connecticut,979316,60.5,620210,38.3,7,
 9 | Delaware,255394,61.9,152356,37,3,
10 | District of Columbia,210403,92.9,14821,6.5,3,
11 | Florida,4143957,50.9,3939380,48.4,27,
12 | Georgia,1843452,47,2048244,52.2,,15
13 | Hawaii,324918,71.8,120309,26.6,4,
14 | Idaho,235219,36.1,400989,61.5,,4
15 | Illinois,3319237,61.8,1981158,36.9,21,
16 | Indiana,1367264,49.9,1341101,49,11,
17 | Iowa,818240,54,677508,44.7,7,
18 | Kansas,499979,41.4,685541,56.8,,6
19 | Kentucky,746510,41.1,1043264,57.5,,8
20 | Louisiana,780981,39.9,1147603,58.6,,9
21 | Maine,421484,57.6,296195,40.5,4,
22 | Maryland,1579890,61.9,938671,36.8,10,
23 | Massachusetts,1891083,62,1104284,36.2,12,
24 | Michigan,2867680,57.4,2044405,40.9,17,
25 | Minnesota,1573323,54.2,1275400,44,10,
26 | Mississippi,520864,42.8,687266,56.4,,6
27 | Missouri,1439364,49.3,1444352,49.4,,11
28 | Montana,229725,47.2,241816,49.7,,3
29 | Nebraska,324352,41.5,446039,57,1,4
30 | Nevada,531884,55.1,411988,42.7,5,
31 | New Hampshire,384591,54.3,316937,44.8,4,
32 | New Jersey,2085051,56.8,1545495,42.1,15,
33 | New Mexico,464458,56.7,343820,42,5,
34 | New York,4363386,62.2,2576360,36.7,31,
35 | North Carolina,2123390,49.9,2109698,49.5,15,
36 | North Dakota,141113,44.7,168523,53.3,,3
37 | Ohio,2708685,51.2,2501855,47.2,20,
38 | Oklahoma,502294,34.4,959745,65.6,,7
39 | Oregon,978605,57.1,699673,40.8,7,
40 | Pennsylvania,3192316,54.7,2586496,44.3,21,
41 | Rhode Island,281209,63.1,157317,35.3,4,
42 | South Carolina,850121,44.9,1018756,53.8,,8
43 | South Dakota,170886,44.7,203019,53.2,,3
44 | Tennessee,1081074,41.8,1470160,56.9,,11
45 | Texas,3521164,43.8,4467748,55.5,,34
46 | Utah,301771,34.2,555497,62.9,,5
47 | Vermont,219105,67.8,98791,30.6,3,
48 | Virginia,1958370,52.7,1726053,46.4,13,
49 | Washington,1547632,57.5,1097176,40.5,11,
50 | West Virginia,301438,42.6,394278,55.7,,5
51 | Wisconsin,1670474,56.3,1258181,42.4,10,
52 | Wyoming,80496,32.7,160639,65.2,,3
53 | 


--------------------------------------------------------------------------------
/chapter02/exercise03.Rmd:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: "Exercise 3"
 3 | author: "Sören Berg"
 4 | date: "01 May 2019"
 5 | output: html_document
 6 | ---
 7 | 
 8 | ## (a)
 9 | 
10 | The number of 6's in 1000 rolls of a fair die is $\mathrm{Bin}\left(1000, \frac{1}{6}\right)$ distributed with mean $\frac{1000}{6}$ and standard deviation $\sqrt{1000 \frac{1}{6}\frac{5}{6}}$. We approximate this distribution with a normal distribution with same mean and standard deviation.
11 | 
12 | ```{r}
13 | x <- seq(100,220,1)
14 | y <- dnorm(x, 1000/6, sqrt(1000*1/6*5/6)) 
15 | plot(x, y, type="l")
16 | ```
17 | 
18 | ## (b)
19 | 
20 | ```{r}
21 | qnorm(0.05, 1000/6, sqrt(1000*1/6*5/6)) 
22 | qnorm(0.25, 1000/6, sqrt(1000*1/6*5/6)) 
23 | qnorm(0.5, 1000/6, sqrt(1000*1/6*5/6)) 
24 | qnorm(0.75, 1000/6, sqrt(1000*1/6*5/6)) 
25 | qnorm(0.95, 1000/6, sqrt(1000*1/6*5/6)) 
26 | ```
27 | 


--------------------------------------------------------------------------------
/chapter02/exercise04.Rmd:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: "Exercise 4"
 3 | author: "Sören Berg"
 4 | date: "01 May 2019"
 5 | output: html_document
 6 | ---
 7 | 
 8 | ## (a)
 9 | 
10 | The distribution for $\theta$ is discrete (note that the three probabilities given in the exercise sum up to one). 
11 | First, let $\theta=\frac{1}{12}$. The binomial distribution
12 | $\mathrm{Bin}(n,\theta)$ has mean $\frac{1000}{12}$ and standard deviation $\sqrt{1000\tfrac{1}{12}\tfrac{11}{12}}$, and we approximate the conditional distribution $p(y|\theta)$ with $\mathrm{N}\left(\frac{1000}{12}, \sqrt{1000\tfrac{1}{12}\tfrac{11}{12}}\right)$. The cases $\theta=1/6$ and $\theta=1/4$ can be handled analogously so that our approximation of the prior predictive distribution for $y$ is a sum (or mixture) of three normals. More precisely,
13 | 
14 | $$
15 | p(y) = \sum_{\theta\in\{1/12,1/6,1/4\}}p(y|\theta) = \sum_{\theta\in\{1/12,1/6,1/4\}} p(\theta) p(y|\theta)
16 | $$
17 | 
18 | is approximated with
19 | 
20 | $$
21 | \mathrm{Pr}\left(\theta=\frac{1}{12}\right) \cdot \mathrm{N}\left(\frac{1000}{12},  \sqrt{1000\tfrac{1}{12}\tfrac{11}{12}}\right)
22 | + \mathrm{Pr}\left(\theta=\frac{1}{6}\right) \cdot \mathrm{N}\left(\frac{1000}{6},  \sqrt{1000\tfrac{1}{6}\tfrac{5}{6}}\right)
23 | + \mathrm{Pr}\left(\theta=\frac{1}{4}\right) \cdot \mathrm{N}\left(\frac{1000}{4},  \sqrt{1000\tfrac{1}{4}\tfrac{3}{4}}\right)
24 | $$
25 | 
26 | Sketch:
27 | 
28 | ```{r}
29 | x <- seq(50,300, 1)
30 | y <- 1/4 * dnorm(x, 1000/12, sqrt(1000 \cdot 1/12*11/12)) + 1/2 * dnorm(x, 1000/6, sqrt(1000/cdot 1/6*5/6)) + 1/4 * dnorm(x, 1000/4, sqrt(1000*1/4*3/4))
31 | plot(x, y, type="l")
32 | ```
33 | 
34 | ## (b)
35 | 
36 | The sketch of the prior predictive distribution (alternatively one can compare the locations and standard deviations of the three normals analytically) shows that the overlap between the three bell-shaped curves are neglible for approximation purposes. The first (left-most) bell-shape curve describes $1/4$ of the overall mass of the prior predictive distribution. Therefore, the $5$% (or $\frac{1}{20}$) point is approximately the $4\frac{1}{20}=\frac{1}{5}$ (or $20$%) point of the first normal, which is
37 | 
38 | ```{r}
39 | qnorm(1/5, 1000/12, sqrt(1000*1/12*11/12))
40 | ```
41 | 
42 | The $25$% point, thus, lies between the first two bell-shaped curves at approximately $125$.
43 | Since the last normal describes $1/4$ of the overall mass, too, the $50$% point is approximately the mean of the second normal (at around $1000/6 = 166.6\ldots$).
44 | The $75$% lies between the second and third normal, approximately at around $205$ and the $95$% point is the approximately the $80$% point of the third normal at around
45 | 
46 | ```{r}
47 | qnorm(4/5, 1000/4, sqrt(1000*1/4*3/4))
48 | ```
49 | 


--------------------------------------------------------------------------------
/chapter02/exercise07.Rmd:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: "Exercise 7"
 3 | author: "Sören Berg"
 4 | date: "01 May 2019"
 5 | output: html_document
 6 | ---
 7 | 
 8 | ## (a)
 9 | 
10 | We show that for the binomial likelihood $y \sim \mathrm{Bin}(n, \theta)$ $p(\theta)\propto \theta^{-1}(1-\theta)^{-1}$ is the uniform prior distribution for the natural parameters of the exponential family.
11 | 
12 | We start by determining the natural parameter 
13 | 
14 | $$ 
15 | \begin{aligned}
16 | p(y | \theta) &= \binom{n}{y} \theta^y (1-\theta)^{n-y} 
17 | = \binom{n}{y} (1-\theta)^n e^{\log(\theta^y (1-\theta)^{-y})}
18 | = \binom{n}{y} (1-\theta)^n  e^{\left(\log\frac{\theta}{1-\theta}\right)y}.
19 | \end{aligned}
20 | $$
21 | 
22 | This also verifies that the binomial distribution belongs to the exponential
23 | family with $f(y)=\binom{n}{y}$, $g(\theta)= (1-\theta)^n$, $\phi(\theta)=\mathrm{logit}(\theta)=\log\frac{\theta}{1-\theta}$ and $u(y)=y$, cf. notation on page 36 BDA3.
24 | 
25 | Now, let $p$ be a density with $p(\phi(\theta))\propto 1$. Since $\phi$ is a one-to-one transformation we can transform the density as follows. Note that said transformation can be done by equation (2.19) on page 52 BDA3, which is a special case of more general transformation theorems for integrals. First, note that the inverse of the logit function is the sigmoid function $\phi^{-1}(\theta) = \frac{1}{1+e^{-\theta}}$.
26 | We conclude
27 | 
28 | $$
29 | \begin{aligned}
30 | p(\theta)
31 | &= p(\phi^{-1}(\phi(\theta)))
32 | = p(\phi(\theta)) \left| \frac{d\phi(\theta)}{d\theta}\right|
33 | = p(\phi(\theta)) \left| \frac{1}{\tfrac{\theta}{1-\theta}} \frac{1}{(1-\theta)^2}\right|\\
34 | &= p(\phi(\theta)) \left| \frac{1}{\theta(1-\theta)}\right|
35 | = p(\phi(\theta)) \frac{1}{\theta(1-\theta)}
36 | \propto \frac{1}{\theta(1-\theta)}=\theta^{-1}(1-\theta)^{-1}.
37 | \end{aligned}
38 | $$
39 | 
40 | 
41 | ## (b)
42 | We show that if $y=0$ or $y=n$ the resulting posterior is improper.
43 | 
44 | The resulting posterior is proportional to the prior times the likelihood. Thus,
45 | 
46 | $$
47 | p(\theta|y) \propto p(\theta) p(y|\theta) 
48 | = \theta^{-1}(1-\theta)^{-1} \binom{n}{y}\theta^y(1-\theta)^{n-y}
49 | \propto \theta^{y-1}(1-\theta)^{n-y-1}.
50 | $$
51 | 
52 | Let $y=0$, then the integral over the posterior is (except some nonzero normalization factor) 
53 | 
54 | $$
55 | \int_0^1  \theta^{y-1}(1-\theta)^{n-y-1} \ \mathrm{d}\theta
56 | = \int_0^1  \theta^{-1}(1-\theta)^{n-1} \ \mathrm{d}\theta
57 | \geq \int_0^1 \theta^{-1} \ \mathrm{d}\theta
58 | = \lim_{\theta\searrow 0} \log(\theta) = \infty,
59 | $$
60 | where the inequality follows since $\theta\in[0,1]$.
61 | The case $y=n$ can be tackled analogously.
62 | 


--------------------------------------------------------------------------------
/chapter02/exercise10.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: "Exercise 10"
  3 | author: "Corrie Bartelheimer"
  4 | output:
  5 |   html_document:
  6 |     toc: true
  7 | ---
  8 | ```{r setup, include=FALSE}
  9 | knitr::opts_chunk$set(echo = TRUE, comment=NA)
 10 | ```
 11 | ## Problem
 12 | Suppose there are $N$ cable cars in San Francisco, numbered sequentially from 1 to $N$. You see a cable car at random; it is numbered 203. You wish to estimate $N$.
 13 | 
 14 | 
 15 | ## Some preliminary thoughts
 16 | Before starting to solve the exercise, let's think what we would intuitively expect without using any statistical knowledge.
 17 | Obviously, having seen a cable car numbered 203, we know that $N$ cannot be lower than 203. So $N$ must be 203 or higher. My first guess when seeing the number 203 would be to expect $N$ to be around 400. My intuitive reasoning for this is as follows: If $N$ would be much much larger than 203 then the probability for seeing a cable car with a number higher than 203 also becomes larger. On the other hand, if $N$ would be close to 203, the probability for seeing a car numbered smaller than 203 becomes larger. I expect to half the time see a car with a number smaller than $\lfloor \frac{N}{2}\rfloor$ and the other half a car with a number larger than $\frac{N}{2}$, hence the guess of 400 after seeing the car 203.
 18 | Let's see if this intuition agrees with the math.
 19 | 
 20 | ## Part (a) - Compute the posterior
 21 | Assume your prior distribution on $N$ is geometric with mean 100; that is
 22 | $$p(N) = \frac{1}{100} \left(\frac{99}{100}\right)^{N-1}$$
 23 | for $N =1, 2,...$.
 24 | What is your posterior distribution for $N$?
 25 | 
 26 | For this problem, we can approximate the posterior using grid approximation. 
 27 | ```{r}
 28 | N_seq <- 1:1000000
 29 | 
 30 | prior_mean <- 100
 31 | prior <- 1/prior_mean * (1-1/prior_mean)^(N_seq - 1)
 32 | ```
 33 | We use the following function for the likelihood:
 34 | $$P(y \,|\, N) = \begin{cases}
 35 |         \frac{1}{N}  &\text{ if }N \geq y\\
 36 |         0 & \text{otherwise} \end{cases}$$
 37 | where $y$ denotes our observed data, here $y=203$.
 38 | 
 39 | We then compute the likelihood for all possible $N$ (as big as computationally still possible), multiply the prior with the likelihood and simply standardize the posterior by dividing out its sum:
 40 | ```{r}
 41 | lkhd <- ifelse( N_seq >= 203, 1/N_seq, 0)
 42 | 
 43 | unstzd.post <- prior * lkhd
 44 | 
 45 | post <- unstzd.post / sum(unstzd.post)
 46 | 
 47 | 
 48 | plot(N_seq, post, type = "l", xlim = c(1, 1000),
 49 |      main="Posterior", xlab="N", ylab = "")
 50 | ```
 51 | 
 52 | The posterior is highly skewed with very small probability mass on very large values of $N$. By definition of our likelihood, there's no probability mass on values of $N$ below 203. Interestingly, and different from my intuition, most probability mass is on the values just above 203.
 53 | 
 54 | ## Part (b) - Computing posterior summaries
 55 | What are the posterior mean and standard deviation of $N$?
 56 | 
 57 | To compute the posterior mean and standard deviation, we sample from our posterior distribution:
 58 | ```{r}
 59 | post_sample <- sample(N_seq, size=2000, replace=T, prob=post)
 60 | hist(post_sample, breaks = 30, main="Histogram of the posterior sample",
 61 |      xlab="N")
 62 | ```
 63 | 
 64 | Computing posterior summaries is then straight-forward:
 65 | ```{r}
 66 | mean(post_sample)
 67 | ```
 68 | 
 69 | ```{r}
 70 | median(post_sample)
 71 | ```
 72 | 
 73 | 
 74 | ```{r}
 75 | sd(post_sample)
 76 | ```
 77 | 
 78 | ```{r}
 79 | Mode <- function(x) {
 80 |   ux <- unique(x)
 81 |   ux[which.max(tabulate(match(x, ux)))]
 82 | }
 83 | Mode(post_sample)
 84 | ```
 85 | 
 86 | Our posterior distribution tells us to expect $N$ to be around 280, much smaller than my guess of around 400. One reason for this value is also our prior: Since we only have one observation, our prior has a strong influence. Picking different prior means for the geometric prior distribution yields different posterior means.
 87 | 
 88 | ## Part (c) - Non-informative prior
 89 | Choose a reasonable 'non-informative' prior distribution for $N$ and give the resulting posterior distribution, mean and standard deviation for $N$.
 90 | 
 91 | A first idea would be to take a flat uniform prior: $P(N) \propto 1$. However, this leads to an improper posterior:
 92 | 
 93 | $$\begin{align*}
 94 | P(N \,|\,) &\propto P(N) P(y \,|\, N) &\\
 95 | &\propto P(y \,|\, N) &\\
 96 | &\propto \frac{1}{N} & \text{if } N \geq y 
 97 | \end{align*}$$
 98 | Since $\sum_{N=1}^{\infty} \frac{1}{N} = \infty$, this posterior would be improper. Simulating this shows the problems we get:
 99 | ```{r}
100 | N_seq <- 1:100000
101 | prior <- 1
102 | 
103 | lkhd <- ifelse( N_seq >= 203, 1/N_seq, 0)
104 | 
105 | unstzd.post <- prior * lkhd
106 | 
107 | post <- unstzd.post / sum(unstzd.post)
108 | post_sample <- sample(N_seq, size=2000, replace=T, prob=post)
109 | 
110 | plot(N_seq, post, type = "l", xlim = c(1, 1000),
111 |      main="Posterior", xlab="N", ylab = "")
112 | ```
113 | 
114 | There's too much probability mass on high values and if we try to compute the mean we get unreasonably high values and wildly different values if we change the support over which we approximate the posterior.
115 | 
116 | Another option for the prior is $P(N) \propto \frac{1}{N}$. This is an improper prior (same reasoning as above, it doesn't integrate to 1) but it leads to a proper posterior density:
117 | $$\begin{align*}
118 | P(N \,|\,y) &\propto P(N) P(y \,|\, N) &\\
119 | &\propto \frac{1}{N} \cdot \frac{1}{N} & \text{if } N \geq y\\
120 | &\propto \frac{1}{N^2} & \text{if } N \geq y \\
121 | &= 
122 | c\frac{1}{N^2} & \text{if } N \geq y, \text{ for some } c\\
123 | \end{align*}$$
124 | since the sum $\sum_{N=1}^{\infty} \frac{1}{N^2}$ converges.
125 | 
126 | We can compute $c$ as follows:
127 | $$\begin{align*}
128 | 1  &= c\sum_{N=203}^\infty \frac{1}{N^2} \\
129 | \iff \quad \frac{1}{c}&=  \sum_{N=203}^\infty \frac{1}{N^2}\\ 
130 | & = \sum_{N=1}^\infty \frac{1}{N^2} - \sum_{N=1}^{202} \frac{1}{N^2} \\
131 | & = \frac{\pi^2}{6} - \sum_{N=1}^{202} \frac{1}{N^2}
132 | \end{align*}$$
133 | 
134 | We compute $c$ numerically:
135 | ```{r}
136 | one_c <- pi^2/6 - sum( 1/(1:202)^2)
137 | c <- 1/one_c
138 | c
139 | ```
140 | 
141 | ```{r}
142 | N_seq <- 1:1000000
143 | 
144 | post <- ifelse( N_seq >= 203, c/N_seq^2, 0)
145 | 
146 | post_sample <- sample(N_seq, size=2000, replace=T, prob=post)
147 | 
148 | plot(N_seq, post, type = "l", xlim = c(1, 1000),
149 |      main="Posterior", xlab="N", ylab = "")
150 | ```
151 | 
152 | It is straight-forward to see that this posterior does not have a mean:
153 | $$\begin{align*}
154 | E(N\,|\,y) &= \sum_{N=1}^\infty N \cdot P(N\,|\,y) \\
155 | &= \sum_{N=203}^\infty N \frac{c}{N^2} \\
156 | &= \sum_{N=203}^\infty \frac{c}{N} \\
157 | &= \infty
158 | \end{align*}$$
159 | 
160 | 


--------------------------------------------------------------------------------
/chapter02/exercise12/exercise12.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: "Chapter 2 - Exercise 12"
  3 | author: "Aaron McDaid - aaron.mcdaid@gmail.com"
  4 | date: "2 May 2019"
  5 | output: html_document
  6 | ---
  7 | 
  8 | <div style="display:none">
  9 |   $$
 10 |   \newcommand{\EE}[1]{\mathbb{E}\mathopen{}\left[ #1 \right]\mathclose{}}
 11 |   \newcommand{\Var}[1]{\mathrm{Var}\mathopen{}\left[ #1 \right]\mathclose{}}
 12 |   $$
 13 | </div>
 14 | 
 15 | ```{r setup, include=FALSE}
 16 | knitr::opts_chunk$set(
 17 |   # cache = TRUE,
 18 |   # dev = "svglite",
 19 |   echo = TRUE,
 20 |   comment = NA,
 21 |   message = FALSE,
 22 |   warning = TRUE,
 23 |   error = TRUE
 24 | )
 25 | 
 26 | library(tidyverse)
 27 | library(scales)
 28 | library(kableExtra)
 29 | library(here)
 30 | 
 31 | theme_set(theme_bw())
 32 | ```
 33 | 
 34 | ## Chapter 2 Exercise 9
 35 | 
 36 | _12. Jeffreys’ prior distributions: suppose y|θ ∼ Poisson(θ). Find Jeffreys’ prior density for θ, and then find α and β for which the Gamma(α, β) density is a close match to Jeffreys’ density._
 37 | 
 38 | pdf for Poisson:
 39 | 
 40 | $$ f(y ; \theta) = \frac{\theta^y e^{-\theta}}{y!} $$
 41 | 
 42 | The Jeffreys prior density is proportional to:
 43 | 
 44 | \begin{align}
 45 |     p(\theta) \propto \sqrt{I(\theta)} = \sqrt{\EE{\left( \frac{d}{d\theta} \log f(y; \theta) \right)^2 \middle| \theta}}
 46 | \end{align}
 47 | 
 48 | Breaking this into steps, we start with the density $f(y; \theta)$
 49 | 
 50 | $$ \frac{\theta^y e^{-\theta}}{y!} $$
 51 | 
 52 | apply the logarithm
 53 | 
 54 | \begin{align}
 55 | \log(\cdot) & = y \log(\theta) - \theta \log(e) - \log (y!)
 56 | \\          & = y \log(\theta) - \theta - \log (y!)
 57 | \end{align}
 58 | 
 59 | then the derivative
 60 | 
 61 | $$
 62 | \frac{d}{d\theta}(\cdot) = \frac{y}{\theta} - 1
 63 | $$
 64 | 
 65 | Next, we need the expectation of the square, i.e.
 66 | 
 67 | $$ \EE{\left( \frac{y}{\theta} - 1 \right)^2 \middle| \theta} $$
 68 | 
 69 | Using the fact that $\Var{X} = \EE{X^2} - \EE{X}^2$, we can compute the expectation of the square as $\Var{X} + \EE{X}^2$, i.e:
 70 | 
 71 | \begin{align}
 72 | \EE{\left( \frac{y}{\theta} - 1 \right)^2 \middle| \theta} & = \Var{ \frac{y}{\theta} - 1  \middle| \theta} + \EE{\frac{y}{\theta} - 1 \middle| \theta}^2
 73 | \\                                         & = \frac1{\theta^2} \Var{ y \middle| \theta} + \left(\frac1{\theta}\EE{y \middle| \theta} - 1\right)^2
 74 | \end{align}
 75 | 
 76 | Then, because $y$ has a Poisson distribution, the mean and variance of $y$ are simply $\Var{y|\theta} = \EE{y|\theta} = \theta$,
 77 | \begin{align}
 78 | \EE{\left( \frac{y}{\theta} - 1 \right)^2 \middle| \theta} & = \frac1{\theta^2} \theta + \left(\frac1{\theta} \theta - 1\right)^2
 79 | \\                                                         & = \frac1{\theta} + \left(1 - 1\right)^2
 80 | \\                                                         & = \frac1{\theta}
 81 | \end{align}
 82 | 
 83 | Final step in the Jeffreys formula above is to apply the square root, giving us the Jeffreys' prior for Poisson:
 84 | 
 85 | $$ p(\theta) \propto \sqrt{I(\theta)} = \sqrt{\frac1{\theta}} = \theta^{-\frac12} $$
 86 | 
 87 | 
 88 | To complete the exercise, we are also asked to find a parameterization of the Gamma distribution which is a close match to this:
 89 | 
 90 | $$ \theta \sim Gamma(\alpha, \beta) $$
 91 | 
 92 | $$ f(\theta | \alpha, \beta) = \frac{\beta^\alpha}{\Gamma(\alpha)} \theta^{\alpha-1} e^{-\beta\theta} $$
 93 | 
 94 | This can be achieved by setting $\alpha=\frac12$ and $\beta=0$
 95 | 
 96 | \begin{align}
 97 | f(\theta | \alpha, \beta) & = \frac{0^\frac12}{\Gamma\left(\frac12\right)} \theta^{\frac12-1} e^{-0\theta}
 98 | \\                          = \frac{0^\frac12}{\Gamma\left(\frac12\right)} \theta^{-\frac12}
 99 | \end{align}
100 | 
101 | As required, that is proportional to $\theta^{-\frac12}$. But it's an [improper prior](https://en.wikipedia.org/wiki/Prior_probability#Examples).
102 | 


--------------------------------------------------------------------------------
/chapter02/exercise15/exercise15.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: "Chapter 2 - Exercise 12"
  3 | author: "Aaron McDaid - aaron.mcdaid@gmail.com"
  4 | date: "2 May 2019"
  5 | output: html_document
  6 | ---
  7 | 
  8 | <div style="display:none">
  9 |   $$
 10 |   \newcommand{\EE}[1]{\mathbb{E}\mathopen{}\left[ #1 \right]\mathclose{}}
 11 |   \newcommand{\Var}[1]{\mathrm{Var}\mathopen{}\left[ #1 \right]\mathclose{}}
 12 |   $$
 13 | </div>
 14 | 
 15 | ```{r setup, include=FALSE}
 16 | knitr::opts_chunk$set(
 17 |   # cache = TRUE,
 18 |   # dev = "svglite",
 19 |   echo = TRUE,
 20 |   comment = NA,
 21 |   message = FALSE,
 22 |   warning = TRUE,
 23 |   error = TRUE
 24 | )
 25 | 
 26 | library(tidyverse)
 27 | library(scales)
 28 | library(kableExtra)
 29 | library(here)
 30 | 
 31 | theme_set(theme_bw())
 32 | ```
 33 | 
 34 | ## Chapter 2 Exercise 9
 35 | 
 36 | _12. Jeffreys’ prior distributions: suppose y|θ ∼ Poisson(θ). Find Jeffreys’ prior density for θ, and then find α and β for which the Gamma(α, β) density is a close match to Jeffreys’ density._
 37 | 
 38 | pdf for Poisson:
 39 | 
 40 | $$ f(y ; \theta) = \frac{\theta^y e^{-\theta}}{y!} $$
 41 | 
 42 | The Jeffreys prior density is proportional to:
 43 | 
 44 | \begin{align}
 45 |     p(\theta) \propto \sqrt{I(\theta)} = \sqrt{\EE{\left( \frac{d}{d\theta} \log f(y; \theta) \right)^2 \middle| \theta}}
 46 | \end{align}
 47 | 
 48 | Breaking this into steps, we start with the density $f(y; \theta)$
 49 | 
 50 | $$ \frac{\theta^y e^{-\theta}}{y!} $$
 51 | 
 52 | apply the logarithm
 53 | 
 54 | \begin{align}
 55 | \log(\cdot) & = y \log(\theta) - \theta \log(e) - \log (y!)
 56 | \\          & = y \log(\theta) - \theta - \log (y!)
 57 | \end{align}
 58 | 
 59 | then the derivative
 60 | 
 61 | $$
 62 | \frac{d}{d\theta}(\cdot) = \frac{y}{\theta} - 1
 63 | $$
 64 | 
 65 | Next, we need the expectation of the square, i.e.
 66 | 
 67 | $$ \EE{\left( \frac{y}{\theta} - 1 \right)^2 \middle| \theta} $$
 68 | 
 69 | Using the fact that $\Var{X} = \EE{X^2} - \EE{X}^2$, we can compute the expectation of the square as $\Var{X} + \EE{X}^2$, i.e:
 70 | 
 71 | \begin{align}
 72 | \EE{\left( \frac{y}{\theta} - 1 \right)^2 \middle| \theta} & = \Var{ \frac{y}{\theta} - 1  \middle| \theta} + \EE{\frac{y}{\theta} - 1 \middle| \theta}^2
 73 | \\                                         & = \frac1{\theta^2} \Var{ y \middle| \theta} + \left(\frac1{\theta}\EE{y \middle| \theta} - 1\right)^2
 74 | \end{align}
 75 | 
 76 | Then, because $y$ has a Poisson distribution, the mean and variance of $y$ are simply $\Var{y|\theta} = \EE{y|\theta} = \theta$,
 77 | \begin{align}
 78 | \EE{\left( \frac{y}{\theta} - 1 \right)^2 \middle| \theta} & = \frac1{\theta^2} \theta + \left(\frac1{\theta} \theta - 1\right)^2
 79 | \\                                                         & = \frac1{\theta} + \left(1 - 1\right)^2
 80 | \\                                                         & = \frac1{\theta}
 81 | \end{align}
 82 | 
 83 | Final step in the Jeffreys formula above is to apply the square root, giving us the Jeffreys' prior for Poisson:
 84 | 
 85 | $$ p(\theta) \propto \sqrt{I(\theta)} = \sqrt{\frac1{\theta}} = \theta^{-\frac12} $$
 86 | 
 87 | 
 88 | To complete the exercise, we are also asked to find a parameterization of the Gamma distribution which is a close match to this:
 89 | 
 90 | $$ \theta \sim Gamma(\alpha, \beta) $$
 91 | 
 92 | $$ f(\theta | \alpha, \beta) = \frac{\beta^\alpha}{\Gamma(\alpha)} \theta^{\alpha-1} e^{-\beta\theta} $$
 93 | 
 94 | This can be achieved by setting $\alpha=\frac12$ and $\beta=0$
 95 | 
 96 | \begin{align}
 97 | f(\theta | \alpha, \beta) & = \frac{0^\frac12}{\Gamma\left(\frac12\right)} \theta^{\frac12-1} e^{-0\theta}
 98 | \\                          = \frac{0^\frac12}{\Gamma\left(\frac12\right)} \theta^{-\frac12}
 99 | \end{align}
100 | 
101 | As required, that is proportional to $\theta^{-\frac12}$. But it's an [improper prior](https://en.wikipedia.org/wiki/Prior_probability#Examples).
102 | 


--------------------------------------------------------------------------------
/chapter02/exercise21.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: "Exercise 21"
  3 | output: html_document
  4 | ---
  5 | 
  6 | ```{r setup, include=FALSE}
  7 | <<<<<<< HEAD
  8 | <<<<<<< HEAD
  9 | <<<<<<< HEAD
 10 | knitr::opts_chunk$set(echo = TRUE, comment = NA, warning = F)
 11 | =======
 12 | knitr::opts_chunk$set(echo = TRUE, comment = NA)
 13 | >>>>>>> add exercise 21 plus required data
 14 | =======
 15 | knitr::opts_chunk$set(echo = TRUE, comment = NA, warning = F)
 16 | >>>>>>> render exercise 21
 17 | =======
 18 | knitr::opts_chunk$set(echo = TRUE, comment = NA, warning = F)
 19 | >>>>>>> 8009051bbf8560e11876eb94c2e8897c4b0f5cff
 20 | ```
 21 | 
 22 | 
 23 | ## Combining the data
 24 | <<<<<<< HEAD
 25 | <<<<<<< HEAD
 26 | <<<<<<< HEAD
 27 | The data is available from Gelmans [website](http://stat.columbia.edu/~gelman/book/data/). The data combines multiple surveys and asks various questions. There is thus a whole range of columns in the data frame.
 28 | =======
 29 | The data is available from Gelmans [website](). The data combines multiple surveys and asks various questions. There is thus a whole range of columns in the data frame.
 30 | >>>>>>> add exercise 21 plus required data
 31 | =======
 32 | The data is available from Gelmans [website](http://stat.columbia.edu/~gelman/book/data/). The data combines multiple surveys and asks various questions. There is thus a whole range of columns in the data frame.
 33 | >>>>>>> render exercise 21
 34 | =======
 35 | The data is available from Gelmans [website](http://stat.columbia.edu/~gelman/book/data/). The data combines multiple surveys and asks various questions. There is thus a whole range of columns in the data frame.
 36 | >>>>>>> 8009051bbf8560e11876eb94c2e8897c4b0f5cff
 37 | ```{r, message=F, warning=F}
 38 | library(foreign)
 39 | library(tidyverse)
 40 | library(scales)
 41 | 
 42 | pre <- read.dta("pew_research_center_june_elect_wknd_data.dta")
 43 | election <- read.csv("2008ElectionResult.csv") %>%
 44 |   mutate(state=tolower(state))
 45 | 
 46 | colnames(pre)
 47 | ```
 48 | 
 49 | We will concentrate here on the answer to question which political ideology the participant has:
 50 | ```{r}
 51 | table(pre$ideo) / nrow(pre)
 52 | ```
 53 | 
 54 | We want to estimate the percentage of the population in each state who label themselves as _very liberal_.
 55 | We also need the abbreviations for each state:
 56 | ```{r}
 57 | mapping <- tibble(state = tolower(state.name)) %>%
 58 |    bind_cols(tibble(abb = state.abb)) %>% 
 59 |    bind_rows(tibble(state = c("washington dc", "district of columbia"), 
 60 |                     abb = c("DC", "DC")))
 61 | 
 62 | election <- election %>%
 63 |   left_join(mapping, by="state") %>%
 64 |   select(-state)
 65 | ```
 66 | 
 67 | We aggregate the survey data by state so that we get a proportion estimate of liberals based on the survey:
 68 | ```{r}
 69 | d <- pre %>% 
 70 |   filter( !is.na( ideo ) ) %>%
 71 |   select(state, ideo) %>%
 72 |   group_by(state) %>%
 73 |   summarise(lib=sum(ideo == "very liberal"), n=n()) %>%
 74 |   left_join(mapping, by="state") %>%
 75 |   left_join(election, by="abb") %>%
 76 |   mutate(lib.prop = lib / n ,
 77 |          obama_vote_share = vote_Obama_pct / 100,
 78 |          obama_won = vote_Obama_pct > 50)
 79 | 
 80 | d  
 81 | ```
 82 | 
 83 | ## Graph proportions vs vote-share
 84 | ```{r, fig.height=6, fig.width=6}
 85 | d %>%
 86 |   ggplot(aes(x=lib.prop, y=obama_vote_share, label=abb, col=obama_won)) +
 87 |   geom_hline(yintercept = 0.5, col="grey", size=1) +
 88 |   geom_text(show.legend = F, size=3) +
 89 |   theme_minimal() +
 90 |   scale_color_brewer(palette="Set1") + 
 91 |   scale_y_continuous(labels = percent,
 92 |                      name = "Proportion of Obama Votes",
 93 |                      limits = c(0,1)) +
 94 |   scale_x_continuous(labels = percent,
 95 |                      name = "Proportion of Liberals in Survey",
 96 |                      limits = c(0, 0.15) )
 97 | ```
 98 | 
 99 | We see that some states had a high proportion of liberals in the survey but a low proportion of votes for Obama in the election.
100 | 
101 | ## Graph Bayes posterior mean vs vote-share
102 | We assume that the number of liberals in the survey follows a Poisson distribution:
103 | $$\text{Liberals}_j \sim \text{Poisson}(n_j\theta_j))$$ where $n_j$ number of Survey participants and $\theta$ is the underlying rate of liberals (that is, close to the proportion of votes for Obama) in state $j$.
104 | 
105 | The conjugate prior for the Poisson distribution is the Gamma distribution $\text{Gamma}(\alpha, \beta)$ with shape parameter $\alpha$ and rate parameter $\beta$.
106 | We can use the Method of Moments to obtain estimates for our prior:
107 | ```{r}
108 | <<<<<<< HEAD
109 | <<<<<<< HEAD
110 | <<<<<<< HEAD
111 | ( rate <- mean( d$lib / d$n ) / var( d$lib / d$n ) )
112 | ( shape <- rate*mean( d$lib / d$n ) )
113 | =======
114 | rate <- mean( d$lib / d$n ) / var( d$lib / d$n )
115 | shape <- rate*mean( d$lib / d$n )
116 | >>>>>>> add exercise 21 plus required data
117 | =======
118 | ( rate <- mean( d$lib / d$n ) / var( d$lib / d$n ) )
119 | ( shape <- rate*mean( d$lib / d$n ) )
120 | >>>>>>> render exercise 21
121 | =======
122 | ( rate <- mean( d$lib / d$n ) / var( d$lib / d$n ) )
123 | ( shape <- rate*mean( d$lib / d$n ) )
124 | >>>>>>> 8009051bbf8560e11876eb94c2e8897c4b0f5cff
125 | ```
126 | 
127 | This gives a prior that puts most probability mass between 0 and 0.1 for the rate parameter $\theta$.
128 | ```{r}
129 | curve(dgamma(x, shape=shape, rate=rate), from=0, to=0.3)
130 | hist(d$lib / d$n , add=T)
131 | ```
132 | 
133 | The posterior for Gamma is then $\text{Gamma}(\alpha + \sum_i^n y_i, \beta + \sum_i^n x_i)$ from which we can estimate the posterior mean.
134 | ```{r, fig.height=7, fig.width=7}
135 | d %>%
136 |   mutate(post_shape = shape + lib,
137 |          post_rate = rate + n,
138 |          post_mean = post_shape / post_rate ) %>%
139 |   ggplot(aes(y=obama_vote_share, x=post_mean, label=abb, col=obama_won)) +
140 |   geom_hline(yintercept = 0.5, col="grey", size=1) +
141 |   geom_text(show.legend = F, size=3) +
142 |   theme_minimal() +
143 |   scale_color_brewer(palette="Set1") + 
144 |   scale_y_continuous(labels = percent,
145 |                      name = "Proportion of Obama Votes") +
146 |   scale_x_continuous(labels = percent,
147 |                      name = "Bayesian Mean of Proportion in Survey",
148 |                      limits = c(0, 0.15) )
149 | ```
150 | 
151 | 
152 | 
153 | 
154 | ## Graph proportion vs number of survey-participants
155 | ```{r}
156 | d %>%
157 |   ggplot(aes(x=n, y=lib.prop, label=abb, col=obama_won)) +
158 |   geom_text(show.legend = F, size=3) +
159 |   theme_minimal() +
160 |   scale_color_brewer(palette="Set1") +
161 |   scale_x_continuous(name = "Number of Survey Participants") +
162 |   scale_y_continuous(labels = percent,
163 |                      name ="Proportion of Liberals in Survey",
164 |                      limits = c(0,0.15))
165 | ```
166 | 
167 | 
168 | 
169 | ## Graph Bayesian estimate vs number of survey-participants
170 | 
171 | ```{r, fig.height=7, fig.width=7}
172 | d %>%
173 |   mutate(post_shape = shape + lib,
174 |          post_rate = rate + n,
175 |          post_mean = post_shape / post_rate ) %>%
176 |   ggplot(aes(x=n, y=post_mean, label=abb, col=obama_won)) +
177 |   geom_text(show.legend = F, size=3) +
178 |   theme_minimal() +
179 |   scale_color_brewer(palette="Set1") + 
180 |   scale_x_continuous(name = "Number of Survey Participants") +
181 |   scale_y_continuous(labels = percent,
182 |                      name = "Bayesian Mean of Proportion in Survey",
183 |                      limits = c(0, 0.15) )
184 | ```
185 | 
186 | 


--------------------------------------------------------------------------------
/chapter02/pew_research_center_june_elect_wknd_data.dta:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter02/pew_research_center_june_elect_wknd_data.dta


--------------------------------------------------------------------------------
/chapter03/bioassayExample.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: "Bioassay Experiment"
  3 | author: "Corrie"
  4 | date: "5/29/2019"
  5 | output: 
  6 |   github_document:
  7 |     pandoc_args: --webtex
  8 | ---
  9 | 
 10 | ```{r setup, include=FALSE}
 11 | knitr::opts_chunk$set(echo = TRUE, comment = NA)
 12 | ```
 13 | 
 14 | ```{r, warning=F, message=F}
 15 | library(MASS)
 16 | library(ggplot2)
 17 | library(mvtnorm)
 18 | library(dplyr)
 19 | library(cowplot)
 20 | ```
 21 | This example is a nonconjugate model for a bioassay experiment. It is a two-parameter example from the class of generalized linear models. Here, we use a simulation grid-approximation approach to get the posterior distribution.
 22 | 
 23 | ## The data
 24 | To test how toxic a drug is, it is often given to animals at various doses and then observed how many have adverse outcomes. Often, the response is simply a dichothomous outcome: animal alive or dead.
 25 | An example of such data:
 26 | ```{r}
 27 | d <- data.frame(log.dose=c(-0.86, -0.3, -0.05, 0.73),
 28 |                 n.animals=c(5,5,5,5),
 29 |                 n.deaths=c(0,1,3,5))
 30 | d %>% knitr::kable()
 31 | ```
 32 | 
 33 | ```{r}
 34 | d <- d %>%
 35 |   mutate(prop=n.deaths / n.animals) 
 36 | d %>%
 37 |   ggplot(aes(x=log.dose, y=prop)) +
 38 |   geom_point(col="#E41A1C") +
 39 |   theme_minimal() +
 40 |   labs(y="Proportion of deaths",
 41 |        x="Dose (log g/ml")
 42 | ```
 43 | 
 44 | 
 45 | ## The model
 46 | It is reasonable to model the outcomes of the five animals _within each group_ as exchangeable and independent. The data points $y_i$=`n.deaths` are then binomially distributed:
 47 | $$y_i | \theta_i \sim \text{Bin}(n_i, \theta)$$
 48 | where $n_i$ is the number of animals (in this example it is 5 for each group).
 49 | 
 50 | We model the response $\theta$ by the dose using a linear model together with a logit-link:
 51 | $$\text{logit}(\theta_i)  = \alpha + \beta x_i$$
 52 | This is called a logistic regression model.
 53 | 
 54 | ## A frequentist approach
 55 | To get a rough estimate around where we should plot our grid, we first compute the maximum likelihood estimate using the standard logistic regression tools.
 56 | ```{r}
 57 | d.notagg <- data.frame(log.dose=c(rep(-0.86, 5), rep(-0.3,5), rep(-0.05,5), rep(0.73,5) ),
 58 |                 death=c(rep(0,5), 1, rep(0,4), rep(1, 3), 0, 0, rep(1, 5)) )
 59 | 
 60 | fit <- glm(death ~ 1 + log.dose,
 61 |           data=d.notagg,
 62 |           family="binomial")
 63 | 
 64 | summary(fit)
 65 | ```
 66 | 
 67 | The estimate is $(\hat{\alpha}, \hat{\beta}) = (0.85, 7.75)$ with standard errors of 1.0 and 4.9 for $\alpha$ and $\beta$, respectively.
 68 | 
 69 | # Approximating the posterior using a grid
 70 | First, we define some functions.
 71 | ```{r}
 72 | logit <- function(x) log(x / (1-x) )
 73 | invlogit <- function(x) exp(x) / (1 + exp(x))
 74 | ```
 75 | 
 76 | We compute the log posterior using the log likelihood. This helps to avoid numerical problems.
 77 | Simplifying some of the expressions, we get the following function for the log likelihood:
 78 | ```{r}
 79 | log.lkhd <- function(alpha, beta) {
 80 |   lin <- alpha + beta * d$log.dose
 81 |   y <- d$n.deaths; n <- d$n.animals
 82 |   sum( y*(lin - log( 1 + exp(lin ) )) + ( n-y )*(-log(1 + exp(lin))) )
 83 | }
 84 | ```
 85 | Next, we define a prior function. Since we use a uniform prior, we just define a constant function:
 86 | ```{r}
 87 | prior <- function(alpha, beta) {
 88 |   1
 89 | }
 90 | ```
 91 | Next, we define our grid and a function to compute the posterior:
 92 | ```{r}
 93 | grid_size <- 100
 94 | alpha_seq <- seq(-5, 10, length.out=grid_size)
 95 | beta_seq <- seq(-10, 40, length.out=grid_size)
 96 | 
 97 | alpha_width <- alpha_seq[2] - alpha_seq[1]
 98 | beta_width <- beta_seq[2] - beta_seq[1]
 99 | 
100 | post.grid <- expand.grid(alpha = alpha_seq, 
101 |                          beta = beta_seq ) 
102 | 
103 | posterior.grid <- function(grid, prior_fun=prior) {
104 |   grid %>%
105 |   rowwise %>%
106 |   mutate(loglkhd = log.lkhd(alpha, beta),
107 |          prior = prior_fun(alpha, beta)) %>%
108 |   mutate(log.post = loglkhd + log(prior) ) %>%
109 |   ungroup() %>%
110 |   mutate(log.postm = log.post - max(log.post),
111 |          un.post = exp(log.postm),
112 |          # normalize the posterior
113 |          post = un.post / sum(un.post),
114 |          prior = prior / sum(prior) ) %>%
115 |     select(-log.postm, -un.post)
116 | }
117 | ```
118 | Now, we compute the posterior:
119 | ```{r}
120 | post.grid <- posterior.grid(post.grid)
121 | ```
122 | We can plot the posterior density as contour lines. To get the right contour lines, we use the mode and multiply it with 0.05, 0.1, 0.15, ..., 0.95.
123 | ```{r, fig.height=5, fig.width=5}
124 | mode <- max(post.grid$post)
125 | breaks <- seq(0.05, 0.95, by=0.1) * mode
126 | 
127 | unf_post_plot <- post.grid %>%
128 |   ggplot(aes(x=alpha, y=beta, z=post)) + 
129 |   stat_contour(breaks=breaks, col="#377EB8") +
130 |   ylim(-10, 40) +
131 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
132 |                      limits = c(-5, 10)) +
133 |   theme_minimal() +
134 |   labs(title="Posterior density", subtitle = "with uniform prior")
135 | unf_post_plot
136 | ```
137 | 
138 | ## Sampling from the posterior
139 | To sample from the posterior, we take the following steps:
140 | 
141 | 1. Compute the marginal posterior distribution of $\alpha$ by numerically summing over $\beta$:
142 | ```{r}
143 | marg.alpha <- post.grid %>%
144 |   group_by(alpha) %>%
145 |   summarise(post = sum(post)) %>%
146 |   pull(post)
147 | ```
148 | 
149 | 2. For $s = 1, ..., 1000$ (or more if you want to have more samples), 
150 |   a) Draw samples from $p(\alpha | y)$:
151 | ```{r}
152 | N <- 1000
153 | alpha.sample <- sample(seq(-5, 10, length.out = 100), N, replace=T, prob=marg.alpha)
154 | ```
155 | 
156 |   
157 |   b) Draw $\beta$ from the discrete conditional distribution $p(\beta | \alpha, y)$ given the just-sampled value of $\alpha$:
158 | ```{r}
159 | beta.sample <- c()
160 | for(i in 1:N) {
161 |   cond.beta <- post.grid %>%
162 |     filter(alpha == alpha.sample[i]) %>%
163 |     mutate(post = post / sum(post)) %>%
164 |     pull(post)
165 |   bsamp <- sample(seq(-10, 40, length.out = 100), 1, prob=cond.beta)
166 |   beta.sample[i] <- bsamp
167 | }
168 | ```
169 |   
170 |   c) For each of the sampled $\alpha$ and $\beta$, add a uniform random jitter, centered at zero with a width equal to the spacing of the sampling grid, This gives the simulation draws a continuous distribution:
171 | ```{r}
172 | # add random jitter
173 | alpha.sample <- alpha.sample + runif(N, min = 0 - alpha_width/2,
174 |                                         max = 0 + alpha_width/2)
175 | beta.sample <- beta.sample + runif(N, min = 0 - beta_width/2, 
176 |                                       max = 0 + beta_width/2 )
177 | ```
178 | 
179 | 
180 | The whole thing as a function:
181 | ```{r}
182 | extract.sample <- function(density.grid, N=1000, prior=FALSE) {
183 |   if (prior) {
184 |     density.grid <- density.grid %>%
185 |       select(alpha, beta, post=prior)
186 |   } else {
187 |     density.grid <- density.grid %>%
188 |       select(alpha,  beta, post)
189 |   }
190 |   marg.alpha <- density.grid %>%
191 |   group_by(alpha) %>%
192 |   summarise(post = sum(post)) %>%
193 |   pull(post)
194 |   
195 |   alpha.sample <- sample(seq(-5, 10, length.out = 100), N, replace=T, prob=marg.alpha)
196 |   beta.sample <- c()
197 |   for(i in 1:N) {
198 |     cond.beta <- density.grid %>%
199 |       filter(alpha == alpha.sample[i]) %>%
200 |       mutate(post = post / sum(post)) %>%
201 |       pull(post)
202 |     bsamp <- sample(seq(-10, 40, length.out = 100), 1, prob=cond.beta)
203 |     beta.sample[i] <- bsamp
204 |   }
205 |   alpha.sample <- alpha.sample + runif(N, min = 0 - alpha_width/2,
206 |                                         max = 0 + alpha_width/2)
207 |   beta.sample <- beta.sample + runif(N, min = 0 - beta_width/2, 
208 |                                       max = 0 + beta_width/2 )
209 |   
210 |   data.frame(alpha = alpha.sample,
211 |            beta = beta.sample)
212 | }
213 | ```
214 | 
215 | We can now plot the posterior sample:
216 | ```{r, fig.height=5, fig.width=5}
217 | post.sample <- extract.sample(post.grid)
218 | 
219 | unf_post_sample <- post.sample %>%
220 |   ggplot(aes(x=alpha, y=beta)) +
221 |   geom_point(size=0.5) +
222 |   ylim(-10, 40) +
223 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
224 |                      limits = c(-5, 10)) +
225 |   theme_minimal() +
226 |   labs(title="Posterior sample",
227 |        subtitle="with uniform prior")
228 | unf_post_sample 
229 | ```
230 | 
231 | 
232 | We can plot the samples as the resulting logistic model together with the data:
233 | ```{r}
234 | plot_samples <- function(sample, n=100, title="Posterior samples", subtitle="") {
235 |   log.dose <- seq(-1.5, 1.5, length.out = 100)
236 | 
237 |   sample %>%
238 |     sample_n(size=n) %>%
239 |     mutate(id=1:n) %>%
240 |     purrr::pmap_df(~tibble(log.dose=log.dose, id=..3,
241 |                           prop=invlogit(..1 + ..2*log.dose))) %>%
242 |     ggplot(aes(x=log.dose, y=prop)) +
243 |     geom_line(aes(group=id), alpha=0.2, col="#377EB8") +
244 |     geom_hline(yintercept=0.5, linetype='dashed', col="grey") +
245 |     geom_point(data=d, col="#E41A1C") +
246 |     theme_minimal() +
247 |     labs(x="Dose (log g/ml)", y="Proportion of deaths",
248 |          title=title, subtitle = subtitle)
249 | }
250 | plot_samples(post.sample)
251 | ```
252 | 
253 | Note that we can also sample and visualize our prior distribution:
254 | ```{r, warning=F, fig.height=5, fig.width=5}
255 | prior.sample <- extract.sample(post.grid, prior=TRUE)
256 | 
257 | prior.sample %>%
258 |   ggplot(aes(x=alpha, y=beta)) +
259 |   geom_point(size=0.5) +
260 |   ylim(-10, 40) +
261 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
262 |                      limits = c(-5, 10)) +
263 |   theme_minimal() +
264 |   labs(title="Prior sample")
265 | ```
266 | 
267 | The prior is uniform and thus not very spectacular. For the logistic model, this then looks as follows:
268 | ```{r}
269 | plot_samples(prior.sample, n=200, title="Prior samples")
270 | ```
271 | 
272 | 
273 | 
274 | ## Posterior Distribution of the LD50
275 | We can use the posterior sample to compute the LD50 - the dose level at which probability of death is 50%.
276 | In our logistic model, a 50% survival rate means
277 | $$\begin{align*}
278 | \text{LD50}: && E(\frac{y_i}{n_i}) = \text{logit}^{-1}(\alpha + \beta x_i) = 0.5
279 | \end{align*}$$
280 | Thus $\alpha + \beta x_i = \text{logit}(0.5) = 0$ and the LD50 is $x_i = -\alpha / \beta$.
281 | 
282 | *Attention:* In this example, LD50 is a meaningless concept if $\beta \leq 0$, in which case increasing the dose does not cause the probability of death to increase.
283 | 
284 | We report:
285 | 
286 | (1) The posterior probability that $\beta > 0$, that is, that the drug is harmful:
287 | ```{r}
288 | mean(post.sample$beta > 0)
289 | ```
290 | 
291 | From this, we can conclude that the posterior probability of $\beta > 0$ is roughly estimated to exceed 0.999.
292 | 
293 | (2) The posterior distribution for the LD50 conditional on $\beta > 0$. All draws had positive values of $\beta$, so the distribution is given by the whole sample:
294 | ```{r, warning=F}
295 | LD50_samps <- post.sample %>%
296 |   mutate( LD50 = - alpha / beta) 
297 | LD50.mean <- LD50_samps %>%
298 |   summarise(mean = mean(LD50))
299 | 
300 | unif.LD50.plot <- LD50_samps %>%
301 |   ggplot(aes(x=LD50)) + 
302 |   geom_histogram(bins=50, 
303 |                  fill="#377EB8", col="white") +
304 |   scale_y_continuous(labels = NULL, name="") +
305 |   xlim(-0.6, 0.7) +
306 |   geom_vline(data=LD50.mean, aes(xintercept=mean), col="#E41A1C") +
307 |   theme_minimal() +
308 |   labs(title="Posterior distribution for the LD50",
309 |        subtitle="with uniform prior")
310 | unif.LD50.plot
311 | ```
312 | 
313 | We can incorporate the LD50 data in the plot of the logistic model:
314 | ```{r, fig.height=6, fig.width=7}
315 | plot_samples(post.sample) +
316 |   geom_point(data=LD50_samps[1:100,], 
317 |              aes(x=LD50, y=0.5), alpha=0.3, size=0.5)
318 | ```
319 | 
320 | # A different prior
321 | We want to replace the uniform prior density by a joint normal prior distribution on $(\alpha, \beta)$ with
322 | $\alpha \sim \text{Normal}(0, 2^2)$, $\beta \sim \text{Normal}(10, 10^2)$, and $\text{corr}(\alpha, \beta)=0.5$.
323 | ```{r}
324 | mvn_prior <- function(alpha, beta) {
325 |   rho <- matrix(c(2^2, 2*10*0.5, 2*10*0.5, 10^2), ncol=2)
326 |   dmvnorm(c(alpha, beta), 
327 |            mean=c(0, 10),
328 |            sigma=rho)
329 | }
330 | 
331 | prior.density <- function(grid, prior_fun){
332 |   grid %>%
333 |     rowwise %>%
334 |     mutate(prior = prior_fun(alpha, beta)) %>%
335 |     ungroup() %>%
336 |     mutate(prior = prior / sum(prior)) 
337 | }
338 | 
339 | mvn.prior.grid <- prior.density(post.grid, prior_fun=mvn_prior)
340 | ```
341 | 
342 | 
343 | Our prior density then looks as follows:
344 | ```{r, fig.height=5, fig.width=5}
345 | mode <- max(mvn.prior.grid$prior)
346 | breaks <- seq(0.05, 0.95, by=0.1) * mode
347 | 
348 | mvn.prior.grid %>%
349 |   ggplot(aes(x=alpha, y=beta, z=prior)) + 
350 |   stat_contour(breaks=breaks, col="#E41A1C") +
351 |   ylim(-10, 40) +
352 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
353 |                      limits = c(-5, 10)) +
354 |   theme_minimal() +
355 |   labs(title="Prior density")
356 | ```
357 | 
358 | We can visualize the samples again:
359 | ```{r, warning=F, fig.height=5, fig.width=10}
360 | prior.sample <- extract.sample(mvn.prior.grid, prior=TRUE)
361 | 
362 | prior_points <- prior.sample %>%
363 |   ggplot(aes(x=alpha, y=beta)) +
364 |   geom_point(size=0.5) +
365 |   ylim(-10, 40) +
366 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
367 |                      limits = c(-5, 10)) +
368 |   theme_minimal() +
369 |   labs(title="Prior sample", subtitle="with multivariate normal prior")
370 | 
371 | prior_model <- plot_samples(prior.sample, n=200, 
372 |                             title="Prior sample", subtitle="with multivariate normal prior")
373 | plot_grid(prior_points, prior_model)
374 | ```
375 | 
376 | We can see that the prior still allows a wide range of different models but different to the uniform prior, it is much more restricted to a certain range that is already very close to the observed data.
377 | 
378 | ## The new posterior density
379 | We now use this prior to compute our posterior denstiy.
380 | ```{r}
381 | mvn.post.grid <- posterior.grid(post.grid, prior_fun=mvn_prior)
382 | ```
383 | We can compare our new posterior density with the old posterior density (obtained with a uniform prior)
384 | ```{r, fig.height=5, fig.width=10}
385 | mode <- max(mvn.post.grid$post)
386 | breaks <- seq(0.05, 0.95, by=0.1) * mode
387 | 
388 | mvn_post_plot <- mvn.post.grid %>%
389 |   ggplot(aes(x=alpha, y=beta, z=post)) + 
390 |   stat_contour(breaks=breaks, col="#377EB8") +
391 |   ylim(-10, 40) +
392 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
393 |                      limits = c(-5, 10)) +
394 |   scale_color_brewer(palette = "Set1", name="",
395 |                      label=c("post"="posterior", "prior"), direction = -1) +
396 |   theme_minimal() +
397 |   labs(title="Posterior density", subtitle = "with multivariate normal prior")
398 | 
399 | plot_grid(unf_post_plot, mvn_post_plot)
400 | ```
401 | 
402 | We can see that our new prior is slightly more regularizing than the uniform prior: The new posterior density is a bit tighter than the old posterior.
403 | 
404 | We can also compare the maximum a posteriori estimates:
405 | With the uniform prior, we have as MAP estimate:
406 | ```{r}
407 | post.grid[which.max(post.grid$post),]
408 | ```
409 | and with the multivariate normal posterior, we get:
410 | ```{r}
411 | mvn.post.grid[which.max(mvn.post.grid$post),]
412 | ```
413 | 
414 | While the beta value is the same (at least in our grid approximation), the new posterior alpha is a bit closer to zero. 
415 | 
416 | ## Sampling from the new posterior
417 | ```{r, fig.height=5, fig.width=10, warning=F}
418 | mvn.post.sample <- extract.sample(mvn.post.grid)
419 | mvn.prior.sample <- extract.sample(mvn.prior.grid, prior = T)
420 | 
421 | post_plot <- mvn.post.sample %>%
422 |   ggplot(aes(x=alpha, y=beta)) +
423 |   geom_point(size=0.5) +
424 |   ylim(-10, 40) +
425 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
426 |                      limits = c(-5, 10)) +
427 |   theme_minimal() +
428 |   labs(title="Posterior sample", subtitle = "with multivariate normal prior")
429 | 
430 | prior_plot <- mvn.prior.sample %>%
431 |   ggplot(aes(x=alpha, y=beta)) +
432 |   geom_point(size=0.5) +
433 |   ylim(-10, 40) +
434 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
435 |                      limits = c(-5, 10)) +
436 |   theme_minimal() +
437 |   labs(title="Prior sample", subtitle = "with multivariate normal prior")
438 | 
439 | plot_grid(post_plot, prior_plot)
440 | ```
441 | 
442 | We again use the sample to visualize the posterior for the logistic model:
443 | ```{r}
444 | plot_samples(mvn.post.sample) 
445 | ```
446 | 
447 | We can compare this again to the prior samples which also shows how the posterior is a compromise between the data and the prior.
448 | ```{r}
449 | plot_samples(mvn.prior.sample) +
450 |   labs(title="Prior samples")
451 | ```
452 | 
453 | ## New LD50
454 | ```{r, warning=F, fig.height=5, fig.width=10}
455 | mvn.LD50_samps <- mvn.post.sample %>%
456 |   mutate( LD50 = - alpha / beta) 
457 | 
458 | mvn.LD50.mean <- mvn.LD50_samps %>%
459 |   summarise(mean = mean(LD50))
460 | 
461 | mvn.LD50.plot <- mvn.LD50_samps %>%
462 |   ggplot(aes(x=LD50)) + 
463 |   geom_histogram(bins=50, 
464 |                  fill="#377EB8", col="white") +
465 |   scale_y_continuous(labels = NULL, name="") +
466 |   xlim(-0.6, 0.7) + 
467 |   geom_vline(data=mvn.LD50.mean, aes(xintercept=mean), col="#E41A1C") +
468 |   theme_minimal() +
469 |   labs(title="Posterior distribution for the LD50", 
470 |        subtitle="with multivariate normal prior")
471 | 
472 | plot_grid(mvn.LD50.plot, unif.LD50.plot)
473 | ```
474 | 
475 | 
476 | # Normal approximation of the posterior
477 | The fourth chapter _Asymptotics and Non-Bayesian Approaches_ explains how it is possible to approximate the posterior distribution, using the mode and a normal distribution.
478 | We will now compute the normal approximation and compare it to the exact posterior obtained by the uniform prior.
479 | 
480 | Since we assume a uniform prior density for $(\alpha, \beta)$, the posterior mode is the same as the maximum likelihood estimate. So we get the mode by computing the MLE:
481 | ```{r,  fig.height=5, fig.width=5}
482 | # lkhd function
483 | bioassayfun <- function(w, df) {
484 |   z <- w[1] + w[2]*df$log.dose
485 |   -sum(df$n.deaths*(z) - df$n.animals*log1p(exp(z)))
486 | }
487 | 
488 | #' Optimize
489 | w0 <- c(0,0)
490 | optim_res <- optim(w0, bioassayfun, gr = NULL, d, hessian = T)
491 | # w is the mode
492 | w <- optim_res$par
493 | # this computes the inverse of the hessian at the mode
494 | S <- solve(optim_res$hessian)
495 | 
496 | #' Multivariate normal probability density function
497 | dmvnorm <- function(x, mu, sig)
498 |   exp(-0.5*(length(x)*log(2*pi) + log(det(sig)) + (x-mu) %*% solve(sig, x - mu)))
499 | 
500 | #' Evaluate likelihood at points (alpha, beta) 
501 | ab_grid <- expand.grid(alpha = alpha_seq, 
502 |                          beta = beta_seq ) 
503 | 
504 | ab_grid$lkhd <- apply(ab_grid, 1, dmvnorm, w, S)
505 | 
506 | 
507 | 
508 | 
509 | #' Create a plot of the posterior density
510 | norm_post_plot <- ggplot(data = ab_grid, aes(x = alpha, y = beta, z=lkhd)) +
511 |   stat_contour( col="#377EB8") +
512 |   ylim(-10, 40) +
513 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
514 |                      limits = c(-5, 10)) +
515 |   theme_minimal() +
516 |   labs(x = 'alpha', y = 'beta',
517 |        title="Posterior Density",
518 |        subtitle="using normal Approximation" ) 
519 | 
520 | norm_post_plot
521 | ```
522 | 
523 | The posterior density is very similar to the one we obtained before, but it is missing the slight skew in the upper corner.
524 | 
525 | Similarly for the posterior sample:
526 | ```{r, fig.height=5, fig.width=5}
527 | # sample from the multivariate model
528 | norm_sample <- MASS::mvrnorm(N, w, S) %>%
529 |   data.frame() %>%
530 |   rename(alpha=X1, beta=X2)
531 | 
532 | norm_post_sample <- norm_sample %>%
533 |   ggplot() +
534 |   geom_point(aes(alpha, beta), size=0.5) +
535 |   ylim(-10, 40) +
536 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
537 |                      limits = c(-5, 10)) +
538 |   theme_minimal() +
539 |   labs(title="Posterior sample",
540 |        subtitle="using normal approximation")
541 | norm_post_sample
542 | ```
543 | 
544 | This also affects the resulting LD50 and the probability that $\beta > 0$:
545 | ```{r}
546 | mean(norm_sample$beta > 0)
547 | ```
548 | Before, this probability was ~1.
549 | 
550 | ```{r, warning=F, message=F}
551 | LD50_norm_samps <- norm_sample %>%
552 |   filter( beta > 0 ) %>%
553 |   mutate( LD50 = - alpha/beta )
554 | 
555 | LD50_norm_mean <- LD50_norm_samps %>%
556 |   summarise(mean = mean(LD50))
557 | 
558 | norm.LD50.plot <- LD50_norm_samps %>%
559 |   ggplot(aes(x=LD50)) + 
560 |   geom_histogram(bins=50, 
561 |                  fill="#377EB8", col="white") +
562 |   scale_y_continuous(labels = NULL, name="") +
563 |   xlim(-1, 1) +
564 |   geom_vline(data=LD50_norm_mean, aes(xintercept=mean), col="#E41A1C") +
565 |   theme_minimal() +
566 |   labs(title="Posterior distribution for the LD50",
567 |        subtitle="using normal approximation")
568 | 
569 | norm.LD50.plot
570 | ```
571 | 
572 | ## Comparison
573 | A direct comparison of the normal approximation with the exact posterior makes the differences clearer:
574 | ```{r, fig.height=10, fig.width=12, warning=F, message=F}
575 | #' Combine the plots
576 | plot_grid(unf_post_plot, unf_post_sample, 
577 |              unif.LD50.plot + xlim(-1, 1), norm_post_plot, 
578 |              norm_post_sample, norm.LD50.plot, ncol = 3)
579 | ```
580 | 
581 | 


--------------------------------------------------------------------------------
/chapter03/bioassayExample.md:
--------------------------------------------------------------------------------
  1 | Bioassay Experiment
  2 | ================
  3 | Corrie
  4 | 5/29/2019
  5 | 
  6 | ``` r
  7 | library(MASS)
  8 | library(ggplot2)
  9 | library(mvtnorm)
 10 | library(dplyr)
 11 | library(cowplot)
 12 | ```
 13 | 
 14 | This example is a nonconjugate model for a bioassay experiment. It is a
 15 | two-parameter example from the class of generalized linear models. Here,
 16 | we use a simulation grid-approximation approach to get the posterior
 17 | distribution.
 18 | 
 19 | ## The data
 20 | 
 21 | To test how toxic a drug is, it is often given to animals at various
 22 | doses and then observed how many have adverse outcomes. Often, the
 23 | response is simply a dichothomous outcome: animal alive or dead. An
 24 | example of such data:
 25 | 
 26 | ``` r
 27 | d <- data.frame(log.dose=c(-0.86, -0.3, -0.05, 0.73),
 28 |                 n.animals=c(5,5,5,5),
 29 |                 n.deaths=c(0,1,3,5))
 30 | d %>% knitr::kable()
 31 | ```
 32 | 
 33 | | log.dose | n.animals | n.deaths |
 34 | | -------: | --------: | -------: |
 35 | |   \-0.86 |         5 |        0 |
 36 | |   \-0.30 |         5 |        1 |
 37 | |   \-0.05 |         5 |        3 |
 38 | |     0.73 |         5 |        5 |
 39 | 
 40 | ``` r
 41 | d <- d %>%
 42 |   mutate(prop=n.deaths / n.animals) 
 43 | d %>%
 44 |   ggplot(aes(x=log.dose, y=prop)) +
 45 |   geom_point(col="#E41A1C") +
 46 |   theme_minimal() +
 47 |   labs(y="Proportion of deaths",
 48 |        x="Dose (log g/ml")
 49 | ```
 50 | 
 51 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-3-1.png)<!-- -->
 52 | 
 53 | ## The model
 54 | 
 55 | It is reasonable to model the outcomes of the five animals *within each
 56 | group* as exchangeable and independent. The data points
 57 | ![y\_i](https://latex.codecogs.com/png.latex?y_i "y_i")=`n.deaths` are
 58 | then binomially distributed:   
 59 | ![y\_i | \\theta\_i \\sim \\text{Bin}(n\_i,
 60 | \\theta)](https://latex.codecogs.com/png.latex?y_i%20%7C%20%5Ctheta_i%20%5Csim%20%5Ctext%7BBin%7D%28n_i%2C%20%5Ctheta%29
 61 | "y_i | \\theta_i \\sim \\text{Bin}(n_i, \\theta)")  
 62 | where ![n\_i](https://latex.codecogs.com/png.latex?n_i "n_i") is the
 63 | number of animals (in this example it is 5 for each group).
 64 | 
 65 | We model the response
 66 | ![\\theta](https://latex.codecogs.com/png.latex?%5Ctheta "\\theta") by
 67 | the dose using a linear model together with a logit-link:   
 68 | ![\\text{logit}(\\theta\_i) = \\alpha + \\beta
 69 | x\_i](https://latex.codecogs.com/png.latex?%5Ctext%7Blogit%7D%28%5Ctheta_i%29%20%20%3D%20%5Calpha%20%2B%20%5Cbeta%20x_i
 70 | "\\text{logit}(\\theta_i)  = \\alpha + \\beta x_i")  
 71 | This is called a logistic regression model.
 72 | 
 73 | ## A frequentist approach
 74 | 
 75 | To get a rough estimate around where we should plot our grid, we first
 76 | compute the maximum likelihood estimate using the standard logistic
 77 | regression
 78 | tools.
 79 | 
 80 | ``` r
 81 | d.notagg <- data.frame(log.dose=c(rep(-0.86, 5), rep(-0.3,5), rep(-0.05,5), rep(0.73,5) ),
 82 |                 death=c(rep(0,5), 1, rep(0,4), rep(1, 3), 0, 0, rep(1, 5)) )
 83 | 
 84 | fit <- glm(death ~ 1 + log.dose,
 85 |           data=d.notagg,
 86 |           family="binomial")
 87 | 
 88 | summary(fit)
 89 | ```
 90 | 
 91 | ``` 
 92 | 
 93 | Call:
 94 | glm(formula = death ~ 1 + log.dose, family = "binomial", data = d.notagg)
 95 | 
 96 | Deviance Residuals: 
 97 |      Min        1Q    Median        3Q       Max  
 98 | -1.37756  -0.64102  -0.07708   0.05473   1.83495  
 99 | 
100 | Coefficients:
101 |             Estimate Std. Error z value Pr(>|z|)
102 | (Intercept)   0.8466     1.0191   0.831    0.406
103 | log.dose      7.7488     4.8727   1.590    0.112
104 | 
105 | (Dispersion parameter for binomial family taken to be 1)
106 | 
107 |     Null deviance: 27.526  on 19  degrees of freedom
108 | Residual deviance: 11.789  on 18  degrees of freedom
109 | AIC: 15.789
110 | 
111 | Number of Fisher Scoring iterations: 7
112 | ```
113 | 
114 | The estimate is ![(\\hat{\\alpha}, \\hat{\\beta}) =
115 | (0.85, 7.75)](https://latex.codecogs.com/png.latex?%28%5Chat%7B%5Calpha%7D%2C%20%5Chat%7B%5Cbeta%7D%29%20%3D%20%280.85%2C%207.75%29
116 | "(\\hat{\\alpha}, \\hat{\\beta}) = (0.85, 7.75)") with standard errors
117 | of 1.0 and 4.9 for
118 | ![\\alpha](https://latex.codecogs.com/png.latex?%5Calpha "\\alpha") and
119 | ![\\beta](https://latex.codecogs.com/png.latex?%5Cbeta "\\beta"),
120 | respectively.
121 | 
122 | # Approximating the posterior using a grid
123 | 
124 | First, we define some functions.
125 | 
126 | ``` r
127 | logit <- function(x) log(x / (1-x) )
128 | invlogit <- function(x) exp(x) / (1 + exp(x))
129 | ```
130 | 
131 | We compute the log posterior using the log likelihood. This helps to
132 | avoid numerical problems. Simplifying some of the expressions, we get
133 | the following function for the log likelihood:
134 | 
135 | ``` r
136 | log.lkhd <- function(alpha, beta) {
137 |   lin <- alpha + beta * d$log.dose
138 |   y <- d$n.deaths; n <- d$n.animals
139 |   sum( y*(lin - log( 1 + exp(lin ) )) + ( n-y )*(-log(1 + exp(lin))) )
140 | }
141 | ```
142 | 
143 | Next, we define a prior function. Since we use a uniform prior, we just
144 | define a constant function:
145 | 
146 | ``` r
147 | prior <- function(alpha, beta) {
148 |   1
149 | }
150 | ```
151 | 
152 | Next, we define our grid and a function to compute the posterior:
153 | 
154 | ``` r
155 | grid_size <- 100
156 | alpha_seq <- seq(-5, 10, length.out=grid_size)
157 | beta_seq <- seq(-10, 40, length.out=grid_size)
158 | 
159 | alpha_width <- alpha_seq[2] - alpha_seq[1]
160 | beta_width <- beta_seq[2] - beta_seq[1]
161 | 
162 | post.grid <- expand.grid(alpha = alpha_seq, 
163 |                          beta = beta_seq ) 
164 | 
165 | posterior.grid <- function(grid, prior_fun=prior) {
166 |   grid %>%
167 |   rowwise %>%
168 |   mutate(loglkhd = log.lkhd(alpha, beta),
169 |          prior = prior_fun(alpha, beta)) %>%
170 |   mutate(log.post = loglkhd + log(prior) ) %>%
171 |   ungroup() %>%
172 |   mutate(log.postm = log.post - max(log.post),
173 |          un.post = exp(log.postm),
174 |          # normalize the posterior
175 |          post = un.post / sum(un.post),
176 |          prior = prior / sum(prior) ) %>%
177 |     select(-log.postm, -un.post)
178 | }
179 | ```
180 | 
181 | Now, we compute the posterior:
182 | 
183 | ``` r
184 | post.grid <- posterior.grid(post.grid)
185 | ```
186 | 
187 | We can plot the posterior density as contour lines. To get the right
188 | contour lines, we use the mode and multiply it with 0.05, 0.1, 0.15, …,
189 | 0.95.
190 | 
191 | ``` r
192 | mode <- max(post.grid$post)
193 | breaks <- seq(0.05, 0.95, by=0.1) * mode
194 | 
195 | unf_post_plot <- post.grid %>%
196 |   ggplot(aes(x=alpha, y=beta, z=post)) + 
197 |   stat_contour(breaks=breaks, col="#377EB8") +
198 |   ylim(-10, 40) +
199 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
200 |                      limits = c(-5, 10)) +
201 |   theme_minimal() +
202 |   labs(title="Posterior density", subtitle = "with uniform prior")
203 | unf_post_plot
204 | ```
205 | 
206 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-10-1.png)<!-- -->
207 | 
208 | ## Sampling from the posterior
209 | 
210 | To sample from the posterior, we take the following steps:
211 | 
212 | 1.  Compute the marginal posterior distribution of
213 |     ![\\alpha](https://latex.codecogs.com/png.latex?%5Calpha "\\alpha")
214 |     by numerically summing over
215 |     ![\\beta](https://latex.codecogs.com/png.latex?%5Cbeta "\\beta"):
216 | 
217 | <!-- end list -->
218 | 
219 | ``` r
220 | marg.alpha <- post.grid %>%
221 |   group_by(alpha) %>%
222 |   summarise(post = sum(post)) %>%
223 |   pull(post)
224 | ```
225 | 
226 | 2.  For ![s = 1,
227 |     ..., 1000](https://latex.codecogs.com/png.latex?s%20%3D%201%2C%20...%2C%201000
228 |     "s = 1, ..., 1000") (or more if you want to have more samples),
229 | 
230 | <!-- end list -->
231 | 
232 | 1)  Draw samples from ![p(\\alpha |
233 |     y)](https://latex.codecogs.com/png.latex?p%28%5Calpha%20%7C%20y%29
234 |     "p(\\alpha | y)"):
235 | 
236 | <!-- end list -->
237 | 
238 | ``` r
239 | N <- 1000
240 | alpha.sample <- sample(seq(-5, 10, length.out = 100), N, replace=T, prob=marg.alpha)
241 | ```
242 | 
243 | 2)  Draw ![\\beta](https://latex.codecogs.com/png.latex?%5Cbeta
244 |     "\\beta") from the discrete conditional distribution ![p(\\beta |
245 |     \\alpha,
246 |     y)](https://latex.codecogs.com/png.latex?p%28%5Cbeta%20%7C%20%5Calpha%2C%20y%29
247 |     "p(\\beta | \\alpha, y)") given the just-sampled value of
248 |     ![\\alpha](https://latex.codecogs.com/png.latex?%5Calpha "\\alpha"):
249 | 
250 | <!-- end list -->
251 | 
252 | ``` r
253 | beta.sample <- c()
254 | for(i in 1:N) {
255 |   cond.beta <- post.grid %>%
256 |     filter(alpha == alpha.sample[i]) %>%
257 |     mutate(post = post / sum(post)) %>%
258 |     pull(post)
259 |   bsamp <- sample(seq(-10, 40, length.out = 100), 1, prob=cond.beta)
260 |   beta.sample[i] <- bsamp
261 | }
262 | ```
263 | 
264 | 3)  For each of the sampled
265 |     ![\\alpha](https://latex.codecogs.com/png.latex?%5Calpha "\\alpha")
266 |     and ![\\beta](https://latex.codecogs.com/png.latex?%5Cbeta
267 |     "\\beta"), add a uniform random jitter, centered at zero with a
268 |     width equal to the spacing of the sampling grid, This gives the
269 |     simulation draws a continuous distribution:
270 | 
271 | <!-- end list -->
272 | 
273 | ``` r
274 | # add random jitter
275 | alpha.sample <- alpha.sample + runif(N, min = 0 - alpha_width/2,
276 |                                         max = 0 + alpha_width/2)
277 | beta.sample <- beta.sample + runif(N, min = 0 - beta_width/2, 
278 |                                       max = 0 + beta_width/2 )
279 | ```
280 | 
281 | The whole thing as a function:
282 | 
283 | ``` r
284 | extract.sample <- function(density.grid, N=1000, prior=FALSE) {
285 |   if (prior) {
286 |     density.grid <- density.grid %>%
287 |       select(alpha, beta, post=prior)
288 |   } else {
289 |     density.grid <- density.grid %>%
290 |       select(alpha,  beta, post)
291 |   }
292 |   marg.alpha <- density.grid %>%
293 |   group_by(alpha) %>%
294 |   summarise(post = sum(post)) %>%
295 |   pull(post)
296 |   
297 |   alpha.sample <- sample(seq(-5, 10, length.out = 100), N, replace=T, prob=marg.alpha)
298 |   beta.sample <- c()
299 |   for(i in 1:N) {
300 |     cond.beta <- density.grid %>%
301 |       filter(alpha == alpha.sample[i]) %>%
302 |       mutate(post = post / sum(post)) %>%
303 |       pull(post)
304 |     bsamp <- sample(seq(-10, 40, length.out = 100), 1, prob=cond.beta)
305 |     beta.sample[i] <- bsamp
306 |   }
307 |   alpha.sample <- alpha.sample + runif(N, min = 0 - alpha_width/2,
308 |                                         max = 0 + alpha_width/2)
309 |   beta.sample <- beta.sample + runif(N, min = 0 - beta_width/2, 
310 |                                       max = 0 + beta_width/2 )
311 |   
312 |   data.frame(alpha = alpha.sample,
313 |            beta = beta.sample)
314 | }
315 | ```
316 | 
317 | We can now plot the posterior sample:
318 | 
319 | ``` r
320 | post.sample <- extract.sample(post.grid)
321 | 
322 | unf_post_sample <- post.sample %>%
323 |   ggplot(aes(x=alpha, y=beta)) +
324 |   geom_point(size=0.5) +
325 |   ylim(-10, 40) +
326 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
327 |                      limits = c(-5, 10)) +
328 |   theme_minimal() +
329 |   labs(title="Posterior sample",
330 |        subtitle="with uniform prior")
331 | unf_post_sample 
332 | ```
333 | 
334 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-16-1.png)<!-- -->
335 | 
336 | We can plot the samples as the resulting logistic model together with
337 | the
338 | data:
339 | 
340 | ``` r
341 | plot_samples <- function(sample, n=100, title="Posterior samples", subtitle="") {
342 |   log.dose <- seq(-1.5, 1.5, length.out = 100)
343 | 
344 |   sample %>%
345 |     sample_n(size=n) %>%
346 |     mutate(id=1:n) %>%
347 |     purrr::pmap_df(~tibble(log.dose=log.dose, id=..3,
348 |                           prop=invlogit(..1 + ..2*log.dose))) %>%
349 |     ggplot(aes(x=log.dose, y=prop)) +
350 |     geom_line(aes(group=id), alpha=0.2, col="#377EB8") +
351 |     geom_hline(yintercept=0.5, linetype='dashed', col="grey") +
352 |     geom_point(data=d, col="#E41A1C") +
353 |     theme_minimal() +
354 |     labs(x="Dose (log g/ml)", y="Proportion of deaths",
355 |          title=title, subtitle = subtitle)
356 | }
357 | plot_samples(post.sample)
358 | ```
359 | 
360 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-17-1.png)<!-- -->
361 | 
362 | Note that we can also sample and visualize our prior distribution:
363 | 
364 | ``` r
365 | prior.sample <- extract.sample(post.grid, prior=TRUE)
366 | 
367 | prior.sample %>%
368 |   ggplot(aes(x=alpha, y=beta)) +
369 |   geom_point(size=0.5) +
370 |   ylim(-10, 40) +
371 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
372 |                      limits = c(-5, 10)) +
373 |   theme_minimal() +
374 |   labs(title="Prior sample")
375 | ```
376 | 
377 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-18-1.png)<!-- -->
378 | 
379 | The prior is uniform and thus not very spectacular. For the logistic
380 | model, this then looks as follows:
381 | 
382 | ``` r
383 | plot_samples(prior.sample, n=200, title="Prior samples")
384 | ```
385 | 
386 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-19-1.png)<!-- -->
387 | 
388 | ## Posterior Distribution of the LD50
389 | 
390 | We can use the posterior sample to compute the LD50 - the dose level at
391 | which probability of death is 50%. In our logistic model, a 50% survival
392 | rate means   
393 | ![\\begin{align\*}
394 | \\text{LD50}: && E(\\frac{y\_i}{n\_i}) = \\text{logit}^{-1}(\\alpha +
395 | \\beta x\_i) = 0.5
396 | \\end{align\*}](https://latex.codecogs.com/png.latex?%5Cbegin%7Balign%2A%7D%0A%5Ctext%7BLD50%7D%3A%20%26%26%20E%28%5Cfrac%7By_i%7D%7Bn_i%7D%29%20%3D%20%5Ctext%7Blogit%7D%5E%7B-1%7D%28%5Calpha%20%2B%20%5Cbeta%20x_i%29%20%3D%200.5%0A%5Cend%7Balign%2A%7D
397 | "\\begin{align*}
398 | \\text{LD50}: && E(\\frac{y_i}{n_i}) = \\text{logit}^{-1}(\\alpha + \\beta x_i) = 0.5
399 | \\end{align*}")  
400 | Thus ![\\alpha + \\beta x\_i = \\text{logit}(0.5)
401 | = 0](https://latex.codecogs.com/png.latex?%5Calpha%20%2B%20%5Cbeta%20x_i%20%3D%20%5Ctext%7Blogit%7D%280.5%29%20%3D%200
402 | "\\alpha + \\beta x_i = \\text{logit}(0.5) = 0") and the LD50 is ![x\_i
403 | = -\\alpha /
404 | \\beta](https://latex.codecogs.com/png.latex?x_i%20%3D%20-%5Calpha%20%2F%20%5Cbeta
405 | "x_i = -\\alpha / \\beta").
406 | 
407 | *Attention:* In this example, LD50 is a meaningless concept if ![\\beta
408 | \\leq 0](https://latex.codecogs.com/png.latex?%5Cbeta%20%5Cleq%200
409 | "\\beta \\leq 0"), in which case increasing the dose does not cause the
410 | probability of death to increase.
411 | 
412 | We report:
413 | 
414 | 1)  The posterior probability that ![\\beta
415 |     \> 0](https://latex.codecogs.com/png.latex?%5Cbeta%20%3E%200
416 |     "\\beta \> 0"), that is, that the drug is harmful:
417 | 
418 | <!-- end list -->
419 | 
420 | ``` r
421 | mean(post.sample$beta > 0)
422 | ```
423 | 
424 |     [1] 1
425 | 
426 | From this, we can conclude that the posterior probability of ![\\beta
427 | \> 0](https://latex.codecogs.com/png.latex?%5Cbeta%20%3E%200
428 | "\\beta \> 0") is roughly estimated to exceed 0.999.
429 | 
430 | 2)  The posterior distribution for the LD50 conditional on ![\\beta
431 |     \> 0](https://latex.codecogs.com/png.latex?%5Cbeta%20%3E%200
432 |     "\\beta \> 0"). All draws had positive values of
433 |     ![\\beta](https://latex.codecogs.com/png.latex?%5Cbeta "\\beta"), so
434 |     the distribution is given by the whole sample:
435 | 
436 | <!-- end list -->
437 | 
438 | ``` r
439 | LD50_samps <- post.sample %>%
440 |   mutate( LD50 = - alpha / beta) 
441 | LD50.mean <- LD50_samps %>%
442 |   summarise(mean = mean(LD50))
443 | 
444 | unif.LD50.plot <- LD50_samps %>%
445 |   ggplot(aes(x=LD50)) + 
446 |   geom_histogram(bins=50, 
447 |                  fill="#377EB8", col="white") +
448 |   scale_y_continuous(labels = NULL, name="") +
449 |   xlim(-0.6, 0.7) +
450 |   geom_vline(data=LD50.mean, aes(xintercept=mean), col="#E41A1C") +
451 |   theme_minimal() +
452 |   labs(title="Posterior distribution for the LD50",
453 |        subtitle="with uniform prior")
454 | unif.LD50.plot
455 | ```
456 | 
457 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-21-1.png)<!-- -->
458 | 
459 | We can incorporate the LD50 data in the plot of the logistic model:
460 | 
461 | ``` r
462 | plot_samples(post.sample) +
463 |   geom_point(data=LD50_samps[1:100,], 
464 |              aes(x=LD50, y=0.5), alpha=0.3, size=0.5)
465 | ```
466 | 
467 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-22-1.png)<!-- -->
468 | 
469 | # A different prior
470 | 
471 | We want to replace the uniform prior density by a joint normal prior
472 | distribution on ![(\\alpha,
473 | \\beta)](https://latex.codecogs.com/png.latex?%28%5Calpha%2C%20%5Cbeta%29
474 | "(\\alpha, \\beta)") with ![\\alpha \\sim
475 | \\text{Normal}(0, 2^2)](https://latex.codecogs.com/png.latex?%5Calpha%20%5Csim%20%5Ctext%7BNormal%7D%280%2C%202%5E2%29
476 | "\\alpha \\sim \\text{Normal}(0, 2^2)"), ![\\beta \\sim
477 | \\text{Normal}(10, 10^2)](https://latex.codecogs.com/png.latex?%5Cbeta%20%5Csim%20%5Ctext%7BNormal%7D%2810%2C%2010%5E2%29
478 | "\\beta \\sim \\text{Normal}(10, 10^2)"), and ![\\text{corr}(\\alpha,
479 | \\beta)=0.5](https://latex.codecogs.com/png.latex?%5Ctext%7Bcorr%7D%28%5Calpha%2C%20%5Cbeta%29%3D0.5
480 | "\\text{corr}(\\alpha, \\beta)=0.5").
481 | 
482 | ``` r
483 | mvn_prior <- function(alpha, beta) {
484 |   rho <- matrix(c(2^2, 2*10*0.5, 2*10*0.5, 10^2), ncol=2)
485 |   dmvnorm(c(alpha, beta), 
486 |            mean=c(0, 10),
487 |            sigma=rho)
488 | }
489 | 
490 | prior.density <- function(grid, prior_fun){
491 |   grid %>%
492 |     rowwise %>%
493 |     mutate(prior = prior_fun(alpha, beta)) %>%
494 |     ungroup() %>%
495 |     mutate(prior = prior / sum(prior)) 
496 | }
497 | 
498 | mvn.prior.grid <- prior.density(post.grid, prior_fun=mvn_prior)
499 | ```
500 | 
501 | Our prior density then looks as follows:
502 | 
503 | ``` r
504 | mode <- max(mvn.prior.grid$prior)
505 | breaks <- seq(0.05, 0.95, by=0.1) * mode
506 | 
507 | mvn.prior.grid %>%
508 |   ggplot(aes(x=alpha, y=beta, z=prior)) + 
509 |   stat_contour(breaks=breaks, col="#E41A1C") +
510 |   ylim(-10, 40) +
511 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
512 |                      limits = c(-5, 10)) +
513 |   theme_minimal() +
514 |   labs(title="Prior density")
515 | ```
516 | 
517 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-24-1.png)<!-- -->
518 | 
519 | We can visualize the samples again:
520 | 
521 | ``` r
522 | prior.sample <- extract.sample(mvn.prior.grid, prior=TRUE)
523 | 
524 | prior_points <- prior.sample %>%
525 |   ggplot(aes(x=alpha, y=beta)) +
526 |   geom_point(size=0.5) +
527 |   ylim(-10, 40) +
528 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
529 |                      limits = c(-5, 10)) +
530 |   theme_minimal() +
531 |   labs(title="Prior sample", subtitle="with multivariate normal prior")
532 | 
533 | prior_model <- plot_samples(prior.sample, n=200, 
534 |                             title="Prior sample", subtitle="with multivariate normal prior")
535 | plot_grid(prior_points, prior_model)
536 | ```
537 | 
538 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-25-1.png)<!-- -->
539 | 
540 | We can see that the prior still allows a wide range of different models
541 | but different to the uniform prior, it is much more restricted to a
542 | certain range that is already very close to the observed data.
543 | 
544 | ## The new posterior density
545 | 
546 | We now use this prior to compute our posterior denstiy.
547 | 
548 | ``` r
549 | mvn.post.grid <- posterior.grid(post.grid, prior_fun=mvn_prior)
550 | ```
551 | 
552 | We can compare our new posterior density with the old posterior density
553 | (obtained with a uniform prior)
554 | 
555 | ``` r
556 | mode <- max(mvn.post.grid$post)
557 | breaks <- seq(0.05, 0.95, by=0.1) * mode
558 | 
559 | mvn_post_plot <- mvn.post.grid %>%
560 |   ggplot(aes(x=alpha, y=beta, z=post)) + 
561 |   stat_contour(breaks=breaks, col="#377EB8") +
562 |   ylim(-10, 40) +
563 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
564 |                      limits = c(-5, 10)) +
565 |   scale_color_brewer(palette = "Set1", name="",
566 |                      label=c("post"="posterior", "prior"), direction = -1) +
567 |   theme_minimal() +
568 |   labs(title="Posterior density", subtitle = "with multivariate normal prior")
569 | 
570 | plot_grid(unf_post_plot, mvn_post_plot)
571 | ```
572 | 
573 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-27-1.png)<!-- -->
574 | 
575 | We can see that our new prior is slightly more regularizing than the
576 | uniform prior: The new posterior density is a bit tighter than the old
577 | posterior.
578 | 
579 | We can also compare the maximum a posteriori estimates: With the uniform
580 | prior, we have as MAP estimate:
581 | 
582 | ``` r
583 | post.grid[which.max(post.grid$post),]
584 | ```
585 | 
586 |     # A tibble: 1 x 6
587 |       alpha  beta loglkhd  prior log.post    post
588 |       <dbl> <dbl>   <dbl>  <dbl>    <dbl>   <dbl>
589 |     1 0.909  8.18   -5.90 0.0001    -5.90 0.00322
590 | 
591 | and with the multivariate normal posterior, we get:
592 | 
593 | ``` r
594 | mvn.post.grid[which.max(mvn.post.grid$post),]
595 | ```
596 | 
597 |     # A tibble: 1 x 6
598 |       alpha  beta loglkhd    prior log.post    post
599 |       <dbl> <dbl>   <dbl>    <dbl>    <dbl>   <dbl>
600 |     1 0.758  8.18   -5.92 0.000613    -10.8 0.00403
601 | 
602 | While the beta value is the same (at least in our grid approximation),
603 | the new posterior alpha is a bit closer to zero.
604 | 
605 | ## Sampling from the new posterior
606 | 
607 | ``` r
608 | mvn.post.sample <- extract.sample(mvn.post.grid)
609 | mvn.prior.sample <- extract.sample(mvn.prior.grid, prior = T)
610 | 
611 | post_plot <- mvn.post.sample %>%
612 |   ggplot(aes(x=alpha, y=beta)) +
613 |   geom_point(size=0.5) +
614 |   ylim(-10, 40) +
615 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
616 |                      limits = c(-5, 10)) +
617 |   theme_minimal() +
618 |   labs(title="Posterior sample", subtitle = "with multivariate normal prior")
619 | 
620 | prior_plot <- mvn.prior.sample %>%
621 |   ggplot(aes(x=alpha, y=beta)) +
622 |   geom_point(size=0.5) +
623 |   ylim(-10, 40) +
624 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
625 |                      limits = c(-5, 10)) +
626 |   theme_minimal() +
627 |   labs(title="Prior sample", subtitle = "with multivariate normal prior")
628 | 
629 | plot_grid(post_plot, prior_plot)
630 | ```
631 | 
632 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-30-1.png)<!-- -->
633 | 
634 | We again use the sample to visualize the posterior for the logistic
635 | model:
636 | 
637 | ``` r
638 | plot_samples(mvn.post.sample) 
639 | ```
640 | 
641 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-31-1.png)<!-- -->
642 | 
643 | We can compare this again to the prior samples which also shows how the
644 | posterior is a compromise between the data and the prior.
645 | 
646 | ``` r
647 | plot_samples(mvn.prior.sample) +
648 |   labs(title="Prior samples")
649 | ```
650 | 
651 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-32-1.png)<!-- -->
652 | 
653 | ## New LD50
654 | 
655 | ``` r
656 | mvn.LD50_samps <- mvn.post.sample %>%
657 |   mutate( LD50 = - alpha / beta) 
658 | 
659 | mvn.LD50.mean <- mvn.LD50_samps %>%
660 |   summarise(mean = mean(LD50))
661 | 
662 | mvn.LD50.plot <- mvn.LD50_samps %>%
663 |   ggplot(aes(x=LD50)) + 
664 |   geom_histogram(bins=50, 
665 |                  fill="#377EB8", col="white") +
666 |   scale_y_continuous(labels = NULL, name="") +
667 |   xlim(-0.6, 0.7) + 
668 |   geom_vline(data=mvn.LD50.mean, aes(xintercept=mean), col="#E41A1C") +
669 |   theme_minimal() +
670 |   labs(title="Posterior distribution for the LD50", 
671 |        subtitle="with multivariate normal prior")
672 | 
673 | plot_grid(mvn.LD50.plot, unif.LD50.plot)
674 | ```
675 | 
676 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-33-1.png)<!-- -->
677 | 
678 | # Normal approximation of the posterior
679 | 
680 | The fourth chapter *Asymptotics and Non-Bayesian Approaches* explains
681 | how it is possible to approximate the posterior distribution, using the
682 | mode and a normal distribution. We will now compute the normal
683 | approximation and compare it to the exact posterior obtained by the
684 | uniform prior.
685 | 
686 | Since we assume a uniform prior density for ![(\\alpha,
687 | \\beta)](https://latex.codecogs.com/png.latex?%28%5Calpha%2C%20%5Cbeta%29
688 | "(\\alpha, \\beta)"), the posterior mode is the same as the maximum
689 | likelihood estimate. So we get the mode by computing the MLE:
690 | 
691 | ``` r
692 | # lkhd function
693 | bioassayfun <- function(w, df) {
694 |   z <- w[1] + w[2]*df$log.dose
695 |   -sum(df$n.deaths*(z) - df$n.animals*log1p(exp(z)))
696 | }
697 | 
698 | #' Optimize
699 | w0 <- c(0,0)
700 | optim_res <- optim(w0, bioassayfun, gr = NULL, d, hessian = T)
701 | # w is the mode
702 | w <- optim_res$par
703 | # this computes the inverse of the hessian at the mode
704 | S <- solve(optim_res$hessian)
705 | 
706 | #' Multivariate normal probability density function
707 | dmvnorm <- function(x, mu, sig)
708 |   exp(-0.5*(length(x)*log(2*pi) + log(det(sig)) + (x-mu) %*% solve(sig, x - mu)))
709 | 
710 | #' Evaluate likelihood at points (alpha, beta) 
711 | ab_grid <- expand.grid(alpha = alpha_seq, 
712 |                          beta = beta_seq ) 
713 | 
714 | ab_grid$lkhd <- apply(ab_grid, 1, dmvnorm, w, S)
715 | 
716 | 
717 | 
718 | 
719 | #' Create a plot of the posterior density
720 | norm_post_plot <- ggplot(data = ab_grid, aes(x = alpha, y = beta, z=lkhd)) +
721 |   stat_contour( col="#377EB8") +
722 |   ylim(-10, 40) +
723 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
724 |                      limits = c(-5, 10)) +
725 |   theme_minimal() +
726 |   labs(x = 'alpha', y = 'beta',
727 |        title="Posterior Density",
728 |        subtitle="using normal Approximation" ) 
729 | 
730 | norm_post_plot
731 | ```
732 | 
733 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-34-1.png)<!-- -->
734 | 
735 | The posterior density is very similar to the one we obtained before, but
736 | it is missing the slight skew in the upper corner.
737 | 
738 | Similarly for the posterior sample:
739 | 
740 | ``` r
741 | # sample from the multivariate model
742 | norm_sample <- MASS::mvrnorm(N, w, S) %>%
743 |   data.frame() %>%
744 |   rename(alpha=X1, beta=X2)
745 | 
746 | norm_post_sample <- norm_sample %>%
747 |   ggplot() +
748 |   geom_point(aes(alpha, beta), size=0.5) +
749 |   ylim(-10, 40) +
750 |   scale_x_continuous(breaks=c(-4, -2, 0, 2, 4, 6, 8, 10), 
751 |                      limits = c(-5, 10)) +
752 |   theme_minimal() +
753 |   labs(title="Posterior sample",
754 |        subtitle="using normal approximation")
755 | norm_post_sample
756 | ```
757 | 
758 |     Warning: Removed 1 rows containing missing values (geom_point).
759 | 
760 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-35-1.png)<!-- -->
761 | 
762 | This also affects the resulting LD50 and the probability that ![\\beta
763 | \> 0](https://latex.codecogs.com/png.latex?%5Cbeta%20%3E%200
764 | "\\beta \> 0"):
765 | 
766 | ``` r
767 | mean(norm_sample$beta > 0)
768 | ```
769 | 
770 |     [1] 0.938
771 | 
772 | Before, this probability was ~1.
773 | 
774 | ``` r
775 | LD50_norm_samps <- norm_sample %>%
776 |   filter( beta > 0 ) %>%
777 |   mutate( LD50 = - alpha/beta )
778 | 
779 | LD50_norm_mean <- LD50_norm_samps %>%
780 |   summarise(mean = mean(LD50))
781 | 
782 | norm.LD50.plot <- LD50_norm_samps %>%
783 |   ggplot(aes(x=LD50)) + 
784 |   geom_histogram(bins=50, 
785 |                  fill="#377EB8", col="white") +
786 |   scale_y_continuous(labels = NULL, name="") +
787 |   xlim(-1, 1) +
788 |   geom_vline(data=LD50_norm_mean, aes(xintercept=mean), col="#E41A1C") +
789 |   theme_minimal() +
790 |   labs(title="Posterior distribution for the LD50",
791 |        subtitle="using normal approximation")
792 | 
793 | norm.LD50.plot
794 | ```
795 | 
796 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-37-1.png)<!-- -->
797 | 
798 | ## Comparison
799 | 
800 | A direct comparison of the normal approximation with the exact posterior
801 | makes the differences clearer:
802 | 
803 | ``` r
804 | #' Combine the plots
805 | plot_grid(unf_post_plot, unf_post_sample, 
806 |              unif.LD50.plot + xlim(-1, 1), norm_post_plot, 
807 |              norm_post_sample, norm.LD50.plot, ncol = 3)
808 | ```
809 | 
810 | ![](bioassayExample_files/figure-gfm/unnamed-chunk-38-1.png)<!-- -->
811 | 


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-10-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-10-1.png


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-16-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-16-1.png


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-17-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-17-1.png


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-18-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-18-1.png


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-19-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-19-1.png


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-21-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-21-1.png


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-22-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-22-1.png


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-24-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-24-1.png


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-25-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-25-1.png


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-27-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-27-1.png


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-3-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-3-1.png


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-30-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-30-1.png


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-31-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-31-1.png


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-32-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-32-1.png


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-33-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-33-1.png


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-34-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-34-1.png


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-35-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-35-1.png


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-37-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-37-1.png


--------------------------------------------------------------------------------
/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-38-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stappit/bayesian-data-analysis/e038303c4c84ce505c51e91a6d4a1d731fab4dd8/chapter03/bioassayExample_files/figure-gfm/unnamed-chunk-38-1.png


--------------------------------------------------------------------------------
/chapter03/exercise15.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: "Chapter 3 - Exercise 15"
  3 | author: "Aaron McDaid - aaron.mcdaid@gmail.com"
  4 | date: "2 May 2019"
  5 | output: html_document
  6 | ---
  7 | 
  8 | <div style="display:none">
  9 |   $$
 10 |   \newcommand{\NN}[1]{\mathcal{N}\mathopen{}\left( #1 \right)\mathclose{}}
 11 |   \newcommand{\PP}[1]{\mathrm{P}\mathopen{}\left( #1 \right)\mathclose{}}
 12 |   \newcommand{\EE}[1]{\mathbb{E}\mathopen{}\left[ #1 \right]\mathclose{}}
 13 |   \newcommand{\Var}[1]{\mathrm{Var}\mathopen{}\left[ #1 \right]\mathclose{}}
 14 |   \newcommand{\dd}[0]{~\mathrm{d}}
 15 |   \newcommand{\yy}[0]{\boldsymbol{y}}
 16 |   $$
 17 | </div>
 18 | 
 19 | ```{r setup, include=FALSE}
 20 | knitr::opts_chunk$set(
 21 |   # cache = TRUE,
 22 |   # dev = "svglite",
 23 |   echo = TRUE,
 24 |   comment = NA,
 25 |   message = FALSE,
 26 |   warning = TRUE,
 27 |   error = TRUE
 28 | )
 29 | 
 30 | library(tidyverse)
 31 | library(scales)
 32 | library(kableExtra)
 33 | library(here)
 34 | 
 35 | theme_set(theme_bw())
 36 | ```
 37 | 
 38 | 
 39 | ### The question
 40 | Question 15: Joint distributions: The autoregressive time-series model $y_1$, $y_2$, $\dots$ with mean level $0$,
 41 | autocorrelation $0.8$, residual standard deviation $1$, and normal errors can be written as
 42 | $$(y_t |y_{t−1} , y_{t−2} , \dots) \sim N(0.8y_{t−1} , 1) \qquad \mbox{ for all $t$}$$.
 43 | 
 44 | (a) Prove that the distribution of $y_t$ , given the observations at all other integer time points
 45 | $t$, depends only on $y_{t−1}$ and $y_{t+1}$ .
 46 | 
 47 | (b) What is the distribution of $y_t$ given $y_{t-1}$ and $y_{t+1}$ ?
 48 | 
 49 | _This answer is just for part (a) of question 15. Also, it's very much a work-in-progress as we try to make it clear and correct_
 50 | 
 51 | ### some notation
 52 | 
 53 | First define some notation to make things more concise.
 54 | Define $\yy_{a..b}$ as a vector of all $y_i$ with $a \ge i \gt b$.
 55 | This allows us to say:
 56 | 
 57 | $$ \PP{\yy_{1..t}} = \PP{y_t, y_{t-1}, \dots, y_2, y_1} $$
 58 | 
 59 | ### conditional probability
 60 | 
 61 | $$
 62 | \PP{A | B, C} = \frac{\PP{A, B, C}}{\PP{B,C}}
 63 | $$
 64 | Next, note that
 65 | $$\PP{B, C} = \int \PP{A,B,C} \dd A
 66 | $$
 67 | and therefore
 68 | $$
 69 | \PP{A | B, C} = \frac{\PP{A, B, C}}{\int \PP{A,B,C} \dd A}
 70 | $$
 71 | 
 72 | ### part(a)
 73 | 
 74 | The question discusses "all other integer time points".
 75 | We define $n$ as an arbitrarily large integer much greater than $t+1$.
 76 | We can't easily reason about "all" time points, as there are infinitely many;
 77 | but we can consider large $n$ as $n \rightarrow \infty$.
 78 | 
 79 | We begin with the joint density over "all" time points:
 80 | $$
 81 | 	\PP{\yy_{1..n}} = \PP{y_n, y_{n−1} , y_{n−2}, \dots, y_{t+1}, y_t, y_{t-1} , \dots, y_1}
 82 | $$
 83 | 
 84 | Using the formula above for conditional probability:
 85 | $$
 86 | 	\PP{y_t | \yy_{1..t-1}, \yy_{t+1..n}} = \frac{\PP{\yy_{1..n}}}{\int \dd y_t ~ \PP{\yy_{1..n}}}
 87 | $$
 88 | 
 89 | We can use the gaussian density in the question to expand:
 90 | $$
 91 | \PP{\yy_{1..n}} = \PP{y_1} \prod_{i=2}^n \NN{y_i|0.8 y_{i-1}, 1}
 92 | $$
 93 | Substituting this into
 94 | $$
 95 | 	\PP{y_t | \yy_{1..t-1}, \yy_{t+1..n}} = \frac{\PP{\yy_{1..n}}}{\int \dd y_t ~ \PP{\yy_{1..n}}}
 96 | $$
 97 | we get
 98 | $$
 99 | 	\PP{y_t | \yy_{1..t-1}, \yy_{t+1..n}} = \frac{\PP{y_1} \prod_{i=2}^n \NN{y_i|0.8 y_{i-1}, 1}}{\int \dd y_t ~ \PP{y_1} \prod_{i=2}^n \NN{y_i|0.8 y_{i-1}, 1}}
100 | $$
101 | Factors that don't use $y_t$ can be brought outside the integral:
102 | $$
103 | 	\PP{y_t | \yy_{1..t-1}, \yy_{t+1..n}} = \frac{\PP{y_1} \prod_{i=2}^n \NN{y_i|0.8 y_{i-1}, 1}}{\PP{y_1} ~ \left( \prod_{i=2}^{t-1} \NN{y_i|0.8 y_{i-1}, 1}\right) \left(\prod_{i=t+1}^n \NN{y_i|0.8 y_{i-1}, 1}\right) \left(\int \dd y_t ~ \NN{y_t|0.8 y_{t-1}, 1} \NN{y_{t+1}|0.8 y_t, 1}\right)}
104 | $$
105 | The factors that were just moved out can be cancelled against the same factors in the numerator:
106 | $$
107 | 	\PP{y_t | \yy_{1..t-1}, \yy_{t+1..n}} = \frac{\NN{y_t|0.8 y_{t-1}, 1} \NN{y_{t+1}|0.8 y_t, 1}}{\int \dd y_t ~ \NN{y_t|0.8 y_{t-1}, 1} \NN{y_{t+1}|0.8 y_t, 1}}
108 | $$
109 | 
110 | That final expression includes terms for $y_{t+1}$, $y_t$ and $y_{t-1}$, but nothing else.
111 | Therefore part (a) is shown.
112 | 
113 | ### part (b)
114 | 
115 | For part b, I think we can model $(y_{t+1}, y_t)$ as a bivariate normal,
116 | parameterized in terms of $y_{t-1}$, and use  equation 3.14).
117 | TODO: part b!
118 | 


--------------------------------------------------------------------------------
/meetings/2019-03-28.md:
--------------------------------------------------------------------------------
 1 | # Summary
 2 | 
 3 | * We read sections 1.3 and 1.4, and solved the first exercise of the chapter. 
 4 | 
 5 | * These topics in particular came up:
 6 | 
 7 |   1. Why does the last equality of equation 1.4 (page 7) hold? In other words, why is $\tilde y$ independent of $y$ conditional on $\theta$? 
 8 | 
 9 |   2. Suppose you have two normal random variables $\theta_i \sim \normal(\mu_i, \sigma_i)$, $i = 1, 2$, with corresponding probability density functions $p_i$. Let $c \in (0, 1)$. What's the difference between the probability density function of $\theta := c \theta_1 + (1 - c) \theta_2$ and the probability density function $p := cp_1 + (1 - c)p_2$?
10 | 
11 |   3. What is marginalisation?
12 | 
13 |   4. What's the difference between a probability and a probability density?
14 | 
15 | * We had some volunteers to prepare exercises for the next session on 11th April:
16 | 
17 |   * Exercise 2: Sören
18 |   * Exercise 3: Corrie
19 |   * Exercise 4: Salma
20 |   * Exercise 5: Amin
21 |   * Exercise 6: Tiago
22 |   * Exercise 7: Prasana
23 |   * Exercise 8: Janders
24 |   * Exercise 9: Ayan
25 | 


--------------------------------------------------------------------------------
/meetings/2019-04-11.md:
--------------------------------------------------------------------------------
 1 | # Summary
 2 | 
 3 | * The following solutions were presented and discussed:
 4 | 
 5 |     * Exercise 2: Sören
 6 |     * Exercise 3: Corrie
 7 |     * Exercise 4: Salma
 8 |     * Exercise 5: Amin
 9 |     * Exercise 6: Tiago
10 |     * Exercise 7: Prasana
11 |     * Exercise 8: Jan
12 |     * Exercise 9: Vasa
13 | 
14 | * Exercise 9 caused the most discussion, especially with ideas on how to efficiently implement the simulation. 
15 | 
16 | * The next session is planned for the 2nd of May.
17 | 
18 | * We aim to have read up to and including section 2.6 of chapter 2 before the next session. During the next session we will summarise and discuss the material.
19 | 
20 | * The following volunteered to attempt some exercises:
21 | 
22 |     * Exercise 1: Jan
23 |     * Exercise 2: Amin 
24 |     * Exercise 3: Camile 
25 |     * Exercise 5: Sven 
26 |     * Exercise 6: Vasa 
27 |     * Exercise 7: Sören 
28 |     * Exercise 8: Brian 
29 |     * Exercise 9: Santiago 
30 |     * Exercise 10: Corrie 
31 |     * Exercise 11: Konrad 
32 |     * Exercise 12: Aaron 
33 |     * Exercise 13: Jan 
34 | 


--------------------------------------------------------------------------------
/meetings/2019-05-02.md:
--------------------------------------------------------------------------------
 1 | # Summary
 2 | 
 3 | * The following solutions were presented and discussed:
 4 | 
 5 |   * Exercise 1: Sven
 6 |   * Exercise 2: Amin
 7 |   * Exercise 5: Sven
 8 |   * Exercise 6: Vasa
 9 |   * Exercise 7: Sören
10 | 
11 | * The following volunteered for next time:
12 | 
13 |   * Exercise 15: Aaron
14 |   * Exercise 19: Salma
15 |   * Exercise 20: Brian
16 |   * Exercise 21: Corrie
17 | 
18 | * It was suggested that somebody present the details of one of the examples and Vasa volunteered to look into the cancer rate example for next time.
19 | 
20 | * Some exercises have boring details - it's up to the person presenting the solution what counts as boring.
21 | 
22 | * We agreed that biweekly is a good rhythm: weekly is too much to prepare for and less than biweekly leads to forgetting.
23 | 
24 | * We agreed to set up a slack poll to decide the day of the week to avoid selection bias of those present.
25 | 


--------------------------------------------------------------------------------
/meetings/2019-05-16.md:
--------------------------------------------------------------------------------
 1 | # Summary
 2 | 
 3 | * The following solutions were presented and discussed:
 4 | 
 5 |   * Chapter 2 Exercise 8: Brian
 6 |   * Chapter 2 Exercise 9: Brian
 7 |   * Chapter 2 Exercise 10: Corrie
 8 |   * Chapter 2 Exercise 11: Konrad
 9 |   * Chapter 2 Exercise 12: Aaron
10 | 
11 | * Vasa presented the cancer rate example. The question was raised whether it's better to base your prior directly on the observed rates or only via the prior predictive distribution. Konrad pointed out that you can plot all the observed rates in one plot (as in the book) but the observed counts are conditional on the population size. 
12 | 
13 | * We decided to attempt to read all of chapter 3 for next time since it's very similar to chapter 2 (just with multiple variables). Next date: 29th May.
14 | 
15 | * The following were volunteered for next time:
16 | 
17 |   • Chapter 3 bioassay example: Corrie
18 |   • Chapter 3 Exercise 5: Vasa
19 |   • Chapter 3 Exercise 6: Brian
20 |   • Chapter 3 Exercise 11: Vasa
21 |   • Chapter 3 Exercise 12?: Konrad (take your pick)
22 | 
23 | * There are also some leftover exercises from chapter 2:
24 | 
25 |   * Chapter 2 Exercise 15: Aaron
26 |   * Chapter 2 Exercise 19: Salma
27 |   * Chapter 2 Exercise 20: Brian
28 |   * Chapter 2 Exercise 21: Corrie
29 | 
30 | * Can somebody link the exercises to the sections so that we can cover everything more efficiently? 
31 | 
32 | * There was some discussion whether we want to attempt to solve all the exercises. Maybe we assign each person a random exercise? Each person picks a random exercise? We continue picking the exercises that look interesting?
33 | 
34 | * Let's give the repo some TLC, i.e. a github project page, so that we can view the html more conveniently.
35 | 


--------------------------------------------------------------------------------
/readme.md:
--------------------------------------------------------------------------------
1 | # Berlin Bayesians Reading Group
2 | 
3 | We meet every ~2 weeks to discuss material from Bayesian Data Analysis. See [meetings](meetings) for a summary of the material covered, and the chapter folders for solutions.
4 | 
5 | ## Contributing
6 | 
7 | To add your solutions, you can submit a pull-request. If you are unfamiliar with pull-requests, see [Hadley Wickam's guide](http://r-pkgs.had.co.nz/git.html#git-pullreq), or get in touch via [slack](https://join.slack.com/t/berlinbayesians/shared_invite/enQtNTMyODA2MjE0MDY0LTk2OTgwNmJlMWIzYzMzMzEyNTlkYmY0MDFlZmUwYTllOWYyMTlkNDU3YzFiNDkzNjQwMmMxYjg1YzQxYTcyNjM).
8 | 


--------------------------------------------------------------------------------