24 |
25 | ```{r setup, include=FALSE}
26 | knitr::opts_chunk$set(
27 | # cache = TRUE,
28 | # dev = "svglite",
29 | echo = TRUE,
30 | comment = NA,
31 | message = FALSE,
32 | warning = TRUE,
33 | error = TRUE
34 | )
35 |
36 | library(tidyverse)
37 | library(scales)
38 | library(kableExtra)
39 | library(here)
40 |
41 | theme_set(theme_bw())
42 | ```
43 |
44 | ## Part A
45 |
46 | The marginal density of $y$ (a.k.a. the prior predictive distribution) is
47 |
48 | $$
49 | \begin{align}
50 | p(y)
51 | &=
52 | p(y \mid \theta = 1) \cdot p(\theta = 1)
53 | +
54 | p(y \mid \theta = 2) \cdot p(\theta = 2)
55 | \\
56 | &=
57 | \normal(y \mid 1, 2) \cdot 0.5
58 | +
59 | \normal(y \mid 2, 2) \cdot 0.5
60 | \end{align}
61 | $$
62 |
63 | The graph of the prior predictive distribution is
64 |
65 | ```{r}
66 | tibble(y = seq(-5, 8, 0.05)) %>%
67 | mutate(
68 | density = 0.5 * (dnorm(y, 1, 2) + dnorm(y, 2, 2))
69 | ) %>%
70 | ggplot() +
71 | aes(y, density) +
72 | geom_line()
73 | ```
74 |
75 |
76 | This looks fairly normal because the variance is so large (in relation to the distance between the two means). However, the weighted average of normal probability density functions is not necessarily the probability density function of the weighted average of normal random variables. In other words, summing two normal probability density functions doesn't give us a normal density function, even though the sum of two normal random variables is a normal random variable. This becomes clearer if we plot the same prior predictive distribution but with a very small variance.
77 |
78 |
79 | ```{r}
80 | tibble(y = seq(0, 3, 0.01)) %>%
81 | mutate(
82 | density = 0.5 * (dnorm(y, 1, 0.1) + dnorm(y, 2, 0.1))
83 | ) %>%
84 | ggplot() +
85 | aes(y, density) +
86 | geom_line()
87 | ```
88 |
89 | Clearly, this distribution is not normal. In particular, it is multi-modal.
90 |
91 | ## Part B
92 |
93 | $$
94 | \begin{align}
95 | \mathbb P(\theta \mid y = 1)
96 | &=
97 | \frac{
98 | \mathbb P(y = 1 \mid \theta) \cdot \mathbb P (\theta)
99 | }{
100 | \mathbb P(y)
101 | }
102 | \\
103 | &=
104 | \frac{
105 | \normal(1 \mid \theta, 2) \cdot 0.5
106 | }{
107 | \normal(1 \mid 1, 2) \cdot 0.5
108 | +
109 | \normal(1 \mid 2, 2) \cdot 0.5
110 | }
111 | \\
112 | &=
113 | \frac{
114 | \normal(1 \mid \theta, 2)
115 | }{
116 | \normal(1 \mid 1, 2)
117 | +
118 | \normal(1 \mid 2, 2)
119 | }
120 | \end{align}
121 | $$
122 |
123 | which is approximately 0.531 when evaluated at $\theta = 1$.
124 |
125 | ## Part C
126 |
127 | As $\sigma \rightarrow 0$, the posterior density of $\theta$ converges around a point mass at $y = 1$. This is because $\normal(1 \mid 2, \sigma) \approx 0$ for $\sigma$ sufficiently small.
128 |
129 | As $\sigma \rightarrow \infty$, the posterior density of $\theta$ converges to 0.5. This is because $\normal(1 \mid 1, \sigma) \approx \normal(1 \mid 2, \sigma)$ for $\sigma \gg 0$.
130 |
131 |
--------------------------------------------------------------------------------
/chapter01/exercise02.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Exercise 2"
3 | author: "Sören Berg"
4 | date: "11 April 2019"
5 | output: html_document
6 | ---
7 |
8 | # Exercise 2
9 |
10 | ## Part I - equation (1.8) for vectors
11 |
12 | We show that $E(u) = E(E(u|v))$.
13 |
14 | First, note that if $x\in\R^d$ is a vector the vector valued integral with $x$ as an integrant is defined by $\int x := (\int x_1, \dots, \int x_d)^t$. Therefore, it is sufficient to show
15 |
16 | $$
17 | E(u_i) = \int \int u_i p(u, v) \ du \ dv = \int \int u_i p(u|v) \ du \ p(v) \ dv = \int E(u_i|v) p(v) \ dv = E(E(u_i|v))
18 | $$
19 |
20 | for all $i=1,\dots, d$. This implies that $E(u) = E(E(u|v))$.
21 |
22 |
23 | ## Part II - equation (1.9) for vectors
24 |
25 | We show $Cov(u) = E(Cov(u|v)) + Cov(E(u|v))$.
26 |
27 | For the covariance matrix of a random variable $X=(x_1,\dots, x_d)^t \in\R^d$ with $\mu = E(X)\in\R^d$ it holds
28 |
29 | $$
30 | \begin{aligned}
31 | Cov(X) &= E( (X-\mu)(X-\mu)^t ) \\
32 | &= E\left( \begin{pmatrix} (x_1- \mu_1)^2 & \dots & (x_1-\mu_1)(x_d-\mu_d) \\ \vdots & & \vdots \\ (x_d-\mu_d)(x_1-\mu_1) & \dots & (x_d- \mu_d)^2 \end{pmatrix} \right) \\
33 | &= \begin{pmatrix} Var(x_1) & \dots & Cov(x_1, x_d) \\ \vdots & & \vdots \\ Cov(x_d, x_1) & \dots & Var(x_d, x_d) \end{pmatrix}.
34 | \end{aligned}
35 | $$
36 |
37 | Using the identity above and basic rules for the expectation value we conclude
38 |
39 | $$
40 | \begin{aligned}
41 | \left(E(Cov(u|v)) + Cov(E(u|v))\right)_{ij} &= E(Cov(u|v)_{ij}) + Cov(E(u|v))_{ij} \\
42 | &= E( Cov(u_i, u_j|v)) + Cov(E(u_i|v), E(u_j|v)) \\
43 | &= E( E(u_iu_j|v) - E(u_i|v)E(u_j|v) ) + E( E(u_i|v)E(u_j|v)) - E( E(u_i|v)) E(E(u_j|v)) \\
44 | &=E( E(u_iu_j|v)) - E( E(u_i|v)) E(E(u_j|v)) \\
45 | &= E(u_iu_j) - E(u_i)E(u_j) \\
46 | &= Cov(u_i, u_j),
47 | \end{aligned}
48 | $$
49 |
50 | for all $i,j \in \{1,\dots,d\}$.
51 |
52 | Therefore, $Cov(u) = E(Cov(u|v)) + Cov(E(u|v))$.
53 |
--------------------------------------------------------------------------------
/chapter01/exercise03.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Exercise 3"
3 | author: "Corrie Bartelheimer"
4 | date: "April 10, 2019"
5 | output:
6 | html_document:
7 | toc: true
8 | ---
9 |
10 | ```{r setup, include=FALSE}
11 | knitr::opts_chunk$set(echo = TRUE)
12 | ```
13 |
14 | # Probability calculation for genetics
15 | Suppose that in each individual of a large population there is a pair of genes, each of which can be either $x$ or $X$, that controls eye color: those with xx have blue eyes, while heterozygotes ($xX$ or $Xx$) and those with XX have brown eyes. The proportion of blue-eyed individuals is $p^2$ and the proportion of heterozygotes is $2p(1-p)$. The proportion of brown-eyed homozygotes is then $(1-p)^2$.Each parent transmits one of its own genes to the child where the probability to transmit either gene is $\frac{1}{2}$.
16 |
17 | ## Part 1
18 | Assuming random mating, show that among brown-eyed children of brown-eyed parents, the expcted proportion of heterozygotes is $2p/(1+2p)$.
19 |
20 | We write short-hand $Xx$ to mean both heterozygote combinations. We also denote $B$ as brown-eyed.
21 |
22 | We first consider the different combinations of brown-eyed parents we can have. There are three combinations for a person to be brown-eyed thus we have $3 \times 3 = 9$ different gene combinations for brown-eyed parents.
23 | Of these 9 combinations, there is
24 |
25 | - 1 combination with parents homozygote (both are $XX$).
26 | The probability of this to happen is $(1-p)^4$.
27 | - 4 combinations with one parent heterozygote and one homozygote (e.g. $Xx$ and $XX$).
28 | The probability of this combination is $4p(1-p)^3$.
29 | - 4 combinations with both parents heterozygote (e.g. $Xx$).
30 | The probability $4p^2(1-p)^2$.
31 |
32 | For each parent combination, there are 4 combinations how to transmit the genes to its children, each of which is equally likely.
33 |
34 | - If both parents are brown-eyed homozygotes, then all 4 possible combinations to transmit are also homozygote and $0$ of $4$ are heterozygote.
35 | - If one parent is heterozygote, the other homozygote, we can get the following combinations: $Xx$, $xX$, $XX$, $XX$.
36 | That is $2$ of $4$ are heterozygotes.
37 | - If both parents are heterozygotes, we can get the following combinations: $Xx$, $xX$, $XX$ and $xx$.
38 | Since we already know that both parents and kids are all brown-eyed, we can omit $xx$ as a possible combination. Thus $2$ of $3$ combinations are heterozygote.
39 |
40 | We can now combine all this information to obtain the desired probability as follows:
41 |
42 |
43 | $$\begin{align*}
44 | Pr(\text{Judy} = Xx | \text{ Judy & parents}= B) &= \frac{0 \times (1-p)^4 + 2 \times 4p(1-p)^3 + 2\times 4p^2(1-p)^2}{4 \times (1-p)^4 + 4 \times 4p(1-p)^3 + 3\times 4p^2(1-p)^2} \\
45 | \\
46 | &= \frac{ (1-p)^2 \left[ 8p(1-p) +8p^2 \right] }{4(1-p)^2 \left[ (1-p)^2 + 4p(1-p) + 3\times p^2 \right]} \\
47 | \\
48 | &= \frac{ 8p(1-p)^2 \left[ 1-p +p \right] }{4(1-p)^2 \left[ 1-2p + p^2 + 4p-4p^2 + 3 p^2 \right]} \\
49 | \\
50 | &= \frac{ 2p }{ 1+2p }
51 | \end{align*}$$
52 |
53 | ## Part 2
54 | Suppose Judy, a brown-eyed child of brown-eyed parents, marries a heterozygote, and they have $n$ children, all brown-eyed. Find the posterior probability that Judy is a heterozygote.
55 |
56 | $$\begin{align*}
57 | Pr(\text{Judy}= Xx | n \text{ children} = B ) &= \frac{ Pr(\text{Judy}= Xx) Pr( n \text{ children} = B | \text{Judy}= Xx) }{Pr( n \text{ children} = B )}
58 | \end{align*}$$
59 |
60 | We calculated above the prior probability that Judy is heterozygote as
61 | $$Pr(\text{Judy}= Xx) = \frac{2p}{1+2p}$$
62 |
63 | To compute the probability of $n$ children, given that Judy is heterozygote, remember from above that when both parents are heterozygote, we have 3 of 4 combinations that result in brown-eyed children. Thus:
64 | $$Pr( n \text{ children} = B |\text{Judy}= Xx ) = \left( \frac{3}{4} \right)^n $$
65 |
66 | Last, we need the probability of the denominator:
67 | $$Pr(n \text{ children} = B ) = Pr(\text{Judy}= Xx) Pr( n \text{ children} = B |\text{Judy}= Xx ) + Pr( \text{Judy}=XX) Pr( n \text{ children} = B |\text{Judy}= XX )$$
68 |
69 | Obviously, $Pr( n \text{ children} = B |\text{Judy}= XX ) = 1$. Also,
70 | $$\begin{align*}
71 | Pr( \text{Judy}=XX) &= 1 - Pr( \text{Judy}=Xx) \\
72 | &= 1 - \frac{2p}{1+2p} \\
73 | &= \frac{1 +2p - 2p}{1+2p} \\
74 | &= \frac{1}{1+2p}
75 | \end{align*}$$
76 |
77 | Plugging all this together, we get:
78 | $$
79 | \begin{align*}
80 | Pr(\text{Judy}= Xx | n \text{ children} = B ) &= \frac{ \left( \frac{3}{4} \right)^n \frac{2p}{1+2p} }{ \left( \frac{3}{4} \right)^n \frac{2p}{1+2p} + \frac{1}{1+2p}}
81 | \end{align*}$$
82 |
83 | ## Part 3
84 | Find the probability (given all information as above) that Judys first grand-children has blue eyes.
85 |
86 | The grand-child can only have blue eyes if Judys child is heterozygote and if her random mate is also heterozygote or blue-eyed. For two heterozygote parents, we know the probability of having a blue-eyed kid is $1$ of $4$. If one parent is blue-eyed and the other heterozygote then 2 of 4 combination result in a blue-eyed kid. So we get:
87 |
88 | $$\begin{align*}
89 | Pr(\text{ Judys grand-kid} = xx) &= Pr(\text{Judys kid} = Xx) \left[ \frac{1}{4} Pr(\text{random mate} = Xx) + \frac{1}{2} Pr(\text{radom mate} = xx) \right] \\
90 | &= Pr(\text{Judys kid} = Xx) \left[ \frac{1}{4} 2p(1-p) + \frac{1}{2} p^2 \right] \\
91 | &= Pr(\text{Judys kid} = Xx) \left( \frac{1}{2} p \right)
92 | \end{align*}$$
93 |
94 | To get the probability that Judys kid heterozygote, we need to consider the two cases of Judy being heterozygote or homozygote.
95 | We already know the posterior probability that Judy is heterozygote and the probability of a heterozygote kid when both parents are heterozygote. If Judy is homozygote then the probability of a heterozygote kid is $\frac{1}{2}$.
96 | To get the probability that Judy is homozygote, we only need to change the nominator in the posterior probability:
97 | $$
98 | \begin{align*}
99 | Pr(\text{Judy}= XX | n \text{ children} = B ) &= \frac{ \frac{1}{1+2p} }{ \left( \frac{3}{4} \right)^n \frac{2p}{1+2p} + \frac{1}{1+2p}}
100 | \end{align*}$$
101 |
102 | Thus:
103 | $$
104 | \begin{align*}
105 | Pr(\text{Judys kid} = Xx ) &= \frac{1}{2}Pr(\text{Judy}= XX ) + \frac{2}{3} Pr(\text{Judy}= Xx) \\
106 | \\
107 | &= \frac{1}{2} \frac{ \frac{1}{1+2p} }{ \left( \frac{3}{4} \right)^n \frac{2p}{1+2p} + \frac{1}{1+2p}} + \frac{2}{3} \frac{ \left( \frac{3}{4} \right)^n \frac{2p}{1+2p} }{ \left( \frac{3}{4} \right)^n \frac{2p}{1+2p} + \frac{1}{1+2p}} \\
108 | \end{align*}$$
109 |
110 |
111 | And therefore we get the following probability for a blue-eyed grand-kid:
112 | $$\begin{align*}
113 | Pr(\text{ Judys grand-kid} = xx) &= Pr(\text{Judys kid} = Xx) \left( \frac{1}{2} p \right) \\
114 | \\
115 | &= \left[ \frac{1}{2} \frac{ \frac{1}{1+2p} }{ \left( \frac{3}{4} \right)^n \frac{2p}{1+2p} + \frac{1}{1+2p}} + \frac{2}{3} \frac{ \left( \frac{3}{4} \right)^n \frac{2p}{1+2p} }{ \left( \frac{3}{4} \right)^n \frac{2p}{1+2p} + \frac{1}{1+2p}} \right] \left( \frac{1}{2} p \right)
116 | \end{align*}$$
117 |
--------------------------------------------------------------------------------
/chapter01/exercise08.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Chapter 01 -- Exercise08"
3 | author: "Jan Kraemer"
4 | date: "11 April 2019"
5 | output: html_document
6 | ---
7 |
8 | ```{r setup, include=FALSE}
9 | knitr::opts_chunk$set(echo = TRUE)
10 | ```
11 |
12 | ## The Question
13 |
14 | Subjective probability: discuss the following statement. "The Probability of event $E$ is considered 'Subjective' if two rational persons $A$ and $B$ can assign unequal probabilities to $E$, $P_A(E)$ and $P_B(E)$. These probabilities can also be interpreted as 'contidional': $P_A(E) = P(E|I_A)$ and $P_B(E) = P(E|I_B)$, where $I_A$ and $I_B$ represent the knowledge available to persons $A$ and $B$ respectively."
15 |
16 | Apply this idea to the following examples:
17 |
18 | a) the probability that a '6' appears when a fair die is rolled, where A observes the outcome of the die roll and B does not.
19 | b) The probability that Brazil wins the next World Cup, where A is ignorant of soccer and B is a knowledgeable sports fan.
20 |
21 | ## An Attempt towards the Answer
22 |
23 | ### Thoughts regarding subjective probability
24 |
25 | The hard part is how to formulate the individual knowledge mathematically.
26 |
27 |
28 | ### Part a)
29 |
30 | #### The Intuitive Approach
31 |
32 | I would posit that for $E=6$: $P(E|I_A)=1$, her knowing the outcome, while $P_B(E) = 1/6$, being clueless and just _hoping_ it is a fair dice.
33 |
34 | #### Mathematical formulism
35 |
36 | \[
37 | P(E)=\frac{1}{6} \\
38 | I_A: E\in \{6\}; I_B: E \in \{1\dots6\} \\
39 | \\
40 | P_A(E=6)=P(E=6|I_A)=P(E=6|E\in\{6\})=\frac{P(E=6 \cap E\in\{6\})}{P(E=6)}=\frac{P(E=6)}{P(E=6)}=1\\
41 | P_B(E=6)=P(E=6|I_B)=P(E=6|E \in \{1\dots6\})=\frac{P(E=6 \cap E \in \{1\dots6\})}{P(E \in \{1\dots6\})} = \frac{P(E=6)}{P(E \in \{1\dots6\})} = \frac{1/6}{1}=\frac{1}{6}
42 | \]
43 |
44 |
45 | ### Part b)
46 |
47 | #### The Intuitive Approach
48 |
49 | This is a hard one since I am rather well modeled by person $A$. So I went to Wikipedia, saw that Brasil won 5 titles so far, and posit: $P(\text{'Barzil wins next Worldcup'}|\text{'wikipedia'})=P(\text{'Brazil won any one World cup in the past'})=\frac{5}{21}$
50 |
51 | As for $B$: I have no idea how to begin to model that....even intuitively...
52 |
53 | #### Mathematical formulism
54 |
55 | I don't even...
56 |
--------------------------------------------------------------------------------
/chapter01/exercise09.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Exercise 9"
3 | author: "Corrie"
4 | output: html_document
5 | ---
6 |
7 | ```{r setup, include=FALSE}
8 | knitr::opts_chunk$set(echo = TRUE, comment=NA)
9 | ```
10 |
11 | ```{r}
12 | library(tidyverse)
13 | ```
14 |
15 |
16 | A clinic has three doctors. Patients come into the clinic at random, starting at 9am, according to a Poisson process with time parameter 10 minutes; that is, the time after opening at which the first patient appears follows an exponential distribution with expectation 10 minutes and then, after each patient arrives, the waiting time until the next patient arrives is independently exponentially distributed, also with expectation 10 minutes. When a patient arrives, s/he waits until a doctor is available. The amount of time spent by each doctor with each patient is a random variable, uniformly distributed between 5 and 20 minutes. The office stops admitting new patients at 4pm and closes when the last patient is through with the doctor.
17 |
18 |
19 | We observe 7 hours (from 9am to 4pm) of patients arriving. That is `r 7*60` minutes. We thus sample patient arrival times (actually, waiting time between the arrival of two patients) from $\text{Exp}(\theta=1/10)$ a number of times, until the sum of arrival times is above 420. Or, in a vectorized way, we sample a generous amount of arrival times (say $n=100$) and discard all for which the sum is above 420.
20 | ```{r}
21 | patients <- rexp(n=100, rate=1/10)
22 | arrival_times <- cumsum(patients)
23 | arrival_times <- arrival_times[arrival_times <= 420 ]
24 |
25 |
26 | ( num_patients <- length(arrival_times) )
27 | ```
28 |
29 | Since each patient will spent between 5 and 20 minutes with the doctor (uniformly distributed), independent of the doctor, we can simulate the time spent with the doctor already at arrival time (how convenient).
30 | ```{r}
31 | consultation_time <- runif(num_patients, min=5, max=20)
32 | ```
33 |
34 | Since we already now in advance how long a doctor will spend with a patient, we also already know in advance will be available next. So everytime a new patient arrives, we check which doctor will be soonest available and then add the consultation time of this patient to the time the doctor will be next available.
35 | ```{r}
36 | simulate <- function(num_doctors=3, mean_wait_time=10,
37 | opening_time=9, closing_time=16,
38 | min_consul_time=5,
39 | max_consul_time=20){
40 |
41 | hours <- closing_time - opening_time
42 | minutes <- hours * 60
43 | patients <- rexp(n=200, rate=1/mean_wait_time)
44 | arrival_times <- cumsum(patients)
45 | arrival_times <- arrival_times[arrival_times <= minutes ]
46 |
47 |
48 | num_patients <- length(arrival_times)
49 |
50 | consultation_time <- runif(num_patients, min=min_consul_time, max=max_consul_time)
51 |
52 | # initiate doctors
53 | doctors <- rep(0, num_doctors)
54 | # waiting time
55 | waiting_times <- c()
56 | num_waiters <- 0
57 | for (i in 1:num_patients) {
58 | # which doctor will be next available?
59 | next_free_doctor <- which.min(doctors)
60 | # what time will the doctor be available?
61 | next_free_doctor_time <- doctors[next_free_doctor]
62 | # does the patient have to wait?
63 | if (next_free_doctor_time > arrival_times[i]) {
64 | enter_room <- next_free_doctor_time
65 | waiting_times[i] <- enter_room - arrival_times[i]
66 | num_waiters <- num_waiters + 1
67 | } else {
68 | enter_room <- arrival_times[i]
69 | waiting_times[i] <- 0
70 | }
71 | leave_room <- enter_room + consultation_time[i]
72 | doctors[next_free_doctor] <- leave_room
73 | }
74 | #num_waiters <- sum( waiting_times > 0 )
75 | office_close <- max(doctors)
76 | avg_wait_all <- mean(waiting_times)
77 | avg_wait <- mean( waiting_times[waiting_times > 0 ])
78 | rel_closing <- office_close - 420
79 | res <- c(num_patients, num_waiters, office_close, rel_closing, avg_wait_all, avg_wait)
80 | names(res) <- c("num_patients", "num_waiters", "office_close", "rel_closing", "avg_wait_all", "avg_wait")
81 | as.data.frame(t(res))
82 | }
83 |
84 | simulate()
85 | ```
86 |
87 | We can now easily simulate the process multiple times:
88 | ```{r}
89 | num_sims <- 10000
90 | sims <- map_dfr(1:num_sims, .f=function(x) {simulate(num_doctors = 3) } )
91 | ```
92 |
93 | From this, we can estimate the median and 50% interval for the number of patients in a day, how many patients have to wait, the closing time of the office and the average waiting time.
94 | ```{r}
95 | summary(sims)
96 | ```
97 |
98 | ```{r, fig.height=8, fig.width=8, dpi=400}
99 | sims %>%
100 | select(-office_close) %>%
101 | replace_na(list(avg_wait = 0)) %>%
102 | pivot_longer(everything()) %>%
103 | ggplot(aes(x=value)) +
104 | geom_histogram(bins=20, fill="#377EB8", col="white") +
105 | facet_wrap(~name, scales="free") +
106 | theme_minimal()
107 |
108 | ```
109 |
110 |
--------------------------------------------------------------------------------
/chapter01/exercise09/Aaron - PoissonPatients.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": 1,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "# Aaron\n",
10 | "import numpy.random as nr\n",
11 | "import itertools\n",
12 | "from collections import namedtuple\n",
13 | "import pandas as pd"
14 | ]
15 | },
16 | {
17 | "cell_type": "code",
18 | "execution_count": 2,
19 | "metadata": {},
20 | "outputs": [],
21 | "source": [
22 | "start_time = 9.0 # 9am\n",
23 | "end_time = 16.0 # 4pm\n",
24 | "hourly_rate = 10/60 # every 10 minutes\n",
25 | "process_time_min_max = (5/60, 20/60) # uniform between 10 and 30 minutes"
26 | ]
27 | },
28 | {
29 | "cell_type": "code",
30 | "execution_count": 3,
31 | "metadata": {},
32 | "outputs": [],
33 | "source": [
34 | "def arrival_times():\n",
35 | " current_time = start_time\n",
36 | " while True:\n",
37 | " current_time += nr.exponential(hourly_rate)\n",
38 | " if current_time > end_time:\n",
39 | " break\n",
40 | " else:\n",
41 | " yield current_time\n",
42 | " \n",
43 | "def process_times():\n",
44 | " while True: yield nr.uniform(*process_time_min_max)\n",
45 | " \n",
46 | "def patients():\n",
47 | " return zip(arrival_times(), process_times())\n",
48 | " "
49 | ]
50 | },
51 | {
52 | "cell_type": "code",
53 | "execution_count": 4,
54 | "metadata": {},
55 | "outputs": [
56 | {
57 | "name": "stdout",
58 | "output_type": "stream",
59 | "text": [
60 | "(9.22647697452822, 0.2655797646475507)\n",
61 | "(9.269022429514955, 0.18736198398352416)\n",
62 | "(9.57737637227136, 0.18725573329773443)\n",
63 | "(9.719719861923531, 0.2801266643451028)\n",
64 | "(9.782588453675242, 0.08807369500432473)\n",
65 | "(9.80840073728496, 0.3036979691681516)\n"
66 | ]
67 | }
68 | ],
69 | "source": [
70 | "for a in itertools.islice(patients(), 6):\n",
71 | " print(a)"
72 | ]
73 | },
74 | {
75 | "cell_type": "code",
76 | "execution_count": 5,
77 | "metadata": {},
78 | "outputs": [],
79 | "source": [
80 | "Patient_record = namedtuple('Patient_record', 'arrival wait process_time')\n",
81 | "def simulate():\n",
82 | " doctors = [start_time for _ in range(3)]\n",
83 | " for arrival_time, process_time in patients():\n",
84 | " # A patient has just arrived\n",
85 | " # Which doctor will see this patient?\n",
86 | " doctors = sorted(doctors)\n",
87 | " first_free_doctor = doctors.pop(0)\n",
88 | " # when will this patient be seen?\n",
89 | " enter_the_room = max(arrival_time, first_free_doctor)\n",
90 | " leave_the_room = enter_the_room + process_time\n",
91 | " # this doctor will be busy until then\n",
92 | " doctors.append(leave_the_room)\n",
93 | " assert len(doctors) == 3\n",
94 | " yield Patient_record(arrival_time, enter_the_room-arrival_time, process_time)"
95 | ]
96 | },
97 | {
98 | "cell_type": "code",
99 | "execution_count": 6,
100 | "metadata": {},
101 | "outputs": [
102 | {
103 | "data": {
104 | "text/html": [
105 | "
\n",
106 | "\n",
119 | "
\n",
120 | " \n",
121 | "
\n",
122 | "
\n",
123 | "
arrival
\n",
124 | "
wait
\n",
125 | "
process_time
\n",
126 | "
\n",
127 | " \n",
128 | " \n",
129 | "
\n",
130 | "
0
\n",
131 | "
9.057364
\n",
132 | "
0.000000
\n",
133 | "
0.314053
\n",
134 | "
\n",
135 | "
\n",
136 | "
1
\n",
137 | "
9.133635
\n",
138 | "
0.000000
\n",
139 | "
0.144422
\n",
140 | "
\n",
141 | "
\n",
142 | "
2
\n",
143 | "
9.245335
\n",
144 | "
0.000000
\n",
145 | "
0.282666
\n",
146 | "
\n",
147 | "
\n",
148 | "
3
\n",
149 | "
9.615008
\n",
150 | "
0.000000
\n",
151 | "
0.153809
\n",
152 | "
\n",
153 | "
\n",
154 | "
4
\n",
155 | "
9.835923
\n",
156 | "
0.000000
\n",
157 | "
0.206641
\n",
158 | "
\n",
159 | "
\n",
160 | "
5
\n",
161 | "
9.995190
\n",
162 | "
0.000000
\n",
163 | "
0.318543
\n",
164 | "
\n",
165 | "
\n",
166 | "
6
\n",
167 | "
10.202111
\n",
168 | "
0.000000
\n",
169 | "
0.106085
\n",
170 | "
\n",
171 | "
\n",
172 | "
7
\n",
173 | "
10.374703
\n",
174 | "
0.000000
\n",
175 | "
0.210055
\n",
176 | "
\n",
177 | "
\n",
178 | "
8
\n",
179 | "
10.428493
\n",
180 | "
0.000000
\n",
181 | "
0.324764
\n",
182 | "
\n",
183 | "
\n",
184 | "
9
\n",
185 | "
10.582662
\n",
186 | "
0.000000
\n",
187 | "
0.283858
\n",
188 | "
\n",
189 | "
\n",
190 | "
10
\n",
191 | "
10.701231
\n",
192 | "
0.000000
\n",
193 | "
0.311638
\n",
194 | "
\n",
195 | "
\n",
196 | "
11
\n",
197 | "
10.728006
\n",
198 | "
0.025251
\n",
199 | "
0.228080
\n",
200 | "
\n",
201 | "
\n",
202 | "
12
\n",
203 | "
10.731422
\n",
204 | "
0.135098
\n",
205 | "
0.278255
\n",
206 | "
\n",
207 | "
\n",
208 | "
13
\n",
209 | "
11.096205
\n",
210 | "
0.000000
\n",
211 | "
0.140842
\n",
212 | "
\n",
213 | "
\n",
214 | "
14
\n",
215 | "
11.100451
\n",
216 | "
0.000000
\n",
217 | "
0.263788
\n",
218 | "
\n",
219 | "
\n",
220 | "
15
\n",
221 | "
11.456833
\n",
222 | "
0.000000
\n",
223 | "
0.103892
\n",
224 | "
\n",
225 | "
\n",
226 | "
16
\n",
227 | "
11.845074
\n",
228 | "
0.000000
\n",
229 | "
0.272671
\n",
230 | "
\n",
231 | "
\n",
232 | "
17
\n",
233 | "
12.356279
\n",
234 | "
0.000000
\n",
235 | "
0.143145
\n",
236 | "
\n",
237 | "
\n",
238 | "
18
\n",
239 | "
12.644723
\n",
240 | "
0.000000
\n",
241 | "
0.122049
\n",
242 | "
\n",
243 | "
\n",
244 | "
19
\n",
245 | "
12.791327
\n",
246 | "
0.000000
\n",
247 | "
0.282262
\n",
248 | "
\n",
249 | "
\n",
250 | "
20
\n",
251 | "
12.983407
\n",
252 | "
0.000000
\n",
253 | "
0.146275
\n",
254 | "
\n",
255 | "
\n",
256 | "
21
\n",
257 | "
13.171120
\n",
258 | "
0.000000
\n",
259 | "
0.092032
\n",
260 | "
\n",
261 | "
\n",
262 | "
22
\n",
263 | "
13.454217
\n",
264 | "
0.000000
\n",
265 | "
0.285920
\n",
266 | "
\n",
267 | "
\n",
268 | "
23
\n",
269 | "
13.753393
\n",
270 | "
0.000000
\n",
271 | "
0.192504
\n",
272 | "
\n",
273 | "
\n",
274 | "
24
\n",
275 | "
13.987684
\n",
276 | "
0.000000
\n",
277 | "
0.199915
\n",
278 | "
\n",
279 | "
\n",
280 | "
25
\n",
281 | "
14.093300
\n",
282 | "
0.000000
\n",
283 | "
0.210698
\n",
284 | "
\n",
285 | "
\n",
286 | "
26
\n",
287 | "
14.173222
\n",
288 | "
0.000000
\n",
289 | "
0.205493
\n",
290 | "
\n",
291 | "
\n",
292 | "
27
\n",
293 | "
14.302395
\n",
294 | "
0.000000
\n",
295 | "
0.306721
\n",
296 | "
\n",
297 | "
\n",
298 | "
28
\n",
299 | "
14.683552
\n",
300 | "
0.000000
\n",
301 | "
0.220661
\n",
302 | "
\n",
303 | "
\n",
304 | "
29
\n",
305 | "
14.701095
\n",
306 | "
0.000000
\n",
307 | "
0.279457
\n",
308 | "
\n",
309 | "
\n",
310 | "
30
\n",
311 | "
14.735536
\n",
312 | "
0.000000
\n",
313 | "
0.105993
\n",
314 | "
\n",
315 | "
\n",
316 | "
31
\n",
317 | "
14.881566
\n",
318 | "
0.000000
\n",
319 | "
0.186963
\n",
320 | "
\n",
321 | "
\n",
322 | "
32
\n",
323 | "
14.907767
\n",
324 | "
0.000000
\n",
325 | "
0.199865
\n",
326 | "
\n",
327 | "
\n",
328 | "
33
\n",
329 | "
15.101449
\n",
330 | "
0.000000
\n",
331 | "
0.145699
\n",
332 | "
\n",
333 | "
\n",
334 | "
34
\n",
335 | "
15.379239
\n",
336 | "
0.000000
\n",
337 | "
0.131209
\n",
338 | "
\n",
339 | "
\n",
340 | "
35
\n",
341 | "
15.447321
\n",
342 | "
0.000000
\n",
343 | "
0.102237
\n",
344 | "
\n",
345 | "
\n",
346 | "
36
\n",
347 | "
15.619611
\n",
348 | "
0.000000
\n",
349 | "
0.110790
\n",
350 | "
\n",
351 | "
\n",
352 | "
37
\n",
353 | "
15.654948
\n",
354 | "
0.000000
\n",
355 | "
0.222799
\n",
356 | "
\n",
357 | "
\n",
358 | "
38
\n",
359 | "
15.728213
\n",
360 | "
0.000000
\n",
361 | "
0.155223
\n",
362 | "
\n",
363 | "
\n",
364 | "
39
\n",
365 | "
15.914905
\n",
366 | "
0.000000
\n",
367 | "
0.257484
\n",
368 | "
\n",
369 | " \n",
370 | "
\n",
371 | "
"
372 | ],
373 | "text/plain": [
374 | " arrival wait process_time\n",
375 | "0 9.057364 0.000000 0.314053\n",
376 | "1 9.133635 0.000000 0.144422\n",
377 | "2 9.245335 0.000000 0.282666\n",
378 | "3 9.615008 0.000000 0.153809\n",
379 | "4 9.835923 0.000000 0.206641\n",
380 | "5 9.995190 0.000000 0.318543\n",
381 | "6 10.202111 0.000000 0.106085\n",
382 | "7 10.374703 0.000000 0.210055\n",
383 | "8 10.428493 0.000000 0.324764\n",
384 | "9 10.582662 0.000000 0.283858\n",
385 | "10 10.701231 0.000000 0.311638\n",
386 | "11 10.728006 0.025251 0.228080\n",
387 | "12 10.731422 0.135098 0.278255\n",
388 | "13 11.096205 0.000000 0.140842\n",
389 | "14 11.100451 0.000000 0.263788\n",
390 | "15 11.456833 0.000000 0.103892\n",
391 | "16 11.845074 0.000000 0.272671\n",
392 | "17 12.356279 0.000000 0.143145\n",
393 | "18 12.644723 0.000000 0.122049\n",
394 | "19 12.791327 0.000000 0.282262\n",
395 | "20 12.983407 0.000000 0.146275\n",
396 | "21 13.171120 0.000000 0.092032\n",
397 | "22 13.454217 0.000000 0.285920\n",
398 | "23 13.753393 0.000000 0.192504\n",
399 | "24 13.987684 0.000000 0.199915\n",
400 | "25 14.093300 0.000000 0.210698\n",
401 | "26 14.173222 0.000000 0.205493\n",
402 | "27 14.302395 0.000000 0.306721\n",
403 | "28 14.683552 0.000000 0.220661\n",
404 | "29 14.701095 0.000000 0.279457\n",
405 | "30 14.735536 0.000000 0.105993\n",
406 | "31 14.881566 0.000000 0.186963\n",
407 | "32 14.907767 0.000000 0.199865\n",
408 | "33 15.101449 0.000000 0.145699\n",
409 | "34 15.379239 0.000000 0.131209\n",
410 | "35 15.447321 0.000000 0.102237\n",
411 | "36 15.619611 0.000000 0.110790\n",
412 | "37 15.654948 0.000000 0.222799\n",
413 | "38 15.728213 0.000000 0.155223\n",
414 | "39 15.914905 0.000000 0.257484"
415 | ]
416 | },
417 | "execution_count": 6,
418 | "metadata": {},
419 | "output_type": "execute_result"
420 | }
421 | ],
422 | "source": [
423 | "simulated_data = pd.DataFrame(list(simulate()))\n",
424 | "simulated_data"
425 | ]
426 | },
427 | {
428 | "cell_type": "code",
429 | "execution_count": 7,
430 | "metadata": {},
431 | "outputs": [],
432 | "source": [
433 | "Record_of_one_simulation = namedtuple('Record_of_one_simulation', 'n number_of_waiters average_wait_of_waiters closing_time')\n",
434 | "def one_simulation():\n",
435 | " simulated_data = pd.DataFrame(list(simulate()))\n",
436 | " # How many patients came to the office?\n",
437 | " n = simulated_data.shape[0]\n",
438 | " # How many had to wait for a doctor?\n",
439 | " waiters = simulated_data.query('wait > 0')\n",
440 | " number_of_waiters = waiters.shape[0]\n",
441 | " # What was their average wait?\n",
442 | " average_wait_of_waiters = waiters.wait.mean()\n",
443 | " # When did the office close?\n",
444 | " closing_time = float(simulated_data.iloc[[-1],:].eval('arrival+wait+process_time'))\n",
445 | " rec = Record_of_one_simulation(n, number_of_waiters, average_wait_of_waiters, closing_time)\n",
446 | " return rec"
447 | ]
448 | },
449 | {
450 | "cell_type": "code",
451 | "execution_count": 8,
452 | "metadata": {},
453 | "outputs": [
454 | {
455 | "name": "stdout",
456 | "output_type": "stream",
457 | "text": [
458 | " n number_of_waiters average_wait_of_waiters closing_time\n",
459 | "0 50 11 0.074844 16.068317\n",
460 | "1 45 8 0.077117 16.015736\n",
461 | "2 39 7 0.079502 15.944850\n",
462 | "3 45 2 0.083258 16.120998\n",
463 | "4 44 7 0.077304 16.067006\n",
464 | "5 42 5 0.040500 15.947436\n",
465 | "6 37 7 0.076368 16.000642\n",
466 | "7 36 5 0.114543 16.273268\n",
467 | "8 51 9 0.056909 16.060058\n",
468 | "9 40 7 0.104967 16.289932\n"
469 | ]
470 | }
471 | ],
472 | "source": [
473 | "number_of_simulations = 10\n",
474 | "sims = pd.DataFrame([one_simulation() for sim in range(number_of_simulations)])\n",
475 | "\n",
476 | "print(sims)"
477 | ]
478 | },
479 | {
480 | "cell_type": "code",
481 | "execution_count": null,
482 | "metadata": {},
483 | "outputs": [],
484 | "source": []
485 | }
486 | ],
487 | "metadata": {
488 | "kernelspec": {
489 | "display_name": "Python 3",
490 | "language": "python",
491 | "name": "python3"
492 | },
493 | "language_info": {
494 | "codemirror_mode": {
495 | "name": "ipython",
496 | "version": 3
497 | },
498 | "file_extension": ".py",
499 | "mimetype": "text/x-python",
500 | "name": "python",
501 | "nbconvert_exporter": "python",
502 | "pygments_lexer": "ipython3",
503 | "version": "3.6.7"
504 | }
505 | },
506 | "nbformat": 4,
507 | "nbformat_minor": 2
508 | }
509 |
--------------------------------------------------------------------------------
/chapter02/2008ElectionResult.csv:
--------------------------------------------------------------------------------
1 | state,vote_Obama,vote_Obama_pct,vote_McCain,vote_McCain_pct,electoral_vote_dem,electoral_vote_rep
2 | Alabama,811764,38.8,1264879,60.4,,9
3 | Alaska,105650,37.7,168844,60.2,,3
4 | Arizona,948648,45,1132560,53.8,,10
5 | Arkansas,418049,38.8,632672,58.8,,6
6 | California,7245731,60.9,4434146,37.3,55,
7 | Colorado,1216793,53.5,1020135,44.9,9,
8 | Connecticut,979316,60.5,620210,38.3,7,
9 | Delaware,255394,61.9,152356,37,3,
10 | District of Columbia,210403,92.9,14821,6.5,3,
11 | Florida,4143957,50.9,3939380,48.4,27,
12 | Georgia,1843452,47,2048244,52.2,,15
13 | Hawaii,324918,71.8,120309,26.6,4,
14 | Idaho,235219,36.1,400989,61.5,,4
15 | Illinois,3319237,61.8,1981158,36.9,21,
16 | Indiana,1367264,49.9,1341101,49,11,
17 | Iowa,818240,54,677508,44.7,7,
18 | Kansas,499979,41.4,685541,56.8,,6
19 | Kentucky,746510,41.1,1043264,57.5,,8
20 | Louisiana,780981,39.9,1147603,58.6,,9
21 | Maine,421484,57.6,296195,40.5,4,
22 | Maryland,1579890,61.9,938671,36.8,10,
23 | Massachusetts,1891083,62,1104284,36.2,12,
24 | Michigan,2867680,57.4,2044405,40.9,17,
25 | Minnesota,1573323,54.2,1275400,44,10,
26 | Mississippi,520864,42.8,687266,56.4,,6
27 | Missouri,1439364,49.3,1444352,49.4,,11
28 | Montana,229725,47.2,241816,49.7,,3
29 | Nebraska,324352,41.5,446039,57,1,4
30 | Nevada,531884,55.1,411988,42.7,5,
31 | New Hampshire,384591,54.3,316937,44.8,4,
32 | New Jersey,2085051,56.8,1545495,42.1,15,
33 | New Mexico,464458,56.7,343820,42,5,
34 | New York,4363386,62.2,2576360,36.7,31,
35 | North Carolina,2123390,49.9,2109698,49.5,15,
36 | North Dakota,141113,44.7,168523,53.3,,3
37 | Ohio,2708685,51.2,2501855,47.2,20,
38 | Oklahoma,502294,34.4,959745,65.6,,7
39 | Oregon,978605,57.1,699673,40.8,7,
40 | Pennsylvania,3192316,54.7,2586496,44.3,21,
41 | Rhode Island,281209,63.1,157317,35.3,4,
42 | South Carolina,850121,44.9,1018756,53.8,,8
43 | South Dakota,170886,44.7,203019,53.2,,3
44 | Tennessee,1081074,41.8,1470160,56.9,,11
45 | Texas,3521164,43.8,4467748,55.5,,34
46 | Utah,301771,34.2,555497,62.9,,5
47 | Vermont,219105,67.8,98791,30.6,3,
48 | Virginia,1958370,52.7,1726053,46.4,13,
49 | Washington,1547632,57.5,1097176,40.5,11,
50 | West Virginia,301438,42.6,394278,55.7,,5
51 | Wisconsin,1670474,56.3,1258181,42.4,10,
52 | Wyoming,80496,32.7,160639,65.2,,3
53 |
--------------------------------------------------------------------------------
/chapter02/exercise03.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Exercise 3"
3 | author: "Sören Berg"
4 | date: "01 May 2019"
5 | output: html_document
6 | ---
7 |
8 | ## (a)
9 |
10 | The number of 6's in 1000 rolls of a fair die is $\mathrm{Bin}\left(1000, \frac{1}{6}\right)$ distributed with mean $\frac{1000}{6}$ and standard deviation $\sqrt{1000 \frac{1}{6}\frac{5}{6}}$. We approximate this distribution with a normal distribution with same mean and standard deviation.
11 |
12 | ```{r}
13 | x <- seq(100,220,1)
14 | y <- dnorm(x, 1000/6, sqrt(1000*1/6*5/6))
15 | plot(x, y, type="l")
16 | ```
17 |
18 | ## (b)
19 |
20 | ```{r}
21 | qnorm(0.05, 1000/6, sqrt(1000*1/6*5/6))
22 | qnorm(0.25, 1000/6, sqrt(1000*1/6*5/6))
23 | qnorm(0.5, 1000/6, sqrt(1000*1/6*5/6))
24 | qnorm(0.75, 1000/6, sqrt(1000*1/6*5/6))
25 | qnorm(0.95, 1000/6, sqrt(1000*1/6*5/6))
26 | ```
27 |
--------------------------------------------------------------------------------
/chapter02/exercise04.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Exercise 4"
3 | author: "Sören Berg"
4 | date: "01 May 2019"
5 | output: html_document
6 | ---
7 |
8 | ## (a)
9 |
10 | The distribution for $\theta$ is discrete (note that the three probabilities given in the exercise sum up to one).
11 | First, let $\theta=\frac{1}{12}$. The binomial distribution
12 | $\mathrm{Bin}(n,\theta)$ has mean $\frac{1000}{12}$ and standard deviation $\sqrt{1000\tfrac{1}{12}\tfrac{11}{12}}$, and we approximate the conditional distribution $p(y|\theta)$ with $\mathrm{N}\left(\frac{1000}{12}, \sqrt{1000\tfrac{1}{12}\tfrac{11}{12}}\right)$. The cases $\theta=1/6$ and $\theta=1/4$ can be handled analogously so that our approximation of the prior predictive distribution for $y$ is a sum (or mixture) of three normals. More precisely,
13 |
14 | $$
15 | p(y) = \sum_{\theta\in\{1/12,1/6,1/4\}}p(y|\theta) = \sum_{\theta\in\{1/12,1/6,1/4\}} p(\theta) p(y|\theta)
16 | $$
17 |
18 | is approximated with
19 |
20 | $$
21 | \mathrm{Pr}\left(\theta=\frac{1}{12}\right) \cdot \mathrm{N}\left(\frac{1000}{12}, \sqrt{1000\tfrac{1}{12}\tfrac{11}{12}}\right)
22 | + \mathrm{Pr}\left(\theta=\frac{1}{6}\right) \cdot \mathrm{N}\left(\frac{1000}{6}, \sqrt{1000\tfrac{1}{6}\tfrac{5}{6}}\right)
23 | + \mathrm{Pr}\left(\theta=\frac{1}{4}\right) \cdot \mathrm{N}\left(\frac{1000}{4}, \sqrt{1000\tfrac{1}{4}\tfrac{3}{4}}\right)
24 | $$
25 |
26 | Sketch:
27 |
28 | ```{r}
29 | x <- seq(50,300, 1)
30 | y <- 1/4 * dnorm(x, 1000/12, sqrt(1000 \cdot 1/12*11/12)) + 1/2 * dnorm(x, 1000/6, sqrt(1000/cdot 1/6*5/6)) + 1/4 * dnorm(x, 1000/4, sqrt(1000*1/4*3/4))
31 | plot(x, y, type="l")
32 | ```
33 |
34 | ## (b)
35 |
36 | The sketch of the prior predictive distribution (alternatively one can compare the locations and standard deviations of the three normals analytically) shows that the overlap between the three bell-shaped curves are neglible for approximation purposes. The first (left-most) bell-shape curve describes $1/4$ of the overall mass of the prior predictive distribution. Therefore, the $5$% (or $\frac{1}{20}$) point is approximately the $4\frac{1}{20}=\frac{1}{5}$ (or $20$%) point of the first normal, which is
37 |
38 | ```{r}
39 | qnorm(1/5, 1000/12, sqrt(1000*1/12*11/12))
40 | ```
41 |
42 | The $25$% point, thus, lies between the first two bell-shaped curves at approximately $125$.
43 | Since the last normal describes $1/4$ of the overall mass, too, the $50$% point is approximately the mean of the second normal (at around $1000/6 = 166.6\ldots$).
44 | The $75$% lies between the second and third normal, approximately at around $205$ and the $95$% point is the approximately the $80$% point of the third normal at around
45 |
46 | ```{r}
47 | qnorm(4/5, 1000/4, sqrt(1000*1/4*3/4))
48 | ```
49 |
--------------------------------------------------------------------------------
/chapter02/exercise07.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Exercise 7"
3 | author: "Sören Berg"
4 | date: "01 May 2019"
5 | output: html_document
6 | ---
7 |
8 | ## (a)
9 |
10 | We show that for the binomial likelihood $y \sim \mathrm{Bin}(n, \theta)$ $p(\theta)\propto \theta^{-1}(1-\theta)^{-1}$ is the uniform prior distribution for the natural parameters of the exponential family.
11 |
12 | We start by determining the natural parameter
13 |
14 | $$
15 | \begin{aligned}
16 | p(y | \theta) &= \binom{n}{y} \theta^y (1-\theta)^{n-y}
17 | = \binom{n}{y} (1-\theta)^n e^{\log(\theta^y (1-\theta)^{-y})}
18 | = \binom{n}{y} (1-\theta)^n e^{\left(\log\frac{\theta}{1-\theta}\right)y}.
19 | \end{aligned}
20 | $$
21 |
22 | This also verifies that the binomial distribution belongs to the exponential
23 | family with $f(y)=\binom{n}{y}$, $g(\theta)= (1-\theta)^n$, $\phi(\theta)=\mathrm{logit}(\theta)=\log\frac{\theta}{1-\theta}$ and $u(y)=y$, cf. notation on page 36 BDA3.
24 |
25 | Now, let $p$ be a density with $p(\phi(\theta))\propto 1$. Since $\phi$ is a one-to-one transformation we can transform the density as follows. Note that said transformation can be done by equation (2.19) on page 52 BDA3, which is a special case of more general transformation theorems for integrals. First, note that the inverse of the logit function is the sigmoid function $\phi^{-1}(\theta) = \frac{1}{1+e^{-\theta}}$.
26 | We conclude
27 |
28 | $$
29 | \begin{aligned}
30 | p(\theta)
31 | &= p(\phi^{-1}(\phi(\theta)))
32 | = p(\phi(\theta)) \left| \frac{d\phi(\theta)}{d\theta}\right|
33 | = p(\phi(\theta)) \left| \frac{1}{\tfrac{\theta}{1-\theta}} \frac{1}{(1-\theta)^2}\right|\\
34 | &= p(\phi(\theta)) \left| \frac{1}{\theta(1-\theta)}\right|
35 | = p(\phi(\theta)) \frac{1}{\theta(1-\theta)}
36 | \propto \frac{1}{\theta(1-\theta)}=\theta^{-1}(1-\theta)^{-1}.
37 | \end{aligned}
38 | $$
39 |
40 |
41 | ## (b)
42 | We show that if $y=0$ or $y=n$ the resulting posterior is improper.
43 |
44 | The resulting posterior is proportional to the prior times the likelihood. Thus,
45 |
46 | $$
47 | p(\theta|y) \propto p(\theta) p(y|\theta)
48 | = \theta^{-1}(1-\theta)^{-1} \binom{n}{y}\theta^y(1-\theta)^{n-y}
49 | \propto \theta^{y-1}(1-\theta)^{n-y-1}.
50 | $$
51 |
52 | Let $y=0$, then the integral over the posterior is (except some nonzero normalization factor)
53 |
54 | $$
55 | \int_0^1 \theta^{y-1}(1-\theta)^{n-y-1} \ \mathrm{d}\theta
56 | = \int_0^1 \theta^{-1}(1-\theta)^{n-1} \ \mathrm{d}\theta
57 | \geq \int_0^1 \theta^{-1} \ \mathrm{d}\theta
58 | = \lim_{\theta\searrow 0} \log(\theta) = \infty,
59 | $$
60 | where the inequality follows since $\theta\in[0,1]$.
61 | The case $y=n$ can be tackled analogously.
62 |
--------------------------------------------------------------------------------
/chapter02/exercise10.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Exercise 10"
3 | author: "Corrie Bartelheimer"
4 | output:
5 | html_document:
6 | toc: true
7 | ---
8 | ```{r setup, include=FALSE}
9 | knitr::opts_chunk$set(echo = TRUE, comment=NA)
10 | ```
11 | ## Problem
12 | Suppose there are $N$ cable cars in San Francisco, numbered sequentially from 1 to $N$. You see a cable car at random; it is numbered 203. You wish to estimate $N$.
13 |
14 |
15 | ## Some preliminary thoughts
16 | Before starting to solve the exercise, let's think what we would intuitively expect without using any statistical knowledge.
17 | Obviously, having seen a cable car numbered 203, we know that $N$ cannot be lower than 203. So $N$ must be 203 or higher. My first guess when seeing the number 203 would be to expect $N$ to be around 400. My intuitive reasoning for this is as follows: If $N$ would be much much larger than 203 then the probability for seeing a cable car with a number higher than 203 also becomes larger. On the other hand, if $N$ would be close to 203, the probability for seeing a car numbered smaller than 203 becomes larger. I expect to half the time see a car with a number smaller than $\lfloor \frac{N}{2}\rfloor$ and the other half a car with a number larger than $\frac{N}{2}$, hence the guess of 400 after seeing the car 203.
18 | Let's see if this intuition agrees with the math.
19 |
20 | ## Part (a) - Compute the posterior
21 | Assume your prior distribution on $N$ is geometric with mean 100; that is
22 | $$p(N) = \frac{1}{100} \left(\frac{99}{100}\right)^{N-1}$$
23 | for $N =1, 2,...$.
24 | What is your posterior distribution for $N$?
25 |
26 | For this problem, we can approximate the posterior using grid approximation.
27 | ```{r}
28 | N_seq <- 1:1000000
29 |
30 | prior_mean <- 100
31 | prior <- 1/prior_mean * (1-1/prior_mean)^(N_seq - 1)
32 | ```
33 | We use the following function for the likelihood:
34 | $$P(y \,|\, N) = \begin{cases}
35 | \frac{1}{N} &\text{ if }N \geq y\\
36 | 0 & \text{otherwise} \end{cases}$$
37 | where $y$ denotes our observed data, here $y=203$.
38 |
39 | We then compute the likelihood for all possible $N$ (as big as computationally still possible), multiply the prior with the likelihood and simply standardize the posterior by dividing out its sum:
40 | ```{r}
41 | lkhd <- ifelse( N_seq >= 203, 1/N_seq, 0)
42 |
43 | unstzd.post <- prior * lkhd
44 |
45 | post <- unstzd.post / sum(unstzd.post)
46 |
47 |
48 | plot(N_seq, post, type = "l", xlim = c(1, 1000),
49 | main="Posterior", xlab="N", ylab = "")
50 | ```
51 |
52 | The posterior is highly skewed with very small probability mass on very large values of $N$. By definition of our likelihood, there's no probability mass on values of $N$ below 203. Interestingly, and different from my intuition, most probability mass is on the values just above 203.
53 |
54 | ## Part (b) - Computing posterior summaries
55 | What are the posterior mean and standard deviation of $N$?
56 |
57 | To compute the posterior mean and standard deviation, we sample from our posterior distribution:
58 | ```{r}
59 | post_sample <- sample(N_seq, size=2000, replace=T, prob=post)
60 | hist(post_sample, breaks = 30, main="Histogram of the posterior sample",
61 | xlab="N")
62 | ```
63 |
64 | Computing posterior summaries is then straight-forward:
65 | ```{r}
66 | mean(post_sample)
67 | ```
68 |
69 | ```{r}
70 | median(post_sample)
71 | ```
72 |
73 |
74 | ```{r}
75 | sd(post_sample)
76 | ```
77 |
78 | ```{r}
79 | Mode <- function(x) {
80 | ux <- unique(x)
81 | ux[which.max(tabulate(match(x, ux)))]
82 | }
83 | Mode(post_sample)
84 | ```
85 |
86 | Our posterior distribution tells us to expect $N$ to be around 280, much smaller than my guess of around 400. One reason for this value is also our prior: Since we only have one observation, our prior has a strong influence. Picking different prior means for the geometric prior distribution yields different posterior means.
87 |
88 | ## Part (c) - Non-informative prior
89 | Choose a reasonable 'non-informative' prior distribution for $N$ and give the resulting posterior distribution, mean and standard deviation for $N$.
90 |
91 | A first idea would be to take a flat uniform prior: $P(N) \propto 1$. However, this leads to an improper posterior:
92 |
93 | $$\begin{align*}
94 | P(N \,|\,) &\propto P(N) P(y \,|\, N) &\\
95 | &\propto P(y \,|\, N) &\\
96 | &\propto \frac{1}{N} & \text{if } N \geq y
97 | \end{align*}$$
98 | Since $\sum_{N=1}^{\infty} \frac{1}{N} = \infty$, this posterior would be improper. Simulating this shows the problems we get:
99 | ```{r}
100 | N_seq <- 1:100000
101 | prior <- 1
102 |
103 | lkhd <- ifelse( N_seq >= 203, 1/N_seq, 0)
104 |
105 | unstzd.post <- prior * lkhd
106 |
107 | post <- unstzd.post / sum(unstzd.post)
108 | post_sample <- sample(N_seq, size=2000, replace=T, prob=post)
109 |
110 | plot(N_seq, post, type = "l", xlim = c(1, 1000),
111 | main="Posterior", xlab="N", ylab = "")
112 | ```
113 |
114 | There's too much probability mass on high values and if we try to compute the mean we get unreasonably high values and wildly different values if we change the support over which we approximate the posterior.
115 |
116 | Another option for the prior is $P(N) \propto \frac{1}{N}$. This is an improper prior (same reasoning as above, it doesn't integrate to 1) but it leads to a proper posterior density:
117 | $$\begin{align*}
118 | P(N \,|\,y) &\propto P(N) P(y \,|\, N) &\\
119 | &\propto \frac{1}{N} \cdot \frac{1}{N} & \text{if } N \geq y\\
120 | &\propto \frac{1}{N^2} & \text{if } N \geq y \\
121 | &=
122 | c\frac{1}{N^2} & \text{if } N \geq y, \text{ for some } c\\
123 | \end{align*}$$
124 | since the sum $\sum_{N=1}^{\infty} \frac{1}{N^2}$ converges.
125 |
126 | We can compute $c$ as follows:
127 | $$\begin{align*}
128 | 1 &= c\sum_{N=203}^\infty \frac{1}{N^2} \\
129 | \iff \quad \frac{1}{c}&= \sum_{N=203}^\infty \frac{1}{N^2}\\
130 | & = \sum_{N=1}^\infty \frac{1}{N^2} - \sum_{N=1}^{202} \frac{1}{N^2} \\
131 | & = \frac{\pi^2}{6} - \sum_{N=1}^{202} \frac{1}{N^2}
132 | \end{align*}$$
133 |
134 | We compute $c$ numerically:
135 | ```{r}
136 | one_c <- pi^2/6 - sum( 1/(1:202)^2)
137 | c <- 1/one_c
138 | c
139 | ```
140 |
141 | ```{r}
142 | N_seq <- 1:1000000
143 |
144 | post <- ifelse( N_seq >= 203, c/N_seq^2, 0)
145 |
146 | post_sample <- sample(N_seq, size=2000, replace=T, prob=post)
147 |
148 | plot(N_seq, post, type = "l", xlim = c(1, 1000),
149 | main="Posterior", xlab="N", ylab = "")
150 | ```
151 |
152 | It is straight-forward to see that this posterior does not have a mean:
153 | $$\begin{align*}
154 | E(N\,|\,y) &= \sum_{N=1}^\infty N \cdot P(N\,|\,y) \\
155 | &= \sum_{N=203}^\infty N \frac{c}{N^2} \\
156 | &= \sum_{N=203}^\infty \frac{c}{N} \\
157 | &= \infty
158 | \end{align*}$$
159 |
160 |
--------------------------------------------------------------------------------
/chapter02/exercise12/exercise12.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Chapter 2 - Exercise 12"
3 | author: "Aaron McDaid - aaron.mcdaid@gmail.com"
4 | date: "2 May 2019"
5 | output: html_document
6 | ---
7 |
8 |
18 |
19 | ```{r setup, include=FALSE}
20 | knitr::opts_chunk$set(
21 | # cache = TRUE,
22 | # dev = "svglite",
23 | echo = TRUE,
24 | comment = NA,
25 | message = FALSE,
26 | warning = TRUE,
27 | error = TRUE
28 | )
29 |
30 | library(tidyverse)
31 | library(scales)
32 | library(kableExtra)
33 | library(here)
34 |
35 | theme_set(theme_bw())
36 | ```
37 |
38 |
39 | ### The question
40 | Question 15: Joint distributions: The autoregressive time-series model $y_1$, $y_2$, $\dots$ with mean level $0$,
41 | autocorrelation $0.8$, residual standard deviation $1$, and normal errors can be written as
42 | $$(y_t |y_{t−1} , y_{t−2} , \dots) \sim N(0.8y_{t−1} , 1) \qquad \mbox{ for all $t$}$$.
43 |
44 | (a) Prove that the distribution of $y_t$ , given the observations at all other integer time points
45 | $t$, depends only on $y_{t−1}$ and $y_{t+1}$ .
46 |
47 | (b) What is the distribution of $y_t$ given $y_{t-1}$ and $y_{t+1}$ ?
48 |
49 | _This answer is just for part (a) of question 15. Also, it's very much a work-in-progress as we try to make it clear and correct_
50 |
51 | ### some notation
52 |
53 | First define some notation to make things more concise.
54 | Define $\yy_{a..b}$ as a vector of all $y_i$ with $a \ge i \gt b$.
55 | This allows us to say:
56 |
57 | $$ \PP{\yy_{1..t}} = \PP{y_t, y_{t-1}, \dots, y_2, y_1} $$
58 |
59 | ### conditional probability
60 |
61 | $$
62 | \PP{A | B, C} = \frac{\PP{A, B, C}}{\PP{B,C}}
63 | $$
64 | Next, note that
65 | $$\PP{B, C} = \int \PP{A,B,C} \dd A
66 | $$
67 | and therefore
68 | $$
69 | \PP{A | B, C} = \frac{\PP{A, B, C}}{\int \PP{A,B,C} \dd A}
70 | $$
71 |
72 | ### part(a)
73 |
74 | The question discusses "all other integer time points".
75 | We define $n$ as an arbitrarily large integer much greater than $t+1$.
76 | We can't easily reason about "all" time points, as there are infinitely many;
77 | but we can consider large $n$ as $n \rightarrow \infty$.
78 |
79 | We begin with the joint density over "all" time points:
80 | $$
81 | \PP{\yy_{1..n}} = \PP{y_n, y_{n−1} , y_{n−2}, \dots, y_{t+1}, y_t, y_{t-1} , \dots, y_1}
82 | $$
83 |
84 | Using the formula above for conditional probability:
85 | $$
86 | \PP{y_t | \yy_{1..t-1}, \yy_{t+1..n}} = \frac{\PP{\yy_{1..n}}}{\int \dd y_t ~ \PP{\yy_{1..n}}}
87 | $$
88 |
89 | We can use the gaussian density in the question to expand:
90 | $$
91 | \PP{\yy_{1..n}} = \PP{y_1} \prod_{i=2}^n \NN{y_i|0.8 y_{i-1}, 1}
92 | $$
93 | Substituting this into
94 | $$
95 | \PP{y_t | \yy_{1..t-1}, \yy_{t+1..n}} = \frac{\PP{\yy_{1..n}}}{\int \dd y_t ~ \PP{\yy_{1..n}}}
96 | $$
97 | we get
98 | $$
99 | \PP{y_t | \yy_{1..t-1}, \yy_{t+1..n}} = \frac{\PP{y_1} \prod_{i=2}^n \NN{y_i|0.8 y_{i-1}, 1}}{\int \dd y_t ~ \PP{y_1} \prod_{i=2}^n \NN{y_i|0.8 y_{i-1}, 1}}
100 | $$
101 | Factors that don't use $y_t$ can be brought outside the integral:
102 | $$
103 | \PP{y_t | \yy_{1..t-1}, \yy_{t+1..n}} = \frac{\PP{y_1} \prod_{i=2}^n \NN{y_i|0.8 y_{i-1}, 1}}{\PP{y_1} ~ \left( \prod_{i=2}^{t-1} \NN{y_i|0.8 y_{i-1}, 1}\right) \left(\prod_{i=t+1}^n \NN{y_i|0.8 y_{i-1}, 1}\right) \left(\int \dd y_t ~ \NN{y_t|0.8 y_{t-1}, 1} \NN{y_{t+1}|0.8 y_t, 1}\right)}
104 | $$
105 | The factors that were just moved out can be cancelled against the same factors in the numerator:
106 | $$
107 | \PP{y_t | \yy_{1..t-1}, \yy_{t+1..n}} = \frac{\NN{y_t|0.8 y_{t-1}, 1} \NN{y_{t+1}|0.8 y_t, 1}}{\int \dd y_t ~ \NN{y_t|0.8 y_{t-1}, 1} \NN{y_{t+1}|0.8 y_t, 1}}
108 | $$
109 |
110 | That final expression includes terms for $y_{t+1}$, $y_t$ and $y_{t-1}$, but nothing else.
111 | Therefore part (a) is shown.
112 |
113 | ### part (b)
114 |
115 | For part b, I think we can model $(y_{t+1}, y_t)$ as a bivariate normal,
116 | parameterized in terms of $y_{t-1}$, and use equation 3.14).
117 | TODO: part b!
118 |
--------------------------------------------------------------------------------
/meetings/2019-03-28.md:
--------------------------------------------------------------------------------
1 | # Summary
2 |
3 | * We read sections 1.3 and 1.4, and solved the first exercise of the chapter.
4 |
5 | * These topics in particular came up:
6 |
7 | 1. Why does the last equality of equation 1.4 (page 7) hold? In other words, why is $\tilde y$ independent of $y$ conditional on $\theta$?
8 |
9 | 2. Suppose you have two normal random variables $\theta_i \sim \normal(\mu_i, \sigma_i)$, $i = 1, 2$, with corresponding probability density functions $p_i$. Let $c \in (0, 1)$. What's the difference between the probability density function of $\theta := c \theta_1 + (1 - c) \theta_2$ and the probability density function $p := cp_1 + (1 - c)p_2$?
10 |
11 | 3. What is marginalisation?
12 |
13 | 4. What's the difference between a probability and a probability density?
14 |
15 | * We had some volunteers to prepare exercises for the next session on 11th April:
16 |
17 | * Exercise 2: Sören
18 | * Exercise 3: Corrie
19 | * Exercise 4: Salma
20 | * Exercise 5: Amin
21 | * Exercise 6: Tiago
22 | * Exercise 7: Prasana
23 | * Exercise 8: Janders
24 | * Exercise 9: Ayan
25 |
--------------------------------------------------------------------------------
/meetings/2019-04-11.md:
--------------------------------------------------------------------------------
1 | # Summary
2 |
3 | * The following solutions were presented and discussed:
4 |
5 | * Exercise 2: Sören
6 | * Exercise 3: Corrie
7 | * Exercise 4: Salma
8 | * Exercise 5: Amin
9 | * Exercise 6: Tiago
10 | * Exercise 7: Prasana
11 | * Exercise 8: Jan
12 | * Exercise 9: Vasa
13 |
14 | * Exercise 9 caused the most discussion, especially with ideas on how to efficiently implement the simulation.
15 |
16 | * The next session is planned for the 2nd of May.
17 |
18 | * We aim to have read up to and including section 2.6 of chapter 2 before the next session. During the next session we will summarise and discuss the material.
19 |
20 | * The following volunteered to attempt some exercises:
21 |
22 | * Exercise 1: Jan
23 | * Exercise 2: Amin
24 | * Exercise 3: Camile
25 | * Exercise 5: Sven
26 | * Exercise 6: Vasa
27 | * Exercise 7: Sören
28 | * Exercise 8: Brian
29 | * Exercise 9: Santiago
30 | * Exercise 10: Corrie
31 | * Exercise 11: Konrad
32 | * Exercise 12: Aaron
33 | * Exercise 13: Jan
34 |
--------------------------------------------------------------------------------
/meetings/2019-05-02.md:
--------------------------------------------------------------------------------
1 | # Summary
2 |
3 | * The following solutions were presented and discussed:
4 |
5 | * Exercise 1: Sven
6 | * Exercise 2: Amin
7 | * Exercise 5: Sven
8 | * Exercise 6: Vasa
9 | * Exercise 7: Sören
10 |
11 | * The following volunteered for next time:
12 |
13 | * Exercise 15: Aaron
14 | * Exercise 19: Salma
15 | * Exercise 20: Brian
16 | * Exercise 21: Corrie
17 |
18 | * It was suggested that somebody present the details of one of the examples and Vasa volunteered to look into the cancer rate example for next time.
19 |
20 | * Some exercises have boring details - it's up to the person presenting the solution what counts as boring.
21 |
22 | * We agreed that biweekly is a good rhythm: weekly is too much to prepare for and less than biweekly leads to forgetting.
23 |
24 | * We agreed to set up a slack poll to decide the day of the week to avoid selection bias of those present.
25 |
--------------------------------------------------------------------------------
/meetings/2019-05-16.md:
--------------------------------------------------------------------------------
1 | # Summary
2 |
3 | * The following solutions were presented and discussed:
4 |
5 | * Chapter 2 Exercise 8: Brian
6 | * Chapter 2 Exercise 9: Brian
7 | * Chapter 2 Exercise 10: Corrie
8 | * Chapter 2 Exercise 11: Konrad
9 | * Chapter 2 Exercise 12: Aaron
10 |
11 | * Vasa presented the cancer rate example. The question was raised whether it's better to base your prior directly on the observed rates or only via the prior predictive distribution. Konrad pointed out that you can plot all the observed rates in one plot (as in the book) but the observed counts are conditional on the population size.
12 |
13 | * We decided to attempt to read all of chapter 3 for next time since it's very similar to chapter 2 (just with multiple variables). Next date: 29th May.
14 |
15 | * The following were volunteered for next time:
16 |
17 | • Chapter 3 bioassay example: Corrie
18 | • Chapter 3 Exercise 5: Vasa
19 | • Chapter 3 Exercise 6: Brian
20 | • Chapter 3 Exercise 11: Vasa
21 | • Chapter 3 Exercise 12?: Konrad (take your pick)
22 |
23 | * There are also some leftover exercises from chapter 2:
24 |
25 | * Chapter 2 Exercise 15: Aaron
26 | * Chapter 2 Exercise 19: Salma
27 | * Chapter 2 Exercise 20: Brian
28 | * Chapter 2 Exercise 21: Corrie
29 |
30 | * Can somebody link the exercises to the sections so that we can cover everything more efficiently?
31 |
32 | * There was some discussion whether we want to attempt to solve all the exercises. Maybe we assign each person a random exercise? Each person picks a random exercise? We continue picking the exercises that look interesting?
33 |
34 | * Let's give the repo some TLC, i.e. a github project page, so that we can view the html more conveniently.
35 |
--------------------------------------------------------------------------------
/readme.md:
--------------------------------------------------------------------------------
1 | # Berlin Bayesians Reading Group
2 |
3 | We meet every ~2 weeks to discuss material from Bayesian Data Analysis. See [meetings](meetings) for a summary of the material covered, and the chapter folders for solutions.
4 |
5 | ## Contributing
6 |
7 | To add your solutions, you can submit a pull-request. If you are unfamiliar with pull-requests, see [Hadley Wickam's guide](http://r-pkgs.had.co.nz/git.html#git-pullreq), or get in touch via [slack](https://join.slack.com/t/berlinbayesians/shared_invite/enQtNTMyODA2MjE0MDY0LTk2OTgwNmJlMWIzYzMzMzEyNTlkYmY0MDFlZmUwYTllOWYyMTlkNDU3YzFiNDkzNjQwMmMxYjg1YzQxYTcyNjM).
8 |
--------------------------------------------------------------------------------