├── .gitignore
├── README
├── chapter1.tex
├── chapter2.tex
├── chapter3.tex
├── chapter4.tex
├── chapter5.tex
├── chapter6.tex
├── chapter7.tex
├── mitzenmacher-and-upfal-solutions.pdf
└── mitzenmacher-and-upfal-solutions.tex


/.gitignore:
--------------------------------------------------------------------------------
1 | *.aux
2 | *.log
3 | *.toc
4 | *.out
5 | *.synctex.gz
6 | 


--------------------------------------------------------------------------------
/README:
--------------------------------------------------------------------------------
1 | Solutions to problems in the "Probability and Computing" book by Mitzenmacher and Upfal.
2 | 


--------------------------------------------------------------------------------
/chapter1.tex:
--------------------------------------------------------------------------------
  1 | \chapter{Events and Probability}
  2 | 
  3 | \section{Comments on the main text}
  4 | 
  5 | \section{Exercises}
  6 | 
  7 | \subsection*{Exercise 1.1}
  8 | 
  9 | Flip a fair coin ten times.
 10 | 
 11 | \begin{itemize}
 12 | 	\item[(a)] $\P(\text{\#heads = \#tails}) = \frac{\binom {10} 5}{2^{10}} = \frac{63}{256}$. Just choose 5 coins which can be heads.
 13 | 	\item[(b)] $\P(\text{more heads than tails}) = \left(\binom {10}{6} + \binom {10}{7} + \binom {10}{8} +
 14 | 	\binom {10}{9} + \binom {10}{10} \right) / 2^{10}$. Ensure that the number of heads are more than 5.
 15 | 	\item[(c)] $\P(\text{the ith flip and the (11-i)th flip are the same for i = 1...5}) = (1/2)^{5}$. For one $i$, the probability is 1/2.
 16 | 	\item[(d)] \textbf{Method 1: }
 17 | 	
 18 | 	$\P(\text{at least 4 consecutive heads}) = \P(\text{consecutive 4}) + \P(\text{consecutive 5}) + \P(\text{consecutive 6})
 19 | 	+\P(\text{consecutive 7}) + \P(\text{consecutive 8}) + \P(\text{consecutive 9}) + \P(\text{consecutive 10}) = ( (2 \times 2^5 + 5 \times 2^4 - 3 - 2)
 20 | 	+ (2 \times 2^4 + 4 \times 2^3) + (2 \times 2^3 + 3 \times 2^2) + (2 \times 2^2 + 2 \times 2^1) + (2 \times 2^1 + 1 \times 2^0) + 2 + 1)
 21 | 	/1024 = 251/1024 $.Consecutive 5,6,7,8,9,10 are easy to understand. Consecutive 4 is special. There are 2 cases in consecutive 4 which are actually
 22 | 	consecutive 5 (HHHHTHHHHH and HHHHHTHHHH). There are 3 cases in consecutive 4 which are duplicated (HHHHTTHHHH, HHHHTHHHHT, THHHHTHHHH). 
 23 | 	
 24 | 	
 25 | 	\textbf{Method 2, recursion:}
 26 | 	
 27 | 	Let $\P_4(n)$ be the probability that you have at least 4 consecutive heads after n flips.
 28 | 	Clearly $\P_4(0) = \P_4(1) = \P_4(2) = \P_4(3) = 0$ and $\P_4(4)= 2^{-4}$.
 29 | 	
 30 | 	For more flips, either you have achieved 4 consecutive heads in the first a few flips already or you have a string without 
 31 | 	4 consecutive heads followed by `THHHH...'. So for $n > 4$: $\P_4(n) = \P_4 (n - 1) + 2^{-5} (1 - \P_4(n-5))$.
 32 | 	Because, if $(n-1)$ already has 4 consecutive heads, the probability is just $\P_4(n-1)$, if it does
 33 | 	not have, and we want $n$ to have, then we know that the $n$th flip must be H and the last 5 flips
 34 | 	out of $n$ flips must be THHHH. Only in this way can we say that the first $n-1$ does not achieve it but with the last flip it does. 
 35 | 	
 36 | \end{itemize}
 37 | 
 38 | 
 39 | \subsection*{Exercise 1.2}
 40 | 
 41 | Roll two standard dice.
 42 | 
 43 | \begin{itemize}
 44 | 	\item[(a)] Same number: 1/6.
 45 | 	\item[(b)] The number on the 1st dice $>$ the number on the 2nd dice: 15/36, (6 equal cases, the rest are symmetric).
 46 | 	\item[(c)] The sum is even: 1/2.
 47 | 	\item[(d)] The product is a perfect square: (6 + 2)/36. (6 pairs of identical numbers, 1 * 4 and 4 * 1 are also two cases)
 48 | \end{itemize}
 49 | 
 50 | 
 51 | \subsection*{Exercise 1.3}
 52 | 
 53 | Shuffle a standard deck of cards.
 54 | 
 55 | \begin{itemize}
 56 | 	\item[(a)] First two cards include at least one ace: $1 - \frac{48}{52} \cdot \frac{47}{51}$, use 1 minus the probability
 57 | 	that neither of them is ace. Or we can view it as just draw two cards from a deck and they have at least one ace: 
 58 | 	1 - $\binom{48}{2}/\binom{52}{2}$.
 59 | 	\item[(b)] First five cards include at least one ace: 
 60 | 	$1 - \frac{48 \times 47 \times 46 \times 45 \times 44}{52 \times 51 \times 50 \times 49 \times 48}$
 61 | 	\item[(c)] First two cards are a pair of the same rank: 3/51. Whatever the first card is, we look at the probability that the second one is at the same rank as the first one.
 62 | 	\item[(d)] First five cards are all diamonds: $(13 \times 12 \times 11 \times 10 \times 9)/(52 \times 51 \times 50 \times 49 \times 48)$.
 63 | 	\item[(e)] First five cards form a full house ( three of one rank and two of another rank):
 64 | 	in this problem, we see it as drawing 5 random cards from a deck. The sample space is $\binom{52}{5}$. To be a full house, we need to
 65 | 	first pick two numbers $\binom{13}{2}$, then one of them has three cards and the other has two cards $\binom{13}{2}\binom{4}{3} \binom{4}{2} \times 2 = 3744$. The last 2 means that there are two ways to 
 66 | 	So the probability is 3744/$\binom{52}{5}$.
 67 | \end{itemize}
 68 | 
 69 | \subsection*{Exercise 1.4}
 70 | 
 71 | Let $E_k$ be the event that the loser has won $k$ games, where $0 \leq k \leq n-1$.
 72 | 
 73 | If the game stops and the loser won $k$ games, then it means there are $n + k$ games in total. the $k$ games
 74 | won by the loser can happen at any place except the last one. 
 75 | 
 76 | The answer for $k$ is 
 77 | 
 78 | \begin{equation*}
 79 | \P[E_k] = \frac{\binom{n+k-1}{k}}{2^{n+k-1}}
 80 | \end{equation*}
 81 | 
 82 | We want to verify this result. Since $k < n$, we know that for a fixed $n$ we need to prove,
 83 | 
 84 | \begin{equation*}
 85 | \sum_{k = 0}^{n - 1} \P[E_k] = \sum_{k = 0}^{n - 1} \frac{\binom{n+k-1}{k}}{2^{n+k-1}} = 1
 86 | \end{equation*}
 87 | 
 88 | 
 89 | 
 90 | \subsection*{Exercise 1.5}
 91 | 
 92 | (a) (b) (c) Just enumerate?
 93 | 
 94 | \subsection*{Exercise 1.6}
 95 | 
 96 | Show that the number of white balls is equally likely to be any number between 1 and $n-1$.
 97 | Hint: Use mathematical induction.
 98 | 
 99 | Let $W_n$ be the random variable representing the number of white balls when there are
100 | $n$ balls in total.
101 | 
102 | Base case: $n = 3$. $\P[W_3 = 1] = 1/2$ and $\P[W_3 = 2] = 1/2$ because the third
103 | ball can be either white or black with equal probability.
104 | 
105 | Inductive step: Let $X_n$ be the random variable representing the $n$th ball. It is $b$ if it is
106 | black and $w$ if white.
107 | Assuming $\P[W_n = i] = \frac{1}{n-1} \forall i = 1,2...n$, then we can derive that
108 | 
109 | \begin{equation*}
110 | \begin{split}
111 | \P[W_{n+1} = i] & = \P[W_n = i - 1] \cdot \P[X_n = w | W_n = i - 1] \\
112 | &\ \ + \P[W_n = i] \cdot \P[X_n = b | W_n = i] \\
113 | & = \frac{1}{n-1} \frac{i - 1}{n} + \frac{1}{n - 1} \frac{n - i}{n} \\
114 | & = \frac{1}{n}
115 | \end{split}
116 | \end{equation*}
117 | 
118 | \subsection*{Exercise 1.7}
119 | 
120 | \noindent (a) \url{https://proofwiki.org/wiki/Inclusion-Exclusion_Principle}
121 | 
122 | \noindent (b)(c) Bonferroni Inequalities.
123 | 
124 | Define 
125 | 
126 | \begin{equation*}
127 | S_k := \sum_{1\leq i_1 < ... <  i_k \leq n} \P(A_{i_1} \cap ... \cap A_{i_k})
128 | \end{equation*}
129 | 
130 | 
131 | \subsection*{Exercise 1.8}
132 | 
133 | Let $E_1, E_2, E_3$ be the events that the number chosen is divisible by 4,6,9.
134 | 
135 | \begin{equation*}
136 | \begin{split}
137 | \P[E] & = \P[E_1] + \P[E_2] + \P[E_3] - \P[E_1 \cap E_2]
138 | - \P[E_2 \cap E_3] - \P[E_1 \cap E_3] \\
139 | & \ \ \ + \P[E_1\cap E_2 \cap E_3] \\
140 | & =  250000/10^6 + 166666/10^6 + 111111/10^6 - 83333/10^6 \\
141 | & \ \ - 55555/10^6 - 27777/10^6 + 27777/10^6\\
142 | & = 38889/1000000
143 | \end{split}
144 | \end{equation*}
145 | 
146 | \subsection*{Exercise 1.9}
147 | 
148 | Flip a fair coin $n$ times.
149 | For $k>0$, find an upper bound on the probability that there is a sequence of $\log_2 n + k$ consecutive heads.
150 | 
151 | Let's begin by defining the probability space and events we will analyse. First, let
152 | $H_i$ be the event that the $i$th coin comes up heads. Similarly, let
153 | $S_i$ denote the event that $\log_2 n + k$ consecutive coin flips are heads, starting with the
154 | $i$th flip. We can derive an upper bound on the probability $p$ that there is a sequence of 
155 | $\log_2 n + k$ consecutive heads using the union bound.
156 | Note that only a single run of $\log_2 n + k$ heads within $n$
157 | flips is required for success. As a result,
158 | we can apply the union bound to the sequence of events $S_i$
159 | to obtain
160 | 
161 | \begin{equation*}
162 | p = \P\left(\bigcup_{i \in I} S_i \right) \leq \sum_{i \in }\P[S_i]
163 | \end{equation*}
164 | where $I = \{1,2,3... n - \log_2 n - k + 1\}$. Note that the limits of the summation have been selected
165 | to prevent indexing events which do not exist (e.g. $S_0, S_{n - \log_2 n - k + 2}$).
166 | 
167 | At this point, we need to determine $\P[S_i]$. For any given sequence starting at flip
168 | $i$, each coin toss will be independent of the others (i.e., $\{H_i\}$ are mutually independent). As a result, we can
169 | express the desired probability
170 | 
171 | \begin{equation*}
172 | \P[S_1] = \P\left[\bigcap_{i=1}^{\log_2 n + k} H_i\right] = \prod_{i=1}^{\log_2 n + k} \P[H_i]
173 | = (\frac{1}{2})^{\log_2 n + k} = \frac{1}{2^k n}
174 | \end{equation*}
175 | 
176 | Similarly, $\P[S_i]$ must also be $1/2^k n$ because we only care about the coin sequence starting from
177 | the $i$th coin. Substituting into the union bound.
178 | 
179 | \begin{equation*}
180 | p \leq \sum_{i=1}^{n - \log_2 n - k + 1} \frac{1}{2^k n} = \frac{n - \log_2 n - k + 1}{2^k n} \leq 2^{-k}
181 | \end{equation*}
182 | 
183 | \begin{remark}
184 | 	This exercise shows us that to bound a probability we do not have to think about the exact probability
185 | 	very carefully. We can just simplify the procedure by omitting some details. For example, we know that
186 | 	when $S_1$ happens, it can also mean that $S_2$ may happen. However, we do not care about the duplicated
187 | 	cases. We just need to know that the actual probability is smaller than $\P[S_1] + \P[S_2]$.
188 | 	In practice, if this sum is small, then that is already enough for us.
189 | \end{remark}
190 | 
191 | \subsection*{Exercise 1.10}
192 | 
193 | Let $F$ represent the event that I flip a fair coin, let $T$ represent the event that I flip a two-headed coin.
194 | Let $X$ be the random variable representing the result of the flip. 
195 | 
196 | \begin{equation*}
197 | \begin{split}
198 | \P[T | X = H] & = \frac{\P[T \cap (X = H)]}{\P[X = H]} \\
199 | & = \frac{1/2 \cdot 1}{\P[X = H | F]\P[F] + \P[X = H | T]\P[T]} \\
200 | & = \frac{1/2}{1/2 \cdot 1/2 + 1 \cdot 1/2} \\
201 | & = \frac{2}{3}
202 | \end{split}
203 | \end{equation*}
204 | 
205 | \subsection*{Exercise 1.11}
206 | 
207 | \noindent (a) Obviously, only when the bit is flipped even times can we get a correct bit.
208 | 
209 | \noindent (b) We study the probability that after the bit passes through the relays it is flipped.
210 | 
211 | \begin{equation*}
212 | \frac{1-q_1}{2} \cdot (1 - \frac{1-q_2}{2}) + \frac{1-q_2}{2} \cdot (1 - \frac{1-q_1}{2}) = \frac{1-q_1q_2}{2}
213 | \end{equation*}
214 | 
215 | \noindent (c) 
216 | The proof is by induction. For the base case, let $n = 1$. The probability of receiving the correct bit
217 | is given by:
218 | 
219 | \begin{equation*}
220 | \frac{1 + (1 - 2p)}{2} = 1 - p
221 | \end{equation*}
222 | 
223 | The base case now is verified. We want to prove that when $n = k$ is valid, then $n = k+1$ is also valid.
224 | Let $E_n$ be the event that we receive a correct bit after $n$ relays. 
225 | 
226 | \begin{equation*}
227 | \P[E_{n+1}] = \P[E_n]\cdot (1-p) + (1 - \P[E_n]) \cdot p = \frac{1 + (1-2p)^{n+1}}{2}
228 | \end{equation*}
229 | 
230 | \qed
231 | 
232 | \subsection*{Exercise 1.12}
233 | 
234 | Monty Hall Problem.
235 | 
236 | This is a classic problem in probability and the ``counter-intuitive'' result can best be seen by
237 | applying Bayes' Law. To begin our analysis let's enumerate the sample space. Let $O_i$
238 | correspond to the event where Monty opens door $i$. In addition, let $C_i$
239 | be the event that the car is behind door $i$. 
240 | Without loss of generality we can assume that the contestant always initially chooses the first
241 | door and that Monty chooses the second (since we could always permute the door labels to achieve
242 | this condition). Subject to this condition, the sample space $\Omega$ can be enumerated simply by the
243 | position of the car as $\Omega = \{C_1, C_2, C_3\}$.
244 | 
245 | We now know that the condition is that Monty opens the door 2. Therefore, all we need to do is to compare
246 | the probability $\P[C_1 | O_2]$ and $\P[C_3 | O_2]$.
247 | 
248 | \begin{equation*}
249 | \P[C_1 | O_2] = \frac{\P[O_2 | C_1] \cdot \P[C_1]}{\sum_{i=1}^{3} \P[O_2 | C_i]\P[C_i]},
250 | \P[C_3 | O_2] = \frac{\P[O_2 | C_3] \cdot \P[C_3]}{\sum_{i=1}^{3} \P[O_2 | C_i]\P[C_i]}
251 | \end{equation*}
252 | 
253 | To calculate these two probabilities, we need to figure out all quantities here. $\P[C_i] = 1/3$.
254 | $\P[O_2 | C_1] = 1/2, \P[O_2 | C_2] = 0, \P[O_2 | C_3] = 1$. Here, since the contestant
255 | already chose the door 1, if the car is behind the door 1, then Monty would open the door 2 with probability 1/2, if the
256 | car is behind the door 2, Monty would not open it, if the car is behind the door 3, Monty would definitely open
257 | the door 2 because there is no other doors to choose.
258 | 
259 | Therefore, $\P[C_1 | O_2] = 1/3$ and $\P[C_3 | O_2] = 2/3$. So we always change the door and we always get larger
260 | probability. 
261 | 
262 | How to understand this intuitively? One idea is that we think about these two cases, the first case is that the contestant chose
263 | the right one with probability 1/3, and the wrong one with probability 2/3. Monty always opens a door behind which there is no car.
264 | If we fall in the first case, then it means changing would give us a failure. If we fall in the second case, then it means that
265 | changing would give us a success. Since we have larger probability to fall in the second case, we like to choose to change our
266 | option.
267 | 
268 | \subsection*{Exercise 1.13}
269 | 
270 | We need to calculate the probability. Let the random variable $R$ represent the result of the test. It is $P$ if positive
271 | and $N$ if negative. Let $D$ be whether the patient has the disorder. It is $T$ if true, and $F$ if false.
272 | 
273 | \begin{equation*}
274 | \begin{split}
275 | \P[D = T | R = P] & = \frac{\P[R =P | D = T] \cdot \P[D = T]}{\P[R = P]} \\
276 | & = \frac{\P[R = P | D = T] \cdot \P[D = T]}{\P[R = P | D = T] \cdot \P[D = T]
277 | 	+ \P[R = P | D = F] \cdot \P[D = F]} \\
278 | & = \frac{0.999 \cdot 0.02}{0.999 \cdot 0.02 + 0.005 \cdot 0.98} \\
279 | & = 0.80305466237
280 | \end{split}
281 | \end{equation*}
282 | 
283 | \subsection*{Exercise 1.14}
284 | 
285 | Let $M$ represent the event that he wins 3 games and I win 1 game. Let $E_1$ be the event that I am better,
286 | $E_2$ equally good, $E_3$ be the event that he is better. $\P[E_i] = 1/3$.
287 | 
288 | Then we want 
289 | 
290 | \begin{equation*}
291 | \begin{split}
292 | \P[E_3 | M] & = \frac{\P[M | E_3] \P[E_3]}{\sum_{i=1}^{3} \P[M | E_i] \P[E_i]} \\
293 | & = \frac{0.0864}{0.0864 + 0.0625 + 0.0384} \\
294 | & \approx 0.461
295 | \end{split}
296 | \end{equation*}
297 | 
298 | \subsection*{Exercise 1.15}
299 | 
300 | We use the principle of deferred decisions. 
301 | 
302 | It does not matter whether what outcomes are for the previous 9 dice. There is always one outcome of the 10th dice
303 | that can make the result divisible by 6. The probability that we can get this number is just 1/6.
304 | 
305 | \subsection*{Exercise 1.16}
306 | 
307 | \noindent (a) All three show the same number on the first roll: 1/36
308 | 
309 | \noindent (b) Exactly two of them show the same number on the first roll: 5/12
310 | 
311 | \noindent (c) When two of the three dice showing the same number on the first roll, the player can roll the different one
312 | once, if it succeeds, then the game ends, if it does not succeeds, the player can roll it one more time. The probability
313 | should be $1 - (\frac{5}{6})^2 = 11/36$.
314 | 
315 | \noindent (d)
316 | 
317 | The probability that the player wins the game:
318 | 
319 | $p = \frac{1}{36} + \frac{15}{36} \cdot \frac{11}{36} + \frac{20}{36} 
320 | \cdot (\frac{1}{36} + \frac{15}{36} \cdot \frac{1}{6} + \frac{20}{36} \cdot \frac{1}{36}) $
321 | 
322 | 
323 | \subsection*{Exercise 1.17}
324 | 
325 | I don't think the analysis would change...
326 | 
327 | \subsection*{Exercise 1.18}
328 | 
329 | \noindent (a) Choose $x \in \{0,...,n-1\}$ uniformly and let $y = z - x \mod n$. Then outputs $F(z) = F(x) + F(y) \mod m$. 
330 | The result is wrong if either of $x$ and $y$ is corrupted. By union bound, this is $2/5$. 
331 | 
332 | \noindent (b) When we can do 3 times, we take the majority vote. If there is no such thing, then we return the first answer.
333 | 
334 | The algorithm is wrong when at least two outputs are wrong. Assuming the probability that one test gives us wrong answer is
335 | $p_z$. By (a), we know that $p_z \leq 2/5$. The error probability now is $\binom{3}{2} p_z^2(1-p_z) + \binom{3}{3} p_z^3 \leq
336 | 0.352$.
337 | 
338 | Note that the first term is actually  $2p_z^2(1-p_z)$ because if the first test is correct then the algorithm would also give
339 | us a correct answer.
340 | 
341 | \subsection*{Exercise 1.19}
342 | 
343 | $A | B < A, A | B = A, A | B > A$
344 | 
345 | \subsection*{Exercise 1.20}
346 | 
347 | If $E_1, E_2, E_3,...,E_k$ are mutually independent then we know that for any subset $I \subseteq [1,k]$,
348 | 
349 | \begin{equation*}
350 | \P\left[\bigcap_{i\in I}E_i \right] = \prod_{i\in I}\P[E_i]
351 | \end{equation*}
352 | 
353 | We now want to prove that $\overline{E_1}, \overline{E_2}, \overline{E_3}...\overline{E_k}$ are mutually independent.
354 | So we want to prove that for any subset $I \subseteq [1,k]$,
355 | 
356 | \begin{equation*}
357 | \begin{split}
358 | \P\left[\bigcap_{i\in I}\bar{E_i} \right] = \prod_{i\in I}\P[\bar{E_i}] \\
359 | \end{split}
360 | \end{equation*}
361 | 
362 | By de morgan law, we know that $\cap_{i \in I} \overline{E_i} = \overline{\cup_{i \in I} E_i}$.
363 | 
364 | \begin{equation*}
365 | \begin{split}
366 | \P\left[\bigcap_{i\in I}\overline{E_i} \right] & = 1 - \P\left[\bigcup_{i \in I} E_i \right] \\
367 | & = 1 - \Bigg( \sum_{i\in I} \P(E_i) - \sum_{i<j \in I} \P(E_i \cap E_j) \\
368 | & \ \ \ \ + \sum_{i<j<k \in I} \P(E_i \cap E_j \cap E_k) \\
369 | & \ \ \ \ - ... + (-1)^{l+1} \sum_{i_1 < i_2 < ... < i_l \in I} \P( \bigcap_{r=1}^{l} E_{ir}) \Bigg) \\
370 | & = 1 - \Bigg( \sum_{i\in I} \P(E_i) - \sum_{i<j \in I} \P(E_i) \cdot \P(E_j) \\
371 | & \ \ \ \ + \sum_{i<j<k \in I} \P(E_i) \cdot \P(E_j) \cdot \P(E_k) \\
372 | & \ \ \ \ - ... + (-1)^{l+1} \sum_{i_1 < i_2 < ... < i_l \in I} \prod_{r=1}^{l}\P ( E_{ir} ) \Bigg) \\
373 | & = \prod_{i \in I}(1 - \P[E_i])\\
374 | & = \prod_{i\in I}\P[\overline{E_i}] \\ 
375 | \end{split}
376 | \end{equation*}
377 | 
378 | \subsection*{Exercise 1.21}
379 | 
380 | Suppose that we throw a fair four-sided die (you may think of this as a square
381 | die thrown in a two-dimensional universe). We may take $\Omega = \{1, 2,3,4\}$, where each $\omega \in \Omega$
382 | is equally likely to occur. The events $A = \{1, 2\}, B = \{1, 3\}, C = \{1,4\}$ are pairwise
383 | independent but not independent.
384 | 
385 | \subsection*{Exercise 1.22}
386 | 
387 | \noindent (a) Each possible subset has the same probability to be chosen, $1/2^n$. To generate this $X$,
388 | we know that for all elements in $X$, the probability that they got chosen is $1/2^{|X|}$. The probability
389 | that the others in $\{1,...,n\}$ are not chosen is $1/2^{(n-|X|)}$. Then the probability that we generate
390 | this $X$ is just $1/2^{n}$.
391 | 
392 | \noindent (b) 
393 | 
394 | We generate two sets $X$ and $Y$.
395 | 
396 | \begin{equation*}
397 | \P[X \subseteq Y] = \frac{\sum_{i=0}^{n} \binom{n}{i} \cdot 2^{i}}{2^n \cdot 2^{n}} = \frac{(2+1)^n}{4^n} = (\frac{3}{4})^n
398 | \end{equation*}
399 | 
400 | Another perspective is that, for each element, both $X$ and $Y$ have a probability $1/2$ to choose it. We also
401 | know that to make $X \subseteq Y$, either $Y$ chooses it or they both do not choose it. The probability is therefore
402 | $1/2 + 1/4 = 3/4$. There are $n$ elements. So the probability is $(3/4)^n$.
403 | 
404 | Next, we look at the probability that $X$ and $Y$ include all the elements.
405 | 
406 | \begin{equation*}
407 | \P[X \cup Y = \{1,...,n\}] = (3/4)^n
408 | \end{equation*}
409 | 
410 | For one element, it is in either $X$ or $Y$ with probability $3/4$. We require that all elements are
411 | either in $X$ or in $Y$.
412 | 
413 | \begin{equation*}
414 | \P\left[\bigcap_{\omega \in \{1,...,n\}} (\omega \in X \cup \omega \in Y) \right] = (3/4)^n
415 | \end{equation*}
416 | 
417 | Another version of the answer can be found here: \url{http://oak.conncoll.edu/cchung/coursework/hw1.pdf}.
418 | 
419 | \subsection*{Exercise 1.23}
420 | 
421 | By using the randomized min-cut algorithm, we can reduce the graph from $n$ vertices to just $2$ vertices.
422 | We take the edges between these two nodes as our min-cut set. There are at most $\binom{n}{2} = n(n-1)/2$
423 | such pairs of vertices. According to our analysis, some of them may be the min-cut sets we need while the
424 | others are not. Even if all of them are distinct min-cut sets, the number of min-cut sets would not exceed the number
425 | of such pairs. Hence, there are at most $n(n-1)/2$ distinct min-cut sets.
426 | 
427 | \subsection*{Exercise 1.24}
428 | 
429 | We know that to cut the graph into two connected components, we contract edges until there are only
430 | two vertices left. Hence, to cut the graph into $r$ connected components, we contract edges until there
431 | are only $r$ vertices left. 
432 | 
433 | We use the similar way as we did in finding a 2-way cut set. Assuming the size of the $r$-way cut set
434 | is $k$. Then the average degree should be at least $k/(r-1)$ (would be interesting to discuss, but should be easy). 
435 | The number of edges should be at least
436 | $nk/2(r-1)$.
437 | 
438 | 
439 | \begin{equation*}
440 | \P[F_{n-r}] \geq \prod_{i=1}^{n-r} \left(1- \frac{2(r-1)}{n-i+1}\right)
441 | \end{equation*}
442 | 
443 | This should stop at some point. Another similar answer can be found at
444 | \url{https://hkn.eecs.berkeley.edu/~chris/CS174/mt1reviewsoln.pdf}.
445 | 
446 | \subsection*{Exercise 1.25}
447 | 
448 | \noindent (a) The number of edge contractions should be $2(n-2)$. The probability that it finds a min-cut
449 | is $(1- p)^2$ where $p = 2/n(n-1)$.
450 | 
451 | \noindent (b) The number of edge contractions should be $(n-k) + l(k-2)$.
452 | 
453 | \begin{equation*}
454 | p = \frac{k(k-1)}{n(n-1)} \cdot \left(1 - \left(1- \frac{2}{k(k-1)}\right)^l\right) \approx \frac{k^2}{n^2}(1-e^{-\frac{2l}{k^2}})
455 | \end{equation*}
456 | 
457 | \noindent (c) Find optimal values of $k$ and $l$ for the variation in (b) that maximize the probability
458 | of finding a minimum cut while using the same number of edge contractions as running the original algorithm
459 | twice.
460 | 
461 | We know the condition is that the number of edge contractions is $(n-k) + l(k-2) = 2(n-2)$. Hence,
462 | $l = \frac{n+k-4}{k-2}$, and the above expression becomes roughly $\frac{k^2}{n^2}(1 - e^{-\frac{n+k}{k^3}})$.
463 | We claim that, up to constant factors, this is maximized by taking $k = n^{1/3}$. To see this, note that
464 | if $k \geq n^{1/3}$, since $1 - e^{-x} \leq x (x\geq 0)$, this expression is at most $\frac{k^2}{n^2}\frac{n+k}{k^3} = \frac{n+k}{kn^2}$.
465 | Similarly, if $k\leq n^{1/3}$,
466 | then $(1 - e^{-\frac{n+k}{k^3}})$ is at least a constant bounded away from zero, and hence, up to constant factors,
467 | the probability of success is $\Omega(n^{2/3}/n^2) = \Omega(n^{-4/3})$, which is significantly better than the
468 | $O(1/n^2)$ in (a).
469 | 
470 | 
471 | The solution to (c) is from here:
472 | 
473 | \url{theory.stanford.edu/~valiant/teaching/CS265_2015/ps1_sols.pdf}
474 | 
475 | \subsection*{Exercise 1.26}
476 |  
477 | Programming...


--------------------------------------------------------------------------------
/chapter2.tex:
--------------------------------------------------------------------------------
  1 | \chapter{Discrete Random Variables and Expectation}
  2 | 
  3 | \section{Comments on the main text}
  4 | 
  5 | \section{Exercises}
  6 | 
  7 | \subsection*{Exercise 2.1}
  8 | 
  9 | (k+1)/2
 10 | 
 11 | \subsection*{Exercise 2.2}
 12 | 
 13 | The probability that typing ``proof'' is $\frac{1}{26^5}$. Let $X_i$ be the random variable
 14 | that is 1 if the word ``proof'' starts at the position $i$. Therefore, by linearity of 
 15 | expectation
 16 | 
 17 | \begin{equation*}
 18 | \E[X] = \sum_{i=1}^{999996} \E[X_i] = \frac{999996}{26^5}
 19 | \end{equation*}
 20 | 
 21 | \subsection*{Exercise 2.3}
 22 | 
 23 | By Jensen's inequality, if $f$ is convex, then we know that $\E[f(X)] \geq f(\E[X])$.
 24 | If it is a linear function then we know that $\E[f(X)] = f(\E[X])$. If it is concave
 25 | then $\E[f(X)] \leq f(\E[X])$.
 26 | 
 27 | \subsection*{Exercise 2.4}
 28 | 
 29 | By Jensen's inequality, $\E[f(X)] \geq f(\E[X])$ for any convex functions $f$. If
 30 | $f$ is twice differentiable and its second derivative is non-negative, then $f$ is convex. For
 31 | $f(x) = x^k$, the second derivative is $f''(x) = k(k-1)x^{k-2}$. Since $k$ is an even number greater
 32 | than 1, we know that the second derivative is non-negative. Hence, it is convex. 
 33 | 
 34 | \subsection*{Exercise 2.5}
 35 | 
 36 | The probability that $X$ is even is that
 37 | 
 38 | \begin{equation*}
 39 | \begin{split}
 40 | \P[X \text{ is even}] & = \sum_{i=0,2,4...}^{n} \binom{n}{i}\left(\frac{1}{2}\right)^{n} \\
 41 | & = \left(\frac{1}{2}\right)^{n} \sum_{i=0,2,4...}^{n} \binom{n}{i}
 42 | \end{split}
 43 | \end{equation*}
 44 | 
 45 | We know that
 46 | 
 47 | \begin{equation*}
 48 | \begin{split}
 49 | \binom{n}{0} & = \binom{n-1}{0} \\
 50 | \binom{n}{1} & = \binom{n-1}{1} + \binom{n-1}{0} \\
 51 | \binom{n}{2} & = \binom{n-1}{2} + \binom{n-1}{1} \\
 52 | \binom{n}{3} & = \binom{n-1}{3} + \binom{n-1}{2} \\
 53 | ... \\
 54 | \binom{n}{n-2} & = \binom{n-1}{n-2} + \binom{n-1}{n-3} \\
 55 | \binom{n}{n-1} & = \binom{n-1}{n-1} + \binom{n-1}{n-2} \\
 56 | \binom{n}{n} & = \binom{n-1}{n-1} \\
 57 | \end{split}
 58 | \end{equation*}
 59 | 
 60 | We found that when $n$ is even
 61 | 
 62 | \begin{equation*}
 63 | \begin{split}
 64 | \sum_{i=0,2,4...}^{n} \binom{n}{i} & = \binom{n}{0} + \binom{n}{2} + \binom{n}{4} + ... + \binom{n}{n} \\
 65 | & = \binom{n-1}{0} + \binom{n-1}{1} + \binom{n-1}{2} + ... + \binom{n-1}{n-1} \\
 66 | & = \binom{n}{1} + \binom{n}{3} + \binom{n}{5} + ... + \binom{n}{n - 1}
 67 | \end{split}
 68 | \end{equation*}
 69 | 
 70 | and when $n$ is odd
 71 | 
 72 | \begin{equation*}
 73 | \begin{split}
 74 | \sum_{i=0,2,4...}^{n} \binom{n}{i} & = \binom{n}{0} + \binom{n}{2} + \binom{n}{4} + ... + \binom{n}{n-1} \\
 75 | & = \binom{n-1}{0} + \binom{n-1}{1} + \binom{n-1}{2} + ... + \binom{n-1}{n-1} \\
 76 | & = \binom{n}{1} + \binom{n}{3} + \binom{n}{5} + ... + \binom{n}{n}
 77 | \end{split}
 78 | \end{equation*}
 79 | 
 80 | Hence, $\sum_{i=0,2,4...}^{n} \binom{n}{i} = 2^{n-1}$. $\P[X \text{ is even}] = 1/2$.
 81 | 
 82 | \subsection*{Exercise 2.6}
 83 | 
 84 | Two six-sided dice. $X = X_1 + X_2$.
 85 | 
 86 | (a) $\E[X | X_1 \text{ is even}] = \frac{3}{18} + \frac{4}{18} + \frac{10}{18} + \frac{12}{18} + \frac{21}{18} + \frac{24}{18}
 87 | + \frac{18}{18} + \frac{20}{18} + \frac{11}{18} + \frac{12}{18}$.
 88 | 
 89 | (b) $\E[X | X_1 = X_2] = 7$.
 90 | 
 91 | (c) $\E[X_1 | X = 9] = 9/2$.
 92 | 
 93 | (d) $\E[X_1 - X_2 | X = k] = 0 $ for $k \in [2,12]$.
 94 | 
 95 | 
 96 | \subsection*{Exercise 2.7}
 97 | 
 98 | $X$ and $Y$ are independent geometric random variables, where $X$ has parameter $p$ and $Y$ has parameter $q$.
 99 | 
100 | (a) What is $\P[X = Y]$?
101 | 
102 | $\P[X = Y = n] = (1-p)^{n-1}p(1-q)^{n-1}q$. 
103 | 
104 | $\P[X=Y] = \sum_{n=1}^{\infty} (1-p)^{n-1}p(1-q)^{n-1}q
105 | = \sum_{n=1}^{\infty} pq[(1-p)(1-q)]^{n-1}$.
106 | 
107 | Since we know that for geometric random variables 
108 | $\P[X \geq i] = \sum_{n=i}^{\infty}(1-p)^{n-1}p = (1-p)^{i-1}$. So we have,
109 | 
110 | \begin{equation*}
111 | \begin{split}
112 | S = \P[X = Y] & = \sum_{n=1}^{\infty} pq[(1-p)(1-q)]^{n-1} \\
113 | \frac{(p+q-pq)S}{pq} & = \sum_{n=1}^{\infty}(1-p-q+pq)^{n-1}(p+q-pq)\\
114 | & = (1-p-q+pq)^{1-1} = 1\\
115 | S & = \frac{pq}{p+q-pq}
116 | \end{split}
117 | \end{equation*}
118 | 
119 | Hence, $\P[X = Y] = \frac{pq}{p+q-pq}$.
120 | 
121 | (b) What is $\E[\max(X,Y)]$?
122 | 
123 | From \textbf{MU 2.9} we know that $\E[\max (X,Y)] = \E[X] + \E[Y] - \E[\min(X,Y)]$.
124 | From part (c) we will know that $\min(X,Y)$ is a geometric random variable mean $p+q-pq$. 
125 | Therefore, $\E[\min(X,Y)] = \frac{1}{p+q-pq}$, and we get
126 | 
127 | \begin{equation*}
128 | 	\E[\max(X,Y)] = \frac{1}{p} + \frac{1}{q} - \frac{1}{p+q-pq}
129 | \end{equation*}
130 | 
131 | (c) What is $\P[\min(X,Y) = k]$?
132 | 
133 | We split this event into two disjoint events.
134 | 
135 | \begin{equation*}
136 | \begin{split}
137 | \P[\min(X,Y) = k] & = \P[X = k, Y \geq k] + \P[X > k, Y = k] \\
138 | & = \P[X = k] \cdot \P[Y \geq k] + \P[X > k] \cdot \P[Y = k] \\
139 | & = (1-p)^{k-1}p \cdot \sum_{n=k}^{\infty}(1-q)^{n-1}q + \left(\sum_{m=k+1}^{\infty}(1-p)^{m-1}p\right) \cdot (1-q)^{k-1}q \\
140 | & = (1-p)^{k-1}p \cdot (1-q)^{k-1} + (1-p)^{k} \cdot (1-q)^{k-1}q \\
141 | & = (1-p)^{k-1}(1-q)^{k-1}(p + (1-p)q) \\
142 | & = [(1 - (p+q-pq)]^{k-1}(p+q-pq)
143 | \end{split}
144 | \end{equation*}
145 | 
146 | (d) What is $\E[X | X \leq Y]$?
147 | 
148 | \begin{equation*}
149 | \begin{split}
150 | \E[X | X \leq Y] & = \sum_{x} x \frac{\P[X = x \cap x \leq Y]}{\P[X \leq Y]}\\
151 | \end{split}
152 | \end{equation*}
153 | 
154 | We investigate our denominator,
155 | 
156 | \begin{equation*}
157 | \begin{split}
158 | \P[X \leq Y] & = \sum_{x = 1}^{\infty}\P[Y \geq x] \P[X = x]\\
159 | & = \sum_{x = 1}^{\infty}(1-q)^{x-1} (1-p)^{x-1}p \\
160 | & = p \sum_{x=1}^{\infty}(1-p-q+pq)^{x-1} \\
161 | & = \frac{p}{p+q-pq}
162 | \end{split}
163 | \end{equation*}
164 | 
165 | Now the whole equation is
166 | 
167 | \begin{equation*}
168 | \begin{split}
169 | \E[X | X \leq Y] & = \sum_{x} x \frac{\P[X = x \cap x \leq Y]}{\P[X \leq Y]}\\
170 | & = \frac{p+q-pq}{p} \cdot \sum_x x \P[X = x \cap x \leq Y]\\
171 | & = \frac{p+q-pq}{p} \cdot \sum_x x \P[X = x]\cdot \P[x \leq Y]\\
172 | & = \frac{p+q-pq}{p} \cdot \sum_x x (1-p)^{x-1}p \cdot (1-q)^{x-1}\\
173 | & = \sum_{x} x (1-p-q+pq)^{x-1}(p+q-pq)
174 | \end{split}
175 | \end{equation*}
176 | 
177 | This is equivalent to the expectation of a geometric random variable with mean $p + q - pq$.
178 | Hence, $\E[X | X \leq Y] = \frac{1}{p+q-pq}$.
179 | 
180 | \subsection*{Exercise 2.8}
181 | 
182 | Let $G$ be the random variable that represents the number of girls they have and let $B$ be the number 
183 | of girls they have.
184 | 
185 | (a) $\E[G] = 0 \cdot (\frac{1}{2})^k + 1 \cdot (\frac{1}{2} + \frac{1}{2^2} + \frac{1}{2^3} + ... + \frac{1}{2^{k}})
186 | = \frac{2^{k}-1}{2 ^k}$.
187 | 
188 | $\E[B] = \sum_{n = 0}^{k - 1} \frac{n}{2^{n+1}} + \frac{k}{2^k} = \frac{2^k -1}{2^k}$.
189 | 
190 | (b) We know that the expectation of the number of their children is $2$ since now it is a standard
191 | geometric random variable. The expected number of girls they can have is
192 | 
193 | \begin{equation*}
194 | 	\E[G] = \sum_{i = 1}^{\infty}\frac{1}{2^i} = 1
195 | \end{equation*}
196 | 
197 | Therefore, $\E[B] = 1$.
198 | 
199 | \subsection*{Exercise 2.9}
200 | 
201 | (a) 
202 | 
203 | \begin{equation*}
204 | \begin{split}
205 | \E[\max(X_1,X_2)] & = \sum_{x_1}\sum_{x_2}\max (x_1, x_2)(1/k)(1/k) \\
206 | & = 1/k^2 \sum_{x_1}\sum_{x_2 \leq x_1} x_1 + \sum_{x_2 > x_1}x_2
207 | \end{split}
208 | \end{equation*}
209 | 
210 | \begin{equation*}
211 | \begin{split}
212 | \E[\min(X_1,X_2)] & = \sum_{x_1}\sum_{x_2}\min (x_1, x_2)(1/k)(1/k) \\
213 | & = 1/k^2 \sum_{x_1}\sum_{x_2 \leq x_1} x_2 + \sum_{x_2 > x_1}x_1
214 | \end{split}
215 | \end{equation*}
216 | 
217 | (b)
218 | 
219 | \begin{equation*}
220 | \begin{split}
221 | \E[\max(X_1,X_2)] + \E[\min(X_1,X_2)] & = 1/k^2 \sum_{x_1}\sum_{x_2 \leq x_1} x_1 + \sum_{x_2 > x_1}x_2 
222 | + 1/k^2 \sum_{x_1}\sum_{x_2 \leq x_1} x_2 + \sum_{x_2 > x_1}x_1\\
223 | & = 1/k^2 \sum_{x_1} \left( \sum_{x_2}x_2 + \sum_{x_2} x_1 \right) \\
224 | & = 1/k^2 \sum_{x_1}\sum_{x_2}(x_2 + x_1) \\
225 | & = \E[X_1] + \E[X_2]
226 | \end{split}
227 | \end{equation*}
228 | 
229 | (c)
230 | 
231 | By linearity of expectation,
232 | 
233 | \begin{equation*}
234 | \begin{split}
235 | \E[\max(X_1,X_2)] + \E[\min(X_1,X_2)] & = \E[\max(X_1, X_2) + \min(X_1, X_2)] \\
236 | & = \E[X_1 + X_2]\\
237 | & = \E[X_1] + \E[X_2]
238 | \end{split}
239 | \end{equation*}
240 | 
241 | 
242 | \subsection*{Exercise 2.10}
243 | 
244 | (a)
245 | 
246 | When $n = 1$, there is only one $\lambda = 1$.
247 | 
248 | \begin{equation*}
249 | \begin{split}
250 | f(\lambda x) \leq \lambda f(x)
251 | \end{split}
252 | \end{equation*}
253 | 
254 | Inductive step, assuming for $n = k$, $f(\sum_{i=1}^{k}\lambda_i x_i) \leq \sum_{i=1}^{k}\lambda_i f(x_i)$.
255 | Then we have, 
256 | 
257 | \begin{equation*}
258 | \begin{split}
259 | f(\sum_{i=1}^{k+1}\lambda_i x_i) & = f(\sum_{i=1}^{k-1}\lambda_i x_i + \lambda_k x_k + \lambda_{k+1} x_{k+1})\\
260 | & = f(\sum_{i=1}^{k-1}\lambda_i x_i + (\lambda_k + \lambda_{k+1})(\frac{\lambda_k x_k}{\lambda_k + \lambda_{k+1}} 
261 | + \frac{\lambda_{k+1} x_{k+1}}{\lambda_k + \lambda_{k+1}}))\\
262 | & = f(\sum_{i=1}^{k-1}\lambda_i x_i + \lambda_k' x_k') \\
263 | & \leq \sum_{i=1}^{k}\lambda_i f(x_i) \\
264 | & = \sum_{i=1}^{k-1}\lambda_i f(x_i) + \lambda_{k}'f(x_k') \\
265 | & = \sum_{i=1}^{k-1}\lambda_i f(x_i) + (\lambda_k + \lambda_{k+1})f(\frac{\lambda_k x_k}{\lambda_k + \lambda_{k+1}} 
266 | + \frac{\lambda_{k+1} x_{k+1}}{\lambda_k + \lambda_{k+1}}) \\
267 | & \leq \sum_{i=1}^{k-1}\lambda_i f(x_i) + (\lambda_k + \lambda_{k+1})(\frac{\lambda_k}{\lambda_k + \lambda_{k+1}}f(x_k) 
268 | + \frac{\lambda_{k+1}}{\lambda_k + \lambda_{k+1}}f(x_{k+1})) \\
269 | & = \sum_{i=1}^{k-1}\lambda_i f(x_i) + \lambda_k f(x_k) 
270 | + \lambda_{k+1} f(x_{k+1}) \\
271 | & = \sum_{i = 1}^{k+1} \lambda_i f(x_i)
272 | \end{split}
273 | \end{equation*}
274 | 
275 | (b)
276 | 
277 | If $X$ can only take finitely many values, then we can write its probability distribution as $\sum_{i=1}^{n}\lambda_i = 1$
278 | where $\lambda_i$ is the probability that $X$ takes the $i$th value in Im$X$.
279 | 
280 | \subsection*{Exercise 2.11}
281 | 
282 | Prove Lemma 2.6.
283 | 
284 | \subsection*{Exercise 2.12}
285 | 
286 | The expected number of cards we must draw to see all cards is
287 | 
288 | \begin{equation*}
289 | \begin{split}
290 | \E[X] = \sum \E[X_i] = n \sum_{i=1}^{n} \frac{1}{i}
291 | \end{split}
292 | \end{equation*}
293 | 
294 | If we draw $2n$ cards, what is the expected number of cards in the deck that are not chosen at all?
295 | 
296 | The idea is to use indicator random variables. 
297 | The probability that the $i$th card is not chosen is $(\frac{n-1}{n})^{2n}$. 
298 | 
299 | \begin{equation*}
300 | \begin{split}
301 | \E[X] = \sum \E[X_i] = n (\frac{n-1}{n})^{2n}
302 | \end{split}
303 | \end{equation*}
304 | 
305 | Chosen exactly once?
306 | 
307 | The probability that the $i$th card is chosen only once is 
308 | $\binom{2n}{1}(\frac{1}{n})(\frac{n-1}{n})^{2n-1}$.
309 | 
310 | \begin{equation*}
311 | \begin{split}
312 | \E[X] = \sum \E[X_i] = 
313 | n \binom{2n}{1}(\frac{1}{n})(\frac{n-1}{n})^{2n-1}
314 | = 2n (\frac{n-1}{n})^{2n-1}
315 | \end{split}
316 | \end{equation*}
317 | 
318 | \subsection*{Exercise 2.13}
319 | 
320 | (a)
321 | 
322 | This problem is exactly equivalent to the coupon collector’s problem we did; assume that the
323 | relevant pairs of coupons form a label. 
324 | 
325 | (b) No matter what $k$ is, we can always get $n\ln n + \Theta(n)$.
326 | 
327 | \subsection*{Exercise 2.14}
328 | 
329 | The last toss must be head, so for the previous $n-1$ tosses we know there are $k-1$ heads.
330 | Hence, the probability is $\binom{n-1}{k-1} p^k(1-p)^{n-k}$.
331 | 
332 | \subsection*{Exercise 2.15}
333 | 
334 | Let $X_i$ be the number of coin flips for the next head. 
335 | 
336 | \begin{equation*}
337 | \begin{split}
338 | \E[X] & = \E[\sum_{i = 1}^{k} X_i] \\
339 | & = \sum_{i=1}^{k} \E[X_i] \\
340 | & = \sum_{i=1}^{k} 1/p = k/p
341 | \end{split}
342 | \end{equation*}
343 | 
344 | \subsection*{Exercise 2.16}
345 | 
346 | (a)
347 | 
348 | Let $X_i$ be an indicator random variable which gets 1 if there is a streak of $\log n + 1$ starting
349 | from the $i$th flip. Its expectation is $(1/2)^{\log n} = 1/n$ because if a streak starts from the
350 | $i$th flip, the next $\log n$ flips must be the same as the $i$th one. Hence the expectation of the number
351 | of streaks of length $\log n + 1$ is $(n - \log n)(1/n) = 1 - o(1)$.
352 | 
353 | 
354 | \noindent (b)
355 | 
356 | In other words, we want to prove that with high probability there is at least one streak with length
357 | at least $\floor{\log n - 2\log \log n}$.
358 | 
359 | Let $k = \floor{\log n - 2\log \log n}$. We break the sequence into disjoint blocks of $k$ consecutive
360 | flips. There are $\floor{n/k}$ such blocks. For the sequence of $n$ flips to not contain a streak of
361 | $k$ flips (denote this event by $A$) it is necessary that none of the blocks contains a streak of length
362 | $k$ (denote this event by $B$). Thus we have $\P[A] \leq \P[B]$. 
363 | 
364 | The probability that a single block does not contain a streak is $1 - (1/2)^{k-1}$. Since the blocks
365 | are disjoint and independent, the probability that none of the blocks contains a streak is
366 | 
367 | \begin{equation*}
368 | 	\P[B] = \left(1 - \left(\frac{1}{2}\right)^{k-1} \right)^{\floor{n/k}}
369 | \end{equation*}
370 | 
371 | Now, since $k - 1 \leq \log n - 2 \log n \log n$ and $\floor{n/k} \leq n/\log n - 1$, we get
372 | 
373 | \begin{equation*}
374 | \begin{split}
375 | \P[B] & \leq \left(1 - \left(\frac{1}{2}\right)^{\log n - 2\log \log n}  \right)^{n/\log n - 1} \\
376 | & = \left(1 - \frac{\log^2 n}{n}\right)^{n/\log n - 1} \\
377 | & \leq \left(\exp\left(- \frac{\log^2 n}{n}\right)\right)^{n/\log n - 1} \\
378 | & = \exp\left(-\frac{\log^2 n}{n}\left(\frac{n}{\log n} - 1 \right)\right) \\
379 | & = \exp\left(-\log n \left(1 - \frac{\log n}{n}\right)\right)
380 | \end{split}
381 | \end{equation*}
382 | where the second inequality is based on the fact that $1 - x \leq e^{-x}$. Let $g(n) = 1 - \log n/n$.
383 | Since $g(4) = 3/4 > 1/\log e$ and since $g(n)$ is increasing for $n \geq 4$ as the derivative $g'(n) = 
384 | n^{-2}(\log n - 1/(\ln 2)) > 0$ when $n \geq 4$, for $n \geq 4$ we have that $g(n) \geq 1/\log e$ and 
385 | therefore
386 | 
387 | \begin{equation*}
388 | 	\P[B] \leq \exp \left( - \frac{\log n}{\log e} \right) = \exp(-\ln n) = \frac{1}{n}
389 | \end{equation*}
390 | 
391 | Note that here we do not try to find out the exact probability of the event described in the problem($A$).
392 | Instead, we find another event $B$ which has higher probability than $A$. The probability of $B$ is 
393 | easier to bound.  
394 | 
395 | The answer can be also found in \url{https://www.cs.helsinki.fi/u/mkhkoivi/teaching/RA-I/solutions1.pdf}
396 | 
397 | \subsection*{Exercise 2.17}
398 | 
399 | We see that $\E[Y_0] = 1$ and $\E[Y_1] = 2p$. Next for $i \geq 1$, we have
400 | \begin{equation*}
401 | 	\E[Y_i | Y_{i-1} = j] = 2pj
402 | \end{equation*}
403 | so that
404 | 
405 | \begin{equation*}
406 | 	\E[Y_i] = \E[\E[Y_i | Y_{i-1}]] = 
407 | 	\sum_{j} \P[Y_{i-1} = j] 2pj = 2p \E[Y_{i-1}]
408 | \end{equation*}
409 | 
410 | Thus, we have $\E[Y_i] = (2p)^i$. When $p < 1/2$, this probability is bounded.
411 | 
412 | \subsection*{Exercise 2.18}
413 | 
414 | Use Induction. 
415 | 
416 | Let $b_1, b_2,...,b_n$ be the values of the items observed at time $b_t$. We will prove this by
417 | induction. Let $M_t$ be a random variable that takes the value of the item in memory at time $t$.
418 | We need to show that at time $t$, $\P[M_t = b_i] = 1/t$ for all $1 \leq i \leq t$.
419 | 
420 | The base case is when $t = 1$, which is trivially true since $M_t = b_1$ with probability 1. Assume
421 | that at time $t$, $\P[M_t = b_i] = 1/t$ for all $1 \leq i \leq t$. Now we prove that this
422 | property holds for time $t + 1$. At time $t+1$, we set $M_{t+1} = b_{t+1}$ with probability $1/(t+1)$.
423 | Therefore, $\P[M_{t+1} = b_{t+1}] = 1/(t+1)$. For the rest, $1 \leq i \leq t$,
424 | 
425 | \begin{equation*}
426 | \begin{split}
427 | \P[M_{t+1} = b_i] & = \P[\text{no swat at time }t \text{ and } M_t = b_i] \\
428 | & = \P[\text{no swap at time } t] \P[M_t = b_i] \\
429 | & = \frac{t}{t+1} \frac{1}{t} \\
430 | & = \frac{1}{t+1}
431 | \end{split}
432 | \end{equation*}
433 | 
434 | The answer can be found at \url{https://inst.eecs.berkeley.edu/~cs174/fa10/sol2.pdf}.
435 | 
436 | \subsection*{Exercise 2.19}
437 | 
438 | If in 2.18, when the $k$th item appears, it replaces the item in memory with probability $1/2$.
439 | Describe the distribution of the item in memory.
440 | 
441 | In this version of the algorithm, let $X$ be random variable which is equal to $i$ if and only if the item
442 | in memory is the $i$th item.
443 | 
444 | Then the probability distribution should be: for $i = 2,...,n$. $\P[X = i] = (\frac{1}{2})^{n-i+1}$.
445 | For the first item, the probability is $\P[X = 1] = (\frac{1}{2})^{n-1}$.
446 | 
447 | \subsection*{Exercise 2.20}
448 | 
449 | There are $n$ positions, each position can either be a fixed point or not.
450 | 
451 | Let $X_i$ be an indicator showing whether the $i$th position is a fixed point.
452 | The permutation may put $i$ still at $i$ with probability $1/n$ since it has
453 | $n$ choices. So we have $\sum_{i=1}^{n}\E[X_i] = n \cdot 1/n = 1$.
454 | 
455 | The solution can also be found at \url{http://www.cs.nthu.edu.tw/~wkhon/random/assignment/assign1ans.pdf}.
456 | 
457 | \subsection*{Exercise 2.21}
458 | 
459 | \begin{equation*}
460 | \begin{split}
461 | \E[\sum_{i=1}^{n} |a_i -i|] & = \sum_{i=1}^{n} \E[|a_i - i|] \\
462 | & = \sum_{i=1}^{n} \sum_{j=1}^{n} \frac{1}{n} |j-i| \\
463 | & = \sum_{i=1}^{n} \frac{1}{n} (\sum_{p = 1}^{i-1} p + \sum_{q=1}^{n-i}q) \\
464 | & = \sum_{i=1}^{n} \frac{2i^2 + n^2 + n - 2i - 2ni}{2n} \\
465 | & = \sum_{i=1}^{n} (\frac{i^2-i}{n} + \frac{n + 1}{2} - i) \\
466 | & = \sum_{i=1}^{n} \frac{i^2 - i}{n} \\
467 | & = \frac{n^2 - 1}{3}
468 | \end{split}
469 | \end{equation*}
470 | 
471 | \subsection*{Exercise 2.22}
472 | 
473 | If $i < j$ and $a_i > a_j$, let $Y_{ij}$ be an indicator random variable.
474 | Let $Y = \sum_{i<j} Y_{ij}$. What we want is just the expectation of $Y$.
475 | 
476 | \begin{equation*}
477 | \begin{split}
478 | \E[Y] & = \sum_{i<j} \E[X_{ij}] \\
479 | & = \sum_{i<j} \P[a_i > a_ j] \\
480 | & = \sum_{i<j} \frac{1}{2} \\
481 | & = \frac{(n-1)n}{4}
482 | \end{split}
483 | \end{equation*}
484 | 
485 | A similar solution can be seen at \url{http://www.cs.ox.ac.uk/people/varun.kanade/teaching/CS174-Fall2012/HW/HW2_sol.pdf}.
486 | 
487 | \subsection*{Exercise 2.23}
488 | 
489 | Let $X_i$ be the number of swaps needed when the $i$th element is inserted.
490 | 
491 | We know that $\E[X_i] = \frac{i - 1}{2}$.
492 | 
493 | \begin{equation*}
494 | \begin{split}
495 | \E[X] & = \sum_{i=1}^{n} \E[X_k] \\
496 | & = \sum_{i=1}^{n} \frac{i - 1}{2} \\
497 | & = \frac{n^2-n}{4}
498 | \end{split}
499 | \end{equation*}
500 | 
501 | \subsection*{Exercise 2.24}
502 | 
503 | We use the memoryless property. This is a very important trick. 
504 | There are two cases for us to discuss: 
505 | \begin{enumerate}
506 | 	\item if the first is not 6, then we forget it
507 | 	and start from the next. The expected number of flips should be $1 + \E[X]$.
508 | 	The corresponding probability for this case is $5/6$.
509 | 	\item if the first is 6, then we look at the second one. If it is not 6, then we forget it
510 | 	and keep going. If it is 6, we count it as 2.
511 | \end{enumerate}
512 | 
513 | Let $A_i$ means the event that $i$th flip is six. $\P[A_i] = 1/6$.
514 | 
515 | \begin{equation*}
516 | \begin{split}
517 | \E[X] & = \P[\overline{A_1}]\E[X | \overline{A_1}]
518 | + \P[A_1]\E[X|A_1] \\
519 | & = \P[\overline{A_1}]\E[X | \overline{A_1}] + 
520 | \P[A_1](\P[\overline{A_2} | A_1] \E[X|A_1,\overline{A_2}] +  
521 | \P[A_2 | A_1] \E[X|A_1,A_2]) \\
522 | & = \P[\overline{A_1}]\E[X | \overline{A_1}] + 
523 | \P[A_1](\P[\overline{A_2}] \E[X|A_1,\overline{A_2}] +  
524 | \P[A_2] \E[X|A_1,A_2]) \\
525 | & = \frac{5}{6}\E[X | \overline{A_1}] + 
526 | \frac{1}{6}(\frac{5}{6} \E[X|A_1,\overline{A_2}] +  
527 | \frac{1}{6} \E[X|A_1,A_2]) \\
528 | \end{split}
529 | \end{equation*}
530 | 
531 | Hence we know that,
532 | 
533 | $\E[X] = \frac{5}{6}(1 + \E[X]) + \frac{1}{6}(\frac{5}{6}(2+\E[X]) + \frac{1}{6}\cdot 2)
534 | = \frac{35}{36}\E[X] + \frac{42}{36}$.
535 | 
536 | \subsection*{Exercise 2.25}
537 | 
538 | \subsection*{Exercise 2.26}
539 | 
540 | These directed arcs will generate disjoint cycles because a node can only be an end point for two
541 | edges. 
542 | 
543 | Let $X_i^k$ be an indicator variable which is 1 if vertex $i$ belongs to a cycle of length $k$.
544 | Let $Y_k = \sum_{i=1}^{n} X_n^k$ be the total number of nodes belonging to a $k$-cycle and let
545 | $N_k$ be the number of $k$-cycles in the graph. By definition we must have $N_k = Y_k/k$
546 | because these cycles are disjoint. Finally
547 | let $N$ be the total number of cycles in the graph, which is $N = \sum_{k=1}^{n}N_k$.
548 | 
549 | We want to know how many permutations $\pi$ there are such that $X_i^k = 1$. If vertex $i$ belongs
550 | to a $k$-cycle, then the next vertex $j = \pi(i)$ can not be vertex $i$, vertex $\pi(j)$ can be neither
551 | $i$ nor $j$, and so on, until the $k$th vertex is again $i$. Thus, we can choose the consecutive $k-1$
552 | vertices on the same cycle in $(n-1)(n-2)...(n-k+1)$ ways. The remaining $n-k$ vertices can be in any
553 | order in $\pi$, so the number of possibilities is $(n-k)(n-k-1)...2\cdot 1$. The total number of permutations
554 | such that $X_i^k = 1$ therefore $(n-1)!$. As there are $n!$ permutations in total, the probablity of such event
555 | is $\P[X_i^k = 1] = (n-1)!/n! = 1/n$.
556 | 
557 | By using the linearity of expectations twice, we can get the expected number of cycles.
558 | 
559 | \begin{equation*}
560 | \E[N] = \sum_{k=1}^{n}\E[N_k] = \sum_{k=1}^{n}\frac{1}{k}\E[Y_k]
561 | = \sum_{k=1}^{n}\frac{1}{k} \sum_{i=1}^{n}\E[X_i^k] = 
562 | \sum_{k=1}^{n}\frac{1}{k} \sum_{i=1}^{n}\frac{1}{n} = \sum_{k=1}^{n}\frac{1}{k} = H(n)
563 | \end{equation*}
564 | 
565 | \subsection*{Exercise 2.27}
566 | 
567 | \begin{equation*}
568 | \begin{split}
569 | \E[X] & = \sum_{x=1}^{\infty} x \P[X = x] \\ 
570 | & = \sum_{x=1}^{\infty} \frac{6}{\pi^2} \frac{1}{x}
571 | \end{split}
572 | \end{equation*}
573 | 
574 | This sum diverges since $\sum \frac{1}{x}$ diverges, so the expectation is not finite.
575 | 
576 | \subsection*{Exercise 2.28}
577 | 
578 | \subsection*{Exercise 2.29}
579 | 
580 | \subsection*{Exercise 2.30}
581 | 
582 | \subsection*{Exercise 2.31}
583 | 
584 | \subsection*{Exercise 2.32}
585 | 
586 | (a)
587 | 
588 | $E_i$s are disjoint events, therefore $\P[E] = \sum_{i=1}^{n} \P[E_i]$. For
589 | $i \leq m$, $\P[E_i] = 0$, since none of them are selected. Now, for $i>m$ two indepedent
590 | events make up $E_i$.
591 | 
592 | \begin{equation*}
593 | \begin{split}
594 | \P[E_i] & = \P[\text{the } i \text{th candidate is the best}] \cdot 
595 | \P[\text{the } i \text{th candidate is chosen}] \\
596 | & = \frac{1}{n} \cdot \P[\text{best of the first } i-1 \text{ candidates is in the first }m \text{ candidates}]\\
597 | & = \frac{1}{n} \frac{m}{i-1}
598 | \end{split}
599 | \end{equation*}
600 | 
601 | Now, putting this all together, we get
602 | 
603 | \begin{equation*}
604 | \begin{split}
605 | \P[E] = \sum_{i=m+1}^{n} \P[E_i] = \frac{m}{n}\sum_{i=m+1}^{n}\frac{1}{i-1}
606 | \end{split}
607 | \end{equation*}
608 | 
609 | (b)
610 | 
611 | Using Lemma 2.10 from the book,
612 | 
613 | \begin{equation*}
614 | 	\P[E] \geq \frac{m}{n}\int_{m+1}^{n+1}\frac{1}{x-1}dx = 
615 | 	\ln(x-1)|_{m+1}^{n+1} = \frac{m}{n}(\ln n - \ln m)
616 | \end{equation*}
617 | 
618 | and
619 | 
620 | \begin{equation*}
621 | \P[E] \leq \frac{m}{n}\int_{m}^{n}\frac{1}{x-1}dx = 
622 | \ln(x-1)|_{m}^{n} = \frac{m}{n}(\ln (n-1) - \ln (m-1))
623 | \end{equation*}
624 | 
625 | 
626 | (c)
627 | 
628 | Since the bound from above is concave, we can take the derivative to find the $m$
629 | so that the function is maximized. 
630 | 
631 | \begin{equation*}
632 | \frac{\mathrm{d}}{\mathrm{d} m}\frac{m}{n}(\ln n - \ln m) = \frac{\ln n}{n} - \frac{\ln m}{n} + \frac{1}{n} = 0
633 | \end{equation*}
634 | 
635 | Then we get $\ln m = \ln n - 1$, which is 
636 | 
637 | \begin{equation*}
638 | m = e^{\ln n -1} = e^{\ln n}e^{-1} = ne^{-1} = \frac{n}{e}
639 | \end{equation*}
640 | 
641 | Substituting this $m$ back into the bound from part (b), we get
642 | 
643 | \begin{equation*}
644 | \P[E] \geq \frac{1}{e} (\ln n - \ln \frac{n}{e}) = 1/e
645 | \end{equation*}
646 | 
647 | 
648 | 
649 | 
650 | 
651 | 
652 | 
653 | 
654 | 
655 | 
656 | 
657 | 
658 | 
659 | 
660 | 
661 | 
662 | 
663 | 
664 | 
665 | 
666 | 
667 | 
668 | 
669 | 
670 | 
671 | 
672 | 
673 | 
674 | 
675 | 
676 | 
677 | 


--------------------------------------------------------------------------------
/chapter3.tex:
--------------------------------------------------------------------------------
1 | \chapter{Moments and Deviations}
2 | 


--------------------------------------------------------------------------------
/chapter4.tex:
--------------------------------------------------------------------------------
1 | \chapter{Chernoff and Hoeffding Bounds}
2 | 


--------------------------------------------------------------------------------
/chapter5.tex:
--------------------------------------------------------------------------------
1 | \chapter{Balls, Bins and Random Graphs}
2 | 


--------------------------------------------------------------------------------
/chapter6.tex:
--------------------------------------------------------------------------------
1 | \chapter{The Probabilistic Method}
2 | 


--------------------------------------------------------------------------------
/chapter7.tex:
--------------------------------------------------------------------------------
1 | \chapter{Markov Chains and Random Walks}
2 | 
3 | 
4 | 
5 | 


--------------------------------------------------------------------------------
/mitzenmacher-and-upfal-solutions.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Vkomini/mitzenmacher-upfal-solutions/d70b0af7e74ad5b98c332c178f0b75d9341165ac/mitzenmacher-and-upfal-solutions.pdf


--------------------------------------------------------------------------------
/mitzenmacher-and-upfal-solutions.tex:
--------------------------------------------------------------------------------
 1 | \documentclass{book}
 2 | \usepackage[utf8]{inputenc}
 3 | \usepackage[letterpaper, margin=1in ]{geometry}
 4 | 
 5 | \title{Mitzenmacher \& Upfal (2nd Edition) Solutions}
 6 | \author{Leran Cai \and Hayk Saribekyan}
 7 | 
 8 | 
 9 | \usepackage{natbib}
10 | \usepackage{graphicx}
11 | \usepackage{amssymb}
12 | \usepackage[utf8]{inputenc}
13 | \usepackage[english]{babel}
14 | \usepackage{amsmath}
15 | \usepackage{amsthm}
16 | \usepackage{algorithm}
17 | \usepackage[noend]{algpseudocode}
18 | \usepackage{bm}
19 | \usepackage{array}
20 | \usepackage{tikz}
21 | \usepackage{tikz-qtree}
22 | \usepackage{pgf,tikz}
23 | \usepackage{mathrsfs}
24 | \usetikzlibrary{arrows}
25 | \usepackage{pgfplots}
26 | \usepackage{booktabs,makecell}
27 | \usepackage{rotating}
28 | \usepackage{hyperref}
29 | \usepackage{mathtools}
30 | \DeclarePairedDelimiter\ceil{\lceil}{\rceil}
31 | \DeclarePairedDelimiter\floor{\lfloor}{\rfloor}
32 | 
33 | \newtheorem{theorem}{Theorem}[chapter]
34 | \newtheorem{corollary}{Corollary}[theorem]
35 | \newtheorem{lemma}[theorem]{Lemma}
36 | \theoremstyle{definition}
37 | \newtheorem{definition}{Definition}[chapter]
38 | \theoremstyle{remark}
39 | \newtheorem*{remark}{Remark}
40 | 
41 | \renewcommand{\P}{\mathbb{P}}
42 | \newcommand{\E}{\mathbb{E}}
43 | 
44 | \begin{document}
45 | 
46 | \maketitle
47 | 
48 | \tableofcontents
49 | 
50 | \input{chapter1}
51 | \input{chapter2}
52 | \input{chapter3}
53 | \input{chapter4}
54 | \input{chapter5}
55 | \input{chapter6}
56 | \input{chapter7}
57 | 
58 | \end{document}
59 | 


--------------------------------------------------------------------------------