├── lecture-note
    ├── README.md
    ├── 1.2.md
    ├── 1.7.md
    └── 1.1.md
├── supplements
    ├── README.md
    ├── taylor-linearization.md
    ├── var-cov-matrix.md
    └── matrix-multiplication.md
├── exercise-solution
    ├── README.md
    ├── 1.4.md
    ├── 1.3.md
    ├── 1.1.md
    └── 1.2.md
├── question-solution
    ├── README.md
    ├── 1.3.1.md
    ├── 1.3.3.md
    ├── 1.2.2.md
    ├── 1.7.6.md
    ├── 1.4.6.md
    ├── 1.3.5.md
    ├── 1.2.1.md
    ├── 1.1.5.md
    ├── 1.6.1.md
    ├── 1.2.6.md
    ├── 1.2.8.md
    ├── 2.1.4.md
    ├── 1.6.4.md
    ├── 1.4.2.md
    ├── 1.3.2.md
    ├── 2.2.4.md
    ├── 1.7.5.md
    ├── 1.3.6.md
    ├── 1.4.4.md
    ├── 1.1.1.md
    ├── 1.4.1.md
    ├── 1.3.7.md
    ├── 1.4.7.md
    ├── 1.1.2.md
    ├── 1.2.7.md
    ├── 1.7.2.md
    ├── 1.7.3.md
    ├── 2.2.1.md
    ├── 2.1.1.md
    ├── 1.4.3.md
    ├── 1.1.6.md
    ├── 2.1.3.md
    ├── 1.6.2.md
    ├── 2.2.2.md
    ├── 1.7.4.md
    ├── 1.7.7.md
    ├── 1.6.3.md
    ├── 2.2.3.md
    ├── 1.5.5.md
    ├── 1.4.5.md
    ├── 1.5.3.md
    ├── 1.1.3.md
    ├── 1.2.5.md
    ├── 1.7.8.md
    ├── 1.5.1.md
    ├── 2.1.5.md
    ├── 1.1.4.md
    ├── 1.2.9.md
    ├── 1.5.4.md
    ├── 1.3.4.md
    ├── 1.2.4.md
    ├── 1.7.1.md
    ├── 1.2.3.md
    ├── 1.5.2.md
    └── 2.1.2.md
├── book.json
├── SUMMARY.md
└── README.md


/lecture-note/README.md:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/supplements/README.md:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/exercise-solution/README.md:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/question-solution/README.md:
--------------------------------------------------------------------------------
1 | 


--------------------------------------------------------------------------------
/book.json:
--------------------------------------------------------------------------------
1 | {
2 |     "plugins": ["mathjax"],
3 |     "pluginsConfig": {
4 |         "mathjax": {
5 |             "forceSVG": false
6 |         }
7 |     }
8 | }


--------------------------------------------------------------------------------
/question-solution/1.3.1.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Apr 23, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 3 Finite-Sample Properties of OLS
10 | 
11 | ...
12 | 
13 | #### Review Question 1.3.1 (Role of the no-multicollinearity assumption)
14 | 
15 | In Proposition 1.1 and 1.2, where did we use Assumption 1.3 that $$ \mathrm{rank} ( \mathbf{X} ) = K $$?
16 | 
17 | ##### Solution
18 | 
19 | We need the no-multicollinearity condition to make sure $$ \mathbf{X}' \mathbf{X} $$ is invertible.
20 | 
21 | ---
22 | 
23 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.3.3.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Apr 23, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 3 Finite-Sample Properties of OLS
10 | 
11 | ...
12 | 
13 | #### Review Question 1.3.3 (What Gauss-Markov does not mean)
14 | 
15 | Under Assumptions 1.1–1.4, does there exist a linear, but not necessarily unbiased, estimator of $$ \boldsymbol{\beta} $$ that has a variance smaller than that of the OLS estimator? If so, how small can the variance be?
16 | 
17 | ##### Solution
18 | 
19 | If an estimator of $$ \boldsymbol{\beta} $$ is a constant, then the estimator is trivially linear in $$ \mathbf{y} $$.
20 | 
21 | ---
22 | 
23 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.2.2.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Mar 20, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 2 The Algebra of Least Squares
10 | 
11 | ...
12 | 
13 | #### Review Question 1.2.2
14 | 
15 | Verify that $$ \mathbf{X}' \mathbf{X} / n = \frac{1}{n} \sum_i \mathbf{x}_i \mathbf{x}_i' $$ and $$ \mathbf{X}' \mathbf{y} / n = \frac{1}{n} \sum_i \mathbf{x}_i y_i $$.
16 | 
17 | ##### Solution
18 | 
19 | By the [outer-way](../supplements/matrix-multiplication.md) matrix multiplication rule, noticing that $$ \mathbf{x}_i $$ is the $$ i $$th column vector of $$ \mathbf{X}' $$, above equations are obviously true.
20 | 
21 | ---
22 | 
23 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.7.6.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 20, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 7 Application: Returns to Scale in Electricity Supply
10 | 
11 | ...
12 | 
13 | #### Review Question 1.7.6
14 | 
15 | Why is the $$R^2$$ of 0.926 from the unrestricted model (1.7.7) _lower_ than the $$R^2$$ of 0.932 from the restricted model (1.7.8)?
16 | 
17 | ##### Solution
18 | 
19 | That is because the dependent variable in the restricted regression is different from that in the unrestricted regression. If the dependent variable were the same, then indeed the $$R^2$$ should be higher for the unrestricted model.
20 | 
21 | ---
22 | 
23 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.4.6.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 11, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 4 Hypothesis Testing under Normality
10 | 
11 | ...
12 | 
13 | #### Review Question 1.4.6 ($$t$$ vs. $$F$$)
14 | 
15 | “It is nonsense to test a hypothesis consisting of a large number of equality restrictions, because the $$t$$-test will most likely reject at least some of the restrictions.” Criticize this statement.
16 | 
17 | ##### Solution
18 | 
19 | A joint test of multiple restrictions _simultaneously_ is different from _separately_ test multiple restrictions. As explained in the text, the overall significance increases with the number of restrictions to be tested if the $$t$$-test is applied to each restriction without adjusting the critical value.
20 | 
21 | ---
22 | 
23 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.3.5.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 8, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 3 Finite-Sample Properties of OLS
10 | 
11 | ...
12 | 
13 | #### Review Question 1.3.5
14 | 
15 | Propose an unbiased estimator of $$ \sigma^2 $$ if you had data on $$ \boldsymbol{ \varepsilon } $$.
16 | 
17 | ##### Solution
18 | 
19 | $$ \boldsymbol{ \varepsilon }' \boldsymbol{ \varepsilon } / n $$ is an unbiased estimator of $$ \sigma^2 $$. This is because
20 | 
21 | $$
22 | \begin{align}
23 | \mathrm{E} ( \boldsymbol{ \varepsilon }' \boldsymbol{ \varepsilon } / n )
24 | & =
25 | \frac{1}{n} \sum_{i=1}^n \mathrm{E} ( \varepsilon_i^2 )
26 | \\ & =
27 | \frac{1}{n} \sum_{i=1}^n \mathrm{E} [ \mathrm{E} ( \varepsilon_i^2 \mid \mathbf{X} ) ]
28 | \\ & =
29 | \frac{1}{n} \cdot n \sigma^2
30 | \\ & =
31 | \sigma^2.
32 | \end{align}
33 | $$
34 | 
35 | ---
36 | 
37 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.2.1.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Mar 20, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 2 The Algebra of Least Squares
10 | 
11 | ...
12 | 
13 | #### Review Question 1.2.1
14 | 
15 | Prove that $$ \mathbf{X}' \mathbf{X} $$ is positive definite if $$ \mathbf{X} $$ is of full column rank. 
16 | 
17 | ##### Solution
18 | 
19 | By the definition of positive definite matrix, we need to show that $$ \mathbf{c}' \mathbf{X}' \mathbf{X} \mathbf{c} > 0 $$ for $$ \mathbf{c} \neq \mathbf{0} $$. Define $$ \mathbf{z} \equiv \mathbf{X} \mathbf{c} $$. Then $$ \mathbf{c}' \mathbf{X}' \mathbf{X} \mathbf{c} = \mathbf{z}' \mathbf{z} = \sum_{k=1}^{K} z_i^2 $$. If $$ \mathbf{X} $$ is of full column rank, then the column vectors of $$ \mathbf{X} $$ are linearly independent. This means $$ \mathbf{X} \mathbf{c} = \mathbf{0} $$ if and only if $$ \mathbf{c} = \mathbf{0} $$. Then $$ \mathbf{z} \neq \mathbf{0} $$ for any $$ \mathbf{c} \neq \mathbf{0} $$.
20 | 
21 | ---
22 | 
23 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.1.5.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Mar 13, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 1 The Classical Linear Regression Model
10 | 
11 | ...
12 | 
13 | #### Review Question 1.1.5 (Multicollinearity for the simple regression model)
14 | 
15 | Show that Assumption 1.3 for the simple regression model is that the nonconstant regressor ($$ x_{i2} $$) is really nonconstant (i.e. $$ x_{i2} \neq x_{j2} $$ for some pairs of $$ (i, j) $$, $$ i \neq j $$, with probability one).
16 | 
17 | ##### Solution
18 | 
19 | The simple regression model is
20 | 
21 | $$
22 | \mathbf{y} = \beta_1 \cdot \mathbf{1} + \beta_2 \mathbf{x}_2 + \boldsymbol{\varepsilon}.
23 | $$
24 | 
25 | Assumption 1.3 requires that $$ \{ \mathbf{1}, \mathbf{x}_2 \} $$ are linearly independent with probability one. This means $$ \mathbf{x}_2 $$ is not proportional to $$ \mathbf{1} $$ with probability one, i.e., $$ x_{i2} \neq x_{j2} $$ for some pairs of $$ (i, j) $$, $$ i \neq j $$, with probability one.
26 | 
27 | ---
28 | 
29 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/exercise-solution/1.4.md:
--------------------------------------------------------------------------------
 1 | # Solution to Analytical Exercise
 2 | 
 3 | by Qiang Gao, updated at Jun 26, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ...
10 | 
11 | #### Analytical Exercise 1.4 (Partitioned regression)
12 | 
13 | Let $$ \mathbf{X} $$ be partitioned as
14 | 
15 | $$
16 | \underset{ n \times K }{ \mathbf{X} } \equiv
17 | \left[
18 | \underset{ n \times K_1 }{ \mathbf{X}_1 } \;
19 | \vdots \;
20 | \underset{ n \times K_2 }{ \mathbf{X}_2 }
21 | \right].
22 | $$
23 | 
24 | Partition $$ \boldsymbol{\beta} $$ accordingly:
25 | 
26 | $$
27 | \boldsymbol{\beta} \equiv
28 | \begin{bmatrix}
29 | \boldsymbol{\beta}_1 \\ \boldsymbol{\beta}_2
30 | \end{bmatrix}
31 | \quad
32 | \begin{array}{l}
33 | \leftarrow K_1 \times 1
34 | \\
35 | \leftarrow K_2 \times 1
36 | \end{array}.
37 | $$
38 | 
39 | Thus, the regression can be written as
40 | 
41 | $$
42 | \mathbf{y} =
43 | \mathbf{X}_1 \boldsymbol{ \beta }_1 +
44 | \mathbf{X}_2 \boldsymbol{ \beta }_2 +
45 | \boldsymbol{ \varepsilon }.
46 | $$
47 | 
48 | Let ...
49 | 
50 | ##### Solution
51 | 
52 | Let ...
53 | 
54 | ---
55 | 
56 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.6.1.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 15, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 6 Generalized Least Squares (GLS)
10 | 
11 | ...
12 | 
13 | #### Review Question 1.6.1 (The no-multicollinearity assumption for the transformed model)
14 | 
15 | Assumption 1.3 for the transformed model is that $$ \mathrm{rank} ( \mathbf{C} \mathbf{X} ) = K $$. This is satisfied since $$ \mathbf{C} $$ is nonsingular and $$ \mathbf{X} $$ is of full column rank. Show this.
16 | 
17 | ##### Solution
18 | 
19 | To show $$ \mathrm{rank} ( \mathbf{C} \mathbf{X} ) = K $$ is equivalent to show $$ \mathbf{C} \mathbf{X} \mathbf{v} \neq \mathbf{0} $$ for any $$ \mathbf{v} \neq \mathbf{0} $$.
20 | 
21 | For any $$ \mathbf{v} \neq \mathbf{0} $$, $$ \mathbf{X} \mathbf{v} \neq \mathbf{0} $$ because $$ \mathbf{X} $$ is of full column rank. Let $$ \mathbf{u} \equiv \mathbf{X} \mathbf{v} \neq \mathbf{0} $$, because $$ \mathbf{C} $$ is nonsingular (also full column rank), $$ \mathbf{C} \mathbf{u} = \mathbf{C} \mathbf{X} \mathbf{v} \neq \mathbf{0} $$. Q.E.D.
22 | 
23 | ---
24 | 
25 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.2.6.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Mar 21, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 2 The Algebra of Least Squares
10 | 
11 | ...
12 | 
13 | #### Review Question 1.2.6 (Change in units and $$ R^2 $$)
14 | 
15 | Does a change in the unit of measurement for the dependent variable change $$ R^2 $$? A change in the unit of measurement for the regressors? **Hint:** Check whether the change affects the denominator and the numerator in the definition for $$ R^2 $$.
16 | 
17 | ##### Solution
18 | 
19 | By definition,
20 | 
21 | $$
22 | R^2 = \frac{ \sum_{i=1}^n (\hat{y}_i - \bar{y})^2 }
23 | { \sum_{i=1}^n (y_i - \bar{y})^2 }.
24 | $$
25 | 
26 | (1) If we change the unit of measurement of the dependent variable $$ y $$, then $$ y_i $$, $$ \hat{y}_i $$, and $$ \bar{y} $$ simultaneously scales by a factor $$ \alpha $$. Both the denominator and the numerator are scaled by $$ \alpha^2 $$, leaving $$ R^2 $$ unchanged.
27 | 
28 | (2) If we change the unit of measurement of the regressors, then $$ y_i $$, $$ \hat{y}_i $$, and $$ \bar{y} $$ donnot change at all. $$ R^2 $$ remains unchanged.
29 | 
30 | ---
31 | 
32 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.2.8.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Mar 22, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 2 The Algebra of Least Squares
10 | 
11 | ...
12 | 
13 | #### Review Question 1.2.8
14 | 
15 | Show that
16 | 
17 | $$
18 | R_{uc}^2 = \frac{ \mathbf{y}' \mathbf{P} \mathbf{y} }
19 | { \mathbf{y}' \mathbf{y} }.
20 | $$
21 | 
22 | ##### Solution
23 | 
24 | $$
25 | \begin{align}
26 | R_{uc}^2 & = 1 - \frac{ \mathbf{e}' \mathbf{e} }
27 | { \mathbf{y}' \mathbf{y} }
28 | &&
29 | \text{(definition (1.2.16))}
30 | \\ & =
31 | \frac{ \mathbf{y}' \mathbf{y} - \mathbf{e}' \mathbf{e} }
32 | { \mathbf{y}' \mathbf{y} }
33 | \\ & =
34 | \frac{ \mathbf{y}' \mathbf{I} \mathbf{y} - 
35 | \mathbf{y}' \mathbf{M} \mathbf{y} }
36 | { \mathbf{y}' \mathbf{y} }
37 | &&
38 | \text{($ \mathbf{e} = \mathbf{M} \mathbf{y} $, $\mathbf{M}$ idempotent)}
39 | \\ & =
40 | \frac{ \mathbf{y}' (\mathbf{I} - \mathbf{M}) \mathbf{y} }
41 | { \mathbf{y}' \mathbf{y} }
42 | \\ & =
43 | \frac{ \mathbf{y}' \mathbf{P} \mathbf{y} }
44 | { \mathbf{y}' \mathbf{y} }.
45 | &&
46 | (\mathbf{M} \equiv \mathbf{I} - \mathbf{P})
47 | \end{align}
48 | $$
49 | 
50 | ---
51 | 
52 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/2.1.4.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Sep 17, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 2 Large-Sample Theory
 8 | 
 9 | ### Section 1 Review of Limit Theorems for Sequences of Random Variables
10 | 
11 | ...
12 | 
13 | #### Review Question 2.1.4
14 | 
15 | Suppose $$ \sqrt{n} ( \hat\theta_n - \theta ) \to_d N(0, \sigma^2) $$. Does it follow that $$ \hat\theta_n \to_p \theta $$?
16 | 
17 | **Hint**:
18 | 
19 | $$
20 | \hat\theta_n - \theta = \frac{1}{\sqrt{n}} \cdot
21 | \sqrt{n} ( \hat\theta_n - \theta )
22 | \text{, }
23 | \operatorname*{plim}_{n \to \infty}
24 | \frac{1}{\sqrt{n}} = 0.
25 | $$
26 | 
27 | ##### Solution
28 | 
29 | Using the multiply-and-divide strategy,
30 | 
31 | $$
32 | \hat\theta_n - \theta = \frac{1}{\sqrt{n}} \cdot
33 | \sqrt{n} ( \hat\theta_n - \theta ).
34 | \tag{1}
35 | $$
36 | 
37 | Because $$ 1 / \sqrt{n} \to 0 $$, as is shown in [Review Question 2.1.1](2.1.1.md),
38 | 
39 | $$
40 | \frac{1}{\sqrt{n}} \to_p 0.
41 | \tag{2}
42 | $$
43 | 
44 | Because $$ \sqrt{n} ( \hat\theta_n - \theta ) \to_d N(0, \sigma^2) $$, combining (2) into (1) and Lemma 2.4(b),
45 | 
46 | $$
47 | \begin{gather}
48 | \hat\theta_n - \theta \to_p 0, \\
49 | 
50 | \hat\theta_n \to_p \theta.
51 | \end{gather}
52 | $$
53 | 
54 | ---
55 | 
56 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.6.4.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 15, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 6 Generalized Least Squares (GLS)
10 | 
11 | ...
12 | 
13 | #### Review Question 1.6.4 (Sampling error of GLS)
14 | 
15 | Show: $$ \hat{ \boldsymbol{ \beta } }_{\mathrm{GLS}} - \boldsymbol{ \beta } = ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V}^{-1} \boldsymbol{ \varepsilon } $$.
16 | 
17 | ##### Solution
18 | 
19 | $$
20 | \begin{align}
21 | \hat{ \boldsymbol{ \beta } }_{\mathrm{GLS}}
22 | & =
23 | ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V}^{-1} \mathbf{y}
24 | \tag{1.6.5}
25 | \\ & =
26 | ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V}^{-1} ( \mathbf{X} \boldsymbol{ \beta } + \boldsymbol{ \varepsilon } )
27 | \\ & =
28 | ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} \boldsymbol{ \beta } + ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V}^{-1} \boldsymbol{ \varepsilon }
29 | \\ & =
30 | \boldsymbol{ \beta } + ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V}^{-1} \boldsymbol{ \varepsilon }.
31 | \end{align}
32 | $$
33 | 
34 | ---
35 | 
36 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.4.2.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 10, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 4 Hypothesis Testing under Normality
10 | 
11 | ...
12 | 
13 | #### Review Question 1.4.2 (Computation of test statistics)
14 | 
15 | Verify that $$ SE( b_k ) $$ as well as $$ \mathbf{b} $$, $$ SSR $$, $$ s^2 $$, and $$ R^2 $$ can be calculated from the following sample averages: $$ \mathbf{S}_{ \mathbf{xx} } $$, $$ \mathbf{s}_{ \mathbf{xy} } $$, $$ \mathbf{y}' \mathbf{y} / n $$, and $$ \bar{y} $$.
16 | 
17 | ##### Solution
18 | 
19 | In [review question 1.2.9](1.2.9.md), it has been shown that $$ \mathbf{b} $$, $$ SSR $$, $$ s^2 $$, and $$ R^2 $$ can be calculated from sample averages $$ \mathbf{S}_{ \mathbf{xx} } $$, $$ \mathbf{s}_{ \mathbf{xy} } $$, $$ \mathbf{y}' \mathbf{y} / n $$, and $$ \bar{y} $$.
20 | 
21 | Because
22 | 
23 | $$
24 | SE( b_k ) \equiv \sqrt{ s^2 \cdot \left( ( \mathbf{X}' \mathbf{X} )^{-1} \right)_{kk} },
25 | $$
26 | 
27 | by definition that $$ \mathbf{S}_{\mathbf{xx}} = \frac{1}{n} \mathbf{X}' \mathbf{X} $$, it is obvious that $$ SE( b_k ) $$ can be calculated from sample averages of $$ \mathbf{S}_{ \mathbf{xx} } $$, $$ \mathbf{s}_{ \mathbf{xy} } $$, $$ \mathbf{y}' \mathbf{y} / n $$, and $$ \bar{y} $$.
28 | 
29 | ---
30 | 
31 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.3.2.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Apr 23, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 3 Finite-Sample Properties of OLS
10 | 
11 | ...
12 | 
13 | #### Review Question 1.3.2 (Example of a linear estimator)
14 | 
15 | For the consumption function example in Example 1.1, propose a linear and unbiased estimator of $$ \beta_2 $$ that is different from the OLS estimator.
16 | 
17 | ##### Solution
18 | 
19 | We propose an estimator $$ \widehat{\beta}_2 = ( CON_2 - CON_1 ) / ( YD_2 - YD_1 ) $$.
20 | 
21 | 1. When $$ YD_1, \ldots, YD_n $$ is known, $$ \widehat{\beta}_2 $$ is a linear combination of $$ CON_1, \ldots, CON_n $$.
22 | 
23 | 2. Because
24 | 
25 | $$
26 | \begin{align}
27 | \mathrm{E} ( \widehat{\beta}_2 \mid YD_1, \ldots, YD_n )
28 | & =
29 | \mathrm{E} \left( \left.
30 | \frac{ ( \beta_1 + \beta_2 YD_2 + \varepsilon_2) - ( \beta_1 + \beta_2 YD_1 + \varepsilon_1 ) }
31 | { YD_2 - YD_1 }
32 | \right| YD_1, \ldots, YD_n
33 | \right)
34 | \\ & =
35 | \mathrm{E} \left( \left.
36 | \frac{ \beta_2 (YD_2 - YD_1) + \varepsilon_2 - \varepsilon_1 }
37 | { YD_2 - YD_1 }
38 | \right| YD_1, \ldots, YD_n
39 | \right)
40 | \\ & =
41 | \beta_2 + 0 - 0 = \beta_2,
42 | \end{align}
43 | $$
44 | 
45 | So $$ \widehat{\beta}_2 $$ proposed here is linear and unbiased.
46 | 
47 | ---
48 | 
49 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/2.2.4.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Nov 8, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 2 Large-Sample Theory
 8 | 
 9 | ### Section 2 Fundamental Concepts in Time-Series Analysis
10 | 
11 | ...
12 | 
13 | #### Review Question 2.2.4
14 | 
15 | Let $$ \{ x_i \} $$ be a sequence of real numbers that change with $$i$$ and $$ \{ \varepsilon_i \} $$ be a sequence of i.i.d. random variables with mean 0 and finite variance. Is $$ \{ x_i \cdot \varepsilon_i \} $$ i.i.d.? [Answer: No.] Is it serially independent? [Answer: Yes.] An m.d.s? [Answer: Yes.] Stationary? [Answer: No.]
16 | 
17 | ##### Solution
18 | 
19 | (a) i.i.d.? No. Because $$ \operatorname*{Var} (x_i \cdot \varepsilon_i) = x_i^2 \sigma^2 $$ changes with $$i$$.
20 | 
21 | (b) Serially independent? Yes. Because $$ x_i \cdot \varepsilon_i $$ can be considered as a function of $$ \varepsilon_i $$, $$ f_i (\varepsilon_i) = x_i \cdot \varepsilon_i $$, and the functions of two independent random variables are also independent.
22 | 
23 | (c) m.d.s.? Yes. Let $$ z_i = x_i \cdot \varepsilon_i $$, $$ \operatorname*{E} ( z_i | z_{i-1}, z_{i-2}, \ldots, z_{1} ) = \operatorname*{E} ( z_i ) = x_i \operatorname*{E} ( \varepsilon_i ) = 0 $$.
24 | 
25 | (d) Stationary? No. Because $$ \operatorname*{Var} (x_i \cdot \varepsilon_i) = x_i^2 \sigma^2 $$ changes with $$i$$.
26 | 
27 | ---
28 | 
29 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.7.5.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 20, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 7 Application: Returns to Scale in Electricity Supply
10 | 
11 | ...
12 | 
13 | #### Review Question 1.7.5
14 | 
15 | If you take $$p_{i2}$$ instead of $$p_{i3}$$ and subtract $$ \log (p_{i2}) $$ from both sides of
16 | 
17 | $$
18 | \log ( TC_i ) = \beta_1 + \beta_2 \log ( Q_i ) + \beta_3 \log ( p_{i1} ) + \beta_4 \log ( p_{i2} ) + \beta_5 \log ( p_{i3} ) + \varepsilon_i,
19 | \tag{1.7.4}
20 | $$
21 | 
22 | how does the restricted regression look? Without actually estimating it on Nerlove's data, can you tell from the estimated  restricted regression in the text what the restricted OLS estimate of $$ ( \beta_1, \ldots, \beta_5 ) $$ will be? Their standard errors? the $$SSR$$? What about the $$R^2$$?
23 | 
24 | ##### Solution
25 | 
26 | The new restricted regression will be
27 | 
28 | $$
29 | \log \left( \frac{ TC_i }{ p_{i2} } \right) = \beta_1 +
30 | \beta_2 \log ( Q_i ) +
31 | \beta_3 \log \left( \frac{ p_{i1} }{ p_{i2} } \right) +
32 | \beta_5 \log \left( \frac{ p_{i3} }{ p_{i2} } \right) +
33 | \varepsilon_i.
34 | \tag{1}
35 | $$
36 | 
37 | The OLS estimate from regression on (1) should yield the same point estimate and standard errors. The $$SSR$$ should be the same, but $$R^2$$ should be different.
38 | 
39 | ---
40 | 
41 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.3.6.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 8, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 3 Finite-Sample Properties of OLS
10 | 
11 | ...
12 | 
13 | #### Review Question 1.3.6
14 | 
15 | Prove part (d) of Proposition 1.1, under Assumptions 1.1—1.4, $$ \mathrm{Cov} ( \mathbf{b}, \mathbf{e} \mid \mathbf{X} ) = \mathbf{0} $$, where $$ \mathbf{e} \equiv \mathbf{y} - \mathbf{X} \mathbf{b} $$.
16 | 
17 | ##### Solution
18 | 
19 | By [definition of covariance](../supplements/var-cov-matrix.md),
20 | 
21 | $$
22 | \begin{align}
23 | \mathrm{Cov} ( \mathbf{b}, \mathbf{e} \mid \mathbf{X} )
24 | & =
25 | \mathrm{E} \{ [ \mathbf{b} - \mathrm{E} ( \mathbf{b} \mid \mathbf{X} ) ][ \mathbf{e} - \mathrm{E} ( \mathbf{e} \mid \mathbf{X} ) ]' \mid \mathbf{X} \}
26 | \\ & =
27 | \mathrm{E} \{ [ \mathbf{A} \boldsymbol{ \varepsilon } ][ \mathbf{M} \boldsymbol{ \varepsilon } ]' \mid \mathbf{X} \}
28 | &&
29 | ( \mathbf{A} \equiv (\mathbf{X}' \mathbf{X})^{-1} \mathbf{X}',  \mathbf{M} \equiv \mathbf{I} - \mathbf{X}
30 | ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{X}' )
31 | \\ & =
32 | \mathbf{A} \mathrm{E} ( \boldsymbol{ \varepsilon } \boldsymbol{\varepsilon}' \mid \mathbf{X} ) \mathbf{M}'
33 | \\ & =
34 | \sigma^2 \mathbf{A} \mathbf{M}'
35 | &&
36 | ( \mathbf{M} \mathbf{X} = \mathbf{0} )
37 | \\ & = \mathbf{0}.
38 | \end{align}
39 | $$
40 | 
41 | ---
42 | 
43 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.4.4.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 11, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 4 Hypothesis Testing under Normality
10 | 
11 | ...
12 | 
13 | #### Review Question 1.4.4 (One-tailed $$t$$-test)
14 | 
15 | The $$t$$-test described in the text is the **tow-tailed $$t$$-test** because the significance $$\alpha$$ is equally distributed between both tails of the $$t$$ distribution. Suppose the alternative is one-sided and written as $$ \mathrm{H}_1: \beta_k > \bar{\beta}_k $$. Consider the following modification of the decision rule of the $$t$$-test.
16 | 
17 | 1. Same as above.
18 | 2. Find the critical value $$t_\alpha$$ such that the area in the $$t$$ distribution to the right of $$t_\alpha$$ is $$\alpha$$. Note the difference from the two-tailed test: the left tail is ignored and the area of $$\alpha$$ is assigned to the upper tail only.
19 | 3. Accept if $$t_k < t_\alpha$$; reject otherwise.
20 | 
21 | Show that the size (significance level) of this **one-tailed $$t$$-test** is $$\alpha$$.
22 | 
23 | ##### Solution
24 | 
25 | The size of a test is the probability of falsely rejection. When $$\mathrm{H}_0$$ is true, the test statistic $$t_k$$ has a $$ t(n-K) $$ distribution. If we reject whenever $$t_k \ge t_\alpha$$, then the probability is $$\alpha$$ by construction. So the size of this **one-tailed $$t$$-test** is $$\alpha$$.
26 | 
27 | ---
28 | 
29 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/supplements/taylor-linearization.md:
--------------------------------------------------------------------------------
 1 | ## Taylor's Linearization
 2 | 
 3 | by Qiang Gao, updated at Mar 9, 2018
 4 | 
 5 | ---
 6 | 
 7 | Any nonlinear differentiable function, i.e., a curve of points $$(y, x)$$, depicted as $$y = f(x)$$, can be _locally_ approximated as a line, called **linearization**, around any point $$(\bar{y}, \bar{x})$$ on the curve.
 8 | 
 9 | #### One Independent Variable Case
10 | 
11 | A function of a single variable $$y = f(x)$$ can be approximately linearized around a point $$(\bar{y}, \bar{x})$$, where $$\bar{y} = f(\bar{x})$$, as
12 | 
13 | $$
14 | y - \bar y \approx \frac{d f(\bar{x})}{d x} (x - \bar x_1).
15 | $$
16 | 
17 | #### Two Independent Variable Case
18 | 
19 | A function of two variables $$y = f(x_1, x_2)$$ can be approximately linearized around a point $$(\bar{y}, \bar{x}_1, \bar{x}_2)$$, where $$\bar{y} = f(\bar{x}_1, \bar{x}_2)$$, as
20 | 
21 | $$
22 | y - \bar{y} \approx \frac{\partial f(\bar{x}_1, \bar{x}_2)}{\partial x_1} (x_1 - \bar{x}_1) + \frac{\partial f(\bar{x}_1, \bar{x}_2)}{\partial x_2} (x_2 - \bar{x}_2).
23 | $$
24 | 
25 | #### Many Independent Variable Case
26 | 
27 | A function of $$K$$ independent variables $$y = f( \mathbf{x})$$, where $$\mathbf{x} = (x_1, x_2, \ldots, x_K)$$, can be approximately linearized around point $$(\bar{y}, \bar{ \mathbf{x} })$$, where $$\bar{y} = f( \bar{ \mathbf{x} } )$$, as
28 | 
29 | $$
30 | y - \bar{y} \approx \nabla f(\bar{ \mathbf{x} }) \cdot ( \mathbf{x} - \bar{ \mathbf{x} } ).
31 | $$
32 | 
33 | ---
34 | 
35 | Copyright ©2018 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.1.1.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Mar 13, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 1 The Classical Linear Regression Model
10 | 
11 | ...
12 | 
13 | #### Review Question 1.1.1 (Change in units in the semi-log form)
14 | 
15 | In the wage equation
16 | 
17 | $$
18 | \log ( WAGE_i ) = \beta_1 + \beta_2 S_i + \beta_3 TENURE_i + \beta_4 EXPR_i + \varepsilon_i,
19 | \tag{1.1.3}
20 | $$
21 | 
22 | of Example 1.2, if $$ WAGE $$ is measured in cents rather than in dollars, what difference does it make to the equation?
23 | 
24 | ##### Solution
25 | 
26 | Let $$ WAGE' $$ denote the $$ WAGE $$ variable measured in cents, that is,
27 | 
28 | $$
29 | WAGE' = 100 WAGE,
30 | $$
31 | 
32 | $$
33 | \log ( WAGE' ) = \log (100) + \log ( WAGE ).
34 | $$
35 | 
36 | Substituting into equation (1.1.3),
37 | 
38 | $$
39 | \log ( WAGE_i') - \log (100) = \beta_1 + \beta_2 S_i +
40 | \beta_3 TENURE_i + \beta_4 EXPR_i +
41 | \varepsilon_i,
42 | $$
43 | 
44 | $$
45 | \log (\mathit{WAGE}_i') = \log (100) + \beta_1 + \beta_2 S_i +
46 | \beta_3 \mathit{TENURE}_i + \beta_4 \mathit{EXPR}_i +
47 | \varepsilon_i,
48 | $$
49 | 
50 | $$
51 | \log ( WAGE_i') = \beta_1' + \beta_2 S_i + \beta_3
52 |  TENURE_i + \beta_4 EXPR_i + \varepsilon_i,
53 | $$
54 | 
55 | where $$ \beta_1' $$ is defined as $$ \beta_1' = \log (100) + \beta_1 $$. So the _only_ difference is $$ \beta_1 $$ is increased by $$ \log (100)$$.
56 | 
57 | ---
58 | 
59 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.4.1.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 10, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 4 Hypothesis Testing under Normality
10 | 
11 | ...
12 | 
13 | #### Review Question 1.4.1 (Conditionial vs. unconditional distribution)
14 | 
15 | (a) Do we know from Assumptions 1.1—1.5 that the marginal (unconditional) distribution of $$ \mathbf{b} $$ is normal?
16 | 
17 | (b) Are the statistics $$ z_k $$, $$ t_k $$, and $$ F $$ distributed independently of $$ \mathbf{X} $$?
18 | 
19 | ##### Solution
20 | 
21 | (a) Under Assumptions 1.1—1.5,
22 | 
23 | $$
24 | \mathbf{b} \mid \mathbf{X} \sim
25 | N( \boldsymbol{\beta}, \sigma^2 \cdot ( \mathbf{X}' \mathbf{X} )^{-1} ).
26 | \tag{1.4.2}
27 | $$
28 | 
29 | Because the variance of $$ \mathbf{b} $$ depends on $$ \mathbf{X} $$, when the marginal distribution of $$ \mathbf{X} $$ is unknown,  the marginal distribution of $$ \mathbf{b} $$ is also unknown, so it is not necessarily normal.
30 | 
31 | (b) Under Assumptions 1.1—1.5 and the null hypothesis,
32 | 
33 | $$
34 | \begin{align}
35 | z_k \mid \mathbf{X} & \sim N(0, 1),
36 | \tag{1.4.3}
37 | \\
38 | t_k \mid \mathbf{X} & \sim t(n - K),
39 | \tag{1.4.5}
40 | \\
41 | F \mid \mathbf{X} & \sim F(\# \mathbf{r}, n - K).
42 | \tag{1.4.9}
43 | \end{align}
44 | $$
45 | 
46 | These _conditional_ distributions does not depend on the value of $$ \mathbf{X} $$, so their _marginal_ distributions are independent of $$ \mathbf{X} $$.
47 | 
48 | ---
49 | 
50 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.3.7.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 8, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 3 Finite-Sample Properties of OLS
10 | 
11 | ...
12 | 
13 | #### Review Question 1.3.7
14 | 
15 | Prove (1.2.21),
16 | 
17 | $$
18 | 0 \le p_i \le 1 \text{ and } \sum_{i=1}^{n} p_i = K,
19 | \tag{1.2.21}
20 | $$
21 | 
22 | where 
23 | 
24 | $$
25 | p_i \equiv \mathbf{x}_i' ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{x}_i.
26 | \tag{1.2.20}
27 | $$
28 | 
29 | ##### Solution
30 | 
31 | Because
32 | 
33 | $$
34 | \mathbf{P} \equiv \mathbf{X} ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{ X }',
35 | $$
36 | 
37 | $$ p_i $$ is the $$i$$-th row and $$i$$-th column of $$ \mathbf{P} $$. Because $$ \mathbf{P} $$ is positive semidefinite, $$ p_i \ge 0 $$. Similarly, because
38 | 
39 | $$
40 | \mathbf{M} \equiv \mathbf{I} - \mathbf{X} ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{ X }',
41 | $$
42 | 
43 | $$ 1 - p_i $$ is the $$i$$-th row and $$i$$-th column of $$ \mathbf{M} $$. Because $$ \mathbf{M} $$ is positive semidefinite, $$ 1 - p_i \ge 0 $$, $$ p_i \le 1 $$.
44 | 
45 | Finally,
46 | 
47 | $$
48 | \begin{align}
49 | \sum_{i=1}^n p_i 
50 | & =
51 | \mathrm{trace} ( \mathbf{P} )
52 | \\ & =
53 | \mathrm{trace} ( \mathbf{X} ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{ X }' )
54 | \\ & =
55 | \mathrm{trace} ( ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{ X }' \mathbf{X} )
56 | \\ & =
57 | \mathrm{trace} ( \mathbf{I}_K )
58 | \\ & =
59 | K.
60 | \end{align}
61 | $$
62 | 
63 | ---
64 | 
65 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.4.7.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 11, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 4 Hypothesis Testing under Normality
10 | 
11 | ...
12 | 
13 | #### Review Question 1.4.7 (Variance of $$s^2$$)
14 | 
15 | Show that, under Assumptions 1.1—1.5,
16 | 
17 | $$
18 | \mathrm{Var} ( s^2 \mid \mathbf{X} ) = 
19 | \frac{ 2 \sigma^4 }{ n - K }.
20 | $$
21 | 
22 | **Hint:** If a random variable is distributed as $$ \chi^2 (m) $$, then its mean is $$m$$ and variance $$2m$$.
23 | 
24 | ##### Solution
25 | 
26 | Because
27 | 
28 | $$
29 | s^2 \equiv \frac{ \mathbf{e}' \mathbf{e} }{ n - K } =
30 | \frac{ \sigma^2 }{ n - K }
31 | \left( \frac{ \boldsymbol{\varepsilon} }{ \sigma } \right)'
32 | \mathbf{M}
33 | \left( \frac{ \boldsymbol{\varepsilon} }{ \sigma } \right),
34 | $$
35 | 
36 | by property of variance,
37 | 
38 | $$
39 | \mathrm{Var} ( s^2 \mid \mathbf{X} ) =
40 | \frac{ \sigma^4 }{ (n - K)^2 } \mathrm{Var} ( q \mid \mathbf{X} ),
41 | \tag{1}
42 | $$
43 | 
44 | where
45 | 
46 | $$
47 | q \equiv
48 | \left( \frac{ \boldsymbol{\varepsilon} }{ \sigma } \right)'
49 | \mathbf{M}
50 | \left( \frac{ \boldsymbol{\varepsilon} }{ \sigma } \right)
51 | $$
52 | 
53 | defined in page 36 in text is distributed as $$ q \mid \mathbf{X} \sim \chi^2 (n - K)$$. So $$ \mathrm{Var} ( q \mid \mathbf{X} ) = 2(n - K) $$, and substituting into (1)
54 | 
55 | $$
56 | \mathrm{Var} ( s^2 \mid \mathbf{X} ) = 
57 | \frac{ 2 \sigma^4 }{ n - K }.
58 | $$
59 | 
60 | ---
61 | 
62 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.1.2.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Mar 20, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 1 The Classical Linear Regression Model
10 | 
11 | ...
12 | 
13 | #### Review Question 1.1.2 (Conditional cross-moment of error terms)
14 | 
15 | Prove the last equality in (1.1.15),
16 | 
17 | $$
18 | \mathrm{E} (\varepsilon_i \varepsilon_j \mid \mathbf{X})
19 | =
20 | \mathrm{E} (\varepsilon_i \mid \mathbf{x}_i)
21 | \mathrm{E} (\varepsilon_j \mid \mathbf{x}_j)
22 | \qquad
23 | \text{(for $i \neq j$)}.
24 | \tag{1.1.15}
25 | $$
26 | 
27 | ##### Solution
28 | $$
29 | \begin{align}
30 | \mathrm{E} ( \varepsilon_i \varepsilon_j \mid \mathbf{X} )
31 | & =
32 | \mathrm{E} [ \mathrm{E} ( \varepsilon_i \varepsilon_j \mid \mathbf{X}, \varepsilon_j ) \mid \mathbf{X} ]
33 | &&
34 | \text{(law of iterated expectations)}
35 | \\ 
36 | & = \mathrm{E} [ \varepsilon_j \mathrm{E} ( \varepsilon_i \mid \mathbf{X}, \varepsilon_j ) \mid \mathbf{X} ]
37 | &&
38 | \text{($\varepsilon_j$ is constant under condition)}
39 | \\
40 | & = \mathrm{E} [ \varepsilon_j \mathrm{E} ( \varepsilon_i \mid \mathbf{x}_i ) \mid \mathbf{X} ]
41 | &&
42 | \text{(random sample)} \\
43 | & = \mathrm{E} ( \varepsilon_i \mid \mathbf{x}_i ) \mathrm{E} ( \varepsilon_j \mid \mathbf{X})
44 | &&
45 | \text{($\mathbf{x}_i$ is known conditional on $\mathbf{X}$)} \\
46 | & = \mathrm{E} ( \varepsilon_i \mid \mathbf{x}_i ) \mathrm{E} ( \varepsilon_j \mid \mathbf{x}_j)
47 | &&
48 | \text{(random sample)}
49 | \end{align}
50 | $$
51 | 
52 | ---
53 | 
54 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.2.7.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Mar 21, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 2 The Algebra of Least Squares
10 | 
11 | ...
12 | 
13 | #### Review Question 1.2.7 (Relation between $$ R_{uc}^2 $$ and $$ R^2 $$)
14 | 
15 | Show that
16 | 
17 | $$
18 | 1 - R^2 = \left( 1 + \frac{ n \cdot \bar{y}^2 }{ \sum_{i=1}^n (y_i - \bar{y})^2 } \right)
19 | (1 - R_{uc}^2).
20 | $$
21 | 
22 | **Hint:** Use the identity $$ \sum_i (y_i - \bar{y})^2 = \sum_i y_i^2 - n \cdot \bar{y}^2 $$.
23 | 
24 | ##### Solution
25 | 
26 | By definition of $$ R^2 $$, the left side equals
27 | 
28 | $$
29 | \frac{ \sum_{i=1}^n e_i^2 }{ \sum_{i=1}^n (y_i - \bar{y})^2 }.
30 | $$
31 | 
32 | By definition of $$ R_{uc}^2 $$, the right side equals
33 | 
34 | $$
35 | \begin{align}
36 |   & \left( 1 + \frac{ n \cdot \bar{y}^2 }
37 |   { \sum_{i=1}^n (y_i - \bar{y})^2 } \right)
38 |   \frac{ \sum_{i=1}^n e_i^2 }{ \sum_{i=1}^n y_i^2 }
39 |   \\
40 |   = &
41 |   \left( \frac{ \sum_i y_i^2 - n \cdot \bar{y}^2 
42 |   + n \cdot \bar{y}^2 }
43 |   { \sum_{i=1}^n (y_i - \bar{y})^2 } \right)
44 |   \frac{ \sum_{i=1}^n e_i^2 }{ \sum_{i=1}^n y_i^2 }
45 |   \qquad
46 |   \text{(hint)}
47 |   \\
48 |   = &
49 |   \left( \frac{ \sum_i y_i^2  }
50 |   { \sum_{i=1}^n (y_i - \bar{y})^2 } \right)
51 |   \frac{ \sum_{i=1}^n e_i^2 }{ \sum_{i=1}^n y_i^2 }
52 |   \\
53 |   = &
54 |   \frac{ \sum_{i=1}^n e_i^2 }{ \sum_{i=1}^n 
55 |   (y_i - \bar{y})^2 }.
56 | \end{align}
57 | $$
58 | 
59 | Left side and right side are equal.
60 | 
61 | ---
62 | 
63 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.7.2.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 20, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 7 Application: Returns to Scale in Electricity Supply
10 | 
11 | ...
12 | 
13 | #### Review Question 1.7.2 (Change of units)
14 | 
15 | In Nerlove's data, output is measured in kilowatt hours. If output were measured in megawatt hours, how would the estimated restricted regression change?
16 | 
17 | ##### Solution
18 | 
19 | In the restricted regression
20 | 
21 | $$
22 | \log \left( \frac{ TC_i }{ p_{i3} } \right) =
23 | \beta_1 + \beta_2 \log ( Q_i ) + \beta_3 \log
24 | \left( \frac{ p_{i1} }{ p_{i3} } \right) + \beta_4 \log
25 | \left( \frac{ p_{i2} }{ p_{i3} } \right) +
26 | \varepsilon_i,
27 | \tag{1.7.6}
28 | $$
29 | 
30 | if output measured in kilowatt hours ($$Q_i$$) is substituted by output measured in megawatt hours ($$ Q'_i \equiv Q_i / 1000 $$),
31 | 
32 | $$
33 | \begin{align}
34 | \log \left( \frac{ TC_i }{ p_{i3} } \right) & =
35 | \beta_1 + \beta_2 \log ( 1000 \cdot Q'_i ) + \beta_3 \log
36 | \left( \frac{ p_{i1} }{ p_{i3} } \right) + \beta_4 \log
37 | \left( \frac{ p_{i2} }{ p_{i3} } \right) +
38 | \varepsilon_i
39 | \\ & =
40 | ( \beta_1 + \beta_2 \log (1000) ) +
41 | \beta_2 \log ( Q'_i ) + \beta_3 \log
42 | \left( \frac{ p_{i1} }{ p_{i3} } \right) + \beta_4 \log
43 | \left( \frac{ p_{i2} }{ p_{i3} } \right) +
44 | \varepsilon_i,
45 | \end{align}
46 | $$
47 | 
48 | then the estimated values of slopes ($$ \beta_2 $$, $$ \beta_3 $$ and $$ \beta_4 $$) will not change, but the estimated value of the intercept ($$ \beta_1 $$) will increase by $$ \hat{\beta}_2 \log (1000) $$.
49 | 
50 | ---
51 | 
52 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.7.3.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 20, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 7 Application: Returns to Scale in Electricity Supply
10 | 
11 | ...
12 | 
13 | #### Review Question 1.7.3 (Recovering technology parameters from regression coefficients)
14 | 
15 | Show that the technology parameters ($$\mu$$, $$\alpha_1$$, $$\alpha_2$$, $$\alpha_3$$) can be determined uniquely from the first four equations in (1.7.5),
16 | 
17 | $$
18 | \begin{align}
19 | \beta_1 & = \mu, \tag{1.7.5a} \\
20 | \beta_2 & = \frac{1}{r}, \tag{1.7.5b} \\
21 | \beta_3 & = \frac{ \alpha_1 }{ r }, \tag{1.7.5c} \\
22 | \beta_4 & = \frac{ \alpha_2 }{ r }, \tag{1.7.5d}
23 | \end{align}
24 | $$
25 | 
26 | and the definition $$r \equiv \alpha_1 + \alpha_2 + \alpha_3$$. (Do not use the fifth equation $$ \beta_5 = \alpha_3 / r $$.)
27 | 
28 | ##### Solution
29 | 
30 | From (1.7.5a) we have
31 | 
32 | $$
33 | \mu = \beta_1.
34 | \tag{1}
35 | $$
36 | 
37 | Substituting (1.7.5b) into (1.7.5c) we have
38 | 
39 | $$
40 | \alpha_1 = \frac{ \beta_3 }{ \beta_2 }.
41 | \tag{2}
42 | $$
43 | 
44 | Similarly, substituting (1.7.5b) into (1.7.5d) we have
45 | 
46 | $$
47 | \alpha_2 = \frac{ \beta_4 }{ \beta_2 }.
48 | \tag{3}
49 | $$
50 | 
51 | Finally, using the definition of $$r$$ and (1.7.5b),
52 | 
53 | $$
54 | \alpha_1 + \alpha_2 + \alpha_3 = \frac{1}{\beta_2},
55 | \tag{4}
56 | $$
57 | 
58 | substituting (2) and (3) into (4) and rearrange terms,
59 | 
60 | $$
61 | \alpha_3 = \frac{1}{\beta_2} - \frac{\beta_3}{\beta_2} -
62 | \frac{\beta_4}{\beta_2} =
63 | \frac{ 1 - \beta_3 - \beta_4 }{ \beta_2 }.
64 | \tag{5}
65 | $$
66 | 
67 | ---
68 | 
69 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/2.2.1.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Oct 31, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 2 Large-Sample Theory
 8 | 
 9 | ### Section 2 Fundamental Concepts in Time-Series Analysis
10 | 
11 | ...
12 | 
13 | #### Review Question 2.2.1
14 | 
15 | Prove that $$ \boldsymbol{\Gamma}_{-j} = \boldsymbol{\Gamma}_j' $$.
16 | 
17 | **Hint**: $$ \operatorname*{Cov} ( \mathbf{z}_i, \mathbf{z}_{i-j} ) = \operatorname*{E} [ ( \mathbf{z}_i - \boldsymbol{\mu} ) ( \mathbf{z}_{i-j} - \boldsymbol{\mu} )' ] $$ where $$ \boldsymbol{\mu} = \operatorname*{E} ( \mathbf{z}_i ) $$. By covariance-stationarity, $$ \operatorname*{Cov} ( \mathbf{z}_i, \mathbf{z}_{i-j} ) = \operatorname*{Cov} ( \mathbf{z}_{i+j}, \mathbf{z}_i ) $$.
18 | 
19 | ##### Solution
20 | 
21 | $$
22 | \begin{align*}
23 | 
24 | \boldsymbol{\Gamma}_{-j} 
25 | &= \operatorname*{Cov} ( \mathbf{z}_i, \mathbf{z}_{i+j} ) 
26 | && \text{(definition of $\boldsymbol{\Gamma}$)}
27 | \\
28 | 
29 | &= \operatorname*{E} [ ( \mathbf{z}_i - \boldsymbol{\mu} )( \mathbf{z}_{i+j} - \boldsymbol{\mu} )' ]
30 | && \text{(definition of $\operatorname*{Cov}$)}
31 | \\
32 | 
33 | &= \operatorname*{E} [ ( \mathbf{z}_{i-j} - \boldsymbol{\mu} )( \mathbf{z}_{i} - \boldsymbol{\mu} )' ]
34 | && \text{(covariance-stationarity)}
35 | \\
36 | 
37 | &= \operatorname*{E} [ ( \mathbf{z}_{i} - \boldsymbol{\mu} )( \mathbf{z}_{i-j} - \boldsymbol{\mu} )' ]'
38 | && \text{(transpose)}
39 | \\
40 | 
41 | &= \operatorname*{Cov} ( \mathbf{z}_i, \mathbf{z}_{i-j} )' 
42 | && \text{(definition of $\operatorname*{Cov}$)}
43 | \\
44 | 
45 | &= \boldsymbol{\Gamma}_j'.
46 | && \text{(definition of $\boldsymbol{\Gamma}$)}
47 | \end{align*}
48 | $$
49 | 
50 | ---
51 | 
52 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/2.1.1.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Sep 16, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 2 Large-Sample Theory
 8 | 
 9 | ### Section 1 Review of Limit Theorems for Sequences of Random Variables
10 | 
11 | ...
12 | 
13 | #### Review Question 2.1.1 (Usual convergence vs. convergence in probability)
14 | 
15 | A sequence of real numbers is a trivial example of a sequence of random variables. Is it true that
16 | 
17 | $$
18 | \lim_{n \to \infty} z_n = \alpha
19 | \implies
20 | \operatorname*{plim}_{n \to \infty} z_n = \alpha ?
21 | $$
22 | 
23 | **Hint**: Look at the definition of $$ \operatorname{plim} $$. Since $$ \lim_{n \to \infty} z_n = \alpha $$, $$ | z_n - \alpha | < \varepsilon $$ for $$ n $$ sufficiently large.
24 | 
25 | ##### Solution
26 | 
27 | By definition, $$ \lim_{n \to \infty} z_n = \alpha $$ means, for any $$ \varepsilon > 0 $$, for $$n$$ sufficiently large,
28 | 
29 | $$
30 | | z_n - \alpha | < \varepsilon.
31 | \tag{1}
32 | $$
33 | 
34 | Considering $$ z_n $$ as a trivial random variable, (1) is equivalent to
35 | 
36 | $$
37 | \begin{gather}
38 | \operatorname{Prob} (| z_n - \alpha | < \varepsilon) = 1, \\
39 | \operatorname{Prob} (| z_n - \alpha | > \varepsilon) = 0, \\
40 | \lim_{n \to \infty} \operatorname{Prob} (| z_n - \alpha | > \varepsilon) = 0.
41 | \tag{2}
42 | \end{gather}
43 | $$
44 | 
45 | Then $$ \operatorname*{plim}_{n \to \infty} z_n = \alpha $$ by definition.
46 | 
47 | ##### Appendix
48 | 
49 | A sequence of random scalars $$ {z_n} $$ converges in probability to a constant $$ \alpha $$ if, for any $$ \varepsilon > 0 $$,
50 | 
51 | $$
52 | \lim_{n \to \infty} \operatorname{Prob} ( | z_n - \alpha | > \varepsilon ) = 0.
53 | $$
54 | 
55 | ---
56 | 
57 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.4.3.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 11, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 4 Hypothesis Testing under Normality
10 | 
11 | ...
12 | 
13 | #### Review Question 1.4.3
14 | 
15 | For the formula
16 | 
17 | $$
18 | F \equiv
19 | \frac{ ( \mathbf{R} \mathbf{b} - \mathbf{r} )' [ \mathbf{R} ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{R}' ]^{-1} ( \mathbf{R} \mathbf{b} - \mathbf{r} ) / \# \mathbf{r} }
20 | { s^2 }
21 | \tag{1.4.9}
22 | $$
23 | 
24 | to be well-defined, the matrix $$ \mathbf{R} ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{R}' $$ must be nonsingular. Prove the stronger result that the matrix is positive definite.
25 | 
26 | ##### Solution
27 | 
28 | We need to show that
29 | 
30 | $$
31 | \mathbf{z}' \mathbf{R} ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{R}' \mathbf{z} > 0,
32 | \tag{1}
33 | $$
34 | 
35 | for any nonzero vector $$ \mathbf{z} $$.
36 | 
37 | Since $$ \mathbf{R} $$ is of full row rank, for any nonzero $$ \mathbf{z} $$, $$ \mathbf{R}' \mathbf{z} $$ is also nonzero. So equivalently, what we need to show becomes
38 | 
39 | $$
40 | \mathbf{c}' ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{c} > 0,
41 | \tag{2}
42 | $$
43 | 
44 | where $$ \mathbf{c} = \mathbf{R}' \mathbf{z} $$ is nonzero.
45 | 
46 | This is equivalent to proving $$ ( \mathbf{X}' \mathbf{X} )^{-1} $$ is positive definite, which is indeed true because in [review question 1.2.1](1.2.1.md), it is already shown that $$ \mathbf{X}' \mathbf{X} $$ is positive definite. (A matrix is positive definite if and only if all its eigenvalues are positive. The eigenvalues of $$ \mathbf{A}^{-1} $$ are the reciprocals of the eigenvalues of $$ \mathbf{A} $$.) 
47 | 
48 | ---
49 | 
50 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.1.6.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Mar 13, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 1 The Classical Linear Regression Model
10 | 
11 | ...
12 | 
13 | #### Review Question 1.1.6 (An exercise in conditional and unconditional expectations)
14 | 
15 | Show that Assumptions 1.2 and 1.4 imply
16 | 
17 | $$
18 | \begin{align}
19 | \mathrm{Var} ( \varepsilon_i ) & = \sigma^2
20 | &&
21 | (i = 1, 2, \ldots, n)
22 | \\
23 | \text{and} \qquad
24 | \mathrm{Cov} ( \varepsilon_i, \varepsilon_j ) & = 0
25 | &&
26 | (i \neq j; i, j = 1, 2, \ldots, n).  \tag{$\ast$}
27 | \end{align}
28 | $$
29 | 
30 | ##### Solution
31 | 
32 | Strict exogeneity implies $$ \mathrm{E} (\varepsilon_i) = 0 $$. So $$ (\ast) $$ is equivalent to
33 | 
34 | $$
35 | \begin{align}
36 | \mathrm{E} ( \varepsilon_i^2 ) & = \sigma^2
37 | &&
38 | (i = 1, 2, \ldots, n)
39 | \\
40 | \text{and} \qquad
41 | \mathrm{E} ( \varepsilon_i \varepsilon_j ) & = 0
42 | &&
43 | (i \neq j; i, j = 1, 2, \ldots, n).
44 | \end{align}
45 | $$
46 | 
47 | (1) For $$i = 1, 2, \ldots, n$$,
48 | 
49 | $$
50 | \begin{align}
51 | \mathrm{E} (\varepsilon_i^2)
52 | & =
53 | \mathrm{E} [\mathrm{E} (\varepsilon_i^2 \mid \mathbf{X})]
54 | &&
55 | \text{(law of total expectations)}
56 | \\ & =
57 | \sigma^2.
58 | &&
59 | \text{(Assumption 1.4)}
60 | \end{align}
61 | $$
62 | 
63 | (2) For $$i \neq j; i, j = 1, 2, \ldots, n$$,
64 | 
65 | $$
66 | \begin{align}
67 | \mathrm{E} (\varepsilon_i \varepsilon_j)
68 | & =
69 | \mathrm{E} [ \mathrm{E} (\varepsilon_i \varepsilon_j \mid \mathbf{X}) ]
70 | &&
71 | \text{(law of total expectations)}
72 | \\ & = 0.
73 | &&
74 | \text{(Assumption 1.4)}
75 | \end{align}
76 | $$
77 | 
78 | ---
79 | 
80 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/exercise-solution/1.3.md:
--------------------------------------------------------------------------------
 1 | # Solution to Analytical Exercise
 2 | 
 3 | by Qiang Gao, updated at Jun 26, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ...
10 | 
11 | #### Analytical Exercise 1.3 (Deviation-from-the-mean regression)
12 | 
13 | Consider a regression model with a constant. Let $$ \mathbf{X} $$ be partitioned as
14 | 
15 | $$
16 | \underset{ n \times K }{ \mathbf{X} } \equiv
17 | \left[
18 | \underset{ n \times 1 }{ \mathbf{1} } \;
19 | \vdots \;
20 | \underset{ n \times ( K - 1) }{ \mathbf{X}_2 }
21 | \right]
22 | $$
23 | 
24 | so the first regressor is a constant. Partition $$ \boldsymbol{\beta} $$ and $$ \mathbf{b} $$ accordingly:
25 | 
26 | $$
27 | \boldsymbol{\beta} \equiv
28 | \begin{bmatrix}
29 | \beta_1 \\ \boldsymbol{\beta}_2
30 | \end{bmatrix},
31 | \quad
32 | \mathbf{b} \equiv
33 | \begin{bmatrix}
34 | b_1 \\ \mathbf{b}_2
35 | \end{bmatrix}.
36 | $$
37 | 
38 | Also let $$ \widetilde{ \mathbf{X} }_2 \equiv \mathbf{M}_1 \mathbf{X}_2 $$ and $$ \widetilde{ \mathbf{y} } \equiv \mathbf{M}_1 \mathbf{y} $$ (where $$ \mathbf{M}_1 $$ is defined in [Analytical Exercise 1.2](1.2.md)). They are deviations from the mean for the nonconstant regressors and the dependent variable. Prove the following:
39 | 
40 | (a) The $$ K $$ normal equations are
41 | 
42 | $$
43 | \bar{y} = b_1 + \bar{ \mathbf{x} }'_2 \mathbf{b}_2
44 | $$
45 | 
46 | where $$ \bar{ \mathbf{x} }_2 \equiv \mathbf{X}'_2 \mathbf{1} /n $$ and
47 | 
48 | $$
49 | \mathbf{X}'_2 \mathbf{y} = n \cdot b_1 \cdot \bar{ \mathbf{x} }_2 + \mathbf{X}'_2 \mathbf{X}_2 \mathbf{b}_2.
50 | $$
51 | 
52 | (b) $$ \mathbf{b}_2 = ( \widetilde{ \mathbf{X} }'_2 \widetilde{ \mathbf{X} }_2 )^{-1} \widetilde{ \mathbf{X} }'_2 \widetilde{ \mathbf{y} } $$.
53 | 
54 | ##### Solution
55 | 
56 | The solution is a special case of the solution to [Analytical Exercise 1.4](1.4.md).
57 | 
58 | ---
59 | 
60 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/2.1.3.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Sep 17, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 2 Large-Sample Theory
 8 | 
 9 | ### Section 1 Review of Limit Theorems for Sequences of Random Variables
10 | 
11 | ...
12 | 
13 | #### Review Question 2.1.3
14 | 
15 | Prove Lemma 2.4(c)
16 | 
17 | $$
18 | \mathbf{x}_n \to_d \mathbf{x}
19 | \text{, }
20 | \mathbf{A}_n \to_p \mathbf{A}
21 | \implies
22 | \mathbf{A}_n \mathbf{x}_n \to_d \mathbf{A} \mathbf{x}
23 | $$
24 | 
25 | from Lemma 2.4(a)
26 | 
27 | $$
28 | \mathbf{x}_n \to_d \mathbf{x}
29 | \text{, }
30 | \mathbf{y}_n \to_p \boldsymbol\alpha
31 | \implies
32 | \mathbf{x}_n + \mathbf{y}_n \to_d
33 | \mathbf{x} + \boldsymbol\alpha
34 | $$
35 | 
36 | and Lemma 2.4(b)
37 | 
38 | $$
39 | \mathbf{x}_n \to_d \mathbf{x}
40 | \text{, }
41 | \mathbf{y}_n \to_p \mathbf{0}
42 | \implies
43 | \mathbf{y}_n' \mathbf{x}_n \to_p \mathbf{0}.
44 | $$
45 | 
46 | **Hint**: $$ \mathbf{A}_n \mathbf{x}_n = ( \mathbf{A}_n - \mathbf{A} ) \mathbf{x}_n + \mathbf{A} \mathbf{x}_n $$. By (b), $$  ( \mathbf{A}_n - \mathbf{A} ) \mathbf{x}_n \to_p \mathbf{0} $$.
47 | 
48 | ##### Solution
49 | 
50 | Using the add-and-subtract strategy,
51 | 
52 | $$
53 | \mathbf{A}_n \mathbf{x}_n =
54 | ( \mathbf{A}_n - \mathbf{A} ) \mathbf{x}_n + 
55 | \mathbf{A} \mathbf{x}_n.
56 | \tag{1}
57 | $$
58 | 
59 | Because $$ \mathbf{A}_n \to_p \mathbf{A} $$, we have $$ \mathbf{A}_n - \mathbf{A} \to_p \mathbf{0} $$. Using $$ \mathbf{x}_n \to_d \mathbf{x} $$ and Lemma 2.4(b),
60 | 
61 | $$
62 | ( \mathbf{A}_n - \mathbf{A} ) \mathbf{x}_n \to_p \mathbf{0}.
63 | \tag{2}
64 | $$
65 | 
66 | Because $$ \mathbf{x}_n \to_d \mathbf{x} $$, by Lemma 2.3(b),
67 | 
68 | $$
69 | \mathbf{A} \mathbf{x}_n \to_d \mathbf{A} \mathbf{x}.
70 | \tag{3}
71 | $$
72 | 
73 | Combining (2) and (3) into (1), using Lemma 2.4(a),
74 | 
75 | $$
76 | \mathbf{A}_n \mathbf{x}_n \to_d \mathbf{A} \mathbf{x}.
77 | $$
78 | 
79 | ---
80 | 
81 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.6.2.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 15, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 6 Generalized Least Squares (GLS)
10 | 
11 | ...
12 | 
13 | #### Review Question 1.6.2 (Generalized $$ SSR $$)
14 | 
15 | Show that $$ \hat{ \boldsymbol{ \beta } }_{ \mathrm{GLS} } $$ minimizes $$ ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )' \mathbf{V}^{-1} ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } ) $$.
16 | 
17 | ##### Solution
18 | 
19 | Note that for symmetric $$ \mathbf{V} $$, its inverse $$ \mathbf{V}^{-1} $$ is also symmetric.
20 | 
21 | The objective function is
22 | 
23 | $$
24 | \begin{align}
25 | & ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )' \mathbf{V}^{-1} ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )
26 | \\ = &
27 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )' ( \mathbf{V}^{-1} \mathbf{y} - \mathbf{V}^{-1} \mathbf{X} \tilde{ \boldsymbol{ \beta } } )
28 | \\ = &
29 | \mathbf{y}' \mathbf{V}^{-1} \mathbf{y} - 2 \cdot \mathbf{y}' \mathbf{V}^{-1} \mathbf{X} \tilde{ \boldsymbol{ \beta } } + \tilde{ \boldsymbol{ \beta } }' \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} \tilde{ \boldsymbol{ \beta } }.
30 | \end{align}
31 | $$
32 | 
33 | Taking derivative with respect to $$ \tilde{ \boldsymbol{ \beta } } $$ leads to the first-order condition
34 | 
35 | $$
36 | - 2 \cdot \mathbf{X}' \mathbf{V}^{-1} \mathbf{y} + 2 \cdot \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} \tilde{ \boldsymbol{ \beta } } = \mathbf{0},
37 | $$
38 | 
39 | and it reduces to
40 | 
41 | $$
42 | \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} \tilde{ \boldsymbol{ \beta } } = \mathbf{X}' \mathbf{V}^{-1} \mathbf{y},
43 | $$
44 | 
45 | which solves that
46 | 
47 | $$
48 | \tilde{ \boldsymbol{ \beta } }_{ \mathrm{GLS} } = ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V}^{-1} \mathbf{y}.
49 | $$
50 | 
51 | ---
52 | 
53 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/2.2.2.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Oct 31, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 2 Large-Sample Theory
 8 | 
 9 | ### Section 2 Fundamental Concepts in Time-Series Analysis
10 | 
11 | ...
12 | 
13 | #### Review Question 2.2.2 (Forecasting white noise)
14 | 
15 | For the white noise process of Example 2.4, $$ \operatorname*{E} (z_i) = 0 $$. What is $$ \operatorname*{E} (z_i | z_1) $$ for $$ i \ge 2 $$? **Hint**: You should be able to forecast the future exactly if you know the value of $$z_1$$. Is the process an m.d.s? [Answer: No.]
16 | 
17 | ##### Solution
18 | 
19 | According to the setup of Example 2.4,
20 | 
21 | $$
22 | \begin{align*}
23 | \operatorname*{E} (z_i | z_1) 
24 | &= \operatorname*{E} ( \operatorname*{cos} (iw) | \operatorname*{cos} (w) )
25 | && \text{(definition of $\{z_i\}$)}
26 | \\
27 | 
28 | &= \operatorname*{E} ( \operatorname*{cos} (iw) | w )
29 | && \text{(same information for $ w \in (0, 2 \pi)$)}
30 | \\
31 | 
32 | &= \operatorname*{cos} (iw),
33 | \end{align*}
34 | $$
35 | 
36 | $$
37 | \begin{align*}
38 | \operatorname*{Var} (z_i | z_1)
39 | &= \operatorname*{E} (z_i^2 | z_1) - \operatorname*{E} (z_i | z_1)^2
40 | && \text{(formula of $\operatorname*{Var} (\cdot)$)}
41 | \\
42 | 
43 | &= \operatorname*{E} ( \operatorname*{cos} (iw)^2 | \operatorname*{cos} (w) ) - \operatorname*{cos} (iw)^2
44 | && \text{(definition of $\{ z_i \}$)}
45 | \\
46 | 
47 | &
48 | = \operatorname*{E} ( \operatorname*{cos} (iw)^2 | w ) - \operatorname*{cos} (iw)^2
49 | && \text{(same information for $ w \in (0, 2 \pi)$)}
50 | \\
51 | 
52 | &= \operatorname*{cos} (iw)^2 - \operatorname*{cos} (iw)^2
53 | \\
54 | 
55 | &= 0.
56 | \end{align*}
57 | $$
58 | 
59 | This means $$ z_i (i \ge 2) $$ can be forecasted exactly conditional on $$ z_1 $$.
60 | 
61 | The process of $$ \{ z_i \} $$ is not an m.d.s, because $$ \operatorname*{E} (z_i | z_{i-1}, \ldots, z_1) \neq 0 $$.
62 | 
63 | ---
64 | 
65 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.7.4.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 20, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 7 Application: Returns to Scale in Electricity Supply
10 | 
11 | ...
12 | 
13 | #### Review Question 1.7.4 (Recovering left-out coefficients from restricted OLS)
14 | 
15 | Calculate the restricted OLS estimate of $$\beta_5$$ from
16 | 
17 | $$
18 | \log \left( \frac{ TC_i }{ p_{i3} } \right) =
19 | - \underset{ (0.88) }{ 4.7 } + 
20 | \underset{ (0.017) }{ 0.72 } \log ( Q_i ) +
21 | \underset{ (0.20) }{ 0.59 } \log \left(
22 | \frac{ p_{i1} }{ p_{i3} }
23 | \right) -
24 | \underset{ (0.19) }{ 0.007 } \log \left(
25 | \frac{ p_{i2} }{ p_{i3} }
26 | \right).
27 | \tag{1.7.8}
28 | $$
29 | 
30 | How do you calculate the standard error of $$ b_5 $$ from the printout of the restricted OLS?
31 | 
32 | ##### Solution
33 | 
34 | Because of the restriction
35 | 
36 | $$
37 | \beta_3 + \beta_4 + \beta_5 = 1,
38 | $$
39 | 
40 | the restricted OLS estimate of $$ \beta_5 $$ is
41 | 
42 | $$
43 | b_5 = 1 - b_3 - b_4 = 1 - 0.59 - (- 0.007) = 0.417.
44 | $$
45 | 
46 | We can write
47 | 
48 | $$
49 | b_5 = 1 + \mathbf{c}' \mathbf{b},
50 | $$
51 | 
52 | where
53 | 
54 | $$
55 | \mathbf{c} \equiv
56 | \begin{bmatrix}
57 | 0 \\ 0 \\ -1 \\ -1
58 | \end{bmatrix},
59 | \qquad
60 | \mathbf{b} \equiv
61 | \begin{bmatrix}
62 | b_1 \\ b_2 \\ b_3 \\ b_4
63 | \end{bmatrix},
64 | $$
65 | 
66 | then
67 | 
68 | $$
69 | \mathrm{Var} ( b_5 \mid \mathbf{X} ) =
70 | \mathrm{Var} ( 1 + \mathbf{c}' \mathbf{b} | \mathbf{X}) =
71 | \mathbf{c}' \mathrm{Var} ( \mathbf{b} \mid \mathbf{X} ) \mathbf{c}.
72 | $$
73 | 
74 | From the printout of the restricted OLS regression, we have the estimate $$ \widehat{ \mathrm{Var} ( \mathbf{b} \mid \mathbf{X} ) } $$, then we can calculate the standard error of $$b_5$$ as
75 | 
76 | $$
77 | \mathrm{SE} ( b_5 ) =
78 | \sqrt{ \mathbf{c}' \widehat{ \mathrm{Var} ( \mathbf{b} \mid \mathbf{X} ) } \mathbf{c} }.
79 | $$
80 | 
81 | ---
82 | 
83 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/lecture-note/1.2.md:
--------------------------------------------------------------------------------
 1 | # Lecture Notes on Econometrics
 2 | 
 3 | by Qiang Gao, updated at March 26, 2018
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 2 The Algebra of Least Squares
10 | 
11 | ...
12 | 
13 | #### Residuals are NOT the error terms
14 | 
15 | - $$ \boldsymbol{\beta} $$ is unknown
16 | - overfitting
17 | 
18 | #### Basic Algebraic Problems
19 | 
20 | ##### Existence (does there exist a solution?)
21 | 
22 | - For $$ y = a x^2 + b x + c $$, there exists a solution only if $$ b^2 \geq 4 a c $$.
23 | - For $$ \mathbf{y} = \mathbf{A} \mathbf{x} $$, there exists a solution only if $$ \mathbf{y} $$ lies in the **column space** of $$ \mathbf{A} $$.
24 | 
25 | ##### Uniqueness (is the solution unique?)
26 | 
27 | - For $$ y = a x^2 + b x + c $$, there exists a solution and the solution is unique only if $$ b^2 = 4 a c $$.
28 | - For $$ \mathbf{y} = \mathbf{A} \mathbf{x} $$, there exists a solution and the solution is unique only if $$ \mathbf{y} $$ lies in the **column space** of $$ \mathbf{A} $$ and $$ \mathbf{A} $$ has **full column rank**.
29 | 
30 | ##### Analytical Solution (is the solution in closed-form?)
31 | 
32 | - If $$ y = a x^2 + b x + c $$ and $$ b^2 \geq 4 a c $$, then the solutions are $$ x = (-b \pm \sqrt{b^2 - 4ac}) /(2a) $$.
33 | 
34 | - If $$ \mathbf{y} = \mathbf{A} \mathbf{x} $$ and $$ \mathbf{A} $$ is (square and) invertible, then the unique solution is $$ \mathbf{x} = \mathbf{A}^{-1} \mathbf{y} $$.
35 | 
36 | #### Vector Differentiation
37 | 
38 | - For the real-valued function $$ y = f( \mathbf{x} ) = \mathbf{a}' \mathbf{x} $$ (inner product form),
39 | $$
40 | \frac{df(\mathbf{x})}{d\mathbf{x}} = \mathbf{a}.
41 | $$
42 | 
43 | - For the real-valued function $$ y = f( \mathbf{x} ) = \mathbf{x}' \mathbf{A} \mathbf{x} $$ (quadratic from),
44 | $$
45 | \frac{df(\mathbf{x})}{d\mathbf{x}} = \mathbf{A} \mathbf{x}.
46 | $$
47 | 
48 | #### Matrix Multiplication
49 | 
50 | There are equivalently _four_ ways of [matrix multiplication](../supplements/matrix-multiplication.md), each is very important.
51 | 
52 | ---
53 | 
54 | Copyright ©2018 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.7.7.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 20, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 7 Application: Returns to Scale in Electricity Supply
10 | 
11 | ...
12 | 
13 | #### Review Question 1.7.7
14 | 
15 | A more realistic assumption about the rental price of capital may be that there is an economy-wide capital market so $$p_{i2}$$ is the same across firms. In this case,
16 | 
17 | (a) Can we estimate the technology parameters?
18 | 
19 | (b) Can we test homogeneity of the cost function in factor prices?
20 | 
21 | ##### Solution
22 | 
23 | (a) When $$p_{i2}$$ is the same across firms, there will have the perfect multicollinearity problem, such that Assumption 3 does not hold. So it will not be possible to estimate the 5 parameters $$ ( \beta_1, \ldots, \beta_5 ) $$ _simultaneously_ from unrestricted regression (1.7.7).
24 | 
25 | But recall that $$ ( \beta_1, \ldots, \beta_5 ) $$ are not 5 independently free parameters,
26 | 
27 | $$
28 | \beta_3 + \beta_4 + \beta_5 = 1,
29 | \tag{1}
30 | $$
31 | 
32 | It is safe to disregard $$ \beta_4 $$ and estimate $$ ( \beta_1, \beta_2, \beta_3, \beta_5 ) $$ from the restricted OLS regression
33 | 
34 | $$
35 | \log \left( \frac{ TC_i }{ p_{i2} } \right) = \beta_1 +
36 | \beta_2 \log ( Q_i ) +
37 | \beta_3 \log \left( \frac{ p_{i1} }{ p_{i2} } \right) +
38 | \beta_5 \log \left( \frac{ p_{i3} }{ p_{i2} } \right) +
39 | \varepsilon_i.
40 | \tag{2}
41 | $$
42 | 
43 | Then the estimate of $$\beta_4$$ can be calculated from (1).
44 | 
45 | So the answer is _yes_. Even though there will be prefect multicollinearity problem if $$p_{i2}$$ is the same across firms, $$ ( \beta_1, \ldots, \beta_5 ) $$ _can_ be estimated.
46 | 
47 | (b) No, because when the price of capital is constant across firms we are forced to use the adding up restriction $$ \beta_3 + \beta_4 + \beta_5 = 1 $$ to calculate $$ \beta_4 $$ (capital's contribution) from the OLS estimate of $$\beta_3$$ and $$\beta_5$$. After all, we can't test the assumption which can't be relaxed.
48 | 
49 | ---
50 | 
51 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/supplements/var-cov-matrix.md:
--------------------------------------------------------------------------------
 1 | # Variance-Covariance Matrix of Random Vectors
 2 | 
 3 | by Qiang Gao, updated at May 8, 2017
 4 | 
 5 | ---
 6 | 
 7 | The **variance-covariance matrix** of a random vector $$ \mathbf{x} $$ is defined as
 8 | 
 9 | $$
10 | \begin{align}
11 | \mathrm{Var} ( \mathbf{x} ) & \equiv
12 | \mathrm{E} [ ( \mathbf{x} - \mathrm{E} ( \mathbf{x} ) )
13 | ( \mathbf{x} - \mathrm{E} ( \mathbf{x} ) )' ]
14 | \qquad \text{(the definition)}
15 | \\ & =
16 | \mathrm{E} [ \mathbf{x} \mathbf{x}' -
17 | \mathbf{x} \mathrm{E} ( \mathbf{x} )' -
18 | \mathrm{E} ( \mathbf{x} ) \mathbf{x}' +
19 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{x} )' ]
20 | \\ & =
21 | \mathrm{E} ( \mathbf{x} \mathbf{x}' ) -
22 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{x} )' -
23 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{x} )' +
24 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{x} )'
25 | \\ & =
26 | \mathrm{E} ( \mathbf{x} \mathbf{x}' ) -
27 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{x} )'.
28 | \qquad \text{(the formula)}
29 | \end{align}
30 | $$
31 | 
32 | The last equation is the convenient formula for calculating variance.
33 | 
34 | The **covariance matrix** between two random vectors $$ \mathbf{x} $$ and $$ \mathbf{y} $$ is defined as
35 | 
36 | $$
37 | \begin{align}
38 | \mathrm{Cov} ( \mathbf{x}, \mathbf{y} ) & \equiv
39 | \mathrm{E} [ ( \mathbf{x} - \mathrm{E} ( \mathbf{x} ) )
40 | ( \mathbf{y} - \mathrm{E} ( \mathbf{y} ) )' ]
41 | \qquad \text{(the definition)}
42 | \\ & =
43 | \mathrm{E} [ \mathbf{x} \mathbf{y}' -
44 | \mathbf{x} \mathrm{E} ( \mathbf{y} )' -
45 | \mathrm{E} ( \mathbf{x} ) \mathbf{y}' +
46 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{y} )' ]
47 | \\ & =
48 | \mathrm{E} ( \mathbf{x} \mathbf{y}' ) -
49 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{y} )' -
50 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{y} )' +
51 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{y} )'
52 | \\ & =
53 | \mathrm{E} ( \mathbf{x} \mathbf{x}' ) -
54 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{x} )'.
55 | \qquad \text{(the formula)}
56 | \end{align}
57 | $$
58 | 
59 | The last equation is the convenient formula for calculating variance.
60 | 
61 | ---
62 | 
63 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/exercise-solution/1.1.md:
--------------------------------------------------------------------------------
 1 | # Solution to Analytical Exercise
 2 | 
 3 | by Qiang Gao, updated at May 17, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ...
10 | 
11 | #### Analytical Exercise 1.1 (Proof that $$ \mathbf{b} $$ minimizes $$ SSR $$)
12 | 
13 | Let $$ \mathbf{b} $$ be the OLS estimator of $$ \boldsymbol{ \beta } $$. Prove that, for any hypothetical estimator $$ \tilde{ \boldsymbol{ \beta } } $$ of $$ \boldsymbol{ \beta } $$,
14 | 
15 | $$
16 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )'
17 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )
18 | \ge
19 | ( \mathbf{y} - \mathbf{X} \mathbf{b} )'
20 | ( \mathbf{y} - \mathbf{X} \mathbf{b} ).
21 | $$
22 | 
23 | ##### Solution
24 | 
25 | $$
26 | \begin{align}
27 | &
28 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )'
29 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )
30 | \\ = &
31 | [ ( \mathbf{y} - \mathbf{X} \mathbf{b} ) + \mathbf{X} ( \mathbf{b}  - \tilde{ \boldsymbol{ \beta } } ) ]'
32 | [ ( \mathbf{y} - \mathbf{X} \mathbf{b} ) + \mathbf{X} ( \mathbf{b} - \tilde{ \boldsymbol{ \beta } } ) ]
33 | \\ = &
34 | ( \mathbf{y} - \mathbf{X} \mathbf{b} )'
35 | ( \mathbf{y} - \mathbf{X} \mathbf{b} ) +
36 | ( \mathbf{y} - \mathbf{X} \mathbf{b} )'
37 | \mathbf{X} ( \mathbf{b} - \tilde{ \boldsymbol{ \beta } } ) +
38 | ( \mathbf{b} - \tilde{ \boldsymbol{ \beta } } )' \mathbf{X}'
39 | ( \mathbf{y} - \mathbf{X} \mathbf{b} )
40 | \\ & +
41 | ( \mathbf{b} - \tilde{ \boldsymbol{ \beta } } )' \mathbf{X}'
42 | \mathbf{X} ( \mathbf{b} - \tilde{ \boldsymbol{ \beta } } )
43 | \\ = &
44 | ( \mathbf{y} - \mathbf{X} \mathbf{b} )'
45 | ( \mathbf{y} - \mathbf{X} \mathbf{b} ) +
46 | ( \mathbf{b} - \tilde{ \boldsymbol{ \beta } } )' \mathbf{X}'
47 | \mathbf{X} ( \mathbf{b} - \tilde{ \boldsymbol{ \beta } } )
48 | \qquad
49 | \text{ ( because $ \mathbf{X}' \mathbf{y} = \mathbf{X}' \mathbf{X} \mathbf{b} $ ) }
50 | \\ \ge &
51 | ( \mathbf{y} - \mathbf{X} \mathbf{b} )'
52 | ( \mathbf{y} - \mathbf{X} \mathbf{b} ).
53 | \qquad
54 | \text{ (because $ \mathbf{X}' \mathbf{X} $ is positive definite and $ \tilde{ \boldsymbol{ \beta } } $ could $= \mathbf{b}$) }
55 | \end{align}
56 | $$
57 | 
58 | ---
59 | 
60 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.6.3.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 15, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 6 Generalized Least Squares (GLS)
10 | 
11 | ...
12 | 
13 | #### Review Question 1.6.3
14 | 
15 | Derive the expression for $$ \mathrm{Var} ( \mathbf{b} \mid \mathbf{X} ) $$ for the generalized regression model. What is the relation of it to $$ \mathrm{Var} ( \hat{ \boldsymbol{ \beta } }_{ \mathrm{GLS} } \mid \mathbf{X} ) $$? Verify that Proposition 1.7(c) (efficiency of GLS) implies
16 | 
17 | $$
18 | ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V} \mathbf{X} ( \mathbf{X}' \mathbf{X} )^{-1} \ge ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1}.
19 | $$
20 | 
21 | ##### Solution
22 | 
23 | (a) For the generalized regression model,
24 | 
25 | $$
26 | \begin{align}
27 | \mathrm{Var} ( \mathbf{b} \mid \mathbf{X} )
28 | & =
29 | \mathrm{Var} ( \mathbf{b} - \boldsymbol{ \beta } \mid \mathbf{X} )
30 | \\ & =
31 | \mathrm{Var} ( ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{X}' \boldsymbol{ \varepsilon } \mid \mathbf{X} )
32 | \\ & =
33 | ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{X}' \mathrm{Var} ( \boldsymbol{ \varepsilon } \mid \mathbf{X} ) \mathbf{X} ( \mathbf{X}' \mathbf{X} )^{-1}
34 | \\ & =
35 | \sigma^2 \cdot
36 | ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V} \mathbf{X} ( \mathbf{X}' \mathbf{X} )^{-1}.
37 | \tag{1}
38 | \end{align}
39 | $$
40 | 
41 | (b) According to the text,
42 | 
43 | $$
44 | \mathrm{Var} ( \hat{ \boldsymbol{ \beta } }_{\mathrm{GLS}} \mid \mathbf{X} ) = \sigma^2 \cdot ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1}.
45 | \tag{1.6.6}
46 | $$
47 | 
48 | Because $$ \mathbf{b} $$ is unbiased, following the Gauss-Markov theorem in Proposition 1.7(c),
49 | 
50 | $$
51 | \mathrm{Var} ( \mathbf{b} \mid \mathbf{X} ) \ge
52 | \mathrm{Var} ( \hat{ \boldsymbol{ \beta } }_{\mathrm{GLS}} \mid \mathbf{X} )
53 | \tag{2}
54 | $$
55 | 
56 | in the matrix sense.
57 | 
58 | (c) Substituting (1) and (1.6.6) into (2), it is easy to derive that
59 | 
60 | $$
61 | ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V} \mathbf{X} ( \mathbf{X}' \mathbf{X} )^{-1} \ge ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1}.
62 | $$
63 | 
64 | ---
65 | 
66 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/2.2.3.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Nov 8, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 2 Large-Sample Theory
 8 | 
 9 | ### Section 2 Fundamental Concepts in Time-Series Analysis
10 | 
11 | ...
12 | 
13 | #### Review Question 2.2.3 (No anticipated changes in martingales)
14 | 
15 | Suppose $$\{ x_i \}$$ is a martingale with respect to $$\{ \mathbf{z}_i \}$$. Show that $$ \operatorname*{E} (x_{i+j} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1) = x_{i-1} $$ and $$ \operatorname*{E} (x_{i+j+1} - x_{i+j} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1) = 0 $$ for $$ j = 0, 1, \ldots $$
16 | 
17 | **Hint**: Use the Law of Iterated Expectations.
18 | 
19 | ##### Solution
20 | 
21 | For $$j = 0, 1, \ldots ,$$
22 | 
23 | $$
24 | \begin{align*}
25 | \operatorname*{E} (x_{i+j} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1)
26 | &= \operatorname*{E} [ \operatorname*{E} (x_{i+j} | \mathbf{z}_{i+j-1}, \mathbf{z}_{i+j-2}, \ldots, \mathbf{z}_1) | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1]
27 | \\
28 | 
29 | &= \operatorname*{E} [ x_{i+j-1} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1]
30 | \\
31 | 
32 | &= \operatorname*{E} [ \operatorname*{E} (x_{i+j-1} | \mathbf{z}_{i+j-2}, \mathbf{z}_{i+j-3}, \ldots, \mathbf{z}_1) | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1]
33 | \\
34 | 
35 | &= \operatorname*{E} [ x_{i+j-2} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1]
36 | \\
37 | 
38 | &= \cdots
39 | \\
40 | 
41 | &= \operatorname*{E} [ x_{i} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1]
42 | \\
43 | 
44 | &= x_{i-1}.
45 | \end{align*}
46 | $$
47 | 
48 | Similarly, $$ \operatorname*{E} (x_{i+j+1} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1) = x_{i-1} $$. Combining results,
49 | 
50 | $$
51 | \begin{align*}
52 | \operatorname*{E} (x_{i+j+1} - x_{i+j} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1)
53 | &= \operatorname*{E} (x_{i+j+1} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1) - \operatorname*{E} (x_{i+j} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1)
54 | \\
55 | 
56 | &= x_{i-1} - x_{i-1}
57 | \\
58 | 
59 | &= 0.
60 | \end{align*}
61 | $$
62 | 
63 | ---
64 | 
65 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.5.5.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 11, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 5 Relation to Maximum Likelihood
10 | 
11 | ...
12 | 
13 | #### Review Question 1.5.5 (Likelihood equations for classical regression model)
14 | 
15 | We used the two-step procedure to derive the ML estimate for the classical regression model. An alternative way to find the ML estimator is to solve for the first-order conditions that set
16 | 
17 | $$
18 | \begin{align}
19 | \frac{ \partial \log L( \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \beta } } }
20 | & =
21 | \frac{1}{ \gamma } \mathbf{X}' ( \mathbf{y} - \mathbf{X} \boldsymbol{ \beta } ) = \mathbf{0},
22 | \tag{1.5.13a}
23 | \\
24 | \frac{ \partial \log L ( \boldsymbol{ \theta } ) }{ \partial  }
25 | & =
26 | -\frac{n}{ 2 \gamma } + \frac{1}{ 2 \gamma^2 } ( \mathbf{y} - \mathbf{X} \boldsymbol{ \beta } )' ( \mathbf{y} - \mathbf{X} \boldsymbol{ \beta } ) = 0.
27 | \tag{1.5.13b}
28 | \end{align}
29 | $$
30 | 
31 | The first-order conditions for the log likelihood is called the **likelihood equations**. Verify that the ML estimator given in Proposition 1.5 solves the likelihood equations.
32 | 
33 | ##### Solution
34 | 
35 | Proposition 1.5 states that the ML estimator is
36 | 
37 | $$
38 | \begin{align}
39 | \hat{ \boldsymbol{ \beta } } & = \mathbf{b} = ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{X}' \mathbf{y},
40 | \tag{1}
41 | \\
42 | \hat{ \gamma } & = \frac{ \mathbf{e}' \mathbf{e} }{n}.
43 | \tag{2}
44 | \end{align}
45 | $$
46 | 
47 | Substituting $$ \boldsymbol{ \beta } $$ and $$ \gamma $$ in (1.5.13a) with (1) and (2), we have
48 | 
49 | $$
50 | \frac{n}{ \mathbf{e}' \mathbf{e} } \mathbf{X}' \mathbf{e} =
51 | \mathbf{0}.
52 | \tag{3}
53 | $$
54 | 
55 | Equation (3) holds because $$ \mathbf{X}' \mathbf{e} = \mathbf{0} $$, as stated in (1.2.3').
56 | 
57 | Substituting $$ \boldsymbol{ \beta } $$ and $$ \gamma $$ in (1.5.13b) with (1) and (2), we have
58 | 
59 | $$
60 | -\frac{ n^2 }{ 2 \cdot \mathbf{e}' \mathbf{e} } +
61 | \frac{ n^2 }{2 \cdot \mathbf{e}' \mathbf{e} \cdot \mathbf{e}' \mathbf{e}} \mathbf{e}' \mathbf{e} = 0.
62 | \tag{4}
63 | $$
64 | 
65 | Equation (4) holds by cancelling terms.
66 | 
67 | ---
68 | 
69 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.4.5.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 11, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 4 Hypothesis Testing under Normality
10 | 
11 | ...
12 | 
13 | #### Review Question 1.4.5 (Relation between $$F(1, n - K)$$ and $$t(n - K)$$)
14 | 
15 | Look up the $$t$$ and $$F$$ distribution tables to verify that $$ F_\alpha(1, n - K) = ( t_{\alpha / 2} (n - K) )^2 $$ for degrees of freedom and significance levels of your choice.
16 | 
17 | ##### Solution
18 | 
19 | We can verify their equality by following `Python` code.
20 | 
21 | ```python
22 | In [1]: import numpy as np
23 | In [2]: from scipy.stats import t  # the t-distribution
24 | In [3]: from scipy.stats import f  # the F-distribution
25 | In [4]: sz = [0.05, 0.01, 0.005, 0.001]  # sizes
26 | In [5]: df = [20, 40, 60, 80, 100, 200, 500, 1000]  # degrees of freedom
27 | In [6]: SZ, DF = np.meshgrid(sz, df)  # generate meshgrid table
28 | In [7]: f.isf(SZ, 1, DF)  # isf = inverse survival function = 1 - cdf
29 | Out[7]:
30 | array([[  4.3512435 ,   8.09595806,   9.94393492,  14.81877555],
31 |        [  4.08474573,   7.31409993,   8.82785886,  12.60935783],
32 |        [  4.00119138,   7.07710579,   8.49461671,  11.97298729],
33 |        [  3.96035242,   6.96268806,   8.33460762,  11.67136163],
34 |        [  3.93614299,   6.89530103,   8.24064017,  11.49543133],
35 |        [  3.88837472,   6.76329947,   8.05715996,  11.15450054],
36 |        [  3.86012404,   6.68583329,   7.94984966,  10.95670343],
37 |        [  3.85077467,   6.66029481,   7.91453232,  10.89186556]])
38 | In [8]: t.isf(SZ/2, DF)**2  # isf = inverse survival function = 1 - cdf
39 | Out[8]:
40 | array([[  4.3512435 ,   8.09595806,   9.94393492,  14.81877554],
41 |        [  4.0847457 ,   7.31409993,   8.82785886,  12.60935783],
42 |        [  4.00119137,   7.07710581,   8.49461671,  11.97298729],
43 |        [  3.96035242,   6.96268805,   8.33460759,  11.67136163],
44 |        [  3.93614299,   6.89530103,   8.24064016,  11.49543139],
45 |        [  3.88837472,   6.76329947,   8.05715996,  11.15450054],
46 |        [  3.86012404,   6.68583329,   7.94984966,  10.95670343],
47 |        [  3.85077467,   6.66029481,   7.91453232,  10.89186556]])
48 | ```
49 | 
50 | ---
51 | 
52 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.5.3.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 15, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 5 Relation to Maximum Likelihood
10 | 
11 | ...
12 | 
13 | #### Review Question 1.5.3 (Concentrated log likelihood with respect to $$ \tilde{ \sigma }^2 $$)
14 | 
15 | Writing $$ \tilde{ \sigma }^2 $$ as $$ \tilde{ \gamma } $$, the log likelihood function for the classical regression model is
16 | 
17 | $$
18 | \log L ( \tilde{ \boldsymbol{ \beta } }, \tilde{ \gamma } ) =
19 | - \frac{n}{2} \log ( 2 \pi ) -
20 | \frac{n}{2} \log ( \tilde{ \gamma } ) -
21 | \frac{1}{ 2 \tilde{ \gamma } }
22 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )'
23 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } ).
24 | \tag{1}
25 | $$
26 | 
27 | In the two-step maximization procedure described in the text, we first maximized this function with respect to $$ \tilde{ \boldsymbol{ \beta } } $$. Instead, first maximize with respect to $$ \tilde{ \gamma } $$ given $$ \tilde{ \boldsymbol{ \beta } } $$. Show that the concentrated log likelihood (concentrated with respect to $$ \tilde{ \gamma } \equiv \tilde{ \sigma }^2 $$) is
28 | 
29 | $$
30 | - \frac{n}{2} [ 1 + \log ( 2 \pi ) ] - \frac{n}{2} \log \left(
31 | \frac{ ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )'
32 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } ) }{ n }
33 | \right).
34 | \tag{2}
35 | $$
36 | 
37 | ##### Solution
38 | 
39 | When maximizing (1) with respect to $$ \tilde{ \gamma } $$ given $$ \tilde{ \boldsymbol{ \beta } } $$, the first-order condition is
40 | 
41 | $$
42 | \frac{ \partial \log L ( \tilde{ \boldsymbol{ \beta } }, \tilde{ \gamma } ) }{ \partial \tilde{ \gamma } } =
43 | - \frac{n}{2} \frac{1}{ \tilde{ \gamma } } + \frac{
44 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )'
45 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } ) }{2}
46 | \frac{1}{ \tilde{ \gamma }^2 } = 0,
47 | $$
48 | 
49 | which gives
50 | 
51 | $$
52 | \tilde{ \gamma } = \frac{ 
53 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )'
54 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } ) }{n}.
55 | \tag{3}
56 | $$
57 | 
58 | Substituting partial solution (3) into objective function (1), we get the concentrated log likelihood function (2).
59 | 
60 | ---
61 | 
62 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.1.3.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Mar 13, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 1 The Classical Linear Regression Model
10 | 
11 | ...
12 | 
13 | #### Review Question 1.1.3 (Combining linearity and strict exogeneity)
14 | 
15 | Show that Assumptions 1.1 and 1.2 imply
16 | 
17 | $$
18 | \mathrm{E} ( y_i \mid \mathbf{X} ) = \mathbf{x}_i' \boldsymbol{\beta}
19 | \qquad
20 | \text{($i = 1, 2, \ldots, n$)}
21 | \tag{1.1.20}
22 | $$
23 | 
24 | Conversely, show that this assumption implies that there exist error terms that satisfy those two assumptions.
25 | 
26 | ##### Solution
27 | 
28 | (1) Firstly, we prove Assumption 1.1 and 1.2 imply (1.1.20).
29 | 
30 | $$
31 | \begin{align}
32 | \mathrm{E} ( y_i \mid \mathbf{X} )
33 | & = \mathrm{E} ( \mathbf{x}_i' \boldsymbol{\beta} + \varepsilon_i \mid \mathbf{X} )
34 | &&
35 | \text{(Assumption 1.1)}
36 | \\ & =
37 | \mathrm{E} ( \mathbf{x}_i' \boldsymbol{\beta} \mid \mathbf{X} ) + \mathrm{E} ( \varepsilon_i \mid \mathbf{X} )
38 | &&
39 | \text{(linearity of conditional expectatioins)}
40 | \\ & =
41 | \mathrm{E} ( \mathbf{x}_i' \boldsymbol{\beta} \mid \mathbf{X} )
42 | &&
43 | \text{(Assumption 1.2)}
44 | \\& =
45 | \mathbf{x}_i' \boldsymbol{\beta}
46 | &&
47 | \text{($\mathbf{x}_i$ is known conditional on $\mathbf{X}$, $\boldsymbol{\beta}$ is constant)}
48 | \end{align}
49 | $$
50 | 
51 | (2) Conversely, we prove (1.1.20) implies there exist error terms that satisfy Assumption 1.1 and 1.2.
52 | 
53 | We _define_ the error term as
54 | 
55 | $$
56 | \varepsilon_i = y_i - \mathbf{x}_i' \boldsymbol{\beta},
57 | $$
58 | 
59 | then Assumption 1.1 is satisfied. To prove assumption 1.2, notice that
60 | 
61 | $$
62 | \begin{align}
63 | \mathrm{E} (\varepsilon_i \mid \mathbf{X})
64 | & =
65 | \mathrm{E} ( y_i - \mathbf{x}_i' \boldsymbol{\beta} \mid \mathbf{X})
66 | &&
67 | \text{(definition of $\varepsilon_i$)}
68 | \\ & =
69 | \mathrm{E} ( y_i \mid \mathbf{X} ) - \mathrm{E} ( \mathbf{x}_i' \boldsymbol{\beta} \mid \mathbf{X} )
70 | &&
71 | \text{(linearity of conditional expectations)}
72 | \\ & =
73 | \mathbf{x}_i' \boldsymbol{\beta} - \mathrm{E} ( \mathbf{x}_i' \boldsymbol{\beta} \mid \mathbf{X} )
74 | &&
75 | \text{(1.1.20)}
76 | \\ & =
77 | \mathbf{x}_i' \boldsymbol{\beta} - \mathbf{x}_i' \boldsymbol{\beta}
78 | &&
79 | \text{($\mathbf{x}_i$ is known conditional on $\mathbf{X}$, $\boldsymbol{\beta}$ is constant)}
80 | \\ & = 0.
81 | \end{align}
82 | $$
83 | 
84 | ---
85 | 
86 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.2.5.md:
--------------------------------------------------------------------------------
  1 | # Solution to Review Question
  2 | 
  3 | by Qiang Gao, updated at Mar 21, 2017
  4 | 
  5 | ---
  6 | 
  7 | ## Chapter 1 Finite-Sample Properties of OLS
  8 | 
  9 | ### Section 2 The Algebra of Least Squares
 10 | 
 11 | ...
 12 | 
 13 | #### Review Question 1.2.5 (Matrix algebra of fitted values and residuals)
 14 | 
 15 | Show the following:
 16 | 
 17 | (a) $$ \hat{\mathbf{y}} = \mathbf{P} \mathbf{y} $$, $$ \mathbf{e} = \mathbf{M} \mathbf{y} = \mathbf{M} \boldsymbol{\varepsilon} $$.
 18 | 
 19 | (b) $$ \mathrm{SSR} = \boldsymbol{\varepsilon}' \mathbf{M} \boldsymbol{\varepsilon} $$.
 20 | 
 21 | ##### Solution
 22 | 
 23 | (a.1) $$ \hat{\mathbf{y}} = \mathbf{P} \mathbf{y} $$ because
 24 | 
 25 | $$
 26 | \begin{align}
 27 | \hat{\mathbf{y}}
 28 | & =
 29 | \mathbf{X} \mathbf{b}
 30 | &&
 31 | \text{(definition of $\hat{\mathbf{y}}$)}
 32 | \\ & =
 33 | \mathbf{X} ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{X}' \mathbf{y}
 34 | &&
 35 | \text{(definition of $\mathbf{b}$)}
 36 | \\ & =
 37 | \mathbf{P} \mathbf{y}.
 38 | &&
 39 | \text{(definition of $\mathbf{P}$)}
 40 | \end{align}
 41 | $$
 42 | 
 43 | (a.2) $$ \mathbf{e} = \mathbf{M} \mathbf{y} = \mathbf{M} \boldsymbol{\varepsilon} $$ because
 44 | 
 45 | $$
 46 | \begin{align}
 47 | \mathbf{e} & = \mathbf{y} - \mathbf{X} \mathbf{b}
 48 | &&
 49 | \text{(definition of $\mathbf{e}$)}
 50 | \\ & =
 51 | \mathbf{y} - \hat{\mathbf{y}}
 52 | &&
 53 | \text{(definition of $\hat{\mathbf{y}}$)}
 54 | \\ & =
 55 | \mathbf{y} - \mathbf{P} \mathbf{y}
 56 | &&
 57 | (\hat{\mathbf{y}} = \mathbf{P} \mathbf{y})
 58 | \\ & =
 59 | (\mathbf{I} - \mathbf{P}) \mathbf{y}
 60 | \\ & =
 61 | \mathbf{M} \mathbf{y}
 62 | &&
 63 | \text{(definition of $\mathbf{M}$)}
 64 | \\ & =
 65 | \mathbf{M} ( \mathbf{X} \boldsymbol{\beta} + \boldsymbol{ \varepsilon } )
 66 | &&
 67 | \text{(Assumption 1.1)}
 68 | \\ & =
 69 | \mathbf{M} \mathbf{X} \boldsymbol{\beta} +
 70 | \mathbf{M} \boldsymbol{\varepsilon}
 71 | \\ & =
 72 | \mathbf{M} \boldsymbol{\varepsilon}
 73 | &&
 74 | (\mathbf{M} \mathbf{X} = \mathbf{0})
 75 | \end{align}
 76 | $$
 77 | 
 78 | (b) $$ \mathrm{SSR} = \boldsymbol{\varepsilon}' \mathbf{M} \boldsymbol{\varepsilon} $$ because
 79 | 
 80 | $$
 81 | \begin{align}
 82 | \mathrm{SSR} & = \mathbf{e}' \mathbf{e}
 83 | &&
 84 | \text{(definition of $\mathrm{SSR}$)}
 85 | \\ & =
 86 | (\mathbf{M} \boldsymbol{\varepsilon})'
 87 | (\mathbf{M} \boldsymbol{\varepsilon})
 88 | &&
 89 | ( \mathbf{e} = \mathbf{M} \boldsymbol{\varepsilon} )
 90 | \\ & =
 91 | \boldsymbol{\varepsilon}' \mathbf{M} \mathbf{M}
 92 | \boldsymbol{\varepsilon}
 93 | \\ & =
 94 | \boldsymbol{\varepsilon}' \mathbf{M}
 95 | \boldsymbol{\varepsilon}.
 96 | &&
 97 | \text{($\mathbf{M}$ is idempotent)}
 98 | \end{align}
 99 | $$
100 | 
101 | ---
102 | 
103 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.7.8.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 20, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 7 Application: Returns to Scale in Electricity Supply
10 | 
11 | ...
12 | 
13 | #### Review Question 1.7.8
14 | 
15 | Taking logs of both sides of the production function (1.7.1), one can derive the log-linear relationship
16 | 
17 | $$
18 | \log ( Q_i ) = \alpha_0 + \alpha_1 \log ( x_{i1} ) +
19 | \alpha_2 \log ( x_{i2} ) + \alpha_3 \log ( x_{i3} ) +
20 | \varepsilon_i,
21 | \tag{1}
22 | $$
23 | 
24 | where
25 | 
26 | $$
27 | \alpha_0 \equiv \mathrm{E} [ \log ( A_i ) ],
28 | \qquad
29 | \varepsilon_i \equiv \log ( A_i ) - \mathrm{E} [ \log ( A_i ) ].
30 | $$
31 | 
32 | Suppose, in addition to total costs, output, and factor prices, we had data on factor inputs. (a) Can we estimate $$ \alpha $$'s by applying OLS to this log-linear relationship? Why or why not? (b) Suggest a different way to estimate $$\alpha$$'s.
33 | 
34 | ##### Solution
35 | 
36 | (a) The economic interpretation of the error term $$ \varepsilon_i $$ represents the firm's production efficiency relative to the industry's average efficiency. The input choice of the firm $$ ( x_{i1}, x_{i2}, x_{i3} ) $$ can depend on $$ \varepsilon_i $$, making regressors not be orthogonal to the error term. Applying OLS to log-linear relationship (1) will result in biased estimator, so we cannot estimate $$\alpha$$'s using OLS.
37 | 
38 | (b) Following microeconomic theory, under the Cobb-Douglas technology, input shares do not depend on factor prices.
39 | 
40 | This can be seen as follows. From equation (8) and (10) in [solution to review question 1.7.1](1.7.1.md),
41 | 
42 | $$
43 | \begin{align}
44 | \frac{ p_{i2} x_{i2} }{ p_{i1} x_{i1} } & =
45 | \frac{ \alpha_2 }{ \alpha_1 },
46 | \tag{1}
47 | \\
48 | \frac{ p_{i3} x_{i3} }{ p_{i1} x_{i1} } & =
49 | \frac{ \alpha_3 }{ \alpha_1 }.
50 | \tag{2}
51 | \end{align}
52 | $$
53 | 
54 | The input shares are calculated as
55 | 
56 | $$
57 | \frac{ p_{i1} x_{i1} }{ p_{i1} x_{i1} + p_{i2} x_{i2} + p_{i3} x_{i3} } =
58 | \frac{1}{ 1 + \alpha_2 / \alpha_1 + \alpha_3 / \alpha_1 } =
59 | \frac{ \alpha_1 }{ \alpha_1 + \alpha_2 + \alpha_3 },
60 | \tag{3}
61 | $$
62 | 
63 | similarly,
64 | 
65 | $$
66 | \begin{align}
67 | \frac{ p_{i2} x_{i2} }{ p_{i1} x_{i1} + p_{i2} x_{i2} + p_{i3} x_{i3} } & =
68 | \frac{ \alpha_2 }{ \alpha_1 + \alpha_2 + \alpha_3 },
69 | \tag{4}
70 | \\
71 | \frac{ p_{i3} x_{i3} }{ p_{i1} x_{i1} + p_{i2} x_{i2} + p_{i3} x_{i3} } & =
72 | \frac{ \alpha_3 }{ \alpha_1 + \alpha_2 + \alpha_3 }.
73 | \tag{5}
74 | \end{align}
75 | $$
76 | 
77 | It is evident that these input shares are determined completely by parameters and do not depend on factor prices.
78 | 
79 | Under constant returns to scale, these shares equal to $$\alpha_1$$, $$\alpha_2$$ and $$\alpha_3$$ respectively. So we can estimate these parameters using sample averages of input shares.
80 | 
81 | ---
82 | 
83 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.5.1.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 15, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 5 Relation to Maximum Likelihood
10 | 
11 | ...
12 | 
13 | #### Review Question 1.5.1 (Use of regularity conditions)
14 | 
15 | Assuming that taking expectations (i.e. taking integrals) and differentiation can be interchanged, prove that the expected value of the score vector,
16 | 
17 | $$
18 | \mathbf{s} ( \tilde{ \boldsymbol{ \theta } } ) \equiv
19 | \frac{ \partial \log L ( \tilde{ \boldsymbol{ \theta } } ) }
20 | { \partial \tilde{ \boldsymbol{ \theta } } },
21 | \tag{1.5.9}
22 | $$
23 | 
24 | if evaluated at the true parameter value $$ \boldsymbol{\theta} $$, is zero.
25 | 
26 | ##### Solution
27 | 
28 | We start from the identity on pdf,
29 | 
30 | $$
31 | \int f( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) \, d \mathbf{z} = 1.
32 | \tag{1}
33 | $$
34 | 
35 | Differentiate both sides of (1) with respect to $$ \tilde{ \boldsymbol{ \theta } } $$ and use the regularity condition, which allows us to interchange integration and differentiation, we obtain
36 | 
37 | $$
38 | \int \frac{ \partial f ( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) }
39 | { \partial \tilde{ \boldsymbol{ \theta } } }
40 | \, d \mathbf{z} = 0.
41 | \tag{2}
42 | $$
43 | 
44 | Dividing $$ f( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) $$ and multiplying $$ f( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) $$ on the integrand of (2),
45 | 
46 | $$
47 | \int \frac{1}{ f( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) }
48 | \frac{ \partial f ( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) }
49 | { \partial \tilde{ \boldsymbol{ \theta } } } \cdot
50 | f( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } )
51 | \, d \mathbf{z} = 0.
52 | \tag{3}
53 | $$
54 | 
55 | From basic calculus, the score vector function (1.5.9) can be written as
56 | 
57 | $$
58 | \mathbf{s} ( \tilde{ \boldsymbol{ \theta } } ) =
59 | \frac{ \partial \log L ( \tilde{ \boldsymbol{ \theta } } ) }
60 | { \partial \tilde{ \boldsymbol{ \theta } } } =
61 | \frac{ \partial \log f ( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) }{ \partial \tilde{ \boldsymbol{ \theta } } } =
62 | \frac{1}{ f( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) }
63 | \frac{ \partial f ( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) }
64 | { \partial \tilde{ \boldsymbol{ \theta } } }.
65 | \tag{4}
66 | $$
67 | 
68 | Combining (3) and (4), evaluating at the true parameter value $$\boldsymbol{ \theta }$$, using the definition of expectation,
69 | 
70 | $$
71 | \int \frac{1}{ f( \mathbf{z}; \boldsymbol{ \theta } ) }
72 | \frac{ \partial f ( \mathbf{z}; \boldsymbol{ \theta } ) }
73 | { \partial \boldsymbol{ \theta } } \cdot
74 | f( \mathbf{z}; \boldsymbol{ \theta } )
75 | \, d \mathbf{z} =
76 | \int \mathbf{s} ( \boldsymbol{ \theta } ) \cdot
77 | f( \mathbf{z}; \boldsymbol{ \theta } )
78 | \, d \mathbf{z} =
79 | \mathrm{E} [ \mathbf{s} ( \boldsymbol{ \theta } ) ]
80 | = 0.
81 | $$
82 | 
83 | ---
84 | 
85 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/2.1.5.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at Sep 17, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 2 Large-Sample Theory
 8 | 
 9 | ### Section 1 Review of Limit Theorems for Sequences of Random Variables
10 | 
11 | ...
12 | 
13 | #### Review Question 2.1.5 (Combine Delta method with Lindeberg-Levy)
14 | 
15 | Let $$ \{ z_i \} $$ be a sequence of i.i.d. (independently and identically distributed) random variables with $$ \operatorname{E} ( z_i ) = \mu \neq 0 $$ and $$ \operatorname{Var} ( z_i ) = \sigma^2 $$, and let $$ \bar{z}_n $$ be the sample mean. Show that
16 | 
17 | $$
18 | \sqrt{n}
19 | \left( \frac{1}{\bar{z}_n} - \frac{1}{\mu}
20 |   \right)
21 | \to_d N
22 | \left( 0, \frac{ \sigma^2 }{ \mu^4 }
23 |   \right).
24 | $$
25 | 
26 | **Hint**: In Lemma 2.5, set $$ \boldsymbol{\beta} = \mu $$, $$ \mathbf{a} ( \boldsymbol{\beta} ) = 1 / \mu $$, $$ \mathbf{x}_n = \bar{z}_n $$.
27 | 
28 | ##### Solution
29 | 
30 | By Linderberg-Levy CLT,
31 | 
32 | $$
33 | \sqrt{n} ( \bar{z}_n - \mu ) \to_d N(0, \sigma^2).
34 | $$
35 | 
36 | Set $$ \boldsymbol{\beta} = \mu $$, $$ \mathbf{a} ( \boldsymbol{\beta} ) = 1 / \mu $$, $$ \mathbf{x}_n = \bar{z}_n $$,
37 | 
38 | $$
39 | \mathbf{A} ( \boldsymbol{ \beta } ) =
40 | - \frac{ 1 }{ \mu^2 },
41 | $$
42 | 
43 | using Lemma 2.5,
44 | 
45 | $$
46 | \begin{align}
47 | \sqrt{n} \left( \frac{ 1 }{ \bar{z}_n } - 
48 |   \frac{ 1 }{ \mu } \right) & =
49 | \sqrt{n} [ \mathbf{a} ( \mathbf{x}_n ) - 
50 |   \mathbf{a} ( \boldsymbol{ \beta } ) ] \\
51 | & \to_d N \left( 0, \left( - \frac{1}{ \mu^2 } \right) \sigma^2 \left( - \frac{1}{ \mu^2 } \right) \right) \\
52 | & = N \left( 0, \frac{ \sigma^2 } { \mu^4 } \right)
53 | \end{align}
54 | $$
55 | 
56 | ##### Appendix
57 | 
58 | **Lemma 2.5 (the “delta method”)**: Suppose $$ \mathbf{x}_n $$ is a sequence of $$K$$-dimensional random vectors such that $$ \mathbf{x}_n \to_p \boldsymbol{\beta} $$ and
59 | 
60 | $$
61 | \sqrt{n} ( \mathbf{x}_n - \boldsymbol{\beta} )
62 | \to_d \mathbf{z},
63 | $$
64 | 
65 | and suppose $$ \mathbf{a} (\cdot): \mathbb{R}^K \to \mathbb{R}^r $$ has continuous first derivatives with $$ \mathbf{A} ( \boldsymbol{\beta} ) $$ denoting the $$ r \times K $$ matrix of first derivatives evaluated at $$ \boldsymbol{\beta} $$:
66 | 
67 | $$
68 | \underset{ ( r \times K ) }{ \mathbf{A} ( \boldsymbol{\beta} ) }
69 | \equiv
70 | \frac{
71 |   \partial \mathbf{a} ( \boldsymbol{ \beta } ) }
72 |   { \partial \boldsymbol{ \beta }' }.
73 | $$
74 | 
75 | Then
76 | 
77 | $$
78 | \sqrt{n} [ \mathbf{a} ( \mathbf{x}_n ) - \mathbf{a} (
79 |   \boldsymbol{ \beta }) ] \to_d
80 | \mathbf{A} ( \boldsymbol{ \beta } ) \mathbf{z}.
81 | $$
82 | 
83 | In particular:
84 | 
85 | $$
86 | \sqrt{n} ( \mathbf{x}_n - \boldsymbol{ \beta } ) \to_d
87 | N( \mathbf{0}, \boldsymbol{\Sigma} )
88 | \implies
89 | \sqrt{n} [ \mathbf{a} ( \mathbf{x}_n ) - \mathbf{a}
90 | ( \boldsymbol{ \beta } ) ] \to_d N( \mathbf{0},
91 | \mathbf{A} ( \boldsymbol{ \beta } )
92 | \boldsymbol{ \Sigma } \mathbf{A} ( \boldsymbol{ \beta } )' ).
93 | $$
94 | 
95 | ---
96 | 
97 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/exercise-solution/1.2.md:
--------------------------------------------------------------------------------
 1 | # Solution to Analytical Exercise
 2 | 
 3 | by Qiang Gao, updated at May 17, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ...
10 | 
11 | #### Analytical Exercise 1.2 (The annihilator associated with the vector of ones)
12 | 
13 | Let $$ \mathbf{1} $$ be the $$n$$-dimensional column vector of ones, and let $$ \mathbf{M}_1 \equiv \mathbf{I}_n - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' $$. That is, $$ \mathbf{M}_1 $$ is the annihilator associated with $$ \mathbf{1} $$. Prove the following:
14 | 
15 | (a) $$ \mathbf{M}_1 $$ is symmetric and idempotent.
16 | 
17 | (b) $$ \mathbf{M}_1 \mathbf{1} = \mathbf{0} $$.
18 | 
19 | (c) $$ \mathbf{M}_1 \mathbf{y} = \mathbf{y} - \bar{y} \cdot \mathbf{1} $$ where
20 | 
21 | $$
22 | \bar{y} = \frac{1}{n} \sum_{i=1}^{n} y_i.
23 | $$
24 | 
25 | $$ \mathbf{M}_1 \mathbf{y} $$ is the vector of **deviations from the mean**.
26 | 
27 | ##### Solution
28 | 
29 | (a) The symmetry of $$ \mathbf{M}_1 $$ is verified as
30 | 
31 | $$
32 | \begin{align}
33 | \mathbf{M}'_1 & = \mathbf{I}'_n - ( \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' )'
34 | \\ & =
35 | \mathbf{I}_n - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}'
36 | && \text{(Because 
37 | $( \mathbf{A} \mathbf{B} )' = \mathbf{B}' \mathbf{A}' $,
38 | $ ( \mathbf{A}^{-1} )' = ( \mathbf{A}' )^{-1} $)}
39 | \\ & =
40 | \mathbf{M}_1.
41 | \end{align}
42 | $$
43 | 
44 | The idempotency of $$ \mathbf{M}_1 $$ is verified as
45 | 
46 | $$
47 | \begin{align}
48 | \mathbf{M}_1 \mathbf{M}_1 & = ( \mathbf{I}_n - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' )( \mathbf{I}_n - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' )
49 | \\ & =
50 | \mathbf{I}_n
51 | - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}'
52 | - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}'
53 | + \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}'
54 | \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}'
55 | \\ & =
56 | \mathbf{I}_n
57 | - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}'
58 | - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}'
59 | + \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}'
60 | \\ & =
61 | \mathbf{I}_n
62 | - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}'
63 | \\ & =
64 | \mathbf{M}_1.
65 | \end{align}
66 | $$
67 | 
68 | (b)
69 | 
70 | $$
71 | \begin{align}
72 | \mathbf{M}_1 \mathbf{1} & =
73 | ( \mathbf{I}_n - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' ) \mathbf{1}
74 | \\ & =
75 | \mathbf{1} - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' \mathbf{1}
76 | \\ & =
77 | \mathbf{1} - \mathbf{1}
78 | \\ & =
79 | \mathbf{0}.
80 | \end{align}
81 | $$
82 | 
83 | (c)
84 | 
85 | $$
86 | \begin{align}
87 | \mathbf{M}_{1} \mathbf{y} & = ( \mathbf{I}_n - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' ) \mathbf{y}
88 | \\ & =
89 | \mathbf{y} - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' \mathbf{y}
90 | \\ & =
91 | \mathbf{y} - \mathbf{1} \cdot \frac{1}{n} \cdot \sum_{i=1}^{n} y_i
92 | \\ & =
93 | \mathbf{y} - \bar{y} \cdot \mathbf{1}.
94 | \end{align}
95 | $$
96 | 
97 | ---
98 | 
99 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.1.4.md:
--------------------------------------------------------------------------------
  1 | # Solution to Review Question
  2 | 
  3 | by Qiang Gao, updated at Mar 13, 2017
  4 | 
  5 | ---
  6 | 
  7 | ## Chapter 1 Finite-Sample Properties of OLS
  8 | 
  9 | ### Section 1 The Classical Linear Regression Model
 10 | 
 11 | ...
 12 | 
 13 | #### Review Question 1.1.4 (Normally distributed ramdom sample)
 14 | 
 15 | Consider a random sample on consumption and disposable income, $$ ( CON_i, YD_i ) $$, $$ ( i = 1, 2, \ldots, n ) $$. Suppose the joint distribution of $$ ( CON_i, YD_i ) $$ (which is the same across $$ i $$ because of the random sample assumption) is normal. Clearly, Assumption 1.3 is satisfied; the rank of $$ \mathbf{X} $$ would be less than $$ K $$ only by pure accident. Show that the other assumptions, Assumptions 1.1, 1.2, and 1.4, are satisfied. **Hint:** if two random variables, $$ y $$ and $$ x $$, are jointly normally distributed, then the conditional expectation is linear in $$ x $$, i.e.,
 16 | 
 17 | $$
 18 | \mathrm{E} ( y \mid x ) = \beta_1 + \beta_2 x,
 19 | $$
 20 | 
 21 | and the conditional variance, $$ \mathrm{Var} ( y \mid x ) $$, does not depend on $$ x $$. Here, the fact that the distribution is the same across $$ i $$ is important; if the distribution differed across $$ i $$, $$ \beta_1 $$ and $$ \beta_2 $$ could vary across $$ i $$.
 22 | 
 23 | ##### Solution (with flaw)
 24 | 
 25 | (1) We _define_ the error term as
 26 | 
 27 | $$
 28 | \varepsilon_i = CON_i - \beta_1 - \beta_2 YD_i,
 29 | $$
 30 | 
 31 | then Assumption 1.1 is satisfied.
 32 | 
 33 | (2) Because
 34 | 
 35 | $$
 36 | \begin{align}
 37 | \mathrm{E} (\varepsilon_i \mid \mathbf{X})
 38 | & =
 39 | \mathrm{E} ( CON_i - \beta_1 - \beta_2 YD_i \mid \mathbf{X} )
 40 | &&
 41 | \text{(definition of $\varepsilon_i$)}
 42 | \\ & =
 43 | \mathrm{E} ( CON_i \mid YD_i ) - \beta_1 - \beta_2 YD_i
 44 | &&
 45 | \text{(linearity of conditional expectations)}
 46 | \\ & =
 47 | \beta_1 + \beta_2 \mathit{YD}_i - \beta_1 - \beta_2 \mathit{YD}_i
 48 | &&
 49 | \text{(hint)}
 50 | \\ & = 0,
 51 | \end{align}
 52 | $$
 53 | 
 54 | Assumption 1.2 holds.
 55 | 
 56 | (3) To prove Assumption 1.4,
 57 | 
 58 | $$
 59 | \begin{align}
 60 | \mathrm{E} ( \varepsilon_i^2 \mid \mathbf{X} )
 61 | & =
 62 | \mathrm{Var} ( \varepsilon_i \mid \mathbf{X} ) + \mathrm{E} ( \varepsilon_i \mid \mathbf{X} )^2
 63 | &&
 64 | \text{(definition of $\mathrm{Var} (\cdot)$)}
 65 | \\ & =
 66 | \mathrm{Var} ( \varepsilon_i \mid \mathbf{X} )
 67 | &&
 68 | \text{(Assumption 1.2)}
 69 | \\ & =
 70 | \mathrm{Var} ( CON_i - \beta_1 - \beta_2 YD_i \mid YD_i )
 71 | &&
 72 | \text{(definition of $\varepsilon_i$)}
 73 | \\ & =
 74 | \mathrm{Var} ( CON_i \mid YD_i )
 75 | &&
 76 | \text{($\mathrm{Var} (ax + b) = a^2 \mathrm{Var} (x)$ )}
 77 | \\ & = \sigma^2 > 0,
 78 | &&
 79 | \text{(hint)}
 80 | \end{align}
 81 | $$
 82 | 
 83 | and
 84 | 
 85 | $$
 86 | \begin{align}
 87 | \mathrm{E} ( \varepsilon_i \varepsilon_j \mid \mathbf{X} )
 88 | & =
 89 | \mathrm{E} ( \varepsilon_i \mid \mathbf{x}_i ) \mathrm{E} ( \varepsilon_j \mid \mathbf{x}_j )
 90 | &&
 91 | \text{(Review Question 1.1.2)}
 92 | \\ & = 0.
 93 | &&
 94 | \text{(Assumption 1.2)}
 95 | \end{align}
 96 | $$
 97 | 
 98 | ---
 99 | 
100 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.2.9.md:
--------------------------------------------------------------------------------
  1 | # Solution to Review Question
  2 | 
  3 | by Qiang Gao, updated at Mar 26, 2017
  4 | 
  5 | ---
  6 | 
  7 | ## Chapter 1 Finite-Sample Properties of OLS
  8 | 
  9 | ### Section 2 The Algebra of Least Squares
 10 | 
 11 | ...
 12 | 
 13 | #### Review Question 1.2.9 (Computation of the statistics)
 14 | 
 15 | Verify that $$ \mathbf{b} $$, $$ \mathrm{SSR} $$, $$ s^2 $$, and $$ R^2 $$ can be calculated from the following sample averages: $$ \mathbf{S}_{ \mathbf{x} \mathbf{x} } $$, $$ \mathbf{s}_{ \mathbf{x} \mathbf{y} } $$, $$ \mathbf{y}' \mathbf{y} / n $$, and $$ \bar{y} $$. (If the regressors include a constant, then $$ \bar{y} $$ is the element of $$ \mathbf{s}_{ \mathbf{x} \mathbf{y} } $$ corresponding to the constant.) Therefore, those sample averages need to be computed just once in order to obtain the regression coefficients and related statistics.
 16 | 
 17 | ##### Solution
 18 | 
 19 | (1) According to (1.2.5'),
 20 | 
 21 | $$
 22 | \mathbf{b} = \mathbf{S}_{\mathbf{x} \mathbf{x}}^{-1}
 23 | \mathbf{s}_{\mathbf{x} \mathbf{y}}.
 24 | $$
 25 | 
 26 | (2)
 27 | 
 28 | $$
 29 | \begin{align}
 30 | \mathrm{SSR} & = \mathbf{e}' \mathbf{e}
 31 | &&
 32 | \text{(definition (1.2.12))}
 33 | \\ & = 
 34 | (\mathbf{y} - \mathbf{X} \mathbf{b})'
 35 | (\mathbf{y} - \mathbf{X} \mathbf{b})
 36 | &&
 37 | \text{(definition (1.2.4))}
 38 | \\ & =
 39 | \mathbf{y}'\mathbf{y} - \mathbf{y}' \mathbf{X}
 40 | \mathbf{b} - \mathbf{b}' \mathbf{X}' \mathbf{y} +
 41 | \mathbf{b}' \mathbf{X}' \mathbf{X} \mathbf{b}
 42 | \\ & =
 43 | \mathbf{y}'\mathbf{y} - \mathbf{y}' \mathbf{X}
 44 | \mathbf{b} - \mathbf{b}' \mathbf{X}' \mathbf{y} +
 45 | \mathbf{b}' \mathbf{X}' \mathbf{y}
 46 | &&
 47 | \text{(definition (1.2.5))}
 48 | \\ & =
 49 | \mathbf{y}'\mathbf{y} - (\mathbf{X}' \mathbf{y})' \mathbf{b}
 50 | \\ & =
 51 | n \cdot \mathbf{y}' \mathbf{y} / n - n \cdot
 52 | \mathbf{s}_{\mathbf{x} \mathbf{y}}'
 53 | \mathbf{S}_{\mathbf{x} \mathbf{x}}^{-1}
 54 | \mathbf{s}_{\mathbf{x} \mathbf{y}}
 55 | \\ & =
 56 | n \cdot ( \mathbf{y}' \mathbf{y} / n -
 57 | \mathbf{s}_{\mathbf{x} \mathbf{y}}'
 58 | \mathbf{S}_{\mathbf{x} \mathbf{x}}^{-1}
 59 | \mathbf{s}_{\mathbf{x} \mathbf{y}} ).  
 60 | \end{align}
 61 | $$
 62 | 
 63 | (3)
 64 | 
 65 | $$
 66 | s^2 = \frac{\mathrm{SSR}}{n - K} = 
 67 | \frac{n}{n - K} \cdot ( \mathbf{y}' \mathbf{y} / n -
 68 | \mathbf{s}_{\mathbf{x} \mathbf{y}}'
 69 | \mathbf{S}_{\mathbf{x} \mathbf{x}}^{-1}
 70 | \mathbf{s}_{\mathbf{x} \mathbf{y}} ).
 71 | $$
 72 | 
 73 | (4)
 74 | 
 75 | $$
 76 | \begin{align}
 77 | R^2 & = 1 - \frac{ \mathrm{SSR} }
 78 | { \sum_{i=1}^n (y_i - \bar{y})^2 }
 79 | &&
 80 | \text{(definition (1.2.18))}
 81 | \\ & =
 82 | 1 - \frac{ \mathrm{SSR} }
 83 | { \sum_{i=1}^n (y_i^2 - 2\bar{y}y_i + \bar{y}^2) }
 84 | \\ & =
 85 | 1 - \frac{ \mathrm{SSR} }
 86 | { \sum_{i=1}^n y_i^2 - 2 \bar{y} \sum_{i=1}^n y_i +
 87 | \sum_{i=1}^n \bar{y}^2 }
 88 | \\ & =
 89 | 1 - \frac{ \mathrm{SSR} }
 90 | { n \cdot \mathbf{y}' \mathbf{y} / n -
 91 | 2n \cdot \bar{y}^2 + n \cdot \bar{y}^2}
 92 | \\ & =
 93 | 1 - \frac{ \mathrm{SSR} }
 94 | { n \cdot \mathbf{y}' \mathbf{y} / n -
 95 | n \cdot \bar{y}^2}.
 96 | \end{align}
 97 | $$
 98 | 
 99 | ---
100 | 
101 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.5.4.md:
--------------------------------------------------------------------------------
 1 | # Solution to Review Question
 2 | 
 3 | by Qiang Gao, updated at May 15, 2017
 4 | 
 5 | ---
 6 | 
 7 | ## Chapter 1 Finite-Sample Properties of OLS
 8 | 
 9 | ### Section 5 Relation to Maximum Likelihood
10 | 
11 | ...
12 | 
13 | #### Review Question 1.5.4 (Information matrix equality for classical regression model)
14 | 
15 | Verify the **information matrix equality**
16 | 
17 | $$
18 | \mathbf{I} ( \boldsymbol{ \theta } ) =
19 | - \mathrm{E} \left[
20 | \frac{ \partial^2 \log L( \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } } \, \partial \tilde{ \boldsymbol{ \theta } }' }
21 | \right]
22 | \tag{1.5.11}
23 | $$
24 | 
25 | ~~for the linear regression model~~.
26 | 
27 | ##### Solution
28 | 
29 | _Comment_: The information matrix equality is an identity, it is not specific to the linear regression model.
30 | 
31 | We begin with the identity
32 | 
33 | $$
34 | 1 = \oint_{ \Omega } f( \mathbf{z} ; \boldsymbol{ \theta } ) \, d \mathbf{z}.
35 | $$
36 | 
37 | Taking derivative with respect to $$ \boldsymbol{ \theta } $$ and interchange differentiation and integration,
38 | 
39 | $$
40 | 0 = \oint_{\Omega} \frac{ \partial f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } } } \, d \mathbf{z} =
41 | \oint_{\Omega} \frac{ \partial \log f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } } } f( \mathbf{z}; \boldsymbol{ \theta } ) \, d \mathbf{z}.
42 | $$
43 | 
44 | Taking derivative with respect to $$ \boldsymbol{ \theta }' $$ and interchange differentiation and integration,
45 | 
46 | $$
47 | \begin{align}
48 | 0 & = \oint_{\Omega} \frac{ \partial^2 \log f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } } \, \partial \tilde{ \boldsymbol{ \theta } }' } f( \mathbf{z}; \boldsymbol{ \theta } ) \, d \mathbf{z} +
49 | \oint_{\Omega} \frac{ \partial \log f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } } }
50 | \frac{ \partial f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } }' } \, d \mathbf{z}
51 | \\ & =
52 | \oint_{\Omega} \frac{ \partial^2 \log f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } } \, \partial \tilde{ \boldsymbol{ \theta } }' } f( \mathbf{z}; \boldsymbol{ \theta } ) \, d \mathbf{z} +
53 | \oint_{\Omega} \frac{ \partial \log f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } } }
54 | \frac{ \partial \log f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } }' } f( \mathbf{z}; \boldsymbol{ \theta } ) \, d \mathbf{z}.
55 | \tag{1}
56 | \end{align}
57 | $$
58 | 
59 | Because
60 | 
61 | $$
62 | \begin{align}
63 | \mathbf{I} ( \boldsymbol{ \theta } ) & \equiv
64 | \mathrm{E} [ \mathbf{s} ( \boldsymbol{ \theta } ) \mathbf{s} ( \boldsymbol{ \theta } )' ]
65 | \tag{1.5.10}
66 | \\ & =
67 | \oint_{\Omega} \frac{ \partial \log f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } } }
68 | \frac{ \partial \log f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } }' } f( \mathbf{z}; \boldsymbol{ \theta } ) \, d \mathbf{z},
69 | \end{align}
70 | $$
71 | 
72 | and $$ L( \boldsymbol{ \theta } ) \equiv f( \mathbf{z}; \boldsymbol{ \theta } ) $$, substituting into (1) and rearranging terms, we get (1.5.11).
73 | 
74 | ---
75 | 
76 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/supplements/matrix-multiplication.md:
--------------------------------------------------------------------------------
  1 | ## Matrix Multiplication Rules
  2 | 
  3 | by Qiang Gao, updated at Mar 20, 2017
  4 | 
  5 | ---
  6 | 
  7 | Matrix multiplication can be defined equivalently in _four_ different ways as following, where vector $$ \mathbf{a} $$ is  _column_ vector by default and $$ \mathbf{a}^\intercal $$ means corresponding _row_ vector, the transpose of $$ \mathbf{a} $$.
  8 | 
  9 | #### Definition (inner-way multiplication)
 10 | 
 11 | $$
 12 | \begin{align}
 13 |   \mathbf{A}_{m \times n} \mathbf{B}_{n \times p}
 14 |   & = 
 15 |   \begin{bmatrix}
 16 |     — & \boldsymbol{\alpha}_1^\intercal & — \\
 17 |       & \vdots & \\
 18 |     — & \boldsymbol{\alpha}_m^\intercal & —
 19 |   \end{bmatrix}
 20 |   \begin{bmatrix}
 21 |     | & & | \\
 22 |     \mathbf{b}_1 & \cdots & \mathbf{b}_p \\
 23 |     | & & |
 24 |   \end{bmatrix} \\
 25 |   & =
 26 |   \begin{bmatrix}
 27 |     \boldsymbol{\alpha}_1^\intercal \mathbf{b}_1
 28 |       & \cdots
 29 |       & \boldsymbol{\alpha}_1^\intercal \mathbf{b}_p \\
 30 |     \vdots & \ddots & \vdots \\
 31 |     \boldsymbol{\alpha}_m^\intercal \mathbf{b}_1
 32 |       & \cdots
 33 |       & \boldsymbol{\alpha}_m^\intercal \mathbf{b}_p
 34 |   \end{bmatrix}
 35 | \end{align}
 36 | $$
 37 | 
 38 | #### Definition (outer-way multiplication)
 39 | 
 40 | $$
 41 | \begin{align}
 42 |   \mathbf{A}_{m \times n} \mathbf{B}_{n \times p}
 43 |   & =
 44 |   \begin{bmatrix}
 45 |     | & & | \\
 46 |     \mathbf{a}_1 & \cdots & \mathbf{a}_n \\
 47 |     | & & |
 48 |   \end{bmatrix}
 49 |   \begin{bmatrix}
 50 |     — & \boldsymbol{\beta}_1^\intercal & — \\
 51 |     & \vdots & \\
 52 |     — & \boldsymbol{\beta}_n^\intercal & —
 53 |   \end{bmatrix} \\
 54 |   & =
 55 |   \sum_{i=1}^n \mathbf{a}_i \boldsymbol{\beta}_i^\intercal
 56 | \end{align}
 57 | $$
 58 | 
 59 | #### Definition (column-way multiplication)
 60 | 
 61 | $$
 62 | \begin{align}
 63 |   \mathbf{A}_{m \times n} \mathbf{B}_{n \times p}
 64 |   & =
 65 |   \begin{bmatrix}
 66 |     | & & | \\
 67 |     \mathbf{a}_1 & \cdots & \mathbf{a}_n \\
 68 |     | & & |
 69 |   \end{bmatrix}
 70 |   \begin{bmatrix}
 71 |     | & & | \\
 72 |     \mathbf{b}_1 & \cdots & \mathbf{b}_p \\
 73 |     | & & |
 74 |   \end{bmatrix} \\
 75 |   & =
 76 |   \begin{bmatrix}
 77 |     | & & | \\
 78 |     \sum_{i=1}^n b_{i1} \mathbf{a}_i
 79 |       & \cdots
 80 |       & \sum_{i=1}^n
 81 |       b_{ip} \mathbf{a}_i \\
 82 |     | & & |
 83 |   \end{bmatrix}
 84 | \end{align}
 85 | $$
 86 | 
 87 | #### Definition (row-way multiplication)
 88 | 
 89 | $$
 90 | \begin{align}
 91 |   \mathbf{A}_{m \times n} \mathbf{B}_{n \times p}
 92 |   & =
 93 |   \begin{bmatrix}
 94 |     — & \boldsymbol{\alpha}_1^\intercal & — \\
 95 |     & \vdots & \\
 96 |     — & \boldsymbol{\alpha}_m^\intercal & —
 97 |   \end{bmatrix}
 98 |   \begin{bmatrix}
 99 |     — & \boldsymbol{\beta}_1^\intercal & — \\
100 |     & \vdots & \\
101 |     — & \boldsymbol{\beta}_n^\intercal & —
102 |   \end{bmatrix} \\
103 |   & =
104 |   \begin{bmatrix}
105 |     — & \sum_{i=1}^n a_{1i}
106 |       \boldsymbol{\beta}_i^\intercal & — \\
107 |     & \vdots & \\
108 |     — & \sum_{i=1}^n a_{mi}
109 |       \boldsymbol{\beta}_i^\intercal & —
110 |   \end{bmatrix}
111 | \end{align}
112 | $$
113 | 
114 | ##### Note
115 | 
116 | - If you are concerning _each element_ of the result of $$ \mathbf{A} \mathbf{B} $$, then you should use the inner-way multiplication.
117 | 
118 | - If you want the express $$ \mathbf{A} \mathbf{B} $$ as a _sum_, then you should use the outer-way multiplication.
119 | 
120 | - If you are concerning _each column_ of the result of $$ \mathbf{A} \mathbf{B} $$, then you should use the column-way multiplication.
121 | 
122 | - If you are concerning _each row_ of the result of $$ \mathbf{A} \mathbf{B} $$, then you should use the row-way multiplication.
123 | 
124 | ---
125 | 
126 | Copyright ©2018 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.3.4.md:
--------------------------------------------------------------------------------
  1 | # Solution to Review Question
  2 | 
  3 | by Qiang Gao, updated at May 8, 2017
  4 | 
  5 | ---
  6 | 
  7 | ## Chapter 1 Finite-Sample Properties of OLS
  8 | 
  9 | ### Section 3 Finite-Sample Properties of OLS
 10 | 
 11 | ...
 12 | 
 13 | #### Review Question 1.3.4 (Gauss-Markov for unconditional variance)
 14 | 
 15 | (a) Prove: $$ \mathrm{Var} ( \widehat{\boldsymbol{\beta}} ) = \mathrm{E} [  \mathrm{Var} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) ] + \mathrm{Var} [ \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) ] $$.
 16 | 
 17 | (b) Prove (1.3.3) in textbook,
 18 | 
 19 | $$
 20 | \mathrm{Var} ( \widehat{ \boldsymbol{ \beta } } ) \ge
 21 | \mathrm{Var} ( \mathbf{b} ),
 22 | \tag{1.3.3}
 23 | $$
 24 | 
 25 | where $$ \widehat{ \boldsymbol{ \beta } } $$ is any linear unbiased estimator.
 26 | 
 27 | ##### Solution
 28 | 
 29 | (a) By the [formula of variance](supplements/var-cov-matrix.md),
 30 | 
 31 | $$
 32 | \begin{align}
 33 | \mathrm{Var} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} )
 34 | & =
 35 | \mathrm{E} ( \widehat{\boldsymbol{\beta}} \widehat{\boldsymbol{\beta}}' \mid \mathbf{X} ) -
 36 | \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} )
 37 | \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} )'.
 38 | \end{align}
 39 | $$
 40 | 
 41 | Taking expectations,
 42 | 
 43 | $$
 44 | \begin{align}
 45 | \mathrm{E} [  \mathrm{Var} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) ]
 46 | & =
 47 | \mathrm{E} [ \mathrm{E} ( \widehat{\boldsymbol{\beta}} \widehat{\boldsymbol{\beta}}' \mid \mathbf{X} ) ] -
 48 | \mathrm{E} [ \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} )' ]
 49 | \\ & =
 50 | \mathrm{E} ( \widehat{\boldsymbol{\beta}} \widehat{\boldsymbol{\beta}}' ) -
 51 | \mathrm{E} [ \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} )' ].
 52 | \tag{1}
 53 | \end{align}
 54 | $$
 55 | 
 56 | Also by the [formula of variance](supplements/var-cov-matrix.md),
 57 | 
 58 | $$
 59 | \begin{align}
 60 | \mathrm{Var} [ \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) ]
 61 | & =
 62 | \mathrm{E} [ \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} )' ] -
 63 | \mathrm{E} ( \widehat{ \boldsymbol{ \beta } } )
 64 | \mathrm{E} ( \widehat{ \boldsymbol{ \beta } } )'.
 65 | \tag{2}
 66 | \end{align}
 67 | $$
 68 | 
 69 | Combining equations (1) and (2), and again by the [formula of variance](supplements/var-cov-matrix.md),
 70 | 
 71 | $$
 72 | \mathrm{E} [  \mathrm{Var} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) ] +
 73 | \mathrm{Var} [ \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) ]
 74 | =
 75 | \mathrm{E} ( \widehat{\boldsymbol{\beta}} \widehat{\boldsymbol{\beta}}' ) -
 76 | \mathrm{E} ( \widehat{ \boldsymbol{ \beta } } )
 77 | \mathrm{E} ( \widehat{ \boldsymbol{ \beta } } )'
 78 | =
 79 | \mathrm{Var} ( \widehat{ \boldsymbol{ \beta } } ).
 80 | $$
 81 | 
 82 | (b)
 83 | 
 84 | $$
 85 | \begin{align}
 86 | & \mathrm{Var} ( \widehat{ \boldsymbol{ \beta } } ) -
 87 | \mathrm{Var} ( \mathbf{b} )
 88 | \\ = &
 89 | \mathrm{E} [  \mathrm{Var} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) ] + \mathrm{Var} [ \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) ] -
 90 | \mathrm{E} [ \mathrm{Var} ( \mathbf{b} \mid \mathbf{X} ) ] - \mathrm{Var} [ \mathrm{E} ( \mathbf{b} \mid \mathbf{X} ) ]
 91 | &&
 92 | \text{(by part (a))}
 93 | \\ = &
 94 | \mathrm{E} [  \mathrm{Var} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) ] -
 95 | \mathrm{E} [ \mathrm{Var} ( \mathbf{b} \mid \mathbf{X} ) ]
 96 | &&
 97 | \text{($ \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) = \mathrm{E} ( \mathbf{b} \mid \mathbf{X} ) = \boldsymbol{ \beta } $)}
 98 | \\ = &
 99 | \mathrm{E} [  \mathrm{Var} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) - \mathrm{Var} ( \mathbf{b} \mid \mathbf{X} ) ]
100 | &&
101 | \text{(linearity of expectations)}
102 | \\ = &
103 | \text{postive semidefinite matrix}.
104 | &&
105 | \text{(Gauss-Markov Theorem)}
106 | \end{align}
107 | $$
108 | 
109 | ---
110 | 
111 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/SUMMARY.md:
--------------------------------------------------------------------------------
 1 | # 目录
 2 | 
 3 | * [简介](README.md)
 4 | 
 5 | ## Chapter 1 Finite-Sample Properties of OLS
 6 | 
 7 | ### Section 1 The Classical Linear Regression Model
 8 | 
 9 | * [Lecture Note 1.1](lecture-note/1.1.md)
10 |   * [Review Question 1.1.1](question-solution/1.1.1.md)
11 |   * [Review Question 1.1.2](question-solution/1.1.2.md)
12 |   * [Review Question 1.1.3](question-solution/1.1.3.md)
13 |   * [Review Question 1.1.4](question-solution/1.1.4.md)
14 |   * [Review Question 1.1.5](question-solution/1.1.5.md)
15 |   * [Review Question 1.1.6](question-solution/1.1.6.md)
16 | 
17 | ### Section 2 The Algebra of Least Squares
18 | 
19 | * [Lecture Note 1.2](lecture-note/1.2.md)
20 |   * [Review Question 1.2.1](question-solution/1.2.1.md)
21 |   * [Review Question 1.2.2](question-solution/1.2.2.md)
22 |   * [Review Question 1.2.3](question-solution/1.2.3.md)
23 |   * [Review Question 1.2.4](question-solution/1.2.4.md)
24 |   * [Review Question 1.2.5](question-solution/1.2.5.md)
25 |   * [Review Question 1.2.6](question-solution/1.2.6.md)
26 |   * [Review Question 1.2.7](question-solution/1.2.7.md)
27 |   * [Review Question 1.2.8](question-solution/1.2.8.md)
28 |   * [Review Question 1.2.9](question-solution/1.2.9.md)
29 |   
30 | ### Section 3 Finite-Sample Properties of OLS
31 | 
32 | * [Review Question 1.3.1](question-solution/1.3.1.md)
33 | * [Review Question 1.3.2](question-solution/1.3.2.md)
34 | * [Review Question 1.3.3](question-solution/1.3.3.md)
35 | * [Review Question 1.3.4](question-solution/1.3.4.md)
36 | * [Review Question 1.3.5](question-solution/1.3.5.md)
37 | * [Review Question 1.3.6](question-solution/1.3.6.md)
38 | * [Review Question 1.3.7](question-solution/1.3.7.md)
39 | 
40 | ### Section 4 Hypothesis Testing under Normality
41 | 
42 | * [Review Question 1.4.1](question-solution/1.4.1.md)
43 | * [Review Question 1.4.2](question-solution/1.4.2.md)
44 | * [Review Question 1.4.3](question-solution/1.4.3.md)
45 | * [Review Question 1.4.4](question-solution/1.4.4.md)
46 | * [Review Question 1.4.5](question-solution/1.4.5.md)
47 | * [Review Question 1.4.6](question-solution/1.4.6.md)
48 | * [Review Question 1.4.7](question-solution/1.4.7.md)
49 | 
50 | ### Section 5 Relation to Maximum Likelihood
51 | 
52 | * [Review Question 1.5.1](question-solution/1.5.1.md)
53 | * [Review Question 1.5.2](question-solution/1.5.2.md)
54 | * [Review Question 1.5.3](question-solution/1.5.3.md)
55 | * [Review Question 1.5.4](question-solution/1.5.4.md)
56 | * [Review Question 1.5.5](question-solution/1.5.5.md)
57 | 
58 | ### Section 6 Generalized Least Squares (GLS)
59 | 
60 | * [Review Question 1.6.1](question-solution/1.6.1.md)
61 | * [Review Question 1.6.2](question-solution/1.6.2.md)
62 | * [Review Question 1.6.3](question-solution/1.6.3.md)
63 | * [Review Question 1.6.4](question-solution/1.6.4.md)
64 | 
65 | ### Section 7 Application: Returns to Scale in Electricity Supply
66 | 
67 | * [Lecture Note](lecture-note/1.7.md)
68 | * [Review Question 1.7.1](question-solution/1.7.1.md)
69 | * [Review Question 1.7.2](question-solution/1.7.2.md)
70 | * [Review Question 1.7.3](question-solution/1.7.3.md)
71 | * [Review Question 1.7.4](question-solution/1.7.4.md)
72 | * [Review Question 1.7.5](question-solution/1.7.5.md)
73 | * [Review Question 1.7.6](question-solution/1.7.6.md)
74 | * [Review Question 1.7.7](question-solution/1.7.7.md)
75 | * [Review Question 1.7.8](question-solution/1.7.8.md)
76 | 
77 | ### Analytical Exercises
78 | 
79 | * [Analytical Exercise 1.1](exercise-solution/1.1.md)
80 | * [Analytical Exercise 1.2](exercise-solution/1.2.md)
81 | 
82 | ## Chapter 2 Large-Sample Theory
83 | 
84 | ### Section 1 Review of Limit Theorems for Sequences of Random Variables
85 | 
86 | * [Review Question 2.1.1](question-solution/2.1.1.md)
87 | * [Review Question 2.1.2](question-solution/2.1.2.md)
88 | * [Review Question 2.1.3](question-solution/2.1.3.md)
89 | * [Review Question 2.1.4](question-solution/2.1.4.md)
90 | * [Review Question 2.1.5](question-solution/2.1.5.md)
91 | 
92 | ## Supplements
93 | 
94 | * [Taylor's linearization](supplements/taylor-linearization.md)
95 | * [Variance-Covariance Matrix](supplements/var-cov-matrix.md)
96 | * [Four Ways of Matrix Multiplication](supplements/matrix-multiplication.md)
97 | 


--------------------------------------------------------------------------------
/lecture-note/1.7.md:
--------------------------------------------------------------------------------
  1 | # Lecture Notes on Econometrics
  2 | 
  3 | by Qiang Gao, updated at May 17, 2017
  4 | 
  5 | ---
  6 | 
  7 | ## Chapter 1 Finite-Sample Properties of OLS
  8 | 
  9 | ### Section 7 Application: Returns to Scale in Electricity Supply
 10 | 
 11 | ...
 12 | 
 13 | #### How to derive the cost function (1.7.2)
 14 | 
 15 | Assume the firms are engaged in **cost minimization**,
 16 | 
 17 | $$
 18 | \begin{gather}
 19 | \min \quad p_{i1} x_{i1} + p_{i2} x_{i2} + p_{i3} x_{i3}
 20 | \tag{1}
 21 | \\
 22 | \text{s.t.} \qquad
 23 | A_i x_{i1}^{\alpha_1} x_{i2}^{\alpha_2} x_{i3}^{\alpha_3} = Q_i.
 24 | \tag{2}
 25 | \end{gather}
 26 | $$
 27 | 
 28 | Then the **Lagrangian** is
 29 | 
 30 | $$
 31 | \mathcal{L} ( x_{i1}, x_{i2}, x_{i3} ) =
 32 | p_{i1} x_{i1} + p_{i2} x_{i2} + p_{i3} x_{i3} -
 33 | \lambda ( A_i x_{i1}^{\alpha_1} x_{i2}^{\alpha_2} x_{i3}^{\alpha_3} - Q_i ),
 34 | \tag{3}
 35 | $$
 36 | 
 37 | where $$ \lambda > 0 $$ is the **shadow price** of the **exogenous variable** $$ Q_i $$.
 38 | 
 39 | The **first-order conditions** are
 40 | 
 41 | $$
 42 | \begin{align}
 43 | \frac{ \partial \mathcal{L} ( x_{i1}, x_{i2}, x_{i3} ) }
 44 | { \partial x_{i1} } =
 45 | p_{i1} - \lambda \alpha_1 A_i x_{i1}^{\alpha_1 - 1} x_{i2}^{\alpha_2} x_{i3}^{\alpha_3} = 0,
 46 | \tag{4}
 47 | \\
 48 | \frac{ \partial \mathcal{L} ( x_{i1}, x_{i2}, x_{i3} ) }
 49 | { \partial x_{i2} } =
 50 | p_{i2} - \lambda \alpha_2 A_i x_{i1}^{\alpha_1} x_{i2}^{\alpha_2 - 1} x_{i3}^{\alpha_3} = 0,
 51 | \tag{5}
 52 | \\
 53 | \frac{ \partial \mathcal{L} ( x_{i1}, x_{i2}, x_{i3} ) }
 54 | { \partial x_{i3} } =
 55 | p_{i3} - \lambda \alpha_1 A_i x_{i1}^{\alpha_1} x_{i2}^{\alpha_2} x_{i3}^{\alpha_3 - 1} = 0.
 56 | \tag{6}
 57 | \end{align}
 58 | $$
 59 | 
 60 | Dividing (5) over (4) to eliminate the shadow price $$ \lambda $$,
 61 | 
 62 | $$
 63 | \begin{gather}
 64 | \frac{ p_{i2} }{ p_{i1} } =
 65 | \frac{ \alpha_2 }{ \alpha_1 }
 66 | \frac{ x_{i1} }{ x_{i2} },
 67 | \tag{7}
 68 | \\
 69 | x_{i2} = p_{i1} p_{i2}^{-1} \alpha_1^{-1} \alpha_2 x_{i1}.
 70 | \tag{8}
 71 | \end{gather}
 72 | $$
 73 | 
 74 | Similarly, dividing (6) over (4),
 75 | 
 76 | $$
 77 | \begin{gather}
 78 | \frac{ p_{i3} }{ p_{i1} } =
 79 | \frac{ \alpha_3 }{ \alpha_1 }
 80 | \frac{ x_{i1} }{ x_{i3} },
 81 | \tag{9}
 82 | \\
 83 | x_{i3} = p_{i1} p_{i3}^{-1} \alpha_1^{-1} \alpha_3 x_{i1}.
 84 | \tag{10}
 85 | \end{gather}
 86 | $$
 87 | 
 88 | Substituting (8) and (10) into (2),
 89 | 
 90 | $$
 91 | A_i x_{i1}^{ \alpha_1 + \alpha_2 + \alpha_3 }
 92 | p_{i1}^{ \alpha_2 + \alpha_3 }
 93 | p_{i2}^{ - \alpha_2 }
 94 | p_{i3}^{ - \alpha_3 }
 95 | \alpha_1^{ - \alpha_2 - \alpha_3}
 96 | \alpha_2^{ \alpha_2 }
 97 | \alpha_3^{ \alpha_3 }
 98 | = Q_i.
 99 | \tag{11}
100 | $$
101 | 
102 | Using $$ r \equiv \alpha_1 + \alpha_2 + \alpha_3 $$, we solved from (11) that
103 | 
104 | $$
105 | \begin{align}
106 | x_{i1} & = A_i^{-1/r} Q_i^{1/r}
107 | p_{i1}^{-1 + \alpha_1/r}
108 | p_{i2}^{ \alpha_2/r }
109 | p_{i3}^{ \alpha_3/r }
110 | \alpha_1^{1 - \alpha_1/r }
111 | \alpha_2^{- \alpha_2/r}
112 | \alpha_3^{- \alpha_3/r}
113 | \\ & =
114 | \alpha_1 \cdot ( A_i \alpha_1^{\alpha_1} \alpha_2^{\alpha_2} \alpha_3^{\alpha^3} )^{-1/r}
115 | Q_i^{1/r}
116 | \frac{1}{ p_{i1} }
117 | p_{i1}^{ \alpha_1/r}
118 | p_{i2}^{ \alpha_2/r }
119 | p_{i3}^{ \alpha_3/r }.
120 | \tag{12}
121 | \end{align}
122 | $$
123 | 
124 | Similarly, because of the symmetry of $$x_{i1}$$, $$x_{i2}$$, and $$ x_{i3} $$, we can also solve that
125 | 
126 | $$
127 | \begin{align}
128 | x_{i2} & = \alpha_2 \cdot ( A_i \alpha_1^{\alpha_1} \alpha_2^{\alpha_2} \alpha_3^{\alpha^3} )^{-1/r}
129 | Q_i^{1/r}
130 | \frac{1}{ p_{i2} }
131 | p_{i1}^{ \alpha_1/r}
132 | p_{i2}^{ \alpha_2/r }
133 | p_{i3}^{ \alpha_3/r },
134 | \tag{13}
135 | \\
136 | x_{i3} & = \alpha_3 \cdot ( A_i \alpha_1^{\alpha_1} \alpha_2^{\alpha_2} \alpha_3^{\alpha^3} )^{-1/r}
137 | Q_i^{1/r}
138 | \frac{1}{ p_{i3} }
139 | p_{i1}^{ \alpha_1/r}
140 | p_{i2}^{ \alpha_2/r }
141 | p_{i3}^{ \alpha_3/r }.
142 | \tag{14}
143 | \end{align}
144 | $$
145 | 
146 | Substituting solutions (12), (13) and (14) of the **endogenous variables** into the **objective function** (1), we get
147 | 
148 | $$
149 | TC_i = r \cdot ( A_i \alpha_1^{ \alpha_1 } \alpha_2^{ \alpha_2 } \alpha_3^{ \alpha_3 } )^{-1/r} Q_i^{1/r} p_{i1}^{\alpha_1 / r} p_{i2}^{\alpha_2 / r} p_{i3}^{\alpha_3 / r}.
150 | \tag{1.7.2}
151 | $$
152 | 
153 | ---
154 | 
155 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.2.4.md:
--------------------------------------------------------------------------------
  1 | # Solution to Review Question
  2 | 
  3 | by Qiang Gao, updated at Apr 2, 2018
  4 | 
  5 | ---
  6 | 
  7 | ## Chapter 1 Finite-Sample Properties of OLS
  8 | 
  9 | ### Section 2 The Algebra of Least Squares
 10 | 
 11 | ...
 12 | 
 13 | #### Review Question 1.2.4
 14 | 
 15 | Prove that
 16 | 
 17 | $$
 18 | \begin{align}
 19 | &
 20 | \text{ Both $ \mathbf{P} $ and $ \mathbf{M} $ are symmetric and idempotent, }
 21 | \tag{1.2.9}
 22 | \\ &
 23 | \mathbf{P} \mathbf{X} = \mathbf{X} \quad
 24 | \text{(hence the term projection matrix),}
 25 | \tag{1.2.10}
 26 | \\ &
 27 | \mathbf{M} \mathbf{X} = \mathbf{0} \quad
 28 | \text{(hence the term annihilator).}\tag{1.2.11}
 29 | \end{align}
 30 | $$
 31 | 
 32 | ##### Solution
 33 | 
 34 | (1) $$ \mathbf{P} $$ is symmetric because
 35 | 
 36 | $$
 37 | \begin{align}
 38 | \mathbf{P}^\intercal 
 39 | & =
 40 | [\mathbf{X} (\mathbf{X}^\intercal \mathbf{X})^{-1}
 41 | \mathbf{X}^\intercal]^\intercal
 42 | &&
 43 | \text{(definition (1.2.7))}
 44 | \\ & =
 45 | (\mathbf{X}^\intercal)^\intercal
 46 | [(\mathbf{X}^\intercal \mathbf{X})^{-1}]^\intercal
 47 | \mathbf{X}^\intercal
 48 | &&
 49 | ((\mathbf{A} \mathbf{B})^\intercal = 
 50 | \mathbf{B}^\intercal \mathbf{A}^\intercal)
 51 | \\ & = 
 52 | ( \mathbf{X}^\intercal)^\intercal
 53 | [(\mathbf{X}^\intercal \mathbf{X})^\intercal]^{-1}
 54 | \mathbf{X}^\intercal
 55 | &&
 56 | ((\mathbf{A}^{-1})^\intercal =
 57 | (\mathbf{A}^\intercal)^{-1})
 58 | \\ & =
 59 | \mathbf{X} (\mathbf{X}^\intercal \mathbf{X})^{-1}
 60 | \mathbf{X}^\intercal
 61 | &&
 62 | ((\mathbf{A}^\intercal)^\intercal = \mathbf{A})
 63 | \\ & =
 64 | \mathbf{P}.
 65 | &&
 66 | \text{(definition (1.2.7))}
 67 | \end{align}
 68 | $$
 69 | 
 70 | (2) $$ \mathbf{M} $$ is symmetric because
 71 | 
 72 | $$
 73 | \begin{align}
 74 | \mathbf{M}^\intercal
 75 | & =
 76 | ( \mathbf{I} - \mathbf{P} )^\intercal
 77 | &&
 78 | \text{(definition (1.2.8))}
 79 | \\ & =
 80 | \mathbf{I}^\intercal - \mathbf{P}^\intercal
 81 | &&
 82 | ((\mathbf{A} - \mathbf{B})^\intercal =
 83 | \mathbf{A}^\intercal - \mathbf{B}^\intercal)
 84 | \\ & =
 85 | \mathbf{I} - \mathbf{P}
 86 | &&
 87 | \text{($\mathbf{I}$ and $\mathbf{P}$ are symmetric)}
 88 | \\ & =
 89 | \mathbf{M}.
 90 | &&
 91 | \text{(definition (1.2.8))}
 92 | \end{align}
 93 | $$
 94 | 
 95 | (3) $$ \mathbf{P} $$ is idempotent because
 96 | 
 97 | $$
 98 | \begin{align}
 99 | \mathbf{P}^2
100 | & =
101 | (\mathbf{X} (\mathbf{X}^\intercal \mathbf{X})^{-1}
102 | \mathbf{X}^\intercal) \cdot (\mathbf{X}
103 | (\mathbf{X}^\intercal \mathbf{X})^{-1}
104 | \mathbf{X}^\intercal)
105 | &&
106 | \text{(definition (1.2.7))}
107 | \\ & =
108 | \mathbf{X} (\mathbf{X}^\intercal \mathbf{X})^{-1}
109 | (\mathbf{X}^\intercal \mathbf{X})
110 | (\mathbf{X}^\intercal \mathbf{X})^{-1}
111 | \mathbf{X}^\intercal
112 | &&
113 | ((\mathbf{A} \mathbf{B}) \mathbf{C} =
114 | \mathbf{A} (\mathbf{B} \mathbf{C}))
115 | \\ & =
116 | \mathbf{X} (\mathbf{X}^\intercal \mathbf{X})^{-1}
117 | \mathbf{X}^\intercal
118 | &&
119 | (\mathbf{A}^{-1} \mathbf{A} = \mathbf{I})
120 | \\ & = 
121 | \mathbf{P}.
122 | &&
123 | \text{(definition (1.2.7))}
124 | \end{align}
125 | $$
126 | 
127 | (4) $$ \mathbf{M} $$ is idempotent because
128 | 
129 | $$
130 | \begin{align}
131 | \mathbf{M}^2
132 | & = 
133 | (\mathbf{I} - \mathbf{P})(\mathbf{I} - \mathbf{P})
134 | &&
135 | \text{(definition (1.2.8))}
136 | \\ & =
137 | \mathbf{I} - \mathbf{P} - \mathbf{P} + \mathbf{P}^2
138 | \\ & =
139 | \mathbf{I} - \mathbf{P} - \mathbf{P} + \mathbf{P}
140 | &&
141 | \text{($\mathbf{P}$ is idempotent)}
142 | \\ & =
143 | \mathbf{I} - \mathbf{P}
144 | \\ & =
145 | \mathbf{M}.
146 | &&
147 | \text{(definition (1.2.8))}
148 | \end{align}
149 | $$
150 | 
151 | (5) $$ \mathbf{P} \mathbf{X} = \mathbf{X} $$ because
152 | 
153 | $$
154 | \begin{align}
155 | \mathbf{P} \mathbf{X} 
156 | & = 
157 | \mathbf{X} (\mathbf{X}^\intercal \mathbf{X})^{-1}
158 | \mathbf{X}^\intercal \cdot \mathbf{X}
159 | &&
160 | \text{(definition (1.2.7))}
161 | \\ & =
162 | \mathbf{X} (\mathbf{X}^\intercal \mathbf{X})^{-1}
163 | (\mathbf{X}^\intercal \mathbf{X})
164 | \\ & =
165 | \mathbf{X}.
166 | \end{align}
167 | $$
168 | 
169 | (6) $$ \mathbf{M} \mathbf{X} = \mathbf{0} $$ because
170 | 
171 | $$
172 | \begin{align}
173 | \mathbf{M} \mathbf{X} 
174 | & =
175 | ( \mathbf{I} - \mathbf{P} ) \mathbf{X}
176 | &&
177 | \text{(definition (1.2.8))}
178 | \\ & =
179 | \mathbf{X} - \mathbf{P} \mathbf{X}
180 | \\ & =
181 | \mathbf{X} - \mathbf{X}
182 | &&
183 | ( \mathbf{P} \mathbf{X} = \mathbf{X} )
184 | \\ & = \mathbf{0}.
185 | \end{align}
186 | $$
187 | 
188 | ---
189 | 
190 | Copyright ©2018 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.7.1.md:
--------------------------------------------------------------------------------
  1 | # Solution to Review Question
  2 | 
  3 | by Qiang Gao, updated at May 20, 2017
  4 | 
  5 | ---
  6 | 
  7 | ## Chapter 1 Finite-Sample Properties of OLS
  8 | 
  9 | ### Section 7 Application: Returns to Scale in Electricity Supply
 10 | 
 11 | ...
 12 | 
 13 | #### Review Question 1.7.1 (Review of duality theory)
 14 | 
 15 | Consult your favorite microeconomic textbook to remember how to derive the Cobb-Douglas cost function from the Cobb-Douglas production function.
 16 | 
 17 | ##### Solution
 18 | 
 19 | Assume firm $$i$$ is engaged in **cost minimization**,
 20 | 
 21 | $$
 22 | \begin{gather}
 23 | \min \quad p_{i1} x_{i1} + p_{i2} x_{i2} + p_{i3} x_{i3}
 24 | \tag{1}
 25 | \\
 26 | \text{s.t.} \qquad
 27 | A_i x_{i1}^{\alpha_1} x_{i2}^{\alpha_2} x_{i3}^{\alpha_3} = Q_i.
 28 | \tag{2}
 29 | \end{gather}
 30 | $$
 31 | 
 32 | Then the **Lagrangian** is
 33 | 
 34 | $$
 35 | \mathcal{L} ( x_{i1}, x_{i2}, x_{i3}, \lambda ) =
 36 | p_{i1} x_{i1} + p_{i2} x_{i2} + p_{i3} x_{i3} -
 37 | \lambda ( A_i x_{i1}^{\alpha_1} x_{i2}^{\alpha_2} x_{i3}^{\alpha_3} - Q_i ),
 38 | \tag{3}
 39 | $$
 40 | 
 41 | where $$ \lambda > 0 $$ is the **shadow price** of the **exogenous variable** $$ Q_i $$.
 42 | 
 43 | The **first-order conditions** are
 44 | 
 45 | $$
 46 | \begin{align}
 47 | \frac{ \partial \mathcal{L} ( x_{i1}, x_{i2}, x_{i3}, \lambda ) }
 48 | { \partial x_{i1} } & =
 49 | p_{i1} - \lambda \alpha_1 A_i x_{i1}^{\alpha_1 - 1} x_{i2}^{\alpha_2} x_{i3}^{\alpha_3} = 0,
 50 | \tag{4}
 51 | \\
 52 | \frac{ \partial \mathcal{L} ( x_{i1}, x_{i2}, x_{i3}, \lambda ) }
 53 | { \partial x_{i2} } & =
 54 | p_{i2} - \lambda \alpha_2 A_i x_{i1}^{\alpha_1} x_{i2}^{\alpha_2 - 1} x_{i3}^{\alpha_3} = 0,
 55 | \tag{5}
 56 | \\
 57 | \frac{ \partial \mathcal{L} ( x_{i1}, x_{i2}, x_{i3}, \lambda ) }
 58 | { \partial x_{i3} } & =
 59 | p_{i3} - \lambda \alpha_1 A_i x_{i1}^{\alpha_1} x_{i2}^{\alpha_2} x_{i3}^{\alpha_3 - 1} = 0.
 60 | \tag{6}
 61 | \\
 62 | \frac{ \partial \mathcal{L} ( x_{i1}, x_{i2}, x_{i3}, \lambda ) }
 63 | { \partial \lambda } & =
 64 | A_i x_{i1}^{\alpha_1} x_{i2}^{\alpha_2} x_{i3}^{\alpha_3} - Q_i = 0.
 65 | \tag{7}
 66 | \end{align}
 67 | $$
 68 | 
 69 | Dividing (5) over (4) to eliminate the shadow price $$ \lambda $$,
 70 | 
 71 | $$
 72 | \begin{gather}
 73 | \frac{ p_{i2} }{ p_{i1} } =
 74 | \frac{ \alpha_2 }{ \alpha_1 }
 75 | \frac{ x_{i1} }{ x_{i2} },
 76 | \tag{8}
 77 | \\
 78 | x_{i2} = p_{i1} p_{i2}^{-1} \alpha_1^{-1} \alpha_2 x_{i1}.
 79 | \tag{9}
 80 | \end{gather}
 81 | $$
 82 | 
 83 | Similarly, dividing (6) over (4),
 84 | 
 85 | $$
 86 | \begin{gather}
 87 | \frac{ p_{i3} }{ p_{i1} } =
 88 | \frac{ \alpha_3 }{ \alpha_1 }
 89 | \frac{ x_{i1} }{ x_{i3} },
 90 | \tag{10}
 91 | \\
 92 | x_{i3} = p_{i1} p_{i3}^{-1} \alpha_1^{-1} \alpha_3 x_{i1}.
 93 | \tag{11}
 94 | \end{gather}
 95 | $$
 96 | 
 97 | Substituting (9) and (11) into (7),
 98 | 
 99 | $$
100 | A_i x_{i1}^{ \alpha_1 + \alpha_2 + \alpha_3 }
101 | p_{i1}^{ \alpha_2 + \alpha_3 }
102 | p_{i2}^{ - \alpha_2 }
103 | p_{i3}^{ - \alpha_3 }
104 | \alpha_1^{ - \alpha_2 - \alpha_3}
105 | \alpha_2^{ \alpha_2 }
106 | \alpha_3^{ \alpha_3 }
107 | = Q_i.
108 | \tag{12}
109 | $$
110 | 
111 | Using $$ r \equiv \alpha_1 + \alpha_2 + \alpha_3 $$, we solved from (12) that
112 | 
113 | $$
114 | \begin{align}
115 | x_{i1} & = A_i^{-1/r} Q_i^{1/r}
116 | p_{i1}^{-1 + \alpha_1/r}
117 | p_{i2}^{ \alpha_2/r }
118 | p_{i3}^{ \alpha_3/r }
119 | \alpha_1^{1 - \alpha_1/r }
120 | \alpha_2^{- \alpha_2/r}
121 | \alpha_3^{- \alpha_3/r}
122 | \\ & =
123 | \alpha_1 \cdot ( A_i \alpha_1^{\alpha_1} \alpha_2^{\alpha_2} \alpha_3^{\alpha^3} )^{-1/r}
124 | Q_i^{1/r}
125 | \frac{1}{ p_{i1} }
126 | p_{i1}^{ \alpha_1/r}
127 | p_{i2}^{ \alpha_2/r }
128 | p_{i3}^{ \alpha_3/r }.
129 | \tag{13}
130 | \end{align}
131 | $$
132 | 
133 | Similarly, because of the symmetry of $$x_{i1}$$, $$x_{i2}$$, and $$ x_{i3} $$, we can also solve that
134 | 
135 | $$
136 | \begin{align}
137 | x_{i2} & = \alpha_2 \cdot ( A_i \alpha_1^{\alpha_1} \alpha_2^{\alpha_2} \alpha_3^{\alpha^3} )^{-1/r}
138 | Q_i^{1/r}
139 | \frac{1}{ p_{i2} }
140 | p_{i1}^{ \alpha_1/r}
141 | p_{i2}^{ \alpha_2/r }
142 | p_{i3}^{ \alpha_3/r },
143 | \tag{14}
144 | \\
145 | x_{i3} & = \alpha_3 \cdot ( A_i \alpha_1^{\alpha_1} \alpha_2^{\alpha_2} \alpha_3^{\alpha^3} )^{-1/r}
146 | Q_i^{1/r}
147 | \frac{1}{ p_{i3} }
148 | p_{i1}^{ \alpha_1/r}
149 | p_{i2}^{ \alpha_2/r }
150 | p_{i3}^{ \alpha_3/r }.
151 | \tag{15}
152 | \end{align}
153 | $$
154 | 
155 | Substituting solutions of the **endogenous variables** (13), (14) and (15) into the **objective function** (1), we get
156 | 
157 | $$
158 | TC_i = r \cdot ( A_i \alpha_1^{ \alpha_1 } \alpha_2^{ \alpha_2 } \alpha_3^{ \alpha_3 } )^{-1/r} Q_i^{1/r} p_{i1}^{\alpha_1 / r} p_{i2}^{\alpha_2 / r} p_{i3}^{\alpha_3 / r}.
159 | \tag{1.7.2}
160 | $$
161 | 
162 | ---
163 | 
164 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.2.3.md:
--------------------------------------------------------------------------------
  1 | # Solution to Review Question
  2 | 
  3 | by Qiang Gao, updated at Mar 21, 2017
  4 | 
  5 | ---
  6 | 
  7 | ## Chapter 1 Finite-Sample Properties of OLS
  8 | 
  9 | ### Section 2 The Algebra of Least Squares
 10 | 
 11 | ...
 12 | 
 13 | #### Review Question 1.2.3 (OLS estimator for the simple regression model)
 14 | 
 15 | In the simple regression model, $$ K = 2 $$ and $$ x_{i1} = 1 $$. Show that
 16 | 
 17 | $$
 18 | \mathbf{S}_{\mathbf{x}\mathbf{x}} = 
 19 | \begin{bmatrix}
 20 |   1 & \bar{x}_2 \\
 21 |   \bar{x}_2 & \frac{1}{n} \sum_{i=1}^n x_{i2}^2
 22 | \end{bmatrix},
 23 | \quad
 24 | \mathbf{S}_{\mathbf{x}\mathbf{y}} =
 25 | \begin{bmatrix}
 26 |   \bar{y} \\ \frac{1}{n} \sum_{i=1}^n x_{i2} y_i
 27 | \end{bmatrix}
 28 | $$
 29 | 
 30 | where
 31 | 
 32 | $$
 33 | \bar{y} \equiv \frac{1}{n} \sum_{i=1}^n y_i,
 34 | \quad
 35 | \bar{x}_2 \equiv \frac{1}{n} \sum_{i=1}^n x_{i2}.
 36 | $$
 37 | 
 38 | Show that
 39 | 
 40 | $$
 41 | b_2 = \frac{ \frac{1}{n} \sum_{i=1}^n (x_{i2} - \bar{x}_2)(y_i - \bar{y}) }{ \frac{1}{n} \sum_{i=1}^n (x_{i2} - \bar{x}_2)^2 },
 42 | \quad
 43 | b_1 = \bar{y} - \bar{x}_2 b_2.
 44 | $$
 45 | 
 46 | ##### Solution
 47 | 
 48 | (1)
 49 | 
 50 | $$
 51 | \begin{align}
 52 |   \mathbf{S}_{\mathbf{x}\mathbf{x}}
 53 |   & = \frac{1}{n} \mathbf{X}' \mathbf{X} \\
 54 |   & =
 55 |   \frac{1}{n}
 56 |   \begin{bmatrix}
 57 |     1 & \cdots & 1 \\
 58 |     x_{12} & \cdots & x_{n2}
 59 |   \end{bmatrix}
 60 |   \begin{bmatrix}
 61 |     1 & x_{12} \\
 62 |     \vdots & \vdots \\
 63 |     1 & x_{n2}
 64 |   \end{bmatrix} \\
 65 |   & =
 66 |   \frac{1}{n}
 67 |   \begin{bmatrix}
 68 |     n & \sum_{i=1}^n x_{i2} \\
 69 |     \sum_{i=1}^n x_{i2} & \sum_{i=1}^n x_{i2}^2
 70 |   \end{bmatrix} \\
 71 |   & =
 72 |   \begin{bmatrix}
 73 |   1 & \bar{x}_2 \\
 74 |   \bar{x}_2 & \frac{1}{n} \sum_{i=1}^n x_{i2}^2
 75 |   \end{bmatrix}
 76 | \end{align}
 77 | $$
 78 | 
 79 | (2)
 80 | 
 81 | $$
 82 | \begin{align}
 83 |   \mathbf{S}_{\mathbf{x}\mathbf{y}}
 84 |   & = \frac{1}{n} \mathbf{X}' \mathbf{y} \\
 85 |   & =
 86 |   \frac{1}{n}
 87 |   \begin{bmatrix}
 88 |     1 & \cdots & 1 \\
 89 |     x_{12} & \cdots & x_{n2}
 90 |   \end{bmatrix}
 91 |   \begin{bmatrix}
 92 |     y_1 \\
 93 |     \vdots \\
 94 |     y_n
 95 |   \end{bmatrix} \\
 96 |   & =
 97 |   \frac{1}{n}
 98 |   \begin{bmatrix}
 99 |     \sum_{i=1}^n y_i \\
100 |     \sum_{i=1}^n x_{i2} y_i
101 |   \end{bmatrix} \\
102 |   & =
103 |   \begin{bmatrix}
104 |     \bar{y} \\
105 |     \frac{1}{n} \sum_{i=1}^n x_{i2} y_i
106 |   \end{bmatrix}
107 | \end{align}
108 | $$
109 | 
110 | (3) To solve for $$ \mathbf{b} $$ from
111 | 
112 | $$
113 | \mathbf{S}_{\mathbf{x} \mathbf{x}}
114 | \mathbf{b} =
115 | \mathbf{S}_{\mathbf{x} \mathbf{y}},
116 | $$
117 | 
118 | perform row operations on the following augmented matrix
119 | 
120 | $$
121 | \begin{align}
122 |   \begin{bmatrix}
123 |     \mathbf{S}_{\mathbf{x} \mathbf{x}} \mid
124 |     \mathbf{S}_{\mathbf{x} \mathbf{y}}
125 |   \end{bmatrix}
126 |   & =
127 |   \begin{bmatrix}
128 |     1 & \bar{x}_2 & \bar{y} \\
129 |     \bar{x}_2 & \frac{1}{n} \sum_{i=1}^n x_{i2}^2
130 |     & \frac{1}{n} \sum_{i=1}^n x_{i2} y_i
131 |   \end{bmatrix} \\
132 |   & \sim
133 |   \begin{bmatrix}
134 |     1 & \bar{x}_2 & \bar{y} \\
135 |     0 & \frac{1}{n} \sum_{i=1}^n x_{i2}^2 - \bar{x}_2^2
136 |     & \frac{1}{n} \sum_{i=1}^n x_{i2} y_i
137 |     - \bar{x}_2 \bar{y}
138 |   \end{bmatrix} \\
139 |   & =
140 |   \begin{bmatrix}
141 |     1 & \bar{x}_2 & \bar{y} \\
142 |     0 & \frac{1}{n} \sum_{i=1}^n (x_{i2} - \bar{x}_2)^2
143 |     & \frac{1}{n} \sum_{i=1}^n 
144 |     (x_{i2} - \bar{x}_2)(y_i - \bar{y})
145 |   \end{bmatrix} \\
146 |   & \sim
147 |   \begin{bmatrix}
148 |     1 & \bar{x}_2 & \bar{y} \\
149 |     0 & 1 & 
150 |     \frac{ \frac{1}{n} \sum_{i=1}^n
151 |     (x_{i2} - \bar{x}_2)(y_i - \bar{y}) }
152 |     {\frac{1}{n} \sum_{i=1}^n (x_{i2} - \bar{x}_2)^2}
153 |   \end{bmatrix} \\
154 |   & \sim
155 |   \begin{bmatrix}
156 |     1 & 0 & \bar{y} -
157 |     \bar{x}_2
158 |     \frac{ \frac{1}{n} \sum_{i=1}^n
159 |     (x_{i2} - \bar{x}_2)(y_i - \bar{y}) }
160 |     {\frac{1}{n} \sum_{i=1}^n (x_{i2} - \bar{x}_2)^2}
161 |     \\
162 |     0 & 1 &
163 |     \frac{ \frac{1}{n} \sum_{i=1}^n
164 |     (x_{i2} - \bar{x}_2)(y_i - \bar{y}) }
165 |     {\frac{1}{n} \sum_{i=1}^n (x_{i2} - \bar{x}_2)^2}
166 |   \end{bmatrix} \\
167 |   & = 
168 |   \begin{bmatrix}
169 |     \mathbf{I} \mid \mathbf{b}
170 |   \end{bmatrix}.
171 | \end{align}
172 | $$
173 | 
174 | So
175 | 
176 | $$
177 | b_2 = \frac{ \frac{1}{n} \sum_{i=1}^n (x_{i2} - \bar{x}_2)(y_i - \bar{y}) }{ \frac{1}{n} \sum_{i=1}^n (x_{i2} - \bar{x}_2)^2 },
178 | \quad
179 | b_1 = \bar{y} - \bar{x}_2 b_2.
180 | $$
181 | 
182 | ##### Note
183 | 
184 | $$
185 | b_2 \stackrel{p} \to \frac{\mathrm{Cov} (x_2, y)}{\mathrm{Var} (x_2)}
186 | $$
187 | 
188 | ---
189 | 
190 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Lecture Notes on Econometrics
  2 | 
  3 | by Qiang Gao (School of Finance, Capital University of Economics and Business)
  4 | 
  5 | 
  6 | # 高级计量经济学课件
  7 | 
  8 | 作者：高强（首都经济贸易大学金融学院）
  9 | 
 10 | ---
 11 | 
 12 | 本课件是对日本经济学家林文夫（2000）所著《高级计量经济学》教材[^1]的补充。
 13 | 
 14 | 国内新入学的研究生，本科阶段所接受的数学训练普遍不足。例如，微积分课程缺少对梯度、雅可比矩阵、海赛矩阵、多元泰勒展开的讲解；线性代数课程缺少对向量空间、坐标变换、特征分解、四个基本子空间、奇异值分解、二次型等重要概念的强调；概率论与数理统计两门特别重要的课程，通常被合并成一门课在一个学期讲完，造成学生普遍搞不清楚条件期望、条件方差、方差—协方差矩阵、多元正态分布、$$\chi^2$$分布、$$t$$分布、$$F$$分布、极大似然估计量、信息矩阵等基础概念；更不要说压根不安排实分析、微分方程、随机过程、动态优化等数学课了。
 15 | 
 16 | 正是因为这一问题给面向研究生开设的《高级计量经济学》课程带来了极大的挑战，所以我尝试制作这些课件。这些课件并不能够取代本科阶段应有的数学教育，而是希望在《高级计量经济学》授课过程中，逢山开路，遇水搭桥，以“现学现用”为原则尽可能弥补课本假设学生知道所以没有讲但其实学生根本不知道的那些基础数学知识，作为一个教辅资料，帮助学生扎扎实实地学明白《高级计量经济学》教材中所有的数学推导。
 17 | 
 18 | 练习题原本应该由学生亲自完成，这是学习取得成效的重要环节。但现实情况是，由于学生普遍基础不足，根本没有能力独立完成练习。所以本课件的另外一个组成部分是公布所有练习题的答案供学生参考。如果担心学生因此就有办法作弊，逃避亲自完成练习，那么可以采取随堂闭卷测验的方式考察学生是否确实掌握练习题的解法。希望本课件将练习题答案公布更多是帮助学生，而不是害到学生。
 19 | 
 20 | 最后，本课件正在逐步建设中，许多内容尚处空缺，错误也在所难免，非常欢迎读者的批评指正（本书每一页边距空白处可随意点击“加号”添加评论）。
 21 | 
 22 | ps. 若网页中数学公式显示不正常，通常刷新一遍网页即可正常显示。这似乎是 `gitbook.com` 平台的技术性问题，或者是从国内网络访问国际网络的普遍性问题造成的，本人无法解决。
 23 | 
 24 | # 目录
 25 | 
 26 | ## Chapter 1 Finite-Sample Properties of OLS
 27 | 
 28 | ### Section 1 The Classical Linear Regression Model
 29 | 
 30 | * [Lecture Note 1.1](lecture-note/1.1.md)
 31 |     * [Review Question 1.1.1](question-solution/1.1.1.md)
 32 |     * [Review Question 1.1.2](question-solution/1.1.2.md)
 33 |     * [Review Question 1.1.3](question-solution/1.1.3.md)
 34 |     * [Review Question 1.1.4](question-solution/1.1.4.md)
 35 |     * [Review Question 1.1.5](question-solution/1.1.5.md)
 36 |     * [Review Question 1.1.6](question-solution/1.1.6.md)
 37 | 
 38 | ### Section 2 The Algebra of Least Squares
 39 | 
 40 | * [Lecture Note 1.2](lecture-note/1.2.md)
 41 |     * [Review Question 1.2.1](question-solution/1.2.1.md)
 42 |     * [Review Question 1.2.2](question-solution/1.2.2.md)
 43 |     * [Review Question 1.2.3](question-solution/1.2.3.md)
 44 |     * [Review Question 1.2.4](question-solution/1.2.4.md)
 45 |     * [Review Question 1.2.5](question-solution/1.2.5.md)
 46 |     * [Review Question 1.2.6](question-solution/1.2.6.md)
 47 |     * [Review Question 1.2.7](question-solution/1.2.7.md)
 48 |     * [Review Question 1.2.8](question-solution/1.2.8.md)
 49 |     * [Review Question 1.2.9](question-solution/1.2.9.md)
 50 |     
 51 | ### Section 3 Finite-Sample Properties of OLS
 52 | 
 53 | * [Review Question 1.3.1](question-solution/1.3.1.md)
 54 | * [Review Question 1.3.2](question-solution/1.3.2.md)
 55 | * [Review Question 1.3.3](question-solution/1.3.3.md)
 56 | * [Review Question 1.3.4](question-solution/1.3.4.md)
 57 | * [Review Question 1.3.5](question-solution/1.3.5.md)
 58 | * [Review Question 1.3.6](question-solution/1.3.6.md)
 59 | * [Review Question 1.3.7](question-solution/1.3.7.md)
 60 | 
 61 | ### Section 4 Hypothesis Testing under Normality
 62 | 
 63 | * [Review Question 1.4.1](question-solution/1.4.1.md)
 64 | * [Review Question 1.4.2](question-solution/1.4.2.md)
 65 | * [Review Question 1.4.3](question-solution/1.4.3.md)
 66 | * [Review Question 1.4.4](question-solution/1.4.4.md)
 67 | * [Review Question 1.4.5](question-solution/1.4.5.md)
 68 | * [Review Question 1.4.6](question-solution/1.4.6.md)
 69 | * [Review Question 1.4.7](question-solution/1.4.7.md)
 70 | 
 71 | ### Section 5 Relation to Maximum Likelihood
 72 | 
 73 | * [Review Question 1.5.1](question-solution/1.5.1.md)
 74 | * [Review Question 1.5.2](question-solution/1.5.2.md)
 75 | * [Review Question 1.5.3](question-solution/1.5.3.md)
 76 | * [Review Question 1.5.4](question-solution/1.5.4.md)
 77 | * [Review Question 1.5.5](question-solution/1.5.5.md)
 78 | 
 79 | ### Section 6 Generalized Least Squares (GLS)
 80 | 
 81 | * [Review Question 1.6.1](question-solution/1.6.1.md)
 82 | * [Review Question 1.6.2](question-solution/1.6.2.md)
 83 | * [Review Question 1.6.3](question-solution/1.6.3.md)
 84 | * [Review Question 1.6.4](question-solution/1.6.4.md)
 85 | 
 86 | ### Section 7 Application: Returns to Scale in Electricity Supply
 87 | 
 88 | * [Lecture Note](lecture-note/1.7.md)
 89 | * [Review Question 1.7.1](question-solution/1.7.1.md)
 90 | * [Review Question 1.7.2](question-solution/1.7.2.md)
 91 | * [Review Question 1.7.3](question-solution/1.7.3.md)
 92 | * [Review Question 1.7.4](question-solution/1.7.4.md)
 93 | * [Review Question 1.7.5](question-solution/1.7.5.md)
 94 | * [Review Question 1.7.6](question-solution/1.7.6.md)
 95 | * [Review Question 1.7.7](question-solution/1.7.7.md)
 96 | * [Review Question 1.7.8](question-solution/1.7.8.md)
 97 | 
 98 | ### Analytical Exercises
 99 | 
100 | * [Analytical Exercise 1.1](exercise-solution/1.1.md)
101 | * [Analytical Exercise 1.2](exercise-solution/1.2.md)
102 | 
103 | ## Chapter 2 Large-Sample Theory
104 | 
105 | ### Section 1 Review of Limit Theorems for Sequences of Random Variables
106 | 
107 | * [Review Question 2.1.1](question-solution/2.1.1.md)
108 | * [Review Question 2.1.2](question-solution/2.1.2.md)
109 | * [Review Question 2.1.3](question-solution/2.1.3.md)
110 | * [Review Question 2.1.4](question-solution/2.1.4.md)
111 | * [Review Question 2.1.5](question-solution/2.1.5.md)
112 | 
113 | ## Supplements
114 | 
115 | * [Taylor's linearization](supplements/taylor-linearization.md)
116 | * [Variance-Covariance Matrix](supplements/var-cov-matrix.md)
117 | * [Four Ways of Matrix Multiplication](supplements/matrix-multiplication.md)
118 | 
119 | #### 注
120 | 
121 | [^1]: Fumio Hayashi, _Econometrics_. Princeton University Press, 2000. (http://press.princeton.edu/titles/6946.html)
122 | 
123 | ---
124 | 
125 | Copyright ©2018 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/1.5.2.md:
--------------------------------------------------------------------------------
  1 | # Solution to Review Question
  2 | 
  3 | by Qiang Gao, updated at May 15, 2017
  4 | 
  5 | ---
  6 | 
  7 | ## Chapter 1 Finite-Sample Properties of OLS
  8 | 
  9 | ### Section 5 Relation to Maximum Likelihood
 10 | 
 11 | ...
 12 | 
 13 | #### Review Question 1.5.2 (Maximizing joint log likelihood)
 14 | 
 15 | Consider maximizing (the log of) the _joint_ likelihood
 16 | 
 17 | $$
 18 | f_{ \mathbf{y}, \mathbf{X} } ( \mathbf{y}, \mathbf{X}; \tilde{ \boldsymbol{ \zeta } } ) =
 19 | f_{ \mathbf{y} \mid \mathbf{X} } ( \mathbf{y} \mid \mathbf{X}; \tilde{ \boldsymbol{ \theta } } ) \cdot
 20 | f_{ \mathbf{X} } ( \mathbf{X} ; \tilde{ \boldsymbol{ \psi } } )
 21 | \tag{1.5.2}
 22 | $$
 23 | 
 24 | for the classical regression model, where $$ \tilde{ \boldsymbol{ \theta } } = ( \tilde{ \boldsymbol{ \beta } }, \tilde{ \sigma }^2 )'$$ and $$ \log f_{ \mathbf{y} \mid \mathbf{X} } ( \mathbf{y} \mid \mathbf{X}; \tilde{ \boldsymbol{ \theta } } ) $$ is given by
 25 | 
 26 | $$
 27 | \log L ( \tilde{ \boldsymbol{ \beta } }, \tilde{ \sigma }^2 ) =
 28 | - \frac{n}{2} \log ( 2 \pi ) -
 29 | \frac{n}{2} \log ( \tilde{ \sigma }^2 ) -
 30 | \frac{1}{ 2 \tilde{ \sigma }^2 }
 31 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )'
 32 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } ).
 33 | \tag{1.5.5}
 34 | $$
 35 | 
 36 | You would parameterize the marginal likelihood $$ f( \mathbf{X} ; \tilde{ \boldsymbol{ \psi } } ) $$ and take the log of (1.5.2) to obtain the objective function to be maximized over $$ \boldsymbol{ \zeta } \equiv ( \boldsymbol{ \theta }', \boldsymbol{ \psi }' )' $$. (a) What is the ML estimator of $$ \boldsymbol{ \theta } \equiv ( \boldsymbol{ \beta }', \sigma^2 )' $$? (b) Derive the Cramer-Rao bound for $$ \boldsymbol{ \beta } $$.
 37 | 
 38 | ##### Solution
 39 | 
 40 | (a) Taking log of (1.5.2),
 41 | 
 42 | $$
 43 | \begin{align}
 44 | \log f_{ \mathbf{y}, \mathbf{X} } ( \mathbf{y}, \mathbf{X}; \tilde{ \boldsymbol{ \zeta } } ) =
 45 | \log f_{ \mathbf{y} \mid \mathbf{X} } ( \mathbf{y} \mid \mathbf{X}; \tilde{ \boldsymbol{ \theta } } ) +
 46 | \log f_{ \mathbf{X} } ( \mathbf{X} ; \tilde{ \boldsymbol{ \psi } } ).
 47 | \tag{1}
 48 | \end{align}
 49 | $$
 50 | 
 51 | The ML estimator of $$ \boldsymbol{ \theta } $$ maximizing (1) will be exactly the same as that of maximizing $$ \log f_{ \mathbf{y} \mid \mathbf{X} } ( \mathbf{y} \mid \mathbf{X}; \tilde{ \boldsymbol{ \theta } } ) $$, because $$ \tilde{ \boldsymbol{ \theta } } $$ does not appear in $$ \log f_{ \mathbf{X} } ( \mathbf{X} ; \tilde{ \boldsymbol{ \psi } } ) $$, the first-order conditions of the two maximization will be the same.
 52 | 
 53 | (b) By the information matrix equality,
 54 | 
 55 | $$
 56 | \begin{align}
 57 | \mathbf{I} ( \boldsymbol{ \zeta } ) & =
 58 | - \mathrm{E} \left[
 59 | \frac{ \partial^2 \log L ( \boldsymbol{ \zeta } ) }
 60 | { \partial \tilde{ \boldsymbol{ \zeta } } \,
 61 | \partial \tilde{ \boldsymbol{ \zeta } }' }
 62 | \right]
 63 | \\ & =
 64 | - \mathrm{E} \left[
 65 | \begin{matrix}
 66 | \frac{ \partial^2 \log L ( \boldsymbol{ \theta }, \boldsymbol{ \psi } ) }
 67 | { \partial \tilde{ \boldsymbol{ \theta } } \,
 68 | \partial \tilde{ \boldsymbol{ \theta } }' }
 69 | &
 70 | \frac{ \partial^2 \log L ( \boldsymbol{ \theta }, \boldsymbol{ \psi } ) }
 71 | { \partial \tilde{ \boldsymbol{ \theta } } \,
 72 | \partial \tilde{ \boldsymbol{ \psi } }' }
 73 | \\
 74 | \frac{ \partial^2 \log L ( \boldsymbol{ \theta }, \boldsymbol{ \psi } ) }
 75 | { \partial \tilde{ \boldsymbol{ \psi } } \,
 76 | \partial \tilde{ \boldsymbol{ \theta } }' }
 77 | &
 78 | \frac{ \partial^2 \log L ( \boldsymbol{ \theta }, \boldsymbol{ \psi } ) }
 79 | { \partial \tilde{ \boldsymbol{ \psi } } \,
 80 | \partial \tilde{ \boldsymbol{ \psi } }' }
 81 | \end{matrix}
 82 | \right]
 83 | \\ & =
 84 | - \mathrm{E} \left[
 85 | \begin{matrix}
 86 | \frac{ \partial^2 \log L ( \boldsymbol{ \theta }, \boldsymbol{ \psi } ) }
 87 | { \partial \tilde{ \boldsymbol{ \theta } } \,
 88 | \partial \tilde{ \boldsymbol{ \theta } }' }
 89 | &
 90 | \mathbf{0}
 91 | \\
 92 | \mathbf{0}
 93 | &
 94 | \frac{ \partial^2 \log L ( \boldsymbol{ \theta }, \boldsymbol{ \psi } ) }
 95 | { \partial \tilde{ \boldsymbol{ \psi } } \,
 96 | \partial \tilde{ \boldsymbol{ \psi } }' }
 97 | \end{matrix}
 98 | \right],
 99 | \end{align}
100 | $$
101 | 
102 | since $$ \partial^2 \log L ( \boldsymbol{ \theta }, \boldsymbol{ \psi } ) / ( \partial \tilde{ \boldsymbol{ \theta } } \,
103 | \partial \tilde{ \boldsymbol{ \psi } }' ) = \mathbf{0} $$. Thus the information matrix $$ \mathbf{I} ( \boldsymbol{ \zeta } ) $$ is block diagonal with its first block corresponding to $$ \boldsymbol{ \theta } $$ and the second block corresponding to $$ \boldsymbol{ \psi } $$. Its inverse is also block diagonal, with its first block being the inverse of
104 | 
105 | $$
106 | - \mathrm{E} \left[
107 | \frac{ \partial^2 \log L ( \boldsymbol{ \theta }, \boldsymbol{ \psi } ) }
108 | { \partial \tilde{ \boldsymbol{ \theta } } \,
109 | \partial \tilde{ \boldsymbol{ \theta } }' }
110 | \right]
111 | =
112 | - \mathrm{E} \left[
113 | \frac{ \partial^2 \log L ( \boldsymbol{ \theta } ) }
114 | { \partial \tilde{ \boldsymbol{ \theta } } \,
115 | \partial \tilde{ \boldsymbol{ \theta } }' }
116 | \right].
117 | $$
118 | 
119 | So the Cramer-Rao bound for $$ \boldsymbol{ \theta } $$ is the negative of the inverse of the expected value of (1.5.12) in the text. The expectation, however, is over $$ \mathbf{y} $$ _and_ $$ \mathbf{X} $$ because here the density is a _joint_ density. Therefore, the Cramer-Rao bound for $$ \boldsymbol{ \beta } $$ is $$ \sigma^2 [ \mathrm{E} ( \mathbf{X}' \mathbf{X} ) ]^{-1} $$.
120 | 
121 | ---
122 | 
123 | Copyright ©2017 by Qiang Gao


--------------------------------------------------------------------------------
/lecture-note/1.1.md:
--------------------------------------------------------------------------------
  1 | # Lecture Notes on Econometrics
  2 | 
  3 | by Qiang Gao, updated at March 26, 2018
  4 | 
  5 | ---
  6 | 
  7 | ## Chapter 1 Finite-Sample Properties of OLS
  8 | 
  9 | ### Section 1 The Classical Linear Regression Model
 10 | 
 11 | ...
 12 | 
 13 | #### The Big Picture of Econometrics
 14 | 
 15 | Its a bridge between **model** and **data**. Model is some theoretic mathematical formulation. Data is collected/measured from the real world according to the definition of variables.
 16 | 
 17 | The bridge is in two ways:
 18 | 
 19 | 1. (from data to model) Parameter Estimation.
 20 | 2. (from model to data) Hypothesis Test.
 21 | 
 22 | #### Assumption 1.1 (linearity)
 23 | 
 24 | ##### 1. It's a tautology
 25 | 
 26 | Because the error term $$ \varepsilon_i $$ is defined as
 27 | 
 28 | $$
 29 | \varepsilon_i = y_i - \mathbf{x}_i \cdot \boldsymbol\beta,
 30 | $$
 31 | 
 32 | the equality in (1.1.1) trivially holds true by definition.
 33 | 
 34 | Equation (1.1.1) only restricts a linear functional relationship between $$y$$ and $$ \mathbf{x} $$, nothing more.
 35 | 
 36 | ##### 2. Nonlinearity can be linearized
 37 | 
 38 | The linearity assumption is not so much restrictive, because any nonlinear function can be easily [linearized](../supplements/taylor-linearization.md).
 39 | 
 40 | ##### 3. The knowns are $$( y_i, \mathbf{x}_i $$) and the unknowns are $$( \boldsymbol\beta, \varepsilon_i )$$
 41 | 
 42 | ##### 4. $$ \boldsymbol\beta $$ is of primary interest
 43 | 
 44 | $$ \beta $$ means marginal separate effects.
 45 | 
 46 | ##### 5. $$ \varepsilon_i $$ is of primary concern
 47 | 
 48 | $$\varepsilon_i$$ should not depend on $$ \mathbf{x} $$
 49 | 
 50 | ##### 6. marginal separate effect relies on total differentiation
 51 | 
 52 | - explicit equation
 53 | - implicit equation
 54 | - differential vs. elasticity
 55 | 
 56 | ##### 7. variables are usually transformed (in log)
 57 | 
 58 | By the rules of differentiation
 59 | 
 60 | $$
 61 | \frac{d \ln x}{dx} = \frac{1}{x},
 62 | $$
 63 | 
 64 | we can write it in total differential form as
 65 | 
 66 | $$
 67 | d \ln x = \frac{dx}{x}.
 68 | $$
 69 | 
 70 | Similarly,
 71 | 
 72 | $$
 73 | d \ln y = \frac{dy}{y}.
 74 | $$
 75 | 
 76 | So
 77 | 
 78 | $$
 79 | \frac{d \ln y}{d \ln x} = \frac{d y / y}{d x / x}
 80 | $$
 81 | 
 82 | coincides with the definition of elasticity. It is of this reason that
 83 | in economics, variables are often expressed in logs rather than in
 84 | levels in equations.
 85 | 
 86 | #### Assumption 1.2 (strict exogeneity)
 87 | 
 88 | ##### Joint Distribution
 89 | 
 90 | $$
 91 | f_{Y,X}(y, x) \qquad \oint f_{Y,X}(y, x)\,dx\,dy = 1
 92 | $$
 93 | 
 94 | ##### Marginal Distribution
 95 | 
 96 | $$
 97 | \begin{align}
 98 | f_{Y} (y) \equiv \oint f_{Y,X}(y, x) \, dx && \oint f_{Y} (y) \, dy = 1 \\
 99 | f_{X} (x) \equiv \oint f_{Y,X}(y, x) \, dy && \oint f_{X} (x) \, dx = 1 
100 | \end{align}
101 | $$
102 | 
103 | ##### Conditional Distribution
104 | 
105 | ##### (Unconditional) Expectation
106 | 
107 | The (unconditional) expectation $$\mathrm{E}(x)$$ is defined as
108 | 
109 | $$
110 | \mathrm{E}(x) = \int x f(y, x) \, dy dx
111 | $$
112 | 
113 | ##### Conditional Expectation
114 | 
115 | If $$(y, x)$$ are jointly distributed random variables, where their joint p.d.f. is expressed as $$f(y, x)$$, then $$\mathrm{E} (y | x)$$ is defined
116 | as
117 | 
118 | $$
119 | \mathrm{E} (y|x) = \int_{-\infty}^{+\infty} y \frac{ f(y, x) }{ \int_{-\infty}^{+\infty} f(y, x) dy } dy,
120 | $$
121 | 
122 | where $$\int_{-\infty}^{+\infty} f(y, x) dy$$ is the definition of the marginal distribution of $$x$$. In words, the expectation of $$y$$ conditional on $$x$$ is the weighted average of $$y$$, where the weighting is the conditional probability density.
123 | 
124 | ##### Law of Total Expectations
125 | 
126 | $$
127 | \mathrm{E} ( \mathrm{E} (y | x) ) = \mathrm{E} (y).
128 | $$
129 | 
130 | ##### Law of Iterated Expectations
131 | 
132 | $$
133 | \mathrm{E} ( \mathrm{E} (y | x, z) | z ) = \mathrm{E} (y | z).
134 | $$
135 | 
136 | ##### Moment
137 | 
138 | The $$k$$-th order moment of a random variable $$x$$ is defined as
139 | 
140 | $$
141 | \mathrm{E}(x^k)
142 | $$
143 | 
144 | ##### Variance
145 | 
146 | $$
147 | \begin{align}
148 | \mathrm{Var}(x) &= \mathrm{E} [ (x - \mathrm{E} (x))^2 ] && \text{(definition)} \\
149 | &= \mathrm{E}(x^2)- E(x)^2 && \text{(formula)}
150 | \end{align}
151 | $$
152 | 
153 | ##### Covariance
154 | 
155 | $$
156 | \begin{align}
157 | \mathrm{Cov} (x, y) &= \mathrm{E} [ (x - \mathrm{E} (x) )( y - \mathrm{E} (y)) ] && \text{(definition)} \\
158 | &= \mathrm{E} (xy) - \mathrm{E}(x) \mathrm{E} (y) && \text{(formula)}
159 | \end{align}
160 | $$
161 | 
162 | ##### Correlation Coefficient
163 | 
164 | $$
165 | \rho_{x,y} = \frac{\mathrm{Cov} (x,y)}{\sqrt {\mathrm{Var} (x) \mathrm{Var} (y)}} \in [-1, 1]
166 | $$
167 | 
168 | ##### Linearity of Expectation
169 | 
170 | $$
171 | \mathrm{E} (ax + b) = a \mathrm{E} (x) + b
172 | $$
173 | 
174 | ##### Nonlinearity of Variance
175 | 
176 | $$
177 | \mathrm{Var} (ax + b) = a^2 \mathrm{Var} (x).
178 | $$
179 | 
180 | #### Assumption 1.3 (no multicollinearity)
181 | 
182 | - perfect multicollinearity _can_ occur in rare conditions as long as its _measure_ is zero.
183 | 
184 | #### Assumption 1.4 (spherical error variance)
185 | 
186 | $$
187 | \mathbf{x} \mathbf{x}'
188 | \equiv
189 | \begin{bmatrix}
190 | x_1^2 & \cdots & x_1 x_n \\
191 | \vdots & \ddots & \vdots \\
192 | x_n x_1 & \cdots & x_n^2
193 | \end{bmatrix}
194 | $$
195 | 
196 | $$
197 | \mathrm{E}
198 | \begin{bmatrix}
199 | a_{11} & \cdots & a_{n1} \\
200 | \vdots & \ddots & \vdots \\
201 | a_{m1} & \cdots & a_{mn}
202 | \end{bmatrix}
203 | \equiv
204 | \begin{bmatrix}
205 | \mathrm{E} (a_{11}) & \cdots & \mathrm{E} (a_{n1}) \\
206 | \vdots & \ddots & \vdots \\
207 | \mathrm{E} (a_{m1}) & \cdots & \mathrm{E} (a_{mn})
208 | \end{bmatrix},
209 | $$
210 | 
211 | $$
212 | \mathrm{Var} ( \mathbf{x} )
213 | \equiv
214 | \mathrm{E} [ ( \mathbf{x} - \overline{\mathbf{x}} ) ( \mathbf{x} - \overline{\mathbf{x}} )' ]
215 | $$
216 | 
217 | ---
218 | 
219 | Copyright ©2018 by Qiang Gao


--------------------------------------------------------------------------------
/question-solution/2.1.2.md:
--------------------------------------------------------------------------------
  1 | # Solution to Review Question
  2 | 
  3 | by Qiang Gao, updated at June 8, 2018
  4 | 
  5 | ---
  6 | 
  7 | ## Chapter 2 Large-Sample Theory
  8 | 
  9 | ### Section 1 Review of Limit Theorems for Sequences of Random Variables
 10 | 
 11 | ...
 12 | 
 13 | #### Review Question 2.1.2 (Alternative definition of convergence for vector sequences)
 14 | 
 15 | (a) Verify that the definition in the text of “$$ \mathbf{z}_n \to_{m.s.} \mathbf{z} $$” is equivalent to
 16 | 
 17 | $$
 18 | \lim_{n \to \infty} \operatorname{E}
 19 | [ ( \mathbf{z}_n - \mathbf{z} )'( \mathbf{z}_n - \mathbf{z} ) ]
 20 | = 0.
 21 | $$
 22 | 
 23 | **Hint**: $$ \operatorname{E} [ ( \mathbf{z}_n - \mathbf{z} )'( \mathbf{z}_n - \mathbf{z} ) ] =
 24 | \operatorname{E} [ ( z_{n1} - z_1 )^2 ] + \cdots + 
 25 | \operatorname{E} [ ( z_{nK} - z_K )^2 ] $$, where $$K$$ is the dimension of $$\mathbf{z}$$.
 26 | 
 27 | (b) Similarly, verify that the definition in the text of “$$ \mathbf{z}_n \to_p \boldsymbol{\alpha} $$” is equivalent to
 28 | 
 29 | $$
 30 | \lim_{n \to \infty} \operatorname{Prob}
 31 | \left(
 32 | ( \mathbf{z}_n - \boldsymbol{\alpha} )'
 33 | ( \mathbf{z}_n - \boldsymbol{\alpha} ) >
 34 | \varepsilon
 35 | \right) = 0,
 36 | $$
 37 | 
 38 | for any $$ \varepsilon > 0 $$.
 39 | 
 40 | ##### Solution
 41 | 
 42 | (a) If $$ \mathbf{z}_n \to_{m.s.} \mathbf{z} $$, by definition in the text,
 43 | 
 44 | $$
 45 | \lim_{n \to \infty} \operatorname{E}
 46 | [ ( z_{nk} - z_k )^2 ] = 0,
 47 | \text{ for $k = 1, \ldots, K$},
 48 | $$
 49 | 
 50 | then we can show
 51 | 
 52 | $$
 53 | \begin{align}
 54 | & \color{white}{=} \lim_{n \to \infty} \operatorname{E}
 55 | [ ( \mathbf{z}_n - \mathbf{z} )'( \mathbf{z}_n - \mathbf{z} ) ] \\
 56 | & = \lim_{n \to \infty}
 57 | \left(
 58 | \operatorname{E} [ ( z_{n1} - z_1 )^2 ] + \cdots + 
 59 | \operatorname{E} [ ( z_{nK} - z_K )^2 ]
 60 | \right) \\
 61 | & = \lim_{n \to \infty} \operatorname{E} [ ( z_{n1} - z_1 )^2 ] + \cdots + 
 62 | \lim_{n \to \infty} \operatorname{E} [ ( z_{nK} - z_K )^2 ] \\
 63 | & = 0.
 64 | \end{align}
 65 | $$
 66 | 
 67 | If $$ \lim_{n \to \infty} \operatorname{E}
 68 | [ ( \mathbf{z}_n - \mathbf{z} )'( \mathbf{z}_n - \mathbf{z} ) ] = 0 $$, then
 69 | 
 70 | $$
 71 | \lim_{n \to \infty} \operatorname{E} [ ( z_{n1} - z_1 )^2 ] + \cdots + 
 72 | \lim_{n \to \infty} \operatorname{E} [ ( z_{nK} - z_K )^2 ] = 0.
 73 | \tag{1}
 74 | $$
 75 | 
 76 | Because each terms in (1) are non-negative, (1) implies
 77 | 
 78 | $$
 79 | \lim_{n \to \infty} \operatorname{E}
 80 | [ ( z_{nk} - z_k )^2 ] = 0,
 81 | \text{ for $k = 1, \ldots, K$},
 82 | $$
 83 | 
 84 | which means $$ z_{nk} \to_{m.s.} z_k $$ for $$ k = 1, \ldots, K $$, and this implies $$ \mathbf{z}_n \to_{m.s.} \mathbf{z} $$ by definition in the text.
 85 | 
 86 | (b) Similarly, if $$ \mathbf{z}_n \to_p \boldsymbol{\alpha} $$, by definition in the text, for any $$ \varepsilon > 0 $$,
 87 | 
 88 | $$
 89 | \operatorname{Prob}
 90 | \left(
 91 | | z_{nk} - \alpha_k | > \varepsilon
 92 | \right) \to 0,
 93 | \text{ for $k = 1, \ldots, K$},
 94 | $$
 95 | 
 96 | then we can show
 97 | 
 98 | $$
 99 | \begin{gather}
100 | \operatorname{Prob}
101 | \left(
102 | ( z_{nk} - \alpha_k )^2 > \varepsilon^2
103 | \right) \to 0,
104 | \text{ for $k = 1, \ldots, K$}, 
105 | && \text{(equivalent event)} \\
106 | 
107 | \operatorname{Prob}
108 | \left(
109 | ( z_{nk} - \alpha_k )^2 > \varepsilon
110 | \right) \to 0,
111 | \text{ for $k = 1, \ldots, K$},
112 | && \text{(any $\varepsilon > 0$)} \\
113 | 
114 | \operatorname{Prob}
115 | \left(
116 | ( z_{n1} - \alpha_1 )^2 > \varepsilon
117 | \right) + \cdots +
118 | \operatorname{Prob}
119 | \left(
120 | ( z_{nK} - \alpha_K )^2 > \varepsilon
121 | \right) \to 0, 
122 | && \text{(limit sum)} \\
123 | 
124 | \left(
125 | \not\Rightarrow
126 | \operatorname{Prob}
127 | \left(
128 | (z_{n1} - \alpha_1)^2 + \cdots + (z_{nK} - \alpha_K)^2 > K \varepsilon
129 | \right) \to 0
130 | \right)
131 | && \text{(bigger event)} \\
132 | 
133 | \operatorname{Prob}
134 | \left(
135 | \bigcup_{k=1}^K
136 | \left(
137 | ( z_{nk} - \alpha_k )^2 > \varepsilon
138 | \right)
139 | \right) \to 0, 
140 | && \text{(smaller event)} \\
141 | 
142 | \operatorname{Prob}
143 | \left(
144 | \bigcap_{k=1}^K
145 | \left(
146 | ( z_{nk} - \alpha_k )^2 \leq \varepsilon
147 | \right)
148 | \right) \to 1, 
149 | && \text{(complement event)} \\
150 | 
151 | \operatorname{Prob}
152 | \left(
153 | ( z_{n1} - \alpha_1 )^2 + \cdots +
154 | ( z_{nK} - \alpha_K )^2 \leq K \varepsilon
155 | \right) \to 1, 
156 | && \text{(bigger event)} \\
157 | 
158 | \operatorname{Prob}
159 | \left(
160 | ( z_{n1} - \alpha_1 )^2 + \cdots +
161 | ( z_{nK} - \alpha_K )^2 \leq \varepsilon
162 | \right) \to 1,
163 | && \text{(any $\varepsilon > 0$)} \\
164 | 
165 | \operatorname{Prob}
166 | \left(
167 | ( \mathbf{z}_n - \boldsymbol{\alpha} )'
168 | ( \mathbf{z}_n - \boldsymbol{\alpha} )
169 | \leq \varepsilon
170 | \right) \to 1, 
171 | && \text{(matrix notation)} \\
172 | 
173 | \operatorname{Prob}
174 | \left(
175 | ( \mathbf{z}_n - \boldsymbol{\alpha} )'
176 | ( \mathbf{z}_n - \boldsymbol{\alpha} )
177 | > \varepsilon
178 | \right) \to 0.
179 | && \text{(complement event)}
180 | \end{gather}
181 | $$
182 | 
183 | If $$ \operatorname{Prob} \left( ( \mathbf{z}_n - \boldsymbol{\alpha} )' ( \mathbf{z}_n - \boldsymbol{\alpha} ) > \varepsilon \right) \to 0 $$, then we can show
184 | 
185 | $$
186 | \begin{gather}
187 | \operatorname{Prob}
188 | \left(
189 | ( z_{n1} - \alpha_1 )^2 + \cdots +
190 | ( z_{nK} - \alpha_K )^2 > \varepsilon
191 | \right) \to 0, 
192 | && \text{(scalar notation)} \\
193 | 
194 | \operatorname{Prob}
195 | \left(
196 | \bigcup_{k=1}^K
197 | \left(
198 | ( z_{nk} - \alpha_k )^2 > \varepsilon
199 | \right)
200 | \right) \to 0, 
201 | && \text{(smaller event)} \\
202 | 
203 | \operatorname{Prob}
204 | \left(
205 | ( z_{nk} - \alpha_k )^2 > \varepsilon
206 | \right) \to 0
207 | \text{ for $k = 1, \ldots, K$ },
208 | && \text{(smaller events)}
209 | \end{gather}
210 | $$
211 | 
212 | which means $$ z_{nk} \to_p \alpha_k $$ for $$ k = 1, \ldots, K $$, and this implies $$ \mathbf{z}_n \to_p \boldsymbol{\alpha} $$.
213 | 
214 | ---
215 | 
216 | Copyright ©2018 by Qiang Gao


--------------------------------------------------------------------------------