├── lecture-note ├── README.md ├── 1.2.md ├── 1.7.md └── 1.1.md ├── supplements ├── README.md ├── taylor-linearization.md ├── var-cov-matrix.md └── matrix-multiplication.md ├── exercise-solution ├── README.md ├── 1.4.md ├── 1.3.md ├── 1.1.md └── 1.2.md ├── question-solution ├── README.md ├── 1.3.1.md ├── 1.3.3.md ├── 1.2.2.md ├── 1.7.6.md ├── 1.4.6.md ├── 1.3.5.md ├── 1.2.1.md ├── 1.1.5.md ├── 1.6.1.md ├── 1.2.6.md ├── 1.2.8.md ├── 2.1.4.md ├── 1.6.4.md ├── 1.4.2.md ├── 1.3.2.md ├── 2.2.4.md ├── 1.7.5.md ├── 1.3.6.md ├── 1.4.4.md ├── 1.1.1.md ├── 1.4.1.md ├── 1.3.7.md ├── 1.4.7.md ├── 1.1.2.md ├── 1.2.7.md ├── 1.7.2.md ├── 1.7.3.md ├── 2.2.1.md ├── 2.1.1.md ├── 1.4.3.md ├── 1.1.6.md ├── 2.1.3.md ├── 1.6.2.md ├── 2.2.2.md ├── 1.7.4.md ├── 1.7.7.md ├── 1.6.3.md ├── 2.2.3.md ├── 1.5.5.md ├── 1.4.5.md ├── 1.5.3.md ├── 1.1.3.md ├── 1.2.5.md ├── 1.7.8.md ├── 1.5.1.md ├── 2.1.5.md ├── 1.1.4.md ├── 1.2.9.md ├── 1.5.4.md ├── 1.3.4.md ├── 1.2.4.md ├── 1.7.1.md ├── 1.2.3.md ├── 1.5.2.md └── 2.1.2.md ├── book.json ├── SUMMARY.md └── README.md /lecture-note/README.md: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /supplements/README.md: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /exercise-solution/README.md: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /question-solution/README.md: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /book.json: -------------------------------------------------------------------------------- 1 | { 2 | "plugins": ["mathjax"], 3 | "pluginsConfig": { 4 | "mathjax": { 5 | "forceSVG": false 6 | } 7 | } 8 | } -------------------------------------------------------------------------------- /question-solution/1.3.1.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Apr 23, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 3 Finite-Sample Properties of OLS 10 | 11 | ... 12 | 13 | #### Review Question 1.3.1 (Role of the no-multicollinearity assumption) 14 | 15 | In Proposition 1.1 and 1.2, where did we use Assumption 1.3 that $$ \mathrm{rank} ( \mathbf{X} ) = K $$? 16 | 17 | ##### Solution 18 | 19 | We need the no-multicollinearity condition to make sure $$ \mathbf{X}' \mathbf{X} $$ is invertible. 20 | 21 | --- 22 | 23 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.3.3.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Apr 23, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 3 Finite-Sample Properties of OLS 10 | 11 | ... 12 | 13 | #### Review Question 1.3.3 (What Gauss-Markov does not mean) 14 | 15 | Under Assumptions 1.1–1.4, does there exist a linear, but not necessarily unbiased, estimator of $$ \boldsymbol{\beta} $$ that has a variance smaller than that of the OLS estimator? If so, how small can the variance be? 16 | 17 | ##### Solution 18 | 19 | If an estimator of $$ \boldsymbol{\beta} $$ is a constant, then the estimator is trivially linear in $$ \mathbf{y} $$. 20 | 21 | --- 22 | 23 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.2.2.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Mar 20, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 2 The Algebra of Least Squares 10 | 11 | ... 12 | 13 | #### Review Question 1.2.2 14 | 15 | Verify that $$ \mathbf{X}' \mathbf{X} / n = \frac{1}{n} \sum_i \mathbf{x}_i \mathbf{x}_i' $$ and $$ \mathbf{X}' \mathbf{y} / n = \frac{1}{n} \sum_i \mathbf{x}_i y_i $$. 16 | 17 | ##### Solution 18 | 19 | By the [outer-way](../supplements/matrix-multiplication.md) matrix multiplication rule, noticing that $$ \mathbf{x}_i $$ is the $$ i $$th column vector of $$ \mathbf{X}' $$, above equations are obviously true. 20 | 21 | --- 22 | 23 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.7.6.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 20, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 7 Application: Returns to Scale in Electricity Supply 10 | 11 | ... 12 | 13 | #### Review Question 1.7.6 14 | 15 | Why is the $$R^2$$ of 0.926 from the unrestricted model (1.7.7) _lower_ than the $$R^2$$ of 0.932 from the restricted model (1.7.8)? 16 | 17 | ##### Solution 18 | 19 | That is because the dependent variable in the restricted regression is different from that in the unrestricted regression. If the dependent variable were the same, then indeed the $$R^2$$ should be higher for the unrestricted model. 20 | 21 | --- 22 | 23 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.4.6.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 11, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 4 Hypothesis Testing under Normality 10 | 11 | ... 12 | 13 | #### Review Question 1.4.6 ($$t$$ vs. $$F$$) 14 | 15 | “It is nonsense to test a hypothesis consisting of a large number of equality restrictions, because the $$t$$-test will most likely reject at least some of the restrictions.” Criticize this statement. 16 | 17 | ##### Solution 18 | 19 | A joint test of multiple restrictions _simultaneously_ is different from _separately_ test multiple restrictions. As explained in the text, the overall significance increases with the number of restrictions to be tested if the $$t$$-test is applied to each restriction without adjusting the critical value. 20 | 21 | --- 22 | 23 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.3.5.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 8, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 3 Finite-Sample Properties of OLS 10 | 11 | ... 12 | 13 | #### Review Question 1.3.5 14 | 15 | Propose an unbiased estimator of $$ \sigma^2 $$ if you had data on $$ \boldsymbol{ \varepsilon } $$. 16 | 17 | ##### Solution 18 | 19 | $$ \boldsymbol{ \varepsilon }' \boldsymbol{ \varepsilon } / n $$ is an unbiased estimator of $$ \sigma^2 $$. This is because 20 | 21 | $$ 22 | \begin{align} 23 | \mathrm{E} ( \boldsymbol{ \varepsilon }' \boldsymbol{ \varepsilon } / n ) 24 | & = 25 | \frac{1}{n} \sum_{i=1}^n \mathrm{E} ( \varepsilon_i^2 ) 26 | \\ & = 27 | \frac{1}{n} \sum_{i=1}^n \mathrm{E} [ \mathrm{E} ( \varepsilon_i^2 \mid \mathbf{X} ) ] 28 | \\ & = 29 | \frac{1}{n} \cdot n \sigma^2 30 | \\ & = 31 | \sigma^2. 32 | \end{align} 33 | $$ 34 | 35 | --- 36 | 37 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.2.1.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Mar 20, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 2 The Algebra of Least Squares 10 | 11 | ... 12 | 13 | #### Review Question 1.2.1 14 | 15 | Prove that $$ \mathbf{X}' \mathbf{X} $$ is positive definite if $$ \mathbf{X} $$ is of full column rank. 16 | 17 | ##### Solution 18 | 19 | By the definition of positive definite matrix, we need to show that $$ \mathbf{c}' \mathbf{X}' \mathbf{X} \mathbf{c} > 0 $$ for $$ \mathbf{c} \neq \mathbf{0} $$. Define $$ \mathbf{z} \equiv \mathbf{X} \mathbf{c} $$. Then $$ \mathbf{c}' \mathbf{X}' \mathbf{X} \mathbf{c} = \mathbf{z}' \mathbf{z} = \sum_{k=1}^{K} z_i^2 $$. If $$ \mathbf{X} $$ is of full column rank, then the column vectors of $$ \mathbf{X} $$ are linearly independent. This means $$ \mathbf{X} \mathbf{c} = \mathbf{0} $$ if and only if $$ \mathbf{c} = \mathbf{0} $$. Then $$ \mathbf{z} \neq \mathbf{0} $$ for any $$ \mathbf{c} \neq \mathbf{0} $$. 20 | 21 | --- 22 | 23 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.1.5.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Mar 13, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 1 The Classical Linear Regression Model 10 | 11 | ... 12 | 13 | #### Review Question 1.1.5 (Multicollinearity for the simple regression model) 14 | 15 | Show that Assumption 1.3 for the simple regression model is that the nonconstant regressor ($$ x_{i2} $$) is really nonconstant (i.e. $$ x_{i2} \neq x_{j2} $$ for some pairs of $$ (i, j) $$, $$ i \neq j $$, with probability one). 16 | 17 | ##### Solution 18 | 19 | The simple regression model is 20 | 21 | $$ 22 | \mathbf{y} = \beta_1 \cdot \mathbf{1} + \beta_2 \mathbf{x}_2 + \boldsymbol{\varepsilon}. 23 | $$ 24 | 25 | Assumption 1.3 requires that $$ \{ \mathbf{1}, \mathbf{x}_2 \} $$ are linearly independent with probability one. This means $$ \mathbf{x}_2 $$ is not proportional to $$ \mathbf{1} $$ with probability one, i.e., $$ x_{i2} \neq x_{j2} $$ for some pairs of $$ (i, j) $$, $$ i \neq j $$, with probability one. 26 | 27 | --- 28 | 29 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /exercise-solution/1.4.md: -------------------------------------------------------------------------------- 1 | # Solution to Analytical Exercise 2 | 3 | by Qiang Gao, updated at Jun 26, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ... 10 | 11 | #### Analytical Exercise 1.4 (Partitioned regression) 12 | 13 | Let $$ \mathbf{X} $$ be partitioned as 14 | 15 | $$ 16 | \underset{ n \times K }{ \mathbf{X} } \equiv 17 | \left[ 18 | \underset{ n \times K_1 }{ \mathbf{X}_1 } \; 19 | \vdots \; 20 | \underset{ n \times K_2 }{ \mathbf{X}_2 } 21 | \right]. 22 | $$ 23 | 24 | Partition $$ \boldsymbol{\beta} $$ accordingly: 25 | 26 | $$ 27 | \boldsymbol{\beta} \equiv 28 | \begin{bmatrix} 29 | \boldsymbol{\beta}_1 \\ \boldsymbol{\beta}_2 30 | \end{bmatrix} 31 | \quad 32 | \begin{array}{l} 33 | \leftarrow K_1 \times 1 34 | \\ 35 | \leftarrow K_2 \times 1 36 | \end{array}. 37 | $$ 38 | 39 | Thus, the regression can be written as 40 | 41 | $$ 42 | \mathbf{y} = 43 | \mathbf{X}_1 \boldsymbol{ \beta }_1 + 44 | \mathbf{X}_2 \boldsymbol{ \beta }_2 + 45 | \boldsymbol{ \varepsilon }. 46 | $$ 47 | 48 | Let ... 49 | 50 | ##### Solution 51 | 52 | Let ... 53 | 54 | --- 55 | 56 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.6.1.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 15, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 6 Generalized Least Squares (GLS) 10 | 11 | ... 12 | 13 | #### Review Question 1.6.1 (The no-multicollinearity assumption for the transformed model) 14 | 15 | Assumption 1.3 for the transformed model is that $$ \mathrm{rank} ( \mathbf{C} \mathbf{X} ) = K $$. This is satisfied since $$ \mathbf{C} $$ is nonsingular and $$ \mathbf{X} $$ is of full column rank. Show this. 16 | 17 | ##### Solution 18 | 19 | To show $$ \mathrm{rank} ( \mathbf{C} \mathbf{X} ) = K $$ is equivalent to show $$ \mathbf{C} \mathbf{X} \mathbf{v} \neq \mathbf{0} $$ for any $$ \mathbf{v} \neq \mathbf{0} $$. 20 | 21 | For any $$ \mathbf{v} \neq \mathbf{0} $$, $$ \mathbf{X} \mathbf{v} \neq \mathbf{0} $$ because $$ \mathbf{X} $$ is of full column rank. Let $$ \mathbf{u} \equiv \mathbf{X} \mathbf{v} \neq \mathbf{0} $$, because $$ \mathbf{C} $$ is nonsingular (also full column rank), $$ \mathbf{C} \mathbf{u} = \mathbf{C} \mathbf{X} \mathbf{v} \neq \mathbf{0} $$. Q.E.D. 22 | 23 | --- 24 | 25 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.2.6.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Mar 21, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 2 The Algebra of Least Squares 10 | 11 | ... 12 | 13 | #### Review Question 1.2.6 (Change in units and $$ R^2 $$) 14 | 15 | Does a change in the unit of measurement for the dependent variable change $$ R^2 $$? A change in the unit of measurement for the regressors? **Hint:** Check whether the change affects the denominator and the numerator in the definition for $$ R^2 $$. 16 | 17 | ##### Solution 18 | 19 | By definition, 20 | 21 | $$ 22 | R^2 = \frac{ \sum_{i=1}^n (\hat{y}_i - \bar{y})^2 } 23 | { \sum_{i=1}^n (y_i - \bar{y})^2 }. 24 | $$ 25 | 26 | (1) If we change the unit of measurement of the dependent variable $$ y $$, then $$ y_i $$, $$ \hat{y}_i $$, and $$ \bar{y} $$ simultaneously scales by a factor $$ \alpha $$. Both the denominator and the numerator are scaled by $$ \alpha^2 $$, leaving $$ R^2 $$ unchanged. 27 | 28 | (2) If we change the unit of measurement of the regressors, then $$ y_i $$, $$ \hat{y}_i $$, and $$ \bar{y} $$ donnot change at all. $$ R^2 $$ remains unchanged. 29 | 30 | --- 31 | 32 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.2.8.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Mar 22, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 2 The Algebra of Least Squares 10 | 11 | ... 12 | 13 | #### Review Question 1.2.8 14 | 15 | Show that 16 | 17 | $$ 18 | R_{uc}^2 = \frac{ \mathbf{y}' \mathbf{P} \mathbf{y} } 19 | { \mathbf{y}' \mathbf{y} }. 20 | $$ 21 | 22 | ##### Solution 23 | 24 | $$ 25 | \begin{align} 26 | R_{uc}^2 & = 1 - \frac{ \mathbf{e}' \mathbf{e} } 27 | { \mathbf{y}' \mathbf{y} } 28 | && 29 | \text{(definition (1.2.16))} 30 | \\ & = 31 | \frac{ \mathbf{y}' \mathbf{y} - \mathbf{e}' \mathbf{e} } 32 | { \mathbf{y}' \mathbf{y} } 33 | \\ & = 34 | \frac{ \mathbf{y}' \mathbf{I} \mathbf{y} - 35 | \mathbf{y}' \mathbf{M} \mathbf{y} } 36 | { \mathbf{y}' \mathbf{y} } 37 | && 38 | \text{($ \mathbf{e} = \mathbf{M} \mathbf{y} $, $\mathbf{M}$ idempotent)} 39 | \\ & = 40 | \frac{ \mathbf{y}' (\mathbf{I} - \mathbf{M}) \mathbf{y} } 41 | { \mathbf{y}' \mathbf{y} } 42 | \\ & = 43 | \frac{ \mathbf{y}' \mathbf{P} \mathbf{y} } 44 | { \mathbf{y}' \mathbf{y} }. 45 | && 46 | (\mathbf{M} \equiv \mathbf{I} - \mathbf{P}) 47 | \end{align} 48 | $$ 49 | 50 | --- 51 | 52 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/2.1.4.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Sep 17, 2017 4 | 5 | --- 6 | 7 | ## Chapter 2 Large-Sample Theory 8 | 9 | ### Section 1 Review of Limit Theorems for Sequences of Random Variables 10 | 11 | ... 12 | 13 | #### Review Question 2.1.4 14 | 15 | Suppose $$ \sqrt{n} ( \hat\theta_n - \theta ) \to_d N(0, \sigma^2) $$. Does it follow that $$ \hat\theta_n \to_p \theta $$? 16 | 17 | **Hint**: 18 | 19 | $$ 20 | \hat\theta_n - \theta = \frac{1}{\sqrt{n}} \cdot 21 | \sqrt{n} ( \hat\theta_n - \theta ) 22 | \text{, } 23 | \operatorname*{plim}_{n \to \infty} 24 | \frac{1}{\sqrt{n}} = 0. 25 | $$ 26 | 27 | ##### Solution 28 | 29 | Using the multiply-and-divide strategy, 30 | 31 | $$ 32 | \hat\theta_n - \theta = \frac{1}{\sqrt{n}} \cdot 33 | \sqrt{n} ( \hat\theta_n - \theta ). 34 | \tag{1} 35 | $$ 36 | 37 | Because $$ 1 / \sqrt{n} \to 0 $$, as is shown in [Review Question 2.1.1](2.1.1.md), 38 | 39 | $$ 40 | \frac{1}{\sqrt{n}} \to_p 0. 41 | \tag{2} 42 | $$ 43 | 44 | Because $$ \sqrt{n} ( \hat\theta_n - \theta ) \to_d N(0, \sigma^2) $$, combining (2) into (1) and Lemma 2.4(b), 45 | 46 | $$ 47 | \begin{gather} 48 | \hat\theta_n - \theta \to_p 0, \\ 49 | 50 | \hat\theta_n \to_p \theta. 51 | \end{gather} 52 | $$ 53 | 54 | --- 55 | 56 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.6.4.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 15, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 6 Generalized Least Squares (GLS) 10 | 11 | ... 12 | 13 | #### Review Question 1.6.4 (Sampling error of GLS) 14 | 15 | Show: $$ \hat{ \boldsymbol{ \beta } }_{\mathrm{GLS}} - \boldsymbol{ \beta } = ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V}^{-1} \boldsymbol{ \varepsilon } $$. 16 | 17 | ##### Solution 18 | 19 | $$ 20 | \begin{align} 21 | \hat{ \boldsymbol{ \beta } }_{\mathrm{GLS}} 22 | & = 23 | ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V}^{-1} \mathbf{y} 24 | \tag{1.6.5} 25 | \\ & = 26 | ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V}^{-1} ( \mathbf{X} \boldsymbol{ \beta } + \boldsymbol{ \varepsilon } ) 27 | \\ & = 28 | ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} \boldsymbol{ \beta } + ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V}^{-1} \boldsymbol{ \varepsilon } 29 | \\ & = 30 | \boldsymbol{ \beta } + ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V}^{-1} \boldsymbol{ \varepsilon }. 31 | \end{align} 32 | $$ 33 | 34 | --- 35 | 36 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.4.2.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 10, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 4 Hypothesis Testing under Normality 10 | 11 | ... 12 | 13 | #### Review Question 1.4.2 (Computation of test statistics) 14 | 15 | Verify that $$ SE( b_k ) $$ as well as $$ \mathbf{b} $$, $$ SSR $$, $$ s^2 $$, and $$ R^2 $$ can be calculated from the following sample averages: $$ \mathbf{S}_{ \mathbf{xx} } $$, $$ \mathbf{s}_{ \mathbf{xy} } $$, $$ \mathbf{y}' \mathbf{y} / n $$, and $$ \bar{y} $$. 16 | 17 | ##### Solution 18 | 19 | In [review question 1.2.9](1.2.9.md), it has been shown that $$ \mathbf{b} $$, $$ SSR $$, $$ s^2 $$, and $$ R^2 $$ can be calculated from sample averages $$ \mathbf{S}_{ \mathbf{xx} } $$, $$ \mathbf{s}_{ \mathbf{xy} } $$, $$ \mathbf{y}' \mathbf{y} / n $$, and $$ \bar{y} $$. 20 | 21 | Because 22 | 23 | $$ 24 | SE( b_k ) \equiv \sqrt{ s^2 \cdot \left( ( \mathbf{X}' \mathbf{X} )^{-1} \right)_{kk} }, 25 | $$ 26 | 27 | by definition that $$ \mathbf{S}_{\mathbf{xx}} = \frac{1}{n} \mathbf{X}' \mathbf{X} $$, it is obvious that $$ SE( b_k ) $$ can be calculated from sample averages of $$ \mathbf{S}_{ \mathbf{xx} } $$, $$ \mathbf{s}_{ \mathbf{xy} } $$, $$ \mathbf{y}' \mathbf{y} / n $$, and $$ \bar{y} $$. 28 | 29 | --- 30 | 31 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.3.2.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Apr 23, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 3 Finite-Sample Properties of OLS 10 | 11 | ... 12 | 13 | #### Review Question 1.3.2 (Example of a linear estimator) 14 | 15 | For the consumption function example in Example 1.1, propose a linear and unbiased estimator of $$ \beta_2 $$ that is different from the OLS estimator. 16 | 17 | ##### Solution 18 | 19 | We propose an estimator $$ \widehat{\beta}_2 = ( CON_2 - CON_1 ) / ( YD_2 - YD_1 ) $$. 20 | 21 | 1. When $$ YD_1, \ldots, YD_n $$ is known, $$ \widehat{\beta}_2 $$ is a linear combination of $$ CON_1, \ldots, CON_n $$. 22 | 23 | 2. Because 24 | 25 | $$ 26 | \begin{align} 27 | \mathrm{E} ( \widehat{\beta}_2 \mid YD_1, \ldots, YD_n ) 28 | & = 29 | \mathrm{E} \left( \left. 30 | \frac{ ( \beta_1 + \beta_2 YD_2 + \varepsilon_2) - ( \beta_1 + \beta_2 YD_1 + \varepsilon_1 ) } 31 | { YD_2 - YD_1 } 32 | \right| YD_1, \ldots, YD_n 33 | \right) 34 | \\ & = 35 | \mathrm{E} \left( \left. 36 | \frac{ \beta_2 (YD_2 - YD_1) + \varepsilon_2 - \varepsilon_1 } 37 | { YD_2 - YD_1 } 38 | \right| YD_1, \ldots, YD_n 39 | \right) 40 | \\ & = 41 | \beta_2 + 0 - 0 = \beta_2, 42 | \end{align} 43 | $$ 44 | 45 | So $$ \widehat{\beta}_2 $$ proposed here is linear and unbiased. 46 | 47 | --- 48 | 49 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/2.2.4.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Nov 8, 2017 4 | 5 | --- 6 | 7 | ## Chapter 2 Large-Sample Theory 8 | 9 | ### Section 2 Fundamental Concepts in Time-Series Analysis 10 | 11 | ... 12 | 13 | #### Review Question 2.2.4 14 | 15 | Let $$ \{ x_i \} $$ be a sequence of real numbers that change with $$i$$ and $$ \{ \varepsilon_i \} $$ be a sequence of i.i.d. random variables with mean 0 and finite variance. Is $$ \{ x_i \cdot \varepsilon_i \} $$ i.i.d.? [Answer: No.] Is it serially independent? [Answer: Yes.] An m.d.s? [Answer: Yes.] Stationary? [Answer: No.] 16 | 17 | ##### Solution 18 | 19 | (a) i.i.d.? No. Because $$ \operatorname*{Var} (x_i \cdot \varepsilon_i) = x_i^2 \sigma^2 $$ changes with $$i$$. 20 | 21 | (b) Serially independent? Yes. Because $$ x_i \cdot \varepsilon_i $$ can be considered as a function of $$ \varepsilon_i $$, $$ f_i (\varepsilon_i) = x_i \cdot \varepsilon_i $$, and the functions of two independent random variables are also independent. 22 | 23 | (c) m.d.s.? Yes. Let $$ z_i = x_i \cdot \varepsilon_i $$, $$ \operatorname*{E} ( z_i | z_{i-1}, z_{i-2}, \ldots, z_{1} ) = \operatorname*{E} ( z_i ) = x_i \operatorname*{E} ( \varepsilon_i ) = 0 $$. 24 | 25 | (d) Stationary? No. Because $$ \operatorname*{Var} (x_i \cdot \varepsilon_i) = x_i^2 \sigma^2 $$ changes with $$i$$. 26 | 27 | --- 28 | 29 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.7.5.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 20, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 7 Application: Returns to Scale in Electricity Supply 10 | 11 | ... 12 | 13 | #### Review Question 1.7.5 14 | 15 | If you take $$p_{i2}$$ instead of $$p_{i3}$$ and subtract $$ \log (p_{i2}) $$ from both sides of 16 | 17 | $$ 18 | \log ( TC_i ) = \beta_1 + \beta_2 \log ( Q_i ) + \beta_3 \log ( p_{i1} ) + \beta_4 \log ( p_{i2} ) + \beta_5 \log ( p_{i3} ) + \varepsilon_i, 19 | \tag{1.7.4} 20 | $$ 21 | 22 | how does the restricted regression look? Without actually estimating it on Nerlove's data, can you tell from the estimated restricted regression in the text what the restricted OLS estimate of $$ ( \beta_1, \ldots, \beta_5 ) $$ will be? Their standard errors? the $$SSR$$? What about the $$R^2$$? 23 | 24 | ##### Solution 25 | 26 | The new restricted regression will be 27 | 28 | $$ 29 | \log \left( \frac{ TC_i }{ p_{i2} } \right) = \beta_1 + 30 | \beta_2 \log ( Q_i ) + 31 | \beta_3 \log \left( \frac{ p_{i1} }{ p_{i2} } \right) + 32 | \beta_5 \log \left( \frac{ p_{i3} }{ p_{i2} } \right) + 33 | \varepsilon_i. 34 | \tag{1} 35 | $$ 36 | 37 | The OLS estimate from regression on (1) should yield the same point estimate and standard errors. The $$SSR$$ should be the same, but $$R^2$$ should be different. 38 | 39 | --- 40 | 41 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.3.6.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 8, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 3 Finite-Sample Properties of OLS 10 | 11 | ... 12 | 13 | #### Review Question 1.3.6 14 | 15 | Prove part (d) of Proposition 1.1, under Assumptions 1.1—1.4, $$ \mathrm{Cov} ( \mathbf{b}, \mathbf{e} \mid \mathbf{X} ) = \mathbf{0} $$, where $$ \mathbf{e} \equiv \mathbf{y} - \mathbf{X} \mathbf{b} $$. 16 | 17 | ##### Solution 18 | 19 | By [definition of covariance](../supplements/var-cov-matrix.md), 20 | 21 | $$ 22 | \begin{align} 23 | \mathrm{Cov} ( \mathbf{b}, \mathbf{e} \mid \mathbf{X} ) 24 | & = 25 | \mathrm{E} \{ [ \mathbf{b} - \mathrm{E} ( \mathbf{b} \mid \mathbf{X} ) ][ \mathbf{e} - \mathrm{E} ( \mathbf{e} \mid \mathbf{X} ) ]' \mid \mathbf{X} \} 26 | \\ & = 27 | \mathrm{E} \{ [ \mathbf{A} \boldsymbol{ \varepsilon } ][ \mathbf{M} \boldsymbol{ \varepsilon } ]' \mid \mathbf{X} \} 28 | && 29 | ( \mathbf{A} \equiv (\mathbf{X}' \mathbf{X})^{-1} \mathbf{X}', \mathbf{M} \equiv \mathbf{I} - \mathbf{X} 30 | ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{X}' ) 31 | \\ & = 32 | \mathbf{A} \mathrm{E} ( \boldsymbol{ \varepsilon } \boldsymbol{\varepsilon}' \mid \mathbf{X} ) \mathbf{M}' 33 | \\ & = 34 | \sigma^2 \mathbf{A} \mathbf{M}' 35 | && 36 | ( \mathbf{M} \mathbf{X} = \mathbf{0} ) 37 | \\ & = \mathbf{0}. 38 | \end{align} 39 | $$ 40 | 41 | --- 42 | 43 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.4.4.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 11, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 4 Hypothesis Testing under Normality 10 | 11 | ... 12 | 13 | #### Review Question 1.4.4 (One-tailed $$t$$-test) 14 | 15 | The $$t$$-test described in the text is the **tow-tailed $$t$$-test** because the significance $$\alpha$$ is equally distributed between both tails of the $$t$$ distribution. Suppose the alternative is one-sided and written as $$ \mathrm{H}_1: \beta_k > \bar{\beta}_k $$. Consider the following modification of the decision rule of the $$t$$-test. 16 | 17 | 1. Same as above. 18 | 2. Find the critical value $$t_\alpha$$ such that the area in the $$t$$ distribution to the right of $$t_\alpha$$ is $$\alpha$$. Note the difference from the two-tailed test: the left tail is ignored and the area of $$\alpha$$ is assigned to the upper tail only. 19 | 3. Accept if $$t_k < t_\alpha$$; reject otherwise. 20 | 21 | Show that the size (significance level) of this **one-tailed $$t$$-test** is $$\alpha$$. 22 | 23 | ##### Solution 24 | 25 | The size of a test is the probability of falsely rejection. When $$\mathrm{H}_0$$ is true, the test statistic $$t_k$$ has a $$ t(n-K) $$ distribution. If we reject whenever $$t_k \ge t_\alpha$$, then the probability is $$\alpha$$ by construction. So the size of this **one-tailed $$t$$-test** is $$\alpha$$. 26 | 27 | --- 28 | 29 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /supplements/taylor-linearization.md: -------------------------------------------------------------------------------- 1 | ## Taylor's Linearization 2 | 3 | by Qiang Gao, updated at Mar 9, 2018 4 | 5 | --- 6 | 7 | Any nonlinear differentiable function, i.e., a curve of points $$(y, x)$$, depicted as $$y = f(x)$$, can be _locally_ approximated as a line, called **linearization**, around any point $$(\bar{y}, \bar{x})$$ on the curve. 8 | 9 | #### One Independent Variable Case 10 | 11 | A function of a single variable $$y = f(x)$$ can be approximately linearized around a point $$(\bar{y}, \bar{x})$$, where $$\bar{y} = f(\bar{x})$$, as 12 | 13 | $$ 14 | y - \bar y \approx \frac{d f(\bar{x})}{d x} (x - \bar x_1). 15 | $$ 16 | 17 | #### Two Independent Variable Case 18 | 19 | A function of two variables $$y = f(x_1, x_2)$$ can be approximately linearized around a point $$(\bar{y}, \bar{x}_1, \bar{x}_2)$$, where $$\bar{y} = f(\bar{x}_1, \bar{x}_2)$$, as 20 | 21 | $$ 22 | y - \bar{y} \approx \frac{\partial f(\bar{x}_1, \bar{x}_2)}{\partial x_1} (x_1 - \bar{x}_1) + \frac{\partial f(\bar{x}_1, \bar{x}_2)}{\partial x_2} (x_2 - \bar{x}_2). 23 | $$ 24 | 25 | #### Many Independent Variable Case 26 | 27 | A function of $$K$$ independent variables $$y = f( \mathbf{x})$$, where $$\mathbf{x} = (x_1, x_2, \ldots, x_K)$$, can be approximately linearized around point $$(\bar{y}, \bar{ \mathbf{x} })$$, where $$\bar{y} = f( \bar{ \mathbf{x} } )$$, as 28 | 29 | $$ 30 | y - \bar{y} \approx \nabla f(\bar{ \mathbf{x} }) \cdot ( \mathbf{x} - \bar{ \mathbf{x} } ). 31 | $$ 32 | 33 | --- 34 | 35 | Copyright ©2018 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.1.1.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Mar 13, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 1 The Classical Linear Regression Model 10 | 11 | ... 12 | 13 | #### Review Question 1.1.1 (Change in units in the semi-log form) 14 | 15 | In the wage equation 16 | 17 | $$ 18 | \log ( WAGE_i ) = \beta_1 + \beta_2 S_i + \beta_3 TENURE_i + \beta_4 EXPR_i + \varepsilon_i, 19 | \tag{1.1.3} 20 | $$ 21 | 22 | of Example 1.2, if $$ WAGE $$ is measured in cents rather than in dollars, what difference does it make to the equation? 23 | 24 | ##### Solution 25 | 26 | Let $$ WAGE' $$ denote the $$ WAGE $$ variable measured in cents, that is, 27 | 28 | $$ 29 | WAGE' = 100 WAGE, 30 | $$ 31 | 32 | $$ 33 | \log ( WAGE' ) = \log (100) + \log ( WAGE ). 34 | $$ 35 | 36 | Substituting into equation (1.1.3), 37 | 38 | $$ 39 | \log ( WAGE_i') - \log (100) = \beta_1 + \beta_2 S_i + 40 | \beta_3 TENURE_i + \beta_4 EXPR_i + 41 | \varepsilon_i, 42 | $$ 43 | 44 | $$ 45 | \log (\mathit{WAGE}_i') = \log (100) + \beta_1 + \beta_2 S_i + 46 | \beta_3 \mathit{TENURE}_i + \beta_4 \mathit{EXPR}_i + 47 | \varepsilon_i, 48 | $$ 49 | 50 | $$ 51 | \log ( WAGE_i') = \beta_1' + \beta_2 S_i + \beta_3 52 | TENURE_i + \beta_4 EXPR_i + \varepsilon_i, 53 | $$ 54 | 55 | where $$ \beta_1' $$ is defined as $$ \beta_1' = \log (100) + \beta_1 $$. So the _only_ difference is $$ \beta_1 $$ is increased by $$ \log (100)$$. 56 | 57 | --- 58 | 59 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.4.1.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 10, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 4 Hypothesis Testing under Normality 10 | 11 | ... 12 | 13 | #### Review Question 1.4.1 (Conditionial vs. unconditional distribution) 14 | 15 | (a) Do we know from Assumptions 1.1—1.5 that the marginal (unconditional) distribution of $$ \mathbf{b} $$ is normal? 16 | 17 | (b) Are the statistics $$ z_k $$, $$ t_k $$, and $$ F $$ distributed independently of $$ \mathbf{X} $$? 18 | 19 | ##### Solution 20 | 21 | (a) Under Assumptions 1.1—1.5, 22 | 23 | $$ 24 | \mathbf{b} \mid \mathbf{X} \sim 25 | N( \boldsymbol{\beta}, \sigma^2 \cdot ( \mathbf{X}' \mathbf{X} )^{-1} ). 26 | \tag{1.4.2} 27 | $$ 28 | 29 | Because the variance of $$ \mathbf{b} $$ depends on $$ \mathbf{X} $$, when the marginal distribution of $$ \mathbf{X} $$ is unknown, the marginal distribution of $$ \mathbf{b} $$ is also unknown, so it is not necessarily normal. 30 | 31 | (b) Under Assumptions 1.1—1.5 and the null hypothesis, 32 | 33 | $$ 34 | \begin{align} 35 | z_k \mid \mathbf{X} & \sim N(0, 1), 36 | \tag{1.4.3} 37 | \\ 38 | t_k \mid \mathbf{X} & \sim t(n - K), 39 | \tag{1.4.5} 40 | \\ 41 | F \mid \mathbf{X} & \sim F(\# \mathbf{r}, n - K). 42 | \tag{1.4.9} 43 | \end{align} 44 | $$ 45 | 46 | These _conditional_ distributions does not depend on the value of $$ \mathbf{X} $$, so their _marginal_ distributions are independent of $$ \mathbf{X} $$. 47 | 48 | --- 49 | 50 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.3.7.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 8, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 3 Finite-Sample Properties of OLS 10 | 11 | ... 12 | 13 | #### Review Question 1.3.7 14 | 15 | Prove (1.2.21), 16 | 17 | $$ 18 | 0 \le p_i \le 1 \text{ and } \sum_{i=1}^{n} p_i = K, 19 | \tag{1.2.21} 20 | $$ 21 | 22 | where 23 | 24 | $$ 25 | p_i \equiv \mathbf{x}_i' ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{x}_i. 26 | \tag{1.2.20} 27 | $$ 28 | 29 | ##### Solution 30 | 31 | Because 32 | 33 | $$ 34 | \mathbf{P} \equiv \mathbf{X} ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{ X }', 35 | $$ 36 | 37 | $$ p_i $$ is the $$i$$-th row and $$i$$-th column of $$ \mathbf{P} $$. Because $$ \mathbf{P} $$ is positive semidefinite, $$ p_i \ge 0 $$. Similarly, because 38 | 39 | $$ 40 | \mathbf{M} \equiv \mathbf{I} - \mathbf{X} ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{ X }', 41 | $$ 42 | 43 | $$ 1 - p_i $$ is the $$i$$-th row and $$i$$-th column of $$ \mathbf{M} $$. Because $$ \mathbf{M} $$ is positive semidefinite, $$ 1 - p_i \ge 0 $$, $$ p_i \le 1 $$. 44 | 45 | Finally, 46 | 47 | $$ 48 | \begin{align} 49 | \sum_{i=1}^n p_i 50 | & = 51 | \mathrm{trace} ( \mathbf{P} ) 52 | \\ & = 53 | \mathrm{trace} ( \mathbf{X} ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{ X }' ) 54 | \\ & = 55 | \mathrm{trace} ( ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{ X }' \mathbf{X} ) 56 | \\ & = 57 | \mathrm{trace} ( \mathbf{I}_K ) 58 | \\ & = 59 | K. 60 | \end{align} 61 | $$ 62 | 63 | --- 64 | 65 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.4.7.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 11, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 4 Hypothesis Testing under Normality 10 | 11 | ... 12 | 13 | #### Review Question 1.4.7 (Variance of $$s^2$$) 14 | 15 | Show that, under Assumptions 1.1—1.5, 16 | 17 | $$ 18 | \mathrm{Var} ( s^2 \mid \mathbf{X} ) = 19 | \frac{ 2 \sigma^4 }{ n - K }. 20 | $$ 21 | 22 | **Hint:** If a random variable is distributed as $$ \chi^2 (m) $$, then its mean is $$m$$ and variance $$2m$$. 23 | 24 | ##### Solution 25 | 26 | Because 27 | 28 | $$ 29 | s^2 \equiv \frac{ \mathbf{e}' \mathbf{e} }{ n - K } = 30 | \frac{ \sigma^2 }{ n - K } 31 | \left( \frac{ \boldsymbol{\varepsilon} }{ \sigma } \right)' 32 | \mathbf{M} 33 | \left( \frac{ \boldsymbol{\varepsilon} }{ \sigma } \right), 34 | $$ 35 | 36 | by property of variance, 37 | 38 | $$ 39 | \mathrm{Var} ( s^2 \mid \mathbf{X} ) = 40 | \frac{ \sigma^4 }{ (n - K)^2 } \mathrm{Var} ( q \mid \mathbf{X} ), 41 | \tag{1} 42 | $$ 43 | 44 | where 45 | 46 | $$ 47 | q \equiv 48 | \left( \frac{ \boldsymbol{\varepsilon} }{ \sigma } \right)' 49 | \mathbf{M} 50 | \left( \frac{ \boldsymbol{\varepsilon} }{ \sigma } \right) 51 | $$ 52 | 53 | defined in page 36 in text is distributed as $$ q \mid \mathbf{X} \sim \chi^2 (n - K)$$. So $$ \mathrm{Var} ( q \mid \mathbf{X} ) = 2(n - K) $$, and substituting into (1) 54 | 55 | $$ 56 | \mathrm{Var} ( s^2 \mid \mathbf{X} ) = 57 | \frac{ 2 \sigma^4 }{ n - K }. 58 | $$ 59 | 60 | --- 61 | 62 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.1.2.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Mar 20, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 1 The Classical Linear Regression Model 10 | 11 | ... 12 | 13 | #### Review Question 1.1.2 (Conditional cross-moment of error terms) 14 | 15 | Prove the last equality in (1.1.15), 16 | 17 | $$ 18 | \mathrm{E} (\varepsilon_i \varepsilon_j \mid \mathbf{X}) 19 | = 20 | \mathrm{E} (\varepsilon_i \mid \mathbf{x}_i) 21 | \mathrm{E} (\varepsilon_j \mid \mathbf{x}_j) 22 | \qquad 23 | \text{(for $i \neq j$)}. 24 | \tag{1.1.15} 25 | $$ 26 | 27 | ##### Solution 28 | $$ 29 | \begin{align} 30 | \mathrm{E} ( \varepsilon_i \varepsilon_j \mid \mathbf{X} ) 31 | & = 32 | \mathrm{E} [ \mathrm{E} ( \varepsilon_i \varepsilon_j \mid \mathbf{X}, \varepsilon_j ) \mid \mathbf{X} ] 33 | && 34 | \text{(law of iterated expectations)} 35 | \\ 36 | & = \mathrm{E} [ \varepsilon_j \mathrm{E} ( \varepsilon_i \mid \mathbf{X}, \varepsilon_j ) \mid \mathbf{X} ] 37 | && 38 | \text{($\varepsilon_j$ is constant under condition)} 39 | \\ 40 | & = \mathrm{E} [ \varepsilon_j \mathrm{E} ( \varepsilon_i \mid \mathbf{x}_i ) \mid \mathbf{X} ] 41 | && 42 | \text{(random sample)} \\ 43 | & = \mathrm{E} ( \varepsilon_i \mid \mathbf{x}_i ) \mathrm{E} ( \varepsilon_j \mid \mathbf{X}) 44 | && 45 | \text{($\mathbf{x}_i$ is known conditional on $\mathbf{X}$)} \\ 46 | & = \mathrm{E} ( \varepsilon_i \mid \mathbf{x}_i ) \mathrm{E} ( \varepsilon_j \mid \mathbf{x}_j) 47 | && 48 | \text{(random sample)} 49 | \end{align} 50 | $$ 51 | 52 | --- 53 | 54 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.2.7.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Mar 21, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 2 The Algebra of Least Squares 10 | 11 | ... 12 | 13 | #### Review Question 1.2.7 (Relation between $$ R_{uc}^2 $$ and $$ R^2 $$) 14 | 15 | Show that 16 | 17 | $$ 18 | 1 - R^2 = \left( 1 + \frac{ n \cdot \bar{y}^2 }{ \sum_{i=1}^n (y_i - \bar{y})^2 } \right) 19 | (1 - R_{uc}^2). 20 | $$ 21 | 22 | **Hint:** Use the identity $$ \sum_i (y_i - \bar{y})^2 = \sum_i y_i^2 - n \cdot \bar{y}^2 $$. 23 | 24 | ##### Solution 25 | 26 | By definition of $$ R^2 $$, the left side equals 27 | 28 | $$ 29 | \frac{ \sum_{i=1}^n e_i^2 }{ \sum_{i=1}^n (y_i - \bar{y})^2 }. 30 | $$ 31 | 32 | By definition of $$ R_{uc}^2 $$, the right side equals 33 | 34 | $$ 35 | \begin{align} 36 | & \left( 1 + \frac{ n \cdot \bar{y}^2 } 37 | { \sum_{i=1}^n (y_i - \bar{y})^2 } \right) 38 | \frac{ \sum_{i=1}^n e_i^2 }{ \sum_{i=1}^n y_i^2 } 39 | \\ 40 | = & 41 | \left( \frac{ \sum_i y_i^2 - n \cdot \bar{y}^2 42 | + n \cdot \bar{y}^2 } 43 | { \sum_{i=1}^n (y_i - \bar{y})^2 } \right) 44 | \frac{ \sum_{i=1}^n e_i^2 }{ \sum_{i=1}^n y_i^2 } 45 | \qquad 46 | \text{(hint)} 47 | \\ 48 | = & 49 | \left( \frac{ \sum_i y_i^2 } 50 | { \sum_{i=1}^n (y_i - \bar{y})^2 } \right) 51 | \frac{ \sum_{i=1}^n e_i^2 }{ \sum_{i=1}^n y_i^2 } 52 | \\ 53 | = & 54 | \frac{ \sum_{i=1}^n e_i^2 }{ \sum_{i=1}^n 55 | (y_i - \bar{y})^2 }. 56 | \end{align} 57 | $$ 58 | 59 | Left side and right side are equal. 60 | 61 | --- 62 | 63 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.7.2.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 20, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 7 Application: Returns to Scale in Electricity Supply 10 | 11 | ... 12 | 13 | #### Review Question 1.7.2 (Change of units) 14 | 15 | In Nerlove's data, output is measured in kilowatt hours. If output were measured in megawatt hours, how would the estimated restricted regression change? 16 | 17 | ##### Solution 18 | 19 | In the restricted regression 20 | 21 | $$ 22 | \log \left( \frac{ TC_i }{ p_{i3} } \right) = 23 | \beta_1 + \beta_2 \log ( Q_i ) + \beta_3 \log 24 | \left( \frac{ p_{i1} }{ p_{i3} } \right) + \beta_4 \log 25 | \left( \frac{ p_{i2} }{ p_{i3} } \right) + 26 | \varepsilon_i, 27 | \tag{1.7.6} 28 | $$ 29 | 30 | if output measured in kilowatt hours ($$Q_i$$) is substituted by output measured in megawatt hours ($$ Q'_i \equiv Q_i / 1000 $$), 31 | 32 | $$ 33 | \begin{align} 34 | \log \left( \frac{ TC_i }{ p_{i3} } \right) & = 35 | \beta_1 + \beta_2 \log ( 1000 \cdot Q'_i ) + \beta_3 \log 36 | \left( \frac{ p_{i1} }{ p_{i3} } \right) + \beta_4 \log 37 | \left( \frac{ p_{i2} }{ p_{i3} } \right) + 38 | \varepsilon_i 39 | \\ & = 40 | ( \beta_1 + \beta_2 \log (1000) ) + 41 | \beta_2 \log ( Q'_i ) + \beta_3 \log 42 | \left( \frac{ p_{i1} }{ p_{i3} } \right) + \beta_4 \log 43 | \left( \frac{ p_{i2} }{ p_{i3} } \right) + 44 | \varepsilon_i, 45 | \end{align} 46 | $$ 47 | 48 | then the estimated values of slopes ($$ \beta_2 $$, $$ \beta_3 $$ and $$ \beta_4 $$) will not change, but the estimated value of the intercept ($$ \beta_1 $$) will increase by $$ \hat{\beta}_2 \log (1000) $$. 49 | 50 | --- 51 | 52 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.7.3.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 20, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 7 Application: Returns to Scale in Electricity Supply 10 | 11 | ... 12 | 13 | #### Review Question 1.7.3 (Recovering technology parameters from regression coefficients) 14 | 15 | Show that the technology parameters ($$\mu$$, $$\alpha_1$$, $$\alpha_2$$, $$\alpha_3$$) can be determined uniquely from the first four equations in (1.7.5), 16 | 17 | $$ 18 | \begin{align} 19 | \beta_1 & = \mu, \tag{1.7.5a} \\ 20 | \beta_2 & = \frac{1}{r}, \tag{1.7.5b} \\ 21 | \beta_3 & = \frac{ \alpha_1 }{ r }, \tag{1.7.5c} \\ 22 | \beta_4 & = \frac{ \alpha_2 }{ r }, \tag{1.7.5d} 23 | \end{align} 24 | $$ 25 | 26 | and the definition $$r \equiv \alpha_1 + \alpha_2 + \alpha_3$$. (Do not use the fifth equation $$ \beta_5 = \alpha_3 / r $$.) 27 | 28 | ##### Solution 29 | 30 | From (1.7.5a) we have 31 | 32 | $$ 33 | \mu = \beta_1. 34 | \tag{1} 35 | $$ 36 | 37 | Substituting (1.7.5b) into (1.7.5c) we have 38 | 39 | $$ 40 | \alpha_1 = \frac{ \beta_3 }{ \beta_2 }. 41 | \tag{2} 42 | $$ 43 | 44 | Similarly, substituting (1.7.5b) into (1.7.5d) we have 45 | 46 | $$ 47 | \alpha_2 = \frac{ \beta_4 }{ \beta_2 }. 48 | \tag{3} 49 | $$ 50 | 51 | Finally, using the definition of $$r$$ and (1.7.5b), 52 | 53 | $$ 54 | \alpha_1 + \alpha_2 + \alpha_3 = \frac{1}{\beta_2}, 55 | \tag{4} 56 | $$ 57 | 58 | substituting (2) and (3) into (4) and rearrange terms, 59 | 60 | $$ 61 | \alpha_3 = \frac{1}{\beta_2} - \frac{\beta_3}{\beta_2} - 62 | \frac{\beta_4}{\beta_2} = 63 | \frac{ 1 - \beta_3 - \beta_4 }{ \beta_2 }. 64 | \tag{5} 65 | $$ 66 | 67 | --- 68 | 69 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/2.2.1.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Oct 31, 2017 4 | 5 | --- 6 | 7 | ## Chapter 2 Large-Sample Theory 8 | 9 | ### Section 2 Fundamental Concepts in Time-Series Analysis 10 | 11 | ... 12 | 13 | #### Review Question 2.2.1 14 | 15 | Prove that $$ \boldsymbol{\Gamma}_{-j} = \boldsymbol{\Gamma}_j' $$. 16 | 17 | **Hint**: $$ \operatorname*{Cov} ( \mathbf{z}_i, \mathbf{z}_{i-j} ) = \operatorname*{E} [ ( \mathbf{z}_i - \boldsymbol{\mu} ) ( \mathbf{z}_{i-j} - \boldsymbol{\mu} )' ] $$ where $$ \boldsymbol{\mu} = \operatorname*{E} ( \mathbf{z}_i ) $$. By covariance-stationarity, $$ \operatorname*{Cov} ( \mathbf{z}_i, \mathbf{z}_{i-j} ) = \operatorname*{Cov} ( \mathbf{z}_{i+j}, \mathbf{z}_i ) $$. 18 | 19 | ##### Solution 20 | 21 | $$ 22 | \begin{align*} 23 | 24 | \boldsymbol{\Gamma}_{-j} 25 | &= \operatorname*{Cov} ( \mathbf{z}_i, \mathbf{z}_{i+j} ) 26 | && \text{(definition of $\boldsymbol{\Gamma}$)} 27 | \\ 28 | 29 | &= \operatorname*{E} [ ( \mathbf{z}_i - \boldsymbol{\mu} )( \mathbf{z}_{i+j} - \boldsymbol{\mu} )' ] 30 | && \text{(definition of $\operatorname*{Cov}$)} 31 | \\ 32 | 33 | &= \operatorname*{E} [ ( \mathbf{z}_{i-j} - \boldsymbol{\mu} )( \mathbf{z}_{i} - \boldsymbol{\mu} )' ] 34 | && \text{(covariance-stationarity)} 35 | \\ 36 | 37 | &= \operatorname*{E} [ ( \mathbf{z}_{i} - \boldsymbol{\mu} )( \mathbf{z}_{i-j} - \boldsymbol{\mu} )' ]' 38 | && \text{(transpose)} 39 | \\ 40 | 41 | &= \operatorname*{Cov} ( \mathbf{z}_i, \mathbf{z}_{i-j} )' 42 | && \text{(definition of $\operatorname*{Cov}$)} 43 | \\ 44 | 45 | &= \boldsymbol{\Gamma}_j'. 46 | && \text{(definition of $\boldsymbol{\Gamma}$)} 47 | \end{align*} 48 | $$ 49 | 50 | --- 51 | 52 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/2.1.1.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Sep 16, 2017 4 | 5 | --- 6 | 7 | ## Chapter 2 Large-Sample Theory 8 | 9 | ### Section 1 Review of Limit Theorems for Sequences of Random Variables 10 | 11 | ... 12 | 13 | #### Review Question 2.1.1 (Usual convergence vs. convergence in probability) 14 | 15 | A sequence of real numbers is a trivial example of a sequence of random variables. Is it true that 16 | 17 | $$ 18 | \lim_{n \to \infty} z_n = \alpha 19 | \implies 20 | \operatorname*{plim}_{n \to \infty} z_n = \alpha ? 21 | $$ 22 | 23 | **Hint**: Look at the definition of $$ \operatorname{plim} $$. Since $$ \lim_{n \to \infty} z_n = \alpha $$, $$ | z_n - \alpha | < \varepsilon $$ for $$ n $$ sufficiently large. 24 | 25 | ##### Solution 26 | 27 | By definition, $$ \lim_{n \to \infty} z_n = \alpha $$ means, for any $$ \varepsilon > 0 $$, for $$n$$ sufficiently large, 28 | 29 | $$ 30 | | z_n - \alpha | < \varepsilon. 31 | \tag{1} 32 | $$ 33 | 34 | Considering $$ z_n $$ as a trivial random variable, (1) is equivalent to 35 | 36 | $$ 37 | \begin{gather} 38 | \operatorname{Prob} (| z_n - \alpha | < \varepsilon) = 1, \\ 39 | \operatorname{Prob} (| z_n - \alpha | > \varepsilon) = 0, \\ 40 | \lim_{n \to \infty} \operatorname{Prob} (| z_n - \alpha | > \varepsilon) = 0. 41 | \tag{2} 42 | \end{gather} 43 | $$ 44 | 45 | Then $$ \operatorname*{plim}_{n \to \infty} z_n = \alpha $$ by definition. 46 | 47 | ##### Appendix 48 | 49 | A sequence of random scalars $$ {z_n} $$ converges in probability to a constant $$ \alpha $$ if, for any $$ \varepsilon > 0 $$, 50 | 51 | $$ 52 | \lim_{n \to \infty} \operatorname{Prob} ( | z_n - \alpha | > \varepsilon ) = 0. 53 | $$ 54 | 55 | --- 56 | 57 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.4.3.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 11, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 4 Hypothesis Testing under Normality 10 | 11 | ... 12 | 13 | #### Review Question 1.4.3 14 | 15 | For the formula 16 | 17 | $$ 18 | F \equiv 19 | \frac{ ( \mathbf{R} \mathbf{b} - \mathbf{r} )' [ \mathbf{R} ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{R}' ]^{-1} ( \mathbf{R} \mathbf{b} - \mathbf{r} ) / \# \mathbf{r} } 20 | { s^2 } 21 | \tag{1.4.9} 22 | $$ 23 | 24 | to be well-defined, the matrix $$ \mathbf{R} ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{R}' $$ must be nonsingular. Prove the stronger result that the matrix is positive definite. 25 | 26 | ##### Solution 27 | 28 | We need to show that 29 | 30 | $$ 31 | \mathbf{z}' \mathbf{R} ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{R}' \mathbf{z} > 0, 32 | \tag{1} 33 | $$ 34 | 35 | for any nonzero vector $$ \mathbf{z} $$. 36 | 37 | Since $$ \mathbf{R} $$ is of full row rank, for any nonzero $$ \mathbf{z} $$, $$ \mathbf{R}' \mathbf{z} $$ is also nonzero. So equivalently, what we need to show becomes 38 | 39 | $$ 40 | \mathbf{c}' ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{c} > 0, 41 | \tag{2} 42 | $$ 43 | 44 | where $$ \mathbf{c} = \mathbf{R}' \mathbf{z} $$ is nonzero. 45 | 46 | This is equivalent to proving $$ ( \mathbf{X}' \mathbf{X} )^{-1} $$ is positive definite, which is indeed true because in [review question 1.2.1](1.2.1.md), it is already shown that $$ \mathbf{X}' \mathbf{X} $$ is positive definite. (A matrix is positive definite if and only if all its eigenvalues are positive. The eigenvalues of $$ \mathbf{A}^{-1} $$ are the reciprocals of the eigenvalues of $$ \mathbf{A} $$.) 47 | 48 | --- 49 | 50 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.1.6.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Mar 13, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 1 The Classical Linear Regression Model 10 | 11 | ... 12 | 13 | #### Review Question 1.1.6 (An exercise in conditional and unconditional expectations) 14 | 15 | Show that Assumptions 1.2 and 1.4 imply 16 | 17 | $$ 18 | \begin{align} 19 | \mathrm{Var} ( \varepsilon_i ) & = \sigma^2 20 | && 21 | (i = 1, 2, \ldots, n) 22 | \\ 23 | \text{and} \qquad 24 | \mathrm{Cov} ( \varepsilon_i, \varepsilon_j ) & = 0 25 | && 26 | (i \neq j; i, j = 1, 2, \ldots, n). \tag{$\ast$} 27 | \end{align} 28 | $$ 29 | 30 | ##### Solution 31 | 32 | Strict exogeneity implies $$ \mathrm{E} (\varepsilon_i) = 0 $$. So $$ (\ast) $$ is equivalent to 33 | 34 | $$ 35 | \begin{align} 36 | \mathrm{E} ( \varepsilon_i^2 ) & = \sigma^2 37 | && 38 | (i = 1, 2, \ldots, n) 39 | \\ 40 | \text{and} \qquad 41 | \mathrm{E} ( \varepsilon_i \varepsilon_j ) & = 0 42 | && 43 | (i \neq j; i, j = 1, 2, \ldots, n). 44 | \end{align} 45 | $$ 46 | 47 | (1) For $$i = 1, 2, \ldots, n$$, 48 | 49 | $$ 50 | \begin{align} 51 | \mathrm{E} (\varepsilon_i^2) 52 | & = 53 | \mathrm{E} [\mathrm{E} (\varepsilon_i^2 \mid \mathbf{X})] 54 | && 55 | \text{(law of total expectations)} 56 | \\ & = 57 | \sigma^2. 58 | && 59 | \text{(Assumption 1.4)} 60 | \end{align} 61 | $$ 62 | 63 | (2) For $$i \neq j; i, j = 1, 2, \ldots, n$$, 64 | 65 | $$ 66 | \begin{align} 67 | \mathrm{E} (\varepsilon_i \varepsilon_j) 68 | & = 69 | \mathrm{E} [ \mathrm{E} (\varepsilon_i \varepsilon_j \mid \mathbf{X}) ] 70 | && 71 | \text{(law of total expectations)} 72 | \\ & = 0. 73 | && 74 | \text{(Assumption 1.4)} 75 | \end{align} 76 | $$ 77 | 78 | --- 79 | 80 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /exercise-solution/1.3.md: -------------------------------------------------------------------------------- 1 | # Solution to Analytical Exercise 2 | 3 | by Qiang Gao, updated at Jun 26, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ... 10 | 11 | #### Analytical Exercise 1.3 (Deviation-from-the-mean regression) 12 | 13 | Consider a regression model with a constant. Let $$ \mathbf{X} $$ be partitioned as 14 | 15 | $$ 16 | \underset{ n \times K }{ \mathbf{X} } \equiv 17 | \left[ 18 | \underset{ n \times 1 }{ \mathbf{1} } \; 19 | \vdots \; 20 | \underset{ n \times ( K - 1) }{ \mathbf{X}_2 } 21 | \right] 22 | $$ 23 | 24 | so the first regressor is a constant. Partition $$ \boldsymbol{\beta} $$ and $$ \mathbf{b} $$ accordingly: 25 | 26 | $$ 27 | \boldsymbol{\beta} \equiv 28 | \begin{bmatrix} 29 | \beta_1 \\ \boldsymbol{\beta}_2 30 | \end{bmatrix}, 31 | \quad 32 | \mathbf{b} \equiv 33 | \begin{bmatrix} 34 | b_1 \\ \mathbf{b}_2 35 | \end{bmatrix}. 36 | $$ 37 | 38 | Also let $$ \widetilde{ \mathbf{X} }_2 \equiv \mathbf{M}_1 \mathbf{X}_2 $$ and $$ \widetilde{ \mathbf{y} } \equiv \mathbf{M}_1 \mathbf{y} $$ (where $$ \mathbf{M}_1 $$ is defined in [Analytical Exercise 1.2](1.2.md)). They are deviations from the mean for the nonconstant regressors and the dependent variable. Prove the following: 39 | 40 | (a) The $$ K $$ normal equations are 41 | 42 | $$ 43 | \bar{y} = b_1 + \bar{ \mathbf{x} }'_2 \mathbf{b}_2 44 | $$ 45 | 46 | where $$ \bar{ \mathbf{x} }_2 \equiv \mathbf{X}'_2 \mathbf{1} /n $$ and 47 | 48 | $$ 49 | \mathbf{X}'_2 \mathbf{y} = n \cdot b_1 \cdot \bar{ \mathbf{x} }_2 + \mathbf{X}'_2 \mathbf{X}_2 \mathbf{b}_2. 50 | $$ 51 | 52 | (b) $$ \mathbf{b}_2 = ( \widetilde{ \mathbf{X} }'_2 \widetilde{ \mathbf{X} }_2 )^{-1} \widetilde{ \mathbf{X} }'_2 \widetilde{ \mathbf{y} } $$. 53 | 54 | ##### Solution 55 | 56 | The solution is a special case of the solution to [Analytical Exercise 1.4](1.4.md). 57 | 58 | --- 59 | 60 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/2.1.3.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Sep 17, 2017 4 | 5 | --- 6 | 7 | ## Chapter 2 Large-Sample Theory 8 | 9 | ### Section 1 Review of Limit Theorems for Sequences of Random Variables 10 | 11 | ... 12 | 13 | #### Review Question 2.1.3 14 | 15 | Prove Lemma 2.4(c) 16 | 17 | $$ 18 | \mathbf{x}_n \to_d \mathbf{x} 19 | \text{, } 20 | \mathbf{A}_n \to_p \mathbf{A} 21 | \implies 22 | \mathbf{A}_n \mathbf{x}_n \to_d \mathbf{A} \mathbf{x} 23 | $$ 24 | 25 | from Lemma 2.4(a) 26 | 27 | $$ 28 | \mathbf{x}_n \to_d \mathbf{x} 29 | \text{, } 30 | \mathbf{y}_n \to_p \boldsymbol\alpha 31 | \implies 32 | \mathbf{x}_n + \mathbf{y}_n \to_d 33 | \mathbf{x} + \boldsymbol\alpha 34 | $$ 35 | 36 | and Lemma 2.4(b) 37 | 38 | $$ 39 | \mathbf{x}_n \to_d \mathbf{x} 40 | \text{, } 41 | \mathbf{y}_n \to_p \mathbf{0} 42 | \implies 43 | \mathbf{y}_n' \mathbf{x}_n \to_p \mathbf{0}. 44 | $$ 45 | 46 | **Hint**: $$ \mathbf{A}_n \mathbf{x}_n = ( \mathbf{A}_n - \mathbf{A} ) \mathbf{x}_n + \mathbf{A} \mathbf{x}_n $$. By (b), $$ ( \mathbf{A}_n - \mathbf{A} ) \mathbf{x}_n \to_p \mathbf{0} $$. 47 | 48 | ##### Solution 49 | 50 | Using the add-and-subtract strategy, 51 | 52 | $$ 53 | \mathbf{A}_n \mathbf{x}_n = 54 | ( \mathbf{A}_n - \mathbf{A} ) \mathbf{x}_n + 55 | \mathbf{A} \mathbf{x}_n. 56 | \tag{1} 57 | $$ 58 | 59 | Because $$ \mathbf{A}_n \to_p \mathbf{A} $$, we have $$ \mathbf{A}_n - \mathbf{A} \to_p \mathbf{0} $$. Using $$ \mathbf{x}_n \to_d \mathbf{x} $$ and Lemma 2.4(b), 60 | 61 | $$ 62 | ( \mathbf{A}_n - \mathbf{A} ) \mathbf{x}_n \to_p \mathbf{0}. 63 | \tag{2} 64 | $$ 65 | 66 | Because $$ \mathbf{x}_n \to_d \mathbf{x} $$, by Lemma 2.3(b), 67 | 68 | $$ 69 | \mathbf{A} \mathbf{x}_n \to_d \mathbf{A} \mathbf{x}. 70 | \tag{3} 71 | $$ 72 | 73 | Combining (2) and (3) into (1), using Lemma 2.4(a), 74 | 75 | $$ 76 | \mathbf{A}_n \mathbf{x}_n \to_d \mathbf{A} \mathbf{x}. 77 | $$ 78 | 79 | --- 80 | 81 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.6.2.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 15, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 6 Generalized Least Squares (GLS) 10 | 11 | ... 12 | 13 | #### Review Question 1.6.2 (Generalized $$ SSR $$) 14 | 15 | Show that $$ \hat{ \boldsymbol{ \beta } }_{ \mathrm{GLS} } $$ minimizes $$ ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )' \mathbf{V}^{-1} ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } ) $$. 16 | 17 | ##### Solution 18 | 19 | Note that for symmetric $$ \mathbf{V} $$, its inverse $$ \mathbf{V}^{-1} $$ is also symmetric. 20 | 21 | The objective function is 22 | 23 | $$ 24 | \begin{align} 25 | & ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )' \mathbf{V}^{-1} ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } ) 26 | \\ = & 27 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )' ( \mathbf{V}^{-1} \mathbf{y} - \mathbf{V}^{-1} \mathbf{X} \tilde{ \boldsymbol{ \beta } } ) 28 | \\ = & 29 | \mathbf{y}' \mathbf{V}^{-1} \mathbf{y} - 2 \cdot \mathbf{y}' \mathbf{V}^{-1} \mathbf{X} \tilde{ \boldsymbol{ \beta } } + \tilde{ \boldsymbol{ \beta } }' \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} \tilde{ \boldsymbol{ \beta } }. 30 | \end{align} 31 | $$ 32 | 33 | Taking derivative with respect to $$ \tilde{ \boldsymbol{ \beta } } $$ leads to the first-order condition 34 | 35 | $$ 36 | - 2 \cdot \mathbf{X}' \mathbf{V}^{-1} \mathbf{y} + 2 \cdot \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} \tilde{ \boldsymbol{ \beta } } = \mathbf{0}, 37 | $$ 38 | 39 | and it reduces to 40 | 41 | $$ 42 | \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} \tilde{ \boldsymbol{ \beta } } = \mathbf{X}' \mathbf{V}^{-1} \mathbf{y}, 43 | $$ 44 | 45 | which solves that 46 | 47 | $$ 48 | \tilde{ \boldsymbol{ \beta } }_{ \mathrm{GLS} } = ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V}^{-1} \mathbf{y}. 49 | $$ 50 | 51 | --- 52 | 53 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/2.2.2.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Oct 31, 2017 4 | 5 | --- 6 | 7 | ## Chapter 2 Large-Sample Theory 8 | 9 | ### Section 2 Fundamental Concepts in Time-Series Analysis 10 | 11 | ... 12 | 13 | #### Review Question 2.2.2 (Forecasting white noise) 14 | 15 | For the white noise process of Example 2.4, $$ \operatorname*{E} (z_i) = 0 $$. What is $$ \operatorname*{E} (z_i | z_1) $$ for $$ i \ge 2 $$? **Hint**: You should be able to forecast the future exactly if you know the value of $$z_1$$. Is the process an m.d.s? [Answer: No.] 16 | 17 | ##### Solution 18 | 19 | According to the setup of Example 2.4, 20 | 21 | $$ 22 | \begin{align*} 23 | \operatorname*{E} (z_i | z_1) 24 | &= \operatorname*{E} ( \operatorname*{cos} (iw) | \operatorname*{cos} (w) ) 25 | && \text{(definition of $\{z_i\}$)} 26 | \\ 27 | 28 | &= \operatorname*{E} ( \operatorname*{cos} (iw) | w ) 29 | && \text{(same information for $ w \in (0, 2 \pi)$)} 30 | \\ 31 | 32 | &= \operatorname*{cos} (iw), 33 | \end{align*} 34 | $$ 35 | 36 | $$ 37 | \begin{align*} 38 | \operatorname*{Var} (z_i | z_1) 39 | &= \operatorname*{E} (z_i^2 | z_1) - \operatorname*{E} (z_i | z_1)^2 40 | && \text{(formula of $\operatorname*{Var} (\cdot)$)} 41 | \\ 42 | 43 | &= \operatorname*{E} ( \operatorname*{cos} (iw)^2 | \operatorname*{cos} (w) ) - \operatorname*{cos} (iw)^2 44 | && \text{(definition of $\{ z_i \}$)} 45 | \\ 46 | 47 | & 48 | = \operatorname*{E} ( \operatorname*{cos} (iw)^2 | w ) - \operatorname*{cos} (iw)^2 49 | && \text{(same information for $ w \in (0, 2 \pi)$)} 50 | \\ 51 | 52 | &= \operatorname*{cos} (iw)^2 - \operatorname*{cos} (iw)^2 53 | \\ 54 | 55 | &= 0. 56 | \end{align*} 57 | $$ 58 | 59 | This means $$ z_i (i \ge 2) $$ can be forecasted exactly conditional on $$ z_1 $$. 60 | 61 | The process of $$ \{ z_i \} $$ is not an m.d.s, because $$ \operatorname*{E} (z_i | z_{i-1}, \ldots, z_1) \neq 0 $$. 62 | 63 | --- 64 | 65 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.7.4.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 20, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 7 Application: Returns to Scale in Electricity Supply 10 | 11 | ... 12 | 13 | #### Review Question 1.7.4 (Recovering left-out coefficients from restricted OLS) 14 | 15 | Calculate the restricted OLS estimate of $$\beta_5$$ from 16 | 17 | $$ 18 | \log \left( \frac{ TC_i }{ p_{i3} } \right) = 19 | - \underset{ (0.88) }{ 4.7 } + 20 | \underset{ (0.017) }{ 0.72 } \log ( Q_i ) + 21 | \underset{ (0.20) }{ 0.59 } \log \left( 22 | \frac{ p_{i1} }{ p_{i3} } 23 | \right) - 24 | \underset{ (0.19) }{ 0.007 } \log \left( 25 | \frac{ p_{i2} }{ p_{i3} } 26 | \right). 27 | \tag{1.7.8} 28 | $$ 29 | 30 | How do you calculate the standard error of $$ b_5 $$ from the printout of the restricted OLS? 31 | 32 | ##### Solution 33 | 34 | Because of the restriction 35 | 36 | $$ 37 | \beta_3 + \beta_4 + \beta_5 = 1, 38 | $$ 39 | 40 | the restricted OLS estimate of $$ \beta_5 $$ is 41 | 42 | $$ 43 | b_5 = 1 - b_3 - b_4 = 1 - 0.59 - (- 0.007) = 0.417. 44 | $$ 45 | 46 | We can write 47 | 48 | $$ 49 | b_5 = 1 + \mathbf{c}' \mathbf{b}, 50 | $$ 51 | 52 | where 53 | 54 | $$ 55 | \mathbf{c} \equiv 56 | \begin{bmatrix} 57 | 0 \\ 0 \\ -1 \\ -1 58 | \end{bmatrix}, 59 | \qquad 60 | \mathbf{b} \equiv 61 | \begin{bmatrix} 62 | b_1 \\ b_2 \\ b_3 \\ b_4 63 | \end{bmatrix}, 64 | $$ 65 | 66 | then 67 | 68 | $$ 69 | \mathrm{Var} ( b_5 \mid \mathbf{X} ) = 70 | \mathrm{Var} ( 1 + \mathbf{c}' \mathbf{b} | \mathbf{X}) = 71 | \mathbf{c}' \mathrm{Var} ( \mathbf{b} \mid \mathbf{X} ) \mathbf{c}. 72 | $$ 73 | 74 | From the printout of the restricted OLS regression, we have the estimate $$ \widehat{ \mathrm{Var} ( \mathbf{b} \mid \mathbf{X} ) } $$, then we can calculate the standard error of $$b_5$$ as 75 | 76 | $$ 77 | \mathrm{SE} ( b_5 ) = 78 | \sqrt{ \mathbf{c}' \widehat{ \mathrm{Var} ( \mathbf{b} \mid \mathbf{X} ) } \mathbf{c} }. 79 | $$ 80 | 81 | --- 82 | 83 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /lecture-note/1.2.md: -------------------------------------------------------------------------------- 1 | # Lecture Notes on Econometrics 2 | 3 | by Qiang Gao, updated at March 26, 2018 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 2 The Algebra of Least Squares 10 | 11 | ... 12 | 13 | #### Residuals are NOT the error terms 14 | 15 | - $$ \boldsymbol{\beta} $$ is unknown 16 | - overfitting 17 | 18 | #### Basic Algebraic Problems 19 | 20 | ##### Existence (does there exist a solution?) 21 | 22 | - For $$ y = a x^2 + b x + c $$, there exists a solution only if $$ b^2 \geq 4 a c $$. 23 | - For $$ \mathbf{y} = \mathbf{A} \mathbf{x} $$, there exists a solution only if $$ \mathbf{y} $$ lies in the **column space** of $$ \mathbf{A} $$. 24 | 25 | ##### Uniqueness (is the solution unique?) 26 | 27 | - For $$ y = a x^2 + b x + c $$, there exists a solution and the solution is unique only if $$ b^2 = 4 a c $$. 28 | - For $$ \mathbf{y} = \mathbf{A} \mathbf{x} $$, there exists a solution and the solution is unique only if $$ \mathbf{y} $$ lies in the **column space** of $$ \mathbf{A} $$ and $$ \mathbf{A} $$ has **full column rank**. 29 | 30 | ##### Analytical Solution (is the solution in closed-form?) 31 | 32 | - If $$ y = a x^2 + b x + c $$ and $$ b^2 \geq 4 a c $$, then the solutions are $$ x = (-b \pm \sqrt{b^2 - 4ac}) /(2a) $$. 33 | 34 | - If $$ \mathbf{y} = \mathbf{A} \mathbf{x} $$ and $$ \mathbf{A} $$ is (square and) invertible, then the unique solution is $$ \mathbf{x} = \mathbf{A}^{-1} \mathbf{y} $$. 35 | 36 | #### Vector Differentiation 37 | 38 | - For the real-valued function $$ y = f( \mathbf{x} ) = \mathbf{a}' \mathbf{x} $$ (inner product form), 39 | $$ 40 | \frac{df(\mathbf{x})}{d\mathbf{x}} = \mathbf{a}. 41 | $$ 42 | 43 | - For the real-valued function $$ y = f( \mathbf{x} ) = \mathbf{x}' \mathbf{A} \mathbf{x} $$ (quadratic from), 44 | $$ 45 | \frac{df(\mathbf{x})}{d\mathbf{x}} = \mathbf{A} \mathbf{x}. 46 | $$ 47 | 48 | #### Matrix Multiplication 49 | 50 | There are equivalently _four_ ways of [matrix multiplication](../supplements/matrix-multiplication.md), each is very important. 51 | 52 | --- 53 | 54 | Copyright ©2018 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.7.7.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 20, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 7 Application: Returns to Scale in Electricity Supply 10 | 11 | ... 12 | 13 | #### Review Question 1.7.7 14 | 15 | A more realistic assumption about the rental price of capital may be that there is an economy-wide capital market so $$p_{i2}$$ is the same across firms. In this case, 16 | 17 | (a) Can we estimate the technology parameters? 18 | 19 | (b) Can we test homogeneity of the cost function in factor prices? 20 | 21 | ##### Solution 22 | 23 | (a) When $$p_{i2}$$ is the same across firms, there will have the perfect multicollinearity problem, such that Assumption 3 does not hold. So it will not be possible to estimate the 5 parameters $$ ( \beta_1, \ldots, \beta_5 ) $$ _simultaneously_ from unrestricted regression (1.7.7). 24 | 25 | But recall that $$ ( \beta_1, \ldots, \beta_5 ) $$ are not 5 independently free parameters, 26 | 27 | $$ 28 | \beta_3 + \beta_4 + \beta_5 = 1, 29 | \tag{1} 30 | $$ 31 | 32 | It is safe to disregard $$ \beta_4 $$ and estimate $$ ( \beta_1, \beta_2, \beta_3, \beta_5 ) $$ from the restricted OLS regression 33 | 34 | $$ 35 | \log \left( \frac{ TC_i }{ p_{i2} } \right) = \beta_1 + 36 | \beta_2 \log ( Q_i ) + 37 | \beta_3 \log \left( \frac{ p_{i1} }{ p_{i2} } \right) + 38 | \beta_5 \log \left( \frac{ p_{i3} }{ p_{i2} } \right) + 39 | \varepsilon_i. 40 | \tag{2} 41 | $$ 42 | 43 | Then the estimate of $$\beta_4$$ can be calculated from (1). 44 | 45 | So the answer is _yes_. Even though there will be prefect multicollinearity problem if $$p_{i2}$$ is the same across firms, $$ ( \beta_1, \ldots, \beta_5 ) $$ _can_ be estimated. 46 | 47 | (b) No, because when the price of capital is constant across firms we are forced to use the adding up restriction $$ \beta_3 + \beta_4 + \beta_5 = 1 $$ to calculate $$ \beta_4 $$ (capital's contribution) from the OLS estimate of $$\beta_3$$ and $$\beta_5$$. After all, we can't test the assumption which can't be relaxed. 48 | 49 | --- 50 | 51 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /supplements/var-cov-matrix.md: -------------------------------------------------------------------------------- 1 | # Variance-Covariance Matrix of Random Vectors 2 | 3 | by Qiang Gao, updated at May 8, 2017 4 | 5 | --- 6 | 7 | The **variance-covariance matrix** of a random vector $$ \mathbf{x} $$ is defined as 8 | 9 | $$ 10 | \begin{align} 11 | \mathrm{Var} ( \mathbf{x} ) & \equiv 12 | \mathrm{E} [ ( \mathbf{x} - \mathrm{E} ( \mathbf{x} ) ) 13 | ( \mathbf{x} - \mathrm{E} ( \mathbf{x} ) )' ] 14 | \qquad \text{(the definition)} 15 | \\ & = 16 | \mathrm{E} [ \mathbf{x} \mathbf{x}' - 17 | \mathbf{x} \mathrm{E} ( \mathbf{x} )' - 18 | \mathrm{E} ( \mathbf{x} ) \mathbf{x}' + 19 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{x} )' ] 20 | \\ & = 21 | \mathrm{E} ( \mathbf{x} \mathbf{x}' ) - 22 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{x} )' - 23 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{x} )' + 24 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{x} )' 25 | \\ & = 26 | \mathrm{E} ( \mathbf{x} \mathbf{x}' ) - 27 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{x} )'. 28 | \qquad \text{(the formula)} 29 | \end{align} 30 | $$ 31 | 32 | The last equation is the convenient formula for calculating variance. 33 | 34 | The **covariance matrix** between two random vectors $$ \mathbf{x} $$ and $$ \mathbf{y} $$ is defined as 35 | 36 | $$ 37 | \begin{align} 38 | \mathrm{Cov} ( \mathbf{x}, \mathbf{y} ) & \equiv 39 | \mathrm{E} [ ( \mathbf{x} - \mathrm{E} ( \mathbf{x} ) ) 40 | ( \mathbf{y} - \mathrm{E} ( \mathbf{y} ) )' ] 41 | \qquad \text{(the definition)} 42 | \\ & = 43 | \mathrm{E} [ \mathbf{x} \mathbf{y}' - 44 | \mathbf{x} \mathrm{E} ( \mathbf{y} )' - 45 | \mathrm{E} ( \mathbf{x} ) \mathbf{y}' + 46 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{y} )' ] 47 | \\ & = 48 | \mathrm{E} ( \mathbf{x} \mathbf{y}' ) - 49 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{y} )' - 50 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{y} )' + 51 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{y} )' 52 | \\ & = 53 | \mathrm{E} ( \mathbf{x} \mathbf{x}' ) - 54 | \mathrm{E} ( \mathbf{x} ) \mathrm{E} ( \mathbf{x} )'. 55 | \qquad \text{(the formula)} 56 | \end{align} 57 | $$ 58 | 59 | The last equation is the convenient formula for calculating variance. 60 | 61 | --- 62 | 63 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /exercise-solution/1.1.md: -------------------------------------------------------------------------------- 1 | # Solution to Analytical Exercise 2 | 3 | by Qiang Gao, updated at May 17, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ... 10 | 11 | #### Analytical Exercise 1.1 (Proof that $$ \mathbf{b} $$ minimizes $$ SSR $$) 12 | 13 | Let $$ \mathbf{b} $$ be the OLS estimator of $$ \boldsymbol{ \beta } $$. Prove that, for any hypothetical estimator $$ \tilde{ \boldsymbol{ \beta } } $$ of $$ \boldsymbol{ \beta } $$, 14 | 15 | $$ 16 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )' 17 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } ) 18 | \ge 19 | ( \mathbf{y} - \mathbf{X} \mathbf{b} )' 20 | ( \mathbf{y} - \mathbf{X} \mathbf{b} ). 21 | $$ 22 | 23 | ##### Solution 24 | 25 | $$ 26 | \begin{align} 27 | & 28 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )' 29 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } ) 30 | \\ = & 31 | [ ( \mathbf{y} - \mathbf{X} \mathbf{b} ) + \mathbf{X} ( \mathbf{b} - \tilde{ \boldsymbol{ \beta } } ) ]' 32 | [ ( \mathbf{y} - \mathbf{X} \mathbf{b} ) + \mathbf{X} ( \mathbf{b} - \tilde{ \boldsymbol{ \beta } } ) ] 33 | \\ = & 34 | ( \mathbf{y} - \mathbf{X} \mathbf{b} )' 35 | ( \mathbf{y} - \mathbf{X} \mathbf{b} ) + 36 | ( \mathbf{y} - \mathbf{X} \mathbf{b} )' 37 | \mathbf{X} ( \mathbf{b} - \tilde{ \boldsymbol{ \beta } } ) + 38 | ( \mathbf{b} - \tilde{ \boldsymbol{ \beta } } )' \mathbf{X}' 39 | ( \mathbf{y} - \mathbf{X} \mathbf{b} ) 40 | \\ & + 41 | ( \mathbf{b} - \tilde{ \boldsymbol{ \beta } } )' \mathbf{X}' 42 | \mathbf{X} ( \mathbf{b} - \tilde{ \boldsymbol{ \beta } } ) 43 | \\ = & 44 | ( \mathbf{y} - \mathbf{X} \mathbf{b} )' 45 | ( \mathbf{y} - \mathbf{X} \mathbf{b} ) + 46 | ( \mathbf{b} - \tilde{ \boldsymbol{ \beta } } )' \mathbf{X}' 47 | \mathbf{X} ( \mathbf{b} - \tilde{ \boldsymbol{ \beta } } ) 48 | \qquad 49 | \text{ ( because $ \mathbf{X}' \mathbf{y} = \mathbf{X}' \mathbf{X} \mathbf{b} $ ) } 50 | \\ \ge & 51 | ( \mathbf{y} - \mathbf{X} \mathbf{b} )' 52 | ( \mathbf{y} - \mathbf{X} \mathbf{b} ). 53 | \qquad 54 | \text{ (because $ \mathbf{X}' \mathbf{X} $ is positive definite and $ \tilde{ \boldsymbol{ \beta } } $ could $= \mathbf{b}$) } 55 | \end{align} 56 | $$ 57 | 58 | --- 59 | 60 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.6.3.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 15, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 6 Generalized Least Squares (GLS) 10 | 11 | ... 12 | 13 | #### Review Question 1.6.3 14 | 15 | Derive the expression for $$ \mathrm{Var} ( \mathbf{b} \mid \mathbf{X} ) $$ for the generalized regression model. What is the relation of it to $$ \mathrm{Var} ( \hat{ \boldsymbol{ \beta } }_{ \mathrm{GLS} } \mid \mathbf{X} ) $$? Verify that Proposition 1.7(c) (efficiency of GLS) implies 16 | 17 | $$ 18 | ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V} \mathbf{X} ( \mathbf{X}' \mathbf{X} )^{-1} \ge ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1}. 19 | $$ 20 | 21 | ##### Solution 22 | 23 | (a) For the generalized regression model, 24 | 25 | $$ 26 | \begin{align} 27 | \mathrm{Var} ( \mathbf{b} \mid \mathbf{X} ) 28 | & = 29 | \mathrm{Var} ( \mathbf{b} - \boldsymbol{ \beta } \mid \mathbf{X} ) 30 | \\ & = 31 | \mathrm{Var} ( ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{X}' \boldsymbol{ \varepsilon } \mid \mathbf{X} ) 32 | \\ & = 33 | ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{X}' \mathrm{Var} ( \boldsymbol{ \varepsilon } \mid \mathbf{X} ) \mathbf{X} ( \mathbf{X}' \mathbf{X} )^{-1} 34 | \\ & = 35 | \sigma^2 \cdot 36 | ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V} \mathbf{X} ( \mathbf{X}' \mathbf{X} )^{-1}. 37 | \tag{1} 38 | \end{align} 39 | $$ 40 | 41 | (b) According to the text, 42 | 43 | $$ 44 | \mathrm{Var} ( \hat{ \boldsymbol{ \beta } }_{\mathrm{GLS}} \mid \mathbf{X} ) = \sigma^2 \cdot ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1}. 45 | \tag{1.6.6} 46 | $$ 47 | 48 | Because $$ \mathbf{b} $$ is unbiased, following the Gauss-Markov theorem in Proposition 1.7(c), 49 | 50 | $$ 51 | \mathrm{Var} ( \mathbf{b} \mid \mathbf{X} ) \ge 52 | \mathrm{Var} ( \hat{ \boldsymbol{ \beta } }_{\mathrm{GLS}} \mid \mathbf{X} ) 53 | \tag{2} 54 | $$ 55 | 56 | in the matrix sense. 57 | 58 | (c) Substituting (1) and (1.6.6) into (2), it is easy to derive that 59 | 60 | $$ 61 | ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{X}' \mathbf{V} \mathbf{X} ( \mathbf{X}' \mathbf{X} )^{-1} \ge ( \mathbf{X}' \mathbf{V}^{-1} \mathbf{X} )^{-1}. 62 | $$ 63 | 64 | --- 65 | 66 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/2.2.3.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Nov 8, 2017 4 | 5 | --- 6 | 7 | ## Chapter 2 Large-Sample Theory 8 | 9 | ### Section 2 Fundamental Concepts in Time-Series Analysis 10 | 11 | ... 12 | 13 | #### Review Question 2.2.3 (No anticipated changes in martingales) 14 | 15 | Suppose $$\{ x_i \}$$ is a martingale with respect to $$\{ \mathbf{z}_i \}$$. Show that $$ \operatorname*{E} (x_{i+j} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1) = x_{i-1} $$ and $$ \operatorname*{E} (x_{i+j+1} - x_{i+j} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1) = 0 $$ for $$ j = 0, 1, \ldots $$ 16 | 17 | **Hint**: Use the Law of Iterated Expectations. 18 | 19 | ##### Solution 20 | 21 | For $$j = 0, 1, \ldots ,$$ 22 | 23 | $$ 24 | \begin{align*} 25 | \operatorname*{E} (x_{i+j} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1) 26 | &= \operatorname*{E} [ \operatorname*{E} (x_{i+j} | \mathbf{z}_{i+j-1}, \mathbf{z}_{i+j-2}, \ldots, \mathbf{z}_1) | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1] 27 | \\ 28 | 29 | &= \operatorname*{E} [ x_{i+j-1} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1] 30 | \\ 31 | 32 | &= \operatorname*{E} [ \operatorname*{E} (x_{i+j-1} | \mathbf{z}_{i+j-2}, \mathbf{z}_{i+j-3}, \ldots, \mathbf{z}_1) | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1] 33 | \\ 34 | 35 | &= \operatorname*{E} [ x_{i+j-2} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1] 36 | \\ 37 | 38 | &= \cdots 39 | \\ 40 | 41 | &= \operatorname*{E} [ x_{i} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1] 42 | \\ 43 | 44 | &= x_{i-1}. 45 | \end{align*} 46 | $$ 47 | 48 | Similarly, $$ \operatorname*{E} (x_{i+j+1} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1) = x_{i-1} $$. Combining results, 49 | 50 | $$ 51 | \begin{align*} 52 | \operatorname*{E} (x_{i+j+1} - x_{i+j} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1) 53 | &= \operatorname*{E} (x_{i+j+1} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1) - \operatorname*{E} (x_{i+j} | \mathbf{z}_{i-1}, \mathbf{z}_{i-2}, \ldots, \mathbf{z}_1) 54 | \\ 55 | 56 | &= x_{i-1} - x_{i-1} 57 | \\ 58 | 59 | &= 0. 60 | \end{align*} 61 | $$ 62 | 63 | --- 64 | 65 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.5.5.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 11, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 5 Relation to Maximum Likelihood 10 | 11 | ... 12 | 13 | #### Review Question 1.5.5 (Likelihood equations for classical regression model) 14 | 15 | We used the two-step procedure to derive the ML estimate for the classical regression model. An alternative way to find the ML estimator is to solve for the first-order conditions that set 16 | 17 | $$ 18 | \begin{align} 19 | \frac{ \partial \log L( \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \beta } } } 20 | & = 21 | \frac{1}{ \gamma } \mathbf{X}' ( \mathbf{y} - \mathbf{X} \boldsymbol{ \beta } ) = \mathbf{0}, 22 | \tag{1.5.13a} 23 | \\ 24 | \frac{ \partial \log L ( \boldsymbol{ \theta } ) }{ \partial } 25 | & = 26 | -\frac{n}{ 2 \gamma } + \frac{1}{ 2 \gamma^2 } ( \mathbf{y} - \mathbf{X} \boldsymbol{ \beta } )' ( \mathbf{y} - \mathbf{X} \boldsymbol{ \beta } ) = 0. 27 | \tag{1.5.13b} 28 | \end{align} 29 | $$ 30 | 31 | The first-order conditions for the log likelihood is called the **likelihood equations**. Verify that the ML estimator given in Proposition 1.5 solves the likelihood equations. 32 | 33 | ##### Solution 34 | 35 | Proposition 1.5 states that the ML estimator is 36 | 37 | $$ 38 | \begin{align} 39 | \hat{ \boldsymbol{ \beta } } & = \mathbf{b} = ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{X}' \mathbf{y}, 40 | \tag{1} 41 | \\ 42 | \hat{ \gamma } & = \frac{ \mathbf{e}' \mathbf{e} }{n}. 43 | \tag{2} 44 | \end{align} 45 | $$ 46 | 47 | Substituting $$ \boldsymbol{ \beta } $$ and $$ \gamma $$ in (1.5.13a) with (1) and (2), we have 48 | 49 | $$ 50 | \frac{n}{ \mathbf{e}' \mathbf{e} } \mathbf{X}' \mathbf{e} = 51 | \mathbf{0}. 52 | \tag{3} 53 | $$ 54 | 55 | Equation (3) holds because $$ \mathbf{X}' \mathbf{e} = \mathbf{0} $$, as stated in (1.2.3'). 56 | 57 | Substituting $$ \boldsymbol{ \beta } $$ and $$ \gamma $$ in (1.5.13b) with (1) and (2), we have 58 | 59 | $$ 60 | -\frac{ n^2 }{ 2 \cdot \mathbf{e}' \mathbf{e} } + 61 | \frac{ n^2 }{2 \cdot \mathbf{e}' \mathbf{e} \cdot \mathbf{e}' \mathbf{e}} \mathbf{e}' \mathbf{e} = 0. 62 | \tag{4} 63 | $$ 64 | 65 | Equation (4) holds by cancelling terms. 66 | 67 | --- 68 | 69 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.4.5.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 11, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 4 Hypothesis Testing under Normality 10 | 11 | ... 12 | 13 | #### Review Question 1.4.5 (Relation between $$F(1, n - K)$$ and $$t(n - K)$$) 14 | 15 | Look up the $$t$$ and $$F$$ distribution tables to verify that $$ F_\alpha(1, n - K) = ( t_{\alpha / 2} (n - K) )^2 $$ for degrees of freedom and significance levels of your choice. 16 | 17 | ##### Solution 18 | 19 | We can verify their equality by following `Python` code. 20 | 21 | ```python 22 | In [1]: import numpy as np 23 | In [2]: from scipy.stats import t # the t-distribution 24 | In [3]: from scipy.stats import f # the F-distribution 25 | In [4]: sz = [0.05, 0.01, 0.005, 0.001] # sizes 26 | In [5]: df = [20, 40, 60, 80, 100, 200, 500, 1000] # degrees of freedom 27 | In [6]: SZ, DF = np.meshgrid(sz, df) # generate meshgrid table 28 | In [7]: f.isf(SZ, 1, DF) # isf = inverse survival function = 1 - cdf 29 | Out[7]: 30 | array([[ 4.3512435 , 8.09595806, 9.94393492, 14.81877555], 31 | [ 4.08474573, 7.31409993, 8.82785886, 12.60935783], 32 | [ 4.00119138, 7.07710579, 8.49461671, 11.97298729], 33 | [ 3.96035242, 6.96268806, 8.33460762, 11.67136163], 34 | [ 3.93614299, 6.89530103, 8.24064017, 11.49543133], 35 | [ 3.88837472, 6.76329947, 8.05715996, 11.15450054], 36 | [ 3.86012404, 6.68583329, 7.94984966, 10.95670343], 37 | [ 3.85077467, 6.66029481, 7.91453232, 10.89186556]]) 38 | In [8]: t.isf(SZ/2, DF)**2 # isf = inverse survival function = 1 - cdf 39 | Out[8]: 40 | array([[ 4.3512435 , 8.09595806, 9.94393492, 14.81877554], 41 | [ 4.0847457 , 7.31409993, 8.82785886, 12.60935783], 42 | [ 4.00119137, 7.07710581, 8.49461671, 11.97298729], 43 | [ 3.96035242, 6.96268805, 8.33460759, 11.67136163], 44 | [ 3.93614299, 6.89530103, 8.24064016, 11.49543139], 45 | [ 3.88837472, 6.76329947, 8.05715996, 11.15450054], 46 | [ 3.86012404, 6.68583329, 7.94984966, 10.95670343], 47 | [ 3.85077467, 6.66029481, 7.91453232, 10.89186556]]) 48 | ``` 49 | 50 | --- 51 | 52 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.5.3.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 15, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 5 Relation to Maximum Likelihood 10 | 11 | ... 12 | 13 | #### Review Question 1.5.3 (Concentrated log likelihood with respect to $$ \tilde{ \sigma }^2 $$) 14 | 15 | Writing $$ \tilde{ \sigma }^2 $$ as $$ \tilde{ \gamma } $$, the log likelihood function for the classical regression model is 16 | 17 | $$ 18 | \log L ( \tilde{ \boldsymbol{ \beta } }, \tilde{ \gamma } ) = 19 | - \frac{n}{2} \log ( 2 \pi ) - 20 | \frac{n}{2} \log ( \tilde{ \gamma } ) - 21 | \frac{1}{ 2 \tilde{ \gamma } } 22 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )' 23 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } ). 24 | \tag{1} 25 | $$ 26 | 27 | In the two-step maximization procedure described in the text, we first maximized this function with respect to $$ \tilde{ \boldsymbol{ \beta } } $$. Instead, first maximize with respect to $$ \tilde{ \gamma } $$ given $$ \tilde{ \boldsymbol{ \beta } } $$. Show that the concentrated log likelihood (concentrated with respect to $$ \tilde{ \gamma } \equiv \tilde{ \sigma }^2 $$) is 28 | 29 | $$ 30 | - \frac{n}{2} [ 1 + \log ( 2 \pi ) ] - \frac{n}{2} \log \left( 31 | \frac{ ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )' 32 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } ) }{ n } 33 | \right). 34 | \tag{2} 35 | $$ 36 | 37 | ##### Solution 38 | 39 | When maximizing (1) with respect to $$ \tilde{ \gamma } $$ given $$ \tilde{ \boldsymbol{ \beta } } $$, the first-order condition is 40 | 41 | $$ 42 | \frac{ \partial \log L ( \tilde{ \boldsymbol{ \beta } }, \tilde{ \gamma } ) }{ \partial \tilde{ \gamma } } = 43 | - \frac{n}{2} \frac{1}{ \tilde{ \gamma } } + \frac{ 44 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )' 45 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } ) }{2} 46 | \frac{1}{ \tilde{ \gamma }^2 } = 0, 47 | $$ 48 | 49 | which gives 50 | 51 | $$ 52 | \tilde{ \gamma } = \frac{ 53 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )' 54 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } ) }{n}. 55 | \tag{3} 56 | $$ 57 | 58 | Substituting partial solution (3) into objective function (1), we get the concentrated log likelihood function (2). 59 | 60 | --- 61 | 62 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.1.3.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Mar 13, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 1 The Classical Linear Regression Model 10 | 11 | ... 12 | 13 | #### Review Question 1.1.3 (Combining linearity and strict exogeneity) 14 | 15 | Show that Assumptions 1.1 and 1.2 imply 16 | 17 | $$ 18 | \mathrm{E} ( y_i \mid \mathbf{X} ) = \mathbf{x}_i' \boldsymbol{\beta} 19 | \qquad 20 | \text{($i = 1, 2, \ldots, n$)} 21 | \tag{1.1.20} 22 | $$ 23 | 24 | Conversely, show that this assumption implies that there exist error terms that satisfy those two assumptions. 25 | 26 | ##### Solution 27 | 28 | (1) Firstly, we prove Assumption 1.1 and 1.2 imply (1.1.20). 29 | 30 | $$ 31 | \begin{align} 32 | \mathrm{E} ( y_i \mid \mathbf{X} ) 33 | & = \mathrm{E} ( \mathbf{x}_i' \boldsymbol{\beta} + \varepsilon_i \mid \mathbf{X} ) 34 | && 35 | \text{(Assumption 1.1)} 36 | \\ & = 37 | \mathrm{E} ( \mathbf{x}_i' \boldsymbol{\beta} \mid \mathbf{X} ) + \mathrm{E} ( \varepsilon_i \mid \mathbf{X} ) 38 | && 39 | \text{(linearity of conditional expectatioins)} 40 | \\ & = 41 | \mathrm{E} ( \mathbf{x}_i' \boldsymbol{\beta} \mid \mathbf{X} ) 42 | && 43 | \text{(Assumption 1.2)} 44 | \\& = 45 | \mathbf{x}_i' \boldsymbol{\beta} 46 | && 47 | \text{($\mathbf{x}_i$ is known conditional on $\mathbf{X}$, $\boldsymbol{\beta}$ is constant)} 48 | \end{align} 49 | $$ 50 | 51 | (2) Conversely, we prove (1.1.20) implies there exist error terms that satisfy Assumption 1.1 and 1.2. 52 | 53 | We _define_ the error term as 54 | 55 | $$ 56 | \varepsilon_i = y_i - \mathbf{x}_i' \boldsymbol{\beta}, 57 | $$ 58 | 59 | then Assumption 1.1 is satisfied. To prove assumption 1.2, notice that 60 | 61 | $$ 62 | \begin{align} 63 | \mathrm{E} (\varepsilon_i \mid \mathbf{X}) 64 | & = 65 | \mathrm{E} ( y_i - \mathbf{x}_i' \boldsymbol{\beta} \mid \mathbf{X}) 66 | && 67 | \text{(definition of $\varepsilon_i$)} 68 | \\ & = 69 | \mathrm{E} ( y_i \mid \mathbf{X} ) - \mathrm{E} ( \mathbf{x}_i' \boldsymbol{\beta} \mid \mathbf{X} ) 70 | && 71 | \text{(linearity of conditional expectations)} 72 | \\ & = 73 | \mathbf{x}_i' \boldsymbol{\beta} - \mathrm{E} ( \mathbf{x}_i' \boldsymbol{\beta} \mid \mathbf{X} ) 74 | && 75 | \text{(1.1.20)} 76 | \\ & = 77 | \mathbf{x}_i' \boldsymbol{\beta} - \mathbf{x}_i' \boldsymbol{\beta} 78 | && 79 | \text{($\mathbf{x}_i$ is known conditional on $\mathbf{X}$, $\boldsymbol{\beta}$ is constant)} 80 | \\ & = 0. 81 | \end{align} 82 | $$ 83 | 84 | --- 85 | 86 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.2.5.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Mar 21, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 2 The Algebra of Least Squares 10 | 11 | ... 12 | 13 | #### Review Question 1.2.5 (Matrix algebra of fitted values and residuals) 14 | 15 | Show the following: 16 | 17 | (a) $$ \hat{\mathbf{y}} = \mathbf{P} \mathbf{y} $$, $$ \mathbf{e} = \mathbf{M} \mathbf{y} = \mathbf{M} \boldsymbol{\varepsilon} $$. 18 | 19 | (b) $$ \mathrm{SSR} = \boldsymbol{\varepsilon}' \mathbf{M} \boldsymbol{\varepsilon} $$. 20 | 21 | ##### Solution 22 | 23 | (a.1) $$ \hat{\mathbf{y}} = \mathbf{P} \mathbf{y} $$ because 24 | 25 | $$ 26 | \begin{align} 27 | \hat{\mathbf{y}} 28 | & = 29 | \mathbf{X} \mathbf{b} 30 | && 31 | \text{(definition of $\hat{\mathbf{y}}$)} 32 | \\ & = 33 | \mathbf{X} ( \mathbf{X}' \mathbf{X} )^{-1} \mathbf{X}' \mathbf{y} 34 | && 35 | \text{(definition of $\mathbf{b}$)} 36 | \\ & = 37 | \mathbf{P} \mathbf{y}. 38 | && 39 | \text{(definition of $\mathbf{P}$)} 40 | \end{align} 41 | $$ 42 | 43 | (a.2) $$ \mathbf{e} = \mathbf{M} \mathbf{y} = \mathbf{M} \boldsymbol{\varepsilon} $$ because 44 | 45 | $$ 46 | \begin{align} 47 | \mathbf{e} & = \mathbf{y} - \mathbf{X} \mathbf{b} 48 | && 49 | \text{(definition of $\mathbf{e}$)} 50 | \\ & = 51 | \mathbf{y} - \hat{\mathbf{y}} 52 | && 53 | \text{(definition of $\hat{\mathbf{y}}$)} 54 | \\ & = 55 | \mathbf{y} - \mathbf{P} \mathbf{y} 56 | && 57 | (\hat{\mathbf{y}} = \mathbf{P} \mathbf{y}) 58 | \\ & = 59 | (\mathbf{I} - \mathbf{P}) \mathbf{y} 60 | \\ & = 61 | \mathbf{M} \mathbf{y} 62 | && 63 | \text{(definition of $\mathbf{M}$)} 64 | \\ & = 65 | \mathbf{M} ( \mathbf{X} \boldsymbol{\beta} + \boldsymbol{ \varepsilon } ) 66 | && 67 | \text{(Assumption 1.1)} 68 | \\ & = 69 | \mathbf{M} \mathbf{X} \boldsymbol{\beta} + 70 | \mathbf{M} \boldsymbol{\varepsilon} 71 | \\ & = 72 | \mathbf{M} \boldsymbol{\varepsilon} 73 | && 74 | (\mathbf{M} \mathbf{X} = \mathbf{0}) 75 | \end{align} 76 | $$ 77 | 78 | (b) $$ \mathrm{SSR} = \boldsymbol{\varepsilon}' \mathbf{M} \boldsymbol{\varepsilon} $$ because 79 | 80 | $$ 81 | \begin{align} 82 | \mathrm{SSR} & = \mathbf{e}' \mathbf{e} 83 | && 84 | \text{(definition of $\mathrm{SSR}$)} 85 | \\ & = 86 | (\mathbf{M} \boldsymbol{\varepsilon})' 87 | (\mathbf{M} \boldsymbol{\varepsilon}) 88 | && 89 | ( \mathbf{e} = \mathbf{M} \boldsymbol{\varepsilon} ) 90 | \\ & = 91 | \boldsymbol{\varepsilon}' \mathbf{M} \mathbf{M} 92 | \boldsymbol{\varepsilon} 93 | \\ & = 94 | \boldsymbol{\varepsilon}' \mathbf{M} 95 | \boldsymbol{\varepsilon}. 96 | && 97 | \text{($\mathbf{M}$ is idempotent)} 98 | \end{align} 99 | $$ 100 | 101 | --- 102 | 103 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.7.8.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 20, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 7 Application: Returns to Scale in Electricity Supply 10 | 11 | ... 12 | 13 | #### Review Question 1.7.8 14 | 15 | Taking logs of both sides of the production function (1.7.1), one can derive the log-linear relationship 16 | 17 | $$ 18 | \log ( Q_i ) = \alpha_0 + \alpha_1 \log ( x_{i1} ) + 19 | \alpha_2 \log ( x_{i2} ) + \alpha_3 \log ( x_{i3} ) + 20 | \varepsilon_i, 21 | \tag{1} 22 | $$ 23 | 24 | where 25 | 26 | $$ 27 | \alpha_0 \equiv \mathrm{E} [ \log ( A_i ) ], 28 | \qquad 29 | \varepsilon_i \equiv \log ( A_i ) - \mathrm{E} [ \log ( A_i ) ]. 30 | $$ 31 | 32 | Suppose, in addition to total costs, output, and factor prices, we had data on factor inputs. (a) Can we estimate $$ \alpha $$'s by applying OLS to this log-linear relationship? Why or why not? (b) Suggest a different way to estimate $$\alpha$$'s. 33 | 34 | ##### Solution 35 | 36 | (a) The economic interpretation of the error term $$ \varepsilon_i $$ represents the firm's production efficiency relative to the industry's average efficiency. The input choice of the firm $$ ( x_{i1}, x_{i2}, x_{i3} ) $$ can depend on $$ \varepsilon_i $$, making regressors not be orthogonal to the error term. Applying OLS to log-linear relationship (1) will result in biased estimator, so we cannot estimate $$\alpha$$'s using OLS. 37 | 38 | (b) Following microeconomic theory, under the Cobb-Douglas technology, input shares do not depend on factor prices. 39 | 40 | This can be seen as follows. From equation (8) and (10) in [solution to review question 1.7.1](1.7.1.md), 41 | 42 | $$ 43 | \begin{align} 44 | \frac{ p_{i2} x_{i2} }{ p_{i1} x_{i1} } & = 45 | \frac{ \alpha_2 }{ \alpha_1 }, 46 | \tag{1} 47 | \\ 48 | \frac{ p_{i3} x_{i3} }{ p_{i1} x_{i1} } & = 49 | \frac{ \alpha_3 }{ \alpha_1 }. 50 | \tag{2} 51 | \end{align} 52 | $$ 53 | 54 | The input shares are calculated as 55 | 56 | $$ 57 | \frac{ p_{i1} x_{i1} }{ p_{i1} x_{i1} + p_{i2} x_{i2} + p_{i3} x_{i3} } = 58 | \frac{1}{ 1 + \alpha_2 / \alpha_1 + \alpha_3 / \alpha_1 } = 59 | \frac{ \alpha_1 }{ \alpha_1 + \alpha_2 + \alpha_3 }, 60 | \tag{3} 61 | $$ 62 | 63 | similarly, 64 | 65 | $$ 66 | \begin{align} 67 | \frac{ p_{i2} x_{i2} }{ p_{i1} x_{i1} + p_{i2} x_{i2} + p_{i3} x_{i3} } & = 68 | \frac{ \alpha_2 }{ \alpha_1 + \alpha_2 + \alpha_3 }, 69 | \tag{4} 70 | \\ 71 | \frac{ p_{i3} x_{i3} }{ p_{i1} x_{i1} + p_{i2} x_{i2} + p_{i3} x_{i3} } & = 72 | \frac{ \alpha_3 }{ \alpha_1 + \alpha_2 + \alpha_3 }. 73 | \tag{5} 74 | \end{align} 75 | $$ 76 | 77 | It is evident that these input shares are determined completely by parameters and do not depend on factor prices. 78 | 79 | Under constant returns to scale, these shares equal to $$\alpha_1$$, $$\alpha_2$$ and $$\alpha_3$$ respectively. So we can estimate these parameters using sample averages of input shares. 80 | 81 | --- 82 | 83 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.5.1.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 15, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 5 Relation to Maximum Likelihood 10 | 11 | ... 12 | 13 | #### Review Question 1.5.1 (Use of regularity conditions) 14 | 15 | Assuming that taking expectations (i.e. taking integrals) and differentiation can be interchanged, prove that the expected value of the score vector, 16 | 17 | $$ 18 | \mathbf{s} ( \tilde{ \boldsymbol{ \theta } } ) \equiv 19 | \frac{ \partial \log L ( \tilde{ \boldsymbol{ \theta } } ) } 20 | { \partial \tilde{ \boldsymbol{ \theta } } }, 21 | \tag{1.5.9} 22 | $$ 23 | 24 | if evaluated at the true parameter value $$ \boldsymbol{\theta} $$, is zero. 25 | 26 | ##### Solution 27 | 28 | We start from the identity on pdf, 29 | 30 | $$ 31 | \int f( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) \, d \mathbf{z} = 1. 32 | \tag{1} 33 | $$ 34 | 35 | Differentiate both sides of (1) with respect to $$ \tilde{ \boldsymbol{ \theta } } $$ and use the regularity condition, which allows us to interchange integration and differentiation, we obtain 36 | 37 | $$ 38 | \int \frac{ \partial f ( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) } 39 | { \partial \tilde{ \boldsymbol{ \theta } } } 40 | \, d \mathbf{z} = 0. 41 | \tag{2} 42 | $$ 43 | 44 | Dividing $$ f( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) $$ and multiplying $$ f( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) $$ on the integrand of (2), 45 | 46 | $$ 47 | \int \frac{1}{ f( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) } 48 | \frac{ \partial f ( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) } 49 | { \partial \tilde{ \boldsymbol{ \theta } } } \cdot 50 | f( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) 51 | \, d \mathbf{z} = 0. 52 | \tag{3} 53 | $$ 54 | 55 | From basic calculus, the score vector function (1.5.9) can be written as 56 | 57 | $$ 58 | \mathbf{s} ( \tilde{ \boldsymbol{ \theta } } ) = 59 | \frac{ \partial \log L ( \tilde{ \boldsymbol{ \theta } } ) } 60 | { \partial \tilde{ \boldsymbol{ \theta } } } = 61 | \frac{ \partial \log f ( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) }{ \partial \tilde{ \boldsymbol{ \theta } } } = 62 | \frac{1}{ f( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) } 63 | \frac{ \partial f ( \mathbf{z}; \tilde{ \boldsymbol{ \theta } } ) } 64 | { \partial \tilde{ \boldsymbol{ \theta } } }. 65 | \tag{4} 66 | $$ 67 | 68 | Combining (3) and (4), evaluating at the true parameter value $$\boldsymbol{ \theta }$$, using the definition of expectation, 69 | 70 | $$ 71 | \int \frac{1}{ f( \mathbf{z}; \boldsymbol{ \theta } ) } 72 | \frac{ \partial f ( \mathbf{z}; \boldsymbol{ \theta } ) } 73 | { \partial \boldsymbol{ \theta } } \cdot 74 | f( \mathbf{z}; \boldsymbol{ \theta } ) 75 | \, d \mathbf{z} = 76 | \int \mathbf{s} ( \boldsymbol{ \theta } ) \cdot 77 | f( \mathbf{z}; \boldsymbol{ \theta } ) 78 | \, d \mathbf{z} = 79 | \mathrm{E} [ \mathbf{s} ( \boldsymbol{ \theta } ) ] 80 | = 0. 81 | $$ 82 | 83 | --- 84 | 85 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/2.1.5.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Sep 17, 2017 4 | 5 | --- 6 | 7 | ## Chapter 2 Large-Sample Theory 8 | 9 | ### Section 1 Review of Limit Theorems for Sequences of Random Variables 10 | 11 | ... 12 | 13 | #### Review Question 2.1.5 (Combine Delta method with Lindeberg-Levy) 14 | 15 | Let $$ \{ z_i \} $$ be a sequence of i.i.d. (independently and identically distributed) random variables with $$ \operatorname{E} ( z_i ) = \mu \neq 0 $$ and $$ \operatorname{Var} ( z_i ) = \sigma^2 $$, and let $$ \bar{z}_n $$ be the sample mean. Show that 16 | 17 | $$ 18 | \sqrt{n} 19 | \left( \frac{1}{\bar{z}_n} - \frac{1}{\mu} 20 | \right) 21 | \to_d N 22 | \left( 0, \frac{ \sigma^2 }{ \mu^4 } 23 | \right). 24 | $$ 25 | 26 | **Hint**: In Lemma 2.5, set $$ \boldsymbol{\beta} = \mu $$, $$ \mathbf{a} ( \boldsymbol{\beta} ) = 1 / \mu $$, $$ \mathbf{x}_n = \bar{z}_n $$. 27 | 28 | ##### Solution 29 | 30 | By Linderberg-Levy CLT, 31 | 32 | $$ 33 | \sqrt{n} ( \bar{z}_n - \mu ) \to_d N(0, \sigma^2). 34 | $$ 35 | 36 | Set $$ \boldsymbol{\beta} = \mu $$, $$ \mathbf{a} ( \boldsymbol{\beta} ) = 1 / \mu $$, $$ \mathbf{x}_n = \bar{z}_n $$, 37 | 38 | $$ 39 | \mathbf{A} ( \boldsymbol{ \beta } ) = 40 | - \frac{ 1 }{ \mu^2 }, 41 | $$ 42 | 43 | using Lemma 2.5, 44 | 45 | $$ 46 | \begin{align} 47 | \sqrt{n} \left( \frac{ 1 }{ \bar{z}_n } - 48 | \frac{ 1 }{ \mu } \right) & = 49 | \sqrt{n} [ \mathbf{a} ( \mathbf{x}_n ) - 50 | \mathbf{a} ( \boldsymbol{ \beta } ) ] \\ 51 | & \to_d N \left( 0, \left( - \frac{1}{ \mu^2 } \right) \sigma^2 \left( - \frac{1}{ \mu^2 } \right) \right) \\ 52 | & = N \left( 0, \frac{ \sigma^2 } { \mu^4 } \right) 53 | \end{align} 54 | $$ 55 | 56 | ##### Appendix 57 | 58 | **Lemma 2.5 (the “delta method”)**: Suppose $$ \mathbf{x}_n $$ is a sequence of $$K$$-dimensional random vectors such that $$ \mathbf{x}_n \to_p \boldsymbol{\beta} $$ and 59 | 60 | $$ 61 | \sqrt{n} ( \mathbf{x}_n - \boldsymbol{\beta} ) 62 | \to_d \mathbf{z}, 63 | $$ 64 | 65 | and suppose $$ \mathbf{a} (\cdot): \mathbb{R}^K \to \mathbb{R}^r $$ has continuous first derivatives with $$ \mathbf{A} ( \boldsymbol{\beta} ) $$ denoting the $$ r \times K $$ matrix of first derivatives evaluated at $$ \boldsymbol{\beta} $$: 66 | 67 | $$ 68 | \underset{ ( r \times K ) }{ \mathbf{A} ( \boldsymbol{\beta} ) } 69 | \equiv 70 | \frac{ 71 | \partial \mathbf{a} ( \boldsymbol{ \beta } ) } 72 | { \partial \boldsymbol{ \beta }' }. 73 | $$ 74 | 75 | Then 76 | 77 | $$ 78 | \sqrt{n} [ \mathbf{a} ( \mathbf{x}_n ) - \mathbf{a} ( 79 | \boldsymbol{ \beta }) ] \to_d 80 | \mathbf{A} ( \boldsymbol{ \beta } ) \mathbf{z}. 81 | $$ 82 | 83 | In particular: 84 | 85 | $$ 86 | \sqrt{n} ( \mathbf{x}_n - \boldsymbol{ \beta } ) \to_d 87 | N( \mathbf{0}, \boldsymbol{\Sigma} ) 88 | \implies 89 | \sqrt{n} [ \mathbf{a} ( \mathbf{x}_n ) - \mathbf{a} 90 | ( \boldsymbol{ \beta } ) ] \to_d N( \mathbf{0}, 91 | \mathbf{A} ( \boldsymbol{ \beta } ) 92 | \boldsymbol{ \Sigma } \mathbf{A} ( \boldsymbol{ \beta } )' ). 93 | $$ 94 | 95 | --- 96 | 97 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /exercise-solution/1.2.md: -------------------------------------------------------------------------------- 1 | # Solution to Analytical Exercise 2 | 3 | by Qiang Gao, updated at May 17, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ... 10 | 11 | #### Analytical Exercise 1.2 (The annihilator associated with the vector of ones) 12 | 13 | Let $$ \mathbf{1} $$ be the $$n$$-dimensional column vector of ones, and let $$ \mathbf{M}_1 \equiv \mathbf{I}_n - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' $$. That is, $$ \mathbf{M}_1 $$ is the annihilator associated with $$ \mathbf{1} $$. Prove the following: 14 | 15 | (a) $$ \mathbf{M}_1 $$ is symmetric and idempotent. 16 | 17 | (b) $$ \mathbf{M}_1 \mathbf{1} = \mathbf{0} $$. 18 | 19 | (c) $$ \mathbf{M}_1 \mathbf{y} = \mathbf{y} - \bar{y} \cdot \mathbf{1} $$ where 20 | 21 | $$ 22 | \bar{y} = \frac{1}{n} \sum_{i=1}^{n} y_i. 23 | $$ 24 | 25 | $$ \mathbf{M}_1 \mathbf{y} $$ is the vector of **deviations from the mean**. 26 | 27 | ##### Solution 28 | 29 | (a) The symmetry of $$ \mathbf{M}_1 $$ is verified as 30 | 31 | $$ 32 | \begin{align} 33 | \mathbf{M}'_1 & = \mathbf{I}'_n - ( \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' )' 34 | \\ & = 35 | \mathbf{I}_n - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' 36 | && \text{(Because 37 | $( \mathbf{A} \mathbf{B} )' = \mathbf{B}' \mathbf{A}' $, 38 | $ ( \mathbf{A}^{-1} )' = ( \mathbf{A}' )^{-1} $)} 39 | \\ & = 40 | \mathbf{M}_1. 41 | \end{align} 42 | $$ 43 | 44 | The idempotency of $$ \mathbf{M}_1 $$ is verified as 45 | 46 | $$ 47 | \begin{align} 48 | \mathbf{M}_1 \mathbf{M}_1 & = ( \mathbf{I}_n - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' )( \mathbf{I}_n - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' ) 49 | \\ & = 50 | \mathbf{I}_n 51 | - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' 52 | - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' 53 | + \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' 54 | \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' 55 | \\ & = 56 | \mathbf{I}_n 57 | - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' 58 | - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' 59 | + \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' 60 | \\ & = 61 | \mathbf{I}_n 62 | - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' 63 | \\ & = 64 | \mathbf{M}_1. 65 | \end{align} 66 | $$ 67 | 68 | (b) 69 | 70 | $$ 71 | \begin{align} 72 | \mathbf{M}_1 \mathbf{1} & = 73 | ( \mathbf{I}_n - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' ) \mathbf{1} 74 | \\ & = 75 | \mathbf{1} - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' \mathbf{1} 76 | \\ & = 77 | \mathbf{1} - \mathbf{1} 78 | \\ & = 79 | \mathbf{0}. 80 | \end{align} 81 | $$ 82 | 83 | (c) 84 | 85 | $$ 86 | \begin{align} 87 | \mathbf{M}_{1} \mathbf{y} & = ( \mathbf{I}_n - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' ) \mathbf{y} 88 | \\ & = 89 | \mathbf{y} - \mathbf{1} ( \mathbf{1}' \mathbf{1} )^{-1} \mathbf{1}' \mathbf{y} 90 | \\ & = 91 | \mathbf{y} - \mathbf{1} \cdot \frac{1}{n} \cdot \sum_{i=1}^{n} y_i 92 | \\ & = 93 | \mathbf{y} - \bar{y} \cdot \mathbf{1}. 94 | \end{align} 95 | $$ 96 | 97 | --- 98 | 99 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.1.4.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Mar 13, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 1 The Classical Linear Regression Model 10 | 11 | ... 12 | 13 | #### Review Question 1.1.4 (Normally distributed ramdom sample) 14 | 15 | Consider a random sample on consumption and disposable income, $$ ( CON_i, YD_i ) $$, $$ ( i = 1, 2, \ldots, n ) $$. Suppose the joint distribution of $$ ( CON_i, YD_i ) $$ (which is the same across $$ i $$ because of the random sample assumption) is normal. Clearly, Assumption 1.3 is satisfied; the rank of $$ \mathbf{X} $$ would be less than $$ K $$ only by pure accident. Show that the other assumptions, Assumptions 1.1, 1.2, and 1.4, are satisfied. **Hint:** if two random variables, $$ y $$ and $$ x $$, are jointly normally distributed, then the conditional expectation is linear in $$ x $$, i.e., 16 | 17 | $$ 18 | \mathrm{E} ( y \mid x ) = \beta_1 + \beta_2 x, 19 | $$ 20 | 21 | and the conditional variance, $$ \mathrm{Var} ( y \mid x ) $$, does not depend on $$ x $$. Here, the fact that the distribution is the same across $$ i $$ is important; if the distribution differed across $$ i $$, $$ \beta_1 $$ and $$ \beta_2 $$ could vary across $$ i $$. 22 | 23 | ##### Solution (with flaw) 24 | 25 | (1) We _define_ the error term as 26 | 27 | $$ 28 | \varepsilon_i = CON_i - \beta_1 - \beta_2 YD_i, 29 | $$ 30 | 31 | then Assumption 1.1 is satisfied. 32 | 33 | (2) Because 34 | 35 | $$ 36 | \begin{align} 37 | \mathrm{E} (\varepsilon_i \mid \mathbf{X}) 38 | & = 39 | \mathrm{E} ( CON_i - \beta_1 - \beta_2 YD_i \mid \mathbf{X} ) 40 | && 41 | \text{(definition of $\varepsilon_i$)} 42 | \\ & = 43 | \mathrm{E} ( CON_i \mid YD_i ) - \beta_1 - \beta_2 YD_i 44 | && 45 | \text{(linearity of conditional expectations)} 46 | \\ & = 47 | \beta_1 + \beta_2 \mathit{YD}_i - \beta_1 - \beta_2 \mathit{YD}_i 48 | && 49 | \text{(hint)} 50 | \\ & = 0, 51 | \end{align} 52 | $$ 53 | 54 | Assumption 1.2 holds. 55 | 56 | (3) To prove Assumption 1.4, 57 | 58 | $$ 59 | \begin{align} 60 | \mathrm{E} ( \varepsilon_i^2 \mid \mathbf{X} ) 61 | & = 62 | \mathrm{Var} ( \varepsilon_i \mid \mathbf{X} ) + \mathrm{E} ( \varepsilon_i \mid \mathbf{X} )^2 63 | && 64 | \text{(definition of $\mathrm{Var} (\cdot)$)} 65 | \\ & = 66 | \mathrm{Var} ( \varepsilon_i \mid \mathbf{X} ) 67 | && 68 | \text{(Assumption 1.2)} 69 | \\ & = 70 | \mathrm{Var} ( CON_i - \beta_1 - \beta_2 YD_i \mid YD_i ) 71 | && 72 | \text{(definition of $\varepsilon_i$)} 73 | \\ & = 74 | \mathrm{Var} ( CON_i \mid YD_i ) 75 | && 76 | \text{($\mathrm{Var} (ax + b) = a^2 \mathrm{Var} (x)$ )} 77 | \\ & = \sigma^2 > 0, 78 | && 79 | \text{(hint)} 80 | \end{align} 81 | $$ 82 | 83 | and 84 | 85 | $$ 86 | \begin{align} 87 | \mathrm{E} ( \varepsilon_i \varepsilon_j \mid \mathbf{X} ) 88 | & = 89 | \mathrm{E} ( \varepsilon_i \mid \mathbf{x}_i ) \mathrm{E} ( \varepsilon_j \mid \mathbf{x}_j ) 90 | && 91 | \text{(Review Question 1.1.2)} 92 | \\ & = 0. 93 | && 94 | \text{(Assumption 1.2)} 95 | \end{align} 96 | $$ 97 | 98 | --- 99 | 100 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.2.9.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Mar 26, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 2 The Algebra of Least Squares 10 | 11 | ... 12 | 13 | #### Review Question 1.2.9 (Computation of the statistics) 14 | 15 | Verify that $$ \mathbf{b} $$, $$ \mathrm{SSR} $$, $$ s^2 $$, and $$ R^2 $$ can be calculated from the following sample averages: $$ \mathbf{S}_{ \mathbf{x} \mathbf{x} } $$, $$ \mathbf{s}_{ \mathbf{x} \mathbf{y} } $$, $$ \mathbf{y}' \mathbf{y} / n $$, and $$ \bar{y} $$. (If the regressors include a constant, then $$ \bar{y} $$ is the element of $$ \mathbf{s}_{ \mathbf{x} \mathbf{y} } $$ corresponding to the constant.) Therefore, those sample averages need to be computed just once in order to obtain the regression coefficients and related statistics. 16 | 17 | ##### Solution 18 | 19 | (1) According to (1.2.5'), 20 | 21 | $$ 22 | \mathbf{b} = \mathbf{S}_{\mathbf{x} \mathbf{x}}^{-1} 23 | \mathbf{s}_{\mathbf{x} \mathbf{y}}. 24 | $$ 25 | 26 | (2) 27 | 28 | $$ 29 | \begin{align} 30 | \mathrm{SSR} & = \mathbf{e}' \mathbf{e} 31 | && 32 | \text{(definition (1.2.12))} 33 | \\ & = 34 | (\mathbf{y} - \mathbf{X} \mathbf{b})' 35 | (\mathbf{y} - \mathbf{X} \mathbf{b}) 36 | && 37 | \text{(definition (1.2.4))} 38 | \\ & = 39 | \mathbf{y}'\mathbf{y} - \mathbf{y}' \mathbf{X} 40 | \mathbf{b} - \mathbf{b}' \mathbf{X}' \mathbf{y} + 41 | \mathbf{b}' \mathbf{X}' \mathbf{X} \mathbf{b} 42 | \\ & = 43 | \mathbf{y}'\mathbf{y} - \mathbf{y}' \mathbf{X} 44 | \mathbf{b} - \mathbf{b}' \mathbf{X}' \mathbf{y} + 45 | \mathbf{b}' \mathbf{X}' \mathbf{y} 46 | && 47 | \text{(definition (1.2.5))} 48 | \\ & = 49 | \mathbf{y}'\mathbf{y} - (\mathbf{X}' \mathbf{y})' \mathbf{b} 50 | \\ & = 51 | n \cdot \mathbf{y}' \mathbf{y} / n - n \cdot 52 | \mathbf{s}_{\mathbf{x} \mathbf{y}}' 53 | \mathbf{S}_{\mathbf{x} \mathbf{x}}^{-1} 54 | \mathbf{s}_{\mathbf{x} \mathbf{y}} 55 | \\ & = 56 | n \cdot ( \mathbf{y}' \mathbf{y} / n - 57 | \mathbf{s}_{\mathbf{x} \mathbf{y}}' 58 | \mathbf{S}_{\mathbf{x} \mathbf{x}}^{-1} 59 | \mathbf{s}_{\mathbf{x} \mathbf{y}} ). 60 | \end{align} 61 | $$ 62 | 63 | (3) 64 | 65 | $$ 66 | s^2 = \frac{\mathrm{SSR}}{n - K} = 67 | \frac{n}{n - K} \cdot ( \mathbf{y}' \mathbf{y} / n - 68 | \mathbf{s}_{\mathbf{x} \mathbf{y}}' 69 | \mathbf{S}_{\mathbf{x} \mathbf{x}}^{-1} 70 | \mathbf{s}_{\mathbf{x} \mathbf{y}} ). 71 | $$ 72 | 73 | (4) 74 | 75 | $$ 76 | \begin{align} 77 | R^2 & = 1 - \frac{ \mathrm{SSR} } 78 | { \sum_{i=1}^n (y_i - \bar{y})^2 } 79 | && 80 | \text{(definition (1.2.18))} 81 | \\ & = 82 | 1 - \frac{ \mathrm{SSR} } 83 | { \sum_{i=1}^n (y_i^2 - 2\bar{y}y_i + \bar{y}^2) } 84 | \\ & = 85 | 1 - \frac{ \mathrm{SSR} } 86 | { \sum_{i=1}^n y_i^2 - 2 \bar{y} \sum_{i=1}^n y_i + 87 | \sum_{i=1}^n \bar{y}^2 } 88 | \\ & = 89 | 1 - \frac{ \mathrm{SSR} } 90 | { n \cdot \mathbf{y}' \mathbf{y} / n - 91 | 2n \cdot \bar{y}^2 + n \cdot \bar{y}^2} 92 | \\ & = 93 | 1 - \frac{ \mathrm{SSR} } 94 | { n \cdot \mathbf{y}' \mathbf{y} / n - 95 | n \cdot \bar{y}^2}. 96 | \end{align} 97 | $$ 98 | 99 | --- 100 | 101 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.5.4.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 15, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 5 Relation to Maximum Likelihood 10 | 11 | ... 12 | 13 | #### Review Question 1.5.4 (Information matrix equality for classical regression model) 14 | 15 | Verify the **information matrix equality** 16 | 17 | $$ 18 | \mathbf{I} ( \boldsymbol{ \theta } ) = 19 | - \mathrm{E} \left[ 20 | \frac{ \partial^2 \log L( \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } } \, \partial \tilde{ \boldsymbol{ \theta } }' } 21 | \right] 22 | \tag{1.5.11} 23 | $$ 24 | 25 | ~~for the linear regression model~~. 26 | 27 | ##### Solution 28 | 29 | _Comment_: The information matrix equality is an identity, it is not specific to the linear regression model. 30 | 31 | We begin with the identity 32 | 33 | $$ 34 | 1 = \oint_{ \Omega } f( \mathbf{z} ; \boldsymbol{ \theta } ) \, d \mathbf{z}. 35 | $$ 36 | 37 | Taking derivative with respect to $$ \boldsymbol{ \theta } $$ and interchange differentiation and integration, 38 | 39 | $$ 40 | 0 = \oint_{\Omega} \frac{ \partial f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } } } \, d \mathbf{z} = 41 | \oint_{\Omega} \frac{ \partial \log f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } } } f( \mathbf{z}; \boldsymbol{ \theta } ) \, d \mathbf{z}. 42 | $$ 43 | 44 | Taking derivative with respect to $$ \boldsymbol{ \theta }' $$ and interchange differentiation and integration, 45 | 46 | $$ 47 | \begin{align} 48 | 0 & = \oint_{\Omega} \frac{ \partial^2 \log f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } } \, \partial \tilde{ \boldsymbol{ \theta } }' } f( \mathbf{z}; \boldsymbol{ \theta } ) \, d \mathbf{z} + 49 | \oint_{\Omega} \frac{ \partial \log f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } } } 50 | \frac{ \partial f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } }' } \, d \mathbf{z} 51 | \\ & = 52 | \oint_{\Omega} \frac{ \partial^2 \log f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } } \, \partial \tilde{ \boldsymbol{ \theta } }' } f( \mathbf{z}; \boldsymbol{ \theta } ) \, d \mathbf{z} + 53 | \oint_{\Omega} \frac{ \partial \log f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } } } 54 | \frac{ \partial \log f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } }' } f( \mathbf{z}; \boldsymbol{ \theta } ) \, d \mathbf{z}. 55 | \tag{1} 56 | \end{align} 57 | $$ 58 | 59 | Because 60 | 61 | $$ 62 | \begin{align} 63 | \mathbf{I} ( \boldsymbol{ \theta } ) & \equiv 64 | \mathrm{E} [ \mathbf{s} ( \boldsymbol{ \theta } ) \mathbf{s} ( \boldsymbol{ \theta } )' ] 65 | \tag{1.5.10} 66 | \\ & = 67 | \oint_{\Omega} \frac{ \partial \log f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } } } 68 | \frac{ \partial \log f( \mathbf{z}; \boldsymbol{ \theta } ) }{ \partial \tilde{ \boldsymbol{ \theta } }' } f( \mathbf{z}; \boldsymbol{ \theta } ) \, d \mathbf{z}, 69 | \end{align} 70 | $$ 71 | 72 | and $$ L( \boldsymbol{ \theta } ) \equiv f( \mathbf{z}; \boldsymbol{ \theta } ) $$, substituting into (1) and rearranging terms, we get (1.5.11). 73 | 74 | --- 75 | 76 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /supplements/matrix-multiplication.md: -------------------------------------------------------------------------------- 1 | ## Matrix Multiplication Rules 2 | 3 | by Qiang Gao, updated at Mar 20, 2017 4 | 5 | --- 6 | 7 | Matrix multiplication can be defined equivalently in _four_ different ways as following, where vector $$ \mathbf{a} $$ is _column_ vector by default and $$ \mathbf{a}^\intercal $$ means corresponding _row_ vector, the transpose of $$ \mathbf{a} $$. 8 | 9 | #### Definition (inner-way multiplication) 10 | 11 | $$ 12 | \begin{align} 13 | \mathbf{A}_{m \times n} \mathbf{B}_{n \times p} 14 | & = 15 | \begin{bmatrix} 16 | — & \boldsymbol{\alpha}_1^\intercal & — \\ 17 | & \vdots & \\ 18 | — & \boldsymbol{\alpha}_m^\intercal & — 19 | \end{bmatrix} 20 | \begin{bmatrix} 21 | | & & | \\ 22 | \mathbf{b}_1 & \cdots & \mathbf{b}_p \\ 23 | | & & | 24 | \end{bmatrix} \\ 25 | & = 26 | \begin{bmatrix} 27 | \boldsymbol{\alpha}_1^\intercal \mathbf{b}_1 28 | & \cdots 29 | & \boldsymbol{\alpha}_1^\intercal \mathbf{b}_p \\ 30 | \vdots & \ddots & \vdots \\ 31 | \boldsymbol{\alpha}_m^\intercal \mathbf{b}_1 32 | & \cdots 33 | & \boldsymbol{\alpha}_m^\intercal \mathbf{b}_p 34 | \end{bmatrix} 35 | \end{align} 36 | $$ 37 | 38 | #### Definition (outer-way multiplication) 39 | 40 | $$ 41 | \begin{align} 42 | \mathbf{A}_{m \times n} \mathbf{B}_{n \times p} 43 | & = 44 | \begin{bmatrix} 45 | | & & | \\ 46 | \mathbf{a}_1 & \cdots & \mathbf{a}_n \\ 47 | | & & | 48 | \end{bmatrix} 49 | \begin{bmatrix} 50 | — & \boldsymbol{\beta}_1^\intercal & — \\ 51 | & \vdots & \\ 52 | — & \boldsymbol{\beta}_n^\intercal & — 53 | \end{bmatrix} \\ 54 | & = 55 | \sum_{i=1}^n \mathbf{a}_i \boldsymbol{\beta}_i^\intercal 56 | \end{align} 57 | $$ 58 | 59 | #### Definition (column-way multiplication) 60 | 61 | $$ 62 | \begin{align} 63 | \mathbf{A}_{m \times n} \mathbf{B}_{n \times p} 64 | & = 65 | \begin{bmatrix} 66 | | & & | \\ 67 | \mathbf{a}_1 & \cdots & \mathbf{a}_n \\ 68 | | & & | 69 | \end{bmatrix} 70 | \begin{bmatrix} 71 | | & & | \\ 72 | \mathbf{b}_1 & \cdots & \mathbf{b}_p \\ 73 | | & & | 74 | \end{bmatrix} \\ 75 | & = 76 | \begin{bmatrix} 77 | | & & | \\ 78 | \sum_{i=1}^n b_{i1} \mathbf{a}_i 79 | & \cdots 80 | & \sum_{i=1}^n 81 | b_{ip} \mathbf{a}_i \\ 82 | | & & | 83 | \end{bmatrix} 84 | \end{align} 85 | $$ 86 | 87 | #### Definition (row-way multiplication) 88 | 89 | $$ 90 | \begin{align} 91 | \mathbf{A}_{m \times n} \mathbf{B}_{n \times p} 92 | & = 93 | \begin{bmatrix} 94 | — & \boldsymbol{\alpha}_1^\intercal & — \\ 95 | & \vdots & \\ 96 | — & \boldsymbol{\alpha}_m^\intercal & — 97 | \end{bmatrix} 98 | \begin{bmatrix} 99 | — & \boldsymbol{\beta}_1^\intercal & — \\ 100 | & \vdots & \\ 101 | — & \boldsymbol{\beta}_n^\intercal & — 102 | \end{bmatrix} \\ 103 | & = 104 | \begin{bmatrix} 105 | — & \sum_{i=1}^n a_{1i} 106 | \boldsymbol{\beta}_i^\intercal & — \\ 107 | & \vdots & \\ 108 | — & \sum_{i=1}^n a_{mi} 109 | \boldsymbol{\beta}_i^\intercal & — 110 | \end{bmatrix} 111 | \end{align} 112 | $$ 113 | 114 | ##### Note 115 | 116 | - If you are concerning _each element_ of the result of $$ \mathbf{A} \mathbf{B} $$, then you should use the inner-way multiplication. 117 | 118 | - If you want the express $$ \mathbf{A} \mathbf{B} $$ as a _sum_, then you should use the outer-way multiplication. 119 | 120 | - If you are concerning _each column_ of the result of $$ \mathbf{A} \mathbf{B} $$, then you should use the column-way multiplication. 121 | 122 | - If you are concerning _each row_ of the result of $$ \mathbf{A} \mathbf{B} $$, then you should use the row-way multiplication. 123 | 124 | --- 125 | 126 | Copyright ©2018 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.3.4.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 8, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 3 Finite-Sample Properties of OLS 10 | 11 | ... 12 | 13 | #### Review Question 1.3.4 (Gauss-Markov for unconditional variance) 14 | 15 | (a) Prove: $$ \mathrm{Var} ( \widehat{\boldsymbol{\beta}} ) = \mathrm{E} [ \mathrm{Var} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) ] + \mathrm{Var} [ \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) ] $$. 16 | 17 | (b) Prove (1.3.3) in textbook, 18 | 19 | $$ 20 | \mathrm{Var} ( \widehat{ \boldsymbol{ \beta } } ) \ge 21 | \mathrm{Var} ( \mathbf{b} ), 22 | \tag{1.3.3} 23 | $$ 24 | 25 | where $$ \widehat{ \boldsymbol{ \beta } } $$ is any linear unbiased estimator. 26 | 27 | ##### Solution 28 | 29 | (a) By the [formula of variance](supplements/var-cov-matrix.md), 30 | 31 | $$ 32 | \begin{align} 33 | \mathrm{Var} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) 34 | & = 35 | \mathrm{E} ( \widehat{\boldsymbol{\beta}} \widehat{\boldsymbol{\beta}}' \mid \mathbf{X} ) - 36 | \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) 37 | \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} )'. 38 | \end{align} 39 | $$ 40 | 41 | Taking expectations, 42 | 43 | $$ 44 | \begin{align} 45 | \mathrm{E} [ \mathrm{Var} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) ] 46 | & = 47 | \mathrm{E} [ \mathrm{E} ( \widehat{\boldsymbol{\beta}} \widehat{\boldsymbol{\beta}}' \mid \mathbf{X} ) ] - 48 | \mathrm{E} [ \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} )' ] 49 | \\ & = 50 | \mathrm{E} ( \widehat{\boldsymbol{\beta}} \widehat{\boldsymbol{\beta}}' ) - 51 | \mathrm{E} [ \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} )' ]. 52 | \tag{1} 53 | \end{align} 54 | $$ 55 | 56 | Also by the [formula of variance](supplements/var-cov-matrix.md), 57 | 58 | $$ 59 | \begin{align} 60 | \mathrm{Var} [ \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) ] 61 | & = 62 | \mathrm{E} [ \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} )' ] - 63 | \mathrm{E} ( \widehat{ \boldsymbol{ \beta } } ) 64 | \mathrm{E} ( \widehat{ \boldsymbol{ \beta } } )'. 65 | \tag{2} 66 | \end{align} 67 | $$ 68 | 69 | Combining equations (1) and (2), and again by the [formula of variance](supplements/var-cov-matrix.md), 70 | 71 | $$ 72 | \mathrm{E} [ \mathrm{Var} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) ] + 73 | \mathrm{Var} [ \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) ] 74 | = 75 | \mathrm{E} ( \widehat{\boldsymbol{\beta}} \widehat{\boldsymbol{\beta}}' ) - 76 | \mathrm{E} ( \widehat{ \boldsymbol{ \beta } } ) 77 | \mathrm{E} ( \widehat{ \boldsymbol{ \beta } } )' 78 | = 79 | \mathrm{Var} ( \widehat{ \boldsymbol{ \beta } } ). 80 | $$ 81 | 82 | (b) 83 | 84 | $$ 85 | \begin{align} 86 | & \mathrm{Var} ( \widehat{ \boldsymbol{ \beta } } ) - 87 | \mathrm{Var} ( \mathbf{b} ) 88 | \\ = & 89 | \mathrm{E} [ \mathrm{Var} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) ] + \mathrm{Var} [ \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) ] - 90 | \mathrm{E} [ \mathrm{Var} ( \mathbf{b} \mid \mathbf{X} ) ] - \mathrm{Var} [ \mathrm{E} ( \mathbf{b} \mid \mathbf{X} ) ] 91 | && 92 | \text{(by part (a))} 93 | \\ = & 94 | \mathrm{E} [ \mathrm{Var} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) ] - 95 | \mathrm{E} [ \mathrm{Var} ( \mathbf{b} \mid \mathbf{X} ) ] 96 | && 97 | \text{($ \mathrm{E} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) = \mathrm{E} ( \mathbf{b} \mid \mathbf{X} ) = \boldsymbol{ \beta } $)} 98 | \\ = & 99 | \mathrm{E} [ \mathrm{Var} ( \widehat{\boldsymbol{\beta}} \mid \mathbf{X} ) - \mathrm{Var} ( \mathbf{b} \mid \mathbf{X} ) ] 100 | && 101 | \text{(linearity of expectations)} 102 | \\ = & 103 | \text{postive semidefinite matrix}. 104 | && 105 | \text{(Gauss-Markov Theorem)} 106 | \end{align} 107 | $$ 108 | 109 | --- 110 | 111 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /SUMMARY.md: -------------------------------------------------------------------------------- 1 | # 目录 2 | 3 | * [简介](README.md) 4 | 5 | ## Chapter 1 Finite-Sample Properties of OLS 6 | 7 | ### Section 1 The Classical Linear Regression Model 8 | 9 | * [Lecture Note 1.1](lecture-note/1.1.md) 10 | * [Review Question 1.1.1](question-solution/1.1.1.md) 11 | * [Review Question 1.1.2](question-solution/1.1.2.md) 12 | * [Review Question 1.1.3](question-solution/1.1.3.md) 13 | * [Review Question 1.1.4](question-solution/1.1.4.md) 14 | * [Review Question 1.1.5](question-solution/1.1.5.md) 15 | * [Review Question 1.1.6](question-solution/1.1.6.md) 16 | 17 | ### Section 2 The Algebra of Least Squares 18 | 19 | * [Lecture Note 1.2](lecture-note/1.2.md) 20 | * [Review Question 1.2.1](question-solution/1.2.1.md) 21 | * [Review Question 1.2.2](question-solution/1.2.2.md) 22 | * [Review Question 1.2.3](question-solution/1.2.3.md) 23 | * [Review Question 1.2.4](question-solution/1.2.4.md) 24 | * [Review Question 1.2.5](question-solution/1.2.5.md) 25 | * [Review Question 1.2.6](question-solution/1.2.6.md) 26 | * [Review Question 1.2.7](question-solution/1.2.7.md) 27 | * [Review Question 1.2.8](question-solution/1.2.8.md) 28 | * [Review Question 1.2.9](question-solution/1.2.9.md) 29 | 30 | ### Section 3 Finite-Sample Properties of OLS 31 | 32 | * [Review Question 1.3.1](question-solution/1.3.1.md) 33 | * [Review Question 1.3.2](question-solution/1.3.2.md) 34 | * [Review Question 1.3.3](question-solution/1.3.3.md) 35 | * [Review Question 1.3.4](question-solution/1.3.4.md) 36 | * [Review Question 1.3.5](question-solution/1.3.5.md) 37 | * [Review Question 1.3.6](question-solution/1.3.6.md) 38 | * [Review Question 1.3.7](question-solution/1.3.7.md) 39 | 40 | ### Section 4 Hypothesis Testing under Normality 41 | 42 | * [Review Question 1.4.1](question-solution/1.4.1.md) 43 | * [Review Question 1.4.2](question-solution/1.4.2.md) 44 | * [Review Question 1.4.3](question-solution/1.4.3.md) 45 | * [Review Question 1.4.4](question-solution/1.4.4.md) 46 | * [Review Question 1.4.5](question-solution/1.4.5.md) 47 | * [Review Question 1.4.6](question-solution/1.4.6.md) 48 | * [Review Question 1.4.7](question-solution/1.4.7.md) 49 | 50 | ### Section 5 Relation to Maximum Likelihood 51 | 52 | * [Review Question 1.5.1](question-solution/1.5.1.md) 53 | * [Review Question 1.5.2](question-solution/1.5.2.md) 54 | * [Review Question 1.5.3](question-solution/1.5.3.md) 55 | * [Review Question 1.5.4](question-solution/1.5.4.md) 56 | * [Review Question 1.5.5](question-solution/1.5.5.md) 57 | 58 | ### Section 6 Generalized Least Squares (GLS) 59 | 60 | * [Review Question 1.6.1](question-solution/1.6.1.md) 61 | * [Review Question 1.6.2](question-solution/1.6.2.md) 62 | * [Review Question 1.6.3](question-solution/1.6.3.md) 63 | * [Review Question 1.6.4](question-solution/1.6.4.md) 64 | 65 | ### Section 7 Application: Returns to Scale in Electricity Supply 66 | 67 | * [Lecture Note](lecture-note/1.7.md) 68 | * [Review Question 1.7.1](question-solution/1.7.1.md) 69 | * [Review Question 1.7.2](question-solution/1.7.2.md) 70 | * [Review Question 1.7.3](question-solution/1.7.3.md) 71 | * [Review Question 1.7.4](question-solution/1.7.4.md) 72 | * [Review Question 1.7.5](question-solution/1.7.5.md) 73 | * [Review Question 1.7.6](question-solution/1.7.6.md) 74 | * [Review Question 1.7.7](question-solution/1.7.7.md) 75 | * [Review Question 1.7.8](question-solution/1.7.8.md) 76 | 77 | ### Analytical Exercises 78 | 79 | * [Analytical Exercise 1.1](exercise-solution/1.1.md) 80 | * [Analytical Exercise 1.2](exercise-solution/1.2.md) 81 | 82 | ## Chapter 2 Large-Sample Theory 83 | 84 | ### Section 1 Review of Limit Theorems for Sequences of Random Variables 85 | 86 | * [Review Question 2.1.1](question-solution/2.1.1.md) 87 | * [Review Question 2.1.2](question-solution/2.1.2.md) 88 | * [Review Question 2.1.3](question-solution/2.1.3.md) 89 | * [Review Question 2.1.4](question-solution/2.1.4.md) 90 | * [Review Question 2.1.5](question-solution/2.1.5.md) 91 | 92 | ## Supplements 93 | 94 | * [Taylor's linearization](supplements/taylor-linearization.md) 95 | * [Variance-Covariance Matrix](supplements/var-cov-matrix.md) 96 | * [Four Ways of Matrix Multiplication](supplements/matrix-multiplication.md) 97 | -------------------------------------------------------------------------------- /lecture-note/1.7.md: -------------------------------------------------------------------------------- 1 | # Lecture Notes on Econometrics 2 | 3 | by Qiang Gao, updated at May 17, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 7 Application: Returns to Scale in Electricity Supply 10 | 11 | ... 12 | 13 | #### How to derive the cost function (1.7.2) 14 | 15 | Assume the firms are engaged in **cost minimization**, 16 | 17 | $$ 18 | \begin{gather} 19 | \min \quad p_{i1} x_{i1} + p_{i2} x_{i2} + p_{i3} x_{i3} 20 | \tag{1} 21 | \\ 22 | \text{s.t.} \qquad 23 | A_i x_{i1}^{\alpha_1} x_{i2}^{\alpha_2} x_{i3}^{\alpha_3} = Q_i. 24 | \tag{2} 25 | \end{gather} 26 | $$ 27 | 28 | Then the **Lagrangian** is 29 | 30 | $$ 31 | \mathcal{L} ( x_{i1}, x_{i2}, x_{i3} ) = 32 | p_{i1} x_{i1} + p_{i2} x_{i2} + p_{i3} x_{i3} - 33 | \lambda ( A_i x_{i1}^{\alpha_1} x_{i2}^{\alpha_2} x_{i3}^{\alpha_3} - Q_i ), 34 | \tag{3} 35 | $$ 36 | 37 | where $$ \lambda > 0 $$ is the **shadow price** of the **exogenous variable** $$ Q_i $$. 38 | 39 | The **first-order conditions** are 40 | 41 | $$ 42 | \begin{align} 43 | \frac{ \partial \mathcal{L} ( x_{i1}, x_{i2}, x_{i3} ) } 44 | { \partial x_{i1} } = 45 | p_{i1} - \lambda \alpha_1 A_i x_{i1}^{\alpha_1 - 1} x_{i2}^{\alpha_2} x_{i3}^{\alpha_3} = 0, 46 | \tag{4} 47 | \\ 48 | \frac{ \partial \mathcal{L} ( x_{i1}, x_{i2}, x_{i3} ) } 49 | { \partial x_{i2} } = 50 | p_{i2} - \lambda \alpha_2 A_i x_{i1}^{\alpha_1} x_{i2}^{\alpha_2 - 1} x_{i3}^{\alpha_3} = 0, 51 | \tag{5} 52 | \\ 53 | \frac{ \partial \mathcal{L} ( x_{i1}, x_{i2}, x_{i3} ) } 54 | { \partial x_{i3} } = 55 | p_{i3} - \lambda \alpha_1 A_i x_{i1}^{\alpha_1} x_{i2}^{\alpha_2} x_{i3}^{\alpha_3 - 1} = 0. 56 | \tag{6} 57 | \end{align} 58 | $$ 59 | 60 | Dividing (5) over (4) to eliminate the shadow price $$ \lambda $$, 61 | 62 | $$ 63 | \begin{gather} 64 | \frac{ p_{i2} }{ p_{i1} } = 65 | \frac{ \alpha_2 }{ \alpha_1 } 66 | \frac{ x_{i1} }{ x_{i2} }, 67 | \tag{7} 68 | \\ 69 | x_{i2} = p_{i1} p_{i2}^{-1} \alpha_1^{-1} \alpha_2 x_{i1}. 70 | \tag{8} 71 | \end{gather} 72 | $$ 73 | 74 | Similarly, dividing (6) over (4), 75 | 76 | $$ 77 | \begin{gather} 78 | \frac{ p_{i3} }{ p_{i1} } = 79 | \frac{ \alpha_3 }{ \alpha_1 } 80 | \frac{ x_{i1} }{ x_{i3} }, 81 | \tag{9} 82 | \\ 83 | x_{i3} = p_{i1} p_{i3}^{-1} \alpha_1^{-1} \alpha_3 x_{i1}. 84 | \tag{10} 85 | \end{gather} 86 | $$ 87 | 88 | Substituting (8) and (10) into (2), 89 | 90 | $$ 91 | A_i x_{i1}^{ \alpha_1 + \alpha_2 + \alpha_3 } 92 | p_{i1}^{ \alpha_2 + \alpha_3 } 93 | p_{i2}^{ - \alpha_2 } 94 | p_{i3}^{ - \alpha_3 } 95 | \alpha_1^{ - \alpha_2 - \alpha_3} 96 | \alpha_2^{ \alpha_2 } 97 | \alpha_3^{ \alpha_3 } 98 | = Q_i. 99 | \tag{11} 100 | $$ 101 | 102 | Using $$ r \equiv \alpha_1 + \alpha_2 + \alpha_3 $$, we solved from (11) that 103 | 104 | $$ 105 | \begin{align} 106 | x_{i1} & = A_i^{-1/r} Q_i^{1/r} 107 | p_{i1}^{-1 + \alpha_1/r} 108 | p_{i2}^{ \alpha_2/r } 109 | p_{i3}^{ \alpha_3/r } 110 | \alpha_1^{1 - \alpha_1/r } 111 | \alpha_2^{- \alpha_2/r} 112 | \alpha_3^{- \alpha_3/r} 113 | \\ & = 114 | \alpha_1 \cdot ( A_i \alpha_1^{\alpha_1} \alpha_2^{\alpha_2} \alpha_3^{\alpha^3} )^{-1/r} 115 | Q_i^{1/r} 116 | \frac{1}{ p_{i1} } 117 | p_{i1}^{ \alpha_1/r} 118 | p_{i2}^{ \alpha_2/r } 119 | p_{i3}^{ \alpha_3/r }. 120 | \tag{12} 121 | \end{align} 122 | $$ 123 | 124 | Similarly, because of the symmetry of $$x_{i1}$$, $$x_{i2}$$, and $$ x_{i3} $$, we can also solve that 125 | 126 | $$ 127 | \begin{align} 128 | x_{i2} & = \alpha_2 \cdot ( A_i \alpha_1^{\alpha_1} \alpha_2^{\alpha_2} \alpha_3^{\alpha^3} )^{-1/r} 129 | Q_i^{1/r} 130 | \frac{1}{ p_{i2} } 131 | p_{i1}^{ \alpha_1/r} 132 | p_{i2}^{ \alpha_2/r } 133 | p_{i3}^{ \alpha_3/r }, 134 | \tag{13} 135 | \\ 136 | x_{i3} & = \alpha_3 \cdot ( A_i \alpha_1^{\alpha_1} \alpha_2^{\alpha_2} \alpha_3^{\alpha^3} )^{-1/r} 137 | Q_i^{1/r} 138 | \frac{1}{ p_{i3} } 139 | p_{i1}^{ \alpha_1/r} 140 | p_{i2}^{ \alpha_2/r } 141 | p_{i3}^{ \alpha_3/r }. 142 | \tag{14} 143 | \end{align} 144 | $$ 145 | 146 | Substituting solutions (12), (13) and (14) of the **endogenous variables** into the **objective function** (1), we get 147 | 148 | $$ 149 | TC_i = r \cdot ( A_i \alpha_1^{ \alpha_1 } \alpha_2^{ \alpha_2 } \alpha_3^{ \alpha_3 } )^{-1/r} Q_i^{1/r} p_{i1}^{\alpha_1 / r} p_{i2}^{\alpha_2 / r} p_{i3}^{\alpha_3 / r}. 150 | \tag{1.7.2} 151 | $$ 152 | 153 | --- 154 | 155 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.2.4.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Apr 2, 2018 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 2 The Algebra of Least Squares 10 | 11 | ... 12 | 13 | #### Review Question 1.2.4 14 | 15 | Prove that 16 | 17 | $$ 18 | \begin{align} 19 | & 20 | \text{ Both $ \mathbf{P} $ and $ \mathbf{M} $ are symmetric and idempotent, } 21 | \tag{1.2.9} 22 | \\ & 23 | \mathbf{P} \mathbf{X} = \mathbf{X} \quad 24 | \text{(hence the term projection matrix),} 25 | \tag{1.2.10} 26 | \\ & 27 | \mathbf{M} \mathbf{X} = \mathbf{0} \quad 28 | \text{(hence the term annihilator).}\tag{1.2.11} 29 | \end{align} 30 | $$ 31 | 32 | ##### Solution 33 | 34 | (1) $$ \mathbf{P} $$ is symmetric because 35 | 36 | $$ 37 | \begin{align} 38 | \mathbf{P}^\intercal 39 | & = 40 | [\mathbf{X} (\mathbf{X}^\intercal \mathbf{X})^{-1} 41 | \mathbf{X}^\intercal]^\intercal 42 | && 43 | \text{(definition (1.2.7))} 44 | \\ & = 45 | (\mathbf{X}^\intercal)^\intercal 46 | [(\mathbf{X}^\intercal \mathbf{X})^{-1}]^\intercal 47 | \mathbf{X}^\intercal 48 | && 49 | ((\mathbf{A} \mathbf{B})^\intercal = 50 | \mathbf{B}^\intercal \mathbf{A}^\intercal) 51 | \\ & = 52 | ( \mathbf{X}^\intercal)^\intercal 53 | [(\mathbf{X}^\intercal \mathbf{X})^\intercal]^{-1} 54 | \mathbf{X}^\intercal 55 | && 56 | ((\mathbf{A}^{-1})^\intercal = 57 | (\mathbf{A}^\intercal)^{-1}) 58 | \\ & = 59 | \mathbf{X} (\mathbf{X}^\intercal \mathbf{X})^{-1} 60 | \mathbf{X}^\intercal 61 | && 62 | ((\mathbf{A}^\intercal)^\intercal = \mathbf{A}) 63 | \\ & = 64 | \mathbf{P}. 65 | && 66 | \text{(definition (1.2.7))} 67 | \end{align} 68 | $$ 69 | 70 | (2) $$ \mathbf{M} $$ is symmetric because 71 | 72 | $$ 73 | \begin{align} 74 | \mathbf{M}^\intercal 75 | & = 76 | ( \mathbf{I} - \mathbf{P} )^\intercal 77 | && 78 | \text{(definition (1.2.8))} 79 | \\ & = 80 | \mathbf{I}^\intercal - \mathbf{P}^\intercal 81 | && 82 | ((\mathbf{A} - \mathbf{B})^\intercal = 83 | \mathbf{A}^\intercal - \mathbf{B}^\intercal) 84 | \\ & = 85 | \mathbf{I} - \mathbf{P} 86 | && 87 | \text{($\mathbf{I}$ and $\mathbf{P}$ are symmetric)} 88 | \\ & = 89 | \mathbf{M}. 90 | && 91 | \text{(definition (1.2.8))} 92 | \end{align} 93 | $$ 94 | 95 | (3) $$ \mathbf{P} $$ is idempotent because 96 | 97 | $$ 98 | \begin{align} 99 | \mathbf{P}^2 100 | & = 101 | (\mathbf{X} (\mathbf{X}^\intercal \mathbf{X})^{-1} 102 | \mathbf{X}^\intercal) \cdot (\mathbf{X} 103 | (\mathbf{X}^\intercal \mathbf{X})^{-1} 104 | \mathbf{X}^\intercal) 105 | && 106 | \text{(definition (1.2.7))} 107 | \\ & = 108 | \mathbf{X} (\mathbf{X}^\intercal \mathbf{X})^{-1} 109 | (\mathbf{X}^\intercal \mathbf{X}) 110 | (\mathbf{X}^\intercal \mathbf{X})^{-1} 111 | \mathbf{X}^\intercal 112 | && 113 | ((\mathbf{A} \mathbf{B}) \mathbf{C} = 114 | \mathbf{A} (\mathbf{B} \mathbf{C})) 115 | \\ & = 116 | \mathbf{X} (\mathbf{X}^\intercal \mathbf{X})^{-1} 117 | \mathbf{X}^\intercal 118 | && 119 | (\mathbf{A}^{-1} \mathbf{A} = \mathbf{I}) 120 | \\ & = 121 | \mathbf{P}. 122 | && 123 | \text{(definition (1.2.7))} 124 | \end{align} 125 | $$ 126 | 127 | (4) $$ \mathbf{M} $$ is idempotent because 128 | 129 | $$ 130 | \begin{align} 131 | \mathbf{M}^2 132 | & = 133 | (\mathbf{I} - \mathbf{P})(\mathbf{I} - \mathbf{P}) 134 | && 135 | \text{(definition (1.2.8))} 136 | \\ & = 137 | \mathbf{I} - \mathbf{P} - \mathbf{P} + \mathbf{P}^2 138 | \\ & = 139 | \mathbf{I} - \mathbf{P} - \mathbf{P} + \mathbf{P} 140 | && 141 | \text{($\mathbf{P}$ is idempotent)} 142 | \\ & = 143 | \mathbf{I} - \mathbf{P} 144 | \\ & = 145 | \mathbf{M}. 146 | && 147 | \text{(definition (1.2.8))} 148 | \end{align} 149 | $$ 150 | 151 | (5) $$ \mathbf{P} \mathbf{X} = \mathbf{X} $$ because 152 | 153 | $$ 154 | \begin{align} 155 | \mathbf{P} \mathbf{X} 156 | & = 157 | \mathbf{X} (\mathbf{X}^\intercal \mathbf{X})^{-1} 158 | \mathbf{X}^\intercal \cdot \mathbf{X} 159 | && 160 | \text{(definition (1.2.7))} 161 | \\ & = 162 | \mathbf{X} (\mathbf{X}^\intercal \mathbf{X})^{-1} 163 | (\mathbf{X}^\intercal \mathbf{X}) 164 | \\ & = 165 | \mathbf{X}. 166 | \end{align} 167 | $$ 168 | 169 | (6) $$ \mathbf{M} \mathbf{X} = \mathbf{0} $$ because 170 | 171 | $$ 172 | \begin{align} 173 | \mathbf{M} \mathbf{X} 174 | & = 175 | ( \mathbf{I} - \mathbf{P} ) \mathbf{X} 176 | && 177 | \text{(definition (1.2.8))} 178 | \\ & = 179 | \mathbf{X} - \mathbf{P} \mathbf{X} 180 | \\ & = 181 | \mathbf{X} - \mathbf{X} 182 | && 183 | ( \mathbf{P} \mathbf{X} = \mathbf{X} ) 184 | \\ & = \mathbf{0}. 185 | \end{align} 186 | $$ 187 | 188 | --- 189 | 190 | Copyright ©2018 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.7.1.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 20, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 7 Application: Returns to Scale in Electricity Supply 10 | 11 | ... 12 | 13 | #### Review Question 1.7.1 (Review of duality theory) 14 | 15 | Consult your favorite microeconomic textbook to remember how to derive the Cobb-Douglas cost function from the Cobb-Douglas production function. 16 | 17 | ##### Solution 18 | 19 | Assume firm $$i$$ is engaged in **cost minimization**, 20 | 21 | $$ 22 | \begin{gather} 23 | \min \quad p_{i1} x_{i1} + p_{i2} x_{i2} + p_{i3} x_{i3} 24 | \tag{1} 25 | \\ 26 | \text{s.t.} \qquad 27 | A_i x_{i1}^{\alpha_1} x_{i2}^{\alpha_2} x_{i3}^{\alpha_3} = Q_i. 28 | \tag{2} 29 | \end{gather} 30 | $$ 31 | 32 | Then the **Lagrangian** is 33 | 34 | $$ 35 | \mathcal{L} ( x_{i1}, x_{i2}, x_{i3}, \lambda ) = 36 | p_{i1} x_{i1} + p_{i2} x_{i2} + p_{i3} x_{i3} - 37 | \lambda ( A_i x_{i1}^{\alpha_1} x_{i2}^{\alpha_2} x_{i3}^{\alpha_3} - Q_i ), 38 | \tag{3} 39 | $$ 40 | 41 | where $$ \lambda > 0 $$ is the **shadow price** of the **exogenous variable** $$ Q_i $$. 42 | 43 | The **first-order conditions** are 44 | 45 | $$ 46 | \begin{align} 47 | \frac{ \partial \mathcal{L} ( x_{i1}, x_{i2}, x_{i3}, \lambda ) } 48 | { \partial x_{i1} } & = 49 | p_{i1} - \lambda \alpha_1 A_i x_{i1}^{\alpha_1 - 1} x_{i2}^{\alpha_2} x_{i3}^{\alpha_3} = 0, 50 | \tag{4} 51 | \\ 52 | \frac{ \partial \mathcal{L} ( x_{i1}, x_{i2}, x_{i3}, \lambda ) } 53 | { \partial x_{i2} } & = 54 | p_{i2} - \lambda \alpha_2 A_i x_{i1}^{\alpha_1} x_{i2}^{\alpha_2 - 1} x_{i3}^{\alpha_3} = 0, 55 | \tag{5} 56 | \\ 57 | \frac{ \partial \mathcal{L} ( x_{i1}, x_{i2}, x_{i3}, \lambda ) } 58 | { \partial x_{i3} } & = 59 | p_{i3} - \lambda \alpha_1 A_i x_{i1}^{\alpha_1} x_{i2}^{\alpha_2} x_{i3}^{\alpha_3 - 1} = 0. 60 | \tag{6} 61 | \\ 62 | \frac{ \partial \mathcal{L} ( x_{i1}, x_{i2}, x_{i3}, \lambda ) } 63 | { \partial \lambda } & = 64 | A_i x_{i1}^{\alpha_1} x_{i2}^{\alpha_2} x_{i3}^{\alpha_3} - Q_i = 0. 65 | \tag{7} 66 | \end{align} 67 | $$ 68 | 69 | Dividing (5) over (4) to eliminate the shadow price $$ \lambda $$, 70 | 71 | $$ 72 | \begin{gather} 73 | \frac{ p_{i2} }{ p_{i1} } = 74 | \frac{ \alpha_2 }{ \alpha_1 } 75 | \frac{ x_{i1} }{ x_{i2} }, 76 | \tag{8} 77 | \\ 78 | x_{i2} = p_{i1} p_{i2}^{-1} \alpha_1^{-1} \alpha_2 x_{i1}. 79 | \tag{9} 80 | \end{gather} 81 | $$ 82 | 83 | Similarly, dividing (6) over (4), 84 | 85 | $$ 86 | \begin{gather} 87 | \frac{ p_{i3} }{ p_{i1} } = 88 | \frac{ \alpha_3 }{ \alpha_1 } 89 | \frac{ x_{i1} }{ x_{i3} }, 90 | \tag{10} 91 | \\ 92 | x_{i3} = p_{i1} p_{i3}^{-1} \alpha_1^{-1} \alpha_3 x_{i1}. 93 | \tag{11} 94 | \end{gather} 95 | $$ 96 | 97 | Substituting (9) and (11) into (7), 98 | 99 | $$ 100 | A_i x_{i1}^{ \alpha_1 + \alpha_2 + \alpha_3 } 101 | p_{i1}^{ \alpha_2 + \alpha_3 } 102 | p_{i2}^{ - \alpha_2 } 103 | p_{i3}^{ - \alpha_3 } 104 | \alpha_1^{ - \alpha_2 - \alpha_3} 105 | \alpha_2^{ \alpha_2 } 106 | \alpha_3^{ \alpha_3 } 107 | = Q_i. 108 | \tag{12} 109 | $$ 110 | 111 | Using $$ r \equiv \alpha_1 + \alpha_2 + \alpha_3 $$, we solved from (12) that 112 | 113 | $$ 114 | \begin{align} 115 | x_{i1} & = A_i^{-1/r} Q_i^{1/r} 116 | p_{i1}^{-1 + \alpha_1/r} 117 | p_{i2}^{ \alpha_2/r } 118 | p_{i3}^{ \alpha_3/r } 119 | \alpha_1^{1 - \alpha_1/r } 120 | \alpha_2^{- \alpha_2/r} 121 | \alpha_3^{- \alpha_3/r} 122 | \\ & = 123 | \alpha_1 \cdot ( A_i \alpha_1^{\alpha_1} \alpha_2^{\alpha_2} \alpha_3^{\alpha^3} )^{-1/r} 124 | Q_i^{1/r} 125 | \frac{1}{ p_{i1} } 126 | p_{i1}^{ \alpha_1/r} 127 | p_{i2}^{ \alpha_2/r } 128 | p_{i3}^{ \alpha_3/r }. 129 | \tag{13} 130 | \end{align} 131 | $$ 132 | 133 | Similarly, because of the symmetry of $$x_{i1}$$, $$x_{i2}$$, and $$ x_{i3} $$, we can also solve that 134 | 135 | $$ 136 | \begin{align} 137 | x_{i2} & = \alpha_2 \cdot ( A_i \alpha_1^{\alpha_1} \alpha_2^{\alpha_2} \alpha_3^{\alpha^3} )^{-1/r} 138 | Q_i^{1/r} 139 | \frac{1}{ p_{i2} } 140 | p_{i1}^{ \alpha_1/r} 141 | p_{i2}^{ \alpha_2/r } 142 | p_{i3}^{ \alpha_3/r }, 143 | \tag{14} 144 | \\ 145 | x_{i3} & = \alpha_3 \cdot ( A_i \alpha_1^{\alpha_1} \alpha_2^{\alpha_2} \alpha_3^{\alpha^3} )^{-1/r} 146 | Q_i^{1/r} 147 | \frac{1}{ p_{i3} } 148 | p_{i1}^{ \alpha_1/r} 149 | p_{i2}^{ \alpha_2/r } 150 | p_{i3}^{ \alpha_3/r }. 151 | \tag{15} 152 | \end{align} 153 | $$ 154 | 155 | Substituting solutions of the **endogenous variables** (13), (14) and (15) into the **objective function** (1), we get 156 | 157 | $$ 158 | TC_i = r \cdot ( A_i \alpha_1^{ \alpha_1 } \alpha_2^{ \alpha_2 } \alpha_3^{ \alpha_3 } )^{-1/r} Q_i^{1/r} p_{i1}^{\alpha_1 / r} p_{i2}^{\alpha_2 / r} p_{i3}^{\alpha_3 / r}. 159 | \tag{1.7.2} 160 | $$ 161 | 162 | --- 163 | 164 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.2.3.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at Mar 21, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 2 The Algebra of Least Squares 10 | 11 | ... 12 | 13 | #### Review Question 1.2.3 (OLS estimator for the simple regression model) 14 | 15 | In the simple regression model, $$ K = 2 $$ and $$ x_{i1} = 1 $$. Show that 16 | 17 | $$ 18 | \mathbf{S}_{\mathbf{x}\mathbf{x}} = 19 | \begin{bmatrix} 20 | 1 & \bar{x}_2 \\ 21 | \bar{x}_2 & \frac{1}{n} \sum_{i=1}^n x_{i2}^2 22 | \end{bmatrix}, 23 | \quad 24 | \mathbf{S}_{\mathbf{x}\mathbf{y}} = 25 | \begin{bmatrix} 26 | \bar{y} \\ \frac{1}{n} \sum_{i=1}^n x_{i2} y_i 27 | \end{bmatrix} 28 | $$ 29 | 30 | where 31 | 32 | $$ 33 | \bar{y} \equiv \frac{1}{n} \sum_{i=1}^n y_i, 34 | \quad 35 | \bar{x}_2 \equiv \frac{1}{n} \sum_{i=1}^n x_{i2}. 36 | $$ 37 | 38 | Show that 39 | 40 | $$ 41 | b_2 = \frac{ \frac{1}{n} \sum_{i=1}^n (x_{i2} - \bar{x}_2)(y_i - \bar{y}) }{ \frac{1}{n} \sum_{i=1}^n (x_{i2} - \bar{x}_2)^2 }, 42 | \quad 43 | b_1 = \bar{y} - \bar{x}_2 b_2. 44 | $$ 45 | 46 | ##### Solution 47 | 48 | (1) 49 | 50 | $$ 51 | \begin{align} 52 | \mathbf{S}_{\mathbf{x}\mathbf{x}} 53 | & = \frac{1}{n} \mathbf{X}' \mathbf{X} \\ 54 | & = 55 | \frac{1}{n} 56 | \begin{bmatrix} 57 | 1 & \cdots & 1 \\ 58 | x_{12} & \cdots & x_{n2} 59 | \end{bmatrix} 60 | \begin{bmatrix} 61 | 1 & x_{12} \\ 62 | \vdots & \vdots \\ 63 | 1 & x_{n2} 64 | \end{bmatrix} \\ 65 | & = 66 | \frac{1}{n} 67 | \begin{bmatrix} 68 | n & \sum_{i=1}^n x_{i2} \\ 69 | \sum_{i=1}^n x_{i2} & \sum_{i=1}^n x_{i2}^2 70 | \end{bmatrix} \\ 71 | & = 72 | \begin{bmatrix} 73 | 1 & \bar{x}_2 \\ 74 | \bar{x}_2 & \frac{1}{n} \sum_{i=1}^n x_{i2}^2 75 | \end{bmatrix} 76 | \end{align} 77 | $$ 78 | 79 | (2) 80 | 81 | $$ 82 | \begin{align} 83 | \mathbf{S}_{\mathbf{x}\mathbf{y}} 84 | & = \frac{1}{n} \mathbf{X}' \mathbf{y} \\ 85 | & = 86 | \frac{1}{n} 87 | \begin{bmatrix} 88 | 1 & \cdots & 1 \\ 89 | x_{12} & \cdots & x_{n2} 90 | \end{bmatrix} 91 | \begin{bmatrix} 92 | y_1 \\ 93 | \vdots \\ 94 | y_n 95 | \end{bmatrix} \\ 96 | & = 97 | \frac{1}{n} 98 | \begin{bmatrix} 99 | \sum_{i=1}^n y_i \\ 100 | \sum_{i=1}^n x_{i2} y_i 101 | \end{bmatrix} \\ 102 | & = 103 | \begin{bmatrix} 104 | \bar{y} \\ 105 | \frac{1}{n} \sum_{i=1}^n x_{i2} y_i 106 | \end{bmatrix} 107 | \end{align} 108 | $$ 109 | 110 | (3) To solve for $$ \mathbf{b} $$ from 111 | 112 | $$ 113 | \mathbf{S}_{\mathbf{x} \mathbf{x}} 114 | \mathbf{b} = 115 | \mathbf{S}_{\mathbf{x} \mathbf{y}}, 116 | $$ 117 | 118 | perform row operations on the following augmented matrix 119 | 120 | $$ 121 | \begin{align} 122 | \begin{bmatrix} 123 | \mathbf{S}_{\mathbf{x} \mathbf{x}} \mid 124 | \mathbf{S}_{\mathbf{x} \mathbf{y}} 125 | \end{bmatrix} 126 | & = 127 | \begin{bmatrix} 128 | 1 & \bar{x}_2 & \bar{y} \\ 129 | \bar{x}_2 & \frac{1}{n} \sum_{i=1}^n x_{i2}^2 130 | & \frac{1}{n} \sum_{i=1}^n x_{i2} y_i 131 | \end{bmatrix} \\ 132 | & \sim 133 | \begin{bmatrix} 134 | 1 & \bar{x}_2 & \bar{y} \\ 135 | 0 & \frac{1}{n} \sum_{i=1}^n x_{i2}^2 - \bar{x}_2^2 136 | & \frac{1}{n} \sum_{i=1}^n x_{i2} y_i 137 | - \bar{x}_2 \bar{y} 138 | \end{bmatrix} \\ 139 | & = 140 | \begin{bmatrix} 141 | 1 & \bar{x}_2 & \bar{y} \\ 142 | 0 & \frac{1}{n} \sum_{i=1}^n (x_{i2} - \bar{x}_2)^2 143 | & \frac{1}{n} \sum_{i=1}^n 144 | (x_{i2} - \bar{x}_2)(y_i - \bar{y}) 145 | \end{bmatrix} \\ 146 | & \sim 147 | \begin{bmatrix} 148 | 1 & \bar{x}_2 & \bar{y} \\ 149 | 0 & 1 & 150 | \frac{ \frac{1}{n} \sum_{i=1}^n 151 | (x_{i2} - \bar{x}_2)(y_i - \bar{y}) } 152 | {\frac{1}{n} \sum_{i=1}^n (x_{i2} - \bar{x}_2)^2} 153 | \end{bmatrix} \\ 154 | & \sim 155 | \begin{bmatrix} 156 | 1 & 0 & \bar{y} - 157 | \bar{x}_2 158 | \frac{ \frac{1}{n} \sum_{i=1}^n 159 | (x_{i2} - \bar{x}_2)(y_i - \bar{y}) } 160 | {\frac{1}{n} \sum_{i=1}^n (x_{i2} - \bar{x}_2)^2} 161 | \\ 162 | 0 & 1 & 163 | \frac{ \frac{1}{n} \sum_{i=1}^n 164 | (x_{i2} - \bar{x}_2)(y_i - \bar{y}) } 165 | {\frac{1}{n} \sum_{i=1}^n (x_{i2} - \bar{x}_2)^2} 166 | \end{bmatrix} \\ 167 | & = 168 | \begin{bmatrix} 169 | \mathbf{I} \mid \mathbf{b} 170 | \end{bmatrix}. 171 | \end{align} 172 | $$ 173 | 174 | So 175 | 176 | $$ 177 | b_2 = \frac{ \frac{1}{n} \sum_{i=1}^n (x_{i2} - \bar{x}_2)(y_i - \bar{y}) }{ \frac{1}{n} \sum_{i=1}^n (x_{i2} - \bar{x}_2)^2 }, 178 | \quad 179 | b_1 = \bar{y} - \bar{x}_2 b_2. 180 | $$ 181 | 182 | ##### Note 183 | 184 | $$ 185 | b_2 \stackrel{p} \to \frac{\mathrm{Cov} (x_2, y)}{\mathrm{Var} (x_2)} 186 | $$ 187 | 188 | --- 189 | 190 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Lecture Notes on Econometrics 2 | 3 | by Qiang Gao (School of Finance, Capital University of Economics and Business) 4 | 5 | 6 | # 高级计量经济学课件 7 | 8 | 作者:高强(首都经济贸易大学金融学院) 9 | 10 | --- 11 | 12 | 本课件是对日本经济学家林文夫(2000)所著《高级计量经济学》教材[^1]的补充。 13 | 14 | 国内新入学的研究生,本科阶段所接受的数学训练普遍不足。例如,微积分课程缺少对梯度、雅可比矩阵、海赛矩阵、多元泰勒展开的讲解;线性代数课程缺少对向量空间、坐标变换、特征分解、四个基本子空间、奇异值分解、二次型等重要概念的强调;概率论与数理统计两门特别重要的课程,通常被合并成一门课在一个学期讲完,造成学生普遍搞不清楚条件期望、条件方差、方差—协方差矩阵、多元正态分布、$$\chi^2$$分布、$$t$$分布、$$F$$分布、极大似然估计量、信息矩阵等基础概念;更不要说压根不安排实分析、微分方程、随机过程、动态优化等数学课了。 15 | 16 | 正是因为这一问题给面向研究生开设的《高级计量经济学》课程带来了极大的挑战,所以我尝试制作这些课件。这些课件并不能够取代本科阶段应有的数学教育,而是希望在《高级计量经济学》授课过程中,逢山开路,遇水搭桥,以“现学现用”为原则尽可能弥补课本假设学生知道所以没有讲但其实学生根本不知道的那些基础数学知识,作为一个教辅资料,帮助学生扎扎实实地学明白《高级计量经济学》教材中所有的数学推导。 17 | 18 | 练习题原本应该由学生亲自完成,这是学习取得成效的重要环节。但现实情况是,由于学生普遍基础不足,根本没有能力独立完成练习。所以本课件的另外一个组成部分是公布所有练习题的答案供学生参考。如果担心学生因此就有办法作弊,逃避亲自完成练习,那么可以采取随堂闭卷测验的方式考察学生是否确实掌握练习题的解法。希望本课件将练习题答案公布更多是帮助学生,而不是害到学生。 19 | 20 | 最后,本课件正在逐步建设中,许多内容尚处空缺,错误也在所难免,非常欢迎读者的批评指正(本书每一页边距空白处可随意点击“加号”添加评论)。 21 | 22 | ps. 若网页中数学公式显示不正常,通常刷新一遍网页即可正常显示。这似乎是 `gitbook.com` 平台的技术性问题,或者是从国内网络访问国际网络的普遍性问题造成的,本人无法解决。 23 | 24 | # 目录 25 | 26 | ## Chapter 1 Finite-Sample Properties of OLS 27 | 28 | ### Section 1 The Classical Linear Regression Model 29 | 30 | * [Lecture Note 1.1](lecture-note/1.1.md) 31 | * [Review Question 1.1.1](question-solution/1.1.1.md) 32 | * [Review Question 1.1.2](question-solution/1.1.2.md) 33 | * [Review Question 1.1.3](question-solution/1.1.3.md) 34 | * [Review Question 1.1.4](question-solution/1.1.4.md) 35 | * [Review Question 1.1.5](question-solution/1.1.5.md) 36 | * [Review Question 1.1.6](question-solution/1.1.6.md) 37 | 38 | ### Section 2 The Algebra of Least Squares 39 | 40 | * [Lecture Note 1.2](lecture-note/1.2.md) 41 | * [Review Question 1.2.1](question-solution/1.2.1.md) 42 | * [Review Question 1.2.2](question-solution/1.2.2.md) 43 | * [Review Question 1.2.3](question-solution/1.2.3.md) 44 | * [Review Question 1.2.4](question-solution/1.2.4.md) 45 | * [Review Question 1.2.5](question-solution/1.2.5.md) 46 | * [Review Question 1.2.6](question-solution/1.2.6.md) 47 | * [Review Question 1.2.7](question-solution/1.2.7.md) 48 | * [Review Question 1.2.8](question-solution/1.2.8.md) 49 | * [Review Question 1.2.9](question-solution/1.2.9.md) 50 | 51 | ### Section 3 Finite-Sample Properties of OLS 52 | 53 | * [Review Question 1.3.1](question-solution/1.3.1.md) 54 | * [Review Question 1.3.2](question-solution/1.3.2.md) 55 | * [Review Question 1.3.3](question-solution/1.3.3.md) 56 | * [Review Question 1.3.4](question-solution/1.3.4.md) 57 | * [Review Question 1.3.5](question-solution/1.3.5.md) 58 | * [Review Question 1.3.6](question-solution/1.3.6.md) 59 | * [Review Question 1.3.7](question-solution/1.3.7.md) 60 | 61 | ### Section 4 Hypothesis Testing under Normality 62 | 63 | * [Review Question 1.4.1](question-solution/1.4.1.md) 64 | * [Review Question 1.4.2](question-solution/1.4.2.md) 65 | * [Review Question 1.4.3](question-solution/1.4.3.md) 66 | * [Review Question 1.4.4](question-solution/1.4.4.md) 67 | * [Review Question 1.4.5](question-solution/1.4.5.md) 68 | * [Review Question 1.4.6](question-solution/1.4.6.md) 69 | * [Review Question 1.4.7](question-solution/1.4.7.md) 70 | 71 | ### Section 5 Relation to Maximum Likelihood 72 | 73 | * [Review Question 1.5.1](question-solution/1.5.1.md) 74 | * [Review Question 1.5.2](question-solution/1.5.2.md) 75 | * [Review Question 1.5.3](question-solution/1.5.3.md) 76 | * [Review Question 1.5.4](question-solution/1.5.4.md) 77 | * [Review Question 1.5.5](question-solution/1.5.5.md) 78 | 79 | ### Section 6 Generalized Least Squares (GLS) 80 | 81 | * [Review Question 1.6.1](question-solution/1.6.1.md) 82 | * [Review Question 1.6.2](question-solution/1.6.2.md) 83 | * [Review Question 1.6.3](question-solution/1.6.3.md) 84 | * [Review Question 1.6.4](question-solution/1.6.4.md) 85 | 86 | ### Section 7 Application: Returns to Scale in Electricity Supply 87 | 88 | * [Lecture Note](lecture-note/1.7.md) 89 | * [Review Question 1.7.1](question-solution/1.7.1.md) 90 | * [Review Question 1.7.2](question-solution/1.7.2.md) 91 | * [Review Question 1.7.3](question-solution/1.7.3.md) 92 | * [Review Question 1.7.4](question-solution/1.7.4.md) 93 | * [Review Question 1.7.5](question-solution/1.7.5.md) 94 | * [Review Question 1.7.6](question-solution/1.7.6.md) 95 | * [Review Question 1.7.7](question-solution/1.7.7.md) 96 | * [Review Question 1.7.8](question-solution/1.7.8.md) 97 | 98 | ### Analytical Exercises 99 | 100 | * [Analytical Exercise 1.1](exercise-solution/1.1.md) 101 | * [Analytical Exercise 1.2](exercise-solution/1.2.md) 102 | 103 | ## Chapter 2 Large-Sample Theory 104 | 105 | ### Section 1 Review of Limit Theorems for Sequences of Random Variables 106 | 107 | * [Review Question 2.1.1](question-solution/2.1.1.md) 108 | * [Review Question 2.1.2](question-solution/2.1.2.md) 109 | * [Review Question 2.1.3](question-solution/2.1.3.md) 110 | * [Review Question 2.1.4](question-solution/2.1.4.md) 111 | * [Review Question 2.1.5](question-solution/2.1.5.md) 112 | 113 | ## Supplements 114 | 115 | * [Taylor's linearization](supplements/taylor-linearization.md) 116 | * [Variance-Covariance Matrix](supplements/var-cov-matrix.md) 117 | * [Four Ways of Matrix Multiplication](supplements/matrix-multiplication.md) 118 | 119 | #### 注 120 | 121 | [^1]: Fumio Hayashi, _Econometrics_. Princeton University Press, 2000. (http://press.princeton.edu/titles/6946.html) 122 | 123 | --- 124 | 125 | Copyright ©2018 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/1.5.2.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at May 15, 2017 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 5 Relation to Maximum Likelihood 10 | 11 | ... 12 | 13 | #### Review Question 1.5.2 (Maximizing joint log likelihood) 14 | 15 | Consider maximizing (the log of) the _joint_ likelihood 16 | 17 | $$ 18 | f_{ \mathbf{y}, \mathbf{X} } ( \mathbf{y}, \mathbf{X}; \tilde{ \boldsymbol{ \zeta } } ) = 19 | f_{ \mathbf{y} \mid \mathbf{X} } ( \mathbf{y} \mid \mathbf{X}; \tilde{ \boldsymbol{ \theta } } ) \cdot 20 | f_{ \mathbf{X} } ( \mathbf{X} ; \tilde{ \boldsymbol{ \psi } } ) 21 | \tag{1.5.2} 22 | $$ 23 | 24 | for the classical regression model, where $$ \tilde{ \boldsymbol{ \theta } } = ( \tilde{ \boldsymbol{ \beta } }, \tilde{ \sigma }^2 )'$$ and $$ \log f_{ \mathbf{y} \mid \mathbf{X} } ( \mathbf{y} \mid \mathbf{X}; \tilde{ \boldsymbol{ \theta } } ) $$ is given by 25 | 26 | $$ 27 | \log L ( \tilde{ \boldsymbol{ \beta } }, \tilde{ \sigma }^2 ) = 28 | - \frac{n}{2} \log ( 2 \pi ) - 29 | \frac{n}{2} \log ( \tilde{ \sigma }^2 ) - 30 | \frac{1}{ 2 \tilde{ \sigma }^2 } 31 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } )' 32 | ( \mathbf{y} - \mathbf{X} \tilde{ \boldsymbol{ \beta } } ). 33 | \tag{1.5.5} 34 | $$ 35 | 36 | You would parameterize the marginal likelihood $$ f( \mathbf{X} ; \tilde{ \boldsymbol{ \psi } } ) $$ and take the log of (1.5.2) to obtain the objective function to be maximized over $$ \boldsymbol{ \zeta } \equiv ( \boldsymbol{ \theta }', \boldsymbol{ \psi }' )' $$. (a) What is the ML estimator of $$ \boldsymbol{ \theta } \equiv ( \boldsymbol{ \beta }', \sigma^2 )' $$? (b) Derive the Cramer-Rao bound for $$ \boldsymbol{ \beta } $$. 37 | 38 | ##### Solution 39 | 40 | (a) Taking log of (1.5.2), 41 | 42 | $$ 43 | \begin{align} 44 | \log f_{ \mathbf{y}, \mathbf{X} } ( \mathbf{y}, \mathbf{X}; \tilde{ \boldsymbol{ \zeta } } ) = 45 | \log f_{ \mathbf{y} \mid \mathbf{X} } ( \mathbf{y} \mid \mathbf{X}; \tilde{ \boldsymbol{ \theta } } ) + 46 | \log f_{ \mathbf{X} } ( \mathbf{X} ; \tilde{ \boldsymbol{ \psi } } ). 47 | \tag{1} 48 | \end{align} 49 | $$ 50 | 51 | The ML estimator of $$ \boldsymbol{ \theta } $$ maximizing (1) will be exactly the same as that of maximizing $$ \log f_{ \mathbf{y} \mid \mathbf{X} } ( \mathbf{y} \mid \mathbf{X}; \tilde{ \boldsymbol{ \theta } } ) $$, because $$ \tilde{ \boldsymbol{ \theta } } $$ does not appear in $$ \log f_{ \mathbf{X} } ( \mathbf{X} ; \tilde{ \boldsymbol{ \psi } } ) $$, the first-order conditions of the two maximization will be the same. 52 | 53 | (b) By the information matrix equality, 54 | 55 | $$ 56 | \begin{align} 57 | \mathbf{I} ( \boldsymbol{ \zeta } ) & = 58 | - \mathrm{E} \left[ 59 | \frac{ \partial^2 \log L ( \boldsymbol{ \zeta } ) } 60 | { \partial \tilde{ \boldsymbol{ \zeta } } \, 61 | \partial \tilde{ \boldsymbol{ \zeta } }' } 62 | \right] 63 | \\ & = 64 | - \mathrm{E} \left[ 65 | \begin{matrix} 66 | \frac{ \partial^2 \log L ( \boldsymbol{ \theta }, \boldsymbol{ \psi } ) } 67 | { \partial \tilde{ \boldsymbol{ \theta } } \, 68 | \partial \tilde{ \boldsymbol{ \theta } }' } 69 | & 70 | \frac{ \partial^2 \log L ( \boldsymbol{ \theta }, \boldsymbol{ \psi } ) } 71 | { \partial \tilde{ \boldsymbol{ \theta } } \, 72 | \partial \tilde{ \boldsymbol{ \psi } }' } 73 | \\ 74 | \frac{ \partial^2 \log L ( \boldsymbol{ \theta }, \boldsymbol{ \psi } ) } 75 | { \partial \tilde{ \boldsymbol{ \psi } } \, 76 | \partial \tilde{ \boldsymbol{ \theta } }' } 77 | & 78 | \frac{ \partial^2 \log L ( \boldsymbol{ \theta }, \boldsymbol{ \psi } ) } 79 | { \partial \tilde{ \boldsymbol{ \psi } } \, 80 | \partial \tilde{ \boldsymbol{ \psi } }' } 81 | \end{matrix} 82 | \right] 83 | \\ & = 84 | - \mathrm{E} \left[ 85 | \begin{matrix} 86 | \frac{ \partial^2 \log L ( \boldsymbol{ \theta }, \boldsymbol{ \psi } ) } 87 | { \partial \tilde{ \boldsymbol{ \theta } } \, 88 | \partial \tilde{ \boldsymbol{ \theta } }' } 89 | & 90 | \mathbf{0} 91 | \\ 92 | \mathbf{0} 93 | & 94 | \frac{ \partial^2 \log L ( \boldsymbol{ \theta }, \boldsymbol{ \psi } ) } 95 | { \partial \tilde{ \boldsymbol{ \psi } } \, 96 | \partial \tilde{ \boldsymbol{ \psi } }' } 97 | \end{matrix} 98 | \right], 99 | \end{align} 100 | $$ 101 | 102 | since $$ \partial^2 \log L ( \boldsymbol{ \theta }, \boldsymbol{ \psi } ) / ( \partial \tilde{ \boldsymbol{ \theta } } \, 103 | \partial \tilde{ \boldsymbol{ \psi } }' ) = \mathbf{0} $$. Thus the information matrix $$ \mathbf{I} ( \boldsymbol{ \zeta } ) $$ is block diagonal with its first block corresponding to $$ \boldsymbol{ \theta } $$ and the second block corresponding to $$ \boldsymbol{ \psi } $$. Its inverse is also block diagonal, with its first block being the inverse of 104 | 105 | $$ 106 | - \mathrm{E} \left[ 107 | \frac{ \partial^2 \log L ( \boldsymbol{ \theta }, \boldsymbol{ \psi } ) } 108 | { \partial \tilde{ \boldsymbol{ \theta } } \, 109 | \partial \tilde{ \boldsymbol{ \theta } }' } 110 | \right] 111 | = 112 | - \mathrm{E} \left[ 113 | \frac{ \partial^2 \log L ( \boldsymbol{ \theta } ) } 114 | { \partial \tilde{ \boldsymbol{ \theta } } \, 115 | \partial \tilde{ \boldsymbol{ \theta } }' } 116 | \right]. 117 | $$ 118 | 119 | So the Cramer-Rao bound for $$ \boldsymbol{ \theta } $$ is the negative of the inverse of the expected value of (1.5.12) in the text. The expectation, however, is over $$ \mathbf{y} $$ _and_ $$ \mathbf{X} $$ because here the density is a _joint_ density. Therefore, the Cramer-Rao bound for $$ \boldsymbol{ \beta } $$ is $$ \sigma^2 [ \mathrm{E} ( \mathbf{X}' \mathbf{X} ) ]^{-1} $$. 120 | 121 | --- 122 | 123 | Copyright ©2017 by Qiang Gao -------------------------------------------------------------------------------- /lecture-note/1.1.md: -------------------------------------------------------------------------------- 1 | # Lecture Notes on Econometrics 2 | 3 | by Qiang Gao, updated at March 26, 2018 4 | 5 | --- 6 | 7 | ## Chapter 1 Finite-Sample Properties of OLS 8 | 9 | ### Section 1 The Classical Linear Regression Model 10 | 11 | ... 12 | 13 | #### The Big Picture of Econometrics 14 | 15 | Its a bridge between **model** and **data**. Model is some theoretic mathematical formulation. Data is collected/measured from the real world according to the definition of variables. 16 | 17 | The bridge is in two ways: 18 | 19 | 1. (from data to model) Parameter Estimation. 20 | 2. (from model to data) Hypothesis Test. 21 | 22 | #### Assumption 1.1 (linearity) 23 | 24 | ##### 1. It's a tautology 25 | 26 | Because the error term $$ \varepsilon_i $$ is defined as 27 | 28 | $$ 29 | \varepsilon_i = y_i - \mathbf{x}_i \cdot \boldsymbol\beta, 30 | $$ 31 | 32 | the equality in (1.1.1) trivially holds true by definition. 33 | 34 | Equation (1.1.1) only restricts a linear functional relationship between $$y$$ and $$ \mathbf{x} $$, nothing more. 35 | 36 | ##### 2. Nonlinearity can be linearized 37 | 38 | The linearity assumption is not so much restrictive, because any nonlinear function can be easily [linearized](../supplements/taylor-linearization.md). 39 | 40 | ##### 3. The knowns are $$( y_i, \mathbf{x}_i $$) and the unknowns are $$( \boldsymbol\beta, \varepsilon_i )$$ 41 | 42 | ##### 4. $$ \boldsymbol\beta $$ is of primary interest 43 | 44 | $$ \beta $$ means marginal separate effects. 45 | 46 | ##### 5. $$ \varepsilon_i $$ is of primary concern 47 | 48 | $$\varepsilon_i$$ should not depend on $$ \mathbf{x} $$ 49 | 50 | ##### 6. marginal separate effect relies on total differentiation 51 | 52 | - explicit equation 53 | - implicit equation 54 | - differential vs. elasticity 55 | 56 | ##### 7. variables are usually transformed (in log) 57 | 58 | By the rules of differentiation 59 | 60 | $$ 61 | \frac{d \ln x}{dx} = \frac{1}{x}, 62 | $$ 63 | 64 | we can write it in total differential form as 65 | 66 | $$ 67 | d \ln x = \frac{dx}{x}. 68 | $$ 69 | 70 | Similarly, 71 | 72 | $$ 73 | d \ln y = \frac{dy}{y}. 74 | $$ 75 | 76 | So 77 | 78 | $$ 79 | \frac{d \ln y}{d \ln x} = \frac{d y / y}{d x / x} 80 | $$ 81 | 82 | coincides with the definition of elasticity. It is of this reason that 83 | in economics, variables are often expressed in logs rather than in 84 | levels in equations. 85 | 86 | #### Assumption 1.2 (strict exogeneity) 87 | 88 | ##### Joint Distribution 89 | 90 | $$ 91 | f_{Y,X}(y, x) \qquad \oint f_{Y,X}(y, x)\,dx\,dy = 1 92 | $$ 93 | 94 | ##### Marginal Distribution 95 | 96 | $$ 97 | \begin{align} 98 | f_{Y} (y) \equiv \oint f_{Y,X}(y, x) \, dx && \oint f_{Y} (y) \, dy = 1 \\ 99 | f_{X} (x) \equiv \oint f_{Y,X}(y, x) \, dy && \oint f_{X} (x) \, dx = 1 100 | \end{align} 101 | $$ 102 | 103 | ##### Conditional Distribution 104 | 105 | ##### (Unconditional) Expectation 106 | 107 | The (unconditional) expectation $$\mathrm{E}(x)$$ is defined as 108 | 109 | $$ 110 | \mathrm{E}(x) = \int x f(y, x) \, dy dx 111 | $$ 112 | 113 | ##### Conditional Expectation 114 | 115 | If $$(y, x)$$ are jointly distributed random variables, where their joint p.d.f. is expressed as $$f(y, x)$$, then $$\mathrm{E} (y | x)$$ is defined 116 | as 117 | 118 | $$ 119 | \mathrm{E} (y|x) = \int_{-\infty}^{+\infty} y \frac{ f(y, x) }{ \int_{-\infty}^{+\infty} f(y, x) dy } dy, 120 | $$ 121 | 122 | where $$\int_{-\infty}^{+\infty} f(y, x) dy$$ is the definition of the marginal distribution of $$x$$. In words, the expectation of $$y$$ conditional on $$x$$ is the weighted average of $$y$$, where the weighting is the conditional probability density. 123 | 124 | ##### Law of Total Expectations 125 | 126 | $$ 127 | \mathrm{E} ( \mathrm{E} (y | x) ) = \mathrm{E} (y). 128 | $$ 129 | 130 | ##### Law of Iterated Expectations 131 | 132 | $$ 133 | \mathrm{E} ( \mathrm{E} (y | x, z) | z ) = \mathrm{E} (y | z). 134 | $$ 135 | 136 | ##### Moment 137 | 138 | The $$k$$-th order moment of a random variable $$x$$ is defined as 139 | 140 | $$ 141 | \mathrm{E}(x^k) 142 | $$ 143 | 144 | ##### Variance 145 | 146 | $$ 147 | \begin{align} 148 | \mathrm{Var}(x) &= \mathrm{E} [ (x - \mathrm{E} (x))^2 ] && \text{(definition)} \\ 149 | &= \mathrm{E}(x^2)- E(x)^2 && \text{(formula)} 150 | \end{align} 151 | $$ 152 | 153 | ##### Covariance 154 | 155 | $$ 156 | \begin{align} 157 | \mathrm{Cov} (x, y) &= \mathrm{E} [ (x - \mathrm{E} (x) )( y - \mathrm{E} (y)) ] && \text{(definition)} \\ 158 | &= \mathrm{E} (xy) - \mathrm{E}(x) \mathrm{E} (y) && \text{(formula)} 159 | \end{align} 160 | $$ 161 | 162 | ##### Correlation Coefficient 163 | 164 | $$ 165 | \rho_{x,y} = \frac{\mathrm{Cov} (x,y)}{\sqrt {\mathrm{Var} (x) \mathrm{Var} (y)}} \in [-1, 1] 166 | $$ 167 | 168 | ##### Linearity of Expectation 169 | 170 | $$ 171 | \mathrm{E} (ax + b) = a \mathrm{E} (x) + b 172 | $$ 173 | 174 | ##### Nonlinearity of Variance 175 | 176 | $$ 177 | \mathrm{Var} (ax + b) = a^2 \mathrm{Var} (x). 178 | $$ 179 | 180 | #### Assumption 1.3 (no multicollinearity) 181 | 182 | - perfect multicollinearity _can_ occur in rare conditions as long as its _measure_ is zero. 183 | 184 | #### Assumption 1.4 (spherical error variance) 185 | 186 | $$ 187 | \mathbf{x} \mathbf{x}' 188 | \equiv 189 | \begin{bmatrix} 190 | x_1^2 & \cdots & x_1 x_n \\ 191 | \vdots & \ddots & \vdots \\ 192 | x_n x_1 & \cdots & x_n^2 193 | \end{bmatrix} 194 | $$ 195 | 196 | $$ 197 | \mathrm{E} 198 | \begin{bmatrix} 199 | a_{11} & \cdots & a_{n1} \\ 200 | \vdots & \ddots & \vdots \\ 201 | a_{m1} & \cdots & a_{mn} 202 | \end{bmatrix} 203 | \equiv 204 | \begin{bmatrix} 205 | \mathrm{E} (a_{11}) & \cdots & \mathrm{E} (a_{n1}) \\ 206 | \vdots & \ddots & \vdots \\ 207 | \mathrm{E} (a_{m1}) & \cdots & \mathrm{E} (a_{mn}) 208 | \end{bmatrix}, 209 | $$ 210 | 211 | $$ 212 | \mathrm{Var} ( \mathbf{x} ) 213 | \equiv 214 | \mathrm{E} [ ( \mathbf{x} - \overline{\mathbf{x}} ) ( \mathbf{x} - \overline{\mathbf{x}} )' ] 215 | $$ 216 | 217 | --- 218 | 219 | Copyright ©2018 by Qiang Gao -------------------------------------------------------------------------------- /question-solution/2.1.2.md: -------------------------------------------------------------------------------- 1 | # Solution to Review Question 2 | 3 | by Qiang Gao, updated at June 8, 2018 4 | 5 | --- 6 | 7 | ## Chapter 2 Large-Sample Theory 8 | 9 | ### Section 1 Review of Limit Theorems for Sequences of Random Variables 10 | 11 | ... 12 | 13 | #### Review Question 2.1.2 (Alternative definition of convergence for vector sequences) 14 | 15 | (a) Verify that the definition in the text of “$$ \mathbf{z}_n \to_{m.s.} \mathbf{z} $$” is equivalent to 16 | 17 | $$ 18 | \lim_{n \to \infty} \operatorname{E} 19 | [ ( \mathbf{z}_n - \mathbf{z} )'( \mathbf{z}_n - \mathbf{z} ) ] 20 | = 0. 21 | $$ 22 | 23 | **Hint**: $$ \operatorname{E} [ ( \mathbf{z}_n - \mathbf{z} )'( \mathbf{z}_n - \mathbf{z} ) ] = 24 | \operatorname{E} [ ( z_{n1} - z_1 )^2 ] + \cdots + 25 | \operatorname{E} [ ( z_{nK} - z_K )^2 ] $$, where $$K$$ is the dimension of $$\mathbf{z}$$. 26 | 27 | (b) Similarly, verify that the definition in the text of “$$ \mathbf{z}_n \to_p \boldsymbol{\alpha} $$” is equivalent to 28 | 29 | $$ 30 | \lim_{n \to \infty} \operatorname{Prob} 31 | \left( 32 | ( \mathbf{z}_n - \boldsymbol{\alpha} )' 33 | ( \mathbf{z}_n - \boldsymbol{\alpha} ) > 34 | \varepsilon 35 | \right) = 0, 36 | $$ 37 | 38 | for any $$ \varepsilon > 0 $$. 39 | 40 | ##### Solution 41 | 42 | (a) If $$ \mathbf{z}_n \to_{m.s.} \mathbf{z} $$, by definition in the text, 43 | 44 | $$ 45 | \lim_{n \to \infty} \operatorname{E} 46 | [ ( z_{nk} - z_k )^2 ] = 0, 47 | \text{ for $k = 1, \ldots, K$}, 48 | $$ 49 | 50 | then we can show 51 | 52 | $$ 53 | \begin{align} 54 | & \color{white}{=} \lim_{n \to \infty} \operatorname{E} 55 | [ ( \mathbf{z}_n - \mathbf{z} )'( \mathbf{z}_n - \mathbf{z} ) ] \\ 56 | & = \lim_{n \to \infty} 57 | \left( 58 | \operatorname{E} [ ( z_{n1} - z_1 )^2 ] + \cdots + 59 | \operatorname{E} [ ( z_{nK} - z_K )^2 ] 60 | \right) \\ 61 | & = \lim_{n \to \infty} \operatorname{E} [ ( z_{n1} - z_1 )^2 ] + \cdots + 62 | \lim_{n \to \infty} \operatorname{E} [ ( z_{nK} - z_K )^2 ] \\ 63 | & = 0. 64 | \end{align} 65 | $$ 66 | 67 | If $$ \lim_{n \to \infty} \operatorname{E} 68 | [ ( \mathbf{z}_n - \mathbf{z} )'( \mathbf{z}_n - \mathbf{z} ) ] = 0 $$, then 69 | 70 | $$ 71 | \lim_{n \to \infty} \operatorname{E} [ ( z_{n1} - z_1 )^2 ] + \cdots + 72 | \lim_{n \to \infty} \operatorname{E} [ ( z_{nK} - z_K )^2 ] = 0. 73 | \tag{1} 74 | $$ 75 | 76 | Because each terms in (1) are non-negative, (1) implies 77 | 78 | $$ 79 | \lim_{n \to \infty} \operatorname{E} 80 | [ ( z_{nk} - z_k )^2 ] = 0, 81 | \text{ for $k = 1, \ldots, K$}, 82 | $$ 83 | 84 | which means $$ z_{nk} \to_{m.s.} z_k $$ for $$ k = 1, \ldots, K $$, and this implies $$ \mathbf{z}_n \to_{m.s.} \mathbf{z} $$ by definition in the text. 85 | 86 | (b) Similarly, if $$ \mathbf{z}_n \to_p \boldsymbol{\alpha} $$, by definition in the text, for any $$ \varepsilon > 0 $$, 87 | 88 | $$ 89 | \operatorname{Prob} 90 | \left( 91 | | z_{nk} - \alpha_k | > \varepsilon 92 | \right) \to 0, 93 | \text{ for $k = 1, \ldots, K$}, 94 | $$ 95 | 96 | then we can show 97 | 98 | $$ 99 | \begin{gather} 100 | \operatorname{Prob} 101 | \left( 102 | ( z_{nk} - \alpha_k )^2 > \varepsilon^2 103 | \right) \to 0, 104 | \text{ for $k = 1, \ldots, K$}, 105 | && \text{(equivalent event)} \\ 106 | 107 | \operatorname{Prob} 108 | \left( 109 | ( z_{nk} - \alpha_k )^2 > \varepsilon 110 | \right) \to 0, 111 | \text{ for $k = 1, \ldots, K$}, 112 | && \text{(any $\varepsilon > 0$)} \\ 113 | 114 | \operatorname{Prob} 115 | \left( 116 | ( z_{n1} - \alpha_1 )^2 > \varepsilon 117 | \right) + \cdots + 118 | \operatorname{Prob} 119 | \left( 120 | ( z_{nK} - \alpha_K )^2 > \varepsilon 121 | \right) \to 0, 122 | && \text{(limit sum)} \\ 123 | 124 | \left( 125 | \not\Rightarrow 126 | \operatorname{Prob} 127 | \left( 128 | (z_{n1} - \alpha_1)^2 + \cdots + (z_{nK} - \alpha_K)^2 > K \varepsilon 129 | \right) \to 0 130 | \right) 131 | && \text{(bigger event)} \\ 132 | 133 | \operatorname{Prob} 134 | \left( 135 | \bigcup_{k=1}^K 136 | \left( 137 | ( z_{nk} - \alpha_k )^2 > \varepsilon 138 | \right) 139 | \right) \to 0, 140 | && \text{(smaller event)} \\ 141 | 142 | \operatorname{Prob} 143 | \left( 144 | \bigcap_{k=1}^K 145 | \left( 146 | ( z_{nk} - \alpha_k )^2 \leq \varepsilon 147 | \right) 148 | \right) \to 1, 149 | && \text{(complement event)} \\ 150 | 151 | \operatorname{Prob} 152 | \left( 153 | ( z_{n1} - \alpha_1 )^2 + \cdots + 154 | ( z_{nK} - \alpha_K )^2 \leq K \varepsilon 155 | \right) \to 1, 156 | && \text{(bigger event)} \\ 157 | 158 | \operatorname{Prob} 159 | \left( 160 | ( z_{n1} - \alpha_1 )^2 + \cdots + 161 | ( z_{nK} - \alpha_K )^2 \leq \varepsilon 162 | \right) \to 1, 163 | && \text{(any $\varepsilon > 0$)} \\ 164 | 165 | \operatorname{Prob} 166 | \left( 167 | ( \mathbf{z}_n - \boldsymbol{\alpha} )' 168 | ( \mathbf{z}_n - \boldsymbol{\alpha} ) 169 | \leq \varepsilon 170 | \right) \to 1, 171 | && \text{(matrix notation)} \\ 172 | 173 | \operatorname{Prob} 174 | \left( 175 | ( \mathbf{z}_n - \boldsymbol{\alpha} )' 176 | ( \mathbf{z}_n - \boldsymbol{\alpha} ) 177 | > \varepsilon 178 | \right) \to 0. 179 | && \text{(complement event)} 180 | \end{gather} 181 | $$ 182 | 183 | If $$ \operatorname{Prob} \left( ( \mathbf{z}_n - \boldsymbol{\alpha} )' ( \mathbf{z}_n - \boldsymbol{\alpha} ) > \varepsilon \right) \to 0 $$, then we can show 184 | 185 | $$ 186 | \begin{gather} 187 | \operatorname{Prob} 188 | \left( 189 | ( z_{n1} - \alpha_1 )^2 + \cdots + 190 | ( z_{nK} - \alpha_K )^2 > \varepsilon 191 | \right) \to 0, 192 | && \text{(scalar notation)} \\ 193 | 194 | \operatorname{Prob} 195 | \left( 196 | \bigcup_{k=1}^K 197 | \left( 198 | ( z_{nk} - \alpha_k )^2 > \varepsilon 199 | \right) 200 | \right) \to 0, 201 | && \text{(smaller event)} \\ 202 | 203 | \operatorname{Prob} 204 | \left( 205 | ( z_{nk} - \alpha_k )^2 > \varepsilon 206 | \right) \to 0 207 | \text{ for $k = 1, \ldots, K$ }, 208 | && \text{(smaller events)} 209 | \end{gather} 210 | $$ 211 | 212 | which means $$ z_{nk} \to_p \alpha_k $$ for $$ k = 1, \ldots, K $$, and this implies $$ \mathbf{z}_n \to_p \boldsymbol{\alpha} $$. 213 | 214 | --- 215 | 216 | Copyright ©2018 by Qiang Gao --------------------------------------------------------------------------------