We observe equilibrium price: \(p_i = \frac{\alpha_d - \alpha_s + \epsilon_i - \mu_i}{\beta_s - \beta_d}\) which is a priori correlated with \(\epsilon_i\)
The estimated model is: \(Y = \alpha_0 + \alpha_1 X_1 + u\), where \(u = \beta_2 X_2 + \epsilon\) and \(X_2\) can be expressed as \(X_2 = \gamma_0 + \gamma_1 X_1 + v\) with \(E[v | X_1] = 0\)
Key insight: \(X_1\) is correlated with the error term of the estimated model: \(u = \beta_2 \gamma_0 + \beta_2 \gamma_1 X_1 + \beta_2 v + \epsilon\) as long as \(\beta_2 \gamma_1 \neq 0\)
Sign of the bias depends on the sign of \(\beta_2 \gamma_1\)
Estimated coefficient: \(E(\hat{\beta}|\hat{x}) = \frac{Cov(\hat{x},y)}{Var(\hat{x})} = \beta + \frac{Cov(\hat{x},\mu)}{Var(\hat{x})}\), and \(Cov(\hat{x},\mu) = 0\)by construction
the ivreg function
The function ivreg from the package AER allows to estimate an instrumental variable regression in one step. The syntax is as follows:
ivreg(y ~ x1 + x2 | z1 + z2, data = dat)
where y is the dependent variable, x1 and x2 are the true and (potentially) endogenous regressors, and z1 and z2 are the instruments (including exogenous regressors in x).
The function will automatically perform the two stages of the regression and return the estimated coefficients with the appropriate standard errors
Replication of Acemoglu et al (2001)
Summary of the paper
Daron Acemoglu, Simon Johnson and James Robinson received the 2024 Nobel Prize for their work on understanding the differences in prosperity between nations
Their key contribution is to study the role of institutions in economic development, and to show that good institutions are a key driver of economic growth. You can find their Nobel Prize lecture here
Institutions include formal rules (e.g. property rights, rule of law) and informal constraints (e.g. social norms, culture)
In 2001, they published a very influential paper: “The Colonial Origins of Comparative Development” in the American Economic Review
They show that the heterogeneous way former European colonizers set up institutions in their colonies explains a large share of the observed disparity in their current economic performance
OLS regression
In a first step, the authors study the relationship between economic development (log GDP per capita 1995) and institution strength (expropriation risk in 1985-1995) using ordinary least square on a set of 64 former colonies. \[
\log y_i = \mu + \alpha R_i + \mathbf{X}_i'\gamma + \epsilon_i
\]
Discuss the Gauss-Markov assumptions in that context
Download the dataset and describe the data
Replication of Figure 2: create a scatter plot of log GDP per capita in 1995 against average expropriation risk in 1985-1995
The result of the main OLS specification is given in columns 2, 5 and 6 of Table 2. Interpret the results
Replicate these three regressions and export the results into a latex table using stargazer
IV strategy
To address endogeneity of institution levels, the authors instrument it by the mortality level of settlers at the start of colonization. The full model they have in mind is:
where \(y_i\) is the GDP per capita in 1995, \(R_i\) is the average expropriation risk in 1985-1995, \(C_i\) is a a measure of early institutional development, \(S_i\) is a measure of European settlement, and \(M_i\) is the mortality rate of settlers.
The instrument for \(R\) chosen is \(\log(M)\). Why not \(C\) or \(S\)?
What are the assumptions needed for \(\log(M)\) to be a valid instrument?
Replicate Figure 3. Do you think settler mortality is a valid instrument?
Table 4 panel B (column 2) presents more formally this first stage. How do you interpret it?
The result of the full IV is presented in panel A (column 2).
The authors state: “measurement error is likely to be more important than reverse causality and omitted variable bias”. Do you agree with this statement? Why?
Interpret the coefficient in terms of causal effect.
Replicate the first and second stage regressions using the lm and predict functions
Run the full IV regression using ivreg. Compare the results