8 Instrumental Variables (II)

8.1 Lecture review and reading

“We describe our lives, our big choices, in narratives that blend conviction and luck, articulate plan and stumbling, seizing opportunity and being in the wrong place at the wrong time. We may be less than candid with ourselves and others in these narratives, but few would deny a role to luck.”

The quote above, found in Paul Rosenbaum’s excellent “Observation and Experiment” book (p. 258), nicely illustrates how many IV designs come to be. While we may not believe that the treatment that we care about has been randomly assigned, we hope to find an instrument (luck) that means that some part of the variation in that treatment can be thought of as random. In lecture this week we went over a number of examples of both good and bad candidates for plausible instrumental variables, many of which are featured in greater detail on this week’s reading list. In addition to the papers we study below, I suggest that you take a look at this paper which uses rainfall as an instrument for participation in political protests, and this paper which uses mayoral elections as an instrument for spending on police officers.

In addition, some of the more mechanical detail on two-stage least-squares estimation that we covered in the lecture can be found on pages 173-186 of MHE, and there is a nice discussion of the exclusion restriction (and the difficulty of testing that assumption empirically) on pages 301-303 of the Morgan and Winship textbook. Finally, if you are looking for a lighter read – and don’t mind spending some money on another causal inference book – I would really recommend the entire chapter on IV in the Rosenbaum book that I mentioned above. It is notable because it features a rare combination of clear explanation, detailed examples, and funny anecdotes. Not something you say everyday about an econometrics book.

8.2 Seminar

8.2.1 Data

We will use two datasets this week:

  1. Acemoglu, Johnson, and Robinson (2001)
  2. Dinas et. al. (2018) – Although we used data from this paper previously (week 5), you will need to download this alternative dataset now, as I have reformatted it and added an additional variable which we will be using this week to replicate the IV section of their paper.

8.2.2 Do institutions cause growth? – Acemoglu, Johnson, and Robinson (2001)

The Acemoglu, Johnson, and Robinson (herein, AJR) paper entitled “The Colonial Origins of Comparative Development: An Empirical Investigation” is a classic example of instrumental variables estimation in the social sciences. AJR propose to study the relationship between a country’s institutions (measured with an index relating to the strength of property rights) and that country’s level of economic development (measured in modern day log GDP per capita) using settler mortality rates from the early 17th, 18th, and 19th centuries as an instrument.1

In terms of the notation we have been using in lecture, we can clarify the intuition behind AJR’s proposed strategy by using the following simplified notation:

  • Instrument (\(Z_i \in \{1,0\}\)) – Settler mortality in the 17th, 18th and early 19th centuries (1 if settler mortality was high, 0 if settler mortality was low)
  • Treatment (\(D_i\)) – Strong modern property institutions (1 if strong, 0 if weak)
  • Outcome (\(Y_i\)) – Modern day wealth (GDP)

In the simplest terms, AJR use instrumental variables to estimate the effect of \(D_i\) on \(Y_i\) by instrumenting \(D_i\) with \(Z_i\). They find that strong modern property rights institutions causes higher GDP per capita.

Question 1. Assumptions

Name the four main assumptions underpinning instrumental variables as a strategy for identifying causal effects in the potential outcomes framework (i.e. name the assumptions needed to interpret the IV estimate as a LATE for compliers). Write out each assumption, and in your own words, interpret each assumption with regard to the specific setup of AJR’s study. Finally, discuss the plausibility of each assumption.

Question 2. Replication

ajr.csv is a dataset with observations for 64 countries, including information on the following variables:

  1. GDP – log GDP per captia (adjusted for inflation, in 1995 US dollars)
  2. Exprop – average protection against expropriation risk (a continuous variable, and a proxy for institutions)
  3. Mort – settler mortality (measured as the number of individuals who died per 1000 people)
  4. logMort – log settler mortality (the logged version of the Mort variable)
  5. Latitude – the latitude of the country
  6. Latitude2 – the latitude of the country, squared
  7. Africa – dummy for Africa
  8. Asia – dummy for Asia
  9. Namer – dummy for North America
  10. Samer – dummy for South America
  11. Neo – dummy for Neo-Eruope

a. Estimate the effect of Exprop on GDP in two ways using regression. First, run a simple bivariate linear regression of those two variables, not including any other covariates. Second, run a multiple linear regression, including covariates for Africa, Asia, Namer and Samer. Interpret the direction and statistical significance of the estimated coefficient for Exprop from both models. Are these likely to be good estimates of the causal quantity of interest? Why, or why not?

b. Use the same two regression approaches as in the question above, but this time estimate the effect of logMort on GDP. Interpret the direction and statistical significance of the estimate of the causal effect. What does this “reduced form” model estimate? Under what conditions can we interpret this result as causal?

c. Estimate the first stage model for this problem. Remember, the first stage model should be a regression of \(D_i\) on \(Z_i\), potentially controlling for \(X_i\) (some covariates). Estimate these first stage models twice – once with and once without covariates (use the covariates that we used in the question above). Conduct an F-test for the first stage, for both models. Are you satisfied that the settler mortality instrument is sufficiently strong to serve as an instrument? (Note: you will need to use the waldtest function in the lmtest package to estimate the F-tests. To estimate an F-test of a model without covariates, you just need to run waldtest(model_name). This will produce the F-statistic comparing your model to a model only including the intercept – the “null” model.)

d. Use the ivreg function in the AER package to estimate the local average treatment effect (LATE) for compliers of Exprop on GDP, with logMort used as the instrument. Again, do this twice: once with, and once without, covariates (remember, the same covariates need to be included in both the first- and second-stage models). Interpret the estimated LATE coefficient from both models. Additionally, retreive the F-statistics for the first stage models estimated by the ivreg function (see the help file for the ivreg summary function (?summary.ivreg) for instructions of how to do this). Do they match the first-stage F-statistics that you calculated above?

8.3 Homework

8.3.1 Refugees and support for the far right – Dinas et. al. (2018)

In this homework we will revist the paper by Dinas et. al. which investigates the causal effect of the refugee crisis on support for the far right Golden Dawn party in Greece. In week 5’s seminar, we used a difference-in-differences approach to answer this question, but Dinas et. al. also use an IV approach in their paper, where they instrument for the number of refugees in a municipality using the distance of the municipality from the Turkish coast as an instrument.

As a reminder, the (updated) dinas_golden_dawn.Rdata file contains data on 96 Greek municipalities. In contrast to the data we studied in week 5, here we only have one observation per municipality (i.e. we are focussing only on cross-sectional variation, and ignoring the time dimension that we had last time). The muni data.frame contained within the new file therefore includes the following variables:

  1. treatment – binary (1 if the observation is in the treatment group (a municipality that received many refugees).
  2. trarrprop – continuous (per capita number of refugees arriving in each municipality)
  3. gdvote_change – the outcome of interest. The change in the Golden Dawn’s share of the vote between January and September 2015. (Continuous)
  4. logdist – the logged distance of each municipality from the Turkish coast (originally measured in kilometers)

a. First-stage relationship

The figure below shows graphically the idea behind the IV in this paper. It is clear that municipalities on islands that are situated closer to the Turkish coast were more likely to see inflows of refugees than municipalities on islands that were further away. In some sense, then, this plot is graphical evidence of the first-stage relationship that is required under the instrumental variables assumption.

Island proximity to the Turkish coast and refugees per capita

Island proximity to the Turkish coast and refugees per capita

Using the variable logdist, calculate the first-stage effect on treatment (the binary treatment indicator) and trarrprop (the continuous treatment indicator). Conduct an F-test of these models against the null model (i.e. using waldtest). Are you satisfied with the results?


Use instrumental variable regressions to estimate the effect of the refugee treatment (binary and continuous) on the change in vote share for the Golden Dawn, using the logdist variable as the instrument. What is the causal effect estimated from this model? How does it compare with the effects that we estimated using the difference-in-differences design in week 5?

c. Exclusion restriction

What is the exclusion restriction for this instrument? Are you persuaded that it holds? What test do the authors of this paper employ to try to strengthen their argument?

  1. The AJR paper has – at the most recent count – over 11 thousand citations, and is heavily debated in a range of disciplines including economics, political science, and history. The questions here are highly stylized, and do not represent a fair characterization of the paper. I strongly encourage you to read the paper and surrounding debates carefully if you want to have a full understanding of this debate.