7 Instrumental Variables (I)
Aside from experiments, all of the strategies covered up to this point rely on the researcher being able to control for confounding factors when estimating causal effects. For the next two weeks, we focus on a strategy – instrumental variables – which can be used to address unobserved confounding factors in the context of cross-sectional data (i.e. when we can’t use the panel data methods discussed in previous weeks). This week, we will motivate instrumental variable (IV) methods, by discussing how this strategy can be useful in the context of experimental data where some units fail to comply with the treatment.
This week, we focus on IV as a strategy for dealing with non-compliance (one-sided and two-sided) in randomized experiments. The clearest exposition of the idea of non-compliance is given in the two chapters of the Gerber and Green textbook (chapter 5 and chapter 6), although note that they generally avoid any mention of the phrase “instrumental variable” (I don’t really know why. Some of the linguistic decisions in that book are a little idiosyncratic.). The discussion of IV estimators is also very good (and short) in the Sovey and Green paper. This paper also emphasises the point that most applications in political science do not use IV for addressing non-compliance in randomized experiments (the case we discuss this week), but instead are applications where researchers use IV as a method for overcoming selection bias in cross-sectional observational studies (the case we will discuss next week). The “reader’s checklist” they provide at the end of the paper is particularily recommended, as it gives good, straightforward advice to anyone thinking about using IV as an estimation strategy for causal effects.
The chapter on IV estimation in MHE is good, though much of the material is beyond the level you (or anyone else, to be honest) will need to implement a good IV design. The treatment in the Mastering ’Metrics book is somewhat easier, and also includes some very interesting applications of IV when used to address non-compliance.
7.1 Seminar
7.1.1 Children’s Television and Educational Performance
Can educational television programmes improve children’s learning outcomes? Sesame Street is an American television programme aimed at young children. The creators of Sesame Street decided from the very beginning of the show’s production that a central goal would to be educate as well as entertain its audience. As Malcolm Gladwell argued, “Sesame Street was built around a single, breakthrough insight: that if you can hold the attention of children, you can educate them”. In addition to building the show around a carefully constructed educational curriculum, the show’s producers also worked closely with educational researchers to determine whether the show’s content was effectively improving its young viewers’ numeracy and literacy skills.
The dataset contained in sesame_experiment.dta
includes information on 240 children who were randomly assigned to two groups. The treatment of interest here is watching Sesame Street, but clearly it is not possible to force children to watch a TV show or (perhaps even harder) to refrain from watching, and so watching the show cannot be randomized. Instead, in this study, researchers randomized whether children were encouraged to watch the show. More specifically, when the study was run in the 1970s, Sesame Street was on the air each day between 9am and 10am. The parents of children in the treatment group were encouraged to show Sesame Street to their children on a regular basis, while parents of the children in the control group were given no such encouragement. Because it is only encouragement that is randomized here, there is the possiblity of non-compliance – i.e. some children will not watch Sesame Street even though they are in the treatment condition, and some children will watch Sesame Street even though they are in the control condition. The data is is .dta
format and you can load it as follows:
The data includes the following variables:
encour
– 1 if the child was encouraged to watch Sesame Street, 0 otherwisewatched
– 1 if the child watched Sesame Street regularly, 0 otherwiseletters
– the score of the child on a literacy testage
– age of the child (in months)female
– 1 if the child is female, 0 otherwise
For this seminar, you will also need the AER
and ivdesc
packages:
1. Compliance and the intention-to-treat
- In the context of this specific example, define the following unit types:
- Compliers
- Always-takers
- Never-takers
- Defiers
- Calculate the proportion of children in the treatment group who did not watch Sesame Street. Calculate the proportion of children in the control group who did watch Sesame Street. What type of non-compliance occured in this experiment? Hint: You might find the
table()
andprop.table()
functions helpful here.
- Calculate the proportion of compliers in this experiment. Which assumptions are required for us to identify this quantity?
- Calculate the Intention-to-Treat effect (ITT). What is the interpretation of the ITT here?
2. Local Average Treatment Effect (LATE)
- What does the LATE estimate?
- Estimate the LATE for this example. You should do this in three ways (all of which can be found on the lecture slides!):
- Using the Wald estimator.
- Using “manual” two-stage least squares (i.e. you need to specify the regressions yourself).
- Using
ivreg
from theAER
package.
- You have now estimated two treatment effects: the ITT and LATE. Which is of greater interest to the TV show’s producers?
3. Exclusion restriction
- What does the assumption of the exclusion restriction mean in this example? Are you convinced that the exclusion restriction holds here?
4. Characterising the compliers
- Use the
ivdesc()
function from theivdesc
package to evaluate differences between compliers, always-takers, and never-takers in this sample. What is mean age of compliers? What fraction of compliers are female? Are compliers significantly different from other types of units with respect to these covariates?
7.1.2 Estimating the Impact of The Hajj
Clingingsmith, Khwaja and Kremer (2009) estimate the impact on pilgrims of performing the Hajj pilgrimage to Mecca using an instrumental variables approach. They compare successful and unsuccessful applicants in a randomized lottery used by Pakistan to allocate Hajj visas and examine the impact of the Hajj pilgrimage on the subsequent beliefs and values of Pakistani Muslims.
You can download the data for this part of the assignment from the top of the page, and load it using the following command:
You will again need the AER package for this problem:
The data object, hajj
includes the following key variables:
moderacy
, an index ranging from 0 to 4 constructed from opinion questions, where higher values indicate more moderate views on Islamic practices, Islamist terrorism, and the status of womensuccess
= 1 if the respondent won the lottery for a Hajj visa, 0 otherwisehajj2006
= 1 if the respondent went on the Hajj, 0 otherwiseage
, measured in yearsliterate
= 1 if respondent is literate, 0 otherwise 6urban
= 1 if respondent lives in an urban area, 0 otherwise
1. Calculating non-compliance
- Calculate (i) the proportion of people who won the lottery and did not go on the Hajj and (ii) the proportion of people who lost the lottery and went on the Hajj. Using these answers, what type of non-compliance occurred in this natural experiment?
- In this study, who are the compliers and who are the always-takers?
2. Calculating the ITT and the LATE
- Calculate the ITT, using
moderacy
as the outcome variable. What does the ITT represent in this example?
- Calculate the proportion of compliers and the LATE in this example. Interpret your results.
- Calculate the local average treatment effect (LATE) using two-stage least squares and verify that your answer is identical to part (c). Report its standard error. Is the LATE statistically significant?