3 Selection on Observables (I)


3.1 Lecture review and reading

For a theoretical discussion of the selection on observables identification strategy, the MHE chapter on regression is very good (especially pages 51 onwards). There is also a very nice exposition of the conditional independence assumption as it specifically relates to matching in this paper by Jasjeet S Sekhon (that paper also has a fantastic title). For practical advice on matching, the best resourse is probably the Elizabeth Stuart paper. This paper gives lots of straightforward recommendations about the different decisions one has to make when implementing different matching estimators. Applied examples are found in Eggers and Hainmueller (2009) and Dehejia and Wahba (1999). There is also a famous example of subclassification in this 1968 paper by Cochran.

Finally, in lecture we discussed the problem of conditioning on post-treatment variables when estimating causal effects. There are many papers on this subject, but two of the more accessible ones I have found are this one by Montgomery et al, which focusses on post-treatment bias in experimental settings; and this one by Acharya et al, which focusses on post-treatment bias in observational settings. Both are worth reading.

3.2 Seminar

3.2.1 Data

We will use three datasets this week, each of which is drawn from one of the papers on the reading list:

  1. Eggers and Hainmueller (2009)
  2. Dehejia and Wahba (1999) observational data
  3. Dehejia and Wahba (1999) experimental data

Download all 3 of these datasets now, and store them in a place alongside the code that you write for today’s problem set. If you have forgotten, look back at the instructions for last week about how to set your working directory.

3.2.2 Installing and loading packages

This week we will be using some additional functions that do not come preinstalled with R. There are many additional packages for R for many different types of quantitative analysis. Today we will be using the Matching package, which provides several helpful functions for implementing various matching strategies. You will also need the foreign package, which allows R to read in data of different formats (such as STATA and SPSS).

To get started, you will need to install these packages on whatever computer you are using. Note: you only need to install a package on a computer once. Do not run this code every time you run your R script!

install.packages("Matching")
install.packages("foreign")

Once installed, you can load the packages using the library() function.

library(Matching)
library(foreign)

Now you will be able to access the functions you need for the seminar and homework.

Finally, as this problem set will involve some random number generation, we will use the set.seed() function at the top of our scripts to ensure that the results remain the same everytime we run them. You specify the function as follows:

set.seed(12345)

where you can have any number in the place of 12345.

3.2.3 MPs for Sale? – Eggers and Hainmueller (2009)

What is the monetary value of serving as an elected politician? It is firmly a part of received wisdom that politicians often use their positions of public office for self-serving purposes. There is much evidence that private firms profit from their connections with influential politicians, but evidence that politicians benefit financially because of their political positions is thin. Eggers and Hainmueller (2009) seek to address this question by asking: what are the financial returns to serving in parliament? They study data from the UK political system, and compare the wealth at the time of death for individuals who ran for office and won (MPs) to individuals who ran for office and lost (candidates) to draw causal inferences about the effects of political office on wealth.

The data from this study is in Rdata format, and can be loaded using the load() function as follows:

load("eggers_mps.Rdata")

The data includes observations of 427 individuals. There are indicators for the main outcome of interest – the (log) wealth at the time of the individual’s death (lnrealgross) – and for the treatment – whether the individual was elected to parliament (treated == 1) or failed to win their election (treated == 0). Crucially, the data also includes information on a rich set of covariates. These include:

  1. labour – binary (1 if the individual was a member of the Labour party, 0 otherwise)
  2. tory – binary (1 if the individual was a member of the Conservative party, 0 otherwise)
  3. yob – year of birth
  4. yod – year of death
  5. female – binary (1 if the individual was female, 0 otherwise)
  6. aristo – binary (1 if the individual held an aristocratic title, 0 otherwise)
  7. Variables pertaining to the secondary education of the individual, all prefixed scat_ (see the paper for details)
  8. Variables pertaining to the university education of the individual, all prefixed ucat_ (see the paper for details)
  9. Variables pertaining to the pre-treatment occupation of the individual, all prefixed oc_ (see the paper for details)

Question 1. Estimate the average treatment effect using either a t.test or a bivariate linear regression. What is it? Is it significantly different from zero? Can we interpret this as an unbiased estimate of the causal effect of elected office on wealth?

Reveal answer

t.test(mps$lnrealgross[mps$treated ==  1],
       mps$lnrealgross[mps$treated ==  0])

    Welch Two Sample t-test

data:  mps$lnrealgross[mps$treated == 1] and mps$lnrealgross[mps$treated == 0]
t = 4.9215, df = 333.28, p-value = 1.353e-06
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 0.3108102 0.7247022
sample estimates:
mean of x mean of y 
 12.93624  12.41848 
summary(lm(lnrealgross ~ treated, data = mps))

Call:
lm(formula = lnrealgross ~ treated, data = mps)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.9853 -0.4925 -0.0192  0.4244  3.5180 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 12.41848    0.06466  192.07  < 2e-16 ***
treated      0.51776    0.10377    4.99 8.85e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.043 on 423 degrees of freedom
Multiple R-squared:  0.05558,   Adjusted R-squared:  0.05335 
F-statistic:  24.9 on 1 and 423 DF,  p-value: 8.853e-07

The regression suggests that gross log wealth of those in the treated group – those individuals who were elected to parliament – was 0.52 points higher than for those in the control group. In other words, elected candidates were worth, on average, 67% (i.e. (exp(0.517)-1)*100) more at death than candidates who failed to win elections. The difference is significantly different from zero.

Question 2. Using the Match() function from the matching package, carry out exact matching with replacement to estimate the average treatment effect on the treated. Match on gender, aristocratic title, and whether the individual received their secondary education from Eton. What is the ATT? How many matched observations are there? (Note: set the ties argument to be equal to FALSE when using this function. Ties occur when more than one control unit can be matched with a given treatment unit. When this happens, and ties = FALSE R will select at random which control observation to match with the treated observation. Because this process involves some randomness, this is why we used the set.seed() function earlier, as doing so ensures that the results we generate will be the same each time we run the matching. Note that setting ties = TRUE would not make a major difference to the results, it would just mean that R just take a weighted average of the outcomes of all the matching controls.)

Reveal answer

X_vars <- mps[,c("female", "aristo", "scat_eto")]

match_out <- Match(Y = mps$lnrealgross,
                    Tr = mps$treated,
                    X = X_vars,
                    exact = T,
                    estimand = "ATT",
                    M = 1,
                    ties = FALSE)

summary(match_out)

Estimate...  0.5862 
SE.........  0.13515 
T-stat.....  4.3373 
p.val......  1.4423e-05 

Original number of observations..............  425 
Original number of treated obs...............  165 
Matched number of observations...............  163 
Matched number of observations  (unweighted).  163 

Number of obs dropped by 'exact' or 'caliper'  2 

The ATT is equal to 0.5. There are 163 matched observations. This number comes from the fact that we are matching for the treated units, where there are 165 in the original data, and from the fact that two treated units were dropped because they could not be matched on these covariates. (If you are curious, try to find out who these individuals are in the mps data using the index.dropped vector stored in the match_out object.)

Question 3. Evaluate the balance between treated and control observations on those variables before and after matching. You can use t-tests to do this, or you can calculate the standardized difference in means that we discussed in lecture. You can also use the MatchBalance() function, which automates this task very effectively. The object created by Match() contains two vectors, match_out$index.treated and match_out$index.control which give the indexes of the matched treated and control observations. These vectors will be helpful for subsetting purposes.

Reveal answer

Pre-match balance:

t.test(mps$female[mps$treated==1], mps$female[mps$treated==0])
t.test(mps$aristo[mps$treated==1], mps$aristo[mps$treated==0])
t.test(mps$scat_eto[mps$treated==1], mps$scat_eto[mps$treated==0])

    Welch Two Sample t-test

data:  mps$female[mps$treated == 1] and mps$female[mps$treated == 0]
t = -0.68433, df = 386.2, p-value = 0.4942
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.05281431  0.02554158
sample estimates:
 mean of x  mean of y 
0.03636364 0.05000000 


    Welch Two Sample t-test

data:  mps$aristo[mps$treated == 1] and mps$aristo[mps$treated == 0]
t = 2.9165, df = 189.74, p-value = 0.003966
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 0.01908818 0.09886054
sample estimates:
  mean of x   mean of y 
0.066666667 0.007692308 


    Welch Two Sample t-test

data:  mps$scat_eto[mps$treated == 1] and mps$scat_eto[mps$treated == 0]
t = 3.7689, df = 206.7, p-value = 0.0002139
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 0.04969061 0.15870100
sample estimates:
 mean of x  mean of y 
0.12727273 0.02307692 

Post-match balance:

T-tests:

t.test(mps$female[match_out$index.treated], mps$female[match_out$index.control])
t.test(mps$aristo[match_out$index.treated], mps$aristo[match_out$index.control])
t.test(mps$scat_eto[match_out$index.treated], mps$scat_eto[match_out$index.control])

    Welch Two Sample t-test

data:  mps$female[match_out$index.treated] and mps$female[match_out$index.control]
t = 0, df = 324, p-value = 1
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.04115937  0.04115937
sample estimates:
 mean of x  mean of y 
0.03680982 0.03680982 


    Welch Two Sample t-test

data:  mps$aristo[match_out$index.treated] and mps$aristo[match_out$index.control]
t = 0, df = 324, p-value = 1
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.04992579  0.04992579
sample estimates:
 mean of x  mean of y 
0.05521472 0.05521472 


    Welch Two Sample t-test

data:  mps$scat_eto[match_out$index.treated] and mps$scat_eto[match_out$index.control]
t = 0, df = 324, p-value = 1
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.07014575  0.07014575
sample estimates:
mean of x mean of y 
0.1165644 0.1165644 

Standardized-bias:

# Define a function to calculate the standardized bias (see Stuart p. 14)
stand_bias <- function(xt, xc, sdt){
  
  (mean(xt) - mean(xc))/sdt
  
}

# Women before and after
stand_bias(xt = mps$female[mps$treated==1], xc = mps$female[mps$treated==0], sdt = var(mps$female[mps$treated==1]))
stand_bias(xt = mps$female[match_out$index.treated], xc = mps$female[match_out$index.control], sdt = var(mps$female[mps$treated==1]))

# Aristocracy before and after
stand_bias(xt = mps$aristo[mps$treated==1], xc = mps$aristo[mps$treated==0], sdt = var(mps$aristo[mps$treated==1]))
stand_bias(xt = mps$aristo[match_out$index.treated], xc = mps$aristo[match_out$index.control], sdt = var(mps$aristo[mps$treated==1]))

# Eton before and after
stand_bias(xt = mps$scat_eto[mps$treated==1], xc = mps$scat_eto[mps$treated==0], sdt = var(mps$scat_eto[mps$treated==1]))
stand_bias(xt = mps$scat_eto[match_out$index.treated], xc = mps$scat_eto[match_out$index.control], sdt = var(mps$scat_eto[mps$treated==1]))
[1] -0.3867925
[1] 0
[1] 0.9420579
[1] 0
[1] 0.9323871
[1] 0
MatchBalance(treated ~  female + aristo + labour, 
             data =  mps, 
             match.out = match_out)

***** (V1) female *****
                       Before Matching       After Matching
mean treatment........   0.036364           0.03681 
mean control..........       0.05           0.03681 
std mean diff.........    -7.2625                 0 

mean raw eQQ diff.....   0.012121                 0 
med  raw eQQ diff.....          0                 0 
max  raw eQQ diff.....          1                 0 

mean eCDF diff........  0.0068182                 0 
med  eCDF diff........  0.0068182                 0 
max  eCDF diff........   0.013636                 0 

var ratio (Tr/Co).....    0.73936                 1 
T-test p-value........    0.49417                 1 


***** (V2) aristo *****
                       Before Matching       After Matching
mean treatment........   0.066667          0.055215 
mean control..........  0.0076923          0.055215 
std mean diff.........     23.571                 0 

mean raw eQQ diff.....   0.060606                 0 
med  raw eQQ diff.....          0                 0 
max  raw eQQ diff.....          1                 0 

mean eCDF diff........   0.029487                 0 
med  eCDF diff........   0.029487                 0 
max  eCDF diff........   0.058974                 0 

var ratio (Tr/Co).....     8.1698                 1 
T-test p-value........  0.0039662                 1 


***** (V3) labour *****
                       Before Matching       After Matching
mean treatment........     0.3697           0.37423 
mean control..........    0.54615           0.52761 
std mean diff.........    -36.444           -31.596 

mean raw eQQ diff.....    0.17576           0.15337 
med  raw eQQ diff.....          0                 0 
max  raw eQQ diff.....          1                 1 

mean eCDF diff........   0.088228          0.076687 
med  eCDF diff........   0.088228          0.076687 
max  eCDF diff........    0.17646           0.15337 

var ratio (Tr/Co).....    0.94219            0.9396 
T-test p-value........ 0.00033894         0.0039664 


Before Matching Minimum p.value: 0.00033894 
Variable Name(s): labour  Number(s): 3 

After Matching Minimum p.value: 0.0039664 
Variable Name(s): labour  Number(s): 3 

In the raw data, there is significant imbalance between treatment and control groups with respect to the aristo and scat_eto variables. Regardless of how we do the calculation after matching, there is no imbalance between matched treatment and control groups with respect to any of these variables. This makes sense, as the exact matching process implies that the distribution of these covariates should be identical in the treatment and control groups.

Question 4. Rematch the data, this time expanding the list of covariates to include all of the the schooling, university and occupation categories. Use exact matching. What is the ATT? How many observations remain in the matched data?

Reveal answer

X_vars <- mps[,c("female", "aristo",  "scat_pub",  "scat_eto", "scat_nm", "scat_sec", "oc_teacherall", "oc_barrister", "oc_solicitor",  "oc_dr", "oc_civil_serv", "oc_local_politics", "oc_business",  "oc_white_collar",  "oc_union_org", "oc_journalist", "oc_miner","ucat_ox", "ucat_deg", "ucat_nm")]

match_out <- Match(Y = mps$lnrealgross,
                    Tr = mps$treated,
                    X = X_vars,
                    exact = T,
                    estimand = "ATT",
                    M = 1,
                    replace = T,
                    ties = FALSE)

summary(match_out)

Estimate...  0.63578 
SE.........  0.15141 
T-stat.....  4.1991 
p.val......  2.6804e-05 

Original number of observations..............  425 
Original number of treated obs...............  165 
Matched number of observations...............  110 
Matched number of observations  (unweighted).  110 

Number of obs dropped by 'exact' or 'caliper'  55 

The ATT has increased to 0.8, and there are now far fewer matched observations (110).

Question 5. In the Egger and Hainmueller paper, they run the matching analysis separately for Labour and Conservative candidates. Do this now and try to replicate the 3rd and 6th columns from table 3 of their paper. As you will see, they do not use exact matching, but rather they use M=1 genetic matching. Nevertheless, you can get quite close to this result using simpler approaches. In the analysis they present they do not only match on the categorical variables we have used so far, they also use the continuous variables for year of birth and year of death (yob,yod). Include these in the new matching procedure. Because of these continuous variables, you will not be able use exact matching here. (You could also implement genetic matching by using the GenMatch() function before the Match() function and using the output of GenMatch() as the input for the Weight.matrix in Match())

Reveal answer

# Set up data
Y = mps$lnrealgross
Tr = mps$treated
X_vars <- mps[,c("female", "aristo",  "scat_pub",  "scat_eto", "scat_nm", "scat_sec", "oc_teacherall", "oc_barrister", "oc_solicitor",  "oc_dr", "oc_civil_serv", "oc_local_politics", "oc_business",  "oc_white_collar",  "oc_union_org", "oc_journalist", "oc_miner","ucat_ox", "ucat_deg", "ucat_nm","yob","yod")]

# Labour matching
lab_obs <- mps$labour == 1

lab_match_out <- Match(Y = Y[lab_obs],
                    Tr = Tr[lab_obs],
                    X = X_vars[lab_obs,],
                    estimand = "ATT",
                    M = 1)

summary(lab_match_out)

# Conservative matching
con_obs <- mps$tory == 1

con_match_out <- Match(Y = Y[con_obs],
                    Tr = Tr[con_obs],
                    X = X_vars[con_obs,],
                    estimand = "ATT",
                    M = 1)

summary(con_match_out)

Estimate...  0.13796 
AI SE......  0.1554 
T-stat.....  0.88774 
p.val......  0.37468 

Original number of observations..............  203 
Original number of treated obs...............  61 
Matched number of observations...............  61 
Matched number of observations  (unweighted).  61 


Estimate...  0.97597 
AI SE......  0.34039 
T-stat.....  2.8672 
p.val......  0.0041412 

Original number of observations..............  222 
Original number of treated obs...............  104 
Matched number of observations...............  104 
Matched number of observations  (unweighted).  106 

The ATT for Labour candidates is 0.13, and is not statistically significant. The ATT for Conservative candidates is 0.98, and is significantly different from zero. Both of these cases are very close to the numbers reported in table 3 of the paper.

For genetic matching you could try the following:

lab_genout <- GenMatch(Tr = Tr[lab_obs],
         X = X_vars[lab_obs,],
         M = 1,
         estimand = "ATT",
         pop.size = 1000)

lab_genmatch_out <- Match(Y = Y[lab_obs],
                    Tr = Tr[lab_obs],
                    X = X_vars[lab_obs,],
                    estimand = "ATT",
                    M = 1,
                    Weight.matrix = lab_genout)

summary(lab_genmatch_out)

3.3 Homework

3.3.1 National Supported Work – Dehejia and Wahba (1999)

The National Supported Work programme was a federal programme in the US which provided work experience to individuals who had faced economic problems in the past. Individuals were randomly assigned to participate in the programme from a pool of applicants between 1975 and 1977, and both treatment and control individuals were interviewed in 1978 and asked about the amount of money they were currently earning. The experimental data from this programme is stored in the nsw_exper.dta file.

The file NSW.dw.obs.dta contains data from from an observational study constructed by LaLonde (1986). The constructed observational study was formed by replacing the randomized control group with a comparison group formed using data from two national public surveys. The idea here is to see whether it is possible to use this observational data to construct unbiased estimates of the effect of the NSW treatment by implementing a matching estimator.

Dehejia and Wahba (1999) used these data to evaluate the potential efficacy of matching. In this homework, you will replicate some of Dehejia and Wahba’s findings. The following variables are included in the NSW observational dataset (NSW.dw.obs.dta):

  • treat (1 = experimental treatment group; 0 = observational comparison group)
  • age (age in years)
  • educ (years of schooling)
  • black (1 if black; 0 otherwise)
  • hisp (1 if Hispanic; 0 otherwise)
  • married (1 if married; 0 otherwise)
  • nodegree (1 if no high school diploma; 0 otherwise)
  • educcat (1 = did not finish high school, 2 = high school, 3 = some college, 4 = graduated from college)
  • re74, re75, re78 (real earnings in 1974, 1975 and 1978)

You will also need the nsw_exper.dta file, which includes the data used in the original experiment. For this file, the only variables of interest for today will be:

  • nsw (1 if the respondent was in the experimental treatment group, 0 if they were in the experimental control group)
  • re78 (real earnings in 1978)

You can load both of the data files, which are in STATA format, using the read.dta() function:

nsw_exp <- read.dta("nsw_exper.dta")
nsw_obs <- read.dta("NSW.dw.obs.dta")

Question 1. Estimate the treatment effect from the experimental data either via a) a simple difference in means between treated (nsw=1) and control units (nsw=0), where nsw is the treatment indicator; or b) a linear regression on the single dummy treatment variable (nsw). What is the ATE as measured from the randomized experiment? How does this compare to a raw difference in group means from the observational data?

Solution

exp_model <- lm(re78 ~ nsw, nsw_exp)
coef(exp_model)[2]
     nsw 
1794.343 

The experimental estimate of the ATE is 1794. That is, those individuals randomly allocated to the National Supported Work programme earned $1794 more on average than those in the experimental control group.

obs_model <- lm(re78 ~ treat, nsw_obs)

coef(obs_model)[2]
    treat 
-9401.156 

The unconditional difference in group means for the observational data is very different from the experimental estimate. People in the control group earn $9401 more on average than those in the treatment group. Eligibility criteria for the NSW programme included conditions regarding previous social and economic problems for the applicants (see p. 1054 in Dehejia and Wahba). This applicant screening necessarily leads to selection bias in comparisons between those selected for the programme and those not selected (in the observational study). The consequence of this bias is that a naive comparison of means between the treatment group and the newly constructed control sample give the misleading impression that participation in the work programme decreased earnings.

Question 2. Estimate the average treatment effect on the treated using the constructed observational data using matching. You will need to decide which distance measure to use; which variables to include; which choice of M to use. Try several specifications and when you are happy with the balance you have acheived, estimate the causal effect (you can use the Match() function without providing the outcome vector, and the function will return a matched dataset that you can use to check balance). How close are your estimates to the experimental benchmark?

Solution

There are many decisions to make here! This is my attempt:

# Match without outcome variables for balance testing (you could do this for many different specifications)

match_out <- Match(Tr = nsw_obs$treat, 
      X = nsw_obs[,c("age", "educ", "black", "married", "nodegree", "re74", "re75", "hisp", "educcat4")],
      M = 1,
      Weight = 2, # Mahalanobis
      exact = F,
      ties = FALSE,
      estimand = "ATT")

#MatchBalance(treat ~ age + educ  + black + married + nodegree + re74 + re75 + hisp + educcat4, data = nsw_obs, match.out = match_out)

# Match WITH outcome variables for effect estimation

match_out <- Match(Y = nsw_obs$re78,
      Tr = nsw_obs$treat, 
      X = nsw_obs[,c("age", "educ", "black", "married", "nodegree", "re74", "re75", "hisp", "educcat4")],
      M = 1,
      Weight = 2,
      exact = F,
      ties = FALSE,
      estimand = "ATT")

summary(match_out)

Estimate...  1060.3 
SE.........  723.15 
T-stat.....  1.4662 
p.val......  0.14259 

Original number of observations..............  18667 
Original number of treated obs...............  185 
Matched number of observations...............  185 
Matched number of observations  (unweighted).  185 

Number of obs dropped by 'exact' or 'caliper'  0 

I get an ATT estimate of 1060, which is not that far away from the experimental estimate, but is still an underestimate of the true treatment effect.

Question 3. Redo the matching exercise, excluding the variables for pre-treatment earnings in 1974 and 1975. How important do the earnings-in-1974 and -1975 variables appear to be in terms of satisfying the selection on observables assumption?

Solution

match_out <- Match(Y = nsw_obs$re78,
      Tr = nsw_obs$treat, 
      X = nsw_obs[,c("age", "educ", "black", "married", "nodegree", "hisp", "educcat4")],
      M = 5,
      Weight = 2,
      exact = F,
      ties = FALSE,
      estimand = "ATT")

summary(match_out)

Estimate...  -3747.8 
SE.........  795.05 
T-stat.....  -4.7139 
p.val......  2.4303e-06 

Original number of observations..............  18667 
Original number of treated obs...............  185 
Matched number of observations...............  185 
Matched number of observations  (unweighted).  925 

Number of obs dropped by 'exact' or 'caliper'  0 

Very important! The estimated effect of the treatment variable in the absence of information about previous earning levels is strongly negative. The 1974 and 1975 earnings variables therefore appear to be crucial for uncovering the true causal effect, as established by the randomised experiment. This should serve as a reminder of the fragility of the selection on observables assumption.