9 Regression Discontinuity Designs


9.1 Lecture review

“Jump around! Jump around! Jump up, jump up and get down! Jump!” – House of Pain

Regression discontinuity designs are appropriate in cases where the probability that a unit is treated jumps at some specific value (\(c\)) of a running variable (\(X_i\)). The RDD is powered by the idea that jumps in the outcome (\(Y\)) that occur at the same value of the running variable can – under certain assumptions – be attributed to the causal effect of the treatment. The basic logic behind this is that we normally expect most outcomes to evolve smoothly, and so as long as we are able to model that smooth evolution to a sufficient degree of accuracy, we can calculate the difference in outcomes that occur at the cutoff and count that as the treatment effect.

There are good treatments (get it?) of the RDD in both the Mostly Harmless book, and the Mastering ’Metrics books, but I think it is easiest to gain intuition about these methods from seeing plenty of examples. The paper we covered during the lecture is this one by Andy Hall, which uses a very common RD setting – election results which lead to one type of candidate being elected rather than another type – but in an innovative way that speaks to some important debates in American Politics. For an example of population-threshold based RDDs, this paper by Andy Eggers is very nice. Eggers uses the fact that municipalities in France use one electoral system (Majoritarian) when they have a population of less than 3500 and another (PR) when they have more than 3500 people. Finally, I really like this paper wher Ferwerda and Miller implement a geographic-RDD to investigate the causal effects of the Vichy regime in World War Two on the strength of the French resistance (you could also read this paper in which Kocher and Monteiro (2016) which challenges the “as if random” interpretation of Ferwerda and Miller’s RDD setting.)

9.2 Seminar

9.2.1 Data and packages

We will use one dataset this week:

  1. Meyersson (2014)

We will also need to use an additional package – the rdd package – this week for estimating the regression discontinuity models, and the associated plotting functions. Install this package now and load it into your environment.

install.packages("rdd")
library(rdd)

9.2.2 Islamic Political Rule and Women’s Empowerment

The islamic_women.csv dataset includes the following variables:

  1. margin – The margin of the Islamic party’s win or loss in the 1994 election (numbers greater than zero imply that the Islamic party won, and numbers less than zero imply that the Islamic party lost. 0 is an exact tie.)
  2. school_men – the secondary school completion rate for men aged 15-20
  3. school_women – the secondary school completion rate for women aged 15-20
  4. log_pop – log of the municipality population in 1994
  5. sex_ratio – sex ratio of the municipality in 1994
  6. log_area – log of the municipality area in 1994

Question 1. Create a treatment variable, islamic_win, that indicates whether or not the Islamic party won the 1994 election. Attach this variable to the islamic data.frame. In how many municipalities did the Islamic party win the election?

Solution

  # Create the treatment variable
  islamic$islamic_win <- islamic$margin > 0
  
  # Tabulate the treatment variable
  table(islamic$islamic_win)

FALSE  TRUE 
 2332   328 

The Islamic party won in 328 municipalities.

Question 2. Calculate the difference in means in secondary school completion rates for women between regions where Islamic parties won and lost in 1994. Do you think this is a credible estimate of the causal effect of Islamic party control? Why or why not?

Solution

  # Create the treatment variable
  diff_in_means <- mean(islamic$school_women[islamic$islamic_win==1],na.rm=T) - 
  mean(islamic$school_women[islamic$islamic_win==0],na.rm=T)

  diff_in_means
[1] -0.02577901

The difference in means is -0.0258. A naive interpretation of this would be that Islamic party control leads to lower female education rates. But this is not a very credible estimate of the causal effect, due to selection bias. The treated and untreated regions are likely to have very different potential outcomes; they are imbalanced. For example, it is likely that places that strongly support Islamic parties are less supportive of female education to begin with.

Question 3. Before estimating the RDD, we need to select an appropriate bandwidth for the analysis. You can select the optimal bandwidth for this data using the Imbens-Kalyanaram procedure, implemented using the IKBandwidth function in the rdd package. You may need to read the help file (?IKbandwidth) to see what the function requires. After you have selected the optimal bandwidth, explain what the bandwidth means in this case.

Solution

 band <- IKbandwidth(islamic$margin, islamic$school_women)

This gives a bandwidth of 0.239. This suggests that the optimal bias-variance tradeoff is achieved for this specific RD analysis by only including municipalities where there vote share margin for the Islamic party was in the interval from -0239 to 0.239.

Finally, using the bandwidth that you just calculated, create a new dataset that includes only those observations that fall within the optimal bandwidth of the running variable. You can use the following code – where band is the saved value of the optimal bandwidth – to acheive this:

 islamic_rdd  <- islamic[islamic$margin < band & islamic$margin > -band, ]

Question 4. Find the Local Average Treatment Effect of Islamic party control on women’s secondary school education at the threshold, using the dataset you created in the question above and a simple linear regression that includes the treatment and running variable. How credible do you think this result is?

Solution

  linear_rdd_model <- lm(school_women ~ margin + islamic_win, data=islamic_rdd)
  summary(linear_rdd_model)  

Call:
lm(formula = school_women ~ margin + islamic_win, data = islamic_rdd)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.17444 -0.07748 -0.00821  0.06402  0.33996 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)      0.128457   0.006368  20.172  < 2e-16 ***
margin          -0.220607   0.044521  -4.955 8.46e-07 ***
islamic_winTRUE  0.033075   0.011043   2.995  0.00281 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.09329 on 1017 degrees of freedom
  (569 observations deleted due to missingness)
Multiple R-squared:  0.02645,   Adjusted R-squared:  0.02453 
F-statistic: 13.81 on 2 and 1017 DF,  p-value: 1.204e-06

This gives a LATE estimate of 0.033 with a p value of 0.003, indicating that the effect is strongly statistically significant. This result may not be very credible because it requires that the relationship between female education and the running variable is linear and has the same slope on either side of the cutoff. This is a very strong assumption.

Question 5. Use RD estimation to find the Local Average Treatment Effect of Islamic party control on women’s secondary school education at the threshold, using local linear regression estimated with the RDestimate function from the rdd package. Does the estimate differ from question 4?

The RDestimate function takes a number of arguments:

Argument Description
formula The formula of the RDD, which takes the form outcome_variable ~ running_variable
data The name of the data of interest. In our case, we want to use islamic, not islamic_rdd. This is because the RDestimate function will apply the bandwidth restriction for us automatically when we specify the bw option as we do below.
bw The bandwidth at which to estimate the RDD. Here, this argument should be bw = band where band is the optimal bandwidth you calculated above.
cutpoint The cutpoint, i.e. the value of the running variable at which the disconuity is assumed to occur. Here, this argument should be 0.

Solution

  rd_est <- RDestimate(school_women~margin,cutpoint=0, bw=band, data=islamic)
  summary(rd_est)  

Call:
RDestimate(formula = school_women ~ margin, data = islamic, cutpoint = 0, 
    bw = band)

Type:
sharp 

Estimates:
           Bandwidth  Observations  Estimate  Std. Error  z value
LATE       0.2398     1020          0.02963   0.01241     2.388  
Half-BW    0.1199      589          0.02502   0.01646     1.520  
Double-BW  0.4796     2050          0.02279   0.01009     2.257  
           Pr(>|z|)   
LATE       0.01694   *
Half-BW    0.12859    
Double-BW  0.02399   *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

F-statistics:
           F       Num. DoF  Denom. DoF  p        
LATE        4.990  3         1016        3.865e-03
Half-BW     1.703  3          585        3.304e-01
Double-BW  25.770  3         2046        4.441e-16

This gives a LATE estimate of 0.0296 with a p value of 0.016, indicating that the effect is clearly significant at the 5% level. The estimate is only slightly lower than in the manual version of RDD that we estimated above.

Question 6. Plot the relationship between the running variable and the outcome, with a line representing the local linear regression either side of the cutoff. Use your plot to explain why your results do or do not differ strongly between the linear RDD model you estimated in question 4 and the non-linear RDD model you estimated in question 5. Note: You do not need to produce this plot manually! You can simply call the plot() function directly on the object you created in question 5. You can use the range argument to control the x axis. For more options, look at the help file for ?plot.RD.

Solution

  plot(rd_est, range=c(-0.6,0.6))
  abline(v=0) 

The plot indicates that the relationship between \(Y\) and \(\tilde{X}\) does be approximately linear on either side of the cutoff. That explains why we received similar answers in both questions 4 and 5 above: linearity is a reasonable fit to the data here. That will not always be the case, though, and it is usually good practice to show that you results are insensitive to the functional form specification of the running variable.

9.3 Homework

9.3.1 Assessing the robustness of RD estimates

Question 1. Perform placebo tests to check that the relationship between the running variable and outcome is not subject to discontinuities at values other than zero. To do this, re-estimate the RD, but varying the placebo cutoffs. Try the following values: -0.1, -0.05, 0.05 and 0.1. What do you conclude?

Solution

  summary(RDestimate(school_women ~ margin, bw = band, cutpoint = -0.1, data = islamic))

Call:
RDestimate(formula = school_women ~ margin, data = islamic, cutpoint = -0.1, 
    bw = band)

Type:
sharp 

Estimates:
           Bandwidth  Observations  Estimate  Std. Error  z value
LATE       0.2398     1345          -0.03411  0.011010    -3.098 
Half-BW    0.1199      722          -0.04179  0.015316    -2.728 
Double-BW  0.4796     2500          -0.03532  0.008328    -4.241 
           Pr(>|z|)      
LATE       1.946e-03  ** 
Half-BW    6.366e-03  ** 
Double-BW  2.228e-05  ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

F-statistics:
           F       Num. DoF  Denom. DoF  p        
LATE       15.256  3         1341        1.831e-09
Half-BW     8.018  3          718        5.835e-05
Double-BW  28.430  3         2496        0.000e+00
  summary(RDestimate(school_women ~ margin, bw = band, cutpoint = -0.05, data = islamic))

Call:
RDestimate(formula = school_women ~ margin, data = islamic, cutpoint = -0.05, 
    bw = band)

Type:
sharp 

Estimates:
           Bandwidth  Observations  Estimate   Std. Error  z value
LATE       0.2398     1176           0.009464  0.010904     0.8679
Half-BW    0.1199      667           0.020206  0.014474     1.3960
Double-BW  0.4796     2352          -0.001253  0.008842    -0.1418
           Pr(>|z|)   
LATE       0.3854     
Half-BW    0.1627     
Double-BW  0.8873     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

F-statistics:
           F       Num. DoF  Denom. DoF  p        
LATE        7.620  3         1172        9.557e-05
Half-BW     5.125  3          663        3.289e-03
Double-BW  23.216  3         2348        1.665e-14
  summary(RDestimate(school_women ~ margin, bw = band, cutpoint = 0.05, data = islamic))

Call:
RDestimate(formula = school_women ~ margin, data = islamic, cutpoint = 0.05, 
    bw = band)

Type:
sharp 

Estimates:
           Bandwidth  Observations  Estimate   Std. Error  z value
LATE       0.2398      883           0.008142  0.01388      0.5866
Half-BW    0.1199      474          -0.008542  0.01915     -0.4461
Double-BW  0.4796     1842           0.015935  0.01080      1.4761
           Pr(>|z|)   
LATE       0.5575     
Half-BW    0.6555     
Double-BW  0.1399     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

F-statistics:
           F        Num. DoF  Denom. DoF  p        
LATE        0.9364  3          879        8.449e-01
Half-BW     0.6545  3          470        8.392e-01
Double-BW  20.0031  3         1838        1.861e-12
  summary(RDestimate(school_women ~ margin, bw = band, cutpoint = 0.1, data = islamic))

Call:
RDestimate(formula = school_women ~ margin, data = islamic, cutpoint = 0.1, 
    bw = band)

Type:
sharp 

Estimates:
           Bandwidth  Observations  Estimate  Std. Error  z value
LATE       0.2398      737          0.01707   0.01647     1.036  
Half-BW    0.1199      335          0.02960   0.02225     1.331  
Double-BW  0.4796     1614          0.02266   0.01270     1.784  
           Pr(>|z|)   
LATE       0.30003    
Half-BW    0.18334    
Double-BW  0.07446   .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

F-statistics:
           F       Num. DoF  Denom. DoF  p        
LATE        1.986  3          733        2.294e-01
Half-BW     1.270  3          331        5.690e-01
Double-BW  16.116  3         1610        5.053e-10

While there are no significant LATE estimates for the -0.05, 0.05 or 0.1 cutoffs, there is a significant (negative) treatment effect at the -0.1 cutoff. That is, the p-values for three of the cutoffs are large, while the p-value for the -0.1 cutoff is just 0.002. This is not hugely encouraging, as it suggests that there are significant discontinuities in the outcome variable for values of the running variable other than 0. This constitutes evidence against a true RD effect because it indicates that the discontinuity we measured at 0 could be only one of many discontinuities in the data. In other words, the effect at 0 could have arisen due to chance alone.

Question 2. An alternative robustness check is to examine whether there are significant treatment effects for pre-treatment covariates that should not be affected by the treatment. Use the three pre-treatment covariates here – log_pop, sex_ratio, log_area – as outcome variables in new RD estimates. Do you find any significant effects of the treatment on these variables? (Note: remember to set the bw argument to be equal to the bandwith that we selected in question 3 for all of these models.)

Solution

  summary(RDestimate(log_pop ~ margin, bw = band, cutpoint = band, data = islamic))

Call:
RDestimate(formula = log_pop ~ margin, data = islamic, cutpoint = band, 
    bw = band)

Type:
sharp 

Estimates:
           Bandwidth  Observations  Estimate  Std. Error  z value
LATE       0.2398      323          0.50808   0.6402      0.7937 
Half-BW    0.1199       91          2.17410   0.7292      2.9816 
Double-BW  0.4796     1067          0.06513   0.5459      0.1193 
           Pr(>|z|)    
LATE       0.427393    
Half-BW    0.002867  **
Double-BW  0.905030    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

F-statistics:
           F      Num. DoF  Denom. DoF  p      
LATE       2.043  3          319        0.21555
Half-BW    3.997  3           87        0.02045
Double-BW  2.596  3         1063        0.10239
  summary(RDestimate(sex_ratio ~ margin, bw = band, cutpoint = band, data = islamic))

Call:
RDestimate(formula = sex_ratio ~ margin, data = islamic, cutpoint = band, 
    bw = band)

Type:
sharp 

Estimates:
           Bandwidth  Observations  Estimate   Std. Error  z value 
LATE       0.2398      323          -0.019682  0.02985     -0.65939
Half-BW    0.1199       91          -0.060267  0.04468     -1.34899
Double-BW  0.4796     1067           0.001324  0.02993      0.04425
           Pr(>|z|)   
LATE       0.5096     
Half-BW    0.1773     
Double-BW  0.9647     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

F-statistics:
           F       Num. DoF  Denom. DoF  p     
LATE       0.8566  3          319        0.9278
Half-BW    1.0891  3           87        0.7162
Double-BW  0.7255  3         1063        0.9264
  summary(RDestimate(log_area ~ margin, bw = band, cutpoint = band, data = islamic))

Call:
RDestimate(formula = log_area ~ margin, data = islamic, cutpoint = band, 
    bw = band)

Type:
sharp 

Estimates:
           Bandwidth  Observations  Estimate  Std. Error  z value
LATE       0.2398      317          0.5296    0.6085      0.8703 
Half-BW    0.1199       88          1.8959    0.6786      2.7936 
Double-BW  0.4796     1053          0.2188    0.5174      0.4229 
           Pr(>|z|)    
LATE       0.384145    
Half-BW    0.005212  **
Double-BW  0.672336    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

F-statistics:
           F      Num. DoF  Denom. DoF  p      
LATE       2.712  3          313        0.09017
Half-BW    4.303  3           84        0.01424
Double-BW  1.809  3         1049        0.28744

There is no evidence of discontinuities for these any of these variables. They all have small placebo LATE estimates and high p-values. This suggests that, at least based on these observed covariates, there was no sorting around the threshold: local randomisation is likely to hold.

Question 3. Perform a McCrary test (another way to check for sorting at the theshold) using the DCdensity function. Plot and interpret the results.

Solution

DCdensity(islamic$margin, verbose =TRUE)
Assuming cutpoint of zero.
Using calculated bin size:  0.009 
Using calculated bandwidth:  0.165 

Log difference in heights is  -0.095  with SE  0.147 
  this gives a z-stat of  -0.650 
  and a p value of  0.515 
[1] 0.5153874

The p-value for the resultant test statistic is 0.51, meaning that we cannot reject the null hypothesis of no discontinuous jump in the density of the running variable. Visually, you can also see that there is no strong evidence for this in the plot.

Question 4. (Difficult) Examine the sensitivity of the main RD result to the choice of bandwidth by calculating and plotting RD estimates and their associated 95% confidence intervals for a range of bandwidths from 0.05 to 0.6. To what extent do the results depend on the choice of bandwidth?

A good way to start is to create a vector of bandwidths that we loop over in a for loop. On each iteration of the for loop, estimate the RDD using that specific bandwidth, and save the LATE estimates and the associated confidence intervals. Once the for loop is complete, we can simply plot the estimates and confidence intervals over the range of the bandwidths we specify. The code below will help you get started:

# Set the vector of bandwidths
bandwidths <- seq(from = 0.05, to = 0.6, by = 0.005)

# Setup some variables that can be used to store the estimates and confidence intervals
rd_ests <- rep(NA, length(bandwidths))
rd_hi <- rep(NA, length(bandwidths))
rd_lo <- rep(NA, length(bandwidths))

# Loop over the bandwidths

for(i in 1:length(bandwidths)){
  
  ## Estimate the RD model with the appropriate bandwidth here (using bandwidth[i], for instance)
  
  ## Extract the LATE coefficient from the model here (if rd_est is the estimated model, the LATE coefficient can be found in re_est$est[1])
  
  ## Extract the LATE standard error from the model here (if rd_est is the estimated model, the LATE standard error can be found in re_est$se[1])
  
  ## Calculate the upper end of the confidence interval here (remember that the upper end of the confidence interval can be calculated as coefficient + 1.96*standard error)
  
  ## Calculate the lower end of the confidence interval here (remember that the upper end of the confidence interval can be calculated as coefficient - 1.96*standard error)
  
  ## Assign the estimated coefficient to the appropriate position in rd_ests
  
  ## Assign the estimated upper confidence interval to the appropriate position in rd_hi
  
  ## Assign the estimated lower confidence interval to the appropriate position in rd_lo
  
}

Solution

# Set the vector of bandwidths
bandwidths <- seq(from = 0.05, to = 0.6, by = 0.005)

# Setup some variables that can be used to store the estimates and confidence intervals
rd_ests <- rep(NA, length(bandwidths))
rd_hi <- rep(NA, length(bandwidths))
rd_lo <- rep(NA, length(bandwidths))

# Loop over the bandwidths

for(i in 1:length(bandwidths)){
  
  ## Estimate the RD model with the appropriate bandwidth here (using bandwidth[i], for instance)
  
  rd_est <- RDestimate(school_women ~ margin, bw = bandwidths[i], data = islamic)
  
  ## Extract the LATE coefficient from the model here (if rd_est is the estimated model, the LATE coefficient can be found in re_est$est[1])
  
  late_coef <- rd_est$est[1]
  
  ## Extract the LATE standard error from the model here (if rd_est is the estimated model, the LATE standard error can be found in re_est$se[1])
  
  late_se <- rd_est$se[1]
  
  ## Calculate the upper end of the confidence interval here (remember that the upper end of the confidence interval can be calculated as coefficient + 1.96*standard error)
  
  late_hi <- late_coef + 1.96 * late_se
  
  ## Calculate the lower end of the confidence interval here (remember that the upper end of the confidence interval can be calculated as coefficient - 1.96*standard error)
  
  late_lo <- late_coef - 1.96 * late_se
  
  ## Assign the estimated coefficient to the appropriate position in rd_ests
  
  rd_ests[i] <- late_coef
  
  ## Assign the estimated upper confidence interval to the appropriate position in rd_hi
  
  rd_hi[i] <- late_hi
  
  ## Assign the estimated lower confidence interval to the appropriate position in rd_lo
  
  rd_lo[i] <- late_lo
  
}

  plot(x = bandwidths, y = rd_ests, 
       type = "l",
       lwd = 2,
       ylim = c(-0.1,0.1),
       xlab = "Bandwidth", ylab="Estimate")
  abline(h = 0)
  lines(x = bandwidths, y = rd_hi, lty = 3)
  lines(x = bandwidths, y = rd_lo, lty = 3)
  legend("topright", 
         legend = c("RD Estimate","95% Confidence Interval"),
         lty = c(1,3))

The plot shows a fairly typical pattern in RD analysis. At very low bandwidths, very little data is used in estimation, making the variance very high. Unsurprisingly therefore, the confidence interval is very wide at low bandwidths (recall that the optimal IK bandwidth is chosen with the twin goals of reducing both bias and variance). As the threshold (and therefore the effective sample size) increases, the confidence interval decreases.

The size of the RD estimate does not change much as the bandwidth increases, which is reassuring: we would not want our result to be extremely sensitive to the bandwidth choice. Its statistical significance does change somewhat as the threshold changes. It is significant at the 95% level for bandwidths between approximately 0.15 and 0.55, becoming statistically indistinguishable from 0 again at higher bandwidths. Again, this is reasurring as the main result holds across a wide range of bandwidths, and it would not be very defensible in the first place to do RD estimation with a bandwidth as high as 0.55, where we are using data that is very far indeed from the cutoff. Overall, if you saw this plot in a published paper you should feel reasurred that the author’s results are not driven primarily by the choice of threshold.