6 Synthetic Control


6.1 Lecture review and reading

Synthetic control methods are a relatively new addition to the roster of causal inference techniques used in applied political science work. Because of this, there is no standard text-book treatment of these methods, at least that I am aware of. The best place to start reading therefore is this article by Abadie et al (2015) in the AJPS (you will recognise the Germany example from the lecture). The same authors also wrote a more detailed exposition of the method in this paper, which goes into more technical detail behind the estimation strategy.

For applications, the one we will focus on throughout the seminar is this paper by David Hope, who is at Kings College London. It may be inspirational to know that David started the project that resulted in this paper in a causal inference class very similar to the one you are currently taking! In fact, the initial version of this paper can be found on the course assessment guidelines page as an example of what you might do for a final project on this course. Another nice example of the synthetic control method is this paper by Benjamin Born and coauthors: in this study, the authors use synthetic control to estimate the costs of the Brexit vote to the UK economy.

6.2 Seminar

The seminar this week is devoted to learning how to use the Synth package in R. This package has been developed to make it easier to implement synthetic control designs, though as you will see it does have a somewhat idiosyncratic coding style. You will need to install the package and load it as we have done in previous weeks:

#install.packages("Synth") # Remember that you only need to install the package once
library(Synth)

6.2.1 Data

We will use one main dataset this week:

  1. Hope (2016)

6.2.2 The effect of Economic and Monetary Union on Current Account Balances – Hope (2016)

In early 2008, about a decade after the Euro was first introduced, the European Commission published a document looking back at the currency’s short history and concluded that the European Economic and Monetary Union was a “resounding success”. By the end of 2009 Europe was at the beginning of a multiyear sovereign debt crisis, in which several countries – including a number of Eurozone members – were unable to repay or refinance their government debt or to bail out over-indebted banks. Although the causes of the Eurocrisis were many and varied, one aspect of the pre-crisis era that became particularly damaging after 2008 were the large and persistent current account deficits of many member states. Current account imbalances – which capture the inflows and outflows of both goods and services and investment income – were a marked feature of the post-EMU, pre-crisis era, with many countries in the Eurozone running persistent current account deficits (indicating that they were net borrowers from the rest of the world). Large current account deficits make economies more vulnerable to external economic shocks because of the risk of a sudden stop in capital used to finance government deficits.

David Hope in his recent article investigates the extent to which the introduction of the Economic and Monetary Union in 1999 was responsible for the current account imbalances that emerged in the 2000s. Using a sythetic control method, Hope evaluates the causal effect of EMU on current account balances in 11 countries between 1980 and 2010. In this exercise, we will focus on just one country – Spain – and evaluate the causal effect of joining EMU on the Spanish current account balance. Of the \(J\) countries in the sample, therefore, \(j = 1\) is Spain, and \(j=2,...,16\) will represent the “donor” pool of countries. In this case, the donor pool consists of 15 OECD countries that did not join the EMU: Australia, Canada, Chile, Denmark, Hungary, Israel, Japan, Korea, Mexico, New Zealand, Poland, Sweden, Turkey, the UK and the US.

The hope_emu.csv file contains data on these 16 countries across the years 1980 to 2010. The data includes the following variables:

  1. period – the year of observation
  2. country_ID – the country of observation
  3. country_no – a numeric country identifier
  4. CAB – current account balance
  5. GDPPC_PPP – GDP per capita, purchasing power adjusted
  6. invest – Total investment as a % of GDP
  7. gov_debt – Government debt as a % of GDP
  8. openness – trade openness
  9. demand – domestic demand growth
  10. x_price – price level of exports
  11. gov_deficit – Government primary balance as a % of GDP
  12. credit – domestic credit to the private sector as a % of GDP
  13. GDP_gr – GDP growth %

Use the read.csv function to load the downloaded data into R now. For this assignment, we will need the qualitative variables to be stored as character variables, rather than the factor encoding that R uses by default. For this reason, we will set the stringsAsFactors arugment in the read.csv function to be false.

emu <- read.csv("hope_emu.csv", stringsAsFactors = FALSE)

Question 1. Plotting Spain’s current account balance

Plot the trajectory of the Spanish current account balance over time in red. Add other lines to the plot for the current account balance for 3 other countries (using the lines() function). Plot an additional dashed vertical line in 1999 to mark the introduction of the EMU (use the abline function, setting the v argument to the appropriate number). Would you be happy using any of them on their own as the control group?

Reveal answer

plot(x = emu[emu$country_ID == "ESP",]$period,
     y = emu[emu$country_ID == "ESP",]$CAB,
     type = "l",
     xlab = "Year",
     ylab = "Current Account Balance",
     col = "red",
     lwd = 3,
     frame.plot = FALSE, # Frame.plot tells R whether we want a box around our plot
     ylim = range(emu$CAB)) # Because we are plotting multiple lines, we need to manually set the y-axis limits (here I am just using the range of the entire data)
lines(x = emu[emu$country_ID == "USA",]$period,
      y = emu[emu$country_ID == "USA",]$CAB,
      col = "orange")
lines(x = emu[emu$country_ID == "GBR",]$period,
      y = emu[emu$country_ID == "GBR",]$CAB,
      col = "blue")
lines(x = emu[emu$country_ID == "JPN",]$period,
      y = emu[emu$country_ID == "JPN",]$CAB,
      col = "darkgreen")
abline(v = 1999, 
       lty = 3) # Lty specifies the line type (1 is solid, 2 dashed, 3 dotted, etc)
legend("topleft",
       legend = c("ESP","USA", "GBR", "JPN"),
       col = c("red", "orange", "blue", "darkgreen"),
       lty = 1,
       lwd = 2)

None of these individual countries is a perfect approximation to the pre-treatment trend for Spain, although the US and the UK lines are clearly closer than the Japanese line. The goal of the synthetic control analysis is to create a weighting scheme which, when applied to all countries in the donor pool, creates a closer match to the pre-intervention treated unit trend than any of the individual countries do alone.

Question 2. Preparing the synthetic control

The Synth package takes data in a somewhat unusual format. The main function we will use to get our data.frame into the correct shape is the dataprep() function. Look at the help file for this function using ?dataprep. You will see that this function requires us to correctly specify a number of different arguments. I have summarised the main arguments you will need to use in the table below:

Argument Description
foo This is where we put the data.frame that we want to use for the analysis
predictors This argument expeects a vector of names for the covariates we would like to use to estimate the model. You will need to use the c() function, and enter in all the variable names that you will be using.
dependent The name of the dependent variable in the analysis (here, "CAB")
unit.varaible The name of the variable that identifies each unit (must be numeric)
unit.names.varaible The name of the variable that contains the name for each unit (here, "country_ID")
time.varaible The name of the variable that identifies each time period (must be numeric)
treatment.identifier The identifying number of the treatment unit (must correspond to the value for the treated unit in unit.variable)
controls.identifier The identifying numbers of the control units (must correspond to the values for the control units in unit.variable)
time.predictors.prior A vector indicating the time periods before the treatment
time.optimize.ssr Another vector indicating the time periods before the treatment
time.plot A vector indicating the time periods before and after the treatment

Reveal answer

  dataprep_out <- dataprep(foo = emu,
                           predictors = c("GDPPC_PPP","openness","demand","x_price","GDP_gr", 
                                          "invest", "gov_debt", "gov_deficit", "credit", "CAB"),
                           dependent = "CAB",
                           unit.variable = "country_no",
                           time.variable = "period",
                           treatment.identifier = 1, # 1 is spain
                           controls.identifier = c(2:16),
                           time.predictors.prior = c(1980:1998),
                           time.optimize.ssr = c(1980:1998),
                           unit.names.variable = "country_ID",
                           time.plot = 1980:2010
                           )

Question 2. Estimating the synthetic control

Fortunately, though getting the data in the prep function correctly can be a pain, estimating the synthetic control is very straightforward. Use the synth() function on the dataprep_out object that you just created, remembering to assign the output to a new object.

Reveal answer

synth_out <- synth(dataprep_out)

X1, X0, Z1, Z0 all come directly from dataprep object.


**************** 
 searching for synthetic control unit  
 

**************** 
**************** 
**************** 

MSPE (LOSS V): 1.343137 

solution.v:
 0.1650376 0.1578517 0.1158127 0.2623553 0.1052685 0.000665803 0.05772119 6.96e-07 0.09755469 0.03773179 

solution.w:
 0.185021 0.002046917 0.003221188 0.00192881 0.00376691 0.001284599 0.1465354 0.002916178 0.1873096 0.004514661 0.04568778 0.0008294113 0.02171683 0.3931764 4.43432e-05 

Question 3. Plotting the results

Use synth’s path.plot() and gaps.plot() functions to produce plots which compare Spain’s actual current account balance trend to that of the synthetic Spain you have just created. These function takes two main arguments, and then some additional styling arguments to make the plot look nice. Look at the help file to figure out what goes where!

Interpret these plot. What do they suggest about the effect of the introduction of EMU on the Spanish current account balance?

Reveal answer

path.plot(synth.res = synth_out,
          dataprep.res = dataprep_out,
          Xlab = "Time",
          Ylab = "Current account balance",
          Legend = c("Spain", "Synthetic Spain"),
          Ylim = c(-10,5),
          tr.intake = 1999)

gaps.plot(synth.res = synth_out,
          dataprep.res = dataprep_out,
          Xlab = "Time",
          Ylab = "Current Account Balance Difference (Real - Synthetic Spain)",
          tr.intake = 1999)

The synthetic version of Spain provides a reasonably good approximation to the pre-treatment trend of Spain, as there are only small differences in the Current Account Balance between real Spain and synthetic Spain before 1999.

In addition, it is clear that the trajectory of Spain and its synthetic control diverge significantly after the EMU is introduced in 1999. In particular, the actual Spanish current account balance deteriorated much more than the current account balances of the synthetic control unit in the post-EMU period. This therefore provides some empirical support for the hypothesis that the introduction of the EMU caused the current account balances of Spain to deteriorate.

Question 3. Interpreting the synthetic control unit

A crucial strength of the synthetic control approach is that it allows us to be very transparent about the comparisons we are making when making causal inferences. In particular, we know tha the synthetic Spain that we created in question 2 is a weighted average of the 15 OECD non-EMU countries in our data. Let’s practice some of this transparency now by reporting the estimated vector of country weights in a nice table.

Look in the help file for ?synth, and read the “Value” section of that page. The value section will tell you all of the things that are returned by a function. You can access them by using the dollar sign operator that we have used in the past to extract variables from a data.frame.

What are the top five countries contributing to synthetic Spain?

Reveal answer

# The country weights are stored in the folowing object
synth_out$solution.w
       w.weight
2  1.850210e-01
3  2.046917e-03
4  3.221188e-03
5  1.928810e-03
6  3.766910e-03
7  1.284599e-03
8  1.465354e-01
9  2.916178e-03
10 1.873096e-01
11 4.514661e-03
12 4.568778e-02
13 8.294113e-04
14 2.171683e-02
15 3.931764e-01
16 4.434319e-05
# The following code can be used to find the names that those codes refer to
unique(emu[,c("country_ID","country_no")])
    country_ID country_no
1          ESP          1
32         AUS          2
63         CAN          3
94         CHL          4
125        DNK          5
156        HUN          6
187        ISR          7
218        JPN          8
249        KOR          9
280        MEX         10
311        NZL         11
342        POL         12
373        SWE         13
404        TUR         14
435        GBR         15
466        USA         16

As the table shows (I have contructed this table using the information from the lines of code above), the main contributors to synthetic Spain are Great Britain, Mexico, Australia and Japan, with a smaller contribution from Poland.

Country Weight
GBR 0.393
MEX 0.187
AUS 0.185
JPN 0.147
POL 0.046

Question 4. Estimate a placebo synthetic control treatment effect

One way to check the validity of the synthetic control is to estimate “placebo” effects – i.e. effects for units that were not exposed to the treatment. Do this now for Australia (which did not join EMU in 1999). Of course, in constructing synthetic Australia, we must exclude Spain – the actual treatment unit – from the analysis. Before you repeat the steps above for Australia, create a new data.frame that doesn’t include the Spanish observations.

What does the estimated treatment effect for Australia tell you about the validity of the design for estimating the treatment effect of the EMU on the Spanish current account balance? Compare the treatment effects from the Australian synthetic control analysis and the Spanish synthetic control analysis in terms of the pre- and post-treatment root mean square error values.

Reveal answer

# Exclude Spanish observations

emu_australia <- emu[emu$country_ID != "ESP",]

# Prepare the data for Australia

dataprep_out_australia <- dataprep(foo = emu_australia,
                           predictors = c("GDPPC_PPP","openness","demand","x_price","GDP_gr", 
                                          "invest", "gov_debt", "gov_deficit", "credit", "CAB"),
                           dependent = "CAB",
                           unit.variable = "country_no",
                           time.variable = "period",
                           treatment.identifier = 2, # 2 is Australia
                           controls.identifier = c(3:16), # Excluding Australia from the donor pool
                           time.predictors.prior = c(1980:1998),
                           time.optimize.ssr = c(1980:1998),
                           unit.names.variable = "country_ID",
                           time.plot = 1980:2010
                           )


# Estimate the new synthetic control
synth_out_australia <- synth(dataprep_out_australia)

X1, X0, Z1, Z0 all come directly from dataprep object.


**************** 
 searching for synthetic control unit  
 

**************** 
**************** 
**************** 

MSPE (LOSS V): 2.447927 

solution.v:
 0.4628641 0.1527368 0.0004101862 0.01080804 0.002944383 0.01435065 0.003140863 0.1457437 0.1452918 0.06170942 

solution.w:
 5.7e-09 3.25e-08 1.44684e-05 0.05506486 2.49e-08 9e-09 7.33e-08 0.04292471 0.5278432 2.023e-07 1.33e-08 3.522e-07 4.01e-08 0.3741521 
# Plot the results

path.plot(synth.res = synth_out_australia,
          dataprep.res = dataprep_out_australia,
          Xlab = "Time",
          Ylab = "Current account balance",
          Legend = c("Australia", "Synthetic Australia"),
          Ylim = c(-10,5),
          tr.intake = 1999)

The placebo test here supports the inferences drawn from the main synthetic control analysis. There is clearly no effect of the introduction of EMU on the current account balance of Australia. Of course, full permutation inference would require re-estimating the synthetic control for every unit in the donor pool, not just Australia, and comparing the distribution of these placebo treatment effects to the treatment effect for Spain. In the homework, you will be asked to complete this analysis.

We can also calculate the root mean squared prediction error for the pre-and post-intervention periods for both Australia and Spain. Recall that the the RMSE measures the size of the gap between the outcome of interest in each country and its synthetic counterpart. Large values of the ratio of the pre- and post-RMSEs provides evidence that the treatment effect is large. (We take the ratio of these measures because a large post-treatment RMSE is not itself sufficient evidence of a large treatment effect, because the synthetic control may be a poor approximation to the unit of interest. We account for the quality of the synthetic control unit by diving the post-treatment RMSE by the pre-treatment RMSE).

# Define function for calculating the RMSE
rmse <- function(x,y){
  sqrt(mean((x - y)^2))
}

# Define vector for pre/post-intervention subsetting

pre_intervention <- c(1980:2010) < 1999

## Spain

# Extract the weights for synthetic spain
spain_weights <- synth_out$solution.w

# Calculate the outcome for synthetic spain using matrix multiplication
synthetic_spain <- as.numeric(dataprep_out$Y0plot %*% spain_weights)

# Extract the true outcome for spain
true_spain <- emu[emu$country_ID == "ESP",]$CAB

# Calculate the RMSE for the pre-intervention period for spain

pre_rmse_spain <- rmse(x = true_spain[pre_intervention], y = synthetic_spain[pre_intervention])

# Calculate the RMSE for the post-intervention period for spain

post_rmse_spain <- rmse(x = true_spain[!pre_intervention], y = synthetic_spain[!pre_intervention])

post_rmse_spain/pre_rmse_spain
[1] 3.794214
## Australia

# Extract the weights for synthetic Australia
australia_weights <- synth_out_australia$solution.w

# Calculate the outcome for synthetic Australia using matrix multiplication
synthetic_australia <- as.numeric(dataprep_out_australia$Y0plot %*% australia_weights)

# Extract the true outcome for Australia
true_australia <- emu_australia[emu_australia$country_ID == "AUS",]$CAB

# Calculate the RMSE for the pre-intervention period for Australia

pre_rmse_australia <- rmse(x = true_australia[pre_intervention], y = synthetic_australia[pre_intervention])

# Calculate the RMSE for the post-intervention period for Australia

post_rmse_australia <- rmse(x = true_australia[!pre_intervention], y = synthetic_australia[!pre_intervention])

post_rmse_australia/pre_rmse_australia
[1] 0.7680724

The ratio of the RMSEs is much larger for Spain than for Australia, confirming the insight we took from the plots: the (null) placebo effect we estimated for Australia gives additional strength to our conclusion about the treatment effect we estimated for Spain.

6.3 Homework

In the full paper linked to on the reading list (and above), Hope conducts the synthetic control analysis for several countries, not just for Spain. One particularly interesting case that he evaluates is Austria. For this week’s homework, your task is to replicate the analysis we have just completed but this time using Austria, and not Spain, as the unit of interest. You will notice that the data provided for the seminar does not include any information about Austria, but you can download an additional, part-completed, dataset here:

  1. Hope (2016) additional Austria data

You will also notice that this additional data is missing one crucial variable: the outcome. Because you do not have any outcome variable here for the new treated unit, you will need to collect this yourself. In Hope’s paper, he suggests that the data on each country’s current account balance (measured as a % of GDP) can be found in the IMF World Economic Outlook Database, October 2015. You should be able to find the relevant data at this link. Your first task is to retreive this information for Austria for the years 1980 to 2010, and to include that information in the Austria data set.

Once you have found the data and entered it into your downloaded csv file, you need to load both the Austria data and the main dataset from the seminar and combine them. To do so, you could use the rbind function that we introduced in last week’s homework.

Solution

emu <- read.csv("hope_emu.csv", stringsAsFactors = FALSE)

austria <- read.csv("hope_emu_austria.csv", stringsAsFactors = FALSE)
## Stack the data.frames on top of one another
emu_new <- rbind(emu, austria)

Question 1. Synthetic control for Austria

You should now re-estimate the synthetic control method, this time using Austria as the unit of interest. Remember to adjust the various parameters in the data.prep function (you must ensure that you are not using Spain as a part of the donor pool). Then answer the following questions:

  1. Which 5 countries receive the highest weight as a part of synthetic Austria?

Solution

  ## Prepare data

  dataprep_out_austria <- dataprep(foo = emu_new,
                           predictors = c("GDPPC_PPP","openness","demand","x_price","GDP_gr", 
                                          "invest", "gov_debt", "gov_deficit", "credit", "CAB"),
                           dependent = "CAB",
                           unit.variable = "country_no",
                           time.variable = "period",
                           treatment.identifier = 17, # 17 is Austria
                           controls.identifier = c(2:16), # Do not include 1, which is Spain
                           time.predictors.prior = c(1980:1998),
                           time.optimize.ssr = c(1980:1998),
                           unit.names.variable = "country_ID",
                           time.plot = 1980:2010
                           )

  ## Estimate synthetic control

  synth_out_austria <- synth(dataprep_out_austria) 

X1, X0, Z1, Z0 all come directly from dataprep object.


**************** 
 searching for synthetic control unit  
 

**************** 
**************** 
**************** 

MSPE (LOSS V): 1.49337 

solution.v:
 3.8745e-06 0.0001221313 0.01294896 0.001467821 0.05974939 0.4407692 0.03466163 0.0002523227 0.01753072 0.432494 

solution.w:
 0.2134394 4.74e-06 1.374e-06 3.1118e-05 0.167978 1.3167e-06 0.3904238 0.02731565 1.30476e-05 4.591e-06 0.2006311 7.96501e-05 5.9219e-06 6.51496e-05 5.0828e-06 
  # Retrieve country weights
  synth_out_austria$solution.w
       w.weight
2  2.134394e-01
3  4.740009e-06
4  1.373976e-06
5  3.111801e-05
6  1.679780e-01
7  1.316707e-06
8  3.904238e-01
9  2.731565e-02
10 1.304761e-05
11 4.590981e-06
12 2.006311e-01
13 7.965012e-05
14 5.921851e-06
15 6.514957e-05
16 5.082755e-06
  # Retrieve country IDs
  unique(emu_new[,c("country_ID","country_no")])
    country_ID country_no
1          ESP          1
32         AUS          2
63         CAN          3
94         CHL          4
125        DNK          5
156        HUN          6
187        ISR          7
218        JPN          8
249        KOR          9
280        MEX         10
311        NZL         11
342        POL         12
373        SWE         13
404        TUR         14
435        GBR         15
466        USA         16
497        AUT         17
Country Weight
JPN 0.39
AUS 0.21
POL 0.20
HUN 0.17
KOR 0.03

The main contributors to synthetic Austria are Japan, Australia, Poland, Hungary, and Korea

  1. Produce a plot which compares the current account balance of Austria to that of synthetic Austria. How does this compare to the equivalent plot of Spain?

Solution

  # Plot the results

  gaps.plot(synth.res = synth_out_austria,
          dataprep.res = dataprep_out_austria,
          Xlab = "Time",
          Ylab = "Difference in current account balance (Real - Synthetic)",
          Ylim = c(-5,10),
          tr.intake = 1999)

In contrast to the Spanish case, the Austrian synthetic control demonstrates a worse current account balance than real Austria in the post-EMU period. These results imply that the EMU improved the current account position of Austria and worsened the current account position of Spain.

  1. Retreive the vector of weights assigned to each of the variables used to construct synthetic Austria. Which variables contribute most to the synthetic control? (Hint: look at the “value” section of the synth() function help file if you do not know where to find these values.)

Solution

  round(sort(synth_out_austria$solution.v, decreasing = T),2)
     invest  CAB GDP_gr gov_debt credit demand x_price gov_deficit
BFGS   0.44 0.43   0.06     0.03   0.02   0.01       0           0
     openness GDPPC_PPP
BFGS        0         0

The variable weights can be found in the solution.v element of the synth_out_austria object. Here, I am using the sort() function to sort the weights in decreasing order of size, and then am rounding the weights to two decimal places to make them more interpretable.

The largest two weights are assigned to the current account balance predictor, and the investment predictor, with smaller weights placed on GDP growth, government debt, and domestic credit. Note that it is very common for the pre-treatment values of the dependent variable to be upweighted in the construction of the synthetic control, because the algorithm aims to produce a close fit between the synthetic unit and the treatment unit in the pre-treatment period.

Question 2. (Difficult question) Permutation inference

Conduct full permutation inference by estimating placebo treatment effects for all of the control units, and comparing them to the actual estimated effect for Spain (and/or Austria). You should calculate the RMSE ratios for each unit, and provide a plot summarising these statistics.

Hint: rather than writing many, many lines of code, try using a for() loop, where you iterate over countries with the same lines of code. For example, if I wanted to calculate the mean current account balance for each country in the data, instead of writing the following:

esp_mean <- mean(emu[emu$country_no == 1,]$CAB)
aus_mean <- mean(emu[emu$country_no == 2,]$CAB)
can_mean <- mean(emu[emu$country_no == 3,]$CAB)

and so on, I could instead use:

cab_means <- rep(NA, 16)
for(i in 1:16){

  cab_means[i] <- mean(emu[emu$country_no == i,]$CAB)

}

Where I am creating a vector to store all the means I am calculating, and then I loop over the country identifiers (from 1 to 16), using the i indicator to subset the data on each iteration.

You can use this basic structure to loop over each country in the donor pool, construct the synthetic control for that country on each iteration, calculate the RMSE ratio, and then save the result.

If you manage to successfully complete this task, please email me your plot of the resulting RMSE ratios for each country. I promise to be very impressed!

Solution

## Create a vector to store the RMSE ratios
rmse_ratio_vec <- rep(NA, 16)

# Name the vector with the country names from the dataset
names(rmse_ratio_vec) <- unique(emu$country_ID)

## Assign the Spanish ratio to the vector
rmse_ratio_vec[1] <- post_rmse_spain/pre_rmse_spain

## Exclude Spain from data so that it never enters the donor pool

emu_donors <- emu[emu$country_ID != "ESP",]

## Loop over each country in the donor pool, estimate the synthetic control for that country, calculate the RMSE ratio and assign it to the vector
  for(i in 2:16){

    # Prepare the data for country i
    dataprep_out_i <- dataprep(foo = emu_donors,
                           predictors = c("GDPPC_PPP","openness","demand","x_price","GDP_gr", 
                                          "invest", "gov_debt", "gov_deficit", "credit", "CAB"),
                           dependent = "CAB",
                           unit.variable = "country_no",
                           time.variable = "period",
                           treatment.identifier = i, # Select country i as the treated country
                           controls.identifier = unique(emu_donors$country_no)[unique(emu_donors$country_no)!=i], # Exclude country i from the donor pool
                           time.predictors.prior = c(1980:1998),
                           time.optimize.ssr = c(1980:1998),
                           unit.names.variable = "country_ID",
                           time.plot = 1980:2010
                           )

    # Estimate the new synthetic control for coutry i
    synth_out_i <- synth(dataprep_out_i)

    # Extract the weights for the synthetic control
    i_weights <- synth_out_i$solution.w

    # Calculate the outcome for the synthetic unit using matrix multiplication
    synthetic_i <- as.numeric(dataprep_out_i$Y0plot %*% i_weights)

    # Extract the true outcome for Australia
    true_i <- emu_donors[emu_donors$country_no == i,]$CAB

    # Define the pre and post-intervention periods
    pre_intervention <- c(1980:2010) < 1999
    
    # Calculate the RMSE for the pre-intervention period

    pre_rmse_i <- rmse(x = true_i[pre_intervention], y = synthetic_i[pre_intervention])

    # Calculate the RMSE for the post-intervention period for Australia

    post_rmse_i <- rmse(x = true_i[!pre_intervention], y = synthetic_i[!pre_intervention])

    # Assign the RMSE ratios to the vector
    rmse_ratio_vec[i] <- post_rmse_i/pre_rmse_i
    
  }
  # Finally, plot the results!  
  barplot(sort(rmse_ratio_vec), ylab = "Post-intervention RMSE/Pre-intervention RMSE", main = "RMSE ratio")

The MSPE ratio is higher when the effect of the EMU on the current account balance is larger. However, the measure also takes into account how well the synthetic control for each country can approximate the pre-EMU trend in current account balances. A large current account balance gap in the post-EMU period is not strong evidence of the EMU having a large effect if the synthetic control unit does not closely match the current account balance of the country of interest in the pre-EMU period. Put another way, a high post-EMU MSPE is not indicative of the EMU having a large effect on the current account balance when the pre-EMU MSPE is also large.

The results clearly show that this test statistic is far larger for Spain than for any other placebo countries in the donor pool. This implies that the results from the Spanish synthetic control analysis are very unlikely to have been driven by chance.