1 Measurement: What, Why and How

Topics: What is measurement? Why is measurement important? Representative versus pragmatic measurement. How are social measurements used and misused?

Required reading:

1.1 Seminar

For this week’s class we are going to look at existing measures of the extent to which different countries in different years (1946-2008) were “democratic”. The data set that we are using was compiled for a project that created a synthetic measure of how democratic countries were by combining the information in all the different measures (Pemstein, Meserve, and Melton 2010). The different measures in the data set are all on different scales and were constructed by different authors according to different coding rules, covering different countries and years. This assignment is mostly aimed at reminding you how to do data analysis in R.

Remember that .Rdata files are loaded into R with the load() command. You can directly load the data file into R from the web with the following command:

load(url("https://uclspp.github.io/POLS0013/4_data/week-1-democracy.Rdata"))
  1. Before you look at the data set, consider whether you think of countries being democratic as a binary quantity or not. Is it the case that a country is either democratic or not, with no middle ground? Or is it a continuum, with countries varying widely in how democratic they are? Can a country be somewhat democratic?
  1. Load the data file “week-1-democracy.Rdata” into R. The first three variables in the data frame “democracy” are:

If you look at the top of the data file using head(democracy) you will see 12 further variables (in alphabetic order from arat to vanhanen). Each of these corresponds to a different measure of democracy. You might recognise some of the names (eg freedomhouse and polity are relatively well-known and widely used). The coverage of country-years varies by measure.

For each of the 12 measures, use the data set to calculate the range of scores used for that measure.

  1. In addition to the range of scores, we might want to know if the distribution of scores look similar for the different measures. Generate histograms of all the scores for each measure. Do they look generally similar or not?
  1. The polity score has integer values from -10 to 10. Calculate the proportion of country-years that are classified as democratic by pacl, among country-years with each value of the polity score.
  1. Plot the results of Q4 by polity score. Describe the association you see between the two scores.
  1. Which are the country-years that are classified as a 10 by polity and 0 (non-democratic) by pacl? Take the last of these in the data set, and figure out whether the polity or the pacl score changed in the subsequent year. What happened in that country in that year? Hint: There are many ways to achieve this. And it’s always a good idea to get inspiration from the internet.
  1. Plot the trajectory of the polity scores and the pacl scores for the country in question across the full set of years in the data set.
  1. Use the command cor(democracy[,4:15],use = "pairwise.complete.obs") to calculate the correlation table for the 12 measures. You may want to wrap that in a round(x,2) command to make it easier to read. Note that the use = argument is needed because not all the measures are available in all the country-years, so we just calculate correlations between measures for the country-years where both are available. What does a higher correlation mean in this context? What does a low correlation mean in this context?
  1. Overall, do these seem like big disagreements between measures or small disagreements between measures? Are you surprised at how much different measures agree or at how much they disagree?

References

Pemstein, Daniel, Stephen A Meserve, and James Melton. 2010. “Democratic Compromise: A Latent Variable Analysis of Ten Measures of Regime Type.” Political Analysis 18 (4): 426–49.