Introduction

Welcome to the course website dedicated to the POLS0013 module Measurement in Data Science! On this site, you will find the lecture slides as well as the seminar tasks. You can navigate to each week’s material by clicking on the specific sidebar menu link. Please do let me know if you are struggling to access anything. Below you will find information related to the organisational aspects of the course. These are also covered in the intro lecture video.

Course Description

This module is designed for third year students in the undergraduate degrees in Philosophy, Politics and Economics, in Geography, in Population Health and in Social Sciences with the focus on Social Data Science. It therefore assumes that you are familiar with the material in the required first and second year modules of those programmes. These preceding modules cover basic quantitative analysis, sampling, linear regression, regression models for binary and categorical outcomes (especially logistic regression), panel data, multilevel models and some quantatitive text analysis. The module is now also open to third year students in the undergraduate degree in Politics and International Relations.

This module is fundamentally about the task of connecting the data that we use in quantitative analyses with the social science concepts that we are interested in making claims about. The processes of conceptualisation and measurement are sometimes used to distinguish between two parts of what we will be studying. In his book “Conceptualization and Measurement in the Social Sciences”, Hubert Blalock writes “Conceptualization involves a series of processes by which theoretical constructs, ideas, and concepts are clarified, distinguished, and given definitions that make it possible to reach a reasonable degree of consensus and understanding of the theoretical ideas we are trying to express.” (Blalock 1982, 11) “By measurement, we refer to the general process through which numbers are assigned to objects in such a fashion that it is also understood just what kinds of mathematical operations can legitimately be used, given the nature of the physical operations that have been used to justify or rationalize this assignment of numbers to objects.”

Measurement is important whether we are making causal claims or descriptive claims, at least if we want the evidence we collect to speak to underlying concepts of theoretical interest. It is common, but unfortunate, to have “slippage” between the analysis that researchers have done and how they talk about it. It is always tempting to report that you have shown something about a grand social scientific concept, conveniently forgetting that all you have actually done is demonstrated something about your dubious measure of that concept. One of the things that you learn in a module on causal inference is about a similar sort of slippage, whereby analyses that do not justify causal claims are discussed as if they do.

Note that the material of this module, similarly to causal inference modules, is that it does not always involve new estimation techniques. It often just involves careful thinking about which analyses to do, some of which may prove to be very simple. Some of the techniques covered in this module involve applying familiar regression analyses to data in a particular way so as to solve a measurement problem, much like several “causal inference methods” involve applying familiar regression analyses to data in a particular way (eg regression discontinuity designs and differences-in-differences). While some of the topics will involve new estimators and models, not all of them will. At several points in the module, we will define a problem, observe that if we had some types of data we could solve the measurement problem with a regression model, but that if we had different types of data, we would need a new method. As with causal inference, a lot of the core intellectual content here is figuring out what is possible and sensible with the data you have, not necessarily learning some fancier model that will magically make the limitations of the data go away.

Teaching Delivery

I recommend you try and structure your weekly workflow as follows:

  1. Complete the required reading
  2. Attend the lecture
  3. Attempt the seminar assignments
  4. Attend the seminar
  5. Go back through the seminar assignment
  6. Throughout the above, make note of things that are still unclear. If they are not clarified by any of the above, send a question via the Moodle Forum and I will seek to answer these in the next lecture.

Lectures

Lectures will take place Tuesdays 9-11am. The lecture slides will be made available to you to download before the lecture on this website in the tab dedicated to the relevant week.

Seminars

The one-hour seminars will take place in person on Tuesday afternoons.

Please try to stick with your assigned seminar slot, such as to keep an even numbers distribution across the groups. If this is not possible, you can ask the Political Science undergraduate admin team ( polsci.ug@ucl.ac.uk) for help. Note that I cannot help you with timetabling issues.

For each seminar, there is task-sheet with questions for you to work on during your seminar. You will find these on this site on the page dedicated to each week’s material. Please try to look at the seminar task ahead of time and attempt to complete it, so you can ask your seminar leader more specific questions. The solutions will be made visible on the day after the seminars.

Student Support and Feedback Hours

Tuesdays 11.30am-12.30pm, Wednesdays 11.30am-12.30pm; Book via this link

Assessment

Students will write a 1500 word “essay” and submit a 1500 word “coursework”. The essay involves finding and critiquing a pre-existing measure of a social science concept. The coursework involves completing a series of prompts that involve data analysis on a provided dataset, in the style of the weekly homework assignments.

  • Assessment part I (“essay”) due 10\(^{th}\) December 2024
  • Assessment part II (“coursework”) due 14\(^{th}\) January 2025

Please remember that plagiarism is taken extremely seriously and can disqualify you from the module (for details of what constitutes plagiarism see . If you are in doubt about any of this, ask the tutor.

You can download the instructions for the first part of the assessment (research report on a measure) below.

Resources

Readings

This module combines a range of material that is not traditionally taught together (although it should be). As a result, Prof. Ben Lauderdale is in the process of writing a textbook for the module, Pragmatic Social Measurement, which provides most of the required readings for the module. The latest draft version of this book is available here:

You will find it useful to consult additional reference materials, and there are no shortage of possible references for most of the individual topics covered in the module. For each week in the schedule below, I list further readings on the theoretical ideas covered that week as well as applications of the resulting measurement techniques.

As a supplemental resource for several of the topics in the module, I highly recommend “An Introduction to Statistical Learning” (2nd Edition) by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani. The PDF of the textbook is freely available for download. We will not cover all the topics in that book, because it aims to provide an introduction to “statistical learning” aka machine learning, and our aim in this module is to study measurement. Nonetheless, the topics in that book that are not covered in this module are well worth your time.

Software

Throughout the course we will use the free and open source statistical analysis software R.

Before the course starts, you can and should download and install (the latest version of) R on your personal computer. You should also also download and install RStudio, which is a user-interface to R.

Please ensure that both R and RStudio are installed on your personal computers before the first lecture. This is recommended over using the UCL RStudio Server, which is only accessible online. UCL machines, either virtual via or on campus, will already have this software installed.

Academic Freedom and Intellectual Property

Academic freedom is the cornerstone of university research and teaching, so that all university staff, speakers, and students can freely explore questions and ideas and challenge perceived views and opinions, without being censored or harassed by a government, any state authorities, the University, other students, or external pressure groups. As part of the UCL academic community, all staff, speakers, and students share these responsibilities:

  • Everyone must respect freedom of thought and freedom of expression. Your lecturer will not limit what can be discussed in the seminar, as long as it is relevant to the subject. They will not censor any topics, and they will expose you to controversial issues, questions, facts, views, and debates.
    • You may disagree with some facts or views that you read or hear in the classroom. You are encouraged to engage with these facts and views in a respectful manner.
    • Your lecturer will not penalise you merely for expressing views they or other students disagree with. However, they will expect you to present logical arguments supported by evidence.
  • You are explicitly prohibited from recording, publishing, distributing or transferring any class material/content, in whole or in part, in any format, to any individual or entity outside the module, linking to or posting it online (including social media), or making it otherwise available to any person or entity outside the module, unless you have received prior specific written approval from the module leader. You are also explicitly prohibited from aiding or abetting in any of these actions. Similarly, your lecturer will not record, publish or distribute seminar sessions without the explicit consent of the participants.
  • By agreeing to take this module, you agree to abide by these terms. If you do not comply with these terms, you will potentially be subject to disciplinary actions similar to those under violations of the university Student Code of Conduct.

Last Updated: 21 Nov 2024 9:11 AM GMT


References

Blalock, Hubert M. 1982. Conceptualization and Measurement in the Social Sciences. 04; H61, B5.