Assessment Guidelines

Students will be evaluated through a 3000-word research paper applying the methods from the course to a research question chosen by the student. The paper includes two parts.

Part 1 (1000 words, 30%)

For the first part of the assessment of this course, you should write a 1000-word review of an existing text-as-data application in a social science paper of your choice. You are free to select any paper that uses any of the quantitative text analysis methods that we cover on the course, including those listed on the syllabus. You will need to provide a full bibliographic reference to the selected paper so that I can refer to it.

The goal of this part of the assessment is to demonstrate that you can think critically about the methods that we have covered on the course and evaluate their use in real-world applications. Consequently, the content of your review should be focused on how the quantitative text analysis methods in your selected paper are used, whether the assumptions of those methods is likely to be met in the case that you study, whether the methods are well-implemented, and whether the results are appropriately interpreted.

To do well on this part of the assessment, students should:

  1. Accurately describe the goals of the chosen text-analysis method, the methodological approach, and any key implementation details
  2. Discuss any assumptions on which the text analysis strategy rests and evaluate whether those assumptions are met in the selected application
  3. Critically assess the strengths and weaknesses of the selected approach
  4. Describe at least one alternative text analysis strategy that might be used in the selected application, with a focus on how the inferences might differ across the proposed and the original strategies

Although you are free to discuss a paper in part one that uses the same method as you use in part two, students are likely to be rewarded for demonstrating both a breadth and depth of understanding of the material covered on the course.

Part 2 – Research paper (2000 words, 70%)

In the second part of the assessment, you should write a 2000-word original research paper in which you use a method (or methods) of your choice to answer a social science research question. Again, you can select any topic of interest and use any of the methods that we cover on the course. Which methodological approach you choose should be guided by a) your research question, and b) the type of data that is available to you.

The research paper should follow the basic elements of a novel research project. The paper should address a specific research question, identify the theoretical contribution, and implement a suitable design based on one of the methods that we study in the course. The paper should focus narrowly on a topic of the student’s choice and display a depth of understanding of one of the text-analysis approaches, rather than a survey of all methods.

To do well on this part of the assessment, students should:

  1. Formulate a research question which can be answered using one of the methods that we discuss on the course

  2. Accurately describe the method they intend to use, and briefly discuss the assumptions required for the method to produce valid inferences

  3. Identify an existing corpus of text data suitable for answering their research question of interest, or collect an original corpus of their own

  4. For PG students only: Implement the selected method on the corpus, paying careful attention to any decisions that affect the outcomes of the analysis.

  5. For PG students only: Interpret the output of the analysis, commenting on how the results relate to the original research question

  6. Discuss the strengths and weaknesses of the analysis conducted, including commenting on how it compares to other approaches

Additional credit will be given to projects that formulate interesting research questions and for projects that more ambitious. For instance, collecting an original dataset via webscraping is clearly more ambitious than downloading an existing corpus, and implementing and interpreting an unsupervised topic model is more ambitious than implementing an off-the-shelf dictionary.

Research Paper Suggested Structure

Introduction/Research question statement

This section of the paper should be used to introduce the main topic that you will be addressing, and the central research question that you will be aiming to answer.

Description of methodology

In this section of the paper you should explain the methodological approach you are taking, what assumptions it relies on, how it helps to answer the research question of interest, how it compares to other approaches, and so on.

Description of corpus

You need to clearly and concisely describe the corpus you will use to implement your design. You should say where the corpus comes from (Did you collect it yourself? If so, how?). This section is also the place to to discuss the scope of the study: are you focusing on a particular country/set of countries? What is the scope of the analysis? You may also wish to include some descriptive statistics of your corpus.

Description of Implementation

In this section, you need to describe how you implemented your chosen approach. For instance, what were the feature selection decisions you made? Why did you make those decisions? Why did you choose that number of topics? How did you validate your approach? And so on.


For PG students only: Results should be presented in well-formatted tables and beautifully constructed figures. Do not include any raw R output (marks will be deducted). Remember that in addition to presenting the main substantive results of the analysis, you may wish to include empirical evidence that pertains to the assumptions that lie behind the methodological approach (validation!), though some of these might be better placed in the appendix to your paper.


What have we learned from your paper? What are the limitations of the analysis?


  • Please respect the word limits. Word counts include footnotes, but do not include the bibliography, tables, the title page (including abstract), or any appendices.

  • The R code used in the empirical parts of your assignment should be included in the appendix. This does not count towards your word count. Note that your R script file should be neatly presented and easy to follow, and not include everything you have tried out, but only what is included in your final paper. Essentially, it needs to be possible for us to fully understand the analyses you have done and possibly recreate all of the main tables/plots in your paper. You should not include any R code in the main body of your paper.

  • You should include your anonymous answers to your weekly assignments as Appendix to your paper.

  • Plagiarism and use of AI will be taken very seriously. See this page for details on UCL’s plagiarism policy.

Help and Assistance

If you have any questions concerning course material or logistics you can always ask them during lectures and seminars.

I encourage you to book in to see me during Student Support and Feedback hours to discuss your project before the end of term.

You will also have an opportunity to seek feedback in week 6 of term, when you will be able to submit short (one-paragraph) descriptions on your project for me to read. I will then provide feedback (either written or in person) to help guide you towards completing a successful project.


The research paper will be due on [TBA], 2024 at 2pm. Papers will be submitted online via Turnitin.