What's On This Page?
- My syllabus
- My lecture slides
- The code and real-world datasets used in the exercises in the book
- The questions asked in the additional replication-style exercises we provide instructors using DSS, which can be used as in-class exercises or as take-home problem sets
- Links to interactive graphs to help students develop an intuition about some of the trickier concepts in statistics
- Other resources under development such as review exercises, videos, and additional readings.
What's NOT On This Page?
- The files necessary to produce and change my syllabus, my lecture slides, and the additional replication-style exercises
- The datasets analyzed in the additional replication-style exercises
- The solutions to given exercises.
If you are an instructor using DSS as the main textbook in your course, you can request these materials from Princeton University Press here.
Other Resources:
- DSS Student Resources: This website hosts all the materials I have created specifically for student use, including interactive graphs and review exercises. It excludes instructor-only materials available on this site, such as my syllabus, lecture slides, and additional replication-style exercises.
- The first chapter is available for free here. You can find the book in Amazon here, request an exam copy here, and provide feedback directly to the authors here.
- The GitHub repository with the practice exercises (learnr tutorials) I have created is here. An the one with the interactive graphs (shiny apps) I have created is here.
INSTRUCTOR RESOURCES for DATA ANALYSIS FOR SOCIAL SCIENCE (DSS)
MY COURSE
My course progresses through bite-sized exercises, which students have a chance to practice at least three times: once with the textbook, once with the in-class exercises, and once with the take-home weekly problem sets. All three–textbook, class, and problem sets–move in parallel, asking similar questions, but using different real-world datasets so that students get to see the same material in different contexts.
My syllabus is here.
-
Below you will find the other resources organized by chapter:
- Chapter 1: Introduction
- Chapter 2: Estimating Causal Effects with Randomized Experiments
- Chapter 3: Inferring Population Characteristics via Survey Research
- Chapter 4: Predicting Outcomes Using Linear Regression
- Chapter 5: Estimating Causal Effects with Observational Data
- Chapter 6: Probability
- Chapter 7: Quantifying Uncertainty
Although also provided below chapter-by-chapter, all the code and all the real-world datasets used in the exercises in the book are in a folder named DSS here. We recommend downloading this folder, unzipping it, and saving it directly on your Desktop, which is where the code used throughout the book assumes the DSS folder is located. (Datasets and code are also available in this GitHub repository.)
IMPORTANT NOTES
- There is a lot more material in DSS than what my lecture slides cover. My course skips some of the more advanced-level material in DSS because it is meant for undergraduate students with no prior knowledge of coding or statistics and only minimal knowledge of math.
- My lecture slides are meant to complement DSS, not be a substitute. They assume students come to class having done the readings and having followed along with the exercises in the book on their own computer. In class, we go over a different replication-style exercise (asking similar questions but analyzing different data) and my slides do not always repeat all the explanations and details given in the book.
- Any errors found in the these resources are my own. If you find any, I would really appreciate it if you could let me know by sending me an email at ellaudet@gmail.com. (Updated: 11/02/2023)
CHAPTER 1: INTRODUCTION
In chapter 1, we start from the very beginning by installing and familiarizing ourselves with the two programs we use―R and RStudio―and by laying the groundwork for forthcoming analyses.Code and Data Used in Book Exercises:
- Introduction.R (R script with the code)
- STAR.csv (CSV file with the data)
Lectures Slides:
- Lecture 1. Course Introduction
- Lecture 2. Introduction to R and RStudio (Readings: 1-1.6 of DSS)
- Lecture 3. Observations and Variables (Readings: 1.7)
- Lecture 4. Computing and Interpreting Means (Readings: 1.8-1.10)
Additional Replication-Style Exercises:
- Estimating the Bias in Self-Reported Turnout - Part I: Loading and Making Sense of Data
- Estimating the Bias in Self-Reported Turnout - Part II: Computing and Interpreting Means
- Effects of A Criminal Record in Labor Market - Part I: Loading and Making Sense of Data
- Effects of A Criminal Record in Labor Market - Part II: Computing and Interpreting Means
- Effects of Female Leaders in India - Part I: Loading and Making Sense of Data
- Effects of Female Leaders in India - Part II: Computing and Interpreting Means
- Effects of Social Pressure Message on Probability of Voting - Part I: Loading and Making Sense of Data, and Computing and Interpreting Means
Review Exercises:
- To access them run the code in this R script in RStudio
Additional Readings:
CHAPTER 2: ESTIMATING CAUSAL EFFECTS WITH RANDOMIZED EXPERIMENTS
In chapter 2, we learn what causal effects are and how to estimate them using randomized experiments. We analyze data from Project STAR to answer: What is the effect of small classes on student performance?Code and Data Used in Book Exercises:
- Experimental.R (R script with the code)
- STAR.csv (CSV file with the data)
Lectures Slides:
- Lecture 5. Causal Effects and Randomized Experiments (Readings: 2-2.4, Video: How to Run a Randomized Experiment)
- Lecture 6. Does Social Pressure Affect Turnout? (Readings: 2.5-2.7)
Additional Replication-Style Exercises:
Effects of Social Pressure Message on Probability of Voting - Part II: Estimating an Average Causal Effect
Review Exercises:
- To access them run the code in this R script in RStudio
Interactive Graphs:
- Random Treatment Assignment Makes Treatment and Control Groups Comparable When the Sample Size is Large Enough (If link doesn't work, run the code in this R script in RStudio)
CHAPTER 3: INFERRING POPULATION CHARACTERISTICS VIA SURVEY RESEARCH
In chapter 3, we learn about surveys and how to visualize and summarize the distribution of single variables as well as the relationship between two variables. We analyze data on the 2016 British referendum to answer: Who Supported Brexit?Code and Data Used in Book Exercises:
- Population.R (R script with the code)
- BES.csv & UK_districts.csv (CSV files with the data)
Lecture Slides:
- Lecture 7. Survey Research and Exploring One Variable at a Time (Readings: 3-3.4)
- Lecture 8. Review
- Lecture 9. Exploring the Relationship Between Two Variables (Readings: 3.5-3.7)
Additional Replication-Style Exercises:
- Estimating the Bias in Self-Reported Turnout - Part III: Subsetting Variables and Creating Histograms
- Evidence of Data Fabrication
- Effects of Female Leaders in India - Part IV: Visualizations and Correlations
- Effect of Assassination of Leaders on Level of Democracy - Part I: Visualizations and Correlations
Interactive Graphs:
- Random Sampling Creates a Representative Sample of the Target Population When Sample Size is Large Enough (If link doesn't work, run the code in this R script in RStudio)
- How the Mean and Standard Deviation Change the Distribution of a Variable (If link doesn't work, run the code in this R script in RStudio)
- The Two Characteristics the Correlation Coefficient Captures (If link doesn't work, run the code in this R script in RStudio)
CHAPTER 4: PREDICTING OUTCOMES USING LINEAR REGRESSION
In chapter 4, we learn how to predict outcomes using simple linear regression models. We analyze data from 170 countries to predict GDP growth using night-time light emissions as measured from space.Code and Data Used in Book Exercises:
- Prediction.R (R script with the code)
- countries.csv (CSV file with the data)
Lecture Slides:
- Lecture 10. Predicting Non-Binary Outcomes with Linear Regression (Readings: 4-4.4.1, Video: How to Fit a Line to Predict Y Based on X)
- Lecture 11. Predicting Binary Outcomes with Linear Regression (Readings: 4.6-4.9)
- Lecture 12. Review
Additional Replication-Style Exercises:
Interactive Graphs:
- The Role the Intercept and the Slope Play in Defining a Line (If link doesn't work, run the code in this R script in RStudio)
- The Least Squares Method (If link doesn't work, run the code in this R script in RStudio)
CHAPTER 5: ESTIMATING CAUSAL EFFECTS WITH OBSERVATIONAL DATA
In chapter 5, we learn how to estimate causal effects using observational data. We analyze survey and electoral data to answer: What was the effect of Russian TV propaganda on the 2014 Ukrainian elections?Code and Data Used in Book Exercises:
- Observational.R (R script with the code)
- UA_survey.csv & UA_precincts.csv (CSV files with the data)
Lecture Slides:
- Lecture 13. Estimating Causal Effects with Observational Data and the Problem of Confounders (Readings: 5-5.3.1)
- Lecture 14. Controlling for Confounders Using Multiple Linear Regression (Readings: 5.3.2-5.4.2)
- Lecture 15. Internal and External Validity (Readings: 5.5-5.7)
- Lecture 16. Review
Additional Replication-Style Exercises:
- Effect of Black Candidates on Black Turnout
- Effect of Assassination of Leaders on Level of Democracy - Part II: Fitting a Line to Compute the Difference-in-Means Estimator
- Effect of Assassination of Leaders on Level of Democracy - Part III: Controlling for Confounders
- Effect of Political TV Ads on Turnout
CHAPTER 6: PROBABILITY
In chapter 6, we cover basic probability. We learn about random variables and their distributions, the distinction between population parameters and sample statistics, and the two large sample theorems that enable us to measure statistical uncertainty.Code and Data Used in Book Exercises:
- Probability.R (R script with the code)
Lecture Slides:
- Lecture 19. Probability (Readings: 6-6.8)
Additional Replication-Style Exercises:
Interactive Graphs:
- The Law of Large Numbers (coming soon)
- The Central Limit Theorem (coming soon)
CHAPTER 7: QUANTIFYING UNCERTAINTY
In chapter 7, we learn how to quantify the uncertainty in our empirical findings in order to draw conclusions at the population level. We complete some of the analyses we started in chapters 2 through 5.Code and Data Used in Book Exercises:
- Uncertainty.R (R script with the code)
- STAR.csv, BES.csv, countries.csv & UA_survey.csv (CSV files with the data)
Lecture Slides:
- Lecture 20. Hypothesis Testing with Estimated Regression Coefficients (Readings: 7-7.1, 7.3-7.6)
- Lecture 21. Do Small Classes Increase Probability of Graduating
- Lecture 22. Do Women Promote Different Policies Than Men?
- Lecture 23. Does Social Pressure Affect Turnout?
- Lecture 24. Is There Racial Discrimination in the Labor Market?
Additional Replication-Style Exercises:
- Effects of A Criminal Record in Labor Market - Part IV: Focus on White Applicants
- Effects of A Criminal Record in Labor Market - Part V: Focus on Black Applicants
- Effects of Female Leaders in India - Part V: Effect on Drinking Water Facilities
- Effects of Female Leaders in India - Part VI: Effect on Irrigation Facilities
- Predicting Course Grades - Part IV: Quantifying Uncertainty
- Effects of Black-Sounding Names on Call Backs for Job Interviews
- Effect of Small Classes on Student Outcomes
- Effect of Social Pressure Message on Probability of Voting - Part III: Estimate an Average Causal Effect, Determine Statistical Significance, and Discuss Internal and External Validity
Interactive Graphs:
- Hypothesis Testing (coming soon)