SELECTED PRAISE for DATA ANALYSIS FOR SOCIAL SCIENCE
"I love this book. More importantly, my students love this book." — Anna Harvey, NYU
"This is without doubt the best book to get started with data analysis in the social sciences. My students—all of them complete novices—were easily able to conduct their own analyses after working through this book." — Simon Weschle, Syracuse University
"[This book] is a game changer! I have been teaching quantitative methods for 14 years, and I never had such good results and engagement from my students until I adopted this book." — Javier Sajuria, Queen Mary University of London
"[This book] is outstanding—by far the most effective introductory textbook on the topic I have come across in more than a decade of teaching this material. I am amazed by how easy it makes my job as an instructor." — PJ Lamberson, UCLA
"This is the book that I plan to teach from, next time I teach introductory statistics. As it is, I recommend it as a reference for students in more advanced classes such as Applied Regression and Causal Inference, if they want a clean refresher from first principles." — Andrew Gelman, co-author of Regression and Other Stories
"Data science from zero to sixty—gently, expertly, quickly." — Gary King, Harvard University
"This book will transform the way we teach data science in the social sciences. Assuming zero background knowledge, it takes readers step-by-step through the most important concepts of data analysis and coding without sacrificing rigor." — Molly Roberts, UC San Diego
"I have been teaching statistics for twenty-five years and I have never seen a book this well done." — Vanessa Baird, University of Colorado Boulder
"I particularly love its problem-solving approach. While most textbooks teach statistics without offering students a clear motivation, this one teaches statistics as a way to solve real problems with real datasets.” — Guillermo Solovey, University of Buenos Aires
"My favorite feature is that it puts causal inference first, before probability and statistical inference. I have found that this unconventional order is gentler and more engaging for complete beginners than the approach used in many other books. It also allows students with some prior knowledge of statistics to learn something new from the start." — Max Goplerud, University of Pittsburgh
"I especially like that it teaches point estimates and uncertainty separately. In the past, when I taught these concepts together, I found students were overwhelmed. Breaking them up makes the statistics easier to understand. It's a genius idea!" — Christopher Ojeda, UC Merced
“Students liked the clear explanations and relevant real-world examples, and they even found coding in R fun!” — Alicia Cooperman, GW University
“I don’t think I’ve seen a more accessible introduction to R and RStudio—cheat sheets included!” — Didier Ruedin, University of Neuchâtel
"[T]he instructor resources that come with it are the best I've seen provided with a textbook and made adopting the book much easier.” — Mark Richardson, Georgetown University
BOOK OVERVIEW
Assuming no prior knowledge of statistics or coding and only minimal knowledge of math, Data Analysis for Social Science (DSS for short) teaches the fundamentals of survey research, predictive models, and causal inference, plus how to analyze real-world data with the free and popular statistical program R.
It progresses by teaching how to solve one kind of problem after another, bringing in methods as needed. It teaches, in this order, how to (1) estimate causal effects with randomized experiments, (2) visualize and summarize data, (3) infer population characteristics, (4) predict outcomes, (5) estimate causal effects with observational data, and (6) generalize from sample to population.
It flips the script of traditional statistics textbooks. It starts by estimating causal effects with randomized experiments and postpones any discussion of probability and statistical inference until the final chapters. This unconventional order engages students by demonstrating from the very beginning how data analysis can be used to answer interesting questions, while reserving more abstract, complex concepts for later chapters.
CHAPTER SUMMARIES
In chapter 1, we start from the very beginning by familiarizing ourselves with RStudio and R and learning to load and make sense of data. (Full chapter available for free here.)
In chapter 2, we learn what causal effects are and how to estimate them using randomized experiments. We analyze data from Project STAR to answer: What is the effect of small classes on student performance?
In chapter 3, we learn about surveys and how to visualize and summarize the distribution of single variables as well as the relationship between two variables. We analyze data on the 2016 British referendum to answer: Who Supported Brexit?
In chapter 4, we learn how to predict outcomes using simple linear regression models. We analyze data from 170 countries to predict GDP growth using night-time light emissions as measured from space.
In chapter 5, we learn how to estimate causal effects using observational data. We analyze survey and electoral data to answer: What was the effect of Russian TV propaganda on the 2014 Ukrainian elections?
In chapter 6, we cover basic probability. We learn about random variables and their distributions, the distinction between population parameters and sample statistics, and the two large sample theorems that enable us to measure statistical uncertainty.
In chapter 7, we learn how to quantify the uncertainty in our empirical findings in order to draw conclusions at the population level. We complete some of the analyses we started in chapters 2 through 5.
FEATURES
Provides a step-by-step guide to analyzing real-world data using the powerful, open-source statistical program R, which is free for everyone to use. The datasets are provided on the book’s official website so that readers can learn how to analyze data by following along with the exercises in the book on their own computer.
Specifically designed to accommodate students with a variety of math backgrounds. It includes supplemental materials for students with minimal knowledge of math and clearly identifies sections with more advanced material so that readers can skip them if they so choose.
Includes CHEATSHEETS of statistical concepts and R.
Includes TIPS with supplemental materials, such as additional explanations, answers to common questions, notes on best practices, and recommendations.
Includes RECALLs reminding you of relevant information mentioned earlier in the book—particularly helpful when the book is read incrementally, such as over a semester.
Whenever a new core concept is introduced, in the margin you will find its definition repeated (displayed in red).
Whenever a new piece of R code is introduced, in the margin you will find a brief overview of how it works with an example (displayed in a cyan-colored frame).
Comes with instructor materials (upon request), including sample syllabus, lecture slides, additional replication-style exercises with solutions, and real-world datasets. Instructors: Get your exam copy here, preview some materials here, and request the complete set of materials here.
We sincerely hope our book is helpful to you and your students! — Elena and Kosuke