MACMILLAN-CSAP WORKSHOP ON QUANTITATIVE RESEARCH METHODS
Abstract: Texts are increasingly used to make causal inferences: either with the document serving as the treatment or the outcome. We introduce a new conceptual framework to understand text-based causal inferences, demonstrate fundamental problems that arise when using manual or computational approaches applied to text for causal inference, and provide solutions to the problems we raise. Our work introduces a methodology that connects traditional survey experiment methodology from the social sciences with A/B tests more common in industry and machine learning. Using this framework, we then show that the standard application of text methods leads to an Analyst Induced SUTVA Violation and we show how to resolve the problem using a training and test split. Taken together, our work provides a more rigorous foundation to build upon for applying text-based methods to causal inference. This is joint work with Naoki Egami, Christian Fong, Justin Grimmer, and Molly Roberts.
Brandon is an assistant professor of Sociology at Princeton University where he is also affiliated with the Politics department, the Office of Population Research and the Princeton Institute for Computational Science & Engineering. Before moving to Princeton, Brandon received his PhD in Political Science and a master’s degree in Statistics from Harvard. He publishes on methodology in computational social science with a particular focus on statistical text analysis.
This workshop series is being sponsored by the ISPS Center for the Study of American Politics and The Whitney and Betty MacMillan Center for International and Area Studies at Yale with support from the Edward J. and Dorothy Clarke Kempf Fund.