C. Adida, A. Lo, M. Platas Izama. “Engendering empathy among Americans can promote inclusionary behavior toward Syrian refugees” , Forthcoming, Proceedings of the National Academy of Sciences. [Data & Code]
We investigate whether American citizens can be persuaded to adopt more inclusionary behavior toward refugees using a minimally invasive online perspective-taking exercise frequently used by refugee advocates in the real world. Through the use of a randomized survey experiment on a representative sample of American citizens we find that this short and interactive perspective-taking exercise can promote, in the short term, Americans’ willingness to act on behalf of Syrian refugees, by writing anonymous letters of support to the White House. This effect, while driven primarily by self-identified Democrats, is also apparent among self-identified Republicans.
Lo, A., Chernoff, H., Zheng, T. and Lo, S.H., 2016. Framework for making better prediction by directly estimating variables’ predictivity, Proceedings of the National Academy of Sciences, 113(50), pp.14277-14282. [pdf]
Good prediction, especially in the context of big data, is important. Common approaches to prediction include using significance-based criterion for evaluating variables to use in models and evaluating variables and models simultaneously for prediction using cross-validation or independent test data. The first approach can lead to choosing less predictive variables, as significance does not imply predictivity. The second approach can be improved through considering a variable’s predictivity as a parameter to be estimated. The literature currently lacks measures that do this. We suggest a novel measure that evaluates variables’ abilities to predict, the I-score. It is effective in differentiating between noisy and predictive variables in big data and can be related to a lower bound for the correct prediction rate.
A. Lo, M. Agnes, J. Auerbach, R. Fan, S. Lo, P. Wang, & T. Zheng, 2016.”Network-guided interaction mining for blood pressure phenotype of unrelated individuals in GAW19” BMC Proceedings. 10 (Suppl 7):13. [pdf]
Lo, A., Chernoff, H., Zheng, T. and Lo, S.H., 2015. Why significant variables aren’t automatically good predictors. Proceedings of the National Academy of Sciences, 112(45), pp.13892-13897. [pdf]
A recent puzzle in the big data scientific literature is that an increase in explanatory variables found to be significantly correlated with an outcome variable does not necessarily lead to improvements in prediction. This problem occurs in both simple and complex data. We offer explanations and statistical insights into why higher significance does not automatically imply stronger predictivity and why variables with strong predictivity sometimes fail to be significant. We suggest shifting the research agenda toward searching for a criterion to locate highly predictive variables rather than highly significant variables. We offer an alternative approach, the partition retention method, which was effective in reducing prediction error from 30% to 8% on a long-studied breast cancer data set.
Adida, C., Combes, N., Lo, A., & Verink, A. The Spousal Bump: Do Cross-Ethnic Marriages Increase Political Support in Multiethnic Democracies? Comparative Political Studies, April 2016; vol. 49, 5: pp. 635–661.
In democratic Africa, where ethnicity is a key driver of vote choice, politicians must attract voters across ethnic lines. This article explores one way politicians can do this: by appealing to a coethnic bond through their spouse. We propose that cross-ethnic spouses can help candidates send credible signals of coalition building before an election. We test this argument with a survey experiment in Benin, where President Yayi has married across ethnic lines. Our results confirm that priming the first lady’s ethnicity increases support for President Yayi among her coethnics. We generalize these results by combining new data on leader-spouse ethnicity with Afrobarometer survey data. Our results suggest that cross-ethnic marriages are one tool leaders can use to shore up support in multiethnic elections.
A. Lo & J.H. Fowler.”The Mathematics of Murder” Nature. 501:170-171 (12 September 2013)
“The importance of variable selection for high dimensional data“, A. Lo.
High dimensional (HD) data, where the number of covariates and/or interactions of covariates might exceed the number of observations, in increasing used in prediction in the social sciences. An important question is how to select the predictive covariates from among the full set. Common covariate selection approaches use rules specific to the application, such as simply selecting statistically significant covariates or using machine learning techniques. These can suffer from lack of objectivity, choosing some but not all predictive covariates, and failing selection consistency unless sample size is larger than predictor dimension or if covariates are marginally unrelated but jointly related to response. Finally, the literature is scarce in statistics that can be used to directly evaluate covariate predictivity. We address these issues by proposing a variable screening step prior to traditional model, in which we covariates for their predictivity. We show that the (I) statistic is directly related to predictivity and can help screen for noisy covariates and covariate interactions. We illustrate how our variable screening approach can removing noisy phrases and ranking important ones based on their measured predictivity directly. We also show improvements to out-of-sample forecasting in a state failure application.