Can computational social scientists predict social science phenomena?

Over the past fifteen years, a technological revolution has deepened the intersection between the computational and social sciences. The rise of the internet has created new classes of data for social scientists to analyze—data which, because of its novelty compared to traditional social science data, has brought with it an influx of computer science practices. A side effect of this intersection, however, is the new tension between the two disciplines’ core values: social science’s emphasis on explanation and computer science’s focus on prediction. With each of these goals corresponding to distinct research methods and practices, does one necessarily have an edge over the other?

In a newly published paper “Integrating explanation and prediction in computational social science,” with Jake Hofman of Microsoft Research and several co-authors, Lab Director Duncan Watts offers a new perspective on this tension. They argue that, while both explanatory and predictive approaches are powerful on their own, the current “computational revolution” in social science creates an opportunity for productive synthesis. Integrative modelling practices, though rare in computational social science today, can help to realize this emerging field’s potential.

A framework for integrative modelling

To better understand how explanation and prediction can complement—rather than compete with—one another, Hofman et al. outline a conceptual framework for understanding these goals in empirical modelling practices. They categorize activities along two dimensions: whether their focus is on explanation or prediction, and whether they are observational or intervention-based. This categorization produces four quadrants of activities, each with its own advantages and drawbacks. Quadrants one through three contain practices familiar to both traditional and computational social scientists. In these quadrants, modelling is used to describe situations in the past or present, estimate the effects of changing a situation, and forecast outcomes for similar situations in the future.

Intriguingly, methods falling into the fourth quadrant—labelled integrative modelling—are rare in the modern field. This quadrant overturns the conventional wisdom that explanatory insight comes at the cost of predictive accuracy, and vice versa. Instead, it views these epistemic values as complements, combining the explanatory and predictive powers of the other quadrants to attempt to predict yet unseen outcomes in terms of causal relationships. The authors consider integrative modelling an exciting but underdeveloped direction for computational social science, urging a new blend of traditional and computational approaches.

Encouraging methodological innovation in CSS

Given the current scarcity of integrative modelling practices, Hofman et al. highlight three suggestions for methodological innovation in computational social science. First, they encourage researchers to explicitly orient their attention towards the fourth quadrant, such as through cross-domain testing or alternating between predictive and explanatory modelling. Second, they advocate for a labelling system that more clearly characterizes individual research contributions, identifying both their quadrants and levels of granularity. Finally, they call on researchers to standardize open science practices across the social and computer sciences, including through the pre-registration of analyses using predictive models and the use of a common task framework for explanatory modelling.

With these changes in place, the authors pave the way for more replicable, more cumulative, and more useful social science that takes full advantage of the computational revolution. Taking both explanation and prediction seriously through these practices, they argue, will lead to deeper understanding and advance work at the intersection of the computational and social sciences.

Read the full paper published in Nature here.