MIS Quarterly, Shmueli & Koppius

MISQ Archivist

Predictive Analytics in Information Systems Research

Galit Shmueli and Otto Koppius

Abstract

This research essay highlights the need to integrate predictive analytics into information systems research and shows several concrete ways in which this goal can be accomplished. Predictive analytics include empirical methods (statistical and other) that generate data predictions as well as methods for assessing predictive power. Predictive analytics not only assist in creating practically useful models, they also play an important role alongside explanatory modeling in theory building and theory testing. We describe six roles for predictive analytics: new theory generation, measurement development, comparison of competing theories, improvement of existing models, relevance assessment, and assessment of the predictability of empirical phenomena. Despite the importance of predictive analytics, we find that they are rare in the empirical IS literature. The latter relies nearly exclusively on explanatory statistical modeling, where statistical inference is used to test and evaluate the explanatory power of underlying causal models. However, explanatory power does not imply predictive power and thus predictive analytics are necessary for assessing predictive power and for building empirical models that predict well. To show the distinction between predictive analytics and explanatory statistical modeling, we present differences that arise in the modeling process of each type. These differences translate into different final models, so that a pure explanatory statistical model is best tuned for testing causal hypotheses and a pure empirical predictive model is best in terms of predictive power. We "convert" a well-known explanatory paper on TAM to a predictive context to illustrate these differences and show how predictive analytics can add theoretical and practical value to IS research.

Keywords: Prediction, causal explanation, theory building, theory testing, statistical model, data mining, modeling process