Regression and Machine Learning Methods to Predict Discrete Outcomes in Accounting Research
Journal of Financial Reporting, accepted
92 Pages Posted: 11 Mar 2021 Last revised: 9 Mar 2022
Date Written: March 1, 2022
Predictive modeling focuses on iteratively trying various combinations and transformations of a set of variables to generate a decision rule that predicts outcomes for new observations. Although accounting researchers have demonstrated interest in predictive modeling, we identify a lack of accessible and applied guidance on this topic for accounting settings. This issue has become more salient with the increasing availability of machine learning models that use unfamiliar terminology, are estimated using algorithms, and produce different outputs than other models used for causal inference. To overcome this gap, we provide an overview of how to predict discrete outcomes with logistic regression and machine learning models used in recent studies. We also include guidance and a comprehensive example - predicting investigations by the U.S. Securities and Exchange Commission - that illustrates the elements of the prediction process, highlighting the importance of out-of-sample accuracy and unique aspects in the presentation of a prediction model's results.
Keywords: prediction, machine learning, deep learning, neural networks, random forests, gradient boosting, support vector machines, k-fold cross-validation.
JEL Classification: C10, C25, C45, C53, M48
Suggested Citation: Suggested Citation