The Composite Overfit Analysis Framework: Assessing the Out-of-Sample Generalizability of Construct-Based Models Using Predictive Deviance, Deviance Trees, and Unstable Paths

Published Online:https://doi.org/10.1287/mnsc.2023.4705

References

  • Aguinis H, Gottfredson RK, Joo H (2013) Best-practice recommendations for defining, identifying, and handling outliers. Organ. Res. Methods 16(2):270–301.CrossrefGoogle Scholar
  • Amrhein V, Greenland S, McShane B (2019) Retire statistical significance. Nature 567(7748):305–307.CrossrefGoogle Scholar
  • Arlot S, Celisse A (2010) A survey of cross-validation procedures for model selection. Statist. Surveys 4:40–79.CrossrefGoogle Scholar
  • Belsley DA, Kuh E, Welsch RE (2005) Regression Diagnostics: Identifying Influential Data and Sources of Collinearity, vol. 571 (John Wiley & Sons, New York).Google Scholar
  • Burman P (1989) A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika 76(3):503–514.CrossrefGoogle Scholar
  • Chin WW (1998) The partial least squares approach to structural equation modeling. Marcoulides GA, ed. Modern Methods for Business Research (Lawrence Erlbaum Associates Publishers, Mahwah, NJ), 295–336.Google Scholar
  • D’Amour A, Heller K, Moldovan D, Adlam B, Alipanahi B, Beutel A, Chen C, et al. (2020) Underspecification presents challenges for credibility in modern machine learning. Preprint, submitted November 24, https://arxiv.org/abs/2011.03395.Google Scholar
  • Danks NP (2021) The piggy in the middle: The role of mediators in PLS-SEM-based prediction: A research note. ACM SIGMIS Database DATABASE Adv. Inform. Systems 52(SI):24–42.Google Scholar
  • Danks NP, Ray S (2018) Predictions from partial least squares models. Ali F, Rasoolimanesh SM, Cobanoglu C, eds. Applying Partial Least Squares in Tourism and Hospitality Research (Emerald Publishing Limited, Bingley, UK), 35–52.CrossrefGoogle Scholar
  • Devroye L, Wagner T (1979) Distribution-free performance bounds for potential function rules. IEEE Trans. Inform. Theory 25(5):601–604.CrossrefGoogle Scholar
  • Dijkstra TK (2009) Latent variables and indices: Herman Wold’s basic design and partial least squares. Handbook of Partial Least Squares: Concepts, Methods and Applications (Springer, Berlin, Heidelberg), 23–46.Google Scholar
  • Evermann J, Tate M (2016) Assessing the predictive performance of structural equation model estimators. J. Bus. Res. 69(10):4565–4582.CrossrefGoogle Scholar
  • Faul F, Erdfelder E, Buchner A, Lang AG (2009) Statistical power analyses using G* Power 3.1: Tests for correlation and regression analyses. Behav. Res. Methods 41(4):1149–1160.Google Scholar
  • Gefen D, Straub DW, Rigdon EE (2011) An update and extension to SEM guidelines for administrative and social science research. MIS Quart. 35(2):iii–xiv.CrossrefGoogle Scholar
  • Gibbert M, Nair LB, Weiss M, Hoegl M (2020) Using outliers for theory building. Organ. Res. Methods 24(1):172–181.CrossrefGoogle Scholar
  • Gray PH, Cooper WH (2010) Pursuing failure. Organ. Res. Methods 13(4):620–643.CrossrefGoogle Scholar
  • Gregor S (2006) The nature of theory in information systems. MIS. Quart. 30(3):611–642.CrossrefGoogle Scholar
  • Hair J, Hollingsworth CL, Randolph AB, Chong AYL (2017a) An updated and expanded assessment of PLS-SEM in information systems research. Indust. Management Data Systems 117(3):442–458.CrossrefGoogle Scholar
  • Hair JF, Howard MC, Nitzl C (2020) Assessing measurement model quality in PLS-SEM using confirmatory composite analysis. J. Bus. Res. 109(2020):101–110.CrossrefGoogle Scholar
  • Hair JF, Sarstedt M, Ringle CM (2019) Rethinking some of the rethinking of partial least squares. Eur. J. Marketing 53(4):566–584.CrossrefGoogle Scholar
  • Hair JF Jr, Hult GTM, Ringle CM, Sarstedt M (2021) A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM) (Sage Publications, Thousand Oaks, CA).CrossrefGoogle Scholar
  • Hair JF, Hult GTM, Ringle CM, Sarstedt M, Thiele KO (2017b) Mirror, mirror on the wall: A comparative evaluation of composite-based structural equation modeling methods. J. Acad. Marketing Sci. 45(5):616–632.CrossrefGoogle Scholar
  • Hastie T, Tibshirani R, Friedman JH (2013) The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. (Springer, New York).Google Scholar
  • Henseler J (2017) Bridging design and behavioral research with variance-based structural equation modeling. J. Advertising 46(1):178–192.CrossrefGoogle Scholar
  • Henseler J, Hubona G, Ray PA (2016) Using PLS path modeling in new technology research: Updated guidelines. Indust. Management Data Systems 116(1):2–20.CrossrefGoogle Scholar
  • Henseler J, Dijkstra TK, Sarstedt M, Ringle CM, Diamantopoulos A, Straub DW, Ketchen DJ, Hair JF, Hult GTM, Calantone RJ (2014) Common beliefs and reality about PLS: Comments on Rönkkö and Evermann (2013). Organ. Res. Methods 17(2):182–209.CrossrefGoogle Scholar
  • Hwang H (2009) Regularized generalized structured component analysis. Psychometrika 74(3):517–530.CrossrefGoogle Scholar
  • Hwang H, Takane Y (2004) Generalized structured component analysis. Psychometrika 69(1):81–99.CrossrefGoogle Scholar
  • James G, Witten D, Hastie T, Tibshirani R (2013) An Introduction to Statistical Learning (Springer, New York).CrossrefGoogle Scholar
  • Jöreskog KG, Sörbom D (1996) LISREL 8: User’s Reference Guide (Scientific Software, Chicago).Google Scholar
  • Junior ML, Godinho Filho M (2010) Variations of the kanban system: Literature review and classification. Internat. J. Production Econom. 125(1):13–21.CrossrefGoogle Scholar
  • Kaplan A (1964) The Conduct of Inquiry: Methodology for Behavioral Science (Chandler Publishing, New York).Google Scholar
  • Kock N, Hadaya P (2018) Minimum sample size estimation in PLS-SEM: The inverse square root and gamma-exponential methods. Inform. Systems J. 28(1):227–261.Google Scholar
  • Liu FT, Ting KM, Zhou ZH (2008) Isolation forest. 2008 Eighth IEEE Internat. Conf. Data Mining (IEEE Computer Society, Washington, DC), 413–422.Google Scholar
  • Makrikadis SG, Wheelwright SC, Hyndman RJ (1998) Forecasting: Methods and Applications, 3rd ed. (Wiley, New York).Google Scholar
  • McGuire WJ (1983) A contextualist theory of knowledge: Its implications for innovation and reform in psychological research. Adv. Experiment. Soc. Psych. 16(1983):1–47.Google Scholar
  • McGuire WJ (1989) A Perspectivist Approach to the Strategic Planning of Programmatic Scientific Research (Cambridge University Press, Cambridge, UK).CrossrefGoogle Scholar
  • McIntosh CN, Edwards JR, Antonakis J (2014) Reflections on partial least squares path modeling. Organ. Res. Methods 17(2):210–251.CrossrefGoogle Scholar
  • Montgomery DC, Peck EA, Vining GG (2012) Introduction to Linear Regression Analysis, vol. 821 (John Wiley & Sons, New York).Google Scholar
  • Pascale R, Sternin J, Sternin M (2010) The Power of Positive Deviance: How Unlikely Innovators Solve the World’s Toughest Problems (Harvard Business School Press, Boston).Google Scholar
  • Pek J, MacCallum RC (2011) Sensitivity analysis in structural equation models: Cases and their influence. Multivariate Behav. Res. 46(2):202–228.CrossrefGoogle Scholar
  • Quionero-Candela J, Sugiyama M, Schwaighofer A, Lawrence ND (2009) Data Set Shift in Machine Learning (MIT Press, Cambridge, MA).Google Scholar
  • Ray S, Danks NP, Calero Valdez A, (2022) seminr: Building and Estimating Structural Equation Models. R package version 2.3.2. Accessed February 7, 2023, https://cran.r-project.org/web/packages/seminr/.Google Scholar
  • R Core Team (2022) The R project for statistical computing. R Foundation for Statistical Computing, Vienna. Accessed February 7, 2023, https://www.R-project.org/.Google Scholar
  • Rigdon EE (2012) Rethinking partial least squares path modeling: In praise of simple methods. Long Range Planning 45(5–6):341–358.CrossrefGoogle Scholar
  • Ringle CM, Sarstedt M, Straub D (2012) Editor’s comments: A critical look at the use of PLS-SEM in MIS Quarterly. MIS Quart. 36(1):iii–xiv.CrossrefGoogle Scholar
  • Rönkkö M, McIntosh CN, Antonakis J, Edwards JR (2016) Partial least squares path modeling: Time for some serious second thoughts. J. Oper. Management 47–48(2016):9–27.CrossrefGoogle Scholar
  • Sarstedt M, Hair JF, Ringle CM, Thiele KO, Gudergan SP (2016) Estimation issues with PLS and CBSEM: Where the bias lies! J. Bus. Res. 69(10):3998–4010.CrossrefGoogle Scholar
  • Schlittgen R (2019) cbsem: Simulation, estimation and segmentation of composite based structural equation models. R package version 1.0.0. Accessed January 22, 2020, https://cran.r-project.org/web/packages/cbsem/index.html.Google Scholar
  • Schlittgen R, Sarstedt M, Ringle CM (2020) Data generation for composite-based structural equation modeling methods. Adv. Data Anal. Classification 14(2020):747–757.CrossrefGoogle Scholar
  • Sharma P, Sarstedt M, Shmueli G, Kim KH, Thiele KO (2019) PLS-based model selection: The role of alternative explanations in information systems research. J. Assoc. Inform. Systems 20(4):4.Google Scholar
  • Sharma PN, Shmueli G, Sarstedt M, Danks N, Ray S (2021) Prediction‐oriented model selection in partial least squares path modeling. Decision Sci. 52(3):567–607.CrossrefGoogle Scholar
  • Shmueli G (2010) To explain or to predict? Statist. Sci. 25(3):289–310.CrossrefGoogle Scholar
  • Shmueli G, Koppius OR (2011) Predictive analytics in information systems research. MIS Quart. 35(3):553–572.CrossrefGoogle Scholar
  • Shmueli G, Ray S, Estrada JMV, Chatla SB (2016) The elephant in the room: Predictive performance of PLS models. J. Bus. Res. 69(10):4552–4564.CrossrefGoogle Scholar
  • Shmueli G, Sarstedt M, Hair JF, Cheah JH, Ting H, Vaithilingam S, Ringle CM (2019) Predictive model assessment in PLS-SEM: Guidelines for using PLSpredict. Eur. J. Marketing 53(11):2322–2347.CrossrefGoogle Scholar
  • Spreitzer GM, Sonenshein S (2004) Toward the construct definition of positive deviance. Amer. Behav. Sci. 47(6):828–847.CrossrefGoogle Scholar
  • Steiger JH (1996) Dispelling some myths about factor indeterminacy. Multivariate Behav. Res. 31(4):539–550.CrossrefGoogle Scholar
  • Subbaswamy A, Saria S (2019) From development to deployment: Data set shift, causality, and shift-stable models in health AI. Biostatistics 11(19):345–352.Google Scholar
  • Tenenhaus A, Tenenhaus M (2011) Regularized generalized canonical correlation analysis. Psychometrika 76(2):257–284.CrossrefGoogle Scholar
  • Urbach N, Ahlemann F (2010) Structural equation modeling in information systems research using partial least squares. J. Inform. Tech. Theory Appl. 11(2):5–40.Google Scholar
  • Venkatesh V, Davis FD (2000) A theoretical extension of the technology acceptance model: Four longitudinal field studies. Management Sci. 46(2):186–204.LinkGoogle Scholar
  • Venkatesh V, Thong JY, Xu X (2012) Consumer acceptance and use of information technology: Extending the unified theory of acceptance and use of technology. MIS Quart. 36(1):157–178.CrossrefGoogle Scholar
  • Walls JL, Hoffman AJ (2012) Exceptional boards: Environmental experience and positive deviance from institutional norms. J. Organ. Behav. 34(2):253–271.CrossrefGoogle Scholar
  • Witten IH, Frank E, Hall MA (2017) Data Mining Practical Machine Learning Tools and Techniques, 3rd ed. (Morgan Kaufmann, Burlington, MA).Google Scholar
  • Wold H (1975) Path models with latent variables: The NIPALS approach. Blalock HM, Aganbegian A, Borodkin FM, Boudon R, Capecchi V, eds. Quantitative Sociology: International Perspective on Mathematical and Statistical Modeling (Academic Press, Cambridge, MA), 307–357.CrossrefGoogle Scholar
  • Wold H (1982) Soft modeling: The basic design and some extensions. Joreskog KG, Wold HOA, eds. Systems Under Indirect Observations: Part II (North-Holland, Amsterdam), 1–54.Google Scholar
  • Zeitlin MF, Ghassemi H, Mansour M, Levine RA, Dillanneva M, Carballo M, Sockalingam S (1990) Positive Deviance in Child Nutrition: With Emphasis on Psychosocial and Behavioural Aspects and Implications for Development (United Nations University, Tokyo).Google Scholar
  • Zhang Y, Yang Y (2015) Cross-validation for selecting a model selection procedure. J. Econometrics 187(1):95–112.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.