Measuring the Driving Forces of Predictive Performance: Application to Credit Scoring
References
- (2021) Explaining individual predictions when features are dependent: More accurate approximations to Shapley values. Artificial Intelligence 298:103502.Crossref, Google Scholar
- (2024) Efficient Shapley performance attribution for least-squares regression. Statist. Comput. 34(5):149.Crossref, Google Scholar
- (2012) Random search for hyper-parameter optimization. J. Machine Learn. Res. 12:281–305.Google Scholar
- (2022) The anatomy of out-of-sample forecasting accuracy. Federal Reserve Bank of Atlanta, Atlanta.Google Scholar
- (2020) Generalized shap: Generating multiple types of explanations in machine learning. Preprint, submitted June 12, https://arxiv.org/abs/2006.07155.Google Scholar
- (2001) Random forests. Machine Learn. 45(1):5–32.Crossref, Google Scholar
- (2018) Panning for gold: ‘Model-x’ knockoffs for high dimensional controlled variable selection. J. Roy. Statist. Soc. Ser. B: Statist. Methodology 80(3):551–577.Crossref, Google Scholar
- (2019) Visualizing the feature importance for black box models. Machine Learning and Knowledge Discovery in Databases (Springer International Publishing, Cham, Switzerland), 655–670.Crossref, Google Scholar
- (2024) Interpretable machine learning for imbalanced credit scoring data sets. Eur. J. Oper. Res. 312(1):357–372.Crossref, Google Scholar
- (2020) Understanding global feature contributions with additive importance measures. Adv. Neural Inform. Processing Systems 33:17212–17223.Google Scholar
- (2021) Shapley values for feature selection: The good, the bad, and the axioms. IEEE Access 9:144352–144360.Crossref, Google Scholar
- (2021) What are the most important statistical ideas of the past 50 years? J. Amer. Statist. Assoc. 116(536):2087–2097.Crossref, Google Scholar
- (2019) Data Shapley: Equitable valuation of data for machine learning. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 2242–2251.Google Scholar
- (2025) AICO: Feature significance tests for supervised learning. Preprint, submitted June 29, https://arxiv.org/abs/2506.23396.Google Scholar
- (2007a) Estimators of relative importance in linear regression based on variance decomposition. Amer. Statist. 61(2):139–147.Crossref, Google Scholar
- (2007b) Relative importance for linear regression in R: The package relaimpo. J. Statist. Software 17(1):27.Google Scholar
- (2021) Deep learning for credit scoring: Do or don’t? Eur. J. Oper. Res. 295(1):292–305.Crossref, Google Scholar
- (2022) Computationally efficient feature significance and importance for predictive models. Proc. 3rd ACM Internat. Conf. AI Finance (ACM, New York), 300–307.Google Scholar
- (1997) Clustering large data sets with mixed numeric and categorical values. Proc. 1st Pacific-Asia Conf. Knowledge Discovery Data Mining (World Scientific, Singapore), 21–34.Google Scholar
- (2026) The fairness of credit scoring models. Management Sci. 72(1):406–425.Link, Google Scholar
- (2007) A Shapley-based decomposition of the R-square of a linear regression. J. Econom. Inequality 5(2):199–212.Crossref, Google Scholar
- (2023) Benchmarking state-of-the-art imbalanced data learning approaches for credit scoring. Expert Systems Appl. 213:118878.Crossref, Google Scholar
- (2020) Problems with Shapley-value-based explanations as feature importance measures. Proc. Internat. Conf. Machine Learn. 119:5491–5500.Google Scholar
- (2018) Distribution-free predictive inference for regression. J. Amer. Statist. Assoc. 113(523):1094–1111.Crossref, Google Scholar
- (1980) Introduction to Bivariate and Multivariate Analysis, vol. 4 (Scott, Foresman Glenview, IL).Google Scholar
- (2001) Analysis of regression in game theory approach. Appl. Stochastic Models Bus. Industry 17(4):319–330.Crossref, Google Scholar
- (2023) Variable selection via Thompson sampling. J. Amer. Statist. Assoc. 118(541):287–304.Crossref, Google Scholar
- (2017) A unified approach to interpreting model predictions. Adv. Neural Inform. Processing Systems 30:4768–4777.Google Scholar
- (2018) Consistent individualized feature attribution for tree ensembles. Preprint, submitted February 12, https://arxiv.org/abs/1802.03888.Google Scholar
- (2021) Portfolio performance attribution via Shapley value. Preprint, submitted February 11, https://arxiv.org/abs/2102.05799.Google Scholar
- (2025) Interpretable Machine Learning. Accessed July 31, 2025, https://christophm.github.io/interpretable-ml-book/.Google Scholar
- (2019) Definitions, methods, and applications in interpretable machine learning. Proc. Natl. Acad. Sci. USA 116(44):22071–22080.Crossref, Google Scholar
- (2017) On Shapley value for measuring importance of dependent inputs. SIAM/ASA J. Uncertainty Quant. 5(1):986–1002.Crossref, Google Scholar
- (2009) A simple and fast algorithm for k-medoids clustering. Expert Systems Appl. 36(2):3336–3341.Crossref, Google Scholar
- (1987) Dividing the indivisible: Using simple symmetry to partition variance explained. Proc. 2nd Internat. Tampere Conf. Statist., 245–260.Google Scholar
- (2019) Shapley decomposition of R-squared in machine learning models. Preprint, submitted August 26, https://arxiv.org/abs/1908.09718.Google Scholar
- (2019) Conformalized quantile regression. Proc. 33rd Internat. Conf. Neural Inform. Processing Systems 32:3543–3553.Google Scholar
- (1953) A value for n-person games. Contributions Theory Games 2:307–317.Google Scholar
- (2010) An efficient explanation of individual classifications using game theory. J. Machine Learn. Res. 11:1–18.Google Scholar
- (2021) Stochastic tree search for estimating optimal dynamic treatment regimes. J. Amer. Statist. Assoc. 116(533):421–432.Crossref, Google Scholar
- (2020) The many Shapley values for model explanation. Proc. 37th Internat. Conf. Machine Learn. (PMLR, New York), 9269–9278.Google Scholar
- (2020) The Shapley Taylor interaction index. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 9259–9268.Google Scholar
- (2017) Axiomatic attribution for deep networks. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 3319–3328.Google Scholar
- (2021) From global to local MDI variable importances for random forests and when they are Shapley values. Adv. Neural Inform. Processing Systems 34:3533–3543.Google Scholar
- (2024) Feature importance: A closer look at Shapley values and loco. Statist. Sci. 39(4):623–636.Crossref, Google Scholar
- (2021) Statistical stability indices for lime: Obtaining reliable explanations for machine learning models. J. Oper. Res. Soc. 73(1):91–101.Crossref, Google Scholar
- (2005) Algorithmic Learning in a Random World (Springer, New York).Google Scholar
- (2020) Efficient nonparametric statistical inference on population feature importance using Shapley values. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 10282–10291.Google Scholar
- (2023) “Why did the model fail?”: Attributing model performance changes to distribution shifts. Proc. 40th Internat. Conf. Machine Learn. (PMLR, New York), 41550–41578.Google Scholar
- (2021) Causal interpretations of black-box models. J. Bus. Econom. Statist. 39(1):272–281.Crossref, Google Scholar

