Free Access

Cost-Aware Calibration of Classifiers

Mochen Yang
Corresponding Author
Mochen Yang
[email protected]
https://orcid.org/0000-0001-5101-9041
Department of Information and Decision Sciences, Carlson School of Management, University of Minnesota, Minneapolis, Minnesota 55455
Search for more papers by this author
,
Xuan Bi
Xuan Bi
[email protected]
https://orcid.org/0000-0002-4683-1411
Department of Information and Decision Sciences, Carlson School of Management, University of Minnesota, Minneapolis, Minnesota 55455
Search for more papers by this author

Corresponding Author

Mochen Yang

Department of Information and Decision Sciences, Carlson School of Management, University of Minnesota, Minneapolis, Minnesota 55455

Search for more papers by this author

Xuan Bi

[email protected]

https://orcid.org/0000-0002-4683-1411

Department of Information and Decision Sciences, Carlson School of Management, University of Minnesota, Minneapolis, Minnesota 55455

Search for more papers by this author

Published Online:9 Dec 2024https://doi.org/10.1287/ijds.2024.0038

References

Ayres-de Campos D, Bernardes J, Garrido A, Marques-de Sa J, Pereira-Leite L (2000) Sisporto 2.0: A program for automated analysis of cardiotocograms. J. Maternal-Fetal Medicine 9(5):311–318.Google Scholar
Ban GY, Rudin C (2019) The big data newsvendor: Practical insights from machine learning. Oper. Res. 67(1):90–108.Link, Google Scholar
Bayati M, Braverman M, Gillam M, Mack KM, Ruiz G, Smith MS, Horvitz E (2014) Data-driven decisions for reducing readmissions for heart failure: General methodology and case study. PLoS One 9(10):e109264.Google Scholar
Berardi G, Esuli A, Sebastiani F (2015) Utility-theoretic ranking for semiautomated text classification. ACM Trans. Knowledge Discovery Data 10(1):6.Google Scholar
Bertsimas D, Kallus N (2020) From predictive to prescriptive analytics. Management Sci. 66(3):1025–1044.Link, Google Scholar
Card D, Smith NA (2018) The importance of calibration for estimating proportions from annotations. Walker M, Ji H, Stent A, eds. Proc. 2018 Conf. North American Chapter Assoc. Computational Linguistics Human Language Tech., vol. 1, Long Papers (Association for Computational Linguistics, New Orleans), 1636–1646.Google Scholar
Dembczynski K, Cheng W, Hüllermeier E (2010) Bayes optimal multilabel classification via probabilistic classifier chains. Proc. 27th Internat. Conf. Machine Learn. (Omnipress, Madison, WI), 279–286.Google Scholar
Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. Preprint, submitted October 11, https://arxiv.org/abs/1810.04805?amp=1.Google Scholar
Domingos P (1999) Metacost: A general method for making classifiers cost-sensitive. Proc. Fifth ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (ACM, New York), 155–164.Google Scholar
Guo C, Pleiss G, Sun Y, Weinberger KQ (2017) On calibration of modern neural networks. Precup D, Teh YW, eds. Proc. 34th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 70 (PMLR, New York), 1321–1330.Google Scholar
Herńndez-Orallo J (2014) Probabilistic reframing for cost-sensitive regression. ACM Trans. Knowledge Discovery Data 8(4):17.Google Scholar
Huber J, Müller S, Fleischmann M, Stuckenschmidt H (2019) A data-driven newsvendor problem: From data to decision. Eur. J. Oper. Res. 278(3):904–915.Google Scholar
Kleinberg J, Mullainathan S, Raghavan M (2017) Inherent trade-offs in the fair determination of risk scores. Papadimitriou CH, ed. 8th Innovations Theoret. Comput. Sci. Conf. (ITCS 2017), Leibniz International Proceedings in Informatics (LIPIcs), vol. 67 (Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Dagstuhl, Germany), 43:1–43:23.Google Scholar
Kuleshov V, Fenner N, Ermon S (2018) Accurate uncertainties for deep learning using calibrated regression. Dy J, Krause A, eds. Proc. 35th Internat. Conf. Machine Learn., Proceedings of Machine Learning Research, vol. 80 (PMLR, New York), 2796–2804.Google Scholar
Kumar A, Liang PS, Ma T (2019) Verified uncertainty calibration. Wallach H, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox E, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 33 (Curran Associates Inc., Red Hook, NY), 3792–3803.Google Scholar
Maas A, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. Proc. 49th Annual Meeting Assoc. Computational Linguistics Human Language Tech., HLT ’11, vol. 1 (Association for Computational Linguistics, Cambridge, MA), 142–150.Google Scholar
Moro S, Cortez P, Rita P (2014) A data-driven approach to predict the success of bank telemarketing. Decision Support Systems 62:22–31.Google Scholar
Nguyen K, O’Connor B (2015) Posterior calibration and exploratory analysis for natural language processing models. Màrquez L, Callison-Burch C, Su J, eds. Proc. 2015 Conf. Empirical Methods Natural Language Processing (Association for Computational Linguistics, Lisbon, Portugal), 1587–1598.Google Scholar
Niculescu-Mizil A, Caruana R (2005) Predicting good probabilities with supervised learning. Proc. 22nd Internat. Conf. Machine Learn. (ACM, New York), 625–632.Google Scholar
Pate A, Van Staa T, Emsley R (2020) An assessment of the potential miscalibration of cardiovascular disease risk predictions caused by a secular trend in cardiovascular disease in England. BMC Medical Res. Methodology 20(1):1–12.Google Scholar
Platt J (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classifiers 10(3):61–74.Google Scholar
Provost F (2005) Toward economic machine learning and utility-based data mining. Proc. 1st Internat. Workshop Utility-Based Data Mining (Association for Computing Machinery, New York), 1.Google Scholar
Shah ND, Steyerberg EW, Kent DM (2018) Big data and predictive analytics: Recalibrating expectations. J. Amer. Medical Assoc. 320(1):27–28.Google Scholar
Tuomo A, Suutala J, Röning J, Koskimäki H (2020) Better classifier calibration for small datasets. ACM Trans. Knowledge Discovery Data 14(3):1–19.Google Scholar
Vafeiadis T, Diamantaras KI, Sarigiannidis G, Chatzisavvas KC (2015) A comparison of machine learning techniques for customer churn prediction. Simulation Model. Practice Theory 55:1–9.Google Scholar
Van Calster B, McLernon DJ, Van Smeden M, Wynants L, Steyerberg EW (2019) Calibration: The Achilles heel of predictive analytics. BMC Medicine 17(1):1–7.Google Scholar
Viaene S, Dedene G (2005) Cost-sensitive learning and decision making revisited. Eur. J. Oper. Res. 166(1):212–220.Google Scholar
Voudouri A, Khain P, Carmona I, Bellprat O, Grazzini F, Avgoustoglou E, Bettems J, Kaufmann P (2017) Objective calibration of numerical weather prediction models. Atmospheric Res. 190:128–140.Google Scholar
Watson L, Guo C, Cormode G, Sablayrolles A (2021) On the importance of difficulty calibration in membership inference attacks. Preprint, submitted November 15, https://arxiv.org/abs/2111.08440.Google Scholar
Wolberg WH, Mangasarian OL (1990) Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc. Natl. Acad. Sci. USA 87(23):9193–9196.Google Scholar
Yeh IC, Lien Ch (2009) The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Systems Appl. 36(2):2473–2480.Google Scholar
Zadrozny B, Elkan C (2002) Transforming classifier scores into accurate multiclass probability estimates. Proc. Eighth ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 694–699.Google Scholar
Zhao S, Kim M, Sahoo R, Ma T, Ermon S (2021) Calibrating predictions to decisions: A novel approach to multi-class calibration. Ranzato M, Beygelzimer A, Dauphin Y, Liang PS, Wortman Vaughan J, eds. Advances in Neural Information Processing Systems, vol. 34 (Curran Associates Inc., Red Hook, NY), 22313–22324.Google Scholar
Zhu C, Byrd RH, Lu P, Nocedal J (1997) Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Trans. Math. Software 23(4):550–560.Google Scholar
Zumel N, Mount J (2019) Practical Data Science with R (Manning, Shelter Island, NY).Google Scholar

cover image INFORMS Journal on Data Science

Volume 4, Issue 2

April-June 2025

Pages iii-vi, 101-196, ii

Article Information

Supplemental Material

Metrics

Information

Received:May 22, 2024
Accepted:September 18, 2024
Published Online:December 09, 2024

Cite as

Mochen Yang, Xuan Bi (2024) Cost-Aware Calibration of Classifiers. INFORMS Journal on Data Science 4(2):101-113.

https://doi.org/10.1287/ijds.2024.0038

Keywords

PDF download

Available Issues