Improving Reliability Estimation for Individual Numeric Predictions: A Machine Learning Approach
Published Online:10 Mar 2021https://doi.org/10.1287/ijoc.2020.1019
References
- (2008a) Comparison of approaches for estimating reliability of individual regression predictions. Data Knowledge Engrg. 67(3):504–516.Crossref, Google Scholar
- (2008b) Estimation of individual prediction reliability using the local sensitivity analysis. Appl. Intelligence 29(3):187–203.Crossref, Google Scholar
- (2009) An overview of advances in reliability estimation of individual predictions in machine learning. Intelligence Data Anal. 13(2):385–401.Crossref, Google Scholar
- (1996) Bagging predictors. Machine Learn. 24(2):123–140.Crossref, Google Scholar
- (1950) Verification of forecasts expressed in terms of probability. Monthly Weather Rev. 78(1):1–3.Crossref, Google Scholar
- (2012) No longer confidential: Estimating the confidence of individual regression predictions. PLoS One 7(11):e48723.Crossref, Google Scholar
- (1999) Confidence and prediction intervals for neural network ensembles. IJCNN ’99: Proc. Internat. Joint Conf. Neural Networks (IEEE, Washington, DC), 1215–1218.Google Scholar
- (2016) XGBoost: A scalable tree boosting system. Krishnapuram B, Shah M, Smola AJ, Aggarwal CC, Shen D, Rastogi R, eds. KDD ’16: Proc. 22nd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (ACM, New York), 785–794.Google Scholar
- (2016) Using recurrent neural network models for early detection of heart failure onset. J. Amer. Medical Inform. Assoc. 24(2):361–370.Crossref, Google Scholar
- (2009) DPRESS: Localizing estimates of predictive uncertainty. J. Cheminformatics 1(1):1–16.Crossref, Google Scholar
- (2018) One-at-a-time: A meta-learning recommender-system for recommendation-algorithm selection on micro level. Preprint, submitted May 30, https://arxiv.org/abs/1805.12118.Google Scholar
- (2018) Deep confidence: A computationally efficient framework for calculating reliable prediction errors for deep neural networks. J. Chemical Inform. Model. 59(3):1269–1281.Crossref, Google Scholar
- (2015) A differential harmony search based hybrid interval type2 fuzzy EGARCH model for stock market volatility prediction. Internat. J. Approximate Reasoning 59:81–104.Crossref, Google Scholar
- (2015) Automated experiments on ad privacy settings. Proc. Privacy Enhancing Tech. 2015(1):92–112.Crossref, Google Scholar
- (2010) Reliability of predictions in regression models. Kuzelovam D, Hakl F, eds. Proc XV. PhD Conf. (Institute of Computer Science, Prague, Czech Republic), 11–18.Google Scholar
- (2000) A unified bias-variance decomposition. Langley P, ed. Proc. 17th Internat. Conf. Machine Learn. (Morgan Kaufmann, Stanford, CA), 231–238.Google Scholar
- (1979) Bootstrap methods: Another look at the jackknife. Ann. Statist. 7(1):1–26.Crossref, Google Scholar
- (2004) The estimation of prediction error: Covariance penalties and cross-validation. J. Amer. Statist. Assoc. 99(467):619–632.Crossref, Google Scholar
- (1992) Neural networks and the bias variance dilemma. Neural Comput. 4:1–58.Crossref, Google Scholar
- (2013) EVEX in ST’13: Application of a large-scale text mining resource to event extraction and network construction. Nédellec C, Bossy R, Kim J-D, Kim J, Ohta T, Pyysalo S, Zweigenbaum P, eds. Proc. BioNLP Shared Task 2013 Workshop (Association for Computational Linguistics, Stroudsburg, PA), 26–34.Google Scholar
- (1963) Confidence interval estimation in non-linear regression. J. Roy. Statist. Soc. B 25(2):330–333.Google Scholar
- (2001) Idiot’s Bayes—not so stupid after all? Internat. Statist. Rev. 69(3):385–398.Google Scholar
- (1997) Practical confidence and prediction intervals. Jordan MI, Petsche T, eds. NIPS’96: Proc. 9th Internat. Conf. Neural Inform. Processing Systems (MIT Press, Cambridge, MA), 176–182.Google Scholar
- (2003) Transductive confidence machine for active learning. Proc. Internat. Joint Conf. Neural Networks (IEEE, Washington, DC), 1435–1440.Google Scholar
- (2019) A Human’s Guide to Machine Intelligence: How Algorithms are Shaping Our Lives and How We Can Stay in Control (Viking, New York).Google Scholar
- (2018) Large-scale price interval prediction at OTA sites. IEEE Access 6:807–817.Crossref, Google Scholar
- (1997) Prediction intervals for artificial neural networks. J. Amer. Statist. Assoc. 92(438):748–757.Crossref, Google Scholar
- (2015) Use of GRADE for assessment of evidence about prognosis: rating confidence in estimates of event rates in broad categories of patients. BMJ 350:1–8.Crossref, Google Scholar
- (2019) An algorithm for removing sensitive information: application to race-independent recidivism prediction. Ann. Appl. Stat. 13(1):189–220.Crossref, Google Scholar
- (2014) Bayesian approach to single-cell differential expression analysis. Nature Methods 11(7):740–742.Crossref, Google Scholar
- (2010) Construction of optimal prediction intervals for load forecasting problems. IEEE Trans. Power Systems 25(3):1496–1503.Crossref, Google Scholar
- (1985) Confidence bands for regression functions. J. Amer. Statist. Assoc. 80(391):683–691.Crossref, Google Scholar
- (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. IJCAI’95: Proc. 14th Internat. Joint Conference Artificial Intelligence, vol. 2 (Morgan Kaufmann, San Francisco), 1137–1143.Google Scholar
- (2002) Reliable classifications with machine learning. Elomaa T, Mannila H, Toivonen H, eds. ECML 2002: Proc. 13th Eur. Conf. Machine Learn. (Springer, Berlin), 219–231.Google Scholar
- (2014) Random forest ensembles for detection and prediction of Alzheimer’s disease with a good between-cohort robustness. NeuroImage Clin. 6:115–125.Crossref, Google Scholar
- (2018) General approach to estimate error bars for quantitative structure–activity relationship predictions of molecular activity. J. Chemical Inform. Model. 58(8):1561–1575.Crossref, Google Scholar
- (2001) Comparing the Bayes and typicalness frameworks. De Raedt L, Flach P, eds. ECML 2001: 12th Eur. Conf. Machine Learn. (Springer, Berlin), 369—371. Google Scholar
- (2001) Ridge regression confidence machine. Brodley CE, Pohoreckyj Danyluk A, eds. ICML’01: Proc. 18th Internat. Conf. Machine Learn. (Morgan Kaufmann, San Francisco), 385–392.Google Scholar
- (2001) Confidence estimation methods for neural networks: A practical comparison. IEEE Trans. Neural Networks 12(6):1278–1287.Crossref, Google Scholar
- (1984) Cross-validation of regression models. J. Amer. Statist. Assoc. 79(387):575–583.Crossref, Google Scholar
- (2002) Transductive Confidence Machines for Pattern Recognition. Elomaa T, Mannila H, Toivonen H, eds. ECML 2002: Proc. 13th Eur. Conf. Machine Learn. (Springer, Berlin), 381–390.Google Scholar
- (2018) Noise robust speech recognition on Aurora4 by humans and machines. IEEE Internat. Conf. Acoustics Speech Signal Processing (ICASSP) (IEEE, New York), 5604–5608.Google Scholar
- (2004) Gaussian processes in machine learning. Bousquet O, von Luxburg U, Rätsch G, eds. Advanced Lectures on Machine Learning (Springer, Berlin), 63–71.Crossref, Google Scholar
- (1999) Transduction with confidence and credibility. Dean T, ed. IJCAI’99: Proc. 16th Internat. Joint Conf. Artificial Intelligence (Morgan Kaufmann, San Francisco, CA), 722–726.Google Scholar
- (1996) Bootstrap model selection. J. Amer. Statist. Assoc. 91(434):655–665.Crossref, Google Scholar
- (2004) Similarity to molecules in the training set is a good discriminator for prediction accuracy in QSAR. J. Chemical Inform. Comput. Sci. 44(6):1912–1928.Crossref, Google Scholar
- (2006) Machine learning approaches for estimation of prediction interval for the model output. Neural Networks 19(2):225–235.Crossref, Google Scholar
- (2017) The problem of infra-marginality in outcome tests for discrimination. Ann. Appl. Statist. 11(3):1193–1216.Google Scholar
- (2019) Handling uncertainty through confidence intervals in portfolio optimization. Swarm Evolutionary Comput. 44:774–787.Crossref, Google Scholar
- (2012) On measuring and correcting the effects of data mining and model selection on measuring and correcting the effects of data mining and model selection. J. Amer. Statist. Assoc. 93(441):120–131.Google Scholar
- , et al. (2016) Bronchoscopic lung cryobiopsy increases diagnostic confidence in the multidisciplinary diagnosis of idiopathic pulmonary fibrosis. Amer. J. Respiratory Critical Care Medicine 193(7):745–752.Crossref, Google Scholar
- (2014) Assessment of machine learning reliability methods for quantifying the applicability domain of QSAR regression models. J. Chemical Inform. Model. 54(2):431–441.Crossref, Google Scholar
- (2010) Accurate telemonitoring of Parkinsons disease progression by noninvasive speech tests. IEEE Trans. Biomedical Engrg. 57(4):884–893.Crossref, Google Scholar
- (2007) Transductive Reliability Estimation for Kernel Based Classifiers. Berthold MR, Shawe-Taylor J, Lavrač N, eds. Adv. Intelligent Data Analysis VII: Proc. 7th Internat. Sympos. Intelligent Data Analysis (Springer, Berlin), 37–47.Google Scholar
- (1967) Estimation of the probability of an event as a function of several independent variables. Biometrika 54(1):167–179.Crossref, Google Scholar
- (1990) Confidence intervals. Introductory Statistics, 5th ed. (Wiley, New York), 254–281.Google Scholar

