Toward Efficient Ensemble Learning with Structure Constraints: Convergent Algorithms and Applications
Published Online:19 Aug 2022https://doi.org/10.1287/ijoc.2022.1224
References
- (2010) An l2 boosting algorithm for estimation of a regression function. IEEE Trans. Inform. Theory 56(3):1417–1429.Crossref, Google Scholar
- (2008) Approximation and learning by greedy algorithms. Ann. Statist. 36(1):64–94.Crossref, Google Scholar
- (2007) Adaboost is consistent. J. Machine Learn. Res. 8:2347–2368.Google Scholar
- (2006) Some theory for generalized boosting algorithms. J. Machine Learn. Res. 7:705–732.Google Scholar
- (2016) Convergence rates for kernel conjugate gradient for random design regression. Anal. Appl. (Singapore) 14(6):763–794.Crossref, Google Scholar
- (1996) Bagging predictors. Machine Learn. 24(2):123–140.Crossref, Google Scholar
- (2007) Boosting algorithms: Regularization, prediction and model fitting. Statist. Sci. 22(4):477–505.Google Scholar
- (2007) Optimal rates for the regularized least squares algorithm. Foundations Comput. Math. 7(3):331–368.Crossref, Google Scholar
- (2008) Decision-tree-based knowledge discovery: Single-vs. multi-decision-tree induction. INFORMS J. Comput. 20(1):46–54.Link, Google Scholar
- (2016) Xgboost: A scalable tree boosting system. Proc. 22nd ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association of Computing Machinery, New York), 785–794.Google Scholar
- (2004) Support vector machine soft margin classifiers: Error analysis. J. Machine Learn. Res. 5:1143–1175.Google Scholar
- (1998) Spam! Comm. ACM 41(8):74–83.Crossref, Google Scholar
- (2006) Statistical comparisons of classifiers over multiple data sets. J. Machine Learn. Res. 7(1):1–30.Google Scholar
- (2017) Active learning with multiple localized regression models. INFORMS J. Comput. 29(3):503–522.Link, Google Scholar
- Dua D Graff C (2019) UCI Machine Learning Repository (School of Information and Computer Science, University of California, Irvine, CA). https://archive.ics.uci.edu/ml/datasets/Spambase.Google Scholar
- (2012) Characterizing l2 boosting. Ann. Statist. 40(2):1074–1101.Crossref, Google Scholar
- (2006) Sentiwordnet: A publicly available lexical resource for opinion mining. Proc. 5th Internat. Conf. on Language Resources and Evaluation, vol. 6, 417–422.Google Scholar
- (2000) Regularization networks and support vector machines. Adv. Comput. Math. 13(1):1–50.Crossref, Google Scholar
- (1995) Boosting a weak learning algorithm by majority. Inform. Comput. 121(2):256–285.Crossref, Google Scholar
- (2016) New analysis and results for the frank-wolfe method. Math. Programming 155(1–2):199–230.Crossref, Google Scholar
- (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. System Sci. 55(1):119–139.Crossref, Google Scholar
- (2001) Greedy function approximation: A gradient boosting machine. Ann. Statist. 29(5):1189–1232.Crossref, Google Scholar
- (2000) Additive logistic regression: A statistical view of boosting - rejoinder. Ann. Statist. 28(2):400–407.Crossref, Google Scholar
- (1976) A dual algorithm for the solution of nonlinear variational problems via finite-element approximations. Comput. Math. Appl. 2(1):17–40.Crossref, Google Scholar
- (2017) Ensembles of overfit and overconfident forecasts. Management Sci. 63(4):1110–1130.Link, Google Scholar
- (2002) A Distribution-Free Theory of Nonparametric Regression (Springer, New York).Crossref, Google Scholar
- (2019) A subjectivity classification framework for sports articles using improved cortical algorithms. Neural Comput. Appl. 31(11):8069–8085.Crossref, Google Scholar
- (2006) A fast learning algorithm for deep belief nets. Neural Comput. 18(7):1527–1554.Crossref, Google Scholar
- (2013) Revisiting Frank-Wolfe: Projection-free sparse convex optimization. Proc. 30th Internat. Conf. on Machine Learn., vol. 28.Google Scholar
- (2004) Process consistency for adaboost. Ann. Statist. 32(1):13–29.Crossref, Google Scholar
- , et al. (2017) Lightgbm: A highly efficient gradient boosting decision tree. Proc. 31st Internat. Conf. on Neural Information Processing Systems, 3149–3157.Google Scholar
- (2020) Deep learning in business analytics and operations research: Models, applications and managerial implications. Eur. J. Oper. Res. 281(3):628–641.Crossref, Google Scholar
- (2017) Imagenet classification with deep convolutional neural networks. Comm. ACM 60(6):84–90.Crossref, Google Scholar
- (2018a) Distributed kernel-based gradient descent algorithms. Constructive Approximations 47(2):249–276.Crossref, Google Scholar
- (2018b) Optimal learning rates for kernel partial least squares. J. Fourier Anal. Appl. 24(3):908–933.Crossref, Google Scholar
- (2017a) Distributed learning with regularized least squares. J. Machine Learn. Res. 18:1–31.Google Scholar
- (2017b) Learning rates for classification with gaussian kernels. Neural Comput. 29(12):3353C–3380.Crossref, Google Scholar
- (2013) Optimization of tree ensembles. IEEE Trans. Neural Network Learn. Systems 24(10):1598–1608.Google Scholar
- (2020) Optimization of tree ensembles. Oper. Res. 68(5):1605–1624.Link, Google Scholar
- (2013) The rate of convergence of adaboost. J. Machine Learn. Res. 14:2315–2347.Google Scholar
- (2005) Smooth minimization of non-smooth functions. Math. Programming 103(1):127–152.Crossref, Google Scholar
- (2021) Optimization for l1-norm error fitting via data aggregation. INFORMS J. Comput. 33(1):120–142.Link, Google Scholar
- (2018) Catboost: Unbiased boosting with categorical features. Proc. 32nd Internat. Conf. on Neural Inform. Processing Systems, 6639–6649.Google Scholar
- (1987) Simplifying decision trees. Internat. J. Human Comput. Stud. 27(3):221–234.Google Scholar
- (2001) The boosting approach to machine learning: An overview. Proc. Workshop on Nonlinear Estimation and Classification.Google Scholar
- (2012) Boosting: Foundations and Algorithms (MIT Press, Cambridge, MA).Crossref, Google Scholar
- (2011) Concentration estimates for learning with l1-regularizer and data dependent hypothesis spaces. Appl. Comput. Harmon. Anal. 31(2):286–302.Crossref, Google Scholar
- (2019) Sparse kernel regression with coefficient-based ℓq-regularization. J. Machine Learn. Res. 20(161):1–44.Google Scholar
- (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489.Crossref, Google Scholar
- (2008) Support Vector Machines (Springer, New York).Crossref, Google Scholar
- (2007) Fast rates for support vector machines using gaussian kernels. Ann. Statist. 35(2):575–607.Crossref, Google Scholar
- (2009) Optimal rates for regularized least squares regression. Proc. 22nd Conf. on Learn. Theory.Google Scholar
- (2013) Boosting with the logistic loss is consistent, arXiv preprint arXiv:1305.2648.Google Scholar
- (2008) Greedy approximation. Acta Numerics 17:235–409.Crossref, Google Scholar
- (2015) Greedy approximation in convex optimization. Constructive Approximations 41(2):269–296.Crossref, Google Scholar
- (2009) A constraint programming approach for solving a queueing design and control problem. INFORMS J. Comput. 21(4):549–561.Link, Google Scholar
- (2004) Optimal aggregation of classifiers in statistical learning. Ann. Statist. 32(1):135–166.Crossref, Google Scholar
- (2013) The Nature of Statistical Learning Theory. (Springer Science & Business Media, New York)Google Scholar
- (2020) Kernel-based l2-boosting with structure constraints. Preprint, submitted September 16, https://arxiv.org/abs/2009.07558.Google Scholar
- (2019) Rescaled boosting in classification. IEEE Trans. Neural Network Learn. Systems 30(9):2598–2610.Crossref, Google Scholar
- (2019) Early stopping for kernel boosting algorithms: A general analysis with localized complexities. IEEE Trans. Inform. Theory 65(10):6685–6703.Crossref, Google Scholar
- (2011) Logistic classification with varying Gaussians. Comput. Math. Appl. 61(2):397–407.Crossref, Google Scholar
- (2007) On early stopping in gradient descent learning. Constructive Approximations 26(2):289–315.Crossref, Google Scholar
- (1997) Interior Point Algorithms: Theory and Analysis (Wiley-Interscience, New York).Crossref, Google Scholar
- (2018) Recent trends in deep learning based natural language processing. IEEE Comput. Intelligence Magazine 13(3):55–75.Crossref, Google Scholar
- (2017) A new kind of nonparametric test for statistical comparison of multiple classifiers over multiple data sets. IEEE Trans. Cybernetics 47(12):4418–4431.Crossref, Google Scholar
- (2003) Sequential greedy approximation for certain convex optimization problems. IEEE Trans. Inform. Theory 49(3):682–691.Crossref, Google Scholar
- (2004) Statistical behavior and consistency of classification methods based on convex risk minimization. Ann. Statist. 32(1):56–85.Crossref, Google Scholar
- (2005) Boosting with early stopping: Convergence and consistency. Ann. Statist. 33(4):1538–1579.Crossref, Google Scholar

