Trimmed Statistical Estimation via Variance Reduction

Published Online:https://doi.org/10.1287/moor.2019.0992

In this paper, we show how to transform any optimization problem that arises from fitting a machine learning model into one that (1) detects and removes contaminated data from the training set while (2) simultaneously fitting the trimmed model on the uncontaminated data that remains. To solve the resulting nonconvex optimization problem, we introduce a fast stochastic proximal-gradient algorithm that incorporates prior knowledge through nonsmooth regularization. For data sets of size n, our approach requires O(n2/3/) gradient evaluations to reach -accuracy, and when a certain error bound holds, the complexity improves to O(κn2/3 log(1/)), where κ is a “condition number.” These rates are n1/3 times better than those achieved by typical, nonstochastic methods.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.