Outlier Detection in Regression: Conic Quadratic Formulations

Andrés Gómez
Corresponding Author
Andrés Gómez
[email protected]
https://orcid.org/0000-0003-3668-0653
Daniel J. Epstein Department of Industrial and Systems Engineering, University of Southern California, Los Angeles, California 90089
Search for more papers by this author
,
José Neto
José Neto
[email protected]
https://orcid.org/0000-0002-5354-4816
SAMOVAR, Télécom SudParis, Institut Polytechnique de Paris, 91120 Palaiseau, France
Search for more papers by this author

Corresponding Author

Andrés Gómez

Daniel J. Epstein Department of Industrial and Systems Engineering, University of Southern California, Los Angeles, California 90089

Search for more papers by this author

José Neto

[email protected]

https://orcid.org/0000-0002-5354-4816

SAMOVAR, Télécom SudParis, Institut Polytechnique de Paris, 91120 Palaiseau, France

Search for more papers by this author

Published Online:2 Dec 2025https://doi.org/10.1287/ijoc.2025.1215

References

Agulló J (2001) New algorithms for computing the least trimmed squares regression estimator. Comput. Statist. Data Anal. 36(4):425–439.Crossref, Google Scholar
Aktürk MS, Atamtürk A, Gürel S (2009) A strong conic quadratic reformulation for machine-job assignment with controllable processing times. Oper. Res. Lett. 37(3):187–191.Crossref, Google Scholar
Atamturk A, Gómez A (2020) Safe screening rules for L0-regression from perspective relaxations. Daumé H III, Singh A, eds. Proc. 37th Internat. Conf. Machine Learn. (PMLR, Cambridge, MA), 421–430.Google Scholar
Atamtürk A, Gómez A (2025) Rank-one convexification for sparse regression. J. Machine Learn. Res. 26(35):1–50.Google Scholar
Atamtürk A, Gómez A, Han S (2021) Sparse and smooth signal estimation: Convexification of L0-formulations. J. Machine Learn. Res. 22(1):2370–2412.Google Scholar
Ben-Ameur W, Neto J (2022) New bounds for subset selection from conic relaxations. Eur. J. Oper. Res. 298(2):425–438.Crossref, Google Scholar
Bernholt T (2005a) Robust estimators are hard to compute. Technical report, University of Dortmund, Dortmund, Germany.Google Scholar
Bernholt T (2005b) Computing the least median of squares estimator in time O(nd). Gervasi O, Gavrilova ML, Kumar V, Laganà A, Lee HP, Mun Y, Taniar D, Tan CJK, eds. Comput. Sci. Its Appl. ICCSA 2005 (Springer, Berlin), 697–706.Google Scholar
Bertsimas D, Mazumder R (2014) Least quantile regression via modern optimization. Ann. Statist. 42(6):2494–2525.Crossref, Google Scholar
Bertsimas D, Van Parys B (2020) Sparse high-dimensional regression: Exact scalable algorithms and phase transitions. Ann. Statist. 48(1):300–323.Crossref, Google Scholar
Bertsimas D, King A, Mazumder R (2016) Best subset selection via a modern optimization lens. Ann. Statist. 44(2):813–852.Crossref, Google Scholar
Bhatia K, Jain P, Kar P (2015) Robust regression via hard thresholding. Cortes C, Lawrence N, Lee D, Sugiyama M, Garnett R, eds. Adv. Neural Inform. Processing Systems 28 (NIPS 2015) (Curran Associates, Inc., Red Hook, NY), 721–729.Google Scholar
Cozad A, Sahinidis NV, Miller DC (2014) Learning surrogate models for simulation-based optimization. AIChE J. 60(6):2211–2227.Crossref, Google Scholar
Cozad A, Sahinidis NV, Miller DC (2015) A combined first-principles and data-driven approach to model building. Comput. Chemical Engrg. 73:116–127.Crossref, Google Scholar
Dong H, Chen K, Linderoth J (2015) Regularization vs. relaxation: A conic optimization perspective of statistical variable selection. Preprint, submitted October 20, https://arxiv.org/abs/1510.06083.Google Scholar
El Ghaoui L, Lebret H (1997) Robust solutions to least-squares problems with uncertain data. SIAM J. Matrix Anal. Appl. 18(4):1035–1064.Crossref, Google Scholar
Frangioni A, Gentile C (2006) Perspective cuts for a class of convex 0–1 mixed integer programs. Math. Programming 106:225–236.Crossref, Google Scholar
Giloni A, Padberg M (2002) Least trimmed squares regression, least median squares regression, and mathematical programming. Math. Comput. Model. 35(9–10):1043–1060.Crossref, Google Scholar
Gómez A (2021) Outlier detection in time series via mixed-integer conic quadratic optimization. SIAM J. Optim. 31(3):1897–1925.Crossref, Google Scholar
Gómez A, Neto J (2025) Outlier detection in regression: Conic quadratic formulations. https://doi.org/10.1287/ijoc.2025.1215.cd, https://github.com/INFORMSJoC/2025.1215.Google Scholar
Gómez A, Prokopyev OA (2021) A mixed-integer fractional optimization approach to best subset selection. INFORMS J. Comput. 33(2):551–565.Abstract, Google Scholar
Günlük O, Linderoth J (2010) Perspective reformulations of mixed integer nonlinear programs with indicator variables. Math. Programming 124:183–205.Crossref, Google Scholar
Hampel FR (1971) A general qualitative definition of robustness. Ann. Math. Statist. 42(6):1887–1896.Crossref, Google Scholar
Hawkins DM (1994) The feasible solution algorithm for least trimmed squares regression. Comput. Statist. Data Anal. 17(2):185–196.Crossref, Google Scholar
Hazimeh H, Mazumder R (2020) Fast best subset selection: Coordinate descent and local combinatorial optimization algorithms. Oper. Res. 68(5):1517–1537.Link, Google Scholar
Hazimeh H, Mazumder R, Nonet T (2022a) L0Learn: A scalable package for sparse learning using L0 regularization. Preprint, submitted February 10, https://arxiv.org/abs/2202.04820.Google Scholar
Hazimeh H, Mazumder R, Saab A (2022b) Sparse regression at scale: Branch-and-bound rooted in first-order optimization. Math. Programming 196(1–2):347–388.Crossref, Google Scholar
Huber PJ (1973) Robust regression: Asymptotics, conjectures and Monte Carlo. Ann. Statist. 1(5):799–821.Crossref, Google Scholar
Huber PJ (2011) Robust Statistics (Springer, Berlin).Google Scholar
Insolia L, Kenney A, Chiaromonte F, Felici G (2022) Simultaneous feature selection and outlier detection with optimality guarantees. Biometrics 78(4):1592–1603.Crossref, Google Scholar
Kimura K, Waki H (2018) Minimization of Akaike’s information criterion in linear regression analysis via mixed integer nonlinear program. Optim. Methods Software 33(3):633–649.Crossref, Google Scholar
Liu P, Fattahi S, Gómez A, Küçükyavuz S (2022) A graph-based decomposition method for convex quadratic optimization with indicators. Math. Programming 200:669–701.Crossref, Google Scholar
Manzour H, Küçükyavuz S, Wu H-H, Shojaie A (2021) Integer programming for learning directed acyclic graphs from continuous data. INFORMS J. Optim. 3(1):46–73.Link, Google Scholar
Mazumder R, Radchenko P, Dedieu A (2023) Subset selection with shrinkage: Sparse linear modeling when the SNR is low. Oper. Res. 71(1):129–147.Link, Google Scholar
Miyashiro R, Takano Y (2015) Mixed integer second-order cone programming formulations for variable selection in linear regression. Eur. J. Oper. Res. 247(3):721–731.Crossref, Google Scholar
Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY (2014) On the least trimmed squares estimator. Algorithmica 69:148–183.Crossref, Google Scholar
Park YW, Klabjan D (2020) Subset selection for multiple linear regression via optimization. J. Global Optim. 77(3):543–574.Crossref, Google Scholar
Rousseeuw PJ (1984) Least median of squares regression. J. Amer. Statist. Assoc. 79(388):871–880.Crossref, Google Scholar
Rousseeuw PJ, Leroy AM (1987) Robust Regression and Outlier Detection (John Wiley & Sons, New York).Crossref, Google Scholar
Rousseeuw PJ, Van Driessen K (2006) Computing LTS regression for large data sets. Data Mining Knowledge Discovery 12:29–45.Crossref, Google Scholar
Rousseeuw P, Yohai V (1984) Robust regression by means of S-estimators. Franke J, Härdle W, Martin D, eds. Robust Nonlinear Time Series Anal. (Springer-Verlag, Berlin), 256–272.Google Scholar
Rousseeuw P, Croux C, Todorov V, Ruckstuhl A, Salibian-Barrera M, Verbeke T, Koller M, Maechler M (2009) Robustbase: Basic robust statistics. R package version 0.4-5. Accessed November 8, 2025, http://CRAN.R-project.org/package=robustbase.Google Scholar
She Y, Owen AB (2011) Outlier detection using nonconvex penalized regression. J. Amer. Statist. Assoc. 106(494):626–639.Crossref, Google Scholar
Shen Y, Sanghavi S (2019a) Iterative least trimmed squares for mixed linear regression. Wallach H, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox E, Garnett R, eds. Adv. Neural Inform. Processing Systems 32 (Curran Associates, Inc., Red Hook, NY), 6078–6088.Google Scholar
Shen Y, Sanghavi S (2019b) Learning with bad training data via iterative trimmed loss minimization. Chaudhuri K, Salakhutdinov R, eds. Proc. 36th Internat. Conf. Machine Learn. (PMLR, Cambridge, MA), 5739–5748.Google Scholar
Wilson ZT, Sahinidis NV (2017) The ALAMO approach to machine learning. Comput. Chemical Engrg. 106:785–795.Crossref, Google Scholar
Xie W, Deng X (2020) Scalable algorithms for the sparse ridge regression. SIAM J. Optim. 30(4):3359–3386.Crossref, Google Scholar
Zheng X, Sun X, Li D (2014) Improving the performance of MIQP solvers for quadratic programs with cardinality and minimum threshold constraints: A semidefinite program approach. INFORMS J. Comput. 26(4):690–703.Link, Google Scholar
Zioutas G, Avramidis A (2005) Deleting outliers in robust regression with mixed integer programming. Acta Math. Appl. Sinica 21:323–334.Crossref, Google Scholar
Zioutas G, Pitsoulis L, Avramidis A (2009) Quadratic mixed integer programming and support vectors for deleting outliers in robust regression. Ann. Oper. Res. 166(1):339–353.Crossref, Google Scholar

cover image INFORMS Journal on Computing

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:March 11, 2025
Accepted:August 29, 2025
Published Online:December 02, 2025

Cite as

Andrés Gómez, José Neto (2025) Outlier Detection in Regression: Conic Quadratic Formulations. INFORMS Journal on Computing 0(0).

https://doi.org/10.1287/ijoc.2025.1215

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Outlier Detection in Regression: Conic Quadratic Formulations

References

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News