ℓ1-Norm Regularized ℓ1-Norm Best-Fit Lines
Abstract
We propose an optimization framework for estimating a sparse outlier-insensitive one-dimensional subspace. Our objective is to minimize both the representation error and a penalty on the loadings using an -norm criterion for each. To our knowledge, computational methods for this pure -norm formulation have not been previously proposed. Given that the problem is NP-hard, we introduce a linear relaxation-based approach. We present a novel fitting procedure, utilizing sortings of simple ratios. The proposed algorithm has a worst-case time complexity of . We show that, under certain circumstances, the method achieves global optimality. Compared with previously proposed methods, the proposed algorithm often finds the subspace with the lowest discordance and offers a smoother tradeoff between sparsity of the estimate and fit. We demonstrate the scalability with a parallel GPU implementation that provides a 16-fold improvement in computational speed for matrices of 2,000 × 2,000 over a serial CPU implementation. Furthermore, this method is distinguished by several advantages, including the lack of a need for initialization and deterministic and replicable steps. An application to human microbiome data demonstrates the effectiveness of the algorithm in deriving meaningful and sparse solutions. By combining sparsity for interpretability with outlier insensitivity, the proposed approach enables more stable extraction of latent factors, thereby improving the reliability of downstream decision models.
History: Yu Ding served as the senior editor for this article.
Funding: X. Ling was supported in part by the Auburn University at Montgomery GIA Program.
Supplemental Material: The data and supplemental material are available at https://doi.org/10.1287/ijds.2025.0086.

