1-Norm Regularized ℓ1-Norm Best-Fit Lines

Published Online:https://doi.org/10.1287/ijds.2025.0086

Supplemental Material

Online Appendix: ijds.2025.0086.sm1.pdf

Software and Data: ijds.2025.0086.cd.zip


Description of Software and Data

The code and data in the zip file referenced above are a snapshot of the software and data that were used in the research reported in the paper "ℓ1-Norm Regularized ℓ1-Norm Best-Fit Lines " by Xiao Ling and Paul Brooks.

The goal of this repository is to replicate the numerical experiments in the paper.

Computer and Software Environment

The following describes the computer hardware conditions and software environment on which the authors produce the results reported in the paper.

Dependencies

The code in this repository requires the following dependencies. The dependency version number corresponds to the version of the package with which the code was tested.

Installation

One may access the R function sparsel1 from the pcaL1 package.

To compile the CUDA source file, one needs to install the NVIDIA CUDA Toolkit, version 12 or earlier.

Reproducibility Workflow

The CSV file Figure2.csv is manually computed and is used to generate Figure 2.

Synthetic datasets in the dataset2 folder are used to generate Figure 5. The filename n1000m2000nc0mc0i2 represents a dataset with 1,000 samples and 2,000 columns, with 0 outliers and 0 contaminated columns, generated as replication 2 using seed 2. README describes how to generate synthetic data in folder datase4. Its corresponding ground-truth file has the same name with the prefix “v_”, namely v_n1000m2000nc0mc0i2.

The second row in the table means that when one compiles the CUDA source file sparsel1.cu and runs the executable on one of the datasets in the dataset2 folder with a specified value of λ, the program produces an output file named v_[data replication] [lambda]. This output file contains the estimated best-fit line. The execution time for the code on a 3000 × 3000 matrix is roughly 1 minute. Compute the average discordance and l0-norm across 5 replications using Code_figure6. To illustrate the process, consider the following example:

./sparsel1 n1000m1000nc0mc0i1 22

This command produces a solution file named v_122. The discordance is then computed by comparing v_122 with the ground-truth file v_n1000m1000nc0mc0i1. The aggregated results are stored in the file lbehavior_fixedcolsat1000, located in the latex_zip/csv folder.

To reproduce the results in Figure 2
  • Data Files: Figure2.csv
  • Code Files: Manually
  • Output: The plot in Figure 2
  • Run Time at the Above-Specified Computer Conditions: 1 minutes
To reproduce the results in Figure 3
  • Data File: Dataset1
  • Code File: Code_figure3.r
  • Output: The plots in Figure 3
  • Run Time at the Above-Specified Computer Conditions: 1 minutes
To reproduce the results in Figure 4
  • Data File: Dataset1
  • Code File: Code_figure4.r
  • Output: The plots in Figure 4
  • Run Time at the Above-Specified Computer Conditions: 1 minutes
To reproduce the results in Figure 5
  • Data File: Dataset2
  • Code File: Code_figure5.r
  • Output: The plots in Figure 5
  • Run Time at the Above-Specified Computer Conditions: 1 minute
To reproduce the results in Figure 6
  • Data File: Dataset3
  • Code File: Code_figure6.r
  • Output: The plots in Figure 6
  • Run Time at the Above-Specified Computer Conditions: 3 minutes
To reproduce the results in Figure 8
  • Data File: Dataset4
  • Code File: sparsel1.cu
  • Output: The plots in Figure 8
  • Run Time at the Above-Specified Computer Conditions: 3 minutes
To reproduce the results in Figure 9
  • Data File: Dataset4
  • Code File: breakpoints.cu
  • Output: The plots in Figure 9
  • Run Time at the Above-Specified Computer Conditions: 5 minutes

Cite

To cite the contents of this repository, please cite both the paper and this repository using their respective DOIs.

Article: https://doi.org/10.1287/ijds.2025.0086
Software and Data Repository: https://doi.org/10.1287/ijds.2025.0086.cd

License

Copyright (c) (2026 Ling, Brooks)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.