Reproducible Feature Selection for High-Dimensional Measurement Error Models

Published Online:https://doi.org/10.1287/ijoc.2023.0282

References

  • Agarwal A, Shah D, Shen D, Song D (2021) On robustness of principal component regression. J. Amer. Statist. Assoc. 116(536):1731–1745.CrossrefGoogle Scholar
  • Barber RF, Candès EJ (2015) Controlling the false discovery rate via knockoffs. Ann. Statist. 43(5):2055–2085.Google Scholar
  • Barber RF, Candès EJ (2019) A knockoff filter for high-dimensional selective inference. Ann. Statist. 47(5):2504–2537.Google Scholar
  • Barber RF, Candès EJ, Samworth RJ (2020) Robust inference with knockoffs. Ann. Statist. 48(3):1409–1431.Google Scholar
  • Bastani H, Bayati M (2020) Online decision making with high-dimensional covariates. Oper. Res. 68(1):276–294.LinkGoogle Scholar
  • Bates S, Candès EJ, Janson L, Wang W (2021) Metropolized knockoff sampling. J. Amer. Statist. Assoc. 116(535):1413–1427.CrossrefGoogle Scholar
  • Belloni A, Chernozhukov V, Kaul A (2017) Confidence bands for coefficients in high dimensional linear models with error-in-variables. Preprint, submitted March 1, https://arxiv.org/abs/1703.00469.Google Scholar
  • Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. B 57(1):289–300.CrossrefGoogle Scholar
  • Byrd M, McGee M (2019) A simple correction procedure for high-dimensional generalized linear models with measurement error. Preprint, submitted December 26, https://arxiv.org/abs/1912.11740.Google Scholar
  • Cai T, Liu W, Luo X (2011) A constrained ℓ1 minimization approach to sparse precision matrix estimation. J. Amer. Statist. Assoc. 106(494):594–607.CrossrefGoogle Scholar
  • Candès EJ, Fan Y, Janson L, Lv J (2018) Panning for gold: ‘Model-X’ knockoffs for high-dimensional controlled variable selection. J. Roy. Statist. Soc. B 80(3):551–577.CrossrefGoogle Scholar
  • Cao Y, Sun X, Yao Y (2024) Controlling the false discovery rate in transformational sparsity: Split knockoffs. J. Roy. Statist. Soc. B 86(2):386–410.CrossrefGoogle Scholar
  • Cheng Y, Wang X, Xia Y (2021) Supervised t-distributed stochastic neighbor embedding for data visualization and classification. J. Amer. Statist. Assoc. 111(2):394–406.Google Scholar
  • Chudik A, Kapetanios G, Pesaran H (2018) A one covariate at a time, multiple testing approach to variable selection in high-dimensional linear regression models. Econometrica 86(4):1479–1512.CrossrefGoogle Scholar
  • Datta A, Zou H (2017) CoCoLasso for high-dimensional error-in-variables regression. Ann. Statist. 45(6):2400–2426.Google Scholar
  • Duchi J, Shalev-Shwartz S, Singer Y, Chandra T (2008) Efficient projections onto the ℓ1-ball for learning in high dimensions. McCallum A, Roweis S, eds. Proc. 25th Internat. Conf. Machine Learn. (Association for Computing Machinery, New York), 272–279.Google Scholar
  • Fan J, Lv J (2008) Sure independence screening for ultrahigh dimensional feature space. J. Roy. Statist. Soc. B 70(5):849–911.CrossrefGoogle Scholar
  • Fan Y, Lv J (2016) Innovated scalable efficient estimation in ultra-large Gaussian graphical models. Ann. Statist. 44(5):2098–2126.Google Scholar
  • Fan Y, Demirkaya E, Lv J (2019) Nonuniformity of p-values can occur early in diverging dimensions. J. Machine Learn. Res. 20(77):1–33.Google Scholar
  • Fan Y, Gao L, Lv J (2024) ARK: Robust knockoffs inference with coupling. Preprint, submitted June 4, https://arxiv.org/abs/2307.04400.Google Scholar
  • Fan J, Han X, Gu W (2012) Control of the false discovery rate under arbitrary covariance dependence (with discussion). J. Amer. Statist. Assoc. 107(499):1019–1045.CrossrefGoogle Scholar
  • Fan Y, Demirkaya E, Li G, Lv J (2020a) RANK: Large-scale inference with graphical nonlinear knockoffs. J. Amer. Statist. Assoc. 115(529):362–379.CrossrefGoogle Scholar
  • Fan Y, Lv J, Sharifvaghefi M, Uematsu Y (2020b) IPAD: Stable interpretable forecasting with knockoffs inference. J. Amer. Statist. Assoc. 115(532):1822–1834.CrossrefGoogle Scholar
  • Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3):432–441.CrossrefGoogle Scholar
  • Huang N, Mojumder P, Sun T, Lv J, Golden JM (2021) Not registered? Please sign-up first: A randomized field experiment on the ex-ante registration request. Inform. Systems Res. 32(3):914–931.LinkGoogle Scholar
  • Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP (2003) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4(2):249–264.CrossrefGoogle Scholar
  • Jayachandran S, Sharma S, Kaufman P, Raman P (2005) The role of relational information processes and technology use in customer relationship management. J. Marketing 69(4):177–192.CrossrefGoogle Scholar
  • Jiang F, Zhou Y, Liu J, Ma Y (2023) On high-dimensional Poisson models with measurement error: Hypothesis testing for nonlinear nonconvex optimization. Ann. Statist. 51(1):233–259.Google Scholar
  • Kallus N, Mao X, Udell M (2018) Causal inference with noisy and missing covariates via matrix factorization. Advances in Neural Information Processing Systems, vol. 31 (Curran Associates, Inc., Red Hook, NY), 6921–6932.Google Scholar
  • Li M, Li R, Ma Y (2021) Inference in high-dimensional linear measurement error models. J. Multivariate Anal. 184:104759.CrossrefGoogle Scholar
  • Liang H, Li R (2009) Variable selection for partially linear models with measurement errors. J. Amer. Statist. Assoc. 104(485):234–248.CrossrefGoogle Scholar
  • Loh PL, Tan LX (2018) High-dimensional robust precision matrix estimation: Cellwise corruption under ϵ-contamination. Electronic J. Statist. 12(1):1429–1467.Google Scholar
  • Loh PL, Wainwright MJ (2012) High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity. Ann. Statist. 40(3):1637–1664.Google Scholar
  • Paulson C, Luo L, James GM (2018) Efficient large-scale internet media selection optimization for online display advertising. J. Marketing Res. 55(4):489–506.CrossrefGoogle Scholar
  • Ravikumar P, Wainwright MJ, Raskutti G, Yu B (2011) High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence. Electronic J. Statist. 5:935–980.CrossrefGoogle Scholar
  • Reppe S, Refvem H, Gautvik VT, Olstad OK, Høvring PI, Reinholt FP, Holden M, Frigessi A, Jemtland R, Gautvik KM (2010) Eight genes are highly associated with BMD variation in postmenopausal Caucasian women. Bone 46(3):604–612.CrossrefGoogle Scholar
  • Rocke DM, Durbin B (2001) A model for measurement error for gene expression arrays. J. Comput. Biol. 8(6):557–569.CrossrefGoogle Scholar
  • Romano Y, Sesia M, Candès E (2020) Deep knockoffs. J. Amer. Statist. Assoc. 115(532):1861–1872.CrossrefGoogle Scholar
  • Sørensen Ø, Frigessi A, Thoresen M (2015) Measurement error in lasso: Impact and likelihood bias correction. Statist. Sinica 25(2):809–829.Google Scholar
  • Sørensen Ø, Hellton KH, Frigessi A, Thoresen M (2018) Covariate selection in high-dimensional generalized linear models with measurement error. J. Comput. Graph. Statist. 27(4):739–749.CrossrefGoogle Scholar
  • Sui C, Wen H, Han J, Chen T, Gao Y, Wang Y, Yang L, Guo L (2023) Decreased gray matter volume in the right middle temporal gyrus associated with cognitive dysfunction in preeclampsia superimposed on chronic hypertension. Frontiers Neurosci. 17:1138952.CrossrefGoogle Scholar
  • Tang CY, Fan Y, Kong Y (2020) Precision matrix estimation by inverse principal orthogonal decomposition. Commun. Math. Res. 36(1):68–92.CrossrefGoogle Scholar
  • Uematsu Y, Tanaka S (2019) High-dimensional macroeconomic forecasting and variable selection via penalized regression. Econom. J. 22(1):34–56.CrossrefGoogle Scholar
  • van de Geer S, Bühlmann P, Ritov YA, Dezeure R (2014) On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Statist. 42(3):1166–1202.Google Scholar
  • Wang Z, Xue L (2019) Inference for high dimensional linear models with error-in-variables. Comm. Statist. Simulation Comput. 13(1):1–10.Google Scholar
  • Wang Y, Wang J, Balakrishnan S, Singh A (2019) Rate optimal estimation and confidence intervals for high-dimensional regression with missing covariates. J. Multivariate Anal. 174:104526.CrossrefGoogle Scholar
  • Weiner MW, Veitch DP, Aisen PS, Beckett LA, Cairns NJ, Green RC, Harvey D, et al. (2017) The Alzheimer’s Disease Neuroimaging Initiative 3: Continued innovation for clinical trial improvement. Alzheimer’s Dementia 13(5):561–571.CrossrefGoogle Scholar
  • Yang M, Adomavicius G, Burtch G, Ren Y (2018) Mind the gap: Accounting for measurement error and misclassification in variables generated via data mining. Inform. Systems Res. 29(1):4–24.LinkGoogle Scholar
  • Yuan M (2010) High dimensional inverse covariance matrix estimation via linear programming. J. Machine Learn. Res. 11(79):2261–2286.Google Scholar
  • Zhang CH, Zhang S (2014) Confidence intervals for low dimensional parameters in high dimensional linear models. J. Roy. Statist. Soc. B 76(1):217–242.CrossrefGoogle Scholar
  • Zhao J, Zhou Y, Liu Y (2024) Estimation of linear functionals in high-dimensional linear models: From sparsity to nonsparsity. J. Amer. Statist. Assoc. 119(546):1579–1591.CrossrefGoogle Scholar
  • Zheng Z, Lv J, Lin W (2021) Nonsparse learning with latent variables. Oper. Res. 69(1):346–359.LinkGoogle Scholar
  • Zhou X, Li Y, Zheng Z, Wu J, Zhang J (2024) Reproducible feature selection for high-dimensional measurement error models. http://dx.doi.org/ 10.1287/ijoc.2023.0282.cd, https://github.com/INFORMSJoC/2023.0282.Google Scholar
  • Zhu F, Iansiti M (2012) Entry into platform-based markets. Strategic Management J. 33(1):88–106.CrossrefGoogle Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.