A Reduced Modeling Approach for Making Predictions with Incomplete Data Having Blockwise Missing Patterns
Published Online:26 Nov 2024https://doi.org/10.1287/ijds.2022.9016
References
- (2020) Visualizing the effects of predictor variables in black box supervised learning models. J. Roy. Statist. Soc. Ser B Statist. Methodology 82(4).Google Scholar
- (2017) Considerations for assessing model averaging of regression coefficients. Ecological Appl. 27(1).Google Scholar
- (2003) An analysis of four missing data treatment methods for supervised learning. Appl. Artificial Intelligence 17(5–6):519–533.Google Scholar
- (1984) Classification and Regression Trees, vol. 19 (Routledge).Google Scholar
- (2020) Working with deep generative models and tabular data imputation. Proc. Internat. Conf. Machine Learn. (Vienna).Google Scholar
- (2014) A split-and-conquer approach for analysis of extraordinarily large data. Statist. Sinica 24(4).Google Scholar
- (1977) Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Ser. B Methodological 39(1):1–22.Google Scholar
- (2013) Event labeling combining ensemble detectors and background knowledge. Progress Artificial Intelligence 2:113–127.Google Scholar
- (1997) Learning belief networks in the presence of missing values and hidden variables. Proc. 14th Internat. Conf. Machine Learn. (Morgan Kaufmann Publishers Inc., San Francisco), 125–133.Google Scholar
- (1996) Lazy decision tree. Proc. Thirteenth Natl. Conf. Artificial Intelligence, vol. 1 (AAAI Press, Palo Alto, CA), 717–724.Google Scholar
- GSA (2021) Wellbuilt for wellbeing. Accessed November 18, 2024, https://www.gsa.gov/governmentwide-initiatives/federal-highperformance-green-buildings/resource-library/health/wellbuilt-for-wellbeing.Google Scholar
- Han J, Kamber M, Pei J (2022) Data Mining: Concepts and Techniques, 4th ed. (Morgan Kaufmann).Google Scholar
- (2012) Jackknife model averaging. J. Econometrics 167(1):38–46.Google Scholar
- (2015) Matrix completion and low-rank SVD via fast alternating least squares. J. Machine Learn. Res. 16:3367–3402.Google Scholar
- (2020) How to deal with missing data in supervised deep learning? Proc. Internat. Conf. Machine Learn.Google Scholar
- (2016) Imputation with the R package VIM. J. Statist. Software 74:1–16.Google Scholar
- (2002) Classification and regression by random forest. R News 2(December):18–22.Google Scholar
- (2002) Statistical Analysis with Missing Data (John Wiley & Sons, Hoboken, NJ).Google Scholar
- Lundberg SM, Lee S-I (2017) A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems (Curran Associates Inc., Red Hook, NY), 4766–4775.Google Scholar
- (2019) MIWAE: Deep generative modelling and imputation of incomplete data sets. Proc. 36th Internat. Conf. Machine Learn. (PMLR, New York).Google Scholar
- (2010) Spectral regularization algorithms for learning large incomplete matrices. J. Machine Learn. Res. 11:2287–2322.Google Scholar
- (1976) Inference and missing data. Biometrika 63(3):581–592.Google Scholar
- (2004) Multiple Imputation for Nonresponse in Surveys (John Wiley & Sons, Hoboken, NJ).Google Scholar
- (2007) Handling missing values when applying classification models. J. Machine Learn. Res. 8:1625–1657.Google Scholar
- (2021) The US COVID-19 Trends and Impact Survey: Continuous real-time measurement of COVID-19 symptoms, risks, protective behaviors, testing, and vaccination. Proc. Natl. Acad. Sci. USA 118(51):e2111454118.Google Scholar
- (2002) Missing data: Our view of the state of the art. Psych. Methods 7(2):142–177.Google Scholar
- (1997) Learning to classify incomplete examples. Comput. Learn. Theory Natural Learn. Systems Making Learn. Systems Practice, vol. 4 (MIT Press, Cambridge, MA), 87–105.Google Scholar
- (2018) Processing of missing data by neural networks. Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Advances in Neural Information Processing Systems, vol. 31 (Curran Associates, Inc., Red Hook, NY).Google Scholar
- (2023) Discovery of associative patterns between workplace sound level and physiological wellbeing using wearable devices and empirical Bayes modeling. NPJ Digital Medicine 6(1):1–10.Google Scholar
- (2016) Feature importance and prediction modeling for multi- source healthcare data with missing values. Proc. 6th Internat. Conf. Digital Health (ACM, New York), 1–8.Google Scholar
- (2012) Missforest: Non-parametric missing value imputation for mixed-type data. Bioinformatics 28(1):112–118.Google Scholar
- Ushey K, Allaire J, Tang Y (2018) reticulate: Interface to “Python.” CRAN: Contributed Packages. https://doi.org/10.32614/CRAN.package.reticulate.Google Scholar
- (2018) Flexible Imputation of Missing Data, 2nd ed. (CRC Press, Boca Raton, FL).Google Scholar
- (2011) Multivariate imputation by chained equations. J. Statist. Software 45(3):1–67.Google Scholar
- (2020) Does imputation matter? Benchmark for predictive models. Proc. Internat. Conf. Machine Learn.Google Scholar
- (2013) Multi-source learning with block-wise missing data for Alzheimer’s disease prediction. Proc. 19th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York).Google Scholar
- (2021) Integrating multisource block-wise missing data in model selection. J. Amer. Statist. Assoc. 116(536):1914–1927.Google Scholar
- (2018) GAIN: Missing data imputation using generative adversarial nets. Proc. 35th Internat. Conf. Machine Learn., vol. 13 (PMLR, New York).Google Scholar
- (2020) Optimal sparse linear prediction for block-missing multi-modality data without imputation. J. Amer. Statist. Assoc. 115(531):1406–1419.Google Scholar
- (2012) Multi-source feature learning for joint analysis of incomplete multiple heterogeneous neuroimaging data. Neuroimage 61(3):622–632.Google Scholar
- Zhou D, Cai T, Lu J (2023) Multi-source learning via completion of block-wise overlapping noisy matrices. J. Machine Learn. Res. 221:1–43.Google Scholar
- (2020) Generalized integrative principal component analysis for multi-type data with block-wise missing structure. Biostatistics (Oxford, England) 21(2):302–318.Google Scholar

