Predicting with Proxies: Transfer Learning in High Dimension

Hamsa Bastani
Corresponding Author
Hamsa Bastani
[email protected]
https://orcid.org/0000-0002-8793-4732
aOperations Information and Decisions, The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania 19104
Search for more papers by this author

Hamsa Bastani

Corresponding Author

Hamsa Bastani

[email protected]

https://orcid.org/0000-0002-8793-4732

aOperations Information and Decisions, The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania 19104

Search for more papers by this author

Published Online:2 Oct 2020https://doi.org/10.1287/mnsc.2020.3729

References

Ahsen ME, Ayvaci MUS, Raghunathan S (2019) When algorithmic predictions use human-generated data: A bias-aware classification algorithm for breast cancer diagnosis. Inform. Systems Res. 30(1):97–116.Link, Google Scholar
Anderer A, Bastani H, Silberholz J (2019) Adaptive clinical trial designs with surrogates: When should we bother? Preprint, submitted June 13, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3397464.Google Scholar
Axon RN, Williams MV (2011) Hospital readmission as an accountability measure. JAMA 305(5):504–505.Crossref, Google Scholar
Baardman L, Levin I, Perakis G, Singhvi D (2017) Leveraging comparables for new product sales forecasting. Working paper.Google Scholar
Bastani H, Bastani O, Kim C (2017) Interpreting predictive models for human-in-the-loop analytics. Working paper, University of Michigan, Ann Arbor.Google Scholar
Bastani H, Simchi-Levi D, Zhu R (2019) Meta dynamic pricing: Learning across experiments. Preprint, submitted February 28, 2019, https://arxiv.org/abs/1902.10918.Google Scholar
Bayati M, Bhaskar S, Montanari A (2018) Statistical analysis of a low cost method for multiple disease prediction. Statist. Methods Medical Res. 27(8):2312–2328.Crossref, Google Scholar
Belloni A, Chernozhukov V, Hansen C (2014) Inference on treatment effects after selection among high-dimensional controls. Rev. Econom. Stud. 81(2):608–650.Crossref, Google Scholar
Belloni A, Chen D, Chernozhukov V, Hansen C (2012) Sparse models and methods for optimal instruments with an application to eminent domain. Econometrica 80(6):2369–2429.Crossref, Google Scholar
Bickel P, Ritov Y, Tsybakov AB (2009) Simultaneous analysis of LASSO and Dantzig selector. Ann. Statist. 37(4):1705–1732.Crossref, Google Scholar
Brynjolfsson E, Hu YJ, Rahman MS (2013) Competing in the age of omnichannel retailing. MIT Sloan Management Rev. (May 21), https://sloanreview.mit.edu/article/competing-in-the-age-of-omnichannel-retailing/.Google Scholar
Bühlmann P, Van De Geer S (2011) Statistics for High-Dimensional Data: Methods, Theory and Applications (Springer Science & Business Media, Heidelberg, Germany).Crossref, Google Scholar
Candes E, Tao T (2007) The Dantzig selector: Statistical estimation when p is much larger than n. Ann. Statist. 35(6):2313–2351.Crossref, Google Scholar
Caruana R (1997) Multitask learning. Machine Learning 28(1):41–75.Crossref, Google Scholar
Chen SS, Donoho DL, Saunders MA (1995) Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20(1):33–61.Google Scholar
CMS (2018) Readmissions reduction program (hrrp). Accessed October 2, 2018, https://www.cms.gov/medicare/medicare-fee-for-service-payment/acuteinpatientpps/readmissions-reduction-program.html.Google Scholar
Collobert R, Weston J (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. Proc. 25th Internat. Conf. Machine Learn. (ACM), 160–167.Google Scholar
Cowie CC, Rust KF, Ford ES, Eberhardt MS, Byrd-Holt DD, Li C, Williams DE, et al.. (2009) Full accounting of diabetes and pre-diabetes in the US population in 1988–1994 and 2005–2006. Diabetes Care 32(2):287–294.Crossref, Google Scholar
Dzyabura D, Jagabathula S, Muller E (2019) Accounting for discrepancies between online and offline product evaluations. Marketing Sci. 38(1):88–106.Link, Google Scholar
Farias VF, Li AA (2019) Learning preferences with side information. Management Sci. 65(7):3131–3149.Link, Google Scholar
Friedman J, Hastie T, Tibshirani R (2001) The Elements of Statistical Learning, vol. 1. (Springer, New York).Google Scholar
Homrighausen D, McDonald DJ (2014) Leave-one-out cross-validation is risk consistent for LASSO. Machine Learning 97(1-2):65–78.Crossref, Google Scholar
ICDM (2013) Personalized Expedia hotel searches. Accessed December 28, 2018, https://www.kaggle.com/c/expedia-personalized-sort.Google Scholar
Jalali A, Sanghavi S, Ruan C, Ravikumar P (2010) A dirty model for multi-task learning. XXX eds. Advances in Neural Information Processing Systems, vol. XX (Curran Associates, Red Hook, NY), 964–972.Google Scholar
Läll K, Mägi R, Morris A, Metspalu A, Fischer K (2017) Personalized risk prediction for type 2 diabetes: The potential of genetic risk scores. Genetics Medicine 19(3):322–329.Crossref, Google Scholar
Li K-C (1987) Asymptotic optimality for c_p,c_l, cross-validation and generalized cross-validation: Discrete index set. Ann. Statist. 15(3):958–975.Crossref, Google Scholar
McCullagh P, Nelder JA (1989) Generalized Linear Models, 2nd ed. (Chapman & Hall, London).Crossref, Google Scholar
Meier L, Van De Geer S, Bühlmann P (2008) The group LASSO for logistic regression. J. Roy. Statist. Soc. Series B Statist. Methodology 70(1):53–71.Crossref, Google Scholar
Milstein A (2009) Ending extra payment for “never events”—stronger incentives for patients’ safety. New England J. Medicine 360(23):2388–2390.Crossref, Google Scholar
Mullainathan S, Obermeyer Z (2017) Does machine learning automate moral hazard and error? Amer. Econom. Rev. 107(5):476–480.Crossref, Google Scholar
Negahban S, Yu B, Wainwright MJ, Ravikumar PK (2009) A unified framework for high-dimensional analysis of m-estimators with decomposable regularizers. XXX eds. Advances in Neural Information Processing Systems, vol. XX (Curran Associates, Red Hook, NY), 1348–1356.Google Scholar
Obermeyer Z, Lee TH (2017) Lost in thought—the limits of the human mind and the future of medicine. New England J. Medicine 377(13):1209–1211.Crossref, Google Scholar
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans. Knowledge Data Engrg. 22(10):1345–1359.Crossref, Google Scholar
Picard RR, Cook RD (1984) Cross-validation of regression models. J. Amer. Statist. Assoc. 79(387):575–583.Crossref, Google Scholar
Raina R, Ng AY, Koller D (2006) Constructing informative priors using transfer learning. Proc. 23rd Internat. Conf. Machine Learning (ACM), 713–720.Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J. Royal Statist. Soc. Series B: Methodological 58(1):267–288.Crossref, Google Scholar
Tsybakov AB (2004) Optimal aggregation of classifiers in statistical learning. Ann. Statist. 32(1):135–166.Crossref, Google Scholar
Tuomilehto J, Schwarz P, Lindström J (2011) Long-term benefits from lifestyle interventions for type 2 diabetes prevention: Time to expand the efforts. Diabetes Care 34(Supplement 2):S210–S214.Crossref, Google Scholar
Wainwright M (2019) High-Dimensional Statistics: A Non-Asymptotic Viewpoint (Cambridge University Press, Cambridge, UK).Google Scholar
Zhang D, Dai H, Dong L, Wu Q, Guo L, Liu X (2018) The value of pop-up stores in driving online engagement in platform retailing: Evidence from a large-scale field experiment with Alibaba. Preprint, submitted March 5, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3129506.Google Scholar

Volume 67, Issue 5

May 2021

Pages 2657-3320, iii-iv

Article Information

Supplemental Material

Metrics

Information

Received:December 28, 2018
Accepted:June 02, 2020
Published Online:October 02, 2020

Cite as

Hamsa Bastani (2020) Predicting with Proxies: Transfer Learning in High Dimension. Management Science 67(5):2964-2984.

https://doi.org/10.1287/mnsc.2020.3729

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Predicting with Proxies: Transfer Learning in High Dimension

References

Volume 67, Issue 5

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News