Predicting with Proxies: Transfer Learning in High Dimension

Published Online:https://doi.org/10.1287/mnsc.2020.3729

References

  • Ahsen ME, Ayvaci MUS, Raghunathan S (2019) When algorithmic predictions use human-generated data: A bias-aware classification algorithm for breast cancer diagnosis. Inform. Systems Res. 30(1):97–116.LinkGoogle Scholar
  • Anderer A, Bastani H, Silberholz J (2019) Adaptive clinical trial designs with surrogates: When should we bother? Preprint, submitted June 13, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3397464.Google Scholar
  • Axon RN, Williams MV (2011) Hospital readmission as an accountability measure. JAMA 305(5):504–505.CrossrefGoogle Scholar
  • Baardman L, Levin I, Perakis G, Singhvi D (2017) Leveraging comparables for new product sales forecasting. Working paper.Google Scholar
  • Bastani H, Bastani O, Kim C (2017) Interpreting predictive models for human-in-the-loop analytics. Working paper, University of Michigan, Ann Arbor.Google Scholar
  • Bastani H, Simchi-Levi D, Zhu R (2019) Meta dynamic pricing: Learning across experiments. Preprint, submitted February 28, 2019, https://arxiv.org/abs/1902.10918.Google Scholar
  • Bayati M, Bhaskar S, Montanari A (2018) Statistical analysis of a low cost method for multiple disease prediction. Statist. Methods Medical Res. 27(8):2312–2328.CrossrefGoogle Scholar
  • Belloni A, Chernozhukov V, Hansen C (2014) Inference on treatment effects after selection among high-dimensional controls. Rev. Econom. Stud. 81(2):608–650.CrossrefGoogle Scholar
  • Belloni A, Chen D, Chernozhukov V, Hansen C (2012) Sparse models and methods for optimal instruments with an application to eminent domain. Econometrica 80(6):2369–2429.CrossrefGoogle Scholar
  • Bickel P, Ritov Y, Tsybakov AB (2009) Simultaneous analysis of LASSO and Dantzig selector. Ann. Statist. 37(4):1705–1732.CrossrefGoogle Scholar
  • Brynjolfsson E, Hu YJ, Rahman MS (2013) Competing in the age of omnichannel retailing. MIT Sloan Management Rev. (May 21), https://sloanreview.mit.edu/article/competing-in-the-age-of-omnichannel-retailing/.Google Scholar
  • Bühlmann P, Van De Geer S (2011) Statistics for High-Dimensional Data: Methods, Theory and Applications (Springer Science & Business Media, Heidelberg, Germany).CrossrefGoogle Scholar
  • Candes E, Tao T (2007) The Dantzig selector: Statistical estimation when p is much larger than n. Ann. Statist. 35(6):2313–2351.CrossrefGoogle Scholar
  • Caruana R (1997) Multitask learning. Machine Learning 28(1):41–75.CrossrefGoogle Scholar
  • Chen SS, Donoho DL, Saunders MA (1995) Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20(1):33–61.Google Scholar
  • CMS (2018) Readmissions reduction program (hrrp). Accessed October 2, 2018, https://www.cms.gov/medicare/medicare-fee-for-service-payment/acuteinpatientpps/readmissions-reduction-program.html.Google Scholar
  • Collobert R, Weston J (2008) A unified architecture for natural language processing: Deep neural networks with multitask learning. Proc. 25th Internat. Conf. Machine Learn. (ACM), 160–167.Google Scholar
  • Cowie CC, Rust KF, Ford ES, Eberhardt MS, Byrd-Holt DD, Li C, Williams DE, et al.. (2009) Full accounting of diabetes and pre-diabetes in the US population in 1988–1994 and 2005–2006. Diabetes Care 32(2):287–294.CrossrefGoogle Scholar
  • Dzyabura D, Jagabathula S, Muller E (2019) Accounting for discrepancies between online and offline product evaluations. Marketing Sci. 38(1):88–106.LinkGoogle Scholar
  • Farias VF, Li AA (2019) Learning preferences with side information. Management Sci. 65(7):3131–3149.LinkGoogle Scholar
  • Friedman J, Hastie T, Tibshirani R (2001) The Elements of Statistical Learning, vol. 1. (Springer, New York).Google Scholar
  • Homrighausen D, McDonald DJ (2014) Leave-one-out cross-validation is risk consistent for LASSO. Machine Learning 97(1-2):65–78.CrossrefGoogle Scholar
  • ICDM (2013) Personalized Expedia hotel searches. Accessed December 28, 2018, https://www.kaggle.com/c/expedia-personalized-sort.Google Scholar
  • Jalali A, Sanghavi S, Ruan C, Ravikumar P (2010) A dirty model for multi-task learning. XXX eds. Advances in Neural Information Processing Systems, vol. XX (Curran Associates, Red Hook, NY), 964–972.Google Scholar
  • Läll K, Mägi R, Morris A, Metspalu A, Fischer K (2017) Personalized risk prediction for type 2 diabetes: The potential of genetic risk scores. Genetics Medicine 19(3):322–329.CrossrefGoogle Scholar
  • Li K-C (1987) Asymptotic optimality for c_p,c_l, cross-validation and generalized cross-validation: Discrete index set. Ann. Statist. 15(3):958–975.CrossrefGoogle Scholar
  • McCullagh P, Nelder JA (1989) Generalized Linear Models, 2nd ed. (Chapman & Hall, London).CrossrefGoogle Scholar
  • Meier L, Van De Geer S, Bühlmann P (2008) The group LASSO for logistic regression. J. Roy. Statist. Soc. Series B Statist. Methodology 70(1):53–71.CrossrefGoogle Scholar
  • Milstein A (2009) Ending extra payment for “never events”—stronger incentives for patients’ safety. New England J. Medicine 360(23):2388–2390.CrossrefGoogle Scholar
  • Mullainathan S, Obermeyer Z (2017) Does machine learning automate moral hazard and error? Amer. Econom. Rev. 107(5):476–480.CrossrefGoogle Scholar
  • Negahban S, Yu B, Wainwright MJ, Ravikumar PK (2009) A unified framework for high-dimensional analysis of m-estimators with decomposable regularizers. XXX eds. Advances in Neural Information Processing Systems, vol. XX (Curran Associates, Red Hook, NY), 1348–1356.Google Scholar
  • Obermeyer Z, Lee TH (2017) Lost in thought—the limits of the human mind and the future of medicine. New England J. Medicine 377(13):1209–1211.CrossrefGoogle Scholar
  • Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans. Knowledge Data Engrg. 22(10):1345–1359.CrossrefGoogle Scholar
  • Picard RR, Cook RD (1984) Cross-validation of regression models. J. Amer. Statist. Assoc. 79(387):575–583.CrossrefGoogle Scholar
  • Raina R, Ng AY, Koller D (2006) Constructing informative priors using transfer learning. Proc. 23rd Internat. Conf. Machine Learning (ACM), 713–720.Google Scholar
  • Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J. Royal Statist. Soc. Series B: Methodological 58(1):267–288.CrossrefGoogle Scholar
  • Tsybakov AB (2004) Optimal aggregation of classifiers in statistical learning. Ann. Statist. 32(1):135–166.CrossrefGoogle Scholar
  • Tuomilehto J, Schwarz P, Lindström J (2011) Long-term benefits from lifestyle interventions for type 2 diabetes prevention: Time to expand the efforts. Diabetes Care 34(Supplement 2):S210–S214.CrossrefGoogle Scholar
  • Wainwright M (2019) High-Dimensional Statistics: A Non-Asymptotic Viewpoint (Cambridge University Press, Cambridge, UK).Google Scholar
  • Zhang D, Dai H, Dong L, Wu Q, Guo L, Liu X (2018) The value of pop-up stores in driving online engagement in platform retailing: Evidence from a large-scale field experiment with Alibaba. Preprint, submitted March 5, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3129506.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.