Targeting Prospective Customers: Robustness of Machine-Learning Methods to Typical Data Challenges
Published Online:15 Nov 2019https://doi.org/10.1287/mnsc.2019.3308
References
- (2015) Aggregation bias in sponsored search data: The curse and the cure. Marketing Sci. 34(1):59–77.Link, Google Scholar
- (1974) Estimation and prediction from aggregate data when aggregates are measured more accurately than their components. Econometrica 42(January):113–134.Crossref, Google Scholar
- (1973) Simulation and aggregation: A reconsideration. Rev. Econ. Stat. 55(February):114–118.Crossref, Google Scholar
- (2008) Assessing the impact of changing environments on classifier performance. Bergler S, ed. Adv. Artificial Intelligence (AI 2008), Lecture Notes in Computer Science, vol. 5032 (Springer, Berlin), 13–24.Crossref, Google Scholar
- (2011) A comparison of sales response predictions from demand models applied to store-level versus panel data. J. Bus. Econom. Statist. 29(2):319–326.Crossref, Google Scholar
- (1998) Tracking the best disjunction. Machine Learn. 32(2):127–150.Crossref, Google Scholar
- (2001) Bioinformatics: The Machine Learning Approach (MIT Press, Cambridge, MA).Google Scholar
- (2010) Efficient resampling methods for training support vector machines with imbalanced datasets. 2010 Internat. Joint Conf. Neural Networks (IJCNN) (IEEE, Piscataway, NJ), 1–8.Crossref, Google Scholar
- (2013) A survey on metric learning for feature vectors and structured data. Working paper, University of Southern California, Los Angeles.Google Scholar
- (2005) On the asymptotic properties of a nonparametric l/sub 1/-test statistic of homogeneity. IEEE Trans. Inform. Theory. 51(11):3965–3973.Crossref, Google Scholar
- (2007) Dirichlet-enhanced spam filtering based on biased samples. Advances in Neural Information Processing Systems, vol. 19 (Curran Associates, Red Hook, NY), 161–168.Google Scholar
- (2006) Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics 22(14):e49–e57.Crossref, Google Scholar
- (2010) “The best price you’ll ever get”: The 2005 employee discount pricing promotions in the U.S. automobile industry. Marketing Sci. 29(2):268–290.Link, Google Scholar
- (2002) Appearance-based object recognition using SVMs: Which kernel should I use? Proc. NIPS Workshop Stat. Methods Comput. Experiments Visual Processing Comput. Vision (Whistler, British Columbia) vol. 2002.Google Scholar
- (2003) Adaptive Bayes for a student modeling prediction task based on learning styles. Brusilovsky P, Corbett A, de Rosis F, eds. Internat. Conf. User Modeling (Springer, Berlin), 328–332.Crossref, Google Scholar
- (2017) Double/debiased/Neyman machine learning of treatment effects. Amer. Econom. Rev. 107(5):261–265.Crossref, Google Scholar
- (2018) Double/debiased machine learning for treatment and causal parameters. Econom. J. 21(1):C1–C68.Crossref, Google Scholar
- (2005) Predicting breast cancer survivability: A comparison of three data mining methods. Artificial Intelligence Medicine 34(2):113–127.Crossref, Google Scholar
- , eds. (1998) Special issue on context sensitivity and concept drift. Machine Learn. 32(2).Google Scholar
- Direct Marketing Association (2005) Statistical Fact Book (Direct Marketing Association, New York).Google Scholar
- (1969) Should estimation prior to aggregation be the rule? Rev. Econ. Stat. 51(November):409–420.Crossref, Google Scholar
- (2006) Efficient optimization of support vector machine learning parameters for unbalanced datasets. J. Comput. Appl. Math. 196(2):425–436.Crossref, Google Scholar
- (1996) Decision-making under uncertainty: Capturing dynamic brand choice processes in turbulent consumer goods markets. Marketing Sci. 15(1):1–20.Link, Google Scholar
- (1994) A comparison and an exploration of the forecasting accuracy of a loglinear model at different levels of aggregation. Internat. J. Forecast. 10(2):245–261.Crossref, Google Scholar
- (2014) Classification in the presence of label noise: A survey. IEEE Trans. Neural Networks Learn. Systems 25(5):845–869.Crossref, Google Scholar
- (1979) Multivariate generalizations of the Wald-Wolfowitz and Smirnov two-sample tests. Ann. Statist. 7(4):697–717.Crossref, Google Scholar
- (2001) The Elements of Statistical Learning, Springer Series in Statistics, vol. 1 (Springer, New York).Google Scholar
- (2014) A survey on concept drift adaptation. ACM Comput. Surv. 46(4):1–37.Crossref, Google Scholar
- (2012) A kernel two-sample test. J. Machine Learn. Res. 13(March):723–773.Google Scholar
- (1960) Is aggregation necessarily bad? Rev. Econom. Statist. 42(February):1–13.Crossref, Google Scholar
- (1996) Do household scanner data provide representative inferences from brand choices: A comparison with store data. J. Marketing Res. 33(4):383–398.Crossref, Google Scholar
- (2002) Permutation tests for equality of distributions in high‐dimensional settings. Biometrika 89(2):359–374.Crossref, Google Scholar
- (2006a) Classifier technology and the illusion of progress. Statist. Sci. 21(1):1–14.Crossref, Google Scholar
- (2006b) Rejoinder: Classifier technology and the illusion of progress. Statist. Sci. 21(1):30–34.Crossref, Google Scholar
- (1998) Extracting hidden context. Machine Learn. 32(2):101–126.Crossref, Google Scholar
- (1995) Detecting concept drift in financial time series prediction using symbolic machine learning. AI-Conference (World Scientific Publishing, Singapore), 91–98.Google Scholar
- He H, Ma Y, eds. (2013) Imbalanced Learning: Foundations, Algorithms, and Applications (John Wiley & Sons, New York).Crossref, Google Scholar
- (1991) Tracking drifting concepts using random examples. Warmuth MK, Valiant LG, eds. Proc. 4th Annual Workshop Comput. Learn. Theory (Morgan Kaufmann Publishers Inc., Burlington, MA), 13–23.Crossref, Google Scholar
- (1994) Tracking drifting concepts by minimizing disagreements. Machine Learn. 14(1):27–45.Crossref, Google Scholar
- (1998) Tracking the best expert. Machine Learn. 32(2):151–178.Crossref, Google Scholar
- (2018) Heterogeneous treatment effects and optimal targeting policy. Working paper, University of Chicago, Chicago.Google Scholar
- (1998) Approximate nearest neighbors: Towards removing the curse of dimensionality. In Proc. 30th Annual ACM Sympos. Theory Comput. (Association for Computing Machinery, New York), 604–613.Crossref, Google Scholar
- (1997) Feature selection: Evaluation, application, and small sample performance. IEEE Trans. Pattern Anal. Machine Intelligence 19(2):153–158.Crossref, Google Scholar
- (2013) A support vector machine (SVM) approach to imbalanced datasets of customer responses: Comparison with other customer response models. Serv. Bus. 7(1):167–182.Crossref, Google Scholar
- (2003) Predicting phases in business cycles under concept drift. Proc. LWA, 3–10.Google Scholar
- (1992) The normative impact of consumer price expectations for multiple brands on consumer purchase behavior. Marketing Sci. 11(3):266–286.Link, Google Scholar
- (1994) The impact of dealing patterns on purchase behavior. Marketing Sci. 13(4):351–373.Link, Google Scholar
- (1992) Incrementally learning time-varying half-planes. Moody JE, Hanson SJ, Lippmann, RP, eds. Advances in Neural Information Processing Systems, vol. 4 (MIT Press, Cambridge, MA), 920–927.Google Scholar
- (1991) Learning time-varying concepts. Lippmann RP, Moody JE, Touretzky DS, eds. Advances in Neural Information Processing Systems, vol. 3 (Curran Associates, Red Hook, NY), 183–189.Google Scholar
- (2003) Drifting concepts as hidden factors in clinical studies. Dojat M, Keravnou ET, Barahona P, eds. Conf. Artificial Intelligence Medicine Europe (Springer, Berlin), 355–364.Crossref, Google Scholar
- (1998) Agrawal R, Stolorz P, eds. Approaches to online learning and concept drift for user identification in computer security. KDD’98 Proc. 4th Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 259–263.Google Scholar
- (2015) Generative moment matching networks. Bach F, Blei D, eds. Proc. 32nd Internat. Conf. Machine Learn. (Association for Computing Machinery, New York), 1718–1727.Google Scholar
- (2007) Segmentation approaches in data-mining: A comparison of RFM, CHAID, and logistic regression. J. Bus. Res. 60(6):656–662.Crossref, Google Scholar
- (2012) A unifying view on dataset shift in classification. Pattern Recognition 45(1):521–530.Crossref, Google Scholar
- (1998) Robust sensor fusion: Analysis and application to audio visual speech recognition. Machine Learn. 32(2):85–100.Crossref, Google Scholar
- (1998) Planning media schedules in the presence of dynamic advertising quality. Marketing Sci. 17(3):214–235.Link, Google Scholar
- (2009) Comparison of customer response models. Service Bus. 3(2):117–130.Crossref, Google Scholar
- (2010) Online mass flow prediction in CFB boilers with explicit detection of sudden concept drift. SIGKDD Explorations 11(2):109–116.Crossref, Google Scholar
- (2006) Imputation methods to deal with missing values when data mining trauma injury data. Luzar-Stiffler V, Dobric VH, eds. 28th Internat. Conf. Inform. Tech. Interfaces (IEEE, Piscataway, NJ), 213–218.Crossref, Google Scholar
- (1989) Econometric analysis of aggregation in the context of linear prediction models. Econometrica 57(4):861–888.Crossref, Google Scholar
- (1986) Incremental learning from noisy data. Machine Learn. 1(3):317–354.Crossref, Google Scholar
- (2013) Equivalence of distance-based and RKHS-based statistics in hypothesis testing. Ann. Statist. 41(5):2263–2291.Crossref, Google Scholar
- (2000) Improving predictive inference under covariate shift by weighting the log-likelihood function. J. Statist. Planning Inference 90(2):227–244.Crossref, Google Scholar
- (2019) Efficiently evaluating targeting policies: Improving upon champion vs. challenger experiments. Management Sci. Forthcoming.Google Scholar
- (1939) On the estimation of the discrepancy between empirical curves of distribution for two independent samples. Bull. Math. Univ. Moscou 2(2):3–14.Google Scholar
- (2005) An empirical comparison of methods for forecasting using many predictors. Manuscript, Princeton University, Princeton, NJ.Google Scholar
- (2012) Machine Learning in Non-Stationary Environments (MIT Press, Cambridge, MA).Crossref, Google Scholar
- (2007) Covariate shift adaptation by importance weighted cross validation. J. Machine Learn. Res. 8(May):985–1005.Google Scholar
- (1998) Reinforcement Learning: An Introduction, vol. 1, no. 1 (MIT Press, Cambridge, MA).Google Scholar
- (2013) Real-data comparison of data mining methods in prediction of diabetes in Iran. Healthcare Inform. Res. 19(3):177–185.Crossref, Google Scholar
- (1992) Estimating the effects of consumer incentive programs on domestic automobile sales. J. Bus. Econom. Statist. 10(4):409–417.Crossref, Google Scholar
- (1996) Regression shrinkage and selection via the lasso. J. Royal Statist. Soc. B. 58(1):267–288.Crossref, Google Scholar
- (2016) Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk. BMC Medical Res. Methodology 16(1):26.Crossref, Google Scholar
- (2006) Handling local concept drift with dynamic integration of classifiers: Domain of antibiotic resistance in nosocomial infections. Lee DJ, Nutter B, Antani S, Mitra S, Archibald J, eds. 19th IEEE Internat. Sympos. Comput.-Based Medical Systems (CBMS 2006) (IEEE, Piscataway, NJ), 679–684.Crossref, Google Scholar
- (2012) On the dataset shift problem in software engineering prediction models. Empirical Software Engrg. 17(1–2):62–74.Crossref, Google Scholar
- (2010) Perceived age estimation under lighting condition change by covariate shift adaption. Proc. 22nd Internat. Conf. Comput. Linguistics (IEEE, Piscataway, NJ), 897–904.Google Scholar
- (1998) Statistical mechanics of online learning of drifting concepts: A variational approach. Machine Learn. 32(2):179–201.Crossref, Google Scholar
- (2003) Mining concept-drifting data streams using ensemble classifiers. Proc. 9th ACM SIGKDD Internat. Conf. Knowledge Discovery Data Mining (Association for Computing Machinery, New York), 226–235.Crossref, Google Scholar
- (1998) Guest editors’ introduction. Machine Learn. 32:83–84.Crossref, Google Scholar
- (1996) Learning in the presence of concept drift and hidden contexts. Machine Learn. 23(1):69–101.Crossref, Google Scholar
- (2002) Brain–computer interfaces for communication and control. Clinical Neurophysiology 113(6):767–791.Crossref, Google Scholar
- (2010) A comparison of data mining methods in microfinance. 2nd IEEE Internat. Conf. Inform. Financial Engrg. (ICIFE) (IEEE, Piscataway, NJ), 499–502.Crossref, Google Scholar
- (2008) Learning a Mahalanobis distance metric for data clustering and classification. Pattern Recognition 41(12):3600–3612.Crossref, Google Scholar
- (2007) Asymptotic Bayesian generalization error when training and test distributions are different. Ghahramani Z, ed. Proc. 24th Internat. Conf. Machine Learn. (Association for Computing Machinery, New York), 1079–1086.Crossref, Google Scholar
- (2006) Distance metric learning: A comprehensive survey. Manuscript, Michigan State University, East Lansing.Google Scholar
- (2004) Learning and evaluating classifiers under sample selection bias. Proc. 21st Internat. Conf. Machine Learn. (Association for Computing Machinery, New York), 114.Crossref, Google Scholar

