A Conditional Gradient Approach for Nonparametric Estimation of Mixing Distributions
Published Online:21 May 2020https://doi.org/10.1287/mnsc.2019.3373
References
- (2013) Learning with submodular functions: A convex optimization perspective. Foundations Trends Machine Learn. 6(2–3):145–373.Crossref, Google Scholar
- (1995) Automobile prices in market equilibrium. Econometrica 63(4):841–890.Crossref, Google Scholar
- (1997) An endogenous segmentation mode choice model with an application to intercity travel. Transportation Sci. 31(1):34–48.Link, Google Scholar
- (1992) Computer-assisted analysis of mixtures (C.A.MAN): Statistical algorithms. Biometrics 48(1):283–303.Crossref, Google Scholar
- (2008) Database paper—the IRI marketing data set. Marketing Sci. 27(4):745–748.Link, Google Scholar
- (2010) Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm. ACM Trans. Algorithms 6(4):63.Crossref, Google Scholar
- (2001) Rank aggregation methods for the web. Proc. 10th Internat. Conf. World Wide Web (ACM, New York), 613–622.Google Scholar
- (2018) Approximate nonparametric maximum likelihood for mixture models: A convex optimization approach to fitting arbitrary multivariate mixing distributions. Comput. Statist. Data Anal. 122:80–91.Crossref, Google Scholar
- (2011) A simple estimator for the distribution of random coefficients. Quant. Econom. 2(3):381–418.Crossref, Google Scholar
- (1956) An algorithm for quadratic programming. Naval Res. Logist. Quart. 3(1–2):95–110.Google Scholar
- (2015) Faster rates for the Frank-Wolfe method over strongly-convex sets. Proc. 32nd Internat. Conf. Machine Learn. (ICML-15) (ACM, New York), 541–549.Google Scholar
- (1986) Some comments on Wolfe’s ‘away step’. Math. Programming 35(1):110–119.Crossref, Google Scholar
- (2015) Conditional gradient algorithms for norm-regularized smooth convex optimization. Math. Programming 152(1–2):75–112.Crossref, Google Scholar
- (2014) Consideration-set heuristics. J. Bus. Res. 67(8):1688–1699.Crossref, Google Scholar
- (2004) MM algorithms for generalized Bradley-Terry models. Ann. Statist. 32(1):384–406.Crossref, Google Scholar
- (2016) A nonparametric joint assortment and price choice model. Management Sci. 63(9):3128–3145.Link, Google Scholar
- (2019) The limit of rationality in choice modeling: Formulation, computation, and implications. Management Sci. 65(5):2196–2215.Google Scholar
- (2011) Sparse convex optimization methods for machine learning. Unpublished PhD thesis, ETH Zürich, Zurich.Google Scholar
- (2013) Revisiting Frank-Wolfe: Projection-free sparse convex optimization. Proc. 30th Internat. Conf. Machine Learn. (ICML-13) (ACM, New York), 427–435.Google Scholar
- (2010) A simple algorithm for nuclear norm regularized problems. Proc. 27th Internat. Conf. Machine Learn. (ICML-10) (ACM, New York), 471–478.Google Scholar
- (2017) MM algorithm for general mixed multinomial logit models. J. Appl. Econometrics 32(4):841–857.Crossref, Google Scholar
- (2009) General maximum likelihood empirical Bayes estimation of normal means. Ann. Statist. 37(4):1647–1684.Crossref, Google Scholar
- (2014) Efficient image and video co-localization with Frank-Wolfe algorithm. Fleet D, Pajdla T, Schiele B, Tuytelaars T, eds. Computer Vision–ECCV 2014, Lecture Notes in Computer Science, vol. 8694 (Springer, Cham, Switzerland), 253–268.Google Scholar
- (2005) Supervised ordering—an empirical survey. 5th IEEE Internat. Conf. Data Mining (IEEE, Piscataway, NJ), 673–676.Google Scholar
- (1956) Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters. Ann. Math. Statist. 27(4):887–906.Crossref, Google Scholar
- (2015) Barrier Frank-Wolfe for marginal inference. Adv. Neural Inform. Processing Systems 28:532–540.Google Scholar
- (2015) On the global linear convergence of Frank-Wolfe optimization variants. Adv. Neural Inform. Processing Systems 28:496–504.Google Scholar
- (1978) Nonparametric maximum likelihood estimation of a mixing distribution. J. Amer. Statist. Assoc. 73(364):805–811.Crossref, Google Scholar
- (1995) Consideration sets of size one: An empirical investigation of automobile purchases. Internat. J. Res. Marketing 12(1):55–66.Crossref, Google Scholar
- (1983) The geometry of mixture likelihoods: A general theory. Ann. Statist. 11(1):86–94.Crossref, Google Scholar
- (1995) Mixture Models: Theory, Geometry and Applications, NSF-CBMS Regional Conference Series in Probability and Statistics, vol. 5 (Institute of Mathematical Statistics, Hayward, CA).Google Scholar
- (2009) Learning to rank for information retrieval. Foundations Trends Inform. Retrieval 3(3):225–331.Crossref, Google Scholar
- (2000) Mixed MNL models for discrete response. J. Appl. Econometrics 15(5):447–470.Crossref, Google Scholar
- (2000) Finite Mixture Models (John Wiley & Sons, New York).Google Scholar
- (2014) A branch-and-cut algorithm for the latent-class logit assortment problem. Discrete Appl. Math. 164(1):246–263.Crossref, Google Scholar
- (2006) Numerical Optimization, 2nd ed. (Springer, New York).Google Scholar
- (2010) A control function approach to endogeneity in consumer choice models. J. Marketing Res. 47(1):3–13.Crossref, Google Scholar
- (2012) Early stopping—but when? Montavon G, Orr GB, Müller KR, eds. Neural Networks: Tricks of the Trade, Lecture Notes in Computer Science, vol. 7700 (Springer, Berlin), 53–67.Google Scholar
- (1950) A generalization of the method of maximum likelihood-estimating a mixing distribution. Ann. Math. Statist. 21(2):314–315.Google Scholar
- (2010) Trading accuracy for sparsity in optimization problems with sparsity constraints. SIAM J. Optim. 20(6):2807–2832.Crossref, Google Scholar
- (2008) EM algorithms for nonparametric estimation of mixing distributions. J. Choice Model. 1(1):40–69.Crossref, Google Scholar
- (2009) Discrete Choice Methods with Simulation, 2nd ed. (Cambridge University Press, Cambridge, UK).Crossref, Google Scholar
- (2016) Parallel and distributed block-coordinate Frank-Wolfe algorithms. Proc. 33rd Internat. Conf. Machine Learn. (ICML-16) (ACM, New York), 1548–1557.Google Scholar
- (2007) On early stopping in gradient descent learning. Constructive Approximation 26(2):289–315.Crossref, Google Scholar
- (1969) Nonlinear Programming: A Unified Approach (Prentice-Hall, Englewood Cliffs, NJ).Google Scholar
- (2003) Sequential greedy approximation for certain convex optimization problems. IEEE Trans. Inform. Theory 49(3):682–691.Crossref, Google Scholar

