The Information Projection in Moment Inequality Models: Existence, Dual Representation, and Approximation

Published Online:https://doi.org/10.1287/moor.2024.0568

References

  • [1] Alvarez-Mena J, Hernández-Lerma O (2005) Convergence and approximation of optimization problems. SIAM J. Optim. 15(2):527–539.CrossrefGoogle Scholar
  • [2] Alwan L, Ebrahimi N, Soofi E (1998) Information theoretic framework for process control. Eur. J. Oper. Res. 111(3):526–542.CrossrefGoogle Scholar
  • [3] Andrews DWK, Shi X (2013) Inference based on conditional moment inequalities. Econometrica 81(2):609–666.Google Scholar
  • [4] Artstein Z (1983) Distributions of random sets and random selections. Israel J. Math. 46:313–324.CrossrefGoogle Scholar
  • [5] Bajgiran AH, Mardikoraem M, Soofi ES (2021) Maximum entropy distributions with quantile information. Eur. J. Oper. Res. 290(1):196–209.CrossrefGoogle Scholar
  • [6] Bhattacharya B (2006) An iterative procedure for general probability measures to obtain I-projections onto intersections of convex sets. Ann. Statist. 34(2):878–902.CrossrefGoogle Scholar
  • [7] Bhattacharya B, Dykstra RL (1995) A general duality approach to I-projections. J. Statist. Planning Inference 47(3):203–216.CrossrefGoogle Scholar
  • [8] Bhattacharya B, Dykstra RL (1997) On Dykstra’s iterative fitting procedure. Ann. Inst. Statist. Math. 49(3):435–446.CrossrefGoogle Scholar
  • [9] Billingsley P (1995) Probability and Measure, 3rd ed. (John Wiley & Sons, New York).Google Scholar
  • [10] Bogachev VI (2007) Measure Theory, vol. 1 (Springer-Verlag, Berlin).CrossrefGoogle Scholar
  • [11] Borwein JM, Lewis AS (1991) Duality relationships for entropy-like minimization problems. SIAM J. Control Optim. 29(2):325–338.CrossrefGoogle Scholar
  • [12] Borwein JM, Lewis AS (1993) Partially-finite programming in L1 and the existence of maximum entropy estimates. SIAM J. Optim. 3(2):248–267.CrossrefGoogle Scholar
  • [13] Boyle P, Feng S, Tian W (2007) Chapter 24 Large deviation techniques and financial applications. Birge JR, Linetsky V, eds. Financial Engineering, Handbooks in Operations Research and Management Science, vol. 15 (Elsevier, Amsterdam), 971–1000.CrossrefGoogle Scholar
  • [14] Brockett PL, Charnes A, Cooper WW, Learner D, Phillips FY (1995) Information theory as a unifying statistical approach for use in marketing research. Eur. J. Oper. Res. 84(2):310–329.CrossrefGoogle Scholar
  • [15] Canay IA (2010) EL inference for partially identified models: Large deviations optimality and bootstrap validity. J. Econom. 156(2):408–425.CrossrefGoogle Scholar
  • [16] Chang YC (2022) A design parameter-free geometric Kullback-Leibler information control chart for monitoring Bernoulli processes. Comput. Indust. Engrg. 169:108150.CrossrefGoogle Scholar
  • [17] Chen C (2008) An information-theoretic view of visual analytics. IEEE Comput. Graphics Appl. 28(1):18–23.CrossrefGoogle Scholar
  • [18] Csiszár I (1975) I-divergence geometry of probability distributions and minimization problems. Ann. Probab. 3(1):146–158.CrossrefGoogle Scholar
  • [19] Csiszár I (1984) Sanov property, generalized I-projection and a conditional limit theorem. Ann. Probab. 12(3):768–793.CrossrefGoogle Scholar
  • [20] Csiszár I, Matus F (2003) Information projections revisited. IEEE Trans. Inform. Theory 49(6):1474–1490.CrossrefGoogle Scholar
  • [21] Dal Maso G (1993) An introduction to Γ-convergence. Progress in Nonlinear Differential Equations and Their Applications (Birkhäuser, Boston).Google Scholar
  • [22] Dembo A, Kontoyiannis L (2002) Source coding, large deviations, and approximate pattern matching. IEEE Trans. Inform. Theory 48(6):1590–1615.CrossrefGoogle Scholar
  • [23] Dhilion IS, Mallela S, Kumar R (2003) A divisive information-theoretic feature clustering algorithm for text classification. J. Machine Learn. Res. 3:1256–1287.Google Scholar
  • [24] Du J, Hu M, Zhang W (2020) Missing data problem in the monitoring system: A review. IEEE Sensors J. 20(23):13984–13998.CrossrefGoogle Scholar
  • [25] Dykstra RL (1985) An iterative procedure for obtaining I-projections onto the intersection of convex sets. Ann. Probab. 13(3):975 –984.CrossrefGoogle Scholar
  • [26] Dykstra RL, Wollan PC (1987) Algorithm as 228: Finding I-projections subject to a finite set of linear inequality constraints. J. Roy. Statist. Soc. Ser. C (Appl. Statist.) 36(3):377–383.Google Scholar
  • [27] Floerchinger S, Haas T (2020) Thermodynamics from relative entropy. Phys. Rev. E 102(5–1):052117.CrossrefGoogle Scholar
  • [28] Folland GB (1999) Real Analysis: Modern Techniques and Their Applications, 2nd ed. (Wiley Inter-Science, New York).Google Scholar
  • [29] Foster J, Shorrocks A (1988) Poverty orderings. Econometrica 56(1):173–177.CrossrefGoogle Scholar
  • [30] Frittelli M (2000) The minimal entropy martingale measure and the valuation problem in incomplete markets. Math. Finance 10(1):39–52.CrossrefGoogle Scholar
  • [31] Ganchev K, Graca J, Gillenwater J, Taskar B (2010) Posterior regularization for structured latent variable models. J. Machine Learn. Res. 11(67):2001–2049.Google Scholar
  • [32] Gelfand IM (1936) Sur un lemme de la theorie des espaces lineaires. Comm. de L’Institut Des Sci. Mathématiques et Mécaniques de L’Université de Kharkoff et la Société Mathématique de Kharkoff 13:35–40.Google Scholar
  • [33] Grimmet G, Stirzaker D (2001) Probability and Random Processes, 3rd ed. (Oxford University Press, Oxford).CrossrefGoogle Scholar
  • [34] Grünwald P, de Heide R, Koolen W (2024) Safe testing. J. Roy. Statist. Soc. Ser. B (Statist. Methodology) 86(5):1091–1128.CrossrefGoogle Scholar
  • [35] Haberman SJ (1984) Adjustment by minimum discriminant information. Ann. Statist. 12(3):971–988.CrossrefGoogle Scholar
  • [36] Hanche-Olsen H, Holden H (2010) The Kolmogorov-Riesz compactness theorem. Expositiones Mathematicae 28(4):385–394.CrossrefGoogle Scholar
  • [37] Harris TR, Mapp HP (1986) A stochastic dominance comparison of water-conserving irrigation strategies. Amer. J. Agricultural Econom. 68(2):298–305.CrossrefGoogle Scholar
  • [38] Hoeffding W (1965) Asymptotically optimal tests for multinomial distributions. Ann. Math. Statist. 36(2):369–401.CrossrefGoogle Scholar
  • [39] Kallenberg O (2021) Foundations of Modern Probability, Probability Theory and Stochastic Modelling, 3rd ed., vol. 99 (Springer Nature, Switzerland).CrossrefGoogle Scholar
  • [40] Kandasamy K, Krishnamurthy A, Poczos B, Wasserman L, Robins J (2015) Nonparametric von mises estimators for entropies, divergences and mutual informations. Cortes C, Lawrence N, Lee D, Sugiyama M, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 28 (Curran Associates, Inc., Red Hook, NY).Google Scholar
  • [41] Kortanek KO (1993) Semi-infinite programming duality for order restricted statistical inference models. Zeitschrift Für Oper. Res. 37(3):285–301.Google Scholar
  • [42] Kullback S (1959) Information Theory and Statistics (Wiley, New York).Google Scholar
  • [43] Kullback S, Leibler RA (1951) On information and sufficiency. Ann. Math. Statist. 22(1):79 –86.CrossrefGoogle Scholar
  • [44] Lassance N, Vrins F (2023) Portfolio selection: A target-distribution approach. Eur. J. Oper. Res. 310(1):302–314.CrossrefGoogle Scholar
  • [45] Léonard (2012) From the Schrödinger problem to the Monge-Kantorovich problem. J. Functional Anal. 262(4):1879–1920.CrossrefGoogle Scholar
  • [46] Luenberger D (1969) Optimization by Vector Space Methods, Series in Decision and Control (John Wiley & Sons, Inc., New York).Google Scholar
  • [47] Manski CF (2005) Partial identification with missing data: Concepts and findings. Internat. J. Approximate Reasoning 39(2–3):151–165.CrossrefGoogle Scholar
  • [48] Nutz M, Wiesel J (2022) Entropic optimal transport: Convergence of potentials. Probab. Theory Related Fields 184:401–424.CrossrefGoogle Scholar
  • [49] Pettis BJ (1938) On integration in vector spaces. Trans. Amer. Math. Soc. 44:70–74.CrossrefGoogle Scholar
  • [50] Peyre G, Cuturi M (2019) Computational optimal transport. Foundations and Trends in Machine Learning 11(5–6):355–607.CrossrefGoogle Scholar
  • [51] Post T, Potì V (2017) Portfolio analysis using stochastic dominance, relative entropy, and empirical likelihood. Management Sci. 63(1):153–165.LinkGoogle Scholar
  • [52] Rudin W (1991) Functional Analysis, International Series in Pure and Applied Mathematics, 2nd ed. (McGraw-Hill, Boston).Google Scholar
  • [53] Sanov IN (1957) On the probability of large deviations of random variables. Matematicheskii Sbornik 42:11–44.Google Scholar
  • [54] Shapiro A, Dentcheva D, Ruszczynski A (2009) Lectures on Stochastic Programming. Modeling and Theory, MPS-SIAM Series on Optimization (SIAM-MPS, Philadelphia).CrossrefGoogle Scholar
  • [55] Sheehy A (1988) Kullback-Leibler constrained estimation of probability measures. Technical Report 132, Department of Statistics, University of Washington, Seattle.Google Scholar
  • [56] Soofi E, Retzer J (2002) Information indices: Unification and applications. J. Econom. 107(1):17–40.CrossrefGoogle Scholar
  • [57] Steiner SH, MacKay RJ (2001) Monitoring processes with data censored owing to competing risks by using exponentially weighted moving average control charts. J. Roy. Statist. Soc. Ser. C (Appl. Statist.) 50(3):293–302.CrossrefGoogle Scholar
  • [58] van der Vaart AW, Wellner J (1996) Weak Convergence and Empirical Processes, Springer Series in Statistics, 1st ed. (Springer, New York).CrossrefGoogle Scholar
  • [59] Verteramo Chiu L, Tauer L, Gröhn Y, Smith R (2020) Ranking disease control strategies with stochastic outcomes. Preventive Veterinary Medicine 176:104906.CrossrefGoogle Scholar
  • [60] Wayne Patty C (1993) Foundations of Topology (Wavelan Press Inc., Prospect Heights).Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.