The Information Projection in Moment Inequality Models: Existence, Dual Representation, and Approximation

Rami Tabri
Rami Tabri
[email protected]
https://orcid.org/0000-0003-4117-0498
Department of Econometrics and Business Statistics, Monash University, Clayton, Victoria 3800, Australia
Search for more papers by this author

Department of Econometrics and Business Statistics, Monash University, Clayton, Victoria 3800, Australia

Search for more papers by this author

Published Online:15 Sep 2025https://doi.org/10.1287/moor.2024.0568

References

[1] Alvarez-Mena J, Hernández-Lerma O (2005) Convergence and approximation of optimization problems. SIAM J. Optim. 15(2):527–539.Crossref, Google Scholar
[2] Alwan L, Ebrahimi N, Soofi E (1998) Information theoretic framework for process control. Eur. J. Oper. Res. 111(3):526–542.Crossref, Google Scholar
[3] Andrews DWK, Shi X (2013) Inference based on conditional moment inequalities. Econometrica 81(2):609–666.Google Scholar
[4] Artstein Z (1983) Distributions of random sets and random selections. Israel J. Math. 46:313–324.Crossref, Google Scholar
[5] Bajgiran AH, Mardikoraem M, Soofi ES (2021) Maximum entropy distributions with quantile information. Eur. J. Oper. Res. 290(1):196–209.Crossref, Google Scholar
[6] Bhattacharya B (2006) An iterative procedure for general probability measures to obtain I-projections onto intersections of convex sets. Ann. Statist. 34(2):878–902.Crossref, Google Scholar
[7] Bhattacharya B, Dykstra RL (1995) A general duality approach to I-projections. J. Statist. Planning Inference 47(3):203–216.Crossref, Google Scholar
[8] Bhattacharya B, Dykstra RL (1997) On Dykstra’s iterative fitting procedure. Ann. Inst. Statist. Math. 49(3):435–446.Crossref, Google Scholar
[9] Billingsley P (1995) Probability and Measure, 3rd ed. (John Wiley & Sons, New York).Google Scholar
[10] Bogachev VI (2007) Measure Theory, vol. 1 (Springer-Verlag, Berlin).Crossref, Google Scholar
[11] Borwein JM, Lewis AS (1991) Duality relationships for entropy-like minimization problems. SIAM J. Control Optim. 29(2):325–338.Crossref, Google Scholar
[12] Borwein JM, Lewis AS (1993) Partially-finite programming in L1 and the existence of maximum entropy estimates. SIAM J. Optim. 3(2):248–267.Crossref, Google Scholar
[13] Boyle P, Feng S, Tian W (2007) Chapter 24 Large deviation techniques and financial applications. Birge JR, Linetsky V, eds. Financial Engineering, Handbooks in Operations Research and Management Science, vol. 15 (Elsevier, Amsterdam), 971–1000.Crossref, Google Scholar
[14] Brockett PL, Charnes A, Cooper WW, Learner D, Phillips FY (1995) Information theory as a unifying statistical approach for use in marketing research. Eur. J. Oper. Res. 84(2):310–329.Crossref, Google Scholar
[15] Canay IA (2010) EL inference for partially identified models: Large deviations optimality and bootstrap validity. J. Econom. 156(2):408–425.Crossref, Google Scholar
[16] Chang YC (2022) A design parameter-free geometric Kullback-Leibler information control chart for monitoring Bernoulli processes. Comput. Indust. Engrg. 169:108150.Crossref, Google Scholar
[17] Chen C (2008) An information-theoretic view of visual analytics. IEEE Comput. Graphics Appl. 28(1):18–23.Crossref, Google Scholar
[18] Csiszár I (1975) I-divergence geometry of probability distributions and minimization problems. Ann. Probab. 3(1):146–158.Crossref, Google Scholar
[19] Csiszár I (1984) Sanov property, generalized I-projection and a conditional limit theorem. Ann. Probab. 12(3):768–793.Crossref, Google Scholar
[20] Csiszár I, Matus F (2003) Information projections revisited. IEEE Trans. Inform. Theory 49(6):1474–1490.Crossref, Google Scholar
[21] Dal Maso G (1993) An introduction to Γ-convergence. Progress in Nonlinear Differential Equations and Their Applications (Birkhäuser, Boston).Google Scholar
[22] Dembo A, Kontoyiannis L (2002) Source coding, large deviations, and approximate pattern matching. IEEE Trans. Inform. Theory 48(6):1590–1615.Crossref, Google Scholar
[23] Dhilion IS, Mallela S, Kumar R (2003) A divisive information-theoretic feature clustering algorithm for text classification. J. Machine Learn. Res. 3:1256–1287.Google Scholar
[24] Du J, Hu M, Zhang W (2020) Missing data problem in the monitoring system: A review. IEEE Sensors J. 20(23):13984–13998.Crossref, Google Scholar
[25] Dykstra RL (1985) An iterative procedure for obtaining I-projections onto the intersection of convex sets. Ann. Probab. 13(3):975 –984.Crossref, Google Scholar
[26] Dykstra RL, Wollan PC (1987) Algorithm as 228: Finding I-projections subject to a finite set of linear inequality constraints. J. Roy. Statist. Soc. Ser. C (Appl. Statist.) 36(3):377–383.Google Scholar
[27] Floerchinger S, Haas T (2020) Thermodynamics from relative entropy. Phys. Rev. E 102(5–1):052117.Crossref, Google Scholar
[28] Folland GB (1999) Real Analysis: Modern Techniques and Their Applications, 2nd ed. (Wiley Inter-Science, New York).Google Scholar
[29] Foster J, Shorrocks A (1988) Poverty orderings. Econometrica 56(1):173–177.Crossref, Google Scholar
[30] Frittelli M (2000) The minimal entropy martingale measure and the valuation problem in incomplete markets. Math. Finance 10(1):39–52.Crossref, Google Scholar
[31] Ganchev K, Graca J, Gillenwater J, Taskar B (2010) Posterior regularization for structured latent variable models. J. Machine Learn. Res. 11(67):2001–2049.Google Scholar
[32] Gelfand IM (1936) Sur un lemme de la theorie des espaces lineaires. Comm. de L’Institut Des Sci. Mathématiques et Mécaniques de L’Université de Kharkoff et la Société Mathématique de Kharkoff 13:35–40.Google Scholar
[33] Grimmet G, Stirzaker D (2001) Probability and Random Processes, 3rd ed. (Oxford University Press, Oxford).Crossref, Google Scholar
[34] Grünwald P, de Heide R, Koolen W (2024) Safe testing. J. Roy. Statist. Soc. Ser. B (Statist. Methodology) 86(5):1091–1128.Crossref, Google Scholar
[35] Haberman SJ (1984) Adjustment by minimum discriminant information. Ann. Statist. 12(3):971–988.Crossref, Google Scholar
[36] Hanche-Olsen H, Holden H (2010) The Kolmogorov-Riesz compactness theorem. Expositiones Mathematicae 28(4):385–394.Crossref, Google Scholar
[37] Harris TR, Mapp HP (1986) A stochastic dominance comparison of water-conserving irrigation strategies. Amer. J. Agricultural Econom. 68(2):298–305.Crossref, Google Scholar
[38] Hoeffding W (1965) Asymptotically optimal tests for multinomial distributions. Ann. Math. Statist. 36(2):369–401.Crossref, Google Scholar
[39] Kallenberg O (2021) Foundations of Modern Probability, Probability Theory and Stochastic Modelling, 3rd ed., vol. 99 (Springer Nature, Switzerland).Crossref, Google Scholar
[40] Kandasamy K, Krishnamurthy A, Poczos B, Wasserman L, Robins J (2015) Nonparametric von mises estimators for entropies, divergences and mutual informations. Cortes C, Lawrence N, Lee D, Sugiyama M, Garnett R, eds. Adv. Neural Inform. Processing Systems, vol. 28 (Curran Associates, Inc., Red Hook, NY).Google Scholar
[41] Kortanek KO (1993) Semi-infinite programming duality for order restricted statistical inference models. Zeitschrift Für Oper. Res. 37(3):285–301.Google Scholar
[42] Kullback S (1959) Information Theory and Statistics (Wiley, New York).Google Scholar
[43] Kullback S, Leibler RA (1951) On information and sufficiency. Ann. Math. Statist. 22(1):79 –86.Crossref, Google Scholar
[44] Lassance N, Vrins F (2023) Portfolio selection: A target-distribution approach. Eur. J. Oper. Res. 310(1):302–314.Crossref, Google Scholar
[45] Léonard (2012) From the Schrödinger problem to the Monge-Kantorovich problem. J. Functional Anal. 262(4):1879–1920.Crossref, Google Scholar
[46] Luenberger D (1969) Optimization by Vector Space Methods, Series in Decision and Control (John Wiley & Sons, Inc., New York).Google Scholar
[47] Manski CF (2005) Partial identification with missing data: Concepts and findings. Internat. J. Approximate Reasoning 39(2–3):151–165.Crossref, Google Scholar
[48] Nutz M, Wiesel J (2022) Entropic optimal transport: Convergence of potentials. Probab. Theory Related Fields 184:401–424.Crossref, Google Scholar
[49] Pettis BJ (1938) On integration in vector spaces. Trans. Amer. Math. Soc. 44:70–74.Crossref, Google Scholar
[50] Peyre G, Cuturi M (2019) Computational optimal transport. Foundations and Trends in Machine Learning 11(5–6):355–607.Crossref, Google Scholar
[51] Post T, Potì V (2017) Portfolio analysis using stochastic dominance, relative entropy, and empirical likelihood. Management Sci. 63(1):153–165.Link, Google Scholar
[52] Rudin W (1991) Functional Analysis, International Series in Pure and Applied Mathematics, 2nd ed. (McGraw-Hill, Boston).Google Scholar
[53] Sanov IN (1957) On the probability of large deviations of random variables. Matematicheskii Sbornik 42:11–44.Google Scholar
[54] Shapiro A, Dentcheva D, Ruszczynski A (2009) Lectures on Stochastic Programming. Modeling and Theory, MPS-SIAM Series on Optimization (SIAM-MPS, Philadelphia).Crossref, Google Scholar
[55] Sheehy A (1988) Kullback-Leibler constrained estimation of probability measures. Technical Report 132, Department of Statistics, University of Washington, Seattle.Google Scholar
[56] Soofi E, Retzer J (2002) Information indices: Unification and applications. J. Econom. 107(1):17–40.Crossref, Google Scholar
[57] Steiner SH, MacKay RJ (2001) Monitoring processes with data censored owing to competing risks by using exponentially weighted moving average control charts. J. Roy. Statist. Soc. Ser. C (Appl. Statist.) 50(3):293–302.Crossref, Google Scholar
[58] van der Vaart AW, Wellner J (1996) Weak Convergence and Empirical Processes, Springer Series in Statistics, 1st ed. (Springer, New York).Crossref, Google Scholar
[59] Verteramo Chiu L, Tauer L, Gröhn Y, Smith R (2020) Ranking disease control strategies with stochastic outcomes. Preventive Veterinary Medicine 176:104906.Crossref, Google Scholar
[60] Wayne Patty C (1993) Foundations of Topology (Wavelan Press Inc., Prospect Heights).Google Scholar

cover image Mathematics of Operations Research

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:June 17, 2024
Accepted:August 01, 2025
Published Online:September 15, 2025

Cite as

Rami Tabri (2025) The Information Projection in Moment Inequality Models: Existence, Dual Representation, and Approximation. Mathematics of Operations Research 0(0).

https://doi.org/10.1287/moor.2024.0568

Keywords

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

The Information Projection in Moment Inequality Models: Existence, Dual Representation, and Approximation

References

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News