Distributionally Robust Losses for Latent Covariate Mixtures
Published Online:2 Sep 2022https://doi.org/10.1287/opre.2022.2363
References
- (2016) Fairml: Toolbox for diagnosing bias in predictive modeling. Unpublished master’s thesis, Massachusetts Institute of Technology, Cambridge, MA.Google Scholar
- (2009) A study on similarity and relatedness using distributional and wordnet-based approaches. Proc. North Amer. Chapter Assoc. Comput. Linguistics (Association for Computational Linguistics, Stroudsburg, PA).Google Scholar
- (2016) Deep speech 2: End-to-end speech recognition in English and Mandarin. Proc. 33rd Internat. Conf. Machine Learn. (ACM, New York), 173–182.Google Scholar
- (2007) UCI machine learning repository. http://www.ics.uci.edu/mlearn/MLRepository.html.Google Scholar
- (2016) Big data’s disparate impact. California Law Rev. 104(3):671–732.Google Scholar
- (2007) Analysis of representations for domain adaptation. Adv. Neural Inform. Processing Systems 20:137–144.Crossref, Google Scholar
- (2009) Robust Optimization(Princeton University Press, Princeton, NJ).Crossref, Google Scholar
- (2013) Robust solutions of optimization problems affected by uncertain probabilities. Management Sci. 59(2):341–357.Link, Google Scholar
- (2004) Reproducing Kernel Hilbert Spaces in Probability and Statistics (Kluwer Academic Publishers, Amsterdam).Crossref, Google Scholar
- (2018) Data-driven robust optimization. Math. Programming Ser. A 167(2):235–292.Crossref, Google Scholar
- (2007) Discriminative learning for differing training and test distributions. Proc. 24th Internat. Conf. Machine Learn. (ACM, New York).Google Scholar
- (2019) Robust Wasserstein profile inference and applications to machine learning. J. Appl. Probab. 56(3):830–857.Crossref, Google Scholar
- (2017) Data-driven optimal transport cost selection for distributionally robust optimization. Preprint, submitted May 19, https://arxiv.org/abs/1705.07152.Google Scholar
- (2016) Demographic dialectal variation in social media: A case study of African-American English. Proc. Empirical Methods Natural Language Processing (Association for Computational Linguistics, Stroudsburg, PA), 1119–1130.Google Scholar
- (2016) Magging: Maximin aggregation for inhomogeneous large-scale data. Proc. IEEE 104(1):126–135.Crossref, Google Scholar
- (2017) Semeval-2017 task 1: Semantic textual similarity multilingual and cross-lingual focused evaluation. Proc. 10th Internat. Workshop Semantic Evaluation.Google Scholar
- (2019) Fairness under unawareness: Assessing disparity when protected class is unobserved. Proc. Conf. Fairness Accountability Transparency (ACM, New York), 339–348.Google Scholar
- (2022) How many labelers do you have? A closer look at gold-standard labels. Preprint, submitted June 24, https://arxiv.org/abs/2206.12041.Google Scholar
- (2017) A study of bias in recidivism prediction instruments. Big Data 5(2):153–163.Crossref, Google Scholar
- Consumer Financial Protection Bureau (2014) Using publicly available information to proxy for unidentified race and ethnicity: A methodology and assessment. https://www.consumerfinance.gov/data-research/research-reports/usingpublicly-available-information-to-proxy-for-unidentified-race-and-ethnicity/.Google Scholar
- (2009) Modeling wine preferences by data mining from physicochemical properties. Decision Support Systems 47(4):547–553.Crossref, Google Scholar
- (2004) Kernel Methods for Pattern Analysis (Cambridge University Press).Google Scholar
- (2021) Learning models with uniform performance via distributionally robust optimization. Ann. Statist. 49(3):1378–1406.Crossref, Google Scholar
- (2021) Statistics of robust optimization: A generalized empirical likelihood approach. Math. Oper. Res. 46(3):946–969.Link, Google Scholar
- (2012) Fairness through awareness. Innovations Theoretical Comput. Sci., 214–226.Crossref, Google Scholar
- (2018) Data-driven distributionally robust optimization using the Wasserstein metric: Performance guarantees and tractable reformulations. Math. Programming Ser. A 171(1–2):115–166.Crossref, Google Scholar
- (2015) On the rate of convergence in Wasserstein distance of the empirical measure. Probab. Theory Related Fields 162(3–4):707–738.Crossref, Google Scholar
- (2016) Distributionally robust stochastic optimization with Wasserstein distance. Preprint, submitted April 8, https://arxiv.org/abs/1604.02199.Google Scholar
- (2017) Wasserstein distributional robustness and regularization in statistical learning. Preprint, submitted December 17, https://arxiv.org/abs/1712.06050.Google Scholar
- (2016) Domain adaptation with conditional transferable components. Proc. 33rd Internat. Conf. Machine Learn. (ACM, New York), 2839–2848.Google Scholar
- (2009) Covariate shift by kernel mean matching. Quiñonero-Candela J, Sugiyama M, Schwaighofer A, Lawrence ND, eds. Dataset Shift in Machine Learning (MIT Press, Cambridge, MA), 131–160.Google Scholar
- (2010) Report on the evaluation of 2D still-image face recognition algorithms. NIST Interagency/Internal Report 7709, National Institute of Standards and Technology, Gaithersburg, MD.Google Scholar
- (2016) Equality of opportunity in supervised learning. Adv. Neural Inform. Processing Systems 29.Google Scholar
- (2018) Fairness without demographics in repeated loss minimization. Proc. 35th Internat. Conf. Machine Learn (ACM, New York).Google Scholar
- (2017) Calibration for the (computationally identifiable) masses. Preprint, submitted November 22, https://arxiv.org/abs/1711.08513.Google Scholar
- (2017) Grouping-by-ID: Guarding against adversarial domain shifts.Google Scholar
- (2015) Tagging performance correlates with author age. Proc. 53rd Annual Meeting Assoc. Comput. Linguistics (Short Papers) (Association for Computational Linguistics, Stroudsburg, PA), vol. 2, 483–488.Google Scholar
- (2018) Does distributionally robust supervised learning give robust classifiers? Proc. 35th Internat. Conf. Machine Learn. (ACM, New York).Google Scholar
- (2007) Correcting sample selection bias by unlabeled data. Adv. Neural Inform. Processing Systems 20: 601–608.Google Scholar
- (2015) Causal Inference for Statistics, Social, and Biomedical Sciences (Cambridge University Press, New York).Crossref, Google Scholar
- (2018) Preventing fairness gerrymandering: Auditing and learning for subgroup fairness. Preprint, submitted November 14, 2017, https://arxiv.org/abs/1711.05144.Google Scholar
- (2017) Avoiding discrimination through causal reasoning. Adv. Neural Inform. Processing Systems 30:656–666.Google Scholar
- (2019) Wasserstein distributionally robust optimization: Theory and applications in machine learning. Operations Research & Management Science in the Age of Analytics (INFORMS), 130–166.Link, Google Scholar
- (2019) Combating conservativeness in data-driven optimization under uncertainty: A solution path approach. Preprint, submitted September 13, https://arxiv.org/abs/1909.06477.Google Scholar
- (2015) Quantifying input uncertainty in stochastic optimization. Proc. 2015 Winter Simulation Conf. (IEEE, Piscataway, NJ).Google Scholar
- (2017) Minimax statistical learning and domain adaptation with Wasserstein distances. Preprint, submitted May 22, https://arxiv.org/abs/1705.07815.Google Scholar
- (2014) Robust classification under sample selection bias. Adv. Neural Inform. Processing Systems 27:37–45.Google Scholar
- (2017) Robust covariate shift prediction with general losses and feature views. Preprint, submitted December 28, https://arxiv.org/abs/1712.10043.Google Scholar
- (1994) Building a large annotated corpus of English: The Penn Treebank. Comput. Linguistics 19(2):313–330.Google Scholar
- (2015) Maximin effects in inhomogeneous large-scale data. Ann. Statist. 43(4):1801–1830.Crossref, Google Scholar
- (2015) Distributional smoothing with virtual adversarial training. Preprint, submitted July 2, https://arxiv.org/abs/1507.00677.Google Scholar
- (2017) Variance regularization with convex objectives. Adv. Neural Inform. Processing Systems 30:2975–2984.Google Scholar
- (2014) GloVe: Global vectors for word representation. Proc. Empirical Methods Natural Language Processing (Association for Computational Linguistics, Stroudsburg, PA).Google Scholar
- (2016) Causal inference by using invariant prediction: Identification and confidence intervals. J. Roy. Statist. Soc. B 78(5):947–1012.Crossref, Google Scholar
- (2018) Know what you don’t know: Unanswerable questions for squad. Proc. Annual Meeting Assoc. Comput. Linguistics (Association for Computational Linguistics, Stroudsburg, PA).Google Scholar
- (2000) Optimization of conditional value-at-risk. J. Risk 2(3):21–42.Crossref, Google Scholar
- (2016) Confidence intervals for maximin effects in inhomogeneous large-scale data. Statistical Analysis for High-Dimensional Data (Springer), 255–277.Crossref, Google Scholar
- (2018) Anchor regression: Heterogeneous data meets causality. Preprint, submitted January 18, https://arxiv.org/abs/1801.06229.Google Scholar
- (2017) Academic performance prediction in a gender-imbalanced environment. Proc. 11th ACM Conf. Recommender Systems (ACM, New York), vol. 1, 48–51.Google Scholar
- (2015) Distributionally robust logistic regression. Adv. Neural Inform. Processing Systems 28:1576–1584.Google Scholar
- (2017) No classification without representation: Assessing geodiversity issues in open data sets for the developing world. Preprint, submitted November 22, https://arxiv.org/abs/1711.08536.Google Scholar
- (2017) Distributionally robust stochastic programming. SIAM J. Optim. 27(4):2258–2275.Crossref, Google Scholar
- (2009) Lectures on Stochastic Programming: Modeling and Theory (SIAM and Mathematical Programming Society).Crossref, Google Scholar
- (2000) Improving predictive inference under covariate shift by weighting the log-likelihood function. J. Statist. Planning Inference 90(2):227–244.Crossref, Google Scholar
- (2018) Certifying some distributional robustness with principled adversarial training. Proc. Sixth Internat. Conf. Learn. Representations.Google Scholar
- (2019) Distributionally robust optimization and generalization in kernel methods. Adv. Neural Inform. Processing Systems 32:9134–9144.Google Scholar
- (2006) Mixture regression for covariate shift. Adv. Neural Inform. Processing Systems 19:1337–1344.Google Scholar
- (2007) Covariate shift adaptation by importance weighted cross validation. J. Machine Learn. Res. 8(35):985–1005.Google Scholar
- (2017) Gender and dialect bias in YouTube’s automatic captions. Proc. First Workshop Ethics Natural Language Processing, vol. 1 (Association for Computational Linguistics, Stroudsburg, PA), 53–59.Google Scholar
- (2014) Rényi divergence and Kullback-Leibler divergence. IEEE Trans. Inform. Theory 60(7):3797–3820.Crossref, Google Scholar
- (2014) Robust learning under uncertain test distributions: Relating covariate shift to model misspecification. Proc. 31st Internat. Conf. Machine Learn. (ACM, New York), 631–639.Google Scholar

