April 29, 2026 in 2026 INFORMS Analytics+ Conference Wrap Up

Amazing Things Come from Having Many Good Models

Cynthia Rudin’s Analytics+ keynote challenges long held assumptions about accuracy, interpretability, and how models should be built.

SHARE: PRINT ARTICLE:print this page https://doi.org/10.1287/LYTX.2026.02.02n

Cynthia Rudin, keynote speaker at 2026 INFORMS Analytics+ Conference

For decades, analytics has lived with an uncomfortable tradeoff. If you want state-of-the-art accuracy, you must accept complexity and opacity. If you want interpretability, you must settle for weaker performance. In her keynote at the 2026 INFORMS Analytics+ Conference, Cynthia Rudin argued that this tradeoff is not only unnecessary - it is often simply wrong.

Rudin, the Gilbert, Louis, and Edward Lehrman Distinguished Professor of Computer Science and Electrical and Computer Engineering at Duke University, made a case rooted in both theory and practice: in many real world problems, there are many predictive models that perform equally well. When that happens, accuracy does not force complexity. Instead, it creates freedom - freedom to choose models that are simple, interpretable, aligned with domain knowledge, and usable in high stakes decisions.

Rudin calls this phenomenon the “Rashomon Effect,” which borrows from a classic insight by statistician Leo Breiman that holds that data often support multiple, equally good explanations. Rudin’s keynote explored what happens when the field takes this insight seriously.

When Black Boxes Fail in Practice

Rudin opened her address with a concrete example from early work on the New York City power grid. Machine learning models were built to recommend which underground manholes should be repaired. Highly accurate black box models looked promising, but engineers could not make sense of the recommendations. The models were impossible to troubleshoot, and in practice, they failed.

The turning point came when black boxes were replaced with interpretable models. Engineers could examine them, understand why they made specific recommendations, and identify mistakes. Accuracy did not drop; it improved.

That experience crystallized a broader pattern Rudin had seen repeatedly: when multiple models perform similarly, the difference between success and failure is not predictive accuracy, but whether humans can work with the model.

Simple, Accurate Models

The Rashomon Effect refers to the existence of a large set of models, very different in form, that achieve nearly the same predictive performance. This effect has been observed for decades, but its consequences have been widely misunderstood.

One major implication is the existence of simple yet accurate models. Rudin illustrated this with an example of a dataset used by FICO for credit scoring. In a public competition, participants were asked to predict which consumers would default on a loan based on their credit history. They were told to do this by building black box models and then explaining them, based on the assumption that interpretable models could not achieve comparable accuracy.

That assumption turned out to be false. Rudin and her collaborators proved that by producing sparse, intuitive models, including scoring systems that fit on a single slide, they could match black box performance. These models could be built quickly, understood immediately, and deployed responsibly.

Why was this result missed for so long? According to Rudin, earlier conclusions confused the limitations of algorithms with the limitations of model classes. Linear models and greedy tree methods were never optimized well enough to reveal how powerful interpretable models could be. With modern optimization and computing, that barrier has largely disappeared.

Why Noise Creates Simplicity

Rudin then addressed a deeper question: why do simple, accurate models exist in so many problems?

The answer lies in noise. In many domains, including criminal justice, healthcare, lending, and recidivism, the outcomes being predicted are inherently non‑deterministic. Even with identical inputs, different outcomes can occur. That uncertainty creates a large Rashomon set: many models perform similarly because no model can predict perfectly.

When outcomes are noisy:

  • Loss estimates become more variable
  • Complex models generalize poorly
  • Practitioners compensate by simplifying
  • The proportion of good models within simpler model classes increases

The result is a powerful and counterintuitive conclusion: noise induces simplicity. In these settings, optimizing for interpretability does not sacrifice accuracy, because accuracy was never uniquely tied to complexity in the first place.

From One Model to Many

Having many good models creates another opportunity that traditional machine learning workflows are poorly equipped to exploit: most algorithms return a single model, even when countless alternatives exist. When that model violates domain knowledge or ethical constraints, practitioners are forced to tweak hyperparameters and rerun training, which is a slow, frustrating process.

Rudin argued that this is a fundamental flaw in the framework, not just the tools. Instead of asking algorithms to return one model, she proposed returning all sufficiently good models, allowing users to explore, compare, and filter them based on preferences such as fairness, monotonicity, or policy constraints.

Ruden’s group has begun building systems that make this possible. For decision trees, she and her collaborators enumerate entire sets of optimal and near-optimal trees and present them through interactive interfaces in which users can inspect structure, impose constraints, and select models that make sense in context. The bottleneck shifts from optimization to human-machine interaction - a far better problem to have.

Real-world Implications

Perhaps the most striking part of the keynote was Rudin’s case study in medical imaging. A widely used black‑box model demonstrated strong performance in predicting breast cancer risk years in advance from mammograms, but it offered no insight into why a given patient was high risk.

Through interpretability, Rudin’s team discovered that the model relied heavily on left–right breast asymmetry, a subtle signal not previously recognized as a predictive biomarker. By replacing opaque components with explicit symmetry‑based reasoning, they produced a much simpler model that performed nearly as well, and, crucially, revealed where and why a risk signal existed.

Interpretability did more than explain a model; it enabled a new scientific insight, one that clinicians can understand, validate, and potentially act on.Rudin closed with a clear message for institutions and policymakers: in domains with uncertain outcomes, there is no justification for defaulting to black‑box models when interpretable alternatives perform just as well. Accountability, trust, and responsibility demand transparency.

She also argued for a shift in education. If students only learn about deep networks and transformers, they are ill‑prepared to deploy models responsibly in real systems. Interpretable machine learning is no longer a niche topic - it is central to the future of analytics.The takeaway from Rudin’s keynote was not incremental; it was foundational. Many of the field’s assumptions about accuracy, complexity, and tradeoffs stem from decades‑old computational constraints. When those constraints fall away, a different picture emerges, one in which accuracy and interpretability are often allies, not enemies.

When many good models exist, amazing things really do follow.

SHARE:

Keywords:
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.