Reference Aware Delexicalization (RAD) Framework: Theory Driven Artificial Intelligence Modeling for Domain Generalization

Sandeep Suntwal
Corresponding Author
Sandeep Suntwal
[email protected]
https://orcid.org/0000-0002-7746-7114
Information Systems Department, College of Business, University of Colorado, Colorado Springs, Colorado 80918
Search for more papers by this author
,
Susan A. Brown
Susan A. Brown
[email protected]
https://orcid.org/0000-0002-9484-4428
Management Information Systems Department, Eller College of Management, University of Arizona, Tucson, Arizona 85721
Search for more papers by this author

Sandeep Suntwal

Corresponding Author

Sandeep Suntwal

[email protected]

https://orcid.org/0000-0002-7746-7114

Information Systems Department, College of Business, University of Colorado, Colorado Springs, Colorado 80918

Search for more papers by this author

Susan A. Brown

[email protected]

https://orcid.org/0000-0002-9484-4428

Management Information Systems Department, Eller College of Management, University of Arizona, Tucson, Arizona 85721

Search for more papers by this author

Published Online:5 Mar 2026https://doi.org/10.1287/isre.2023.0457

Abstract

The ability to generalize is critical for machine learning and natural language processing (NLP) models to perform effectively across a wide range of domains. However, state-of-the-art neural models often struggle to maintain performance when tested out-of-domain (OOD), emphasizing their inability to generalize beyond the data distribution used for training. This challenge of oversensitivity to spurious biases in the training data remains an open research problem across various NLP task areas. Although prior work has introduced techniques to improve domain generalization capabilities of neural networks, existing methods are constrained in their ability to identify and mitigate biases that are both subtle and multidimensional within the training data. These biases can include unintended correlations between features or domain-specific terminology that may not transfer well to other contexts. In response to these limitations, we propose the novel reference aware delexicalization (RAD) data augmentation framework designed to improve generalization for inference-based NLP tasks. The RAD framework uses attention weighting to detect biases and extract bias-prone lexical concepts. Grounded in the theory of reference, RAD’s delexicalization method utilizes the principles of reference fixing and borrowing to generate context-aware placeholder mappings to reduce data oversensitivity. We conducted rigorous benchmark evaluations using RAD-augmented data across various transformer architectures (e.g., BERT, RoBERTa) on several natural language inference (NLI), recognizing textual entailment (RTE), and fact verification data sets. Our findings demonstrate consistent improvements in OOD performance, indicating RAD’s ability to improve model generalization for key NLI and RTE tasks. Because advancing fundamental NLI and RTE capabilities remains crucial for many downstream NLP applications, this work highlights RAD’s potential for positive impact across areas where inference and OOD robustness are highly valued. We also conducted additional qualitative analysis on large language models using RAD across various case studies, demonstrating RAD’s potential as a complementary framework for improving systematic reasoning.

History: Gautam Pant, Senior Editor; Heng Xu, Associate Editor.

Supplemental Material: The online appendices are available at https://doi.org/10.1287/isre.2023.0457.

cover image Information Systems Research

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:August 01, 2023
Accepted:November 04, 2025
Published Online:March 05, 2026

Cite as

Sandeep Suntwal, Susan A. Brown (2026) Reference Aware Delexicalization (RAD) Framework: Theory Driven Artificial Intelligence Modeling for Domain Generalization. Information Systems Research 0(0).

https://doi.org/10.1287/isre.2023.0457

Keywords

Acknowledgments

The authors thank the senior editor, associate editor, and the anonymous reviewers for their thoughtful guidance and insights.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Reference Aware Delexicalization (RAD) Framework: Theory Driven Artificial Intelligence Modeling for Domain Generalization

Abstract

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News