Mitigating Age-Related Bias in Large Language Models: Strategies for Responsible Artificial Intelligence Development

Zhuang Liu
Zhuang Liu
[email protected]
https://orcid.org/0000-0002-4695-6345
School of Fintech, Dongbei University of Finance and Economics, Dalian 116025, China
Search for more papers by this author
,
Shiyao Qian
Shiyao Qian
[email protected]
https://orcid.org/0009-0004-2876-1343
Department of Computer Science, University of Toronto, Toronto, Ontario M5S 1A1, Canada
Search for more papers by this author
,
Shuirong Cao
Shuirong Cao
[email protected]
https://orcid.org/0009-0003-0857-0630
School of Computer Science, Nanjing University, Nanjing 210023, China
Search for more papers by this author
,
Tianyu Shi
Corresponding Author
Tianyu Shi
[email protected]
https://orcid.org/0009-0001-9119-778X
Department of Computer Science, University of Toronto, Toronto, Ontario M5S 1A1, Canada
Search for more papers by this author

School of Fintech, Dongbei University of Finance and Economics, Dalian 116025, China

Search for more papers by this author

Shiyao Qian

[email protected]

https://orcid.org/0009-0004-2876-1343

Department of Computer Science, University of Toronto, Toronto, Ontario M5S 1A1, Canada

Search for more papers by this author

Shuirong Cao

[email protected]

https://orcid.org/0009-0003-0857-0630

School of Computer Science, Nanjing University, Nanjing 210023, China

Search for more papers by this author

Tianyu Shi

Corresponding Author

Tianyu Shi

[email protected]

https://orcid.org/0009-0001-9119-778X

Department of Computer Science, University of Toronto, Toronto, Ontario M5S 1A1, Canada

Search for more papers by this author

Published Online:21 May 2025https://doi.org/10.1287/ijoc.2024.0645

References

Adila D, Zhang S, Han B, Wang B (2024) Discovering bias in latent space: An unsupervised debiasing approach. Forty-First Internat. Conf. Machine Learn. (ICML 2024) (OpenReview.net).Google Scholar
Agiza A, Mostagir M, Reda S (2024) PoliTune: Analyzing the impact of data selection and fine-tuning on economic and political biases in large language models. Preprint, submitted July 27, http://dx.doi.org/10.48550/ARXIV.2404.08699.Google Scholar
Ba Y, Liu X, Chen X, Wang H, Xu Y, Li K, Zhang S (2024) Cautiously-optimistic knowledge sharing for cooperative multi-agent reinforcement learning. Wooldridge MJ, Dy JG, Natarajan S, eds. Thirty-Eighth AAAI Conf. Artificial Intelligence (AAAI Press, Palo Alto, CA), 17299–17307.Google Scholar
Balvert M (2024) Iterative rule extension for logic analysis of data: An MILP-based heuristic to derive interpretable binary classifiers from large data sets. INFORMS J. Comput. 36(3):723–741.Link, Google Scholar
Birru J, Chague F, De-Losso R, Giovannetti B (2024) Attention and biases: Evidence from tax-inattentive investors. Management Sci. 70(10):7101–7119.Link, Google Scholar
Cai Y, Zhang C, Shen W, Zhang X, Ruan W, Huang L (2023) Reprem: Representation pre-training with masked model for reinforcement learning. Williams B, Chen Y, Neville J, eds. Thirty-Seventh AAAI Conf. Artificial Intelligence (AAAI Press, Palo Alto, CA), 6879–6887.Google Scholar
Chen J, Liu L, Zhou F (2025) Do not wait: Preemptive rumor detection with cooperative LLMs and accessible social context. Inform. Processing Management 62(3):103995.Crossref, Google Scholar
Chhikara G, Sharma A, Ghosh K, Chakraborty A (2024) Few-shot fairness: Unveiling LLM’s potential for fairness-aware classification. Preprint, submitted February 28, http://dx.doi.org/10.48550/ARXIV.2402.18502.Google Scholar
Chu CH, Donato-Woodger S, Khan SS, Nyrup R, Leslie K, Lyn A, Shi T, Bianchi A, Rahimi SA, Grenier A (2023) Age-related bias and artificial intelligence: A scoping review. Humanities Soc. Sci. Comm. 10(1):510.Crossref, Google Scholar
Dai S, Xu C, Xu S, Pang L, Dong Z, Xu J (2024) Bias and unfairness in information retrieval systems: New challenges in the LLM era. KDD 2024, 6437–6447.Google Scholar
De Cremer D (2020) What does building a fair AI really entail. Harvard Bus. Rev. (September 3), https://hbr.org/2020/09/what-does-building-a-fair-ai-really-entail.Google Scholar
Fan X, Hanasusanto GA (2024) A decision rule approach for two-stage data-driven distributionally robust optimization problems with random recourse. INFORMS J. Comput. 36(2):526–542.Link, Google Scholar
Fernández-Ardèvol M, Grenier L (2024) Exploring data ageism: What good data can(‘t) tell us about the digital practices of older people? New Media Soc. 26(8):4611–4628.Crossref, Google Scholar
Gallegos IO, Rossi RA, Barrow J, Tanjim MM, Kim S, Demoncourt F, Yu T, Zhang R, Ahmed NK (2024) Bias and fairness in large language models: A survey. Preprint, submitted July 12, http://dx.doi.org/10.48550/ARXIV.2309.00770.Google Scholar
Ghanbarzadeh S, Huang Y, Palangi H, Moreno RC, Khanour H (2023) Gender-tuning: Empowering fine-tuning for debiasing pre-trained language models. Findings Association Computational Linguistics: ACL 2023 (Association for Computational Linguistic, Toronto), 5448–5458.Google Scholar
Gu S, Knoll A, Jin M (2025) TeaMs-RL: Teaching LLMs to teach themselves better instructions via reinforcement learning. Preprint, submitted March 1, http://dx.doi.org/10.48550/ARXIV.2403.08694.Google Scholar
Gupta S, Shriv V, Desh A, Kalyan A, Clark P, Sab A, Khot T (2024) Bias runs deep: Implicit reasoning biases in persona-assigned LLMs. Twelfth Internat. Conf. Learn. Representations (OpenReview.net).Google Scholar
Gurevych I, Hovy EH, Slonim N, Stein B (2015) Debating technologies (Dagstuhl Seminar 15512). Dagstuhl Rep. 5(12):18–46.Google Scholar
Haller P, Aynetdinov A, Akbik A (2024) OpinionGPT: Model. Explicit biases in instruction-tuned LLMs. Proc. 2024 Conf. North Amer. (Association for Computational Linguistics, Stroudsburg, PA), 78–86.Google Scholar
Han Y (2024) Fairness evaluation within large language models through the lens of depression. Proc. 2023 4th Internat. Conf. Machine Learn. Comput. Appl. (Association for Computing Machinery, New York), 108–112.Google Scholar
Harris C (2023) Mitigating age biases in resume screening AI models. Flairs 2023 (Clearwater Beach, FL).Google Scholar
Harris CG (2024) Combining human-in-the-loop systems and AI fairness toolkits to reduce age bias in AI job hiring algorithms. BigComp 2024, 60–66.Google Scholar
Hu J, Jiang Y, Weng P (2024) Revisiting data augmentation in deep reinforcement learning. Twelfth Internat. Conf. Learn. Representations (OpenReview.net).Google Scholar
Hu B, Zhao C, Zhang P, Zhou Z, Yang Y, Xu Z, Liu B (2023) Enabling intelligent interactions between an agent and an LLM: A reinforcement learning approach. Preprint, submitted August 31, https://arxiv.org/abs/2306.03604v4.Google Scholar
Jiang AQ, Sablayrolles A, Lacroix T, Sayed WE (2023) Mistral 7b. Preprint, submitted October 10, http://dx.doi.org/10.48550/ARXIV.2310.06825.Google Scholar
Kamruzzaman M (2025) Investigating and mitigating undesirable biases in large language models. Walsh T, Shah J, Kolter Z, eds. AAAI-25, Sponsored Assoc. Advancement Artificial Intelligence (AAAI Press, Palo Alto, CA), 29273–29274.Google Scholar
Kamruzzaman M, Shovon MMI, Kim GL (2024) Investigating subtler biases in LLMs: Ageism, beauty, institutional, and nationality bias in generative models. Ku LW, Martins A, Srikumar V, eds. Findings Association Computational Linguistics: ACL 2024 (Association for Computational Linguistic, Stroudsburg, PA), 8940–8965.Google Scholar
Kelley S, Ovchinnikov A, Hardoon DR, Heinrich A (2022) Antidiscrimination laws, artificial intelligence, and gender bias: A case study in nonmortgage fintech lending. Manufacturing Service Oper. Management 24(6):3039–3059.Link, Google Scholar
Kidder W, D’Cruz J, Varshney KR (2024) Empathy and the right to be an exception: What LLMs can and cannot do. Preprint, submitted January 25, http://dx.doi.org/10.48550/ARXIV.2401.14523.Google Scholar
Kiehne N, Ljapunov A, Bätje M, Balke W (2024) Analyzing effects of learning downstream tasks on moral bias in LLMs. Calzolari N, Kan MY, Hoste V, Lenci A, Sakti S, Xue N, eds. Proc. 2024 Joint Internat. Conf. Comput. Linguistics, Language Resources Evaluation (ELRA and ICCL, Paris), 904–923.Google Scholar
Kumar A, Yunusov S, Emami A (2024) Subtle biases need subtler measures: Dual metrics for evaluating representative and affinity bias in LLMs. Ku LW, Martins A, Srikumar V, eds. Proc. 62nd Annual Meeting Assoc. Comput. Linguistics, vol. 1 (Association for Computational Linguistic, Bangkok, Thailand), 375–392.Google Scholar
Leslie D (2020) Tackling COVID-19 through responsible AI innovation: Five steps in the right direction. Harvard Data Sci Rev. (June 5), https://hdsr.mitpress.mit.edu/pub/as1p81um/release/3.Google Scholar
Lin L, Wang L, Guo J, Wong K (2024) Investigating bias in LLM-based bias detection: Disparities between LLMs and human perception. Preprint, submitted December 10, http://dx.doi.org/10.48550/ARXIV.2403.14896.Google Scholar
Liu S, Maturi T, Shen S, Mihalcea R (2024) The generation gap: Exploring age bias in large language models. Preprint, submitted October 15, http://dx.doi.org/10.48550/ARXIV.2404.08760.Google Scholar
Liu Z, Qian S, Cao S, Shi T (2025) Mitigating age-related bias in large language models: Strategies for responsible artificial intelligence development. http://dx.doi.org/10.1287/ijoc.2024.0645.cd, https://github.com/INFORMSJoC/2024.0645.Google Scholar
Liu Z, Huang D, Huang K, Li Z, Zhao J (2020) FinBERT: A pre-trained financial language representation model for financial text mining. Proc. Twenty-Ninth Internat. Joint Conf. Artificial Intelligence, IJCAI 2020 (ijcai.org), 4513–4519.Google Scholar
Ma H, Zhang C, Bian Y, Liu L, Zhang Z, Zhao P, Zhang S (2023) Fairness-guided few-shot prompting for large language models. Preprint, submitted March 31, http://dx.doi.org/10.48550/ARXIV.2303.13217.Google Scholar
Ma Y, Jiao L, Liu F, Li L, Ma W, Yang S, Liu X, Chen P (2025) Unveiling and mitigating generalized biases of DNNs through the intrinsic dimensions of perceptual manifolds. IEEE Trans. Pattern Anal. Machine Intelligence 47(3):2237–2244.Crossref, Google Scholar
Maheshwari G, Bellet A, Denis P, Keller M (2023) Fair without leveling down: A new intersectional fairness definition. Bouamor H, Pino J, Kalika B, eds. Findings Assoc. Comput. Linguistics: EMNLP 2023 (Association for Computational Linguistics, Stroudsburg, PA), 9018–9032.Google Scholar
Mak H (2022) Enabling smarter cities with operations management. Manufacturing Service Oper. Management 24(1):24–39.Link, Google Scholar
Meade N, Gella S, Gupta P, Jin D, Reddy S, Liu Y (2023) Using in-context learning to improve dialogue safety. Bouamor H, Pino J, Kalika B, eds. Findings Assoc. Comput. Linguistics: EMNLP 2023 (Association for Computational Linguistics, Stroudsburg, PA), 11882–11910.Google Scholar
Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A (2022) A survey on bias and fairness in machine learning. ACM Comput. Surveys 54(6):115:1–115:35.Crossref, Google Scholar
Meta (2024) Llama 3. POPL ‘79 Proc. 6th ACM SIGACT-SIGPLAN Sympos. Principles Programming Languages, 226–236.Google Scholar
Nangia N, Vania C, Bhalerao R, Bowman SR (2020) Crows-pairs: A challenge dataset for measuring social biases in masked LMs. Webber B, Cohn T, He Y, Liu Y, eds. Proc. 2020 Conf. Empirical Methods Natural Language Processing: EMNLP 2020 (Association for Computational Linguistics, Stroudsburg, PA), 1953–1967.Google Scholar
Nguyen H, Eger S (2024) Is there really a citation age bias in NLP? Preprint, submitted January 7, http://dx.doi.org/10.48550/ARXIV.2401.03545.Google Scholar
Oba D, Kaneko M, Bollegala D (2024) In-contextual gender bias suppression for large language models. Graham Y, Purver M, eds. Findings Association Computational Linguistics: EACL 2024 (Association for Computational Linguistic, Stroudsburg, PA), 1722–1742.Google Scholar
Oketunji AF, Anas M, Saina D (2023) Large language model (LLM) bias index—LLMBI. Preprint, submitted December 29, http://dx.doi.org/10.48550/ARXIV.2312.14769.Google Scholar
O’Leary DE (2025) Confirmation and specificity biases in large language models: An explorative study. IEEE Intelligent Systems 40(1):63–68.Crossref, Google Scholar
OpenAI (2023) GPT-4 technical report. Preprint, submitted, http://dx.doi.org/10.48550/ARXIV.2303.08774.Google Scholar
Parrish A, Chen A, Nangia N, Padmakumar V, Phang J, Thompson J, Htut PM, Bowman SR (2022) BBQ: A hand-built bias benchmark for question answering. Muresan S, Nakov P, Villavicencio A, eds. Findings Association Computational Linguistics: ACL 2024 (Association for Computational Linguistic, Stroudsburg, PA), 2086–2105.Google Scholar
Peng Y, Xiao L, Hd B, Hong LJ, Lam H (2022) A new likelihood ratio method for training artificial neural networks. INFORMS J. Comput. 34(1):638–655.Link, Google Scholar
Proebsting G, Poliak A (2025) Biases in large language model-elicited text: A case study in natural language inference. Rambow O, Wanner L, Apidianaki M, Al-Khalifa H, Di Eugenio B, Schockaert S, eds. Proc. 31st Internat. Conf. Comput. Linguistics: COLING 2025 (Association for Computational Linguistics, Stroudsburg, PA), 5836–5851.Google Scholar
Rajabalizadeh A, Davarnia D (2024) Solving a class of cut-generating linear programs via machine learning. INFORMS J. Comput. 36(3):708–722.Link, Google Scholar
Samorani M, Harris SL, Blount LG, Lu H, Santoro MA (2022) Overbooked and overlooked: Machine learning and racial bias in medical appointment scheduling. Manufacturing Service Oper. Management 24(6):2825–2842.Link, Google Scholar
Shin J, Song H, Lee H, Jeong S, Park J (2025) Ask LLMs directly, “what shapes your bias?”: Measuring social bias in large language models. Ku LW, Martins A, Srikumar V, eds. Findings Association Computational Linguistics: ACL 2024 (Association for Computational Linguistic, Stroudsburg, PA), 16122–16143.Google Scholar
Sun Y, Qi J, Zhu Z, Li K, Zhao L, Lv L (2025) Bias-guided margin loss for robust visual question answering. Inform. Processing Management 62(3):103988.Crossref, Google Scholar
Tokpo EK, Calders T (2023) Model-based counterfactual generator for gender bias mitigation. Preprint, submitted November 6, http://dx.doi.org/10.48550/ARXIV.2311.03186.Google Scholar
Touvron H, Martin L, Stone K, Albert P, Almahairi A, Babaei Y, Bashlykov N (2023) Llama 2: Open foundation and fine-tuned chat models. Preprint, submitted July 19, http://dx.doi.org/10.48550/ARXIV.2307.09288.Google Scholar
Wang S, Delage E (2024) A column generation scheme for distributionally robust multi-item newsvendor problems. INFORMS J. Comput. 36(3):849–867.Link, Google Scholar
Wu X, Nian J, Tao Z, Fang Y (2025) Evaluating social biases in LLM reasoning. Preprint, submitted February 21, http://dx.doi.org/10.48550/ARXIV.2502.15361.Google Scholar
Xu W, Zhu G, Zhao X, Pan L, Li L, Wang W (2024) Pride and prejudice: LLM amplifies self-bias in self-refinement. Ku LW, Martins A, Srikumar V, eds. Findings Association Computational Linguistics: ACL 2024 (Association for Computational Linguistic, Stroudsburg, PA), 15474–15492.Google Scholar
Xu Z, Chen W, Tang Y, Li X, Hu C, Chu Z, Ren K, Zheng Z, Lu Z (2025) Mitigating social bias in large language models: A multi-objective approach within a multi-agent framework. Walsh T, Shah J, Kolter Z, eds. AAAI-25, Sponsored Assoc. Advancement Artificial Intelligence (AAAI Press, Palo Alto, CA), 25579–25587.Google Scholar
Yang C, Rustogi R, Wu T (2023) Beyond testers’ biases: Guiding model testing with knowledge bases using LLMs. Bouamor H, Pino J, Bali K, eds. Findings Assoc. Comput. Linguistics: EMNLP 2023 (Association for Computational Linguistics, Stroudsburg, PA), 13504–13519.Google Scholar
Yang H, Wang Y, Xu X, Zhang H, Bian Y (2024a) Can we trust LLMs? Mitigate overconfidence bias in LLMs through knowledge transfer. Preprint, submitted May 27, http://dx.doi.org/10.48550/ARXIV.2405.16856.Google Scholar
Yang A, Yang B, Hui B, Zheng B, Yu B, Zhou C, Li C (2024b) Qwen2 technical report. Preprint, submitted September 10, https://arxiv.org/abs/2407.10671.Google Scholar
You Y, Huang J, Tong Q, Wang B (2025) Tackling biased complementary label learning with large margin. Inform. Sci. 687:121400.Crossref, Google Scholar
Yu X, Shi R, Feng P, Tian Y, Li S, Liao S, Wu W (2024) Leveraging partial symmetry for multi-agent reinforcement learning. Wooldridge MJ, Dy JG, Natarajan S, eds. Thirty-Eighth AAAI Conf. Artificial Intelligence: AAAI 2024 (AAAI Press, Palo Alto, CA), 17583–17590.Google Scholar
Zhang N, Xu H (2024) Fairness of ratemaking for catastrophe insurance: Lessons from machine learning. Inform. Systems Res. 35(2):469–488.Link, Google Scholar
Zhao Y, Wang B, Wang Y (2025) Explicit vs. implicit: Investigating social bias in LLMs through self-reflection. Preprint, submitted March 7, http://dx.doi.org/10.48550/ARXIV.2501.02295.Google Scholar
Zuccotto M, Castellini A, La Torre D, Mola L, Farinelli A (2024) Reinforcement learning applications in environmental sustainability: A review. Artificial Intelligence Rev. 57(4):88.Crossref, Google Scholar

cover image INFORMS Journal on Computing

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Received:February 24, 2024
Accepted:April 23, 2025
Published Online:May 21, 2025

Cite as

Zhuang Liu; , Shiyao Qian; , Shuirong Cao, Tianyu Shi; (2025) Mitigating Age-Related Bias in Large Language Models: Strategies for Responsible Artificial Intelligence Development. INFORMS Journal on Computing 0(0).

https://doi.org/10.1287/ijoc.2024.0645

Keywords

Acknowledgments

The authors appreciate the editors’ and anonymous reviewers’ valuable comments.

PDF download

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Available Issues

Mitigating Age-Related Bias in Large Language Models: Strategies for Responsible Artificial Intelligence Development

References

Articles In Advance

Article Information

Supplemental Material

Metrics

Information

Cite as

Keywords

Sign Up for INFORMS Publications Updates and News