Inventing with Machines: Generative AI and the Evolving Landscape of IS Research

References

  • Abbasi A, Parsons J, Pant G, Sheng ORL, Sarker S (2024) Pathways for design research on artificial intelligence. Inform. Systems Res. 35(2):441–459.LinkGoogle Scholar
  • Agent4Science (2025) Open conference of AI agents for science. Retrieved September 24, https://agents4science.org.Google Scholar
  • Baek J, Jauhar SK, Cucerzan S, Hwang SJ (2024) ResearchAgent: Iterative research idea generation over scientific literature with large language models. Preprint, submitted April 11, https://arxiv.org/abs/240407738.Google Scholar
  • Bhargava HK, Jenkin T, Kazaz B, Sarker S, Walls M (2025) Guidelines on the Use of AI/Gen AI: Recommendations for INFORMS Journals (INFORMS, Catonsville, MD).Google Scholar
  • Bozkurt A (2024) GenAI et al. Cocreation, authorship, ownership, academic ethics and integrity in a time of generative AI. Open Prax 16(1):1–10.CrossrefGoogle Scholar
  • Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, et al. (2020) Language models are few-shot learners. Adv. Neural Inform. Processing Systems, vol. 33 (Curran Associates Inc., Red Hook, NY), 1877–1901.Google Scholar
  • Cemri M, Pan MZ, Yang S, Agrawal LA, Chopra B, Tiwari R, Keutzer K, et al. (2025) Why do multi-agent LLM systems fail? Preprint, submitted March 17, https://arxiv.org/abs/250313657.Google Scholar
  • Chen Y, Benton J, Radhakrishnan A, Uesato J, Denison C, Schulman J, Somani A, et al. (2025) Reasoning models don’t always say what they think. Preprint, submitted May 8, https://arxiv.org/abs/250505410.Google Scholar
  • Du Y, Li S, Torralba A, Tenenbaum JB, Mordatch I (2023) Improving factuality and reasoning in language models through multiagent debate. Preprint, submitted May 23, https://arxiv.org/abs/2305.14325.Google Scholar
  • Edge D, Trinh H, Cheng N, Bradley J, Chao A, Mody A, Truitt S, et al. (2024) From local to global: A graph RAG approach to query-focused summarization. Preprint, submitted April 24, https://arxiv.org/abs/240416130.Google Scholar
  • Gao L, Madaan A, Zhou S, Alon U, Liu P, Yang Y, Callan J, et al. (2023) Pal: Program-aided language models. Proc. Internat. Conf. Machine Learn. (PMLR, New York), 10764–10799.Google Scholar
  • Guo T, Chen X, Wang Y, Chang R, Pei S, Chawla NV, Wiest O, et al. (2024) Large language model based multi-agents: A survey of progress and challenges. Preprint, submitted January 21, https://arxiv.org/abs/240201680.Google Scholar
  • Hong S, Zhuge M, Chen J, Zheng X, Cheng Y, Wang J, Zhang C, et al. (2023) MetaGPT: Meta programming for a multi-agent collaborative framework. Preprint, submitted August 1, https://arxiv.org/abs/2308.00352.Google Scholar
  • Hu EJ, Shen Y, Wallis P, Allen-Zhu Z, Li Y, Wang S, Wang L, et al. (2022) Lora: Low-rank adaptation of large language models. Proc. Internat. Conf. Learn. Representation 1(2):3.Google Scholar
  • Izacard G, Grave E (2020) Leveraging passage retrieval with generative models for open domain question answering. Preprint, submitted July 2, https://arxiv.org/abs/200701282.Google Scholar
  • Ji Z, Lee N, Frieske R, Yu T, Su D, Xu Y, Ishii E, et al. (2023) Survey of hallucination in natural language generation. ACM Comput. Survey 55(12):1–38.CrossrefGoogle Scholar
  • Karpukhin V, Oguz B, Lewis MS, Wu PS, Edunov L, Chen S, Yih D, et al. (2020) Dense passage retrieval for open-domain question answering. Proc. Conf. Empirical Methods Natural Language Processing (Association for Computational Linguistics, Stroudsburg, PA), 6769–6781.Google Scholar
  • Larsen KR, Mueller RM, Bonaretti D, Fischer-Preβler D, Burleson J, Singh N, Parsons J, et al. (2025) The ITEM ontology: A tool to elucidate the anatomy of psychometric indicators. Inform. Systems Res., ePub ahead of print August 13, https://doi.org/10.1287/isre.2023.0257.Google Scholar
  • Lin Y, Tang S, Lyu B, Wu J, Lin H, Yang K, Li J, et al. (2025) Goedel-prover: A frontier model for open-source automated theorem proving. Preprint, submitted February 11, https://arxiv.org/abs/250207640.Google Scholar
  • Matton K, Ness RO, Guttag J, Kıcıman E (2025) Walk the talk? Measuring the faithfulness of large language model explanations. Preprint, submitted April 19, https://arxiv.org/abs/250414150.Google Scholar
  • Maynez J, Narayan S, Bohnet B, McDonald R (2020) On faithfulness and factuality in abstractive summarization. Proc. 58th Annual Meeting Assoc. Comput. Linguist. (Association for Computational Linguistics, Stroudsburg, PA).Google Scholar
  • Mitchell M, Krakauer DC (2023) The debate over understanding in AI’s large language models. Proc. Natl. Acad. Sci. USA 120(13):e2215907120.CrossrefGoogle Scholar
  • Naddaf M (2025) AI is transforming peer review—And many scientists are worried. Nature 639(8056):852–854.CrossrefGoogle Scholar
  • Nakano R, Hilton J, Balaji S, Wu J, Ouyang L, Kim C, Hesse C, et al. (2021) WebGPT: Browser-assisted question-answering with human feedback. Preprint, submitted December 17, https://arxiv.org/abs/211209332.Google Scholar
  • Ouyang L, Wu J, Jiang X, Almeida D, Wainwright C, Mishkin P, Zhang C, et al. (2022) Training language models to follow instructions with human feedback. Adv. Neural Inform. Processing Systems (Curran Associates Inc., Red Hook, NY), 27730–27744.Google Scholar
  • Park JS, O’Brien J, Cai CJ, Morris MR, Liang P, Bernstein MS (2023) Generative agents: Interactive simulacra of human behavior. Proc. 36th Annual ACM Sympos. User Interface Software Tech (UIST '23) (Association for Computing Machinery (ACM), New York), 1–22.Google Scholar
  • Peter S, Riemer K, West JD (2025) The benefits and dangers of anthropomorphic conversational agents. Proc. Natl. Acad. Sci. USA 122(22):e2415898122.CrossrefGoogle Scholar
  • Rafailov R, Sharma A, Mitchell E, Manning CD, Ermon S, Finn C (2023) Direct preference optimization: Your language model is secretly a reward model. Preprint, submitted May 29, https://arxiv.org/abs/2305.18290.Google Scholar
  • Riemer K, Peter S (2024) Conceptualizing generative AI as style engines: Application archetypes and implications. Internat. J. Inform. Management 79(C):102824.Google Scholar
  • Sarker S, Chatterjee S, Xiao X, Elbanna A (2019) The sociotechnical axis of cohesion for the IS discipline: Its historical legacy and its continued relevance. MIS Quart. 43(3):695–720.CrossrefGoogle Scholar
  • Schick T, Dwivedi-Yu J, Dessì R, Raileanu R, Lomeli M, Hambro E, Zettlemoyer L, et al. (2023) Toolformer: Language models can teach themselves to use tools. Adv. Neural Inform. Processing Systems, vol. 36 (Curran Associates Inc., Red Hook, NY), 68539–68551.Google Scholar
  • Shinn N, Cassano F, Gopinath A, Narasimhan K, Yao S (2023) Reflexion: Language agents with verbal reinforcement learning. Preprint, submitted March 20, https://arxiv.org/abs/2303.11366.Google Scholar
  • Shuster K, Xu J, Komeili M, Ju D, Smith EM, Roller S, Ung M, et al. (2022) Blenderbot 3: A deployed conversational agent that continually learns to responsibly engage. Preprint, submitted August 5, https://arxiv.org/abs/220803188.Google Scholar
  • Strachan JW, Albergo D, Borghini G, Pansardi O, Scaliti E, Gupta S, Saxena K, et al. (2024) Testing theory of mind in large language models and humans. Nature Human Behav. 8(7):1285–1295.CrossrefGoogle Scholar
  • Susarla A, Gopal R, Thatcher JB, Sarker S (2023) The Janus effect of generative AI: Charting the path for responsible conduct of scholarly activities in information systems. Inform. Systems Res. 34(2):399–408.LinkGoogle Scholar
  • Thoppilan R, De Freitas D, Hall J, Shazeer N, Kulshreshtha A, Cheng HT, Jin A, et al. (2022) Lamda: Language models for dialog applications. Preprint, submitted January 20, https://arxiv.org/abs/220108239.Google Scholar
  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, et al. (2017) Attention is all you need. Preprint, submitted June 12, https://arxiv.org/abs/1706.03762.Google Scholar
  • Walters WH, Wilder EI (2023) Fabrication and errors in the bibliographic citations generated by ChatGPT. Sci. Rep. 13(1):14045.CrossrefGoogle Scholar
  • Wang G, Xie Y, Jiang Y, Mandlekar A, Xiao C, Zhu Y, Fan L, et al. (2023) Voyager: An open-ended embodied agent with large language models. Preprint, submitted May 25, https://arxiv.org/abs/230516291.Google Scholar
  • Wei J, Bosma M, Zhao VY, Guu K, Yu AW, Lester B, Du N, et al. (2022a) Finetuned language models are zero-shot learners. Internat. Conf. Learn. Representation (ICLR 2022) (OpenReview).Google Scholar
  • Wei J, Wang X, Schuurmans D, Bosma M, Xia F, Chi E, Le QV, et al. (2022b) Chain-of-thought prompting elicits reasoning in large language models. Adv. Neural Inform. Processing Systems, vol. 35 (Curran Associates Inc., Red Hook, NY), 24824–24837.Google Scholar
  • Wei J, Yang Y, Zhang X, Chen Y, Zhuang X, Zhangyang G, Zhou D, et al. (2025) From AI for science to agentic science: A survey on autonomous scientific discovery. Preprint, submitted August 18, https://arxiv.org/abs/250814111.Google Scholar
  • Wu J, Zhu J, Liu Y, Xu M, Jin Y (2025) Agentic reasoning: A streamlined framework for enhancing LLM reasoning with agentic tools. Proc. 63rd Annual Meeting Assoc. Comput. Linguist. (Association for Computational Linguistics, Stroudsburg, PA).Google Scholar
  • Wu Q, Bansal G, Zhang J, Wu Y, Li B, Zhu E, Jiang L, et al. (2024) Autogen: Enabling next-gen LLM applications via multi-agent conversations. First Conf. Language Modeling (COLM 2024) (OpenReview).Google Scholar
  • Xu Z, Jain S, Kankanhalli M (2024) Hallucination is inevitable: An innate limitation of large language models. Preprint, submitted January 22, https://arxiv.org/abs/240111817.Google Scholar
  • Yang J, Jimenez CE, Wettig A, Lieret K, Yao S, Narasimhan K, Press O (2024) SWE-agent: Agent-computer interfaces enable automated software engineering. Adv. Neural Inform. Processing Systems, vol. 37 (Curran Associates Inc., Red Hook, NY), 50528–50652.Google Scholar
  • Yao S, Chen H, Yang J, Narasimhan K (2022) Webshop: Towards scalable real-world web interaction with grounded language agents. Adv. Neural Inform. Processing Systems, vol. 35 (Curran Associates Inc., Red Hook, NY), 20744–20757.Google Scholar
  • Yao S, Zhao J, Yu D, Du N, Shafran I, Narasimhan K, Cao Y (2023) React: Synergizing reasoning and acting in language models. 11th Internat. Conf. Learn. Representation (ICLR 2023) (OpenReview).Google Scholar
  • Zhang P, Hu X, Huang G, Qi Y, Zhang H, Li X, Song J, et al. (2025) A next-generation open access ecosystem for scientific discovery generated by AI scientists. Preprint, submitted August 20, https://arxiv.org/abs/250815126.Google Scholar
  • Zhu K, Du H, Hong Z, Yang X, Guo S, Zhe W, Zhenhailong W, et al. (2025) MultiAgentBench: Evaluating the collaboration and competition of LLM agents. Preprint, submitted March 3, https://arxiv.org/abs/250301935.Google Scholar
  • Zhuang Z, Chen J, Xu H, Jiang Y, Lin J (2025) Large language models for automated scholarly paper review: A survey. Preprint, submitted January 17, https://arxiv.org/abs/2501.10326.Google Scholar
  • Ziegler DM, Stiennon N, Wu J, Brown TB, Radford A, Amodei D, Christiano P, et al. (2019) Fine-tuning language models from human preferences. Preprint, submitted September 18, https://arxiv.org/abs/190908593.Google Scholar
INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.