March 1, 2024 in Generative AI

Chasing Tomorrow: ChatGPT Regulations and Models in a Rapidly Changing Landscape

Patricia Neri

SHARE: PRINT ARTICLE:

https://doi.org/10.1287/orms.2024.01.08

A quote from Bill Gates raises the utmost importance of the ramifications of ChatGPT:

“In my lifetime, I’ve seen two demonstrations of technology that struck me as revolutionary. The first time was in 1980, when I was introduced to a graphical user interface … The second big surprise came just last year … I watched in awe as they asked GPT, their AI model, 60 multiple-choice questions from the AP Bio exam – and it got 59 of them right … I knew I had just seen the most important advance in technology since the graphical user interface” (emphasis added). [1]

As most readers are aware, ChatGPT is a generative artificial intelligence (AI) model. One of my favorite articles on generative AI is the McKinsey report, “The Economic Potential of Generative AI” [2]. McKinsey estimated that generative AI could increase the baseline impact of all artificial intelligence by 15%-40%. According to the report, this estimate would roughly double if we include the impact of embedding generative AI into software that is currently used for other tasks beyond those use cases.

“By combining generative AI with all other technologies, work automation could add 0.2 to 3.3 percentage points annually to productivity growth. However, workers will need support in learning new skills, and some will change occupations” [2].

The benefits derived from using ChatGPT are enormous, and the risks are substantial. We must know the risks and make decisions about how to reduce them. However, ChatGPT has limitations. It can provide incorrect answers (one needs to check its output) and biased answers (ChatGPT was trained on biased data). Can ChatGPT substitute humans? Based on my 20+ years of professional experience, my answer is “no.”

Because many of the readers of this article are familiar with mathematical optimization (from operations research, or O.R.), I will mention my experience developing an AI/OR system at UPS. Through the years, I have had similar experiences developing other analytical systems, including deep learning systems.

In 2000, while working at UPS, my team and I developed a state-of-the-art system for the Teamsters schedulers to route and schedule loads all over the U.S. It took several weeks for the schedulers to route all the expected loads; however, as new loads were added to the system, their schedule would break. They wanted the O.R. system and to keep their jobs. As we collaborated and they compared their routes with the ones developed by the system, their requirements became clearer. Initially, they asked me for requirements A, B and C when what they really wanted was B, C, X and Z. To develop a system that the schedulers would use required understanding their concerns (not only translating their business requirements into a math language), working in partnership so they would own the system and facilitating the change management process. Solving this problem required human creativity, empathy and collaboratively working in cross-functional teams.

Generative AI models ChatGPT (developed by OpenAI) and Gemini (developed by Google) are rapidly developing, and their applications are multiplying exponentially. Still, there is not a unique global regulation to reduce their risks and enforce their beneficial, safe applications. Currently, the most comprehensive and binding framework is the European Union AI Act. [Because my personal experience is with ChatGPT, in the rest of this article, I will refer to generative AI models simply as ChatGPT. Please note, Google’s Gemini remains an extremely strong tool.]

One of the reasons ChatGPT has been so well received is that nontechnical users only need everyday language to benefit from using it. ChatGPT is a large language model (LLM), but not all LLMs are ChatGPT, and therefore the costs are different.

In this brief article, I will discuss generative AI applications, risks and regulations as well as technical aspects of LLMs and finally how to implement ChatGPT in the real world.

ChatGPT Applications

ChatGPT can do some amazing things. It can summarize substantial amounts of text in any field, translate multiple languages, analyze customer sentiment, and create music and art. It can develop computer code and create O.R. models with relative ease, which can be used as an initial draft or aid for final products.

ChatGPT generates ideas for recipes, the perfect vacation and work-related purposes. Marketers create new materials such as blogs, ads, videos and even customized advertising, which, again, they use as a starting point for their final versions. Programmers and software engineers use ChatGPT (e.g., GitHub Copilot) to generate code, which they then fix (remember ChatGPT can give wrong answers) and combine with their own code to develop applications faster than before ChatGPT.

ChatGPT helps team meetings run smoothly by taking the meeting notes (yes, one needs to review them and make sure notes are correct).

Examples of LLM generative AI-based systems include chatbots that provide answers to questions in data applications, such as extracting information from documents (medical, legal, chemical, etc.) and answering questions related to those documents in a systematic way. Chatbots can enhance the customer experience by understanding their sentiments, resolving questions, providing suggestions on how to navigate websites and recommending products.

ChatGPT Risks and Regulations

In March 2023, Sam Altman, CEO of Open AI, mentioned that ChatGPT could be the “greatest technology humanity has yet developed” [3]. However, he warned that ChatGPT in the wrong hands can have real negative consequences for society (e.g., offensive cyberattacks) and can produce large-scale misinformation that is difficult to determine whether true or “fake news.”

Fake news is a significant immediate risk because “at least 64 countries (plus the European Union) – representing a combined population of about 49% of the people in the world – are meant to hold national elections …” in 2024 [4].

Teaching institutions, writers, artists and many other professionals are concerned about plagiarism, violation of copyright laws and losing their jobs because of ChatGPT. Teachers wonder how they can detect plagiarism in their students’ assignments but also recognize that ChatGPT is a new technology that their students must learn to use. In my opinion, professionals who know how to use ChatGPT will create better outputs, faster. There is a balance between controlling (i.e., regulating) these risks and using ChatGPT to enhance human tasks.

Also in March 2023, more than 1,000 AI researchers and technology leaders signed “Pause Giant AI Experiments: An Open Letter,” which stated that “Advanced AI could represent a profound change in the history of life on Earth. … Powerful AI systems should be developed only once we are confident that their effects will be positive, and their risks will be manageable” [5]. The letter asks AI labs to pause the training of AI systems more powerful than GPT-4 for at least 6 months. Nevertheless, there has not been a pause. Therefore, governments must step up and regulate the development and use of AI.

There is not yet a unique international framework to regulate AI. The most comprehensive and binding framework is the European Union AI Act [6] drafted in February 2024. It classifies AI systems based on their risk and how powerful their AI models are. Systems are classified as Banned applications, High Risk, Limited Risk and Low Risk. However, there is not currently a clear metric on how to determine how powerful a model is. An AI model must satisfy requirements dependent on its risk. Limited risk, such as systems that interact with humans (i.e., chatbots), have transparency obligations.

Table 1. Examples of Banned Applications and High-Risk Systems

Banned Applications	High-Risk Systems
Cognitive behavioral manipulation of people or specific vulnerable groups	AI systems that are used in products falling under the EU’s product safety legislation
Social scoring	AI systems that negatively affect safety or fundamental rights
Biometric identification and categorization of people	AI systems falling into specific areas (e.g., management and operation of critical infrastructure, education and vocational training, law enforcement, etc.)
Real-time and remote biometric identification systems, such as facial recognition	AI systems used to influence the outcome of elections and voter behavior

There are more stringent requirements for general-purpose AI systems.

The EU AI Act allows innovation while still providing strict rules, enforcements and fines (ranging from 35 million euros or 7% of global turnover to 7.5 million euros or 1.5 % of turnover). The Act requires that the public must know when they are interacting with AI models (e.g., chatbots) and the public can ask how a model scored them (i.e., receiving loans and credit). The public must know if news is fake or real.

All around the world, there are few exceptions for AI-banned applications for law enforcement, military or defense uses.

Brief Review of LLMs

Simple machine learning models, such as regression, random forests, ensemble methods, etc., can solve many analytical problems. Deep learning models solve more complex problems such as natural language processing (NLP), vision recognition, language understanding (e.g., machine translation of text, speech recognition, like Siri), and language generation (generating descriptions from structured data, e.g., chatbots, text summaries). However, deep learning models require a great amount of data and computing power. If a problem can be solved using simpler models, then that would be the best approach.

Since 2017, there have been huge advances in deep learning LLMs. Two key articles were the GPT papers “Language Models Are Unsupervised Multitask Learners” published by OpenAI [7] and “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding” published by Google [8].

Foundational LLMs are trained with vast amounts of GPU and corpus texts; they use millions of parameters. However, once created, they can be customized using a small amount of data for a variety of tasks at a minimal cost. Moreover, there are open-source options that we can use at a minimal cost and achieve state-of-the-art performance (see https://huggingface.co/).

In November 2022, OpenAI released the first version of ChatGPT. Since then, more advanced versions have been released. These models are based on LLMs and are generative AI models. One major difference between nongenerative and generative AI models is that the latter do not require coding, just questions expressed in natural (everyday) language, and their range of applications is in many domains. These models mimic the input data (text, videos, images, etc.) and derive new content.

On December 6, 2023, Google introduced Gemini [9]. Their largest and most capable model for highly complex tasks is Gemini Ultra, which surpasses GPT-4’s state-of-the-art performance in most tasks in the general, reasoning and math areas.

Retrieval-Augmented Generation Architectures

ChatGPT and other LLM-based systems are usually implemented in retrieval-augmented generation (RAG) architectures.

At Underwriters Laboratories (UL) Solutions, a wide variety of analytical models are used: statistical (the simple A/B test), machine learning (random forests, XGBoost) and deep learning (NLP, computer vision and others) as well as ChatGPT in different architectures (RAGs and others) and getting amazing results.

Figure 1 illustrates a simple RAG architecture.

A good introduction to RAG architectures with many interesting applications can be found in “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks” [10].

The article “Advanced RAG Techniques: An Illustrated Overview” describes different RAG architectures and includes sources to vast documentation and tutorials [11]. This article also mentions that RAG architectures decrease hallucinations and offer more interpretability and control.

The brain of the RAG architecture is the LLM model, which could be a ChatGPT-family model, or another LLM model, such as open-source Llama 2 released by Meta, which supports integration in Hugging Face. Llama 2 is a family of pretrained and fine-tuned LLMs [12].

ChatGPT is an innovative technology that will continue to greatly affect our lives. It has massive potential for improving our lives and substantial risks that we must avoid.

The power of ChatGPT is realized when it is used in software applications (not so much by just prompting). Currently, these software applications are based on RAG architectures.

References

Bill Gates, The Age of AI has begun, Gates Notes, March 21, 2023.
https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/the-economic-potential-of-generative-ai-the-next-productivity-frontier
https://abcnews.go.com/Technology/openai-ceo-sam-altman-ai-reshape-society-acknowledges/story?id=97897122
https://time.com/6550920/world-elections-2024/
https://futureoflife.org/open-letter/pause-giant-ai-experiments/
https://www.europarl.europa.eu/news/en/headlines/society/20230601STO93804/eu-ai-act-first-regulation-on-artificial-intelligence
Alec Radford, Jeff Wu, Rewon Child, D. Luan, Dario Amodei and I. Sutskever, 2018, “Language Models are Unsupervised Multitask Learners,” OpenAI.
Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova, 2019, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” https://arxiv.org/abs/1810.04805.
https://blog.google/technology/ai/google-gemini-ai/
Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, et al., 2021, “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” https://arxiv.org/abs/2005.11401.
Ivan Ilin, 2023, “Advanced RAG Techniques: An Illustrated Overview,” Towards AI, December 17, https://pub.towardsai.net/advanced-rag-techniques-an-illustrated-overview-04d193d8fec6.
https://huggingface.co/docs/transformers/model_doc/llama2

Patricia Neri

Patricia Neri, Ph.D., PMP, is a principal decision scientist at Underwriters Laboratories (UL) Solution. Her responsibilities include growing and leading a team of decision and data scientists to develop innovative applications based on AI in the areas of supply chain and sustainability, provide leadership, project management skills, analytics and business judgment to support strategic and tactical activities.

In the past, she has led global analytical projects in several verticals working with cross-functional teams.

At INFORMS, she has served as chair (2017-2018) of the prestigious Daniel H. Wagner Prize and judged and coached international teams competing for the Franz Edelman Award. Patricia also served as the INFORMS Practice Section Board Officer (2014-2023) and president of the INFORMS DFW chapter (2009-2010).

Keywords:

Chasing Tomorrow: ChatGPT Regulations and Models in a Rapidly Changing Landscape

ChatGPT Applications

ChatGPT Risks and Regulations

Brief Review of LLMs

Retrieval-Augmented Generation Architectures

References

Login to your account

Login or Register

Member Login

Nonmember Login