January 13, 2023 in Analyze This!
Introducing My New Co-author, ChatGPT
SHARE: PRINT ARTICLE:
https://doi.org/10.1287/LYTX.2023.01.07
In early November, I took my first trip to Austin, Texas, since 2007. Austin’s population has increased by more than 30% over the last 15 years, and this growth has clearly put pressure on the city’s infrastructure and housing market. (For years, many of my Austin friends have been telling me about rising property values and rental costs.) Appropriately, my first meeting after arriving in town was with Ben Alamar, an old friend who had recently relocated from the Bay Area.
Ben is a pioneer in the world of sports analytics. In particular, he is known for his work in developing advanced statistical models and algorithms to evaluate player performance and predict game outcomes in a variety of sports. He has also written extensively on the use of analytics in sports, including a recent book entitled “Sports Analytics: A Guide for Coaches, Managers, and Other Decision Makers” [1].
I invited Ben to be an early guest on my “Optimize This!” podcast. In particular, based on his previous work as a data scientist with the NBA’s Oklahoma City Thunder and Cleveland Cavaliers, I was interested in his perspective on how to get NBA players to make optimal data-driven decisions in real time. Over breakfast tacos, Ben brought me up to speed on the work of Second Spectrum [2], a leading provider of advanced analytics and tracking data for the NBA. To capture data from NBA games, Second Spectrum installs a network of high-speed cameras in each arena. These cameras capture video of the game at a rate of up to 240 frames per second and use machine learning algorithms to track the movement of each player and the ball on the court. The cameras also capture data on other game-related events, such as rebounds, passes and shot attempts. Once the data has been captured, Second Spectrum’s analytics team processes and analyzes the data using a variety of statistical and machine learning techniques. They use this analysis to generate advanced metrics and insights that can help teams and coaches make more informed decisions about strategy, player performance and game planning. Very eye-opening.
My next meeting was with a software-as-a-service (SaaS) company with which my colleagues and I have been collaborating for the past two years. Our research has sought to answer a fundamental question: Is this company’s “customer health score” metric predictive of future customer outcomes?
Customer health scores are metrics that are developed and used to assess the overall health and well-being of customer relationships. Like most such health scores, this company’s metric is based on a combination of factors, including customer usage of the product, customer engagement with the company and the customer’s overall satisfaction with both. The purpose of customer health scores is to help SaaS companies with subscription-based business models to identify and address potential issues or challenges with their customer relationships before they become serious problems, thereby helping to improve customer retention and revenue growth.
Understanding the effectiveness of these types of metrics, however, is something of an analytic challenge. In particular, it requires careful tracking of both longitudinal customer health score data inputs and downstream customer decisions (churn, retention and/or growth), followed by an examination of the statistical relationship between past periods’ customer health scores and subsequent customer outcomes. Most firms that rely on these types of customer health scores never do the hard work to explore this question, but our work with this Austin-based company has demonstrated that their scores are indeed predictive of future customer outcomes and, in particular, which components of those scores are most strongly correlated with customer retention. Kudos to Dr. Ross Johnson, whose recently completed dissertation was based in part on this project.
A few days after returning from Austin, I met with Jon Petersen and Holger Teichgraeber from the data science team at Archer (NYSE: ACHR). Archer is a young company that is simultaneously trying to design a new type of commuter aircraft and create a new urban transportation service. After hearing about the company earlier this year, I was eager to learn more about the types of questions that their data science team was helping the company address.
But Petersen and Teichgraeber first needed to educate me on electric vertical takeoff and landing (eVTOL) aircraft – vehicles that use electric propulsion to take off and land without the need for a runway. Such aircraft typically have a number of small rotors or propellers that allow them to hover and move vertically as well as horizontally. These aircraft are designed to be environmentally friendly, with low noise levels and zero emissions, and are being developed for a variety of applications, including air transportation in densely packed urban areas.
Petersen and Teichgraeber will surely be terrific guests on a future episode of “Optimize This!” because the type of modeling and analysis work they are engaged in at Archer will be of great interest to our audience. Our conversation about the many analytic problems associated with introducing a brand-new service (no historical data, few hard constraints) was fascinating.
However, I was far more jazzed about seeing Midnight [3], the company’s newly designed eVTOL commuter aircraft that was being shown publicly for the first time that very day. Over the previous few months, my eyesight had deteriorated so significantly that I was unable to drive. As such, my trip from Oakland to Palo Alto and back featured a bus, two different types of trains, a ferry ride and a couple of short stints as a passenger in other people’s cars. Thus, the promise of an eVTOL transportation service was particularly exciting to me (and continues to be, even as recent cataract surgery has enabled me to get back behind the wheel).
Speaking of disruptive new technologies, this is the first column that I have written since the much-celebrated release of ChatGPT on November 30, 2022. ChatGPT is a modification of the popular language generation model called GPT-3. GPT-3 (short for “Generative Pretrained Transformer 3”) is a language generation model developed by OpenAI that has received significant attention because of its impressive performance on a variety of language tasks. It is one of the largest and most powerful language models currently available, with a total of 175 billion parameters. GPT-3 has been pretrained on a massive data set of text, allowing it to understand the structure and patterns of language. Its large number of parameters allows it to generate coherent and human-like text when given a prompt. ChatGPT is specifically designed for generating human-like text and is able to generate responses that are appropriate for a given context. One of the main advantages of ChatGPT is its ability to understand context and use this understanding to generate relevant responses. In addition to generating text, ChatGPT is also capable of performing tasks such as translation, summarization and question-answering. This makes it a powerful tool that can assist users with a variety of tasks.
As you might have suspected by now, some parts of this column were generated by ChatGPT. In fairness, I did do some work organizing the content, editing computer-generated text and integrating it with content that I had created. However, with the help of ChatGPT, it took far less time to write this column than usual. But in what sense did I actually “write” this column? Is it fair for me to claim to be the “author” of this column?
The answers are not clear. Happy New Year!
References
Vijay Mehrotra is a professor in the Department of Business Analytics and Information Systems at the University of San Francisco’s School of Management and a longtime member of INFORMS.
([email protected])