July 19, 2023 in Machine Learning

To Avoid Wasting Money on Artificial Intelligence, Business Leaders Need More AI Acumen

SHARE: PRINT ARTICLE:

https://doi.org/10.1287/LYTX.2023.03.10

Key Takeaway: Business leaders need more machine learning acumen to properly translate business objectives into the requirements for a predictive model. Without this, pitfalls abound. In this article, we show why one client’s deployed lead-scoring model is potentially worthless for improving sales team efforts and how the difference between targeting marketing and targeting sales is rarely understood.

With the current hype around artificial intelligence (AI), there is no shortage of enthusiasm for using predictive modeling to optimize business processes. However, subtle nuance in the specifics of the problem can make the difference between a technologically driven contribution and an ineffective predictive model.

Consider Bob, the name we’ll use for our client who manages a sales team for a successful midsize B2B software company. To make more effective use of his team, Bob wanted to flag the leads most worth pursuing. In an attempt to oblige him, one of this article’s authors arranged for a student, Anne, to take on Bob as a client for her work/study project. The goal of the project was to prototype a model with the intention of justifying further investment.

Bob gave Anne his sales data. It conveyed valuable experience from which to learn: previous leads that had converted into sales and others that had not. Anne followed the usual procedure of applying a machine learning method – in this case, a gradient boosted tree – to generate a predictive model from the data. This was a lead-scoring model, capable of calculating the probability that a lead would ultimately convert into a paying customer.

When Anne explained the model’s high performance to Bob, he was thrilled. It predicted well. One metric that conveyed this high performance was lift, which came in at five: For the 10% of leads flagged by the model as most likely to convert, 25% would do so within the next three months, which was five times more than the overall conversion rate of 5%. To say the model has a lift of five is to say that it identifies a group that is five times more likely to exhibit a positive outcome. This means that if a salesperson focused only on members of this top group, they would land five times as many sales as they would if they instead spread their attention evenly across all leads.

Five times the sales for the same salesperson effort is a pretty attractive proposition. Unfortunately, this model was very unlikely to deliver on that lofty promise. The approach we’ve described follows a common fallacy – a fundamental misconception about the data, not an arcane technical detail embedded within the way that machine learning was applied.

To understand the error, let’s start with another closely related application of machine learning: targeting marketing with response modeling. Similar to a lead-scoring model, a response model predicts whether a prospect will buy, to decide whether to apply limited resources. But rather than salesperson time, the resource in question is marketing contact, such as the cost to print and mail a direct marketing brochure. By targeting a marketing campaign to contact only those who are more likely to buy, response modeling improves the marketing response rate and, therefore, the profit of a marketing campaign.

As similar as these two endeavors may sound, there’s a fundamental difference in what the model is actually predicting.

Prediction goal for lead-scoring: Will the individual buy, given how the sales team has previously interacted with them?

Prediction goal for response modeling: Will the individual buy if contacted?

Notice the difference? A response model is trained over data that pertains to customers who were marketed to with a previous campaign (or a test campaign). Everyone on the list – aka the training data – was contacted in the same manner, and then, after a bit of time, it became known which individuals made a purchase as well as those who did not purchase. This provides both positive and negative examples in the data from which to learn; the predictive model is the thing that gets “learned.” And, across the entire list, each individual has been treated uniformly.

But the data used to train a lead-scoring model doesn’t usually have this consistency. Salespeople are, for the most part, free agents. They treat each lead as they see fit, depending on which ones seem to hold the most potential. Some leads may never receive a phone call, whereas others may receive the VIP treatment, including repeated phone calls and various amounts of wining and dining.

Because the data doesn’t track sales that come after one consistent treatment, the model trained over that data doesn’t predict sales from any particular treatment. But with response modeling (for which the data tracks purchases in light of one specific marketing treatment), the model predicts purchases that result from – or at least occur in the aftermath of – that particular treatment.

The response model helps improve marketing, but the lead-scoring model won’t necessarily improve sales activity. For example, if a response model outputs a high score for an individual – say, a probability of 25%, where the overall average is only 5% – that tells a marketer that the individual is more worthwhile to contact than average and it would be wise to include the individual for the marketing campaign. But if a lead-scoring model outputs the same thing, it doesn’t necessarily help the sales team improve their performance: They may already be doing exactly what the model is suggesting.

To illustrate this potential worthlessness, consider a case in which there’s general agreement among the sales team that the most promising leads are midsize high-tech consumer retailers with at least 80% of their sales at brick-and-mortar locations rather than online. This kind of assumption may well be a self-fulfilling prophecy. Because the team spends most of their efforts on just that kind of customer, that is who winds up making most of the purchases. In this case, a lead-scoring model may only capture what the team is already doing. The model’s scores won’t alter the team’s behavior at all, let alone improve it. The model outputs high scores for leads that will convert – but they’re destined to convert because they’re the kind of leads who are already being actively sold to, even without the model.

Sales vs. Marketing

Only by testing broader sales activity could you discover new sales opportunities. The purpose of a model is to improve sales, by way of identifying new opportunities not already being pursued. But the data available is too limited for such a discovery to be made. If an as yet unknown segment of customers has high potential, the only way to find out would be to try selling to them. After tracking such trials, the data would potentially show the value of the customer segment.

But a sales force isn’t inclined to serve the needs of a well-planned experiment. Marketing projects, on the other hand, have much more potential to cast a wide net and discover new, hot pockets of prospects. After all, they call it mass marketing for a reason. It would be a very different thing – and understandably rare, if not impossible – to command a sales team to experimentally spread their efforts across a wide swath of prospects. Although there are ways to control for this differential treatment without running an experiment, it requires both intentionality about what data you collect and a significant degree of sophistication in advanced modeling techniques.

The best practices for targeting marketing don’t translate so easily to targeting sales. As interrelated as they are, sales and marketing are two very different animals.

Anne’s project demonstrated that you could predict which leads were likely to convert, but more work needed to be done to understand how to use the model. Given that the project was intended as a proof of concept, we were surprised by Bob’s response when she handed over the model. With the model in hand – highly predictive but potentially worthless – Bob was chomping at the bit to deploy it to guide his sales team.

We explained the reason his model was likely worthless for improving sales, but there was no talking him down. “Oh no, we’re definitely going to use it – machine learning is the best way to drive decisions with data and we’re dead set to do so.” We encouraged him to go out and collect more data and perform the more sophisticated analysis necessary, but that would be expensive, and from Bob’s perspective, he already had a great model. As far as we know, his team still uses it to target sales efforts today.

Ultimately, business professionals need more data fluency – more machine learning acumen. As much as the hype about machine learning urges the world to adopt it, that excitement should also urge a new degree of semitechnical understanding so that the business objective can be properly translated into the model’s technical requirements. In our story with Bob and Anne, one such piece of background knowledge was critical to ensuring the project would improve operational decisions: the exact prediction goal for a model and the required examples that must populate the training data in order to pursue that prediction goal.

Author's note. This article is a product of Siegel's work as the Bodily Bicentennial Professor in Analytics at UVA Darden School of Business.

Eric Siegel

Eric Siegel, Ph.D., is a leading consultant and former Columbia University professor who helps companies deploy machine learning. He is the founder of the long-running Machine Learning Week conference series, a frequent keynote speaker and executive editor of The Machine Learning Times. Eric authored the forthcoming book "The AI Playbook: Mastering the Rare Art of Machine Learning Deployment" and the bestselling "Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die," which has been used in courses at hundreds of universities. He won the Distinguished Faculty award when he was a professor at Columbia University, where he taught graduate courses in machine learning and AI. Later, he served as a business school professor at UVA Darden. Eric also publishes op-eds on analytics and social justice.

Michael Albert

Michael Albert is an assistant professor of quantitative analysis in Darden’s MBA program and has joint appointments in Systems Engineering and Computer Science in the School of Engineering and Applied Sciences (SEAS) at the University of Virginia. His research focuses on combining machine learning and algorithmic techniques to automate the design of markets. His work has appeared in leading artificial intelligence and machine learning venues, such as the Association for the Advancement of Artificial Intelligence (AAAI) and the International Joint Conference on Artificial Intelligence (IJCAI). Prior to joining Darden in 2018, Albert received his Ph.D. in financial economics at Duke University’s Fuqua School of Business. He has also worked as a visiting assistant professor of finance at The Ohio State University, postdoctoral researcher at the Learning Agents Research Group at the University of Texas at Austin under Peter Stone, and postdoctoral researcher in the artificial intelligence group headed by Vincent Conitzer at Duke University.

Keywords: