September 9, 2018 in Forum
Neural networks and the case for efficient modeling
SHARE: PRINT ARTICLE:
https://doi.org/10.1287/LYTX.2018.05.07
High tech companies first introduced the idea of artificial intelligence (AI) into the mainstream, and now the term is a rapidly growing part of everyday business lexicon. Within the AI space, deep learning modeling and neural networks are receiving the most attention. Much like “big data,” the past decade’s tech buzzword du jour that left companies scrambling to understand and apply, neural networks and its cousin, deep learning, are said to be the “new bleeding edge” that can bring companies to technological salvation.
Yet with all the fervor and fanfare, many are still left wondering what a neural network actually is. Neural networks, or at least the idea of them, have been around since the 1940s.
These mathematical models were inspired by our biological understanding of how our brains function, using layers of neurons to process and pass information, and the algorithms that underlie neural networks are designed to mimic these functions. The recent resurgence in interest in neural networks is due to a number of factors, including dramatic advances in fundamental research in the field, exponential increases in computing power over time, and the emergence of AI in popular consumer devices such as Apple’s Siri and Amazon’s Alexa. This, along with the recent blast of marketing hype surrounding AI, has driven exploration of using neural networks and deep learning to address more traditional business questions – a new set of tools in the professional data scientist’s arsenal.
Recently, I came across an article discussing how analytical models in general are leveraged in the context of customer science. The article compared several types of models using IBM Watson Telco data to predict customer churn, and the most powerful model described was a logistic regression – a type of statistical model that is commonly used to predict a binary outcome, such as a “yes” or a “no.” I also found another article that leveraged the exact same customer churn data set, but instead used a multilayer perceptron, which is one of the base forms of a deep learning model. The deep learning article was written to show how a simple multilayer perceptron was more powerful than the models from the first article. And in fact, the deep learning model was able to achieve an accuracy of about 82 percent, while the logistic regression was able to achieve an accuracy of 80 percent. Statistical performance had improved. But which model would a business stakeholder choose?
We can start by asking whether a 2 percent difference is meaningful, given the business context of the model. In fields where accuracy is key, such as the medical field, that difference could be consequential. For a business question like customer churn, however, many would argue this isn’t as big of a difference. But statistical accuracy is not the only way to measure models against each other. I’d argue that we need at least two other evaluation criteria: the ease of deploying the model into a production environment, and the ease of explaining the model (“explainability”) and its results to business stakeholders. We need to consider a balance of all three criteria when selecting a “best” model to deploy in a business.
Finding the right combination brings to mind the concept of efficiency in economics – striving for the perfect balance of optimization for each measure to the point where further improving one measure would begin to harm another. Perhaps then an appropriate term for this model measurement technique could be “efficient modeling.” A modeler should start with the simplest model appropriate for the business question being asked, and then iteratively improve that model until it has reached “efficiency” – an optimal level of predictive power, explainability and ease of production.
In the second part of this series on neural networks, we’ll return to our competing models and review them in more depth through our “efficient modeling” lens. We’ve already seen that for this example, predictive power tilts in favor of the deep learning model. But what about ease of deployment? And explainability to business users? We’ll explore the implications of these criteria on model selection in the next article.
Jim Theologes is a data scientist at Elicit where he spends his days building insights from data for a wide variety of industries including retail, software security, short-term rental and aerospace. His foremost goal is distilling simple understanding from complex data to drive actionable insights.