Book Reviews

Published Online:https://doi.org/10.1287/inte.2015.0794

Abstract

In Book Reviews, we review an extensive and diverse range of books. They cover theory and applications in operations research, statistics, management science, econometrics, mathematics, computers, and information systems. In addition, we include books in other fields that emphasize technical applications. The editor will be pleased to receive an email from those willing to review a book, with an indication of specific areas of interest. If you are aware of a specific book that you would like to review, or that you think should be reviewed, please contact the editor.

The following books are reviewed in this issue of Interfaces, 45(3), May–June 2015: Predictive Analytics, Data Mining and Big Data: Myths, Misconceptions and Methods, Steven Finlay; and Modeling Techniques in Predictive Analytics: Business Problems and Solutions with R, Thomas W. Miller.

Predictive Analytics, Data Mining and Big Data: Myths, Misconceptions and Methods

Finlay, Steven. 2014. Predictive Analytics, Data Mining and Big Data: Myths, Misconceptions and Methods. Palgrave Macmillan. 226 pp. $45.00.

If you want an excellent nontechnical overview of the predictive analytics process, this book should meet your needs. Business managers asked to supervise data scientists and (or) their projects and others who need to understand the potential benefits they can expect from predictive analytics, data mining, and Big Data would do well to read this book. Importantly, the author does not jump right into implementing predictive analytics or managing a predictive analytics project within one’s organization, but spends time answering the question “How do I know if predictive analytics is right for my organization at this time?”

Chapter 1, Introduction, contains a good overview of why models fail; reasons given include (1) the organization is not ready for predictive analytics, (2) it uses the wrong model, and (3) its governance of the process and (or) model is weak. The chapter moves next to an explanation of the characteristics of Big Data and the entreaty to “…view Big Data as a philosophy about how to deal with data, rather than [focusing on] how much data is available or what it contains” (p. 14). In this chapter, the author also makes a surprising statement, at least for a consultant: “There are lots of opportunities for Big Data, but Big Data is not the answer to all of the world’s problems and it may not be right for you” (p. 17). Chapter 2, Using Predictive Models, is an important chapter for readers involved with predictive models—as long as they are not the model builders.

In Chapter 3, Analytics, Organization and Culture, Finlay provides the business manager with an excellent overview of the best way to implement predictive analytics within an organization; this is arguably the most useful chapter in the book. As the author states, “…getting a good quality analytics team in place is probably the last thing on the list you need to get right, not the first. You need to create the appetite for predictive analytics before you start recruiting people to do it” (p. 42). The three questions Finlay puts forth to determine “…the likely impact of introducing predictive analytics to an organization and how much disruption it might cause…”) (pp. 60–61) are excellent, as are the three options given for “…combin[ing] human and automated prediction at the case level” (p. 63). Along the same lines, in Chapter 4, The Value of Data, he provides an overview of the benefits the reader can—and cannot—expect to achieve with data. The author points out that all data are not good data, and in Table 4.1 does an excellent job of defining data types. Overall, this chapter is a good introduction to the cost-benefit decisions that must be made when considering the specific data to obtain or use.

Chapter 5, Ethics and Legislation, contains a well-written introduction to ethics in general and the ethics of data use in particular. The identification of the three areas of concern in the realm of ethical use of data is helpful, as is the list of the three determinants of the risk of ethically questionable data usage. Table 5.2 nicely identifies how to determine these risks. This chapter might at first seem unnecessary; however, it is useful for the business manager because some data scientists might be better referred to as “data cowboys.”

Chapters 6–8 cover the heart of model building. Chapter 6, Types of Predictive Models, is less technical than this reviewer would have liked, but is nonetheless an excellent overview of the topic. Notably, the author does an admirable job of giving a nontechnical description of support vector machines. The chapter closes with a helpful discussion of ensemble systems and the prospects for better types of predictive models in the future. In Chapter 7, The Predictive Analytics Process, Finlay includes the following warning: “It’s all the other things around model development that need careful consideration and control if predictive analytics is going to be successful. My experience is that the data scientists themselves are not always the best people to oversee the wider task of getting models implemented and in use” (pp. 134–135). Business managers would be wise to heed this advice, coming as it does—surprisingly—from a data scientist. This chapter also contains excellent points on choosing the proper project management process for the type and scale of project and gives some reasons that information technology and predictive analytics projects fail. Chapter 8, How to Build a Predictive Model, is a good overview of the topic for the reader who is not a data scientist. This reviewer is very satisfied with the level of technical detail in this chapter; for example, the author makes an excellent point about model bias.

The audience for this book is the intelligent nontechnical person whom predictive analytics impacts in some way. Finlay, a data scientist with decades of experience, provides an excellent introduction for such readers, equipping them with the knowledge to manage both the implementation and use of predictive analytics models in their organizations. This book represents a commendable effort for a first-time author.

Neil Desnoyers

Decision & System Sciences Department, Saint Joseph’s University, Philadelphia, Pennsylvania,

Modeling Techniques in Predictive Analytics: Business Problems and Solutions with R

Miller, Thomas W. 2015. Modeling Techniques in Predictive Analytics: Business Problems and Solutions with R. Pearson. 359 pp. $79.99.

Davis: “That’s part of your problem: you haven’t seen enough movies. All of life’s riddles are answered in the movies,” Steve Martin as Davis in the movie Grand Canyon (1991).

Each of the 12 chapters in Miller’s very enjoyable book, Modeling Techniques in Predictive Analytics: Business Problems and Solutions with R, begins with an esoteric movie quote similar to the one above as a prelude to the lesson in the chapter. We find 10 interesting business modeling examples in these chapters; for example, Chapter 1 sets up the book and Chapter 12 is a summation. Each lesson is applicable to a business problem; in addition, the author suggests a resolution (often using R) and employs useful and varied data visualization to bring home the point. Cogent examples show how other business problems could be solved using each of these strategies. He wraps this all up in a highly readable format not often found in a book of this nature.

The author states in the preface that he wishes this to be the rare book that combines strategy, methods, and code. Let us look at each of these three areas in this review.

The first goal the author states is that he wants the book to be a ready resource and reference guide for modeling techniques. He achieves this. The beauty of this book is that each chapter starts with an interesting predictive analytic problem and then launches into techniques to solve it. This is a much better hook than in other books that do this in reverse order. We see the problem, become involved in its analysis, and work with the author toward the resolution. Each chapter explains how we can apply this modeling to other business challenges, not only to the problem at hand.

The author’s second goal is to show programmers how to build upon a foundation of code that works to solve real business problems. He uses a very helpful technique; he presents all the R programs at the end of each chapter. The formatting within the R programs is very organized, allowing the reader to more easily navigate the program. Within each program, annotations guide programmers through the programs with detailed explanations of the coding and commands. This is effective at helping programmers with various skill levels understand what each line of code is accomplishing in the program. By understanding the code, programmers are more capable of using the code to solve their own analytics problems. At the end of each program, the author presents helpful suggestions for students for future practice of the techniques discussed in the chapter. This can be a useful resource in predictive analytics courses that use R, because it allows programmers to expand upon the R programs given.

The author’s third goal is to translate the results of models into words and pictures that management can understand. The book is extremely effective here. It includes numerous examples of data visualization methods to enhance the points being made. For example, it gives working examples of box plots, spine charts, matrix bubble charts, horizon plots, ribbon plots, flow charts, scatter plots, Poisson distribution charts, negative binomial models, correlation heat maps, tree-structured regression charts, mosaic plots, and parallel coordinate plots. Each visualization is used in the context of explaining the data related to the problem. This gives the reader good working knowledge about using the chart or diagram in a real-life situation.

The appendices are also useful. They explain some of the basic statistical concepts this book employs, describe how to measure data, contain some useful and varied case studies, and include detailed R programs. The ability to download the R programs from the book’s website is helpful for the reader who wants to reproduce the author’s results.

The reader should be forewarned: This book assumes a certain level of statistical knowledge on the reader’s part. For those with this knowledge, the book moves along in a nice flow. Others may need to keep Wikipedia handy to brush up on terms and theories. This review of terminology is relevant and helpful as the book guides the reader from analytic signpost to signpost. Even the business person with little analytic experience will benefit from the examples in this book and the insight into the benefits of using these statistical tools.

Our constructive criticisms of the book are minimal. First, it could refer to the appendices more frequently. For example, it initially references the classical versus Bayesian approaches to variability on p. 5, and then mentions them several times later in the book. Adding a note saying that Appendix A includes an explanation would be helpful to the less-informed reader. Second, in several instances, the book references colors on charts; however, the charts are in black and white; therefore, determining which shading is red and which shading is blue can be difficult. Third, in two instances, the book states “Figure ?? shows….” None of these criticisms detracts from the book’s value.

This is a good book on analytic modeling. It succeeds in its objective of being a resource for modeling techniques, a handy guide to R in the real world, and a reference for data visualization ideas. Even the less statistically inclined readers will be able to glean useful information and ideas that they can utilize with data miners in their business settings. Those with a statistical background will find a fast-paced, entertaining look at how to use analytics successfully in a predictive manner.

David Wehling

The Toro Company, International Division, Bloomington, Minnesota, Globe University, Minneapolis, Minnesota,

Kate Klasen

The Toro Company, International Division, Bloomington, Minnesota,