Introducing Prescriptive and Predictive Analytics to MBA Students with Microsoft Excel
Abstract
Managers are increasingly being tasked with overseeing data-driven projects that incorporate prescriptive and predictive models. Furthermore, basic knowledge of the data analytics pipeline is a fundamental requirement in many modern organizations. Given the central importance of analytics in today’s business environment, there is a growing demand for educational pedagogies that give students the opportunity to learn the fundamentals while also familiarizing them with how such tools are applied. However, a tension exists between the introduction of real-world problems that students can analyze and extract insight from and the need for prerequisite knowledge of mathematical concepts and programming languages such as Python/R. As a consequence, this paper describes an application-focused course that uses Microsoft Excel and mathematical programming to introduce MBA students with nontechnical backgrounds to tools from both prescriptive and predictive analytics. While students’ gain proficiency in managing data and creating optimization and machine learning models, they are also exposed to broader business concepts. Teaching evaluations indicate that the course has helped students further develop their practical skills in Microsoft Excel, gain an appreciation of the real-world impact of data analytics, and has introduced them to a discipline they originally believed was best suited for more technically focused professionals.
Supplemental Material: Supplemental materials are available at https://doi.org/10.1287/ited.2023.0286.
1. Introduction
The emergence of business analytics and artificial intelligence (AI) has fundamentally changed the requirements of a management education due to strong industry demand for personnel that have proficiency in business intelligence, quantitative analysis, and data engineering (Zhan et al. 2018, Dean 2020, Mitra et al. 2021). In fact, the field of data science, which did not formally exist in the early 2010s (Press 2013), is projected to grow approximately 28% through 2026 (Schroeder 2021). A nascent area is the manager who has enough low-level technical knowledge to understand the array of sophisticated analytics techniques that can be applied to data stores while also having a high-level perspective to see the managerial implications of the resulting analysis (Fitzgerald 2012). Although this seems to be an incredible opportunity for MBA programs, which have recently seen a decline in the number of applications (Thomas 2022), many business schools are struggling with how to incorporate analytics training into their MBA programming (Stine et al. 2019).
Although many management schools offer graduate degrees in data science, incorporating modernized analytics modules into the MBA curriculum is challenging. From a program design perspective, there are no accreditation standards for developing machine learning and business analytics courses (Xu and Babaian 2021). Furthermore, many business-focused AI textbooks put too much emphasis on the application of analytics and do not give sufficient detail as to how the algorithms actually work (Marr and Ward 2019). Alternatively, most academic AI resources are written for technical and/or mathematically oriented learners (Xu and Babaian 2021). Thus, there is tension between creating MBA courses that balance the need to have students understand analytics algorithms with teaching them how to interpret and communicate their findings (Langley 2019). There is also pressure to ensure that other functional areas of business knowledge are well represented in the curriculum (e.g., marketing, accounting, and finance), to offer unique courses in new and promising industry segments (e.g., digital transformation, sustainability, and entrepreneurship), and to incorporate extracurricular and networking opportunities (Kurczy 2018, Kirkpatrick 2020). As a consequence, there may be scant room for offering multiple courses on the topic of business analytics. The previous difficulties are further exacerbated by three main course-level challenges.
1.1. Students Require Prerequisite Knowledge
Prescriptive and predictive analytics are grounded in several quantitative disciplines such as algebra, mathematical programming, numerical optimization, statistics, and data mining. Furthermore, many introductory analytics courses assume students have some programming experience in Python/R or teach students how to code (Nagpal and Gabrani 2019). However, strong quantitative skills are typically lacking in MBA students (Weathers and Aragón 2019) and because programming is difficult to learn, a considerable time investment must be made (Murray 2014, Holzmen and Gooch 2016). Consequently, in some courses geared toward nontechnical learners, exercises and assignments often rely on partially constructed solutions if they contain difficult mathematical concepts or snippets of code that students do not fully comprehend (Langley 2019). Expecting that students have the required technical knowledge to succeed is clearly an impediment to encouraging them to seek out analytics training. Conversely, proficiency with Excel spreadsheets continues to be a technical area that graduate programs should place more emphasis on (Formby et al. 2017, Seal et al. 2020) even though most MBA programs require applicants to have a working knowledge of Excel on admittance (Questrom School of Business 2022).
1.2. Experiential Education is Valuable Even for Management Training
Businesses must look beyond the data scientist to develop talent such that managers can contribute to producing data-driven solutions (Miller and Hughes 2017). However, to fully understand what data analysts do, what type of insights are possible, and the many pitfalls that may arise during the data engineering and modeling process, students must be trained to perform prescription/prediction tasks. It has also been shown that business students both enjoy, and gain a deeper understanding of, quantitative material when a hands-on approach is used (Xu and Babaian 2021). Nevertheless, when providing MBA students with opportunities to implement analytics models for difficult data-grounded problems, it is essential to have them also acquire skills related to problem conceptualization, model formulation, and the dissemination of insight (Fry 2008). Thus, the ability to craft application-driven learning experiences is crucial to having MBA students understand how to better manage a team of analytics professionals that work together to make data-informed decisions that improve the efficiency and operations of organizations (Longwood University 2021).
1.3. Modern Analytics Workflow
There are many aspects to data-driven decision making that must be touched on even for management-level knowledge. For instance, the education literature suggests that students should become proficient at assessing data quality, identifying relevant managerial questions, determining whether the data can be used to address the proposed questions, managing data sets, performing quantitative analyses, and interpreting model outputs (Weathers and Aragón 2019). Although these concepts aptly summarize the data acquisition, data preparation, and the model engineering (i.e., train, test, evaluate) pipeline, there is also a fundamental difference between building analytics models that are used to extract managerial insight, and deploying a trained model to generate business value (Visengeriyeva et al. 2020). Thus, MBA students should be aware of how analytics models can be deployed in production environments. Although this involves an introduction to machine learning operations (i.e., ML Ops), it also puts more emphasis on the integration of domain specific knowledge, both in the implementation of the AI tool and in determining what high-level insights can be derived from its deployment (Langley 2019).
This paper describes an application-driven course that has been developed to introduce MBA students to optimization and machine learning concepts using the unifying language of mathematical programming. Although this lens has been introduced at a PhD level (Bertsimas and Dunn 2019), it is argued that this perspective is also valuable for teaching at the MBA level. As a consequence, students are not required to have prerequisite programming knowledge nor do they need to demonstrate advanced mathematical proficiency. Furthermore, all analytics models can be implemented with Microsoft Excel, which strengthens their competency with the software. The course provides students with many opportunities to apply the tools they learn and repeatedly emphasizes the business insight that can be obtained from rigorous quantitative analysis. Thus, it differs from the literature in that it presents more detail than many courses designed for nontechnical learners (Way et al. 2017) while also focusing on managerial issues such as quantitative storytelling, the model engineering process, and deployment; these topics are typically omitted from graduate-level curricula (Mike et al. 2020). In sum, the course teaches students how to obtain and communicate high-level analytics-based insight for real-world applications using experiential learning activities.
2. Background and Motivation
In the first year of their MBA at the Schuilch School of Business, students are required to complete two mandatory courses that relate to data analytics and operations management. The first is entitled Quantitative Methods and teaches business numeracy; this includes topics such as descriptive statistics (measures of location and association, distributions, variability), inference (simple and multiple linear regression), and decision trees. The goal is to cultivate a basic understanding of the methods underlying data-driven decision making, have students gain an appreciation of their use in addressing real-world business problems, and provide opportunities to implement these methods using Microsoft Excel. The second course is entitled Operations Management and presents topics in operational strategy, process analysis, quality management, introduction to supply chains and production systems, and project management. The objective is to teach students foundational knowledge in managing operations. Both are quarter-year courses with six lectures, several assignments/case studies, and at least one individual exam. Although the specific topics may vary from year-to-year, variants of these courses can be found in many MBA curricula (Herrington 2010).
The elective half-year course, entitled Models and Applications in Operational Research (MAOR) is the primary focus of this teaching article although several related modules have been created for other courses. MAOR constitutes a survey of topics related to operational research. The emphasis is placed on analyzing practical problems and obtaining business insight rather than on proofs or exploring mathematical properties. The application areas are varied and include, but are not limited to, financial planning and portfolio optimization, inventory and production, marketing analytics, transportation, healthcare, revenue management, supply chains, and economics.
From a technical perspective, MAOR has historically focused on teaching mathematical optimization, decision trees, and Monte Carlo simulation. However, prior to the redesign of the curriculum, enrollment was low (10–15 students) and the examples were somewhat outdated. In fact, there had not been any substantial revision to the course content for many years and several topics overlapped with existing courses. More specifically, AI and its integration with society emerged as a strategic focus of the Schuilch School of Business and the university more broadly (York University 2022a, b). As a consequence, other courses were revised to meet this objective. For instance, the coverage of decision trees was moved to Quantitative Methods and Monte Carlo methods are now taught in Applications of Data Science in Finance. Thus, MAOR needed to be updated and brought in line with the strategic focus of the program, school, and the university.
The philosophical underpinnings behind the revised version of MAOR come from several sources including the author’s own research and educational background, a review of the relevant literature (Warner 2013, Laursen and Thorlund 2016, Carillo 2017), as well as discussions with colleagues within and outside the institution. The following five principles helped guide the redesign:
Data, as opposed to models, are the focus. This perspective is championed in the book The Analytics Edge (Bertsimas et al. 2016) and by many industry experts (e.g., Telenti and Jiang 2020).
Students should be trained to become “data-savvy managers” (Manyika et al. 2011) that contribute to producing data-driven solutions. That is, students who successfully complete the course should have the requisite knowledge to understand the relationship between business problems and the application of analytical tools used to extract insight (Wilder and Ozgur 2015b). Thus, although students should receive hands-on experience implementing prescriptive and predictive algorithms to fully understand the process associated with performing analytics, the goal is to create managers and not graduates that will eventually become implementation experts (e.g., data scientists).
The course should appeal to students without a strong mathematical background. Introducing data analytics to nontechnical learners is difficult due to persistent concerns about the math requirements (Roth and Matherne 2021). By focusing on mathematical programs (prescriptive) and machine learning models formulated as optimization problems (predictive), a unified and modern treatment of optimization and machine learning can be presented (Kopcso and Pachamanova 2018, Bertsimas et al. 2019). Furthermore, only rudimentary mathematical knowledge is required as constrained optimization has been successfully taught in MBA programs for decades (Liberatore and Nydick 1999, Regan 2005). Finally, the approach enables a rich discussion of the difference between prescriptive and predictive models, emphasizes the analytics pipeline and the varying ways in which data are used (Jurney 2017), and allows for the deliberation of the societal concerns associated with AI, such as model interpretability (Bertsimas and Dunn 2017, Bertsimas and Wiberg 2020, Bertsimas et al. 2021), prejudice (Corbett-Davies and Goel 2018, Manyika et al. 2019, Kaushal et al. 2020), and issues with causality (Shah 2019, Cousineau et al. 2023).
Contrary to introductory courses that require some prerequisite knowledge of programming (Boutilier and Chan 2023), the course should appeal to students without any formal coding background. Typically, Microsoft Excel is regarded as a software that is exclusively associated with performing descriptive and prescriptive analysis (Paul and MacDonald 2020). However, with the advent of predictive analytics and its widespread usage/appeal, this is slowly changing (Brusco 2022, Naderi et al. 2022). Furthermore, Excel remains one of the primary tools for data analysis (Leong and Cheong 2008, Montoya 2022). Thus, because of its pervasive use in industry and management education (Wilder and Li 2019, Patrick et al. 2019, Huggins et al. 2020), all examples and coursework should be completed within the Microsoft Excel ecosystem.
Because it is an elective within the Operations Management and Information Systems department, the course should highlight the deep connections between analytics and topics within the operations and supply chain management field (LeClair 2018). That is, students should become adept at being quantitative storytellers, that is, communicators that have the skill set to liaise with both data scientists and senior management on operations-focused issues (Olsen-Phillips 2022).
The revised course was first taught in Winter 2019 with 26 students (12 completed course evaluations) and Fall 2021 with 28 students (15 completed course evaluations). Although the course is typically offered in-person, the Fall 2021 version was fully online due to the COVID-19 pandemic.
3. Learning Objectives and Course Design
The purpose of MAOR is to provide students with an introduction to modern analytics through the analysis of real-world problems focusing on operations management. The course discusses data (acquisition, validation, and preparation), model engineering (training, testing, and evaluating prescriptive/predictive models), and model deployment. It consists of a mix of mini-lectures (videos) together with discussions of real-world applications (in-class sessions) focusing on decision situations. The central theme of the course is to have all topics motivated by circumstances that require evidenced-based decision making. Thus, students develop a sense of how to approach complex problems of a technical nature. Furthermore, a ubiquitous feature in modern management is the central role that data plays in driving core business processes. Consequently, students learn how to effectively analyze data sets and implement analytical tools using a mathematical programming framework.
The course is organized into two modules. The first module introduces students to optimization technology for prescriptive analytics and includes topics such as linear programming, sensitivity analysis, mixed-integer linear programming, goal programs, and nonlinear programming. Many application areas are covered including inventory modeling, price optimization, production, transportation, process analysis, scheduling, and supply chain management. Several topics in the first half of the course are somewhat standard and are covered in MBA-level operations courses such as decision analysis (Stern School of Business 2021) and spreadsheet modeling (Rotman School of Management 2022). This is purposeful: not only are these topics taught at a level where larger-scale optimization models for more realistic-sized business problems are solved, but they act as a scaffold for creating prediction models using optimization technology.
The second module begins with a discussion of data acquisition, preparation, and wrangling and subsequently takes the view that machine learning is the application of optimization technology on historical data stores. As a result, supervised and unsupervised learning models are formulated as mathematical programming problems (Gambella et al. 2021). Topics include regression (least squares, quantile, least absolute deviation, k-nearest neighbors), regularization (best-subset selection, ridge, lasso), classification (logistic, support vector machines, k-nearest neighbors), unsupervised models (k-means clustering, principal component analysis), machine learning operations, and modern examples of AI systems. Many applications are introduced including human resource management, worker/student performance, revenue management, healthcare process analysis, quality management, and product forecasting. Moreover, the optimization framework is applied to several data sets to obtain managerial insight. The limitations and benefits of using an exact approach are discussed (e.g., certificate of optimality, added expressiveness of the mathematical language), as well as the process of deployment (giving analytics-based recommendations versus deploying a trained model). Finally, the link between prescription and prediction is emphasized (i.e., predict-then-optimize) to connect the perspectives. All computational analysis is performed in Microsoft Excel using the native solver tool and with OpenSolver, an open-source alternative.
The course objectives are as follows:
To expose students to the main concepts of quantitative reasoning in a business environment.
To provide students with opportunities to improve problem solving and critical thinking skills and become expert communicators of quantitative information, for example, acquire the ability to effectively liaise with data scientists, machine learning engineers, and senior managers alike.
To develop a proficiency with mathematical models and computational tools for prescriptive and predictive analytics with a focus on improving their knowledge of Microsoft Excel.
To understand the relationship between mathematical programming, prescriptive/predictive models, and the role that data plays in problem formulation and model interpretability.
To facilitate an understanding of the business processes that support data-driven decision making and to make students aware of potential issues (e.g., model bias, data leakage).
3.1. Course Materials
There are no required textbooks for this course. This is because of financial considerations, the lens MAOR uses to introduce concepts in business analytics (i.e., the unifying language of mathematical programming), and because, as discussed in the Introduction, AI textbooks are typically too technical for business students or they do not present enough detail with respect to the underlying algorithms. Instead, students are provided with a detailed slide deck (∼100 slides per meeting session), several handouts, and multiple Excel spreadsheets that have worked examples of all problems discussed in class. In addition, a package of completed problems is provided for self-study (exams). An example slide deck covering mixed-integer linear programming (MILP) models (Week 5) and machine learning regression (Week 9) is included in the supplementary materials, as well as a handout on sensitivity analysis (Week 2) and three Excel files (one example is from Week 5 and two examples are from Week 9).
3.2. Computational Considerations
When solving problems motivated by real-world applications, it is often necessary to formulate large-scale models (e.g., hundreds or thousands of decision variables and/or constraints). For instance, constructing a mathematical program to solve a transportation problem (prescription) with 50 nodes may require thousands of decision variables. Alternatively, creating a quadratic program to solve a soft-margin, support vector machine (prediction) requires there to be at least as many decision variables as the number of data points and features (Gambella et al. 2021).
Unfortunately, Microsoft Excel solver is limited to 200 decision variables and 100 constraints. Thus, to solve moderate- to large-scale problems, the open-source add-in OpenSolver is used (Mason 2012). Not only is it easy to install, but it has no limit on the number of variables or constraints that can be incorporated into a model. The interface is similar to the native solver tool (Figure 1) and the advanced version has access to several powerful nonlinear solvers. This is especially important when formulating machine learning models as their objective functions are frequently nonlinear (e.g., logistic regression, ridge regression). Furthermore, its open-source status means that OpenSolver can be used in industry for noncore projects. Finally, there are Excel connectors for industrial solvers (Gurobi Optimization, LLC 2022) which implies that students may see similar interfaces if they are employed in organizations that have licenses for these software tools.
Nevertheless, it’s important to note that OpenSolver is not a panacea. Although there has been significant speed improvements in constrained optimization solvers, linear and quadratic programs in particular, state-of-the-art implementations of machine learning models as optimization problems still only contain hundreds to a few thousand data points (Bertsimas and King 2016). Although this is sufficient for many applications and for an introductory MBA course on analytics (Pachamanova et al. 2022), the result is that the course does not delve into problems where big data sets are required (Deng et al. 2009). Thus, it is recommended that all models, especially assigned material, be run on several machines (MacOS, Windows) to determine how long it takes OpenSolver, or the native solver tool, to find an optimal solution. Then, an instructor can modify the problem (e.g., reduce the size of the data set, use a different solver) if the runtime is too long.
In Appendix A, more detail is provided regarding the use of OpenSolver, features that have been particularly important for the course, and issues that instructors should be aware of.
3.3. Weekly Sessions
The course is delivered over a 13-week semester. There are 12 two-hour meeting sessions, a midterm, and a final exam. An example course outline is provided in Table 1. Prior to each meeting session, students are required to watch a 20- to 40-minute video that contextualizes the upcoming in-class session. The video focuses on introducing a specific prescriptive/predictive technique, the settings under which it is appropriate, what/how data are used, potential pitfalls, and discusses how the model can be implemented in Excel. During the meeting session, one to three real-world applications that use the technique are introduced. Each example is structured similarly: problem definition, identifying relevant data, data preparation and transformation, model engineering, discussion of managerial insights, and the communication of business knowledge.
|
Example Course Outline
Week | Sample topics |
---|---|
1 | Introduction to Mathematical Programming: Visualizing the Solution Approach |
2 | Modeling Mathematical Programs with Excel & OpenSolver + Sensitivity Analysis |
3 | Solving Medium- to Large-Scale Linear Programming Models with Excel |
4 | Integer and Binary Programming: Scheduling and Covering Problems + Logical Constraints |
5 | Mixed Integer Linear Programs: Multiple Decision Classes and Linking Constraints |
6 | Goal Programming: Hard vs. Soft Constraints and Pricing Deviational Variables |
7 | Nonlinear Programming: Quadratic Optimization and Empirical Risk Minimization |
8 | Introduction to Prediction: Data, Training, Evaluation + Predict-then-Optimize |
9 | Supervised Learning Regression: Least-Squares and Quantile Regression + Regularization |
10 | Supervised Learning Classification: Logistic Regression and Support Vector Machines |
11 | Unsupervised Learning: k-Means Clustering and Dimensionality Reduction (PCA) |
12 | Introduction to Machine Learning Operations: Overview of End-to-End Workflows |
This method of pedagogy is known as the flipped classroom (Bishop and Verleger 2013, Akçayir and Akçayir 2018) and is particular advantageous in this setting as it allows students to learn at their own pace, makes course content more easily accessible, and meeting sessions are almost entirely devoted to (i) introducing a real-world application; (ii) solving a data-driven analytics problem; and (iii) contextualizing the example within the MBA curriculum to build an understanding of the challenges that may be associated with encountering the problem outside the classroom (e.g., issues with data acquisition, combining multiple competing perspectives). Furthermore, for each problem, a spreadsheet associated with the implementation of the solution is provided. Apart from being extensively used during class, it is a great resource for self-study in that students can review the steps associated with data wrangling, identify the actions needed for implementing the analytics model, and experiment with alternative formulations (e.g., adding/removing constraints, different types of regularization). This helps to improve their Excel proficiency and provides a template for producing spreadsheets with a similar structure (e.g., assignments/exams).
3.3.1. Detailed Outline—Prescriptive Analytics Session: MILP Models (Week 5).
From a technical perspective, the purpose of this week’s module is to have students realize that mathematical programs can include many types of decision variables (continuous, integer, discrete). The implication is that very general models can be created whose decisions only depend on what type of managerial insight one would like to learn. That is, models are not limited to one type of decision variable but, instead, depend on the business decisions that need to be understood/made. The prelecture video touches on this idea noting that the computational complexity of the underlying algorithm is no more difficult than solving an integer or binary program (the previous week’s class). What’s difficult about this class of problems, however, is that to faithfully represent the real-world situation, the different types of decision variables need to be linked. Thus, a new class of constraints is required. To understand how these linking constraints manifest in prescriptive problems, several real-world examples are introduced during the in-class session.
During the two-hour class discussion, two to three examples are introduced that pertain to some aspect of operations management. They can include make versus buy decisions (Agrawal 2006), building plan design (Ngowtanasuwan 2012), production and transportation, and volume discounts (Terwiesch and Cachon 2019). Each example is derived from a research project and/or a consulting experience of the instructor or their colleagues, which means that there is a rich discussion of the setting that motivates each problem. All examples follow the same rubric: problem description, definition of decision variables and objective function, creation of constraints, introduction of the linking constraints, formulation in Excel, and analysis of the optimal solution.
The in-class examples give students the opportunity to formulate and solve a mathematical programming problem, and thus, they get hands-on experience at model building in a low-stress environment. Furthermore, the problems incorporate linking constraints in different ways which further demonstrates the generality of mathematical programs. Because the size of the models and level of involvement of the examples can be slightly above the typical introductory operations course offered in MBA programs, students expand their quantitative problem-solving skills while also improving their competency with Excel. Finally, each problem requires data for model parametrization which leads to a nuanced discussion regarding how this information can be collected in practice.
From a managerial point of view, the flipped classroom environment allows for a significant interaction. Furthermore, after solving each problem, there is a rich conversation of next steps (e.g., how to interpret the results, follow-up analyses that could be performed to demonstrate the robustness of the findings, the ways in which the results can be communicated to management). Finally, operational insights motivated by the solution are considered. For example, in the production and transportation example, a dog food manufacturer must decide which production facilities to operate (binary decision variables) and which customers to serve from which operating plant (continuous decision variables). The optimal solution has that a small subset of all production facilities are used. Furthermore, the students observe that some customers receive dog food from production facilities in a distant city even though there is an option to rent a facility in the same city, whereas other customers are not served at all. The example highlights how the high fixed cost of renting a production facility is driving the model to minimize the number of rented plants. In some cases, the total cost of production is such that it makes sense not to serve a customer. However, if variable costs were higher and fixed costs were lower, the insights may differ. Thus, model interpretability is important as it provides visibility into why an analytics tool produced a particular solution.
For a detailed discussion of the Production and Transportation example, see Appendix B.1 and the corresponding Excel spreadsheet provided in the online appendix.
3.3.2. Detailed Outline—Predictive Analytics Session: Regression (Week 9).
From a technical perspective, the purpose of this week’s module is to have students realize that mathematical programs can be formulated to solve prediction tasks. This concept is introduced in the previous two sessions with a discussion of empirical risk minimization and the presentation of ordinary least squares (OLS) regression as an unconstrained quadratic optimization problem. However, it is expanded on in the prelecture video where mathematical programming formulations of regularized regression models (lasso, ridge, and best subset selection) are introduced.
To be more concrete, suppose that a data set has been collected with N instances and J features. The data have already been cleaned and the features transformed to their final values. Let the outcome variable of interest be yi and define the value of feature j associated with instance i to be xij for and . Then, the objective function for the regularized models are
In the prelecture video, apart from presenting the mathematical programming formulation for each regression model, insight is given as to the motivation for the constraints, when each model should be applied, and several real-world studies that utilize the different regularization techniques. In the subsequent in-class session, an example is introduced where students apply the models to a historical data set of academic performance associated with high school students in Portugal. The data set must first be cleaned and some of the variables transformed before it is separated into a training and testing set. The final data set contains 33 features and 395 records, which is split using the 80/20 rule. Then, students implement each regularized model in Excel including two benchmarks: (i) predicting the mean response of the training set and (ii) OLS regression. After implementation, students evaluate the results and how changes to the budget parameter affects model performance. The quantitative discussion concludes with students having to choose the best-performing prediction model, motivating that choice (e.g., lowest root mean squared error), and identifying the most important features for predicting grades, that is, home life situation, presence of both parents, parental education levels, rural versus urban location, school reputation, whether the student has university aspirations, number of absences, and the amount of study time.
After the quantitative analysis, students must indicate how they would go about presenting these findings to parents and school administration alike. This discussion focuses on what is controllable (e.g., number of absences, study time) versus intractable factors that cannot be easily changed (e.g., home life situation, presence of both parents, parental education levels, rural versus. urban location). The classroom conversation eventually leads to a broader dialogue on how to incentivize individuals, such as employees, to achieve desirable performance (Sommer and Loch 2009, Lee and Zenios 2012). The discussion also introduces behavioral nudges (Thaler 2018, Fishbane et al. 2020), a more recent mechanism that is used to indirectly influence behavior, such as Uber rewarding badges to incentivize drivers to work longer hours and Deliveroo using push notifications to nudge to their food delivery workers’ into working faster (Möhlmann 2021).
Two additional models are introduced during the in-class session. The first is the formulation of quantile regression which is motivated by the desire to predict the median response (e.g., house prices) and to create prediction intervals. The corresponding model is a linear program:
Finally, an implicit assumption regarding both quantile and regularized OLS regression is that the approximator is a linear function of the features. Although more complex models can be introduced in advanced classes, such as mathematical programs for decision tree learning (Dunn 2018, Aghaei et al. 2020), their formulation is too complex for the MBA audience. Instead, a nonparametric model (kNN regression) can be formulated where the function class does not need to be specified:
The session concludes by presenting an overview of the data wrangling and model engineering process, and contrasts the role of data in creating predictive/prescriptive models. For a detailed discussion of the Regularized Regression and kNN Regression examples, see Appendices B.2 and B.3 and the corresponding Excel spreadsheets provided in the online appendix.
3.3.3. Discussion Sessions (Week 8 and Week 12).
There are two sessions that do not follow the flipped classroom structure. These sessions present high-level content on the data acquisition and preparation process, and model engineering (train, test, evaluate), as well as details associated with deploying analytics models in practice (i.e., user insight vs. the creation of an AI tool). Week 8 acts as a transition session between the presentation of prescriptive (Weeks 1–7) and predictive (Weeks 9–12) models. Students are introduced to noteworthy concepts associated with the process of creating prediction tools such as transforming data, model engineering, assessing out-of-sample performance, data leakage, model bias, and the bias-variance tradeoff. This is especially important for MBA graduates if and when they manage multiple analytics professionals (e.g., data scientists, machine learning engineers) on large AI projects. Furthermore, the final session (Week 12) focuses on the end result of analysis, that is, whether the models are used to obtain managerial insight or are deployed. Concrete suggestions on effective communication strategies are given such as ensuring that recommendations and business insights are clearly communicated, the importance of creating visual aids, the use of outlines to systematically organize complex ideas, and how to balance the amount of detail that is presented (e.g., main text versus appendix). In addition, a few real-world examples of customer-facing AI tools are showcased. The classroom conversation highlights the purpose of each tool and the potential ethical issues associated with its use. A summary of the class content is given in Appendix C.
3.4. Assignments and Exams
There are five deliverables in MAOR: three assignments worth 12% each and two noncumulative exams worth 32% each. Using Table 1 as a guide, the first assignment covers weeks 2–4, the second covers material from weeks 5–7, and the last assignment focuses on weeks 8–11. Because of time constraints, machine learning operations is not assessed. Finally, the first exam focuses entirely on prescriptive modeling, whereas the second exam covers data engineering and predictive modeling.
All work is submitted individually, although students are encouraged to consult with their peers on assignments. Deliverables have the same format: they contain two to three questions that introduce a modern problem, provide real data, and require both technical and managerial answers. Each question has students perform data preprocessing and wrangling, necessitates the implementation of one or more prescriptive/predictive models, requires the evaluation of the proposed models, and asks several high-level questions about the implications of the results. Deliverables are submitted online, and the only difference between assignments and exams is that the latter has a shorter time interval for completion. That is, all deliverables are take-home, open-book assessments that are completed using Microsoft Excel. For more details, see Appendix B.4 and the online appendix, which includes a sample exam and the corresponding Excel spreadsheet.
3.5. Consulting Project
There is currently no culminating project in MAOR. The reason is that it is a second-year elective course and, during this year, MBA students are required to complete an intense capstone project that liaises with a real company and conducts a comprehensive strategic assessment of all functional organizational areas. Because students analyze many aspects of a business and make recommendations to management in a strategic action plan, adding another consulting project to their workload would be placing an undue burden on their already packed schedules. Furthermore, there may also be conflicts of interest that would greatly increase the logistical complexity of MAOR.
Nevertheless, in many courses, consulting projects with real companies are incorporated into the curriculum (Boutilier and Chan 2023). This can be an extremely valuable learning experience as it exposes students to the analytics pipeline (problem definition, data acquisition, model engineering) while emphasizing soft skills such as the writing of technical reports and presenting analytical findings to management. To facilitate these types of experiences, instructors can collaborate with their department’s career center or use online platforms such as Riipen (https://app.riipen.com), which connects educators to industry partners so that a real-world project can be designed as an in-course assignment (both approaches have been successfully used in other courses taught by the author).
4. Student Feedback and Course Modifications
To determine the effectiveness of the course curriculum, feedback from the online student evaluation system is examined. The program solicits both quantitative and qualitative responses on the efficacy of the instructor, the utility of the course content, and how closely the student’s experiences match with their expectations. Changes to the course, driven by both formal/informal student feedback and structural modifications induced by the COVID-19 pandemic, are also discussed.
4.1. Student Evaluations
To determine the success of the curriculum, a subset of the quantitative course evaluation questions that most directly relate to content design and delivery were selected. This includes the structure of the weekly modules, the effectiveness of the deliverables, and the ability of the instructor to achieve the stated course objectives. The mean course score, department score, and faculty score are recorded with the best assessment being seven and the worst being one (i.e., a seven-point scale is used as focus groups at the school indicated that students desired a more fine grained set of distinctions compared with the five-point scale). The same set of questions are given to all courses offered by the business school. Results for the following 10 questions are presented in Table 2.
Q1: The course materials (e.g., course kits, textbooks, readings, audio visual materials, laboratory manuals, websites, etc.) helped me achieve the course objectives.
Q2: The course activities (e.g., lectures, discussions, simulations, assignments, exercises and presentations, etc.) helped me achieve the course objectives.
Q3: The course tests/exams or final paper/essay were directly related to the course objectives.
Q4: The course helped me grow intellectually.
Q5: The course encourages critical thinking skills such as problem-solving definition, decision making, judgment, analysis and evaluation.
Q6: The instructor was organized and well prepared.
Q7: The instructor presented ideas and concepts clearly.
Q8: The instructor showed enthusiasm for the subject.
Q9: The instructor dealt effectively with student’s questions and comments.
Q10: The instructor created and maintained an environment that was conducive to learning.
|
Student Evaluations Scores (Seven-Point Scale) for 10 Questions Most Directly Related to Content Delivery
Winter 2019 | Fall 2021 | Average | ||
---|---|---|---|---|
Class size | 26 | 28 | 27 | |
Responses | 12 | 15 | 13.5 | |
Q1 | Course | 6.75 | 6.20 | 6.48 |
Department | 5.76 | 5.92 | 5.84 | |
Faculty | 5.94 | 6.04 | 5.99 | |
Q2 | Course | 6.75 | 6.53 | 6.64 |
Department | 5.73 | 5.92 | 5.83 | |
Faculty | 5.97 | 6.06 | 6.02 | |
Q3 | Course | 6.58 | 6.40 | 6.49 |
Department | 6.09 | 6.00 | 6.05 | |
Faculty | 6.13 | 6.18 | 6.16 | |
Q4 | Course | 6.75 | 6.73 | 6.74 |
Department | 5.58 | 5.91 | 5.75 | |
Faculty | 5.96 | 6.06 | 6.01 | |
Q5 | Course | 6.92 | 6.80 | 6.86 |
Department | 5.92 | 5.97 | 5.95 | |
Faculty | 6.10 | 6.12 | 6.11 | |
Q6 | Course | 6.83 | 6.80 | 6.82 |
Department | 5.92 | 5.90 | 5.91 | |
Faculty | 6.19 | 6.15 | 6.17 | |
Q7 | Course | 6.75 | 6.67 | 6.71 |
Department | 5.42 | 5.69 | 5.56 | |
Faculty | 5.95 | 6.01 | 5.98 | |
Q8 | Course | 6.92 | 6.80 | 6.86 |
Department | 5.98 | 5.97 | 5.98 | |
Faculty | 6.37 | 6.36 | 6.37 | |
Q9 | Course | 6.83 | 6.40 | 6.62 |
Department | 5.71 | 5.95 | 5.83 | |
Faculty | 6.10 | 6.16 | 6.13 | |
Q10 | Course | 6.83 | 6.53 | 6.68 |
Department | 5.75 | 5.77 | 5.76 | |
Faculty | 6.12 | 6.13 | 6.13 | |
All | Course | 6.79 | 6.59 | 6.69 |
Department | 5.79 | 5.90 | 5.84 | |
Faculty | 6.08 | 6.13 | 6.11 |
Notes. The mean score for the course, the department score, and the faculty score are presented. The final three rows of the table present the average scores for all 10 questions.
For every question, significantly higher evaluations for MAOR are observed compared with other courses offered by the department and within the business school. The final three rows of Table 2 present the average scores for all 10 questions. Student t statistics and p values corresponding to a one-sided hypothesis test are computed. The null hypothesis is that the observations from MAOR and the department/faculty courses come from populations with the same mean but unknown and potentially unequal variances. Although the tests are underpowered (the population size is two), they do provide some statistical evidence to suggest that the average course score is higher than the average department, , and faculty score, .
Informal discussions with students in MAOR indicate that the favorable evaluations are because of the perceived relevance of the topics covered in the course and the accessibility of the curriculum (e.g., Excel versus Python, emphasis on obtaining business insight versus exploring mathematical proofs/structure). The scores from Table 2 suggest that students are very much engaged with, and interested in, the material and that from the student perspective, the course design is effective at developing problem-solving and critical thinking skills. Finally, the analysis underscores the fact that although some aspects of the course are novel (in particular, representing machine learning models as mathematical programming problems) and, in many cases, there are few resources on the topic, students are satisfied with the support they have received (lecture slides, videos, and in-class sessions) and are intellectually stimulated rather than overwhelmed by the content.
Although student evaluations can be a biased indicator of quality (Wachtel 1998) and their validity in evaluating teaching effectiveness has been repeatedly questioned (Hornstein 2017), the results do indicate that the course has been successful at achieving the desired objectives. Furthermore, research suggests that there is a correlation between high course ratings in student evaluations one year, and a subsequent increase in a course’s overall enrollment in the subsequent year (Brown and Kosovich 2015, Yu et al. 2021). Although MAOR has not been consistently offered, the quantitative evaluations indicate that it may be quite popular once teaching schedules stabilize.
In addition to the quantitative analysis of student course evaluation scores, a thematic analysis of student comments was performed (Braun and Clarke 2006). The relevant question, and student responses to that question, are given in Table 3 where each quote best captures multiple students’ sentiment. The qualitative analysis further suggests that the course has been successful at achieving the learning outcomes and following the five philosophical principles that helped guide the redesign. The results indicate that MAOR has made students more excited about the techniques and applications related to the field of data analytics and it is apparent that, after taking the course, student’s still believe that the concepts they learned will be useful in their future careers. A common theme is that student’s appreciated the balance between the technical and managerial content, enjoyed the opportunity to implement analytics models, and valued the materials that were provided to them. In fact, the ability to complete assignments in Excel was especially important.
|
Qualitative Comments from Student Evaluations
What aspects of the COURSE did you like or dislike (Winter 2019)? | What aspects of the COURSE did you like or dislike (Fall 2021)? | What teaching and learning activities helped you learn in this course (Fall 2021)? |
---|---|---|
“I literally felt I learnt a lot and something really applicable to real world...[makes comparison to previous version of course].” | “I like that the course provided an opportunity to explore the OpenSolver capability of Excel.” | “The asynchronous content was useful as was the professors ability to answer questions throughout lectures.” |
“The real examples from different industries. Machine learning.” | “Loved the content and the ability to program these models from scratch. Also enjoyed the method of delivery.” | “The teacher slides, excel examples, assignments were very helpful.” |
“The course was well structured with some real world problems to solve. The mathematical aspect, basic ML and modeling were thoroughly enjoyable and knowledgeable for any operations professional.” | “The course was practical with a lot of exercises and assignments which made the concepts clear.” | “Real world optimization problems. Building Models in class step by step with students.” |
“Loved this course, so useful! And makes you think. It has a good balance of quant and making managerial decisions.” | “Liked the class discussions and that materials were provided prior to class.” | “The [video] lectures and sample excel files were very helpful.” |
4.2. Student-Driven Course Updates
After the Winter 2019 version of the course, a theme emerged that slightly changed the curriculum for Fall 2021. The following comment summarizes the motivation behind these changes:
“I think more time could be spent on predictive analysis (machine learning), as it is more useful and difficult to learn. I feel it needs more time to teach and learn this topic.”
In the Winter 2019 version of MAOR, there was considerably less discussion of empirical risk minimization (Week 7) and machine learning operations (Week 12), whereas dimensionality reduction was not covered at all (Week 11). After adding these topics, the same sentiment was not expressed in the Fall 2021 course evaluations. Nevertheless, additional changes are being considered. In particular, efforts are focused on determining whether goal programming (Week 6) should be replaced. The original motivation for including this topic is because deviational variables are seen in quantile regression (Week 9). However, the discussion of soft constraints can be incorporated into some of the examples associated with linear programming (Weeks 3–5) and be reinforced in Week 9.
Two potential candidates have emerged as replacements: stochastic programming and an additional session on data wrangling. With stochastic programming, more emphasis can be placed on how data are used in prescriptive analytics. There are also connections with the discussion of empirical risk minimization and the structure of supervised learning models. Finally, a similar lecture is given to students in a different graduate program with identical technical backgrounds. Alternatively, by including another session on data acquisition/wrangling, more emphasis can be placed on how to determine what data are required to answer complex business problems. Furthermore, additional detail can be given as to how to clean and visualize data for model engineering.
A few technically proficient students expressed their desire to see Python incorporated into the course curriculum. This feedback has not, and will not, be incorporated for two main reasons. First, a majority of students enjoyed the coverage of Excel (see, for example, Table 3). These students indicated that they were happy to improve their proficiency with spreadsheet modeling and, more importantly, would not take the course if coding knowledge became a prerequisite. This suggests that, for the most part, the course is appealing to the desired audience. Second, there are currently two masters programs offered by the business school that require students to complete more mathematically and computationally challenging analytics courses. Knowledge of Python/R is expected and MBA students are welcome to take those courses as electives if they desire.
4.3. Case-Based Assignments
Although the course deliverables are motivated by real-world applications, none of them are case based. To further bolster the managerial content of MAOR, assignments and/or exams could be derived from case studies. For instance, there are a plethora of cases at Harvard Business Publishing that require mathematical programming and machine learning methods. In fact, the author is exploring whether future iterations of the course will replace the assignment component with case studies. One of the reservations associated with making this change is that, typically, case studies are completed by groups of MBA students. As a consequence, not all students may get hands-on experience preparing and analyzing the data, implementing the prescriptive/predictive models, validating their effectiveness, and reporting the results; this is a serious concern.
4.4. Hybrid Delivery
In the Winter 2019 version of MAOR, there were no prerecorded video lectures. Instead, meeting sessions were three hours long, and all content was presented in a classroom setting. However, halfway through the semester, COVID-19 shut down all in-person lectures. As a result, prelecture videos were introduced as meeting sessions (over Zoom) and were shortened to two hours. This format remained for the Fall 2021 version of the course, where meeting sessions were also delivered over Zoom. The following comment summarizes the consensus observed with this format:
“Liked the class discussions and that [the] materials were provided prior to class.”
Because of the almost unanimous support of the course structure, a hybrid delivery will be adopted going forward. As discussed in Section 3.3, the videos will cover more technical content, whereas the in-person sessions will be devoted to examples that use the methods in real-world applications.
5. Discussion and Conclusion
Mathematical programming constitutes a fundamental quantitative skill that MBA graduates should possess, and because of this, is taught pervasively in operations management programs throughout North America (Martinez de Albeniz Margalef 2011, Mark 2021). At the same time, data analytics is a nascent but in-demand field that has been shown to produce tangible business benefits (Kelly 2022). This paper introduces a new course geared toward MBA students that teaches them the fundamentals of prescriptive and predictive analytics using the unifying perspective of mathematical optimization. The curriculum is primarily based on a flipped classroom design (Akçayir and Akçayir 2018) where technical content is presented as short prelecture videos and real-world examples are solved collectively during meeting sessions. There are many experiential learning components as students get hands-on opportunities to solve quantitative problems with Excel through in-class examples, assignments, and exams. However, particular focus is placed on ensuring that managerial concepts in operations management are emphasized while also describing how to communicate analytics-driven insight to business managers. Finally, an overview of the analytics pipeline is presented, several AI tools (e.g., fraud detection, revenue management) are introduced, and broader ethical issues associated with their deployment are discussed.
Student feedback indicates that this course has achieved its learning objectives and has appropriately balanced the presentation of technical material with insightful commentary on managerial decision-making. They also suggest that the flipped classroom format has been successful and that the use of Excel has been well received. Interestingly, not one student commented on feeling mathematically unprepared, which indicates that the course may have strong appeal to nontechnical learners who want to gain a better understanding of the relationship between business problems and the application of analytical tools used for decision-making. Finally, it was observed that students appreciated the emphasis on real-world problems; this allows for rich discussions that emphasize the connections between analytics and modern topics in operations and supply chain management. This emphasis also means that, as compared with typical introductory operations courses offered in many MBA programs, larger-sized and potentially more intricate models can be formulated.
The course aims to produce “data-savvy” managers (Manyika et al. 2011). To do so, MBA students must learn the fundamentals of analytics and receive hands-on training to frame problems and interpret the results while also connecting these quantitative skills to the broader business discipline (Wilder and Ozgur 2015a). In sum, they must relate the low-level implementation of mathematical models and computational algorithms, to the high-level processes that govern the deployment of AI tools to extract insight, make data-driven decisions, and are used by organizations to interact with their customers. By unifying the presentation of prescriptive and predictive analytics through the use of mathematical optimization, the course de-emphasizes mathematical knowledge and puts the focus on business-level problem solving abilities and critical thinking skills. Furthermore, providing different perspectives—from micro (model engineering) to macro (analytics pipeline) processes—and incorporating multiple real-world examples allows MBA students to gain the confidence to contribute to analytics teams. This is imperative when working on large organizational projects that are far more intricate and technically complex. Although many business schools have different viewpoints on how best to prepare MBA students for such a position and an industry that increasingly values analytics and AI (Kurczy 2018), this course presents, what are hopefully, valuable insights on how such a curriculum can be designed and delivered.
I thank Amber Moore whose encouragement directly motivated the creation of this work, Andre Cire for careful review of the original manuscript, and the editor, associate editor, and anonymous reviewers whose comments and suggestions considerably improved the article.
Appendix A. Overview of OpenSolver’s Functionality
OpenSolver is an Excel add-in that extends the native solver tool in two notable ways: (i) there is no limitation on the number of decision variables and constraints that can be included in an optimization model; and (ii) there are several nonlinear solvers. It is also free and open source (Mason 2012). Students in MAOR are required to download the advanced version of OpenSolver to gain access to the linear and nonlinear solvers. After downloading the zip file and extracting the folder, students open the application by double-clicking on OpenSolver.xlam. If there is no security notice (if there is, you must give Excel permission to enable macros), OpenSolver commands will then be visible in Excel’s Data tab on Windows or in the Menu Bar for the MacOS version.
The OpenSolver interface (Figure A.1) is intuitive and similar to the native Excel Solver tool (see the tutorial at https://opensolver.org/using-opensolver/). In fact, many students find it convenient to build larger models using the native solver tool and solve them with OpenSolver’s algorithms. Apart from the OpenSolver Model dialog box which is used to build the model, two key aspects of its functionality are highlighted:
Solver Engine: Several linear and nonlinear solvers are supported (Figure A.2); a summary of each algorithm is presented on the OpenSolver website. Although this is briefly touched on in Week 7, students are encouraged to try multiple algorithms when solving nonlinear problems as it is common for only a subset to work effectively and/or some may take longer to run.
Options: This opens a dialog box (Figure A.3) where users can set, for instance, the numerical precision associated with constraint satisfaction, the percentage tolerance of an integer solution (B&B %), and an option to make unconstrained variable cells non-negative.
As is apparent, using OpenSolver to solve optimization problems in Excel is no harder than learning how to formulate models with the native solver tool. Thus, while considerable time is spent discussing Solver and the reports (e.g., sensitivity) it generates (Week 2), little time is spent reviewing the layout of the OpenSolver Model dialog box. Instead, as previously described, the novel features of the software are highlighted. In addition, there are several unique issues that are reviewed:
There is limited support for the application of Excel functions with nonlinear models. Thus, students are encouraged to copy and paste the values of the cells containing the final data (CTRL + ALT + V; Values) after preforming complex transformations. This is particularly important for feature engineering in Weeks 8–12, assignment 3, and the final exam.
All optimization models should refer to data that is on the same sheet as the model itself. Although this is not necessary for the native solver tool, this is a limitation of OpenSolver.
Although OpenSolver claims to work with the most recent MacOS and Windows distributions, issues may arise on certain computers and/or desktop versions. Thus, it is recommended that students have access to a school computer laboratory and/or loaner laptops where OpenSolver is known to work so that any deliverables that utilize the software can be completed. Nevertheless, this was not an issue in the Winter 2019 or Fall 2021 versions of the course as OpenSolver worked on all students’ computers, some of which were Windows and MacOS.
Students may periodically formulate incorrect models that they may still attempt to solve. Thus, it is important that all models be tested prior to being assigned so that the expected runtime can be assessed. One can then provide guidance to the students as follows: “if the model does not solve within 30 seconds, stop it, and assume you have made an error.”
Appendix B. Description of Excel Examples Included in the Supplementary Files
In this section, details regarding the Excel files included as part of the online appendix are discussed. The first three examples are associated with the discussion of course content in Week 5 (see Section 3.3.1) and Week 9 (see Section 3.3.2), whereas the fourth example is associated with the motivation behind the questions on the final exam. For each application, some background is given (i.e., the problem setting, managerial issues, and the data that has been collected), the methodological focus is highlighted, instruction regarding the formulation of the model is provided, the use of Excel/OpenSolver is described, and some in-class discussion questions are posed.
B.1. Production and Transportation Example
Although this problem is fairly typical for MBA-level operations courses, details regarding this particular application are motivated by a personal relationship with an executive at Masters Best Friend. Consequently, some background information regarding the company is given, as well as the motivation behind the particular problem. Some simplifying assumptions have been made (e.g., production cost/capacity are identical across facilities), and all data are fictitious. The managerial decisions represent whether certain production facilities should be rented whether certain orders should be satisfied, and how many units should be transported from each rented production facility to each customer that is served. Then, there is a brief class discussion on the data that has been obtained, how one would go about collecting this data in practice, and additional information that one may need to procure in order to have the model faithfully represent the real-world setting.
Although a detailed explanation of the steps associated with deriving the optimization model is presented in the supplementary file Week 5-Mixed Integer Linear Programs.pdf, a concise formulation is included here. Let yi be equal to one if the company rents production facility and zero otherwise, let zj be equal to one if the company decides to serve customer and zero otherwise, and let xij be the amount of dog food (pounds) shipped from production facility i to customer j. The fixed monthly cost for operating any production facility is $60,000. The monthly capacity of each plant is 2,500 pounds, and the production cost at any production facility is $10.25 per pound. After the product is manufactured, it is shipped to customers at a cost of $0.02 per pound per mile. Finally, define Qj to be the amount (in pounds) of dog food requested by customer , Rj to be the total revenue associated with customer , and tij to be the distance between production facility i and customer j for . Then,
From a technical perspective, the purpose of this example is to demonstrate the generality of the type of constraints that can be used to link decision variables in mathematical programming problems. Earlier in the session, an example with Big-M constraints is introduced that uses a single binary variable to toggle a single continuous decision variable to be some positive number or zero. In this example, each binary variable toggles multiple decision variables to be positive or, collectively, all zero. Furthermore, it reinforces the idea that a single model can be created to answer multiple managerial questions by incorporating multiple types/classes of decision variables. However, in order for the model to produce sensible results, the different variables must be linked.
The Excel spreadsheet (Figure B.1) in the online appendix that corresponds to this example is entitled Week 5-Production and Transportation.xlsx. Decision variables are in green-colored cells, constraints are in yellow-colored cells, and components of the objective are colored orange. Cells that do not contain any colors are associated with the data that has been provided. The model contains 80 decision variables (16 are binary, 64 are continuous/integer) and 16 constraints. Because there are mixed variable types, the problem can be classified as a MILP. The constraints in cells I24:J31 (B.2) ensure that if a production facility is used, it cannot produce more than the maximum capacity (2,500 pounds). Otherwise, if the facility is not used, it should not produce any amount. Alternatively, the constraints in cells B32:I34 (B.3) ensure that if a customer is served, all of the requested units are delivered. Otherwise, the company should not deliver any units to that customer. Non-negativity and binary constraints (B.4) are also required; this can be implemented by clicking on the Make unconstrained variable cells non-negative option and highlighting cells B20:I:21 and adding a binary constraint. The objective is to maximize profit, which is the revenue less the fixed cost associated with renting a facility and the variable costs associated with production and transportation (see cell I43, which is the difference between the revenue, cell F45, and the three types of costs, cell I40).
The model can be implemented and solved using both the Excel Solver tool or OpenSolver. Figure B.1 presents a screenshot of the Excel spreadsheet where the model has been highlighted using OpenSolver’s Show/Hide Model button. The details of how the model is formulated in OpenSolver is given in Figure B.2. The interface and implementation is similar to the native Excel Solver tool. The only difference is that adding constraints does not require an extra dialog box.
In addition to examining how Solver came to the optimal solution (interpretability) and the sensitivity of the solution to changes in model parameters (as discussed in Section 3.3.1), student feedback is solicited on how one would go about showing the robustness of these results and how best to present these findings to executives. Additional ideas for classroom discussion include the following:
How can the company automate this monthly decision-making process?
What additional information should be collected to potentially make more informed choices?
What are some threats to the validity of the results from the model (e.g., changes to cost parameters, production capacities)? How can these threats be mitigated?
Why may it be important to create a model such as this for use as a decision-making aid?
B.2. Regularized Regression Example
The setting and data set are derived from the paper by Cortez and Silva (2008). Nevertheless, it is important to emphasize what managerial questions this type of analysis is attempting to answer, how pervasive these types of inquiries are (i.e., predicting student/employee performance given a large data set of features), and how difficult it is to draw causal conclusions from the results. From a technical perspective, the purpose of this example is to demonstrate how to apply optimization models for predictive purposes using a machine learning (i.e., train, test, evaluate) approach. To this end, a detailed explanation of the steps associated with formulating optimization models of ordinary least squares, lasso, ridge, and best-subset selection regression is given in the first half of the supplementary file Week 9-Regression Models.pdf; a summary is given in Section 3.3.2. The corresponding Excel spreadsheet, Week 9-Regularized Regression.xlsx, is also provided and is discussed here.
The Excel file has nine spreadsheets. The first two contain the raw data set and a reference to where the data were obtained. The raw data are split into a training (80%) and testing (20%) set. In each of these spreadsheets, columns A:AF contain the raw data: there are 315 instances in the Training spreadsheet and 80 instances in the Testing spreadsheet. In both spreadsheets, columns AH:BO contain the feature engineered data where various transformations have been applied (e.g., normalization, one-hot encoding). There are many possible “correct” transformations, and it is important to discuss the reasoning behind permissible choices. The transformed data in columns AH:BN are used as features to predict the outcome variable (Grade) in column BO.
Four different predictive models are trained (see the spreadsheets OLS Model, Ridge Model, Lasso Model, and Subset Model). Decision variables in these spreadsheets are in green-colored cells, constraints are in yellow-colored cells, and the objective function is in an orange-colored cell. In each spreadsheet, the actual outcome from the training set is referenced in column A and is denoted by yi for instance . The predicted outcome for instance i, , from applying the regression model is given in column B; it is estimated using the following equation:
The optimization model in OLS Model minimizes the unconstrained sum of squared differences between the predicted and actual outcomes to determine . The other spreadsheets incorporate regularization constraints as outlined in Section 3.3.2. Additional decision variables, , and the relevant constraints are introduced in Lasso Model and Subset Model. In the former spreadsheet, γ are continuous and represent the constraint that . In the latter spreadsheet, represent binary decision variables which toggle whether certain features are included in the model or not. The ridge constraint is given in cells E7:F7 (Ridge Model), the lasso constraint in cells E13:F13 (Lasso Model), and the constraint that controls the number of features to be included in the model is given in cells E13:F13 (Subset Model). Both the Excel Solver tool and OpenSolver can be used although it is important to leave the Make Unconstrained Variables Non-Negative option unchecked and select a nonlinear solution method (e.g., GRG Nonlinear, COIN-OR Bonmin).
Running the optimization models provide estimates of subject to different assumptions about regularization. To assess each model’s performance, they are applied to the test set data; this is done in the Validation spreadsheet. The actual outcome from the test set is referenced in column A and is denoted by yi for instance . The predicted outcome for instance i, , is obtained by applying the trained regression model to data from the test set. The value is obtained by adding α to the SUMPRODUCT between the estimated regression coefficients and the value of the features from the test set associated with instance i. The predicted grades for each regression model is given in columns C:F. All models are compared with a naive benchmark that estimates the mean grade from the training set (cell T14) and uses this value to predict grades in the test set. This benchmark provides an objective measure of the value of incorporating the set of features in the prediction model. Then, the absolute value (columns H:L) and squared difference (columns N:R) between yi and is calculated for each to facilitate the computation of various performance metrics including the mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE), and coefficient of determination (R2). The four prediction models and the benchmark are then compared in cells U5:Y10 (see the screenshot in Figure B.3).
Although the regularized models (either ridge or lasso) exhibit slightly better performance than OLS, the test set metrics do indicate that the outcome variable can be fairly accurately predicted (e.g., there is a significant difference between the benchmark’s performance metrics—see Average column—and the regularized models). Furthermore, by experimenting with different budget parameters for the regularization constraints (i.e., hyperparameters), one can investigate the tradeoff between bias and variance. In fact, these observations could lead to a discussion of how to choose the best hyperparameter values (e.g., splitting the data set into three groups instead of two, cross-validation). Otherwise, it’s important to identify the most relevant features for predicting grades (e.g., parental education level, urban/rural home location, number of student absences, the amount of study time per week, whether the student has university aspirations) and then soliciting opinions as to what actionable insights can be derived from the analysis. As indicated in Section 3.3.2, a broader dialogue on controllable versus uncontrollable factors, incentivizing individuals, and even behavioral nudges can be undertaken. For instance, some ideas for classroom discussion include:
From a qualitative perspective, what do you notice about the independence of some of the most important features used for predicting lower grades? What does this indicate?
How can a model such as this be used to provide support to certain student groups? How can the results be misused or misapplied (e.g., exacerbate existing inequalities)?
How should this information be presented to educators, parents, and school administrators?
What behavioral nudge do you think would work best with this population and why?
Weekly updates (one sentence) about a child’s performance (Kraft and Rogers 2015).
Educating students on the struggles of well-known scientists (Lin-Siegler et al. 2016).
Sending parents weekly text reminders to engage in literary activity (York et al. 2019).
Are there any general takeaways that come to mind if one were to perform a similar analysis with the goal of assessing worker performance in other industries (e.g., manufacturing)?
B.3. k-Nearest Neighbor (kNN) Regression Example
The setting and data set are identical to what is described in the Regularized Regression Example (see Appendix B.2). The difference is that instead of specifying a function class whose parameters must be estimated, a data-driven approach is used. That is, the kNN method is a nonparametric model class that uses the data to predict outcomes by averaging observations in the same neighborhood. A detailed explanation of the steps associated with formulating the kNN model is given in the supplementary file Week 9-Regression Models.pdf; a summary is given in Section 3.3.2. The corresponding Excel spreadsheet, Week 9-kNN Regression.xlsx, is also provided and is discussed here.
The Excel file has five spreadsheets. The first two contain the raw data set and references where the data were obtained. In contrast to the Regularized Regression Example, the Training spreadsheet contains 394 of 395 instances. The goal is to predict the outcome of a single test set instance (row 2 in the Prediction spreadsheet). Similar to the Regularized Regression Example, columns A:AF in the Training and Prediction spreadsheets contain the raw data set values while columns AH:BO contain the feature engineered data. In the Prediction spreadsheet, the transformed data are copied to cells AH3:BO395 and the Euclidean distance is calculated between each instance of the training set and the single test set instance whose outcome we want to predict (column BP).
The final spreadsheet, KNN Model, presents the optimization model used to calculate the predicted response for the test set instance. Column A references the Euclidean distance from the Prediction spreadsheet and column B is the grade (outcome variable) associated with training instance . The decision variables, column C, are in green-colored cells. They represent binary variables that indicate whether training instance i is used to predict the grade associated with the test instance. There are 394 decision variables which implies that OpenSolver must be used. There are two constraints, both of which are in yellow-colored cells. The first (cell F3) computes the predicted outcome by averaging the grades of instances in the same neighborhood as the test instance. The second (cell F4) determines how many neighbors must be used (cell I1) for predicting the outcome. The objective, the orange cell, is to minimize the sum of Euclidean distances between the test set instance and the neighbors used to predict its outcome (cell F1).
A screenshot from the OpenSolver Model dialog box is presented in Figure B.4. Because the model is linear, the COIN-OR CBC (Linear Solver) engine can be selected. After formulating the model and selecting “Save Model,” the “Solve” button should be pressed (Figure A.1). The prediction for the test set instance is in cell F3 and can be compared with the actual grade (cell I2). To this end, running the model with different values of k (cell I1) and comparing the prediction with the actual outcome illustrates the tradeoff between matching instances based mostly on noisy signals (small values of k) and excessively smoothed predictors (high values of k).
The resulting discussion focuses on data (see Section 3.3.2) and the differences between parametric and nonparametric prediction models. Other function classes are briefly introduced (e.g., decision trees, neural networks) and the tradeoff between these prediction machines are discussed, such as capacity versus interpretability. For instance, some ideas for classroom discussion include the following:
In what settings would a nonparametric model (e.g., kNN regression) be preferred as compared with a parametric model (e.g., regularized OLS regression)?
What are the benefits/drawbacks to training a prediction model that uses a function class that is interpretable but has low capacity (i.e., may not be able to learn complicated relationships)?
What are the benefits/drawbacks to training a prediction model that has a high capacity (i.e., can learn complicated relationships) but is not interpretable?
B.4. Sample Final Exam
There are three questions associated with this final exam. The answer key (Exam Questions and Answers-Fall 2021.pdf) and the Excel spreadsheet (Exam Excel-Fall 2021.xlsx) are included in the online appendix. Although the answer key is fairly technical, this is for completeness and to ensure that the answers are clear. Students are not required to submit mathematical programming formulations to obtain full marks. The models are implemented in Excel; decision variables are in green-colored cells, constraints are in yellow-colored cells, and the objective function is in an orange-colored cell. Finally, the exam is designed such that all Excel models can be solved using the both Solver tool or OpenSolver.
Q1: This question asks students to formulate a nonlinear pricing optimization model which expands on an example introduced in Week 7. The goal is to determine a dynamic pricing strategy for two packs of day-old jelly-filled and regular flavored donuts (there is no mixing of flavors) such that, given their linear price-response functions, total profit is maximized in the final six hours before a donut store closes. The subquestions foster both data-driven insight (e.g., the selling price of freshly baked donuts) and managerial insight (e.g., the pros/cons of the specified revenue management strategy). In particular, they ensure that students see the connection between analytics and operations-focused issues, that is, business-level critical thinking skills are relevant to fully and correctly answer this question. The corresponding Excel model is provided in the Q1-Price Optimization spreadsheet (a screenshot is presented in Figure B.5).
Q2: This question asks students to formulate a ridge logistic classification model to predict patient survival for individuals living with heart failure and who request paramedic service. The goal is to create a machine learning model that can assess whether these patients will be able to make their own way to the hospital in an emergency instead of requiring an ambulance (the problem is motivated by several articles). The example combines the introduction of classification models (Week 10) with regularization constraints (Week 9). The data set is given in the Q2-Raw Data spreadsheet. The raw training data, the corresponding transformed variables, and the formulation of the ridge logistic regression model (Figure B.6) are presented in the Q2-Training (Ridge) spreadsheet. The required transformations for feature engineering (Week 8) are specified on the exam handout. The raw testing data, the corresponding transformed variables, and measures of predictive performance (e.g., confusion matrix, accuracy, precision, recall) are presented in the Q2-Testing (Ridge) spreadsheet. The subquestions give students the opportunity to implement multiple components of the analytics pipeline (data preparation, feature engineering, model implementation). They also encourage students to reflect on the results (performance metrics) in order to make analytics-driven recommendations on whether the trained model should be deployed.
Q3: This questions asks students to implement a k-means clustering algorithm (nonlinear optimization) that is a direct application of an example introduced in Week 11. In particular, the length, width, height, and weight of 30 packages are given and the goal is to (i) cluster the packages into exactly one of three groups and (ii) find the centroid values of each group. The raw and transformed data set is given in the Q3-Data spreadsheet, whereas the implementation of the unsupervised learning model (Figure B.7) is formulated in the Q3-Packages spreadsheet. The required transformations for feature engineering (Week 8) are specified on the exam handout. In the Q3-Packages spreadsheet, there are 90 binary decision variables, J2:L31, indicating which cluster each instance should be assigned to. Continuous variables P5:S7 define the centroid coordinates of a cluster, whereas column M defines constraints that ensure each instance is assigned to exactly one cluster. The objective is to define 12 coordinate values (four for each centroid) and assign each of the 30 instances to one of three clusters such that the sum of squared distances is minimized. The purpose of this question is to recreate a model that was discussed during an in-class meeting session and to expand on that model by altering the constraint set. Thus, it reinforces the benefit of using mathematical programming for algorithmic transparency in predictive modeling (Molnar 2020).
Appendix C. Discussion Sessions
C.1. Week 8: Introduction to Prediction
C.1.1. Video.
Defining predictive analytics (the training of a computational model using a historical data set that generalizes a decision rule against a given performance metric by anticipating unseen future inputs), clearly defining the prediction problem (what objective are we serving? what are we trying to do for the end-user? why is the problem important?), acquiring a data set (how do we decide what data to collect and how much is needed?), secure storage (where do we store the data?), and privacy compliance (how should sensitive information be protected?).
C.1.2. Class.
Data wrangling (reformatting particular attributes and correcting errors in the data set, what to do with missing values), feature engineering, irrelevant data and outliers, overview of model engineering (train, test, evaluate), data leakage, the difference between prediction (estimating an outcome based on the association between a dependent variable and set of independent features) and causality (identifying mechanisms through which certain responses are observed), evaluating model performance (e.g., root mean squared error, coefficient of determination, confusion matrix, precision, recall), and modeling bias (Shepperd 2021).
C.2. Week 12: Introduction to Machine Learning Operations
C.2.1. Video.
Overview of the analytics pipeline and the differences between recommendations for business insight versus deploying an AI tool for widespread adoption/interaction.
C.2.2. Class.
How to present/write technical insights for business audiences if the use case is for generating analytics-based recommendations (Hager et al. 1997). Some AI tools are deployed, however, and real-world examples include COMPASS to predict recidivism (Larson et al. 2016, Rudin et al. 2018), recommendation algorithms (Naumov et al. 2019, Anand et al. 2021), AI tools used for automating medical diagnoses (Babier et al. 2020, Gomolin et al. 2020), fraud detection (Soviany 2018, Hasan and Rizvi 2022) vehicle routing (Holland et al. 2017, Ellegood et al. 2020), and revenue management and pricing (Carrier and Fiig 2018, Ammirato et al. 2020). Because of time constraints, only two applications are covered during the in-class session. However, because of the high-level presentation associated with these examples, some of this content has been taught in other courses.
For each application, the AI tool and its managerial purpose is described, the concept of model drift (predictive) and the need for reparameterization (prescriptive) is introduced, and quality management (detecting issues, retraining) is discussed (Luigi 2019). Finally, students are introduced to the ethical issues surrounding AI systems such as the possibility that decisions can reinforce existing forms of prejudice and amplify inequality (O’Neil 2016).
References
- 2020) Learning optimal classification trees: Strong max-flow formulations. Preprint, submitted May 13, https://arxiv.org/abs/2002.09142.Google Scholar (
- 2006) Why, when, and what to outsource, Chapter 11. Schniederjans A, Schniederjans D, Schniederjans M, eds. Outsourcing Management Information Systems (Idea Group Publishing, Hershey, PA).Google Scholar (
- 2018) The flipped classroom: A review of its advantages and challenges. Comput. Ed. 126:334–345.Crossref, Google Scholar (
- 2020) A systematic literature review of revenue management in passenger transportation. Meas. Bus. Excell.Crossref, Google Scholar (
- 2021) AI based music recommendation system using deep learning algorithms. Proc. IOP Conf. Series Earth Environment Sci., vol. 785, 012013. https://iopscience.iop.org/article/10.1088/1755-1315/785/1/012013/pdf.Crossref, Google Scholar (
- 2020) Knowledge-based automated planning with three-dimensional generative adversarial networks. Medical Phys. 47(2):297–306.Crossref, Google Scholar (
- 2017) Optimal classification trees. Machine Learn. 106(7):1039–1082.Crossref, Google Scholar (
- 2019) Machine Learning Under a Modern Optimization Lens (Dynamic Ideas, Charlestown, MA).Google Scholar (
- 2016) OR forum—An algorithmic approach to linear regression. Oper. Res. 64(1):2–16.Link, Google Scholar (
- 2020) Machine learning in oncology: methods, applications, and challenges. JCO Clinical Cancer Inform. 4:885–894.Crossref, Google Scholar (
- 2016) The Analytics Edge (Dynamic Ideas, Charlestown, MA).Google Scholar (
- 2021) Interpretable clustering: An optimization approach. Machine Learn. 110(1):89–138.Crossref, Google Scholar (
- 2019) Machine learning under a modern optimization lens. Accessed April 5, 2023, https://bit.ly/3XRFjbO.Google Scholar (
- 2013) The flipped classroom: A survey of the research. Proc. ASEE Annual Conf. & Exposition, 23–1200. https://peer.asee.org/the-flipped-classroom-a-survey-of-the-research.Google Scholar (
- Google Scholar (2023) Introducing and integrating machine learning in an operations research curriculum: An application-driven course. INFORMS Trans. Ed. 23(2):64–83.
- 2006) Using thematic analysis in psychology. Qual. Res. Psych. 3(2):77–101.Crossref, Google Scholar (
- 2015) The impact of professor reputation and section attributes on student course selection. Res. Higher Edu. 56(5):496–509.Crossref, Google Scholar (
- Brusco M (2022) Logistic regression via excel spreadsheets: Mechanics, model selection, and relative predictor importance. INFORMS Trans. Ed. 23(1):1–11.Google Scholar
- 2017) Let’s stop trying to be “sexy”: Preparing managers for the (big) data-driven business era. Bus. Processing Management J. 23(3):598–622.Crossref, Google Scholar (
- 2018) Future of airline revenue management. J. Revenue Pricing Management 17:45–47.Google Scholar (
- 2018) The measure and mismeasure of fairness: A critical review of fair machine learning. Preprint, submitted August 14, https://arxiv.org/abs/1808.00023.Google Scholar (
- 2008) Using data mining to predict secondary school student performance. Accessed April 5, 2023, http://www3.dsi.uminho.pt/pcortez/student.pdf.Google Scholar (
- 2023) Estimating causal effects with optimization-based methods: A review and empirical comparison. Eur. J. Oper. Res. 304(2):367–380.Crossref, Google Scholar (
- 2020) Using the learning assistant model in an undergraduate business analytics course. INFORMS Trans. Ed. 20(3):125–133.Link, Google Scholar (
- 2009)
Imagenet: A large-scale hierarchical image database . Proc. IEEE Conf. on Comput. Vision and Pattern Recognition (IEEE, Piscataway, NJ), 248–255.Crossref, Google Scholar ( - 2018) Optimal trees for prediction and prescription. PhD thesis, Massachusetts Institute of Technology, Cambridge, MA.Google Scholar (
- 2020) School bus routing problem: Contemporary trends and research directions. Omega 95:102056.Crossref, Google Scholar (
- 2020) Behavioral nudges reduce failure to appear for court. Science 370(6517):eabb6591.Crossref, Google Scholar (
- 2012) MBA in business analytics: Big data are a big deal. Accessed April 5, 2023, https://bit.ly/3PRb7ZL.Google Scholar (
- 2017) Microsoft Excel®: Is it an important job skill for college graduates? Inform. Systems Ed. J. 15(3):55.Google Scholar (
- 2008) Teaching problem solving techniques to MBA students through supply chain management. INFORMS Trans. Ed. 8(2):65–74.Link, Google Scholar (
- 2021) Optimization problems for machine learning: A survey. Eur. J. Oper. Res. 290(3):807–828.Crossref, Google Scholar (
- 2020) Artificial intelligence applications in dermatology: Where do we stand? Frontiers Medicine 7:100.Crossref, Google Scholar (
Gurobi Optimization LLC (2022) Gurobi optimizer reference manual. Accessed April 5, 2023, https://www.gurobi.com.Google Scholar- 1997) Designing & Delivering: Scientific, Technical, and Managerial Presentations (John Wiley & Sons, Hoboken, NJ).Google Scholar (
- 2022)
AI-driven fraud detection and mitigation in e-commerce transactions . Proc. Data Analytics and Management: ICDAM 2021, vol. 1 (Springer, Berlin), 403–414.Crossref, Google Scholar ( - 2010) MBA: Past, present and future. Acad. Ed. Leadership J. 14(1):63.Google Scholar (
- 2017) Ups optimizes delivery routes. Interfaces 47(1):8–23.Link, Google Scholar (
- 2016) Learning to code at a business school. Accessed April 5, 2023, https://bit.ly/3oKwAYw.Google Scholar (
- 2017) Student evaluations of teaching are an inadequate assessment tool for evaluating faculty performance. Cogent Ed. 4(1):1304016.Crossref, Google Scholar (
- 2020) Case article—Converting nfl point spreads into probabilities: A case study for teaching business analytics. INFORMS Trans. Ed. 21(1):45–48.Abstract, Google Scholar (
- 2017) Agile Data Science 2.0: Building Full-Stack Data Analytics Applications with Spark (O’Reilly Media, Newton, MA).Google Scholar (
- 2020) Healthcare AI systems are biased. Sci. Amer. 11:17.Google Scholar (
- 2022) Data analytics is a hot, sexy sector for career growth. Accessed April 5, 2023, https://bit.ly/3bqV1ar.Google Scholar (
- 2020) Is an MBA degree really worth it? Accessed April 5, 2023, https://bit.ly/3cXlNHQ.Google Scholar (
- 2018) Case article—Business value in integrating predictive and prescriptive analytics models INFORMS Trans. Ed. 19(1):36–42.Google Scholar (
- 2015) The underutilized potential of teacher-to-parent communication: Evidence from a field experiment. Econom. Ed. Rev. 47:49–63.Crossref, Google Scholar (
- 2018) MBAs who code. Accessed April 5, 2023, https://bit.ly/3d1djj3.Google Scholar (
- 2019) An integrative framework for artificial intelligence education. Proc. Conf. AAAI Artificial Intelligence (AAAI Press, Palo Alto, CA), vol. 33, 9670–9677.Google Scholar (
- 2016) How we analyzed the compas recidivism algorithm. ProPublica 9(1):3.Google Scholar (
- 2016) Business Analytics for Managers: Taking Business Intelligence Beyond Reporting (John Wiley & Sons, Hoboken, NJ).Crossref, Google Scholar (
- 2018) Integrating business analytics in the marketing curriculum: Eight recommendations. Marketing Ed. Rev. 28(1):6–13.Crossref, Google Scholar (
- 2012) An evidence-based incentive system for Medicare’s end-stage renal disease program. Management Sci. 58(6):1092–1105.Link, Google Scholar (
- 2008) Teaching business modeling using spreadsheets. INFORMS Trans. Ed. 9(1):20–34.Link, Google Scholar (
- 1999) The teachers’ forum: Breaking the mold—A new approach to teaching the first mba course in management science. Interfaces 29(4):99–116.Link, Google Scholar (
- 2016) Even Einstein struggled: Effects of learning about great scientists’ struggles on high school students’ motivation to learn science. J. Ed. Psych. 108(3):314.Crossref, Google Scholar (
Longwood University (2021) What can I do with an MBA in data analytics? Accessed April 5, 2023, https://bit.ly/3QdSKOs.Google ScholarLuigi (2019) The ultimate guide to model retraining. Accessed April 5, 2023, https://mlinproduction.com/model-retraining/.Google Scholar- 2019) What do we do about the biases in AI? Accessed April 5, 2023, https://bit.ly/3oK0KLF.Google Scholar (
- 2011) Big data: The next frontier for innovation, competition, and productivity. Accessed April 5, 2023, https://mck.co/3vxARlU.Google Scholar (
- 2021) What is a typical MBA program curriculum? Accessed April 5, 2023, https://bit.ly/3Sl6Tv6.Google Scholar (
- 2019) Artificial Intelligence in Practice (Wiley, New York).Google Scholar (
- 2011) Linear programming basics. Case study, Harvard Bus. Rev. https://store.hbr.org/product/linear-programming-basics/IES442.Google Scholar (
- Mason AJ (2012) Opensolver-an open source add-in to solve linear and integer programmes in excel. Klatte D, Lüthi H-J, Schmedders K, eds. Selected Papers Internat. Conf. Oper. Res. (OR 2011) (Springer-Verlag, Berlin), 401–406.Google Scholar
- 2020) Equalizing data science curriculum for computer science pupils. Proc. 20th Koli Calling Internat. Conf. on Comput. Ed. Res. (ACM, New York), 1–5.Google Scholar (
- 2017) The Quant Crunch: How the Demand for Data Science Skills Is Disrupting the Job Market (Burning Glass Technologies, Boston).Google Scholar (
- 2021) Predictors of choosing business analytics concentration and consequent academic performance. INFORMS Trans. Ed. 21(3):130–144.Link, Google Scholar (
- 2021) Algorithmic nudges don’t have to be unethical. Accessed April 5, 2023, https://bit.ly/3oPRky9.Google Scholar (
- 2020) Interpretable machine learning. Leanpub Publishing, https://books.google.ca/books?hl=en&lr=&id=jBm3DwAAQBAJ&oi=fnd&pg=PP1&dq=Molnar+C+(2020)+Interpretable+machine+learning.&ots=EgzP0kIJS5&sig=WEB5YhdpOkRUCLFDyvwLzWjppvI&redir_esc=y#v=onepage&q=Molnar%20C%20(2020)%20Interpretable%20machine%20learning.&f=false.Google Scholar (
- 2022) MS Excel as a modern data analysis tool? Accessed April 5, 2023, https://bit.ly/3oKpDqm.Google Scholar (
- 2014) Why MBA students are learning to code. Accessed April 5, 2023, https://bit.ly/3BATJnU.Google Scholar (
- 2022) An interactive spreadsheet model for teaching classification using logistic regression. Working paper. Accessed April 5, 2023, https://bit.ly/3KenIoq.Google Scholar (
- Nagpal A, Gabrani G (2019) Python for data analytics, scientific and technical applications. 2019 Amity Internat. Conf. Artificial Intelligence (AICAI) (IEEE, Piscataway, NJ), 140–145.Google Scholar
- 2019) Deep learning recommendation model for personalization and recommendation systems. Preprint, submitted May 31, https://arxiv.org/abs/1906.00091.Google Scholar (
- 2012) An application of mathematical model for decision in building plan design. Procedia Soc. Behav. Sci. 68:537–548.Crossref, Google Scholar (
- 2022) What to look for in a business analytics curriculum. Accessed April 5, 2023, https://bit.ly/3BzLVTj.Google Scholar (
- 2016) Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (Broadway Books, New York).Google Scholar (
- 2022) Case article—Machine learning, ethics, and change management: A data-driven approach to improving hospital observation unit operations. INFORMS Trans. Ed. 22(3):178–187.Link, Google Scholar (
- 2019) Automated pathologist scheduling at the ottawa hospital. INFORMS J. Appl. Analytics 49(2):93–103.Link, Google Scholar (
- 2020) Analytics curriculum for undergraduate and graduate students. Decision Sci. J. Innovative Ed. 18(1):22–58.Crossref, Google Scholar (
- 2013) A very short history of data science. Accessed April 5, 2023, https://bit.ly/3SjkyTG.Google Scholar (
Questrom School of Business (2022) Preparation for the online MBA. Accessed April 5, 2023, https://bit.ly/3zn1NWx.Google Scholar- 2005) Professional decision modeling: Details of a short MBA practice course. INFORMS Trans. Ed. 6(1):35–52.Link, Google Scholar (
- 2021) Using moneyball to introduce students to data analytics: Illustrating the data analytics life cycle. INFORMS Trans. Ed., ePub ahead of print July 13, https://doi.org/10.1287/ited.2021.0252ca.Link, Google Scholar (
Rotman School of Management (2022) Rsm2408h: Modelling and optimization for decision making. Accessed April 5, 2023, https://bit.ly/3PPPuJr.Google Scholar- 2018) The age of secrecy and unfairness in recidivism prediction. Preprint, submitted November 2, https://arxiv.org/abs/1811.00731.Google Scholar (
- 2021) The data analytics profession and employment is exploding—Three trends that matter. Accessed April 5, 2023, https://bit.ly/3SlBOY8.Google Scholar (
- 2020) Delivering business analytics competencies and skills: A supply side assessment. INFORMS J. Appl. Analytics 50(4):239–254.Link, Google Scholar (
- 2019) Stand up for best practices: Misuse of deep learning in nature’s earthquake aftershock paper. Accessed April 5, 2023, https://bit.ly/3OLMbBG.Google Scholar (
- 2021) CS5702 modern data book. Accessed April 5, 2023, https://bit.ly/3BzXQ3q.Google Scholar (
- 2009) Incentive contracts in projects with unforeseeable uncertainty. Production Oper. Management 18(2):185–196.Crossref, Google Scholar (
Stern School of Business (2021) Syllabus for decisions models and analytics (Fall 2021, MBA Program). Accessed April 5, 2023, https://bit.ly/3JqVkOS.Google Scholar- 2018) The benefits of using artificial intelligence in payment fraud detection: A case study. J. Payments Strategy Systems 12(2):102–110.Google Scholar (
- 2019) Implications of Artificial Intelligence on Business Schools and Lifelong Learning (Academic Leadership Group, Cambridge, MA).Google Scholar (
- 2020) Treating medical data as a durable asset. Nature Genetics 52(10):1005–1010.Crossref, Google Scholar (
- Google Scholar (2019) Matching Supply with Demand: An Introduction to Operations Management, 4th ed. (McGraw-Hill Education, New York).
- 2018) From cashews to nudges: The evolution of behavioral economics. Amer. Econom. Rev. 108(6):1265–1287.Crossref, Google Scholar (
- 2022) MBA applications at some of the country’s best colleges fell this year. Accessed April 5, 2023, https://on.wsj.com/3zMvbXN.Google Scholar (
- 2020) Why you might want to use machine learning. Accessed April 5, 2023, https://ml-ops.org/content/motivation.Google Scholar (
- 1998) Student evaluation of college teaching effectiveness: A brief review. Assessment Evaluation Higher Ed. 23(2):191–212.Crossref, Google Scholar (
- Warner J (2013) Business analytics in the MBA curriculum. Proc. Northeast Bus. Econom. Assoc. https://web.p.ebscohost.com/ehost/pdfviewer/pdfviewer?vid=0&sid=e6e875a2-6e06-4800-8bd8-aa5eb949737b%40redis.Google Scholar
- 2017) Machine learning modules for all disciplines. Proc. 2017 ACM Conf. Innovation Tech. Comput. Sci. Ed., 84–85. https://dl.acm.org/doi/pdf/10.1145/3059009.3072979.Google Scholar (
- 2019) Integrating analytics into marketing curricula: Challenges and effective practices for developing six critical competencies. Marketing Ed. Rev. 29(4):266–282.Crossref, Google Scholar (
- 2019) Case—You can’t take it with you. INFORMS Trans. Ed. 20(1):37–40.Link, Google Scholar (
- 2015a)
Business analytics for business analysts in manufacturing . Handbook of Research on Organizational Transformations Through Big Data Analytics (IGI Global, Pennsylvania), 97–105.Crossref, Google Scholar ( - 2015b) Business analytics curriculum for undergraduate majors. INFORMS Trans. Ed. 15(2):180–187.Link, Google Scholar (
- 2021) Artificial intelligence in business curriculum: The pedagogy and learning outcomes. Internat. J. Management Ed. 19(3):100550.Google Scholar (
- 2019) One step at a time the effects of an early literacy text-messaging program for parents of preschoolers. J. Human Resources 54(3):537–566.Crossref, Google Scholar (
York University (2022a) Fostering the future of artificial intelligence: Report from the York University task force on AI & society. Accessed April 5, 2023, https://bit.ly/3vyc80S.Google ScholarYork University (2022b) York university strategic research plan (2018-202): Toward new heights. Accessed April 5, 2023, https://bit.ly/3vwMJok.Google Scholar- 2021) How student evaluations of teaching affect course enrollment. Assessment Evaluation Higher Ed. 46(5):779–792.Crossref, Google Scholar (
- 2018) Integrating data analytics into the undergraduate accounting curriculum. Bus. Ed. Innovation J. 10(2).Google Scholar (