February 8, 2024 in Principles for Successful Analytics Projects

Project Management

Why Data Science Projects Fail: Part 8

SHARE: PRINT ARTICLE:print this page https://doi.org/10.1287/LYTX.2024.01.10

“Nine women cannot have a baby in one month.”

“Brooks’ Law: Adding resources to a late software project only makes it later.”

The Standish Group has published their “CHAOS Report” for nearly 40 years, chronicling the failure of most IT projects to achieve scope, timing, budget and quality goals (all four). As of their 2020 “CHAOS Report: Beyond Infinity,” only 19% of all IT projects achieve these four lofty goals, which has not gotten much better in the past 40 years and, frankly, may even be worse. Many projects will, of course, achieve a combination of some subset of these goals, e.g., timing and quality but not scope and budget.

Why is this statistic even relevant in the context of data science project failures? Beyond the data analysis and modeling phases, a.k.a. the “fun part,” successful data science modeling projects ultimately evolve to become software and systems engineering projects. We recall Tom Davenport’s “Begin with the end in mind” goal: “Models make the enterprise smarter; models embedded in systems and business processes make the enterprise more economically efficient.”

(I will delve more into the complexities and complications of transforming a model into a turnkey business system for planning or real-time decision-making in a later installment.)

Building a turnkey production system-based version of a model that supports an enterprise business process of more than modest importance and criticality will entail building API interfaces to multiple data sources, architecting of a microservice application around the model, built-in model drift detection, refitting of the model, model assumption revalidation, error handling, fault tolerance, high-availability and high-reliability robustness, and failover capabilities to get to 99+% uptime (mission-critical systems require the elusive “5-9s” or 99.999% uptime). Having been through this process many, many times myself, strong project management (PM) practices, processes, skills, discipline and judgment are critical to successfully achieving scope, timing, budget and quality goals, or at least not the null set.

The “Perfect” Project Management

PM for modeling initiatives is quite a bit more relaxed in that it is understood to be a “research project” – an application of the scientific method, a voyage of discovery, almost as prone to “puffs of smoke over the lab bench” as Edison inventing the lightbulb. The four dimensions of the inviolate PM “box” (formerly a triangle until the addition of “quality”) are no less important to set realistic estimates and expectations for stakeholders as to how long a project is going to take.

Agile/MVP (Minimum Viable Product), and its precursors extreme programming and rapid prototyping, has provided a marked improvement over Waterfall (Egad!) in creating a more flexible, realistic framework for software and system engineering projects. (According to The Standish Group, software projects using Agile methods are three times more likely to succeed than those using the Waterfall method, and Waterfall (software) projects are two times more likely to fail.)

Although Scrum is more popular for strictly software projects, I tend to prefer and recommend Kanban for modeling projects because modeling is more reactive in nature and data scientists are continually discovering new variables, constraints and data sources with which to integrate, etc. The notion of MVP, or Minimum Viable Model, is most appealing to emphasize the importance of stakeholder confidence to get a basic model up and running and to functionally provide some insightful output ASAP.

Once your model is deemed successful and too important to live without reliably, and a budget is approved to convert it into a production system (mission-critical or not, planning or real-time), you are going to become part of a much larger team consisting of data engineers, software engineers, cloud data center system engineers, test and QA engineers, project managers, business analysts, etc. Most likely, you will no longer be “in control” of the project.

Teams will typically employ Agile Scrum for systems development and SAFe (Scaled Agile Framework) for enterprise-scale systems development. I have used Agile Scrum, Kanban and SAFe, and I highly recommend training in all of them to reduce project management risk.

Project management is at least part science (refer to https://www.pmi.org/certifications/project-management-pmp), and there are many tools and techniques available for managing software projects. That said, project management is also an art form that is based on experience, instinct, judgment and most importantly, knowledge about your people, processes, technology and business problem domain. The fewer unknowns and “new stuff” on a project, the higher probability for success. (But refer back to the 2020 CHAOS Report for a reality check and the reasons IT projects fail.)

The Usual Suspects in PM

There is a set of “usual suspects” as to why IT and data science projects fail:

  • Underestimating scope.
  • Overcommitting on scope.
  • Underestimating complexity (e.g., technological, architectural, change management, system integration).
  • Overestimating team capacity and capabilities, especially new staff or newly formed teams (there is always a ramp-up, learning curve period).
  • Overprioritizing too many features (i.e., every feature cannot be Priority Level 1 for Release 1).
  • Unrealistic timeline to address scope and no slack in the timeline for contingencies and unforeseen circumstances.
  • Insufficient quantity of (adequately skilled) resources to address scope.
  • Project manager’s ability to succeed through adversity and make decisions to course-correct when things go awry (and they ultimately will).

Like most endeavors, e.g., learning to play an instrument or a sport, PM skills are developed through the experience of doing project management, including making mistakes and learning from them, not reading about it in a book or taking a course. Studying project management may be necessary but is by no means sufficient for developing and honing your skills and becoming a good project manager. There is no substitute for PM experience or learning from other, more experienced project managers. In my experience, it takes years of hands-on intensive experience managing projects of larger and larger scope and greater and greater complexity to develop expertise.

The best advice I can provide is to always err on the side of caution when making project commitments on scope, timeline, resources, budget, quality and complexity, and when in doubt, seek the advice of more experienced project managers. Always be transparent with your team members, stakeholders and constituents, and report bad news and offer solutions as soon as you discover an adverse situation. (Bad news does not get better with age!)

phased data science project life cycle

Above is a generalized, representative, phased data science project life cycle that I have developed based on decades of real-world implementation experience by professional data scientists that you can utilize as a template to help guide your own project planning process and expectation setting for proper project management.

Douglas A. Gray

SHARE:

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.