Every company needs data science and AI — or so they think — while at the same time most data science projects fail to deliver any lasting value. While there is no exact measurement on the percentage of data science projects that fail, there is consensus that the failure rate is quite high. In 2016, Gartner estimated that 60% of data science projects fail. A year later, Gartner’s own analyst, Nick Heudecker, said that Gartner was being too conservative, and the failure rate was closer to 85%. This 2021 article from Mateusz Kwaśniak seeks the source of a seemingly ubiquitous statistic that only 13% of data science projects succeed. And while the data science tools and methods get increasingly complex, the primary drivers of project failures remain frustratingly human: Executives trusting gut feelings over the data. Companies failing to nurture a data-driven culture. Engaging in big data and analytics projects — to the point of creating entire departments – with no clear path to creating value. Bringing a data science project to successful completion requires a dedication to process despite the many temptations and biases our human brains bring to the table.
As consultants, we are brought into a multitude of companies and projects — often only when they show problems or indications of failures. Here are just some of the ways we’ve seen data projects fail over the years, in the hopes that it can help you avoid stepping into the same traps:
- Starting with the tool, not the problem. Trying to fit a problem to a tool, rather than finding the right tool for the problem. A lack of proper problem analysis can lead to this. Selecting a tool based on industry hype is a common mistake. For example: using a Large Language Model for a problem that’s more appropriate for data analysis, just because LLMs are the latest trend.
- Using fancy tools on simple problems. Similar to #1, but slightly different. Sometimes a basic linear regression model or statistical tool is the best solution. Throwing a neural network at a simple problem just wastes money and is bound to create problems.
- Poor data quality. Garbage in = garbage out. If the data is of bad quality, or not accessible, you can’t get any value out of it, no matter how much analysis or transformation is applied. Even worse: You might get answers which look correct but are actually wrong and misleading.
- The solution goes unused. The end users in a company are not properly trained to use the solution. Or the processes in the company never adapt to the new capabilities, so it goes unused even when it’s available. Sometimes they don’t even know that the feature exists.
- End users don’t want to use it because the company culture disincentivizes it. Sometimes their bonus structure incentivizes the old, inferior method instead of the new solution. Or they might fear that they’ll be made redundant, and eventually fired, if they use the new tool.
- Not really having an end-to-end view. The company thinks that having the solution in isolation is enough. They don’t consider the software development needed to actually put a solution into production, for all the users that need it. The solution is valid, but it’ll never be used.
- The business problem and the specific business outcomes are not thought through. The customer doesn’t want to spend the time to properly define the problem. They don’t want to go through the problem of actually specifying what they need. This has to be done before building something, otherwise the risk is too high.
Executing a successful data science project requires more than the skills of the individual people involved. Even with a great team, the right approach is required in order to ensure success. Before any data science or software engineering takes place, the business problem must be properly analyzed. Requirements must be carefully defined. The desired end results must also be defined, including ensuring that end users can actually access the solution that’s being built. A data project that leaves out major portions of the business, its processes, or its people is likely to fail.
Starting a data science project with a robust process and an experienced team is going to make your chances of success much higher. Having the experience to guide the project towards simpler, well-proven methods over trendy new tools is also key. Insight Softmax brings a deep understanding of these challenges and solutions to the table. We will guide you through our proven process that limits your risks and increases your chances of data science success. To learn more, click here.