In most cases and for most projects, highly accurate predictions are generally impossible. Fortunately, they don’t need to be. What I call “predicting better than guessing,” that is, predictions that are more accurate than what a person would guess, are usually more than sufficient to render mass scale operations more effective and deliver an impact on the bottom line. This applies for marketing, credit risk management, fraud detection, and so on. In other words, a little prediction goes a long way.
Can you talk about both the power, and the limits, of predictive analytics?
Machine learning, which is the underlying technology that powers predictive analytics, is built expressly to learn from data and make predictions. That’s the power.
Data is intrinsically predictive—it is, essentially, an encoding of your organization’s collective experience. And the predictions generated by predictive models directly inform the way each individual is treated, be those individuals customers, healthcare patients, suspects, automobiles, satellites, or seaworthy vessels.
There are two main limitations to predictive analytics. First, as certifiably valuable as it is, as I mentioned earlier, the performance level of predictive analytics normally cannot achieve high accuracy. Second, to learn from data to predict, you need plenty of both positive and negative examples of the thing you’re trying to predict, such as customers who defected and others who did not, or some that defaulted on their loan and others who did not. Without this set of learning cases, known as training data, you cannot apply machine learning methods.
What do you advise organizations to do when it comes to committing to a plan for executing on the insights from analytics?
With predictive analytics, the greatest value usually comes not from insights per se, but from the per-individual predictions generated by the predictive model. These drive, more effectively, mass-scale operations such as marketing, fraud detection, financial credit risk, and healthcare. It’s literally an upgrade to all the main operations or activities organizations undertake, across both the private and public sectors—that’s the true value, and that’s what organizations need to keep in mind.
Which emerging trend or technology do you think will have the biggest impact on predictive analytics?
The new wave is distributed solutions, which parallelize machine learning. Instead of one computer, regardless of how powerful it is, crunching a massive amount of numbers, you basically have hundreds, thousands, or tens of thousands of computers each tackling a small part of the problem at the same time. The benefit of this is that when number crunching takes five minutes rather than 24 hours, the data scientist’s intrinsically iterative process takes a dramatic upswing. He or she can achieve significantly greater performance in a shorter amount of time.
Beyond that, while core technology and software solutions are evolving in other exciting ways, I’m most excited about the breadth of business applications across sectors. As the awareness, understanding, and comfort with deploying predictive models grows, so does its organic integration into more and more processes.
Join the upcoming “Expert Perspectives: Betting on the Future” videocast series with Eric Siegel on September 25 to learn more.