How to Differentiate Machine Learning From Dressed-up BI

Machine learning is the use of computing resources that have the ability to learn without being explicitly programmed — that is, acquire and apply knowledge and skills that maximize the chance of success. That definition of machine learning, provided courtesy of Rob Clyde, vice-chair of the ISACA board of directors and board director and executive chair to White Cloud Security is a pretty standard explanation. Or here’s another. Machine learning is a cognitive system that has the potential to learn from interactions and then deliver evidence-based answers to a problem.

Dressed Up As Machine Learning

But you’d never know that from the many vendors that purport to offer a machine-learning based application that is really something else — usually a dressed-up business intelligence solution against which SQL queries are run. Such vendors are prevalent, says Padraig Stapleton, VP of Engineering at Argyle Data, "Machine learning and AI are two terms that are abused a lot. You talk to a vendor and it will tell you that it has machine learning or AI but then you talk further to them about how the applications really works and it becomes clear that they do not."

Some executives like to throw around the names of particular machine learning algorithms, such as deep learning, XGBoost, random forest or boosted trees, according to Colin Priest, director of Product Marketing for DataRobot. But deep learning, one of his examples, tends to perform best on images and sounds or voices, he said — it doesn’t perform as well on mainstream business.

Priest offered yet another definition for true machine learning: it positively impacts business operations so long as it satisfies three prerequisites: business/domain knowledge, access to data and an understanding of that data. "One of the things we realized early on is that there is a lot of market uncertainty about what AI actually is and what machine learning actually is," he said.

How To Tell The Difference

Mere definitions, though, only go so far in helping companies distinguish the pretenders from the real thing. Especially as so many companies have come to believe that any new application worthwhile purchasing must have machine learning as part of its make-up. "I think a lot of companies think that if whatever they buy doesn’t have machine learning then they are buying old, obsolete technology," Stapleton said. They are ready to believe, in other words, when a vendor says its product has machine learning. To help companies better tell what exactly it is that they are buying, these experts share some pointers.

It’s all about the data. Understand exactly what it is the application is doing with the data, Stapleton said. "Is your system capable of adapting to the changes in the underlying dataset, or is it subject to a rigid set of rules and thresholds that have been configured to your business? Be careful to distinguish mere automation from true, intelligent decision-making that detects and responds to patterns in data and adapts to changes over time, says Prasad Chalasani, senior vice president of Data Science at MediaMath. "In general, a system can be said to be ‘learning’ only if its behavior is not explicitly coded, but rather its actions, decisions, recommendations, etc. are able to improve automatically as it is exposed to increasing amounts of data."

He offers another example of a real-time bidding system. If that system is able to figure out over time that visitors to the NYTimes.com page in California are between the ages of 30 and 40 and are more likely to click on a Tesla ad — and adjusts bid prices accordingly — then such a system can be said to be truly employing machine-learning. There is no explicit hard-coded rule that could prescribe that behavior, he said.

Another obvious example of AI would be self-driving vehicles. "They [self-driving cars] have to navigate new roads every day, under different weather conditions, moving through traffic and pedestrians that they have never seen before and yet they can," says Chris Nicholson, CEO of Skymind

How much customization is necessary? If an application is truly based on machine learning there will be a lot of customization involved, Stapleton said. "There is a data component piece, and then choosing of the right algorithm that will be applied to that dataset and adding some relevant features." If the application in question is really dressed-up BI or something similar then not much customization will be necessary. "It is very tell-tale if a vendor says, ‘We can install our solution out-of-the-box, start processing the data and give you results" over a short period of time.'"

Talk to the vendor. Really dig into the company and who its people are, Stapleton says, "Understand who the engineering team is and what their background is. A proper machine learning application is built combining access to data, access to domain expertise and access to the data scientists who developed the algorithms."

Dima Stopel, co-founder and VP of R&D at Twistlock offered other questions to throw at a vendor.

What is the exact machine learning algorithm that you use?
Why did you choose it?
How large is the learning data set and what are the most significant input features?
How long does your system learn before providing any value?

Learning Opportunities