Talk to MapR CMO Jack Norris and he’ll make no apologies for the fact that his company’s big data crunching technology is not 100 percent open source.
Though MapR’s architecture leverages Apache Foundation projects like Hadoop and Spark, it's a commercial software provider at the core and their “game-changing” offerings are proprietary.
“We’re creating foundational technologies here,” Ted Dunning the company’s applications architect told CMSWire, explaining that MapR’s top Forrester rated software helps customers handle information in a way that would be rivals, Cloudera and Hortonworks simply cannot.
Case in point is MapR’s Converged Data Platform (CDP), which brings together Apache Hadoop and Apache Spark with a top-ranked, proprietary database MapR DB and MapR Streams.
This morning MapR was awarded a patent for CDP by the United States Patent and Trademark Office. It recognizes the company’s fundamental innovation in data architecture that enables real-time and mission-critical application deployments at scale.
“No one else, not Cloudera, not Hortonworks or their partners have this capability,” said Anil Gadre, senior vice president product management, MapR Technologies.
“MapR has certainly forged its own path and delivered a unique distribution of Hadoop,” Constellation Research analyst Doug Henschen told CMSWire.
“It uses what it likes from the open-source Hadoop ecosystem but selectively adds its own proprietary components where it sees fit in order to deliver higher performance, durability and availability and infrastructure efficiencies over more conventional Hadoop distributions,” he added.
What’s different about MapR’s CDP is that its architecture is data-centric, data does not reside in separate clusters, as in one for historical data (Hadoop) one for streaming data, one for database and so on…CDP answers the question
“How do you really deliver industrial strength data analysis in real time?” said Gadre. “It’s one thing to know ‘what should I do’ based on retrospective information and quite another to look at what’s happening now, what happened in the past, and answer the question what should I do next?”
Instead of answering the questions one-by-one and combining answers, MapR CDP simply provides an answer in real time.
“With MapR Streams the company has knit together capabilities that companies would otherwise have to put together from Kafka, Hadoop and NoSQL,” Henschen told CMSWire when CDP was first announced.
A real world application of CDP might be in preventing fraud in real time. MapR has been able to help customers detect a racket now, as it is happening, instead of in two weeks which is how long competitors’ solutions might take to crunch and combine historical data with streaming data.
“Data loses value in time, but it gains value in aggregation, said Dunning. So, in the aforementioned example, insight around fraud would be gleaned by recognizing patterns in historical data. When streaming data is in the mix, the merchant might be alerted as to the fact that the person about to commit fraud is standing in front of them.
Another example might be ComScore which uses CDP to auction off and place ads on the web, they have better insight with which to make decisions.
Does Open Source Matter?
If there’s a downside to CDP, the most obvious is that it’s not 100 percent open source and it’s not free. As a result customers have to accept greater dependence, and potential lock-in, on the vendor. Henschen has said that not everyone has issues with this.
“Customers like ComScore that I’ve talked to over the years willingly make this choice to gain advantages in performance and other areas including high availability, snapshotting and other features unique to MapR,” said Henschen.
Would-be customers should consider vendor-lock-in tradeoffs alongside the performance advantages warned Henschen. He then added that other Hadoop distributors also employ components that are unique to their distributions (whether they are open source or not), so a commitment to any Hadoop distribution should not be taken lightly.
Industry analyst Robin Bloor, owner of Bloor Research, said the open source versus proprietary conversation is overrated, that the people who write the checks for technology simply choose the technology that gets the job done.
All of that being said, the “I can do anything you can do, better” conversation might be hushed for a while because the US Patent Office has handed MapR a pretty good answer.
“You can’t do this at all.”