Will MapR Streams Help it Leapfrog the Competition?

5 minute read
Virginia Backaitis avatar

Yesterday was a big day at MapR’s San Jose, Calif. headquarters as it unveiled what is supposed to be a ground-breaking new product, MapR Streams. Streams is an event-streaming architecture for gathering and analyzing data in real-time.

Coupled with MapR’s existing Hadoop plus Enterprise Storage plus NoSQL plus Interactive SQL platform, Streams creates what will now be called the MapR Converged Data Platform (CDP).

Real Time Data, At Your Fingertips

“It’s the biggest change in enterprise computing architecture in decades,” MapR’s Chief Marketing Officer Jack Norris told CMSWire. “No one else, not Cloudera, not Hortonworks or their partners has this,” he said.

The aforementioned big data crunching vendors are MapR’s closest competitors. “This” refers to MapR’s new ability to connect data producers and consumers in real time on the same cluster while simultaneously gleaning insights based on events over time. 

If that’s too geeky to parse, here’s an example that anyone who has been trying to buy holiday gifts online this month might appreciate. By this time next year e-Tailers who use MapR’s CDP — or something like it — might be able to show you ideas based on your taste, confirm that they are in stock and ready to ship before you invest time researching, picking sizes, model numbers or colors, only to find out that the item's not available.

“It (Streams) is not a small thing, but a big concept,” Bloor Group analyst Robin Bloor told CMSWire. “It puts real time messaging right in the database, machines can communicate in real time, globally, in a way that’s resilient and secure.”

MapR’s CDP is unique in its approach and capabilities because of the way MapR set out to handle big data from the start. Unlike other Hadoop distro providers, it chose to innovate at the data layer, thereby becoming the system of record for data.

Enterprise storage is converged into MapR’s underlying Hadoop platform thereby providing for high data protection, availability and disaster recovery capabilities for the price of Hadoop support. It’s worth noting too that MapR supports interfaces for file, database, JSON and now streams.

Data-in-Motion, Data-at-Rest

Hadoop was originally created for data at rest. At the time computer hardware had become inexpensive enough that keeping all of your data in a “data lake” became an option. Hadoop promised to crunch that data and to help organizations unleash insights unlike any the world had ever seen before. To a large extent, that is exactly what has happened. New companies and business models and services have been created around the capability — think Spotify, Netflix and your favorite online, interactive game.

But the world has changed since Hadoop’s early days. People now want to look at data as it’s coming in and compare it with data that’s already there, according to Gartner analyst Merv Adrian. In a video that hasn’t been released to the public, Adrian noted that any architecture that favors data-at-rest over data-in-motion has a problem.

And that’s where MapR’s Converged Data Platform comes in. Norris said that it can look at data-in-motion and data-at-rest at the same time. The capability creates a competitive advantage for companies because it enables them to respond to threats and opportunities in real time.

Needless to say, MapR isn’t the only Hadoop distro provider addressing data-in-motion and data-at-rest. Constellation Research analyst Doug Henschen told CMSWire that seemingly every vendor has been addressing streaming scenarios.

Learning Opportunities

"But with MapR Streams the company has knit together capabilities that companies would otherwise have to put together from Kafka, Hadoop and NoSQL,” Henschen told CMSWire, noting that the choice of MapR means you’re straying from open source Apache software. 

“So in the bargain the customer has to accept greater dependence on the vendor (MapR, and its smaller development community vs. its rivals),” he said.

Even so, he noted, it’s a tradeoff that MapR customers agree to make. 

“Customers like comScore that I’ve talked to over the years willingly make this choice to gain advantages in performance and other areas including high availability, snapshotting and other features unique to MapR,” noted Henschen.

Open Source or No, Just Get the Job Done 

Bloor doesn’t see the fact that MapR’s software isn’t 100 percent open source as much of a dissuader. Though he stopped short of labeling open source evangelists as zealots, Bloor did note that industry insiders and onlookers tend to enter the debate in a way that corporate decision makers do no not.

“Companies are interested in results, not philosophical discussions,” he said. “If I am the guy who writes the check, I want the product that can get the job done.” 

Bloor told CMSWire that based on what’s been made available on the market, he hasn’t seen anything comparable to MapR Streams and its Converged Data Platform. “You can’t just bolt Kafka onto Hadoop and get the same,” he said.

That’s in theory, of course, because we won’t actually be able to see Streams or CDP until 2016.