My Hadoop is Better Than Yours Its MapRs Turn strata

My Hadoop is Better Than Yours: It's MapR's Turn #strata

4 minute read
Virginia Backaitis avatar

The commercial vendors behind Hadoop don’t actually taunt each other in the press, but they do say things like we’re the only…we don’t do that… they’re doing this…

Of course each of them has to differentiate themselves from the next and show potential customers why their branded commercial Hadoop distribution is not only fast, but also Enterprise-grade, secured and flexible.

A Little Background

MapR makes its claim as the only platform that delivers a general-purpose framework for storage and processing on Hadoop. Drilling down just a bit, as Jack Norris, the company’s chief marketing officer tells the story, MapR leverages not only HDFS (Hadoop Distributed File System) but also the NFS (Network File System) protocol which brings benefits to its customers that the competition simply can’t deliver.

“We give our customers huge advantage,” says Norris, referring to how data is stored, protected, accessed…And some customers seem to agree.

MapR also says it’s the fastest, and they may be. (Does anyone want to comment?). I have yet to go to a big data conference where MapR isn’t breaking a speed record or showcasing its “world record-holding performance and enterprise-grade reliability for Hadoop.”

And that’s, no doubt, what Norris and his team will be doing later today at O’Reilly’s Strata Conference in Santa Clara, California.

But that’s not all they’ll be doing. They’ll also be talking about their latest release which includes Hadoop 2.2 with YARN; the Vertica Analytics Platform on MapR and their MapR Sandbox for Hadoop for developers.

MapR’s Twist on YARN

In laymen’s terms YARN is the operating system for open source Apache Hadoop 2.x; every Hadoop distro that’s viable will eventually support it. When MapR competitor Hortonworks first came out of the gate with a Hadoop distro with YARN, it seemed like they made their announcement just seconds after the Apache Foundation voted on its promotion. Shortly after Hortonworks announced that its HDP was Enterprise-ready, competitor Cloudera followed next.

Now, months later, MapR is announcing its latest release, which includes Apache Hadoop 2.0 with YARN.

“What took you so long? Aren’t you a bit late to the game?”, we asked Norris. And while he didn’t say that MapR wasn’t in any hurry, he did say, “Now seems like the right time for our customers” -- whatever that means.

It could mean that they took a little longer so that they could take full advantage of all that MapR has to offer.

Norris says that by combining YARN with MapR’s read-write (R/W) POSIX data platform, MapR enables YARN-based applications to not only run on a Hadoop cluster and share compute resources, but to also read, write and update data in the underlying distributed file system and database tables. The net effect reportedly gives customers the ability to develop and deploy a much broader set of Big Data Hadoop applications.

The HP Vertica Analytics Platform on MapR

Data isn’t worth much without analytics, so it’s little wonder that Hadoop distro vendors feel compelled to empower their customers with SQL-on-Hadoop solutions.

Learning Opportunities

MapR today gives Vertica enthusiasts a reason to be excited, they are announcing the early access release of the new HP Vertica Analytics Platform on MapR. The SQL-on Hadoop solution tightly integrates HP Vertica’s high performance analytics platform directly on MapR’s enterprise-grade distribution of Hadoop.

Is this a big deal? If you’re a Vertica user who wants to crunch big data it is. Though technologists will argue that there are plenty of cheaper ways to get the job done, as we always point out, the end user just wants to do his job with as little disruption as possible.

MapR for Developers

As MongoDB’s Matt Asay often says, “developers are the kingmakers”, that’s why MapR needs them to fall in love with their product and their brand.

Today, MapR announces its MapR Sandbox for Hadoop Developers. Better late than never, we say.  The first on the market was Cloudera Quick Start  Hortonworks Sandbox came next.

What would make aspiring Hadoop developers and administrators choose one over the next? While great tutorials and a super-friendly environment are one answer, what your peers use is another, and what your company uses is still a third.

It’s also worth noting that developers who have worked with Hadoop have typically worked with more than one flavor and, as of now, aren’t more passionate about one than the next. Ditto for many of their employers. I learned this recently when I was recruiting Hadoop engineers and, quite frankly, I was shocked.

Vendors can decide what this finding means to them, but from where I sit, you’ve still got work to do. Outside of Silicon Valley the only “brand” associated with Hadoop is Apache.

Title image by Sura Nualpradid (Shutterstock).