Information Management, Who Coined the Term 'Enterprise Data Hub' and Does It Matter?“That’s not fair, it was my idea first ..." Anyone who has kids has heard that line before, and it seems that some people don’t grow out of it.

Now granted, sometimes there are good reasons for these arguments and accusations. Apple says Samsung "slavishly" copied the iPhone and iPad in its Galaxy line of mobile phones and tablet, and the courts agree that its claim has merit. And the so-called Rockstar consortium, made up of Microsoft, Apple, Sony and others, is suing Google over patents.

But then there’s MapR, the company that develops and sells Apache Hadoop-derived software; a representative sent me an email yesterday. It states:

There has been a lot of attention lately given to enterprise data hubs or data lakes. Some vendors have suggested that their views are novel and first-to-market. Yet, MapR and others have been talking about enterprise data hub requirements for many months.”

It seems like the next line should be “That’s not fair.” Or “That’s not right.”

Sticks and Stones 

Who are these “some” vendors that MapR is referring to? We asked Jack Norris, the company’s chief marketing officer (CMO) this question, but he deflected it insisting that the spotlight be on the criteria that a “true” enterprise data hub requires -- long term storage, high availability, data protection, full backup, full disaster recovery and so on…

Needless to say, these are features that MapR claims to have built into its Hadoop-based offerings. But does that mean it stands alone in providing them?

We decided to leave that question for later and to again ask Norris who specifically MapR referred to in the email where it stated, “Some vendors have suggested that their views are novel and first to market.”

Plenty of Room in the Data Hub Sandbox

A Google search for “Enterprise Data Hub” points primarily to one vendor -- Cloudera. Yet Cloudera doesn’t seem to claim that the term belongs only to it or that its definition of “Enterprise Data Hub” is the same as everyone else’s.

In fact, at Strata + Hadoop World last month, speaking specifically about data lakes and data hubs, Cloudera founder Mike Olson said that, “This (Enterprise Data Hub) meme is much in the industry right now.”

He then went on to explain that Cloudera’s vision for the data hub is that it takes in diverse data, processes it, and serves it up to a variety of downstream systems. In other words, Cloudera has moved beyond seeing Hadoop as a digital sandbox for data scientists, to something that allows its customers to bring more diverse workloads to their data, beyond just MapReduce.

Matt Brandwein, director of product marketing at Cloudera, explains that it’s with this in mind that Cloudera (long ago) included HBase in its distribution, launched native interactive SQL for Hadoop with Cloudera Impala, provided integrated Search for Hadoop, and so on. He also says that Cloudera was first to market with these features.

He adds that Cloudera saw early on that without comprehensive security and data management -- including access controls, auditing, lineage and discovery -- that enterprises who wanted to utilize data hubs would never adopt the Hadoop platform and fully realize its potential. As a result they built Sentry, Cloudera Navigator, and doubled down on enhancements to the core open source platform to deliver rock-solid availability and data protection, to ensure that Hadoop could be trusted as a central data management platform.

In many cases, Cloudera’s products are open source and the company’s competitors have subsequently adopted them.

“We see what real customers need and build what the enterprise requires,” says Brandwein.

There’s no doubt that other Hadoop brands might make similar claims and for the most part, that’s not a problem for Cloudera. Partly because they believe they are ahead of the market, (“Other vendors talk about use cases. We have production reference customers”) and partly because what’s good for the marketplace is also good for Cloudera.

As Mike Haro, Hortonworks Director of Communications, has said, “We are way too early in this market to fight now,” meaning that, at this point, the market still has plenty of room to grow.

Gartner analyst, Merv Adrian, seems to agree.

It should be noted that it was CMSWire.com that approached Cloudera for comment (not the other way around), as we wanted to get their reaction to MapR’s claim.

King of the Data Hub Mountain?

Norris claims that MapR is the only company that develops and sells Apache Hadoop-derived software that provides the same high availability and data protection required of more established data platforms. He believes that MapR’s competitors’ enterprise data hubs are built for transitory data.

We don’t have enough competitive intelligence to make this call, but it seems to contradict what others have told us. We will be reaching out to analysts for comment, so look for an update.

Does it Matter?

Finally, we can find references for “Enterprise Data Hub” as far back as 2011. Big data accelerator MetaScale’s Global Head of Sales and Marketing, Ankur Gupta doesn’t think it makes much difference as to who used the term first, since, in his view, it’s an old concept.

In its generic form, “It’s a way to use ETL to get data from different sources into one place,” says Gupta.

What’s new, he says, is adding Hadoop in innovative ways, providing valuable presentation, and, of course, marketing.

And the fact that more than one vendor seems to be working on enterprise data hubs and using the term, "It’s a good thing,” he says, meaning that a rising tide can lift all boats.

Cloudera concurs, “We encourage the term's use because it directly correlates to where our customers see the value,” says Brandwein.

So in our view, it’s just fine if everyone uses the term “Enterprise Data Hub,” (provided they have something of value to offer) as long as they focus on developing products and services that help customers get the most possible value at their data.

And if there has to be a single winner (and we suspect there will be more than one) in the market, let it be the company that meets its customers’ wants and needs versus one that focuses on its competitor's claims.

Title image S. Fagan / all rights reserved