Information Management, Who Coined the Term 'Enterprise Data Hub' and Does It Matter?“That’s not fair, it was my idea first ..." Anyone who has kids has heard that line before, and it seems that some people don’t grow out of it.

Now granted, sometimes there are good reasons for these arguments and accusations. Apple says Samsung "slavishly" copied the iPhone and iPad in its Galaxy line of mobile phones and tablet, and the courts agree that its claim has merit. And the so-called Rockstar consortium, made up of Microsoft, Apple, Sony and others, is suing Google over patents.

But then there’s MapR, the company that develops and sells Apache Hadoop-derived software; a representative sent me an email yesterday. It states:

There has been a lot of attention lately given to enterprise data hubs or data lakes. Some vendors have suggested that their views are novel and first-to-market. Yet, MapR and others have been talking about enterprise data hub requirements for many months.”

It seems like the next line should be “That’s not fair.” Or “That’s not right.”

Sticks and Stones 

Who are these “some” vendors that MapR is referring to? We asked Jack Norris, the company’s chief marketing officer (CMO) this question, but he deflected it insisting that the spotlight be on the criteria that a “true” enterprise data hub requires -- long term storage, high availability, data protection, full backup, full disaster recovery and so on…

Needless to say, these are features that MapR claims to have built into its Hadoop-based offerings. But does that mean it stands alone in providing them?

We decided to leave that question for later and to again ask Norris who specifically MapR referred to in the email where it stated, “Some vendors have suggested that their views are novel and first to market.”

Plenty of Room in the Data Hub Sandbox

A Google search for “Enterprise Data Hub” points primarily to one vendor -- Cloudera. Yet Cloudera doesn’t seem to claim that the term belongs only to it or that its definition of “Enterprise Data Hub” is the same as everyone else’s.

In fact, at Strata + Hadoop World last month, speaking specifically about data lakes and data hubs, Cloudera founder Mike Olson said that, “This (Enterprise Data Hub) meme is much in the industry right now.”

He then went on to explain that Cloudera’s vision for the data hub is that it takes in diverse data, processes it, and serves it up to a variety of downstream systems. In other words, Cloudera has moved beyond seeing Hadoop as a digital sandbox for data scientists, to something that allows its customers to bring more diverse workloads to their data, beyond just MapReduce.

Matt Brandwein, director of product marketing at Cloudera, explains that it’s with this in mind that Cloudera (long ago) included HBase in its distribution, launched native interactive SQL for Hadoop with Cloudera Impala, provided integrated Search for Hadoop, and so on. He also says that Cloudera was first to market with these features.