Ok, everyone knows that data scientists are sexy, but what about local TV reporters who can’t keep themselves from saying words like Hadoop, MapReduce, NoSQL, JSON and petabyte?
They were on my television when I came home from the Strata + Hadoop World conference, I kid you not.
At first I thought that maybe my brain had reached its capacity at the conference and that it was somehow offloading voice files to the DVR which was then interfering with the news … but that’s a bit too far-fetched, even for me.
Instead, it seems that Big Data actually took New York for one week last month, and that New York in turn, became taken with Big Data.
And why not? The City is the center of data-dependent industries like Advertising and Marketing, Investment Banking and Insurance, Media and New Media, Medical Research, I could go on … (and in some cases I already have). You can get my take on many of the vendor announcements made at Strata here.
This is the first part of a two-part article about takeaways from the conference program (versus the vendor announcements) as well as presentations that were notable and/or especially interesting -- like acclaimed data scientist Claudia Perlich’s cautionary tale about how data can be deceiving; lessons learned by the data science team at Nordstrom; one man’s take on why women make better data scientists than men and why a few slivers of data may be all we need at any one time.
Without further ado …
Big Data Comes of Age
Flashback to 2008: forget about Hadoop being enterprise ready, no one was even using the term “big data” back then. The first Hadoop World Conference held in 2009, with 700 attendees. “We were elated (to have that many people show interest),” says Cloudera Strategy Officer Mike Olson.
Last week’s filled to capacity conference had approximately 3400 attendees. Next year it moves to the Javits Center, the same place where the New York Auto show is held.
Suffice it to say that big data is booming.
And experts say that enterprises have gone far beyond being curious about big data. “We’re no longer being asked, 'What is this (big data) good for?' and 'Is it real?' Today companies are asking us 'How do I solve this problem?'” says Alan Saldich, vice president of marketing at Cloudera.
Brenden Grace, vice president of engineering at marketing agency Collective says that Big Data and Hadoop are “must haves” at his company and that he suspects this to be the case for most of the Strata conference attendees that he met.
True or False: Hadoop Is Ready for Prime Time
That being said, MapR Vice President of Marketing Jack Norris gave a main stage presentation at Strata in which he said that one of the myths around Hadoop is that some believe Hadoop isn't yet ready for prime time. While we agree that that’s a myth -- some of the enterprises that we speak to are already gleaning insights from their big data projects -- many others are at the proof of concept stage. Almost no one questions whether big data is part of their future.
We heard Gartner Analyst Merv Adrian tell SiliconAngle’s theCUBE that when it comes to big data, enterprises have gone from asking why and where, to how and who.
This conference was all about real deployment. Organizations that were kicking the tires are now prepared for the production readiness, enterprise class features, and more and more governance and security, the issues on the lips and minds of people who were getting ready to make investments," said Adrian. He believes that big data (and Hadoop) are ready to move into the mainstream.
Saldich, independently of anything Adrian said, echoed the same sentiment, “Big data is no longer a science experiment or something that’s novel and unfamiliar, companies are beginning to look at Hadoop for data management.”
So while there may still be companies who don’t think Hadoop is ready from prime time, give them time, they’ll get there. Our advice to MapR and anyone else trying to convince enterprises that Hadoop is enterprise-ready? It may not be the best way to spend your time -- too many other companies want big data now. Fish where the fish are, we say, there are more and more fish showing up every day.
It’s Not a Data Platform, It’s a Data Hub
In case you haven’t heard, Cloudera introduced the concept of a “data hub” at the conference. The idea being that Hadoop is the primary place where enterprise data is stored. This doesn't mean that today’s data warehouses are doomed, as Cloudera explained at the conference, they’ll continue to be used for special use cases.
Want a metaphor around that? Think of your high-end camera, do you use it for most of the pictures you take? If you’re like most people, the answer is “no,” you use your smartphone. Data hub evangelists view data warehouses in much the same way; you’ll use them in special situations. The data hub will be where most of your data is stored.
It’s worth mentioning that shortly after Olson introduced the concept of a data hub at Strata, MapR announced that they had embraced the concept of data hub too; great minds think alike, no?
As we continue our wrap-up of Strata, we’ll find out if Facebook (which may be the world’s biggest user of Hadoop) thinks that the future is all about Hadoop, and we’ll point you to some presentations that we found to be especially interesting.