It’s hard to believe that only three years ago “big data” seemed like a strange term. I remember sitting in a crowded room at GigaOm’s first New York Structure Conference at Chelsea Piers listening to a bunch of bigwigs debate what the massive amount of data we were accumulating quickly would eventually be called.
“I hope it’s not big data,” one of them said -- I think it may have been Om Malik. It seemed as if the term was being used “temporarily” until someone came up with something better.
Needless to say, it stuck.
Later that year at the O’Reilly Strata Conference in New York most of the audience was curious about what the big data buzz was all about; there wasn’t much talk of it at work. O’Reilly’s Maureen Jennings and I reflected on it last fall. “I think there was more press there than attendees at that conference,” she said. She was kind of kidding. But it was amazing to both of us that there’s so much interest in big data right now, that the conference will be held at the Javits Center next year (where things like the New York Auto Show are held). And I bet the exhibit hall will be jam-packed with vendors.
So it goes to follow that this week Big Data Bits will be written in two segments. Though we can’t include everyone, here’s what we found to be notable:
EMC’s ViPR Gets Busy with HDFS and the Download is Free
EMC is keen on leading the world into the third era of computing and part of that plan includes ViPR, a software defined storage layer which provides an interface to information wherever it is stored, even on competitor products like NetApp.
ViPR now includes the ViPR HDFS Data Service which is a Hadoop compatible file system that enables customers to use their existing storage infrastructure as a big data Repository. The company says that it gives organizations the ability to run analytics using well known industry Hadoop distributions on existing data stored across heterogeneous systems such as VNX, Isilon and NetApp arrays and, in 2014, commodity storage.
Whoa, EMC said commodity.
And it’s not only that that’s new. The ViPR download is free (NOTE: it’s intended for non-production purposes), but that’s not all. The company promises that its salespeople won’t be nagging you to buy after the download. Chad Sakac, the company’s Senior Vice President of systems engineering wrote:
And, get this … When you download, we leave you alone :-) Yes, we note that your emc.com account downloaded the stuff, but it gets routed to the inside SE team (not the inside sales team), so the follow-up is 'hey, did you get it working alright,' not 'can I sell you something!'"
And Syncplicity users and soon-to-be customers take note, I predict that ViPR and Syncplicity will have a play.
Cloudera Announces Commercial Support for Spark (Better than MapReduce?)
We’ve already told you that Cloudera is claiming its stakes on the Enterprise Data Hub. They have also announced commercial support for Apache Spark, a lightning fast machine learning and processing environment that is said to be up to 100 times faster and require writing two to 10 times less code than equivalent MapReduce applications.