Say you were a contestant on Jeopardy, and the answer Alex Trebek read was “EMC, Hadoop." Chances are the question you’d buzz-in with would be "What is Pivotal?" — referring to the company EMC spun-off less than a year ago.
I suspect EMC would rather not have it this way. After all, big data is a huge component of the company’s strategy, and its Isilon division provides its own offering for deploying Hadoop and analytics. The trouble is not enough people know about it.
EMC could try to change this via its usual tactics, like seizing opportunities around the Formula 1 car it sponsors or treating them to the lifestyle of the 1 percent at Fenway Park. But in the land of big data, there’s another means that’s more appropriate. Can you say open source and free? It turns out those words are sexy, too.
So it goes to follow that at O'Reilly Strata + Hadoop World 2013 in New York City today, EMC will announce the release of its Hadoop Starter Kit (HSK) 2.0, which is now available for free. According to the company, HSK ups the efficiency of all Hadoop distribution deployments by reducing the time and cost to deploy EMC Isilon Scale-Out NAS.
This offering includes several features which will be attractive to Enterprises. Isilon’s native HDFS, for example, brings Hadoop to the data Vs, the other way around, which means that data can be processed in place.
Other vendors, such as Teradata, also offer this capability and those who don’t will likely be doing so or introducing faster processing via some other means. Though some vendors think the business wins of leveraging big data are so big that price really isn’t a factor, others think speed and cost savings are becoming customer mandates.
A Look at the Benefits
“The benefits of a single file system with seamless multi-protocol support (NFS, CIFS/SMB, HDFS, etc.), which include avoiding the CapEx costs of purchasing a separate infrastructure and faster results in the absence of migrating petabytes of data, cannot be overstated,” Nick Kirsch, chief technology officer of EMC’s Isilon Storage Division, wrote in a blog post today.
If you’re curious to know more about how HSK 2.0 works, here’s the skinny, as provided by EMC:
HSK enables rapid provisioning by guiding you through the automation process. From the creation of virtual Hadoop nodes to starting up the Hadoop services on the cluster, much of the Hadoop cluster deployment can be automated, and requires little expertise on the user’s part. The automation process allows the Virtual Hadoop clusters to be rapidly deployed and configured as needed. In addition to deploying quickly, there is also a strong need for high availability for certain mission-critical uses of Hadoop. High-availability protection is provided through the virtualization platform to protect the single points of failure in the Hadoop system, such as the NameNode for HDFS and JobTracker for MapReduce.”
If you want more information, you can see HSK 2.0 in action at the Strata conference in New York City this week. If you can’t get there, check out this video.