In case you've been sleeping under a rock, yesterday afternoon Intel announced that it has developed its own distribution of Apache Hadoop, The Intel® Distribution for Apache Hadoop. What’s different about it, as compared to other Hadoop vendors (Cloudera, Hortonworks, MapR, EMC’s Pivotal HD, HP’s, IBM’s and WANdisco’s -- did I miss any?), is that it’s built from the silicon up which should give it an edge in delivering performance and scalability.
Intel Relies on Partners to Support Launch
This is something that most technology vendors want; companies like SAP, for example, work closely with their partners up and down the stack to make sure that their customers get the best performance possible. Needless to say, it gives them a leg up on the competition.
So it follows, that Intel was able to announce that it had a whole set of strategic partners supporting the launch; they include: 1degreenorth, AMAX, Cisco, Colfax Corporation, Cray, Datameer, Dell, En Pointe, Flytxt, Hadapt, HStreaming, Infosys, LucidWorks, MarkLogic, NextBio, Pentaho, Persistent Systems, RainStor, Red Hat, Revolution Analytics, SAP, SAS, Savvis, a CenturyLink company, Silicon Mechanics, Simba Technologies, SoftNet Solutions, SuperMicro Computer, Inc., Tableau Software, Teradata, T-Systems, Wipro and Zettaset.
Though many pundits say that Intel’s announcement yesterday was a complete surprise, with all those vendors in the mix, I say, really?
Not only that, but when I interviewed Cloudera’s VP of products, Charles Zedlewski last week, about his company’s impending announcement at the Strata conference, Intel was on his list of Hadoop distribution vendors. Contrary to what some are saying, they were not blind-sided to the announcement, and Zedlewski didn’t seem to be worried in the least. (To be clear, the Pivotal HD announcement had not yet been made.)
I say this, in part, to temper the notion that Intel is trying to make a play in the already too crowded Hadoop market. I think not, they are instead aiming to be a catalyst. If the Big Data and Analytics market grows, and Intel is seen as part of that scene, their sales will grow too.
The Intel Hadoop Distribution
And an important part of making that happen is to answer for security and security vs. performance concerns which Intel does in its press release:
The Intel Distribution is the first to provide complete encryption with support of Intel® AES New Instructions (Intel® AES-NI) in the Intel® Xeon® processor. By incorporating silicon-based encryption support of the Hadoop Distributed File System*, organizations can now more securely analyze their data sets without compromising performance.”
Intel also promises that its Hadoop distribution will help companies with processing speed. They claim that their optimizations for the networking and IO technologies in the Intel Xeonprocessor platform enable new levels of analytic performance.
With the Intel Distribution in place, they say that analyzing one terabyte of data, which would have previously taken more than 4 hours to fully process, can now be done in 7 minutes. Nice!
Citing the commonly quoted data growth statistics, “Intel estimates that the world generates 1 petabyte (1,000 terabytes) of data every 11 seconds or the equivalent of 13 years of HD video,” Boyd Davis, the vice president and general manager of Intel’s Datacenter Software Division, says the power of Intel technology opens up the world to even greater possibilities.
If everything works as presented, this is a good thing and not so much a disruption (unless you make ARM chips). In fact, some industry watchers say that if the Big Data and Analytics market grows quickly enough, Intel will bow out of the Hadoop space (if it’s in the way) without too much complaint. What they want, in the long run, is to have Hadoop distributions optimized for their products so that they remain the dominant player when Hadoop becomes adopted Enterprise wide.