Welcome back my friends to the show that never ends …
Even though we’ve left the Strata + Hadoop World conference held in New York City last week, the notable news we didn’t report keeps buzzing in our ears, reminding us that we need to share it with you.
We’ve picked out what we think is most relevant and interesting to our readers and we’re splitting it between two posts. This one, in traditional Big Data Bits style, covers product/service announcements; the next discusses what we found to be especially noteworthy from the presentations at the conference.
In case you missed last week’s coverage, check out the news made by Cloudera, EMC, Infochimps,MapR, Microsoft andSAP HANA (we covered Hortonworks and Pivotal the week before). With the exception of SAP and Infochimps, the aforementioned news centers around making Hadoop a better platform or data hub, as some will now be calling it. The announcements that follow seem to have a common theme: namely, they make Hadoop more palatable and accessible to the Enterprise and the Enterpriseuser.
Alteryx, Cloudera, Revolution Analytics Set You Up to Deliver Insights Like a Data Scientist
Alteryx COO George Mathew says that there are, at best, 200,000 data scientists in the world. Knowing this, the question remains how enterprises will leverage Big Data to its potential when the talent required to do so doesn’t ( yet) exist?
Some believe that putting aspiring data scientists through Big Data+Analytics “universities” and bootcamps is the answer.
Others, likeAlteryx and Revolution Analytics, think that giving the 2.5 million data analysts in the worldthe tools they need to do the geekiest part of a data scientist’s job may be another.
It’s with this in mind that the companies created a technology that analysts and business users can use to easily create and run sophisticated predictive analytics directly on data stored in Cloudera's Distribution Including Apache Hadoop.
Continuuity Makes Big Data App Development Easier with Reactor 2.0 and Rackspace
There’s something Continuuity founder Jonathan Gray takes to heart what a good many big data geeks might not; namely, that before big data application development can go mainstream, it has to get a whole lot easier. After all, not every engineer is a prodigy or has a degree from (or can even be admitted into) Stanford, UC Berkeley, Carnegie Mellon, IIT or MIT.
“Developing applications on top of Hadoop is really, really hard,” Gray told me last year, and he should know, he built real time services for Hadoop and Hive when he worked at Facebook.
In order to make development easier for others, Gray and the Continuuity team built Reactor, the fastest and easiest way to build and run Hadoop and HBase applications.
At Strata, Gray announced Reactor 2.0, which includes MapReduce Scheduling, High Availability, Resource Isolation and full REST API support. The company also hooked-up with Rackspace, which means that Reactor 2.0 is now available on the public cloud -- which might bolster Rackspace's Hadoop as a Service offering.
Kognitio Delivers Rapid Return On Big Data
Hadoop wasn’t built for speed, but in-memory technologies are. Big data analytics company Kognitio, announced the availability of its new platform, Kognitio Analytical Platform 8.1 at Strata. ItsIn-MemorySM advanced analytical platform enables SQL access on top of Hadoop, delivering rapid return on insight from big data.
Learning Opportunities
The company says that its “true” in-memory technology enables firms to enhance the insights gained from data extracted from a variety of locations which it then optimizes and pins to multiple terabytes of memory.The result they say is greater, more rapid results from big data sets. Who doesn’t want that?
The company also says that its new platform gives data scientists and business analysts the ability to use industry-standard SQL alongside any standard programming language and interact with data from Hadoop, as well as conventional data systems, in their infrastructure in near real-time.
Hunk Brought to You by Splunk
Remember Splunk, the company that spins machine data exhaust into gold? At Strata, they announced “Hunk,” a full-featured, integrated analytics platform for Hadoop that enables everyone in an organization to interactively explore, analyze and visualize historical data in Hadoop.
What’s special about it is that it gives analysts a way to work with data sets that are too large to move by pointing it to a Hadoop cluster versus trying to move masses of data. It also lets you run queries on the fly and preview results as MapReduce jobs are running.
For developers, Hunk provides an ability to build applications on top of Hadoop using a standards-based web framework, documented REST API and software development kits for C#, Java, JavaScript, Python, PHP and Ruby.
Put this together and Hunk makes Hadoop more consumable to enterprise users.
Pivotal’s Spring Makes Big Data Application Development Easier
At the risk of sounding like a broken record, Pivotal announced that Spring is now certified for Apache Hadoop 1.2.1 and 2.0.6 alpha, as well as Pivotal HD 1.0 and Hortonworks HDP 1.3. What this brings to enterprises is a way for its Spring savvy Java developers to access Big Data in Hadoop.
Hadoop Becomes Viable for Enterprises
It’s long been said that all that Hadoop lacks to become Enterprise-ready is governance and security. While the Apache Community and Hadoop Distribution providers have nearly closed that gap, the ecosystem around Hadoop is in the process of creating something equally remarkable: the tools needed to leverage big data and big data insights to the hilt.
It’s no wonder that Strata attendance has grown from 700 to over 4,000 in the past few years; enterprises are showing more and more interest in Hadoop as Hadoop, and the surrounding ecosystem, becomes enterprise ready.