Big Data Bits: The Strata Edition - Beyond Hadoop Distributions

Welcome back my friends to the show that never ends …

Even though we’ve left the Strata + Hadoop World conference held in New York City last week, the notable news we didn’t report keeps buzzing in our ears, reminding us that we need to share it with you.

We’ve picked out what we think is most relevant and interesting to our readers and we’re splitting it between two posts. This one, in traditional Big Data Bits style, covers product/service announcements; the next discusses what we found to be especially noteworthy from the presentations at the conference.

In case you missed last week’s coverage, check out the news made by Cloudera, EMC, Infochimps, MapR, Microsoft and SAP HANA (we covered Hortonworks and Pivotal the week before). With the exception of SAP and Infochimps, the aforementioned news centers around making Hadoop a better platform or data hub, as some will now be calling it. The announcements that follow seem to have a common theme: namely, they make Hadoop more palatable and accessible to the Enterprise and the Enterprise user.

Alteryx, Cloudera, Revolution Analytics Set You Up to Deliver Insights Like a Data Scientist

Alteryx COO George Mathew says that there are, at best, 200,000 data scientists in the world. Knowing this, the question remains how enterprises will leverage Big Data to its potential when the talent required to do so doesn’t ( yet) exist?

Some believe that putting aspiring data scientists through Big Data+Analytics “universities” and bootcamps is the answer.

Others, like Alteryx and Revolution Analytics, think that giving the 2.5 million data analysts in the world the tools they need to do the geekiest part of a data scientist’s job may be another.

It’s with this in mind that the companies created a technology that analysts and business users can use to easily create and run sophisticated predictive analytics directly on data stored in Cloudera's Distribution Including Apache Hadoop.

Continuuity Makes Big Data App Development Easier with Reactor 2.0 and Rackspace

There’s something Continuuity founder Jonathan Gray takes to heart what a good many big data geeks might not; namely, that before big data application development can go mainstream, it has to get a whole lot easier. After all, not every engineer is a prodigy or has a degree from (or can even be admitted into) Stanford, UC Berkeley, Carnegie Mellon, IIT or MIT.

“Developing applications on top of Hadoop is really, really hard,” Gray told me last year, and he should know, he built real time services for Hadoop and Hive when he worked at Facebook.

In order to make development easier for others, Gray and the Continuuity team built Reactor, the fastest and easiest way to build and run Hadoop and HBase applications.

At Strata, Gray announced Reactor 2.0, which includes MapReduce Scheduling, High Availability, Resource Isolation and full REST API support. The company also hooked-up with Rackspace, which means that Reactor 2.0 is now available on the public cloud -- which might bolster Rackspace's Hadoop as a Service offering.

Kognitio Delivers Rapid Return On Big Data

Hadoop wasn’t built for speed, but in-memory technologies are. Big data analytics company Kognitio, announced the availability of its new platform, Kognitio Analytical Platform 8.1 at Strata. Its In-MemorySM advanced analytical platform enables SQL access on top of Hadoop, delivering rapid return on insight from big data.

The company says that its “true” in-memory technology enables firms to enhance the insights gained from data extracted from a variety of locations which it then optimizes and pins to multiple terabytes of memory. The result they say is greater, more rapid results from big data sets. Who doesn’t want that?

The company also says that its new platform gives data scientists and business analysts the ability to use industry-standard SQL alongside any standard programming language and interact with data from Hadoop, as well as conventional data systems, in their infrastructure in near real-time.

Hunk Brought to You by Splunk

Remember Splunk, the company that spins machine data exhaust into gold? At Strata, they announced “Hunk,” a full-featured, integrated analytics platform for Hadoop that enables everyone in an organization to interactively explore, analyze and visualize historical data in Hadoop.

What’s special about it is that it gives analysts a way to work with data sets that are too large to move by pointing it to a Hadoop cluster versus trying to move masses of data. It also lets you run queries on the fly and preview results as MapReduce jobs are running.

For developers, Hunk provides an ability to build applications on top of Hadoop using a standards-based web framework, documented REST API and software development kits for C#, Java, JavaScript, Python, PHP and Ruby.

Put this together and Hunk makes Hadoop more consumable to enterprise users.

Learning Opportunities

Webinar

Mar

Operational Efficiency in Government: Delivering Modern Service on Real-World Budgets

See how state and local agencies use AI to cut costs, boost efficiency and deliver modern service to citizens.

Webinar

Mar

Beyond Modernization: Engineering a Secure, Mission-Critical Contact Center

A straight conversation for leaders who need to build an operation that's actually ready for AI.

Webinar

Mar

Content Leaders Collective: Navigating Content Decisions at Scale

Discover how content leaders are modernizing content operations, avoiding costly missteps and preparing for scale and AI.

Webinar

Mar

New Research on AI for CX: What Consumers Want, What Enterprises Prioritize and Where the Gap is Growing

Based on Ada's 2026 survey, this session explores evolving expectations for AI-powered CX.

Webinar

Mar

Do Learning Programs Really Work? How to Turn Education Into Engagement In Healthcare

See how leaders are using learning programs to build trust with healthcare professionals and create measurable engagement.

Webinar

Apr

Content Strategy Leaders Live: Managing Scale, Safety & AI in Manufacturing

How manufacturing leaders update content systems to handle product complexity and scale while integrating AI safely.

Webinar

Mar

Operational Efficiency in Government: Delivering Modern Service on Real-World Budgets

See how state and local agencies use AI to cut costs, boost efficiency and deliver modern service to citizens.

Webinar

Mar

Beyond Modernization: Engineering a Secure, Mission-Critical Contact Center

A straight conversation for leaders who need to build an operation that's actually ready for AI.

Webinar

Mar

Content Leaders Collective: Navigating Content Decisions at Scale

Discover how content leaders are modernizing content operations, avoiding costly missteps and preparing for scale and AI.

Pivotal’s Spring Makes Big Data Application Development Easier

At the risk of sounding like a broken record, Pivotal announced that Spring is now certified for Apache Hadoop 1.2.1 and 2.0.6 alpha, as well as Pivotal HD 1.0 and Hortonworks HDP 1.3. What this brings to enterprises is a way for its Spring savvy Java developers to access Big Data in Hadoop.

Hadoop Becomes Viable for Enterprises

It’s long been said that all that Hadoop lacks to become Enterprise-ready is governance and security. While the Apache Community and Hadoop Distribution providers have nearly closed that gap, the ecosystem around Hadoop is in the process of creating something equally remarkable: the tools needed to leverage big data and big data insights to the hilt.

It’s no wonder that Strata attendance has grown from 700 to over 4,000 in the past few years; enterprises are showing more and more interest in Hadoop as Hadoop, and the surrounding ecosystem, becomes enterprise ready.