Big data never sleeps, or at least big data vendors don’t. We thought that after we reported and recapped all that went on at Strata+ Hadoop World, all would be calm for a while …
That didn’t happen.
Not only was there big, big data news last week such as Amazon’s announcement of Kinesis, Pivotal’s announcement of PivotalOne, DataStax’s new release and so on … there were many others we didn’t get to and we can’t let them pass without mention -- it’s news you need to know.
Udacity and Cloudera Partner to Democratize Hadoop Training
The impending third era of computing (a.k.a. “the data era”) will require a workforce that knows how to work with big data; and, needless to say, not enough people have the necessary education or training. While a number of colleges and universities are beginning to develop post-graduate degree programs around data science and Hadoop, they won’t be able to matriculate as many workers as the market is predicted to demand. Not only that, but admission to these programs is highly selective, because seats are limited and the price of tuition is high, leaving the economy without the skilled workers it needs.
Last week Cloudera and Udacity, an innovative provider of online higher education, announced an initiative around closing the skills gap. They are partnering to deliver Hadoop and data science training via Udacity's accessible online education portal. A course such as Introduction to Hadoop and MapReduce, taught by Cloudera University instructors, costs $105.00 to take (if you pre-register) and about a month to complete. And if you don’t meet the prerequisites for the class, Udacity will point you to a class that will provide them.
Splunk Gets Smarter, Kinder and Hunks-Up With Amazon
Splunk couldn’t stop making headlines last week. We already told you about Splunk's partnership with Ford that shows how a car can be turned into a data platform, but we didn’t tell you that the company was also called to the White House to moderate a panel of tech leaders who discussed big data innovation for the public good; that they brought in former Continuuity boss Todd Papaioannou as Chief Technology Officer (CTO); and that they introduced Amazon Machine Images (AMIs) for Splunk Enterprise 6 and Hunk: Splunk Analytics for Hadoop at AWS.
Jethrodata Is How Fast?
Jethrodata, an analytics index-based SQL engine for Hadoop, promises to deliver SQL queries that power reports, dashboards and ad-hoc requests as much as 100 times faster than current tools (Hive, Impala, Hawq). They say it works by combining the scalability of Hadoop with the performance of an analytical database in one system. According to the website, Jethrodata automatically indexes data as it is written into Hadoop and that queries then use indexes to access only the data they need instead of performing a full-scan of the entire dataset.
Since it’s still in beta, we have yet to hear any success stories, but if it works as promised, it’s certainly worth keeping an eye on.
WANdisco Unveils Non-Stop Hadoop for Hortonworks
In the Big Data/Hadoop world, WANdisco is alone in providing 100 percent uptime for Hadoop; it does this by eliminating Hadoop’s most problematic single point of failure, the NameNode.
Last week, at the Strata Conference in London, WANdisco announced Non-Stop Hadoop for Hortonworks version 1.5. The company says that this new release includes significant performance and administration enhancements for large multi-data center Hadoop deployments.
According to a press release, WANdisco Non-Stop Hadoop is designed for Hortonworks Data Platform (HDP) 2.0 and provides the ability to schedule maintenance, imports of large data sets and other activities by time zone. The benefit? Global enterprises can now plan for server downtime and maintain performance across data centers that span multiple geographies.
WANdisco’s Non-Stop Hadoop for Hortonworks 1.5 adds support for existing Ambari installations. Apache Ambari is the operational framework for provisioning, managing and monitoring Hadoop clusters. Ambari, which is 100 percent open source, includes an intuitive collection of operator tools and a robust set of APIs that hide the complexity of Hadoop, simplifying the operation of clusters.
Amadeus and Couchbase Partner to Provide Better Shopping Experiences for Travelers
Want to talk about an annoying online shopping experience? How about selecting a trip, finding a price that’s acceptable and then finding out that it has changed (for the worse) when you go to book it.
Some of us might think that the website we’re dealing with is intentionally deceiving, but that’s not necessarily the case; it could be that the provider’s data management backbone doesn’t provide for replication that’s fast enough.
To solve this problem, Amadeus, which provides technology to the travel industry, is partnering with NoSQL document database provider Couchbase.