It’s not just Big Data that keeps getting bigger, but also the ecosystem around it. Hadoop, now in its seventh year, has spawned more companies, more applications and more ancillary technologies than most of us can name or count. It’s the beauty of Apache’s brand of Open Source.
As Cloudera’s Chief Architect, Doug Cutting said on the occasion of Hadoop’s seventh birthday, “The Hadoop ecosystem has hundreds of developers working for tens of organizations. Competitors productively collaborate on a daily basis, improving the software we all share.”
We’d differ with Cutting when it comes to the number of organizations in the ecosystem (we think there are many more), but the point remains the same.
Cloudera Search & Real Time Search Now Available for General Use
Speaking of Cloudera, the company has just announced the general availability of Cloudera Search, a fully integrated search engine for the exploration of data stored in the Hadoop Distributed File System (HDFS) and Apache HBase™. It’s free and can be downloaded now.
Cloudera also announced a Real Time Search (RTS) add-on that is available by subscription. The company says that the for-fee service enables customers to more effectively leverage Cloudera Search by providing technical support, legal indemnification and continual influence over the development of the open source project.
Both Cloudera Search and Cloudera RTS will not only speed enterprise adoption of Hadoop but will also make it available to a larger cross-section of users, which is vital for the technology’s growth.
Start-up GraphDive Wins US$ 2 Million to Grow Its Chatter Analysis to Actionable Insight Offering
If you've been following the US Open and posting phrases like “Viva Nadal” on Facebook or photos of the boy-wonder on Instagram, GraphDive knows your type.
It can parse thousands of other social data points to create a clean user interest graph on you and predict your behavior. Are you more likely to shop at the Wilson or Ralph Lauren Boutique? Do you prefer a Tesla over an Aston Martin?
Sure, it’s a bit invasive, but float me the right coupon, and I’ll deal with it.
GraphDive’s investors are certain that advertisers and marketers are yearning for these types of insights, so they’re putting up US$ 2 million big ones to back the growth of the company’s proprietary technology. And given that GraphDive has enjoyed 10x growth in the past year, they may be making a pretty good bet.
Hey Larry, While Your America’s Cup Team Was Busy Defending Itself for Cheating, Cassandra 2.0 - The Next Generation of Big Data was Released
Oracle’s stronghold on databases is slipping away, Open Source Big Data databases, like Apache Cassandra, are taking over pieces of its territory each and every day. And interestingly, as this is happening, Larry Ellison’s America’s Cup team is busy coming to terms with the fact that it was caught cheating in the race. What did they do? They added five pounds of illegal weight to one of their 3,086-pound boats.
The developers behind Apache’s Cassandra, on the other hand, are simply keeping their heads down. According to DataStax VP Jonathan Ellis, Cassandra 2.0’s headlining features include lightweight transactions, CQL enhancements and triggers as well as many internal optimizations and improvements. What will happen as a result? The technology will likely be more widely deployed and more rapidly adopted. Want to know more? Check out the Planet Cassandra blog.
Twitter Gifts Summingbird to the Open Source Community
No, “Summingbird” is not a typo. It’s a hybrid technology Twitter developed for its own use.
It’s a salve, of sorts, for the difficulties of using Hadoop for batch processing and Storm for stream processing at the same time. Twitter’s explanation of how Summingbird works is a bit techy for even the more technical among us; but if businesses find good use cases and a sponsor adopts it (like DataStax adopted Cassandra), we’ll surely get a more palatable explanation. Until then, we can experience via Twitter without even knowing it’s there.
Title image courtesy of argus (Shutterstock)