While much of the drama in the Hadoop world of late has surrounded the Open Data Platform (ODP) initiative, there’s progress outside of it, too.
Today MapR announced that its Hadoop distribution now ships with Apache Drill 1.0. Apache Drill —an open source, low latency SQL query engine for Hadoop and NoSQL.
Its promise is that it makes it easier for end users to interact with data from both legacy transactional systems and new data sources, such as Internet of Things (IoT) sensors, web click-streams and other semi-structured data, along with support for popular business intelligence (BI) and data visualization tools.
This is Different
Unlike many other SQL on Hadoop engines, Drill is schema free, which suggests that neither IT nor data scientists need to spend time setting up the structure before analysis. What this means, theoretically speaking, is that business users will be able to move from data to analysis and insight faster. Not only that, but it’s also possible that they’d be able to leverage more data while they’re at it.
And while that’s the blue-sky version of Drill, it’s worth noting that it’s a new Apache Project that only moved from Incubator to Top Level project six months ago. There will no doubt be critics from competing Hadoop distro providers who will question Drill’s maturity and suitability for the Enterprise at this point in time.
That being said, most of the leadership on the Apache Drill project works at MapR, suggesting that its employees are intimate with the technology and in position to make it enterprise-grade.
And since MapR is the one Hadoop distro provider that positions itself as a software vendor vs. one whose revenue is largely dependent on training and support, it would be hard to believe that they’d add something to their product that would create problems for its customers, and therefore themselves. (Note: we’re not suggesting that other Hadoop distro providers would.)
Matt Aslett, research director, data platforms and analytics at 451 Research, said, “The availability of Apache Drill in the MapR Distribution is a major milestone for the SQL-on-Hadoop project, which is significant in delivering real-time insights from complex data formats without requiring any data preparation.
“Apache Drill is an example of MapR collaborating with others as part of the Apache development process on new technologies to expand the Hadoop portfolio."
His latter statement speaks to both the criticism that MapR sometimes receives about being a proprietary Hadoop distro provider and to MapR’s response to it — namely that it packages and “bullet-proofs” the best-in-class, big data-crunching solutions (many of them Open Source) for its customers so that their need for services is next to nil.
It’s worth noting, too, that Apache Drill 1.0, which is now included in MapR’s distro, is free for the taking. So should a competitor, like Hortonworks, who has at least one contributor on the project, find it extremely valuable, they can engineer it into their distro as well.