It turns out that some people do, in fact, want an Open Data Platform.
Despite all of the brouhaha that might have gone down last week, first around Pivotal Software’s Data Event and then at Strata and Hadoop World, some of the vendors and companies that have signed onto the Open Data Platform (ODP) initiative are calling it, “An answer to our Hadoop prayers.” The aforementioned quote comes from Scott Gnau, president of Teradata Labs.
Simon Schmidt, the chief data architect at Union Bank, provided a reason as to why the ODP — a tested reference core of open source Apache Hadoop, Apache Ambari and related Apache source artifacts — was vital for an enterprise like his.
“We can’t maintain an internal staff to do all the testing, compatibility testing and researching of every piece of technology that comes along,” he said, adding that “having some industry people backing these things, giving us the type of indemnification that we require make this (a big data platform) a viable option for us for the long term.”
That statement, perhaps, answers the question that Gartner Analyst Nick Heudecker posed when we interviewed him shortly after the ODP announcement. ”It’s not clear who’s asking for this.”
An Impetus for Change?
In the same conversation Heudecker also suggested that ODP could be the “catalyst that drives adoption” (of Hadoop).
While there’s been a great deal of hype around how broadly enterprises are embracing Hadoop, Heudecker said that when he talks to customers, he finds that many are still in an experimental phase. This being the case, they’re probably not yet at a stage where they’d be concerned about vendor lock-in, how the technologies surrounding Hadoop might play together and so on. So it goes to follow that ODP might not yet be relevant to them, and therefore is being received as “a marketing story that makes a lot of noise,” which could, in fact be the case, according to Heudecker.
Whatever it is, it’s certainly a story that Cloudera’s Chief Strategy officer Mike Olson has taken offense to. He’s on record saying that ODP is “antithetical to the open source model and the Apache way."
It’s a statement that Adam Kocoloski, CTO of IBM Cloud Data Services and co-founder of Cloudant would likely take exception to.
While referring to something that Cloudera CTO Amr Awadallah said during his keynote at Strata and Hadoop World last Thursday, Kocoloski, speaking before the same audience, made it clear that ODP, is not about “displacing or usurping the Apache Software Foundation’s (ASF) place in the big data ecosystem,” but instead trying to eliminate some of the skew so that customers could more easily extract value from their big data platforms regardless of the vendors they chose.
Lingering Doubts
Cloudera’s Olson also all but called Pivotal and Hortonworks (the latter a Cloudera competitor and founding member of ODP) liars in a blog post he published the morning of the ODP announcement:
“Pivotal and Hortonworks claim that the ODP is driven by an industry-wide longing for standardization in the Apache Hadoop ecosystem. I don’t believe them.”
Olson qualified his statement with the fact that, as of last Tuesday, Cloudera’s partner ecosystem included 1,447 companies and that he’s “not hearing from them that they’re confused about building applications on core Hadoop.”
Craig Rubendall, vice president data platform R&D at SAS, in a blog post, pointed out that some confusion does, in fact, exist. And while his company intends to continue to support all five Hadoop distros — Cloudera, Hortonworks, IBM, MapR and Pivotal — he said that ODP could go a long way in “fostering collaboration across all members of the Hadoop and big data ecosystem,” and reduce the time spent fixing compatibility issues and boost innovation focused on solving our customers’ biggest challenges” instead.
Teradata’s Gnau, also in a blog post, echoed something similar. Referring to “A Prayer for Hadoop,” an article he wrote for Fast Company more than 18 months ago, he explained that he had made “a heartfelt plea to the vendor community that supports Hadoop or offers their own distributions to unify our cause to help make Hadoop easier and more enterprise-ready.”
To Gnau, this means corralling “the increasingly fragmented ecosystem of Hadoop tools and creating the conditions that allow businesses to extract value from big data with new and innovative data-driven, analytic applications."
It goes without saying that ODP’s stated goals are to do exactly this.
Learning Opportunities
'Awesome and Obvious'
Altiscale, another member of ODP, said, again via a blog post, that the initiative has the potential to dramatically reduce R&D costs, improve interoperability of the ASF’s Hadoop ecosystem components, reduce customer confusion, and bring the benefits of Hadoop to a broader range of customers than ever before.
Raymie Stata, the CEO and Founder of Altiscale, who penned the post, gave good reason for why ODP, or something like it, is critical to widespread adoption and success of Hadoop in the enterprise, it’s a worthwhile read.
Vishal Sikka, CEO of Infosys and SAP’s former CTO, said, via Twitter, “We will look back on #OpenData 5 years from now as a most awesome & yet obvious initiative.”
And lest we be remiss, though Hortonworks, whose Hadoop distro, HDP, is wholly made up of open source components, has much to gain as a result of being the only independent Hadoop distro pure play member of ODP (because the others declined their invitations), it also has quite a bit to offer — 66 committers who have intimate knowledge of the code within Apache Hadoop, HDFS, Map Reduce, YARN, and Ambari, and will now certify solutions like Pivotal’s newly open source HAWQ on HDP and work on ODP as well.
Hortonworks said that its commitment to ASF projects won’t waiver, and there’s no reason to doubt that, both because their business model depends on it and because their employees strongly identify with ASF projects and the community.
The Bottom Line
What’s in it for Hortonworks? Perhaps better chances at winning support subscriptions from enterprises that go the ODP way. Pivotal has already said that it will recommend Hortonworks for second and third level support for their platform. HDP is the Hadoop distro in Teradata’s big data appliance, and, of course there’s a possibility that ODP members will steer their customers toward Hortonworks because HDP and the services around it are all open.
What do non-members of ODP say? Microsoft vice president, Data Platform, T. K. “Ranga” Rengarajan, told us that “It’s a good thing, overall, for the ecosystem,” noting that HDInsight on Azure was built with Hortonworks support.
He also added that “ODP reinforces HDFS and YARN and it coalesces the base,” explaining that ODP holds the potential to make enterprise Hadoop solutions more plug and play, which would be good for customers and good for vendors.
Time will tell whether the introduction of ODP will be the pivotal point for adoption of Hadoop in the enterprise, but it seems that it, or something like it, is needed.
Scott Carpenter, SVP at Core Logic, drives the point home. “We really believe in ecosystem,” he said. “Your technology can be the best in the world, but if you’re trying to go it alone in the marketplace, that’s a real tough road.”
His point is well made.
It might be that Hadoop’s earliest adopters were pioneers who were willing, and had the talent, time and resources available, to make their way down a rough road. Enterprises typically prefer paved highways and if the fifteen companies who have signed on to the ODP initiative can provide that, while leveraging the best of what ASF projects have to offer, they might be on to something.