IBM Accelerates Speed of Big Data Analytics with Pure Data System for Hadoop

Enterprise technology provider IBM is unveiling new data acceleration and Hadoop technologies it says will make analyzing volumes of big data faster and easier.

IBM is offering two new technologies -- BLU Acceleration for improved analytical performance and Pure Data System for Hadoop for faster and easier enterprise deployment of Hadoop. BLU Acceleration allows users to load big data into RAM instead of hard disks, allowing in-memory performance regardless of the size of data sets. IBM says tests show some data queries are processed up to 1,000 faster as a result.

In addition, BLU Acceleration offers a capability IBM calls "data skipping," which allows users to avoid duplicate information, analyze data in parallel across multiple processors and analyze data transparently to the application without needing a separate data modeling layer. Another new capability called “actionable compression” removes the need to decompress data before analyzing it.

Meanwhile, IBM says Pure Data System for Hadoop offers enhanced and easy-to-use analytic and visualization tools and can reduce the time needed to ramp up an enterprise Hadoop implementation from “weeks to minutes.” Hadoop is an open-source software system that can process large volumes of both structured and unstructured data.

IBM is also releasing new versions of its enterprise InfoSphere BigInsights Hadoop application development tool, InfoSphere “stream computing” software designed to analyze big data volumes in real time, and Informix database software.

IBM Meets Market Demands, Struggles with Challenges

According to a post by analyst Jeff Kelly on the Wikibon blog, IBM has “has the broadest and deepest big data product and services portfolio in the industry,” but also faces challenges relating to a fragmented set of expensive big data solutions. As a result, Kelly says IBM’s big data products and services are primarily used by large enterprises.

While Kelly says the big data capabilities IBM released today do not represent unique approaches, they do show that Big Blue has its “ear to the ground” and is “actively evolving its big data portfolio of offerings to meet the needs of practitioners and the enterprise” by doing things like extending the value of existing DB2 and Informix deployments by providing targeted real-time capabilities.

Keeping Up with the Joneses (aka SAP)

It is probably not a coincidence that IBM is making a major announcement about speeding up and easing its big data analysis/reporting capabilities a week after enterprise technology competitor SAP unveiled rapid-deployment solutions for the SAP Business Suite powered by SAP HANA. What this means, promises SAP, is that it will take 12 weeks or less to implement these new capabilities and have real-time analytics up and running at its customer sites.

Earlier this year, SAP announced that its super-fast in-memory database management system, SAP HANA, was able to do analytics and transactions in the same database -- thereby giving enterprise decision makers the ability to understand not only what’s happening, while it’s happening, but also to take advantage of insights based on what’s happened in the past during the customer relationship to get a sense of how a customer might be feeling during the current transaction.

All this is part of SAP’s big data transition strategy that promises to disrupt the big data market.

It will take some time to sort out which of these enterprise IT giants has the lead in big data (naturally Oracle and Microsoft are also jockeying for positions), and more than likely each one will outdo the other in certain key facets of big data management, but IBM is letting it be known that the big data battle has just begun.

All new IBM big data solutions will be available in Q2 2013, except the PureData System for Hadoop, which will start shipping to customers in the second half of 2013.