“Big Data” was all the rage in 2012. Everyone from your company’s president, to President Obama to a passerby on Main Street had something to say about it. Whether any of them could actually define the term was another matter; not even the experts seem to agree on what it means.
2012 at a Glance
We spent the year interviewing and listening to vendors and early adopters talk about Big Data. We knocked on the doors of both brick and mortar firms whose businesses aren’t web-based and on those that are -- Linkedin, Eventbrite, Kaggle and Match.com are just a few examples.
And while the latter group spoke of pushing the envelope with Hadoop and machine learning, the others were dazed by all the Hadoopola -- they were excited by Big Data’s possibilities, but not yet ready to implement meaningful enterprise-wide strategies, to invest in the required technologies, to hire data scientists and so on.
It’s worth noting that some of these firms did run Big Data pilots, but that’s as far as they got.
“Big Data is really, really hard,” many of them told us.
That’s why we say that history will show that 2013, not 2012, was the year of Big Data. It will go down as the year in which companies talked less about it and began to adopt and reap real benefits from it.
And though we’re tempted to make bigger, but more specific predictions, we’re going to leave them to the experts in the market.
We asked them one simple question, “What’s your prediction for Big Data in 2013?” Here are the best of their largely unedited answers. It’s a lengthy read, but interesting, mind-bending and provocative.
John Schroeder, CEO and CoFounder, MapR
Revenue generating use cases (of Big Data) will trump cost saving applications.
- Hadoop will pull away from the other Big Data analytics alternatives.
- Hadoop expertise is growing rapidly, but a shortage of talent remains.
- SQL-based tools for Hadoop will continue to expand.
- HBase will become a popular platform for BlobStores (BLOB=binary large objects).
- Hadoop will be used more in real-time applications.
- Hardware will become optimized for use with Hadoop.
- HBase emerges as attractive platform for lightweight OLTP.
Laura Teller, Chief Strategy Officer, Opera Solutions
Wall Street is going to use Data Equity to Value Companies
As the year goes on, Wall St. is going to increasingly use "data equity" to value companies, much as they have used brand equity in the past. A company's ability to gather and leverage large amounts of exclusive forms of data will form a new axis in computing a company's long-term value. New fortunes will be made with this equation.
Big Data Apps will be a major trend in 2013
Big Data helps us finds new and different answers, but with Big Data's impact spreading across many markets, industries and fields of research, the answers also require new ways of asking the right questions.
In 2013, the real money will not be in data management platforms like Hadoop and NoSQL, or in how a business collects and processes data. Because most businesses will be naturally hesitant to rip and replace database and storage infrastructures that took years and often hundreds of millions to build, true innovation won’t take hold there. The real emerging market and money, or innovation in this case, will be in Big Data applications -- custom applications that help quickly answer domain-specific questions.
Herb Cunitz, President, Hortonworks
Emergence of vertically aligned Apache Hadoop “solutions”
As the keynote of Hadoop Summit last year, Geoffrey Moore characterized Apache Hadoop as currently crossing the chasm and that we would know it has landed on the other side and is enjoying adoption by the mainstream when vertical solutions arise.
As more and more companies gain success we will see patterns and solutions arise that are custom-fit for a challenge found in a particular industry. As the system integrators and consultants become more and more expert on Apache Hadoop, they will wrap solutions in packages and we will see the emergence of these vertical solutions. Facilitating the growth of this ecosystem is a core strategy at Hortonworks.
David Jonker, Head of Big Data Strategy, SAP
In-memory computing will become a cornerstone technology in every Big Data project where timeliness is key; the vendors who have not yet announced an in-memory capability will finally get on the bandwagon. The 'killer use case' will revolve around personalized consumer experiences in the 'bricks and mortar' world as companies look for any and every competitive edge.
Privacy and other social issues related to Big Data will start to get more airtime by the end of 2013 as the public begins to understand how much information enterprises collect and can access about an individual.
I predict the emergence of the 'Data Reservoir' as the big data architecture that leading enterprises will embrace in 2013. The Data Reservoir model is increasingly central to the thinking of many large banks and Internet companies -- rather than do painful manual integration between data warehouse silos, the Data Reservoir allows you to feed a copy of all interesting data into a unified Hadoop-based repository. This is the foundation for much more agile exploration, discovery and analysis in 2013.
Sanjay Mehta, Vice President of Product Marketing, Splunk
The Big Data Conversation will Shift
In 2013, the attention surrounding Big Data will shift from focusing on the enormity of data and infrastructure technologies, to the specific uses of Big Data and new applications/ways to harness it. Basically, we will hear and see more proof to rationalize Big Data software.
Cars.com, for example, increases revenue, defends its website and enhances the user experience by analyzing data across the organization with Splunk Enterprise. Next year, more users like Cars.com will talk about the results of Big Data and the business decisions driven by analyzing an increasing amount of machine-generated big data -- this is what Splunk calls operational intelligence.
Srikanth Velamakanni, Co-founder and CEO, Fractal Analytics
At least one major IPO and a couple of Instagram/YouTube style acquisitions. Splunk will face pressure due to heightened expectations.
- Talent shortage will become more acute and pose a real threat to some companies' growth prospects.
- Rise of AI within Analytics space. The fields of computer science, AI, machine learning and game theory will play a greater role in Big Data Analytics.
- Rise of personal (self) analytics. More and more companies will provide data to consumers in a way they can analyze to control their behavior and personal life.
- Companies will develop clearer privacy policies and give consumers more control of their sharing. A specific segment of consumers would emerge that actively manage what they share with who.
- Big Data analytics will see more applications across industries. More companies will exceed Big Data management capabilities and seek external expertise.
- A significant increase in mobile analytics. Mobile push analytics will change how consumers consume information and buy things.
- More intelligent devices and appliances will emerge with a significant degree of embedded analytics.
- More focus on real time analytics although I don't expect much progress within the year.
- Analytics product companies that can't handle large data volume, variety or velocity will struggle to survive.
Steve Hillion, Chief Product Officer, Alpine Data Labs
Commercial distributions of Hadoop will begin to dominate. As more organizations begin to take Hadoop seriously, they will want to pay for fully-supported commercial versions. And as those commercial versions gain maturity, we may see some consolidation, perhaps even an acquisition of Cloudera as the big players vie for leadership in big data.
- Greater Maturity in BI. As the traditional BI vendors get their offerings working on the latest SQL interfaces into Hadoop, and as newcomers like Datameer and Platfora push the envelope, there will be a clash that ultimately benefits the consumer. Beyond basic BI, vendors will support more advanced visualizations and novel ways of exploring big data.
- Hadoop will find its identity as the sandbox for data science. This year, Hadoop has struggled to get beyond simple batch processing and to realize its potential as a platform for advanced analytics. For those that want to go beyond batch, and beyond basic reporting, organizations such as Think Big Analytics, SAS and Alpine Data Labs will finally allow everyday users to get deep insights from their big data.
- Data Warehouse move to the cloud. While larger organizations remain tied to their on-premises warehouses, smaller companies and early adopters will move more and more of their data assets into the cloud.
- Challenges to Hadoop will begin to appear. Users will reach a point of frustration with performance limitations, version chaos, and the myriad different standards and interfaces. Rival technologies and platforms will leverage HDFS as a substrate while moving beyond the performance limitations of Hadoop -- and consequently there will be a push for greater innovation in all of the big data platforms.
Editor's Note: To read more of Virginia's take on the big data beat, read Getting Fuzzy: The Line Between Social, Big Data and Predictive Analytics