Everyone is doing big data, or least it would appear so from the forums, blog posts and media articles currently appearing on the Web. But big data solutions are expensive and require investment in human resources and technology, which not every enterprise has the resources to do. So how widespread is big data adoption? Open source integration specialist, Talend went to the market to find out and came back with some interesting results.
Big Data Problems
Contained in a research report appropriately entitled How Big Is Big Data Adoption, its starting point is that big data will produce a fundamental shift in the way enterprises do business and even the very nature of enterprises as we know them today.
The problems around big data stems from the fact that the amount of data globally is growing by 40% per year and more and more data is becoming accessible to wider audiences as techniques for capturing physical data improve — Talend cites the example of Walmart which handles more than 1 million customer transactions every hour, and which is imported into databases estimated to contain more than 2.5 petabytes of data.
What is Big Data
Just to be clear from the outset what exactly we are talking about here — Big Data consists of complex sets of diverse structured and unstructured data sets that group too much information together to be analyzed by traditional data management tools and practices.
The data comes from both in and outside the enterprise, and comes from social media and internet text, financial transactions or any other enterprise business.
Big data business information drivers
Most importantly, in the current big data discussions preoccupying enterprises everywhere, it comes from customer interactions. Where conventional data management tools are overwhelmed by data and no longer able to manage or analyze it, is where big data and big data analytics kick in.
The early adopters, like other technology spaces, are compelled by a drive to secure a competitive edge and take the biggest risks, as well as to deploy the first early tools so they can build something more sophisticated themselves. The late adopters, in general, are striving for productivity gains but only after a large number of tools have been released onto the market and tested. These users take less risk and go with tools that have a verifiable track record.
Big Data Survey
The objective of the survey was to find out who was using what technologies and what enterprises saw as the principal advantages, if any, of making the substantial investment required in deploying these technologies. The conclusions here were the result of a survey of 231 big data professionals involved in procuring technologies for their companies, or those responsible for examining the possibilities of deploying such technologies.
The respondents in the survey were located in the US (49%) and EMEA (51%) with 60% of those responding in IT departments and 36% having business titles. The findings of the survey are telling and show a market that while growing at an increasingly rapid rate, still has some way to go. The results showed that:
- 41% have a strategy for dealing with big data
- 48% of big data initiatives are driven by the business, 39% by IT and 13% cross-functionally
- Of those that said they didn't have a big data strategy, 76% say they don’t distinguish big data from existing corporate data
- Increasing the reach and accuracy of predictive analytics was the principal reason cited for deploying big data technologies in the first place
- 62% indicated that they have achieved big data business benefits with the primary benefit being business process optimization (28%). Marketing and sales improvements were also cited as major drivers (24%)
But it’s not all good, with over 10% of companies reporting they had not received any business benefits from the deployments, citing lack of big data skills sets, governance and management issues as the principal reason for their problems. In fact, the problems with big data expertise comes up here again as resource issues around time and budget cited as the principal problem. Generally speaking, the kind of data that was being inputted comes from web and social media (57%), sales data (54%).
Finally, to round off the figures, open source Apache Hadoop and Hadoop-based distributions represented over 60% of big data implementation technologies in use or considered for use.