Everyone is doing big data, or least it would appear so from the forums, blog posts and media articles currently appearing on the Web. But big data solutions are expensive and require investment in human resources and technology, which not every enterprise has the resources to do. So how widespread is big data adoption? Open source integration specialist, Talend went to the market to find out and came back with some interesting results.

Big Data Problems

Contained in a research report appropriately entitled How Big Is Big Data Adoption, its starting point is that big data will produce a fundamental shift in the way enterprises do business and even the very nature of enterprises as we know them today.

The problems around big data stems from the fact that the amount of data globally is growing by 40% per year and more and more data is becoming accessible to wider audiences as techniques for capturing physical data improve -- Talend cites the example of Walmart which handles more than 1 million customer transactions every hour, and which is imported into databases estimated to contain more than 2.5 petabytes of data.

What is Big Data

Just to be clear from the outset what exactly we are talking about here -- Big Data consists of complex sets of diverse structured and unstructured data sets that group too much information together to be analyzed by traditional data management tools and practices.

The data comes from both in and outside the enterprise, and comes from social media and internet text, financial transactions or any other enterprise business.

Talend big data drivers.jpg
Big data business information drivers

Most importantly, in the current big data discussions preoccupying enterprises everywhere, it comes from customer interactions. Where conventional data management tools are overwhelmed by data and no longer able to manage or analyze it, is where big data and big data analytics kick in.

The early adopters, like other technology spaces, are compelled by a drive to secure a competitive edge and take the biggest risks, as well as to deploy the first early tools so they can build something more sophisticated themselves. The late adopters, in general, are striving for productivity gains but only after a large number of tools have been released onto the market and tested. These users take less risk and go with tools that have a verifiable track record.

Big Data Survey

The objective of the survey was to find out who was using what technologies and what enterprises saw as the principal advantages, if any, of making the substantial investment required in deploying these technologies. The conclusions here were the result of a survey of 231 big data professionals involved in procuring technologies for their companies, or those responsible for examining the possibilities of deploying such technologies.

The respondents in the survey were located in the US (49%) and EMEA (51%) with 60% of those responding in IT departments and 36% having business titles. The findings of the survey are telling and show a market that while growing at an increasingly rapid rate, still has some way to go. The results showed that:

  • 41% have a strategy for dealing with big data
  • 48% of big data initiatives are driven by the business, 39% by IT and 13% cross-functionally
  • Of those that said they didn't have a big data strategy, 76% say they don’t distinguish big data from existing corporate data
  • Increasing the reach and accuracy of predictive analytics was the principal reason cited for deploying big data technologies in the first place
  • 62% indicated that they have achieved big data business benefits with the primary benefit being business process optimization (28%). Marketing and sales improvements were also cited as major drivers (24%)

But it’s not all good, with over 10% of companies reporting they had not received any business benefits from the deployments, citing lack of big data skills sets, governance and management issues as the principal reason for their problems. In fact, the problems with big data expertise comes up here again as resource issues around time and budget cited as the principal problem. Generally speaking, the kind of data that was being inputted comes from web and social media (57%), sales data (54%).

Finally, to round off the figures, open source Apache Hadoop and Hadoop-based distributions represented over 60% of big data implementation technologies in use or considered for use.

Big Data Background

According to Talend, big data and issues around big data rose on the technology horizon around ten years ago when Doug Laney from the Meta Group (now Gartner) noted that companies needed to start looking beyond traditional data management approaches to deal with the growing amount of data entering the enterprise. This particularly applied to early adopters like Google and Facebook that need to gather and analyze huge volumes of data as part of their business.

But more importantly than that, Laney said, the business models required that these kinds of companies develop big data strategies. Others incorporated the strategies into their wider data management strategies, managing traditional data and big data as part of a wide whole.

The growth rate here is rapid. In 2011, the report says, there were only 9 companies that were offering products around big data. At this point in time, Talend says, there are around 120. For those companies that do have a big data strategy, it is being driven by a number of company functions that indicate it has now moved beyond the early adopter stage. IT, it seems, is still a driving force for big data or by a bottom up approach that offers the prospects of greater efficiencies in gathering and analyzing large data sets.

But that’s not the entire story. The research also shows that there is considerable business interest in big data, with 48% of those surveyed citing a drive for big data adoption from the business side of the house, with the focus on increased customer satisfaction, better revenues and faster time to market.

Big Data Business Benefits

The research found that the biggest driver for the deployment of big data solutions is enhancing the accuracy of predictive analytics (68%) and the ability to analyze current and historical data to make future predictions.

Talend big data business drivers.jpg
Big Data business drivers

From this, enterprises expect to enhance their ability to predict emerging trends and to optimize revenue creation possibilities both through revenue optimization and the creation of new income streams.

Nearly two thirds (62%) of those that implemented big data projects said that they had achieved business benefits with business process optimization (28%) and improved marketing and sales (24%) as the principal gains. However, the flip side of the coin is that the remainder, or 38%, said that they did not, or were unaware of any of the benefits that big data has brought to them.

Big Data Integration

So where and how is big data deployed in the enterprise? The most common use case scenarios were marketing campaign analysis, risk management, predictive analytics and recommendation engines. It also found that IT is integrating existing data warehouses and business intelligent systems with enterprise sources of structured and unstructured data.

The most common applications that are being integrated are financial transactions (48.4%), social media, and clickstream and Internet text (48.4%). Web logs and call detail records are both at 28.4%, while analysis of social media and text is being used to identify super users -- users that have the most influence on others in the community. Financial transactions are also being used to analyze how users are spending money and complete views of customer buying patterns and behavior.

Implementation Challenges

We have seen in the past that big data implementation challenges are not just technical, but also one of human skills and expertise and that companies like IBM are dealing with this problem by setting up campuses with a focus on big data and big data analytics.

It has been hard to quantify the problem, although Talend says that in the US alone 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts are needed to analyze big data and make decisions that are accessible to the wide business community.

Talend big data challenges.jpg
Big data challenges

The problem is so widespread that in this survey, 52% of respondents specifically cited a lack of expertise in-house as their major problem, after budgets and time allocation.

On top of this -- and something that is clearly related even if the research doesn't say so -- 48% report that data quality is still a challenge, with only 11% citing convincing upper management of the necessity of big data analytics as the main problem.