Who Says Big Data Needs to Shrink to Grow

Who Says 'Big Data Needs to Shrink to Grow'?
While most people were busy nursing their New Year’s Eve hangovers or getting busy with their resolutions on January 1, the New York Times ran a rather interesting headline: "Big Data Shrinks to Grow." We looked at it and said, really? That’s not our experience, but continued to read, anyway.

The article states that interest in big data is waning; the author bases his claim on the fact that Google searches for the term “big data” are no longer rising. We think that may be a pretty lousy basis for an argument; after all, it could be that the reason that searches for “big data” are down is because many people already know what it is. (As I stated in my year end big data wrap-up “If 2012 was the year your grandmother instigated big data conversations at the dinner table (yes, the 'buzz' around it actually was that big) then 2013 will go down in history as the year the enterprise began to make serious plans around it.")

We also hope that enterprises don’t craft their big data strategies via Google search. But enough about that.

The article also points out that Kaggle (a site that hosts data scientist competitions) has changed its business model from one that spans the marketplace to one that specializes in specific industries, starting with Oil and Gas.

Interesting? Yes, but a trend? We're not so sure. Hopefully data scientists, statisticians or even high school students with a little common sense will point out that one or two companies changing their business strategies does not a market trend make.

The article does bring up more interesting questions like: do data scientists and other professionals who work with big data need to have deep industry insight to deliver discoveries that warrant the costs of wrestling with big data in the enterprise?

It quotes Kaggle founder Anthony Goldbloom saying:

We liked to say ‘It’s all about the data,’ but the reality is that you have to understand enough about the domain in order to make a business. What a pharmaceutical company thinks a prediction about a chemical’s toxicity is worth is very different from what Clorox thinks shelf space is worth. There is a lot to learn in each area.”

Goldbloom makes a good point, but does this mean that data scientists who don’t have industry specific training can’t yield the returns that big data hype suggests?

If so, then the big data business may be in serious trouble because data scientists are rare enough, add another skill to their heavy list of “must have requirements” and we’ll not only have to wait for them to finish their post graduate training but to also get jobs and work for five years before they can add value.

The Experts Weigh In

Enough about what we think. We asked four firms that provide products and services to enterprises to comment on questions like: Does big data need to shrink to grow? and Do data scientists need to have industry specific experience to be worthy of their hefty price tags?

Here’s what they said:

On the question as to whether data scientists need to have domain knowledge to make cost justifiable contributions Sandy Steier, CEO and co-founder of 1010data said:

To add value to any industry, a person would presumably need a certain amount of both analytical expertise and domain knowledge. An interesting question is, is it better to start with domain knowledge and learn big data analysis from there, or is it better to start with analytical experience and then apply it to a new domain?

I believe the latter is the easier path to success. I'm sure some domain experts would want to emphasize domain knowledge, but in my 35 years of analytical experience and 14 years of growing 1010data, that's what I have seen pretty consistently. Certainly the normal path is to be schooled in analysis or programming, get a job in a specific industry, and then perhaps even change industries."

Byron Banks, Vice President, database & technology at SAP offered a different point of view:

We believe industry and domain expertise is essential for big data initiatives to succeed. Big data, like any new technology or IT trend, will only prosper if there is a direct link to quantifiable business results. Unfortunately there have been a number of cases where the technology is driving the project -- organizations collecting lots of data because they now can, rather than first focusing on the needs of the business and then looking for the best approach, big data or otherwise, to support those goals. This has led to some of the hype-cycle criticisms regarding big data.

At SAP we are focused on helping customers to 'big data enable' their existing business processes and to go after new business opportunities. We believe the best way to get started in the right direction is to engage data scientists who know the customer’s industry, and can make the link between the business goals, potential data sources, and the IT organization and technologies. SAP has a team of data scientists, hired from industry, not IT, and we also rely on our business partners to ensure deep industry-specific knowledge is brought to every big data project we undertake."

On the question as to whether big data (the industry) must “shrink to grow” Stefan Andreasen, CTO and co-founder at Kapow Software, a Kofax company, said:

The answer to this all depends on how you define big data. If you mean that big data is only about data processing and analytics, yes, you might be right that the initial 'hype' is over and there will be a 'shrink' before we see the growth again.

However, if you look at big data as a whole new way to work with data, then it will only grow, not shrink.

For me the essence of big data is the need to work with more and more data sources, each spitting out ever changing data in ever changing formats, to find the right answer. For example like when shopping for an airline tickets, you go to more and more places, over many days, to secure a better price. That way of defining big data will only grow.

Said in other words, thinking of the 3 V’s of big data -- Volume, Velocity and Variety -- yes, it might be true the need to process more volume will shrink, but the need to get real time data from more sources will only grow.”

Michael Collins, head of product marketing at LucidWorks, offered a different answer:

The concern with big data isn’t the growth, but that organizations are focusing on the size and not working to rationalize how their data-driven applications bring forward to users the relevant information that is needed for useful decision making. Verticalization will not address the many various types of data such as financial data (as an example) that is developed daily. As more organizations realize that the first instinctual human behavior is to search, they will step back and re-examine their big data stack and understand how to capitalize on one option while addressing the main issue that has no capability of shrinking. Analytics and BI are rendered of less value when the data they are working with isn’t relevant -- enter search."

Something Else to Consider

In my mind there’s still something else to consider. Exactly who are the “big data” experts and data scientists bringing expertise into the industry? As a headhunter, I know that for every nine posers there is only one who actually has the experience needed to deliver results.

To enterprise project owners and C-level executives I beg the following, Look at the resumes of the experts you’re bringing in to work on your big data initiatives (even if they work for consulting firms and went to good schools). What were they doing five years ago? When did they have the time and the opportunity to gain their big data and data scientist expertise? Are you handing them their first opportunity to analyze big data, to work with stats and algorithms since their undergrad days? (Fine if you are, but you should know that that’s what you are doing.)

Just because “experts” have trained themselves (or been trained) to talk a good game doesn’t mean they can play one. Sites like LinkedIn are full of discussions on “Answers to questions you’ll be asked on Hadoop interviews.”

Is There Really Gold in Big Data? Is It Worth Going After?

CEOs who have developed solid relationships with great data scientists and other professionals skilled in working with big data sets will answer those questions “Yes” without hesitating; but the key here may be that they’ve enjoyed solid business to scientist to big data pro interactions. It’s worth noting that what may be “interesting” to the scientist may not be worthwhile for the business and vice versa. We need these experts to talk.

What do you think?

Title image by fluidworkshop (Shutterstock)