Rescue Your Data From the Big Data Landfill

These professionals now have to deal with a data landfill -- mountains of data that enterprises are collecting and trying to use to make business decisions. This fast changing world requires a new strategy that goes beyond the current data preparation methods used by IT.

The Current Challenge of Data Preparation

Data analysts have always faced difficulties in preparing data for analysis. There is almost always a diversity of sources, both internal (such as sales, manufacturing and finance) and external (such as third-party providers, public sources and the Internet). Data also comes from a variety of locations and in various formats, such as Excel, JSON and XML.

To date, analysts have spent the vast majority of their time preparing data and much less time doing actual analysis, a problem that has only been exacerbated by the challenges of getting these rapidly growing data sets into the right form. A recent InformationWeek study on big data reported that 59 percent of respondents said data quality problems are the biggest barrier to successful analytics. More than ever, data preparation is a significant impediment to informed and timely decision-making for marketing departments looking to take advantage of their big data.

A recent article in the Harvard Business Review reported research on best practices for how businesses should use data. The article reported that companies with a culture of “evidence based decision-making” consistently see improvements in their business performance. One of the hallmarks of such companies, the research found, was that they “ensure that all decision makers have performance data at their fingertips every day.”

The implications for data preparation are clear: It cannot be a month long process handled exclusively by IT professionals. Basing a campaign off of data that old might have been acceptable 10 years ago, but today’s business analysts need to conduct analytics quickly if they want to keep up.

More than ever, marketing departments need to be able to perform their analytics ad hoc.

Envisioning a New Method

As more data is collected from more sources, IT departments have begun to create data landfills. These landfills can be rich repositories -- but only if users can reach the “right” data easily and quickly. Having the right data is one of the biggest challenges for envisioning a new method of data preparation.

So what is right data? It’s the information necessary to answer a given business question. Four characteristics mark right data:

  • Complete – having all the information needed to answer the question, no matter how many sources it may come from.
  • Contextual – the ability to look at the data from as many different perspectives as a user may wish, an essential element in ad hoc and what if analysis.
  • Consumable – having the data in whatever formats are needed so analysts and decision makers can use the BI tools they’re familiar with.
  • Clean – the data is accurate in all respects, both in content and format.

Now let’s look at a hypothetical example of how advanced data preparation methods can help business analysts and marketing departments. Let’s use the case of a global consumer package goods company that sells its product through 400 distributors and needs to gain better visibility into the correlation between distributor incentives and sales results across dozens of brands.

The challenge is that the data necessary to create this picture exists in many examples and formats, spread among multiple sources, including purchasing, distribution, marketing and retailers. Moreover, all of this data -- structured and unstructured – has to be aligned in Excel spreadsheets.

The variety and volume of data is so great that employees at the company have never even attempted to pull all the data together. It is just too daunting a task. Plus, any IT effort to accomplish data unification would be a one time effort. The end result wouldn’t be flexible enough to accommodate future changes or different kinds of business questions.

In situations like this, there needs to be a method for businesses to complete, contextualize and clean data so it is consumable without having to fully rely on IT, while still giving them transparency and governance where needed.

Keeping the Data Out of the Landfill

It’s hard to argue with the Harvard study that identified “evidence based decision-making” as a valuable business practice. Who doesn't agree that real facts are a sound basis for action?

But in many enterprises, this is easier said than done. Rapidly growing data collections are making it increasingly difficult for marketers to capitalize on available information, and it’s simply too hard and too time consuming to get all the available data into a consistent, reliable form that can be used for ad hoc business analysis.

New criteria for what makes “right” data and methods for producing that data are needed to make it possible for business users to gain much quicker access to diverse sources of data. That will make the promises of big data more realistic and keep organizations and markets out of the landfill.

Title image by Huguette Roe (Shutterstock)