The Gist

  • More data not always better. Having access to more data does not necessarily mean better insights if the quality of the data is poor.
  • Ensure data quality. Establishing standards for data quality, reimagining holistic data quality, establishing a data control center and eliminating data silos are ways to ensure data quality.
  • The right data gets results. Getting the right data sets on the right audience, having the right data architecture and using customer data platforms can help businesses make more informed decisions, maximize marketing investments and build stronger connections with customers.

Companies have access to more data today than ever before, but if the quality is questionable, so are any inferences from it. As the old — very old — computer science saying goes: “Garbage in, garbage out.”

"In today’s market, the risk of having poor quality data increases as the volume of data an organization ingests increases,” said Dan Lynn, Crux Data senior vice president of product. “One bad dataset can have a detrimental impact on your business, with that risk multiplied every time someone new touches a dataset. Yet the risk of poor data quality is often accepted because this is just the status quo for data operations.”

Ensuring the quality of external data is especially hard because suppliers in the data industry all have unique schemas and languages for transmitting their specific data sets to their customers, according to Lynn.

But Lynn and other data experts pointed to the following ways for organizations to ensure the quality of their data:

How to Define Your Organization's Data Quality Measures

Define your organization's standards for data quality, Lynn recommended. The first step in improving data quality is to know the various data you possess. Each data supply has its own sources, formatting, processing and abilities — no two are the same.

Having a proven set of data quality measures based on supplier and business specifications allows for quality measurements, anomaly detections and data that's analytics ready, enabling the organization to make better business decisions.

Related Article: Customer Data Management Is the Key to Consumer Trust, Profitability

A Comprehensive Approach to Data Quality

Reimagine holistic data quality across several dimensions, added Amaresh Tripathy, Genpact global leader of analytics. “Ensure complete coverage of traditional data quality, including completeness, conformity, duplicates, activity and timeliness. Include data management process metrics that create and manage foundational data like the right first time, cycle time, new customer set up experience, and the product.”

It also establishes the downstream impact of data quality, Tripathy said. “For example, the number of incidents in business processes and their impact or rework due to poor quality of foundational data, compliance to regulations, and future readiness to support strategic initiatives like a CRM implementation or connected planning initiatives.”

Related Article: Customer Data: 8 Rights Don't Make It Wrong 

How to Identify and Tag Data Quality Issues

Supply baseline visibility of the size and scale of data quality issues through an automated dashboard coupled with a central team, enabling an organization to identify and tag data quality issues, Tripathy said.

“Provide a starting point for decision-makers to prioritize investments in improving data quality,” Tripathy added. This includes educating leadership on the composition of the data universe per domain (e.g., volume, distribution across countries, number of attribute groups and their quality), creating a heat map of data quality issues and identifying what needs attention.

“Create a central data dictionary focused on critical data elements for each data domain,” Tripathy said. “Begin with foundational master data like customer, product, finance, vendors and employee with documented business rules and data ownership.”

Learning Opportunities

Eliminate Silos: Connecting Data to Improve Data Quality

Siloed data can’t be accessed by other departments, so often they will develop their own data, meaning there are “multiple versions of the truth” inside the organization, according to Bob Rogers CEO.

AI and machine learning methods can connect these silos, Rogers said. “I once led a data science team trying to help a healthcare office organize 1.4 million incoming faxes. The incoming information resulted in three separate data silos. One where the raw data was dumped into a processing queue, another where patient’s appointment information was added to an e-health record, and finally a silo for diagnostic charts and scans for each patient.”

The data science team built an algorithm to pull key information from each fax that was then directly connected to the charts of existing patients, or it would create a new record if the patient was new, Rogers explained. “Using ‘shared identifiers,’ all three silos became accessible by all departments. Merging data silos automatically improves data quality essentially making vital data ‘shareable.’”

Focus on the Right Audience: Getting Useful Data Sets

The first step to data quality is to ensure that you’re getting useful data. That means getting the right data sets on the right audience, said Raymond Velez, Publicis Sapient chief technology officer. “With the right data, a business will have enough inputs to make predictive components of machine learning valuable.

You also need the right data architecture that puts the data into an accessible format and structure. Customer data platforms (CDPs) provide a central “hub” for data and allow businesses the ability to know their customers as each individual data source plugs into a central source that can scale the customer data across the entire business via the cloud, Velez said.

“CDPs collect customer data from multiple sources, create a unified customer view and extract up-to-date insights to make them available for activation," Velez said. "Understanding customers through data allows the business to make more informed decisions, maximize marketing investments and build stronger connections with customers.”

CDPs connect all relevant omnichannel information about the customer while urgently and accurately generating insights that can be made available in real time to inform other systems and stakeholders at scale, Velez added. “By centralizing what they already know, and then connecting what they learn, companies achieve a unified, 360-degree view of the customer that can inform marketing and broader business decisions.”

Final Thoughts: Other Strategies for Ensuring Data Quality

There are several other ways to vet data quality as well, such as ensuring you have enough data to be statistically significant and relying on more accurate first-party data than third-party data. The better you ensure the quality of your data, the better you ensure that any analysis that relies on it will be accurate.