It has been argued that “data is the lifeblood of the digital economy,” and even that “data is the lifeblood of capitalism.”

Without addressing the merits of those specific arguments, what is clear is the outsized role data plays in modern business. Unfortunately, while the importance of data is uncontested, and while its application in the form of artificial intelligence (AI) has captured the imaginations of both businesses and the public at large, we have paid comparatively less attention to actually mobilizing data so it delivers actionable insights.

The way to do that is through the four stages of the data mobilization process: data audit, data consolidation, analytics and AI. I will address the data audit and consolidation stages in this post, the first of a two-part series, and then I will discuss analytics and AI in my next post.

Introducing a Data Audit

Before you can use data effectively, you first need to understand what you have. Tracing your customer journey is an important exercise regardless, but it can also be a great way to identify how and where different parts of your data stream are collected and by whom. Performing this kind of data audit gives an organization the opportunity to step back, take a deep breath, inventory the data it collects and assess its quality. This means not only reviewing the “usual suspects,” like customer relationship management (CRM) systems (Salesforce, Microsoft Dynamics, Oracle Sales Cloud, etc.) and marketing automation platforms (HubSpot, Marketo, Pardot, etc.), but also studying the myriad other places in which customer data is captured.

In most organizations, 100 percent adherence to data collection policies is a work in progress. For example, some sales reps may bypass their CRM systems and instead record information in personal Word documents or Excel spreadsheets. Not only does this mean that critical information is left out of analysis, it also creates a regulatory compliance nightmare, because new laws (GDPR, for example) require close scrutiny of customer data.

The problems are not always the rep’s fault. The underlying cause may be that the CRM tool is cumbersome and needs to be streamlined. It may also be that sales reps have been burdened with too many data collection responsibilities, in which case an examination of the customer journey can help you better delegate responsibility for data collection.

A thorough data audit will help you identify gaps and inefficiencies in the data collection process.

Nobody wants to be the bad cop, but almost every organization needs one. A well-thought-out process that results from a careful data audit and includes efficient tools and proper distribution of responsibilities is key, but someone still needs to make sure data collection policies are being followed. It can be a painstaking process, but sales executives will need to monitor the data landscape regularly to make sure everyone is doing their part.

It is also important to have a data governance strategy with clear guidelines on how data is recorded. To use a simple example, if some teams record the state in a customer’s billing address as “AK” and others record it as “Alaska,” that will cause headaches during the consolidation step. Of course, it’s is possible to standardize data after it has been collected, but that is a wasteful and inefficient way to do things. Instead, consider the 1-10-100 rule: what costs $1 to prevent costs $10 to fix and $100 to clean up once it becomes a mess. It is easier to adopt data governance guidelines and enforce them than it is to go back and fix inconsistencies after the fact.

Related Article: Following the Data Through a Digital Transformation

Performing a Data Audit

Rudimentary as it may sound, the first step in a data audit is a series of interviews. The purpose of the interviews is to understand the five W’s (what, why, where, who and when) of your data. The previous step involved understanding the processes and stakeholders that collect data. Now, you need to contact each stakeholder throughout your customer journey and explore the five W’s in depth. In-person conversations are ideal, but meeting with every person one-on-one may not be possible in larger organizations. If that is case, an alternative option is to collect the bulk of the information via questionnaires sent to all stakeholders and then use that information to facilitate in-depth discussions where appropriate.

One of the first questions to answer, and one that will drive subsequent discussions, is the “what” question: What data is being captured by this business process? You will have to ask that question of “field level” information — i.e., name, address, phone number, email address, etc. It’s not enough to generalize and identify “customer demographic” data, because that can mean different things in different contexts.

Once you have identified “what” data your organization is capturing, you can determine “why” that data is being captured. Given the fact that organizations that collect data must deal with increasingly complex regulatory requirements, it’s more critical than ever to have proper justification to capture, process and store customer data. With an understanding of why the data in each field is being collected, an organization can more easily identify which data is no longer needed (and can therefore be deleted) and which data is necessary for business processes and analyses.

The next step is to determine the “where,” which means identifying both the physical and technical location of the data. The end result is for you to be able to say, for example, “all information regarding a customer’s physical address is collected at the qualification stage by sales development representatives and stored in, with the data residing in a Salesforce data center in either Dallas or Chicago.” Having that information helps you comply with both data residency laws and requests to delete personal information.

As for the “who” and “when” of customer data, they are less valuable for analysis, but it’s good practice to document both.

The “who,” question refers to who has access to customer data. In other words, you need to be able to answer questions like these: Is customer data visible to all members of your sales organization, or only to specially designated administrators? Can anyone edit or delete customer data, or is that an administrator-only privilege? Which fields are available to be viewed by which personnel? Are passwords required for viewing customer data? What about two-factor authentication? How is data encrypted? Is it only encrypted in transit or is it encrypted at rest? What security standards does that encryption meet?

Answering the “when,” question involves determining how long data is retained. Is every data field held forever? Only some? Is data automatically deleted at set intervals? You need to answer those questions on a regular basis and enforce policies regarding how long data is retained.

Learning Opportunities

While a full data audit is an arduous task, it’s one that will generate great rewards. By knowing the five W’s of your data, you’ll be in a much better position to squeeze the maximum amount of value from it. You’ll also be in a much better position to withstand regulatory inquiries.

You should conduct full data audits annually in order to ensure that information remains current and that policies are being upheld.

But once you have inventoried your data and put in place a robust process for ongoing collection, how do you bring it all together? Through the data consolidation process.

Related Article: When to Up Your Data Game

Data Consolidation

Data identified at the audit stage needs to be tagged, transformed into digital format where necessary and consolidated into a single data lake. Unfortunately, as any practitioner can tell you, raw data is rarely pretty. Your data lake is likely to contain data from a variety of sources and in a variety of formats, making it unfit for reliable analyses. Data consolidation, therefore, typically involves two distinct steps: connecting different data sources and transferring data into a flexible model that data can be reported against.

A clear data governance strategy, along with unique identifiers that allow you to connect pieces of information about customers and products from all sources, is essential for the first step.

The second step involves a good ETL (extract, transform and load) process that takes data from transactional data sources and fits it into a data model that can be leveraged within a data warehouse like Amazon Redshift or Oracle Database.

Once an organization has done that and has cleansed, transformed and cataloged its data, business users can easily ask the right questions without relying on additional technical help.

It’s important to note that not every piece of data needs to be duplicated and placed into a data warehouse. Taking the time to evaluate what kinds of reports and analyses are important to your organization (and what data sets are required for them) will save you time and money.

Related Article: Emerging Data Needs Fuel New Data Management

Preparing for Analytics and AI

Data audit and consolidation will definitely be the more painful parts of your journey to data mobilization. But they are fundamental and necessary if you want to avoid the dreaded 80/20 data analysis dilemma. You may be familiar with the phrase “garbage in, garbage out.” That’s as true in data science as it is anywhere. It doesn’t matter how powerful your models are if you’re training them on bad data. And thus, data analytics professionals face the 80/20 dilemma: the fact that data scientists spend only 20 percent of their time on actual data analysis and 80 percent of their time finding, cleaning and reorganizing huge amounts of data.

Having addressed the audit and consolidation steps, I will focus next on the “fun” parts of data mobilization: analytics and AI.

fa-solid fa-hand-paper Learn how you can join our contributor community.