New Data Provenance Standards Released: Impact on CX Leaders Explained

The Gist

Enhanced transparency. The Data Provenance Standards ensure clarity on the origins and usage of data in AI applications.
Cross-industry collaboration. Diverse industries collaborate to develop standards that promote responsible data use.
CX implications. New standards aid CX leaders in managing customer data more effectively, enhancing trust and personalization.

Back in November 2023, the Data & Trust Alliance (D&TA) announced eight new standards that bring transparency to dataset origins for data and artificial intelligence (AI) applications.

Now, after testing and validation with more than 50 organizations inside and outside of the Alliance — including IBM, Walmart, Pfizer and others — the D&TA has released version 1.0.0 of its Data Provenance Standards.

A serene forest scene with tall, slender trees. Sunlight filters through the dense canopy, creating beams of light that illuminate the mist and the forest floor covered with fallen leaves in piece about the Data & Trust Alliance. — Now, after testing and validation with more than 50 organizations inside and outside of the Alliance — including IBM, Walmart, Pfizer and others — the D&TA has released version 1.0.0 of its Data Provenance Standards.nixki on Adobe Stock Photos

Why Introduce Data Standards?

In the race to adopt AI, members of the Alliance, along with other businesses, sought out better rules around data quality.

“AI is all about the data. In fact, data may be the only sustainable source of competitive advantage,” said Rob Thomas, SVP, software and chief commercial officer at IBM and chair of the D&TA Data Provenance Standards initiative.

There is little transparency around the data that trains and feeds AI models. And the consequences, according to the Alliance — such as copyright infringement and questions around privacy and authenticity — could impact the technology’s business value and its acceptance by society.

In fact, according to a recent IBM survey, 61% of CEOs say lack of clarity on data lineage and provenance is a top barrier to adoption of generative AI.

What Are the Data Provenance Standards?

Back in November, the D&TA originally proposed eight standards. Now, after gathering feedback from small- and medium-sized enterprises, validation and testing, Version 1.0.0 of the Data Provenance Standards contains 22 metadata fields grouped into three standards, with that metadata intended to travel with the dataset as it’s shared and transformed.

The three standards are:

Source: Identifies the origin of the current data set, including dataset name, unique URL, dataset issuer and description of the dataset.
Provenance: Concerns the data origin geography, dataset issue date, range of dates for data generation, data format and more.
Use: Covers the intended use of the data, including confidentiality classification, license to use, proprietary data presence and more.

As technology and AI transform industries, organizations need a blueprint for evaluating the data that fuels these algorithms, said Christine Pierce, chief data officer, audience measurement at Nielsen.

“Through the collaboration of experts across multiple industries and disciplines, the D&TA Data Provenance Standards meet this need,” she explained. “The standards promote trust and transparency by surfacing critical metadata elements in a consistent way, helping practitioners make informed decisions about the suitability of data sources and applications.”

Who’s Behind the Data Provenance Standards?

The D&TA Standards were built by a working group of chief data officers, chief information officers, leaders in data strategy and other practitioners across more than 15 industries. These Alliance companies include:

AARP
American Express
Deloitte
Howso
Humana
IBM
Kenvue
Mastercard
Nielsen
Nike
Pfizer
Regions Bank
Transcarent
UPS
Walmart
Warby Parker

“Safe adoption of future AI tools will require trust and transparency in the data powering them,” said Thomas Birchfield, technical program manager at Transcarent. “Cross-industry collaboration toward a universal set of data provenance standards is a key component of leveraging data effectively and responsibly.”

Related Article: AI Trust Issues: What You Need to Know

What Do the Data Provenance Standards Mean for CX Leaders?

These new standards aren’t just changing the game for chief data officers and chief information officers. They also introduce new implications for marketing executives and customer experience leaders.

D&TA’s Data Provenance Standards increase transparency into who and how customer data is collected, stored and utilized before it even enters the organization, according to Kristina Podnar, senior policy director at Data & Trust Alliance. “This fosters trust with customers, which is critical for organizations — specifically CMOs and Chief Customer officers/VPs of contact centers — who are increasingly held accountable for the appropriate handling of customer data.”

Beyond providing visibility into the type of data you’re acquiring, the standards also highlight potential risks, said Podnar.

“For example, if you are acquiring AdTech data for a new product launch, but a large percentage of the data is lookalike or generative synthetic data, it could skew your product targeting strategy,” she explained. “If this data is further ingested into AI models within the enterprise, they could collapse the AI model over time, thereby increasing legal and reputational risks and loss on investment.”

From a tactical perspective, she said, these standards ensure data is appropriately sourced and maintained, allowing marketing and CX leaders to create more personalized, contextually relevant customer experiences. “This, in turn, makes consumers feel understood and valued.”

Learning Opportunities

Webinar

Nov

Fix the Content Bottleneck: Build a Better WebOps Strategy

Content stalled? Dev overloaded? You’re not the only one. Learn how streamlined WebOps bridges the publishing gap.

Webinar

Nov

Know Your Caller Reputation: How to Protect Your Brand and Get More Calls Answered

80% of unidentified calls go unanswered. See why your calls aren’t getting through.

Webinar

Nov

How to Build a Solid Knowledge Foundation for AI Success

See how leading brands keep their AI honest, compliant and actually helpful.

Webinar

Dec

From Manual to Magical: How AI Transforms CX Teams

Learn how to replace manual support processes with automation that actually delivers.

Webinar

On demand

Beyond Storage: Smarter Content, Bigger Impact with DAM + AI

Discover how the DAM + AI duo makes content smarter, stronger and more accessible.

Watch Now

Webinar

On demand

Agentic AI Playbook: Real-World Customer Service Use Cases You Can Deploy Now

Boost self-service by 30% and slash call volume by 63% with agentic AI.

Watch Now

Webinar

Nov

Fix the Content Bottleneck: Build a Better WebOps Strategy

Content stalled? Dev overloaded? You’re not the only one. Learn how streamlined WebOps bridges the publishing gap.

Webinar

Nov

Know Your Caller Reputation: How to Protect Your Brand and Get More Calls Answered

80% of unidentified calls go unanswered. See why your calls aren’t getting through.

Webinar

Nov

How to Build a Solid Knowledge Foundation for AI Success

See how leading brands keep their AI honest, compliant and actually helpful.

And, in the contact center, Podnar added, reliable data means more effective call handling and issue resolution, reducing time and resource expenditures.

What’s Next for the Data & Trust Alliance?

Some organizations have already begun using the Data Provenance Standards. IBM, for instance, tested the standards as part of their clearance process for datasets used to train foundational AI models. The result? They saw an increase in both efficiency (time for clearance) and overall data quality.

The next step is to increase adoption among other companies. According to the D&TA, many data suppliers and producers shared their feedback on the standards, and now the Alliance plans to enlist these organizations as partners in adoption. They also share the same goal for toolset providers, an effort that could make adoption easier.