close up of man holding magnifying glass in front of him

Why the Apache Unomi Open-Source Customer Data Platform Is Worth a Look

8 minute read
John Zimmerer avatar
Developers should take note of the Apache Unomi open-source customer data platform, which recently passed a major milestone.

Customer experience (CX) demands personalization, and personalization requires access to a wide variety of customer data. Today, that data is commonly maintained in separate, siloed systems of record and engagement. However, marketers need a consolidated 360-degree-view of customer data to personalize content and make relevant recommendations. Thus was born the customer data platform (CDP), a relatively new approach to master data management for CX data.

Much has been written about CDPs in the 18 months since the category first appeared. A previously-published CMSWire article provides a great overview of CDPs, and I encourage you to read it and other articles published by CMSWire on the topic, including this one from Raviv Turner as well as this Econsultancy article.

In this article, I explore the state of the CDP market and highlight Apache Unomi, an open source CDP.

Customer Data Platform Market

The Customer Data Platform Institute, or CDP Institute for short, describes itself as a vendor-neutral organization dedicated to helping marketers manage customer data. The CDP Institute “educates marketers about the issues, methods, and technology used to manage customer data, with a special focus on Customer Data Platforms.”

The CDP Institute has compiled a list of CDPs that tracks commercially available offerings known to the Institute. As of this writing, the list had 87 entries. Note however that more CDPs exist in the marketplace as, for example, the list currently does not include the recently announced offerings from Adobe or Salesforce.

The problem with a new category like CDP is that vendors rush to rebrand their products under the new moniker, regardless of how well (or not) those products fit within the category. Recent Winterberry Group research found that fewer than 20 of the more than 100 offerings they considered met the Group’s definition of a CDP (which is very similar to the CDP Institute’s):

Platforms that are able to ingest and integrate customer data from multiple sources; offer customer profile management; support “real-time” customer segmentation; and make customer data accessible to other systems.

Buyer beware, indeed.

Related Article: How Will Adobe and Salesforce CDP Announcements Impact the Industry?

An Open Source CDP Alternative

The main arguments against implementing a commercial CDP are the limited extensibility — vendors focus first, and sometimes exclusively, on integrating their own products — and the closed nature of those systems. These solutions often store data in a proprietary format, and vendor licensing and terms-of-use language can cloud the issue of who “owns” the customer data managed by the CDP.

Implementing an open source CDP is an interesting — and now viable — alternative to investing in a vendor’s proprietary offering.

Unpacking Apache Unomi

The Apache Software Foundation recently announced a major milestone in the Unomi project: it is now a Top Level Project, meaning it has graduated from the Apache Incubator; is now fully deployable; and is supported by a sufficiently large community of developers. The long and impressive list of Unomi project team members includes Adobe, Jahia, Red Hat and Talend employees, among others.

A Bit of Data Management History

The term master data management (MDM) has been around since at least 2004. While the overarching goal remains the same — to provide access to data to the systems that need to consume it — the approaches to MDM today are different.

A centralized MDM hub moves master data from source systems into a single data repository, consolidates it, cleanses it to remove errors and inconsistencies, and then distributes to other systems. A registry-style hub creates an index of the data on source systems, and can do data matching and cleansing, but leaves data in the original systems. Hybrid MDMs combine these approaches: they create a reference to the original data sources but also serve as the primary data source for new applications.

MDM requires data integration tools to extract data (or link to it) from individual data sources, (optionally) transform it, and then load it into the target system (ETL). ETL integration was commonly done “point to point,” meaning each source is integrated with the ETL tool one at a time. The newest generation of data management tools can connect to and federate with multiple data sources to create a virtual database (VDB), including integration platforms-as-a-service (iPaaS) and lighter-weight integration software-as-a-service (iSaaS) offerings. They can even link to third-party data sources like social media and data providers such as credit reporting agencies. The integration tool then makes the data available as a service to consuming applications, often using RESTful APIs.

According to Elie Auvray, co-founder and head of business development at Jahia, the vision for Apache Unomi is to be “a hub that integrates with and completes other systems for digital marketing purposes rather than a centralized master storage for all customer data from all system.”

In order to be a true registry-style data hub (see A Bit of Data Management History), data must be able to easily flow through the hub, to and from the connected spokes. Having a standard way of getting and exposing data would make connecting spokes simpler. Apache Unomi is the industry's first reference implementation of the upcoming OASIS Context Server specification (editor's note: recently changed to the Customer Data Platform specification) and intends to provide an open interoperability standard for customer data, just as CMIS is for content stores.

Learning Opportunities

Data privacy, protection and transparency are all hallmarks of Apache Unomi. According to Auvray, the software can aggregate customer profile information without the need for personally identifiable information (PII). Instead, Apache Unomi uses unique identifiers to relate records in the source information systems, for example, CRM ID corresponding to a support database ID. Per Auvray:

Apache Unomi has built-in personal data protection capabilities (from the customer’s point of view) such as consent management, data anonymization right-to-be-forgotten capabilities as required by new regulations (e.g., as defined by European GDPR regulations and in California’s Erasure Law). Using Apache Unomi APIs, developers can build application features and UIs for managing and controlling what data are collected, whether or not visitors must consent to it, and (eventuality) to anonymize/delete it.

Related Article: Clearing Up CDP Misconceptions

Buying vs. Building Your CDP

Let's be clear: Apache Unomi is not intended for business users and is not commercial off-the-shelf software. Instead, it is a “headless” CDP, designed for corporate and commercial software developers as an alternative to licensing another vendor’s CDP or building their own, when a CDP is intended to be a layer of a larger, service-oriented (API-driven) digital experience (DX) software platform.

That said, Apache Unomi provides a rich set of CDP functionality that is very attractive to developers. And using an open source CDP means developers can easily understand, improve or extend the CDP without having to wait on a third-party vendor. It also allows developers to leverage their peer community that shares the same willingness to build quality software and which can collectively bear the effort and the cost of that development.

Related Article: Is That New CDP Truly a Customer Data Platform?

Apache Unomi Use Cases

Jahia was the primary contributor to the Apache Unomi project, and is eating its own dog food, so to speak. Its Digital Experience Manager and Marketing Factory products rely upon Apache Unomi to get the data needed to build better personalization and more efficient content optimization. Other developers can use Apache Unomi and develop custom plug-in extensions to solve for these use cases:                                               

  • Privacy and consent management.
  • Visitor/customer profile management.
  • Audience/persona segmentation.
  • A/B testing of content.
  • User/event/goal tracking.
  • Reporting.

On a simpler scale, DX applications can become a consumer of the customer data managed by Apache Unomi. For example, my company, Topdown’s INTOUCH cloud-based customer communication management software is engineered to consume customer data via a REST call for use when personalizing communications. While Topdown anticipated the REST call would be published by a data integration tool, INTOUCH could just as easily ingest data from Apache Unomi. That data would almost certainly be richer than what we typically see, and would allow customers to use information about devices, location and other contextual data such as:

  • Event-based triggers to send communications (e.g., provide marketing or support information when a certain web page is viewed).
  • Inputs to conditional logic (e.g., determining which channel to format a communication).
  • Comments to include in the content of a communication (e.g., “we see you’re traveling, so we’ll automatically approve any charges made in |*country*|”).

If you’re an in-house developer, or you work for a software vendor, and you need to solve for any of the use cases above, then consider looking into Apache Unomi. Using the open source CDP could put you on a faster path to improving the data privacy, protection and personalization for your customers.

About the author

John Zimmerer

John Zimmerer is the senior director of marketing at Topdown, where he leads market research and outreach efforts for the company's customer communications and customer experience products. Most recently, John has been researching and writing about the future direction of the technologies that power customer experience, and is regarded as a thought leader in this area.