The brands that are succeeding today are the ones that give people the information they want, when they want it, in a format that is meaningful to them. The ability to meet these consumer demands is the driver behind data integration. And the complexity of transforming data so consumers can share and access their personal information across all the channels is growing. In our world, we see projects that pull data from well over 30 different sources.
Cleaning Up the Data
Data integration involves a lot of details (systems, transmission, structure, security, etc.) to address, but when it comes to pleasing the consumer the biggest and the most important task is normalizing the data. While there are many transmission and file format standards, standards generally do not exist on data values.
Let’s look at one small example of the challenges a data integration team faces: One source may store personal titles as Mr or Ms and another source may store the same titles as Mr. or Miss and another source may use mr and ms. If the data isn’t stored in exactly the same way, it requires clean up.
To break down silos of information, you need to be prepared for a fair amount of processing and data scrubbing because each source is typically governed by a different department, and in some cases a different company. Each entity has its own specifications and way of storing the data.
With more than 100 national and international standard-setting bodies and just as many standards out there, picking the right standard(s) to extract and load your data can be a project in and of itself. And it doesn’t mean that the other entities serving the customer are selecting the same standards. This why industry standards and picking a standard to follow are so important.
Picking the Right Standards
How do you pick the right standards? In a dream world, you would select one standard and everyone in your data ecosystem would follow it. However, living in a dream world doesn't usually work. Choosing a standard requires understanding and focusing on the business requirements. What is the task at hand? What data you will be consuming? If you have a lot of data crunching — 100 million rows of data — ETL (Extract Transit Load) may be the best standard because it is typically easier to process the data locally. If you have a situation where you simply need one row or record, a SOAP or RESTful web service may be the best answer. If you require real-time transactional data such as payments, then web services standards are a better fit.
Many real time APIs, such as those published by Twitter, Flickr and Amazon, employ SOAP or RESTful standards and frameworks. A word of caution: Even when you are using web services standards to build a high-speed system integration with third parties, the third parties can (and in our experience will) negatively impact your response time.
Standards help you get to market faster and create solutions that are more reliable. However, standards are not a magic wand.
There is no such thing as a “standard” integration. Even within the same company, you may find different versions of applications and different uses for the data, which then require using different standards.
The reality is big data is just as complex as the integration. Big data includes far more data points than most businesses need. Accessing big data has limited value and its integration comes with a large price tag.
From a Big (Data) Problem to a Small (Data) Solution
The good news is small data really holds the majority of the relevant, more personal information that creates a great customer experience. Unlike big data, which requires significant analysis (time) to uncover patterns and insights that may or may not be useful, small data is specific and can be real time.
Small data provides information such as location, whether an item has been opened, payments, balances, website visits, downloads, etc. It is far more actionable. It is also much easier to mine, manage and use.
As nothing is standard about standards when it comes to data integration, the simplicity of small data may spell business relief in time, money and outcomes.
Title image by Roman Kraft