The Gist

  • Data Industrialization. Modernizing the data stack involves "industrializing data," creating a modular and agile organization that makes data a strategic asset accessible to all who need it, eliminating data silos and proprietary management systems.
  • Data-driven transformation. A modern data stack is foundational for attaining business outcomes from digital transformation and AI, enabling a business to be data-driven, gain insights faster and unlock the value of digital assets to enable innovation.
  • Silos and spaghetti. The biggest challenge for legacy organizations is silos and spaghetti, with the solution being data mesh to resolve data mess, overcome tech debt and data silos, and create a data culture that guides the data stack you build.

There are numerous reasons why organizations pursue a "modern data stack." Some aim to move away from a "silos and spaghetti architecture" described in "Future Ready," while others aim to establish an AI factory and its supporting data marketplace, as described by Marco Iansiti and Karim Lakhani. CMOs are interested in creating omnichannel engagement with customers for their desired products and services. Data practitioners and CIOs are left to ponder the implications in Andreessen Horowitz's 2020 analysis of modern data infrastructures and what the impact will be on businesses. 

Where Do Organizations Need to Modernize Their Data Processes the Most?

The authors of "Future Ready" describe creating a modern data stack as "industrializing data" and see it as crucial in making data a strategic asset that is accessible to all who need it, resulting in a modular and agile organization. A reusable and modular data platform that fixes integration, cleans data and provides a single view of the customer is necessary for success and requires an industrialized data foundation.

As a goal, modern tools should get organizations out of the tedious, expensive data governance, mapping and classification work. It should at the same time eliminate data being siloed in multiple and proprietary data management systems. According to Craig Milroy, former chief data architect for TD Bank, “The data lake architecture on Hadoop was the modern data stack not too long ago; now it is data mesh and data lake houses. While I am all for the decommissioning of Hadoop, I think organizations should think through the business capability enablement in selecting the next data stack.” Miami University CIO, David Seidl, goes on to suggest, “This all really depends upon the organization, its scale, maturity and business needs. But a good general answer remains data culture. Data culture still is one of things you need to build to guide the data stack you build.”

Related Article: The 6 Must-Have, Must-Change Martech Categories for 2022

What Business Outcomes Do You Hope a Modern Data Stack Will Deliver?

Many CIOs believe that starting with clearly defined business outcomes is essential. To attain business outcomes from digital transformation and AI, a modern data stack is foundational. It's the central nervous system for an organization. Undoubtedly, attempting to digitally transform with legacy data stacks will impede the transformation process and increase technical debt within the organization. Put simply, a modern data stack should enable a business to be data driven, to gain insights faster and to unlock the value of digital assets and enable innovation. It is without question the starting point for digital transformation. Milroy says, therefore, “a modern data stack should result in less data silos, less tech debt, more data exchange (internal/external), self-service data access, and data governance (understood data including data quality); and exceed business expectations.”

Related Article: Challenges or Opportunities? Maximizing Customer Data to Thrive in

Should a Modern Data Stack Be in the Cloud and Your Data Center as Well?

The Andreessen Horowitz architecture clearly is slanted toward cloud solutions. However, former BusinessWeek CIO, Isaac Sacolick says, “There isn't a universal answer to a CIO's data management strategy. That said, in my opinion, most companies will use public clouds for analytics and edge when there's a performance and cost benefit. Large enterprises will shift consistent workloads, ETLs for example to data clouds when it's cheaper. Data platforms are a moving target, and CIOs will always have a spectrum of legacy, operating and emerging data platforms. That's a factor today in how CIOs manage data centers and the cloud, ultimately becoming a multicloud and distributed data strategy.”

Seidl agrees when he says, “For many organizations, the cloud makes business sense. There are some absolutely magical tools that scale incredibly well in the cloud. For really large organizations, it may be more efficient to run the big stuff yourself. For this caliber of organization, this is a large scale, 24x7 thing.”

Learning Opportunities

However, Milroy suggests that larger organizations will in the end move to the cloud as well. He says the cloud-centric modern data stack is the only option. "It will allow future data practitioners to decommission and reshape the errors made at this attempt of a modern data stack; without the on-premise data migration pain.” Constellation Research's VP and Principal Analyst Dion Hinchcliffe adds, “It clearly depends. That said, CIOs need a way to systematically create and manage a data fabric across all clouds, with local variation only when required. Some parts of the data stack may indeed be local. Some global.”

What Is Getting in the Way of Legacy Organizations?

Clearly, it is silos and spaghetti. One CIO calls it integration spaghetti sprawl. This is clearly a bigger problem at many organizations. Milroy says, “All of this sounds like a data mess that data mesh is supposed to resolve." He added that he would be surprised to learn the percentage of people who acknowledge the significance of tech debt and data silos. Fortunately, MIT-CISR has determined that it's most organizations — 51%. Why it is so bad? Seidl says it's because of “habits, territorialism, lack of investment, lack of someone being responsible or having authority to make things happen, and a lack of organizational habits of using data appropriately and thus needing data across silos to force the issue are all common.”

Where Should a Modern Data Stack Be in a CIO’s Priority List?

In the end, building an effective modern data stack is like cardio conditioning for sports teams. It doesn't get the headlines per se, but it is how championship teams win. That is why data should be the number one priority, as a famous commercial once said, "'Quality is Job One." You may not delight a customer or employee with good, clean and available data, but there are infinite ways for organizations to lose with bad data or data breaches. No wonder there are so many data quality and observability vendors.   

For Seidl, “Everything depends ... but if it's not job one, you probably have pretty serious existential issues or technical debt that are pushing it down in priority.” For this reason, Milroy says, “CIOs or CDOs need to own the data platform and its data product. If CIO still owns a Hadoop cluster, you might want to accelerate the decommissioning effort and reshape the data estate into the cloud.”

Parting Words on Modern Data Stacks

The rationale for upgrading the data stack extends to recessions as well. As I proposed in my article "Recessions: How CMOs Can Turn Hard Times Into Growth," data is instrumental in enabling sound decision-making and developing data models. Research by Harvard Business Review also demonstrates that organizations that revamp their data systems and decentralize decision-making during recessions tend to perform better. Ultimately, a modern data stack should provide these advantages.

fa-solid fa-hand-paper Learn how you can join our contributor community.