Recent research released suggests that many organizations are not fully satisfied with the performance and output of their data management and data warehousing initiatives. Published by San Mateo, Calif.-based SnapLogic, which develops a Platform-as-a-Service (PaaS) platform for integrating cloud data sources, it indicates that many IT leaders are struggling to manage a growing number of disconnected applications and data sources, outdated legacy systems and slow and manual data which is blocking progress and costing organizations millions of dollars.
The research, which was carried out for SnapLogic by Vanson Bourne across 500 IT decision makers in the US and UK, looked at why data management projects fail. More to the point it found that the average organization has 115 distinct applications and data sources across their enterprise, but almost half of them (49%) are siloed and disconnected from one another. Titled The State of Data Management – Why Data Warehouse Projects Fail, it showed that:
- 83% of organizations are not fully satisfied with the performance and output of their data management and data warehousing initiatives.
- 89% of IT Decision Makers (ITDMs) from organizations where these apps and systems are not integrated worried that these data silos are holding them back.
- ITDMs confirmed they are losing, on average, more than $1 million annually due to poor data management.
- Over three quarters (76%) of survey respondents indicated that their companies have increased their data budgets over the past year.
- Nearly nine in ten (88%) ITDMs experience challenges trying to load data into data warehouses, with the biggest inhibitors being legacy technology (49%), complex data types and formats (44%), data silos (40%), and data access issues tied to regulatory requirements (40%).
In effect, the research underlines what more and more organizations are finding now, especially during the current epidemic, notably that despite years trying to develop data management strategies, many organizations are still struggling, which among other things leaves them vulnerable to attack.
Related Article: The Role of Augmented Data Management in the Workplace
Data Management at the Enterprise Core
Heikki Nousiainen, CTO at Finland-based Aiven, an open source data database and backend messaging systems points out that the problem of managing data is no longer academic, but a problem that has very real implications for the digital workplace. Data and data management are increasingly moving to the very core of modern business. The companies that are best equipped to utilize the data to focus and drive the business have a great competitive advantage on the market, he told us.
Having data available in terms of both access and perhaps even more importantly discovery will ensure the business can reap the benefits on data driven decisions and potential innovations, based on insights gathered from collected data. There are two key elements of data management that need to be balanced in every organization. On one hand, modern data management is as much keeping data secure as it is about keeping it accessible and usable for business benefits. “Striking the right and thought out balance on protection and access to each dataset is of paramount to business success,” he said.
Each business should consider data management strategies at an executive level. Data governance models can help to define and sharpen up data classification schemes for both data security and from a risk management perspective. The efforts should also ensure all relevant data is catalogued with data formats and access policies. Data catalogues are an important tool in both enforcing the right level of protection based on the type of collected or generated data and facilitating the re-use and tapping into existing data points for business benefits.
In sum, if data security is only one side of the coin in shaping up a company’s data management strategy, data accessibility is equally important, but obviously needs to be handled in a controlled manner. As such, data management strategies need to be shaped up and agreed upon by executive teams together — capturing both the opportunities and risks — and simply cannot be left as CISO, CIO or legal responsibility.
Who 'Owns' Data Management?
While data access and data security should be a top priority for any organization, where that responsibility lies for these elements of a data management strategy has been a topic for never-ending debate. In the old days, two teams were primarily responsible for data security: database engineering and security. It is no longer the case today with DevOps, data analytics, cloud architecture, and many other teams in the picture, Manav Mital, cofounder and CEO of Redwood City, Calif.-based Cyral, a cloud security platform, said. While it is easy to blame that tech sprawl for the worsened security posture, there is an even bigger underlying issue that is at the heart of the problem it's the speed at which these teams now operate.
Thanks to cloud and infrastructure as code, data repositories get spun up and provisioned in a matter of minutes. With that, not only do teams have more repositories to monitor, which is a massive challenge in and of itself, but also the repositories themselves are public by default. To use the example of Snowflake, which went public this summer, the URL from its repository can be accessed from anywhere in the world. This lack of visibility at scale and speed at which teams operate are the reason that the data breaches are on the rise. “There are companies that absolutely get how big of a problem that is. What can be seen in these companies is the recognition that data management strategies should be created in the context of these modern software development methodologies like agile and DevOps,” he said.
One of the more promising things that are taking place is the evolution of DevOps practices into DevSecOps, which bakes in security at every step of the development lifecycle and enables drastically more secure software development at the speed of agile and DevOps.
It is paramount that there is tight collaboration between and among Dev and Security teams. With the new practices like DevSecOps and Security as Code, the world is making good progress in that direction.
Key Elements Of Data Management
So how do you build such a strategy? Matt Bertram is CEO and SEO strategist of Houston-based EWR Digital. In sum, he said, a good data management strategy needs to implement procedures to collect, prepare, store and distribute data.
Organization leaders need to identify data sources. These may be a combination of external and external assets. Consider how the data will be collected and if it will be structured or unstructured.
Organizations need to clean and transform your raw data before it can be analyzed. Guidelines are required for naming data, documenting lineage and adding metadata tags for easy access.
Decide where data will be stored and whether you will use XML, CSV or relational databases for structured data. Consider if you need a data lake for unstructured data.
Devise a process for distributing the data to the teams and departments that need it. Facilitate access and analysis for users and devise communication strategies for data insights.
Data management should be a responsibility shared across the organization by every team which touches the data. This encompasses the marketers and salespeople who enter the data, the analysts who interpret the data, the devs who gather the data and the database admins who store the data.
“Remember that every touchpoint has the potential to corrupt your data, and that data management must be a collective responsibility,” he said.