Information governance. Are IT, legal, security, compliance and records managers sick of this phrase yet? Do your eyes glaze over when your inbox fills with recommendations from supposedly authoritative strangers on information governance, but the authors forget to put in the details on how exactly to implement it? 

So, what’s this article about? Information governance. And this article won't tell you how either.

Why Do We Need Information Governance?

Ninety percent of businesses don’t know what their data contains. Many organizations have a tendency to keep unstructured content, in some cases, forever. Most don’t know what evil lurks in the dungeons of abandoned servers — and couldn’t find it if they tried. And yet they keep it, they migrate it and buy more storage to hold it.

Unfortunately, with the growth of unstructured content expected to increase from 62 percent to a staggering 800 percent over the next five years, petabytes of storage start to add up. One petabyte will cost you $5 million per year, just to keep it plugged in and available. This approach creates organizational risk including challenges in e-discovery, litigation, data privacy, security, migration and compliance. Now you're getting into some very risky business.

Why does this happen? Why don’t the majority of organizations solve the problem? Who’s to blame? 

Everyone. 

IT wants to delete it, or at best, archive it. Legal’s situation gets iffy if they have to prove defensible disposal. Business professionals think "maybe I might need that someday," and then records managers — who typically own the process — are usually prevented from making the decision. Finally coming to the table are those responsible for data privacy, compliance, information security and even marketing. Content remains mismanaged, and dark data grows, and grows, until no one actually remembers why it was saved and forgets it even exists. Potentially a serious liability exposure to the organization.

So that’s the crux of the problem — functional groups within the organization can’t seem to come to the table and develop a consensus, processes and policies to delete content that no longer has value. Unfortunately, this increases the likelihood of exactly what they are trying to avoid — higher operating costs, increased organizational risk and unabated growth of content assets that are of no value. 

Without transparency and an understanding of non-compliance, legal due diligence and security requirements, IT is left to consider all content assets as containing value, or expose the organization to potential deletion of valuable content, resulting in organizational risk. The over-management of information is a gross waste of capital resources, yet the average cost of a security breach in 2015 was $3.9 million, plus the related costs of loss of brand integrity, customer confidence and remediation.

Distributed locations and environments, including the cloud, have complicated the problem. IT is overwhelmed by tasks, and underwhelmed by resources. End users manually add non-existent, subjective, absent or just plain wrong metadata. If end users can’t find what they are looking for, what do they do? They either recreate content, or just don’t do what they set out to. Which metadata will they pick from a drop down list? Statistically, the first option. Yet 92 percent of organizations depend on their end users to correctly tag content, rendering it marginally useful, if at all.

Choices and History Lessons

What’s the choice? Fix it or live with it. Many organizations just don’t know how to fix it. Surprisingly, many don’t even understand metadata. 

Allow me a short digression, with a bit of a history lesson. Some 1,800 years ago, a man named Aristotle came up with taxonomy — the practice and science of classification. He had the idea of a taxonomy for plants and animals. Others came along throughout the centuries and made their own contributions to the theory and practice of classification and taxonomies. It wasn’t until the early 1700s that Carl Linnaeus claimed he was the "Father of Taxonomy" as he developed an overall framework for classifying all animals and plants. 

Taxonomies are still alive, perhaps not quite thriving, but alive. Not a new breakthrough technology, but a proven one. Machine learning, currently the phrase of the day in the media, is said to be the solution to all our search and findability problems. It has been around for 62 years. Runner up in third place is enterprise search — 55 years old and it still doesn’t work. The biggest bang for your buck is still auto-classification and taxonomies.

Metadata, auto-classification and taxonomies address lifecycle content management, including unstructured and semi-structured content across the entire organization, regardless of where the content resides. Effective tools include those that perform metadata generation and auto-classification, and provide the ability to easily develop one or more taxonomies to support the enterprise or functional groups. For example, identification and protection of privacy and confidential data. The tools should be flexible and provide an enterprise framework for reuse and repurposing, and provide metadata for any application that needs it. So not only for inventorying and classifying content, but also for search, records management, security, migration, text analytics, e-discovery, litigation support, FOIA and collaboration — all from the same enterprise metadata repository. It’s not a magic bullet, but nowadays, what is?

From a technical point of view, information governance should be an infrastructure component consisting of tools that help the implementation of automated and consistent processes for companies of all sizes and all industries. The most successful organizations are those that believe digital assets are of value and must be proactively managed to achieve quantifiable benefits. 

It isn’t easy, but through a combination of technology, planning and people, as well as taking a step-by-step approach, it is certainly doable and well worth the effort. Will it ever be perfect? No, because people will always find a way around anything. But technology is available that minimizes, and in some cases eliminates, the human factor. This is what paves the road to success.

Title image: Rembrandt's Aristotle with Bust of Homer